herley-shaori commented on PR #15824:
URL: https://github.com/apache/lucene/pull/15824#issuecomment-4067886406
Thanks for the thorough feedback; you're right on all counts, and I
apologize for the incorrect assumptions.
Corrections:
- FST in Lucene 9.x was already off-heap (mmap) since LUCENE-9257, not
heap-resident as I stated. The claim that
this "restores lucene90 FST behavior" was wrong.
- The ByteBuffersDataInput approach does the same bounds checks, so this
doesn't actually solve the underlying
problem — it just shifts where the bounds check happens.
What I take away from the discussion:
1. The real question is why Hotspot fails to elide the Panama
checkValidStateRaw() bounds check in the TrieReader
code path, specifically, when it does optimize it for other
MemorySegmentIndexInput usages.
2. As @gf2121 suggested, testing TrieReader with ByteBuffer-based mmap
(instead of Panama) would help isolate
whether Panama is the root cause.
3. As @rmuir pointed out, the per-byte trie traversal (up to 36
lookupChild() calls for a UUID string) is itself
excessive and should be addressed independently.
I'll close this PR. If I can help investigate the JIT optimization angle
or the trie traversal efficiency, happy to
do so.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]