dungba88 commented on code in PR #12738: URL: https://github.com/apache/lucene/pull/12738#discussion_r1378285803
########## lucene/core/src/java/org/apache/lucene/util/fst/NodeHash.java: ########## @@ -269,36 +283,58 @@ private boolean nodesEqual(FSTCompiler.UnCompiledNode<T> node, long address) thr return false; } + record OffsetAndLength(long offset, int length) {} + /** Inner class because it needs access to hash function and FST bytes. */ - private class PagedGrowableHash { + class PagedGrowableHash { private PagedGrowableWriter entries; - private long count; + // nocommit: use PagedGrowableWriter? there was some size overflow issue with + // PagedGrowableWriter Review Comment: So it turns out it's due to this `copiedOffsets` is a mapping from the global node address to the local node address. The key value (global address) will grow larger than the node hash (which is then masked by the `mask` attribute). And thus it needs to be masked and rehashed separately. I did not mask and did not rehash, which cause the size to be overflown. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org