mikemccand commented on issue #12542:
URL: https://github.com/apache/lucene/issues/12542#issuecomment-1712472912

   > We seem to create a PagedGrowableWriter with [page size 128 MB 
here](https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/util/fst/NodeHash.java#L34),
 meaning even when building a small FST, we are allocating at least 128 MB 
pages?
   
   OK this was really freaking me out overnight (allocating 128 MB array even 
for building the tiniest of FSTs), so I dug deeper, and it is a false alarm!
   
   It turns out that 
[`PagedGrowableWriter`](https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/util/packed/PagedGrowableWriter.java),
 via its [parent class `AbstractPagedMutable`, will allocate a "just big 
enough" final 
page](https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/util/packed/AbstractPagedMutable.java#L57),
 instead of the full 128 MB page size.  And it will reallocate whenever the 
`NodeHash` resizes to a larger array.  There is also some sneaky power-of-2 mod 
trickery that ensures that that final page, even on indefinite rehashing, is 
always sized to exactly a power of 2.  And a [real if statement to enforce 
it](https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/util/packed/PackedInts.java#L867-L869).
  Phew!
   
   I'll open a separate tiny PR to address the wrong `bitsRequired` during 
rehash -- that's just a smallish performance bug when building biggish FSTs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to