Re: [PR] Defer Sorter.DocMap packing until after flush [lucene]

via GitHub Mon, 11 May 2026 15:06:29 -0700


Tim-Brooks commented on PR #16048:
URL: https://github.com/apache/lucene/pull/16048#issuecomment-4425551183


   During a flush reading packed values is a major performance hit. 
Particularly for points flushing.
   
   <img width="1664" height="619" alt="image" 
src="https://github.com/user-attachments/assets/203cc443-9919-4e34-908a-8b5dd20d2b4f";
 />
   
   Keeping the unpacked version for the life of the flush (still short) 
resolves this and Lucene already allocated the memory. The memory is still 
bounded by (directly) `maxBufferedDocs` and `ramBufferSizeMB`. I assumed it was 
unnecessary to add switches or configurations to tweak my proposed new 
behavior. But I can if that is desired.
   
   For context a segment with 128K documents will take ~512KB for this int[]. 
Lucene already creates this for the initial sorter. It just releases it within 
the same method scope. I am proposing we keep it around for the lifetime of the 
flush and then release to a packed version once we have a sealed segment.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Defer Sorter.DocMap packing until after flush [lucene]

Reply via email to