Re: [I] Make the perThreadHardLimitMB to be configurable above 2GB [lucene]

via GitHub Wed, 15 Oct 2025 08:28:03 -0700


punAhuja commented on issue #15296:
URL: https://github.com/apache/lucene/issues/15296#issuecomment-3407027871


   > [#15152](https://github.com/apache/lucene/issues/15152) - byteBlockPool 
can hold only upto 2 GB for reference, will that not be a problem if hard limit 
breaches 2 GB ?
   
   The byteBlockPool is only relevant to the postings layer. I had done an 
experiment where I generated many unique tokens to index on a single thread, 
which caused an Arithmetic Exception. But that is an unrealistic scenario. 
Especially in our case of vector heavy workload. Vectors dont use the 
byteBlockPool. Postings data (from text fields) use byteBlockPool that remain 
small.
   In vector heavy payloads (like high dimensional vectors), most of the 
in-memory data comes from vector graphs, not text postings, so the 
ByteBlockPool usage remains minimal.
   
   Maybe we can add a note saying raising the per thread ram limit is 
beneficial for vector-heavy data, but could be unsafe for extreme text-heavy 
payloads.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [I] Make the perThreadHardLimitMB to be configurable above 2GB [lucene]

Reply via email to