punAhuja commented on issue #15296: URL: https://github.com/apache/lucene/issues/15296#issuecomment-3407027871
> [#15152](https://github.com/apache/lucene/issues/15152) - byteBlockPool can hold only upto 2 GB for reference, will that not be a problem if hard limit breaches 2 GB ? The byteBlockPool is only relevant to the postings layer. I had done an experiment where I generated many unique tokens to index on a single thread, which caused an Arithmetic Exception. But that is an unrealistic scenario. Especially in our case of vector heavy workload. Vectors dont use the byteBlockPool. Postings data (from text fields) use byteBlockPool that remain small. In vector heavy payloads (like high dimensional vectors), most of the in-memory data comes from vector graphs, not text postings, so the ByteBlockPool usage remains minimal. Maybe we can add a note saying raising the per thread ram limit is beneficial for vector-heavy data, but could be unsafe for extreme text-heavy payloads. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
