punAhuja opened a new issue, #15296:
URL: https://github.com/apache/lucene/issues/15296

   ### Description
   
   The parameter perThreadHardLimitMB cannot be larger than 2GB, which means a 
single thread cannot write segments larger than 2GB.
   Refer: 
https://lucene.apache.org/core/9_9_0/core/org/apache/lucene/index/IndexWriterConfig.html#setRAMPerThreadHardLimitMB(int)
   
   This issue proposes to make this parameter configurable above the 2GB limit, 
so that each thread can write a bigger segment.
   
   When indexing high dimensional vector data, each segment has its own HNSW 
graph. So more segments mean more graphs to search per shard and more graph 
rebuild work during merges. With this change, a single indexing thread can 
flush fewer, and larger segments, which is generally more resource-efficient 
for vector-heavy workloads.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to