punAhuja opened a new issue, #15296: URL: https://github.com/apache/lucene/issues/15296
### Description The parameter perThreadHardLimitMB cannot be larger than 2GB, which means a single thread cannot write segments larger than 2GB. Refer: https://lucene.apache.org/core/9_9_0/core/org/apache/lucene/index/IndexWriterConfig.html#setRAMPerThreadHardLimitMB(int) This issue proposes to make this parameter configurable above the 2GB limit, so that each thread can write a bigger segment. When indexing high dimensional vector data, each segment has its own HNSW graph. So more segments mean more graphs to search per shard and more graph rebuild work during merges. With this change, a single indexing thread can flush fewer, and larger segments, which is generally more resource-efficient for vector-heavy workloads. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
