weizijun opened a new issue, #14554: URL: https://github.com/apache/lucene/issues/14554
### Description When there are many shards to merge, vector data merging can easily lead to memory overflow and high CPU cost. The index.merge.scheduler.max_thread_count parameter can't control the merge thread count, it only pause the writeByte by MergeRateLimiter when the merge thread is bigger then max_thread_count. But OnHeapHnswGraph has been built during the pause phase, and it will take up so much memory that the Java heap is not enough. This problem can easily be caused when a datanode with a 32G heap size holds 2-3TB of vector documents(with bbq, the node can contain these data). The PR https://github.com/apache/lucene/pull/14527 can reduce the heap size, but it don't solve the problem totally. Is there any solution to this problem? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org