zhaih commented on PR #12844: URL: https://github.com/apache/lucene/pull/12844#issuecomment-1832571805
@benwtrent Thanks for running the benchmark. I looked at the profile and I think we call `ramBytesUsed` after every document is indexed to control the flush [here](https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/index/DocumentsWriterFlushControl.java#L206). @stefanvodita I can think of two options here: 1. just use maxSize as before for estimation, it's not too accurate but at least is a good upper bound 2. Carefully account the extra memory used when we add new nodes to the graph and accumulate it by an `AtomicLong`, including the cost for the node itself as well as the cost of resizing the existing neighbor's NeighborArray. Which I think is doable (in both single thread and multi thread situation) but a bit tricky. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org