Jackyrie2 opened a new pull request, #12480: URL: https://github.com/apache/lucene/pull/12480
### Description This is an update to the previous PR. While benchmarking potential improvements to `HNSWGraphBuilder.initializeFromGraph`, a few issues were found. * ordinal of newNode was used as index to the `scoringContext` map instead of using size as index * the entire `scoringContext` was evaluated during call to `sortInternal` instead of just checking if the new index being sorted has a pre-computed score Two new unit tests were added to demonstrate the bugs, this PR fixes the issues above. ### Benchmarking Result To measure any meaningful latency improvements, we have to first create a big index and other small indexes, then once we force an index merge, the index writer will invoke`HNSWGraphBuilder.initializeFromGraph`. [KnnGraphTester](https://github.com/mikemccand/luceneutil/blob/master/src/main/KnnGraphTester.java#L690) was modified as the following: 1. first add 90% of documents 2. iw.commit() 3. forceMerge into 1 segment 4. set merge policy to NoMergePolicy and add the rest of the documents 5. set merge policy to LogDocMergePolicy 6. forceMerge into 1 segment again <- This step is specifically captured in the benchmark and I have verified in logs that initializeFromGraph is called exactly once in this step From the benchmark results, we are calculating significantly fewer scores using the lazy eval enhancement. However, the indexMergeTime did not decrease as expected. <img width="1162" alt="benchmark" src="https://github.com/apache/lucene/assets/45954779/c8201249-85d0-4192-90b3-543ae72623b5"> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org