ChrisHegarty commented on PR #12703: URL: https://github.com/apache/lucene/pull/12703#issuecomment-1783135109
I've not been able to spend all that much time on this this week, but here's my current thinking. The abstractions in the PR are currently not great (as discussed above), but putting that aside for now since we can get a sense of the potential real performance impact from this approach as it is - so I did some performance experiments other than micro jmh. It seems that this change improves the merge performance of vector data in segments by about 10% - not great, I was hoping for better. Is it worth proceeding with or maybe looking elsewhere? I'm not sure. Here's how I determine the 10% - by just hacking on KnnGraphTester from luceneUtil so that it creates an index with more than one segment when ingesting 100k+ vectors with dimensions of 768, then timing the forceMerge. This is a very rough experiment, but shows that the potential gain is much less than I expected. Caution - I could have goofed up several things, from the actual implementation to the experiment merge. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org