Re: [PR] [DRAFT] Load vector data directly from the memory segment [lucene]

via GitHub Fri, 27 Oct 2023 08:46:21 -0700


ChrisHegarty commented on PR #12703:
URL: https://github.com/apache/lucene/pull/12703#issuecomment-1783135109


   I've not been able to spend all that much time on this this week, but here's 
my current thinking.
   
   The abstractions in the PR are currently not great (as discussed above), but 
putting that aside for now since we can get a sense of the potential real 
performance impact from this approach as it is - so I did some performance 
experiments other than micro jmh.
   
   It seems that this change improves the merge performance of vector data in 
segments by about 10% - not great, I was hoping for better. Is it worth 
proceeding with or maybe looking elsewhere? I'm not sure.   Here's how I 
determine the 10% - by just hacking on KnnGraphTester from luceneUtil so that 
it creates an index with more than one segment when ingesting 100k+ vectors 
with dimensions of 768, then timing the forceMerge. This is a very rough 
experiment, but shows that the potential gain is much less than I expected.  
Caution - I could have goofed up several things, from the actual implementation 
to the experiment merge.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Re: [PR] [DRAFT] Load vector data directly from the memory segment [lucene]

Reply via email to