Re: [PR] Re-use information from graph traversal during exact search [lucene]

via GitHub Mon, 20 Nov 2023 14:50:09 -0800


kaivalnp commented on PR #12820:
URL: https://github.com/apache/lucene/pull/12820#issuecomment-1819930817


   Yes, the restrictive filter will cause more fallbacks to `#exactSearch`, and 
the high `topK` will mean more visitation = saving more on duplicate work
   
   > So we see a 5-10% improvement in latency b/w baseline and candidate?
   
   The benchmark is sort of a happy case (high `topK`), but yes, we see a 5-10% 
improvement in latency there. I'm not sure what's the general `topK` - 
`selectivity` combinations seen by users, may be helpful if someone else can 
replicate the benchmark with values close to their use-case
   
   > like working in ordinal v/s docId space that @jpountz pointed out
   
   I took a shot at moving the tracking of `visited` nodes to the docId space 
(delegated to `KnnCollector`), and the benchmark results are similar to the 
posted one (but the flaw mentioned above is addressed)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Re: [PR] Re-use information from graph traversal during exact search [lucene]

Reply via email to