[GitHub] [lucene] agorlenko commented on pull request #11946: add similarity threshold for hnsw

GitBox Fri, 18 Nov 2022 08:04:14 -0800


agorlenko commented on PR #11946:
URL: https://github.com/apache/lucene/pull/11946#issuecomment-1320221923


   If we use only post-filter in KnnVectorQuery, then we have to set k = 
Integer.MAX_VALUE (or another very big value) and calculate similarity with all 
vectors. So the complexity would be O(n). 
   
   I had another idea: we can check the similarity while we are traversing the 
graph. If similarity is less then threshold, we can get rid of this node and 
stop to explore this path. In that case we set k = Integer.MAX_VALUE, set 
similarityThreshold value, but the time complexity would be between O(log(n)) 
and O(n) (it depends on number of vectors with similarity greater than 
threshold). I hope that it allow us to solve task like the ones I described 
above (https://github.com/apache/lucene/pull/11946#issuecomment-1318924833) 
more efficiently.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] agorlenko commented on pull request #11946: add similarity threshold for hnsw

Reply via email to