agorlenko commented on PR #11946: URL: https://github.com/apache/lucene/pull/11946#issuecomment-1320221923
If we use only post-filter in KnnVectorQuery, then we have to set k = Integer.MAX_VALUE (or another very big value) and calculate similarity with all vectors. So the complexity would be O(n). I had another idea: we can check the similarity while we are traversing the graph. If similarity is less then threshold, we can get rid of this node and stop to explore this path. In that case we set k = Integer.MAX_VALUE, set similarityThreshold value, but the time complexity would be between O(log(n)) and O(n) (it depends on number of vectors with similarity greater than threshold). I hope that it allow us to solve task like the ones I described above (https://github.com/apache/lucene/pull/11946#issuecomment-1318924833) more efficiently. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org