[GitHub] [lucene] agorlenko commented on pull request #11946: add similarity threshold for hnsw

GitBox Tue, 06 Dec 2022 15:25:47 -0800


agorlenko commented on PR #11946:
URL: https://github.com/apache/lucene/pull/11946#issuecomment-1340151458


   I've done some experiments with real data and it seems that it really 
doesn't work as I expected. If number of docs which exceed threshold is 
significant (for example 20% or more of previously accepted docs), the query 
works slow and it is better to perform exact search. And unfortunately it 
happens quite often. 
   
   So I agree with @msokolov and I think I should rewrite this PR with 
post-filtering approach. It allows us to preserve predictable performance and 
not modify LeafReader/IndexReader (just filter TopDocs in KnnVectorQuery).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [lucene] agorlenko commented on pull request #11946: add similarity threshold for hnsw

Reply via email to