kaivalnp commented on code in PR #932: URL: https://github.com/apache/lucene/pull/932#discussion_r888849991
########## lucene/core/src/java/org/apache/lucene/search/KnnVectorQuery.java: ########## @@ -225,6 +225,11 @@ public BitSetIterator getIterator(int contextOrd) { return new BitSetIterator(bitSets[contextOrd], cost[contextOrd]); } + public void setBitSet(BitSet bitSet, int cost) { + bitSets[ord] = bitSet; Review Comment: Here are some related numbers. This is the baseline (with bulk collection): selectivity | effective topK | post-filter recall | post-filter time | pre-filter recall | pre-filter time -- | -- | -- | -- | -- | -- 0.8 | 125 | 0.964 | 1.58 | 0.975 | 1.60 0.6 | 166 | 0.962 | 1.94 | 0.981 | 1.97 0.4 | 250 | 0.960 | 2.70 | 0.986 | 2.64 0.2 | 500 | 0.963 | 4.76 | 0.991 | 4.51 0.1 | 1000 | 0.957 | 8.53 | 0.995 | 7.78 0.01 | 10000 | 0.961 | 58.28 | 1.000 | 9.58 I removed the overloaded `BulkScorer` (and made the `#scorer` return a `ConstantScoreScorer` wrapping the `BitSetIterator` of our query, much like the `BitSetQuery` that you mentioned). This would remove the bulk collection optimization (and switch to doc by doc collection). Here are the numbers: selectivity | effective topK | post-filter recall | post-filter time | pre-filter recall | pre-filter time -- | -- | -- | -- | -- | -- 0.8 | 125 | 0.967 | 1.55 | 0.976 | 19.65 0.6 | 166 | 0.964 | 1.94 | 0.981 | 17.79 0.4 | 250 | 0.961 | 2.69 | 0.986 | 14.71 0.2 | 500 | 0.958 | 4.78 | 0.992 | 11.19 0.1 | 1000 | 0.959 | 8.53 | 0.994 | 11.50 0.01 | 10000 | 0.937 | 58.32 | 1.000 | 10.34 The prefilter collection time seems to be high when more docs pass (and are collected one-by-one) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org