kaivalnp commented on code in PR #932:
URL: https://github.com/apache/lucene/pull/932#discussion_r933076894
##########
lucene/core/src/test/org/apache/lucene/util/hnsw/KnnGraphTester.java:
##########
@@ -730,4 +794,61 @@ protected int comparePivot(int j) {
return Float.compare(score[pivot], score[j]);
}
}
+
+ private static class SelectiveQuery extends Query {
+
+ public float selectivity = 1f;
+ private FixedBitSet selectedBits;
+ private long cost;
+
+ @SuppressForbidden(reason = "Uses Math.random()")
Review Comment:
We currently use command line arguments to pass the configuration
parameters, and hence will be unable to use `RandomizedRunner` (because of
`main` having static context)
Using a `RandomizedRunner` will be beneficial, as we can save on the true
KNN compute step (as we can reproduce the random setting of bits across runs).
The current code does not cache the true KNN results for selective filters
(since it is random and won't produce deterministic results)
This true KNN step takes up bulk of the search time for selective searches
We can:
- Manually add some parameter for the seed, and enable caching of true KNN
results
- Extend `LuceneTestCase` and shift to individual tests (indexing,
searching, etc). This might clean up the code, and make it more readable in
general, but needs a major refactor in my opinion
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]