benwtrent commented on code in PR #13910:
URL: https://github.com/apache/lucene/pull/13910#discussion_r1803132931


##########
lucene/test-framework/src/java/org/apache/lucene/tests/index/BaseKnnVectorsFormatTestCase.java:
##########
@@ -1906,4 +1916,122 @@ public void testMismatchedFields() throws Exception {
 
     IOUtils.close(reader, w2, dir1, dir2);
   }
+
+  /**
+   * Test that the query is a viable approximation to exact search. This test 
is designed to uncover
+   * gross failures only, not to represent the true expected recall.
+   */
+  public void testRecall() throws IOException {
+    VectorSimilarityFunction vectorSimilarityFunction = 
VectorSimilarityFunction.EUCLIDEAN;
+    int dim = 16;
+    try (Directory indexStore = getKnownIndexStore("field", dim, 
vectorSimilarityFunction);
+        IndexReader reader = DirectoryReader.open(indexStore)) {
+      IndexSearcher searcher = newSearcher(reader);
+      float[] queryEmbedding = new float[dim];
+      String queryString = "Apache License";
+      computeLineEmbedding(queryString, queryEmbedding);
+      // computeLineEmbedding("   END OF TERMS AND CONDITIONS", 
queryEmbedding);
+      // pass match-all "filter" to force full traversal, bypassing graph
+      KnnFloatVectorQuery exactQuery =
+          new KnnFloatVectorQuery("field", queryEmbedding, 1000, new 
MatchAllDocsQuery());

Review Comment:
   Also, I think for more consistent runs, we may want to have multiple query 
embeddings that we test with and gather `min` `max` and `avg` recalls. But this 
can be a further refinement on this work.
   
   I just think having a single query might be very flaky in the long run.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to