jtibshirani commented on a change in pull request #262:
URL: https://github.com/apache/lucene/pull/262#discussion_r696842465



##########
File path: 
lucene/codecs/src/java/org/apache/lucene/codecs/simpletext/SimpleTextKnnVectorsReader.java
##########
@@ -140,7 +147,38 @@ public VectorValues getVectorValues(String field) throws 
IOException {
 
   @Override
   public TopDocs search(String field, float[] target, int k, Bits acceptDocs) 
throws IOException {
-    throw new UnsupportedOperationException();
+    VectorValues values = getVectorValues(field);
+    if (values == null) {
+      return null;
+    }
+    if (target.length != values.dimension()) {
+      throw new IllegalArgumentException(
+          "incorrect dimension for field "
+              + field
+              + "; expected "
+              + values.dimension()
+              + " but target has "
+              + target.length);
+    }
+    FieldInfo info = readState.fieldInfos.fieldInfo(field);
+    VectorSimilarityFunction vectorSimilarity = 
info.getVectorSimilarityFunction();
+    HitQueue topK = new HitQueue(k, false);
+    int doc;
+    while ((doc = values.nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) {
+      float[] vector = values.vectorValue();
+      float score = vectorSimilarity.compare(vector, target);
+      if (vectorSimilarity.reversed) {
+        score = 1 / (score + 1);
+      }
+      topK.insertWithOverflow(new ScoreDoc(doc, score));
+    }
+    ScoreDoc[] topScoreDocs = new ScoreDoc[topK.size()];
+    int i = 0;
+    for (ScoreDoc scoreDoc : topK) {
+      topScoreDocs[i++] = scoreDoc;
+    }
+    Arrays.sort(topScoreDocs, Comparator.comparingInt(x -> x.doc));

Review comment:
       I was apparently confused too :) It'd also be great to document what the 
TotalHits part of the result means, this is a bit subtle. I think it's the 
number of docs that were visited during the search.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to