shubhamvishu commented on code in PR #12679:
URL: https://github.com/apache/lucene/pull/12679#discussion_r1359606449


##########
lucene/core/src/java/org/apache/lucene/search/AbstractRnnVectorQuery.java:
##########
@@ -0,0 +1,105 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.search;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Objects;
+import org.apache.lucene.index.FieldInfo;
+import org.apache.lucene.index.LeafReaderContext;
+
+/**
+ * Search for all (approximate) vectors within a radius using the {@link 
RnnCollector}.
+ *
+ * @lucene.experimental
+ */
+abstract class AbstractRnnVectorQuery extends AbstractKnnVectorQuery {
+  private static final TopDocs NO_RESULTS = TopDocsCollector.EMPTY_TOPDOCS;
+
+  protected final float traversalThreshold, resultThreshold;
+
+  public AbstractRnnVectorQuery(
+      String field, float traversalThreshold, float resultThreshold, Query 
filter) {
+    super(field, Integer.MAX_VALUE, filter);
+    assert traversalThreshold <= resultThreshold;

Review Comment:
   Lets just throw IAE here? Same for the `RnnCollector`.



##########
lucene/core/src/java/org/apache/lucene/search/AbstractRnnVectorQuery.java:
##########
@@ -0,0 +1,105 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.search;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Objects;
+import org.apache.lucene.index.FieldInfo;
+import org.apache.lucene.index.LeafReaderContext;
+
+/**
+ * Search for all (approximate) vectors within a radius using the {@link 
RnnCollector}.
+ *
+ * @lucene.experimental
+ */
+abstract class AbstractRnnVectorQuery extends AbstractKnnVectorQuery {
+  private static final TopDocs NO_RESULTS = TopDocsCollector.EMPTY_TOPDOCS;
+
+  protected final float traversalThreshold, resultThreshold;

Review Comment:
   Maybe add some javadocs(just copy from Rnn[Byte/Float]VectorQuery cx) for 
what each of these would serve in this new query?



##########
lucene/core/src/java/org/apache/lucene/search/AbstractRnnVectorQuery.java:
##########
@@ -0,0 +1,105 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.search;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Objects;
+import org.apache.lucene.index.FieldInfo;
+import org.apache.lucene.index.LeafReaderContext;
+
+/**
+ * Search for all (approximate) vectors within a radius using the {@link 
RnnCollector}.
+ *
+ * @lucene.experimental
+ */
+abstract class AbstractRnnVectorQuery extends AbstractKnnVectorQuery {
+  private static final TopDocs NO_RESULTS = TopDocsCollector.EMPTY_TOPDOCS;

Review Comment:
   I see its same as `AbstractKnnVectorQuery` but we could just statically 
import `TopDocsCollector.EMPTY_TOPDOCS` and use that instead?



##########
lucene/core/src/java/org/apache/lucene/search/AbstractRnnVectorQuery.java:
##########


Review Comment:
   Lets add some tests for these going forward



##########
lucene/core/src/java/org/apache/lucene/search/RnnCollector.java:
##########
@@ -0,0 +1,71 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.search;
+
+import java.util.ArrayList;
+import java.util.List;
+
+/**
+ * A collector that performs radius-based vector searches. All vectors within 
an outer radius are
+ * traversed, and those within an inner radius are collected.
+ *
+ * @lucene.experimental
+ */
+public class RnnCollector extends AbstractKnnCollector {
+  private final float traversalThreshold, resultThreshold;
+  private final List<ScoreDoc> scoreDocList;
+
+  /**
+   * Performs radius-based vector searches.
+   *
+   * @param traversalThreshold similarity score corresponding to outer radius 
of graph traversal.
+   * @param resultThreshold similarity score corresponding to inner radius of 
result collection.
+   * @param visitLimit limit of graph nodes to visit.
+   */
+  public RnnCollector(float traversalThreshold, float resultThreshold, long 
visitLimit) {
+    super(Integer.MAX_VALUE, visitLimit);
+    assert traversalThreshold <= resultThreshold;
+    this.traversalThreshold = traversalThreshold;
+    this.resultThreshold = resultThreshold;
+    this.scoreDocList = new ArrayList<>();
+  }
+
+  @Override
+  public boolean collect(int docId, float similarity) {
+    if (similarity >= resultThreshold) {
+      return scoreDocList.add(new ScoreDoc(docId, similarity));
+    }
+    return false;
+  }
+
+  @Override
+  public float minCompetitiveSimilarity() {
+    return traversalThreshold;
+  }
+
+  @Override
+  // This does not return results in a sorted order to prevent unnecessary 
calculations (because we
+  // do not want to maintain the topK)

Review Comment:
   Nit : Maybe move the javadoc above the annotation here and in other 
occurrences. Visually I just feel something off this way.



##########
lucene/core/src/java/org/apache/lucene/search/AbstractRnnVectorQuery.java:
##########
@@ -0,0 +1,105 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.search;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Objects;
+import org.apache.lucene.index.FieldInfo;
+import org.apache.lucene.index.LeafReaderContext;
+
+/**
+ * Search for all (approximate) vectors within a radius using the {@link 
RnnCollector}.
+ *
+ * @lucene.experimental
+ */
+abstract class AbstractRnnVectorQuery extends AbstractKnnVectorQuery {
+  private static final TopDocs NO_RESULTS = TopDocsCollector.EMPTY_TOPDOCS;
+
+  protected final float traversalThreshold, resultThreshold;
+
+  public AbstractRnnVectorQuery(
+      String field, float traversalThreshold, float resultThreshold, Query 
filter) {
+    super(field, Integer.MAX_VALUE, filter);
+    assert traversalThreshold <= resultThreshold;
+    this.traversalThreshold = traversalThreshold;
+    this.resultThreshold = resultThreshold;
+  }
+
+  @Override
+  protected TopDocs exactSearch(LeafReaderContext context, DocIdSetIterator 
acceptIterator)
+      throws IOException {
+    @SuppressWarnings("resource")
+    FieldInfo fi = context.reader().getFieldInfos().fieldInfo(field);
+    if (fi == null || fi.getVectorDimension() == 0) {
+      // The field does not exist or does not index vectors
+      return NO_RESULTS;
+    }
+
+    VectorScorer vectorScorer = createVectorScorer(context, fi);
+    List<ScoreDoc> scoreDocList = new ArrayList<>();
+
+    int doc;
+    while ((doc = acceptIterator.nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) {
+      boolean advanced = vectorScorer.advanceExact(doc);
+      assert advanced;
+
+      float score = vectorScorer.score();
+      if (score >= resultThreshold) {
+        scoreDocList.add(new ScoreDoc(doc, score));
+      }
+    }
+
+    TotalHits totalHits = new TotalHits(acceptIterator.cost(), 
TotalHits.Relation.EQUAL_TO);
+    return new TopDocs(totalHits, scoreDocList.toArray(ScoreDoc[]::new));
+  }
+
+  @Override
+  // Segment-level results are not sorted (because we do not want to maintain 
the topK), just
+  // concatenate them
+  protected TopDocs mergeLeafResults(TopDocs[] perLeafResults) {
+    long value = 0;
+    TotalHits.Relation relation = TotalHits.Relation.EQUAL_TO;
+    List<ScoreDoc> scoreDocList = new ArrayList<>();
+
+    for (TopDocs topDocs : perLeafResults) {
+      value += topDocs.totalHits.value;
+      if (topDocs.totalHits.relation == 
TotalHits.Relation.GREATER_THAN_OR_EQUAL_TO) {
+        relation = TotalHits.Relation.GREATER_THAN_OR_EQUAL_TO;
+      }
+      scoreDocList.addAll(List.of(topDocs.scoreDocs));
+    }
+
+    return new TopDocs(new TotalHits(value, relation), 
scoreDocList.toArray(ScoreDoc[]::new));
+  }
+
+  @Override
+  public boolean equals(Object o) {
+    if (this == o) return true;
+    if (o == null || getClass() != o.getClass()) return false;
+    if (!super.equals(o)) return false;

Review Comment:
   Lets merge these two if statements? Same for below occurences



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to