[ https://issues.apache.org/jira/browse/LUCENE-9619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17450277#comment-17450277 ]
Ignacio Vera commented on LUCENE-9619: -------------------------------------- In LUCENE-9820 we have done the first step to move the API but still the methods #visitDocsIds and #visitDocValues are using the IntersectVisitor as an input. Here I am proposing to introduce two functional interfaces {{DocIdsVisitor}} and {{DocValuesVisitor}} to use them as the input for those methods so the API would look like: {code:java} /** * Basic operations to read the KD-tree. * * @lucene.experimental */ public interface PointTree extends Cloneable { /** Clone, the current node becomes the root of the new tree. */ PointTree clone(); /** * Move to the first child node and return {@code true} upon success. Returns {@code false} for * leaf nodes and {@code true} otherwise. */ boolean moveToChild() throws IOException; /** * Move to the next sibling node and return {@code true} upon success. Returns {@code false} if * the current node has no more siblings. */ boolean moveToSibling() throws IOException; /** * Move to the parent node and return {@code true} upon success. Returns {@code false} for the * root node and {@code true} otherwise. */ boolean moveToParent() throws IOException; /** Return the minimum packed value of the current node. */ byte[] getMinPackedValue(); /** Return the maximum packed value of the current node. */ byte[] getMaxPackedValue(); /** Return the number of points below the current node. */ long size(); /** Visit all the docs below the current node. */ void visitDocIDs(DocIdsVisitor docIdsVisitor) throws IOException; /** Visit all the docs and values below the current node. */ default void visitDocValues(DocValuesVisitor docValuesVisitor) throws IOException { visitDocValues((min, max) -> Relation.CELL_CROSSES_QUERY, docID -> {}, docValuesVisitor); } /** * Similar to {@link #visitDocValues(DocValuesVisitor)} but in this case it allows adding a * filter that works like {@link IntersectVisitor#compare(byte[], byte[])}. */ void visitDocValues( BiFunction<byte[], byte[], Relation> compare, DocIdsVisitor docIdsVisitor, DocValuesVisitor docValuesVisitor) throws IOException; } /** * Collects all documents below a tree node by calling {@link * PointTree#visitDocIDs(DocIdsVisitor)} */ @FunctionalInterface public interface DocIdsVisitor { /** Called for all documents below a tree node. */ void visit(int docID) throws IOException; } /** * Collects all documents and values below a tree node by calling {@link * PointTree#visitDocValues(DocValuesVisitor)} (DocIdsVisitor)} */ @FunctionalInterface public interface DocValuesVisitor { /** Called for all documents and values below a tree node. */ void visit(int docID, byte[] packedValue) throws IOException; /** * Similar to {@link DocValuesVisitor#visit(int, byte[])} but in this case the packedValue can * have more than one docID associated to it. The provided iterator should not escape the scope * of this method so that implementations of PointValues are free to reuse it. */ default void visit(DocIdSetIterator iterator, byte[] packedValue) throws IOException { int docID; while ((docID = iterator.nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) { visit(docID, packedValue); } } } /** * We recurse the {@link PointTree}, using a provided instance of this to guide the recursion. * * @lucene.experimental */ public interface IntersectVisitor extends DocValuesVisitor, DocIdsVisitor { /** * Called for non-leaf cells to test how the cell relates to the query, to determine how to * further recurse down the tree. * * <ul> * <li>{@link Relation#CELL_OUTSIDE_QUERY}: Stop recursing down the current branch of the * tree. * <li>{@link Relation#CELL_INSIDE_QUERY}: All nodes below the current node are visited using * the underlying {@link DocIdsVisitor}. he consumer should generally blindly accept the * docID. * <li>{@link Relation#CELL_CROSSES_QUERY}: Keep recursing down the current branch of the * tree. If the current node is a leaf, visit all docs and values usinng the underlying * {@link DocValuesVisitor}. The consumer should scrutinize the packedValue to decide * whether to accept it. * </ul> */ Relation compare(byte[] minPackedValue, byte[] maxPackedValue); /** Notifies the caller that this many documents are about to be visited */ default void grow(int count) {} } {code} Any thoughts? > Move Points from a visitor API to a cursor-style API? > ----------------------------------------------------- > > Key: LUCENE-9619 > URL: https://issues.apache.org/jira/browse/LUCENE-9619 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Adrien Grand > Priority: Minor > > Points' visitor API work well but there are a couple things we could make > better if we moved to a cursor API, e.g. > - Term queries could return a DocIdSetIterator without having to materialize > a BitSet. > - Nearest-neighbor search could work on top of the regular API instead of > casting to BKDReader > https://github.com/apache/lucene-solr/blob/6a7131ee246d700c2436a85ddc537575de2aeacf/lucene/sandbox/src/java/org/apache/lucene/sandbox/document/FloatPointNearestNeighbor.java#L296 > - We could optimize counting the number of matches of a query by adding the > number of points in a leaf without visiting documents where there are no > deleted documents and a leaf fully matches the query. -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org