jpountz commented on a change in pull request #486: URL: https://github.com/apache/lucene/pull/486#discussion_r761233070
########## File path: lucene/core/src/java/org/apache/lucene/index/ExitableDirectoryReader.java ########## @@ -505,80 +504,29 @@ public long size() { } @Override - public void visitDocIDs(PointValues.IntersectVisitor visitor) throws IOException { + public void visitDocIDs(PointValues.DocIdsVisitor docIdsVisitor) throws IOException { checkAndThrow(); - in.visitDocIDs(visitor); + in.visitDocIDs( + docID -> { + checkAndThrowWithSampling(); Review comment: Can we find a better way and not check on every doc/value pair but once per leaf node or something like that? ########## File path: lucene/core/src/java/org/apache/lucene/index/PointValues.java ########## @@ -361,14 +405,29 @@ private void intersect(IntersectVisitor visitor, PointTree pointTree) throws IOE // TODO: we can assert that the first value here in fact matches what the pointTree // claimed? // Leaf node; scan and filter all points in this block: - pointTree.visitDocValues(visitor); + visitor.grow((int) pointTree.size()); Review comment: is this cast safe? If yes, can you leave a comment about it and use `Math.toIntExact`? ########## File path: lucene/core/src/java/org/apache/lucene/index/PointValues.java ########## @@ -361,14 +405,29 @@ private void intersect(IntersectVisitor visitor, PointTree pointTree) throws IOE // TODO: we can assert that the first value here in fact matches what the pointTree // claimed? // Leaf node; scan and filter all points in this block: - pointTree.visitDocValues(visitor); + visitor.grow((int) pointTree.size()); + pointTree.visitDocValues(visitor::compare, visitor, visitor); } break; default: throw new IllegalArgumentException("Unreachable code"); } } + private void visitDocIds(IntersectVisitor visitor, PointTree pointTree) throws IOException { + final long size = pointTree.size(); + if (size <= Integer.MAX_VALUE) { + visitor.grow((int) size); + pointTree.visitDocIDs(visitor::visit); + } else { + if (pointTree.moveToChild()) { + do { + visitDocIds(visitor, pointTree); Review comment: nit: I wonder if we should try to make it iterative instead of recursive so that it would also work with (weird) implementations of PointValues that could have a high depth ########## File path: lucene/core/src/java/org/apache/lucene/index/PointValues.java ########## @@ -323,10 +355,18 @@ default void grow(int count) {} */ public final void intersect(IntersectVisitor visitor) throws IOException { final PointTree pointTree = getPointTree(); - intersect(visitor, pointTree); + intersect(wrapIntersectVisitor(visitor), pointTree); assert pointTree.moveToParent() == false; } + /** + * Adds the possibility of wrapping a provided {@link IntersectVisitor} in {@link + * #intersect(IntersectVisitor)}. + */ + protected IntersectVisitor wrapIntersectVisitor(IntersectVisitor visitor) throws IOException { + return visitor; + } Review comment: Actually I have a preference for not making intersect final, which has the benefit of not increasing the API surface area (protected methods show up in javadocs). ########## File path: lucene/core/src/java/org/apache/lucene/index/PointValues.java ########## @@ -269,22 +270,33 @@ protected PointValues() {} long size(); /** Visit all the docs below the current node. */ - void visitDocIDs(IntersectVisitor visitor) throws IOException; + void visitDocIDs(DocIdsVisitor docIdsVisitor) throws IOException; /** Visit all the docs and values below the current node. */ - void visitDocValues(IntersectVisitor visitor) throws IOException; + default void visitDocValues(DocValuesVisitor docValuesVisitor) throws IOException { + visitDocValues((min, max) -> Relation.CELL_CROSSES_QUERY, docID -> {}, docValuesVisitor); + } + + /** + * Similar to {@link #visitDocValues(DocValuesVisitor)} but in this case it allows adding a + * filter that works like {@link IntersectVisitor#compare(byte[], byte[])}. + */ + void visitDocValues( + BiFunction<byte[], byte[], Relation> compare, Review comment: I'm considuring adding a new functional interface instead of relying on a java.util function. This would help put javadocs about the contract that we expect from this comparison function. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org