jpountz commented on a change in pull request #658: URL: https://github.com/apache/lucene/pull/658#discussion_r802539169
########## File path: lucene/core/src/java/org/apache/lucene/search/PointRangeQuery.java ########## @@ -369,6 +376,45 @@ public Scorer scorer(LeafReaderContext context) throws IOException { return scorerSupplier.get(Long.MAX_VALUE); } + @Override + public int count(LeafReaderContext context) throws IOException { + LeafReader reader = context.reader(); + + PointValues values = reader.getPointValues(field); + if (checkValidPointValues(values) == false) { + return 0; + } + + if (reader.hasDeletions() == false + && numDims == 1 + && values.getDocCount() == values.size()) { + // if all documents have at-most one point + final int[] intersectingLeafNodeCount = {0}; + // create a custom IntersectVisitor that records the number of leafNodes that matched + final IntersectVisitor visitor = + new IntersectVisitor() { + @Override + public void visit(int docID) { + intersectingLeafNodeCount[0]++; Review comment: Let's throw an UnsupportedOperationException here and move the increment to `visit(int,byte[])`? Tt would be a bug if this method would ever get called since the point is to skip nodes that are contained by the query. ########## File path: lucene/core/src/java/org/apache/lucene/index/PointValues.java ########## @@ -369,6 +369,52 @@ private void intersect(IntersectVisitor visitor, PointTree pointTree) throws IOE } } + /** + * Finds the number of points matching the provided range conditions. Using this method is faster + * than calling {@link #intersect(IntersectVisitor)} to get the count of intersecting points. This + * method does not enforce live documents, therefore it should only be used when there are no + * deleted documents. + */ + public final long countPoints(IntersectVisitor visitor) throws IOException { + final PointTree pointTree = getPointTree(); + long countPoints = countPoints(visitor, pointTree); + assert pointTree.moveToParent() + == false; // just checking to make sure we ended the tree search at the root node + return countPoints; + } + + private long countPoints(IntersectVisitor visitor, PointTree pointTree) throws IOException { + Relation r = visitor.compare(pointTree.getMinPackedValue(), pointTree.getMaxPackedValue()); + switch (r) { + case CELL_OUTSIDE_QUERY: + // This cell is fully outside the query shape: return 0 as the count of its nodes + return 0; + case CELL_INSIDE_QUERY: + // This cell is fully inside the query shape: return the size of the entire node as the + // count + return pointTree.size(); + case CELL_CROSSES_QUERY: + /* + The cell crosses the shape boundary, or the cell fully contains the query, so we fall + through and do full counting. + */ + if (pointTree.moveToChild()) { + int cellCount = 0; + do { + cellCount += countPoints(visitor, pointTree); + } while (pointTree.moveToSibling()); + pointTree.moveToParent(); + return cellCount; + } else { + // we have reached a leaf node here. + pointTree.visitDocValues(visitor); + return 0; // the visitor has safely recorded the number of leaf nodes that matched + } + default: + throw new IllegalArgumentException("Unreachable code"); + } + } + Review comment: I think I'd keep these two methods as implementation details of PointRangeQuery? The contract is a bit weird as the `IntersectVisitor` only collects documents that are on leaves that cross the query. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org