iverase commented on a change in pull request #7: URL: https://github.com/apache/lucene/pull/7#discussion_r727868879
########## File path: lucene/test-framework/src/java/org/apache/lucene/index/AssertingLeafReader.java ########## @@ -1090,6 +1090,13 @@ private void assertStats(int maxDoc) { assert in.getDocCount() <= maxDoc; } + @Override + public IndexTree getIndexTree() throws IOException { + // TODO: assert that there are no illegal calls when navigating the tree? + assertThread("Points", creationThread); Review comment: Added and `AssertingIndexTree` that validates that we don't call moveToChild() or clone() after having called moveToParent(). ########## File path: lucene/core/src/java/org/apache/lucene/index/ExitableDirectoryReader.java ########## @@ -372,6 +372,12 @@ private void checkAndThrow() { } } + @Override + public IndexTree getIndexTree() throws IOException { + checkAndThrow(); + return in.getIndexTree(); + } Review comment: added a TODO. One of the ideas I had was to make intersects method final in PointValues API, so in that case this would be a must. That was another reason for opening #371, as it is pretty hard doing that without that change. ########## File path: lucene/core/src/java/org/apache/lucene/index/PointValues.java ########## @@ -227,8 +228,56 @@ protected PointValues() {} CELL_CROSSES_QUERY }; + /** Create a new {@link IndexTree} to navigate the index */ + public abstract IndexTree getIndexTree() throws IOException; + + /** + * Basic operations to read the KD-tree. + * + * @lucene.experimental + */ + public interface IndexTree extends Cloneable { + + /** Clone, the current node becomes the root of the new tree. */ + IndexTree clone(); + + /** + * Move to the first child node and return {@code true} upon success. Returns {@code false} for + * leaf nodes and {@code true} otherwise. Should not be called if the current node has already + * called this method. Review comment: Have a look to what I did in AssertingIndexTree. We can detect it and throw an error in that case? It should be a small performance penalty. ########## File path: lucene/core/src/java/org/apache/lucene/index/PointValues.java ########## @@ -331,6 +450,9 @@ public long estimateDocCount(IntersectVisitor visitor) { /** Returns the number of bytes per dimension */ public abstract int getBytesPerDimension() throws IOException; + /** Returns the maximum number of points per leaf node */ + public abstract int getMaxPointsPerLeafNode() throws IOException; Review comment: This is currently used for one dimensional merges. In that case we read one at a time each leaf in the tree. In order to do so, we build buffers before hand that are sized using this value. If we don't expose it we will need to have more complex logic, like checking the size of a leaf every time and sizing our buffers accordingly? I guess a more advance solution would be to move from a visitor pattern to returning iterators so there is no need to copy the data locally? ########## File path: lucene/core/src/java/org/apache/lucene/index/PointValues.java ########## @@ -331,6 +450,9 @@ public long estimateDocCount(IntersectVisitor visitor) { /** Returns the number of bytes per dimension */ public abstract int getBytesPerDimension() throws IOException; + /** Returns the maximum number of points per leaf node */ + public abstract int getMaxPointsPerLeafNode() throws IOException; Review comment: I didn't;t thought about the grow method, nice. I removed this method from the API. ########## File path: lucene/core/src/java/org/apache/lucene/index/PointValues.java ########## @@ -331,6 +450,9 @@ public long estimateDocCount(IntersectVisitor visitor) { /** Returns the number of bytes per dimension */ public abstract int getBytesPerDimension() throws IOException; + /** Returns the maximum number of points per leaf node */ + public abstract int getMaxPointsPerLeafNode() throws IOException; Review comment: I didn't thought about the grow method, nice. I removed this method from the API. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org