Jackie-Jiang commented on code in PR #8498:
URL: https://github.com/apache/pinot/pull/8498#discussion_r859097338


##########
pinot-core/src/main/java/org/apache/pinot/core/plan/FilterPlanNode.java:
##########
@@ -56,6 +59,10 @@
 
 
 public class FilterPlanNode implements PlanNode {
+
+  private static final Set<String> CAN_APPLY_H3_INCLUSION_INDEX_FUNCTION_NAMES 
=
+      ImmutableSet.of("st_within", "stwithin", "st_contains", "stcontains");

Review Comment:
   (minor) No need to check `st_within` and `st_contains` because the function 
name is already canonicalized (we keep 2 function names for `st_distance` for 
backward compatibility). We may directly use the value equals check into the 
`canApplyH3IndexForInclusionCheck()` instead of using a set for 2 values.



##########
pinot-core/src/main/java/org/apache/pinot/core/operator/filter/H3InclusionIndexFilterOperator.java:
##########
@@ -0,0 +1,145 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.core.operator.filter;
+
+import it.unimi.dsi.fastutil.longs.LongSet;
+import java.util.Collections;
+import java.util.List;
+import org.apache.commons.lang3.tuple.Pair;
+import org.apache.pinot.common.request.context.ExpressionContext;
+import org.apache.pinot.common.request.context.predicate.EqPredicate;
+import org.apache.pinot.common.request.context.predicate.Predicate;
+import org.apache.pinot.core.common.Operator;
+import org.apache.pinot.core.operator.blocks.FilterBlock;
+import org.apache.pinot.core.operator.dociditerators.ScanBasedDocIdIterator;
+import org.apache.pinot.core.operator.docidsets.BitmapDocIdSet;
+import org.apache.pinot.segment.local.utils.GeometrySerializer;
+import org.apache.pinot.segment.local.utils.H3Utils;
+import org.apache.pinot.segment.spi.IndexSegment;
+import org.apache.pinot.segment.spi.index.reader.H3IndexReader;
+import org.apache.pinot.spi.utils.BooleanUtils;
+import org.apache.pinot.spi.utils.BytesUtils;
+import org.locationtech.jts.geom.Geometry;
+import org.roaringbitmap.buffer.BufferFastAggregation;
+import org.roaringbitmap.buffer.ImmutableRoaringBitmap;
+import org.roaringbitmap.buffer.MutableRoaringBitmap;
+
+
+/**
+ * A filter operator that uses H3 index for geospatial data inclusion
+ */
+public class H3InclusionIndexFilterOperator extends BaseFilterOperator {
+
+  private static final String EXPLAIN_NAME = "INCLUSION_FILTER_H3_INDEX";
+
+  private final IndexSegment _segment;
+  private final Predicate _predicate;
+  private final int _numDocs;
+  private final H3IndexReader _h3IndexReader;
+  private final Geometry _geometry;
+  private final boolean _isPositiveCheck;
+
+  public H3InclusionIndexFilterOperator(IndexSegment segment, Predicate 
predicate, int numDocs) {
+    _segment = segment;
+    _predicate = predicate;
+    _numDocs = numDocs;
+
+    List<ExpressionContext> arguments = 
predicate.getLhs().getFunction().getArguments();
+    EqPredicate eqPredicate = (EqPredicate) predicate;
+    _isPositiveCheck = BooleanUtils.toBoolean(eqPredicate.getValue());
+
+    if (arguments.get(0).getType() == ExpressionContext.Type.IDENTIFIER) {
+      _h3IndexReader = 
segment.getDataSource(arguments.get(0).getIdentifier()).getH3Index();
+      _geometry = 
GeometrySerializer.deserialize(BytesUtils.toBytes(arguments.get(1).getLiteral()));
+    } else {
+      _h3IndexReader = 
segment.getDataSource(arguments.get(1).getIdentifier()).getH3Index();
+      _geometry = 
GeometrySerializer.deserialize(BytesUtils.toBytes(arguments.get(0).getLiteral()));
+    }
+    // must be some h3 index
+    assert _h3IndexReader != null : "the column must have H3 index setup.";
+  }
+
+  @Override
+  protected FilterBlock getNextBlock() {
+    // get the set of H3 cells at the specified resolution which completely 
cover the input shape and potential cover.
+    final Pair<LongSet, LongSet> fullCoverAndPotentialCoverCells =
+        H3Utils.coverGeometryInH3(_geometry, 
_h3IndexReader.getH3IndexResolution().getLowestResolution());
+    final LongSet fullyCoverH3Cells = 
fullCoverAndPotentialCoverCells.getLeft();
+    final LongSet potentialCoverH3Cells = 
fullCoverAndPotentialCoverCells.getRight();
+
+    // have list of h3 cell ids for polygon provided
+    // return filtered num_docs
+    ImmutableRoaringBitmap[] potentialMatchDocIds = new 
ImmutableRoaringBitmap[potentialCoverH3Cells.size()];
+    int i = 0;
+    for (long h3IndexId : potentialCoverH3Cells) {

Review Comment:
   Use iterator (`LongIterator.nextLong()`) instead to avoid the unnecessary 
boxing/unboxing, same for other places when iterating over `LongSet`



##########
pinot-core/src/main/java/org/apache/pinot/core/operator/filter/H3InclusionIndexFilterOperator.java:
##########
@@ -0,0 +1,145 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.core.operator.filter;
+
+import it.unimi.dsi.fastutil.longs.LongSet;
+import java.util.Collections;
+import java.util.List;
+import org.apache.commons.lang3.tuple.Pair;
+import org.apache.pinot.common.request.context.ExpressionContext;
+import org.apache.pinot.common.request.context.predicate.EqPredicate;
+import org.apache.pinot.common.request.context.predicate.Predicate;
+import org.apache.pinot.core.common.Operator;
+import org.apache.pinot.core.operator.blocks.FilterBlock;
+import org.apache.pinot.core.operator.dociditerators.ScanBasedDocIdIterator;
+import org.apache.pinot.core.operator.docidsets.BitmapDocIdSet;
+import org.apache.pinot.segment.local.utils.GeometrySerializer;
+import org.apache.pinot.segment.local.utils.H3Utils;
+import org.apache.pinot.segment.spi.IndexSegment;
+import org.apache.pinot.segment.spi.index.reader.H3IndexReader;
+import org.apache.pinot.spi.utils.BooleanUtils;
+import org.apache.pinot.spi.utils.BytesUtils;
+import org.locationtech.jts.geom.Geometry;
+import org.roaringbitmap.buffer.BufferFastAggregation;
+import org.roaringbitmap.buffer.ImmutableRoaringBitmap;
+import org.roaringbitmap.buffer.MutableRoaringBitmap;
+
+
+/**
+ * A filter operator that uses H3 index for geospatial data inclusion
+ */
+public class H3InclusionIndexFilterOperator extends BaseFilterOperator {
+
+  private static final String EXPLAIN_NAME = "INCLUSION_FILTER_H3_INDEX";
+
+  private final IndexSegment _segment;
+  private final Predicate _predicate;
+  private final int _numDocs;
+  private final H3IndexReader _h3IndexReader;
+  private final Geometry _geometry;
+  private final boolean _isPositiveCheck;
+
+  public H3InclusionIndexFilterOperator(IndexSegment segment, Predicate 
predicate, int numDocs) {
+    _segment = segment;
+    _predicate = predicate;
+    _numDocs = numDocs;
+
+    List<ExpressionContext> arguments = 
predicate.getLhs().getFunction().getArguments();
+    EqPredicate eqPredicate = (EqPredicate) predicate;
+    _isPositiveCheck = BooleanUtils.toBoolean(eqPredicate.getValue());
+
+    if (arguments.get(0).getType() == ExpressionContext.Type.IDENTIFIER) {
+      _h3IndexReader = 
segment.getDataSource(arguments.get(0).getIdentifier()).getH3Index();
+      _geometry = 
GeometrySerializer.deserialize(BytesUtils.toBytes(arguments.get(1).getLiteral()));
+    } else {
+      _h3IndexReader = 
segment.getDataSource(arguments.get(1).getIdentifier()).getH3Index();
+      _geometry = 
GeometrySerializer.deserialize(BytesUtils.toBytes(arguments.get(0).getLiteral()));
+    }
+    // must be some h3 index
+    assert _h3IndexReader != null : "the column must have H3 index setup.";
+  }
+
+  @Override
+  protected FilterBlock getNextBlock() {
+    // get the set of H3 cells at the specified resolution which completely 
cover the input shape and potential cover.
+    final Pair<LongSet, LongSet> fullCoverAndPotentialCoverCells =

Review Comment:
   (minor) We don't usually put `final` for local variables



##########
pinot-core/src/main/java/org/apache/pinot/core/plan/FilterPlanNode.java:
##########
@@ -139,6 +146,46 @@ private boolean canApplyH3Index(Predicate predicate, 
FunctionContext function) {
     return columnName != null && 
_indexSegment.getDataSource(columnName).getH3Index() != null && findLiteral;
   }
 
+  /**
+   * H3 index can be applied for inclusion check iff:
+   * <ul>
+   *   <li>Predicate is of type EQ</li>
+   *   <li>Left-hand-side of the predicate is an ST_Within or ST_Contains 
function</li>
+   *   <li>For ST_Within, the first argument is an identifier, the second 
argument is literal</li>
+   *   <li>For ST_Contains function the first argument is literal, the second 
argument is an identifier</li>
+   *   <li>The identifier column has H3 index</li>
+   * </ul>
+   */
+  private boolean canApplyH3IndexForInclusionCheck(Predicate predicate, 
FunctionContext function) {
+    if (predicate.getType() != Predicate.Type.EQ) {
+      return false;
+    }
+    String functionName = function.getFunctionName();
+    if (!CAN_APPLY_H3_INCLUSION_INDEX_FUNCTION_NAMES.contains(functionName)) {
+      return false;
+    }
+    List<ExpressionContext> arguments = function.getArguments();
+    if (arguments.size() != 2) {
+      throw new BadQueryRequestException("Expect 2 arguments for function: " + 
functionName);
+    }
+    // TODO: handle nested geography/geometry conversion functions
+    if (functionName.equals("st_within") || functionName.equals("stwithin")) {
+      if (arguments.get(0).getType() == ExpressionContext.Type.IDENTIFIER
+          && arguments.get(1).getType() == ExpressionContext.Type.LITERAL) {
+        String columnName = arguments.get(0).getIdentifier();
+        return columnName != null && 
_indexSegment.getDataSource(columnName).getH3Index() != null;

Review Comment:
   (minor) `columnName` will never be `null`
   ```suggestion
           return _indexSegment.getDataSource(columnName).getH3Index() != null;
   ```



##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/utils/H3Utils.java:
##########
@@ -35,4 +52,66 @@ private H3Utils() {
       throw new RuntimeException("Failed to instantiate H3 instance", e);
     }
   }
+
+  private static LongSet coverLineInH3(LineString lineString, int resolution) {
+    LongSet coveringH3Cells = new LongOpenHashSet();
+    LongList endpointH3Cells = new LongArrayList();
+    for (Coordinate endpoint : lineString.getCoordinates()) {
+      endpointH3Cells.add(H3_CORE.geoToH3(endpoint.y, endpoint.x, resolution));
+    }
+    for (int i = 0; i < endpointH3Cells.size() - 1; i++) {
+      try {
+        coveringH3Cells.addAll(H3_CORE.h3Line(endpointH3Cells.getLong(i), 
endpointH3Cells.getLong(i + 1)));
+      } catch (LineUndefinedException e) {
+        throw new RuntimeException(e);
+      }
+    }
+    return coveringH3Cells;
+  }
+
+  private static Pair<LongSet, LongSet> coverPolygonInH3(Polygon polygon, int 
resolution) {
+    LongSet potentialH3Cells = coverLineInH3(polygon.getExteriorRing(), 
resolution);
+
+    // TODO: this can be further optimized to use native H3 implementation. 
They have plan to support natively.
+    // https://github.com/apache/pinot/issues/8547
+    LongSet polyfilledSet = new LongOpenHashSet(H3_CORE.polyfill(
+        Arrays.stream(polygon.getExteriorRing().getCoordinates())
+            .map(coordinate -> new GeoCoord(coordinate.y, coordinate.x))
+            .collect(Collectors.toList()), ImmutableList.of(), resolution));

Review Comment:
   (minor)
   ```suggestion
               .collect(Collectors.toList()), Collections.emptyList(), 
resolution));
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to