Kontinuation commented on code in PR #14101:
URL: https://github.com/apache/iceberg/pull/14101#discussion_r2622371223
##########
api/src/main/java/org/apache/iceberg/expressions/Literals.java:
##########
@@ -719,4 +722,45 @@ public String toString() {
return "X'" + BaseEncoding.base16().encode(bytes) + "'";
}
}
+
+ static class BoundingBoxLiteral extends BaseLiteral<ByteBuffer> {
+ private static final Comparator<ByteBuffer> CMP =
+
Comparators.<ByteBuffer>nullsFirst().thenComparing(Comparators.unsignedBytes());
+
+ BoundingBoxLiteral(BoundingBox value) {
+ super(value.toByteBuffer());
+ }
+
+ BoundingBoxLiteral(ByteBuffer value) {
+ super(value);
+ }
+
+ @Override
+ protected Type.TypeID typeId() {
+ return Type.TypeID.GEOMETRY;
Review Comment:
I decided to switch to approach 1 and remove the typeId() method from
`BoundingBoxLiteral`. Here are the reasons:
1. If `BoundingBox` carries typeId, it should also carry the full fledged
type information including CRS and algorithm. Otherwise it would be strange
that only the geospatial type ID is explicitly specified while other type
attributes inherit from the type of the field.
2. Iceberg needs to check that the type of the field and the bounding box
matches with each other when binding and evaluating spatial predicates. The CRS
and algorithm also need to match.
3. Once the query engine checked that the geospatial field is supported by
the engine and decided to generate a bounding box for pushing down a spatial
predicate, it needs extract the geospatial type attributes from the Iceberg
schema to construct the `BoundingBox`. The CRS attribute is tightly coupled
with the iceberg table since it is the name of the table property defining the
full CRS in PROJJSON, so there's a high chance that the engine will extract
field type in iceberg table schema directly use it to construct the bounding
box. Inferring the type of BoundingBox according to the context would eliminate
the need for this extra step.
4. The untyped BoundingBox looks concise in the REST Catalog SPEC:
https://github.com/apache/iceberg/pull/14856. Adding type information to the
bounding box would make the data model more cluttered.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]