Kontinuation commented on code in PR #10981:
URL: https://github.com/apache/iceberg/pull/10981#discussion_r1793717159


##########
format/spec.md:
##########
@@ -1286,6 +1291,7 @@ This serialization scheme is for storing single values as 
individual binary valu
 | **`struct`**                 | Not supported                                 
                                                               |
 | **`list`**                   | Not supported                                 
                                                               |
 | **`map`**                    | Not supported                                 
                                                               |
+| **`geometry`**               | A single point, encoded as a {x, y, optional 
z, optional m} concatenation of its 8-byte IEEE 754 values, little-endian. |

Review Comment:
   Is it always concatenated by 4 floating-point values? If it is not the case, 
we'll have a hard time figuring out if the point is in XYZ or XYM when there 
are 3 encoded dimensions. I suggest we use the WKB encoding of points here as 
well.
   
   Enforcing the appearance of all 4 components and fill NaN for optional 
component also works, as it is more similar to the `BoundingBox` struct defined 
in the Parquet spec.



##########
format/spec.md:
##########
@@ -1286,6 +1291,7 @@ This serialization scheme is for storing single values as 
individual binary valu
 | **`struct`**                 | Not supported                                 
                                                               |
 | **`list`**                   | Not supported                                 
                                                               |
 | **`map`**                    | Not supported                                 
                                                               |
+| **`geometry`**               | A single point, encoded as a {x, y, optional 
z, optional m} concatenation of its 8-byte IEEE 754 values, little-endian. |

Review Comment:
   Is it always concatenated by 4 floating-point values? If it is not the case, 
we'll have a hard time figuring out if the point is in XYZ or XYM when there 
are 3 encoded dimensions. I suggest we use the WKB encoding of points here as 
well.
   
   Enforcing the appearance of all 4 components and allow filling NaN for 
optional components also works, as it is more similar to the `BoundingBox` 
struct defined in the Parquet spec.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to