jiayuasu opened a new issue, #2886: URL: https://github.com/apache/sedona/issues/2886
Sub-task of #2877. ## Scope When a DataFrame written to GeoParquet contains a `Box2D` column, emit it as a native GeoParquet 1.1 bbox covering column rather than as a generic struct. Per the [GeoParquet 1.1 spec](https://geoparquet.org/releases/v1.1.0/), bbox covering columns are `struct<xmin: float, ymin: float, xmax: float, ymax: float>` — Float32, not Float64. Use `Math.nextUp` / `Math.nextDown` for conservative outward rounding so the Float32 bounds always contain the Float64 truth (bit-compatible with `apache/sedona-db`'s `next_after` approach in `rust/sedona-geoparquet/src/writer.rs`). ## Implementation - In `spark/common/.../execution/datasources/geoparquet/GeoParquetWriteSupport.scala` (and supporting metadata), detect `Box2DUDT` columns and: - Write them at the Parquet level as `struct<xmin/ymin/xmax/ymax: float>` (downcast from Float64 with conservative rounding). - Surface them in the GeoParquet metadata as the bbox covering column for the associated geometry column (or as a standalone bbox if not associated). - Tests: round-trip a DataFrame with a Box2D column through GeoParquet and confirm the on-disk schema is Float32 + the metadata declares it as a covering column. ## Out of scope - **Reader-side auto-materialization** — Reading existing GeoParquet bbox covering columns back as `Box2D` is a separate child issue (legacy files, missing metadata, conflicting schemas warrant their own design). - 3D bbox covering (waits for `Box3D`). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
