jiayuasu opened a new issue, #2886:
URL: https://github.com/apache/sedona/issues/2886

   Sub-task of #2877.
   
   ## Scope
   
   When a DataFrame written to GeoParquet contains a `Box2D` column, emit it as 
a native GeoParquet 1.1 bbox covering column rather than as a generic struct.
   
   Per the [GeoParquet 1.1 spec](https://geoparquet.org/releases/v1.1.0/), bbox 
covering columns are `struct<xmin: float, ymin: float, xmax: float, ymax: 
float>` — Float32, not Float64. Use `Math.nextUp` / `Math.nextDown` for 
conservative outward rounding so the Float32 bounds always contain the Float64 
truth (bit-compatible with `apache/sedona-db`'s `next_after` approach in 
`rust/sedona-geoparquet/src/writer.rs`).
   
   ## Implementation
   
   - In 
`spark/common/.../execution/datasources/geoparquet/GeoParquetWriteSupport.scala`
 (and supporting metadata), detect `Box2DUDT` columns and:
     - Write them at the Parquet level as `struct<xmin/ymin/xmax/ymax: float>` 
(downcast from Float64 with conservative rounding).
     - Surface them in the GeoParquet metadata as the bbox covering column for 
the associated geometry column (or as a standalone bbox if not associated).
   - Tests: round-trip a DataFrame with a Box2D column through GeoParquet and 
confirm the on-disk schema is Float32 + the metadata declares it as a covering 
column.
   
   ## Out of scope
   
   - **Reader-side auto-materialization** — Reading existing GeoParquet bbox 
covering columns back as `Box2D` is a separate child issue (legacy files, 
missing metadata, conflicting schemas warrant their own design).
   - 3D bbox covering (waits for `Box3D`).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to