szehon-ho opened a new issue, #10260: URL: https://github.com/apache/iceberg/issues/10260
### Proposed Change (This is an abridged version of the proposal document) Big data open source projects have been leveraged for storage and analysis of geospatial data for a long time, and a flourishing ecosystem has evolved. Examples are GeoParquet for Parquet, Apache Sedona for Spark, GeoMesa for HBase and Cassandra, and developing or added native support in Hive and Trino. Given the central position of Apache Iceberg, it would be great to natively support geospatial support as well. There have been additional of geospatial support to Apache Iceberg (Geolake and Havasu) which have promising results. Unfortunately as Iceberg lacks Extension points, these have been in the form of forks of the project. It would be great to leverage the efforts and findings of these projects into Iceberg. This will add the following to the Iceberg project: - Geospatial types (ex, point, linestring, polygon) - Geospatial expressions (st_covers, st_covered_by, st_intersects) - Geospatial partition transforms (XZ2) - Geospatial sort (hilbert) - Spark integration support This will allow the following use cases: - Create a table with geospatial type ```CREATE TABLE geom_table (geom GEOMETRY);``` - Insert geospatial data ```INSERT INTO geom_table VALUES ('POINT(1 2)', 'LINESTRING(1 2, 3 4)')``` - Query using geospatial predicates: ```SELECT * FROM geom_table WHERE ST_COVERS(geom, ST_POINT(0.5, 0.5))``` - Define a geospatial partition transform to allow partition filtering for geospatial query ```ALTER TABLE geom_table ADD PARTITION FIELD (xz2(geom))``` - Rewrite using geospatial sort order to allow file and row-group filtering for geospatial query ```CALL rewrite_data_files(table => `geom_table`, sort_order => `hilbert(geom)`)``` ### Proposal document https://docs.google.com/document/d/1iVFbrRNEzZl8tDcZC81GFt01QJkLJsI9E2NBOt21IRI ### Specifications - [X] Table - [ ] View - [ ] REST - [ ] Puffin - [ ] Encryption - [ ] Other -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org