szehon-ho commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1909915990
########## format/spec.md: ########## @@ -205,13 +205,18 @@ Supported primitive types are defined in the table below. Primitive types added | | **`uuid`** | Universally unique identifiers | Should use 16-byte fixed | | | **`fixed(L)`** | Fixed-length byte array of length L | | | | **`binary`** | Arbitrary-length byte array | | +| [v3](#version-3) | **`geometry(C)`** | Geometry features from [OGC – Simple feature access](https://portal.ogc.org/files/?artifact_id=25355). Edges interpolation is always linear/planar. See [Appendix G](#appendix-g-geospatial-notes). Parameterized by crs C [4]. If not specified, C is `OGC:CRS84`. | | +| [v3](#version-3) | **`geography(C, A)`** | Geometry features from [OGC – Simple feature access](https://portal.ogc.org/files/?artifact_id=25355). See [Appendix G](#appendix-g-geospatial-notes). Parameterized by crs C[5] and edge-interpolation algoritm A [6]. If not specified, C is `OGC:CRS84`. | | + Notes: 1. Timestamp values _without time zone_ represent a date and time of day regardless of zone: the time value is independent of zone adjustments (`2017-11-16 17:10:34` is always retrieved as `2017-11-16 17:10:34`). 2. Timestamp values _with time zone_ represent a point in time: values are stored as UTC and do not retain a source time zone (`2017-11-16 17:10:34 PST` is stored/retrieved as `2017-11-17 01:10:34 UTC` and these values are considered identical). 3. Character strings must be stored as UTF-8 encoded byte arrays. - +4. CRS (coordinate reference system) is a mapping of how coordinates refer to locations on Earth. See [Appendix G](#appendix-g-geospatial-notes) for specifying custom CRS. If this field is null (no custom crs provided), CRS defaults to `OGC:CRS84`, which means the data must be stored in longitude, latitude based on the WGS84 datum. Fixed and cannot be changed by schema evolution. Review Comment: Got it, removed. Took a look and didn't modify the schema evolution section, as it already lists supported type promotions, and so lack of mentioning `geography` and `geometry` implies its not supported. ########## format/spec.md: ########## @@ -603,8 +608,9 @@ Notes: 4. Position delete metadata can use `referenced_data_file` when all deletes tracked by the entry are in a single data file. Setting the referenced file is required for deletion vectors. 5. The `content_offset` and `content_size_in_bytes` fields are used to reference a specific blob for direct access to a deletion vector. For deletion vectors, these values are required and must exactly match the `offset` and `length` stored in the Puffin footer for the deletion vector blob. 6. The following field ids are reserved on `data_file`: 141. - -The `partition` struct stores the tuple of partition values for each file. Its type is derived from the partition fields of the partition spec used to write the manifest file. In v2, the partition struct's field ids must match the ids from the partition spec. +7. `geometry` and `geography`: this is a point: X, Y, Z, and M are the lower / upper bound of all objects in the file. For the X and Y values only, the lower_bound's values (xmin/ymin) may be greater than the upper_bound's value (xmax/ymax). In this X case, an object in the file may match if it contains an X such that `x >= xmin` OR `x <= xmax`, and in this Y case if `y >= ymin` OR `y <= ymax`. In geographic terminology, the concepts of `xmin`, `xmax`, `ymin`, and `ymax` are also known as `westernmost`, `easternmost`, `northernmost` and `southernmost`. +8. `geography` further restricts these points to the canonical ranges of [-180 180] for X and [-90 90] for Y. Review Comment: Added a separate paragraph. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org