desruisseaux commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1806309872
########## format/spec.md: ########## @@ -483,6 +485,8 @@ Notes: 2. For `float` and `double`, the value `-0.0` must precede `+0.0`, as in the IEEE 754 `totalOrder` predicate. NaNs are not permitted as lower or upper bounds. 3. If sort order ID is missing or unknown, then the order is assumed to be unsorted. Only data files and equality delete files should be written with a non-null order id. [Position deletes](#position-delete-files) are required to be sorted by file and position, not a table order, and should set sort order id to null. Readers must ignore sort order id for position delete files. 4. The following field ids are reserved on `data_file`: 141. +5. For `geometry`, this is a point. X = min value of all component points of all geometries in file when edges = PLANAR, westernmost bound of all geometries in file if edges = SPHERICAL. Y = max value of all component points of all geometries in file when edges = PLANAR, northernmost bound of all geometries if edges = SPHERICAL. Z is min value for all component points of all geometries in the file. M is min value of all component points of all geometries in the file. See Appendix D for encoding. Review Comment: On the "geodesics" versus "pseudo-geodesics" names, I do not have a strong opinion at this time. The rational for "pseudo-generic" is that current implementations are cheating: they do not compute the real geodesic. Instead, they pretend that the ellipsoid is a sphere (e.g., they ignore WGS84 flattening) and apply the formulas for a sphere. When we pretend that the ellipsoid is a sphere, the results are of course a little bit wrong, but there is some ways to make them "less wrong". For example, some map projections will convert the _geodetic latitudes_ to _authalic latitudes_. An _authalic sphere_ is a sphere with the same surface as the ellipsoid. An _authalic latitude_ is the latitude that a point would have if we deform the ellipsoid until it gets the shape of the authalic sphere. Therefore, for each point in a geometry, the numerical values of the _authalic latitude_ is slightly different than the _geodetic latitude_. If an application converts all geodetic latitudes to authalic latitudes before to perform computations with libraries that assume a sphere, such as S2, then even if the shapes are a little bit wrong, the surfaces are closer to the reality. Conversely, if someone is more interested in shapes rather than surfaces, she/he may use _conformal latitudes_ instead. See for example the list of [auxiliary lat itudes on MathWord](https://mathworld.wolfram.com/AuxiliaryLatitude.html). In summary, even if Iceberg doesn't dive in all this complexity for now, there is a possibility that future versions may want to add something like `AUTHALIC_GEODESIC` (I don't know if putting "authalic geodesic" terms together is orthodox, we would need to verify usages) for geodesics computed on a sphere but with geodetic latitudes converted to authalic latitudes. Idem for other kinds of latitude such as conformal. The current implementations just ignore all this complexity, use the latitudes _as-is_ and plug the values directly in the spherical formulas without bothering about the fact that this is not quite correct. This is exactly what the "Google Mercator" projection does. EPSG calls that "Popular Visualisation Pseudo Mercator" projection. Hence the proposal for "pseudo" in "pseudo-geodesic", by analogy with "pseudo-Mercator". -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org