szehon-ho commented on PR #10981:
URL: https://github.com/apache/iceberg/pull/10981#issuecomment-2533103710

   Update, there was a sync with @jiayuasu @flyrain @dmitrykoval @paleolimbot 
@rdblue and Menelaos, it was decided the following (meeting notes):
   
   My summary is that we decided to have two types:
   
   * geometry(crs_id) always uses linear edges but can have a geographic CRS
   * geography(crs_id, algorithm) always uses geodesic edges defined by a 
geographic CRS
   * A geography’s algorithm approximates the edges and must be used 
consistently. A spherical approximation is considered a different algorithm.
   * The crs_id is opaque, but could be srid:<srid> to select a specific SRID, 
or projjson:<property-name> to select a JSON CRS in a table property
   * Neither Parquet nor Iceberg is responsible for providing CRS definitions, 
but may include them for convenience (if they can considering copyright or 
other legal considerations)
   
   Here are the specific points I think we decided on:
   
   * Planar/linear edges are always associated with the geometry type. Geometry 
should always use linear edges.
   * Parquet and Iceberg should have a geometry type because users already 
expect the linear behavior
   * Geometry needs to support geographic CRS
   * Geometry needs a CRS parameter, but not an edge parameter
   * Geography never uses linear edges
   * Geography edges are always interpreted as edges on the spheroid defined by 
the geographic CRS (geodesics)
     * An exception here, which is that if the algorithm specified is 
spherical, then we are talking about geodesics (great circle arcs) on a sphere.
   I think it is important to notice (and specify/require) that if the 
algorithm is spherical, then the radius of the underlying sphere is 
assumed/expected to be the mean radius of the spheroid specified by the CRS, 
where the mean radius is always defined as (2 * major_axis_length + 
minor_axis_length) / 3.
   * Geography bounding boxes must include the northmost/southmost points on 
edges
   * Geography edge calculations use a particular algorithm, which may 
introduce either approximation errors (for instance, Vincenty) or may simplify 
the problem and introduce representation errors (i.e. Spherical)
   * The edge calculation algorithm must be a parameter of the geography type 
(i.e. spherical, andoyer, vincenty, etc.)
   * The algorithm is set by what the writer creating the table can produce (vs 
having a default in the format)
   * Writers must not write if they cannot produce bounding boxes using the 
correct algorithm
   * Engines should reject non-geographic CRS for geography columns
   * we decided that coordinates should be limited to [-180, 180] and [-90, 90] 
for geography.
   
   
   updating the pr based on the same.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to