c-thiel commented on issue #591:
URL: https://github.com/apache/iceberg-rust/issues/591#issuecomment-2474101258

   This is not so much about this specific use case - which I also don't care 
about much either, but about having two different representations for the same 
entity. Took me a while to formulate it so clearly.
   There are many lines of code in different files just to work around this 
problem.
   
   Let's assume for example we want to add a field to a schema, then the point 
representation in the `PartitionSpec` is not compatible with the one in the 
`Schema`. Java would just throw an exception in this case. Because we have two 
distinct representations for the same entity, and those representations are not 
bijective, we need extra code to handle conversion.
   
   
https://github.com/apache/iceberg/blob/e06b069529be3d3d389b156646e751de3753feb0/core/src/main/java/org/apache/iceberg/SchemaUpdate.java#L97-L103
   also 
   
https://github.com/apache/iceberg/blob/e06b069529be3d3d389b156646e751de3753feb0/core/src/main/java/org/apache/iceberg/SchemaUpdate.java#L112
   several lines of docs here:
   
https://github.com/apache/iceberg/blob/e06b069529be3d3d389b156646e751de3753feb0/api/src/main/java/org/apache/iceberg/UpdateSchema.java#L54-L58
   
   We are even inclined to document at [some 
places](https://github.com/apache/iceberg/blob/e06b069529be3d3d389b156646e751de3753feb0/api/src/main/java/org/apache/iceberg/UpdateSchema.java#L95)
 that a point is nothing special here:
   ```
    * <p>The given name is used to name the new column and names containing "." 
are not handled
    * differently.
    ```
    
    All of this could be solved by using a globaly unique identifier for a 
field - which for me is very clearly `Vec<String>` and just handle conversion 
to a string only representation at the last moment.
    
    I stumbled across this again when implementing `SchemaUpdate`. So lets 
forget about the use-case, but maybe still consider harmonizing identifiers for 
a field.
    


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to