syun64 commented on issue #281:
URL: https://github.com/apache/iceberg-python/issues/281#issuecomment-1901252873

   Makes sense @Fokko . Thank you very much for taking the time to lay all 
these options out for the cases where a user may have to handle schema updates.
   
   As you suggested, I think it would make sense to port over the Java code to 
support the replace operation.
   
   For most cases where there are compatible schema changes, I think we would 
want to rely on **union_by_name_with** function to evolve the schema. 
   
   Things definitely get a little hairy when we are handling backwards 
incompatible type changes, and I think the 
update_schema(allow_incompatible_changes=True) case and REPLACE TABLE has 
similar, but slightly different outcomes:
   1. update_schema(allow_incompatible_changes=True) as the name suggests 
allows for schema updates with incompatible changes. This means that updates to 
specific fields with incompatible changes are allowed. The output list of 
fields that represent the table are still a union of the existing schema and 
the new schema
   2. buildReplacement from the java code assigns fresh IDs on the new schema, 
but also ensures that field IDs of field names that were used before will keep 
their old field IDs. This is also an incompatible schema update, and it is more 
destructive in the sense that it could delete existing columns (if they are not 
in the new schema) within the same transaction as well.
   
   I think depending on the agreement that the data provider has with their 
consumers, both types of incompatible schema changes are valid use cases 
(albeit awful to apply).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to