syun64 commented on issue #281: URL: https://github.com/apache/iceberg-python/issues/281#issuecomment-1901252873
Makes sense @Fokko . Thank you very much for taking the time to lay all these options out for the cases where a user may have to handle schema updates. As you suggested, I think it would make sense to port over the Java code to support the replace operation. For most cases where there are compatible schema changes, I think we would want to rely on **union_by_name_with** function to evolve the schema. Things definitely get a little hairy when we are handling backwards incompatible type changes, and I think the update_schema(allow_incompatible_changes=True) case and REPLACE TABLE has similar, but slightly different outcomes: 1. update_schema(allow_incompatible_changes=True) as the name suggests allows for schema updates with incompatible changes. This means that updates to specific fields with incompatible changes are allowed. The output list of fields that represent the table are still a union of the existing schema and the new schema 2. buildReplacement from the java code assigns fresh IDs on the new schema, but also ensures that field IDs of field names that were used before will keep their old field IDs. This is also an incompatible schema update, and it is more destructive in the sense that it could delete existing columns (if they are not in the new schema) within the same transaction as well. I think depending on the agreement that the data provider has with their consumers, both types of incompatible schema changes are valid use cases (albeit awful to apply). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org