Fokko commented on issue #281:
URL: https://github.com/apache/iceberg-python/issues/281#issuecomment-1901165986

   > Evolving schema would be nice, also would it be possible to evolve 
partitions? e.g. after a specific overwrite I want to have schema evolution and 
possibility to replace the table adding partitions to specific columns.
   
   Schema evolution is [already possible 
today](https://py.iceberg.apache.org/api/#schema-evolution). What's missing 
there is the `unionByName` method that will promote the schema to a new schema:
   
   ```python
   with table.update_schema() as update:
       update.union_by_name_with(new_schema)
   ```
   
   This will promote the schema to the new schema if compatible. If you also 
want to allow incompatible changes, you can do:
   
   ```python
   with table.update_schema(allow_incompatible_changes=True) as update:
       update.union_by_name_with(new_schema)
   ```
   
   By making this explicit, you make sure that you don't break the table for 
the downstream consumers. If you want to add partition columns in a single 
transaction, you can do:
   
   ```python
   with table.transaction() as tx:
       with tx.update_schema() as update:
           update.union_by_name_with(new_schema)
       with tx.update_spec() as update:
           update.add_field("dt_transaction", "transaction", DayTransform())
   ```
   
   Note: The updating partitions is still underway 
(https://github.com/apache/iceberg-python/pull/245/), but is expected to be 
included in 0.6.0
   
   > I'm wondering if it would be better to have a separate function that 
achieves these goals in a single transaction?
   
   We could do something similar as `CREATE OR REPLACE TABLE` as in Spark. It 
isn't just a DROP/CREATE, since it will create a new schema with new field-ids 
and such. Please refer to the [java 
code](https://github.com/apache/iceberg/blob/e32df0ce08086758c44e9174c582638068244073/core/src/main/java/org/apache/iceberg/TableMetadata.java#L672).
 This might be more applicable if you just want to replace everything, instead 
of evolving the existing table.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to