syun64 commented on issue #281:
URL: https://github.com/apache/iceberg-python/issues/281#issuecomment-1900741649

   Thank you for the great points @Fokko and @nicor88 .
   
   Just like @nicor88 mentioned, I think RTAS will be slightly different from 
overwrite in the sense that the schema, the partitioning scheme, sort order or 
any of the table properties can also be updated atomically with this operation. 
In short, the function needs to support updating any of the arguments that are 
currently supported on 
[create_table](https://github.com/apache/iceberg-python/blob/94d7821cbc6b31b791e18d4f91c0991684616076/pyiceberg/catalog/__init__.py#L286)
 function, in addition to overwriting the Iceberg table data with the input 
pyarrow table.
   
   I'm wondering if it would be better to have a separate function that 
achieves these goals in a single transaction?
   
   ```
   class Table:
       ...
       def replace(
           self,
           schema: Schema,
           df: pa.Table,
           location: Optional[str] = None,
           partition_spec: PartitionSpec = UNPARTITIONED_PARTITION_SPEC,
           sort_order: SortOrder = UNSORTED_SORT_ORDER,
           properties: Properties = EMPTY_DICT,
       )  -> None:
       # update table properties, partition spec, sort_order and schema
       # overwrite all data in the table with new data from df
       # commit transaction in single metadata update
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to