syun64 commented on PR #498:
URL: https://github.com/apache/iceberg-python/pull/498#issuecomment-2033344103

   > Shall we move "append", "overwrite", and "add_files" to `Transaction` 
class? This change would enable us to seamlessly chain these operations with 
other table updates in a single commit. This adjustment could be particularly 
beneficial in the context of `CreateTableTransaction`, as it would enable users 
to not only create a table but also populate it with initial data in one go.
   
   I think this is a great question.
   
   I think we have two options here:
   1. We move these actions into the Transaction class, and remove them from 
Table class
   2. We move them into the Transaction class, and also keep an implementation 
in the Table class
   
   I'm not sure which of the above two are better, but I keep asking myself 
whether there's a 'good' reason why we have two separate APIs that achieve 
similar results.
   
   For example, we have **update_spec**, **update_schema** that can be created 
from the **Transaction** or the **Table**, and I feel like we might be creating 
work for ourselves by duplicating the feature in both classes. What if we 
consolidated all of our actions into the Transaction class, and removed them 
from the Table class?
   
   I think the upside of that would be that API would convey a very clear 
message to the developer that a _transaction is committed to a table_, and that 
a series of _actions_ can be chained onto the _same transaction_, as a single 
commit.
   
   In addition, we can avoid [issues like 
this](https://github.com/apache/iceberg-python/pull/508) where we roll out a 
feature to one API implementation, but not the other.
   
   ```
       with given_table.update_schema() as tx:
           tx.add_column(path="new_column1", field_type=IntegerType())
   ```
   
   ```
       with given_table.transaction() as tx:
           with tx.update_schema() as update:
               update.add_column(path="new_column1", field_type=IntegerType())
   ```
   
   To me, the bottom pattern feels more explicit than the above option, and I'm 
curious to hear others' opinions on this topic


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to