liurenjie1024 commented on issue #75:
URL: https://github.com/apache/iceberg-rust/issues/75#issuecomment-1751595418

   > > But in this case the data is already in memory through the `load_table` 
operation.
   > 
   > You're correct. By reusing the `metadata_location` and `TableMetadata`, we 
can eliminate 2 unnecessary network requests in this scenario.
   > 
   > However, I have the following concerns about this change:
   > 
   > First of all, not all catalogs are implemented in memory, which means they 
cannot benefit from this change. Adding `metadata_location` and `TableMetadata` 
to `CommitTable` could result in additional costs for every update operation.
   > 
   > Secondly, the `Catalog` should not rely on client input. Users may 
inadvertently fill the `CommitTable` with incorrect metadata location or 
metadata.
   > 
   > Finally, the location of users' metadata or metadata itself could be 
outdated. For example, the iceberg table is undergoing concurrent writes and 
the client's metadata location may have already been changed.
   > 
   > I believe we should exclude `metadata_location` and `TableMetadata` from 
`CommitTable`. Instead, let's rely on the `Catalog` to maintain the state 
accurately, which will also enhance the usability of our catalog API.
   > 
   > What are your thoughts?
   
   +1, I think catalog should depends on client input for correctness since we 
may do concurrent modification to table.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to