syun64 commented on issue #406:
URL: https://github.com/apache/iceberg-python/issues/406#issuecomment-1941783224

   I think `table_exists` function that @djouallah proposed and the PR that 
@hussein-awala is working on to support `CREATE TABLE IF NOT EXISTS` both serve 
different purposes. And I think that we should support both in PyIceberg:
   
   **table_exists:**
   - important if we just want to check that a table exists in a namespace. I'd 
argue this is the same as doing list_tables and checking if the table exists in 
the return list, and hence isn't as critical to implement as 'CREATE TABLE IF 
NOT EXISTS'
   - It is however, very simple to implement. And we could just support it
   
   **CREATE TABLE IF NOT EXISTS**
   - allows users to deploy an idempotent table creation statement into 
Production, so that the same code can be run to first create a table, and then 
ignore the creation of the table henceforth without requiring a code change.
   - This semantic is different from running `table_exists` and then invoking 
`create_table` sequentially, because CREATE TABLE IF NOT EXISTS is a single 
call to the catalog. In table_exists + create_table, the two calls are made 
**separately and sequentially**, meaning there is a probability that a 
**concurrent** process could have created a table, leading to create_table 
failing, even if table_exists returned False for a given process.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to