syun64 commented on issue #406: URL: https://github.com/apache/iceberg-python/issues/406#issuecomment-1941783224
I think `table_exists` function that @djouallah proposed and the PR that @hussein-awala is working on to support `CREATE TABLE IF NOT EXISTS` both serve different purposes. And I think that we should support both in PyIceberg: **table_exists:** - important if we just want to check that a table exists in a namespace. I'd argue this is the same as doing list_tables and checking if the table exists in the return list, and hence isn't as critical to implement as 'CREATE TABLE IF NOT EXISTS' - It is however, very simple to implement. And we could just support it **CREATE TABLE IF NOT EXISTS** - allows users to deploy an idempotent table creation statement into Production, so that the same code can be run to first create a table, and then ignore the creation of the table henceforth without requiring a code change. - This semantic is different from running `table_exists` and then invoking `create_table` sequentially, because CREATE TABLE IF NOT EXISTS is a single call to the catalog. In table_exists + create_table, the two calls are made **separately and sequentially**, meaning there is a probability that a **concurrent** process could have created a table, leading to create_table failing, even if table_exists returned False for a given process. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org