sungwy commented on issue #1148: URL: https://github.com/apache/iceberg-python/issues/1148#issuecomment-2352949020
> Create_all() relies on the default method in SQLAlchemy, https://docs.sqlalchemy.org/en/20/core/internals.html#sqlalchemy.engine.Dialect.has_table, which uses the DB metadata. This is also what the Java implementation is doing: https://github.com/mrcnc/iceberg/blob/928e52ae888f601e1d8d81ae2405d677db10f470/core/src/main/java/org/apache/iceberg/jdbc/JdbcCatalog.java#L155 I see. My concern arising from the review comments was not about the correctness of this approach, but with the possibility of introducing a disruptive behavior change to the existing users. But given that SqlAlchemy relies on `has_table` function to on `checkfirst`, I think the proposed change to remove `ensure_tables_exist` will make the check more permissive, than restrictive. On that note, I'm convinced based on your argument that we should remove `ensure_tables_exist`! 🙂 Thinking forward to how we could introduce this change - I think it would be important to add tests to validate that the change would be safe for table read-only permissioned users in PostgreSQL and sqlite (which are the two formally supported DB backends with the SQLCatalog). Unfortunately, I'm realizing that we don't actually have an integration test suite for PostgreSQL backed SQLCatalog. I've raised [this issue to track this](https://github.com/apache/iceberg-python/issues/1178), and I think it would be easier to introduce a change of this nature once we have a test suite for each of our officially supported DB backends. So I'm thinking we could work on your suggestions in the following order: 1. introduce a flag that ignores table creation if enabled (backward compatible, and safe to introduce) 2. add test suite for postgresql #1178 3. remove `_ensure_tables_exist` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org