sungwy commented on issue #1148: URL: https://github.com/apache/iceberg-python/issues/1148#issuecomment-2353497016
Hi @isc-patrick - thank you very much for taking the time to bring these points to discussiong. I think these are important questions for us to answer and take a stance on as the project grows, and it's great to have all your fantastic considerations documented here for future reference. > I would skip the integration tests with Postgres and make some statements about assumptions for SQL databases working with pyiceberg via SQLAlchemy. SQLAlchemy is a well established abstraction layer and the SQLCatalog is extremely simple. I think this is an interesting point. Rather than expressing support for specific DB and driver as 'supported' backends for the SqlCatalog, I think listing out the requirements of the dialect is a much more scalable way of supporting the SqlCatalog. This would make sense in the long run, especially since the Iceberg Community is thinking of the REST Catalog as the strategic catalog implementation. Given that context, maybe it won't make sense for us to expend so much effort in maintaining SqlCatalog's first class support against multiple DB backends. However, given that our documentation already expresses formal support for postgresql+psycopg2, I think we should raise a discussion to the mailing list if we want to remove this lingo of formal support and move in this direction. > Postgres has dozens of releases over the past few years, https://www.postgresql.org/docs/release/, and SQLAlchemy supports 6 different Postgres drivers, https://docs.sqlalchemy.org/en/13/dialects/postgresql.html. Which combination of the 100's of possible ones are the integration tests going to cover? This is a valid point. I think my stance on this is that we don't have to cover all combinations of the driver in our tests, but just the ones we choose to more 'formally' support. For now, postgresql+psycopg2 and sqlite are already two types of connections we formally support according to our [documentation](https://github.com/apache/iceberg-python/blob/0dc54080aa287dc8e920da128d7f4b335965f1df/mkdocs/docs/configuration.md?plain=1#L234) and [dependency management](https://github.com/apache/iceberg-python/blob/0dc54080aa287dc8e920da128d7f4b335965f1df/pyproject.toml#L79-L80). I think it is fair to restrict our integration tests two these backends, and only test on the latest released versions for simplicity. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org