Fokko commented on code in PR #7921:
URL: https://github.com/apache/iceberg/pull/7921#discussion_r1254692883
##########
python/pyproject.toml:
##########
@@ -61,6 +61,7 @@ thrift = { version = ">=0.13.0,<1.0.0", optional = true }
boto3 = { version = ">=1.17.106", optional = true }
s3fs = { version = ">=2021.08.0,<2024.1.0", optional = true } # Upper bound
set arbitrarily, to be reassessed in early 2024.
adlfs = { version = ">=2021.07.0,<2024.1.0", optional = true } # Upper bound
set arbitrarily, to be reassessed in early 2024.
+psycopg2-binary = { version = ">=2.9.6", optional = true }
Review Comment:
Yes, so there are multiple things here. The `DB-API` spec should be able to
abstract the different databases, while SQLAlchemy is a full ORM layer.
Personally, I'm not a huge fan of ORM layers in general because it takes the
common denominator, and you won't be able to optimize (SQLAlchemy allows you to
do this to some extent). Also, with Airflow we experienced some memory
pressure, but I won't expect PyIceberg to have many open and concurrent
connections.
I think @cccs-eric also noticed that while libraries implement the `DB_API`
specification, there are still subtle differences:
- [Postgres prepared statements](https://www.psycopg.org/docs/usage.html):
`INSERT INTO movie VALUES (%s, %s, %s);`.
- [SQLite prepared
statements](https://docs.python.org/3/library/sqlite3.html): `INSERT INTO movie
VALUES(?, ?, ?)`.
I don't think not-using prepared statements is an option due to security
considerations. Therefore I would lean to switching to `SQLAlchemy`. Another
nice feature is that it also supports [generating the `CREATE TABLE`
statements](https://www.tutorialspoint.com/sqlalchemy/sqlalchemy_core_creating_table.htm).
This also makes it easier to add engines, and run tests against them.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]