b4sus opened a new issue, #14317:
URL: https://github.com/apache/iceberg/issues/14317
### Apache Iceberg version
1.9.2
### Query engine
Trino
### Please describe the bug 🐞
Hello Iceberg Community,
We have recently switched jdbc catalog from postgresql to mysql. This meant
for us to ensure that trino and kafka connector have the mysql driver on
classpath, to be able to connect to it.
This initially worked good. But after a while we noticed that we are having
exceptions from mysql driver, like this:
```
Caused by: org.apache.iceberg.jdbc.UncheckedSQLException: Failed to execute
exists query: SELECT table_namespace FROM iceberg_tables WHERE catalog_name = ?
AND (table_namespace = ? OR table_namespace LIKE ? ESCAPE '!') LIMIT 1
at org.apache.iceberg.jdbc.JdbcUtil.exists(JdbcUtil.java:828)
at org.apache.iceberg.jdbc.JdbcUtil.namespaceExists(JdbcUtil.java:796)
at
org.apache.iceberg.jdbc.JdbcCatalog.namespaceExists(JdbcCatalog.java:606)
at
io.trino.plugin.iceberg.catalog.jdbc.TrinoJdbcCatalog.namespaceExists(TrinoJdbcCatalog.java:127)
at
io.trino.plugin.iceberg.catalog.jdbc.TrinoJdbcCatalog.listNamespaces(TrinoJdbcCatalog.java:273)
at
io.trino.plugin.iceberg.catalog.jdbc.TrinoJdbcCatalog.listTables(TrinoJdbcCatalog.java:183)
at
io.trino.plugin.iceberg.IcebergMetadata.getRelationTypes(IcebergMetadata.java:953)
at
io.trino.plugin.base.classloader.ClassLoaderSafeConnectorMetadata.getRelationTypes(ClassLoaderSafeConnectorMetadata.java:297)
at
io.trino.tracing.TracingConnectorMetadata.getRelationTypes(TracingConnectorMetadata.java:269)
at
io.trino.metadata.MetadataManager.getRelationTypes(MetadataManager.java:618)
at
io.trino.tracing.TracingMetadata.getRelationTypes(TracingMetadata.java:331)
at
io.trino.metadata.MetadataListing.doGetRelationTypes(MetadataListing.java:150)
at
io.trino.metadata.MetadataListing.getRelationTypes(MetadataListing.java:141)
... 46 more
Caused by: com.mysql.cj.jdbc.exceptions.CommunicationsException:
Communications link failure
The last packet successfully received from the server was 8,105,461
milliseconds ago. The last packet sent successfully to the server was 8,105,461
milliseconds ago.
```
After some digging, it turned out, that mysql server is killing idle
connections after some time (usually hours (see
[here](https://dev.mysql.com/doc/refman/8.4/en/server-system-variables.html#sysvar_wait_timeout)),
default 8 hours) and than when client tries to reuse the connection, the
exceptions is thrown. Postgres, by default, doesn't do this (see
[here](https://www.postgresql.org/docs/current/runtime-config-client.html#GUC-IDLE-SESSION-TIMEOUT)).
Now, I've checked how iceberg handels jdbc connections and I've found out
[here](https://github.com/apache/iceberg/blob/main/core/src/main/java/org/apache/iceberg/ClientPoolImpl.java),
that each time connection is needed, first one from pool is taken, used and
returned back to first position. Default pool size is 2
([here](https://github.com/apache/iceberg/blob/main/core/src/main/java/org/apache/iceberg/CatalogProperties.java#L124)).
In practice (at least in our setup) this means that one connection is being
reused (almost) all the time (borrowed -> returned, borrowed -> returned and so
on), while the second connection is sitting there without being used - but
eventually it can get picked up and exception is thrown.
With kafka, we could set the pool size to 1 (using
[this](https://github.com/apache/iceberg/blob/main/core/src/main/java/org/apache/iceberg/CatalogProperties.java#L123)
property) and the problem was solved.
With trino, however, it is not possible to pass this property down to
iceberg - they had a [ticket](https://github.com/trinodb/trino/issues/23095) to
do that, but they added only passing of property `retryable_status_codes`, so I
cannot change the pool size.
My question is - why is pooling done in this way - that (almost) always the
one connection is being reused? Why are they not rotated? That would keep them
both active in the eyes of mysql server (or any server caring about active/idle
connections).
Does something speak against rotating them?
Regards,
b4sus
### Willingness to contribute
- [x] I can contribute a fix for this bug independently
- [ ] I would be willing to contribute a fix for this bug with guidance from
the Iceberg community
- [ ] I cannot contribute a fix for this bug at this time
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]