rahulsmahadev opened a new pull request, #3418:
URL: https://github.com/apache/iceberg-python/pull/3418
<!-- Closes #2772 -->
# Rationale for this change
The REST Catalog uses `requests` with no retries and no timeout configured,
so transient 5xx / network failures bubble up immediately and slow servers can
hang the client indefinitely. The reporting user on #2772 hit infinite-timeout
hangs against a Polaris instance returning `504` from a proxy.
This change adds an optional `connection:` block on the catalog properties
for opting in to a per-request timeout and a retry policy, as proposed in the
issue body:
```yaml
catalog:
default:
uri: http://rest-catalog/ws/
connection:
timeout: 60 # seconds, applied to every HTTP call
retry:
total: 5
backoff_factor: 1.0
status_forcelist: [429, 500, 502, 503, 504]
allowed_methods: [GET, HEAD, OPTIONS]
```
Changes:
- Added `connection`, `connection.timeout`, `connection.retry` catalog
properties.
- Added a private `_RetryTimeoutHTTPAdapter` that subclasses
`requests.adapters.HTTPAdapter` and applies a default timeout to every call
(since `requests` does not expose a session-wide timeout).
- The `connection.retry` sub-mapping is passed verbatim to
`urllib3.util.retry.Retry`, so any kwarg the upstream class accepts can be
configured.
- Both keys are optional and opt-in. When neither is set, the default
`requests` behavior is preserved — no behavior change for existing users.
- Validation: negative or zero `timeout` raises `ValueError`; unknown
`Retry` kwargs are caught and re-raised as `ValueError` with a clearer message.
The SigV4 adapter mounted at the catalog URI (see #3063) is a longer-prefix
match and still wins for that host, so SigV4 users continue to get their
existing retry behavior unchanged — this change deliberately stays out of that
path.
## Are these changes tested?
Yes. Added five tests in `tests/catalog/test_rest.py`:
- `test_session_without_connection_config_uses_default_adapter` — no
`connection:` block → no `_RetryTimeoutHTTPAdapter` is mounted (default
`requests` behavior preserved).
- `test_session_with_connection_timeout_and_retry` — full nested
`connection:` block applies timeout and all `Retry` kwargs (`total`,
`backoff_factor`, `status_forcelist`, `allowed_methods`).
- `test_session_with_connection_timeout_only` — `timeout` alone works
without a `retry` block.
- `test_session_with_invalid_connection_timeout_raises` — negative `timeout`
raises `ValueError`.
- `test_session_with_invalid_connection_retry_kwarg_raises` — unknown
`Retry` kwarg raises `ValueError`.
## Are there any user-facing changes?
Yes — two new optional REST catalog properties (`connection.timeout`,
`connection.retry`) documented in `mkdocs/docs/configuration.md` under "Retry
and timeout". No behavior change for users who do not set them.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]