rahulsmahadev opened a new pull request, #3418:
URL: https://github.com/apache/iceberg-python/pull/3418

   <!-- Closes #2772 -->
   
   # Rationale for this change
   
   The REST Catalog uses `requests` with no retries and no timeout configured, 
so transient 5xx / network failures bubble up immediately and slow servers can 
hang the client indefinitely. The reporting user on #2772 hit infinite-timeout 
hangs against a Polaris instance returning `504` from a proxy.
   
   This change adds an optional `connection:` block on the catalog properties 
for opting in to a per-request timeout and a retry policy, as proposed in the 
issue body:
   
   ```yaml
   catalog:
     default:
       uri: http://rest-catalog/ws/
       connection:
         timeout: 60                     # seconds, applied to every HTTP call
         retry:
           total: 5
           backoff_factor: 1.0
           status_forcelist: [429, 500, 502, 503, 504]
           allowed_methods: [GET, HEAD, OPTIONS]
   ```
   
   Changes:
   - Added `connection`, `connection.timeout`, `connection.retry` catalog 
properties.
   - Added a private `_RetryTimeoutHTTPAdapter` that subclasses 
`requests.adapters.HTTPAdapter` and applies a default timeout to every call 
(since `requests` does not expose a session-wide timeout).
   - The `connection.retry` sub-mapping is passed verbatim to 
`urllib3.util.retry.Retry`, so any kwarg the upstream class accepts can be 
configured.
   - Both keys are optional and opt-in. When neither is set, the default 
`requests` behavior is preserved — no behavior change for existing users.
   - Validation: negative or zero `timeout` raises `ValueError`; unknown 
`Retry` kwargs are caught and re-raised as `ValueError` with a clearer message.
   
   The SigV4 adapter mounted at the catalog URI (see #3063) is a longer-prefix 
match and still wins for that host, so SigV4 users continue to get their 
existing retry behavior unchanged — this change deliberately stays out of that 
path.
   
   ## Are these changes tested?
   
   Yes. Added five tests in `tests/catalog/test_rest.py`:
   - `test_session_without_connection_config_uses_default_adapter` — no 
`connection:` block → no `_RetryTimeoutHTTPAdapter` is mounted (default 
`requests` behavior preserved).
   - `test_session_with_connection_timeout_and_retry` — full nested 
`connection:` block applies timeout and all `Retry` kwargs (`total`, 
`backoff_factor`, `status_forcelist`, `allowed_methods`).
   - `test_session_with_connection_timeout_only` — `timeout` alone works 
without a `retry` block.
   - `test_session_with_invalid_connection_timeout_raises` — negative `timeout` 
raises `ValueError`.
   - `test_session_with_invalid_connection_retry_kwarg_raises` — unknown 
`Retry` kwarg raises `ValueError`.
   
   ## Are there any user-facing changes?
   
   Yes — two new optional REST catalog properties (`connection.timeout`, 
`connection.retry`) documented in `mkdocs/docs/configuration.md` under "Retry 
and timeout". No behavior change for users who do not set them.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to