Tommo56700 opened a new issue, #3008:
URL: https://github.com/apache/iceberg-python/issues/3008

   ### Feature Request / Improvement
   
   Hi team,
   
   I’ve recently migrated to AWS S3 Tables and switched from using the 
GlueCatalog to the REST catalog. After updating the catalog configuration, 
everything works correctly in local, single‑process scenarios. However, I’m 
encountering intermittent failures when scaling out to multiple Dask workers 
making parallel requests.
   Specifically, I’m seeing occasional `ThrottlingException` errors coming from 
AWS SigV4‑signed requests. Once throttling occurs, subsequent requests 
sometimes fail with:
   `requests.exceptions.HTTPError: 403 Client Error`
   
   My understanding is that throttled SigV4 signing attempts can lead to 
follow‑on request failures, resulting in unauthorized S3 operations. According 
to AWS’s recommendation for handling throttling on signed requests, retry 
configuration should be applied via botocore: 
https://boto3.amazonaws.com/v1/documentation/api/latest/guide/retries.html
   
   While reviewing the PyIceberg implementation, I noticed:
   
   - The GlueCatalog sets reasonable default retry settings on the underlying 
boto session: 
https://github.com/apache/iceberg-python/blob/main/pyiceberg/catalog/glue.py#L331-L348
   - The REST catalog, and specifically the SigV4Adapter, does not appear to 
configure any retry behavior by default: 
https://github.com/apache/iceberg-python/blob/main/pyiceberg/catalog/rest/__init__.py#L684-L694
   
   This creates an inconsistency where switching from Glue to REST results in 
weaker retry behavior, which becomes visible under parallel load.
   
   ### Question / Proposal
   Should the REST catalog align its default retry behavior with what 
GlueCatalog already applies?
   At present, users can manually configure retry settings by supplying a 
custom botocore session via catalog properties, but it seems reasonable and 
more consistent for the REST catalog to provide safe defaults, especially since 
SigV4Adapter is now a common path for AWS S3 Tables.
   
   Matching (or at least approaching) the GlueCatalog’s retry policy would 
provide the following benefits:
   
   - Avoid intermittent throttling‑triggered failures in distributed workloads
   - Improve parity between Glue and REST behavior
   - Reduce the configuration burden on users switching to REST for AWS‑backed 
tables
   
   Happy to discuss or test any proposed changes. Thanks for your work on the 
project!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to