JannicCutura opened a new pull request, #2994:
URL: https://github.com/apache/iceberg-python/pull/2994
# Rationale for this change
Add support for accessing Iceberg tables via S3 access points, enabling
cross-account access scenarios where organizations enforce access point usage
instead of direct bucket access. Closes #2991
Changes:
- Add S3_ACCESS_POINT_PREFIX config constant (s3.access-point.<bucket>)
- Implement _resolve_s3_access_point() in PyArrowFileIO
- Implement _resolve_s3_access_point() in FsspecFileIO
- Add 12 unit tests (6 per FileIO implementation)
Configuration:
```
s3.access-point.<bucket-name> = <access-point-alias>
```
for example:
```
from pyiceberg.catalog import load_catalog
from pyiceberg.io import S3_ACCESS_POINT_PREFIX
catalog = load_catalog(
"glue",
**{
"type": "glue",
"client.region": AWS_REGION,
# Multiple buckets, each with its own access point
"s3.access-point.bucket-a": "alias-a-123456-s3alias",
f"{S3_ACCESS_POINT_PREFIX}bucket-b": "alias-b-789012-s3alias",
}
)
```
## What
The access point alias (format: <name>-<account-id>-s3alias) is used
transparently in place of the bucket name when accessing S3 objects.
## Why
Organizations increasingly enforce S3 access point usage for cross-account
data access instead of direct bucket access. This is common in enterprise
environments with strict security policies.
## How
Introduces `s3.access-point.<bucket>` configuration that maps bucket names
to access point aliases. Both PyArrowFileIO and FsspecFileIO resolve these
at runtime, rewriting paths transparently.
<!-- Closes #${2991} -->
## Are these changes tested?
Yes.
- 12 unit tests added (6 for PyArrowFileIO, 6 for FsspecFileIO)
- Manually tested with real cross-account S3 access point on across two AWS
accounts
## Are there any user-facing changes?
No breaking changes. Existing configurations continue to work unchanged.
There is a new configuration option `s3.access-point.<bucket-name>`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]