huan233usc opened a new pull request, #729:
URL: https://github.com/apache/iceberg-cpp/pull/729
## What & why
The REST catalog already declares the
`.../namespaces/{ns}/tables/{tbl}/credentials`
endpoint path, but `LoadTableResult` discards the `storage-credentials` the
server
returns. As a result, catalogs that vend per-table storage credentials
cannot be used
end to end — the client has no credentials to read/write the table's data
files.
This wires up vended storage credentials and, in doing so, resolves the
existing
`FIXME: support per-table FileIO creation` in `RestCatalog::LoadTable`.
## What changed
- **`StorageCredential` type** (`prefix` + `config`) and a parsed
`LoadTableResult.storage_credentials` list, with JSON round-trip support in
`catalog/rest/json_serde.cc`.
- **`MakeTableFileIO`** (`catalog/rest/rest_file_io.cc`): builds a
table-scoped FileIO by
layering, in increasing precedence, the catalog configuration → the
table's `config` →
the most-specific (longest-prefix) vended credential, following the REST
spec guidance
to "choose the most specific prefix". When the table supplies no
overrides, the shared
catalog FileIO instance is reused. The existing `MakeCatalogFileIO` is
untouched, so
there is no behavior change for non-vending servers.
- **`MatchStorageCredential`**: longest-prefix selection helper.
- Wired through `LoadTable`, `CreateTable`, `RegisterTable`, and
`StageCreateTable`.
(`UpdateTable` is intentionally left alone — its `CommitTableResponse`
does not carry
storage credentials in the spec.)
## Known limitations / follow-ups
Surfaced during self-review; all are pre-existing structural gaps that this
PR narrows
but does not fully close. Happy to file issues / follow up on these:
1. **Commit path**: `UpdateTable` still returns a `Table` bound to the
shared catalog
FileIO (the spec's `CommitTableResponse` carries no credentials), so a
table obtained
from a commit loses vended credentials that the equivalent `LoadTable`
would have.
2. **Multi-prefix tables**: one credential is selected against the table's
base location
and baked into a single FileIO. Tables whose data/metadata span multiple
credential
prefixes (e.g. split `write.data.path` / `write.metadata.path`) would
need per-path
credential routing (Java routes per-prefix inside `S3FileIO`).
3. **Credential expiry/refresh**: vended credentials are typically
short-lived STS
tokens; there is no refresh hook yet. The `.../credentials` endpoint
(`Endpoint::TableCredentials()`) exists and is the natural follow-up for
refresh.
4. **No FileIO caching**: a new FileIO is constructed per load when the
response carries
table config or credentials; a cache keyed by merged properties would
avoid repeated
S3 client construction.
5. The client does not yet send `X-Iceberg-Access-Delegation:
vended-credentials` by
default (Java does); servers that gate vending on that header require it
to be set
via the `header.` catalog property for now.
## Testing
- Full REST unit suite passing (`rest_catalog_test`, 284 tests).
- New cases:
- serde round-trip and deserialize of `storage-credentials` (incl. multiple
credentials);
- `MatchStorageCredential` longest-prefix / no-match;
- `MakeTableFileIO` reuses the catalog FileIO when there are no overrides,
and applies
the most-specific vended credential over both the catalog and table
config.
- Manually verified against a credential-vending REST catalog: `LoadTable`
parses the
vended credentials and `MakeTableFileIO` selects and applies them when
constructing
the per-table FileIO.
This pull request and its description were written by Isaac.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]