huan233usc opened a new pull request, #729:
URL: https://github.com/apache/iceberg-cpp/pull/729

   ## What & why
   
   The REST catalog already declares the 
`.../namespaces/{ns}/tables/{tbl}/credentials`
   endpoint path, but `LoadTableResult` discards the `storage-credentials` the 
server
   returns. As a result, catalogs that vend per-table storage credentials 
cannot be used
   end to end — the client has no credentials to read/write the table's data 
files.
   
   This wires up vended storage credentials and, in doing so, resolves the 
existing
   `FIXME: support per-table FileIO creation` in `RestCatalog::LoadTable`.
   
   ## What changed
   
   - **`StorageCredential` type** (`prefix` + `config`) and a parsed
     `LoadTableResult.storage_credentials` list, with JSON round-trip support in
     `catalog/rest/json_serde.cc`.
   - **`MakeTableFileIO`** (`catalog/rest/rest_file_io.cc`): builds a 
table-scoped FileIO by
     layering, in increasing precedence, the catalog configuration → the 
table's `config` →
     the most-specific (longest-prefix) vended credential, following the REST 
spec guidance
     to "choose the most specific prefix". When the table supplies no 
overrides, the shared
     catalog FileIO instance is reused. The existing `MakeCatalogFileIO` is 
untouched, so
     there is no behavior change for non-vending servers.
   - **`MatchStorageCredential`**: longest-prefix selection helper.
   - Wired through `LoadTable`, `CreateTable`, `RegisterTable`, and 
`StageCreateTable`.
     (`UpdateTable` is intentionally left alone — its `CommitTableResponse` 
does not carry
     storage credentials in the spec.)
   
   ## Known limitations / follow-ups
   
   Surfaced during self-review; all are pre-existing structural gaps that this 
PR narrows
   but does not fully close. Happy to file issues / follow up on these:
   
   1. **Commit path**: `UpdateTable` still returns a `Table` bound to the 
shared catalog
      FileIO (the spec's `CommitTableResponse` carries no credentials), so a 
table obtained
      from a commit loses vended credentials that the equivalent `LoadTable` 
would have.
   2. **Multi-prefix tables**: one credential is selected against the table's 
base location
      and baked into a single FileIO. Tables whose data/metadata span multiple 
credential
      prefixes (e.g. split `write.data.path` / `write.metadata.path`) would 
need per-path
      credential routing (Java routes per-prefix inside `S3FileIO`).
   3. **Credential expiry/refresh**: vended credentials are typically 
short-lived STS
      tokens; there is no refresh hook yet. The `.../credentials` endpoint
      (`Endpoint::TableCredentials()`) exists and is the natural follow-up for 
refresh.
   4. **No FileIO caching**: a new FileIO is constructed per load when the 
response carries
      table config or credentials; a cache keyed by merged properties would 
avoid repeated
      S3 client construction.
   5. The client does not yet send `X-Iceberg-Access-Delegation: 
vended-credentials` by
      default (Java does); servers that gate vending on that header require it 
to be set
      via the `header.` catalog property for now.
   
   ## Testing
   
   - Full REST unit suite passing (`rest_catalog_test`, 284 tests).
   - New cases:
     - serde round-trip and deserialize of `storage-credentials` (incl. multiple
       credentials);
     - `MatchStorageCredential` longest-prefix / no-match;
     - `MakeTableFileIO` reuses the catalog FileIO when there are no overrides, 
and applies
       the most-specific vended credential over both the catalog and table 
config.
   - Manually verified against a credential-vending REST catalog: `LoadTable` 
parses the
     vended credentials and `MakeTableFileIO` selects and applies them when 
constructing
     the per-table FileIO.
   
   This pull request and its description were written by Isaac.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to