Re: [PR] Add KMS support for S3 [polaris]

via GitHub Wed, 07 May 2025 09:58:42 -0700


collado-mike commented on PR #1424:
URL: https://github.com/apache/polaris/pull/1424#issuecomment-2859349730

> > I know there is some churn in the encryption configuration in Iceberg,
but I also know that s3.sse.key is a property that's currently used to specify
the KMS key associated with a table.
>
> The way I read
[this](https://iceberg.apache.org/docs/1.6.0/aws/#s3-server-side-encryption), I
believe the `s3.sse.key` property is on the catalog rather than the table - and
I don't see this key being persisted in the [table metadata
either](https://github.com/apache/iceberg/blob/main/core/src/main/java/org/apache/iceberg/TableMetadata.java).
I'm not an Iceberg expert, but I'm guessing that won't be right to do since
the user can change this input in their session and we would not know if they
can have access to this KMS key from the session itself?

Because of the way we construct the `FileIO` in Polaris, we can actually
pass in whatever properties we want:
https://github.com/apache/polaris/blob/b93e97b9c5c97cb813696df50aee7550f21b155c/service/common/src/main/java/org/apache/polaris/service/catalog/io/DefaultFileIOFactory.java#L88-L109
. We also return the `config` field of the `LoadTableResponse` with the same
properties, so the client would be able to read/write the files with the
correct encryption key.

Notably, this PR handles the permissions side of the KMS d/encryption, but
doesn't specify how the table is configured to ensure files are written with
encryption. Enabling setting the key in the table properties and exposing them
via the `config` field of the `LoadTableResponse` would ensure that both the
service and the client are using the same key to write data and metadata files.
As of the current revision, the only way to ensure the files are written with
the same key is by relying on the S3 SSE bucket configuration. That means that
cases such as object-store layout (where a random prefix is inserted before the
table path) couldn't ensure that different tables are encrypted with different
keys.

To be very clear, I _really, really, really_ want to see this KMS support
merged. But I really want it because it means that we don't have to rely solely
on path prefixes to guarantee security of the credentials we vend. Currently,
we have to prevent table path overlap because otherwise a single set of
credentials could be used to read data from tables the user doesn't have access
to. By supporting SSE and credentials that specify table-level encryption keys,
we can allow for table location overlap _and_ still guarantee that a given set
of credentials can only be used for the table they were generated for. But that
requires a very targeted IAM policy - even the encryption context alone isn't
enough to support object-store layout without the ability to decrypt files for
every single table under the same prefix. In my mind, both the encryption
context and the specific KMS key are required to support that. For that,
Approach 3 is the only viable approach.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] Add KMS support for S3 [polaris]

Reply via email to