collado-mike commented on PR #1424:
URL: https://github.com/apache/polaris/pull/1424#issuecomment-2859349730

   > > I know there is some churn in the encryption configuration in Iceberg, 
but I also know that s3.sse.key is a property that's currently used to specify 
the KMS key associated with a table.
   > 
   > The way I read 
[this](https://iceberg.apache.org/docs/1.6.0/aws/#s3-server-side-encryption), I 
believe the `s3.sse.key` property is on the catalog rather than the table - and 
I don't see this key being persisted in the [table metadata 
either](https://github.com/apache/iceberg/blob/main/core/src/main/java/org/apache/iceberg/TableMetadata.java).
 I'm not an Iceberg expert, but I'm guessing that won't be right to do since 
the user can change this input in their session and we would not know if they 
can have access to this KMS key from the session itself?
   
   Because of the way we construct the `FileIO` in Polaris, we can actually 
pass in whatever properties we want: 
https://github.com/apache/polaris/blob/b93e97b9c5c97cb813696df50aee7550f21b155c/service/common/src/main/java/org/apache/polaris/service/catalog/io/DefaultFileIOFactory.java#L88-L109
 . We also return the `config` field of the `LoadTableResponse` with the same 
properties, so the client would be able to read/write the files with the 
correct encryption key. 
   
   Notably, this PR handles the permissions side of the KMS d/encryption, but 
doesn't specify how the table is configured to ensure files are written with 
encryption. Enabling setting the key in the table properties and exposing them 
via the `config` field of the `LoadTableResponse` would ensure that both the 
service and the client are using the same key to write data and metadata files. 
As of the current revision, the only way to ensure the files are written with 
the same key is by relying on the S3 SSE bucket configuration. That means that 
cases such as object-store layout (where a random prefix is inserted before the 
table path) couldn't ensure that different tables are encrypted with different 
keys.
   
   To be very clear, I _really, really, really_ want to see this KMS support 
merged. But I really want it because it means that we don't have to rely solely 
on path prefixes to guarantee security of the credentials we vend. Currently, 
we have to prevent table path overlap because otherwise a single set of 
credentials could be used to read data from tables the user doesn't have access 
to. By supporting SSE and credentials that specify table-level encryption keys, 
we can allow for table location overlap _and_ still guarantee that a given set 
of credentials can only be used for the table they were generated for. But that 
requires a very targeted IAM policy - even the encryption context alone isn't 
enough to support object-store layout without the ability to decrypt files for 
every single table under the same prefix. In my mind, both the encryption 
context and the specific KMS key are required to support that. For that, 
Approach 3 is the only viable approach.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to