ggershinsky commented on code in PR #13066: URL: https://github.com/apache/iceberg/pull/13066#discussion_r2378179070
########## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveCatalog.java: ########## @@ -88,6 +91,7 @@ public class HiveCatalog extends BaseMetastoreViewCatalog private String name; private Configuration conf; private FileIO fileIO; + private KeyManagementClient keyManagementClient; Review Comment: Agreed wrt relevance, but there are a couple of issues: - Security. If table properties are fetched by a catalog from the storage backend (eg the metadata file), then a "key removal attack" becomes possible. In this attack, the key id is removed from the table properties by tampering with the metadata file. This makes the writers produce unencrypted data files.. Therefore, it is important to make sure each catalog client gets the table key id directly from its catalog service, and not from the storage. Also, one of the classes that extend BaseMetastoreCatalog is the HadoopCatalog, which is not safe for encryption because it does not have a catalog service independent of the storage backend. - Code re-use is thin. Most of the PR logic is in the "table operations". But the "newTableOps" is an abstract method in BaseMetastoreCatalog; which makes sense since many things (inc table key retrieval) are catalog-specific. Maybe we should start with the Hive catalog? (Plus the REST catalog in another PR). Then, as more catalogs are added, we can check if/how we can safely move a common encryption code into a parent or util? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
