ggershinsky commented on code in PR #7770:
URL: https://github.com/apache/iceberg/pull/7770#discussion_r1546070409


##########
core/src/main/java/org/apache/iceberg/SnapshotParser.java:
##########
@@ -93,6 +95,16 @@ static void toJson(Snapshot snapshot, JsonGenerator 
generator) throws IOExceptio
       generator.writeNumberField(SCHEMA_ID, snapshot.schemaId());
     }
 
+    if (snapshot.manifestListKeyMetadata() != null) {
+      generator.writeStringField(MANIFEST_LIST_KEY_METADATA, 
snapshot.manifestListKeyMetadata());
+    }
+
+    // TODO discuss: do we need to sign the size value? Or sign the whole 
snapshot?

Review Comment:
   This is the question. Some thoughts on the scenarios and protection options:
   - currently, we don't have a metadata.json key. We have only a key for 
snapshot's manifest list file. Besides using it for encrypting the manifest 
list file, we can also use this key for signing snapshot's sensitive parts like 
the manifest list size field. Or for signing the whole metadata.json file 
(should be possible with some effort) - then we also protect the integrity of 
e.g. the table properties (like the table key id).
   - snapshot (metadata.json file) doesn't keep secret values, so encrypting it 
might not be required. The signatures, mentioned above, would be kept in added 
snapshot fields - sufficient for detecting the file modification attacks.
   - these protection techniques are not required with the REST catalog - 
because we trust the catalog service (we don't trust the storage service). 
Since the whole snapshot is stored in the REST catalog, we don't need to sign 
anything.
   - the manifest list key is not stored in the catalog. Instead, it is wrapped 
in a KMS with the table master key, and stored in the snapshot 
MANIFEST_LIST_KEY_METADATA field. Only the KMS-authorized (for the table key) 
users/processes will be able to get the manifest list key.
   - In catalogs other than the REST, the signatures provide a partial 
protection - because the metadata.json is kept in the untrusted storage. With 
the signatures, it can't be modified. But the whole folder can be replaced 
(e.g. a replay attack - where all table files are removed, and replaced with 
files of an older version of the table). To prevent this attack in non-REST 
catalogs, we will have to update the catalog per each table snapshot (setting 
eg the latest table version/sequence number, or a random AAD prefix)



##########
api/src/main/java/org/apache/iceberg/Snapshot.java:
##########
@@ -162,6 +162,25 @@ default Iterable<DeleteFile> removedDeleteFiles(FileIO io) 
{
    */
   String manifestListLocation();
 
+  /**
+   * Return the size of this snapshot's manifest list file. Must be a verified 
value, taken from a

Review Comment:
   We can define this field to be required only for encrypted tables. It will 
be not set in the snapshot file for unencrypted tables - where this method can 
return 0 (or -1, I'll make it consistent across all implementation classes).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to