mbutrovich commented on code in PR #2584:
URL: https://github.com/apache/iceberg-rust/pull/2584#discussion_r3414713299


##########
crates/iceberg/src/arrow/reader/pipeline.rs:
##########
@@ -431,14 +437,44 @@ impl ArrowReader {
         )
         .with_parquet_read_options(parquet_read_options);
 
-        let arrow_metadata = ArrowReaderMetadata::load_async(&mut reader, 
Default::default())
+        let arrow_reader_options = 
Self::build_arrow_reader_options(key_metadata)?;
+
+        let arrow_metadata = ArrowReaderMetadata::load_async(&mut reader, 
arrow_reader_options)
             .await
             .map_err(|e| {
                 Error::new(ErrorKind::Unexpected, "Failed to load Parquet 
metadata").with_source(e)
             })?;
 
         Ok((reader, arrow_metadata))
     }
+
+    /// Builds `ArrowReaderOptions`, adding `FileDecryptionProperties` when
+    /// key metadata is present for Parquet Modular Encryption.
+    fn build_arrow_reader_options(key_metadata: Option<&[u8]>) -> 
Result<ArrowReaderOptions> {
+        match key_metadata {
+            Some(km) => {
+                let standard_key_metadata = StandardKeyMetadata::decode(km)?;
+                let mut builder = FileDecryptionProperties::builder(
+                    standard_key_metadata.encryption_key().as_bytes().to_vec(),

Review Comment:
   The decoded DEK is passed straight to 
`FileDecryptionProperties::builder(key)`. A malformed key currently surfaces as 
arrow-rs's generic build/decrypt error. Would an explicit check that the key is 
a valid AES length (16/24/32 bytes), returning a clear `iceberg::Error`, be 
worth adding? That is a real invariant with a better message than the 
downstream failure.



##########
crates/iceberg/src/arrow/reader/pipeline.rs:
##########
@@ -431,14 +437,44 @@ impl ArrowReader {
         )
         .with_parquet_read_options(parquet_read_options);
 
-        let arrow_metadata = ArrowReaderMetadata::load_async(&mut reader, 
Default::default())
+        let arrow_reader_options = 
Self::build_arrow_reader_options(key_metadata)?;
+
+        let arrow_metadata = ArrowReaderMetadata::load_async(&mut reader, 
arrow_reader_options)
             .await
             .map_err(|e| {
                 Error::new(ErrorKind::Unexpected, "Failed to load Parquet 
metadata").with_source(e)
             })?;
 
         Ok((reader, arrow_metadata))
     }
+
+    /// Builds `ArrowReaderOptions`, adding `FileDecryptionProperties` when
+    /// key metadata is present for Parquet Modular Encryption.
+    fn build_arrow_reader_options(key_metadata: Option<&[u8]>) -> 
Result<ArrowReaderOptions> {
+        match key_metadata {
+            Some(km) => {
+                let standard_key_metadata = StandardKeyMetadata::decode(km)?;

Review Comment:
   `StandardKeyMetadata.file_length` is parsed but unused; the size used for 
reading comes from `task.file_size_in_bytes`. That matches Java, where 
`fileLength()` has no consumer on the native-encryption path (it is for AGS1 
stream decryption, not PME data files), so this looks correct. Not worth 
asserting `file_length == file_size_in_bytes`: `file_length` is optional and 
the spec does not guarantee the two are equal.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to