Re: [PR] API: Extend FileIO and add EncryptingFileIO. [iceberg]

via GitHub Sun, 18 Feb 2024 13:06:55 -0800


rdblue commented on code in PR #9592:
URL: https://github.com/apache/iceberg/pull/9592#discussion_r1493844182



##########
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/BaseReader.java:
##########
@@ -184,25 +181,15 @@ protected InputFile getInputFile(String location) {
 
   private Map<String, InputFile> inputFiles() {
     if (lazyInputFiles == null) {
-      Stream<EncryptedInputFile> encryptedFiles =
-          
taskGroup.tasks().stream().flatMap(this::referencedFiles).map(this::toEncryptedInputFile);
-
-      // decrypt with the batch call to avoid multiple RPCs to a key server, 
if possible
-      Iterable<InputFile> decryptedFiles = 
table.encryption().decrypt(encryptedFiles::iterator);
-
-      Map<String, InputFile> files = 
Maps.newHashMapWithExpectedSize(taskGroup.tasks().size());
-      decryptedFiles.forEach(decrypted -> 
files.putIfAbsent(decrypted.location(), decrypted));
-      this.lazyInputFiles = ImmutableMap.copyOf(files);
+      this.lazyInputFiles =
+          EncryptingFileIO.create(table().io(), table().encryption())
+              .bulkDecrypt(
+                  () -> 
taskGroup.tasks().stream().flatMap(this::referencedFiles).iterator());

Review Comment:
   These changes demonstrate how `EncryptingFileIO` is cleaner for bulk 
decryption. This section has been broken by contributors several times in 
efforts to simplify the logic, but that end up breaking the bulk decrypt 
behavior.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] API: Extend FileIO and add EncryptingFileIO. [iceberg]

Reply via email to