pvary commented on code in PR #10926:
URL: https://github.com/apache/iceberg/pull/10926#discussion_r1739186116


##########
core/src/main/java/org/apache/iceberg/hadoop/HadoopFileIO.java:
##########
@@ -63,7 +63,11 @@ public class HadoopFileIO implements HadoopConfigurable, 
DelegateFileIO {
    * <p>{@link Configuration Hadoop configuration} must be set through {@link
    * HadoopFileIO#setConf(Configuration)}
    */
-  public HadoopFileIO() {}
+  public HadoopFileIO() {
+    // Create a default hadoopConf as it is required for the object to be 
valid.
+    // E.g. newInputFile would throw NPE with hadoopConf.get() otherwise.
+    this.hadoopConf = new SerializableConfiguration(new Configuration())::get;

Review Comment:
   > but it still has the implication that it depends on the runtime env. if 
the other side (deserialization) has a different env (default config), this can 
be different.
   
   Yes, but I think (not tested) we suffer from the same issue, if the 
configuration provided through the Catalog is created as `Configuration(false)`.
   
   > FileIO is used to read from manifest file. ManifestFiles.read is a widely 
used API.
   
   Yes, but this case the reader provides the FileIO independently from the 
Task.
   
   > only original FileIO and EncryptionManager need to be serialized.
   
   Yes. Encryption should be handled 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to