nastra commented on code in PR #10926:
URL: https://github.com/apache/iceberg/pull/10926#discussion_r1718548173


##########
core/src/main/java/org/apache/iceberg/hadoop/HadoopFileIO.java:
##########
@@ -63,7 +63,11 @@ public class HadoopFileIO implements HadoopConfigurable, 
DelegateFileIO {
    * <p>{@link Configuration Hadoop configuration} must be set through {@link
    * HadoopFileIO#setConf(Configuration)}
    */
-  public HadoopFileIO() {}
+  public HadoopFileIO() {
+    // Create a default hadoopConf as it is required for the object to be 
valid.
+    // E.g. newInputFile would throw NPE with hadoopConf.get() otherwise.
+    this.hadoopConf = new SerializableConfiguration(new Configuration())::get;

Review Comment:
   FYI there's a similar NPE issue in `ResolvingFileIO` that I looked at 
(https://github.com/apache/iceberg/pull/10872) but in the `ResolvingFileIO` we 
should probably default to a `null` config while here it seems to make more 
sense to create a new config if the configured one is null.
   
   > I am wondering if we should change FileIOParser to serialize and 
deserialize Hadoop Configuration when the FileIO is HadoopConfigurable. We can 
probably only serialize the key-value string pairs from the Configuration as a 
JSON object (kind of a read only copy).
   
   Yes I had similar thoughts on this topic. Let me think about this for a bit 
and we can discuss how we'd like to approach/improve this situation



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to