stevenzwu commented on code in PR #10926:
URL: https://github.com/apache/iceberg/pull/10926#discussion_r1733692980


##########
core/src/main/java/org/apache/iceberg/hadoop/HadoopFileIO.java:
##########
@@ -63,7 +63,11 @@ public class HadoopFileIO implements HadoopConfigurable, 
DelegateFileIO {
    * <p>{@link Configuration Hadoop configuration} must be set through {@link
    * HadoopFileIO#setConf(Configuration)}
    */
-  public HadoopFileIO() {}
+  public HadoopFileIO() {
+    // Create a default hadoopConf as it is required for the object to be 
valid.
+    // E.g. newInputFile would throw NPE with hadoopConf.get() otherwise.
+    this.hadoopConf = new SerializableConfiguration(new Configuration())::get;

Review Comment:
   I am leaning toward implementing Hadoop configuration JSON serialization as 
string key-value pairs. The only down side is the dependency to Hadoop class in 
`FileIOParser`.
   
   Another alternative is to merge Hadoop configuration key-value pairs into 
the `Map<String, String>` properties. `setConf` would also load the Hadoop 
config key-value pairs with lower-priority into the properties map. map 
properties are also passed along with higher override property to Hadoop 
configuration. this could avoid dependency to Hadoop class in `FileIOParser`.
   
    @rdblue @Fokko any feedback too?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to