stevenzwu commented on code in PR #10926: URL: https://github.com/apache/iceberg/pull/10926#discussion_r1720058500
########## core/src/main/java/org/apache/iceberg/hadoop/HadoopFileIO.java: ########## @@ -63,7 +63,11 @@ public class HadoopFileIO implements HadoopConfigurable, DelegateFileIO { * <p>{@link Configuration Hadoop configuration} must be set through {@link * HadoopFileIO#setConf(Configuration)} */ - public HadoopFileIO() {} + public HadoopFileIO() { + // Create a default hadoopConf as it is required for the object to be valid. + // E.g. newInputFile would throw NPE with hadoopConf.get() otherwise. + this.hadoopConf = new SerializableConfiguration(new Configuration())::get; Review Comment: I have two concerns for the above approach 1. it is not a JSON serialization of the Hadoop config. it is Java serialization. 2. we are serializing the whole `FileIO` and store the bytes as `hadoopConf`, which conceptually is not correct. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org