nastra commented on code in PR #10926: URL: https://github.com/apache/iceberg/pull/10926#discussion_r1716349267
########## core/src/main/java/org/apache/iceberg/hadoop/HadoopFileIO.java: ########## @@ -63,7 +63,11 @@ public class HadoopFileIO implements HadoopConfigurable, DelegateFileIO { * <p>{@link Configuration Hadoop configuration} must be set through {@link * HadoopFileIO#setConf(Configuration)} */ - public HadoopFileIO() {} + public HadoopFileIO() { + // Create a default hadoopConf as it is required for the object to be valid. + // E.g. newInputFile would throw NPE with hadoopConf.get() otherwise. + this.hadoopConf = new SerializableConfiguration(new Configuration())::get; Review Comment: I thought about this problem a bit more. The one downside I see with this change is that `HadoopFileIO` will typically be instantiated via `CatalogUtil.loadFileIO`, which will then set the `Conf`. Also the javadoc of the constructor mentions that the conf must be set by calling `setConf()`. Instead of serializing the conf in the Parser, maybe the conf should be initialized in `FileIOParser.fromJson(String)` instead of passing a null conf in https://github.com/apache/iceberg/blob/7fe4037e778583489ccdf6e06a78645384d4b9f3/core/src/main/java/org/apache/iceberg/io/FileIOParser.java#L68? I think that would solve the issue while also not instantiating a new Conf every time a new instance of `HadoopFileIO` is created? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org