singhpk234 commented on code in PR #13459: URL: https://github.com/apache/iceberg/pull/13459#discussion_r2183779315
########## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/RewriteTablePathSparkAction.java: ########## @@ -312,22 +316,24 @@ private String rebuildMetadata() { } private String saveFileList(Set<Pair<String, String>> filesToMove) { - List<Tuple2<String, String>> fileList = - filesToMove.stream() - .map(p -> Tuple2.apply(p.first(), p.second())) - .collect(Collectors.toList()); - Dataset<Tuple2<String, String>> fileListDataset = - spark().createDataset(fileList, Encoders.tuple(Encoders.STRING(), Encoders.STRING())); String fileListPath = stagingDir + RESULT_LOCATION; - fileListDataset - .repartition(1) - .write() - .mode(SaveMode.Overwrite) - .format("csv") - .save(fileListPath); + OutputFile fileList = table.io().newOutputFile(fileListPath); Review Comment: [doubt] staging location default value is table metadata path, but can be set to anything ? if thats the case : 1. what if table's fileIO didn't had the credentials to write to staging directory but spark did, would this cause failures ? 2. when we are using the local disk to write this, but tables file IO was pointing to object store ? would now that workloads fail ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org