vaultah commented on code in PR #13720:
URL: https://github.com/apache/iceberg/pull/13720#discussion_r2252819385


##########
core/src/main/java/org/apache/iceberg/RewriteTablePathUtil.java:
##########
@@ -357,6 +438,55 @@ public static RewriteResult<DataFile> rewriteDataManifest(
     }
   }
 
+
+  /**
+   * Rewrite a data manifest, replacing path references.
+   *
+   * @param manifestFile source manifest file to rewrite
+   * @param snapshotIds snapshot ids for filtering returned data manifest 
entries
+   * @param outputFile output file to rewrite manifest file to
+   * @param io file io
+   * @param format format of the manifest file
+   * @param specsById map of partition specs by id
+   * @param sourcePrefix source prefix that will be replaced
+   * @param targetPrefix target prefix that will replace it
+   * @return rewritten manifest file and a copy plan for the referenced 
content files
+   */
+  public static Pair<ManifestFile, RewriteResult<DataFile>> 
rewriteDataManifestWithResult(
+      ManifestFile manifestFile,
+      Set<Long> snapshotIds,
+      OutputFile outputFile,
+      FileIO io,
+      int format,
+      Map<Integer, PartitionSpec> specsById,
+      String sourcePrefix,
+      String targetPrefix)
+      throws IOException {
+    PartitionSpec spec = specsById.get(manifestFile.partitionSpecId());
+    ManifestWriter<DataFile> writer =
+            ManifestFiles.write(format, spec, outputFile, 
manifestFile.snapshotId());
+    RewriteResult<DataFile> rewriteResult = null;
+
+    try (ManifestWriter<DataFile> dataManifestWriter = writer;
+         ManifestReader<DataFile> reader =
+            ManifestFiles.read(manifestFile, io, specsById)
+                .select(Arrays.asList("*"))) {
+       rewriteResult =
+          StreamSupport.stream(reader.entries().spliterator(), false)
+            .map(
+                entry ->
+                    writeDataFileEntry(
+                        entry,
+                        snapshotIds,
+                        spec,
+                        sourcePrefix,
+                        targetPrefix,
+                        writer))
+            .reduce(new RewriteResult<>(), RewriteResult::append);
+    }
+    return Pair.of(writer.toManifestFile(), rewriteResult);

Review Comment:
   I only made it return the entire `ManifestFile` in an attempt to make this 
function more general-purpose, since it's part of public API (however 
specialized and niche it might be in reality). We can return 
`RewrittenFileInfo` from it though



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to