szehon-ho commented on code in PR #12885:
URL: https://github.com/apache/iceberg/pull/12885#discussion_r2143115914


##########
core/src/main/java/org/apache/iceberg/RewriteTablePathUtil.java:
##########
@@ -312,7 +314,87 @@ public static RewriteResult<DataFile> rewriteDataManifest(
         ManifestReader<DataFile> reader =
             ManifestFiles.read(manifestFile, io, 
specsById).select(Arrays.asList("*"))) {
       return StreamSupport.stream(reader.entries().spliterator(), false)
-          .map(entry -> writeDataFileEntry(entry, spec, sourcePrefix, 
targetPrefix, writer))
+          .map(
+              entry ->
+                  writeDataFileEntry(entry, Set.of(), spec, sourcePrefix, 
targetPrefix, writer))
+          .reduce(new RewriteResult<>(), RewriteResult::append);
+    }
+  }
+
+  /**
+   * Rewrite a data manifest, replacing path references.
+   *
+   * @param manifestFile source manifest file to rewrite
+   * @param deltaSnapshotIds snapshot ids to filter manifest entry
+   * @param outputFile output file to rewrite manifest file to
+   * @param io file io
+   * @param format format of the manifest file
+   * @param specsById map of partition specs by id
+   * @param sourcePrefix source prefix that will be replaced
+   * @param targetPrefix target prefix that will replace it
+   * @return a copy plan of content files in the manifest that was rewritten
+   */
+  public static RewriteResult<DataFile> rewriteDataManifest(
+      ManifestFile manifestFile,
+      Set<Long> deltaSnapshotIds,

Review Comment:
   sorry for last minute, I thought about it a little and want to fix the 
javadoc/variable name
   
   1. deltaSnapshotIds doens't make too much sense in this method (though 
probably in the caller context).
   2. Also comment is not specific/correct, its not really filtering what we 
rewrite (we rewrite all these files), but what we return.
   
   Maybe 'snapshotIds' : a list of snapshot ids for filtering returned data 
manifest entries.  Only manifest entries that refer to these one of these 
snapshot ids will be returned.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to