aokolnychyi commented on code in PR #9724:
URL: https://github.com/apache/iceberg/pull/9724#discussion_r1498618397


##########
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/RewriteDataFilesSparkAction.java:
##########
@@ -507,4 +645,54 @@ public int totalGroupCount() {
       return totalGroupCount;
     }
   }
+
+  private static class MakeDeleteFile implements MapFunction<Row, DeleteFile> {
+
+    private final boolean posDeletes;
+    private final Types.StructType partitionType;
+    private final Map<Integer, PartitionSpec> specsById;
+
+    /**
+     * Map function that transforms entries table rows into {@link DeleteFile}
+     *
+     * @param posDeletes true for position deletes, false for equality deletes
+     * @param partitionType partition type of table
+     * @param specsById table's partition specs
+     */
+    MakeDeleteFile(
+        boolean posDeletes, Types.StructType partitionType, Map<Integer, 
PartitionSpec> specsById) {
+      this.posDeletes = posDeletes;
+      this.partitionType = partitionType;
+      this.specsById = specsById;
+    }
+
+    @Override
+    public DeleteFile call(Row row) throws Exception {
+      PartitionData partition = new PartitionData(partitionType);
+      GenericRowWithSchema partitionRow = row.getAs(0);
+
+      for (int i = 0; i < partitionRow.length(); i++) {
+        partition.set(i, partitionRow.get(i));
+      }
+
+      int specId = row.getAs(1);
+      String path = row.getAs(2);
+      long fileSize = row.getAs(3);
+      long recordCount = row.getAs(4);
+
+      FileMetadata.Builder builder = 
FileMetadata.deleteFileBuilder(specsById.get(specId));

Review Comment:
   Deleting based on path is not a good idea as Iceberg won't be able to prune 
manifests using partition info. The action for rewriting manifests already 
handles this, we can use a similar approach.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to