szehon-ho commented on code in PR #10020:
URL: https://github.com/apache/iceberg/pull/10020#discussion_r1540213729


##########
core/src/main/java/org/apache/iceberg/PositionDeletesTable.java:
##########
@@ -141,25 +147,88 @@ private Schema calculateSchema() {
     }
   }
 
+  /**
+   * Handle collisions between table and partition field ids, as both need to 
be part of position
+   * deletes table
+   *
+   * @param tableSchema original table schema
+   * @param partitionType original table's partition type
+   * @return partition type with reassigned field ids
+   */
+  public static Types.StructType partitionType(Schema tableSchema, 
Types.StructType partitionType) {

Review Comment:
   These and following are just helper methods to reasisgn partition field ids 
to prevent collision with schema's field ids.



##########
core/src/main/java/org/apache/iceberg/PositionDeletesTable.java:
##########
@@ -196,15 +265,14 @@ protected List<String> scanColumns() {
      */
     public BatchScan baseTableFilter(Expression expr) {
       return new PositionDeletesBatchScan(
-          table(), schema(), context(), Expressions.and(baseTableFilter, 
expr));
+          table(), schema(), context(), Expressions.and(baseTableFilter, 
expr), fieldMap);
     }
 
     @Override
     protected CloseableIterable<ScanTask> doPlanFiles() {
       String schemaString = SchemaParser.toJson(tableSchema());
-
-      // prepare transformed partition specs and caches
-      Map<Integer, PartitionSpec> transformedSpecs = 
transformSpecs(tableSchema(), table().specs());
+      Map<Integer, PartitionSpec> transformedSpecs =

Review Comment:
   We use the transformed spec to evaluate the filters, so we need to bind 
against the new (reassigned) field ids,  hence passing the field id map here.



##########
core/src/main/java/org/apache/iceberg/PositionDeletesTable.java:
##########
@@ -278,28 +344,38 @@ public void close() throws IOException {
 
         @Override
         public CloseableIterator<ScanTask> iterator() {
+          // Partition filter by base table filter
           Expression partitionFilter =
               Projections.inclusive(spec, 
isCaseSensitive()).project(baseTableFilter);
 
-          // Filter partitions
+          // Read manifests (use original table's partition ids to 
de-serialize partition values)
           CloseableIterable<ManifestEntry<DeleteFile>> deleteFileEntries =
-              ManifestFiles.readDeleteManifest(manifest, table().io(), 
transformedSpecs)
+              ManifestFiles.readDeleteManifest(manifest, table().io(), 
table().specs())

Review Comment:
   Here is a bit tricky.  We need to read manifests using the original field 
ids, so go back to use the original table's partition spec which preserves the 
original field ids.  
   
   We can no longer use the spec to bind the user-provided partition filter 
though, because now those field ids are the old ones and do not match the 
reasisgned ones, so we need to split that out and evaluate it separately 
outside the ManifestReader.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to