szehon-ho commented on code in PR #11555: URL: https://github.com/apache/iceberg/pull/11555#discussion_r1903293655
########## core/src/main/java/org/apache/iceberg/io/DeleteSchemaUtil.java: ########## @@ -43,4 +43,15 @@ public static Schema pathPosSchema() { public static Schema posDeleteSchema(Schema rowSchema) { return rowSchema == null ? pathPosSchema() : pathPosSchema(rowSchema); } + + public static Schema posDeleteReadSchema(Schema rowSchema) { Review Comment: Somehow after the rebase this is needed for position delete rewrite (there must be some intervening change related to delete readers). Previously this used the method above `pathPosSchema(rowSchema)` for the read schema, which has 'row' as required. This would fail saying 'row' is required but not found in the delete file, as 'row' is usually not set. Note that Spark and all readers actually don't include the 'row' field in the read schema https://github.com/apache/iceberg/blob/main/data/src/main/java/org/apache/iceberg/data/BaseDeleteLoader.java#L70. But here, I do want to read the 'row' field and preserve it if it is set by some engine. So I am taking the strategy of RewritePositionDelete and actually reading this field, but as optional to avoid the assert error if it is not found. https://github.com/apache/iceberg/blob/main/core/src/main/java/org/apache/iceberg/PositionDeletesTable.java#L118. (the reader there is derived from schema of metadata table PositionDeletesTable). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org