huaxingao commented on PR #11551: URL: https://github.com/apache/iceberg/pull/11551#issuecomment-2512581027
@flyrain I think this over. The `missingIds` could be from [`ROW_POSITION.fieldId()`](https://github.com/apache/iceberg/blob/main/data/src/main/java/org/apache/iceberg/data/DeleteFilter.java#L273C39-L273C61) or from [`eqDelete.equalityFieldIds()`](https://github.com/apache/iceberg/blob/main/data/src/main/java/org/apache/iceberg/data/DeleteFilter.java#L277C26-L277C53). In batch case, [`needRowPosCol`](https://github.com/apache/iceberg/blob/main/data/src/main/java/org/apache/iceberg/data/DeleteFilter.java#L272C9-L272C22) is false, so we don't have `ROW_POSITION`. Is it possible to have `IS_DELETED.fieldId()` in `eqDelete.equalityFieldIds()`, meaning, is it possible to do an equality delete based on `IS_DELETED` column? If not, then `missingIds` should not contain `IS_DELETED.fieldId()`. If we explicitly select `_delete`, then the `_delete` is in the `expectedSchema`. For example, if the table has c1 and c2, and equality delete is based on c1, and we run SELECT c2 , _delete FROM table, the `expectedSchema` is c2, _delete, and the missing column is c1. The `requiredSchema` would then be c2, _delete, c1, thus allowing us to still remove the extra column from the end of the columns. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org