huaxingao commented on PR #11551:
URL: https://github.com/apache/iceberg/pull/11551#issuecomment-2512581027

   @flyrain I think this over. The `missingIds` could be from 
[`ROW_POSITION.fieldId()`](https://github.com/apache/iceberg/blob/main/data/src/main/java/org/apache/iceberg/data/DeleteFilter.java#L273C39-L273C61)
 or from 
[`eqDelete.equalityFieldIds()`](https://github.com/apache/iceberg/blob/main/data/src/main/java/org/apache/iceberg/data/DeleteFilter.java#L277C26-L277C53).
 In batch case, 
[`needRowPosCol`](https://github.com/apache/iceberg/blob/main/data/src/main/java/org/apache/iceberg/data/DeleteFilter.java#L272C9-L272C22)
 is false, so we don't have `ROW_POSITION`. Is it possible to have 
`IS_DELETED.fieldId()` in `eqDelete.equalityFieldIds()`, meaning, is it 
possible to do an equality delete based on `IS_DELETED` column? If not, then 
`missingIds` should not contain `IS_DELETED.fieldId()`.
   
   If we explicitly select `_delete`, then the `_delete` is in the 
`expectedSchema`. For example, if the table has c1 and c2, and equality delete 
is based on c1, and we run SELECT c2 , _delete FROM table, the `expectedSchema` 
is c2, _delete, and the missing column is c1. The `requiredSchema` would then 
be c2, _delete, c1, thus allowing us to still remove the extra column from the 
end of the columns.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to