pvary commented on issue #10431:
URL: https://github.com/apache/iceberg/issues/10431#issuecomment-2157551451

   Here is how the different deletes work:
   - EQ-DELETE - removes all occurrences of the record with the given id BEFORE 
the snapshot
   - POS-DELETE - removes a given row from the given data file BEFORE OR IN the 
given snapshot.
   
   Based on the data you have shown, I would guess that there were 2 updates 
for the given record in the snapshot.
   Flink does the following:
   - In upsert mode, it always writes the EQ-DELETE for the id, and writes the 
new record. Also it stores the filename and the id for all written record in 
memory.
   - If there is a new update for an id which arrived in the same snapshot, 
then it also  writes the positional delete file.
   
   Based on the commit message you have shown, it tells us, that it has written 
3 delete files - I expect that one of the delete files is the positional 
delete. It should contain the delete record for the given row.
   
   Could you please check it this is the case?
   Thanks,
   Peter
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to