wypoon commented on PR #10784:
URL: https://github.com/apache/iceberg/pull/10784#issuecomment-2623572958

   Another question I have about duplicate entries (in the manifests) is: do 
their presence make the table unreadable? Or is the table still readable and it 
is a valid state, although undesirable (as is the case with dangling deletes, 
for example)?
   
   An observation is that the existing subinterfaces of 
`org.apache.iceberg.actions.SnapshotUpdate` are all actions that do not change 
the state of the table (the data remains the same). This is not the case with 
removing data (or delete) files from the metadata because those files are 
missing from the storage. In that case, the state of the table is changed; 
either data is deleted or added (due to "undeletion" from removing delete 
files) or both. For this reason, I think that removing files from metadata is 
logically different from "repair" operations such as deduplication of entries 
or correcting statistics for entries.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to