amogh-jahagirdar commented on PR #10784: URL: https://github.com/apache/iceberg/pull/10784#issuecomment-2316733589
> Makes sense @amogh-jahagirdar ! > > > @szehon-ho I actually had a question on the snapshot repair, based on the description the goal of that is to repair snapshot summary stats which may have been corrupted. Doesn't that necessarily mean we must mutate the existing snapshot (and subsequent snapshots) to correct it? > > i was just thinking we make a new snapshot with correct summary stats (only the totals). But yes you are right, it is open to interpretation, in this case you cant go back and fix wrong stats, so this particular feature probably does need more thought. Really sorry for the delayed response @szehon-ho I forgot I had this open. So I was discussing this with @rdblue and I am more convinced of your point that we may as well have a unified `RepairTable` action with different configuration methods. The compelling arguments for me at least is that it follows the existing API patterns, for example `ManageSnapshots` has different options but serves as a useful entry point. `RepairTable` is a useful entry point for a user, there can be sane defaults on what to repair, and power users can specify which specific operations they want to run if they know what's broken. Another example of this came up in https://github.com/apache/iceberg/pull/10755/files#r1696094932 for removing unused specs where we were thinking of having an entry point maintenance API for both removing unused specs and schemas. So from an API consistency perspective, I think it's good to have the same pattern for this action. Furthermore, the procedure is traversing the entire metadata tree so practically there's probably overlap across the different repair operations. I can update the PR based on this -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org