jackye1995 commented on PR #6642: URL: https://github.com/apache/iceberg/pull/6642#issuecomment-1423178313
Took a brief look, overall I agree with what the community discussion led to, replaying the timeline is cool but Hudi concurrent transaction has awkward behavior and we cannot guarantee it is always correct. Instead, we can ask user to always compact before migration, such that we only need to offer the ability to migrate the latest compacted table in Hudi. When new data come, the user can rerun compaction, and then rerun this migration action, and the action will fully replace the previous version of the migrated Iceberg table. This approach will work for both CoW and MoR tables, and we can add timeline replay as follow up feature if necessary. What do you think? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org