dramaticlly commented on PR #13720: URL: https://github.com/apache/iceberg/pull/13720#issuecomment-3235287484
> @dramaticlly sorry, I'm not sure I follow. Re. **1**: My point is that even in incremental mode, even when we rewrite a single manifest list (snapshot 2), we need to update lengths for all manifests that it references (A, B, C, D). If we don't do that, per your suggestion, the rewritten manifest list will only have correct lengths for the manifests that were rewritten in the same run (D), leaving lengths of other manifests (A, B, C) as they were in the source table (i.e., _as if_ they were never rewritten) and therefore incorrect. Then, after this manifest list is added to the target table, the target table will have the problem described in #13719. > > Re. **2**: fixing existing manifest references is not in scope of this PR. This PR fixes `rewrite_table_path` so it produces correct manifest lists going forward. In a correct manifest list, `manifest_length` matches the size of the file at `manifest_path` for all manifest references. To guarantee that, we need to rewrite all referenced manifests. Re 1. I get what you mean now, even though manifest A, B, C was copied earlier with correct length in target table, the incremental rewrite for snapshot 2 cannot obtain such length from source table, so they unfortunately need rewrite to get the correct length. And this essentially means incremental rewrite is no longer limited to the changing snapshots. Let me think a bit more on this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
