dramaticlly commented on PR #12172:
URL: https://github.com/apache/iceberg/pull/12172#issuecomment-2641532788
I did some look on this, I think right now the only strict delta
manifest-list/manifests/data files are rewritten if `start_version` is
provided. So for incremental rewrite and copy does not result in cumulative of
all deltas on read. Collected some details from provided unit test examples to
help visualize the problem
1st snapshot: 4355148708777346214
2nd snapshot: 936971667881185972
- source table
```
+--------------------------------------------------------------------------------+-----------------------------------------------------------+-------------------+
|manifest-list
|path
|added_snapshot_id |
+--------------------------------------------------------------------------------+-----------------------------------------------------------+-------------------+
|1224/metadata/snap-936971667881185972-1-7a1b9514-44b9-418e-a6b3-dd724a5daa01.avro|1224/metadata/7a1b9514-44b9-418e-a6b3-dd724a5daa01-m0.avro|936971667881185972
|
|1224/metadata/snap-936971667881185972-1-7a1b9514-44b9-418e-a6b3-dd724a5daa01.avro|1224/metadata/a3362153-e2aa-4e7c-a04e-7ea9f525d0ce-m0.avro|4355148708777346214|
+--------------------------------------------------------------------------------+-----------------------------------------------------------+-------------------+
+----------------------------------------------------------+------+-------------------+-----------------------------------------------------------------------+
|manifest-file
|status|snapshot_id |file_path
|
+----------------------------------------------------------+------+-------------------+-----------------------------------------------------------------------+
|1224/metadata/7a1b9514-44b9-418e-a6b3-dd724a5daa01-m0.avro|1
|936971667881185972
|1224/data/00000-16-2c43c06a-e56e-4c74-a498-393398ac66df-0-00001.parquet|
|1224/metadata/a3362153-e2aa-4e7c-a04e-7ea9f525d0ce-m0.avro|1
|4355148708777346214|1224/data/00000-2-f37fff9f-1075-4e06-85e1-fa7c3f22ac48-0-00001.parquet
|
+----------------------------------------------------------+------+-------------------+-----------------------------------------------------------------------+
```
- target table
```
+--------------------------------------------------------------------------------+-----------------------------------------------------------+------------------+
|manifest-list
|path
|added_snapshot_id |
+--------------------------------------------------------------------------------+-----------------------------------------------------------+------------------+
|4614/metadata/snap-936971667881185972-1-7a1b9514-44b9-418e-a6b3-dd724a5daa01.avro|4614/metadata/7a1b9514-44b9-418e-a6b3-dd724a5daa01-m0.avro|936971667881185972|
+--------------------------------------------------------------------------------+-----------------------------------------------------------+------------------+
+----------------------------------------------------------+------+-------------------+-----------------------------------------------------------------------+
|manifest-file
|status|snapshot_id |file_path
|
+----------------------------------------------------------+------+-------------------+-----------------------------------------------------------------------+
|4614/metadata/7a1b9514-44b9-418e-a6b3-dd724a5daa01-m0.avro|1
|936971667881185972|4614/data/00000-16-2c43c06a-e56e-4c74-a498-393398ac66df-0-00001.parquet
|
+----------------------------------------------------------+------+-------------------+-----------------------------------------------------------------------+
```
So incremental copy after 1st snapshot, read of target table ends up with
only the delta for all iceberg metadata related to snapshot
`936971667881185972`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]