Fokko opened a new pull request, #12161: URL: https://github.com/apache/iceberg/pull/12161
I was looking at adding support for `source-ids` in PyIceberg, but noticed that it was also still lacking for Java. I've noticed that `source-ids` are also backported to V1 and V2 tables, which surprised me since this might break existing V2 implementations that are unaware of the `source-ids`. This PR reconsiders https://github.com/apache/iceberg/pull/9661 And more specifically: https://lists.apache.org/thread/9opgkrpqhzp3nl8hdohgnk1m1zxnxmq0 It would be good to only allow multi-arg transforms from V3 onwards, and avoid having some implementations support this by setting a flag. Other implementations might not be aware of this implementation and drop the 2nd argument onward: ```json { "source-id": 19, "source-ids": [19, 25], "field-id": 1000, "name": "ts_bucket", "transform": "bucket" } ``` The V2 implementation that is unaware of the `source-ids` (PyIceberg, Iceberg-Rust and others), would produce: ```json { "source-id": 19, "field-id": 1000, "name": "ts_bucket", "transform": "bucket" } ``` Breaking the partitioning silently 😱 cc @rdblue @szehon-ho @advancedxy @jbonofre -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org