Fokko commented on issue #1284: URL: https://github.com/apache/iceberg-python/issues/1284#issuecomment-2503087394
Thanks for the context yesterday, I was still noodling on it overnight. If I understand correctly (and please also share the video of you and @adrianqin; I must have missed it during the paternity leave), you're looking for a shallow clone of a table. In this situation, the metadata and manifests are recreated, but the re-use the existing data-files are re-used to avoid unnecessary heavy lifting. As I mentioned yesterday, this is still a bit tricky since you might do a delete operation, where a data-file is being dropped that's still referenced in the other table, but that's inherent to a shallow clone. Instead of mangling the `create_table` operation, shouldn't it be better to have a clone operation? > The case where some Iceberg Rest Catalog Servers do not respect the given field IDs and assigns fresh IDs actually feels more like an edge case to me, that we want to opine on and course correct. The REST catalog follows the philosophy that clients shouldn't have to worry about Field-IDs. The register table is designed to take an existing file, and re-use all the metadata, rather than re-assigning it. But it looks like there is interest in more flavors than just `create` and `register`. A similar question was also raised on Slack: https://apache-iceberg.slack.com/archives/C029EE6HQ5D/p1732441495465079 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org