Vino1016 commented on issue #12514: URL: https://github.com/apache/iceberg/issues/12514#issuecomment-2795894546
> **Library Upgrade -** > > We don't have a specific upgrade guide here because the library is attempting to be as independent from the format as possible. This means that if your code compiles and runs with the new version of the library, we expect it to work identically with your table. > > So your main checks willl be around configuration changes and other performance sorts of things that have changed, but under the hood your table won't change. > > **File Format Switch -** > > You can have files in multiple formats in the same table, there shouldn't be anything you need to do. > > **Catalog Migration** > > This much more complicated. Doing this will require changing all of your clients to point to glue instead of using path based table access. This is most important for writers, when you have multiple catalogs working with the same metadata.json you essentially enter into a split-brain situation. I would probably make the glue table "read only" and use the "registerTable" api to set the current hadoop based metadata.json as the current table path using another process. Eventually when I was ready to switch over writers, I would stop the sync and point the writers at Glue. Thank you for your response. Through our testing, we found that while Glue Crawler could potentially address metadata management, we have concerns about the significant risks this might pose to our existing data. We've tentatively decided to upgrade to version 1.4.0 while retaining the original Hadoop Catalog instead of adopting Glue. Regarding your point about Iceberg tables supporting both ORC and Parquet formats without additional adaptation, we observed that the file format is determined by the table properties set during table creation. However, we noticed that the Iceberg API does not provide a way to modify these properties after table creation. Could you clarify if there's a recommended approach to dynamically adjust the write format configuration (e.g., switching new data writes to Parquet while retaining existing ORC files), or would this require creating a new table with the desired properties? Any insights would be greatly appreciated! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org