lirui-apache commented on issue #13294: URL: https://github.com/apache/iceberg/issues/13294#issuecomment-2961540716
Hey @pvary , I went through the thread but it seems not related to the issue here. I don't intend to change whether/how the config is exposed or turn it on by default. What I want to discuss is: when the config is explicitly turned on but there is no snapshot, should ExpireSnapshots clean unused schema/specs. I don't think this changes default behavior because users explicitly turning on this config are not looking for default behavior. Our production use-case: a user creates a table with over 2k columns, and then adds another 1k columns. However, the extra 1k columns are not added in a single commit but one by one with 1k commits. So this creates 1k schemas each with over 2k columns. The metadata json file grows to 1.4GB which makes the table barely usable. But since there's no snapshot so far, our maintenance service is unable to clean the schemas with ExpireSnapshots. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org