fqaiser94 commented on issue #6514: URL: https://github.com/apache/iceberg/issues/6514#issuecomment-2258818765
Apologies @mahdibh for not responding, I suspect I was busy at the time dealing with other things. --- @bryanck I did consider this idea in the past and I would be satisfied if we supported such a config option. (Frankly, I would have preferred if iceberg never did refreshes internally and left that as a higher-level, explicit, user-facing concern but that boat has long since sailed). I think there are two main challenges with this approach. The first is where/how do you expose this `only_allow_explicit_refresh` (obv bad name) config option? 1. On the table doesn't seem right, as this might only be necessary for specific processes e.g. Kafka-Connect and in general, Iceberg philosophy tends to prefer an optimistic concurrency model. 2. The only other place I can think of to expose such a config option is when you initialize a new Catalog implementation. This seems more appropriate but does mean each `Catalog` implementation needs to change to take advantage of this config option (we would have to default to `false` to preserve backwards compatibility). That should be fine IMO. The other challenge is we need to be able to distinguish between: 1. "user-refresh calls" i.e. `table.refresh` (which we do want to allow regardless of `only_allow_explicit_refresh` setting) and 2. iceberg-refresh-calls i.e. internal `apply()` and `retry` APIs (which we don't want to allow if `only_allow_explicit_refresh == true`) This is an implementation detail and while I'm sure it's do-able, I remember I tried prototyping this ~1.5 years ago and was dissatisfied with the resulting code. Let me take another stab at it as my knowledge of iceberg internals has improved a lot since then XD -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org