fqaiser94 commented on issue #6514:
URL: https://github.com/apache/iceberg/issues/6514#issuecomment-2258818765

   Apologies @mahdibh for not responding, I suspect I was busy at the time 
dealing with other things. 
   
   ---
   
   @bryanck I did consider this idea in the past and I would be satisfied if we 
supported such a config option. (Frankly, I would have preferred if iceberg 
never did refreshes internally and left that as a higher-level, explicit, 
user-facing concern but that boat has long since sailed). 
   
   I think there are two main challenges with this approach. 
   
   The first is where/how do you expose this `only_allow_explicit_refresh` (obv 
bad name) config option? 
   1. On the table doesn't seem right, as this might only be necessary for 
specific processes e.g. Kafka-Connect and in general, Iceberg philosophy tends 
to prefer an optimistic concurrency model. 
   2. The only other place I can think of to expose such a config option is 
when you initialize a new Catalog implementation. This seems more appropriate 
but does mean each `Catalog` implementation needs to change to take advantage 
of this config option (we would have to default to `false` to preserve 
backwards compatibility). That should be fine IMO. 
   
   The other challenge is we need to be able to distinguish between:
   1. "user-refresh calls" i.e. `table.refresh` (which we do want to allow 
regardless of `only_allow_explicit_refresh` setting) and 
   2. iceberg-refresh-calls i.e. internal `apply()` and `retry` APIs (which we 
don't want to allow if `only_allow_explicit_refresh == true`)
   
   This is an implementation detail and while I'm sure it's do-able, I remember 
I tried prototyping this ~1.5 years ago and was dissatisfied with the resulting 
code. Let me take another stab at it as my knowledge of iceberg internals has 
improved a lot since then XD


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to