tdcmeehan commented on PR #10118:
URL: https://github.com/apache/iceberg/pull/10118#issuecomment-2049941295

   Thanks for your question @singhpk234.  The motivation is to try to make both 
improvements listed in #9991, and I would like to solicit feedback on this 
approach.
   
   tl;dr, there are two problems I'd like to solve: I would like for manifest 
caching to expose metrics to our project's existing metrics reporting 
infrastructure, and I would also like for manifest caching to be able to have a 
customizable cache key to account for custom implementations of FileIO which 
don't make the same presumptions as the default manifest cache, namely that 
there is single long-lived instance of each type of FileIO.
   
   While there are other approaches to solve both of these problems, it seemed 
most straightforward to allow the manifest cache to be pluggable--that way, I 
can easily integrate it with my existing infrastructure, as I'm in control of 
the code which supplies the cache.  And by allowing me to write a custom cache, 
I am in complete control of my project's caching needs, without having to make 
a lot of changes in the core library.
   
   The feedback I would request is: does the Iceberg core library want these 
things as well?  i.e., does the core library want the ability to handle custom 
FileIO caching which does not use long-lived FileIO references, and does the 
core project want to expose metrics for manifest caching?  In order to support 
such things cleanly, we may require some refactoring to FileIO and 
ManifestCaches.  I have thoughts around that, but wanted to get feedback on the 
community before I proceed either way.  (My personal thinking is, caching is a 
bit of an involved infrastructure investment, and it may make sense to simply 
provide a basic implementation (as done here), but allow arbitrary cache 
customization for those that require such sophistication, rather than build it 
into the core library.)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to