tdcmeehan commented on PR #10118: URL: https://github.com/apache/iceberg/pull/10118#issuecomment-2049941295
Thanks for your question @singhpk234. The motivation is to try to make both improvements listed in #9991, and I would like to solicit feedback on this approach. tl;dr, there are two problems I'd like to solve: I would like for manifest caching to expose metrics to our project's existing metrics reporting infrastructure, and I would also like for manifest caching to be able to have a customizable cache key to account for custom implementations of FileIO which don't make the same presumptions as the default manifest cache, namely that there is single long-lived instance of each type of FileIO. While there are other approaches to solve both of these problems, it seemed most straightforward to allow the manifest cache to be pluggable--that way, I can easily integrate it with my existing infrastructure, as I'm in control of the code which supplies the cache. And by allowing me to write a custom cache, I am in complete control of my project's caching needs, without having to make a lot of changes in the core library. The feedback I would request is: does the Iceberg core library want these things as well? i.e., does the core library want the ability to handle custom FileIO caching which does not use long-lived FileIO references, and does the core project want to expose metrics for manifest caching? In order to support such things cleanly, we may require some refactoring to FileIO and ManifestCaches. I have thoughts around that, but wanted to get feedback on the community before I proceed either way. (My personal thinking is, caching is a bit of an involved infrastructure investment, and it may make sense to simply provide a basic implementation (as done here), but allow arbitrary cache customization for those that require such sophistication, rather than build it into the core library.) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org