[I] Iceberg Cache [iceberg-rust]

via GitHub Thu, 17 Apr 2025 00:48:21 -0700


Xuanwo opened a new issue, #1226:
URL: https://github.com/apache/iceberg-rust/issues/1226


   ### What's the feature are you trying to implement?
   
   Cache is an essential component of an Iceberg table, and different types of 
cache are needed at various levels.
   
   For example, for our table metadata, we will need a `Manifest` cache so that 
we don't have to read and deserialize the same manifest files repeatedly. For 
our Parquet files, we will need a `FileMetadata` cache to avoid parsing the 
metadata from the Parquet files each time. We could even implement a raw data 
cache to store portions of data files, eliminating the need to download them 
from S3 again.
   
   As the foundation for various query engines, iceberg-rust should be designed 
to simplify integration while still allowing each engine to fully optimize 
performance. This applies whether they are using iceberg-rust on a single 
machine or within a distributed cluster.
   
   I plan to add a set of cache APIs to meet all those needs. My current plan 
is:
   
   - `ObjectCache`: an object cache trait that can hold objects like `Manifest` 
or `FileMetadata`
   - `BytesCache`: a bytes cache that can hold row content of files, like 
`table_metadata.json` files.
   - In FileIO Cache like opendal's CacheLayer, but the API is not decided yet.
   
   ## Tasks
   
   - ObjectCache
     - [ ] https://github.com/apache/iceberg-rust/pull/1222
     - [ ] https://github.com/apache/iceberg-rust/pull/1225
   - BytesCache
   - OpenDAL CacheLayer (TBD)
   
   ### Willingness to contribute
   
   I can contribute to this feature independently


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

[I] Iceberg Cache [iceberg-rust]

Reply via email to