[PR] Object Cache: caches parsed Manifests and ManifestLists for performance [iceberg-rust]

via GitHub Tue, 30 Jul 2024 18:30:32 -0700


sdd opened a new pull request, #512:
URL: https://github.com/apache/iceberg-rust/pull/512


   This builds on top of the [concurrent scans PR 
](https://github.com/apache/iceberg-rust/pull/373) and so needs to be merged 
after that.
   
   It caches parsed instances of `Manifest` and `ManifestList` objects so that 
they are not re-fetched and re-parsed if the same object is required in a 
subsequent scan. Experiments on the test data in my perf testing branch have 
shown that this can reduce the time taken for `plan_files` to execute a second 
time from 650ms down to 5ms, even if this involved a different filter predicate.
   
   The cache is an LRU cache implemented using the great 
[moka](https://github.com/moka-rs/moka) crate. By default the cache size is 
32Mb but it can be configured to use any size or be disabled entirely.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

[PR] Object Cache: caches parsed Manifests and ManifestLists for performance [iceberg-rust]

Reply via email to