dentiny opened a new pull request, #1719:
URL: https://github.com/apache/iceberg-rust/pull/1719

   ## Which issue does this PR close?
   
   - Closes https://github.com/apache/iceberg-rust/pull/1698
   
   ## What changes are included in this PR?
   
   Context: I see huge CPU time spent on manifest list loading, especially avro 
deserialization (see attached PR for details), I want to leverage the object 
cache to avoid unnecessary IO and deser.
   
   Discussed online with @liurenjie1024 for a bit, see
   - https://github.com/apache/iceberg-rust/pull/1698#issuecomment-3332906860
   - https://github.com/apache/iceberg-rust/pull/512#discussion_r2365940293
   
   we lean towards the path that:
   - Make object cache a read-through and write-through cache for manifest and 
manifest list
   - Later loading attempts from object cache first, could be either a 
read-through cache, or look-aside for easier implementation
   
   I plan to structure and split the series of PRs as follows:
   - [ ] Store manifest list into object cache, if cache enabled
   - [ ] Load manifest list with object cache considered, which makes object 
store a part of file io
   - [ ] Replicate the same procedure to manifest files
   
   ## Are these changes tested?
   
   Yes, unit test added.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to