Shekharrajak commented on PR #16224:
URL: https://github.com/apache/iceberg/pull/16224#issuecomment-4628172057

   > Hi @Shekharrajak , thanks for the pr. what is the context? is there any 
callers?
   
   @szehon-ho  Thanks for checking. : 
   
   -  CherryPickOperation currently has local manifest-reading helpers at 
core/src/main/java/org/apache/iceberg/CherryPickOperation.java , duplicating 
logic that belongs in  SnapshotChanges. With streaming SnapshotChanges, 
validateReplacedPartitions can use ancestors between snapshots and iterate 
readAddedDataFiles() in try-with-resources.
   - In some places where the maintenance monitoring reads back several 
snapshots, it can aggregate counts while streaming files instead of building 
lists first. e.g. flink TableChange.java
   - In follow up PRs we will add closeable streaming APIs here in 
SnapshotChanges  
   
   CloseableIterable<DataFile> readAddedDataFiles()
        CloseableIterable<DataFile> readRemovedDataFiles()
        CloseableIterable<DeleteFile> readAddedDeleteFiles()
        CloseableIterable<DeleteFile> readRemovedDeleteFiles()
   
        Then keep current APIs:
   
        Iterable<DataFile> addedDataFiles()
        Iterable<DataFile> removedDataFiles()
   
   So large callers can avoid O(all changed files) memory use. 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to