Shekharrajak commented on PR #16224:
URL: https://github.com/apache/iceberg/pull/16224#issuecomment-4628172057
> Hi @Shekharrajak , thanks for the pr. what is the context? is there any
callers?
@szehon-ho Thanks for checking. :
- CherryPickOperation currently has local manifest-reading helpers at
core/src/main/java/org/apache/iceberg/CherryPickOperation.java , duplicating
logic that belongs in SnapshotChanges. With streaming SnapshotChanges,
validateReplacedPartitions can use ancestors between snapshots and iterate
readAddedDataFiles() in try-with-resources.
- In some places where the maintenance monitoring reads back several
snapshots, it can aggregate counts while streaming files instead of building
lists first. e.g. flink TableChange.java
- In follow up PRs we will add closeable streaming APIs here in
SnapshotChanges
CloseableIterable<DataFile> readAddedDataFiles()
CloseableIterable<DataFile> readRemovedDataFiles()
CloseableIterable<DeleteFile> readAddedDeleteFiles()
CloseableIterable<DeleteFile> readRemovedDeleteFiles()
Then keep current APIs:
Iterable<DataFile> addedDataFiles()
Iterable<DataFile> removedDataFiles()
So large callers can avoid O(all changed files) memory use.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]