ajantha-bhat commented on code in PR #6661:
URL: https://github.com/apache/iceberg/pull/6661#discussion_r1135420154
##########
core/src/main/java/org/apache/iceberg/PartitionsTable.java:
##########
@@ -220,21 +257,53 @@ Iterable<Partition> all() {
static class Partition {
private final StructLike key;
- private long recordCount;
- private int fileCount;
private int specId;
+ private long dataRecordCount;
+ private int dataFileCount;
+
+ private final Set<DeleteFile> equalityDeleteFiles;
+ private final Set<DeleteFile> positionDeleteFiles;
Partition(StructLike key) {
this.key = key;
- this.recordCount = 0;
- this.fileCount = 0;
this.specId = 0;
+ this.dataRecordCount = 0;
+ this.dataFileCount = 0;
+ this.positionDeleteFiles = Sets.newHashSet();
+ this.equalityDeleteFiles = Sets.newHashSet();
+ }
+
+ private void update(FileScanTask task) {
Review Comment:
> Probably the only way to effectively do it , until this whole table is
migrated over to some kind of view of 'files' table, is to rewrite the
PartitionsTableScan to directly use the underlying code:
ManifestReader.readDeleteManifest() / ManifestReader.read(), and then go
through those iterators, instead of using the ManifestGroup.planFiles() /
FileScanTask way.
@szehon-ho: I have spent some time and realized that I am not super familiar
with this side of code. Would you like to contribute a PR for data files for
this? I can then extend it to delete files and handle these stats updates.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]