szehon-ho commented on code in PR #7630:
URL: https://github.com/apache/iceberg/pull/7630#discussion_r1199170227
##########
spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/actions/RewriteDataFilesSparkAction.java:
##########
@@ -159,26 +161,25 @@ public RewriteDataFiles.Result execute() {
validateAndInitOptions();
- Map<StructLike, List<List<FileScanTask>>> fileGroupsByPartition =
+ StructLikeMap<List<List<FileScanTask>>> fileGroupsByPartition =
planFileGroups(startingSnapshotId);
RewriteExecutionContext ctx = new
RewriteExecutionContext(fileGroupsByPartition);
if (ctx.totalGroupCount() == 0) {
LOG.info("Nothing found to rewrite in {}", table.name());
- return
ImmutableRewriteDataFiles.Result.builder().rewriteResults(ImmutableList.of()).build();
+ return EMPTY_RESULT;
}
Stream<RewriteFileGroup> groupStream = toGroupStream(ctx,
fileGroupsByPartition);
- RewriteDataFilesCommitManager commitManager =
commitManager(startingSnapshotId);
if (partialProgressEnabled) {
- return doExecuteWithPartialProgress(ctx, groupStream, commitManager);
+ return doExecuteWithPartialProgress(ctx, groupStream,
commitManager(startingSnapshotId));
} else {
- return doExecute(ctx, groupStream, commitManager);
+ return doExecute(ctx, groupStream, commitManager(startingSnapshotId));
}
}
- Map<StructLike, List<List<FileScanTask>>> planFileGroups(long
startingSnapshotId) {
+ StructLikeMap<List<List<FileScanTask>>> planFileGroups(long
startingSnapshotId) {
Review Comment:
> I think I tried it before. Have you checked this comment?
https://github.com/apache/iceberg/pull/7630#discussion_r1197398322
Yea I saw that, so hence my suggestion did not use coerce, but rather the
table's latest partitionType. It should be exactly the same as the existing
code now?
> If I use transformValues, it cannot check the existing fileGroups.size() >
0 check due to generics. Are we ok with that? That's why I didn't do
Yea , in RewritePositionDeleteFilesSparkAction, we've moved that check as
part of toGroupStream, which I think you also copied here.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]