97harsh commented on code in PR #14964:
URL: https://github.com/apache/iceberg/pull/14964#discussion_r2662381012
##########
spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/actions/RewriteDataFilesSparkAction.java:
##########
@@ -157,13 +158,19 @@ public RewriteDataFilesSparkAction filter(Expression
expression) {
return this;
}
+ public RewriteDataFilesSparkAction toBranch(String targetBranch) {
+ this.branch = targetBranch;
+ return this;
+ }
+
@Override
public RewriteDataFiles.Result execute() {
if (table.currentSnapshot() == null) {
return EMPTY_RESULT;
}
- long startingSnapshotId = table.currentSnapshot().snapshotId();
+ long startingSnapshotId =
+ branch != null ? table.snapshot(branch).snapshotId() :
table.currentSnapshot().snapshotId();
Review Comment:
Thank you, that's a valid suggestion.
I considered adding the check in the procedure, but think it makes more
sense to add in the action class because:
- The action can be used directly via
SparkActions.get().rewriteDataFiles(table).toBranch("branch").execute() without
going through the procedure. Validating in the action ensures all callers are
protected.
- Keeping it consistent with other validations in this class (e.g.,
maxConcurrentFileGroupRewrites >= 1, partialProgressEnabled checks) are done in
the action, not the procedure.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]