pvary commented on PR #11513: URL: https://github.com/apache/iceberg/pull/11513#issuecomment-2475910745
> For easier review it would be great if you could highlight the changes made. > > I see the note on RewritePlanResult but i'm not sure where this class came from. It is unclear what has been extracted and renamed and what is new code. Let me amend this now, and thanks for taking a look anyway! So there were no significant new code in the refactor. Created a single new class (`RewriteFileGroupPlanner`) for the functionality, and created a new class (`RewritePlan`) for storing result. The responsibility of the new class is to group the rewrite tasks together and return the calculated plan. Originally the `RewriteDataFilesSparkAction.toGroupStream` did the grouping. It used several other methods (planFileGroups, groupByPartition, fileGroupsByPartition, newRewriteGroup) and the RewriteExecutionContext inner class to do the grouping. The methods and the class got moved to the `RewriteFileGroupPlanner`. The result got its own class to be able to return not only the stream, but the total counts as well which are needed by the current Spark implementation. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org