aokolnychyi commented on code in PR #11273:
URL: https://github.com/apache/iceberg/pull/11273#discussion_r1799857526
##########
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkPositionDeltaWrite.java:
##########
@@ -169,7 +174,13 @@ public DeltaWriterFactory
createBatchWriterFactory(PhysicalWriteInfo info) {
// broadcast the table metadata as the writer factory will be sent to
executors
Broadcast<Table> tableBroadcast =
sparkContext.broadcast(SerializableTableWithSize.copyOf(table));
- return new PositionDeltaWriteFactory(tableBroadcast, command, context,
writeProperties);
+ Broadcast<Map<String, DeleteFileSet>> rewritableDeletes = null;
+ if (context.deleteGranularity() == DeleteGranularity.FILE && scan !=
null) {
+ rewritableDeletes = sparkContext.broadcast(scan.rewritableDeletes());
Review Comment:
We should avoid the broadcast if the set of rewritable deletes is
empty/null. I'd also move this into a helper method and modify the
comment/invocation above for consistency.
```
@Override
public DeltaWriterFactory createBatchWriterFactory(PhysicalWriteInfo info) {
// broadcast large objects as the writer factory will be sent to executors
return new PositionDeltaWriteFactory(
sparkContext.broadcast(SerializableTableWithSize.copyOf(table)),
broadcastRewritableDeletes(),
...
}
private Broadcast<Map<String, DeleteFileSet>> broadcastRewritableDeletes() {
if (context.deleteGranularity() == DeleteGranularity.FILE && scan != null)
{
Map<String, DeleteFileSet> rewritableDeletes = scan.rewritableDeletes();
if (rewritableDeletes != null && !rewritableDeletes.isEmpty()) {
return sparkContext.broadcast(rewritableDeletes);
}
}
return null;
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]