This is an automated email from the ASF dual-hosted git repository.

asf-gitbox-commits pushed a commit to branch branch-4.x
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-4.x by this push:
     new 4eacbfe2bdc4 [SPARK-56661] Fixing MapPartitionsExternalUDF to generate 
output attributes only once
4eacbfe2bdc4 is described below

commit 4eacbfe2bdc4dd10212372dc1a3192aa91516059
Author: Sven Weber <[email protected]>
AuthorDate: Sat May 30 10:27:34 2026 -0400

    [SPARK-56661] Fixing MapPartitionsExternalUDF to generate output attributes 
only once
    
    ### What changes were proposed in this pull request?
    
    This is a follow-up PR on the recently merged 
[55768](https://github.com/apache/spark/pull/55768). I noticed that the 
`MapPartitionsExternalUDF ` re-generates its output attributes on every 
function call. Instead, we should compute the output attributes once and store 
them in a local variable. This behavior is fixed by this PR.
    
    ### Why are the changes needed?
    
    The current behavior re-computes the attributes over and over again while 
they are not expected to change.
    
    ### Does this PR introduce _any_ user-facing change?
    
    No.
    
    ### How was this patch tested?
    
    Existing unit tests for this class.
    
    ### Was this patch authored or co-authored using generative AI tooling?
    
    No
    
    Closes #56206 from sven-weber-db/sven-weber_data/fix-map-op.
    
    Authored-by: Sven Weber <[email protected]>
    Signed-off-by: Herman van Hövell <[email protected]>
    (cherry picked from commit d004e0d8567c3c4ccba9311bbc69dcf442555fa1)
    Signed-off-by: Herman van Hövell <[email protected]>
---
 .../sql/catalyst/plans/logical/logicalExternalUDFOperators.scala    | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/logicalExternalUDFOperators.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/logicalExternalUDFOperators.scala
index 49c3e938d5b6..84163e7350f3 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/logicalExternalUDFOperators.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/logicalExternalUDFOperators.scala
@@ -56,11 +56,13 @@ case class MapPartitionsExternalUDF(
     child: LogicalPlan)
   extends ExternalUDF {
 
-  // Map partitions always operate on StructTypes
-  override def output: Seq[Attribute] = toAttributes(
+  val nodeOutputAttributes = toAttributes(
     function.dataType.asInstanceOf[StructType]
   )
 
+  // Map partitions always operate on StructTypes
+  override def output: Seq[Attribute] = nodeOutputAttributes
+
   override protected def withNewChildInternal(
       newChild: LogicalPlan): MapPartitionsExternalUDF =
     copy(child = newChild)


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to