(spark) branch master updated: [SPARK-52877][PYTHON][FOLLOW-UP] Use columns instead of itercolumns in RecordBatch

gurwls223 Thu, 24 Jul 2025 22:39:03 -0700

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/master by this push:
     new d148e9be24f4 [SPARK-52877][PYTHON][FOLLOW-UP] Use columns instead of 
itercolumns in RecordBatch
d148e9be24f4 is described below

commit d148e9be24f49b79bf85aa46e48ae7e71bda2f13
Author: Hyukjin Kwon <[email protected]>
AuthorDate: Fri Jul 25 14:37:52 2025 +0900

    [SPARK-52877][PYTHON][FOLLOW-UP] Use columns instead of itercolumns in 
RecordBatch
    
    ### What changes were proposed in this pull request?
    
    This PR proposes to use `columns` instead of `itercolumns` in RecordBatch, 
which does not exist in the old version of PyArrow.
    
    ### Why are the changes needed?
    
    To recover the build 
https://github.com/apache/spark/actions/runs/16507806777/job/46682838114
    This is just a temporary workaround.
    
    ### Does this PR introduce _any_ user-facing change?
    
    No, test-only.
    
    ### How was this patch tested?
    
    Manually.
    
    ### Was this patch authored or co-authored using generative AI tooling?
    
    No,
    
    Closes #51661 from HyukjinKwon/SPARK-52877.
    
    Authored-by: Hyukjin Kwon <[email protected]>
    Signed-off-by: Hyukjin Kwon <[email protected]>
---
 python/pyspark/sql/pandas/serializers.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/python/pyspark/sql/pandas/serializers.py 
b/python/pyspark/sql/pandas/serializers.py
index 8367dca0228c..b6f9282b3a2f 100644
--- a/python/pyspark/sql/pandas/serializers.py
+++ b/python/pyspark/sql/pandas/serializers.py
@@ -793,7 +793,7 @@ class 
ArrowBatchUDFSerializer(ArrowStreamArrowUDFSerializer):
         for batch in super().load_stream(stream):
             columns = [
                 [conv(v) for v in column.to_pylist()] if conv is not None else 
column.to_pylist()
-                for column, conv in zip(batch.itercolumns(), converters)
+                for column, conv in zip(batch.columns, converters)
             ]
             if len(columns) == 0:
                 yield [[pyspark._NoValue] * batch.num_rows]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

(spark) branch master updated: [SPARK-52877][PYTHON][FOLLOW-UP] Use columns instead of itercolumns in RecordBatch

Reply via email to