(spark) branch master updated: [SPARK-54787][PS] Use list comprehension in pandas _bool_column_labels

gurwls223 Sat, 20 Dec 2025 21:51:34 -0800

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/master by this push:
     new 20af8bdfb907 [SPARK-54787][PS] Use list comprehension in pandas 
_bool_column_labels
20af8bdfb907 is described below

commit 20af8bdfb9073be5711cea5df0c5ce0ff168e92c
Author: Devin Petersohn <[email protected]>
AuthorDate: Sun Dec 21 14:50:40 2025 +0900

    [SPARK-54787][PS] Use list comprehension in pandas _bool_column_labels
    
    ### What changes were proposed in this pull request?
    
    Use list comprehension in the pandas.DataFrame method _bool_column_labels. 
This will modestly improve memory and performance, but also reduces code to a 
single line.
    
    ### Why are the changes needed?
    
    For mantainability and performance
    
    ### Does this PR introduce _any_ user-facing change?
    
    No
    
    ### How was this patch tested?
    
    CI
    
    ### Was this patch authored or co-authored using generative AI tooling?
    
    No
    
    Closes #53550 from devin-petersohn/devin/pandas_maintain_01.
    
    Authored-by: Devin Petersohn <[email protected]>
    Signed-off-by: Hyukjin Kwon <[email protected]>
---
 python/pyspark/pandas/frame.py | 12 +++---------
 1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/python/pyspark/pandas/frame.py b/python/pyspark/pandas/frame.py
index 0ec7ee60bb5b..7f0a516d5963 100644
--- a/python/pyspark/pandas/frame.py
+++ b/python/pyspark/pandas/frame.py
@@ -11268,15 +11268,9 @@ defaultdict(<class 'list'>, {'col..., 'col...})]
         """
         Filter column labels of boolean columns (without None).
         """
-        bool_column_labels = []
-        for label in column_labels:
-            psser = self._psser_for(label)
-            if is_bool_dtype(psser):
-                # Rely on dtype rather than spark type because
-                # columns that consist of bools and Nones should be excluded
-                # if bool_only is True
-                bool_column_labels.append(label)
-        return bool_column_labels
+        # Rely on dtype rather than spark type because columns that consist of 
bools and
+        # Nones should be excluded if bool_only is True
+        return [label for label in column_labels if 
is_bool_dtype(self._psser_for(label))]
 
     def _result_aggregated(
         self, column_labels: List[Label], scols: Sequence[PySparkColumn]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

(spark) branch master updated: [SPARK-54787][PS] Use list comprehension in pandas _bool_column_labels

Reply via email to