(spark) branch master updated: [SPARK-53943][PYTHON][DOCS] Add examples for function unwrap_udt

ruifengz Fri, 17 Oct 2025 03:04:21 -0700

This is an automated email from the ASF dual-hosted git repository.

ruifengz pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/master by this push:
     new c0416713e8e1 [SPARK-53943][PYTHON][DOCS] Add examples for function 
unwrap_udt
c0416713e8e1 is described below

commit c0416713e8e1e5fb81f218feb339cfbeacb61b72
Author: Ruifeng Zheng <[email protected]>
AuthorDate: Fri Oct 17 18:03:47 2025 +0800

    [SPARK-53943][PYTHON][DOCS] Add examples for function unwrap_udt
    
    ### What changes were proposed in this pull request?
    Add examples for function unwrap_udt
    
    ### Why are the changes needed?
    to improve doc
    
    ### Does this PR introduce _any_ user-facing change?
    yes, doc-only
    
    ### How was this patch tested?
    doctests
    
    ### Was this patch authored or co-authored using generative AI tooling?
    no
    
    Closes #52647 from zhengruifeng/doc_unwrap_udt.
    
    Authored-by: Ruifeng Zheng <[email protected]>
    Signed-off-by: Ruifeng Zheng <[email protected]>
---
 python/pyspark/sql/functions/builtin.py | 51 +++++++++++++++++++++++++++++++--
 1 file changed, 48 insertions(+), 3 deletions(-)

diff --git a/python/pyspark/sql/functions/builtin.py 
b/python/pyspark/sql/functions/builtin.py
index 85115f132d44..99fb7ee1ec5e 100644
--- a/python/pyspark/sql/functions/builtin.py
+++ b/python/pyspark/sql/functions/builtin.py
@@ -25765,9 +25765,54 @@ def unwrap_udt(col: "ColumnOrName") -> Column:
 
     .. versionadded:: 3.4.0
 
-    Notes
-    -----
-    Supports Spark Connect.
+    Parameters
+    ----------
+    col : :class:`~pyspark.sql.Column` or column name
+
+    Returns
+    -------
+    :class:`~pyspark.sql.Column`
+        The underlying representation.
+
+    Examples
+    --------
+    Example 1: Unwrap ML-specific UDT - VectorUDT
+
+    >>> from pyspark.sql import functions as sf
+    >>> from pyspark.ml.linalg import Vectors
+    >>> vec1 = Vectors.dense(1, 2, 3)
+    >>> vec2 = Vectors.sparse(4, {1: 1.0, 3: 5.5})
+    >>> df = spark.createDataFrame([(vec1,), (vec2,)], ["vec"])
+    >>> df.select(sf.unwrap_udt("vec")).printSchema()
+    root
+     |-- unwrap_udt(vec): struct (nullable = true)
+     |    |-- type: byte (nullable = false)
+     |    |-- size: integer (nullable = true)
+     |    |-- indices: array (nullable = true)
+     |    |    |-- element: integer (containsNull = false)
+     |    |-- values: array (nullable = true)
+     |    |    |-- element: double (containsNull = false)
+
+    Example 2: Unwrap ML-specific UDT - MatrixUDT
+
+    >>> from pyspark.sql import functions as sf
+    >>> from pyspark.ml.linalg import Matrices
+    >>> mat1 = Matrices.dense(2, 2, range(4))
+    >>> mat2 = Matrices.sparse(2, 2, [0, 2, 3], [0, 1, 1], [2, 3, 4])
+    >>> df = spark.createDataFrame([(mat1,), (mat2,)], ["mat"])
+    >>> df.select(sf.unwrap_udt("mat")).printSchema()
+    root
+     |-- unwrap_udt(mat): struct (nullable = true)
+     |    |-- type: byte (nullable = false)
+     |    |-- numRows: integer (nullable = false)
+     |    |-- numCols: integer (nullable = false)
+     |    |-- colPtrs: array (nullable = true)
+     |    |    |-- element: integer (containsNull = false)
+     |    |-- rowIndices: array (nullable = true)
+     |    |    |-- element: integer (containsNull = false)
+     |    |-- values: array (nullable = true)
+     |    |    |-- element: double (containsNull = false)
+     |    |-- isTransposed: boolean (nullable = false)
     """
     from pyspark.sql.classic.column import _to_java_column
 


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

(spark) branch master updated: [SPARK-53943][PYTHON][DOCS] Add examples for function unwrap_udt

Reply via email to