(spark) branch master updated: [SPARK-54050][PYTHON][DOCS] Update the documents of arrow-batching related configures

ruifengz Mon, 27 Oct 2025 21:58:25 -0700

This is an automated email from the ASF dual-hosted git repository.

ruifengz pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/master by this push:
     new 036f591448ec [SPARK-54050][PYTHON][DOCS] Update the documents of 
arrow-batching related configures
036f591448ec is described below

commit 036f591448ecd466a018794c2e6cc7a049486200
Author: Ruifeng Zheng <[email protected]>
AuthorDate: Tue Oct 28 12:58:04 2025 +0800

    [SPARK-54050][PYTHON][DOCS] Update the documents of arrow-batching related 
configures
    
    ### What changes were proposed in this pull request?
    Update the documents of arrow-batching related configures
    
    ### Why are the changes needed?
    remove
    ```
    This configuration is not effective for the grouping API such as 
DataFrame(.cogroup).groupby.applyInPandas because each group becomes each 
ArrowRecordBatch.
    ```
    to reflect recent changes on arrow-batching
    
    ### Does this PR introduce _any_ user-facing change?
    yes, doc-only changes
    
    ### How was this patch tested?
    CI
    
    ### Was this patch authored or co-authored using generative AI tooling?
    No
    
    Closes #52753 from zhengruifeng/update_doc_max_records.
    
    Authored-by: Ruifeng Zheng <[email protected]>
    Signed-off-by: Ruifeng Zheng <[email protected]>
---
 .../main/scala/org/apache/spark/sql/internal/SQLConf.scala   | 12 ++++--------
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
index 71a01d4c0700..46629aaca776 100644
--- a/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
+++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
@@ -3905,9 +3905,7 @@ object SQLConf {
   val ARROW_EXECUTION_MAX_RECORDS_PER_BATCH =
     buildConf("spark.sql.execution.arrow.maxRecordsPerBatch")
       .doc("When using Apache Arrow, limit the maximum number of records that 
can be written " +
-        "to a single ArrowRecordBatch in memory. This configuration is not 
effective for the " +
-        "grouping API such as DataFrame(.cogroup).groupby.applyInPandas 
because each group " +
-        "becomes each ArrowRecordBatch. If set to zero or negative there is no 
limit. " +
+        "to a single ArrowRecordBatch in memory. If set to zero or negative 
there is no limit. " +
         "See also spark.sql.execution.arrow.maxBytesPerBatch. If both are set, 
each batch " +
         "is created when any condition of both is met.")
       .version("2.3.0")
@@ -3950,11 +3948,9 @@ object SQLConf {
     buildConf("spark.sql.execution.arrow.maxBytesPerBatch")
       .internal()
       .doc("When using Apache Arrow, limit the maximum bytes in each batch 
that can be written " +
-        "to a single ArrowRecordBatch in memory. This configuration is not 
effective for the " +
-        "grouping API such as DataFrame(.cogroup).groupby.applyInPandas 
because each group " +
-        "becomes each ArrowRecordBatch. Unlike 
'spark.sql.execution.arrow.maxRecordsPerBatch', " +
-        "this configuration does not work for createDataFrame/toPandas with 
Arrow/pandas " +
-        "instances. " +
+        "to a single ArrowRecordBatch in memory. " +
+        "Unlike 'spark.sql.execution.arrow.maxRecordsPerBatch', this 
configuration does not " +
+        "work for createDataFrame/toPandas with Arrow/pandas instances. " +
         "See also spark.sql.execution.arrow.maxRecordsPerBatch. If both are 
set, each batch " +
         "is created when any condition of both is met.")
       .version("4.0.0")


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

(spark) branch master updated: [SPARK-54050][PYTHON][DOCS] Update the documents of arrow-batching related configures

Reply via email to