This is an automated email from the ASF dual-hosted git repository.
gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push:
new f1f856d5463 [SPARK-45526][PYTHON][DOCS] Improve the example of
DataFrameReader/Writer.options to take a dictionary
f1f856d5463 is described below
commit f1f856d546360d34ca1f7ee1ddc163381586b180
Author: Hyukjin Kwon <[email protected]>
AuthorDate: Fri Oct 13 14:23:09 2023 +0900
[SPARK-45526][PYTHON][DOCS] Improve the example of
DataFrameReader/Writer.options to take a dictionary
### What changes were proposed in this pull request?
This PR proposes to add the example of DataFrameReader/Writer.options to
take a dictionary.
### Why are the changes needed?
For users to know how to set options in a dictionary ay PySpark.
### Does this PR introduce _any_ user-facing change?
Yes, it describes an example for setting the options with a dictionary.
### How was this patch tested?
Existing doctests in this PR's CI.
### Was this patch authored or co-authored using generative AI tooling?
No.
Closes #43357
Closes #43358 from HyukjinKwon/SPARK-45528.
Authored-by: Hyukjin Kwon <[email protected]>
Signed-off-by: Hyukjin Kwon <[email protected]>
---
python/pyspark/sql/readwriter.py | 14 ++++++++++++--
python/pyspark/sql/streaming/readwriter.py | 10 ++++++++++
2 files changed, 22 insertions(+), 2 deletions(-)
diff --git a/python/pyspark/sql/readwriter.py b/python/pyspark/sql/readwriter.py
index ea429a75e15..81977c9e8cc 100644
--- a/python/pyspark/sql/readwriter.py
+++ b/python/pyspark/sql/readwriter.py
@@ -220,7 +220,12 @@ class DataFrameReader(OptionUtils):
Examples
--------
- >>> spark.read.option("key", "value")
+ >>> spark.read.options(key="value")
+ <...readwriter.DataFrameReader object ...>
+
+ Specify options in a dictionary.
+
+ >>> spark.read.options(**{"k1": "v1", "k2": "v2"})
<...readwriter.DataFrameReader object ...>
Specify the option 'nullValue' and 'header' with reading a CSV file.
@@ -1172,7 +1177,12 @@ class DataFrameWriter(OptionUtils):
Examples
--------
- >>> spark.range(1).write.option("key", "value")
+ >>> spark.range(1).write.options(key="value")
+ <...readwriter.DataFrameWriter object ...>
+
+ Specify options in a dictionary.
+
+ >>> spark.range(1).write.options(**{"k1": "v1", "k2": "v2"})
<...readwriter.DataFrameWriter object ...>
Specify the option 'nullValue' and 'header' with writing a CSV file.
diff --git a/python/pyspark/sql/streaming/readwriter.py
b/python/pyspark/sql/streaming/readwriter.py
index 2026651ce12..b0f01c06b2e 100644
--- a/python/pyspark/sql/streaming/readwriter.py
+++ b/python/pyspark/sql/streaming/readwriter.py
@@ -224,6 +224,11 @@ class DataStreamReader(OptionUtils):
>>> spark.readStream.options(x="1", y=2)
<...streaming.readwriter.DataStreamReader object ...>
+ Specify options in a dictionary.
+
+ >>> spark.readStream.options(**{"k1": "v1", "k2": "v2"})
+ <...streaming.readwriter.DataStreamReader object ...>
+
The example below specifies 'rowsPerSecond' and 'numPartitions'
options to
Rate source in order to generate 10 rows with 10 partitions every
second.
@@ -943,6 +948,11 @@ class DataStreamWriter:
>>> df.writeStream.option("x", 1)
<...streaming.readwriter.DataStreamWriter object ...>
+ Specify options in a dictionary.
+
+ >>> df.writeStream.options(**{"k1": "v1", "k2": "v2"})
+ <...streaming.readwriter.DataStreamWriter object ...>
+
The example below specifies 'numRows' and 'truncate' options to
Console source in order
to print 3 rows for every batch without truncating the results.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]