This is an automated email from the ASF dual-hosted git repository.
gurwls223 pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/branch-3.0 by this push:
new 21f46e8 [SPARK-31319][SQL][FOLLOW-UP] Add a SQL example for UDAF
21f46e8 is described below
commit 21f46e828b3c4c57548d2e9862e2f6dba4082b5b
Author: Huaxin Gao <[email protected]>
AuthorDate: Tue Apr 14 13:29:44 2020 +0900
[SPARK-31319][SQL][FOLLOW-UP] Add a SQL example for UDAF
### What changes were proposed in this pull request?
Add a SQL example for UDAF
### Why are the changes needed?
To make SQL Reference complete
### Does this PR introduce any user-facing change?
Yes.
Add the following page, also change ```Sql``` to ```SQL``` in the example
tab for all the sql examples.
<img width="1110" alt="Screen Shot 2020-04-13 at 6 09 24 PM"
src="https://user-images.githubusercontent.com/13592258/79175240-06cd7400-7db2-11ea-8f3e-af71a591a64b.png">
### How was this patch tested?
Manually build and check
Closes #28209 from huaxingao/udf_followup.
Authored-by: Huaxin Gao <[email protected]>
Signed-off-by: HyukjinKwon <[email protected]>
(cherry picked from commit 46be1e01e977788f00f1f5aa8d64bc5f191bc578)
Signed-off-by: HyukjinKwon <[email protected]>
---
docs/sql-data-sources-jdbc.md | 2 +-
docs/sql-data-sources-json.md | 2 +-
docs/sql-data-sources-load-save-functions.md | 8 +++---
docs/sql-data-sources-parquet.md | 4 +--
docs/sql-getting-started.md | 2 +-
docs/sql-performance-tuning.md | 2 +-
docs/sql-ref-datatypes.md | 2 +-
docs/sql-ref-functions-udf-aggregate.md | 37 ++++++++++++++++++++++++++++
8 files changed, 48 insertions(+), 11 deletions(-)
diff --git a/docs/sql-data-sources-jdbc.md b/docs/sql-data-sources-jdbc.md
index 3cdff42..eaee670 100644
--- a/docs/sql-data-sources-jdbc.md
+++ b/docs/sql-data-sources-jdbc.md
@@ -217,7 +217,7 @@ the following case-insensitive options:
{% include_example jdbc_dataset r/RSparkSQLExample.R %}
</div>
-<div data-lang="sql" markdown="1">
+<div data-lang="SQL" markdown="1">
{% highlight sql %}
diff --git a/docs/sql-data-sources-json.md b/docs/sql-data-sources-json.md
index 588f6cd..0f1ca43 100644
--- a/docs/sql-data-sources-json.md
+++ b/docs/sql-data-sources-json.md
@@ -77,7 +77,7 @@ For a regular multi-line JSON file, set a named parameter
`multiLine` to `TRUE`.
</div>
-<div data-lang="sql" markdown="1">
+<div data-lang="SQL" markdown="1">
{% highlight sql %}
diff --git a/docs/sql-data-sources-load-save-functions.md
b/docs/sql-data-sources-load-save-functions.md
index a7efb93..0866f37 100644
--- a/docs/sql-data-sources-load-save-functions.md
+++ b/docs/sql-data-sources-load-save-functions.md
@@ -127,7 +127,7 @@ visit the official Apache ORC/Parquet websites.
{% include_example manual_save_options_orc r/RSparkSQLExample.R %}
</div>
-<div data-lang="sql" markdown="1">
+<div data-lang="SQL" markdown="1">
{% highlight sql %}
CREATE TABLE users_with_options (
@@ -257,7 +257,7 @@ Bucketing and sorting are applicable only to persistent
tables:
{% include_example write_sorting_and_bucketing python/sql/datasource.py %}
</div>
-<div data-lang="sql" markdown="1">
+<div data-lang="SQL" markdown="1">
{% highlight sql %}
@@ -291,7 +291,7 @@ while partitioning can be used with both `save` and
`saveAsTable` when using the
{% include_example write_partitioning python/sql/datasource.py %}
</div>
-<div data-lang="sql" markdown="1">
+<div data-lang="SQL" markdown="1">
{% highlight sql %}
@@ -323,7 +323,7 @@ It is possible to use both partitioning and bucketing for a
single table:
{% include_example write_partition_and_bucket python/sql/datasource.py %}
</div>
-<div data-lang="sql" markdown="1">
+<div data-lang="SQL" markdown="1">
{% highlight sql %}
diff --git a/docs/sql-data-sources-parquet.md b/docs/sql-data-sources-parquet.md
index 6e52446..7875b10 100644
--- a/docs/sql-data-sources-parquet.md
+++ b/docs/sql-data-sources-parquet.md
@@ -52,7 +52,7 @@ Using the data from the above example:
</div>
-<div data-lang="sql" markdown="1">
+<div data-lang="SQL" markdown="1">
{% highlight sql %}
@@ -242,7 +242,7 @@ refreshTable("my_table")
</div>
-<div data-lang="sql" markdown="1">
+<div data-lang="SQL" markdown="1">
{% highlight sql %}
REFRESH TABLE my_table;
diff --git a/docs/sql-getting-started.md b/docs/sql-getting-started.md
index fc0a5d0..dab34af 100644
--- a/docs/sql-getting-started.md
+++ b/docs/sql-getting-started.md
@@ -205,7 +205,7 @@ refer it, e.g. `SELECT * FROM global_temp.view1`.
{% include_example global_temp_view python/sql/basic.py %}
</div>
-<div data-lang="sql" markdown="1">
+<div data-lang="SQL" markdown="1">
{% highlight sql %}
diff --git a/docs/sql-performance-tuning.md b/docs/sql-performance-tuning.md
index 279aad6..5b784a5 100644
--- a/docs/sql-performance-tuning.md
+++ b/docs/sql-performance-tuning.md
@@ -169,7 +169,7 @@ head(join(src, hint(records, "broadcast"), src$key ==
records$key))
</div>
-<div data-lang="sql" markdown="1">
+<div data-lang="SQL" markdown="1">
{% highlight sql %}
-- We accept BROADCAST, BROADCASTJOIN and MAPJOIN for broadcast hint
diff --git a/docs/sql-ref-datatypes.md b/docs/sql-ref-datatypes.md
index 1e0d051..150e194 100644
--- a/docs/sql-ref-datatypes.md
+++ b/docs/sql-ref-datatypes.md
@@ -631,7 +631,7 @@ from pyspark.sql.types import *
</table>
</div>
-<div data-lang="sql" markdown="1">
+<div data-lang="SQL" markdown="1">
The following table shows the type names as well as aliases used in Spark SQL
parser for each data type.
diff --git a/docs/sql-ref-functions-udf-aggregate.md
b/docs/sql-ref-functions-udf-aggregate.md
index 3d8a64e..3fde94d 100644
--- a/docs/sql-ref-functions-udf-aggregate.md
+++ b/docs/sql-ref-functions-udf-aggregate.md
@@ -94,8 +94,45 @@ For example, a user-defined average for untyped DataFrames
can look like:
<div data-lang="java" markdown="1">
{% include_example untyped_custom_aggregation
java/org/apache/spark/examples/sql/JavaUserDefinedUntypedAggregation.java%}
</div>
+<div data-lang="SQL" markdown="1">
+{% highlight sql %}
+-- Compile and place UDAF MyAverage in a JAR file called `MyAverage.jar` in
/tmp.
+CREATE FUNCTION myAverage AS 'MyAverage' USING JAR '/tmp/MyAverage.jar';
+
+SHOW USER FUNCTIONS;
+-- +------------------+
+-- | function|
+-- +------------------+
+-- | default.myAverage|
+-- +------------------+
+
+CREATE TEMPORARY VIEW employees
+USING org.apache.spark.sql.json
+OPTIONS (
+ path "examples/src/main/resources/employees.json"
+);
+
+SELECT * FROM employees;
+-- +-------+------+
+-- | name|salary|
+-- +-------+------+
+-- |Michael| 3000|
+-- | Andy| 4500|
+-- | Justin| 3500|
+-- | Berta| 4000|
+-- +-------+------+
+
+SELECT myAverage(salary) as average_salary FROM employees;
+-- +--------------+
+-- |average_salary|
+-- +--------------+
+-- | 3750.0|
+-- +--------------+
+{% endhighlight %}
+</div>
</div>
### Related Statements
+
* [Scalar User Defined Functions (UDFs)](sql-ref-functions-udf-scalar.html)
* [Integration with Hive UDFs/UDAFs/UDTFs](sql-ref-functions-udf-hive.html)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]