[spark] branch branch-3.0 updated: [SPARK-31319][SQL][FOLLOW-UP] Add a SQL example for UDAF

gurwls223 Mon, 13 Apr 2020 21:33:03 -0700

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/branch-3.0 by this push:
     new 21f46e8  [SPARK-31319][SQL][FOLLOW-UP] Add a SQL example for UDAF
21f46e8 is described below

commit 21f46e828b3c4c57548d2e9862e2f6dba4082b5b
Author: Huaxin Gao <[email protected]>
AuthorDate: Tue Apr 14 13:29:44 2020 +0900

    [SPARK-31319][SQL][FOLLOW-UP] Add a SQL example for UDAF
    
    ### What changes were proposed in this pull request?
    Add a SQL example for UDAF
    
    ### Why are the changes needed?
    To make SQL Reference complete
    
    ### Does this PR introduce any user-facing change?
    Yes.
    Add the following page, also change ```Sql``` to ```SQL``` in the example 
tab for all the sql examples.
    <img width="1110" alt="Screen Shot 2020-04-13 at 6 09 24 PM" 
src="https://user-images.githubusercontent.com/13592258/79175240-06cd7400-7db2-11ea-8f3e-af71a591a64b.png";>
    
    ### How was this patch tested?
    Manually build and check
    
    Closes #28209 from huaxingao/udf_followup.
    
    Authored-by: Huaxin Gao <[email protected]>
    Signed-off-by: HyukjinKwon <[email protected]>
    (cherry picked from commit 46be1e01e977788f00f1f5aa8d64bc5f191bc578)
    Signed-off-by: HyukjinKwon <[email protected]>
---
 docs/sql-data-sources-jdbc.md                |  2 +-
 docs/sql-data-sources-json.md                |  2 +-
 docs/sql-data-sources-load-save-functions.md |  8 +++---
 docs/sql-data-sources-parquet.md             |  4 +--
 docs/sql-getting-started.md                  |  2 +-
 docs/sql-performance-tuning.md               |  2 +-
 docs/sql-ref-datatypes.md                    |  2 +-
 docs/sql-ref-functions-udf-aggregate.md      | 37 ++++++++++++++++++++++++++++
 8 files changed, 48 insertions(+), 11 deletions(-)

diff --git a/docs/sql-data-sources-jdbc.md b/docs/sql-data-sources-jdbc.md
index 3cdff42..eaee670 100644
--- a/docs/sql-data-sources-jdbc.md
+++ b/docs/sql-data-sources-jdbc.md
@@ -217,7 +217,7 @@ the following case-insensitive options:
 {% include_example jdbc_dataset r/RSparkSQLExample.R %}
 </div>
 
-<div data-lang="sql"  markdown="1">
+<div data-lang="SQL"  markdown="1">
 
 {% highlight sql %}
 
diff --git a/docs/sql-data-sources-json.md b/docs/sql-data-sources-json.md
index 588f6cd..0f1ca43 100644
--- a/docs/sql-data-sources-json.md
+++ b/docs/sql-data-sources-json.md
@@ -77,7 +77,7 @@ For a regular multi-line JSON file, set a named parameter 
`multiLine` to `TRUE`.
 
 </div>
 
-<div data-lang="sql"  markdown="1">
+<div data-lang="SQL"  markdown="1">
 
 {% highlight sql %}
 
diff --git a/docs/sql-data-sources-load-save-functions.md 
b/docs/sql-data-sources-load-save-functions.md
index a7efb93..0866f37 100644
--- a/docs/sql-data-sources-load-save-functions.md
+++ b/docs/sql-data-sources-load-save-functions.md
@@ -127,7 +127,7 @@ visit the official Apache ORC/Parquet websites.
 {% include_example manual_save_options_orc r/RSparkSQLExample.R %}
 </div>
 
-<div data-lang="sql"  markdown="1">
+<div data-lang="SQL"  markdown="1">
 
 {% highlight sql %}
 CREATE TABLE users_with_options (
@@ -257,7 +257,7 @@ Bucketing and sorting are applicable only to persistent 
tables:
 {% include_example write_sorting_and_bucketing python/sql/datasource.py %}
 </div>
 
-<div data-lang="sql"  markdown="1">
+<div data-lang="SQL"  markdown="1">
 
 {% highlight sql %}
 
@@ -291,7 +291,7 @@ while partitioning can be used with both `save` and 
`saveAsTable` when using the
 {% include_example write_partitioning python/sql/datasource.py %}
 </div>
 
-<div data-lang="sql"  markdown="1">
+<div data-lang="SQL"  markdown="1">
 
 {% highlight sql %}
 
@@ -323,7 +323,7 @@ It is possible to use both partitioning and bucketing for a 
single table:
 {% include_example write_partition_and_bucket python/sql/datasource.py %}
 </div>
 
-<div data-lang="sql"  markdown="1">
+<div data-lang="SQL"  markdown="1">
 
 {% highlight sql %}
 
diff --git a/docs/sql-data-sources-parquet.md b/docs/sql-data-sources-parquet.md
index 6e52446..7875b10 100644
--- a/docs/sql-data-sources-parquet.md
+++ b/docs/sql-data-sources-parquet.md
@@ -52,7 +52,7 @@ Using the data from the above example:
 
 </div>
 
-<div data-lang="sql"  markdown="1">
+<div data-lang="SQL"  markdown="1">
 
 {% highlight sql %}
 
@@ -242,7 +242,7 @@ refreshTable("my_table")
 
 </div>
 
-<div data-lang="sql"  markdown="1">
+<div data-lang="SQL"  markdown="1">
 
 {% highlight sql %}
 REFRESH TABLE my_table;
diff --git a/docs/sql-getting-started.md b/docs/sql-getting-started.md
index fc0a5d0..dab34af 100644
--- a/docs/sql-getting-started.md
+++ b/docs/sql-getting-started.md
@@ -205,7 +205,7 @@ refer it, e.g. `SELECT * FROM global_temp.view1`.
 {% include_example global_temp_view python/sql/basic.py %}
 </div>
 
-<div data-lang="sql"  markdown="1">
+<div data-lang="SQL"  markdown="1">
 
 {% highlight sql %}
 
diff --git a/docs/sql-performance-tuning.md b/docs/sql-performance-tuning.md
index 279aad6..5b784a5 100644
--- a/docs/sql-performance-tuning.md
+++ b/docs/sql-performance-tuning.md
@@ -169,7 +169,7 @@ head(join(src, hint(records, "broadcast"), src$key == 
records$key))
 
 </div>
 
-<div data-lang="sql"  markdown="1">
+<div data-lang="SQL"  markdown="1">
 
 {% highlight sql %}
 -- We accept BROADCAST, BROADCASTJOIN and MAPJOIN for broadcast hint
diff --git a/docs/sql-ref-datatypes.md b/docs/sql-ref-datatypes.md
index 1e0d051..150e194 100644
--- a/docs/sql-ref-datatypes.md
+++ b/docs/sql-ref-datatypes.md
@@ -631,7 +631,7 @@ from pyspark.sql.types import *
 </table>
 </div>
 
-<div data-lang="sql"  markdown="1">
+<div data-lang="SQL"  markdown="1">
 
 The following table shows the type names as well as aliases used in Spark SQL 
parser for each data type.
 
diff --git a/docs/sql-ref-functions-udf-aggregate.md 
b/docs/sql-ref-functions-udf-aggregate.md
index 3d8a64e..3fde94d 100644
--- a/docs/sql-ref-functions-udf-aggregate.md
+++ b/docs/sql-ref-functions-udf-aggregate.md
@@ -94,8 +94,45 @@ For example, a user-defined average for untyped DataFrames 
can look like:
 <div data-lang="java"  markdown="1">
   {% include_example untyped_custom_aggregation 
java/org/apache/spark/examples/sql/JavaUserDefinedUntypedAggregation.java%}
 </div>
+<div data-lang="SQL"  markdown="1">
+{% highlight sql %}
+-- Compile and place UDAF MyAverage in a JAR file called `MyAverage.jar` in 
/tmp.
+CREATE FUNCTION myAverage AS 'MyAverage' USING JAR '/tmp/MyAverage.jar';
+
+SHOW USER FUNCTIONS;
+-- +------------------+
+-- |          function|
+-- +------------------+
+-- | default.myAverage|
+-- +------------------+
+
+CREATE TEMPORARY VIEW employees
+USING org.apache.spark.sql.json
+OPTIONS (
+    path "examples/src/main/resources/employees.json"
+);
+
+SELECT * FROM employees;
+-- +-------+------+
+-- |   name|salary|
+-- +-------+------+
+-- |Michael|  3000|
+-- |   Andy|  4500|
+-- | Justin|  3500|
+-- |  Berta|  4000|
+-- +-------+------+
+
+SELECT myAverage(salary) as average_salary FROM employees;
+-- +--------------+
+-- |average_salary|
+-- +--------------+
+-- |        3750.0|
+-- +--------------+
+{% endhighlight %}
+</div>
 </div>
 
 ### Related Statements
+
  * [Scalar User Defined Functions (UDFs)](sql-ref-functions-udf-scalar.html)
  * [Integration with Hive UDFs/UDAFs/UDTFs](sql-ref-functions-udf-hive.html)


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[spark] branch branch-3.0 updated: [SPARK-31319][SQL][FOLLOW-UP] Add a SQL example for UDAF

Reply via email to