This is an automated email from the ASF dual-hosted git repository. shaofengshi pushed a commit to branch document in repository https://gitbox.apache.org/repos/asf/kylin.git
The following commit(s) were added to refs/heads/document by this push: new dac8bcd KYLIN-3766 Merge docs for Query SQL Grammar, Function and Data Type dac8bcd is described below commit dac8bcd1e98adc3579015ff2f7fb83f1237cd43a Author: GinaZhai <na.z...@kyligence.io> AuthorDate: Wed Jan 30 15:26:51 2019 +0800 KYLIN-3766 Merge docs for Query SQL Grammar, Function and Data Type --- website/_data/docs-cn.yml | 4 +- website/_data/docs.yml | 4 +- website/_docs/index.cn.md | 46 ++++- website/_docs/index.md | 2 +- website/_docs/tutorial/kylin_datatype.cn.md | 16 -- website/_docs/tutorial/kylin_datatype.md | 15 -- website/_docs/tutorial/kylin_function.cn.md | 211 -------------------- website/_docs/tutorial/kylin_function.md | 210 -------------------- .../{kylin_grammar.cn.md => sql_reference.cn.md} | 218 ++++++++++++++++++-- .../{kylin_grammar.md => sql_reference.md} | 220 +++++++++++++++++++-- 10 files changed, 463 insertions(+), 483 deletions(-) diff --git a/website/_data/docs-cn.yml b/website/_data/docs-cn.yml index cdc4698..564d23e 100644 --- a/website/_data/docs-cn.yml +++ b/website/_data/docs-cn.yml @@ -27,12 +27,10 @@ - title: 教程 docs: - tutorial/kylin_sample - - tutorial/kylin_grammar - - tutorial/kylin_function - - tutorial/kylin_datatype - tutorial/web - tutorial/create_cube - tutorial/cube_build_job + - tutorial/sql_reference - tutorial/project_level_acl - tutorial/cube_spark - tutorial/cube_streaming diff --git a/website/_data/docs.yml b/website/_data/docs.yml index 924a8c3..cd5139b 100644 --- a/website/_data/docs.yml +++ b/website/_data/docs.yml @@ -36,12 +36,10 @@ - title: Tutorial docs: - tutorial/kylin_sample - - tutorial/kylin_grammar - - tutorial/kylin_function - - tutorial/kylin_datatype - tutorial/web - tutorial/create_cube - tutorial/cube_build_job + - tutorial/sql_reference - tutorial/project_level_acl - tutorial/cube_spark - tutorial/cube_streaming diff --git a/website/_docs/index.cn.md b/website/_docs/index.cn.md index f30068f..02dcefb 100644 --- a/website/_docs/index.cn.md +++ b/website/_docs/index.cn.md @@ -18,9 +18,51 @@ Apache Kylin™是一个开源的分布式分析引擎,提供Hadoop之上的SQ * [v2.0 document](/cn/docs20/) * [归档](/archive/) -安装 +安装 ------------ -请参考安装文档以安装Apache Kylin: [安装向导](/cn/docs/install/) +1. [安装指南](install/index.html) +2. [Kylin 配置](install/configuration.html) +3. [集群模式部署](install/kylin_cluster.html) +4. [高级配置](install/advance_settings.html) +5. [用 Docker 运行 Kylin](install/kylin_docker.html) +6. [在 AWS EMR 上安装 Kylin](install/kylin_aws_emr.html) + +教程 +------------ +1. [样例 Cube 快速入门](tutorial/kylin_sample.html) +2. [Web 界面](tutorial/web.html) +3. [Cube 创建](tutorial/create_cube.html) +4. [Cube 构建和 Job 监控](tutorial/cube_build_job.html) +5. [SQL 快速参考](tutorial/sql_reference.html) +6. [用 Kafka 流构建 Cube](tutorial/cube_streaming.html) +7. [用 Spark 构建 Cube](tutorial/cube_spark.html) +8. [优化 Cube 构建](tutorial/cube_build_performance.html) +9. [查询下压](tutorial/query_pushdown.html) +10. [建立 System Cube](tutorial/setup_systemcube.html) +11. [使用 Cube Planner](tutorial/use_cube_planner.html) +12. [使用 Dashboard](tutorial/use_dashboard.html) +13. [建立 JDBC 数据源](tutorial/setup_jdbc_datasource.html) + + +工具集成 +------------ +1. [ODBC 驱动](tutorial/odbc.html) +2. [JDBC 驱动](howto/howto_jdbc.html) +3. [RESTful API 列表](howto/howto_use_restapi.html) +4. [用 API 构建 Cube](howto/howto_build_cube_with_restapi.html) +5. [MS Excel 及 PowerBI 教程](tutorial/powerbi.html) +6. [Tableau 8](tutorial/tableau.html) +7. [Tableau 9](tutorial/tableau_91.html) +8. [SQuirreL](tutorial/squirrel.html) +9. [Qlik Sense 集成](tutorial/Qlik.html) +10. [Apache Superset](tutorial/superset.html) +11. [Redash](/blog/2018/05/08/redash-kylin-plugin-strikingly/) + + +帮助 +------------ +1. [备份 Kylin 元数据](howto/howto_backup_metadata.html) +2. [清理存储](howto/howto_cleanup_storage.html) diff --git a/website/_docs/index.md b/website/_docs/index.md index 0c8c60b..4762913 100644 --- a/website/_docs/index.md +++ b/website/_docs/index.md @@ -33,7 +33,7 @@ Tutorial 2. [Web Interface](tutorial/web.html) 3. [Cube Wizard](tutorial/create_cube.html) 4. [Cube Build and Job Monitoring](tutorial/cube_build_job.html) -5. [SQL reference: by Apache Calcite](http://calcite.apache.org/docs/reference.html) +5. [SQL reference: by Apache Calcite](tutorial/sql_reference.html) 6. [Build Cube with Streaming Data](tutorial/cube_streaming.html) 7. [Build Cube with Spark Engine](tutorial/cube_spark.html) 8. [Cube Build Tuning](tutorial/cube_build_performance.html) diff --git a/website/_docs/tutorial/kylin_datatype.cn.md b/website/_docs/tutorial/kylin_datatype.cn.md deleted file mode 100644 index 27fb24b..0000000 --- a/website/_docs/tutorial/kylin_datatype.cn.md +++ /dev/null @@ -1,16 +0,0 @@ ---- -layout: docs-cn -title: "数据类型" -categories: tutorial -permalink: /cn/docs/tutorial/kylin_datatype.html ---- - -## 数据类型 - -| :--------- | :--------- | :--------- | :--------- | :------------------- | -| ANY | CHAR | VARCHAR | STRING | BOOLEAN | -| BYTE | BINARY | INT | SHORT | LONG | -| INTEGER | TINYINT | SMALLINT | BIGINT | | -| FLOAT | REAL | DOUBLE | DECIMAL | | -| NUMERIC | DATE | TIME | DATETIME | TIMESTAMP | - diff --git a/website/_docs/tutorial/kylin_datatype.md b/website/_docs/tutorial/kylin_datatype.md deleted file mode 100644 index c2bede2..0000000 --- a/website/_docs/tutorial/kylin_datatype.md +++ /dev/null @@ -1,15 +0,0 @@ ---- -layout: docs -title: Data Type -categories: tutorial -permalink: /docs/tutorial/kylin_datatype.html ---- - -## Data Type - -| :--------- | :--------- | :--------- | :--------- | :------------------- | -| ANY | CHAR | VARCHAR | STRING | BOOLEAN | -| BYTE | BINARY | INT | SHORT | LONG | -| INTEGER | TINYINT | SMALLINT | BIGINT | | -| FLOAT | REAL | DOUBLE | DECIMAL | | -| NUMERIC | DATE | TIME | DATETIME | TIMESTAMP | \ No newline at end of file diff --git a/website/_docs/tutorial/kylin_function.cn.md b/website/_docs/tutorial/kylin_function.cn.md deleted file mode 100644 index 93b239c..0000000 --- a/website/_docs/tutorial/kylin_function.cn.md +++ /dev/null @@ -1,211 +0,0 @@ ---- -layout: docs-cn -title: "SQL 函数" -categories: tutorial -permalink: /cn/docs/tutorial/kylin_function.html ---- - -## 函数 - -[COUNT](#COUNT) - [COUNT(COLUMN)](#COUNT_COLUMN) - [COUNT(*)](#COUNT_) -[COUNT_DISTINCT](#COUNT_DISTINCT) -[MAX](#MAX) -[MIN](#MIN) -[PERCENTILE](#PERCENTILE) -[SUM](#SUM) -[TOP_N](#TOP_N) - -[WINDOW](#WINDOW) - [ROW_NUMBER](#ROW_NUMBER) - [AVG](#AVG) - [RANK](#RANK) - [DENSE_RANK](#DENSE_RANK) - [FIRST_VALUE](#FIRST_VALUE) - [LAST_VALUE](#LAST_VALUE) - [LAG](#LAG) - [LEAD](#LEAD) - [NTILE](#NTILE) - [CASE WHEN](#CASEWHEN) - [CAST](#CAST) - -[SUSTRING](#SUBSTRING) -[COALESCE](#COALESCE) - - -## COUNT {#COUNT} -用于返回与指定条件匹配的行数。 - -### COUNT(COLUMN) {#COUNT_COLUMN} - -例子: -{% highlight Groff markup %} -SELECT COUNT(seller_id) FROM kylin_sales; -{% endhighlight %} - -### COUNT(*) {#COUNT_} - -例子: -{% highlight Groff markup %} -SELECT COUNT(*) FROM kylin_sales; -{% endhighlight %} - - -## COUNT_DISTINCT {#COUNT_DISTINCT} - -例子: -{% highlight Groff markup %} -SELECT COUNT(DISTINCT seller_id) AS DIST_SELLER FROM kylin_sales; -{% endhighlight %} - -## MAX {#MAX} -返回一列中的最大值。NULL 值不包括在计算中。 -例子: -{% highlight Groff markup %} -SELECT MAX(lstg_site_id) FROM kylin_sales; -{% endhighlight %} - - -## MIN {#MIN} -返回一列中的最小值。NULL 值不包括在计算中。 -例子: -{% highlight Groff markup %} -SELECT MIN(lstg_site_id) FROM kylin_sales; -{% endhighlight %} - - -## PERCENTILE {#PERCENTILE} - -例子: -{% highlight Groff markup %} -SELECT seller_id, PERCENTILE(price, 0.5) FROM kylin_sales GROUP BY seller_id; - -SELECT seller_id, PERCENTILE_APPROX(price, 0.5) FROM kylin_sales GROUP BY seller_id; -{% endhighlight %} - - -## SUM {#SUM} -返回数值列的总数。 -例子: -{% highlight Groff markup %} -SELECT SUM(price) FROM kylin_sales; -{% endhighlight %} - -## TOP_N {#TOP_N} - -例子: -{% highlight Groff markup %} -SELECT SUM(price) AS gmv - FROM kylin_sales -INNER JOIN kylin_cal_dt AS kylin_cal_dt - ON kylin_sales.part_dt = kylin_cal_dt.cal_dt - INNER JOIN kylin_category_groupings - ON kylin_sales.leaf_categ_id = kylin_category_groupings.leaf_categ_id AND kylin_sales.lstg_site_id = kylin_category_groupings.site_id - WHERE kylin_cal_dt.cal_dt between DATE '2013-09-01' AND DATE '2013-10-01' AND (lstg_format_name='FP-GTC' OR 'a' = 'b') - GROUP BY kylin_cal_dt.cal_dt; - -SELECT kylin_sales.part_dt, seller_id -FROM kylin_sales -INNER JOIN kylin_cal_dt AS kylin_cal_dt -ON kylin_sales.part_dt = kylin_cal_dt.cal_dt -INNER JOIN kylin_category_groupings -ON kylin_sales.leaf_categ_id = kylin_category_groupings.leaf_categ_id -AND kylin_sales.lstg_site_id = kylin_category_groupings.site_id -GROUP BY -kylin_sales.part_dt, kylin_sales.seller_id ORDER BY SUM(kylin_sales.price) DESC LIMIT 20; -{% endhighlight %} - -## WINDOW {#WINDOW} -```WINDOW``` 函数在和当前行相关的一组表行上执行计算。 -*注意*:```WINDOW``` 函数中必须有 ```OVER``` 子句 - -### ROW_NUMBER {#ROW_NUMBER} - -例子: -{% highlight Groff markup %} -SELECT lstg_format_name, SUM(price) AS gmv, ROW_NUMBER() OVER() FROM kylin_sales GROUP BY lstg_format_name; -{% endhighlight %} - -### AVG {#AVG} -返回数值列的平均值。NULL 值不包括在计算中。 -例子: -{% highlight Groff markup %} -SELECT lstg_format_name, AVG(SUM(price)) OVER(PARTITION BY lstg_format_name) FROM kylin_sales GROUP BY part_dt, lstg_format_name; -{% endhighlight %} - -### RANK {#RANK} - -例子: -{% highlight Groff markup %} -SELECT part_dt, lstg_format_name, SUM(price), RANK() OVER(PARTITION BY lstg_format_name ORDER BY part_dt) AS "rank" FROM kylin_sales GROUP BY part_dt, lstg_format_name; -{% endhighlight %} - -### DENSE_RANK {#DENSE_RANK} - -例子: -{% highlight Groff markup %} -SELECT part_dt, lstg_format_name, SUM(price), DENSE_RANK() OVER(PARTITION BY lstg_format_name ORDER BY part_dt) AS "dense_rank" FROM kylin_sales GROUP BY part_dt, lstg_format_name; -{% endhighlight %} - -### FIRST_VALUE {#FIRST_VALUE} - -例子: -{% highlight Groff markup %} -SELECT part_dt, lstg_format_name, SUM(price) AS gmv, FIRST_VALUE(SUM(price)) OVER(PARTITION BY lstg_format_name ORDER BY part_dt) AS "first" FROM kylin_sales WHERE part_dt < '2012-02-01' GROUP BY part_dt, lstg_format_name; -{% endhighlight %} - -### LAST_VALUE {#LAST_VALUE} - -例子: -{% highlight Groff markup %} -SELECT part_dt, lstg_format_name, SUM(price) AS gmv, LAST_VALUE(SUM(price)) OVER(PARTITION BY lstg_format_name ORDER BY part_dt) AS "current" FROM kylin_sales WHERE part_dt < '2012-02-01' GROUP BY part_dt, lstg_format_name; -{% endhighlight %} - -### LAG {#LAG} - -例子: -{% highlight Groff markup %} -SELECT part_dt, lstg_format_name, SUM(price) AS gmv, LAG(SUM(price), 1, 0.0) OVER(PARTITION BY lstg_format_name ORDER BY part_dt) AS "prev" FROM kylin_sales WHERE part_dt < '2012-02-01' GROUP BY part_dt, lstg_format_name; -{% endhighlight %} - -### LEAD {#LEAD} - -例子: -{% highlight Groff markup %} -SELECT part_dt, lstg_format_name, SUM(price) AS gmv, LEAD(SUM(price), 1, 0.0) OVER(PARTITION BY lstg_format_name ORDER BY part_dt) AS "next" FROM kylin_sales WHERE part_dt < '2012-02-01' GROUP BY part_dt, lstg_format_name; -{% endhighlight %} - -### NTILE {#NTILE} - -例子: -{% highlight Groff markup %} -SELECT part_dt, lstg_format_name, SUM(price) AS gmv, NTILE(4) OVER (PARTITION BY lstg_format_name ORDER BY part_dt) AS "quarter" FROM kylin_sales WHERE part_dt < '2012-02-01' GROUP BY part_dt, lstg_format_name; -{% endhighlight %} - -### CASE WHEN {#CASEWHEN} - -例子: -{% highlight Groff markup %} -SELECT part_dt, lstg_format_name, SUM(price) AS gmv, (CASE LAG(SUM(price), 1, 0.0) OVER(PARTITION BY lstg_format_name ORDER BY part_dt) WHEN 0.0 THEN 0 ELSE SUM(price)/LAG(SUM(price), 1, 0.0) OVER(PARTITION BY lstg_format_name ORDER BY part_dt) END) AS "prev" FROM kylin_sales WHERE part_dt < '2012-02-01' GROUP BY part_dt, lstg_format_name; -{% endhighlight %} - -### CAST {#CAST} -```RANGE```,```INTERVAL``` 关键字指明了范围。```PRECEDING``` 表示前几天(秒/分/时/月/年)。```FOLLOWING``` 表示后几天(秒/分/时/月/年)。 -例子: -{% highlight Groff markup %} -SELECT part_dt, lstg_format_name, SUM(price) AS gmv, FIRST_VALUE(SUM(price)) OVER (PARTITION BY lstg_format_name ORDER BY CAST(part_dt AS timestamp) RANGE INTERVAL '3' DAY PRECEDING) AS "prev 3 days", LAST_VALUE(SUM(price)) OVER (PARTITION BY lstg_format_name ORDER BY CAST(part_dt AS timestamp) RANGE INTERVAL '3' DAY FOLLOWING) AS "next 3 days" FROM kylin_sales WHERE part_dt < '2012-02-01' GROUP BY part_dt, lstg_format_name; -{% endhighlight %} - -## SUBSTRING {#SUBSTRING} -例子: -{% highlight Groff markup %} -SELECT SUBSTRING(lstg_format_name, 1) FROM kylin_sales; -{% endhighlight %} - -## COALESCE {#COALESCE} -例子: -{% highlight Groff markup %} -SELECT COALESCE(lstg_format_name, '888888888888') FROM kylin_sales; -{% endhighlight %} - diff --git a/website/_docs/tutorial/kylin_function.md b/website/_docs/tutorial/kylin_function.md deleted file mode 100644 index 60255da..0000000 --- a/website/_docs/tutorial/kylin_function.md +++ /dev/null @@ -1,210 +0,0 @@ ---- -layout: docs -title: SQL Function -categories: tutorial -permalink: /docs/tutorial/kylin_function.html ---- - -## Function - -[COUNT](#COUNT) - [COUNT(COLUMN)](#COUNT_COLUMN) - [COUNT(*)](#COUNT_) -[COUNT_DISTINCT](#COUNT_DISTINCT) -[MAX](#MAX) -[MIN](#MIN) -[PERCENTILE](#PERCENTILE) -[SUM](#SUM) -[TOP_N](#TOP_N) - -[WINDOW](#WINDOW) - [ROW_NUMBER](#ROW_NUMBER) - [AVG](#AVG) - [RANK](#RANK) - [DENSE_RANK](#DENSE_RANK) - [FIRST_VALUE](#FIRST_VALUE) - [LAST_VALUE](#LAST_VALUE) - [LAG](#LAG) - [LEAD](#LEAD) - [NTILE](#NTILE) - [CASE WHEN](#CASEWHEN) - [CAST](#CAST) - -[SUSTRING](#SUBSTRING) -[COALESCE](#COALESCE) - -## COUNT {#COUNT} -Returns the number of rows matching the specified criteria. - -### COUNT(COLUMN) {#COUNT_COLUMN} - -Example: -{% highlight Groff markup %} -SELECT COUNT(seller_id) FROM kylin_sales; -{% endhighlight %} - -### COUNT(*) {#COUNT_} - -Example: -{% highlight Groff markup %} -SELECT COUNT(*) FROM kylin_sales; -{% endhighlight %} - - -## COUNT_DISTINCT {#COUNT_DISTINCT} - -Example: -{% highlight Groff markup %} -SELECT COUNT(DISTINCT seller_id) AS DIST_SELLER FROM kylin_sales; -{% endhighlight %} - -## MAX {#MAX} -Returns the maximum value in a column. NULL values are not included in the calculation. -Example: -{% highlight Groff markup %} -SELECT MAX(lstg_site_id) FROM kylin_sales; -{% endhighlight %} - - -## MIN {#MIN} -Returns the minimum value in a column. NULL values are not included in the calculation. -Example: -{% highlight Groff markup %} -SELECT MIN(lstg_site_id) FROM kylin_sales; -{% endhighlight %} - - -## PERCENTILE {#PERCENTILE} - -Example: -{% highlight Groff markup %} -SELECT seller_id, PERCENTILE(price, 0.5) FROM kylin_sales GROUP BY seller_id; - -SELECT seller_id, PERCENTILE_APPROX(price, 0.5) FROM kylin_sales GROUP BY seller_id; -{% endhighlight %} - - -## SUM {#SUM} -Returns the total number of numeric columns. -Example: -{% highlight Groff markup %} -SELECT SUM(price) FROM kylin_sales; -{% endhighlight %} - -## TOP_N {#TOP_N} - -Example: -{% highlight Groff markup %} -SELECT SUM(price) AS gmv - FROM kylin_sales -INNER JOIN kylin_cal_dt AS kylin_cal_dt - ON kylin_sales.part_dt = kylin_cal_dt.cal_dt - INNER JOIN kylin_category_groupings - ON kylin_sales.leaf_categ_id = kylin_category_groupings.leaf_categ_id AND kylin_sales.lstg_site_id = kylin_category_groupings.site_id - WHERE kylin_cal_dt.cal_dt between DATE '2013-09-01' AND DATE '2013-10-01' AND (lstg_format_name='FP-GTC' OR 'a' = 'b') - GROUP BY kylin_cal_dt.cal_dt; - -SELECT kylin_sales.part_dt, seller_id -FROM kylin_sales -INNER JOIN kylin_cal_dt AS kylin_cal_dt -ON kylin_sales.part_dt = kylin_cal_dt.cal_dt -INNER JOIN kylin_category_groupings -ON kylin_sales.leaf_categ_id = kylin_category_groupings.leaf_categ_id -AND kylin_sales.lstg_site_id = kylin_category_groupings.site_id -GROUP BY -kylin_sales.part_dt, kylin_sales.seller_id ORDER BY SUM(kylin_sales.price) DESC LIMIT 20; -{% endhighlight %} - -## WINDOW {#WINDOW} - -The ```WINDOW``` function performs the calculation on a set of table rows associated with the current row. -*NOTE*: ```OVER``` clause is necessary for window functions. - -### ROW_NUMBER {#ROW_NUMBER} - -Example: -{% highlight Groff markup %} -SELECT lstg_format_name, SUM(price) AS gmv, ROW_NUMBER() OVER() FROM kylin_sales GROUP BY lstg_format_name; -{% endhighlight %} - -### AVG {#AVG} -Returns the average of the numeric columns. NULL values are not included in the calculation. -Example: -{% highlight Groff markup %} -SELECT lstg_format_name, AVG(SUM(price)) OVER(PARTITION BY lstg_format_name) FROM kylin_sales GROUP BY part_dt, lstg_format_name; -{% endhighlight %} - -### RANK {#RANK} - -Example: -{% highlight Groff markup %} -SELECT part_dt, lstg_format_name, SUM(price), RANK() OVER(PARTITION BY lstg_format_name ORDER BY part_dt) AS "rank" FROM kylin_sales GROUP BY part_dt, lstg_format_name; -{% endhighlight %} - -### DENSE_RANK {#DENSE_RANK} - -Example: -{% highlight Groff markup %} -SELECT part_dt, lstg_format_name, SUM(price), DENSE_RANK() OVER(PARTITION BY lstg_format_name ORDER BY part_dt) AS "dense_rank" FROM kylin_sales GROUP BY part_dt, lstg_format_name; -{% endhighlight %} - -### FIRST_VALUE {#FIRST_VALUE} - -Example: -{% highlight Groff markup %} -SELECT part_dt, lstg_format_name, SUM(price) AS gmv, FIRST_VALUE(SUM(price)) OVER(PARTITION BY lstg_format_name ORDER BY part_dt) AS "first" FROM kylin_sales WHERE part_dt < '2012-02-01' GROUP BY part_dt, lstg_format_name; -{% endhighlight %} - -### LAST_VALUE {#LAST_VALUE} - -Example: -{% highlight Groff markup %} -SELECT part_dt, lstg_format_name, SUM(price) AS gmv, LAST_VALUE(SUM(price)) OVER(PARTITION BY lstg_format_name ORDER BY part_dt) AS "current" FROM kylin_sales WHERE part_dt < '2012-02-01' GROUP BY part_dt, lstg_format_name; -{% endhighlight %} - -### LAG {#LAG} - -Example: -{% highlight Groff markup %} -SELECT part_dt, lstg_format_name, SUM(price) AS gmv, LAG(SUM(price), 1, 0.0) OVER(PARTITION BY lstg_format_name ORDER BY part_dt) AS "prev" FROM kylin_sales WHERE part_dt < '2012-02-01' GROUP BY part_dt, lstg_format_name; -{% endhighlight %} - -### LEAD {#LEAD} - -Example: -{% highlight Groff markup %} -SELECT part_dt, lstg_format_name, SUM(price) AS gmv, LEAD(SUM(price), 1, 0.0) OVER(PARTITION BY lstg_format_name ORDER BY part_dt) AS "next" FROM kylin_sales WHERE part_dt < '2012-02-01' GROUP BY part_dt, lstg_format_name; -{% endhighlight %} - -### NTILE {#NTILE} - -Example: -{% highlight Groff markup %} -SELECT part_dt, lstg_format_name, SUM(price) AS gmv, NTILE(4) OVER (PARTITION BY lstg_format_name ORDER BY part_dt) AS "quarter" FROM kylin_sales WHERE part_dt < '2012-02-01' GROUP BY part_dt, lstg_format_name; -{% endhighlight %} - -### CASE WHEN {#CASEWHEN} - -Example: -{% highlight Groff markup %} -SELECT part_dt, lstg_format_name, SUM(price) AS gmv, (CASE LAG(SUM(price), 1, 0.0) OVER(PARTITION BY lstg_format_name ORDER BY part_dt) WHEN 0.0 THEN 0 ELSE SUM(price)/LAG(SUM(price), 1, 0.0) OVER(PARTITION BY lstg_format_name ORDER BY part_dt) END) AS "prev" FROM kylin_sales WHERE part_dt < '2012-02-01' GROUP BY part_dt, lstg_format_name; -{% endhighlight %} - -### CAST {#CAST} -The keyword ```RANGE```, ```INTERVAL``` to specify a range. ```PRECEDING``` means the first few days (second/minute/hour/month/year). ```FOLLOWING``` means the next few days (second/minute/hour/month/year). -Example: -{% highlight Groff markup %} -SELECT part_dt, lstg_format_name, SUM(price) AS gmv, FIRST_VALUE(SUM(price)) OVER (PARTITION BY lstg_format_name ORDER BY CAST(part_dt AS timestamp) RANGE INTERVAL '3' DAY PRECEDING) AS "prev 3 days", LAST_VALUE(SUM(price)) OVER (PARTITION BY lstg_format_name ORDER BY CAST(part_dt AS timestamp) RANGE INTERVAL '3' DAY FOLLOWING) AS "next 3 days" FROM kylin_sales WHERE part_dt < '2012-02-01' GROUP BY part_dt, lstg_format_name; -{% endhighlight %} - -## SUBSTRING {#SUBSTRING} -Example: -{% highlight Groff markup %} -SELECT SUBSTRING(lstg_format_name, 1) FROM kylin_sales; -{% endhighlight %} - -## COALESCE {#COALESCE} -Example: -{% highlight Groff markup %} -SELECT COALESCE(lstg_format_name, '888888888888') FROM kylin_sales; -{% endhighlight %} diff --git a/website/_docs/tutorial/kylin_grammar.cn.md b/website/_docs/tutorial/sql_reference.cn.md similarity index 50% rename from website/_docs/tutorial/kylin_grammar.cn.md rename to website/_docs/tutorial/sql_reference.cn.md index 087d465..263c740 100644 --- a/website/_docs/tutorial/kylin_grammar.cn.md +++ b/website/_docs/tutorial/sql_reference.cn.md @@ -1,12 +1,14 @@ --- layout: docs-cn -title: "SQL 语法" +title: "SQL 快速参考" categories: tutorial -permalink: /cn/docs/tutorial/kylin_grammar.html +permalink: /cn/docs/tutorial/sql_reference.html --- -## 语法 +Kylin 是使用 Calcite 进行查询的,原则上 Kylin 支持所有 Calcite 能够识别的语句。 +## 语法 +[QUERY SYNTAX](#QUERYSYNTAX) [SELECT](#SELECT) [STATEMENT](#STATEMENT) [EXPRESSION](#EXPRESSION) @@ -14,10 +16,41 @@ permalink: /cn/docs/tutorial/kylin_grammar.html [JOIN](#JOIN) [INNER JOIN](#INNERJOIN) [LEFT JOIN](#LEFTJOIN) - [CROSS JOIN](#CROSSJOIN) [UNION](#UNION) [UNION ALL](#UNIONALL) +## 函数 + +[COUNT](#COUNT) + [COUNT(COLUMN)](#COUNT_COLUMN) + [COUNT(*)](#COUNT_) +[COUNT_DISTINCT](#COUNT_DISTINCT) +[MAX](#MAX) +[MIN](#MIN) +[PERCENTILE](#PERCENTILE) +[SUM](#SUM) +[TOP_N](#TOP_N) + +[WINDOW](#WINDOW) + [ROW_NUMBER](#ROW_NUMBER) + [AVG](#AVG) + [RANK](#RANK) + [DENSE_RANK](#DENSE_RANK) + [FIRST_VALUE](#FIRST_VALUE) + [LAST_VALUE](#LAST_VALUE) + [LAG](#LAG) + [LEAD](#LEAD) + [NTILE](#NTILE) + [CASE WHEN](#CASEWHEN) + [CAST](#CAST) + +[SUSTRING](#SUBSTRING) +[COALESCE](#COALESCE) + +## 数据类型 +[数据类型](#datatype) + +## 查询语法 {#QUERYSYNTAX} {% highlight Groff markup %} statement: | query @@ -141,7 +174,6 @@ SELECT lstg_format_name, SUM(price) AS gmv, COUNT(1) AS trans_cnt FROM kylin_sal 3. 值 + 1 ## SUBQUERY {#SUBQUERY} - 例子: {% highlight Groff markup %} SELECT cal_dt ,sum(price) AS sum_price FROM (SELECT kylin_cal_dt.cal_dt, kylin_sales.price FROM kylin_sales INNER JOIN kylin_cal_dt AS kylin_cal_dt ON kylin_sales.part_dt = kylin_cal_dt.cal_dt INNER JOIN kylin_category_groupings ON kylin_sales.leaf_categ_id = kylin_category_groupings.leaf_categ_id AND kylin_sales.lstg_site_id = kylin_category_groupings.site_id) t GROUP BY cal_dt; @@ -163,12 +195,6 @@ SELECT kylin_cal_dt.cal_dt, kylin_sales.price FROM kylin_sales INNER JOIN kylin_ SELECT seller_id FROM kylin_sales LEFT JOIN kylin_category_groupings AS kylin_category_groupings ON kylin_sales.leaf_categ_id = kylin_category_groupings.leaf_categ_id AND kylin_sales.lstg_site_id = kylin_category_groupings.site_id WHERE lstg_format_name='FP-GTC' GROUP BY seller_id LIMIT 20; {% endhighlight %} -### CROSS JOIN {#CROSSJOIN} -例子: -{% highlight Groff markup %} -SELECT * FROM (SELECT leaf_categ_id, SUM(price) AS sum_price FROM kylin_sales GROUP BY leaf_categ_id UNION ALL SELECT leaf_categ_id, SUM(price) AS sum_price FROM kylin_sales GROUP BY leaf_categ_id) CROSS JOIN (SELECT SUM(price) AS sum_price_2 FROM kylin_sales GROUP BY leaf_categ_id) UNION ALL SELECT CAST(1999 AS bigint) AS leaf_categ_id, 11.2 AS sum_price, 21.2 AS sum_price2 UNION ALL SELECT * FROM (SELECT leaf_categ_id, SUM(price) AS sum_price FROM kylin_sales GROUP BY leaf_categ_id UNI [...] -{% endhighlight %} - ## UNION {#UNION} ```UNION``` 操作符用于合并两个或多个 ```SELECT``` 语句的结果集。 *注意* ```UNION``` 内部的 ```SELECT``` 语句必须拥有相同数量的列。列也必须拥有相似的数据类型。同时,每条 ```SELECT``` 语句中的列的顺序必须相同。 @@ -186,3 +212,173 @@ SELECT SUM(price) AS x, seller_id, COUNT(*) AS y FROM kylin_sales WHERE part_dt {% highlight Groff markup %} SELECT COUNT(trans_id) AS trans_id FROM kylin_sales AS test_a WHERE trans_id <> 1 UNION ALL SELECT COUNT(trans_id) AS trans_id FROM kylin_sales AS test_b; {% endhighlight %} + +## COUNT {#COUNT} +用于返回与指定条件匹配的行数。 + +### COUNT(COLUMN) {#COUNT_COLUMN} +例子: +{% highlight Groff markup %} +SELECT COUNT(seller_id) FROM kylin_sales; +{% endhighlight %} + +### COUNT(*) {#COUNT_} +例子: +{% highlight Groff markup %} +SELECT COUNT(*) FROM kylin_sales; +{% endhighlight %} + + +## COUNT_DISTINCT {#COUNT_DISTINCT} +例子: +{% highlight Groff markup %} +SELECT COUNT(DISTINCT seller_id) AS DIST_SELLER FROM kylin_sales; +{% endhighlight %} + +## MAX {#MAX} +返回一列中的最大值。NULL 值不包括在计算中。 +例子: +{% highlight Groff markup %} +SELECT MAX(lstg_site_id) FROM kylin_sales; +{% endhighlight %} + + +## MIN {#MIN} +返回一列中的最小值。NULL 值不包括在计算中。 +例子: +{% highlight Groff markup %} +SELECT MIN(lstg_site_id) FROM kylin_sales; +{% endhighlight %} + + +## PERCENTILE {#PERCENTILE} +例子: +{% highlight Groff markup %} +SELECT seller_id, PERCENTILE(price, 0.5) FROM kylin_sales GROUP BY seller_id; + +SELECT seller_id, PERCENTILE_APPROX(price, 0.5) FROM kylin_sales GROUP BY seller_id; +{% endhighlight %} + + +## SUM {#SUM} +返回数值列的总数。 +例子: +{% highlight Groff markup %} +SELECT SUM(price) FROM kylin_sales; +{% endhighlight %} + +## TOP_N {#TOP_N} +例子: +{% highlight Groff markup %} +SELECT SUM(price) AS gmv + FROM kylin_sales +INNER JOIN kylin_cal_dt AS kylin_cal_dt + ON kylin_sales.part_dt = kylin_cal_dt.cal_dt + INNER JOIN kylin_category_groupings + ON kylin_sales.leaf_categ_id = kylin_category_groupings.leaf_categ_id AND kylin_sales.lstg_site_id = kylin_category_groupings.site_id + WHERE kylin_cal_dt.cal_dt between DATE '2013-09-01' AND DATE '2013-10-01' AND (lstg_format_name='FP-GTC' OR 'a' = 'b') + GROUP BY kylin_cal_dt.cal_dt; + +SELECT kylin_sales.part_dt, seller_id +FROM kylin_sales +INNER JOIN kylin_cal_dt AS kylin_cal_dt +ON kylin_sales.part_dt = kylin_cal_dt.cal_dt +INNER JOIN kylin_category_groupings +ON kylin_sales.leaf_categ_id = kylin_category_groupings.leaf_categ_id +AND kylin_sales.lstg_site_id = kylin_category_groupings.site_id +GROUP BY +kylin_sales.part_dt, kylin_sales.seller_id ORDER BY SUM(kylin_sales.price) DESC LIMIT 20; +{% endhighlight %} + +## WINDOW {#WINDOW} +```WINDOW``` 函数在和当前行相关的一组表行上执行计算。 +*注意*:```WINDOW``` 函数中必须有 ```OVER``` 子句 + +### ROW_NUMBER {#ROW_NUMBER} +例子: +{% highlight Groff markup %} +SELECT lstg_format_name, SUM(price) AS gmv, ROW_NUMBER() OVER() FROM kylin_sales GROUP BY lstg_format_name; +{% endhighlight %} + +### AVG {#AVG} +返回数值列的平均值。NULL 值不包括在计算中。 +例子: +{% highlight Groff markup %} +SELECT lstg_format_name, AVG(SUM(price)) OVER(PARTITION BY lstg_format_name) FROM kylin_sales GROUP BY part_dt, lstg_format_name; +{% endhighlight %} + +### RANK {#RANK} +例子: +{% highlight Groff markup %} +SELECT part_dt, lstg_format_name, SUM(price), RANK() OVER(PARTITION BY lstg_format_name ORDER BY part_dt) AS "rank" FROM kylin_sales GROUP BY part_dt, lstg_format_name; +{% endhighlight %} + +### DENSE_RANK {#DENSE_RANK} +例子: +{% highlight Groff markup %} +SELECT part_dt, lstg_format_name, SUM(price), DENSE_RANK() OVER(PARTITION BY lstg_format_name ORDER BY part_dt) AS "dense_rank" FROM kylin_sales GROUP BY part_dt, lstg_format_name; +{% endhighlight %} + +### FIRST_VALUE {#FIRST_VALUE} +例子: +{% highlight Groff markup %} +SELECT part_dt, lstg_format_name, SUM(price) AS gmv, FIRST_VALUE(SUM(price)) OVER(PARTITION BY lstg_format_name ORDER BY part_dt) AS "first" FROM kylin_sales WHERE part_dt < '2012-02-01' GROUP BY part_dt, lstg_format_name; +{% endhighlight %} + +### LAST_VALUE {#LAST_VALUE} +例子: +{% highlight Groff markup %} +SELECT part_dt, lstg_format_name, SUM(price) AS gmv, LAST_VALUE(SUM(price)) OVER(PARTITION BY lstg_format_name ORDER BY part_dt) AS "current" FROM kylin_sales WHERE part_dt < '2012-02-01' GROUP BY part_dt, lstg_format_name; +{% endhighlight %} + +### LAG {#LAG} +例子: +{% highlight Groff markup %} +SELECT part_dt, lstg_format_name, SUM(price) AS gmv, LAG(SUM(price), 1, 0.0) OVER(PARTITION BY lstg_format_name ORDER BY part_dt) AS "prev" FROM kylin_sales WHERE part_dt < '2012-02-01' GROUP BY part_dt, lstg_format_name; +{% endhighlight %} + +### LEAD {#LEAD} +例子: +{% highlight Groff markup %} +SELECT part_dt, lstg_format_name, SUM(price) AS gmv, LEAD(SUM(price), 1, 0.0) OVER(PARTITION BY lstg_format_name ORDER BY part_dt) AS "next" FROM kylin_sales WHERE part_dt < '2012-02-01' GROUP BY part_dt, lstg_format_name; +{% endhighlight %} + +### NTILE {#NTILE} +例子: +{% highlight Groff markup %} +SELECT part_dt, lstg_format_name, SUM(price) AS gmv, NTILE(4) OVER (PARTITION BY lstg_format_name ORDER BY part_dt) AS "quarter" FROM kylin_sales WHERE part_dt < '2012-02-01' GROUP BY part_dt, lstg_format_name; +{% endhighlight %} + +### CASE WHEN {#CASEWHEN} +例子: +{% highlight Groff markup %} +SELECT part_dt, lstg_format_name, SUM(price) AS gmv, (CASE LAG(SUM(price), 1, 0.0) OVER(PARTITION BY lstg_format_name ORDER BY part_dt) WHEN 0.0 THEN 0 ELSE SUM(price)/LAG(SUM(price), 1, 0.0) OVER(PARTITION BY lstg_format_name ORDER BY part_dt) END) AS "prev" FROM kylin_sales WHERE part_dt < '2012-02-01' GROUP BY part_dt, lstg_format_name; +{% endhighlight %} + +### CAST {#CAST} +```RANGE```,```INTERVAL``` 关键字指明了范围。```PRECEDING``` 表示前几天(秒/分/时/月/年)。```FOLLOWING``` 表示后几天(秒/分/时/月/年)。 +例子: +{% highlight Groff markup %} +SELECT part_dt, lstg_format_name, SUM(price) AS gmv, FIRST_VALUE(SUM(price)) OVER (PARTITION BY lstg_format_name ORDER BY CAST(part_dt AS timestamp) RANGE INTERVAL '3' DAY PRECEDING) AS "prev 3 days", LAST_VALUE(SUM(price)) OVER (PARTITION BY lstg_format_name ORDER BY CAST(part_dt AS timestamp) RANGE INTERVAL '3' DAY FOLLOWING) AS "next 3 days" FROM kylin_sales WHERE part_dt < '2012-02-01' GROUP BY part_dt, lstg_format_name; +{% endhighlight %} + +## SUBSTRING {#SUBSTRING} +例子: +{% highlight Groff markup %} +SELECT SUBSTRING(lstg_format_name, 1) FROM kylin_sales; +{% endhighlight %} + +## COALESCE {#COALESCE} +例子: +{% highlight Groff markup %} +SELECT COALESCE(lstg_format_name, '888888888888') FROM kylin_sales; +{% endhighlight %} + +## 数据类型 {#datatype} + +| ---------- | ---------- | ---------- | ---------- | -------------------- | +| ANY | CHAR | VARCHAR | STRING | BOOLEAN | +| BYTE | BINARY | INT | SHORT | LONG | +| INTEGER | TINYINT | SMALLINT | BIGINT | TIMESTAMP | +| FLOAT | REAL | DOUBLE | DECIMAL | DATETIME | +| NUMERIC | DATE | TIME | | | \ No newline at end of file diff --git a/website/_docs/tutorial/kylin_grammar.md b/website/_docs/tutorial/sql_reference.md similarity index 50% rename from website/_docs/tutorial/kylin_grammar.md rename to website/_docs/tutorial/sql_reference.md index 4efec46..4af4709 100644 --- a/website/_docs/tutorial/kylin_grammar.md +++ b/website/_docs/tutorial/sql_reference.md @@ -1,12 +1,15 @@ --- layout: docs -title: SQL Grammar +title: SQL Reference categories: tutorial -permalink: /docs/tutorial/kylin_grammar.html +permalink: /docs/tutorial/sql_reference.html --- +Kylin uses Calcite for the query. In principle, Kylin supports all statements that Calcite recognizes. + ## Grammar +[QUERY SYNTAX](#QUERYSYNTAX) [SELECT](#SELECT) [STATEMENT](#STATEMENT) [EXPRESSION](#EXPRESSION) @@ -14,10 +17,41 @@ permalink: /docs/tutorial/kylin_grammar.html [JOIN](#JOIN) [INNER JOIN](#INNERJOIN) [LEFT JOIN](#LEFTJOIN) - [CROSS JOIN](#CROSSJOIN) [UNION](#UNION) [UNION ALL](#UNIONALL) +## Function + +[COUNT](#COUNT) + [COUNT(COLUMN)](#COUNT_COLUMN) + [COUNT(*)](#COUNT_) +[COUNT_DISTINCT](#COUNT_DISTINCT) +[MAX](#MAX) +[MIN](#MIN) +[PERCENTILE](#PERCENTILE) +[SUM](#SUM) +[TOP_N](#TOP_N) +[WINDOW](#WINDOW) + [ROW_NUMBER](#ROW_NUMBER) + [AVG](#AVG) + [RANK](#RANK) + [DENSE_RANK](#DENSE_RANK) + [FIRST_VALUE](#FIRST_VALUE) + [LAST_VALUE](#LAST_VALUE) + [LAG](#LAG) + [LEAD](#LEAD) + [NTILE](#NTILE) + [CASE WHEN](#CASEWHEN) + [CAST](#CAST) +[SUSTRING](#SUBSTRING) +[COALESCE](#COALESCE) + +## Data Type + +[DATA TYPE](#DATATYPE) + + +## QUERY SYNTAX {#QUERYSYNTAX} {% highlight Groff markup %} statement: | query @@ -141,7 +175,6 @@ Example: 3. VALUE + 1 ## SUBQUERY {#SUBQUERY} - Example: {% highlight Groff markup %} SELECT cal_dt ,sum(price) AS sum_price FROM (SELECT kylin_cal_dt.cal_dt, kylin_sales.price FROM kylin_sales INNER JOIN kylin_cal_dt AS kylin_cal_dt ON kylin_sales.part_dt = kylin_cal_dt.cal_dt INNER JOIN kylin_category_groupings ON kylin_sales.leaf_categ_id = kylin_category_groupings.leaf_categ_id AND kylin_sales.lstg_site_id = kylin_category_groupings.site_id) t GROUP BY cal_dt; @@ -165,12 +198,6 @@ Example: SELECT seller_id FROM kylin_sales LEFT JOIN kylin_category_groupings AS kylin_category_groupings ON kylin_sales.leaf_categ_id = kylin_category_groupings.leaf_categ_id AND kylin_sales.lstg_site_id = kylin_category_groupings.site_id WHERE lstg_format_name='FP-GTC' GROUP BY seller_id LIMIT 20; {% endhighlight %} -### CROSS JOIN {#CROSSJOIN} -Example: -{% highlight Groff markup %} -SELECT * FROM (SELECT leaf_categ_id, SUM(price) AS sum_price FROM kylin_sales GROUP BY leaf_categ_id UNION ALL SELECT leaf_categ_id, SUM(price) AS sum_price FROM kylin_sales GROUP BY leaf_categ_id) CROSS JOIN (SELECT SUM(price) AS sum_price_2 FROM kylin_sales GROUP BY leaf_categ_id) UNION ALL SELECT CAST(1999 AS bigint) AS leaf_categ_id, 11.2 AS sum_price, 21.2 AS sum_price2 UNION ALL SELECT * FROM (SELECT leaf_categ_id, SUM(price) AS sum_price FROM kylin_sales GROUP BY leaf_categ_id UNI [...] -{% endhighlight %} - ## UNION {#UNION} The ```UNION``` operator is used to combine the result sets of two or more ```SELECT``` statements. *Note that* the ```SELECT``` statement inside ```UNION``` must have the same number of columns. Columns must also have similar data types. At the same time, the order of the columns in each ```SELECT``` statement must be the same. @@ -187,4 +214,175 @@ The ```UNION ALL``` command is almost equivalent to the ```UNION``` command, but Example: {% highlight Groff markup %} SELECT COUNT(trans_id) AS trans_id FROM kylin_sales AS test_a WHERE trans_id <> 1 UNION ALL SELECT COUNT(trans_id) AS trans_id FROM kylin_sales AS test_b; -{% endhighlight %} \ No newline at end of file +{% endhighlight %} + +## COUNT {#COUNT} +Returns the number of rows matching the specified criteria. + +### COUNT(COLUMN) {#COUNT_COLUMN} +Example: +{% highlight Groff markup %} +SELECT COUNT(seller_id) FROM kylin_sales; +{% endhighlight %} + +### COUNT(*) {#COUNT_} +Example: +{% highlight Groff markup %} +SELECT COUNT(*) FROM kylin_sales; +{% endhighlight %} + + +## COUNT_DISTINCT {#COUNT_DISTINCT} +Example: +{% highlight Groff markup %} +SELECT COUNT(DISTINCT seller_id) AS DIST_SELLER FROM kylin_sales; +{% endhighlight %} + +## MAX {#MAX} +Returns the maximum value in a column. NULL values are not included in the calculation. +Example: +{% highlight Groff markup %} +SELECT MAX(lstg_site_id) FROM kylin_sales; +{% endhighlight %} + + +## MIN {#MIN} +Returns the minimum value in a column. NULL values are not included in the calculation. +Example: +{% highlight Groff markup %} +SELECT MIN(lstg_site_id) FROM kylin_sales; +{% endhighlight %} + + +## PERCENTILE {#PERCENTILE} +Example: +{% highlight Groff markup %} +SELECT seller_id, PERCENTILE(price, 0.5) FROM kylin_sales GROUP BY seller_id; + +SELECT seller_id, PERCENTILE_APPROX(price, 0.5) FROM kylin_sales GROUP BY seller_id; +{% endhighlight %} + + +## SUM {#SUM} +Returns the total number of numeric columns. +Example: +{% highlight Groff markup %} +SELECT SUM(price) FROM kylin_sales; +{% endhighlight %} + +## TOP_N {#TOP_N} +Example: +{% highlight Groff markup %} +SELECT SUM(price) AS gmv + FROM kylin_sales +INNER JOIN kylin_cal_dt AS kylin_cal_dt + ON kylin_sales.part_dt = kylin_cal_dt.cal_dt + INNER JOIN kylin_category_groupings + ON kylin_sales.leaf_categ_id = kylin_category_groupings.leaf_categ_id AND kylin_sales.lstg_site_id = kylin_category_groupings.site_id + WHERE kylin_cal_dt.cal_dt between DATE '2013-09-01' AND DATE '2013-10-01' AND (lstg_format_name='FP-GTC' OR 'a' = 'b') + GROUP BY kylin_cal_dt.cal_dt; + +SELECT kylin_sales.part_dt, seller_id +FROM kylin_sales +INNER JOIN kylin_cal_dt AS kylin_cal_dt +ON kylin_sales.part_dt = kylin_cal_dt.cal_dt +INNER JOIN kylin_category_groupings +ON kylin_sales.leaf_categ_id = kylin_category_groupings.leaf_categ_id +AND kylin_sales.lstg_site_id = kylin_category_groupings.site_id +GROUP BY +kylin_sales.part_dt, kylin_sales.seller_id ORDER BY SUM(kylin_sales.price) DESC LIMIT 20; +{% endhighlight %} + +## WINDOW {#WINDOW} +The ```WINDOW``` function performs the calculation on a set of table rows associated with the current row. +*NOTE*: ```OVER``` clause is necessary for window functions. + +### ROW_NUMBER {#ROW_NUMBER} +Example: +{% highlight Groff markup %} +SELECT lstg_format_name, SUM(price) AS gmv, ROW_NUMBER() OVER() FROM kylin_sales GROUP BY lstg_format_name; +{% endhighlight %} + +### AVG {#AVG} +Returns the average of the numeric columns. NULL values are not included in the calculation. +Example: +{% highlight Groff markup %} +SELECT lstg_format_name, AVG(SUM(price)) OVER(PARTITION BY lstg_format_name) FROM kylin_sales GROUP BY part_dt, lstg_format_name; +{% endhighlight %} + +### RANK {#RANK} +Example: +{% highlight Groff markup %} +SELECT part_dt, lstg_format_name, SUM(price), RANK() OVER(PARTITION BY lstg_format_name ORDER BY part_dt) AS "rank" FROM kylin_sales GROUP BY part_dt, lstg_format_name; +{% endhighlight %} + +### DENSE_RANK {#DENSE_RANK} +Example: +{% highlight Groff markup %} +SELECT part_dt, lstg_format_name, SUM(price), DENSE_RANK() OVER(PARTITION BY lstg_format_name ORDER BY part_dt) AS "dense_rank" FROM kylin_sales GROUP BY part_dt, lstg_format_name; +{% endhighlight %} + +### FIRST_VALUE {#FIRST_VALUE} +Example: +{% highlight Groff markup %} +SELECT part_dt, lstg_format_name, SUM(price) AS gmv, FIRST_VALUE(SUM(price)) OVER(PARTITION BY lstg_format_name ORDER BY part_dt) AS "first" FROM kylin_sales WHERE part_dt < '2012-02-01' GROUP BY part_dt, lstg_format_name; +{% endhighlight %} + +### LAST_VALUE {#LAST_VALUE} +Example: +{% highlight Groff markup %} +SELECT part_dt, lstg_format_name, SUM(price) AS gmv, LAST_VALUE(SUM(price)) OVER(PARTITION BY lstg_format_name ORDER BY part_dt) AS "current" FROM kylin_sales WHERE part_dt < '2012-02-01' GROUP BY part_dt, lstg_format_name; +{% endhighlight %} + +### LAG {#LAG} +Example: +{% highlight Groff markup %} +SELECT part_dt, lstg_format_name, SUM(price) AS gmv, LAG(SUM(price), 1, 0.0) OVER(PARTITION BY lstg_format_name ORDER BY part_dt) AS "prev" FROM kylin_sales WHERE part_dt < '2012-02-01' GROUP BY part_dt, lstg_format_name; +{% endhighlight %} + +### LEAD {#LEAD} +Example: +{% highlight Groff markup %} +SELECT part_dt, lstg_format_name, SUM(price) AS gmv, LEAD(SUM(price), 1, 0.0) OVER(PARTITION BY lstg_format_name ORDER BY part_dt) AS "next" FROM kylin_sales WHERE part_dt < '2012-02-01' GROUP BY part_dt, lstg_format_name; +{% endhighlight %} + +### NTILE {#NTILE} +Example: +{% highlight Groff markup %} +SELECT part_dt, lstg_format_name, SUM(price) AS gmv, NTILE(4) OVER (PARTITION BY lstg_format_name ORDER BY part_dt) AS "quarter" FROM kylin_sales WHERE part_dt < '2012-02-01' GROUP BY part_dt, lstg_format_name; +{% endhighlight %} + +### CASE WHEN {#CASEWHEN} +Example: +{% highlight Groff markup %} +SELECT part_dt, lstg_format_name, SUM(price) AS gmv, (CASE LAG(SUM(price), 1, 0.0) OVER(PARTITION BY lstg_format_name ORDER BY part_dt) WHEN 0.0 THEN 0 ELSE SUM(price)/LAG(SUM(price), 1, 0.0) OVER(PARTITION BY lstg_format_name ORDER BY part_dt) END) AS "prev" FROM kylin_sales WHERE part_dt < '2012-02-01' GROUP BY part_dt, lstg_format_name; +{% endhighlight %} + +### CAST {#CAST} +The keyword ```RANGE```, ```INTERVAL``` to specify a range. ```PRECEDING``` means the first few days (second/minute/hour/month/year). ```FOLLOWING``` means the next few days (second/minute/hour/month/year). +Example: +{% highlight Groff markup %} +SELECT part_dt, lstg_format_name, SUM(price) AS gmv, FIRST_VALUE(SUM(price)) OVER (PARTITION BY lstg_format_name ORDER BY CAST(part_dt AS timestamp) RANGE INTERVAL '3' DAY PRECEDING) AS "prev 3 days", LAST_VALUE(SUM(price)) OVER (PARTITION BY lstg_format_name ORDER BY CAST(part_dt AS timestamp) RANGE INTERVAL '3' DAY FOLLOWING) AS "next 3 days" FROM kylin_sales WHERE part_dt < '2012-02-01' GROUP BY part_dt, lstg_format_name; +{% endhighlight %} + +## SUBSTRING {#SUBSTRING} +Example: +{% highlight Groff markup %} +SELECT SUBSTRING(lstg_format_name, 1) FROM kylin_sales; +{% endhighlight %} + +## COALESCE {#COALESCE} +Example: +{% highlight Groff markup %} +SELECT COALESCE(lstg_format_name, '888888888888') FROM kylin_sales; +{% endhighlight %} + + +## DATA TYPE {#DATATYPE} + +| ---------- | ---------- | ---------- | ---------- | -------------------- | +| ANY | CHAR | VARCHAR | STRING | BOOLEAN | +| BYTE | BINARY | INT | SHORT | LONG | +| INTEGER | TINYINT | SMALLINT | BIGINT | TIMESTAMP | +| FLOAT | REAL | DOUBLE | DECIMAL | DATETIME | +| NUMERIC | DATE | TIME | | | \ No newline at end of file