[kylin] branch document updated: KYLIN-3552 add doc and blog

shaofengshi Wed, 16 Jan 2019 17:54:07 -0800

This is an automated email from the ASF dual-hosted git repository.

shaofengshi pushed a commit to branch document
in repository https://gitbox.apache.org/repos/asf/kylin.git



The following commit(s) were added to refs/heads/document by this push:
     new ed5d565  KYLIN-3552 add doc and blog
ed5d565 is described below

commit ed5d565f02f40c27197e603124a402ce597fd9d9
Author: edouardzyc <edouard...@hotmail.com>
AuthorDate: Thu Jan 17 09:52:00 2019 +0800

    KYLIN-3552 add doc and blog
---
 website/_data/development-cn.yml                   |   1 +
 website/_data/development.yml                      |   1 +
 website/_dev/datasource_sdk.cn.md                  | 154 ++++++++++++++++++++
 website/_dev/datasource_sdk.md                     | 156 +++++++++++++++++++++
 .../2019-01-16-introduce-data-source-sdk-v2.6.0.md |  37 +++++
 website/images/blog/data-source-sdk.png            | Bin 0 -> 21267 bytes
 6 files changed, 349 insertions(+)

diff --git a/website/_data/development-cn.yml b/website/_data/development-cn.yml
index 83379b5..a42f8b7 100644
--- a/website/_data/development-cn.yml
+++ b/website/_data/development-cn.yml
@@ -33,3 +33,4 @@
     - new_metadata
     - web_tech
     - about_temp_files
+    - datasource_sdk
diff --git a/website/_data/development.yml b/website/_data/development.yml
index fa68c1b..a3d1d03 100644
--- a/website/_data/development.yml
+++ b/website/_data/development.yml
@@ -32,3 +32,4 @@
   - new_metadata
   - web_tech
   - about_temp_files
+  - datasource_sdk
diff --git a/website/_dev/datasource_sdk.cn.md 
b/website/_dev/datasource_sdk.cn.md
new file mode 100644
index 0000000..0a50652
--- /dev/null
+++ b/website/_dev/datasource_sdk.cn.md
@@ -0,0 +1,154 @@
+---
+layout: dev
+title:  开发 JDBC 数据源
+categories: development
+permalink: /cn/development/datasource_sdk.html
+---
+
+> 自 Apache Kylin v2.6.0 起有效
+
+## Data source SDK
+自 Apache Kylin v2.6.0 起，我们提供一套新的数据源框架 *Data source SDK*，使用框架提供的API, 
开发者可以很轻松实现一个新的数据源, 适配sql方言。 
+
+## 如何开发
+
+### 实现新数据源的配置
+
+*Data source SDK* 提供转换的机制, 框架里预定义一个配置文件 *default.xml* 对应ansi sql方言。
+
+开发者不需要编码, 只需要为新的数据源新建一个配置文件 *{dialect}.xml*, 
+
+配置文件结构:
+* 根节点:  
+&lt;DATASOURCE_DEF NAME="kylin" ID="default"&gt;, ID的值为方言的名称.
+* 属性节点:  
+定义方言的属性。
+
+<table>
+  <tbody align="left">  
+  <tr>
+    <td align="center">属性</td>
+    <td align="center">描述</td>
+  </tr>
+  <tr>
+    <td> sql.default-converted-enabled </td>
+    <td> 是否需要转换 </td>
+  </tr>
+  <tr>
+    <td> sql.allow-no-offset </td>
+    <td> 是否允许没有offset字句 </td>
+  </tr>
+  <tr>
+    <td> sql.allow-fetch-no-rows </td>
+    <td> 是否允许fetch 0 rows</td>
+  </tr>
+  <tr>
+    <td> sql.allow-no-orderby-with-fetch </td>
+    <td> fetch是否必须跟orderby </td>
+  </tr>
+  <tr>
+    <td> sql.keyword-default-escape  </td>
+    <td> &lt;default&gt;是否是关键字 </td>
+  </tr>
+  <tr>
+     <td> sql.keyword-default-uppercase </td>
+     <td> &lt;default&gt; 是否需要转换成大写 </td>
+  </tr>
+  <tr>
+    <td> sql.paging-type </td>
+    <td> 分页类型比如 LIMIT_OFFSET, FETCH_NEXT, ROWNUM </td>
+  </tr>
+  <tr>
+    <td> sql.case-sensitive </td>
+    <td> 是否大小写敏感 </td>
+  </tr>
+  <tr>
+    <td> metadata.enable-cache </td>
+    <td> 是否开启缓存(针对开启大小写敏感) </td>
+  </tr>
+  <tr>
+    <td> sql.enable-quote-all-identifiers </td>
+    <td> 是否开启quote </td>
+  </tr>
+  <tr>
+    <td> transaction.isolation-level </td>
+    <td> 事务隔离级别(针对sqoop) </td>
+  </tr>
+  </tbody>
+</table>
+
+
+* 方法节点:  
+开发者可以根据数据源方言定义方法的实现。
+比如，我们想要实现 Greenplum 作为数据源，但是 Greenplum 不支持 *TIMESTAMPDIFF* 方法，那我们就可以在 
*greenplum.xml* 里面定义 
+
+```
+<FUNCTION_DEF ID="64" EXPRESSION="(CAST($1 AS DATE) - CAST($0 AS DATE))"/>
+```
+
+对比在 *default.xml* 定义
+
+```
+<FUNCTION_DEF ID="64" EXPRESSION="TIMESTAMPDIFF(day, $0, $1)"/>
+```
+
+*Data source SDK* 可以把 default 里定义相同 function id 方法转换成目标方言里的定义.
+
+* 类型节点:  
+开发者可以根据数据源方言定义数据类型。
+还是拿 Greenplum 作为例子, Greenplum 支持 *BIGINT* 而不是 *LONG*, 那我们可以在 *greenplum.xml* 定义
+
+```
+<TYPE_DEF ID="Long" EXPRESSION="BIGINT"/>
+```
+
+对比在 *default.xml* 定义
+
+```
+<TYPE_DEF ID="Long" EXPRESSION="LONG"/>
+```
+
+*Data source SDK* 可以把 default 里定义相同 type id 方法转换成目标方言里的定义.
+
+
+### Adaptor
+
+Adaptor 提供一系列的 API 比如从数据源获取元数据，数据等。  
+*Data source SDK* 提供了默认的实现，开发者可以创建一个类继承它，并且有自己的实现。
+{% highlight Groff markup %}
+org.apache.kylin.sdk.datasource.adaptor.DefaultAdaptor
+{% endhighlight %}
+Adaptor 还预留一个方法 *fixSql(String sql)*. 
+如果根据配置文件转换之后的 sql 还是和目标方言有些适配问题, 开发者可以去实现这个方法做 sql 最后的修改. 
+
+
+## 部署
+一些新的配置：  
+{% highlight Groff markup %}
+kylin.query.pushdown.runner-class-name=org.apache.kylin.query.pushdown.PushdownRunnerSDKImpl
+kylin.source.default=16
+kylin.source.jdbc.dialect={JDBC 方言}
+kylin.source.jdbc.adaptor={JDBC 连接的数据源对应的适配器类名}
+kylin.source.jdbc.user={JDBC 连接用户名}
+kylin.source.jdbc.pass={JDBC 连接密码}
+kylin.source.jdbc.connection-url={JDBC 连接字符串}
+kylin.source.jdbc.driver={JDBC 驱动类名}
+{% endhighlight %}  
+
+使用mysql作为例子:
+{% highlight Groff markup %}
+kylin.query.pushdown.runner-class-name=org.apache.kylin.query.pushdown.PushdownRunnerSDKImpl
+kylin.source.default=16
+kylin.source.jdbc.dialect=mysql
+kylin.source.jdbc.adaptor=org.apache.kylin.sdk.datasource.adaptor.MysqlAdaptor
+kylin.source.jdbc.user={mysql username}
+kylin.source.jdbc.pass={mysql password}
+kylin.source.jdbc.connection-url=jdbc:mysql://{HOST_URL}:3306/{Database name}
+kylin.source.jdbc.driver=com.mysql.jdbc.Driver
+{% endhighlight %}
+
+新增加的 *{dialect}.xml* 放置在 $KYLIN_HOME/conf/datasource 目录下。
+新开发的 Adaptor 打成jar包后 放置在在 $KYLIN_HOME/ext 目录下。
+
+其余的配置和更早的jdbc连接方式一致，请参考 
[setup_jdbc_datasource](/cn/docs/tutorial/setup_jdbc_datasource.html)
+
diff --git a/website/_dev/datasource_sdk.md b/website/_dev/datasource_sdk.md
new file mode 100644
index 0000000..9d8258d
--- /dev/null
+++ b/website/_dev/datasource_sdk.md
@@ -0,0 +1,156 @@
+---
+layout: dev
+title:  Develop JDBC Data Source
+categories: development
+permalink: /development/datasource_sdk.html
+---
+
+&gt; Available since Apache Kylin v2.6.0
+
+## Data source SDK
+
+Since v2.6.0 Apache Kylin provides a new data source framework *Data source 
SDK*, which provides APIs to help developers handle dialect differences and 
easily implement a new data source. 
+
+## How to develop
+
+### Configuration to implement a new data source
+
+*Data source SDK* provides a conversion framework and has pre-defined a 
configuration file *default.xml* for ansi sql dialect.
+
+Developers do not need coding, what they should do is just create a new 
configuration file {dialect}.xml for the new data source dialect.
+
+Structure of the configuration:
+
+* Root node:  
+&lt;DATASOURCE_DEF NAME="kylin" ID="default"&gt;, the value of ID should be 
name of dialect.
+* Property node:  
+Define the properties of the dialect.
+
+<table>
+  <tbody align="left">  
+  <tr>
+    <td align="center">属性</td>
+    <td align="center">描述</td>
+  </tr>
+  <tr>
+    <td> sql.default-converted-enabled </td>
+    <td> whether enable convert </td>
+  </tr>
+  <tr>
+    <td> sql.allow-no-offset </td>
+    <td> whether allow no offset </td>
+  </tr>
+  <tr>
+    <td> sql.allow-fetch-no-rows </td>
+    <td> whether allow fetch 0 rows </td>
+  </tr>
+  <tr>
+    <td> sql.allow-no-orderby-with-fetch </td>
+    <td> whether allow fetch without orderby </td>
+  </tr>
+  <tr>
+    <td> sql.keyword-default-escape  </td>
+    <td> whether &lt;default&gt; is keyword </td>
+  </tr>
+  <tr>
+     <td> sql.keyword-default-uppercase </td>
+     <td> whether &lt;default&gt; should be transform to uppercase </td>
+  </tr>
+  <tr>
+    <td> sql.paging-type </td>
+    <td> paging type like LIMIT_OFFSET, FETCH_NEXT, ROWNUM </td>
+  </tr>
+  <tr>
+    <td> sql.case-sensitive </td>
+    <td> whether identifier is case sensitive </td>
+  </tr>
+  <tr>
+    <td> metadata.enable-cache </td>
+    <td> whether enable cache for `sql.case-sensitive` is true </td>
+  </tr>
+  <tr>
+    <td> sql.enable-quote-all-identifiers </td>
+    <td> whether enable quote </td>
+  </tr>
+  <tr>
+    <td> transaction.isolation-level </td>
+    <td> transaction isolation level for sqoop </td>
+  </tr>
+  </tbody>
+</table>
+
+
+* Function node:  
+Developers can define the functions implementation in target data source 
dialect.  
+For example, we want to implement Greenplum as data source, but Greenplum does 
not support function such as *TIMESTAMPDIFF*, so we can define in 
*greenplum.xml* 
+
+``` 
+<FUNCTION_DEF ID="64" EXPRESSION="(CAST($1 AS DATE) - CAST($0 AS DATE))"/>
+```
+
+contrast with the configuration in *default.xml*
+
+``` 
+<FUNCTION_DEF ID="64" EXPRESSION="TIMESTAMPDIFF(day, $0, $1)"/>
+```
+
+*Data source SDK* provides conversion functions from default to target dialect 
with same function id.
+
+* Type node:  
+Developers can define the types implementation in target data source dialect.
+Also take Greenplum as example, Greenplum support *BIGINT* instead of *LONG*, 
so we can define in *greenplum.xml*
+
+``` 
+<TYPE_DEF ID="Long" EXPRESSION="BIGINT"/>
+```
+
+contrast with the configuration in *default.xml*
+
+``` 
+<TYPE_DEF ID="Long" EXPRESSION="LONG"/>
+```
+*Data source SDK* provides conversion types from default to target dialect 
with same type id.
+
+
+### Adaptor
+
+Adaptor provides a list of API like get metadata and data from data source. 
+*Data source SDK* provides a default implementation，developers can create a 
new class to extends it and have their own implementation.
+{% highlight Groff markup %}
+org.apache.kylin.sdk.datasource.adaptor.DefaultAdaptor
+{% endhighlight %}
+
+Adaptor also reserves a function *fixSql(String sql)*.  
+After the conversion with the conversion framework, if the sql still have some 
problems to adapt the target dialect, developers can implement the function to 
fix sql finally. 
+
+
+## How to enable data source for Kylin
+Some new configurations:  
+{% highlight Groff markup %}
+kylin.query.pushdown.runner-class-name=org.apache.kylin.query.pushdown.PushdownRunnerSDKImpl
+kylin.source.default=16
+kylin.source.jdbc.dialect={Dialect}
+kylin.source.jdbc.adaptor={Class name of Adaptor}
+kylin.source.jdbc.user={JDBC Connection Username}
+kylin.source.jdbc.pass={JDBC Connection Password}
+kylin.source.jdbc.connection-url={JDBC Connection String}
+kylin.source.jdbc.driver={JDBC Driver Class Name}
+{% endhighlight %}
+
+Take mysql as an example:
+{% highlight Groff markup %}
+kylin.query.pushdown.runner-class-name=org.apache.kylin.query.pushdown.PushdownRunnerSDKImpl
+kylin.source.default=16
+kylin.source.jdbc.dialect=mysql
+kylin.source.jdbc.adaptor=org.apache.kylin.sdk.datasource.adaptor.MysqlAdaptor
+kylin.source.jdbc.user={mysql username}
+kylin.source.jdbc.pass={mysql password}
+kylin.source.jdbc.connection-url=jdbc:mysql://{HOST_URL}:3306/{Database name}
+kylin.source.jdbc.driver=com.mysql.jdbc.Driver
+{% endhighlight %}
+
+Put the configuration file *{dialect}.xml* under directory 
$KYLIN_HOME/conf/datasource.
+Create jar file for the new Adaptor, and put under directory $KYLIN_HOME/ext.
+
+Other configurations are identical with the former jdbc connection, please 
refer to [setup_jdbc_datasource](/docs/tutorial/setup_jdbc_datasource.html)
+
diff --git a/website/_posts/blog/2019-01-16-introduce-data-source-sdk-v2.6.0.md 
b/website/_posts/blog/2019-01-16-introduce-data-source-sdk-v2.6.0.md
new file mode 100644
index 0000000..962d7c9
--- /dev/null
+++ b/website/_posts/blog/2019-01-16-introduce-data-source-sdk-v2.6.0.md
@@ -0,0 +1,37 @@
+---
+layout: post-blog
+title:  Introduce data source SDK
+date:   2019-01-16 20:00:00
+author: Youcheng Zhang
+categories: blog
+---
+
+## Data source SDK
+
+Apache Kylin has already supported several data sources like Amazon Redshift, 
SQL Server through JDBC. But we found that it takes much efforts to develop an 
implementation to a new source engine, like supporting metadata sync, cube 
build and query pushdown. It’s mainly because the SQL dialects and jdbc 
implementations between source engines are quite different.
+  
+So since 2.6.0, Kylin provides a new data source SDK, which provides APIs to 
help developers handle these dialect differences and easily implement a new 
data source through JDBC.
+  
+With this SDK, users can achieve followings from a JDBC source:
+
+* Synchronize metadata and data from JDBC source
+* Build cube from JDBC source
+* Query pushdown to JDBC source engine when cube is unmatched
+
+
+## Structure
+
+{:.center}
+![](/images/blog/data-source-sdk.png)
+ 
+When users want to synchronize metadata or get data from data source, the 
request pass through the framework, and the framework find the adaptor what has 
an API for metadata and data.  
+ 
+To avoid having complex adaptors, when having a push-down query, framework 
provides sql conversions from ansi sql to target data source dialect(includes 
sql functions and sql types), and adaptor just provide a function *fixSql* to 
fix the sql after conversion.  
+
+
+## How to develop  
+  
+Please follow this [doc](/development/datasource_sdk.html)
+
+
+
diff --git a/website/images/blog/data-source-sdk.png 
b/website/images/blog/data-source-sdk.png
new file mode 100644
index 0000000..b415bea
Binary files /dev/null and b/website/images/blog/data-source-sdk.png differ

[kylin] branch document updated: KYLIN-3552 add doc and blog

Reply via email to