This is an automated email from the ASF dual-hosted git repository. xxyu pushed a commit to branch document in repository https://gitbox.apache.org/repos/asf/kylin.git
The following commit(s) were added to refs/heads/document by this push: new 2b6807a Release Kylin 4.0.0 2b6807a is described below commit 2b6807aded4f55adcedae745e7ed4f802783924a Author: XiaoxiangYu <x...@apache.org> AuthorDate: Mon Aug 30 17:29:53 2021 +0800 Release Kylin 4.0.0 --- website/_docs40/howto/howto_cleanup_storage.cn.md | 21 +-------- website/_docs40/howto/howto_cleanup_storage.md | 22 +-------- website/_docs40/release_notes.md | 55 ++++++++++++++++++++++- website/download/index.cn.md | 11 ++--- website/download/index.md | 13 +++--- 5 files changed, 69 insertions(+), 53 deletions(-) diff --git a/website/_docs40/howto/howto_cleanup_storage.cn.md b/website/_docs40/howto/howto_cleanup_storage.cn.md index ad23bbc..893abe6 100644 --- a/website/_docs40/howto/howto_cleanup_storage.cn.md +++ b/website/_docs40/howto/howto_cleanup_storage.cn.md @@ -5,23 +5,4 @@ categories: 帮助 permalink: /cn/docs40/howto/howto_cleanup_storage.html --- -Kylin 在构建 cube 期间会在 HDFS 上生成临时文件;除此之外,当清理/删除/合并 cube 时,一些 parquet 文件可能被遗留但是以后再也不会被查询;虽然 Kylin 已经开始做自动化的垃圾回收,但不一定能覆盖到所有的情况;你可以定期做离线的存储清理: - -可以被删除的文件包括: -- 临时的任务文件 -`hdfs:///kylin/${metadata_url}/${project}/job_tmp` -- 不会再被用到的segment的cuboid文件 -`hdfs:///kylin/${metadata_url}/${project}/${cube_name}/${non_used_segment} ` - -步骤: -1. 检查哪些资源可以清理,这一步不会删除任何东西: -{% highlight Groff markup %} -export KYLIN_HOME=/path/to/kylin_home -${KYLIN_HOME}/bin/kylin.sh org.apache.kylin.tool.StorageCleanupJob --delete false -{% endhighlight %} -请将这里的 (version) 替换为你安装的 Kylin jar 版本。 -2. 你可以抽查一两个资源来检查它们是否已经没有被引用了;然后加上“--delete true”选项进行清理。 -{% highlight Groff markup %} -${KYLIN_HOME}/bin/kylin.sh org.apache.kylin.tool.StorageCleanupJob --delete true -{% endhighlight %} - +Wiki : https://cwiki.apache.org/confluence/display/KYLIN/How+to+clean+up+storage+in+Kylin+4 diff --git a/website/_docs40/howto/howto_cleanup_storage.md b/website/_docs40/howto/howto_cleanup_storage.md index 675c17e..d8a947e 100644 --- a/website/_docs40/howto/howto_cleanup_storage.md +++ b/website/_docs40/howto/howto_cleanup_storage.md @@ -5,24 +5,4 @@ categories: howto permalink: /docs40/howto/howto_cleanup_storage.html --- -Kylin will generate intermediate files in HDFS during the cube building; Besides, when purge/drop/merge cubes, some Parquet file may be left and will no longer be queried; Although Kylin has started to do some -automated garbage collection, it might not cover all cases; You can do an offline storage cleanup periodically: -Which can be deleted: -- temp job files -`hdfs:///kylin/${metadata_url}/${project}/job_tmp` -- none used segment cuboid files -`hdfs:///kylin/${metadata_url}/${project}/${cube_name}/${non_used_segment}` - -Steps: -1. Check which resources can be cleanup, this will not remove anything: -{% highlight Groff markup %} -export KYLIN_HOME=/path/to/kylin_home -${KYLIN_HOME}/bin/kylin.sh org.apache.kylin.tool.StorageCleanupJob --delete false -{% endhighlight %} -Here please replace (version) with the specific Kylin jar version in your installation; -2. You can pickup 1 or 2 resources to check whether they're no longer be referred; Then add the "--delete true" option to start the cleanup: -{% highlight Groff markup %} -${KYLIN_HOME}/bin/kylin.sh org.apache.kylin.tool.StorageCleanupJob --delete true -{% endhighlight %} -On finish, the temp job files and none used segment cuboid files should be dropped; - +Wiki : https://cwiki.apache.org/confluence/display/KYLIN/How+to+clean+up+storage+in+Kylin+4 diff --git a/website/_docs40/release_notes.md b/website/_docs40/release_notes.md index 1586ca7..a618907 100644 --- a/website/_docs40/release_notes.md +++ b/website/_docs40/release_notes.md @@ -5,7 +5,7 @@ categories: gettingstarted permalink: /docs40/release_notes.html --- -To download latest release, please visit: [http://kylin.apache.org/download/](http://kylin.apache.org/download/), +To download the latest release, please visit: [http://kylin.apache.org/download/](http://kylin.apache.org/download/), there are source code package, binary package and installation guide avaliable. Any problem or issue, please report to Apache Kylin JIRA project: [https://issues.apache.org/jira/browse/KYLIN](https://issues.apache.org/jira/browse/KYLIN) @@ -15,6 +15,59 @@ or send to Apache Kylin mailing list: * User relative: [u...@kylin.apache.org](mailto:u...@kylin.apache.org) * Development relative: [d...@kylin.apache.org](mailto:d...@kylin.apache.org) +## v4.0.0 - 2021-08-30 + +__New Feature__ + +* [KYLIN-4498] - CubePlaner for Kylin on Parquet +* [KYLIN-4895] - change spark deploy mode of kylin4.0 engine from local to cluster +* [KYLIN-4905] - Support limit .. offset ... in spark query engine +* [KYLIN-4925] - Use Spark 3 as build and query engine for Kylin 4 +* [KYLIN-4948] - Provide an API to allow users to adjust cuboids manually +* [KYLIN-4966] - Refresh the existing segment according to the new cuboid list in kylin4 +* [KYLIN-5011] - Detect and scatter skewed data in dict encoding step +* [KYLIN-5019] - Avoid building global dictionary from all data of fact table each time +* [KYLIN-5059] - Fix error when using different HDFS cluster in cube building + +__Bug Fix__ + +* [KYLIN-4729] - The hive table will be overwrited when add csv table with the same name +* [KYLIN-4879] - The function of sql to remove comments is not perfect. In some cases, the sql query conditions used will be modified +* [KYLIN-4887] - Segment pruner support string type partition col in spark query engine +* [KYLIN-4889] - Query error when spark engine in local mode +* [KYLIN-4935] - Existing JobOutput's extend info will be lost when it is updated +* [KYLIN-4967] - Forbid to set 'spark.sql.adaptive.enabled' to true when building cube with Spark 2.X +* [KYLIN-5013] - Write table_snapshot to wrong cluster in Kylin4.0 +* [KYLIN-5014] - Spark driver log is abnormal in yarn cluster mode +* [KYLIN-5021] - FilePruner in Spark3 throws NPE when no partition columns in cube +* [KYLIN-5040] - Cuboid should not be exactly matched when there is no group by time partition column and there are multiple segments in the query + +__Improvement__ + +* [KYLIN-4554] - Validate "filter condition" on model saving +* [KYLIN-4888] - Performance optimization of union query with spark engine +* [KYLIN-4890] - Use numSlices = 1 to reduce task num when executing sparder canary +* [KYLIN-4892] - Reduce the times of fetching files status from HDFS in FilePruner +* [KYLIN-4893] - Optimize query performance when using shard by column +* [KYLIN-4894] - Upgrade Apache Spark version to 2.4.7 +* [KYLIN-4897] - Add table snapshot and global dictionary cleaning in StorageCleanupJob +* [KYLIN-4898] - Add automated test cases +* [KYLIN-4903] - cache parent datasource to accelerate next layer's cuboid building +* [KYLIN-4906] - support query/job server dynamic register and discovery in kylin4 +* [KYLIN-4908] - Segment pruner support integer partition col in spark query engine +* [KYLIN-4910] - Return hostname as Sparder URL address when spark master is set to local +* [KYLIN-4917] - Fix some problem of logger system in kylin4 +* [KYLIN-4923] - CubeMigration Tools support migrate meta from 2.x/3.x cluster to 4.0 cluster +* [KYLIN-4926] - Optimize Global Dict building: replace operation 'mapPartitions.count()' with 'foreachPartitions' +* [KYLIN-4927] - Forbid to use AE when building Global Dict +* [KYLIN-4936] - Exactly aggregation can't transform to project +* [KYLIN-4937] - Verify the uniqueness of the global dictionary after building global dictionary +* [KYLIN-4944] - Upgrade CentOS version, Hadoop version and Spark version for Kylin Docker image +* [KYLIN-4945] - Repartition encoded dataset to avoid data skew caused by a single column +* [KYLIN-4980] - Support prunning segments from complex filter conditions +* [KYLIN-5027] - Add the config of whether to build base cuboid in kylin4 +* [KYLIN-5031] - The last_build_job_id of segment is null when the semgent status is RUNNING or ERROR. + ## v3.1.2 - 2021-04-26 __New Feature__ diff --git a/website/download/index.cn.md b/website/download/index.cn.md index cc0f771..5ba2d8c 100644 --- a/website/download/index.cn.md +++ b/website/download/index.cn.md @@ -5,12 +5,13 @@ title: 下载 您可以按照这些[步骤](https://www.apache.org/info/verification.html) 并使用这些[KEYS](https://www.apache.org/dist/kylin/KEYS)来验证下载文件的有效性. -#### v4.0.0-beta -- 这是 4.0.0-alpha 版本后的一个主要版本,包含25个新功能以及改进和14个问题的修复。关于具体内容请查看发布说明。 +#### v4.0.0 +- 这是 Kylin 4 的第一个正式版本,包含32个新功能以及改进和10个问题的修复。关于具体内容请查看发布说明。 - [发布说明](/docs/release_notes.html), [安装指南](https://cwiki.apache.org/confluence/display/KYLIN/Installation+Guide) and [升级指南](/docs40/howto/howto_upgrade.html) -- 源码下载: [apache-kylin-4.0.0-beta-source-release.zip](https://www.apache.org/dyn/closer.cgi/kylin/apache-kylin-4.0.0-beta/apache-kylin-4.0.0-beta-source-release.zip) \[[asc](https://www.apache.org/dist/kylin/apache-kylin-4.0.0-beta/apache-kylin-4.0.0-beta-source-release.zip.asc)\] \[[sha256](https://www.apache.org/dist/kylin/apache-kylin-4.0.0-beta/apache-kylin-4.0.0-beta-source-release.zip.sha256)\] -- Hadoop 2 和 Hadoop 3 二进制包 (请为 Kylin 4.x 使用指定版本的 Spark,版本为 Apache Spark 2.4.6, 而不是环境自带的 Spark): - - [apache-kylin-4.0.0-beta-bin.tar.gz](https://www.apache.org/dyn/closer.cgi/kylin/apache-kylin-4.0.0-beta/apache-kylin-4.0.0-beta-bin.tar.gz) \[[asc](https://www.apache.org/dist/kylin/apache-kylin-4.0.0-beta/apache-kylin-4.0.0-beta-bin.tar.gz.asc)\] \[[sha256](https://www.apache.org/dist/kylin/apache-kylin-4.0.0-beta/apache-kylin-4.0.0-beta-bin.tar.gz.sha256)\] (已经在 CDH 5.7, CDH 6.2, AWS EMR 5.31, AWS EMR 6.0.0, HDP 2.4 环境下验证, Hadoop3 和 EMR 环境需做额外配置,请[查看安装指南](https://cwiki.apache.org/ [...] +- 源码下载: [apache-kylin-4.0.0-source-release.zip](https://www.apache.org/dyn/closer.cgi/kylin/apache-kylin-4.0.0/apache-kylin-4.0.0-source-release.zip) \[[asc](https://www.apache.org/dist/kylin/apache-kylin-4.0.0/apache-kylin-4.0.0-source-release.zip.asc)\] \[[sha256](https://www.apache.org/dist/kylin/apache-kylin-4.0.0/apache-kylin-4.0.0-source-release.zip.sha256)\] +- 二进制包 (选择二进制包前请检查文档 [Hadoop 支持矩阵](https://cwiki.apache.org/confluence/display/KYLIN/Support+Hadoop+Version+Matrix+of+Kylin+4)): + - for Apache Spark 2.4.7 [apache-kylin-4.0.0-bin-spark2.tar.gz](https://www.apache.org/dyn/closer.cgi/kylin/apache-kylin-4.0.0/apache-kylin-4.0.0-bin-spark2.tar.gz) \[[asc](https://www.apache.org/dist/kylin/apache-kylin-4.0.0/apache-kylin-4.0.0-bin-spark2.tar.gz.asc)\] \[[sha256](https://www.apache.org/dist/kylin/apache-kylin-4.0.0/apache-kylin-4.0.0-bin-spark2.tar.gz.sha256)\] + - for Apache Spark 3.1.1 [apache-kylin-4.0.0-bin-spark3.tar.gz](https://www.apache.org/dyn/closer.cgi/kylin/apache-kylin-4.0.0/apache-kylin-4.0.0-bin-spark3.tar.gz) \[[asc](https://www.apache.org/dist/kylin/apache-kylin-4.0.0/apache-kylin-4.0.0-bin-spark3.tar.gz.asc)\] \[[sha256](https://www.apache.org/dist/kylin/apache-kylin-4.0.0/apache-kylin-4.0.0-bin-spark3.tar.gz.sha256)\] #### v3.1.2 - 这是 3.1.2 版本后的一个bug-fix版本,包含40个问题的修复以及各种改进。关于具体内容请查看发布说明。 diff --git a/website/download/index.md b/website/download/index.md index e55dab0..efa8bdc 100644 --- a/website/download/index.md +++ b/website/download/index.md @@ -6,13 +6,14 @@ permalink: /download/index.html You can verify the download by following these [procedures](https://www.apache.org/info/verification.html) and using these [KEYS](https://www.apache.org/dist/kylin/KEYS). -#### v4.0.0-beta -- This is a major release after 4.0.0-alpha, with 25 new features/improvements and 14 bug fixes. Check the release notes. +#### v4.0.0 +- This is the first GA release for Kylin 4, with 32 new features/improvements and 10 bug fixes. Check the release notes. - [Release notes](/docs/release_notes.html), [installation guide](https://cwiki.apache.org/confluence/display/KYLIN/Installation+Guide) and [upgrade guide](https://cwiki.apache.org/confluence/display/KYLIN/How+to+upgrade) -- Source download: [apache-kylin-4.0.0-beta-source-release.zip](https://www.apache.org/dyn/closer.cgi/kylin/apache-kylin-4.0.0-beta/apache-kylin-4.0.0-beta-source-release.zip) \[[asc](https://www.apache.org/dist/kylin/apache-kylin-4.0.0-beta/apache-kylin-4.0.0-beta-source-release.zip.asc)\] \[[sha256](https://www.apache.org/dist/kylin/apache-kylin-4.0.0-beta/apache-kylin-4.0.0-beta-source-release.zip.sha256)\] -- Binary for Apache Hadoop 2 and Hadoop 3 download (Please use the specified version Spark for Kylin 4.X, the version should be Apache Spark 2.4.6, not the Spark that provided by the environment): - - [apache-kylin-4.0.0-beta-bin.tar.gz](https://www.apache.org/dyn/closer.cgi/kylin/apache-kylin-4.0.0-beta/apache-kylin-4.0.0-beta-bin.tar.gz) \[[asc](https://www.apache.org/dist/kylin/apache-kylin-4.0.0-beta/apache-kylin-4.0.0-beta-bin.tar.gz.asc)\] \[[sha256](https://www.apache.org/dist/kylin/apache-kylin-4.0.0-beta/apache-kylin-4.0.0-beta-bin.tar.gz.sha256)\] (Verified on CDH 5.7, CDH 6.2, AWS EMR 5.31, AWS EMR 6.0.0, HDP 2.4, Hadoop 3 and EMR environments require additional configu [...] - +- Source download: [apache-kylin-4.0.0-source-release.zip](https://www.apache.org/dyn/closer.cgi/kylin/apache-kylin-4.0.0/apache-kylin-4.0.0-source-release.zip) \[[asc](https://www.apache.org/dist/kylin/apache-kylin-4.0.0/apache-kylin-4.0.0-source-release.zip.asc)\] \[[sha256](https://www.apache.org/dist/kylin/apache-kylin-4.0.0/apache-kylin-4.0.0-source-release.zip.sha256)\] +- Binary for the download (check this to see which binary you should choose [Hadoop Matrix supported](https://cwiki.apache.org/confluence/display/KYLIN/Support+Hadoop+Version+Matrix+of+Kylin+4)) : + - for Apache Spark 2.4.7 [apache-kylin-4.0.0-bin-spark2.tar.gz](https://www.apache.org/dyn/closer.cgi/kylin/apache-kylin-4.0.0/apache-kylin-4.0.0-bin-spark2.tar.gz) \[[asc](https://www.apache.org/dist/kylin/apache-kylin-4.0.0/apache-kylin-4.0.0-bin-spark2.tar.gz.asc)\] \[[sha256](https://www.apache.org/dist/kylin/apache-kylin-4.0.0/apache-kylin-4.0.0-bin-spark2.tar.gz.sha256)\] + - for Apache Spark 3.1.1 [apache-kylin-4.0.0-bin-spark3.tar.gz](https://www.apache.org/dyn/closer.cgi/kylin/apache-kylin-4.0.0/apache-kylin-4.0.0-bin-spark3.tar.gz) \[[asc](https://www.apache.org/dist/kylin/apache-kylin-4.0.0/apache-kylin-4.0.0-bin-spark3.tar.gz.asc)\] \[[sha256](https://www.apache.org/dist/kylin/apache-kylin-4.0.0/apache-kylin-4.0.0-bin-spark3.tar.gz.sha256)\] + #### v3.1.2 - This is a bug-fix release after 3.1.0, with 40 bug fixes and enhancement. Check the release notes. - [Release notes](/docs/release_notes.html), [installation guide](/docs/install/index.html) and [upgrade guide](/docs/howto/howto_upgrade.html)