Repository: kylin
Updated Branches:
  refs/heads/1.x-staging e16ffb7ae -> 75b6479fe


KYLIN-1295 Documents on some common technical concepts


Project: http://git-wip-us.apache.org/repos/asf/kylin/repo
Commit: http://git-wip-us.apache.org/repos/asf/kylin/commit/75b6479f
Tree: http://git-wip-us.apache.org/repos/asf/kylin/tree/75b6479f
Diff: http://git-wip-us.apache.org/repos/asf/kylin/diff/75b6479f

Branch: refs/heads/1.x-staging
Commit: 75b6479fecbbf8872351cbd2dcfa5dc60f5a0e3e
Parents: e16ffb7
Author: lidongsjtu <don...@ebay.com>
Authored: Tue Jan 12 10:34:02 2016 +0800
Committer: lidongsjtu <don...@ebay.com>
Committed: Tue Jan 12 10:34:16 2016 +0800

----------------------------------------------------------------------
 website/_docs/gettingstarted/concepts.md        |  65 +++++++++++++++++++
 website/_docs/gettingstarted/terminology.md     |   2 +-
 .../images/docs/concepts/AggregationGroup.png   | Bin 0 -> 105363 bytes
 website/images/docs/concepts/CubeAction.png     | Bin 0 -> 110592 bytes
 website/images/docs/concepts/CubeDesc.png       | Bin 0 -> 190025 bytes
 website/images/docs/concepts/CubeInstance.png   | Bin 0 -> 285222 bytes
 website/images/docs/concepts/CubeSegment.png    | Bin 0 -> 96393 bytes
 website/images/docs/concepts/DataModel.png      | Bin 0 -> 193661 bytes
 website/images/docs/concepts/DataSource.png     | Bin 0 -> 180295 bytes
 website/images/docs/concepts/Dimension.png      | Bin 0 -> 190495 bytes
 website/images/docs/concepts/Job.png            | Bin 0 -> 299095 bytes
 website/images/docs/concepts/JobAction.png      | Bin 0 -> 53369 bytes
 website/images/docs/concepts/Measure.png        | Bin 0 -> 160824 bytes
 website/images/docs/concepts/Partition.png      | Bin 0 -> 148494 bytes
 14 files changed, 66 insertions(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/kylin/blob/75b6479f/website/_docs/gettingstarted/concepts.md
----------------------------------------------------------------------
diff --git a/website/_docs/gettingstarted/concepts.md 
b/website/_docs/gettingstarted/concepts.md
new file mode 100644
index 0000000..081ec54
--- /dev/null
+++ b/website/_docs/gettingstarted/concepts.md
@@ -0,0 +1,65 @@
+---
+layout: docs
+title:  "Technical Concepts"
+categories: gettingstarted
+permalink: /docs/gettingstarted/concepts.html
+version: v1.2
+since: v1.2
+---
+ 
+Here are some basic technical concepts used in Apache Kylin, please check them 
for your reference.
+For terminology in domain, please refer to: [Terminology](terminology.md)
+
+## CUBE
+* __Table__ - This is definition of hive tables as source of cubes, which must 
be synced before building cubes.
+![](/images/docs/concepts/DataSource.png)
+
+* __Data Model__ - This describes a [STAR 
SCHEMA](https://en.wikipedia.org/wiki/Star_schema) data model, which defines 
fact/lookup tables and filter condition.
+![](/images/docs/concepts/DataModel.png)
+
+* __Cube Descriptor__ - This describes definition and settings for a cube 
instance, defining which data model to use, what dimensions and measures to 
have, how to partition to segments and how to handle auto-merge etc.
+![](/images/docs/concepts/CubeDesc.png)
+
+* __Cube Instance__ - This is instance of cube, built from one cube 
descriptor, and consist of one or more cube segments according partition 
settings.
+![](/images/docs/concepts/CubeInstance.png)
+
+* __Partition__ - User can define a DATE/STRING column as partition column on 
cube descriptor, to separate one cube into several segments with different date 
periods.
+![](/images/docs/concepts/Partition.png)
+
+* __Cube Segment__ - This is actual carrier of cube data, and maps to a HTable 
in HBase. One building job creates one new segment for the cube instance. Once 
data change on specified data period, we can refresh related segments to avoid 
rebuilding whole cube.
+![](/images/docs/concepts/CubeSegment.png)
+
+* __Aggregation Group__ - Each aggregation group is subset of dimensions, and 
build cuboid with combinations inside. It aims at pruning for optimization.
+![](/images/docs/concepts/AggregationGroup.png)
+
+## DIMENSION & MEASURE
+* __Mandotary__ - This dimension type is used for cuboid pruning, if a 
dimension is specified as “mandatory”, then those combinations without such 
dimension are pruned.
+* __Hierarchy__ - This dimension type is used for cuboid pruning, if dimension 
A,B,C forms a “hierarchy” relation, then only combinations with A, AB or 
ABC shall be remained. 
+* __Derived__ - On lookup tables, some dimensions could be generated from its 
PK, so there's specific mapping between them and FK from fact table. So those 
dimensions are DERIVED and don't participate in cuboid generation.
+![](/images/docs/concepts/Dimension.png)
+
+* __Count Distinct(HyperLogLog)__ - Immediate COUNT DISTINCT is hard to 
calculate, a approximate algorithm - 
[HyperLogLog](https://en.wikipedia.org/wiki/HyperLogLog) is introduced, and 
keep error rate in a lower level. 
+* __Count Distinct(Precise)__ - Precise COUNT DISTINCT will be pre-calculated 
basing on RoaringBitmap, currently only int or bigint are supported.
+* __Top N__ - (Will release in 2.x) For example, with this measure type, user 
can easily get specified numbers of top sellers/buyers etc. 
+![](/images/docs/concepts/Measure.png)
+
+## CUBE ACTIONS
+* __BUILD__ - Given an interval of partition column, this action is to build a 
new cube segment.
+* __REFRESH__ - This action will rebuilt cube segment in some partition 
period, which is used in case of source table increasing.
+* __MERGE__ - This action will merge multiple continuous cube segments into 
single one. This can be automated with auto-merge settings in cube descriptor.
+* __PURGE__ - Clear segments under a cube instance. This will only update 
metadata, and won't delete cube data from HBase.
+![](/images/docs/concepts/CubeAction.png)
+
+## JOB STATUS
+* __NEW__ - This denotes one job has been just created.
+* __PENDING__ - This denotes one job is paused by job scheduler and waiting 
for resources.
+* __RUNNING__ - This denotes one job is running in progress.
+* __FINISHED__ - This denotes one job is successfully finished.
+* __ERROR__ - This denotes one job is aborted with errors.
+* __DISCARDED__ - This denotes one job is cancelled by end users.
+![](/images/docs/concepts/Job.png)
+
+## JOB ACTION
+* __RESUME__ - Once a job in ERROR status, this action will try to restore it 
from latest successful point.
+* __DISCARD__ - No matter status of a job is, user can end it and release 
resources with DISCARD action.
+![](/images/docs/concepts/JobAction.png)
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/kylin/blob/75b6479f/website/_docs/gettingstarted/terminology.md
----------------------------------------------------------------------
diff --git a/website/_docs/gettingstarted/terminology.md 
b/website/_docs/gettingstarted/terminology.md
index ac4a19d..0f9e669 100644
--- a/website/_docs/gettingstarted/terminology.md
+++ b/website/_docs/gettingstarted/terminology.md
@@ -8,7 +8,7 @@ since: v0.5.x
 ---
  
 
-Here are some terms we are using in Apache Kylin, please check them for your 
reference.   
+Here are some domain terms we are using in Apache Kylin, please check them for 
your reference.   
 They are basic knowledge of Apache Kylin which also will help to well 
understand such concerpt, term, knowledge, theory and others about Data 
Warehouse, Business Intelligence for analycits. 
 
 * __Data Warehouse__: a data warehouse (DW or DWH), also known as an 
enterprise data warehouse (EDW), is a system used for reporting and data 
analysis, [wikipedia](https://en.wikipedia.org/wiki/Data_warehouse)

http://git-wip-us.apache.org/repos/asf/kylin/blob/75b6479f/website/images/docs/concepts/AggregationGroup.png
----------------------------------------------------------------------
diff --git a/website/images/docs/concepts/AggregationGroup.png 
b/website/images/docs/concepts/AggregationGroup.png
new file mode 100644
index 0000000..0e563fc
Binary files /dev/null and b/website/images/docs/concepts/AggregationGroup.png 
differ

http://git-wip-us.apache.org/repos/asf/kylin/blob/75b6479f/website/images/docs/concepts/CubeAction.png
----------------------------------------------------------------------
diff --git a/website/images/docs/concepts/CubeAction.png 
b/website/images/docs/concepts/CubeAction.png
new file mode 100644
index 0000000..bbc6ef5
Binary files /dev/null and b/website/images/docs/concepts/CubeAction.png differ

http://git-wip-us.apache.org/repos/asf/kylin/blob/75b6479f/website/images/docs/concepts/CubeDesc.png
----------------------------------------------------------------------
diff --git a/website/images/docs/concepts/CubeDesc.png 
b/website/images/docs/concepts/CubeDesc.png
new file mode 100644
index 0000000..a0736a4
Binary files /dev/null and b/website/images/docs/concepts/CubeDesc.png differ

http://git-wip-us.apache.org/repos/asf/kylin/blob/75b6479f/website/images/docs/concepts/CubeInstance.png
----------------------------------------------------------------------
diff --git a/website/images/docs/concepts/CubeInstance.png 
b/website/images/docs/concepts/CubeInstance.png
new file mode 100644
index 0000000..7748afa
Binary files /dev/null and b/website/images/docs/concepts/CubeInstance.png 
differ

http://git-wip-us.apache.org/repos/asf/kylin/blob/75b6479f/website/images/docs/concepts/CubeSegment.png
----------------------------------------------------------------------
diff --git a/website/images/docs/concepts/CubeSegment.png 
b/website/images/docs/concepts/CubeSegment.png
new file mode 100644
index 0000000..6f57720
Binary files /dev/null and b/website/images/docs/concepts/CubeSegment.png differ

http://git-wip-us.apache.org/repos/asf/kylin/blob/75b6479f/website/images/docs/concepts/DataModel.png
----------------------------------------------------------------------
diff --git a/website/images/docs/concepts/DataModel.png 
b/website/images/docs/concepts/DataModel.png
new file mode 100644
index 0000000..dd959f5
Binary files /dev/null and b/website/images/docs/concepts/DataModel.png differ

http://git-wip-us.apache.org/repos/asf/kylin/blob/75b6479f/website/images/docs/concepts/DataSource.png
----------------------------------------------------------------------
diff --git a/website/images/docs/concepts/DataSource.png 
b/website/images/docs/concepts/DataSource.png
new file mode 100644
index 0000000..1933fa3
Binary files /dev/null and b/website/images/docs/concepts/DataSource.png differ

http://git-wip-us.apache.org/repos/asf/kylin/blob/75b6479f/website/images/docs/concepts/Dimension.png
----------------------------------------------------------------------
diff --git a/website/images/docs/concepts/Dimension.png 
b/website/images/docs/concepts/Dimension.png
new file mode 100644
index 0000000..65e5810
Binary files /dev/null and b/website/images/docs/concepts/Dimension.png differ

http://git-wip-us.apache.org/repos/asf/kylin/blob/75b6479f/website/images/docs/concepts/Job.png
----------------------------------------------------------------------
diff --git a/website/images/docs/concepts/Job.png 
b/website/images/docs/concepts/Job.png
new file mode 100644
index 0000000..a790239
Binary files /dev/null and b/website/images/docs/concepts/Job.png differ

http://git-wip-us.apache.org/repos/asf/kylin/blob/75b6479f/website/images/docs/concepts/JobAction.png
----------------------------------------------------------------------
diff --git a/website/images/docs/concepts/JobAction.png 
b/website/images/docs/concepts/JobAction.png
new file mode 100644
index 0000000..1ec370b
Binary files /dev/null and b/website/images/docs/concepts/JobAction.png differ

http://git-wip-us.apache.org/repos/asf/kylin/blob/75b6479f/website/images/docs/concepts/Measure.png
----------------------------------------------------------------------
diff --git a/website/images/docs/concepts/Measure.png 
b/website/images/docs/concepts/Measure.png
new file mode 100644
index 0000000..34542f6
Binary files /dev/null and b/website/images/docs/concepts/Measure.png differ

http://git-wip-us.apache.org/repos/asf/kylin/blob/75b6479f/website/images/docs/concepts/Partition.png
----------------------------------------------------------------------
diff --git a/website/images/docs/concepts/Partition.png 
b/website/images/docs/concepts/Partition.png
new file mode 100644
index 0000000..636eaed
Binary files /dev/null and b/website/images/docs/concepts/Partition.png differ

Reply via email to