http://git-wip-us.apache.org/repos/asf/kylin/blob/7ea64f38/website/_docs20/howto/howto_use_restapi.md ---------------------------------------------------------------------- diff --git a/website/_docs20/howto/howto_use_restapi.md b/website/_docs20/howto/howto_use_restapi.md new file mode 100644 index 0000000..58ec55b --- /dev/null +++ b/website/_docs20/howto/howto_use_restapi.md @@ -0,0 +1,1113 @@ +--- +layout: docs20 +title: Use RESTful API +categories: howto +permalink: /docs20/howto/howto_use_restapi.html +since: v0.7.1 +--- + +This page lists the major RESTful APIs provided by Kylin. + +* Query + * [Authentication](#authentication) + * [Query](#query) + * [List queryable tables](#list-queryable-tables) +* CUBE + * [List cubes](#list-cubes) + * [Get cube](#get-cube) + * [Get cube descriptor (dimension, measure info, etc)](#get-cube-descriptor) + * [Get data model (fact and lookup table info)](#get-data-model) + * [Build cube](#build-cube) + * [Disable cube](#disable-cube) + * [Purge cube](#purge-cube) + * [Enable cube](#enable-cube) +* JOB + * [Resume job](#resume-job) + * [Pause job](#pause-job) + * [Discard job](#discard-job) + * [Get job status](#get-job-status) + * [Get job step output](#get-job-step-output) +* Metadata + * [Get Hive Table](#get-hive-table) + * [Get Hive Table (Extend Info)](#get-hive-table-extend-info) + * [Get Hive Tables](#get-hive-tables) + * [Load Hive Tables](#load-hive-tables) +* Cache + * [Wipe cache](#wipe-cache) +* Streaming + * [Initiate cube start position](#initiate-cube-start-position) + * [Build stream cube](#build-stream-cube) + * [Check segment holes](#check-segment-holes) + * [Fill segment holes](#fill-segment-holes) + +## Authentication +`POST /kylin/api/user/authentication` + +#### Request Header +Authorization data encoded by basic auth is needed in the header, such as: +Authorization:Basic {data} + +#### Response Body +* userDetails - Defined authorities and status of current user. + +#### Response Sample + +```sh +{ + "userDetails":{ + "password":null, + "username":"sample", + "authorities":[ + { + "authority":"ROLE_ANALYST" + }, + { + "authority":"ROLE_MODELER" + } + ], + "accountNonExpired":true, + "accountNonLocked":true, + "credentialsNonExpired":true, + "enabled":true + } +} +``` + +#### Curl Example + +``` +curl -c /path/to/cookiefile.txt -X POST -H "Authorization: Basic XXXXXXXXX" -H 'Content-Type: application/json' http://<host>:<port>/kylin/api/user/authentication +``` + +If login successfully, the JSESSIONID will be saved into the cookie file; In the subsequent http requests, attach the cookie, for example: + +``` +curl -b /path/to/cookiefile.txt -X PUT -H 'Content-Type: application/json' -d '{"startTime":'1423526400000', "endTime":'1423526400', "buildType":"BUILD"}' http://<host>:<port>/kylin/api/cubes/your_cube/build +``` + +Alternatively, you can provide the username/password with option "user" in each curl call; please note this has the risk of password leak in shell history: + + +``` +curl -X PUT --user ADMIN:KYLIN -H "Content-Type: application/json;charset=utf-8" -d '{ "startTime": 820454400000, "endTime": 821318400000, "buildType": "BUILD"}' http://localhost:7070/kylin/api/cubes/kylin_sales/build +``` + +*** + +## Query +`POST /kylin/api/query` + +#### Request Body +* sql - `required` `string` The text of sql statement. +* offset - `optional` `int` Query offset. If offset is set in sql, curIndex will be ignored. +* limit - `optional` `int` Query limit. If limit is set in sql, perPage will be ignored. +* acceptPartial - `optional` `bool` Whether accept a partial result or not, default be "false". Set to "false" for production use. +* project - `optional` `string` Project to perform query. Default value is 'DEFAULT'. + +#### Request Sample + +```sh +{ + "sql":"select * from TEST_KYLIN_FACT", + "offset":0, + "limit":50000, + "acceptPartial":false, + "project":"DEFAULT" +} +``` + +#### Curl Example + +``` +curl -X POST -H "Authorization: Basic XXXXXXXXX" -H "Content-Type: application/json" -d '{ "sql":"select count(*) from TEST_KYLIN_FACT", "project":"learn_kylin" }' http://localhost:7070/kylin/api/query +``` + +#### Response Body +* columnMetas - Column metadata information of result set. +* results - Data set of result. +* cube - Cube used for this query. +* affectedRowCount - Count of affected row by this sql statement. +* isException - Whether this response is an exception. +* ExceptionMessage - Message content of the exception. +* Duration - Time cost of this query +* Partial - Whether the response is a partial result or not. Decided by `acceptPartial` of request. + +#### Response Sample + +```sh +{ + "columnMetas":[ + { + "isNullable":1, + "displaySize":0, + "label":"CAL_DT", + "name":"CAL_DT", + "schemaName":null, + "catelogName":null, + "tableName":null, + "precision":0, + "scale":0, + "columnType":91, + "columnTypeName":"DATE", + "readOnly":true, + "writable":false, + "caseSensitive":true, + "searchable":false, + "currency":false, + "signed":true, + "autoIncrement":false, + "definitelyWritable":false + }, + { + "isNullable":1, + "displaySize":10, + "label":"LEAF_CATEG_ID", + "name":"LEAF_CATEG_ID", + "schemaName":null, + "catelogName":null, + "tableName":null, + "precision":10, + "scale":0, + "columnType":4, + "columnTypeName":"INTEGER", + "readOnly":true, + "writable":false, + "caseSensitive":true, + "searchable":false, + "currency":false, + "signed":true, + "autoIncrement":false, + "definitelyWritable":false + } + ], + "results":[ + [ + "2013-08-07", + "32996", + "15", + "15", + "Auction", + "10000000", + "49.048952730908745", + "49.048952730908745", + "49.048952730908745", + "1" + ], + [ + "2013-08-07", + "43398", + "0", + "14", + "ABIN", + "10000633", + "85.78317064220418", + "85.78317064220418", + "85.78317064220418", + "1" + ] + ], + "cube":"test_kylin_cube_with_slr_desc", + "affectedRowCount":0, + "isException":false, + "exceptionMessage":null, + "duration":3451, + "partial":false +} +``` + + +## List queryable tables +`GET /kylin/api/tables_and_columns` + +#### Request Parameters +* project - `required` `string` The project to load tables + +#### Response Sample +```sh +[ + { + "columns":[ + { + "table_NAME":"TEST_CAL_DT", + "table_SCHEM":"EDW", + "column_NAME":"CAL_DT", + "data_TYPE":91, + "nullable":1, + "column_SIZE":-1, + "buffer_LENGTH":-1, + "decimal_DIGITS":0, + "num_PREC_RADIX":10, + "column_DEF":null, + "sql_DATA_TYPE":-1, + "sql_DATETIME_SUB":-1, + "char_OCTET_LENGTH":-1, + "ordinal_POSITION":1, + "is_NULLABLE":"YES", + "scope_CATLOG":null, + "scope_SCHEMA":null, + "scope_TABLE":null, + "source_DATA_TYPE":-1, + "iS_AUTOINCREMENT":null, + "table_CAT":"defaultCatalog", + "remarks":null, + "type_NAME":"DATE" + }, + { + "table_NAME":"TEST_CAL_DT", + "table_SCHEM":"EDW", + "column_NAME":"WEEK_BEG_DT", + "data_TYPE":91, + "nullable":1, + "column_SIZE":-1, + "buffer_LENGTH":-1, + "decimal_DIGITS":0, + "num_PREC_RADIX":10, + "column_DEF":null, + "sql_DATA_TYPE":-1, + "sql_DATETIME_SUB":-1, + "char_OCTET_LENGTH":-1, + "ordinal_POSITION":2, + "is_NULLABLE":"YES", + "scope_CATLOG":null, + "scope_SCHEMA":null, + "scope_TABLE":null, + "source_DATA_TYPE":-1, + "iS_AUTOINCREMENT":null, + "table_CAT":"defaultCatalog", + "remarks":null, + "type_NAME":"DATE" + } + ], + "table_NAME":"TEST_CAL_DT", + "table_SCHEM":"EDW", + "ref_GENERATION":null, + "self_REFERENCING_COL_NAME":null, + "type_SCHEM":null, + "table_TYPE":"TABLE", + "table_CAT":"defaultCatalog", + "remarks":null, + "type_CAT":null, + "type_NAME":null + } +] +``` + +*** + +## List cubes +`GET /kylin/api/cubes` + +#### Request Parameters +* offset - `required` `int` Offset used by pagination +* limit - `required` `int ` Cubes per page. +* cubeName - `optional` `string` Keyword for cube names. To find cubes whose name contains this keyword. +* projectName - `optional` `string` Project name. + +#### Response Sample +```sh +[ + { + "uuid":"1eaca32a-a33e-4b69-83dd-0bb8b1f8c53b", + "last_modified":1407831634847, + "name":"test_kylin_cube_with_slr_empty", + "owner":null, + "version":null, + "descriptor":"test_kylin_cube_with_slr_desc", + "cost":50, + "status":"DISABLED", + "segments":[ + ], + "create_time":null, + "source_records_count":0, + "source_records_size":0, + "size_kb":0 + } +] +``` + +## Get cube +`GET /kylin/api/cubes/{cubeName}` + +#### Path Variable +* cubeName - `required` `string` Cube name to find. + +## Get cube descriptor +`GET /kylin/api/cube_desc/{cubeName}` +Get descriptor for specified cube instance. + +#### Path Variable +* cubeName - `required` `string` Cube name. + +#### Response Sample +```sh +[ + { + "uuid": "a24ca905-1fc6-4f67-985c-38fa5aeafd92", + "name": "test_kylin_cube_with_slr_desc", + "description": null, + "dimensions": [ + { + "id": 0, + "name": "CAL_DT", + "table": "EDW.TEST_CAL_DT", + "column": null, + "derived": [ + "WEEK_BEG_DT" + ], + "hierarchy": false + }, + { + "id": 1, + "name": "CATEGORY", + "table": "DEFAULT.TEST_CATEGORY_GROUPINGS", + "column": null, + "derived": [ + "USER_DEFINED_FIELD1", + "USER_DEFINED_FIELD3", + "UPD_DATE", + "UPD_USER" + ], + "hierarchy": false + }, + { + "id": 2, + "name": "CATEGORY_HIERARCHY", + "table": "DEFAULT.TEST_CATEGORY_GROUPINGS", + "column": [ + "META_CATEG_NAME", + "CATEG_LVL2_NAME", + "CATEG_LVL3_NAME" + ], + "derived": null, + "hierarchy": true + }, + { + "id": 3, + "name": "LSTG_FORMAT_NAME", + "table": "DEFAULT.TEST_KYLIN_FACT", + "column": [ + "LSTG_FORMAT_NAME" + ], + "derived": null, + "hierarchy": false + }, + { + "id": 4, + "name": "SITE_ID", + "table": "EDW.TEST_SITES", + "column": null, + "derived": [ + "SITE_NAME", + "CRE_USER" + ], + "hierarchy": false + }, + { + "id": 5, + "name": "SELLER_TYPE_CD", + "table": "EDW.TEST_SELLER_TYPE_DIM", + "column": null, + "derived": [ + "SELLER_TYPE_DESC" + ], + "hierarchy": false + }, + { + "id": 6, + "name": "SELLER_ID", + "table": "DEFAULT.TEST_KYLIN_FACT", + "column": [ + "SELLER_ID" + ], + "derived": null, + "hierarchy": false + } + ], + "measures": [ + { + "id": 1, + "name": "GMV_SUM", + "function": { + "expression": "SUM", + "parameter": { + "type": "column", + "value": "PRICE", + "next_parameter": null + }, + "returntype": "decimal(19,4)" + }, + "dependent_measure_ref": null + }, + { + "id": 2, + "name": "GMV_MIN", + "function": { + "expression": "MIN", + "parameter": { + "type": "column", + "value": "PRICE", + "next_parameter": null + }, + "returntype": "decimal(19,4)" + }, + "dependent_measure_ref": null + }, + { + "id": 3, + "name": "GMV_MAX", + "function": { + "expression": "MAX", + "parameter": { + "type": "column", + "value": "PRICE", + "next_parameter": null + }, + "returntype": "decimal(19,4)" + }, + "dependent_measure_ref": null + }, + { + "id": 4, + "name": "TRANS_CNT", + "function": { + "expression": "COUNT", + "parameter": { + "type": "constant", + "value": "1", + "next_parameter": null + }, + "returntype": "bigint" + }, + "dependent_measure_ref": null + }, + { + "id": 5, + "name": "ITEM_COUNT_SUM", + "function": { + "expression": "SUM", + "parameter": { + "type": "column", + "value": "ITEM_COUNT", + "next_parameter": null + }, + "returntype": "bigint" + }, + "dependent_measure_ref": null + } + ], + "rowkey": { + "rowkey_columns": [ + { + "column": "SELLER_ID", + "length": 18, + "dictionary": null, + "mandatory": true + }, + { + "column": "CAL_DT", + "length": 0, + "dictionary": "true", + "mandatory": false + }, + { + "column": "LEAF_CATEG_ID", + "length": 0, + "dictionary": "true", + "mandatory": false + }, + { + "column": "META_CATEG_NAME", + "length": 0, + "dictionary": "true", + "mandatory": false + }, + { + "column": "CATEG_LVL2_NAME", + "length": 0, + "dictionary": "true", + "mandatory": false + }, + { + "column": "CATEG_LVL3_NAME", + "length": 0, + "dictionary": "true", + "mandatory": false + }, + { + "column": "LSTG_FORMAT_NAME", + "length": 12, + "dictionary": null, + "mandatory": false + }, + { + "column": "LSTG_SITE_ID", + "length": 0, + "dictionary": "true", + "mandatory": false + }, + { + "column": "SLR_SEGMENT_CD", + "length": 0, + "dictionary": "true", + "mandatory": false + } + ], + "aggregation_groups": [ + [ + "LEAF_CATEG_ID", + "META_CATEG_NAME", + "CATEG_LVL2_NAME", + "CATEG_LVL3_NAME", + "CAL_DT" + ] + ] + }, + "signature": "lsLAl2jL62ZApmOLZqWU3g==", + "last_modified": 1445850327000, + "model_name": "test_kylin_with_slr_model_desc", + "null_string": null, + "hbase_mapping": { + "column_family": [ + { + "name": "F1", + "columns": [ + { + "qualifier": "M", + "measure_refs": [ + "GMV_SUM", + "GMV_MIN", + "GMV_MAX", + "TRANS_CNT", + "ITEM_COUNT_SUM" + ] + } + ] + } + ] + }, + "notify_list": null, + "auto_merge_time_ranges": null, + "retention_range": 0 + } +] +``` + +## Get data model +`GET /kylin/api/model/{modelName}` + +#### Path Variable +* modelName - `required` `string` Data model name, by default it should be the same with cube name. + +#### Response Sample +```sh +{ + "uuid": "ff527b94-f860-44c3-8452-93b17774c647", + "name": "test_kylin_with_slr_model_desc", + "lookups": [ + { + "table": "EDW.TEST_CAL_DT", + "join": { + "type": "inner", + "primary_key": [ + "CAL_DT" + ], + "foreign_key": [ + "CAL_DT" + ] + } + }, + { + "table": "DEFAULT.TEST_CATEGORY_GROUPINGS", + "join": { + "type": "inner", + "primary_key": [ + "LEAF_CATEG_ID", + "SITE_ID" + ], + "foreign_key": [ + "LEAF_CATEG_ID", + "LSTG_SITE_ID" + ] + } + } + ], + "capacity": "MEDIUM", + "last_modified": 1442372116000, + "fact_table": "DEFAULT.TEST_KYLIN_FACT", + "filter_condition": null, + "partition_desc": { + "partition_date_column": "DEFAULT.TEST_KYLIN_FACT.CAL_DT", + "partition_date_start": 0, + "partition_date_format": "yyyy-MM-dd", + "partition_type": "APPEND", + "partition_condition_builder": "org.apache.kylin.metadata.model.PartitionDesc$DefaultPartitionConditionBuilder" + } +} +``` + +## Build cube +`PUT /kylin/api/cubes/{cubeName}/build` + +#### Path Variable +* cubeName - `required` `string` Cube name. + +#### Request Body +* startTime - `required` `long` Start timestamp of data to build, e.g. 1388563200000 for 2014-1-1 +* endTime - `required` `long` End timestamp of data to build +* buildType - `required` `string` Supported build type: 'BUILD', 'MERGE', 'REFRESH' + +#### Curl Example +``` +curl -X PUT -H "Authorization: Basic XXXXXXXXX" -H 'Content-Type: application/json' -d '{"startTime":'1423526400000', "endTime":'1423526400', "buildType":"BUILD"}' http://<host>:<port>/kylin/api/cubes/{cubeName}/build +``` + +#### Response Sample +``` +{ + "uuid":"c143e0e4-ac5f-434d-acf3-46b0d15e3dc6", + "last_modified":1407908916705, + "name":"test_kylin_cube_with_slr_empty - 19700101000000_20140731160000 - BUILD - PDT 2014-08-12 22:48:36", + "type":"BUILD", + "duration":0, + "related_cube":"test_kylin_cube_with_slr_empty", + "related_segment":"19700101000000_20140731160000", + "exec_start_time":0, + "exec_end_time":0, + "mr_waiting":0, + "steps":[ + { + "interruptCmd":null, + "name":"Create Intermediate Flat Hive Table", + "sequence_id":0, + "exec_cmd":"hive -e \"DROP TABLE IF EXISTS kylin_intermediate_test_kylin_cube_with_slr_desc_19700101000000_20140731160000_c143e0e4_ac5f_434d_acf3_46b0d15e3dc6;\nCREATE EXTERNAL TABLE IF NOT EXISTS kylin_intermediate_test_kylin_cube_with_slr_desc_19700101000000_20140731160000_c143e0e4_ac5f_434d_acf3_46b0d15e3dc6\n(\nCAL_DT date\n,LEAF_CATEG_ID int\n,LSTG_SITE_ID int\n,META_CATEG_NAME string\n,CATEG_LVL2_NAME string\n,CATEG_LVL3_NAME string\n,LSTG_FORMAT_NAME string\n,SLR_SEGMENT_CD smallint\n,SELLER_ID bigint\n,PRICE decimal\n)\nROW FORMAT DELIMITED FIELDS TERMINATED BY '\\177'\nSTORED AS SEQUENCEFILE\nLOCATION '/tmp/kylin-c143e0e4-ac5f-434d-acf3-46b0d15e3dc6/kylin_intermediate_test_kylin_cube_with_slr_desc_19700101000000_20140731160000_c143e0e4_ac5f_434d_acf3_46b0d15e3dc6';\nSET mapreduce.job.split.metainfo.maxsize=-1;\nSET mapred.compress.map.output=true;\nSET mapred.map.output.compression.codec=com.hadoop.compression.lzo.LzoCodec;\nSET mapred.output.compress=true;\nSET ma pred.output.compression.codec=com.hadoop.compression.lzo.LzoCodec;\nSET mapred.output.compression.type=BLOCK;\nSET mapreduce.job.max.split.locations=2000;\nSET hive.exec.compress.output=true;\nSET hive.auto.convert.join.noconditionaltask = true;\nSET hive.auto.convert.join.noconditionaltask.size = 300000000;\nINSERT OVERWRITE TABLE kylin_intermediate_test_kylin_cube_with_slr_desc_19700101000000_20140731160000_c143e0e4_ac5f_434d_acf3_46b0d15e3dc6\nSELECT\nTEST_KYLIN_FACT.CAL_DT\n,TEST_KYLIN_FACT.LEAF_CATEG_ID\n,TEST_KYLIN_FACT.LSTG_SITE_ID\n,TEST_CATEGORY_GROUPINGS.META_CATEG_NAME\n,TEST_CATEGORY_GROUPINGS.CATEG_LVL2_NAME\n,TEST_CATEGORY_GROUPINGS.CATEG_LVL3_NAME\n,TEST_KYLIN_FACT.LSTG_FORMAT_NAME\n,TEST_KYLIN_FACT.SLR_SEGMENT_CD\n,TEST_KYLIN_FACT.SELLER_ID\n,TEST_KYLIN_FACT.PRICE\nFROM TEST_KYLIN_FACT\nINNER JOIN TEST_CAL_DT\nON TEST_KYLIN_FACT.CAL_DT = TEST_CAL_DT.CAL_DT\nINNER JOIN TEST_CATEGORY_GROUPINGS\nON TEST_KYLIN_FACT.LEAF_CATEG_ID = TEST_CATEGORY_GROUPINGS.LEAF_CATEG_ID AN D TEST_KYLIN_FACT.LSTG_SITE_ID = TEST_CATEGORY_GROUPINGS.SITE_ID\nINNER JOIN TEST_SITES\nON TEST_KYLIN_FACT.LSTG_SITE_ID = TEST_SITES.SITE_ID\nINNER JOIN TEST_SELLER_TYPE_DIM\nON TEST_KYLIN_FACT.SLR_SEGMENT_CD = TEST_SELLER_TYPE_DIM.SELLER_TYPE_CD\nWHERE (test_kylin_fact.cal_dt < '2014-07-31 16:00:00')\n;\n\"", + "interrupt_cmd":null, + "exec_start_time":0, + "exec_end_time":0, + "exec_wait_time":0, + "step_status":"PENDING", + "cmd_type":"SHELL_CMD_HADOOP", + "info":null, + "run_async":false + }, + { + "interruptCmd":null, + "name":"Extract Fact Table Distinct Columns", + "sequence_id":1, + "exec_cmd":" -conf C:/kylin/Kylin/server/src/main/resources/hadoop_job_conf_medium.xml -cubename test_kylin_cube_with_slr_empty -input /tmp/kylin-c143e0e4-ac5f-434d-acf3-46b0d15e3dc6/kylin_intermediate_test_kylin_cube_with_slr_desc_19700101000000_20140731160000_c143e0e4_ac5f_434d_acf3_46b0d15e3dc6 -output /tmp/kylin-c143e0e4-ac5f-434d-acf3-46b0d15e3dc6/test_kylin_cube_with_slr_empty/fact_distinct_columns -jobname Kylin_Fact_Distinct_Columns_test_kylin_cube_with_slr_empty_Step_1", + "interrupt_cmd":null, + "exec_start_time":0, + "exec_end_time":0, + "exec_wait_time":0, + "step_status":"PENDING", + "cmd_type":"JAVA_CMD_HADOOP_FACTDISTINCT", + "info":null, + "run_async":true + }, + { + "interruptCmd":null, + "name":"Load HFile to HBase Table", + "sequence_id":12, + "exec_cmd":" -input /tmp/kylin-c143e0e4-ac5f-434d-acf3-46b0d15e3dc6/test_kylin_cube_with_slr_empty/hfile/ -htablename KYLIN-CUBE-TEST_KYLIN_CUBE_WITH_SLR_EMPTY-19700101000000_20140731160000_11BB4326-5975-4358-804C-70D53642E03A -cubename test_kylin_cube_with_slr_empty", + "interrupt_cmd":null, + "exec_start_time":0, + "exec_end_time":0, + "exec_wait_time":0, + "step_status":"PENDING", + "cmd_type":"JAVA_CMD_HADOOP_NO_MR_BULKLOAD", + "info":null, + "run_async":false + } + ], + "job_status":"PENDING", + "progress":0.0 +} +``` + +## Enable Cube +`PUT /kylin/api/cubes/{cubeName}/enable` + +#### Path variable +* cubeName - `required` `string` Cube name. + +#### Response Sample +```sh +{ + "uuid":"1eaca32a-a33e-4b69-83dd-0bb8b1f8c53b", + "last_modified":1407909046305, + "name":"test_kylin_cube_with_slr_ready", + "owner":null, + "version":null, + "descriptor":"test_kylin_cube_with_slr_desc", + "cost":50, + "status":"ACTIVE", + "segments":[ + { + "name":"19700101000000_20140531160000", + "storage_location_identifier":"KYLIN-CUBE-TEST_KYLIN_CUBE_WITH_SLR_READY-19700101000000_20140531160000_BF043D2D-9A4A-45E9-AA59-5A17D3F34A50", + "date_range_start":0, + "date_range_end":1401552000000, + "status":"READY", + "size_kb":4758, + "source_records":6000, + "source_records_size":620356, + "last_build_time":1407832663227, + "last_build_job_id":"2c7a2b63-b052-4a51-8b09-0c24b5792cda", + "binary_signature":null, + "dictionaries":{ + "TEST_CATEGORY_GROUPINGS/CATEG_LVL2_NAME":"/dict/TEST_CATEGORY_GROUPINGS/CATEG_LVL2_NAME/16d8185c-ee6b-4f8c-a919-756d9809f937.dict", + "TEST_KYLIN_FACT/LSTG_SITE_ID":"/dict/TEST_SITES/SITE_ID/0bec6bb3-1b0d-469c-8289-b8c4ca5d5001.dict", + "TEST_KYLIN_FACT/SLR_SEGMENT_CD":"/dict/TEST_SELLER_TYPE_DIM/SELLER_TYPE_CD/0c5d77ec-316b-47e0-ba9a-0616be890ad6.dict", + "TEST_KYLIN_FACT/CAL_DT":"/dict/PREDEFINED/date(yyyy-mm-dd)/64ac4f82-f2af-476e-85b9-f0805001014e.dict", + "TEST_CATEGORY_GROUPINGS/CATEG_LVL3_NAME":"/dict/TEST_CATEGORY_GROUPINGS/CATEG_LVL3_NAME/270fbfb0-281c-4602-8413-2970a7439c47.dict", + "TEST_KYLIN_FACT/LEAF_CATEG_ID":"/dict/TEST_CATEGORY_GROUPINGS/LEAF_CATEG_ID/2602386c-debb-4968-8d2f-b52b8215e385.dict", + "TEST_CATEGORY_GROUPINGS/META_CATEG_NAME":"/dict/TEST_CATEGORY_GROUPINGS/META_CATEG_NAME/0410d2c4-4686-40bc-ba14-170042a2de94.dict" + }, + "snapshots":{ + "TEST_CAL_DT":"/table_snapshot/TEST_CAL_DT.csv/8f7cfc8a-020d-4019-b419-3c6deb0ffaa0.snapshot", + "TEST_SELLER_TYPE_DIM":"/table_snapshot/TEST_SELLER_TYPE_DIM.csv/c60fd05e-ac94-4016-9255-96521b273b81.snapshot", + "TEST_CATEGORY_GROUPINGS":"/table_snapshot/TEST_CATEGORY_GROUPINGS.csv/363f4a59-b725-4459-826d-3188bde6a971.snapshot", + "TEST_SITES":"/table_snapshot/TEST_SITES.csv/78e0aecc-3ec6-4406-b86e-bac4b10ea63b.snapshot" + } + } + ], + "create_time":null, + "source_records_count":6000, + "source_records_size":0, + "size_kb":4758 +} +``` + +## Disable Cube +`PUT /kylin/api/cubes/{cubeName}/disable` + +#### Path variable +* cubeName - `required` `string` Cube name. + +#### Response Sample +(Same as "Enable Cube") + +## Purge Cube +`PUT /kylin/api/cubes/{cubeName}/purge` + +#### Path variable +* cubeName - `required` `string` Cube name. + +#### Response Sample +(Same as "Enable Cube") + +*** + +## Resume Job +`PUT /kylin/api/jobs/{jobId}/resume` + +#### Path variable +* jobId - `required` `string` Job id. + +#### Response Sample +``` +{ + "uuid":"c143e0e4-ac5f-434d-acf3-46b0d15e3dc6", + "last_modified":1407908916705, + "name":"test_kylin_cube_with_slr_empty - 19700101000000_20140731160000 - BUILD - PDT 2014-08-12 22:48:36", + "type":"BUILD", + "duration":0, + "related_cube":"test_kylin_cube_with_slr_empty", + "related_segment":"19700101000000_20140731160000", + "exec_start_time":0, + "exec_end_time":0, + "mr_waiting":0, + "steps":[ + { + "interruptCmd":null, + "name":"Create Intermediate Flat Hive Table", + "sequence_id":0, + "exec_cmd":"hive -e \"DROP TABLE IF EXISTS kylin_intermediate_test_kylin_cube_with_slr_desc_19700101000000_20140731160000_c143e0e4_ac5f_434d_acf3_46b0d15e3dc6;\nCREATE EXTERNAL TABLE IF NOT EXISTS kylin_intermediate_test_kylin_cube_with_slr_desc_19700101000000_20140731160000_c143e0e4_ac5f_434d_acf3_46b0d15e3dc6\n(\nCAL_DT date\n,LEAF_CATEG_ID int\n,LSTG_SITE_ID int\n,META_CATEG_NAME string\n,CATEG_LVL2_NAME string\n,CATEG_LVL3_NAME string\n,LSTG_FORMAT_NAME string\n,SLR_SEGMENT_CD smallint\n,SELLER_ID bigint\n,PRICE decimal\n)\nROW FORMAT DELIMITED FIELDS TERMINATED BY '\\177'\nSTORED AS SEQUENCEFILE\nLOCATION '/tmp/kylin-c143e0e4-ac5f-434d-acf3-46b0d15e3dc6/kylin_intermediate_test_kylin_cube_with_slr_desc_19700101000000_20140731160000_c143e0e4_ac5f_434d_acf3_46b0d15e3dc6';\nSET mapreduce.job.split.metainfo.maxsize=-1;\nSET mapred.compress.map.output=true;\nSET mapred.map.output.compression.codec=com.hadoop.compression.lzo.LzoCodec;\nSET mapred.output.compress=true;\nSET ma pred.output.compression.codec=com.hadoop.compression.lzo.LzoCodec;\nSET mapred.output.compression.type=BLOCK;\nSET mapreduce.job.max.split.locations=2000;\nSET hive.exec.compress.output=true;\nSET hive.auto.convert.join.noconditionaltask = true;\nSET hive.auto.convert.join.noconditionaltask.size = 300000000;\nINSERT OVERWRITE TABLE kylin_intermediate_test_kylin_cube_with_slr_desc_19700101000000_20140731160000_c143e0e4_ac5f_434d_acf3_46b0d15e3dc6\nSELECT\nTEST_KYLIN_FACT.CAL_DT\n,TEST_KYLIN_FACT.LEAF_CATEG_ID\n,TEST_KYLIN_FACT.LSTG_SITE_ID\n,TEST_CATEGORY_GROUPINGS.META_CATEG_NAME\n,TEST_CATEGORY_GROUPINGS.CATEG_LVL2_NAME\n,TEST_CATEGORY_GROUPINGS.CATEG_LVL3_NAME\n,TEST_KYLIN_FACT.LSTG_FORMAT_NAME\n,TEST_KYLIN_FACT.SLR_SEGMENT_CD\n,TEST_KYLIN_FACT.SELLER_ID\n,TEST_KYLIN_FACT.PRICE\nFROM TEST_KYLIN_FACT\nINNER JOIN TEST_CAL_DT\nON TEST_KYLIN_FACT.CAL_DT = TEST_CAL_DT.CAL_DT\nINNER JOIN TEST_CATEGORY_GROUPINGS\nON TEST_KYLIN_FACT.LEAF_CATEG_ID = TEST_CATEGORY_GROUPINGS.LEAF_CATEG_ID AN D TEST_KYLIN_FACT.LSTG_SITE_ID = TEST_CATEGORY_GROUPINGS.SITE_ID\nINNER JOIN TEST_SITES\nON TEST_KYLIN_FACT.LSTG_SITE_ID = TEST_SITES.SITE_ID\nINNER JOIN TEST_SELLER_TYPE_DIM\nON TEST_KYLIN_FACT.SLR_SEGMENT_CD = TEST_SELLER_TYPE_DIM.SELLER_TYPE_CD\nWHERE (test_kylin_fact.cal_dt < '2014-07-31 16:00:00')\n;\n\"", + "interrupt_cmd":null, + "exec_start_time":0, + "exec_end_time":0, + "exec_wait_time":0, + "step_status":"PENDING", + "cmd_type":"SHELL_CMD_HADOOP", + "info":null, + "run_async":false + }, + { + "interruptCmd":null, + "name":"Extract Fact Table Distinct Columns", + "sequence_id":1, + "exec_cmd":" -conf C:/kylin/Kylin/server/src/main/resources/hadoop_job_conf_medium.xml -cubename test_kylin_cube_with_slr_empty -input /tmp/kylin-c143e0e4-ac5f-434d-acf3-46b0d15e3dc6/kylin_intermediate_test_kylin_cube_with_slr_desc_19700101000000_20140731160000_c143e0e4_ac5f_434d_acf3_46b0d15e3dc6 -output /tmp/kylin-c143e0e4-ac5f-434d-acf3-46b0d15e3dc6/test_kylin_cube_with_slr_empty/fact_distinct_columns -jobname Kylin_Fact_Distinct_Columns_test_kylin_cube_with_slr_empty_Step_1", + "interrupt_cmd":null, + "exec_start_time":0, + "exec_end_time":0, + "exec_wait_time":0, + "step_status":"PENDING", + "cmd_type":"JAVA_CMD_HADOOP_FACTDISTINCT", + "info":null, + "run_async":true + }, + { + "interruptCmd":null, + "name":"Load HFile to HBase Table", + "sequence_id":12, + "exec_cmd":" -input /tmp/kylin-c143e0e4-ac5f-434d-acf3-46b0d15e3dc6/test_kylin_cube_with_slr_empty/hfile/ -htablename KYLIN-CUBE-TEST_KYLIN_CUBE_WITH_SLR_EMPTY-19700101000000_20140731160000_11BB4326-5975-4358-804C-70D53642E03A -cubename test_kylin_cube_with_slr_empty", + "interrupt_cmd":null, + "exec_start_time":0, + "exec_end_time":0, + "exec_wait_time":0, + "step_status":"PENDING", + "cmd_type":"JAVA_CMD_HADOOP_NO_MR_BULKLOAD", + "info":null, + "run_async":false + } + ], + "job_status":"PENDING", + "progress":0.0 +} +``` +## Pause Job +`PUT /kylin/api/jobs/{jobId}/pause` + +#### Path variable +* jobId - `required` `string` Job id. + +## Discard Job +`PUT /kylin/api/jobs/{jobId}/cancel` + +#### Path variable +* jobId - `required` `string` Job id. + +## Get Job Status +`GET /kylin/api/jobs/{jobId}` + +#### Path variable +* jobId - `required` `string` Job id. + +#### Response Sample +(Same as "Resume Job") + +## Get job step output +`GET /kylin/api/jobs/{jobId}/steps/{stepId}/output` + +#### Path Variable +* jobId - `required` `string` Job id. +* stepId - `required` `string` Step id; the step id is composed by jobId with step sequence id; for example, the jobId is "fb479e54-837f-49a2-b457-651fc50be110", its 3rd step id is "fb479e54-837f-49a2-b457-651fc50be110-3", + +#### Response Sample +``` +{ + "cmd_output":"log string" +} +``` + +*** + +## Get Hive Table +`GET /kylin/api/tables/{tableName}` + +#### Request Parameters +* tableName - `required` `string` table name to find. + +#### Response Sample +```sh +{ + uuid: "69cc92c0-fc42-4bb9-893f-bd1141c91dbe", + name: "SAMPLE_07", + columns: [{ + id: "1", + name: "CODE", + datatype: "string" + }, { + id: "2", + name: "DESCRIPTION", + datatype: "string" + }, { + id: "3", + name: "TOTAL_EMP", + datatype: "int" + }, { + id: "4", + name: "SALARY", + datatype: "int" + }], + database: "DEFAULT", + last_modified: 1419330476755 +} +``` + +## Get Hive Table (Extend Info) +`GET /kylin/api/tables/{tableName}/exd-map` + +#### Request Parameters +* tableName - `optional` `string` table name to find. + +#### Response Sample +``` +{ + "minFileSize": "46055", + "totalNumberFiles": "1", + "location": "hdfs://sandbox.hortonworks.com:8020/apps/hive/warehouse/sample_07", + "lastAccessTime": "1418374103365", + "lastUpdateTime": "1398176493340", + "columns": "struct columns { string code, string description, i32 total_emp, i32 salary}", + "partitionColumns": "", + "EXD_STATUS": "true", + "maxFileSize": "46055", + "inputformat": "org.apache.hadoop.mapred.TextInputFormat", + "partitioned": "false", + "tableName": "sample_07", + "owner": "hue", + "totalFileSize": "46055", + "outputformat": "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat" +} +``` + +## Get Hive Tables +`GET /kylin/api/tables` + +#### Request Parameters +* project- `required` `string` will list all tables in the project. +* ext- `optional` `boolean` set true to get extend info of table. + +#### Response Sample +```sh +[ + { + uuid: "53856c96-fe4d-459e-a9dc-c339b1bc3310", + name: "SAMPLE_08", + columns: [{ + id: "1", + name: "CODE", + datatype: "string" + }, { + id: "2", + name: "DESCRIPTION", + datatype: "string" + }, { + id: "3", + name: "TOTAL_EMP", + datatype: "int" + }, { + id: "4", + name: "SALARY", + datatype: "int" + }], + database: "DEFAULT", + cardinality: {}, + last_modified: 0, + exd: { + minFileSize: "46069", + totalNumberFiles: "1", + location: "hdfs://sandbox.hortonworks.com:8020/apps/hive/warehouse/sample_08", + lastAccessTime: "1398176495945", + lastUpdateTime: "1398176495981", + columns: "struct columns { string code, string description, i32 total_emp, i32 salary}", + partitionColumns: "", + EXD_STATUS: "true", + maxFileSize: "46069", + inputformat: "org.apache.hadoop.mapred.TextInputFormat", + partitioned: "false", + tableName: "sample_08", + owner: "hue", + totalFileSize: "46069", + outputformat: "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat" + } + } +] +``` + +## Load Hive Tables +`POST /kylin/api/tables/{tables}/{project}` + +#### Request Parameters +* tables - `required` `string` table names you want to load from hive, separated with comma. +* project - `required` `String` the project which the tables will be loaded into. + +#### Response Sample +``` +{ + "result.loaded": ["DEFAULT.SAMPLE_07"], + "result.unloaded": ["sapmle_08"] +} +``` + +*** + +## Wipe cache +`PUT /kylin/api/cache/{type}/{name}/{action}` + +#### Path variable +* type - `required` `string` 'METADATA' or 'CUBE' +* name - `required` `string` Cache key, e.g the cube name. +* action - `required` `string` 'create', 'update' or 'drop' + +*** + +## Initiate cube start position +Set the stream cube's start position to the current latest offsets; This can avoid building from the earlist position of Kafka topic (if you have set a long retension time); + +`PUT /kylin/api/cubes/{cubeName}/init_start_offsets` + +#### Path variable +* cubeName - `required` `string` Cube name + +#### Response Sample +```sh +{ + "result": "success", + "offsets": "{0=246059529, 1=253547684, 2=253023895, 3=172996803, 4=165503476, 5=173513896, 6=19200473, 7=26691891, 8=26699895, 9=26694021, 10=19204164, 11=26694597}" +} +``` + +## Build stream cube +`PUT /kylin/api/cubes/{cubeName}/build2` + +This API is specific for stream cube's building; + +#### Path variable +* cubeName - `required` `string` Cube name + +#### Request Body + +* sourceOffsetStart - `required` `long` The start offset, 0 represents from previous position; +* sourceOffsetEnd - `required` `long` The end offset, 9223372036854775807 represents to the end position of current stream data +* buildType - `required` Build type, "BUILD", "MERGE" or "REFRESH" + +#### Request Sample + +```sh +{ + "sourceOffsetStart": 0, + "sourceOffsetEnd": 9223372036854775807, + "buildType": "BUILD" +} +``` + +#### Response Sample +```sh +{ + "uuid": "3afd6e75-f921-41e1-8c68-cb60bc72a601", + "last_modified": 1480402541240, + "version": "1.6.0", + "name": "embedded_cube_clone - 1409830324_1409849348 - BUILD - PST 2016-11-28 22:55:41", + "type": "BUILD", + "duration": 0, + "related_cube": "embedded_cube_clone", + "related_segment": "42ebcdea-cbe9-4905-84db-31cb25f11515", + "exec_start_time": 0, + "exec_end_time": 0, + "mr_waiting": 0, + ... +} +``` + +## Check segment holes +`GET /kylin/api/cubes/{cubeName}/holes` + +#### Path variable +* cubeName - `required` `string` Cube name + +## Fill segment holes +`PUT /kylin/api/cubes/{cubeName}/holes` + +#### Path variable +* cubeName - `required` `string` Cube name
http://git-wip-us.apache.org/repos/asf/kylin/blob/7ea64f38/website/_docs20/howto/howto_use_restapi_in_js.md ---------------------------------------------------------------------- diff --git a/website/_docs20/howto/howto_use_restapi_in_js.md b/website/_docs20/howto/howto_use_restapi_in_js.md new file mode 100644 index 0000000..6bdfae4 --- /dev/null +++ b/website/_docs20/howto/howto_use_restapi_in_js.md @@ -0,0 +1,46 @@ +--- +layout: docs20 +title: Use RESTful API in Javascript +categories: howto +permalink: /docs20/howto/howto_use_restapi_in_js.html +--- +Kylin security is based on basic access authorization, if you want to use API in your javascript, you need to add authorization info in http headers. + +## Example on Query API. +``` +$.ajaxSetup({ + headers: { 'Authorization': "Basic eWFu**********X***ZA==", 'Content-Type': 'application/json;charset=utf-8' } // use your own authorization code here + }); + var request = $.ajax({ + url: "http://hostname/kylin/api/query", + type: "POST", + data: '{"sql":"select count(*) from SUMMARY;","offset":0,"limit":50000,"acceptPartial":true,"project":"test"}', + dataType: "json" + }); + request.done(function( msg ) { + alert(msg); + }); + request.fail(function( jqXHR, textStatus ) { + alert( "Request failed: " + textStatus ); + }); + +``` + +## Keypoints +1. add basic access authorization info in http headers. +2. use right ajax type and data synax. + +## Basic access authorization +For what is basic access authorization, refer to [Wikipedia Page](http://en.wikipedia.org/wiki/Basic_access_authentication). +How to generate your authorization code (download and import "jquery.base64.js" from [https://github.com/yckart/jquery.base64.js](https://github.com/yckart/jquery.base64.js)). + +``` +var authorizationCode = $.base64('encode', 'NT_USERNAME' + ":" + 'NT_PASSWORD'); + +$.ajaxSetup({ + headers: { + 'Authorization': "Basic " + authorizationCode, + 'Content-Type': 'application/json;charset=utf-8' + } +}); +``` http://git-wip-us.apache.org/repos/asf/kylin/blob/7ea64f38/website/_docs20/index.cn.md ---------------------------------------------------------------------- diff --git a/website/_docs20/index.cn.md b/website/_docs20/index.cn.md new file mode 100644 index 0000000..83b6f55 --- /dev/null +++ b/website/_docs20/index.cn.md @@ -0,0 +1,26 @@ +--- +layout: docs20-cn +title: æ¦è¿° +categories: docs +permalink: /cn/docs20/index.html +--- + +æ¬¢è¿æ¥å° Apache Kylin⢠+------------ +> Extreme OLAP Engine for Big Data + +Apache Kylinâ¢æ¯ä¸ä¸ªå¼æºçåå¸å¼åæå¼æï¼æä¾Hadoopä¹ä¸çSQLæ¥è¯¢æ¥å£åå¤ç»´åæï¼OLAPï¼è½åä»¥æ¯æè¶ å¤§è§æ¨¡æ°æ®ï¼æåç±eBay Inc.å¼åå¹¶è´¡ç®è³å¼æºç¤¾åºã + +æ¥çæ§çæ¬ææ¡£: +* [v1.5](/cn/docs15/) +* [v1.3](/cn/docs/) + +å®è£ +------------ +请åèå®è£ ææ¡£ä»¥å®è£ Apache Kylin: [å®è£ å导](/cn/docs20/install/) + + + + + + http://git-wip-us.apache.org/repos/asf/kylin/blob/7ea64f38/website/_docs20/index.md ---------------------------------------------------------------------- diff --git a/website/_docs20/index.md b/website/_docs20/index.md new file mode 100644 index 0000000..f34112c --- /dev/null +++ b/website/_docs20/index.md @@ -0,0 +1,59 @@ +--- +layout: docs20 +title: Overview +categories: docs +permalink: /docs20/index.html +--- + +Welcome to Apache Kylinâ¢: Extreme OLAP Engine for Big Data +------------ + +Apache Kylin⢠is an open source Distributed Analytics Engine designed to provide SQL interface and multi-dimensional analysis (OLAP) on Hadoop supporting extremely large datasets. + +Document of prior versions: + +* [v1.6.x document](/docs16/) +* [v1.5.x document](/docs15/) +* [v1.3.x document](/docs/) + +Installation & Setup +------------ +1. [Hadoop Env](install/hadoop_env.html) +2. [Installation Guide](install/index.html) +3. [Advanced settings](install/advance_settings.html) +4. [Deploy in cluster mode](install/kylin_cluster.html) +5. [Run Kylin with Docker](install/kylin_docker.html) + + +Tutorial +------------ +1. [Quick Start with Sample Cube](tutorial/kylin_sample.html) +2. [Cube Creation](tutorial/create_cube.html) +3. [Cube Build and Job Monitoring](tutorial/cube_build_job.html) +4. [Web Interface](tutorial/web.html) +5. [SQL reference: by Apache Calcite](http://calcite.apache.org/docs/reference.html) +6. [Build Cube with Streaming Data](tutorial/cube_streaming.html) +7. [Build Cube with Spark Engine (beta)](tutorial/cube_spark.html) + + +Connectivity and APIs +------------ +1. [ODBC driver](tutorial/odbc.html) +2. [JDBC driver](howto/howto_jdbc.html) +3. [RESTful API list](howto/howto_use_restapi.html) +4. [Build cube with RESTful API](howto/howto_build_cube_with_restapi.html) +5. [Call RESTful API in Javascript](howto/howto_use_restapi_in_js.html) +6. [Connect from MS Excel and PowerBI](tutorial/powerbi.html) +7. [Connect from Tableau 8](tutorial/tableau.html) +8. [Connect from Tableau 9](tutorial/tableau_91.html) +9. [Connect from SQuirreL](tutorial/squirrel.html) +10. [Connect from Apache Flink](tutorial/flink.html) + +Operations +------------ +1. [Backup/restore Kylin metadata](howto/howto_backup_metadata.html) +2. [Cleanup storage (HDFS & HBase)](howto/howto_cleanup_storage.html) +3. [Upgrade from old version](howto/howto_upgrade.html) + + + http://git-wip-us.apache.org/repos/asf/kylin/blob/7ea64f38/website/_docs20/install/advance_settings.md ---------------------------------------------------------------------- diff --git a/website/_docs20/install/advance_settings.md b/website/_docs20/install/advance_settings.md new file mode 100644 index 0000000..f76d39a --- /dev/null +++ b/website/_docs20/install/advance_settings.md @@ -0,0 +1,98 @@ +--- +layout: docs20 +title: "Advanced Settings" +categories: install +permalink: /docs20/install/advance_settings.html +--- + +## Overwrite default kylin.properties at Cube level +In `conf/kylin.properties` there are many parameters, which control/impact on Kylin's behaviors; Most parameters are global configs like security or job related; while some are Cube related; These Cube related parameters can be customized at each Cube level, so you can control the behaviors more flexibly. The GUI to do this is in the "Configuration Overwrites" step of the Cube wizard, as the screenshot below. + + + +Here take two example: + + * `kylin.cube.algorithm`: it defines the Cubing algorithm that the job engine will select; Its default value is "auto", means the engine will dynamically pick an algorithm ("layer" or "inmem") by sampling the data. If you knows Kylin and your data/cluster well, you can set your preferred algorithm directly (usually "inmem" has better performance but will request more memory). + + * `kylin.hbase.region.cut`: it defines how big a region is when creating the HBase table. The default value is "5" (GB) per region. It might be too big for a small or medium cube, so you can give it a smaller value to get more regions created, then can gain better query performance. + +## Overwrite default Hadoop job conf at Cube level +The `conf/kylin_job_conf.xml` and `conf/kylin_job_conf_inmem.xml` manage the default configurations for Hadoop jobs. If you have the need to customize the configs by cube, you can achieve that with the similar way as above, but need adding a prefix `kylin.job.mr.config.override.`; These configs will be parsed out and then applied when submitting jobs. See two examples below: + + * If want a cube's job getting more memory from Yarn, you can define: `kylin.job.mr.config.override.mapreduce.map.java.opts=-Xmx7g` and `kylin.job.mr.config.override.mapreduce.map.memory.mb=8192` + * If want a cube's job going to a different Yarn resource queue, you can define: `kylin.job.mr.config.override.mapreduce.job.queuename=myQueue` (note: "myQueue" is just a sample) + + ## Overwrite default Hive job conf at Cube level +The `conf/kylin_hive_conf.xml` manage the default configurations when running Hive job (like creating intermediate flat hive table). If you have the need to customize the configs by cube, you can achieve that with the similar way as above, but need using another prefix `kylin.hive.config.override.`; These configs will be parsed out and then applied when running "hive -e" or "beeline" commands. See example below: + + * If want hive goes a different Yarn resource queue, you can define: `kylin.hive.config.override.mapreduce.job.queuename=myQueue` (note: "myQueue" is just a sample) + + +## Enable compression + +By default, Kylin does not enable compression, this is not the recommend settings for production environment, but a tradeoff for new Kylin users. A suitable compression algorithm will reduce the storage overhead. But unsupported algorithm will break the Kylin job build also. There are three kinds of compression used in Kylin, HBase table compression, Hive output compression and MR jobs output compression. + +* HBase table compression +The compression settings define in `kyiln.properties` by `kylin.hbase.default.compression.codec`, default value is *none*. The valid value includes *none*, *snappy*, *lzo*, *gzip* and *lz4*. Before changing the compression algorithm, please make sure the selected algorithm is supported on your HBase cluster. Especially for snappy, lzo and lz4, not all Hadoop distributions include these. + +* Hive output compression +The compression settings define in `kylin_hive_conf.xml`. The default setting is empty which leverages the Hive default configuration. If you want to override the settings, please add (or replace) the following properties into `kylin_hive_conf.xml`. Take the snappy compression for example: +{% highlight Groff markup %} + <property> + <name>mapreduce.map.output.compress.codec</name> + <value>org.apache.hadoop.io.compress.SnappyCodec</value> + <description></description> + </property> + <property> + <name>mapreduce.output.fileoutputformat.compress.codec</name> + <value>org.apache.hadoop.io.compress.SnappyCodec</value> + <description></description> + </property> +{% endhighlight %} + +* MR jobs output compression +The compression settings define in `kylin_job_conf.xml` and `kylin_job_conf_inmem.xml`. The default setting is empty which leverages the MR default configuration. If you want to override the settings, please add (or replace) the following properties into `kylin_job_conf.xml` and `kylin_job_conf_inmem.xml`. Take the snappy compression for example: +{% highlight Groff markup %} + <property> + <name>mapreduce.map.output.compress.codec</name> + <value>org.apache.hadoop.io.compress.SnappyCodec</value> + <description></description> + </property> + <property> + <name>mapreduce.output.fileoutputformat.compress.codec</name> + <value>org.apache.hadoop.io.compress.SnappyCodec</value> + <description></description> + </property> +{% endhighlight %} + +Compression settings only take effect after restarting Kylin server instance. + +## Allocate more memory to Kylin instance + +Open `bin/setenv.sh`, which has two sample settings for `KYLIN_JVM_SETTINGS` environment variable; The default setting is small (4GB at max.), you can comment it and then un-comment the next line to allocate 16GB: + +{% highlight Groff markup %} +export KYLIN_JVM_SETTINGS="-Xms1024M -Xmx4096M -Xss1024K -XX:MaxPermSize=128M -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:$KYLIN_HOME/logs/kylin.gc.$$ -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=64M" +# export KYLIN_JVM_SETTINGS="-Xms16g -Xmx16g -XX:MaxPermSize=512m -XX:NewSize=3g -XX:MaxNewSize=3g -XX:SurvivorRatio=4 -XX:+CMSClassUnloadingEnabled -XX:+CMSParallelRemarkEnabled -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode -XX:CMSInitiatingOccupancyFraction=70 -XX:+DisableExplicitGC -XX:+HeapDumpOnOutOfMemoryError" +{% endhighlight %} + +## Enable LDAP or SSO authentication + +Check [How to Enable Security with LDAP and SSO](../howto/howto_ldap_and_sso.html) + + +## Enable email notification + +Kylin can send email notification on job complete/fail; To enable this, edit `conf/kylin.properties`, set the following parameters: +{% highlight Groff markup %} +mail.enabled=true +mail.host=your-smtp-server +mail.username=your-smtp-account +mail.password=your-smtp-pwd +mail.sender=your-sender-address +kylin.job.admin.dls=adminstrator-address +{% endhighlight %} + +Restart Kylin server to take effective. To disable, set `mail.enabled` back to `false`. + +Administrator will get notifications for all jobs. Modeler and Analyst need enter email address into the "Notification List" at the first page of cube wizard, and then will get notified for that cube. http://git-wip-us.apache.org/repos/asf/kylin/blob/7ea64f38/website/_docs20/install/hadoop_evn.md ---------------------------------------------------------------------- diff --git a/website/_docs20/install/hadoop_evn.md b/website/_docs20/install/hadoop_evn.md new file mode 100644 index 0000000..2c300df --- /dev/null +++ b/website/_docs20/install/hadoop_evn.md @@ -0,0 +1,40 @@ +--- +layout: docs20 +title: "Hadoop Environment" +categories: install +permalink: /docs20/install/hadoop_env.html +--- + +Kylin need run in a Hadoop node, to get better stability, we suggest you to deploy it a pure Hadoop client machine, on which it the command lines like `hive`, `hbase`, `hadoop`, `hdfs` already be installed and configured. The Linux account that running Kylin has got permission to the Hadoop cluster, including create/write hdfs, hive tables, hbase tables and submit MR jobs. + +## Recommended Hadoop Versions + +* Hadoop: 2.6 - 2.7 +* Hive: 0.13 - 1.2.1 +* HBase: 0.98 - 0.99, 1.x +* JDK: 1.7+ + +_Tested with Hortonworks HDP 2.2 and Cloudera Quickstart VM 5.1. Windows and MacOS have known issues._ + +To make things easier we strongly recommend you try Kylin with an all-in-one sandbox VM, like [HDP sandbox](http://hortonworks.com/products/hortonworks-sandbox/), and give it 10 GB memory. In the following tutorial we'll go with **Hortonworks Sandbox 2.1** and **Cloudera QuickStart VM 5.1**. + +To avoid permission issue in the sandbox, you can use its `root` account. The password for **Hortonworks Sandbox 2.1** is `hadoop` , for **Cloudera QuickStart VM 5.1** is `cloudera`. + +We also suggest you using bridged mode instead of NAT mode in Virtual Box settings. Bridged mode will assign your sandbox an independent IP address so that you can avoid issues like [this](https://github.com/KylinOLAP/Kylin/issues/12). + +### Start Hadoop +Use ambari helps to launch hadoop: + +``` +ambari-agent start +ambari-server start +``` + +With both command successfully run you can go to ambari homepage at <http://your_sandbox_ip:8080> (user:admin,password:admin) to check everything's status. **By default hortonworks ambari disables Hbase, you need manually start the `Hbase` service at ambari homepage.** + + + +**Additonal Info for setting up Hortonworks Sandbox on Virtual Box** + + Please make sure Hbase Master port [Default 60000] and Zookeeper [Default 2181] is forwarded to Host OS. + http://git-wip-us.apache.org/repos/asf/kylin/blob/7ea64f38/website/_docs20/install/index.cn.md ---------------------------------------------------------------------- diff --git a/website/_docs20/install/index.cn.md b/website/_docs20/install/index.cn.md new file mode 100644 index 0000000..68b5aec --- /dev/null +++ b/website/_docs20/install/index.cn.md @@ -0,0 +1,46 @@ +--- +layout: docs20 +title: "Installation Guide" +categories: install +permalink: /cn/docs20/install/index.html +version: v0.7.2 +since: v0.7.1 +--- + +### Environment + +Kylin requires a properly setup hadoop environment to run. Following are the minimal request to run Kylin, for more detial, please check this reference: [Hadoop Environment](hadoop_env.html). + +## Prerequisites on Hadoop + +* Hadoop: 2.4+ +* Hive: 0.13+ +* HBase: 0.98+, 1.x +* JDK: 1.7+ +_Tested with Hortonworks HDP 2.2 and Cloudera Quickstart VM 5.1_ + + +It is most common to install Kylin on a Hadoop client machine. It can be used for demo use, or for those who want to host their own web site to provide Kylin service. The scenario is depicted as: + + + +For normal use cases, the application in the above picture means Kylin Web, which contains a web interface for cube building, querying and all sorts of management. Kylin Web launches a query engine for querying and a cube build engine for building cubes. These two engines interact with the Hadoop components, like hive and hbase. + +Except for some prerequisite software installations, the core of Kylin installation is accomplished by running a single script. After running the script, you will be able to build sample cube and query the tables behind the cubes via a unified web interface. + +### Install Kylin + +1. Download latest Kylin binaries at [http://kylin.apache.org/download](http://kylin.apache.org/download) +2. Export KYLIN_HOME pointing to the extracted Kylin folder +3. Make sure the user has the privilege to run hadoop, hive and hbase cmd in shell. If you are not so sure, you can run **bin/check-env.sh**, it will print out the detail information if you have some environment issues. +4. To start Kylin, simply run **bin/kylin.sh start** +5. To stop Kylin, simply run **bin/kylin.sh stop** + +> If you want to have multiple Kylin nodes please refer to [this](kylin_cluster.html) + +After Kylin started you can visit <http://your_hostname:7070/kylin>. The username/password is ADMIN/KYLIN. It's a clean Kylin homepage with nothing in there. To start with you can: + +1. [Quick play with a sample cube](../tutorial/kylin_sample.html) +2. [Create and Build your own cube](../tutorial/create_cube.html) +3. [Kylin Web Tutorial](../tutorial/web.html) + http://git-wip-us.apache.org/repos/asf/kylin/blob/7ea64f38/website/_docs20/install/index.md ---------------------------------------------------------------------- diff --git a/website/_docs20/install/index.md b/website/_docs20/install/index.md new file mode 100644 index 0000000..77794e1 --- /dev/null +++ b/website/_docs20/install/index.md @@ -0,0 +1,35 @@ +--- +layout: docs20 +title: "Installation Guide" +categories: install +permalink: /docs20/install/index.html +--- + +### Environment + +Kylin requires a properly setup Hadoop environment to run. Following are the minimal request to run Kylin, for more detial, please check [Hadoop Environment](hadoop_env.html). + +It is most common to install Kylin on a Hadoop client machine, from which Kylin can talk with the Hadoop cluster via command lines including `hive`, `hbase`, `hadoop`, etc. The scenario is depicted as: + + + +For normal use cases, the application in the above picture means Kylin Web, which contains a web interface for cube building, querying and all sorts of management. Kylin Web launches a query engine for querying and a cube build engine for building cubes. These two engines interact with the Hadoop components, like hive and hbase. + +Except for some prerequisite software installations, the core of Kylin installation is accomplished by running a single script. After running the script, you will be able to build sample cube and query the tables behind the cubes via a unified web interface. + +### Install Kylin + +1. Download latest Kylin binaries at [http://kylin.apache.org/download](http://kylin.apache.org/download) +2. Export KYLIN_HOME pointing to the extracted Kylin folder +3. Make sure the user has the privilege to run hadoop, hive and hbase cmd in shell. If you are not so sure, you can run **bin/check-env.sh**, it will print out the detail information if you have some environment issues. +4. To start Kylin, run **bin/kylin.sh start**, after the server starts, you can watch logs/kylin.log for runtime logs; +5. To stop Kylin, run **bin/kylin.sh stop** + +> If you want to have multiple Kylin nodes running to provide high availability, please refer to [this](kylin_cluster.html) + +After Kylin started you can visit <http://hostname:7070/kylin>. The default username/password is ADMIN/KYLIN. It's a clean Kylin homepage with nothing in there. To start with you can: + +1. [Quick play with a sample cube](../tutorial/kylin_sample.html) +2. [Create and Build a cube](../tutorial/create_cube.html) +3. [Kylin Web Tutorial](../tutorial/web.html) + http://git-wip-us.apache.org/repos/asf/kylin/blob/7ea64f38/website/_docs20/install/kylin_cluster.md ---------------------------------------------------------------------- diff --git a/website/_docs20/install/kylin_cluster.md b/website/_docs20/install/kylin_cluster.md new file mode 100644 index 0000000..d7fec7e --- /dev/null +++ b/website/_docs20/install/kylin_cluster.md @@ -0,0 +1,32 @@ +--- +layout: docs20 +title: "Deploy in Cluster Mode" +categories: install +permalink: /docs20/install/kylin_cluster.html +--- + + +### Kylin Server modes + +Kylin instances are stateless, the runtime state is saved in its "Metadata Store" in hbase (kylin.metadata.url config in conf/kylin.properties). For load balance considerations it is possible to start multiple Kylin instances sharing the same metadata store (thus sharing the same state on table schemas, job status, cube status, etc.) + +Each of the kylin instances has a kylin.server.mode entry in conf/kylin.properties specifying the runtime mode, it has three options: 1. "job" for running job engine only 2. "query" for running query engine only and 3 "all" for running both. Notice that only one server can run the job engine("all" mode or "job" mode), the others must all be "query" mode. + +A typical scenario is depicted in the following chart: + + + +### Setting up Multiple Kylin REST servers + +If you are running Kylin in a cluster where you have multiple Kylin REST server instances, please make sure you have the following property correctly configured in ${KYLIN_HOME}/conf/kylin.properties for EVERY server instance. + +1. kylin.rest.servers + List of web servers in use, this enables one web server instance to sync up with other servers. For example: kylin.rest.servers=sandbox1:7070,sandbox2:7070 + +2. kylin.server.mode + Make sure there is only one instance whose "kylin.server.mode" is set to "all"(or "job"), others should be "query" + +## Setup load balancer + +To enable Kylin high availability, you need setup a load balancer in front of these servers, let it routing the incoming requests to the cluster. Client sides send all requests to the load balancer, instead of talk with a specific instance. + http://git-wip-us.apache.org/repos/asf/kylin/blob/7ea64f38/website/_docs20/install/kylin_docker.md ---------------------------------------------------------------------- diff --git a/website/_docs20/install/kylin_docker.md b/website/_docs20/install/kylin_docker.md new file mode 100644 index 0000000..a0a09eb --- /dev/null +++ b/website/_docs20/install/kylin_docker.md @@ -0,0 +1,10 @@ +--- +layout: docs20 +title: "Run Kylin with Docker" +categories: install +permalink: /docs20/install/kylin_docker.html +version: v1.5.3 +since: v1.5.2 +--- + +Apache Kylin runs as a client of Hadoop cluster, so it is reasonable to run within a Docker container; please check [this project](https://github.com/Kyligence/kylin-docker/) on github. http://git-wip-us.apache.org/repos/asf/kylin/blob/7ea64f38/website/_docs20/install/manual_install_guide.cn.md ---------------------------------------------------------------------- diff --git a/website/_docs20/install/manual_install_guide.cn.md b/website/_docs20/install/manual_install_guide.cn.md new file mode 100644 index 0000000..b369568 --- /dev/null +++ b/website/_docs20/install/manual_install_guide.cn.md @@ -0,0 +1,48 @@ +--- +layout: docs20-cn +title: "æå¨å®è£ æå" +categories: å®è£ +permalink: /cn/docs20/install/manual_install_guide.html +version: v0.7.2 +since: v0.7.1 +--- + +## å¼è¨ + +å¨å¤§å¤æ°æ åµä¸ï¼æä»¬çèªå¨èæ¬[Installation Guide](./index.html)å¯ä»¥å¸®å©ä½ å¨ä½ çhadoop sandboxçè³ä½ çhadoop clusterä¸å¯å¨Kylinã使¯ï¼ä¸ºé²é¨ç½²èæ¬åºéï¼æä»¬æ°åæ¬æä½ä¸ºåèæåæ¥è§£å³ä½ çé®é¢ã + +åºæ¬ä¸æ¬æè§£éäºèªå¨èæ¬ä¸çæ¯ä¸æ¥éª¤ãæä»¬åè®¾ä½ å·²ç»å¯¹Linuxä¸çHadoopæä½é常çæã + +## åææ¡ä»¶ +* å·²å®è£ Tomcatï¼è¾åºå°CATALINA_HOMEï¼with CATALINA_HOME exported). +* Kylin äºè¿å¶æä»¶æ·è´è³æ¬å°å¹¶è§£åï¼ä¹å使ç¨$KYLIN_HOMEå¼ç¨ + +## æ¥éª¤ + +### åå¤Jars + +Kylinä¼éè¦ä½¿ç¨ä¸¤ä¸ªjarå ï¼ä¸¤ä¸ªjarå åé ç½®å¨é»è®¤kylin.propertiesï¼ï¼there two jars and configured in the default kylin.propertiesï¼ + +``` +kylin.job.jar=/tmp/kylin/kylin-job-latest.jar + +``` + +è¿æ¯Kylinç¨äºMR jobsçjob jarå ãä½ éè¦å¤å¶ $KYLIN_HOME/job/target/kylin-job-latest.jar å° /tmp/kylin/ + +``` +kylin.coprocessor.local.jar=/tmp/kylin/kylin-coprocessor-latest.jar + +``` + +è¿æ¯ä¸ä¸ªKylin伿¾å¨hbaseä¸çhbaseåå¤çjarå ãå®ç¨äºæé«æ§è½ãä½ éè¦å¤å¶ $KYLIN_HOME/storage/target/kylin-coprocessor-latest.jar å° /tmp/kylin/ + +### å¯å¨Kylin + +以`./kylin.sh start` + +å¯å¨Kylin + +并以`./Kylin.sh stop` + +忢Kylin