This is an automated email from the ASF dual-hosted git repository. xxyu pushed a commit to branch doc5.0 in repository https://gitbox.apache.org/repos/asf/kylin.git
commit 1ed6d2c7aa877077db7122de5f29474778865820 Author: Mukvin <boyboys...@163.com> AuthorDate: Tue Aug 30 16:00:52 2022 +0800 KYLIN-5242 remove useless contents --- website/docs/configuration/configuration.md | 14 - website/docs/modeling/model_concepts_operations.md | 2 +- website/docs/monitor/job_diagnosis.md | 4 +- .../operations/system-operation/junk_file_clean.md | 1 - website/docs/quickstart/quick_start.md | 2 +- website/docs/restapi/streaming_job_api.md | 282 --------------------- website/sidebars.js | 4 - 7 files changed, 4 insertions(+), 305 deletions(-) diff --git a/website/docs/configuration/configuration.md b/website/docs/configuration/configuration.md index 3046f38233..edd2b53567 100644 --- a/website/docs/configuration/configuration.md +++ b/website/docs/configuration/configuration.md @@ -93,20 +93,6 @@ The file **kylin.properties** occupies some of the most important configurations | kylin.query.realization.chooser.thread-max-num | The maximum number of threads in the model matching thread pool in the query engine, the default is 50. It should be noted that when the maximum number of threads is set to be less than or equal to 0 or less than the number of core threads, this thread pool will be unavailable, which will cause the entire query engine to be unavailable | | kylin.query.memory-limit-during-collect-mb | Limit the memory usage when getting query result in Kylin,the unit is megabytes, defaults to 5400mb | | kylin.query.auto-model-view-enabled | Automatically generate views for model. When the config is on, a view will be generated for each model and user can query on that view. The view will be named with {project_name}.{model_name} and contains all the tables defined in the model and all the columns referenced by the dimension and measure of the table. | -| kylin.streaming.job.max-concurrent-jobs | Only for Kylin Realtime. Max tasks numbers used to ingesting realtime data and merging segments. | -| kylin.streaming.kafka-conf.maxOffsetsPerTrigger | Only for Kylin Realtime. Max records numbers of ingesting data at one time. -1 stands for no limitation. | -| kylin.streaming.job-status-watch-enabled | Only for Kylin Realtime. Whether enabling tasks monitor, "true" stands for enabled and "false" stands for disabled. | -| kylin.streaming.job-retry-enabled | Only for Kylin Realtime. Whether retrying after tasks failed, "true" stands for enabled and "false" stands for disabled. | -| kylin.streaming.job-retry-interval | Only for Kylin Realtime. How many minutes the tasks will retry after failed. | -| kylin.streaming.job-retry-max-interval | Only for Kylin Realtime. How many minutes the interval is when the tasks retry. | -| kylin.engine.streaming-metrics-enabled | Only for Kylin Realtime. Whether enabling tasks metrics monitor, "true" stands for enabled and "false" stands for disabled. | -| kylin.engine.streaming-segment-merge-interval | Only for Kylin Realtime. How many seconds the interval is when merging segments. | -| kylin.engine.streaming-segment-clean-interval | Only for Kylin Realtime. How many hours the time is before which the segments will be cleaned after being merged. | -| kylin.engine.streaming-segment-merge-ratio | Only for Kylin Realtime. The ratio, which the summary of the segments reach, will trigger merging segments. | -| kylin.streaming.jobstats.survival-time-threshold | Only for Kylin Realtime. How many days the realtime data statistics keeps. The default value is 7. | -| kylin.streaming.spark-conf.spark.yarn.queue | Only for Kylin Realtime. The name of the yarn queue which realtime tasks exclusively use. | -| kylin.streaming.spark-conf.spark.port.maxRetries | Only for Kylin Realtime. The number to retry when the port is occupied. | -| kylin.streaming.kafka.starting-offsets | Only for Kylin Realtime. The offset from where to consume Kafka message. The default value is 'earliest'. | | kylin.storage.columnar.spark-conf.spark.sql.view-truncate-enabled | Allow spark view to lose precision when loading tables and queries, the default value is false | | kylin.engine.spark-conf.spark.sql.view-truncate-enabled=true | Allow spark view to lose precision during construction, the default value is false | | kylin.source.hive.databases | Configure the database list loaded by the data source. There is no default value. Both the system level and the project level can be configured. The priority of the project level is greater than the system level. | diff --git a/website/docs/modeling/model_concepts_operations.md b/website/docs/modeling/model_concepts_operations.md index ed016b0110..334491eed4 100644 --- a/website/docs/modeling/model_concepts_operations.md +++ b/website/docs/modeling/model_concepts_operations.md @@ -51,7 +51,7 @@ Model design refers to build the star model or snowflake model based on data tab - **Fact Table**: The fact table of this model. - - **Types**: Model types, which include *Batch Model*, *Streaming Model*, *Fusion Model* + - **Types**: Model types, which include *Batch Model* - **Usage**: Hit count by SQL statements in the last 30 days. Update every 30 minutes. diff --git a/website/docs/monitor/job_diagnosis.md b/website/docs/monitor/job_diagnosis.md index b09b289928..b0d04d4551 100644 --- a/website/docs/monitor/job_diagnosis.md +++ b/website/docs/monitor/job_diagnosis.md @@ -21,7 +21,7 @@ last_update: ### View Job Execution Log On Web UI -You can view the job execution log in the ` Monitor -> Batch Job/Streaming Job` interface. As shown below, you can click the **Open Job Steps** button at the position 1, and then click the **Log Output** button in the job details to view the first and the last 100 lines of job execution log in a popup window. You can download the full log by clicking the link **download the full log** at the position 3. +You can view the job execution log in the ` Monitor -> Batch Job` interface. As shown below, you can click the **Open Job Steps** button at the position 1, and then click the **Log Output** button in the job details to view the first and the last 100 lines of job execution log in a popup window. You can download the full log by clicking the link **download the full log** at the position 3. > **Tip**: If there are multiple steps in a job, you can view the execution > log for each step. @@ -31,7 +31,7 @@ You can view the job execution log in the ` Monitor -> Batch Job/Streaming Job` In FusionInsight, you need to execute the command `source /opt/hadoopclient/bigdata_env` first. The `hadoopclient` is a variable. -You can execute ` $KYLIN_HOME/bin/diag.sh -job <jobid> ` to generate the job diagnostic package and \<jobid\> need to be replaced with the actual job ID number. You can view the job ID number in the ` Monitor -> Batch Job/Streaming Job` interface. You can also click the icon in the position 1 as shown picture below to expand the specified job details that is in the position 2 on the right. +You can execute ` $KYLIN_HOME/bin/diag.sh -job <jobid> ` to generate the job diagnostic package and \<jobid\> need to be replaced with the actual job ID number. You can view the job ID number in the ` Monitor -> Batch Job` interface. You can also click the icon in the position 1 as shown picture below to expand the specified job details that is in the position 2 on the right.  diff --git a/website/docs/operations/system-operation/junk_file_clean.md b/website/docs/operations/system-operation/junk_file_clean.md index 8fcfbcdb1b..f25653e8b5 100644 --- a/website/docs/operations/system-operation/junk_file_clean.md +++ b/website/docs/operations/system-operation/junk_file_clean.md @@ -44,7 +44,6 @@ The scope of junk file cleanup includes: - The total number of query history for all projects. The query history that exceeds this threshold number `kylin.query.queryhistory.project-max-size=10000000` (default) will be cleared. - The query history of a single project exceeds this threshold `kylin.query.queryhistory.project-max-size=1000000` (default) The query history will be cleared. - The query history time of all projects. The query history that exceeds this threshold `kylin.query.queryhistory.survival-time-threshold=30d` (default 30 days) will be cleared. This configuration also supports units: milliseconds ms, microseconds us, minutes m or min, hours h. - - Real-time job status/record table. Realtime jobs that exceed this threshold `kylin.streaming.jobstats.survival-time-threshold=7d` (default 7 days) will be cleaned up. - Invalid optimization suggestion table data. - Expired capacity billing metadata. Capacity billing information that exceeds this threshold `kylin.garbage.storage.sourceusage-survival-time-threshold=90d` (default 90 days) will be cleaned up. - Invalid or out-of-date item-related metadata. diff --git a/website/docs/quickstart/quick_start.md b/website/docs/quickstart/quick_start.md index 67e137036c..61bdc96bf8 100644 --- a/website/docs/quickstart/quick_start.md +++ b/website/docs/quickstart/quick_start.md @@ -209,7 +209,7 @@ On the **Data Asset -> Model** page, you should see an example model with some s  -On the **Monitor** page, you can see all jobs have been completed successfully in **Batch Job** and **Streaming Job** pages. +On the **Monitor** page, you can see all jobs have been completed successfully in **Batch Job** pages.  diff --git a/website/docs/restapi/streaming_job_api.md b/website/docs/restapi/streaming_job_api.md deleted file mode 100644 index 3cd23d6686..0000000000 --- a/website/docs/restapi/streaming_job_api.md +++ /dev/null @@ -1,282 +0,0 @@ ---- -title: Streaming Job API -language: en -sidebar_label: Streaming Job API -pagination_label: Streaming Job API -toc_min_heading_level: 2 -toc_max_heading_level: 6 -pagination_prev: null -pagination_next: null -keywords: - - streaming job api -draft: true -last_update: - date: 08/12/2022 ---- - -> Reminders: -> -> 1. Please read [Access and Authentication REST API](authentication.md) and understand how authentication works. -> 2. On Curl command line, don't forget to quote the URL if it contains the special char `&`. - - -### Get Job List - -- `GET http://host:port/kylin/api/streaming_jobs` - -- Introduced in: 4.5.8.2 - -- Scenarios - - Obtain detailed information of streaming jobs, and provide operation and maintenance decisions based on information such as task status. - -- URL Parameters - - `model_name` - `optional` `string`, fuzzy match model name - - - `model_names` - `optional` `array<string>`, exact model name list - - - `job_types` - `optional` `array<string>`, job types, Optional values:`STREAMING_MERGE`,`STREAMING_BUILD` - - - `statuses` - `optional` `array<string>`, job status, Optional values:`STARTING`,`RUNNING`,`STOPPING`,`ERROR` ,`STOPPED` - - - `project` - `optional` `string`, project name - - - `page_offset` - `optional` `int`, offset of returned result, 0 by default - - - `page_size` - `optional` `int`, quantity of returned result per page, 10 by default - - - `sort_by` - `optional` `string`, sort field, optional values: `last_modified` by default,Optional values:`model_alias`、`data_latency`、`last_status_duration`、`last_modified` - - - `reverse` - `optional` `boolean`, whether sort reverse, "true" by default - - - `job_ids` - `optional` `array<string>`, job id list, the parameter `project` is required when `job_ids` is not empty - -- HTTP Header - - `Accept: application/vnd.apache.kylin-v4-public+json` - - `Accept-Language: en` - - `Content-Type: application/json;charset=utf-8` - -- Curl Request Example - - ```sh - curl -X GET \ - 'http://host:port/kylin/api/streaming_jobs' \ - -H 'Accept: application/vnd.apache.kylin-v4-public+json' \ - -H 'Accept-Language: en' \ - -H 'Authorization: Basic QURNSU46S1lMSU4=' \ - -H 'Content-Type: application/json;charset=utf-8' - ``` - -- Response Details - - - `Code`: `String`, response code. **Returned value**: `000` (request processing success), `999 ` (request processing failed) - - `data`: `JSON`, response data. - - `value`: details of the response data, which consists of: - - `uuid`: `String`, job ID - - `last_modified`: `Long`, last modified time of the job. - - `version`: `String`, system metadata version. - - `mvcc`: `Long`, version number with metadata modified. - - `model_alias`: `String`, model alias. - - `owner`: `String`, job owner - - `model_id`: `String`, Model ID. - - `last_start_time`: `Long`, last start time of the job. - - `last_end_time`: `Long`, last end time of the job. - - `last_update_time`: `Long`, last update time of the job. - - `last_batch_count`: `Long`, last number of messages processed of the job. - - `subscribe`: `String`, topic name. - - `fact_table`: `String`, fact table. - - `job_status`: `String`, job status, Possible values:`STARTING`,`RUNNING`,`STOPPING`,`ERROR` ,`STOPPED`. - - `job_type`: `String`, job types, Possible values:`STREAMING_MERGE`,`STREAMING_BUILD`. - - `process_id`: `String`, the process ID of the job. - - `node_info`: `String`, the information of the machine that run the job. - - `job_execution_id`: `String`, the execution ID of the job. - - `yarn_app_id`: `String`, the ApplicationId on yarn. - - `yarn_app_url`: `String`, the Application url on yarn. - - `project`: `String`, project name. - - `skip_listener`: `Boolean`,whether skip to use listener。 - - `action`: `String`, the action that job is located. - - `model_broken`: `Boolean`, whether the model is broken or not. - - `data_latency`: `Long`, the minimum latency of data(ms). - - `last_status_duration`: `Long`, the last status change time(ms). - - `model_indexes`: `Int`, total number of indexes. - - `launching_error`: `Boolean`, whether launching error. - - `params`: `JSON`, the building parameters of the job. - - `partition_desc`: `JSON`, partition description. - - `offset`: page number - - `limit`: jobs listed in each page - - `total_size`: total number of jobs - - `msg`: `String`: error message - -- Response Example - - ```json - { - "code": "000", - "data": { - "value": [ - { - "uuid": "7bccf62d-535c-70b8-8271-eaef3985aa96_merge", - "last_modified": 1645186414513, - "create_time": 1644823384864, - "version": "4.0.0.0", - "mvcc": -1, - "model_alias": "hy_model", - "owner": "ADMIN", - "model_id": "7bccf62d-535c-70b8-8271-eaef3985aa96", - "last_start_time": null, - "last_end_time": null, - "last_update_time": "2022-02-18 19:21:27", - "last_batch_count": null, - "subscribe": null, - "fact_table": "BASE.HY_LINEORDER", - "job_status": "ERROR", - "job_type": "STREAMING_MERGE", - "process_id": null, - "node_info": null, - "job_execution_id": null, - "yarn_app_id": "", - "yarn_app_url": "", - "params": { - "spark.executor.memory": "1g", - "kylin.streaming.segment-max-size": "32m", - "spark.master": "yarn", - "spark.driver.memory": "512m", - "spark.executor.cores": "2", - "kylin.streaming.job-retry-enabled": "false", - "spark.executor.instances": "2", - "kylin.streaming.segment-merge-threshold": "3", - "spark.sql.shuffle.partitions": "8" - }, - "project": "p1", - "skip_listener": false, - "action": null, - "model_broken": false, - "data_latency": null, - "last_status_duration": 240562073, - "model_indexes": 3, - "launching_error": false, - "partition_desc": { - "partition_date_column": "HY_LINEORDER.LO_PARTITIONCOLUMN", - "partition_date_start": 0, - "partition_date_format": "yyyy-MM-dd HH:mm:ss", - "partition_type": "APPEND", - "partition_condition_builder": "org.apache.kylin.metadata.model.PartitionDesc$DefaultPartitionConditionBuilder" - } - }, - { - "uuid": "7bccf62d-535c-70b8-8271-eaef3985aa96_build", - "last_modified": 1645186414089, - "create_time": 1644823384857, - "version": "4.0.0.0", - "mvcc": -1, - "model_alias": "hy_model", - "owner": "ADMIN", - "model_id": "7bccf62d-535c-70b8-8271-eaef3985aa96", - "last_start_time": null, - "last_end_time": null, - "last_update_time": "2022-02-18 17:43:27", - "last_batch_count": null, - "subscribe": null, - "fact_table": "BASE.HY_LINEORDER", - "job_status": "ERROR", - "job_type": "STREAMING_BUILD", - "process_id": null, - "node_info": null, - "job_execution_id": null, - "yarn_app_id": "application_1643095564973_0592", - "yarn_app_url": "http://10.1.2.210:8088/proxy/application_1643095564973_0592/", - "params": { - "spark.executor.memory": "1g", - "spark.master": "yarn", - "spark.driver.memory": "512m", - "kylin.streaming.kafka-conf.maxOffsetsPerTrigger": "0", - "kylin.streaming.duration": "30", - "spark.executor.cores": "2", - "kylin.streaming.job-retry-enabled": "false", - "spark.executor.instances": "2", - "spark.sql.shuffle.partitions": "8" - }, - "project": "p1", - "skip_listener": false, - "action": null, - "model_broken": false, - "data_latency": 0, - "last_status_duration": 246442660, - "model_indexes": 3, - "launching_error": false, - "partition_desc": { - "partition_date_column": "HY_LINEORDER.LO_PARTITIONCOLUMN", - "partition_date_start": 0, - "partition_date_format": "yyyy-MM-dd HH:mm:ss", - "partition_type": "APPEND", - "partition_condition_builder": "org.apache.kylin.metadata.model.PartitionDesc$DefaultPartitionConditionBuilder" - } - } - ], - "offset": 0, - "limit": 10, - "total_size": 2 - }, - "msg": "" - } - ``` - - - -### Operate Job - -- `PUT http://host:port/kylin/api/streaming_jobs/status` - -- Introduced in: 4.5.8.2 - -- Scenarios - - Modify the status of the jobs. For example, you can restart job task after finding that the job is abnormal. - -- URL Parameters - - `action` - `required` `string`, action types for jobs. Optional values are below: - - `START`,start selected jobs - - `STOP`, stop selected jobs - - `FORCE_STOP`, force to stop the selected jobs - - `RESTART`, restart selected jobs - - - `project` - `optional` `string`, project name. - - - `job_ids` - `required` `array<string>`, job id. - -- HTTP Header - - `Accept: application/vnd.apache.kylin-v4-public+json` - - `Accept-Language: en` - - `Content-Type: application/json;charset=utf-8` - -- Curl Request Example - - ```sh - curl --location --request PUT 'http://host:port/kylin/api/streaming_jobs/status' \ - -H 'Accept: application/vnd.apache.kylin-v4-public+json' \ - -H 'Accept-Language: en' \ - -H 'Authorization: Basic QURNSU46S1lMSU4=' \ - -H 'Content-Type: application/json;charset=utf-8' \ - -d '{ - "project": "p1", - "action": "RESTART", - "job_ids": [ - "7bccf62d-535c-70b8-8271-eaef3985aa96_merge" - ] - }' - ``` - -- Response Details - - `Code`: `String`, response code. **Returned value**: `000` (request processing success), `999 ` (request processing failed) - - `data`: `String`, response data, always be empty. - - `msg`:`String`: error message - -- Response Example - - ```json - { - "code":"000", - "data":"", - "msg":"" - } - ``` diff --git a/website/sidebars.js b/website/sidebars.js index 6c699efdbc..c0e380101d 100644 --- a/website/sidebars.js +++ b/website/sidebars.js @@ -806,10 +806,6 @@ const sidebars = { }, ], }, - { - type: 'doc', - id: 'restapi/streaming_job_api' - }, { type: 'doc', id: 'restapi/callback_api'