wutiangan commented on a change in pull request #3867: URL: https://github.com/apache/incubator-doris/pull/3867#discussion_r440105076
########## File path: docs/en/extending-doris/doris-on-es.md ########## @@ -26,62 +26,314 @@ under the License. # Doris On ES -Doris-On-ES combines Doris's distributed query planning capability with ES (Elastic search)'s full-text search capability to provide a more complete OLAP scenario solution: +Doris-On-ES not only take advantage of Doris's distributed query planning capability but also ES (Elastic search)'s full-text search capability, provide a more complete OLAP scenario solution: 1. Multi-index Distributed Join Query in ES 2. Joint Query of Tables in Doris and ES, More Complex Full-Text Retrieval and Filtering -3. Aggregated queries for fields of ES keyword type: suitable for frequent changes in index, tens of millions or more of single fragmented documents, and the cardinality of the field is very large This document mainly introduces the realization principle and usage of this function. -## Noun Interpretation +## Glossary + +### Noun in Doris * FE: Frontend, the front-end node of Doris. Responsible for metadata management and request access. * BE: Backend, Doris's back-end node. Responsible for query execution and data storage. -* Elastic search (ES): The most popular open source distributed search engine. + +### Noun in ES + * DataNode: The data storage and computing node of ES. * MasterNode: The Master node of ES, which manages metadata, nodes, data distribution, etc. * scroll: The built-in data set cursor feature of ES for streaming scanning and filtering of data. +* _source: contains the original JSON document body that was passed at index time +* doc_values: store the same values as the _source but in a column-oriented fashion +* keyword: string datatype in ES, but the content not analyzed by analyzer +* text: string datatype in ES, the content analyzed by analyzer -## How to use it +## How To Use -### Create appearance +### Create ES Index ``` -CREATE EXTERNAL TABLE `es_table` ( - `id` bigint(20) COMMENT "", +PUT test +{ + "settings": { + "index": { + "number_of_shards": "1", + "number_of_replicas": "0" + } + }, + "mappings": { + "doc": { // ES 7.x版本之后创建索引时不需要指定type,会有一个默认且唯一的`_doc` type + "properties": { + "k1": { + "type": "long" + }, + "k2": { + "type": "date" + }, + "k3": { + "type": "keyword" + }, + "k4": { + "type": "text", + "analyzer": "standard" + }, + "k5": { + "type": "float" + } + } + } + } +} +``` + +### Add JSON documents to ES index + +``` +POST /_bulk +{"index":{"_index":"test","_type":"doc"}} +{ "k1" : 100, "k2": "2020-01-01", "k3": "Trying out Elasticsearch", "k4": "Trying out Elasticsearch", "k5": 10.0} +{"index":{"_index":"test","_type":"doc"}} +{ "k1" : 100, "k2": "2020-01-01", "k3": "Trying out Doris", "k4": "Trying out Doris", "k5": 10.0} +{"index":{"_index":"test","_type":"doc"}} +{ "k1" : 100, "k2": "2020-01-01", "k3": "Doris On ES", "k4": "Doris On ES", "k5": 10.0} +{"index":{"_index":"test","_type":"doc"}} +{ "k1" : 100, "k2": "2020-01-01", "k3": "Doris", "k4": "Doris", "k5": 10.0} +{"index":{"_index":"test","_type":"doc"}} +{ "k1" : 100, "k2": "2020-01-01", "k3": "ES", "k4": "ES", "k5": 10.0} +``` + +### Create external ES table + +``` +CREATE EXTERNAL TABLE `test` ( + `k1` bigint(20) COMMENT "", + `k2` datetime COMMENT "", + `k3` varchar(20) COMMENT "", + `k4` varchar(100) COMMENT "", + `k5` float COMMENT "" +) ENGINE=ELASTICSEARCH // ENGINE必须是Elasticsearch +PROPERTIES ( +"hosts" = "http://192.168.0.1:8200,http://192.168.0.2:8200", +"index" = "test”, +"type" = "doc", + +"user" = "root", +"password" = "root" +); +``` + +The following parameters are accepted by ES table: + +参数 | 说明 Review comment: change it to english ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org