worker24h opened a new issue #3124: Load json data into Doris by json-path
URL: https://github.com/apache/incubator-doris/issues/3124
 
 
   **Requirements describe**
   I wish Doris can support json data, that‘s load json data into table by 
routine load or stream load
   
   **It's my idea**
   Routine load:
   1. Create table for `books`
       CREATE TABLE `books`(
       `category` varchar(32), 
       `author` varchar(32), 
       `title` varchar(32), 
       `dt` int COMMENT '天分区,格式YYYYMMDD',
       `price` int
       ) ENGINE=OLAP
       AGGREGATE KEY(`category`,`author`,`title`)
       PARTITION BY RANGE(`dt`) (
           PARTITION p0 VALUES less than ("20190101"),
           PARTITION p20190101 VALUES less than ("20190102"),
           PARTITION p20190102 VALUES less than ("20190103")
       )
       DISTRIBUTED BY HASH(`category`, `author`, `title`) BUCKETS 32
       PROPERTIES ("storage_type"="column");
   
   2. Create Routine Load
    CREATE ROUTINE LOAD example_db.books_label1 ON books
       COLUMNS(category, author, title, dt, price),
       PROPERTIES
       (
           **"format" = "json",**
           "desired_concurrent_number"="3",
           "max_batch_interval" = "20",
           "max_batch_rows" = "300000",
           "max_batch_size" = "209715200",
           "strict_mode" = "false",
           "timezone" = "Africa/Abidjan"
       )
       FROM KAFKA
       (
           "kafka_broker_list" = "broker1:9092,broker2:9092,broker3:9092",
           "kafka_topic" = "my_topic",
           "property.security.protocol" = "ssl",
           "property.ssl.ca.location" = "FILE:ca.pem",
           "property.ssl.certificate.location" = "FILE:client.pem",
           "property.ssl.key.location" = "FILE:client.key",
           "property.ssl.key.password" = "abcdefg",
           "property.client.id" = "my_client_id"
       );
   
   3. kafka json data
   {
       **"jsonpath": [
           {"key": "author",   "type": "string",   "value": 
"$.store.book.author"},
           {"key": "category", "type": "string",   "value": 
"$.store.book.category"}, 
           {"key": "price",    "type": "float",    "value": 
"$.store.book.price"},
           {"key": "title",    "type": "string",   "value": 
"$.store.book.title"}
           {"key": "dt",       "type": "integer",  "value": "$.date"}
       ],**
       userdata : [
           {
               "store": {
                   "book": [
                       {"category": "reference", "author": "NigelRees", 
"title": "SayingsoftheCentury", "price": 8.95},
                       {"category": "fiction", "author": "EvelynWaugh", 
"title": "SwordofHonour", "price": 12.99}
                   ],
                   "bicycle": {"color": "red", "price": 19.95}
               },
               "expensive": 10,
               "date": 20190202
           },
       ]
       
   }
   
   PS:
   1) We must specify format, that's "json"
   2) In json data of kafka, we need contain a jsonpath object
   3) userdata is array object, it can contain lots of application data 
objects. 
   
   **  Can you offer some other suggestions ??**

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org

Reply via email to