[GitHub] [incubator-doris] wyb commented on issue #3010: Spark load etl interface

GitBox Fri, 28 Feb 2020 00:20:23 -0800

wyb commented on issue #3010: Spark load etl interface
URL: 
https://github.com/apache/incubator-doris/issues/3010#issuecomment-592402743
 
 
   @wangbo 
   
   1. General configs can be set when adding a Spark cluster, and user can set 
different configs in the loadstmt which will override the generals when 
submitting the load job.
   
   ```sql
   LOAD LABEL db_name.label_name 
   (
     DATA INFILE "/tmp/file1" INTO TABLE t1 ...,
     -- DATA FROM TABLE hive_table ...
   )
   WITH CLUSTER spark.cluster_name
   (spark_param_key=spark_param_value, ...)
   [PROPERTIES ("bitmap_data" = "hive_db.table", key1=value1, ... )]
   ```
   
   2. In the initial implementation version, user need to resubmit the whole 
job when job failed.
   
   3. After ETL is completed, FE will get the ETL files path and schedule push 
tasks to the relevant BE tablet.
   ```java
   public Map<String, Long> getEtlFilePaths(String outputPath)
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org

[GitHub] [incubator-doris] wyb commented on issue #3010: Spark load etl interface

Reply via email to