[I] when using spark loading datas,it returns out of update load job etl status failed [doris]

via GitHub Fri, 27 Oct 2023 01:38:39 -0700


mxli441 opened a new issue, #26024:
URL: https://github.com/apache/doris/issues/26024


   ### Search before asking
   
   - [X] I had searched in the 
[issues](https://github.com/apache/doris/issues?q=is%3Aissue) and found no 
similar issues.
   
   
   ### Version
   
   doris apache-doris-2.0-beta-bin-x64-noavx2
   spark spark-3.5.0-bin-hadoop3
   hadoop hadoop-3.3.5 
   
   ### What's Wrong?
   
   CREATE EXTERNAL RESOURCE "spark_yarn"
   PROPERTIES
   (
   "type" = "spark",
   "spark.master" = "yarn",
   "spark.submit.deployMode" = "cluster",
   "spark.executor.memory" = "3g",
   "spark.hadoop.yarn.resourcemanager.address" = "hdfs://hostname:8032",
   "spark.hadoop.fs.defaultFS" = "hdfs://hostname:9000",
   "working_dir" = "hdfs://hostname:9000/spark_load",
   "broker" = "broker_name"
   );
   
   LOAD LABEL god.label2023102719
   (
       DATA INFILE("hdfs://hostname:9000/spark_load/customer.csv")
       INTO TABLE customer
       COLUMNS TERMINATED BY "|"
       
(c_custkey,c_name,c_address,c_city,c_nation,c_region,c_phone,c_mktsegment)
   )
   WITH RESOURCE 'spark_yarn'
   (
       "spark.executor.memory" = "2g",
       "spark.shuffle.compress" = "true"
   )
   PROPERTIES
   (
       "timeout" = "3600"
   );
   
   mysql>  show load order by createtime desc limit 1\G
   *************************** 1. row ***************************
            JobId: 179014
            Label: label23
            State: CANCELLED
         Progress: Unknown id: 179014
             Type: SPARK
          EtlInfo: NULL
         TaskInfo: cluster:spark_yarn; timeout(s):3600; max_filter_ratio:0.0
         ErrorMsg: type:ETL_RUN_FAIL; msg:errCode = 2, detailMessage = spark 
etl job failed. msg: spark app state: FAILED
       CreateTime: 2023-10-27 16:26:01
     EtlStartTime: 2023-10-27 16:26:17
    EtlFinishTime: NULL
    LoadStartTime: NULL
   LoadFinishTime: 2023-10-27 16:27:56
              URL: http://hostname:8088/proxy/application_1698388341294_0013/
       JobDetails: {"Unfinished 
backends":{},"ScannedRows":0,"TaskNumber":0,"LoadBytes":0,"All 
backends":{},"FileNumber":0,"FileSize":0}
    TransactionId: 159014
     ErrorTablets: {}
             User: root
          Comment: 
   1 row in set (0.01 sec)
   
   fe.warn.log
   2023-10-27 16:09:52,115 WARN (Load etl checker|45) 
[LoadJob.unprotectedExecuteCancel():659] LOAD_JOB=179012, 
transaction_id={159013}, error_msg={Failed to execute load with error: errCode 
= 2, detailMessage = spark etl job failed. msg: spark app state: FAILED}
   2023-10-27 16:27:56,841 WARN (Load etl checker|45) 
[LoadManager.lambda$processEtlStateJobs$10():442] update load job etl status 
failed. job id: 179014
   org.apache.doris.common.LoadException: errCode = 2, detailMessage = spark 
etl job failed. msg: spark app state: FAILED
           at 
org.apache.doris.load.loadv2.SparkLoadJob.updateEtlStatus(SparkLoadJob.java:309)
 ~[doris-fe.jar:1.2-SNAPSHOT]
           at 
org.apache.doris.load.loadv2.LoadManager.lambda$processEtlStateJobs$10(LoadManager.java:436)
 ~[doris-fe.jar:1.2-SNAPSHOT]
           at 
java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184) 
~[?:1.8.0_131]
           at 
java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175) 
~[?:1.8.0_131]
           at 
java.util.concurrent.ConcurrentHashMap$ValueSpliterator.forEachRemaining(ConcurrentHashMap.java:3566)
 ~[?:1.8.0_131]
           at 
java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481) 
~[?:1.8.0_131]
           at 
java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471) 
~[?:1.8.0_131]
           at 
java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151) 
~[?:1.8.0_131]
           at 
java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
 ~[?:1.8.0_131]
           at 
java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) 
~[?:1.8.0_131]
           at 
java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418) 
~[?:1.8.0_131]
           at 
org.apache.doris.load.loadv2.LoadManager.processEtlStateJobs(LoadManager.java:434)
 ~[doris-fe.jar:1.2-SNAPSHOT]
           at 
org.apache.doris.load.loadv2.LoadEtlChecker.runAfterCatalogReady(LoadEtlChecker.java:43)
 ~[doris-fe.jar:1.2-SNAPSHOT]
           at 
org.apache.doris.common.util.MasterDaemon.runOneCycle(MasterDaemon.java:58) 
~[doris-fe.jar:1.2-SNAPSHOT]
           at org.apache.doris.common.util.Daemon.run(Daemon.java:116) 
~[doris-fe.jar:1.2-SNAPSHOT]
   2023-10-27 16:27:56,842 WARN (Load etl checker|45) 
[LoadJob.unprotectedExecuteCancel():659] LOAD_JOB=179014, 
transaction_id={159014}, error_msg={Failed to execute load with error: errCode 
= 2, detailMessage = spark etl job failed. msg: spark app state: FAILED}
   
   
   
   ### What You Expected?
   
   this task can run smoothly,and the data can load in my table 
   
   ### How to Reproduce?
   
   _No response_
   
   ### Anything Else?
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org

[I] when using spark loading datas,it returns out of update load job etl status failed [doris]

Reply via email to