we have a hive job and the sink table is orc format, but it alway occur the
blow exception when run the reducer phase.
but it can run successfully, when we change the orc format to the text format.
Error: java.lang.RuntimeException:
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while
processing row
{"_col0":42553624,"_col1":"v42553624","_col2":"","_col3":"","_col4":"","_col5":"","_col6":"xxx","_col7":"223.166.151.236","_col8":"xx#xx#xxx","_col9":"223.166.151.236","_col10":"user/avatar/17/7a/177a1c20630391292ec4a0cad6a9a93d.jpg","_col11":1,"_col12":0.0,"_col13":0,"_col14":"","_col15":"","_col16":"","_col17":"","_col18":"","_col19":0,"_col20":"","_col21":1,"_col22":3,"_col23":1,"_col24":null,"_col25":"","_col26":0,"_col27":0,"_col28":0,"_col29":"","_col30":"","_col31":0,"_col32":0,"_col33":"2020-01-17
19:58:29.0","_col34":"2020-01-17
19:58:31.0","_col35":null,"_col36":0,"_col37":0,"_col38":0,"_col39":0,"_col40":"","_col41":"","_col42":0,"_col43":1,"_col44":0,"_col45":0,"_col46":0,"_col47":"","_col48":41281955}
03-07-2020 02:37:17 CST dwd_user INFO - at
org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:172)
03-07-2020 02:37:17 CST dwd_user INFO - at
org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
03-07-2020 02:37:17 CST dwd_user INFO - at
org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
03-07-2020 02:37:17 CST dwd_user INFO - at
org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
03-07-2020 02:37:17 CST dwd_user INFO - at
org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
03-07-2020 02:37:17 CST dwd_user INFO - at
java.security.AccessController.doPrivileged(Native Method)
03-07-2020 02:37:17 CST dwd_user INFO - at
javax.security.auth.Subject.doAs(Subject.java:422)
03-07-2020 02:37:17 CST dwd_user INFO - at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762)
03-07-2020 02:37:17 CST dwd_user INFO - at
org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
03-07-2020 02:37:17 CST dwd_user INFO - Caused by:
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while
processing row
{"_col0":42553624,"_col1":"v42553624","_col2":"","_col3":"","_col4":"","_col5":"","_col6":"xx","_col7":"223.166.151.236","_col8":"xx#xxxx","_col9":"223.166.151.236","_col10":"user/avatar/17/7a/177a1c20630391292ec4a0cad6a9a93d.jpg","_col11":1,"_col12":0.0,"_col13":0,"_col14":"","_col15":"","_col16":"","_col17":"","_col18":"","_col19":0,"_col20":"","_col21":1,"_col22":3,"_col23":1,"_col24":null,"_col25":"","_col26":0,"_col27":0,"_col28":0,"_col29":"","_col30":"","_col31":0,"_col32":0,"_col33":"2020-01-17
19:58:29.0","_col34":"2020-01-17
19:58:31.0","_col35":null,"_col36":0,"_col37":0,"_col38":0,"_col39":0,"_col40":"","_col41":"","_col42":0,"_col43":1,"_col44":0,"_col45":0,"_col46":0,"_col47":"","_col48":41281955}
03-07-2020 02:37:17 CST dwd_user INFO - at
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:518)
03-07-2020 02:37:17 CST dwd_user INFO - at
org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:163)
03-07-2020 02:37:17 CST dwd_user INFO - ... 8 more
03-07-2020 02:37:17 CST dwd_user INFO - Caused by:
org.apache.hadoop.hive.ql.metadata.HiveException:
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File
/tmp/staging/.hive-staging_hive_2020-07-03_02-31-36_242_6951178611473682186-1/_task_tmp.-ext-10002/_tmp.000013_3
could only be replicated to 0 nodes instead of minReplication (=1).
There are 9 datanode(s) running and no node(s) are excluded in this operation.
03-07-2020 02:37:17 CST dwd_user INFO - at
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1620)
03-07-2020 02:37:17 CST dwd_user INFO - at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getNewBlockTargets(FSNamesystem.java:3135)
03-07-2020 02:37:17 CST dwd_user INFO - at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3059)
03-07-2020 02:37:17 CST dwd_user INFO - at
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:725)
03-07-2020 02:37:17 CST dwd_user INFO - at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:493)
03-07-2020 02:37:17 CST dwd_user INFO - at
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
03-07-2020 02:37:17 CST dwd_user INFO - at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
03-07-2020 02:37:17 CST dwd_user INFO - at
org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
03-07-2020 02:37:17 CST dwd_user INFO - at
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2217)
03-07-2020 02:37:17 CST dwd_user INFO - at
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2213)
03-07-2020 02:37:17 CST dwd_user INFO - at
java.security.AccessController.doPrivileged(Native Method)
03-07-2020 02:37:17 CST dwd_user INFO - at
javax.security.auth.Subject.doAs(Subject.java:422)
03-07-2020 02:37:17 CST dwd_user INFO - at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762)
03-07-2020 02:37:17 CST dwd_user INFO - at
org.apache.hadoop.ipc.Server$Handler.run(Server.java:2211)
03-07-2020 02:37:17 CST dwd_user INFO -
03-07-2020 02:37:17 CST dwd_user INFO - at
org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:787)
03-07-2020 02:37:17 CST dwd_user INFO - at
org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:838)
03-07-2020 02:37:17 CST dwd_user INFO - at
org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88)
03-07-2020 02:37:17 CST dwd_user INFO - at
org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:838)
03-07-2020 02:37:17 CST dwd_user INFO - at
org.apache.hadoop.hive.ql.exec.UnionOperator.process(UnionOperator.java:137)
03-07-2020 02:37:17 CST dwd_user INFO - at
org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:838)
03-07-2020 02:37:17 CST dwd_user INFO - at
org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:97)
03-07-2020 02:37:17 CST dwd_user INFO - at
org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:162)
03-07-2020 02:37:17 CST dwd_user INFO - at
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:508)
03-07-2020 02:37:17 CST dwd_user INFO - ... 9 more
03-07-2020 02:37:17 CST dwd_user INFO - Caused by:
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File
/tmp/staging/.hive-staging_hive_2020-07-03_02-31-36_242_6951178611473682186-1/_task_tmp.-ext-10002/_tmp.000013_3
could only be replicated to 0 nodes instead of minReplication (=1).
There are 9 datanode(s) running and no node(s) are excluded in this operation.
03-07-2020 02:37:17 CST dwd_user INFO - at
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1620)
03-07-2020 02:37:17 CST dwd_user INFO - at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getNewBlockTargets(FSNamesystem.java:3135)
03-07-2020 02:37:17 CST dwd_user INFO - at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3059)
03-07-2020 02:37:17 CST dwd_user INFO - at
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:725)
03-07-2020 02:37:17 CST dwd_user INFO - at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:493)
03-07-2020 02:37:17 CST dwd_user INFO - at
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
03-07-2020 02:37:17 CST dwd_user INFO - at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
03-07-2020 02:37:17 CST dwd_user INFO - at
org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
03-07-2020 02:37:17 CST dwd_user INFO - at
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2217)
03-07-2020 02:37:17 CST dwd_user INFO - at
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2213)
03-07-2020 02:37:17 CST dwd_user INFO - at
java.security.AccessController.doPrivileged(Native Method)
03-07-2020 02:37:17 CST dwd_user INFO - at
javax.security.auth.Subject.doAs(Subject.java:422)
03-07-2020 02:37:17 CST dwd_user INFO - at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762)
03-07-2020 02:37:17 CST dwd_user INFO - at
org.apache.hadoop.ipc.Server$Handler.run(Server.java:2211)
03-07-2020 02:37:17 CST dwd_user INFO -
03-07-2020 02:37:17 CST dwd_user INFO - at
org.apache.hadoop.ipc.Client.call(Client.java:1476)
03-07-2020 02:37:17 CST dwd_user INFO - at
org.apache.hadoop.ipc.Client.call(Client.java:1413)
03-07-2020 02:37:17 CST dwd_user INFO - at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
03-07-2020 02:37:17 CST dwd_user INFO - at
com.sun.proxy.$Proxy14.addBlock(Unknown Source)
03-07-2020 02:37:17 CST dwd_user INFO - at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:418)
03-07-2020 02:37:17 CST dwd_user INFO - at
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
03-07-2020 02:37:17 CST dwd_user INFO - at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
03-07-2020 02:37:17 CST dwd_user INFO - at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
03-07-2020 02:37:17 CST dwd_user INFO - at
java.lang.reflect.Method.invoke(Method.java:498)
03-07-2020 02:37:17 CST dwd_user INFO - at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
03-07-2020 02:37:17 CST dwd_user INFO - at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
03-07-2020 02:37:17 CST dwd_user INFO - at
com.sun.proxy.$Proxy15.addBlock(Unknown Source)
03-07-2020 02:37:17 CST dwd_user INFO - at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1603)
03-07-2020 02:37:17 CST dwd_user INFO - at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1388)
03-07-2020 02:37:17 CST dwd_user INFO - at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:554)
03-07-2020 02:37:17 CST dwd_user INFO -
03-07-2020 02:37:17 CST dwd_user INFO -
03-07-2020 02:37:17 CST dwd_user INFO - FAILED: Execution Error, return code 2
from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
03-07-2020 02:37:17 CST dwd_user INFO - MapReduce Jobs Launched:
03-07-2020 02:37:17 CST dwd_user INFO - Stage-Stage-9: Map: 14 Reduce:
9 Cumulative CPU: 2748.28 sec HDFS Read: 2220278689
HDFS Write: 10765048344 SUCCESS
03-07-2020 02:37:17 CST dwd_user INFO - Stage-Stage-2: Map: 45
Cumulative CPU: 1220.73 sec HDFS Read: 5947612504 HDFS Write:
1200295370 FAIL
03-07-2020 02:37:17 CST dwd_user INFO - Total MapReduce CPU Time Spent: 0 days
1 hours 6 minutes 9 seconds 10 msec
03-07-2020 02:37:18 CST dwd_user INFO - ??????????2
03-07-2020 02:37:18 CST dwd_user INFO - Process completed unsuccessfully in 347
seconds.
03-07-2020 02:37:18 CST dwd_user ERROR - Job run failed!
java.lang.RuntimeException:
azkaban.jobExecutor.utils.process.ProcessFailureException
at azkaban.jobExecutor.ProcessJob.run(ProcessJob.java:167)
at azkaban.execapp.JobRunner.runJob(JobRunner.java:693)
at azkaban.execapp.JobRunner.run(JobRunner.java:545)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: azkaban.jobExecutor.utils.process.ProcessFailureException
at
azkaban.jobExecutor.utils.process.AzkabanProcess.run(AzkabanProcess.java:131)
at azkaban.jobExecutor.ProcessJob.run(ProcessJob.java:161)
... 7 more
03-07-2020 02:37:18 CST dwd_user ERROR -
azkaban.jobExecutor.utils.process.ProcessFailureException cause:
azkaban.jobExecutor.utils.process.ProcessFailureException
03-07-2020 02:37:18 CST dwd_user INFO - Finishing job dwd_user attempt: 2 at
1593715038086 with status FAILED
AND now the datanode occur many exceptions :
2020-07-03 12:40:40,788 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode:
DataNode{data=FSDataset{dirpath='[/var/hadoop/datanode/data/current]'},
localName='live-hadoop11-online-ali-bjf.vhouhn.com:50010',
datanodeUuid='48d7541d-610a-4a94-8ea9-32a41079a3c8',
xmitsInProgress=0}:Exception transfering block
BP-832853851-192.168.90.217-1556178711773:blk_4285397204_3211656451 to mirror
192.168.90.212:50010: java.io.EOFException: Premature EOF: no length prefix
available
2020-07-03 12:40:40,788 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode:
live-hadoop11-online-ali-bjf.vhouhn.com:50010:DataXceiver error processing
WRITE_BLOCK operation src: /192.168.90.43:46780 dst: /192.168.90.43:50010
java.io.EOFException: Premature EOF: no length prefix available
at
org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2294)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:749)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:253)
at
java.lang.Thread.run(Thread.java:745)
2020-07-03 12:40:42,313 WARN org.apache.hadoop.hdfs.server.datanode.DataNode:
Slow flushOrSync took 400ms (threshold=300ms), isSync:false,
flushTotalNanos=399253831ns
Anyone can tell me how to solve it ? thanks all.
Best regards,
Eason