[ https://issues.apache.org/jira/browse/HBASE-28951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17894937#comment-17894937 ]
Duo Zhang commented on HBASE-28951: ----------------------------------- In general the problem is the 'dead' RS is not really dead and can still do something. On normal write path, we have a fencing way to rename the WAL directory and call recoverLease on all the wal files, so the dead RS can not write WAL any more, so it can not accept writes. Here we should also find a fencing way to make sure that the dead RS can not do split any more, and then we can safely dispatch the split task to another RS. > WAL Split Delays Due to Concurrent WAL Splitting During worker RS Abort > ----------------------------------------------------------------------- > > Key: HBASE-28951 > URL: https://issues.apache.org/jira/browse/HBASE-28951 > Project: HBase > Issue Type: Bug > Affects Versions: 2.5.8 > Reporter: Umesh Kumar Kumawat > Priority: Major > > When a worker RS gets aborted after the SplitWALRemoteProcedure got > dispatched, RegionServerTracker takes care of it and [aborts the pending > Operation|https://github.com/apache/hbase/blob/rel/2.5.8/hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/RSProcedureDispatcher.java#L160] > on the aborting region as part of > [expireServer|https://github.com/apache/hbase/blob/rel/2.5.8/hbase-server/src/main/java/org/apache/hadoop/hbase/master/RegionServerTracker.java#L172]. > > It did help the parent procedure, SplitWalProcedure, to choose another worker > RS but the aborting RS is also splitting the WAL. Now while creating the > recovered edits both will try to write the same file. One RS that starts late > for the file deletes the previous file that cause failures. > h4. Logs - > region server tracker marking the remove procedure failed > {code:java} > 2024-10-01 23:02:32,274 WARN [RegionServerTracker-0] > procedure.SplitWALRemoteProcedure - Sent > hdfs://hbase1a/hbase/WALs/regionserver-33.regionserver.hbase.<cluster>,XXXXX,1727362162836-splitting/regionserver-33.regionserver.hbase.<cluster>%2CXXXXX%2C1727362162836.1727822221172 > to wrong server > regionserver-283.regionserver.hbase.<cluster>,XXXXX,1727420096936, try another > org.apache.hadoop.hbase.DoNotRetryIOException: server not online > regionserver-283.regionserver.hbase.<cluster>,XXXXX,1727420096936 > at > org.apache.hadoop.hbase.master.procedure.RSProcedureDispatcher.abortPendingOperations(RSProcedureDispatcher.java:163) > at > org.apache.hadoop.hbase.master.procedure.RSProcedureDispatcher.abortPendingOperations(RSProcedureDispatcher.java:61) > at > org.apache.hadoop.hbase.procedure2.RemoteProcedureDispatcher$BufferNode.abortOperationsInQueue(RemoteProcedureDispatcher.java:417) > at > org.apache.hadoop.hbase.procedure2.RemoteProcedureDispatcher.removeNode(RemoteProcedureDispatcher.java:201) > at > org.apache.hadoop.hbase.master.procedure.RSProcedureDispatcher.serverRemoved(RSProcedureDispatcher.java:176) > at > org.apache.hadoop.hbase.master.ServerManager.lambda$expireServer$2(ServerManager.java:576) > at > java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948) > at java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:647) > at > org.apache.hadoop.hbase.master.ServerManager.expireServer(ServerManager.java:576) > at > org.apache.hadoop.hbase.master.ServerManager.expireServer(ServerManager.java:530) > at > org.apache.hadoop.hbase.master.RegionServerTracker.processAsActiveMaster(RegionServerTracker.java:172) > at > org.apache.hadoop.hbase.master.RegionServerTracker.refresh(RegionServerTracker.java:206) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:750){code} > {code:java} > 2024-10-01 23:02:32,340 INFO [PEWorker-21] procedure2.ProcedureExecutor - > Finished pid=122448609, ppid=122448595, state=SUCCESS; > SplitWALRemoteProcedure > regionserver-33.regionserver.hbase.<cluster>,XXXXX%2C1727362162836.1727822221172, > worker=regionserver-283.regionserver.hbase.<cluster>,XXXXX,1727420096936 in > 54.0500 sec{code} > Parent SplitWalProcedure will create another RemoteProcedure for this > {code:java} > 2024-10-01 23:02:32,726 WARN [PEWorker-17] procedure.SplitWALProcedure - > Failed to split wal > hdfs://hbase1a/hbase/WALs/regionserver-33.regionserver.hbase.<cluster>,XXXXX,1727362162836-splitting/regionserver-33.regionserver.hbase.<cluster>,XXXXX%2C1727362162836.1727822221172 > by server regionserver-283.regionserver.hbase.<cluster>,XXXXX,1727420096936, > retry...{code} > {code:java} > 2024-10-01 23:02:39,414 INFO [PEWorker-28] procedure2.ProcedureExecutor - > Initialized subprocedures=[{pid=122452821, ppid=122448595, state=RUNNABLE; > SplitWALRemoteProcedure > regionserver-33.regionserver.hbase.<cluster>%2CXXXXX%2C1727362162836.1727822221172, > > worker=regionserver-323.regionserver.hbase.<cluster>,XXXXX,1727308912906}]{code} > Splitting still in progress on dying rs > {code:java} > 2024-10-01 23:02:45,652 INFO > [G_REPLAY_OPS-regionserver/regionserver-283:XXXXX-0] wal.WALSplitter - > Splitting > hdfs://hbase1a/hbase/WALs/regionserver-33.regionserver.hbase.<cluster>,XXXXX,1727362162836-splitting/regionserver-33.regionserver.hbase.<cluster>%2CXXXXX%2C1727362162836.1727822221172, > size=128.1 M (134313407bytes){code} > rs-323 creating recovered edits > {code:java} > 2024-10-01 23:02:42,876 INFO > [OPS-regionserver/regionserver-323:XXXXX-5-Writer-2] > monitor.StreamSlowMonitor - New stream slow monitor > 0000000000007468971-regionserver-33.regionserver.hbase.<cluster>%2CXXXXX%2C1727362162836.1727822221172.temp{code} > {code:java} > 2024-10-01 23:02:43,171 INFO > [OPS-regionserver/regionserver-323:XXXXX-5-Writer-2] > wal.RecoveredEditsOutputSink - Creating recovered edits writer > path=hdfs://hbase1a/hbase/data/default/SEARCH.REPLAY_ID_BATCH_INDEX_START_INDEX/d3be13a8187ff35746fff1def4f4dba4/recovered.edits/0000000000007468971-regionserver-33.regionserver.hbase.<cluster>%2CXXXXX%2C1727362162836.1727822221172.temp{code} > rs-283 deletes the above files and again creates the file > {code:java} > 2024-10-01 23:02:50,520 WARN > [OPS-regionserver/regionserver-283:XXXXX-0-Writer-2] > wal.RecoveredEditsOutputSink - Found old edits file. It could be the result > of a previous failed split attempt. Deleting > hdfs://hbase1a/hbase/data/default/SEARCH.REPLAY_ID_BATCH_INDEX_START_INDEX/d3be13a8187ff35746fff1def4f4dba4/recovered.edits/0000000000007468971-regionserver-33.regionserver.hbase.<cluster>%2CXXXXX%2C1727362162836.1727822221172.temp, > length=0{code} > {code:java} > 2024-10-01 23:02:50,794 INFO > [OPS-regionserver/regionserver-283:XXXXX-0-Writer-2] > monitor.StreamSlowMonitor - New stream slow monitor > 0000000000007468971-regionserver-33.regionserver.hbase.<cluster>%2CXXXXX%2C1727362162836.1727822221172.temp{code} > {code:java} > 2024-10-01 23:02:51,135 INFO > [OPS-regionserver/regionserver-283:XXXXX-0-Writer-2] > wal.RecoveredEditsOutputSink - Creating recovered edits writer > path=hdfs://hbase1a/hbase/data/default/SEARCH.REPLAY_ID_BATCH_INDEX_START_INDEX/d3be13a8187ff35746fff1def4f4dba4/recovered.edits/0000000000007468971-regionserver-33.regionserver.hbase.<cluster>%2CXXXXX%2C1727362162836.1727822221172.temp{code} > Now rs 323 will start failing > {code:java} > 2024-10-01 23:03:02,137 WARN [Thread-1081409] hdfs.DataStreamer - > DataStreamer Exception > java.io.FileNotFoundException: File does not exist: > /hbase/data/default/SEARCH.REPLAY_ID_BATCH_INDEX_START_INDEX/d3be13a8187ff35746fff1def4f4dba4/recovered.edits/0000000000007468971-regionserver-33.regionserver.hbase.hbase1a.hbase.core2.aws-prod5-uswest2.aws.sfdc.is%2C60020%2C1727362162836.1727822221172.temp > (inode 1440741238) [Lease. Holder: DFSClient_NONMAPREDUCE_-2039838105_1, > pending creates: 21] > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:3103) > at > org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.analyzeFileState(FSDirWriteFileOp.java:610) > at > org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.validateAddBlock(FSDirWriteFileOp.java:171) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2977) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:912) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:595) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:618) > at > org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:589) > at > org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:573) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1227) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1105) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1028) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:3060) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at > org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:121) > at > org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:88) > at > org.apache.hadoop.hdfs.DFSOutputStream.addBlock(DFSOutputStream.java:1091) > at > org.apache.hadoop.hdfs.DataStreamer.locateFollowingBlock(DataStreamer.java:1939) > at > org.apache.hadoop.hdfs.DataStreamer.setupPipelineForCreate(DataStreamer.java:1734) > at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:717) > Caused by: > org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): File > does not exist: > /hbase/data/default/SEARCH.REPLAY_ID_BATCH_INDEX_START_INDEX/d3be13a8187ff35746fff1def4f4dba4/recovered.edits/0000000000007468971-regionserver-33.regionserver.hbase.hbase1a.hbase.core2.aws-prod5-uswest2.aws.sfdc.is%2C60020%2C1727362162836.1727822221172.temp > (inode 1440741238) [Lease. Holder: DFSClient_NONMAPREDUCE_-2039838105_1, > pending creates: 21] > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:3103) > at > org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.analyzeFileState(FSDirWriteFileOp.java:610) > at > org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.validateAddBlock(FSDirWriteFileOp.java:171) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2977) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:912) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:595) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:618) > at > org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:589) > at > org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:573) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1227) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1105) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1028) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:3060) > at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1567) > at org.apache.hadoop.ipc.Client.call(Client.java:1513) > at org.apache.hadoop.ipc.Client.call(Client.java:1410) > at > org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:258) > at > org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:139) > at com.sun.proxy.$Proxy18.addBlock(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.lambda$addBlock$11(ClientNamenodeProtocolTranslatorPB.java:495) > at > org.apache.hadoop.ipc.internal.ShadedProtobufHelper.ipc(ShadedProtobufHelper.java:160) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:495) > at sun.reflect.GeneratedMethodAccessor247.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:433) > at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:166) > at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:158) > at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:96) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:362) > at com.sun.proxy.$Proxy19.addBlock(Unknown Source) > at sun.reflect.GeneratedMethodAccessor247.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:361) > at com.sun.proxy.$Proxy20.addBlock(Unknown Source) > at sun.reflect.GeneratedMethodAccessor247.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:361) > at com.sun.proxy.$Proxy20.addBlock(Unknown Source) > at > org.apache.hadoop.hdfs.DFSOutputStream.addBlock(DFSOutputStream.java:1088) > ... 3 more > {code} > {code:java} > 2024-10-01 23:03:02,143 ERROR [split-log-closeStream-pool-1] > wal.RecoveredEditsOutputSink - Could not close recovered edits at > hdfs://hbase1a/hbase/data/default/SEARCH.REPLAY_ID_BATCH_INDEX_START_INDEX/d3be13a8187ff35746fff1def4f4dba4/recovered.edits/0000000000007468971-regionserver-33.regionserver.hbase.<cluster>%2CXXXXX%2C1727362162836.1727822221172.temp > java.io.FileNotFoundException: File does not exist: > /hbase/data/default/SEARCH.REPLAY_ID_BATCH_INDEX_START_INDEX/d3be13a8187ff35746fff1def4f4dba4/recovered.edits/0000000000007468971-regionserver-33.regionserver.hbase.<cluster>%2CXXXXX%2C1727362162836.1727822221172.temp > (inode 1440741238) [Lease. Holder: DFSClient_NONMAPREDUCE_-2039838105_1, > pending creates: 21] > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:3103) > at > org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.analyzeFileState(FSDirWriteFileOp.java:610) > at > org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.validateAddBlock(FSDirWriteFileOp.java:171) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2977) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:912) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:595) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:618) > at > org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:589) > at > org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:573) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1227) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1105) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1028) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:3060) at > sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at > org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:121) > at > org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:88) > at > org.apache.hadoop.hdfs.DFSOutputStream.addBlock(DFSOutputStream.java:1091) > at > org.apache.hadoop.hdfs.DataStreamer.locateFollowingBlock(DataStreamer.java:1939) > at > org.apache.hadoop.hdfs.DataStreamer.setupPipelineForCreate(DataStreamer.java:1734) > at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:717) > Caused by: > org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): File > does not exist: > /hbase/data/default/SEARCH.REPLAY_ID_BATCH_INDEX_START_INDEX/d3be13a8187ff35746fff1def4f4dba4/recovered.edits/0000000000007468971-regionserver-33.regionserver.hbase.<cluster>%2CXXXXX%2C1727362162836.1727822221172.temp > (inode 1440741238) [Lease. Holder: DFSClient_NONMAPREDUCE_-2039838105_1, > pending creates: 21] > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:3103) > at > org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.analyzeFileState(FSDirWriteFileOp.java:610) > at > org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.validateAddBlock(FSDirWriteFileOp.java:171) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2977) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:912) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:595) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:618) > at > org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:589) > at > org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:573) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1227) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1105) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1028) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:3060) at > org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1567) > at org.apache.hadoop.ipc.Client.call(Client.java:1513) > at org.apache.hadoop.ipc.Client.call(Client.java:1410) > at > org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:258) > at > org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:139) > at com.sun.proxy.$Proxy18.addBlock(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.lambda$addBlock$11(ClientNamenodeProtocolTranslatorPB.java:495) > at > org.apache.hadoop.ipc.internal.ShadedProtobufHelper.ipc(ShadedProtobufHelper.java:160) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:495) > at sun.reflect.GeneratedMethodAccessor247.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:433) > at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:166) > at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:158) > at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:96) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:362) > at com.sun.proxy.$Proxy19.addBlock(Unknown Source) > at sun.reflect.GeneratedMethodAccessor247.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:361) > at com.sun.proxy.$Proxy20.addBlock(Unknown Source) > at sun.reflect.GeneratedMethodAccessor247.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:361) > at com.sun.proxy.$Proxy20.addBlock(Unknown Source) > at > org.apache.hadoop.hdfs.DFSOutputStream.addBlock(DFSOutputStream.java:1088) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)