Hello guys. Still investigating the tcpdump. I don't see a lot of packets with the flag SYN when the listenQueue is full. What I see is a lot of packets with the flag "PSH, ACK" with data inside like this : getListing.org.apache.hadoop.hdfs.protocol.ClientProtocol /apps/hive/warehouse/<mydb>.db/<mytable>/<mypartition>
It makes me wonder, when a client perform an hdfs dfs -ls -R <HDFS_PATH>, how many SYN packets will it send to the namenode ? One in total or one by subfolder ? Let's say I have "n" subfolders inside <HDFS_PATH>. Will we have this situation : - Client sends one SYN packet to Namenode - Namenode sends one SYN-ACK packets to client - Client sends n ACK or (PSH, ACK) packets to Namenode Or this situation : - Client sends n SYN packet to Namenode - Namenode sends n SYN-ACK packets to client - Client sends n ACK or (PSH, ACK) It would mean an hdfs recursive listing on a path with a lot of subfolders could harm the other clients by sending too many packets to the namenode ? About the jstack, I tried it on the namenode JVM but it provoked a failover, as the namenode was not answering at all (in particular, no answer to ZKFC), and the jstack never ended, I had to kill it. I don't know if a kill -3 or a jstack -F could help, but at least jstack -F contains less valuable information. T@le Le mar. 22 févr. 2022 à 10:29, Amith sha <[email protected]> a écrit : > If TCP error occurs then you need to check the network metrics. Yes, TCP > DUMP can help you. > > > Thanks & Regards > Amithsha > > > On Tue, Feb 22, 2022 at 1:29 PM Tale Hive <[email protected]> wrote: > >> Hello ! >> >> @Amith sha <[email protected]> >> I checked also the system metrics, nothing wrong in CPU, RAM or IO. >> The only thing I found was these TCP errors (ListenDrop). >> >> @HK >> I'm monitoring a lot of JVM metrics like this one : >> "UnderReplicatedBlocks" in the bean >> "Hadoop:service=NameNode,name=FSNamesystem". >> And I found no under replicated blocks when the problem of timeout >> occurs, unfortunately. >> Thanks for you advice, in addition to the tcpdump, I'll perform some >> jstacks to see if I can find what ipc handlers are doing. >> >> Best regards. >> >> T@le >> >> >> >> >> >> >> Le mar. 22 févr. 2022 à 04:30, HK <[email protected]> a écrit : >> >>> Hi Tape, >>> Could you please thread dump of namenode process. Could you please check >>> what ipc handlers are doing. >>> >>> We faced similar issue when the under replication is high in the cluster >>> due to filesystem wirteLock. >>> >>> On Tue, 22 Feb 2022, 8:37 am Amith sha, <[email protected]> wrote: >>> >>>> Check your system metrics too. >>>> >>>> On Mon, Feb 21, 2022, 10:52 PM Tale Hive <[email protected]> wrote: >>>> >>>>> Yeah, next step is for me to perform a tcpdump just when the problem >>>>> occurs. >>>>> I want to know if my namenode does not accept connections because it >>>>> freezes for some reasons or because there is too many connections at a >>>>> time. >>>>> >>>>> My delay if far worse than 2s, sometimes, an hdfs dfs -ls -d >>>>> /user/<my-user> takes 20s, 43s and rarely it is even bigger than 1 minut. >>>>> And during this time, CallQueue is OK, Heap is OK, I don't find any >>>>> metrics which could show me a problem inside the namenode JVM. >>>>> >>>>> Best regards. >>>>> >>>>> T@le >>>>> >>>>> Le lun. 21 févr. 2022 à 16:32, Amith sha <[email protected]> a >>>>> écrit : >>>>> >>>>>> If you still concerned about the delay of > 2 s then you need to do >>>>>> benchmark with and without load. To find the root cause of the problem it >>>>>> will help. >>>>>> >>>>>> On Mon, Feb 21, 2022, 1:52 PM Tale Hive <[email protected]> wrote: >>>>>> >>>>>>> Hello Amith. >>>>>>> >>>>>>> Hm, not a bad idea. If I increase the size of the listenQueue and if >>>>>>> I increase timeout, the combination of both may mitigate more the >>>>>>> problem >>>>>>> than just increasing listenQueue size. >>>>>>> It won't solve the problem of acceptance speed, but it could help. >>>>>>> >>>>>>> Thanks for the suggestion ! >>>>>>> >>>>>>> T@le >>>>>>> >>>>>>> Le lun. 21 févr. 2022 à 02:33, Amith sha <[email protected]> a >>>>>>> écrit : >>>>>>> >>>>>>>> org.apache.hadoop.net.ConnectTimeoutException: 20000 millis timeout >>>>>>>> while waiting for channel to be ready for connect. >>>>>>>> Connection timed out after 20000 milli sec i suspect this value is >>>>>>>> very low for a namenode with 75Gb of heap usage. Can you increase the >>>>>>>> value >>>>>>>> to 5sec and check the connection. To increase the value modify this >>>>>>>> property >>>>>>>> ipc.client.rpc-timeout.ms - core-site.xml (If not present then add >>>>>>>> to the core-site.xml) >>>>>>>> >>>>>>>> >>>>>>>> Thanks & Regards >>>>>>>> Amithsha >>>>>>>> >>>>>>>> >>>>>>>> On Fri, Feb 18, 2022 at 9:17 PM Tale Hive <[email protected]> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hello Tom. >>>>>>>>> >>>>>>>>> Sorry for my absence of answers, I don't know why gmail puts your >>>>>>>>> mail into spam -_-. >>>>>>>>> >>>>>>>>> To answer you : >>>>>>>>> >>>>>>>>> - The metrics callQueueLength, avgQueueTime, avgProcessingTime >>>>>>>>> and GC metric are all OK >>>>>>>>> - Threads are plenty sufficient (I can see the metrics also >>>>>>>>> for them and I am below 200, the number I have for 8020 RPC >>>>>>>>> server) >>>>>>>>> >>>>>>>>> Did you see my other answers about this problem ? >>>>>>>>> I would be interested to have your opinion about that ! >>>>>>>>> >>>>>>>>> Best regards. >>>>>>>>> >>>>>>>>> T@le >>>>>>>>> >>>>>>>>> >>>>>>>>> Le mar. 15 févr. 2022 à 02:16, tom lee <[email protected]> a >>>>>>>>> écrit : >>>>>>>>> >>>>>>>>>> It might be helpful to analyze namenode metrics and logs. >>>>>>>>>> >>>>>>>>>> What about some key metrics? Examples are callQueueLength, >>>>>>>>>> avgQueueTime, avgProcessingTime and GC metrics. >>>>>>>>>> >>>>>>>>>> In addition, is the number of >>>>>>>>>> threads(dfs.namenode.service.handler.count) in the namenode >>>>>>>>>> sufficient? >>>>>>>>>> >>>>>>>>>> Hopefully this will help. >>>>>>>>>> >>>>>>>>>> Best regards. >>>>>>>>>> Tom >>>>>>>>>> >>>>>>>>>> Tale Hive <[email protected]> 于2022年2月14日周一 23:57写道: >>>>>>>>>> >>>>>>>>>>> Hello. >>>>>>>>>>> >>>>>>>>>>> I encounter a strange problem with my namenode. I have the >>>>>>>>>>> following architecture : >>>>>>>>>>> - Two namenodes in HA >>>>>>>>>>> - 600 datanodes >>>>>>>>>>> - HDP 3.1.4 >>>>>>>>>>> - 150 millions of files and folders >>>>>>>>>>> >>>>>>>>>>> Sometimes, when I query the namenode with the hdfs client, I got >>>>>>>>>>> a timeout error like this : >>>>>>>>>>> hdfs dfs -ls -d /user/myuser >>>>>>>>>>> >>>>>>>>>>> 22/02/14 15:07:44 INFO retry.RetryInvocationHandler: >>>>>>>>>>> org.apache.hadoop.net.ConnectTimeoutException: Call From >>>>>>>>>>> <my-client-hostname>/<my-client-ip> to >>>>>>>>>>> <active-namenode-hostname>:8020 >>>>>>>>>>> failed on socket timeout exception: >>>>>>>>>>> org.apache.hadoop.net.ConnectTimeoutException: 20000 millis >>>>>>>>>>> timeout while waiting for channel to be ready for connect. ch : >>>>>>>>>>> java.nio.channels.SocketChannel[connection-pending >>>>>>>>>>> remote=<active-namenode-hostname>/<active-namenode-ip>:8020]; >>>>>>>>>>> For more details see: >>>>>>>>>>> http://wiki.apache.org/hadoop/SocketTimeout, >>>>>>>>>>> while invoking ClientNamenodeProtocolTranslatorPB.getFileInfo >>>>>>>>>>> over <active-namenode-hostname>/<active-namenode-ip>:8020 after 2 >>>>>>>>>>> failover >>>>>>>>>>> attempts. Trying to failover after sleeping for 2694ms. >>>>>>>>>>> >>>>>>>>>>> I checked the heap of the namenode and there is no problem (I >>>>>>>>>>> have 75 GB of max heap, I'm around 50 used GB). >>>>>>>>>>> I checked the threads of the clientRPC for the namenode and I'm >>>>>>>>>>> at 200 which respects the recommandations from hadoop operations >>>>>>>>>>> book. >>>>>>>>>>> I have serviceRPC enabled to prevent any problem which could be >>>>>>>>>>> coming from datanodes or ZKFC. >>>>>>>>>>> General resources seems OK, CPU usage is pretty fine, same for >>>>>>>>>>> memory, network or IO. >>>>>>>>>>> No firewall is enabled on my namenodes nor my client. >>>>>>>>>>> >>>>>>>>>>> I was wondering what could cause this problem, please ? >>>>>>>>>>> >>>>>>>>>>> Thank you in advance for your help ! >>>>>>>>>>> >>>>>>>>>>> Best regards. >>>>>>>>>>> >>>>>>>>>>> T@le >>>>>>>>>>> >>>>>>>>>>
