Hello guys.

Still investigating the tcpdump. I don't see a lot of packets with the flag
SYN when the listenQueue is full.
What I see is a lot of packets with the flag "PSH, ACK" with data inside
like this :
getListing.org.apache.hadoop.hdfs.protocol.ClientProtocol
/apps/hive/warehouse/<mydb>.db/<mytable>/<mypartition>

It makes me wonder, when a client perform an hdfs dfs -ls -R <HDFS_PATH>,
how many SYN packets will it send to the namenode ? One in total or one by
subfolder ?
Let's say I have "n" subfolders inside <HDFS_PATH>. Will we have this
situation :
- Client sends one SYN packet to Namenode
- Namenode sends one SYN-ACK packets to client
- Client sends n ACK or (PSH, ACK) packets to Namenode

Or this situation :
- Client sends n SYN packet to Namenode
- Namenode sends n SYN-ACK packets to client
- Client sends n ACK or (PSH, ACK)

It would mean an hdfs recursive listing on a path with a lot of subfolders
could harm the other clients by sending too many packets to the namenode ?

About the jstack, I tried it on the namenode JVM but it provoked a
failover, as the namenode was not answering at all (in particular, no
answer to ZKFC), and the jstack never ended, I had to kill it.
I don't know if a kill -3 or a jstack -F could help, but at least jstack -F
contains less valuable information.

T@le

Le mar. 22 févr. 2022 à 10:29, Amith sha <[email protected]> a écrit :

> If TCP error occurs then you need to check the network metrics. Yes, TCP
> DUMP can help you.
>
>
> Thanks & Regards
> Amithsha
>
>
> On Tue, Feb 22, 2022 at 1:29 PM Tale Hive <[email protected]> wrote:
>
>> Hello !
>>
>> @Amith sha <[email protected]>
>> I checked also the system metrics, nothing wrong in CPU, RAM or IO.
>> The only thing I found was these TCP errors (ListenDrop).
>>
>> @HK
>> I'm monitoring a lot of JVM metrics like this one :
>> "UnderReplicatedBlocks" in the bean
>> "Hadoop:service=NameNode,name=FSNamesystem".
>> And I found no under replicated blocks when the problem of timeout
>> occurs, unfortunately.
>> Thanks for you advice, in addition to the tcpdump, I'll perform some
>> jstacks to see if I can find what ipc handlers are doing.
>>
>> Best regards.
>>
>> T@le
>>
>>
>>
>>
>>
>>
>> Le mar. 22 févr. 2022 à 04:30, HK <[email protected]> a écrit :
>>
>>> Hi Tape,
>>> Could you please thread dump of namenode process. Could you please check
>>> what ipc handlers are doing.
>>>
>>> We faced similar issue when the under replication is high in the cluster
>>> due to filesystem wirteLock.
>>>
>>> On Tue, 22 Feb 2022, 8:37 am Amith sha, <[email protected]> wrote:
>>>
>>>> Check your system metrics too.
>>>>
>>>> On Mon, Feb 21, 2022, 10:52 PM Tale Hive <[email protected]> wrote:
>>>>
>>>>> Yeah, next step is for me to perform a tcpdump just when the problem
>>>>> occurs.
>>>>> I want to know if my namenode does not accept connections because it
>>>>> freezes for some reasons or because there is too many connections at a 
>>>>> time.
>>>>>
>>>>> My delay if far worse than 2s, sometimes, an hdfs dfs -ls -d
>>>>> /user/<my-user> takes 20s, 43s and rarely it is even bigger than 1 minut.
>>>>> And during this time, CallQueue is OK, Heap is OK, I don't find any
>>>>> metrics which could show me a problem inside the namenode JVM.
>>>>>
>>>>> Best regards.
>>>>>
>>>>> T@le
>>>>>
>>>>> Le lun. 21 févr. 2022 à 16:32, Amith sha <[email protected]> a
>>>>> écrit :
>>>>>
>>>>>> If you still concerned about the delay of > 2 s then you need to do
>>>>>> benchmark with and without load. To find the root cause of the problem it
>>>>>> will help.
>>>>>>
>>>>>> On Mon, Feb 21, 2022, 1:52 PM Tale Hive <[email protected]> wrote:
>>>>>>
>>>>>>> Hello Amith.
>>>>>>>
>>>>>>> Hm, not a bad idea. If I increase the size of the listenQueue and if
>>>>>>> I increase timeout, the combination of both may mitigate more the 
>>>>>>> problem
>>>>>>> than just increasing listenQueue size.
>>>>>>> It won't solve the problem of acceptance speed, but it could help.
>>>>>>>
>>>>>>> Thanks for the suggestion !
>>>>>>>
>>>>>>> T@le
>>>>>>>
>>>>>>> Le lun. 21 févr. 2022 à 02:33, Amith sha <[email protected]> a
>>>>>>> écrit :
>>>>>>>
>>>>>>>> org.apache.hadoop.net.ConnectTimeoutException: 20000 millis timeout
>>>>>>>> while waiting for channel to be ready for connect.
>>>>>>>> Connection timed out after 20000 milli sec i suspect this value is
>>>>>>>> very low for a namenode with 75Gb of heap usage. Can you increase the 
>>>>>>>> value
>>>>>>>> to 5sec and check the connection. To increase the value modify this 
>>>>>>>> property
>>>>>>>> ipc.client.rpc-timeout.ms - core-site.xml (If not present then add
>>>>>>>> to the core-site.xml)
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks & Regards
>>>>>>>> Amithsha
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, Feb 18, 2022 at 9:17 PM Tale Hive <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hello Tom.
>>>>>>>>>
>>>>>>>>> Sorry for my absence of answers, I don't know why gmail puts your
>>>>>>>>> mail into spam -_-.
>>>>>>>>>
>>>>>>>>> To answer you :
>>>>>>>>>
>>>>>>>>>    - The metrics callQueueLength, avgQueueTime, avgProcessingTime
>>>>>>>>>    and GC metric are all OK
>>>>>>>>>    - Threads are plenty sufficient (I can see the metrics also
>>>>>>>>>    for them and I  am below 200, the number I have for 8020 RPC 
>>>>>>>>> server)
>>>>>>>>>
>>>>>>>>> Did you see my other answers about this problem ?
>>>>>>>>> I would be interested to have your opinion about that !
>>>>>>>>>
>>>>>>>>> Best regards.
>>>>>>>>>
>>>>>>>>> T@le
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Le mar. 15 févr. 2022 à 02:16, tom lee <[email protected]> a
>>>>>>>>> écrit :
>>>>>>>>>
>>>>>>>>>> It might be helpful to analyze namenode metrics and logs.
>>>>>>>>>>
>>>>>>>>>> What about some key metrics? Examples are callQueueLength,
>>>>>>>>>> avgQueueTime, avgProcessingTime and GC metrics.
>>>>>>>>>>
>>>>>>>>>> In addition, is the number of
>>>>>>>>>> threads(dfs.namenode.service.handler.count) in the namenode 
>>>>>>>>>> sufficient?
>>>>>>>>>>
>>>>>>>>>> Hopefully this will help.
>>>>>>>>>>
>>>>>>>>>> Best regards.
>>>>>>>>>> Tom
>>>>>>>>>>
>>>>>>>>>> Tale Hive <[email protected]> 于2022年2月14日周一 23:57写道:
>>>>>>>>>>
>>>>>>>>>>> Hello.
>>>>>>>>>>>
>>>>>>>>>>> I encounter a strange problem with my namenode. I have the
>>>>>>>>>>> following architecture :
>>>>>>>>>>> - Two namenodes in HA
>>>>>>>>>>> - 600 datanodes
>>>>>>>>>>> - HDP 3.1.4
>>>>>>>>>>> - 150 millions of files and folders
>>>>>>>>>>>
>>>>>>>>>>> Sometimes, when I query the namenode with the hdfs client, I got
>>>>>>>>>>> a timeout error like this :
>>>>>>>>>>> hdfs dfs -ls -d /user/myuser
>>>>>>>>>>>
>>>>>>>>>>> 22/02/14 15:07:44 INFO retry.RetryInvocationHandler:
>>>>>>>>>>> org.apache.hadoop.net.ConnectTimeoutException: Call From
>>>>>>>>>>> <my-client-hostname>/<my-client-ip> to 
>>>>>>>>>>> <active-namenode-hostname>:8020
>>>>>>>>>>> failed on socket timeout exception:
>>>>>>>>>>>   org.apache.hadoop.net.ConnectTimeoutException: 20000 millis
>>>>>>>>>>> timeout while waiting for channel to be ready for connect. ch :
>>>>>>>>>>> java.nio.channels.SocketChannel[connection-pending
>>>>>>>>>>> remote=<active-namenode-hostname>/<active-namenode-ip>:8020];
>>>>>>>>>>>   For more details see:
>>>>>>>>>>> http://wiki.apache.org/hadoop/SocketTimeout,
>>>>>>>>>>> while invoking ClientNamenodeProtocolTranslatorPB.getFileInfo
>>>>>>>>>>> over <active-namenode-hostname>/<active-namenode-ip>:8020 after 2 
>>>>>>>>>>> failover
>>>>>>>>>>> attempts. Trying to failover after sleeping for 2694ms.
>>>>>>>>>>>
>>>>>>>>>>> I checked the heap of the namenode and there is no problem (I
>>>>>>>>>>> have 75 GB of max heap, I'm around 50 used GB).
>>>>>>>>>>> I checked the threads of the clientRPC for the namenode and I'm
>>>>>>>>>>> at 200 which respects the recommandations from hadoop operations 
>>>>>>>>>>> book.
>>>>>>>>>>> I have serviceRPC enabled to prevent any problem which could be
>>>>>>>>>>> coming from datanodes or ZKFC.
>>>>>>>>>>> General resources seems OK, CPU usage is pretty fine, same for
>>>>>>>>>>> memory, network or IO.
>>>>>>>>>>> No firewall is enabled on my namenodes nor my client.
>>>>>>>>>>>
>>>>>>>>>>> I was wondering what could cause this problem, please ?
>>>>>>>>>>>
>>>>>>>>>>> Thank you in advance for your help !
>>>>>>>>>>>
>>>>>>>>>>> Best regards.
>>>>>>>>>>>
>>>>>>>>>>> T@le
>>>>>>>>>>>
>>>>>>>>>>

Reply via email to