If you still concerned about the delay of > 2 s then you need to do
benchmark with and without load. To find the root cause of the problem it
will help.

On Mon, Feb 21, 2022, 1:52 PM Tale Hive <[email protected]> wrote:

> Hello Amith.
>
> Hm, not a bad idea. If I increase the size of the listenQueue and if I
> increase timeout, the combination of both may mitigate more the problem
> than just increasing listenQueue size.
> It won't solve the problem of acceptance speed, but it could help.
>
> Thanks for the suggestion !
>
> T@le
>
> Le lun. 21 févr. 2022 à 02:33, Amith sha <[email protected]> a écrit :
>
>> org.apache.hadoop.net.ConnectTimeoutException: 20000 millis timeout while
>> waiting for channel to be ready for connect.
>> Connection timed out after 20000 milli sec i suspect this value is very
>> low for a namenode with 75Gb of heap usage. Can you increase the value to
>> 5sec and check the connection. To increase the value modify this property
>> ipc.client.rpc-timeout.ms - core-site.xml (If not present then add to
>> the core-site.xml)
>>
>>
>> Thanks & Regards
>> Amithsha
>>
>>
>> On Fri, Feb 18, 2022 at 9:17 PM Tale Hive <[email protected]> wrote:
>>
>>> Hello Tom.
>>>
>>> Sorry for my absence of answers, I don't know why gmail puts your mail
>>> into spam -_-.
>>>
>>> To answer you :
>>>
>>>    - The metrics callQueueLength, avgQueueTime, avgProcessingTime and
>>>    GC metric are all OK
>>>    - Threads are plenty sufficient (I can see the metrics also for them
>>>    and I  am below 200, the number I have for 8020 RPC server)
>>>
>>> Did you see my other answers about this problem ?
>>> I would be interested to have your opinion about that !
>>>
>>> Best regards.
>>>
>>> T@le
>>>
>>>
>>> Le mar. 15 févr. 2022 à 02:16, tom lee <[email protected]> a écrit :
>>>
>>>> It might be helpful to analyze namenode metrics and logs.
>>>>
>>>> What about some key metrics? Examples are callQueueLength,
>>>> avgQueueTime, avgProcessingTime and GC metrics.
>>>>
>>>> In addition, is the number of
>>>> threads(dfs.namenode.service.handler.count) in the namenode sufficient?
>>>>
>>>> Hopefully this will help.
>>>>
>>>> Best regards.
>>>> Tom
>>>>
>>>> Tale Hive <[email protected]> 于2022年2月14日周一 23:57写道:
>>>>
>>>>> Hello.
>>>>>
>>>>> I encounter a strange problem with my namenode. I have the following
>>>>> architecture :
>>>>> - Two namenodes in HA
>>>>> - 600 datanodes
>>>>> - HDP 3.1.4
>>>>> - 150 millions of files and folders
>>>>>
>>>>> Sometimes, when I query the namenode with the hdfs client, I got a
>>>>> timeout error like this :
>>>>> hdfs dfs -ls -d /user/myuser
>>>>>
>>>>> 22/02/14 15:07:44 INFO retry.RetryInvocationHandler:
>>>>> org.apache.hadoop.net.ConnectTimeoutException: Call From
>>>>> <my-client-hostname>/<my-client-ip> to <active-namenode-hostname>:8020
>>>>> failed on socket timeout exception:
>>>>>   org.apache.hadoop.net.ConnectTimeoutException: 20000 millis timeout
>>>>> while waiting for channel to be ready for connect. ch :
>>>>> java.nio.channels.SocketChannel[connection-pending
>>>>> remote=<active-namenode-hostname>/<active-namenode-ip>:8020];
>>>>>   For more details see:  http://wiki.apache.org/hadoop/SocketTimeout,
>>>>> while invoking ClientNamenodeProtocolTranslatorPB.getFileInfo over
>>>>> <active-namenode-hostname>/<active-namenode-ip>:8020 after 2 failover
>>>>> attempts. Trying to failover after sleeping for 2694ms.
>>>>>
>>>>> I checked the heap of the namenode and there is no problem (I have 75
>>>>> GB of max heap, I'm around 50 used GB).
>>>>> I checked the threads of the clientRPC for the namenode and I'm at 200
>>>>> which respects the recommandations from hadoop operations book.
>>>>> I have serviceRPC enabled to prevent any problem which could be coming
>>>>> from datanodes or ZKFC.
>>>>> General resources seems OK, CPU usage is pretty fine, same for memory,
>>>>> network or IO.
>>>>> No firewall is enabled on my namenodes nor my client.
>>>>>
>>>>> I was wondering what could cause this problem, please ?
>>>>>
>>>>> Thank you in advance for your help !
>>>>>
>>>>> Best regards.
>>>>>
>>>>> T@le
>>>>>
>>>>

Reply via email to