Hello.

On 04/05/2016 07:20 PM, Jon Maloy wrote:

When a link is down, it will continuously try to re-establish contact
with the peer by sending out a RESET or and ACTIVATE message at each

   And/or?

timeout interval. The default value for this interval is currently
375 ms. This is wasteful, and may become a problem in very large
clusters with dozens or hundereds of nodes being down simultaneously.

   Hundreds.

We now introduce a simple backoff algorithm for these cases. The
first five messages are sent at default rate; thereafter a message
is sent only each 16't timer interval.

   16th?

This will cover the vast majority of link recyling cases, since the

   Recycling.

endpoint starting last will transmit at the higher speed, and the link
should normally be established well be before the rate needs to be
reduced.

The only case where we will see a degradation of link re-establishment
is when the endpoins remain intact, and a glitch in the transmission

   Endpoints.

media is causing the link reset. We will then experience a worst-case
re-establishing time of 6 seconds, something we deem acceptable.

Acked-by: Ying Xue <ying....@windriver.com>
Signed-off-by: Jon Maloy <jon.ma...@ericsson.com>
[...]

MBR, Sergei

Reply via email to