From: Jon Maloy <jon.ma...@ericsson.com>
Date: Fri, 29 Apr 2016 10:40:24 -0400

> From: Hamish Martin <hamish.mar...@alliedtelesis.co.nz>
> 
> We have observed complete lock up of broadcast-link transmission due to
> unacknowledged packets never being removed from the 'transmq' queue. This
> is traced to nodes having their ack field set beyond the sequence number
> of packets that have actually been transmitted to them.
> Consider an example where node 1 has sent 10 packets to node 2 on a
> link and node 3 has sent 20 packets to node 2 on another link. We
> see examples of an ack from node 2 destined for node 3 being treated as
> an ack from node 2 at node 1. This leads to the ack on the node 1 to node
> 2 link being increased to 20 even though we have only sent 10 packets.
> When node 1 does get around to sending further packets, none of the
> packets with sequence numbers less than 21 are actually removed from the
> transmq.
> To resolve this we reinstate some code lost in commit d999297c3dbb ("tipc:
> reduce locking scope during packet reception") which ensures that only
> messages destined for the receiving node are processed by that node. This
> prevents the sequence numbers from getting out of sync and resolves the
> packet leakage, thereby resolving the broadcast-link transmission
> lock-ups we observed.
> 
> While we are aware that this change only patches over a root problem that
> we still haven't identified, this is a sanity test that it is always
> legitimate to do. It will remain in the code even after we identify and
> fix the real problem.
> 
> Reviewed-by: Chris Packham <chris.pack...@alliedtelesis.co.nz>
> Reviewed-by: John Thompson <john.thomp...@alliedtelesis.co.nz>
> Signed-off-by: Hamish Martin <hamish.mar...@alliedtelesis.co.nz>
> Signed-off-by: Jon Maloy <jon.ma...@ericsson.com>

Applied.

Reply via email to