From: Hoang Huu Le <hoang.h...@dektech.com.au> Date: Wed, 17 Jun 2020 13:56:05 +0700
> Currently, updating binding table (add service binding to > name table/withdraw a service binding) is being sent over replicast. > However, if we are scaling up clusters to > 100 nodes/containers this > method is less affection because of looping through nodes in a cluster one > by one. > > It is worth to use broadcast to update a binding service. This way, the > binding table can be updated on all peer nodes in one shot. > > Broadcast is used when all peer nodes, as indicated by a new capability > flag TIPC_NAMED_BCAST, support reception of this message type. > > Four problems need to be considered when introducing this feature. > 1) When establishing a link to a new peer node we still update this by a > unicast 'bulk' update. This may lead to race conditions, where a later > broadcast publication/withdrawal bypass the 'bulk', resulting in > disordered publications, or even that a withdrawal may arrive before the > corresponding publication. We solve this by adding an 'is_last_bulk' bit > in the last bulk messages so that it can be distinguished from all other > messages. Only when this message has arrived do we open up for reception > of broadcast publications/withdrawals. > > 2) When a first legacy node is added to the cluster all distribution > will switch over to use the legacy 'replicast' method, while the > opposite happens when the last legacy node leaves the cluster. This > entails another risk of message disordering that has to be handled. We > solve this by adding a sequence number to the broadcast/replicast > messages, so that disordering can be discovered and corrected. Note > however that we don't need to consider potential message loss or > duplication at this protocol level. > > 3) Bulk messages don't contain any sequence numbers, and will always > arrive in order. Hence we must exempt those from the sequence number > control and deliver them unconditionally. We solve this by adding a new > 'is_bulk' bit in those messages so that they can be recognized. > > 4) Legacy messages, which don't contain any new bits or sequence > numbers, but neither can arrive out of order, also need to be exempt > from the initial synchronization and sequence number check, and > delivered unconditionally. Therefore, we add another 'is_not_legacy' bit > to all new messages so that those can be distinguished from legacy > messages and the latter delivered directly. > > v1->v2: > - fix warning issue reported by kbuild test robot <l...@intel.com> > - add santiy check to drop the publication message with a sequence > number that is lower than the agreed synch point > > Signed-off-by: kernel test robot <l...@intel.com> > Signed-off-by: Hoang Huu Le <hoang.h...@dektech.com.au> > Acked-by: Jon Maloy <jma...@redhat.com> Applied, thank you.