Sounds good to me.
On Mon, Jan 28, 2019 at 5:47 PM Michal Soltys <sol...@ziu.info> wrote: > > On 19/01/18 07:58, Maciej Żenczykowski wrote: > > I'm not sure there's a truly good answer. > > > > Also note, that there's subtle differences between af_packet sockets > > tied to ALL protocols vs tied to specific protocols (vs none/0 of > > course). > > However ETH_P_ALL protocol raw sockets need to be avoided if at all > > possible due to perf impact. > > Certainly we don't want any of these created outside of debugging. > > > > I believe we only cared about the utterly unbound to any device case > > working (ie. delivering all LLDP packets all nics are receiving). > > So that a single simple daemon could collect and export all the link > > information. > > [btw. now that I know about the PACKET_ORIGDEV option, I do - > > unsurprisingly - see our daemon sets it] > > We really didn't want the complexity of having to bind to individual > > interfaces and having to try to dynamically adjust the set of raw > > sockets. > > (not to mention that less raw sockets is always good, also it's never > > actually entirely clear which interfaces would need to be monitored > > due to the preponderance of tunnels/macvlan/ipvlan/veth/dummy and > > other virtual interface types) > > > > I think from a logic perspective: > > > > - if you bind to slave (physical interface) you should see *all* > > packets on that interface, > > regardless of whether the slave is active or not, and whether the > > packet is link-local or not. > > ie. this should show you exactly what nic is receiving (& sending: > > PACKET_OUTGOING) > > I should see *exactly* one copy of any packet. > > This mode has to be useful for debugging. > > This includes seeing packets which aren't even destined for me if nic > > receives them. > > (ie. destined to unicast/multicast mac we'd filter out: PACKET_OTHERHOST) > > Although by itself this socket's existence shouldn't affect nic mac > > filters and promiscuous mode > > (I believe there are all sorts of various additional socket options to > > change those). > > > > - if you don't bind to an interface then I think I'd expect to see > > packets delivered to stack + link local packets > > - ie. should not see non-link-local packets discarded due to being > > on inactive slaves > > similarly to how I should not see packets filtered out by virtue > > of mac not matching interface mac filters > > - IMHO you should see *all* link local packets arriving at system > > (ie. the original change's purpose) > > [and I think, but am not certain, I shouldn't need to use any > > socket options to register the link local macs, > > although glancing at code I do think the daemon uses > > PACKET_ADD_MEMBERSHIP to register lldp mac, > > so perhaps filtering should apply as normal?] > > - with PACKET_ORIGDEV they should always show up as from the > > physical interfaces (ie. slaves) > > - without PACKET_ORIGDEV: > > - non link local packets should show up as from bonding/team master > > > - link local packets on active slaves could show up on master or > > slave (probably master is required ifirc some earlier fix???) > > As far as I can see, they (LLCs coming on active slave) will show via > master as from master (w/o the socket option) or as from slave (with). > Think Vincent's tests confirmed it. > > > - link local packets on inactive slaves should show up on slave - > > and definitely not on master > > - I don't think I should ever see any PACKET_OTHERHOST packets > > > > - if you bind to master you should see packets from active slaves (ie. > > those that will get delivered to stack) > > - clearly you should not see non-link-local inactive slave packets > > (they'll be dropped) > > > - behaviour wrt. link local packets is more dicey... (I believe > > somewhere in these threads patches there was some description of what > > standard requires, but I don't off the top of my head remember) > > They must be seen, otherwise bonds attached to a linux bridge (I assume > enslaving an interface to a bridge essentially counts as bind) will be > blind to them (among those - e.g. to spanning tree information - this > was what originally caused issues for me last year). Even the bridge > code goes to extra length to not carelessly consume stp packets, if it's > not actively participating in stp (of any kind). Aside that, certain > [recent] bridge features like group_fwd_mask would be non-functional on > bond ports as well. > > As for standard (the one I quoted in the oldest thread) expected the > link-local packet to be both readable via master and slaves depending on > need (though it didn't go into exact gory details). Your current patch I > think cover all possible cases quite greacefully. > > > - for an active-backup bond it would seem logical to see only active > > links link local > > - for a multiple-active aggregate bond I'd be fine with seeing none, > > or all active slaves link local packets - I guess even though the > > later is confusing it makes sense > > - it might be okay to see link-local inactive slave packets, but I > > think I'd prefer not to (could be configurable though) - this would > > seem confusing/wrong to me. > > - and again I don't think I should see any PACKET_OTHERHOST packets > > here... > > - PACKET_ORIGDEV as above... > > > > I gave the above a fair amount of thought... but I'm not guaranteeing > > I didn't make typos, or write something utterly stupid or > > unimplementable or not how stuff should work for other reasons... > > Comments welcome. > > > > I think this continues to be in line with my proposal from earlier in > > the thread? > > (sorry for late reply) > > Yes, pretty much spot on. With some confirmation comments above. > > I did bridging tests yesterday (2 LACP bonds attached via separate > switches treating linux bridge as a shared segment in mstp scenario) and > all is working fine on my side as well. > > If everyone agrees with the proposed code, I will submit v2 patch with > added comment explaining basic logic (or you could submit if you prefer). > > > > > On Thu, Jan 17, 2019 at 4:27 PM Michal Soltys <sol...@ziu.info> wrote: > >> > >> On 19/01/14 03:01, Maciej Żenczykowski wrote: > >> > So I don't remember the specifics... > >> > > >> > (note I'm writing this all from memory without looking it up/testing > >> > it - I may be utterly wrong or dreaming) > >> > > >> > But I seem to recall that the core problem we were trying to solve was > >> > that a daemon listening > >> > on an AF_PACKET ethertype 88CC [LLDP] socket not bound to any device > >> > would not receive LLDP packets > >> > arriving on inactive bond slaves (either active-backup or lag). > >> > > >> > [inactive = link/carrier up, but not part of active aggregator] > >> > > >> > This made monitoring for miscabling harder (IFIRC the only non kernel > >> > fix was to get the daemon to create > >> > a separate AF_PACKET/88CC socket bound to every physical interface in > >> > the system, or monitor for > >> > inactive slaves and add extra packet sockets as needed). > >> > > >> > They would get re-parented to the master and then since the slave was > >> > inactive they would be considered RX_HANDLER_EXACT match only and not > >> > match the * interface. > >> > > >> > Honestly I wasn't aware of PACKET_ORIGDEV, although I don't think it > >> > helps in this case - AFAICR the packets never made it to the packet > >> > socket. > >> > > >> > Perhaps going from: > >> > /* don't change skb->dev for link-local packets */ > >> > if (is_link_local_ether_addr(eth_hdr(skb)->h_dest)) return > >> > RX_HANDLER_PASS; > >> > if (bond_should_deliver_exact_match(skb, slave, bond)) return > >> > RX_HANDLER_EXACT; > >> > > >> > to something more like: > >> > if (bond_should_deliver_exact_match(skb, slave, bond)) { > >> > /* don't change skb->dev for link-local packets on inactive slaves > >> > */ > >> > if (is_link_local_ether_addr(eth_hdr(skb)->h_dest)) return > >> > RX_HANDLER_PASS; > >> > return RX_HANDLER_EXACT; > >> > } > >> > >> Having checked the code (if I get the flow correctly), one > >> thing/question - currently with Mahesh's fixes, not bound LLDP listener > >> will receive all packets - both from active and inactive slaves directly > >> (as the check for suppression is done after the link-local check). > >> > >> The version above will do the suppression check first - so all inactive > >> slaves - excluding non-multi/non-broad ALB - will pass it and return > >> RX_HANDLER_PASS if the packet is link-local. So those will be available > >> w/o binding, but active slaves' packets will be available via master > >> device (but with working PACKET_ORIGDEV now - so slave device can be > >> retrieved easily). This is fine in your scenario I presume ? > >> > > >