On Sat, Nov 14, 2020 at 03:39:28PM +0000, Tj (Elloe Linux) wrote: > MV88E6085 switch not passing IPv6 multicast packets to CPU. > > Seems to be related to interface not being in promiscuous mode. > > This issue has been ongoing since at least July 2020. Latest v5.10-rc3 > still suffers the issue on a Turris Mox with mv88e6085. We've not been > able to reproduce it on the Turris v4.14 stable kernel series so it > appears to be a regression. > > Mox is using Debian 10 Buster. > > First identified due to DHCPv6 leases not being renewed on clients being > served by isc-dhcp-server on the Mox. > > Analysis showed the client IPv6 multicast solicit packets were being > received by the Mox hardware (proved via a mirror port on a managed LAN > switch) but the CPU was not receiving them (observed using tcpdump). > > Further investigation has identified this also affects IPv6 neighbour > discovery for clients when not using frequent RAs from the Mox. > > Currently we've found two reproducible scenarios: > > 1) with isc-dhcp-server configured with very short lease times (180 > seconds). After mox reboot (or systemctl restart systemd-networkd) > clients successfully obtain a lease and a couple of RENEWs (requested > after 90 seconds) but then all goes silent, Mox OS no longer sees the > IPv6 multicast RENEW packets and client leases expire. > > 2) Immediately after reboot when DHCPv6 renewals are still possible if > on the Mox we do "tcdump -ni eth1 ip6" and immediately terminate, > tcpdump takes the interface out of promiscuous mode and IPv6 multicast > packets immediately cease to be received by the CPU. If we use 'tcpdump > --no-promiscuous-mode ..." so on termination it doesn't try to take the > interface out of promiscuous mode IPv6 multicast packets continue to be > seen by the CPU. > > We've been pointed to the mv8e6xxx_dump tool and can capture data but > not sure what specifically to look for. > > We've also added some pr_info() debugging into mvneta to analyse when > promiscuous mode is enabled or disabled since this seems to be strongly > related to the issue. > > We believe there's a big clue in being able to reset the issue by > restarting systemd-networkd on the Mox. We've looked for but not found > any clues or indications of services on the Mox causing this but aren't > ruling this out.
Is there a simple step-by-step reproducer for the issue, that doesn't require a lot of configuration? I've got a Mox with the 6190 switch running net-next and Buildroot that I could try on.