On Sat, 11 Jul 2020 at 19:23, Russell King - ARM Linux admin
<li...@armlinux.org.uk> wrote:
> On Sat, Jul 11, 2020 at 06:23:49PM +0200, Andrew Lunn wrote:
> > So i'm guessing it is the connection between the CPU and the switch.
> > Could you confirm this? Create a bridge, add two ports of the switch
> > to the bridge, and then see if packets can pass between switch ports.
> >
> > If it is the connection between the CPU and the switch, i would then
> > be thinking about the comphy and the firmware. We have seen issues
> > where the firmware is too old. That is not something i've debugged
> > myself, so i don't know where the version information is, or what
> > version is required.
>
> However, in the report, Martin said that reverting the problem commit
> from April 14th on a kernel from July 6th caused everything to work
> again.  That is quite conclusive that 34b5e6a33c1a is the cause of
> the breakage.

I tried it anyway and couldn't get any traffic to flow between the
ports, but I could have configured it wrongly. I gave each port a
static IP, bridged them (with and without br0 having an IP assigned),
and tried pinging from one port to the other. I tried with the
assigned IPs in the same and different subnets, and made sure the
routes were updated between tests. Tx only, no responses, exactly like
pinging a remote host.

I'm now less confident about my git bisect, though, because it appears
my criteria for verifying if a commit was "good" was not sufficient. I
was just checking to see if the port could get assigned a DHCP address
and ping something else, but it appears that (at least on 5.8-rc4 with
the one revert) the interface "dies" after working for about 30-60
seconds. Basically the symptoms I described originally, just preceded
by 30-60 seconds of it working perfectly. I will re-run the bisect to
figure out what makes it go from "working perfectly" to "working
perfectly for less than a minute", which will take a few days.

Reply via email to