FYI: The hardware fix described earlier in this thread give 100% success, first time, every time.
On 27 April 2017 at 15:42, <[email protected]> wrote: > If you have this problem and only care about solutions, jump to > "workarounds" below. > > ### RECAP > > For unlucky souls who come fresh upon this problem and down want to read > though a better part of a decade's worth of conflicting reports.... > > 1. Due to a design issue, the BeagleBone Black and descendants have a > problem where they intermittently come up with various bad state set in the > physical network connection chip (PHY) that make the wired Ethernet port > inaccessible and there is no way to get it to recover using only software - > a power cycle or hardware reset is required. > > 2. One of the ways that the PHY can have bad state is that its address can > be assigned a different value than expected. The latest versions of the > kernel will scan all possible addresses and find the PHY no matter what > address is happens to get, so this failure mode is not longer part of issue > as long as you use one of these new kernels. (BTW, I have an elegant > solution to reassign the PHY back to the expected address which will work > with any kernel version if you need it. It also avoids the current kluge > that hacks up the device tree to match the new found PHY address.) > > 3. There are still some bad states that the PHY chip can come up in that > are not addressed by the new kernel. As far as I know there is no software > only workaround for these - a power cycle or hardware reset is required. > > 4. In my personal experience, the bad state seems to be significantly less > likely when the board is powered though the barrel connector (or USB om > BeagleBone Green) than when it is powered via the pin on P9 header. I've > also noticed that most people in this thread are powering thier boards via > a cape or header connected power supply which makes sense since these > people tend to seen the problem more often. Note that the non-recoverable > bad state can still happen even on a baord powered via the barrel - it is > just less likely. > > 5. In my personal experience, the bad state seems to be more likely on > certain individual boards than others. I have a board that comes up in the > bad state about 50% of the time, while other boards only come up int he bad > state 1 in 100 times. > > 6. In my experience, the bad state seems to be significantly less like if > *nothing* is connected to the Ethernet port at power up. I really mean not > connected - even if there is an unpowered device connected to the other end > of the network cable, then the bad state occurs more often. The cable much > be unplugged at one end or the other. > > 7. Bit 13 in register 18 seems to be a 100% indication that you are in the > bad state. I have never seen a board with that bit set recover, and I have > never seen a non-recoverable board without that bit set (except for a > couple of seconds if you manually clear it before it sets itself on again). > This bit is "reserved" in the datasheet and so far no hints from Microchip > as to what it might mean that might lead to a better understanding of the > issue. > > 8. In the bad state, it is possible to get the PHY to link by manually > configuring it to 10Mbs half duplex (no auto negotiation). While the link > light comes on and the "link active" bit is set, it does not appear to be > decoding incoming packets so this is not a useful workaround. > > ### WORKAROUNDS > > In order of effectiveness/desirability. > > 1. Use a different board. All the commercially available BeagleBone Black > and descendants share this design issue, so look at maybe the Raspberry Pi > or one of the other ARM based SBCs. > > 2. Spin your own version of the board. This problem could be completely > resolved by adding a connection between the reset pin of the PHY and a gpio > on the ARM. This way the ARM would be carefully control the required timing > sequence for bringing up the PHY chip - and also be able to hardware reset > the chip in case there are any problems. > > 3. Use a USB Ethernet adapter rather than the on-board eth0 port. > Compatible adapters can be found for less than $10. > > 4. Connect a gpio pin to the reset pin on header P9. That reset pin is > tied to the hardware reset pin of the PHY chip, so you can reset it under > software control. gpio 60 happens to be very close physically, making for a > very easy jumper connection. Then you need a script to test for the bad > state, and activate the gpio to reset if it is found. Note that the reset > pin will also reset the ARM, the the BB will reboot every-time you do this > but should eventually come up (and satay up) with the PHY in the good > state. > > 5. Unplug the the Ethernet port during power up, check for bad state after > the board comes up, and keep power cycling it until it comes up in a good > state, then reconnect the network cable. > > 6. Power the board though the barrel or USB rather than though the headers. > > Though a combination of 5 & 6, I was able to get my bank of boards to come > up with a better than 80% good state rate on the first try. Yona Applegate > (of LEDscape fame) reports being able to get his large collection of BBS to > all come up with good networking 100% of the time using #4, although the > amount of time it takes for all boards to get to the good state is > indeterminate. > > ### FUTURE DIRECTIONS > > There are likely other workaround possible if someone wants to invest more > time working on this issue. > > Here is a tool that let's you easily inspect and modify registers in the > PHY.... > https://github.com/bigjosh/phyreg > > Here are all my notes from debugging this issue... > https://www.evernote.com/pub/bigjosh2/bbbphyproblem > > I am happy to try and help anyone who want to dig in deeper. I personally > would love to not have to unplug/replug 72 ethernet cables every time I > have to power cycle my bank of BBBs! > > -josh > > > > > > > > > > On Tuesday, November 26, 2013 at 5:22:42 PM UTC-5, AndrewTaneGlen wrote: >> >> Hello, >> >> I have noticed very rare cases (~1/50) of the ethernet phy on the >> Beaglebone Black not being detected on boot, and requiring a hard reset (as >> opposed to calling 'reset' from the command line) to get it to work/be >> detected again. >> >> This problem has been mentioned in a couple of other threads (below) >> concerning different topics (i.e. problems getting the BBB to boot, and the >> ethernet phy 'dying' some time after initially working fine), with no >> solution/workaround for this specific problem being suggested - so I >> thought I'd start a thread specifically for it. >> https://groups.google.com/forum/#!msg/beagleboard/Vp4pxwHm8B >> U/Iaw3p5xm0MoJ >> https://groups.google.com/forum/#!topic/beagleboard/aXv6An1xfqI >> >> In the first thread mlc/Mike discussed his response to the problem as >> follows: >> >> >> >> >> >> >> >> >> >> >> >> >> *"I had issues with the network not coming up on boot, and it was >> traced down to problems with the SYS_RESETn line. I had a level translator >> connected to SYS_RESETn, to drive a 5V chip. It was powered by a 5V rail. >> If the 5V rail powered up "differently" than the 3.3V rail (not sure of the >> exact relationship), I guess it pulled the SYS_RESETn line to weird levels >> that affected the network chip but not the main processor. I'm now using a >> GPIO to drive the external 5V chip now, instead of the SYS_RESETn >> line. Anyway, the moral is be very, very careful with SYS_RESETn, because >> it can cause hard-to-trace problems with networking.*" >> >> I see that the A6 Revision of the Beaglebone Black has some changes to >> the SYS_RESETn line: >> >> "*Based on notification from TI, in random instances there could be a >> glitch in the SYS_RESETn signal from the processor where the SYS_RESETn >> signal was taken high for a momentary amount of time before it was supposed >> to. To prevent this, the signal was ORed with the PORZn (Power On reset).* >> " (http://elinux.org/Beagleboard:BeagleBoneBlack#Revi >> sion_A6_.28Production_Version.29) >> >> Is it likely that this modification will improve/resolve the issue I am >> seeing with the ethernt phy not resetting/powering-up correctly?, seeing as >> the SYS_RESETn signal also feeds into the nRST pin on the ethernet phy (The >> SYS_RESETn line is left untouched in my application). >> >> >> Some additional observations from dmesg concerning this use: >> >> On a good phy boot I see the following: >> [ 2.810749] davinci_mdio 4a101000.mdio: davinci mdio revision 1.6 >> [ 2.817206] davinci_mdio 4a101000.mdio: detected phy mask fffffffe >> [ 2.833517] libphy: 4a101000.mdio: probed >> [ 2.837871] davinci_mdio 4a101000.mdio: phy[0]: device >> 4a101000.mdio:00, driver unknown >> >> Followed later by: >> [ 21.286920] net eth0: initializing cpsw version 1.12 (0) >> [ 21.301166] net eth0: phy found : id is : 0x7c0f1 >> >> On a 'bad phy' boot I see the following (differences highlighted): >> [ 2.806763] davinci_mdio 4a101000.mdio: davinci mdio revision 1.6 >> [ 2.813213] davinci_mdio 4a101000.mdio: detected phy mask *fffffffb* >> [ 2.829512] libphy: 4a101000.mdio: probed >> [ 2.833875] davinci_mdio 4a101000.mdio: phy[2]: device >> 4a101000.mdio:02, driver unknown >> >> Followed later by: >> [ 21.346861] net eth0: initializing cpsw version 1.12 (0) >> [ 21.354379] *libphy: PHY 4a101000.mdio:00 not found* >> [ 21.359469] *net eth0: phy 4a101000.mdio:00 not found on slave 0* >> >> >> So it looks like the 'davinci_mdio_reset' function see the phy in both >> instances, but reports differently on the bad boot. I am not sure what to >> make of this. >> >> I am using the Debian 7.2 Rootfs and the 'RobertCNelson' kernel >> '3.12.0-bone8'. >> >> >> >> Regards, >> Andrew. >> >> >> -- For more options, visit http://beagleboard.org/discuss --- You received this message because you are subscribed to the Google Groups "BeagleBoard" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/beagleboard/CAHKgOt7xYJ7QpfQSVL4QyQBa6ma0%3DceZdBf_NB5DCv%3D7nVAusQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
