Thanks Robert, I'll check that out, So when you sasy "remove the sleeps". I just delete "sleep" from pinctrl-names = "default", "sleep"; or do I need to also remove pinctrl-1 = <&cpsw_sleep>; as well ?
On Wed, Oct 19, 2016 at 11:04 AM, Robert Nelson <[email protected]> wrote: > On Wed, Oct 19, 2016 at 12:54 PM, William Hermans <[email protected]> > wrote: > > > > > > On Wed, Oct 19, 2016 at 3:24 AM, Graham <[email protected]> wrote: > >> > >> I have two BBG units that I use as headless servers, with only access > >> through Ethernet. Both have been running without reboot for multiple > months > >> without any issues. I think that I mentioned that I did have a BBB do > >> exactly what you describe, while running as a headless server last > year, but > >> at the time there was a thunderstorm in the area, and lightning strikes > in > >> the neighborhood. It recovered on reboot, and has never repeated the > >> symptom. > >> > >> So, my conclusion is that it is possible to happen, but rare, and in my > >> case was probably caused by electrical transient coming in the Ethernet > >> connection which is routed from a cable modem to the outside world. > >> > >> For high reliability application, perhaps some extra transient > protection > >> on the Ethernet connection, and some kind of "ping monitor" that can > >> auto-reboot the BBG. > >> > >> --- Graham > > > > > > I haven't had a BBG Until the last 2-3 months to play with. Now, I've had > > ~30 over the course of the last 2 months to observe this behavior on. > Which > > again has only happen once. So, I attributed what happen to me > accidentally > > knocking the board around a little. Until I talked with another person I > > know who has experienced this issue with multiple kernels, and multiple > > times over the last I don't know . . . maybe 6 months. > > > > So what I did was first installed the same Debian image he was using, > then > > changed kernels to the *bone* LTS kernel. Removed g_ether, by removing > > Robert's custom boot script for the 335x evm board. After that I got the > > project files from this person I know and duplicated his software setup. > > Which is a mqtt application. With a custom cape. > > > > Anyway, I was running this software last night, and then I downloaded and > > ran nload from a ssh session. But I keep getting ssh Broken pipe errors. > > Which is not necessarily a concern. I've seen that before. I intend to > hook > > up a serial debug cable and run nload from that, but just have not gotten > > around to it. > > > > One thing on my mind is that perhaps the software this person I know > wrote > > is somehow failing to deal with a "busy network" properly. Meaning if the > > internet connection is bandwidth saturated, and the application is for > some > > reason unable to deal with a "stale connection" How will it act ? > However, I > > would not think this should cause the hardware to fail. Because that's > what > > I'm seeing when the ethernet traffic indication LEDs stop functioning, > While > > also rendering the ethernet connection non functional. What I was able to > > observe so far however. Was that this application sends around 8-9kBit/s > > data, and gets 2-3kBit/s back. > > > > Another concern: Knowing that mqtt by default is an inherently insecure > > protocol, and this app does currently run as root . . .However there > > areseveral caveats to this statement / concern. First, the application > is a > > peer to peer design in that only the mqtt broker can communicate with the > > board. Whether it sends commands, or collects data back from the board. > > Second, mqtt is able to use certificates, however I do not htink that is > > currently the case with this software *YET*. I given this person I know > the > > standard security lecture on running root, and locking things down, etc. > We > > just have not acted on it yet > > > > With all of the above mentioned. When I ran into this issue myself, I was > > not running anything other than a stock image, and the stock software > that > > comes with it. While the board was also just idling for 5-6 days. Maybe a > > little longer. I ran uptime from an ssh session where it reported back "5 > > days . . ." After which this happened. So I'm more inclined to think > this is > > most likely not a userspace application issue. > > > > I'm not even sure where to go from here, as far as tracking this issue > down. > > All I can really do is throw everything I know / have at the board, and > hope > > I get an error trapped from the live kernel log through serial. > > I think it's related to suspend/cpuidle.. I know another user was > having issues, where they had to ping it twice, as the first would > never respond.. > > one thing that might help: remove the sleep pinmux's from: > mac/davinci_mdio: > > https://github.com/RobertCNelson/dtb-rebuilder/blob/4.4-ti/src/arm/am335x- > bone-common.dtsi#L370-L383 > > Regards, > > -- > Robert Nelson > https://rcn-ee.com/ > > -- > For more options, visit http://beagleboard.org/discuss > --- > You received this message because you are subscribed to the Google Groups > "BeagleBoard" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit https://groups.google.com/d/ > msgid/beagleboard/CAOCHtYiMw40NSswGzXJGas3xMkjAqwL79T8%3DyOinDmcfYFg4Kw% > 40mail.gmail.com. > For more options, visit https://groups.google.com/d/optout. > -- For more options, visit http://beagleboard.org/discuss --- You received this message because you are subscribed to the Google Groups "BeagleBoard" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/beagleboard/CALHSORphCSmvWLqwytQy1%2Be-JDk_EQ4XuFFvThTjggwVGqDabQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
