On Sat, Oct 19, 2013 at 4:49 PM, Rich Freeman <ri...@gentoo.org> wrote: > On Sat, Oct 19, 2013 at 6:01 PM, Mark Knecht <markkne...@gmail.com> wrote: >> No magic sys request keys, keyboard and >> mouse are dead, cannot shell in or even ping from another machine on >> the network. > > These types of situations are really annoying to debug. Do you get > anything on the console? Try leaving at a text console with no screen > saver so that you have a chance to see any panic message/etc that > might be left there. If you have something set to put your monitor to > sleep then after the panic your system will not wake up. >
OK, it's a good idea just to have a Konsole terminal open. That might catch something. Only issue is I'm running KDE, 6 desktops, 2 monitors, so I need to make sure it's always visible and always on top. > Serial console is another option, albeit not exactly convenient. > OK, so I remember years ago debugging something for Ingo Molnar using the serial console, but in those days it was a real serial console on a real serial port. None of my machine have those ports anymore. There must be a more modern version of doing that. I'll go look for info. Ethernet? USB? We've recently moved and the only other machine I've got here at the apartment is a Gentoo laptop. > I have on my blog somewhere instructions for setting up kdump, but to > be honest with recent kernel versions it hasn't been working (that > could have changed). You can configure your kernel to auto-reboot to > a panic kernel which you can then use to dump core to disk, then you > can reboot back into your normal system to examine it at your leisure. > That should tell you what was going on when it crashed, but only if > the kernel actually detected a panic (usually it does). > There's a gentoo.wiki.org page here: http://wiki.gentoo.org/wiki/Kernel_Crash_Dumps The setup looks reasonably straight forward so I've reconfigured 3.10.17 following those instructions. One question for now. In the Kernel Hacking section there's an option for "Detect Hard and Soft Lockups" which on the surface looks like a good thing to turn on but it's not mentioned in these instructions. When turned on it has options for Panic (Reboot) for both types. Seems like I probably want that all turned on? Comments? > Note that logs are useless in a panic (unless you're using kdump) as > the kernel will not write anything to disk following a panic. If you > get an oops/bug you might or might not get anything in your logs > depending on whether it affected the filesystem/disk/etc subsystems. > If the kernel knows its internals are scrambled the last thing you > want it doing is trying to write to your filesystems. With kdump it > does a reboot into a new kernel which fully re-initializes everything > and then dumps ram safely to disk. > > Rich > As I expected about the logs. If the machine's dead then I don't want stuff getting written to disk anyway. kdump sounds like the best solution going right now. I'll try and see if I can get it working. Thanks very much Rich! Great ideas. Cheers, Mark