On Mon, 2013-04-08 at 11:15 -0400, Olivier Diotte wrote: > Hi Ben, > > I just arrived at the machine this morning and it seems the > system is responsive: gdm, my gnome-session, > even gnome-screensaver were still running > (the other two times, I had been kicked back > to tty1 and the system was totally unresponsive). > > On the other hand, the doxygen process > terminated (with a segfault though). > The logs seem to indicate "page allocation failure" > crashes
A page allocation failure is not a crash. > and there are also errors that may > indicate my USB dongle is at fault (at some point > during the weekend, exim was unable to connect > to localhost and there are a lot of errors related > to wifi/networking). > > It also seems weird to me that /var/log/debug > skips from April 6th 19h02 to April 8th 09h41 I don't think that's so weird. > Crashes also seems to say my kernel is tainted > which I am not sure why as the closest thing > I have to a taint would be > firmware-realtek (for the USB dongle). No, the wireless driver taints the kernel because it comes from the staging area of the kernel source. The staging drivers have not been thoroughly reviewed and are assumed (usually correctly) to be quite buggy. > I am unsure what to try next except another > doxygen run friday, except this time I will > deactivate/remove the USB dongle and reboot > beforehand. Let me know if you have > a better idea for tests, would like other infos, etc. > > Attached are all logs that seemed > relevant (as a .tar.bz2 archive), edited to remove my > MAC address (replaced with MY:MA:Ca:dr:es:s0, > yeah, I forgot a d, but I am too lazy > to edit now). I also removed all entries predating > April 5th at around 17h00. > I also attached /proc/meminfo from this morning, in case that is relevant. OK, there's nothing weird in meminfo. I think the basic problem is that doxygen is allocating more memory than can be provided on this computer. If the working set (the set of data that's regularly accessed) for all running programs adds up to more than the size of physical memory then the kernel will be continuously swapping data to and from the swap partition, and the larger programs will become unresponsive. The wireless networking failure is just a symptom of the shortage of free memory. Changing the I/O class, as you originally attempted, doesn't affect swapping, so far as I know. Are you running doxygen over a particularly large set of sources? I ask because I want to know whether this could be a bug in doxygen (use of excessive memory). Ben. -- Ben Hutchings The first rule of tautology club is the first rule of tautology club.
signature.asc
Description: This is a digitally signed message part