Michael:

Apologies if it looks like I just 'guessed' at the cause, but I assure you I 
did the basic 'diff' checks I do for all kernel issues. The git-log seemed 
pretty clear - the only change was the 'b43' patch and when I saw it was a 
bluetooth/WiFi interaction patch I didn't think I needed to dig much further.
At the time it happened, knowing I was being a guinea-pig for hardy-proposed, I 
was more interested in simply reporting the experience to prevent it getting 
promoted to hardy-updates if there was a problem.

I'm in the Kernel ACPI Team and I've spent most of my time recently
building a semi-automated testing system (using DKMS instead of
SystemTap) for easier (remote) investigation of this kind of issue (ACPI
is plagued by them) and experience there is look to the basics (all
recent changes are prime suspects unless proved otherwise!).

Yes, b43 apparently not even being used is perplexing me. I was fully
aware the module wasn't in use but unless the ABI bump is somehow
causing an issue with something in user-space I couldn't see how it
could be anything other than some weird side effect.

I've now scheduled some time to do a git-bisect, just to try without the
ABI bumps since they are now suspect.

The PC checks out in every other way (memory passes tests, no other
kernel module differences or configuration between the two kernels, (I
checked the initrd image just in case), etc.).

Testing is slightly hampered by the fact the OOPS is relatively rare,
and random, and when it happens the only solution is a reboot. SysReq+S
will sometimes sync the disks so the log should be saved, fortunately.

I've had the PC running with v2.6.24-20-generic for a while now and no
sign of the OOPS or any other instability.

The only thing I've done relating in any way to the on-board radios was
(about 1 week before the hardy-proposed v2.6.24-21-generic update)
install Alexander Sack's NetworkManager v0.7 PPA builds to test the 3G
and OpenVPN connections functionality. There is/was no problem with that
with kernel 2.6.24-20-generic.

The common thing in all the OOPS back-traces is the Bluetooth module
trying to add or del[ete] a connection (for the mouse). It seems to be
trying to double-add or double-free too, which would explain the "sysfs:
duplicate filename" or "NULL pointer deference" if it was destroyed on
the first,  for the second attempts.

This functionality was introduced with

commit 47b66fe95afa8400cefaea06263ab8948d8465ba
Author: Dave Young <[EMAIL PROTECTED]>
Date:   Fri Feb 15 01:34:03 2008 -0800

    BLUETOOTH: Add conn add/del workqueues to avoid connection fail.

$ git-describe --contains 47b66fe
v2.6.24.3~17

I'm dropping some debug code in a build of v2.6.24-21 for the calls in
hci_conn_add_conn() and hci_conn_del_conn() in
net/bluetooth/hci_sysfs.c, and drivers/base/core.c device_add() and
device_del(). These functions schedule and execute the workqueue
activity that is seen in the back-traces.

I'm also going to monitor the user-space hal responses to the uvents.


** Changed in: linux (Ubuntu)
     Assignee: (unassigned) => TJ (intuitivenipple)

-- 
OOPS: "Unable to handle kernel NULL pointer dereference"
https://bugs.launchpad.net/bugs/258450
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to