Package: linux-2.6
Version: 2.6.32-15~bpo50+1
Severity: important

Hi,
I'm using the latest lenny-backport kernel on a new dell poweredge R710.
every few days I have a kernel panic similar to : Kernel panic - not syncing: 
CRED: put_cred_rcu() sees f640ad80 with usage -163535872"
I automated a vmcore backup with kexec in order to collect useful infos:

# uname -a
Linux hostname 2.6.32-bpo.5-686-bigmem #1 SMP Fri Jun 11 22:59:12 UTC 2010 i686 
GNU/Linux
# dpkg -l linux-image-2.6.32-bpo.5-686-bigmem
ii  linux-image-2.6.32-bpo.5-686-bigmem              2.6.32-15~bpo50+1          
                      Linux 2.6.32 for PCs with 4GB+ RAM
# dpkg -l firmware-bnx2
ii  firmware-bnx2                           0.24~bpo50+1                     
Binary firmware for Broadcom NetXtremeII
# lsmod   
Module                  Size  Used by
xt_multiport            1775  1 
iptable_filter          1790  1 
ip_tables               7694  1 iptable_filter
x_tables                8327  2 xt_multiport,ip_tables
ipmi_devintf            4029  2 
ipmi_si                26744  1 
ipmi_msghandler        22871  2 ipmi_devintf,ipmi_si
ext2                   46337  1 
loop                    9765  0 
snd_pcm                47346  0 
snd_timer              12238  1 snd_pcm
snd                    34383  2 snd_pcm,snd_timer
evdev                   5609  0 
soundcore               3450  1 snd
psmouse                44665  0 
snd_page_alloc          5133  1 snd_pcm
dcdbas                  3948  0 
tpm_tis                 5572  0 
tpm                     8145  1 tpm_tis
tpm_bios                3573  1 tpm
serio_raw               2920  0 
pcspkr                  1207  0 
power_meter             6894  0 
button                  3598  0 
processor              26623  24 
ext3                   94308  3 
jbd                    32213  1 ext3
mbcache                 3766  2 ext2,ext3
dm_mirror               9683  0 
dm_region_hash          5652  1 dm_mirror
dm_log                  6425  2 dm_mirror,dm_region_hash
dm_snapshot            18033  0 
dm_mod                 46150  15 dm_mirror,dm_log,dm_snapshot
sg                     15980  0 
sr_mod                 10770  0 
cdrom                  26487  1 sr_mod
ata_generic             2019  0 
sd_mod                 25889  7 
crc_t10dif              1012  1 sd_mod
ses                     4516  0 
enclosure               4027  1 ses
ata_piix               17672  0 
ehci_hcd               28251  0 
uhci_hcd               16153  0 
libata                115989  2 ata_generic,ata_piix
usbcore                98810  3 ehci_hcd,uhci_hcd
nls_base                4541  1 usbcore
megaraid_sas           21953  6 
scsi_mod              101457  6 sg,sr_mod,sd_mod,ses,libata,megaraid_sas
bnx2                   52121  0 
thermal                 9198  0 
fan                     2590  0 
thermal_sys             9378  3 processor,thermal,fan
# lspci
00:00.0 Host bridge: Intel Corporation QuickPath Architecture I/O Hub to ESI 
Port (rev 13)
00:01.0 PCI bridge: Intel Corporation QuickPath Architecture I/O Hub PCI 
Express Root Port 1 (rev 13)
00:03.0 PCI bridge: Intel Corporation QuickPath Architecture I/O Hub PCI 
Express Root Port 3 (rev 13)
00:04.0 PCI bridge: Intel Corporation QuickPath Architecture I/O Hub PCI 
Express Root Port 4 (rev 13)
00:05.0 PCI bridge: Intel Corporation QuickPath Architecture I/O Hub PCI 
Express Root Port 5 (rev 13)
00:06.0 PCI bridge: Intel Corporation QuickPath Architecture I/O Hub PCI 
Express Root Port 6 (rev 13)
00:07.0 PCI bridge: Intel Corporation QuickPath Architecture I/O Hub PCI 
Express Root Port 7 (rev 13)
00:09.0 PCI bridge: Intel Corporation QuickPath Architecture I/O Hub PCI 
Express Root Port 9 (rev 13)
00:14.0 PIC: Intel Corporation QuickPath Architecture I/O Hub System Management 
Registers (rev 13)
00:14.1 PIC: Intel Corporation QuickPath Architecture I/O Hub GPIO and Scratch 
Pad Registers (rev 13)
00:14.2 PIC: Intel Corporation QuickPath Architecture I/O Hub Control Status 
and RAS Registers (rev 13)
00:1a.0 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI 
Controller #4 (rev 02)
00:1a.1 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI 
Controller #5 (rev 02)
00:1a.7 USB Controller: Intel Corporation 82801I (ICH9 Family) USB2 EHCI 
Controller #2 (rev 02)
00:1d.0 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI 
Controller #1 (rev 02)
00:1d.1 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI 
Controller #2 (rev 02)
00:1d.7 USB Controller: Intel Corporation 82801I (ICH9 Family) USB2 EHCI 
Controller #1 (rev 02)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 92)
00:1f.0 ISA bridge: Intel Corporation 82801IB (ICH9) LPC Interface Controller 
(rev 02)
00:1f.2 IDE interface: Intel Corporation 82801IB (ICH9) 2 port SATA IDE 
Controller (rev 02)
01:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit 
Ethernet (rev 20)
01:00.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit 
Ethernet (rev 20)
02:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit 
Ethernet (rev 20)
02:00.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit 
Ethernet (rev 20)
03:00.0 RAID bus controller: LSI Logic / Symbios Logic Device 0079 (rev 04)
08:03.0 VGA compatible controller: Matrox Graphics, Inc. MGA G200eW WPCM450 
[Hermon] - Winbond/Nuvoton (rev 0a)



then I use crash from http://people.redhat.com/~anderson/
# ./crash /usr/lib/debug/boot/vmlinux-2.6.32-bpo.5-686-bigmem 
/root/vmcore_20100807_084759

crash 5.0.6
[...]
     KERNEL: /usr/lib/debug/boot/vmlinux-2.6.32-bpo.5-686-bigmem
   DUMPFILE: /root/vmcore_20100807_084759
       CPUS: 24
       DATE: Sat Aug  7 08:47:48 2010
     UPTIME: 1 days, 19:44:25
LOAD AVERAGE: 0.14, 0.43, 0.54
      TASKS: 583
   NODENAME: hostname
    RELEASE: 2.6.32-bpo.5-686-bigmem
    VERSION: #1 SMP Fri Jun 11 22:59:12 UTC 2010
    MACHINE: i686  (2925 Mhz)
     MEMORY: 24 GB
      PANIC: "[157461.063951] Kernel panic - not syncing: CRED: put_cred_rcu() 
sees f640ad80 with usage -163535872"
        PID: 0
    COMMAND: "swapper"
       TASK: f7151100  (1 of 24)  [THREAD_INFO: f715c000]
        CPU: 15
      STATE: TASK_RUNNING (PANIC)
crash>  log
[...]
[157461.063951] Kernel panic - not syncing: CRED: put_cred_rcu() sees f640ad80 
with usage -163535872
[157461.063953] 
[157461.092227] Pid: 0, comm: swapper Not tainted 2.6.32-bpo.5-686-bigmem #1
[157461.092228] Call Trace:
[157461.092234]  [<c127d2b8>] ? panic+0x38/0xe4
[157461.092238]  [<c104f409>] ? put_cred_rcu+0x1c/0x7b
[157461.092241]  [<c1075bc5>] ? __rcu_process_callbacks+0x164/0x227
[157461.092243]  [<c1075ca4>] ? rcu_process_callbacks+0x1c/0x39
[157461.092246]  [<c103be20>] ? __do_softirq+0xaa/0x151
[157461.092254]  [<c103bef8>] ? do_softirq+0x31/0x3c
[157461.092260]  [<c103bfce>] ? irq_exit+0x26/0x58
[157461.092268]  [<c1019b20>] ? smp_apic_timer_interrupt+0x6c/0x76
[157461.092277]  [<c10089d5>] ? apic_timer_interrupt+0x31/0x38
[157461.092286]  [<c114007b>] ? radix_tree_node_alloc+0x2c/0x53
[157461.092299]  [<f89520e7>] ? acpi_idle_enter_bm+0x253/0x28e [processor]
[157461.092308]  [<c11d6269>] ? cpuidle_idle_call+0x68/0xbb
[157461.092316]  [<c1007229>] ? cpu_idle+0x89/0xa4
crash> bt
PID: 0      TASK: f7151100  CPU: 15  COMMAND: "swapper"
#0 [f715dddc] crash_kexec at c1064078
#1 [f715ddf8] machine_kexec at c101c7cd
#2 [f715de48] crash_kexec at c106408d
#3 [f715de9c] panic at c127d2ba
#4 [f715deb4] put_cred_rcu at c104f404
#5 [f715dee8] rcu_process_callbacks at c1075c9f
#6 [f715deec] __do_softirq at c103be1a
#7 [f715df14] do_softirq at c103bef3
#8 [f715df20] irq_exit at c103bfc9
#9 [f715df24] smp_apic_timer_interrupt at c1019b1b
#10 [f715df30] apic_timer_interrupt at c10089d0
#11 [f715df94] cpuidle_idle_call at c11d6265
#12 [f715dfa0] cpu_idle at c1007223
crash> struct cred f640ad80
struct cred {
 usage = {
   counter = -163535872
 }, 
 uid = 3238771162, 
 gid = 0, 
 suid = 0, 
 sgid = 3240690332, 
 euid = 0, 
 egid = 0, 
 fsuid = 0, 
 fsgid = 13, 
 securebits = 749, 
 cap_inheritable = {
   cap = {0, 16777216}
 }, 
 cap_permitted = {
   cap = {0, 0}
 }, 
 cap_effective = {
   cap = {0, 0}
 }, 
 cap_bset = {
   cap = {0, 4083503872}
 }, 
 jit_keyring = 0 '\000', 
 thread_keyring = 0x0, 
 request_key_auth = 0x0, 
 tgcred = 0x20, 
 security = 0x0, 
 user = 0xffffffff, 
 group_info = 0xffffffff, 
 rcu = {
   next = 0x0, 
   func = 0
 }
}
crash> mod
MODULE   NAME               SIZE  OBJECT FILE
f804feb4  ipmi_msghandler   22871  (not loaded)  [CONFIG_KALLSYMS]
f8054e3c  thermal_sys        9378  (not loaded)  [CONFIG_KALLSYMS]
f8059964  x_tables           8327  (not loaded)  [CONFIG_KALLSYMS]
f805dd00  ipmi_devintf       4029  (not loaded)  [CONFIG_KALLSYMS]
f8060718  fan                2590  (not loaded)  [CONFIG_KALLSYMS]
f8068b4c  ipmi_si           26744  (not loaded)  [CONFIG_KALLSYMS]
f806fd30  thermal            9198  (not loaded)  [CONFIG_KALLSYMS]
f838aa44  ip_tables          7694  (not loaded)  [CONFIG_KALLSYMS]
f839149c  iptable_filter     1790  (not loaded)  [CONFIG_KALLSYMS]
f83a2a70  bnx2              52121  (not loaded)  [CONFIG_KALLSYMS]
f83bc2d0  scsi_mod         101457  (not loaded)  [CONFIG_KALLSYMS]
f83eb46c  xt_multiport       1775  (not loaded)  [CONFIG_KALLSYMS]
f8421afc  megaraid_sas      21953  (not loaded)  [CONFIG_KALLSYMS]
f8431ef0  nls_base           4541  (not loaded)  [CONFIG_KALLSYMS]
f8476acc  usbcore           98810  (not loaded)  [CONFIG_KALLSYMS]
f84d5720  libata           115989  (not loaded)  [CONFIG_KALLSYMS]
f84f9860  uhci_hcd          16153  (not loaded)  [CONFIG_KALLSYMS]
f85134ec  ehci_hcd          28251  (not loaded)  [CONFIG_KALLSYMS]
f852c16c  ata_piix          17672  (not loaded)  [CONFIG_KALLSYMS]
f855db1c  enclosure          4027  (not loaded)  [CONFIG_KALLSYMS]
f859cdcc  ses                4516  (not loaded)  [CONFIG_KALLSYMS]
f85d5268  crc_t10dif         1012  (not loaded)  [CONFIG_KALLSYMS]
f8622b4c  sd_mod            25889  (not loaded)  [CONFIG_KALLSYMS]
f862e5e8  ata_generic        2019  (not loaded)  [CONFIG_KALLSYMS]
f8645ba8  cdrom             26487  (not loaded)  [CONFIG_KALLSYMS]
f8657490  sr_mod            10770  (not loaded)  [CONFIG_KALLSYMS]
f866b95c  sg                15980  (not loaded)  [CONFIG_KALLSYMS]
f869b598  dm_mod            46150  (not loaded)  [CONFIG_KALLSYMS]
f86b4c7c  dm_snapshot       18033  (not loaded)  [CONFIG_KALLSYMS]
f86c33e0  dm_log             6425  (not loaded)  [CONFIG_KALLSYMS]
f86d01a4  dm_region_hash     5652  (not loaded)  [CONFIG_KALLSYMS]
f86e115c  dm_mirror          9683  (not loaded)  [CONFIG_KALLSYMS]
f88c6b20  mbcache            3766  (not loaded)  [CONFIG_KALLSYMS]
f88e3d64  jbd               32213  (not loaded)  [CONFIG_KALLSYMS]
f8928000  ext3              94308  (not loaded)  [CONFIG_KALLSYMS]
f893aa58  button             3598  (not loaded)  [CONFIG_KALLSYMS]
f8955b50  processor         26623  (not loaded)  [CONFIG_KALLSYMS]
f8985648  power_meter        6894  (not loaded)  [CONFIG_KALLSYMS]
f898f274  pcspkr             1207  (not loaded)  [CONFIG_KALLSYMS]
f899a82c  serio_raw          2920  (not loaded)  [CONFIG_KALLSYMS]
f89aaa94  tpm_bios           3573  (not loaded)  [CONFIG_KALLSYMS]
f89b9a38  tpm                8145  (not loaded)  [CONFIG_KALLSYMS]
f89c81b4  tpm_tis            5572  (not loaded)  [CONFIG_KALLSYMS]
f89f0acc  psmouse           44665  (not loaded)  [CONFIG_KALLSYMS]
f8a07acc  dcdbas             3948  (not loaded)  [CONFIG_KALLSYMS]
f8a1407c  snd_page_alloc     5133  (not loaded)  [CONFIG_KALLSYMS]
f8a211d4  evdev              5609  (not loaded)  [CONFIG_KALLSYMS]
f8a34980  soundcore          3450  (not loaded)  [CONFIG_KALLSYMS]
f8a54690  snd               34383  (not loaded)  [CONFIG_KALLSYMS]
f8a6c85c  snd_timer         12238  (not loaded)  [CONFIG_KALLSYMS]
f8a93b18  snd_pcm           47346  (not loaded)  [CONFIG_KALLSYMS]
f8ab8168  loop               9765  (not loaded)  [CONFIG_KALLSYMS]
f8cee24c  ext2              46337  (not loaded)  [CONFIG_KALLSYMS]

we noticed something : when the server froze after a panic (before I set it up 
to automatically switch to a crashkernel) , the other servers connected to the 
same ethernet switch were unreachable over the network. everything seems as if 
the ethernet card gets crazy and starts sending random data . I can't say for 
sure ... Restarting the faulty server gets everything back in order. 

I do hope there is enough data to identify the cause of this bug. I keep the 
vmcore dump for some time in case someone wants more infos.

Regards,
Joseph.




--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Reply via email to