[EMAIL PROTECTED] wrote:
>
> http://bugzilla.kernel.org/show_bug.cgi?id=6412
> 
>            Summary: Kernel crashes randomly -- Unable to handle kernel NULL
>                     pointer dereference ...
>     Kernel Version: 2.6.16.5 - mainline, neither out of tree modules loaded
>                     nor comp
>             Status: NEW
>           Severity: normal
>              Owner: [EMAIL PROTECTED]
>          Submitter: [EMAIL PROTECTED]
> 
> 
> Most recent kernel where this bug did not occur:
> Unknown - My SMP machine keeps crashing scince kernel version 2.6.10 or so up 
> to
> 7 times a day. Most of the time I am unable to tell you the cause of the
> crashes, because my syslogs do not contian any data about that.
> 
> Distribution: Debian
> 
> Hardware Environment: SMP, i386,
> 
> Software Environment: root-nfs; SMP disabled in kernel in the hope to reduce 
> the
> number the of kernel crashes; I was running X, KDE 3.5 and Mozilla
> 
> Problem Description:
> Suddenly, my PC  restarted X, started xdm and my screen, mouse and keyboard 
> were
> frozen. I was able to log into the crashed machine from my nfs server via ssh
> and to produce this dmesg:
> 
> IN=wan0 OUT= MAC=ff:ff:ff:ff:ff:ff:00:80:77:48:f6:fa:08:00 SRC=192.168.2.2
> DST=192.168.2.255 LEN=229 TOS=0x00 PREC=0x00 TTL=60 ID=966 PROTO=UDP SPT=138
> DPT=138 LEN=209
> Unable to handle kernel NULL pointer dereference at virtual address 00000000
>  printing eip:
> 00000000
> *pde = 5386a067
> Oops: 0000 [#1]
> PREEMPT
> Modules linked in: esp6 ah6 ipcomp esp4 ah4 xfrm_user arc4 af_packet lp 
> autofs4
> tun ipx p8022 psnap llc p8023 bridge iptable_mangle ipt_TCPMSS xt_state
> ipt_REJECT ipt_LOG ipt_multiport iptable_filter ipt_MASQUERADE ipt_REDIRECT
> xt_tcpudp iptable_nat ip_tables ip6table_raw ip6table_mangle ip6t_hl xt_limit
> ip6t_multiport ip6t_LOG ip6table_filter ip6_tables x_tables ipv6 deflate
> zlib_deflate zlib_inflate sha1 crypto_null af_key binfmt_misc nfsd exportfs
> eeprom i2c_viapro ppdev ip_nat_ftp ip_nat ip_conntrack_ftp ip_conntrack
> nfnetlink ide_floppy ide_disk ide_cd snd_seq_dummy snd_seq_oss snd_seq_midi
> snd_seq_midi_event snd_seq snd_via82xx snd_ens1371 snd_pcm_oss snd_mixer_oss
> gameport snd_via82xx_modem snd_ac97_codec snd_ac97_bus snd_mpu401_uart snd_pcm
> snd_rawmidi snd_seq_device snd_timer snd via82cxxx generic psmouse
> snd_page_alloc ide_core serio_raw soundcore dl2k 8139too uhci_hcd via686a 
> hwmon
> i2c_isa usbcore parport_pc parport unix
> CPU:    0
> EIP:    0060:[<00000000>]    Not tainted VLI
> EFLAGS: 00213246   (2.6.16.5-d6vaa-1CPU #4)
> EIP is at run_init_process+0x3feffde0/0x29
> eax: f6bff340   ebx: f6bff340   ecx: 00000003   edx: 00000003
> esi: 00000000   edi: f748e000   ebp: 00000003   esp: f748ff70
> ds: 007b   es: 007b   ss: 0068
> Process Xorg (pid: 6506, threadinfo=f748e000 task=f7a02030)
> Stack: <0>f88161f5 f5c65bc0 00000022 086c2690 f748e000 c0266151 00000000 
> 0000000c
>        c0266b67 00000022 00000002 00000022 00000002 00000000 00000000 00000000
>        00000022 0000000d c0102a2d 0000000d bfd5c0e0 08700a38 00000022 086c2690
> Call Trace:
>  [<f88161f5>] unix_shutdown+0x54/0x125 [unix]
>  [<c0266151>] sys_shutdown+0x24/0x35
>  [<c0266b67>] sys_socketcall+0x128/0x181
>  [<c0102a2d>] syscall_call+0x7/0xb
> Code:  Bad EIP value.
>  <6>agpgart: Found an AGP 2.0 compliant device at 0000:00:00.0.
> agpgart: Putting AGP V2 device at 0000:00:00.0 into 1x mode
> agpgart: Putting AGP V2 device at 0000:01:00.0 into 1x mode
> 
> Finally I was able to reboot the system via ssh. ps -e |grep xdm was telling 
> me,
> that no xdm was running, but xdm was showing on my frozen screen. top showed
> that the crshed machine was not under load. => no livelock
> 
> Steps to reproduce: unknown, because my machine crashes randomly
> 

The CPU has started execution at address 0x00000000.  I'd assume that
sk->sk_state_change is zero in unix_shutdown().

Could you add this patch?  If my theory is correct, it will prevent the
crashes and will give us the same info.


diff -puN net/unix/af_unix.c~a net/unix/af_unix.c
--- 25/net/unix/af_unix.c~a     Wed Apr 19 13:45:05 2006
+++ 25-akpm/net/unix/af_unix.c  Wed Apr 19 13:48:13 2006
@@ -1780,6 +1780,19 @@ out:
        return copied ? : err;
 }
 
+static void do_sk_state_change(struct sock *sk)
+{
+       void (*sk_state_change)(struct sock *sk);
+
+       sk_state_change = sk->sk_state_change;
+       if (!sk_state_change) {
+               printk(KERN_ERR "%s: sk_state_change=NULL\n", __FUNCTION__);
+               dump_stack();
+       } else {
+               sk_state_change(sk);
+       }
+}
+
 static int unix_shutdown(struct socket *sock, int mode)
 {
        struct sock *sk = sock->sk;
@@ -1794,7 +1807,7 @@ static int unix_shutdown(struct socket *
                if (other)
                        sock_hold(other);
                unix_state_wunlock(sk);
-               sk->sk_state_change(sk);
+               do_sk_state_change(sk);
 
                if (other &&
                        (sk->sk_type == SOCK_STREAM || sk->sk_type == 
SOCK_SEQPACKET)) {
@@ -1808,7 +1821,7 @@ static int unix_shutdown(struct socket *
                        unix_state_wlock(other);
                        other->sk_shutdown |= peer_mode;
                        unix_state_wunlock(other);
-                       other->sk_state_change(other);
+                       do_sk_state_change(other);
                        read_lock(&other->sk_callback_lock);
                        if (peer_mode == SHUTDOWN_MASK)
                                sk_wake_async(other,1,POLL_HUP);
_

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to