Re: [IPV6]: Audit all ip6_dst_lookup/ip6_dst_store calls

Herbert Xu Mon, 31 Jul 2006 02:06:16 -0700

On Sun, Jul 30, 2006 at 10:32:10PM -0500, Matt Domsch wrote:
> 
> I applied this on 2.6.18-rc3, and it panics immediately as the first
> IPv6 TCP (ssh) session is initiated to the system.


Executive summary:

1) We resolved one lockdep warning only to stumble onto another lockdep
validator bug.  

2) There is something broken in the x86_64 unwind code which is causing
it to panic just about everytime somebody calls dump_stack().

Andi, this is the second time I've seen a report where an otherwise
harmless dump_stack call (the other one was caused by a WARN_ON) gets
turned into a panic by the stack unwind code on x86_64.  This particular
report is with 2.6.18-rc3 so it looks like whatever bug is causing it
hasn't been fixed yet.

Could you please have a look at it? Thanks.

> =============================================
> [ INFO: possible recursive locking detected ]
> ---------------------------------------------
> swapper/0 is trying to acquire lock:
>  (slock-AF_INET6){-+..}, at: [<ffffffff80414fda>] sk_clone+0xd2/0x3a8
> 
> but task is already holding lock:
>  (slock-AF_INET6){-+..}, at: [<ffffffff883d71a8>] tcp_v6_rcv+0x30e/0x76e 
> [ipv6]
> 
> other info that might help us debug this:
> 1 lock held by swapper/0:
>  #0:  (slock-AF_INET6){-+..}, at: [<ffffffff883d71a8>] tcp_v6_rcv+0x30e/0x76e 
> [ipv6]
> 
> stack backtrace:
> 
> Call Trace:
>  [<ffffffff8026f861>] show_trace+0xae/0x30e
>  [<ffffffff8026fad6>] dump_stack+0x15/0x17
>  [<ffffffff802a73d4>] __lock_acquire+0x12e/0xa18
>  [<ffffffff802a8232>] lock_acquire+0x4b/0x69
>  [<ffffffff8026883b>] _spin_lock+0x25/0x31
>  [<ffffffff80414fda>] sk_clone+0xd2/0x3a8
>  [<ffffffff8043c8a7>] inet_csk_clone+0x11/0x6f
>  [<ffffffff80445615>] tcp_create_openreq_child+0x24/0x49c
>  [<ffffffff883d5d85>] :ipv6:tcp_v6_syn_recv_sock+0x2c5/0x6be
>  [<ffffffff80445c5e>] tcp_check_req+0x1d1/0x326
>  [<ffffffff883d4f0e>] :ipv6:tcp_v6_do_rcv+0x15d/0x372
>  [<ffffffff883d75b9>] :ipv6:tcp_v6_rcv+0x71f/0x76e
>  [<ffffffff883ba49f>] :ipv6:ip6_input+0x223/0x315
>  [<ffffffff883bab4d>] :ipv6:ipv6_rcv+0x254/0x2af
>  [<ffffffff80221883>] netif_receive_skb+0x260/0x2dd
>  [<ffffffff88101292>] :e1000:e1000_clean_rx_irq+0x423/0x4c2
>  [<ffffffff880ff752>] :e1000:e1000_clean+0x88/0x17d
>  [<ffffffff8020caed>] net_rx_action+0xac/0x1d1
>  [<ffffffff80212809>] __do_softirq+0x68/0xf5
>  [<ffffffff802626fa>] call_softirq+0x1e/0x28

Now let's move onto the lockdep validator bug :)

Ingo/Arjen, I thought we've already fixed this before but somehow I
can't find anything in the email archives so perhaps I'm mixing it up
with another recursive lock false-positive.

The problem here is really quite simple: when we accept a TCP connection
there are two sockets involved.  The listening socket which existed before
the connection came in and the socket we construct for the newly arrived
connection.

The code works something like this:

* Take slock on listening socket.
* Construct child socket.
* Take slock on child socket.

As we never do the locking in the opposite direction (child followed by
listening socket) this is safe.

So perhaps we need to add some extra annotation in sk_clone?

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [IPV6]: Audit all ip6_dst_lookup/ip6_dst_store calls

Reply via email to