> > The second one seems to be trickier. It looks like a race wrt. PADT
> > message reception. Reproducing the bug will probably require to
> > generate some PADT flooding to a host that creates and releases PPPoE
> > connections.
Ok I think I can see the potential race here, specifically the PADT
frame is received while the pppoe interface is being deleted. (I will
have a go inducing this with msleep() in the code tomorrow)
1. pppoe_flush_dev() - sk->sk_state = PPPOX_DEAD, po->pppoe_dev = NULL
2. pppoe_connect() - sk->sk_state = PPPOX_NONE, po->pppoe_dev = NULL
3. pppoe_disc_rcv() - sk->sk_state = PPPOX_ZOMBIE po->pppoe_dev = NULL
4. pppoe_release() - dev_put(po->pppoe_dev) ----> Oops
Either in pppoe_disc_rcv() we add the condition:
@@ -496,7 +499,8 @@ static int pppoe_disc_rcv(struct sk_buff *skb,
struct net_device *dev,
/* We're no longer connect at the PPPOE layer,
* and must wait for ppp channel to disconnect
us.
*/
- sk->sk_state = PPPOX_ZOMBIE;
+ if (sk->sk_state & PPPOX_CONNECTED)
+ sk->sk_state = PPPOX_ZOMBIE;
}
Or perhaps we remove the assumption that the state PPPOX_ZOMBIE has a
non-null pppoe_dev on it.
I don't know why the code isn't like the following anyway.
-if (sk->sk_state & (PPPOX_CONNECTED | PPPOX_BOUND | PPPOX_ZOMBIE)) {
+if (po->pppoe_dev) {
dev_put(po->pppoe_dev);
po->pppoe_dev = NULL;
}