Michael Buesch wrote:
The real question is: Why does this patch help? Let's explain it. We don't stop networking just for fun there. While executing long preemptible periodic work, we must ensure that the TX path into the driver is not entered. It's the same reason why we disable IRQs in the first place. We can't take the mutex in the TX path and the IRQ handler. (That are the only places where we can't take the mutex). Short: We must stop netif here. The question is: Why does stopping netif queue cause a watchdog trigger here? The maximum time it can take for the periodic work inside of the critical section is about 0.2sec. So the queue is stopped for about 0.2sec max. Why does the watchdog trigger? Any idea from some networking guru? Could synchronize_net() take over 5sec in some worst case? Why? Questions over questions :D
To check if it takes more than 5 seconds, I restored the original network disabling code and increased the timeout to 30 seconds. If this works without error, I'll try to margin the time. I'm still running that branch every second.
Larry Index: wireless-2.6/drivers/net/wireless/bcm43xx/bcm43xx_main.c =================================================================== --- wireless-2.6.orig/drivers/net/wireless/bcm43xx/bcm43xx_main.c +++ wireless-2.6/drivers/net/wireless/bcm43xx/bcm43xx_main.c @@ -4147,6 +4147,7 @@ static int __devinit bcm43xx_init_one(st SET_MODULE_OWNER(net_dev); SET_NETDEV_DEV(net_dev, &pdev->dev); + net_dev->watchdog_timeo = 30 * HZ; net_dev->open = bcm43xx_net_open; net_dev->stop = bcm43xx_net_stop; net_dev->get_stats = bcm43xx_net_get_stats;
Index: wireless-2.6/drivers/net/wireless/bcm43xx/bcm43xx_main.c =================================================================== --- wireless-2.6.orig/drivers/net/wireless/bcm43xx/bcm43xx_main.c +++ wireless-2.6/drivers/net/wireless/bcm43xx/bcm43xx_main.c @@ -3207,7 +3207,7 @@ static void do_periodic_work(struct bcm4 bcm43xx_periodic_every15sec(bcm); bcm->periodic_state = state + 1; - schedule_delayed_work(&bcm->periodic_work, HZ * 15); + schedule_delayed_work(&bcm->periodic_work, HZ * 1); } /* Estimate a "Badness" value based on the periodic work @@ -3227,7 +3227,7 @@ static int estimate_periodic_work_badnes if (state % 1 == 0) /* every 15 sec */ badness += 1; -#define BADNESS_LIMIT 4 +#define BADNESS_LIMIT 0 return badness; } @@ -4147,6 +4147,7 @@ static int __devinit bcm43xx_init_one(st SET_MODULE_OWNER(net_dev); SET_NETDEV_DEV(net_dev, &pdev->dev); + net_dev->watchdog_timeo = 30 * HZ; net_dev->open = bcm43xx_net_open; net_dev->stop = bcm43xx_net_stop; net_dev->get_stats = bcm43xx_net_get_stats;