On Mon, Jun 05, 2006 at 04:11:25PM -0700, Kok, Auke wrote:
>
> Netpoll was broken due to the earlier addition of multiqueue.
>
> Signed-off-by: Mitch Williams <[EMAIL PROTECTED]>
> Signed-off-by: Auke Kok <[EMAIL PROTECTED]>
> ---
>
> drivers/net/e1000/e1000_main.c | 9 ++++++++-
> 1 files changed, 8 insertions(+), 1 deletions(-)
>
> diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c
> index ed15fca..7103a0e 100644
> --- a/drivers/net/e1000/e1000_main.c
> +++ b/drivers/net/e1000/e1000_main.c
> @@ -4629,10 +4629,17 @@ static void
> e1000_netpoll(struct net_device *netdev)
> {
> struct e1000_adapter *adapter = netdev_priv(netdev);
> +#ifdef CONFIG_E1000_NAPI
> + int budget = 0;
> +
> + disable_irq(adapter->pdev->irq);
> + e1000_clean_tx_irq(adapter, adapter->tx_ring);
> + adapter->clean_rx(adapter, adapter->rx_ring, &budget, netdev->weight);
> +#else
> +
I've been speaking about this fix with a Jeff Moyer, and we've come up with some
concerns regarding its implementation. Specifically the call to
adapter->clean_rx in the case of the e1000 driver is rather a layering
violation in the netpoll code, in the sense that the function pointed to by
clean_rx
is functionality that is nominally used by the dev->poll method. In fact in
this case, it would appear possible since dev->poll is called under the
poll_lock, but dev->poll_controller is not, that is is possible to have cpus in
a system executing in e1000_clean_rx_irq_[ps] at the same time leading to data
corruption:
CPU0:
netpoll_poll_dev
dev->poll_controller (e1000_netpoll)
adapter->clean_rx (e1000_clean_rx_irq)
CPU1:
napi_poll
dev->poll (e1000_clean)
e1000_clean_rx_irq
I'm not sure what the right fix is here. A spinlock in e1000_clean_rx_irq[_ps]
would seem to be an easy and direct fix, but I don't know that thats the best
solution.
Something I don't understand is why the call to adapter->clean_rx is
required in the first place when the napi_poll routine calls it itself directly.
All other drivers schedule a napi poll and receive frames via that path. My
guess is that it has to do with the fact that we schedule polls on the device in
the polling_netdev array, rather than the actual registered netdev. Looking at
the driver code I note that while an entire array is allocated for polling
netdevs, we only ever use entry 0. Would it make sense to just remove the the
polling_netdev array and use the registered device like all the other drivers.
I expect that would eliminate the need for this patch as well.
Regards
Neil
> disable_irq(adapter->pdev->irq);
> e1000_intr(adapter->pdev->irq, netdev, NULL);
> e1000_clean_tx_irq(adapter, adapter->tx_ring);
> -#ifndef CONFIG_E1000_NAPI
> adapter->clean_rx(adapter, adapter->rx_ring);
> #endif
> enable_irq(adapter->pdev->irq);
>
>
>
> --
> Auke Kok <[EMAIL PROTECTED]>
> -
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
/***************************************************
*Neil Horman
*Software Engineer
*gpg keyid: 1024D / 0x92A74FA1 - http://pgp.mit.edu
***************************************************/
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html