On Tue, Sep 06, 2005 at 03:36:27PM -0700, David S. Miller wrote:
> From: Eugene Surovegin <[EMAIL PROTECTED]>
> Date: Tue, 6 Sep 2005 15:04:17 -0700
> 
> > David, correct me if I'm wrong, but I think there is a major problem
> > with current netconsole/netpoll approach.
> 
> You're preaching to the choir.  I think the whole netpoll
> implementation is fundamentally flawed, and the locking problems we
> keep bumping into are merely a symptom.
> 
> People want this thing so badly, that I keep letting them continue to
> patch this thing into quasi-working, even though it's foundations are
> what are so problematic.

Well I agree in some ways and disagree in others.

> It's never going to work %100 reliably, I think, here's why:
> 
> The core issue, and conflict, is that the desire is to have the
> responses be immediate and come at the moment the event occurs.
> Because the situation may be so dire that deferring into a more
> appropriate software IRQ context may not be possible, and thus we'd
> lose the log message or event.

In the case of kgdb-over-ethernet or netdump or several others,
deferring to an IRQ context doesn't even make sense.
 
> So we try to spit out netconsole messages in hw IRQ context and stuff
> like that, as you stated.  The tg3 driver is susceptible to the
> problem you mention, as is bnx2, because they use purely software
> interrupt spinlocking, and thus their timers will deadlock if any hw
> IRQ context netpoll operations occur.

I'm not aware of the tg3 problem, please describe it in more detail.

> There is a way to fix all of this, deferring all netpoll operations to
> software IRQ context, but you sacrifice reliability when the system is
> in such a bad state that software IRQs are not occuring any more
> or are deadlocked.

At that point, why bother? Just use syslogd. Or more likely, use a
serial cable, which will actually work reliably.

Where I disagree is this: what netpoll is trying to do is not
fundamentally unreasonable. There's nothing magical about interrupts
that should make it impossible to drive network hardware in polled
mode with interrupts disabled. 

Instead, the problem is that the network stack has evolved in a
direction that made a bunch of fairly reasonable assumptions that
netpoll has now broken.

-- 
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to