Re: Van Jacobson net channels

Andi Kleen Wed, 01 Feb 2006 11:53:51 -0800

On Wednesday 01 February 2006 20:37, Jeff Garzik wrote:
 
> To have a fully async, zero copy network receive, POSIX read(2) is 
> inadequate.


Agreed, but POSIX aio is adequate.

> One needs a ring buffer, similar in API to the mmap'd   
> packet socket, where you can queue a whole bunch of reads.  Van's design 
> seems similar to this.

See lio_listio et.al.

But I don't think Van's design is supposed to be exposed to user space.
It's just a better way to implement BSD sockets.

> Key point 2:
> Once the kernel gets enough info to determine which channel should 
> receive a packet, it's done.  Van pushes TCP/IP receive processing into 
> the userland app, which is quite an idea.

We already do this since 2.3 (Alexey's work) 

The only difference in his scheme seems to be that the demultiplex
to different sockets is somehow (he doesn't explain how) pushed into
the driver. It's also unclear how this will simplify the drivers
as the slides claim.

Also I should add that the added ACK latency is a problem for a few
workloads.

> This pushes work out of the  
> kernel and into the app,

It's still in the kernel, just in process context.

> which in turn, increases the amount of work  
> that can be performed in parallel on multiple cpus/cores. 

Well the current demultiplex already runs on all CPUs
(assuming you have enough devices to send affinied interrupts to each CPU -
in the future with MSI-X this can be hopefully done better)

> The overall  
> bottleneck in the kernel is reduced. 

What bottleneck exactly?

-Andi
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Van Jacobson net channels

Reply via email to