On Wed, Apr 13, 2016 at 05:50:17AM -0700, Eric Dumazet wrote:
> On Wed, 2016-04-13 at 14:08 +0300, Michael S. Tsirkin wrote:
> > On Wed, Apr 13, 2016 at 11:04:45AM +0200, Paolo Abeni wrote:
> > > This patch series try to remove the need for any lock in the tun device
> > > xmit path, significantly improving the forwarding performance when 
> > > multiple
> > > processes are accessing the tun device (i.e. in a nic->bridge->tun->vm 
> > > scenario).
> > > 
> > > The lockless xmit is obtained explicitly setting the NETIF_F_LLTX feature 
> > > bit
> > > and removing the default qdisc.
> > > 
> > > Unlikely most virtual devices, the tun driver has featured a default qdisc
> > > for a long period, but it already lost such feature in linux 4.3.
> > 
> > Thanks -  I think it's a good idea to reduce the
> > lock contention there.
> > 
> > But I think it's unfortunate that it requires
> > bypassing the qdisc completely: this means
> > that anyone trying to do traffic shaping will
> > get back the contention.
> > 
> > Can we solve the lock contention for qdisc?
> > E.g. add a small lockless queue in front of it,
> > whoever has the qdisc lock would be
> > responsible for moving things from there to qdisc
> > proper.
> > 
> > Thoughts? Is there a chance this might work reasonably well?
> 
> Adding any new queue in front of qdisc is problematic :
> - Adds a new buffer, with extra latencies.

Only where lock contention would previously occur, right?

> - If you want to implement priorities properly for X COS, you need X
> queues.

This definitely needs thought.

> - Who is going to service this extra buffer and feed the qdisc ?

The way I see it - whoever has the lock, at unlock time.

> - If the innocent guy is RT thread, maybe the extra latency will hurt.

Again - more than a lock?

> - Adding another set of atomic ops.

That's likely true. Use some per-cpu trick instead?

> We have such a schem here at Google (called holdq), but it was a
> nightmare to tune.
> 
> We never tried to upstream this beast, it is kind of ugly, and were
> expecting something better. Problem is : If you use HTB on a bonding
> device, you want still to properly use MQ on the slaves.
> 
> HTB queue. 20 netperf generating UDP packets 
> lpaa23:~# ./super_netperf 20 -H lpaa24 -t UDP_STREAM -l 3000 -- -m 100 &
> [1] 181993
> 
> 
> With the holdq feature turned on : about 1 Mpps
> 
> lpaa23:~# sar -n DEV 1 10|grep eth0|grep Average
> Average:         eth0     28.50 999071.60      3.07 138542.64      0.00
> 0.00      0.60
> 
> holdq turned off : about 620 Kpps
> 
> lpaa23:~# sar -n DEV 1 10|grep eth0|grep Average
> Average:         eth0     39.00 617765.40      4.73  85667.42      0.00
> 0.00      0.90
> 

Reply via email to