2016-04-20 15:34 GMT-07:00 Eric Dumazet <eric.duma...@gmail.com>: > On Wed, 2016-04-20 at 14:24 -0700, Michael Ma wrote: >> 2016-04-08 7:19 GMT-07:00 Eric Dumazet <eric.duma...@gmail.com>: >> > On Thu, 2016-03-31 at 16:48 -0700, Michael Ma wrote: >> >> I didn't really know that multiple qdiscs can be isolated using MQ so >> >> that each txq can be associated with a particular qdisc. Also we don't >> >> really have multiple interfaces... >> >> >> >> With this MQ solution we'll still need to assign transmit queues to >> >> different classes by doing some math on the bandwidth limit if I >> >> understand correctly, which seems to be less convenient compared with >> >> a solution purely within HTB. >> >> >> >> I assume that with this solution I can still share qdisc among >> >> multiple transmit queues - please let me know if this is not the case. >> > >> > Note that this MQ + HTB thing works well, unless you use a bonding >> > device. (Or you need the MQ+HTB on the slaves, with no way of sharing >> > tokens between the slaves) >> >> Actually MQ+HTB works well for small packets - like flow of 512 byte >> packets can be throttled by HTB using one txq without being affected >> by other flows with small packets. However I found using this solution >> large packets (10k for example) will only achieve very limited >> bandwidth. In my test I used MQ to assign one txq to a HTB which sets >> rate at 1Gbit/s, 512 byte packets can achieve the ceiling rate by >> using 30 threads. But sending 10k packets using 10 threads has only 10 >> Mbit/s with the same TC configuration. If I increase burst and cburst >> of HTB to some extreme large value (like 50MB) the ceiling rate can be >> hit. >> >> The strange thing is that I don't see this problem when using HTB as >> the root. So txq number seems to be a factor here - however it's >> really hard to understand why would it only affect larger packets. Is >> this a known issue? Any suggestion on how to investigate the issue >> further? Profiling shows that the cpu utilization is pretty low. > > You could try > > perf record -a -g -e skb:kfree_skb sleep 5 > perf report > > So that you see where the packets are dropped. > > Chances are that your UDP sockets SO_SNDBUF is too big, and packets are > dropped at qdisc enqueue time, instead of having backpressure. >
Thanks for the hint - how should I read the perf report? Also we're using TCP socket in this testing - TCP window size is set to 70kB. - 35.88% init [kernel.kallsyms] [k] intel_idle ◆ intel_idle ▒ - 15.83% strings libc-2.5.so [.] __GI___connect_internal ▒ - __GI___connect_internal ▒ - 50.00% get_mapping ▒ __nscd_get_map_ref ▒ 50.00% __nscd_open_socket ▒ - 13.19% strings libc-2.5.so [.] __GI___libc_recvmsg ▒ - __GI___libc_recvmsg ▒ + 64.52% getifaddrs ▒ + 35.48% __check_pf ▒ - 10.55% strings libc-2.5.so [.] __sendto_nocancel ▒ - __sendto_nocancel ▒ 100.00% 0 > >