On Wed, 16 Nov 2016 16:34:09 -0800 Eric Dumazet <eric.duma...@gmail.com> wrote:
> On Wed, 2016-11-16 at 23:40 +0100, Jesper Dangaard Brouer wrote: > > > Using -R 1 does not seem to help remove __ip_select_ident() > > > > Samples: 56K of event 'cycles', Event count (approx.): 78628132661 > > Overhead Command Shared Object Symbol > > + 9.11% netperf [kernel.vmlinux] [k] __ip_select_ident > > + 6.98% netperf [kernel.vmlinux] [k] _raw_spin_lock > > + 6.21% swapper [mlx5_core] [k] mlx5e_poll_tx_cq > > + 5.03% netperf [kernel.vmlinux] [k] > > copy_user_enhanced_fast_string > > + 4.69% netperf [kernel.vmlinux] [k] __ip_make_skb > > + 4.63% netperf [kernel.vmlinux] [k] skb_set_owner_w > > + 4.15% swapper [kernel.vmlinux] [k] __slab_free > > + 3.80% netperf [mlx5_core] [k] mlx5e_sq_xmit > > + 2.00% swapper [kernel.vmlinux] [k] sock_wfree > > + 1.94% netperf netperf [.] send_data > > + 1.92% netperf netperf [.] send_omni_inner > > Check "ss -nu" ? > > You will see if sockets are connected (present in ss output or not) Tested different versions of netperf, commands used below signature: netperf-2.6.0: connected "broken" netperf-2.7.0: connected works SVN-r709 : connected works I noticed there is a Send-Q, and the perf-top2 is _raw_spin_lock, which looks like it comes from __dev_queue_xmit(), but we know from experience that this stall is actually caused by writing the tailptr/doorbell in the HW. Thus, this could benefit a lot from bulk/xmit_more into the qdisc layer. > UDP being connected does not prevent __ip_select_ident() being used. > > if ((iph->frag_off & htons(IP_DF)) && !skb->ignore_df) { > > So you need IP_DF being set, and skb->ignore_df being 0 Thanks for explaining that! :-) http://lxr.free-electrons.com/source/include/net/ip.h?v=4.8#L332 http://lxr.free-electrons.com/source/net/ipv4/ip_output.c?v=4.8#L449 Netperf UDP_STREAM default send 64K packets that get fragmented... which actually is very unfortunate because people end-up testing a code path in the kernel they didn't expect. That is why I use the option "-- -m 1472". > time to try IP_MTU_DISCOVER ;) To Rick, maybe you can find a good solution or option with Eric's hint, to send appropriate sized UDP packets with Don't Fragment (DF). -- Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat Author of http://www.iptv-analyzer.org LinkedIn: http://www.linkedin.com/in/brouer Testing with ss -nua $ /usr/local/stow/netperf-2.6.0-demo/bin/netperf -H 198.18.50.1 -t UDP_STREAM -l 3 -- -m 1472 -n -N > /dev/null & sleep 1; ss -una State Recv-Q Send-Q Local Address:Port Peer Address:Port UNCONN 0 11520 *:54589 *:* $ /usr/local/stow/netperf-2.7.0-demo/bin/netperf -H 198.18.50.1 -t UDP_STREAM -l 3 -- -m 1472 -n -N > /dev/null & sleep 1; ss -una State Recv-Q Send-Q Local Address:Port Peer Address:Port ESTAB 0 18432 198.18.50.3:46803 198.18.50.1:51851 $ ~/tools/netperf2-svn/src/netperf -H 198.18.50.1 -t UDP_STREAM -l 3 -- -m 1472 -n -N > /dev/null & sleep 1; ss -una State Recv-Q Send-Q Local Address:Port Peer Address:Port ESTAB 0 43776 198.18.50.3:42965 198.18.50.1:51948