On Mon, 26 Mar 2018 14:58:02 -0700 William Tu <u9012...@gmail.com> wrote:
> > Again high count for NMI ?!? > > > > Maybe you just forgot to tell perf that you want it to decode the > > bpf_prog correctly? > > > > https://prototype-kernel.readthedocs.io/en/latest/bpf/troubleshooting.html#perf-tool-symbols > > > > Enable via: > > $ sysctl net/core/bpf_jit_kallsyms=1 > > > > And use perf report (while BPF is STILL LOADED): > > > > $ perf report --kallsyms=/proc/kallsyms > > > > E.g. for emailing this you can use this command: > > > > $ perf report --sort cpu,comm,dso,symbol --kallsyms=/proc/kallsyms > > --no-children --stdio -g none | head -n 40 > > > > Thanks, I followed the steps, the result of l2fwd > # Total Lost Samples: 119 > # > # Samples: 2K of event 'cycles:ppp' > # Event count (approx.): 25675705627 > # > # Overhead CPU Command Shared Object Symbol > # ........ ... ....... .................. > .................................. > # > 10.48% 013 xdpsock xdpsock [.] main > 9.77% 013 xdpsock [kernel.vmlinux] [k] clflush_cache_range > 8.45% 013 xdpsock [kernel.vmlinux] [k] nmi > 8.07% 013 xdpsock [kernel.vmlinux] [k] xsk_sendmsg > 7.81% 013 xdpsock [kernel.vmlinux] [k] __domain_mapping > 4.95% 013 xdpsock [kernel.vmlinux] [k] ixgbe_xmit_frame_ring > 4.66% 013 xdpsock [kernel.vmlinux] [k] skb_store_bits > 4.39% 013 xdpsock [kernel.vmlinux] [k] syscall_return_via_sysret > 3.93% 013 xdpsock [kernel.vmlinux] [k] pfn_to_dma_pte > 2.62% 013 xdpsock [kernel.vmlinux] [k] __intel_map_single > 2.53% 013 xdpsock [kernel.vmlinux] [k] __alloc_skb > 2.36% 013 xdpsock [kernel.vmlinux] [k] iommu_no_mapping > 2.21% 013 xdpsock [kernel.vmlinux] [k] alloc_skb_with_frags > 2.07% 013 xdpsock [kernel.vmlinux] [k] skb_set_owner_w > 1.98% 013 xdpsock [kernel.vmlinux] [k] __kmalloc_node_track_caller > 1.94% 013 xdpsock [kernel.vmlinux] [k] ksize > 1.84% 013 xdpsock [kernel.vmlinux] [k] validate_xmit_skb_list > 1.62% 013 xdpsock [kernel.vmlinux] [k] kmem_cache_alloc_node > 1.48% 013 xdpsock [kernel.vmlinux] [k] __kmalloc_reserve.isra.37 > 1.21% 013 xdpsock xdpsock [.] xq_enq > 1.08% 013 xdpsock [kernel.vmlinux] [k] intel_alloc_iova > You did use net/core/bpf_jit_kallsyms=1 and correct perf commands decoding of bpf_prog, so the perf top#3 'nmi' is likely a real NMI call... which looks wrong. > And l2fwd under "perf stat" looks OK to me. There is little context > switches, cpu is fully utilized, 1.17 insn per cycle seems ok. > > Performance counter stats for 'CPU(s) 6': > 10000.787420 cpu-clock (msec) # 1.000 CPUs utilized > 24 context-switches # 0.002 K/sec > 0 cpu-migrations # 0.000 K/sec > 0 page-faults # 0.000 K/sec > 22,361,333,647 cycles # 2.236 GHz > 13,458,442,838 stalled-cycles-frontend # 60.19% frontend cycles idle > 26,251,003,067 instructions # 1.17 insn per cycle > # 0.51 stalled cycles per > insn > 4,938,921,868 branches # 493.853 M/sec > 7,591,739 branch-misses # 0.15% of all branches > 10.000835769 seconds time elapsed This perf stat also indicate something is wrong. The 1.17 insn per cycle is NOT okay, it is too low (compared to what usually I see, e.g. 2.36 insn per cycle). It clearly says you have 'stalled-cycles-frontend' and '60.19% frontend cycles idle'. This means your CPU have issues/bottleneck fetching instructions. Explained by Andi Kleen here [1] [1] https://github.com/andikleen/pmu-tools/wiki/toplev-manual -- Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat LinkedIn: http://www.linkedin.com/in/brouer