W dniu 08.11.2018 o 01:59, Paweł Staszewski pisze:


W dniu 05.11.2018 o 21:17, Jesper Dangaard Brouer pisze:
On Sun, 4 Nov 2018 01:24:03 +0100 Paweł Staszewski <pstaszew...@itcare.pl> wrote:

And today again after allpy patch for page allocator - reached again
64/64 Gbit/s

with only 50-60% cpu load
Great.

today no slowpath hit for netwoking :)

But again dropped pckt at 64GbitRX and 64TX ....
And as it should not be pcie express limit  -i think something more is
Well, this does sounds like a PCIe bandwidth limit to me.

See the PCIe BW here: https://en.wikipedia.org/wiki/PCI_Express

You likely have PCIe v3, where 1-lane have 984.6 MBytes/s or 7.87 Gbit/s
Thus,  x16-lanes have 15.75 GBytes or 126 Gbit/s.  It does say "in each
direction", but you are also forwarding this RX->TX on both (dual) ports
NIC that is sharing the same PCIe slot.
Network controller changed from 2-port 100G connectx4 to 2 separate cards 100G connectx5


   PerfTop:   92239 irqs/sec  kernel:99.4%  exact:  0.0% [4000Hz cycles],  (all, 56 CPUs) ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

     6.65%  [kernel]       [k] irq_entries_start
     5.57%  [kernel]       [k] tasklet_action_common.isra.21
     4.60%  [kernel]       [k] mlx5_eq_int
     4.04%  [kernel]       [k] mlx5e_skb_from_cqe_mpwrq_linear
     3.66%  [kernel]       [k] _raw_spin_lock_irqsave
     3.58%  [kernel]       [k] mlx5e_sq_xmit
     2.66%  [kernel]       [k] fib_table_lookup
     2.52%  [kernel]       [k] _raw_spin_lock
     2.51%  [kernel]       [k] build_skb
     2.50%  [kernel]       [k] _raw_spin_lock_irq
     2.04%  [kernel]       [k] try_to_wake_up
     1.83%  [kernel]       [k] queued_spin_lock_slowpath
     1.81%  [kernel]       [k] mlx5e_poll_tx_cq
     1.65%  [kernel]       [k] do_idle
     1.50%  [kernel]       [k] mlx5e_poll_rx_cq
     1.34%  [kernel]       [k] __sched_text_start
     1.32%  [kernel]       [k] cmd_exec
     1.30%  [kernel]       [k] cmd_work_handler
     1.16%  [kernel]       [k] vlan_do_receive
     1.15%  [kernel]       [k] memcpy_erms
     1.15%  [kernel]       [k] __dev_queue_xmit
     1.07%  [kernel]       [k] mlx5_cmd_comp_handler
     1.06%  [kernel]       [k] sched_ttwu_pending
     1.00%  [kernel]       [k] ipt_do_table
     0.98%  [kernel]       [k] ip_finish_output2
     0.92%  [kernel]       [k] pfifo_fast_dequeue
     0.88%  [kernel]       [k] mlx5e_handle_rx_cqe_mpwrq
     0.78%  [kernel]       [k] dev_gro_receive
     0.78%  [kernel]       [k] mlx5e_napi_poll
     0.76%  [kernel]       [k] mlx5e_post_rx_mpwqes
     0.70%  [kernel]       [k] process_one_work
     0.67%  [kernel]       [k] __netif_receive_skb_core
     0.65%  [kernel]       [k] __build_skb
     0.63%  [kernel]       [k] llist_add_batch
     0.62%  [kernel]       [k] tcp_gro_receive
     0.60%  [kernel]       [k] inet_gro_receive
     0.59%  [kernel]       [k] ip_route_input_rcu
     0.59%  [kernel]       [k] rcu_irq_exit
     0.56%  [kernel]       [k] napi_complete_done
     0.52%  [kernel]       [k] kmem_cache_alloc
     0.48%  [kernel]       [k] __softirqentry_text_start
     0.48%  [kernel]       [k] mlx5e_xmit
     0.47%  [kernel]       [k] __queue_work
     0.46%  [kernel]       [k] memset_erms
     0.46%  [kernel]       [k] dev_hard_start_xmit
     0.45%  [kernel]       [k] insert_work
     0.45%  [kernel]       [k] enqueue_task_fair
     0.44%  [kernel]       [k] __wake_up_common
     0.43%  [kernel]       [k] finish_task_switch
     0.43%  [kernel]       [k] kmem_cache_free_bulk
     0.42%  [kernel]       [k] ip_forward
     0.42%  [kernel]       [k] worker_thread
     0.41%  [kernel]       [k] schedule
     0.41%  [kernel]       [k] _raw_spin_unlock_irqrestore
     0.40%  [kernel]       [k] netif_skb_features
     0.40%  [kernel]       [k] queue_work_on
     0.40%  [kernel]       [k] pfifo_fast_enqueue
     0.39%  [kernel]       [k] vlan_dev_hard_start_xmit
     0.39%  [kernel]       [k] page_frag_free
     0.36%  [kernel]       [k] swiotlb_map_page
     0.36%  [kernel]       [k] update_cfs_rq_h_load
     0.35%  [kernel]       [k] validate_xmit_skb.isra.142
     0.35%  [kernel]       [k] dev_ifconf
     0.35%  [kernel]       [k] check_preempt_curr
     0.34%  [kernel]       [k] _raw_spin_trylock
     0.34%  [kernel]       [k] rcu_idle_exit
     0.33%  [kernel]       [k] ip_rcv_core.isra.20.constprop.25
     0.33%  [kernel]       [k] __qdisc_run
     0.33%  [kernel]       [k] skb_release_data
     0.32%  [kernel]       [k] native_sched_clock
     0.30%  [kernel]       [k] add_interrupt_randomness
     0.29%  [kernel]       [k] interrupt_entry
     0.28%  [kernel]       [k] skb_gro_receive
     0.26%  [kernel]       [k] read_tsc
     0.26%  [kernel]       [k] __get_xps_queue_idx
     0.26%  [kernel]       [k] inet_gifconf
     0.26%  [kernel]       [k] skb_segment
     0.25%  [kernel]       [k] __tasklet_schedule_common
     0.25%  [kernel]       [k] smpboot_thread_fn
     0.23%  [kernel]       [k] __update_load_avg_se
     0.22%  [kernel]       [k] tcp4_gro_receive


Not much traffic now:
  bwm-ng v0.6.1 (probing every 0.500s), press 'h' for help
  input: /proc/net/dev type: rate
  |         iface                   Rx Tx                Total
==============================================================================          enp175s0:           6.95 Gb/s            4.20 Gb/s           11.15 Gb/s          enp216s0:           4.23 Gb/s            6.98 Gb/s           11.21 Gb/s ------------------------------------------------------------------------------             total:          11.18 Gb/s           11.18 Gb/s           22.37 Gb/s

  bwm-ng v0.6.1 (probing every 1.000s), press 'h' for help
  input: /proc/net/dev type: rate
  |         iface                   Rx Tx                Total
==============================================================================          enp175s0:       700264.50 P/s        923890.25 P/s 1624154.75 P/s          enp216s0:       932598.81 P/s        708771.50 P/s 1641370.25 P/s ------------------------------------------------------------------------------             total:      1632863.38 P/s       1632661.75 P/s 3265525.00 P/s




Also is that normal that some kworker procs takes 10%+ of cpu ?
below top

 2913 root      20   0       0      0      0 I  10.3  0.0   6:58.29 kworker/u112:1-     7 root      20   0       0      0      0 I   8.6  0.0   6:17.18 kworker/u112:0- 10289 root      20   0       0      0      0 I   6.6  0.0   6:33.90 kworker/u112:4-  2939 root      20   0       0      0      0 R   3.6  0.0   7:37.68 kworker/u112:2-  4557 root      20   0       0      0      0 I   1.3  0.0   0:08.82 kworker/45:4-ev  6775 root      20   0       0      0      0 I   1.3  0.0   0:26.30 kworker/50:4-ev  6833 root      20   0       0      0      0 D   1.3  0.0   0:04.96 kworker/15:0+ev  6840 root      20   0       0      0      0 I   1.3  0.0   0:09.32 kworker/55:2-ev  6874 root      20   0       0      0      0 D   1.3  0.0   0:08.51 kworker/53:0+ev  7710 root      20   0       0      0      0 I   1.3  0.0   0:07.78 kworker/14:1-ev 12075 root      20   0       0      0      0 I   1.3  0.0   1:19.22 kworker/23:3-ev 31209 root      20   0       0      0      0 I   1.3  0.0   0:07.02 kworker/20:1-ev 32351 root      20   0       0      0      0 R   1.3  0.0   0:06.99 kworker/51:2+ev 39869 root      20   0       0      0      0 D   1.3  0.0   0:06.15 kworker/42:0+ev 39959 root      20   0       0      0      0 I   1.3  0.0   0:16.23 kworker/51:1-ev 42858 root      20   0       0      0      0 I   1.3  0.0   0:47.72 kworker/27:2-ev 43281 root      20   0       0      0      0 I   1.3  0.0   0:14.99 kworker/14:4-ev 43282 root      20   0       0      0      0 I   1.3  0.0   0:13.38 kworker/16:1-ev 43389 root      20   0       0      0      0 D   1.3  0.0   0:08.92 kworker/54:2+ev 45214 root      20   0       0      0      0 I   1.3  0.0   0:05.82 kworker/55:0-ev 46894 root      20   0       0      0      0 I   1.3  0.0   0:04.11 kworker/46:1-ev 47027 root      20   0       0      0      0 D   1.3  0.0   0:03.79 kworker/47:1+ev 47129 root      20   0       0      0      0 D   1.3  0.0   0:03.15 kworker/52:0+ev 47133 root      20   0       0      0      0 I   1.3  0.0   0:03.19 kworker/49:1-ev 47179 root      20   0       0      0      0 I   1.3  0.0   0:02.83 kworker/17:3-ev 48062 root      20   0       0      0      0 I   1.3  0.0   0:02.54 kworker/44:1-ev 48158 root      20   0       0      0      0 I   1.3  0.0   0:02.17 kworker/16:2-ev 48168 root      20   0       0      0      0 I   1.3  0.0   0:02.13 kworker/27:3-ev 48247 root      20   0       0      0      0 I   1.3  0.0   0:01.83 kworker/22:0-ev 48337 root      20   0       0      0      0 I   1.3  0.0   0:01.57 kworker/15:1-ev 48345 root      20   0       0      0      0 I   1.3  0.0   0:01.49 kworker/24:3-ev 49302 root      20   0       0      0      0 I   1.3  0.0   0:00.71 kworker/54:1-ev 49366 root      20   0       0      0      0 I   1.3  0.0   0:00.38 kworker/20:3-ev 49400 root      20   0       0      0      0 I   1.3  0.0   0:00.31 kworker/26:2-ev 49430 root      20   0       0      0      0 I   1.3  0.0   0:00.21 kworker/42:2-ev 49463 root      20   0       0      0      0 D   1.3  0.0   0:00.08 kworker/50:2+ev 51698 root      20   0       0      0      0 D   1.3  0.0   0:14.85 kworker/46:2+ev 54238 root      20   0       0      0      0 I   1.3  0.0   0:23.73 kworker/52:1-ev  2507 root      20   0       0      0      0 I   1.0  0.0   0:09.60 kworker/44:2-ev  4525 root      20   0       0      0      0 I   1.0  0.0   0:08.07 kworker/26:1-ev  4556 root      20   0       0      0      0 I   1.0  0.0   0:05.15 kworker/48:0-ev  4604 root      20   0       0      0      0 I   1.0  0.0   0:10.90 kworker/19:0-ev  5789 root      20   0       0      0      0 I   1.0  0.0   0:08.24 kworker/18:0-ev  6868 root      20   0       0      0      0 I   1.0  0.0   0:09.68 kworker/47:0-ev  6900 root      20   0       0      0      0 I   1.0  0.0   0:28.83 kworker/18:1-ev  7764 root      20   0       0      0      0 I   1.0  0.0   0:03.00 kworker/49:2-ev 12045 root      20   0       0      0      0 I   1.0  0.0   1:16.98 kworker/24:2-ev 32218 root      20   0       0      0      0 I   1.0  0.0   0:04.13 kworker/45:2-ev 34082 root      20   0       0      0      0 I   1.0  0.0   0:06.29 kworker/17:1-ev 39791 root      20   0       0      0      0 I   1.0  0.0   0:19.51 kworker/21:4-ev 39973 root      20   0       0      0      0 I   1.0  0.0   0:17.12 kworker/53:2-ev 43223 root      20   0       0      0      0 I   1.0  0.0   0:07.88 kworker/25:0-ev 43295 root      20   0       0      0      0 I   1.0  0.0   0:10.89 kworker/22:4-ev 46055 root      20   0       0      0      0 I   1.0  0.0   0:04.00 kworker/21:2-ev 46077 root      20   0       0      0      0 I   1.0  0.0   0:04.62 kworker/19:1-ev 47204 root      20   0       0      0      0 I   1.0  0.0   0:03.03 kworker/25:2-ev 47989 root      20   0       0      0      0 I   1.0  0.0   0:02.65 kworker/43:1-ev 49127 root      20   0       0      0      0 I   1.0  0.0   0:01.10 kworker/48:2-ev 49317 root      20   0       0      0      0 I   1.0  0.0   0:00.56 kworker/23:1-ev 54191 root      20   0       0      0      0 R   1.0  0.0   0:30.27 kworker/43:2+ev    81 root      20   0       0      0      0 S   0.7  0.0   0:50.27 ksoftirqd/14    87 root      20   0       0      0      0 S   0.7  0.0   1:02.92 ksoftirqd/15   102 root      20   0       0      0      0 S   0.7  0.0   0:29.78 ksoftirqd/18   117 root      20   0       0      0      0 S   0.7  0.0   0:30.73 ksoftirqd/21   127 root      20   0       0      0      0 S   0.7  0.0   0:24.45 ksoftirqd/23   137 root      20   0       0      0      0 S   0.7  0.0   0:24.94 ksoftirqd/25   142 root      20   0       0      0      0 S   0.7  0.0   0:21.74 ksoftirqd/26   222 root      20   0       0      0      0 S   0.7  0.0   0:27.83 ksoftirqd/42   227 root      20   0       0      0      0 S   0.7  0.0   0:25.35 ksoftirqd/43   242 root      20   0       0      0      0 S   0.7  0.0   0:21.40 ksoftirqd/46   267 root      20   0       0      0      0 S   0.7  0.0   0:08.62 ksoftirqd/51  5174 root      20   0       0      0      0 I   0.7  0.0   5:57.10 kworker/u112:3-





going on there - and hard to catch - cause perf top doestn chenged
besides there is no queued slowpath hit now

I ordered now also intel cards to compare - but 3 weeks eta
Faster - cause 3 days - i will have mellanox connectx 5 - so can
separate traffic to two different x16 pcie busses
I do think you need to separate traffic to two different x16 PCIe
slots.  I have found that the ConnectX-5 is significantly faster
packet-per-sec performance than ConnectX-4, but that is not your
use-case (max BW). I've not tested these NICs for maximum
_bidirectional_ bandwidth limits, I've only made sure I can do 100G
unidirectional, which can hit some funny motherboard memory limits
(remember to equip motherboard with 4 RAM blocks for full memory BW).

Yes memory channels are separated and there are 4 modules per cpu :)



Reply via email to