------- Comment From danijel.so...@de.ibm.com 2020-03-20 06:01 EDT------- Hi,
first let me get to your questions regarding the kernel. As you already guessed, this was an upstream (fedora) kernel and the latest version at the time we pursued the issue. That is also the answer to why we did not use the stable 5.4. However, the problem appeared already with older kernel versions since the commit named above was introduced with 4.18. Now getting to the problem. Before the code change stated above, striding RQ was used with ConnectX-4 devices. By introducing the commit, the default RQ was set to "legacy RQ" for ConnectX-4 devices as the comment in the commit already indicates, "this implies that ConnectX-4 LX now uses legacy RQ by default". As our performance tests showed for z14 and z15 it is beneficial to use striding RQ with ConnectX-4 (RoCE Express 2(.1)). That is why we ask to switch the default for ConnectX-4 back to striding RQ when using a z14/z15. This can be done by calling the following command: ethtool --set-priv-flags DEVNAME rx_striding_rq on -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1868113 Title: [Ubuntu 20.04] Striding RQ as Default for ConnectX-4 Status in Ubuntu on IBM z Systems: Incomplete Status in linux package in Ubuntu: Incomplete Bug description: ello, Within our Network Performance runs in the RoCE Express 2(.1) area, we noticed a performance regression with streaming workloads which could be mitigated by using an ethtool setting. The Commit which switched the default value from "Striding RQ" to "Legacy RQ" for ConnectX-4 devices (RoCE Express 2(.1)) is attached here: commit 5ffd81943d7a57423f204cd5844bf430b5634472 (refs/bisect/bad) Author: Tariq Toukan <tar...@mellanox.com> Date: Tue Feb 20 15:17:54 2018 +0200 net/mlx5e: RX, Always prefer Linear SKB configuration Prefer the linear SKB configuration of Legacy RQ over the non-linear one of Striding RQ. This implies that ConnectX-4 LX now uses legacy RQ by default, as it does not support the linear configuration of Striding RQ. Signed-off-by: Tariq Toukan <tar...@mellanox.com> Signed-off-by: Saeed Mahameed <sae...@mellanox.com> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c index 2c634e50d051..333d4ed52b94 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -4405,9 +4405,16 @@ void mlx5e_build_nic_params(struct mlx5_core_dev *mdev, MLX5E_SET_PFLAG(params, MLX5E_PFLAG_RX_CQE_COMPRESS, params->rx_cqe_compress_def); /* RQ */ - if (mlx5e_striding_rq_possible(mdev, params)) - MLX5E_SET_PFLAG(params, MLX5E_PFLAG_RX_STRIDING_RQ, - !slow_pci_heuristic(mdev)); + /* Prefer Striding RQ, unless any of the following holds: + * - Striding RQ configuration is not possible/supported. + * - Slow PCI heuristic. + * - Legacy RQ would use linear SKB while Striding RQ would use non-linear. + */ + if (!slow_pci_heuristic(mdev) && + mlx5e_striding_rq_possible(mdev, params) && + (mlx5e_rx_mpwqe_is_linear_skb(mdev, params) || + !mlx5e_rx_is_linear_skb(mdev, params))) + MLX5E_SET_PFLAG(params, MLX5E_PFLAG_RX_STRIDING_RQ, true); mlx5e_set_rq_type(mdev, params); mlx5e_init_rq_type_params(mdev, params); We have modified the upstream-kernel to allow us running of measurements and compare differences between Legacy RQ vs Striding RQ. Here is an example below: Kernel used: 5.4.0-rc7 The measurements run on a dedicated machine (z14) using uperf with streaming profiles (MTU size 1500). Example throughput drop: (traffic via a shared card, i.e. client and server using VFs from the same ConnectX-4) -------------------------------------------------------------------------- | | Legacy RQ | Striding RQ | -------------------------------------------------------------------------- |str-writex30k (1 connection) | 24.62Gb/s | 33.47Gb/s | -------------------------------------------------------------------------- Additionaly, two tests with transactional workload using the ethtool proposed switch: -------------------------------------------------------------------------- | | Legacy RQ | Striding RQ | -------------------------------------------------------------------------- | rr1c-200x30k---1 | 4.12Gb/s | 5.66Gb/s | -------------------------------------------------------------------------- | rr1c-200x30k--10 | 15.10Gb/s | 20.77Gb/s | -------------------------------------------------------------------------- As concluded in the communication with Mellanox, there is a possibility to use a simple ethtool command to switch between the queuing methods, allowing us to avoid kernel code changes: ethtool --set-priv-flags DEVNAME rx_striding_rq on (To list the available settings you may use: ethtool --show-priv-flags DEVNAME) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-z-systems/+bug/1868113/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp