[Kernel-packages] [Bug 1801574] Re: [cosmic] ipoib ping with large message size failed
Hi, After trying the steps that mentioned in #4 on 18.10, the issue doesn't reproduces with the suggested setting. yours, Talat -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1801574 Title: [cosmic] ipoib ping with large message size failed Status in linux package in Ubuntu: Confirmed Bug description: We see that ping over ipoib interface stuck with large packets, this is a new degradation, this test pass on Ubuntu 18.04. After investigating the issue, we see that commit [1] introduce the issue, and it is not an upstream commit, it is canonical commit. Could you please check with the canonical kernel team why they revert that commit? To reproduce the bug, please use ConnectX-3 devices with ipoib connection with 2044 MTU (default) and run command [2]. Is there open Launchpad on it? [1] commit 77a24c313d21e3765b04d90521e9228a9bb6e332 Author: Tyler Hicks Date: Fri Aug 3 21:53:15 2018 + Revert "net: increase fragment memory usage limits" This reverts commit c2a936600f78aea00d3312ea4b66a79a4619f9b4. It made denial of service attacks on the IP fragment handling easier to carry out. CVE-2018-5391 Signed-off-by: Tyler Hicks Signed-off-by: Stefan Bader [2] ping 13.194.22.1 -I 13.194.23.1 -s 65507 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1801574/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1801574] Re: [cosmic] ipoib ping with large message size failed
Hi Tyler, We did a kernel bisecting and we fount that the mentioned commit is the root cause of this big, if bionic contain this commit, so the bug also in bionic release. yours, Talat -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1801574 Title: [cosmic] ipoib ping with large message size failed Status in linux package in Ubuntu: Confirmed Bug description: We see that ping over ipoib interface stuck with large packets, this is a new degradation, this test pass on Ubuntu 18.04. After investigating the issue, we see that commit [1] introduce the issue, and it is not an upstream commit, it is canonical commit. Could you please check with the canonical kernel team why they revert that commit? To reproduce the bug, please use ConnectX-3 devices with ipoib connection with 2044 MTU (default) and run command [2]. Is there open Launchpad on it? [1] commit 77a24c313d21e3765b04d90521e9228a9bb6e332 Author: Tyler Hicks Date: Fri Aug 3 21:53:15 2018 + Revert "net: increase fragment memory usage limits" This reverts commit c2a936600f78aea00d3312ea4b66a79a4619f9b4. It made denial of service attacks on the IP fragment handling easier to carry out. CVE-2018-5391 Signed-off-by: Tyler Hicks Signed-off-by: Stefan Bader [2] ping 13.194.22.1 -I 13.194.23.1 -s 65507 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1801574/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1801574] Re: [cosmic] ipoib ping with large message size failed
Hi Sure i will try it and update. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1801574 Title: [cosmic] ipoib ping with large message size failed Status in linux package in Ubuntu: Confirmed Bug description: We see that ping over ipoib interface stuck with large packets, this is a new degradation, this test pass on Ubuntu 18.04. After investigating the issue, we see that commit [1] introduce the issue, and it is not an upstream commit, it is canonical commit. Could you please check with the canonical kernel team why they revert that commit? To reproduce the bug, please use ConnectX-3 devices with ipoib connection with 2044 MTU (default) and run command [2]. Is there open Launchpad on it? [1] commit 77a24c313d21e3765b04d90521e9228a9bb6e332 Author: Tyler Hicks Date: Fri Aug 3 21:53:15 2018 + Revert "net: increase fragment memory usage limits" This reverts commit c2a936600f78aea00d3312ea4b66a79a4619f9b4. It made denial of service attacks on the IP fragment handling easier to carry out. CVE-2018-5391 Signed-off-by: Tyler Hicks Signed-off-by: Stefan Bader [2] ping 13.194.22.1 -I 13.194.23.1 -s 65507 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1801574/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1851446] Re: Backport MPLS patches from 5.3 to 4.15
Hi Jeff. Unfortunately, was missing patch [1] that add the expose the support in the mlx5 driver. [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v5.5-rc2&id=5dc9520bf04a6b95660a307d7654460d1463d91a could you please add it ? Yours, Talat -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1851446 Title: Backport MPLS patches from 5.3 to 4.15 Status in linux package in Ubuntu: Incomplete Status in linux source package in Bionic: Incomplete Status in linux source package in Disco: New Bug description: Mellanox is requesting a backport of the following commit IDs from 5.3 back to 4.15. Netdevice HW MPLS features are not passed from device driver's netdevice to upper netdevice, specifically VLAN and bonding netdevice which are created by the kernel when needed. This prevents enablement and usage of HW offloads, such as TSO and checksumming for MPLS tagged traffic when running via VLAN or bonding interface. The patches introduce changes to the initialization steps of the VLAN and bonding netdevices to inherit the MPLS features from lower netdevices to allow the HW offloads. Ariel Levkovich (2): net: bonding: Inherit MPLS features from slave devices net: vlan: Inherit MPLS features from parent device drivers/net/bonding/bond_main.c | 11 +++ net/8021q/vlan_dev.c| 1 + 2 files changed, 12 insertions(+) https://www.mail-archive.com/netdev@vger.kernel.org/msg299084.html Commit IDs (All landed in 5.3) 600bb0318c18e9616d97ad123caaa7c5f7bf222c 8b6912a5019356d7adb1b8a146c9eef5e679bf98 2e770b507ccde8eedc129946e4b78ceed0a22df2 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1851446/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1758662] Re: [Bionic] mlx4 ETH - mlnx_qos failed when set some TC to vendor
After testing the build, the issue didn't reproduced. Thanks. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1758662 Title: [Bionic] mlx4 ETH - mlnx_qos failed when set some TC to vendor Status in linux package in Ubuntu: In Progress Status in linux source package in Bionic: In Progress Bug description: reproduce: [root@reg-l-vrt-41018-010 ~]# /usr/bin/mlnx_qos -i ens8 -s vendor,ets,vendor,ets,ets,strict,ets,vendor -t 0,36,0,55,4,0,5,0 Netlink error: Bad value. see dmesg. [root@reg-l-vrt-41018-010 ~]# dmesg [69718.992299] mlx4_en: ens8: TC[0]: Not supported TSA: 255 There is a upstream commit that fix the issue, please add it to bionic commit a42b63c1ac1986f17f71bc91a6b0aaa14d4dae71 Author: Moni Shoua Date: Thu Dec 28 16:26:11 2017 +0200 net/mlx4_en: Change default QoS settings Change the default mapping between TC and TCG as follows: Prio | TC/TCG | from to |(set by FW) (set by SW) -+--- 0| 0/0 0/7 1| 1/0 0/6 2| 2/0 0/5 3| 3/0 0/4 4| 4/0 0/3 5| 5/0 0/2 6| 6/0 0/1 7| 7/0 0/0 These new settings cause that a pause frame for any prio stops traffic for all prios. Fixes: 564c274c3df0 ("net/mlx4_en: DCB QoS support") Signed-off-by: Moni Shoua Signed-off-by: Maor Gottlieb Signed-off-by: Tariq Toukan Signed-off-by: David S. Miller diff --git a/drivers/net/ethernet/mellanox/mlx4/en_dcb_nl.c b/drivers/net/ethernet/mellanox/mlx4/en_dcb_nl.c index 5f41dc9..1a0c3bf8 100644 --- a/drivers/net/ethernet/mellanox/mlx4/en_dcb_nl.c +++ b/drivers/net/ethernet/mellanox/mlx4/en_dcb_nl.c @@ -310,6 +310,7 @@ static int mlx4_en_ets_validate(struct mlx4_en_priv *priv, struct ieee_ets *ets) } switch (ets->tc_tsa[i]) { + case IEEE_8021QAZ_TSA_VENDOR: case IEEE_8021QAZ_TSA_STRICT: break; case IEEE_8021QAZ_TSA_ETS: @@ -347,6 +348,10 @@ static int mlx4_en_config_port_scheduler(struct mlx4_en_priv *priv, /* higher TC means higher priority => lower pg */ for (i = IEEE_8021QAZ_MAX_TCS - 1; i >= 0; i--) { switch (ets->tc_tsa[i]) { + case IEEE_8021QAZ_TSA_VENDOR: + pg[i] = MLX4_EN_TC_VENDOR; + tc_tx_bw[i] = MLX4_EN_BW_MAX; + break; case IEEE_8021QAZ_TSA_STRICT: pg[i] = num_strict++; tc_tx_bw[i] = MLX4_EN_BW_MAX; diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c index 99051a2..21bc17f 100644 --- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c +++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c @@ -3336,6 +3336,13 @@ int mlx4_en_init_netdev(struct mlx4_en_dev *mdev, int port, priv->msg_enable = MLX4_EN_MSG_LEVEL; #ifdef CONFIG_MLX4_EN_DCB if (!mlx4_is_slave(priv->mdev->dev)) { + u8 prio; + + for (prio = 0; prio < IEEE_8021QAZ_MAX_TCS; ++prio) { + priv->ets.prio_tc[prio] = prio; + priv->ets.tc_tsa[prio] = IEEE_8021QAZ_TSA_VENDOR; + } + priv->dcbx_cap = DCB_CAP_DCBX_VER_CEE | DCB_CAP_DCBX_HOST | DCB_CAP_DCBX_VER_IEEE; priv->flags |= MLX4_EN_DCB_ENABLED; diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h index 2b72677..7db3d0d 100644 --- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h +++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h @@ -479,6 +479,7 @@ struct mlx4_en_frag_info { #define MLX4_EN_BW_MIN 1 #define MLX4_EN_BW_MAX 100 /* Utilize 100% of t
[Kernel-packages] [Bug 1763269] [NEW] Mellanox [mlx5] [bionic] UBSAN: Undefined behaviour in ./include/linux/net_dim.h
Public bug reported: We see UBSAN: Undefined behaviour in ./include/linux/net_dim.h:243:6 we saw the following trace during traffic in the regression: [12885.292500] UBSAN: Undefined behaviour in ./include/linux/net_dim.h:243:6 [12885.296358] signed integer overflow: [12885.300100] 358869104 * 100 cannot be represented in type 'int' [12885.304001] CPU: 2 PID: 19630 Comm: sock_stream_tes Tainted: G OE 4.15.0-rc8-for-upstream-dbg-2018-01-25_19-31-23-61 #1 [12885.311856] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu2 04/01/2014 [12885.316091] Call Trace: [12885.320234] [12885.324366] dump_stack+0xd1/0x159 [12885.328586] ? dma_virt_map_sg+0x147/0x147 [12885.332804] ? val_to_string.constprop.4+0x88/0xd1 [12885.337055] ubsan_epilogue+0x9/0x49 [12885.341345] handle_overflow+0x15e/0x189 [12885.345636] ? __ubsan_handle_negate_overflow+0x108/0x108 [12885.349891] ? kvm_clock_read+0x1f/0x30 [12885.354230] ? ktime_get+0x18d/0x280 [12885.358654] ? getrawmonotonic64+0x320/0x320 [12885.363116] ? mark_lock+0x1cf/0xc50 [12885.367624] ? inet_recvmsg+0x121/0x4a0 [12885.372114] mlx5e_napi_poll+0x1199/0x15c0 [mlx5_core] [12885.376774] ? mlx5e_rx_dim_work+0x160/0x160 [mlx5_core] [12885.381406] ? print_irqtrace_events+0x120/0x120 [12885.385907] ? mark_held_locks+0x93/0x100 [12885.392099] ? print_irqtrace_events+0x120/0x120 [12885.396589] ? trace_hardirqs_on_caller+0x206/0x390 [12885.401278] ? kasan_slab_free+0x87/0xc0 [12885.406000] ? pvclock_clocksource_read+0x146/0x280 [12885.410608] ? mark_held_locks+0x71/0x100 [12885.415251] net_rx_action+0x58c/0x10a0 [12885.419873] ? napi_complete_done+0x3d0/0x3d0 [12885.424385] ? check_chain_key+0x150/0x260 [12885.428784] ? debug_check_no_locks_freed+0x200/0x200 [12885.433041] ? match_held_lock+0x8a/0x4f0 [12885.437215] ? match_held_lock+0x8a/0x4f0 [12885.441249] ? lock_downgrade+0x3e0/0x3e0 [12885.445151] ? do_raw_spin_unlock+0x14d/0x230 [12885.448970] ? save_trace+0x1f0/0x1f0 [12885.452664] ? save_trace+0x1f0/0x1f0 [12885.456224] ? match_held_lock+0xa2/0x4f0 [12885.459668] ? pvclock_clocksource_read+0x146/0x280 [12885.463085] ? save_trace+0x1f0/0x1f0 [12885.466361] ? preempt_count_sub+0x14/0xd0 [12885.469566] ? __lock_is_held+0x5d/0x110 [12885.472665] ? preempt_count_sub+0x14/0xd0 [12885.475653] ? __lock_is_held+0x5d/0x110 [12885.478529] ? mark_lock+0x1cf/0xc50 [12885.481276] ? match_held_lock+0xa2/0x4f0 [12885.483984] ? print_irqtrace_events+0x120/0x120 [12885.486679] ? save_trace+0x1f0/0x1f0 [12885.490891] ? irq_exit+0x150/0x150 [12885.493454] ? __napi_schedule+0x1ae/0x220 [12885.495936] ? netdev_master_upper_dev_link+0x20/0x20 [12885.498402] ? check_chain_key+0x150/0x260 [12885.500774] ? __tasklet_schedule+0x22/0xf0 [12885.503086] ? match_held_lock+0xa2/0x4f0 [12885.505431] ? mlx5_eq_int+0x821/0xb50 [mlx5_core] [12885.507775] ? save_trace+0x1f0/0x1f0 [12885.510082] ? pvclock_clocksource_read+0x146/0x280 [12885.512416] ? pvclock_read_flags+0x80/0x80 [12885.514705] ? save_trace+0x1f0/0x1f0 [12885.516995] ? __handle_irq_event_percpu+0x1b0/0x800 [12885.519305] ? __lock_is_held+0x5d/0x110 [12885.521630] __do_softirq+0x248/0xba9 [12885.523913] ? __irqentry_text_end+0x1f8a70/0x1f8a70 [12885.526234] ? pvclock_clocksource_read+0x146/0x280 [12885.528563] ? pvclock_read_flags+0x80/0x80 [12885.530843] ? do_raw_spin_trylock+0x120/0x120 [12885.533178] ? kvm_clock_read+0x1f/0x30 [12885.535432] ? kvm_sched_clock_read+0x5/0x10 [12885.537702] ? sched_clock_cpu+0x14/0x1f0 [12885.539968] irq_exit+0xf4/0x150 [12885.542186] do_IRQ+0xe8/0x1e0 [12885.544390] common_interrupt+0xa2/0xa2 [12885.546607] There is int overflow in: include/linux/net_dim.h #define IS_SIGNIFICANT_DIFF(val, ref) \ (((100 * abs((val) - (ref))) / (ref)) > 10) /* more than 10% difference */ The include/linux/net_dim.h library in new in kernel 4.16, in 4.15 kernel this code was in drivers/net/ethernet/mellanox/mlx5/core/en_rx_am.c The upstream fix that fix this issue is commit f97c3dc3c0e8d23a5c4357d182afeef4c67f5c33 Author: Tal Gilboa Date: Thu Mar 29 13:53:52 2018 +0300 net/dim: Fix int overflow When calculating difference between samples, the values are multiplied by 100. Large values may cause int overflow when multiplied (usually on first iteration). Fixed by forcing 100 to be of type unsigned long. Fixes: 4c4dbb4a7363 ("net/mlx5e: Move dynamic interrupt coalescing code to include/linux") Signed-off-by: Tal Gilboa Reviewed-by: Andy Gospodarek Signed-off-by: David S. Miller diff --git a/include/linux/net_dim.h b/include/linux/net_dim.h index bebeaad..29ed8fd 100644 --- a/include/linux/net_dim.h +++ b/include/linux/net_dim.h @@ -231,7 +231,7 @@ static inline void net_dim_exit_parking(struct net_dim *dim) } #define IS_SIGNIFICANT_DIFF(val, ref) \ - (((100 * abs((val) - (ref))) / (ref)) > 10) /* more than 10% difference */ + (((100UL * abs((val) - (ref))) / (re
[Kernel-packages] [Bug 1763325] Re: [bionic] ConnectX5 Large message size throughput degradation in TCP
Testing this patch with bionic and it is working properly before the patch # ethtool --show-priv-flags enp6s0f0 Private flags for enp6s0f0: rx_cqe_moder : on tx_cqe_moder : on rx_cqe_compress: off After applying the patch # ethtool --show-priv-flags enp6s0f0 Private flags for enp6s0f0: rx_cqe_moder : on tx_cqe_moder : off rx_cqe_compress: off -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1763325 Title: [bionic] ConnectX5 Large message size throughput degradation in TCP Status in linux package in Ubuntu: New Bug description: we see degradation ~20% on ConnectX-5/4 in the following case: TCP, 1 QP, 1 stream, unidir, single port. Message sizes 1M and up show this degradation. After changing the default TX moderation mode to off we see up to 40% packet rate and up to 23% bandwidth degradtions. There is an upstream commit that fix this issue, I will backport it and send it to the kernel-t...@lists.ubuntu.com commit 48bfc39791b8b4a25f165e711f18b9c1617cefbc Author: Tal Gilboa Date: Fri Mar 30 15:50:08 2018 -0700 net/mlx5e: Set EQE based as default TX interrupt moderation mode The default TX moderation mode was mistakenly set to CQE based. The intention was to add a control ability in order to improve some specific use-cases. In general, we prefer to use EQE based moderation as it gives much better numbers for the common cases. CQE based causes a degradation in the common case since it resets the moderation timer on CQE generation. This causes an issue when TSO is well utilized (large TSO sessions). The timer is set to 16us so traffic of ~64KB TSO sessions per second would mean timer reset (CQE per TSO session -> long time between CQEs). In this case we quickly reach the tcp_limit_output_bytes (256KB by default) and cause a halt in TX traffic. By setting EQE based moderation we make sure timer would expire after 16us regardless of the packet rate. This fixes an up to 40% packet rate and up to 23% bandwidth degradtions. Fixes: 0088cbbc4b66 ("net/mlx5e: Enable CQE based moderation on TX CQ") Signed-off-by: Tal Gilboa Signed-off-by: Saeed Mahameed Signed-off-by: David S. Miller diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c index c71f4f10283b..0aab3afc6885 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -4137,7 +4137,7 @@ void mlx5e_build_nic_params(struct mlx5_core_dev *mdev, struct mlx5e_params *params, u16 max_channels, u16 mtu) { - u8 cq_period_mode = 0; + u8 rx_cq_period_mode; params->sw_mtu = mtu; params->hard_mtu = MLX5E_ETH_HARD_MTU; @@ -4173,12 +4173,12 @@ void mlx5e_build_nic_params(struct mlx5_core_dev *mdev, params->lro_timeout = mlx5e_choose_lro_timeout(mdev, MLX5E_DEFAULT_LRO_TIMEOUT); /* CQ moderation params */ - cq_period_mode = MLX5_CAP_GEN(mdev, cq_period_start_from_cqe) ? + rx_cq_period_mode = MLX5_CAP_GEN(mdev, cq_period_start_from_cqe) ? MLX5_CQ_PERIOD_MODE_START_FROM_CQE : MLX5_CQ_PERIOD_MODE_START_FROM_EQE; params->rx_dim_enabled = MLX5_CAP_GEN(mdev, cq_moderation); - mlx5e_set_rx_cq_mode_params(params, cq_period_mode); - mlx5e_set_tx_cq_mode_params(params, cq_period_mode); + mlx5e_set_rx_cq_mode_params(params, rx_cq_period_mode); + mlx5e_set_tx_cq_mode_params(params, MLX5_CQ_PERIOD_MODE_START_FROM_EQE); /* TX inline */ params->tx_min_inline_mode = mlx5e_params_calculate_tx_min_inline(mdev); To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1763325/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : h
[Kernel-packages] [Bug 1763325] [NEW] [bionic] ConnectX5 Large message size throughput degradation in TCP
Public bug reported: we see degradation ~20% on ConnectX-5/4 in the following case: TCP, 1 QP, 1 stream, unidir, single port. Message sizes 1M and up show this degradation. After changing the default TX moderation mode to off we see up to 40% packet rate and up to 23% bandwidth degradtions. There is an upstream commit that fix this issue, I will backport it and send it to the kernel-t...@lists.ubuntu.com commit 48bfc39791b8b4a25f165e711f18b9c1617cefbc Author: Tal Gilboa Date: Fri Mar 30 15:50:08 2018 -0700 net/mlx5e: Set EQE based as default TX interrupt moderation mode The default TX moderation mode was mistakenly set to CQE based. The intention was to add a control ability in order to improve some specific use-cases. In general, we prefer to use EQE based moderation as it gives much better numbers for the common cases. CQE based causes a degradation in the common case since it resets the moderation timer on CQE generation. This causes an issue when TSO is well utilized (large TSO sessions). The timer is set to 16us so traffic of ~64KB TSO sessions per second would mean timer reset (CQE per TSO session -> long time between CQEs). In this case we quickly reach the tcp_limit_output_bytes (256KB by default) and cause a halt in TX traffic. By setting EQE based moderation we make sure timer would expire after 16us regardless of the packet rate. This fixes an up to 40% packet rate and up to 23% bandwidth degradtions. Fixes: 0088cbbc4b66 ("net/mlx5e: Enable CQE based moderation on TX CQ") Signed-off-by: Tal Gilboa Signed-off-by: Saeed Mahameed Signed-off-by: David S. Miller diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c index c71f4f10283b..0aab3afc6885 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -4137,7 +4137,7 @@ void mlx5e_build_nic_params(struct mlx5_core_dev *mdev, struct mlx5e_params *params, u16 max_channels, u16 mtu) { - u8 cq_period_mode = 0; + u8 rx_cq_period_mode; params->sw_mtu = mtu; params->hard_mtu = MLX5E_ETH_HARD_MTU; @@ -4173,12 +4173,12 @@ void mlx5e_build_nic_params(struct mlx5_core_dev *mdev, params->lro_timeout = mlx5e_choose_lro_timeout(mdev, MLX5E_DEFAULT_LRO_TIMEOUT); /* CQ moderation params */ - cq_period_mode = MLX5_CAP_GEN(mdev, cq_period_start_from_cqe) ? + rx_cq_period_mode = MLX5_CAP_GEN(mdev, cq_period_start_from_cqe) ? MLX5_CQ_PERIOD_MODE_START_FROM_CQE : MLX5_CQ_PERIOD_MODE_START_FROM_EQE; params->rx_dim_enabled = MLX5_CAP_GEN(mdev, cq_moderation); - mlx5e_set_rx_cq_mode_params(params, cq_period_mode); - mlx5e_set_tx_cq_mode_params(params, cq_period_mode); + mlx5e_set_rx_cq_mode_params(params, rx_cq_period_mode); + mlx5e_set_tx_cq_mode_params(params, MLX5_CQ_PERIOD_MODE_START_FROM_EQE); /* TX inline */ params->tx_min_inline_mode = mlx5e_params_calculate_tx_min_inline(mdev); ** Affects: linux (Ubuntu) Importance: Undecided Status: New -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1763325 Title: [bionic] ConnectX5 Large message size throughput degradation in TCP Status in linux package in Ubuntu: New Bug description: we see degradation ~20% on ConnectX-5/4 in the following case: TCP, 1 QP, 1 stream, unidir, single port. Message sizes 1M and up show this degradation. After changing the default TX moderation mode to off we see up to 40% packet rate and up to 23% bandwidth degradtions. There is an upstream commit that fix this issue, I will backport it and send it to the kernel-t...@lists.ubuntu.com commit 48bfc39791b8b4a25f165e711f18b9c1617cefbc Author: Tal Gilboa Date: Fri Mar 30 15:50:08 2018 -0700
[Kernel-packages] [Bug 1764982] [NEW] [bionic] machine stuck and bonding not working well when nvmet_rdma module is loaded
Public bug reported: Hi Machine stuck after unregistering bonding interface when the nvmet_rdma module is loading. scenario: # modprobe nvmet_rdma # modprobe -r bonding # modprobe bonding -v mode=1 miimon=100 fail_over_mac=0 # ifdown eth4 # ifdown eth5 # ip addr add 15.209.12.173/8 dev bond0 # ip link set bond0 up # echo +eth5 > /sys/class/net/bond0/bonding/slaves # echo +eth4 > /sys/class/net/bond0/bonding/slaves # echo -eth4 > /sys/class/net/bond0/bonding/slaves # echo -eth5 > /sys/class/net/bond0/bonding/slaves # echo -bond0 > /sys/class/net/bonding_masters dmesg: kernel: [78348.225556] bond0 (unregistering): Released all slaves kernel: [78358.339631] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78368.419621] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78378.499615] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78388.579625] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78398.659613] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78408.739655] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78418.819634] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78428.899642] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78438.979614] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78449.059619] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78459.139626] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78469.219623] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78479.299619] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78489.379620] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78499.459623] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78509.539631] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78519.619629] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 The following upstream commits that fix this issue commit a3dd7d0022c347207ae931c753a6dc3e6e8fcbc1 Author: Max Gurtovoy Date: Wed Feb 28 13:12:38 2018 +0200 nvmet-rdma: Don't flush system_wq by default during remove_one The .remove_one function is called for any ib_device removal. In case the removed device has no reference in our driver, there is no need to flush the system work queue. Reviewed-by: Israel Rukshin Signed-off-by: Max Gurtovoy Reviewed-by: Sagi Grimberg Signed-off-by: Keith Busch Signed-off-by: Jens Axboe diff --git a/drivers/nvme/target/rdma.c b/drivers/nvme/target/rdma.c index aa8068f..a59263d 100644 --- a/drivers/nvme/target/rdma.c +++ b/drivers/nvme/target/rdma.c @@ -1469,8 +1469,25 @@ static struct nvmet_fabrics_ops nvmet_rdma_ops = { static void nvmet_rdma_remove_one(struct ib_device *ib_device, void *client_data) { struct nvmet_rdma_queue *queue, *tmp; + struct nvmet_rdma_device *ndev; + bool found = false; + + mutex_lock(&device_list_mutex); + list_for_each_entry(ndev, &device_list, entry) { + if (ndev->device == ib_device) { + found = true; + break; + } + } + mutex_unlock(&device_list_mutex); + + if (!found) + return; - /* Device is being removed, delete all queues using this device */ + /* + * IB Device that is used by nvmet controllers is being removed, + * delete all queues using this device. + */ mutex_lock(&nvmet_rdma_queue_mutex); list_for_each_entry_safe(queue, tmp, &nvmet_rdma_queue_list, queue_list) { commit 9bad0404ecd7594265cef04e176adeaa4ffbca4a Author: Max Gurtovoy Date: Wed Feb 28 13:12:39 2018 +0200 nvme-rdma: Don't flush delete_wq by default during remove_one The .remove_one function is called for any ib_device removal. In case the removed device has no reference in our driver, there is no need to flush the work queue. Reviewed-by: Israel Rukshin Signed-off-by: Max Gurtovoy Reviewed-by: Sagi Grimberg Signed-off-by: Keith Busch Signed-off-by: Jens Axboe diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c index f5f460b..250b277 100644 --- a/drivers/nvme/host/rdma.c +++ b/drivers/nvme/host/rdma.c @@ -2024,6 +2024,20 @@ static struct nvmf_transport_ops nvme_rdma_transport = { static void nvme_rdma_remove_one(struct ib_device *ib_device, void *client_data) { struct nvme_rdma_ctrl *ctrl; + struct nvme_rdma_device *ndev; + bool found = false; + + mutex_lock(&device_list_mutex); + list_for_each_entry(ndev, &device_list, entry) { + if (ndev->dev == ib_device) { +
[Kernel-packages] [Bug 1764982] Re: [bionic] machine stuck and bonding not working well when nvmet_rdma module is loaded
Hi, Sorry I missed that, will do it today. yours, Talat -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1764982 Title: [bionic] machine stuck and bonding not working well when nvmet_rdma module is loaded Status in linux package in Ubuntu: In Progress Status in linux source package in Bionic: Fix Committed Bug description: == SRU Justification == This bug causes the machine to get stuck and bonding to not work when the nvmet_rdma module is loaded. Both of these commits are in mainline as of v4.17-rc1. == Fixes == a3dd7d0022c3 ("nvmet-rdma: Don't flush system_wq by default during remove_one") 9bad0404ecd7 ("nvme-rdma: Don't flush delete_wq by default during remove_one") == Regression Potential == Low. Limited to nvme driver and tested by Mellanox. == Test Case == A test kernel was built with these patches and tested by the original bug reporter. The bug reporter states the test kernel resolved the bug. == Original Bug Description == Hi Machine stuck after unregistering bonding interface when the nvmet_rdma module is loading. scenario: # modprobe nvmet_rdma # modprobe -r bonding # modprobe bonding -v mode=1 miimon=100 fail_over_mac=0 # ifdown eth4 # ifdown eth5 # ip addr add 15.209.12.173/8 dev bond0 # ip link set bond0 up # echo +eth5 > /sys/class/net/bond0/bonding/slaves # echo +eth4 > /sys/class/net/bond0/bonding/slaves # echo -eth4 > /sys/class/net/bond0/bonding/slaves # echo -eth5 > /sys/class/net/bond0/bonding/slaves # echo -bond0 > /sys/class/net/bonding_masters dmesg: kernel: [78348.225556] bond0 (unregistering): Released all slaves kernel: [78358.339631] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78368.419621] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78378.499615] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78388.579625] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78398.659613] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78408.739655] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78418.819634] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78428.899642] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78438.979614] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78449.059619] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78459.139626] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78469.219623] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78479.299619] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78489.379620] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78499.459623] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78509.539631] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78519.619629] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 The following upstream commits that fix this issue commit a3dd7d0022c347207ae931c753a6dc3e6e8fcbc1 Author: Max Gurtovoy Date: Wed Feb 28 13:12:38 2018 +0200 nvmet-rdma: Don't flush system_wq by default during remove_one The .remove_one function is called for any ib_device removal. In case the removed device has no reference in our driver, there is no need to flush the system work queue. Reviewed-by: Israel Rukshin Signed-off-by: Max Gurtovoy Reviewed-by: Sagi Grimberg Signed-off-by: Keith Busch Signed-off-by: Jens Axboe diff --git a/drivers/nvme/target/rdma.c b/drivers/nvme/target/rdma.c index aa8068f..a59263d 100644 --- a/drivers/nvme/target/rdma.c +++ b/drivers/nvme/target/rdma.c @@ -1469,8 +1469,25 @@ static struct nvmet_fabrics_ops nvmet_rdma_ops = { static void nvmet_rdma_remove_one(struct ib_device *ib_device, void *client_data) { struct nvmet_rdma_queue *queue, *tmp; + struct nvmet_rdma_device *ndev; + bool found = false; + + mutex_lock(&device_list_mutex); + list_for_each_entry(ndev, &device_list, entry) { + if (ndev->device == ib_device) { + found = true; + break; + } + } + mutex_unlock(&device_list_mutex); + + if (!found) + return; - /* Device is being removed, delete all queues using this device */ + /* + * IB Device that is used by nvmet controllers is being removed, + * delete all queues using this device. + */
[Kernel-packages] [Bug 1764982] Re: [bionic] machine stuck and bonding not working well when nvmet_rdma module is loaded
** Tags removed: verification-needed-bionic ** Tags added: verification-done-bionic -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1764982 Title: [bionic] machine stuck and bonding not working well when nvmet_rdma module is loaded Status in linux package in Ubuntu: In Progress Status in linux source package in Bionic: Fix Committed Bug description: == SRU Justification == This bug causes the machine to get stuck and bonding to not work when the nvmet_rdma module is loaded. Both of these commits are in mainline as of v4.17-rc1. == Fixes == a3dd7d0022c3 ("nvmet-rdma: Don't flush system_wq by default during remove_one") 9bad0404ecd7 ("nvme-rdma: Don't flush delete_wq by default during remove_one") == Regression Potential == Low. Limited to nvme driver and tested by Mellanox. == Test Case == A test kernel was built with these patches and tested by the original bug reporter. The bug reporter states the test kernel resolved the bug. == Original Bug Description == Hi Machine stuck after unregistering bonding interface when the nvmet_rdma module is loading. scenario: # modprobe nvmet_rdma # modprobe -r bonding # modprobe bonding -v mode=1 miimon=100 fail_over_mac=0 # ifdown eth4 # ifdown eth5 # ip addr add 15.209.12.173/8 dev bond0 # ip link set bond0 up # echo +eth5 > /sys/class/net/bond0/bonding/slaves # echo +eth4 > /sys/class/net/bond0/bonding/slaves # echo -eth4 > /sys/class/net/bond0/bonding/slaves # echo -eth5 > /sys/class/net/bond0/bonding/slaves # echo -bond0 > /sys/class/net/bonding_masters dmesg: kernel: [78348.225556] bond0 (unregistering): Released all slaves kernel: [78358.339631] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78368.419621] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78378.499615] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78388.579625] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78398.659613] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78408.739655] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78418.819634] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78428.899642] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78438.979614] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78449.059619] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78459.139626] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78469.219623] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78479.299619] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78489.379620] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78499.459623] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78509.539631] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78519.619629] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 The following upstream commits that fix this issue commit a3dd7d0022c347207ae931c753a6dc3e6e8fcbc1 Author: Max Gurtovoy Date: Wed Feb 28 13:12:38 2018 +0200 nvmet-rdma: Don't flush system_wq by default during remove_one The .remove_one function is called for any ib_device removal. In case the removed device has no reference in our driver, there is no need to flush the system work queue. Reviewed-by: Israel Rukshin Signed-off-by: Max Gurtovoy Reviewed-by: Sagi Grimberg Signed-off-by: Keith Busch Signed-off-by: Jens Axboe diff --git a/drivers/nvme/target/rdma.c b/drivers/nvme/target/rdma.c index aa8068f..a59263d 100644 --- a/drivers/nvme/target/rdma.c +++ b/drivers/nvme/target/rdma.c @@ -1469,8 +1469,25 @@ static struct nvmet_fabrics_ops nvmet_rdma_ops = { static void nvmet_rdma_remove_one(struct ib_device *ib_device, void *client_data) { struct nvmet_rdma_queue *queue, *tmp; + struct nvmet_rdma_device *ndev; + bool found = false; + + mutex_lock(&device_list_mutex); + list_for_each_entry(ndev, &device_list, entry) { + if (ndev->device == ib_device) { + found = true; + break; + } + } + mutex_unlock(&device_list_mutex); + + if (!found) + return; - /* Device is being removed, delete all queues using this device */ + /* + * IB Device that is used by nvmet controllers is being removed, + * delete all queues using t
[Kernel-packages] [Bug 1764982] Re: [bionic] machine stuck and bonding not working well when nvmet_rdma module is loaded
Thank you for the build, I tested with those patches and it is work. Thanks, Talat -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1764982 Title: [bionic] machine stuck and bonding not working well when nvmet_rdma module is loaded Status in linux package in Ubuntu: In Progress Status in linux source package in Bionic: In Progress Bug description: Hi Machine stuck after unregistering bonding interface when the nvmet_rdma module is loading. scenario: # modprobe nvmet_rdma # modprobe -r bonding # modprobe bonding -v mode=1 miimon=100 fail_over_mac=0 # ifdown eth4 # ifdown eth5 # ip addr add 15.209.12.173/8 dev bond0 # ip link set bond0 up # echo +eth5 > /sys/class/net/bond0/bonding/slaves # echo +eth4 > /sys/class/net/bond0/bonding/slaves # echo -eth4 > /sys/class/net/bond0/bonding/slaves # echo -eth5 > /sys/class/net/bond0/bonding/slaves # echo -bond0 > /sys/class/net/bonding_masters dmesg: kernel: [78348.225556] bond0 (unregistering): Released all slaves kernel: [78358.339631] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78368.419621] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78378.499615] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78388.579625] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78398.659613] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78408.739655] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78418.819634] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78428.899642] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78438.979614] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78449.059619] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78459.139626] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78469.219623] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78479.299619] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78489.379620] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78499.459623] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78509.539631] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 kernel: [78519.619629] unregister_netdevice: waiting for bond0 to become free. Usage count = 2 The following upstream commits that fix this issue commit a3dd7d0022c347207ae931c753a6dc3e6e8fcbc1 Author: Max Gurtovoy Date: Wed Feb 28 13:12:38 2018 +0200 nvmet-rdma: Don't flush system_wq by default during remove_one The .remove_one function is called for any ib_device removal. In case the removed device has no reference in our driver, there is no need to flush the system work queue. Reviewed-by: Israel Rukshin Signed-off-by: Max Gurtovoy Reviewed-by: Sagi Grimberg Signed-off-by: Keith Busch Signed-off-by: Jens Axboe diff --git a/drivers/nvme/target/rdma.c b/drivers/nvme/target/rdma.c index aa8068f..a59263d 100644 --- a/drivers/nvme/target/rdma.c +++ b/drivers/nvme/target/rdma.c @@ -1469,8 +1469,25 @@ static struct nvmet_fabrics_ops nvmet_rdma_ops = { static void nvmet_rdma_remove_one(struct ib_device *ib_device, void *client_data) { struct nvmet_rdma_queue *queue, *tmp; + struct nvmet_rdma_device *ndev; + bool found = false; + + mutex_lock(&device_list_mutex); + list_for_each_entry(ndev, &device_list, entry) { + if (ndev->device == ib_device) { + found = true; + break; + } + } + mutex_unlock(&device_list_mutex); + + if (!found) + return; - /* Device is being removed, delete all queues using this device */ + /* + * IB Device that is used by nvmet controllers is being removed, + * delete all queues using this device. + */ mutex_lock(&nvmet_rdma_queue_mutex); list_for_each_entry_safe(queue, tmp, &nvmet_rdma_queue_list, queue_list) { commit 9bad0404ecd7594265cef04e176adeaa4ffbca4a Author: Max Gurtovoy Date: Wed Feb 28 13:12:39 2018 +0200 nvme-rdma: Don't flush delete_wq by default during remove_one The .remove_one function is called for any ib_device removal. In case the removed device has no reference in our driver, there is no need to flush the work queue. Reviewed-by: Israel Rukshin Signed-off-by: Max Gurtovoy Reviewed-by: Sa
[Kernel-packages] [Bug 1799049] [NEW] [bionic]mlx5: reading SW stats through ifstat cause kernel crash
Public bug reported: Description of problem: Attempting to read SW stats (ifstat -x cpu_hit) on a system with probed VFs will crash the system. How reproducible: Always Steps to Reproduce: 1. Create a VF echo 1 > /sys/bus/pci/devices/\:82\:00.0/sriov_numvfs 2. read SW stats: ifstat -x cpu_hit Actual results: System will crash Expected results: No system crash Additional info: The reason for the crash is insufficient check on the helper that determines if a netdev is vf rep. will crash: $ ifstat -x cpu_hit will not crash: $ ifstat -x cpu_hit $VF_REP $ ifstat -x cpu_hit $UPLINK_REP We already have a fix for this issue, and it is accepted upstream. https://patchwork.ozlabs.org/patch/935193/ ** Affects: linux (Ubuntu) Importance: Undecided Status: New -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1799049 Title: [bionic]mlx5: reading SW stats through ifstat cause kernel crash Status in linux package in Ubuntu: New Bug description: Description of problem: Attempting to read SW stats (ifstat -x cpu_hit) on a system with probed VFs will crash the system. How reproducible: Always Steps to Reproduce: 1. Create a VF echo 1 > /sys/bus/pci/devices/\:82\:00.0/sriov_numvfs 2. read SW stats: ifstat -x cpu_hit Actual results: System will crash Expected results: No system crash Additional info: The reason for the crash is insufficient check on the helper that determines if a netdev is vf rep. will crash: $ ifstat -x cpu_hit will not crash: $ ifstat -x cpu_hit $VF_REP $ ifstat -x cpu_hit $UPLINK_REP We already have a fix for this issue, and it is accepted upstream. https://patchwork.ozlabs.org/patch/935193/ To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1799049/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1799049] Re: [bionic]mlx5: reading SW stats through ifstat cause kernel crash
Thank you for your concern. I already sent the patch to the canonical kernel mailing list, and waiting for them to review it. Thanks, Talat -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1799049 Title: [bionic]mlx5: reading SW stats through ifstat cause kernel crash Status in linux package in Ubuntu: Triaged Status in linux source package in Bionic: Triaged Bug description: Description of problem: Attempting to read SW stats (ifstat -x cpu_hit) on a system with probed VFs will crash the system. How reproducible: Always Steps to Reproduce: 1. Create a VF echo 1 > /sys/bus/pci/devices/\:82\:00.0/sriov_numvfs 2. read SW stats: ifstat -x cpu_hit Actual results: System will crash Expected results: No system crash Additional info: The reason for the crash is insufficient check on the helper that determines if a netdev is vf rep. will crash: $ ifstat -x cpu_hit will not crash: $ ifstat -x cpu_hit $VF_REP $ ifstat -x cpu_hit $UPLINK_REP We already have a fix for this issue, and it is accepted upstream. https://patchwork.ozlabs.org/patch/935193/ To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1799049/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1799049] Re: [bionic]mlx5: reading SW stats through ifstat cause kernel crash
The patch fixes the issue and we tested it. Thanks, Talat -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1799049 Title: [bionic]mlx5: reading SW stats through ifstat cause kernel crash Status in linux package in Ubuntu: Triaged Status in linux source package in Bionic: Triaged Bug description: Description of problem: Attempting to read SW stats (ifstat -x cpu_hit) on a system with probed VFs will crash the system. How reproducible: Always Steps to Reproduce: 1. Create a VF echo 1 > /sys/bus/pci/devices/\:82\:00.0/sriov_numvfs 2. read SW stats: ifstat -x cpu_hit Actual results: System will crash Expected results: No system crash Additional info: The reason for the crash is insufficient check on the helper that determines if a netdev is vf rep. will crash: $ ifstat -x cpu_hit will not crash: $ ifstat -x cpu_hit $VF_REP $ ifstat -x cpu_hit $UPLINK_REP We already have a fix for this issue, and it is accepted upstream. https://patchwork.ozlabs.org/patch/935193/ To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1799049/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1801574] [NEW] [cosmic] ipoib ping with large message size failed
Public bug reported: We see that ping over ipoib interface stuck with large packets, this is a new degradation, this test pass on Ubuntu 18.04. After investigating the issue, we see that commit [1] introduce the issue, and it is not an upstream commit, it is canonical commit. Could you please check with the canonical kernel team why they revert that commit? To reproduce the bug, please use ConnectX-3 devices with ipoib connection with 2044 MTU (default) and run command [2]. Is there open Launchpad on it? [1] commit 77a24c313d21e3765b04d90521e9228a9bb6e332 Author: Tyler Hicks Date: Fri Aug 3 21:53:15 2018 + Revert "net: increase fragment memory usage limits" This reverts commit c2a936600f78aea00d3312ea4b66a79a4619f9b4. It made denial of service attacks on the IP fragment handling easier to carry out. CVE-2018-5391 Signed-off-by: Tyler Hicks Signed-off-by: Stefan Bader [2] ping 13.194.22.1 -I 13.194.23.1 -s 65507 ** Affects: linux (Ubuntu) Importance: Undecided Status: Incomplete ** Tags: cosmic ** Summary changed: - [bionic] ipoib ping with large message size failed + [cosmic] ipoib ping with large message size failed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1801574 Title: [cosmic] ipoib ping with large message size failed Status in linux package in Ubuntu: Incomplete Bug description: We see that ping over ipoib interface stuck with large packets, this is a new degradation, this test pass on Ubuntu 18.04. After investigating the issue, we see that commit [1] introduce the issue, and it is not an upstream commit, it is canonical commit. Could you please check with the canonical kernel team why they revert that commit? To reproduce the bug, please use ConnectX-3 devices with ipoib connection with 2044 MTU (default) and run command [2]. Is there open Launchpad on it? [1] commit 77a24c313d21e3765b04d90521e9228a9bb6e332 Author: Tyler Hicks Date: Fri Aug 3 21:53:15 2018 + Revert "net: increase fragment memory usage limits" This reverts commit c2a936600f78aea00d3312ea4b66a79a4619f9b4. It made denial of service attacks on the IP fragment handling easier to carry out. CVE-2018-5391 Signed-off-by: Tyler Hicks Signed-off-by: Stefan Bader [2] ping 13.194.22.1 -I 13.194.23.1 -s 65507 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1801574/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1687877] [NEW] bonding - mlx5 - speed changed to 0 after changing ring size
Public bug reported: The problem happens when changing the ring size of a mlx5_core interface that is part of a LACP bond. [268312.721076] mlx5_core :42:00.1 enp66s0f1: mlx5e_update_carrier:143: Link up [268312.721940] mlx5_core :42:00.1 enp66s0f1: speed changed to 0 for port enp66s0f1 [268312.732089] bond0: link status up again after 0 ms for interface enp66s0f1 the upstream commit "bonding: allow notifications for bond_set_slave_link_state " fix the issue, will cherry-pick and send to xenial git tree. Thanks, Talat ** Affects: linux (Ubuntu) Importance: Undecided Status: New -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1687877 Title: bonding - mlx5 - speed changed to 0 after changing ring size Status in linux package in Ubuntu: New Bug description: The problem happens when changing the ring size of a mlx5_core interface that is part of a LACP bond. [268312.721076] mlx5_core :42:00.1 enp66s0f1: mlx5e_update_carrier:143: Link up [268312.721940] mlx5_core :42:00.1 enp66s0f1: speed changed to 0 for port enp66s0f1 [268312.732089] bond0: link status up again after 0 ms for interface enp66s0f1 the upstream commit "bonding: allow notifications for bond_set_slave_link_state " fix the issue, will cherry-pick and send to xenial git tree. Thanks, Talat To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1687877/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1687877] Re: bonding - mlx5 - speed changed to 0 after changing ring size
Thank you, Tested it and it fixes the issue. could you please add this fix to the next SRU -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1687877 Title: bonding - mlx5 - speed changed to 0 after changing ring size Status in linux package in Ubuntu: In Progress Status in linux source package in Xenial: In Progress Bug description: The problem happens when changing the ring size of a mlx5_core interface that is part of a LACP bond. [268312.721076] mlx5_core :42:00.1 enp66s0f1: mlx5e_update_carrier:143: Link up [268312.721940] mlx5_core :42:00.1 enp66s0f1: speed changed to 0 for port enp66s0f1 [268312.732089] bond0: link status up again after 0 ms for interface enp66s0f1 the upstream commit "bonding: allow notifications for bond_set_slave_link_state " fix the issue, will cherry-pick and send to xenial git tree. Thanks, Talat To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1687877/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1682418] [NEW] [zesty] mlx5 OVS vxlan ipv6 LNST test cause Oops
Public bug reported: After running offload enabled LNST ipv6 vxlan ovs test (recipes/ovs_offload/1_virt_ovs_vxlan_ipv6.xml) with the setup that it creates multiple times till it crashes. The test itself and other LNST tests pass, it's the shutdown phase that causes this. There are different stack traces that usually relate to some kind of allocation (or ext4, inode), see one below. scenario : 1. Install lnst tests git clone https://github.com/jpirko/lnst.git && cd lnst && ./setup.py install 2. prepare OVS offload enable setup (2 machines) connected Back to Back 3. enable 2 VM's on the mlnx5 Physical Function on each machine 4. setup lnst on vm and HV (run lnst-slave) 5. run IPv VXLAN lnst test in loop for example #lnst-ctl -d --pools=talat run recipes/ovs_offload/1_virt_ovs_vxlan_ipv6.xml Call trace kernel: [76406.381439] Oops: [#1] SMP kernel: [76406.419297] Modules linked in: act_mirred act_gact act_tunnel_key cls_flower sch_ingress vport_vxlan vxlan ip6_udp_tunnel udp_tunnel vfio_pci vfio_iommu_type1 vfio_virqfd vfio mlx5_ib ib_core nfsv3 nfs fscache xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter bridge stp llc openvswitch nf_conntrack_ipv6 nf_nat_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_defrag_ipv6 nf_nat nf_conntrack intel_rapl sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd glue_helper cryptd ipmi_ssif intel_cstate ipmi_si input_leds joydev ipmi_devintf kernel: [76406.981750] mei_me dcdbas intel_rapl_perf shpchp mei ipmi_msghandler lpc_ich mac_hid acpi_power_meter configfs nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables x_tables autofs4 btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear mlx4_en hid_generic tg3 mlx5_core usbhid mlx4_core ahci ptp mxm_wmi hid libahci megaraid_sas devlink pps_core fjes wmi kernel: [76407.335099] CPU: 25 PID: 5253 Comm: ip Not tainted 4.10.0-19-generic #21-Ubuntu kernel: [76407.446475] Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.3.4 11/08/2016 kernel: [76407.558645] task: 9a2b76f89680 task.stack: bda6c76a8000 kernel: [76407.618666] RIP: 0010:rb_erase+0x194/0x350 kernel: [76407.676596] RSP: 0018:bda6c76ab4f0 EFLAGS: 00010046 kernel: [76407.735460] RAX: 9a2c2cc30bc0 RBX: 9a2c53372d18 RCX: kernel: [76407.797100] RDX: RSI: 9a2c53372d20 RDI: 9a2c2cc30a40 kernel: [76407.858831] RBP: bda6c76ab4f0 R08: R09: 00018040002e kernel: [76407.921323] R10: 9a2c2cc30b40 R11: 000f9e00 R12: 9a2c2cc30a40 kernel: [76407.984793] R13: 9a2c53372d18 R14: 0046 R15: 9a2c5536b800 kernel: [76408.048453] FS: 7f3d96082d80() GS:9a2c5f30() knlGS: kernel: [76408.166912] CS: 0010 DS: ES: CR0: 80050033 ke
[Kernel-packages] [Bug 1678585] Re: [Zesty] TSO doesn't work properly
Hi, This test work properly in the previous releases. Thanks, Talat -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1678585 Title: [Zesty] TSO doesn't work properly Status in linux package in Ubuntu: Incomplete Bug description: The TSO (TCP segmentation offload), By default it is shown to be ON, but actually aggregation doesn’t happen. When turning it on again - aggregation is observed. Same behavior for mlx4_en, mlx5_core, igb. Steps to repro: 1. root:~# uname -r 4.10.0-14-generic 2. root:~# ethtool -i eno2 driver: igb version: 5.4.0-k firmware-version: 1.63, 0x89fa expansion-rom-version: bus-info: :06:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: yes supports-register-dump: yes supports-priv-flags: no 3. root:~# ethtool -k eno2 | grep offload tcp-segmentation-offload: on udp-fragmentation-offload: off [fixed] generic-segmentation-offload: on generic-receive-offload: on large-receive-offload: off [fixed] rx-vlan-offload: on tx-vlan-offload: on l2-fwd-offload: off [fixed] hw-tc-offload: off [fixed] 4. root@:~# ethtool -K eno2 gso off 5. root@:~# ethtool -k eno2 | grep offload tcp-segmentation-offload: on udp-fragmentation-offload: off [fixed] generic-segmentation-offload: off generic-receive-offload: on large-receive-offload: off [fixed] rx-vlan-offload: on tx-vlan-offload: on l2-fwd-offload: off [fixed] hw-tc-offload: off [fixed] 6. netperf -H 10.195.43.1 -l 4 -t TCP_STREAM -c -C -- -m 15000 Traffic size less or equal to 1514 is observed, despite the fact TSO is on. 7. ethtool -K eno2 tso on 8. netperf -H 10.195.43.1 -l 4 -t TCP_STREAM -c -C -- -m 15000 Traffic size of 64K is observed. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1678585/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1678585] Re: [Zesty] TSO doesn't work properly
It is working with Ubuntu 16.10 . -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1678585 Title: [Zesty] TSO doesn't work properly Status in linux package in Ubuntu: Incomplete Bug description: The TSO (TCP segmentation offload), By default it is shown to be ON, but actually aggregation doesn’t happen. When turning it on again - aggregation is observed. Same behavior for mlx4_en, mlx5_core, igb. Steps to repro: 1. root:~# uname -r 4.10.0-14-generic 2. root:~# ethtool -i eno2 driver: igb version: 5.4.0-k firmware-version: 1.63, 0x89fa expansion-rom-version: bus-info: :06:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: yes supports-register-dump: yes supports-priv-flags: no 3. root:~# ethtool -k eno2 | grep offload tcp-segmentation-offload: on udp-fragmentation-offload: off [fixed] generic-segmentation-offload: on generic-receive-offload: on large-receive-offload: off [fixed] rx-vlan-offload: on tx-vlan-offload: on l2-fwd-offload: off [fixed] hw-tc-offload: off [fixed] 4. root@:~# ethtool -K eno2 gso off 5. root@:~# ethtool -k eno2 | grep offload tcp-segmentation-offload: on udp-fragmentation-offload: off [fixed] generic-segmentation-offload: off generic-receive-offload: on large-receive-offload: off [fixed] rx-vlan-offload: on tx-vlan-offload: on l2-fwd-offload: off [fixed] hw-tc-offload: off [fixed] 6. netperf -H 10.195.43.1 -l 4 -t TCP_STREAM -c -C -- -m 15000 Traffic size less or equal to 1514 is observed, despite the fact TSO is on. 7. ethtool -K eno2 tso on 8. netperf -H 10.195.43.1 -l 4 -t TCP_STREAM -c -C -- -m 15000 Traffic size of 64K is observed. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1678585/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1682418] Re: [zesty] mlx5 OVS vxlan ipv6 LNST test cause Oops
Additional trace with this test is below [80921.628988] general protection fault: [#1] SMP [80921.630449] Modules linked in: act_mirred act_tunnel_key cls_flower sch_ingress vport_vxlan vxlan ip6_udp_tunnel udp_tunnel vfio_pci vfio_iommu_type1 vfio_virqfd vfio mlx5_ib ib_core openvswitch nf_conntrack_ipv6 nf_nat_ipv6 nf_defrag_ipv6 nfsv3 nfs fscache xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat libcrc32c nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter bridge stp llc binfmt_misc ipmi_ssif intel_rapl sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd glue_helper cryptd intel_cstate ipmi_si mei_me ipmi_devintf [80921.652556] dcdbas intel_rapl_perf mei shpchp lpc_ich ipmi_msghandler acpi_power_meter mac_hid nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables x_tables autofs4 mlx4_en mxm_wmi mlx5_core mlx4_core tg3 ahci megaraid_sas ptp libahci devlink pps_core fjes wmi [80921.659813] CPU: 7 PID: 5743 Comm: kworker/u82:0 Not tainted 4.10.0-19-generic #21-Ubuntu [80921.662346] Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.3.4 11/08/2016 [80921.664686] Workqueue: events_freezable_power_ disk_events_workfn [80921.666586] task: 96f69a285a00 task.stack: acc627394000 [80921.668437] RIP: 0010:kmem_cache_alloc+0x77/0x1a0 [80921.669902] RSP: 0018:acc6273978a8 EFLAGS: 00010086 [80921.671531] RAX: RBX: RCX: 00011c41 [80921.673758] RDX: 00011c40 RSI: 01080020 RDI: 0001c5c0 [80921.675985] RBP: acc6273978d8 R08: 96f6ff0dc5c0 R09: 0064 [80921.729078] R10: 96f6af006630 R11: R12: 01080020 [80921.782687] R13: 957acd99 R14: 96eedf407980 R15: 96eedf407980 [80921.836867] FS: () GS:96f6ff0c() knlGS: [80921.944601] CS: 0010 DS: ES: CR0: 80050033 [80921.998508] CR2: 7fd1e8a75a08 CR3: 001062c09000 CR4: 003426e0 [80922.052471] DR0: DR1: DR2: [80922.105264] DR3: DR6: fffe0ff0 DR7: 0400 [80922.156346] Call Trace: [80922.205695] ? kmem_cache_alloc+0xd3/0x1a0 [80922.254432] alloc_iova+0x49/0x240 [80922.301753] alloc_iova_fast+0x55/0x200 [80922.347916] intel_alloc_iova+0xac/0xe0 [80922.392736] intel_map_sg+0xc2/0x220 [80922.436095] ? lock_timer_base+0x81/0xa0 [80922.478325] ata_qc_issue+0x204/0x320 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1682418 Title: [zesty] mlx5 OVS vxlan ipv6 LNST test cause Oops Status in linux package in Ubuntu: Incomplete Status in linux source package in Zesty: Incomplete Bug description: After running offload enabled LNST ipv6 vxlan ovs test (recipes/ovs_offload/1_virt_ovs_vxlan_ipv6.xml) with the setup that it creates multiple times till it crashes. The test itself and other LNST tests pass, it's the shutdown phase that causes this. There are different stack traces that usually relate to some kind of allocation (or ext4, inode), see one below. scenario : 1. Install lnst tests git clone https://github.com/jpirko/lnst.git && cd lnst && ./setup.py install 2. prepare OVS offload enable setup (2 machines) connected Back to Back 3. enable 2 VM's on the mlnx5 Physical Function on each machine 4. setup lnst on vm and HV (run lnst-slave) 5. run IPv VXLAN lnst test in loop for example #lnst-ctl -d --pools=talat run recipes/ovs_offload/1_virt_ovs_vxlan_ipv6.xml Call trace kernel: [76406.381439] Oops: [#1] SMP kernel: [76406.419297] Modules linked in: act_mirred act_gact act_tunnel_key cls_flower sch_ingress vport_vxlan vxlan ip6_udp_tunnel udp_tunnel vfio_pci vfio_iommu_type1 vfio_virqfd vfio mlx5_ib ib_core nfsv3 nfs fscache xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter bridge stp llc openvswitch nf_conntrack_ipv6 nf_nat_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_defrag_ipv6 nf_nat nf_conntrack intel_rapl sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd glue_helper cryptd ipmi_ssif intel_cstate ipmi_si input_leds joydev ipmi_devintf
[Kernel-packages] [Bug 1758662] [NEW] [Bionic] mlx4 ETH - mlnx_qos failed when set some TC to vendor
Public bug reported: reproduce: [root@reg-l-vrt-41018-010 ~]# /usr/bin/mlnx_qos -i ens8 -s vendor,ets,vendor,ets,ets,strict,ets,vendor -t 0,36,0,55,4,0,5,0 Netlink error: Bad value. see dmesg. [root@reg-l-vrt-41018-010 ~]# dmesg [69718.992299] mlx4_en: ens8: TC[0]: Not supported TSA: 255 There is a upstream commit that fix the issue, please add it to bionic commit a42b63c1ac1986f17f71bc91a6b0aaa14d4dae71 Author: Moni Shoua Date: Thu Dec 28 16:26:11 2017 +0200 net/mlx4_en: Change default QoS settings Change the default mapping between TC and TCG as follows: Prio | TC/TCG | from to |(set by FW) (set by SW) -+--- 0| 0/0 0/7 1| 1/0 0/6 2| 2/0 0/5 3| 3/0 0/4 4| 4/0 0/3 5| 5/0 0/2 6| 6/0 0/1 7| 7/0 0/0 These new settings cause that a pause frame for any prio stops traffic for all prios. Fixes: 564c274c3df0 ("net/mlx4_en: DCB QoS support") Signed-off-by: Moni Shoua Signed-off-by: Maor Gottlieb Signed-off-by: Tariq Toukan Signed-off-by: David S. Miller diff --git a/drivers/net/ethernet/mellanox/mlx4/en_dcb_nl.c b/drivers/net/ethernet/mellanox/mlx4/en_dcb_nl.c index 5f41dc9..1a0c3bf8 100644 --- a/drivers/net/ethernet/mellanox/mlx4/en_dcb_nl.c +++ b/drivers/net/ethernet/mellanox/mlx4/en_dcb_nl.c @@ -310,6 +310,7 @@ static int mlx4_en_ets_validate(struct mlx4_en_priv *priv, struct ieee_ets *ets) } switch (ets->tc_tsa[i]) { + case IEEE_8021QAZ_TSA_VENDOR: case IEEE_8021QAZ_TSA_STRICT: break; case IEEE_8021QAZ_TSA_ETS: @@ -347,6 +348,10 @@ static int mlx4_en_config_port_scheduler(struct mlx4_en_priv *priv, /* higher TC means higher priority => lower pg */ for (i = IEEE_8021QAZ_MAX_TCS - 1; i >= 0; i--) { switch (ets->tc_tsa[i]) { + case IEEE_8021QAZ_TSA_VENDOR: + pg[i] = MLX4_EN_TC_VENDOR; + tc_tx_bw[i] = MLX4_EN_BW_MAX; + break; case IEEE_8021QAZ_TSA_STRICT: pg[i] = num_strict++; tc_tx_bw[i] = MLX4_EN_BW_MAX; diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c index 99051a2..21bc17f 100644 --- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c +++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c @@ -3336,6 +3336,13 @@ int mlx4_en_init_netdev(struct mlx4_en_dev *mdev, int port, priv->msg_enable = MLX4_EN_MSG_LEVEL; #ifdef CONFIG_MLX4_EN_DCB if (!mlx4_is_slave(priv->mdev->dev)) { + u8 prio; + + for (prio = 0; prio < IEEE_8021QAZ_MAX_TCS; ++prio) { + priv->ets.prio_tc[prio] = prio; + priv->ets.tc_tsa[prio] = IEEE_8021QAZ_TSA_VENDOR; + } + priv->dcbx_cap = DCB_CAP_DCBX_VER_CEE | DCB_CAP_DCBX_HOST | DCB_CAP_DCBX_VER_IEEE; priv->flags |= MLX4_EN_DCB_ENABLED; diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h index 2b72677..7db3d0d 100644 --- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h +++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h @@ -479,6 +479,7 @@ struct mlx4_en_frag_info { #define MLX4_EN_BW_MIN 1 #define MLX4_EN_BW_MAX 100 /* Utilize 100% of the line */ +#define MLX4_EN_TC_VENDOR 0 #define MLX4_EN_TC_ETS 7 enum dcb_pfc_type { ** Affects: linux (Ubuntu) Importance: Undecided Status: New -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1758662 Title: [Bionic] mlx4 ETH - mlnx_qos failed when set some TC to vendor Status in linux package in Ubuntu: New Bug description: reproduce: [root@reg-l-vrt-41018-010 ~]# /usr/bin/mlnx_qos -i ens8 -s vendor,ets,vendor,ets,ets,st
[Kernel-packages] [Bug 1758662] Re: [Bionic] mlx4 ETH - mlnx_qos failed when set some TC to vendor
Thank you, will test and update. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1758662 Title: [Bionic] mlx4 ETH - mlnx_qos failed when set some TC to vendor Status in linux package in Ubuntu: In Progress Status in linux source package in Bionic: In Progress Bug description: reproduce: [root@reg-l-vrt-41018-010 ~]# /usr/bin/mlnx_qos -i ens8 -s vendor,ets,vendor,ets,ets,strict,ets,vendor -t 0,36,0,55,4,0,5,0 Netlink error: Bad value. see dmesg. [root@reg-l-vrt-41018-010 ~]# dmesg [69718.992299] mlx4_en: ens8: TC[0]: Not supported TSA: 255 There is a upstream commit that fix the issue, please add it to bionic commit a42b63c1ac1986f17f71bc91a6b0aaa14d4dae71 Author: Moni Shoua Date: Thu Dec 28 16:26:11 2017 +0200 net/mlx4_en: Change default QoS settings Change the default mapping between TC and TCG as follows: Prio | TC/TCG | from to |(set by FW) (set by SW) -+--- 0| 0/0 0/7 1| 1/0 0/6 2| 2/0 0/5 3| 3/0 0/4 4| 4/0 0/3 5| 5/0 0/2 6| 6/0 0/1 7| 7/0 0/0 These new settings cause that a pause frame for any prio stops traffic for all prios. Fixes: 564c274c3df0 ("net/mlx4_en: DCB QoS support") Signed-off-by: Moni Shoua Signed-off-by: Maor Gottlieb Signed-off-by: Tariq Toukan Signed-off-by: David S. Miller diff --git a/drivers/net/ethernet/mellanox/mlx4/en_dcb_nl.c b/drivers/net/ethernet/mellanox/mlx4/en_dcb_nl.c index 5f41dc9..1a0c3bf8 100644 --- a/drivers/net/ethernet/mellanox/mlx4/en_dcb_nl.c +++ b/drivers/net/ethernet/mellanox/mlx4/en_dcb_nl.c @@ -310,6 +310,7 @@ static int mlx4_en_ets_validate(struct mlx4_en_priv *priv, struct ieee_ets *ets) } switch (ets->tc_tsa[i]) { + case IEEE_8021QAZ_TSA_VENDOR: case IEEE_8021QAZ_TSA_STRICT: break; case IEEE_8021QAZ_TSA_ETS: @@ -347,6 +348,10 @@ static int mlx4_en_config_port_scheduler(struct mlx4_en_priv *priv, /* higher TC means higher priority => lower pg */ for (i = IEEE_8021QAZ_MAX_TCS - 1; i >= 0; i--) { switch (ets->tc_tsa[i]) { + case IEEE_8021QAZ_TSA_VENDOR: + pg[i] = MLX4_EN_TC_VENDOR; + tc_tx_bw[i] = MLX4_EN_BW_MAX; + break; case IEEE_8021QAZ_TSA_STRICT: pg[i] = num_strict++; tc_tx_bw[i] = MLX4_EN_BW_MAX; diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c index 99051a2..21bc17f 100644 --- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c +++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c @@ -3336,6 +3336,13 @@ int mlx4_en_init_netdev(struct mlx4_en_dev *mdev, int port, priv->msg_enable = MLX4_EN_MSG_LEVEL; #ifdef CONFIG_MLX4_EN_DCB if (!mlx4_is_slave(priv->mdev->dev)) { + u8 prio; + + for (prio = 0; prio < IEEE_8021QAZ_MAX_TCS; ++prio) { + priv->ets.prio_tc[prio] = prio; + priv->ets.tc_tsa[prio] = IEEE_8021QAZ_TSA_VENDOR; + } + priv->dcbx_cap = DCB_CAP_DCBX_VER_CEE | DCB_CAP_DCBX_HOST | DCB_CAP_DCBX_VER_IEEE; priv->flags |= MLX4_EN_DCB_ENABLED; diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h index 2b72677..7db3d0d 100644 --- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h +++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h @@ -479,6 +479,7 @@ struct mlx4_en_frag_info { #define MLX4_EN_BW_MIN 1 #define MLX4_EN_BW_MAX 100 /* Utilize 100% of the line */ +#define MLX4_
[Kernel-packages] [Bug 1674087] [NEW] [zesty] net sched actions - Adding support for user cookies
Public bug reported: Adding optional 128-bit action cookie. The idea is to save user state that when retrieved serves as a correlator. The kernel _should not_ interpret it. The user can store whatever they wish in the 128 bits like persistent data, http or existing kernel fib protocol field, etc. Sample exercise(showing variable length use of cookie) .. create an accept action with cookie a1b2c3d4 sudo $TC actions add action ok index 1 cookie a1b2c3d4 .. dump all gact actions.. sudo $TC -s actions ls action gact action order 0: gact action pass random type none pass val 0 index 1 ref 1 bind 0 installed 5 sec used 5 sec Action statistics: Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 cookie a1b2c3d4 .. bind the accept action to a filter.. sudo $TC filter add dev lo parent : protocol ip prio 1 \ u32 match ip dst 127.0.0.1/32 flowid 1:1 action gact index 1 ... send some traffic.. $ ping 127.0.0.1 -c 3 PING 127.0.0.1 (127.0.0.1) 56(84) bytes of data. 64 bytes from 127.0.0.1: icmp_seq=1 ttl=64 time=0.020 ms 64 bytes from 127.0.0.1: icmp_seq=2 ttl=64 time=0.027 ms 64 bytes from 127.0.0.1: icmp_seq=3 ttl=64 time=0.038 ms upstream Commits 1045ba7 net sched actions: Add support for user cookies 37f1c63 net sched actions: do not overwrite status of action creation. ** Affects: linux (Ubuntu) Importance: Undecided Status: New -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1674087 Title: [zesty] net sched actions - Adding support for user cookies Status in linux package in Ubuntu: New Bug description: Adding optional 128-bit action cookie. The idea is to save user state that when retrieved serves as a correlator. The kernel _should not_ interpret it. The user can store whatever they wish in the 128 bits like persistent data, http or existing kernel fib protocol field, etc. Sample exercise(showing variable length use of cookie) .. create an accept action with cookie a1b2c3d4 sudo $TC actions add action ok index 1 cookie a1b2c3d4 .. dump all gact actions.. sudo $TC -s actions ls action gact action order 0: gact action pass random type none pass val 0 index 1 ref 1 bind 0 installed 5 sec used 5 sec Action statistics: Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 cookie a1b2c3d4 .. bind the accept action to a filter.. sudo $TC filter add dev lo parent : protocol ip prio 1 \ u32 match ip dst 127.0.0.1/32 flowid 1:1 action gact index 1 ... send some traffic.. $ ping 127.0.0.1 -c 3 PING 127.0.0.1 (127.0.0.1) 56(84) bytes of data. 64 bytes from 127.0.0.1: icmp_seq=1 ttl=64 time=0.020 ms 64 bytes from 127.0.0.1: icmp_seq=2 ttl=64 time=0.027 ms 64 bytes from 127.0.0.1: icmp_seq=3 ttl=64 time=0.038 ms upstream Commits 1045ba7 net sched actions: Add support for user cookies 37f1c63 net sched actions: do not overwrite status of action creation. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1674087/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1674087] Re: [zesty] net sched actions - Adding support for user cookies
The patches sent to kernel-t...@lists.canonical.com Thanks, Talat -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1674087 Title: [zesty] net sched actions - Adding support for user cookies Status in linux package in Ubuntu: Incomplete Bug description: Adding optional 128-bit action cookie. The idea is to save user state that when retrieved serves as a correlator. The kernel _should not_ interpret it. The user can store whatever they wish in the 128 bits like persistent data, http or existing kernel fib protocol field, etc. Sample exercise(showing variable length use of cookie) .. create an accept action with cookie a1b2c3d4 sudo $TC actions add action ok index 1 cookie a1b2c3d4 .. dump all gact actions.. sudo $TC -s actions ls action gact action order 0: gact action pass random type none pass val 0 index 1 ref 1 bind 0 installed 5 sec used 5 sec Action statistics: Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 cookie a1b2c3d4 .. bind the accept action to a filter.. sudo $TC filter add dev lo parent : protocol ip prio 1 \ u32 match ip dst 127.0.0.1/32 flowid 1:1 action gact index 1 ... send some traffic.. $ ping 127.0.0.1 -c 3 PING 127.0.0.1 (127.0.0.1) 56(84) bytes of data. 64 bytes from 127.0.0.1: icmp_seq=1 ttl=64 time=0.020 ms 64 bytes from 127.0.0.1: icmp_seq=2 ttl=64 time=0.027 ms 64 bytes from 127.0.0.1: icmp_seq=3 ttl=64 time=0.038 ms upstream Commits 1045ba7 net sched actions: Add support for user cookies 37f1c63 net sched actions: do not overwrite status of action creation. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1674087/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1672144] Re: ifup service of network device stay active after driver stop
Thank you I will give a try and update. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1672144 Title: ifup service of network device stay active after driver stop Status in linux package in Ubuntu: Incomplete Bug description: The network device systemd service stay active after unload the module of this network device, that call close port (ndo_stop). once we try to load the NIC driver again, it try to start the ifup service of his NICs and due to the service is already up, so it fail and we didn't see the interface with the static configuration =. below simple reproduce with the Mellanox ConnectX4 device (driver name mlx5_core). Also we see this issue with Azure system, Ubuntu 17.04 guest over Hyper-v, the VF failed to start after re-enable SR-IOV from VM's vNIC. For now we have a Work Around that to add a udev rule, echo DRIVERS==\"*mlx*\", SUBSYSTEM==\"net\", ACTION==\"add\",RUN+=\"/sbin/ifup --force $env{INTERFACE}\" > /lib/udev/rules.d/100-up.rules Example: #:/lib/udev/rules.d# cat 100-up.rules DRIVERS=="*mlx*", SUBSYSTEM=="net", ACTION=="add",RUN+="/sbin/ifup --force $env{INTERFACE}" *** * More info and reproduce * *** # ifdown ens1f0 RTNETLINK answers: Cannot assign requested address # ifup ens1f0 # ifconfig ens1f0 ens1f0: flags=4163 mtu 1500 inet 123.12.23.1 netmask 255.255.0.0 broadcast 123.12.255.255 inet6 fe80::268a:7ff:fea1:fbdc prefixlen 64 scopeid 0x20 ether 24:8a:07:a1:fb:dc txqueuelen 1000 (Ethernet) RX packets 0 bytes 0 (0.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 17 bytes 1392 (1.3 KB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 # ethtool -i ens1f0 |grep driv driver: mlx5_core # systemctl status ifup@ens1f ifup@ens1f0.service ifup@ens1f1.service # systemctl status ifup@ens1f0.service * ifup@ens1f0.service - ifup for ens1f0 Loaded: loaded (/lib/systemd/system/ifup@.service; static; vendor preset: enabled) Active: active (exited) since Sun 2017-03-12 09:40:04 IST; 2h 26min ago Main PID: 1608 (code=exited, status=0/SUCCESS) CGroup: /system.slice/ifup@ens1f0.service Mar 12 09:40:04 qa-h-vrt-039 systemd[1]: Started ifup for ens1f0. Mar 12 09:40:04 qa-h-vrt-039 sh[1608]: ifup: interface ens1f0 already configured root@qa-h-vrt-039:/tmp# modprobe -rv mlx5_ib rmmod mlx5_ib rmmod mlx5_core # modprobe -rv mlx5_core # ifconfig -a |grep ens1f0 # lsmod |grep mlx5 # systemctl status ifup@ens1f0.service * ifup@ens1f0.service - ifup for ens1f0 Loaded: loaded (/lib/systemd/system/ifup@.service; static; vendor preset: enabled) Active: active (exited) since Sun 2017-03-12 09:40:04 IST; 2h 27min ago Main PID: 1608 (code=exited, status=0/SUCCESS) CGroup: /system.slice/ifup@ens1f0.service Mar 12 09:40:04 qa-h-vrt-039 systemd[1]: Started ifup for ens1f0. Mar 12 09:40:04 qa-h-vrt-039 sh[1608]: ifup: interface ens1f0 already configured # modprobe mlx5_core # ifconfig ens1f0 ens1f0: flags=4098 mtu 1500 ether 24:8a:07:a1:fb:dc txqueuelen 1000 (Ethernet) RX packets 0 bytes 0 (0.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 0 bytes 0 (0.0 B) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 # cat /etc/network/interfaces # This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). # The loopback network interface auto lo iface lo inet loopback # The primary network interface auto eno1 iface eno1 inet dhcp #ens1f0 auto ens1f0 iface ens1f0 inet static address 123.12.23.1 netmask 255.255.0.0 mtu 1500 * * Another repto and investigate * * once interface is created the system starts a service that is responsible for activating it (basically runs ifup). so, at first shot everything works. at the second driver reload: Good flow (on good setup 4.9.0-rc5+): 1. driver is unloaded and the interface’s “ifup” service is shutdown: Feb 23 00:54:09 reg-l-vrt-206-006 kernel: [6.790189] mlx4_en: enP43508p0s2: Close port called Feb 23 00:54:09 reg-l-vrt-206-006 kernel: [6.868484] hv_netvsc a2be13bb-7244-44ff-a31a-dea8d58a79da eth1: VF down: enP43508p0s2 Feb 23 00:54:09 reg-l-vrt-206-006 kern
[Kernel-packages] [Bug 1672144] Re: ifup service of network device stay active after driver stop
The issue doesn't reproduce with the 4.10.5 mainline kernel. could you please cherry-pick this commit to the zesty kernel? Thanks Talat -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1672144 Title: ifup service of network device stay active after driver stop Status in linux package in Ubuntu: Incomplete Bug description: The network device systemd service stay active after unload the module of this network device, that call close port (ndo_stop). once we try to load the NIC driver again, it try to start the ifup service of his NICs and due to the service is already up, so it fail and we didn't see the interface with the static configuration =. below simple reproduce with the Mellanox ConnectX4 device (driver name mlx5_core). Also we see this issue with Azure system, Ubuntu 17.04 guest over Hyper-v, the VF failed to start after re-enable SR-IOV from VM's vNIC. For now we have a Work Around that to add a udev rule, echo DRIVERS==\"*mlx*\", SUBSYSTEM==\"net\", ACTION==\"add\",RUN+=\"/sbin/ifup --force $env{INTERFACE}\" > /lib/udev/rules.d/100-up.rules Example: #:/lib/udev/rules.d# cat 100-up.rules DRIVERS=="*mlx*", SUBSYSTEM=="net", ACTION=="add",RUN+="/sbin/ifup --force $env{INTERFACE}" *** * More info and reproduce * *** # ifdown ens1f0 RTNETLINK answers: Cannot assign requested address # ifup ens1f0 # ifconfig ens1f0 ens1f0: flags=4163 mtu 1500 inet 123.12.23.1 netmask 255.255.0.0 broadcast 123.12.255.255 inet6 fe80::268a:7ff:fea1:fbdc prefixlen 64 scopeid 0x20 ether 24:8a:07:a1:fb:dc txqueuelen 1000 (Ethernet) RX packets 0 bytes 0 (0.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 17 bytes 1392 (1.3 KB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 # ethtool -i ens1f0 |grep driv driver: mlx5_core # systemctl status ifup@ens1f ifup@ens1f0.service ifup@ens1f1.service # systemctl status ifup@ens1f0.service * ifup@ens1f0.service - ifup for ens1f0 Loaded: loaded (/lib/systemd/system/ifup@.service; static; vendor preset: enabled) Active: active (exited) since Sun 2017-03-12 09:40:04 IST; 2h 26min ago Main PID: 1608 (code=exited, status=0/SUCCESS) CGroup: /system.slice/ifup@ens1f0.service Mar 12 09:40:04 qa-h-vrt-039 systemd[1]: Started ifup for ens1f0. Mar 12 09:40:04 qa-h-vrt-039 sh[1608]: ifup: interface ens1f0 already configured root@qa-h-vrt-039:/tmp# modprobe -rv mlx5_ib rmmod mlx5_ib rmmod mlx5_core # modprobe -rv mlx5_core # ifconfig -a |grep ens1f0 # lsmod |grep mlx5 # systemctl status ifup@ens1f0.service * ifup@ens1f0.service - ifup for ens1f0 Loaded: loaded (/lib/systemd/system/ifup@.service; static; vendor preset: enabled) Active: active (exited) since Sun 2017-03-12 09:40:04 IST; 2h 27min ago Main PID: 1608 (code=exited, status=0/SUCCESS) CGroup: /system.slice/ifup@ens1f0.service Mar 12 09:40:04 qa-h-vrt-039 systemd[1]: Started ifup for ens1f0. Mar 12 09:40:04 qa-h-vrt-039 sh[1608]: ifup: interface ens1f0 already configured # modprobe mlx5_core # ifconfig ens1f0 ens1f0: flags=4098 mtu 1500 ether 24:8a:07:a1:fb:dc txqueuelen 1000 (Ethernet) RX packets 0 bytes 0 (0.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 0 bytes 0 (0.0 B) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 # cat /etc/network/interfaces # This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). # The loopback network interface auto lo iface lo inet loopback # The primary network interface auto eno1 iface eno1 inet dhcp #ens1f0 auto ens1f0 iface ens1f0 inet static address 123.12.23.1 netmask 255.255.0.0 mtu 1500 * * Another repto and investigate * * once interface is created the system starts a service that is responsible for activating it (basically runs ifup). so, at first shot everything works. at the second driver reload: Good flow (on good setup 4.9.0-rc5+): 1. driver is unloaded and the interface’s “ifup” service is shutdown: Feb 23 00:54:09 reg-l-vrt-206-006 kernel: [6.790189] mlx4_en: enP43508p0s2: Close port called Feb 23 00:54:09 reg-l-vrt-206-006 kernel: [6.868484] hv_netvsc a2be13bb-7244-
[Kernel-packages] [Bug 1676388] [NEW] [zesty] mlx5e OVS fixes
Public bug reported: We see the following issues with Ubuntu Zesty while testing the OVS offloading. I already test the patches that fix this issue and i'm going to send them to kernel-t...@lists.canonical.com Issues 1. FW error of groups overlapping when scaling up ovs tc qdisc del dev ens5f0 ingress tc qdisc add dev ens5f0 ingress tc filter add dev ens5f0 parent : protocol ip pref 8 handle 0x1 flower dst_mac e4:1d:2d:5d:25:35 ip_proto udp src_port 2009 action mirred egress redirect dev eth0 tc filter add dev ens5f0 parent : protocol arp pref 1 handle 0x1 flower dst_mac e4:1d:2d:5d:25:35 src_mac e4:1d:2d:5d:25:34 action mirred egress redirect dev eth0 tc filter del dev ens5f0 parent : pref 8 handle 0x1 flower tc filter add dev ens5f0 parent : protocol ip pref 4 handle 0x1 flower dst_mac e4:1d:2d:5d:25:35 src_mac e4:1d:2d:5d:25:34 ip_proto udp src_port 2229 action mirred egress redirect dev eth0 tc filter add dev ens5f0 parent : protocol ip pref 8 handle 0x1 flower dst_mac e4:1d:2d:5d:25:35 ip_proto udp src_port 2009 action mirred egress redirect dev eth0 2. DELETE_VXLAN_UDP_DPORT failed when restart ovs in vxlan with non- default port - configure ovs - vxlan with non-default port (1236) - service openvswitch restart The Following commits should fix those issue Commit ID: 1ad9a00ae0efc2e9337148d6c382fad3d27bf99a Title: net/mlx5e: Avoid supporting udp tunnel port ndo for VF reps Link: https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/?id=1ad9a00ae0efc2e9337148d6c382fad3d27bf99a Commit ID: af36370569eb37420e1e78a2e60c277b781fcd00 Title: net/mlx5: Fix create autogroup prev initializer Like: https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/?id=af36370569eb37420e1e78a2e60c277b781fcd00 ** Affects: linux (Ubuntu) Importance: Undecided Status: New -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1676388 Title: [zesty] mlx5e OVS fixes Status in linux package in Ubuntu: New Bug description: We see the following issues with Ubuntu Zesty while testing the OVS offloading. I already test the patches that fix this issue and i'm going to send them to kernel-t...@lists.canonical.com Issues 1. FW error of groups overlapping when scaling up ovs tc qdisc del dev ens5f0 ingress tc qdisc add dev ens5f0 ingress tc filter add dev ens5f0 parent : protocol ip pref 8 handle 0x1 flower dst_mac e4:1d:2d:5d:25:35 ip_proto udp src_port 2009 action mirred egress redirect dev eth0 tc filter add dev ens5f0 parent : protocol arp pref 1 handle 0x1 flower dst_mac e4:1d:2d:5d:25:35 src_mac e4:1d:2d:5d:25:34 action mirred egress redirect dev eth0 tc filter del dev ens5f0 parent : pref 8 handle 0x1 flower tc filter add dev ens5f0 parent : protocol ip pref 4 handle 0x1 flower dst_mac e4:1d:2d:5d:25:35 src_mac e4:1d:2d:5d:25:34 ip_proto udp src_port 2229 action mirred egress redirect dev eth0 tc filter add dev ens5f0 parent : protocol ip pref 8 handle 0x1 flower dst_mac e4:1d:2d:5d:25:35 ip_proto udp src_port 2009 action mirred egress redirect dev eth0 2. DELETE_VXLAN_UDP_DPORT failed when restart ovs in vxlan with non- default port - configure ovs - vxlan with non-default port (1236) - service openvswitch restart The Following commits should fix those issue Commit ID: 1ad9a00ae0efc2e9337148d6c382fad3d27bf99a Title: net/mlx5e: Avoid supporting udp tunnel port ndo for VF reps Link: https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/?id=1ad9a00ae0efc2e9337148d6c382fad3d27bf99a Commit ID: af36370569eb37420e1e78a2e60c277b781fcd00 Title: net/mlx5: Fix create autogroup prev initializer Like: https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/?id=af36370569eb37420e1e78a2e60c277b781fcd00 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1676388/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1676786] [NEW] [Zesty] mlx5_core Kernel oops with bonding mode 1 and 6
Public bug reported: We get kernel panic when we install a bond interface with two of Mellanox mlx5 NIC's and try to unload the bonding module. scenario: 1. network interfaces configuration # cat /etc/network/interfaces # This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). # The loopback network interface auto lo iface lo inet loopback # The primary network interface auto eno1 iface eno1 inet dhcp #ens1f0 auto ens1f0 iface ens1f0 inet manual bond-master bond1 auto ens1f1 iface ens1f1 inet manual bond-master bond1 auto bond1 iface bond1 inet static address 27.65.194.1 netmask 255.255.255.0 bond-slaves ens1f0 ens1f1 bond-mode 1 bond-primary ens1f0 bond-miimon 100 iface bond1 inet6 static address 907c:c828:4d05:5bf8::::0002/127 # cat /etc/modprobe.d/bonding.conf options bonding mode=1 2. ifup bond1 3. modprobe -r bonding 4. OOPS Mar 12 16:44:32 qa-h-vrt-038 kernel: [ 540.443796] Oops: [#1] SMP Mar 12 16:44:32 qa-h-vrt-038 kernel: [ 540.444686] Modules linked in: mlx5_ib mlx5_core bonding mlx4_ib ib_core mlx4_en mlx4_core nfsv3 nfs fscache xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat libcrc32c nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ipmi_ssif intel_rapl sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd glue_helper cryptd intel_cstate joydev input_leds intel_rapl_perf serio_raw lpc_ich hpilo ipmi_si ioatdma ipmi_devintf dca ipmi_msghandler shpchp mac_hid acpi_power_meter nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables Mar 12 16:44:32 qa-h-vrt-038 kernel: [ 540.469445] x_tables autofs4 hid_generic psmouse usbhid hid pata_acpi tg3 hpsa ptp scsi_transport_sas devlink pps_core wmi fjes [last unloaded: mlx5_core] Mar 12 16:44:32 qa-h-vrt-038 kernel: [ 540.473672] CPU: 23 PID: 4846 Comm: ifenslave Not tainted 4.10.0-9-generic #11-Ubuntu Mar 12 16:44:32 qa-h-vrt-038 kernel: [ 540.475894] Hardware name: HP ProLiant DL380p Gen8, BIOS P70 07/01/2015 Mar 12 16:44:32 qa-h-vrt-038 kernel: [ 540.478038] task: 9b8394e31680 task.stack: b2ed054f4000 Mar 12 16:44:32 qa-h-vrt-038 kernel: [ 540.533408] RIP: 0010:mlx5_lag_netdev_event+0x1e6/0x230 [mlx5_core] Mar 12 16:44:32 qa-h-vrt-038 kernel: [ 540.590069] RSP: 0018:b2ed054f7bd0 EFLAGS: 00010202 Mar 12 16:44:32 qa-h-vrt-038 kernel: [ 540.646302] RAX: 0002 RBX: 9b7f825f6000 RCX: Mar 12 16:44:32 qa-h-vrt-038 kernel: [ 540.701966] RDX: RSI: 00040400 RDI: 9b7f840a00b0 Mar 12 16:44:32 qa-h-vrt-038 kernel: [ 540.756395] RBP: b2ed054f7c18 R08: c02fb000 R09: 9b7fa3117ea8 Mar 12 16:44:32 qa-h-vrt-038 kernel: [ 540.810250] R10: R11: 0051a84e R12: 0001 Mar 12 16:44:32 qa-h-vrt-038 kernel: [ 540.863569] R13: 0004 R14: 9b7fa3117ea8 R15: 8992b108 Mar 12 16:44:32 qa-h-vrt-038 kernel: [ 540.916725] FS: 7fc6cca0e700() GS:9b83af0c() knlGS: Mar 12 16:44:32 qa-h-vrt-038 kernel: [ 541.020509] CS: 0010 DS: ES: CR0: 80050033 Mar 12 16:44:32 qa-h-vrt-038 kernel: [ 541.072342] CR2: 0002 CR3: 000817013000 CR4: 001406e0 Mar 12 16:44:32 qa-h-vrt-038 kernel: [ 541.127206] Call Trace: Mar 12 16:44:32 qa-h-vrt-038 kernel: [ 541.180602] notifier_call_chain+0x4a/0x70 Mar 12 16:44:32 qa-h-vrt-038 kernel: [ 541.235310] raw_notifier_call_chain+0x16/0x20 Mar 12 16:44:32 qa-h-vrt-038 kernel: [ 541.287923] call_netdevice_notifiers_info+0x35/0x60 Mar 12 16:44:32 qa-h-vrt-038 kernel: [ 541.342951] netdev_upper_dev_unlink+0x72/0xb0 Mar 12 16:44:32 qa-h-vrt-038 kernel: [ 541.395322] bond_upper_dev_unlink.isra.42+0x18/0x40 [bonding] Mar 12 16:44:32 qa-h-vrt-038 kernel: [ 541.446520] __bond_release_one+0x170/0x550 [bonding] Mar 12 16:44:32 qa-h-vrt-038 kernel: [ 541.499303] ? netdev_info+0x6c/
[Kernel-packages] [Bug 1676388] Re: [zesty] mlx5e OVS fixes
Additional patches that needed, I will send them to the mailing list as well 375f51e net/mlx5: E-Switch, Don't allow changing inline mode when flows are configured d85cdcc net/mlx5e: Change the TC offload rule add/del code path to be per NIC or E-Switch 4456f61 devlink: allow to fillup eswitch attrs even if mode_get op does not exist 1a6aa36 devlink: use nla_put_failure goto label instead of out 21e3d2d devlink: rename devlink_eswitch_fill to devlink_nl_eswitch_fill adf200f devlink: fix the name of eswitch commands 65ba8fb net/mlx5e: Avoid wrong identification of rules on deletion -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1676388 Title: [zesty] mlx5e OVS fixes Status in linux package in Ubuntu: Fix Committed Status in linux source package in Zesty: Fix Committed Bug description: We see the following issues with Ubuntu Zesty while testing the OVS offloading. I already test the patches that fix this issue and i'm going to send them to kernel-t...@lists.canonical.com Issues 1. FW error of groups overlapping when scaling up ovs tc qdisc del dev ens5f0 ingress tc qdisc add dev ens5f0 ingress tc filter add dev ens5f0 parent : protocol ip pref 8 handle 0x1 flower dst_mac e4:1d:2d:5d:25:35 ip_proto udp src_port 2009 action mirred egress redirect dev eth0 tc filter add dev ens5f0 parent : protocol arp pref 1 handle 0x1 flower dst_mac e4:1d:2d:5d:25:35 src_mac e4:1d:2d:5d:25:34 action mirred egress redirect dev eth0 tc filter del dev ens5f0 parent : pref 8 handle 0x1 flower tc filter add dev ens5f0 parent : protocol ip pref 4 handle 0x1 flower dst_mac e4:1d:2d:5d:25:35 src_mac e4:1d:2d:5d:25:34 ip_proto udp src_port 2229 action mirred egress redirect dev eth0 tc filter add dev ens5f0 parent : protocol ip pref 8 handle 0x1 flower dst_mac e4:1d:2d:5d:25:35 ip_proto udp src_port 2009 action mirred egress redirect dev eth0 2. DELETE_VXLAN_UDP_DPORT failed when restart ovs in vxlan with non- default port - configure ovs - vxlan with non-default port (1236) - service openvswitch restart The Following commits should fix those issue Commit ID: 1ad9a00ae0efc2e9337148d6c382fad3d27bf99a Title: net/mlx5e: Avoid supporting udp tunnel port ndo for VF reps Link: https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/?id=1ad9a00ae0efc2e9337148d6c382fad3d27bf99a Commit ID: af36370569eb37420e1e78a2e60c277b781fcd00 Title: net/mlx5: Fix create autogroup prev initializer Like: https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/?id=af36370569eb37420e1e78a2e60c277b781fcd00 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1676388/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1676857] [NEW] [zesty] mlx4_core OOM with 32 bit arch
Public bug reported: we get OOM when we load the mlx4_core driver with 32-bit system with rich cpus machines The following upstream patches will fix this issue acd7628 mlx4: reduce rx ring page_cache size 3608b13 mlx4: reduce OOM risk on arches with large pages ** Affects: linux (Ubuntu) Importance: Undecided Status: New -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1676857 Title: [zesty] mlx4_core OOM with 32 bit arch Status in linux package in Ubuntu: New Bug description: we get OOM when we load the mlx4_core driver with 32-bit system with rich cpus machines The following upstream patches will fix this issue acd7628 mlx4: reduce rx ring page_cache size 3608b13 mlx4: reduce OOM risk on arches with large pages To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1676857/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1676858] [NEW] [zesty] mlx4_core OOM with 32 bit arch
Public bug reported: we get OOM when we load the mlx4_core driver with 32-bit system with rich cpus machines The following upstream patches will fix this issue 3608b13 mlx4: reduce OOM risk on arches with large pages ** Affects: linux (Ubuntu) Importance: Undecided Status: New ** Description changed: we get OOM when we load the mlx4_core driver with 32-bit system with rich cpus machines The following upstream patches will fix this issue - - acd7628 mlx4: reduce rx ring page_cache size + 3608b13 mlx4: reduce OOM risk on arches with large pages -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1676858 Title: [zesty] mlx4_core OOM with 32 bit arch Status in linux package in Ubuntu: New Bug description: we get OOM when we load the mlx4_core driver with 32-bit system with rich cpus machines The following upstream patches will fix this issue 3608b13 mlx4: reduce OOM risk on arches with large pages To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1676858/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1676786] Re: [Zesty] mlx5_core Kernel oops with bonding mode 1 and 6
This commit fix the issue commit e497ec680c4cd51e76bfcdd49363d9ab8d32a757 Author: Talat Batheesh Date: Tue Mar 28 16:13:41 2017 +0300 net/mlx5: Avoid dereferencing uninitialized pointer In NETDEV_CHANGEUPPER event the upper_info field is valid only when linking is true. Otherwise it should be ignored. Fixes: 7907f23adc18 (net/mlx5: Implement RoCE LAG feature) Signed-off-by: Talat Batheesh Reviewed-by: Aviv Heller Reviewed-by: Moni Shoua Signed-off-by: Saeed Mahameed Signed-off-by: David S. Miller -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1676786 Title: [Zesty] mlx5_core Kernel oops with bonding mode 1 and 6 Status in linux package in Ubuntu: Incomplete Bug description: We get kernel panic when we install a bond interface with two of Mellanox mlx5 NIC's and try to unload the bonding module. scenario: 1. network interfaces configuration # cat /etc/network/interfaces # This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). # The loopback network interface auto lo iface lo inet loopback # The primary network interface auto eno1 iface eno1 inet dhcp #ens1f0 auto ens1f0 iface ens1f0 inet manual bond-master bond1 auto ens1f1 iface ens1f1 inet manual bond-master bond1 auto bond1 iface bond1 inet static address 27.65.194.1 netmask 255.255.255.0 bond-slaves ens1f0 ens1f1 bond-mode 1 bond-primary ens1f0 bond-miimon 100 iface bond1 inet6 static address 907c:c828:4d05:5bf8::::0002/127 # cat /etc/modprobe.d/bonding.conf options bonding mode=1 2. ifup bond1 3. modprobe -r bonding 4. OOPS Mar 12 16:44:32 qa-h-vrt-038 kernel: [ 540.443796] Oops: [#1] SMP Mar 12 16:44:32 qa-h-vrt-038 kernel: [ 540.444686] Modules linked in: mlx5_ib mlx5_core bonding mlx4_ib ib_core mlx4_en mlx4_core nfsv3 nfs fscache xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat libcrc32c nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ipmi_ssif intel_rapl sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd glue_helper cryptd intel_cstate joydev input_leds intel_rapl_perf serio_raw lpc_ich hpilo ipmi_si ioatdma ipmi_devintf dca ipmi_msghandler shpchp mac_hid acpi_power_meter nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables Mar 12 16:44:32 qa-h-vrt-038 kernel: [ 540.469445] x_tables autofs4 hid_generic psmouse usbhid hid pata_acpi tg3 hpsa ptp scsi_transport_sas devlink pps_core wmi fjes [last unloaded: mlx5_core] Mar 12 16:44:32 qa-h-vrt-038 kernel: [ 540.473672] CPU: 23 PID: 4846 Comm: ifenslave Not tainted 4.10.0-9-generic #11-Ubuntu Mar 12 16:44:32 qa-h-vrt-038 kernel: [ 540.475894] Hardware name: HP ProLiant DL380p Gen8, BIOS P70 07/01/2015 Mar 12 16:44:32 qa-h-vrt-038 kernel: [ 540.478038] task: 9b8394e31680 task.stack: b2ed054f4000 Mar 12 16:44:32 qa-h-vrt-038 kernel: [ 540.533408] RIP: 0010:mlx5_lag_netdev_event+0x1e6/0x230 [mlx5_core] Mar 12 16:44:32 qa-h-vrt-038 kernel: [ 540.590069] RSP: 0018:b2ed054f7bd0 EFLAGS: 00010202 Mar 12 16:44:32 qa-h-vrt-038 kernel: [ 540.646302] RAX: 0002 RBX: 9b7f825f6000 RCX: Mar 12 16:44:32 qa-h-vrt-038 kernel: [ 540.701966] RDX: RSI: 00040400 RDI: 9b7f840a00b0 Mar 12 16:44:32 qa-h-vrt-038 kernel: [ 540.756395] RBP: b2ed054f7c18 R08: c02fb000 R09: 9b7fa3117ea8 Mar 12 16:44:32 qa-h-vrt-038 kernel: [ 540.810250] R10: R11: 0051a84e R12: 0001 Mar 12 16:44:32 qa-h-vrt-038 kernel: [ 540.863569] R13: 0004 R14: 9b7fa3117ea8 R15: 8992b108 Mar 12 16:44:32 qa-h-vrt-038 kernel: [ 540.916725] FS: 7fc6cca0e700() GS:9b83af0c()
[Kernel-packages] [Bug 1668042] Re: [Xenial - 16.04 ]Bonding driver - stack corruption when trying to copy 20 bytes to a sockaddr
Thanks, Tested and the fix is working properly. Talat ** Tags removed: verification-needed-xenial -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1668042 Title: [Xenial - 16.04 ]Bonding driver - stack corruption when trying to copy 20 bytes to a sockaddr Status in linux package in Ubuntu: In Progress Status in linux source package in Trusty: In Progress Status in linux source package in Xenial: Fix Committed Bug description: In Ubuntu Xenial with kernel 4.4.0-65, we get kernel Panic after scenario [1]. patch [2] should fix the issue When using an IPoIB bond currently only active-backup mode is a valid use case and this commit strengthens it. Since commit 2ab82852a270 ("net/bonding: Enable bonding to enslave netdevices not supporting set_mac_address()") was introduced till 4.7-rc1, IPoIB didn't support the set_mac_address ndo, and hence the fail over mac policy always applied to IPoIB bonds. With the introduction of commit 492a7e67ff83 ("IB/IPoIB: Allow setting the device address"), that doesn't hold and practically IPoIB bonds are broken as of that. To fix it, lets go to fail over mac if the device doesn't support the ndo OR this is IPoIB device. As a by-product, this commit also prevents a stack corruption which occurred when trying to copy 20 bytes (IPoIB) device address to a sockaddr struct that has only 16 bytes of storage. [1] Get panic after create bond with down/updelay and restart NIC driver Configure bond with down/updelay cat /etc/network/interfaces auto bond1 iface bond1 inet static address 31.136.42.17 netmask 255.255.0.0 bond-slaves ib0 ib1 bond-miimon 100 bond-updelay 5000 bond-mode active-backup bond-primary ib1 bond-downdelay 5000 auto ib0 iface ib0 inet manual bond-master bond1 auto ib1 iface ib1 inet manual bond-master bond1 modprobe -r [2] 1533e77315220dc1d5ec3bd6d9fe32e2aa0a74c0 net/bonding: Enforce active-backup policy for IPoIB bonds To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1668042/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1678585] [NEW] [Zesty] TSO doesn't work properly
Public bug reported: The TSO (TCP segmentation offload), By default it is shown to be ON, but actually aggregation doesn’t happen. When turning it on again - aggregation is observed. Same behavior for mlx4_en, mlx5_core, igb. Steps to repro: 1. root:~# uname -r 4.10.0-14-generic 2. root:~# ethtool -i eno2 driver: igb version: 5.4.0-k firmware-version: 1.63, 0x89fa expansion-rom-version: bus-info: :06:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: yes supports-register-dump: yes supports-priv-flags: no 3. root:~# ethtool -k eno2 | grep offload tcp-segmentation-offload: on udp-fragmentation-offload: off [fixed] generic-segmentation-offload: on generic-receive-offload: on large-receive-offload: off [fixed] rx-vlan-offload: on tx-vlan-offload: on l2-fwd-offload: off [fixed] hw-tc-offload: off [fixed] 4. root@:~# ethtool -K eno2 gso off 5. root@:~# ethtool -k eno2 | grep offload tcp-segmentation-offload: on udp-fragmentation-offload: off [fixed] generic-segmentation-offload: off generic-receive-offload: on large-receive-offload: off [fixed] rx-vlan-offload: on tx-vlan-offload: on l2-fwd-offload: off [fixed] hw-tc-offload: off [fixed] 6. netperf -H 10.195.43.1 -l 4 -t TCP_STREAM -c -C -- -m 15000 Traffic size less or equal to 1514 is observed, despite the fact TSO is on. 7. ethtool -K eno2 tso on 8. netperf -H 10.195.43.1 -l 4 -t TCP_STREAM -c -C -- -m 15000 Traffic size of 64K is observed. ** Affects: linux (Ubuntu) Importance: Undecided Status: New -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1678585 Title: [Zesty] TSO doesn't work properly Status in linux package in Ubuntu: New Bug description: The TSO (TCP segmentation offload), By default it is shown to be ON, but actually aggregation doesn’t happen. When turning it on again - aggregation is observed. Same behavior for mlx4_en, mlx5_core, igb. Steps to repro: 1. root:~# uname -r 4.10.0-14-generic 2. root:~# ethtool -i eno2 driver: igb version: 5.4.0-k firmware-version: 1.63, 0x89fa expansion-rom-version: bus-info: :06:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: yes supports-register-dump: yes supports-priv-flags: no 3. root:~# ethtool -k eno2 | grep offload tcp-segmentation-offload: on udp-fragmentation-offload: off [fixed] generic-segmentation-offload: on generic-receive-offload: on large-receive-offload: off [fixed] rx-vlan-offload: on tx-vlan-offload: on l2-fwd-offload: off [fixed] hw-tc-offload: off [fixed] 4. root@:~# ethtool -K eno2 gso off 5. root@:~# ethtool -k eno2 | grep offload tcp-segmentation-offload: on udp-fragmentation-offload: off [fixed] generic-segmentation-offload: off generic-receive-offload: on large-receive-offload: off [fixed] rx-vlan-offload: on tx-vlan-offload: on l2-fwd-offload: off [fixed] hw-tc-offload: off [fixed] 6. netperf -H 10.195.43.1 -l 4 -t TCP_STREAM -c -C -- -m 15000 Traffic size less or equal to 1514 is observed, despite the fact TSO is on. 7. ethtool -K eno2 tso on 8. netperf -H 10.195.43.1 -l 4 -t TCP_STREAM -c -C -- -m 15000 Traffic size of 64K is observed. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1678585/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1678585] Re: [Zesty] TSO doesn't work properly
The bug is exist also in kernel that mentioned in commit#2 4.11.0-041100rc5-generic ** Tags added: kernel-bug-exists-upstream -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1678585 Title: [Zesty] TSO doesn't work properly Status in linux package in Ubuntu: Incomplete Bug description: The TSO (TCP segmentation offload), By default it is shown to be ON, but actually aggregation doesn’t happen. When turning it on again - aggregation is observed. Same behavior for mlx4_en, mlx5_core, igb. Steps to repro: 1. root:~# uname -r 4.10.0-14-generic 2. root:~# ethtool -i eno2 driver: igb version: 5.4.0-k firmware-version: 1.63, 0x89fa expansion-rom-version: bus-info: :06:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: yes supports-register-dump: yes supports-priv-flags: no 3. root:~# ethtool -k eno2 | grep offload tcp-segmentation-offload: on udp-fragmentation-offload: off [fixed] generic-segmentation-offload: on generic-receive-offload: on large-receive-offload: off [fixed] rx-vlan-offload: on tx-vlan-offload: on l2-fwd-offload: off [fixed] hw-tc-offload: off [fixed] 4. root@:~# ethtool -K eno2 gso off 5. root@:~# ethtool -k eno2 | grep offload tcp-segmentation-offload: on udp-fragmentation-offload: off [fixed] generic-segmentation-offload: off generic-receive-offload: on large-receive-offload: off [fixed] rx-vlan-offload: on tx-vlan-offload: on l2-fwd-offload: off [fixed] hw-tc-offload: off [fixed] 6. netperf -H 10.195.43.1 -l 4 -t TCP_STREAM -c -C -- -m 15000 Traffic size less or equal to 1514 is observed, despite the fact TSO is on. 7. ethtool -K eno2 tso on 8. netperf -H 10.195.43.1 -l 4 -t TCP_STREAM -c -C -- -m 15000 Traffic size of 64K is observed. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1678585/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1678585] Re: [Zesty] TSO doesn't work properly
Unfortunately, No. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1678585 Title: [Zesty] TSO doesn't work properly Status in linux package in Ubuntu: Incomplete Bug description: The TSO (TCP segmentation offload), By default it is shown to be ON, but actually aggregation doesn’t happen. When turning it on again - aggregation is observed. Same behavior for mlx4_en, mlx5_core, igb. Steps to repro: 1. root:~# uname -r 4.10.0-14-generic 2. root:~# ethtool -i eno2 driver: igb version: 5.4.0-k firmware-version: 1.63, 0x89fa expansion-rom-version: bus-info: :06:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: yes supports-register-dump: yes supports-priv-flags: no 3. root:~# ethtool -k eno2 | grep offload tcp-segmentation-offload: on udp-fragmentation-offload: off [fixed] generic-segmentation-offload: on generic-receive-offload: on large-receive-offload: off [fixed] rx-vlan-offload: on tx-vlan-offload: on l2-fwd-offload: off [fixed] hw-tc-offload: off [fixed] 4. root@:~# ethtool -K eno2 gso off 5. root@:~# ethtool -k eno2 | grep offload tcp-segmentation-offload: on udp-fragmentation-offload: off [fixed] generic-segmentation-offload: off generic-receive-offload: on large-receive-offload: off [fixed] rx-vlan-offload: on tx-vlan-offload: on l2-fwd-offload: off [fixed] hw-tc-offload: off [fixed] 6. netperf -H 10.195.43.1 -l 4 -t TCP_STREAM -c -C -- -m 15000 Traffic size less or equal to 1514 is observed, despite the fact TSO is on. 7. ethtool -K eno2 tso on 8. netperf -H 10.195.43.1 -l 4 -t TCP_STREAM -c -C -- -m 15000 Traffic size of 64K is observed. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1678585/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1701892] [NEW] [Artful] mlx5e-Introduce RX Page-Reuse
Public bug reported: This feature is a Page-Reuse mechanism in non-Striding RQ RX datapath. A WQE (RX descriptor) buffer is a page, that in most cases was fully wasted on a packet that is much smaller, requiring a new page for the next round. In this feature, we implement a page-reuse mechanism, that resembles a `SW Striding RQ`. We allow the WQE to reuse its allocated page as much as it could, until the page is fully consumed. In each round, the WQE is capable of receiving packet of maximal size (MTU). Yet, upon the reception of a packet, the WQE knows the actual packet size, and consumes the exact amount of memory needed to build a linear SKB. Then, it updates the buffer pointer within the page accordingly, for the next round. Feature is mutually exclusive with XDP (packet-per-page) and LRO (session size is a power of two, needs unused page). Upstream acceptance Commits accd58833237 net/mlx5e: Introduce RX Page-Reuse bce2b2bf6682 net/mlx5e: Enhance RX SKB headroom logic 78aedd327982 net/mlx5e: Build SKB with exact frag_size ** Affects: linux (Ubuntu) Importance: Undecided Status: Incomplete ** Tags: artful -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1701892 Title: [Artful] mlx5e-Introduce RX Page-Reuse Status in linux package in Ubuntu: Incomplete Bug description: This feature is a Page-Reuse mechanism in non-Striding RQ RX datapath. A WQE (RX descriptor) buffer is a page, that in most cases was fully wasted on a packet that is much smaller, requiring a new page for the next round. In this feature, we implement a page-reuse mechanism, that resembles a `SW Striding RQ`. We allow the WQE to reuse its allocated page as much as it could, until the page is fully consumed. In each round, the WQE is capable of receiving packet of maximal size (MTU). Yet, upon the reception of a packet, the WQE knows the actual packet size, and consumes the exact amount of memory needed to build a linear SKB. Then, it updates the buffer pointer within the page accordingly, for the next round. Feature is mutually exclusive with XDP (packet-per-page) and LRO (session size is a power of two, needs unused page). Upstream acceptance Commits accd58833237 net/mlx5e: Introduce RX Page-Reuse bce2b2bf6682 net/mlx5e: Enhance RX SKB headroom logic 78aedd327982 net/mlx5e: Build SKB with exact frag_size To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1701892/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1668019] [NEW] [UBUNTU Zesty] mlx5 - Improve OVS offload driver
Public bug reported: The patches in this series should improve the OVS offloading driver and some fixes mlx5 driver, by adding reflect HW offload status for querying whether a filter is offloaded to HW or not. This patches is already accepted upstream net-next (4.11). The following patches should improve the OVS offload driver: PICK: 5cecb6c net/sched: cls_bpf: Reflect HW offload status PICK: 24d3dc6 net/sched: cls_u32: Reflect HW offload status PICK: c7d2b2f net/sched: cls_matchall: Reflect HW offloading status PICK: 5559396 net/sched: cls_flower: Reflect HW offload status PICK: e696028 net/sched: Reflect HW offload status PICK: 7a335ad net/sched: cls_matchall: Dump the classifier flags PICK: 749e672 net/sched: cls_flower: Properly handle classifier flags dumping PICK: a61d5ce net/mlx5: Fix static checker warnings PICK: 264d7bf net/mlx5: E-Switch, Enlarge the FDB size for the switchdev mode PICK: ce99f6b net/mlx5e: Support SRIOV TC encapsulation offloads for IPv6 tunnels PICK: 9a94111 net/mlx5e: Maximize ip tunnel key usage on the TC offloading path PICK: 76f7444 net/mlx5e: Use the full tunnel key info for encapsulation offload house-keeping PICK: 75c33da net/mlx5e: TC ipv4 tunnel encap offload cosmetic changes PICK: 19f4440 net/mlx5e: Add TC offloads matching on IPv6 encapsulation headers PICK: 073ff3c net/mlx5: Use exact encap header size for the FW input buffer PICK: 7898489 IB/mlx5: Enable Eth VFs to query their min-inline value for user-space PICK: 8c7245a6 net/mlx5: Push min-inline mode resolution helper into the core PICK: a3308d8 net/sched: cls_flower: Disallow duplicate internal elements ** Affects: linux (Ubuntu) Importance: Undecided Status: New ** Tags: zesty -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1668019 Title: [UBUNTU Zesty] mlx5 - Improve OVS offload driver Status in linux package in Ubuntu: New Bug description: The patches in this series should improve the OVS offloading driver and some fixes mlx5 driver, by adding reflect HW offload status for querying whether a filter is offloaded to HW or not. This patches is already accepted upstream net-next (4.11). The following patches should improve the OVS offload driver: PICK: 5cecb6c net/sched: cls_bpf: Reflect HW offload status PICK: 24d3dc6 net/sched: cls_u32: Reflect HW offload status PICK: c7d2b2f net/sched: cls_matchall: Reflect HW offloading status PICK: 5559396 net/sched: cls_flower: Reflect HW offload status PICK: e696028 net/sched: Reflect HW offload status PICK: 7a335ad net/sched: cls_matchall: Dump the classifier flags PICK: 749e672 net/sched: cls_flower: Properly handle classifier flags dumping PICK: a61d5ce net/mlx5: Fix static checker warnings PICK: 264d7bf net/mlx5: E-Switch, Enlarge the FDB size for the switchdev mode PICK: ce99f6b net/mlx5e: Support SRIOV TC encapsulation offloads for IPv6 tunnels PICK: 9a94111 net/mlx5e: Maximize ip tunnel key usage on the TC offloading path PICK: 76f7444 net/mlx5e: Use the full tunnel key info for encapsulation offload house-keeping PICK: 75c33da net/mlx5e: TC ipv4 tunnel encap offload cosmetic changes PICK: 19f4440 net/mlx5e: Add TC offloads matching on IPv6 encapsulation headers PICK: 073ff3c net/mlx5: Use exact encap header size for the FW input buffer PICK: 7898489 IB/mlx5: Enable Eth VFs to query their min-inline value for user-space PICK: 8c7245a6 net/mlx5: Push min-inline mode resolution helper into the core PICK: a3308d8 net/sched: cls_flower: Disallow duplicate internal elements To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1668019/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1668042] [NEW] [Xenial - 16.04 ]Bonding driver - stack corruption when trying to copy 20 bytes to a sockaddr
Public bug reported: In Ubuntu Xenial with kernel 4.4.0-65, we get kernel Panic after scenario [1]. patch [2] should fix the issue When using an IPoIB bond currently only active-backup mode is a valid use case and this commit strengthens it. Since commit 2ab82852a270 ("net/bonding: Enable bonding to enslave netdevices not supporting set_mac_address()") was introduced till 4.7-rc1, IPoIB didn't support the set_mac_address ndo, and hence the fail over mac policy always applied to IPoIB bonds. With the introduction of commit 492a7e67ff83 ("IB/IPoIB: Allow setting the device address"), that doesn't hold and practically IPoIB bonds are broken as of that. To fix it, lets go to fail over mac if the device doesn't support the ndo OR this is IPoIB device. As a by-product, this commit also prevents a stack corruption which occurred when trying to copy 20 bytes (IPoIB) device address to a sockaddr struct that has only 16 bytes of storage. [1] Get panic after create bond with down/updelay and restart NIC driver Configure bond with down/updelay cat /etc/network/interfaces auto bond1 iface bond1 inet static address 31.136.42.17 netmask 255.255.0.0 bond-slaves ib0 ib1 bond-miimon 100 bond-updelay 5000 bond-mode active-backup bond-primary ib1 bond-downdelay 5000 auto ib0 iface ib0 inet manual bond-master bond1 auto ib1 iface ib1 inet manual bond-master bond1 modprobe -r [2] 1533e77315220dc1d5ec3bd6d9fe32e2aa0a74c0 net/bonding: Enforce active-backup policy for IPoIB bonds ** Affects: linux (Ubuntu) Importance: Undecided Status: New ** Tags: xenial -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1668042 Title: [Xenial - 16.04 ]Bonding driver - stack corruption when trying to copy 20 bytes to a sockaddr Status in linux package in Ubuntu: New Bug description: In Ubuntu Xenial with kernel 4.4.0-65, we get kernel Panic after scenario [1]. patch [2] should fix the issue When using an IPoIB bond currently only active-backup mode is a valid use case and this commit strengthens it. Since commit 2ab82852a270 ("net/bonding: Enable bonding to enslave netdevices not supporting set_mac_address()") was introduced till 4.7-rc1, IPoIB didn't support the set_mac_address ndo, and hence the fail over mac policy always applied to IPoIB bonds. With the introduction of commit 492a7e67ff83 ("IB/IPoIB: Allow setting the device address"), that doesn't hold and practically IPoIB bonds are broken as of that. To fix it, lets go to fail over mac if the device doesn't support the ndo OR this is IPoIB device. As a by-product, this commit also prevents a stack corruption which occurred when trying to copy 20 bytes (IPoIB) device address to a sockaddr struct that has only 16 bytes of storage. [1] Get panic after create bond with down/updelay and restart NIC driver Configure bond with down/updelay cat /etc/network/interfaces auto bond1 iface bond1 inet static address 31.136.42.17 netmask 255.255.0.0 bond-slaves ib0 ib1 bond-miimon 100 bond-updelay 5000 bond-mode active-backup bond-primary ib1 bond-downdelay 5000 auto ib0 iface ib0 inet manual bond-master bond1 auto ib1 iface ib1 inet manual bond-master bond1 modprobe -r [2] 1533e77315220dc1d5ec3bd6d9fe32e2aa0a74c0 net/bonding: Enforce active-backup policy for IPoIB bonds To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1668042/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1668042] Re: [Xenial - 16.04 ]Bonding driver - stack corruption when trying to copy 20 bytes to a sockaddr
Thank you, will test and update. By the way, i sent the patch to kernel-t...@lists.canonical.com. Thank you, Talat -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1668042 Title: [Xenial - 16.04 ]Bonding driver - stack corruption when trying to copy 20 bytes to a sockaddr Status in linux package in Ubuntu: In Progress Status in linux source package in Xenial: In Progress Bug description: In Ubuntu Xenial with kernel 4.4.0-65, we get kernel Panic after scenario [1]. patch [2] should fix the issue When using an IPoIB bond currently only active-backup mode is a valid use case and this commit strengthens it. Since commit 2ab82852a270 ("net/bonding: Enable bonding to enslave netdevices not supporting set_mac_address()") was introduced till 4.7-rc1, IPoIB didn't support the set_mac_address ndo, and hence the fail over mac policy always applied to IPoIB bonds. With the introduction of commit 492a7e67ff83 ("IB/IPoIB: Allow setting the device address"), that doesn't hold and practically IPoIB bonds are broken as of that. To fix it, lets go to fail over mac if the device doesn't support the ndo OR this is IPoIB device. As a by-product, this commit also prevents a stack corruption which occurred when trying to copy 20 bytes (IPoIB) device address to a sockaddr struct that has only 16 bytes of storage. [1] Get panic after create bond with down/updelay and restart NIC driver Configure bond with down/updelay cat /etc/network/interfaces auto bond1 iface bond1 inet static address 31.136.42.17 netmask 255.255.0.0 bond-slaves ib0 ib1 bond-miimon 100 bond-updelay 5000 bond-mode active-backup bond-primary ib1 bond-downdelay 5000 auto ib0 iface ib0 inet manual bond-master bond1 auto ib1 iface ib1 inet manual bond-master bond1 modprobe -r [2] 1533e77315220dc1d5ec3bd6d9fe32e2aa0a74c0 net/bonding: Enforce active-backup policy for IPoIB bonds To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1668042/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1668019] Re: [UBUNTU Zesty] mlx5 - Improve OVS offload driver
The patches already sent to kernel-t...@lists.canonical.com. Thanks, Talat -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1668019 Title: [UBUNTU Zesty] mlx5 - Improve OVS offload driver Status in linux package in Ubuntu: Triaged Status in linux source package in Zesty: Triaged Bug description: The patches in this series should improve the OVS offloading driver and some fixes mlx5 driver, by adding reflect HW offload status for querying whether a filter is offloaded to HW or not. This patches is already accepted upstream net-next (4.11). The following patches should improve the OVS offload driver: PICK: 5cecb6c net/sched: cls_bpf: Reflect HW offload status PICK: 24d3dc6 net/sched: cls_u32: Reflect HW offload status PICK: c7d2b2f net/sched: cls_matchall: Reflect HW offloading status PICK: 5559396 net/sched: cls_flower: Reflect HW offload status PICK: e696028 net/sched: Reflect HW offload status PICK: 7a335ad net/sched: cls_matchall: Dump the classifier flags PICK: 749e672 net/sched: cls_flower: Properly handle classifier flags dumping PICK: a61d5ce net/mlx5: Fix static checker warnings PICK: 264d7bf net/mlx5: E-Switch, Enlarge the FDB size for the switchdev mode PICK: ce99f6b net/mlx5e: Support SRIOV TC encapsulation offloads for IPv6 tunnels PICK: 9a94111 net/mlx5e: Maximize ip tunnel key usage on the TC offloading path PICK: 76f7444 net/mlx5e: Use the full tunnel key info for encapsulation offload house-keeping PICK: 75c33da net/mlx5e: TC ipv4 tunnel encap offload cosmetic changes PICK: 19f4440 net/mlx5e: Add TC offloads matching on IPv6 encapsulation headers PICK: 073ff3c net/mlx5: Use exact encap header size for the FW input buffer PICK: 7898489 IB/mlx5: Enable Eth VFs to query their min-inline value for user-space PICK: 8c7245a6 net/mlx5: Push min-inline mode resolution helper into the core PICK: a3308d8 net/sched: cls_flower: Disallow duplicate internal elements To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1668019/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1668042] Re: [Xenial - 16.04 ]Bonding driver - stack corruption when trying to copy 20 bytes to a sockaddr
Could you please add this fix also to trusty? ** Also affects: linux-lts-trusty (Ubuntu) Importance: Undecided Status: New ** No longer affects: linux-lts-trusty (Ubuntu) ** No longer affects: linux-lts-trusty (Ubuntu Xenial) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-lts-trusty in Ubuntu. https://bugs.launchpad.net/bugs/1668042 Title: [Xenial - 16.04 ]Bonding driver - stack corruption when trying to copy 20 bytes to a sockaddr Status in linux package in Ubuntu: In Progress Status in linux source package in Xenial: In Progress Bug description: In Ubuntu Xenial with kernel 4.4.0-65, we get kernel Panic after scenario [1]. patch [2] should fix the issue When using an IPoIB bond currently only active-backup mode is a valid use case and this commit strengthens it. Since commit 2ab82852a270 ("net/bonding: Enable bonding to enslave netdevices not supporting set_mac_address()") was introduced till 4.7-rc1, IPoIB didn't support the set_mac_address ndo, and hence the fail over mac policy always applied to IPoIB bonds. With the introduction of commit 492a7e67ff83 ("IB/IPoIB: Allow setting the device address"), that doesn't hold and practically IPoIB bonds are broken as of that. To fix it, lets go to fail over mac if the device doesn't support the ndo OR this is IPoIB device. As a by-product, this commit also prevents a stack corruption which occurred when trying to copy 20 bytes (IPoIB) device address to a sockaddr struct that has only 16 bytes of storage. [1] Get panic after create bond with down/updelay and restart NIC driver Configure bond with down/updelay cat /etc/network/interfaces auto bond1 iface bond1 inet static address 31.136.42.17 netmask 255.255.0.0 bond-slaves ib0 ib1 bond-miimon 100 bond-updelay 5000 bond-mode active-backup bond-primary ib1 bond-downdelay 5000 auto ib0 iface ib0 inet manual bond-master bond1 auto ib1 iface ib1 inet manual bond-master bond1 modprobe -r [2] 1533e77315220dc1d5ec3bd6d9fe32e2aa0a74c0 net/bonding: Enforce active-backup policy for IPoIB bonds To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1668042/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1668042] Re: [Xenial - 16.04 ]Bonding driver - stack corruption when trying to copy 20 bytes to a sockaddr
Thank you, the fix works as expected in Xenial. ** Tags added: verification-done -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1668042 Title: [Xenial - 16.04 ]Bonding driver - stack corruption when trying to copy 20 bytes to a sockaddr Status in linux package in Ubuntu: In Progress Status in linux source package in Xenial: In Progress Bug description: In Ubuntu Xenial with kernel 4.4.0-65, we get kernel Panic after scenario [1]. patch [2] should fix the issue When using an IPoIB bond currently only active-backup mode is a valid use case and this commit strengthens it. Since commit 2ab82852a270 ("net/bonding: Enable bonding to enslave netdevices not supporting set_mac_address()") was introduced till 4.7-rc1, IPoIB didn't support the set_mac_address ndo, and hence the fail over mac policy always applied to IPoIB bonds. With the introduction of commit 492a7e67ff83 ("IB/IPoIB: Allow setting the device address"), that doesn't hold and practically IPoIB bonds are broken as of that. To fix it, lets go to fail over mac if the device doesn't support the ndo OR this is IPoIB device. As a by-product, this commit also prevents a stack corruption which occurred when trying to copy 20 bytes (IPoIB) device address to a sockaddr struct that has only 16 bytes of storage. [1] Get panic after create bond with down/updelay and restart NIC driver Configure bond with down/updelay cat /etc/network/interfaces auto bond1 iface bond1 inet static address 31.136.42.17 netmask 255.255.0.0 bond-slaves ib0 ib1 bond-miimon 100 bond-updelay 5000 bond-mode active-backup bond-primary ib1 bond-downdelay 5000 auto ib0 iface ib0 inet manual bond-master bond1 auto ib1 iface ib1 inet manual bond-master bond1 modprobe -r [2] 1533e77315220dc1d5ec3bd6d9fe32e2aa0a74c0 net/bonding: Enforce active-backup policy for IPoIB bonds To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1668042/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1552627] [NEW] mlx4_en didn't choose time-stamping shift value according to HW frequency
Public bug reported: Hi, Previously, the shift value used for time-stamping was constant and didn't depend on the HW chip frequency. Change that to take the frequency into account and calculate the maximal value in cycles per wraparound of ten seconds. This time slot was chosen since it gives a good accuracy in time synchronization. Algorithm for shift value calculation: * Round up the maximal value in cycles to nearest power of two * Calculate maximal multiplier by division of all 64 bits set to above result * Then, invert the function clocksource_khz2mult() to get the shift from maximal mult value below is the upstream commit that should fix the issue commit 31c128b66e5b28f468076e4f3ca3025c35342041 ** Affects: linux (Ubuntu) Importance: Undecided Status: New ** Tags: trusty wily xenial -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1552627 Title: mlx4_en didn't choose time-stamping shift value according to HW frequency Status in linux package in Ubuntu: New Bug description: Hi, Previously, the shift value used for time-stamping was constant and didn't depend on the HW chip frequency. Change that to take the frequency into account and calculate the maximal value in cycles per wraparound of ten seconds. This time slot was chosen since it gives a good accuracy in time synchronization. Algorithm for shift value calculation: * Round up the maximal value in cycles to nearest power of two * Calculate maximal multiplier by division of all 64 bits set to above result * Then, invert the function clocksource_khz2mult() to get the shift from maximal mult value below is the upstream commit that should fix the issue commit 31c128b66e5b28f468076e4f3ca3025c35342041 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1552627/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1552632] [NEW] mlx4_core Set UAR page size to 4KB regardless of system page size
Public bug reported: problem description: The current code sets UAR page size equal to system page size. The ConnectX-3 and ConnectX-3 Pro HWs require minimum 128 UAR pages. The mlx4 kernel drivers are not loaded if there is less than 128 UAR pages. solution: Always set UAR page to 4KB. This allows more UAR pages if the OS has PAGE_SIZE larger than 4KB. For example, PowerPC kernel use 64KB system page size, with 4MB uar region, there are 4MB/2/64KB = 32 uars (half for uar, half for blueflame). This does not meet minimum 128 UAR pages requirement. With 4KB UAR page, there are 4MB/2/4KB = 512 uars which meet the minimum requirement. Note that only codes in mlx4_core that deal with firmware know that uar page size is 4KB. Codes that deal with usr page in cq and qp context (mlx4_ib, mlx4_en and part of mlx4_core) still have the same assumption that uar page size equals to system page size. Note that with this implementation, on 64KB system page size kernel, there are 16 uars per system page but only one uars is used. The other 15 uars are ignored because of the above assumption. Regarding SR-IOV, mlx4_core in hypervisor will set the uar page size to 4KB and mlx4_core code in virtual OS will obtain the uar page size from firmware. Regarding backward compatibility in SR-IOV, if hypervisor has this new code, the virtual OS must be updated. If hypervisor has old code, and the virtual OS has this new code, the new code will be backward compatible with the old code. If the uar size is big enough, this new code in VF continues to work with 64 KB uar page size (on PowerPc kernel). If the uar size does not meet 128 uars requirement, this new code not loaded in VF and print the same error message as the old code in Hypervisor. below is the upstream commit id 85743f1eb34548ba4b056d2f184a3d107a3b8917 ** Affects: linux (Ubuntu) Importance: Undecided Status: New ** Tags: trusty wily xenial -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1552632 Title: mlx4_core Set UAR page size to 4KB regardless of system page size Status in linux package in Ubuntu: New Bug description: problem description: The current code sets UAR page size equal to system page size. The ConnectX-3 and ConnectX-3 Pro HWs require minimum 128 UAR pages. The mlx4 kernel drivers are not loaded if there is less than 128 UAR pages. solution: Always set UAR page to 4KB. This allows more UAR pages if the OS has PAGE_SIZE larger than 4KB. For example, PowerPC kernel use 64KB system page size, with 4MB uar region, there are 4MB/2/64KB = 32 uars (half for uar, half for blueflame). This does not meet minimum 128 UAR pages requirement. With 4KB UAR page, there are 4MB/2/4KB = 512 uars which meet the minimum requirement. Note that only codes in mlx4_core that deal with firmware know that uar page size is 4KB. Codes that deal with usr page in cq and qp context (mlx4_ib, mlx4_en and part of mlx4_core) still have the same assumption that uar page size equals to system page size. Note that with this implementation, on 64KB system page size kernel, there are 16 uars per system page but only one uars is used. The other 15 uars are ignored because of the above assumption. Regarding SR-IOV, mlx4_core in hypervisor will set the uar page size to 4KB and mlx4_core code in virtual OS will obtain the uar page size from firmware. Regarding backward compatibility in SR-IOV, if hypervisor has this new code, the virtual OS must be updated. If hypervisor has old code, and the virtual OS has this new code, the new code will be backward compatible with the old code. If the uar size is big enough, this new code in VF continues to work with 64 KB uar page size (on PowerPc kernel). If the uar size does not meet 128 uars requirement, this new code not loaded in VF and print the same error message as the old code in Hypervisor. below is the upstream commit id 85743f1eb34548ba4b056d2f184a3d107a3b8917 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1552632/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1557950] Re: mlx5_core kernel trace after "ethtool -C eth1 adaptive-rx on" flow
** Patch added: "0002-net-mlx5e-Don-t-modify-CQ-before-it-was-created.patch" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1557950/+attachment/4600895/+files/0002-net-mlx5e-Don-t-modify-CQ-before-it-was-created.patch -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1557950 Title: mlx5_core kernel trace after "ethtool -C eth1 adaptive-rx on" flow Status in linux package in Ubuntu: New Bug description: reproduce steps: # ethtool -c eth1 Coalesce parameters for eth1: Adaptive RX: off TX: off # ethtool -C eth1 adaptive-rx on # cat /etc/os-release NAME="Ubuntu" VERSION="16.04 (Xenial Xerus)" ID=ubuntu ID_LIKE=debian PRETTY_NAME="Ubuntu 16.04" VERSION_ID="16.04" HOME_URL="http://www.ubuntu.com/"; SUPPORT_URL="http://help.ubuntu.com/"; BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"; UBUNTU_CODENAME=xenial # uname -a Linux dev-h-vrt-006 4.4.0-11-generic #26-Ubuntu SMP Sat Mar 5 14:25:21 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux #dmesg [174430.803529] mst_pci: module verification failed: signature and/or required key missing - tainting kernel [174453.001485] BUG: unable to handle kernel NULL pointer dereference at (null) [174453.001509] IP: [] mlx5e_set_coalesce+0x6e/0x100 [mlx5_core] [174453.001535] PGD 81a5c7067 PUD 81aa93067 PMD 0 [174453.001556] Oops: [#1] SMP [174453.001571] Modules linked in: mst_pciconf(OE) mst_pci(OE) nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace fscache mlx5_ib ib_core ib_addr vfio_pci vfio_virqfd vfio_iommu_type1 vfio xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables bridge stp llc x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul aesni_intel kvm_intel aes_x86_64 ipmi_ssif lrw gf128mul glue_helper ablk_helper cryptd serio_raw kvm sb_edac edac_core irqbypass hpilo ipmi_si 8250_fintek ioatdma ipmi_msghandler acpi_power_meter mac_hid lpc_ich shpchp sunrpc autofs4 mlx4_en [174453.001928] psmouse ixgbe dca vxlan pata_acpi ip6_udp_tunnel udp_tunnel mdio hpsa mlx5_core tg3 scsi_transport_sas mlx4_core ptp pps_core wmi fjes [174453.002011] CPU: 2 PID: 40824 Comm: ethtool Tainted: G OE 4.4.0-11-generic #26-Ubuntu [174453.002026] Hardware name: HP ProLiant DL380p Gen8, BIOS P70 12/20/2013 [174453.002037] task: 8800bd919b80 ti: 880814d74000 task.ti: 880814d74000 [174453.002072] RIP: 0010:[] [] mlx5e_set_coalesce+0x6e/0x100 [mlx5_core] [174453.002119] RSP: 0018:880814d77c30 EFLAGS: 00010246 [174453.002141] RAX: RBX: 880816e0 RCX: [174453.002176] RDX: RSI: 880814d77c74 RDI: 880814e6 [174453.002210] RBP: 880814d77c60 R08: 81e42520 R09: ff00 [174453.002245] R10: 0533 R11: 0246 R12: 880814d77c74 [174453.002280] R13: 880814e6 R14: R15: 880814e6 [174453.002316] FS: 7f7e927f7700() GS:88081f68() knlGS: [174453.002352] CS: 0010 DS: ES: CR0: 80050033 [174453.002374] CR2: CR3: 000818365000 CR4: 001406e0 [174453.002409] Stack: [174453.002427] 17a4b290 880814e6 000600114bb3 000f [174453.002476] 8946 880814e6 880814d77ce0 8171784a [174453.002525] 000f0010 00200010 00200010 [174453.002574] Call Trace: [174453.002597] [] ethtool_set_coalesce+0x5a/0x80 [174453.002621] [] dev_ethtool+0xe78/0x1d70 [174453.002645] [] ? page_cache_async_readahead+0x6b/0x70 [174453.002670] [] ? page_add_file_rmap+0x25/0x60 [174453.002694] [] ? __rtnl_unlock+0x15/0x20 [174453.002717] [] ? netdev_run_todo+0x61/0x320 [174453.002741] [] dev_ioctl+0x182/0x580 [174453.002765] [] sock_do_ioctl+0x42/0x50 [174453.002788] [] sock_ioctl+0x1d2/0x290 [174453.002811] [] do_vfs_ioctl+0x29f/0x490 [174453.003136] [] ? __do_page_fault+0x1b4/0x400 [174453.003161] [] ? fd_install+0x25/0x30 [174453.003183] [] SyS_ioctl+0x79/0x90 [174453.003208] [] entry_SY
[Kernel-packages] [Bug 1557950] [NEW] mlx5_core kernel trace after "ethtool -C eth1 adaptive-rx on" flow
Public bug reported: reproduce steps: # ethtool -c eth1 Coalesce parameters for eth1: Adaptive RX: off TX: off # ethtool -C eth1 adaptive-rx on # cat /etc/os-release NAME="Ubuntu" VERSION="16.04 (Xenial Xerus)" ID=ubuntu ID_LIKE=debian PRETTY_NAME="Ubuntu 16.04" VERSION_ID="16.04" HOME_URL="http://www.ubuntu.com/"; SUPPORT_URL="http://help.ubuntu.com/"; BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"; UBUNTU_CODENAME=xenial # uname -a Linux dev-h-vrt-006 4.4.0-11-generic #26-Ubuntu SMP Sat Mar 5 14:25:21 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux #dmesg [174430.803529] mst_pci: module verification failed: signature and/or required key missing - tainting kernel [174453.001485] BUG: unable to handle kernel NULL pointer dereference at (null) [174453.001509] IP: [] mlx5e_set_coalesce+0x6e/0x100 [mlx5_core] [174453.001535] PGD 81a5c7067 PUD 81aa93067 PMD 0 [174453.001556] Oops: [#1] SMP [174453.001571] Modules linked in: mst_pciconf(OE) mst_pci(OE) nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace fscache mlx5_ib ib_core ib_addr vfio_pci vfio_virqfd vfio_iommu_type1 vfio xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables bridge stp llc x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul aesni_intel kvm_intel aes_x86_64 ipmi_ssif lrw gf128mul glue_helper ablk_helper cryptd serio_raw kvm sb_edac edac_core irqbypass hpilo ipmi_si 8250_fintek ioatdma ipmi_msghandler acpi_power_meter mac_hid lpc_ich shpchp sunrpc autofs4 mlx4_en [174453.001928] psmouse ixgbe dca vxlan pata_acpi ip6_udp_tunnel udp_tunnel mdio hpsa mlx5_core tg3 scsi_transport_sas mlx4_core ptp pps_core wmi fjes [174453.002011] CPU: 2 PID: 40824 Comm: ethtool Tainted: G OE 4.4.0-11-generic #26-Ubuntu [174453.002026] Hardware name: HP ProLiant DL380p Gen8, BIOS P70 12/20/2013 [174453.002037] task: 8800bd919b80 ti: 880814d74000 task.ti: 880814d74000 [174453.002072] RIP: 0010:[] [] mlx5e_set_coalesce+0x6e/0x100 [mlx5_core] [174453.002119] RSP: 0018:880814d77c30 EFLAGS: 00010246 [174453.002141] RAX: RBX: 880816e0 RCX: [174453.002176] RDX: RSI: 880814d77c74 RDI: 880814e6 [174453.002210] RBP: 880814d77c60 R08: 81e42520 R09: ff00 [174453.002245] R10: 0533 R11: 0246 R12: 880814d77c74 [174453.002280] R13: 880814e6 R14: R15: 880814e6 [174453.002316] FS: 7f7e927f7700() GS:88081f68() knlGS: [174453.002352] CS: 0010 DS: ES: CR0: 80050033 [174453.002374] CR2: CR3: 000818365000 CR4: 001406e0 [174453.002409] Stack: [174453.002427] 17a4b290 880814e6 000600114bb3 000f [174453.002476] 8946 880814e6 880814d77ce0 8171784a [174453.002525] 000f0010 00200010 00200010 [174453.002574] Call Trace: [174453.002597] [] ethtool_set_coalesce+0x5a/0x80 [174453.002621] [] dev_ethtool+0xe78/0x1d70 [174453.002645] [] ? page_cache_async_readahead+0x6b/0x70 [174453.002670] [] ? page_add_file_rmap+0x25/0x60 [174453.002694] [] ? __rtnl_unlock+0x15/0x20 [174453.002717] [] ? netdev_run_todo+0x61/0x320 [174453.002741] [] dev_ioctl+0x182/0x580 [174453.002765] [] sock_do_ioctl+0x42/0x50 [174453.002788] [] sock_ioctl+0x1d2/0x290 [174453.002811] [] do_vfs_ioctl+0x29f/0x490 [174453.003136] [] ? __do_page_fault+0x1b4/0x400 [174453.003161] [] ? fd_install+0x25/0x30 [174453.003183] [] SyS_ioctl+0x79/0x90 [174453.003208] [] entry_SYSCALL_64_fastpath+0x16/0x71 [174453.003231] Code: 66 89 87 36 66 00 00 8b 46 08 66 89 87 38 66 00 00 0f 84 91 00 00 00 49 89 fd 49 89 f4 48 63 45 d4 49 8b 95 00 13 00 00 45 31 f6 <4c> 8b 3c c2 41 80 bf dc 18 00 00 00 74 3d 49 63 c6 41 0f b7 4c [174453.003564] RIP [] mlx5e_set_coalesce+0x6e/0x100 [mlx5_core] [174453.003607] RSP [174453.003626] CR2: [174453.004055] ---[ end trace 8466dfbb422a27d8 ]--- Fix upstream commits 7524a5d88b94afef8397a79f1e664af5b7052c22 net/mlx5e: Don't modify CQ before it was created 2fcb92fbd04eef26dfe7e67839da6262d83d6b65 net/mlx5e: Don't try to modify CQ moderation if
[Kernel-packages] [Bug 1557950] Re: mlx5_core kernel trace after "ethtool -C eth1 adaptive-rx on" flow
** Patch added: "0001-net-mlx5e-Don-t-try-to-modify-CQ-moderation-if-it-is.patch" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1557950/+attachment/4600894/+files/0001-net-mlx5e-Don-t-try-to-modify-CQ-moderation-if-it-is.patch -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1557950 Title: mlx5_core kernel trace after "ethtool -C eth1 adaptive-rx on" flow Status in linux package in Ubuntu: New Bug description: reproduce steps: # ethtool -c eth1 Coalesce parameters for eth1: Adaptive RX: off TX: off # ethtool -C eth1 adaptive-rx on # cat /etc/os-release NAME="Ubuntu" VERSION="16.04 (Xenial Xerus)" ID=ubuntu ID_LIKE=debian PRETTY_NAME="Ubuntu 16.04" VERSION_ID="16.04" HOME_URL="http://www.ubuntu.com/"; SUPPORT_URL="http://help.ubuntu.com/"; BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"; UBUNTU_CODENAME=xenial # uname -a Linux dev-h-vrt-006 4.4.0-11-generic #26-Ubuntu SMP Sat Mar 5 14:25:21 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux #dmesg [174430.803529] mst_pci: module verification failed: signature and/or required key missing - tainting kernel [174453.001485] BUG: unable to handle kernel NULL pointer dereference at (null) [174453.001509] IP: [] mlx5e_set_coalesce+0x6e/0x100 [mlx5_core] [174453.001535] PGD 81a5c7067 PUD 81aa93067 PMD 0 [174453.001556] Oops: [#1] SMP [174453.001571] Modules linked in: mst_pciconf(OE) mst_pci(OE) nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace fscache mlx5_ib ib_core ib_addr vfio_pci vfio_virqfd vfio_iommu_type1 vfio xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables bridge stp llc x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul aesni_intel kvm_intel aes_x86_64 ipmi_ssif lrw gf128mul glue_helper ablk_helper cryptd serio_raw kvm sb_edac edac_core irqbypass hpilo ipmi_si 8250_fintek ioatdma ipmi_msghandler acpi_power_meter mac_hid lpc_ich shpchp sunrpc autofs4 mlx4_en [174453.001928] psmouse ixgbe dca vxlan pata_acpi ip6_udp_tunnel udp_tunnel mdio hpsa mlx5_core tg3 scsi_transport_sas mlx4_core ptp pps_core wmi fjes [174453.002011] CPU: 2 PID: 40824 Comm: ethtool Tainted: G OE 4.4.0-11-generic #26-Ubuntu [174453.002026] Hardware name: HP ProLiant DL380p Gen8, BIOS P70 12/20/2013 [174453.002037] task: 8800bd919b80 ti: 880814d74000 task.ti: 880814d74000 [174453.002072] RIP: 0010:[] [] mlx5e_set_coalesce+0x6e/0x100 [mlx5_core] [174453.002119] RSP: 0018:880814d77c30 EFLAGS: 00010246 [174453.002141] RAX: RBX: 880816e0 RCX: [174453.002176] RDX: RSI: 880814d77c74 RDI: 880814e6 [174453.002210] RBP: 880814d77c60 R08: 81e42520 R09: ff00 [174453.002245] R10: 0533 R11: 0246 R12: 880814d77c74 [174453.002280] R13: 880814e6 R14: R15: 880814e6 [174453.002316] FS: 7f7e927f7700() GS:88081f68() knlGS: [174453.002352] CS: 0010 DS: ES: CR0: 80050033 [174453.002374] CR2: CR3: 000818365000 CR4: 001406e0 [174453.002409] Stack: [174453.002427] 17a4b290 880814e6 000600114bb3 000f [174453.002476] 8946 880814e6 880814d77ce0 8171784a [174453.002525] 000f0010 00200010 00200010 [174453.002574] Call Trace: [174453.002597] [] ethtool_set_coalesce+0x5a/0x80 [174453.002621] [] dev_ethtool+0xe78/0x1d70 [174453.002645] [] ? page_cache_async_readahead+0x6b/0x70 [174453.002670] [] ? page_add_file_rmap+0x25/0x60 [174453.002694] [] ? __rtnl_unlock+0x15/0x20 [174453.002717] [] ? netdev_run_todo+0x61/0x320 [174453.002741] [] dev_ioctl+0x182/0x580 [174453.002765] [] sock_do_ioctl+0x42/0x50 [174453.002788] [] sock_ioctl+0x1d2/0x290 [174453.002811] [] do_vfs_ioctl+0x29f/0x490 [174453.003136] [] ? __do_page_fault+0x1b4/0x400 [174453.003161] [] ? fd_install+0x25/0x30 [174453.003183] [] SyS_ioctl+0x79/0x90 [174453.003208]
[Kernel-packages] [Bug 1540435] Re: Introducing ConnectX-4 Ethernet SRIOV
Hi, Could you please tell me when we have this feature in the Xenial kernel ? I didn't see them in the master branch yet. thanks, Talat -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1540435 Title: Introducing ConnectX-4 Ethernet SRIOV Status in linux package in Ubuntu: Fix Committed Status in linux source package in Xenial: Fix Committed Bug description: Hi, This patchset introduces the support of Ethernet SRIOV in ConnectX-4 family of 100G Ethernet NICs. Basic Introduction: ConnectX-4 HW architecture provides two kinds of underlying HW switches. MPFS (Multi Physical Function Switch) or L2 Table in Software terms: The HCA has one MPFS switch per physical port, this switch is responsible of forwarding Unicast traffic to the various overlying Physical Functions (PFs). Multicast traffic is flooded amongst all the PFs, Each PF can request to forward a unicast MAC to its E-Switch Uplink vport (which we will cover later) through SET_L2_TABLE_ENTRY HW command. MPFS has five ports, four are connected to PFs (one for each) and one is connected directly to the Physical Port (Physical Link). E-Switch (Ethernet Switch): The HCA has one per physical function. The main responsibility of this component is to forward Unicast/Multicast and vlan tagged/untagged traffic to the various Virtual Functions (VFs) allocated by the PF. Unlike MPFS, the PF needs to explicitly create the E-Switch FDB table, Which is a HW flow table managed by the PF driver whenever vport_group_manager capability bit is set for this PF. E-Switch has Virtual Ports (vports) entities as its ports, vport0 and uplink vport are special kind of vports that represents PF vport (vport0) and uplink vport which is connected to the MPFS switch (if exists) as the PF external link. vport1..vportN represent VF0..VF(N-1) egress/ingress ports. E-Switch FDB contains forwarding rules such as: UC MAC0 -> vport0(PF). UC MAC1 -> vport1. UC MAC2 -> vport2. MC MACX -> vport0, vport2, Uplink. MC MACY -> vport1, Uplink. For unmatched traffic FDB has the following default rules: Unmatched Traffic (src vport != Uplink) -> Uplink. Unmatched Traffic (src vport == Uplink) -> vport0(PF). NIC VPort context: Each NIC (VF/PF) has its own vport context which will be used to store the current NIC vport context (UC/MC and vlan lists) and other NIC properties such as MTU, promisc mode, etc.. NIC (VF/PF) driver is responsible of constantly updating this context. FDB rules population: Each NIC vport (VF/PF) will notify E-Switch manager of its UC/MC vport context changes via modify vport context command, which will be translated to an event that will be handled by E-Switch manager (PF) which will update FDB table accordingly. Both PF and VF use the same driver and submit commands directly to the firmware. The PF sees the vport_group_manager capability bit and as such runs the code to populate the embedded switches as explained above. The patch goes as follows: Patches 1-2 introduces the basic PCI SRIOV functionalities and the support of Connectx4 to enable specific VFs via enable/disable HCA commands. These two patches will be also in use later for the IB SRIOV flow. Patches 3-8 Introduces the basic E-Switch capabilities and commands to be used later by VF to modify and update its NIC vport context, and by PF (E-Switch Manager) driver to Query the VF NIC context and acts accordingly. Patches 9-10 Provide the needed functionality of a NIC driver VF/PF to support SRIOV, mainly vport context update support. Patch 11 ("net/mlx5: Introducing E-Switch and l2 table"), Introduces the basic E-Switch support and infrastructure to read vport context events and to update MPFS L2 Table of the UC mac addresses request by the PF. Patches 12-18 Introduces SRIOV enablemenet and E-Switch FDB table management It adds the Basic E-Swtich public API to set and get sriov properties to be used in PF netdev sriov ndos. Patchset was applied ontop of commit 3f8c0f7 "gianfar: use of_property_read_bool()" Saeed, Eli and Or. Eli Cohen (2): net/mlx5_core: Modify enable/disable hca functions net/mlx5_core: Add base sriov support Saeed Mahameed (18): net/mlx5: Add HW capabilities and structs for SR-IOV E-Switch. net/mlx5: Update access functions to Query/Modify vport MAC address net/mlx5: Introduce access functions to modify/query vport mac lists net/mlx5: Introduce access functions to modify/query vport state net/mlx5: Introduce access functions to modify/query vport promisc mode net/mlx5: Introduce access functions to modify/query vport vlans net/mlx5e: Write UC/MC list and promisc mode into vport context net/mlx5e: Write vla
[Kernel-packages] [Bug 1528466] [NEW] Mellanox ConnectX4 MTU limits: max and min
Public bug reported: Max MTU limit should be 9978 Tested on Ubuntu 14.04.4 daily build, this issue exist also in 15.10. reproduce # ifconfig p2p1 mtu 900 SIOCSIFMTU: Invalid argument # dmesg [62898.559808] mlx5_core :20:00.0 p2p1: mlx5e_change_mtu: Bad MTU (900) > (1) Max [63058.512668] command failed, status bad parameter(0x3), syndrome 0x648afc Upstream commit that fix the issue commit 60825c35bf023553f8524f6695f176236e54df97 Author: Doron Tsur Date: Thu Nov 12 19:35:27 2015 +0200 net/mlx5e: Max mtu comparison fix On change mtu the driver compares between hardware queried mtu and software requested mtu. We need to compare between software representation of the queried mtu and the requested mtu. Fixes: facc9699f0fe ('net/mlx5e: Fix HW MTU settings') Signed-off-by: Doron Tsur Signed-off-by: Saeed Mahameed Signed-off-by: Or Gerlitz Signed-off-by: David S. Miller # cat /etc/os-release NAME="Ubuntu" VERSION="14.04.3 LTS, Trusty Tahr" ID=ubuntu ID_LIKE=debian PRETTY_NAME="Ubuntu 14.04.3 LTS" VERSION_ID="14.04" HOME_URL="http://www.ubuntu.com/"; SUPPORT_URL="http://help.ubuntu.com/"; BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"; # uname -r 4.2.0-22-generic ** Affects: linux (Ubuntu) Importance: Undecided Status: New ** Attachment added: "net/mlx5e: Max mtu comparison fix" https://bugs.launchpad.net/bugs/1528466/+attachment/4538790/+files/0001-net-mlx5e-Max-mtu-comparison-fix.patch -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1528466 Title: Mellanox ConnectX4 MTU limits: max and min Status in linux package in Ubuntu: New Bug description: Max MTU limit should be 9978 Tested on Ubuntu 14.04.4 daily build, this issue exist also in 15.10. reproduce # ifconfig p2p1 mtu 900 SIOCSIFMTU: Invalid argument # dmesg [62898.559808] mlx5_core :20:00.0 p2p1: mlx5e_change_mtu: Bad MTU (900) > (1) Max [63058.512668] command failed, status bad parameter(0x3), syndrome 0x648afc Upstream commit that fix the issue commit 60825c35bf023553f8524f6695f176236e54df97 Author: Doron Tsur Date: Thu Nov 12 19:35:27 2015 +0200 net/mlx5e: Max mtu comparison fix On change mtu the driver compares between hardware queried mtu and software requested mtu. We need to compare between software representation of the queried mtu and the requested mtu. Fixes: facc9699f0fe ('net/mlx5e: Fix HW MTU settings') Signed-off-by: Doron Tsur Signed-off-by: Saeed Mahameed Signed-off-by: Or Gerlitz Signed-off-by: David S. Miller # cat /etc/os-release NAME="Ubuntu" VERSION="14.04.3 LTS, Trusty Tahr" ID=ubuntu ID_LIKE=debian PRETTY_NAME="Ubuntu 14.04.3 LTS" VERSION_ID="14.04" HOME_URL="http://www.ubuntu.com/"; SUPPORT_URL="http://help.ubuntu.com/"; BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"; # uname -r 4.2.0-22-generic To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1528466/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1528466] Re: Mellanox ConnectX4 MTU limits: max and min
** Tags added: trusty ** Tags added: vivid wily -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1528466 Title: Mellanox ConnectX4 MTU limits: max and min Status in linux package in Ubuntu: Incomplete Bug description: Max MTU limit should be 9978 Tested on Ubuntu 14.04.4 daily build, this issue exist also in 15.10. reproduce # ifconfig p2p1 mtu 900 SIOCSIFMTU: Invalid argument # dmesg [62898.559808] mlx5_core :20:00.0 p2p1: mlx5e_change_mtu: Bad MTU (900) > (1) Max [63058.512668] command failed, status bad parameter(0x3), syndrome 0x648afc Upstream commit that fix the issue commit 60825c35bf023553f8524f6695f176236e54df97 Author: Doron Tsur Date: Thu Nov 12 19:35:27 2015 +0200 net/mlx5e: Max mtu comparison fix On change mtu the driver compares between hardware queried mtu and software requested mtu. We need to compare between software representation of the queried mtu and the requested mtu. Fixes: facc9699f0fe ('net/mlx5e: Fix HW MTU settings') Signed-off-by: Doron Tsur Signed-off-by: Saeed Mahameed Signed-off-by: Or Gerlitz Signed-off-by: David S. Miller # cat /etc/os-release NAME="Ubuntu" VERSION="14.04.3 LTS, Trusty Tahr" ID=ubuntu ID_LIKE=debian PRETTY_NAME="Ubuntu 14.04.3 LTS" VERSION_ID="14.04" HOME_URL="http://www.ubuntu.com/"; SUPPORT_URL="http://help.ubuntu.com/"; BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"; # uname -r 4.2.0-22-generic To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1528466/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1507482] Re: ethtool self-test failed and creates a FW reset.
Hi, Any update with this big ? could you please add the fix to 14.04.4 ? Thanks Talat -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1507482 Title: ethtool self-test failed and creates a FW reset. Status in linux package in Ubuntu: In Progress Bug description: On ConnectX3 device we try to run self-test and it’s failed, also seen that cause a FW reset. Steps to reproduce: ~# ethtool -t ens2 The test result is FAIL The test extra info: Interrupt Test -16 Link Test0 Speed Test 0 Register Test0 Loopback Test0 ~# dmesg [341548.619799] mlx4_core :07:00.0: command 0x31 timed out (go bit not cleared) [341548.803801] mlx4_core :07:00.0: command 0x49 timed out (go bit not cleared) [341548.803808] mlx4_core :07:00.0: device is going to be reset [341549.816587] mlx4_core :07:00.0: device was reset successfully [341549.823425] mlx4_en :07:00.0: Internal error detected, restarting device ~# cat /etc/os-release NAME="Ubuntu" VERSION="15.10 (Wily Werewolf)" ID=ubuntu ID_LIKE=debian PRETTY_NAME="Ubuntu Wily Werewolf (development branch)" VERSION_ID="15.10" HOME_URL="http://www.ubuntu.com/"; SUPPORT_URL="http://help.ubuntu.com/"; BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"; ~# uname -r 4.2.0-7-generic ~# ethtool -i ens2 driver: mlx4_en version: 2.2-1 (Feb 2014) firmware-version: 2.34.5000 bus-info: :07:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: no supports-register-dump: no supports-priv-flags: yes To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1507482/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1528466] Re: Mellanox ConnectX4 MTU limits: max and min
Thank You, Tested on 15.10 and it's working proberly. you are right, need to fix this in Wily and HWE kernels in Trusty. Yours, Talat -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1528466 Title: Mellanox ConnectX4 MTU limits: max and min Status in linux package in Ubuntu: In Progress Status in linux-lts-vivid package in Ubuntu: Incomplete Status in linux-lts-wily package in Ubuntu: In Progress Status in linux source package in Vivid: Incomplete Status in linux source package in Wily: In Progress Bug description: Max MTU limit should be 9978 Tested on Ubuntu 14.04.4 daily build, this issue exist also in 15.10. reproduce # ifconfig p2p1 mtu 900 SIOCSIFMTU: Invalid argument # dmesg [62898.559808] mlx5_core :20:00.0 p2p1: mlx5e_change_mtu: Bad MTU (900) > (1) Max [63058.512668] command failed, status bad parameter(0x3), syndrome 0x648afc Upstream commit that fix the issue commit 60825c35bf023553f8524f6695f176236e54df97 Author: Doron Tsur Date: Thu Nov 12 19:35:27 2015 +0200 net/mlx5e: Max mtu comparison fix On change mtu the driver compares between hardware queried mtu and software requested mtu. We need to compare between software representation of the queried mtu and the requested mtu. Fixes: facc9699f0fe ('net/mlx5e: Fix HW MTU settings') Signed-off-by: Doron Tsur Signed-off-by: Saeed Mahameed Signed-off-by: Or Gerlitz Signed-off-by: David S. Miller # cat /etc/os-release NAME="Ubuntu" VERSION="14.04.3 LTS, Trusty Tahr" ID=ubuntu ID_LIKE=debian PRETTY_NAME="Ubuntu 14.04.3 LTS" VERSION_ID="14.04" HOME_URL="http://www.ubuntu.com/"; SUPPORT_URL="http://help.ubuntu.com/"; BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"; # uname -r 4.2.0-22-generic To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1528466/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1531132] [NEW] Mellanox ConnectX3 PTP does not work with linux ptp and timekeeper after reboot
Public bug reported: After a machine reboot, trying to sync 2 clocks is impossible with ptp4l and timekeeper. server : ptp4l -i eth4 -m client : ptp4l -i eth4 -m -s it works only after driver restart. Attached two batches that should fix this issue. 0001-net-mlx4_en-Remove-dependency-between-timestamping-c.patch 0002-net-mlx4_en-Fix-HW-timestamp-init-issue-upon-system-.patch This two commits installed on Ubuntu 15.10 and it's fix the issue. please add the fix also to 14.04.4 Thanks, Talat ** Affects: linux (Ubuntu) Importance: Undecided Status: New ** Tags: trusty wily ** Attachment added: "Remove-dependency-between-timestamping" https://bugs.launchpad.net/bugs/1531132/+attachment/4544283/+files/0001-net-mlx4_en-Remove-dependency-between-timestamping-c.patch -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1531132 Title: Mellanox ConnectX3 PTP does not work with linux ptp and timekeeper after reboot Status in linux package in Ubuntu: New Bug description: After a machine reboot, trying to sync 2 clocks is impossible with ptp4l and timekeeper. server : ptp4l -i eth4 -m client : ptp4l -i eth4 -m -s it works only after driver restart. Attached two batches that should fix this issue. 0001-net-mlx4_en-Remove-dependency-between-timestamping-c.patch 0002-net-mlx4_en-Fix-HW-timestamp-init-issue-upon-system-.patch This two commits installed on Ubuntu 15.10 and it's fix the issue. please add the fix also to 14.04.4 Thanks, Talat To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1531132/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1531132] Re: Mellanox ConnectX3 PTP does not work with linux ptp and timekeeper after reboot
** Patch added: "Fix-HW-timestamp-init-issue-upon-system" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1531132/+attachment/4544284/+files/0002-net-mlx4_en-Fix-HW-timestamp-init-issue-upon-system-.patch -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1531132 Title: Mellanox ConnectX3 PTP does not work with linux ptp and timekeeper after reboot Status in linux package in Ubuntu: New Bug description: After a machine reboot, trying to sync 2 clocks is impossible with ptp4l and timekeeper. server : ptp4l -i eth4 -m client : ptp4l -i eth4 -m -s it works only after driver restart. Attached two batches that should fix this issue. 0001-net-mlx4_en-Remove-dependency-between-timestamping-c.patch 0002-net-mlx4_en-Fix-HW-timestamp-init-issue-upon-system-.patch This two commits installed on Ubuntu 15.10 and it's fix the issue. please add the fix also to 14.04.4 Thanks, Talat To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1531132/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1531203] [NEW] ConnectX-5 is displayed as ConnectX-4 in 'lspci'
Public bug reported: When testing ConnectX-5 in Ubuntu we noticed the following in lspci: [root@localhost network-scripts]# lspci | grep -i mel 00:09.0 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4] 00:0a.0 Ethernet controller: Mellanox Technologies MT28800 Family [ConnectX-4] ConnectX-5 is printed as ConnectX-4, But it is fully functioning and working. The reason is that there is a typo in pci.ids file and it's fixed in the following commit from https://github.com/pciutils/pciids.git. ommit 8084d2f3e776d53ffdb41feb13b75cd9a96fc1be Author: The PCI ID Mail Robot Date: Thu Dec 10 03:15:13 2015 +0100 New snapshot generated . . . 1014 MT27700 Family [ConnectX-4 Virtual Function] 1015 MT27710 Family [ConnectX-4 Lx] 1016 MT27710 Family [ConnectX-4 Lx Virtual Function] - 1017 MT28800 Family [ConnectX-4] + 1017 MT28800 Family [ConnectX-5] 1018 MT28800 Family [ConnectX-5 Virtual Function] 5274 MT21108 InfiniBridge 5a44 MT23108 InfiniHost . . . Please update the package of pciutils to the latest version available. More info about the bug in file /usr/share/misc/pci.ids line 17310--> 020d MT28800 Family [ConnectX-5 Flash Recovery] line 17342--> 1017 MT28800 Family [ConnectX-4] dpkg --list | grep pciutils ii pciutils 1:3.3.1-1ubuntu1 amd64Linux PCI Utilities dpkg -L pciutils /. /usr /usr/share /usr/share/misc /usr/share/misc/pci.ids /usr/share/doc /usr/share/doc/pciutils /usr/share/doc/pciutils/examples /usr/share/doc/pciutils/examples/example.c /usr/share/doc/pciutils/TODO.Debian /usr/share/doc/pciutils/copyright /usr/share/doc/pciutils/README.gz /usr/share/man /usr/share/man/man8 /usr/share/man/man8/lspci.8.gz /usr/share/man/man8/pcimodules.8.gz /usr/share/man/man8/setpci.8.gz /usr/share/man/man8/update-pciids.8.gz /usr/share/man/man7 /usr/share/man/man7/pcilib.7.gz /usr/bin /usr/bin/pcimodules /usr/bin/lspci /usr/bin/setpci /usr/sbin /usr/sbin/update-pciids /usr/share/doc/pciutils/changelog.Debian.gz Thanks, Talat ** Affects: linux (Ubuntu) Importance: Undecided Status: New -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1531203 Title: ConnectX-5 is displayed as ConnectX-4 in 'lspci' Status in linux package in Ubuntu: New Bug description: When testing ConnectX-5 in Ubuntu we noticed the following in lspci: [root@localhost network-scripts]# lspci | grep -i mel 00:09.0 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4] 00:0a.0 Ethernet controller: Mellanox Technologies MT28800 Family [ConnectX-4] ConnectX-5 is printed as ConnectX-4, But it is fully functioning and working. The reason is that there is a typo in pci.ids file and it's fixed in the following commit from https://github.com/pciutils/pciids.git. ommit 8084d2f3e776d53ffdb41feb13b75cd9a96fc1be Author: The PCI ID Mail Robot Date: Thu Dec 10 03:15:13 2015 +0100 New snapshot generated . . . 1014 MT27700 Family [ConnectX-4 Virtual Function] 1015 MT27710 Family [ConnectX-4 Lx] 1016 MT27710 Family [ConnectX-4 Lx Virtual Function] - 1017 MT28800 Family [ConnectX-4] + 1017 MT28800 Family [ConnectX-5] 1018 MT28800 Family [ConnectX-5 Virtual Function] 5274 MT21108 InfiniBridge 5a44 MT23108 InfiniHost . . . Please update the package of pciutils to the latest version available. More info about the bug in file /usr/share/misc/pci.ids line 17310--> 020d MT28800 Family [ConnectX-5 Flash Recovery] line 17342--> 1017 MT28800 Family [ConnectX-4] dpkg --list | grep pciutils ii pciutils 1:3.3.1-1ubuntu1 amd64Linux PCI Utilities dpkg -L pciutils /. /usr /usr/share /usr/share/misc /usr/share/misc/pci.ids /usr/share/doc /usr/share/doc/pciutils /usr/share/doc/pciutils/examples /usr/share/doc/pciutils/examples/example.c /usr/share/doc/pciutils/TODO.Debian /usr/share/doc/pciutils/copyright /usr/share/doc/pciutils/README.gz /usr/share/man /usr/share/man/man8 /usr/share/man/man8/lspci.8.gz /usr/share/man/man8/pcimodules.8.gz /usr/share/man/man8/setpci.8.gz /usr/share/man/man8/update-pciids.8.gz /usr/share/man/man7 /usr/share/man/man7/pcilib.7.gz /usr/bin /usr/bin/pcimodules /usr/bin/lspci /usr/bin/setpci /usr/sbin /usr/sbin/update-pciids /usr/share/doc/pciutils/changelog.Debian.gz Thanks, Talat To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1531203/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-p
[Kernel-packages] [Bug 1540435] [NEW] Introducing ConnectX-4 Ethernet SRIOV
Public bug reported: Hi, This patchset introduces the support of Ethernet SRIOV in ConnectX-4 family of 100G Ethernet NICs. Basic Introduction: ConnectX-4 HW architecture provides two kinds of underlying HW switches. MPFS (Multi Physical Function Switch) or L2 Table in Software terms: The HCA has one MPFS switch per physical port, this switch is responsible of forwarding Unicast traffic to the various overlying Physical Functions (PFs). Multicast traffic is flooded amongst all the PFs, Each PF can request to forward a unicast MAC to its E-Switch Uplink vport (which we will cover later) through SET_L2_TABLE_ENTRY HW command. MPFS has five ports, four are connected to PFs (one for each) and one is connected directly to the Physical Port (Physical Link). E-Switch (Ethernet Switch): The HCA has one per physical function. The main responsibility of this component is to forward Unicast/Multicast and vlan tagged/untagged traffic to the various Virtual Functions (VFs) allocated by the PF. Unlike MPFS, the PF needs to explicitly create the E-Switch FDB table, Which is a HW flow table managed by the PF driver whenever vport_group_manager capability bit is set for this PF. E-Switch has Virtual Ports (vports) entities as its ports, vport0 and uplink vport are special kind of vports that represents PF vport (vport0) and uplink vport which is connected to the MPFS switch (if exists) as the PF external link. vport1..vportN represent VF0..VF(N-1) egress/ingress ports. E-Switch FDB contains forwarding rules such as: UC MAC0 -> vport0(PF). UC MAC1 -> vport1. UC MAC2 -> vport2. MC MACX -> vport0, vport2, Uplink. MC MACY -> vport1, Uplink. For unmatched traffic FDB has the following default rules: Unmatched Traffic (src vport != Uplink) -> Uplink. Unmatched Traffic (src vport == Uplink) -> vport0(PF). NIC VPort context: Each NIC (VF/PF) has its own vport context which will be used to store the current NIC vport context (UC/MC and vlan lists) and other NIC properties such as MTU, promisc mode, etc.. NIC (VF/PF) driver is responsible of constantly updating this context. FDB rules population: Each NIC vport (VF/PF) will notify E-Switch manager of its UC/MC vport context changes via modify vport context command, which will be translated to an event that will be handled by E-Switch manager (PF) which will update FDB table accordingly. Both PF and VF use the same driver and submit commands directly to the firmware. The PF sees the vport_group_manager capability bit and as such runs the code to populate the embedded switches as explained above. The patch goes as follows: Patches 1-2 introduces the basic PCI SRIOV functionalities and the support of Connectx4 to enable specific VFs via enable/disable HCA commands. These two patches will be also in use later for the IB SRIOV flow. Patches 3-8 Introduces the basic E-Switch capabilities and commands to be used later by VF to modify and update its NIC vport context, and by PF (E-Switch Manager) driver to Query the VF NIC context and acts accordingly. Patches 9-10 Provide the needed functionality of a NIC driver VF/PF to support SRIOV, mainly vport context update support. Patch 11 ("net/mlx5: Introducing E-Switch and l2 table"), Introduces the basic E-Switch support and infrastructure to read vport context events and to update MPFS L2 Table of the UC mac addresses request by the PF. Patches 12-18 Introduces SRIOV enablemenet and E-Switch FDB table management It adds the Basic E-Swtich public API to set and get sriov properties to be used in PF netdev sriov ndos. Patchset was applied ontop of commit 3f8c0f7 "gianfar: use of_property_read_bool()" Saeed, Eli and Or. Eli Cohen (2): net/mlx5_core: Modify enable/disable hca functions net/mlx5_core: Add base sriov support Saeed Mahameed (18): net/mlx5: Add HW capabilities and structs for SR-IOV E-Switch. net/mlx5: Update access functions to Query/Modify vport MAC address net/mlx5: Introduce access functions to modify/query vport mac lists net/mlx5: Introduce access functions to modify/query vport state net/mlx5: Introduce access functions to modify/query vport promisc mode net/mlx5: Introduce access functions to modify/query vport vlans net/mlx5e: Write UC/MC list and promisc mode into vport context net/mlx5e: Write vlan list into vport context net/mlx5: Introducing E-Switch and l2 table net/mlx5: E-Switch, Introduce FDB hardware capabilities net/mlx5: E-Switch, Add SR-IOV (FDB) support net/mlx5: E-Switch, Introduce Vport administration functions net/mlx5: E-Switch, Introduce HCA cap and E-Switch vport context net/mlx5: E-Switch, Introduce set vport vlan (VST mode) net/mlx5: E-Switch, Introduce get vf statistics net/mlx5e: Add support for SR-IOV ndos net/mlx5e: Assign random MAC address if needed net/mlx5: Fix query E-Switch capabilities ** Affects: linux (Ubuntu) Importance: Undecided
[Kernel-packages] [Bug 1540435] Re: Introducing ConnectX-4 Ethernet SRIOV
Thank you Tim, All this patches is accepted to upstream. could you please tell me how can i acsess the Launchpad github ? can i backport them on my setup and attach the patches ? thanks, Talat -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1540435 Title: Introducing ConnectX-4 Ethernet SRIOV Status in linux package in Ubuntu: Incomplete Bug description: Hi, This patchset introduces the support of Ethernet SRIOV in ConnectX-4 family of 100G Ethernet NICs. Basic Introduction: ConnectX-4 HW architecture provides two kinds of underlying HW switches. MPFS (Multi Physical Function Switch) or L2 Table in Software terms: The HCA has one MPFS switch per physical port, this switch is responsible of forwarding Unicast traffic to the various overlying Physical Functions (PFs). Multicast traffic is flooded amongst all the PFs, Each PF can request to forward a unicast MAC to its E-Switch Uplink vport (which we will cover later) through SET_L2_TABLE_ENTRY HW command. MPFS has five ports, four are connected to PFs (one for each) and one is connected directly to the Physical Port (Physical Link). E-Switch (Ethernet Switch): The HCA has one per physical function. The main responsibility of this component is to forward Unicast/Multicast and vlan tagged/untagged traffic to the various Virtual Functions (VFs) allocated by the PF. Unlike MPFS, the PF needs to explicitly create the E-Switch FDB table, Which is a HW flow table managed by the PF driver whenever vport_group_manager capability bit is set for this PF. E-Switch has Virtual Ports (vports) entities as its ports, vport0 and uplink vport are special kind of vports that represents PF vport (vport0) and uplink vport which is connected to the MPFS switch (if exists) as the PF external link. vport1..vportN represent VF0..VF(N-1) egress/ingress ports. E-Switch FDB contains forwarding rules such as: UC MAC0 -> vport0(PF). UC MAC1 -> vport1. UC MAC2 -> vport2. MC MACX -> vport0, vport2, Uplink. MC MACY -> vport1, Uplink. For unmatched traffic FDB has the following default rules: Unmatched Traffic (src vport != Uplink) -> Uplink. Unmatched Traffic (src vport == Uplink) -> vport0(PF). NIC VPort context: Each NIC (VF/PF) has its own vport context which will be used to store the current NIC vport context (UC/MC and vlan lists) and other NIC properties such as MTU, promisc mode, etc.. NIC (VF/PF) driver is responsible of constantly updating this context. FDB rules population: Each NIC vport (VF/PF) will notify E-Switch manager of its UC/MC vport context changes via modify vport context command, which will be translated to an event that will be handled by E-Switch manager (PF) which will update FDB table accordingly. Both PF and VF use the same driver and submit commands directly to the firmware. The PF sees the vport_group_manager capability bit and as such runs the code to populate the embedded switches as explained above. The patch goes as follows: Patches 1-2 introduces the basic PCI SRIOV functionalities and the support of Connectx4 to enable specific VFs via enable/disable HCA commands. These two patches will be also in use later for the IB SRIOV flow. Patches 3-8 Introduces the basic E-Switch capabilities and commands to be used later by VF to modify and update its NIC vport context, and by PF (E-Switch Manager) driver to Query the VF NIC context and acts accordingly. Patches 9-10 Provide the needed functionality of a NIC driver VF/PF to support SRIOV, mainly vport context update support. Patch 11 ("net/mlx5: Introducing E-Switch and l2 table"), Introduces the basic E-Switch support and infrastructure to read vport context events and to update MPFS L2 Table of the UC mac addresses request by the PF. Patches 12-18 Introduces SRIOV enablemenet and E-Switch FDB table management It adds the Basic E-Swtich public API to set and get sriov properties to be used in PF netdev sriov ndos. Patchset was applied ontop of commit 3f8c0f7 "gianfar: use of_property_read_bool()" Saeed, Eli and Or. Eli Cohen (2): net/mlx5_core: Modify enable/disable hca functions net/mlx5_core: Add base sriov support Saeed Mahameed (18): net/mlx5: Add HW capabilities and structs for SR-IOV E-Switch. net/mlx5: Update access functions to Query/Modify vport MAC address net/mlx5: Introduce access functions to modify/query vport mac lists net/mlx5: Introduce access functions to modify/query vport state net/mlx5: Introduce access functions to modify/query vport promisc mode net/mlx5: Introduce access functions to modify/query vport vlans net/mlx5e: Write UC/MC list and promisc mode into vport context net/mlx5e: Write vlan
[Kernel-packages] [Bug 1540435] Re: Introducing ConnectX-4 Ethernet SRIOV
Thank you, you can take the patches from my brunch. i've cherry-piced them and test them on machine with xenial release. lp:~talat-b87/+junk/mlx5-sriov Thanks, Talat -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1540435 Title: Introducing ConnectX-4 Ethernet SRIOV Status in linux package in Ubuntu: In Progress Status in linux source package in Xenial: In Progress Bug description: Hi, This patchset introduces the support of Ethernet SRIOV in ConnectX-4 family of 100G Ethernet NICs. Basic Introduction: ConnectX-4 HW architecture provides two kinds of underlying HW switches. MPFS (Multi Physical Function Switch) or L2 Table in Software terms: The HCA has one MPFS switch per physical port, this switch is responsible of forwarding Unicast traffic to the various overlying Physical Functions (PFs). Multicast traffic is flooded amongst all the PFs, Each PF can request to forward a unicast MAC to its E-Switch Uplink vport (which we will cover later) through SET_L2_TABLE_ENTRY HW command. MPFS has five ports, four are connected to PFs (one for each) and one is connected directly to the Physical Port (Physical Link). E-Switch (Ethernet Switch): The HCA has one per physical function. The main responsibility of this component is to forward Unicast/Multicast and vlan tagged/untagged traffic to the various Virtual Functions (VFs) allocated by the PF. Unlike MPFS, the PF needs to explicitly create the E-Switch FDB table, Which is a HW flow table managed by the PF driver whenever vport_group_manager capability bit is set for this PF. E-Switch has Virtual Ports (vports) entities as its ports, vport0 and uplink vport are special kind of vports that represents PF vport (vport0) and uplink vport which is connected to the MPFS switch (if exists) as the PF external link. vport1..vportN represent VF0..VF(N-1) egress/ingress ports. E-Switch FDB contains forwarding rules such as: UC MAC0 -> vport0(PF). UC MAC1 -> vport1. UC MAC2 -> vport2. MC MACX -> vport0, vport2, Uplink. MC MACY -> vport1, Uplink. For unmatched traffic FDB has the following default rules: Unmatched Traffic (src vport != Uplink) -> Uplink. Unmatched Traffic (src vport == Uplink) -> vport0(PF). NIC VPort context: Each NIC (VF/PF) has its own vport context which will be used to store the current NIC vport context (UC/MC and vlan lists) and other NIC properties such as MTU, promisc mode, etc.. NIC (VF/PF) driver is responsible of constantly updating this context. FDB rules population: Each NIC vport (VF/PF) will notify E-Switch manager of its UC/MC vport context changes via modify vport context command, which will be translated to an event that will be handled by E-Switch manager (PF) which will update FDB table accordingly. Both PF and VF use the same driver and submit commands directly to the firmware. The PF sees the vport_group_manager capability bit and as such runs the code to populate the embedded switches as explained above. The patch goes as follows: Patches 1-2 introduces the basic PCI SRIOV functionalities and the support of Connectx4 to enable specific VFs via enable/disable HCA commands. These two patches will be also in use later for the IB SRIOV flow. Patches 3-8 Introduces the basic E-Switch capabilities and commands to be used later by VF to modify and update its NIC vport context, and by PF (E-Switch Manager) driver to Query the VF NIC context and acts accordingly. Patches 9-10 Provide the needed functionality of a NIC driver VF/PF to support SRIOV, mainly vport context update support. Patch 11 ("net/mlx5: Introducing E-Switch and l2 table"), Introduces the basic E-Switch support and infrastructure to read vport context events and to update MPFS L2 Table of the UC mac addresses request by the PF. Patches 12-18 Introduces SRIOV enablemenet and E-Switch FDB table management It adds the Basic E-Swtich public API to set and get sriov properties to be used in PF netdev sriov ndos. Patchset was applied ontop of commit 3f8c0f7 "gianfar: use of_property_read_bool()" Saeed, Eli and Or. Eli Cohen (2): net/mlx5_core: Modify enable/disable hca functions net/mlx5_core: Add base sriov support Saeed Mahameed (18): net/mlx5: Add HW capabilities and structs for SR-IOV E-Switch. net/mlx5: Update access functions to Query/Modify vport MAC address net/mlx5: Introduce access functions to modify/query vport mac lists net/mlx5: Introduce access functions to modify/query vport state net/mlx5: Introduce access functions to modify/query vport promisc mode net/mlx5: Introduce access functions to modify/query vport vlans net/mlx5e: Write UC/MC list and promisc mode into vport conte
[Kernel-packages] [Bug 1540435] Re: Introducing ConnectX-4 Ethernet SRIOV
I am sorry about that, could you please tell me how can i do this pull request. also attached a tar file that contain all of the commits. i do cherry-pick them on machine with Xenial and kernel 16.04, it's work properly. Thanks, Talat ** Attachment added: "mlx5_sriov.tar.gz" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1540435/+attachment/4565841/+files/mlx5_sriov.tar.gz -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1540435 Title: Introducing ConnectX-4 Ethernet SRIOV Status in linux package in Ubuntu: In Progress Status in linux source package in Xenial: In Progress Bug description: Hi, This patchset introduces the support of Ethernet SRIOV in ConnectX-4 family of 100G Ethernet NICs. Basic Introduction: ConnectX-4 HW architecture provides two kinds of underlying HW switches. MPFS (Multi Physical Function Switch) or L2 Table in Software terms: The HCA has one MPFS switch per physical port, this switch is responsible of forwarding Unicast traffic to the various overlying Physical Functions (PFs). Multicast traffic is flooded amongst all the PFs, Each PF can request to forward a unicast MAC to its E-Switch Uplink vport (which we will cover later) through SET_L2_TABLE_ENTRY HW command. MPFS has five ports, four are connected to PFs (one for each) and one is connected directly to the Physical Port (Physical Link). E-Switch (Ethernet Switch): The HCA has one per physical function. The main responsibility of this component is to forward Unicast/Multicast and vlan tagged/untagged traffic to the various Virtual Functions (VFs) allocated by the PF. Unlike MPFS, the PF needs to explicitly create the E-Switch FDB table, Which is a HW flow table managed by the PF driver whenever vport_group_manager capability bit is set for this PF. E-Switch has Virtual Ports (vports) entities as its ports, vport0 and uplink vport are special kind of vports that represents PF vport (vport0) and uplink vport which is connected to the MPFS switch (if exists) as the PF external link. vport1..vportN represent VF0..VF(N-1) egress/ingress ports. E-Switch FDB contains forwarding rules such as: UC MAC0 -> vport0(PF). UC MAC1 -> vport1. UC MAC2 -> vport2. MC MACX -> vport0, vport2, Uplink. MC MACY -> vport1, Uplink. For unmatched traffic FDB has the following default rules: Unmatched Traffic (src vport != Uplink) -> Uplink. Unmatched Traffic (src vport == Uplink) -> vport0(PF). NIC VPort context: Each NIC (VF/PF) has its own vport context which will be used to store the current NIC vport context (UC/MC and vlan lists) and other NIC properties such as MTU, promisc mode, etc.. NIC (VF/PF) driver is responsible of constantly updating this context. FDB rules population: Each NIC vport (VF/PF) will notify E-Switch manager of its UC/MC vport context changes via modify vport context command, which will be translated to an event that will be handled by E-Switch manager (PF) which will update FDB table accordingly. Both PF and VF use the same driver and submit commands directly to the firmware. The PF sees the vport_group_manager capability bit and as such runs the code to populate the embedded switches as explained above. The patch goes as follows: Patches 1-2 introduces the basic PCI SRIOV functionalities and the support of Connectx4 to enable specific VFs via enable/disable HCA commands. These two patches will be also in use later for the IB SRIOV flow. Patches 3-8 Introduces the basic E-Switch capabilities and commands to be used later by VF to modify and update its NIC vport context, and by PF (E-Switch Manager) driver to Query the VF NIC context and acts accordingly. Patches 9-10 Provide the needed functionality of a NIC driver VF/PF to support SRIOV, mainly vport context update support. Patch 11 ("net/mlx5: Introducing E-Switch and l2 table"), Introduces the basic E-Switch support and infrastructure to read vport context events and to update MPFS L2 Table of the UC mac addresses request by the PF. Patches 12-18 Introduces SRIOV enablemenet and E-Switch FDB table management It adds the Basic E-Swtich public API to set and get sriov properties to be used in PF netdev sriov ndos. Patchset was applied ontop of commit 3f8c0f7 "gianfar: use of_property_read_bool()" Saeed, Eli and Or. Eli Cohen (2): net/mlx5_core: Modify enable/disable hca functions net/mlx5_core: Add base sriov support Saeed Mahameed (18): net/mlx5: Add HW capabilities and structs for SR-IOV E-Switch. net/mlx5: Update access functions to Query/Modify vport MAC address net/mlx5: Introduce access functions to modify/query vport mac lists net/mlx5: Introduce access functions to modify/query v
[Kernel-packages] [Bug 1544978] Re: PCI Call Traces hw csum failure in dmesg with 4.4.0-2-generic
Hi, This upstream commit should fix the bug commit 82d69203df634b4dfa765c94f60ce9482bcc44d6 Author: Daniel Jurgens Date: Wed May 4 15:00:33 2016 +0300 net/mlx4_en: Fix endianness bug in IPV6 csum calculation Use htons instead of unconditionally byte swapping nexthdr. On a little endian systems shifting the byte is correct behavior, but it results in incorrect csums on big endian architectures. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1544978 Title: PCI Call Traces hw csum failure in dmesg with 4.4.0-2-generic Status in Ubuntu on IBM z Systems: Incomplete Status in linux package in Ubuntu: Incomplete Bug description: == Comment: #0 - Helmut Grauer - 2016-02-12 03:00:03 == Hi getting the following Call Traces when PCI interfaces will be configured [ 246.051566] enp0s0: hw csum failure [ 246.051571] CPU: 2 PID: 0 Comm: swapper/2 Tainted: GE 4.4.0-2-generic #16-Ubuntu [ 246.051573]f9793778 f9793808 0002 f97938a8 f9793820 f9793820 00114182 0166 0091e9ca 000a 000a f9793868 f9793808 f9d38000 00114182 f9793808 f9793868 [ 246.051581] Call Trace: [ 246.051589] ([<001140b8>] show_trace+0x140/0x148) [ 246.051590] [<00114136>] show_stack+0x76/0xe8 [ 246.051595] [<005172d6>] dump_stack+0x6e/0x90 [ 246.051599] [<00673500>] __skb_checksum_complete+0xd0/0xd8 [ 246.051605] [<0076ae24>] icmpv6_rcv+0x124/0x500 [ 246.051608] [<00746e60>] ip6_input_finish+0x170/0x4e0 [ 246.051610] [<0074775c>] ip6_input+0x4c/0xd0 [ 246.051611] [<007478ee>] ip6_mc_input+0x10e/0x280 [ 246.051612] [<00747538>] ipv6_rcv+0x368/0x540 [ 246.051616] [<0067e5d4>] __netif_receive_skb_core+0x6fc/0xaf8 [ 246.051618] [<00681a56>] netif_receive_skb_internal+0x3e/0xd8 [ 246.051619] [<00682314>] napi_gro_frags+0x17c/0x208 [ 246.051627] [<03ff805f3a2c>] mlx4_en_process_rx_cq+0x8b4/0xbd0 [mlx4_en] [ 246.051630] [<03ff805f3e62>] mlx4_en_poll_rx_cq+0xc2/0x1a0 [mlx4_en] [ 246.051631] [<006839e2>] net_rx_action+0x2a2/0x418 [ 246.051635] [<00162726>] __do_softirq+0x156/0x300 [ 246.051637] [<00162ace>] irq_exit+0xd6/0xf8 [ 246.051641] [<0010cc5a>] do_IRQ+0x6a/0x88 [ 246.051644] [<007a99c2>] io_int_handler+0x112/0x220 [ 246.051646] [<00104856>] enabled_wait+0x56/0xa8 [ 246.051649] ([<00ccb888>] cpu_dead_idle+0x0/0x8) [ 246.051651] [<00104b5a>] arch_cpu_idle+0x32/0x48 [ 246.051669] [<001a8198>] cpu_startup_entry+0x200/0x278 [ 246.051674] [<001156ba>] smp_start_secondary+0xea/0xf8 [ 246.051679] [<007a9f42>] restart_int_handler+0x62/0x78 [ 246.051680] [<>] (null) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-z-systems/+bug/1544978/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1533249] [NEW] Re-enabe vlan TX acceleration
Public bug reported: Unable to send TCP/UDP traffic between VLAN on ConnectX4LX, due to vlan TX acceleration is off. Scenario Configure static VLAN on CX4LX and try to run iperf TCP/UDP. There are no traffic TCP/UDP even that the ICMP traffic pass (using ping between vlan's) #server # iperf -s #Client: # iperf -c Setup info: OS: Ubuntu14.04.4 Kernel 4.2.0-22 Upstream commits that fix this issue c44d84d net/mlx5e: Fix inline header size calculation 59fb571 net/mlx5e: Fix LSO vlan insertion 13c5224 net/mlx5e: Re-eanble client vlan TX acceleration ** Affects: linux (Ubuntu) Importance: Undecided Status: New ** Patch added: "Patch" https://bugs.launchpad.net/bugs/1533249/+attachment/4548627/+files/0001-Fix-Vlan-UDP-and-TCP-traffic.patch -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1533249 Title: Re-enabe vlan TX acceleration Status in linux package in Ubuntu: New Bug description: Unable to send TCP/UDP traffic between VLAN on ConnectX4LX, due to vlan TX acceleration is off. Scenario Configure static VLAN on CX4LX and try to run iperf TCP/UDP. There are no traffic TCP/UDP even that the ICMP traffic pass (using ping between vlan's) #server # iperf -s #Client: # iperf -c Setup info: OS: Ubuntu14.04.4 Kernel 4.2.0-22 Upstream commits that fix this issue c44d84d net/mlx5e: Fix inline header size calculation 59fb571 net/mlx5e: Fix LSO vlan insertion 13c5224 net/mlx5e: Re-eanble client vlan TX acceleration To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1533249/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1533249] Re: Re-enable vlan TX acceleration
Hi, Please add this Fix to 14.04.4. Thanks Talat ** Tags added: trusty -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1533249 Title: Re-enable vlan TX acceleration Status in linux package in Ubuntu: Fix Released Status in linux source package in Wily: In Progress Status in linux source package in Xenial: Fix Released Bug description: Unable to send TCP/UDP traffic between VLAN on ConnectX4LX, due to vlan TX acceleration is off. Scenario Configure static VLAN on CX4LX and try to run iperf TCP/UDP. There are no traffic TCP/UDP even that the ICMP traffic pass (using ping between vlan's) #server # iperf -s #Client: # iperf -c Setup info: OS: Ubuntu14.04.4 Kernel 4.2.0-22 Upstream commits that fix this issue c44d84d net/mlx5e: Fix inline header size calculation 59fb571 net/mlx5e: Fix LSO vlan insertion 13c5224 net/mlx5e: Re-eanble client vlan TX acceleration To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1533249/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1533249] Re: Re-enable vlan TX acceleration
Added tarboot file that contain the 3 upstream commits. 0001-net-mlx5e-Re-eanble-client-vlan-TX-acceleration.patch 0002-net-mlx5e-Fix-LSO-vlan-insertion.patch 0003-net-mlx5e-Fix-inline-header-size-calculation.patch This patches upstream and the fix verified on machine with 14.04.4 and kernel 4.2.8-ckt1+ Thanks, Talat ** Attachment added: "Vlan_fix_patches" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1533249/+attachment/4549222/+files/Vlan_fix_patches.tar.gz -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1533249 Title: Re-enable vlan TX acceleration Status in linux package in Ubuntu: Fix Released Status in linux source package in Wily: In Progress Status in linux source package in Xenial: Fix Released Bug description: Unable to send TCP/UDP traffic between VLAN on ConnectX4LX, due to vlan TX acceleration is off. Scenario Configure static VLAN on CX4LX and try to run iperf TCP/UDP. There are no traffic TCP/UDP even that the ICMP traffic pass (using ping between vlan's) #server # iperf -s #Client: # iperf -c Setup info: OS: Ubuntu14.04.4 Kernel 4.2.0-22 Upstream commits that fix this issue c44d84d net/mlx5e: Fix inline header size calculation 59fb571 net/mlx5e: Fix LSO vlan insertion 13c5224 net/mlx5e: Re-eanble client vlan TX acceleration To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1533249/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1533721] [NEW] Ethtool didn't support link speed 25Gb 56Gb 100Gb 200Gb
Public bug reported: please add missing Advertised speeds into ethtool Ethtool 3.13 didn't support link speed 25Gb 56Gb 100Gb 200Gb. Ethtool 4.2 didn't support link speed 25Gb 100Gb 200Gb. This the list from ethtool 4.2 #define ALL_ADVERTISED_MODES\ (ADVERTISED_10baseT_Half | \ ADVERTISED_10baseT_Full | \ ADVERTISED_100baseT_Half | \ ADVERTISED_100baseT_Full | \ ADVERTISED_1000baseT_Half |\ ADVERTISED_1000baseT_Full |\ ADVERTISED_1000baseKX_Full|\ ADVERTISED_2500baseX_Full |\ ADVERTISED_1baseT_Full | \ ADVERTISED_1baseKX4_Full | \ ADVERTISED_1baseKR_Full | \ ADVERTISED_1baseR_FEC |\ ADVERTISED_2baseMLD2_Full |\ ADVERTISED_2baseKR2_Full | \ ADVERTISED_4baseKR4_Full | \ ADVERTISED_4baseCR4_Full | \ ADVERTISED_4baseSR4_Full | \ ADVERTISED_4baseLR4_Full | \ ADVERTISED_56000baseKR4_Full | \ ADVERTISED_56000baseCR4_Full | \ ADVERTISED_56000baseSR4_Full | \ ADVERTISED_56000baseLR4_Full) ** Affects: linux (Ubuntu) Importance: Undecided Status: Incomplete -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1533721 Title: Ethtool didn't support link speed 25Gb 56Gb 100Gb 200Gb Status in linux package in Ubuntu: Incomplete Bug description: please add missing Advertised speeds into ethtool Ethtool 3.13 didn't support link speed 25Gb 56Gb 100Gb 200Gb. Ethtool 4.2 didn't support link speed 25Gb 100Gb 200Gb. This the list from ethtool 4.2 #define ALL_ADVERTISED_MODES\ (ADVERTISED_10baseT_Half | \ ADVERTISED_10baseT_Full | \ ADVERTISED_100baseT_Half | \ ADVERTISED_100baseT_Full | \ ADVERTISED_1000baseT_Half |\ ADVERTISED_1000baseT_Full |\ ADVERTISED_1000baseKX_Full|\ ADVERTISED_2500baseX_Full |\ ADVERTISED_1baseT_Full | \ ADVERTISED_1baseKX4_Full | \ ADVERTISED_1baseKR_Full | \ ADVERTISED_1baseR_FEC |\ ADVERTISED_2baseMLD2_Full |\ ADVERTISED_2baseKR2_Full | \ ADVERTISED_4baseKR4_Full | \ ADVERTISED_4baseCR4_Full | \ ADVERTISED_4baseSR4_Full | \ ADVERTISED_4baseLR4_Full | \ ADVERTISED_56000baseKR4_Full | \ ADVERTISED_56000baseCR4_Full | \ ADVERTISED_56000baseSR4_Full | \ ADVERTISED_56000baseLR4_Full) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1533721/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1531132] Re: Mellanox ConnectX3 PTP does not work with linux ptp and timekeeper after reboot
Hi Luis, Thank you, I verified it and it is working proberly. Yours, Talat ** Tags removed: patch trusty verification-needed-wily wily ** Tags added: verification-done-wily -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1531132 Title: Mellanox ConnectX3 PTP does not work with linux ptp and timekeeper after reboot Status in linux package in Ubuntu: Fix Released Status in linux source package in Wily: Fix Committed Bug description: After a machine reboot, trying to sync 2 clocks is impossible with ptp4l and timekeeper. server : ptp4l -i eth4 -m client : ptp4l -i eth4 -m -s it works only after driver restart. Attached two batches that should fix this issue. 0001-net-mlx4_en-Remove-dependency-between-timestamping-c.patch 0002-net-mlx4_en-Fix-HW-timestamp-init-issue-upon-system-.patch This two commits installed on Ubuntu 15.10 and it's fix the issue. please add the fix also to 14.04.4 Thanks, Talat To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1531132/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1535045] [NEW] Biosdevname does not provide interface naming information for ConnecX4 Devices
Public bug reported: Hi, Biosdevname is enabled by default for ubuntu14.04.4, We see that the ConnectX3 interface name is pXpX and ConnectX4 interfaces is ethX when biosdevname is enabled we should see all the interfaces like pxpX As for biosdevname user manual : when we ran #biosdevname -i , it should return the interface biosdevname the command exit code should return 0 for success see it in the link :http://linux.die.net/man/1/biosdevname. When we run biosdevname on ConnectX4 interface we get return status 2. could you please add support for ConnectX4 devices ? Exit Codes Returns 0 on success, with BIOS-suggested name printed to stdout. Returns 1 on provided device name lookup failure. Returns 2 if system BIOS does not provide naming information. biosdevname requires system BIOS to provide naming information, either via SMBIOS or sysfs files. Returns 3 if not run as root but requires root privileges. Returns 4 if running in a virtual machine. # dpkg --list |grep biosdevname ii biosdevname0.4.1-0ubuntu6.3 amd64 apply BIOS-given names to network devices # uname -r 4.2.0-24-generic lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description:Ubuntu 14.04.3 LTS Release:14.04 Codename: trusty ** Affects: linux (Ubuntu) Importance: Undecided Status: New -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1535045 Title: Biosdevname does not provide interface naming information for ConnecX4 Devices Status in linux package in Ubuntu: New Bug description: Hi, Biosdevname is enabled by default for ubuntu14.04.4, We see that the ConnectX3 interface name is pXpX and ConnectX4 interfaces is ethX when biosdevname is enabled we should see all the interfaces like pxpX As for biosdevname user manual : when we ran #biosdevname -i , it should return the interface biosdevname the command exit code should return 0 for success see it in the link :http://linux.die.net/man/1/biosdevname. When we run biosdevname on ConnectX4 interface we get return status 2. could you please add support for ConnectX4 devices ? Exit Codes Returns 0 on success, with BIOS-suggested name printed to stdout. Returns 1 on provided device name lookup failure. Returns 2 if system BIOS does not provide naming information. biosdevname requires system BIOS to provide naming information, either via SMBIOS or sysfs files. Returns 3 if not run as root but requires root privileges. Returns 4 if running in a virtual machine. # dpkg --list |grep biosdevname ii biosdevname0.4.1-0ubuntu6.3 amd64 apply BIOS-given names to network devices # uname -r 4.2.0-24-generic lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description:Ubuntu 14.04.3 LTS Release:14.04 Codename: trusty To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1535045/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1533249] Re: Re-enable vlan TX acceleration
Thank you Tim, This Bug affect just Mellanox Drivers and improve the vlan process in the newest Mellanox Devices. could you please add this bug Fix to Ubuntu 14.04.4 kernel. Thanks in advance, Talat -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1533249 Title: Re-enable vlan TX acceleration Status in linux package in Ubuntu: Fix Released Status in linux source package in Wily: In Progress Status in linux source package in Xenial: Fix Released Bug description: Unable to send TCP/UDP traffic between VLAN on ConnectX4LX, due to vlan TX acceleration is off. Scenario Configure static VLAN on CX4LX and try to run iperf TCP/UDP. There are no traffic TCP/UDP even that the ICMP traffic pass (using ping between vlan's) #server # iperf -s #Client: # iperf -c Setup info: OS: Ubuntu14.04.4 Kernel 4.2.0-22 Upstream commits that fix this issue c44d84d net/mlx5e: Fix inline header size calculation 59fb571 net/mlx5e: Fix LSO vlan insertion 13c5224 net/mlx5e: Re-eanble client vlan TX acceleration To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1533249/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1535045] Re: Biosdevname does not provide interface naming information for ConnecX4 Devices
** Changed in: linux (Ubuntu) Status: Incomplete => Confirmed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1535045 Title: Biosdevname does not provide interface naming information for ConnecX4 Devices Status in linux package in Ubuntu: Confirmed Bug description: Hi, Biosdevname is enabled by default for ubuntu14.04.4, We see that the ConnectX3 interface name is pXpX and ConnectX4 interfaces is ethX when biosdevname is enabled we should see all the interfaces like pxpX As for biosdevname user manual : when we ran #biosdevname -i , it should return the interface biosdevname the command exit code should return 0 for success see it in the link :http://linux.die.net/man/1/biosdevname. When we run biosdevname on ConnectX4 interface we get return status 2. could you please add support for ConnectX4 devices ? Exit Codes Returns 0 on success, with BIOS-suggested name printed to stdout. Returns 1 on provided device name lookup failure. Returns 2 if system BIOS does not provide naming information. biosdevname requires system BIOS to provide naming information, either via SMBIOS or sysfs files. Returns 3 if not run as root but requires root privileges. Returns 4 if running in a virtual machine. # dpkg --list |grep biosdevname ii biosdevname0.4.1-0ubuntu6.3 amd64 apply BIOS-given names to network devices # uname -r 4.2.0-24-generic lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description:Ubuntu 14.04.3 LTS Release:14.04 Codename: trusty To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1535045/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1528466] Re: Mellanox ConnectX4 MTU limits: max and min
** Tags added: verification-done-wily -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1528466 Title: Mellanox ConnectX4 MTU limits: max and min Status in linux package in Ubuntu: In Progress Status in linux-lts-wily package in Ubuntu: In Progress Status in linux source package in Wily: Fix Committed Bug description: Max MTU limit should be 9978 Tested on Ubuntu 14.04.4 daily build, this issue exist also in 15.10. reproduce # ifconfig p2p1 mtu 900 SIOCSIFMTU: Invalid argument # dmesg [62898.559808] mlx5_core :20:00.0 p2p1: mlx5e_change_mtu: Bad MTU (900) > (1) Max [63058.512668] command failed, status bad parameter(0x3), syndrome 0x648afc Upstream commit that fix the issue commit 60825c35bf023553f8524f6695f176236e54df97 Author: Doron Tsur Date: Thu Nov 12 19:35:27 2015 +0200 net/mlx5e: Max mtu comparison fix On change mtu the driver compares between hardware queried mtu and software requested mtu. We need to compare between software representation of the queried mtu and the requested mtu. Fixes: facc9699f0fe ('net/mlx5e: Fix HW MTU settings') Signed-off-by: Doron Tsur Signed-off-by: Saeed Mahameed Signed-off-by: Or Gerlitz Signed-off-by: David S. Miller # cat /etc/os-release NAME="Ubuntu" VERSION="14.04.3 LTS, Trusty Tahr" ID=ubuntu ID_LIKE=debian PRETTY_NAME="Ubuntu 14.04.3 LTS" VERSION_ID="14.04" HOME_URL="http://www.ubuntu.com/"; SUPPORT_URL="http://help.ubuntu.com/"; BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"; # uname -r 4.2.0-22-generic To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1528466/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1535045] Re: Biosdevname does not provide interface naming information for ConnecX4 Devices
** Package changed: linux (Ubuntu) => biosdevname (Ubuntu) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1535045 Title: Biosdevname does not provide interface naming information for ConnecX4 Devices Status in biosdevname package in Ubuntu: Confirmed Bug description: Hi, Biosdevname is enabled by default for ubuntu14.04.4, We see that the ConnectX3 interface name is pXpX and ConnectX4 interfaces is ethX when biosdevname is enabled we should see all the interfaces like pxpX As for biosdevname user manual : when we ran #biosdevname -i , it should return the interface biosdevname the command exit code should return 0 for success see it in the link :http://linux.die.net/man/1/biosdevname. When we run biosdevname on ConnectX4 interface we get return status 2. could you please add support for ConnectX4 devices ? Exit Codes Returns 0 on success, with BIOS-suggested name printed to stdout. Returns 1 on provided device name lookup failure. Returns 2 if system BIOS does not provide naming information. biosdevname requires system BIOS to provide naming information, either via SMBIOS or sysfs files. Returns 3 if not run as root but requires root privileges. Returns 4 if running in a virtual machine. # dpkg --list |grep biosdevname ii biosdevname0.4.1-0ubuntu6.3 amd64 apply BIOS-given names to network devices # uname -r 4.2.0-24-generic lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description:Ubuntu 14.04.3 LTS Release:14.04 Codename: trusty To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/biosdevname/+bug/1535045/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1533249] Re: Re-enable vlan TX acceleration
** Tags added: verification-done-wily -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1533249 Title: Re-enable vlan TX acceleration Status in linux package in Ubuntu: Fix Released Status in linux source package in Wily: Fix Committed Status in linux source package in Xenial: Fix Released Bug description: Unable to send TCP/UDP traffic between VLAN on ConnectX4LX, due to vlan TX acceleration is off. Scenario Configure static VLAN on CX4LX and try to run iperf TCP/UDP. There are no traffic TCP/UDP even that the ICMP traffic pass (using ping between vlan's) #server # iperf -s #Client: # iperf -c Setup info: OS: Ubuntu14.04.4 Kernel 4.2.0-22 Upstream commits that fix this issue c44d84d net/mlx5e: Fix inline header size calculation 59fb571 net/mlx5e: Fix LSO vlan insertion 13c5224 net/mlx5e: Re-eanble client vlan TX acceleration To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1533249/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1507482] Re: ethtool self-test failed and creates a FW reset.
Hi Rafael, Any update with this bug ? Thanks, Talat ** Tags added: wily -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1507482 Title: ethtool self-test failed and creates a FW reset. Status in linux package in Ubuntu: In Progress Bug description: On ConnectX3 device we try to run self-test and it’s failed, also seen that cause a FW reset. Steps to reproduce: ~# ethtool -t ens2 The test result is FAIL The test extra info: Interrupt Test -16 Link Test0 Speed Test 0 Register Test0 Loopback Test0 ~# dmesg [341548.619799] mlx4_core :07:00.0: command 0x31 timed out (go bit not cleared) [341548.803801] mlx4_core :07:00.0: command 0x49 timed out (go bit not cleared) [341548.803808] mlx4_core :07:00.0: device is going to be reset [341549.816587] mlx4_core :07:00.0: device was reset successfully [341549.823425] mlx4_en :07:00.0: Internal error detected, restarting device ~# cat /etc/os-release NAME="Ubuntu" VERSION="15.10 (Wily Werewolf)" ID=ubuntu ID_LIKE=debian PRETTY_NAME="Ubuntu Wily Werewolf (development branch)" VERSION_ID="15.10" HOME_URL="http://www.ubuntu.com/"; SUPPORT_URL="http://help.ubuntu.com/"; BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"; ~# uname -r 4.2.0-7-generic ~# ethtool -i ens2 driver: mlx4_en version: 2.2-1 (Feb 2014) firmware-version: 2.34.5000 bus-info: :07:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: no supports-register-dump: no supports-priv-flags: yes To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1507482/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1528466] Re: Mellanox ConnectX4 MTU limits: max and min
** Tags removed: verification-needed-xenial ** Tags added: verification-done-xenial -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1528466 Title: Mellanox ConnectX4 MTU limits: max and min Status in linux package in Ubuntu: Fix Released Status in linux-lts-wily package in Ubuntu: Fix Released Status in linux source package in Wily: Fix Released Status in linux source package in Xenial: Fix Committed Bug description: Max MTU limit should be 9978 Tested on Ubuntu 14.04.4 daily build, this issue exist also in 15.10. reproduce # ifconfig p2p1 mtu 900 SIOCSIFMTU: Invalid argument # dmesg [62898.559808] mlx5_core :20:00.0 p2p1: mlx5e_change_mtu: Bad MTU (900) > (1) Max [63058.512668] command failed, status bad parameter(0x3), syndrome 0x648afc Upstream commit that fix the issue commit 60825c35bf023553f8524f6695f176236e54df97 Author: Doron Tsur Date: Thu Nov 12 19:35:27 2015 +0200 net/mlx5e: Max mtu comparison fix On change mtu the driver compares between hardware queried mtu and software requested mtu. We need to compare between software representation of the queried mtu and the requested mtu. Fixes: facc9699f0fe ('net/mlx5e: Fix HW MTU settings') Signed-off-by: Doron Tsur Signed-off-by: Saeed Mahameed Signed-off-by: Or Gerlitz Signed-off-by: David S. Miller # cat /etc/os-release NAME="Ubuntu" VERSION="14.04.3 LTS, Trusty Tahr" ID=ubuntu ID_LIKE=debian PRETTY_NAME="Ubuntu 14.04.3 LTS" VERSION_ID="14.04" HOME_URL="http://www.ubuntu.com/"; SUPPORT_URL="http://help.ubuntu.com/"; BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"; # uname -r 4.2.0-22-generic To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1528466/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1585978] Re: mlx5_core kexec fail
** Tags removed: verification-needed-xenial ** Tags added: verification-done-xenial -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1585978 Title: mlx5_core kexec fail Status in linux package in Ubuntu: Fix Released Status in linux source package in Xenial: Fix Committed Bug description: In machine with ConnectX4 device the kexec failed load a new kernel. missing pci shutdown callback. Operation system - Ubuntu 16.04 kernel 4.4.0-22-generic Scenario # kexec -l /boot/vmlinuz-4.3.0-rc6-gemini-perf-2015-10-22_19-44-01 --initrd=/boot/initramfs-4.3.0-rc6-gemini-perf-2015-10-22_19-44-01.img --command-line="root=UUID=7ec1922f-8631-46bf-abe4-afb7affab4fe console=tty0 console=ttyS0,115200n8 rhgb SYSFONT=latarcyrheb-sun16 LANG=en_US.UTF-8 KEYTABLE=us" # kexec -e upstream commit that fix this issue commit 5fc7197d3a256d9c5de3134870304b24892a4908 Author: Majd Dibbiny Date: Fri Apr 22 00:33:07 2016 +0300 net/mlx5: Add pci shutdown callback To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1585978/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1477466] Re: Low performance when using vlan over VxLan
Hi Dragan, This is an old bug, as far as i remember that we tested it and the performance still not as expected. we can test it again, but does this patches back-ported ? thanks, Talat -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1477466 Title: Low performance when using vlan over VxLan Status in linux package in Ubuntu: In Progress Status in linux source package in Vivid: In Progress Bug description: We see a performance issue when running traffic over vlan interface that created over VxLAN interface. We reach 24 Gbps over the VxLan interface while we reach only 4 Gbps over the VLAN interface. Turned out that GRO isn't supported for VLAN over VxLAN. The following upstream commits fix this issue. commit 66e5133f19e901a044fa5eaeeb6ecff4545839e5 Author: Toshiaki Makita Date: Mon Jun 1 21:55:06 2015 +0900 vlan: Add GRO support for non hardware accelerated vlan Currently packets with non-hardware-accelerated vlan cannot be handled by GRO. This causes low performance for 802.1ad and stacked vlan, as their vlan tags are currently not stripped by hardware. This patch adds GRO support for non-hardware-accelerated vlan and improves receive performance of them. commit 9b174d88c257150562b0101fcc6cb6c3cb74275c Author: Jesse Gross Date: Tue Dec 30 19:10:15 2014 -0800 net: Add Transparent Ethernet Bridging GRO support. Currently the only tunnel protocol that supports GRO with encapsulated Ethernet is VXLAN. This pulls out the Ethernet code into a proper layer so that it can be used by other tunnel protocols such as GRE and Geneve. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1477466/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1477466] Re: Low performance when using vlan over VxLan
Hi, We will test this scenario on the latest Ubuntu (Yakkety Yak). this may take a time, will update later. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1477466 Title: Low performance when using vlan over VxLan Status in linux package in Ubuntu: Incomplete Status in linux source package in Vivid: Incomplete Bug description: We see a performance issue when running traffic over vlan interface that created over VxLAN interface. We reach 24 Gbps over the VxLan interface while we reach only 4 Gbps over the VLAN interface. Turned out that GRO isn't supported for VLAN over VxLAN. The following upstream commits fix this issue. commit 66e5133f19e901a044fa5eaeeb6ecff4545839e5 Author: Toshiaki Makita Date: Mon Jun 1 21:55:06 2015 +0900 vlan: Add GRO support for non hardware accelerated vlan Currently packets with non-hardware-accelerated vlan cannot be handled by GRO. This causes low performance for 802.1ad and stacked vlan, as their vlan tags are currently not stripped by hardware. This patch adds GRO support for non-hardware-accelerated vlan and improves receive performance of them. commit 9b174d88c257150562b0101fcc6cb6c3cb74275c Author: Jesse Gross Date: Tue Dec 30 19:10:15 2014 -0800 net: Add Transparent Ethernet Bridging GRO support. Currently the only tunnel protocol that supports GRO with encapsulated Ethernet is VXLAN. This pulls out the Ethernet code into a proper layer so that it can be used by other tunnel protocols such as GRE and Geneve. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1477466/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1634862] Re: PPC + kernel 4.8 swap_dup: Bad swap file entry 2c0000000016ed28
Hi, we didn't see the issue on kernel 4.8.0-27-generic. do we have fixes that related to this area in kernel 4.8.0-27-generic? thanks, Talat -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1634862 Title: PPC + kernel 4.8 swap_dup: Bad swap file entry 2c16ed28 Status in linux package in Ubuntu: In Progress Status in linux source package in Yakkety: In Progress Bug description: Hi on ppc machines with Ubuntu 16.10 (Yakkety Yak), after upgrading to last Ubuntu kernel (4.8), and running installation of MLNX_OFED Driver package that run a compilation jobs that equal to number of CPUs , the System hangs and we got a bad swap file message with pages and pages of the following showing, this is just a sample from the system log. I had been running with kernel 4.4 and everything worked ok on the same machine, so when we upgrade to kernel 4.8 we start to see the following error dmseg log gets lots of this message: - [10101.785548] swap_dup: Bad swap file entry 2c16ed28 [10101.785595] swap_dup: Bad swap file entry 2c16ed29 [10101.785606] swap_dup: Bad swap file entry 2c16ed2a [10101.785613] swap_dup: Bad swap file entry 2c16ed2b [10101.785622] swap_dup: Bad swap file entry 2c16ed2c [10101.785629] swap_dup: Bad swap file entry 2c16ed2d [10101.785637] swap_dup: Bad swap file entry 2c16ed2e [10101.785648] swap_dup: Bad swap file entry 2c16ed2f [10101.785659] swap_dup: Bad swap file entry 2c16ed2e and call trace : Call Trace: [10101.785773] [c002fb923ac0] [c0b9db00] dump_stack+0xb0/0xf0 (unreliable) [10101.785789] [c002fb923b00] [c0b9b330] dump_header+0x84/0x224 [10101.785801] [c002fb923bd0] [c025d724] oom_kill_process+0x3b4/0x5c0 [10101.785811] [c002fb923c80] [c025dc80] out_of_memory+0x290/0x580 [10101.785822] [c002fb923d30] [c025dff4] pagefault_out_of_memory+0x84/0xb0 [10101.785836] [c002fb923d80] [c0b94d44] do_page_fault+0x7c4/0x7d0 [10101.785849] [c002fb923e30] [c0008948] handle_page_fault+0x10/0x30 # cat /etc/*-release DISTRIB_ID=Ubuntu DISTRIB_RELEASE=16.10 DISTRIB_CODENAME=yakkety DISTRIB_DESCRIPTION="Ubuntu 16.10" NAME="Ubuntu" VERSION="16.10 (Yakkety Yak)" ID=ubuntu ID_LIKE=debian PRETTY_NAME="Ubuntu 16.10" VERSION_ID="16.10" HOME_URL="http://www.ubuntu.com/"; SUPPORT_URL="http://help.ubuntu.com/"; BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"; PRIVACY_POLICY_URL="http://www.ubuntu.com/legal/terms-and-policies/privacy-policy"; VERSION_CODENAME=yakkety UBUNTU_CODENAME=yakkety To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1634862/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1649207] [NEW] mlx5_core failed to increase rx ring to 4096 - SIOCSIFFLAGS: Cannot allocate memory
Public bug reported: Failed to increase rx ring to 4096 - SIOCSIFFLAGS: Cannot allocate memory Scenario ubuntu@cto-netsim3:~$ sudo ethtool -g ens6f0 [sudo] password for ubuntu: Ring parameters for ens6f0: Pre-set maximums: RX: 4096 RX Mini: 0 RX Jumbo: 0 TX: 8192 Current hardware settings: RX: 1024 RX Mini: 0 RX Jumbo: 0 TX: 1024 ubuntu@cto-netsim3:~$ sudo ethtool -G ens6f0 rx 4096 Cannot set device ring parameters: Cannot allocate memory After brinding the interface down with: ubuntu@cto-netsim3:~$ sudo ifconfig ens6f0 down I can not bring it back up !!! ubuntu@cto-netsim3:~$ sudo ifconfig ens6f0 up SIOCSIFFLAGS: Cannot allocate memory ubuntu@cto-netsim3:~$ ubuntu@cto-netsim3:~$ ubuntu@cto-netsim3:~$ sudo ifconfig ens6f0 up SIOCSIFFLAGS: Cannot allocate memory ubuntu@cto-netsim3:~$ ubuntu@cto-netsim3:~$ ubuntu@cto-netsim3:~$ sudo ifconfig ens6f0 up SIOCSIFFLAGS: Cannot allocate memory ubuntu@cto-netsim3:~$ ubuntu@cto-netsim3:~$ ubuntu@cto-netsim3:~$ sudo ifconfig ens6f0 up SIOCSIFFLAGS: Cannot allocate memory ubuntu@cto-netsim3:~$ ubuntu@cto-netsim3:~$ ubuntu@cto-netsim3:~$ sudo ifconfig ens6f0 up SIOCSIFFLAGS: Cannot allocate memory ubuntu@cto-netsim3:~$ ubuntu@cto-netsim3:~$ sudo ifconfig ens6f0 up SIOCSIFFLAGS: Cannot allocate memory ubuntu@cto-netsim3:~$ ubuntu@cto-netsim3:~$ ubuntu@cto-netsim3:~$ sudo ifconfig ens6f0 up SIOCSIFFLAGS: Cannot allocate memory ubuntu@cto-netsim3:~$ ubuntu@cto-netsim3:~$ ubuntu@cto-netsim3:~$ sudo ifconfig ens6f0 up SIOCSIFFLAGS: Cannot allocate memory ubuntu@cto-netsim3:~$ ubuntu@cto-netsim3:~$ ubuntu@cto-netsim3:~$ sudo ifconfig ens6f0 up SIOCSIFFLAGS: Cannot allocate memory ubuntu@cto-netsim3:~$ ubuntu@cto-netsim3:~$ ubuntu@cto-netsim3:~$ sudo ifconfig ens6f0 up SIOCSIFFLAGS: Cannot allocate memory ubuntu@cto-netsim3:~$ dmsg: [ 774.935067] mlx5_core :81:00.0: swiotlb buffer is full (sz: 8388608 bytes) [ 774.935070] swiotlb: coherent allocation failed for device :81:00.0 size=8388608 [ 774.935074] CPU: 38 PID: 6042 Comm: ethtool Not tainted 4.8.0-22-generic #24 [ 774.935075] Hardware name: Quanta Computer Inc D51B-1U (dual 1G LoM)/S2B-MB (dual 1G LoM), BIOS S2B_3A19 05/15/2015 [ 774.935078] 0286 65a68699 8d946b1db9a0 8502f5d2 [ 774.935083] 007f 8d946b1db9e8 8505a280 [ 774.935087] 8d946b1dba80 8d94000b 024082c0 8db471b0e0a0 [ 774.935091] Call Trace: [ 774.935104] [] dump_stack+0x63/0x81 [ 774.935108] [] swiotlb_alloc_coherent+0x140/0x160 [ 774.935115] [] x86_swiotlb_alloc_coherent+0x43/0x50 [ 774.935150] [] mlx5_dma_zalloc_coherent_node+0xa4/0x100 [mlx5_core] [ 774.935164] [] mlx5_buf_alloc_node+0x4d/0xc0 [mlx5_core] [ 774.935181] [] mlx5_cqwq_create+0x7e/0x160 [mlx5_core] [ 774.935199] [] mlx5e_open_cq+0x9e/0x1f0 [mlx5_core] [ 774.935214] [] mlx5e_open_channels+0x715/0xf30 [mlx5_core] [ 774.935229] [] mlx5e_open_locked+0xda/0x1e0 [mlx5_core] [ 774.935245] [] mlx5e_set_ringparam+0x21e/0x350 [mlx5_core] [ 774.935252] [] dev_ethtool+0x59f/0x1fc0 [ 774.935255] [] ? new_slab+0x300/0x6e0 [ 774.935259] [] ? __rtnl_unlock+0x2a/0x50 [ 774.935262] [] ? netdev_run_todo+0x60/0x330 [ 774.935266] [] ? alloc_set_pte+0x4ec/0x610 [ 774.935268] [] ? dev_get_by_name_rcu+0x61/0x80 [ 774.935272] [] dev_ioctl+0x180/0x5a0 [ 774.935277] [] sock_do_ioctl+0x42/0x50 [ 774.935280] [] sock_ioctl+0x1d2/0x290 [ 774.935283] [] do_vfs_ioctl+0xa3/0x610 [ 774.935287] [] ? __do_page_fault+0x203/0x4d0 [ 774.935289] [] SyS_ioctl+0x79/0x90 [ 774.935307] [] entry_SYSCALL_64_fastpath+0x1e/0xa8 [ 774.935312] mlx5_core :81:00.0: :81:00.0:mlx5_cqwq_create:121:(pid 6042): mlx5_buf_alloc_node() failed, -12 [ 774.935537] mlx5_core :81:00.0 ens6f0: mlx5e_open_locked: mlx5e_open_channels failed, -12 This is the upstream patches that fix this issue ec8b9981ad3f net/mlx5e: Create UMR MKey per RQ 3608ae77c098 net/mlx5e: Move function mlx5e_create_umr_mkey 1c1b522808a1 net/mlx5e: Implement Fragmented Work Queue (WQ) Thanks, Talat ** Affects: linux (Ubuntu) Importance: Undecided Status: New ** Tags: patch -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1649207 Title: mlx5_core failed to increase rx ring to 4096 - SIOCSIFFLAGS: Cannot allocate memory Status in linux package in Ubuntu: New Bug description: Failed to increase rx ring to 4096 - SIOCSIFFLAGS: Cannot allocate memory Scenario ubuntu@cto-netsim3:~$ sudo ethtool -g ens6f0 [sudo] password for ubuntu: Ring parameters for ens6f0: Pre-set maximums: RX: 4096 RX Mini: 0 RX Jumbo: 0 TX: 8192 Current hardware settings: RX: 1024 RX Mini: 0 RX Jumbo: 0 TX: 1024 ubuntu@cto-netsim3:~$ sudo ethtool -G ens6f0 rx 4096 Cannot set device ring parameters: Cannot allocate memory After brinding the interface down with: ubuntu@cto-netsim3:~$ sudo ifconfig ens6f0 down I
[Kernel-packages] [Bug 1649207] Re: mlx5_core failed to increase rx ring to 4096 - SIOCSIFFLAGS: Cannot allocate memory
** Tags added: yakkety -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1649207 Title: mlx5_core failed to increase rx ring to 4096 - SIOCSIFFLAGS: Cannot allocate memory Status in linux package in Ubuntu: Incomplete Bug description: Failed to increase rx ring to 4096 - SIOCSIFFLAGS: Cannot allocate memory Scenario ubuntu@cto-netsim3:~$ sudo ethtool -g ens6f0 [sudo] password for ubuntu: Ring parameters for ens6f0: Pre-set maximums: RX: 4096 RX Mini: 0 RX Jumbo: 0 TX: 8192 Current hardware settings: RX: 1024 RX Mini: 0 RX Jumbo: 0 TX: 1024 ubuntu@cto-netsim3:~$ sudo ethtool -G ens6f0 rx 4096 Cannot set device ring parameters: Cannot allocate memory After brinding the interface down with: ubuntu@cto-netsim3:~$ sudo ifconfig ens6f0 down I can not bring it back up !!! ubuntu@cto-netsim3:~$ sudo ifconfig ens6f0 up SIOCSIFFLAGS: Cannot allocate memory ubuntu@cto-netsim3:~$ ubuntu@cto-netsim3:~$ ubuntu@cto-netsim3:~$ sudo ifconfig ens6f0 up SIOCSIFFLAGS: Cannot allocate memory ubuntu@cto-netsim3:~$ ubuntu@cto-netsim3:~$ ubuntu@cto-netsim3:~$ sudo ifconfig ens6f0 up SIOCSIFFLAGS: Cannot allocate memory ubuntu@cto-netsim3:~$ ubuntu@cto-netsim3:~$ ubuntu@cto-netsim3:~$ sudo ifconfig ens6f0 up SIOCSIFFLAGS: Cannot allocate memory ubuntu@cto-netsim3:~$ ubuntu@cto-netsim3:~$ ubuntu@cto-netsim3:~$ sudo ifconfig ens6f0 up SIOCSIFFLAGS: Cannot allocate memory ubuntu@cto-netsim3:~$ ubuntu@cto-netsim3:~$ sudo ifconfig ens6f0 up SIOCSIFFLAGS: Cannot allocate memory ubuntu@cto-netsim3:~$ ubuntu@cto-netsim3:~$ ubuntu@cto-netsim3:~$ sudo ifconfig ens6f0 up SIOCSIFFLAGS: Cannot allocate memory ubuntu@cto-netsim3:~$ ubuntu@cto-netsim3:~$ ubuntu@cto-netsim3:~$ sudo ifconfig ens6f0 up SIOCSIFFLAGS: Cannot allocate memory ubuntu@cto-netsim3:~$ ubuntu@cto-netsim3:~$ ubuntu@cto-netsim3:~$ sudo ifconfig ens6f0 up SIOCSIFFLAGS: Cannot allocate memory ubuntu@cto-netsim3:~$ ubuntu@cto-netsim3:~$ ubuntu@cto-netsim3:~$ sudo ifconfig ens6f0 up SIOCSIFFLAGS: Cannot allocate memory ubuntu@cto-netsim3:~$ dmsg: [ 774.935067] mlx5_core :81:00.0: swiotlb buffer is full (sz: 8388608 bytes) [ 774.935070] swiotlb: coherent allocation failed for device :81:00.0 size=8388608 [ 774.935074] CPU: 38 PID: 6042 Comm: ethtool Not tainted 4.8.0-22-generic #24 [ 774.935075] Hardware name: Quanta Computer Inc D51B-1U (dual 1G LoM)/S2B-MB (dual 1G LoM), BIOS S2B_3A19 05/15/2015 [ 774.935078] 0286 65a68699 8d946b1db9a0 8502f5d2 [ 774.935083] 007f 8d946b1db9e8 8505a280 [ 774.935087] 8d946b1dba80 8d94000b 024082c0 8db471b0e0a0 [ 774.935091] Call Trace: [ 774.935104] [] dump_stack+0x63/0x81 [ 774.935108] [] swiotlb_alloc_coherent+0x140/0x160 [ 774.935115] [] x86_swiotlb_alloc_coherent+0x43/0x50 [ 774.935150] [] mlx5_dma_zalloc_coherent_node+0xa4/0x100 [mlx5_core] [ 774.935164] [] mlx5_buf_alloc_node+0x4d/0xc0 [mlx5_core] [ 774.935181] [] mlx5_cqwq_create+0x7e/0x160 [mlx5_core] [ 774.935199] [] mlx5e_open_cq+0x9e/0x1f0 [mlx5_core] [ 774.935214] [] mlx5e_open_channels+0x715/0xf30 [mlx5_core] [ 774.935229] [] mlx5e_open_locked+0xda/0x1e0 [mlx5_core] [ 774.935245] [] mlx5e_set_ringparam+0x21e/0x350 [mlx5_core] [ 774.935252] [] dev_ethtool+0x59f/0x1fc0 [ 774.935255] [] ? new_slab+0x300/0x6e0 [ 774.935259] [] ? __rtnl_unlock+0x2a/0x50 [ 774.935262] [] ? netdev_run_todo+0x60/0x330 [ 774.935266] [] ? alloc_set_pte+0x4ec/0x610 [ 774.935268] [] ? dev_get_by_name_rcu+0x61/0x80 [ 774.935272] [] dev_ioctl+0x180/0x5a0 [ 774.935277] [] sock_do_ioctl+0x42/0x50 [ 774.935280] [] sock_ioctl+0x1d2/0x290 [ 774.935283] [] do_vfs_ioctl+0xa3/0x610 [ 774.935287] [] ? __do_page_fault+0x203/0x4d0 [ 774.935289] [] SyS_ioctl+0x79/0x90 [ 774.935307] [] entry_SYSCALL_64_fastpath+0x1e/0xa8 [ 774.935312] mlx5_core :81:00.0: :81:00.0:mlx5_cqwq_create:121:(pid 6042): mlx5_buf_alloc_node() failed, -12 [ 774.935537] mlx5_core :81:00.0 ens6f0: mlx5e_open_locked: mlx5e_open_channels failed, -12 This is the upstream patches that fix this issue ec8b9981ad3f net/mlx5e: Create UMR MKey per RQ 3608ae77c098 net/mlx5e: Move function mlx5e_create_umr_mkey 1c1b522808a1 net/mlx5e: Implement Fragmented Work Queue (WQ) Thanks, Talat To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1649207/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1585978] [NEW] mlx5_core kexec fail
Public bug reported: In machine with ConnectX4 device the kexec failed load a new kernel. missing pci shutdown callback. Operation system - Ubuntu 16.04 kernel 4.4.0-22-generic Scenario # kexec -l /boot/vmlinuz-4.3.0-rc6-gemini-perf-2015-10-22_19-44-01 --initrd=/boot/initramfs-4.3.0-rc6-gemini-perf-2015-10-22_19-44-01.img --command-line="root=UUID=7ec1922f-8631-46bf-abe4-afb7affab4fe console=tty0 console=ttyS0,115200n8 rhgb SYSFONT=latarcyrheb-sun16 LANG=en_US.UTF-8 KEYTABLE=us" # kexec -e upstream commit that fix this issue commit 5fc7197d3a256d9c5de3134870304b24892a4908 Author: Majd Dibbiny Date: Fri Apr 22 00:33:07 2016 +0300 net/mlx5: Add pci shutdown callback ** Affects: linux (Ubuntu) Importance: Undecided Status: Incomplete ** Tags: xenial -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1585978 Title: mlx5_core kexec fail Status in linux package in Ubuntu: Incomplete Bug description: In machine with ConnectX4 device the kexec failed load a new kernel. missing pci shutdown callback. Operation system - Ubuntu 16.04 kernel 4.4.0-22-generic Scenario # kexec -l /boot/vmlinuz-4.3.0-rc6-gemini-perf-2015-10-22_19-44-01 --initrd=/boot/initramfs-4.3.0-rc6-gemini-perf-2015-10-22_19-44-01.img --command-line="root=UUID=7ec1922f-8631-46bf-abe4-afb7affab4fe console=tty0 console=ttyS0,115200n8 rhgb SYSFONT=latarcyrheb-sun16 LANG=en_US.UTF-8 KEYTABLE=us" # kexec -e upstream commit that fix this issue commit 5fc7197d3a256d9c5de3134870304b24892a4908 Author: Majd Dibbiny Date: Fri Apr 22 00:33:07 2016 +0300 net/mlx5: Add pci shutdown callback To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1585978/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1634862] [NEW] PPC + kernel 4.8 swap_dup: Bad swap file entry 2c0000000016ed28
Public bug reported: Hi on ppc machines with Ubuntu 16.10 (Yakkety Yak), after upgrading to last Ubuntu kernel (4.8), and running installation of MLNX_OFED Driver package that run a compilation jobs that equal to number of CPUs , the System hangs and we got a bad swap file message with pages and pages of the following showing, this is just a sample from the system log. I had been running with kernel 4.4 and everything worked ok on the same machine, so when we upgrade to kernel 4.8 we start to see the following error dmseg log gets lots of this message: - [10101.785548] swap_dup: Bad swap file entry 2c16ed28 [10101.785595] swap_dup: Bad swap file entry 2c16ed29 [10101.785606] swap_dup: Bad swap file entry 2c16ed2a [10101.785613] swap_dup: Bad swap file entry 2c16ed2b [10101.785622] swap_dup: Bad swap file entry 2c16ed2c [10101.785629] swap_dup: Bad swap file entry 2c16ed2d [10101.785637] swap_dup: Bad swap file entry 2c16ed2e [10101.785648] swap_dup: Bad swap file entry 2c16ed2f [10101.785659] swap_dup: Bad swap file entry 2c16ed2e and call trace : Call Trace: [10101.785773] [c002fb923ac0] [c0b9db00] dump_stack+0xb0/0xf0 (unreliable) [10101.785789] [c002fb923b00] [c0b9b330] dump_header+0x84/0x224 [10101.785801] [c002fb923bd0] [c025d724] oom_kill_process+0x3b4/0x5c0 [10101.785811] [c002fb923c80] [c025dc80] out_of_memory+0x290/0x580 [10101.785822] [c002fb923d30] [c025dff4] pagefault_out_of_memory+0x84/0xb0 [10101.785836] [c002fb923d80] [c0b94d44] do_page_fault+0x7c4/0x7d0 [10101.785849] [c002fb923e30] [c0008948] handle_page_fault+0x10/0x30 # cat /etc/*-release DISTRIB_ID=Ubuntu DISTRIB_RELEASE=16.10 DISTRIB_CODENAME=yakkety DISTRIB_DESCRIPTION="Ubuntu 16.10" NAME="Ubuntu" VERSION="16.10 (Yakkety Yak)" ID=ubuntu ID_LIKE=debian PRETTY_NAME="Ubuntu 16.10" VERSION_ID="16.10" HOME_URL="http://www.ubuntu.com/"; SUPPORT_URL="http://help.ubuntu.com/"; BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"; PRIVACY_POLICY_URL="http://www.ubuntu.com/legal/terms-and-policies/privacy-policy"; VERSION_CODENAME=yakkety UBUNTU_CODENAME=yakkety ** Affects: linux (Ubuntu) Importance: Undecided Status: New ** Attachment added: "full dmesg" https://bugs.launchpad.net/bugs/1634862/+attachment/4763753/+files/full_dmesg.txt -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1634862 Title: PPC + kernel 4.8 swap_dup: Bad swap file entry 2c16ed28 Status in linux package in Ubuntu: New Bug description: Hi on ppc machines with Ubuntu 16.10 (Yakkety Yak), after upgrading to last Ubuntu kernel (4.8), and running installation of MLNX_OFED Driver package that run a compilation jobs that equal to number of CPUs , the System hangs and we got a bad swap file message with pages and pages of the following showing, this is just a sample from the system log. I had been running with kernel 4.4 and everything worked ok on the same machine, so when we upgrade to kernel 4.8 we start to see the following error dmseg log gets lots of this message: - [10101.785548] swap_dup: Bad swap file entry 2c16ed28 [10101.785595] swap_dup: Bad swap file entry 2c16ed29 [10101.785606] swap_dup: Bad swap file entry 2c16ed2a [10101.785613] swap_dup: Bad swap file entry 2c16ed2b [10101.785622] swap_dup: Bad swap file entry 2c16ed2c [10101.785629] swap_dup: Bad swap file entry 2c16ed2d [10101.785637] swap_dup: Bad swap file entry 2c16ed2e [10101.785648] swap_dup: Bad swap file entry 2c16ed2f [10101.785659] swap_dup: Bad swap file entry 2c16ed2e and call trace : Call Trace: [10101.785773] [c002fb923ac0] [c0b9db00] dump_stack+0xb0/0xf0 (unreliable) [10101.785789] [c002fb923b00] [c0b9b330] dump_header+0x84/0x224 [10101.785801] [c002fb923bd0] [c025d724] oom_kill_process+0x3b4/0x5c0 [10101.785811] [c002fb923c80] [c025dc80] out_of_memory+0x290/0x580 [10101.785822] [c002fb923d30] [c025dff4] pagefault_out_of_memory+0x84/0xb0 [10101.785836] [c002fb923d80] [c0b94d44] do_page_fault+0x7c4/0x7d0
[Kernel-packages] [Bug 1634862] Re: PPC + kernel 4.8 swap_dup: Bad swap file entry 2c0000000016ed28
Hi, there is no ppc image under http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.9-rc1/, the ppc page not found. as we see there " Build for ppc64el failed (see BUILD.LOG.ppc64el): linux-headers-4.9.0-040900rc1_4.9.0-040900rc1.201610151630_all.deb *_ppc64el.deb " Thanks Talat -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1634862 Title: PPC + kernel 4.8 swap_dup: Bad swap file entry 2c16ed28 Status in linux package in Ubuntu: Incomplete Bug description: Hi on ppc machines with Ubuntu 16.10 (Yakkety Yak), after upgrading to last Ubuntu kernel (4.8), and running installation of MLNX_OFED Driver package that run a compilation jobs that equal to number of CPUs , the System hangs and we got a bad swap file message with pages and pages of the following showing, this is just a sample from the system log. I had been running with kernel 4.4 and everything worked ok on the same machine, so when we upgrade to kernel 4.8 we start to see the following error dmseg log gets lots of this message: - [10101.785548] swap_dup: Bad swap file entry 2c16ed28 [10101.785595] swap_dup: Bad swap file entry 2c16ed29 [10101.785606] swap_dup: Bad swap file entry 2c16ed2a [10101.785613] swap_dup: Bad swap file entry 2c16ed2b [10101.785622] swap_dup: Bad swap file entry 2c16ed2c [10101.785629] swap_dup: Bad swap file entry 2c16ed2d [10101.785637] swap_dup: Bad swap file entry 2c16ed2e [10101.785648] swap_dup: Bad swap file entry 2c16ed2f [10101.785659] swap_dup: Bad swap file entry 2c16ed2e and call trace : Call Trace: [10101.785773] [c002fb923ac0] [c0b9db00] dump_stack+0xb0/0xf0 (unreliable) [10101.785789] [c002fb923b00] [c0b9b330] dump_header+0x84/0x224 [10101.785801] [c002fb923bd0] [c025d724] oom_kill_process+0x3b4/0x5c0 [10101.785811] [c002fb923c80] [c025dc80] out_of_memory+0x290/0x580 [10101.785822] [c002fb923d30] [c025dff4] pagefault_out_of_memory+0x84/0xb0 [10101.785836] [c002fb923d80] [c0b94d44] do_page_fault+0x7c4/0x7d0 [10101.785849] [c002fb923e30] [c0008948] handle_page_fault+0x10/0x30 # cat /etc/*-release DISTRIB_ID=Ubuntu DISTRIB_RELEASE=16.10 DISTRIB_CODENAME=yakkety DISTRIB_DESCRIPTION="Ubuntu 16.10" NAME="Ubuntu" VERSION="16.10 (Yakkety Yak)" ID=ubuntu ID_LIKE=debian PRETTY_NAME="Ubuntu 16.10" VERSION_ID="16.10" HOME_URL="http://www.ubuntu.com/"; SUPPORT_URL="http://help.ubuntu.com/"; BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"; PRIVACY_POLICY_URL="http://www.ubuntu.com/legal/terms-and-policies/privacy-policy"; VERSION_CODENAME=yakkety UBUNTU_CODENAME=yakkety To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1634862/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1634862] Re: PPC + kernel 4.8 swap_dup: Bad swap file entry 2c0000000016ed28
Hi Joseph, we already test the following kernel and here is the status. v4.8-rc8 http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.8-rc8/ working v4.8-rc1 http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.8-rc1/ working v4.8 http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.8/ not working v4.9-rc2 http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.9-rc2/ not working Thanks, Talat -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1634862 Title: PPC + kernel 4.8 swap_dup: Bad swap file entry 2c16ed28 Status in linux package in Ubuntu: Incomplete Bug description: Hi on ppc machines with Ubuntu 16.10 (Yakkety Yak), after upgrading to last Ubuntu kernel (4.8), and running installation of MLNX_OFED Driver package that run a compilation jobs that equal to number of CPUs , the System hangs and we got a bad swap file message with pages and pages of the following showing, this is just a sample from the system log. I had been running with kernel 4.4 and everything worked ok on the same machine, so when we upgrade to kernel 4.8 we start to see the following error dmseg log gets lots of this message: - [10101.785548] swap_dup: Bad swap file entry 2c16ed28 [10101.785595] swap_dup: Bad swap file entry 2c16ed29 [10101.785606] swap_dup: Bad swap file entry 2c16ed2a [10101.785613] swap_dup: Bad swap file entry 2c16ed2b [10101.785622] swap_dup: Bad swap file entry 2c16ed2c [10101.785629] swap_dup: Bad swap file entry 2c16ed2d [10101.785637] swap_dup: Bad swap file entry 2c16ed2e [10101.785648] swap_dup: Bad swap file entry 2c16ed2f [10101.785659] swap_dup: Bad swap file entry 2c16ed2e and call trace : Call Trace: [10101.785773] [c002fb923ac0] [c0b9db00] dump_stack+0xb0/0xf0 (unreliable) [10101.785789] [c002fb923b00] [c0b9b330] dump_header+0x84/0x224 [10101.785801] [c002fb923bd0] [c025d724] oom_kill_process+0x3b4/0x5c0 [10101.785811] [c002fb923c80] [c025dc80] out_of_memory+0x290/0x580 [10101.785822] [c002fb923d30] [c025dff4] pagefault_out_of_memory+0x84/0xb0 [10101.785836] [c002fb923d80] [c0b94d44] do_page_fault+0x7c4/0x7d0 [10101.785849] [c002fb923e30] [c0008948] handle_page_fault+0x10/0x30 # cat /etc/*-release DISTRIB_ID=Ubuntu DISTRIB_RELEASE=16.10 DISTRIB_CODENAME=yakkety DISTRIB_DESCRIPTION="Ubuntu 16.10" NAME="Ubuntu" VERSION="16.10 (Yakkety Yak)" ID=ubuntu ID_LIKE=debian PRETTY_NAME="Ubuntu 16.10" VERSION_ID="16.10" HOME_URL="http://www.ubuntu.com/"; SUPPORT_URL="http://help.ubuntu.com/"; BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"; PRIVACY_POLICY_URL="http://www.ubuntu.com/legal/terms-and-policies/privacy-policy"; VERSION_CODENAME=yakkety UBUNTU_CODENAME=yakkety To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1634862/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1634862] Re: PPC + kernel 4.8 swap_dup: Bad swap file entry 2c0000000016ed28
Hi, Any update with this bug ? thanks, Talat -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1634862 Title: PPC + kernel 4.8 swap_dup: Bad swap file entry 2c16ed28 Status in linux package in Ubuntu: Incomplete Bug description: Hi on ppc machines with Ubuntu 16.10 (Yakkety Yak), after upgrading to last Ubuntu kernel (4.8), and running installation of MLNX_OFED Driver package that run a compilation jobs that equal to number of CPUs , the System hangs and we got a bad swap file message with pages and pages of the following showing, this is just a sample from the system log. I had been running with kernel 4.4 and everything worked ok on the same machine, so when we upgrade to kernel 4.8 we start to see the following error dmseg log gets lots of this message: - [10101.785548] swap_dup: Bad swap file entry 2c16ed28 [10101.785595] swap_dup: Bad swap file entry 2c16ed29 [10101.785606] swap_dup: Bad swap file entry 2c16ed2a [10101.785613] swap_dup: Bad swap file entry 2c16ed2b [10101.785622] swap_dup: Bad swap file entry 2c16ed2c [10101.785629] swap_dup: Bad swap file entry 2c16ed2d [10101.785637] swap_dup: Bad swap file entry 2c16ed2e [10101.785648] swap_dup: Bad swap file entry 2c16ed2f [10101.785659] swap_dup: Bad swap file entry 2c16ed2e and call trace : Call Trace: [10101.785773] [c002fb923ac0] [c0b9db00] dump_stack+0xb0/0xf0 (unreliable) [10101.785789] [c002fb923b00] [c0b9b330] dump_header+0x84/0x224 [10101.785801] [c002fb923bd0] [c025d724] oom_kill_process+0x3b4/0x5c0 [10101.785811] [c002fb923c80] [c025dc80] out_of_memory+0x290/0x580 [10101.785822] [c002fb923d30] [c025dff4] pagefault_out_of_memory+0x84/0xb0 [10101.785836] [c002fb923d80] [c0b94d44] do_page_fault+0x7c4/0x7d0 [10101.785849] [c002fb923e30] [c0008948] handle_page_fault+0x10/0x30 # cat /etc/*-release DISTRIB_ID=Ubuntu DISTRIB_RELEASE=16.10 DISTRIB_CODENAME=yakkety DISTRIB_DESCRIPTION="Ubuntu 16.10" NAME="Ubuntu" VERSION="16.10 (Yakkety Yak)" ID=ubuntu ID_LIKE=debian PRETTY_NAME="Ubuntu 16.10" VERSION_ID="16.10" HOME_URL="http://www.ubuntu.com/"; SUPPORT_URL="http://help.ubuntu.com/"; BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"; PRIVACY_POLICY_URL="http://www.ubuntu.com/legal/terms-and-policies/privacy-policy"; VERSION_CODENAME=yakkety UBUNTU_CODENAME=yakkety To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1634862/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1507482] [NEW] ethtool self-test failed and creates a FW reset.
Public bug reported: On ConnectX3 device we try to run self-test and it’s failed, also seen that cause a FW reset. Steps to reproduce: ~# ethtool -t ens2 The test result is FAIL The test extra info: Interrupt Test -16 Link Test0 Speed Test 0 Register Test0 Loopback Test0 ~# dmesg [341548.619799] mlx4_core :07:00.0: command 0x31 timed out (go bit not cleared) [341548.803801] mlx4_core :07:00.0: command 0x49 timed out (go bit not cleared) [341548.803808] mlx4_core :07:00.0: device is going to be reset [341549.816587] mlx4_core :07:00.0: device was reset successfully [341549.823425] mlx4_en :07:00.0: Internal error detected, restarting device ~# cat /etc/os-release NAME="Ubuntu" VERSION="15.10 (Wily Werewolf)" ID=ubuntu ID_LIKE=debian PRETTY_NAME="Ubuntu Wily Werewolf (development branch)" VERSION_ID="15.10" HOME_URL="http://www.ubuntu.com/"; SUPPORT_URL="http://help.ubuntu.com/"; BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"; ~# uname -r 4.2.0-7-generic ~# ethtool -i ens2 driver: mlx4_en version: 2.2-1 (Feb 2014) firmware-version: 2.34.5000 bus-info: :07:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: no supports-register-dump: no supports-priv-flags: yes ** Affects: linux (Ubuntu) Importance: Undecided Status: New -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1507482 Title: ethtool self-test failed and creates a FW reset. Status in linux package in Ubuntu: New Bug description: On ConnectX3 device we try to run self-test and it’s failed, also seen that cause a FW reset. Steps to reproduce: ~# ethtool -t ens2 The test result is FAIL The test extra info: Interrupt Test -16 Link Test0 Speed Test 0 Register Test0 Loopback Test0 ~# dmesg [341548.619799] mlx4_core :07:00.0: command 0x31 timed out (go bit not cleared) [341548.803801] mlx4_core :07:00.0: command 0x49 timed out (go bit not cleared) [341548.803808] mlx4_core :07:00.0: device is going to be reset [341549.816587] mlx4_core :07:00.0: device was reset successfully [341549.823425] mlx4_en :07:00.0: Internal error detected, restarting device ~# cat /etc/os-release NAME="Ubuntu" VERSION="15.10 (Wily Werewolf)" ID=ubuntu ID_LIKE=debian PRETTY_NAME="Ubuntu Wily Werewolf (development branch)" VERSION_ID="15.10" HOME_URL="http://www.ubuntu.com/"; SUPPORT_URL="http://help.ubuntu.com/"; BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"; ~# uname -r 4.2.0-7-generic ~# ethtool -i ens2 driver: mlx4_en version: 2.2-1 (Feb 2014) firmware-version: 2.34.5000 bus-info: :07:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: no supports-register-dump: no supports-priv-flags: yes To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1507482/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1507482] Re: ethtool self-test failed and creates a FW reset.
Hi This commit [1] fixes this issue, could you please cherry-pick this commit. [1] - commit that fixes this issue: >From 820d39f3c497df6c8e040b8dcc7c19eeaa312701 Mon Sep 17 00:00:00 2001 From: Carol L Soto Date: Thu, 8 Oct 2015 15:26:15 +0300 Subject: net/mlx4_core: Avoid failing the interrupts test Yours , Talat -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1507482 Title: ethtool self-test failed and creates a FW reset. Status in linux package in Ubuntu: Incomplete Bug description: On ConnectX3 device we try to run self-test and it’s failed, also seen that cause a FW reset. Steps to reproduce: ~# ethtool -t ens2 The test result is FAIL The test extra info: Interrupt Test -16 Link Test0 Speed Test 0 Register Test0 Loopback Test0 ~# dmesg [341548.619799] mlx4_core :07:00.0: command 0x31 timed out (go bit not cleared) [341548.803801] mlx4_core :07:00.0: command 0x49 timed out (go bit not cleared) [341548.803808] mlx4_core :07:00.0: device is going to be reset [341549.816587] mlx4_core :07:00.0: device was reset successfully [341549.823425] mlx4_en :07:00.0: Internal error detected, restarting device ~# cat /etc/os-release NAME="Ubuntu" VERSION="15.10 (Wily Werewolf)" ID=ubuntu ID_LIKE=debian PRETTY_NAME="Ubuntu Wily Werewolf (development branch)" VERSION_ID="15.10" HOME_URL="http://www.ubuntu.com/"; SUPPORT_URL="http://help.ubuntu.com/"; BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"; ~# uname -r 4.2.0-7-generic ~# ethtool -i ens2 driver: mlx4_en version: 2.2-1 (Feb 2014) firmware-version: 2.34.5000 bus-info: :07:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: no supports-register-dump: no supports-priv-flags: yes To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1507482/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1507482] Re: ethtool self-test failed and creates a FW reset.
Hi, Thank you for your effort. please make sure that interface link is up, in the machine log above we see that link test failed "Link Test1". could you please retest and update. Yours, Talat -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1507482 Title: ethtool self-test failed and creates a FW reset. Status in linux package in Ubuntu: In Progress Bug description: On ConnectX3 device we try to run self-test and it’s failed, also seen that cause a FW reset. Steps to reproduce: ~# ethtool -t ens2 The test result is FAIL The test extra info: Interrupt Test -16 Link Test0 Speed Test 0 Register Test0 Loopback Test0 ~# dmesg [341548.619799] mlx4_core :07:00.0: command 0x31 timed out (go bit not cleared) [341548.803801] mlx4_core :07:00.0: command 0x49 timed out (go bit not cleared) [341548.803808] mlx4_core :07:00.0: device is going to be reset [341549.816587] mlx4_core :07:00.0: device was reset successfully [341549.823425] mlx4_en :07:00.0: Internal error detected, restarting device ~# cat /etc/os-release NAME="Ubuntu" VERSION="15.10 (Wily Werewolf)" ID=ubuntu ID_LIKE=debian PRETTY_NAME="Ubuntu Wily Werewolf (development branch)" VERSION_ID="15.10" HOME_URL="http://www.ubuntu.com/"; SUPPORT_URL="http://help.ubuntu.com/"; BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"; ~# uname -r 4.2.0-7-generic ~# ethtool -i ens2 driver: mlx4_en version: 2.2-1 (Feb 2014) firmware-version: 2.34.5000 bus-info: :07:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: no supports-register-dump: no supports-priv-flags: yes To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1507482/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1507482] Re: ethtool self-test failed and creates a FW reset.
Hi, This bug for Ethernet device and your setup is Infiniband, we see "Link layer: InfiniBand" in the ibstat output. Please validate the fixes on ConnectX3 Ethernet interface. Thanks, Talat -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1507482 Title: ethtool self-test failed and creates a FW reset. Status in linux package in Ubuntu: In Progress Bug description: On ConnectX3 device we try to run self-test and it’s failed, also seen that cause a FW reset. Steps to reproduce: ~# ethtool -t ens2 The test result is FAIL The test extra info: Interrupt Test -16 Link Test0 Speed Test 0 Register Test0 Loopback Test0 ~# dmesg [341548.619799] mlx4_core :07:00.0: command 0x31 timed out (go bit not cleared) [341548.803801] mlx4_core :07:00.0: command 0x49 timed out (go bit not cleared) [341548.803808] mlx4_core :07:00.0: device is going to be reset [341549.816587] mlx4_core :07:00.0: device was reset successfully [341549.823425] mlx4_en :07:00.0: Internal error detected, restarting device ~# cat /etc/os-release NAME="Ubuntu" VERSION="15.10 (Wily Werewolf)" ID=ubuntu ID_LIKE=debian PRETTY_NAME="Ubuntu Wily Werewolf (development branch)" VERSION_ID="15.10" HOME_URL="http://www.ubuntu.com/"; SUPPORT_URL="http://help.ubuntu.com/"; BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"; ~# uname -r 4.2.0-7-generic ~# ethtool -i ens2 driver: mlx4_en version: 2.2-1 (Feb 2014) firmware-version: 2.34.5000 bus-info: :07:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: no supports-register-dump: no supports-priv-flags: yes To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1507482/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1514861] [NEW] mlx5 EN driver wrongly enables sets VLAN filtering under promiscuous mode
Public bug reported: Description of problem: The mlx5 Ethernet driver doesn't allow packets marked with all possible VLAN tags to be accepted under promiscuous mode. This is wrong and disallows Open-Stack to properly function in Para-Virtual configuration. How reproducible: just put the NIC to promiscuous mode and send packet from another node tagged any vlan which was not previously configured on the NIC vlan filter, it will not be accepted. Actual results: ARP packets sent on vlan 52 packets are dropped Expected results: packets should received Host info: #uname -a Linux dev-h-vrt-006 4.2.0-16-generic #19-Ubuntu SMP Thu Oct 8 15:35:06 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux #lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description:Ubuntu 15.10 Release:15.10 Codename: wily The following upstream commit fix it: commit c07543431e9f3d126d083808efa0e76461d8833b Author: Achiad Shochat Date: Thu Oct 8 15:26:18 2015 +0300 net/mlx5e: Disable VLAN filter in promiscuous mode When the device was set to promiscuous mode, we didn't disable VLAN filtering, which is wrong behaviour, fix that. Now when the device is set to promiscuous mode RX packets sent over any VLAN (or no VLAN tag at all) will be accepted. Signed-off-by: Achiad Shochat Signed-off-by: Or Gerlitz Signed-off-by: David S. Miller I backported it to Ubuntu 15.10 (please see the attached patch). This issue need to be fix also in Ubuntu 14.04.4 not only 15.10. ** Affects: linux (Ubuntu) Importance: Undecided Status: New ** Tags: trusty wily ** Patch added: "Patch" https://bugs.launchpad.net/bugs/1514861/+attachment/4516282/+files/0001-net-mlx5e-Disable-VLAN-filter-in-promiscuous-mode.patch -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1514861 Title: mlx5 EN driver wrongly enables sets VLAN filtering under promiscuous mode Status in linux package in Ubuntu: New Bug description: Description of problem: The mlx5 Ethernet driver doesn't allow packets marked with all possible VLAN tags to be accepted under promiscuous mode. This is wrong and disallows Open-Stack to properly function in Para-Virtual configuration. How reproducible: just put the NIC to promiscuous mode and send packet from another node tagged any vlan which was not previously configured on the NIC vlan filter, it will not be accepted. Actual results: ARP packets sent on vlan 52 packets are dropped Expected results: packets should received Host info: #uname -a Linux dev-h-vrt-006 4.2.0-16-generic #19-Ubuntu SMP Thu Oct 8 15:35:06 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux #lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description:Ubuntu 15.10 Release:15.10 Codename: wily The following upstream commit fix it: commit c07543431e9f3d126d083808efa0e76461d8833b Author: Achiad Shochat Date: Thu Oct 8 15:26:18 2015 +0300 net/mlx5e: Disable VLAN filter in promiscuous mode When the device was set to promiscuous mode, we didn't disable VLAN filtering, which is wrong behaviour, fix that. Now when the device is set to promiscuous mode RX packets sent over any VLAN (or no VLAN tag at all) will be accepted. Signed-off-by: Achiad Shochat Signed-off-by: Or Gerlitz Signed-off-by: David S. Miller I backported it to Ubuntu 15.10 (please see the attached patch). This issue need to be fix also in Ubuntu 14.04.4 not only 15.10. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1514861/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1514861] Re: mlx5 EN driver wrongly enables sets VLAN filtering under promiscuous mode
Hi Brad, i verified this bug and it's fix the issue please don't drop this fix and move it to verification-done-wily. if needed a canonical verification of this bug i'll ask someone to do it. thanks, Talat -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1514861 Title: mlx5 EN driver wrongly enables sets VLAN filtering under promiscuous mode Status in linux package in Ubuntu: Fix Released Status in linux source package in Wily: Fix Committed Status in linux source package in Xenial: Fix Released Bug description: Description of problem: The mlx5 Ethernet driver doesn't allow packets marked with all possible VLAN tags to be accepted under promiscuous mode. This is wrong and disallows Open-Stack to properly function in Para-Virtual configuration. How reproducible: just put the NIC to promiscuous mode and send packet from another node tagged any vlan which was not previously configured on the NIC vlan filter, it will not be accepted. Actual results: ARP packets sent on vlan 52 packets are dropped Expected results: packets should received Host info: #uname -a Linux dev-h-vrt-006 4.2.0-16-generic #19-Ubuntu SMP Thu Oct 8 15:35:06 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux #lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description:Ubuntu 15.10 Release:15.10 Codename: wily The following upstream commit fix it: commit c07543431e9f3d126d083808efa0e76461d8833b Author: Achiad Shochat Date: Thu Oct 8 15:26:18 2015 +0300 net/mlx5e: Disable VLAN filter in promiscuous mode When the device was set to promiscuous mode, we didn't disable VLAN filtering, which is wrong behaviour, fix that. Now when the device is set to promiscuous mode RX packets sent over any VLAN (or no VLAN tag at all) will be accepted. Signed-off-by: Achiad Shochat Signed-off-by: Or Gerlitz Signed-off-by: David S. Miller I backported it to Ubuntu 15.10 (please see the attached patch). This issue need to be fix also in Ubuntu 14.04.4 not only 15.10. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1514861/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1514861] Re: mlx5 EN driver wrongly enables sets VLAN filtering under promiscuous mode
Hi Tim, You are right - I tested this patch, it's working and fix this issue. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1514861 Title: mlx5 EN driver wrongly enables sets VLAN filtering under promiscuous mode Status in linux package in Ubuntu: Fix Released Status in linux source package in Wily: In Progress Status in linux source package in Xenial: Fix Released Bug description: Description of problem: The mlx5 Ethernet driver doesn't allow packets marked with all possible VLAN tags to be accepted under promiscuous mode. This is wrong and disallows Open-Stack to properly function in Para-Virtual configuration. How reproducible: just put the NIC to promiscuous mode and send packet from another node tagged any vlan which was not previously configured on the NIC vlan filter, it will not be accepted. Actual results: ARP packets sent on vlan 52 packets are dropped Expected results: packets should received Host info: #uname -a Linux dev-h-vrt-006 4.2.0-16-generic #19-Ubuntu SMP Thu Oct 8 15:35:06 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux #lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description:Ubuntu 15.10 Release:15.10 Codename: wily The following upstream commit fix it: commit c07543431e9f3d126d083808efa0e76461d8833b Author: Achiad Shochat Date: Thu Oct 8 15:26:18 2015 +0300 net/mlx5e: Disable VLAN filter in promiscuous mode When the device was set to promiscuous mode, we didn't disable VLAN filtering, which is wrong behaviour, fix that. Now when the device is set to promiscuous mode RX packets sent over any VLAN (or no VLAN tag at all) will be accepted. Signed-off-by: Achiad Shochat Signed-off-by: Or Gerlitz Signed-off-by: David S. Miller I backported it to Ubuntu 15.10 (please see the attached patch). This issue need to be fix also in Ubuntu 14.04.4 not only 15.10. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1514861/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1514861] Re: mlx5 EN driver wrongly enables sets VLAN filtering under promiscuous mode
Hi, could you please add this fix to Ubuntu 14.04.4 ? Thanks, Talat -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1514861 Title: mlx5 EN driver wrongly enables sets VLAN filtering under promiscuous mode Status in linux package in Ubuntu: Fix Released Status in linux source package in Wily: In Progress Status in linux source package in Xenial: Fix Released Bug description: Description of problem: The mlx5 Ethernet driver doesn't allow packets marked with all possible VLAN tags to be accepted under promiscuous mode. This is wrong and disallows Open-Stack to properly function in Para-Virtual configuration. How reproducible: just put the NIC to promiscuous mode and send packet from another node tagged any vlan which was not previously configured on the NIC vlan filter, it will not be accepted. Actual results: ARP packets sent on vlan 52 packets are dropped Expected results: packets should received Host info: #uname -a Linux dev-h-vrt-006 4.2.0-16-generic #19-Ubuntu SMP Thu Oct 8 15:35:06 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux #lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description:Ubuntu 15.10 Release:15.10 Codename: wily The following upstream commit fix it: commit c07543431e9f3d126d083808efa0e76461d8833b Author: Achiad Shochat Date: Thu Oct 8 15:26:18 2015 +0300 net/mlx5e: Disable VLAN filter in promiscuous mode When the device was set to promiscuous mode, we didn't disable VLAN filtering, which is wrong behaviour, fix that. Now when the device is set to promiscuous mode RX packets sent over any VLAN (or no VLAN tag at all) will be accepted. Signed-off-by: Achiad Shochat Signed-off-by: Or Gerlitz Signed-off-by: David S. Miller I backported it to Ubuntu 15.10 (please see the attached patch). This issue need to be fix also in Ubuntu 14.04.4 not only 15.10. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1514861/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1517919] [NEW] Mlx5 EN - Ethtool link speed setting fixes
Public bug reported: Setting link speed for ConnectX4 is not working. reproduce ethtool -s ens1 speed 1 ethtool -s ens1 |grep Speed Speed: 4 The commit below fix this issue is commit 6fa1bcab6be6e9bd93f80e345c7e9a4ec7861df9 Author: Achiad Shochat Date: Sun Aug 16 16:04:50 2015 +0300 net/mlx5e: Ethtool link speed setting fixes - Port speed settings are applied by the device only upon port admin status transition from DOWN to UP. So we enforce this transition regardless of the port's current operation state (which may be occasionally DOWN if for example the network cable is disconnected). - Fix the PORT_UP/DOWN device interface enum - Set the local_port bit in the device PAOS register - EXPORT the PAOS (Port Administrative and Operational Status) register set/query access functions. Signed-off-by: Achiad Shochat Signed-off-by: David S. Miller I cherry -pick it to Ubuntu 15.10 (please see the attached patch) and test it. could you please add the fix also in Ubuntu 14.04.4 not only 15.10. # lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description:Ubuntu 15.10 Release:15.10 Codename: wily # uname -a Linux dev-h-vrt-006 4.2.0-16-generic #19-Ubuntu SMP Thu Oct 8 15:35:06 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux ** Affects: linux (Ubuntu) Importance: Undecided Status: Incomplete -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1517919 Title: Mlx5 EN - Ethtool link speed setting fixes Status in linux package in Ubuntu: Incomplete Bug description: Setting link speed for ConnectX4 is not working. reproduce ethtool -s ens1 speed 1 ethtool -s ens1 |grep Speed Speed: 4 The commit below fix this issue is commit 6fa1bcab6be6e9bd93f80e345c7e9a4ec7861df9 Author: Achiad Shochat Date: Sun Aug 16 16:04:50 2015 +0300 net/mlx5e: Ethtool link speed setting fixes - Port speed settings are applied by the device only upon port admin status transition from DOWN to UP. So we enforce this transition regardless of the port's current operation state (which may be occasionally DOWN if for example the network cable is disconnected). - Fix the PORT_UP/DOWN device interface enum - Set the local_port bit in the device PAOS register - EXPORT the PAOS (Port Administrative and Operational Status) register set/query access functions. Signed-off-by: Achiad Shochat Signed-off-by: David S. Miller I cherry -pick it to Ubuntu 15.10 (please see the attached patch) and test it. could you please add the fix also in Ubuntu 14.04.4 not only 15.10. # lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description:Ubuntu 15.10 Release:15.10 Codename: wily # uname -a Linux dev-h-vrt-006 4.2.0-16-generic #19-Ubuntu SMP Thu Oct 8 15:35:06 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1517919/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp