Re: [net][PATCH 5/5] rds: avoid version downgrade to legitimate newer peer connections

2019-07-10 Thread Yanjun Zhu
h can results in no connection. Example a peer initiated connection with say tos 8 while usual connection racing can get downgraded to tos 0 which is not desirable. Patch fixes above issue introduced by commit commit d021fabf525f ("rds: rdma: add consumer reject") Reported-by: Yanjun

Re: [net][PATCH 4/5] rds: Return proper "tos" value to user-space

2019-07-10 Thread Yanjun Zhu
On 2019/7/10 13:32, Santosh Shilimkar wrote: From: Gerd Rausch The proper "tos" value needs to be returned to user-space (sockopt RDS_INFO_CONNECTIONS). Fixes: 3eb450367d08 ("rds: add type of service(tos) infrastructure") Signed-off-by: Gerd Rausch Reviewed-by: Zhu Yanjun Thanks. I am OK

Re: [net][PATCH 3/5] rds: Accept peer connection reject messages due to incompatible version

2019-07-10 Thread Yanjun Zhu
On 2019/7/10 13:32, Santosh Shilimkar wrote: From: Gerd Rausch Prior to commit d021fabf525ff ("rds: rdma: add consumer reject") function "rds_rdma_cm_event_handler_cmn" would always honor a rejected connection attempt by issuing a "rds_conn_drop". The commit mentioned above added a "break",

Re: [net][PATCH 1/5] rds: fix reordering with composite message notification

2019-07-10 Thread Yanjun Zhu
On 2019/7/10 13:32, Santosh Shilimkar wrote: RDS composite message(rdma + control) user notification needs to be triggered once the full message is delivered and such a fix was added as part of commit 941f8d55f6d61 ("RDS: RDMA: Fix the composite message user notification"). But rds_send_remove_

Re: [PATCH 0/2] forcedeth: recv cache support

2019-07-09 Thread Yanjun Zhu
On 2019/7/9 6:23, David Miller wrote: From: Zhu Yanjun Date: Fri, 5 Jul 2019 02:19:26 -0400 This recv cache is to make NIC work steadily when the system memory is not enough. The system is supposed to hold onto enough atomic memory to absorb all reasonable situations like this. If anythin

Re: [PATCH 1/2] forcedeth: add recv cache make nic work steadily

2019-07-09 Thread Yanjun Zhu
On 2019/7/9 4:52, Jakub Kicinski wrote: On Fri, 5 Jul 2019 02:19:27 -0400, Zhu Yanjun wrote: A recv cache is added. The size of recv cache is 1000Mb / skb_length. When the system memory is not enough, this recv cache can make nic work steadily. When nic is up, this recv cache and work queue a

Re: [PATCH 1/1] net: rds: fix memory leak when unload rds_rdma

2019-06-03 Thread Yanjun Zhu
87.926043] Unregistered RDS/infiniband transport " So IMO, this commit fixes this problem. The root cause is in the commit log. Zhu Yanjun On 2019/6/3 20:43, Yanjun Zhu wrote: Sorry. Add Håkon Bugge He told me to notice the memory leak when caches are freed. Zhu Yanjun On 2019/6/3 20

Re: [PATCH 1/1] net: rds: fix memory leak when unload rds_rdma

2019-06-03 Thread Yanjun Zhu
Sorry. Add Håkon Bugge He told me to notice the memory leak when caches are freed. Zhu Yanjun On 2019/6/3 20:48, Zhu Yanjun wrote: When KASAN is enabled, after several rds connections are created, then "rmmod rds_rdma" is run. The following will appear. " BUG rds_ib_incoming (Not tainted): O

Re: [PATCH 1/1] net: rds: add per rds connection cache statistics

2019-06-02 Thread Yanjun Zhu
On 2019/6/3 11:03, santosh.shilim...@oracle.com wrote: On 6/1/19 12:54 AM, Zhu Yanjun wrote: The variable cache_allocs is to indicate how many frags (KiB) are in one rds connection frag cache. The command "rds-info -Iv" will output the rds connection cache statistics as below: " RDS IB Connect

Re: [net-next][PATCH 1/2] rds: handle unsupported rdma request to fs dax memory

2019-04-25 Thread Yanjun Zhu
On 2019/4/26 8:44, Santosh Shilimkar wrote: From: Hans Westgaard Ry RDS doesn't support RDMA on memory apertures that require On Demand Paging (ODP), such as FS DAX memory. User applications can try to use RDS to perform RDMA over such memories and since it doesn't report any failure, it can

Re: [net-next][PATCH 2/2] rds: add sysctl for rds support of On-Demand-Paging

2019-04-25 Thread Yanjun Zhu
On 2019/4/26 8:44, Santosh Shilimkar wrote: RDS doesn't support RDMA on memory apertures that require On Demand Paging (ODP), such as FS DAX memory. A sysctl is added to indicate whether RDMA requiring ODP is supported. Reviewed-by: Håkon Bugge Reviewed-tested-by: Zhu Yanjun Thanks, Santos

Re: [net-next PATCH] net/rds: Return proper "tos" value to user-space

2019-03-07 Thread Yanjun Zhu
On 2019/3/8 6:01, Gerd Rausch wrote: The proper "tos" value needs to be returned to user-space (sockopt RDS_INFO_CONNECTIONS). Fixes: 3eb450367d08 ("rds: add type of service(tos) infrastructure") Signed-off-by: Gerd Rausch In RDS/IB, tos is set in this function. Do you still use RoCE device

Re: [net-next PATCH v2] net/rds: Accept peer connection reject messages due to incompatible version

2019-03-06 Thread Yanjun Zhu
On 2019/3/7 10:09, Yanjun Zhu wrote: On 2019/3/7 9:55, Santosh Shilimkar wrote: On 3/6/2019 5:49 PM, Gerd Rausch wrote: Prior to commit d021fabf525ff ("rds: rdma: add consumer reject") function "rds_rdma_cm_event_handler_cmn" would always honor a rejected connection

Re: [net-next PATCH v2] net/rds: Accept peer connection reject messages due to incompatible version

2019-03-06 Thread Yanjun Zhu
On 2019/3/7 9:55, Santosh Shilimkar wrote: On 3/6/2019 5:49 PM, Gerd Rausch wrote: Prior to commit d021fabf525ff ("rds: rdma: add consumer reject") function "rds_rdma_cm_event_handler_cmn" would always honor a rejected connection attempt by issuing a "rds_conn_drop". The commit mentioned a

Re: [PATCH] net/rds: Accept peer connection reject messages due to incompatible version

2019-03-06 Thread Yanjun Zhu
On 2019/3/6 15:04, Gerd Rausch wrote: Prior to commit d021fabf525ff ("rds: rdma: add consumer reject") function "rds_rdma_cm_event_handler_cmn" would always honor a rejected connection attempt by issuing a "rds_conn_drop". The commit mentioned above added a "break", eliminating the "fallthrou

Re: [net-next][PATCH 5/5] rds: rdma: update rdma transport for tos

2019-03-05 Thread Yanjun Zhu
On 2019/3/6 0:48, Gerd Rausch wrote: Hi Santosh, On 05/03/2019 08.41, Santosh Shilimkar wrote: On 3/5/2019 8:33 AM, Gerd Rausch wrote: If there's a mechanism that ensures compatibility with older (pre-4.1) versions of RDS I am not seeing it. Thats handled as part of the connection reject ha

Re: [PATCH net] net: sun: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles

2019-02-06 Thread Yanjun Zhu
On 2019/2/6 0:19, Yang Wei wrote: From: Yang Wei dev_consume_skb_irq() should be called when skb xmit done. It makes drop profiles(dropwatch, perf) more friendly. Thanks a lot. I am OK. Zhu Yanjun Signed-off-by: Yang Wei --- drivers/net/ethernet/sun/cassini.c | 2 +- drivers/net/eth

Re: [PATCH] net: sun: cassini: Cleanup license conflict

2019-01-21 Thread Yanjun Zhu
On 2019/1/19 0:30, Shannon Nelson wrote: On Fri, Jan 18, 2019 at 2:51 AM Thomas Gleixner wrote: The recent addition of SPDX license identifiers to the files in drivers/net/ethernet/sun created a licensing conflict. The cassini driver files contain a proper license notice: * This program

Re: [PATCH] net: nvidia: forcedeth: Fix two possible concurrency use-after-free bugs

2019-01-08 Thread Yanjun Zhu
On 2019/1/9 11:20, Jia-Ju Bai wrote: On 2019/1/9 10:35, Yanjun Zhu wrote: On 2019/1/9 10:03, Jia-Ju Bai wrote: On 2019/1/9 9:24, Yanjun Zhu wrote: On 2019/1/8 20:57, Jia-Ju Bai wrote: On 2019/1/8 20:54, Zhu Yanjun wrote: 在 2019/1/8 20:45, Jia-Ju Bai 写道: In drivers/net/ethernet

Re: [PATCH] net: nvidia: forcedeth: Fix two possible concurrency use-after-free bugs

2019-01-08 Thread Yanjun Zhu
On 2019/1/9 10:03, Jia-Ju Bai wrote: On 2019/1/9 9:24, Yanjun Zhu wrote: On 2019/1/8 20:57, Jia-Ju Bai wrote: On 2019/1/8 20:54, Zhu Yanjun wrote: 在 2019/1/8 20:45, Jia-Ju Bai 写道: In drivers/net/ethernet/nvidia/forcedeth.c, the functions nv_start_xmit() and nv_start_xmit_optimized

Re: [PATCH] net: nvidia: forcedeth: Fix two possible concurrency use-after-free bugs

2019-01-08 Thread Yanjun Zhu
On 2019/1/8 20:57, Jia-Ju Bai wrote: On 2019/1/8 20:54, Zhu Yanjun wrote: 在 2019/1/8 20:45, Jia-Ju Bai 写道: In drivers/net/ethernet/nvidia/forcedeth.c, the functions nv_start_xmit() and nv_start_xmit_optimized() can be concurrently executed with nv_poll_controller(). nv_start_xmit    line

Re: KASAN: null-ptr-deref Read in rds_ib_get_mr

2018-05-11 Thread Yanjun Zhu
On 2018/5/12 0:58, Santosh Shilimkar wrote: On 5/11/2018 12:48 AM, Yanjun Zhu wrote: On 2018/5/11 13:20, DaeRyong Jeong wrote: We report the crash: KASAN: null-ptr-deref Read in rds_ib_get_mr Note that this bug is previously reported by syzkaller. https://syzkaller.appspot.com/bug?id

Re: [rds-devel] KASAN: null-ptr-deref Read in rds_ib_get_mr

2018-05-11 Thread Yanjun Zhu
On 2018/5/11 18:46, Sowmini Varadhan wrote: On (05/11/18 15:48), Yanjun Zhu wrote: diff --git a/net/rds/ib_rdma.c b/net/rds/ib_rdma.c index e678699..2228b50 100644 --- a/net/rds/ib_rdma.c +++ b/net/rds/ib_rdma.c @@ -539,11 +539,17 @@ void rds_ib_flush_mrs(void) void *rds_ib_get_mr(struct

Re: KASAN: null-ptr-deref Read in rds_ib_get_mr

2018-05-11 Thread Yanjun Zhu
On 2018/5/11 13:20, DaeRyong Jeong wrote: We report the crash: KASAN: null-ptr-deref Read in rds_ib_get_mr Note that this bug is previously reported by syzkaller. https://syzkaller.appspot.com/bug?id=0bb56a5a48b000b52aa2b0d8dd20b1f545214d91 Nonetheless, this bug has not fixed yet, and we hope

Re: [PATCH] mlx4_core: allocate 4KB ICM chunks

2018-05-10 Thread Yanjun Zhu
On 2018/5/11 7:31, Qing Huang wrote: When a system is under memory presure (high usage with fragments), the original 256KB ICM chunk allocations will likely trigger kernel memory management to enter slow path doing memory compact/migration ops in order to complete high order memory allocations.

Re: [PATCHv2 1/1] IB/rxe: avoid double kfree_skb

2018-04-25 Thread Yanjun Zhu
Add netdev@vger.kernel.org On 2018/4/26 12:41, Zhu Yanjun wrote: When skb is sent, it will pass the following functions in soft roce. rxe_send [rdma_rxe] ip_local_out __ip_local_out ip_output ip_finish_output ip_finish_output2

Re: [PATCH 1/1] IB/rxe: avoid double kfree_skb

2018-04-24 Thread Yanjun Zhu
kfree_skb in soft roce module again. If I am wrong, please correct me. Zhu Yanjun On 2018/4/24 16:34, Yanjun Zhu wrote: Hi, all rxe_send     ip_local_out         __ip_local_out             nf_hook_slow In the above call process, nf_hook_slow drops and frees skb, then -EPERM is returned when

Re: [PATCH 1/1] IB/rxe: avoid double kfree_skb

2018-04-24 Thread Yanjun Zhu
oce, kfree_skb should not be called in this module. I will make further investigations about other error handler after ip_local_out. If I am wrong, please correct me. Any reply is appreciated. Zhu Yanjun On 2018/4/20 13:46, Yanjun Zhu wrote: On 2018/4/20 10:19, Doug Ledford wrote: On Thu, 2018-04

Re: [PATCH 1/1] IB/rxe: avoid double kfree_skb

2018-04-19 Thread Yanjun Zhu
On 2018/4/20 10:19, Doug Ledford wrote: On Thu, 2018-04-19 at 10:01 -0400, Zhu Yanjun wrote: When skb is dropped by iptables rules, the skb is freed at the same time -EPERM is returned. So in softroce, it is not necessary to free skb again. Or else, crash will occur. The steps to reproduce:

Re: [PATCH 1/1] net/mlx4_core: avoid resetting HCA when accessing an offline device

2018-04-17 Thread Yanjun Zhu
On 2018/4/17 23:37, Tariq Toukan wrote: On 16/04/2018 4:02 AM, Zhu Yanjun wrote: While a faulty cable is used or HCA firmware error, HCA device will be offline. When the driver is accessing this offline device, the following call trace will pop out. " ...    [] dump_stack+0x63/0x81    [] pa

Re: [PATCH next-queue 2/2] ixgbe: add unlikely notes to tx fastpath expressions

2018-01-18 Thread Yanjun Zhu
On 2018/1/9 6:47, Shannon Nelson wrote: Add unlikely() to a few error checking expressions in the Tx offload handling. Suggested-by: Yanjun Zhu Hi, I am fine with this patch. I have a question. The ipsec feature is supported in ixgbevf? Thanks a lot. Zhu Yanjun Signed-off-by: Shannon

Re: [PATCH v3 next-queue 08/10] ixgbe: process the Tx ipsec offload

2017-12-22 Thread Yanjun Zhu
On 2017/12/20 8:00, Shannon Nelson wrote: If the skb has a security association referenced in the skb, then set up the Tx descriptor with the ipsec offload bits. While we're here, we fix an oddly named field in the context descriptor struct. v3: added ifdef CONFIG_XFRM_OFFLOAD check around ca

Re: [PATCH v3 next-queue 00/10] ixgbe: Add ipsec offload

2017-12-20 Thread Yanjun Zhu
On 2017/12/21 14:39, Yanjun Zhu wrote: On 2017/12/20 7:59, Shannon Nelson wrote: This is an implementation of the ipsec hardware offload feature for the ixgbe driver and Intel's 10Gbe series NICs: x540, x550, 82599. Hi, Nelson I notice that the ipsec feature is based on x540, x550,

Re: [PATCH v3 next-queue 00/10] ixgbe: Add ipsec offload

2017-12-20 Thread Yanjun Zhu
On 2017/12/20 7:59, Shannon Nelson wrote: This is an implementation of the ipsec hardware offload feature for the ixgbe driver and Intel's 10Gbe series NICs: x540, x550, 82599. Hi, Nelson I notice that the ipsec feature is based on x540, x550, 82599.  But this ixgbe driver will also work wi

Re: [PATCH net-next 1/1] forcedeth: remove unnecessary variable

2017-12-07 Thread Yanjun Zhu
On 2017/12/8 3:07, David Miller wrote: From: Zhu Yanjun Date: Wed, 6 Dec 2017 23:15:15 -0500 Since both tx_ring and first_tx are the head of tx ring, it not necessary to use two variables. So first_tx is removed. These are not variables, they are structure members. Sure. These 2 structure

Re: [PATCH 1/1] bnx2x: fix slowpath null crash

2017-11-07 Thread Yanjun Zhu
On 2017/11/8 11:27, Elior, Ariel wrote: When "NETDEV WATCHDOG: em4 (bnx2x): transmit queue 2 timed out" occurs, BNX2X_SP_RTNL_TX_TIMEOUT is set. In the function bnx2x_sp_rtnl_task, bnx2x_nic_unload and bnx2x_nic_load are executed to shutdown and open NIC. In the function bnx2x_nic_load, bnx2x_a

Re: [PATCH 1/1] bnx2x: fix slowpath null crash

2017-11-07 Thread Yanjun Zhu
Please ignore this mail. Zhu Yanjun On 2017/11/8 9:58, root wrote: From: Zhu Yanjun When "NETDEV WATCHDOG: em4 (bnx2x): transmit queue 2 timed out" occurs, BNX2X_SP_RTNL_TX_TIMEOUT is set. In the function bnx2x_sp_rtnl_task, bnx2x_nic_unload and bnx2x_nic_load are executed to shutdown and op

Re: [PATCH 1/1] forcedeth: remove tx_stop variable

2017-09-14 Thread Yanjun Zhu
Hi, all After this patch is applied, the TCP && UDP tests are made. The TCP bandwidth is 939 Mbits/sec. The UDP bandwidth is 806 Mbits/sec. So I think this patch can work well. host1 <-> host2 host1: forcedeth NIC IP: 1.1.1.107 iperf -s host2: forcedeth NIC IP:1.1.1.105 iperf -c 1.1.1.10

Re: [PATCH 1/5] rds: tcp: release the created connection

2017-03-27 Thread Yanjun Zhu
On 2017/3/27 15:37, Sowmini Varadhan wrote: On (03/27/17 03:06), Zhu Yanjun wrote: Date: Mon, 27 Mar 2017 03:06:26 -0400 From: Zhu Yanjun To: yanjun@oracle.com, santosh.shilim...@oracle.com, netdev@vger.kernel.org, linux-r...@vger.kernel.org, rds-de...@oss.oracle.com, junxiao...@oracl

Re: [PATCHv3 1/4] rds: ib: drop unnecessary rdma_reject

2017-03-12 Thread Yanjun Zhu
On 2017/3/13 14:32, Leon Romanovsky wrote: On Mon, Mar 13, 2017 at 01:43:45AM -0400, Zhu Yanjun wrote: When rdma_accept fails, rdma_reject is called in it. As such, it is not necessary to execute rdma_reject again. It is not always correct, according to the code, rdma_accept can fail and will

Re: [PATCHv2 1/4] rds: ib: drop unnecessary rdma_reject

2017-03-12 Thread Yanjun Zhu
On 2017/3/13 3:43, santosh.shilim...@oracle.com wrote: On 3/12/17 12:33 PM, Leon Romanovsky wrote: On Sun, Mar 12, 2017 at 04:07:55AM -0400, Zhu Yanjun wrote: When rdma_accept fails, rdma_reject is called in it. As such, it is not necessary to execute rdma_reject again. Cc: Joe Jin Cc: Junx

Re: [PATCH 2/5] rds: ib: replace spin_lock_irq with spin_lock_irqsave

2017-03-11 Thread Yanjun Zhu
Sorry. I have no test case to show some issue. But from Linux Kernel Development Second Edition by Robert Love. Use spin_lock_irq is dangerous since spin_unlock_irq unconditionally enables interrupts. We can assume the following scenario: --->the interrupt is disabled. spin_lock_irq(

Re: [PATCH 1/1] ixgbe: add the external ixgbe fiber transceiver status

2017-02-09 Thread Yanjun Zhu
On 2017/2/10 3:08, Tantilov, Emil S wrote: -Original Message- From: netdev-ow...@vger.kernel.org [mailto:netdev-ow...@vger.kernel.org] On Behalf Of Zhu Yanjun Sent: Wednesday, February 08, 2017 7:03 PM To: Kirsher, Jeffrey T ; broo...@kernel.org; da...@davemloft.net; intel-wired-...@lis

[PATCHv2 1/1] r8169: fix the typo in the comment

2017-01-05 Thread yanjun . zhu
From: Zhu Yanjun >From the realtek data sheet, the PID0 should be bit 0. Signed-off-by: Zhu Yanjun --- Change from v1 to v2: change the commit header. drivers/net/ethernet/realtek/r8169.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/realtek/r8169.

Re: [PATCH 1/1] r8169: fix the typo

2016-12-29 Thread Yanjun Zhu
Hi, Please comment on this patch. Zhu Yanjun On 2016/12/29 11:11, Zhu Yanjun wrote: >From the realtek data sheet, the PID0 should be bit 0. Signed-off-by: Zhu Yanjun --- drivers/net/ethernet/realtek/r8169.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/et