On 17/02/2021 19:00, David Ahern wrote:
> On 2/17/21 7:01 AM, Or Gerlitz wrote:
@@ -1136,6 +1265,10 @@ static int nvme_tcp_try_send_cmd_pdu(struct
nvme_tcp_request *req)
else
flags |= MSG_EOR;
+ if (test_bit(NVME_TCP_Q_OFF_DDP, &queue->flags) &
On 17/02/2021 15:55, Or Gerlitz wrote:
> On Sun, Feb 14, 2021 at 8:20 PM David Ahern wrote:
>> On 2/11/21 2:10 PM, Boris Pismenny wrote:
>>> @@ -223,6 +229,164 @@ static inline size_t nvme_tcp_pdu_last_send(struct
>>> nvme_tcp_request *req,
>>> retur
On 11/02/2021 23:32, Randy Dunlap wrote:
>
> Hi,
> Did vger.kernel.org eat patch 21/21?
>
> and does that patch contain the Documentation updates?
>
> thanks.
>
It seems the error was on my end, thanks for raising this, I've resent
that patch now.
Signed-off-by: Boris Pismenny
Signed-off-by: Ben Ben-Ishay
Signed-off-by: Or Gerlitz
Signed-off-by: Yoray Zack
---
Documentation/networking/index.rst | 1 +
Documentation/networking/tcp-ddp-offload.rst | 296 +++
2 files changed, 297 insertions(+)
create mode
invalidate ddp requests.
- Query nvmeotcp capabilities
Signed-off-by: Boris Pismenny
Signed-off-by: Ben Ben-Ishay
Signed-off-by: Or Gerlitz
Signed-off-by: Yoray Zack
---
.../net/ethernet/mellanox/mlx5/core/Kconfig | 10 +
.../net/ethernet/mellanox/mlx5/core/Makefile | 2 +
drivers/net/ethernet
From: Ben Ben-Ishay
NVMEoTCP offload statistics includes both control and data path
statistic: counters for ndo, offloaded packets/bytes, dropped packets
and resync operation.
Signed-off-by: Boris Pismenny
Signed-off-by: Ben Ben-Ishay
Signed-off-by: Or Gerlitz
Signed-off-by: Yoray Zack
y with skb_condense's policy, but filling this hole is
counter-productive as the data there already resides in its
destination buffer.
Signed-off-by: Boris Pismenny
Signed-off-by: Ben Ben-Ishay
Signed-off-by: Or Gerlitz
Signed-off-by: Yoray Zack
---
.../net/ethernet/mellanox/mlx5/core/Make
From: Ben Ben-Ishay
NVMEoTCP offload uses buffer registration for every NVME request to
perform direct data placement, The registration is done via KLM UMR
WQE's. The driver resync handler advertise the software resync response
via static params WQE.
Signed-off-by: Boris Pismenny
Signe
registration
- Maintain static and progress HW contexts by posting the proper
WQEs at creation time, or upon resync
Queue teardown will free the corresponding contexts.
Signed-off-by: Boris Pismenny
Signed-off-by: Ben Ben-Ishay
Signed-off-by: Or Gerlitz
Signed-off-by: Yoray Zack
---
.../net
From: Ben Ben-Ishay
NVMEoTCP offload uses buffer registration for ddp operation,
every request comprises from SG list that might have elements with size > 4K,
thus the appropriate way to perform buffer registration is with KLM UMRs.
Signed-off-by: Boris Pismenny
Signed-off-by: Ben Ben-Is
From: Ben Ben-Ishay
Add helper macros for posting KLM UMR WQE.
Signed-off-by: Boris Pismenny
Signed-off-by: Ben Ben-Ishay
Signed-off-by: Or Gerlitz
Signed-off-by: Yoray Zack
---
drivers/net/ethernet/mellanox/mlx5/core/en.h | 18 ++
1 file changed, 18 insertions(+)
diff
From: Ben Ben-Ishay
Teardown ddp contexts asynchronously by posting a WQE, and calling back
to nvme-tcp when the corresponding CQE is received.
Signed-off-by: Boris Pismenny
Signed-off-by: Ben Ben-Ishay
Signed-off-by: Or Gerlitz
Signed-off-by: Yoray Zack
---
.../net/ethernet/mellanox/mlx5
Both nvme-tcp and tls require tcp flow steering. Compile it for both of
them. Additionally, use reference counting to allocate/free TCP flow
steering.
Signed-off-by: Boris Pismenny
Signed-off-by: Ben Ben-Ishay
Signed-off-by: Or Gerlitz
Signed-off-by: Yoray Zack
---
drivers/net/ethernet
information to SW
- Add new capability to HCA_CAP that represnts the NVMEoTCP offload ability
Signed-off-by: Ben Ben-ishay
Signed-off-by: Boris Pismenny
Signed-off-by: Or Gerlitz
Signed-off-by: Yoray Zack
---
include/linux/mlx5/device.h | 8 +++
include/linux/mlx5/mlx5_ifc.h | 101
From: Ben Ben-ishay
Add the NVMEoTCP offload definition and access functions for 128B CQEs.
Signed-off-by: Ben Ben-ishay
Signed-off-by: Boris Pismenny
Signed-off-by: Or Gerlitz
Signed-off-by: Yoray Zack
---
include/linux/mlx5/device.h | 36 +++-
1 file
non-offload. This change simplifies the code, but it may
degrade performance for non-offload crc calculation.
Signed-off-by: Yoray Zack
Signed-off-by: Boris Pismenny
Signed-off-by: Ben Ben-Ishay
Signed-off-by: Or Gerlitz
---
drivers/nvme/host/tcp.c | 86
in the UP state, and down is always
there between up to unregister.
Signed-off-by: Or Gerlitz
Signed-off-by: Boris Pismenny
Signed-off-by: Ben Ben-Ishay
Signed-off-by: Yoray Zack
---
drivers/nvme/host/tcp.c | 39 +++
1 file changed, 39 insertions(+)
diff
), which will update the HW,
and resume offload when all is successful.
Furthermore, we let the offloading driver advertise what is the max hw
sectors/segments via tcp_ddp_limits.
A follow-up patch introduces the data-path changes required for this
offload.
Signed-off-by: Boris Pismenny
Signed-off-by
follows:
NVMe-TCP performs the specific completion, while NIC driver performs the
generic mq_blk completion.
Signed-off-by: Boris Pismenny
Signed-off-by: Ben Ben-Ishay
Signed-off-by: Or Gerlitz
Signed-off-by: Yoray Zack
---
drivers/nvme/host/tcp.c | 158
This commit introduces new functions to support direct data placement
(DDP) NIC offloads that avoid copying data from SKBs.
Later patches will use this for nvme-tcp DDP offload.
Signed-off-by: Boris Pismenny
Signed-off-by: Ben Ben-Ishay
Signed-off-by: Or Gerlitz
Signed-off-by: Yoray Zack
get_netdev_for_sock is a utility that is used to obtain
the net_device structure from a connected socket.
Later patches will use this for nvme-tcp DDP and DDP CRC offloads.
Signed-off-by: Boris Pismenny
Reviewed-by: Sagi Grimberg
---
include/net/sock.h | 17 +
net/tls
c
net/mlx5e: NVMEoTCP, data-path for DDP+CRC offload
net/mlx5e: NVMEoTCP statistics
Ben Ben-ishay (2):
net/mlx5: Header file changes for nvme-tcp offload
net/mlx5: Add 128B CQE for NVMEoTCP offload
Boris Pismenny (9):
net: Introduce direct data placement tcp offload
net: Introduce cr
subsequent series, will add NVMe-TCP transmit side CRC support.
Signed-off-by: Boris Pismenny
Signed-off-by: Ben Ben-Ishay
Signed-off-by: Or Gerlitz
Signed-off-by: Yoray Zack
Reviewed-by: Sagi Grimberg
---
include/linux/netdev_features.h | 2 ++
include/linux/netdevice.h | 2 +-
include/linux
(src == dst), and skip the copy when that's true.
As the current user for these routines is in the block layer (nvme-tcp),
then we only apply the change for bio_vec. Other routines use the normal
methods for copying.
Signed-off-by: Boris Pismenny
Signed-off-by: Ben Ben-Ishay
Signed-off-b
avoid needless copies, such as when using
skb_condense, we mark the skb->ddp_crc bit. This bit will be
used to indicate both ddp and crc offload (next patch in series).
A follow-up patch will use this interface for DDP in NVMe-TCP.
Signed-off-by: Boris Pismenny
Signed-off-by: Ben Ben-Ishay
Sig
On 03/02/2021 18:56, Christoph Hellwig wrote:
> On Tue, Feb 02, 2021 at 08:00:51PM +0200, Or Gerlitz wrote:
>> will look into this, any idea for a more suitable location?
>
> Maybe just a new file under lib/ for now?
>
That doesn't work unless we copy quite a lot of code. There are macros
here (
On 03/02/2021 21:34, Ira Weiny wrote:
> On Wed, Feb 03, 2021 at 05:56:21PM +0100, Christoph Hellwig wrote:
>> On Tue, Feb 02, 2021 at 08:00:51PM +0200, Or Gerlitz wrote:
>>> will look into this, any idea for a more suitable location?
>>
>> Maybe just a new file under lib/ for now?
>>
Overly lo
From: Ben Ben-ishay
Add the NVMEoTCP offload definition and access functions for 128B CQEs.
Signed-off-by: Ben Ben-ishay
Signed-off-by: Boris Pismenny
Signed-off-by: Or Gerlitz
Signed-off-by: Yoray Zack
---
include/linux/mlx5/device.h | 36 +++-
1 file
Both nvme-tcp and tls require tcp flow steering. Compile it for both of
them. Additionally, use reference counting to allocate/free TCP flow
steering.
Signed-off-by: Boris Pismenny
Signed-off-by: Ben Ben-Ishay
Signed-off-by: Or Gerlitz
Signed-off-by: Yoray Zack
---
drivers/net/ethernet
get_netdev_for_sock is a utility that is used to obtain
the net_device structure from a connected socket.
Later patches will use this for nvme-tcp DDP and DDP CRC offloads.
Signed-off-by: Boris Pismenny
Reviewed-by: Sagi Grimberg
---
include/net/sock.h | 17 +
net/tls
in the UP state, and down is always
there between up to unregister.
Signed-off-by: Or Gerlitz
Signed-off-by: Boris Pismenny
Signed-off-by: Ben Ben-Ishay
Signed-off-by: Yoray Zack
---
drivers/nvme/host/tcp.c | 36
1 file changed, 36 insertions(+)
diff
Signed-off-by: Boris Pismenny
Signed-off-by: Ben Ben-Ishay
Signed-off-by: Or Gerlitz
Signed-off-by: Yoray Zack
---
Documentation/networking/index.rst | 1 +
Documentation/networking/tcp-ddp-offload.rst | 296 +++
2 files changed, 297 insertions(+)
create mode
subsequent series, will add NVMe-TCP transmit side CRC support.
Signed-off-by: Boris Pismenny
Signed-off-by: Ben Ben-Ishay
Signed-off-by: Or Gerlitz
Signed-off-by: Yoray Zack
Reviewed-by: Sagi Grimberg
---
include/linux/netdev_features.h | 2 ++
include/linux/netdevice.h | 2 +-
include/linux
From: Ben Ben-Ishay
NVMEoTCP offload uses buffer registration for every NVME request to
perform direct data placement, The registration is done via KLM UMR
WQE's. The driver resync handler advertise the software resync response
via static params WQE.
Signed-off-by: Boris Pismenny
Signe
avoid needless copies, such as when using
skb_condense, we mark the skb->ddp_crc bit. This bit will be
used to indicate both ddp and crc offload (next patch in series).
A follow-up patch will use this interface for DDP in NVMe-TCP.
Signed-off-by: Boris Pismenny
Signed-off-by: Ben Ben-Ishay
Sig
follows:
NVMe-TCP performs the specific completion, while NIC driver performs the
generic mq_blk completion.
Signed-off-by: Boris Pismenny
Signed-off-by: Ben Ben-Ishay
Signed-off-by: Or Gerlitz
Signed-off-by: Yoray Zack
---
drivers/nvme/host/tcp.c | 141
information to SW
- Add new capability to HCA_CAP that represnts the NVMEoTCP offload ability
Signed-off-by: Ben Ben-ishay
Signed-off-by: Boris Pismenny
Signed-off-by: Or Gerlitz
Signed-off-by: Yoray Zack
---
include/linux/mlx5/device.h | 8 +++
include/linux/mlx5/mlx5_ifc.h | 101
net/mlx5e: NVMEoTCP statistics
Ben Ben-ishay (4):
net: SKB copy(+hash) iterators for DDP offloads
nvme-tcp : Recalculate crc in the end of the capsule
net/mlx5: Header file changes for nvme-tcp offload
net/mlx5: Add 128B CQE for NVMEoTCP offload
Boris Pismenny (8):
iov_iter: Introduce n
From: Ben Ben-Ishay
NVMEoTCP offload uses buffer registration for ddp operation,
every request comprises from SG list that might have elements with size > 4K,
thus the appropriate way to perform buffer registration is with KLM UMRs.
Signed-off-by: Boris Pismenny
Signed-off-by: Ben Ben-Is
From: Ben Ben-Ishay
Add helper macros for posting KLM UMR WQE.
Signed-off-by: Boris Pismenny
Signed-off-by: Ben Ben-Ishay
Signed-off-by: Or Gerlitz
Signed-off-by: Yoray Zack
---
drivers/net/ethernet/mellanox/mlx5/core/en.h | 18 ++
1 file changed, 18 insertions(+)
diff
iter/pages in case that the
source of the copy operation might be identical to the destination,
in such cases the copy is skipped only for bio_vec, later commits
uses those functions to introduce new skb copy(+hash) functions.
Signed-off-by: Boris Pismenny
Signed-off-by: Ben Ben-Ishay
Signed-off-by
From: Ben Ben-Ishay
Teardown ddp contexts asynchronously by posting a WQE, and calling back
to nvme-tcp when the corresponding CQE is received.
Signed-off-by: Boris Pismenny
Signed-off-by: Ben Ben-Ishay
Signed-off-by: Or Gerlitz
Signed-off-by: Yoray Zack
---
.../net/ethernet/mellanox/mlx5
From: Ben Ben-Ishay
NVMEoTCP offload statistics includes both control and data path
statistic: counters for ndo, offloaded packets/bytes, dropped packets
and resync operation.
Signed-off-by: Boris Pismenny
Signed-off-by: Ben Ben-Ishay
Signed-off-by: Or Gerlitz
Signed-off-by: Yoray Zack
y with skb_condense's policy, but filling this hole is
counter-productive as the data there already resides in its
destination buffer.
Signed-off-by: Boris Pismenny
Signed-off-by: Ben Ben-Ishay
Signed-off-by: Or Gerlitz
Signed-off-by: Yoray Zack
---
.../net/ethernet/mellanox/mlx5/core/Make
), which will update the HW,
and resume offload when all is successful.
Furthermore, we let the offloading driver advertise what is the max hw
sectors/segments via tcp_ddp_limits.
A follow-up patch introduces the data-path changes required for this
offload.
Signed-off-by: Boris Pismenny
Signed-off-by
that
the destination buffer is represented by bio_vec.
Signed-off-by: Boris Pismenny
Signed-off-by: Ben Ben-Ishay
Signed-off-by: Or Gerlitz
Signed-off-by: Yoray Zack
---
include/linux/skbuff.h | 5 +
net/core/datagram.c| 44 ++
2 files changed, 49
registration
- Maintain static and progress HW contexts by posting the proper
WQEs at creation time, or upon resync
Queue teardown will free the corresponding contexts.
Signed-off-by: Boris Pismenny
Signed-off-by: Ben Ben-Ishay
Signed-off-by: Or Gerlitz
Signed-off-by: Yoray Zack
---
.../net
invalidate ddp requests.
- Query nvmeotcp capabilities
Signed-off-by: Boris Pismenny
Signed-off-by: Ben Ben-Ishay
Signed-off-by: Or Gerlitz
Signed-off-by: Yoray Zack
---
.../net/ethernet/mellanox/mlx5/core/Kconfig | 10 +
.../net/ethernet/mellanox/mlx5/core/Makefile | 2 +
drivers/net/ethernet
simplifies the code, but it may degrade
performance for non-offload crc calculation.
Signed-off-by: Boris Pismenny
Signed-off-by: Ben Ben-Ishay
Signed-off-by: Or Gerlitz
Signed-off-by: Yoray Zack
---
drivers/nvme/host/tcp.c | 118
1 file changed, 95
On 14/01/2021 22:43, Eric Dumazet wrote:
> On Thu, Jan 14, 2021 at 9:19 PM Boris Pismenny
> wrote:
>>
>>
>>
>> On 14/01/2021 17:57, Eric Dumazet wrote:
>>> On Thu, Jan 14, 2021 at 4:10 PM Boris Pismenny wrote:
>>>>
>>>> This com
On 19/01/2021 6:36, David Ahern wrote:
> On 1/17/21 1:42 AM, Boris Pismenny wrote:
>> This is needed for a few reasons that are explained in detail
>> in the tcp-ddp offload documentation. See patch 21 overview
>> and rx-data-path sections. Our reasons are as follows:
>
&g
On 19/01/2021 6:18, David Ahern wrote:
> On 1/14/21 8:10 AM, Boris Pismenny wrote:
>> @@ -664,8 +753,15 @@ static int nvme_tcp_process_nvme_cqe(struct
>> nvme_tcp_queue *queue,
>> return -EINVAL;
>> }
>>
>> -if (!nvme_try_co
On 19/01/2021 5:47, David Ahern wrote:
> On 1/14/21 8:10 AM, Boris Pismenny wrote:
>> +static
>> +int nvme_tcp_offload_socket(struct nvme_tcp_queue *queue)
>> +{
>> +struct net_device *netdev = get_netdev_for_sock(queue->sock->sk, true);
>>
On 16/01/2021 6:57, David Ahern wrote:
> I have not had time to review this version of the patches, but this
> patch seems very similar to 13 of 15 from v1 and you did not respond to
> my question on it ...
>
> On 1/14/21 8:10 AM, Boris Pismenny wrote:
>> diff --git
>
On 14/01/2021 17:57, Eric Dumazet wrote:
> On Thu, Jan 14, 2021 at 4:10 PM Boris Pismenny wrote:
>>
>> This commit introduces direct data placement offload for TCP.
>> This capability is accompanied by new net_device operations that
>> configure hardware contexts. Th
On 14/01/2021 6:47, David Ahern wrote:
> On 1/13/21 6:27 PM, Sagi Grimberg wrote:
>>> Changes since RFC v1:
>>> =
>>> * Split mlx5 driver patches to several commits
>>> * Fix nvme-tcp handling of recovery flows. In particular, move queue
>>> offlaod
>>>
On 14/01/2021 3:27, Sagi Grimberg wrote:
> Hey Boris, sorry for some delays on my end...
>
> I saw some long discussions on this set with David, what is
> the status here?
>
The main purpose of this series is to address these.
> I'll take some more look into the patches, but if you
> addressed
From: Ben Ben-Ishay
NVMEoTCP direct data placement constructs an SKB from each CQE, while
pointing at NVME buffers.
This enables the offload, as the NVMe-TCP layer will skip the copy when
src == dst.
Signed-off-by: Boris Pismenny
Signed-off-by: Ben Ben-Ishay
Signed-off-by: Or Gerlitz
Signed
simplifies the code, but it may degrade
performance for non-offload crc calculation.
Signed-off-by: Boris Pismenny
Signed-off-by: Ben Ben-Ishay
Signed-off-by: Or Gerlitz
Signed-off-by: Yoray Zack
---
drivers/nvme/host/tcp.c | 116
1 file changed, 94
information to SW
- Add new capability to HCA_CAP that represnts the NVMEoTCP offload ability
Signed-off-by: Ben Ben-ishay
Signed-off-by: Boris Pismenny
Signed-off-by: Or Gerlitz
Signed-off-by: Yoray Zack
---
include/linux/mlx5/device.h | 8 +++
include/linux/mlx5/mlx5_ifc.h | 101
subsequent series, will add NVMe-TCP transmit side CRC support.
Signed-off-by: Boris Pismenny
Signed-off-by: Ben Ben-Ishay
Signed-off-by: Or Gerlitz
Signed-off-by: Yoray Zack
Reviewed-by: Sagi Grimberg
---
include/linux/netdev_features.h | 2 ++
include/linux/skbuff.h | 5 +
net/Kconfig
From: Ben Ben-Ishay
Add helper macros for posting KLM UMR WQE.
Signed-off-by: Boris Pismenny
Signed-off-by: Ben Ben-Ishay
Signed-off-by: Or Gerlitz
Signed-off-by: Yoray Zack
---
drivers/net/ethernet/mellanox/mlx5/core/en.h | 18 ++
1 file changed, 18 insertions(+)
diff
follows:
NVMe-TCP performs the specific completion, while NIC driver performs the
generic mq_blk completion.
Signed-off-by: Boris Pismenny
Signed-off-by: Ben Ben-Ishay
Signed-off-by: Or Gerlitz
Signed-off-by: Yoray Zack
---
drivers/nvme/host/tcp.c | 141
From: Ben Ben-Ishay
NVMEoTCP offload uses buffer registration for ddp operation,
every request comprises from SG list that might have elements with size > 4K,
thus the appropriate way to perform buffer registration is with KLM UMRs.
Signed-off-by: Boris Pismenny
Signed-off-by: Ben Ben-Is
that
the destination buffer is represented by bio_vec.
Signed-off-by: Boris Pismenny
Signed-off-by: Ben Ben-Ishay
Signed-off-by: Or Gerlitz
Signed-off-by: Yoray Zack
---
include/linux/skbuff.h | 5 +
net/core/datagram.c| 44 ++
2 files changed, 49
get_netdev_for_sock is a utility that is used to obtain
the net_device structure from a connected socket.
Later patches will use this for nvme-tcp DDP and DDP CRC offloads.
Signed-off-by: Boris Pismenny
Reviewed-by: Sagi Grimberg
---
include/net/sock.h | 17 +
net/tls
iter/pages in case that the
source of the copy operation might be identical to the destination,
in such cases the copy is skipped only for bio_vec, later commits
uses those functions to introduce new skb copy(+hash) functions.
Signed-off-by: Boris Pismenny
Signed-off-by: Ben Ben-Ishay
Signed-off-by
From: Ben Ben-ishay
Add the NVMEoTCP offload definition and access functions for 128B CQEs.
Signed-off-by: Ben Ben-ishay
Signed-off-by: Boris Pismenny
Signed-off-by: Or Gerlitz
Signed-off-by: Yoray Zack
---
include/linux/mlx5/device.h | 36 +++-
1 file
), which will update the HW,
and resume offload when all is successful.
Furthermore, we let the offloading driver advertise what is the max hw
sectors/segments via tcp_ddp_limits.
A follow-up patch introduces the data-path changes required for this
offload.
Signed-off-by: Boris Pismenny
Signed-off-by
in the UP state, and down is always
there between up to unregister.
Signed-off-by: Or Gerlitz
Signed-off-by: Boris Pismenny
Signed-off-by: Ben Ben-Ishay
Signed-off-by: Yoray Zack
---
drivers/nvme/host/tcp.c | 36
1 file changed, 36 insertions(+)
diff
registration
- Maintain static and progress HW contexts by posting the proper
WQEs at creation time, or upon resync
Queue teardown will free the corresponding contexts.
Signed-off-by: Boris Pismenny
Signed-off-by: Ben Ben-Ishay
Signed-off-by: Or Gerlitz
Signed-off-by: Yoray Zack
---
.../net
From: Ben Ben-Ishay
NVMEoTCP offload statistics includes both control and data path
statistic: counters for ndo, offloaded packets/bytes, dropped packets
and resync operation.
Signed-off-by: Boris Pismenny
Signed-off-by: Ben Ben-Ishay
Signed-off-by: Or Gerlitz
Signed-off-by: Yoray Zack
already very memory efficient, we modify
skb_condence to avoid copying data from fragments to the linear
part of SKBs that belong to a socket that uses DDP offload.
A follow-up patch will use this interface for DDP in NVMe-TCP.
Signed-off-by: Boris Pismenny
Signed-off-by: Ben Ben-Ishay
Signed-off-by
invalidate ddp requests.
- Query nvmeotcp capabilities
Signed-off-by: Boris Pismenny
Signed-off-by: Ben Ben-Ishay
Signed-off-by: Or Gerlitz
Signed-off-by: Yoray Zack
---
.../net/ethernet/mellanox/mlx5/core/Kconfig | 11 +
.../net/ethernet/mellanox/mlx5/core/Makefile | 2 +
drivers/net/ethernet
From: Ben Ben-Ishay
NVMEoTCP offload uses buffer registration for every NVME request to perform
direct data placement,
The registration is done via KLM UMR WQE's.
The driver resync handler advertise the software resync response via static
params WQE.
Signed-off-by: Boris Pismenny
Signe
Add documentation to describe NIC direct data placement offload
for protocol layered over TCP. Use NVMe-TCP as an example.
---
Documentation/networking/index.rst | 1 +
Documentation/networking/tcp-ddp-offload.rst | 296 +++
2 files changed, 297 insertions(+)
create mo
From: Ben Ben-Ishay
Teardown ddp contexts asynchronously by posting a WQE, and calling back
to nvme-tcp when the corresponding CQE is received.
Signed-off-by: Boris Pismenny
Signed-off-by: Ben Ben-Ishay
Signed-off-by: Or Gerlitz
Signed-off-by: Yoray Zack
---
.../net/ethernet/mellanox/mlx5
Both nvme-tcp and tls require tcp flow steering. Compile it for both of
them. Additionally, use reference counting to allocate/free TCP flow
steering.
Signed-off-by: Boris Pismenny
Signed-off-by: Ben Ben-Ishay
Signed-off-by: Or Gerlitz
Signed-off-by: Yoray Zack
---
drivers/net/ethernet
-path for DDP offload
net/mlx5e: NVMEoTCP statistics
Ben Ben-ishay (4):
net: SKB copy(+hash) iterators for DDP offloads
nvme-tcp : Recalculate crc in the end of the capsule
net/mlx5: Header file changes for nvme-tcp offload
net/mlx5: Add 128B CQE for NVMEoTCP offload
Boris Pismen
On 15/12/2020 7:19, David Ahern wrote:
> On 12/13/20 11:21 AM, Boris Pismenny wrote:
>>>> as zerocopy for the following reasons:
>>>> (1) The former places buffers *exactly* where the user requests
>>>> regardless of the order of response arrivals, while
On 15/12/2020 15:33, Shai Malin wrote:
> On 12/14/2020 08:38, Boris Pismenny wrote:
>> On 10/12/2020 19:15, Shai Malin wrote:
>>> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c index
>>> c0c33320fe65..ef96e4a02bbd 100644
>>> --- a/drivers/nv
On 10/12/2020 19:15, Shai Malin wrote:
> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c index
> c0c33320fe65..ef96e4a02bbd 100644
> --- a/drivers/nvme/host/tcp.c
> +++ b/drivers/nvme/host/tcp.c
> @@ -14,6 +14,7 @@
> #include
> #include
> #include
> +#include
>
> #includ
On 11/12/2020 20:45, Jakub Kicinski wrote:
> On Thu, 10 Dec 2020 19:43:57 -0700 David Ahern wrote:
>> On 12/10/20 7:01 PM, Jakub Kicinski wrote:
>>> On Wed, 9 Dec 2020 21:26:05 -0700 David Ahern wrote:
Yes, TCP is a byte stream, so the packets could very well show up like
this:
On 10/12/2020 6:26, David Ahern wrote:
> On 12/9/20 1:15 AM, Boris Pismenny wrote:
>> On 09/12/2020 2:38, David Ahern wrote:
[...]
>>
>> There is more to this than TCP zerocopy that exists in userspace or
>> inside the kernel. First, please note that the patches include
On 10/12/2020 5:39, David Ahern wrote:
> On 12/9/20 12:41 AM, Boris Pismenny wrote:
>
>> is applied there is relevant here. More generally, this offload is
>> very similar in concept to TLS offload (tls_device).
>>
>
> I disagree with the TLS comparison. As an
On 09/12/2020 3:11, David Ahern wrote:
> On 12/8/20 5:57 PM, David Ahern wrote:
>>> diff --git a/include/net/inet_connection_sock.h
>>> b/include/net/inet_connection_sock.h
>>> index 7338b3865a2a..a08b85b53aa8 100644
>>> --- a/include/net/inet_connection_sock.h
>>> +++ b/include/net/inet_conne
On 09/12/2020 2:57, David Ahern wrote:
> On 12/7/20 2:06 PM, Boris Pismenny wrote:
>> diff --git a/include/linux/netdev_features.h
>> b/include/linux/netdev_features.h
>> index 934de56644e7..fb35dcac03d2 100644
>> --- a/include/linux/netdev_features.h
>> +++
On 09/12/2020 2:38, David Ahern wrote:
>
> The AF_XDP reference was to differentiate one zerocopy use case (all
> packets go to userspace) from another (kernel managed TCP socket with
> zerocopy payload). You are focusing on a very narrow use case - kernel
> based NVMe over TCP - of a more general
On 09/12/2020 3:06, David Ahern wrote:
> On 12/7/20 2:06 PM, Boris Pismenny wrote:
>> get_netdev_for_sock is a utility that is used to obtain
>> the net_device structure from a connected socket.
>>
>> Later patches will use this for nvme-tcp DDP and DDP CRC offloads.
On 08/12/2020 2:42, David Ahern wrote:
> On 12/7/20 2:06 PM, Boris Pismenny wrote:
>> This commit introduces direct data placement offload for TCP.
>> This capability is accompanied by new net_device operations that
>> configure
>> hardware contexts. There is a contex
On 08/12/2020 2:39, David Ahern wrote:
> On 12/7/20 2:06 PM, Boris Pismenny wrote:
>> When using direct data placement the NIC writes some of the payload
>> directly to the destination buffer, and constructs the SKB such that it
>> points to this data. As a result, the s
(tcp_ddp_resync), which will update the HW,
and resume offload when all is successful.
Furthermore, we let the offloading driver advertise what is the max hw
sectors/segments via tcp_ddp_limits.
A follow-up patch introduces the data-path changes required for this
offload.
Signed-off-by: Boris
From: Ben Ben-ishay
NVMEoTCP direct data placement constructs an SKB from each CQE, while
pointing at NVME buffers.
This enables the offload, as the NVMe-TCP layer will skip the copy when
src == dst.
Signed-off-by: Boris Pismenny
Signed-off-by: Ben Ben-Ishay
Signed-off-by: Or Gerlitz
Signed
information to SW
- Add new capability to HCA_CAP that represnts the NVMEoTCP offload ability
Signed-off-by: Ben Ben-ishay
Signed-off-by: Boris Pismenny
Signed-off-by: Or Gerlitz
Signed-off-by: Yoray Zack
---
include/linux/mlx5/device.h | 8 +++
include/linux/mlx5/mlx5_ifc.h | 104
Both nvme-tcp and tls require tcp flow steering. Compile it for both of
them. Additionally, use reference counting to allocate/free TCP flow
steering.
Signed-off-by: Boris Pismenny
Signed-off-by: Ben Ben-Ishay
Signed-off-by: Or Gerlitz
Signed-off-by: Yoray Zack
---
drivers/net/ethernet
subsequent series, will add NVMe-TCP transmit side CRC support.
Signed-off-by: Boris Pismenny
Signed-off-by: Ben Ben-Ishay
Signed-off-by: Or Gerlitz
Signed-off-by: Yoray Zack
Reviewed-by: Sagi Grimberg
---
include/linux/netdev_features.h | 2 ++
include/linux/skbuff.h | 5 +
net/Kconfig
From: Yoray Zack
The nvme-tcp crc computed over the first packet after resync may provide
the wrong signal when the packet contains multiple PDUs. We workaround
that by ignoring the cqe->nvmeotcp_crc signal for the first packet after
resync.
Signed-off-by: Yoray Zack
Signed-off-by: Bo
in the UP state, and down is always
there between up to unregister.
Signed-off-by: Or Gerlitz
Signed-off-by: Boris Pismenny
Signed-off-by: Ben Ben-Ishay
Signed-off-by: Yoray Zack
---
drivers/nvme/host/tcp.c | 36
1 file changed, 36 insertions(+)
diff
From: Ben Ben-ishay
Add the NVMEoTCP offload definition and access functions for 128B cookies.
Signed-off-by: Ben Ben-ishay
Signed-off-by: Boris Pismenny
Signed-off-by: Or Gerlitz
Signed-off-by: Yoray Zack
---
include/linux/mlx5/device.h | 35 ++-
1 file
copy, and a static_key to enabled
it when TCP direct data placement is possible.
Signed-off-by: Boris Pismenny
Signed-off-by: Ben Ben-Ishay
Signed-off-by: Or Gerlitz
Signed-off-by: Yoray Zack
---
include/linux/uio.h | 2 ++
lib/iov_iter.c | 11 ++-
2 files changed, 12 insertions
1 - 100 of 380 matches
Mail list logo