On 1/14/21 8:10 AM, Boris Pismenny wrote: > @@ -664,8 +753,15 @@ static int nvme_tcp_process_nvme_cqe(struct > nvme_tcp_queue *queue, > return -EINVAL; > } > > - if (!nvme_try_complete_req(rq, cqe->status, cqe->result)) > - nvme_complete_rq(rq); > + req = blk_mq_rq_to_pdu(rq); > + if (req->offloaded) { > + req->status = cqe->status; > + req->result = cqe->result; > + nvme_tcp_teardown_ddp(queue, cqe->command_id, rq); > + } else { > + if (!nvme_try_complete_req(rq, cqe->status, cqe->result)) > + nvme_complete_rq(rq); > + } > queue->nr_cqe++; > > return 0; > @@ -859,9 +955,18 @@ static int nvme_tcp_recv_pdu(struct nvme_tcp_queue > *queue, struct sk_buff *skb, > static inline void nvme_tcp_end_request(struct request *rq, u16 status) > { > union nvme_result res = {}; > + struct nvme_tcp_request *req = blk_mq_rq_to_pdu(rq); > + struct nvme_tcp_queue *queue = req->queue; > + struct nvme_tcp_data_pdu *pdu = (void *)queue->pdu; > > - if (!nvme_try_complete_req(rq, cpu_to_le16(status << 1), res)) > - nvme_complete_rq(rq); > + if (req->offloaded) { > + req->status = cpu_to_le16(status << 1); > + req->result = res; > + nvme_tcp_teardown_ddp(queue, pdu->command_id, rq); > + } else { > + if (!nvme_try_complete_req(rq, cpu_to_le16(status << 1), res)) > + nvme_complete_rq(rq); > + } > } > > static int nvme_tcp_recv_data(struct nvme_tcp_queue *queue, struct sk_buff > *skb,
The req->offload checks assume the offload is to the expected offload_netdev, but you do not verify the data arrived as expected. You might get lucky if both netdev's belong to the same PCI device (assuming the h/w handles it a certain way), but it will not if the netdev's belong to different devices. Consider a system with 2 network cards -- even if it is 2 mlx5 based devices. One setup can have the system using a bond with 1 port from each PCI device. The tx path picks a leg based on the hash of the ntuple and that (with Tariq's bond patches) becomes the expected offload device. A similar example holds for a pure routing setup with ECMP. For both there is full redundancy in the network - separate NIC cards connected to separate TORs to have independent network paths. A packet arrives on the *other* netdevice - you have *no* control over the Rx path. Your current checks will think the packet arrived with DDP but it did not.