On 2026/4/7 1:40, Jason Gunthorpe wrote:
> Sashiko points out that once the srq memory is stored into the xarray by
> alloc_srqc() it can immediately be looked up by:
> 
>       xa_lock(&srq_table->xa);
>       srq = xa_load(&srq_table->xa, srqn & (hr_dev->caps.num_srqs - 1));
>       if (srq)
>               refcount_inc(&srq->refcount);
>       xa_unlock(&srq_table->xa);
> 
> Which will fail refcount debug because the refcount is 0 and then crash:
> 
>       srq->event(srq, event_type);
> 
> Because event is NULL.

I don't think this will actually happen because HW won't report an SRQ
event before the SRQ is fully ready and actually used.

>From the perspective of coding, I'm fine with this change, but since
there is similar logic for QP event, could you also apply this change
to QP?

Junxian

> 
> Use refcount_inc_not_zero() instead to ensure a partially prepared srq is
> never retrieved from the event handler and fix the ordering of the
> initialization so refcount becomes 1 only after it is fully ready.
> 
> Link: 
> https://sashiko.dev/#/patchset/0-v1-e911b76a94d1%2B65d95-rdma_udata_rep_jgg%40nvidia.com?part=3
> Signed-off-by: Jason Gunthorpe <[email protected]>
> ---
>  drivers/infiniband/hw/hns/hns_roce_srq.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/infiniband/hw/hns/hns_roce_srq.c 
> b/drivers/infiniband/hw/hns/hns_roce_srq.c
> index cb848e8e6bbd76..d6201ddde0292a 100644
> --- a/drivers/infiniband/hw/hns/hns_roce_srq.c
> +++ b/drivers/infiniband/hw/hns/hns_roce_srq.c
> @@ -16,8 +16,8 @@ void hns_roce_srq_event(struct hns_roce_dev *hr_dev, u32 
> srqn, int event_type)
>  
>       xa_lock(&srq_table->xa);
>       srq = xa_load(&srq_table->xa, srqn & (hr_dev->caps.num_srqs - 1));
> -     if (srq)
> -             refcount_inc(&srq->refcount);
> +     if (srq && !refcount_inc_not_zero(&srq->refcount))
> +             srq = NULL;
>       xa_unlock(&srq_table->xa);
>  
>       if (!srq) {
> @@ -481,8 +481,8 @@ int hns_roce_create_srq(struct ib_srq *ib_srq,
>       }
>  
>       srq->event = hns_roce_ib_srq_event;
> -     refcount_set(&srq->refcount, 1);
>       init_completion(&srq->free);
> +     refcount_set(&srq->refcount, 1);
>  
>       return 0;
>  

Reply via email to