Hi Jack,

Thanks for your reply, could you send me your PATCH, I can test it in
our environment.
please see comments inline.

>
> We have seen this problem, and have developed a fix for it.  We will
> be submitting the fix within the next few days (after final testing).
> The fix utilizes rcu locking in the interrupt handlers (to avoid
> deadlocks).
>
> The fix you propose below has the potential to cause a deadlock.
> (We originally tried irq spinlocks here, because this lock is also
> grabbed in the process context in mlx4_cq_alloc/free).
>
> The problem is that under VPI, the ETH interface uses multiple msix irq's,
> which can result in one cq completion event interrupting another,
> in-progress cq completion event. A deadlock results when the handler
> for the first cq completion grabs the spinlock, and is interrupted by
> the second completion before it has a chance to release the spinlock.
> The handler for the second completion will deadlock waiting for the
> spinlock to be released.
>
> -Jack
>
So spin_lock_irqsave is more appropriate for this fix, but will
introduce more performance lose I guess.
In OFED, rcu lock is used,  but synchronize_rcu() is missing, may also
need combine cq->refcount, I guess?


-- 
Mit freundlichen Grüßen,
Best Regards,

Jack Wang

Linux Kernel Developer Storage
ProfitBricks GmbH  The IaaS-Company.

ProfitBricks GmbH
Greifswalder Str. 207
D - 10405 Berlin
Tel: +49 30 5770083-42
Fax: +49 30 5770085-98
Email: jinpu.w...@profitbricks.com
URL: http://www.profitbricks.de

Sitz der Gesellschaft: Berlin.
Registergericht: Amtsgericht Charlottenburg, HRB 125506 B.
Geschäftsführer: Andreas Gauger, Achim Weiss.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to