Hi Jack, Thanks for your reply, could you send me your PATCH, I can test it in our environment. please see comments inline.
> > We have seen this problem, and have developed a fix for it. We will > be submitting the fix within the next few days (after final testing). > The fix utilizes rcu locking in the interrupt handlers (to avoid > deadlocks). > > The fix you propose below has the potential to cause a deadlock. > (We originally tried irq spinlocks here, because this lock is also > grabbed in the process context in mlx4_cq_alloc/free). > > The problem is that under VPI, the ETH interface uses multiple msix irq's, > which can result in one cq completion event interrupting another, > in-progress cq completion event. A deadlock results when the handler > for the first cq completion grabs the spinlock, and is interrupted by > the second completion before it has a chance to release the spinlock. > The handler for the second completion will deadlock waiting for the > spinlock to be released. > > -Jack > So spin_lock_irqsave is more appropriate for this fix, but will introduce more performance lose I guess. In OFED, rcu lock is used, but synchronize_rcu() is missing, may also need combine cq->refcount, I guess? -- Mit freundlichen Grüßen, Best Regards, Jack Wang Linux Kernel Developer Storage ProfitBricks GmbH The IaaS-Company. ProfitBricks GmbH Greifswalder Str. 207 D - 10405 Berlin Tel: +49 30 5770083-42 Fax: +49 30 5770085-98 Email: jinpu.w...@profitbricks.com URL: http://www.profitbricks.de Sitz der Gesellschaft: Berlin. Registergericht: Amtsgericht Charlottenburg, HRB 125506 B. Geschäftsführer: Andreas Gauger, Achim Weiss. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html