On Wed, Feb 15, 2017 at 5:26 PM, Eric Dumazet <eric.duma...@gmail.com> wrote:
> On Wed, 2017-02-15 at 16:52 +0200, Matan Barak (External) wrote:
>
>> So, in case of RDMA CQs, we add some per-CQE overhead of comparing the
>> list pointers and condition upon that. Maybe we could add an
>> invoke_tasklet boolean field on mlx4_cq and return its value from
>> mlx4_cq_completion.
>> That's way we could do invoke_tasklet |= mlx4_cq_completion(....);
>>
>> Outside the while loop we could just
>> if (invoke_tasklet)
>>      tasklet_schedule
>>
>> Anyway, I guess that even with per-CQE overhead, the performance impact
>> here is pretty negligible - so I guess that's fine too :)
>
>
> Real question or suggestion would be to use/fire a tasklet only under
> stress.
>
> Firing a tasklet adds a lot of latencies for user-space CQ completion,
> since softirqs might have to be handled by a kernel thread (ksoftirqd)
>

At least for mlx4_en driver we don't need this tasklet and it is only
adding this overhead. (we have napi)

we must consider removing it for mlx4_en cqs and move the tasklet
handling to mlx4_ib.

I will ack the patch.

Reply via email to