Hi David & Folks,

I have a virtual device driver that does some fancy processing of
packets in ndo_start_xmit before forwarding them onward out of a
tunnel elsewhere. In order to make that fancy processing fast, I have
AVX and AVX2 implementations. This means I need to use the FPU.

So, I do the usual pattern found throughout the kernel:

        if (!irq_fpu_usable())
                generic_c(...);
        else {
                kernel_fpu_begin();
                optimized_avx(...);
                kernel_fpu_end();
         }

This works fine with, say, iperf3 in TCP mode. The AVX performance is
great. However, when using iperf3 in UDP mode, irq_fpu_usable() is
mostly false! I added a dump_stack() call to see why, except nothing
looks strange; the initial call in the stack trace is
entry_SYSCALL_64_fastpath. Why would irq_fpu_usable() return false
when we're in a syscall? Doesn't that mean this is in process context?

So, I find this a bit disturbing. If anybody has an explanation, and a
way to work around it, I'd be quite happy. Or, simply if there is a
debugging technique you'd recommend, I'd be happy to try something and
report back.

Thanks,
Jason
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to