On Sat, Oct 31, 2015 at 1:45 PM, Eric Dumazet <eric.duma...@gmail.com> wrote:
>     13.84%       opensock  [kernel.kallsyms]    [k] queued_spin_lock_slowpath
>                  |
>                  --- queued_spin_lock_slowpath
>                     |
>                     |--99.97%-- _raw_spin_lock
>                     |          |
>                     |          |--53.03%-- __close_fd
>                     |          |
>                     |          |--46.83%-- __alloc_fd

Interesting. "__close_fd" actually looks more expensive than
allocation. They presumably get called equally often, so it's probably
some cache effect.

__close_fd() doesn't do anything even remotely interesting as far as I
can tell, but it strikes me that we probably take a *lot* of cache
misses on the stupid "close-on-exec" flags, which are probably always
zero anyway.

Mind testing something really stupid, and making the __clear_bit() in
__clear_close_on_exec() conditiona, something like this:

     static inline void __clear_close_on_exec(int fd, struct fdtable *fdt)
     {
    -       __clear_bit(fd, fdt->close_on_exec);
    +       if (test_bit(fd, fdt->close_on_exec)
    +               __clear_bit(fd, fdt->close_on_exec);
     }

and see if it makes a difference.

This is the kind of thing that a single-threaded (or even
single-socket) test will never actually show, because it caches well
enough. But for two sockets, I could imagine the unnecessary dirtying
of cachelines and ping-pong being noticeable.

The other stuff we probably can't do all that much about. Unless we
decide to go for some complicated lockless optimistic file descriptor
allocation scheme with retry-on-failure instead of locks. Which I'm
sure is possible, but I'm equally sure is painful.

                    Linus
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to