From: Sean Tranchetti <stran...@codeaurora.org>
Date: Fri, 26 Jun 2020 18:31:03 -0600

> @@ -328,6 +325,10 @@ int genl_register_family(struct genl_family *family)
>       if (err)
>               return err;
>  
> +     /* Acquire netlink table lock before any GENL-specific locks to ensure
> +      * sync with any netlink operations making calls into the GENL code.
> +      */
> +     netlink_table_grab();
>       genl_lock_all();

This locking sequence is illegal, and if you tested this change with the
proper lock debugging options enabled you wouldn't have been able to
even boot a machine without it OOPS'ing.

This code was essentially not tested as far as I am concerned.

netlink_table_grab() takes an atomic lock (write_lock_irq), so it
creates an atomic section.  But then we immediately call
genl_lock_all() which takes multiple sleepable locks (a semaphore and
a mutex).

You'll have to find another way to fix this bug and I would like to ask
that you do so in a way that keeps all of these code paths sleepable
and does not do any GFP_ATOMIC conversions.

Thank you.

Reply via email to