On Tue, 21 May 2019 19:02:00 -0700, Jakub Kicinski wrote:
> When netdev with active kTLS sockets in unregistered
> notifier callback walks the offloaded sockets and
> cleans up offload state.  RX data may still be processed,
> however, and if resync was requested prior to device
> removal we would hit a NULL pointer dereference on
> ctx->netdev use.
> 
> Make sure resync is under the device offload lock
> and NULL-check the netdev pointer.
> 
> This should be safe, because the pointer is set to
> NULL either in the netdev notifier (under said lock)
> or when socket is completely dead and no resync can
> happen.
> 
> The other access to ctx->netdev in tls_validate_xmit_skb()
> does not dereference the pointer, it just checks it against
> other device pointer, so it should be pretty safe (perhaps
> we can add a READ_ONCE/WRITE_ONCE there, if paranoid).
> 
> Fixes: 4799ac81e52a ("tls: Add rx inline crypto offload")
> Signed-off-by: Jakub Kicinski <[email protected]>
> Reviewed-by: Dirk van der Merwe <[email protected]>
> ---
>  net/tls/tls_device.c | 15 ++++++++++-----
>  1 file changed, 10 insertions(+), 5 deletions(-)
> 
> diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c
> index ca54a7c7ec81..aa33e4accc32 100644
> --- a/net/tls/tls_device.c
> +++ b/net/tls/tls_device.c
> @@ -553,8 +553,8 @@ void tls_device_write_space(struct sock *sk, struct 
> tls_context *ctx)
>  void handle_device_resync(struct sock *sk, u32 seq, u64 rcd_sn)
>  {
>       struct tls_context *tls_ctx = tls_get_ctx(sk);
> -     struct net_device *netdev = tls_ctx->netdev;
>       struct tls_offload_context_rx *rx_ctx;
> +     struct net_device *netdev;
>       u32 is_req_pending;
>       s64 resync_req;
>       u32 req_seq;
> @@ -568,10 +568,15 @@ void handle_device_resync(struct sock *sk, u32 seq, u64 
> rcd_sn)
>       is_req_pending = resync_req;
>  
>       if (unlikely(is_req_pending) && req_seq == seq &&
> -         atomic64_try_cmpxchg(&rx_ctx->resync_req, &resync_req, 0))
> -             netdev->tlsdev_ops->tls_dev_resync_rx(netdev, sk,
> -                                                   seq + TLS_HEADER_SIZE - 1,
> -                                                   rcd_sn);
> +         atomic64_try_cmpxchg(&rx_ctx->resync_req, &resync_req, 0)) {
> +             seq += TLS_HEADER_SIZE - 1;
> +             down_read(&device_offload_lock);

Sorry this may actually cause a sleep in atomic, turns out resync may
get called directly from softirq under certain conditions.

Would it be possible to drop this from stable?  I can post a revert +
new fix (probably on a refcount..) or should I post an incremental fix?

> +             netdev = tls_ctx->netdev;
> +             if (netdev)
> +                     netdev->tlsdev_ops->tls_dev_resync_rx(netdev, sk, seq,
> +                                                           rcd_sn);
> +             up_read(&device_offload_lock);
> +     }
>  }
>  
>  static int tls_device_reencrypt(struct sock *sk, struct sk_buff *skb)

Reply via email to