On 01/29/2018 09:27 PM, John Fastabend wrote:
> The selftests test_maps program was leaving dangling BPF sockmap
> programs around because not all psock elements were removed from
> the map. The elements in turn hold a reference on the BPF program
> they are attached to causing BPF programs to stay open even after
> test_maps has completed.
> 
> The original intent was that sk_state_change() would be called
> when TCP socks went through TCP_CLOSE state. However, because
> socks may be in SOCK_DEAD state or the sock may be a listening
> socket the event is not always triggered.
> 
> To resolve this use the ULP infrastructure and register our own
> proto close() handler. This fixes the above case.

Great that we can get rid of smap_state_change() and piggy-back on ULP
infra instead.

> Fixes: 174a79ff9515 ("bpf: sockmap with sk redirect support")
> Reported-by: Prashant Bhole <bhole_prashant...@lab.ntt.co.jp>
> Signed-off-by: John Fastabend <john.fastab...@gmail.com>
[...]
> +void bpf_tcp_close(struct sock *sk, long timeout)
> +{
> +     void (*close_fun)(struct sock *sk, long timeout);
> +     struct smap_psock_map_entry *e, *tmp;
> +     struct smap_psock *psock;
> +     struct sock *osk;
> +
> +     rcu_read_lock();
> +     psock = smap_psock_sk(sk);
> +     if (unlikely(!psock))
> +             return sk->sk_prot->close(sk, timeout);

Here, we return with RCU read lock held, rcu_read_unlock() is missing.
lockdep should probably notice this imbalance as well.

> +     /* The psock may be destroyed anytime after exiting the RCU critial
> +      * section so by the time we use close_fun the psock may no longer
> +      * be valid. However, bpf_tcp_close is called with the sock lock
> +      * held so the close hook and sk are still valid.
> +      */
> +     close_fun = psock->save_close;
> +
> +     write_lock_bh(&sk->sk_callback_lock);
> +     list_for_each_entry_safe(e, tmp, &psock->maps, list) {
> +             osk = cmpxchg(e->entry, sk, NULL);
> +             if (osk == sk) {
> +                     list_del(&e->list);
> +                     smap_release_sock(psock, sk);
> +             }
> +     }
> +     write_unlock_bh(&sk->sk_callback_lock);
> +     rcu_read_unlock();
> +     close_fun(sk, timeout);
> +}

Reply via email to