On 01/29/2018 09:27 PM, John Fastabend wrote: > The selftests test_maps program was leaving dangling BPF sockmap > programs around because not all psock elements were removed from > the map. The elements in turn hold a reference on the BPF program > they are attached to causing BPF programs to stay open even after > test_maps has completed. > > The original intent was that sk_state_change() would be called > when TCP socks went through TCP_CLOSE state. However, because > socks may be in SOCK_DEAD state or the sock may be a listening > socket the event is not always triggered. > > To resolve this use the ULP infrastructure and register our own > proto close() handler. This fixes the above case.
Great that we can get rid of smap_state_change() and piggy-back on ULP infra instead. > Fixes: 174a79ff9515 ("bpf: sockmap with sk redirect support") > Reported-by: Prashant Bhole <bhole_prashant...@lab.ntt.co.jp> > Signed-off-by: John Fastabend <john.fastab...@gmail.com> [...] > +void bpf_tcp_close(struct sock *sk, long timeout) > +{ > + void (*close_fun)(struct sock *sk, long timeout); > + struct smap_psock_map_entry *e, *tmp; > + struct smap_psock *psock; > + struct sock *osk; > + > + rcu_read_lock(); > + psock = smap_psock_sk(sk); > + if (unlikely(!psock)) > + return sk->sk_prot->close(sk, timeout); Here, we return with RCU read lock held, rcu_read_unlock() is missing. lockdep should probably notice this imbalance as well. > + /* The psock may be destroyed anytime after exiting the RCU critial > + * section so by the time we use close_fun the psock may no longer > + * be valid. However, bpf_tcp_close is called with the sock lock > + * held so the close hook and sk are still valid. > + */ > + close_fun = psock->save_close; > + > + write_lock_bh(&sk->sk_callback_lock); > + list_for_each_entry_safe(e, tmp, &psock->maps, list) { > + osk = cmpxchg(e->entry, sk, NULL); > + if (osk == sk) { > + list_del(&e->list); > + smap_release_sock(psock, sk); > + } > + } > + write_unlock_bh(&sk->sk_callback_lock); > + rcu_read_unlock(); > + close_fun(sk, timeout); > +}