On 06/30/2018 03:17 PM, John Fastabend wrote: > This addresses two syzbot issues that lead to identifying (by Eric and > Wei) a class of bugs where we don't correctly check for IPv4/v6 > sockets and their associated state. The second issue was a locking > omission in sockhash. > > The first patch addresses IPv6 socks and fixing an error where > sockhash would overwrite the prot pointer with IPv4 prot. To fix > this build similar solution to TLS ULP. Although we continue to > allow socks in all states not just ESTABLISH in this patch set > because as Martin points out there should be no issue with this > on the sockmap ULP because we don't use the ctx in this code. Once > multiple ULPs coexist we may need to revisit this. However we > can do this in *next trees. > > The other issue syzbot found that the tcp_close() handler missed > locking the hash bucket lock which could result in corrupting the > sockhash bucket list if delete and close ran at the same time. > And also the smap_list_remove() routine was not working correctly > at all. This was not caught in my testing because in general my > tests (to date at least lets add some more robust selftest in > bpf-next) do things in the "expected" order, create map, add socks, > delete socks, then tear down maps. The tests we have that do the > ops out of this order where only working on single maps not multi- > maps so we never saw the issue. Thanks syzbot. The fix is to > restructure the tcp_close() lock handling. And fix the obvious > bug in smap_list_remove(). > > Finally, during review I noticed the release handler was omitted > from the upstream code (patch 4) due to an incorrect merge conflict > fix when I ported the code to latest bpf-next before submitting. > This would leave references to the map around if the user never > closes the map. > > v3: rework patches, dropping ESTABLISH check and adding rcu > annotation along with the smap_list_remove fix > > v4: missed one more case where maps was being accessed without > the sk_callback_lock, spoted by Martin as well. > > v5: changed to use a specific lock for maps and reduced callback > lock so that it is only used to gaurd sk callbacks. I think > this makes the logic a bit cleaner and avoids confusion > ovoer what each lock is doing. > > Also big thanks to Martin for thorough review he caught at least > one case where I missed a rcu_call().
Applied it to bpf, thanks John!