On 05/30/2018 02:29 PM, Wei Wang wrote: > From: Wei Wang <wei...@google.com> > > Sock hash only supports IPv4 socket proto right now. > If a non-IPv4 socket gets stored in the BPF map, sk->sk_prot gets > overwritten with the v4 tcp prot. > > Syskaller reported the following related issue on an IPv6 socket: > BUG: KASAN: slab-out-of-bounds in ip6_dst_idev include/net/ip6_fib.h:203 > [inline] > BUG: KASAN: slab-out-of-bounds in ip6_xmit+0x2002/0x23f0 > net/ipv6/ip6_output.c:264 > Read of size 8 at addr ffff8801b300edb0 by task syz-executor888/4522 > > CPU: 0 PID: 4522 Comm: syz-executor888 Not tainted 4.17.0-rc4+ #17 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > Google 01/01/2011 > Call Trace: > __dump_stack lib/dump_stack.c:77 [inline] > dump_stack+0x1b9/0x294 lib/dump_stack.c:113 > print_address_description+0x6c/0x20b mm/kasan/report.c:256 > kasan_report_error mm/kasan/report.c:354 [inline] > kasan_report.cold.7+0x242/0x2fe mm/kasan/report.c:412 > __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433 > ip6_dst_idev include/net/ip6_fib.h:203 [inline] > ip6_xmit+0x2002/0x23f0 net/ipv6/ip6_output.c:264 > inet6_csk_xmit+0x377/0x630 net/ipv6/inet6_connection_sock.c:139 > tcp_transmit_skb+0x1be0/0x3e40 net/ipv4/tcp_output.c:1159 > tcp_send_syn_data net/ipv4/tcp_output.c:3441 [inline] > tcp_connect+0x2207/0x45a0 net/ipv4/tcp_output.c:3480 > tcp_v4_connect+0x1934/0x1d50 net/ipv4/tcp_ipv4.c:272 > __inet_stream_connect+0x943/0x1120 net/ipv4/af_inet.c:655 > tcp_sendmsg_fastopen net/ipv4/tcp.c:1162 [inline] > tcp_sendmsg_locked+0x2859/0x3ee0 net/ipv4/tcp.c:1209 > tcp_sendmsg+0x2f/0x50 net/ipv4/tcp.c:1447 > inet_sendmsg+0x19f/0x690 net/ipv4/af_inet.c:798 > sock_sendmsg_nosec net/socket.c:629 [inline] > sock_sendmsg+0xd5/0x120 net/socket.c:639 > ___sys_sendmsg+0x805/0x940 net/socket.c:2117 > __sys_sendmsg+0x115/0x270 net/socket.c:2155 > __do_sys_sendmsg net/socket.c:2164 [inline] > __se_sys_sendmsg net/socket.c:2162 [inline] > __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2162 > do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287 > entry_SYSCALL_64_after_hwframe+0x49/0xbe > RIP: 0033:0x43ff99 > RSP: 002b:00007ffc00bd1cf8 EFLAGS: 00000217 ORIG_RAX: 000000000000002e > RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 000000000043ff99 > RDX: 0000000020000000 RSI: 0000000020000580 RDI: 0000000000000003 > RBP: 00000000006ca018 R08: 00000000004002c8 R09: 00000000004002c8 > R10: 00000000004002c8 R11: 0000000000000217 R12: 00000000004018c0 > R13: 0000000000401950 R14: 0000000000000000 R15: 0000000000000000 > > Fixes: 81110384441a ("bpf: sockmap, add hash map support") > Reported-by: syzbot+5c063698bdbfac19f...@syzkaller.appspotmail.com > Signed-off-by: Wei Wang <wei...@google.com> > Acked-by: Eric Dumazet <eduma...@google.com> > Acked-by: Willem de Bruijn <will...@google.com> > ---
Hi Wei, Thanks for the report and fix. It would be better to fix the root cause so that IPv6 works as intended. I'm testing the following now, Author: John Fastabend <john.fastab...@gmail.com> Date: Thu May 31 14:38:59 2018 -0700 sockmap: fix crash when ipv6 sock is added by adding support for IPv6 Apparently we had a testing escape and missed IPv6. This fixes a crash where we assign tcp_prot to IPv6 sockets instead of tcpv6_prot. Signed-off-by: John Fastabend <john.fastab...@gmail.com> diff --git a/kernel/bpf/sockmap.c b/kernel/bpf/sockmap.c index 52a91d8..e191122 100644 --- a/kernel/bpf/sockmap.c +++ b/kernel/bpf/sockmap.c @@ -41,6 +41,7 @@ #include <linux/mm.h> #include <net/strparser.h> #include <net/tcp.h> +#include <net/transp_v6.h> #include <linux/ptr_ring.h> #include <net/inet_common.h> #include <linux/sched/signal.h> @@ -162,6 +163,8 @@ static bool bpf_tcp_stream_read(const struct sock *sk) } static struct proto tcp_bpf_proto; +static struct proto tcpv6_bpf_proto; + static int bpf_tcp_init(struct sock *sk) { struct smap_psock *psock; @@ -182,13 +185,21 @@ static int bpf_tcp_init(struct sock *sk) psock->sk_proto = sk->sk_prot; if (psock->bpf_tx_msg) { + tcpv6_bpf_proto.sendmsg = bpf_tcp_sendmsg; + tcpv6_bpf_proto.sendpage = bpf_tcp_sendpage; + tcpv6_bpf_proto.recvmsg = bpf_tcp_recvmsg; + tcpv6_bpf_proto.stream_memory_read = bpf_tcp_stream_read; tcp_bpf_proto.sendmsg = bpf_tcp_sendmsg; tcp_bpf_proto.sendpage = bpf_tcp_sendpage; tcp_bpf_proto.recvmsg = bpf_tcp_recvmsg; tcp_bpf_proto.stream_memory_read = bpf_tcp_stream_read; } - sk->sk_prot = &tcp_bpf_proto; + if (sk->sk_family == AF_INET6) + sk->sk_prot = &tcpv6_bpf_proto; + else + sk->sk_prot = &tcp_bpf_proto; + rcu_read_unlock(); return 0; } @@ -1113,6 +1124,8 @@ static int bpf_tcp_ulp_register(void) { tcp_bpf_proto = tcp_prot; tcp_bpf_proto.close = bpf_tcp_close; + tcpv6_bpf_proto = tcpv6_prot; + tcpv6_bpf_proto.close = bpf_tcp_close; /* Once BPF TX ULP is registered it is never unregistered. It * will be in the ULP list for the lifetime of the system. Doing * duplicate registers is not a problem.