On Thu, May 23, 2019 at 05:58 PM CEST, John Fastabend wrote: > [...] > >> >> Thanks for taking a look at it. Setting MSG_DONTWAIT works great for >> me. No more crashes in sk_stream_wait_memory. I've tested it on top of >> current bpf-next (f49aa1de9836). Here's my: >> >> Tested-by: Jakub Sitnicki <ja...@cloudflare.com> >> >> The actual I've tested is below, for completeness. >> >> BTW. I've ran into another crash which I haven't seen before while >> testing sockmap-echo, but it looks unrelated: >> >> https://lore.kernel.org/netdev/20190522100142.28925-1-ja...@cloudflare.com/ >> >> -Jakub >> >> --- 8< --- >> >> diff --git a/net/core/skbuff.c b/net/core/skbuff.c >> index e89be6282693..4a7c656b195b 100644 >> --- a/net/core/skbuff.c >> +++ b/net/core/skbuff.c >> @@ -2337,6 +2337,7 @@ int skb_send_sock_locked(struct sock *sk, struct >> sk_buff *skb, int offset, >> kv.iov_base = skb->data + offset; >> kv.iov_len = slen; >> memset(&msg, 0, sizeof(msg)); >> + msg.msg_flags = MSG_DONTWAIT; >> >> ret = kernel_sendmsg_locked(sk, &msg, &kv, 1, slen); >> if (ret <= 0) > > I went ahead and submitted this feel free to add your signed-off-by.
Thanks! The fix was all your idea :-) Now that those pesky crashes are gone, we plan to look into drops when doing echo with sockmap. Marek tried running echo-sockmap [1] with latest bpf-next (plus mentioned crash fixes) and reports that not all data bounces back: $ yes| head -c $[1024*1024] | nc -q2 192.168.1.33 1234 |wc -c 971832 $ yes| head -c $[1024*1024] | nc -q2 192.168.1.33 1234 |wc -c 867352 $ yes| head -c $[1024*1024] | nc -q2 192.168.1.33 1234 |wc -c 952648 I'm tring to turn echo-sockmap into a selftest but as you can probably guess over loopback all works fine. -Jakub [1] https://github.com/cloudflare/cloudflare-blog/blob/master/2019-02-tcp-splice/echo-sockmap.c