On 3/1/18 11:51 AM, William Tu wrote:
> On Thu, Mar 1, 2018 at 10:36 AM, David Ahern <dsah...@gmail.com> wrote:
>> On 3/1/18 10:29 AM, William Tu wrote:
>>> Hi,
>>>
>>> We're running commands below on kernel 4.15.0:
>>> 1) ip netns add at_ns0
>>> 2) ip link add p0 type veth peer name ovs-p0
>>> 3) ip link set p0 netns at_ns0
>>> 4) ip link set dev ovs-p0 up
>>
>> # uname -a
>> Linux kenny-jessie3 4.16.0-rc2+ #162 SMP Thu Mar 1 08:48:58 PST 2018
>> x86_64 GNU/Linux
>>
>> # bash -x /tmp/2
>> + ip netns add at_ns0
>> + ip link add p0 type veth peer name ovs-p0
>> + ip link set p0 netns at_ns0
>> + ip link set dev ovs-p0 up
>>
>> Works fine for me on top of tree.
>>
>> What is the output of 'cat /proc/<pid>/stack' when it hangs?
>>
> root@osb:~/iproute2# ps aux | grep ip
> root       3652  0.0  0.0  11532   884 pts/24   S+   10:43   0:00 ip
> link add p0 type veth peer name ovs-p0
> 
> root@osb:~/iproute2# cat /proc/3652/stack
> [<0>] __skb_wait_for_more_packets+0x103/0x160
> [<0>] __skb_recv_datagram+0x69/0xc0
> [<0>] skb_recv_datagram+0x3f/0x60
> [<0>] netlink_recvmsg+0x59/0x420
> [<0>] ___sys_recvmsg+0xee/0x230
> [<0>] __sys_recvmsg+0x4e/0x90
> [<0>] entry_SYSCALL_64_fastpath+0x24/0x87
> [<0>] 0xffffffffffffffff
> 
> if I run strace on "ip link add p0 type veth peer name ovs-p0"
> open("/usr/lib/ip/link_veth.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No
> such file or directory)
> sendmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0,
> groups=00000000},
> msg_iov(1)=[{"X\0\0\0\20\0\5\6\315J\230Z\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
> 88}], msg_controllen=0, msg_flags=0}, 0) = 88
> recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0,
> groups=00000000}, msg_iov(1)=[{NULL, 0}], msg_controllen=0,
> msg_flags=MSG_TRUNC}, MSG_PEEK|MSG_TRUNC) = 36
> recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0,
> groups=00000000},
> msg_iov(1)=[{"$\0\0\0\2\0\0\1\315J\230Z1\24\0\0\0\0\0\0X\0\0\0\20\0\5\6\315J\230Z"...,
> 36}], msg_controllen=0, msg_flags=0}, 0) = 36
> 
> Thanks a lot
> William
> 


I still can not reproduce the hang, but try this and see if it fixes
your problem (whitespace damaged on paste):

diff --git a/lib/libnetlink.c b/lib/libnetlink.c
index 7ca47b22581a..9d692afbc740 100644
--- a/lib/libnetlink.c
+++ b/lib/libnetlink.c
@@ -670,8 +672,9 @@ static int __rtnl_talk_iov(struct rtnl_handle *rtnl,
struct iovec *iov,
                                                free(buf);
                                        if (h->nlmsg_seq == seq)
                                                return 0;
-                                       else
+                                       else if (i < iovlen)
                                                goto next;
+                                       return 0;
                                }

                                if (rtnl->proto != NETLINK_SOCK_DIAG &&

Reply via email to