On 03/12 07:59, Eric Dumazet wrote: > > > On 12/03/2018 07:20 AM, Nicolas Belouin wrote: > > Hi, > > I ran into a panic while adding an interface to a bridge with a vxlan > > interface already attached to it, as it seems related mtu=9000. > > > > I get the following panic info : > > > > [ 2482.419893] br100: port 2(vif1.1) entered blocking state > > [ 2482.425427] br100: port 2(vif1.1) entered forwarding state > > [ 2482.431797] skbuff: skb_over_panic: text:ffffffff816e4f78 len:40 put:40 > > head:ffff880146449000 data:ffff880146458fd0 tail:0xfff8 end:0xec0 dev:vif1.1 > > [ 2482.442891] ------------[ cut here ]------------ > > [ 2482.448254] kernel BUG at > > /srv/jenkins/workspace/workspace/hosting-xen-dom0-kernel/build/src/linux-4.9/net/core/skbuff.c:105! > > [ 2482.459009] invalid opcode: 0000 [#1] PREEMPT SMP > > [ 2482.464371] Modules linked in: > > [ 2482.469682] CPU: 19 PID: 1317 Comm: kworker/19:1 Not tainted > > 4.9.135-dom0-e9d15b2-x86_64-iaas #2 > > [ 2482.480362] Hardware name: Dell Inc. PowerEdge C8220/09N44V, BIOS 2.7.1 > > 03/04/2015 > > [ 2482.491008] Workqueue: ipv6_addrconf addrconf_dad_work > > [ 2482.496380] task: ffff88017eef1a00 task.stack: ffffc90001fcc000 > > [ 2482.501785] RIP: e030:[<ffffffff815ed71f>] [<ffffffff815ed71f>] > > skb_panic+0x5f/0x70 > > [ 2482.512450] RSP: e02b:ffffc90001fcfba8 EFLAGS: 00010296 > > [ 2482.517817] RAX: 0000000000000088 RBX: ffff880117fb0800 RCX: > > 0000000000000000 > > [ 2482.528447] RDX: 0000000000000088 RSI: ffff880184cd03c8 RDI: > > ffff880184cd03c8 > > [ 2482.539085] RBP: ffffc90001fcfc00 R08: 00000000000006a8 R09: > > ffffffff81ea7359 > > [ 2482.549717] R10: ffff880180406f80 R11: 00000000000006a8 R12: > > ffff880147258cc0 > > [ 2482.560350] R13: ffffc90001fcfc20 R14: ffffffff81d13440 R15: > > 0000000000000000 > > [ 2482.570993] FS: 0000000000000000(0000) GS:ffff880184cc0000(0000) > > knlGS:0000000000000000 > > [ 2482.581646] CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 > > [ 2482.587039] CR2: 00007f5b17f032b0 CR3: 0000000001c08000 CR4: > > 0000000000042660 > > [ 2482.597675] Stack: > > [ 2482.602958] ffff880146458fd0 000000000000fff8 0000000000000ec0 > > ffff88017f3f0000 > > [ 2482.613619] ffffffff815efa62 ffffffff816e4f78 ffff880117fb0800 > > ffffc90001fcfc20 > > [ 2482.624288] ffff880147258cc0 ffff88017f3f0000 ffff880146502000 > > ffffc90001fcfc68 > > [ 2482.634955] Call Trace: > > [ 2482.640254] [<ffffffff815efa62>] ? skb_put+0x42/0x50 > > [ 2482.645633] [<ffffffff816e4f78>] ? ip6_mc_hdr.constprop.36+0x58/0xd0 > > [ 2482.651045] [<ffffffff816e511a>] ? mld_newpack+0x12a/0x1e0 > > [ 2482.656421] [<ffffffff816e5257>] ? add_grhead.isra.28+0x87/0xa0 > > [ 2482.661825] [<ffffffff816e60d6>] ? add_grec+0x446/0x4c0 > > [ 2482.667198] [<ffffffff8108b06b>] ? __local_bh_enable_ip+0x1b/0xb0 > > [ 2482.672609] [<ffffffff816e6328>] ? mld_send_initial_cr.part.29+0x58/0xa0 > > [ 2482.678022] [<ffffffff816e83d6>] ? ipv6_mc_dad_complete+0x26/0x60 > > [ 2482.683441] [<ffffffff816cc2cf>] ? addrconf_dad_completed+0x29f/0x2c0 > > [ 2482.688850] [<ffffffff816e6b84>] ? ipv6_dev_mc_inc+0x194/0x2c0 > > [ 2482.694249] [<ffffffff816cc3ee>] ? addrconf_dad_work+0xfe/0x3d0 > > [ 2482.699650] [<ffffffff817484ed>] ? _raw_spin_unlock_irq+0xd/0x20 > > [ 2482.705052] [<ffffffff8109de12>] ? process_one_work+0x142/0x3e0 > > [ 2482.710453] [<ffffffff8109e112>] ? worker_thread+0x62/0x480 > > [ 2482.715848] [<ffffffff8109e0b0>] ? process_one_work+0x3e0/0x3e0 > > [ 2482.721256] [<ffffffff810a3472>] ? kthread+0xe2/0x100 > > [ 2482.726621] [<ffffffff81028701>] ? __switch_to+0x261/0x6b0 > > [ 2482.732006] [<ffffffff810a3390>] ? kthread_park+0x60/0x60 > > [ 2482.737379] [<ffffffff81748c37>] ? ret_from_fork+0x57/0x70 > > [ 2482.742761] Code: 00 00 48 89 44 24 10 8b 87 b0 00 00 00 48 89 44 24 08 > > 48 8b 87 c0 00 00 00 48 c7 c7 50 8e a2 81 48 89 04 24 31 c0 e8 b5 07 b6 ff > > <0f> 0b 66 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 > > [ 2482.759199] RIP [<ffffffff815ed71f>] skb_panic+0x5f/0x70 > > [ 2482.764672] RSP <ffffc90001fcfba8> > > [ 2482.771186] ---[ end trace 6d0fe52ed049d841 ]--- > > [ 2482.776641] Kernel panic - not syncing: Fatal exception in interrupt > > [ 2482.861621] Kernel Offset: disabled > > > > I circumvented the bug by applying this patch: > > diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c > > index 21f6deb2aec9..2762c3dcc883 100644 > > --- a/net/ipv6/mcast.c > > +++ b/net/ipv6/mcast.c > > @@ -1605,8 +1605,6 @@ static struct sk_buff *mld_newpack(struct inet6_dev > > *idev, unsigned int mtu) > > IPV6_TLV_PADN, 0 }; > > > > /* we assume size > sizeof(ra) here */ > > - /* limit our allocations to order-0 page */ > > - size = min_t(int, size, SKB_MAX_ORDER(0, 0)); > > skb = sock_alloc_send_skb(sk, size, 1, &err); > > > > if (!skb) > > > > The lines are introduced by commit 72e09ad107e78d69ff4d3b97a69f0aad2b77280f > > stating that "order-2 GRP_ATOMIC allocations are very unreliable" > > I then wonder if this statement is still relevant, or if such a patch > > would be acceptable to you. > > Thanks for the report, but this patch is not correct. > > I rather suspect commit 1837b2e2bcd23137766555a63867e649c0b637f0 ("mld, igmp: > Fix reserved tailroom calculation") > is the problem. >
If I follow the code here, skb_over_panic is triggered if skb->tail > skb->end, with the order-0 page limitation, and mtu 9000, we can assume (in the case of PAGE_SIZE=4k) skb->end to be 4k. Tail is set to 0 by the alloc, and then to LL_RESERVED_SPACE of the net device by the skb_reserve. We then only change it during the skb_put leading to the skb_over_panic by setting it to LL_RESERVED_SPACE(dev) + sizeof(struct ipv6hdr) So I don't see what is the role of the commit you suspect here, the only cause I can see here is that the needed space is just larger than a 4k page. If you can explain to me why you think the reserved tailroom calculation has an impact on this skb_put, I might be able to look at it more in depth The patch I posted is of course a bit harsh and I do not really expect it to be the right solution, but unless I set a lesser MTU (<PAGE_SIZE), I don't really see any way to get it to work with the order 0 page limitation Regards, Nicolas