Hi again, > -----Ursprüngliche Nachricht----- > Von: Wei Wang [mailto:wei...@google.com] > Gesendet: Montag, 1. Februar 2021 19:53 > An: David Ahern <dsah...@gmail.com> > Cc: Schmid, Carsten <carsten_sch...@mentor.com>; > da...@davemloft.net; kuz...@ms2.inr.ac.ru; yoshf...@linux-ipv6.org; > netdev@vger.kernel.org > Betreff: Re: Possible race in ipv4 routing > > On Mon, Feb 1, 2021 at 7:35 AM David Ahern <dsah...@gmail.com> wrote: > > > > On 2/1/21 2:20 AM, Schmid, Carsten wrote: > > > Hi, > > > > > > on kernel 4.14(.147) i have seen something weird. The stack trace: > > > > > > [65064.457920] BUG: unable to handle kernel NULL pointer dereference > at 0000000000000604 > > > [65064.466677] IP: ip_route_output_key_hash_rcu+0x755/0x850 -------8<------------- ... -------8<-------------
> > > Fortunately i have a core dump, and analyzed that. i again have a core dump and ... ... the pointer is not NULL this time: (see RCX content, which has the pointer's content and is accessed) [44087.587354] general protection fault: 0000 [#1] PREEMPT SMP NOPTI [44087.594172] Modules linked in: bcmdhd(O) ebt_ip6 ebt_ip ebtable_filter ebtables squashfs zlib_inflate xz_dec veth lzo lzo_compress lzo_decompress esp4 ah4 xfrm4_mode_transport xfrm_user xfrm_algo cls_u32 sch_htb cdc_acm intel_tfm_governor intel_ipu4_psys intel_ipu4_psys_csslib ecryptfs snd_soc_apl_mgu_hu intel_xhci_usb_role_switch roles dwc3 adv728x udc_core intel_ipu4_isys videobuf2_dma_contig videobuf2_memops ipu4_acpi intel_ipu4_isys_csslib coretemp snd_soc_skl videobuf2_v4l2 videobuf2_core sdw_cnl snd_soc_acpi_intel_match snd_soc_acpi snd_soc_core sbi_apl snd_compress i2c_i801 snd_soc_skl_ipc sdw_bus crc8 intel_ipu4_mmu snd_soc_sst_ipc snd_soc_sst_dsp ahci snd_hda_ext_core libahci xhci_pci snd_hda_core mei_me xhci_hcd snd_pcm libata cfg80211 mei snd_timer usbcore dwc3_pci snd scsi_mod usb_common [44087.607643] skl_ipc_process_reply: 65 callbacks suppressed [44087.679547] soundcore rfkill intel_ipu4 iova nfsd auth_rpcgss lockd grace sunrpc zram zsmalloc loop fuse 8021q bridge stp llc inap560t(O) i915 video backlight intel_gtt i2c_algo_bit drm_kms_helper drm firmware_class igb_avb(O) ptp hwmon spi_pxa2xx_platform pps_core [last unloaded: bcmdhd] [44087.708253] CPU: 1 PID: 248 Comm: 6310_io03 Tainted: G U O 4.14.198-apl #1 [44087.717004] task: ffffa0d273b75780 task.stack: ffffa28cc0714000 [44087.723623] RIP: 0010:ip_route_output_key_hash_rcu+0x763/0x860 [44087.730138] RSP: 0018:ffffa28cc0717940 EFLAGS: 00010206 [44087.735976] RAX: ffffa0d0d1f40b00 RBX: ffffa0d26e6d6038 RCX: 000499f24000000a [44087.743961] RDX: 0000000000000001 RSI: 0000000060c730a0 RDI: 0000000000000000 [44087.751942] RBP: ffffa28cc0717990 R08: 0000000000000000 R09: ffffa0d272e29200 [44087.759919] R10: 0000000000000000 R11: ffffa0d0c60ec300 R12: ffffa0d0d1f40b00 [44087.767897] R13: ffffa28cc07179a0 R14: ffffa0d272e31000 R15: 0000000000000000 [44087.775870] FS: 00007f85affff700(0000) GS:ffffa0d27fc80000(0000) knlGS:0000000000000000 [44087.784917] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [44087.791338] CR2: 00007f3837ec1000 CR3: 00000001f70d4000 CR4: 00000000003406a0 [44087.799314] Call Trace: [44087.802048] ip_route_output_key_hash+0x82/0xb0 [44087.807119] ip_route_output_flow+0x19/0x50 [44087.811798] ip_queue_xmit+0x389/0x3c0 [44087.815986] __tcp_transmit_skb+0x598/0x9f0 [44087.820658] tcp_write_xmit+0x1b7/0xf50 [44087.824942] __tcp_push_pending_frames+0x30/0xd0 [44087.830109] tcp_push+0xe7/0x110 [44087.833712] tcp_sendmsg_locked+0x9ac/0xe50 [44087.838386] ? __switch_to_asm+0x41/0x70 [44087.842770] tcp_sendmsg+0x27/0x40 [44087.846574] inet_sendmsg+0x2f/0xf0 [44087.850468] sock_sendmsg+0x31/0x40 [44087.854363] ___sys_sendmsg+0x28d/0x2a0 [44087.858646] ? wake_up_q+0x54/0x80 [44087.862444] ? futex_wake+0x8a/0x180 [44087.866437] ? do_futex+0xc0/0xc60 [44087.870234] ? tick_program_event+0x3f/0x70 [44087.874912] ? __fget+0x71/0xa0 [44087.878422] __sys_sendmsg+0x4f/0x90 [44087.882412] ? __sys_sendmsg+0x4f/0x90 [44087.886591] SyS_sendmsg+0x9/0x10 [44087.890290] do_syscall_64+0x79/0x350 [44087.894380] ? schedule+0x2e/0x90 [44087.898079] ? exit_to_usermode_loop+0x5a/0x90 [44087.903040] entry_SYSCALL_64_after_hwframe+0x41/0xa6 [44087.908680] RIP: 0033:0x7f85bb79e807 [44087.912667] RSP: 002b:00007f85afffceb0 EFLAGS: 00000293 ORIG_RAX: 000000000000002e [44087.921131] RAX: ffffffffffffffda RBX: 0000000000000132 RCX: 00007f85bb79e807 [44087.929116] RDX: 0000000000004000 RSI: 00007f85afffcf20 RDI: 0000000000000132 [44087.937083] RBP: 00007f85afffcf20 R08: 0000000000000000 R09: 0000000000000000 [44087.945057] R10: 000000000536591c R11: 0000000000000293 R12: 0000000000004000 [44087.953022] R13: 0000560a8a4d5208 R14: 00007f85affff5c0 R15: 0000560a8a8a7040 [44087.960994] Code: 4d c8 4c 89 45 d0 e8 dd d2 bd ff 4c 8b 45 d0 4c 8b 4d c8 e9 83 fa ff ff 48 8b 08 48 8b 89 98 04 00 00 48 85 c9 0f 84 a5 fe ff ff <8b> 89 04 06 00 00 39 88 a0 00 00 00 0f 85 93 fe ff ff 8b 80 80 [44087.982134] RIP: ip_route_output_key_hash_rcu+0x763/0x860 RSP: ffffa28cc0717940 > > > https://www.spinics.net/lists/stable-commits/msg133055.html > > > But this patch didn't make it into 4.14. > > > > > > Can someone check this race condition? > > > > > > > dst->dev is NULL. Adding author of the patch for thoughts. > > It definitely looks like the race described in > https://www.spinics.net/lists/stable-commits/msg133055.html, from all > the evidence above. > However, I am not very sure, how rt_is_expired() could crash with NULL > ptr. I don't think dst->dev could be NULL even after calling > dst_dev_put(), cause we assign loopback_dev to dst->dev there. > Also, Carsten mentioned the memdump shows 'dst.obsolete=0x0002'. In > rt_is_expired(), if 'dst.obsolete=0x0002', we should not call > rt_is_expired(). So there might be some memory barrier that is > required in dst_dev_put()? Do we have this part locked, between checking dst.obsolete=0x0002 and calling rt_is_expired? > But anyway, since the fix above replaced dst_dev_put() with > rt_add_uncached_list(), I believe this crash should also be fixed. The crash that is seen here has the fix applied, it was a trivial "move the code a bit" for adaption into 4.14. (and it is meanwhile merged into 4.14 stable). Additionally, in my kernel i added a NULL pointer check which obviously doesn't really help here (this is not upstream): static inline bool rt_is_expired(const struct rtable *rth) { -return rth->rt_genid != rt_genid_ipv4(dev_net(rth->dst.dev)); +return (dev_net(rth->dst.dev) == NULL) || + (rth->rt_genid != rt_genid_ipv4(dev_net(rth->dst.dev))); } Any thoughts what could cause this? For me it looks like a UAF and/or a race. What i have also found is: https://syzkaller.appspot.com/bug?id=1157f1b5ae3cf62f60d725868ae9fc53e9050aa0 Best regards Carsten ----------------- Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank Thürauf