FYI, I picked this up again with Ubuntu 20.10 and still see the same
issue.
# lsb_release -d ; uname -srmo
Description: Ubuntu 20.10
Linux 5.8.0-43-generic x86_64 GNU/Linux
Dmesg output:
[Sun Feb 28 11:44:52 2021] NOHZ: local_softirq_pending 08
[Sun Feb 28 11:44:52 2021] NOHZ: local_softirq_pending 08
[Sun Feb 28 12:00:56 2021] INFO: task NetworkManager:1421 blocked for more than
120 seconds.
[Sun Feb 28 12:00:56 2021] Tainted: P W OE 5.8.0-43-generic
#49-Ubuntu
[Sun Feb 28 12:00:56 2021] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[Sun Feb 28 12:00:56 2021] NetworkManager D 0 1421 1 0x00000000
[Sun Feb 28 12:00:56 2021] Call Trace:
[Sun Feb 28 12:00:56 2021] __schedule+0x212/0x5d0
[Sun Feb 28 12:00:56 2021] schedule+0x55/0xc0
[Sun Feb 28 12:00:56 2021] schedule_preempt_disabled+0xe/0x10
[Sun Feb 28 12:00:56 2021] __mutex_lock.constprop.0+0x14a/0x490
[Sun Feb 28 12:00:56 2021] ? netlink_dump+0xc3/0x400
[Sun Feb 28 12:00:56 2021] __mutex_lock_slowpath+0x13/0x20
[Sun Feb 28 12:00:56 2021] mutex_lock+0x34/0x40
[Sun Feb 28 12:00:56 2021] rtnl_lock+0x15/0x20
[Sun Feb 28 12:00:56 2021] nl80211_dump_scan+0x34/0x160 [cfg80211]
[Sun Feb 28 12:00:56 2021] ? __alloc_skb+0xa4/0x200
[Sun Feb 28 12:00:56 2021] netlink_dump+0x18e/0x400
[Sun Feb 28 12:00:56 2021] __netlink_dump_start+0x20e/0x2f0
[Sun Feb 28 12:00:56 2021] genl_family_rcv_msg_dumpit+0x8f/0x110
[Sun Feb 28 12:00:56 2021] ? genl_rcv_msg+0xa0/0xa0
[Sun Feb 28 12:00:56 2021] ? nl80211_dump_survey+0x2e0/0x2e0 [cfg80211]
[Sun Feb 28 12:00:56 2021] ? genl_family_rcv_msg_dumpit+0x110/0x110
[Sun Feb 28 12:00:56 2021] genl_family_rcv_msg+0x1ef/0x290
[Sun Feb 28 12:00:56 2021] ? do_poll.constprop.0+0x287/0x3a0
[Sun Feb 28 12:00:56 2021] ? free_pcp_prepare+0x59/0x110
[Sun Feb 28 12:00:56 2021] ? genl_family_rcv_msg+0x290/0x290
[Sun Feb 28 12:00:56 2021] genl_rcv_msg+0x4c/0xa0
[Sun Feb 28 12:00:56 2021] ? genl_family_rcv_msg+0x290/0x290
[Sun Feb 28 12:00:56 2021] netlink_rcv_skb+0x4e/0x110
[Sun Feb 28 12:00:56 2021] genl_rcv+0x29/0x40
[Sun Feb 28 12:00:56 2021] netlink_unicast+0x218/0x330
[Sun Feb 28 12:00:56 2021] netlink_sendmsg+0x23b/0x460
[Sun Feb 28 12:00:56 2021] ? aa_sk_perm+0x43/0x1b0
[Sun Feb 28 12:00:56 2021] sock_sendmsg+0x65/0x70
[Sun Feb 28 12:00:56 2021] ____sys_sendmsg+0x257/0x2a0
[Sun Feb 28 12:00:56 2021] ? sendmsg_copy_msghdr+0x7e/0xa0
[Sun Feb 28 12:00:56 2021] ? get_order+0x20/0x20
[Sun Feb 28 12:00:56 2021] ___sys_sendmsg+0x82/0xc0
[Sun Feb 28 12:00:56 2021] ? get_order+0x20/0x20
[Sun Feb 28 12:00:56 2021] ? get_order+0x20/0x20
[Sun Feb 28 12:00:56 2021] ? get_order+0x20/0x20
[Sun Feb 28 12:00:56 2021] ? ep_poll+0x2ec/0x480
[Sun Feb 28 12:00:56 2021] ? __fget_light+0x32/0x80
[Sun Feb 28 12:00:56 2021] __sys_sendmsg+0x62/0xb0
[Sun Feb 28 12:00:56 2021] __x64_sys_sendmsg+0x1f/0x30
[Sun Feb 28 12:00:56 2021] do_syscall_64+0x49/0xc0
[Sun Feb 28 12:00:56 2021] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[Sun Feb 28 12:00:56 2021] RIP: 0033:0x7fca1587b91d
[Sun Feb 28 12:00:56 2021] Code: Unable to access opcode bytes at RIP
0x7fca1587b8f3.
[Sun Feb 28 12:00:56 2021] RSP: 002b:00007ffcd45b2210 EFLAGS: 00000293
ORIG_RAX: 000000000000002e
[Sun Feb 28 12:00:56 2021] RAX: ffffffffffffffda RBX: 000055abfea2bc40 RCX:
00007fca1587b91d
[Sun Feb 28 12:00:56 2021] RDX: 0000000000000000 RSI: 00007ffcd45b2260 RDI:
000000000000000b
[Sun Feb 28 12:00:56 2021] RBP: 00007ffcd45b2260 R08: 0000000000000000 R09:
000055abfec2d3a0
[Sun Feb 28 12:00:56 2021] R10: 000055abfebc8ba0 R11: 0000000000000293 R12:
000055abfea2bc40
[Sun Feb 28 12:00:56 2021] R13: 000055abfea2bdc0 R14: 00007fca158e2f80 R15:
000055abfea2c340
If I try to run any other networking commands (like ip), they hang.
Here's the NetworkManager callstack:
# cat /proc/1421/stack
[<0>] rtnl_lock+0x15/0x20
[<0>] nl80211_dump_scan+0x34/0x160 [cfg80211]
[<0>] netlink_dump+0x18e/0x400
[<0>] __netlink_dump_start+0x20e/0x2f0
[<0>] genl_family_rcv_msg_dumpit+0x8f/0x110
[<0>] genl_family_rcv_msg+0x1ef/0x290
[<0>] genl_rcv_msg+0x4c/0xa0
[<0>] netlink_rcv_skb+0x4e/0x110
[<0>] genl_rcv+0x29/0x40
[<0>] netlink_unicast+0x218/0x330
[<0>] netlink_sendmsg+0x23b/0x460
[<0>] sock_sendmsg+0x65/0x70
[<0>] ____sys_sendmsg+0x257/0x2a0
[<0>] ___sys_sendmsg+0x82/0xc0
[<0>] __sys_sendmsg+0x62/0xb0
[<0>] __x64_sys_sendmsg+0x1f/0x30
[<0>] do_syscall_64+0x49/0xc0
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
** Changed in: linux (Ubuntu)
Status: Expired => Incomplete
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1878321
Title:
TP-Link UE300 [2357:0601]: Kernel workqueue rtl_work_func_t [r8152]
gets stuck, preventing network connectivity: Bad RIP value
Status in linux package in Ubuntu:
Incomplete
Bug description:
$ lsb_release -rd
Description: Ubuntu 19.10
Release: 19.10
Package: linux-image-5.3.0-51-generic
What happened: After plugging my TP-Link (Realtek 8152 based) USB GbE
ethernet device into my laptop and using it, network connectivity loss
occurs several minutes later. Once this has occurred, various
processes on the system hang when I run them (eg "ip a"; "sudo"). It
appears that the rtnetlink lock is held and is never released. The
device disappears from the list of available devices.
What I expected: I can plug the ethernet device in and use it to
successfully observe cats on the internet for hours at a time.
Kernel logs:
May 11 22:15:58 allosaurus kernel: usb 2-2: new SuperSpeed Gen 1 USB device
number 2 using xhci_hcd
May 11 22:15:58 allosaurus kernel: usb 2-2: New USB device found,
idVendor=2357, idProduct=0601, bcdDevice=30.00
May 11 22:15:58 allosaurus kernel: usb 2-2: New USB device strings: Mfr=1,
Product=2, SerialNumber=6
May 11 22:15:58 allosaurus kernel: usb 2-2: Product: USB 10/100/1000 LAN
May 11 22:15:58 allosaurus kernel: usb 2-2: Manufacturer: TP-LINK
May 11 22:15:58 allosaurus kernel: usb 2-2: SerialNumber: 000001000000
May 11 22:15:59 allosaurus kernel: usbcore: registered new interface driver
r8152
May 11 22:15:59 allosaurus kernel: usbcore: registered new interface driver
cdc_ether
May 11 22:15:59 allosaurus kernel: usb 2-2: reset SuperSpeed Gen 1 USB device
number 2 using xhci_hcd
May 11 22:15:59 allosaurus kernel: r8152 2-2:1.0 eth0: v1.09.11
May 11 22:15:59 allosaurus kernel: r8152 2-2:1.0 enxd03745081b4b: renamed
from eth0
May 11 22:16:02 allosaurus kernel: IPv6: ADDRCONF(NETDEV_CHANGE):
enxd03745081b4b: link becomes ready
May 11 22:16:02 allosaurus kernel: r8152 2-2:1.0 enxd03745081b4b: carrier on
May 11 22:16:02 allosaurus kernel: r8152 2-2:1.0 enxd03745081b4b: carrier off
May 11 22:16:05 allosaurus kernel: r8152 2-2:1.0 enxd03745081b4b: carrier on
May 11 22:20:17 allosaurus kernel: NOHZ: local_softirq_pending 08
May 11 22:24:33 allosaurus kernel: NOHZ: local_softirq_pending 08
May 11 22:24:39 allosaurus kernel: NOHZ: local_softirq_pending 08
May 11 22:25:14 allosaurus kernel: NOHZ: local_softirq_pending 08
May 11 22:30:31 allosaurus kernel: INFO: task kworker/1:2:10776 blocked for
more than 120 seconds.
May 11 22:30:31 allosaurus kernel: Tainted: P OE
5.3.0-51-generic #44-Ubuntu
May 11 22:30:31 allosaurus kernel: "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 11 22:30:31 allosaurus kernel: kworker/1:2 D 0 10776 2
0x80004000
May 11 22:30:31 allosaurus kernel: Workqueue: events rtl_work_func_t [r8152]
May 11 22:30:31 allosaurus kernel: Call Trace:
May 11 22:30:31 allosaurus kernel: __schedule+0x2b9/0x6c0
May 11 22:30:31 allosaurus kernel: schedule+0x42/0xb0
May 11 22:30:31 allosaurus kernel: rpm_resume+0x174/0x780
May 11 22:30:31 allosaurus kernel: ? wait_woken+0x80/0x80
May 11 22:30:31 allosaurus kernel: rpm_resume+0x31d/0x780
May 11 22:30:31 allosaurus kernel: ? __switch_to_asm+0x34/0x70
May 11 22:30:31 allosaurus kernel: ? __switch_to_xtra+0x1c5/0x5c0
May 11 22:30:31 allosaurus kernel: ? __switch_to_asm+0x34/0x70
May 11 22:30:31 allosaurus kernel: ? __switch_to_asm+0x40/0x70
May 11 22:30:31 allosaurus kernel: ? __switch_to_asm+0x34/0x70
May 11 22:30:31 allosaurus kernel: __pm_runtime_resume+0x52/0x80
May 11 22:30:31 allosaurus kernel: usb_autopm_get_interface+0x1d/0x50
May 11 22:30:31 allosaurus kernel: rtl_work_func_t+0x70/0x285 [r8152]
May 11 22:30:31 allosaurus kernel: ? __schedule+0x2c1/0x6c0
May 11 22:30:31 allosaurus kernel: process_one_work+0x1db/0x380
May 11 22:30:31 allosaurus kernel: worker_thread+0x4d/0x400
May 11 22:30:31 allosaurus kernel: kthread+0x104/0x140
May 11 22:30:31 allosaurus kernel: ? process_one_work+0x380/0x380
May 11 22:30:31 allosaurus kernel: ? kthread_park+0x80/0x80
May 11 22:30:31 allosaurus kernel: ret_from_fork+0x35/0x40
May 11 22:31:34 allosaurus kernel: usb 2-2: USB disconnect, device number 2
May 11 22:32:32 allosaurus kernel: INFO: task NetworkManager:1470 blocked for
more than 120 seconds.
May 11 22:32:32 allosaurus kernel: Tainted: P OE
5.3.0-51-generic #44-Ubuntu
May 11 22:32:32 allosaurus kernel: "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 11 22:32:32 allosaurus kernel: NetworkManager D 0 1470 1
0x00000000
May 11 22:32:32 allosaurus kernel: Call Trace:
May 11 22:32:32 allosaurus kernel: __schedule+0x2b9/0x6c0
May 11 22:32:32 allosaurus kernel: schedule+0x42/0xb0
May 11 22:32:32 allosaurus kernel: schedule_preempt_disabled+0xe/0x10
May 11 22:32:32 allosaurus kernel: __mutex_lock.isra.0+0x182/0x4f0
May 11 22:32:32 allosaurus kernel: __mutex_lock_slowpath+0x13/0x20
May 11 22:32:32 allosaurus kernel: mutex_lock+0x2e/0x40
May 11 22:32:32 allosaurus kernel: rtnl_lock+0x15/0x20
May 11 22:32:32 allosaurus kernel: nl80211_dump_scan+0x34/0x6d0 [cfg80211]
May 11 22:32:32 allosaurus kernel: ? __kmalloc_reserve.isra.0+0x31/0x90
May 11 22:32:32 allosaurus kernel: genl_lock_dumpit+0x33/0x50
May 11 22:32:32 allosaurus kernel: netlink_dump+0x18b/0x380
May 11 22:32:32 allosaurus kernel: __netlink_dump_start+0x191/0x200
May 11 22:32:32 allosaurus kernel: genl_family_rcv_msg+0x2f3/0x470
May 11 22:32:32 allosaurus kernel: ? genl_lock_dumpit+0x50/0x50
May 11 22:32:32 allosaurus kernel: ? genl_lock_done+0x50/0x50
May 11 22:32:32 allosaurus kernel: ? genl_unlock+0x20/0x20
May 11 22:32:32 allosaurus kernel: ? __alloc_skb+0x84/0x1d0
May 11 22:32:32 allosaurus kernel: ? do_sys_poll+0x415/0x530
May 11 22:32:32 allosaurus kernel: genl_rcv_msg+0x4c/0xa0
May 11 22:32:32 allosaurus kernel: ? genl_family_rcv_msg+0x470/0x470
May 11 22:32:32 allosaurus kernel: netlink_rcv_skb+0x50/0x120
May 11 22:32:32 allosaurus kernel: genl_rcv+0x29/0x40
May 11 22:32:32 allosaurus kernel: netlink_unicast+0x187/0x220
May 11 22:32:32 allosaurus kernel: netlink_sendmsg+0x222/0x3e0
May 11 22:32:32 allosaurus kernel: sock_sendmsg+0x65/0x70
May 11 22:32:32 allosaurus kernel: ____sys_sendmsg+0x212/0x280
May 11 22:32:32 allosaurus kernel: ___sys_sendmsg+0x88/0xd0
May 11 22:32:32 allosaurus kernel: ? set_fd_set.part.0+0x50/0x50
May 11 22:32:32 allosaurus kernel: ? set_fd_set.part.0+0x50/0x50
May 11 22:32:32 allosaurus kernel: ? set_fd_set.part.0+0x50/0x50
May 11 22:32:32 allosaurus kernel: ? ep_poll+0x294/0x420
May 11 22:32:32 allosaurus kernel: ? __fget_light+0x57/0x70
May 11 22:32:32 allosaurus kernel: __sys_sendmsg+0x5c/0xa0
May 11 22:32:32 allosaurus kernel: __x64_sys_sendmsg+0x1f/0x30
May 11 22:32:32 allosaurus kernel: do_syscall_64+0x5a/0x130
May 11 22:32:32 allosaurus kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9
May 11 22:32:32 allosaurus kernel: RIP: 0033:0x7fae0d9e52ad
May 11 22:32:32 allosaurus kernel: Code: Bad RIP value.
May 11 22:32:32 allosaurus kernel: RSP: 002b:00007fff53453b70 EFLAGS:
00000293 ORIG_RAX: 000000000000002e
May 11 22:32:32 allosaurus kernel: RAX: ffffffffffffffda RBX:
0000561c49219380 RCX: 00007fae0d9e52ad
May 11 22:32:32 allosaurus kernel: RDX: 0000000000000000 RSI:
00007fff53453bc0 RDI: 000000000000000b
May 11 22:32:32 allosaurus kernel: RBP: 00007fff53453bc0 R08:
0000000000000000 R09: 0000000000001000
May 11 22:32:32 allosaurus kernel: R10: 0000561c491ec010 R11:
0000000000000293 R12: 0000561c49219380
May 11 22:32:32 allosaurus kernel: R13: 0000561c49219540 R14:
00007fae0db1f280 R15: 0000561c49421370
May 11 22:32:32 allosaurus kernel: INFO: task Qt bearer threa:2482 blocked
for more than 120 seconds.
May 11 22:32:32 allosaurus kernel: Tainted: P OE
5.3.0-51-generic #44-Ubuntu
May 11 22:32:32 allosaurus kernel: "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 11 22:32:32 allosaurus kernel: Qt bearer threa D 0 2482 1
0x00000000
May 11 22:32:32 allosaurus kernel: Call Trace:
May 11 22:32:32 allosaurus kernel: __schedule+0x2b9/0x6c0
May 11 22:32:32 allosaurus kernel: schedule+0x42/0xb0
May 11 22:32:32 allosaurus kernel: schedule_preempt_disabled+0xe/0x10
May 11 22:32:32 allosaurus kernel: __mutex_lock.isra.0+0x182/0x4f0
May 11 22:32:32 allosaurus kernel: __mutex_lock_slowpath+0x13/0x20
May 11 22:32:32 allosaurus kernel: mutex_lock+0x2e/0x40
May 11 22:32:32 allosaurus kernel: __netlink_dump_start+0x59/0x200
May 11 22:32:32 allosaurus kernel: rtnetlink_rcv_msg+0x23a/0x380
May 11 22:32:32 allosaurus kernel: ? rtnl_fill_ifinfo+0xe80/0xe80
May 11 22:32:32 allosaurus kernel: ? rtnl_fill_ifinfo+0xe80/0xe80
May 11 22:32:32 allosaurus kernel: ? rtnl_calcit.isra.0+0x100/0x100
May 11 22:32:32 allosaurus kernel: netlink_rcv_skb+0x50/0x120
May 11 22:32:32 allosaurus kernel: rtnetlink_rcv+0x15/0x20
May 11 22:32:32 allosaurus kernel: netlink_unicast+0x187/0x220
May 11 22:32:32 allosaurus kernel: netlink_sendmsg+0x222/0x3e0
May 11 22:32:32 allosaurus kernel: sock_sendmsg+0x65/0x70
May 11 22:32:32 allosaurus kernel: __sys_sendto+0x113/0x190
May 11 22:32:32 allosaurus kernel: ? fd_install+0x27/0x30
May 11 22:32:32 allosaurus kernel: ? __sys_socket+0x9e/0xf0
May 11 22:32:32 allosaurus kernel: __x64_sys_sendto+0x29/0x30
May 11 22:32:32 allosaurus kernel: do_syscall_64+0x5a/0x130
May 11 22:32:32 allosaurus kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9
May 11 22:32:32 allosaurus kernel: RIP: 0033:0x7f07df539dfa
May 11 22:32:32 allosaurus kernel: Code: Bad RIP value.
After observing this issue, I attempted to upgrade to mainline image
"linux-image-unsigned-5.6.11-050611-generic" from
https://kernel.ubuntu.com/~kernel-ppa/mainline/ and this fixed the issue
(although the device continually reconnects). See the corresponding kernel logs
below for comparison against the above:
May 12 19:12:00 allosaurus kernel: usb 4-1.2: new SuperSpeed Gen 1 USB device
number 16 using xhci_hcd
May 12 19:12:00 allosaurus kernel: usb 4-1.2: New USB device found,
idVendor=2357, idProduct=0601, bcdDevice=30.00
May 12 19:12:00 allosaurus kernel: usb 4-1.2: New USB device strings: Mfr=1,
Product=2, SerialNumber=6
May 12 19:12:00 allosaurus kernel: usb 4-1.2: Product: USB 10/100/1000 LAN
May 12 19:12:00 allosaurus kernel: usb 4-1.2: Manufacturer: TP-LINK
May 12 19:12:00 allosaurus kernel: usb 4-1.2: SerialNumber: 000001000000
May 12 19:12:00 allosaurus kernel: usb 4-1.2: reset SuperSpeed Gen 1 USB
device number 16 using xhci_hcd
May 12 19:12:00 allosaurus kernel: r8152 4-1.2:1.0: Direct firmware load for
rtl_nic/rtl8153a-3.fw failed with error -2
May 12 19:12:00 allosaurus kernel: r8152 4-1.2:1.0: unable to load firmware
patch rtl_nic/rtl8153a-3.fw (-2)
May 12 19:12:00 allosaurus kernel: r8152 4-1.2:1.0 eth0: v1.11.11
May 12 19:12:00 allosaurus kernel: r8152 4-1.2:1.0 enxd03745081b4b: renamed
from eth0
May 12 19:12:03 allosaurus kernel: IPv6: ADDRCONF(NETDEV_CHANGE):
enxd03745081b4b: link becomes ready
May 12 19:12:03 allosaurus kernel: r8152 4-1.2:1.0 enxd03745081b4b: carrier on
May 12 19:12:03 allosaurus kernel: r8152 4-1.2:1.0 enxd03745081b4b: carrier
off
May 12 19:12:06 allosaurus kernel: r8152 4-1.2:1.0 enxd03745081b4b: carrier on
May 12 19:12:42 allosaurus kernel: usb 4-1.2: USB disconnect, device number 16
(I note that these mainline logs recur at least once per minute, but I
suspect that when I am able to fetch an up-to-date firmware package
the latest packege will address this; not yet confirmed).
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1878321/+subscriptions
--
Mailing list: https://launchpad.net/~kernel-packages
Post to : [email protected]
Unsubscribe : https://launchpad.net/~kernel-packages
More help : https://help.launchpad.net/ListHelp