On Mon, 3 Aug 2020 09:55:28 +0200 =?UTF-8?Q?C=c3=a9dric_Dufour?= < cedric.duf...@ced-network.net> wrote: > Package: linux-source-4.19 > Version: 4.19.132-1 > Severity: important > > Hello, > > Since linux-image-4.19.0-10-amd64, I'm facing regular Kernel panics - "RIP: 0010:__cgroup_bpf_run_filter_skb+0x26d/0x3d0" - resulting in full (file) *server freeze*. > > The issue is pretty well described and summarized in https://forum.proxmox.com/threads/kernel-5-4-44-causes-system-freeze-on-hp-microserver-gen8.72050/page-2#post-323498 > > The "culprit" commit - "netprio_cgroup: Fix unlimited memory leak of v2 cgroups" - is indeed included in Debian kernel (4.19) since changelog entry 4.19.131-1 > > It *seems* there is already a patch proposed upstream (although here for kernel 4.9): https://lkml.org/lkml/2020/7/20/883 > > Best regards, > > Cédric > > -- > Cédric Dufour > >
FWIW, I am seeing a very similar issue. Some Debian 10 AWS instances used to run Guacamole via Docker recently started randomly freezing up on me. I enabled kernel dumps and finally caught one of the machines misbehaving. Looking at the kdump I see this: KERNEL: /usr/lib/debug/vmlinux-4.19.0-10-cloud-amd64 DUMPFILE: dump.202008101612 [PARTIAL DUMP] CPUS: 2 DATE: Mon Aug 10 16:11:47 2020 UPTIME: 00:05:44 LOAD AVERAGE: 0.21, 0.11, 0.04 TASKS: 261 NODENAME: guac.env0.staging.cool.cyber.dhs.gov RELEASE: 4.19.0-10-cloud-amd64 VERSION: #1 SMP Debian 4.19.132-1 (2020-07-24) MACHINE: x86_64 (2499 Mhz) MEMORY: 4 GB PANIC: "BUG: unable to handle kernel NULL pointer dereference at 0000000000000010" PID: 1453 COMMAND: "sshd" TASK: ffff8a3f695115c0 [THREAD_INFO: ffff8a3f695115c0] CPU: 0 STATE: TASK_RUNNING (PANIC) crash> bt PID: 1453 TASK: ffff8a3f695115c0 CPU: 0 COMMAND: "sshd" #0 [ffffb37740c77800] machine_kexec at ffffffff97a4b297 #1 [ffffb37740c77858] __crash_kexec at ffffffff97b0e7dd #2 [ffffb37740c77920] crash_kexec at ffffffff97b0f62d #3 [ffffb37740c77938] oops_end at ffffffff97a2907d #4 [ffffb37740c77958] no_context at ffffffff97a5858e #5 [ffffb37740c779b0] __do_page_fault at ffffffff97a58c42 #6 [ffffb37740c77a20] async_page_fault at ffffffff982010be [exception RIP: __cgroup_bpf_run_filter_skb+189] RIP: ffffffff97b94ffd RSP: ffffb37740c77ad0 RFLAGS: 00010286 RAX: 0000000000000000 RBX: ffff8a3ff55e5ee8 RCX: 0000000000000000 RDX: 0000000000000001 RSI: ffff8a3ff3d49800 RDI: ffff8a3ff52fd500 RBP: ffff8a3ff52fd500 R8: ffff8a3ff55e5ee8 R9: 0000000000010000 R10: 0000000000000001 R11: ffff8a3ef6dd7500 R12: 0000000000000000 R13: 0000000000000000 R14: ffff8a3ff52fd840 R15: ffff8a3ff55e5ee8 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #7 [ffffb37740c77b30] ip_finish_output at ffffffff97f65988 #8 [ffffb37740c77b68] ip_output at ffffffff97f6640c #9 [ffffb37740c77bc0] __ip_queue_xmit at ffffffff97f65e6d #10 [ffffb37740c77c18] __tcp_transmit_skb at ffffffff97f80557 #11 [ffffb37740c77c88] tcp_write_xmit at ffffffff97f81e34 #12 [ffffb37740c77cf0] __tcp_push_pending_frames at ffffffff97f82ae1 #13 [ffffb37740c77d00] tcp_sendmsg_locked at ffffffff97f733ac #14 [ffffb37740c77da8] tcp_sendmsg at ffffffff97f73507 #15 [ffffb37740c77dc8] sock_sendmsg at ffffffff97ee8aa6 #16 [ffffb37740c77de0] sock_write_iter at ffffffff97ee8b47 #17 [ffffb37740c77e50] new_sync_write at ffffffff97c49bfb #18 [ffffb37740c77ed0] vfs_write at ffffffff97c4c7d5 #19 [ffffb37740c77f00] ksys_write at ffffffff97c4ca77 #20 [ffffb37740c77f38] do_syscall_64 at ffffffff97a04140 #21 [ffffb37740c77f50] entry_SYSCALL_64_after_hwframe at ffffffff98200088 RIP: 00007fd74beba504 RSP: 00007ffc1d456638 RFLAGS: 00000246 RAX: ffffffffffffffda RBX: 0000000000000084 RCX: 00007fd74beba504 RDX: 0000000000000084 RSI: 000055785f33bb90 RDI: 0000000000000003 RBP: 000055785f31d630 R8: 0000000000000000 R9: 0000000000001000 R10: 0000000000000008 R11: 0000000000000246 R12: 00000000000001dd R13: 000055785ddc9b00 R14: 0000000000000003 R15: 00007ffc1d4566e0 ORIG_RAX: 0000000000000001 CS: 0033 SS: 002b crash> sym ffffffff97b94ffd ffffffff97b94ffd (T) __cgroup_bpf_run_filter_skb+189 ./debian/build/build_amd64_none_cloud-amd64/./kernel/bpf/cgroup.c: 539 crash> log [ 0.000000] Linux version 4.19.0-10-cloud-amd64 ( debian-ker...@lists.debian.org) (gcc version 8.3.0 (Debian 8.3.0-6)) #1 SMP Debian 4.19.132-1 (2020-07-24) [ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-4.19.0-10-cloud-amd64 root=UUID=9ac8f5bd-5b64-48cd-9efd-2b2d35a30500 ro console=tty0 console=ttyS0,115200 earlyprintk=ttyS0,115200 nmi_watchdog=1 elevator=noop scsi_mod.use_blk_mq=Y crashkernel=384M-:128M <SNIP> [ 478.686368] BUG: unable to handle kernel NULL pointer dereference at 0000000000000010 [ 478.693551] PGD 0 P4D 0 [ 478.696291] Oops: 0000 [#1] SMP PTI [ 478.699431] CPU: 0 PID: 1453 Comm: sshd Kdump: loaded Not tainted 4.19.0-10-cloud-amd64 #1 Debian 4.19.132-1 [ 478.706782] Hardware name: Amazon EC2 t3.medium/, BIOS 1.0 10/16/2017 [ 478.711129] RIP: 0010:__cgroup_bpf_run_filter_skb+0xbd/0x1e0 [ 478.715172] Code: 00 00 00 49 89 7f 18 48 89 0c 24 44 89 e1 48 29 c8 48 89 4c 24 08 49 89 87 d8 00 00 00 89 d2 48 8d 84 d6 b0 03 00 00 48 8b 00 <48> 8b 58 10 4c 8d 70 10 48 85 db 0f 84 01 01 00 00 4d 8d 6f 30 bd [ 478.727711] RSP: 0018:ffffb37740c77ad0 EFLAGS: 00010286 [ 478.731595] RAX: 0000000000000000 RBX: ffff8a3ff55e5ee8 RCX: 0000000000000000 [ 478.736351] RDX: 0000000000000001 RSI: ffff8a3ff3d49800 RDI: ffff8a3ff52fd500 [ 478.741042] RBP: ffff8a3ff52fd500 R08: ffff8a3ff55e5ee8 R09: 0000000000010000 [ 478.745697] R10: 0000000000000001 R11: ffff8a3ef6dd7500 R12: 0000000000000000 [ 478.750446] R13: 0000000000000000 R14: ffff8a3ff52fd840 R15: ffff8a3ff55e5ee8 [ 478.755161] FS: 00007fd74bb17e40(0000) GS:ffff8a3ff7e00000(0000) knlGS:0000000000000000 [ 478.761724] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 478.765853] CR2: 0000000000000010 CR3: 00000000a94e6005 CR4: 00000000007606b0 [ 478.770524] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 478.775273] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 478.779984] PKRU: 55555554 [ 478.782901] Call Trace: [ 478.785756] ip_finish_output+0x228/0x270 [ 478.789204] ? nf_hook_slow+0x44/0xc0 [ 478.792490] ip_output+0x6c/0xe0 [ 478.795685] ? ip_append_data.part.49+0xd0/0xd0 [ 478.799403] __ip_queue_xmit+0x15d/0x410 [ 478.802945] ? set_fd_set.part.7+0x40/0x40 [ 478.806411] __tcp_transmit_skb+0x527/0xb10 [ 478.810032] tcp_write_xmit+0x384/0x1000 [ 478.813636] ? _copy_from_iter_full+0x94/0x240 [ 478.817438] __tcp_push_pending_frames+0x31/0xd0 [ 478.821170] tcp_sendmsg_locked+0xc1c/0xd50 [ 478.824714] tcp_sendmsg+0x27/0x40 [ 478.827921] sock_sendmsg+0x36/0x40 [ 478.831280] sock_write_iter+0x97/0x100 [ 478.834714] new_sync_write+0xfb/0x160 [ 478.838010] vfs_write+0xa5/0x1a0 [ 478.841129] ksys_write+0x57/0xd0 [ 478.844250] do_syscall_64+0x50/0xf0 [ 478.847526] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 478.851385] RIP: 0033:0x7fd74beba504 [ 478.854598] Code: 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b3 0f 1f 80 00 00 00 00 48 8d 05 f9 61 0d 00 8b 00 85 c0 75 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 41 54 49 89 d4 55 48 89 f5 53 [ 478.867315] RSP: 002b:00007ffc1d456638 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ 478.873758] RAX: ffffffffffffffda RBX: 0000000000000084 RCX: 00007fd74beba504 [ 478.878456] RDX: 0000000000000084 RSI: 000055785f33bb90 RDI: 0000000000000003 [ 478.883176] RBP: 000055785f31d630 R08: 0000000000000000 R09: 0000000000001000 [ 478.887885] R10: 0000000000000008 R11: 0000000000000246 R12: 00000000000001dd [ 478.892646] R13: 000055785ddc9b00 R14: 0000000000000003 R15: 00007ffc1d4566e0 [ 478.897480] Modules linked in: xt_nat xt_tcpudp veth xt_conntrack ipt_MASQUERADE nf_conntrack_netlink xfrm_user xfrm_algo nft_counter xt_addrtype nft_compat nft_chain_nat_ipv4 nf_nat_ipv4 nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c nf_tables nfnetlink br_netfilter bridge stp llc binfmt_misc overlay crct10dif_pclmul crc32_pclmul ghash_clmulni_intel nls_ascii nls_cp437 vfat fat intel_rapl_perf evdev serio_raw button ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 crc32c_generic fscrypto ecb crc32c_intel aesni_intel nvme aes_x86_64 crypto_simd ena nvme_core cryptd glue_helper [ 478.931979] CR2: 0000000000000010 Let me know if I can provide any other information that may be of use. Shane Frasier