I'm yet to find a way to reliably reproduce this. Including apport report. Not much more information in there, as none was collected during the crash.
** Attachment added: "apport" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1805062/+attachment/5216640/+files/apport.linux.ut6j6uk2.apport -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1805062 Title: FSCache kernel panic on presumed CIFS share disconnect Status in linux package in Ubuntu: Incomplete Bug description: What Happened ------------- I have a setup with my home NAS server mounted via CIFS to an Ubuntu 18.04 VPS over OpenVPN. The CIFS share is served by Nginx as an open file directory, with fscache/cachefilesd being used to cache the files locally, to reduce the load on my home connection. I was using this setup for web development by editing files directly on the CIFS share from another machine (mounted via LAN, independently of the OpenVPN server). After making changes to an HTML file, I noticed that the Nginx server was no longer responding. I connected to the server via SSH to diagnose the issue and found that trying to perform a file listing (using ls) on the fileshare mount froze my shell, so I opened another SSH connection. I ran dmesg, expecting to find a CIFS timeout error, but instead there was an fscache error trace. Thinking it would help resolve the issue, I ran `systemd restart cachefilesd` to restart cachefilesd. My SSH connection froze again, but this time the server was no longer responding to SSH connections and pings. I was forced to perform a hard reset to restore operation. Upon further inspection of the logs, I noticed that the NAS reconnected to the OpenVPN server between my first and second SSH connection, so it's possible that a network issue caused the initial fscache error. I have no way of confirming this, that's why I put "presumed" in the title. I suspect the issue here might be a race condition when invalidating a cache object (the updated HTML file) during a network issue. What I expected to happen ------------------------- Accessing an updated file shouldn't cause an fscache error, even when a network error occurs. Restarting cachefilesd shouldn't cause a fatal panic, even when a network error occurs. Machine information ------------------- /proc/version_signature: Ubuntu 4.15.0-20.21-generic 4.15.17 lsb_release: Description: Ubuntu 18.04.1 LTS Release: 18.04 apt-cache policy linux-image-4.15.0-20-generic: linux-image-4.15.0-20-generic: Installed: 4.15.0-20.21 Candidate: 4.15.0-20.21 Version table: *** 4.15.0-20.21 500 500 http://gb.archive.ubuntu.com/ubuntu bionic/main amd64 Packages 100 /var/lib/dpkg/status apt-cache policy cachefilesd cachefilesd: cachefilesd: Installed: 0.10.10-0.1 Candidate: 0.10.10-0.1 Version table: *** 0.10.10-0.1 500 500 http://gb.archive.ubuntu.com/ubuntu bionic/universe amd64 Packages 100 /var/lib/dpkg/status apt-cache policy libsmbclient: libsmbclient: Installed: (none) Candidate: 2:4.7.6+dfsg~ubuntu-0ubuntu2.4 Version table: 2:4.7.6+dfsg~ubuntu-0ubuntu2.4 500 500 http://gb.archive.ubuntu.com/ubuntu bionic-updates/main amd64 Packages 2:4.7.6+dfsg~ubuntu-0ubuntu2.2 500 500 http://security.ubuntu.com/ubuntu bionic-security/main amd64 Packages 2:4.7.6+dfsg~ubuntu-0ubuntu2 500 500 http://gb.archive.ubuntu.com/ubuntu bionic/main amd64 Packages apt-cache policy cifs-utils: cifs-utils: Installed: 2:6.8-1 Candidate: 2:6.8-1 Version table: *** 2:6.8-1 500 500 http://gb.archive.ubuntu.com/ubuntu bionic/main amd64 Packages 100 /var/lib/dpkg/status The VPS provider is UK2, I believe they use KVM. journalctl output ----------------- Private and irrelevant information removed. Nov 26 00:30:31 hostname kernel: [kworke] preemptive burial: OBJ66 [DROP_OBJECT] 000000004b2987a1 --- [snip] (CacheFiles error + kernel trace) --- Nov 26 00:34:01 hostname kernel: CacheFiles: Error: Object already active Nov 26 00:34:01 hostname kernel: CacheFiles: object: OBJ6d Nov 26 00:34:01 hostname kernel: CacheFiles: objstate=LOOK_UP_OBJECT fl=18 wbusy=2 ev=0[0] Nov 26 00:34:01 hostname kernel: CacheFiles: ops=0 inp=0 exc=0 Nov 26 00:34:01 hostname kernel: CacheFiles: parent=00000000316fbe48 Nov 26 00:34:01 hostname kernel: CacheFiles: cookie=000000001fa898cd [pr=00000000f0382671 nd=00000000ba69f109 fl=22] Nov 26 00:34:01 hostname kernel: CacheFiles: key=[8] 'dd05000000000000' Nov 26 00:34:01 hostname kernel: ------------[ cut here ]------------ Nov 26 00:34:01 hostname kernel: kernel BUG at /build/linux-5s7Xkn/linux-4.15.0/fs/cachefiles/namei.c:164! Nov 26 00:34:01 hostname kernel: invalid opcode: 0000 [#1] SMP PTI Nov 26 00:34:01 hostname kernel: Modules linked in: binfmt_misc cmac arc4 md4 nls_utf8 cifs ccm cachefiles fscache ip6t_REJECT nf_reject_ipv6 nf_log_ipv6 xt_hl ip6t_rt nf_conntrack_ipv6 nf_defrag_ipv6 ipt_REJECT nf_reject_ipv4 nf_log_ipv4 nf_log_common xt_LOG xt_limit xt_addrtype ipt_MASQUERADE nf_nat_masquerade_ip Nov 26 00:34:01 hostname kernel: async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid cirrus ttm psmouse drm_kms_helper syscopyarea sysfillrect sysimgblt virtio_net fb_sys_fops virtio_blk drm i2c_piix4 pata_acpi floppy Nov 26 00:34:01 hostname kernel: CPU: 0 PID: 22679 Comm: kworker/u64:2 Not tainted 4.15.0-20-generic #21-Ubuntu Nov 26 00:34:01 hostname kernel: Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 Nov 26 00:34:01 hostname kernel: Workqueue: fscache_object fscache_object_work_func [fscache] Nov 26 00:34:01 hostname kernel: RIP: 0010:cachefiles_walk_to_object+0x820/0x8b0 [cachefiles] Nov 26 00:34:01 hostname kernel: RSP: 0018:ffffa36bc5a87d40 EFLAGS: 00010202 Nov 26 00:34:01 hostname kernel: RAX: ffff96b739444801 RBX: ffff96b732f5e3a8 RCX: 0000000000847bda Nov 26 00:34:01 hostname kernel: RDX: 0000000000847bd9 RSI: ffff96b73be27140 RDI: ffff96b73b807000 Nov 26 00:34:01 hostname kernel: RBP: ffffa36bc5a87de8 R08: 0000000000027140 R09: ffffffffc0709f4b Nov 26 00:34:01 hostname kernel: R10: ffffd9cc40e51100 R11: 0000000000010101 R12: ffff96b732f5d200 Nov 26 00:34:01 hostname kernel: R13: ffff96b732f5d310 R14: ffff96b70730ec00 R15: ffff96b732f5e3a8 Nov 26 00:34:01 hostname kernel: FS: 0000000000000000(0000) GS:ffff96b73be00000(0000) knlGS:0000000000000000 Nov 26 00:34:01 hostname kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Nov 26 00:34:01 hostname kernel: CR2: 00007ff8ad9d9f60 CR3: 000000002d522000 CR4: 00000000000006f0 Nov 26 00:34:01 hostname kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Nov 26 00:34:01 hostname kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Nov 26 00:34:01 hostname kernel: Call Trace: Nov 26 00:34:01 hostname kernel: ? wait_woken+0x80/0x80 Nov 26 00:34:01 hostname kernel: cachefiles_lookup_object+0x4d/0xe0 [cachefiles] Nov 26 00:34:01 hostname kernel: fscache_look_up_object+0xbd/0x1c0 [fscache] Nov 26 00:34:01 hostname kernel: fscache_object_work_func+0x63/0x1f0 [fscache] Nov 26 00:34:01 hostname kernel: process_one_work+0x1de/0x410 Nov 26 00:34:01 hostname kernel: worker_thread+0x32/0x410 Nov 26 00:34:01 hostname kernel: kthread+0x121/0x140 Nov 26 00:34:01 hostname kernel: ? process_one_work+0x410/0x410 Nov 26 00:34:01 hostname kernel: ? kthread_create_worker_on_cpu+0x70/0x70 Nov 26 00:34:01 hostname kernel: ? do_syscall_64+0x73/0x130 Nov 26 00:34:01 hostname kernel: ? kthread_create_worker_on_cpu+0x70/0x70 Nov 26 00:34:01 hostname kernel: ret_from_fork+0x35/0x40 Nov 26 00:34:01 hostname kernel: Code: 1e ca 48 c7 c7 70 b1 70 c0 e8 ed 58 1e ca 0f 0b e8 d6 3a 18 ca 48 c7 c7 30 b2 70 c0 e8 da 58 1e ca 31 f6 4c 89 e7 e8 e8 2a 00 00 <0f> 0b 80 bb 1c 01 00 00 00 0f 85 6c f9 ff ff 3d 00 00 20 00 0f Nov 26 00:34:01 hostname kernel: RIP: cachefiles_walk_to_object+0x820/0x8b0 [cachefiles] RSP: ffffa36bc5a87d40 Nov 26 00:34:01 hostname kernel: ---[ end trace 788b2ad4e37538d5 ]--- --- [snip] (fileshare server reconnects to OpenVPN) --- Nov 26 00:34:43 hostname openvpn[15768]: [snip]:[snip] TLS: tls_process: killed expiring key Nov 26 00:34:46 hostname openvpn[15768]: [snip]:[snip] TLS: soft reset sec=0 Nov 26 00:34:48 hostname openvpn[15768]: [snip]:[snip] VERIFY OK: depth=1, CN=[snip] Nov 26 00:34:48 hostname openvpn[15768]: [snip]:[snip] VERIFY KU OK Nov 26 00:34:48 hostname openvpn[15768]: [snip]:[snip] Validating certificate extended key usage Nov 26 00:34:48 hostname openvpn[15768]: [snip]:[snip] ++ Certificate has EKU (str) TLS Web Client Authentication, expects TLS Web Client Authentication Nov 26 00:34:48 hostname openvpn[15768]: [snip]:[snip] VERIFY EKU OK Nov 26 00:34:48 hostname openvpn[15768]: [snip]:[snip] VERIFY OK: depth=0, CN=[snip] Nov 26 00:34:48 hostname openvpn[15768]: [snip]:[snip] peer info: IV_VER=2.3.11 Nov 26 00:34:48 hostname openvpn[15768]: [snip]:[snip] peer info: IV_PLAT=linux Nov 26 00:34:48 hostname openvpn[15768]: [snip]:[snip] peer info: IV_PROTO=2 Nov 26 00:34:48 hostname openvpn[15768]: [snip]:[snip] Outgoing Data Channel: Cipher 'AES-256-CBC' initialized with 256 bit key Nov 26 00:34:48 hostname openvpn[15768]: [snip]:[snip] Outgoing Data Channel: Using 256 bit message hash 'SHA256' for HMAC authentication Nov 26 00:34:48 hostname openvpn[15768]: [snip]:[snip] Incoming Data Channel: Cipher 'AES-256-CBC' initialized with 256 bit key Nov 26 00:34:48 hostname openvpn[15768]: [snip]:[snip] Incoming Data Channel: Using 256 bit message hash 'SHA256' for HMAC authentication Nov 26 00:34:48 hostname openvpn[15768]: [snip]:[snip] Control Channel: TLSv1.2, cipher TLSv1.2 DHE-RSA-AES256-GCM-SHA384, 2048 bit RSA --- [snip] (i connect via ssh) --- Nov 26 00:35:25 hostname sshd[23061]: Connection from [snip] port [snip] on [snip] port [snip] Nov 26 00:35:25 hostname sshd[23061]: Postponed publickey for tjg from [snip] port [snip] ssh2 [preauth] Nov 26 00:35:26 hostname sshd[23061]: Accepted publickey for tjg from [snip] port [snip] ssh2: RSA SHA256:/[snip] Nov 26 00:35:26 hostname sshd[23061]: pam_unix(sshd:session): session opened for user tjg by (uid=0) Nov 26 00:35:26 hostname systemd-logind[604]: New session 214 of user tjg. Nov 26 00:35:26 hostname systemd[1]: Started Session 214 of user tjg. Nov 26 00:35:27 hostname sshd[23061]: User child is on pid 23141 Nov 26 00:35:27 hostname sshd[23141]: Starting session: shell on pts/1 for tjg from [snip] port [snip] id 0 -- [snip] (i try to restart cachefilesd) -- Nov 26 00:35:51 hostname sudo[23179]: tjg : TTY=pts/1 ; PWD=/home/tjg ; USER=root ; COMMAND=/bin/systemctl restart cachefilesd.service Nov 26 00:35:51 hostname sudo[23179]: pam_unix(sudo:session): session opened for user root by tjg(uid=0) Nov 26 00:35:51 hostname systemd[1]: Stopping LSB: CacheFiles daemon... Nov 26 00:35:51 hostname cachefilesd[23183]: * Stopping FilesCache daemon cachefilesd Nov 26 00:35:51 hostname cachefilesd[903]: Daemon Terminated Nov 26 00:35:51 hostname kernel: CacheFiles: File cache on vda1 unregistering Nov 26 00:35:51 hostname kernel: FS-Cache: Withdrawing cache "mycache" --- [crash] (no more logs after this point until hard reset 5 minutes later) --- To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1805062/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp