Package: nfs-kernel-server Version: 1:2.8.2-1+b1 Severity: grave Justification: causes non-serious data loss X-Debbugs-Cc: invernom...@paranoici.org
Dear maintainers, I encountered a big issue, while upgrading package 'nfs-kernel-server' on the box where the NFS server runs (the clients run on the compute nodes of an HPC cluster). The upgrade: [UPGRADE] nfs-kernel-server:amd64 1:2.8.2-1 -> 1:2.8.2-1+b1 got stuck at [...] Setting up nfs-kernel-server (1:2.8.2-1+b1) ... It looks like it was stuck at the restart of the systemd service: # systemctl status nfs-kernel-server.service ● nfs-server.service - NFS server and services Loaded: loaded (/usr/lib/systemd/system/nfs-server.service; enabled; prese> Drop-In: /run/systemd/generator/nfs-server.service.d └─order-with-mounts.conf Active: activating (start-pre) since Tue 2025-01-21 12:40:52 CET; 10min ago Job: 97667 Invocation: ced460d410fe4059b9e8781b35340d70 Docs: man:rpc.nfsd(8) man:exportfs(8) Cntrl PID: 249039 (exportfs) Tasks: 3 (limit: 154102) Memory: 680K (peak: 2.5M) CPU: 10ms CGroup: /system.slice/nfs-server.service ├─239857 /usr/sbin/nfsdctl threads 0 ├─239918 /usr/sbin/exportfs -au └─249039 /usr/sbin/exportfs -r There was a 'nfsdctl' process in uninterruptible sleep (D): $ ps -eldaf | grep nf[s] 4 D root 239857 1 0 80 0 - 847 - 12:07 ? 00:00:00 /usr/sbin/nfsdctl threads 0 5 S root 247511 1 0 80 0 - 1375 - 12:35 ? 00:00:00 /usr/sbin/nfsdcld After about 30 min, since trying to kill PID 239857 obviously had no effect, and I could not find any other strategy to restart nfs-kernel-server.service, I had to reboot the box, thus causing many problems to all the NFS clients. After reboot, I could issue: # aptitude --purge-unused safe-upgrade which finally completed the upgrade (fixing the nfs-kernel-server package, which was left in a partially configured state). I have never seen anything like this before, and I have upgraded nfs-kernel-server and related packages on Debian machines for quite a long time. Anyway, this should *not* happen during a system upgrade with aptitude or apt! I don't know whether bug [#992661] is related or not. [#992661]: <https://bugs.debian.org/992661> By looking at /var/log/kern.log , I see that a kernel BUG was traced at the time when the 'nfsdctl' process got stuck in D state. See the attached kern.log snippet. Please investigate and fix the issue as soon as possible. I really hope we can prevent this from happening again! Thanks for your time and dedication. -- Package-specific info: -- rpcinfo -- program vers proto port service 100000 4 tcp 111 portmapper 100000 3 tcp 111 portmapper 100000 2 tcp 111 portmapper 100000 4 udp 111 portmapper 100000 3 udp 111 portmapper 100000 2 udp 111 portmapper 100011 1 udp 64737 rquotad 100011 2 udp 64737 rquotad 100011 1 tcp 55614 rquotad 100011 2 tcp 55614 rquotad 100024 1 udp 41792 status 100024 1 tcp 50467 status 100005 1 udp 46127 mountd 100005 1 tcp 39579 mountd 100005 2 udp 49119 mountd 100005 2 tcp 40039 mountd 100005 3 udp 33530 mountd 100005 3 tcp 55283 mountd 100003 3 tcp 2049 nfs 100003 4 tcp 2049 nfs 100227 3 tcp 2049 nfs_acl 100021 1 udp 38915 nlockmgr 100021 3 udp 38915 nlockmgr 100021 4 udp 38915 nlockmgr 100021 1 tcp 33105 nlockmgr 100021 3 tcp 33105 nlockmgr 100021 4 tcp 33105 nlockmgr -- /etc/default/nfs-kernel-server -- RPCNFSDPRIORITY=0 NEED_SVCGSSD="" -- /etc/nfs.conf -- [general] pipefs-directory=/run/rpc_pipefs [nfsrahead] [exports] [exportfs] [gssd] [lockd] [exportd] [mountd] manage-gids=y [nfsdcld] [nfsdcltrack] [nfsd] rdma=y rdma-port=20049 [statd] [sm-notify] [svcgssd] -- /etc/nfs.conf.d/*.conf -- -- System Information: Debian Release: trixie/sid APT prefers testing APT policy: (500, 'testing') Architecture: amd64 (x86_64) Kernel: Linux 6.12.9-amd64 (SMP w/16 CPU threads; PREEMPT) Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), LANGUAGE not set Shell: /bin/sh linked to /usr/bin/dash Init: systemd (via /run/systemd/system) LSM: AppArmor: enabled Versions of packages nfs-kernel-server depends on: ii keyutils 1.6.3-4 ii libblkid1 2.40.4-1 ii libc6 2.40-5 ii libcap2 1:2.66-5+b1 ii libevent-core-2.1-7t64 2.1.12-stable-10+b1 ii libnl-3-200 3.7.0-0.3+b1 ii libnl-genl-3-200 3.7.0-0.3+b1 ii libreadline8t64 8.2-6 ii libsqlite3-0 3.46.1-1 ii libtirpc3t64 1.3.4+ds-1.3+b1 ii libuuid1 2.40.4-1 ii libwrap0 7.6.q-35 ii libxml2 2.12.7+dfsg+really2.9.14-0.2+b1 ii netbase 6.4 ii nfs-common 1:2.8.2-1+b1 ii ucf 3.0048 Versions of packages nfs-kernel-server recommends: ii python3 3.12.8-1 ii python3-yaml 6.0.2-1+b1 Versions of packages nfs-kernel-server suggests: ii procps 2:4.0.4-6 -- no debconf information
kern_log_snippet.log.gz
Description: application/gzip