latter 1-2 weeks of this cycle
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1832384
Title:
Unable to unmount apparently unused filesystem
Status in linux package in Ubuntu:
Incomplete
Bug description:
We periodically see an issue where unmounting a ZFS filesystem fails
with EBUSY, even though there appears to be no one using it.
# cat /proc/self/mounts | grep
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive
domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive zfs
rw,nosuid,nodev,noexec,relatime,xattr,noacl 0 0
'lsof' and 'fuser' show no processes using any of the files in the
problematic filesystem:
# ls -l
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive/
total 221
-rw-r----- 1 500 500 52736 May 22 11:01 1_19_1008904362.dbf
-rw-r----- 1 500 500 541696 May 22 11:03 1_20_1008904362.dbf
# fuser
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive/1_20_1008904362.dbf
# fuser
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive/1_19_1008904362.dbf
# fuser
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive/
# lsof | grep
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive
#
The filesystem was shared over NFS, but has since been unshared:
# showmount -e | grep
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive
#
Since no one appears to be using the filesystem, our expectation is
that it should be possible to unmount the filesystem. However,
attempts to unmount the filesystem fail with EBUSY:
# zfs destroy
domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive
umount:
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive: target
is busy.
cannot unmount
'/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive':
umount failed
# umount
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive
umount:
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive: target
is busy.
Using bpftrace, we can see that the unmount is failing in
'propagate_mount_busy()' in the kernel. Using a live kernel debugger, we can
look at the 'mount' struct for this particular mount and see that the
'mnt_count' refcount summed across all CPUs is 2. For filesystems that are
eligible for unmounting, the refcount is 1.
The only way to work around this issue that we have found is to
reboot, at which point the filesystem can be unmounted and destroyed.
So far, we have only been able to reproduce this using a workload driven by
our application. The application mananges ZFS filesystems in groups, and the
lifecycle of each group looks something like
- Create and mount a group of filesystems, 1 parent and 4 children:
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/datafile
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/external
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/temp
- Share all 5 filesystems over NFS
- A client mounts all 5 shares using NFSv3
- For a few hours, the client does NFS operations on the filesystems and
the server occasionally takes ZFS snapshots of them
- Unshare filesystems
- Unmount filesystems
- Delete filesystems
These groups of filesystems are constantly being created and
destroyed. At any given time, we have ~30k filesystems on the system,
about 5k of which are shared. On average, one out of ~200-300k
unmounts fails with this EBUSY error. To create and destroy this many
filesystems takes us about a week or so.
Note that we are using ZFS built from https://github.com/delphix/zfs,
which is essentially master ZFS on Linux.
ProblemType: Bug
DistroRelease: Ubuntu 18.04
Package: linux-image-4.15.0-50-generic 4.15.0-50.54
ProcVersionSignature: Ubuntu 4.15.0-50.54-generic 4.15.18
Uname: Linux 4.15.0-50-generic x86_64
NonfreeKernelModules: zfs zunicode zcommon znvpair zavl icp
AlsaDevices:
total 0
crw-rw---- 1 root audio 116, 1 May 20 19:10 seq
crw-rw---- 1 root audio 116, 33 May 20 19:10 timer
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay'
ApportVersion: 2.20.9-0ubuntu7.6
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord':
'arecord'
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq',
'/dev/snd/timer'] failed with exit code 1:
Date: Tue Jun 11 05:28:21 2019
IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig'
Lsusb: Error: [Errno 2] No such file or directory: 'lsusb': 'lsusb'
MachineType: VMware, Inc. VMware Virtual Platform
PciMultimedia:
ProcEnviron:
TERM=xterm-256color
PATH=(custom, no user)
LANG=en_US.UTF-8
SHELL=/bin/bash
ProcFB: 0 svgadrmfb
ProcKernelCmdLine:
BOOT_IMAGE=/ROOT/username.QbVhgpM/root@/boot/vmlinuz-4.15.0-50-generic
root=ZFS=rpool/ROOT/username.QbVhgpM/root ro console=tty0 console=ttyS0,38400n8
ipv6.disable=1 crashkernel=1024M-:512M
RelatedPackageVersions:
linux-restricted-modules-4.15.0-50-generic N/A
linux-backports-modules-4.15.0-50-generic N/A
linux-firmware 1.173.6
RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill'
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
WifiSyslog:
dmi.bios.date: 09/21/2015
dmi.bios.vendor: Phoenix Technologies LTD
dmi.bios.version: 6.00
dmi.board.name: 440BX Desktop Reference Platform
dmi.board.vendor: Intel Corporation
dmi.board.version: None
dmi.chassis.asset.tag: No Asset Tag
dmi.chassis.type: 1
dmi.chassis.vendor: No Enclosure
dmi.chassis.version: N/A
dmi.modalias:
dmi:bvnPhoenixTechnologiesLTD:bvr6.00:bd09/21/2015:svnVMware,Inc.:pnVMwareVirtualPlatform:pvrNone:rvnIntelCorporation:rn440BXDesktopReferencePlatform:rvrNone:cvnNoEnclosure:ct1:cvrN/A:
dmi.product.name: VMware Virtual Platform
dmi.product.version: None
dmi.sys.vendor: VMware, Inc.
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1832384/+subscriptions
--
Mailing list: https://launchpad.net/~kernel-packages
Post to : [email protected]
Unsubscribe : https://launchpad.net/~kernel-packages
More help : https://help.launchpad.net/ListHelp