I haven't found any obvious source in aufs for the extra fputs which I
suspect are causing this problem. If you could either give me more
information so I can run the same tests myself (I'm guessing the problem
isn't arch-specific) or else reproduce the problem with the kernel
patched with something like the following, perhaps we can catch it in
the act. However we also might just catch legitimate fputs since the
erroneous ones could occur while the refcount is still positive ...
diff --git a/fs/file_table.c b/fs/file_table.c
index df66450fb443..d4911a6e8331 100644
--- a/fs/file_table.c
+++ b/fs/file_table.c
@@ -264,7 +264,9 @@ static DECLARE_DELAYED_WORK(delayed_fput_work,
delayed_fput);
void fput(struct file *file)
{
- if (atomic_long_dec_and_test(&file->f_count)) {
+ long cnt = atomic_long_dec_return(&file->f_count);
+ WARN_ON(cnt < 0);
+ if (cnt == 0) {
struct task_struct *task = current;
if (likely(!in_interrupt() && !(task->flags & PF_KTHREAD))) {
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1650062
Title:
Ubuntu16.04.01VM:Docker-Powervm aufs bad file panic while running
tests in a docker container
Status in linux package in Ubuntu:
New
Bug description:
== Comment: #0 - Vinutha GS - 2016-12-13 02:47:35 ==
When some of the base and io tests were run inside a docker container, the
par crashed and below are the stack trace and other details.
Steps to re-create -
1. Install 16.04.02 on a PowerVM lpar.
2. Ran setup general.
3. Ran docker scripts[home grown scripts] which does docker package
installation and other setups required to run STAF cases inside docker
container.
4. We have docker image using which we launch containers and start tests
inside containers.
If complete details are required on how to execute scripts, please let me
know.
5. STAF Base and IO tests were started inside containers successfully, after
sometime, I see partition is in XMON.
Docker info -
docker info
Containers: 0
Running: 0
Paused: 0
Stopped: 0
Images: 0
Server Version: 1.12.1
Storage Driver: aufs
Root Dir: /var/lib/docker/aufs
Backing Filesystem: extfs
Dirs: 0
Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: null host bridge overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Security Options: apparmor
Kernel Version: 4.4.0-53-generic
Operating System: Ubuntu 16.04.1 LTS
OSType: linux
Architecture: ppc64le
CPUs: 24
Total Memory: 49.89 GiB
Name: bamlp3
ID: I7VI:G4RJ:RHTQ:WNGV:52FK:K7AZ:YDJQ:KFUM:P3UA:MZ3I:5XUY:WV3N
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No swap limit support
Insecure Registries:
127.0.0.0/8
docker ps -a
CONTAINER ID IMAGE COMMAND CREATED
STATUS PORTS NAMES
61f2b8ab0a86 32d545c3ea01 "/bin/sh -c ./staf_io" 24 minutes
ago Up 24 minutes bamlp3-io
151da0322172 590e44f15214 "/bin/sh -c ./staf_ba" 30 minutes
ago Up 30 minutes bamlp3-base
Stack trace -
8:mon> t
[c000000a5e147d10] d00000000a04ca98 aufs_flush_nondir+0x38/0x50 [aufs]
[c000000a5e147d40] c0000000002e0428 filp_close+0x68/0xe0
[c000000a5e147dc0] c00000000030f71c __close_fd+0xcc/0x150
[c000000a5e147e00] c0000000002e04d4 SyS_close+0x34/0x90
[c000000a5e147e30] c000000000009204 system_call+0x38/0xb4
--- Exception: c00 (System Call) at 00003fff8bc217d8
SP (3fffd85203b0) is in userspace
8:mon> e
cpu 0x8: Vector: 300 (Data Access) at [c000000a5e147a40]
pc: d00000000a04bdd4: au_do_flush+0x44/0x220 [aufs]
lr: d00000000a04ca98: aufs_flush_nondir+0x38/0x50 [aufs]
sp: c000000a5e147cc0
msr: 8000000000009033
dar: 28
dsisr: 40000000
current = 0xc000000a8b7fc8e0
paca = 0xc00000000fb44c00 softe: 0 irq_happened: 0x01
pid = 11936, comm = remap_file_page
8:mon>
Release details -
uname -r
4.4.0-53-generic
uname -a
Linux bamlp4 4.4.0-53-generic #74-Ubuntu SMP Fri Dec 2 15:59:36 UTC 2016
ppc64le ppc64le ppc64le GNU/Linux
== Comment: #6 - Vinutha GS - 2016-12-14 03:16:18 ==
Please find the attached sosreport.
Also i have followed the steps for k-dump, It is enabled now.
I'm going to start the tests once again.
== Comment: #12 - Kevin W. Rudd - 2016-12-14 16:06:46 ==
The basic reason for the panic is that close was called on a file
that was no longer valid. The f_count value was -8 for some reason,
so it passed the following check in filep_close():
if (!file_count(filp)) {
printk(KERN_ERR "VFS: Close: file count is 0\n");
return 0;
}
It then blew up in au_do_flush() because f_inode was NULL.
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1650062/+subscriptions
--
Mailing list: https://launchpad.net/~kernel-packages
Post to : [email protected]
Unsubscribe : https://launchpad.net/~kernel-packages
More help : https://help.launchpad.net/ListHelp