The root cause of this fail is a wrong mount ID which is reported for
file mappings:

Steps to reproduce:

root@ubuntu-s-4vcpu-8gb-nyc1-01:~# uname -a
Linux ubuntu-s-4vcpu-8gb-nyc1-01 5.3.0-26-generic #28-Ubuntu SMP Wed Dec 18 
05:37:46 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

root@ubuntu-s-4vcpu-8gb-nyc1-01:~# docker run -it --rm --privileged busybox 
/ # ls -l /proc/1/map_files/
total 0
lr--------    1 root     root            64 Jan  7 18:59 400000-401000 -> 
/bin/sh
lr--------    1 root     root            64 Jan  7 19:00 401000-4dd000 -> 
/bin/sh
lr--------    1 root     root            64 Jan  7 19:00 4dd000-514000 -> 
/bin/sh
lr--------    1 root     root            64 Jan  7 19:00 514000-516000 -> 
/bin/sh

/ # exec 50</proc/1/map_files/400000-401000
/ # cat /proc/self/fdinfo/50 
pos:    0
flags:  0100000
mnt_id: 551
/ # cat /proc/self/mountinfo | grep 551

We can see that the mount 551 isn't listed in container mounts.

If we will try to open /bin/sh directly, we will see the mount ID of the
container root mount.

/ # exec 50</bin/sh
/ # cat /proc/self/fdinfo/50 
pos:    0
flags:  0100000
mnt_id: 607
/ # cat /proc/self/mountinfo | grep '^607'
607 567 0:51 / / rw,relatime master:308 - overlay overlay 
rw,lowerdir=/var/lib/docker/overlay2/l/DCAEKRDYRDTVUIECWWPMTFAKAO:/var/lib/docker/overlay2/l/BEYAU2IKCGHGS5UYC7C6Q6HIHG,upperdir=/var/lib/docker/overlay2/1c92eec684804fbc8642a9a4698a0099c9ff5c39915289e1fcd1b39493558c65/diff,workdir=/var/lib/docker/overlay2/1c92eec684804fbc8642a9a4698a0099c9ff5c39915289e1fcd1b39493558c65/work,xino=off

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1857257

Title:
  linux-image-5.0.0-35-generic breaks checkpointing of container

Status in linux package in Ubuntu:
  New

Bug description:
  Trying to checkpoint a container (docker/podman) on 18.04 fails
  starting with linux-image-5.0.0-35-generic. We (CRIU upstream) see
  this in Travis starting a few weeks ago. Manually testing it locally
  shows that linux-image-5.0.0-32-generic still works and linux-
  image-5.0.0-35-generic does not longer work. It seems to be overlayfs
  related, at least that is what we believe. The CRIU error message we
  see is:

  (00.170944) Error (criu/files-reg.c:1277): Can't lookup mount=410 for fd=-3 
path=/bin/busybox
  (00.170987) Error (criu/cr-dump.c:1246): Collect mappings (pid: 1637) failed 
with -1

  
  We have not seen this only in Travis, but also multiple CRIU users reported 
that bug already. Currently we have to tell them to downgrade the kernel.

  I also able to reproduce it with linux-image-5.3.0-24-generic. Staying
  on the 4.18.0 kernel series does not show this error.
  4.18.0-25-generic works without problems.

  See also https://github.com/checkpoint-restore/criu/issues/860

  One of the possible explanations from our side include:

  "Looks like we have the same as for st_dev now with mnt_id, that is
  bad, because we can't find on which mount to open the file if kernel
  hides these information from us."

  Running on the upstream 5.5.0-rc1 kernel does not show this error.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1857257/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to