Quoting Eric W. Biederman ([email protected]): > Adam Richter <[email protected]> writes: > > > On Linux 4.8-rc1 through 4-8-rc6 (latest rc), lxc fails start to > > Ubuntu 16.04 and Centos 7 containers [1], unless I first run > > "cgmanager -m name=systemd &" on the host, which, unlike the > > containers, was not running systemd or cgmanager. > > Yes, that appears correct. Given the current flat namespace of > hierarchies you fundamentally must coordinate with the host if you want > to use a new hierarchy. So running cgmanager on the host seems like > the minimum way to do that. > > If we truly need something more (which does not appear to be the case > here) the names of hierarchies need to be moved into a namespace. > > > Git bisect revealed that this behavior began with a commit entitled > > "cgroupns: Only allow creation of hierarchies in the initial cgroup > > namespace" [2], which appears to be an attempt to protect against a > > possible denial of service attack. Reversing the commit also restores > > successful commit the need to run that cgmanager process. [Eric and > > Tejun, I have bcc'ed you so you can be aware of this discussion > > thread, as you apparently respectively wrote and approved the commit.] > > As far as I can tell you were getting lucky and not having problems > before. > > > Running that cgmanager invocation is pretty simple, and seems to me to > > be well worth closing a denial of service vulnerability, much as I > > dislike adding something systemd-specific to a non-systemd environment > > and adding a new dependency (lxc requires cgmanager on the host to > > run, I guess, any container that runs systemd). However, I am posting > > this message because I don't fully understand the problem, and, most > > importantly, I am wondering if I have stumbled on an unintended > > consequence of this commit that might have other indicate other > > potential breakage. > > I am surprised that your case worked but I don't think it amounts to an > unintended consequence. > > > If this new lxc behavior is completely acceptable, then I apologize > > for consuming people's time with it and hope that this message will > > allow others experiencing the same problem find an answer for it when > > they search the web. > > I will let the lxc-developers judge. > > I don't think you hit a case that was expected to work. Furthermore
fwiw indeed this was never expected to work. > either your containers were overprivileged or they would not have been > able to create subdirectories in the cgroup hierarchy. So I expect this > change transformed a subtle breakage (aka one you had not noticed yet) > into an explicit breakage. > > I am not subscribed to lxc-users so I don't know if anyone else has > replied to your post. Cc's would have been better than Bcc's for > getting feedback in a situation like this. > > Eric > > > > Adam Richter > > > > > > [1] Here is an example of failing to start one of these containers. > > $ sudo lxc-start --name ubuntu16.04_amd64 --foreground > > Failed to mount cgroup at /sys/fs/cgroup/systemd: Operation not permitted > > [!!!!!!] Failed to mount API filesystems, freezing. > > Freezing execution. > > > > > > [2] Here is the commit diff that triggers the new mishbehavior. > > commit 726a4994b05ff5b6f83d64b5b43c3251217366ce > > Author: Eric W. Biederman <[email protected]> > > Date: Fri Jul 15 06:36:44 2016 -0500 > > > > cgroupns: Only allow creation of hierarchies in the initial cgroup > > namespace > > > > Unprivileged users can't use hierarchies if they create them as they do > > not > > have privilieges to the root directory. > > > > Which means the only thing a hiearchy created by an unprivileged user > > is good for is expanding the number of cgroup links in every css_set, > > which is a DOS attack. > > > > We could allow hierarchies to be created in namespaces in the initial > > user namespace. Unfortunately there is only a single namespace for > > the names of heirarchies, so that is likely to create more confusion > > than not. > > > > So do the simple thing and restrict hiearchy creation to the initial > > cgroup namespace. > > > > Cc: [email protected] > > Fixes: a79a908fd2b0 ("cgroup: introduce cgroup namespaces") > > Signed-off-by: "Eric W. Biederman" <[email protected]> > > Signed-off-by: Tejun Heo <[email protected]> > > > > diff --git a/kernel/cgroup.c b/kernel/cgroup.c > > index e75efa8..e0be49f 100644 > > --- a/kernel/cgroup.c > > +++ b/kernel/cgroup.c > > @@ -2215,12 +2215,8 @@ static struct dentry *cgroup_mount(struct > > file_system_type *fs_type, > > goto out_unlock; > > } > > > > - /* > > - * We know this subsystem has not yet been bound. Users in a > > non-init > > - * user namespace may only mount hierarchies with no bound > > subsystems, > > - * i.e. 'none,name=user1' > > - */ > > - if (!opts.none && !capable(CAP_SYS_ADMIN)) { > > + /* Hierarchies may only be created in the initial cgroup namespace. > > */ > > + if (ns != &init_cgroup_ns) { > > ret = -EPERM; > > goto out_unlock; > > } _______________________________________________ lxc-users mailing list [email protected] http://lists.linuxcontainers.org/listinfo/lxc-users
