On Wed, 10.02.16 15:58, Daniel J Walsh ([email protected]) wrote: > >>>> sed -i 's/^enable/disable/g' /lib/systemd/system-preset/* > >>> Why would this matter? > >> We don't want excess services running inside of a docker container. I > >> only want systemd/journald and any services > >> that I enable in the container. Not something pulled in because the > >> installer thinks this is a VM or a Host OS. > > Well, the default preset policy in Fedora is to disable everything by > > default, modulo a few exceptions. Hence it should be unnecessary to > > change anything with the default preset policy, unless you actually > > want to *enable* rather than disable more by default... > > Here is what I see enabled in the base container. I don't think we > want any of this stuff running by default in a docker container.
[…] Well, but pretty much all the units you listed here are units from RPMs you wouldn't install in a container anyway, aren't they? This, they shouldn't matter anyway, and I'd argue they should be enabled by default in a container too – if they are installed explicitly by the user, through RPM. Hence, I think patching the preset stuff is not necessary at all. > > I don't see why one would want to mask systemd-logind.service. If you > > permit logins and PAM at all, you really need that. > > If I wanted to add a login program I could enable/unmask these. > No one runs docker containers as login services, that would require > getty. Well, "machinectl shell", "cron" and all those things do PAM... In fact the fact that "machinectl shell" goes through PAM and registers with logind through that is one of the major benefits over naked "nsenter". I can see that you don't want to run it by default, but maybe we can rearrange things so that logind is started on first use (i.e. on the first PAM conversation). That way logind would normally not run in a container, until it is actually requested by PAM conversation. We could even add exit-on-idle so that it goes away after a while when the user logs out again. That way logind could stay available but would normally not appear in "ps" unless it is actually used. I added this to the TODO list now. > > And masking the getty stuff appears to be entirely unnecessary... > Again the goal is just to get rid of the getty failure message at > bootup. But there should really be none with current systemd, as you don't have /dev/tty0 and the getty unit has ConditionPathExists=/dev/tty0. How precisely does the getty message look like that you get? > > Which leaves the /dev/hugepages and /sys/fs/fuse/connections > > mounts. Note sure about those. Are you running the container with > > CAP_SYS_ADMIN? If so, then there's no reason to mask those units. If > > not, then I figure we could add checks that these are conditioned out > > if CAP_SYS_ADMIN is missing. > > No docker containers do not enable SYS_ADMIN or NET_ADMIN by > default. I'll add a ConditionCapability=CAP_SYS_ADMIN line to the fuse mount. The hugepages mount already has one (since 218). With that addition there should really be no reason to mask out either of the units explicitly, systemd should already silently skip them in a docker setup where CAP_SYS_ADMIN is missing. > > On nspawn these two aren't seen since nspawn actually doesn't mount > > the real sysfs to /sys, but just a tmpfs with a select number of > > subdirectories from the real sysfs for security reasons. One of the > > subdirs that are suppressed is /sys/fs. Now, > > sys-fs-fuse-connections.mount is conditionalized on > > /sys/fs/fuse/connections existing, hence if it is not there, then it > > won't be mounted. And /dev/hugepages we simply allow to be mounted in > > the container. > > Interesting idea. Maybe we should just mount over /sys/fs also. Well, note that we over-mount /sys with a tmpfs, and then some parts of the real /sys into that. /sys/fs hence is just a subdir of our private tmpfs. The tmpfs is marked r/o after everything is set up. > Do you just mount hugepages then during container setup? No. In nspawn, when we pass CAP_SYS_ADMIN to the container the container will just mount /dev/hugepages correctly on its own. And we do drop CAP_SYS_ADMIN then the ConditionCapability=CAP_SYS_ADMIN in the unit file mentioned above will result in the mount being skipped silently already. Lennart -- Lennart Poettering, Red Hat _______________________________________________ systemd-devel mailing list [email protected] https://lists.freedesktop.org/mailman/listinfo/systemd-devel
