On 02/10/2016 01:14 PM, Lennart Poettering wrote: > On Wed, 10.02.16 11:36, Daniel J Walsh ([email protected]) wrote: > >>>> systemctl mask systemd-firstboot initrd-udevadm-cleanup-db.service >>>> systemd-udev-settle.service systemd-udev-trigger.service >>>> systemd-udevd.service systemd-udevd-control.socket >>>> systemd-udevd-kernel.socket; \ >>> The systemd-firstboot service should have no effect unless you >>> actually boot with an empty /etc (or more accuratily: unless you >>> actually boot with an /etc that lacks /etc/machine-id) . Note that it >>> carries a condition ConditionFirstBoot=yes which makes sure that it >>> isn't even executed in normal cases. >> I see in the logs systemd complaining about no systemd-firstboot >> command. > Well, what have you installed in the container? Is the > systemd-firstboot binary there? If not, why not? If this has been > split out of the core package, then the service unit for it should > have been split out too, hence there shouldn't be any error about this. > >>> Masking all the udev stuff is pretty pointless too. These services are >>> conditioned out in containers too anyway. There's really no need to >>> mask them out. More specifically, they contain >>> ConditionPathIsReadWrite=/sys, i.e. are skipped if /sys is read-only, >>> which is the way how container managers should set up /sys (it's a big >>> security hole to allow containers write access to /sys). My >>> recommendation would be to make sure you container manager implements >>> these recommendations: >> I am just seeing mentions of udev inside of the container, What I don't >> want is messages >> inside of the journal or bootup that look like systemd is trying to run >> firstboot, udev etc. > Sure, that's precisely what the ConditionXYZ= constructs are for: to > skip stuff silently that is not necessary in some cases. > > And by default systemd comes with all the the conditions in place so > that a vanilla systemd image should work fine that implements the > container interface. > >>> https://wiki.freedesktop.org/www/Software/systemd/ContainerInterface/ >>> >>> If your container manager follows these guidelines (of which the /sys >>> being read-only thing is one), then there should be no special hacks >>> necessary in systemd, as it should just work, and detect the slight >>> semantica changes of containers correctly and avoid them cleanly. >>> >>>> rm -f /lib/systemd/system/multi-user.target.wants/systemd* >>>> /lib/systemd/system/multi-user.target.wants/getty*;\ >>> What's the rationale for this? First of all, the getty stuff appears >>> entirely unnecessary as [email protected] (which is the only thing >>> generally linked from gettys.target these days) contains >>> ConditionPathExists=/dev/tty0 which means it's already skipped when >>> run on systems lacking a VC (such as containers). >> Again, I am seeing getty@ failures inside of the container. > That would suggest that there's a /dev/tty0 in the container? That > looks really wrong... A container has no virtual console hence there > should be no /dev/tty0. > On Linux /dev/tty0 is a special device node that is part of the > kernel's VC subsystem, and points to the VC currently in the > foreground. It has no place in virtualized systems such as containers. > What is docker mounting as /dev into the container? Does it just bind > mount the host /dev? That's really nasty, as that will expose host > devices and device node ownership to the containers. They really > shouldn't do that and instead mount their own tmpfs to /tmp and just > create the device nodes for /dev/null, /dev/random and so on, but > nothing else. They don't, they create their own /dev inside the container with locked down devices.
ls -lZ /dev/ total 0 crw-------. 1 root root system_u:object_r:svirt_sandbox_file_t:s0:c15,c706 136, 3 Feb 10 20:42 console lrwxrwxrwx. 1 root root system_u:object_r:svirt_sandbox_file_t:s0:c15,c706 13 Feb 10 20:42 fd -> /proc/self/fd crw-rw-rw-. 1 root root system_u:object_r:svirt_sandbox_file_t:s0:c15,c706 1, 7 Feb 10 20:42 full c---------. 1 root root system_u:object_r:svirt_sandbox_file_t:s0:c15,c706 10, 229 Feb 10 20:42 fuse lrwxrwxrwx. 1 root root system_u:object_r:svirt_sandbox_file_t:s0:c15,c706 11 Feb 10 20:42 kcore -> /proc/kcore drwxrwxrwt. 2 root root system_u:object_r:svirt_sandbox_file_t:s0:c15,c706 40 Feb 10 20:42 mqueue crw-rw-rw-. 1 root root system_u:object_r:svirt_sandbox_file_t:s0:c15,c706 1, 3 Feb 10 20:42 null lrwxrwxrwx. 1 root root system_u:object_r:svirt_sandbox_file_t:s0:c15,c706 8 Feb 10 20:42 ptmx -> pts/ptmx drwxr-xr-x. 2 root root system_u:object_r:svirt_sandbox_file_t:s0:c15,c706 0 Feb 10 20:42 pts crw-rw-rw-. 1 root root system_u:object_r:svirt_sandbox_file_t:s0:c15,c706 1, 8 Feb 10 20:42 random drwxrwxrwt. 2 root root system_u:object_r:svirt_sandbox_file_t:s0:c15,c706 40 Feb 10 20:42 shm lrwxrwxrwx. 1 root root system_u:object_r:svirt_sandbox_file_t:s0:c15,c706 15 Feb 10 20:42 stderr -> /proc/self/fd/2 lrwxrwxrwx. 1 root root system_u:object_r:svirt_sandbox_file_t:s0:c15,c706 15 Feb 10 20:42 stdin -> /proc/self/fd/0 lrwxrwxrwx. 1 root root system_u:object_r:svirt_sandbox_file_t:s0:c15,c706 15 Feb 10 20:42 stdout -> /proc/self/fd/1 crw-rw-rw-. 1 root root system_u:object_r:svirt_sandbox_file_t:s0:c15,c706 5, 0 Feb 10 20:42 tty crw-rw-rw-. 1 root root system_u:object_r:svirt_sandbox_file_t:s0:c15,c706 1, 9 Feb 10 20:42 urandom crw-rw-rw-. 1 root root system_u:object_r:svirt_sandbox_file_t:s0:c15,c706 1, 5 Feb 10 20:42 zero This is what a standard /dev /looks like in a container >>> And the other services you are removing here: what's the point? they >>> aren't really optional, that's why they are linked from /usr/lib, >>> rather than subject to systemctl enable/disable... >>> >>>> sed -i 's/^enable/disable/g' /lib/systemd/system-preset/* >>> Why would this matter? >> We don't want excess services running inside of a docker container. I >> only want systemd/journald and any services >> that I enable in the container. Not something pulled in because the >> installer thinks this is a VM or a Host OS. > Well, the default preset policy in Fedora is to disable everything by > default, modulo a few exceptions. Hence it should be unnecessary to > change anything with the default preset policy, unless you actually > want to *enable* rather than disable more by default... Here is what I see enabled in the base container. I don't think we want any of this stuff running by default in a docker container. grep ^enable /lib/systemd/system-preset/* /lib/systemd/system-preset/85-display-manager.preset:enable gdm.service /lib/systemd/system-preset/85-display-manager.preset:enable lightdm.service /lib/systemd/system-preset/85-display-manager.preset:enable slim.service /lib/systemd/system-preset/85-display-manager.preset:enable lxdm.service /lib/systemd/system-preset/85-display-manager.preset:enable sddm.service /lib/systemd/system-preset/85-display-manager.preset:enable kdm.service /lib/systemd/system-preset/85-display-manager.preset:enable xdm.service /lib/systemd/system-preset/90-default.preset:enable sshd.service /lib/systemd/system-preset/90-default.preset:enable atd.* /lib/systemd/system-preset/90-default.preset:enable crond.* /lib/systemd/system-preset/90-default.preset:enable chronyd.service /lib/systemd/system-preset/90-default.preset:enable NetworkManager.service /lib/systemd/system-preset/90-default.preset:enable NetworkManager-dispatcher.service /lib/systemd/system-preset/90-default.preset:enable ModemManager.service /lib/systemd/system-preset/90-default.preset:enable auditd.service /lib/systemd/system-preset/90-default.preset:enable restorecond.service /lib/systemd/system-preset/90-default.preset:enable bluetooth.* /lib/systemd/system-preset/90-default.preset:enable avahi-daemon.* /lib/systemd/system-preset/90-default.preset:enable cups.* /lib/systemd/system-preset/90-default.preset:enable rsyslog.* /lib/systemd/system-preset/90-default.preset:enable syslog-ng.* /lib/systemd/system-preset/90-default.preset:enable sysklogd.* /lib/systemd/system-preset/90-default.preset:enable firewalld.service /lib/systemd/system-preset/90-default.preset:enable libvirtd.service /lib/systemd/system-preset/90-default.preset:enable xinetd.service /lib/systemd/system-preset/90-default.preset:enable multipathd.service /lib/systemd/system-preset/90-default.preset:enable libstoragemgmt.service /lib/systemd/system-preset/90-default.preset:enable lvm2-monitor.* /lib/systemd/system-preset/90-default.preset:enable lvm2-lvmetad.* /lib/systemd/system-preset/90-default.preset:enable dm-event.* /lib/systemd/system-preset/90-default.preset:enable dmraid-activation.service /lib/systemd/system-preset/90-default.preset:enable mdmonitor.service /lib/systemd/system-preset/90-default.preset:enable mdmonitor-takeover.service /lib/systemd/system-preset/90-default.preset:enable spice-vdagentd.service /lib/systemd/system-preset/90-default.preset:enable qemu-guest-agent.service /lib/systemd/system-preset/90-default.preset:enable dnf-makecache.timer /lib/systemd/system-preset/90-default.preset:enable vmtoolsd.service /lib/systemd/system-preset/90-default.preset:enable dkms.service /lib/systemd/system-preset/90-default.preset:enable ipmi.service /lib/systemd/system-preset/90-default.preset:enable ipmievd.service /lib/systemd/system-preset/90-default.preset:enable x509watch.timer /lib/systemd/system-preset/90-default.preset:enable dnssec-triggerd.service /lib/systemd/system-preset/90-default.preset:enable uuidd.socket /lib/systemd/system-preset/90-default.preset:enable gpm.* /lib/systemd/system-preset/90-default.preset:enable gpsd.socket /lib/systemd/system-preset/90-default.preset:enable x2gocleansessions.service /lib/systemd/system-preset/90-default.preset:enable unbound-anchor.timer /lib/systemd/system-preset/90-default.preset:enable lvm2-lvmpolld.* /lib/systemd/system-preset/90-default.preset:enable dbxtool.service /lib/systemd/system-preset/90-default.preset:enable irqbalance.service /lib/systemd/system-preset/90-default.preset:enable lm_sensors.service /lib/systemd/system-preset/90-default.preset:enable mcelog.* /lib/systemd/system-preset/90-default.preset:enable smartd.service /lib/systemd/system-preset/90-default.preset:enable pcscd.socket /lib/systemd/system-preset/90-default.preset:enable rngd.service /lib/systemd/system-preset/90-default.preset:enable abrtd.service /lib/systemd/system-preset/90-default.preset:enable abrt-ccpp.service /lib/systemd/system-preset/90-default.preset:enable abrt-oops.service /lib/systemd/system-preset/90-default.preset:enable abrt-xorg.service /lib/systemd/system-preset/90-default.preset:enable abrt-vmcore.service /lib/systemd/system-preset/90-default.preset:enable ksm.service /lib/systemd/system-preset/90-default.preset:enable ksmtuned.service /lib/systemd/system-preset/90-default.preset:enable rootfs-resize.service /lib/systemd/system-preset/90-default.preset:enable sysstat.service /lib/systemd/system-preset/90-default.preset:enable sysstat-collect.timer /lib/systemd/system-preset/90-default.preset:enable sysstat-summary.timer /lib/systemd/system-preset/90-default.preset:enable uuidd.service /lib/systemd/system-preset/90-default.preset:enable xendomains.service /lib/systemd/system-preset/90-default.preset:enable xenstored.service /lib/systemd/system-preset/90-default.preset:enable xenconsoled.service /lib/systemd/system-preset/90-default.preset:enable accounts-daemon.service /lib/systemd/system-preset/90-default.preset:enable rtkit-daemon.service /lib/systemd/system-preset/90-default.preset:enable upower.service /lib/systemd/system-preset/90-default.preset:enable udisks2.service /lib/systemd/system-preset/90-default.preset:enable polkit.service /lib/systemd/system-preset/90-default.preset:enable timedatex.service /lib/systemd/system-preset/90-default.preset:enable mlocate-updatedb.timer /lib/systemd/system-preset/90-default.preset:enable sa-update.timer /lib/systemd/system-preset/90-systemd.preset:enable remote-fs.target /lib/systemd/system-preset/90-systemd.preset:enable machines.target /lib/systemd/system-preset/90-systemd.preset:enable [email protected] /lib/systemd/system-preset/90-systemd.preset:enable systemd-timesyncd.service /lib/systemd/system-preset/90-systemd.preset:enable systemd-networkd.service /lib/systemd/system-preset/90-systemd.preset:enable systemd-resolved.service >> Set hostname to <ba64338e2b1a>. >> Running in a container, ignoring fstab device entry for >> /dev/disk/by-uuid/2cd63037-e967-4e87-b29b-044190721e80. >> sys-fs-fuse-connections.mount: Cannot add dependency job, ignoring: Unit >> sys-fs-fuse-connections.mount is masked. >> dev-hugepages.mount: Cannot add dependency job, ignoring: Unit >> dev-hugepages.mount is masked. >> systemd-remount-fs.service: Cannot add dependency job, ignoring: Unit >> systemd-remount-fs.service is masked. >> systemd-logind.service: Cannot add dependency job, ignoring: Unit >> systemd-logind.service is masked. >> getty.target: Cannot add dependency job, ignoring: Unit getty.target is >> masked. >> [OK ] Reached target Encrypted Volumes. >> [OK ] Created slice Root Slice. >> [OK ] Listening on Journal Socket. >> [OK ] Listening on Journal Socket (/dev/log). >> [OK ] Reached target Remote File Systems. >> [OK ] Reached target Paths. >> [OK ] Created slice System Slice. >> ... >> >> I want to get rid of these mount messages, getty messages systemd-logind >> messages... > The remount-fs.service is a nop anyway, unless you actually ship stuff > in /etc/fstab, which you shouldn't. Also, you reference a physical > hard disk from /etc/fstab, which makes no sense either in a > container. I'd really recommend to remove /etc/fstab entirely. I will try to remove /etc/fstab to see if this makes it shutup. > I don't see why one would want to mask systemd-logind.service. If you > permit logins and PAM at all, you really need that. If I wanted to add a login program I could enable/unmask these. No one runs docker containers as login services, that would require getty. > And masking the getty stuff appears to be entirely unnecessary... Again the goal is just to get rid of the getty failure message at bootup. > Which leaves the /dev/hugepages and /sys/fs/fuse/connections > mounts. Note sure about those. Are you running the container with > CAP_SYS_ADMIN? If so, then there's no reason to mask those units. If > not, then I figure we could add checks that these are conditioned out > if CAP_SYS_ADMIN is missing. No docker containers do not enable SYS_ADMIN or NET_ADMIN by default. > On nspawn these two aren't seen since nspawn actually doesn't mount > the real sysfs to /sys, but just a tmpfs with a select number of > subdirectories from the real sysfs for security reasons. One of the > subdirs that are suppressed is /sys/fs. Now, > sys-fs-fuse-connections.mount is conditionalized on > /sys/fs/fuse/connections existing, hence if it is not there, then it > won't be mounted. And /dev/hugepages we simply allow to be mounted in > the container. Interesting idea. Maybe we should just mount over /sys/fs also. Do you just mount hugepages then during container setup? > Lennart > _______________________________________________ systemd-devel mailing list [email protected] https://lists.freedesktop.org/mailman/listinfo/systemd-devel
