Control: reassign -1 autopkgtest Control: retitle -1 autopkgtest-virt-podman: document how to give systemd CAP_SYS_ADMIN Control: severity -1 wishlist Control: forwarded -1 https://salsa.debian.org/ci-team/autopkgtest/-/merge_requests/396 Control: tags -1 + help
On Sat, 10 Aug 2024 at 11:24:51 -0400, Reinhard Tartler wrote: > I personally find that wording a bit too strong. How about something like > this: > ... > > However, this also introduces an additional > > attack surface in the > > kernel if malicious code tried to escape the container sandbox. Are you thinking here of malicious code in a systemd service inside the container trying to escape from systemd's sandboxing to be privileged within the podman container, or are you thinking about malicious code inside the podman container (possibly as unconfined root) escaping from the podman container to harm the host system? In the autopkgtest use-case, I think in general we trust (or distrust!) everything inside the podman container equally: they're all coming from the same apt source(s), and can execute arbitrary code as container root via their maintainer scripts. The only reason that systemd's sandboxing of system services matters to us is that one of the things we ideally want to test is that the maintainer of the package under test didn't configure overly-strict systemd sandboxing that breaks their service's intended functionality. > > (It might also be appropriate to add a shorthand form for that, to avoid > > needing to use the "pass arbitrary options to podman-run" mechanism, > > but that would need some more design to choose a suitable name for > > that option. --trust-root-in-testbed, perhaps, if my understanding of > > the impact of CAP_SYS_ADMIN is correct.) > > I'd love to see such a shortcut, but it is not obvious to my how to name it. > Your suggestion seems too strong to me, because there are typically still > other > security features in play, such as seccomp, selinux or apparmor. Yes, hence my question about how dangerous it is to allow CAP_SYS_ADMIN. One way to phrase it is: are the other security mechanisms that podman uses (seccomp and LSMs) meant to be sufficiently strong that, if container root can escape to the host (even with CAP_SYS_ADMIN), that would justify a CVE in either podman or the kernel? If yes, then denying CAP_SYS_ADMIN is just hardening, rather than being security-critical in its own right. > Re-reading through https://github.com/systemd/systemd/issues/29860 clarifies > that systemd has a number of additional security hardening features, such as > DynamicUsers, but also things like PrivateDevices=`, `ProtectHome=`, > `ProtectSystem=`, `MountFlags=`, `PrivateTmp=`, `ReadWriteDirectories=`, > `ReadOnlyDirectories=`, `InaccessibleDirectories=`, and `MountFlags=`. Yes. Some of these are orthogonal to CAP_SYS_ADMIN; some of them need CAP_SYS_ADMIN to be effective, but are automatically disabled (with a warning) in its absence; and some need CAP_SYS_ADMIN, and service startup fails in its absence. It's the inconsistency between those last two categories that initially led me to think that this could be a systemd bug. > > If nothing is going to be done about this in systemd, and nothing can be > > done about it in podman, then it'll probably have to end up as a > > documentation improvement in autopkgtest-virt-podman(1). > > I tend to agree. Reassigning to autopkgtest(-virt-podman) for that. > I personally would be comfortable running containers > that have systemd inside with CAP_SYS_ADMIN because that is closer to > how systemd runs on a real system. Also, podman provides other additional > security features, such as seccomp and apparmor/selinux. Thanks, that's a useful data point. smcv