On Thu, 01.03.12 15:29, Daniel P. Berrange ([email protected]) wrote: Heya,
> The libvirt-sandbox project[1] is providing an API and command line tools for > constructing application sandboxes. It uses either LXC or KVM virtualization > via libvirt, to confine execution of an application binary, giving it a > read-only view of the host root filesystem, with custom writable areas > grafted onto selected paths. eg if running httpd inside a sandbox, we give > it a private /etc/httpd and /var/www, etc. > > The idea is to get the security isolation benefits of virtualization > technology, without the administrative burden of extra OS installs > that it normally entails. As such the only processes running inside > each sandbox are the application being confined, and a minimal custom > "init" binary provided by libvirt-sandbox itself. > > As we expand our use cases though, particularly to cover the "secure > containers" feature[2] in Feora 17, it is clear that if we're not > careful, our miniml "libvirt-sandbox-init-common" binary is going > turn into a poor mans' copy of systemd. We want to avoid that, and > instead actually make use of systemd directly. > > Since the sandbox shares the same root filesystem as the host, we > can't simply exec 'systemd' as is. We'll need to setup a few custom > writable mounts, where we write out custom units / targets, and > let systemd keep any state. > > So I'm trying to figure out just what is the absolute minimal setup we > can configure for systemd. Our primary target for development is to > sandbox apache. So I'd like to figure out what minimal config / directory > structure I need to create to run systemd and have it only run apache, > and a login shell (for debug inside the sandbox). > > I'm guessing that I can perhaps get away with setting up an override > of the host's /etc/systemd, and writing out custom basic.target > and default.target unit files, which merely running httpd.unit and > a shell ? It is our intention to make systemd run sensibly without any configuration files at all (i.e. empty /etc). And unless there is a bug somewhere this should work already. So one option you have is to take advantage of the fact that systemd looks for unit files in /run, /etc and /lib, where the former override the latter. Making use of that you could trivially override the default.target symlink, and pull in whatever you need, and pull in from there only what you really need. There will be a couple of caveats however, since normal service units will implicitly pull in basic.target (which you can turn of individually with DefaultDependencies=no however), and that will still pull in some system-provided systemd units. Regularly we actually test systemd in container environemnts (mostly nspawn at this point, since LXC is kinda borked on Fedora), and make sure everything boots up cleanly. And this works fine (though you do see a couple of error messages one can safely ignore). Without too much work we should be able to make this entirely clean, by sprinkling a couple of ConditionVirtualization= settings here and there and everywhere, to not even try to execute certain things, for example console setup, and things like that. Some things you probably do want to keep in the container however, like the tmpfiles stuff as one example. So there are a number of ways to go here. We have been working towards the "make an unmodified systemd work in containers" goal. If you want to go more minimal and not even include the actual unit files you don't need in your container then things become a bit more complex, since you need to whitelist what you still want. In the latter case, the units you really really need are none. However, you might want: systemd-tmpfiles-clean.service systemd-tmpfiles-clean.timer systemd-tmpfiles-setup.service console-shell.service halt.service reboot.service poweroff.service basic.target emergency.service emergency.target final.target getty.target halt.target multi-user.target poweroff.target reboot.target rescue.service rescue.target shutdown.target sockets.target sysinit.target But I think it would be a better idea, and more future proof to leave all unit files in, but not have them have any effect if run in a container. (for example, by using ConditionVirtualization= as mentioned above) Kay and I discussed introducing a new switch to systemd, called --container, which would be available in addition to --system (when run on a normal machine) and --user (when run for a specific user to manager user daemons) which would alter the way we look for units. But we couldn't really nail down the samantics for this. We are quite open in adding new container-related features to systemd, in order to minimally alter how systemd works in containers. Our story is not entirely round there yet, but we are very open for ideas to make containers work beautifully with systemd. systemd is quite happy if /sys, /dev, /run and so on are pre-mounted when it is first executed. In fact, initrds tend to mount these directories for us already, and so does nspawn actually. Lennart -- Lennart Poettering - Red Hat, Inc. _______________________________________________ systemd-devel mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/systemd-devel
