Hi Philipp,

Let me go into some detail that is tangential to the larger discussion.

On Mon, Jul 01, 2024 at 09:18:19AM +0200, Philipp Kern wrote:
> How well does this setup nest? I had a lot of trouble trying to run the
> unshare backend within an unprivileged container as setup by systemd-nspawn
> - mostly with device nodes. In the end I had to give up and replaced the
> container with a full-blown VM. I understand that some of the things compose
> a little if the submaps are set up correctly, with less IDs allocated to the
> nested child. Is there a way to make this work properly, or would you always
> run into setup issues with device nodes at this point?

Technically speaking, nesting is possible. The individual container
implementation may limit you, but that's an implementation limit and not
a fundamental one. I'm assuming that you want to nest a rootless
container in a rootless container as that tends to be the most difficult
one. Roughly speaking your unprivileged container wants access to your
user id and a 64k allocation of subuids. This applies to the nested
container. If your outer container maps two 64k ranges (one to 0 to
65535 and the other to whatever your user has in its contained
/etc/subuid), your contained user should actually be able to spawn a
podman container unless I am missing something important. Devices
usually are not a problem (for rootless containers) as you cannot create
them anyway so you end up bind mounting them and the bind mounting
technique nests well.

A typical Debian installation only allocates a single 64k range to each
user. Your first step here is growing that range or adding another one.
(Yes, you may have multiple lines for your user in /etc/subuid.) Then
the podman-run documentation hints at --uidmap and it says that you can
specify it multiple times to map multiple ranges. This is how you
construct your outer container. Then inside, nesting should just work.
Admittedly, I've not tried this.

The takeaway should be that if your outer container is constructed in
the right way, you should be able to nest other containers (e.g. podman,
mmdebstrap, sbuild unshare, ...) without issues. It's not like this just
works out of the box, but it should be feasible.

Helmut

Reply via email to