On Tue, Dec 10, 2019 at 8:25 AM Thomas Deutschmann <whi...@gentoo.org> wrote:
>
> On 2019-12-10 13:44, Rich Freeman wrote:
> > I'm not talking about container-host mapping.  I'm talking about
> > building the same container 100 times and having the container end up
> > with the same UIDs inside each time.
> >
> > Build order in portage isn't really deterministic, especially over
> > long periods of time, so you can't rely on stuff getting installed in
> > the same order.
>
> Assume you are building a container for dev-db/mysql. I can only think
> of one scenario where you would end up with different UIDs: That's when
> dev-db/mysql (or a dependency) would suddenly create an own user and
> will be merged before mysql's user was created.
>
> But this is very theoretically. Especially in a container world, you
> will create one container per services so it's *very* unlikely that
> something like that will ever happen. Not?

There are cases where you might have more than one service in a
container, and there is certainly the issue of dependencies.  It
certainly makes sense to limit a container to a single function, but
internally that could involve multiple processes.

> Aside benefits from reproducible builds in general (which Gentoo doesn't
> provide), please share reasons why one would care about used UIDs/GIDs
> in containers...

Well, that is simple.  In the mysql example /var/lib/mysql would be
bind-mounted from outside the container, since it needs to be
persistent anytime the container is updated.  If you update the
container and it ends up with a different UID for mysqld, then it
wouldn't be able to read anything in /var/lib/mysql as it would still
have the previous UID.  You'd need to keep the two in sync somehow.

In fact, that is the very example you go on to offer...

> > Uh, the container processes shouldn't even see the host
> > processes/files whether they have the same UIDs or not...
>
> Especially when you put mysql or any other service using data into a
> container, service running in that container must be able to access this
> data. And one common way to do that is allowing container to access data
> stored on host, i.e.
>
> > $ docker run \
> >     --name some-mysql \
> >     -v /my/own/datadir:/var/lib/mysql \
> >     -e MYSQL_ROOT_PASSWORD=my-secret-pw \
> >     -d mysql:tag
>
> which will make /my/own/datadir from host available in container as
> /var/lib/mysql.
>

This is obviously exactly how you'd do it if you were using docker,
but you don't need to keep the UID in the container in sync with
anything else in the host.  If security is a concern you'd probably
want to make sure that nothing non-root can reach the directory since
its UID might be in use for something else.

In any case, this is why consistent UIDs in scratch installs are
useful.  When you're building a new version of a container you don't
want all its UIDs to change.  And of course this isn't just limited to
containers, but anything where you have persistent storage.  It is
just that the idea of creating new instances from scratch instead of
updating them in-place has become more fashionable around the same
time that containers have become more fashionable.  You could do the
same thing with a bare-metal host, though it would be a bit more
painful (perhaps using A/B partition booting, or less painful if
you're booting from a SAN or something like that).

-- 
Rich

Reply via email to