Hi Sam,

I'm very glad about your mail.

On Wed, Jan 19, 2022 at 05:57:15PM -0700, Sam Hartman wrote:
> [Feel free to quote in public with attribution.
> I'm sending privately to allow you to control distribution.]

Thank you for your trust in me. I think that your mail is mostly
technical and contains a few personal aspects. While I'm glad for those
words on a personal level, I don't think they help in the public
discussion. Therefore, I've left them out in what mostly is a
full-quote. Please let me know if you disagree with the selection.

Thus, this reply goes to the bug report to continue in a public spot.

> Hi.  I've owed you an explanation on this for a while not, and I regret
> that I allowed other less enjoyable (and less technical) aspects of
> Debian to dominate my thoughts.

Thank you!

> We had a call where we talked about  DPKG_ROOT.
> There were two use cases.
> 
> 1) architectures without qemu.
> You sold me on the importance of this use case.

Good.

> 2) Being able to build for embedded systems without root.
> Putting it mildly I remain unconvinced on this use case.

I think it was actually the other way round. You sold us on this not
being a relevant problem to solve. Since bullseye, user namespaces are
widely available and effectively that is the far easier tool for this
problem.

However, there is one nuance to this that may come back in future. When
building embedded systems, we want their rootfs to be small. Kicking out
everything that we do not need is key. Often times that includes the
dpkg database and the maintainer scripts. Thus it would - in theory - be
feasible to remove anything that is only required by maintainer scripts.
At present, we have no way of telling apart which dependencies are
required for maintainer scripts and which are required for using the
package. Those two concepts are not separated in any way. However, we
have been talking about separating them for at least three years. Think
of this as an early draft for a proposal, not something set in stone.
The takeaway roughly is that if we manage to split package dependencies
into two categories for installation and for runtime, then DPKG_ROOT
poses another gain as those installation dependencies can be removed
from the system that is to be bootstrapped. Please consider this a weak
argument at least for now. You can easily construct counter arguments:
We might install those dependencies inside and remove them at the end.
The reason I'm including it here is to avoid an unpleasant surprise
later on.

> To understand my thoughts, I'd like to describe a bit how I see Debian
> and the maintainer scripts.

I find your way of approaching this very good. I keep learning from you
in terms of how to constructively discuss things. As we'll see, this
bottom-up approach makes it easy to figure out where exactly we
disagree rather than build on inconsistent assumptions.

> We adopted a fairly simple model for maintainer scripts: they run as
> root on the installed system.
> This has huge advantages: it's allowed us to package a lot of software.
> It's easy to reason about.

People have radically differing opinions on that last point. For
instance, Ralf Treinen is involved with CoLiS, a framework for reasoning
about maintainer scripts and if you ask him, "easy" is likely one of the
things he wouldn't attribute. Often times, we only reason about them in
terms of trying whether they work in practice. For that reason, a number
of people are looking for ways to eliminate them in places where we can
do without. You can see dpkg triggers as a way of making them more
declarative. Guillem Jover and Niels Thykier have invested quite some
time into maintainer script removal. Examples include:
 * https://wiki.debian.org/Teams/Dpkg/Spec/MetadataTracking
 * https://wiki.debian.org/Teams/Dpkg/Spec/DeclarativePackaging
A big chunk of the work Johannes and myself invested into DPKG_ROOT was
eliminating unnecessary maintainer scripts.

Can we agree on "easy to reason about" being very subjective? That
really depends on how you look at it.

When it comes to consensus, my perception is that the majority of people
I talk to is in favour of declarative approaches. This applies to
metadata (i.e. what fakeroot solves) in particular.

> We adopted a similar model for builds.
> Many other packaging systems adopted complex models for setting file
> permissions and the like.
> We decided we'd use fakeroot--we captured and modified the OS in order
> to extract the information we needed from the install process and to
> allow people to use familiar tools.
> It worked so much better than other similar systems of the mid 1990's.

Again, this does have two sides. Most of the time, it just works.
fakeroot breaks every so often. It plays whack-a-mole with an ever
evolving glibc and linux. Johannes even had to NMU it once. Worse, we
also have pseudo, which provides fakeroot. I've recently lost more than
an hour figuring out why tar would segfault during a python2.7 build and
it happened to be a bug in pseudo. Really, the fakeroot approach is a
time-sink in many ways.

> We've found the performance suffers, and for a large set of packkages we
> don't need it, thus we have rules-requires-root: no.

I think performance was the least of concerns here, but that may be
subjective as well. In any case, the sum of problems made people invest
into r-r-r.

> But by the time we adopted that, we had significant tooling like
> dh_fixperms and dh_install, such that for a significant number of build
> systems  no  changes are required rather than declaring support for
> r-r-r.

That is very true. I do see where you are aiming. ;)

> The approach for maintainer scripts and for build environment focused on
> making packaging easy and on making existing tooling work as expected.
> 
> The DPKG_ROOT approach uses a different design aesthetic.
> It appears to focus on making the dpkg implementation easy ant the
> expense of everything else hard.

I can see how you arrive at this conclusion even though I disagree with
it. Let me draw a different picture.

The key requirement for the cross-architecture use case is not running
any ELF binaries from the chroot. Whatever script you run, it ends up
using some ELF binary as interpreter. On the surface, that directly
results in the need to avoid chroot. If I missed some obvious
alternative, please tell.

So much of the complexity you see is due to the problem we try to solve.
Whenever we had a choice, we aimed at centralizing the cost. Any calls
to update-alternatives or dpkg-divert will automatically honour
DPKG_ROOT. Making those tools work with DPKG_ROOT certainly was not easy
to achieve. I tried doing it, but I was not successful. Others (at least
David Kalnischkies comes to my mind) made them work. DPKG_ROOT is
supported by more people than Johannes and myself. Where it was
technically feasible, we did centralize the complexity.

You view the picture through the lens of pam. Unfortunately, pam has
quite extensive maintainer scripts calling into tools from pam. Quite
naturally, we see bigger changes here. Some other packages - such as
util-linux - merely needed a rebuild with an updated debhelper.

If you see us pushing unnecessary complexity into pam or you see
opportunities to solve these in a more generic way, please tell.

So I'm trying to sell you the proposed complexity as being required by
the nature of the problem. Let us see how successful that is. ;)

> Even on our call I was skeptical.  I think there are probably better
> approaches using containers, LD_PRELOAD, complex namespaces, and
> injecting  host system binaries somehow that would make things easier
> for maintainer script authors.
> I think that if you give up on being able to run maintainer scripts as
> non-root and focus on Linux as a kernel, there are probably better
> developer experiences to be had.
> The LD_PRELOAD world probably has better answers even if portability and
> running as non-root is important.

This has always remained vague to me. I do remember you talking about
those technologies, but I wasn't able to sketch a workable approach from
that. LD_PRELOAD doesn't look like it can solve anything here as it
cannot make foreign binaries run. A mount namespace make a little more
sense. In principle, one could unpack all the packages, set up a
namespace that bind mounts the outer /bin/sh into the tree and chroot
into it. Now you're there without the DPKG_ROOT mess, but with a
different mess: You now need to figure out which binaries (and
libraries) need to be bind mounted. In some cases (such as the dynamic
loader), the paths to bind mount to may be absent. And during package
installation, that set constantly changes. It also is highly dependent
on the concrete binary packages and will likewise need support from
individual packages. To me that sounds like a similar amount of
complexity to the DPKG_ROOT approach. Maybe you can further that sketch?

I actually looked into this practically and I managed to get to a point
where some things work, but not a full essential bootstrap. I prefer
keeping this to myself for now to give you a chance to reply without me
biasing you. I'll send it after your next reply.

> But anyway, this brings us to the first pam patch.
> That had the property that small changes in a couple places made  the
> DPKG_ROOT functionality work for pam-auth-update.
> In addition, because of the way pam-auth-update code is structured, it's
> fairly likely that future updates to the script will not disturb the
> DPKG_ROOT support.
> So I accepted the patch.

Thank you.

> Then the next patch came along.
> To me that really points out the deficiency of the approach.

It means a different thing to me. To me, this is a process failure. We
should have present the whole thing at once to you. I'm sorry that you
had to digest it piece-meal.

My goal for submitting DPKG_ROOT patches was to first split them into
generic cleanup and improvements unrelated to DPKG_ROOT and then a final
DPKG_ROOT patch. For instance, bash and dash received quite some patches
for cleaning up the diversion mess around /bin/sh. Those were served as
incremental improvements on their own merits but benefiting DPKG_ROOT.

You can also observe that our pam submission was quite late. I was
hesitant about pam, because it was non-obvious in a few aspects. I
didn't want to unnecessarily bother you. It happened anyway.

> Every time a maintainer script touches a file on the target system,
> DPKG_ROOT needs to be involved.

While that is true in principle, we can slightly weaken it. If you touch
that file using common maintenance tools (such as dpkg-divert,
update-alternatives, ...) as opposed to general system utilities (such
as mkdir, chmod, cp, ...) you get to benefit from automatic DPKG_ROOT
support.

> But that's the *common case* for what maintainer scripts do.
> The design makes the common case harder for the benefit of new
> functionality.

I accept that critique. It is inherent in the problem we're trying to
solve. One way or another, you must tell apart which files are used as
tools and which files are being operated on. With the sketched bind
mount approach, you just move that complexity elsewhere.

> Were pam my package, and the patch from anyone other than Helmut, I'd
> back out the existing DPKG_ROOT support saying that the design appears
> bad to me.

Taking patches due to attribution sounds bad to me. We should judge
patches on technical matters, not on people. I also think that both
patches are due to Johannes, but none of us is entirely sure at this
time. The key is: Attribution should not matter for inclusion. Where
attribution plays a role is your willingness to look into patches. You
are more likely to spend time on a patch from someone you trust than one
from someone you don't know. However, if the patch is bad, attribution
should not influence your decision.

On the bright side, this conversation is mostly focused on the technical
aspects. Can we try ignoring the attribution part of the story? Even if
that means that you revert our earlier patch for now.

> I'd admit willingness to accept patches (and accompanying tests) if the
> project gets a consensus behind this approach, because I don't want to
> stand for ever in the way of progress.

How do you imagine forming such consensus?

> (I'm not sure getting a bunch of the essential set to accept your
> patches really counts as consensus here.
> It is equally possible that a bunch of people are not interested in
> standing in your way and a couple of people pushed this forward.  We
> could talk about that.)

Inherently in the problem, the number of affected maintainers is small.
Do we need to bother people outside the essential set, who are otherwise
unaffected by it?

[ personal parts skipped ]

> I don't feel comfortable adopting what I believe is a bad technical
> approach for Steve's package because of my personal reaction to Helmut.

I think there is little way around getting the approach right. If we
make a compromise here, the cost is huge in the long run. I caution
though that "bad" is subjective here. It depends on how you look at it.
I also suggest that we may be pursuing the "least bad" approach.

What I've been missing thus far is that other, good technical approach.
I've tried to sell DPKG_ROOT to a fair number of people and even though
quite some took issue, I haven't seen any competitive (<- possibly
subjective) approach proposed. The ones I've seen include:
 * Somehow bind mount native binaries over foreign ones.
 * Have a separate set of bootstrap scripts and limit DPKG_ROOT to those
   bootstrap scripts.

And none of them looked more maintainable (from a distribution pov) than
DPKG_ROOT to me.

I'm not trying to dump the work of implementing it on you, but yeah if
we find that other approach, maybe Johannes and myself need to start
over?

[ personal parts skipped ]
> There are a number of decisions I'm deferring to [Steve Langasek].
> Eventually, if he continues to have no time that queue will get long
> enough and I'll either work to find someone else or step more forward
> and start taking these decisions myself.

At this point, it seems that we are conflating to separate subissues
here. One is whose decision inclusion for pam is and your recent mails
seem to indicate that you don't want to be the one to decide. At the
same time, you think the approach is technically bad. The recent mails
on the bug were focused on the former, but it seems to me that the real
issue is the latter. Previously, I had hoped that we had resolved the
technical disagreement. Looks like that impression was wrong.

Let us move back to that step and resolve the technical aspect first.
It doesn't make much sense to move forward otherwise. Having you
participate in the discussion is really helpful here. Thank you.

> ----------------------------------------
> 
> None of the following directly helps the situation with pam today.
> But I'm going to guess that you're not willing to go redesign based on
> namespaces and the like...

Actually, can we sketch that out in more detail? My picture of it
results in a similar amount of churn as DPKG_ROOT requires. That is due
to the need for mixing the native and foreign system into one thing
(using bind mounts?). I'm convinced that following up here will improve
our understanding.

> So I'm going to try and provide some hopefully actionable ways to
> incrementally improve what you have.

Appreciated.

> 1) Some sort of function library/more debhelper support to accomplish
> common maintainer scripts.

Perhaps I'm too unimaginative here. To the extend that seemed reasonable
to me, this has happened. Examples include dpkg-divert and
update-alternatives.

> If pam's maintainer scripts were written against something higher level,
> perhaps the common case of modifying files on the target system could be
> expressed in such a way that  the translation from higher-level tooling
> to postinst could handle DPKG_ROOT

Keep in mind that we need to add information here. With DPKG_ROOT, we
need to tell apart which paths refer to the "inner" and which paths
refer to the "outer" system. Sometimes that can happen implicitly, but
not always. Regardless of the level at which we do this, this
information has to be added somehow.

> 2) It sounds like there is CI testing.
> Is there any way this could be set up as an autopkgtest to actually
> block migration of essential packages where the DPKG_ROOT support has
> bitrotted?

Johannes indicated that such a test could be added to mmdebstrap easily
once support for all essential packages is merged. Until then, we have a
salsa job and can file bugs manually.

In any case, as long as we are actively discussing the technical means,
we have no intention to push for a rapid inclusion. To the contrary:
Please wait with applying the proposed changes as long as the discussion
progresses.

Helmut

Reply via email to