On Mon, Jun 27, 2022 at 01:43:19 +0200, Zoltan Puskas wrote:
> Hi,
> 
> I've been working on adding a go based ebuild to Gentoo yesterday and I 
> got this warning form portage saying that EGO_SUM is deprecated and 
> should be avoided. Since I remember there was an intense discussion 
> about this on the ML I went back and have re-read the threads before 
> writing this piece. I'd like to provide my perspective as user, a 
> proxied maintainer, and overlay owner. I also run a private mirror on my 
> LAN to serve my hosts in order to reduce load on external mirrors.
> 
> Before diving in I think it's worth reading mgorny's blog post "The 
> modern packager’s security nightmare"[1] as it's relevant to the 
> discussion, and something I deeply agree with.
> 
> With all that being said, I feel that the tarball idea is a bad due to 
> many reasons.
> 
>  From security point of view, I understand that we still have to trust 
> maintainers not to do funky stuff, but I think this issue goes beyond 
> that.
> 
> First of all one of the advantages of Gentoo is that it gets it's source 
> code from upstream (yes, I'm aware of mirrors acting as a cache layer), 
> which means that poisoning source code needs to be done at upstream 
> level (effectively means hacking GitHub, PyPi, or some standalone 
> project's Gitea/cgit/gitlab/etc. instance or similar), sources which 
> either have more scrutiny or have a limited blast radius.
> 
> Additionally if an upstream dependency has a security issue it's easier 
> to scan all EGO_SUM content and find packages that potentially depend on 
> a broken dependency and force a re-pinning and rebuild. The tarball 
> magic hides this completely and makes searching very expensive.
> 
> In fact using these vendor tarballs is the equivalent of "static 
> linking" in the packaging space. Why are we introducing the same issue 
> in the repository space? This kills the reusability of already 
> downloaded dependencies and bloats storage requirements. This is 
> especially bad on laptops, where SSD free space might be limited, in 
> case the user does not nuke their distfiles after each upgrade.
> 
> Considering that BTRFS (and possibly other filesystems) support on the 
> fly compression the physical cost of a few inflated ebuilds and 
> Manifests is actually way smaller than the logical size would indicate. 
> Compare that to the huge incompressible tarballs that now we need to 
> store.
> 
> As a proxied maintainer or overlay owner hosting these huge tarballs 
> also becomes problem (i.e. we need some public space with potentially 
> gigabytes of free space and enough bandwidth to push that to users). 
> Pushing toward vendor tarballs creates an extra expense on every level 
> (Gentoo infra, mirrors, proxy maintainers, overlay owners, users).
> 
> If bloating portage is a big issue and we frown upon go stuff anyway (or 
> only a few users need these packages), why not consider moving all go 
> packages into an officially supported go packages only overlay? I 
> understand that this would not solve the kernel buffer issue where we 
> run out of environment variable space, but it would debloat the main 
> portage tree.
> 

Rephrasing this just to ensure I'm understanding it correctly: you're
suggesting to move _everything_ that uses Go into its own overlay. Let's
call it gentoo-go for the sake of the example.

If the above is accurate, then I hard disagree.

The biggest package that I have that uses Go is docker (and accompanying
tools). Personal distaste of docker aside, it's a very popular piece of
software, and I don't think it's fair to require all the people who want
to use it to first enable and sync gentoo-go before they can install it.

And what about transitive dependencies? Suppose app-misc/cool-package is
written in some language that isn't Go, but it has a dependency on
sys-apps/cool-util which has a dependency on something written in Go.
Should a user wanting to install cool-package have to enable the
gentoo-go overlay now too? Even though app-misc/cool-package would look
like it doesn't need the overlay unless you dig into the deps.

Not a dev, just a user who really likes Gentoo :)

- Oskari

> It also breaks reproducibility. With EGO_SUM I can check out an older 
> version of portage tree (well to some extent) and rebuild packages since 
> dependency upstream is very likely to host old versions of their source. 
> With the tarballs this breaks since as soon as an ebuild is dropped from 
> mainline portage the vendor tarballs follow them too. There is no way 
> for the user to roll back a package a few weeks back (e.g. if new 
> version has bugs), unlike with EGO_SUM.
> 
> In fact I feel this goes against the spirit of portage too, since now 
> instead of "just describing" how to obtain sources and build them, now 
> it now depends on essentially ephemeral blobs, which happens to be 
> externalized from the portage tree itself. I'm aware that we have 
> ebuilds that pull in patches and other stuff from dev space already, but 
> we shouldn't make this even worse.
> 
> Finally with EGO_SUM we had a nice tool get-ego-vendor which produced 
> the EGO_SUM for maintainers which has made maintenance easier. However I 
> haven't found any new guidance yet on how to maintain go packages with 
> the new tarball method (e.g. what needs to go into the vendor tarball, 
> what changes are needed in ebuilds). Overall this complifates further 
> ebuild development and verification of PRs.
> 
> In summary, IMHO the EGO_SUM way of handling of go packages has more 
> benefits than drawbacks compared to the vendor tarballs.
> 
> Cheers,
> Zoltan
> 
> [1] 
> https://blogs.gentoo.org/mgorny/2021/02/19/the-modern-packagers-security-nightmare/
> 

Attachment: signature.asc
Description: PGP signature

Reply via email to