Hi All,

This doesn't directly affect me. Nor am I familiar with the mechanisms.

Perhaps it's worthwhile to suggest that EGO_SUM itself may be
externalized.  I don't know what goes in here, and this will likely
require help from portage itself, so may not be directly viable.

What if portage had a feature whereby a SRC_URI list could be downloaded
as a SRC_URI itself?  In other words:

SRC_URI_INDIRECT="https://wherever/lists_for_some_go_package.txt";

Where that file itself contains lines for entries that would normally go
into SRC_URI (directly or indirectly via EGO_SUM from what I can
deduce).  Something like:

https://www.upstream.com/downloads/package-version.tar.gz =>
fneh.tar.gz|manifest portion goes here

Where manifest portion would assume DIST and fneh.tar.gz, so would start
with the filesize in bytes, followed by checksum value pairs as per
current Manifest files.

Since users may want to know how big the downloads for a specific ebuild
is, some process to generate these external manifests may be in order,
and to subsequently store the size of these indirect downloads
themselves in the local manifest, so in the local Manifest, something like:

IDIST lists_for_some_go_package.txt direct_size indirect_size CHECKSUM
value CHECKSUM value.

I realise this idea isn't immediately feasible, and perhaps not at all,
presented here since perhaps it could spark an idea for someone else. 
It sounds like this is the problem that the vendor tarball tries to
solve, but that that introduces a trust issue - not sure this exactly
goes away but at a minimum we're now verifying download locations again
(as per EGO_SUM or just SRC_URI in general) rather than code tarballs
containing many many times more code than download locations.

Given:

jkroon@plastiekpoot ~ $ du -sh /var/db/repos/gentoo/
644M    /var/db/repos/gentoo/

I'm not against exploding this by another 200 or even 300 MB personally,
but I do agree that pointless bloat is bad, and ideally we want to
shrink the size requirements of the portage tree rather than enlarge.

Kind Regards,
Jaco

On 2022/09/30 15:57, Florian Schmaus wrote:

> On 28/09/2022 23.23, John Helmert III wrote:
>> On Wed, Sep 28, 2022 at 05:28:00PM +0200, Florian Schmaus wrote:
>>> I would like to continue discussing whether we should entirely
>>> deprecate
>>> EGO_SUM without the desire to offend anyone.
>>>
>>> We now have a pending GitHub PR that bumps restic to 0.14 [1].
>>> Restic is
>>> a very popular backup software written in Go. The PR drops EGO_SUM in
>>> favor of a vendor tarball created by the proxied maintainer. However, I
>>> am unaware of any tool that lets you practically audit the 35 MiB
>>> source
>>> contained in the tarball. And even if such a tool exists, this would
>>> mean another manual step is required, which is, potentially, skipped
>>> most of the time, weakening our user's security. This is because I
>>> believe neither our tooling, e.g., go-mod.eclass, nor any Golang
>>> tooling, does authenticate the contents of the vendor tarball against
>>> upstream's go.sum. But please correct me if I am wrong.
>>>
>>> I wonder if we can reach consensus around un-depreacting EGO_SUM, but
>>> discouraging its usage in certain situations. That is, provide EGO_SUM
>>> as option but disallow its use if
>>> 1.) *upstream* provides a vendor tarball
>>> 2.) the number of EGO_SUM entries exceeds 1000 and a Gentoo developer
>>> maintains the package
>>> 3.) the number of EGO_SUM entries exceeds 1500 and a proxied maintainer
>>> maintains the package
>>
>> I'm not sure I agree on these limits, given the authenticity problem
>> exists regardless of how many dependencies there are.
>
> It's not really about authentication, you always have to trust
> upstream to some degree (unless you audit every line of code). But I
> believe that code distributed via official channels is viewed by more
> eyes and significantly more secure.
>
> EGO_SUM entries are directly fetched from the official distribution
> channels of Golang. Hence, there is a higher chance that malicious
> code in one of those is detected faster, simply because they are
> consumed by more entities. Compared to the dependency tarball that is
> just used by Gentoo. In contrast to the official sources, "nobody" is
> looking at the code inside the tarball.
>
> For proxied packages, where the dependency tarball is published by the
> proxied maintainer, the tarball also allows another entity to inject
> code into the final result of the package. And compared to a few small
> patches in FILESDIR, such a dependency tarball requires more effort to
> review. This further weakens security in comparison to EGO_SUM.
>
> - Flow

Reply via email to