Re: [gentoo-dev] [pre-GLEP] Gentoo binary package container format [gen...@jonesmz.com]
On 18/11/18 22:40, Zac Medico wrote: > On 11/18/18 1:55 PM, Rich Freeman wrote: >> On Sun, Nov 18, 2018 at 4:10 PM Roy Bamford wrote: >>> Replying off list because I am not on the whitelist. >> That seems odd. >> >>> 1) append a uuid to each filename. Generated when the bin package file is >>> generated. >>> 2) encode the hostname of the machine that generated the file >>> 3) encode the use flags in the filename. >> So, I brought up this same issue in the earlier discussion and it was >> considered out of scope, and I think this is fair. The GLEP does not >> specify filename, and IMO the standard for what goes INSIDE the file >> will work just fine with any future enhancements that address exactly >> this use case. >> >> Besides your case of building for a cluster, another use case is >> having a central binary repo that portage could check and utilize when >> a user's preferences happen to match what is pre-built. >> >> I suggest we start a different thread for any additional discussion of >> this use case. I was thinking and it probably wouldn't be super-hard >> to actually start building something like this. But, I don't want to >> derail this GLEP as I don't see any reason designing something like >> this needs to hold up the binary package format. Both the existing >> and proposed binary package formats will encode any metadata needed by >> the package manager inside the file, and the only extension we need is >> to encode identifying info in the filename. >> >> My idea is to basically have portage generate a tag with all the info >> needed to identify the "right" package, take a hash of it, and then >> stick that in the filename. Then when portage is looking for a binary >> package to use at install time it generates the same tag using the >> same algorithm and looks for a matching hash. If a hit is found then >> it reads the complete metadata in the file and applies all the sanity >> checks it already does. Generating of binary packages with the hash >> cold be made optional, and portage could also be configured to first >> look for the matching hash, then fall back to the existing naming >> convention, so that it would be compatible with existing generic >> names. So, users would get a choice as to whether they want to build >> up a library of these packages, or just have each build overwrite the >> last. >> >> Then the next step would be to allow these files to be fetched from a >> binary repo optionally, and then finally we'd need tools to create the >> repo. But, this step isn't needed for your use case. With the proper >> optional switches you could utilize as much of this scheme as you >> like. >> >> Also, you could optionally choose how much you want portage to encode >> in the tag and look for. Are you very fussy and only want a binary >> package with matching CFLAGS/USE/whatever? Or is just matching >> USE/arch/etc enough? Some of the existing portage options could >> potentially be re-used here. > We've already had this handled for a couple years now, via > FEATURES=binpkg-multi-instance. Working fine for me for catalyst ARM runs ... signature.asc Description: OpenPGP digital signature
Re: [gentoo-dev] [pre-GLEP r1] Gentoo binary package container format
Hi, On Sat, 2018-11-17 at 12:21 +0100, Michał Górny wrote: > Here's a pre-GLEP draft based on the earlier discussion on gentoo- > portage-dev mailing list. The specification uses GLEP form as it > provides for cleanly specifying the motivation and rationale. Changes in -r1: took into account the feedback and restructured the motivation into pointing out advantages of the existing format, and focusing on the two real issues of non-transparency and OpenPGP implementations deficiencies. Also added a section on why there's no explicit version number. > Also available via HTTPS: > > rst: https://dev.gentoo.org/~mgorny/tmp/glep-0078.rst > html: https://dev.gentoo.org/~mgorny/tmp/glep-0078.html > --- GLEP: Title: Gentoo binary package container format Author: Michał Górny Type: Standards Track Status: Draft Version: 1 Created: 2018-11-15 Last-Modified: 2018-11-16 Post-History: 2018-11-17 Content-Type: text/x-rst --- Abstract This GLEP proposes a new binary package container format for Gentoo. The current tbz2/XPAK format is shortly described, and its deficiences are explained. Accordingly, the requirements for a new format are set and a gpkg format satisfying them is proposed. The rationale for the design decisions is provided. Motivation == The current Portage binary package format - The historical ``.tbz2`` binary package format used by Portage is a concatenation of two distinct formats: header-oriented compressed .tar format (used to hold package files) and trailer-oriented custom XPAK format (used to hold metadata) [#MAN-XPAK]_. The format has already been extended incompatibly twice. The first time, support for storing multiple successive builds of binary package for a single ebuild version has been added. This feature relies on appending additional hyphen, followed by an integer to the package filename. It is disabled by default (preserving backwards compatibility) and controlled by ``binpkg-multi-instance`` feature. The second time, support for additional compression formats has been added. When format other than bzip2 is used, the ``.tbz2`` suffix is replaced by ``.xpak`` and Portage relies on magic bytes to detect compression used. For backwards compatibility, Portage still defaults to using bzip2; compression program can be switched using ``BINPKG_COMPRESS`` configuration variable. Additionally, there have been minor changes to the stored metadata and file storage policies. In particular, behavior regarding ``INSTALL_MASK``, controllable file compression and stripping has changed over time. The advantages of tbz2/XPAK format -- The tbz2/XPAK format used by Portage has three interesting features: 1. **Each binary package is fully contained within a single file.** While this might seem unnecessary, it makes it easier for the user to transfer binary packages without having to be concerned about finding all the necessary files to transfer. 2. **The binary packages are compatible with regular compressed tarballs, most of the time.** With notable exceptions of historical versions of pbzip2 and the recent zstd compressor, tbz2/XPAK packages can be extracted using regular tar utility with a compressor implementation that discards trailing garbage. 3. **The metadata is uncompressed, and can be efficiently accessed without decompressing package contents.** This includes the possibility of rewriting it (e.g. as a result of package moves) without the necessity of repacking the files. Transparency problem with the current binary package format --- Notwithstanding its advantages, the tbz2/XPAK format has a significant design fault that consists of two issues: 1. **The XPAK format is a custom binary format with explicit use of binary-encoded file offsets and field lengths.** As such, it is non-trivial to read or edit without specialized tools. Such tools are currently implemented separately from the package manager, as part of the portage-utils toolkit, written in C [#PORTAGE-UTILS]_. 2. **The tarball compatibility feature relies on obscure feature of ignoring trailing garbage in compressed files**. While this is implemented consistently in most of the compressors, this feature is not really a part of specification but rather traditional behavior. Given that the original reasons for this no longer apply, new compressor implementations are likely to miss support for this. Both of the issues make the format hard to use without dedicated tools, or when the tools misbehave. This impacts the following scenarios: A. **Using binary packages for system recovery.** In case of serious breakage, it is really preferable that the format depends on as few tools a possible, and especially not on Gentoo-specific tools. B. **Inspecting binary packages in detail exceeding stand
Re: [gentoo-dev] [pre-GLEP] Gentoo binary package container format [gen...@jonesmz.com]
On 11/18/18 6:51 PM, Rich Freeman wrote: > On Sun, Nov 18, 2018 at 5:40 PM Zac Medico wrote: >> >> On 11/18/18 1:55 PM, Rich Freeman wrote: >>> >>> My idea is to basically have portage generate a tag with all the info >>> needed to identify the "right" package, take a hash of it, and then >>> stick that in the filename. Then when portage is looking for a binary >>> package to use at install time it generates the same tag using the >>> same algorithm and looks for a matching hash. >> >> We've already had this handled for a couple years now, via >> FEATURES=binpkg-multi-instance. > > According to the make.conf manpage this simply numbers builds. So, if > you build something twice with the same config you end up with two > duplicate files (wasteful). Presumably if you had a large collection > of these packages portage would have to read the metadata within each > one to figure out which one is appropriate to install. That would be > expensive if IO is slow, such as when fetching packages online > on-demand. > > But, it obviously is somewhat of an improvement for Roy's use case. > > IMO using a content-hash of certain metadata would eliminate > duplication, and based on filename alone it would be clear whether the > sought-after binary package exists or not. As with the build numbers > you couldn't tell from filename inspection what packages you have, but > if you know what you want you could immediately find it. IMO trying > to cram all that metadata into a filename to make them more > transparent isn't a good idea, and using hashes lets the user set > their own policy regarding flexibility. Heck, you could auto-gen > symlinks for subsets of metadata (ie, the same file could be linked > from a file that specifies its USE flags but not its CFLAGS, so it > would be found if either an exact hit on CFLAGS was sought or if > CFLAGS were considered unimportant). > > But, I'm certainly not suggesting that you're not allowed to go to bed > until you've built it. :) The existing ${PKGDIR}/Packages file optimizes metadata access for both local an remote access, and performs well for reasonable numbers of packages. If you insist on mixing binary packages in the same ${PKGDIR} for a large number of alternative configurations, then it will not scale unless you create a way to send your local configuration to the server so that it can select the relevant package list for you. However, bear in mind that mixing alternative configurations in the same ${PKGDIR} might lead to undesirable results if there is anything relevant that is unaccounted for in the package metadata. Possible unaccounted things may include: 1) glibc version the package was built against 2) symbols and/or sonames not accounted for by slot operator dependencies 3) soname dependencies (--usepkgonly + --ignore-soname-deps=n handles this) -- Thanks, Zac signature.asc Description: OpenPGP digital signature
Re: [gentoo-dev] [pre-GLEP r1] Gentoo binary package container format
On 2018.11.19 18:35, Michał Górny wrote: > Hi, > > On Sat, 2018-11-17 at 12:21 +0100, Michał Górny wrote: > > Here's a pre-GLEP draft based on the earlier discussion on gentoo- > > portage-dev mailing list. The specification uses GLEP form as it > > provides for cleanly specifying the motivation and rationale. > > Changes in -r1: took into account the feedback and restructured > the motivation into pointing out advantages of the existing format, > and focusing on the two real issues of non-transparency and OpenPGP > implementations deficiencies. Also added a section on why there's no > explicit version number. > > > Also available via HTTPS: > > > > rst: https://dev.gentoo.org/~mgorny/tmp/glep-0078.rst > > html: https://dev.gentoo.org/~mgorny/tmp/glep-0078.html > > > [snip] Team, Looks good to me. I can manually unpick the binpackage with tar. Choose, if I will check the signatures or not, then spray files all over my broken Gentoo with tar in the same way as I do now. Implementation detail question. It appears that all members must be signed, or none of them since "The archive members support optional OpenPGP signatures. The implementations must allow the user to specify whether OpenPGP signatures are to be expected in remotely fetched packages." Or can the user specify that only some elements need to be signed? Is it a problem if not all elements are signed with the same key? That could happen if one person makes a binpackage and someone else updates the metadata. > -- > Best regards, > Michał Górny > -- Regards, Roy Bamford (Neddyseagoon) a member of elections gentoo-ops forum-mods pgpX6ueFyt3EF.pgp Description: PGP signature
Re: [gentoo-dev] [pre-GLEP r1] Gentoo binary package container format
On Mon, Nov 19, 2018 at 2:21 PM Roy Bamford wrote: > > "The archive members support optional OpenPGP signatures. > The implementations must allow the user to specify whether OpenPGP > signatures are to be expected in remotely fetched packages." > > Or can the user specify that only some elements need to be signed? > > Is it a problem if not all elements are signed with the same key? > That could happen if one person makes a binpackage and someone > else updates the metadata. > IMO this is going a bit into PM details for a GLEP that is about container formats. Presumably any package manager is going to need to figure out what keys are/aren't valid and allow the user to configure this behavior. Users who want to go editing package innards will presumably adjust their package manager settings to accept their modifications, whether it means accepting their own sigs or disabling them. -- Rich
Re: [gentoo-dev] [pre-GLEP r1] Gentoo binary package container format
On 11/19/18 11:33 AM, Rich Freeman wrote: > On Mon, Nov 19, 2018 at 2:21 PM Roy Bamford wrote: >> >> "The archive members support optional OpenPGP signatures. >> The implementations must allow the user to specify whether OpenPGP >> signatures are to be expected in remotely fetched packages." >> >> Or can the user specify that only some elements need to be signed? >> >> Is it a problem if not all elements are signed with the same key? >> That could happen if one person makes a binpackage and someone >> else updates the metadata. >> > > IMO this is going a bit into PM details for a GLEP that is about > container formats. > > Presumably any package manager is going to need to figure out what > keys are/aren't valid and allow the user to configure this behavior. > Users who want to go editing package innards will presumably adjust > their package manager settings to accept their modifications, whether > it means accepting their own sigs or disabling them. With the GLEP as it is, the user *must* use a local signing key to sign installed packages during the installation process if they want to be able to verify signatures for installed packages at some point in the future, since the binary package format does not provide a way to use binary package signatures for this purpose. -- Thanks, Zac signature.asc Description: OpenPGP digital signature
Re: [gentoo-dev] [pre-GLEP r1] Gentoo binary package container format
On Mon, Nov 19, 2018 at 2:40 PM Zac Medico wrote: > > On 11/19/18 11:33 AM, Rich Freeman wrote: > > On Mon, Nov 19, 2018 at 2:21 PM Roy Bamford wrote: > >> > >> "The archive members support optional OpenPGP signatures. > >> The implementations must allow the user to specify whether OpenPGP > >> signatures are to be expected in remotely fetched packages." > >> > >> Or can the user specify that only some elements need to be signed? > >> > >> Is it a problem if not all elements are signed with the same key? > >> That could happen if one person makes a binpackage and someone > >> else updates the metadata. > >> > > > > IMO this is going a bit into PM details for a GLEP that is about > > container formats. > > > > Presumably any package manager is going to need to figure out what > > keys are/aren't valid and allow the user to configure this behavior. > > Users who want to go editing package innards will presumably adjust > > their package manager settings to accept their modifications, whether > > it means accepting their own sigs or disabling them. > > With the GLEP as it is, the user *must* use a local signing key to sign > installed packages during the installation process if they want to be > able to verify signatures for installed packages at some point in the > future, since the binary package format does not provide a way to use > binary package signatures for this purpose. I think we might be talking about different signatures? I think you're referring to signatures of the package files after they are installed on the local filesystem, while I'm talking about verifying the integrity of the package file themselves. If these signatures are applied to different data then obviously you couldn't just have the one signature serve double duty (unless you hung onto the binary package, verified the signature on it, then verified the package contents against the live filesystem). The simplest solution would be to do as you seem to be suggesting - verify the signature on the package before installing it, and then during installation capture whatever metadata is already supported by portage and sign that using a user's trusted key. This seems like the most practical solution in any case since we aren't likely to ever go down the route of using a single signed squashfs for /usr like a release-based binary distro might. -- Rich
Re: [gentoo-dev] [pre-GLEP] Gentoo binary package container format
On Sun, 18 Nov 2018 12:00:48 +0100 Fabian Groffen wrote: > Your point is that the format is broken (== relies on obscure compressor > feature). My point is that the format simply requires a special tool. > The fact that we prefer to use existing tools doesn't imply in any way > that the format is broken to me. > I think you should rewrite your point to mention that you don't want to > use a tool that doesn't exist in @system (?) to unpack a binpkg. My > guess is that you could use some head/tail magic in a script if the > trailing block is upsetting the decompressor. The existing design to the best of my understanding poses problems when it comes to adding new features, as the dependency on a "special tool" becomes the bottleneck, as in order to add the new feature, the special tool has to be adjusted to handle it, and potentially introduce serious incompatible changes. The alternative proposal stated in this pre-GLEP seems infinitely more extensible, which means more room for 3rd-parties to add their own features, while retaining basic portage interop. For instance, I think a "nice" feature that could be added one day would be the ability for the automated package builder to bundle: - The ebuild that was used to build it - All the eclasses that were used by the ebuild - All the sources and patches that were used And therein creating a fat bin/src hybrid, potentially allowing rebuilding the exact same package with minor changes, independently of portage repository changes. And this may be useful for people who don't want the option set in the binary build, but otherwise want the exact same material in a different configuration. In terms of user-friendliness, this could empower Gentoo in new ways, in ways that compete with existing binary distributions wherein upstreams publish .deb files for people to "just install". Presently, the amount of additional hand-holding required (namely: install this overlay, make sure you sync it right, etc, etc, etc) makes it a little too "hands on" for some. Now, I'm not saying Gentoo *should* do exactly this, but I like that this approach gives us the *potential* to do this, and resultingly, some downstream derivatives of Gentoo may be motivated to do something like this, proving usable stand-alone bin-packages which interop nicely with standard Gentoo installations, while also working nicely with downstreams customizations. Achieving this as it is requires downstream to develop their own format, which is likely not going to work with standard Gentoo installs. pgpLToo8zKb7q.pgp Description: OpenPGP digital signature
Re: [gentoo-dev] [pre-GLEP r1] Gentoo binary package container format
On 2018.11.19 19:33, Rich Freeman wrote: > On Mon, Nov 19, 2018 at 2:21 PM Roy Bamford > wrote: > > > > "The archive members support optional OpenPGP signatures. > > The implementations must allow the user to specify whether OpenPGP > > signatures are to be expected in remotely fetched packages." > > > > Or can the user specify that only some elements need to be signed? > > > > Is it a problem if not all elements are signed with the same key? > > That could happen if one person makes a binpackage and someone > > else updates the metadata. > > > > IMO this is going a bit into PM details for a GLEP that is about > container formats. > Rich, Not really. The GLEP needs to be clear about the signing. Is it every element or none? The GLEP hints that a mix of is possible with If the implementation needs to manipulate archive members, it must either create a new signature or discard the existing signature. An individual binpackage could start life with all elements signed by the same key. Some element could be updated and the key for the signature of that element changed. Later still, another element can be changed an have its signature dropped. Should some combinations have no practical value, they should not be permitted by the GLEP. > -- > Rich > > > -- Regards, Roy Bamford (Neddyseagoon) a member of elections gentoo-ops forum-mods pgpaUbIZBgWaT.pgp Description: PGP signature