Re: [gentoo-dev] Bazel Build eclass
On Sun, 2018-11-18 at 15:37 +0800, Jason Zaman wrote: > On Sat, Nov 17, 2018 at 11:54:24PM +0100, Michał Górny wrote: > > On Sun, 2018-11-18 at 03:37 +0800, Jason Zaman wrote: > > > Hey all, > > > > > > I've been using Bazel (https://bazel.build/) to build TensorFlow for a > > > while now. Here is a bazel.eclass I'd like to commit to make it easier > > > for packages that use it to build. It's basically bits that I've > > > refactored out of the TensorFlow ebuild that would be useful to other > > > packages as well. I have a bump to sci-libs/tensorflow-1.12.0 prepared > > > that uses this eclass and have tested a full install. > > > > > > -- Jason > > > > > > # Copyright 1999-2018 Jason Zaman > > > # Distributed under the terms of the GNU General Public License v2 > > > > > > # @ECLASS: bazel.eclass > > > # @MAINTAINER: > > > # Jason Zaman > > > # @AUTHOR: > > > # Jason Zaman > > > # @BLURB: Utility functions for packages using Bazel Build > > > # @DESCRIPTION: > > > # A utility eclass providing functions to run the Bazel Build system. > > > # > > > # This eclass does not export any phase functions. > > > > > > case "${EAPI:-0}" in > > > 0|1|2|3|4|5|6) > > > die "Unsupported EAPI=${EAPI:-0} (too old) for ${ECLASS}" > > > ;; > > > 7) > > > ;; > > > *) > > > die "Unsupported EAPI=${EAPI} (unknown) for ${ECLASS}" > > > ;; > > > esac > > > > > > if [[ ! ${_BAZEL_ECLASS} ]]; then > > > > > > inherit multiprocessing toolchain-funcs > > > > > > BDEPEND=">=dev-util/bazel-0.19" > > > > > > # @FUNCTION: bazel_get_flags > > > # @DESCRIPTION: > > > # Obtain and print the bazel flags for target and host *FLAGS. > > > # > > > # To add more flags to this, append the flags to the > > > # appropriate variable before calling this function > > > bazel_get_flags() { > > > local i fs=() > > > for i in ${CFLAGS}; do > > > fs+=( "--conlyopt=${i}" ) > > > done > > > for i in ${BUILD_CFLAGS}; do > > > fs+=( "--host_conlyopt=${i}" ) > > > done > > > for i in ${CXXFLAGS}; do > > > fs+=( "--cxxopt=${i}" ) > > > done > > > for i in ${BUILD_CXXFLAGS}; do > > > fs+=( "--host_cxxopt=${i}" ) > > > done > > > for i in ${CPPFLAGS}; do > > > fs+=( "--conlyopt=${i}" "--cxxopt=${i}" ) > > > done > > > for i in ${BUILD_CPPFLAGS}; do > > > fs+=( "--host_conlyopt=${i}" "--host_cxxopt=${i}" ) > > > done > > > for i in ${LDFLAGS}; do > > > fs+=( "--linkopt=${i}" ) > > > done > > > for i in ${BUILD_LDFLAGS}; do > > > fs+=( "--host_linkopt=${i}" ) > > > done > > > echo "${fs[*]}" > > > } > > > > > > # @FUNCTION: bazel_setup_bazelrc > > > # @DESCRIPTION: > > > # Creates the bazelrc with common options that will be passed > > > # to bazel. This will be called by ebazel automatically so > > > # does not need to be called from the ebuild. > > > bazel_setup_bazelrc() { > > > if [[ -f "${T}/bazelrc" ]]; then > > > return > > > fi > > > > > > # F: fopen_wr > > > # P: /proc/self/setgroups > > > # Even with standalone enabled, the Bazel sandbox binary is run for > > > feature test: > > > # > > > https://github.com/bazelbuild/bazel/blob/7b091c1397a82258e26ab5336df6c8dae1d97384/src/main/java/com/google/devtools/build/lib/sandbox/LinuxSandboxedSpawnRunner.java#L61 > > > # > > > https://github.com/bazelbuild/bazel/blob/76555482873ffcf1d32fb40106f89231b37f850a/src/main/tools/linux-sandbox-pid1.cc#L113 > > > addpredict /proc > > > > > > mkdir -p "${T}/bazel-cache" || die > > > mkdir -p "${T}/bazel-distdir" || die > > > > > > cat > "${T}/bazelrc" <<-EOF || die > > > startup --batch > > > > Maybe indent this stuff to make it stand out from ebuild code. > > > > > > > > # dont strip HOME, portage sets a temp per-package dir > > > build --action_env HOME > > > > > > # make bazel respect MAKEOPTS > > > build --jobs=$(makeopts_jobs) > > > build --compilation_mode=opt --host_compilation_mode=opt > > > > > > # FLAGS > > > build $(bazel_get_flags) > > > > > > # Use standalone strategy to deactivate the bazel sandbox, since it > > > # conflicts with FEATURES=sandbox. > > > build --spawn_strategy=standalone --genrule_strategy=standalone > > > test --spawn_strategy=standalone --genrule_strategy=standalone > > > > > > build --strip=never > > > build --verbose_failures --noshow_loading_progress > > > test --verbose_test_summary --verbose_failures --noshow_loading_progress > > > > > > # make bazel only fetch distfiles from the cache > > > fetch --repository_cache="${T}/bazel-cache/" > > > --distdir="${T}/bazel-distdir/" > > > build --repository_cache="${T}/bazel-cache/" > > > --distdir="${T}/bazel-distdir/" > > > > > > build --define=PREFIX=${EPREFIX%/}/usr > > > build --define=LIBDIR=\$(PREFIX)/$(get_libdir) > > > > > > EOF > > > > > > tc-is-cross-compiler || \ > > > echo "build --nodistinct_host
Re: [gentoo-dev] [pre-GLEP] Gentoo binary package container format
On 17-11-2018 12:21:40 +0100, Michał Górny wrote: > Problems with the current binary package format > --- > > The following problems were identified with the package format currently > in use: > > 1. **The packages rely on custom binary archive format to store >metadata.** It is entirely Gentoo invented, and requires dedicated >tooling to work with it. In fact, the reference implementation >in Portage does not even include a CLI tool to work with tbz2 >packages; an unofficial implementation is provided as part >of portage-utils toolkit [#PORTAGE-UTILS]_. I think you should rewrite this section to the argument that the metadata is hard to edit, and that there is only one tool to do so (except a python interface from Portage?). On a separate note, I don't think portage-utils can be considered "unofficial", it is a Gentoo official project as far as I am aware. > 2. **The format relies on obscure compressor feature of ignoring >trailing garbage**. While this behavior is traditionally implemented >by many compressors, the original reasons for it have become long >irrelevant and it is not surprising that new compressors do not >support it. In particular, Portage already hit this problem twice: >once when users replaced bzip2 with parallel-capable pbzip2 >implementation [#PBZIP2]_, and the second time when support for zstd >compressor was added [#ZSTD]_. I think this is actually the result of a rather opportunistic implementation. The fault is that we chose to use an extension that suggests the file is a regular compressed tarball. When one detects that a file is xpak padded, it is trivial to feed the decompressor just the relevant part of the datastream. The format itself isn't bad, and doesn't rely on obscure behaviour. > 3. **Placing metadata at the end of file makes partial fetches >complex.** While it is technically possible to obtain package >metadata remotely without fetching the whole package, it usually >requires e.g. 2-3 HTTP requests with rather complex driver. For >comparison, if metadata was placed at the beginning of the file, >early-terminated pipeline with a single fetch request would suffice. I think this point needs to be quantified somewhat why it is so important. I may be wrong, but the average binpkg is small, <1MiB, bigger packages are <50MiB. So what is the gain to be saved here? A "few" MiBs for what operation exactly? I say "few" because I know for some users this is actually not just a blib before it's downloaded. So if this is possible to achieve, in what scenarios is this going to be used (and is this often?). > 4. **Extending the format with OpenPGP signatures is non-trivial.** >Depending on the implementation details, it either requires fetching >additional detached signature, breaking backwards compatibility or >introducing more custom logic to reassemble OpenPGP packets. I think one could add an extra key to the xpak that holds a gpg sig or something. Perhaps this point is better phrased as that current binpkgs don't have any validation options defined. > 5. **Metadata is not compressed.** This is not a significant problem, >it is just listed for completeness. > > > Goals for a new container format > > > The following goals have been set for a replacement format: > > 1. **The packages must remain contained in a single file.** As a matter >of user convenience, it should be possible to transfer binary >packages without having to use multiple files, and to install them >from any location. > > 2. **The file format must be entirely based on common file formats, >respecting best practices, with as little customization as necessary >to satisfy the requirements.** In particular, it is unacceptable >to create new binary formats. I take this as your personal opinion. I don't quite get why it is unacceptable to create a new binary format though. In particular when you're looking for efficiency, such format could serve your purposes. As long as it's clearly defined, I don't see the problem with a binary format either. Could you add why it is you think binary formats are unacceptable here? > 3. **The file format should provide for partial fetching of binary >packages.** It should be possible to easily fetch and read >the package metadata without having to download the whole package. Like above, what is the use-case here? Why would you want this? I think I'm missing something here. > 4. **The file format must provide support for OpenPGP signatures.** >Preferably, it should use standard OpenPGP message formats. > > 5. **The file format must allow for efficient metadata updates.** >In particular, it should be possible to update the metadata without >having to recompress package files. > > 6. **The file format should account for easy recognition both throug
Re: [gentoo-dev] [pre-GLEP] Gentoo binary package container format
On Sun, 2018-11-18 at 10:16 +0100, Fabian Groffen wrote: > On 17-11-2018 12:21:40 +0100, Michał Górny wrote: > > Problems with the current binary package format > > --- > > > > The following problems were identified with the package format currently > > in use: > > > > 1. **The packages rely on custom binary archive format to store > >metadata.** It is entirely Gentoo invented, and requires dedicated > >tooling to work with it. In fact, the reference implementation > >in Portage does not even include a CLI tool to work with tbz2 > >packages; an unofficial implementation is provided as part > >of portage-utils toolkit [#PORTAGE-UTILS]_. > > I think you should rewrite this section to the argument that the > metadata is hard to edit, and that there is only one tool to do so > (except a python interface from Portage?). > On a separate note, I don't think portage-utils can be considered > "unofficial", it is a Gentoo official project as far as I am aware. In this context, Portage is 'official'. Portage-utils is a project that's developed entirely separately from Portage and doesn't use Portage APIs but instead reinvents everything. As such, it is easy for the two to go out of sync. Or for one of them to have bugs that the other one doesn't have (say, with endianness). > > 2. **The format relies on obscure compressor feature of ignoring > >trailing garbage**. While this behavior is traditionally implemented > >by many compressors, the original reasons for it have become long > >irrelevant and it is not surprising that new compressors do not > >support it. In particular, Portage already hit this problem twice: > >once when users replaced bzip2 with parallel-capable pbzip2 > >implementation [#PBZIP2]_, and the second time when support for zstd > >compressor was added [#ZSTD]_. > > I think this is actually the result of a rather opportunistic > implementation. The fault is that we chose to use an extension that > suggests the file is a regular compressed tarball. > When one detects that a file is xpak padded, it is trivial to feed the > decompressor just the relevant part of the datastream. The format > itself isn't bad, and doesn't rely on obscure behaviour. Except if you don't have the proper tools installed. In which case the 'opportunistic' behavior made it possible to extract the contents without special tools... except when it actually happens not to work anymore. Roy's reply indicates that there is actually interest in this design feature. > > > 3. **Placing metadata at the end of file makes partial fetches > >complex.** While it is technically possible to obtain package > >metadata remotely without fetching the whole package, it usually > >requires e.g. 2-3 HTTP requests with rather complex driver. For > >comparison, if metadata was placed at the beginning of the file, > >early-terminated pipeline with a single fetch request would suffice. > > I think this point needs to be quantified somewhat why it is so > important. > I may be wrong, but the average binpkg is small, <1MiB, bigger packages > are <50MiB. > So what is the gain to be saved here? A "few" MiBs for what operation > exactly? I say "few" because I know for some users this is actually not > just a blib before it's downloaded. So if this is possible to achieve, > in what scenarios is this going to be used (and is this often?). Last I checked, Gentoo aimed to support more users than the 'majority' of people with high-throughput Internet access. If there's no cost in doing things better, why not do them better? > > > 4. **Extending the format with OpenPGP signatures is non-trivial.** > >Depending on the implementation details, it either requires fetching > >additional detached signature, breaking backwards compatibility or > >introducing more custom logic to reassemble OpenPGP packets. > > I think one could add an extra key to the xpak that holds a gpg sig or > something. Perhaps this point is better phrased as that current binpkgs > don't have any validation options defined. ...which extra key would mean that the two disjoint implementations in use would need more custom code that extracts the signature, reconstructs signed data for verification and verifies it. Or, in other words, that user needs even more custom tooling to manually verify the package he just fetched. > > > 5. **Metadata is not compressed.** This is not a significant problem, > >it is just listed for completeness. > > > > > > Goals for a new container format > > > > > > The following goals have been set for a replacement format: > > > > 1. **The packages must remain contained in a single file.** As a matter > >of user convenience, it should be possible to transfer binary > >packages without having to use multiple files, and to install them > >from any location. > > > >
Re: [gentoo-dev] [pre-GLEP] Gentoo binary package container format
On 18-11-2018 10:38:51 +0100, Michał Górny wrote: > On Sun, 2018-11-18 at 10:16 +0100, Fabian Groffen wrote: > > On 17-11-2018 12:21:40 +0100, Michał Górny wrote: > > > Problems with the current binary package format > > > --- > > > > > > The following problems were identified with the package format currently > > > in use: > > > > > > 1. **The packages rely on custom binary archive format to store > > >metadata.** It is entirely Gentoo invented, and requires dedicated > > >tooling to work with it. In fact, the reference implementation > > >in Portage does not even include a CLI tool to work with tbz2 > > >packages; an unofficial implementation is provided as part > > >of portage-utils toolkit [#PORTAGE-UTILS]_. > > > > I think you should rewrite this section to the argument that the > > metadata is hard to edit, and that there is only one tool to do so > > (except a python interface from Portage?). > > On a separate note, I don't think portage-utils can be considered > > "unofficial", it is a Gentoo official project as far as I am aware. > > In this context, Portage is 'official'. Portage-utils is a project > that's developed entirely separately from Portage and doesn't use > Portage APIs but instead reinvents everything. As such, it is easy for > the two to go out of sync. Or for one of them to have bugs that > the other one doesn't have (say, with endianness). I'm not sure if it's actually true, I was under the impression the same author(s) worked on the Portage as well as portage-utils code. Anyway, aren't quickpkg and emerge enough from a user's perspective? > > > 2. **The format relies on obscure compressor feature of ignoring > > >trailing garbage**. While this behavior is traditionally implemented > > >by many compressors, the original reasons for it have become long > > >irrelevant and it is not surprising that new compressors do not > > >support it. In particular, Portage already hit this problem twice: > > >once when users replaced bzip2 with parallel-capable pbzip2 > > >implementation [#PBZIP2]_, and the second time when support for zstd > > >compressor was added [#ZSTD]_. > > > > I think this is actually the result of a rather opportunistic > > implementation. The fault is that we chose to use an extension that > > suggests the file is a regular compressed tarball. > > When one detects that a file is xpak padded, it is trivial to feed the > > decompressor just the relevant part of the datastream. The format > > itself isn't bad, and doesn't rely on obscure behaviour. > > Except if you don't have the proper tools installed. In which case > the 'opportunistic' behavior made it possible to extract the contents > without special tools... except when it actually happens not to work > anymore. Roy's reply indicates that there is actually interest in this > design feature. Your point is that the format is broken (== relies on obscure compressor feature). My point is that the format simply requires a special tool. The fact that we prefer to use existing tools doesn't imply in any way that the format is broken to me. I think you should rewrite your point to mention that you don't want to use a tool that doesn't exist in @system (?) to unpack a binpkg. My guess is that you could use some head/tail magic in a script if the trailing block is upsetting the decompressor. I'm not saying this may look ugly, I'm just saying that your point seems biased. > > > 3. **Placing metadata at the end of file makes partial fetches > > >complex.** While it is technically possible to obtain package > > >metadata remotely without fetching the whole package, it usually > > >requires e.g. 2-3 HTTP requests with rather complex driver. For > > >comparison, if metadata was placed at the beginning of the file, > > >early-terminated pipeline with a single fetch request would suffice. > > > > I think this point needs to be quantified somewhat why it is so > > important. > > I may be wrong, but the average binpkg is small, <1MiB, bigger packages > > are <50MiB. > > So what is the gain to be saved here? A "few" MiBs for what operation > > exactly? I say "few" because I know for some users this is actually not > > just a blib before it's downloaded. So if this is possible to achieve, > > in what scenarios is this going to be used (and is this often?). > > Last I checked, Gentoo aimed to support more users than the 'majority' > of people with high-throughput Internet access. If there's no cost > in doing things better, why not do them better? You didn't address the critical question, but instead just repeated what I said. So again, why do you need to read just the metadata? > > > 4. **Extending the format with OpenPGP signatures is non-trivial.** > > >Depending on the implementation details, it either requires fetching > > >additional detached signature, breaking backwards compatibi
Re: [gentoo-dev] [pre-GLEP] Gentoo binary package container format
On 2018.11.18 09:38, Michał Górny wrote: > On Sun, 2018-11-18 at 10:16 +0100, Fabian Groffen wrote: > > On 17-11-2018 12:21:40 +0100, Michał Górny wrote: > > > Problems with the current binary package format [snip] > > > 2. **The format relies on obscure compressor feature of ignoring > > >trailing garbage**. While this behavior is traditionally > implemented > > >by many compressors, the original reasons for it have become > long > > >irrelevant and it is not surprising that new compressors do not > > >support it. In particular, Portage already hit this problem > twice: > > >once when users replaced bzip2 with parallel-capable pbzip2 > > >implementation [#PBZIP2]_, and the second time when support for > zstd > > >compressor was added [#ZSTD]_. > > > > I think this is actually the result of a rather opportunistic > > implementation. The fault is that we chose to use an extension that > > suggests the file is a regular compressed tarball. > > When one detects that a file is xpak padded, it is trivial to feed > the > > decompressor just the relevant part of the datastream. The format > > itself isn't bad, and doesn't rely on obscure behaviour. > > Except if you don't have the proper tools installed. In which case > the 'opportunistic' behavior made it possible to extract the contents > without special tools... except when it actually happens not to work > anymore. Roy's reply indicates that there is actually interest in > this > design feature. > [snip] Team, I use to post something like https://wiki.gentoo.org/wiki/Fix_My_Gentoo with a link to Patricks binhost on the forums every three or four months. It made it worth writing that wiki page anyway. We still get users removing elements of their toolchain or glbc from time to time. The requirement that I didn't express very well, is that it shall be possible to install binary packages without the use of any Gentoo specific tooling. The current tarball of tarballs proposal would satisfy that requirement. Its unlikely that a custom binary format would. Of course, this being Gentoo someone would write a run anywhere script that did the unpicking, We already have deb2targz and rpm2targz. We have the opportunity to design out binpgk2targz before it exists. -- Regards, Roy Bamford (Neddyseagoon) a member of elections gentoo-ops forum-mods pgpFaerHiTnmN.pgp Description: PGP signature
Fwd: Re: [gentoo-dev] [pre-GLEP] Gentoo binary package container format [gen...@jonesmz.com]
See attached. Replying off list because I am not on the whitelist ... -- Regards, Roy Bamford (Neddyseagoon) a member of elections gentoo-ops forum-mods --- Begin Message --- On Sun, Nov 18, 2018 at 5:04 AM Roy Bamford wrote: > On 2018.11.18 09:38, Michał Górny wrote: > > On Sun, 2018-11-18 at 10:16 +0100, Fabian Groffen wrote: > > > On 17-11-2018 12:21:40 +0100, Michał Górny wrote: > > > > Problems with the current binary package format > > [snip] > > > > > 2. **The format relies on obscure compressor feature of ignoring > > > >trailing garbage**. While this behavior is traditionally > > implemented > > > >by many compressors, the original reasons for it have become > > long > > > >irrelevant and it is not surprising that new compressors do not > > > >support it. In particular, Portage already hit this problem > > twice: > > > >once when users replaced bzip2 with parallel-capable pbzip2 > > > >implementation [#PBZIP2]_, and the second time when support for > > zstd > > > >compressor was added [#ZSTD]_. > > > > > > I think this is actually the result of a rather opportunistic > > > implementation. The fault is that we chose to use an extension that > > > suggests the file is a regular compressed tarball. > > > When one detects that a file is xpak padded, it is trivial to feed > > the > > > decompressor just the relevant part of the datastream. The format > > > itself isn't bad, and doesn't rely on obscure behaviour. > > > > Except if you don't have the proper tools installed. In which case > > the 'opportunistic' behavior made it possible to extract the contents > > without special tools... except when it actually happens not to work > > anymore. Roy's reply indicates that there is actually interest in > > this > > design feature. > > > [snip] > > Team, > > I use to post something like https://wiki.gentoo.org/wiki/Fix_My_Gentoo > with a link to Patricks binhost on the forums every three or four months. > It made it worth writing that wiki page anyway. > > We still get users removing elements of their toolchain or glbc from time > to time. The requirement that I didn't express very well, is that it > shall > be possible to install binary packages without the use of any Gentoo > specific tooling. > > The current tarball of tarballs proposal would satisfy that requirement. > > Its unlikely that a custom binary format would. Of course, this being > Gentoo someone would write a run anywhere script that did the > unpicking, We already have deb2targz and rpm2targz. We have the > opportunity to design out binpgk2targz before it exists. > > -- > Regards, > > Roy Bamford > (Neddyseagoon) a member of > elections > gentoo-ops > forum-mods > Replying off list because I am not on the whitelist. Please also consider my use case: I have a cluster file system, cephfs, which all of my gentoo machines mount for access to various shared file resources. I want to have all of them mount a cephfs path to the folder which portage is configured to look for binary packages. This works great if all of the machines have identical portage configurations, but breaks down as soon as one machine uses a different use flag. The reason for this is that the package file names do not encode anything other than the package name and version number. So if a binpkg already exists in my binpkg repository, and another machine builds with different use flags, the binpkg gets overwritten, potentially while a third machine is reading the binpkg file. The filename also does not represent compile time dependencies, or any number of other possible points of differentiation This issue could be (at least partially) solved at least 3 ways. 1) append a uuid to each filename. Generated when the bin package file is generated. 2) encode the hostname of the machine that generated the file 3) encode the use flags in the filename. Perhaps a fuller solution is to respect an environment variable "BINARY_PKG_FILENAME_FORMAT" that accepts a series of variable substitutions to append after the package name and version number? This variable would be used only when generating the binary package. Portage would still use any binary package that it found that matched its needs, regardless of suffix. Thanks for your time. --- End Message --- pgpoqDkTyX_48.pgp Description: PGP signature
Re: Re: [gentoo-dev] [pre-GLEP] Gentoo binary package container format [gen...@jonesmz.com]
On Sun, Nov 18, 2018 at 4:10 PM Roy Bamford wrote: > > Replying off list because I am not on the whitelist. That seems odd. > 1) append a uuid to each filename. Generated when the bin package file is > generated. > 2) encode the hostname of the machine that generated the file > 3) encode the use flags in the filename. So, I brought up this same issue in the earlier discussion and it was considered out of scope, and I think this is fair. The GLEP does not specify filename, and IMO the standard for what goes INSIDE the file will work just fine with any future enhancements that address exactly this use case. Besides your case of building for a cluster, another use case is having a central binary repo that portage could check and utilize when a user's preferences happen to match what is pre-built. I suggest we start a different thread for any additional discussion of this use case. I was thinking and it probably wouldn't be super-hard to actually start building something like this. But, I don't want to derail this GLEP as I don't see any reason designing something like this needs to hold up the binary package format. Both the existing and proposed binary package formats will encode any metadata needed by the package manager inside the file, and the only extension we need is to encode identifying info in the filename. My idea is to basically have portage generate a tag with all the info needed to identify the "right" package, take a hash of it, and then stick that in the filename. Then when portage is looking for a binary package to use at install time it generates the same tag using the same algorithm and looks for a matching hash. If a hit is found then it reads the complete metadata in the file and applies all the sanity checks it already does. Generating of binary packages with the hash cold be made optional, and portage could also be configured to first look for the matching hash, then fall back to the existing naming convention, so that it would be compatible with existing generic names. So, users would get a choice as to whether they want to build up a library of these packages, or just have each build overwrite the last. Then the next step would be to allow these files to be fetched from a binary repo optionally, and then finally we'd need tools to create the repo. But, this step isn't needed for your use case. With the proper optional switches you could utilize as much of this scheme as you like. Also, you could optionally choose how much you want portage to encode in the tag and look for. Are you very fussy and only want a binary package with matching CFLAGS/USE/whatever? Or is just matching USE/arch/etc enough? Some of the existing portage options could potentially be re-used here. Please make any replies in a new thread. -- Rich
Re: [gentoo-dev] [pre-GLEP] Gentoo binary package container format [gen...@jonesmz.com]
On 11/18/18 1:55 PM, Rich Freeman wrote: > On Sun, Nov 18, 2018 at 4:10 PM Roy Bamford wrote: >> >> Replying off list because I am not on the whitelist. > > That seems odd. > >> 1) append a uuid to each filename. Generated when the bin package file is >> generated. >> 2) encode the hostname of the machine that generated the file >> 3) encode the use flags in the filename. > > So, I brought up this same issue in the earlier discussion and it was > considered out of scope, and I think this is fair. The GLEP does not > specify filename, and IMO the standard for what goes INSIDE the file > will work just fine with any future enhancements that address exactly > this use case. > > Besides your case of building for a cluster, another use case is > having a central binary repo that portage could check and utilize when > a user's preferences happen to match what is pre-built. > > I suggest we start a different thread for any additional discussion of > this use case. I was thinking and it probably wouldn't be super-hard > to actually start building something like this. But, I don't want to > derail this GLEP as I don't see any reason designing something like > this needs to hold up the binary package format. Both the existing > and proposed binary package formats will encode any metadata needed by > the package manager inside the file, and the only extension we need is > to encode identifying info in the filename. > > My idea is to basically have portage generate a tag with all the info > needed to identify the "right" package, take a hash of it, and then > stick that in the filename. Then when portage is looking for a binary > package to use at install time it generates the same tag using the > same algorithm and looks for a matching hash. If a hit is found then > it reads the complete metadata in the file and applies all the sanity > checks it already does. Generating of binary packages with the hash > cold be made optional, and portage could also be configured to first > look for the matching hash, then fall back to the existing naming > convention, so that it would be compatible with existing generic > names. So, users would get a choice as to whether they want to build > up a library of these packages, or just have each build overwrite the > last. > > Then the next step would be to allow these files to be fetched from a > binary repo optionally, and then finally we'd need tools to create the > repo. But, this step isn't needed for your use case. With the proper > optional switches you could utilize as much of this scheme as you > like. > > Also, you could optionally choose how much you want portage to encode > in the tag and look for. Are you very fussy and only want a binary > package with matching CFLAGS/USE/whatever? Or is just matching > USE/arch/etc enough? Some of the existing portage options could > potentially be re-used here. We've already had this handled for a couple years now, via FEATURES=binpkg-multi-instance. -- Thanks, Zac signature.asc Description: OpenPGP digital signature
[gentoo-dev] Automated Package Removal and Addition Tracker, for the week ending 2018-11-18 23:59 UTC
The attached list notes all of the packages that were added or removed from the tree, for the week ending 2018-11-18 23:59 UTC. Removals: dev-python/django-extensions 20181115-00:59 vdupras 2e65e4efa6a dev-python/fexpect 20181115-01:00 vdupras d3d276cd8a1 dev-python/shortuuid 20181115-00:59 vdupras c45b1ed2513 media-libs/libomxil-bellagio 20181115-16:31 mattst88 923714fbaa0 Additions: app-office/moneydance20181101-23:02 monsieurp e766b9a94f3 dev-python/filelock 20181118-19:36 mgorny8227eaa7218 dev-util/netsurf-buildsystem 20181113-16:09 vdupras 05d5e712566 net-misc/dmr_utils 20181116-18:40 zerochaos 9c685ea2d0a net-misc/hblink 20181116-18:41 zerochaos e34020154b6 -- Robin Hugh Johnson Gentoo Linux Developer E-Mail : robb...@gentoo.org GnuPG FP : 11AC BA4F 4778 E3F6 E4ED F38E B27B 944E 3488 4E85 Removed Packages: media-libs/libomxil-bellagio,removed,mattst88,20181115-16:31,923714fbaa0 dev-python/fexpect,removed,vdupras,20181115-01:00,d3d276cd8a1 dev-python/shortuuid,removed,vdupras,20181115-00:59,c45b1ed2513 dev-python/django-extensions,removed,vdupras,20181115-00:59,2e65e4efa6a Added Packages: dev-python/filelock,added,mgorny,20181118-19:36,8227eaa7218 app-office/moneydance,added,monsieurp,20181101-23:02,e766b9a94f3 net-misc/hblink,added,zerochaos,20181116-18:41,e34020154b6 net-misc/dmr_utils,added,zerochaos,20181116-18:40,9c685ea2d0a dev-util/netsurf-buildsystem,added,vdupras,20181113-16:09,05d5e712566 Done.
Re: [gentoo-dev] [pre-GLEP] Gentoo binary package container format [gen...@jonesmz.com]
On Sun, Nov 18, 2018 at 5:40 PM Zac Medico wrote: > > On 11/18/18 1:55 PM, Rich Freeman wrote: > > > > My idea is to basically have portage generate a tag with all the info > > needed to identify the "right" package, take a hash of it, and then > > stick that in the filename. Then when portage is looking for a binary > > package to use at install time it generates the same tag using the > > same algorithm and looks for a matching hash. > > We've already had this handled for a couple years now, via > FEATURES=binpkg-multi-instance. According to the make.conf manpage this simply numbers builds. So, if you build something twice with the same config you end up with two duplicate files (wasteful). Presumably if you had a large collection of these packages portage would have to read the metadata within each one to figure out which one is appropriate to install. That would be expensive if IO is slow, such as when fetching packages online on-demand. But, it obviously is somewhat of an improvement for Roy's use case. IMO using a content-hash of certain metadata would eliminate duplication, and based on filename alone it would be clear whether the sought-after binary package exists or not. As with the build numbers you couldn't tell from filename inspection what packages you have, but if you know what you want you could immediately find it. IMO trying to cram all that metadata into a filename to make them more transparent isn't a good idea, and using hashes lets the user set their own policy regarding flexibility. Heck, you could auto-gen symlinks for subsets of metadata (ie, the same file could be linked from a file that specifies its USE flags but not its CFLAGS, so it would be found if either an exact hit on CFLAGS was sought or if CFLAGS were considered unimportant). But, I'm certainly not suggesting that you're not allowed to go to bed until you've built it. :) -- Rich