Re: Including build metadata in packages

2022-02-16 Thread Simon McVittie
On Sun, 13 Feb 2022 at 14:13:10 -0800, Vagrant Cascadian wrote:
> Obviously, this would interfere with any meaningful reproducible builds
> testing for any package that did something like this. Ideally metadata
> like this about a build should *not* be included in the .deb files
> themselves.

Relatedly, I would like to be able to capture some information about
builds even if (perhaps especially if) the build fails. It might make
sense to combine that with what you're looking at. It doesn't seem
ideal that for a successful build, the maintainer can recover detailed
results via a .deb (at a significant reproducibility cost), but for a
failing build - perhaps one that fails as a result of test regressions
- they get no information other than what's in the log. If anything,
these artifacts seem *more* important for failing builds.

Some prior art for this:

In any autopkgtest test-case, in addition to the machine-readable result
(exit status, and depending on test flags, maybe also whether stderr is 0
bytes or >= 1 byte), and the generic human-readable log (stdout/stderr),
tests can drop arbitrary files into $AUTOPKGTEST_ARTIFACTS and they will
be saved in a directory or tarball by the test infrastructure.
ci.debian.net keeps test artifacts for a while and then discards them
(they are not kept indefinitely).

Lots of upstream build systems output detailed test results in some
way: for example Autotools test-suite.log, *.log and *.trs, Meson
meson-logs/**, and various packages like librsvg and gtk4 that drop logs,
images etc. into somewhere under the build directory and/or /tmp for later
analysis if a test fails. At the moment, anything written to these places
and not recorded in the build's stdout/stderr just gets thrown away.

In at least librsvg and gtk4, there is Debian-specific code to grab the
results of failing "reftests" (drawing the same thing in two different
ways that should end up equivalent, and comparing the resulting PNG
images for equality), uuencode them and output them into the log for later
inspection: this is particularly important when a maintainer is assessing
whether a reftest result is "close enough" (e.g. font hinting is off
by a few pixels due to different rounding) or unacceptable (e.g. text
is unreadable or in the wrong place). Getting this out via uuencode is
practically annoying, but it's better than nothing...

In Gitlab-CI, there's a simple, declarative way to ask Gitlab to save
certain files from the CI job's container and store them in a zip file
for later inspection. For example, for a Meson build this could look like:

artifacts:
  when: always
  paths:
- _build/meson-logs
- _build/tests/reftests/*.expected.png
- _build/tests/reftests/*.actual.png

> * output plaintext data to the build log
> 
> Some of these log files are large (>13MB? per architecture, per package
> build) and would greatly benefit from compression...
> 
> How large is too large for this approach to work?
> 
> Relatively simple to implement (at least for plain text logs), but
> potentially stores a lot of data on the buildd infrastructure...

This has the advantage that it can work equally well for failing and
successful builds, and doesn't need any special support in either the
buildd infrastructure or dak.

For packages like gtk4 and librsvg that are quite visual, it would be
very useful to be able to record images (that is, potentially quite
large binary files) and not just text: uuencoding them is a workaround,
but screen-scraping the logs to get the uuencoded binary PNGs out is
not a great start to a debugging session. So far, I've been lucky and
all the failing reftests have had relatively small output...

> * Selectively filter out known unreproducible files
> 
> This adds complexity to the process of verification; you can't beat the
> simplicty of comparing checksums on two .deb files.
> 
> With increased complexity comes increased opportunity for errors, as
> well as maintenance overhead.
> 
> RPM packages, for example, embed signatures in the packages, and these
> need to be excluded for comparison.
> 
> I vaguely recall at least one case where attempting something like this
> in the past and resulting in packages incorrectly being reported as
> reproducible when the filter was overly broad...
> 
> Some nasty corner cases probably lurk down this approach...

A significant disadvantage of this approach is that it will only work
for successful builds: you can't use it to record more information about
a FTBFS caused by build-time test failures, unless you are willing to
let packages with build-time test failures into the archive (at which
point people will start using them or build-depending on them, which
we don't really want for packages that have failed the QA checks that
were meant to stop them from being shipped if they're broken/unusable,
either on a particular architecture or in general).

I personally don't like this: as you say, it's difficult to beat the

Bug#1005875: ITP: python-headerparser -- Python module to parse key-value pairs in the style of RFC 822 headers

2022-02-16 Thread Stephan Lachnit
Package: wnpp
Severity: wishlist
Owner: Stephan Lachnit 
X-Debbugs-Cc: debian-devel@lists.debian.org, stephanlach...@debian.org

* Package name: python-headerparser
  Version : 0.4.0
  Upstream Author : John T. Wodder II
* URL : https://github.com/jwodder/headerparser
* License : MIT
  Programming Lang: Python
  Description : Python module to parse key-value pairs in the style of RFC
822 headers


I intend this maintain this in the Debian Python Team. I will use this library
for my ongoing work to convert SPDX documents to DEP5 documents [1]. The reason
I won't use python-debian is that it is apparently a bit buggy on non-Debian
systems.

Regards,
Stephan

[1] https://lists.debian.org/debian-devel/2022/02/msg00207.html


The long description from the readme:

headerparser parses key-value pairs in the style of RFC 822 (e-mail) headers
and converts them into case-insensitive dictionaries with the trailing message
body (if any) attached. Fields can be converted to other types, marked
required, or given default values using an API based on the standard library’s
argparse module. (Everyone loves argparse, right?) Low-level functions for just
scanning header fields (breaking them into sequences of key-value pairs without
any further processing) are also included.

RFC 822-style headers are header fields that follow the general format of
e-mail headers as specified by RFC 822 and friends: each field is a line of the
form “Name: Value”, with long values continued onto multiple lines (“folded”)
by indenting the extra lines. A blank line marks the end of the header section
and the beginning of the message body.

This basic grammar has been used by numerous textual formats besides e-mail,
including but not limited to:

HTTP request & response headers
Usenet messages
most Python packaging metadata files
Debian packaging control files
META-INF/MANIFEST.MF files in Java JARs
a subset of the YAML serialization format

- all of which this package can parse.


Bug#1005877: ITP: jangouts -- videoconferencing web service using Janus Gateway

2022-02-16 Thread Jonas Smedegaard
Package: wnpp
Severity: wishlist
Owner: Jonas Smedegaard 
X-Debbugs-Cc: debian-devel@lists.debian.org, Debian VoIP Team 


-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512

* Package name: jangouts
  Version : 0.5.0
  Upstream Author : SUSE Linux
* URL : https://github.com/jangouts/jangouts
* License : Expat
  Programming Lang: JavaScript
  Description : videoconferencing web service using Janus Gateway

 Jangouts (for "Janus Hangouts") is a solution
 for videoconferencing based on WebRTC
 and the excellent Janus Gateway,
 with a user interface loosely inspired by Google Hangouts.
 It aims to provide a completely self-hosted open source alternative
 to Google Hangouts and similar solutions.
 Currently Jangouts supports conferences with video, audio,
 screen sharing and textual chat
 organized into an unlimited amount of conference rooms
 with a configurable limit of participants per room.
 .
 Janus is a general purpose WebRTC server/gateway
 with a minimal footprint.

Package will be maintained by the VoIP team using Salsa
at .

 - Jonas

-BEGIN PGP SIGNATURE-

iQIzBAEBCgAdFiEEn+Ppw2aRpp/1PMaELHwxRsGgASEFAmINDjYACgkQLHwxRsGg
ASEWTQ/9GQIJjsSlilz6ORTy0e8TMLSqz1b0D8ZWBpmWlGdMnF8MSIbld6ugcn8Y
AjMUlSwqzKpVIqc+Iz3xTe7vPUaqp4+/UBz+SHtuh7Wau2TL6fAv51hhjEcjrbWi
8DgNPlURTdzSKp+s60u8e8xDwimW/oekSImCEWf5efKQFkrQx/FAAY4NNfNvKGsR
vdL89K97Ry2L4Y93g/crXdCzjuIbHrQBRi2dRmKgpbR5PlQCXiOPLcOvcQSZbuvA
2dBPlOrOnmTxJAArYKOYXjeYpHG9moylCOUDETWBVfd+u8X8gB51DrYYkJp++5QE
uvLf3+GrFgBV/rX4aCs/9iZdT6c/zF7attvQ7WHke+5SURt5izSpMaqoaGCfTysJ
JMDwas1xBhC4Hx9JuEJ1HccSm/fO2xdrgPpqSnPdtGn35vhXuS3fiDuE820qSYZe
KBYRvOJnd0zJd/mnYIwxvWdbSSFwVw0EQqAemKjrw/iz7MIo5iQQYBhhA9H0R+Im
tXIyy9GoaHYKYLQtVkLBgQAlprdLVrOw5pDXlWuDoSHNT8H7M2fs8qm1wk03BtmZ
1qPeXGcqgSvpxYe9fAc6SsNJ9WgszWuRcD6kOvub5vGBHh8MrUvHWfPnF4+e/gUE
HSUWLK9GiVEt02EuGuLIHOatXkL7pOisScbs+twDhfw7oO3PHYU=
=RtCK
-END PGP SIGNATURE-



Re: Including build metadata in packages

2022-02-16 Thread Paul Wise
Simon McVittie wrote:

> Relatedly, I would like to be able to capture some information about
> builds even if (perhaps especially if) the build fails.

That is a good point that I hadn't considered.

> so that failing builds can also produce artifacts, to help the
> maintainer and/or porters to figure out why the build failed.

Agreed that this is useful.

> handling build logs is not dak's job (and I don't think handling
> things like the binutils test results should be dak's job either).

It has always felt weird to me that build logs are entirely separate to
the archive off in a side service rather than first-class artefacts
that people occasionally need to look at. Also that the maintainer
build logs don't end up anywhere and are probably just deleted. I think
the same applies to the buildinfo files and also these tests results
and other artefacts that are mentioned in this thread.

> Here's a straw-man spec, which I have already prototyped in
> :

This seems better than my proposal, modulo the above and also the repro
builds need for a way to distribute buildinfo files somehow.

IIRC last time the build artefact discussion came up I was cycling
between having the artefact handling in the sbuild configs on the
buildds for quick implementation vs having it in debian/ dirs for
distributed maintenance by maintainers.

I think there is a fundamental question here that needs answering
definitively: who is the audience for the artefact feature?

 * Is it individual package maintainers who want test result details?
 * Is it build tool maintainers who want data on tool use/failures?
 * Is it porters who want more detailed logs in case of failure?
 * Is it buildd maintainers for some reason?
 * Is it RC bug fixers?
 * Is it all of the above?

Once that is answered, then we can think about how to accommodate how
and where the list(s?) of files are to be maintained?

 * in debian/
 * in build tools (meson, gcc etc)
 * in debhelper extensions
 * in debhelper
 * in wanna-build
 * in sbuild
 * in sbuild.conf in dsa-puppet
 * in sbuild overrides on buildds

Some of the above will be faster to implement and some will be slower.
The faster parts can possibly even make up for the slower parts, by for
example doing the sbuild proposal in hooks until it is done in stable.

Then there is the question of how the files get off the systems where
builds happen (buildds, maintainer systems). Again, the faster/slower
implementation implications exist here too.

Then there is the question of how the files are further distributed
from there and the question of how people access them.

Then there is the question of whether any of the above will be
implemented in a way that is useful solely to Debian, or in a more
general way to all Debian or apt repository based distributions. Being
able to publish build logs/artifacts seems like something other distros
would be interested in. It sounds like at least the GCC maintainers
want that for too Ubuntu at minimum.

-- 
bye,
pabs

https://wiki.debian.org/PaulWise


signature.asc
Description: This is a digitally signed message part


Re: Including build metadata in packages

2022-02-16 Thread Simon McVittie
On Wed, 16 Feb 2022 at 23:25:46 +0800, Paul Wise wrote:
> Simon McVittie wrote:
> > handling build logs is not dak's job (and I don't think handling
> > things like the binutils test results should be dak's job either).
> 
> It has always felt weird to me that build logs are entirely separate to
> the archive off in a side service rather than first-class artefacts
> that people occasionally need to look at. Also that the maintainer
> build logs don't end up anywhere and are probably just deleted. I think
> the same applies to the buildinfo files and also these tests results
> and other artefacts that are mentioned in this thread.

If the maintainers of dak (our eternally overworked ftp team) want to
pick up build logs as first-class artifacts produced by both failed
and successful builds, they're welcome to do so (and then handling my
prototype of test artifacts would be a matter of adding another glob
pattern to be stored, for the tarball of artifacts that accompanies the
log); but I don't want to block on them doing that, because that seems
like a recipe for it never happening.

I am also not sure that it would be appropriate for dak to be doing
any processing on *failed* builds, which currently fail and get diverted
off into other code paths long before they get to dak.

If you are trying to solve the problem "we cannot see into the logs of
maintainer-built binaries that exist in the archive", I think a better
answer to that would be to stop letting maintainer-built binaries into the
archive, as the release team are already pushing us towards. That way,
we don't have to worry about whether maintainers' build logs and/or test
artifacts would be leaking personal or sensitive information that they
would prefer not to have shared.

> IIRC last time the build artefact discussion came up I was cycling
> between having the artefact handling in the sbuild configs on the
> buildds for quick implementation vs having it in debian/ dirs for
> distributed maintenance by maintainers.

I'm reasonably sure that the sbuild configuration is the wrong place
to specify what the artifacts are, because the interesting artifacts
depend on the build system (Autotools vs Meson vs etc.) and on how the
package uses it (in-tree vs. out-of-tree build, single vs multiple builds,
and so on), as well as on the package itself (for example GTK's ad-hoc
mechanism to store reftest results as PNG files is entirely GTK-specific).
This is something that the package maintainer already needs to know, so
that they can debug failing builds locally.

I tested my prototype with a Meson package, which has the advantage that
it's very consistent: whatever your build directory is, it will have
a meson-logs subdirectory and that's where all the logs are. However,
even Meson is not always done identically: the most obvious example
is that most Meson-built packages use the dh default build directory
./obj-${multiarch}, but if you do two builds (perhaps one for the .deb
and one for the .udeb, like GLib does), you have to find somewhere else
to put the second build.

> I think there is a fundamental question here that needs answering
> definitively: who is the audience for the artefact feature?
> 
>  * Is it individual package maintainers who want test result details?
>  * Is it build tool maintainers who want data on tool use/failures?
>  * Is it porters who want more detailed logs in case of failure?
>  * Is it buildd maintainers for some reason?
>  * Is it RC bug fixers?
>  * Is it all of the above?

As an individual package maintainer, I certainly want this feature.
The exact artifacts that I want vary between packages, which is why
I prototyped it as a new field in d/control.

When toolchain packages like binutils and gcc collect their test
results, I think that's also their maintainer acting as an individual
package maintainer. Obviously they're very important core packages,
but collecting their test results doesn't seem like it fundamentally
differs from me wanting to collect GTK test results.

If the other groups get a benefit from this too, then that's a welcome
bonus, but I think solving it for individual package maintainers and
ignoring everyone else would be a net improvement.

Porters and RC bug fixers can benefit from this information in the
same way package maintainers do; if they're looking at fixing a bug,
they are going to have to change the package *anyway* (to apply the
bug fix), so changing it to collect artifacts (if it doesn't already)
doesn't seem like a huge cost.

I am not aware of buildd maintainers having asked for more detailed
logs. Indeed, buildd maintainers are in the unique position that they
can run arbitrary privileged code on buildds, so they are in a better
position to collect information from a half-built package than mere DDs,
and presumably have less need for this feature.

Build tool maintainers seem like the only one of the groups you've named
that isn't necessarily well-served by my prototype: they don't want to

Debian Med video conference tomorrow, Thursday 2022-02-17 18:00 UTC

2022-02-16 Thread Andreas Tille
Hi,

this is the call for the next video conference of the Debian Med team
that are an established means to organise the tasks inside our team.
We do these conferences twice per month on every

   2th  and  17th

of a month.  Usually it takes us only 15-20min depending what we are
talking about and how many people are joining.  The next meeting is
tomorrow 18:00 UTC
   
 
https://www.timeanddate.com/worldclock/fixedtime.html?msg=Debian+CoViD-19+Biohackathon+Video+Conference&iso=20220217T18

The meeting is on the Debian Social channel

 https://jitsi.debian.social/DebianMedCovid19

These video meetings were started in the Debian Med Biohackathon.
The topic is what contributors have done in the past period and to
coordinate the work until the next meeting.

For those who are interested in hot topics we want to tackle, here
are some items:

  - Preparing the sprint starting on Friday

Newcomers are always welcome.

Lets keep on the great work and see you tomorrow
 
   Andreas.

[0] https://lists.debian.org/debian-med/2021/12/msg0.html
[1] https://blog.bazel.build/2021/03/04/bazel-debian-packaging.html
[2] 
https://docs.google.com/spreadsheets/d/1tApLhVqxRZ2VOuMH_aPUgFENQJfbLlB_PFH_Ah_q7hM/edit#gid=543782716

-- 
http://fam-tille.de



Re: Getting in contact with the i386 porters

2022-02-16 Thread Marc Haber
On Tue, 15 Feb 2022 13:41:22 +0100, Ansgar  wrote:
>The architecture requalification page for bookworm currently lists a
>single porter for i386[1].  Maybe contact him directly?
>
>  [1]: 
> https://salsa.debian.org/release-team/release.debian.org/-/blob/bb0660c80401eeacbe7063044a9a1b711dcc2303/www/bookworm/arch_spec.yaml#L108

That was a very helpful idea (and a very helpful porter). Things are
going forward now. Thanks for helping.

Greetings
Marc
-- 
-- !! No courtesy copies, please !! -
Marc Haber |   " Questions are the | Mailadresse im Header
Mannheim, Germany  | Beginning of Wisdom " | 
Nordisch by Nature | Lt. Worf, TNG "Rightful Heir" | Fon: *49 621 72739834



Re: Including build metadata in packages

2022-02-16 Thread Paul Wise
On Wed, 2022-02-16 at 16:51 +, Simon McVittie wrote:

> If the maintainers of dak (our eternally overworked ftp team) want to
> pick up build logs as first-class artifacts produced by both failed
> and successful builds, they're welcome to do so (and then handling my
> prototype of test artifacts would be a matter of adding another glob
> pattern to be stored, for the tarball of artifacts that accompanies the
> log); but I don't want to block on them doing that, because that seems
> like a recipe for it never happening.

I have heard that they accept patches :)

> If you are trying to solve the problem "we cannot see into the logs of
> maintainer-built binaries that exist in the archive", I think a better
> answer to that would be to stop letting maintainer-built binaries into the
> archive, as the release team are already pushing us towards. That way,
> we don't have to worry about whether maintainers' build logs and/or test
> artifacts would be leaking personal or sensitive information that they
> would prefer not to have shared.

There are always going to be non-buildd binaries in the archive, since
Debian doesn't support autobuilding with non-default build profiles and
even if we had that there will likely always be the need for packages
to be manually bootstrapped.

There is already a (merged?) dak patch for dropping maintainer built
binaries after NEW processing, so we are close to completing this.

ISTR dropping all (not just NEW) maintainer built binaries by default
was decided to be unwanted and the NEW-only approach was preferred.
Personally I wanted to drop all maintainer built binaries by default,
with perhaps a .changes field for enabling accepting binaries.

> I'm reasonably sure that the sbuild configuration is the wrong place
> to specify what the artifacts are

I think that completely depends on the audiences and which artefacts
each of the audiences wants to look at. For some it will be.

> If the other groups get a benefit from this too, then that's a welcome
> bonus, but I think solving it for individual package maintainers and
> ignoring everyone else would be a net improvement.

I think that package maintainers are indeed the primary need for this
feature but that the other audiences shouldn't be ignored.

> Perhaps it would make sense to have a hybrid of what I prototyped, and
> something more like substvars:
...
> What I definitely want to avoid is a system that requires collecting
> the artifacts imperatively rather than declaratively, e.g. converting

Sounds good.

> I think those are a non-starter: as a maintainer of an individual package,
> I do not want to have to ask the Debian sysadmins' permission to collect
> test results (or, worse, ask the sbuild maintainer's permission and then
> wait 2 years for the change to be in a stable release).

I'm saying we want all of the options, not just one of the options.

> I think part of being a do-ocracy is that if there isn't an important
> reason for a small and usually overworked group to be in a position to
> block other people's work, then we should avoid putting extra load on them.

I think that working around groups like this often leads to suboptimal
designs and a better approach is to get the design right and help those
groups do the implementation work, leaving only the deployment to them.

Anyway, I'm not in any of the audiences for this feature and I won't be
doing any work on it, so I'll leave it up to others to determine the
final design and implementation of the flow of build info/artifacts.

-- 
bye,
pabs

https://wiki.debian.org/PaulWise


signature.asc
Description: This is a digitally signed message part