On Wed, 1 Dec 2010 12:13:03 -0800
Alec Warner <anta...@gentoo.org> wrote:

> On Tue, Nov 30, 2010 at 8:02 PM, Jorge Manuel B. S. Vicetto
> <jmbsvice...@gentoo.org> wrote:
> > -----BEGIN PGP SIGNED MESSAGE-----
> > Hash: SHA1
> >
> > On 29-11-2010 10:34, Sebastian Pipping wrote:
> >> On 11/29/10 09:35, Arfrever Frehtes Taifersar Arahesis wrote:
> >>> There will probably be no active version of Python set.
> >>
> >> You had two weeks to come up with this.
> >>
> >> Please find my on IRC to team up on an agreed fix.
> >
> > As Arfrever noted, this is likely the cause of the broken
> > automated weekly stages for this past week. By not having a
> > python symlink / wrapper, stages generation failed on stage2 run.
> > I'd like to take this chance to recall this is the 2nd time on
> > the last few months where stage generation was broken by python
> > changes. Also, we've been unable to create hardened stages for
> > over 8 weeks because of a sandbox issue.
> > The weekly stages generation depends on the quality and stability
> > of the "stable" tree. Therefore, the RelEng team kindly asks all
> > maintainers to pay attention to the stable ebuilds in the system
> > set and to please fix any failures asap as they may / can prevent
> > stage generation. Be sure to think carefully about changes that
> > can impact the stage generation, in particular when they involve
> > python.
> 
> Two issues:
> 
> proj/en/releng is old as hell and doesn't even mention stage
> generation.
> 
> How does a developer know when the stage generation is broken?  Is
> there a dashboard?  At work we have a guy who is basically a build
> cop and checks our build dashboard once a day or so and if it is
> broken he goes and finds the guy who broke it and punches him in
> the face until he fixes it.  I imagine we do not have staff for
> this (and no one has invented punching over the internet.)

Catalyst sends automated emails to rel...@gentoo.org from the
various build boxes: dolphin, poseidon, other dev.g.o machines.

> I am curious how often stage builds fail (how long can they be
> broken until we actually care?)

Fairly often, especially in the last couple of months or so. There
were some arches that, last I checked, hadn't had
any new media in several months. Python is the usual cause.
Remember the last huge Python debacle that resulted in suspension?
Yeah, that was one of the reasons for continually broken media.

Python issues are pretty much the only reason why general stage builds
fail (hardened is its own set of problems.)

Here's part of a typical message from one of the boxes, minus a whole
bunch of "bad interpreter" errors:

---------------------------------------------------------------------
  [[ (1/3) Configuring environment ]]
/usr/portage/scripts/bootstrap.sh: line 307: python: command not found
---------------------------------------------------------------------
  [[ (2/3) Updating portage ]]
env: emerge: No such file or directory

!!! catalyst: run script failed.

Traceback (most recent call last):
  File "modules/generic_stage_target.py", line 1207, in run_local
    "run script failed.",env=self.env)
  File "/usr/lib64/catalyst/modules/catalyst_support.py", line 542,
in cmd
    raise CatalystError,myexc
CatalystError
None

I see messages like this pretty much every day. Releng is
understaffed on a few arches, which is why no one has time to track
down the errors, fix them, and get the builds completed.

Attachment: signature.asc
Description: PGP signature

Reply via email to