Bug#1106083: scilab: FTBFS: autobuilder hangs when using the kernel of trixie

Santiago Vila Wed, 09 Jul 2025 09:59:14 -0700

On Wed, Jul 09, 2025 at 06:28:44PM +0200, Étienne Mollier wrote:
> For what it's worth, I have run the builds on a beefy laptop,
> and despite having most of the build process being already
> serialized, I noticed that my memory consumption was pretty
> high.  I wonder if there could also be some resource exhaustion
> at play.


That would be a possibility in theory, but my autobuilders do nothing
else than building packages, always one package at a time, and they
always have enough memory.

To achieve that, I monitor Committed_AS in /proc/meminfo and collect
statistics about all packages. Then, when the central server receives
a request from one of the autobuilders to "build something", a package
which is buildable in the autobuilder which request a jobs
is always assigned, according to its available memory.

For the particular case of scilab, it needs around 981 MB to build
on single-cpu systems and 1321 MB to build on systems with 2 CPUs.
My smallest machine these days have 4GB of RAM, so I don't think
memory is the problem here.

> I also notice that the kernel run is the cloud variant.  Maybe
> having a look at differences with the plain kernel could reveal
> some clues?  (assuming that the problem has not disappeared from
> 6.12.32 to 6.12.33…)

That would also be a possibility in theory, but I think it's unlikely.

Lucas Nussbaum has been using the cloud variant for ages for his
archive rebuilds, and so far we have never ever found a package which
failed because of such reason.

So, it is not impossible, but I would prefer to leave that for the
case that we have no other choice.

(If you think about it, that would pose some interesting challenges
and technical issues: What legitimate reason could a package have to
ftbfs if you don't use the standard kernel, and how are we supposed to
express such dependency in debian/control?)

In case it matters, the failure rates that I got recently were:

10% (5 out of 50) on systems with 1 CPU
78% (39 out of 50) on systems with 2 CPUs.

I suspect of a race condition of some kind, so if you are still
willing to try different things (as opposed to directly trying
in my VM after I finish my last test build), I would try
bulding the package on a self-hosted qemu/kvm machine
with exactly 2 CPUs. You can probably achieve the same
effect by using GRUB_CMDLINE_LINUX="nr_cpus=2" (i.e.
modify /etc/default/grub, run update-grub and reboot).

(btw: Building scilab in unstable 100 times as we speak, I believe
that by night I will be able to tell the outcome).

Thanks.

Bug#1106083: scilab: FTBFS: autobuilder hangs when using the kernel of trixie

Reply via email to