Bug#1102062: zookeeper: FTBFS: expected: <1> but was: <0>

tony mancill Fri, 04 Jul 2025 11:15:15 -0700

On Sun, Jun 22, 2025 at 11:37:29PM +0200, Santiago Vila wrote:
> Note for the record: I've just given Tony access to an AWS machine
> of type m7a.large, which has 2 CPUs and 8 GB of RAM (I know this
> is enough because I monitor /proc/meminfo during the builds).
> 
> In this type of machine the failure rate is around 30%
> so several tries might be necessary to get build failures.


Thank you for pushing this issue Santiago.  I suppose I have just been
lucky, but using the 2-core amd64-based VM you shared, I encountered
similar failure rates running the tests.  I want to point out that
failing test class is not consistent across builds.  I have expereinced
at least 5 distinct test class failures.

But because I haven't seen these with local builds, I initially pursued
the hypothesis that this was due to limited resources, specifically
cores.  So I performed 30 builds on an 8-core / 32GB instance (arm64,
m7g.2xlarge) and encountered 4 failures for 3 distinct test classes.

And since then, I have experienced the occasional test failure on bare
metal (non-hypervisor) systems, both 4-core and 8-core amd64 and arm64,
although the failures are (much) less frequent.

Because the failures occur for multiple different tests, I don't think
we should attempt to disable tests 1-by-1, I expect that to become a
game of whack-a-mole.  As you suggested, we should engage with upstream
regarding the Heisentests.  I will work on that.

For the trixie release, we can either request that the bug be ignored by
the Release Managers or I can upload a packaging change to skip tests
during the build by default and then request a freeze exception.

If anyone has a strong preference, please speak up.

Thank you,
tony

Bug#1102062: zookeeper: FTBFS: expected: <1> but was: <0>

Reply via email to