Hi Paul, > 1) why does it now suddenly start to (nearly always) fail across the > board on arm64 (in Debian, Ubuntu still seems fine), without changes to > the infrastructure that I know of?
I'm afraid I'm not sure what is up with shasta eating up more memory on arm64 hosts of CI infrastructure. What I can see from my end is that the test roughly requires 8GiB of anonymous memory to map for doing its job. Except that, this is already the case for shasta in bookworm running on bookworm kernel, so that doesn't look to be a regression per se. > 2) do you also believe this is related to memory consumption? The problem you mentioned, where shasta explicitly gives up when running into memory limits, is reproducible when I disable the swap on an 8GiB machine that I have at hand. I attempted to play with /proc/sys/vm/overcommmit_* settings, but my swap at t time was too big (10GiB) to give me the granularity necessary to check whether I could get somewhere with improper overcommit memory tuning. In any cases, the "Killed" status suggest overcommit is active (or heuristic) on your end for at least some of the hosts. Per chance, could you double check the memory settings on the CI hosts, just in case, to make sure that the swap didn't drop off the machine? Or maybe check for memory overcommit settings inconsistencies? Currently readable test logs suggest that: * ci-worker-arm64-10 met memory requirements in November, * ci-worker-arm64-07 did not meet requirements in October, * ci-worker-arm64-08 did not meet requirements in October, * ci-worker-arm64-03 did not meet requirements in October. So this may already be resolved, in case you changed something in between. > 3) If 2 == yes, what are the memory requirements for the test? The test > *could* test for that before it starts and bail out (restriction: > skippable with exit code 77 [2]) if the amount of memory available is > too small. It shouldn't hurt I guess. I think I can bolt something reading the memory commit capacity and usage in /proc/meminfo at the beginning of the test, and skip the run if the testbed couldn't meet the memory requirement for whatever reason. Note this may involve some trial and error. Have a nice Sunday, :) -- .''`. Étienne Mollier <emoll...@debian.org> : :' : gpg: 8f91 b227 c7d6 f2b1 948c 8236 793c f67e 8f0d 11da `. `' sent from /dev/pts/1, please excuse my verbosity `- on air: Ghost - Avalanche
signature.asc
Description: PGP signature