Hi, On Mon, 9 Mar 2020 at 17:09, Vitaly Zuevsky <vitaly.zuev...@gmail.com> wrote: > [VZ]I use a shell script to supervise processes in a docker/kubernetes > container. I noticed steady growth > in the cgroup's CPU utilization from 15 to 35 millicores within 17 > days in absence of any external > stimuli (dry run). If container got restarted, the pattern > reappeared from the base level of 15 millicores.
> [VZ]I reproduced the issue in local docker, sampled cgroup's threads > with perf, and ran perf-diff on reports. > It emerged that the CPU cost grew at dash's freejob() and kernel's > copy_page() at page_fault(). > The findings were consistent with dash growing a data structure, > possibly the one indexed by nprocs at > freejob(), on every iteration of the loop of the shell script. That > data structure would have to be > cloned at fork() happening along the loop, hence kernel page_faults. Thanks for your report; this most likely is a bug in the upstream package. While I doubt it’s any of our patches that introduced this undesired behaviour, it would be great if you could verify this still happens with the latest upstream Git version of dash, and if it also does, if you followed up with the upstream at d...@vger.kernel.org. If that’s too much work for you right now (it’s okay if it is), I can do that for you, but, unfortunately, I cannot provide a timeline at the moment. -- Cheers, Andrej