On Thu, Dec 26, 2024 at 09:01:30AM +0100, Helmut Grohne wrote:
What other place would be suitable for including this functionality?
As I suggested: you need two tools or one new tool because what you're
looking for is the min of ncpus and (available_mem / process_size). The
result of that calculation is not the "number of cpus", it is the number
of processes you want to run.
have a pattern of packages coming up with code chewing /proc/meminfo
using various means (refer to my initial mail referenced from the bug
submission) and reducing parallelism based on it
Yes, I think that's basically what you need to do.
Do you see the computation of allocatable RAM as something we can
accommodate in coreutils? Michael suggested adding "nmem" between the
lines. Did you mean that in an ironic way or are you open to adding such
a tool? It would solve a quite platform-dependent part of the problem
and significantly reduce the boiler plate in real use cases.
Here's the problem: the definition of "available memory" is very vague.
`free -hwv` output from a random machine:
total used free shared buffers
cache available
Mem: 30Gi 6.7Gi 2.4Gi 560Mi 594Mi
21Gi 23Gi
Swap: 11Gi 2.5Mi 11Gi
Comm: 27Gi 22Gi 4.3Gi
Is the amount of available memory 2.4Gi, 23Gi, maybe 23+11Gi? Or 4.3Gi?
IMO, there is no good answer to that question. It's going to vary based
on how/whether virtual memory is implemented, the purpose of the system
(e.g., is it dedicated to building this one thing or does it have other
roles that shouldn't be impacted), the particulars of the build process
(is reducing disk cache better or worse than reducing ||ism?), etc.--and
we havent even gotten to cgroups or other esoteric factors yet. Long
before asking where nmem should go, you'd need to figure out how nmem
would work. You're implicitly looking for this tool to be portable (or
else, what's wrong with using /proc/meminfo directly?) but I don't have
any idea how that would work. You'd need to somehow get people to define
policies, what would that look like? I'd suggest starting by writing a
proof of concept and shopping it around to get buy-in and/or see if it's
useful. The answers you get from someone doing HPC on linux may be
different from the administrator of an openbsd server or a developer on
an OS/X laptop or windows desktop. I'm personally skeptical that this is
a problem that can be solved, but maybe you'll be able to demonstrate
otherwise. At any rate, looking for a project to host & distribute the
tool would seem to be just about the last step. Actually naming the
thing won't be easy either, but showing how it works is probably a
better place to start.