[slurm-users] Re: Do I have to hold back RAM for worker nodes?

Patrick Begou via slurm-users Mon, 12 May 2025 09:32:42 -0700

Hi,

When deploying slurm and having some trouble to start slurmd on nodes, Ifound an interesting command to check the memory size seen by slurm ona compute node:


sudo slurmd -C

This could be helpful. Then I have set the memory size of the node alittle be lower to avoid running out-of-memory, specifically when a userallocate the full node.


Patrick


Le 12/05/2025 à 14:55, Xaver Stiensmeier via slurm-users a écrit :

Josh,
thank you for your thorough answer. I, too, considered switching toCR_Core_Memory after reading into this. Thank you for confirming mysuspicion that without Memory, we cannot handle high memory requestsadequately.
If I may ask: *How do you come up with the specific MemSpecLimit?* Doyou handpick a value for each node, have you picked a constant valuefor all nodes or do you take a capped percentage of the maximum memoryavailable?
Best regards, Xaver

On 5/12/25 14:43, Joshua Randall wrote:
Xaver,
It is my understanding that if we want to have stable systems thatdon't run out of memory, we do need to manage the amount of memoryneeded for everything not running within a slurm job, yes.
In our cluster, we are using `CR_Core_Memory` (so we do constrain ourjob memory) and we set the `RealMemory` to the actual full amount ofmemory available on the machine - I believe these really are given inmegabytes (MB), not mebibytes (MiB). I think their example of (e.g."2048") is intended to convey this because 2000 MiB is 2048 MB. Weset the `MemSpecLimit` for each node to set memory aside foreverything in the system that is not running within a slurm job. Thisinclude the slurm daemon itself, the kernel, filesystem drivers,metrics collection agents, etc -- anything else we are runningoutside the control of slurm jobs. The `MemSpecLimit` just sets asidethe specified amount and the result will be that the maximum memoryjobs can use on the node is (RealMemory - MemSpecLimit). When usingcgroups to limit memory, slurmd will also be allocated the specifiedlimit so that the slurm daemon cannot encroach on job memory.However, note that `MemSpecLimit` is documented to not work unlessyour `SelectTypeParameters` includes Memory as a consumable resource.
Since you are using `CR_Core` (which does not configure Memory as aconsumable resource) then I believe your system will not beconstraining job memory at all. Jobs can oversubscribe memory as manytimes over as there are cores, and any job would be able to run themachine out of memory by using more than is available. With thissetting, I guess you could say you don't have to manage reservingmemory for the OS and slurmd, but only in the sense that any jobcould consume all the memory and cause the system OOM killer to killa random process (including slurmd or something else system critical).
Cheers,

Josh.


--
Dr. Joshua C. Randall
Director of Software Engineering, HPC
Altos Labs
email: jrand...@altoslabs.com
On Mon, May 12, 2025 at 10:27 AM Xaver Stiensmeier via slurm-users<slurm-users@lists.schedmd.com> wrote:
    Dear Slurm-User List,

    currently, in our slurm.conf, we are setting:

        SelectType: select/cons_tres
        SelectTypeParameters: CR_Core

    and in our node configuration /RealMemory /was basically reduced
    by an amount to make sure the node always had enough RAM to run
    the OS. However, this is apparently now how it is supposed to be
    done:

        Lowering RealMemory with the goal of setting aside some
        amount for the OS and not available for job allocations will
        not work as intended if Memory is not set as a consumable
        resource in *SelectTypeParameters*. So one of the *_Memory
        options need to be enabled for that goal to be accomplished.
        (https://slurm.schedmd.com/slurm.conf.html#OPT_RealMemory)

    This leads to four questions regarding holding back RAM for
    worker nodes. Answers/help with any of those questions would be
    appreciated.

        *1.* Is reserving enough RAM for the worker node's OS and
        slurmd actually a thing you have to manage? *2.* If so how
        can we reserve enough RAM for the worker node's OS and slurmd
        when using CR_Core? *3.* Is that maybe a strong argument
        against using CR_Core that we overlooked?

    And semi-related:
    https://slurm.schedmd.com/slurm.conf.html#OPT_RealMemory talks
    about taking a value in megabytes.

        *4.* Is RealMemory really expecting megabytes or is it mebibytes?

    Best regards, Xaver
--slurm-users mailing list -- slurm-users@lists.schedmd.com
    To unsubscribe send an email to slurm-users-le...@lists.schedmd.com


Altos Labs UK Limited | England | Company reg 13484917
Registered address: 3rd Floor 1 Ashley Road, Altrincham, Cheshire,United Kingdom, WA14 2DT

-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com

[slurm-users] Re: Do I have to hold back RAM for worker nodes?

Reply via email to