On Tue, 2024-04-09 at 11:07:32 -0700, Slurm users wrote:
> Hi everyone, I'm conducting some tests. I've just set up SLURM on the head
> node and haven't added any compute nodes yet. I'm trying to test it to
> ensure it's working, but I'm encountering an error: 'Nodes required for the
> job are DOWN
On Mon, 2024-05-06 at 11:38:30 +0100, Slurm users wrote:
> Hello,
>
> I instructed port to use binutils from ports (version 2.40 native) instead
> of base:
>
> `/usr/local/bin/ld: unrecognised emulation mode: elf_aarch64`
>
> ```
> /usr/local/bin/ld -V |grep aarch64
>aarch64cloudabi
>aar
On Mon, 2024-06-24 at 13:54:43 +0200, Slurm users wrote:
> Dear Slurm users,
>
> in our project we exclude the master from computing before starting
> Slurmctld. We used to exclude the master from computing by simply not
> mentioning it in the configuration i.e. just not having:
>
> Partition
Good morning,
yesterday I came across a Slurm (sbatch) script that, after doing some stuff
in the foreground, runs another executable in the background - and doesn't
"wait" for it to finish - literally the last line of the script is
executable &
(and that executable is supposed to take several 1
On Fri, 2024-07-26 at 10:42:45 +0300, Slurm users wrote:
> Good Morning;
>
> This is not a slurm issue. This is a default shell script feature. If you
> want to wait to finish until all background processes, you should use wait
> command after all.
Thank you - I already knew this in principle, an
On Mon, 2024-07-29 at 11:23:12 +0300, Slurm users wrote:
> Hi there all,
>
> We have Dell server with 2 x Nvidia H100 and running slurm on it. After
> restart server if we do not write nvidia-smi command slurm fails. When we
> run nvidia-smi && systemctl restart slurmd && systemctl restart slurmct
Hello everyone,
I've grepped the manual pages and crawled the 'net, but couldn't find any
answer to the following problem:
I can see that the ctld keeps a record of it below /var/spool/slurm - as
long as the job is running or waiting (and shown by "squeue") - and that
this stores the environment
On Wed, 2024-08-07 at 08:55:21 -0400, Slurm users wrote:
> Warning on that one, it can eat up a ton of database space (depending on
> size of environment, uniqueness of environment between jobs, and number of
> jobs). We had it on and it nearly ran us out of space on our database host.
> That said
Hi Daniel,
> error: Unable to contact slurm controller (connect failure)
>
> I appreciate any insight on what could be the cause.
Can you check that the slurmctld is up and running, and that the said
commands work on the controller machine itself?
If the slurmctld cannot be started as a service
iHi,
On Sun, 2024-12-08 at 21:57:11 +, Slurm users wrote:
> I have just rebuilt all my nodes and I see
Did they ever work before with Slurm? (Which version?)
> Only 1 & 2 seem available?
> While 3~6 are not
Either you didn't wait long enough (5 minutes should be sufficient),
or the "down*"
On Sat, 2024-12-28 at 22:59:45 -, Slurm users wrote:
> ls -ls /usr/local/slurm/etc/slurmdbd.conf
> 4 -rw--- 1 slurm slurm 497 Dec 28 16:34 /usr/local/slurm/etc/slurmdbd.conf
>
> sudo -u slurm /usr/local/slurm/sbin/slurmdbd -Dvvv
>
> slurmdbd: error: s_p_parse_file: unable to read
> "/
On Mon, 2025-01-06 at 12:55:12 -0700, Slurm users wrote:
> Hi all,
> I remember seeing on this list a slurm command to change a slurm-friendly
> list such as
>
> gpu[01-02],node[03-04,12-22,27-32,36]
>
> into a bash friendly list such as
>
> gpu01
> gpu02
> node03
> node04
> node12
> etc
I alwa
On Sat, 2025-01-04 at 08:11:21 -, Slurm users wrote:
> JOBID PARTITION NAME USER ST TIME NODES
> NODELIST(REASON)
> 26 cpu myscriptuser1 PD 0:00 4
> (Nodes required for job are DOWN, DRAINED or reserved for jobs in higher
>
On Thu, 2025-01-09 at 07:51:40 -0500, Slurm users wrote:
> Hello there and good morning from Baltimore.
>
> I have a small cluster with 100 nodes. When the cluster is completely empty
> of all jobs, the first job gets allocated to node 41. In other clusters,
> the first job gets allocated to mode
On Tue, 2025-03-04 at 01:03:00 +, Slurm users wrote:
> I am trying to add slurmdbd to my first attempt of slurmctld.
>
> I have mariadb 10.11 running and permissions set.
>
> MariaDB [(none)]> CREATE DATABASE slurm_acct_db;
> Query OK, 1 row affected (0.000 sec)
>
> MariaDB [(none)]> show da
15 matches
Mail list logo