allow anyone
> to log in.
>
> Sean
> --------------
> *From:* Ratnasamy, Fritz via slurm-users
> *Sent:* Tuesday, 17 June 2025 14:55
> *To:* Kevin Buckley
> *Cc:* slurm-users@lists.schedmd.com
> *Subject:* [EXT] [slurm-users] Re: slurm_pam_adopt module not working
>
> * External email:
ers@lists.schedmd.com> wrote:
> On 2025/06/11 12:46, Ratnasamy, Fritz via slurm-users wrote:
> >
> > We wanted to block users from ssh to a node where there are no jobs
> > running however it looks like users are able to do so. We have installed
> > the slurm_pam_adopt_module and
Hi,
We wanted to block users from ssh to a node where there are no jobs
running however it looks like users are able to do so. We have installed
the slurm_pam_adopt_module and set up the slurm.conf accordingly (the same
way we did on our first cluster where the pam module denies ssh access
correc
Hi,
We installed a new slurm version and it returns "command not found" for
seff. I do not remember doing any manual installation for the previous
versions, I thought it was coming with sacct, sbatch, ect... Any idea how I
would need to set it up? I read online seff is actually a perl script.
Bes
Hi,
We were told by our hardware provider that large datasets copied from NFS
location to GPFS could be conducted via slurm to monitor the transfer. I am
not sure of this works as I could not find much online. Otherwise apart
from globus and rsync, what would you suggest as a tool/command to cop
Hi,
I was wondering whether there might be built in support for managing
slurm accounts,users,associations in ansible. it would be nice to be able
to organize accounts in a yaml style file and modify accounts settings via
gitlab CI/CD. For example, in a format as such:
accounts:
- name: "Boss
Hi,
the slurm db duplicates all our account associations. one with
cluster=cluster and another where cluster=venus (which is our actual
cluster). is that intended? Or should I make any changes?
*Fritz Ratnasamy*Data Scientist
Information Technology
--
slurm-users mailing list -- slurm-users@li
Hi,
We are working on a test cluster with slurm 24.11.3 and I am getting this
error message from the login or compute nodes (note that the error does not
show when run from the controller node):
sacctmgr list associations tree format=cluster,account,user,maxnodes
sacctmgr: error: _open_persist_co
Hi,
I am trying to install the new version of slurm. Do you know if there is
a way to find out what support is compiled into the executables? For
example, apache has httpd -L which shows all the loaded modules. See below
result:
[image: image.png]
*Fritz Ratnasamy*Data Scientist
Information Te
Hi,
I am using an old slurm version 20.11.8 and we had to reboot our cluster
today for maintenance. I suspended all the jobs on it with the command
scontrol suspend list_job_ids and all the jobs paused and were suspended.
However, when I tried to resume them after the reboot, scontrol resume did
,
*Fritz Ratnasamy*
Data Scientist
Information Technology
On Thu, Jun 6, 2024 at 2:11 PM Ratnasamy, Fritz via slurm-users <
slurm-users@lists.schedmd.com> wrote:
> As admin on the cluster, we do not observe any issue on our newly added
> gpu nodes.
> However, for regular users, they
As admin on the cluster, we do not observe any issue on our newly added gpu
nodes.
However, for regular users, they are not seeing their jobs running on these
gpu nodes when running squeue -u ( it is however showing as
running status with sacct) and they are not able to ssh to these newly
added
Hi,
What is the "official" process to remove nodes safely? I have drained the
nodes so jobs are completed and put them in down state after they are
completely drained.
I edited the slurm.conf file to remove the nodes. After some time, I can
see that the nodes were removed from the partition with
13 matches
Mail list logo