make sense?
>
> I also missed that setting in slurm.conf so good to know it is possible to
> change the default behaviour.
>
> Tom
>
> From: Patryk Bełzak via slurm-users
> Date: Friday, 17 May 2024 at 10:15
> To: Dj Merrill
> Cc: slurm-users@lists.schedmd.co
slurm.conf so good to know it is possible to
change the default behaviour.
Tom
From: Patryk Bełzak via slurm-users
Date: Friday, 17 May 2024 at 10:15
To: Dj Merrill
Cc: slurm-users@lists.schedmd.com
Subject: [slurm-users] Re: srun weirdness
External email to Cardiff University - Take care when
Hi,
I wonder where does this problems come from, perhaps I am missing something,
but we never had such issues with limits since we have it set on worker nodes
in /etc/security/limits.d/99-cluster.conf:
```
* softmemlock 4086160 #Allow more Memory Locks for MPI
* hardmemlock
I completely missed that, thank you!
-Dj
Laura Hild via slurm-users wrote:
PropagateResourceLimitsExcept won't do it?
Sarlo, Jeffrey S wrote:
You might look at the PropagateResourceLimits and PropagateResourceLimitsExcept
settings in slurm.conf
--
slurm-users mailing list -- slurm-users@l
PropagateResourceLimitsExcept won't do it?
Od: Dj Merrill via slurm-users
Poslano: sreda, 15. maj 2024 09:43
Za: slurm-users@lists.schedmd.com
Zadeva: [EXTERNAL] [slurm-users] Re: srun weirdness
Thank you Hemann and Tom! That was it.
The new cl
riginal Message-
From: Hermann Schwärzler via slurm-users
Sent: Wednesday, May 15, 2024 9:45 AM
To: slurm-users@lists.schedmd.com
Subject: [slurm-users] Re: srun weirdness
External email to Cardiff University - Take care when replying/opening
attachments or links.
Nid ebost mewnol o Brifysgol
: +44 (0)29 208 70734
E-bost: green...@caerdydd.ac.uk Gwefan: http://www.caerdydd.ac.uk/arcca
-Original Message-
From: Hermann Schwärzler via slurm-users
Sent: Wednesday, May 15, 2024 9:45 AM
To: slurm-users@lists.schedmd.com
Subject: [slurm-users] Re: srun weirdness
External ema
Hi Dj,
could be a memory-limits related problem. What is the output of
ulimit -l -m -v -s
in both interactive job-shells?
You are using cgroups-v1 now, right?
In that case what is the respective content of
/sys/fs/cgroup/memory/slurm_*/uid_$(id -u)/job_*/memory.limit_in_bytes
in both shell
Do you have containers setting?
On Tue, May 14, 2024 at 3:57 PM Feng Zhang wrote:
>
> Not sure, very strange, while the two linux-vdso.so.1 looks different:
>
> [deej@moose66 ~]$ ldd /mnt/local/ollama/ollama
> linux-vdso.so.1 (0x7ffde81ee000)
>
>
> [deej@moose66 ~]$ ldd /mnt/local/ollama
Not sure, very strange, while the two linux-vdso.so.1 looks different:
[deej@moose66 ~]$ ldd /mnt/local/ollama/ollama
linux-vdso.so.1 (0x7ffde81ee000)
[deej@moose66 ~]$ ldd /mnt/local/ollama/ollama
linux-vdso.so.1 (0x7fffa66ff000)
Best,
Feng
On Tue, May 14, 2024 at 3:43 PM D
Hi Feng,
Thank you for replying.
It is the same binary on the same machine that fails.
If I ssh to a compute node on the second cluster, it works fine.
It fails when running in an interactive shell obtained with srun on that
same compute node.
I agree that it seems like a runtime environment
Looks more like a runtime environment issue.
Check the binaries:
ldd /mnt/local/ollama/ollama
on both clusters and comparing the output may give some hints.
Best,
Feng
On Tue, May 14, 2024 at 2:41 PM Dj Merrill via slurm-users
wrote:
>
> I'm running into a strange issue and I'm hoping anoth
12 matches
Mail list logo