Is it being written to NFS? You say on your local dev cluster it's a
single node. Is it also the login node as well as compute? In that case
I guess there is no NFS. Larger cluster will be using some sort of
shared storage, so whichever shared file system you are using likely has
caching.
If you a
Similar problem in the cluster I look after. I have a job_submit script
which adds certain nodes to the job's excluded nodes list based on each
node's number of cpus per gpus. This basically solved problem with
fragmentation entirely. The problem is that cons_tres seems to think
(for example) tha
I look after a very heterogeneous GPU Slurm setup and some nodes have
quite few cores. We use a job_submit lua script which calculates the
number of requested cpu cores per gpu. This is then used to scan through
a table of 'weak nodes' based on a 'max cores per gpu' property. The
node names are app
>> Hi,
>>
>> I guess you could use a lua script to filter out flags you don't
>> want. I haven't tried it with mail flags, but I'm using a script like
>> the one referenced to enforce accounts/time limits, etc.
>>
>> https://funinit.wordpress.com/2018/06/07/how-to-use-job_submit_lua-with-slurm/
>>
> Hi,
>
> I guess you could use a lua script to filter out flags you don't
> want. I haven't tried it with mail flags, but I'm using a script like
> the one referenced to enforce accounts/time limits, etc.
>
> https://funinit.wordpress.com/2018/06/07/how-to-use-job_submit_lua-with-slurm/
>
> Cheers
Janne Blomqvist writes:
> On 14/11/2019 20.41, Prentice Bisbal wrote:
>> Is there any way to see how much a job used the GPU(s) on a cluster
>> using sacct or any other slurm command?
>>
>
> We have created
> https://github.com/AaltoScienceIT/ansible-role-sacct_gpu/ as a quick
> hack to put GPU uti
; Name=gpu Type=v100 File=/dev/nvidia1 CPUs=0-17,36-53
> Name=gpu Type=v100 File=/dev/nvidia2 CPUs=18-35,54-71
> Name=gpu Type=v100 File=/dev/nvidia3 CPUs=18-35,54-71
>
> Any help appreciated.
>
> Thanks, Daniel Vecerka CTU Prague
Do jobs actually end up on the same GPU though?
the funding for 25-100G
> networks and/or all-flash commercial data storage appliances (NetApp, Pure,
> etc.)
>
> Any good patterns that I might be able to learn about implementing here? We
> have a few ideas floating about, but I figured this already may be a solved
> problem in this community...
>
> Thanks!
> Will
--
Aaron Jackson - M6PIU
http://aaronsplace.co.uk/
t, and command line options will override any environment variables“
> If the call —nodelist in sbatch-script this may solve the problem.
>
> Beyond all that I would just contact those users and tell them not to use
> nodelist.
>
> Andreas
>> Am 27.11.2018 um 18:05 schrieb Aa
Hi all,
I am wondering if it is possible to disable the use of the --nodelist
argument from srun/sbatch/salloc/etc? In the worst case I can just edit
the code for argument parsing?
Having only recently moved over to Slurm, some users have a preference
for particular nodes with no justifiable reas
10 matches
Mail list logo