Re: [slurm-users] [External] What is an easy way to prevent users run programs on the master/login node.

2021-06-11 Thread Juergen Salk
Hi, I can't speak specifically for arbiter but to my very best knowledge this is just how cgroup memory limits work in general, i.e. both, anonymous memory and page cache, always count against the cgroup memory limit. This also applies for memory constraints imposed to compute jobs if Constrain

Re: [slurm-users] [EXT] Re: Slurm Scheduler Help

2021-06-11 Thread Dana, Jason T.
Thank you for the response! I have given those parameters a shot and will monitor the queue. These parameters would really only impact backfill with respect to job time limits, correct? Based on what I have read, I was under the impression that the main scheduler and the backfill scheduler were

Re: [slurm-users] Slurm Scheduler Help

2021-06-11 Thread Renfro, Michael
Not sure it would work out to 60k queued jobs, but we're using: SchedulerParameters=bf_window=43200,bf_resolution=2160,bf_max_job_user=80,bf_continue,default_queue_depth=200 in our setup. bf_window is driven by our 30-day max job time, bf_resolution is at 5% of that time, and the other values ar

[slurm-users] Slurm Scheduler Help

2021-06-11 Thread Dana, Jason T.
Hello, I currently manage a small cluster separated into 4 partitions. I am experiencing unexpected behavior with the scheduler when the queue has been flooded with a large number of jobs by a single user (around 6) to a single partition. We have each user bound to a global grptres CPU limi

Re: [slurm-users] [External] What is an easy way to prevent users run programs on the master/login node.

2021-06-11 Thread Stefan Staeglich
Hi Prentice, thanks for the hint. I'm evaluating this too. Seems that arbiter doesn't distinguish between RAM that's used really and RAM that's sused as cache only. Or is my impression wrong? Best, Stefan Am Dienstag, 27. April 2021, 17:35:35 CEST schrieb Prentice Bisbal: > I think someone ask

Re: [slurm-users] Job requesting two different GPUs on two

2021-06-11 Thread Gestió Servidors
Hi, I have tried with > > #!/bin/bash > # > #SBATCH --job-name=N2n4 > #SBATCH --partition=cuda.q > #SBATCH --output=N2n4-CUDA.txt > #SBATCH -N 1 # number of nodes with the first GPU > #SBATCH -n 2 # number of cores > #SBATCH --gres=gpu:GeForceRTX3080:1 > #SBATCH hetjob > #SBATCH -N 1 # number of