Hi,
I can't speak specifically for arbiter but to my very best knowledge
this is just how cgroup memory limits work in general, i.e. both,
anonymous memory and page cache, always count against the cgroup
memory limit.
This also applies for memory constraints imposed to compute jobs if
Constrain
Thank you for the response!
I have given those parameters a shot and will monitor the queue.
These parameters would really only impact backfill with respect to job time
limits, correct? Based on what I have read, I was under the impression that the
main scheduler and the backfill scheduler were
Not sure it would work out to 60k queued jobs, but we're using:
SchedulerParameters=bf_window=43200,bf_resolution=2160,bf_max_job_user=80,bf_continue,default_queue_depth=200
in our setup. bf_window is driven by our 30-day max job time, bf_resolution is
at 5% of that time, and the other values ar
Hello,
I currently manage a small cluster separated into 4 partitions. I am
experiencing unexpected behavior with the scheduler when the queue has been
flooded with a large number of jobs by a single user (around 6) to a single
partition. We have each user bound to a global grptres CPU limi
Hi Prentice,
thanks for the hint. I'm evaluating this too.
Seems that arbiter doesn't distinguish between RAM that's used really and RAM
that's sused as cache only. Or is my impression wrong?
Best,
Stefan
Am Dienstag, 27. April 2021, 17:35:35 CEST schrieb Prentice Bisbal:
> I think someone ask
Hi,
I have tried with
>
> #!/bin/bash
> #
> #SBATCH --job-name=N2n4
> #SBATCH --partition=cuda.q
> #SBATCH --output=N2n4-CUDA.txt
> #SBATCH -N 1 # number of nodes with the first GPU
> #SBATCH -n 2 # number of cores
> #SBATCH --gres=gpu:GeForceRTX3080:1
> #SBATCH hetjob
> #SBATCH -N 1 # number of