Re: [slurm-users] [External] Autoset job TimeLimit to fit in a reservation

2021-03-29 Thread Prentice Bisbal
Florian, Not the OP, but I have a use case where we do this. We have a code that uses checkpoint/restart, so for the users, being able to calculate how much time they have until our maintenance reservation starts allows them to submit restartable jobs right up to the reservation start time to

Re: [slurm-users] [External] Re: R jobs crashing when run in parallel

2021-03-29 Thread Prentice Bisbal
It sounds to me like configuration drift on your cluster. I would check that libpcre is actually (still?) Installed on all your cluster nodes. I'll bet if you check the node(s) where the jobs are failing, it's probably a particular subset of nodes, or even only a single node, and libpcre has some h

Re: [slurm-users] [External] Autoset job TimeLimit to fit in a reservation

2021-03-29 Thread Florian Zillner
Hi, well, I think you're putting the cart before the horse, but anyway, you could write a script that extracts the next reservation and does some simple math to display the time in hours or else to the user. It's the users job to set the time their job needs to finish. Auto-squeezing a job that

Re: [slurm-users] R jobs crashing when run in parallel

2021-03-29 Thread William Brown
Maybe you have run out of file handles. William On Mon, 29 Mar 2021, 17:36 Patrick Goetz, wrote: > Could this be a function of the R script you're trying to run, or are > you saying you get this error running the same script which works at > other times? > > On 3/29/21 7:47 AM, Simon Andrews wr

Re: [slurm-users] R jobs crashing when run in parallel

2021-03-29 Thread Patrick Goetz
Could this be a function of the R script you're trying to run, or are you saying you get this error running the same script which works at other times? On 3/29/21 7:47 AM, Simon Andrews wrote: I've got a weird problem on our slurm cluster.  If I submit lots of R jobs to the queue then as soon

Re: [slurm-users] how to print all the key-values of "job_desc" in job_submit.lua?

2021-03-29 Thread Bas van der Vlies
On 29/03/2021 13:03, Diego Zuccato wrote: Il 29/03/21 09:35, taleinterve...@sjtu.edu.cn ha scritto: Why the loop code cannot get the content in job_desc? And what is the correct way to print all its content without manually specify each key? I already reported it quite some time ago. Seems

[slurm-users] R jobs crashing when run in parallel

2021-03-29 Thread Simon Andrews
I've got a weird problem on our slurm cluster. If I submit lots of R jobs to the queue then as soon as I've got more than about 7 of them running at the same time I start to get failures, saying: /bi/apps/R/4.0.4/lib64/R/bin/exec/R: error while loading shared libraries: libpcre2-8.so.0: cannot

Re: [slurm-users] how to print all the key-values of "job_desc" in job_submit.lua?

2021-03-29 Thread Chrysovalantis Paschoulas
Hi all! I had the same problem, so I can feel your pain.. You can find all available fields of job_desc (and job_rec) in this file: ``` src/plugins/job_submit/lua/job_submit_lua.c ``` and then as a workaround, I was just printing specific fields for debugging.. Best Regards, Valantis On 29

Re: [slurm-users] how to print all the key-values of "job_desc" in job_submit.lua?

2021-03-29 Thread Diego Zuccato
Il 29/03/21 09:35, taleinterve...@sjtu.edu.cn ha scritto: > Why the loop code cannot get the content in job_desc? And what is the > correct way to print all its content without manually specify each key? I already reported it quite some time ago. Seems pairs() is not working. -- Diego Zuccato DI

[slurm-users] Autoset job TimeLimit to fit in a reservation

2021-03-29 Thread Jeremy Fix
Hi, I'm wondering if there is any built-in option to autoset a job TimeLimit to fit within a defined reservation. For now, it seems to me that the timelimit must be explicitely provided, in a agreement with the deadline of the reservation, by a user when invoking the srun or sbatch command while

[slurm-users] how to print all the key-values of "job_desc" in job_submit.lua?

2021-03-29 Thread taleintervenor
Hello, Because I'm not sure about the relations between fields of job_desc structure and sbatch parameter, I want to print all the fields and their values in job_desc when testing job_submit.lua. But the following code add to job_submit.lua failed to iterate through job_desc, the for loop print