Re: [slurm-users] Segmentation fault when launching mpi jobs using Intel MPI

2019-02-06 Thread Christopher Samuel
On 2/6/19 9:06 AM, Bob Smith wrote: Any ideas on what is going on? Any reason you're not using "srun" to launch your code? https://slurm.schedmd.com/mpi_guide.html All the best, Chris

Re: [slurm-users] Error in job_submit.lua conditional?

2019-02-06 Thread Prentice Bisbal
Thanks! Prentice On 2/6/19 11:00 AM, Marcus Wagner wrote: Hi Prentice, there, I might help. I've created a table, e.g.: local userflags = {    --  "" = {    -- "bypass"  = 1, # optional, if you want to bypass the submit_plugin    -- "debug"   = 1, # optional, if you want to

[slurm-users] Segmentation fault when launching mpi jobs using Intel MPI

2019-02-06 Thread Bob Smith
Hello all, I am having an issue submitting mpi jobs via sbatch using Intel MPI 2019 Update 1. The job ends with a segmentation fault immediately: [user@head mpihello]$ cat mpihello-intelmpi.out.62 srun: error: node003: task 0: Segmentation fault srun: error: node004: task 0: Segmentation fault [m

Re: [slurm-users] Error in job_submit.lua conditional?

2019-02-06 Thread Marcus Wagner
Hi Prentice, there, I might help. I've created a table, e.g.: local userflags = {    --  "" = {    -- "bypass"  = 1, # optional, if you want to bypass the submit_plugin    -- "debug"   = 1, # optional, if you want to get debug messages    -- "param"   = 1, # optional,

[slurm-users] "We have more time than is possible" in slurmdbd.log with no runaway jobs

2019-02-06 Thread Antony Cleave
Hi All seeing this after some hours of mysql downtime yesterday to correct something else but i didn't notice these errors until after I had performed the Slurm update to 18.08 which went through fine in spite of these errors firstly when restarting the slurmdbd before I started the update [201

Re: [slurm-users] Error in job_submit.lua conditional?

2019-02-06 Thread Prentice Bisbal
"Dirty debugging" I like that. I'm going to use that from now on. I have tried that method in the past while debugging other issues. I try not to use it too much, since I don't want these "dirty debugging" messages being seen by users (I don't have a test environment, so I have to test debug in

Re: [slurm-users] Error in job_submit.lua conditional?

2019-02-06 Thread Prentice Bisbal
Whew! I have use 'user_id' in a dozen other conditionals that I tested exhaustively. After reading your first e-mail, I thought I was going crazy. I suspect the issue is some sort of subtle typo or syntax error. I use similar conditionals throughout my job_submit.lua script, and they all work

[slurm-users] MaxSubmitJobsPerUser does not work as expected

2019-02-06 Thread Aravindh Sampathkumar
Hi. I'm trying to set *MaxSubmitJobsPerUser *to a QOS in expectation that it will limit a user from submitting more than a certain number of jobs at a time. However it seems to limit the user at a much smaller number of jobs. I ran the following command to set the limit. sacctmgr modify qos nor