Re: [slurm-users] Two lines are printed by sacct

2018-04-11 Thread Christopher Samuel
On 12/04/18 14:56, Mahmood Naderan wrote: May I ask what is the purpose to view intermediate job steps? Because for MPI jobs that's where all the information is. The first step is just the batch script. Also, today when I run sacct command, I only see the last two jobs [...] Is that norm

Re: [slurm-users] Two lines are printed by sacct

2018-04-11 Thread Mahmood Naderan
May I ask what is the purpose to view intermediate job steps? Also, today when I run sacct command, I only see the last two jobs [mahmood@rocks7 g]$ sacct -a JobIDJobName PartitionAccount AllocCPUS State ExitCode -- -- -- -- --

Re: [slurm-users] job_submit.lua script

2018-04-11 Thread Christopher Samuel
On 12/04/18 01:47, Bjørn-Helge Mevik wrote: "sysadmin.caos" writes: srun: error: slurm_job_submit: parameter error 65534 4294967294 1 4294967294 is the special value slurm.NO_VAL, meaning the parameter was not specified. It is for 32 bit parametres. I guess (but haven't double checked) th

Re: [slurm-users] Two lines are printed by sacct

2018-04-11 Thread Christopher Samuel
On 12/04/18 04:00, Mahmood Naderan wrote: Hi, Hi Mahmood, I would like to know why the sacct command which I am usinig that to get some reports, shows two lines for each job. sacct reports one line per job step by default, not per job. If you add the 'JobName' field to your sacct command

Re: [slurm-users] Slurm setup question

2018-04-11 Thread Lachlan Musicman
On 12 April 2018 at 01:22, Matt Hohmeister wrote: > > Thanks; I just set StateSaveLocation=/var/spool/slurm.state, and that > went away. Of course, another error popped up: > > > > Apr 11 11:19:24 psy-slurm slurmctld[1772]: fatal: Invalid node names in > partition slurm > > > > Here’s the relevan

Re: [slurm-users] FSU & Slurm

2018-04-11 Thread Matt Hohmeister
Actually, I’ve gotten it all working; I just had overlooked some things in slurm.conf. I had previously trying to get Son of Grid Engine working, but after ripping out half my hair, I went to Slurm, since it’s what our research computing center uses. 😊 Matt Hohmeister Systems and Network Admin

Re: [slurm-users] FSU & Slurm

2018-04-11 Thread Sean Caron
Hi Matt, As a protest to asking questions on this list and getting solicitations for pay-for support, let me give you some advice for free :) If you look at your slurm.conf you'll see there are two directories that your slurm user and group need to have write access to. One is whatever you confi

[slurm-users] (no subject)

2018-04-11 Thread Mike Renfro
Hey, folks. I have a relatively simple queueing setup on Slurm 17.02 with a 1000 CPU-day AssocGrpCPURunMinutesLimit set. When the cluster is less busy than typical, I may still have users run up against the 1000 CPU-day limit, even though some nodes are idle. What’s the easiest way to force a job

[slurm-users] Two lines are printed by sacct

2018-04-11 Thread Mahmood Naderan
Hi, I would like to know why the sacct command which I am usinig that to get some reports, shows two lines for each job. For example [mahmood@rocks7 g]$ sacct --format=start,end,elapsed,ncpus,cputime,user Start EndElapsed NCPUS CPUTime User ---

Re: [slurm-users] FSU & Slurm

2018-04-11 Thread Jess Arrington
Hi Matt, I hope your day is treating you well. Thank you for your posts on the Slurm user list. By chance, do you work with Paul Van Der Mark? Would there be interest on your side to see a Slurm support contract for your systems at FSU? Sites running Slurm with support give us feedback that

Re: [slurm-users] job_submit.lua script

2018-04-11 Thread Bjørn-Helge Mevik
"sysadmin.caos" writes: >     srun: error: slurm_job_submit: parameter error 65534 4294967294 1 4294967294 is the special value slurm.NO_VAL, meaning the parameter was not specified. It is for 32 bit parametres. I guess (but haven't double checked) that 65534 is slurm.NO_VAL16, meaning the sam

Re: [slurm-users] Slurm setup question

2018-04-11 Thread Matt Hohmeister
Thanks; I just set StateSaveLocation=/var/spool/slurm.state, and that went away. Of course, another error popped up: Apr 11 11:19:24 psy-slurm slurmctld[1772]: fatal: Invalid node names in partition slurm Here’s the relevant section from slurm.conf; IP address changed to protect the innocent.

Re: [slurm-users] Slurm setup question

2018-04-11 Thread Douglas Jacobsen
It looks like your slurm.conf is specifying /var/spool as your Save state directory, and `fatal: Incorrect permissions on state save loc: /var/spool` indicates that SlurmUser (another configuration in slurm.conf) does not have access to write to it. It might be a good to make a directory dedicated

[slurm-users] job_submit.lua script

2018-04-11 Thread sysadmin.caos
Hello, I'm writing my own "job_submit.lua" for controlling in what partition a user can run a "srun" and how many CPUs and nodes are allowed. I want to allow only "srun" in partition "interactive" with only one core and one node. I have wrote this script but I'm gett

Re: [slurm-users] Slurm setup question

2018-04-11 Thread Ole Holm Nielsen
Hi Matt, You might want to take a look at my Slurm Wiki, which focuses on CentOS/RHEL 7: https://wiki.fysik.dtu.dk/niflheim/SLURM. Complete instructions for Slurm installation, configuration, etc. is in the Wiki. /Ole On 04/11/2018 02:26 PM, Matt Hohmeister wrote: I’m brand-new to Slurm, an

[slurm-users] Slurm setup question

2018-04-11 Thread Matt Hohmeister
I'm brand-new to Slurm, and setting it up on a single RHEL 7.4 VM as a proof of concept before I deploy it. After following the instructions on https://www.slothparadise.com/how-to-install-slurm-on-centos-7-cluster/ (sorry, site not working now), I can get slurmd to start perfectly, but slurmctl

Re: [slurm-users] Job Preemption Suspend/Resume

2018-04-11 Thread Nicolò Parmiggiani
nobody got same problem? 2018-04-09 13:05 GMT+02:00 Nicolò Parmiggiani : > Hi, > > i have a problem with Preemption. > > When high priority job suspend lower priority job all its ok. But when > high priority job ends and slurm resume lower priority one, sometimes it > happens that the resumed

[slurm-users] slurm-17.11.5 usage of X11

2018-04-11 Thread Philippe Grevet
Hello, We have some problem to use native X11 of the new slurm version (17.11.5) compiled from source on Debain stretch. Before we use the plugin spank with no problem on an older version of slurm. I have first this error "error: X11 forwarding not built in, cannot enable." After reading d