[slurm-users] Slurm versions 18.08.2 and 17.11.11 are now available, as well as 19.05.0pre1

2018-10-18 Thread Tim Wickberg
We are pleased to announce the availability of Slurm versions 18.08.2 and 17.11.11, as well as the first 19.05 release preview version 19.05.0pre1. These versions include a fix for a regression introduced in 18.08.1 and 17.11.10 that prevented the --get-user-env option from working correctly,

Re: [slurm-users] User permissions on submitted jobs

2018-10-18 Thread Aravindh Sampathkumar
Hi Ole, Thanks for the response. It was indeed a matter of UIDs and GIDs. Even though I had FreeIPA configured to serve identities to all nodes, the login node I was trying to submit the job from had an additional entry in /etc/passwd file that overruled the FreeIPA source. Which caused the c

Re: [slurm-users] Socket timed out on send/recv operation

2018-10-18 Thread John Hearns
Kirk, MailProg=/usr/bin/sendmail MailProg should be the program used to SEND mail ie. /bin/mail not sendmail If I am not wrong int he jargon MailProg is a MUA not an MTA (sendmail is an MTA) On Thu, 18 Oct 2018 at 19:01, Kirk Main wrote: > Hi all, > > I'm a new administrator to Slurm a

[slurm-users] Socket timed out on send/recv operation

2018-10-18 Thread Kirk Main
Hi all, I'm a new administrator to Slurm and I've just got my new cluster up and running. We started getting a lot of "Socket timed out on send/recv operation" errors when submitting jobs, and also if you try to "squeue" while others are submitting jobs. The job does eventually run after about a m

Re: [slurm-users] Can frequent hold-release adversely affect slurm?

2018-10-18 Thread Eli V
On Thu, Oct 18, 2018 at 1:03 PM Daniel Letai wrote: > > > Hello all, > > > To solve a requirement where a large number of job arrays (~10k arrays, each > with at most 8M elements) with same priority should be executed with minimal > starvation of any array - we don't want to wait for each array

[slurm-users] Can frequent hold-release adversely affect slurm?

2018-10-18 Thread Daniel Letai
Hello all, To solve a requirement where a large number of job arrays (~10k arrays, each with at most 8M elements) with same priority should be executed with minimal starvation of any array - we don't want to wait for each array to complete before

[slurm-users] Resource sharing between different clusters

2018-10-18 Thread Cao, Lei
Hi, I am pretty new to slurm so please bear with me. I have the following scenario and I wonder if slurm currently supports this in someway. Let's say I have 3 clusters. Cluster1 and cluster2 run their own slurmctld and slurmds(this is a hard requirement), but both of them need to shar

Re: [slurm-users] sprio/sacct priority question

2018-10-18 Thread Marcin Stolarek
As far as I remember sprio does the calculation on its own when executed and priority in job structure stored by slurmctl is updated periodically... maybe this is the answer ? cheers, Marcin śr., 17 paź 2018 o 00:42 Glen MacLachlan napisał(a): > > Hi all, > > I'm using slurm 17.02.8 and when I

[slurm-users] Problem with monitoring cpu time for extern slurm step

2018-10-18 Thread Christian Becker
Dear all, we use pam_slurm_adopt to give users the possibility to make an ssh connection to a node with a running session. But when I try to monitor such a session for cpu usage with sstat I get no results: [root@puppeteer ~]# sstat -aj 3685.extern -o "MinCPU,AveCPU" MinCPU AveCPU -

Re: [slurm-users] Job walltime

2018-10-18 Thread Chris Samuel
On Wednesday, 17 October 2018 10:10:07 PM AEDT Andy Georges wrote: > I am wondering is there is a way to set the job walltime in the job > environment (to set $PBS_WALLTIME). It’s unclear to me how this information > can be retrieved on the worker node, e.g., in the SPANK environment > (prolog, or