Re: [slurm-users] swap size

2018-09-23 Thread Raymond Wan
Hi Chris, On Mon, Sep 24, 2018 at 7:36 AM Christopher Samuel wrote: > On 24/09/18 00:46, Raymond Wan wrote: > > > Hmm, I'm way out of my comfort zone but I am curious about what > > happens. Unfortunately, I don't think I'm able to read kernel code, but > > someone here > > (https://stackov

Re: [slurm-users] swap size

2018-09-23 Thread Christopher Samuel
On 24/09/18 00:46, Raymond Wan wrote: Hmm, I'm way out of my comfort zone but I am curious about what happens.  Unfortunately, I don't think I'm able to read kernel code, but someone here (https://stackoverflow.com/questions/31946854/how-does-sigstop-work-in-linux-kernel) seems to suggest

Re: [slurm-users] swap size

2018-09-23 Thread A
Ray I'm also on Ubuntu. I'll try the same test, but do it with and without swap on (e.g. by running the swapoff and swapon commands first). To complicate things I also don't know if the swapiness level makes a difference. Thanks Ashton On Sun, Sep 23, 2018, 7:48 AM Raymond Wan wrote: > > Hi Ch

Re: [slurm-users] swap size

2018-09-23 Thread Raymond Wan
Hi Chris, On Sunday, September 23, 2018 09:34 AM, Chris Samuel wrote: On Saturday, 22 September 2018 4:19:09 PM AEST Raymond Wan wrote: SLURM's ability to suspend jobs must be storing the state in a location outside of this 512 GB. So, you're not helping this by allocating more swap. I d

Re: [slurm-users] swap size

2018-09-22 Thread Chris Samuel
On Saturday, 22 September 2018 4:19:09 PM AEST Raymond Wan wrote: > SLURM's ability to suspend jobs must be storing the state in a > location outside of this 512 GB. So, you're not helping this by > allocating more swap. I don't believe that's the case. My understanding is that in this mode it'

Re: [slurm-users] swap size

2018-09-22 Thread Renfro, Michael
If your workflows are primarily CPU-bound rather than memory-bound, and since you’re the only user, you could ensure all your Slurm scripts ‘nice’ their Python commands, or use the -n flag for slurmd and the PropagatePrioProcess configuration parameter. Both of these are in the thread at https:

Re: [slurm-users] swap size

2018-09-22 Thread John Hearns
I would say that, yes, you have a good workflow here with Slurm. As another aside - is anyone working with suspending and resuming containers? I see on the Singularity site that suspend/resume in on the roadmap (I am not talking about checkpointing here). Also it is worth saying that these days on

Re: [slurm-users] swap size

2018-09-21 Thread Raymond Wan
Hi Ashton, On Sat, Sep 22, 2018 at 5:34 AM A wrote: > So I'm wondering if 20% is enough, or whether it should scale by the number > of single jobs I might be running at any one time. E.g. if I'm running 10 > jobs that all use 20 gb of ram, and I suspend, should I need 200 gb of swap? Perhaps

Re: [slurm-users] swap size

2018-09-21 Thread A
Hi John! Thanks for the reply, lots to think about. In terms of suspending/resuming, my situation might be a bit different than other people. As I mentioned this is an install on a single node workstation. This is my daily office machine. I run alot of python processing scripts that have low CPU n

Re: [slurm-users] swap size

2018-09-21 Thread John Hearns
Ashton, on a compute node with 256Gbytes of RAM I would not configure any swap at all. None. I managed an SGI UV1 machine at an F1 team which had 1Tbyte of RAM - and no swap. Also our ICE clusters were diskless - SGI very smartly configured swap over ISCSI - but we disabled this, the reason being

[slurm-users] swap size

2018-09-21 Thread A
I have a single node slurm config on my workstation (18 cores, 256 gb ram, 40 Tb disk space). I recently just extended the array size to its current config and am reconfiguring my LVM logical volumes. I'm curious on people's thoughts on swap sizes for a node. Redhat these days recommends up to 20%