On 7/17/19 4:05 AM, Andy Georges wrote:
Can you show what your /etc/pam.d/sshd looks like?
For us it's actually here:
---
# cat /etc/pam.d/common-account
#%PAM-1.0
#
# This file is autogenerated by pam-config. All changes
# will be o
Hi Andy,
We have RHEL7, and pam_slurm_adopt is working for us as well, with memory
constraint working
pam.d/sshd:
#%PAM-1.0
auth required pam_sepermit.so
auth substack password-auth
auth include postlogin
# Used with polkit to reauthorize users in remote sessions
-
OK, as it turns out, it was a problem like this bug:
https://bugs.schedmd.com/show_bug.cgi?id=3819 ( cf
https://bugs.schedmd.com/show_bug.cgi?id=2741 as well )
Back in May, I posted the following thread:
https://lists.schedmd.com/pipermail/slurm-users/2019-May/003372.html - to which
I never go
Not thinking that the server (which runs both the Slurm controller daemon and
the DB) is the issue... It's a Dell PowerEdge R430 platform, with dual Intel
Xeon E5-2640v3 CPUs and 256GB memory, and RAID-1 array of 1TB SATA disks.
top - 09:29:26 up 101 days, 14:57, 3 users, load average: 0.06,
Unfortunately, I think you're stuck in setting it at the account level with
sacctmgr. You could also set that limit as part of a QoS and then attach
the QoS to the partition. But I think that's as granular as you can get for
limiting TRES'.
HTH!
David
On Wed, Jul 17, 2019 at 10:11 AM Mike Harvey
On 7/17/19 12:26 AM, Chris Samuel wrote:
On 16/7/19 11:43 am, Will Dennis wrote:
[2019-07-16T09:36:51.464] error: slurmdbd: agent queue is full (20140),
discarding DBD_STEP_START:1442 request
So it looks like your slurmdbd cannot keep up with the rate of these incoming
steps and is having
Is it possible to set a cluster level limit of GPUs per user? We'd like
to implement a limit of how many GPUs a user may use across multiple
partitions at one time.
I tried this, but it obviously isn't correct:
# sacctmgr modify cluster slurm_cluster set MaxTRESPerUser=gres/gpu=2
Unknown o
Our site has been going through the process of upgrading SLURM on our primary
cluster which was delivered to us with Slurm 16.05 with Bright Computing.
We're currently at 17.02.13-2 and working to get to 17.11 and then 18.08.
We've run into an issue with 17.11 and switching effective GID on a
Hi Mark, Chris,
On Mon, Jul 15, 2019 at 01:23:20PM -0400, Mark Hahn wrote:
> > Could it be a RHEL7 specific issue?
>
> no - centos7 systems here, and pam_adopt works.
Can you show what your /etc/pam.d/sshd looks like?
Kind regards,
-- Andy
signature.asc
Description: PGP signature