[slurm-users] cgroups/v2 plugin rpmbuild issue
Trying to use rpmbuild on Rocky9 Linux, Slurm 21.08 - I want to build with cgroups/v2 support and have these installed: libbpf-devel.x86_642:1.3.0-2.el9 dbus-devel.x86_64 1:1.12.20-8.el9 kernel-headers.x86_64 5.14.0-427.22.1.el9_4 hwloc-devel.x86_64 2.4.1-5.el9 In https://slurm.schedmd.com/cgroup_v2.html it says: Requirements For building cgroup/v2 there are two required libraries checked at configure time... Look at your config.log when configuring to see if they were correctly detected on your system. I don't see any mention of 'ebpf', 'bpf', or 'dbus' when I grep through ~/rpmbuild/BUILD/slurm-21.08.8-2/config.log. Before I noticed that, I installed the RPMS on my cluster and get cgroup errors when I try to run a job or get an allocation. The slurmctld and slurmd processes start up fine on the servers, and I have a generic /etc/slurm/cgroup.conf. It seems like the cgroups plugin isn't getting built right with rpmbuild- how do I troubleshoot? Thanks Chris -- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-le...@lists.schedmd.com
[slurm-users] Re: cgroups/v2 plugin rpmbuild issue
Also it looks like only the v1 plugin gets built and added to the RPM- # ls -l /usr/lib64/slurm/cgroup_v1.so -rwxr-xr-x 1 root root 385872 Jul 1 18:47 /usr/lib64/slurm/cgroup_v1.so # ls -l /usr/lib64/slurm/cgroup_v2.so ls: cannot access '/usr/lib64/slurm/cgroup_v2.so': No such file or directory Chris -- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-le...@lists.schedmd.com
[slurm-users] Re: cgroups/v2 plugin rpmbuild issue
Ah, thanks :) - I just realized this when I saw v1 was the only plugin included in the source. Yes I'm cycling through 21.08 so I can eventually get on the latest Slurm. Chris > On 07/06/2024 9:38 AM PDT Adam Tygart via slurm-users > wrote: > > > The version of Slurm you're building doesn't have support for cgroup > v2. 22.05 does, but I'd recommend going through the upgrade cycle and > moving to something even moderately recent. > -- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-le...@lists.schedmd.com
[slurm-users] Re: Temporarily bypassing pam_slurm_adopt.so
On my Rocky9 cluster I got this to work fine also- Added at the end of /etc/pam.d/sshd: accountsufficientpam_listfile.so item=user sense=allow onerr=fail file=/etc/slurm/allowed_users_file accountrequired pam_slurm_adopt.so I added a couple of usernames to /etc/slurm/allowed_users_file and they can SSH to the node without a job or allocation there. Chris > On 07/08/2024 2:07 PM PDT David Schanzenbach via slurm-users > wrote: > > > Hi Daniel, > > Utilizing pam_access with pam_slurm_adopt might be what you are looking for? > https://slurm.schedmd.com/pam_slurm_adopt.html#admin_access > > Thanks, > David > > -- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-le...@lists.schedmd.com
[slurm-users] "error: slurm_auth_get_host: Lookup failed: Unknown host" in slurmctld.log
I keep getting this logged on my Slurm control host: [2024-08-06T20:31:40.196] error: slurm_auth_get_host: Lookup failed: Unknown host I don't see an identifiable pattern and I'm not sure how to troubleshoot. The jobs being submitted at or around that time seem fine and nobody's complained about their jobs not running, so I can't track down a bad or nonexistent DNS record. Chris -- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-le...@lists.schedmd.com
[slurm-users] AllocNode:Sid in scontrol but not sacct?
Does the accounting database keep this? Maybe I'm missing something but I don't see a way to query for it in sacct. Chris -- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-le...@lists.schedmd.com
[slurm-users] Re: AllocNode:Sid in scontrol but not sacct? - query for what submit host submitted a job
Is there a historical record in the database of where a job got submitted from? I've seen some older posts mentioning setting up a prolog or epilog to record it, but since AllocNode:Sid shows up in scontrol it seems like it should be in he historical record somewhere. Chris > On 12/09/2024 1:03 PM PST Chris Taylor wrote: > > > Does the accounting database keep this? Maybe I'm missing something but I > don't see a way to query for it in sacct. > Chris -- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-le...@lists.schedmd.com