[slurm-users] cgroups/v2 plugin rpmbuild issue

2024-07-05 Thread Chris Taylor via slurm-users
Trying to use rpmbuild on Rocky9 Linux, Slurm 21.08 - I want to build with 
cgroups/v2 support and have these installed:

libbpf-devel.x86_642:1.3.0-2.el9
dbus-devel.x86_64  1:1.12.20-8.el9
kernel-headers.x86_64  5.14.0-427.22.1.el9_4
hwloc-devel.x86_64 2.4.1-5.el9

In https://slurm.schedmd.com/cgroup_v2.html it says: 

Requirements 
For building cgroup/v2 there are two required libraries checked at configure 
time... Look at your config.log when configuring to see if they were correctly 
detected on your system.

I don't see any mention of 'ebpf', 'bpf', or 'dbus' when I grep through 
~/rpmbuild/BUILD/slurm-21.08.8-2/config.log. Before I noticed that, I installed 
the RPMS on my cluster and get cgroup errors when I try to run a job or get an 
allocation. The slurmctld and slurmd processes start up fine on the servers, 
and I have a generic /etc/slurm/cgroup.conf.

It seems like the cgroups plugin isn't getting built right with rpmbuild- how 
do I troubleshoot? Thanks

Chris

-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com


[slurm-users] Re: cgroups/v2 plugin rpmbuild issue

2024-07-06 Thread Chris Taylor via slurm-users
Also it looks like only the v1 plugin gets built and added to the RPM-

# ls -l /usr/lib64/slurm/cgroup_v1.so
-rwxr-xr-x 1 root root 385872 Jul  1 18:47 /usr/lib64/slurm/cgroup_v1.so
# ls -l /usr/lib64/slurm/cgroup_v2.so
ls: cannot access '/usr/lib64/slurm/cgroup_v2.so': No such file or directory

Chris

-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com


[slurm-users] Re: cgroups/v2 plugin rpmbuild issue

2024-07-06 Thread Chris Taylor via slurm-users
Ah, thanks :)  - I just realized this when I saw v1 was the only plugin 
included in the source. Yes I'm cycling through 21.08 so I can eventually get 
on the latest Slurm.
Chris

> On 07/06/2024 9:38 AM PDT Adam Tygart via slurm-users 
>  wrote:
> 
>  
> The version of Slurm you're building doesn't have support for cgroup
> v2. 22.05 does, but I'd recommend going through the upgrade cycle and
> moving to something even moderately recent.
>

-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com


[slurm-users] Re: Temporarily bypassing pam_slurm_adopt.so

2024-07-08 Thread Chris Taylor via slurm-users
On my Rocky9 cluster I got this to work fine also-

Added at the end of /etc/pam.d/sshd:

accountsufficientpam_listfile.so item=user sense=allow onerr=fail 
file=/etc/slurm/allowed_users_file
accountrequired  pam_slurm_adopt.so

I added a couple of usernames to /etc/slurm/allowed_users_file and they can SSH 
to the node without a job or allocation there.

Chris

> On 07/08/2024 2:07 PM PDT David Schanzenbach via slurm-users 
>  wrote:
> 
> 
> Hi Daniel,
>  
>  Utilizing pam_access with pam_slurm_adopt might be what you are looking for?
>  https://slurm.schedmd.com/pam_slurm_adopt.html#admin_access
>  
>  Thanks,
>  David
>  
>

-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com


[slurm-users] "error: slurm_auth_get_host: Lookup failed: Unknown host" in slurmctld.log

2024-08-06 Thread Chris Taylor via slurm-users
I keep getting this logged on my Slurm control host:
[2024-08-06T20:31:40.196] error: slurm_auth_get_host: Lookup failed: Unknown 
host

I don't see an identifiable pattern and I'm not sure how to troubleshoot. The 
jobs being submitted at or around that time seem fine and nobody's complained 
about their jobs not running, so I can't track down a bad or nonexistent DNS 
record.

Chris

-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com


[slurm-users] AllocNode:Sid in scontrol but not sacct?

2024-12-09 Thread Chris Taylor via slurm-users
Does the accounting database keep this? Maybe I'm missing something but I don't 
see a way to query for it in sacct.
Chris

-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com


[slurm-users] Re: AllocNode:Sid in scontrol but not sacct? - query for what submit host submitted a job

2024-12-16 Thread Chris Taylor via slurm-users
Is there a historical record in the database of where a job got submitted from? 
I've seen some older posts mentioning setting up a prolog or epilog to record 
it, but since AllocNode:Sid shows up in scontrol it seems like it should be in 
he historical record somewhere.
Chris

> On 12/09/2024 1:03 PM PST Chris Taylor  wrote:
> 
>  
> Does the accounting database keep this? Maybe I'm missing something but I 
> don't see a way to query for it in sacct.
> Chris

-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com