Well, sorry, I indeed runned the raw script for this mail.
Running the installed one by `make install`, which is setting line 11
path correctly:
use lib qw(/usr/local/slurm-23.02.2/lib/x86_64-linux-gnu/perl/5.30.0);
I get:
perl: error: slurm_persist_conn_open: Something happened with the
receiving/processing of the persistent connection init message to
localhost:6819: Failed to unpack SLURM_PERSIST_INIT message
perl: error: Sending PersistInit msg: Message receive failure
Use of uninitialized value in subroutine entry at
/usr/local/slurm/bin/seff line 57, <DATA> line 564.
perl: error: g_slurm_auth_pack: protocol_version 6500 not supported
perl: error: slurm_send_node_msg: g_slurm_auth_pack:
REQUEST_PERSIST_INIT has authentication error: No error
perl: error: slurm_persist_conn_open: failed to send persistent
connection init message to localhost:6819
perl: error: Sending PersistInit msg: Protocol authentication error
perl: error: DBD_GET_JOBS_COND failure: Unspecified error
Job not found.
Slurm is otherwise running well after an update from 20.11 -> 21.08 ->
23.02.
# sinfo -V
slurm 23.02.2
# sinfo -O nodehost,Version
HOSTNAMES VERSION
x 23.02.2
x 23.02.2
x 23.02.2
x 23.02.2
x 23.02.2
x 23.02.2
x 23.02.2
x 23.02.2
x 23.02.2
x 23.02.2
On 5/25/23 18:33, Mike Robbert wrote:
How did you install seff? I don’t know exactly where this happens, but
it looks like line 11 in the source file for seff is supposed to get
transformed to include an actual path. I am running on CentOS and
install Slurm by building the RPMs using the included spec files and
here is a diff of the file in the source tree and the file that got
installed to /usr/bin/seff
$ diff contribs/seff/seff /usr/bin/seff
11c11
< use lib "${FindBin::Bin}/../lib/perl";
---
use lib qw(/usr/lib64/perl5);
*Mike Robbert*
*Cyberinfrastructure Specialist, Cyberinfrastructure and Advanced
Research Computing*
Information and Technology Solutions (ITS)
303-273-3786 | mrobb...@mines.edu <mailto:mrobb...@mines.edu>
A close up of a sign Description automatically generated
*Our values:*Trust | Integrity | Respect | Responsibility
*From: *slurm-users <slurm-users-boun...@lists.schedmd.com> on behalf of
David Gauchard <gauch...@laas.fr>
*Date: *Thursday, May 25, 2023 at 10:02
*To: *slurm-us...@schedmd.com <slurm-us...@schedmd.com>
*Subject: *[EXTERNAL] [slurm-users] seff in slurm-23.02
CAUTION: This email originated from outside of the Colorado School of
Mines organization. Do not click on links or open attachments unless you
recognize the sender and know the content is safe.
Hello,
slurm-23.02 on ubuntu-20.04,
seff is not working anymore:
```
# ./seff 4911385
Use of uninitialized value $FindBin::Bin in concatenation (.) or string
at ./seff line 11.
Name "FindBin::Bin" used only once: possible typo at ./seff line 11,
<DATA> line 602.
perl: error: slurm_persist_conn_open: Something happened with the
receiving/processing of the persistent connection init message to
localhost:6819: Failed to unpack
SLURM_PERSIST_INIT message
perl: error: Sending PersistInit msg: Message receive failure
Use of uninitialized value in subroutine entry at ./seff line 58, <DATA>
line 602.
perl: error: [...]
```
while using
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FSchedMD%2Fslurm%2Fblob%2Fce7d569807c495516ebfa6fcef25ad36ccc76827%2Fcontribs%2Fseff%2Fseff%23LL19C3-L19C124&data=05%7C01%7Cmrobbert%40mines.edu%7C2a6103be8f63448b670d08db5d396909%7C997209e009b346239a4d76afa44a675c%7C0%7C0%7C638206273390681941%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=lQO0KSMPkx%2BSzejwv0qJ7WqGI43tGQDkYkutW2ghByE%3D&reserved=0 <https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FSchedMD%2Fslurm%2Fblob%2Fce7d569807c495516ebfa6fcef25ad36ccc76827%2Fcontribs%2Fseff%2Fseff%23LL19C3-L19C124&data=05%7C01%7Cmrobbert%40mines.edu%7C2a6103be8f63448b670d08db5d396909%7C997209e009b346239a4d76afa44a675c%7C0%7C0%7C638206273390681941%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=lQO0KSMPkx%2BSzejwv0qJ7WqGI43tGQDkYkutW2ghByE%3D&reserved=0> :
```
# sacct -P -n -a --format
JobID,User,Group,State,Cluster,AllocCPUS,REQMEM,TotalCPU,Elapsed,MaxRSS,ExitCode,NNodes,NTasks -j 4911385
4911385|user|part|FAILED|hpc|1|2000M|00:23.041|00:00:31||0:9|1|
4911385.batch|||CANCELLED by 0|hpc|1||00:23.041|00:00:31|5936692K|0:9|1|1
```
I wonder whether this is an installation error and contrib/seff is working
for other 23.02 users.
Thanks