Hi all:
We encountered a strange bug when query job history using sacct. As show below, we try to list user hpczbzt's job, and sacct do filter the right jobs belong to this user. But there username is displayed as phywht. > sacct -X --user=hpczbzt --format=jobid%16,jobidraw,user,uid,partition,start,end,AllocCPUS,state%20 JobID JobIDRaw User UID Partition Start End AllocCPUS State ---------------- ------------ --------- ------ ---------- ------------------- ------------------- ---------- -------------------- 9882328 9882328 phywht 6270 dgx2 2022-03-13T04:50:12 Unknown 6 RUNNING 9882330 9882330 phywht 6270 dgx2 2022-03-13T04:50:12 Unknown 6 RUNNING 9882332 9882332 phywht 6270 dgx2 2022-03-13T04:50:12 Unknown 6 RUNNING 9882335 9882335 phywht 6270 dgx2 2022-03-13T04:50:12 Unknown 6 RUNNING 9882337 9882337 phywht 6270 dgx2 2022-03-13T04:50:12 Unknown 6 RUNNING 9884211 9884211 phywht 6270 a100 2022-03-12T23:56:02 2022-03-13T00:13:43 8 CANCELLED by 6270 9884265 9884265 phywht 6270 a100 2022-03-13T00:14:22 Unknown 8 RUNNING 9884308 9884308 phywht 6270 64c512g 2022-03-13T01:18:44 2022-03-13T01:37:04 4 CANCELLED by 6270 9884413 9884413 phywht 6270 64c512g 2022-03-13T04:52:06 2022-03-13T05:59:49 40 COMPLETED 9884431 9884431 phywht 6270 a100 2022-03-13T06:09:02 2022-03-13T09:32:45 8 COMPLETED 9887011 9887011 phywht 6270 debug64c5+ 2022-03-13T11:06:44 2022-03-13T11:07:41 1 CANCELLED by 6270 The UID showed by sacct is right, and actual UID of phywht is 6272 as shown below: > id phywht uid=6272(phywht) gid=6272(phywht) groups=6272(phywht) > id hpczbzt uid=6270(hpczbzt) gid=6270(hpczbzt) groups=6270(hpczbzt) Those 2 system accounts are both stored in ldap. Also we have checked them to be consistent on either slurmctld and slurmdbd node. What's more, scontrol and squeue can show the right username as hpczbzt: > scontrol show job 9884265 JobId=9884265 JobName=af_test_session UserId=hpczbzt(6270) GroupId=hpczbzt(6270) MCS_label=N/A Priority=519 Nice=0 Account=acct-phywht QOS=normal JobState=RUNNING Reason=None Dependency=(null) .. > squeue --user=hpczbzt JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 9884265 a100 af_test_ hpczbzt R 11:43:46 1 gpu04 9882328 dgx2 repeat_V hpczbzt R 7:07:56 1 vol05 .. So is there any guess about why only sacct display the wrong username?