But I also find things inconsistent with just sreport itself.
I run:
sreport -T Gres/gpu cluster Utilization Start=01/01/25 End=04/30/25
Allocated Down PLND Dow Idle Planned Reported
--------- -------- -------- --------- -------- ----------
15310868 451198 0 8607344 0 24369410
Doing the same for each of the first four months in the range above
individually gives
Month Allocated Down PLND Dow Idle Planned Reported
-------- --------- -------- -------- --------- -------- ---------
Jan 3398309 324071 0 2430336 0 6152716
Feb 7712527 448009 0 3717620 0 11878156
Mar 2995147 745 0 3129989 0 6125880
Apr 4371832 2444 0 1582138 0 5956414
and adding those 4 Allocated numbers together gives
18477815 > 15310868
So I assume there is NO truncation going on here and those month numbers
are including all time of jobs that ran for anytime in that month but
also time in previous or next month.
-- Paul Raines (http://help.nmr.mgh.harvard.edu)
On Fri, 23 May 2025 8:32am, Passant Hafez via slurm-users wrote:
External Email - Use Caution
Hi Steen,
Thanks a lot, that certainly sorted out most of the discrepancies!
I'm still having some differences though for the sreport and saact output for
certain accounts so was wondering if there's anything else I'm missing in how
sreport calculates it (for sacct I use cputimeraw and sum it and convert to hrs)
All the best,
Passant
________________________________
From: Steen Lysgaard <s...@dtu.dk>
Sent: Thursday, May 22, 2025 9:15 AM
To: 'slurm-us...@schedmd.com' <slurm-us...@schedmd.com>; Passant Hafez
<passant.ha...@glasgow.ac.uk>
Subject: Re: Slurm Reporting Difference between sreport and sacct
Hi Passant,
I've found that when using sacct to track resource usage over specific time
periods, it's helpful to include the --truncate option. Without it, jobs that
started before the specified start time will have their entire runtime counted,
including time outside the specified range. The --truncate option ensures that
only the time within the defined period is included. Maybe this can explain
some of the discrepancy you experience.
Best regards,
Steen
________________________________
From: Passant Hafez via slurm-users <slurm-users@lists.schedmd.com>
Sent: Wednesday, May 21, 2025 18:48
To: 'slurm-us...@schedmd.com' <slurm-us...@schedmd.com>
Subject: [slurm-users] Slurm Reporting Difference between sreport and sacct
Hi all,
I was wondering if someone can help explaining this discrepancy.
I have different values for project gpu consumption using sreport vs sacct (+
some calculations)
This is an example that shows this:
sreport -t hours -T gres/gpu cluster AccountUtilizationByuser start=2025-04-01
end=2025-04-05 | grep project1234
gives 178
while
sacct -n -X --allusers --accounts=project1234 --start=2025-04-01
--end=2025-04-05 -o elapsedraw,AllocTRES%80,user,partition
gives
213480
billing=128,cpu=128,gres/gpu=8,mem=1000G,node=2 gpuplus
249507
billing=128,cpu=128,gres/gpu=8,mem=1000G,node=2 gpuplus
13908
billing=64,cpu=64,gres/gpu=4,mem=500G,node=1 gpuplus
9552
billing=64,cpu=64,gres/gpu=4,mem=500G,node=1 gpuplus
4
billing=16,cpu=16,gres/gpu=1,mem=200G,node=1 gpu
11
billing=16,cpu=16,gres/gpu=1,mem=200G,node=1 gpu
...
I will not bore you with the full output and its calculation, but the first job
alone consumed 213480 seconds/60/60 * 8 gpus that's 474.4 gpu hours which is
way more than the 178 hrs reported by sreport
Any clue why these are inconsistent? or how sreport calculated the 178 value?
All the best,
Passant
The information in this e-mail is intended only for the person to whom it is
addressed. If you believe this e-mail was sent to you in error and the e-mail
contains patient information, please contact the Mass General Brigham Compliance
HelpLine at https://www.massgeneralbrigham.org/complianceline
<https://www.massgeneralbrigham.org/complianceline> .
Please note that this e-mail is not secure (encrypted). If you do not wish to continue communication over unencrypted e-mail, please notify the sender of this message immediately. Continuing to send or respond to e-mail after receiving this message means you understand and accept this risk and wish to continue to communicate over unencrypted e-mail.
--
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com