Hi all,

We run a few Slurm clusters here, all using SlurmDBD to store job history info. 
I also utilize Open XDMoD (http://open.xdmod.org/) to run statistics on the 
jobs. However, it seems that XDMoD does not provide node utilization 
statistics, unless my XDMoD isn’t configured somehow to do that… What I’m 
looking for is numbers of jobs landing on which nodes for a period, and things 
like numbers of completed jobs, failed jobs, etc. per node. What I’m trying to 
get a sense of is how loaded up (or in my case, most probably, how unused) the 
individual nodes are in a cluster.

I have run the command:
sacct -X -p -o 
jobid,jobname,start,end,user,partition%-30,nodelist,alloccpus,reqmem,cputime,qos,state,exitcode,AllocTRES%-50
 -S 01/01/19 > sacct-parsable-2019.txt
to get a list of jobs dumped out for the year, sucked it into Excel, and used a 
PivotTable to get some stats, but that is the long way of doing this… Would 
like something more dynamic and easier. Anyone have any suggestions?

Thanks,
Will



Reply via email to