Hi,
My cluster has 2 nodes, with the first having 2 gpus and the second 1 gpu.
The states of both nodes is "drained" because "gres/gpu count reported
lower than configured": any idea why this happens? Thanks.
My .conf files are:
slurm.conf
AccountingStorageTRES=gres/gpu
GresTypes=gpu
NodeName=t
Files generated by the slurmdbd archive are read back into the live
database by sacctmgr. See:
archive load
Load in to the database previously archived data. The archive file will
not be loaded if the records already exist in the database - therefore,
trying to load an archive file mor
Hello
Our slurmdbd database is getting rather large and affecting performance, but
we want to keep usage data around for a few years for metric purposes in order
to figure out how our users work. I read a suggestion to have a backup DB which
has all the usage data synced to it for metric pur