Still stuck with this; maybe this gives an idea to someone. Tried resetting the RawUsage by forcing slurm to regenerate assoc_usage, and though the file was generated, the RawUsage for all users now is stuck in 0. This makes me think there is a communication problem with slurmdbd (which through sreport still reports things ok btw)? Tried changing the IP address as suggested in this related problem ( https://lists.schedmd.com/pipermail/slurm-users/2020-March/005051.html), but nothing. Both slurmctld and slurmdbd have been restarted. Any ideas?
El jue., 20 ago. 2020 a las 10:36, Stephan Schott (<schot...@hhu.de>) escribió: > Hi fellow Slurm users, > We are facing the following issue in Slurm 18.08 of Ubuntu Bionic (a > Qlustar cluster). For the last 2+ months, one of our users has been using > the queue intensively, using array jobs to handle his work. Now, the > problem there is that for some reason his RawUsage hasn't increased (and is > in fact close to 0), and hence his Fairshare factor is mistakenly high. The > curious thing of it all is that the usage reported in sreport fits > quite well with what we have seen in the last weeks. > What can cause this kind of discrepancy? All users are configured in the > same way, and are using more or less the same partitions. The only > difference I saw was the usage of array jobs instead of normal batch jobs, > but I have no idea why that would cause differences; we are now running > some tests to check if that is actually the case. > Any ideas are welcome, > > -- > Stephan Schott Verdugo > Biochemist > > Heinrich-Heine-Universitaet Duesseldorf > Institut fuer Pharm. und Med. Chemie > Universitaetsstr. 1 > 40225 Duesseldorf > Germany > -- Stephan Schott Verdugo Biochemist Heinrich-Heine-Universitaet Duesseldorf Institut fuer Pharm. und Med. Chemie Universitaetsstr. 1 40225 Duesseldorf Germany