After some more digging this turns out to be the same issue as in Bug 4153 and 
was fixed on September 27th 2017.
If you’ve upgraded to 17.02/17.11 prior to this date, be sure to check your 
reqmem data.

> Am 26.01.2018 um 11:59 schrieb Lech Nieroda <lech.nier...@uni-koeln.de>:
> 
> Dear slurm users,
> 
> we have run into a problem after upgrading from slurm 15.08.12 to 17.02.6 
> back in August 2017: all old jobs which had their memory requested with the 
> ‚mem-per-cpu’ option have shown absurd values in the ‚reqmem‘ attribute when 
> queried with sacct.
> The values were somewhere in the PetaByte range, whereas they should have 
> been in the GigaByte range.
> 
> An analysis of the issue has shown the following:
> The attribute corresponding to ‚reqmem’ in the database is ‚mem_req‘ in the 
> ‚cheops_job_table‘ table. It stores both ‚mem‘ and ‚mem-per-cpu’ values - the 
> ‚mem‘ value is stored directly and the ‚mem-per-cpu’ is stored with a certain 
> flag (bit) set.
> In slurm 15.08.12 the ‚mem_req‘ attribute is a simple int (32bit) and the 
> flag is the 32nd bit.
> In slurm 17.02.6 the ‚mem_req‘ attribute is a bigint (64bit) and the flag is 
> the 64th bit.
> Thus the 'mem-per-cpu‘ values with ‚2^31‘ „added" to them appeared as 
> PetaBytes.
> 
> The uint32_t -> uint64_t change took place with the commit at 2016-06-27 with 
> the adnotation that it requires further "table conversion logic to MySQL, as 
> mem_req column needs to change type to 'bigint unsigned' from 'int 
> unsigned‘.“.
> I don’t know if this work has been done but when we’ve upgraded slurm and the 
> database was converted automatically, the values were not corrected and there 
> was no error concerning this issue. 
> 
> In case you have run into something similar, the fix is simple - we’ve 
> converted the values ‚manually‘, i.e. made a query that selected all entries 
> with 2^31 <= mem_req < 2^63, made a backup, cleared the 2^31 bit, set the 
> 2^63 bit, stored and checked the values.
> 
> 
> Regards,
> Lech
> 
> --
> Dipl.-Wirt.-Inf. Lech Nieroda
> Regionales Rechenzentrum der Universität zu Köln (RRZK)
> 
> 
> 
> 
> 
> 


Regards,
Lech

--
Dipl.-Wirt.-Inf. Lech Nieroda
Regionales Rechenzentrum der Universität zu Köln (RRZK)






Reply via email to