On 20/2/20 2:16 pm, Nathan R Crawford wrote:
I interpret this as, in general, changing SelectType will nuke existing jobs, but that since cons_tres uses the same state format as cons_res, it should work.
We got caught with just this on our GPU nodes (though it was fixed before I got to see what was going on) - it seems that the format of the RPCs changes when you go from cons_res to cons_tres and we were having issues until we restarted slurmd on the compute nodes as well.
My memory is that this was causing issues for starting new jobs (in a failing completely type of manner), I'm not sure what the consequences were for running jobs (though I suspect it would not have been great for them).
If Doug sees this he may remember this (he caught and fixed it). All the best, Chris -- Chris Samuel : http://www.csamuel.org/ : Berkeley, CA, USA