Hi Chris, If it just requires restarting slurmctld and the slurmd processes on the nodes, I will be happy! Can you confirm that no running or pending jobs were lost in the transition?
Thanks, Nate On Thu, Feb 20, 2020 at 6:54 PM Chris Samuel <ch...@csamuel.org> wrote: > On 20/2/20 2:16 pm, Nathan R Crawford wrote: > > > I interpret this as, in general, changing SelectType will nuke > > existing jobs, but that since cons_tres uses the same state format as > > cons_res, it should work. > > We got caught with just this on our GPU nodes (though it was fixed > before I got to see what was going on) - it seems that the format of the > RPCs changes when you go from cons_res to cons_tres and we were having > issues until we restarted slurmd on the compute nodes as well. > > My memory is that this was causing issues for starting new jobs (in a > failing completely type of manner), I'm not sure what the consequences > were for running jobs (though I suspect it would not have been great for > them). > > If Doug sees this he may remember this (he caught and fixed it). > > All the best, > Chris > -- > Chris Samuel : http://www.csamuel.org/ : Berkeley, CA, USA > > -- Dr. Nathan Crawford nathan.crawf...@uci.edu Director of Scientific Computing School of Physical Sciences 164 Rowland Hall Office: 2101 Natural Sciences II University of California, Irvine Phone: 949-824-4508 Irvine, CA 92697-2025, USA