No other errors in the logs. Identical slurm.conf on all nodes and controller. Only the node with gpus has the gres.conf (with the single line Autodetect=nvml).
I got this error to stop by removing the Gres=gpu:gp100:2 from the NodeName line in the controller and the node and removing the gres.conf from the node. On Mon, Feb 10, 2020 at 11:41 PM Chris Samuel <ch...@csamuel.org> wrote: > On Monday, 10 February 2020 12:11:30 PM PST Dean Schulze wrote: > > > With this configuration I get this message every second in my > slurmctld.log > > file: > > > > error: _slurm_rpc_node_registration node=slurmnode1: Invalid argument > > What other errors are in the logs? > > Could you check that you've got identical slurm.conf and gres.conf files > everywhere? > > All the best, > Chris > -- > Chris Samuel : http://www.csamuel.org/ : Berkeley, CA, USA > > > > >