Re: [slurm-users] slurm_rpc_node_registration invalid argument

Michael Di Domenico Wed, 26 Aug 2020 08:06:27 -0700

sorry i meant to say, our slurm nodehealth script pushed the node to
failed state.  slurm itself wasn't doing this


On Wed, Aug 26, 2020 at 11:02 AM Michael Di Domenico
<mdidomeni...@gmail.com> wrote:
>
> i just upgraded from v18 to v20.  Did something change in the node
> config validation?  it used to be that if i started slurm on a compute
> node that had lower than expected memory or was missing gpu's, slurm
> would push a node into a failed state that i could see in sinfo -R.
> now it seems to be logging every second in the slurmctld
> "slurm_rpc_node_registration invalid argument" log file for each node
> that's broken
>
> Is there some function that got disabled/changed?  i use slurm to
> ferret out bad hardware, but logging to the logfile every seconds
> seems silly and since i don't routinely watch the log files things
> will go unnoticed

Re: [slurm-users] slurm_rpc_node_registration invalid argument

Reply via email to