In addition, you can check why the node were set to drain with `scontrol show node <your node name> | grep Reason`. The same information should also appear in the slurm controller logs (e.g. /var/log/slurm/slurmctld.log).

Colas

On 2019-04-15 18:03, Andy Riebs wrote:
The "invalid user id" message suggests that you need to be running as root (or possibly as the slurm user?) to update the node state.

Run "slurmd -Dvv" as root on one of the compute nodes and it will show you what it thinks is the socket/core/thread configuration.

------------------------------------------------------------------------
*From:* Shihanjian Wang <swan...@ucsc.edu>
*Sent:* Monday, April 15, 2019 5:30PM
*To:* Slurm-users <slurm-users@lists.schedmd.com>
*Cc:*
*Subject:* [slurm-users] Scontrol update: invalid user id
Hi,
We are doing a senior project involving the creation of a Pi Cluster.  We are using 7 Raspberry Pi B+'s in this cluster.

When we use sinfo to look at the status of the nodes, they appear as drained.  We also encountered a problem while trying to update the state of the nodes.  When trying to use scontrol to update the nodes, the get an error message: scontrol update: invalid user id.  We think another reason that the nodes are drained is because of low "resources". This has to do with the low socket*core*thread count, which is the number of CPUs.  We have tried changing this number in the configuration file but this reason still shows.

We are unsure what the problem is regarding this issue. The authentication method used is munge, and we think that slurm is indeed using munge as the authentication type.

If more information is needed, please let us know and we will provide the required information.

Thanks.


Reply via email to