The fact that sinfo is responding shows that at least slurmctld is
running.  Slumd, on the other hand is not.  Please also get output of
slurmd log or running "slurmd -Dvvv"

On Jan 15, 2018 06:42, "Elisabetta Falivene" <e.faliv...@ilabroma.com>
wrote:

> > Anyway I suggest to update the operating system to stretch and fix your
> > configuration under a more recent version of slurm.
>
> I think I'll soon arrive to that :)
> b
>
> 2018-01-15 14:08 GMT+01:00 Gennaro Oliva <oliv...@na.icar.cnr.it>:
>
>> Ciao Elisabetta,
>>
>> On Mon, Jan 15, 2018 at 01:13:27PM +0100, Elisabetta Falivene wrote:
>> > Error messages are not much helping me in guessing what is going on.
>> What
>> > should I check to get what is failing?
>>
>> check slurmctld.log and slurmd.log, you can find them under
>> /var/log/slurm-llnl
>>
>> > *PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST*
>> > *batch*       up   infinite      8   unk* node[01-08]*
>> >
>> >
>> > Running
>> > *systemctl status slurmctld.service*
>> >
>> > returns
>> >
>> > *slurmctld.service - Slurm controller daemon*
>> > *   Loaded: loaded (/lib/systemd/system/slurmctld.service; enabled)*
>> > *   Active: failed (Result: timeout) since Mon 2018-01-15 13:03:39 CET;
>> 41s
>> > ago*
>> > *  Process: 2098 ExecStart=/usr/sbin/slurmctld $SLURMCTLD_OPTIONS
>> > (code=exited, status=0/SUCCESS)*
>> >
>> > * slurmctld[2100]: cons_res: select_p_reconfigure*
>> > * slurmctld[2100]: cons_res: select_p_node_init*
>> > * slurmctld[2100]: cons_res: preparing for 1 partitions*
>> > * slurmctld[2100]: Running as primary controller*
>> > * slurmctld[2100]:
>> > SchedulerParameters=default_queue_depth=100,max_rpc_cnt=0,ma
>> x_sched_time=4,partition_job_depth=0*
>> > * slurmctld.service start operation timed out. Terminating.*
>> > *Terminate signal (SIGINT or SIGTERM) received*
>> > * slurmctld[2100]: Saving all slurm state*
>> > * Failed to start Slurm controller daemon.*
>> > * Unit slurmctld.service entered failed state.*
>>
>> Do you have a backup controller?
>> Check your slurm.conf under:
>> /etc/slurm-llnl
>>
>> Anyway I suggest to update the operating system to stretch and fix your
>> configuration under a more recent version of slurm.
>> Best regards
>> --
>> Gennaro Oliva
>>
>>
>

Reply via email to