Re: [slurm-users] Slurm Node Unresponsive

2020-09-08 Thread Doug Meyer
Hi, Does scontrol ping from the node show the slurm server up? If so munge is fine. Betting it is not this but it is such an easy check. Ensure you have the same slurm.conf on master and client. The fact you can restart the slurmd and all is well is really odd. Suggests slurm is coming up too so

[slurm-users] Slurm Node Unresponsive

2020-09-08 Thread Grant Campbell
Hey, I am running a Slurm cluster that I inherited from an employee who left, so you will have to forgive any ignorance on my part, I am still coming up to speed on some core concepts. I have a vexing issue where one slurm node becomes unresponsive consistently. Network and DNS seem to be working

Re: [slurm-users] Slurmctld and log file

2020-09-08 Thread Brian Andrus
This seems to imply you had some changes in your slurm.conf I'm presuming you are running Centos 7 or such. Do you see anything when you do 'journalctl -u slurmctld' I'm wondering if you were only logging to the journal and then added the bits to also/instead log to a separate file. I do bot

Re: [slurm-users] Slurmctld and log file

2020-09-08 Thread Gestió Servidors
Hello, My slurm logrotate file looks like this: > /var/log/slurm/*.log { > weekly > compress > missingok > nocopytruncate > nocreate > nodelaycompress > nomail > notifempty > noolddir > rotate 5 > sharedscripts > size=5M > create

Re: [slurm-users] Slurmctld and log file

2020-09-08 Thread Steffen Grunewald
On Tue, 2020-09-08 at 09:39:09 +, Gestió Servidors wrote: > Hello, > > I don't know why, but my SLURM server (that is running fine) has its > slurmdctl.log file with size 0 bytes... so... where is writting logs? It > seems that log file has 0 bytes from logrotate process during today's early

Re: [slurm-users] Slurmctld and log file

2020-09-08 Thread Timo Rothenpieler
My slurm logrotate file looks like this: /var/log/slurm/*.log { weekly compress missingok nocopytruncate nocreate nodelaycompress nomail notifempty noolddir rotate 5 sharedscripts size=5M create 640 slurm slurm postrotate systemctl

[slurm-users] Slurmctld and log file

2020-09-08 Thread Gestió Servidors
Hello, I don't know why, but my SLURM server (that is running fine) has its slurmdctl.log file with size 0 bytes... so... where is writting logs? It seems that log file has 0 bytes from logrotate process during today's early morning. My logrotate SLURM conf is this: [root@server logrotate.d]# c

[slurm-users] Slurmctld and log file

2020-09-08 Thread Gestió Servidors
Hello, I don't know why, but my SLURM server (that is running fine) has its slurmdctl.log file with size 0 bytes... so... where is writting logs? It seems that log file has 0 bytes from logrotate process during today's early morning. My logrotate SLURM conf is this: [root@server logrotate.d]# c