Hi,
As an update, I able to clear out the orphan/cancelled jobs by rebooting
the compute nodes which had cancelled jobs. The error messages have
ceased.
Regards,
Jeff
On Wed, Dec 6, 2023 at 8:26 AM Jeffrey McDonald wrote:
> Hi,
> Yesterday, an upgrade to slurm from 22.05.4 to 23.11.0 went si
Hi,
Yesterday, an upgrade to slurm from 22.05.4 to 23.11.0 went sideways and I
ended up losing a number of jobs on the compute nodes. Ultimately, the
installation seems to be successful but I now have some issues with job
remnants it appears.About once per minute (per job), the slurmctld
daem