Hi list I have built a small cluster and have attached a few clients to it. My clients can submit jobs so am confident that the service is setup sufficiently.
What I would like to do is to deploy the slurm client into a docker container. From within the docker container, I have setup munge and can successfully run 'sinfo'. scontrol ping states that the master node is down and any attempt to srun a bash (srun --pty bash -i) results in eventual failure. When I run srun, the master node registers the job in the queue and even allocates (and launches) a new machine for it to run on. Has anyone had any success running slurm clients within a dockerised environment? I ran some tests and I think the problem I have is with the docker firewall. Do I need to configure docker to forward certain ports? My plan is to deploy a graphical environment within each container and allow each user to have their own desktop. From there, they should be able to schedule jobs etc. If I had to forward certain ports, I'm not clear how I could achieve this with multiple users Thanks in advance Jake