slurmd runs as root. I did not modify the installations and I used a slurm config file created at the slurm website. The same setup works for me on debian and (as I tried recently) on Ubuntu Intrepid Ibex alpha as well.
Here is some details, but this bug is specific to Ubuntu Hardy Heron. # scontrol show config | grep SlurmdLog SlurmdLogFile = (null) even so the logfile exists: # ls /var/run/slurm-llnl/ slurmd slurmd.log slurmd.pid Here is the SlurmdLog of a failed job: [Oct 23 11:43:11] setup for a batch_job [Oct 23 11:43:11] entering batch_job_create [Oct 23 11:43:11] [412] Message thread started pid = 21069 [Oct 23 11:43:11] [412] eio: handling events for 1 objects [Oct 23 11:43:11] [412] Called _msg_socket_readable [Oct 23 11:43:11] debug3: _rpc_batch_job: return from _forkexec_slurmstepd [Oct 23 11:43:11] [412] Entered job_manager for 412.4294967294 pid=21069 [Oct 23 11:43:11] [412] alloc LLLP [Oct 23 11:43:11] [412] task affinity plugin loaded [Oct 23 11:43:11] [412] mpi type = (null) [Oct 23 11:43:11] [412] Entering _setup_normal_io [Oct 23 11:43:11] [412] eio: handling events for 1 objects [Oct 23 11:43:11] [412] Called _msg_socket_readable [Oct 23 11:43:11] [412] Uncached user/gid: gs/100 [Oct 23 11:43:11] [412] eio: handling events for 1 objects [Oct 23 11:43:11] [412] Called _msg_socket_readable [Oct 23 11:43:11] [412] stdin file name = /dev/null [Oct 23 11:43:11] [412] stdout file name = /home/gs/calc/tubes/cnt-4.0/tt1/slurm-412.out [Oct 23 11:43:11] [412] stderr file name = /home/gs/calc/tubes/cnt-4.0/tt1/slurm-412.out [Oct 23 11:43:11] [412] eio: handling events for 1 objects [Oct 23 11:43:11] [412] Called _msg_socket_readable [Oct 23 11:43:11] [412] eio: handling events for 1 objects [Oct 23 11:43:11] [412] Called _msg_socket_readable [Oct 23 11:43:11] [412] Leaving _setup_normal_io [Oct 23 11:43:11] [412] debug level = 2 [Oct 23 11:43:11] [412] Before call to spank_init() [Oct 23 11:43:11] [412] spank: opening plugin stack /etc/slurm-llnl/plugstack.conf [Oct 23 11:43:11] [412] After call to spank_init() [Oct 23 11:43:11] [412] num tasks on this node = 1 [Oct 23 11:43:11] [412] New fdpair[0] = 12, fdpair[1] = 13 [Oct 23 11:43:11] [412] eio: handling events for 1 objects [Oct 23 11:43:11] [412] Called _msg_socket_readable [Oct 23 11:43:11] [412] Uncached user/gid: gs/100 [Oct 23 11:43:11] [412] eio: handling events for 1 objects [Oct 23 11:43:11] [412] Called _msg_socket_readable [Oct 23 11:43:11] [412] Couldn't find SLURM_RLIMIT_CPU in environment [Oct 23 11:43:11] [412] Couldn't find SLURM_RLIMIT_FSIZE in environment [Oct 23 11:43:11] [412] Couldn't find SLURM_RLIMIT_DATA in environment [Oct 23 11:43:11] [412] Couldn't find SLURM_RLIMIT_STACK in environment [Oct 23 11:43:11] [412] Couldn't find SLURM_RLIMIT_CORE in environment [Oct 23 11:43:11] [412] Couldn't find SLURM_RLIMIT_RSS in environment [Oct 23 11:43:11] [412] Couldn't find SLURM_RLIMIT_NPROC in environment [Oct 23 11:43:11] [412] Couldn't find SLURM_RLIMIT_NOFILE in environment [Oct 23 11:43:11] [412] Couldn't find SLURM_RLIMIT_MEMLOCK in environment [Oct 23 11:43:11] [412] Couldn't find SLURM_RLIMIT_AS in environment [Oct 23 11:43:11] [412] task 0 (21074) started Oct 23 11:43:11 [Oct 23 11:43:11] [412] eio: handling events for 1 objects [Oct 23 11:43:11] [412] Called _msg_socket_readable [Oct 23 11:43:11] [412] eio: handling events for 1 objects [Oct 23 11:43:11] [412] Called _msg_socket_readable [Oct 23 11:43:11] [412] Unblocking 412.4294967294 task 0, writefd = 13 [Oct 23 11:43:11] [412] affinity task_pre_launch: 412.4294967294, task 0 [Oct 23 11:43:11] [412] Using sched_affinity for tasks [Oct 23 11:43:11] [412] execve(): /var/run/slurm-llnl/slurmd/job00412/script: Permission denied [Oct 23 11:43:11] [412] task 0 (21074) exited status 0x0d00 Oct 23 11:43:11 [Oct 23 11:43:11] [412] affinity task_post_term: 412.4294967294, task 0 [Oct 23 11:43:11] [412] Aggregated 1 task exit messages [Oct 23 11:43:11] [412] sending task exit msg for 1 tasks [Oct 23 11:43:11] [412] Before call to spank_fini() [Oct 23 11:43:11] [412] After call to spank_fini() [Oct 23 11:43:11] [412] job 412 completed with slurm_rc = 0, job_rc = 3328 [Oct 23 11:43:11] [412] sending REQUEST_COMPLETE_BATCH_SCRIPT [Oct 23 11:43:11] [412] auth plugin for Munge (Chris Dunlap, LLNL) loaded [Oct 23 11:43:11] [412] eio: handling events for 1 objects [Oct 23 11:43:11] [412] Called _msg_socket_readable [Oct 23 11:43:11] [412] false, shutdown [Oct 23 11:43:11] [412] Message thread exited [Oct 23 11:43:11] [412] done with job [Oct 23 11:43:11] debug3: in the service_connection [Oct 23 11:43:11] debug2: got this type of message 6010 [Oct 23 11:43:11] debug2: Processing RPC: REQUEST_TERMINATE_JOB [Oct 23 11:43:11] debug: _rpc_terminate_job, uid = 64030 [Oct 23 11:43:11] debug: task_slurmd_release_resources: 412 [Oct 23 11:43:11] debug3: release LLLP job [412.*] [Oct 23 11:43:11] debug3: job state 412: ctime:081023114311 expires:691231160000 [Oct 23 11:43:11] debug: credential for job 412 revoked [Oct 23 11:43:11] debug2: No steps in jobid 412 to send signal 18 [Oct 23 11:43:11] debug2: No steps in jobid 412 to send signal 15 [Oct 23 11:43:11] debug4: sent ALREADY_COMPLETE [Oct 23 11:43:11] debug3: job state 412: ctime:081023114311 revoked:081023114311 expires:081023114311 [Oct 23 11:43:11] debug2: set revoke expiration for jobid 412 to 081023115311 [Oct 23 11:45:06] debug3: in the service_connection [Oct 23 11:45:06] debug: _slurm_recv_timeout at 0 of 4, recv zero bytes [Oct 23 11:45:06] error: slurm_receive_msg_and_forward: Zero Bytes were transmitted or received [Oct 23 11:45:06] error: service_connection: slurm_receive_msg: Zero Bytes were transmitted or received [Oct 23 11:45:06] debug2: _slurm_send_timeout: Socket no longer there. [Oct 23 11:45:06] error: slurm_msg_sendto: Transport endpoint is not connected [Oct 23 11:46:18] debug3: in the service_connection [Oct 23 11:46:18] debug2: got this type of message 1008 -- slurm sbatch command fails https://bugs.launchpad.net/bugs/271518 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs