I will admit that I have not used sbcast but from reading the man pages I think 
that it does not do what you hope.

 

The sbcast command will indeed run on the first allocated node, so the source 
file must be accessible from there.  The man page does say that shared file 
systems are a better solution than sbcast, and I think that is the clue; sbcast 
is designed for clusters where there is no shared file system available between 
all the nodes.   If there was such a shared file system, the main part of the 
script (that executes on every node including the first) could just copy any 
file from shared storage to the local storage.  That might be common where the 
shared storage is e.g. NFS which might not be very fast, and make use of local 
fast storage on each node.

 

It sounds like sbcast also has some useful options for copying different files 
to different subsets of allocated nodes; there is an example for heterogenous 
jobs: https://slurm.schedmd.com/heterogeneous_jobs.html

 

Hopefully some else has actually used it and can give an idea how to do what 
you need to do.

 

William

 

From: slurm-users <slurm-users-boun...@lists.schedmd.com> On Behalf Of Hector 
Yuen
Sent: 18 April 2020 09:06
To: slurm-users@lists.schedmd.com
Subject: [slurm-users] Correct way to do sbcast with sbatch

 

Hello, I have a very basic question about using sbcast.

 

I start from the master node (the one running slurmctld). And have my binary 
there.

 

When I submit a job, I do this through sbatch. The first command I want to run 
is an sbcast. But by then I am already inside one of the allocated nodes for 
the job which doesn't have the binary in first place.

 

What is the correct way to use sbcast inside sbatch?

 

Thanks


 

-- 

-h

Reply via email to