Re: [slurm-users] Slurm and shared file systems

2020-06-19 Thread Alex Chekholko
Hi David, There are several approaches to have a shared filesystem namespace without an actual shared filesystem. One issue you will have to contend with is how to handle any kind of filesystem caching (how much room to allocate for local cache, how to handle cache inconsistencies). examples: gcs

Re: [slurm-users] Slurm and shared file systems

2020-06-19 Thread Brian Andrus
It sounds like you are asking if there should be a shared /home, which you do not need. You do need to ensure a user can access the environment for the node (a home directory, ssh keys, etc). If you are asking about the job binary and the data it will be processing, again, you do not. You cou

Re: [slurm-users] Slurm and shared file systems

2020-06-19 Thread Steven Dick
Condor's original premise was to have long running compute jobs on distributed nodes with no shared filesystem. Of course, they played all kinds of dirty tricks to make this work including intercepted libc and system calls. I see no reason cleverly wrapped slurm jobs coudln't do the same, either p

Re: [slurm-users] Slurm and shared file systems

2020-06-19 Thread Riebs, Andy
David, I've been using Slurm for nearly 20 years, and while I can imagine some clever work-arounds, like staging your job in /var/tmp on all of the nodes before trying to run it, it's hard to imagine a cluster serving a useful purpose without a shared user file system, whether or not Slurm is i