Hi David,
There are several approaches to have a shared filesystem namespace without
an actual shared filesystem. One issue you will have to contend with is how
to handle any kind of filesystem caching (how much room to allocate for
local cache, how to handle cache inconsistencies).
examples:
gcs
It sounds like you are asking if there should be a shared /home, which
you do not need. You do need to ensure a user can access the environment
for the node (a home directory, ssh keys, etc).
If you are asking about the job binary and the data it will be
processing, again, you do not. You cou
Condor's original premise was to have long running compute jobs on
distributed nodes with no shared filesystem.
Of course, they played all kinds of dirty tricks to make this work
including intercepted libc and system calls.
I see no reason cleverly wrapped slurm jobs coudln't do the same,
either p
David,
I've been using Slurm for nearly 20 years, and while I can imagine some clever
work-arounds, like staging your job in /var/tmp on all of the nodes before
trying to run it, it's hard to imagine a cluster serving a useful purpose
without a shared user file system, whether or not Slurm is i