Bogdan, Igor. Thankyou very much for your thoughtful answers. I don not have much time today to do your replies the justice of a proper answer. Regarding the ssh filesystem, the scenario was that I was working for a well known company. We were running CFD simulations on remote academic HPC setups. There was more than one site! The corporate firewall allowed us an outgoing ssh connection. I found it a lot easier to configure an sshfs mount so that engineers could transfer programs and scripts between their locla system and the remote system, rather than using a graphical or a command line ssh client. The actual large data files were transferred by yours truly, via a USB disk drive.
I did not know about gitfs (my bad). That sounds interesting. On Wed, 28 Nov 2018 at 13:09, INKozin <i.n.ko...@googlemail.com> wrote: > > > On Wed, 28 Nov 2018 at 11:33, Bogdan Costescu <bcoste...@gmail.com> wrote: > >> On Mon, Nov 26, 2018 at 4:27 PM John Hearns via Beowulf < >> beowulf@beowulf.org> wrote: >> >>> I have come across this question in a few locations. Being specific, I >>> am a fan of the Julia language. Ont he Juia forum a respected developer >>> recently asked what the options were for keeping code developed on a laptop >>> in sync with code being deployed on an HPC system. >>> >> > > >> I think out loud that many HPC codes depend crucially on a $HOME >>> directory being presnet on the compute nodes as the codes look for dot >>> files etc. in $HOME. I guess this can be dealt with by fake $HOMES which >>> again sync back to the Repo. >>> >> >> I don't follow you here... $HOME, dot files, repo, syncing back? And why >> "Repo" with capital letter, is it supposed to be a name or something >> special? >> > > I think John is talking here about doing version control on whole HOME > directories but trying to be mindful of dot files such as .bashrc and > others which can be application or system specific. The first thing which > comes to mind is to use branches for different cluster systems. However > this also taps into backup (which is another important topic since HOME > dirs are not necessarily backed up). There could be a working solution > which makes use of recursive repos and git lfs support but pruning old > history could still be desirable. Git would minimize the amount of storage > because it's hash based. While this could make it possible to replicate > your environment "wherever you go", a/ you would drag a lot history around > and b/ a significantly different mindset is required to manage the whole > thing. A typical HPC user may know git clone but generally is not a git > adept. Developers are different and, who knows John, maybe someone will > pick up your idea. > > Is gitfs any popular? > > In my HPC universe, people actually not only need code, but also data - >> usually LOTS of data. Replicating the code (for scripting languages) or the >> binaries (for compiled stuff) would be trivial, replicating the data would >> not. Also pulling the data in or pushing it out (f.e. to/from AWS) on the >> fly whenever the instance is brought up would be slow and costly. And by >> the way this is in no way a new idea - queueing systems have for a long >> time the concept of "pre" and "post" job stages, which could be used to >> pull in code and/or data to the node(s) on which the node would be running >> and clean up afterwards. >> >
_______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf