Dear all, dear Chris, thanks for the detailed explanation. We are currently looking into cloud- bursting so your email was very timely for me as I am suppose to look into it.
One of the issues I can see with our workload is simply getting data into the cloud and back out again. We are not talking about a few Gigs here, we are talking up to say 1 or more TB. For reference: we got 9 PB of storage (GPFS) of which we are currently using 7 PB and there are around 1000+ users connected to the system. So cloud bursting would only be possible in some cases. Do you happen to have a feeling of how to handle the issue with the file sizes sensibly? Sorry for hijacking the thread here a bit. All the best from a hot London Jörg Am Montag, 22. Juli 2019, 14:14:13 BST schrieb Chris Dagdigian: > A lot of production HPC runs on cloud systems. > > AWS is big for this via their AWS Parallelcluster stack which does > include lustre support via vfXT for lustre service although they are > careful to caveat it as staging/scratch space not suitable for > persistant storage. AWS has some cool node types now with 25gig, 50gig > and 100-gigabit network support. > > Microsoft Azure is doing amazing things now that they have the > cyclecomputing folks on board, integrated and able to call shots within > the product space. They actually offer bare metal HPC and infiniband > SKUs now and have some interesting parallel filesystem offerings as well. > > Can't comment on google as I've not touched or used it professionally > but AWS and Azure for sure are real players now to consider if you have > an HPC requirement. > > > That said, however, a sober cost accounting still shows on-prem or > "owned' HPC is best from a financial perspective if your workload is > 24x7x365 constant. The cloud based HPC is best for capability, bursty > workloads, temporary workloads, auto-scaling, computing against > cloud-resident data sets or the neat new model where instead of on-prem > multi-user shared HPC you go out and decide to deliver individual > bespoke HPC clusters to each user or team on the cloud. > > The big paradigm shift for cloud HPC is that it does not make a lot of > sense to make a monolithic stack shared by multiple competing users and > groups. The automated provisioning and elasticity of the cloud make it > more sensible to build many clusters so that you can tune each cluster > specifically for the cluster or workload and then blow it up when the > work is done. > > My $.02 of course! > > Chris > > > Jonathan Aquilina <mailto:jaquil...@eagleeyet.net> > > July 22, 2019 at 1:48 PM > > > > Hi Guys, > > > > I am looking at > > https://cloud.google.com/blog/products/storage-data-transfer/introducing-l > > ustre-file-system-cloud-deployment-manager-scripts > > > > This basically allows you to deploy a lustre cluster on google cloud. > > In your HPC setups have you considered moving towards cloud based > > clusters? > > > > Regards, > > > > Jonathan > > > > > > > > _______________________________________________ > > Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing > > To change your subscription (digest mode or unsubscribe) visit > > https://beowulf.org/cgi-bin/mailman/listinfo/beowulf _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit https://beowulf.org/cgi-bin/mailman/listinfo/beowulf