On 7/25/19 8:26 PM, Jörg Saßmannshausen wrote:
Dear all, dear Chris,

thanks for the detailed explanation. We are currently looking into cloud-
bursting so your email was very timely for me as I am suppose to look into it.

One of the issues I can see with our workload is simply getting data into the
cloud and back out again. We are not talking about a few Gigs here, we are
talking up to say 1 or more TB. For reference: we got 9 PB of storage (GPFS)
of which we are currently using 7 PB and there are around 1000+ users
connected to the system. So cloud bursting would only be possible in some
cases.
Do you happen to have a feeling of how to handle the issue with the file sizes
sensibly?

The issue is bursting with large data sets.  You might be able to pre-stage some portion of the data set in a public cloud, and then burst jobs from there.  Data motion between sites is going to be the hard problem in the mix.  Not technically hard, but hard from a cost/time perspective.


--
Joe Landman
e: joe.land...@gmail.com
t: @hpcjoe
w: https://scalability.org
g: https://github.com/joelandman
l: https://www.linkedin.com/in/joelandman

_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf

Reply via email to