On 24 Nov 2020, at 18:31, Alex Chekholko via Beowulf 
<beowulf@beowulf.org<mailto:beowulf@beowulf.org>> wrote:

If you can run your task on just one computer, you should always do that rather 
than having to build a cluster of some kind and all the associated headaches.


If you take on the cloud message, that of course isn’t necessarily the case.  
If you use very high level cloud services like lambda, you don’t have to build 
that infrastructure.  It’s very unlikely to be anywhere near as efficient, of 
course, but throughput efficiency is not what your average scientist cares 
about.  What they care about is getting their answer quickly (and to a lesser 
extent, cheaply)

I saw a recent example where someone took a fairly simple sequencing read 
alignment process, which normally runs on a single 16-core node in about 6 
hours, and split the input files small enough that the alignment code execution 
time and memory use would fit with AWS Lambda’s envelope.  The result executed 
in a couple of minutes, elapsed, but used about four times as many core-hours 
as the optimised single node version.  Of course, this is an embarrassingly 
parallel problem, so this is a relatively easy analysis to move to this sort of 
design.

From the scientist’s point of view, which is better?  Getting their answer in 5 
minutes or 6 hours?  Especially if they’ve also reduced their development time 
as well because they don’t have to worry so much about infrastructure and 
optimisation.

The total value is hard to work out, many of these considerations are hard to 
put a dollar value on.  When I saw that article, I did ask the author how much 
the analysis actually cost, and she didn’t have a number.  But I don’t think we 
can dogmatically say that we should always run a task on a single machine if we 
can.

Tim



-- 
 The Wellcome Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE.
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf

Reply via email to