On the AWS ec2 side, we've been performing a range of tests including full genome sequencing pipelines across varying numbers of nodes and storage. The biggest challenge to date has been IO, particularly if the smaller image systems are used. Where jobs are highly cpu bound, little network (or heaven forbid disk) bound things go reasonably well and have the potential to scale. Once IO becomes a factor the scaling decreases rapidly...

We've also had a run around with Xen and it requires more network tiffling to automate role outs (at least in our environment) but it works ok, especially when paired with something like openQRM. It's a ways off being as polished as VMware and some of the interesting memory handling doesn't appear to be all there. As a result performance degrades rapidly as the number of hosts and IO hungry app load increases fairly severely. Regrettably I don't have enough useful data to present this at present and as always YMMV.

Pete
I've been using Amazon ec2 for clustering for months now, from a software 
perspective it's very similar to running real hardware.  For my needs 
(development) it's perfectly adequate, I've not benchmarked it against running 
the same code on the raw hardware though.

Ashley,




--
The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. _______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to