On 03/21/2013 08:06 AM, Chris Dagdigian wrote: > Jonathan Aquilina wrote: >> It’s not that I need to cluster these vps’s I was just wondering if it >> was possible. What puts me off about amazon is pricing. It seems a bit >> pricy so to speak. >> > MIT StarCluster (the open source stack that builds Grid Engine clusters > on Amazon mentioned elsewhere in this thread) is able to leverage the > AWS Spot Market and the potential savings off of the hourly EC2 rate is > pretty enormous. Via Spot you can run servers for pennies an hour that > traditionally sell for dollars-per-hour on the "normal" EC2 on-demand > service. It would be very hard to beat that price on an internal > infrastructure if one was honest about the fully loaded facility, energy > and staffing costs.
[slight segue into the cloud side] Caveat emptor ... the analysis is in favor of ephermal machines and clusters if (expected/actual) utilization of resources is low. In which case the capital costs, and TCO are more expensive than the public cloud (average pricing). Not tremendously so, but noticeably. The spot pricing moves the cross over point, but its still not a game changer such that remote is always "cheaper". This said, there are other, very good reasons, why public cloud is not appropriate for everyone. Most of the customers we've been working with have tried it for their work and found it lacking one aspect or another. Granted they are highly specialized, with specific needs that a paravirtualized infrastructure makes little sense for (hypervirtualized makes a great deal more sense for them, but bare metal/silicon is optimal in performance, and their needs are driven more by that than other factors). On the costing side, there is no grand conspiracy. There are costs to acquire and run machines, but in many cases, thanks to serious devops work, sanely scoped, extremely dense, very performant, and well engineered systems, these costs are dropping rapidly (as are the revenues in this market, which is part of why the tier1 vendors are showing the quarterly results they are). The cost per processor core (and eventually per processor cycle) are rapidly approaching asymptotes, where the overall cost is not the major factor in decision making. To wit, look that the OPC designs from Facebook. This effort is all about completely commoditizing their hardware buys. Google has been doing something similar for a while. The real questions are: to get a value of X, what investment Y is required, and what are the constraints Z that we must work with. For a large segment of the corporate world where we play, clouds are fine, as long as they are private, completely securely controlled, and engineered to handle the workloads they need. Queuing up in line with 10000 of your closest friends and neighbors for a chance to bid on infrastructure that you need at a particular time, and you have to move a huge bolus of data over (think fractions to multiple PB) is simply not going to fly until we get 10GbE to demark as a common scenario. Even then, data motion is the killer in many cases, and system/network/storage latency and low level performance are at least complicit in the murder of the external cloud concept for these folks. This said, I do encourage folks to try out AWS, Joyent, and many others for cloudy bits, and see if the constraints can be worked around. Our friends at Sabalcore and Penguin are doing cool things with clusters on demand. And fundamentally, if you have a part time need for a cluster, the cluster rental or cloudy versions are likely to be a better deal for you, if the constraints will work. [cliche' warning] As I try to tell folks, your mileage may vary, there are no silver bullets, and if all you have is a hammer, every problem looks like a nail. Public(private) clouds and infrastructure are not a panacea, and there are cases where the other makes more sense. Any realistic view of the data around this (cost, utilization, need, etc.) and a correct assessment of the principle decision issues (performance/latency vs TCO vs data transport vs availability vs ...) is highly recommended. [back to clustering VPS] Honestly this never made a great deal of sense to me. Rather than clustering VPSes, why not cluster bare metal JEOS kvm hypervisor built machines? Not quite VMware stuff. We are using lots of kvm (obscene amounts of it) in our projects across several OSes. Linux, Illumos/SmartOS. We've been tweaking tiburon to handle such kvm boots, so that turning on a very large cluster of virtual machines and having them ready would take seconds to minutes at worst. Not half hours to several hours for provisioning. VPS doesn't quite have the isolation of kvm, which is part of why I'd like to see that. But kvm doesn't have great PCIe pass through (yet, its getting better). VPS might be able to make better use of the resource. -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics, Inc. email: land...@scalableinformatics.com web : http://scalableinformatics.com http://scalableinformatics.com/siflash phone: +1 734 786 8423 x121 fax : +1 866 888 3112 cell : +1 734 612 4615 _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf