On Thu, 8 Nov 2007, Peter St. John wrote:
Recently, probably you noticed, Walmart began selling a $200 linux PC. (Apparently the OS is just Ubuntu 7.10 with a small xindow manager instead of Gnome or KDE). Now Slashdot points to http://www.linuxdevices.com/news/NS5305482907.html, the MB being sold separately for $60 ("development board"). It has 1.5GHz CPU, unpopulated memory (slots for 2GB), one 10/100 connection. Does this look to y'all like fair FLOPS/$ for a kitchen project? I'm thinking 6 of them as compute nodes per 8 port router, with a bigger head node for fileserving. (actually I'll use a spare room but you know what I mean). An arrangement like this might be faster RAM access per core, compared to multicore, since each core has no competition for is't own memory, right?
Well by now you surely have heard the YMMV litany enough times not to hear it again from me, but YMMV quite a bit here so let me indicate a few potential difficulties. a) For this money, I'm guessing the CPU is a 32 bit Celery, which has a very small L2. For some code this won't matter, but if you're worrying about multiple cores and a memory bottleneck, let me assure you the L2 bottleneck on a single 32-bit channel will likely be much worse. b) Amdahl's law rewards higher clock and fewer CPUs over lower clock and more CPUs almost (but not quite) without exception. I doubt that you are an exception. c) A 64-bit CPU has some superlinear speedup compared to a 32-bit CPU at constant clock, for memory bound code especially. 64-bit CPUs have much larger caches as well. This CAN work against you for very cache unfriendly code, but again in 99% of all applications it will work for you -- it is what a cache "does". d) A perfectly fair question is to what extent the memory bus is oversubscribed on a 64-bit dual core, say, a very cheap AMD-64 at roughly twice the clock, with more than twice the total memory bandwidth, and with two cores. This is the question that depends in detail on YOUR APPLICATION. Many applications are de facto CPU bound and you get clock speed scaling within a CPU family all the way down to small cache Celerons. Others are vehemently not. "YMMV", so you have to analyze YOUR application to figure out which it is, where the easiest way by far to find out is to just try it. Sounds like it will cost you somewhere between $100 and $200 to set up a minimal system -- cheap case/power, motherboard, memory, a borrowed video card. You can probably beg, borrow, or buy a dual core AMD at some middling low (but much higher!) clock for no more than $400. Run your presumably EP application on the one, and on the other two at a time. Buy lots of the winner, use the loser as a desktop or head node (even the Celery should be fine for that, especially on a 100 Mbps network). Now, I'm a gambling man (as you may not know) and I will bet you one bottle, can, or glass of ice-cold or cellar cool clean and refreshing or thick and chewy beer as the winner prefers, to be delivered at a mutually convenient time (such as both of us sitting side by side at in a venue that purveys said beverages), that the medium-low end AMD-64 kicks the ass of the maximally cheap Celery in price-performance on your application (where I have an unfair advantage in that I know something about your application, but I'd make the same bet if I didn't). To go into detail, I expect that at contant cost you'll end up with somewhere in the ballpark of 2-3x aggregate bogomips/$ from the AMD, that memory bottlenecks will eat up no more than a small part of it (I actually expect the AMDs to win here TOO because of the probably at least doubled total memory bandwidth and larger cache), that when you factor in a roughly 4x increase in required system volume and 3x increase in total power consumption required to run the same number of Celeries that will match the AMD, at a marginal cost of roughly $200/year in increased power costs and some increased investment of your "free" time to install and mange the extra systems... well, let's just say that I think that the Celeries will look ugly. And I'd expect similar savings from the lowball dual core Xeons, honesly -- system price around $350-500 stripped to match where you vary in this range to find the sweet spot in terms of total memory, processor clock, and other configuration details. Before you turn me down, note that this is a win-win bet for both of us, since the winner gets to buy the next round...;-) rgb
Thanks, Peter _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
-- Robert G. Brown Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone(cell): 1-919-280-8443 Web: http://www.phy.duke.edu/~rgb Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977 _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf