Joe Landman wrote:
Eric Thibodeau wrote:
Joe Landman wrote:
Gus Correa wrote:

Otherwise, your "newbie scientist" can put his/her earbuds and pump up the volume on his Ipod,
while he/she navigates through the Vista colorful 3D menus.

Owie .... I can just imagine the folks squawking about this at SC08 "Yes folks, you need a Cray supercomputer to make Vista run at acceptable performance ..."
Maybe they have a "tune options for performance" option ;)

The machine seems to run w2k8. My own experience with w2k8 is that, frankly, it doesn't suck. This is the first time I have seen a windows release that I can say that about.
A few questions (not necessarily expecting a response):

POSIX?
VERBS?
Kernel latency and scheduler control?

Don't mistake me for a w2k8 apologist. I reamed them pretty hard on the lack of a real posix infrastructure (they claim SUA, but frankly it doesn't build most of what we throw at it, so it really is a non-starter and not worth considering IMO). They need to pull Cygwin in close and tight to get a good POSIX infrastructure. It is in their best interests. Sadly, I suspect the ego driven nature of this will pretty much prevent them from doing this. Can't touch the "toxic" OSS now, can they ...
Cygwin...yerk, that emulation bloat is slow as hell and can barely run some of my basic scripts. A simple find would put the CPU in 100% usage state. Now I don't blame Cygwin for this per say but rather the way that windows (probably) runs this as a DOS(ish) app in a somewhat polled mode. My lack of interest in that OS stopped me from diggin deeper into why Cygwin is so slow. Given it's a more than mature project, I'd have expected such poor performance to have been addressed by now.

IB Verbs? Well through OFED, yes. Through the windows stack? Who knows. We were playing with it on JackRabbit for a customer test/benchmark.
...and...the results? ;)

Kernel latency? Much better/more responsive than w2k3. Scheduler control? Not sure how much you have. I don't like deep diving into registries ... that is a pretty sure way to kill a windows machine.
Well, let's just say these are mechanisms I expect an HPC machine to have when "squeezing the last drop of performance" is mentioned.

These are the real barriers IMHO, without minimally supporting POSIX (threads), there is very little incentive to use the machine for development unless you're willing to accept the code will _only_ run on your "desktop".

The low end economics probably won't work out for this machine though, unless it is N times faster than some other agglomeration of Intel-like products. Adding windows will add cost, not performance in any noticeable way.

The question that Cray (and every other vendor building non-commodity units) is how much better is this than a small cluster someone can build/buy on their own? Better as in faster, able to leap more tall buildings in a single bound, ... (Superman TV show reference for those not in the know). And the hard part will be justifying the additional cost. If the machine isn't 2x the performance, would it be able to justify 2x the price? Since it appears to be a somewhat well branded cluster, I am not sure that argument will be easy to make.
I just rebuilt a 32 core cluster for ~5k$ (CAD) (8*Q6600 1Gig RAM/node + gige netwroking). Bang for the buck? I can't wait to see the CX1's performance specs under _both_ windows and Linux.

The desktop CPUs/MBs will get you best bang per buck, as long as you don't mind no ECC, and 8GB ram limits per node. For your applications, this might be fine. For others, with large memory footprint and long run times, I see people need/require ECC (as memory density increases, ECC becomes important .... darned cosmic rays/natural decays/noisy power supplies/...)
Well, the nodes I built have MBs that state that they can go as high as 8Gigs, but any one with a little experience with the 8Gig mix + 800+MHz RAM know that very little hardware (MB) can actually do it in a stable fashion. My recent experiences with "fast" RAM (800 -1066MHz) is that they end up costing you time ($$$) since it would seem most MBs claim to support it but they all seem to have some impedance problems of some sort (totally unstable). And if one reads the fine prints and the QVLs (Qualifies Vendor Lists), you notice these really high throughputs are for low memory density of the banks (a _total_ of 2-4 Gigs max). I'd personally say this is more of an issue than the ECCism of RAM.

I work on "Clusering Algorithms", and not to confuse people, I mean the k-means type which we could call "data aggregation/mining" algorithms. They are long running and applied to sizable databases (1.2Gigs) which need to be loaded onto each node. This is where having multi-core nodes comes in quite handy as there is way less time lost in data propagation and loading (the databases).

...which brings me to wonder how the I/O is managed under the CX1...is it as basic as one I/O node and GFS or do all nodes have their own I/O paths. I mention this since I've too often seen people ignore the I/O (load times) ignored in their performance assessments ;)

_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to