Re: [Beowulf] MS Cray

Eric Thibodeau Wed, 17 Sep 2008 10:27:54 -0700

Joe Landman wrote:

Eric Thibodeau wrote:
Joe Landman wrote:
Gus Correa wrote:
Otherwise, your "newbie scientist" can put his/her earbuds and pumpup the volume on his Ipod,
while he/she navigates through the Vista colorful 3D menus.
Owie .... I can just imagine the folks squawking about this at SC08"Yes folks, you need a Cray supercomputer to make Vista run atacceptable performance ..."
Maybe they have a "tune options for performance" option ;)
The machine seems to run w2k8. My own experience with w2k8 is that,frankly, it doesn't suck. This is the first time I have seen awindows release that I can say that about.
A few questions (not necessarily expecting a response):

POSIX?
VERBS?
Kernel latency and scheduler control?
Don't mistake me for a w2k8 apologist. I reamed them pretty hard onthe lack of a real posix infrastructure (they claim SUA, but franklyit doesn't build most of what we throw at it, so it really is anon-starter and not worth considering IMO). They need to pull Cygwinin close and tight to get a good POSIX infrastructure. It is in theirbest interests. Sadly, I suspect the ego driven nature of this willpretty much prevent them from doing this. Can't touch the "toxic" OSSnow, can they ...

Cygwin...yerk, that emulation bloat is slow as hell and can barely runsome of my basic scripts. A simple find would put the CPU in 100% usagestate. Now I don't blame Cygwin for this per say but rather the way thatwindows (probably) runs this as a DOS(ish) app in a somewhat polledmode. My lack of interest in that OS stopped me from diggin deeper intowhy Cygwin is so slow. Given it's a more than mature project, I'd haveexpected such poor performance to have been addressed by now.

IB Verbs? Well through OFED, yes. Through the windows stack? Whoknows. We were playing with it on JackRabbit for a customertest/benchmark.

...and...the results? ;)

Kernel latency? Much better/more responsive than w2k3. Schedulercontrol? Not sure how much you have. I don't like deep diving intoregistries ... that is a pretty sure way to kill a windows machine.

Well, let's just say these are mechanisms I expect an HPC machine tohave when "squeezing the last drop of performance" is mentioned.

These are the real barriers IMHO, without minimally supporting POSIX(threads), there is very little incentive to use the machine fordevelopment unless you're willing to accept the code will _only_ runon your "desktop".
The low end economics probably won't work out for this machinethough, unless it is N times faster than some other agglomeration ofIntel-like products. Adding windows will add cost, not performancein any noticeable way.
The question that Cray (and every other vendor buildingnon-commodity units) is how much better is this than a small clustersomeone can build/buy on their own? Better as in faster, able toleap more tall buildings in a single bound, ... (Superman TV showreference for those not in the know). And the hard part will bejustifying the additional cost. If the machine isn't 2x theperformance, would it be able to justify 2x the price? Since itappears to be a somewhat well branded cluster, I am not sure thatargument will be easy to make.
I just rebuilt a 32 core cluster for ~5k$ (CAD) (8*Q6600 1GigRAM/node + gige netwroking). Bang for the buck? I can't wait to seethe CX1's performance specs under _both_ windows and Linux.
The desktop CPUs/MBs will get you best bang per buck, as long as youdon't mind no ECC, and 8GB ram limits per node. For yourapplications, this might be fine. For others, with large memoryfootprint and long run times, I see people need/require ECC (as memorydensity increases, ECC becomes important .... darned cosmicrays/natural decays/noisy power supplies/...)

Well, the nodes I built have MBs that state that they can go as high as8Gigs, but any one with a little experience with the 8Gig mix + 800+MHzRAM know that very little hardware (MB) can actually do it in a stablefashion. My recent experiences with "fast" RAM (800 -1066MHz) is thatthey end up costing you time ($$$) since it would seem most MBs claim tosupport it but they all seem to have some impedance problems of somesort (totally unstable). And if one reads the fine prints and the QVLs(Qualifies Vendor Lists), you notice these really high throughputs arefor low memory density of the banks (a _total_ of 2-4 Gigs max). I'dpersonally say this is more of an issue than the ECCism of RAM.

I work on "Clusering Algorithms", and not to confuse people, I mean thek-means type which we could call "data aggregation/mining" algorithms.They are long running and applied to sizable databases (1.2Gigs) whichneed to be loaded onto each node. This is where having multi-core nodescomes in quite handy as there is way less time lost in data propagationand loading (the databases).

...which brings me to wonder how the I/O is managed under the CX1...isit as basic as one I/O node and GFS or do all nodes have their own I/Opaths. I mention this since I've too often seen people ignore the I/O(load times) ignored in their performance assessments ;)


_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Re: [Beowulf] MS Cray

Reply via email to