Re: [Beowulf] What class of PDEs/numerical schemes suitable for GPU clusters

Mark Hahn Fri, 21 Nov 2008 06:13:48 -0800

Virtually any recent card can run CUDA code. If you Google you can get a
list of compatible cards.


not that many NVidia cards support DP yet though, which is probably
important to anyone coming from the normal HPC world...  there's some
speculation that NV will try to keep DP as a market segmentation
feature to drive HPC towards high-cost Tesla cards, much as vendors
have traditionally tried to herd high-end vis into 10x priced cards.


That's simply not true. Every newer card from NVidia (that is, every

which part is not true? the speculation? OK - speculation is alwaysjust speculation. it _is_ true that only the very latest NV generation,

essentially three bins of one card, does support DP.

and nothing indicates that NV will remove support in future cards, quite
the contrary.

hard to say. NV is a very competitively driven company, that is, makesdecisions for competitive reasons. it's a very standard policy to tryto segment your market, to develop higher margin segments that dependon restricted features. certainly NV has done that before (hence theexistence of Quadro and Tesla) though it's not clear to me whether theywill have any meaningful success given the other players in the market.

segmentation is a play for a dominant incumbent, and I don't think NV
is or believes itself so.  AMD obviously seeks to avoid giving NV any
advantage, and ATI has changed its outlook somewhat since AMDification.
and Larrabee threatens to eat both their lunches.

The distinction between Tesla and GeForce cards is that the former have
no display output, they usually have more ram, and (but I'm not sure
about this one) they are clocked a little lower.


both NV and ATI have always tried to segment "professional graphics"

into a higher-margin market. this involves tying the pro drivers tofeatures found only in the pro cards. it's obvious that NV _could_do this with Cuda, though I agree they probably won't.

the original question was whether there is a strong movement towardsgp-gpu clusters. I think there is not, because neither the hardwarenor software is mature. Cuda is the main software right now, and isNV-proprietary, and is unlikley to target ATI and Intel gp-gpu hardware.

finally, it needs to be said again: current gp-gpus deliver around1 SP Tflop for around 200W. a current cpu (3.4 GHz Core2) deliversabout 1/10 as many flops for something like 1/2 the power. (I'mapproximating cpu+nb+ram.) cost for the cpu approach is higher (let's

guess 2x, but again it's hard to isolate parts of a system.)

so we're left with a peak/theoretical difference of around 1 order of
magnitude.  that's great!  more than enough to justify use of a unique

(read proprietary, nonportable) development tool for some places whereGPUs work especially well (and/or CPUs work poorly). and yes, addinggp-gpu cards to a cluster is a fairly modest price/power premium ifyou expect to use it.

Joe's hmmer example sounds like an excellent example, since it shows goodspeedup, and the application seems to be well-suited to gp-gpu strengths

(and it has a fairly small kernel that needs to be ported to Cuda.)
but comparing all the cores of a July 2008 GPU card to a single core on a
90-nm, n-3 generation chip really doesn't seem appropriate to me.
_______________________________________________
Beowulf mailing list, [email protected]
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Re: [Beowulf] What class of PDEs/numerical schemes suitable for GPU clusters

Reply via email to