On Dec 17, 2012, at 11:23 PM, Mark Hahn wrote: >> "todays 1 gflop/watt" ? > > press releases always put the new shiny thing in the best light. > they're probably thinking of a conventional compute node, > (say, 32 cores, 2.3 GHz, 4 flops/cycle, or 16c and 8 f/c - > either way totalling 294 Gflops for 300W or less.)
For a fair compare you have to add motherboard power losses, as of course that's all included at the gpu cards. As for the gflops it delivers, let's do a more realistic calculation. AVX does have multiply add, yet i doubt you can issue on average every clock another multiply-add in a sustained manner at Sandy Bridge, if we compare it with Nehalem. Note the CPU's tend to have just 1 execution unit that can issue multiplications and historically always had big problems issuing every clock another one; another reason why the manycores hammer away the CPU's so bigtime, as in the end it doesn't matter whether you do matrix multiplications or run FFT's for prime numbers - it's about the multiplication speed the chip can deliver as that's going to determine how fast your code can run on that chip. http://ark.intel.com/products/64596/Intel-Xeon-Processor-E5-2690-20M- Cache-2_90-GHz-8_00-GTs-Intel-QPI That's the fastest i could find. It's 2.9Ghz CPU. So the cpu delivers in terms of Gflops. 2.9Ghz * 1 multiplication a clock * 4 doubles a vector * 8 cores = 92.8 Gflops This for $2057 tray price at introduction. http://ark.intel.com/products/64596/Intel-Xeon-Processor-E5-2690-20M- Cache-2_90-GHz-8_00-GTs-Intel-QPI So i wonder where you got that 294 gflops from. Now in terms of gflops/watt that's 92.8 / 135 watt TDP = 0.68 flops/ watt for the $2k Xeon. One order of a magnitude less than the K20. That's why intel created the Xeon Phi of course. > >> The K20X delivers 1.4 Tflop nearly. >> If i google it's 235 watt TDP. >> >> 1.4 Tflop / 235 = 6 gflops/watt > > debatable whether we can honestly claim that's shipping. > K10 is .78 Gflops DP/W or 17.2 SP. I wonder of the 75 goal > is merely a 4.4x improvement.... " The PERFECT program will leverage anticipated industry fabrication geometry advances to 7 nm." 7 nm gives a factor 16 boost over 28 nm, in theory. So the derived truth from the article points me to double precision. > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin > Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf