Re: [Beowulf] quad-core SPECfp2006: where are 4 FPresults/cycle ?

Mark Hahn Fri, 12 Oct 2007 13:24:39 -0700

This means that 2 additional FP results per cycle in microarchitecture givesonly about 7% of performance increase :-(

the 4 flops/cycle is really for linpack-like code: it assumes you areexecuting packed double SIMD.

The question is - should we wait some better results for new incomingoptimizing compilers versions ? Or it is the reality - that 2 additional FPresults per cycle gives (in average) relative small performance increase ?

just that not all FP is SIMD-friendly, I think. if your code spends a lotof time in blas/lapack functions, I would expect it to see good speedup.


regards, mark hahn.
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Re: [Beowulf] quad-core SPECfp2006: where are 4 FPresults/cycle ?

Reply via email to