Re: [Beowulf] gpgpu

Li, Bo Mon, 01 Sep 2008 17:56:59 -0700

Hello,
It seemed that you had got a very good example for GPGPU. As I said before, 
it's not the time for GPGPU to do the DP calculation at the moment. If you can 
bear SP computation, you will find more about it.
NVidia just sent me some special offer about their Tesla platforms, which said 
that the workstation equipped with two GTX280 level professional cards costs 
about $5000, not bad. But my intention is still to lower the core frequency of 
a gaming card, and use it for computation.
Regards,
Li, Bo
----- Original Message ----- 
From: "Mikhail Kuzminsky" <[EMAIL PROTECTED]>
To: "Kozin, I (Igor)" <[EMAIL PROTECTED]>
Cc: <beowulf@beowulf.org>
Sent: Tuesday, September 02, 2008 1:34 AM
Subject: Re: [Beowulf] gpgpu



>I performed some simplest estimation for possible performance 
> improvements using "dgemm on FirerStream 9250".
> It's extremally good for GPGPU example.
> 
> The source data for 9250: peak DP performance 200 GFLOPS, GDDR3 RAM 1 
> Gbyte.
> 
> 1 Gbyte can hold 3 DP(64 bit) matrixes (n x n) for n=6000 - they 
> require 864 Mbytes.
> Let me suppose that real performance of FireStream will be 90% of peak 
> value (I'm afraid, that reality will be more bad), i.e. 180 GFLOPS.
>  
> dgemm requires 2*n^3 FP operations (I neglect n^2 operations for 
> matrix addition and scaling), i.e. 432 GFLOP
> The calculation time will be 432/180 = 2.4 sec
> 
> We'll need for dgemm calculation also 4 matrix transmissions: 3 to 
> GPGPU, 1 - from GPGPU to main memory of server.
> It's 1152 Gbytes of data.
> 
> For PCI-e x16 v.2 peak throughput value is 8 GB/s, therefore 
> transmission time will be about 0.144 sec (I don't know what may be 
> real throughput for PCIe).
> 
> The total calc. time is therefore about 2.54 sec.
> 
> On dual socket quad core Xeon server w/3 Ghz E5472 (8 cores) the peak 
> performance is 96 GFLOPS. Parallelized dgemm will give, I believe, 
> about 80% of peak - i.e. 77 GFLOPS; therefore calcualtion time is 
> 432/77= 5.6 sec.
> 
> Speedup is 2.2 times. Price increase - I don't know, for example from 
> $4500 to $6500 (if Firestream costs $2000, but may be $1000 as Igor 
> Kozin wrote here), it's about 1.4 times. 
> 
> But I think there will be not too many job which require matrix 
> multiplication for *dense* matrixes w/such large (6000 x 6000) sizes; 
> for sparse matrixes the dimensions, I beleive, will be lower.
> 
> Mikhail     
> _______________________________________________
> Beowulf mailing list, Beowulf@beowulf.org
> To change your subscription (digest mode or unsubscribe) visit 
> http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Re: [Beowulf] gpgpu

Reply via email to