https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51119

--- Comment #28 from Jerry DeLisle <jvdelisle at gcc dot gnu.org> ---
(In reply to Janne Blomqvist from comment #25)

> 
> But, that is not particularly impressive, is it? I don't know about current
> low end graphics adapters, but at least the high end GPU cards (Tesla) are
> capable of several Tflops. Of course, there is a non-trivial threshold size
> to amortize the data movement to/from the GPU.

Not even a graphics card, just the on system chip on a low end laptop. Not
trying to impress, just pointing out that the hardware acceleration is fairly
ubiquitous these days, so why not just use it.  Maybe not important for serious
computing where users already have things like your 20 core machine.
> 
> With the test program from #12, with OpenBLAS (which BTW should be available
> in Fedora 22 as well) I get 337 Gflops/s, or 25 Gflops/s if I restrict it to
> a single core with the OMP_NUM_THREADS=1 environment variable. This on a
> machine with 20 2.8 GHz Ivy bridge cores.
> 
> I'm not per se against using GPU's, but I think there's a lot of low hanging
> fruit to be had just by making it easier for users to use a high performance
> BLAS implementation.

I agree, if available external BLAS does what is needed very good, What I am
exploring is one of those external BLAS libraries that uses GPU.  Maybe the
answer to this PR is "use an external BLAS" and close this PR.

Reply via email to