On Fri, Apr 11, 2014 at 6:05 PM, Sturla Molden <[email protected]> wrote:
> Sturla Molden <[email protected]> wrote:
>
>> Making a totally new BLAS might seem like a crazy idea, but it might be the
>> best solution in the long run.
>
> To see if this can be done, I'll try to re-implement cblas_dgemm and then
> benchmark against MKL, Accelerate and OpenBLAS. If I can get the
> performance better than 75% of their speed, without any assembly or dark
> magic, just plain C99 compiled with Intel icc, that would be sufficient for
> binary wheels on Windows I think.

Sounds like a worthwhile experiment!

My suspicion is that it we'll be better off starting with something
that is almost good enough (OpenBLAS) and then incrementally improving
it to meet our needs, rather than starting from scratch -- there's a
*long* way to get from dgemm to a fully supported BLAS project -- but
no matter what it'll generate useful data, and possibly some useful
code that could either be the basis of something new or integrated
into whatever we do end up doing.

Also, while Windows is maybe in the worst shape, all platforms would
seriously benefit from the existence of a reliable speed-competitive
binary-distribution-compatible BLAS that doesn't break fork().

-n

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org
_______________________________________________
NumPy-Discussion mailing list
[email protected]
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Reply via email to