https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79930

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization

--- Comment #9 from Richard Biener <rguenth at gcc dot gnu.org> ---
If dot_product (matmul (...), ..) can be implemented more optimally (is there a
blas/lapack primitive for it?) then the best course of action is to pattern
match that inside the frontend and emit a library call to an optimized routine
(which means eventually adding one to libfortran or using/extending
-fexternal-blas.

Recovering from this in the middle-end is only possible if both primitives
are inlined and even then I expect it to be quite difficult to get optimal
code out of it (though it's certainly interesting to see if we're at least
getting a useful idea of data dependence).

Long-term exposing important primitives semantics to the middle-end, even when
implemented as library calls would be interesting (aka, add
__builtin_dot_product,
etc. which would make it possible to delay inline-expanding as well).

Reply via email to