https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51119

--- Comment #22 from Joost VandeVondele <Joost.VandeVondele at mat dot ethz.ch> 
---
(In reply to Thomas Koenig from comment #21)
> I assume that for  small matrices bordering on the silly
> (say, a matrix multiplication with dimensions of (1,2) and (2,1))
> the inline code will be faster if the code is compiled with the
> right options, due to function call overhead.  I also assume that
> libxsmm will become faster quite soon for bigger sizes.
> 
> Do you have an idea where the crossover is?

I agree that inline should be faster, if the compiler is reasonably smart, if
the matrix dimensions are known at compile time (i.e. should be able to
generate the same kernel). I haven't checked yet.

Reply via email to