On Tuesday May 09 2017 23:54:41 Allan Sandfeld Jensen wrote:

>It is nowhere near hand-written SIMD code, but then neither are generic 
>libraries ;)

Not if you don't use them right, no. But I'd like to think there's still a gray 
area where humans can outsmart the compiler in this area, for instance because 
the auto-vectoriser doesn't use the full instruction set.
The kind of library I'm thinking of do suppose that their users know what 
they're doing. But even if not,

>(but at 4x vectorization that is still twice as fast as not doing anything).

if that's what you get when you can multiply two QVectors "as is" instead of 
writing out the loop in plain C there's still an advantage, no?

FWIW, this is my forked and updated copy of the MacSTL library I mentioned. WIP 
that I started neglecting after 2012 when I no longer had a need for it and my 
hardware started falling behind (I only have a 2011 i7 and a 2016 N3150 
nowadays).
https://github.com/RJVB/MacSTL

>I doubt that would work. At least for intrinsic it would produce very poor 
>binary output since it would generate intermediate code the compiler then 
>can't map to the optimal instructions they were meant for.

That depends how the intrinsic functions are defined and if compiler switches 
like -mavx do anything other than defining the preprocessor token. I've never 
really looked at what gcc does precisely in this department.

>Maybe it works for inline assembler?

 If memory serves me well the last time I looked at the intrinsic headerfiles 
shipped with clang on Mac they just defined macros expanding to inline 
assembly. So yeah, that should work.

Cheers,
René
_______________________________________________
Interest mailing list
Interest@qt-project.org
http://lists.qt-project.org/mailman/listinfo/interest

Reply via email to