https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63791

H.J. Lu <hjl.tools at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |INVALID

--- Comment #5 from H.J. Lu <hjl.tools at gmail dot com> ---
(In reply to Marcus Kool from comment #2)
>
> To resume, gcc 4.8.4 and gcc 4.9.2 produce code that can be optimised
> further, and gcc 5.1.0 produces even slower code which means that the
> implementation of *_set1_epi8() is slower/much-slower than that it can be.

That is done on purpose.  Add -mtune=intel will get what you want.

Reply via email to