On Fri, Jul 10, 2015 at 11:37:18AM +0200, Uros Bizjak wrote: > Hello! > > > As I wrote at > > > > [PATCH, libcpp]: Use asm flag outputs in search_line_sse42 main loop > > > > https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg113610.html > > > > I wont repeat myself with reasons summary is that current sse4.2 code is > > reduntant as it has same performance as sse2 one. > > This improves sse2 performance by around 10% vs sse4.2 code by > > using better header. > > Have you tried new SSE4.2 implementation (the one with asm flags) with > unrolled loop?
Also, the SSE4.2 implementation looks shorter, so more I-cache friendly, so I wouldn't really say it is redundant if they are roughly same speed. Jakub