http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54201
--- Comment #8 from Richard Guenther <rguenth at gcc dot gnu.org> 2012-08-14 14:37:42 UTC --- Does not help the 2nd testcase btw, because we do not CSE the loads: movdqa .LC0, %xmm3 movdqa .LC0, %xmm2 pand %xmm3, %xmm0 pcmpeqb %xmm2, %xmm0 pand %xmm0, %xmm1 pand %xmm3, %xmm1 movdqa %xmm1, %xmm0 pcmpeqb %xmm2, %xmm0 ret so it's only half of the issue.