[Bug target/94866] Failure to optimize pinsrq of 0 with index 1 into movq

crazylht at gmail dot com via Gcc-bugs Wed, 23 Aug 2023 06:39:10 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94866


--- Comment #6 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Uroš Bizjak from comment #4)
> (In reply to Hongtao.liu from comment #3)
> > in x86 backend expand_vec_perm_1, we always tries vec_merge frist for
> > !one_operand_p, expand_vselect_vconcat is only tried when vec_merge failed
> > which means we'd better to use vec_merge instead of vec_select:vec_concat
> > when available in out backend pattern match.
> 
> In fact, I tried to convert existing sse2_movq128 patterns to vec_merge, but
> the patch regressed:
> 
> -FAIL: gcc.target/i386/sse2-pr94680-2.c scan-assembler movq
> -FAIL: gcc.target/i386/sse2-pr94680-2.c scan-assembler-not pxor
> -FAIL: gcc.target/i386/sse2-pr94680.c scan-assembler-not pxor
> -FAIL: gcc.target/i386/sse2-pr94680.c scan-assembler-times
> (?n)(?:mov|psrldq).*%xmm[0-9] 12
> 
> So, the compiler still expects vec_concat/vec_select patterns to be present.


v2df foo_v2df (v2df x)
 {
   return __builtin_shuffle (x, (v2df) { 0, 0 }, (v2di) { 0, 2 });
 }

The testcase is not a typical vec_merge case, for vec_merge, the shuffle index
should be {0, 3}. Here it happened to be a vec_merge because the second vector
is all zero. And yes for this case, we still need to vec_concat:vec_select
pattern.

[Bug target/94866] Failure to optimize pinsrq of 0 with index 1 into movq

Reply via email to