https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101846
--- Comment #10 from Jakub Jelinek ---
For bar, the problem is that while vpmovdw is AVX512F, we actually recognize it
only at combine time as vpermw (with selected exact permutation) combined with
low part extraction. And vpermw is only AVX512
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101846
--- Comment #9 from Hongtao.liu ---
(In reply to Andrew Pinski from comment #7)
> With just -mavx512f we produce a bunch of instructions (looking like we went
> to scalar mode) while LLVM is able to produce:
> foo(short __vector(16)):
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101846
--- Comment #8 from Andrew Pinski ---
foo is like a zero extend even.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101846
Andrew Pinski changed:
What|Removed |Added
Severity|normal |enhancement
--- Comment #7 from Andrew
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101846
--- Comment #6 from CVS Commits ---
The master branch has been updated by hongtao Liu :
https://gcc.gnu.org/g:faf2b6bc527dff31725dde55381c92688047
commit r12-2919-gfaf2b6bc527dff31725dde55381c92688047
Author: liuhongt
Date: Mon Aug
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101846
--- Comment #5 from CVS Commits ---
The master branch has been updated by hongtao Liu :
https://gcc.gnu.org/g:95e1eca43d106d821720744ac6ff1f5df41a1e78
commit r12-2869-g95e1eca43d106d821720744ac6ff1f5df41a1e78
Author: liuhongt
Date: Wed Aug
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101846
--- Comment #4 from Hongtao.liu ---
But when upper bits is not used, vpmovdw version seems better.
v4hi
bar_dw_128 (v8hi x)
{
return __builtin_shufflevector (x, x, 0, 2, 4, 6);// 4, 5, 6, 7);
}
- vpshufb .LC2(%rip), %xmm0, %xmm0
+
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101846
--- Comment #3 from Hongtao.liu ---
expand_vec_perm_1 is supposed to generate 1 instruction, but it doesn't
consider load of const_vector, if we handle (In reply to Hongtao.liu from
comment #2)
> For foo, vmovdqa is avx_vec_concatv16si/2, and we
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101846
--- Comment #2 from Hongtao.liu ---
For foo, vmovdqa is avx_vec_concatv16si/2, and we can add define_insn_and_split
to combine avx_vec_concatv16si/2 and avx512f_zero_extendv16hiv16si2_1, similar
for other modes in pmovzx{bw,wd,dq}.
For bar, we
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101846
Richard Biener changed:
What|Removed |Added
Last reconfirmed||2021-08-10
Status|UNCONFIR
10 matches
Mail list logo