https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64286
Bug ID: 64286
Summary: Redundant extend removal ignores vector element type
Product: gcc
Version: 4.9.0
Status: UNCONFIRMED
Severity: major
Priority: P3
Component: rtl-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: sergos.gnu at gmail dot com
Created attachment 34266
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=34266&action=edit
reproducer, taken from public sources
The problem is reproducible starting 4.9 and on trunk also.
The line 29 contains a load into V16QI vector
29: p2 = _mm_loadu_si128((__m128i *) (s - 3 * p));
later used at
60: work = _mm_or_si128(_mm_subs_epu8(p2, p1), _mm_subs_epu8(p1, p2));
and later sign extended into V16HI vector
151: p256_2 = _mm256_cvtepu8_epi16(p2);
At the phase 217 split2 we have:
(insn 207 204 209 2 (set (reg:V16QI 21 xmm0 [447])
(mem:V16QI (plus:DI (reg/f:DI 6 bp)
(const_int -114 [0xffffffffffffff8e])) [0 S16 A16]))
GCC_Bug.p.c:2609 1136 {*movv16qi_internal}
(expr_list:REG_EQUIV (mem:V16QI (plus:DI (reg/f:DI 20 frame)
(const_int -66 [0xffffffffffffffbe])) [0 S16 A16])
(nil)))
...
(insn 236 235 238 2 (set (reg:V16QI 22 xmm1 [462])
(us_minus:V16QI (reg:V16QI 23 xmm2 [450])
(reg:V16QI 21 xmm0 [447]))) GCC_Bug.p.c:2925 2096
{*sse2_ussubv16qi3}
(nil))
... (and number of other operations with xmm0 as V16QI)
(insn 871 869 873 2 (set (reg:V16HI 21 xmm0 [orig:573 D.17673 ] [573])
(zero_extend:V16HI (reg:V16QI 21 xmm0 [447]))) GCC_Bug.p.c:5280 2521
{avx2_zero_extendv16qiv16hi2}
(nil))
After that REE reports:
-------
Trying to eliminate extension:
(insn 871 869 873 2 (set (reg:V16HI 21 xmm0 [orig:573 D.17673 ] [573])
(zero_extend:V16HI (reg:V16QI 21 xmm0 [447]))) GCC_Bug.p.c:5280 2521
{avx2_zero_extendv16qiv16hi2}
(nil))
Tentatively merged extension with definition :
(insn 207 204 209 2 (set (reg:V16HI 21 xmm0)
(zero_extend:V16HI (mem:V16QI (plus:DI (reg/f:DI 6 bp)
(const_int -114 [0xffffffffffffff8e])) [0 S16 A16])))
GCC_Bug.p.c:2609 -1
(nil))
deferring rescan insn with uid = 207.
All merges were successful.
Eliminated the extension.
-------------
That renders all V16QI insns using xmm0 invalid.
The test should be compiled with
gcc -O2 GCC_Bug_min.c -mavx2
And run on an avx2-enabled platform.
Correct output:
Is valid: 1
Incorrect output:
Is valid: 0