https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63679
--- Comment #3 from Tejas Belagod <belagod at gcc dot gnu.org> ---
When I try 5.0 with -fno-tree-vectorize, I get:
;; basic block 2, loop depth 0
;; pred: ENTRY
# .MEM_4 = VDEF <.MEM_3(D)>
aD.2496 = *.LC0D.2503;
# VUSE <.MEM_4>
_10 = aD.2496[0];
# VUSE <.MEM_4>
_22 = aD.2496[1];
sum_23 = _10 + _22;
# VUSE <.MEM_4>
_29 = aD.2496[2];
sum_30 = sum_23 + _29;
# VUSE <.MEM_4>
_36 = aD.2496[3];
sum_37 = sum_30 + _36;
# VUSE <.MEM_4>
_43 = aD.2496[4];
sum_44 = sum_37 + _43;
# VUSE <.MEM_4>
_50 = aD.2496[5];
sum_51 = sum_44 + _50;
# VUSE <.MEM_4>
_57 = aD.2496[6];
sum_58 = sum_51 + _57;
# VUSE <.MEM_4>
_6 = aD.2496[7];
sum_7 = _6 + sum_58;
# .MEM_9 = VDEF <.MEM_4>
aD.2496 ={v} {CLOBBER};
# VUSE <.MEM_9>
return sum_7;
;; succ: EXIT
This:
# .MEM_4 = VDEF <.MEM_3(D)>
aD.2496 = *.LC0D.2503;
is what's mainly different from 4.9. 5.0 seems to use a TImode load to
initialize the stack with the const array.
(insn 10 9 11 (set (mem/c:TI (reg:DI 91) [1 aD.2496+0 S16 A128])
(reg:TI 93)) foo.c:4 -1
(nil))
(insn 11 10 12 (set (reg:TI 94)
(mem/u/c:TI (plus:DI (reg:DI 92)
(const_int 16 [0x10])) [0 S16 A32])) foo.c:4 -1
(nil))
(insn 12 11 0 (set (mem/c:TI (plus:DI (reg:DI 91)
(const_int 16 [0x10])) [1 aD.2496+16 S16 A128])
(reg:TI 94)) foo.c:4 -1
(nil))
;; sum_23 = _10 + _22;
(insn 13 12 14 (set (reg:SI 95)
(mem/c:SI (plus:DI (reg/f:DI 68 virtual-stack-vars)
(const_int -32 [0xffffffffffffffe0])) [1 aD.2496+0 S4 A128]))
foo.c:9 -1
(nil))
When DSE wants to optimize it away, it fails to extract SI values from the
TImode stores:
**scanning insn=14
cselib lookup (reg/f:DI 64 sfp) => 3:3
cselib value 6:4299 0x2f6de50 (plus:DI (reg/f:DI 64 sfp)
(const_int -28 [0xffffffffffffffe4]))
cselib lookup (plus:DI (reg/f:DI 64 sfp)
(const_int -28 [0xffffffffffffffe4])) => 6:4299
mem: (plus:DI (reg/f:DI 64 sfp)
(const_int -28 [0xffffffffffffffe4]))
after canon_rtx address: (plus:DI (reg/f:DI 64 sfp)
(const_int -28 [0xffffffffffffffe4]))
gid=0 offset=-28
processing const load gid=0[-28..-24)
trying to replace SImode load in insn 14 from TImode store in insn 10
(lshiftrt:DI (reg:DI 105)
(const_int 32 [0x20]))
Hot cost: 8 (final)
-- could not extract bits of stored value
removing from active insn=10 has store
mems_found = 0, cannot_delete = true
cselib lookup (mem/c:SI (plus:DI (reg/f:DI 64 sfp)
(const_int -28 [0xffffffffffffffe4])) [1 aD.2496+4 S4 A32]) => 0:0
**scanning insn=15
....