https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63679
--- Comment #3 from Tejas Belagod <belagod at gcc dot gnu.org> --- When I try 5.0 with -fno-tree-vectorize, I get: ;; basic block 2, loop depth 0 ;; pred: ENTRY # .MEM_4 = VDEF <.MEM_3(D)> aD.2496 = *.LC0D.2503; # VUSE <.MEM_4> _10 = aD.2496[0]; # VUSE <.MEM_4> _22 = aD.2496[1]; sum_23 = _10 + _22; # VUSE <.MEM_4> _29 = aD.2496[2]; sum_30 = sum_23 + _29; # VUSE <.MEM_4> _36 = aD.2496[3]; sum_37 = sum_30 + _36; # VUSE <.MEM_4> _43 = aD.2496[4]; sum_44 = sum_37 + _43; # VUSE <.MEM_4> _50 = aD.2496[5]; sum_51 = sum_44 + _50; # VUSE <.MEM_4> _57 = aD.2496[6]; sum_58 = sum_51 + _57; # VUSE <.MEM_4> _6 = aD.2496[7]; sum_7 = _6 + sum_58; # .MEM_9 = VDEF <.MEM_4> aD.2496 ={v} {CLOBBER}; # VUSE <.MEM_9> return sum_7; ;; succ: EXIT This: # .MEM_4 = VDEF <.MEM_3(D)> aD.2496 = *.LC0D.2503; is what's mainly different from 4.9. 5.0 seems to use a TImode load to initialize the stack with the const array. (insn 10 9 11 (set (mem/c:TI (reg:DI 91) [1 aD.2496+0 S16 A128]) (reg:TI 93)) foo.c:4 -1 (nil)) (insn 11 10 12 (set (reg:TI 94) (mem/u/c:TI (plus:DI (reg:DI 92) (const_int 16 [0x10])) [0 S16 A32])) foo.c:4 -1 (nil)) (insn 12 11 0 (set (mem/c:TI (plus:DI (reg:DI 91) (const_int 16 [0x10])) [1 aD.2496+16 S16 A128]) (reg:TI 94)) foo.c:4 -1 (nil)) ;; sum_23 = _10 + _22; (insn 13 12 14 (set (reg:SI 95) (mem/c:SI (plus:DI (reg/f:DI 68 virtual-stack-vars) (const_int -32 [0xffffffffffffffe0])) [1 aD.2496+0 S4 A128])) foo.c:9 -1 (nil)) When DSE wants to optimize it away, it fails to extract SI values from the TImode stores: **scanning insn=14 cselib lookup (reg/f:DI 64 sfp) => 3:3 cselib value 6:4299 0x2f6de50 (plus:DI (reg/f:DI 64 sfp) (const_int -28 [0xffffffffffffffe4])) cselib lookup (plus:DI (reg/f:DI 64 sfp) (const_int -28 [0xffffffffffffffe4])) => 6:4299 mem: (plus:DI (reg/f:DI 64 sfp) (const_int -28 [0xffffffffffffffe4])) after canon_rtx address: (plus:DI (reg/f:DI 64 sfp) (const_int -28 [0xffffffffffffffe4])) gid=0 offset=-28 processing const load gid=0[-28..-24) trying to replace SImode load in insn 14 from TImode store in insn 10 (lshiftrt:DI (reg:DI 105) (const_int 32 [0x20])) Hot cost: 8 (final) -- could not extract bits of stored value removing from active insn=10 has store mems_found = 0, cannot_delete = true cselib lookup (mem/c:SI (plus:DI (reg/f:DI 64 sfp) (const_int -28 [0xffffffffffffffe4])) [1 aD.2496+4 S4 A32]) => 0:0 **scanning insn=15 ....