https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69891
Uroš Bizjak <ubizjak at gmail dot com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|UNCONFIRMED |NEW Last reconfirmed| |2016-02-21 CC| |ubizjak at gmail dot com Component|target |rtl-optimization Ever confirmed|0 |1 --- Comment #1 from Uroš Bizjak <ubizjak at gmail dot com> --- (In reply to Zdenek Sojka from comment #0) > Reproduces with x86_64 compiler -m32 as well. (-mno-sse has to be added in case of x86_64 compiler with -m32). This is RTL aliasing issue. We start with following _optimized tree dump: <bb 2>: _2 = BIT_FIELD_REF <v32u32_1, 32, 0>; ... _9 = _2 | 7; BIT_FIELD_REF <v32u32_1, 32, 0> = _9; ... v32u32_1 = { 0, 0, 0, 0, 0, 0, 0, 0 }; ... _19 = BIT_FIELD_REF <v32u32_1, 32, 0>; ... _27 = _19 + _22; ... which gets expanded to: ;; BIT_FIELD_REF <v32u32_1, 32, 0> = _9; (insn 7 6 8 (parallel [ (set (reg:SI 121) (ior:SI (reg:SI 87 [ _2 ]) (const_int 7 [0x7]))) (clobber (reg:CC 17 flags)) ]) pr69891.c:19 -1 (nil)) (insn 8 7 0 (set (mem/j/c:SI (plus:SI (reg/f:SI 81 virtual-incoming-args) (const_int 64 [0x40])) [2 v32u32_1+0 S4 A256]) (reg:SI 121)) pr69891.c:19 -1 (nil)) ... (insn 13 12 0 (set (reg:SI 119 [ _117 ]) (reg:SI 125)) pr69891.c:25 -1 (nil)) ;; v32u32_1 = { 0, 0, 0, 0, 0, 0, 0, 0 }; (insn 14 13 15 (parallel [ (set (reg:SI 127) (plus:SI (reg/f:SI 81 virtual-incoming-args) (const_int 64 [0x40]))) (clobber (reg:CC 17 flags)) ]) pr69891.c:31 -1 (nil)) (insn 15 14 16 (set (reg:SI 128) (const_int 32 [0x20])) pr69891.c:31 -1 (nil)) (insn 16 15 17 (parallel [ (set (reg/f:SI 7 sp) (plus:SI (reg/f:SI 7 sp) (const_int -20 [0xffffffffffffffec]))) (clobber (reg:CC 17 flags)) ]) pr69891.c:31 -1 (expr_list:REG_ARGS_SIZE (const_int 20 [0x14]) (nil))) (insn 17 16 18 (set (mem:SI (pre_dec:SI (reg/f:SI 7 sp)) [2 S4 A32]) (reg:SI 128)) pr69891.c:31 -1 (expr_list:REG_ARGS_SIZE (const_int 24 [0x18]) (nil))) (insn 18 17 19 (set (mem:SI (pre_dec:SI (reg/f:SI 7 sp)) [2 S4 A32]) (const_int 0 [0])) pr69891.c:31 -1 (expr_list:REG_ARGS_SIZE (const_int 28 [0x1c]) (nil))) (insn 19 18 20 (set (mem/f:SI (pre_dec:SI (reg/f:SI 7 sp)) [4 S4 A32]) (reg:SI 127)) pr69891.c:31 -1 (expr_list:REG_ARGS_SIZE (const_int 32 [0x20]) (nil))) (call_insn 20 19 21 (set (reg:SI 0 ax) (call (mem:QI (symbol_ref:SI ("memset") [flags 0x41] <function_decl 0x7f5734764e00 memset>) [0 memset S1 A8]) (const_int 32 [0x20]))) pr69891.c:31 -1 (expr_list:REG_EH_REGION (const_int 0 [0]) (nil)) (nil)) ... (insn 170 169 171 (set (reg:SI 202) (mem/j/c:SI (plus:SI (reg/f:SI 81 virtual-incoming-args) (const_int 64 [0x40])) [2 v32u32_1+0 S4 A256])) pr69891.c:37 -1 (nil)) (insn 171 170 172 (set (reg:SI 203) (mem/j/c:SI (plus:SI (reg/f:SI 81 virtual-incoming-args) (const_int 120 [0x78])) [3 v32u64_1+24 S4 A64])) pr69891.c:37 -1 (nil)) (insn 172 171 173 (parallel [ (set (reg:SI 201) (plus:SI (reg:SI 202) (reg:SI 203))) (clobber (reg:CC 17 flags)) ]) pr69891.c:37 -1 (expr_list:REG_EQUAL (plus:SI (mem/j/c:SI (plus:SI (reg/f:SI 81 virtual-incoming-args) (const_int 64 [0x40])) [2 v32u32_1+0 S4 A256]) (mem/j/c:SI (plus:SI (reg/f:SI 81 virtual-incoming-args) (const_int 120 [0x78])) [3 v32u64_1+24 S4 A64])) (nil))) However, DSE1 pass propagates r121 (aka r207) from (insn 7) all the way to the (insn 170), without considering aliasing memset in (insn 20). 5: r87:SI=[argp:SI+0x40] 6: {r89:HI=-r87:SI#0;clobber flags:CC;} REG_UNUSED flags:CC 7: {r121:SI=r87:SI|0x7;clobber flags:CC;} REG_DEAD r87:SI REG_UNUSED flags:CC 186: r207:SI=r121:SI 8: [argp:SI+0x40]=r121:SI ... 14: {r127:SI=argp:SI+0x40;clobber flags:CC;} REG_UNUSED flags:CC 16: {sp:SI=sp:SI-0x14;clobber flags:CC;} REG_UNUSED flags:CC REG_ARGS_SIZE 0x14 17: [--sp:SI]=0x20 REG_ARGS_SIZE 0x18 18: [--sp:SI]=0 REG_ARGS_SIZE 0x1c 19: [--sp:SI]=r127:SI REG_DEAD r127:SI REG_ARGS_SIZE 0x20 20: ax:SI=call [`memset'] argc:0x20 REG_UNUSED ax:SI REG_EH_REGION 0 ... 170: r202:SI=r207:SI REG_DEAD r207:SI 171: r203:SI=[argp:SI+0x78] 172: {r201:SI=r202:SI+r203:SI;clobber flags:CC;} REG_DEAD r203:SI REG_DEAD r202:SI REG_UNUSED flags:CC REG_EQUAL [argp:SI+0x40]+[argp:SI+0x78] Confirmed as RTL optimization issue.