https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69891

Uroš Bizjak <ubizjak at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2016-02-21
                 CC|                            |ubizjak at gmail dot com
          Component|target                      |rtl-optimization
     Ever confirmed|0                           |1

--- Comment #1 from Uroš Bizjak <ubizjak at gmail dot com> ---
(In reply to Zdenek Sojka from comment #0)

> Reproduces with x86_64 compiler -m32 as well.
(-mno-sse has to be added in case of x86_64 compiler with -m32).

This is RTL aliasing issue.

We start with following _optimized tree dump:

  <bb 2>:
  _2 = BIT_FIELD_REF <v32u32_1, 32, 0>;
  ...
  _9 = _2 | 7;
  BIT_FIELD_REF <v32u32_1, 32, 0> = _9;
  ...
  v32u32_1 = { 0, 0, 0, 0, 0, 0, 0, 0 };
  ...
  _19 = BIT_FIELD_REF <v32u32_1, 32, 0>;
  ...
  _27 = _19 + _22;
  ...

which gets expanded to:

;; BIT_FIELD_REF <v32u32_1, 32, 0> = _9;

(insn 7 6 8 (parallel [
            (set (reg:SI 121)
                (ior:SI (reg:SI 87 [ _2 ])
                    (const_int 7 [0x7])))
            (clobber (reg:CC 17 flags))
        ]) pr69891.c:19 -1
     (nil))

(insn 8 7 0 (set (mem/j/c:SI (plus:SI (reg/f:SI 81 virtual-incoming-args)
                (const_int 64 [0x40])) [2 v32u32_1+0 S4 A256])
        (reg:SI 121)) pr69891.c:19 -1
     (nil))

...

(insn 13 12 0 (set (reg:SI 119 [ _117 ])
        (reg:SI 125)) pr69891.c:25 -1
     (nil))

;; v32u32_1 = { 0, 0, 0, 0, 0, 0, 0, 0 };

(insn 14 13 15 (parallel [
            (set (reg:SI 127)
                (plus:SI (reg/f:SI 81 virtual-incoming-args)
                    (const_int 64 [0x40])))
            (clobber (reg:CC 17 flags))
        ]) pr69891.c:31 -1
     (nil))

(insn 15 14 16 (set (reg:SI 128)
        (const_int 32 [0x20])) pr69891.c:31 -1
     (nil))

(insn 16 15 17 (parallel [
            (set (reg/f:SI 7 sp)
                (plus:SI (reg/f:SI 7 sp)
                    (const_int -20 [0xffffffffffffffec])))
            (clobber (reg:CC 17 flags))
        ]) pr69891.c:31 -1
     (expr_list:REG_ARGS_SIZE (const_int 20 [0x14])
        (nil)))

(insn 17 16 18 (set (mem:SI (pre_dec:SI (reg/f:SI 7 sp)) [2  S4 A32])
        (reg:SI 128)) pr69891.c:31 -1
     (expr_list:REG_ARGS_SIZE (const_int 24 [0x18])
        (nil)))

(insn 18 17 19 (set (mem:SI (pre_dec:SI (reg/f:SI 7 sp)) [2  S4 A32])
        (const_int 0 [0])) pr69891.c:31 -1
     (expr_list:REG_ARGS_SIZE (const_int 28 [0x1c])
        (nil)))

(insn 19 18 20 (set (mem/f:SI (pre_dec:SI (reg/f:SI 7 sp)) [4  S4 A32])
        (reg:SI 127)) pr69891.c:31 -1
     (expr_list:REG_ARGS_SIZE (const_int 32 [0x20])
        (nil)))

(call_insn 20 19 21 (set (reg:SI 0 ax)
        (call (mem:QI (symbol_ref:SI ("memset") [flags 0x41]  <function_decl
0x7f5734764e00 memset>) [0 memset S1 A8])
            (const_int 32 [0x20]))) pr69891.c:31 -1
     (expr_list:REG_EH_REGION (const_int 0 [0])
        (nil))
    (nil))

...

(insn 170 169 171 (set (reg:SI 202)
        (mem/j/c:SI (plus:SI (reg/f:SI 81 virtual-incoming-args)
                (const_int 64 [0x40])) [2 v32u32_1+0 S4 A256])) pr69891.c:37 -1
     (nil))

(insn 171 170 172 (set (reg:SI 203)
        (mem/j/c:SI (plus:SI (reg/f:SI 81 virtual-incoming-args)
                (const_int 120 [0x78])) [3 v32u64_1+24 S4 A64])) pr69891.c:37
-1
     (nil))

(insn 172 171 173 (parallel [
            (set (reg:SI 201)
                (plus:SI (reg:SI 202)
                    (reg:SI 203)))
            (clobber (reg:CC 17 flags))
        ]) pr69891.c:37 -1
     (expr_list:REG_EQUAL (plus:SI (mem/j/c:SI (plus:SI (reg/f:SI 81
virtual-incoming-args)
                    (const_int 64 [0x40])) [2 v32u32_1+0 S4 A256])
            (mem/j/c:SI (plus:SI (reg/f:SI 81 virtual-incoming-args)
                    (const_int 120 [0x78])) [3 v32u64_1+24 S4 A64]))
        (nil)))

However, DSE1 pass propagates r121 (aka r207) from (insn 7) all the way to the
(insn 170), without considering aliasing memset in (insn 20).

    5: r87:SI=[argp:SI+0x40]
    6: {r89:HI=-r87:SI#0;clobber flags:CC;}
      REG_UNUSED flags:CC
    7: {r121:SI=r87:SI|0x7;clobber flags:CC;}
      REG_DEAD r87:SI
      REG_UNUSED flags:CC
  186: r207:SI=r121:SI
    8: [argp:SI+0x40]=r121:SI

  ...

   14: {r127:SI=argp:SI+0x40;clobber flags:CC;}
      REG_UNUSED flags:CC
   16: {sp:SI=sp:SI-0x14;clobber flags:CC;}
      REG_UNUSED flags:CC
      REG_ARGS_SIZE 0x14
   17: [--sp:SI]=0x20
      REG_ARGS_SIZE 0x18
   18: [--sp:SI]=0
      REG_ARGS_SIZE 0x1c
   19: [--sp:SI]=r127:SI
      REG_DEAD r127:SI
      REG_ARGS_SIZE 0x20
   20: ax:SI=call [`memset'] argc:0x20
      REG_UNUSED ax:SI
      REG_EH_REGION 0

  ...

  170: r202:SI=r207:SI
      REG_DEAD r207:SI
  171: r203:SI=[argp:SI+0x78]
  172: {r201:SI=r202:SI+r203:SI;clobber flags:CC;}
      REG_DEAD r203:SI
      REG_DEAD r202:SI
      REG_UNUSED flags:CC
      REG_EQUAL [argp:SI+0x40]+[argp:SI+0x78]

Confirmed as RTL optimization issue.

Reply via email to