https://gcc.gnu.org/bugzilla/show_bug.cgi?id=123010

--- Comment #8 from Jeffrey A. Law <law at gcc dot gnu.org> ---
What hasn't been explained is why 2^n is special here.

For both the multiply-by-2 and multiply-by-3 cases we get the expected code out
of gimple as can be seen in the .optimized dump;

  _2 = a_1(D) * 2;
  return _2;


The initial expansion looks pretty much like you'd expect:

(insn 6 3 7 2 (set (reg:DI 137)
        (sign_extend:DI (ashift:SI (subreg/s/u:SI (reg/v:DI 135 [ a ]) 0)
                (const_int 1 [0x1])))) "j.c":10:14 -1
     (nil))
(insn 7 6 8 2 (set (reg:SI 136 [ _2 ])
        (subreg/s/u:SI (reg:DI 137) 0)) "j.c":10:14 -1
     (expr_list:REG_EQUAL (ashift:SI (subreg/s/u:SI (reg/v:DI 135 [ a ]) 0)
            (const_int 1 [0x1]))
        (nil)))
(insn 8 7 9 2 (set (reg:DI 138)
        (sign_extend:DI (reg:SI 136 [ _2 ]))) "j.c":10:14 discrim 1 -1
     (nil))
(insn 9 8 13 2 (set (reg:DI 134 [ <retval> ])
        (reg:DI 138)) "j.c":10:14 discrim 1 -1
     (nil))

Shift the object in DI, extract the low 32 bits.  It's expansion, so there's a
lot of redudancy in there.  Eventually it gets cleaned up in the expected ways
resulting in:

(insn 6 3 13 2 (set (reg:DI 137)
        (sign_extend:DI (ashift:SI (subreg/s/u:SI (reg/v:DI 135 [ a ]) 0)
                (const_int 1 [0x1])))) "j.c":10:14 312 {ashlsi3_extend}
     (expr_list:REG_DEAD (reg/v:DI 135 [ a ])
        (nil)))

Where (reg 137) gets copied into the return register.   So far, so good.  

THen in combine we generate:

(insn 6 3 13 2 (parallel [
            (set (reg:DI 137)
                (sign_extract:DI (reg:DI 139 [ a ])
                    (const_int 31 [0x1f])
                    (const_int 0 [0])))
            (clobber (scratch:DI))
        ]) "j.c":10:14 333 {*extractdi3}
     (expr_list:REG_DEAD (reg:DI 139 [ a ])
        (nil)))
(insn 13 6 14 2 (set (reg/i:DI 10 a0)
        (ashift:DI (reg:DI 137)
            (const_int 1 [0x1]))) "j.c":11:1 297 {ashldi3}
     (expr_list:REG_DEAD (reg:DI 137)
        (nil)))  

That's the WTH moment.


Trying 2, 6 -> 13:
    2: r135:DI=r139:DI
      REG_DEAD r139:DI
    6: r137:DI=sign_extend(r135:DI#0<<0x1)
      REG_DEAD r135:DI
   13: a0:DI=r137:DI
      REG_DEAD r137:DI
Failed to match this instruction:
(set (reg/i:DI 10 a0)
    (ashift:DI (sign_extract:DI (reg:DI 139 [ a ])
            (const_int 31 [0x1f])
            (const_int 0 [0]))
        (const_int 1 [0x1])))
Successfully matched this instruction:
(set (reg:DI 137)
    (sign_extract:DI (reg:DI 139 [ a ])
        (const_int 31 [0x1f])
        (const_int 0 [0])))
Successfully matched this instruction:
(set (reg/i:DI 10 a0)
    (ashift:DI (reg:DI 137)
        (const_int 1 [0x1])))
allowing combination of insns 2, 6 and 13
original costs 4 + 8 + 4 = 16
replacement costs 8 + 4 = 12
deferring deletion of insn with uid = 2.
modifying insn i2     6: {r137:DI=sign_extract(r139:DI,0x1f,0);clobber
scratch;}
      REG_DEAD r139:DI
deferring rescan insn with uid = 6.
modifying insn i3    13: a0:DI=r137:DI<<0x1
      REG_DEAD r137:DI
deferring rescan insn with uid = 13.


The multiply-by-3 won't trigger that split case, presumably from generic parts
of combine.  There's a costing problem in there I suspect.  It's also a really
odd "simplification", presumably from generic parts of combine.  It's unclear
at this time if we want to support that kind of extract+shift pattern as
another form of sllw when the constants are convenient.

Reply via email to