https://gcc.gnu.org/bugzilla/show_bug.cgi?id=123010
--- Comment #8 from Jeffrey A. Law <law at gcc dot gnu.org> ---
What hasn't been explained is why 2^n is special here.
For both the multiply-by-2 and multiply-by-3 cases we get the expected code out
of gimple as can be seen in the .optimized dump;
_2 = a_1(D) * 2;
return _2;
The initial expansion looks pretty much like you'd expect:
(insn 6 3 7 2 (set (reg:DI 137)
(sign_extend:DI (ashift:SI (subreg/s/u:SI (reg/v:DI 135 [ a ]) 0)
(const_int 1 [0x1])))) "j.c":10:14 -1
(nil))
(insn 7 6 8 2 (set (reg:SI 136 [ _2 ])
(subreg/s/u:SI (reg:DI 137) 0)) "j.c":10:14 -1
(expr_list:REG_EQUAL (ashift:SI (subreg/s/u:SI (reg/v:DI 135 [ a ]) 0)
(const_int 1 [0x1]))
(nil)))
(insn 8 7 9 2 (set (reg:DI 138)
(sign_extend:DI (reg:SI 136 [ _2 ]))) "j.c":10:14 discrim 1 -1
(nil))
(insn 9 8 13 2 (set (reg:DI 134 [ <retval> ])
(reg:DI 138)) "j.c":10:14 discrim 1 -1
(nil))
Shift the object in DI, extract the low 32 bits. It's expansion, so there's a
lot of redudancy in there. Eventually it gets cleaned up in the expected ways
resulting in:
(insn 6 3 13 2 (set (reg:DI 137)
(sign_extend:DI (ashift:SI (subreg/s/u:SI (reg/v:DI 135 [ a ]) 0)
(const_int 1 [0x1])))) "j.c":10:14 312 {ashlsi3_extend}
(expr_list:REG_DEAD (reg/v:DI 135 [ a ])
(nil)))
Where (reg 137) gets copied into the return register. So far, so good.
THen in combine we generate:
(insn 6 3 13 2 (parallel [
(set (reg:DI 137)
(sign_extract:DI (reg:DI 139 [ a ])
(const_int 31 [0x1f])
(const_int 0 [0])))
(clobber (scratch:DI))
]) "j.c":10:14 333 {*extractdi3}
(expr_list:REG_DEAD (reg:DI 139 [ a ])
(nil)))
(insn 13 6 14 2 (set (reg/i:DI 10 a0)
(ashift:DI (reg:DI 137)
(const_int 1 [0x1]))) "j.c":11:1 297 {ashldi3}
(expr_list:REG_DEAD (reg:DI 137)
(nil)))
That's the WTH moment.
Trying 2, 6 -> 13:
2: r135:DI=r139:DI
REG_DEAD r139:DI
6: r137:DI=sign_extend(r135:DI#0<<0x1)
REG_DEAD r135:DI
13: a0:DI=r137:DI
REG_DEAD r137:DI
Failed to match this instruction:
(set (reg/i:DI 10 a0)
(ashift:DI (sign_extract:DI (reg:DI 139 [ a ])
(const_int 31 [0x1f])
(const_int 0 [0]))
(const_int 1 [0x1])))
Successfully matched this instruction:
(set (reg:DI 137)
(sign_extract:DI (reg:DI 139 [ a ])
(const_int 31 [0x1f])
(const_int 0 [0])))
Successfully matched this instruction:
(set (reg/i:DI 10 a0)
(ashift:DI (reg:DI 137)
(const_int 1 [0x1])))
allowing combination of insns 2, 6 and 13
original costs 4 + 8 + 4 = 16
replacement costs 8 + 4 = 12
deferring deletion of insn with uid = 2.
modifying insn i2 6: {r137:DI=sign_extract(r139:DI,0x1f,0);clobber
scratch;}
REG_DEAD r139:DI
deferring rescan insn with uid = 6.
modifying insn i3 13: a0:DI=r137:DI<<0x1
REG_DEAD r137:DI
deferring rescan insn with uid = 13.
The multiply-by-3 won't trigger that split case, presumably from generic parts
of combine. There's a costing problem in there I suspect. It's also a really
odd "simplification", presumably from generic parts of combine. It's unclear
at this time if we want to support that kind of extract+shift pattern as
another form of sllw when the constants are convenient.