Hi all,
I can see why copying from one pseudo-register to another would not be a
reason *not* to decompose a register, but I don't understand why this is
a reason to say it *should* be decomposed.
This is causing me trouble, and I can't tell how to fix it without
figuring out why it is this way in the first place.
My testcase is from pr43137. This was an ARM missed-optimization bug
that was fixed some time ago, but has recurred (in my tree) because I'm
trying to implement SI->DImode extend into 64-bit NEON registers.
Here are the problem insns:
(insn 7 6 8 2 (set (reg:DI 137)
(sign_extend:DI (reg/v:SI 134 [ resultD.4946 ]))) pr43137.c:8
158 {extendsidi2}
(nil))
(insn 8 7 12 2 (set (reg:DI 136 [ <retval> ])
(reg:DI 137)) pr43137.c:8 641 {*movdi_vfp}
(nil))
(insn 12 8 15 2 (set (reg/i:DI 0 r0)
(reg:DI 136 [ <retval> ])) pr43137.c:9 641 {*movdi_vfp}
(nil))
Lower-subreg thinks it should decompose pseudo 136 because there is a
pseudo-to-pseudo copy (137->136), even though there is no use of subregs
here.
The decomposition ends up preventing register allocation from allocating
r0 to pseudo-137, and we get an unnecessary move emitted and a
regression of pr43137.
So, why do we have this code?
[lower-subreg.c, find_decomposable_subregs]
case SIMPLE_PSEUDO_REG_MOVE:
if (MODES_TIEABLE_P (GET_MODE (x), word_mode))
bitmap_set_bit (decomposable_context, regno);
break;
If I remove these lines my problems go away.
Any clues would be appreciated.
Thanks
Andrew