Hi all,

I can see why copying from one pseudo-register to another would not be a reason *not* to decompose a register, but I don't understand why this is a reason to say it *should* be decomposed.

This is causing me trouble, and I can't tell how to fix it without figuring out why it is this way in the first place.

My testcase is from pr43137. This was an ARM missed-optimization bug that was fixed some time ago, but has recurred (in my tree) because I'm trying to implement SI->DImode extend into 64-bit NEON registers.

Here are the problem insns:

(insn 7 6 8 2 (set (reg:DI 137)
        (sign_extend:DI (reg/v:SI 134 [ resultD.4946 ]))) pr43137.c:8
                                                    158 {extendsidi2}
     (nil))

(insn 8 7 12 2 (set (reg:DI 136 [ <retval> ])
        (reg:DI 137)) pr43137.c:8 641 {*movdi_vfp}
     (nil))

(insn 12 8 15 2 (set (reg/i:DI 0 r0)
        (reg:DI 136 [ <retval> ])) pr43137.c:9 641 {*movdi_vfp}
     (nil))

Lower-subreg thinks it should decompose pseudo 136 because there is a pseudo-to-pseudo copy (137->136), even though there is no use of subregs here.

The decomposition ends up preventing register allocation from allocating r0 to pseudo-137, and we get an unnecessary move emitted and a regression of pr43137.


So, why do we have this code?

[lower-subreg.c, find_decomposable_subregs]

            case SIMPLE_PSEUDO_REG_MOVE:
              if (MODES_TIEABLE_P (GET_MODE (x), word_mode))
                bitmap_set_bit (decomposable_context, regno);
              break;

If I remove these lines my problems go away.

Any clues would be appreciated.

Thanks

Andrew

Reply via email to