On 10/14/23 16:14, Roger Sayle wrote:

This patch is my proposed solution to PR rtl-optimization/91865.
Normally RTX simplification canonicalizes a ZERO_EXTEND of a ZERO_EXTEND
to a single ZERO_EXTEND, but as shown in this PR it is possible for
combine's make_compound_operation to unintentionally generate a
non-canonical ZERO_EXTEND of a ZERO_EXTEND, which is unlikely to be
matched by the backend.

For the new test case:

const int table[2] = {1, 2};
int foo (char i) { return table[i]; }

compiling with -O2 -mlarge on msp430 we currently see:

Trying 2 -> 7:
     2: r25:HI=zero_extend(R12:QI)
       REG_DEAD R12:QI
     7: r28:PSI=sign_extend(r25:HI)#0
       REG_DEAD r25:HI
Failed to match this instruction:
(set (reg:PSI 28 [ iD.1772 ])
     (zero_extend:PSI (zero_extend:HI (reg:QI 12 R12 [ iD.1772 ]))))

which results in the following code:

foo:    AND     #0xff, R12
         RLAM.A #4, R12 { RRAM.A #4, R12
         RLAM.A  #1, R12
         MOVX.W  table(R12), R12
         RETA

With this patch, we now see:

Trying 2 -> 7:
     2: r25:HI=zero_extend(R12:QI)
       REG_DEAD R12:QI
     7: r28:PSI=sign_extend(r25:HI)#0
       REG_DEAD r25:HI
Successfully matched this instruction:
(set (reg:PSI 28 [ iD.1772 ])
     (zero_extend:PSI (reg:QI 12 R12 [ iD.1772 ])))
allowing combination of insns 2 and 7
original costs 4 + 8 = 12
replacement cost 8

foo:    MOV.B   R12, R12
         RLAM.A  #1, R12
         MOVX.W  table(R12), R12
         RETA


This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, both with and without --target_board=unix{-m32}
with no new failures.  Ok for mainline?

2023-10-14  Roger Sayle  <ro...@nextmovesoftware.com>

gcc/ChangeLog
         PR rtl-optimization/91865
         * combine.cc (make_compound_operation): Avoid creating a
         ZERO_EXTEND of a ZERO_EXTEND.
Final question. Is there a reasonable expectation that we could get a similar situation with sign extensions? If so we probably ought to try and handle both.

OK with the obvious change to handle nested sign extensions if you think it's useful to do so. And OK as-is if you don't think handling nested sign extensions is useful.

jeff

Reply via email to