On 10/14/23 16:14, Roger Sayle wrote:
This patch is my proposed solution to PR rtl-optimization/91865.
Normally RTX simplification canonicalizes a ZERO_EXTEND of a ZERO_EXTEND
to a single ZERO_EXTEND, but as shown in this PR it is possible for
combine's make_compound_operation to unintentionally generate a
non-canonical ZERO_EXTEND of a ZERO_EXTEND, which is unlikely to be
matched by the backend.
For the new test case:
const int table[2] = {1, 2};
int foo (char i) { return table[i]; }
compiling with -O2 -mlarge on msp430 we currently see:
Trying 2 -> 7:
2: r25:HI=zero_extend(R12:QI)
REG_DEAD R12:QI
7: r28:PSI=sign_extend(r25:HI)#0
REG_DEAD r25:HI
Failed to match this instruction:
(set (reg:PSI 28 [ iD.1772 ])
(zero_extend:PSI (zero_extend:HI (reg:QI 12 R12 [ iD.1772 ]))))
which results in the following code:
foo: AND #0xff, R12
RLAM.A #4, R12 { RRAM.A #4, R12
RLAM.A #1, R12
MOVX.W table(R12), R12
RETA
With this patch, we now see:
Trying 2 -> 7:
2: r25:HI=zero_extend(R12:QI)
REG_DEAD R12:QI
7: r28:PSI=sign_extend(r25:HI)#0
REG_DEAD r25:HI
Successfully matched this instruction:
(set (reg:PSI 28 [ iD.1772 ])
(zero_extend:PSI (reg:QI 12 R12 [ iD.1772 ])))
allowing combination of insns 2 and 7
original costs 4 + 8 = 12
replacement cost 8
foo: MOV.B R12, R12
RLAM.A #1, R12
MOVX.W table(R12), R12
RETA
This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, both with and without --target_board=unix{-m32}
with no new failures. Ok for mainline?
2023-10-14 Roger Sayle <ro...@nextmovesoftware.com>
gcc/ChangeLog
PR rtl-optimization/91865
* combine.cc (make_compound_operation): Avoid creating a
ZERO_EXTEND of a ZERO_EXTEND.
Final question. Is there a reasonable expectation that we could get a
similar situation with sign extensions? If so we probably ought to try
and handle both.
OK with the obvious change to handle nested sign extensions if you think
it's useful to do so. And OK as-is if you don't think handling nested
sign extensions is useful.
jeff