On 8/31/2021 7:24 AM, Roger Sayle wrote:
Hi Christophe,
I'm testing the attached patch, but without an aarch64, it'll take a while to 
figure
out the toolchain to reproduce the failure.  Neither of the platforms I tested 
were
affected, but I can see it's unsafe to reuse the subreg_promoted_reg idiom from
just a few lines earlier.  Any help testing the attached patch on an affected 
target
would be much appreciated.

Sorry for the inconvenience.
Roger
--

-----Original Message-----
From: Christophe LYON <christophe.l...@foss.st.com>
Sent: 31 August 2021 13:32
To: Roger Sayle <ro...@nextmovesoftware.com>; 'GCC Patches' 
<gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH] Preserve SUBREG_PROMOTED_VAR_P on (extend:HI (subreg/s:QI 
(reg:SI)))


On 29/08/2021 09:46, Roger Sayle wrote:
SUBREG_PROMOTED_VAR_P is a mechanism for tracking that a partial
subreg is correctly zero-extended or sign-extended in the parent
register.  For example, the RTL (subreg/s/v:QI (reg/v:SI 23 [ x ]) 0)
indicates that the byte x is zero extended in reg:SI 23, which is useful for 
optimization.
An example is that zero extending the above QImode value to HImode can
simply use a wider subreg, i.e. (subreg:HI (reg/v:SI 23 [ x ]) 0).

This patch addresses the oversight/missed optimization opportunity
that the new HImode subreg above should retain its
SUBREG_PROMOTED_VAR_P annotation as its value is guaranteed to be
correctly extended in the SImode parent.  The code below to preserve
SUBREG_PROMOTED_VAR_P is already present in the middle-end (e.g.
simplify-rtx.c:7232-7242) but missing from one or two (precisely three) places 
that (accidentally) strip it.

Whilst there I also added another optimization.  If we need to extend
the above QImode value beyond the SImode register holding it, say to
DImode, we can eliminate the SUBREG and simply extend from the SImode
register to DImode.

This patch has been tested on x86_64-pc-linux-gnu with "make bootstrap"
and "make -k check" with no new failures, and on a cross-compiler to
nvptx-none, where the function "long foo(char x) { return x; }" now
requires one less instruction.

OK for mainline?

Hi,

This patch causes an ICE when building an aarch64 toolchain:

during RTL pass: expand
In file included from
/tmp/5987050_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libgcc/soft-fp/soft-fp.h:318,
                   from
/tmp/5987050_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libgcc/soft-fp/floatditf.c:32:
/tmp/5987050_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libgcc/soft-fp/floatditf.c:
In function '__floatditf':
/tmp/5987050_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libgcc/soft-fp/op-2.h:249:37:
internal compiler error: in subreg_promoted_mode, at rtl.h:3132
    249 |       _FP_PACK_RAW_2_flo.bits.exp   = X##_e;    \
/tmp/5987050_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libgcc/soft-fp/quad.h:229:33:
note: in expansion of macro '_FP_PACK_RAW_2'
    229 | # define FP_PACK_RAW_Q(val, X)  _FP_PACK_RAW_2 (Q, (val), X)
        |                                 ^~~~~~~~~~~~~~
/tmp/5987050_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libgcc/soft-fp/floatditf.c:42:3:
note: in expansion of macro 'FP_PACK_RAW_Q'
     42 |   FP_PACK_RAW_Q (a, A);
        |   ^~~~~~~~~~~~~
0xa0b53a subreg_promoted_mode

/tmp/5987050_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/rtl.h:3132
0xa0b53a convert_modes(machine_mode, machine_mode, rtx_def*, int)
/tmp/5987050_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/expr.c:699
0xa003bc expand_expr_real_2(separate_ops*, rtx_def*, machine_mode,
expand_modifier)
/tmp/5987050_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/expr.c:9091
0xa0765c expand_expr_real_1(tree_node*, rtx_def*, machine_mode, 
expand_modifier, rtx_def**, bool)
/tmp/5987050_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/expr.c:10497
0x9fcef6 expand_expr_real_2(separate_ops*, rtx_def*, machine_mode,
expand_modifier)
/tmp/5987050_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/expr.c:9798
0xa0765c expand_expr_real_1(tree_node*, rtx_def*, machine_mode, 
expand_modifier, rtx_def**, bool)
/tmp/5987050_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/expr.c:10497
0xa1099e expand_expr
/tmp/5987050_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/expr.h:301
0xa1099e expand_operands(tree_node*, tree_node*, rtx_def*, rtx_def**, 
rtx_def**, expand_modifier)
/tmp/5987050_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/expr.c:8308
0x9fcdff expand_expr_real_2(separate_ops*, rtx_def*, machine_mode,
expand_modifier)
/tmp/5987050_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/expr.c:10288


Can you check?


Thanks,

Christophe


2021-08-29  Roger Sayle  <ro...@nextmovesoftware.com>

gcc/ChangeLog
        * expr.c (convert_modes): Preserve SUBREG_PROMOTED_VAR_P when
        creating a (wider) partial subreg from a SUBREG_PROMOTED_VAR_P
        subreg.
        * simplify-rtx.c (simplify_unary_operation_1) [SIGN_EXTEND]:
        Likewise, preserve SUBREG_PROMOTED_VAR_P when creating a (wider)
        partial subreg from a SUBREG_PROMOTED_VAR_P subreg.  Generate
        SIGN_EXTEND of the SUBREG_REG when a subreg would be paradoxical.
        [ZERO_EXTEND]: Likewise, preserve SUBREG_PROMOTED_VAR_P when
        creating a (wider) partial subreg from a SUBREG_PROMOTED_VAR_P
        subreg.  Generate ZERO_EXTEND of the SUBREG_REG when a subreg
        would be paradoxical.

Roger
--


patch.txt

diff --git a/gcc/expr.c b/gcc/expr.c
index 5dd98a9..17f2c2f 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
This seems to be fixing the failures spotted by my tester.  iq2000, fr30, cris, epiphany so far.  I suspect it'll fix the others as well.

jeff

Reply via email to