On 29/07/14 20:53, Charles Baylis wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61948 > > The lshrdi3_neon,ashrdi3_neon,ashldi3_neon patterns can call > gen_arm_<shift>di3_1bit without checking that the register allocation > constraints of the resulting insn are satisfied. This results in an > ICE: > > $ arm-unknown-linux-gnueabihf-gcc -O2 -mfpu=neon -mthumb -c t.c > t.c: In function âfâ: > t.c:8:1: error: insn does not satisfy its constraints: > } > ^ > (insn 18 7 14 2 (parallel [ > (set (reg/i:DI 0 r0) > (ashiftrt:DI (reg/v:DI 1 r1 [orig:110 t ] [110]) > (const_int 1 [0x1]))) > (clobber (reg:CC 100 cc)) > ]) t.c:8 131 {arm_ashrdi3_1bit} > (expr_list:REG_DEAD (reg:SI 2 r2) > (expr_list:REG_UNUSED (reg:CC 100 cc) > (nil)))) > t.c:8:1: internal compiler error: in copyprop_hardreg_forward_1, at > regcprop.c:775 > > Here, the destination registers are [r0,r1], partially overlapping > with the source registers are [r1,r2]. > > The arm_ashrdi3_1bit pattern has constraints: > [(set (match_operand:DI 0 "s_register_operand" "=r,&r") > (ashiftrt:DI (match_operand:DI 1 "s_register_operand" "0,r") > > This permits the input and output registers to overlap fully, or to > have no overlap. > > The invalid insn with partial overlap is caused when ashrdi3_neon is split: > if (CONST_INT_P (operands[2]) && INTVAL (operands[2]) == 1) > /* This clobbers CC. */ > emit_insn (gen_arm_ashldi3_1bit (operands[0], operands[1])); > else ... > > The proposed fix is to add an additional check for the partial overlap > of the input and output operands, and to fall back to the general > shift by constant pattern if they do overlap. > > It would also be possible to relax the constraints on the > arm_<shift>di3_1bit patterns and allow them to emit an extra move > where needed to support overlapped input/output registers. This would > give slightly better code in the overlapped case here, but would be > more complex to validate to ensure that it didn't regress code size in > non-Neon builds (as the relaxed constraints would enable a longer code > sequence to be used), and would be more complex to validate for > correctness as it would need to be tested both for both little/big > endian. > > Tested using qemu with make check for arm-unknown-linux-gnueabihf > --with-arch=armv7-a --with-tune=cortex-a9 --with-fpu=vfpv3-d16 > --with-float=hard --with-mode=thumb. No regressions > > > gcc/ChangeLog: > 2014-07-29 Charles Baylis <charles.bay...@linaro.org> > > PR target/61948 > * config/arm/neon.md (ashldi3_neon): Don't emit arm_ashldi3_1bit unless > constraints are satisfied. > (<shift>di3_neon): Likewise. >
This is OK, but... > > gcc/testsuite/ChangeLog: > 2014-07-29 Charles Baylis <charles.bay...@linaro.org> > > PR target/61948 > * gcc.target/arm/pr61948.c: New test case. > I think this should be using something like: /* { dg-do compile } */ /* { dg-require-effective-target arm_neon_ok } */ /* { dg-require-effective-target arm_thumb2_ok } */ /* { dg-options "-O2 -mthumb" } */ /* { dg-add-options arm_neon } */ rather than explicitly adding -mfpu=neon. R.