https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61948

The lshrdi3_neon,ashrdi3_neon,ashldi3_neon patterns can call
gen_arm_<shift>di3_1bit without checking that the register allocation
constraints of the resulting insn are satisfied. This results in an
ICE:

$ arm-unknown-linux-gnueabihf-gcc -O2 -mfpu=neon -mthumb -c t.c
t.c: In function ‘f’:
t.c:8:1: error: insn does not satisfy its constraints:
 }
 ^
(insn 18 7 14 2 (parallel [
            (set (reg/i:DI 0 r0)
                (ashiftrt:DI (reg/v:DI 1 r1 [orig:110 t ] [110])
                    (const_int 1 [0x1])))
            (clobber (reg:CC 100 cc))
        ]) t.c:8 131 {arm_ashrdi3_1bit}
     (expr_list:REG_DEAD (reg:SI 2 r2)
        (expr_list:REG_UNUSED (reg:CC 100 cc)
            (nil))))
t.c:8:1: internal compiler error: in copyprop_hardreg_forward_1, at
regcprop.c:775

Here, the destination registers are [r0,r1], partially overlapping
with the source registers are [r1,r2].

The arm_ashrdi3_1bit pattern has constraints:
  [(set (match_operand:DI              0 "s_register_operand" "=r,&r")
        (ashiftrt:DI (match_operand:DI 1 "s_register_operand" "0,r")

This permits the input and output registers to overlap fully, or to
have no overlap.

The invalid insn with partial overlap is caused when ashrdi3_neon is split:
        if (CONST_INT_P (operands[2]) && INTVAL (operands[2]) == 1)
          /* This clobbers CC.  */
          emit_insn (gen_arm_ashldi3_1bit (operands[0], operands[1]));
        else ...

The proposed fix is to add an additional check for the partial overlap
of the input and output operands, and to fall back to the general
shift by constant pattern if they do overlap.

It would also be possible to relax the constraints on the
arm_<shift>di3_1bit patterns and allow them to emit an extra move
where needed to support overlapped input/output registers. This would
give slightly better code in the overlapped case here, but would be
more complex to validate to ensure that it didn't regress code size in
non-Neon builds (as the relaxed constraints would enable a longer code
sequence to be used), and would be more complex to validate for
correctness as it would need to be tested both for both little/big
endian.

Tested using qemu with make check for arm-unknown-linux-gnueabihf
--with-arch=armv7-a --with-tune=cortex-a9 --with-fpu=vfpv3-d16
--with-float=hard --with-mode=thumb. No regressions


gcc/ChangeLog:
2014-07-29  Charles Baylis  <charles.bay...@linaro.org>

    PR target/61948
    * config/arm/neon.md (ashldi3_neon): Don't emit arm_ashldi3_1bit unless
    constraints are satisfied.
    (<shift>di3_neon): Likewise.


gcc/testsuite/ChangeLog:
2014-07-29  Charles Baylis  <charles.bay...@linaro.org>

    PR target/61948
    * gcc.target/arm/pr61948.c: New test case.

Attachment: 0001-Check-that-DImode-_1bit-shifts-obey-constraints-in-N.patch
Description: application/download

Attachment: 0002-testcase-for-PR61948.patch
Description: application/download

Reply via email to