> -----Original Message-----
> From: Przemyslaw Wirkus <przemyslaw.wir...@arm.com>
> Sent: 13 October 2020 10:56
> To: gcc-patches@gcc.gnu.org
> Cc: ni...@redhat.com; Richard Earnshaw <richard.earns...@arm.com>;
> Kyrylo Tkachov <kyrylo.tkac...@arm.com>; Ramana Radhakrishnan
> <ramana.radhakrish...@arm.com>
> Subject: [PATCH][GCC-10 backport] arm: Fix fp16 move patterns for base
> MVE
> 
> Backport of commit 6abd428605e3a279e533fde1cecbc9735ce03b66
> from master branch.
> 
> OK for gcc-10 ?

Ok.
Thanks,
Kyrill

> 
> This patch fixes ICEs in gcc.dg/torture/float16-basic.c for
> -march=armv8.1-m.main+mve -mfloat-abi=hard.  The problem was
> that an fp16 argument was (rightly) being passed in FPRs,
> but the fp16 move patterns only handled GPRs.  LRA then cycled
> trying to look for a way of handling the FPR.
> 
> It looks like there are three related problems here:
> 
> (1) We're using the wrong fp16 move pattern for base MVE.
>     *mov<mode>_vfp_<mode>16 (the pattern we use for +mve.fp)
>     works for base MVE too.
> 
> (2) The fp16 MVE load and store patterns are separate from the
>     main move patterns.  The loads and stores should instead be
>     alternatives of the main move patterns, so that LRA knows
>     what to do with pseudo registers that become stack slots.
> 
> (3) The range restrictions for the loads and stores were wrong
>     for fp16: we were enforcing a multiple of 4 in [-255*4, 255*4]
>     instead of a multiple of 2 in [-255*2, 255*2].
> 
> (2) came from a patch to prevent writeback being used for MVE.
> That patch also added a Uj constraint to enforce the correct
> memory types for MVE.  I think the simplest fix is therefore to merge
> the loads and stores back into the main pattern and extend the Uj
> constraint so that it acts like Um for non-MVE.
> 
> The testcase for that patch was mve-vldstr16-no-writeback.c, whose
> main function is:
> 
> void
> fn1 (__fp16 *pSrc)
> {
>   __fp16 high;
>   __fp16 *pDst = 0;
>   unsigned i;
>   for (i = 0;; i++)
>     if (pSrc[i])
>       pDst[i] = high;
> }
> 
> Fixing (2) causes the store part to fail, not because we're using
> writeback, but because we decide to use GPRs to store high (which is
> uninitialised, and so gets replaced with zero).  This patch therefore
> adds some scan-assembler-nots instead.  (I wondered about changing the
> testcase to initialise high, but that seemed like a bad idea for
> a regression test.)
> 
> For (3): MVE seems to be the only thing to use
> arm_coproc_mem_operand_wb
> (and its various interfaces) for 16-bit scalars: the Neon patterns only
> use it for 32-bit scalars.
> 
> I've added new tests to try the various FPR alternatives of the
> move patterns.  The range of offsets that GCC uses for FPR loads
> and stores is the intersection of the range allowed for GPRs and
> FPRs, so the tests include GPR<->memory tests as well.
> 
> The fp32 and fp64 tests already pass, they're just there for
> completeness.
> 
> gcc/
>       * config/arm/arm-protos.h
> (arm_mve_mode_and_operands_type_check):
>       Delete.
>       * config/arm/arm.c (arm_coproc_mem_operand_wb): Use a scale
> factor
>       of 2 rather than 4 for 16-bit modes.
>       (arm_mve_mode_and_operands_type_check): Delete.
>       * config/arm/constraints.md (Uj): Allow writeback for Neon,
>       but continue to disallow it for MVE.
>       * config/arm/arm.md (*arm32_mov<HFBF:mode>):
> Add !TARGET_HAVE_MVE.
>       * config/arm/vfp.md (*mov_load_vfp_hf16, *mov_store_vfp_hf16):
> Fold
>       back into...
>       (*mov<mode>_vfp_<mode>16): ...here but use Uj for the FPR
> memory
>       constraints.  Use for base MVE too.
> 
> gcc/testsuite/
>       * gcc.target/arm/mve/intrinsics/mve-vldstr16-no-writeback.c: Allow
>       the store to use GPRs instead of FPRs.  Add scan-assembler-nots
>       for writeback.
>       * gcc.target/arm/armv8_1m-fp16-move-1.c: New test.
>       * gcc.target/arm/armv8_1m-fp32-move-1.c: Likewise.
>       * gcc.target/arm/armv8_1m-fp64-move-1.c: Likewise.

Reply via email to