On Tue, 14 Jan 2025 at 14:11, Christophe Lyon
<[email protected]> wrote:
>
> If the target does not support floating-point, we register FP vector
> types as 'void' (see register_vector_type).
>
> The leads to warnings about 'pure attribute on function returning
> void' when we declare the various load intrinsics because their
> call_properties say CP_READ_MEMORY (thus giving them the 'pure'
> attribute), but their return type is void.
>
> To avoid such warnings, declare floating-point scalar and vector types
> even if the target does not have an FPU.
>
> Note that since an FPU can be activated via #pragma GCC target
> ("arch=armv8.1-m.main+mve.fp") for instance, it means that such types
> must cannot appear and disappear withing a single TU, they have to be
> available in both contexts. This implies a noteworthy change for
> __fp16: it not longer depends on using -mfp16-format=ieee or
> alternative. Also note that if the target ISA has the fp16 bit set,
> we already silently activate -mfp16-format=ieee (with an error if
> -mfp16-format=alternative was supplied).
>
> In arm-mve-builtins.cc (register_builtin_types, register_vector_type,
> register_builtin_tuple_types), this means simply removing the early
> exits. However, for this to work, we need to update
> arm_vector_mode_supported_p, so that vector floating-point types are
> always defined, and __fp16 must always be registered by
> arm_init_fp16_builtins (as it is the base type for vectors of
> float16_t. Another side effect is that the declaration of float16_t
> and float32_t typedefs is now unconditional.
>
> The new tests verify that:
> - we emit an error if the code tries to use floating-point intrinsics
> and the target does not have the floating-point extension
> - we emit the expected code when activating the floating-point
> expected via a pragma
> - we emit the expected code when the target supports floating-point
> (no pragma needed)
> - we ignore -mfp16-format=none
>
> An update is needed in g++.target/arm/mve/general-c++/nomve_fp_1.c,
> because the error message now correctly uses float16x8_t instead of
> void as return type.
>
> gcc/ChangeLog:
>
> PR target/117814
> * config/arm/arm-builtins.cc (arm_init_fp16_builtins): Always
> register __fp16 type.
> * config/arm/arm-mve-builtins.cc (register_builtin_tuple_types):
> Remove special handling when TARGET_HAVE_MVE_FLOAT is false.
> (register_vector_type): Likewise.
> (register_builtin_tuple_types): Likewise.
> * config/arm/arm.cc (arm_vector_mode_supported_p): Accept
> floating-point vector modes even if TARGET_HAVE_MVE_FLOAT is
> false.
> * config/arm/arm_mve_types.h (float16_t, float32_t): Define
> unconditionally.
> * doc/extend.texi (Half-precision Floating-point): __fp16 is now
> always available on arm. More x86 paragraph closer to the rest of
> the x86 information.
>
> gcc/testsuite/ChangeLog:
>
> PR target/117814
> * gcc.target/arm/mve/intrinsics/pr117814-f16.c: New test.
> * gcc.target/arm/mve/intrinsics/pr117814-2-f16.c: New test.
> * gcc.target/arm/mve/intrinsics/pr117814-3-f16.c: New test.
> * gcc.target/arm/mve/intrinsics/pr117814-4-f16.c: New test.
> * gcc.target/arm/mve/intrinsics/pr117814-f32.c: New test.
> * gcc.target/arm/mve/intrinsics/pr117814-2-f32.c: New test.
> * gcc.target/arm/mve/intrinsics/pr117814-3-f32.c: New test.
> * g++.target/arm/mve/general-c++/nomve_fp_1.c: Fix expected error
> message.
CI complains because I forgot to remove fp16-compile-none-1.c (we no
longer emit an error when using __fp16 with -mfp16-format=none).
fp16-compile-none-2.c still passes because we do not enable HF mode in
arm_scalar_mode_supported_p.
That still seems a bit inconsistent...
Maybe we should handle arm_fp16_format a bit differently, such that we
can make a difference between no option and explicit
-mfp16-format=none?
Thanks,
Christophe
> ---
> gcc/config/arm/arm-builtins.cc | 4 +--
> gcc/config/arm/arm-mve-builtins.cc | 22 +-----------
> gcc/config/arm/arm.cc | 6 +---
> gcc/config/arm/arm_mve_types.h | 2 --
> gcc/doc/extend.texi | 29 +++++++++------
> .../arm/mve/general-c++/nomve_fp_1.c | 2 +-
> .../arm/mve/intrinsics/pr117814-2-f16.c | 36 +++++++++++++++++++
> .../arm/mve/intrinsics/pr117814-2-f32.c | 36 +++++++++++++++++++
> .../arm/mve/intrinsics/pr117814-3-f16.c | 21 +++++++++++
> .../arm/mve/intrinsics/pr117814-3-f32.c | 21 +++++++++++
> .../arm/mve/intrinsics/pr117814-4-f16.c | 21 +++++++++++
> .../arm/mve/intrinsics/pr117814-f16.c | 28 +++++++++++++++
> .../arm/mve/intrinsics/pr117814-f32.c | 28 +++++++++++++++
> 13 files changed, 213 insertions(+), 43 deletions(-)
> create mode 100644
> gcc/testsuite/gcc.target/arm/mve/intrinsics/pr117814-2-f16.c
> create mode 100644
> gcc/testsuite/gcc.target/arm/mve/intrinsics/pr117814-2-f32.c
> create mode 100644
> gcc/testsuite/gcc.target/arm/mve/intrinsics/pr117814-3-f16.c
> create mode 100644
> gcc/testsuite/gcc.target/arm/mve/intrinsics/pr117814-3-f32.c
> create mode 100644
> gcc/testsuite/gcc.target/arm/mve/intrinsics/pr117814-4-f16.c
> create mode 100644 gcc/testsuite/gcc.target/arm/mve/intrinsics/pr117814-f16.c
> create mode 100644 gcc/testsuite/gcc.target/arm/mve/intrinsics/pr117814-f32.c
>
> diff --git a/gcc/config/arm/arm-builtins.cc b/gcc/config/arm/arm-builtins.cc
> index e860607686c..8f0aae4cff1 100644
> --- a/gcc/config/arm/arm-builtins.cc
> +++ b/gcc/config/arm/arm-builtins.cc
> @@ -2443,9 +2443,7 @@ arm_init_fp16_builtins (void)
> arm_fp16_type_node = make_node (REAL_TYPE);
> TYPE_PRECISION (arm_fp16_type_node) = GET_MODE_PRECISION (HFmode);
> layout_type (arm_fp16_type_node);
> - if (arm_fp16_format)
> - (*lang_hooks.types.register_builtin_type) (arm_fp16_type_node,
> - "__fp16");
> + (*lang_hooks.types.register_builtin_type) (arm_fp16_type_node, "__fp16");
> }
>
> void
> diff --git a/gcc/config/arm/arm-mve-builtins.cc
> b/gcc/config/arm/arm-mve-builtins.cc
> index 42b53cc05e7..b37c91c541b 100644
> --- a/gcc/config/arm/arm-mve-builtins.cc
> +++ b/gcc/config/arm/arm-mve-builtins.cc
> @@ -410,8 +410,6 @@ register_builtin_types ()
> #include "arm-mve-builtins.def"
> for (unsigned int i = 0; i < NUM_VECTOR_TYPES; ++i)
> {
> - if (vector_types[i].requires_float && !TARGET_HAVE_MVE_FLOAT)
> - continue;
> tree eltype = scalar_types[i];
> tree vectype;
> if (eltype == boolean_type_node)
> @@ -433,18 +431,6 @@ register_builtin_types ()
> static void
> register_vector_type (vector_type_index type)
> {
> -
> - /* If the target does not have the mve.fp extension, but the type requires
> - it, then it needs to be assigned a non-dummy type so that functions
> - with those types in their signature can be registered. This allows for
> - diagnostics about the missing extension, rather than about a missing
> - function definition. */
> - if (vector_types[type].requires_float && !TARGET_HAVE_MVE_FLOAT)
> - {
> - acle_vector_types[0][type] = void_type_node;
> - return;
> - }
> -
> tree vectype = abi_vector_types[type];
> tree id = get_identifier (vector_types[type].acle_name);
> tree decl = build_decl (input_location, TYPE_DECL, id, vectype);
> @@ -470,13 +456,7 @@ register_builtin_tuple_types (vector_type_index type)
> {
> const vector_type_info* info = &vector_types[type];
>
> - /* If the target does not have the mve.fp extension, but the type requires
> - it, then it needs to be assigned a non-dummy type so that functions
> - with those types in their signature can be registered. This allows for
> - diagnostics about the missing extension, rather than about a missing
> - function definition. */
> - if (scalar_types[type] == boolean_type_node
> - || (info->requires_float && !TARGET_HAVE_MVE_FLOAT))
> + if (scalar_types[type] == boolean_type_node)
> {
> for (unsigned int num_vectors = 2; num_vectors <= 4; num_vectors += 2)
> acle_vector_types[num_vectors >> 1][type] = void_type_node;
> diff --git a/gcc/config/arm/arm.cc b/gcc/config/arm/arm.cc
> index 5649986868b..2a0c6b2e1d8 100644
> --- a/gcc/config/arm/arm.cc
> +++ b/gcc/config/arm/arm.cc
> @@ -29804,11 +29804,7 @@ arm_vector_mode_supported_p (machine_mode mode)
> return true;
>
> if (TARGET_HAVE_MVE
> - && (VALID_MVE_SI_MODE (mode) || VALID_MVE_PRED_MODE (mode)))
> - return true;
> -
> - if (TARGET_HAVE_MVE_FLOAT
> - && (mode == V2DFmode || mode == V4SFmode || mode == V8HFmode))
> + && (VALID_MVE_MODE (mode) || VALID_MVE_PRED_MODE (mode)))
> return true;
>
> return false;
> diff --git a/gcc/config/arm/arm_mve_types.h b/gcc/config/arm/arm_mve_types.h
> index 42e74666e80..d1889c68ac5 100644
> --- a/gcc/config/arm/arm_mve_types.h
> +++ b/gcc/config/arm/arm_mve_types.h
> @@ -26,10 +26,8 @@
> #ifndef _GCC_ARM_MVE_TYPES_H
> #define _GCC_ARM_MVE_TYPES_H
>
> -#if (__ARM_FEATURE_MVE & 2) /* MVE Floating point. */
> typedef __fp16 float16_t;
> typedef float float32_t;
> -#endif
>
> #pragma GCC arm "arm_mve_types.h"
>
> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> index 1e1b4cc837d..2a542233c70 100644
> --- a/gcc/doc/extend.texi
> +++ b/gcc/doc/extend.texi
> @@ -1170,17 +1170,18 @@ typedef _Complex float __attribute__((mode(IC)))
> _Complex_ibm128;
> @cindex @code{__fp16} data type
> @cindex @code{__Float16} data type
>
> -On ARM and AArch64 targets, GCC supports half-precision (16-bit) floating
> -point via the @code{__fp16} type defined in the ARM C Language Extensions.
> -On ARM systems, you must enable this type explicitly with the
> -@option{-mfp16-format} command-line option in order to use it.
> -On x86 targets with SSE2 enabled, GCC supports half-precision (16-bit)
> -floating point via the @code{_Float16} type. For C++, x86 provides a builtin
> -type named @code{_Float16} which contains same data format as C.
> -
> -ARM targets support two incompatible representations for half-precision
> -floating-point values. You must choose one of the representations and
> -use it consistently in your program.
> +On ARM and AArch64 targets, GCC supports half-precision (16-bit)
> +floating point via the @code{__fp16} type defined in the ARM C
> +Language Extensions. On ARM systems, the @option{-mfp16-format}
> +command-line option selects which format to use when the target
> +supports several of them.
> +
> +Most ARM targets support two incompatible representations for
> +half-precision floating-point values. You must choose one of the
> +representations and use it consistently in your program, unless your
> +target only supports IEEE 754-2008 format (for instance
> +@code{armv8.2-a+fp16} and @code{armv8.1-m.main+mve.fp}) in which case
> +it is the default and only acceptable setting.
>
> Specifying @option{-mfp16-format=ieee} selects the IEEE 754-2008 format.
> This format can represent normalized values in the range of @math{2^{-14}}
> to 65504.
> @@ -1220,6 +1221,12 @@ calls.
> It is recommended that portable code use the @code{_Float16} type defined
> by ISO/IEC TS 18661-3:2015. @xref{Floating Types}.
>
> +
> +On x86 targets with SSE2 enabled, GCC supports half-precision (16-bit)
> +floating point via the @code{_Float16} type. For C++, x86 provides a
> +builtin type named @code{_Float16} which contains same data format as
> +C.
> +
> On x86 targets with SSE2 enabled, without @option{-mavx512fp16},
> all operations will be emulated by software emulation and the @code{float}
> instructions. The default behavior for @code{FLT_EVAL_METHOD} is to keep the
> diff --git a/gcc/testsuite/g++.target/arm/mve/general-c++/nomve_fp_1.c
> b/gcc/testsuite/g++.target/arm/mve/general-c++/nomve_fp_1.c
> index fd8c05b0eed..4b91e0c6327 100644
> --- a/gcc/testsuite/g++.target/arm/mve/general-c++/nomve_fp_1.c
> +++ b/gcc/testsuite/g++.target/arm/mve/general-c++/nomve_fp_1.c
> @@ -12,6 +12,6 @@
> void
> f1 (uint8x16_t v)
> {
> - vreinterpretq_f16 (v); /* { dg-error {ACLE function 'void
> vreinterpretq_f16\(uint8x16_t\)' requires ISA extension 'mve.fp'} } */
> + vreinterpretq_f16 (v); /* { dg-error {ACLE function 'float16x8_t
> vreinterpretq_f16\(uint8x16_t\)' requires ISA extension 'mve.fp'} } */
> /* { dg-message {note: you can enable mve.fp by using the command-line
> option '-march', or by using the 'target' attribute or pragma} "" {target
> *-*-*} .-1 } */
> }
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/pr117814-2-f16.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/pr117814-2-f16.c
> new file mode 100644
> index 00000000000..d16c04e2ee4
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/pr117814-2-f16.c
> @@ -0,0 +1,36 @@
> +/* Check that we can compile if the target does not support floating-point,
> but
> + we use a pragma to enable FP support locally. */
> +
> +/* { dg-require-effective-target arm_v8_1m_mve_ok } */
> +/* { dg-require-effective-target arm_fp_ok } */
> +/* Do not use dg-add-options arm_v8_1m_mve, because this might expand to "",
> + which could imply mve+fp depending on the user settings. We want to make
> + sure the '+fp' extension is not enabled. */
> +/* { dg-options "-mfpu=auto -march=armv8.1-m.main+mve" } */
> +/* We need arm_fp to get the proper -mfloat-abi=XXX, if needed. */
> +/* { dg-add-options arm_fp } */
> +/* { dg-additional-options "-O2" } */
> +
> +#include "arm_mve.h"
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +#pragma GCC target ("arch=armv8.1-m.main+mve.fp")
> +
> +/*
> +**foo:
> +** ...
> +** vldrh.16 q[0-9]+, \[(?:ip|fp|r[0-9]+)\](?: @.*|)
> +** ...
> +*/
> +float16x8_t
> +foo (float16_t const *base)
> +{
> + return vld1q_f16 (base);
> +}
> +
> +#ifdef __cplusplus
> +}
> +#endif
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/pr117814-2-f32.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/pr117814-2-f32.c
> new file mode 100644
> index 00000000000..2b2f6f2ab13
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/pr117814-2-f32.c
> @@ -0,0 +1,36 @@
> +/* Check that we can compile if the target does not support floating-point,
> but
> + we use a pragma to enable FP support locally. */
> +
> +/* { dg-require-effective-target arm_v8_1m_mve_ok } */
> +/* { dg-require-effective-target arm_fp_ok } */
> +/* Do not use dg-add-options arm_v8_1m_mve, because this might expand to "",
> + which could imply mve+fp depending on the user settings. We want to make
> + sure the '+fp' extension is not enabled. */
> +/* { dg-options "-mfpu=auto -march=armv8.1-m.main+mve" } */
> +/* We need arm_fp to get the proper -mfloat-abi=XXX, if needed. */
> +/* { dg-add-options arm_fp } */
> +/* { dg-additional-options "-O2" } */
> +
> +#include "arm_mve.h"
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +#pragma GCC target ("arch=armv8.1-m.main+mve.fp")
> +
> +/*
> +**foo:
> +** ...
> +** vldrw.32 q[0-9]+, \[(?:ip|fp|r[0-9]+)\](?: @.*|)
> +** ...
> +*/
> +float32x4_t
> +foo (float32_t const *base)
> +{
> + return vld1q_f32 (base);
> +}
> +
> +#ifdef __cplusplus
> +}
> +#endif
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/pr117814-3-f16.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/pr117814-3-f16.c
> new file mode 100644
> index 00000000000..bcb0dd65416
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/pr117814-3-f16.c
> @@ -0,0 +1,21 @@
> +/* Check that we can compile if the target supports floating-point. */
> +
> +/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
> +/* { dg-add-options arm_v8_1m_mve_fp } */
> +/* { dg-additional-options "-O2" } */
> +
> +#include "arm_mve.h"
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +float16x8_t
> +foo (float16_t const *base)
> +{
> + return vld1q_f16 (base);
> +}
> +
> +#ifdef __cplusplus
> +}
> +#endif
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/pr117814-3-f32.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/pr117814-3-f32.c
> new file mode 100644
> index 00000000000..7e02816d505
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/pr117814-3-f32.c
> @@ -0,0 +1,21 @@
> +/* Check that we can compile if the target supports floating-point. */
> +
> +/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
> +/* { dg-add-options arm_v8_1m_mve_fp } */
> +/* { dg-additional-options "-O2" } */
> +
> +#include "arm_mve.h"
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +float32x4_t
> +foo (float32_t const *base)
> +{
> + return vld1q_f32 (base);
> +}
> +
> +#ifdef __cplusplus
> +}
> +#endif
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/pr117814-4-f16.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/pr117814-4-f16.c
> new file mode 100644
> index 00000000000..0757d5a549b
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/pr117814-4-f16.c
> @@ -0,0 +1,21 @@
> +/* Check that -mfp16-format=none is silently ignored. */
> +
> +/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
> +/* { dg-add-options arm_v8_1m_mve_fp } */
> +/* { dg-additional-options "-O2 -mfp16-format=none" } */
> +
> +#include "arm_mve.h"
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +float16x8_t
> +foo (float16_t const *base)
> +{
> + return vld1q_f16 (base);
> +}
> +
> +#ifdef __cplusplus
> +}
> +#endif
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/pr117814-f16.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/pr117814-f16.c
> new file mode 100644
> index 00000000000..c25506bd034
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/pr117814-f16.c
> @@ -0,0 +1,28 @@
> +/* Check that we get an error if the target does not support floating-point:
> we
> + force +mve to cancel a possible implicit +mve.fp. */
> +
> +/* { dg-require-effective-target arm_v8_1m_mve_ok } */
> +/* { dg-require-effective-target arm_fp_ok } */
> +/* Do not use dg-add-options arm_v8_1m_mve, because this might expand to "",
> + which could imply mve+fp depending on the user settings. We want to make
> + sure the '+fp' extension is not enabled. */
> +/* { dg-options "-mfpu=auto -march=armv8.1-m.main+mve" } */
> +/* We need arm_fp to get the proper -mfloat-abi=XXX, if needed. */
> +/* { dg-add-options arm_fp } */
> +/* { dg-additional-options "-O2" } */
> +
> +#include "arm_mve.h"
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +float16x8_t
> +foo (float16_t const *base)
> +{
> + return vld1q_f16 (base); /* { dg-error {ACLE function '.*vld1q_f16.*'
> requires ISA extension 'mve.fp'} } */
> +}
> +
> +#ifdef __cplusplus
> +}
> +#endif
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/pr117814-f32.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/pr117814-f32.c
> new file mode 100644
> index 00000000000..91a5e192986
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/pr117814-f32.c
> @@ -0,0 +1,28 @@
> +/* Check that we get an error if the target does not support floating-point:
> we
> + force +mve to cancel a possible implicit +mve.fp. */
> +
> +/* { dg-require-effective-target arm_v8_1m_mve_ok } */
> +/* { dg-require-effective-target arm_fp_ok } */
> +/* Do not use dg-add-options arm_v8_1m_mve, because this might expand to "",
> + which could imply mve+fp depending on the user settings. We want to make
> + sure the '+fp' extension is not enabled. */
> +/* { dg-options "-mfpu=auto -march=armv8.1-m.main+mve" } */
> +/* We need arm_fp to get the proper -mfloat-abi=XXX, if needed. */
> +/* { dg-add-options arm_fp } */
> +/* { dg-additional-options "-O2" } */
> +
> +#include "arm_mve.h"
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +float32x4_t
> +foo (float32_t const *base)
> +{
> + return vld1q_f32 (base); /* { dg-error {ACLE function '.*vld1q_f32.*'
> requires ISA extension 'mve.fp'} } */
> +}
> +
> +#ifdef __cplusplus
> +}
> +#endif
> --
> 2.34.1
>