Re: [PATCH v1 1/1] aarch64: fix fp8 cpuinfo feature names

2024-12-03 Thread Claudio Bantaloukas
On 12/3/2024 10:24 AM, Kyrylo Tkachov wrote: Hi Claudio, On 2 Dec 2024, at 19:14, Claudio Bantaloukas wrote: The previous version of the patch was based on the mistaken assumption that features in /proc/cpuinfo had matching names to the feature names that gcc and gas accept. This patch

[PATCH v1] MAINTAINERS: add myself to write after approval

2024-12-02 Thread Claudio Bantaloukas
simonb Richard Ballricbal02 Scott Bambrough - Wolfgang Bangerth - +Claudio Bantaloukas rdfm Gergö Barany- Thiago Jung Bauermann

[PATCH v1 1/1] aarch64: fix fp8 cpuinfo feature names

2024-12-02 Thread Claudio Bantaloukas
The previous version of the patch was based on the mistaken assumption that features in /proc/cpuinfo had matching names to the feature names that gcc and gas accept. This patch enables the fp8 feature when the f8cvt feature is enabled, under the assumption that fpmr is always enabled when f8cvt i

[PATCH v1 0/1] aarch64: fix fp8 cpuinfo feature names

2024-12-02 Thread Claudio Bantaloukas
is. OK for trunk? Changelog: gcc/ config/aarch64/aarch64-option-extensions.def: (fp8): fix FEATURE_STRING. (fp8fma, ssve-fp8fma): Likewise. (fp8dot4, ssve-fp8dot4, fp8dot2, ssve-fp8dot2): Likewise. Claudio Bantaloukas (1): aarch64: fix fp8 cpuinfo feature names gcc

Re: [PATCH v5 5/5] aarch64: add SVE2 FP8DOT2 and FP8DOT4 intrinsics

2024-11-29 Thread Claudio Bantaloukas
On 11/29/2024 2:15 PM, Kyrylo Tkachov wrote: On 29 Nov 2024, at 13:00, Richard Sandiford wrote: Thanks for the update! Claudio Bantaloukas writes: diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 2a4f016e2df..f7440113570 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc

Re: [PATCH v5 5/5] aarch64: add SVE2 FP8DOT2 and FP8DOT4 intrinsics

2024-11-29 Thread Claudio Bantaloukas
On 11/29/2024 1:00 PM, Richard Sandiford wrote: Thanks for the update! Claudio Bantaloukas writes: diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 2a4f016e2df..f7440113570 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -21957,6 +21957,18 @@ Enable the fp8 (8-bit

[PATCH v5 2/5] aarch64: specify fpm mode in function instances and groups

2024-11-28 Thread Claudio Bantaloukas
Some intrinsics require setting the fpm register before calling the specific asm opcode required. In order to simplify review, this patch: - adds the fpm_mode_index attribute to function_group_info and function_instance objects - updates existing initialisations and call sites. - updates equalit

[PATCH v5 5/5] aarch64: add SVE2 FP8DOT2 and FP8DOT4 intrinsics

2024-11-28 Thread Claudio Bantaloukas
This patch adds support for the following intrinsics: - svdot[_f32_mf8]_fpm - svdot_lane[_f32_mf8]_fpm - svdot[_f16_mf8]_fpm - svdot_lane[_f16_mf8]_fpm The first two are available under a combination of the FP8DOT4 and SVE2 features. Alternatively under the SSVE_FP8DOT4 feature under streaming m

[PATCH v5 3/5] aarch64: add svcvt* FP8 intrinsics

2024-11-28 Thread Claudio Bantaloukas
This patch adds the following intrinsics: - svcvt1_bf16[_mf8]_fpm - svcvt1_f16[_mf8]_fpm - svcvt2_bf16[_mf8]_fpm - svcvt2_f16[_mf8]_fpm - svcvtlt1_bf16[_mf8]_fpm - svcvtlt1_f16[_mf8]_fpm - svcvtlt2_bf16[_mf8]_fpm - svcvtlt2_f16[_mf8]_fpm - svcvtn_mf8[_f16_x2]_fpm (unpredicated) - svcvtnb_mf8[_f32_

[PATCH v5 0/5] aarch64: Add fp8 sve foundation

2024-11-28 Thread Claudio Bantaloukas
omeone commit it on my behalf? Regression tested on aarch64-unknown-linux-gnu. Thanks, Claudio Bantaloukas Claudio Bantaloukas (5): aarch64: Add basic svmfloat8_t support to arm_sve.h aarch64: specify fpm mode in function instances and groups aarch64: add svcvt* FP8 intrinsics aarch64: ad

[PATCH v5 4/5] aarch64: add SVE2 FP8 multiply accumulate intrinsics

2024-11-28 Thread Claudio Bantaloukas
This patch adds support for the following intrinsics: - svmlalb[_f16_mf8]_fpm - svmlalb[_n_f16_mf8]_fpm - svmlalt[_f16_mf8]_fpm - svmlalt[_n_f16_mf8]_fpm - svmlalb_lane[_f16_mf8]_fpm - svmlalt_lane[_f16_mf8]_fpm - svmlallbb[_f32_mf8]_fpm - svmlallbb[_n_f32_mf8]_fpm - svmlallbt[_f32_mf8]_fpm - svml

Re: [PATCH v4 5/5] aarch64: add SVE2 FP8DOT2 and FP8DOT4 intrinsics

2024-11-28 Thread Claudio Bantaloukas
On 21/11/2024 15:41, Richard Sandiford wrote: Claudio Bantaloukas writes: This patch adds support for the following intrinsics: - svdot[_f32_mf8]_fpm - svdot_lane[_f32_mf8]_fpm - svdot[_f16_mf8]_fpm - svdot_lane[_f16_mf8]_fpm The first two are available under a combination of the FP8DOT4

Re: [PATCH v4 4/5] aarch64: add SVE2 FP8 multiply accumulate intrinsics

2024-11-28 Thread Claudio Bantaloukas
On 21/11/2024 14:33, Richard Sandiford wrote: Claudio Bantaloukas writes: [...] @@ -4004,6 +4008,44 @@ SHAPE (ternary_bfloat_lane) typedef ternary_bfloat_lane_base<2> ternary_bfloat_lanex2_def; SHAPE (ternary_bfloat_lanex2) +/* sv_t svfoo[_t0](sv_t, svmfloat8_t, svmfloat8_t, ui

Re: [PATCH v4 3/5] aarch64: add svcvt* FP8 intrinsics

2024-11-21 Thread Claudio Bantaloukas
On 21/11/2024 13:09, Richard Sandiford wrote: Thanks for the updated series and sorry for the slow reply. Claudio Bantaloukas writes: diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/cvt_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/cvt_mf8.c new file mode 100644 index

Re: [PATCH v2 1/4] aarch64: return scalar fp8 values in fp registers

2024-11-20 Thread Claudio Bantaloukas
On 19/11/2024 17:01, Andrew Pinski wrote: On Fri, Nov 8, 2024 at 8:11 AM Claudio Bantaloukas wrote: According to the aapcs64: If the argument is an 8-bit (...) precision Floating-point or short vector type and the NSRN is less than 8, then the argument is allocated to the least significant

[PATCH v4 2/5] aarch64: specify fpm mode in function instances and groups

2024-11-14 Thread Claudio Bantaloukas
Some intrinsics require setting the fpm register before calling the specific asm opcode required. In order to simplify review, this patch: - adds the fpm_mode_index attribute to function_group_info and function_instance objects - updates existing initialisations and call sites. - updates equalit

[PATCH v4 5/5] aarch64: add SVE2 FP8DOT2 and FP8DOT4 intrinsics

2024-11-14 Thread Claudio Bantaloukas
This patch adds support for the following intrinsics: - svdot[_f32_mf8]_fpm - svdot_lane[_f32_mf8]_fpm - svdot[_f16_mf8]_fpm - svdot_lane[_f16_mf8]_fpm The first two are available under a combination of the FP8DOT4 and SVE2 features. Alternatively under the SSVE_FP8DOT4 feature under streaming m

[PATCH v4 3/5] aarch64: add svcvt* FP8 intrinsics

2024-11-14 Thread Claudio Bantaloukas
This patch adds the following intrinsics: - svcvt1_bf16[_mf8]_fpm - svcvt1_f16[_mf8]_fpm - svcvt2_bf16[_mf8]_fpm - svcvt2_f16[_mf8]_fpm - svcvtlt1_bf16[_mf8]_fpm - svcvtlt1_f16[_mf8]_fpm - svcvtlt2_bf16[_mf8]_fpm - svcvtlt2_f16[_mf8]_fpm - svcvtn_mf8[_f16_x2]_fpm (unpredicated) - svcvtnb_mf8[_f32_

[PATCH v4 4/5] aarch64: add SVE2 FP8 multiply accumulate intrinsics

2024-11-14 Thread Claudio Bantaloukas
This patch adds support for the following intrinsics: - svmlalb[_f16_mf8]_fpm - svmlalb[_n_f16_mf8]_fpm - svmlalt[_f16_mf8]_fpm - svmlalt[_n_f16_mf8]_fpm - svmlalb_lane[_f16_mf8]_fpm - svmlalt_lane[_f16_mf8]_fpm - svmlallbb[_f32_mf8]_fpm - svmlallbb[_n_f32_mf8]_fpm - svmlallbt[_f32_mf8]_fpm - svml

[PATCH v4 0/5] aarch64: Add fp8 sve foundation

2024-11-14 Thread Claudio Bantaloukas
Thanks, Claudio Bantaloukas Claudio Bantaloukas (5): aarch64: Add basic svmfloat8_t support to arm_sve.h aarch64: specify fpm mode in function instances and groups aarch64: add svcvt* FP8 intrinsics aarch64: add SVE2 FP8 multiply accumulate intrinsics aarch64: add SVE2 FP8DOT2 and F

[PATCH v2 4/4] aarch64: add SVE2 FP8 multiply accumulate intrinsics

2024-11-13 Thread Claudio Bantaloukas
This patch adds support for the following intrinsics: - svmlalb[_f16_mf8]_fpm - svmlalb[_n_f16_mf8]_fpm - svmlalt[_f16_mf8]_fpm - svmlalt[_n_f16_mf8]_fpm - svmlalb_lane[_f16_mf8]_fpm - svmlalt_lane[_f16_mf8]_fpm - svmlallbb[_f32_mf8]_fpm - svmlallbb[_n_f32_mf8]_fpm - svmlallbt[_f32_mf8]_fpm - svml

[PATCH v3 2/4] aarch64: specify fpm mode in function instances and groups

2024-11-13 Thread Claudio Bantaloukas
Some intrinsics require setting the fpm register before calling the specific asm opcode required. In order to simplify review, this patch: - adds the fpm_mode_index attribute to function_group_info and function_instance objects - updates existing initialisations and call sites. - updates equalit

[PATCH v2 0/4] aarch64: Add fp8 sve foundation

2024-11-13 Thread Claudio Bantaloukas
ok for master? I do not have commit rights yet, if ok, can someone commit it on my behalf? Regression tested on aarch64-unknown-linux-gnu. Thanks, Claudio Bantaloukas Claudio Bantaloukas (4): aarch64: Add basic svmfloat8_t support to arm_sve.h aarch64: specify fpm mode in function instances and

[PATCH v2 3/4] aarch64: add svcvt* FP8 intrinsics

2024-11-13 Thread Claudio Bantaloukas
This patch adds the following intrinsics: - svcvt1_bf16[_mf8]_fpm - svcvt1_f16[_mf8]_fpm - svcvt2_bf16[_mf8]_fpm - svcvt2_f16[_mf8]_fpm - svcvtlt1_bf16[_mf8]_fpm - svcvtlt1_f16[_mf8]_fpm - svcvtlt2_bf16[_mf8]_fpm - svcvtlt2_f16[_mf8]_fpm - svcvtn_mf8[_f16_x2]_fpm (unpredicated) - svcvtnb_mf8[_f32_

[PATCH v3 4/4] aarch64: add SVE2 FP8 multiply accumulate intrinsics

2024-11-13 Thread Claudio Bantaloukas
This patch adds support for the following intrinsics: - svmlalb[_f16_mf8]_fpm - svmlalb[_n_f16_mf8]_fpm - svmlalt[_f16_mf8]_fpm - svmlalt[_n_f16_mf8]_fpm - svmlalb_lane[_f16_mf8]_fpm - svmlalt_lane[_f16_mf8]_fpm - svmlallbb[_f32_mf8]_fpm - svmlallbb[_n_f32_mf8]_fpm - svmlallbt[_f32_mf8]_fpm - svml

[PATCH v3 3/4] aarch64: add svcvt* FP8 intrinsics

2024-11-13 Thread Claudio Bantaloukas
This patch adds the following intrinsics: - svcvt1_bf16[_mf8]_fpm - svcvt1_f16[_mf8]_fpm - svcvt2_bf16[_mf8]_fpm - svcvt2_f16[_mf8]_fpm - svcvtlt1_bf16[_mf8]_fpm - svcvtlt1_f16[_mf8]_fpm - svcvtlt2_bf16[_mf8]_fpm - svcvtlt2_f16[_mf8]_fpm - svcvtn_mf8[_f16_x2]_fpm (unpredicated) - svcvtnb_mf8[_f32_

[PATCH v3 0/4] aarch64: Add fp8 sve foundation

2024-11-13 Thread Claudio Bantaloukas
ok for master? I do not have commit rights yet, if ok, can someone commit it on my behalf? Regression tested on aarch64-unknown-linux-gnu. Thanks, Claudio Bantaloukas Claudio Bantaloukas (4): aarch64: Add basic svmfloat8_t support to arm_sve.h aarch64: specify fpm mode in function instances and

Re: [PATCH v2 0/4] aarch64: Add fp8 sve foundation

2024-11-13 Thread Claudio Bantaloukas
Please disregard this series, posted as v2 by mistake. Cheers, Claudio On 11/13/2024 4:34 PM, Claudio Bantaloukas wrote: The ACLE defines a new set of fp8 vector types and intrinsics that operate on these, some of them operating on the vectors as if they were bags of bits and some requiring an

[PATCH v2 2/4] aarch64: specify fpm mode in function instances and groups

2024-11-13 Thread Claudio Bantaloukas
Some intrinsics require setting the fpm register before calling the specific asm opcode required. In order to simplify review, this patch: - adds the fpm_mode_index attribute to function_group_info and function_instance objects - updates existing initialisations and call sites. - updates equalit

Re: [PATCH v2 4/4] aarch64: add svcvt* FP8 intrinsics

2024-11-12 Thread Claudio Bantaloukas
On 12/11/2024 12:05, Richard Sandiford wrote: Claudio Bantaloukas writes: This patch adds the following intrinsics: - svcvt1_bf16[_mf8]_fpm - svcvt1_f16[_mf8]_fpm - svcvt2_bf16[_mf8]_fpm - svcvt2_f16[_mf8]_fpm - svcvtlt1_bf16[_mf8]_fpm - svcvtlt1_f16[_mf8]_fpm - svcvtlt2_bf16[_mf8]_fpm

Re: [PATCH v2 3/4] aarch64: specify fpm mode in function instances and groups

2024-11-12 Thread Claudio Bantaloukas
On 11/11/2024 16:12, Richard Sandiford wrote: Claudio Bantaloukas writes: diff --git a/gcc/config/aarch64/aarch64-sve-builtins-sve2.def b/gcc/config/aarch64/aarch64-sve-builtins-sve2.def index e4021559f36..8d25bb33dad 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-sve2.def +++ b/gcc

Re: [PATCH v2 2/4] aarch64: Add basic svmfloat8_t support to arm_sve.h

2024-11-12 Thread Claudio Bantaloukas
On 11/11/2024 16:03, Richard Sandiford wrote: Claudio Bantaloukas writes: [...] @@ -231,12 +231,12 @@ CONSTEXPR const group_suffix_info group_suffixes[] = { #define TYPES_all_arith(S, D) \ TYPES_all_float (S, D), TYPES_all_integer (S, D) -/* _bf16 +/* _mf8 _bf16 _f16

[PATCH v2 0/4] aarch64: Add fp8 sve foundation

2024-11-08 Thread Claudio Bantaloukas
svcvt* intrinsics along with supporting shapes Is this ok for master? I do not have commit rights yet, if ok, can someone commit it on my behalf? Regression tested on aarch64-unknown-linux-gnu. Thanks, Claudio Bantaloukas Claudio Bantaloukas (4): aarch64: return scalar fp8 values in fp reg

[PATCH v2 4/4] aarch64: add svcvt* FP8 intrinsics

2024-11-08 Thread Claudio Bantaloukas
This patch adds the following intrinsics: - svcvt1_bf16[_mf8]_fpm - svcvt1_f16[_mf8]_fpm - svcvt2_bf16[_mf8]_fpm - svcvt2_f16[_mf8]_fpm - svcvtlt1_bf16[_mf8]_fpm - svcvtlt1_f16[_mf8]_fpm - svcvtlt2_bf16[_mf8]_fpm - svcvtlt2_f16[_mf8]_fpm - svcvtn_mf8[_f16_x2]_fpm (unpredicated) - svcvtnb_mf8[_f32_

[PATCH v2 3/4] aarch64: specify fpm mode in function instances and groups

2024-11-08 Thread Claudio Bantaloukas
Some intrinsics require setting the fpm register before calling the specific asm opcode required. In order to simplify review, this patch: - adds the fpm_mode_index attribute to function_group_info and function_instance objects - updates existing initialisations and call sites. - updates equalit

[PATCH v2 1/4] aarch64: return scalar fp8 values in fp registers

2024-11-08 Thread Claudio Bantaloukas
According to the aapcs64: If the argument is an 8-bit (...) precision Floating-point or short vector type and the NSRN is less than 8, then the argument is allocated to the least significant bits of register v[NSRN]. gcc/ * config/aarch64/aarch64.cc (aarch64_vfp_is_call_or_return_

[PATCH v1 0/2] aarch64: Add fp8 sve foundation

2024-10-30 Thread Claudio Bantaloukas
fpm_t type - foundational changes that will be used to implement intrinsics requiring an fpm_t argument at the end. Is this ok for master? I do not have commit rights yet, if ok, can someone commit it on my behalf? Regression tested on aarch64-unknown-linux-gnu. Thanks, Claudio Bantaloukas

[PATCH v1 2/2] aarch64: specify fpm mode in function instances and groups

2024-10-30 Thread Claudio Bantaloukas
Some intrinsics require setting the fpm register before calling the specific asm opcode required. In order to simplify review, this patch: - adds the fpm_mode_index attribute to function_group_info and function_instance objects - updates existing initialisations and call sites. - updates equalit

[PATCH v3] aarch64: Add fp8 scalar types

2024-09-20 Thread Claudio Bantaloukas
defs for mfloat8_t are defined in arm_private_fp8.h rather than arm_neon.h and arm_sve.h Thanks, Claudio Bantaloukas gcc/config/aarch64/aarch64-builtins.cc| 20 + gcc/config/aarch64/aarch64.cc | 54 ++- gcc/config/aarch64/aarch64.h | 5 + gcc/config/aar

Re: [PATCH v2] aarch64: Add fp8 scalar types

2024-09-19 Thread Claudio Bantaloukas
On 9/19/2024 2:18 PM, Kyrylo Tkachov wrote: Hi Claudio, On 19 Sep 2024, at 15:09, Claudio Bantaloukas wrote: External email: Use caution opening links or attachments The ACLE defines a new scalar type, __mfp8. This is an opaque 8bit types that can only be used by fp8 intrinsics

[PATCH v2] aarch64: Add fp8 scalar types

2024-09-19 Thread Claudio Bantaloukas
ters Thanks, Claudio Bantaloukas gcc/config/aarch64/aarch64-builtins.cc| 20 + gcc/config/aarch64/aarch64.cc | 54 ++- gcc/config/aarch64/aarch64.h | 5 + gcc/config/aarch64/arm_neon.h | 2 + gcc/config/aarch64/arm_sve.h |

Re: [PATCH v1] aarch64: Add fp8 scalar types

2024-08-02 Thread Claudio Bantaloukas
On 02/08/2024 12:17, Richard Sandiford wrote: > Claudio Bantaloukas writes: >> The ACLE defines a new scalar type, __mfp8. This is an opaque 8bit types that >> can only be used by fp8 intrinsics. Additionally, the mfloat8_t type is made >> available in arm_neon.h and arm

[PATCH v1] aarch64: Add fp8 scalar types

2024-08-02 Thread Claudio Bantaloukas
ch64/fp8_scalar_typecheck_1.c: Likewise. * gcc.target/aarch64/fp8_scalar_typecheck_2.C: New tests in C++. --- Hi, Is this ok for master? I do not have commit rights yet, if ok, can someone commit it on my behalf? Regression tested with aarch64-unknown-linux-gnu. Thanks, Claudio Bantaloukas gcc/co

Re: [PATCH v4 2/3] aarch64: Add support for moving fpm system register

2024-07-31 Thread Claudio Bantaloukas
On 31/07/2024 08:57, Kyrylo Tkachov wrote: > Hi Claudio, > >> On 31 Jul 2024, at 08:29, Claudio Bantaloukas >> wrote: >> >> External email: Use caution opening links or attachments >> >> >> Unlike most system registers, fpmr can be heav

[PATCH v4 2/3] aarch64: Add support for moving fpm system register

2024-07-30 Thread Claudio Bantaloukas
Unlike most system registers, fpmr can be heavily written to in code that exercises the fp8 functionality. That is because every fp8 instrinsic call can potentially change the value of fpmr. Rather than just use an unspec, we treat the fpmr system register like all other registers and use a move o

[PATCH v4 3/3] aarch64: Add fpm register helper functions.

2024-07-30 Thread Claudio Bantaloukas
The ACLE declares several helper types and functions to facilitate construction of `fpm` arguments. These are available when one of the arm_neon.h, arm_sve.h, or arm_sme.h headers is included. These helpers don't map to specific FP8 instructions and there's no expectation that they will produce a

[PATCH v4 1/3] aarch64: Add march flags for +fp8 arch extensions

2024-07-30 Thread Claudio Bantaloukas
This introduces the relevant flags to enable access to the fpmr register and fp8 intrinsics, which will be added subsequently. gcc/ChangeLog: * config/aarch64/aarch64-option-extensions.def (fp8): New. * config/aarch64/aarch64.h (TARGET_FP8): Likewise. * doc/invoke.texi (

[PATCH v4 0/3] aarch64: Add initial support for +fp8 arch extensions

2024-07-30 Thread Claudio Bantaloukas
ted error message in arm_private_fp8.h Is this ok for master? I do not have merge permissions. Can someone merge this for me please? Thanks, Claudio Bantaloukas Claudio Bantaloukas (3): aarch64: Add march flags for +fp8 arch extensions aarch64: Add support for moving fpm system register aarch

Re: [PATCH v3 1/3] aarch64: Add march flags for +fp8 arch extensions

2024-07-30 Thread Claudio Bantaloukas
On 29/07/2024 08:30, Kyrylo Tkachov wrote: > Hi Claudio, > >> On 26 Jul 2024, at 18:32, Claudio Bantaloukas >> wrote: >> >> External email: Use caution opening links or attachments >> >> >> This introduces the relevant flags to enable access to

Re: [PATCH v3 2/3] aarch64: Add support for moving fpm system register

2024-07-30 Thread Claudio Bantaloukas
On 29/07/2024 13:13, Richard Sandiford wrote: > Claudio Bantaloukas writes: >> Unlike most system registers, fpmr can be heavily written to in code that >> exercises the fp8 functionality. That is because every fp8 instrinsic call >> can potentially change the value of fpmr

[PATCH v3 3/3] aarch64: Add fpm register helper functions.

2024-07-26 Thread Claudio Bantaloukas
The ACLE declares several helper types and functions to facilitate construction of `fpm` arguments. These are available when one of the arm_neon.h, arm_sve.h, or arm_sme.h headers is included. These helpers don't map to specific FP8 instructions and there's no expectation that they will produce a

[PATCH v3 0/3] aarch64: Add initial support for +fp8 arch extensions

2024-07-26 Thread Claudio Bantaloukas
lable when including arm_neon.h arm_sve.h or arm_sme.h Is this ok for master? I do not have merge permissions. Can someone merge this for me please? Thanks, Claudio Bantaloukas Claudio Bantaloukas (3): aarch64: Add march flags for +fp8 arch extensions aarch64: Add support for moving

[PATCH v3 1/3] aarch64: Add march flags for +fp8 arch extensions

2024-07-26 Thread Claudio Bantaloukas
This introduces the relevant flags to enable access to the fpmr register and fp8 intrinsics, which will be added subsequently. gcc/ChangeLog: * config/aarch64/aarch64-option-extensions.def (fp8): New. * config/aarch64/aarch64.h (TARGET_FP8): Likewise. * doc/invoke.texi (

[PATCH v3 2/3] aarch64: Add support for moving fpm system register

2024-07-26 Thread Claudio Bantaloukas
Unlike most system registers, fpmr can be heavily written to in code that exercises the fp8 functionality. That is because every fp8 instrinsic call can potentially change the value of fpmr. Rather than just use a an unspec, we treat the fpmr system register like all other registers and use a move

Re: [PATCH v2 3/3] aarch64: Add fpm register helper functions.

2024-07-26 Thread Claudio Bantaloukas
On 26/07/2024 09:13, Kyrylo Tkachov wrote: > Hi Claudio, > >> On 25 Jul 2024, at 16:25, Claudio Bantaloukas >> wrote: >> >> External email: Use caution opening links or attachments >> >> >> The ACLE declares several helper types and functions

Re: [PATCH v2 1/3] aarch64: Add march flags for +fp8 arch extensions

2024-07-26 Thread Claudio Bantaloukas
On 26/07/2024 08:15, Kyrylo Tkachov wrote: > Hi Claudio, > >> On 25 Jul 2024, at 16:25, Claudio Bantaloukas >> wrote: >> >> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi >> index e0a641213ae..f293d49c61a 100644 >> --- a/gcc/doc/invoke.texi &

[PATCH v2 2/3] aarch64: Add support for moving fpm system register

2024-07-25 Thread Claudio Bantaloukas
Unlike most system registers, fpmr can be heavily written to in code that exercises the fp8 functionality. That is because every fp8 instrinsic call can potentially change the value of fpmr. Rather than just use a an unspec, we treat the fpmr system register like all other registers and use a move

[PATCH v2 3/3] aarch64: Add fpm register helper functions.

2024-07-25 Thread Claudio Bantaloukas
The ACLE declares several helper types and functions to facilitate construction of `fpm` arguments. gcc/ChangeLog: * config/aarch64/arm_acle.h (fpm_t): New type representing fpmr values. (enum __ARM_FPM_FORMAT): New enum representing valid fp8 formats. (enum __ARM_FPM_OVE

[PATCH v2 1/3] aarch64: Add march flags for +fp8 arch extensions

2024-07-25 Thread Claudio Bantaloukas
This introduces the relevant flags to enable access to the fpmr register and fp8 intrinsics, which will be added subsequently. gcc/ChangeLog: * config/aarch64/aarch64-option-extensions.def (fp8): New. * config/aarch64/aarch64.h (TARGET_FP8): Likewise. * doc/invoke.texi (

[PATCH v2 0/3] aarch64: Add initial support for +fp8 arch extensions

2024-07-25 Thread Claudio Bantaloukas
sary modifier in .md aarch64: Add fpm register helper functions. - Helper functions and fpm_t types are available unconditionally when including arm_acle.h Is this ok for master? I do not have merge permissions. Can someone merge this for me please? Thanks, Claudio Bantaloukas Claudio Bantal

Re: [PATCH 2/3] aarch64: Add support for moving fpm system register

2024-07-24 Thread Claudio Bantaloukas
On 22/07/2024 11:07, Alex Coplan wrote: > Hi Claudio, > > I've left a couple of small comments below. > > On 22/07/2024 09:30, Claudio Bantaloukas wrote: --8<- >> >> @@ -1505,6 +1513,8 @@ (define_insn_and_split "*movdi_aarch64" >>

Re: [PATCH 2/3] aarch64: Add support for moving fpm system register

2024-07-23 Thread Claudio Bantaloukas
On 22/07/2024 11:07, Alex Coplan wrote: > Hi Claudio, > > I've left a couple of small comments below. > > On 22/07/2024 09:30, Claudio Bantaloukas wrote: >> >> Unlike most system registers, fpmr can be heavily written to in code that >> exercises the fp

Re: [PATCH 1/3] aarch64: Add march flags for +fp8 arch extensions

2024-07-23 Thread Claudio Bantaloukas
On 22/07/2024 10:45, Kyrylo Tkachov wrote: > Hi Claudio, > Thanks for working on this. > >> On 22 Jul 2024, at 11:30, Claudio Bantaloukas >> wrote: >> >> External email: Use caution opening links or attachments >> >> >> This introduces the r

Re: [PATCH 2/3] aarch64: Add support for moving fpm system register

2024-07-23 Thread Claudio Bantaloukas
On 22/07/2024 10:51, Kyrylo Tkachov wrote: > Hi Claudio, > >> On 22 Jul 2024, at 11:30, Claudio Bantaloukas >> wrote: >> >> External email: Use caution opening links or attachments >> >> >> Unlike most system registers, fpmr can be heav

[PATCH 3/3] aarch64: add fpm register helper functions.

2024-07-22 Thread Claudio Bantaloukas
The ACLE declares several helper types and functions to facilitate construction of `fpm` arguments. gcc/ChangeLog: * config/aarch64/arm_acle.h (fpm_t): New type representing fpmr values. (enum __ARM_FPM_FORMAT): New enum representing valid fp8 formats. (enum __ARM_FPM_OVE

[PATCH 2/3] aarch64: Add support for moving fpm system register

2024-07-22 Thread Claudio Bantaloukas
Unlike most system registers, fpmr can be heavily written to in code that exercises the fp8 functionality. That is because every fp8 instrinsic call can potentially change the value of fpmr. Rather than just use a an unspec, we treat the fpmr system register like all other registers and use a move

[PATCH 1/3] aarch64: Add march flags for +fp8 arch extensions

2024-07-22 Thread Claudio Bantaloukas
This introduces the relevant flags to enable access to the fpmr register and fp8 intrinsics, which will be added subsequently. The `+fp8' -march modifier defines the __ARM_FEATURE_FP8 macro to 1. gcc/ChangeLog: * config/aarch64/aarch64-c.cc (__ARM_FEATURE_FP8): New. * config/aa

[PATCH 0/3] aarch64: Add march flags for +fp8 arch extensions

2024-07-22 Thread Claudio Bantaloukas
eature_flags to 128 bits" be applied. Is this ok for master? I do not have merge permissions. Can someone merge this for me please? Thanks, Claudio Bantaloukas Claudio Bantaloukas (3): aarch64: Add march flags for +fp8 arch extensions aarch64: Add support for moving fpm system registe