On 12/3/2024 10:24 AM, Kyrylo Tkachov wrote:
Hi Claudio,
On 2 Dec 2024, at 19:14, Claudio Bantaloukas
wrote:
The previous version of the patch was based on the mistaken assumption that
features in /proc/cpuinfo had matching names to the feature names that gcc and
gas accept.
This patch
simonb
Richard Ballricbal02
Scott Bambrough -
Wolfgang Bangerth -
+Claudio Bantaloukas rdfm
Gergö Barany-
Thiago Jung Bauermann
The previous version of the patch was based on the mistaken assumption that
features in /proc/cpuinfo had matching names to the feature names that gcc and
gas accept.
This patch enables the fp8 feature when the f8cvt feature is enabled, under the
assumption that fpmr is always enabled when f8cvt i
is.
OK for trunk?
Changelog:
gcc/
config/aarch64/aarch64-option-extensions.def: (fp8): fix FEATURE_STRING.
(fp8fma, ssve-fp8fma): Likewise.
(fp8dot4, ssve-fp8dot4, fp8dot2, ssve-fp8dot2): Likewise.
Claudio Bantaloukas (1):
aarch64: fix fp8 cpuinfo feature names
gcc
On 11/29/2024 2:15 PM, Kyrylo Tkachov wrote:
On 29 Nov 2024, at 13:00, Richard Sandiford wrote:
Thanks for the update!
Claudio Bantaloukas writes:
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 2a4f016e2df..f7440113570 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc
On 11/29/2024 1:00 PM, Richard Sandiford wrote:
Thanks for the update!
Claudio Bantaloukas writes:
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 2a4f016e2df..f7440113570 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -21957,6 +21957,18 @@ Enable the fp8 (8-bit
Some intrinsics require setting the fpm register before calling the
specific asm opcode required.
In order to simplify review, this patch:
- adds the fpm_mode_index attribute to function_group_info and
function_instance objects
- updates existing initialisations and call sites.
- updates equalit
This patch adds support for the following intrinsics:
- svdot[_f32_mf8]_fpm
- svdot_lane[_f32_mf8]_fpm
- svdot[_f16_mf8]_fpm
- svdot_lane[_f16_mf8]_fpm
The first two are available under a combination of the FP8DOT4 and SVE2
features.
Alternatively under the SSVE_FP8DOT4 feature under streaming m
This patch adds the following intrinsics:
- svcvt1_bf16[_mf8]_fpm
- svcvt1_f16[_mf8]_fpm
- svcvt2_bf16[_mf8]_fpm
- svcvt2_f16[_mf8]_fpm
- svcvtlt1_bf16[_mf8]_fpm
- svcvtlt1_f16[_mf8]_fpm
- svcvtlt2_bf16[_mf8]_fpm
- svcvtlt2_f16[_mf8]_fpm
- svcvtn_mf8[_f16_x2]_fpm (unpredicated)
- svcvtnb_mf8[_f32_
omeone
commit it on my behalf?
Regression tested on aarch64-unknown-linux-gnu.
Thanks,
Claudio Bantaloukas
Claudio Bantaloukas (5):
aarch64: Add basic svmfloat8_t support to arm_sve.h
aarch64: specify fpm mode in function instances and groups
aarch64: add svcvt* FP8 intrinsics
aarch64: ad
This patch adds support for the following intrinsics:
- svmlalb[_f16_mf8]_fpm
- svmlalb[_n_f16_mf8]_fpm
- svmlalt[_f16_mf8]_fpm
- svmlalt[_n_f16_mf8]_fpm
- svmlalb_lane[_f16_mf8]_fpm
- svmlalt_lane[_f16_mf8]_fpm
- svmlallbb[_f32_mf8]_fpm
- svmlallbb[_n_f32_mf8]_fpm
- svmlallbt[_f32_mf8]_fpm
- svml
On 21/11/2024 15:41, Richard Sandiford wrote:
Claudio Bantaloukas writes:
This patch adds support for the following intrinsics:
- svdot[_f32_mf8]_fpm
- svdot_lane[_f32_mf8]_fpm
- svdot[_f16_mf8]_fpm
- svdot_lane[_f16_mf8]_fpm
The first two are available under a combination of the FP8DOT4
On 21/11/2024 14:33, Richard Sandiford wrote:
Claudio Bantaloukas writes:
[...]
@@ -4004,6 +4008,44 @@ SHAPE (ternary_bfloat_lane)
typedef ternary_bfloat_lane_base<2> ternary_bfloat_lanex2_def;
SHAPE (ternary_bfloat_lanex2)
+/* sv_t svfoo[_t0](sv_t, svmfloat8_t, svmfloat8_t, ui
On 21/11/2024 13:09, Richard Sandiford wrote:
Thanks for the updated series and sorry for the slow reply.
Claudio Bantaloukas writes:
diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/cvt_mf8.c
b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/cvt_mf8.c
new file mode 100644
index
On 19/11/2024 17:01, Andrew Pinski wrote:
On Fri, Nov 8, 2024 at 8:11 AM Claudio Bantaloukas
wrote:
According to the aapcs64: If the argument is an 8-bit (...) precision
Floating-point or short vector type and the NSRN is less than 8, then the
argument is allocated to the least significant
Some intrinsics require setting the fpm register before calling the
specific asm opcode required.
In order to simplify review, this patch:
- adds the fpm_mode_index attribute to function_group_info and
function_instance objects
- updates existing initialisations and call sites.
- updates equalit
This patch adds support for the following intrinsics:
- svdot[_f32_mf8]_fpm
- svdot_lane[_f32_mf8]_fpm
- svdot[_f16_mf8]_fpm
- svdot_lane[_f16_mf8]_fpm
The first two are available under a combination of the FP8DOT4 and SVE2
features.
Alternatively under the SSVE_FP8DOT4 feature under streaming m
This patch adds the following intrinsics:
- svcvt1_bf16[_mf8]_fpm
- svcvt1_f16[_mf8]_fpm
- svcvt2_bf16[_mf8]_fpm
- svcvt2_f16[_mf8]_fpm
- svcvtlt1_bf16[_mf8]_fpm
- svcvtlt1_f16[_mf8]_fpm
- svcvtlt2_bf16[_mf8]_fpm
- svcvtlt2_f16[_mf8]_fpm
- svcvtn_mf8[_f16_x2]_fpm (unpredicated)
- svcvtnb_mf8[_f32_
This patch adds support for the following intrinsics:
- svmlalb[_f16_mf8]_fpm
- svmlalb[_n_f16_mf8]_fpm
- svmlalt[_f16_mf8]_fpm
- svmlalt[_n_f16_mf8]_fpm
- svmlalb_lane[_f16_mf8]_fpm
- svmlalt_lane[_f16_mf8]_fpm
- svmlallbb[_f32_mf8]_fpm
- svmlallbb[_n_f32_mf8]_fpm
- svmlallbt[_f32_mf8]_fpm
- svml
Thanks,
Claudio Bantaloukas
Claudio Bantaloukas (5):
aarch64: Add basic svmfloat8_t support to arm_sve.h
aarch64: specify fpm mode in function instances and groups
aarch64: add svcvt* FP8 intrinsics
aarch64: add SVE2 FP8 multiply accumulate intrinsics
aarch64: add SVE2 FP8DOT2 and F
This patch adds support for the following intrinsics:
- svmlalb[_f16_mf8]_fpm
- svmlalb[_n_f16_mf8]_fpm
- svmlalt[_f16_mf8]_fpm
- svmlalt[_n_f16_mf8]_fpm
- svmlalb_lane[_f16_mf8]_fpm
- svmlalt_lane[_f16_mf8]_fpm
- svmlallbb[_f32_mf8]_fpm
- svmlallbb[_n_f32_mf8]_fpm
- svmlallbt[_f32_mf8]_fpm
- svml
Some intrinsics require setting the fpm register before calling the
specific asm opcode required.
In order to simplify review, this patch:
- adds the fpm_mode_index attribute to function_group_info and
function_instance objects
- updates existing initialisations and call sites.
- updates equalit
ok for master? I do not have commit rights yet, if ok, can someone
commit it on my behalf?
Regression tested on aarch64-unknown-linux-gnu.
Thanks,
Claudio Bantaloukas
Claudio Bantaloukas (4):
aarch64: Add basic svmfloat8_t support to arm_sve.h
aarch64: specify fpm mode in function instances and
This patch adds the following intrinsics:
- svcvt1_bf16[_mf8]_fpm
- svcvt1_f16[_mf8]_fpm
- svcvt2_bf16[_mf8]_fpm
- svcvt2_f16[_mf8]_fpm
- svcvtlt1_bf16[_mf8]_fpm
- svcvtlt1_f16[_mf8]_fpm
- svcvtlt2_bf16[_mf8]_fpm
- svcvtlt2_f16[_mf8]_fpm
- svcvtn_mf8[_f16_x2]_fpm (unpredicated)
- svcvtnb_mf8[_f32_
This patch adds support for the following intrinsics:
- svmlalb[_f16_mf8]_fpm
- svmlalb[_n_f16_mf8]_fpm
- svmlalt[_f16_mf8]_fpm
- svmlalt[_n_f16_mf8]_fpm
- svmlalb_lane[_f16_mf8]_fpm
- svmlalt_lane[_f16_mf8]_fpm
- svmlallbb[_f32_mf8]_fpm
- svmlallbb[_n_f32_mf8]_fpm
- svmlallbt[_f32_mf8]_fpm
- svml
This patch adds the following intrinsics:
- svcvt1_bf16[_mf8]_fpm
- svcvt1_f16[_mf8]_fpm
- svcvt2_bf16[_mf8]_fpm
- svcvt2_f16[_mf8]_fpm
- svcvtlt1_bf16[_mf8]_fpm
- svcvtlt1_f16[_mf8]_fpm
- svcvtlt2_bf16[_mf8]_fpm
- svcvtlt2_f16[_mf8]_fpm
- svcvtn_mf8[_f16_x2]_fpm (unpredicated)
- svcvtnb_mf8[_f32_
ok for master? I do not have commit rights yet, if ok, can someone
commit it on my behalf?
Regression tested on aarch64-unknown-linux-gnu.
Thanks,
Claudio Bantaloukas
Claudio Bantaloukas (4):
aarch64: Add basic svmfloat8_t support to arm_sve.h
aarch64: specify fpm mode in function instances and
Please disregard this series, posted as v2 by mistake.
Cheers,
Claudio
On 11/13/2024 4:34 PM, Claudio Bantaloukas wrote:
The ACLE defines a new set of fp8 vector types and intrinsics that operate on
these, some of them operating on the vectors as if they were bags of bits and
some requiring an
Some intrinsics require setting the fpm register before calling the
specific asm opcode required.
In order to simplify review, this patch:
- adds the fpm_mode_index attribute to function_group_info and
function_instance objects
- updates existing initialisations and call sites.
- updates equalit
On 12/11/2024 12:05, Richard Sandiford wrote:
Claudio Bantaloukas writes:
This patch adds the following intrinsics:
- svcvt1_bf16[_mf8]_fpm
- svcvt1_f16[_mf8]_fpm
- svcvt2_bf16[_mf8]_fpm
- svcvt2_f16[_mf8]_fpm
- svcvtlt1_bf16[_mf8]_fpm
- svcvtlt1_f16[_mf8]_fpm
- svcvtlt2_bf16[_mf8]_fpm
On 11/11/2024 16:12, Richard Sandiford wrote:
Claudio Bantaloukas writes:
diff --git a/gcc/config/aarch64/aarch64-sve-builtins-sve2.def
b/gcc/config/aarch64/aarch64-sve-builtins-sve2.def
index e4021559f36..8d25bb33dad 100644
--- a/gcc/config/aarch64/aarch64-sve-builtins-sve2.def
+++ b/gcc
On 11/11/2024 16:03, Richard Sandiford wrote:
Claudio Bantaloukas writes:
[...]
@@ -231,12 +231,12 @@ CONSTEXPR const group_suffix_info group_suffixes[] = {
#define TYPES_all_arith(S, D) \
TYPES_all_float (S, D), TYPES_all_integer (S, D)
-/* _bf16
+/* _mf8 _bf16
_f16
svcvt* intrinsics along with supporting shapes
Is this ok for master? I do not have commit rights yet, if ok, can someone
commit it on my behalf?
Regression tested on aarch64-unknown-linux-gnu.
Thanks,
Claudio Bantaloukas
Claudio Bantaloukas (4):
aarch64: return scalar fp8 values in fp reg
This patch adds the following intrinsics:
- svcvt1_bf16[_mf8]_fpm
- svcvt1_f16[_mf8]_fpm
- svcvt2_bf16[_mf8]_fpm
- svcvt2_f16[_mf8]_fpm
- svcvtlt1_bf16[_mf8]_fpm
- svcvtlt1_f16[_mf8]_fpm
- svcvtlt2_bf16[_mf8]_fpm
- svcvtlt2_f16[_mf8]_fpm
- svcvtn_mf8[_f16_x2]_fpm (unpredicated)
- svcvtnb_mf8[_f32_
Some intrinsics require setting the fpm register before calling the
specific asm opcode required.
In order to simplify review, this patch:
- adds the fpm_mode_index attribute to function_group_info and
function_instance objects
- updates existing initialisations and call sites.
- updates equalit
According to the aapcs64: If the argument is an 8-bit (...) precision
Floating-point or short vector type and the NSRN is less than 8, then the
argument is allocated to the least significant bits of register v[NSRN].
gcc/
* config/aarch64/aarch64.cc
(aarch64_vfp_is_call_or_return_
fpm_t type
- foundational changes that will be used to implement intrinsics requiring an
fpm_t argument at the end.
Is this ok for master? I do not have commit rights yet, if ok, can someone
commit it on my behalf?
Regression tested on aarch64-unknown-linux-gnu.
Thanks,
Claudio Bantaloukas
Some intrinsics require setting the fpm register before calling the
specific asm opcode required.
In order to simplify review, this patch:
- adds the fpm_mode_index attribute to function_group_info and
function_instance objects
- updates existing initialisations and call sites.
- updates equalit
defs for mfloat8_t are defined in arm_private_fp8.h rather than
arm_neon.h and arm_sve.h
Thanks,
Claudio Bantaloukas
gcc/config/aarch64/aarch64-builtins.cc| 20 +
gcc/config/aarch64/aarch64.cc | 54 ++-
gcc/config/aarch64/aarch64.h | 5 +
gcc/config/aar
On 9/19/2024 2:18 PM, Kyrylo Tkachov wrote:
Hi Claudio,
On 19 Sep 2024, at 15:09, Claudio Bantaloukas
wrote:
External email: Use caution opening links or attachments
The ACLE defines a new scalar type, __mfp8. This is an opaque 8bit types that
can only be used by fp8 intrinsics
ters
Thanks,
Claudio Bantaloukas
gcc/config/aarch64/aarch64-builtins.cc| 20 +
gcc/config/aarch64/aarch64.cc | 54 ++-
gcc/config/aarch64/aarch64.h | 5 +
gcc/config/aarch64/arm_neon.h | 2 +
gcc/config/aarch64/arm_sve.h |
On 02/08/2024 12:17, Richard Sandiford wrote:
> Claudio Bantaloukas writes:
>> The ACLE defines a new scalar type, __mfp8. This is an opaque 8bit types that
>> can only be used by fp8 intrinsics. Additionally, the mfloat8_t type is made
>> available in arm_neon.h and arm
ch64/fp8_scalar_typecheck_1.c: Likewise.
* gcc.target/aarch64/fp8_scalar_typecheck_2.C: New tests in C++.
---
Hi,
Is this ok for master? I do not have commit rights yet, if ok, can someone
commit it on my behalf?
Regression tested with aarch64-unknown-linux-gnu.
Thanks,
Claudio Bantaloukas
gcc/co
On 31/07/2024 08:57, Kyrylo Tkachov wrote:
> Hi Claudio,
>
>> On 31 Jul 2024, at 08:29, Claudio Bantaloukas
>> wrote:
>>
>> External email: Use caution opening links or attachments
>>
>>
>> Unlike most system registers, fpmr can be heav
Unlike most system registers, fpmr can be heavily written to in code that
exercises the fp8 functionality. That is because every fp8 instrinsic call
can potentially change the value of fpmr.
Rather than just use an unspec, we treat the fpmr system register like
all other registers and use a move o
The ACLE declares several helper types and functions to facilitate construction
of `fpm` arguments. These are available when one of the arm_neon.h, arm_sve.h,
or arm_sme.h headers is included. These helpers don't map to specific FP8
instructions and there's no expectation that they will produce a
This introduces the relevant flags to enable access to the fpmr register and
fp8 intrinsics, which will be added subsequently.
gcc/ChangeLog:
* config/aarch64/aarch64-option-extensions.def (fp8): New.
* config/aarch64/aarch64.h (TARGET_FP8): Likewise.
* doc/invoke.texi (
ted error message in arm_private_fp8.h
Is this ok for master? I do not have merge permissions. Can someone merge this
for me please?
Thanks,
Claudio Bantaloukas
Claudio Bantaloukas (3):
aarch64: Add march flags for +fp8 arch extensions
aarch64: Add support for moving fpm system register
aarch
On 29/07/2024 08:30, Kyrylo Tkachov wrote:
> Hi Claudio,
>
>> On 26 Jul 2024, at 18:32, Claudio Bantaloukas
>> wrote:
>>
>> External email: Use caution opening links or attachments
>>
>>
>> This introduces the relevant flags to enable access to
On 29/07/2024 13:13, Richard Sandiford wrote:
> Claudio Bantaloukas writes:
>> Unlike most system registers, fpmr can be heavily written to in code that
>> exercises the fp8 functionality. That is because every fp8 instrinsic call
>> can potentially change the value of fpmr
The ACLE declares several helper types and functions to facilitate construction
of `fpm` arguments. These are available when one of the arm_neon.h, arm_sve.h,
or arm_sme.h headers is included. These helpers don't map to specific FP8
instructions and there's no expectation that they will produce a
lable when including arm_neon.h
arm_sve.h or arm_sme.h
Is this ok for master? I do not have merge permissions. Can someone merge this
for me please?
Thanks,
Claudio Bantaloukas
Claudio Bantaloukas (3):
aarch64: Add march flags for +fp8 arch extensions
aarch64: Add support for moving
This introduces the relevant flags to enable access to the fpmr register and
fp8 intrinsics, which will be added subsequently.
gcc/ChangeLog:
* config/aarch64/aarch64-option-extensions.def (fp8): New.
* config/aarch64/aarch64.h (TARGET_FP8): Likewise.
* doc/invoke.texi (
Unlike most system registers, fpmr can be heavily written to in code that
exercises the fp8 functionality. That is because every fp8 instrinsic call
can potentially change the value of fpmr.
Rather than just use a an unspec, we treat the fpmr system register like
all other registers and use a move
On 26/07/2024 09:13, Kyrylo Tkachov wrote:
> Hi Claudio,
>
>> On 25 Jul 2024, at 16:25, Claudio Bantaloukas
>> wrote:
>>
>> External email: Use caution opening links or attachments
>>
>>
>> The ACLE declares several helper types and functions
On 26/07/2024 08:15, Kyrylo Tkachov wrote:
> Hi Claudio,
>
>> On 25 Jul 2024, at 16:25, Claudio Bantaloukas
>> wrote:
>>
>> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
>> index e0a641213ae..f293d49c61a 100644
>> --- a/gcc/doc/invoke.texi
&
Unlike most system registers, fpmr can be heavily written to in code that
exercises the fp8 functionality. That is because every fp8 instrinsic call
can potentially change the value of fpmr.
Rather than just use a an unspec, we treat the fpmr system register like
all other registers and use a move
The ACLE declares several helper types and functions to
facilitate construction of `fpm` arguments.
gcc/ChangeLog:
* config/aarch64/arm_acle.h (fpm_t): New type representing fpmr values.
(enum __ARM_FPM_FORMAT): New enum representing valid fp8 formats.
(enum __ARM_FPM_OVE
This introduces the relevant flags to enable access to the fpmr register and
fp8 intrinsics, which will be added subsequently.
gcc/ChangeLog:
* config/aarch64/aarch64-option-extensions.def (fp8): New.
* config/aarch64/aarch64.h (TARGET_FP8): Likewise.
* doc/invoke.texi (
sary modifier in .md
aarch64: Add fpm register helper functions.
- Helper functions and fpm_t types are available unconditionally when including
arm_acle.h
Is this ok for master? I do not have merge permissions. Can someone merge this
for me please?
Thanks,
Claudio Bantaloukas
Claudio Bantal
On 22/07/2024 11:07, Alex Coplan wrote:
> Hi Claudio,
>
> I've left a couple of small comments below.
>
> On 22/07/2024 09:30, Claudio Bantaloukas wrote:
--8<-
>>
>> @@ -1505,6 +1513,8 @@ (define_insn_and_split "*movdi_aarch64"
>>
On 22/07/2024 11:07, Alex Coplan wrote:
> Hi Claudio,
>
> I've left a couple of small comments below.
>
> On 22/07/2024 09:30, Claudio Bantaloukas wrote:
>>
>> Unlike most system registers, fpmr can be heavily written to in code that
>> exercises the fp
On 22/07/2024 10:45, Kyrylo Tkachov wrote:
> Hi Claudio,
> Thanks for working on this.
>
>> On 22 Jul 2024, at 11:30, Claudio Bantaloukas
>> wrote:
>>
>> External email: Use caution opening links or attachments
>>
>>
>> This introduces the r
On 22/07/2024 10:51, Kyrylo Tkachov wrote:
> Hi Claudio,
>
>> On 22 Jul 2024, at 11:30, Claudio Bantaloukas
>> wrote:
>>
>> External email: Use caution opening links or attachments
>>
>>
>> Unlike most system registers, fpmr can be heav
The ACLE declares several helper types and functions to
facilitate construction of `fpm` arguments.
gcc/ChangeLog:
* config/aarch64/arm_acle.h (fpm_t): New type representing fpmr values.
(enum __ARM_FPM_FORMAT): New enum representing valid fp8 formats.
(enum __ARM_FPM_OVE
Unlike most system registers, fpmr can be heavily written to in code that
exercises the fp8 functionality. That is because every fp8 instrinsic call
can potentially change the value of fpmr.
Rather than just use a an unspec, we treat the fpmr system register like
all other registers and use a move
This introduces the relevant flags to enable access to the fpmr register and
fp8 intrinsics, which will be added subsequently.
The `+fp8' -march modifier defines the __ARM_FEATURE_FP8 macro to 1.
gcc/ChangeLog:
* config/aarch64/aarch64-c.cc (__ARM_FEATURE_FP8): New.
* config/aa
eature_flags to 128 bits" be applied.
Is this ok for master? I do not have merge permissions. Can someone merge this
for me please?
Thanks,
Claudio Bantaloukas
Claudio Bantaloukas (3):
aarch64: Add march flags for +fp8 arch extensions
aarch64: Add support for moving fpm system registe
68 matches
Mail list logo