[PATCH] Fix command flags for SVE2 faminmax

2025-01-23 Thread saurabh.jha
Earlier, we were gating SVE2 faminmax behind sve+faminmax. This was incorrect and this patch changes it so that it is gated behind sve2+faminmax. gcc/ChangeLog: * config/aarch64/aarch64-sve2.md: (*aarch64_pred_faminmax_fused): Fix to use the correct flags. * config/aarch6

[PATCH v6] AArch64: Add LUTI ACLE for SVE2

2025-01-21 Thread saurabh.jha
This patch introduces support for LUTI2/LUTI4 ACLE for SVE2. LUTI instructions are used for efficient table lookups with 2-bit or 4-bit indices. LUTI2 reads indexed 8-bit or 16-bit elements from the low 128 bits of the table vector using packed 2-bit indices, while LUTI4 can read from the low 128

[PATCH v5] AArch64: Add LUTI ACLE for SVE2

2025-01-20 Thread saurabh.jha
This patch introduces support for LUTI2/LUTI4 ACLE for SVE2. LUTI instructions are used for efficient table lookups with 2-bit or 4-bit indices. LUTI2 reads indexed 8-bit or 16-bit elements from the low 128 bits of the table vector using packed 2-bit indices, while LUTI4 can read from the low 128

[PATCH v4] AArch64: Add LUTI ACLE for SVE2

2025-01-15 Thread saurabh.jha
This patch introduces support for LUTI2/LUTI4 ACLE for SVE2. LUTI instructions are used for efficient table lookups with 2-bit or 4-bit indices. LUTI2 reads indexed 8-bit or 16-bit elements from the low 128 bits of the table vector using packed 2-bit indices, while LUTI4 can read from the low 128

[PATCH v3] AArch64: Add LUTI ACLE for SVE2

2025-01-08 Thread saurabh.jha
This patch introduces support for LUTI2/LUTI4 ACLE for SVE2. LUTI instructions are used for efficient table lookups with 2-bit or 4-bit indices. LUTI2 reads indexed 8-bit or 16-bit elements from the low 128 bits of the table vector using packed 2-bit indices, while LUTI4 can read from the low 128

[PATCH v2 2/2] aarch64: Add support for AdvSIMD lut

2024-11-26 Thread saurabh.jha
The AArch64 FEAT_LUT extension is optional from Armv9.2-a and mandatory from Armv9.5-a. It introduces instructions for lookup table reads with bit indices. This patch adds support for AdvSIMD lut intrinsics. The intrinsics for this extension are implemented as the following builtin functions: * v

[PATCH v2 1/2] aarch64: Refactor AdvSIMD intrinsics

2024-11-26 Thread saurabh.jha
Refactor AdvSIMD intrinsics defined using the new pragma-based approach so that it is more extensible. Introduce a new struct, simd_type, which defines types using a mode and qualifiers, and use objects of this struct in the declaration of intrinsics in the aarch64-simd-pragma-builtins.def file.

[PATCH v2 0/2] aarch64: Add AdvSIMD lut

2024-11-26 Thread saurabh.jha
From: Saurabh Jha This patch series is a revised version of: https://gcc.gnu.org/pipermail/gcc-patches/2024-November/667692.html In the refactor patch, I redesigned how types are declared and how expand happens. I have taken some ideas from the reviews of the other patch series: https://gcc.gnu.

[PATCH v2 2/3] aarch64: Add support for fp8dot2 and fp8dot4

2024-11-14 Thread saurabh.jha
The AArch64 FEAT_FP8DOT2 and FEAT_FP8DOT4 extension introduces instructions for dot product of vectors. This patch introduces the following intrinsics: 1. vdot{q}_{fp16|fp32}_mf8_fpm. 2. vdot{q}_lane{q}_{fp16|fp32}_mf8_fpm. It introduces two flags: fp8dot2 and fp8dot4. We had to add space for a

[PATCH v2 1/3] aarch64: Add support for fp8 convert and scale

2024-11-14 Thread saurabh.jha
The AArch64 FEAT_FP8 extension introduces instructions for conversion and scaling. This patch introduces the following intrinsics: 1. vcvt{1|2}_{bf16|high_bf16|low_bf16}_mf8_fpm. 2. vcvt{q}_mf8_f16_fpm. 3. vcvt_{high}_mf8_f32_fpm. 4. vscale{q}_{f16|f32|f64}. We introduced two aarch64_builtin_sig

[PATCH v2 3/3] aarch64: Add support for fp8fma instructions

2024-11-14 Thread saurabh.jha
The AArch64 FEAT_FP8FMA extension introduces instructions for multiply-add of vectors. This patch introduces the following instructions: 1. {vmlalbq|vmlaltq}_f16_mf8_fpm. 2. {vmlalbq|vmlaltq}_lane{q}_f16_mf8_fpm. 3. {vmlallbbq|vmlallbtq|vmlalltbq|vmlallttq}_f32_mf8_fpm. 4. {vmlallbbq|vmlallbtq|vm

[PATCH v2 0/3] aarch64: Add fp8, fp8dot2, fp8dot4, and fp8fma acle

2024-11-14 Thread saurabh.jha
From: Saurabh Jha This patch series is a revised version of: https://gcc.gnu.org/pipermail/gcc-patches/2024-November/667723.html I have addressed comments around building a list of operands while declaring types and while expanding to RTL. I have also removed signatures with "_fpm" and "_lane" a

[PATCH 3/3] aarch64: Add support for fp8fma instructions

2024-11-06 Thread saurabh.jha
The AArch64 FEAT_FP8FMA extension introduces instructions for multiply-add of vectors. This patch introduces the following instructions: 1. {vmlalbq|vmlaltq}_f16_mf8_fpm. 2. {vmlalbq|vmlaltq}_lane{q}_f16_mf8_fpm. 3. {vmlallbbq|vmlallbtq|vmlalltbq|vmlallttq}_f32_mf8_fpm. 4. {vmlallbbq|vmlallbtq|vm

[PATCH 2/3] aarch64: Add support for fp8dot2 and fp8dot4

2024-11-06 Thread saurabh.jha
The AArch64 FEAT_FP8DOT2 and FEAT_FP8DOT4 extension introduces instructions for dot product of vectors. This patch introduces the following intrinsics: 1. vdot{q}_{fp16|fp32}_mf8_fpm. 2. vdot{q}_lane{q}_{fp16|fp32}_mf8_fpm. It introduces two flags: fp8dot2 and fp8dot4. We had to add space for a

[PATCH 1/3] aarch64: Add support for fp8 convert and scale

2024-11-06 Thread saurabh.jha
The AArch64 FEAT_FP8 extension introduces instructions for conversion and scaling. This patch introduces the following intrinsics: 1. vcvt{1|2}_{bf16|high_bf16|low_bf16}_mf8_fpm. 2. vcvt{q}_mf8_f16_fpm. 3. vcvt_{high}_mf8_f32_fpm. 4. vscale{q}_{f16|f32|f64}. We introduced three new aarch64_built

[PATCH 0/3] aarch64: Add fp8, fp8dot2, fp8dot4, and fp8fma acle

2024-11-06 Thread saurabh.jha
From: Saurabh Jha This patch series has three patches for adding support for fp8, fp8dot2, and fp8dot4 acle AdvSIMD intrinsics. The specific things I need thoughts are on are written in each commit message after the "---" which will be omitted when commiting. Regression tested on aarch64-unknow

[PATCH v6 1/2] aarch64: Add SVE2 faminmax intrinsics

2024-10-10 Thread saurabh.jha
The AArch64 FEAT_FAMINMAX extension introduces instructions for computing the floating point absolute maximum and minimum of the two vectors element-wise. This patch introduces SVE2 faminmax intrinsics. The intrinsics of this extension are implemented as the following builtin functions: * sva[max

[PATCH v6 2/2] aarch64: Add codegen support for SVE2 faminmax

2024-10-10 Thread saurabh.jha
The AArch64 FEAT_FAMINMAX extension introduces instructions for computing the floating point absolute maximum and minimum of the two vectors element-wise. This patch adds code generation for famax and famin in terms of existing unspecs. With this patch: 1. famax can be expressed as taking UNSPEC_

[PATCH v6 0/2] Add support for SVE2 faminmax

2024-10-10 Thread saurabh.jha
From: Saurabh Jha This patch series is a revised version of: https://gcc.gnu.org/pipermail/gcc-patches/2024-October/664912.html In particular, the only changes are in the first patch, where in the test cases of intrinsics, we removed unnecessary capture of regular expression of operands. The sec

[PATCH v5 2/2] aarch64: Add codegen support for SVE2 faminmax

2024-10-09 Thread saurabh.jha
The AArch64 FEAT_FAMINMAX extension introduces instructions for computing the floating point absolute maximum and minimum of the two vectors element-wise. This patch adds code generation for famax and famin in terms of existing unspecs. With this patch: 1. famax can be expressed as taking UNSPEC_

[PATCH v5 0/2] Add support for SVE2 faminmax

2024-10-09 Thread saurabh.jha
From: Saurabh Jha This patch series is a revised version of: https://gcc.gnu.org/pipermail/gcc-patches/2024-October/664391.html Previous review comments are in this thread: https://gcc.gnu.org/pipermail/gcc-patches/2024-October/664329.html The second patch of this is okay to merge. The changes

[PATCH v5 1/2] aarch64: Add SVE2 faminmax intrinsics

2024-10-09 Thread saurabh.jha
The AArch64 FEAT_FAMINMAX extension introduces instructions for computing the floating point absolute maximum and minimum of the two vectors element-wise. This patch introduces SVE2 faminmax intrinsics. The intrinsics of this extension are implemented as the following builtin functions: * sva[max

[PATCH] aarch64: Fix bug with max/min (PR116934)

2024-10-04 Thread saurabh.jha
In ac4cdf5cb43c0b09e81760e2a1902ceebcf1a135, I introduced a bug where I put the new unspecs, UNSPEC_COND_SMAX and UNSPEC_COND_SMIN, into the wrong iterator. I should have put new unspecs in SVE_COND_FP_MAXMIN but I put it in SVE_COND_FP_BINARY_REG instead. That was incorrect because the SVE_COND_

[PATCH v4 1/2] aarch64: Add SVE2 faminmax intrinsics

2024-10-03 Thread saurabh.jha
The AArch64 FEAT_FAMINMAX extension introduces instructions for computing the floating point absolute maximum and minimum of the two vectors element-wise. This patch introduces SVE2 faminmax intrinsics. The intrinsics of this extension are implemented as the following builtin functions: * sva[max

[PATCH v4 2/2] aarch64: Add codegen support for SVE2 faminmax

2024-10-03 Thread saurabh.jha
The AArch64 FEAT_FAMINMAX extension introduces instructions for computing the floating point absolute maximum and minimum of the two vectors element-wise. This patch adds code generation for famax and famin in terms of existing unspecs. With this patch: 1. famax can be expressed as taking UNSPEC_

[PATCH v4 0/2] Add support for SVE2 faminmax

2024-10-03 Thread saurabh.jha
From: Saurabh Jha This is a revised version of this patch series: https://gcc.gnu.org/pipermail/gcc-patches/2024-October/664329.html Unfortunately, I had test case failures which I missed but shouldn't have. Apologies for that. This version fixes the failing test cases in the second patch with

[PATCH v3 1/2] aarch64: Add SVE2 faminmax intrinsics

2024-10-02 Thread saurabh.jha
The AArch64 FEAT_FAMINMAX extension introduces instructions for computing the floating point absolute maximum and minimum of the two vectors element-wise. This patch introduces SVE2 faminmax intrinsics. The intrinsics of this extension are implemented as the following builtin functions: * sva[max

[PATCH v3 2/2] aarch64: Add codegen support for SVE2 faminmax

2024-10-02 Thread saurabh.jha
The AArch64 FEAT_FAMINMAX extension introduces instructions for computing the floating point absolute maximum and minimum of the two vectors element-wise. This patch adds code generation for famax and famin in terms of existing unspecs. With this patch: 1. famax can be expressed as taking UNSPEC_

[PATCH v3 0/2] Add support for SVE2 faminmax

2024-10-02 Thread saurabh.jha
From: Saurabh Jha This patch series is a revised version of: https://gcc.gnu.org/pipermail/gcc-patches/2024-October/664209.html The second commit of the previous patch series was reviewed and has been commited separately. This patch contains first and third commit of the previous patch. The cha

[PATCH v2 2/3] aarch64: Introduce new unspecs for smax/smin

2024-10-01 Thread saurabh.jha
Introduce two new unspecs, UNSPEC_COND_SMAX and UNSPEC_COND_SMIN, corresponding to rtl operators smax and smin. UNSPEC_COND_SMAX is used to generate fmaxnm instruction and UNSPEC_COND_SMIN is used to generate fminnm instruction. With these new unspecs, we can generate SVE2 max/min instructions us

[PATCH v2 3/3] aarch64: Add codegen support for SVE2 faminmax

2024-10-01 Thread saurabh.jha
The AArch64 FEAT_FAMINMAX extension introduces instructions for computing the floating point absolute maximum and minimum of the two vectors element-wise. This patch adds code generation for famax and famin in terms of existing unspecs. With this patch: 1. famax can be expressed as taking UNSPEC_

[PATCH v2 1/3] aarch64: Add SVE2 faminmax intrinsics

2024-10-01 Thread saurabh.jha
The AArch64 FEAT_FAMINMAX extension introduces instructions for computing the floating point absolute maximum and minimum of the two vectors element-wise. This patch introduces SVE2 faminmax intrinsics. The intrinsics of this extension are implemented as the following builtin functions: * sva[max

[PATCH v2 0/3] Add support for SVE2 faminmax

2024-10-01 Thread saurabh.jha
From: Saurabh Jha This patch series a revised version of an earlier patch series: https://gcc.gnu.org/pipermail/gcc-patches/2024-September/662951.html. The main change in this patch series is the introduction of the two unspecs, UNSPEC_COND_SMAX and UNSPEC_COND_SMIN, and using them for existing

[PATCH] [MAINTAINERS] Fix myself in order and add username

2024-09-23 Thread saurabh.jha
ChangeLog: * MAINTAINERS: Fix sort order and add username. --- MAINTAINERS | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/MAINTAINERS b/MAINTAINERS index 0ea4db20f88..3b4cf9d20d8 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -554,10 +554,10 @@ Sam James

[PATCH] [MAINTAINERS] Add myself to write after approval

2024-09-23 Thread saurabh.jha
ChangeLog: * MAINTAINERS: Add myself to write after approval. --- MAINTAINERS | 1 + 1 file changed, 1 insertion(+) diff --git a/MAINTAINERS b/MAINTAINERS index e9fafaf45a7..0ea4db20f88 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -557,6 +557,7 @@ Andrew Jenner andrew

[PATCH v10 2/2] aarch64: Add codegen support for AdvSIMD faminmax

2024-09-18 Thread saurabh.jha
The AArch64 FEAT_FAMINMAX extension is optional from Armv9.2-a and mandatory from Armv9.5-a. It introduces instructions for computing the floating point absolute maximum and minimum of the two vectors element-wise. This patch adds code generation support for famax and famin in terms of existing R

[PATCH v10 1/2] aarch64: Add AdvSIMD faminmax intrinsics

2024-09-18 Thread saurabh.jha
The AArch64 FEAT_FAMINMAX extension is optional from Armv9.2-a and mandatory from Armv9.5-a. It introduces instructions for computing the floating point absolute maximum and minimum of the two vectors element-wise. This patch introduces AdvSIMD faminmax intrinsics. The intrinsics of this extensio

[PATCH v10 0/2] Add support for AdvSIMD faminmax

2024-09-18 Thread saurabh.jha
From: Saurabh Jha This is a revised version of this patch series: https://gcc.gnu.org/pipermail/gcc-patches/2024-September/663204.html The only new thing in both patches of this series are fixing directives in test cases, replace /* { dg-do assemble} */ with /* { dg-do compile } */. We need comp

[PATCH v9 2/2] aarch64: Add codegen support for AdvSIMD faminmax

2024-09-18 Thread saurabh.jha
The AArch64 FEAT_FAMINMAX extension is optional from Armv9.2-a and mandatory from Armv9.5-a. It introduces instructions for computing the floating point absolute maximum and minimum of the two vectors element-wise. This patch adds code generation support for famax and famin in terms of existing R

[PATCH v9 1/2] aarch64: Add AdvSIMD faminmax intrinsics

2024-09-18 Thread saurabh.jha
The AArch64 FEAT_FAMINMAX extension is optional from Armv9.2-a and mandatory from Armv9.5-a. It introduces instructions for computing the floating point absolute maximum and minimum of the two vectors element-wise. This patch introduces AdvSIMD faminmax intrinsics. The intrinsics of this extensio

[PATCH v9 0/2] Add support for AdvSIMD faminmax

2024-09-18 Thread saurabh.jha
From: Saurabh Jha This is a revised version of this patch series: https://gcc.gnu.org/pipermail/gcc-patches/2024-September/thread.html Thanks for the review comments. They are all addressed in this version. The changes are as follows. 1. [intrinsics patch] Using enum class for aarch64_builtin_si

[PATCH 2/2] aarch64: Add codegen support for SVE2 faminmax

2024-09-13 Thread saurabh.jha
The AArch64 FEAT_FAMINMAX extension is optional from Armv9.2-a and mandatory from Armv9.5-a. It introduces instructions for computing the floating point absolute maximum and minimum of the two vectors element-wise. This patch adds code generation for famax and famin in terms of existing unspecs.

[PATCH 1/2] aarch64: Add SVE2 faminmax intrinsics

2024-09-13 Thread saurabh.jha
The AArch64 FEAT_FAMINMAX extension is optional from Armv9.2-a and mandatory from Armv9.5-a. It introduces instructions for computing the floating point absolute maximum and minimum of the two vectors element-wise. This patch introduces SVE2 faminmax intrinsics. The intrinsics of this extension a

[PATCH 0/2] aarch64: Add support for SVE2 faminmax

2024-09-13 Thread saurabh.jha
From: Saurabh Jha This patch series adds support for SVE2 faminmax. It should be merged only after AdvSIMD faminmax patch series is merged: https://gcc.gnu.org/pipermail/gcc-patches/2024-September/662131.html The first patch adds intrinsics and the second patch adds support for combining FMAX/FM

[PATCH v8 2/2] aarch64: Add codegen support for AdvSIMD faminmax

2024-09-03 Thread saurabh.jha
The AArch64 FEAT_FAMINMAX extension is optional from Armv9.2-a and mandatory from Armv9.5-a. It introduces instructions for computing the floating point absolute maximum and minimum of the two vectors element-wise. This patch adds code generation support for famax and famin in terms of existing R

[PATCH v8 1/2] aarch64: Add AdvSIMD faminmax intrinsics

2024-09-03 Thread saurabh.jha
The AArch64 FEAT_FAMINMAX extension is optional from Armv9.2-a and mandatory from Armv9.5-a. It introduces instructions for computing the floating point absolute maximum and minimum of the two vectors element-wise. This patch introduces AdvSIMD faminmax intrinsics. The intrinsics of this extensio

[PATCH v8 0/2] aarch64: Add support for AdvSIMD faminmax.

2024-09-03 Thread saurabh.jha
From: Saurabh Jha This series is a revised version of: https://gcc.gnu.org/pipermail/gcc-patches/2024-August/661860.html. The first patch of the series is updated to address these comments: https://gcc.gnu.org/pipermail/gcc-patches/2024-August/661866.html All comments are addressed exactly as s

[PATCH v7 2/2] aarch64: Add codegen support for AdvSIMD faminmax

2024-08-30 Thread saurabh.jha
The AArch64 FEAT_FAMINMAX extension is optional from Armv9.2-a and mandatory from Armv9.5-a. It introduces instructions for computing the floating point absolute maximum and minimum of the two vectors element-wise. This patch adds code generation support for famax and famin in terms of existing R

[PATCH v7 1/2] aarch64: Add AdvSIMD faminmax intrinsics

2024-08-30 Thread saurabh.jha
The AArch64 FEAT_FAMINMAX extension is optional from Armv9.2-a and mandatory from Armv9.5-a. It introduces instructions for computing the floating point absolute maximum and minimum of the two vectors element-wise. This patch introduces AdvSIMD faminmax intrinsics. The intrinsics of this extensio

[PATCH v7 0/2] aarch64: Add support for AdvSIMD faminmax

2024-08-30 Thread saurabh.jha
From: Saurabh Jha This patch series is a respin of https://gcc.gnu.org/pipermail/gcc-patches/2024-August/661757.html. The major refactorings suggested in the reviews to previous version will be done separately to keep the scope of this series small. I'll create a new series for that refactoring.

[PATCH v6 2/2] aarch64: Add codegen support for AdvSIMD faminmax

2024-08-29 Thread saurabh.jha
The AArch64 FEAT_FAMINMAX extension is optional from Armv9.2-a and mandatory from Armv9.5-a. It introduces instructions for computing the floating point absolute maximum and minimum of the two vectors element-wise. This patch adds code generation support for famax and famin in terms of existing R

[PATCH v6 1/2] aarch64: Add AdvSIMD faminmax intrinsics

2024-08-29 Thread saurabh.jha
The AArch64 FEAT_FAMINMAX extension is optional from Armv9.2-a and mandatory from Armv9.5-a. It introduces instructions for computing the floating point absolute maximum and minimum of the two vectors element-wise. This patch introduces AdvSIMD faminmax intrinsics. The intrinsics of this extensio

[PATCH v6 0/2] aarch64: Add support for AdvSIMD faminmax

2024-08-29 Thread saurabh.jha
From: Saurabh Jha This patch series is a respin of https://gcc.gnu.org/pipermail/gcc-patches/2024-August/661672.html. The new version addresses comment about using AARCH64_PRAGMA_BUILTIN_START and AARCH64_PRAGMA_BUILTIN_END in aarch64_builtins enum. Apart from the function expand_pragma_builtin

[PATCH v5 1/2] aarch64: Add AdvSIMD faminmax intrinsics

2024-08-28 Thread saurabh.jha
The AArch64 FEAT_FAMINMAX extension is optional from Armv9.2-a and mandatory from Armv9.5-a. It introduces instructions for computing the floating point absolute maximum and minimum of the two vectors element-wise. This patch introduces AdvSIMD faminmax intrinsics. The intrinsics of this extensio

[PATCH v5 2/2] aarch64: Add codegen support for AdvSIMD faminmax

2024-08-28 Thread saurabh.jha
The AArch64 FEAT_FAMINMAX extension is optional from Armv9.2-a and mandatory from Armv9.5-a. It introduces instructions for computing the floating point absolute maximum and minimum of the two vectors element-wise. This patch adds code generation support for famax and famin in terms of existing R

[PATCH v5 0/2] aarch64: Add support for AdvSIMD faminmax

2024-08-28 Thread saurabh.jha
From: Saurabh Jha This patch series is a respin of the previous patch here: https://gcc.gnu.org/pipermail/gcc-patches/2024-August/660917.html The new version addresses review comments on the previous patch series. It also introduced a new way of defining AArch4 AdvSIMD intrinsics. All of the new

[PATCH v4 2/2] aarch64: Add codegen support for AdvSIMD faminmax

2024-08-20 Thread saurabh.jha
The AArch64 FEAT_FAMINMAX extension is optional from Armv9.2-a and mandatory from Armv9.5-a. It introduces instructions for computing the floating point absolute maximum and minimum of the two vectors element-wise. This patch adds code generation support for famax and famin in terms of existing R

[PATCH v4 1/2] aarch64: Add AdvSIMD faminmax intrinsics

2024-08-20 Thread saurabh.jha
The AArch64 FEAT_FAMINMAX extension is optional from Armv9.2-a and mandatory from Armv9.5-a. It introduces instructions for computing the floating point absolute maximum and minimum of the two vectors element-wise. This patch introduces AdvSIMD faminmax intrinsics. The intrinsics of this extensio

[PATCH v4 0/2] Add support for AdvSIMD faminmax

2024-08-20 Thread saurabh.jha
From: Saurabh Jha This patch series is a respin of the previous patch here: https://gcc.gnu.org/pipermail/gcc-patches/2024-August/659749.html. This new version is rebased with latest master after the merging of this patch series: https://gcc.gnu.org/pipermail/gcc-patches/2024-August/660532.html.

[PATCH v3 1/2] aarch64: Add AdvSIMD faminmax intrinsics

2024-08-07 Thread saurabh.jha
The AArch64 FEAT_FAMINMAX extension is optional from Armv9.2-a and mandatory from Armv9.5-a. It introduces instructions for computing the floating point absolute maximum and minimum of the two vectors element-wise. This patch does two things: 1. Introduces AdvSIMD faminmax intrinsics. 2. Move rep

[PATCH v3 2/2] aarch64: Add codegen support for AdvSIMD faminmax

2024-08-07 Thread saurabh.jha
The AArch64 FEAT_FAMINMAX extension is optional from Armv9.2-a and mandatory from Armv9.5-a. It introduces instructions for computing the floating point absolute maximum and minimum of the two vectors element-wise. This patch adds code generation support for famax and famin in terms of existing R

[PATCH v3 0/2] Add support for AdvSIMD faminmax

2024-08-07 Thread saurabh.jha
From: Saurabh Jha This patch series is a respin of a previous patch here: https://gcc.gnu.org/pipermail/gcc-patches/2024-August/658984.html The AArch64 FEAT_FAMINMAX is optional from Armv9.2-a and mandatory from Armv9.5-a. It introduces instructions for computing the floating point absolute maxi

[PATCH v2] aarch64: Add support for AdvSIMD faminmax

2024-08-01 Thread saurabh.jha
The AArch64 FEAT_FAMINMAX extension is optional from Armv9.2-a and mandatory from Armv9.5-a. It introduces instructions for computing the floating point absolute maximum and minimum of the two vectors element-wise. This patch does three things: 1. Introduces AdvSIMD faminmax intrinsics. 2. Adds c

[PATCH] aarch64: Add support for AdvSIMD faminmax

2024-08-01 Thread saurabh.jha
The AArch64 FEAT_FAMINMAX extension is optional from Armv9.2-a and mandatory from Armv9.5-a. It introduces instructions for computing the floating point absolute maximum and minimum of the two vectors element-wise. This patch does three things: 1. Introduces AdvSIMD faminmax intrinsics. 2. Adds c

[PATCH] aarch64: Add ACLE intrinsics for AdvSIMD faminmax

2024-07-22 Thread saurabh.jha
The AArch64 FEAT_FAMINMAX extension is optional in Armv9.2 and mandatory in Armv9.5. It introduces instructions for computing the floating point absolute maximum and minimum of the two vectors element-wise. This patch introduces intrinsics for AdvSIMD faminmax extension in the form of the followi