Re: [PATCH] vect: Relax scan-tree-dump strict pattern matching [PR118597]

2025-04-03 Thread Victor Do Nascimento
Cheers, And given how the test also started failing for GCC14, are we also okay to go ahead with backporting the patch? Victor On 4/2/25 17:39, Jeff Law wrote: On 4/2/25 8:53 AM, Victor Do Nascimento wrote: Using specific SSA names in pattern matching in `dg-final' makes tests &quo

[PATCH] vect: Relax scan-tree-dump strict pattern matching [PR118597]

2025-04-02 Thread Victor Do Nascimento
Using specific SSA names in pattern matching in `dg-final' makes tests "unstable", in that changes in passes prior to the pass whose dump is analyzed in the particular test may change the numbering of the SSA variables, causing the test to start failing spuriously. We thus switch from specific SSA

Re: [PATCH v2 2/4] vect: disable multiple calls of poly simdclones

2024-11-19 Thread Victor Do Nascimento
Richard Biener writes: > On Mon, 18 Nov 2024, Victor Do Nascimento wrote: > >> On 11/5/24 07:39, Richard Biener wrote: >> > On Tue, 5 Nov 2024, Victor Do Nascimento wrote: >> > >> >> The current codegen code to support VF's that are multiples of

Re: [PATCH v2 2/4] vect: disable multiple calls of poly simdclones

2024-11-18 Thread Victor Do Nascimento
On 11/5/24 07:39, Richard Biener wrote: On Tue, 5 Nov 2024, Victor Do Nascimento wrote: The current codegen code to support VF's that are multiples of a simdclone simdlen rely on BIT_FIELD_REF to create multiple input vectors. This does not work for non-constant simdclones, so we s

Re: [PATCH v2 2/4] vect: disable multiple calls of poly simdclones

2024-11-16 Thread Victor Do Nascimento
On 11/5/24 07:39, Richard Biener wrote: On Tue, 5 Nov 2024, Victor Do Nascimento wrote: The current codegen code to support VF's that are multiples of a simdclone simdlen rely on BIT_FIELD_REF to create multiple input vectors. This does not work for non-constant simdclones, so we s

Re: [PATCH v2 4/4] vect: Disable `omp declare variant' tests for aarch64

2024-11-04 Thread Victor Do Nascimento
cc'ing Jakub due to email address typo in original patch submission. Apologies, Victor Victor Do Nascimento writes: > gcc/testsuite/ChangeLog: > > * c-c++-common/gomp/declare-variant-14.c: Make i?86 and x86_64 target > only test. > * gfortran.dg/gomp/de

[PATCH v2 4/4] vect: Disable `omp declare variant' tests for aarch64

2024-11-04 Thread Victor Do Nascimento
gcc/testsuite/ChangeLog: * c-c++-common/gomp/declare-variant-14.c: Make i?86 and x86_64 target only test. * gfortran.dg/gomp/declare-variant-14.f90: Likewise. --- gcc/testsuite/c-c++-common/gomp/declare-variant-14.c | 12 +--- .../gfortran.dg/gomp/declare-variant-1

[PATCH v2 3/4] aarch64: Add SVE support for simd clones [PR 96342]

2024-11-04 Thread Victor Do Nascimento
This patch finalizes adding support for the generation of SVE simd clones when no simdlen is provided, following the ABI rules where the widest data type determines the minimum amount of elements in a length agnostic vector. gcc/ChangeLog: * config/aarch64/aarch64-protos.h (add_sve_type_a

[PATCH v2 2/4] vect: disable multiple calls of poly simdclones

2024-11-04 Thread Victor Do Nascimento
The current codegen code to support VF's that are multiples of a simdclone simdlen rely on BIT_FIELD_REF to create multiple input vectors. This does not work for non-constant simdclones, so we should disable using such clones when the VF is a multiple of the non-constant simdlen until we change th

[PATCH v2 1/4] vect: Pass stmt_vec_info to TARGET_SIMD_CLONE_USABLE

2024-11-04 Thread Victor Do Nascimento
This patch adds stmt_vec_info to TARGET_SIMD_CLONE_USABLE to make sure the target can reject a simd_clone based on the vector mode it is using. This is needed because for VLS SVE vectorization the vectorizer accepts Advanced SIMD simd clones when vectorizing using SVE types because the simdlens mig

[PATCH v2 1/4] vect: Pass stmt_vec_info to TARGET_SIMD_CLONE_USABLE

2024-11-04 Thread Victor Do Nascimento
This patch adds stmt_vec_info to TARGET_SIMD_CLONE_USABLE to make sure the target can reject a simd_clone based on the vector mode it is using. This is needed because for VLS SVE vectorization the vectorizer accepts Advanced SIMD simd clones when vectorizing using SVE types because the simdlens mig

[PATCH v2 0/4] aarch64, vect: Extend simdclone support to vector-length agnostic SVE

2024-11-04 Thread Victor Do Nascimento
Following a few bugfixes in the if-convert pass which were previously found to degrade performance in the proposed SVE libmvec autovectorization of conditional calls to math functions, this patch-series carries on the work initially presented by Andre Vieira and last discussed at: - https://patch

[PATCH] aarch64: Extend support for the AE family of Cortex CPUs

2024-10-31 Thread Victor Do Nascimento
Implement -mcpu options for: - Cortex-A520AE - Cortex-A720AE - Cortex-R82AE These all implement the same feature sets as their non-AE counterparts, using the same scheduler and costs and differing only in their respective part numbers. gcc/ChangeLog: * config/aarch64/aarch64-cores

Re: [PATCH 3/3] aarch64: Add SVE support for simd clones [PR 96342]

2024-10-23 Thread Victor Do Nascimento
On 2/1/24 21:59, Richard Sandiford wrote: Andre Vieira writes: This patch finalizes adding support for the generation of SVE simd clones when no simdlen is provided, following the ABI rules where the widest data type determines the minimum amount of elements in a length agnostic vector. gcc/Ch

Re: [PATCH] AArch64: Remove redundant check in aarch64_simd_mov

2024-10-17 Thread Victor Do Nascimento
FWIW, I definitely agree about the spuriousness of the V2DI mode check. While I can't approve, I can confirm it looks good. Thanks, Victor. On 10/17/24 16:10, Wilco Dijkstra wrote: The split condition in aarch64_simd_mov uses aarch64_simd_special_constant_p. While doing the split, it checks

[PATCH v2] middle-end: [PR middle-end/116926] Allow widening optabs for vec-mode -> scalar-mode

2024-10-11 Thread Victor Do Nascimento
The recent refactoring of the dot_prod optab to convert-type exposed a limitation in how `find_widening_optab_handler_and_mode' is currently implemented, owing to the fact that, while the function expects the GET_MODE_CLASS (from_mode) == GET_MODE_CLASS (to_mode) condition to hold, the c6x back

Re: [PATCH] middle-end: [PR middle-end/116926] Allow widening optabs for vec-mode -> scalar-mode

2024-10-11 Thread Victor Do Nascimento
On 10/11/24 08:28, Richard Biener wrote: On Thu, Oct 10, 2024 at 5:25 PM Victor Do Nascimento wrote: The recent refactoring of the dot_prod optab to convert-type exposed a limitation in how `find_widening_optab_handler_and_mode' is currently implemented, owing to the fact that, whil

[PATCH] middle-end: [PR middle-end/116926] Allow widening optabs for vec-mode -> scalar-mode

2024-10-10 Thread Victor Do Nascimento
The recent refactoring of the dot_prod optab to convert-type exposed a limitation in how `find_widening_optab_handler_and_mode' is currently implemented, owing to the fact that, while the function expects the GET_MODE_CLASS (from_mode) == GET_MODE_CLASS (to_mode) condition to hold, the c6x back

Re: [PATCH] middle-end: reorder masking priority of math functions

2024-10-07 Thread Victor Do Nascimento
On 10/7/24 10:52, Richard Biener wrote: On Wed, Oct 2, 2024 at 6:26 PM Victor Do Nascimento wrote: Given the categorization of math built-in functions as `ECF_CONST', when if-converting their uses, their calls are not masked and are thus called with an all-true predicate. This, howeve

Re: [PATCH] middle-end: reorder masking priority of math functions

2024-10-04 Thread Victor Do Nascimento
On 10/4/24 09:32, Tamar Christina wrote: Hi Victor, -Original Message- From: Victor Do Nascimento Sent: Wednesday, October 2, 2024 5:26 PM To: gcc-patches@gcc.gnu.org Cc: Tamar Christina ; richard.guent...@gmail.com; Victor Do Nascimento Subject: [PATCH] middle-end: reorder masking

[PATCH] middle-end: reorder masking priority of math functions

2024-10-02 Thread Victor Do Nascimento
Given the categorization of math built-in functions as `ECF_CONST', when if-converting their uses, their calls are not masked and are thus called with an all-true predicate. This, however, is not appropriate where built-ins have library equivalents, wherein they may exhibit highly architecture-spe

Re: [PATCH] middle-end: Fix ifcvt predicate generation for masked function calls

2024-10-02 Thread Victor Do Nascimento
On 10/1/24 13:10, Richard Biener wrote: On Mon, Sep 30, 2024 at 8:40 PM Tamar Christina wrote: Hi Victor, Thanks! This looks good to me with one minor comment: -Original Message- From: Victor Do Nascimento Sent: Monday, September 30, 2024 2:34 PM To: gcc-patches@gcc.gnu.org Cc

[PATCH] middle-end: Fix ifcvt predicate generation for masked function calls

2024-09-30 Thread Victor Do Nascimento
Up until now, due to a latent bug in the code for the ifcvt pass, irrespective of the branch taken in a conditional statement, the original condition for the if statement was used in masking the function call. Thus, for code such as: if (a[i] > limit) b[i] = fixed_const; else b[i] = f

[PING][PATCH V4 10/10] autovectorizer: Test autovectorization of different dot-prod modes.

2024-09-26 Thread Victor Do Nascimento
Hello, Gentle reminder for this simple renaming update in response to the feedback from the last iteration. 🙂 Thanks, Victor On 9/5/24 12:05, Victor Do Nascimento wrote: Changes from previous revision: Rename new `check_effective_target' and tests to make their intent clearer.

[PING][PATCH V4 04/10] arm: Fix arm backend-use of (u|s|us)dot_prod patterns

2024-09-26 Thread Victor Do Nascimento
Hello, Gentle reminder for this patch 🙂 Thanks, Victor On 9/5/24 11:59, Victor Do Nascimento wrote: Changes from previous revision: As was done for the equivalent aarch64 patch, we rework this patch to do away with mission creep, keeping changes as simple as possible. We thus remove the

[PATCH V4 10/10] autovectorizer: Test autovectorization of different dot-prod modes.

2024-09-05 Thread Victor Do Nascimento
Changes from previous revision: Rename new `check_effective_target' and tests to make their intent clearer. * lib/target-supports.exp: For new `check_effective_target', s/vect_dotprod_twoway/vect_dotprod_hisi/. * One test is renamed to `vect-dotprod-conv-optab.c' to emphasize aim of c

[PATCH V4 04/10] arm: Fix arm backend-use of (u|s|us)dot_prod patterns

2024-09-05 Thread Victor Do Nascimento
Changes from previous revision: As was done for the equivalent aarch64 patch, we rework this patch to do away with mission creep, keeping changes as simple as possible. We thus remove the `gimple_fold_builtin' changes that would have replaced the dot-product builtin calls with DOT_PROD_EXPRs a

[PING] [PATCH V3 09/10] c6x: Adjust dot-product backend patterns

2024-08-28 Thread Victor Do Nascimento
Hello, Gentle reminder for this simple renaming patch :) Thanks, Victor On 8/15/24 09:44, Victor Do Nascimento wrote: Following the migration of the dot_prod optab from a direct to a conversion-type optab, ensure all back-end patterns incorporate the second machine mode into pattern names

[PING] [PATCH V3 07/10] mips: Adjust dot-product backend patterns

2024-08-28 Thread Victor Do Nascimento
Hello, Gentle reminder for this simple renaming patch :) Thanks, Victor On 8/15/24 09:44, Victor Do Nascimento wrote: Following the migration of the dot_prod optab from a direct to a conversion-type optab, ensure all back-end patterns incorporate the second machine mode into pattern names

[PING] [PATCH V3 06/10] arc: Adjust dot-product backend patterns

2024-08-28 Thread Victor Do Nascimento
Hello, Gentle reminder for this simple renaming patch :) Thanks, Victor On 8/15/24 09:44, Victor Do Nascimento wrote: Following the migration of the dot_prod optab from a direct to a conversion-type optab, ensure all back-end patterns incorporate the second machine mode into pattern names

Re: [PATCH V2 03/10] aarch64: Fix aarch64 backend-use of (u|s|us)dot_prod patterns

2024-08-15 Thread Victor Do Nascimento
On 8/15/24 09:26, Richard Sandiford wrote: Victor Do Nascimento writes: Given recent changes to the dot_prod standard pattern name, this patch fixes the aarch64 back-end by implementing the following changes: 1. Add 2nd mode to all (u|s|us)dot_prod patterns in .md files. 2. Rewrite

[PATCH V3 08/10] rs6000: Adjust altivec dot-product backend patterns

2024-08-15 Thread Victor Do Nascimento
Following the migration of the dot_prod optab from a direct to a conversion-type optab, ensure all back-end patterns incorporate the second machine mode into pattern names. gcc/ChangeLog: * config/rs6000/altivec.md (udot_prod): Renamed to... (udot_prodv4si): ...this. (sdot

[PATCH V3 06/10] arc: Adjust dot-product backend patterns

2024-08-15 Thread Victor Do Nascimento
Following the migration of the dot_prod optab from a direct to a conversion-type optab, ensure all back-end patterns incorporate the second machine mode into pattern names. gcc/ChangeLog: * config/arc/simdext.md (sdot_prodv2hi): Renamed to... (sdot_prodsiv2hi): ...this. (u

[PATCH V3 07/10] mips: Adjust dot-product backend patterns

2024-08-15 Thread Victor Do Nascimento
Following the migration of the dot_prod optab from a direct to a conversion-type optab, ensure all back-end patterns incorporate the second machine mode into pattern names. gcc/ChangeLog: * config/mips/loongson-mmi.md (sdot_prodv4hi): Renamed to... (sdot_prodv2siv4hi): ...this. --

[PATCH V3 09/10] c6x: Adjust dot-product backend patterns

2024-08-15 Thread Victor Do Nascimento
Following the migration of the dot_prod optab from a direct to a conversion-type optab, ensure all back-end patterns incorporate the second machine mode into pattern names. gcc/ChangeLog: * config/c6x/c6x.md (sdot_prodv2hi): Renamed to... (sdot_prodsiv2hi): ...this. --- gcc/confi

[PATCH V3 03/10] aarch64: Fix aarch64 backend-use of (u|s|us)dot_prod patterns

2024-08-15 Thread Victor Do Nascimento
Given recent changes to the dot_prod standard pattern name, this patch fixes the aarch64 back-end by implementing the following changes: 1. Add 2nd mode to all (u|s|us)dot_prod patterns in .md files. 2. Rewrite initialization and function expansion mechanism for simd builtins. 3. Fix all direct ca

[PATCH V3 02/10] autovectorizer: Add basic support for convert optabs

2024-08-15 Thread Victor Do Nascimento
Given the shift from modeling dot products as direct optabs to treating them as conversion optabs, we make necessary changes to the autovectorizer code to ensure that given the relevant tree code, together with the input and output data modes, we can retrieve the relevant optab and subsequently the

[PATCH V3 10/10] autovectorizer: Test autovectorization of different dot-prod modes.

2024-08-15 Thread Victor Do Nascimento
From: Victor Do Nascimento Given the novel treatment of the dot product optab as a conversion, we are now able to targe different relationships between output modes and input modes. This is made clearer by way of example. Previously, on AArch64, the following loop was vectorizable: uint32_t

[PATCH V3 04/10] arm: Fix arm backend-use of (u|s|us)dot_prod patterns

2024-08-15 Thread Victor Do Nascimento
gcc/ChangeLog: * config/arm/arm-builtins.cc (enum arm_builtins): Add new ARM_BUILTIN_* enum values: SDOTV8QI, SDOTV16QI, UDOTV8QI, UDOTV16QI, USDOTV8QI, USDOTV16QI. (arm_init_dotprod_builtins): New. (arm_init_builtins): Add call to `arm_init_dotprod_builtins

[PATCH V3 05/10] i386: Fix dot_prod backend patterns for mmx and sse targets

2024-08-15 Thread Victor Do Nascimento
Following the migration of the dot_prod optab from a direct to a conversion-type optab, ensure all back-end patterns incorporate the second machine mode into pattern names. gcc/ChangeLog: * config/i386/mmx.md (usdot_prodv8qi): Renamed to... (usdot_prodv2siv8qi): ...this. (

[PATCH V3 01/10] optabs: Make all `*dot_prod_optab's modeled as conversions

2024-08-15 Thread Victor Do Nascimento
Given the specification in the GCC internals manual defines the {u|s}dot_prod standard name as taking "two signed elements of the same mode, adding them to a third operand of wider mode", there is currently ambiguity in the relationship between the mode of the first two arguments and that of the th

[PATCH V3 00/10] optabs: Make all `*dot_prod_optab's modeled as conversions

2024-08-15 Thread Victor Do Nascimento
nd armhf. I'd appreciate help running relevant tests on the remaining architectures, i.e. arc, mips, altivec and c6x to ensure I've not inadvertently broken anything for those back-ends. Victor Do Nascimento (10): optabs: Make all `*dot_prod_optab's modeled as conversions autov

Re: [PATCH V2 02/10] autovectorizer: Add basic support for convert optabs

2024-08-14 Thread Victor Do Nascimento
On 8/14/24 13:24, Tamar Christina wrote: It seems to me that this should take a code_helper, create the vector modes and call directly_supported_p, or am I missing something? Ok. Having done some digging around in the git history, I see that `vect_supportable_direct_optab_p', upon which I ba

Re: [PATCH V2 02/10] autovectorizer: Add basic support for convert optabs

2024-08-14 Thread Victor Do Nascimento
On 8/14/24 13:24, Tamar Christina wrote: Hi Victor, -Original Message- From: Victor Do Nascimento Sent: Tuesday, August 13, 2024 1:42 PM To: gcc-patches@gcc.gnu.org Cc: Tamar Christina ; claz...@gmail.com; hongtao@intel.com; s...@gcc.gnu.org; bernds_...@t-online.de; al

[PATCH V2 10/10] autovectorizer: Test autovectorization of different dot-prod modes.

2024-08-13 Thread Victor Do Nascimento
From: Victor Do Nascimento Given the novel treatment of the dot product optab as a conversion, we are now able to targe different relationships between output modes and input modes. This is made clearer by way of example. Previously, on AArch64, the following loop was vectorizable: uint32_t

[PATCH V2 07/10] mips: Adjust dot-product backend patterns

2024-08-13 Thread Victor Do Nascimento
Following the migration of the dot_prod optab from a direct to a conversion-type optab, ensure all back-end patterns incorporate the second machine mode into pattern names. gcc/ChangeLog: * config/mips/loongson-mmi.md (sdot_prodv4hi): Renamed to... (sdot_prodv2siv4hi): ...this. --

[PATCH V2 08/10] rs6000: Adjust altivec dot-product backend patterns

2024-08-13 Thread Victor Do Nascimento
Following the migration of the dot_prod optab from a direct to a conversion-type optab, ensure all back-end patterns incorporate the second machine mode into pattern names. gcc/ChangeLog: * config/rs6000/altivec.md (udot_prod): Renamed to... (udot_prodv4si): ...this. (sdot

[PATCH V2 05/10] i386: Fix dot_prod backend patterns for mmx and sse targets

2024-08-13 Thread Victor Do Nascimento
Following the migration of the dot_prod optab from a direct to a conversion-type optab, ensure all back-end patterns incorporate the second machine mode into pattern names. gcc/ChangeLog: * config/i386/mmx.md (usdot_prodv8qi): Renamed to... (usdot_prodv2siv8qi): ...this. (

[PATCH V2 00/10] optabs: Make all `*dot_prod_optab's modeled as conversions

2024-08-13 Thread Victor Do Nascimento
the same input mode but resulting in a different output mode. Regression-tested on x86_64, aarch64 and armhf. I'd appreciate help running relevant tests on the remaining architectures, i.e. arc, mips, altivec and c6x to ensure I've not inadvertently broken anything for those back-e

[PATCH V2 04/10] arm: Fix arm backend-use of (u|s|us)dot_prod patterns

2024-08-13 Thread Victor Do Nascimento
gcc/ChangeLog: * config/arm/arm-builtins.cc (enum arm_builtins): Add new ARM_BUILTIN_* enum values: SDOTV8QI, SDOTV16QI, UDOTV8QI, UDOTV16QI, USDOTV8QI, USDOTV16QI. (arm_init_dotprod_builtins): New. (arm_init_builtins): Add call to `arm_init_dotprod_builtins

[PATCH V2 06/10] arc: Adjust dot-product backend patterns

2024-08-13 Thread Victor Do Nascimento
Following the migration of the dot_prod optab from a direct to a conversion-type optab, ensure all back-end patterns incorporate the second machine mode into pattern names. gcc/ChangeLog: * config/arc/simdext.md (sdot_prodv2hi): Renamed to... (sdot_prodsiv2hi): ...this. (u

[PATCH V2 09/10] c6x: Adjust dot-product backend patterns

2024-08-13 Thread Victor Do Nascimento
Following the migration of the dot_prod optab from a direct to a conversion-type optab, ensure all back-end patterns incorporate the second machine mode into pattern names. gcc/ChangeLog: * config/c6x/c6x.md (sdot_prodv2hi): Renamed to... (sdot_prodsiv2hi): ...this. --- gcc/confi

[PATCH V2 03/10] aarch64: Fix aarch64 backend-use of (u|s|us)dot_prod patterns

2024-08-13 Thread Victor Do Nascimento
Given recent changes to the dot_prod standard pattern name, this patch fixes the aarch64 back-end by implementing the following changes: 1. Add 2nd mode to all (u|s|us)dot_prod patterns in .md files. 2. Rewrite initialization and function expansion mechanism for simd builtins. 3. Fix all direct ca

[PATCH V2 01/10] optabs: Make all `*dot_prod_optab's modeled as conversions

2024-08-13 Thread Victor Do Nascimento
Given the specification in the GCC internals manual defines the {u|s}dot_prod standard name as taking "two signed elements of the same mode, adding them to a third operand of wider mode", there is currently ambiguity in the relationship between the mode of the first two arguments and that of the th

[PATCH V2 02/10] autovectorizer: Add basic support for convert optabs

2024-08-13 Thread Victor Do Nascimento
Given the shift from modeling dot products as direct optabs to treating them as conversion optabs, we make necessary changes to the autovectorizer code to ensure that given the relevant tree code, together with the input and output data modes, we can retrieve the relevant optab and subsequently the

Re: [PATCH 05/10] i386: Fix dot_prod backend patterns for mmx and sse targets

2024-07-12 Thread Victor Do Nascimento
On 7/12/24 03:23, Jiang, Haochen wrote: -Original Message- From: Hongtao Liu Sent: Thursday, July 11, 2024 9:45 AM To: Victor Do Nascimento Cc: gcc-patches@gcc.gnu.org; richard.sandif...@arm.com; richard.earns...@arm.com Subject: Re: [PATCH 05/10] i386: Fix dot_prod backend patterns

[PATCH 03/10] aarch64: Fix aarch64 backend-use of (u|s|us)dot_prod patterns.

2024-07-10 Thread Victor Do Nascimento
Given recent changes to the dot_prod standard pattern name, this patch fixes the aarch64 back-end by implementing the following changes: 1. Add 2nd mode to all (u|s|us)dot_prod patterns in .md files. 2. Rewrite initialization and function expansion mechanism for simd builtins. 3. Fix all direct ca

[PATCH 00/10] Make `dot_prod' a convert-type optab

2024-07-10 Thread Victor Do Nascimento
x to ensure I've not inadvertently broken anything for those backends. Victor Do Nascimento (10): optabs: Make all `*dot_prod_optab's modeled as conversions autovectorizer: Add basic support for convert optabs aarch64: Fix aarch64 backend-use of (u|s|us)dot_prod patterns. arm: Fix arm b

[PATCH 05/10] i386: Fix dot_prod backend patterns for mmx and sse targets

2024-07-10 Thread Victor Do Nascimento
Following the migration of the dot_prod optab from a direct to a conversion-type optab, ensure all back-end patterns incorporate the second machine mode into pattern names. gcc/ChangeLog: * config/i386/mmx.md (usdot_prodv8qi): Deleted. (usdot_prodv2siv8qi): New. (sdot_prod

[PATCH 06/10] arc: Adjust dot-product backend patterns

2024-07-10 Thread Victor Do Nascimento
Following the migration of the dot_prod optab from a direct to a conversion-type optab, ensure all back-end patterns incorporate the second machine mode into pattern names. gcc/ChangeLog: * config/arc/simdext.md (sdot_prodv2hi): Deleted. (sdot_prodsiv2hi): New. (udot_prodv

[PATCH 02/10] autovectorizer: Add basic support for convert optabs

2024-07-10 Thread Victor Do Nascimento
Given the shift from modeling dot products as direct optabs to treating them as conversion optabs, we make necessary changes to the autovectorizer code to ensure that given the relevant tree code, together with the input and output data modes, we can retrieve the relevant optab and subsequently the

[PATCH 08/10] altivec: Adjust dot-product backend patterns

2024-07-10 Thread Victor Do Nascimento
Following the migration of the dot_prod optab from a direct to a conversion-type optab, ensure all back-end patterns incorporate the second machine mode into pattern names. gcc/ChangeLog: * config/rs6000/altivec.md (udot_prod): Deleted. (udot_prodv4si): New. (sdot_prodv8hi

[PATCH 09/10] c6x: Adjust dot-product backend patterns

2024-07-10 Thread Victor Do Nascimento
Following the migration of the dot_prod optab from a direct to a conversion-type optab, ensure all back-end patterns incorporate the second machine mode into pattern names. gcc/ChangeLog: * config/c6x/c6x.md (sdot_prodv2hi): Deleted. (sdot_prodsiv2hi): New. --- gcc/config/c6x/c6x

[PATCH 07/10] mips: Adjust dot-product backend patterns

2024-07-10 Thread Victor Do Nascimento
Following the migration of the dot_prod optab from a direct to a conversion-type optab, ensure all back-end patterns incorporate the second machine mode into pattern names. gcc/ChangeLog: * config/mips/loongson-mmi.md (sdot_prodv4hi): Deleted. (sdot_prodv2siv4hi): New. --- gcc/co

[PATCH 10/10] autovectorizer: Test autovectorization of different dot-prod modes.

2024-07-10 Thread Victor Do Nascimento
From: Victor Do Nascimento Given the novel treatment of the dot product optab as a conversion we are now able to target, for a given architecture, different relationships between output modes and input modes. This is made clearer by way of example. Previously, on AArch64, the following loop was

[PATCH 01/10] optabs: Make all `*dot_prod_optab's modeled as conversions

2024-07-10 Thread Victor Do Nascimento
Given the specification in the GCC internals manual defines the {u|s}dot_prod standard name as taking "two signed elements of the same mode, adding them to a third operand of wider mode", there is currently ambiguity in the relationship between the mode of the first two arguments and that of the th

[PATCH 04/10] arm: Fix arm backend-use of (u|s|us)dot_prod patterns.

2024-07-10 Thread Victor Do Nascimento
gcc/ChangeLog: * config/arm/arm-builtins.cc (enum arm_builtins): Add new ARM_BUILTIN_* enum values: SDOTV8QI, SDOTV16QI, UDOTV8QI, UDOTV16QI, USDOTV8QI, USDOTV16QI. (arm_init_dotprod_builtins): New. (arm_init_builtins): Add call to `arm_init_dotprod_builtins

[PATCH v2] libatomic: Add rcpc3 128-bit atomic operations for AArch64

2024-06-12 Thread Victor Do Nascimento
The introduction of the optional RCPC3 architectural extension for Armv8.2-A upwards provides additional support for the release consistency model, introducing the Load-Acquire RCpc Pair Ordered, and Store-Release Pair Ordered operations in the form of LDIAPP and STILP. These operations are single

[PATCH v2 2/4] Libatomic: Define per-file identifier macros

2024-06-11 Thread Victor Do Nascimento
In order to facilitate the fine-tuning of how `libatomic_i.h' and `host-config.h' headers are used by different atomic functions, we define distinct identifier macros for each file which, in implementing atomic operations, imports these headers. The idea is that different parts of these headers co

[PATCH v2 4/4] Libatomic: Clean up AArch64 `atomic_16.S' implementation file

2024-06-11 Thread Victor Do Nascimento
At present, `atomic_16.S' groups different implementations of the same functions together in the file. Therefore, as an example, the LSE2 implementation of `load_16' follows on immediately from its core implementation, as does the `store_16' LSE2 implementation. Such architectural extension-depen

[PATCH v2 3/4] Libatomic: Make ifunc selector behavior contingent on importing file

2024-06-11 Thread Victor Do Nascimento
By querying previously-defined file-identifier macros, `host-config.h' is able to get information about its environment and, based on this information, select more appropriate function-specific ifunc selectors. This reduces the number of unnecessary feature tests that need to be carried out in ord

[PATCH v2 1/4] Libatomic: AArch64: Convert all lse128 assembly to .insn directives

2024-06-11 Thread Victor Do Nascimento
Given the lack of support for the LSE128 instructions in all but the the most up-to-date version of Binutils (2.42), having the build-time test for assembler support for these instructions often leads to the building of Libatomic without support for LSE128-dependent atomic function implementations.

[PATCH v2 0/4] Libatomic: Cleanup ifunc selector and aliasing

2024-06-11 Thread Victor Do Nascimento
and `--disable-gnu-indirect-function' configurations on armv9.4-a target with LRCPC3 and LSE128 support and without. Victor Do Nascimento (4): Libatomic: AArch64: Convert all lse128 assembly to .insn directives Libatomic: Define per-file identifier macros Libatomic: Make ifunc selector

[PATCH v2] middle-end: Drop __builtin_prefetch calls in autovectorization [PR114061]

2024-06-11 Thread Victor Do Nascimento
At present the autovectorizer fails to vectorize simple loops involving calls to `__builtin_prefetch'. A simple example of such loop is given below: void foo(double * restrict a, double * restrict b, int n){ int i; for(i=0; i *references) clobbers_memory = true; break;

Re: [PATCH] middle-end: Expand {u|s}dot product support in autovectorizer

2024-05-17 Thread Victor Do Nascimento
6 AM Tamar Christina wrote: -Original Message- From: Richard Biener Sent: Friday, May 17, 2024 10:46 AM To: Tamar Christina Cc: Victor Do Nascimento ; gcc- patc...@gcc.gnu.org; Richard Sandiford ; Richard Earnshaw ; Victor Do Nascimento Subject: Re: [PATCH] middle-end: Expand {u|s}do

[PATCH] middle-end: Expand {u|s}dot product support in autovectorizer

2024-05-16 Thread Victor Do Nascimento
From: Victor Do Nascimento At present, the compiler offers the `{u|s|us}dot_prod_optab' direct optabs for dealing with vectorizable dot product code sequences. The consequence of using a direct optab for this is that backend-pattern selection is only ever able to match against one dat

Re: [PATCH] middle-end: Drop __builtin_pretech calls in autovectorization [PR114061]'

2024-05-16 Thread Victor Do Nascimento
On 5/16/24 15:16, Andrew Pinski wrote: On Thu, May 16, 2024, 3:58 PM Victor Do Nascimento mailto:victor.donascime...@arm.com>> wrote: At present the autovectorizer fails to vectorize simple loops involving calls to `__builtin_prefetch'.  A simple example of such lo

[PATCH] middle-end: Drop __builtin_pretech calls in autovectorization [PR114061]'

2024-05-16 Thread Victor Do Nascimento
At present the autovectorizer fails to vectorize simple loops involving calls to `__builtin_prefetch'. A simple example of such loop is given below: void foo(double * restrict a, double * restrict b, int n){ int i; for(i=0; i *references) clobbers_memory = true; break;

[PATCH] libatomic: Add rcpc3 128-bit atomic operations for AArch64

2024-05-16 Thread Victor Do Nascimento
The introduction of the optional RCPC3 architectural extension for Armv8.2-A upwards provides additional support for the release consistency model, introducing the Load-Acquire RCpc Pair Ordered, and Store-Release Pair Ordered operations in the form of LDIAPP and STILP. These operations are single

[PATCH 1/4] Libatomic: Define per-file identifier macros

2024-05-16 Thread Victor Do Nascimento
In order to facilitate the fine-tuning of how `libatomic_i.h' and `host-config.h' headers are used by different atomic functions, we define distinct identifier macros for each file which, in implementing atomic operations, imports these headers. The idea is that different parts of these headers co

[PATCH 4/4] Libatomic: Clean up AArch64 `atomic_16.S' implementation file

2024-05-16 Thread Victor Do Nascimento
At present, `atomic_16.S' groups different implementations of the same functions together in the file. Therefore, as an example, the LSE128 implementation of `exchange_16' follows on immediately from its core implementation, as does the `fetch_or_16' LSE128 implementation. Such architectural exte

[PATCH 2/4] Libatomic: Make ifunc selector behavior contingent on importing file

2024-05-16 Thread Victor Do Nascimento
By querying previously-defined file-identifier macros, `host-config.h' is able to get information about its environment and, based on this information, select more appropriate function-specific ifunc selectors. This reduces the number of unnecessary feature tests that need to be carried out in ord

[PATCH 3/4] Libatomic: Clean up AArch64 ifunc aliasing

2024-05-16 Thread Victor Do Nascimento
Following improvements to the way ifuncs are selected based on detected architectural features, we are able to do away with many of the aliases that were previously needed for subsets of atomic functions that were not implemented in a given extension. This may be clarified by virtue of an example.

[PATCH 0/4] Libatomic: Cleanup ifunc selector and aliasing

2024-05-16 Thread Victor Do Nascimento
able-gnu-indirect-function' configurations on armv9.4-a target with LRCPC3 and LSE128 support and without. Victor Do Nascimento (4): Libatomic: Define per-file identifier macros Libatomic: Make ifunc selector behavior contingent on importing file Libatomic: Clean up AArch64 ifunc a

Re: [PATCH] aarch64: Add +lse128 architectural extension command-line flag

2024-03-27 Thread Victor Do Nascimento
On 3/26/24 12:26, Richard Sandiford wrote: Victor Do Nascimento writes: Given how, at present, the choice of using LSE128 atomic instructions by the toolchain is delegated to run-time selection in the form of Libatomic ifuncs, responsible for querying target support, the `+lse128' t

[PATCH] aarch64: Align lrcpc3 FEAT_STRING with /proc/cpuinfo 'Features' entry

2024-03-25 Thread Victor Do Nascimento
Due to the Linux kernel exposing the lrcpc3 architectural feature as "lrcpc3", this patch corrects the relevant FEATURE_STRING entry in the "rcpc3" AARCH64_OPT_FMV_EXTENSION macro, such that the feature can be correctly detected when doing native compilation on rcpc3-enabled targets. Regtested on

[PATCH] aarch64: Add +lse128 architectural extension command-line flag

2024-03-15 Thread Victor Do Nascimento
Given how, at present, the choice of using LSE128 atomic instructions by the toolchain is delegated to run-time selection in the form of Libatomic ifuncs, responsible for querying target support, the `+lse128' target architecture compile-time flag is absent from GCC. This, however, contrasts with

Re: [libatomic PATCH] PR other/113336: Fix libatomic testsuite regressions on ARM.

2024-02-14 Thread Victor Do Nascimento
arm-linux-gnueabihf with --with-arch=armv6 with make bootstrap and make -k check where it fixes all of the FAILs in libatomic. Ok for mainline? 2024-01-28 Roger Sayle Victor Do Nascimento libatomic/ChangeLog PR other/113336 * Makefile.am: Build tas

[PATCH] AArch64: Update system register database.

2024-02-06 Thread Victor Do Nascimento
With the release of Binutils 2.42, this brings the level of system-register support in GCC in line with the current state-of-the-art in Binutils, ensuring everything available in Binutils is plainly accessible from GCC. Where Binutils uses a more detailed description of which features are responsi

Re: [PATCH v2 2/2] libatomic: Add rcpc3 128-bit atomic operations for AArch64

2024-01-26 Thread Victor Do Nascimento
On 1/26/24 10:53, Richard Sandiford wrote: > Victor Do Nascimento writes: >> @@ -712,6 +760,27 @@ ENTRY (libat_test_and_set_16) >> END (libat_test_and_set_16) >> >> >> +/* Alias all LSE128_LRCPC3 ifuncs to their specific implementations, >> + that

Re: [libatomic PATCH] Fix testsuite regressions on ARM [raspberry pi].

2024-01-25 Thread Victor Do Nascimento
On 1/11/24 15:55, Roger Sayle wrote: Hi Richard, As you've recommended, this issue has now been filed in bugzilla as PR other/113336. As explained in the new PR, libatomic's testsuite used to pass on armv6 (raspberry pi) in previous GCC releases, but the code was incorrect/non-synchronous; t

[PATCH v2 2/2] libatomic: Add rcpc3 128-bit atomic operations for AArch64

2024-01-24 Thread Victor Do Nascimento
The introduction of the optional RCPC3 architectural extension for Armv8.2-A upwards provides additional support for the release consistency model, introducing the Load-Acquire RCpc Pair Ordered, and Store-Release Pair Ordered operations in the form of LDIAPP and STILP. These operations are single

[PATCH v2 1/2] libatomic: Increase max IFUNC_NCOND(N) from 3 to 4.

2024-01-24 Thread Victor Do Nascimento
libatomic/ChangeLog: * libatomic_i.h: Add GEN_SELECTOR implementation for IFUNC_NCOND(N) == 4. --- libatomic/libatomic_i.h | 18 ++ 1 file changed, 18 insertions(+) diff --git a/libatomic/libatomic_i.h b/libatomic/libatomic_i.h index 861a22da152..0a854fd908c 100644

[PATCH v2 0/2] libatomic: AArch64 rcpc3 128-bit atomic operation enablement

2024-01-24 Thread Victor Do Nascimento
/gcc-patches/2024-January/643841.html Victor Do Nascimento (2): libatomic: Increase max IFUNC_NCOND(N) from 3 to 4. libatomic: Add rcpc3 128-bit atomic operations for AArch64 libatomic/Makefile.am| 6 +- libatomic/Makefile.in| 22

[PATCH v4 1/4] libatomic: atomic_16.S: Improve ENTRY, END and ALIAS macro interface

2024-01-24 Thread Victor Do Nascimento
The introduction of further architectural-feature dependent ifuncs for AArch64 makes hard-coding ifunc `_i' suffixes to functions cumbersome to work with. It is awkward to remember which ifunc maps onto which arch feature and makes the code harder to maintain when new ifuncs are added and their su

[PATCH v4 4/4] aarch64: Add explicit checks for implicit LSE/LSE2 requirements.

2024-01-24 Thread Victor Do Nascimento
At present, Evaluation of both `has_lse2(hwcap)' and `has_lse128(hwcap)' may require issuing an `mrs' instruction to query a system register. This instruction, when issued from user-space results in a trap by the kernel which then returns the value read in by the system register. Given the undesi

[PATCH v4 3/4] libatomic: Enable LSE128 128-bit atomics for armv9.4-a

2024-01-24 Thread Victor Do Nascimento
The armv9.4-a architectural revision adds three new atomic operations associated with the LSE128 feature: * LDCLRP - Atomic AND NOT (bitclear) of a location with 128-bit value held in a pair of registers, with original data loaded into the same 2 registers. * LDSETP - Atomic OR (bitset) of

[PATCH v4 2/4] libatomic: Add support for __ifunc_arg_t arg in ifunc resolver

2024-01-24 Thread Victor Do Nascimento
With support for new atomic features in Armv9.4-a being indicated by HWCAP2 bits, Libatomic's ifunc resolver must now query its second argument, of type __ifunc_arg_t*. We therefore make this argument known to libatomic, allowing us to query hwcap2 bits in the following manner: bool resolver

[PATCH v4 0/4] Libatomic: Add LSE128 atomics support for AArch64

2024-01-24 Thread Victor Do Nascimento
upport is present. Regression tested on aarch64-linux-gnu target with LSE128-support. [1] https://gcc.gnu.org/pipermail/gcc-patches/2023-June/620529.html [2] https://gcc.gnu.org/pipermail/gcc-patches/2023-August/626358.html Victor Do Nascimento (4): libatomic: atomic_16.S: Improve ENTRY, END an

Re: [PATCH v3 1/3] libatomic: atomic_16.S: Improve ENTRY, END and ALIAS macro interface

2024-01-08 Thread Victor Do Nascimento
On 1/5/24 11:10, Richard Sandiford wrote: Victor Do Nascimento writes: The introduction of further architectural-feature dependent ifuncs for AArch64 makes hard-coding ifunc `_i' suffixes to functions cumbersome to work with. It is awkward to remember which ifunc maps onto which

  1   2   >