Cheers,
And given how the test also started failing for GCC14, are we also okay
to go ahead with backporting the patch?
Victor
On 4/2/25 17:39, Jeff Law wrote:
On 4/2/25 8:53 AM, Victor Do Nascimento wrote:
Using specific SSA names in pattern matching in `dg-final' makes tests
&quo
Using specific SSA names in pattern matching in `dg-final' makes tests
"unstable", in that changes in passes prior to the pass whose dump is
analyzed in the particular test may change the numbering of the SSA
variables, causing the test to start failing spuriously.
We thus switch from specific SSA
Richard Biener writes:
> On Mon, 18 Nov 2024, Victor Do Nascimento wrote:
>
>> On 11/5/24 07:39, Richard Biener wrote:
>> > On Tue, 5 Nov 2024, Victor Do Nascimento wrote:
>> >
>> >> The current codegen code to support VF's that are multiples of
On 11/5/24 07:39, Richard Biener wrote:
On Tue, 5 Nov 2024, Victor Do Nascimento wrote:
The current codegen code to support VF's that are multiples of a simdclone
simdlen rely on BIT_FIELD_REF to create multiple input vectors. This does not
work for non-constant simdclones, so we s
On 11/5/24 07:39, Richard Biener wrote:
On Tue, 5 Nov 2024, Victor Do Nascimento wrote:
The current codegen code to support VF's that are multiples of a simdclone
simdlen rely on BIT_FIELD_REF to create multiple input vectors. This does not
work for non-constant simdclones, so we s
cc'ing Jakub due to email address typo in original patch submission.
Apologies,
Victor
Victor Do Nascimento writes:
> gcc/testsuite/ChangeLog:
>
> * c-c++-common/gomp/declare-variant-14.c: Make i?86 and x86_64 target
> only test.
> * gfortran.dg/gomp/de
gcc/testsuite/ChangeLog:
* c-c++-common/gomp/declare-variant-14.c: Make i?86 and x86_64 target
only test.
* gfortran.dg/gomp/declare-variant-14.f90: Likewise.
---
gcc/testsuite/c-c++-common/gomp/declare-variant-14.c | 12 +---
.../gfortran.dg/gomp/declare-variant-1
This patch finalizes adding support for the generation of SVE simd clones when
no simdlen is provided, following the ABI rules where the widest data type
determines the minimum amount of elements in a length agnostic vector.
gcc/ChangeLog:
* config/aarch64/aarch64-protos.h (add_sve_type_a
The current codegen code to support VF's that are multiples of a simdclone
simdlen rely on BIT_FIELD_REF to create multiple input vectors. This does not
work for non-constant simdclones, so we should disable using such clones when
the VF is a multiple of the non-constant simdlen until we change th
This patch adds stmt_vec_info to TARGET_SIMD_CLONE_USABLE to make sure the
target can reject a simd_clone based on the vector mode it is using.
This is needed because for VLS SVE vectorization the vectorizer accepts
Advanced SIMD simd clones when vectorizing using SVE types because the simdlens
mig
This patch adds stmt_vec_info to TARGET_SIMD_CLONE_USABLE to make sure the
target can reject a simd_clone based on the vector mode it is using.
This is needed because for VLS SVE vectorization the vectorizer accepts
Advanced SIMD simd clones when vectorizing using SVE types because the simdlens
mig
Following a few bugfixes in the if-convert pass which were previously
found to degrade performance in the proposed SVE libmvec
autovectorization of conditional calls to math functions, this
patch-series carries on the work initially presented by Andre Vieira
and last discussed at:
- https://patch
Implement -mcpu options for:
- Cortex-A520AE
- Cortex-A720AE
- Cortex-R82AE
These all implement the same feature sets as their non-AE
counterparts, using the same scheduler and costs and differing only in
their respective part numbers.
gcc/ChangeLog:
* config/aarch64/aarch64-cores
On 2/1/24 21:59, Richard Sandiford wrote:
Andre Vieira writes:
This patch finalizes adding support for the generation of SVE simd clones when
no simdlen is provided, following the ABI rules where the widest data type
determines the minimum amount of elements in a length agnostic vector.
gcc/Ch
FWIW, I definitely agree about the spuriousness of the V2DI mode check.
While I can't approve, I can confirm it looks good.
Thanks,
Victor.
On 10/17/24 16:10, Wilco Dijkstra wrote:
The split condition in aarch64_simd_mov uses aarch64_simd_special_constant_p.
While
doing the split, it checks
The recent refactoring of the dot_prod optab to convert-type exposed a
limitation in how `find_widening_optab_handler_and_mode' is currently
implemented, owing to the fact that, while the function expects the
GET_MODE_CLASS (from_mode) == GET_MODE_CLASS (to_mode)
condition to hold, the c6x back
On 10/11/24 08:28, Richard Biener wrote:
On Thu, Oct 10, 2024 at 5:25 PM Victor Do Nascimento
wrote:
The recent refactoring of the dot_prod optab to convert-type exposed a
limitation in how `find_widening_optab_handler_and_mode' is currently
implemented, owing to the fact that, whil
The recent refactoring of the dot_prod optab to convert-type exposed a
limitation in how `find_widening_optab_handler_and_mode' is currently
implemented, owing to the fact that, while the function expects the
GET_MODE_CLASS (from_mode) == GET_MODE_CLASS (to_mode)
condition to hold, the c6x back
On 10/7/24 10:52, Richard Biener wrote:
On Wed, Oct 2, 2024 at 6:26 PM Victor Do Nascimento
wrote:
Given the categorization of math built-in functions as `ECF_CONST',
when if-converting their uses, their calls are not masked and are thus
called with an all-true predicate.
This, howeve
On 10/4/24 09:32, Tamar Christina wrote:
Hi Victor,
-Original Message-
From: Victor Do Nascimento
Sent: Wednesday, October 2, 2024 5:26 PM
To: gcc-patches@gcc.gnu.org
Cc: Tamar Christina ; richard.guent...@gmail.com;
Victor Do Nascimento
Subject: [PATCH] middle-end: reorder masking
Given the categorization of math built-in functions as `ECF_CONST',
when if-converting their uses, their calls are not masked and are thus
called with an all-true predicate.
This, however, is not appropriate where built-ins have library
equivalents, wherein they may exhibit highly architecture-spe
On 10/1/24 13:10, Richard Biener wrote:
On Mon, Sep 30, 2024 at 8:40 PM Tamar Christina wrote:
Hi Victor,
Thanks! This looks good to me with one minor comment:
-Original Message-
From: Victor Do Nascimento
Sent: Monday, September 30, 2024 2:34 PM
To: gcc-patches@gcc.gnu.org
Cc
Up until now, due to a latent bug in the code for the ifcvt pass,
irrespective of the branch taken in a conditional statement, the
original condition for the if statement was used in masking the
function call.
Thus, for code such as:
if (a[i] > limit)
b[i] = fixed_const;
else
b[i] = f
Hello,
Gentle reminder for this simple renaming update in response to the
feedback from the last iteration. 🙂
Thanks,
Victor
On 9/5/24 12:05, Victor Do Nascimento wrote:
Changes from previous revision:
Rename new `check_effective_target' and tests to make their intent
clearer.
Hello,
Gentle reminder for this patch 🙂
Thanks,
Victor
On 9/5/24 11:59, Victor Do Nascimento wrote:
Changes from previous revision:
As was done for the equivalent aarch64 patch, we rework this patch to do away
with
mission creep, keeping changes as simple as possible.
We thus remove the
Changes from previous revision:
Rename new `check_effective_target' and tests to make their intent
clearer.
* lib/target-supports.exp: For new `check_effective_target',
s/vect_dotprod_twoway/vect_dotprod_hisi/.
* One test is renamed to `vect-dotprod-conv-optab.c' to emphasize
aim of c
Changes from previous revision:
As was done for the equivalent aarch64 patch, we rework this patch to do away
with
mission creep, keeping changes as simple as possible.
We thus remove the `gimple_fold_builtin' changes that would have replaced the
dot-product builtin calls with DOT_PROD_EXPRs a
Hello,
Gentle reminder for this simple renaming patch :)
Thanks,
Victor
On 8/15/24 09:44, Victor Do Nascimento wrote:
Following the migration of the dot_prod optab from a direct to a
conversion-type optab, ensure all back-end patterns incorporate the
second machine mode into pattern names
Hello,
Gentle reminder for this simple renaming patch :)
Thanks,
Victor
On 8/15/24 09:44, Victor Do Nascimento wrote:
Following the migration of the dot_prod optab from a direct to a
conversion-type optab, ensure all back-end patterns incorporate the
second machine mode into pattern names
Hello,
Gentle reminder for this simple renaming patch :)
Thanks,
Victor
On 8/15/24 09:44, Victor Do Nascimento wrote:
Following the migration of the dot_prod optab from a direct to a
conversion-type optab, ensure all back-end patterns incorporate the
second machine mode into pattern names
On 8/15/24 09:26, Richard Sandiford wrote:
Victor Do Nascimento writes:
Given recent changes to the dot_prod standard pattern name, this patch
fixes the aarch64 back-end by implementing the following changes:
1. Add 2nd mode to all (u|s|us)dot_prod patterns in .md files.
2. Rewrite
Following the migration of the dot_prod optab from a direct to a
conversion-type optab, ensure all back-end patterns incorporate the
second machine mode into pattern names.
gcc/ChangeLog:
* config/rs6000/altivec.md (udot_prod): Renamed to...
(udot_prodv4si): ...this.
(sdot
Following the migration of the dot_prod optab from a direct to a
conversion-type optab, ensure all back-end patterns incorporate the
second machine mode into pattern names.
gcc/ChangeLog:
* config/arc/simdext.md (sdot_prodv2hi): Renamed to...
(sdot_prodsiv2hi): ...this.
(u
Following the migration of the dot_prod optab from a direct to a
conversion-type optab, ensure all back-end patterns incorporate the
second machine mode into pattern names.
gcc/ChangeLog:
* config/mips/loongson-mmi.md (sdot_prodv4hi): Renamed to...
(sdot_prodv2siv4hi): ...this.
--
Following the migration of the dot_prod optab from a direct to a
conversion-type optab, ensure all back-end patterns incorporate the
second machine mode into pattern names.
gcc/ChangeLog:
* config/c6x/c6x.md (sdot_prodv2hi): Renamed to...
(sdot_prodsiv2hi): ...this.
---
gcc/confi
Given recent changes to the dot_prod standard pattern name, this patch
fixes the aarch64 back-end by implementing the following changes:
1. Add 2nd mode to all (u|s|us)dot_prod patterns in .md files.
2. Rewrite initialization and function expansion mechanism for simd
builtins.
3. Fix all direct ca
Given the shift from modeling dot products as direct optabs to
treating them as conversion optabs, we make necessary changes to the
autovectorizer code to ensure that given the relevant tree code,
together with the input and output data modes, we can retrieve the
relevant optab and subsequently the
From: Victor Do Nascimento
Given the novel treatment of the dot product optab as a conversion, we
are now able to targe different relationships between output modes and
input modes.
This is made clearer by way of example. Previously, on AArch64, the
following loop was vectorizable:
uint32_t
gcc/ChangeLog:
* config/arm/arm-builtins.cc (enum arm_builtins): Add new
ARM_BUILTIN_* enum values: SDOTV8QI, SDOTV16QI, UDOTV8QI,
UDOTV16QI, USDOTV8QI, USDOTV16QI.
(arm_init_dotprod_builtins): New.
(arm_init_builtins): Add call to `arm_init_dotprod_builtins
Following the migration of the dot_prod optab from a direct to a
conversion-type optab, ensure all back-end patterns incorporate the
second machine mode into pattern names.
gcc/ChangeLog:
* config/i386/mmx.md (usdot_prodv8qi): Renamed to...
(usdot_prodv2siv8qi): ...this.
(
Given the specification in the GCC internals manual defines the
{u|s}dot_prod standard name as taking "two signed elements of the
same mode, adding them to a third operand of wider mode", there is
currently ambiguity in the relationship between the mode of the first
two arguments and that of the th
nd armhf. I'd appreciate help
running relevant tests on the remaining architectures, i.e. arc, mips,
altivec and c6x to ensure I've not inadvertently broken anything for
those back-ends.
Victor Do Nascimento (10):
optabs: Make all `*dot_prod_optab's modeled as conversions
autov
On 8/14/24 13:24, Tamar Christina wrote:
It seems to me that this should take a code_helper, create the vector modes and
call directly_supported_p, or am I missing something?
Ok. Having done some digging around in the git history, I see that
`vect_supportable_direct_optab_p', upon which I ba
On 8/14/24 13:24, Tamar Christina wrote:
Hi Victor,
-Original Message-
From: Victor Do Nascimento
Sent: Tuesday, August 13, 2024 1:42 PM
To: gcc-patches@gcc.gnu.org
Cc: Tamar Christina ; claz...@gmail.com;
hongtao@intel.com; s...@gcc.gnu.org; bernds_...@t-online.de;
al
From: Victor Do Nascimento
Given the novel treatment of the dot product optab as a conversion, we
are now able to targe different relationships between output modes and
input modes.
This is made clearer by way of example. Previously, on AArch64, the
following loop was vectorizable:
uint32_t
Following the migration of the dot_prod optab from a direct to a
conversion-type optab, ensure all back-end patterns incorporate the
second machine mode into pattern names.
gcc/ChangeLog:
* config/mips/loongson-mmi.md (sdot_prodv4hi): Renamed to...
(sdot_prodv2siv4hi): ...this.
--
Following the migration of the dot_prod optab from a direct to a
conversion-type optab, ensure all back-end patterns incorporate the
second machine mode into pattern names.
gcc/ChangeLog:
* config/rs6000/altivec.md (udot_prod): Renamed to...
(udot_prodv4si): ...this.
(sdot
Following the migration of the dot_prod optab from a direct to a
conversion-type optab, ensure all back-end patterns incorporate the
second machine mode into pattern names.
gcc/ChangeLog:
* config/i386/mmx.md (usdot_prodv8qi): Renamed to...
(usdot_prodv2siv8qi): ...this.
(
the same input
mode but resulting in a different output mode.
Regression-tested on x86_64, aarch64 and armhf. I'd appreciate help
running relevant tests on the remaining architectures, i.e. arc, mips,
altivec and c6x to ensure I've not inadvertently broken anything for
those back-e
gcc/ChangeLog:
* config/arm/arm-builtins.cc (enum arm_builtins): Add new
ARM_BUILTIN_* enum values: SDOTV8QI, SDOTV16QI, UDOTV8QI,
UDOTV16QI, USDOTV8QI, USDOTV16QI.
(arm_init_dotprod_builtins): New.
(arm_init_builtins): Add call to `arm_init_dotprod_builtins
Following the migration of the dot_prod optab from a direct to a
conversion-type optab, ensure all back-end patterns incorporate the
second machine mode into pattern names.
gcc/ChangeLog:
* config/arc/simdext.md (sdot_prodv2hi): Renamed to...
(sdot_prodsiv2hi): ...this.
(u
Following the migration of the dot_prod optab from a direct to a
conversion-type optab, ensure all back-end patterns incorporate the
second machine mode into pattern names.
gcc/ChangeLog:
* config/c6x/c6x.md (sdot_prodv2hi): Renamed to...
(sdot_prodsiv2hi): ...this.
---
gcc/confi
Given recent changes to the dot_prod standard pattern name, this patch
fixes the aarch64 back-end by implementing the following changes:
1. Add 2nd mode to all (u|s|us)dot_prod patterns in .md files.
2. Rewrite initialization and function expansion mechanism for simd
builtins.
3. Fix all direct ca
Given the specification in the GCC internals manual defines the
{u|s}dot_prod standard name as taking "two signed elements of the
same mode, adding them to a third operand of wider mode", there is
currently ambiguity in the relationship between the mode of the first
two arguments and that of the th
Given the shift from modeling dot products as direct optabs to
treating them as conversion optabs, we make necessary changes to the
autovectorizer code to ensure that given the relevant tree code,
together with the input and output data modes, we can retrieve the
relevant optab and subsequently the
On 7/12/24 03:23, Jiang, Haochen wrote:
-Original Message-
From: Hongtao Liu
Sent: Thursday, July 11, 2024 9:45 AM
To: Victor Do Nascimento
Cc: gcc-patches@gcc.gnu.org; richard.sandif...@arm.com;
richard.earns...@arm.com
Subject: Re: [PATCH 05/10] i386: Fix dot_prod backend patterns
Given recent changes to the dot_prod standard pattern name, this patch
fixes the aarch64 back-end by implementing the following changes:
1. Add 2nd mode to all (u|s|us)dot_prod patterns in .md files.
2. Rewrite initialization and function expansion mechanism for simd
builtins.
3. Fix all direct ca
x to ensure I've not inadvertently broken anything for
those backends.
Victor Do Nascimento (10):
optabs: Make all `*dot_prod_optab's modeled as conversions
autovectorizer: Add basic support for convert optabs
aarch64: Fix aarch64 backend-use of (u|s|us)dot_prod patterns.
arm: Fix arm b
Following the migration of the dot_prod optab from a direct to a
conversion-type optab, ensure all back-end patterns incorporate the
second machine mode into pattern names.
gcc/ChangeLog:
* config/i386/mmx.md (usdot_prodv8qi): Deleted.
(usdot_prodv2siv8qi): New.
(sdot_prod
Following the migration of the dot_prod optab from a direct to a
conversion-type optab, ensure all back-end patterns incorporate the
second machine mode into pattern names.
gcc/ChangeLog:
* config/arc/simdext.md (sdot_prodv2hi): Deleted.
(sdot_prodsiv2hi): New.
(udot_prodv
Given the shift from modeling dot products as direct optabs to
treating them as conversion optabs, we make necessary changes to the
autovectorizer code to ensure that given the relevant tree code,
together with the input and output data modes, we can retrieve the
relevant optab and subsequently the
Following the migration of the dot_prod optab from a direct to a
conversion-type optab, ensure all back-end patterns incorporate the
second machine mode into pattern names.
gcc/ChangeLog:
* config/rs6000/altivec.md (udot_prod): Deleted.
(udot_prodv4si): New.
(sdot_prodv8hi
Following the migration of the dot_prod optab from a direct to a
conversion-type optab, ensure all back-end patterns incorporate the
second machine mode into pattern names.
gcc/ChangeLog:
* config/c6x/c6x.md (sdot_prodv2hi): Deleted.
(sdot_prodsiv2hi): New.
---
gcc/config/c6x/c6x
Following the migration of the dot_prod optab from a direct to a
conversion-type optab, ensure all back-end patterns incorporate the
second machine mode into pattern names.
gcc/ChangeLog:
* config/mips/loongson-mmi.md (sdot_prodv4hi): Deleted.
(sdot_prodv2siv4hi): New.
---
gcc/co
From: Victor Do Nascimento
Given the novel treatment of the dot product optab as a conversion we
are now able to target, for a given architecture, different
relationships between output modes and input modes.
This is made clearer by way of example. Previously, on AArch64, the
following loop was
Given the specification in the GCC internals manual defines the
{u|s}dot_prod standard name as taking "two signed elements of the
same mode, adding them to a third operand of wider mode", there is
currently ambiguity in the relationship between the mode of the first
two arguments and that of the th
gcc/ChangeLog:
* config/arm/arm-builtins.cc (enum arm_builtins): Add new
ARM_BUILTIN_* enum values: SDOTV8QI, SDOTV16QI, UDOTV8QI,
UDOTV16QI, USDOTV8QI, USDOTV16QI.
(arm_init_dotprod_builtins): New.
(arm_init_builtins): Add call to `arm_init_dotprod_builtins
The introduction of the optional RCPC3 architectural extension for
Armv8.2-A upwards provides additional support for the release
consistency model, introducing the Load-Acquire RCpc Pair Ordered, and
Store-Release Pair Ordered operations in the form of LDIAPP and STILP.
These operations are single
In order to facilitate the fine-tuning of how `libatomic_i.h' and
`host-config.h' headers are used by different atomic functions, we
define distinct identifier macros for each file which, in implementing
atomic operations, imports these headers.
The idea is that different parts of these headers co
At present, `atomic_16.S' groups different implementations of the
same functions together in the file. Therefore, as an example,
the LSE2 implementation of `load_16' follows on immediately from its
core implementation, as does the `store_16' LSE2 implementation.
Such architectural extension-depen
By querying previously-defined file-identifier macros, `host-config.h'
is able to get information about its environment and, based on this
information, select more appropriate function-specific ifunc
selectors. This reduces the number of unnecessary feature tests that
need to be carried out in ord
Given the lack of support for the LSE128 instructions in all but the
the most up-to-date version of Binutils (2.42), having the build-time
test for assembler support for these instructions often leads to the
building of Libatomic without support for LSE128-dependent atomic
function implementations.
and
`--disable-gnu-indirect-function' configurations on armv9.4-a target
with LRCPC3 and LSE128 support and without.
Victor Do Nascimento (4):
Libatomic: AArch64: Convert all lse128 assembly to .insn directives
Libatomic: Define per-file identifier macros
Libatomic: Make ifunc selector
At present the autovectorizer fails to vectorize simple loops
involving calls to `__builtin_prefetch'. A simple example of such
loop is given below:
void foo(double * restrict a, double * restrict b, int n){
int i;
for(i=0; i *references)
clobbers_memory = true;
break;
6 AM Tamar Christina
wrote:
-Original Message-
From: Richard Biener
Sent: Friday, May 17, 2024 10:46 AM
To: Tamar Christina
Cc: Victor Do Nascimento ; gcc-
patc...@gcc.gnu.org; Richard Sandiford ; Richard
Earnshaw ; Victor Do Nascimento
Subject: Re: [PATCH] middle-end: Expand {u|s}do
From: Victor Do Nascimento
At present, the compiler offers the `{u|s|us}dot_prod_optab' direct
optabs for dealing with vectorizable dot product code sequences. The
consequence of using a direct optab for this is that backend-pattern
selection is only ever able to match against one dat
On 5/16/24 15:16, Andrew Pinski wrote:
On Thu, May 16, 2024, 3:58 PM Victor Do Nascimento
mailto:victor.donascime...@arm.com>> wrote:
At present the autovectorizer fails to vectorize simple loops
involving calls to `__builtin_prefetch'. A simple example of such
lo
At present the autovectorizer fails to vectorize simple loops
involving calls to `__builtin_prefetch'. A simple example of such
loop is given below:
void foo(double * restrict a, double * restrict b, int n){
int i;
for(i=0; i *references)
clobbers_memory = true;
break;
The introduction of the optional RCPC3 architectural extension for
Armv8.2-A upwards provides additional support for the release
consistency model, introducing the Load-Acquire RCpc Pair Ordered, and
Store-Release Pair Ordered operations in the form of LDIAPP and STILP.
These operations are single
In order to facilitate the fine-tuning of how `libatomic_i.h' and
`host-config.h' headers are used by different atomic functions, we
define distinct identifier macros for each file which, in implementing
atomic operations, imports these headers.
The idea is that different parts of these headers co
At present, `atomic_16.S' groups different implementations of the
same functions together in the file. Therefore, as an example,
the LSE128 implementation of `exchange_16' follows on immediately
from its core implementation, as does the `fetch_or_16' LSE128
implementation.
Such architectural exte
By querying previously-defined file-identifier macros, `host-config.h'
is able to get information about its environment and, based on this
information, select more appropriate function-specific ifunc
selectors. This reduces the number of unnecessary feature tests that
need to be carried out in ord
Following improvements to the way ifuncs are selected based on
detected architectural features, we are able to do away with many of
the aliases that were previously needed for subsets of atomic
functions that were not implemented in a given extension.
This may be clarified by virtue of an example.
able-gnu-indirect-function' configurations on armv9.4-a target
with LRCPC3 and LSE128 support and without.
Victor Do Nascimento (4):
Libatomic: Define per-file identifier macros
Libatomic: Make ifunc selector behavior contingent on importing file
Libatomic: Clean up AArch64 ifunc a
On 3/26/24 12:26, Richard Sandiford wrote:
Victor Do Nascimento writes:
Given how, at present, the choice of using LSE128 atomic instructions
by the toolchain is delegated to run-time selection in the form of
Libatomic ifuncs, responsible for querying target support, the
`+lse128' t
Due to the Linux kernel exposing the lrcpc3 architectural feature as
"lrcpc3", this patch corrects the relevant FEATURE_STRING entry in the
"rcpc3" AARCH64_OPT_FMV_EXTENSION macro, such that the feature can be
correctly detected when doing native compilation on rcpc3-enabled
targets.
Regtested on
Given how, at present, the choice of using LSE128 atomic instructions
by the toolchain is delegated to run-time selection in the form of
Libatomic ifuncs, responsible for querying target support, the
`+lse128' target architecture compile-time flag is absent from GCC.
This, however, contrasts with
arm-linux-gnueabihf with --with-arch=armv6
with make bootstrap and make -k check where it fixes all of the FAILs in
libatomic. Ok for mainline?
2024-01-28 Roger Sayle
Victor Do Nascimento
libatomic/ChangeLog
PR other/113336
* Makefile.am: Build tas
With the release of Binutils 2.42, this brings the level of
system-register support in GCC in line with the current
state-of-the-art in Binutils, ensuring everything available in
Binutils is plainly accessible from GCC.
Where Binutils uses a more detailed description of which features are
responsi
On 1/26/24 10:53, Richard Sandiford wrote:
> Victor Do Nascimento writes:
>> @@ -712,6 +760,27 @@ ENTRY (libat_test_and_set_16)
>> END (libat_test_and_set_16)
>>
>>
>> +/* Alias all LSE128_LRCPC3 ifuncs to their specific implementations,
>> + that
On 1/11/24 15:55, Roger Sayle wrote:
Hi Richard,
As you've recommended, this issue has now been filed in bugzilla
as PR other/113336. As explained in the new PR, libatomic's testsuite
used to pass on armv6 (raspberry pi) in previous GCC releases, but
the code was incorrect/non-synchronous; t
The introduction of the optional RCPC3 architectural extension for
Armv8.2-A upwards provides additional support for the release
consistency model, introducing the Load-Acquire RCpc Pair Ordered, and
Store-Release Pair Ordered operations in the form of LDIAPP and STILP.
These operations are single
libatomic/ChangeLog:
* libatomic_i.h: Add GEN_SELECTOR implementation for
IFUNC_NCOND(N) == 4.
---
libatomic/libatomic_i.h | 18 ++
1 file changed, 18 insertions(+)
diff --git a/libatomic/libatomic_i.h b/libatomic/libatomic_i.h
index 861a22da152..0a854fd908c 100644
/gcc-patches/2024-January/643841.html
Victor Do Nascimento (2):
libatomic: Increase max IFUNC_NCOND(N) from 3 to 4.
libatomic: Add rcpc3 128-bit atomic operations for AArch64
libatomic/Makefile.am| 6 +-
libatomic/Makefile.in| 22
The introduction of further architectural-feature dependent ifuncs
for AArch64 makes hard-coding ifunc `_i' suffixes to functions
cumbersome to work with. It is awkward to remember which ifunc maps
onto which arch feature and makes the code harder to maintain when new
ifuncs are added and their su
At present, Evaluation of both `has_lse2(hwcap)' and
`has_lse128(hwcap)' may require issuing an `mrs' instruction to query
a system register. This instruction, when issued from user-space
results in a trap by the kernel which then returns the value read in
by the system register. Given the undesi
The armv9.4-a architectural revision adds three new atomic operations
associated with the LSE128 feature:
* LDCLRP - Atomic AND NOT (bitclear) of a location with 128-bit
value held in a pair of registers, with original data loaded into
the same 2 registers.
* LDSETP - Atomic OR (bitset) of
With support for new atomic features in Armv9.4-a being indicated by
HWCAP2 bits, Libatomic's ifunc resolver must now query its second
argument, of type __ifunc_arg_t*.
We therefore make this argument known to libatomic, allowing us to
query hwcap2 bits in the following manner:
bool
resolver
upport is present.
Regression tested on aarch64-linux-gnu target with LSE128-support.
[1] https://gcc.gnu.org/pipermail/gcc-patches/2023-June/620529.html
[2] https://gcc.gnu.org/pipermail/gcc-patches/2023-August/626358.html
Victor Do Nascimento (4):
libatomic: atomic_16.S: Improve ENTRY, END an
On 1/5/24 11:10, Richard Sandiford wrote:
Victor Do Nascimento writes:
The introduction of further architectural-feature dependent ifuncs
for AArch64 makes hard-coding ifunc `_i' suffixes to functions
cumbersome to work with. It is awkward to remember which ifunc maps
onto which
1 - 100 of 168 matches
Mail list logo