This patch adds the field overlap_op_by_pieces to the struct
riscv_tune_param, which allows to enable the overlap_op_by_pieces
feature of the by-pieces infrastructure.
gcc/ChangeLog:
* config/riscv/riscv.c (struct riscv_tune_param): New field.
(riscv_overlap_op_by_pieces): New fun
This patch enables the overlap-by-pieces feature of the by-pieces
infrastructure for inlining builtins in case the target has set
riscv_slow_unaligned_access_p to false.
To demonstrate the effect for targets with fast unaligned access,
the following code sequences are generated for a 15-byte memse
This patch enables the overlap-by-pieces feature of the by-pieces
infrastructure for inlining builtins in case the target has set
riscv_slow_unaligned_access_p to false.
An example to demonstrate the effect for targets with fast unaligned
access (target's that have slow_unaligned_access set to fal
The RISC-V cpymemsi expansion is called, whenever the by-pieces
infrastructure will not be taking care of the builtin expansion.
Currently, that's the case for e.g. memcpy() with n <= 24 bytes.
The code emitted by the by-pieces infrastructure emits code, that
performs unaligned accesses if the targ
[ree] PR rtl-optimization/100264: Handle more PARALLEL SET expressions
PR rtl-optimization/100264
* ree.c (get_sub_rtx): Ignore SET expressions without register
destinations.
(merge_def_and_ext): Eliminate destination check for register
as such SET expressio
This series provides a cleanup of the current atomics implementation
of RISC-V:
* PR100265: Use proper fences for atomic load/store
* PR100266: Provide programmatic implementation of CAS
As both are very related, I merged the patches into one series
(to avoid merge issues if one overtake the othe
The ratified A extension supports '.aq', '.rl' and '.aqrl' as
memory ordering suffixes. Let's emit them in case we get a '%A'
conversion specifier for riscv_print_operand().
As '%A' was already used for a similar, but restricted, purpose
(only '.aq' was emitted so far), this does not require any o
A previous patch took care, that the proper memory ordering suffixes
for AMOs are emitted. Therefore there is no reason to keep the fence
generation mechanism for release operations.
gcc/
PR 100265
* config/riscv/riscv.c (riscv_memmodel_needs_release_fence):
Remove fu
We don't have any special treatment of MEMMODEL_SYNC_* values,
so let's hide them behind the memmodel_base() function.
gcc/
PR 100265
* config/riscv/riscv.c (riscv_memmodel_needs_amo_acquire):
Ignore MEMMODEL_SYNC_* values.
* config/riscv/riscv.c (riscv_memmod
Using amoswap as atomic store is not an expected optimization
and most likely causes a performance penalty.
Neither SW nor HW have a benefit from this optimization,
so let's simply drop it.
gcc/
PR 100265
* config/riscv/sync.md (atomic_store):
Remove.
---
gcc/config/
mem_thread_fence gets the desired memory model as operand.
Let's emit fences according to this value (as defined in section
"Code Porting and Mapping Guidelines" of the unpriv spec).
gcc/
PR 100265
* config/riscv/sync.md (mem_thread_fence):
Emit fences according t
A recent commit introduced a mechanism to emit proper fences
for RISC-V. Additionally, we already have emit_move_insn ().
Let's reuse this code and provide atomic_load and
atomic_store for RISC-V (as defined in section
"Code Porting and Mapping Guidelines" of the unpriv spec).
Note, that this works
In order to emit LR/SC sequences, let's provide INSNs, which
take care of memory ordering constraints.
gcc/
PR 100266
* config/rsicv/sync.md (UNSPEC_LOAD_RESERVED): New.
* config/rsicv/sync.md (UNSPEC_STORE_CONDITIONAL): New.
* config/riscv/sync.md (riscv_load_r
The current model of the LR and SC INSNs requires a sign-extension
to use the generated SImode value for conditional branches, which
only operate on XLEN registers.
However, the sign-extension is actually not required in both cases,
therefore this patch introduces additional INSNs that consume
the
On RISC-V we are facing the fact, that our conditional branches
require Pmode conditions. Currently, we generate them explicitly
with a check for Pmode and then calling the proper generator
(i.e. gen_cbranchdi4 on RV64 and gen_cbranchsi4 on RV32).
Let's make simplify this code by using gen_cbranch4
The existing CAS implementation uses an INSN definition, which provides
the core LR/SC sequence. Additionally to that, there is a follow-up code,
that evaluates the results and calculates the return values.
This has two drawbacks: a) an extension to sub-word CAS implementations
is not possible (eve
On RISC-V we are facing the fact, that our conditional branches
require Pmode conditions. Currently, we generate them explicitly
with a check for Pmode and then calling the proper generator
(i.e. gen_cbranchdi4 on RV64 and gen_cbranchsi4 on RV32).
Let's simplify this code by generating the INSN hel
This series provides a cleanup of the current atomics implementation
of RISC-V:
* PR100265: Use proper fences for atomic load/store
* PR100266: Provide programmatic implementation of CAS
As both are very related, I merged the patches into one series.
The first patch could be squashed into the fo
We don't have any special treatment of MEMMODEL_SYNC_* values,
so let's hide them behind the memmodel_base() function.
gcc/
PR 100265
* config/riscv/riscv.c (riscv_memmodel_needs_amo_acquire):
Ignore MEMMODEL_SYNC_* values.
* config/riscv/riscv.c (riscv_memmod
The ratified A extension supports '.aq', '.rl' and '.aqrl' as
memory ordering suffixes. Let's emit them in case we get a '%A'
conversion specifier for riscv_print_operand().
As '%A' was already used for a similar, but restricted, purpose
(only '.aq' was emitted so far), this does not require any o
A previous patch took care, that the proper memory ordering suffixes
for AMOs are emitted. Therefore there is no reason to keep the fence
generation mechanism for release operations.
gcc/
PR 100265
* config/riscv/riscv.c (riscv_memmodel_needs_release_fence):
Remove fu
Using AMOSWAP as atomic store does not allow us to do sub-word accesses.
Further, it is not consistent with our atomic_load () implementation.
The benefit of AMOSWAP is that the resulting code sequence will be
smaller (comapred to FENCE+STORE), however, this does not weight
out for the lack of sub-
mem_thread_fence gets the desired memory model as operand.
Let's emit fences according to this value (as defined in section
"Code Porting and Mapping Guidelines" of the unpriv spec).
gcc/
PR 100265
* config/riscv/sync.md (mem_thread_fence):
Emit fences according t
A recent commit introduced a mechanism to emit proper fences
for RISC-V. Additionally, we already have emit_move_insn ().
Let's reuse this code and provide atomic_load and
atomic_store for RISC-V (as defined in section
"Code Porting and Mapping Guidelines" of the unpriv spec).
Note, that this works
In order to emit LR/SC sequences, let's provide INSNs, which
take care of memory ordering constraints.
gcc/
PR 100266
* config/rsicv/sync.md (UNSPEC_LOAD_RESERVED): New.
* config/rsicv/sync.md (UNSPEC_STORE_CONDITIONAL): New.
* config/riscv/sync.md (riscv_load_r
The current model of the LR and SC INSNs requires a sign-extension
to use the generated SImode value for conditional branches, which
only operate on XLEN registers.
However, the sign-extension is actually not required in both cases,
therefore this patch introduces additional INSNs that consume
the
The existing CAS implementation uses an INSN definition, which provides
the core LR/SC sequence. Additionally to that, there is a follow-up code,
that evaluates the results and calculates the return values.
This has two drawbacks: a) an extension to sub-word CAS implementations
is not possible (eve
Atomic instructions require zero-offset memory addresses.
If we allow all addresses, the nonzero-offset addresses will
be prepared in an extra register in an extra instruction before
the actual atomic instruction.
This patch introduces the predicate "riscv_sync_memory_operand",
which restricts the
Move the check for register targets (i.e. REG_P ()) into the function
get_sub_rtx () and change the restriction of REE to "only one child of
a PARALLEL expression is a SET register expression" (was "only one child of
a PARALLEL expression is a SET expression").
This allows to handle more PARALLEL
This series provides a cleanup of the current atomics implementation
of RISC-V (PR100265: Use proper fences for atomic load/store).
The first patch could be squashed into the following patches,
but I found it easier to understand the chances with it in place.
The series has been tested as follows
We don't have any special treatment of MEMMODEL_SYNC_* values,
so let's hide them behind the memmodel_base() function.
gcc/
PR 100265
* config/riscv/riscv.c (riscv_memmodel_needs_amo_acquire):
Ignore MEMMODEL_SYNC_* values.
* config/riscv/riscv.c (riscv_memmod
The ratified A extension supports '.aq', '.rl' and '.aqrl' as
memory ordering suffixes. Let's emit them in case we get a '%A'
conversion specifier for riscv_print_operand().
As '%A' was already used for a similar, but restricted, purpose
(only '.aq' was emitted so far), this does not require any o
A previous patch took care, that the proper memory ordering suffixes
for AMOs are emitted. Therefore there is no reason to keep the fence
generation mechanism for release operations.
gcc/
PR 100265
* config/riscv/riscv.c (riscv_memmodel_needs_release_fence):
Remove fu
Using AMOSWAP as atomic store does not allow us to do sub-word accesses.
Further, it is not consistent with our atomic_load () implementation.
The benefit of AMOSWAP is that the resulting code sequence will be
smaller (comapred to FENCE+STORE), however, this does not weight
out for the lack of sub-
mem_thread_fence gets the desired memory model as operand.
Let's emit fences according to this value (as defined in section
"Code Porting and Mapping Guidelines" of the unpriv spec).
gcc/
PR 100265
* config/riscv/sync.md (mem_thread_fence):
Emit fences according t
A recent commit introduced a mechanism to emit proper fences
for RISC-V. Additionally, we already have emit_move_insn ().
Let's reuse this code and provide atomic_load and
atomic_store for RISC-V (as defined in section
"Code Porting and Mapping Guidelines" of the unpriv spec).
Note, that this works
In order to emit LR/SC sequences, let's provide INSNs, which
take care of memory ordering constraints.
gcc/
PR 100266
* config/rsicv/sync.md (UNSPEC_LOAD_RESERVED): New.
* config/rsicv/sync.md (UNSPEC_STORE_CONDITIONAL): New.
* config/riscv/sync.md (riscv_load_r
The current model of the LR and SC INSNs requires a sign-extension
to use the generated SImode value for conditional branches, which
only operate on XLEN registers.
However, the sign-extension is actually not required in both cases,
therefore this patch introduces additional INSNs that consume
the
Atomic instructions require zero-offset memory addresses.
If we allow all addresses, the nonzero-offset addresses will
be prepared in an extra register in an extra instruction before
the actual atomic instruction.
This patch introduces the predicate "riscv_sync_memory_operand",
which restricts the
This patch adds support for the Zawrs ISA extension.
The patch depends on the corresponding Binutils patch
to be usable (see [1])
The specification can be found here:
https://github.com/riscv/riscv-zawrs/blob/main/zawrs.adoc
Note, that the Zawrs extension is not frozen or ratified yet.
Therefore
40 matches
Mail list logo