cated and
unpredicated insns
Andre Vieira (4):
doloop: Add support for predicated vectorized loops
arm: Annotate instructions with mve_safe_imp_xlane_pred
arm: Fix a wrong attribute use and remove unused unspecs and iterators
arm: Add support for MVE Tail-Predicated Low Overhead Loops
--
2.17.1
This patch adds support in the target agnostic doloop pass for the detection of
predicated vectorized hardware loops. Arm is currently the only target that
will make use of this feature.
The doloop_condition_get function is used to validate that the 'transformed'
jump instruction is one of the c
This patch adds an attribute to the mve md patterns to be able to identify
predicable MVE instructions and what their predicated and unpredicated variants
are. This attribute is used to encode the icode of the unpredicated variant of
an instruction in its predicated variant.
This will make it po
This patch annotates some MVE across lane instructions with a new attribute.
We use this attribute to let the compiler know that these instructions can be
safely implicitly predicated when tail predicating if their operands are
guaranteed to have zeroed tail predicated lanes. These instructions w
This patch fixes the erroneous use of a mode attribute without a mode iterator
in the pattern and removes unused unspecs and iterators.
gcc/ChangeLog:
* config/arm/iterators.md (supf): Remove VMLALDAVXQ_U, VMLALDAVXQ_P_U,
VMLALDAVAXQ_U cases.
(VMLALDAVXQ): Remove iterator
This patch adds support for MVE Tail-Predicated Low Overhead Loops by using the
doloop funcitonality added to support predicated vectorized hardware loops.
gcc/ChangeLog:
* config/arm/arm-protos.h (arm_target_bb_ok_for_lob): Change
declaration to pass basic_block.
(arm_at
LR by 8 in each iteration.
Stam Markianos-Wright (1):
arm: Add define_attr to to create a mapping between MVE predicated and
unpredicated insns
Andre Vieira (4):
doloop: Add support for predicated vectorized loops
arm: Annotate instructions with mve_safe_imp_xlane_pred
arm: Fix a wrong
This patch adds support in the target agnostic doloop pass for the detection of
predicated vectorized hardware loops. Arm is currently the only target that
will make use of this feature.
gcc/ChangeLog:
* df-core.cc (df_bb_regno_only_def_find): New helper function.
* df.h (df_bb_
This patch annotates some MVE across lane instructions with a new attribute.
We use this attribute to let the compiler know that these instructions can be
safely implicitly predicated when tail predicating if their operands are
guaranteed to have zeroed tail predicated lanes. These instructions w
This patch fixes the erroneous use of a mode attribute without a mode iterator
in the pattern and removes unused unspecs and iterators.
gcc/ChangeLog:
* config/arm/iterators.md (supf): Remove VMLALDAVXQ_U, VMLALDAVXQ_P_U,
VMLALDAVAXQ_U cases.
(VMLALDAVXQ): Remove iterator
This patch adds an attribute to the mve md patterns to be able to identify
predicable MVE instructions and what their predicated and unpredicated variants
are. This attribute is used to encode the icode of the unpredicated variant of
an instruction in its predicated variant.
This will make it po
This patch adds support for MVE Tail-Predicated Low Overhead Loops by using the
doloop funcitonality added to support predicated vectorized hardware loops.
gcc/ChangeLog:
* config/arm/arm-protos.h (arm_target_bb_ok_for_lob): Change
declaration to pass basic_block.
(arm_at
converted the VPT-predicated instructions into their
unpredicated equivalents (which also saves us from VPST insns).
The LE instruction here decrements LR by 8 in each iteration.
Stam Markianos-Wright (1):
arm: Add define_attr to to create a mapping between MVE predicated and
unpredicated in
This patch annotates some MVE across lane instructions with a new attribute.
We use this attribute to let the compiler know that these instructions can be
safely implicitly predicated when tail predicating if their operands are
guaranteed to have zeroed tail predicated lanes. These instructions w
This patch fixes the erroneous use of a mode attribute without a mode iterator
in the pattern and removes unused unspecs and iterators.
gcc/ChangeLog:
* config/arm/iterators.md (supf): Remove VMLALDAVXQ_U, VMLALDAVXQ_P_U,
VMLALDAVAXQ_U cases.
(VMLALDAVXQ): Remove iterator
This patch adds an attribute to the mve md patterns to be able to identify
predicable MVE instructions and what their predicated and unpredicated variants
are. This attribute is used to encode the icode of the unpredicated variant of
an instruction in its predicated variant.
This will make it po
This patch adds support in the target agnostic doloop pass for the detection of
predicated vectorized hardware loops. Arm is currently the only target that
will make use of this feature.
gcc/ChangeLog:
* df-core.cc (df_bb_regno_only_def_find): New helper function.
* df.h (df_bb_
This patch adds support for MVE Tail-Predicated Low Overhead Loops by using the
doloop funcitonality added to support predicated vectorized hardware loops.
gcc/ChangeLog:
* config/arm/arm-protos.h (arm_target_bb_ok_for_lob): Change
declaration to pass basic_block.
(arm_at
Hi,
Reworked the patches according to Kyrill's comments, made some other
non-functional changes and rebased.
Reposting as v3 so patchworks picks them up and runs the necessary testing.
Andre Vieira (2):
arm: Add define_attr to to create a mapping between MVE predicated and
unpredicated
Respin after comments from Kyrill and rebase. I also removed an if-then-else
construct in arm_mve_check_reg_origin_is_num_elems similar to the other
functions
Kyrill pointed out.
After an earlier comment from Richard Sandiford I also added comments to the
two tail predication patterns added to e
Reposting for testing purposes, no changes from v2 (other than rebase).
diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index 2a2207c0ba1..449e6935b32 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -2375,6 +2375,21 @@ extern int making_const_table;
else if (TARGET_THUM
Hi,
This patch series adds support for _BitInt for AArch64 when compiling for
Little Endian. The first patch in the series fixes an issue that arises with
support for AArch64, the second patch adds the backend support for it.
Andre Vieira (2):
bitint: Use TARGET_ARRAY_MODE for large bitints
This patch ensures we use TARGET_ARRAY_MODE to determine the storage mode of
large bitints that are represented as arrays in memory. This is required to
support such bitints for aarch64 and potential other targets with similar
bitint specifications. Existing tests like gcc.dg/torture/bitint-25.c
This patch adds support for C23's _BitInt for the AArch64 port when compiling
for little endianness. Big Endianness requires further target-agnostic
support and we therefor disable it for now.
gcc/ChangeLog:
* config/aarch64/aarch64.cc (TARGET_C_BITINT_TYPE_INFO): Declare MACRO.
specify the mangling.
Given that the target agnostic changes are minimal, have been suggested before
and have no impact on other targets, the target specific parts have been
reviewed before, would this still be acceptable for Stage 4? I would really
like to make use of the work that was done to su
This patch adds stmt_vec_info to TARGET_SIMD_CLONE_USABLE to make sure the
target can reject a simd_clone based on the vector mode it is using.
This is needed because for VLS SVE vectorization the vectorizer accepts
Advanced SIMD simd clones when vectorizing using SVE types because the simdlens
mi
The current codegen code to support VF's that are multiples of a simdclone
simdlen rely on BIT_FIELD_REF to create multiple input vectors. This does not
work for non-constant simdclones, so we should disable using such clones when
the VF is a multiple of the non-constant simdlen until we change t
This patch finalizes adding support for the generation of SVE simd clones when
no simdlen is provided, following the ABI rules where the widest data type
determines the minimum amount of elements in a length agnostic vector.
gcc/ChangeLog:
* config/aarch64/aarch64-protos.h (add_sve_type_
Resending series to make use of the Linaro pre-commit CI in patchworks.
Andre Vieira (2):
arm: Add define_attr to to create a mapping between MVE predicated and
unpredicated insns
arm: Add support for MVE Tail-Predicated Low Overhead Loops
--
2.17.1
Re-sending Stam's first patch, same as:
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635301.html
Hopefully patchworks can pick this up :)
diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index a9c2752c0ea..0b0e8620717 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.
Reworked Stam's patch after comments in:
https://gcc.gnu.org/pipermail/gcc-patches/2023-December/640362.html
The original gcc ChangeLog remains unchanged, but I did split up some tests so
here is the testsuite ChangeLog.
gcc/testsuite/ChangeLog:
* gcc.target/arm/lob.h: Update framework
Hi,
Resending series version 2 addression comments on first version, also moved
parts of the first patch to the second so it can be built without the second
patch.
Andre Vieira (2):
arm: Add define_attr to to create a mapping between MVE predicated and
unpredicated insns
arm: Add support
Respin of first version to address comments and make it buildable on its own.
diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index a9c2752c0ea..f0b01b7461f 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -2375,6 +2375,21 @@ extern int making_const_table;
else if (TARGE
Respin after comments on first version.
diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index 2f5ca79ed8d..4f164c54740 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -65,8 +65,8 @@ extern void arm_emit_speculation_barrier_function (void);
exter
Hi,
We held these two patches back in stage 4 because they touched
target-agnostic code, though I am quite confident they will not affect other
targets. Given stage one has reopened, I am reposting them, I rebased them but
they seem to apply cleanly on trunk.
OK for trunk?
Andre Vieira
This patch adds support in the target agnostic doloop pass for the detection of
predicated vectorized hardware loops. Arm is currently the only target that
will make use of this feature.
gcc/ChangeLog:
* df-core.cc (df_bb_regno_only_def_find): New helper function.
* df.h (df_bb_
This patch adds support for MVE Tail-Predicated Low Overhead Loops by using the
doloop funcitonality added to support predicated vectorized hardware loops.
gcc/ChangeLog:
* config/arm/arm-protos.h (arm_target_bb_ok_for_lob): Change
declaration to pass basic_block.
(arm_at
reviewed patches.
OK for trunk?
Andre Vieira (2):
doloop: Add support for predicated vectorized loops
arm: Add support for MVE Tail-Predicated Low Overhead Loops
gcc/config/arm/arm-protos.h |4 +-
gcc/config/arm/arm.cc | 1249
This patch adds support in the target agnostic doloop pass for the detection of
predicated vectorized hardware loops. Arm is currently the only target that
will make use of this feature.
gcc/ChangeLog:
* df-core.cc (df_bb_regno_only_def_find): New helper function.
* df.h (df_bb_
This patch adds support for MVE Tail-Predicated Low Overhead Loops by using the
doloop funcitonality added to support predicated vectorized hardware loops.
gcc/ChangeLog:
* config/arm/arm-protos.h (arm_target_bb_ok_for_lob): Change
declaration to pass basic_block.
(arm_at
On 03/09/15 12:11, Andre Vieira wrote:
On 01/09/15 15:01, Richard Biener wrote:
On Tue, Sep 1, 2015 at 3:40 PM, Andre Vieira
wrote:
Hi Marc,
On 28/08/15 19:07, Marc Glisse wrote:
(not a review, I haven't even read the whole patch)
On Fri, 28 Aug 2015, Andre Vieira wrote:
2015-
On 17/09/15 10:46, Richard Biener wrote:
On Thu, Sep 3, 2015 at 1:11 PM, Andre Vieira
wrote:
On 01/09/15 15:01, Richard Biener wrote:
On Tue, Sep 1, 2015 at 3:40 PM, Andre Vieira
wrote:
Hi Marc,
On 28/08/15 19:07, Marc Glisse wrote:
(not a review, I haven't even read the whole
Ping.
On 11/09/15 18:15, Andre Vieira wrote:
Conditional branches have a maximum range of [-1048576, 1048572]. Any
destination further away can not be reached by these.
To be able to have conditional branches in very large functions, we
invert the condition and change the destination to jump
On 25/09/15 12:42, Richard Biener wrote:
On Fri, Sep 25, 2015 at 1:30 PM, Andre Vieira
wrote:
On 17/09/15 10:46, Richard Biener wrote:
On Thu, Sep 3, 2015 at 1:11 PM, Andre Vieira
wrote:
On 01/09/15 15:01, Richard Biener wrote:
On Tue, Sep 1, 2015 at 3:40 PM, Andre Vieira
wrote:
Hi
check passed for armv6-m.
libgcc/ChangeLog:
2015-08-10 Hale Wang
Andre Vieira
* config/arm/lib1funcs.S: Add new wrapper.
From 832a3d6af6f06399f70b5a4ac3727d55960c93b7 Mon Sep 17 00:00:00 2001
From: Andre Simoes Dias Vieira
Date: Fri, 21 Aug 2015 14:23:28 +0100
Subject: [PATCH
Hi,
This patch addresses PR-67948 by changing the xor-and.c test, initially
written for a simplify-rtx pattern, to make it pass post r228661 (see
https://gcc.gnu.org/ml/gcc-patches/2015-10/msg00676.html). This test no
longer triggered the simplify-rtx pattern it was written for prior to
r2286
On 20/10/15 17:25, Ramana Radhakrishnan wrote:
On Tue, Oct 20, 2015 at 4:52 PM, Andre Vieira
wrote:
Hi,
This patch addresses PR-67948 by changing the xor-and.c test, initially
written for a simplify-rtx pattern, to make it pass post r228661 (see
https://gcc.gnu.org/ml/gcc-patches/2015-10
Ping.
BR,
Andre
On 13/10/15 18:01, Andre Vieira wrote:
This patch ports the aeabi_idiv routine from Linaro Cortex-Strings
(https://git.linaro.org/toolchain/cortex-strings.git), which was
contributed by ARM under Free BSD license.
The new aeabi_idiv routine is used to replace the one in
libgcc
as at link time it knows this global will never be read. By adding a
read of the global, LTO will no longer optimize it away.
Tested by running regressions for this testcase for various ARM targets.
Is this OK to commit?
Thanks,
Andre Vieira
gcc/testsuite/ChangeLog:
2015-11-06 Andre
the generation of FP instructions.
Tested by running regressions for this testcase for various ARM targets.
Is this OK to commit?
Thanks,
Andre Vieira
gcc/testsuite/ChangeLog:
2015-11-06 Andre Vieira
* gcc.target/arm/memset-inline-10.c: Added
dg-require-effective
On 12/11/15 15:08, Andre Vieira wrote:
Hi,
This patch changes the memset-inline-10.c testcase to make sure that
it is only compiled for ARM targets that support -mfloat-abi=hard using
the fact that all non-thumb1 targets do.
This is correct because all targets for which -mthumb causes
rofile that supports neon
and the current test is not sufficient to exclude armv7-m.
Tested by running regressions for this testcase for various ARM targets.
Is this OK to commit?
Thanks,
Andre Vieira
gcc/testsuite/ChangeLog:
2015-11-06 Andre Vieira
* gcc/testsuite/lib
On 13/11/15 10:34, Richard Biener wrote:
On Thu, Nov 12, 2015 at 4:07 PM, Andre Vieira
wrote:
Hi,
This patch changes this testcase to make sure LTO will not optimize away
the assignment of the local array to a global variable which was introduced
to make sure stack space was made available
On 16/11/15 12:07, James Greenhalgh wrote:
On Mon, Nov 16, 2015 at 10:49:11AM +, Andre Vieira wrote:
Hi,
This patch changes the target support mechanism to make it
recognize any ARM 'M' profile as a non-neon supporting target. The
current check only tests for armv6 archite
On 16/11/15 13:33, Richard Biener wrote:
On Mon, Nov 16, 2015 at 12:43 PM, Andre Vieira
wrote:
On 13/11/15 10:34, Richard Biener wrote:
On Thu, Nov 12, 2015 at 4:07 PM, Andre Vieira
wrote:
Hi,
This patch changes this testcase to make sure LTO will not optimize
away
the assignment of
On 16/11/15 15:34, Joern Wolfgang Rennecke wrote:
I just happened to stumble on this problem with another port.
The volatile & test solution doesn't work, though.
What does work, however, is:
__asm__ ("" : : "" (dummy));
I can confirm that Joern's solution works for me too.
This series is aimed at backporting algorithmic optimizations and a
change to a test it affects from trunk to the embedded-5-branch.
Andre Vieira(2):
Backporting algorithmic optimization in match and simplify
Backporting fix for PR-67948.
15-08/msg01493.html (for the first
optimization and changes to second and third)
https://gcc.gnu.org/ml/gcc-patches/2015-07/msg00517.html (the addition
of the original second and third optimizations)
gcc/ChangeLog:
2015-11-27 Andre Vieira
Backport from mainline:
2015-10-09 Andre Vieira
This patch backports the fix for PR-67948 from trunk to the
embedded-5-branch.
The original patch is at:
https://gcc.gnu.org/ml/gcc-patches/2015-10/msg02193.html
Tested for Cortex-M3.
Is this OK to commit?
Thanks,
Andre
gcc/testsuite/ChangeLog
2015-10-27 Andre Vieira
Backport from
On 17/11/15 12:29, Bernd Schmidt wrote:
On 11/16/2015 04:48 PM, Andre Vieira wrote:
On 16/11/15 15:34, Joern Wolfgang Rennecke wrote:
I just happened to stumble on this problem with another port.
The volatile & test solution doesn't work, though.
What does work, however, is:
On 17/11/15 10:10, James Greenhalgh wrote:
On Mon, Nov 16, 2015 at 01:15:32PM +, Andre Vieira wrote:
On 16/11/15 12:07, James Greenhalgh wrote:
On Mon, Nov 16, 2015 at 10:49:11AM +, Andre Vieira wrote:
Hi,
This patch changes the target support mechanism to make it
recognize any
Hi Kyrill
On 20/11/15 11:51, Kyrill Tkachov wrote:
Hi Andre,
On 18/11/15 09:44, Andre Vieira wrote:
On 17/11/15 10:10, James Greenhalgh wrote:
On Mon, Nov 16, 2015 at 01:15:32PM +, Andre Vieira wrote:
On 16/11/15 12:07, James Greenhalgh wrote:
On Mon, Nov 16, 2015 at 10:49:11AM +
On 12/11/15 15:16, Andre Vieira wrote:
On 12/11/15 15:08, Andre Vieira wrote:
Hi,
This patch changes the memset-inline-10.c testcase to make sure that
it is only compiled for ARM targets that support -mfloat-abi=hard using
the fact that all non-thumb1 targets do.
This is correct because
On 17/11/15 16:30, Andre Vieira wrote:
On 17/11/15 12:29, Bernd Schmidt wrote:
On 11/16/2015 04:48 PM, Andre Vieira wrote:
On 16/11/15 15:34, Joern Wolfgang Rennecke wrote:
I just happened to stumble on this problem with another port.
The volatile & test solution doesn't work, thou
, far away, destination.
gcc/ChangeLog:
2015-08-07 Ramana Radhakrishnan
Andre Vieira
* config/aarch64/aarch64.md (*condjump): Handle functions > 1
Mib.
(*cb1): Idem.
(*tb1): Idem.
(*cb1): Idem.
* config/aarch64/iterators.md (inv
On 25/08/15 10:52, Andrew Pinski wrote:
On Tue, Aug 25, 2015 at 5:50 PM, Andrew Pinski wrote:
On Tue, Aug 25, 2015 at 5:37 PM, Andre Vieira
wrote:
Conditional branches have a maximum range of [-1048576, 1048572]. Any
destination further away can not be reached by these.
To be able to have
o combine constants, reducing
run-time operations.
The two examples above would be transformed into (X << 24) ^ 0x
and (X >> 1) ^ 0xa001 respectively.
gcc/ChangeLog:
2015-08-03 Andre Vieira
* match.pd: Added new patterns:
((X {&,<<,>&g
Hi Marc,
On 28/08/15 19:07, Marc Glisse wrote:
(not a review, I haven't even read the whole patch)
On Fri, 28 Aug 2015, Andre Vieira wrote:
2015-08-03 Andre Vieira
* match.pd: Added new patterns:
((X {&,<<,>>} C0) {|,^} C1) {^,|} C2)
(X {|,^,&} C0) {&
On 01/09/15 17:54, Marek Polacek wrote:
On Tue, Sep 01, 2015 at 12:50:27PM -0400, David Malcolm wrote:
I can't comment on the patch itself, but I noticed that in the testsuite
addition, you've gathered all the "dg-final" clauses at the end.
I think that this is consistent with existing practi
On 01/09/15 15:01, Richard Biener wrote:
On Tue, Sep 1, 2015 at 3:40 PM, Andre Vieira
wrote:
Hi Marc,
On 28/08/15 19:07, Marc Glisse wrote:
(not a review, I haven't even read the whole patch)
On Fri, 28 Aug 2015, Andre Vieira wrote:
2015-08-03 Andre Vieira
* match.pd: Adde
, far away, destination.
This patch backports the fix from trunk to the gcc-5-branch.
The original patch is at:
https://gcc.gnu.org/ml/gcc-patches/2015-08/msg01493.html
gcc/ChangeLog:
2015-09-09 Andre Vieira
Backport from mainline:
2015-08-27 Ramana Radhakrishnan
Andre
On 31/12/15 20:54, Joseph Myers wrote:
On Sat, 26 Dec 2015, Thomas Preud'homme wrote:
+#define CMSE_TT_ASM(flags) \
+{ \
+ cmse_address_info_t result; \
+ __asm__ ("tt" # flags " %0,%1" \
+ : "=r"(result) \
+ : "r"(p) \
+ : "memory"); \
+ return result; \
Are th
On 27/11/15 14:28, Andre Vieira wrote:
On 12/11/15 15:16, Andre Vieira wrote:
On 12/11/15 15:08, Andre Vieira wrote:
Hi,
This patch changes the memset-inline-10.c testcase to make sure that
it is only compiled for ARM targets that support -mfloat-abi=hard using
the fact that all non-thumb1
This patch splits out FCMA as a feature from Armv8.3-A and adds it as a separate
feature bit which now controls 'TARGET_COMPLEX'.
gcc/ChangeLog:
* config/aarch64/aarch64-arches.def (FCMA): New feature bit, can not be
used as an extension in the command-line.
* config/aarc
As per the AArch64 ISA FEAT_SME does not require FEAT_SVE2, so we are removing
that false dependency in GCC. However, we chose for now to not support this
combination of features and will diagnose the combination of FEAT_SME without
FEAT_SVE2 as unsupported by GCC. We may choose to support this
port this combination we should investigate these.
The patch series also refactors the FCMA/COMPNUM/TARGET_COMPLEX feature to
separate it from Armv8.3-A feature set.
Andre Vieira (2)
aarch64: Split FCMA feature bit from Armv8.3-A
aarch64: remove SVE2 requirement from SME and diagnose i
After the changes to the vctp intrinsic codegen changed slightly, where we now
unfortunately seem to be generating unneeded moves and extends of the mask.
These are however not incorrect and we don't have a fix for the unneeded
codegen right now, so changing the testcase to accept them so we can c
had gonne by
unnoticed until now.
Only tested on arm-none-eabi mve.exp=dlstp*. OK for trunk?
Andre Vieira (3):
arm, mve: Fix scan-assembler for test7 in dlstp-compile-asm-2.c
arm, mve: Pass -std=c99 to dlstp-loop-form.c to avoid new warning
arm, mve: Detect uses of vctp_vpr_generated inside
This fixes a testism introduced by the warning produced with the -std=c23
default. The testcase is a reduced piece of code meant to trigger an ICE, so
there's little value in trying to change the code itself.
gcc/testsuite/ChangeLog:
* gcc.target/arm/mve/dlstp-loop-form.c: Add -std=c99
Address a problem we were having where we were missing on detecting uses of
vctp_vpr_generated in the analysis for 'arm_attempt_dlstp_transform' because
the use was inside a SUBREG and rtx_equal_p does not catch that. Using
reg_overlap_mentioned_p is much more robust.
gcc/ChangeLog:
* g
After the changes to the vctp intrinsic codegen changed slightly, where we now
unfortunately seem to be generating unneeded moves and extends of the mask.
These are however not incorrect and we don't have a fix for the unneeded
codegen right now, so changing the testcase to accept them so we can c
Address a problem we were having where we were missing on detecting uses of
vctp_vpr_generated in the analysis for 'arm_attempt_dlstp_transform' because
the use was inside a SUBREG and rtx_equal_p does not catch that. Using
reg_overlap_mentioned_p is much more robust.
gcc/ChangeLog:
PR
Hi,
This rejects any loops where any predicated instruction comes before the vctp
that generates the loop predicate. Even though this is not a requirement for
dlstp transformation we have found potential issues where you can end up with a
wrong transformation, so it is safer to reject such loops.
On 31/08/2023 07:39, Richard Biener wrote:
On Wed, Aug 30, 2023 at 5:02 PM Andre Vieira (lists)
wrote:
On 30/08/2023 14:01, Richard Biener wrote:
On Wed, Aug 30, 2023 at 11:15 AM Andre Vieira (lists) via Gcc-patches
wrote:
This patch adds a machine_mode parameter to the
Hi Honza,
My current patch set for AArch64 VLA omp codegen started failing on
gcc.dg/gomp/pr87898.c after this. I traced it back to
'move_sese_region_to_fn' in tree/cfg.cc not setting count for the bb
created.
I was able to 'fix' it locally by setting the count of the new bb to the
accumula
On 30/08/2023 14:04, Richard Biener wrote:
On Wed, 30 Aug 2023, Andre Vieira (lists) wrote:
This patch adds a new target hook to enable us to adapt the types of return
and parameters of simd clones. We use this in two ways, the first one is to
make sure we can create valid SVE types
On 04/10/2023 11:41, Richard Biener wrote:
On Wed, 4 Oct 2023, Andre Vieira (lists) wrote:
On 30/08/2023 14:04, Richard Biener wrote:
On Wed, 30 Aug 2023, Andre Vieira (lists) wrote:
This patch adds a new target hook to enable us to adapt the types of return
and parameters of simd
Hey,
Just a minor update to the patch, I had missed the libgomp testsuite, so
had to make some adjustments there too.
gcc/ChangeLog:
* config/aarch64/aarch64.cc (lane_size): New function.
(aarch64_simd_clone_compute_vecsize_and_simdlen): Determine
simdlen according to NDS rul
So OK to commit this?
This patch makes sure the profile_count information is initialized for
the new
bb created in move_sese_region_to_fn.
gcc/ChangeLog:
* tree-cfg.cc (move_sese_region_to_fn): Initialize profile_count for
new basic block.
Bootstrapped and regression tested o
vect: Add
TARGET_SIMD_CLONE_ADJUST_RET_OR_PARAM with omp: Reorder call for
TARGET_SIMD_CLONE_ADJUST after comments.
Bootstrapped and regression tested the series on
aarch64-unknown-linux-gnu and x86_64-pc-linux-gnu.
Andre Vieira (8):
omp: Replace simd_clone_supbarts with TYPE_VECTOR_SUBPARTS [NEW]
parloops: Copy
Just posting a rebase for completion.
On 30/08/2023 13:31, Richard Biener wrote:
On Wed, 30 Aug 2023, Andre Vieira (lists) wrote:
SVE simd clones require to be compiled with a SVE target enabled or the
argument types will not be created properly. To achieve this we need to copy
Posting the changed patch for completion, already reviewed.
On 30/08/2023 13:32, Richard Biener wrote:
On Wed, 30 Aug 2023, Andre Vieira (lists) wrote:
Teach parloops how to handle a poly nit and bound e ahead of the changes to
enable non-constant simdlen.
Can you use poly_int_tree_p to
helper function.
On 30/08/2023 13:54, Richard Biener wrote:
On Wed, 30 Aug 2023, Andre Vieira (lists) wrote:
The vect_get_smallest_scalar_type helper function was using any argument to a
simd clone call when trying to determine the smallest scalar type that would
be vectorized. This included
Rebased, needs review.
On 30/08/2023 10:13, Andre Vieira (lists) via Gcc-patches wrote:
This patch enables the compiler to use inbranch simdclones when
generating masked loops in autovectorization.
gcc/ChangeLog:
* omp-simd-clone.cc (simd_clone_adjust_argument_types): Make function
Refactor simd clone handling code ahead of support for poly simdlen.
gcc/ChangeLog:
* omp-simd-clone.cc (simd_clone_subparts): Remove.
(simd_clone_init_simd_arrays): Replace simd_clone_supbarts with
TYPE_VECTOR_SUBPARTS.
(ipa_simd_modify_function_body): Likewise.
Rebased on top of trunk, minor change to check if loop_vinfo since we
now do some slp vectorization for simd_clones.
I assume the previous OK still holds.
On 30/08/2023 13:54, Richard Biener wrote:
On Wed, 30 Aug 2023, Andre Vieira (lists) wrote:
When analyzing a loop and choosing a
Rebased, no major changes, still needs review.
On 30/08/2023 10:19, Andre Vieira (lists) via Gcc-patches wrote:
This patch finalizes adding support for the generation of SVE simd
clones when no simdlen is provided, following the ABI rules where the
widest data type determines the minimum
ter return
and argument types have been vectorized.
On 04/10/2023 13:40, Andre Vieira (lists) wrote:
On 04/10/2023 11:41, Richard Biener wrote:
On Wed, 4 Oct 2023, Andre Vieira (lists) wrote:
On 30/08/2023 14:04, Richard Biener wrote:
On Wed, 30 Aug 2023, Andre Vieira (lists) wr
On 29/06/18 11:13, David Malcolm wrote:
> On Fri, 2018-06-29 at 10:15 +0200, Richard Biener wrote:
>> On Fri, 22 Jun 2018, Jan Hubicka wrote:
>>
>>> Hi,
>>> this patch adds dumpfile support for dumps that come in multiple
>>> parts. This
>>> is needed for WPA stream-out dump since we stream partit
On 03/07/18 15:15, David Malcolm wrote:
> On Tue, 2018-07-03 at 11:00 +0100, Andre Vieira (lists) wrote:
>> On 29/06/18 11:13, David Malcolm wrote:
>>> On Fri, 2018-06-29 at 10:15 +0200, Richard Biener wrote:
>>>> On Fri, 22 Jun 2018, Jan Hubicka wrote:
>&g
1 - 100 of 765 matches
Mail list logo