Here is the latest version of the patch, I wasn't sure whether Richard's
'LGTM with...' was meant as a conditional OK and together with the
changes suggested by Andrew I thought I'd ask again, OK for trunk?
As per the AArch64 ISA FEAT_SME does not require FEAT_SVE2. However, we
don't support
Thanks for the suggestions.
On 14/03/2025 21:43, Andrew Carlotti wrote:
On Thu, Mar 13, 2025 at 05:10:07PM +, Andre Vieira (lists) wrote:
Apologies for the delay, had been waiting on some other relevant patches to
go in to make sure we didn't break any valid existing behaviours. It s
On 14/03/2025 09:59, Richard Sandiford wrote:
"Andre Vieira (lists)" writes:
diff --git a/gcc/testsuite/gcc.target/aarch64/no-sve-with-sme-3.c
b/gcc/testsuite/gcc.target/aarch64/no-sve-with-sme-3.c
new file mode 100644
index
00
lane_mf8.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/mlalt_mf8.c: Likewise.
On 04/10/2024 13:08, Kyrylo Tkachov wrote:
Hi Andre,
On 2 Oct 2024, at 19:13, Andre Vieira wrote:
External email: Use caution opening links or attachments
As per the AArch64 ISA FEAT_SME does not require
Add missing Runtime Library Exception to mve header files to bring them
into line with other similar headers. Not adding it in the first place
was an oversight.
gcc/ChangeLog:
* config/arm/arm_mve.h: Add Runtime Library Exception.
* config/arm/arm_mve_types.h: Likewise.diff --g
Address a problem we were having where we were missing on detecting uses of
vctp_vpr_generated in the analysis for 'arm_attempt_dlstp_transform' because
the use was inside a SUBREG and rtx_equal_p does not catch that. Using
reg_overlap_mentioned_p is much more robust.
gcc/ChangeLog:
PR
After the changes to the vctp intrinsic codegen changed slightly, where we now
unfortunately seem to be generating unneeded moves and extends of the mask.
These are however not incorrect and we don't have a fix for the unneeded
codegen right now, so changing the testcase to accept them so we can c
Hi Christophe,
On 28/11/2024 17:00, Christophe Lyon wrote:
Hi Andre,
Thanks, the patch LGTM except a minor nit:
/* Using a VPR that gets re-generated within the loop. */
-void test10 (int32_t *a, int32_t *b, int32_t *c, int n)
+void test10a (int32_t *a, int32_t *b, int32_t *c, int n)
[...]
Address a problem we were having where we were missing on detecting uses of
vctp_vpr_generated in the analysis for 'arm_attempt_dlstp_transform' because
the use was inside a SUBREG and rtx_equal_p does not catch that. Using
reg_overlap_mentioned_p is much more robust.
gcc/ChangeLog:
* g
This fixes a testism introduced by the warning produced with the -std=c23
default. The testcase is a reduced piece of code meant to trigger an ICE, so
there's little value in trying to change the code itself.
gcc/testsuite/ChangeLog:
* gcc.target/arm/mve/dlstp-loop-form.c: Add -std=c99
had gonne by
unnoticed until now.
Only tested on arm-none-eabi mve.exp=dlstp*. OK for trunk?
Andre Vieira (3):
arm, mve: Fix scan-assembler for test7 in dlstp-compile-asm-2.c
arm, mve: Pass -std=c99 to dlstp-loop-form.c to avoid new warning
arm, mve: Detect uses of vctp_vpr_generated inside
After the changes to the vctp intrinsic codegen changed slightly, where we now
unfortunately seem to be generating unneeded moves and extends of the mask.
These are however not incorrect and we don't have a fix for the unneeded
codegen right now, so changing the testcase to accept them so we can c
Hi,
This rejects any loops where any predicated instruction comes before the vctp
that generates the loop predicate. Even though this is not a requirement for
dlstp transformation we have found potential issues where you can end up with a
wrong transformation, so it is safer to reject such loops.
Hi Christophe,
On 28/11/2024 10:22, Christophe Lyon wrote:
The VCTP instruction creates a Vector Tail Predicate in VPR.P0, based
on the input value, but also constrained by a VPT block (if present),
or if used within a DLSTP/LETP loop.
Therefore we need to inform the compiler that this intrinsi
Hi,
Looks like single_pred ICEs if the basic-block does not have a single
predecessor rather than return NULL, which was what this snippet of code
relied on.
This feels like borderline obvious to me as a fix, but I thought I'd get
it checked by one more person.
Call 'single_pred_p' before 's
On 31/10/2024 08:23, Alexandre Oliva wrote:
On Oct 25, 2024, "Andre Vieira (lists)" wrote:
I have to admit I am not super familiar with long doubles, either than
knowing they are 128-bit FP representations... but bisect has pointed
me to this patch when investigating a reg
On 08/10/2024 17:18, Richard Sandiford wrote:
Andrew Carlotti writes:
This includes +fcma as a dependency of +sve, and means that we can
finally support fcma intrinsics on a64fx.
Also add fcma to the Features list in several cpunative testcases that
incorrectly included sve without fcma.
g
Hey,
I have to admit I am not super familiar with long doubles, either than
knowing they are 128-bit FP representations... but bisect has pointed me
to this patch when investigating a regression on aarch64_be-none-elf for
the libstdc++ testcase: 26_numerics/complex/13450.cc
After some reduct
Sorry for the delay, some other work popped up in between and this had
some latent issues. They should all be addressed now in this new patch.
When not dealing with the special armv8.1-m.main conditional
instructions case make sure it uses the
default_noce_conversion_profitable_p call to dete
Hi,
This looks like an acceptable work around. We special case behavior that
I'm not sure we can express in ways GCC can understand or will make use
of, whilst at the same time we keep expressing behavior it does
understand and can optimize.
Nice idea!
LGTM, needs maintainer approval though
Hi,
This patch fixes constant folding of BIT_INSER_EXPR for BYTES_BIG_ENDIAN
targets.
Regression tested on aarch64be-none-elf.
Almost committed this as obvious, but I wanted to double check the
testcase with a maintainer. I decided to not make the test be big-endian
specific, nor to add any
Committed attached patch as obvious.
This obvious patch fixes two warnings introduced with the implementation
of arm_noce_conversion_profitable_p hook.
gcc/ChangeLog:
* config/arm/arm.cc (arm_noce_oncersion_profitable_p): Remove unused
argument name.
(arm_is_v81m_cond_
On 07/10/2024 10:15, Christophe Lyon wrote:
On Mon, 7 Oct 2024 at 11:04, Torbjorn SVENSSON
wrote:
On 2024-10-07 10:53, Andre Vieira (lists) wrote:
Hi Torbjorn,
2. All other the test cases in the list above: These need to be
adapted to the change introduced in r15-3606-g7d6c6a0d15c to
Hi Torbjorn,
On 07/10/2024 09:08, Torbjorn SVENSSON wrote:
There are 3 test cases that are fixed with these 2 commits, but there is
also a bunch that is marked as new fails.
Looking at the test cases that fail, there are 2 different kinds of
failures.
1. gcc.target/arm/attr_thumb.c: This
Hi,
The patch for 'arm: Fix missed CE optimization for armv8.1-m.main [PR
116444]' introduced regressions with arm targets that used 'noce' before.
This is because it would approve all noce optimisations without using
the default cost check. Not sure why this didn't show up in my original
test
This patch splits out FCMA as a feature from Armv8.3-A and adds it as a separate
feature bit which now controls 'TARGET_COMPLEX'.
gcc/ChangeLog:
* config/aarch64/aarch64-arches.def (FCMA): New feature bit, can not be
used as an extension in the command-line.
* config/aarc
As per the AArch64 ISA FEAT_SME does not require FEAT_SVE2, so we are removing
that false dependency in GCC. However, we chose for now to not support this
combination of features and will diagnose the combination of FEAT_SME without
FEAT_SVE2 as unsupported by GCC. We may choose to support this
port this combination we should investigate these.
The patch series also refactors the FCMA/COMPNUM/TARGET_COMPLEX feature to
separate it from Armv8.3-A feature set.
Andre Vieira (2)
aarch64: Split FCMA feature bit from Armv8.3-A
aarch64: remove SVE2 requirement from SME and diagnose i
Resending as v2 so CI picks it up.
This patch refactors and fixes an issue where
arm_mve_dlstp_check_dec_counter
was making an assumption about the form of what a candidate for a dec_insn.
This dec_insn is the instruction that decreases the loop counter inside a
decrementing loop and we expect it
On 26/09/2024 18:56, Ramana Radhakrishnan wrote:
+/* Helper function to determine whether SEQ represents a sequence of
+ instructions representing the Armv8.1-M Mainline conditional arithmetic
+ instructions: csinc, csneg and csinv. The cinc instruction is generated
+ using a diffe
Hi,
This patch restores missed optimizations for armv8.1-m.main targets that
were missed when the generation of csinc, csinv and csneg were enabled
or the same with patch series containing:
commit c2bb84be4a6e581bbf45891457ee632a07416982
Author: Sudi Das
Date: Fri Sep 18 15:47:46 2020 +010
Hi,
The 'code' part of a 'define_code_attr' refers to the type of the key,
in other words, it uses a code_iterator to pick the value from their
(key "value") pair list.
Though it seems rtx_alloc_for_name requires a code_attribute to be used
when the 'value' needs to be a type. In other words,
I'm not a maintainer but I'd argue the entire test is bogus.
The error reporting in this area seems to be somewhat fragile, if you
compile it with '-march=armv7-a -mfloat-abi=soft', you also don't get
the error this is testing for. I'd argue this kind of user friendly
error message should jus
On 11/07/2024 22:42, Christophe Lyon wrote:
+ bool
+ check (function_checker &c) const override
+ {
+if (c.mode_suffix_id == MODE_none)
+ return true;
+
+unsigned int bits = c.type_suffix (0).element_bits;
+return c.require_immediate_range (1, 1, bits);
+ }
When trying t
Hi,
This looks great to me, only one small suggestion, but take it or leave
it I think it's a matter of preference.
On 11/07/2024 22:42, Christophe Lyon wrote:
+ /* No predicate, no suffix. */
if (e.type_suffix (0).integer_p)
if (e.type_suffix (0).unsigne
Hi Christophe,
Maybe this patch was based on an older source, but the comment now reads:
/* _t vfoo[t0](_t, _t)
_t vfoo[_n_t0](_t, _t)
Where the _n form only supports s16/s32/u16/u32 types as for vorrq.
Example: vorrq.
int16x8_t [__arm_]vorrq[_s16](int16x8_t a, int16x8_t b)
int1
Yeah true... committed.
On 01/08/2024 13:54, Christophe Lyon wrote:
On 8/1/24 12:02, Andre Vieira (lists) wrote:
On 01/08/2024 10:09, Christophe Lyon wrote:
It seems your attachment contains only the commit message but lacks
the actual patch?
I blame lack of coffee...
Thanks.
The
On 01/08/2024 10:09, Christophe Lyon wrote:
It seems your attachment contains only the commit message but lacks the
actual patch?
I blame lack of coffee...
Thanks.diff --git a/gcc/testsuite/gcc.target/arm/mve/ivopts-3.c
b/gcc/testsuite/gcc.target/arm/mve/ivopts-3.c
index
19b2442ef12cbf
Hi,
This patch ensures this testcase is ran for armv8.1-m.main+mve as this
is testing that doloops with function calls that aren't intrinsics get
rejected as potential doloop targets during ivopts. For other targets
this loop gets rejected for different reasons.
gcc/testsuite/ChangeLog:
This patch refactors and fixes an issue where
arm_mve_dlstp_check_dec_counter
was making an assumption about the form of what a candidate for a
dec_insn
should be, which caused an ICE.
This dec_insn is the instruction that decreases the loop counter
inside a
decrementing loop a
Hi Christophe,
Thanks for the comments, attached new version for testcase, see below
new cover letter:
This patch refactors and fixes an issue where
arm_mve_dlstp_check_dec_counter
was making an assumption about the form of what a candidate for a dec_insn.
This dec_insn is the instruction th
This patch refactors and fixes an issue where
arm_mve_dlstp_check_dec_counter
was making an assumption about the form of what a candidate for a dec_insn.
It also makes sure that if it does not initially encounter a 'set' in such a
form it tries to find another set that could be the right one.
Looks like I forgot to CC you Richard. But yeh ping :)
On 26/06/2024 13:20, Andre Vieira (lists) wrote:
This patch fixes the backend pattern that was printing the wrong input
scalar register pair when inserting into lane 1.
Added a new test to force float-abi=hard so we can use scan-assembler
This patch fixes the backend pattern that was printing the wrong input
scalar register pair when inserting into lane 1.
Added a new test to force float-abi=hard so we can use scan-assembler to
check
correct codegen.
Regression tested arm-none-eabi with
-march=armv8.1-m.main+mve/-mfloat-abi=ha
Hi,
With the introduction of low overhead loops in
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=3dfc28dbbd21b1d708aa40064380ef4c42c994d7
we defined arm_predict_doloop_p, this is meant to be a low-weight check
to rule out loops we are not considering for doloop optimization and it
is used by
On 06/06/2024 12:53, Richard Earnshaw (lists) wrote:
On 05/06/2024 17:07, Andre Vieira (lists) wrote:
Hi,
This patch adds missing assembly directives to the CMSE library wrapper to call
functions with attribute cmse_nonsecure_call. Without the .type directive the
linker will fail to
On 11/06/2024 14:59, Richard Earnshaw (lists) wrote:
You effectively have an 'else if' split across a comment here, and the
indentation looks weird. Either write 'else if' on one line (and re-indent
accordingly) or put this entire block inside braces.
Apologies here, Torbjorn had this as
Hi,
So, you talk about gen_thumb1_extendhisi2, but there is also
gen_thumb1_extendqisi2. Will it actually be cleaner if the block is
indented one level?
The comment can be added in the "if (TARGET_THUMB1)" block regardless to
indicate that gen_rtx_SIGN_EXTEND can't be used.
gen_rtx_SIGN_EX
Hi Torbjorn,
Thanks for this, I have some comments below.
On 07/06/2024 09:56, Torbjörn SVENSSON wrote:
Properly handle zero and sign extension for Armv8-M.baseline as
Cortex-M23 can have the security extension active.
Currently, there is a internal compiler error on Cortex-M23 for the
epilog p
Hi,
This patch adds missing assembly directives to the CMSE library wrapper
to call functions with attribute cmse_nonsecure_call. Without the .type
directive the linker will fail to produce the correct veneer if a call
to this wrapper function is to far from the wrapper itself. The .size
wa
On 04/06/2024 12:50, Richard Biener wrote:
On Tue, 4 Jun 2024, Andre Vieira (lists) wrote:
Hi,
We got a question as to whether GCC had something similar to llvm's pragma
clang loop interleave_count(N), see
https://clang.llvm.org/docs/LanguageExtensions.html#extensions-for-loop
Hi,
We got a question as to whether GCC had something similar to llvm's
pragma clang loop interleave_count(N), see
https://clang.llvm.org/docs/LanguageExtensions.html#extensions-for-loop-hint-optimizations
I did a quick hack, using 'GCC interleaves N', just as a proof of
concept, to see wheth
This patch adds support for MVE Tail-Predicated Low Overhead Loops by using the
doloop funcitonality added to support predicated vectorized hardware loops.
gcc/ChangeLog:
* config/arm/arm-protos.h (arm_target_bb_ok_for_lob): Change
declaration to pass basic_block.
(arm_at
This patch adds support in the target agnostic doloop pass for the detection of
predicated vectorized hardware loops. Arm is currently the only target that
will make use of this feature.
gcc/ChangeLog:
* df-core.cc (df_bb_regno_only_def_find): New helper function.
* df.h (df_bb_
reviewed patches.
OK for trunk?
Andre Vieira (2):
doloop: Add support for predicated vectorized loops
arm: Add support for MVE Tail-Predicated Low Overhead Loops
gcc/config/arm/arm-protos.h |4 +-
gcc/config/arm/arm.cc | 1249
This patch adds support for MVE Tail-Predicated Low Overhead Loops by using the
doloop funcitonality added to support predicated vectorized hardware loops.
gcc/ChangeLog:
* config/arm/arm-protos.h (arm_target_bb_ok_for_lob): Change
declaration to pass basic_block.
(arm_at
This patch adds support in the target agnostic doloop pass for the detection of
predicated vectorized hardware loops. Arm is currently the only target that
will make use of this feature.
gcc/ChangeLog:
* df-core.cc (df_bb_regno_only_def_find): New helper function.
* df.h (df_bb_
Hi,
We held these two patches back in stage 4 because they touched
target-agnostic code, though I am quite confident they will not affect other
targets. Given stage one has reopened, I am reposting them, I rebased them but
they seem to apply cleanly on trunk.
OK for trunk?
Andre Vieira
Hey Jakub,
This what ya had in mind?
Kind regards,
Andre Vieiradiff --git a/htdocs/gcc-14/changes.html b/htdocs/gcc-14/changes.html
index
ca5174de991bb088f653468f77485c15a61526e6..924e045a15a78b5702a0d6997953f35c6b47efd1
100644
--- a/htdocs/gcc-14/changes.html
+++ b/htdocs/gcc-14/changes.html
This patch fixes some testisms introduced by:
commit 5aa3fec38cc6f52285168b161bab1a869d864b44
Author: Andre Vieira
Date: Wed Apr 10 16:29:46 2024 +0100
aarch64: Add support for _BitInt
The testcases were relying on an unnecessary sign-extend that is no longer
generated.
The tested
Hi,
Patch to add AArch64 to the list of supported _BitInt(N) in
gcc-14/changes.html.
OK?diff --git a/htdocs/gcc-14/changes.html b/htdocs/gcc-14/changes.html
index
a7ba957110183f906938d935bfa17aaed2ba20c8..55ab8c14c6d0b54e05a5f266f25c8ef1a4f959bf
100644
--- a/htdocs/gcc-14/changes.html
+++ b/
Added the target check, also had to change some of the assembly checking
due to changes upstream, the assembly is still valid, but we do extend
where not necessary, I do believe that's a general issue though.
The _BitInt(N > 64) codegen for non-powers of 2 did get worse, we see
similar codegen
regards,
Andre
On 28/03/2024 12:54, Richard Sandiford wrote:
"Andre Vieira (lists)" writes:
This patch makes sure we do not give ABI change diagnostics for the ABI
breaks of GCC 9, 13 and 14 for any type involving _BitInt(N), since that
type did not exist before this GCC version.
This patch adds support for C23's _BitInt for the AArch64 port when
compiling for little endianness. Big Endianness requires further
target-agnostic support and we therefor disable it for now.
The tests expose some suboptimal codegen for which I'll create PR's for
optimizations after this goe
This patch makes sure we do not give ABI change diagnostics for the ABI
breaks of GCC 9, 13 and 14 for any type involving _BitInt(N), since that
type did not exist before this GCC version.
ChangeLog:
* config/aarch64/aarch64.cc (bitint_or_aggr_of_bitint_p): New function.
(aarch
Hi,
Introduced a new patch to disable diagnostics for ABI breaks involving
_BitInt(N) given the type didn't exist, let me know what you think of that.
Also added further testing to replicate the ABI diagnostic tests to use
_BitInt(N).
Andre Vieira (2)
aarch64: Do not give ABI c
Hi,
After the backport off PR target/112787 a failure was reported against
x86_64, this would be fixed by backporting:
* tree-optimization/91838 - fix FAIL of g++.dg/opt/pr91838.C
(d1c072a1c3411a6fe29900750b38210af8451eeb)
* tree-optimization/110838 - less aggressively fold out-of-bound shifts
Hi Thiago,
Thanks for this, LGTM but I can't approve this, CC'ing Richard.
Do have a nitpick, in the gcc/testsuite/ChangeLog: remove
'gcc/testsuite' from bullet points 2-4.
Kind regards,
Andre
On 13/01/2024 00:55, Thiago Jung Bauermann wrote:
Since commits 2c3db94d9fd ("c: Turn int-conversi
Hi,
Bootstrapped and tested the gcc-13 backport of this on gcc-12 for
aarch64-unknown-linux-gnu and x86_64-pc-linux-gnu and no regressions.
OK to push to gcc-12 branch?
Kind regards,
Andre Vieira
On 10/11/2023 13:16, Richard Biener wrote:
The following fixes the issue that when SLP stmts
On 27/02/2024 08:47, Richard Biener wrote:
On Mon, 26 Feb 2024, Andre Vieira (lists) wrote:
On 05/02/2024 09:56, Richard Biener wrote:
On Thu, 1 Feb 2024, Andre Vieira (lists) wrote:
On 01/02/2024 07:19, Richard Biener wrote:
On Wed, 31 Jan 2024, Andre Vieira (lists) wrote:
The
This patch adds support for MVE Tail-Predicated Low Overhead Loops by using the
doloop funcitonality added to support predicated vectorized hardware loops.
gcc/ChangeLog:
* config/arm/arm-protos.h (arm_target_bb_ok_for_lob): Change
declaration to pass basic_block.
(arm_at
This patch adds support in the target agnostic doloop pass for the detection of
predicated vectorized hardware loops. Arm is currently the only target that
will make use of this feature.
gcc/ChangeLog:
* df-core.cc (df_bb_regno_only_def_find): New helper function.
* df.h (df_bb_
This patch adds an attribute to the mve md patterns to be able to identify
predicable MVE instructions and what their predicated and unpredicated variants
are. This attribute is used to encode the icode of the unpredicated variant of
an instruction in its predicated variant.
This will make it po
This patch annotates some MVE across lane instructions with a new attribute.
We use this attribute to let the compiler know that these instructions can be
safely implicitly predicated when tail predicating if their operands are
guaranteed to have zeroed tail predicated lanes. These instructions w
This patch fixes the erroneous use of a mode attribute without a mode iterator
in the pattern and removes unused unspecs and iterators.
gcc/ChangeLog:
* config/arm/iterators.md (supf): Remove VMLALDAVXQ_U, VMLALDAVXQ_P_U,
VMLALDAVAXQ_U cases.
(VMLALDAVXQ): Remove iterator
converted the VPT-predicated instructions into their
unpredicated equivalents (which also saves us from VPST insns).
The LE instruction here decrements LR by 8 in each iteration.
Stam Markianos-Wright (1):
arm: Add define_attr to to create a mapping between MVE predicated and
unpredicated in
* gcc.target/aarch64/bitint-args.c: New test.
* gcc.target/aarch64/bitint-sizes.c: New test.
On 02/02/2024 14:46, Jakub Jelinek wrote:
On Thu, Jan 25, 2024 at 05:45:01PM +, Andre Vieira wrote:
This patch adds support for C23's _BitInt for the AArch64 port when compiling
f
On 05/02/2024 09:56, Richard Biener wrote:
On Thu, 1 Feb 2024, Andre Vieira (lists) wrote:
On 01/02/2024 07:19, Richard Biener wrote:
On Wed, 31 Jan 2024, Andre Vieira (lists) wrote:
The patch didn't come with a testcase so it's really hard to tell
what goes wrong now and
This patch adds support for MVE Tail-Predicated Low Overhead Loops by using the
doloop funcitonality added to support predicated vectorized hardware loops.
gcc/ChangeLog:
* config/arm/arm-protos.h (arm_target_bb_ok_for_lob): Change
declaration to pass basic_block.
(arm_at
This patch adds an attribute to the mve md patterns to be able to identify
predicable MVE instructions and what their predicated and unpredicated variants
are. This attribute is used to encode the icode of the unpredicated variant of
an instruction in its predicated variant.
This will make it po
This patch fixes the erroneous use of a mode attribute without a mode iterator
in the pattern and removes unused unspecs and iterators.
gcc/ChangeLog:
* config/arm/iterators.md (supf): Remove VMLALDAVXQ_U, VMLALDAVXQ_P_U,
VMLALDAVAXQ_U cases.
(VMLALDAVXQ): Remove iterator
This patch annotates some MVE across lane instructions with a new attribute.
We use this attribute to let the compiler know that these instructions can be
safely implicitly predicated when tail predicating if their operands are
guaranteed to have zeroed tail predicated lanes. These instructions w
This patch adds support in the target agnostic doloop pass for the detection of
predicated vectorized hardware loops. Arm is currently the only target that
will make use of this feature.
gcc/ChangeLog:
* df-core.cc (df_bb_regno_only_def_find): New helper function.
* df.h (df_bb_
LR by 8 in each iteration.
Stam Markianos-Wright (1):
arm: Add define_attr to to create a mapping between MVE predicated and
unpredicated insns
Andre Vieira (4):
doloop: Add support for predicated vectorized loops
arm: Annotate instructions with mve_safe_imp_xlane_pred
arm: Fix a wrong
This patch adds support for MVE Tail-Predicated Low Overhead Loops by using the
doloop funcitonality added to support predicated vectorized hardware loops.
gcc/ChangeLog:
* config/arm/arm-protos.h (arm_target_bb_ok_for_lob): Change
declaration to pass basic_block.
(arm_at
This patch fixes the erroneous use of a mode attribute without a mode iterator
in the pattern and removes unused unspecs and iterators.
gcc/ChangeLog:
* config/arm/iterators.md (supf): Remove VMLALDAVXQ_U, VMLALDAVXQ_P_U,
VMLALDAVAXQ_U cases.
(VMLALDAVXQ): Remove iterator
This patch annotates some MVE across lane instructions with a new attribute.
We use this attribute to let the compiler know that these instructions can be
safely implicitly predicated when tail predicating if their operands are
guaranteed to have zeroed tail predicated lanes. These instructions w
This patch adds an attribute to the mve md patterns to be able to identify
predicable MVE instructions and what their predicated and unpredicated variants
are. This attribute is used to encode the icode of the unpredicated variant of
an instruction in its predicated variant.
This will make it po
This patch adds support in the target agnostic doloop pass for the detection of
predicated vectorized hardware loops. Arm is currently the only target that
will make use of this feature.
The doloop_condition_get function is used to validate that the 'transformed'
jump instruction is one of the c
cated and
unpredicated insns
Andre Vieira (4):
doloop: Add support for predicated vectorized loops
arm: Annotate instructions with mve_safe_imp_xlane_pred
arm: Fix a wrong attribute use and remove unused unspecs and iterators
arm: Add support for MVE Tail-Predicated Low Overhead Loops
--
2.17.1
On 19/02/2024 16:17, Jakub Jelinek wrote:
On Mon, Feb 19, 2024 at 04:13:29PM +, Andre Vieira (lists) wrote:
Replaced uses of __seg_gs with the MACRO SEG defined in the testcase to pick
(if any) the right __seg_{gs,fs} keyword based on target.
gcc/testsuite/ChangeLog:
* gcc.dg
Replaced uses of __seg_gs with the MACRO SEG defined in the testcase to
pick (if any) the right __seg_{gs,fs} keyword based on target.
gcc/testsuite/ChangeLog:
* gcc.dg/bitint-86.c (__seg_gs): Replace with SEG MACRO.diff --git a/gcc/testsuite/gcc.dg/bitint-86.c b/gcc/testsuite/gcc.dg/bi
Regards,
Andre
On 20/12/2023 14:30, Richard Biener wrote:
On Wed, 20 Dec 2023, Andre Vieira (lists) wrote:
Thanks, fully agree with all comments.
gcc/ChangeLog:
PR target/112787
* tree-vect-generic (type_for_widest_vector_mode): Change function
to use original vector
On 01/02/2024 07:19, Richard Biener wrote:
On Wed, 31 Jan 2024, Andre Vieira (lists) wrote:
The patch didn't come with a testcase so it's really hard to tell
what goes wrong now and how it is fixed ...
My bad! I had a testcase locally but never added it...
However... now I look
On 31/01/2024 14:35, Richard Biener wrote:
On Wed, 31 Jan 2024, Andre Vieira (lists) wrote:
On 31/01/2024 13:58, Richard Biener wrote:
On Wed, 31 Jan 2024, Andre Vieira (lists) wrote:
On 31/01/2024 12:13, Richard Biener wrote:
On Wed, 31 Jan 2024, Richard Biener wrote:
On Tue, 30
On 31/01/2024 14:03, Richard Biener wrote:
On Wed, 31 Jan 2024, Richard Biener wrote:
On Wed, 31 Jan 2024, Andre Vieira (lists) wrote:
On 31/01/2024 12:13, Richard Biener wrote:
On Wed, 31 Jan 2024, Richard Biener wrote:
On Tue, 30 Jan 2024, Andre Vieira wrote:
This patch adds
On 31/01/2024 13:58, Richard Biener wrote:
On Wed, 31 Jan 2024, Andre Vieira (lists) wrote:
On 31/01/2024 12:13, Richard Biener wrote:
On Wed, 31 Jan 2024, Richard Biener wrote:
On Tue, 30 Jan 2024, Andre Vieira wrote:
This patch adds stmt_vec_info to TARGET_SIMD_CLONE_USABLE to
On 31/01/2024 12:13, Richard Biener wrote:
On Wed, 31 Jan 2024, Richard Biener wrote:
On Tue, 30 Jan 2024, Andre Vieira wrote:
This patch adds stmt_vec_info to TARGET_SIMD_CLONE_USABLE to make sure the
target can reject a simd_clone based on the vector mode it is using.
This is needed
This patch finalizes adding support for the generation of SVE simd clones when
no simdlen is provided, following the ABI rules where the widest data type
determines the minimum amount of elements in a length agnostic vector.
gcc/ChangeLog:
* config/aarch64/aarch64-protos.h (add_sve_type_
The current codegen code to support VF's that are multiples of a simdclone
simdlen rely on BIT_FIELD_REF to create multiple input vectors. This does not
work for non-constant simdclones, so we should disable using such clones when
the VF is a multiple of the non-constant simdlen until we change t
1 - 100 of 762 matches
Mail list logo