from:"Andre Vieira"

[PATCH] arm: Fix operand check for __arm_{mrrc{2}, mcrr{2]} intrinsics [PR 121464]

2025-08-08 Thread Andre Vieira (lists)

Fix the bound checking for the opc1 operand of the following intrinsics: __arm_mcrr __arm_mcrr2 __arm_mrrc __arm_mrrc2 Built arm-none-linux-gnueabihf and ran full regression test, and built arm-none-eabi but only ran the changed tests on that config. OK for trunk and backport to all op

[PATCHv2] modulo-sched: reject loop conditions when not decrementing with one [PR 116479]

2025-04-25 Thread Andre Vieira (lists)

/pr116479.c: New test. On 23/04/2025 16:51, Jakub Jelinek wrote: On Wed, Apr 23, 2025 at 04:46:04PM +0100, Andre Vieira (lists) wrote: On 23/04/2025 16:22, Jakub Jelinek wrote: On Wed, Apr 23, 2025 at 03:57:58PM +0100, Andre Vieira (lists) wrote: +++ b/gcc/testsuite/gcc.target/aarch64/pr116479.c

Re: [PATCH] modulo-sched: reject loop conditions when not decrementing with one [PR 116479]

2025-04-23 Thread Andre Vieira (lists)

On 23/04/2025 16:22, Jakub Jelinek wrote: On Wed, Apr 23, 2025 at 03:57:58PM +0100, Andre Vieira (lists) wrote: +++ b/gcc/testsuite/gcc.target/aarch64/pr116479.c @@ -0,0 +1,20 @@ +/* PR 116479 */ +/* { dg-do run } */ +/* { dg-additional-options "-O -funroll-loops -finline-stringops -fm

[PATCH] modulo-sched: reject loop conditions when not decrementing with one [PR 116479]

2025-04-23 Thread Andre Vieira (lists)

In the commit titled 'doloop: Add support for predicated vectorized loops' the doloop_condition_get function was changed to accept loops with decrements larger than 1. This patch rejects such loops for modulo-sched. I've put the test for this in the aarch64 testsuite, but I just realized eve

Re: [PATCH] aarch64: remove SVE2 requirement from SME and diagnose it as unsupported

2025-03-21 Thread Andre Vieira (lists)

Here is the latest version of the patch, I wasn't sure whether Richard's 'LGTM with...' was meant as a conditional OK and together with the changes suggested by Andrew I thought I'd ask again, OK for trunk? As per the AArch64 ISA FEAT_SME does not require FEAT_SVE2. However, we don't support

Re: [PATCH] aarch64: remove SVE2 requirement from SME and diagnose it as unsupported

2025-03-17 Thread Andre Vieira (lists)

Thanks for the suggestions. On 14/03/2025 21:43, Andrew Carlotti wrote: On Thu, Mar 13, 2025 at 05:10:07PM +, Andre Vieira (lists) wrote: Apologies for the delay, had been waiting on some other relevant patches to go in to make sure we didn't break any valid existing behaviours. It s

Re: [PATCH] aarch64: remove SVE2 requirement from SME and diagnose it as unsupported

2025-03-14 Thread Andre Vieira (lists)

On 14/03/2025 09:59, Richard Sandiford wrote: "Andre Vieira (lists)" writes: diff --git a/gcc/testsuite/gcc.target/aarch64/no-sve-with-sme-3.c b/gcc/testsuite/gcc.target/aarch64/no-sve-with-sme-3.c new file mode 100644 index 00

[PATCH] aarch64: remove SVE2 requirement from SME and diagnose it as unsupported

2025-03-13 Thread Andre Vieira (lists)

lane_mf8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/mlalt_mf8.c: Likewise. On 04/10/2024 13:08, Kyrylo Tkachov wrote: Hi Andre, On 2 Oct 2024, at 19:13, Andre Vieira wrote: External email: Use caution opening links or attachments As per the AArch64 ISA FEAT_SME does not require

[committed] arm, mve: Adding missing Runtime Library Exception to header files

2024-12-02 Thread Andre Vieira (lists)

Add missing Runtime Library Exception to mve header files to bring them into line with other similar headers. Not adding it in the first place was an oversight. gcc/ChangeLog: * config/arm/arm_mve.h: Add Runtime Library Exception. * config/arm/arm_mve_types.h: Likewise.diff --g

[PATCH v2 3/3] arm, mve: Detect uses of vctp_vpr_generated inside subregs

2024-11-29 Thread Andre Vieira

Address a problem we were having where we were missing on detecting uses of vctp_vpr_generated in the analysis for 'arm_attempt_dlstp_transform' because the use was inside a SUBREG and rtx_equal_p does not catch that. Using reg_overlap_mentioned_p is much more robust. gcc/ChangeLog: PR

[PATCH v2 1/3] arm, mve: Fix scan-assembler for test7 in dlstp-compile-asm-2.c

2024-11-29 Thread Andre Vieira

After the changes to the vctp intrinsic codegen changed slightly, where we now unfortunately seem to be generating unneeded moves and extends of the mask. These are however not incorrect and we don't have a fix for the unneeded codegen right now, so changing the testcase to accept them so we can c

Re: [PATCH] arm, mve: Do not DLSTP transform loops if VCTP is not first

2024-11-29 Thread Andre Vieira (lists)

Hi Christophe, On 28/11/2024 17:00, Christophe Lyon wrote: Hi Andre, Thanks, the patch LGTM except a minor nit: /* Using a VPR that gets re-generated within the loop. */ -void test10 (int32_t *a, int32_t *b, int32_t *c, int n) +void test10a (int32_t *a, int32_t *b, int32_t *c, int n) [...]

[PATCH 3/3] arm, mve: Detect uses of vctp_vpr_generated inside subregs

2024-11-29 Thread Andre Vieira

Address a problem we were having where we were missing on detecting uses of vctp_vpr_generated in the analysis for 'arm_attempt_dlstp_transform' because the use was inside a SUBREG and rtx_equal_p does not catch that. Using reg_overlap_mentioned_p is much more robust. gcc/ChangeLog: * g

[PATCH 2/3] arm, mve: Pass -std=c99 to dlstp-loop-form.c to avoid new warning

2024-11-29 Thread Andre Vieira

This fixes a testism introduced by the warning produced with the -std=c23 default. The testcase is a reduced piece of code meant to trigger an ICE, so there's little value in trying to change the code itself. gcc/testsuite/ChangeLog: * gcc.target/arm/mve/dlstp-loop-form.c: Add -std=c99

[PATCH 0/3] arm, mve: Fix DLSTP testism and issue after changes in codegen

2024-11-29 Thread Andre Vieira

had gonne by unnoticed until now. Only tested on arm-none-eabi mve.exp=dlstp*. OK for trunk? Andre Vieira (3): arm, mve: Fix scan-assembler for test7 in dlstp-compile-asm-2.c arm, mve: Pass -std=c99 to dlstp-loop-form.c to avoid new warning arm, mve: Detect uses of vctp_vpr_generated inside

[PATCH 1/3] arm, mve: Fix scan-assembler for test7 in dlstp-compile-asm-2.c

2024-11-29 Thread Andre Vieira

After the changes to the vctp intrinsic codegen changed slightly, where we now unfortunately seem to be generating unneeded moves and extends of the mask. These are however not incorrect and we don't have a fix for the unneeded codegen right now, so changing the testcase to accept them so we can c

[PATCH] arm, mve: Do not DLSTP transform loops if VCTP is not first

2024-11-28 Thread Andre Vieira

Hi, This rejects any loops where any predicated instruction comes before the vctp that generates the loop predicate. Even though this is not a requirement for dlstp transformation we have found potential issues where you can end up with a wrong transformation, so it is safer to reject such loops.

Re: [PATCH] arm: [MVE intrinsics] fix vctpq intrinsic implementation [PR target/117814]

2024-11-28 Thread Andre Vieira (lists)

Hi Christophe, On 28/11/2024 10:22, Christophe Lyon wrote: The VCTP instruction creates a Vector Tail Predicate in VPR.P0, based on the input value, but also constrained by a VPT block (if present), or if used within a DLSTP/LETP loop. Therefore we need to inform the compiler that this intrinsi

[PATCH] arm, mve: Fix arm_mve_dlstp_check_dec_counter's use of single_pred

2024-11-20 Thread Andre Vieira (lists)

Hi, Looks like single_pred ICEs if the basic-block does not have a single predecessor rather than return NULL, which was what this snippet of code relied on. This feels like borderline obvious to me as a fix, but I thought I'd get it checked by one more person. Call 'single_pred_p' before 's

Re: libstdc++-v3: do not duplicate some math functions when using newlib

2024-10-31 Thread Andre Vieira (lists)

On 31/10/2024 08:23, Alexandre Oliva wrote: On Oct 25, 2024, "Andre Vieira (lists)" wrote: I have to admit I am not super familiar with long doubles, either than knowing they are 128-bit FP representations... but bisect has pointed me to this patch when investigating a reg

Re: [PATCH 2/8] aarch64: Add new +fcma flag

2024-10-25 Thread Andre Vieira (lists)

On 08/10/2024 17:18, Richard Sandiford wrote: Andrew Carlotti writes: This includes +fcma as a dependency of +sve, and means that we can finally support fcma intrinsics on a64fx. Also add fcma to the Features list in several cpunative testcases that incorrectly included sve without fcma. g

Re: libstdc++-v3: do not duplicate some math functions when using newlib

2024-10-25 Thread Andre Vieira (lists)

Hey, I have to admit I am not super familiar with long doubles, either than knowing they are 128-bit FP representations... but bisect has pointed me to this patch when investigating a regression on aarch64_be-none-elf for the libstdc++ testcase: 26_numerics/complex/13450.cc After some reduct

arm: Improvements to arm_noce_conversion_profitable_p call [PR 116444]

2024-10-18 Thread Andre Vieira (lists)

Sorry for the delay, some other work popped up in between and this had some latent issues. They should all be addressed now in this new patch. When not dealing with the special armv8.1-m.main conditional instructions case make sure it uses the default_noce_conversion_profitable_p call to dete

Re: [PATCH v2] arm: [MVE intrinsics] Fix support for predicate constants [PR target/114801]

2024-10-18 Thread Andre Vieira (lists)

Hi, This looks like an acceptable work around. We special case behavior that I'm not sure we can express in ways GCC can understand or will make use of, whilst at the same time we keep expressing behavior it does understand and can optimize. Nice idea! LGTM, needs maintainer approval though

fold-const: Fix BIT_INSERT_EXPR folding for BYTES_BIG_ENDIAN [PR116997]

2024-10-14 Thread Andre Vieira (lists)

Hi, This patch fixes constant folding of BIT_INSER_EXPR for BYTES_BIG_ENDIAN targets. Regression tested on aarch64be-none-elf. Almost committed this as obvious, but I wanted to double check the testcase with a maintainer. I decided to not make the test be big-endian specific, nor to add any

arm: fix bootstrap issue with arm_noce_conversion_profitable_p patch [NFC]

2024-10-07 Thread Andre Vieira (lists)

Committed attached patch as obvious. This obvious patch fixes two warnings introduced with the implementation of arm_noce_conversion_profitable_p hook. gcc/ChangeLog: * config/arm/arm.cc (arm_noce_oncersion_profitable_p): Remove unused argument name. (arm_is_v81m_cond_

Re: arm: Make arm_noce_conversion_profitable_p call default hook [PR 116444]

2024-10-07 Thread Andre Vieira (lists)

On 07/10/2024 10:15, Christophe Lyon wrote: On Mon, 7 Oct 2024 at 11:04, Torbjorn SVENSSON wrote: On 2024-10-07 10:53, Andre Vieira (lists) wrote: Hi Torbjorn, 2. All other the test cases in the list above: These need to be adapted to the change introduced in r15-3606-g7d6c6a0d15c to

Re: arm: Make arm_noce_conversion_profitable_p call default hook [PR 116444]

2024-10-07 Thread Andre Vieira (lists)

Hi Torbjorn, On 07/10/2024 09:08, Torbjorn SVENSSON wrote: There are 3 test cases that are fixed with these 2 commits, but there is also a bunch that is marked as new fails. Looking at the test cases that fail, there are 2 different kinds of failures. 1. gcc.target/arm/attr_thumb.c: This

arm: Make arm_noce_conversion_profitable_p call default hook [PR 116444]

2024-10-04 Thread Andre Vieira (lists)

Hi, The patch for 'arm: Fix missed CE optimization for armv8.1-m.main [PR 116444]' introduced regressions with arm targets that used 'noce' before. This is because it would approve all noce optimisations without using the default cost check. Not sure why this didn't show up in my original test

[PATCH 1/2] aarch64: Split FCMA feature bit from Armv8.3-A

2024-10-02 Thread Andre Vieira

This patch splits out FCMA as a feature from Armv8.3-A and adds it as a separate feature bit which now controls 'TARGET_COMPLEX'. gcc/ChangeLog: * config/aarch64/aarch64-arches.def (FCMA): New feature bit, can not be used as an extension in the command-line. * config/aarc

[PATCH 2/2] aarch64: remove SVE2 requirement from SME and diagnose it as unsupported

2024-10-02 Thread Andre Vieira

As per the AArch64 ISA FEAT_SME does not require FEAT_SVE2, so we are removing that false dependency in GCC. However, we chose for now to not support this combination of features and will diagnose the combination of FEAT_SME without FEAT_SVE2 as unsupported by GCC. We may choose to support this

[PATCH 0/2] aarch64: remove SVE2 requirement from SME and diagnose it as unsupported

2024-10-02 Thread Andre Vieira

port this combination we should investigate these. The patch series also refactors the FCMA/COMPNUM/TARGET_COMPLEX feature to separate it from Armv8.3-A feature set. Andre Vieira (2) aarch64: Split FCMA feature bit from Armv8.3-A aarch64: remove SVE2 requirement from SME and diagnose i

[PATCH v2] arm: Prevent ICE when doloop dec_set is not PLUS_EXPR

2024-09-27 Thread Andre Vieira (lists)

Resending as v2 so CI picks it up. This patch refactors and fixes an issue where arm_mve_dlstp_check_dec_counter was making an assumption about the form of what a candidate for a dec_insn. This dec_insn is the instruction that decreases the loop counter inside a decrementing loop and we expect it

Re: [PATCH] arm: Fix missed CE optimization for armv8.1-m.main [PR 116444]

2024-09-27 Thread Andre Vieira (lists)

On 26/09/2024 18:56, Ramana Radhakrishnan wrote: +/* Helper function to determine whether SEQ represents a sequence of + instructions representing the Armv8.1-M Mainline conditional arithmetic + instructions: csinc, csneg and csinv. The cinc instruction is generated + using a diffe

[PATCH] arm: Fix missed CE optimization for armv8.1-m.main [PR 116444]

2024-09-25 Thread Andre Vieira (lists)

Hi, This patch restores missed optimizations for armv8.1-m.main targets that were missed when the generation of csinc, csinv and csneg were enabled or the same with patch series containing: commit c2bb84be4a6e581bbf45891457ee632a07416982 Author: Sudi Das Date: Fri Sep 18 15:47:46 2020 +010

rtl: Enable the use of rtx values with int and mode attributes

2024-08-13 Thread Andre Vieira (lists)

Hi, The 'code' part of a 'define_code_attr' refers to the type of the key, in other words, it uses a code_iterator to pick the value from their (key "value") pair list. Though it seems rtx_alloc_for_name requires a code_attribute to be used when the 'value' needs to be a type. In other words,

Re: [PATCH] testsuite: Avoid running neon test on Cortex-M55

2024-08-13 Thread Andre Vieira (lists)

I'm not a maintainer but I'd argue the entire test is bogus. The error reporting in this area seems to be somewhat fragile, if you compile it with '-march=armv7-a -mfloat-abi=soft', you also don't get the error this is testing for. I'd argue this kind of user friendly error message should jus

Re: [PATCH 05/15] arm: [MVE intrinsics] add vcvt shape

2024-08-05 Thread Andre Vieira (lists)

On 11/07/2024 22:42, Christophe Lyon wrote: + bool + check (function_checker &c) const override + { +if (c.mode_suffix_id == MODE_none) + return true; + +unsigned int bits = c.type_suffix (0).element_bits; +return c.require_immediate_range (1, 1, bits); + } When trying t

Re: [PATCH 03/15] arm: [MVE intrinsics] Cleanup arm-mve-builtins-functions.h

2024-08-02 Thread Andre Vieira (lists)

Hi, This looks great to me, only one small suggestion, but take it or leave it I think it's a matter of preference. On 11/07/2024 22:42, Christophe Lyon wrote: + /* No predicate, no suffix. */ if (e.type_suffix (0).integer_p) if (e.type_suffix (0).unsigne

Re: [PATCH 01/15] arm: [MVE intrinsics] improve comment for orrq shape

2024-08-02 Thread Andre Vieira (lists)

Hi Christophe, Maybe this patch was based on an older source, but the comment now reads: /* _t vfoo[t0](_t, _t) _t vfoo[_n_t0](_t, _t) Where the _n form only supports s16/s32/u16/u32 types as for vorrq. Example: vorrq. int16x8_t [__arm_]vorrq[_s16](int16x8_t a, int16x8_t b) int1

Re: [PATCH] arm: Fix testism with mve/ivopts-3.c testcase

2024-08-02 Thread Andre Vieira (lists)

Yeah true... committed. On 01/08/2024 13:54, Christophe Lyon wrote: On 8/1/24 12:02, Andre Vieira (lists) wrote: On 01/08/2024 10:09, Christophe Lyon wrote: It seems your attachment contains only the commit message but lacks the actual patch? I blame lack of coffee... Thanks. The

Re: [PATCH] arm: Fix testism with mve/ivopts-3.c testcase

2024-08-01 Thread Andre Vieira (lists)

On 01/08/2024 10:09, Christophe Lyon wrote: It seems your attachment contains only the commit message but lacks the actual patch? I blame lack of coffee... Thanks.diff --git a/gcc/testsuite/gcc.target/arm/mve/ivopts-3.c b/gcc/testsuite/gcc.target/arm/mve/ivopts-3.c index 19b2442ef12cbf

[PATCH] arm: Fix testism with mve/ivopts-3.c testcase

2024-08-01 Thread Andre Vieira (lists)

Hi, This patch ensures this testcase is ran for armv8.1-m.main+mve as this is testing that doloops with function calls that aren't intrinsics get rejected as potential doloop targets during ivopts. For other targets this loop gets rejected for different reasons. gcc/testsuite/ChangeLog:

Re: arm: Prevent ICE when doloop dec_set is not PLUS_EXPR

2024-07-31 Thread Andre Vieira (lists)

This patch refactors and fixes an issue where arm_mve_dlstp_check_dec_counter was making an assumption about the form of what a candidate for a dec_insn should be, which caused an ICE. This dec_insn is the instruction that decreases the loop counter inside a decrementing loop a

Re: arm: Prevent ICE when doloop dec_set is not PLUS_EXPR

2024-07-31 Thread Andre Vieira (lists)

Hi Christophe, Thanks for the comments, attached new version for testcase, see below new cover letter: This patch refactors and fixes an issue where arm_mve_dlstp_check_dec_counter was making an assumption about the form of what a candidate for a dec_insn. This dec_insn is the instruction th

arm: Prevent ICE when doloop dec_set is not PLUS_EXPR

2024-07-26 Thread Andre Vieira (lists)

This patch refactors and fixes an issue where arm_mve_dlstp_check_dec_counter was making an assumption about the form of what a candidate for a dec_insn. It also makes sure that if it does not initially encounter a 'set' in such a form it tries to find another set that could be the right one.

Re: mve: Fix vsetq_lane for 64-bit elements with lane 1 [PR 115611]

2024-07-09 Thread Andre Vieira (lists)

Looks like I forgot to CC you Richard. But yeh ping :) On 26/06/2024 13:20, Andre Vieira (lists) wrote: This patch fixes the backend pattern that was printing the wrong input scalar register pair when inserting into lane 1. Added a new test to force float-abi=hard so we can use scan-assembler

mve: Fix vsetq_lane for 64-bit elements with lane 1 [PR 115611]

2024-06-26 Thread Andre Vieira (lists)

This patch fixes the backend pattern that was printing the wrong input scalar register pair when inserting into lane 1. Added a new test to force float-abi=hard so we can use scan-assembler to check correct codegen. Regression tested arm-none-eabi with -march=armv8.1-m.main+mve/-mfloat-abi=ha

[PATCH] arm: make arm_predict_doloop_p reject loops with calls

2024-06-25 Thread Andre Vieira (lists)

Hi, With the introduction of low overhead loops in https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=3dfc28dbbd21b1d708aa40064380ef4c42c994d7 we defined arm_predict_doloop_p, this is meant to be a low-weight check to rule out loops we are not considering for doloop optimization and it is used by

Re: arm: Add .type and .size to __gnu_cmse_nonsecure_call [PR115360]

2024-06-12 Thread Andre Vieira (lists)

On 06/06/2024 12:53, Richard Earnshaw (lists) wrote: On 05/06/2024 17:07, Andre Vieira (lists) wrote: Hi, This patch adds missing assembly directives to the CMSE library wrapper to call functions with attribute cmse_nonsecure_call. Without the .type directive the linker will fail to

Re: [PATCH v3 1/2] arm: Zero/Sign extends for CMSE security on Armv8-M.baseline [PR115253]

2024-06-11 Thread Andre Vieira (lists)

On 11/06/2024 14:59, Richard Earnshaw (lists) wrote: You effectively have an 'else if' split across a comment here, and the indentation looks weird. Either write 'else if' on one line (and re-indent accordingly) or put this entire block inside braces. Apologies here, Torbjorn had this as

Re: [PATCH v2 1/2] arm: Zero/Sign extends for CMSE security on Armv8-M.baseline [PR115253]

2024-06-10 Thread Andre Vieira (lists)

Hi, So, you talk about gen_thumb1_extendhisi2, but there is also gen_thumb1_extendqisi2. Will it actually be cleaner if the block is indented one level? The comment can be added in the "if (TARGET_THUMB1)" block regardless to indicate that gen_rtx_SIGN_EXTEND can't be used. gen_rtx_SIGN_EX

Re: [PATCH v2 1/2] arm: Zero/Sign extends for CMSE security on Armv8-M.baseline [PR115253]

2024-06-10 Thread Andre Vieira (lists)

Hi Torbjorn, Thanks for this, I have some comments below. On 07/06/2024 09:56, Torbjörn SVENSSON wrote: Properly handle zero and sign extension for Armv8-M.baseline as Cortex-M23 can have the security extension active. Currently, there is a internal compiler error on Cortex-M23 for the epilog p

arm: Add .type and .size to __gnu_cmse_nonsecure_call [PR115360]

2024-06-05 Thread Andre Vieira (lists)

Hi, This patch adds missing assembly directives to the CMSE library wrapper to call functions with attribute cmse_nonsecure_call. Without the .type directive the linker will fail to produce the correct veneer if a call to this wrapper function is to far from the wrapper itself. The .size wa

Re: RFC: Support for pragma clang loop interleave_count(N)

2024-06-04 Thread Andre Vieira (lists)

On 04/06/2024 12:50, Richard Biener wrote: On Tue, 4 Jun 2024, Andre Vieira (lists) wrote: Hi, We got a question as to whether GCC had something similar to llvm's pragma clang loop interleave_count(N), see https://clang.llvm.org/docs/LanguageExtensions.html#extensions-for-loop

RFC: Support for pragma clang loop interleave_count(N)

2024-06-04 Thread Andre Vieira (lists)

Hi, We got a question as to whether GCC had something similar to llvm's pragma clang loop interleave_count(N), see https://clang.llvm.org/docs/LanguageExtensions.html#extensions-for-loop-hint-optimizations I did a quick hack, using 'GCC interleaves N', just as a proof of concept, to see wheth

[PATCH 2/2] arm: Add support for MVE Tail-Predicated Low Overhead Loops

2024-05-23 Thread Andre Vieira

This patch adds support for MVE Tail-Predicated Low Overhead Loops by using the doloop funcitonality added to support predicated vectorized hardware loops. gcc/ChangeLog: * config/arm/arm-protos.h (arm_target_bb_ok_for_lob): Change declaration to pass basic_block. (arm_at

[PATCH 1/2] doloop: Add support for predicated vectorized loops

2024-05-23 Thread Andre Vieira

This patch adds support in the target agnostic doloop pass for the detection of predicated vectorized hardware loops. Arm is currently the only target that will make use of this feature. gcc/ChangeLog: * df-core.cc (df_bb_regno_only_def_find): New helper function. * df.h (df_bb_

[PATCH 0/2] arm, doloop: Add support for MVE Tail-Predicated Low Overhead Loops

2024-05-23 Thread Andre Vieira

reviewed patches. OK for trunk? Andre Vieira (2): doloop: Add support for predicated vectorized loops arm: Add support for MVE Tail-Predicated Low Overhead Loops gcc/config/arm/arm-protos.h |4 +- gcc/config/arm/arm.cc | 1249

[PATCH 2/2] arm: Add support for MVE Tail-Predicated Low Overhead Loops

2024-05-23 Thread Andre Vieira

This patch adds support for MVE Tail-Predicated Low Overhead Loops by using the doloop funcitonality added to support predicated vectorized hardware loops. gcc/ChangeLog: * config/arm/arm-protos.h (arm_target_bb_ok_for_lob): Change declaration to pass basic_block. (arm_at

[PATCH 1/2] doloop: Add support for predicated vectorized loops

2024-05-23 Thread Andre Vieira

This patch adds support in the target agnostic doloop pass for the detection of predicated vectorized hardware loops. Arm is currently the only target that will make use of this feature. gcc/ChangeLog: * df-core.cc (df_bb_regno_only_def_find): New helper function. * df.h (df_bb_

[PATCH 0/2] arm, doloop: Add support for MVE Tail-Predicated Low Overhead Loops

2024-05-23 Thread Andre Vieira

Hi, We held these two patches back in stage 4 because they touched target-agnostic code, though I am quite confident they will not affect other targets. Given stage one has reopened, I am reposting them, I rebased them but they seem to apply cleanly on trunk. OK for trunk? Andre Vieira

[wwwdocs] Specify AArch64 BitInt support for little-endian only

2024-05-07 Thread Andre Vieira (lists)

Hey Jakub, This what ya had in mind? Kind regards, Andre Vieiradiff --git a/htdocs/gcc-14/changes.html b/htdocs/gcc-14/changes.html index ca5174de991bb088f653468f77485c15a61526e6..924e045a15a78b5702a0d6997953f35c6b47efd1 100644 --- a/htdocs/gcc-14/changes.html +++ b/htdocs/gcc-14/changes.html

[PATCH] aarch64: Fix _BitInt testcases

2024-04-11 Thread Andre Vieira (lists)

This patch fixes some testisms introduced by: commit 5aa3fec38cc6f52285168b161bab1a869d864b44 Author: Andre Vieira Date: Wed Apr 10 16:29:46 2024 +0100 aarch64: Add support for _BitInt The testcases were relying on an unnecessary sign-extend that is no longer generated. The tested

[PATCH][wwwdocs] gcc-14/changes.html: Update _BitInt to include AArch64 (little-endian)

2024-04-10 Thread Andre Vieira (lists)

Hi, Patch to add AArch64 to the list of supported _BitInt(N) in gcc-14/changes.html. OK?diff --git a/htdocs/gcc-14/changes.html b/htdocs/gcc-14/changes.html index a7ba957110183f906938d935bfa17aaed2ba20c8..55ab8c14c6d0b54e05a5f266f25c8ef1a4f959bf 100644 --- a/htdocs/gcc-14/changes.html +++ b/

Re: [PATCHv3 2/2] aarch64: Add support for _BitInt

2024-04-10 Thread Andre Vieira (lists)

Added the target check, also had to change some of the assembly checking due to changes upstream, the assembly is still valid, but we do extend where not necessary, I do believe that's a general issue though. The _BitInt(N > 64) codegen for non-powers of 2 did get worse, we see similar codegen

Re: [PATCHv2 1/2] aarch64: Do not give ABI change diagnostics for _BitInt(N)

2024-04-10 Thread Andre Vieira (lists)

regards, Andre On 28/03/2024 12:54, Richard Sandiford wrote: "Andre Vieira (lists)" writes: This patch makes sure we do not give ABI change diagnostics for the ABI breaks of GCC 9, 13 and 14 for any type involving _BitInt(N), since that type did not exist before this GCC version.

[PATCHv2 2/2] aarch64: Add support for _BitInt

2024-03-27 Thread Andre Vieira (lists)

This patch adds support for C23's _BitInt for the AArch64 port when compiling for little endianness. Big Endianness requires further target-agnostic support and we therefor disable it for now. The tests expose some suboptimal codegen for which I'll create PR's for optimizations after this goe

[PATCHv2 1/2] aarch64: Do not give ABI change diagnostics for _BitInt(N)

2024-03-27 Thread Andre Vieira (lists)

This patch makes sure we do not give ABI change diagnostics for the ABI breaks of GCC 9, 13 and 14 for any type involving _BitInt(N), since that type did not exist before this GCC version. ChangeLog: * config/aarch64/aarch64.cc (bitint_or_aggr_of_bitint_p): New function. (aarch

[PATCHv2 0/2] aarch64, bitint: Add support for _BitInt for AArch64 Little Endian

2024-03-27 Thread Andre Vieira (lists)

Hi, Introduced a new patch to disable diagnostics for ABI breaks involving _BitInt(N) given the type didn't exist, let me know what you think of that. Also added further testing to replicate the ABI diagnostic tests to use _BitInt(N). Andre Vieira (2) aarch64: Do not give ABI c

Backport PR91838 and PR110838

2024-03-25 Thread Andre Vieira (lists)

Hi, After the backport off PR target/112787 a failure was reported against x86_64, this would be fixed by backporting: * tree-optimization/91838 - fix FAIL of g++.dg/opt/pr91838.C (d1c072a1c3411a6fe29900750b38210af8451eeb) * tree-optimization/110838 - less aggressively fold out-of-bound shifts

Re: [PATCH] testsuite: Fix fallout of turning warnings into errors on 32-bit Arm

2024-03-01 Thread Andre Vieira (lists)

Hi Thiago, Thanks for this, LGTM but I can't approve this, CC'ing Richard. Do have a nitpick, in the gcc/testsuite/ChangeLog: remove 'gcc/testsuite' from bullet points 2-4. Kind regards, Andre On 13/01/2024 00:55, Thiago Jung Bauermann wrote: Since commits 2c3db94d9fd ("c: Turn int-conversi

Re: [PATCH] tree-optimization/110221 - SLP and loop mask/len

2024-03-01 Thread Andre Vieira (lists)

Hi, Bootstrapped and tested the gcc-13 backport of this on gcc-12 for aarch64-unknown-linux-gnu and x86_64-pc-linux-gnu and no regressions. OK to push to gcc-12 branch? Kind regards, Andre Vieira On 10/11/2023 13:16, Richard Biener wrote: The following fixes the issue that when SLP stmts

Re: [PATCH 1/3] vect: Pass stmt_vec_info to TARGET_SIMD_CLONE_USABLE

2024-02-28 Thread Andre Vieira (lists)

On 27/02/2024 08:47, Richard Biener wrote: On Mon, 26 Feb 2024, Andre Vieira (lists) wrote: On 05/02/2024 09:56, Richard Biener wrote: On Thu, 1 Feb 2024, Andre Vieira (lists) wrote: On 01/02/2024 07:19, Richard Biener wrote: On Wed, 31 Jan 2024, Andre Vieira (lists) wrote: The

[PATCH v6 5/5] arm: Add support for MVE Tail-Predicated Low Overhead Loops

2024-02-27 Thread Andre Vieira

This patch adds support for MVE Tail-Predicated Low Overhead Loops by using the doloop funcitonality added to support predicated vectorized hardware loops. gcc/ChangeLog: * config/arm/arm-protos.h (arm_target_bb_ok_for_lob): Change declaration to pass basic_block. (arm_at

[PATCH v6 4/5] doloop: Add support for predicated vectorized loops

2024-02-27 Thread Andre Vieira

This patch adds support in the target agnostic doloop pass for the detection of predicated vectorized hardware loops. Arm is currently the only target that will make use of this feature. gcc/ChangeLog: * df-core.cc (df_bb_regno_only_def_find): New helper function. * df.h (df_bb_

[PATCH v6 1/5] arm: Add define_attr to to create a mapping between MVE predicated and unpredicated insns

2024-02-27 Thread Andre Vieira

This patch adds an attribute to the mve md patterns to be able to identify predicable MVE instructions and what their predicated and unpredicated variants are. This attribute is used to encode the icode of the unpredicated variant of an instruction in its predicated variant. This will make it po

[PATCH v6 2/5] arm: Annotate instructions with mve_safe_imp_xlane_pred

2024-02-27 Thread Andre Vieira

This patch annotates some MVE across lane instructions with a new attribute. We use this attribute to let the compiler know that these instructions can be safely implicitly predicated when tail predicating if their operands are guaranteed to have zeroed tail predicated lanes. These instructions w

[PATCH v6 3/5] arm: Fix a wrong attribute use and remove unused unspecs and iterators

2024-02-27 Thread Andre Vieira

This patch fixes the erroneous use of a mode attribute without a mode iterator in the pattern and removes unused unspecs and iterators. gcc/ChangeLog: * config/arm/iterators.md (supf): Remove VMLALDAVXQ_U, VMLALDAVXQ_P_U, VMLALDAVAXQ_U cases. (VMLALDAVXQ): Remove iterator

[PATCH v5 0/5] arm: Add support for MVE Tail-Predicated Low Overhead Loops

2024-02-27 Thread Andre Vieira

converted the VPT-predicated instructions into their unpredicated equivalents (which also saves us from VPST insns). The LE instruction here decrements LR by 8 in each iteration. Stam Markianos-Wright (1): arm: Add define_attr to to create a mapping between MVE predicated and unpredicated in

Re: [PATCH 2/2] aarch64: Add support for _BitInt

2024-02-27 Thread Andre Vieira (lists)

* gcc.target/aarch64/bitint-args.c: New test. * gcc.target/aarch64/bitint-sizes.c: New test. On 02/02/2024 14:46, Jakub Jelinek wrote: On Thu, Jan 25, 2024 at 05:45:01PM +, Andre Vieira wrote: This patch adds support for C23's _BitInt for the AArch64 port when compiling f

Re: [PATCH 1/3] vect: Pass stmt_vec_info to TARGET_SIMD_CLONE_USABLE

2024-02-26 Thread Andre Vieira (lists)

On 05/02/2024 09:56, Richard Biener wrote: On Thu, 1 Feb 2024, Andre Vieira (lists) wrote: On 01/02/2024 07:19, Richard Biener wrote: On Wed, 31 Jan 2024, Andre Vieira (lists) wrote: The patch didn't come with a testcase so it's really hard to tell what goes wrong now and

[PATCH v5 5/5] arm: Add support for MVE Tail-Predicated Low Overhead Loops

2024-02-22 Thread Andre Vieira

This patch adds support for MVE Tail-Predicated Low Overhead Loops by using the doloop funcitonality added to support predicated vectorized hardware loops. gcc/ChangeLog: * config/arm/arm-protos.h (arm_target_bb_ok_for_lob): Change declaration to pass basic_block. (arm_at

[PATCH v5 1/5] arm: Add define_attr to to create a mapping between MVE predicated and unpredicated insns

2024-02-22 Thread Andre Vieira

This patch adds an attribute to the mve md patterns to be able to identify predicable MVE instructions and what their predicated and unpredicated variants are. This attribute is used to encode the icode of the unpredicated variant of an instruction in its predicated variant. This will make it po

[PATCH v5 4/5] arm: Fix a wrong attribute use and remove unused unspecs and iterators

2024-02-22 Thread Andre Vieira

This patch fixes the erroneous use of a mode attribute without a mode iterator in the pattern and removes unused unspecs and iterators. gcc/ChangeLog: * config/arm/iterators.md (supf): Remove VMLALDAVXQ_U, VMLALDAVXQ_P_U, VMLALDAVAXQ_U cases. (VMLALDAVXQ): Remove iterator

[PATCH v5 3/5] arm: Annotate instructions with mve_safe_imp_xlane_pred

2024-02-22 Thread Andre Vieira

This patch annotates some MVE across lane instructions with a new attribute. We use this attribute to let the compiler know that these instructions can be safely implicitly predicated when tail predicating if their operands are guaranteed to have zeroed tail predicated lanes. These instructions w

[PATCH v5 2/5] doloop: Add support for predicated vectorized loops

2024-02-22 Thread Andre Vieira

This patch adds support in the target agnostic doloop pass for the detection of predicated vectorized hardware loops. Arm is currently the only target that will make use of this feature. gcc/ChangeLog: * df-core.cc (df_bb_regno_only_def_find): New helper function. * df.h (df_bb_

[PATCH v5 0/5] arm: Add support for MVE Tail-Predicated Low Overhead Loops

2024-02-22 Thread Andre Vieira

LR by 8 in each iteration. Stam Markianos-Wright (1): arm: Add define_attr to to create a mapping between MVE predicated and unpredicated insns Andre Vieira (4): doloop: Add support for predicated vectorized loops arm: Annotate instructions with mve_safe_imp_xlane_pred arm: Fix a wrong

[PATCH v4 5/5] arm: Add support for MVE Tail-Predicated Low Overhead Loops

2024-02-21 Thread Andre Vieira

This patch adds support for MVE Tail-Predicated Low Overhead Loops by using the doloop funcitonality added to support predicated vectorized hardware loops. gcc/ChangeLog: * config/arm/arm-protos.h (arm_target_bb_ok_for_lob): Change declaration to pass basic_block. (arm_at

[PATCH v4 4/5] arm: Fix a wrong attribute use and remove unused unspecs and iterators

2024-02-21 Thread Andre Vieira

This patch fixes the erroneous use of a mode attribute without a mode iterator in the pattern and removes unused unspecs and iterators. gcc/ChangeLog: * config/arm/iterators.md (supf): Remove VMLALDAVXQ_U, VMLALDAVXQ_P_U, VMLALDAVAXQ_U cases. (VMLALDAVXQ): Remove iterator

[PATCH v4 3/5] arm: Annotate instructions with mve_safe_imp_xlane_pred

2024-02-21 Thread Andre Vieira

This patch annotates some MVE across lane instructions with a new attribute. We use this attribute to let the compiler know that these instructions can be safely implicitly predicated when tail predicating if their operands are guaranteed to have zeroed tail predicated lanes. These instructions w

[PATCH v4 1/5] arm: Add define_attr to to create a mapping between MVE predicated and unpredicated insns

2024-02-21 Thread Andre Vieira

This patch adds an attribute to the mve md patterns to be able to identify predicable MVE instructions and what their predicated and unpredicated variants are. This attribute is used to encode the icode of the unpredicated variant of an instruction in its predicated variant. This will make it po

[PATCH v4 2/5] doloop: Add support for predicated vectorized loops

2024-02-21 Thread Andre Vieira

This patch adds support in the target agnostic doloop pass for the detection of predicated vectorized hardware loops. Arm is currently the only target that will make use of this feature. The doloop_condition_get function is used to validate that the 'transformed' jump instruction is one of the c

[PATCH v4 0/5] arm: Add support for MVE Tail-Predicated Low Overhead Loops

2024-02-21 Thread Andre Vieira

cated and unpredicated insns Andre Vieira (4): doloop: Add support for predicated vectorized loops arm: Annotate instructions with mve_safe_imp_xlane_pred arm: Fix a wrong attribute use and remove unused unspecs and iterators arm: Add support for MVE Tail-Predicated Low Overhead Loops -- 2.17.1

Re: [comitted] bitint: Fix testism where __seg_gs was being used for all targets

2024-02-19 Thread Andre Vieira (lists)

On 19/02/2024 16:17, Jakub Jelinek wrote: On Mon, Feb 19, 2024 at 04:13:29PM +, Andre Vieira (lists) wrote: Replaced uses of __seg_gs with the MACRO SEG defined in the testcase to pick (if any) the right __seg_{gs,fs} keyword based on target. gcc/testsuite/ChangeLog: * gcc.dg

[comitted] bitint: Fix testism where __seg_gs was being used for all targets

2024-02-19 Thread Andre Vieira (lists)

Replaced uses of __seg_gs with the MACRO SEG defined in the testcase to pick (if any) the right __seg_{gs,fs} keyword based on target. gcc/testsuite/ChangeLog: * gcc.dg/bitint-86.c (__seg_gs): Replace with SEG MACRO.diff --git a/gcc/testsuite/gcc.dg/bitint-86.c b/gcc/testsuite/gcc.dg/bi

Re: veclower: improve selection of vector mode when lowering [PR 112787]

2024-02-19 Thread Andre Vieira (lists)

Regards, Andre On 20/12/2023 14:30, Richard Biener wrote: On Wed, 20 Dec 2023, Andre Vieira (lists) wrote: Thanks, fully agree with all comments. gcc/ChangeLog: PR target/112787 * tree-vect-generic (type_for_widest_vector_mode): Change function to use original vector

Re: [PATCH 1/3] vect: Pass stmt_vec_info to TARGET_SIMD_CLONE_USABLE

2024-02-01 Thread Andre Vieira (lists)

On 01/02/2024 07:19, Richard Biener wrote: On Wed, 31 Jan 2024, Andre Vieira (lists) wrote: The patch didn't come with a testcase so it's really hard to tell what goes wrong now and how it is fixed ... My bad! I had a testcase locally but never added it... However... now I look

Re: [PATCH 1/3] vect: Pass stmt_vec_info to TARGET_SIMD_CLONE_USABLE

2024-01-31 Thread Andre Vieira (lists)

On 31/01/2024 14:35, Richard Biener wrote: On Wed, 31 Jan 2024, Andre Vieira (lists) wrote: On 31/01/2024 13:58, Richard Biener wrote: On Wed, 31 Jan 2024, Andre Vieira (lists) wrote: On 31/01/2024 12:13, Richard Biener wrote: On Wed, 31 Jan 2024, Richard Biener wrote: On Tue, 30

Re: [PATCH 1/3] vect: Pass stmt_vec_info to TARGET_SIMD_CLONE_USABLE

2024-01-31 Thread Andre Vieira (lists)

On 31/01/2024 14:03, Richard Biener wrote: On Wed, 31 Jan 2024, Richard Biener wrote: On Wed, 31 Jan 2024, Andre Vieira (lists) wrote: On 31/01/2024 12:13, Richard Biener wrote: On Wed, 31 Jan 2024, Richard Biener wrote: On Tue, 30 Jan 2024, Andre Vieira wrote: This patch adds

1 2 3 4 5 6 7 8 >

1 - 100 of 766 matches

Mail list logo