> On 4 Sep 2025, at 16:13, Karl Meakin wrote:
>
> Add intrinsic functions for the SME LUTv2 architecture extension
> (`svluti4_zt`, `svwrite_lane_zt` and `svwrite_zt`).
>
> gcc/ChangeLog:
>
> * config/aarch64/aarch64-sme.md (@aarch64_sme_write_zt): New
> insn.
> (aarch64_sme_lut_zt): Likewis
immediate operand case is obviously better
as the SVE immediate form doesn't require a predicate operand.
Bootstrapped and tested on aarch64-none-linux-gnu.
Ok for trunk?
Thanks,
Kyrill
Signed-off-by: Kyrylo Tkachov
gcc/
* config/aarch64/iterators.md (sve_di_suf): New mode attr
> On 3 Sep 2025, at 17:32, Kyrylo Tkachov wrote:
>
>
>
>> On 3 Sep 2025, at 17:26, Alex Coplan wrote:
>>
>> On 03/09/2025 09:40, Kyrylo Tkachov wrote:
>>> Hi all,
>>>
>>> With g:d20b2ad845876eec0ee80a3933ad49f9f6c4ee30 the narro
> On 3 Sep 2025, at 17:26, Alex Coplan wrote:
>
> On 03/09/2025 09:40, Kyrylo Tkachov wrote:
>> Hi all,
>>
>> With g:d20b2ad845876eec0ee80a3933ad49f9f6c4ee30 the narrowing shift
>> instructions
>> are now represented with standard RTL and more merging
Hi Karl,
> On 2 Sep 2025, at 16:16, Karl Meakin wrote:
>
> gcc/ChangeLog:
>
> * config/aarch64/aarch64-sme.md (@aarch64_sme_write_zt): New
> insn.
> (aarch64_sme_lut_zt): Likewise.
> * config/aarch64/aarch64-sve-builtins-shapes.cc (parse_type): New type format
> "%T".
> (struct luti_lane_zt_b
trunk?
Thanks,
Kyrill
Signed-off-by: Kyrylo Tkachov
gcc/
PR target/121749
* config/aarch64/aarch64-simd.md (aarch64_shrn_n):
Use aarch64_simd_shift_imm_offset_ instead of
aarch64_simd_shift_imm_offset_ predicate.
(aarch64_shrn_n VQN define_expand): Lik
> On 2 Sep 2025, at 12:57, Tamar Christina wrote:
>
> This implements the new vector optabs vec_addh_narrow
> adding support for in-vectorizer use for early break.
>
> Bootstrapped Regtested on aarch64-none-linux-gnu,
> arm-none-linux-gnueabihf, x86_64-pc-linux-gnu
> -m32, -m64 and no issues.
> On 2 Sep 2025, at 11:46, Tamar Christina wrote:
>
> Given a sequence such as
>
> int foo ()
> {
> #pragma GCC unroll 4
> for (int i = 0; i < N; i++)
>if (a[i] == 124)
> return 1;
>
> return 0;
> }
>
> where a[i] is long long, we will unroll the loop and use an OR reduction for
>
Hi Tamar,
> On 2 Sep 2025, at 11:46, Tamar Christina wrote:
>
> This implements the new vector optabs vec_addh_narrow
> adding support for in-vectorizer use for early break.
The Advanced SIMD ADDHN instruction doesn’t perform a widening of the operands
for the addition as far as I know.
That i
Hi Jennifer,
> On 25 Aug 2025, at 12:56, Jennifer Schmitz wrote:
>
> When op2 in SVE2 saturating add intrinsics (svuqadd, svsqadd) is a zero
> vector and predication is _z, an ICE in vregs occurs, e.g. for
>
> svuint8_t foo (svbool_t pg, svuint8_t op1)
> {
>return svsqadd_u8_z (pg, op1, svd
aarch64 ldp/stp Alex Coplan
> aarch64 portRichard Earnshaw
> -aarch64 portRichard Sandiford
> aarch64 portTamar Christina
> aarch64 portKyrylo Tkachov
> alpha port Richard Henderson
> --
> 2.50.1
>
> On 13 Aug 2025, at 17:34, Richard Sandiford wrote:
>
> Wilco Dijkstra writes:
>> Add an expander for isinf using integer arithmetic. This is
>> typically faster and avoids generating spurious exceptions on
>> signaling NaNs.
>>
>> int isinf1 (float x) { return __builtin_isinf (x); }
>>
>
> On 8 Aug 2025, at 14:23, Tamar Christina wrote:
>
>> -Original Message-
>> From: Pengfei Li
>> Sent: Friday, August 8, 2025 11:00 AM
>> To: Kyrylo Tkachov
>> Cc: gcc-patches@gcc.gnu.org; Richard Sandiford ;
>> Tamar Christina
>> Su
Hi Pengfei,
> On 7 Aug 2025, at 18:45, Pengfei Li wrote:
>
> This patch fixes incorrect constraints in RTL patterns for AArch64 SVE
> gather/scatter with type widening/narrowing and vector-plus-immediate
> addressing. The bug leads to below "immediate offset out of range"
> errors during assembl
> On 5 Aug 2025, at 11:00, Alex Coplan wrote:
>
> Hi Kyrill,
>
> Sorry for the slow reply, I was away on holiday until yesterday.
>
> On 15/07/2025 13:08, Kyrylo Tkachov wrote:
>> Hi Alex,
>>
>>> On 15 Jul 2025, at 14:59, Alex Coplan wrote:
>>
> On 31 Jul 2025, at 17:20, Tamar Christina wrote:
>
>> -Original Message-----
>> From: Kyrylo Tkachov
>> Sent: Thursday, July 31, 2025 3:47 PM
>> To: Jennifer Schmitz
>> Cc: GCC Patches ; Andrew Pinski
>> ; Richard Earnshaw ; Richard
>&
> On 29 Jul 2025, at 18:41, Richard Sandiford wrote:
>
> After previous patches, we should always get a VNx16BI result
> for ACLE intrinsics that return svbool_t. This patch adds
> an assert that checks a more general condition than that.
>
Ok.
Thanks,
Kyrill
> gcc/
> * config/aarch64/aarc
> On 29 Jul 2025, at 18:41, Richard Sandiford wrote:
>
> This patch continues the work of making ACLE intrinsics use VNx16BI
> for svbool_t results. It deals with the predicate forms of svdupq.
>
> The general predicate expansion builds an equivalent integer vector
> and then compares it wit
> On 29 Jul 2025, at 18:41, Richard Sandiford wrote:
>
> This patch continues the work of making ACLE intrinsics use VNx16BI
> for svbool_t results. It deals with the predicate forms of svdup.
>
Ok.
Thanks,
Kyrill
> gcc/
> * config/aarch64/aarch64-protos.h
> (aarch64_emit_sve_pred_vec_dupl
> On 29 Jul 2025, at 18:41, Richard Sandiford wrote:
>
> This patch continues the work of making ACLE intrinsics use VNx16BI
> for svbool_t results. It deals with the svpnext* intrinsics.
>
I wonder if the new patterns need pred_clobber alternatives in this and the
other patches?
If they d
> On 29 Jul 2025, at 18:41, Richard Sandiford wrote:
>
> This patch continues the work of making ACLE intrinsics use VNx16BI
> for svbool_t results. It deals with the svmatch* and svnmatch*
> intrinsics.
>
Ok.
Thanks,
Kyrill
> gcc/
> * config/aarch64/aarch64-sve2.md (@aarch64_pred_):
> Spl
> On 29 Jul 2025, at 18:41, Richard Sandiford wrote:
>
> This patch continues the work of making ACLE intrinsics use VNx16BI
> for svbool_t results. It deals with the svac* intrinsics (floating-
> point compare absolute).
Ok.
Thanks,
Kyrill
>
> gcc/
> * config/aarch64/aarch64-sve.md (@aarc
> On 29 Jul 2025, at 18:41, Richard Sandiford wrote:
>
> This patch continues the work of making ACLE intrinsics use VNx16BI
> for svbool_t results. It deals with the svcmp*_wide intrinsics.
>
> Since the only uses of these patterns are for ACLE intrinsics,
> there didn't seem much point add
> On 29 Jul 2025, at 18:41, Richard Sandiford wrote:
>
> Patterns that fuse a predicate operation P with a PTEST use
> aarch64_sve_same_pred_for_ptest_p to test whether the governing
> predicates of P and the PTEST are compatible. Most patterns were also
> written as define_insn_and_rewrites,
> On 29 Jul 2025, at 18:41, Richard Sandiford wrote:
>
> The patterns for the svcmp_wide intrinsics used a VNx16BI
> input predicate for all modes, instead of the usual .
> That unnecessarily made some input bits significant, but more
> importantly, it triggered an ICE in aarch64_sve_same_pred
> On 29 Jul 2025, at 18:41, Richard Sandiford wrote:
>
> This patch continues the work of making ACLE intrinsics use VNx16BI
> for svbool_t results. It deals with the non-widening integer forms
> of svcmp*. The handling of the PTEST patterns is similar to that
> for the earlier svwhile* patc
> On 29 Jul 2025, at 17:14, Jennifer Schmitz wrote:
>
> This patch adds dispatch constraints for Neoverse V2 and illustrates the steps
> necessary to enable dispatch scheduling for an AArch64 core.
>
> The dispatch constraints are based on section 4.1 of the Neoverse V2 SWOG.
> Please note tha
> On 31 Jul 2025, at 14:34, Tejas Belagod wrote:
>
> The test was unsafe when tested on different vector lengths. This patch
> fixes it to work on all lengths.
>
Ok. I’ve seen this test fail on GCC 15 branch too, do we want this fix there as
well?
Thanks,
Kyrill
> gcc/testsuite/ChangeLog
>
> On 29 Jul 2025, at 18:41, Richard Sandiford wrote:
>
> This patch continues the work of making ACLE intrinsics use VNx16BI
> for svbool_t results. It deals with the svunpk* intrinsics.
>
LGTM.
Thanks,
Kyrill
> gcc/
> * config/aarch64/aarch64-sve.md (@aarch64_sve_punpk_acle)
> (*aarch64_s
> On 29 Jul 2025, at 18:41, Richard Sandiford wrote:
>
> This patch continues the work of making ACLE intrinsics use VNx16BI
> for svbool_t results. It deals with the floating-point forms of svcmp*.
>
> gcc/
> * config/aarch64/aarch64-sve.md (@aarch64_pred_fcm_acle)
> (*aarch64_pred_fcm_acle,
> On 29 Jul 2025, at 15:31, Richard Sandiford wrote:
>
> The 8-bit and 16-bit tests in cmpbr.c assumed an inverted operand
> order ("w1, w0"), but it's possible to use the uninverted operand
> order too. This patch generalises the tests to support both forms.
>
> This is a prerequisite for a
> On 29 Jul 2025, at 15:15, Remi Machet wrote:
>
>
> On 7/29/25 14:44, Richard Sandiford wrote:
>> External email: Use caution opening links or attachments
>>
>>
>> function_expander::get_reg_target didn't actually check for a register,
>> meaning that it could return a memory target instea
> On 28 Jul 2025, at 17:10, Remi Machet wrote:
>
>
> On 7/28/25 17:02, Kyrylo Tkachov wrote:
>> External email: Use caution opening links or attachments
>>
>>
>> Hi Spencer,
>>
>>> On 28 Jul 2025, at 16:25, Spencer Abson wrote:
>
Hi Spencer,
> On 28 Jul 2025, at 16:25, Spencer Abson wrote:
>
> Streaming-compatible functions can be compiled without SME enabled, but need
> to use "SMSTART SM" and "SMSTOP SM" to temporarily switch into the streaming
> mode of a callee. These switches are conditional on the current mode be
> On 27 Jul 2025, at 03:31, Andrew Pinski wrote:
>
> On Fri, Jul 25, 2025 at 5:14 AM Jennifer Schmitz wrote:
>>
>> This patch adds a new tuning model for the NVIDIA Olympus core.
>> The values used here are based on the Software Optimization Guide
>> that will be published imminently.
>>
>>
> On 21 Jul 2025, at 11:43, Kyrylo Tkachov wrote:
>
> Hi Tamar,
>
>> On 21 Jul 2025, at 11:12, Tamar Christina wrote:
>>
>> Hi Kyrill,
>>
>>> -Original Message-
>>> From: Kyrylo Tkachov
>>> Sent: Friday, July 18, 2025
Hi Tamar,
> On 21 Jul 2025, at 11:12, Tamar Christina wrote:
>
> Hi Kyrill,
>
>> -Original Message-----
>> From: Kyrylo Tkachov
>> Sent: Friday, July 18, 2025 10:40 AM
>> To: GCC Patches
>> Cc: Tamar Christina ; Richard Sandiford
>> ; Andr
Hi Jennifer,
> On 18 Jul 2025, at 17:08, Jennifer Schmitz wrote:
>
>
>
>> On 18 Jul 2025, at 11:39, Kyrylo Tkachov wrote:
>>
>> External email: Use caution opening links or attachments
>>
>>
>> Hi all,
>>
>> For insertin
Hi Tamar,
> On 18 Jul 2025, at 18:25, Tamar Christina wrote:
>
> Hi Kyrill,
>
>> -Original Message-----
>> From: Kyrylo Tkachov
>> Sent: Friday, July 18, 2025 10:40 AM
>> To: GCC Patches
>> Cc: Tamar Christina ; Richard Sandiford
>> ; Alex C
Kyrill
Signed-off-by: Kyrylo Tkachov
gcc/
* config/arm/aarch-common-protos.h (vector_cost_table): Add ins_gp
field. Add comments to other vector cost fields.
* config/aarch64/aarch64.cc (aarch64_rtx_costs): Handle VEC_MERGE case.
* config/aarch64/aarch6
-by: Kyrylo Tkachov
gcc/
* config/aarch64/aarch64.cc (aarch64_rtx_costs): Add extra_cost values
only when speed is true for CONST_VECTOR, VEC_DUPLICATE, VEC_SELECT
cases.
* config/aarch64/aarch64-cost-tables.h (qdf24xx_extra_costs,
thunderx_extra_costs
> On 15 Jul 2025, at 15:50, Richard Sandiford wrote:
>
> Kyrylo Tkachov writes:
>> Hi all,
>>
>> SVE2 BSL2N (x, y, z) = (x & z) | (~y & ~z). When x == y this computes:
>> (x & z) | (~x & ~z) which is ~(x ^ z).
>> Thus, we can use it
> On 15 Jul 2025, at 15:01, Alex Coplan wrote:
>
> Hi,
>
> This relaxes an overzealous assert that required the fpm_t argument to
> be in DImode when expanding FP8 intrinsics. Of course this fails to
> account for modeless const_ints.
>
> Bootstrapped/regtested on aarch64-linux-gnu, OK for
Hi Alex,
> On 15 Jul 2025, at 14:59, Alex Coplan wrote:
>
> Hi,
>
> The predication of the SVE2 FP8 dot product insns was relying on the
> architectural dependency:
>
> FEAT_FP8DOT2 => FEAT_FP8DOT4
>
> which was relaxed in GCC as of
> r15-7480-g299a8e2dc667e795991bc439d2cad5ea5bd379e2, thus l
not z0.d, p3/m, z0.d
ret
Bootstrapped and tested on aarch64-none-linux-gnu.
Ok for trunk?
Thanks,
Kyrill
Signed-off-by: Kyrylo Tkachov
gcc/
* config/aarch64/aarch64-sve2.md (*aarch64_sve2_bsl2n_eon):
New pattern.
(*aarch64_sve2_eon_bsl2n_unpred)
nerate the MOVPRFX
when the operands fall that way, but I guess having a 2-insn MOVPRFX form is
not worse than the current 2-insn codegen at least, and the MOVPRFX can be
fused by many cores.
Bootstrapped and tested on aarch64-none-linux-gnu.
Ok for trunk?
Thanks,
Kyrill
Signed-off-by: Kyrylo Tka
> On 8 Jul 2025, at 17:43, Richard Sandiford wrote:
>
> Kyrylo Tkachov writes:
>> Thanks for your comments, do you mean something like the following?
>
> Yeah, the patch LGTM, thanks.
So it turned out that doing this in the EOR3 pattern in patch 4/7 caused
wrong-co
I had pushed this patch on Friday but have reverted it on trunk now because it
seems to be causing miscomputes in 531.deepsjeng_r.
Thanks,
Kyrill
> On 8 Jul 2025, at 08:28, Tamar Christina wrote:
>
>> -Original Message-----
>> From: Kyrylo Tkachov
>> Sent: Monda
+ arm maintainers.
Hi Pierre,
> On 14 Jul 2025, at 14:07, Pierre Ossman wrote:
>
> Suggested fix for this issue:
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60428
>
> Did not get any response there, so seeing if this is a better forum for
> suggested changes.
>
> We've been using this
> On 11 Jul 2025, at 16:48, Richard Sandiford wrote:
>
> Kyrylo Tkachov writes:
>>> On 10 Jul 2025, at 11:12, Kyrylo Tkachov wrote:
>>>
>>>
>>>
>>>> On 10 Jul 2025, at 10:40, Richard Sandiford
>>>> wrote:
>>>
> On 10 Jul 2025, at 11:12, Kyrylo Tkachov wrote:
>
>
>
>> On 10 Jul 2025, at 10:40, Richard Sandiford
>> wrote:
>>
>> Kyrylo Tkachov writes:
>>> Hi all,
>>>
>>> While the SVE2 NBSL instruction accepts MOVPRFX to add more f
> On 10 Jul 2025, at 10:40, Richard Sandiford wrote:
>
> Kyrylo Tkachov writes:
>> Hi all,
>>
>> While the SVE2 NBSL instruction accepts MOVPRFX to add more flexibility
>> due to its tied operands, the destination of the movprfx cannot be also
>> a so
> On 18 Jun 2025, at 17:26, Kyrylo Tkachov wrote:
>
> Hi all,
>
> This adds support for -mcpu=gb10. This is a big.LITTLE configuration
> involving Cortex-X925 and Cortex-A725 cores. The appropriate MIDR numbers
> are added to detect them in -mcpu=native. We did not add a
nbsl z0.d, z0.d, z2.d, z0.d
ret
which generated a gas warning.
Bootstrapped and tested on aarch64-none-linux-gnu.
Ok for trunk?
Do we want to backport it?
Thanks,
Kyrill
Signed-off-by: Kyrylo Tkachov
gcc/
PR target/120999
* config/aarch64/aarch64-sve2.md (*aa
> On 10 Jul 2025, at 08:09, Jakub Jelinek wrote:
>
> Hi!
>
> While I'm not a native English speaker, I believe all the uses
> of bellow (roar/bark/...) in comments in gcc are meant to be
> below (beneath/under/...).
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
>
Hi Alfie,
> On 7 Jul 2025, at 10:46, Alfie Richards wrote:
>
> Hello all,
>
> This patch implements the couple of amin/amax instructions that are part of
> SME2 + faminmax.
>
> Regression testsed and bootstrapped for Aarch64.
>
> Thanks,
> Alfie
>
> -- >8 --
>
> Implements the sme2+faminmax
> On 8 Jul 2025, at 12:39, Tamar Christina wrote:
>
>> -Original Message-
>> From: Richard Sandiford
>> Sent: Tuesday, July 8, 2025 10:07 AM
>> To: Tamar Christina
>> Cc: Kyrylo Tkachov ; GCC Patches > patc...@gcc.gnu.org>; Richard
> On 7 Jul 2025, at 13:27, Richard Sandiford wrote:
>
> Tamar Christina writes:
>>> -Original Message-
>>> From: Kyrylo Tkachov
>>> Sent: Monday, July 7, 2025 10:38 AM
>>> To: GCC Patches
>>> Cc: Richard Sandiford ; Richard Earns
for trunk?
Thanks,
Kyrill
Signed-off-by: Kyrylo Tkachov
gcc/
* config/aarch64/aarch64-sve2.md (*aarch64_sve2_bsl1n_unpreddi): New
define_insn_and_split.
gcc/testsuite/
* gcc.target/aarch64/sve2/bsl1n_d.c: New test.
0006-aarch64-Use-SVE2-BSL1N-for-DImode-arguments.patch
x1_t a, uint64x1_t b, uint64x1_t c) { return EOR3 (a,
b, c); }
We generate the desired:
eor3_d_gp:
eor x1, x1, x2
eor x0, x1, x0
ret
eor3_d:
eor3 v0.16b, v0.16b, v1.16b, v2.16b
ret
Bootstrapped and tested on aarch64-none-linux-gnu.
Ok for trunk?
Thanks,
Kyrill
Signed-off-by: Kyrylo Tkachov
ested on aarch64-none-linux-gnu.
Ok for trunk?
Thanks,
Kyrill
Signed-off-by: Kyrylo Tkachov
gcc/
* config/aarch64/aarch64-sve2.md (*aarch64_sve2_bsl2n_unpreddi): New
define_insn_and_split.
* config/aarch64/aarch64.cc (aarch64_bsl2n_rtx_form_p): Define.
(aarch64_rt
trunk?
Thanks,
Kyrill
Signed-off-by: Kyrylo Tkachov
gcc/
* config/aarch64/aarch64-sve.md (*aarch64_sve2_nbsl_unpreddi): New
define_insn_and_split.
gcc/testsuite/
* gcc.target/aarch64/sve2/nbsl_d.c: New test.
0005-aarch64-Use-SVE2-NBSL-for-DImode-arguments.patch
Description:
of:
bcax_s:
eor v1.8b, v1.8b, v2.8b
eor v0.8b, v1.8b, v0.8b
ret
Bootstrapped and tested on aarch64-none-linux-gnu.
Ok for trunk?
Thanks,
Kyrill
Signed-off-by: Kyrylo Tkachov
gcc/
* config/aarch64/aarch64-simd.md (eor3q4): Use VDQ_I mode
iterator.
gcc/testsuite
b
ret
When the inputs are in SIMD regs we use BCAX and when they are in GP regs we
don't force them to SIMD with extra moves.
Bootstrapped and tested on aarch64-none-linux-gnu.
Ok for trunk?
Thanks,
Kyrill
Signed-off-by: Kyrylo Tkachov
gcc/
* config/aarch64/aarch64-simd
rovement
always.
Bootstrapped and tested on aarch64-none-linux-gnu.
Ok for trunk?
Thanks,
Kyrill
Signed-off-by: Kyrylo Tkachov
gcc/
* config/aarch64/aarch64-simd.md (bcaxq4): Use VDQ_I mode
iterator.
gcc/testsuite/
* gcc.target/aarch64/simd/bcax_d.c: New test.
0001-a
Resending due to difficulties with my email
> On 7 Jul 2025, at 11:56, Kyrylo Tkachov wrote:
>
> Hi all,
>
> This series improves code generation for 64-bit vector types as well as the
> scalar DImode types.
> It makes use of SHA3 and SVE2 instructions like BCAX, EOR3
cheap itself and can be scheduled away from the critical path or even CSE'd
with other PTRUE constants.
As this sequence is larger code size-wise it is avoided for -Os.
Bootstrapped and tested on aarch64-none-linux-gnu.
Ok for trunk?
Thanks,
Kyrill
Signed-off-by: Kyrylo Tkachov
> On 1 Jul 2025, at 18:37, Alex Coplan wrote:
>
> The "else operand" to maskload should always be a const_vector, never a
> const_int.
>
> This was just an issue I noticed while looking through the code, I don't
> have a testcase which shows a concrete problem due to this.
>
> Testing of tha
> On 1 Jul 2025, at 17:36, Richard Sandiford wrote:
>
> Soumya AR writes:
>> From 2a2c3e3683aaf3041524df166fc6f8cf20895a0b Mon Sep 17 00:00:00 2001
>> From: Soumya AR
>> Date: Mon, 30 Jun 2025 12:17:30 -0700
>> Subject: [PATCH] aarch64: Enable selective LDAPUR generation for cores with
>> RCP
> On 17 Jun 2025, at 12:19, Kyrylo Tkachov wrote:
>
>
>
>> On 4 Apr 2025, at 20:28, ezra.sito...@arm.com wrote:
>>
>> From: Ezra Sitorus
>>
>> This patch updates `aarch64-sys-regs.def', bringing it into sync with
>> the Binutil
trunk and GCC 15 when I’m back.
Thanks,
Kyrill
Signed-off-by: Kyrylo Tkachov
gcc/
* config/aarch64/aarch64-cores.def (gb10): New entry.
* config/aarch64/aarch64-tune.md: Regenerate.
* doc/invoke.texi (AArch64 Options): Document the above.
0001-aarch64-Add-support-for
> On 16 Jun 2025, at 09:54, Richard Sandiford wrote:
>
> We generated inefficient code for bitfield references to Advanced
> SIMD structure modes. In RTL, these modes are just extra-long
> vectors, and so inserting and extracting an element is simply
> a vec_set or vec_extract operation.
>
>
> On 4 Apr 2025, at 20:28, ezra.sito...@arm.com wrote:
>
> From: Ezra Sitorus
>
> This patch updates `aarch64-sys-regs.def', bringing it into sync with
> the Binutils source after this change:
> https://sourceware.org/pipermail/binutils/2025-March/139894.html
Ok. I think these changes are co
Hi Spencer,
Thanks for the patch.
> On 13 Jun 2025, at 14:46, Spencer Abson wrote:
>
> Add the missing combiner patterns for folding NOT+PTEST to NOTS when
> they share the same GP.
>
I guess GP here means “governing predicate”?
GP usually means “General Purpose (register)” in aarch64 so it’d
> On 12 Jun 2025, at 18:20, Remi Machet wrote:
>
>
> On 6/12/25 12:02, Richard Sandiford wrote:
>> External email: Use caution opening links or attachments
>>
>>
>> Remi Machet writes:
>>> Add an optimization to aarch64 SIMD converting mvn+shrn into mvni+subhn
>>> which
>>> allows for bett
> On 12 Jun 2025, at 18:02, Richard Sandiford wrote:
>
> Remi Machet writes:
>> Add an optimization to aarch64 SIMD converting mvn+shrn into mvni+subhn
>> which
>> allows for better optimization when the code is inside a loop by using a
>> constant.
>>
>> Bootstrapped and regtested on aarch6
> On 11 Jun 2025, at 16:22, Richard Sandiford wrote:
>
> The PCS defines a lazy save scheme for managing ZA across normal
> "private-ZA" functions. GCC currently uses this scheme for calls
> to all private-ZA functions (rather than using caller-save).
>
> Therefore, before a sequence of call
> On 3 Jun 2025, at 17:56, Richard Sandiford wrote:
>
> Tamar Christina writes:
>> As requested in my patch for -mmax-vectorization this promotes the parameter
>> --param aarch64-autovec-preference to a first class top target flag.
>>
>> If both the parameter and the flag is specified the par
> On 28 May 2025, at 13:36, Kyrylo Tkachov wrote:
>
> Hi Yuta-san
>
>> On 23 May 2025, at 07:49, Yuta Mukai (Fujitsu)
>> wrote:
>>
>> Hello,
>>
>> We would like to enable features for FUJITSU-MONAKA that were implemented in
>> GC
Hi Yuta-san
> On 23 May 2025, at 07:49, Yuta Mukai (Fujitsu) wrote:
>
> Hello,
>
> We would like to enable features for FUJITSU-MONAKA that were implemented in
> GCC after we added support for FUJITSU-MONAKA.
> As the features were implemented in GCC15, we also want to backport it to
> GCC15.
> On 16 May 2025, at 12:35, Richard Sandiford wrote:
>
> Jennifer Schmitz writes:
>> The ICE in PR120276 resulted from a comparison of VNx4QI and V8QI using
>> partial_subreg_p in the function copy_value during the RTL pass
>> regcprop, failing the assertion in
>>
>> inline bool
>> partial_su
> On 10 May 2025, at 06:17, Andrew Pinski wrote:
>
> Since the AARCH64_CORE defines in aarch64-cores.def all use -1 for
> the variant, it is just easier to add the cast to unsigned in the usage
> in driver-aarch64.cc.
>
> Build and tested on aarch64-linux-gnu.
Ok.
Thanks,
Kyrill
>
> gcc/Ch
> On 10 May 2025, at 05:59, Andrew Pinski wrote:
>
> There is a narrowing warning in aarch64_detect_vector_stmt_subtype
> about gather_load_x32_cost and gather_load_x64_cost converting from int to
> unsigned.
> These fields are always unsigned and even the constructor for sve_vec_cost
> take
> On 8 May 2025, at 21:10, Karl Meakin wrote:
>
> Add rules for lowering `cbranch4` to CBB/CBH/CB when
> CMPBR extension is enabled.
>
> gcc/ChangeLog:
>
> * config/aarch64/aarch64.md (cbranch4): Mmit CMPBR
> instructions if possible.
> (BRANCH_LEN_P_1Kib): New constant.
> (BRANCH_LEN_N_1Kib)
Hi Richard,
> On 7 May 2025, at 18:15, Richard Earnshaw wrote:
>
>
> The header file for the Arm implementation of mmintrin.h was changed in GCC-15
> to disable access to the intrinsics. This patch removes the internal code
> as well.
>
> We still allow -mcpu/-march options for the wmmx cpus,
> On 7 May 2025, at 12:27, Karl Meakin wrote:
>
> Give the `define_insn` rules used in lowering `cbranch4` to RTL
> more descriptive and consistent names: from now on, each rule is named
> after the AArch64 instruction that it generates. Also add comments to
> document each rule.
>
> gcc/Chang
> On 7 May 2025, at 12:27, Karl Meakin wrote:
>
> The rules for conditional branches were spread throughout `aarch64.md`.
> Group them together so it is easier to understand how `cbranch4`
> is lowered to RTL.
>
> gcc/ChangeLog:
>
> * config/aarch64/aarch64.md (condjump): move.
> (*compare_co
Hi Karl,
> On 7 May 2025, at 12:27, Karl Meakin wrote:
>
> This patch series adds support for the CMPBR extension. It includes the
> new `+cmpbr` option and rules to generate the new instructions when
> lowering conditional branches.
Thanks for the series.
You didn’t state it explicitly, but ha
> On 7 May 2025, at 12:27, Karl Meakin wrote:
>
> Add rules for lowering `cbranch4` to CBB/CBH/CB when CMPBR
> extension is enabled.
>
> gcc/ChangeLog:
>
> * config/aarch64/aarch64.md (cbranch4): emit CMPBR
> instructions if possible.
> (cbranch4): new expand rule.
> (aarch64_cb): likewise.
>
> On 7 May 2025, at 12:27, Karl Meakin wrote:
>
> Commit the test file `cmpbr.c` before rules for generating the new
> instructions are added, so that the changes in codegen are more obvious
> in the next commit.
I guess that’s an LLVM best practice.
In GCC since we have the check-function-bod
> On 7 May 2025, at 12:27, Karl Meakin wrote:
>
> Add the `+cmpbr` option to enable the FEAT_CMPBR architectural
> extension.
>
> gcc/ChangeLog:
>
> * config/aarch64/aarch64-option-extensions.def (cmpbr): new
> option.
> * config/aarch64/aarch64.h (TARGET_CMPBR): new macro.
> * doc/invoke.tex
> On 7 May 2025, at 12:27, Karl Meakin wrote:
>
> The `far_branch` attribute only ever takes the values 0 or 1, so make it
> a `no/yes` valued string attribute instead.
>
> gcc/ChangeLog:
>
> * config/aarch64/aarch64.md (far_branch): replace 0/1 with
> no/yes.
> (aarch64_bcond): handle renam
> On 7 May 2025, at 12:27, Karl Meakin wrote:
>
> Extract the hardcoded values for the minimum PC-relative displacements
> into named constants and document them.
>
> gcc/ChangeLog:
>
> * config/aarch64/aarch64.md (BRANCH_LEN_P_128MiB): New constant.
> (BRANCH_LEN_N_128MiB): likewise.
> (BRA
> On 7 May 2025, at 12:27, Karl Meakin wrote:
>
> Make the formatting of the RTL templates in the rules for branch
> instructions more consistent with each other.
>
> gcc/ChangeLog:
>
> * config/aarch64/aarch64.md (cbranch4): reformat.
> (cbranchcc4): likewise.
> (condjump): likewise.
> (*co
> On 6 May 2025, at 10:30, Soumya AR wrote:
>
> From: Soumya AR
>
> This patch adds a get_map () method to the JSON object class to provide access
> to the underlying hash map that stores the JSON key-value pairs.
>
> It also reorganizes the private and public sections of the class to expos
In Hi Richard,
> On 6 May 2025, at 12:34, Richard Sandiford wrote:
>
> writes:
>> From: Soumya AR
>>
>> Hi,
>>
>> This RFC and subsequent patch series introduces support for printing and
>> parsing
>> of aarch64 tuning parameters in the form of JSON.
>
> Thanks for doing this. It looks r
> On 4 May 2025, at 19:19, Yangyu Chen wrote:
>
> Hi everyone,
>
> This patch series introduces support for the target_clones profile
> option in GCC. This option enables users to specify target_clones
> attributes in a separate file, allowing GCC to generate multiple
> versions of the functio
Pushing as obvious.
Signed-off-by: Kyrylo Tkachov
0001-AArch64-changes.html-Fix-typo.patch
Description: 0001-AArch64-changes.html-Fix-typo.patch
> On 1 May 2025, at 14:02, Ayan Shafqat wrote:
>
> On Thu, May 01, 2025 at 08:09:18AM +0000, Kyrylo Tkachov wrote:
>>
>> I was going to ask why not use the standard __buuiltin_sqrt builtins but I
>> guess those don’t guarantee that we avoid a libcall in
> On 28 Apr 2025, at 21:29, Ayan Shafqat wrote:
>
> Rebased with gcc 15.1
>
> This patch introduces two new inline functions, __sqrt and __sqrtf, in
> arm_acle.h for Aarch64 targets. These functions wrap the new builtins
> __builtin_aarch64_sqrtdf and __builtin_aarch64_sqrtsf, respectively,
>
1 - 100 of 1256 matches
Mail list logo