Pushing as obvious.
Thanks,
Kyrill
Signed-off-by: Kyrylo Tkachov
* common.opt.urls: Regenerate.
0001-Regenerate-common.opt.urls.patch
Description: 0001-Regenerate-common.opt.urls.patch
> On 15 Apr 2025, at 15:42, Richard Biener wrote:
>
> On Mon, Apr 14, 2025 at 3:11 PM Kyrylo Tkachov wrote:
>>
>> Hi Honza,
>>
>>> On 13 Apr 2025, at 23:19, Jan Hubicka wrote:
>>>
>>>> +@opindex fipa-reorder-for-locality
>>>
Hi Tejas,
> On 14 Apr 2025, at 16:04, Tejas Belagod wrote:
>
> The operand order to gen_vcond_mask call in the vec_extract pattern is wrong.
> Fix the order where predicate is operand 3.
>
> Tested and bootstrapped on aarch64-linux-gnu. OK for trunk?
>
> gcc/ChangeLog
>
> * config/aarch64/aar
Hi Honza,
> On 13 Apr 2025, at 23:19, Jan Hubicka wrote:
>
>> +@opindex fipa-reorder-for-locality
>> +@item -fipa-reorder-for-locality
>> +Group call chains close together in the binary layout to improve code code
>> +locality. This option is incompatible with an explicit
>> +@option{-flto-part
> On 26 Mar 2025, at 08:42, Kyrylo Tkachov wrote:
>
> Ping.
Ping.
https://gcc.gnu.org/pipermail/gcc-patches/2025-March/676958.html
I’ve ran a profiled LTO bootstrap of GCC with the new bootstrap-lto-locality
bootstrap config
And compared it against a GCC produced by the exi
> On 7 Apr 2025, at 10:21, Tamar Christina wrote:
>
>> -Original Message-----
>> From: Kyrylo Tkachov
>> Sent: Monday, March 31, 2025 1:43 PM
>> To: i...@sandoe.co.uk
>> Cc: Tamar Christina ; GCC Patches > patc...@gcc.gnu.org>; Alice Carlotti ;
> On 31 Mar 2025, at 09:43, Richard Biener wrote:
>
> On Mon, Mar 31, 2025 at 9:41 AM Richard Biener
> wrote:
>>
>> On Mon, Mar 31, 2025 at 9:36 AM Kyrylo Tkachov wrote:
>>>
>>> Ping.
>>
>> Can you reference the patch please? I'
Hi all,
As we're starting a new month, introduce a more appropriate -mapril=
to specify the compilation target instead.
This helps keep GCC more up to date with the passage of time.
Bootstrapped and tested on aarch64-none-linux-gnu.
Signed-off-by: Kyrylo Tkachov
gcc/
* config/aa
Hi Iain,
> On 22 Mar 2025, at 15:31, Iain Sandoe wrote:
>
> 0. Sorry this has taken some time to close off; partly because of waiting
> for input, but mostly that I've been stretched with other work.
> 1. As per the commit message, the apparent non-conformance with 8.5/6
> because FEAT_SPECR
Ping.
Thanks,
Kyrill
> On 24 Mar 2025, at 14:28, Kyrylo Tkachov wrote:
>
> Hi all,
>
> In this testcase GCC tries to expand a VNx4BI vector:
> vector(4) _40;
> _39 = () _24;
> _40 = {_39, _39, _39, _39};
>
> This ends up in a scalarised sequence of bitfiel
Ping.
Thanks,
Kyrill
> On 6 Mar 2025, at 09:25, Kyrylo Tkachov wrote:
>
> Hi all,
>
> Implement partitioning and cloning in the callgraph to help locality.
> A new -fipa-reorder-for-locality flag is used to enable this.
> The majority of the logic is in the new IPA
bfis are gone.
Bootstrapped and tested on aarch64-none-linux-gnu.
Given this a regression from GCC 13 is this ok for trunk now?
Thanks,
Kyrill
Signed-off-by: Kyrylo Tkachov
gcc/
PR middle-end/119442
* expr.cc (store_constructor): Also allow element modes explicitly
accepted by
Hi Dhruv,
> On 21 Mar 2025, at 11:11, Dhruv Chawla wrote:
>
> This adds support for the NVIDIA Olympus core to the AArch64 backend. The
> initial patch does not add any special tuning decisions, and those may come
> later.
>
> Bootstrapped and tested on aarch64-none-linux-gnu.
>
Thanks, given
g to trunk.
Thanks,
Kyrill
Signed-off-by: Kyrylo Tkachov
gcc/
* config/aarch64/aarch64-arches.def (...): Add SVE2p1.
* doc/invoke.texi (AArch64 Options): Document +sve2p1 in
-march=armv9.4-a.
0001-aarch64-Add-sve2p1-to-march-armv9.4-a-flags.patch
Description: 0001-a
> On 16 Mar 2025, at 20:15, Ayan Shafqat wrote:
>
> This patch introduces inline definitions for the __fma and __fmaf
> functions in arm_acle.h for Aarch64 targets. These definitions rely on
> __builtin_fma and __builtin_fmaf to ensure proper inlining and to meet
> the ACLE requirements [1].
>
Hi Ayan,
> On 11 Mar 2025, at 14:53, Ayan Shafqat wrote:
>
> Hello Kyrylo,
>
> On Tue, Mar 11, 2025 at 08:55:46AM +, Kyrylo Tkachov wrote:
>> This looks ok to me.
>> GCC is currently in a regression fixing stage so normally such a change
>> would wait u
Hi Ayan,
> On 9 Mar 2025, at 21:46, Ayan Shafqat wrote:
>
> This patch introduces inline definitions for the __fma and __fmaf
> functions in arm_acle.h for AArch64 targets. These definitions rely on
> __builtin_fma and __builtin_fmaf to ensure proper inlining and to meet
> the ACLE requirements
ality, but we'd appreciate wider performance evaluation.
Bootstrapped and tested on aarch64-none-linux-gnu.
Ok for mainline?
Thanks,
Kyrill
Signed-off-by: Prachi Godbole
Co-authored-by: Kyrylo Tkachov
config/ChangeLog:
* bootstrap-lto-locality.mk: New file.
gcc
both (normal LTO bootstrap and profiledbootstrap).
>>
>> With this optimization we are seeing good performance gains on some large
>> internal workloads that stress the parts of the processor that is sensitive
>> to code locality, but we'd appreciate wider performance eva
.
Bootstrapped and tested on aarch64-none-linux-gnu.
Pushing to trunk.
Thanks,
Kyrill
Signed-off-by: Kyrylo Tkachov
PR rtl-optimization/119046
* config/aarch64/aarch64.cc (aarch64_evpc_dup): Use VOIDmode for
PARALLEL.
0001-PR-rtl-optimization-119046-aarch64-Fix-PARALLEL
> On 5 Mar 2025, at 11:14, Richard Biener wrote:
>
> On Tue, Mar 4, 2025 at 10:01 PM Richard Sandiford
> wrote:
>>
>> Kyrylo Tkachov writes:
>>> Hi all,
>>>
>>> In this testcase late-combine was failing to merge:
>>> dup v31.4s
> On 3 Mar 2025, at 19:52, Wilco Dijkstra wrote:
>
>
> Outline atomics is not designed to be used with -mcmodel=large, so disable
> it automatically if the large code model is used.
>
> Passes regress, OK for commit?
>
This restriction should be documented in invoke.texi IMO.
I also think i
> On 3 Mar 2025, at 19:58, Wilco Dijkstra wrote:
>
>
> Enable the early scheduler on AArch64 for O3/Ofast. This means GCC15 benefits
> from much faster build times with -O2, but avoids the regressions in lbm which
> is very sensitive to minor scheduling changes due to long FMA chains. We can
> On 3 Mar 2025, at 09:49, Andrew Pinski wrote:
>
> On Mon, Mar 3, 2025 at 12:43 AM Kyrylo Tkachov wrote:
>>
>>
>>
>>> On 28 Feb 2025, at 19:06, Andrew Pinski wrote:
>>>
>>> On Fri, Feb 28, 2025 at 5:25 AM Kyrylo Tkachov wrote:
>
> On 28 Feb 2025, at 19:06, Andrew Pinski wrote:
>
> On Fri, Feb 28, 2025 at 5:25 AM Kyrylo Tkachov wrote:
>>
>> Hi all,
>>
>> In this PR late-combine was failing to merge:
>> dup v31.4s, v31.s[3]
>> fmla v30.4s, v31.4s, v29.4s
>> in
d and tested on aarch64-none-linux-gnu.
Apparently this also fixes a regression in
gcc.target/aarch64/vmul_element_cost.c that I observed.
Signed-off-by: Kyrylo Tkachov
gcc/
PR rtl-optimization/119046
* rtlanal.cc (may_trap_p_1): Don't mark FP-mode PARALLELs as trapping.
gcc
igned-off-by: Kyrylo Tkachov
gcc/
PR rtl-optimization/119046
* rtlanal.cc (may_trap_p_1): Don't mark FP-mode PARALLELs as trapping.
gcc/testsuite/
PR rtl-optimization/119046
* g++.target/aarch64/pr119046.C: New test.
0001-PR-rtl-optimization-119046-
> On 18 Feb 2025, at 09:48, Kyrylo Tkachov wrote:
>
>
>
>> On 18 Feb 2025, at 09:41, Richard Sandiford
>> wrote:
>>
>> Kyrylo Tkachov writes:
>>> Hi Soumya
>>>
>>>> On 18 Feb 2025, at 09:12, Soumya AR wrote:
>>&g
> On 18 Feb 2025, at 09:41, Richard Sandiford wrote:
>
> Kyrylo Tkachov writes:
>> Hi Soumya
>>
>>> On 18 Feb 2025, at 09:12, Soumya AR wrote:
>>>
>>> generic_armv8_a.h defines generic_armv8_a_prefetch_tune but still uses
>>> generi
Hi Soumya
> On 18 Feb 2025, at 09:12, Soumya AR wrote:
>
> generic_armv8_a.h defines generic_armv8_a_prefetch_tune but still uses
> generic_prefetch_tune in generic_armv8_a_tunings.
>
> This patch updates the pointer to generic_armv8_a_prefetch_tune.
>
> This patch was bootstrapped and regtest
Hi Spencer,
> On 17 Feb 2025, at 20:07, Spencer Abson wrote:
>
> Add a fold at gimple_fold_builtin to prefer the highpart variant of a builtin
> if the arguments are better suited to it. This helps us avoid copying data
> between lanes before operation.
>
> E.g. We prefer to use UMULL2 rather t
> On 7 Feb 2025, at 01:04, Andrew Pinski wrote:
>
> With release checking we get an uninitialization warning
> inside aarch64_split_move because of jump threading for the case of
> `npieces==0`
> but `npieces` is never 0 (but there is no way the compiler can know that.
> So this fixes the iss
Hi Richard,
> On 5 Feb 2025, at 09:57, Richard Sandiford wrote:
>
> gcc.target/aarch64/sve/acle/general/ldff1_8.c and
> gcc.target/aarch64/sve/ptest_1.c were failing because the
> aarch64 port was giving a zero (unknown) cost to instructions
> that compute two results in parallel. This was late
> On 22 Jan 2025, at 13:53, Richard Sandiford wrote:
>
> Kyrylo Tkachov writes:
>> Hi Richard,
>>
>>> On 22 Jan 2025, at 13:21, Richard Sandiford
>>> wrote:
>>>
>>> GCC 15 is the first release to support FP8 intrinsics.
>>>
Hi Richard,
> On 22 Jan 2025, at 13:21, Richard Sandiford wrote:
>
> GCC 15 is the first release to support FP8 intrinsics.
> The underlying instructions depend on the value of a new register,
> FPMR. Unlike FPCR, FPMR is a normal call-clobbered/caller-save
> register rather than a global regis
> On 20 Jan 2025, at 19:43, Tamar Christina wrote:
>
>> -Original Message-
>> From: Tamar Christina
>> Sent: Friday, January 17, 2025 5:07 PM
>> To: Kyrylo Tkachov ; Richard Sandiford
>>
>> Cc: GCC Patches ; nd ; Richard
>> Earnsh
> On 17 Jan 2025, at 15:01, Richard Sandiford wrote:
>
> Tamar Christina writes:
>>> -Original Message-
>>> From: Richard Sandiford
>>> Sent: Friday, January 10, 2025 4:50 PM
>>> To: Akram Ahmad
>>> Cc: ktkac...@nvidia.com; gcc-patches@gcc.gnu.org
>>> Subject: Re: [PATCH v3 1/2] aar
> On 17 Jan 2025, at 14:47, Richard Sandiford wrote:
>
> Tamar Christina writes:
>>> -Original Message-
>>> From: Kyrylo Tkachov
>>> Sent: Friday, January 17, 2025 1:22 PM
>>> To: Tamar Christina
>>> Cc: GCC Patches ; nd
> On 17 Jan 2025, at 14:06, Tamar Christina wrote:
>
>> -Original Message-----
>> From: Kyrylo Tkachov
>> Sent: Friday, January 17, 2025 1:04 PM
>> To: Tamar Christina
>> Cc: GCC Patches ; nd ; Richard
>> Earnshaw ; ktkac...@gcc.gnu.org; Ri
> On 17 Jan 2025, at 13:56, Tamar Christina wrote:
>
> Hi All,
>
> Following the deprecation of ILP32 *-elf builds fail now due to -Werror on the
> deprecation warning. This is because on embedded builds ILP32 is part of the
> default multilib.
>
> This patch removed it from the default targ
> On 13 Jan 2025, at 18:51, Richard Sandiford wrote:
>
> Iain Sandoe writes:
>> Hi Folks,
>>
>>> On 10 Jan 2025, at 18:30, Wilco Dijkstra wrote:
>>>
>>> Hi Andrew,
>>>
Personally I would like this deprecated even for bare-metal. Yes the
iwatch ABI is an ILP32 ABI but I don't see
Hi Iain,
> On 11 Jan 2025, at 14:21, Iain Sandoe wrote:
>
> Hi,
>
> I originally made this patch for the Darwin Arm64 development branch,
> however in discussions on IRC, it seems that it is also relevant to
> Linux - since there are implementations running on Apple hardware with
> the M1..3 CP
> On 10 Jan 2025, at 15:54, Wilco Dijkstra wrote:
>
> ping
>
>
> Add AARCH64_EXTRA_TUNE_USE_NEW_VECTOR_COSTS and
> AARCH64_EXTRA_TUNE_MATCHED_VECTOR_THROUGHPUT
> to the baseline tuning since all modern cores use it. Fix the neoverse512tvb
> tuning to be
> like Neoverse V1/V2.
For neovers
> On 10 Jan 2025, at 15:30, Richard Sandiford wrote:
>
> Wilco Dijkstra writes:
>> As a minor cleanup remove Cortex-A57 FMA steering pass. Since Cortex-A57 is
>> pretty old, there isn't any benefit of keeping this.
>>
>> Passes regress & bootstrap, OK for commit?
>>
>> gcc:
>> * config.gcc
Hi Wilco,
> On 10 Jan 2025, at 15:05, Wilco Dijkstra wrote:
>
>
> ILP32 was originally intended to make porting to AArch64 easier. Support was
> never merged in the Linux kernel or GLIBC, so it has been unsupported for many
> years. There isn't a benefit in keeping unsupported features foreve
> On 10 Jan 2025, at 11:22, Richard Sandiford wrote:
>
> writes:
>> This patch adds a warning when FMV is used for Aarch64.
>>
>> The reasoning for this is the ACLE [1] spec for FMV has diverged
>> significantly from the current implementation and we want to prevent
>> potential future compat
> On 10 Jan 2025, at 00:07, Tamar Christina wrote:
>
> Hi All,
>
> The Parts Num field for the MIDR for Cortex-X4 is wrong. It's currently the
> parts number for a Cortex-A720 (which does have the right number).
>
> The correct number can be found in the Cortex-X4 Technical Reference Manual
Hi Akram
> On 8 Jan 2025, at 16:23, Akram Ahmad wrote:
>
> Hi Kyrill,
>
> Thanks for the feedback on V2. I found a pattern which works for
> the open-coded signed arithmetic, and I've implemented the other
> feedback you provided as well.
>
> I've send the modified patch in this thread as the
Hi Alfie,
> On 9 Jan 2025, at 10:58, alfie.richa...@arm.com wrote:
>
> This patch adds a warning whenever FMV is used for Aarch64.
>
> The reasoning for this is the ACLE [1] spec for FMV has diverged
> significantly from the current implementation and we want to prevent
> future compatability is
Ping.
Thanks,
Kyrill
> On 13 Dec 2024, at 16:47, Kyrylo Tkachov wrote:
>
> Ping.
> Thanks,
> Kyrill
>
>> On 28 Nov 2024, at 11:22, Kyrylo Tkachov wrote:
>>
>> Ping.
>>
>>> On 15 Nov 2024, at 17:04, Kyrylo Tkachov wrote:
>>>
>&
Hi Akram,
> On 14 Nov 2024, at 16:53, Akram Ahmad wrote:
>
> This renames the existing {s,u}q{add,sub} instructions to use the
> standard names {s,u}s{add,sub}3 which are used by IFN_SAT_ADD and
> IFN_SAT_SUB.
>
> The NEON intrinsics for saturating arithmetic and their corresponding
> builtins
Ping.
Thanks,
Kyrill
> On 28 Nov 2024, at 11:22, Kyrylo Tkachov wrote:
>
> Ping.
>
>> On 15 Nov 2024, at 17:04, Kyrylo Tkachov wrote:
>>
>> Hi all,
>>
>> This is a patch submission following-up from the RFC at:
>> https://gcc.gnu.org/piperma
Thanks for doing this Tamar,
> On 11 Dec 2024, at 10:54, Tamar Christina wrote:
>
>> -Original Message-
>> From: Richard Sandiford
>> Sent: Wednesday, December 11, 2024 9:50 AM
>> To: Tamar Christina
>> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw
>> ; ktkac...@gcc.gnu.org
>> Sub
> On 3 Dec 2024, at 11:32, Tamar Christina wrote:
>
>> -Original Message-----
>> From: Kyrylo Tkachov
>> Sent: Tuesday, December 3, 2024 10:19 AM
>> To: Tamar Christina
>> Cc: GCC Patches ; nd ; Richard
>> Earnshaw ; Marcus Shawcroft
&g
> On 4 Dec 2024, at 19:02, Richard Sandiford wrote:
>
> The arm_neon.h intrinsic definitions use a bitmask of flags to
> indicate what side-effects the intrinsic might have. However,
> their names are a bit confusing:
>
> - FLAG_AUTO_FP was originally suggested as a way of saying
> "automati
next week if there are no objections.
Thanks,
Kyrill
Signed-off-by: Kyrylo Tkachov
gcc/
* config/aarch64/aarch64-option-extensions.def (sve-b16b16,
f32mm, f64mm, sve2p1, sme-f64f64, sme-i16i64, sme-b16b16,
sme-f16f16, mops): Update FEATURE_STRING field.
0001-aarc
> On 3 Dec 2024, at 11:41, Claudio Bantaloukas
> wrote:
>
>
>
> On 12/3/2024 10:24 AM, Kyrylo Tkachov wrote:
>> Hi Claudio,
>>> On 2 Dec 2024, at 19:14, Claudio Bantaloukas
>>> wrote:
>>>
>>>
>>> The previous version o
Hi Claudio,
> On 2 Dec 2024, at 19:14, Claudio Bantaloukas
> wrote:
>
>
> The previous version of the patch was based on the mistaken assumption that
> features in /proc/cpuinfo had matching names to the feature names that gcc and
> gas accept.
> This patch enables the fp8 feature when the f8c
Hi Tamar,
Something I noticed when looking at the various tuning files….
> On 26 Jul 2024, at 11:20, Tamar Christina wrote:
>
> External email: Use caution opening links or attachments
>
>
> Hi All,
>
> This adds a cost model and core definition for Neoverse V3.
>
> It also makes Cortex-X4
Hi Akram,
> On 2 Dec 2024, at 15:54, Akram Ahmad wrote:
>
> GIMPLE code which performs a narrowing truncation on the result of a
> vector concatenation currently results in an unnecessary XTN being
> emitted following a UZP1 to concate the operands. In cases such as this,
> UZP1 should instead u
> On 29 Nov 2024, at 14:16, Richard Sandiford wrote:
>
> Kyrylo Tkachov writes:
>>> On 27 Nov 2024, at 09:34, Richard Sandiford
>>> wrote:
>>>
>>> Soumya AR writes:
>>>> NBSL, BSL1N, and BSL2N are bit-select intructions on SVE
> On 29 Nov 2024, at 14:49, Yury Khrustalev wrote:
>
> Including the "arm_acle.h" header in aarch64-unwind.h requires
> stdint.h to be present and it may not be available during the
> first stage of cross-compilation of GCC.
>
> When cross-building GCC for the aarch64-none-linux-gnu target
>
> On 29 Nov 2024, at 14:25, Yury Khrustalev wrote:
>
> Hi Kyrill,
>
> On Fri, Nov 29, 2024 at 02:06:17PM +, Kyrylo Tkachov wrote:
>> Hi Yury,
>>
>>> On 29 Nov 2024, at 13:57, Yury Khrustalev wrote:
>>>
>>> Inclusion of "arm_ac
> On 29 Nov 2024, at 13:00, Richard Sandiford wrote:
>
> Thanks for the update!
>
> Claudio Bantaloukas writes:
>> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
>> index 2a4f016e2df..f7440113570 100644
>> --- a/gcc/doc/invoke.texi
>> +++ b/gcc/doc/invoke.texi
>> @@ -21957,6 +21957,18
Hi Yury,
> On 29 Nov 2024, at 13:57, Yury Khrustalev wrote:
>
> Inclusion of "arm_acle.h" would requires stdint.h that may
> not be available during first stage of cross-compilation.
Do you mean when trying to build a big-endian cross-compiler or something?
The change seems harmless to me but t
> On 29 Nov 2024, at 13:04, Richard Sandiford wrote:
>
> Kyrylo Tkachov writes:
>> Hi Richard
>>> On 6 Nov 2024, at 18:16, Richard Sandiford
>>> wrote:
>>>
>>> This series adds support for FEAT_SVE2p1 (-march=...+sve2p1).
>>>
g that one? I've noted the documentation comment you
> mentioned :)
Ah, I did review the latest one, but I had clicked reply on the wrong one in
the thread.
I’ve ok’ed that explicitly separately.
Kyrill
>
> Thanks,
> Tamar
>
>> -Original Message-
>> From: Kyrylo
> On 21 Nov 2024, at 10:13, Tamar Christina wrote:
>
>>> I tried writing automated testcases for these, however the testsuite doesn't
>>> want to scan the output of -### and it makes the excess error tests always
>>> fail
>>> unless you use dg-error, which also looks for"error:". So tested ma
Hi Tamar,
> On 15 Nov 2024, at 14:24, Tamar Christina wrote:
>
> Hi All,
>
> This patch makes it so that when you use any of the Cortex-A53 errata
> workarounds but have specified an -march or -mcpu we know is not affected by
> it
> that we suppress the errata workaround.
>
> This is a driver
Ping.
> On 15 Nov 2024, at 17:04, Kyrylo Tkachov wrote:
>
> Hi all,
>
> This is a patch submission following-up from the RFC at:
> https://gcc.gnu.org/pipermail/gcc/2024-November/245076.html
> The patch is rebased and retested against current trunk, some debugging cod
> On 27 Nov 2024, at 09:34, Richard Sandiford wrote:
>
> Soumya AR writes:
>> NBSL, BSL1N, and BSL2N are bit-select intructions on SVE2 with certain
>> operands
>> inverted. These can be extended to work with Neon modes.
>>
>> Since these instructions are unpredicated, duplicate patterns wer
Hi Richard
> On 6 Nov 2024, at 18:16, Richard Sandiford wrote:
>
> This series adds support for FEAT_SVE2p1 (-march=...+sve2p1).
> One thing that the extension does is make some SME and SME2 instructions
> available outside of streaming mode. It also adds quite a few new
> instructions. Some o
Ok for mainline?
Thanks,
Kyrill
Signed-off-by: Prachi Godbole
Co-authored-by: Kyrylo Tkachov
config/ChangeLog:
* bootstrap-lto-locality.mk: New file.
gcc/ChangeLog:
* Makefile.in (OBJS): Add ipa-locality-cloning.o
(GTFILES): Add ipa-localit
> On 14 Nov 2024, at 18:40, Wilco Dijkstra wrote:
>
>
> Cleanup the extra tune defines by introducing AARCH64_EXTRA_TUNE_BASE as a
> common base supported by all modern cores. Initially set it to
> AARCH64_EXTRA_TUNE_CHEAP_SHIFT_EXTEND. No change in generated code.
>
> Passes regress & boo
> On 15 Nov 2024, at 12:33, Wilco Dijkstra wrote:
>
> Hi Kyrill,
>
>> This would make USE_NEW_VECTOR_COSTS effectively the default.
>> Jennifer has been trying to do that as well and then to remove it (as it
>> would be always true) but there are some codegen regressions that still >
>> need
Hi Wilco,
> On 14 Nov 2024, at 18:44, Wilco Dijkstra wrote:
>
>
> Add AARCH64_EXTRA_TUNE_USE_NEW_VECTOR_COSTS and
> AARCH64_EXTRA_TUNE_MATCHED_VECTOR_THROUGHPUT
> to the baseline tuning since all modern cores use it. Fix the neoverse512tvb
> tuning to be
> like Neoverse V1/V2.
>
This would
> On 12 Nov 2024, at 18:55, Richard Sandiford wrote:
>
> Wilco Dijkstra writes:
>> Hi,
>>
> What do you think about disabling late scheduling as well?
I think this would definitely need separate consideration and evaluation
given the above.
Another thing to con
Hi Saurabh,
> On 6 Nov 2024, at 11:03, saurabh@arm.com wrote:
>
>
> The AArch64 FEAT_FP8 extension introduces instructions for conversion
> and scaling.
>
> This patch introduces the following intrinsics:
> 1. vcvt{1|2}_{bf16|high_bf16|low_bf16}_mf8_fpm.
> 2. vcvt{q}_mf8_f16_fpm.
> 3. vcvt_
Hi Victor,
> On 31 Oct 2024, at 22:40, Victor Do Nascimento
> wrote:
>
> Implement -mcpu options for:
>
> - Cortex-A520AE
> - Cortex-A720AE
> - Cortex-R82AE
>
> These all implement the same feature sets as their non-AE
> counterparts, using the same scheduler and costs and differing only i
Hi Vladimir,
Thanks for the patches!
> On 6 Nov 2024, at 08:50, vladimir.miloser...@arm.com wrote:
>
>
> The AArch64 FEAT_LUT extension is optional from Armv9.2-a and mandatory
> from Armv9.5-a. This extension introduces instructions for lookup table
> read with 2-bit indices.
>
> This patch ad
Forwarding to the correct ML...
> Begin forwarded message:
>
> From: Kyrylo Tkachov via Gcc
> Subject: [PATCH] PR target/117449: Restrict vector rotate match and split to
> pre-reload
> Date: 5 November 2024 at 17:57:40 GMT+1
> To: gcc mailing list
> Reply-To: Ky
> On 4 Nov 2024, at 16:03, Kyrylo Tkachov wrote:
>
>
>
>> On 4 Nov 2024, at 15:20, Jakub Jelinek wrote:
>>
>> On Mon, Nov 04, 2024 at 02:31:29PM +0100, Jakub Jelinek wrote:
>>> On Mon, Nov 04, 2024 at 01:07:33PM +, Kyrylo Tkachov wrote:
>>>&g
> On 4 Nov 2024, at 15:20, Jakub Jelinek wrote:
>
> On Mon, Nov 04, 2024 at 02:31:29PM +0100, Jakub Jelinek wrote:
>> On Mon, Nov 04, 2024 at 01:07:33PM +, Kyrylo Tkachov wrote:
>>>> This seems to have broken bootstrap on multiple targets and is caus
> On 4 Nov 2024, at 13:55, Richard Biener wrote:
>
> On Thu, Oct 31, 2024 at 4:30 PM Jeff Law wrote:
>>
>>
>>
>> On 10/27/24 10:21 AM, Kyrylo Tkachov wrote:
>>> Hi all,
>>>
>>> simplify-rtx can transform (X << C1) | (X &
> On 31 Oct 2024, at 18:06, Richard Sandiford wrote:
>
> Wilco Dijkstra writes:
>> The early scheduler takes up ~33% of the total build time, however it doesn't
>> provide a meaningful performance gain. This is partly because modern OoO
>> cores
>> need far less scheduling, partly because th
Hi Jeff,
> On 31 Oct 2024, at 16:25, Jeff Law wrote:
>
>
>
> On 10/27/24 10:22 AM, Kyrylo Tkachov wrote:
>> Hi all,
>> Some vector rotate operations can be implemented in a single instruction
>> rather than using the fallback SHL+USRA sequence.
>> In par
> On 31 Oct 2024, at 14:23, Yury Khrustalev wrote:
>
> From: Szabolcs Nagy
>
> Builtin for chkfeat: the input argument is used to initialize x16 then
> execute chkfeat and return the updated x16.
>
> Note: ACLE __chkfeat(x) plans to flip the bits to be more intuitive
> (xor the input to outp
Hi Yury,
> On 31 Oct 2024, at 14:23, Yury Khrustalev wrote:
>
> From: Szabolcs Nagy
>
> Add new builtins for GCS:
>
> void *__builtin_aarch64_gcspr (void)
> uint64_t __builtin_aarch64_gcspopm (void)
> void *__builtin_aarch64_gcsss (void *)
>
> The builtins are always enabled, but should b
> On 31 Oct 2024, at 11:50, Richard Sandiford wrote:
>
> "Yuta Mukai (Fujitsu)" writes:
>> Hello,
>>
>> This patch adds initial support for FUJITSU-MONAKA CPU, which we are
>> developing.
>> This is the slides for the CPU:
>> https://www.fujitsu.com/downloads/SUPER/topics/isc24/next-arm-bas
> On 27 Oct 2024, at 20:42, Jeff Law wrote:
>
>
>
> On 10/24/24 12:24 AM, Kyrylo Tkachov wrote:
>>> On 24 Oct 2024, at 07:36, Jeff Law wrote:
>>>
>>>
>>>
>>> On 10/22/24 2:26 PM, Kyrylo Tkachov wrote:
>>>> Hi all,
&g
Hi all,
Looks like this immediate variable was missed out when I last fixed the
namespace issues in arm_neon.h. Fixed in the obvious manner.
Bootstrapped and tested on aarch64-none-linux-gnu.
Pushing to trunk.
Thanks,
Kyrill
Signed-off-by: Kyrylo Tkachov
* config/aarch64/arm_neon.h
> On 25 Oct 2024, at 15:25, Richard Sandiford wrote:
>
> Kyrylo Tkachov writes:
>>> On 25 Oct 2024, at 13:46, Richard Sandiford
>>> wrote:
>>>
>>> Kyrylo Tkachov writes:
>>>> Thank you for the suggestions! I’m trying them
This change is not enough to generate the
equivalent sequence in SVE, but that is something that should be tackled
separately.
Bootstrapped and tested on aarch64-none-linux-gnu.
Signed-off-by: Kyrylo Tkachov
gcc/
* simplify-rtx.cc (simplify_context::simplify_binary_operat
ensure the permute indices are not messed
up.
Bootstrapped and tested on aarch64-none-linux-gnu.
Richard had approved these changes in the previous iteration, but I’ll only push
this after the prerequisites in the series.
Thanks,
Kyrill
Signed-off-by: Kyrylo Tkachov
gcc/
* expmed.h
usrav31.4s, v0.4s, 23
mov v0.16b, v31.16b
ret
G2:
shl v31.8b, v0.8b, 3
usrav31.8b, v0.8b, 5
mov v0.8b, v31.8b
ret
Bootstrapped and tested on aarch64-none-linux-gnu.
Signed-off-by: Kyrylo Tkachov
gcc/
-none-linux-gnu.
I’ll push this if the prerequisites are approved.
Thanks,
Kyrill
Signed-off-by: Kyrylo Tkachov
gcc/
PR target/117048
* config/aarch64/aarch64-simd.md (*aarch64_simd_rotate_imm):
New define_insn_and_split.
gcc/testsuite/
PR target/117048
lf-tests in this
patch to validate the transformation.
Bootstrapped and tested on aarch64-none-linux-gnu.
Ok for mainline?
Thanks,
Kyrill
Signed-off-by: Kyrylo Tkachov
PR target/117048
* simplify-rtx.cc (extract_ashift_operands_p): Define.
(simplif
ed on aarch64-none-linux-gnu.
Ok for mainline?
Thanks,
Kyrill
Signed-off-by: Kyrylo Tkachov
gcc/
* config/aarch64/iterators.md (SVE_ASIMD_FULL_I): New mode iterator.
* config/aarch64/aarch64-sve2.md (@aarch64_sve2_xar):
Use SVE_ASIMD_FULL_I modes. Use ROTATE code for the r
> On 25 Oct 2024, at 13:46, Richard Sandiford wrote:
>
> Kyrylo Tkachov writes:
>> Thank you for the suggestions! I’m trying them out now.
>>
>>>> + if (rotamnt % BITS_PER_UNIT != 0)
>>>> +return NULL_RTX;
>>>> + machine_mo
Thank you for the suggestions! I’m trying them out now.
> On 24 Oct 2024, at 21:11, Richard Sandiford wrote:
>
> Kyrylo Tkachov writes:
>> Hi Richard,
>>
>>> On 23 Oct 2024, at 11:30, Richard Sandiford
>>> wrote:
>>>
>>> Kyrylo Tk
1 - 100 of 1141 matches
Mail list logo