Richard Biener writes:
> The following avoids accessing out-of-bound vector elements when
> native encoding a boolean vector with sub-BITS_PER_UNIT precision
> elements. The error was basing the number of elements to extract
> on the rounded up total byte size involved and the patch bases
> every
Richard Biener writes:
> On Wed, 14 Feb 2024, Richard Sandiford wrote:
>
>> Richard Biener writes:
>> > The following avoids accessing out-of-bound vector elements when
>> > native encoding a boolean vector with sub-BITS_PER_UNIT precision
>> > elemen
Ajit Agarwal writes:
>>> diff --git a/gcc/emit-rtl.cc b/gcc/emit-rtl.cc
>>> index 1856fa4884f..ffc47a6eaa0 100644
>>> --- a/gcc/emit-rtl.cc
>>> +++ b/gcc/emit-rtl.cc
>>> @@ -921,7 +921,7 @@ validate_subreg (machine_mode omode, machine_mode imode,
>>> return false;
>>>
>>>/* The subreg o
Ajit Agarwal writes:
>>> diff --git a/gcc/df-problems.cc b/gcc/df-problems.cc
>>> index 88ee0dd67fc..a8d0ee7c4db 100644
>>> --- a/gcc/df-problems.cc
>>> +++ b/gcc/df-problems.cc
>>> @@ -3360,7 +3360,7 @@ df_set_unused_notes_for_mw (rtx_insn *insn, struct
>>> df_mw_hardreg *mws,
>>>if (df_whol
Ajit Agarwal writes:
> On 14/02/24 10:56 pm, Richard Sandiford wrote:
>> Ajit Agarwal writes:
>>>>> diff --git a/gcc/df-problems.cc b/gcc/df-problems.cc
>>>>> index 88ee0dd67fc..a8d0ee7c4db 100644
>>>>> --- a/gcc/df-problems.cc
&
Ajit Agarwal writes:
> Hello Richard:
>
>
> On 14/02/24 10:45 pm, Richard Sandiford wrote:
>> Ajit Agarwal writes:
>>>>> diff --git a/gcc/emit-rtl.cc b/gcc/emit-rtl.cc
>>>>> index 1856fa4884f..ffc47a6eaa0 100644
>>>>> --- a/gcc
Andrew Pinski writes:
> While working on PERM related stuff, I can across that aarch64_evpc_reencode
> was manually figuring out if we shrink the perm indices instead of
> using vec_perm_indices::new_shrunk_vector; shrunk was added after reencode
> was added.
>
> Built and tested for aarch64-linux
Andrew Pinski writes:
> The error message is not clear what options are being taked about when it
> says the values
> need to match; plus there is a wrong quotation dealing with the diagnostic.
> So this changes the error message to be exactly talking about the param
> options that
> are being t
Richard Biener writes:
> On Wed, 14 Feb 2024, Richard Sandiford wrote:
>
>> Richard Biener writes:
>> > On Wed, 14 Feb 2024, Richard Sandiford wrote:
>> >
>> >> Richard Biener writes:
>> >> > The following avoids accessing out-of-bou
Andrew Pinski writes:
> The backend currently defines a whole vector shift left for 64bit vectors,
> adding the
> shift right can also improve code for some PERMs too. So this adds that
> pattern.
Is this reversed? It looks like we have the shift right and the patch is
adding the shift left (a
Richard Biener writes:
> On Wed, 14 Feb 2024, Richard Biener wrote:
>
>> For the testcase in PR113910 we spend a lot of time in PTA comparing
>> bitmaps for looking up equivalence class members. This points to
>> the very weak bitmap_hash function which effectively hashes set
>> and a subset of n
Andrew Pinski writes:
> The testcase gcc.target/aarch64/vect_ctz_1.c fails execution when running
> with -march=armv9-a due to the testcase calls __builtin_ctz with a value of 0.
> The testcase should not depend on undefined behavior of __builtin_ctz. So this
> changes it to use the g form with th
Iain Sandoe writes:
>> On 5 Feb 2024, at 14:56, Iain Sandoe wrote:
>>
>> Tested on aarch64-linux,darwin and a cross from aarch64-darwin to linux,
>> OK for trunk, or some alternative is needed?
>
> Hmm.. apparently, this fails the linaro pre-commit CI for g++ with:
> error: invalid conversion fr
Iain Sandoe writes:
>> On 15 Feb 2024, at 18:05, Richard Sandiford
>> wrote:
>>
>> Iain Sandoe writes:
>>>> On 5 Feb 2024, at 14:56, Iain Sandoe wrote:
>>>>
>>>> Tested on aarch64-linux,darwin and a cross from aarch64-darwi
Richard Biener writes:
> The following tries to address the PHI insertion compile-time hog in
> RTL fwprop observed with the PR54052 testcase where the loop computing
> the "unfiltered" set of variables possibly needing PHI nodes for each
> block exhibits quadratic compile-time and memory-use.
>
>
Richard Biener writes:
> On Mon, 19 Feb 2024, Richard Sandiford wrote:
>
>> Richard Biener writes:
>> > The following tries to address the PHI insertion compile-time hog in
>> > RTL fwprop observed with the PR54052 testcase where the loop computing
>> > the
Richard Biener writes:
>> I suppose that's better than the first version when a block has a
>> large number of dominance frontiers. But I can't remember whether
>> that was the case in PR98863. I have a feeling that I tried the above
>> as part of the PR, since it's the obvious way of applying l
Victor Do Nascimento writes:
> [...]
> diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
> index 157a0b9dfa5..45e901cda64 100644
> --- a/gcc/config/aarch64/aarch64.h
> +++ b/gcc/config/aarch64/aarch64.h
> @@ -297,6 +297,26 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE =
> A
Tamar Christina writes:
> Hi, this I a new version of the patch updating some additional tests
> because some of the LTO tests required a newer binutils than my distro had.
>
> ---
>
> The Arm Architectural Reference Manual (Version J.a, section A2.9 on
> FEAT_LS64)
> shows that ls64 is an optio
Alex Coplan writes:
> On 14/02/2024 11:18, Richard Sandiford wrote:
>> Alex Coplan writes:
>> > This is a backport of the GCC 13 fix for PR111677 to the GCC 12 branch.
>> > The only part of the patch that isn't a straight cherry-pick is due to
>> > the
Tamar Christina writes:
>> -Original Message-
>> From: Richard Sandiford
>> Sent: Thursday, February 1, 2024 4:42 PM
>> To: Tamar Christina
>> Cc: Andrew Pinski ; gcc-patches@gcc.gnu.org; nd
>> ; Richard Earnshaw ; Marcus
>> Shawcroft ; Kyrylo
This patch makes -mtrack-speculation work on streaming-compatible
functions. There were two related issues. The first is that the
streaming-compatible code was using TB(N)Z unconditionally, whereas
those instructions are not allowed with speculation tracking.
That part can be fixed in a similar w
Iain Sandoe writes:
> Andrew Pinski pointed out on irc, that the current implementation of the
> heap trampoline code fragment would make the instruction byte order follow
> memory byte order for BE AArch64, which is not what is required.
>
> This patch revises the initializers so that instruction
In this PR, the SME mode-switching code needs to insert a stack-probe
loop for an alloca. This patch allows the target to do that.
There are two parts to it: allowing loops for insertions in blocks,
and allowing them for insertions on edges. The former can be handled
entirely within mode-switchi
This patch fixes an ICE for a combination of:
- -fstack-clash-protection
- a frame that has SVE save slots
- a frame that has no GPR save slots
- a frame that has a VG save slot
The allocation code was folding the SVE save slot allocation into
the initial frame allocation, so that we had one allo
The main purpose of the aarch64_commit_lazy_save pattern
was to defer insertion of a half-diamond until splitting,
since splitting knew how to create the associated basic blocks.
However, the fix for PR113220 means that mode-switching also
knows how to do that. This patch therefore removes the pa
ACLE guarantees that a function like:
__arm_new("zt0") foo() { ... }
will start with ZT0 equal to zero. I'd forgotten to enforce that
after commiting a lazy save. After such a save, we should zero
ZA iff the function has ZA state and zero ZT0 iff the function
has ZT0 state.
Tested on aarch64
In:
void bar() __arm_inout("za");
void foo() __arm_inout("za", "zt0") { bar(); }
foo cannot tail-call bar because foo needs to restore ZT0 after
the call. I'd forgotten to update the ok_for_sibcall rules
to handle this when adding SME2.
Thanks to Sander de Smalen for the spot.
Tested on aa
The sequence to commit a lazy save includes a branch based on
whether TPIDR2_EL0 is zero. The code assumed that CBZ could
be used for this, but that instruction is forbidden when
-mtrack-speculation is being used.
Tested on aarch64-linux-gnu & pushed.
Richard
gcc/
* config/aarch64/aarc
I noticed while working on another patch that we had a duplicated
call to aarch64_process_target_attr.
Tested on aarch64-linux-gnu & pushed.
Richard
gcc/
* config/aarch64/aarch64.cc (aarch64_option_valid_attribute_p):
Remove duplicated call.
---
gcc/config/aarch64/aarch64.cc |
Ajit Agarwal writes:
> Hello Alex/Richard:
>
> I have placed target indpendent and target dependent code in
> aarch64-ldp-fusion for load store fusion.
>
> Common infrastructure of load store pair fusion is divided into
> target independent and target dependent code.
>
> Target independent code is
early-ra already had code to do regrename-style "broadening"
of the allocation, to promote scheduling freedom. However,
the pass divides the function into allocation regions
and this broadening only worked within a single region.
This meant that if a basic block contained one subblock
of FPR use,
Most code in early-ra used is_chain_candidate to check whether we
should chain two allocnos. This included both tests that matter
for correctness and tests for certain heuristics.
Once that test passes for one pair of allocnos, we test whether
it's safe to chain the containing groups (which might
"Richard Earnshaw (lists)" writes:
> On 21/02/2024 18:30, Evgeny Karpov wrote:
>>
> +/* X18 reserved for the TEB on Windows. */
> +#ifdef TARGET_ARM64_MS_ABI
> +# define FIXED_X18 1
> +# define CALL_USED_X18 0
> +#else
> +# define FIXED_X18 0
> +# define CALL_USED_X18 1
> +#endif
>
> I'm not ove
Evgeny Karpov writes:
> From 55fd2a63afa9abb3543d714b6f5925efd2682e08 Mon Sep 17 00:00:00 2001
> From: Zac Walker
> Date: Wed, 21 Feb 2024 12:20:46 +0100
> Subject: [PATCH v1 04/13] aarch64: Add aarch64-w64-mingw32 COFF
>
> Define ASM specific for COFF format on AArch64.
>
> gcc/ChangeLog:
>
>
Evgeny Karpov writes:
> From 1ea6efa6f88d131884ecef21c4b5d2ecbab14ea7 Mon Sep 17 00:00:00 2001
> From: Zac Walker
> Date: Tue, 20 Feb 2024 18:06:36 +0100
> Subject: [PATCH v1 08/13] aarch64: Add Cygwin and MinGW environments for
> AArch64
>
> Define Cygwin and MinGW environment such as types, SE
Evgeny Karpov writes:
> The calling ABI enum definition has been done following a similar convention
> in
> https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/config/i386/i386-opts.h;h=ef2825803b32001b9632769bdff196d1e43d27ba;hb=refs/heads/master#l41
>
> MS_ABI is used in gcc/config/i386/mingw32.h
"Richard Earnshaw (lists)" writes:
> On 21/02/2024 17:47, Evgeny Karpov wrote:
>> Hello,
>>
>> We would like to take your attention to the review of changes for the
>> new GCC target, aarch64-w64-mingw32. The new target will be
>> supported, tested, added to CI, and maintained by Linaro. This mar
During its forward pass, the SLP layout code tries to calculate
the cost of a layout change on an incoming edge. This is taken
as the minimum of two costs: one in which the source partition
keeps its current layout (chosen earlier during the pass) and
one in which the source partition switches to
I keep forgetting that gcc.dg/rtl is the one testsuite where
tests in target-specific subdirectories aren't automatically
restricted to that target.
Pushed as obvious after testing on aarch64-linux-gnu & x86_64-linux-gnu.
Richard
gcc/testsuite/
* gcc.dg/rtl/aarch64/pr113295-1.c: Restric
Just a note that, following discussion on IRC, I'll pull this for
GCC 14 and resubmit for GCC 15.
There was also pushback on IRC about making the pass opt-in.
Enabling it for x86_64 would mean fixing RPAD to use a representation
that is more robust against recombination, but as you can imagine, it
Alex Coplan writes:
> As discussed on IRC, this makes the aarch64 ldp/stp pass off by default. This
> should stabilize the trunk and give some time to address the P1 regressions.
>
> Sorry for the breakage.
>
> Bootstrapped/regtested on aarch64-linux-gnu, OK for trunk?
>
> Alex
>
> gcc/ChangeLog:
Wilco Dijkstra writes:
> GCC tends to optimistically create CONST of globals with an immediate offset.
> However it is almost always better to CSE addresses of globals and add
> immediate
> offsets separately (the offset could be merged later in single-use cases).
> Splitting CONST expressions wi
Wilco Dijkstra writes:
> Hi Richard,
>
>>> +#define MAX_SET_SIZE(speed) (speed ? 256 : 96)
>>
>> Since this isn't (AFAIK) a standard macro, there doesn't seem to be
>> any need to put it in the header file. It could just go at the head
>> of aarch64.cc instead.
>
> Sure, I've moved it in v4.
>
>>
Alex Coplan writes:
> This is a v3 which addresses shortcomings of the v2 patch. v2 was
> posted here:
> https://gcc.gnu.org/pipermail/gcc-patches/2024-January/642448.html
>
> The main issue in v2 is that we were using the final (transformed)
> patterns in combine_reg_notes instead of the initial
ype`, and `.size` pseudo-ops for
> `aarch64-w64-mingw32` target
> Cc: Andrew Pinski ,
> Richard Sandiford ,
> Jonathan Yong <10wa...@gmail.com>
>
> Recent change
> (https://gcc.gnu.org/pipermail/gcc-cvs/2023-December/394915.html) added a
> generic SME support us
/aarch64/ccmp_3.c: New test.
* gcc.target/aarch64/ccmp_4.c: New test.
* gcc.target/aarch64/ccmp_5.c: New test.
Signed-off-by: Andrew Pinski
Co-authored-by: Richard Sandiford
---
gcc/ccmp.cc | 12 +--
gcc/cfgexpand.cc | 31 ++-
The PR shows that we were registering the same overloaded SVE
builtins twice. This was supposed to be prevented by
function_builder::add_overloaded_function, which uses a map
to detect whether a function of the same name has already been
registered. add_overloaded_function then had some asserts t
As explained in the covering note to the previous patch,
the fact that aarch64-sve-* is now used for multiple header
files means that function_builder::add_overloaded_function
now needs to use a global map to detect duplicated overload
functions, instead of the member variable that it used previous
g:f26f92b534f9 implemented unsigned extensions using ZIPs rather than
UXTL{,2}, since the former has a higher throughput than the latter on
amny cores. The optimisation worked by lowering directly to ZIP during
expand, so that the zero input could be hoisted and shared.
However, changing to ZIP m
Ping for the expr/cfgexpand bits
Richard Sandiford writes:
> Andrew Pinski writes:
>> Ccmp is not used if the result of the and/ior is used by both
>> a GIMPLE_COND and a GIMPLE_ASSIGN. This improves the code generation
>> here by using ccmp in this case.
>> Two ch
Alex Coplan writes:
> The next patch in this series exposes an interface for creating new uses
> in RTL-SSA. The intent is that new user-created uses can consume new
> user-created defs in the same change group. This is so that we can
> correctly update uses of memory when inserting a new store
Alex Coplan writes:
> This exposes an interface for users to create new uses in RTL-SSA.
> This is needed for updating uses after inserting a new store pair insn
> in the aarch64 load/store pair fusion pass.
>
> gcc/ChangeLog:
>
> PR target/113070
> * rtl-ssa/accesses.cc (function_info
Alex Coplan writes:
> In r14-5820-ga49befbd2c783e751dc2110b544fe540eb7e33eb I added support to
> RTL-SSA for inserting new insns, which included support for users
> creating new defs.
>
> However, I missed that apply_changes_to_insn needed updating to ensure
> that the new defs actually got insert
Alex Coplan writes:
> As the PR shows (specifically #c7) we are missing updating uses of mem
> when inserting an stp in the aarch64 load/store pair fusion pass. This
> patch fixes that.
>
> RTL-SSA has a simple view of memory and by default doesn't allow stores
> to be re-ordered w.r.t. other sto
Alex Coplan writes:
> Hi,
>
> For the testcase in the PR, we try to pair insns where the first has
> writeback and the second uses the updated base register. This causes us
> to record a hazard against the second insn, thus narrowing the move
> range away from the end of the BB.
>
> However, it i
Alex Coplan writes:
> This patch adds some accessors to set_info and use_info to make it
> easier to get at and iterate through uses in debug insns.
>
> It is used by the aarch64 load/store pair fusion pass in a subsequent
> patch to fix PR113089, i.e. to update debug uses in the pass.
>
> Bootstr
Alex Coplan writes:
> While working on PR113089, I realised we where missing code to re-parent
> trailing nondebug uses of the base register in the case of cancelling
> writeback in the load/store pair pass. This patch fixes that.
>
> Bootstrapped/regtested as a series on aarch64-linux-gnu (with/
Sorry for the earlier review comment about debug insns. I hadn't
looked far enough into the queue to see this patch.
Alex Coplan writes:
> As the PR shows, we were missing code to update debug uses in the
> load/store pair fusion pass. This patch fixes that.
>
> Note that this patch depends on
Alex Coplan writes:
> Hi,
>
> The PR shows two different cases where try_promote_writeback produces an
> RTL pattern which isn't recognized. Currently this leads to an ICE, as
> we assert recog success, but I think it's better just to back out of the
> changes gracefully if recog fails (as we do
In the original fix for this PR, I'd made sure that
including didn't reach the final return in
simulate_builtin_function_decl (which would indicate duplicate
function definitions). But it seems I forgot to do the same
thing for C++, which defines all of its overloads directly.
This patch fixes a
Alex Coplan writes:
> On 22/01/2024 13:49, Richard Sandiford wrote:
>> Alex Coplan writes:
>> > In r14-5820-ga49befbd2c783e751dc2110b544fe540eb7e33eb I added support to
>> > RTL-SSA for inserting new insns, which included support for users
>> > creating new de
Alex Coplan writes:
>> > + writeback_pats[i] = orig_rtl[i];
>> > +
>> > + // Now that we've characterized the defs involved, go through the
>> > + // debug uses and determine how to update them (if needed).
>> > + for (auto use : set->debug_insn_uses ())
>> > + {
>> > +if
Alexandre Oliva writes:
> Calling arm_neon.h functions that take lanes as arguments may fail to
> report malformed values if the intrinsic happens to be optimized away,
> e.g. because it is pure or const and the result is unused.
>
> Adding __AARCH64_LANE_CHECK calls to the always_inline functions
Radek Barton writes:
> Hello Richard.
>
> Thank you for your suggestion. I am sending a patch update according to it.
>
>> How about avoiding the clash by using the names HIDDEN, SYMBOL_TYPE and
>> SYMBOL_SIZE, with SYMBOL_TYPE taking the symbol type as argument?
>
> Yes, unless the symbol is expl
Matthias Kretz writes:
> On Sunday, 10 December 2023 14:29:45 CET Richard Sandiford wrote:
>> Thanks for the patch and sorry for the slow review.
>
> Sorry for my slow reaction. I needed a long vacation. For now I'll focus on
> the design question wrt. multi-arch compi
Tamar Christina writes:
> Hi All,
>
> The AArch64 vector PCS does not allow simd calls with simdlen 1,
> however due to a bug we currently do allow it for num == 0.
>
> This causes us to emit a symbol that doesn't exist and we fail to link.
>
> Bootstrapped Regtested on aarch64-none-linux-gnu and
Tamar Christina writes:
> Hi All,
>
> As suggested in the ticket this replaces the expansion by converting the
> Advanced SIMD types to SVE types by simply printing out an SVE register for
> these instructions.
>
> This fixes the subreg issues since there are no subregs involved anymore.
>
> Boots
Richard Biener writes:
> On Mon, 15 Jan 2024, Robin Dapp wrote:
>
>> I gave it another shot now by introducing a separate function as
>> Richard suggested. It's probably not at the location he intended.
>>
>> The way I read the discussion there hasn't been any consensus
>> on how (or rather wher
Tejas Belagod writes:
> The target hook aarch64_class_max_nregs returns the incorrect result for
> 64-bit
> structure modes like V31DImode or V41DFmode etc. The calculation of the nregs
> is based on the size of AdvSIMD vector register for 64-bit modes which ought
> to
> be UNITS_PER_VREG / 2.
Manos Anagnostakis writes:
> The current ldp/stp policy framework implementation was missing cases, where
> the memory operands were reversed. Therefore the call to the framework
> function
> is moved after the lower mem check with the suitable parameters. Also removes
> the mode of aarch64_opera
Thanks for doing this. I'm not qualified to review the patch properly,
but was just curious...
Andi Kleen writes:
> This patch implements a clang compatible [[musttail]] attribute for
> returns.
>
> musttail is useful as an alternative to computed goto for interpreters.
> With computed goto the
Szabolcs Nagy writes:
> Recent commit introduced a conditional branch in eh_return epilogues
> that is not compatible with speculation tracking:
>
> commit 426fddcbdad6746fe70e031f707fb07f55dfb405
> Author: Szabolcs Nagy
> CommitDate: 2023-11-27 15:52:48 +
>
> aarch64: Use br inst
Andrew Pinski writes:
> The problem here is the builtin apply mechanism thinks the FP registers
> are to be used due to get_raw_arg_mode not returning VOIDmode. This
> fixes that oversight and the backend now returns VOIDmode for non-general-regs
> if TARGET_GENERAL_REGS_ONLY is true.
>
> Built an
Andrew Pinski writes:
> On aarch64, vectorization of `long` multiply can be done if SVE is enabled
> or if long is 32bit (ILP32). It can also be done for constants too but there
> is no effective target test for that just yet.
>
> Build and tested on aarch64-linux-gnu with no regressions (also tes
g:74e3e839ab2d36841320 handled the UXTL{,2}-ZIP[12] optimisation
in split1. The UXTL input is a 64-bit vector of N-bit elements
and the result is a 128-bit vector of 2N-bit elements. The
corresponding ZIP1 operates on 128-bit vectors of N-bit elements.
This meant that the ZIP1 input had to be a
Andrew Pinski writes:
> The split for movv8di is not ready to handle the case where the setting
> register overlaps with the address of the memory that is being load.
> Fixing the split than just making the output constraint as an early clobber
> for this alternative. The split would first need to
When generalising vector_cst_all_same, I'd forgotten to update
VECTOR_CST_ENCODED_ELT to VECTOR_CST_ELT. The check deliberately
looks at implicitly encoded elements in some cases.
Tested on aarch64-linux-gnu & pushed.
Richard
gcc/
PR target/113572
* config/aarch64/aarch64-sve-b
Szabolcs Nagy writes:
> Recent commit introduced a conditional branch in eh_return epilogues
> that is not compatible with speculation tracking:
>
> commit 426fddcbdad6746fe70e031f707fb07f55dfb405
> Author: Szabolcs Nagy
> CommitDate: 2023-11-27 15:52:48 +
>
> aarch64: Use br inst
Andrew Carlotti writes:
> It would be neater if the middle end for target_clones used a target
> hook for version name mangling, so we only do version name mangling
> once. However, that would require more intrusive refactoring that will
> have to wait till Stage 1.
>
>
> This patch builds upon t
Victor Do Nascimento writes:
> The introduction of further architectural-feature dependent ifuncs
> for AArch64 makes hard-coding ifunc `_i' suffixes to functions
> cumbersome to work with. It is awkward to remember which ifunc maps
> onto which arch feature and makes the code harder to maintain
Victor Do Nascimento writes:
> With support for new atomic features in Armv9.4-a being indicated by
> HWCAP2 bits, Libatomic's ifunc resolver must now query its second
> argument, of type __ifunc_arg_t*.
>
> We therefore make this argument known to libatomic, allowing us to
> query hwcap2 bits in
Victor Do Nascimento writes:
> The armv9.4-a architectural revision adds three new atomic operations
> associated with the LSE128 feature:
>
> * LDCLRP - Atomic AND NOT (bitclear) of a location with 128-bit
> value held in a pair of registers, with original data loaded into
> the same 2 regi
Victor Do Nascimento writes:
> At present, Evaluation of both `has_lse2(hwcap)' and
> `has_lse128(hwcap)' may require issuing an `mrs' instruction to query
> a system register. This instruction, when issued from user-space
> results in a trap by the kernel which then returns the value read in
> b
Andre Vieira writes:
> This patch adds support for C23's _BitInt for the AArch64 port when compiling
> for little endianness. Big Endianness requires further target-agnostic
> support and we therefor disable it for now.
>
> gcc/ChangeLog:
>
> * config/aarch64/aarch64.cc (TARGET_C_BITINT_TYP
Victor Do Nascimento writes:
> @@ -712,6 +760,27 @@ ENTRY (libat_test_and_set_16)
> END (libat_test_and_set_16)
>
>
> +/* Alias all LSE128_LRCPC3 ifuncs to their specific implementations,
> + that is, map it to LSE128, LRCPC or CORE as appropriate. */
> +
> +ALIAS (libat_exchange_16, LSE12
This was another PR caused by the way that
vect_determine_precisions_from_range handle shifts. We tried to
narrow 32768 >> x to a 16-bit shift based on range information for
the inputs and outputs, with vect_recog_over_widening_pattern
(after PR110828) adjusting the shift amount. But this doesn't
Prathamesh Kulkarni writes:
> Hi,
> The test passes -mlittle-endian option but doesn't have target check
> for aarch64_little_endian and thus fails to compile on
> aarch64_be-linux-gnu. The patch adds the missing aarch64_little_endian
> target check, which makes it unsupported on the target.
> OK
Alex Coplan writes:
> Hi,
>
> The fix for PR113089 introduced range-based for loops over the
> debug_insn_uses of an RTL-SSA set_info, but in the case that we reset a
> debug insn, the use would get removed from the use list, and thus we
> would end up using an invalidated iterator in the next ite
Prathamesh Kulkarni writes:
> On Sat, 27 Jan 2024 at 21:19, Richard Sandiford
> wrote:
>>
>> Prathamesh Kulkarni writes:
>> > Hi,
>> > The test passes -mlittle-endian option but doesn't have target check
>> > for aarch64_little_endian and thus
Tamar Christina writes:
> Hi All,
>
> Recently something in the midend had started inverting the branches by
> inverting
> the condition and the branches.
>
> While this is fine, it makes it hard to actually test. In RTL I disable
> scheduling and BB reordering to prevent this. But in GIMPLE th
Alexandre Oliva writes:
> On Jan 23, 2024, Richard Sandiford wrote:
>
>> Performing the check in expand is itself wrong
>
> *nod*
>
>> So I think we should enforce the immediate range within the frontend
>> instead, via TARGET_CHECK_BUILTIN_CALL.
>
>
In this PR, we entered early-ra with quite a bit of dead code.
The code was duly removed (to avoid wasting registers), but there
was a dangling reference in debug instructions, which caused an
ICE later.
Fixed by resetting a debug instruction if it references a register
that is no longer needed by
For something like:
void
foo (void)
{
int *ptr;
asm volatile ("%0" : "=w" (ptr));
asm volatile ("%0" :: "m" (*ptr));
}
early-ra would allocate ptr to an FPR for the first asm, thus
leaving an FPR address in the second asm. The address was then
reloaded by LRA to make it valid.
But early-r
Richard Biener writes:
> On Mon, Jan 29, 2024 at 5:00 PM Richard Sandiford
> wrote:
>>
>> Tamar Christina writes:
>> > Hi All,
>> >
>> > Recently something in the midend had started inverting the branches by
>> > inverting
>> > t
Alex Coplan writes:
> Hi,
>
> The PR shows us ICEing due to an unrecognizable TFmode save emitted by
> aarch64_process_components. The problem is that for T{I,F,D}mode we
> conservatively require mems to be in range for x-register ldp/stp. That
> is because (at least for TImode) it can be alloca
Alex Coplan writes:
> Bootstrapped/regtested on aarch64-linux-gnu, OK for the 13 branch after
> a week of the trunk fix being in? OK for the other active branches if
> the same changes test cleanly there?
>
> GCC 14 patch for reference:
> https://gcc.gnu.org/pipermail/gcc-patches/2024-January/644
Robin Dapp writes:
> @@ -1758,16 +1759,19 @@ extract_bit_field_1 (rtx str_rtx, poly_uint64
> bitsize, poly_uint64 bitnum,
>if (VECTOR_MODE_P (outermode) && !MEM_P (op0))
> {
>scalar_mode innermode = GET_MODE_INNER (outermode);
>enum insn_code icode
> = convert_optab
Richard Ball writes:
> ACLE has added intrinsics to bridge between SVE and Neon.
>
> The NEON_SVE Bridge adds intrinsics that allow conversions between NEON and
> SVE vectors.
>
> This patch adds support to GCC for the following 3 intrinsics:
> svset_neonq, svget_neonq and svdup_neonq
>
> gcc/Chan
Alex Coplan writes:
> On 12/12/2023 15:58, Richard Sandiford wrote:
>> Alex Coplan writes:
>> > Hi,
>> >
>> > This is a v2 version which addresses feedback from Richard's review
>> > here:
>> >
>> > https://gcc.gnu.org/piperma
701 - 800 of 9486 matches
Mail list logo