Jonathan Wright writes:
> Hi,
>
> As subject, this patch splits the aarch64_qshrn_n
> pattern into separate scalar and vector variants. It further splits the vector
> pattern into big/little endian variants that model the zero-high-half
> semantics of the underlying instruction - allowing for more
Jonathan Wright writes:
> Hi,
>
> As subject, this patch adds tests to confirm that a *2 (write to high-half)
> Neon instruction is generated from vcombine* of a narrowing intrinsic
> sequence.
>
> Ok for master?
OK, thanks.
Richard
> Thanks,
> Jonathan
>
> ---
>
> gcc/testsuite/ChangeLog:
>
>
Jonathan Wright writes:
> Hi,
>
> The existing vec_pack_trunc RTL pattern emits an opaque two-
> instruction assembly code sequence that prevents proper instruction
> scheduling. This commit changes the pattern to an expander that emits
> individual xtn and xtn2 instructions.
>
> This commit also
Jonathan Wright writes:
> Hi,
>
> As subject, this patch corrects the type attribute in RTL patterns that
> generate XTN/XTN2 instructions to be "neon_move_narrow_q".
>
> This makes a material difference because these instructions can be
> executed on both SIMD pipes in the Cortex-A57 core model,
Kyrylo Tkachov writes:
> Hi all,
>
> Besides the builtins in aarch64-simd-builtins.def there are a number of
> builtins defined in aarch64-builtins.c itself.
> They could also benefit from the attributes generated by
> aarch64_get_attributes.
> However aarch64_get_attributes and its helpers are
"Andre Vieira (lists)" writes:
> Hi,
>
> When vectorizing with --param vect-partial-vector-usage=1 the vectorizer
> uses an unpredicated (all-true predicate for SVE) main loop and a
> predicated tail loop. The way this was implemented seems to mean it
> re-uses the same vector-mode for both loo
Richard Biener writes:
> On Mon, May 17, 2021 at 3:18 PM Joern Wolfgang Rennecke
> wrote:
>>
>> Attached is the updated version of the patch.
>> Bootstrapped and regtested on x86_64-pc-linux-gnu.
>>
>> OK to apply?
>
> + machine_mode m = mode_for_size ((prec + 1) / 2, MODE_INT, 1).require
Tamar Christina writes:
>> -Original Message-
>> From: Richard Sandiford
>> Sent: Monday, May 10, 2021 5:49 PM
>> To: Tamar Christina
>> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw
>> ; Marcus Shawcroft
>> ; Kyrylo Tkachov
>> Subject: Re: [PATCH 2/4]AArch64: Add support for sign
Tamar Christina writes:
> Hi All,
>
> The current RTL for the vectorizer patterns for dot-product are incorrect.
> Operand3 isn't an output parameter so we can't write to it.
>
> This fixes this issue and reduces the number of RTL.
>
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues
Wilco Dijkstra writes:
> @@ -23746,6 +23767,24 @@ aarch_macro_fusion_pair_p (rtx_insn *prev, rtx_insn
> *curr)
> }
> }
>
> + /* Always treat GOT accesses as a pair to ensure they can be easily
> + identified and optimized in linkers. */
Sorry, I guess I'd not given enough wei
Christophe Lyon writes:
> This patch adds support for the reduc_plus_scal optab with MVE, which
> maps to the vaddv instruction.
>
> It moves the reduc_plus_scal_ expander from neon.md to
> vec-common.md and adds support for MVE to it.
>
> Since vaddv uses a 32-bits accumulator, we have to truncat
Jakub Jelinek writes:
> On Thu, May 27, 2021 at 01:07:09PM +0800, Hongtao Liu via Gcc-patches wrote:
>> + /* Flag used for call_insn indicates it's a fake call. */
>> + RTX_FLAG (insn, used) = 1;
>
>> + /* CALL_INSN use "used" flag to indicate it's a fake call. */
>> + if (i == STACK
Thanks for doing this.
Martin Sebor via Gcc-patches writes:
> […]
> On 5/24/21 5:08 PM, David Malcolm wrote:
>> On Mon, 2021-05-24 at 16:02 -0600, Martin Sebor wrote:
>>> Subsequent patches then replace invocations of the TREE_NO_WARNING()
>>> macro and the gimple_no_warning_p() and gimple_set_no
Joern Rennecke writes:
> Bootstrapped on x86_64-pc-linux-gnu.
>
> 2020-12-10 Joern Rennecke
>
> Fix bug in the define_subst handling that made match_scratch unusable for
> multi-alternative patterns.
OK, and sorry for the slow response.
The changelog won't pass, but I'll leave you to
Joern Rennecke writes:
> At the moment, for a match_dup in a define_cond_exec, you'd have to
> give the number in the
> resulting pattern(s) rather than in the substitute pattern. That's
> not only wrong, but can also
> be impossible when the pattern should apply to multiple patterns with
> diffe
Sorry for the slow reponse.
"Kewen.Lin" writes:
> diff --git a/gcc/vec-perm-indices.c b/gcc/vec-perm-indices.c
> index ede590dc5c9..57dd11d723c 100644
> --- a/gcc/vec-perm-indices.c
> +++ b/gcc/vec-perm-indices.c
> @@ -101,6 +101,70 @@ vec_perm_indices::new_expanded_vector (const
> vec_perm_indi
"H.J. Lu via Gcc-patches" writes:
> On Mon, May 31, 2021 at 06:32:04AM -0700, H.J. Lu wrote:
>> On Mon, May 31, 2021 at 6:26 AM Richard Biener
>> wrote:
>> >
>> > On Mon, May 31, 2021 at 3:12 PM H.J. Lu wrote:
>> > >
>> > > On Mon, May 31, 2021 at 5:46 AM Richard Biener
>> > > wrote:
>> > > >
>
Kewen Lin writes:
> Hi all,
>
> define_insn_and_split should avoid to use empty split condition
> if the condition for define_insn isn't empty, otherwise it can
> sometimes result in unexpected consequence, since the split
> will always be done even if the insn condition doesn't hold.
>
> To avoid
"Kewen.Lin" writes:
> Hi Richard,
>
> on 2021/6/2 锟斤拷锟斤拷4:11, Richard Sandiford wrote:
>> Kewen Lin writes:
>>> Hi all,
>>>
>>> define_insn_and_split should avoid to use empty split condition
>>> if the condition for define_insn isn't empty, otherwise it can
>>> sometimes result in unexpected con
Segher Boessenkool writes:
> Since times immemorial there has been const_int_rtx for all values from
> -64 to 64, but only constm1_rtx..const2_rtx have been available for
> convenient use. Change this, so that we can use all values in
> {-64,...,64} in RTL easily. This matters, because then we w
"Kewen.Lin via Gcc-patches" writes:
> Hi,
>
> As PR100794 shows, in the current implementation PRE bypasses
> some optimization to avoid introducing loop carried dependence
> which stops loop vectorizer to vectorize the loop. At -O2,
> there is no downstream pass to re-catch this kind of opportun
Richard Biener writes:
> On Wed, Jun 2, 2021 at 12:01 PM Kewen.Lin wrote:
>>
>> on 2021/6/2 下午5:13, Richard Sandiford wrote:
>> > "Kewen.Lin" writes:
>> >> Hi Richard,
>> >>
>> >> on 2021/6/2 锟斤拷锟斤拷4:11, Richard Sandiford wrote:
>> >>> Kewen Lin writes:
>> Hi all,
>>
>> define_in
Christophe Lyon writes:
> This patch adds support for auto-vectorization of average value
> computation using vhadd or vrhadd, for both MVE and Neon.
>
> The patch adds the needed [u]avg3_[floor|ceil] patterns to
> vec-common.md, I'm not sure how to factorize them without introducing
> an unspec i
Christophe Lyon writes:
> This patch adds support for auto-vectorization of absolute value
> computation using vabs.
>
> We use a similar pattern to what is used in neon.md and extend the
> existing neg2 expander to match both 'neg' and 'abs'. This
> implies renaming the existing abs2 define_insn
"Kewen.Lin" writes:
> Hi Richi/Richard/Jeff/Segher,
>
> Thanks for the comments!
>
> on 2021/6/3 锟斤拷锟斤拷7:52, Segher Boessenkool wrote:
>> On Wed, Jun 02, 2021 at 06:32:13PM +0100, Richard Sandiford wrote:
>>> Richard Biener writes:
So what Richard suggests would be to disallow split conditio
Segher Boessenkool writes:
> On Thu, Jun 03, 2021 at 09:05:02AM +0100, Richard Sandiford via Gcc-patches
> wrote:
>> Right. Plus it creates less make-work. If we didn't have it, someone
>> would need to split the define_insn_and_splits that don't currently
>>
Segher Boessenkool writes:
> On Wed, Jun 02, 2021 at 06:07:28PM +0100, Richard Sandiford wrote:
>> Segher Boessenkool writes:
>> > Since times immemorial there has been const_int_rtx for all values from
>> > -64 to 64, but only constm1_rtx..const2_rtx have been available for
>> > convenient use.
"H.J. Lu" writes:
> Update vec_duplicate to allow to fail so that backend can only allow
> broadcasting an integer constant to a vector when broadcast instruction
> is available.
I'm not sure why we need this to fail though. Once the optab is defined
for target X, the optab should handle all dup
Sorry for the slow response.
Tamar Christina writes:
> […]
> diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
> index
> 441d6cd28c4eaded7abd756164890dbcffd2f3b8..82123b96313e6783ea214b9259805d65c07d8858
> 100644
> --- a/gcc/tree-vect-patterns.c
> +++ b/gcc/tree-vect-patterns.c
>
Bill Schmidt via Gcc-patches writes:
> On 5/20/21 5:24 PM, Segher Boessenkool wrote:
>> On Tue, May 11, 2021 at 11:01:22AM -0500, Bill Schmidt wrote:
>>> Hi! I'd like to ping this specific patch from the series, which is the
>>> only one remaining that affects common code. I confess that I don't
Christophe Lyon writes:
> On Wed, 2 Jun 2021 at 20:19, Richard Sandiford
> wrote:
>>
>> Christophe Lyon writes:
>> > This patch adds support for auto-vectorization of average value
>> > computation using vhadd or vrhadd, for both MVE and Neon.
>> >
>> > The patch adds the needed [u]avg3_[floor|c
Christophe Lyon writes:
> This patch adds support for auto-vectorization of clz for MVE.
>
> It does so by removing the unspec from mve_vclzq_ and uses
> 'clz' instead. It moves to neon_vclz expander from neon.md to
> vec-common.md and renames it into the standard name clz2.
>
> 2021-06-03 Christ
Christophe Lyon writes:
> This patch adds vec_unpack_hi_, vec_unpack_lo_,
> vec_pack_trunc_ patterns for MVE.
>
> It does so by moving the unpack patterns from neon.md to
> vec-common.md, while adding them support for MVE. The pack expander is
> derived from the Neon one (which in turn is renamed
Segher Boessenkool writes:
> On Tue, Jun 08, 2021 at 02:48:11PM +0200, Richard Biener wrote:
>> > So yeah, patch withdrawn. This on one hand is proof we do want to make
>> > such a change, but on the other hand shows it needs more preparatory
>> > steps.
>>
>> I wonder if it makes sense to provi
Segher Boessenkool writes:
> On Tue, Jun 08, 2021 at 04:50:56PM +0100, Richard Sandiford wrote:
>> Segher Boessenkool writes:
>> > On Tue, Jun 08, 2021 at 02:48:11PM +0200, Richard Biener wrote:
>> >> > So yeah, patch withdrawn. This on one hand is proof we do want to make
>> >> > such a change,
Christophe Lyon writes:
> The problem in this PR is that we call VPSEL with a mask of vector
> type instead of HImode. This happens because operand 3 in vcond_mask
> is the pre-computed vector comparison and has vector type. The fix is
> to transfer this value to VPR.P0 by comparing operand 3 with
Christophe Lyon writes:
> This patch adds support for auto-vectorization of average value
> computation using vhadd or vrhadd, for both MVE and Neon.
>
> The patch adds the needed [u]avg3_[floor|ceil] patterns to
> vec-common.md, I'm not sure how to factorize them without introducing
> an unspec i
Christophe Lyon writes:
> On Tue, 8 Jun 2021 at 13:58, Richard Sandiford
> wrote:
>>
>> Christophe Lyon writes:
>> > This patch adds support for auto-vectorization of clz for MVE.
>> >
>> > It does so by removing the unspec from mve_vclzq_ and uses
>> > 'clz' instead. It moves to neon_vclz expan
Christophe Lyon writes:
> On Tue, 8 Jun 2021 at 14:10, Richard Sandiford
> wrote:
>>
>> Christophe Lyon writes:
>> > This patch adds vec_unpack_hi_, vec_unpack_lo_,
>> > vec_pack_trunc_ patterns for MVE.
>> >
>> > It does so by moving the unpack patterns from neon.md to
>> > vec-common.md, while
Martin Sebor via Gcc-patches writes:
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index 7b37e1b602c..7cdc824730c 100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -13242,13 +13242,8 @@ bounds_check (rtx operand, HOST_WIDE_INT low,
> HOST_WIDE_INT high,
>lan
Christophe Lyon writes:
> In the meantime, I tried to make some progress, and looked at how
> things work on aarch64.
>
> This led me to define (in mve.md):
>
> (define_insn "@mve_vec_pack_trunc_lo_"
> [(set (match_operand: 0 "register_operand" "=w")
>(truncate: (match_operand:MVE_5 1 "re
Christophe Lyon writes:
> Thanks for the feedback. How about v2 attached?
> Do you want me to merge neon_vec_unpack and
> mve_vec_unpack and only have different assembly?
> if (TARGET_HAVE_MVE)
> return "vmovlb.%# %q0, %q1";
> else
> return "vmovlb.%# %q0, %q1";
I think it'd be better to kee
Iain Sandoe writes:
> Hi,
>
> So let’s ignore the questions for now - OK for the non-Darwin parts of the
> patch ?
Looks OK to me.
Thanks,
Richard
>
>> On 24 Sep 2021, at 17:57, Iain Sandoe wrote:
>>
>
>> as noted below the non-Darwin parts of this are trivial (and a no-OP).
>> I’d like to ap
Richard Biener via Gcc-patches writes:
> On Thu, 30 Sep 2021, Andre Vieira (lists) wrote:
>
>> Hi,
>>
>>
>> >> That just forces trying the vector modes we've tried before. Though I
>> >> might
>> >> need to revisit this now I think about it. I'm afraid it might be possible
>> >> for
>> >> this
Richard Biener via Gcc-patches writes:
> On Mon, 4 Oct 2021, Qing Zhao wrote:
>
>>
>>
>> > On Oct 4, 2021, at 12:19 PM, Richard Biener wrote:
>> >
>> > On October 4, 2021 7:00:10 PM GMT+02:00, Qing Zhao
>> > wrote:
>> >> I have several questions on this fix:
>> >>
>> >> 1. This fix avoided
In g:62acc72a957b5614 I'd stopped the unroller from using
an epilogue loop in cases where the iteration count was
known to be a multiple of the unroll factor. The epilogue
and non-epilogue cases still shared this (preexisting) code
to update the edge frequencies:
basic_block exit_bb = single_pr
"Roger Sayle" writes:
> Hi Richard,
>
> All excellent suggestions. The revised patch below implements all of
> your (and Andreas') recommendations. I'm happy to restrict GCC's support
> for saturating arithmetic to integer types, even though I do know of one
> target (nvptx) that supports satura
(Nice optimisations!)
Kyrylo Tkachov writes:
> Hi Tamar,
>
>> -Original Message-
>> From: Tamar Christina
>> Sent: Wednesday, September 29, 2021 5:19 PM
>> To: gcc-patches@gcc.gnu.org
>> Cc: nd ; Richard Earnshaw ;
>> Marcus Shawcroft ; Kyrylo Tkachov
>> ; Richard Sandiford
>>
>> Subjec
Thanks for looking at this.
Prathamesh Kulkarni writes:
> Hi,
> As mentioned in PR, for the following test-case:
>
> typedef unsigned char uint8_t;
>
> static inline uint8_t
> x264_clip_uint8(uint8_t x)
> {
> uint8_t t = -x;
> uint8_t t1 = x & ~63;
> return (t1 != 0) ? t : x;
> }
>
> void
>
Catching up on backlog, sorry for the very late response:
Tamar Christina writes:
> Hi All,
>
> Consider the following case
>
> #include
>
> uint64_t
> test4 (uint8x16_t input)
> {
> uint8x16_t bool_input = vshrq_n_u8(input, 7);
> poly64x2_t mask = vdupq_n_p64(0x0102040810204080UL);
>
Tamar Christina writes:
> Hi All,
>
> This is a respin of an older patch which never got upstream reviewed by a
> maintainer. It's been updated to fit the current GCC codegen.
>
> This patch adds a pattern to support the (F)ADDP (scalar) instruction.
>
> Before the patch, the C code
>
> typedef f
Sorry for the very long delay in reviewing this. Things have been
a bit hectic recently.
Christophe Lyon via Gcc-patches writes:
> VPR_REG is the only register in its class, so it should be handled by
> TARGET_CLASS_LIKELY_SPILLED_P. No test fails without this patch, but
> it seems it should be
Christophe Lyon via Gcc-patches writes:
> The vmvnq_n* intrinsics and have [u]int[16|32]_t arguments, so use
> iterator instead of HI in mve_vmvnq_n_.
>
> 2021-09-03 Christophe Lyon
>
> gcc/
> * config/arm/mve.md (mve_vmvnq_n_): Use V_elem mode
> for operand 1.
>
> diff --git
Christophe Lyon via Gcc-patches writes:
> This patch implements support for vectors of booleans to support MVE
> predicates, instead of HImode. Since the ABI mandates pred16_t (aka
> uint16_t) to represent predicates in intrinsics prototypes, we
> introduce a new "predicate" type qualifier so tha
Christophe Lyon via Gcc-patches writes:
> We make use of qualifier_predicate to describe MVE builtins
> prototypes, restricting to auto-vectorizable vcmp* and vpsel builtins,
> as they are exercised by the tests added earlier in the series.
>
> Special handling is needed for mve_vpselq because it
Christophe Lyon via Gcc-patches writes:
> From: Christophe Lyon
>
> The problem in this PR is that we call VPSEL with a mask of vector
> type instead of HImode. This happens because operand 3 in vcond_mask
> is the pre-computed vector comparison and has vector type.
>
> This patch fixes it by imp
Christophe Lyon via Gcc-patches writes:
> This is mostly a mechanical change, only tested by the intrinsics
> expansion tests.
>
> 2021-09-02 Christophe Lyon
>
> gcc/
> PR target/100757
> PR target/101325
> * config/arm/arm-builtins.c (BINOP_UNONE_NONE_NONE_QUALIFIERS):
Christophe Lyon via Gcc-patches writes:
> This patch covers a few builtins where we do not use the
> iterator and thus we cannot use .
>
> However this introduces a problem for the v2di instructions, because
> there is not predicate for this case. For instance, changing
> STRSBS_P_QUALIFIERS bre
Christophe Lyon via Gcc-patches writes:
> This patch covers a few non-load/store builtins where we do not use
> the iterator and thus we cannot use .
>
> We need to update the expected code in cde-mve-full-assembly.c because
> we now use mve_movv16qi instead of movhi to generate the vmsr
> instru
Richard Sandiford via Gcc-patches writes:
> Christophe Lyon via Gcc-patches writes:
>> This patch covers a few builtins where we do not use the
>> iterator and thus we cannot use .
>>
>> However this introduces a problem for the v2di instructions, because
>> the
Prathamesh Kulkarni writes:
> On Fri, 8 Oct 2021 at 21:19, Richard Sandiford
> wrote:
>>
>> Thanks for looking at this.
>>
>> Prathamesh Kulkarni writes:
>> > Hi,
>> > As mentioned in PR, for the following test-case:
>> >
>> > typedef unsigned char uint8_t;
>> >
>> > static inline uint8_t
>> > x
Diane Meirowitz via Gcc-patches writes:
> Please review my patch. It is tiny. Thank you.
Thanks for the patch and sorry for the very slow response.
I've now pushed this to master and all active branches.
Thanks,
Richard
> Diane
>
> On 9/15/21, 5:02 PM, "Diane Meirowitz" wrote:
>
>
> d
Tamar Christina writes:
> Hi,
>
> Sending a new version of the patch because I noticed the pattern was
> overriding the nor pattern.
>
> A second pattern is needed to capture the nor case as combine will match the
> longest sequence first. So without this pattern we end up de-optimizing nor
> an
Tamar Christina writes:
>> > Note: This patch series is working incrementally towards generating the
>> most
>> > efficient code for this and other loops in small steps.
>>
>> It looks like this could be done in the vectoriser via an extension of the
>> scalar_cond_masked_set mechanism. We
Prathamesh Kulkarni writes:
> On Mon, 11 Oct 2021 at 20:42, Richard Sandiford
> wrote:
>>
>> Prathamesh Kulkarni writes:
>> > On Fri, 8 Oct 2021 at 21:19, Richard Sandiford
>> > wrote:
>> >>
>> >> Thanks for looking at this.
>> >>
>> >> Prathamesh Kulkarni writes:
>> >> > Hi,
>> >> > As mentio
Richard Biener via Gcc-patches writes:
> The following makes sure to rewrite arithmetic with undefined behavior
> on overflow to a well-defined variant when moving them to be always
> executed as part of doing if-conversion for loop vectorization.
>
> Bootstrapped and tested on x86_64-unknown-linu
Richard Biener writes:
> On Wed, 13 Oct 2021, Richard Sandiford wrote:
>
>> Richard Biener via Gcc-patches writes:
>> > The following makes sure to rewrite arithmetic with undefined behavior
>> > on overflow to a well-defined variant when moving them to be always
>> > executed as part of doing if
Aldy Hernandez via Gcc-patches writes:
> Using debug() on an auto_bitmap from gdb doesn't work because the
> implicit conversion from auto_bitmap to bitmap_head doesn't work
> from within a debugging session. This patch adds the convenience
> functions for auto_bitmap.
>
> OK?
OK, thanks.
Richa
Hi Robin,
Thanks for the update and sorry for the late response.
Robin Dapp writes:
> Hi Richard,
>
>> Don't we still need this code (without the REG_DEAD handling) for the
>> case in which…
>>
>>> + /* As we are transforming
>>> +if (x > y)
>>> + {
>>> +a = b;
>>> +
Sorry for the slow review.
Robin Dapp via Gcc-patches writes:
> Hi,
>
> while evaluating another patch that introduces more lvalue paradoxical
> subregs I ran into an ICE in combine at
>
> wide_int o = wi::insert (rtx_mode_t (outer, temp_mode),
> rtx_mode
Sorry for the slow reply.
Tamar Christina writes:
> Hi All,
>
> The following example:
>
> void f11(double * restrict z, double * restrict w, double * restrict x,
>double * restrict y, int n)
> {
> for (int i = 0; i < n; i++) {
> z[i] = (w[i] > 0) ? w[i] : y[i];
> }
> }
>
Tamar Christina writes:
> Hi All,
>
> The following loop does a conditional reduction using an add:
>
> #include
>
> int32_t f (int32_t *restrict array, int len, int min)
> {
> int32_t iSum = 0;
>
> for (int i=0; i if (array[i] >= min)
>iSum += array[i];
> }
> return iSum;
> }
Ping
Richard Sandiford via Gcc-patches writes:
> This patch adds a lang hook for defining a struct/RECORD_TYPE
> “as if” it had appeared directly in the source code. It follows
> the similar existing hook for enums.
>
> It's the caller's responsibility to create the
The arm implementation of add_stmt_cost was added alongside
arm_builtin_vectorization_cost. At that time it was necessary
to override the latter when overriding the former, since
default_add_stmt_cost didn't indirect through the
builtin_vectorization_cost target hook:
int stmt_cost = defaul
The aarch64 version of add_stmt_cost has a redundant test
of flag_vect_cost_model. The current structure was based
on the contemporaneous definition of default_add_stmt_cost,
but g:d6d1127249564146429009e0682f25bd58d7a791 later removed
the flag_vect_cost_model test from the default version.
Teste
Jonathan Wright via Gcc-patches writes:
> The pointer parameter to load a vector of signed values should itself
> be a signed type. This patch fixes two instances of this unsigned-
> signed implicit conversion in arm_neon.h.
>
> Tested relevant intrinsics with -Wpointer-sign and warnings no longer
rs6000_density_test has an early exit test between a call
to get_loop_body and the corresponding free. This would
lead to a memory leak if the early exit is taken.
Tested on powerpc64le-linux-gnu. It's obvious that moving the
test avoids the leak, but there are multiple ways to write it,
so: OK
The current vector cost interface has a quite a bit of redundancy
built in. Each target that defines its own hooks has to replicate
the basic unsigned[3] management. Currently each target also
duplicates the cost adjustment for inner loops.
This patch instead defines a vector_costs class for hol
Robin Dapp writes:
> Hi Richard,
>
>> (2) Insert:
>>
>> if (SUBREG_P (src))
>>src = SUBREG_REG (src);
>>
>> here.
>>
>> OK with those changes if that works. Let me know if they don't —
>> I'll try to be quicker with the next review.
>
> thank you, this looks good in a first testsu
Tamar Christina writes:
> Hi All,
>
> This lowers shifts to GIMPLE when the C interpretations of the shift
> operations
> matches that of AArch64.
>
> In C shifting right by BITSIZE is undefined, but the behavior is defined in
> AArch64. Additionally negative shifts lefts are undefined in C but
Christophe Lyon via Gcc-patches writes:
> These tests are derived from the one provided in the PR: there is a
> compile-only test because I did not have access to anything that could
> execute MVE code until recently.
> I have been able to add an executable test since QEMU supports MVE.
>
> Instea
Christophe Lyon via Gcc-patches writes:
> @@ -31086,36 +31087,20 @@ arm_expand_vector_compare (rtx target, rtx_code
> code, rtx op0, rtx op1,
> case NE:
>if (TARGET_HAVE_MVE)
> {
> - rtx vpr_p0;
> - if (vcond_mve)
> - vpr_p0 = target;
> - else
> -
Christophe Lyon via Gcc-patches writes:
> This patch covers a few builtins where we do not use the
> iterator and thus we cannot use .
>
> For v2di instructions, we use the V8BI mode for predicates.
Why V8BI though, when VPRED uses HI?
Would it make sense to define a V2BI? Or doesn't that work
Christophe Lyon via Gcc-patches writes:
> This is v2 of this patch series, addressing the comments I received.
> The changes v1 -> v2 are:
>
> - Patch 3: added an executable test, and updated
> check_effective_target_arm_mve_hw
> - Patch 4: split into patch 4 and patch 14 (to keep numbering the
Tamar Christina writes:
>> -Original Message-
>> From: Richard Sandiford
>> Sent: Friday, October 15, 2021 1:26 PM
>> To: Tamar Christina
>> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw
>> ; Marcus Shawcroft
>> ; Kyrylo Tkachov
>> Subject: Re: [PATCH]AArch64 Lower intrinsics shift
Christophe LYON writes:
> On 15/10/2021 17:08, Richard Sandiford wrote:
>> Christophe Lyon via Gcc-patches writes:
>>> This patch covers a few builtins where we do not use the
>>> iterator and thus we cannot use .
>>>
>>> For v2di instructions, we use the V8BI mode for predicates.
>> Why V8BI th
Prathamesh Kulkarni writes:
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_unary_4.c
> b/gcc/testsuite/gcc.target/aarch64/sve/cond_unary_4.c
> index 4604365fbef..cedc5b7c549 100644
> --- a/gcc/testsuite/gcc.target/aarch64/sve/cond_unary_4.c
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/c
Michael Matz via Gcc-patches writes:
> Hello,
>
> On Thu, 14 Oct 2021, Richard Biener wrote:
>
>> > So, at _this_ write-through of the email I think I like the above idea
>> > best: make ao_ref be a tree (at least its storage, because it currently
>> > is a one-member-function class), make ao_re
Wilco Dijkstra writes:
> Enable the fast shift feature in Neoverse V1 and N2 tunings as well.
>
> ChangeLog:
> 2021-10-18 Wilco Dijkstra
>
> * config/aarch64/aarch64.c (neoversev1_tunings):
> Enable AARCH64_EXTRA_TUNE_CHEAP_SHIFT_EXTEND.
> (neoversen2_tunings): Likewise.
Wilco Dijkstra writes:
> Tune the case-values-threshold setting for modern cores. A value of 11
> improves
> SPECINT2017 by 0.2% and reduces codesize by 0.04%. With -Os use value 8 which
> reduces codesize by 0.07%.
>
> Passes regress, OK for commit?
>
> ChangeLog:
>
> 2021-10-18 Wilco Dijkstr
Jason Merrill writes:
> On 9/24/21 13:53, Richard Sandiford wrote:
>> This patch adds a lang hook for defining a struct/RECORD_TYPE
>> “as if” it had appeared directly in the source code. It follows
>> the similar existing hook for enums.
>>
>> It's the caller's responsibility to create the fiel
Prathamesh Kulkarni writes:
> Hi,
> The attached patch removes "-mcpu=generic+sve" from dg-options,
> because it conflicts
> with -march=armv8.3-a+sve, and resulted in:
>
> cc1: warning: switch '-mcpu=generic+sve' conflicts with
> '-march=armv8.3-a+sve' switch^M
> FAIL: gcc.target/aarch64/sve/pr93
Martin Liška writes:
> Hello.
>
> The patch does the same as g:df592811f950301ed3b10a08e476dad0f2eff26a for
> aarch64.
>
> Tested locally with cross compiler.
>
> Ready for master?
>
> Thanks,
> Martin
>
> PR target/102375
>
> gcc/ChangeLog:
>
> * config/aarch64/aarch64.c (aarch64_pro
Martin Liška writes:
> On 10/19/21 12:52, Richard Sandiford wrote:
>> It looks like this ought to happen after the alloca and copy, since it
>> modifies the string.
>
> Oh yeah, good point.
>
> Ready to be installed with the change?
> Thanks,
> Martin
>
> From 68df4cba3bccb714a14e3c795e6d9e4a44c54
Wilco Dijkstra writes:
> Hi Richard,
>
>> I'm just concerned that here we're using the same explanation but with
>> different numbers. Why are the new numbers more right than the old ones
>> (especially when it comes to code size, where the trade-off hasn't
>> really changed)?
>
> Like all tuning
Prathamesh Kulkarni writes:
> Hi,
> The attached patch emits a more verbose diagnostic for target attribute that
> is an architecture extension needing a leading '+'.
>
> For the following test,
> void calculate(void) __attribute__ ((__target__ ("sve")));
>
> With patch, the compiler now emits:
>
Robin Dapp via Gcc-patches writes:
> Hi,
>
> I have been playing around with making Kewen's partial vector changes
> workable with s390:
>
> We have a vll instruction that can be passed the highest byte to load.
> The rather unfortunate consequence of this is that a length of zero
> cannot be s
Prathamesh Kulkarni writes:
> On Tue, 19 Oct 2021 at 19:58, Richard Sandiford
> wrote:
>>
>> Prathamesh Kulkarni writes:
>> > Hi,
>> > The attached patch emits a more verbose diagnostic for target attribute
>> > that
>> > is an architecture extension needing a leading '+'.
>> >
>> > For the fol
Wilco Dijkstra writes:
> Hi Richard,
>
>> The problem is that you're effectively asking for these values to be
>> taken on faith without providing any analysis and without describing
>> how you arrived at the new numbers. Did you try other values too?
>> If so, how did they compare with the numbe
"Andre Vieira (lists)" writes:
> On 15/10/2021 09:48, Richard Biener wrote:
>> On Tue, 12 Oct 2021, Andre Vieira (lists) wrote:
>>
>>> Hi Richi,
>>>
>>> I think this is what you meant, I now hide all the unrolling cost
>>> calculations
>>> in the existing target hooks for costs. I did need to adj
1101 - 1200 of 2183 matches
Mail list logo