Tamar Christina writes:
> Hi All,
>
> When determining issue rates we currently discount non-constant MLA
> accumulators
> for Advanced SIMD but don't do it for the latency.
>
> This means the costs for Advanced SIMD with a constant accumulator are wrong
> and
> results in us costing SVE and Adv
Tamar Christina writes:
> Hi All,
>
> boolean comparisons have different cost depending on the mode. e.g.
> a && b when predicated doesn't require an addition instruction, the AND is
> free
Nit (for the commit msg): additional
Maybe:
for SVE, a && b doesn't require an additional instruction
Tamar Christina writes:
>> Tamar Christina writes:
>> > Hi All,
>> >
>> > When determining issue rates we currently discount non-constant MLA
>> > accumulators for Advanced SIMD but don't do it for the latency.
>> >
>> > This means the costs for Advanced SIMD with a constant accumulator are
>> >
Richard Biener writes:
> On Tue, 1 Aug 2023, Richard Sandiford wrote:
>
>> Richard Sandiford writes:
>> > Richard Biener via Gcc-patches writes:
>> >> The following makes sure to limit the shift operand when vectorizing
>> >> (short)((int)x >> 31) via (short)x >> 31 as the out of bounds shift
>>
Jakub Jelinek writes:
> Hi!
>
> As the following self-test testcase shows, wi::shifted_mask sometimes
> doesn't create canonicalized wide_ints, which then fail to compare equal
> to canonicalized wide_ints with the same value.
> In particular, wi::mask (128, false, 128) gives { -1 } with len 1 and
Richard Biener via Gcc-patches writes:
> The following makes sure to not use the original TBAA type for
> looking up a value across an aggregate copy when we had to offset
> the read.
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed to trunk.
>
> 2022-06-30 Richard Biener
>
>
Richard Biener writes:
> On Fri, 1 Jul 2022, Richard Sandiford wrote:
>
>> Richard Biener via Gcc-patches writes:
>> > The following makes sure to not use the original TBAA type for
>> > looking up a value across an aggregate copy when we had to offset
>> > the read.
>> >
>> > Bootstrapped and te
"Andre Vieira (lists)" writes:
> On 29/06/2022 08:18, Richard Sandiford wrote:
>>> + break;
>>> +case AARCH64_RBIT:
>>> +case AARCH64_RBITL:
>>> +case AARCH64_RBITLL:
>>> + if (mode == SImode)
>>> + icode = CODE_FOR_aarch64_rbitsi;
>>> + else
>>> + icode = CODE_FOR_a
Richard Biener via Gcc-patches writes:
> This reverts the change as discussed.
Thanks!
> Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.
>
> 2022-07-01 Richard Biener
>
> * tree-ssa-sccvn.cc (vn_reference_lookup_3): Revert
> back to using maybe_ne (off, -1).
> ---
>
Aldy Hernandez via Gcc-patches writes:
> Currently global ranges are stored in SSA_NAME_RANGE_INFO as a pair of
> wide_int-like objects along with the nonzero bits. We frequently lose
> precision when streaming out our higher resolution iranges. The plan
> was always to store the full irange bet
Xi Ruoyao via Gcc-patches writes:
> On Fri, 2022-07-01 at 12:40 +, Dimitrije Milosevic wrote:
>> Building the ASAN for the n32 MIPS ABI currently fails, due to a few reasons:
>> - defined(__mips64), which is set solely based on the architecture type
>> (32-bit/64-bit),
>> was still used in s
t;> > > > -Original Message-
>> >> > > > From: Richard Sandiford
>> >> > > > Sent: Thursday, June 16, 2022 7:54 PM
>> >> > > > To: Tamar Christina
>> >> > > > Cc: gcc-patches@gcc.gnu.org; nd ; Richard
Sorry for the slow review.
Andrew Carlotti via Gcc-patches writes:
> Hi,
>
> This removes a significant number of intrinsic definitions from the arm_neon.h
> header file, and reduces the amount of code duplication. The new macros and
> data structures are intended to also facilitate moving other
Richard Biener writes:
> The final loop IV use after the loop has that not in LC SSA
> (and inserts not simplified _2 = _3 - 0 stmts). In particular
> since it splits the exit edge when there's a virtual PHI in the
> destination it breaks virtual LC SSA form (but likely also
> non-virtual).
>
> T
Tamar Christina writes:
>> > so that the multiple_p test is skipped if the structure is undefined.
>>
>> Actually, we should probably skip the constant_multiple_p test as well.
>> Keeping it would only be meaningful for little-endian.
>>
>> simplify_gen_subreg should alread do the necessary chec
I know it'll seem like make-work, but could you put the combine flag
in a separate follow-on patch? Reorganising the existing flags
(very welcome!) and adding new ones seem like different things.
TBH I'm a bit suspicious of the combine flag. What fundamental
property holds true after combine tha
In g:76c3041b856cb0 I'd removed a "C ? optab_vector : optab_mixed_sign"
argument from a call to directly_supported_p, thinking that the argument
only existed because of the condition (which I was removing). But the
difference between the scalar and vector forms matters for shifts,
so we do still n
aarch64_builtin_vectorized_function handles some built-in functions
that already have equivalent internal functions. This seems to be
redundant now, since the target builtins that it chooses are mapped
to the same optab patterns as the internal functions.
Tested on aarch64-linux-gnu & pushed.
Ri
The PR is about the aarch64 port using an ACLE built-in function
to vectorise a scalar function call, even though the ECF_* flags for
the ACLE function didn't match the ECF_* flags for the scalar call.
To some extent that kind of difference is inevitable, since the
ACLE intrinsics are supposed to
Ping^2 for the configure bits.
Richard Sandiford via Gcc-patches writes:
> On aarch64, --with-arch, --with-cpu and --with-tune only have an
> effect on the driver, so “./xgcc -B./ -O3” can give significantly
> different results from “./cc1 -O3”. --with-arch did have a limited
> eff
Richard Biener via Gcc-patches writes:
> On Tue, Jul 12, 2022 at 4:38 PM Andrew Carlotti
> wrote:
>>
>> aarch64_general_gimple_fold_builtin doesn't check whether the LHS of a
>> function call is null before converting it to an assign statement. To avoid
>> returning an invalid GIMPLE statement i
This patch extends the fix for PR106253 to AArch32. As with AArch64,
we were using ACLE intrinsics to vectorise scalar built-ins, even
though the two sometimes have different ECF_* flags. (That in turn
is because the ACLE intrinsics should follow the instruction semantics
as closely as possible,
Andrew Carlotti writes:
> This lowers vcombine intrinsics to a GIMPLE vector constructor, which enables
> better optimisation during GIMPLE passes.
>
> gcc/
>
> * config/aarch64/aarch64-builtins.c
> (aarch64_general_gimple_fold_builtin): Add combine.
>
> gcc/testsuite/
>
> * gcc.
Andrew Carlotti writes:
> We already have a V1DF mode, so this makes the vector modes more consistent.
>
> Additionally, this allows us to recognise uint64x1_t and int64x1_t types given
> only the mode and type qualifiers (e.g. in aarch64_lookup_simd_builtin_type).
>
> gcc/ChangeLog:
>
> * c
Andrew Carlotti writes:
> This has been unused since 2014, so there's no reason to retain it.
>
> gcc/ChangeLog:
>
> * config/aarch64/aarch64-builtins.cc
> (enum aarch64_type_qualifiers): Remove qualifier_internal.
> (aarch64_init_simd_builtin_functions): Remove qualifier_interna
Andrew Carlotti writes:
> There were several similarly-named functions, which each built or looked up a
> type using a different subset of valid modes or qualifiers.
>
> This change combines these all into a single function, which can additionally
> handle const and pointer qualifiers.
I like the
Andrew Carlotti writes:
> This removes a significant number of intrinsic definitions from the arm_neon.h
> header file, and reduces the amount of code duplication. The new macros and
> data structures are intended to also facilitate moving other intrinsic
> definitions out of the header file in fu
Prathamesh Kulkarni writes:
> Hi,
> For following test case:
>
> svint32_t foo()
> {
> int32x4_t v = (int32x4_t) { 1, 2, 3, 4 };
> svint32_t v2 = svld1rq_s32 (svptrue_b8(), &v[0]);
> return v2;
> }
>
> After applying workaround in forwprop to not simplify VEC_PERM_EXPR in
> simplify_permutat
Richard Biener writes:
> On Thu, Jul 14, 2022 at 9:55 AM Prathamesh Kulkarni
> wrote:
>>
>> On Wed, 13 Jul 2022 at 12:22, Richard Biener
>> wrote:
>> >
>> > On Tue, Jul 12, 2022 at 9:12 PM Prathamesh Kulkarni via Gcc-patches
>> > wrote:
>> > >
>> > > Hi Richard,
>> > > For the following test:
Richard Ball writes:
> Replace manual swapping idiom with std::swap in aarch64.cc
>
> gcc/config/aarch64/aarch64.cc has a few manual swapping idioms of the form:
>
> x = in0, in0 = in1, in1 = x;
>
> The preferred way is using the standard:
>
> std::swap (in0, in1);
>
> We should just fix these to
Andrew Carlotti writes:
> On Wed, Jul 13, 2022 at 05:36:04PM +0100, Richard Sandiford wrote:
>> I like the part about getting rid of:
>>
>> static tree
>> aarch64_simd_builtin_type (machine_mode mode,
>> bool unsigned_p, bool poly_p)
>>
>> and the flow of the new function
graphds_scc says that it uses Tarjan's algorithm, but it looks like
it uses Kosaraju's algorithm instead (dfs one way followed by dfs
the other way).
OK to install?
Richard
gcc/
* graphds.cc (graphds_scc): Fix algorithm attribution.
---
gcc/graphds.cc | 2 +-
1 file changed, 1 insertio
Richard Biener writes:
> On Wed, 27 Jul 2022, juzhe.zh...@rivai.ai wrote:
>
>> From: zhongjuzhe
>>
>> gcc/ChangeLog:
>>
>> * expr.cc (expand_assignment): Change GET_MODE_PRECISION to
>> GET_MODE_BITSIZE
>>
>> ---
>> gcc/expr.cc | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
Dimitrije Milosevic writes:
>> Do you know someone very familiar with MIPS and GCC and capable as a
>> port maintainer? An active MIPS port maintainer will make the situation
>> better.
> Sadly, no. I agree it would make things easier.
Yeah, I agree that's what we need. I stepped down from bein
Seems this thread has become a bit heated, so I'll try to proceed
with caution :-)
In the below, I'll use "X-mode const_int" to mean "a const_int that
is known from context to represent an X-mode value". Of course,
the const_int itself always stores VOIDmode.
"Roger Sayle" writes:
> Hi Segher,
Currently SLP tries to force permute operations "down" the graph
from loads in the hope of reducing the total number of permutes
needed or (in the best case) removing the need for the permutes
entirely. This patch tries to extend it as follows:
- Allow loads to take a different permutation from t
Ping^3 for the configure bits.
Richard Sandiford via Gcc-patches writes:
> On aarch64, --with-arch, --with-cpu and --with-tune only have an
> effect on the driver, so “./xgcc -B./ -O3” can give significantly
> different results from “./cc1 -O3”. --with-arch did have a limited
> eff
Richard Biener writes:
> The following teaches VN to handle reads from .MASK_STORE and
> .LEN_STORE. For this push_partial_def is extended first for
> convenience so we don't have to handle the full def case in the
> caller (possibly other paths can be simplified then). Also
> the partial defini
"Roger Sayle" writes:
> This patch implements some additional zero-extension and sign-extension
> related optimizations in simplify-rtx.cc. The original motivation comes
> from PR rtl-optimization/71775, where in comment #2 Andrew Pinski sees:
>
> Failed to match this instruction:
> (set (reg:DI
"Roger Sayle" writes:
> Many thanks to Segher and Richard for pointing out that my removal
> of optimizations of ABS(ABS(x)) and ABS(FFS(x)) in the original version
> of this patch was incorrect, and my assumption that these would be
> subsumed by val_signbit_known_clear_p was mistaken. That the
Richard Sandiford via Gcc-patches writes:
> "Roger Sayle" writes:
>> Many thanks to Segher and Richard for pointing out that my removal
>> of optimizations of ABS(ABS(x)) and ABS(FFS(x)) in the original version
>> of this patch was incorrect, and my assumption that
Takayuki 'January June' Suwa via Gcc-patches writes:
> Emitting "(clobber (reg X))" before "(set (subreg (reg X)) (...))" keeps
> data flow consistent, but it also increases register allocation pressure
> and thus often creates many unwanted register-to-register moves that
> cannot be optimized aw
Martin Jambor writes:
> Hi Richard,
>
> On Fri, Nov 13 2020, Richard Sandiford via Gcc-patches wrote:
>> A later patch wants to be able to pass around subarray views of an
>> existing array. The standard class to do that is std::span, but it's
>> a C++20 thing.
Richard Biener writes:
> On Tue, 2 Aug 2022, Richard Sandiford wrote:
>
>> Currently SLP tries to force permute operations "down" the graph
>> from loads in the hope of reducing the total number of permutes
>> needed or (in the best case) removing the need for the permutes
>> entirely. This patch
Jeff Law via Gcc-patches writes:
> On 8/3/2022 1:52 AM, Richard Sandiford via Gcc-patches wrote:
>> Takayuki 'January June' Suwa via Gcc-patches
>> writes:
>>> Emitting "(clobber (reg X))" before "(set (subreg (reg X)) (...))" keeps
&g
Takayuki 'January June' Suwa writes:
> Thanks for your response.
>
> On 2022/08/03 16:52, Richard Sandiford wrote:
>> Takayuki 'January June' Suwa via Gcc-patches
>> writes:
>>> Emitting "(clobber (reg X))" before "(set (subreg (reg X)) (...))" keeps
>>> data flow consistent, but it also increas
Richard Biener writes:
>> +/* Create vector init for vectorized iv. */
>> +static tree
>> +vect_create_nonlinear_iv_init (gimple_seq* stmts, tree init_expr,
>> + tree step_expr, poly_uint64 nunits,
>> + tree vectype,
>> +
Prathamesh Kulkarni writes:
> Hi Richard,
> Following from off-list discussion, in the attached patch, I wrote pattern
> similar to vec_duplicate_reg, which seems to work for the svld1rq tests.
> Does it look OK ?
>
> Sorry, I didn't fully understand your suggestion on integrating with
> vec_dupli
"Roger Sayle" writes:
> This patch to the middle-end's RTL expansion reorders the code in
> emit_store_flag_1 so that the backend has more control over how best
> to expand/split double word equality/inequality comparisons against
> zero or minus one. With the current implementation, the middle-e
Richard Earnshaw writes:
> On 13/06/2022 15:33, Richard Sandiford via Gcc-patches wrote:
>> On aarch64, --with-arch, --with-cpu and --with-tune only have an
>> effect on the driver, so “./xgcc -B./ -O3” can give significantly
>> different results from “./cc1 -O3”. --with-ar
Tamar Christina writes:
> Hi All,
>
> In GCC 11 we implemented the vectorizer optab for widening left shifts,
> however this optab is only supported for uniform shift constants.
>
> At the moment GCC still has two loop vectorization strategy (classical loop
> and
> SLP based loop vec) and the opt
Tamar Christina writes:
> Hi All,
>
> Currently we segfault when len == 0 for an attribute list.
>
> essentially [cons: =0, 1, 2, 3; attrs: ] segfaults but should be equivalent to
> [cons: =0, 1, 2, 3] and [cons: =0, 1, 2, 3; attrs:]. This fixes it by just
> returning early and leaving it to the
Richard Biener writes:
> [...]
>> >> in vect_determine_precisions_from_range. Maybe we should drop
>> >> the shift handling from there and instead rely on
>> >> vect_determine_precisions_from_users, extending:
>> >>
>> >> if (TREE_CODE (shift) != INTEGER_CST
>> >> || !wi::ltu_p (wi::to_w
Hao Liu OS writes:
> Hi Richard,
>
> Update the patch with a simple case (see below case and comments). It shows
> a live stmt may not have reduction def, which introduce the ICE.
>
> Is it OK for trunk?
OK, thanks.
Richard
>
> Fix the assertion failure on empty reduction define in info_
can_div_trunc_p (a, b, &Q, &r) tries to compute a Q and r that
satisfy the usual conditions for truncating division:
(1) a = b * Q + r
(2) |b * Q| <= |a|
(3) |r| < |b|
We can compute Q using the constant component (the case when
all indeterminates are zero). Since |r| < |b| for th
Tamar Christina writes:
>> > +
>> > +(define_constraint "D3"
>> > + "@internal
>> > + A constraint that matches vector of immediates that is with 0 to
>> > +(bits(mode)/2)-1."
>> > + (and (match_code "const,const_vector")
>> > + (match_test "aarch64_const_vec_all_same_in_range_p (op, 0,
>> >
Prathamesh Kulkarni writes:
> On Tue, 25 Jul 2023 at 18:25, Richard Sandiford
> wrote:
>>
>> Hi,
>>
>> Thanks for the rework and sorry for the slow review.
> Hi Richard,
> Thanks for the suggestions! Please find my responses inline below.
>>
>> Prathamesh Kulkarni writes:
>> > Hi Richard,
>> >
Richard Sandiford writes:
> Prathamesh Kulkarni writes:
>> On Tue, 25 Jul 2023 at 18:25, Richard Sandiford
>> wrote:
>>>
>>> Hi,
>>>
>>> Thanks for the rework and sorry for the slow review.
>> Hi Richard,
>> Thanks for the suggestions! Please find my responses inline below.
>>>
>>> Prathamesh K
Tamar Christina writes:
>> >> Do you see vect_constant_defs in practice, or is this just for
>> >> completeness?
>> >> I would expect any constants to appear as direct operands. I don't
>> >> mind keeping it if it's just a belt-and-braces thing though.
>> >
>> > In the latency case where I had a
YunQiang Su writes:
> PR #104914
>
> On TRULY_NOOP_TRUNCATION_MODES_P (DImode, SImode)) == true platforms,
> zero_extract (SI, SI) can be sign-extended. So, if a zero_extract (DI,
> DI) following with an sign_extend(SI, DI) can be merged to a single
> zero_extract (SI, SI).
>
> gcc/ChangeLog:
>
Richard Biener writes:
> The following fixes a problem with my last attempt of avoiding
> out-of-bound shift values for vectorized right shifts of widened
> operands. Instead of truncating the shift amount with a bitwise
> and we actually need to saturate it to the target precision.
>
> The follo
Full review this time, sorry for the skipping the tests earlier.
Prathamesh Kulkarni writes:
> diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
> index 7e5494dfd39..680d0e54fd4 100644
> --- a/gcc/fold-const.cc
> +++ b/gcc/fold-const.cc
> @@ -85,6 +85,10 @@ along with GCC; see the file COPYING3.
Prathamesh Kulkarni writes:
> On Fri, 4 Aug 2023 at 20:36, Richard Sandiford
> wrote:
>>
>> Full review this time, sorry for the skipping the tests earlier.
> Thanks for the detailed review! Please find my responses inline below.
>>
>> Prathamesh Kulkarni writes:
>> > diff --git a/gcc/fold-const
"Andre Vieira (lists)" writes:
> Hi,
>
> This patch enables the use of mixed-types for simd clones for AArch64
> and adds aarch64 as a target_vect_simd_clones.
>
> Bootstrapped and regression tested on aarch64-unknown-linux-gnu
>
> gcc/ChangeLog:
>
> * config/aarch64/aarch64.cc (currentl
Richard Ball writes:
> This patch adds support for the Cortex-A520 CPU to GCC.
>
> No regressions on aarch64-none-elf.
>
> Ok for master?
>
>
> gcc/ChangeLog:
>
> * config/aarch64/aarch64-cores.def (AARCH64_CORE): Add
> Cortex-A520 CPU.
> * config/aarch64/aarch64-tune.md: Regene
Richard Ball writes:
> ACLE has added intrinsics to bridge between SVE and Neon.
>
> The NEON_SVE Bridge adds intrinsics that allow conversions between NEON and
> SVE vectors.
>
> This patch adds support to GCC for the following 3 intrinsics:
> svset_neonq, svget_neonq and svdup_neonq
>
> gcc/Chan
"juzhe.zh...@rivai.ai" writes:
> Hi, Richi.
>
>>> that should be
>
>>> || (!LOOP_VINFO_FULLY_MASKED_P (loop_vinfo)
>>> && !LOOP_VINFO_FULLY_WITH_LENGTH_P (loop_vinfo))
>
>>> I think. It seems to imply that SLP isn't supported with
>>> masking/lengthing.
>
> Oh, yes. At first glance, the
"Andre Vieira (lists)" writes:
> Here is my new version, see inline response to your comments.
>
> New cover letter:
>
> This patch enables the use of mixed-types for simd clones for AArch64,
> adds aarch64 as a target_vect_simd_clones and corrects the way the
> simdlen is chosen for non-specifi
Jakub Jelinek writes:
> On Wed, Aug 09, 2023 at 05:55:28PM +0100, Richard Sandiford wrote:
>> Jakub: do you remember what the reason was? I don't mind dropping
>> "function", but it feels weird to drop the quotes around "simd".
>> Seems like, if we do that, there'll one day be a patch to add
>> t
Jakub Jelinek writes:
> On Wed, Aug 09, 2023 at 06:27:20PM +0100, Richard Sandiford wrote:
>> Jakub Jelinek writes:
>> > On Wed, Aug 09, 2023 at 05:55:28PM +0100, Richard Sandiford wrote:
>> >> Jakub: do you remember what the reason was? I don't mind dropping
>> >> "function", but it feels weird
Richard Biener via Gcc-patches writes:
> On Wed, Aug 9, 2023 at 6:16 PM Andrew Pinski via Gcc-patches
> wrote:
>>
>> If `A` has a range of `[0,0][100,INF]` and the comparison
>> of `A < 50`. This should be optimized to `A <= 0` (which then
>> will be optimized to just `A == 0`).
>> This patch imp
Richard Biener writes:
> On Thu, Aug 10, 2023 at 3:44 PM Richard Sandiford
> wrote:
>>
>> Richard Biener via Gcc-patches writes:
>> > On Wed, Aug 9, 2023 at 6:16 PM Andrew Pinski via Gcc-patches
>> > wrote:
>> >>
>> >> If `A` has a range of `[0,0][100,INF]` and the comparison
>> >> of `A < 50`.
Prathamesh Kulkarni writes:
>> static bool
>> is_simple_vla_size (poly_uint64 size)
>> {
>> if (size.is_constant ())
>> return false;
>> for (int i = 1; i < ARRAY_SIZE (size.coeffs); ++i)
>> if (size[i] != (i <= 1 ? size[0] : 0))
> Just wondering is this should be (i == 1 ? size[0] : 0
Siddhesh Poyarekar writes:
> On 2023-08-08 10:30, Siddhesh Poyarekar wrote:
>>> Do you have a suggestion for the language to address libgcc,
>>> libstdc++, etc. and libiberty, libbacktrace, etc.?
>>
>> I'll work on this a bit and share a draft.
>
> Hi David,
>
> Here's what I came up with for di
Richard Biener writes:
> When we vectorize fold-left reductions with partial vectors but
> no target operation available we use a vector conditional to force
> excess elements to zero. But that doesn't correctly preserve
> the sign of zero. The following patch disables partial vector
> support i
Richard Biener writes:
> On Fri, 11 Aug 2023, juzhe.zh...@rivai.ai wrote:
>
>> Hi, Richi.
>>
>> > 1. Target is using loop MASK as the partial vector loop control.
>> >> I don't think it checks for this?
>>
>> I am not sure whether I understand EXTRACT_LAST correctly.
>> But if target doesn't use
Juzhe-Zhong writes:
> Hi, there is genrecog issue happens in RISC-V backend.
>
> This is the ICE info:
>
> 0xfa3ba4 poly_int_pod<2u, unsigned short>::to_constant() const
> ../../../riscv-gcc/gcc/poly-int.h:504
> 0x28eaa91 recog_5
> ../../../riscv-gcc/gcc/config/riscv/bitmanip.md:31
Thanks for the clean-ups. But...
"Kewen.Lin" writes:
> Hi,
>
> Following Richi's suggestion [1], this patch is to move the
> handlings on VMAT_GATHER_SCATTER in the final loop nest
> of function vectorizable_load to its own loop. Basically
> it duplicates the final loop nest, clean up some usel
Prathamesh Kulkarni writes:
> On Thu, 10 Aug 2023 at 21:27, Richard Sandiford
> wrote:
>>
>> Prathamesh Kulkarni writes:
>> >> static bool
>> >> is_simple_vla_size (poly_uint64 size)
>> >> {
>> >> if (size.is_constant ())
>> >> return false;
>> >> for (int i = 1; i < ARRAY_SIZE (size.coe
"Kewen.Lin" writes:
> Hi Richard,
>
> on 2023/8/14 20:20, Richard Sandiford wrote:
>> Thanks for the clean-ups. But...
>>
>> "Kewen.Lin" writes:
>>> Hi,
>>>
>>> Following Richi's suggestion [1], this patch is to move the
>>> handlings on VMAT_GATHER_SCATTER in the final loop nest
>>> of functio
I think it would help to clarify what the aim of the security policy is.
Specifically:
(1) What service do we want to provide to users by classifying one thing
as a security bug and another thing as not a security bug?
(2) What service do we want to provide to the GNU community by the same
Andrew Pinski via Gcc-patches writes:
> Like the support conditional neg (r12-4470-g20dcda98ed376cb61c74b2c71),
> this just adds conditional not too.
> Also we should be able to turn `(a ? -1 : 0) ^ b` into a conditional
> not.
>
> OK? Bootstrapped and tested on x86_64-linux-gnu and aarch64-linux-
Richard Biener writes:
> On Tue, Aug 15, 2023 at 4:44 AM Kewen.Lin wrote:
>>
>> on 2023/8/14 22:16, Richard Sandiford wrote:
>> > No, it was more that 219-142=77, so it seems like a lot of lines
>> > are being duplicated rather than simply being moved. (Unlike for
>> > VMAT_LOAD_STORE_LANES, whi
Richard Biener writes:
> On Tue, 15 Aug 2023, Kewen.Lin wrote:
>
>> Hi Stefan,
>>
>> on 2023/8/15 02:51, Stefan Schulze Frielinghaus wrote:
>> > Hi everyone,
>> >
>> > I have bootstrapped and regtested the patch below on s390. For the
>> > 64-bit target I do not see any changes regarding the te
Richard Biener writes:
> On Mon, 14 Aug 2023, Prathamesh Kulkarni wrote:
>> On Mon, 7 Aug 2023 at 13:19, Richard Biener
>> wrote:
>> > It doesn't seem to make a difference for x86. That said, the "fix" is
>> > probably sticking the correct target on the dump-check, it seems
>> > that vect_fold_
Richard Biener writes:
> On Tue, 15 Aug 2023, Richard Sandiford wrote:
>
>> Richard Biener writes:
>> > On Tue, 15 Aug 2023, Kewen.Lin wrote:
>> >
>> >> Hi Stefan,
>> >>
>> >> on 2023/8/15 02:51, Stefan Schulze Frielinghaus wrote:
>> >> > Hi everyone,
>> >> >
>> >> > I have bootstrapped and reg
Richard Biener writes:
>> OK, fair enough. So the idea is: see where we end up and then try to
>> improve/factor the APIs in a less peephole way?
>
> Yeah, I think that's the only good way forward.
OK, no objection from me. Sorry for holding the patch up.
Richard
Richard Biener writes:
> The following changes the gate to perform vectorization of BB reductions
> to use needs_fold_left_reduction_p which in turn requires handling
> TYPE_OVERFLOW_UNDEFINED types in the epilogue code generation by
> promoting any operations generated there to use unsigned arith
"juzhe.zh...@rivai.ai" writes:
> Hi, Robin, Richard and Richi.
>
> I am wondering whether we can just simply replace the VEC_EXTRACT expander
> with binary?
>
> Like this :?
>
> DEF_INTERNAL_OPTAB_FN (VEC_EXTRACT, ECF_CONST | ECF_NOTHROW,
> - vec_extract, vec_extract)
> + vec_ex
Prathamesh Kulkarni writes:
>> Unfortunately, the patch regressed following tests on ppc64le and
>> armhf respectively:
>> gcc.target/powerpc/vec-perm-ctor.c scan-tree-dump-not optimized
>> "VIEW_CONVERT_EXPR"
>> gcc.dg/tree-ssa/forwprop-20.c scan-tree-dump-not forwprop1 "VEC_PERM_EXPR"
>>
>> This
Robin Dapp writes:
>> However:
>>
>> | #define vec_extract_direct { 3, 3, false }
>>
>> This looks wrong. The numbers are argument numbers (or -1 for a return
>> value). vec_extract only takes 2 arguments, so 3 looks to be out-of-range.
>>
>> | #define direct_vec_extract_optab_supported_p dir
Richard Ball writes:
> v2: Add missing PROFILE feature flag.
>
> This patch adds support for the Cortex-A720 CPU to GCC.
>
> No regressions on aarch64-none-elf.
>
> Ok for master?
>
> gcc/ChangeLog:
>
> * config/aarch64/aarch64-cores.def (AARCH64_CORE): Add Cortex-
> A720 CPU.
>
Joseph Myers writes:
> On Mon, 17 Jul 2023, Michael Matz via Gcc-patches wrote:
>
>> So, essentially you want unignorable attributes, right? Then implement
>> exactly that: add one new keyword "__known_attribute__" (invent a better
>> name, maybe :) ), semantics exactly as with __attribute__ (i
Alex Coplan writes:
> Hi,
>
> This patch fixes up the code examples in the RTL-SSA documentation (the
> sections on making insn changes) to reflect the current API.
>
> The main issues are as follows:
> - rtl_ssa::recog takes an obstack_watermark & as the first parameter.
>Presumably this is
Joseph Myers writes:
> On Wed, 16 Aug 2023, Richard Sandiford via Gcc-patches wrote:
>
>> Would it be OK to add support for:
>>
>> [[__extension__ ...]]
>>
>> to suppress the pedwarn about using [[]] prior to C2X? Then we can
>
> That seems lik
Richard Biener writes:
>> Am 17.08.2023 um 13:25 schrieb Richard Sandiford via Gcc-patches
>> :
>>
>> Joseph Myers writes:
>>>> On Wed, 16 Aug 2023, Richard Sandiford via Gcc-patches wrote:
>>>>
>>>> Would it be OK to add support f
Richard Biener writes:
> The following avoids running into somehow flawed logic in fold_vec_perm
> for non-VLA vectors.
>
> Bootstrap & regtest running on x86_64-unknown-linux-gnu.
>
> Richard.
>
> PR tree-optimization/111048
> * fold-const.cc (fold_vec_perm_cst): Check for non-VLA
>
Richard Sandiford writes:
> Joseph Myers writes:
>> On Wed, 16 Aug 2023, Richard Sandiford via Gcc-patches wrote:
>>
>>> Would it be OK to add support for:
>>>
>>> [[__extension__ ...]]
>>>
>>> to suppress the pedwarn about u
Prathamesh Kulkarni writes:
> On Mon, 21 Aug 2023 at 12:26, Richard Biener wrote:
>>
>> On Sat, 19 Aug 2023, Prathamesh Kulkarni wrote:
>>
>> > On Fri, 18 Aug 2023 at 14:52, Richard Biener wrote:
>> > >
>> > > On Fri, 18 Aug 2023, Richard Sandiford wrote:
>> > >
>> > > > Richard Biener writes:
Juzhe-Zhong writes:
> Hi, Richard and Richi.
>
> Currently, GCC support COND_LEN_FMA for floating-point **NO** -ffast-math.
> It's supported in tree-ssa-math-opts.cc. However, GCC failed to support
> COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS.
>
> Consider this following case:
> #define TEST_TYPE(T
101 - 200 of 2183 matches
Mail list logo