My understanding was that we went into this knowing that the IVs
would defeat SCEV analysis. Apparently that wasn't a problem for RVV,
but it's not surprising that it is a problem in general.
This isn't just about SELECT_VL though. We use the same type of IV
for cases what aren't going to use SE
Richard Biener writes:
>> But how easy would it be to extend SCEV analysis, via a pattern match?
>> The evolution of the IV phi wrt the inner loop is still a normal SCEV.
>
> No, the IV isn't a normal SCEV, the final value is different.
Which part of the IV though? Won't all executions of the la
juzhe.zh...@rivai.ai writes:
> From: Ju-Zhe Zhong
>
> Follow Richi's suggestion, I change current decrement IV flow from:
>
> do {
>remain -= MIN (vf, remain);
> } while (remain != 0);
>
> into:
>
> do {
>old_remain = remain;
>len = MIN (vf, remain);
>remain -= vf;
> } while (old_r
"juzhe.zh...@rivai.ai" writes:
> Before this patch:
> foo:
> ble a2,zero,.L5
> csrr a3,vlenb
> srli a4,a3,2
> .L3:
> minu a5,a2,a4
> vsetvli zero,a5,e32,m1,ta,ma
> vle32.v v2,0(a1)
> vle32.v v1,0(a0)
> vsetvli t1,zero,e32,m1,ta,ma
> vadd.vv v1,v1,v2
> vsetvli zero,a5,e32,m1,ta,ma
> vse32.v v1,0(a0
"juzhe.zhong" writes:
> Maybe we can include rgroup number into select vl pattern?So that, I always
> use select vl pattern. In my backend, if it is single rgroup,we gen vsetvl,
> otherwise we gen min.
That just seems to be a way of hiding an “is the target RVV?” test though.
IMO targets shouldn
Prathamesh Kulkarni writes:
> Hi Richard,
> The s32 case for single constant patch doesn't regress now after the
> above commit.
> Bootstrapped+tested on aarch64-linux-gnu, and verified that the new
> tests pass for aarch64_be-linux-gnu.
> Is it OK to commit ?
>
> Thanks,
> Prathamesh
>
> [aarch64
"Roger Sayle" writes:
> This patch implements Richard Sandiford's suggestion from
> https://gcc.gnu.org/pipermail/gcc-patches/2023-May/618215.html
> that wi::bswap (and a new wi::bitreverse) should be functions,
> and ideally only accessors are member functions. This patch
> implements the first
Richard Biener writes:
> On Wed, 31 May 2023, juzhe.zh...@rivai.ai wrote:
>
>> Hi?all. I have posted my several investigations:
>> https://gcc.gnu.org/pipermail/gcc-patches/2023-May/620101.html
>> https://gcc.gnu.org/pipermail/gcc-patches/2023-May/620105.html
>> https://gcc.gnu.org/pipermail/gcc
Richard Biener via Gcc-patches writes:
> On Wed, 31 May 2023, Richard Sandiford wrote:
>
>> Richard Biener writes:
>> > On Wed, 31 May 2023, juzhe.zh...@rivai.ai wrote:
>> >
>> >> Hi?all. I have posted my several investigations:
>> >> https://gcc.gnu.org/pipermail/gcc-patches/2023-May/620101.html
Christophe Lyon writes:
> After commit g:d8545fb2c71683f407bfd96706103297d4d6e27b, we missed a
> pattern to match the new GIMPLE form.
>
> With this patch, gcc.target/aarch64/rev16_2.c passes again.
>
> 2023-05-31 Christophe Lyon
>
> PR target/110039
> gcc/
> * config/aarch64/
Christophe Lyon writes:
> On Wed, 31 May 2023 at 11:49, Richard Sandiford
> wrote:
>
>> Christophe Lyon writes:
>> > After commit g:d8545fb2c71683f407bfd96706103297d4d6e27b, we missed a
>> > pattern to match the new GIMPLE form.
>> >
>> > With this patch, gcc.target/aarch64/rev16_2.c passes agai
Richard Biener via Gcc-patches writes:
> On Fri, May 26, 2023 at 4:35 PM Andrew Stubbs wrote:
>>
>> Hi all,
>>
>> I want to implement a vector DIVMOD libfunc for amdgcn, but I can't just
>> do it because the GCC middle-end models DIVMOD's return value as
>> "complex int" type, and there are no ve
Christophe Lyon writes:
> After commit g:d8545fb2c71683f407bfd96706103297d4d6e27b, we missed a
> pattern to match the new GIMPLE form.
>
> With this patch, gcc.target/aarch64/rev16_2.c passes again.
>
> 2023-05-31 Christophe Lyon
>
> PR target/110039
> gcc/
> * config/aarch64/
"juzhe.zh...@rivai.ai" writes:
> Thanks Richi. I am gonna merge it after Richard's final approve.
Thanks for checking, but no need to wait for a second ack from me!
Please go ahead and commit.
Richard
Just some very minor things.
"Andre Vieira (lists)" writes:
> diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
> index
> 5c9da73ea11f8060b18dcf513599c9694fa4f2ad..348bee35a35ae4ed9a8652f5349f430c2733e1cb
> 100644
> --- a/gcc/internal-fn.cc
> +++ b/gcc/internal-fn.cc
> @@ -90,6 +90,71 @@ loo
Sorry for the slow review.
I don't know the IV-related parts well enough to review those properly,
but they looked reasonable to me. Hopefully Richi can comment.
I'm curious though. For:
> + tree step = vect_dr_behavior (vinfo, dr_info)->step;
> +
> + [...]
> + poly_uint64 bytesize = GET_MO
juzhe.zh...@rivai.ai writes:
> + /* If we're using decrement IV approach in loop control, we can use output
> of
> + SELECT_VL to adjust IV of loop control and data reference when it
> satisfies
> + the following checks:
> +
> + (a) SELECT_VL is supported by the target.
> + (b) L
"juzhe.zh...@rivai.ai" writes:
> Hi, Richard. Thanks for the comments.
>
>>> If we use SELECT_VL to refer only to the target-independent ifn, I don't
>>> see why this last bit is true.
> Could you give me more details and information about this since I am not sure
> whether I catch up with you.
Richard Sandiford writes:
> "juzhe.zh...@rivai.ai" writes:
>> Hi, Richard. Thanks for the comments.
>>
If we use SELECT_VL to refer only to the target-independent ifn, I don't
see why this last bit is true.
>> Could you give me more details and information about this since I am not
>>
"juzhe.zh...@rivai.ai" writes:
> Hi, Richard.
>
>>> No, I meant that the comment I quoted seemed to be saying that solution
>>> 3 wasn't possible. The comment seemed to say that we would need to do
>>> solution 1.
> I am so sorry that I didn't write the comments accurately.
> Could you help me wi
"Roger Sayle" writes:
> This patch provides a wide-int implementation of bitreverse, that
> implements both of Richard Sandiford's suggestions from the review at
> https://gcc.gnu.org/pipermail/gcc-patches/2023-May/618215.html of an
> improved API (as a stand-alone function matching the bswap refa
Looks good! Just some minor comments:
Tamar Christina writes:
> diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
> index
> 6a435eb44610960513e9739ac9ac1e8a27182c10..1437ab55b260ab5c876e92d59ba39d24bffc6276
> 100644
> --- a/gcc/doc/md.texi
> +++ b/gcc/doc/md.texi
> @@ -27,6 +27,7 @@ See the next
Richard Sandiford writes:
>> diff --git a/gcc/gensupport.h b/gcc/gensupport.h
>> index
>> a1edfbd71908b6244b40f801c6c01074de56777e..7925e22ed418767576567cad583bddf83c0846b1
>> 100644
>> --- a/gcc/gensupport.h
>> +++ b/gcc/gensupport.h
>> @@ -20,6 +20,7 @@ along with GCC; see the file COPYING3.
Tamar Christina writes:
>> >int operand_number; /* Operand index in the big array. */
>> >int output_format; /* INSN_OUTPUT_FORMAT_*. */
>> > + bool compact_syntax_p;
>> >struct operand_data operand[MAX_MAX_OPERANDS]; };
>> >
>> > @@ -700,12 +702,57 @@ proc
Oluwatamilore Adebayo writes:
> From: oluade01
>
> This adds a recognition pattern for the non-widening
> absolute difference (ABD).
>
> gcc/ChangeLog:
>
> * doc/md.texi (sabd, uabd): Document them.
> * internal-fn.def (ABD): Use new optab.
> * optabs.def (sabd_optab, uabd_optab
Oluwatamilore Adebayo writes:
>> It would be good to mark all of these functions with __attribute__((noipa)),
>> since I think interprocedural optimisations might otherwise defeat the
>> runtime test in abd_run_1.c (in the sense that we might end up folding
>> things at compile time and not testin
Alex Coplan writes:
> Hi,
>
> This patch series fixes various defects with the FEAT_LS64 ACLE
> implementation in the AArch64 backend.
>
> The series is organised as follows:
>
> - Patch 1/3 fixes whitespace errors in the existing code.
> - Patch 2/3 fixes PR110100 where we generate wrong code f
"Andre Vieira (lists)" writes:
> Hi,
>
> This patch fixes an issue introduced by
> g:2f482a07365d9f4a94a56edd13b7f01b8f78b5a0, where a subtype was beeing
> passed to vect_widened_op_tree, when no subtype was to be used. This
> lead to an errorneous use of IFN_VEC_WIDEN_MINUS.
>
> gcc/ChangeLog:
Andrew Stubbs writes:
> On 30/05/2023 07:26, Richard Biener wrote:
>> On Fri, May 26, 2023 at 4:35 PM Andrew Stubbs wrote:
>>>
>>> Hi all,
>>>
>>> I want to implement a vector DIVMOD libfunc for amdgcn, but I can't just
>>> do it because the GCC middle-end models DIVMOD's return value as
>>> "com
In addition to Andreas's and Richard's comments:
Tamar Christina writes:
> +@item
> +@samp{@{@@} is followed by a layout in parentheses which is @samp{cons:}
> followed by
> +a list of @code{match_operand}/@code{match_scratch} comma operand numbers,
> then a
How about:
a comma-separated lis
guojiufu writes:
> Hi,
>
> On 2023-06-09 16:00, Richard Biener wrote:
>> On Fri, 9 Jun 2023, Jiufu Guo wrote:
>>
>>> Hi,
>>>
>>> As checking the code, there is a "gcc_assert (SCALAR_INT_MODE_P
>>> (mode))"
>>> in "try_const_anchors".
>>> This assert seems correct because the function try_const_
Andrew Stubbs writes:
> On 07/06/2023 20:42, Richard Sandiford wrote:
>> I don't know if this helps (probably not), but we have a similar
>> situation on AArch64: a 64-bit mode like V8QI can be doubled to a
>> 128-bit vector or to a pair of 64-bit vectors. We used V16QI for
>> the former and "V2x
Richard Biener writes:
> On Fri, Jun 9, 2023 at 11:45 AM Andrew Stubbs wrote:
>>
>> On 09/06/2023 10:02, Richard Sandiford wrote:
>> > Andrew Stubbs writes:
>> >> On 07/06/2023 20:42, Richard Sandiford wrote:
>> >>> I don't know if this helps (probably not), but we have a similar
>> >>> situatio
"juzhe.zh...@rivai.ai" writes:
> Thanks, Richi.
>
> Should I wait for Richard ACK gain ?
> Since the last email of this patch, he just asked me to adjust comment no
> codes change.
> I am not sure whether he is ok.
Yeah, OK from my POV too, thanks.
Richard
Kyrylo Tkachov via Gcc-patches writes:
> Hi all,
>
> This patch implements RTL constant-folding for the SS_TRUNCATE and
> US_TRUNCATE codes.
> The semantics are a clamping operation on the argument with the min and max
> of the narrow mode,
> followed by a truncation. The signedness of the clamp
Tejas Belagod writes:
> From: Tejas Belagod
>
> This PR optimizes an SVE intrinsics sequence where
> svlasta (svptrue_pat_b8 (SV_VL1), x)
> a scalar is selected based on a constant predicate and a variable vector.
> This sequence is optimized to return the correspoding element of a NEON
Jeff Law via Gcc-patches writes:
> On 6/9/23 04:41, juzhe.zh...@rivai.ai wrote:
>> @@ -4342,135 +4510,81 @@ pass_vsetvl::cleanup_insns (void) const
>> }
>> }
>>
>> +/* Return true if the SET result is not used by any instructions. */
>> +static bool
>> +has_no_uses (basic_block cfg_bb,
Richard Biener writes:
> AFAIU this special instruction is only supposed to prevent
> code motion (of stack memory accesses?) across this instruction?
> I'd say a
>
> (may_clobber (mem:BLK (reg:DI 1 1)))
>
> might be more to the point? I've used "may_clobber" which doesn't
> exist since I'm not
Richard Biener writes:
> On Wed, 14 Jun 2023, Richard Sandiford wrote:
>
>> Richard Biener writes:
>> > AFAIU this special instruction is only supposed to prevent
>> > code motion (of stack memory accesses?) across this instruction?
>> > I'd say a
>> >
>> > (may_clobber (mem:BLK (reg:DI 1 1)))
Richard Biener writes:
> On Wed, 14 Jun 2023, Richard Sandiford wrote:
>
>> Richard Biener writes:
>> > On Wed, 14 Jun 2023, Richard Sandiford wrote:
>> >
>> >> Richard Biener writes:
>> >> > AFAIU this special instruction is only supposed to prevent
>> >> > code motion (of stack memory accesses
Oluwatamilore Adebayo writes:
> From: oluade01
>
> This adds a recognition pattern for the non-widening
> absolute difference (ABD).
>
> gcc/ChangeLog:
>
> * doc/md.texi (sabd, uabd): Document them.
> * internal-fn.def (ABD): Use new optab.
> * optabs.def (sabd_optab, uabd_optab
Richard Biener via Gcc-patches writes:
> Currently vect_determine_partial_vectors_and_peeling will decide
> to apply fully masking to the main loop despite
> --param vect-partial-vector-usage=1 when the currently analyzed
> vector mode results in a vectorization factor that's bigger
> than the num
Richard Biener via Gcc-patches writes:
> The function is only meaningful for LOOP_VINFO_MASKS processing so
> inline it into the single use.
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu, OK?
>
> * tree-vect-loop.cc (vect_get_max_nscalars_per_iter): Inline
> into ...
>
Richard Biener via Gcc-patches writes:
> This implemens fully masked vectorization or a masked epilog for
> AVX512 style masks which single themselves out by representing
> each lane with a single bit and by using integer modes for the mask
> (both is much like GCN).
>
> AVX512 is also special in
Tamar Christina writes:
> +The syntax rules are as follows:
> +@itemize @bullet
> +@item
> +Templates must start with @samp{@{@@} to use the new syntax.
> +
> +@item
> +@samp{@{@@} is followed by a layout in parentheses which is @samp{cons:}
s/parentheses/square brackets/
> +followed by a comma-
Richard Biener writes:
> On Wed, 14 Jun 2023, Richard Sandiford wrote:
>
>> Richard Biener via Gcc-patches writes:
>> > The function is only meaningful for LOOP_VINFO_MASKS processing so
>> > inline it into the single use.
>> >
>> > Bootstrapped and tested on x86_64-unknown-linux-gnu, OK?
>> >
>>
Richard Sandiford writes:
>> +
>> + /* Skip any newlines or whitespaces needed. */
>> + while (ISSPACE(*templ))
>> + templ++;
>> + continue;
>> + }
>> + else if (templ[0] == '/' && templ[1] == '*')
>> + {
>> + templ += 2;
>> + /*
Oluwatamilore Adebayo writes:
> From: oluade01
>
> This adds a recognition pattern for the non-widening
> absolute difference (ABD).
>
> gcc/ChangeLog:
>
> * doc/md.texi (sabd, uabd): Document them.
> * internal-fn.def (ABD): Use new optab.
> * optabs.def (sabd_optab, uabd_optab
Andrew Stubbs writes:
> One
> comment: building a vector constant {0, 1, 2, 3, , 63} results in a
> very large entry in the constant pool and an unnecessary memory load (it
> literally has to use this sequence to generate the addresses to load the
> constant!) Generating the sequence via V
juzhe.zh...@rivai.ai writes:
> From: Ju-Zhe Zhong
>
> Hi, Richard and Richi.
> As we disscussed before, COND_LEN_* patterns were added for multiple
> situations.
> This patch apply CON_LEN_* for the following situation:
>
> Support for the situation that in "vectorizable_operation":
> /* If ope
Richard Biener writes:
> On Wed, 12 Jul 2023, juzhe.zh...@rivai.ai wrote:
>
>> Thanks Richard.
>>
>> Is it correct that the better way is to add optabs
>> (len_strided_load/len_strided_store),
>> then expand LEN_MASK_GATHER_LOAD/LEN_MASK_SCATTER_STORE to
>> len_strided_load/len_strided_store op
Vladimir Makarov via Gcc-patches writes:
> The following patch solves
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110372
>
> The patch was successfully bootstrapped and tested on x86-64.
>
> commit 1f7e5a7b91862b999aab88ee0319052aaf00f0f1
> Author: Vladimir N. Makarov
> Date: Fri Jul 7 09:
Richard Biener via Gcc-patches writes:
> On Mon, Jul 10, 2023 at 1:01 PM Uros Bizjak wrote:
>>
>> On Mon, Jul 10, 2023 at 11:47 AM Richard Biener
>> wrote:
>> >
>> > On Mon, Jul 10, 2023 at 11:26 AM Uros Bizjak wrote:
>> > >
>> > > On Mon, Jul 10, 2023 at 11:17 AM Richard Biener
>> > > wrote:
juzhe.zh...@rivai.ai writes:
> From: Ju-Zhe Zhong
>
> Hi, Richard and Richi.
> As we disscussed before, COND_LEN_* patterns were added for multiple
> situations.
> This patch apply CON_LEN_* for the following situation:
>
> Support for the situation that in "vectorizable_operation":
> /* If ope
Richard Biener writes:
> On Wed, Jul 12, 2023 at 1:05 PM Uros Bizjak wrote:
>>
>> On Wed, Jul 12, 2023 at 12:58 PM Uros Bizjak wrote:
>> >
>> > On Wed, Jul 12, 2023 at 12:23 PM Richard Sandiford
>> > wrote:
>> > >
>> > > Richard Biener via Gcc-patches writes:
>> > > > On Mon, Jul 10, 2023 at 1
juzhe.zh...@rivai.ai writes:
> From: Ju-Zhe Zhong
>
> Hi, Richard and Richi.
> As we disscussed before, COND_LEN_* patterns were added for multiple
> situations.
> This patch apply CON_LEN_* for the following situation:
>
> Support for the situation that in "vectorizable_operation":
> /* If ope
Richard Biener writes:
> The PRs ask for optimizing of
>
> _1 = BIT_FIELD_REF ;
> result_4 = BIT_INSERT_EXPR ;
>
> to a vector permutation. The following implements this as
> match.pd pattern, improving code generation on x86_64.
>
> On the RTL level we face the issue that backend patterns in
Vladimir Makarov writes:
> On 7/12/23 06:07, Richard Sandiford wrote:
>> Vladimir Makarov via Gcc-patches writes:
>>> diff --git a/gcc/lra-assigns.cc b/gcc/lra-assigns.cc
>>> index 73fbef29912..2f95121df06 100644
>>> --- a/gcc/lra-assigns.cc
>>> +++ b/gcc/lra-assigns.cc
>>> @@ -1443,10 +1443,11 @
Summary: We'd like to be able to specify some attributes using
keywords, rather than the traditional __attribute__ or [[...]]
syntax. Would that be OK?
In more detail:
We'd like to add some new target-specific attributes for Arm SME.
These attributes affect semantics and code generation and so t
Thanks for the feedback.
Nathan Sidwell writes:
> On 7/14/23 11:56, Richard Sandiford wrote:
>> Summary: We'd like to be able to specify some attributes using
>> keywords, rather than the traditional __attribute__ or [[...]]
>> syntax. Would that be OK?
>>
>> In more detail:
>>
>> We'd like to
Jakub Jelinek writes:
> On Fri, Jul 14, 2023 at 04:56:18PM +0100, Richard Sandiford via Gcc-patches
> wrote:
>> Summary: We'd like to be able to specify some attributes using
>> keywords, rather than the traditional __attribute__ or [[...]]
>> syntax. Would that
Richard Biener writes:
> On Fri, Jul 14, 2023 at 5:58 PM Richard Sandiford via Gcc-patches
> wrote:
>>
>> Summary: We'd like to be able to specify some attributes using
>> keywords, rather than the traditional __attribute__ or [[...]]
>> syntax. Wou
Jason Merrill writes:
> On Sun, Jul 16, 2023 at 6:50 AM Richard Sandiford
> wrote:
>
>> Jakub Jelinek writes:
>> > On Fri, Jul 14, 2023 at 04:56:18PM +0100, Richard Sandiford via
>> Gcc-patches wrote:
>> >> Summary: We'd like to be able to speci
Juzhe-Zhong writes:
> Hi, Richard.
>
> RISC-V port needs to add a bunch VLS modes (V16QI,V32QI,V64QI,...etc)
> There are sharing same REG_CLASS with VLA modes (VNx16QI,VNx32QI,...etc)
>
> When I am adding those VLS modes, the RTL_SSA initialization in VSETVL PASS
> (inserted after RA) ICE:
> rvv.
juzhe.zh...@rivai.ai writes:
> From: Ju-Zhe Zhong
>
> Hi, Richard.
>
> RISC-V port needs to add a bunch VLS modes (V16QI,V32QI,V64QI,...etc)
> There are sharing same REG_CLASS with VLA modes (VNx16QI,VNx32QI,...etc)
>
> When I am adding those VLS modes, the RTL_SSA initialization in VSETVL PASS
>
Manolis Tsamis writes:
> noce_convert_multiple_sets has been introduced and extended over time to
> handle
> if conversion for blocks with multiple sets. Currently this is focused on
> register moves and rejects any sort of arithmetic operations.
>
> This series is an extension to allow more sequ
Michael Matz via Gcc-patches writes:
> Hello,
>
> the ELF psABI for x86-64 doesn't have any callee-saved SSE
> registers (there were actual reasons for that, but those don't
> matter anymore). This starts to hurt some uses, as it means that
> as soon as you have a call (say to memmove/memcpy, eve
Manolis Tsamis writes:
> On Tue, Jul 18, 2023 at 1:12 AM Richard Sandiford
> wrote:
>>
>> Manolis Tsamis writes:
>> > noce_convert_multiple_sets has been introduced and extended over time to
>> > handle
>> > if conversion for blocks with multiple sets. Currently this is focused on
>> > register
Tamar Christina writes:
> Hi All,
>
> The resulting predicate register of a whilelo is not
> restricted to the lower half of the predicate register file.
>
> As such these tests started failing after recent changes
> because the whilelo outside the loop is getting assigned p15.
It's the whilelo i
Andrew Carlotti writes:
> Updated patch to fix the fp16 intrinsic pragmas, and pushed to master.
> OK to backport to GCC 13?
OK, thanks.
Richard
> Many intrinsics currently depend on both an architecture version and a
> feature, despite the corresponding instructions being available within
> GC
Jeff Law via Gcc-patches writes:
> On 7/19/23 04:25, Richard Biener wrote:
>> On Wed, 19 Jul 2023, YunQiang Su wrote:
>>
>>> Eric Botcazou ?2023?7?19??? 17:45???
> I don't see that. That's definitely not what GCC expects here,
> the left-most word of the doubleword should be unchan
Richard Biener writes:
> On Thu, 20 Jul 2023, Richard Sandiford wrote:
>
>> Tamar Christina writes:
>> > Hi All,
>> >
>> > The resulting predicate register of a whilelo is not
>> > restricted to the lower half of the predicate register file.
>> >
>> > As such these tests started failing after rec
Jan Hubicka writes:
>> Tamar Christina writes:
>> > Hi All,
>> >
>> > The resulting predicate register of a whilelo is not
>> > restricted to the lower half of the predicate register file.
>> >
>> > As such these tests started failing after recent changes
>> > because the whilelo outside the loop
Richard Biener writes:
#> On Thu, 20 Jul 2023, Richard Sandiford wrote:
>
>> Jeff Law via Gcc-patches writes:
>> > On 7/19/23 04:25, Richard Biener wrote:
>> >> On Wed, 19 Jul 2023, YunQiang Su wrote:
>> >>
>> >>> Eric Botcazou ?2023?7?19??? 17:45???
>>
>> > I don't see that. That's d
"Kewen.Lin" writes:
> Hi,
>
> As PR110729 reported, there was one issue for .section
> __patchable_function_entries with -ffunction-sections, that
> is we put the same symbol as link_to section symbol for all
> functions wrongly. The commit r13-4294 for PR99889 has
> fixed this with the correspon
"Kewen.Lin" writes:
> Hi,
>
> Commit r14-2267-gb8806f6ffbe72e adjusts the arguments order
> of LEN_STORE from {len,vector,bias} to {len,bias,vector},
> in order to make them consistent with LEN_MASK_STORE and
> MASK_STORE. But it missed to update the related handlings
> in tree-ssa-sccvn.cc, it c
Richard Biener via Gcc-patches writes:
> When we materialize a layout we push edge permutes to constant/external
> defs without checking we can actually do so. For externals defined
> by vector stmts rather than scalar components we can't.
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu.
Richard Biener writes:
>> Am 20.07.2023 um 16:09 schrieb Richard Sandiford :
>>
>> Richard Biener via Gcc-patches writes:
>>> When we materialize a layout we push edge permutes to constant/external
>>> defs without checking we can actually do so. For externals defined
>>> by vector stmts rathe
Richard Biener writes:
>> Am 20.07.2023 um 18:59 schrieb Richard Sandiford :
>>
>> Richard Biener writes:
> Am 20.07.2023 um 16:09 schrieb Richard Sandiford
> :
Richard Biener via Gcc-patches writes:
> When we materialize a layout we push edge permutes to constant/exte
Jan Hubicka writes:
> Avoid scaling flat loop profiles of vectorized loops
>
> As discussed, when vectorizing loop with static profile, it is not always
> good idea
> to divide the header frequency by vectorization factor because the profile may
> not realistically represent the expected number o
Richard Biener writes:
> The following unifies SLP_TREE_VEC_STMTS into SLP_TREE_VEC_DEFS
> which can handle all cases we need.
>
> Bootstrap & regtest running on x86_64-unknown-linux-gnu.
Nice! Just curious...
> @@ -149,6 +147,20 @@ _slp_tree::~_slp_tree ()
> free (failed);
> }
>
> +/*
Hao Liu OS writes:
> This only affects the new costs in aarch64 backend. Currently, the reduction
> latency of vector body is too large as it is multiplied by stmt count. As the
> scalar reduction latency is small, the new costs model may think "scalar code
> would issue more quickly" and increa
Andrew Pinski via Gcc-patches writes:
> The problem -fasynchronous-unwind-tables is on by default for aarch64
> We need turn it off for crt*.o because it would make __EH_FRAME_BEGIN__ point
> to .eh_frame data from crtbeginT.o instead of the user-defined object
> during static linking.
Could you
钟居哲 writes:
> Hi, Richi. Thank you so much for review.
>
>>> This function doesn't seem to care about conditional vectorization
>>> support, so why are you changing it?
>
> I debug and analyze the code here:
>
> Breakpoint 1, vectorizable_call (vinfo=0x3d358d0, stmt_info=0x3dcc820,
> gsi=0x0, vec
Hao Liu OS writes:
> Hi,
>
> Thanks for the suggestion. I tested it and found a gcc_assert failure:
> gcc.target/aarch64/sve/cost_model_13.c (internal compiler error: in
> info_for_reduction, at tree-vect-loop.cc:5473)
>
> It is caused by empty STMT_VINFO_REDUC_DEF.
When was STMT_VINFO_REDU
"juzhe.zh...@rivai.ai" writes:
> Thanks Richard.
>
> Do you suggest we should add a macro like this first:
>
> #ifndef DEF_INTERNAL_COND_FN
> #define DEF_INTERNAL_COND_FN(NAME, FLAGS, OPTAB, TYPE) \
> DEF_INTERNAL_OPTAB_FN (COND_##NAME, FLAGS, cond_##optab, cond_##TYPE)
> DEF_INTERNAL_OPTAB_FN
"juzhe.zh...@rivai.ai" writes:
> Hi, Richard.
>>> I think we should have an internal-fn helper that returns IFN_COND_LEN_*
>>> for a given IFN_COND_*. It could handle IFN_MASK_LOAD -> IFN_MASK_LEN_LOAD
>>> etc. too.
> Could you name this helper function for me? Does it call
> "get_conditional_le
Hi,
Thanks for the rework and sorry for the slow review.
Prathamesh Kulkarni writes:
> Hi Richard,
> This is reworking of patch to extend fold_vec_perm to handle VLA vectors.
> The attached patch unifies handling of VLS and VLA vector_csts, while
> using fallback code
> for ctors.
>
> For VLS ve
Was leaving a bit of time in case Richi had any comments, but:
Matthew Malcomson writes:
> Our checks for whether the vectorization of a given loop would make an
> out of bounds access miss the case when the vector we load is so large
> as to span multiple iterations worth of data (while only bei
Richard Biener writes:
> On Wed, Jul 26, 2023 at 4:02 AM Hao Liu OS via Gcc-patches
> wrote:
>>
>> > When was STMT_VINFO_REDUC_DEF empty? I just want to make sure that we're
>> > not papering over an issue elsewhere.
>>
>> Yes, I also wonder if this is an issue in vectorizable_reduction. Below
Richard Biener writes:
> On Wed, Jul 26, 2023 at 11:14 AM Richard Sandiford
> wrote:
>>
>> Richard Biener writes:
>> > On Wed, Jul 26, 2023 at 4:02 AM Hao Liu OS via Gcc-patches
>> > wrote:
>> >>
>> >> > When was STMT_VINFO_REDUC_DEF empty? I just want to make sure that
>> >> > we're not pape
Sorry for the slow response.
Hao Liu OS writes:
>> Ah, thanks. In that case, Hao, I think we can avoid the ICE by changing:
>>
>> if ((kind == scalar_stmt || kind == vector_stmt || kind == vec_to_scalar)
>> && vect_is_reduction (stmt_info))
>>
>> to:
>>
>> if ((kind == scalar_stmt || k
Hao Liu OS writes:
>> Which test case do you see this for? The two tests in the patch still
>> seem to report correct latencies for me if I make the change above.
>
> Not the newly added tests. It is still the existing case causing the
> previous ICE (i.e. assertion problem): gcc.target/aarch64
Richard Ball writes:
> Add POLY_INT_CST support to code within
> fold_ctor_reference. This code previously
> only supported INTEGER_CST which caused a
> bug when using VEC_PERM_EXPR with SVE vectors.
Just to add for others: this is a prerequisite for a follow-on patch,
so the change will be teste
juzhe.zh...@rivai.ai writes:
> From: Ju-Zhe Zhong
>
> Hi, Richard and Richi.
>
> Base on previous disscussions, we should make COND_* and COND_LEN_*
> consistent.
>
> So, this patch define these internal function together by these 2
> wrappers:
>
> #ifndef DEF_INTERNAL_COND_FN
> #define DEF_INTERN
Juzhe-Zhong writes:
> Hi, Richard and Richi.
>
> Base on the suggestions from Richard:
> https://gcc.gnu.org/pipermail/gcc-patches/2023-July/625396.html
>
> This patch choose (1) approach that Richard provided, meaning:
>
> RVV implements cond_* optabs as expanders. RVV therefore supports
> both
Richard Biener via Gcc-patches writes:
> The following makes sure to limit the shift operand when vectorizing
> (short)((int)x >> 31) via (short)x >> 31 as the out of bounds shift
> operand otherwise invokes undefined behavior. When we determine
> whether we can demote the operand we know we at m
Richard Sandiford writes:
> Richard Biener via Gcc-patches writes:
>> The following makes sure to limit the shift operand when vectorizing
>> (short)((int)x >> 31) via (short)x >> 31 as the out of bounds shift
>> operand otherwise invokes undefined behavior. When we determine
>> whether we can d
Jeff Law via Gcc-patches writes:
> On 7/19/23 04:11, Xiao Zeng wrote:
>> This patch completes the recognition of the basic semantics
>> defined in the spec, namely:
>>
>> Conditional zero, if condition is equal to zero
>>rd = (rs2 == 0) ? 0 : rs1
>> Conditional zero, if condition is non zero
Jeff Law via Gcc-patches writes:
> On 8/1/23 05:18, Richard Sandiford wrote:
>>
>> Where were you seeing the requirement for pointer equality? genrecog.cc
>> at least uses rtx_equal_p, and I think it has to. E.g. some patterns
>> use (match_dup ...) to match output and input mems, and mem rtxes
1 - 100 of 2183 matches
Mail list logo