Ok. LGTM as long as you change the patch as I suggested.
Thanks.
juzhe.zh...@rivai.ai
From: Kito Cheng
Date: 2023-05-30 14:51
To: juzhe.zh...@rivai.ai
CC: gcc-patches; palmer; kito.cheng; jeffreyalaw; Robin Dapp; pan2.li
Subject: Re: [PATCH] RISC-V: Basic VLS code gen for RISC-V
> >> /* Retur
On Mon, May 29, 2023 at 6:20 PM Martin Jambor wrote:
>
> Hi,
>
> there have been concerns that linear searches through DECL_ARGUMENTS
> that are often necessary to compute the index of a particular
> PARM_DECL which is the key to results of IPA-CP can happen often
> enough to be a compile time iss
On Tue, May 30, 2023 at 7:06 AM Ajit Agarwal wrote:
>
> Hello Richard:
>
> On 22/05/23 6:26 pm, Richard Biener wrote:
> > On Thu, May 18, 2023 at 9:14 AM Ajit Agarwal wrote:
> >>
> >> Hello All:
> >>
> >> This patch improves code sinking pass to sink statements before call to
> >> reduce
> >> re
On Tue, 30 May 2023, 05:35 Alexandre Oliva via Libstdc++, <
libstd...@gcc.gnu.org> wrote:
>
> When long double is wider than double, but from_chars is implemented
> in terms of double, tests that involve the full precision of long
> double are expected to fail. Mark them as such on x86_64-*-vxwor
On Tue, May 30, 2023 at 8:07 AM Kito Cheng via Gcc-patches
wrote:
>
> GNU vector extensions is widly used around this world, and this patch
> enable that with RISC-V vector extensions, this can help people
> leverage existing code base with RVV, and also can write vector programs in a
> familiar w
From: Cedric Landet
The coding style rules require to avoid using FIXME comments. ??? is
preferred.
gcc/ada/
* init.c: Replace FIXME by ???
Tested on x86_64-pc-linux-gnu, committed on master.
---
gcc/ada/init.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/g
From: Eric Botcazou
The compiler fails to capture global references during the analysis of the
aspect on the generic type because it analyzes a copy of the expression.
gcc/ada/
* exp_util.adb (Build_DIC_Procedure_Body.Add_Own_DIC): When inside
a generic unit, preanalyze the expr
From: Eric Botcazou
gcc/ada/
* libgnat/a-cidlli.adb (Put_Image): Simplify.
* libgnat/a-coinve.adb (Put_Image): Likewise.
Tested on x86_64-pc-linux-gnu, committed on master.
---
gcc/ada/libgnat/a-cidlli.adb | 13 +
gcc/ada/libgnat/a-coinve.adb | 13 +
2
From: Piotr Trojanek
For access-to-subprogram types with Pre/Post aspects we create a wrapper
routine that evaluates these aspects. Spec of this wrapper was created
always, while its body was only created when expansion was enabled.
Now we only create these wrappers when expansion is enabled. In
From: Johannes Kliemann
The Default_Stack_Size function does not check that the binder specified
default stack size is greater than the minimum stack size for the runtime.
This can result in tasks using default stack sizes less than the minimum
stack size because the Adjust_Storage_Size only adju
From: Eric Botcazou
This happens when the expression of the return statement is a call that does
not return on the same stack as the enclosing function.
gcc/ada/
* sem_res.adb (Resolve_Call): Restrict previous change to calls that
return on the same stack as the enclosing functi
From: Eric Botcazou
gcc/ada/
* gcc-interface/decl.cc (range_cannot_be_superflat): Return true
immediately if Cannot_Be_Superflat is set.
* gcc-interface/misc.cc (gnat_post_options): Do not override the
-Wstringop-overflow setting.
Tested on x86_64-pc-linux-gnu, c
From: Eric Botcazou
The original fix makes it possible to create transient scopes around return
statements in more cases, but it overlooks that transient scopes are reused
and, in particular, that they can be promoted to secondary stack management.
gcc/ada/
* exp_ch7.adb (Find_Enclosing
From: Ronan Desplanques
gcc/ada/
* doc/gnat_ugn/building_executable_programs_with_gnat.rst: Fix minor
issues.
* doc/gnat_ugn/the_gnat_compilation_model.rst: Fix minor issues.
* gnat_ugn.texi: Regenerate.
Tested on x86_64-pc-linux-gnu, committed on master.
---
...build
Don't require storage access for explicit dereferences used as
lvalue (e.g. Some_Access.all'Address) or for renamings.
gcc/ada/
* gcc-interface/trans.cc (get_storage_model_access): Don't require
storage model access for dereference used as lvalue or renamings.
Tested on x86_64-pc
From: Eric Botcazou
The previous fix to get_storage_model_access was incomplete and needs to be
extended to the node itself.
gcc/ada/
* gcc-interface/trans.cc (get_storage_model_access): Also strip any
type conversion in the node when unwinding the components.
Tested on x86_64-
From: Eric Botcazou
This happens when the peculiar check emitted by Check_Large_Modular_Array
is applied to an object whose actual subtype is an itype with dynamic size,
because the first reference to the itype in the expanded code may turn out
to be within the raise statement, which is problemat
From: Joel Brobecker
This commit changes the runtime on aarch64-linux to use the Linux
version of s-tsmona.adb, so as to add support for this functionality
on aarch64-linux.
gcc/ada/
* Makefile.rtl: Use libgnat/s-tsmona__linux.adb on
aarch64-linux. Link libgnat with -ldl, as th
From: Eric Botcazou
No functional changes.
gcc/ada/
* gcc-interface/decl.cc (gnat_to_gnu_entity) : Replace
integer_zero_node with null_pointer_node for pointer types.
* gcc-interface/trans.cc (gnat_gimplify_expr) : Likewise.
* gcc-interface/utils.cc (maybe_pad_ty
From: Eric Botcazou
This extends an earlier fix done for the others choice of an array aggregate
to all the choices of the aggregate, since the same sharing issue may happen
when the choices are not contiguous.
gcc/ada/
* exp_aggr.adb (Build_Array_Aggr_Code.Get_Assoc_Expr): Duplicate th
From: Eric Botcazou
This streamlines the handling of qualified expressions in the expansion of
aggregates and plugs a couple of loopholes that may cause memory leaks.
gcc/ada/
* exp_aggr.adb (Build_Array_Aggr_Code): Move the declaration of Typ
to the beginning.
(Initiali
From: Eric Botcazou
It comes from a small oversight in get_storage_model_access.
gcc/ada/
* gcc-interface/trans.cc (node_is_component): Remove parentheses.
(node_is_type_conversion): New predicate.
(get_atomic_access): Use it.
(get_storage_model_access): Likewise
From: Eric Botcazou
As the additional temporaries required by the semantics of nonnative storage
models are now created by the front-end, in particular for actual parameters
and assignment statements, the corresponding code in gigi can be removed.
gcc/ada/
* gcc-interface/trans.cc (Call
From: Eric Botcazou
This also removes some obsolete stuff.
gcc/ada/
* gcc-interface/Make-lang.in (ADA_CFLAGS): Move up.
(ALL_ADAFLAGS): Add $(NO_PIE_CFLAGS).
(ada/mdll.o): Remove.
(ada/mdll-fil.o): Likewise.
(ada/mdll-utl.o): Likewise.
Tested on x86_64-p
From: Eric Botcazou
The code generator must now be prepared to translate assignment statements
to objects allocated with a storage model and that are not initialized yet.
gcc/ada/
* gcc-interface/trans.cc (Attribute_to_gnu) : Tweak.
(gnat_to_gnu) : Declare a local variable.
From: Eric Botcazou
gcc/ada/
* gcc-interface/misc.cc (internal_error_function): Be prepared for
an input_location set to UNKNOWN_LOCATION.
Tested on x86_64-pc-linux-gnu, committed on master.
---
gcc/ada/gcc-interface/misc.cc | 22 --
1 file changed, 16 inse
From: Eric Botcazou
gcc/ada/
* gcc-interface/trans.cc (gnat_to_gnu) : Test the
precision of the operation rather than that of the result type.
Tested on x86_64-pc-linux-gnu, committed on master.
---
gcc/ada/gcc-interface/trans.cc | 8
1 file changed, 4 insertions(+),
Hi Kito,
> GNU vector extensions is widly used around this world, and this patch
> enable that with RISC-V vector extensions, this can help people
> leverage existing code base with RVV, and also can write vector programs in a
> familiar way.
>
> The idea of VLS code gen support is emulate VLS op
When using 'Address on an object with a size clause, gigi would end up
creating a copy and using its address instead of the one of the original
object, leading to incorrect behavior. Remove the conversion (that
triggers the copy) when 'Address is applied to a declaration.
gcc/ada/
* gcc-i
From: Eric Botcazou
This works around the limitations present for the support of arrays in the
middle-end by clearing the TREE_OVERFLOW flag for arrays with zero length.
gcc/ada/
* gcc-interface/decl.cc (gnat_to_gnu_entity) : Use a
local variable for the GNAT index type.
From: Eric Botcazou
gcc/ada/
* gcc-interface/trans.cc (Attribute_to_gnu) : Check that
the storage model has Copy_From before instantiating loads for it.
: Likewise.
: Likewise.
(gnat_to_gnu) : Likewise.
: Likewise.
Tested on x86_64-pc-linux-gnu, c
Hi, this patch is bootstrapped PASS.
Ok for trunk ?
Thanks.
juzhe.zh...@rivai.ai
From: juzhe.zhong
Date: 2023-05-25 23:26
To: gcc-patches
CC: richard.sandiford; rguenther; Ju-Zhe Zhong
Subject: [PATCH] VECT: Add SELECT_VL support
From: Ju-Zhe Zhong
This patch is adding SELECT_VL middle-end
On 5/25/23 9:37 AM, David Faust via Gcc-patches wrote:
Many BTF type kinds refer to other types via index to the final types
list. However, the order of the final types list is not guaranteed to
remain the same for the same source program between different runs of
the compiler, making it difficul
Hello Richard:
On 30/05/23 12:34 pm, Richard Biener wrote:
> On Tue, May 30, 2023 at 7:06 AM Ajit Agarwal wrote:
>>
>> Hello Richard:
>>
>> On 22/05/23 6:26 pm, Richard Biener wrote:
>>> On Thu, May 18, 2023 at 9:14 AM Ajit Agarwal wrote:
Hello All:
This patch improves code s
On Mon, May 29, 2023 at 8:17 PM Roger Sayle wrote:
>
>
> This is my proposed minimal fix for PR target/109973 (hopefully suitable
> for backporting) that follows Jakub Jelinek's suggestion that we introduce
> CCZmode and CCCmode variants of ptest and vptest, so that the i386
> backend treats [v]pt
>> why is the conversion after register allocation always
>> safe?
I do worry about this issue too.
I just notice :
+ case MEM:
+ operands[i] = change_address (operands[i], vla_mode, NULL_RTX);
I am not sure whether it is safe.
>> Couldn't we "lower" the fixed-length vectors to VL
On 2023-05-30 13:26 Sinan wrote:
>
>>> +/* Return TRUE if Zcmp push and pop insns should be
>>> + avoided. FALSE otherwise.
>>> + Only use multi push & pop if all GPRs masked can be covered,
>>> + and stack access is SP based,
>>> + and GPRs are at top of the stack frame,
>>> + and no conflicts i
Hi, Richi.
>> but ideally the user would be able to specify -mrvv-size=32 for an
>> implementation with 32 byte vectors and then vector lowering would make use
>> of vectors up to 32 bytes?
Actually, we don't want to specify -mrvv-size = 32 to enable vectorization on
GNU vectors.
You can take a
Sorry I've missed the recent updates on trunk regarding handling FMA.
I'll measure again if something in this still helps.
Thanks,
Di Zhao
> -Original Message-
> From: Di Zhao OS
> Sent: Friday, May 26, 2023 3:15 PM
> To: gcc-patches@gcc.gnu.org
> Subject: [RFC][PATCH] Improve generating
> We want to be able to treat such things as invariant somehow even if we
> can't do that for references to user data that might be changed by
> intervening code.
>
> That is, indicate that we know that the _REF actually refers to a const
> variable or is otherwise known to be unchanging.
>
> Per
On Mon, 29 May 2023, Jan-Benedict Glaw wrote:
> > Can you elaborate how you build GCC?
>
> My host compileris Debian's "gcc-snapshot", by now some two months
> old. (As Eric wrote, it's probably just too old.) That compiler is
> given for CC/CXX. The new build is just (as I wrote in the initial
>
On Tue, May 30, 2023 at 10:03:05AM +0200, Eric Botcazou wrote:
> > We want to be able to treat such things as invariant somehow even if we
> > can't do that for references to user data that might be changed by
> > intervening code.
> >
> > That is, indicate that we know that the _REF actually refe
On Tue, May 30, 2023 at 9:39 AM Uros Bizjak wrote:
>
> On Mon, May 29, 2023 at 8:17 PM Roger Sayle
> wrote:
> >
> >
> > This is my proposed minimal fix for PR target/109973 (hopefully suitable
> > for backporting) that follows Jakub Jelinek's suggestion that we introduce
> > CCZmode and CCCmode
This fixes all asan tests, apart from
c-c++-common/asan/pointer-compare-1.c which needs a workaround for PR
sanitizer/82501.
PR target/110036
* config/riscv/riscv.cc (riscv_asan_shadow_offset): Update to
match libsanitizer.
---
gcc/config/riscv/riscv.cc | 7 +++
1 file
PR sanitizer/82501
* c-c++-common/asan/pointer-compare-1.c: Disable use of small data
on RISC-V.
---
gcc/testsuite/c-c++-common/asan/pointer-compare-1.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/gcc/testsuite/c-c++-common/asan/pointer-compare-1.c
b/gcc/testsuite/
LGTM, I remember Luís updated[1] that, but apparently I forgot sync this to gcc,
and just to remind, I plan to change that to dynamic offset[2] to make
that work on Sv39, Sv48 and Sv57,
but we are still running testing and debugging to make sure LSAN works well...
[1] https://reviews.llvm.org/D97
LGTM, thanks :)
On Tue, May 30, 2023 at 4:43 PM Andreas Schwab via Gcc-patches
wrote:
>
> PR sanitizer/82501
> * c-c++-common/asan/pointer-compare-1.c: Disable use of small data
> on RISC-V.
> ---
> gcc/testsuite/c-c++-common/asan/pointer-compare-1.c | 1 +
> 1 file chang
>>> but ideally the user would be able to specify -mrvv-size=32 for an
>>> implementation with 32 byte vectors and then vector lowering would make use
>>> of vectors up to 32 bytes?
>
> Actually, we don't want to specify -mrvv-size = 32 to enable vectorization on
> GNU vectors.
> You can take a l
(I am still on the meeting hell, and will be released very later,
apology for short and incomplete reply, and will reply complete later)
One point for adding VLS mode support is because SLP, especially for
those SLP candidate not in the loop, those case use VLS type can be
better, of cause using l
In the future, we will definitely mixing VLA and VLS-vlmin together in a
codegen and it will not cause any issues.
For VLS-vlmin, I prefer it is used in length style auto-vectorization (I am not
sure since my SELECT_VL patch is not
finished, I will check if can work when I am working in SELECT_VL
One more note: we found a real case in spec 2006, SLP convert two 8
bit into int8x2_t, but the value has live across the function call, it
only need to save-restore 16 bit, but it become save-restore VLEN bits
because it using VLA mode in backend, you could imagine when VLEN is
larger, the performa
On Fri, 26 May 2023, juzhe.zh...@rivai.ai wrote:
> Hi, Richi. Thanks for your analysis and helps.
>
> >> We could simply retain the original
> >> incrementing IV for loop control and add the decrementing
> >> IV for computing LEN in addition to that and leave IVOPTs
> >> sorting out to eventually
The insn "*shlrd_reg" shifts two registers with a funnel shifter by the
third register to get a single word result:
reg0 = (reg1 SHIFT_OP0 reg3) BIT_JOIN_OP (reg2 SHIFT_OP1 (32 - reg3))
where the funnel left shift is SHIFT_OP0 := ASHIFT, SHIFT_OP1 := LSHIFTRT
and its right shift is SHIFT_OP0 :=
More optimized than the default RTL generation.
gcc/ChangeLog:
* config/xtensa/xtensa.md (adddi3, subdi3):
New RTL generation patterns implemented according to the instruc-
tion idioms described in the Xtensa ISA reference manual (p. 600).
---
gcc/config/xtensa/xtensa.md
This patch introduces more optimized implementations for the 6 cstoresi4
insn comparison methods (eq/ne/lt/le/gt/ge, however, required TARGET_NSA
for eq).
gcc/ChangeLog:
* config/xtensa/xtensa.cc (xtensa_expand_scc):
Add dedicated optimization code for cstoresi4 (eq/ne/gt/ge/lt/le
Ok.
It seems that for this conditions:
+ /* If we're vectorizing a loop that uses length "controls" and
+ can iterate more than once, we apply decrementing IV approach
+ in loop control. */
+ if (LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo)
+ && !LOOP_VINFO_LENS (loop_vinfo).
On Tue, May 30, 2023 at 11:17 AM juzhe.zh...@rivai.ai
wrote:
>
> In the future, we will definitely mixing VLA and VLS-vlmin together in a
> codegen and it will not cause any issues.
> For VLS-vlmin, I prefer it is used in length style auto-vectorization (I am
> not sure since my SELECT_VL patch
On Mon, May 8, 2023 at 12:21 AM Andrew Pinski via Gcc-patches
wrote:
>
> This moves the `a <= CST1 ? MAX : a` optimization
> from phiopt to match. It just adds a new pattern to match.pd.
>
> There is one more change needed before being able to remove
> minmax_replacement from phiopt.
>
> A few not
On Mon, May 8, 2023 at 7:27 AM Andrew Pinski via Gcc-patches
wrote:
>
> This patch adds the support for match that was implemented for PR 87913 in
> phiopt.
> It implements it by adding support to minmax_from_comparison for the check.
> It uses the range information if available which allows to p
Ok for 12 and 13 branch?
--
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."
>> For the future it would be then good to have the vectorizer
>>re-vectorize loops with
>>VLS vector uses to VLA style?
Not really, this patch is just using a magic convert VLS vector into VLA stype
since
it can avoid defining the RVV patterns with VLS modes and avoid a lot of work.
There is
Hi all,
This patch reimplements the MD patterns for the
UHADD,SHADD,UHSUB,SHSUB,URHADD,SRHADD instructions using
standard RTL operations rather than unspecs. The correct RTL representations
involves widening
the inputs before adding them and halving, followed by a truncation back to the
origina
On Tue, May 9, 2023 at 9:06 AM liuhongt via Gcc-patches
wrote:
>
> The patch doesn't handle:
> 1. cast64_to_32,
> 2. memory source with rsize < range.
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> Ok for trunk?
OK and sorry for the delay.
Richard.
> gcc/ChangeLog:
>
>
Hi all,
This patch converts the patterns for the integer widen and pairwise-add
instructions
to standard RTL operations. The pairwise addition withing a vector can be
represented
as an addition of two vec_selects, one selecting the even elements, and one
selecting odd.
Thus for the intrinsic vp
I think I prefer doing VLS mode like these:
This is current VLA patterns:
(define_insn "@pred_"
[(set (match_operand:VI 0 "register_operand" "=vd, vd, vr, vr, vd,
vd, vr, vr, vd, vd, vr, vr")
(if_then_else:VI
(unspec:
[(match_operand: 1 "vector_mask_operand" " vm, vm,Wc1, W
On Wed, 17 May 2023, Jiufu Guo wrote:
> Hi,
>
> This patch tries to optimize "(X - N * M) / N + M" to "X / N".
But if that's valid why not make the transform simpler and transform
(X - N * M) / N to X / N - M instead?
You use the same optimize_x_minus_NM_div_N_plus_M validator for
the division
On Tue, 30 May 2023, juzhe.zh...@rivai.ai wrote:
> Ok.
>
> It seems that for this conditions:
>
> + /* If we're vectorizing a loop that uses length "controls" and
> + can iterate more than once, we apply decrementing IV approach
> + in loop control. */
> + if (LOOP_VINFO_CAN_USE_PARTI
Resubmitting the correct one due to a mistake in merging order of fixes.
---
More optimized than the default RTL generation.
gcc/ChangeLog:
* config/xtensa/xtensa.md (adddi3, subdi3):
New RTL generation patterns implemented according to the instruc-
tion idioms described i
Resubmitting the correct one due to a mistake in merging order of fixes.
---
This patch introduces more optimized implementations for the 6 cstoresi4
insn comparison methods (eq/ne/lt/le/gt/ge, however, required TARGET_NSA
for eq).
gcc/ChangeLog:
* config/xtensa/xtensa.cc (xtensa_expand_s
on 2023/5/30 17:26, juzhe.zh...@rivai.ai wrote:
> Ok.
>
> It seems that for this conditions:
>
> + /* If we're vectorizing a loop that uses length "controls" and
> + can iterate more than once, we apply decrementing IV approach
> + in loop control. */
> + if (LOOP_VINFO_CAN_USE_PARTIAL
>> No, since powerpc is fine with decrementing VL it should also use it.
>>Instead you should make sure to produce SCEV analyzable IVs when
>>possible (when SELECT_VL is not or cannot be used).
Ok. Would you mind giving me the guideline how to rewrite the decrement IV?
Since I am not familiar with
On Tue, May 23, 2023 at 11:28 AM Sebastian Huber
wrote:
>
> On 10.01.23 16:38, Sebastian Huber wrote:
> > On 19/12/2022 17:02, Sebastian Huber wrote:
> >> Build libatomic for all targets. Use gthr.h to provide a default
> >> implementation. If the thread model is "single", then this
> >> impleme
On Tue, 30 May 2023, Kewen.Lin wrote:
> on 2023/5/30 17:26, juzhe.zh...@rivai.ai wrote:
> > Ok.
> >
> > It seems that for this conditions:
> >
> > + /* If we're vectorizing a loop that uses length "controls" and
> > + can iterate more than once, we apply decrementing IV approach
> > + i
>> No, I said the current scheme does sth along
>> do {
>>remain -= MIN (vf, remain);
>> } while (remain != 0);
>> and I suggest to instead do
>> do {
>>old_remain = remain;
>>len = MIN (vf, remain);
>>remain -= vf;
>> } while (old_remain >= vf);
>> basically since only the last
My understanding was that we went into this knowing that the IVs
would defeat SCEV analysis. Apparently that wasn't a problem for RVV,
but it's not surprising that it is a problem in general.
This isn't just about SELECT_VL though. We use the same type of IV
for cases what aren't going to use SE
On 30.05.23 11:53, Richard Biener wrote:
On Tue, May 23, 2023 at 11:28 AM Sebastian Huber
wrote:
On 10.01.23 16:38, Sebastian Huber wrote:
On 19/12/2022 17:02, Sebastian Huber wrote:
Build libatomic for all targets. Use gthr.h to provide a default
implementation. If the thread model is "si
On Tue, 30 May 2023, Richard Sandiford wrote:
> My understanding was that we went into this knowing that the IVs
> would defeat SCEV analysis. Apparently that wasn't a problem for RVV,
> but it's not surprising that it is a problem in general.
>
> This isn't just about SELECT_VL though. We use
Andreas Schwab via Gcc-patches 於 2023年5月30日 週二
17:37 寫道:
> Ok for 12 and 13 branch?
>
Yes, thanks!
> --
> Andreas Schwab, SUSE Labs, sch...@suse.de
> GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7
> "And now for something completely different."
>
- Fix gen_autofdo_event: The download URL for the Intel Perfmon Event
list has changed, as well as the JSON format.
Also it now uses pattern matching to match CPUs. Update the script to support
all of this.
- Regenerate gcc-auto-profile with the latest published Intel model
numbers, so it wo
I stumbled over that error message the other day and found it a bit
confusing:
error: expected ‘#pragma omp’ clause before ‘uses_allocators’
The new wording is not the best, but I think at least better:
error: expected an OpenMP clause before ‘uses_allocators’
('uses_allocators' is a valid
On Tue, May 30, 2023 at 12:17 PM Sebastian Huber
wrote:
>
> On 30.05.23 11:53, Richard Biener wrote:
> > On Tue, May 23, 2023 at 11:28 AM Sebastian Huber
> > wrote:
> >> On 10.01.23 16:38, Sebastian Huber wrote:
> >>> On 19/12/2022 17:02, Sebastian Huber wrote:
> Build libatomic for all tar
On Mon, May 29, 2023 at 5:21 AM Hongtao Liu via Gcc-patches
wrote:
>
> ping.
>
> On Mon, May 8, 2023 at 9:59 AM liuhongt wrote:
> >
> > > > @@ -4799,7 +4800,8 @@ vect_create_vectorized_demotion_stmts (vec_info
> > > > *vinfo, vec *vec_oprnds,
> > > >stmt_v
On Tue, May 30, 2023 at 9:35 AM Ajit Agarwal wrote:
>
> Hello Richard:
>
> On 30/05/23 12:34 pm, Richard Biener wrote:
> > On Tue, May 30, 2023 at 7:06 AM Ajit Agarwal wrote:
> >>
> >> Hello Richard:
> >>
> >> On 22/05/23 6:26 pm, Richard Biener wrote:
> >>> On Thu, May 18, 2023 at 9:14 AM Ajit A
From: Ju-Zhe Zhong
Follow Richi's suggestion, I change current decrement IV flow from:
do {
remain -= MIN (vf, remain);
} while (remain != 0);
into:
do {
old_remain = remain;
len = MIN (vf, remain);
remain -= vf;
} while (old_remain >= vf);
to enhance SCEV.
ALL tests (decrement I
Richard Biener writes:
>> But how easy would it be to extend SCEV analysis, via a pattern match?
>> The evolution of the IV phi wrt the inner loop is still a normal SCEV.
>
> No, the IV isn't a normal SCEV, the final value is different.
Which part of the IV though? Won't all executions of the la
Hi, Richi.
I have send patch by following your suggestion and change the decrement IV
follow:
https://gcc.gnu.org/pipermail/gcc-patches/2023-May/620086.html
It works well in RVV.
Could you take a look at it?
If it's ok, I will send patch of SELECT_VL base on this.
Thanks.
juzhe.zh...@rivai.a
juzhe.zh...@rivai.ai writes:
> From: Ju-Zhe Zhong
>
> Follow Richi's suggestion, I change current decrement IV flow from:
>
> do {
>remain -= MIN (vf, remain);
> } while (remain != 0);
>
> into:
>
> do {
>old_remain = remain;
>len = MIN (vf, remain);
>remain -= vf;
> } while (old_r
Before this patch:
foo:
ble a2,zero,.L5
csrr a3,vlenb
srli a4,a3,2
.L3:
minu a5,a2,a4
vsetvli zero,a5,e32,m1,ta,ma
vle32.v v2,0(a1)
vle32.v v1,0(a0)
vsetvli t1,zero,e32,m1,ta,ma
vadd.vv v1,v1,v2
vsetvli zero,a5,e32,m1,ta,ma
vse32.v v1,0(a0)
add a1,a1,a3
add a0,a0,a3
sub a2,a2,a5
bne a2,zero
On Tue, 30 May 2023, Richard Sandiford wrote:
> Richard Biener writes:
> >> But how easy would it be to extend SCEV analysis, via a pattern match?
> >> The evolution of the IV phi wrt the inner loop is still a normal SCEV.
> >
> > No, the IV isn't a normal SCEV, the final value is different.
>
>
>> How does it affect RVV code quality? I thought you specifically chose
>> the previous approach because code quality was better that way.
Yes, previous way is better for RVV. But as I said, we will definitely use
SELECT_VL then
in SELECT_VL, we will using remain - step (produced by SELET_VL).
"juzhe.zh...@rivai.ai" writes:
> Before this patch:
> foo:
> ble a2,zero,.L5
> csrr a3,vlenb
> srli a4,a3,2
> .L3:
> minu a5,a2,a4
> vsetvli zero,a5,e32,m1,ta,ma
> vle32.v v2,0(a1)
> vle32.v v1,0(a0)
> vsetvli t1,zero,e32,m1,ta,ma
> vadd.vv v1,v1,v2
> vsetvli zero,a5,e32,m1,ta,ma
> vse32.v v1,0(a0
On Wed, 24 May 2023, YunQiang Su wrote:
> > or even:
> >
> > if (INTVAL (length) <= MIPS_MAX_MOVE_BYTES_STRAIGHT)
> > ...
> > else if (INTVAL (length) < 64 && optimize)
> > ...
> >
>
> I don't think this is a good option, since somebody may add some code,
> and may bre
"juzhe.zhong" writes:
> Maybe we can include rgroup number into select vl pattern?So that, I always
> use select vl pattern. In my backend, if it is single rgroup,we gen vsetvl,
> otherwise we gen min.
That just seems to be a way of hiding an “is the target RVV?” test though.
IMO targets shouldn
On Tue, 30 May 2023, juzhe.zhong wrote:
> This patch will generate the number of rgroup ?mov? instructions inside the
> loop. This is unacceptable. For example?if number of rgroups=3? will be 3 more
> instruction in loop. If this patch is necessary? I think I should find a way
> to fix it.
That's
>> That's odd, you only need to adjust the IV which is used in the exit test,
>> not all the others.
Sorry for my incorrect information. I checked the codegen of both single-rgroup
and multi-rgroup.
Their codegen are same behavior, after this patch, there will be 1 more neg
instruction in prehea
Ping #1 for:
https://gcc.gnu.org/pipermail/gcc-patches/2023-May/618976.html
https://gcc.gnu.org/pipermail/gcc-patches/attachments/20230519/9536bf8c/attachment-0001.bin
Johann
Am 19.05.23 um 10:49 schrieb Georg-Johann Lay:
Here is a revised version of the patch. The difference to the
previou
data-intrinsics-assembly.c forces -march=armv6 using dg-add-options
arm_arch_v6, which implicitly adds -mfloat-abi=softfp.
However, for a toolchain configured for arm-linux-gnueabihf and
--with-arch=armv7-a, the testcase will fail when including arm_acle.h
(which includes stdint.h, which will fail
> -Original Message-
> From: Christophe Lyon
> Sent: Tuesday, May 30, 2023 3:00 PM
> To: gcc-patches@gcc.gnu.org; Kyrylo Tkachov ;
> Chris Sidebottom
> Cc: Christophe Lyon
> Subject: [PATCH] [arm][testsuite]: Fix ACLE data-intrinsics testcases
>
> data-intrinsics-assembly.c forces -ma
Hi, all. After several investigations:
Here is my experiements:
void
single_rgroup (int32_t *__restrict a, int32_t *__restrict b, int n)
{
for (int i = 0; i < n; i++)
a[i] = b[i] + a[i];
}
void
mutiple_rgroup (float *__restrict f, double *__restrict d, int n)
{
for (int i = 0; i < n; ++i)
On 5/23/23 06:27, Richard Sandiford wrote:
Jeff Law via Gcc-patches writes:
On 5/17/23 03:03, Jin Ma wrote:
For example:
(define_insn "mov_lowpart_sidi2"
[(set (match_operand:SI0 "register_operand" "=r")
(subreg:SI (match_operand:DI 1 "register_operand" " r") 0))]
1 - 100 of 168 matches
Mail list logo