The native RTL expression for vec_mrghw should be same for BE and LE as
they are register and endian-independent. So both BE and LE need
generate exactly same RTL with index [0 4 1 5] when expanding vec_mrghw
with vec_select and vec_concat.
(set (reg:V4SI 141) (vec_select:V4SI (vec_concat:V8SI
On 2022/8/9 11:01, Kewen.Lin wrote:
Hi Xionghu,
Thanks for the fix.
on 2022/8/8 11:42, Xionghu Luo wrote:
The native RTL expression for vec_mrghw should be same for BE and LE as
they are register and endian-independent. So both BE and LE need
generate exactly same RTL with index [0 4 1 5]
On 2022/8/11 01:07, Segher Boessenkool wrote:
On Wed, Aug 10, 2022 at 02:39:02PM +0800, Xionghu Luo wrote:
On 2022/8/9 11:01, Kewen.Lin wrote:
I have some concern on those changed "altivec_*_direct", IMHO the suffix
"_direct" is normally to indicate the define_insn is mapped to the
correspon
Hi Segher, Ping this for stage 4...
On 2023/2/10 10:59, Xionghu Luo via Gcc-patches wrote:
Resend this patch...
v4: Update per comments.
v3: rename altivec_vmrghb_direct_le to altivec_vmrglb_direct_le to match
the actual output ASM vmrglb. Likewise for all similar xxx_direct_le
patterns.
v2
When spliting edge with self loop, the split edge should be placed just next to
the edge_in->src, otherwise it may generate different position latch bbs for
two consecutive self loops. For details, please refer to:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93680#c4
Regression tested pass on x8
For case like belowi test.c:
1:int foo(char c)
2:{
3: return ((c >= 'A' && c <= 'Z')
4: || (c >= 'a' && c <= 'z')
5: || (c >= '0' && c <='0'));}
the generated line number is incorrect for condition c>='A' of block 2:
Thus correct the condition op0 location.
gcno diff before and with
On 2023/3/2 16:16, Richard Biener wrote:
On Thu, Mar 2, 2023 at 3:31 AM Xionghu Luo via Gcc-patches
wrote:
For case like belowi test.c:
1:int foo(char c)
2:{
3: return ((c >= 'A' && c <= 'Z')
4: || (c >= 'a' && c <=
On 2023/3/2 16:41, Richard Biener wrote:
On Thu, Mar 2, 2023 at 3:31 AM Xionghu Luo via Gcc-patches
wrote:
When spliting edge with self loop, the split edge should be placed just next to
the edge_in->src, otherwise it may generate different position latch bbs for
two consecutive self lo
On 2023/3/2 18:45, Richard Biener wrote:
small.gcno: 648: block 2:`small.c':1, 3, 4, 6
small.gcno: 688:0145: 36:LINES
small.gcno: 700: block 3:`small.c':8, 9
small.gcno: 732:0145: 32:LINES
small.gcno: 744:
On 2023/3/6 16:11, Richard Biener wrote:
On Mon, Mar 6, 2023 at 8:22 AM Xionghu Luo wrote:
On 2023/3/2 18:45, Richard Biener wrote:
small.gcno: 648: block 2:`small.c':1, 3, 4, 6
small.gcno: 688:0145: 36:LINES
small.gcno: 700: blo
On 2023/3/7 16:53, Richard Biener wrote:
On Tue, 7 Mar 2023, Xionghu Luo wrote:
Unfortunately this change (flag_test_coverage -> !optimize ) caused hundred
of gfortran cases execution failure with O0. Take gfortran.dg/index.f90 for
example:
.gimple:
__attribute__((fn spec (". ")))
void p
On 2023/3/7 19:25, Richard Biener wrote:
It would be nice to avoid creating blocks / preserving labels we'll
immediately remove again. For that we do need some analysis
before creating basic-blocks that determines whether a label is
possibly reached by a non-falltru edge.
:
p = 0;
switch
On 2023/3/9 20:02, Richard Biener wrote:
On Wed, 8 Mar 2023, Xionghu Luo wrote:
On 2023/3/7 19:25, Richard Biener wrote:
It would be nice to avoid creating blocks / preserving labels we'll
immediately remove again. For that we do need some analysis
before creating basic-blocks that determ
On 2023/3/9 20:02, Richard Biener wrote:
On Wed, 8 Mar 2023, Xionghu Luo wrote:
On 2023/3/7 19:25, Richard Biener wrote:
It would be nice to avoid creating blocks / preserving labels we'll
immediately remove again. For that we do need some analysis
before creating basic-blocks that determ
From: "luo...@cn.ibm.com"
UNSPEC_SI_FROM_SF is not supported when TARGET_DIRECT_MOVE_64BIT
is false for -m32, don't generate VIEW_CONVERT_EXPR(ARRAY_REF) for
variable vector insert. Remove rs6000_expand_vector_set_var helper
function, adjust the p8 and p9 definitions position and make them
stati
Hi,
On 2021/1/27 03:00, David Edelsohn wrote:
> On Tue, Jan 26, 2021 at 2:46 AM Xionghu Luo wrote:
>>
>> From: "luo...@cn.ibm.com"
>>
>> UNSPEC_SI_FROM_SF is not supported when TARGET_DIRECT_MOVE_64BIT
>> is false for -m32, don't generate VIEW_CONVERT_EXPR(ARRAY_REF) for
>> variable vector inser
Move common functions to header file for cleanup.
gcc/testsuite/ChangeLog:
2021-01-27 Xionghu Luo
* gcc.target/powerpc/pr79251.p8.c: Move definition to ...
* gcc.target/powerpc/pr79251.h: ...this.
* gcc.target/powerpc/pr79251.p9.c: Likewise.
* gcc.target/powerp
BE ilp32 Linux generates extra stack stwu instructions which shouldn't
be counted in, \m … \M is needed around each instruction, not just the
beginning and end of the entire pattern. Pre-approved, committing.
gcc/testsuite/ChangeLog:
2021-02-01 Xionghu Luo
* gcc.target/powerpc/pr79251
v[k] will also be expanded to IFN VEC_SET if k is long type when built
with -Og. -O0 didn't exposed the issue due to v is TREE_ADDRESSABLE,
-O1 and above also didn't capture it because of v[k] is not optimized to
VIEW_CONVERT_EXPR(v)[k_1].
vec_insert defines the element argument type to be signed
Gentle ping, thanks.
On 2021/2/3 17:01, Xionghu Luo wrote:
v[k] will also be expanded to IFN VEC_SET if k is long type when built
with -Og. -O0 didn't exposed the issue due to v is TREE_ADDRESSABLE,
-O1 and above also didn't capture it because of v[k] is not optimized to
VIEW_CONVERT_EXPR(v)[k
vec_insert defines the element argument type to be signed int by ELFv2
ABI, When expanding a vector with a variable rtx, convert the rtx type
SImode.
gcc/ChangeLog:
2021-02-24 Xionghu Luo
PR target/98914
* config/rs6000/rs6000.c (rs6000_expand_vector_set): Convert
elt_
On 2021/2/25 00:57, Segher Boessenkool wrote:
> Hi!
>
> On Wed, Feb 24, 2021 at 09:06:24AM +0800, Xionghu Luo wrote:
>> vec_insert defines the element argument type to be signed int by ELFv2
>> ABI, When expanding a vector with a variable rtx, convert the rtx type
>> SImode.
>
> But that is tr
On 2021/2/25 14:33, Xionghu Luo via Gcc-patches wrote:
>
>
> On 2021/2/25 00:57, Segher Boessenkool wrote:
>> Hi!
>>
>> On Wed, Feb 24, 2021 at 09:06:24AM +0800, Xionghu Luo wrote:
>>> vec_insert defines the element argument type to be signed int by ELFv2
&
loop split condition is moved between loop1 and loop2, the split bb's
count and probability should also be duplicated instead of (100% vs INV),
secondly, the original loop1 and loop2 count need be propotional from the
original loop.
Regression tested pass, OK for master?
diff base/loop-cond-split
I' like to split this patch:
https://gcc.gnu.org/pipermail/gcc-patches/2021-August/576488.html
to two patches:
0001-Fix-loop-split-incorrect-count-and-probability.patch
0002-Don-t-move-cold-code-out-of-loop-by-checking-bb-coun.patch
since they are solving two different things, please help to r
Thanks,
On 2021/8/6 19:46, Richard Biener wrote:
> On Tue, 3 Aug 2021, Xionghu Luo wrote:
>
>> loop split condition is moved between loop1 and loop2, the split bb's
>> count and probability should also be duplicated instead of (100% vs INV),
>> secondly, the original loop1 and loop2 count need be
Hi,
On 2021/8/6 20:15, Richard Biener wrote:
> On Mon, Aug 2, 2021 at 7:05 AM Xiong Hu Luo wrote:
>>
>> There was a patch trying to avoid move cold block out of loop:
>>
>> https://gcc.gnu.org/pipermail/gcc/2014-November/215551.html
>>
>> Richard suggested to "never hoist anything from a bb with
On 2021/8/10 22:47, Richard Biener wrote:
> On Mon, 9 Aug 2021, Xionghu Luo wrote:
>
>> Thanks,
>>
>> On 2021/8/6 19:46, Richard Biener wrote:
>>> On Tue, 3 Aug 2021, Xionghu Luo wrote:
>>>
loop split condition is moved between loop1 and loop2, the split bb's
count and probability sho
On 2021/8/11 17:16, Richard Biener wrote:
On Wed, 11 Aug 2021, Xionghu Luo wrote:
On 2021/8/10 22:47, Richard Biener wrote:
On Mon, 9 Aug 2021, Xionghu Luo wrote:
Thanks,
On 2021/8/6 19:46, Richard Biener wrote:
On Tue, 3 Aug 2021, Xionghu Luo wrote:
loop split condition is moved be
Hi,
On 2021/8/16 19:46, Richard Biener wrote:
On Mon, 16 Aug 2021, Xiong Hu Luo wrote:
It seems to me that ALWAYS_EXECUTED_IN is not computed correctly for
nested loops. inn_loop is updated to inner loop, so it need be restored
when exiting from innermost loop. With this patch, the store inst
On 2021/8/17 13:17, Xionghu Luo via Gcc-patches wrote:
Hi,
On 2021/8/16 19:46, Richard Biener wrote:
On Mon, 16 Aug 2021, Xiong Hu Luo wrote:
It seems to me that ALWAYS_EXECUTED_IN is not computed correctly for
nested loops. inn_loop is updated to inner loop, so it need be restored
when
On 2021/8/17 15:12, Richard Biener wrote:
> On Tue, 17 Aug 2021, Xionghu Luo wrote:
>
>> Hi,
>>
>> On 2021/8/16 19:46, Richard Biener wrote:
>>> On Mon, 16 Aug 2021, Xiong Hu Luo wrote:
>>>
It seems to me that ALWAYS_EXECUTED_IN is not computed correctly for
nested loops. inn_loop is
On 2021/8/17 17:10, Xionghu Luo via Gcc-patches wrote:
>
>
> On 2021/8/17 15:12, Richard Biener wrote:
>> On Tue, 17 Aug 2021, Xionghu Luo wrote:
>>
>>> Hi,
>>>
>>> On 2021/8/16 19:46, Richard Biener wrote:
>>>> On Mon, 16
On 2021/8/10 12:25, Ulrich Drepper wrote:
> On Tue, Aug 10, 2021 at 4:03 AM Xionghu Luo via Gcc-patches
> wrote:
>> For this case, theorotically I think the master GCC will optimize it to:
>>
>>invariant;
>>for (;;)
>>
On 2021/8/19 20:11, Richard Biener wrote:
>> - class loop *inn_loop = loop;
>>
>> if (ALWAYS_EXECUTED_IN (loop->header) == NULL)
>> {
>> @@ -3232,19 +3231,6 @@ fill_always_executed_in_1 (class loop *loop, sbitmap
>> contains_call)
>> to disprove this if possible). */
>>
On 2021/8/24 16:20, Richard Biener wrote:
> On Tue, 24 Aug 2021, Xionghu Luo wrote:
>
>>
>>
>> On 2021/8/19 20:11, Richard Biener wrote:
- class loop *inn_loop = loop;
if (ALWAYS_EXECUTED_IN (loop->header) == NULL)
{
@@ -3232,19 +3231,6 @@ fill_always_e
On 2021/8/27 15:45, Richard Biener wrote:
On Thu, 26 Aug 2021, Xionghu Luo wrote:
On 2021/8/24 16:20, Richard Biener wrote:
On Tue, 24 Aug 2021, Xionghu Luo wrote:
On 2021/8/19 20:11, Richard Biener wrote:
- class loop *inn_loop = loop;
if (ALWAYS_EXECUTED_IN (loop->header
On 2021/8/30 17:19, Richard Biener wrote:
bitmap_set_bit (work_set, loop->header->index);
+ unsigned bb_index;
- for (i = 0; i < loop->num_nodes; i++)
- {
- edge_iterator ei;
- bb = bbs[i];
+ unsigned array_size = last_basic_block_for_fn (cfun) + 1;
Thanks for the review,
On 2020/9/21 16:31, Richard Biener wrote:
+
+static gimple *
+gimple_expand_vec_set_expr (gimple_stmt_iterator *gsi)
+{
+ enum tree_code code;
+ gcall *new_stmt = NULL;
+ gassign *ass_stmt = NULL;
+
+ /* Only consider code == GIMPLE_ASSIGN. */
+ gassign *stmt = dyn_
Hi,
On 2020/9/23 19:33, Richard Biener wrote:
>> The first loop is for rhs stmt process, this loop is for lhs stmt process.
>> I thought vec_extract also need to generate IFN before, but seems not
>> necessary now? And that the first loop needs to update the lhs stmt while
>> then second doesn't.
Hi Segher,
The attached two patches are updated and split from
"[PATCH v2 2/2] rs6000: Expand vec_insert in expander instead of gimple
[PR79251]"
as your comments.
[PATCH v3 2/3] rs6000: Fix lvsl&lvsr mode and change rs6000_expand_vector_set
param
This one is preparation work of fix lvsl&lvs
Hi,
On 2020/9/24 21:27, Richard Biener wrote:
> On Thu, Sep 24, 2020 at 10:21 AM xionghu luo wrote:
>
> I'll just comment that
>
> xxperm 34,34,33
> xxinsertw 34,0,12
> xxperm 34,34,32
>
> doesn't look like a variable-position insert instruction but
> this is a varia
Hi,
On 2020/9/24 20:39, Richard Sandiford wrote:
> xionghu luo writes:
>> @@ -2658,6 +2659,43 @@ expand_vect_cond_mask_optab_fn (internal_fn, gcall
>> *stmt, convert_optab optab)
>>
>> #define expand_vec_cond_mask_optab_fn expand_vect_cond_mask_optab_fn
>>
>> +/* Expand VEC_SET internal
On 2020/9/25 21:28, Richard Sandiford wrote:
> xionghu luo writes:
>> @@ -2658,6 +2659,45 @@ expand_vect_cond_mask_optab_fn (internal_fn, gcall
>> *stmt, convert_optab optab)
>>
>> #define expand_vec_cond_mask_optab_fn expand_vect_cond_mask_optab_fn
>>
>> +/* Expand VEC_SET internal fu
rs6000_expand_vector_set could accept insert either to constant position
or variable position, so change the operand to reg_or_cint_operand.
gcc/ChangeLog:
2020-10-10 Xionghu Luo
* config/rs6000/rs6000-call.c (altivec_expand_vec_set_builtin):
Change call param 2 from type int
gcc/testsuite/ChangeLog:
2020-10-10 Xionghu Luo
* gcc.target/powerpc/fold-vec-insert-char-p8.c: Adjust
instruction counts.
* gcc.target/powerpc/fold-vec-insert-char-p9.c: Likewise.
* gcc.target/powerpc/fold-vec-insert-double.c: Likewise.
* gcc.target/pow
gcc/ChangeLog:
2020-10-10 Xionghu Luo
* config/rs6000/rs6000-c.c (altivec_resolve_overloaded_builtin):
Generate ARRAY_REF(VIEW_CONVERT_EXPR) for P8 and later
platforms.
* config/rs6000/rs6000.c (rs6000_expand_vector_set_var): Update
to call different pat
vec_insert accepts 3 arguments, arg0 is input vector, arg1 is the value
to be insert, arg2 is the place to insert arg1 to arg0. Current expander
generates stxv+stwx+lxv if arg2 is variable instead of constant, which
causes serious store hit load performance issue on Power. This patch tries
1) Bu
Originated from
https://gcc.gnu.org/pipermail/gcc-patches/2020-September/554240.html
with patch split and some refinement per review comments.
Patch of IFN VEC_SET for ARRAY_REF(VIEW_CONVERT_EXPR) is committed,
this patch set enables expanding IFN VEC_SET for Power9 and Power8
with specfic instruc
r12-4526 cancelled jump thread path rotates loop. It exposes a issue in
profile-estimate when predict_extra_loop_exits, outer loop's exit edge
is marked as inner loop's extra loop exit and set with incorrect
prediction, then a hot inner loop will become cold loop finally through
optimizations, this
On 2021/11/23 13:51, Xionghu Luo wrote:
> r12-4526 cancelled jump thread path rotates loop. It exposes a issue in
> profile-estimate when predict_extra_loop_exits, outer loop's exit edge
> is marked as inner loop's extra loop exit and set with incorrect
> prediction, then a hot inner loop will b
On 2021/11/23 17:50, Jan Hubicka wrote:
>> On Tue, Nov 23, 2021 at 6:52 AM Xionghu Luo wrote:
>>>
>>> r12-4526 cancelled jump thread path rotates loop. It exposes a issue in
>>> profile-estimate when predict_extra_loop_exits, outer loop's exit edge
>>> is marked as inner loop's extra loop exit and
Gentle ping, thanks.
[PATCH v3] Fix loop split incorrect count and probability
https://gcc.gnu.org/pipermail/gcc-patches/2021-November/583626.html
On 2021/11/8 14:09, Xionghu Luo via Gcc-patches wrote:
>
>
> On 2021/10/27 15:44, Jan Hubicka wrote:
>>> On Wed, 27 Oct 2021,
Gentle ping and is this patch still suitable for stage 3? Thanks.
[PATCH v7 2/2] Don't move cold code out of loop by checking bb count
https://gcc.gnu.org/pipermail/gcc-patches/2021-November/583911.html
On 2021/11/10 11:08, Xionghu Luo via Gcc-patches wrote:
>
>
> On 2
On 2021/12/1 18:09, Richard Biener wrote:
> On Wed, Nov 10, 2021 at 4:08 AM Xionghu Luo wrote:
>>
>>
>>
>> On 2021/11/4 21:00, Richard Biener wrote:
>>> On Wed, Nov 3, 2021 at 2:29 PM Xionghu Luo wrote:
> + while (outmost_loop != loop)
> +{
> + if (bb_colder_tha
Hi Honza,
Gentle ping for this :), thanks.
https://gcc.gnu.org/pipermail/gcc-patches/2021-November/585289.html
On 2021/11/24 13:03, Xionghu Luo via Gcc-patches wrote:
> On 2021/11/23 17:50, Jan Hubicka wrote:
>>> On Tue, Nov 23, 2021 at 6:52 AM Xionghu Luo wrote:
>>>>
On 2021/12/6 13:09, Xionghu Luo via Gcc-patches wrote:
>
>
> On 2021/12/1 18:09, Richard Biener wrote:
>> On Wed, Nov 10, 2021 at 4:08 AM Xionghu Luo wrote:
>>>
>>>
>>>
>>> On 2021/11/4 21:00, Richard Biener wrote:
>
This patchset is a recollect of previously sent patches. Thanks
Richard that The "Don't move cold code out of loop by checking bb count"
is approved[1], but there are still 3 prerequesite patches to supplement
or avoid regression.
1) Patch [1/3] is the RTL part of not hoisting LIM code out of col
gcc/ChangeLog:
* loop-invariant.c (find_invariants_bb): Check profile count
before motion.
(find_invariants_body): Add argument.
---
gcc/loop-invariant.c | 10 +++---
1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/gcc/loop-invariant.c b/gcc/loop-invarian
r12-4526 cancelled jump thread path rotates loop. It exposes a issue in
profile-estimate when predict_extra_loop_exits, outer loop's exit edge
is marked as inner loop's extra loop exit and set with incorrect
prediction, then a hot inner loop will become cold loop finally through
optimizations, this
In tree-ssa-loop-split.c, split_loop and split_loop_on_cond does two
kind of split. split_loop only works for single loop and insert edge at
exit when split, while split_loop_on_cond is not limited to single loop
and insert edge at latch when split. Both split behavior should consider
loop count a
On 2021/12/7 20:17, Richard Biener wrote:
>>> + class loop *coldest_loop = coldest_outermost_loop[loop->num];
>>> + if (loop_depth (coldest_loop) < loop_depth (outermost_loop))
>>> +{
>>> + class loop *hotter_loop = hotter_than_inner_loop[loop->num];
>>> + if (!hotter_loop
>>> +
Add specialized version to combine two instructions from
9: {r123:CC=cmp(r124:DI&0x6,0);clobber scratch;}
REG_DEAD r124:DI
10: pc={(r123:CC==0)?L15:pc}
REG_DEAD r123:CC
to:
10: {pc={(r123:DI&0x6==0)?L15:pc};clobber scratch;clobber %0:CC;}
then split2 will split i
On 2021/12/9 07:47, Jeff Law wrote:
>> diff --git a/gcc/tree-ssa-loop-split.c b/gcc/tree-ssa-loop-split.c
>> index 3f6ad046623..33128061aab 100644
>> --- a/gcc/tree-ssa-loop-split.c
>> +++ b/gcc/tree-ssa-loop-split.c
>>
>> @@ -607,6 +610,38 @@ split_loop (class loop *loop1)
>> tree guard_n
On 2022/8/16 14:53, Kewen.Lin wrote:
Hi Xionghu,
Thanks for the updated version of patch, some comments are inlined.
on 2022/8/11 14:15, Xionghu Luo wrote:
On 2022/8/11 01:07, Segher Boessenkool wrote:
On Wed, Aug 10, 2022 at 02:39:02PM +0800, Xionghu Luo wrote:
On 2022/8/9 11:01, Kewen
Hi Segher, I'd like to resend and ping for this patch. Thanks.
From 23bffdacdf0eb1140c7a3571e6158797f4818d57 Mon Sep 17 00:00:00 2001
From: Xionghu Luo
Date: Thu, 4 Aug 2022 03:44:58 +
Subject: [PATCH v4] rs6000: Fix incorrect RTL for Power LE when removing the
UNSPECS [PR106069]
v4: Update
Ping.
On 2020/10/10 16:08, Xionghu Luo wrote:
Originated from
https://gcc.gnu.org/pipermail/gcc-patches/2020-September/554240.html
with patch split and some refinement per review comments.
Patch of IFN VEC_SET for ARRAY_REF(VIEW_CONVERT_EXPR) is committed,
this patch set enables expanding IFN V
Ping^2, thanks.
On 2020/11/5 09:34, Xionghu Luo via Gcc-patches wrote:
Ping.
On 2020/10/10 16:08, Xionghu Luo wrote:
Originated from
https://gcc.gnu.org/pipermail/gcc-patches/2020-September/554240.html
with patch split and some refinement per review comments.
Patch of IFN VEC_SET for
Hi,
On 2020/10/27 05:10, Segher Boessenkool wrote:
> On Wed, Oct 21, 2020 at 03:25:29AM -0500, Xionghu Luo wrote:
>> Don't split code from add3 for SDI to allow a later pass to split.
>
> This is very problematic.
>
>> This allows later logic to hoist out constant load in add instructions.
>
>
On 2021/9/1 17:58, Richard Biener wrote:
This fixes the CFG walk order of fill_always_executed_in to use
RPO oder rather than the dominator based order computed by
get_loop_body_in_dom_order. That fixes correctness issues with
unordered dominator children.
The RPO order computed by rev_post_
On 2021/9/2 16:50, Richard Biener wrote:
> On Thu, 2 Sep 2021, Richard Biener wrote:
>
>> On Thu, 2 Sep 2021, Xionghu Luo wrote:
>>
>>>
>>>
>>> On 2021/9/1 17:58, Richard Biener wrote:
This fixes the CFG walk order of fill_always_executed_in to use
RPO oder rather than the dominator b
Resend the patch that addressed Will's comments.
fmod/fmodf and remainder/remainderf could be expanded instead of library
call when fast-math build, which is much faster.
fmodf:
fdivs f0,f1,f2
frizf0,f0
fnmsubs f1,f2,f0,f1
remainderf:
fdivs f0,f1,f2
frinf0,f
Ping^2, thanks.
https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570333.html
On 2021/6/30 09:42, Xionghu Luo via Gcc-patches wrote:
Gentle ping, thanks.
https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570333.html
On 2021/5/14 14:57, Xionghu Luo via Gcc-patches wrote:
Hi,
On 2021/5/13
Ping^2, thanks.
https://gcc.gnu.org/pipermail/gcc-patches/2021-June/572330.html
On 2021/6/30 09:47, Xionghu Luo via Gcc-patches wrote:
Gentle ping, thanks.
https://gcc.gnu.org/pipermail/gcc-patches/2021-June/572330.html
On 2021/6/9 16:03, Xionghu Luo via Gcc-patches wrote:
Hi,
On 2021/6
On 2021/9/4 05:44, Segher Boessenkool wrote:
Hi!
On Fri, Sep 03, 2021 at 10:31:24AM +0800, Xionghu Luo wrote:
fmod/fmodf and remainder/remainderf could be expanded instead of library
call when fast-math build, which is much faster.
Thank you very much for this patch.
Some trivial comments
On 2021/8/26 19:33, Richard Biener wrote:
On Tue, Aug 10, 2021 at 4:03 AM Xionghu Luo wrote:
Hi,
On 2021/8/6 20:15, Richard Biener wrote:
On Mon, Aug 2, 2021 at 7:05 AM Xiong Hu Luo wrote:
There was a patch trying to avoid move cold block out of loop:
https://gcc.gnu.org/pipermail/gcc
On 2021/9/2 18:37, Richard Biener wrote:
On Thu, 2 Sep 2021, Xionghu Luo wrote:
On 2021/9/2 16:50, Richard Biener wrote:
On Thu, 2 Sep 2021, Richard Biener wrote:
On Thu, 2 Sep 2021, Xionghu Luo wrote:
On 2021/9/1 17:58, Richard Biener wrote:
This fixes the CFG walk order of fill_a
On 2021/9/9 18:55, Richard Biener wrote:
diff --git a/gcc/tree-ssa-loop-im.c b/gcc/tree-ssa-loop-im.c
index 5d6845478e7..4b187c2cdaf 100644
--- a/gcc/tree-ssa-loop-im.c
+++ b/gcc/tree-ssa-loop-im.c
@@ -3074,15 +3074,13 @@ fill_always_executed_in_1 (class loop *loop, sbitmap
contains_call)
On 2021/9/10 21:54, Xionghu Luo via Gcc-patches wrote:
On 2021/9/9 18:55, Richard Biener wrote:
diff --git a/gcc/tree-ssa-loop-im.c b/gcc/tree-ssa-loop-im.c
index 5d6845478e7..4b187c2cdaf 100644
--- a/gcc/tree-ssa-loop-im.c
+++ b/gcc/tree-ssa-loop-im.c
@@ -3074,15 +3074,13
On 2021/9/13 16:17, Richard Biener wrote:
On Mon, 13 Sep 2021, Xionghu Luo wrote:
On 2021/9/10 21:54, Xionghu Luo via Gcc-patches wrote:
On 2021/9/9 18:55, Richard Biener wrote:
diff --git a/gcc/tree-ssa-loop-im.c b/gcc/tree-ssa-loop-im.c
index 5d6845478e7..4b187c2cdaf 100644
--- a
Ping^3, thanks.
https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570333.html
On 2021/9/6 08:52, Xionghu Luo via Gcc-patches wrote:
Ping^2, thanks.
https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570333.html
On 2021/6/30 09:42, Xionghu Luo via Gcc-patches wrote:
Gentle ping, thanks
Fold xxsel to vsel like xxperm/vperm to avoid duplicate code.
gcc/ChangeLog:
2021-09-17 Xionghu Luo
* config/rs6000/altivec.md: Add vsx register constraints.
* config/rs6000/vsx.md (vsx_xxsel): Delete.
(vsx_xxsel2): Likewise.
(vsx_xxsel3): Likewise.
(vs
These two patches are updated version from:
https://gcc.gnu.org/pipermail/gcc-patches/2021-September/579490.html
Changes:
1. Fix alignment error in md files.
2. Replace rtx_equal_p with match_dup.
3. Use register_operand instead of gpc_reg_operand to align with
vperm/xxperm.
4. Regression teste
The vsel instruction is a bit-wise select instruction. Using an
IF_THEN_ELSE to express it in RTL is wrong and leads to wrong code
being generated in the combine pass. Per element selection is a
subset of per bit-wise selection,with the patch the pattern is
written using bit operations. But ther
ns?
Other than that question / suggestion, this patch is okay. Please
coordinate with Bill and his builtin patches.
OK.
Thanks, David
On Wed, Sep 15, 2021 at 3:50 AM Xionghu Luo wrote:
Ping^3, thanks.
https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570333.html
On 2021/9/6 08:52,
On 2021/8/11 17:16, Richard Biener wrote:
On Wed, 11 Aug 2021, Xionghu Luo wrote:
On 2021/8/10 22:47, Richard Biener wrote:
On Mon, 9 Aug 2021, Xionghu Luo wrote:
Thanks,
On 2021/8/6 19:46, Richard Biener wrote:
On Tue, 3 Aug 2021, Xionghu Luo wrote:
loop split condition is moved be
On 2021/9/22 17:14, Richard Biener wrote:
On Thu, Sep 9, 2021 at 3:56 AM Xionghu Luo wrote:
On 2021/8/26 19:33, Richard Biener wrote:
On Tue, Aug 10, 2021 at 4:03 AM Xionghu Luo wrote:
Hi,
On 2021/8/6 20:15, Richard Biener wrote:
On Mon, Aug 2, 2021 at 7:05 AM Xiong Hu Luo wrote:
On 2021/9/23 10:13, Xionghu Luo via Gcc-patches wrote:
On 2021/9/22 17:14, Richard Biener wrote:
On Thu, Sep 9, 2021 at 3:56 AM Xionghu Luo wrote:
On 2021/8/26 19:33, Richard Biener wrote:
On Tue, Aug 10, 2021 at 4:03 AM Xionghu Luo
wrote:
Hi,
On 2021/8/6 20:15, Richard Biener
Update the patch to v3, not sure whether you prefer the paste style
and continue to link the previous thread as Segher dislikes this...
[PATCH v3] Don't move cold code out of loop by checking bb count
Changes:
1. Handle max_loop in determine_max_movement instead of
outermost_invariant_loop.
2.
On 2021/10/29 19:52, Richard Biener wrote:
> On Wed, 27 Oct 2021, Xionghu Luo wrote:
>
>> loop_version currently does lv_adjust_loop_entry_edge
>> before it loopifys the copy inserted on the header. This patch moves
>> the condition generation later and thus we have four pieces to help
>> unde
On 2021/10/29 19:48, Richard Biener wrote:
> I'm talking about the can_sm_ref_p call, in that context 'loop' will
> be the outermost loop of
> interest, and we are calling this for all stores in a loop. We're doing
>
> +bool
> +ref_in_loop_hot_body::operator () (mem_ref_loc *loc)
> +{
> + bas
The clobber constraint should match operand's constraint. fusion.md was
generated by genfusion.pl, but it is disabled now, update both places with
correct clobber constraint.
gcc/ChangeLog:
* config/rs6000/fusion.md: Fix incorrect clobber constraint.
* config/rs6000/genfusion.pl:
On 2021/10/29 19:48, Richard Biener wrote:
> I'm talking about the can_sm_ref_p call, in that context 'loop' will
> be the outermost loop of
> interest, and we are calling this for all stores in a loop. We're doing
>
> +bool
> +ref_in_loop_hot_body::operator () (mem_ref_loc *loc)
> +{
> + bas
On 2021/11/3 23:13, David Edelsohn wrote:
> Did you manually change fusion.md or did you regenerate it after
> fixing genfusion.pl?
>
> If you regenerated it, the ChangeLog entry should be "Regenerated" and
> the "Fix incorrect clobber constraint." should refer to the
> genfusion.pl change.
>
On 2021/11/4 09:59, David Edelsohn wrote:
> On Wed, Nov 3, 2021 at 9:46 PM Xionghu Luo wrote:
>>
>> On 2021/11/3 23:13, David Edelsohn wrote:
>>> Did you manually change fusion.md or did you regenerate it after
>>> fixing genfusion.pl?
>>>
>>> If you regenerated it, the ChangeLog entry should b
On 2021/11/5 08:58, David Edelsohn wrote:
> On Thu, Nov 4, 2021 at 8:50 PM Xionghu Luo wrote:
>
>> [PATCH] rs6000: Fix incorrect fusion constraint [PR102991]
>>
>> gcc/ChangeLog:
>>
>> * config/rs6000/fusion.md: Regenerate.
>> * config/rs6000/genfusion.pl: Fix incorrect clobber
On 2021/10/27 15:44, Jan Hubicka wrote:
>> On Wed, 27 Oct 2021, Jan Hubicka wrote:
>>
gcc/ChangeLog:
* tree-ssa-loop-split.c (split_loop): Fix incorrect probability.
(do_split_loop_on_cond): Likewise.
---
gcc/tree-ssa-loop-split.c | 25 --
On 2021/11/4 21:00, Richard Biener wrote:
> On Wed, Nov 3, 2021 at 2:29 PM Xionghu Luo wrote:
>>
>>
>>> + while (outmost_loop != loop)
>>> +{
>>> + if (bb_colder_than_loop_preheader (loop_preheader_edge
>>> (outmost_loop)->src,
>>> +loop_prehead
From: Xiong Hu Luo
vmrghb only accepts permute index {0, 16, 1, 17, 2, 18, 3, 19, 4, 20,
5, 21, 6, 22, 7, 23} no matter for BE or LE in ISA, similarly for vmrghlb.
Remove UNSPEC_VMRGH_DIRECT/UNSPEC_VMRGL_DIRECT pattern as vec_select
+ vec_concat as normal RTL.
Tested pass on P8LE, P9LE and P8BE{
On P8LE, extra rot64+rot64 load or store instructions are generated
in float128 to vector __int128 conversion.
This patch teaches pass swaps to also handle such pattens to remove
extra swap instructions.
(insn 7 6 8 2 (set (subreg:V1TI (reg:KF 123) 0)
(rotate:V1TI (mem/u/c:V1TI (reg/f:DI
1 - 100 of 215 matches
Mail list logo