Michael,
> I've only noticed a couple typos, and one minor remark.
Typos corrected.
> I just wonder why you duplicated these three loops instead of integrating
> the real body into the existing LI_FROM_INNERMOST loop. I would have
> expected your "if (!optimize_loop_for_size_p && split_loop_on_
Thanks for your comment, I will update the case accordingly.
Feng
From: luoxhu
Sent: Wednesday, October 23, 2019 4:02 PM
To: Feng Xue OS; Martin Jambor; Jan Hubicka; gcc-patches@gcc.gnu.org
Subject: Re: Ping: [PATCH V4] Extend IPA-CP to support
Patch attached.
Feng
From: Richard Biener
Sent: Wednesday, October 23, 2019 5:04 PM
To: Feng Xue OS
Cc: Michael Matz; Philipp Tomsich; gcc-patches@gcc.gnu.org; Christoph Müllner;
erick.oc...@theobroma-systems.com
Subject: Re: [PATCH V3] Loop split upon
, October 24, 2019 1:44 PM
To: Feng Xue OS; gcc-patches@gcc.gnu.org; Jan Hubicka; Martin Jambor
Subject: Re: [PATCH] Support multi-versioning on self-recursive function
(ipa/92133)
Hi,
On 2019/10/17 16:23, Feng Xue OS wrote:
> IPA does not allow constant propagation on parameter that is used to cont
Richard,
Thanks for your comments.
>+ /* For PHI node that is not in loop header, its source operands should
>+be defined inside the loop, which are seen as loop variant. */
>+ if (def_bb != loop->header || !skip_head)
>+ return false;
> so if we have
>
> for (;;)
>
Hi, Richard
This is a new patch to support more generalized semi-invariant condition,
which uses
control dependence analysis.
Thanks,
Feng
From: Feng Xue OS
Sent: Friday, October 25, 2019 11:43 AM
To: Richard Biener
Cc: Michael Matz; Philipp Tomsich
Hi, Honza & Martin,
This is a new patch merged with the newest IPA changes. Would you please take
a look at the patch?
Together with the other patch on recursive function versioning, we can find
more than 30% performance
boost on exchange2 in spec2017. So, it will be good if two patches can en
Hi Martin,
Thanks for your review. I updated the patch with your comments.
Feng
> Sorry that it took so long. Next time, please consider making the
> review a bit easier by writing a ChangeLog (yes, I usually read them and
> you'll have to write one anyway).
>> + class ipcp_param_l
> Uh. Note it's not exactly helpful to change algorithms between
> reviews, that makes it
> just harder :/
>
> Btw, I notice you use post-dominance info. Note that we generally do
> not keep that
> up-to-date with CFG manipulations (and for dominators fast queries are
> disabled).
> Probably the
Thanks.
And for this issue, we can add a new tracker as a followup task.
Feng
From: Jan Hubicka
Sent: Tuesday, November 12, 2019 8:34 PM
To: Feng Xue OS
Cc: Martin Jambor; gcc-patches@gcc.gnu.org
Subject: Re: Ping: [PATCH V6] Extend IPA-CP to support
Please check the attachment, and this patch is based on the previous extended
agg-jump-function patch.
Thanks,
Feng
From: Jan Hubicka
Sent: Tuesday, November 12, 2019 8:41 PM
To: Feng Xue OS
Subject: Re: Ping: [PATCH V6] Extend IPA-CP to support
Bootstrapped/regtested on x86_64-linux and aarch64-linux.
Feng
---
2020-01-19 Feng Xue
PR ipa/93166
* ipa-cp.c (get_info_about_necessary_edges): Remove value
check assertion.From 02e4bea314a0ca0a8befb85c64efcfe422d35cb8 Mon Sep 17 00:00:00 2001
From: Feng Xue
Date: Sun
Besides simple pass-through (aggregate) jump function, arithmetic (aggregate)
jump function could also bring same (aggregate) value as parameter passed-in
for self-feeding recursive call. For example,
f1 (int i)/* normal jump function */
{
f1 (i & 1);
}
S
Made some changes.
Feng
From: Feng Xue OS
Sent: Saturday, January 25, 2020 5:54 PM
To: mjam...@suse.cz; Jan Hubicka; gcc-patches@gcc.gnu.org
Subject: [PATCH] Generalized value pass-through for self-recursive function
(ipa/pr93203)
Besides simple pass
Current IPA does not propagate aggregate constant for by-ref argument
if it is simple pass-through of caller parameter. Here is an example,
f1 (int *p)
{
... = *p;
...
}
f2 (int *p)
{
*p = 2;
f1 (p);
}
It is easy to know that in f1(), *p should be 2 after
Thanks,
Feng
From: Feng Xue OS
Sent: Saturday, January 25, 2020 9:50 PM
To: mjam...@suse.cz; Jan Hubicka; gcc-patches@gcc.gnu.org
Subject: [PATCH V2] Generalized value pass-through for self-recursive function
(ipa/pr93203)
Made some changes.
Feng
>> - gcc_checking_assert (item->value);
> I've been staring at this for quite a while, trying to figure out how
> your patch can put NULL here before I realized it was just a clean-up
> :-) Sending such changes independently or pointing them out in the
> email/ChangeLog makes review eas
Christina
Sent: Tuesday, February 11, 2020 6:05 PM
To: Feng Xue OS; Martin Jambor; Jan Hubicka; gcc-patches@gcc.gnu.org
Cc: nd
Subject: RE: [PATCH V2] Generalized value pass-through for self-recursive
function (ipa/pr93203)
Hi Feng,
This patch (commit a0f6a8cb414b687f22c9011a894d5e8e398c4be0) is
self_recursive_pass_through_p and intersect_aggregates_with_edge calls.
(cgraph_edge_brings_all_agg_vals_for_node): Add "node" argument to
intersect_aggregates_with_edge call.
>
> From: gcc-patches-ow...@gcc.gnu.org
> o
If argument for a self-recursive call is a simple pass-through, the call
edge is also considered as source of any value originated from
non-recursive call to the function. Scalar pass-through and full aggregate
pass-through due to pointer pass-through have also been handled.
But we missed another k
>> +static bool
>> +self_recursive_agg_pass_through_p (cgraph_edge *cs, ipa_agg_jf_item *jfunc,
>> +int i)
>> +{
>> + if (cs->caller == cs->callee->function_symbol ()
> I don't know if self-recursive calls can be interposed at all, if yes
> you need to add the av
tches@gcc.gnu.org; seg...@kernel.crashing.org;
wschm...@linux.ibm.com; guoji...@linux.ibm.com; li...@gcc.gnu.org; Feng Xue OS
Subject: [PATCH v2] ipa-cp: Fix PGO regression caused by r278808
v2 Changes:
1. Enable proportion orig_sum to the new nodes for self recursive node:
new_sum = (orig_sum + ne
esday, December 31, 2019 3:43 PM
To: Feng Xue OS; Jan Hubicka; Martin Jambor
Cc: Martin Liška; gcc-patches@gcc.gnu.org; seg...@kernel.crashing.org;
wschm...@linux.ibm.com; guoji...@linux.ibm.com; li...@gcc.gnu.org
Subject: Re: [PATCH v2] ipa-cp: Fix PGO regression caused by r278808
On 2019/12/31 14:43,
When checking a self-recursively generated value for aggregate jump
function, wrong aggregate lattice was used, which will cause infinite
constant propagation. This patch is composed to fix this issue.
2020-01-03 Feng Xue
PR ipa/93084
* ipa-cp.c (self_recursively_generated_p):
For lane-reducing operation(dot-prod/widen-sum/sad) in loop reduction, current
vectorizer could only handle the pattern if the reduction chain does not
contain other operation, no matter the other is normal or lane-reducing.
Acctually, to allow multiple arbitray lane-reducing operations, we need t
Hi, Richard,
Would you please talk a look at this patch?
Thanks,
Feng
From: Feng Xue OS
Sent: Friday, December 29, 2023 6:28 PM
To: gcc-patches@gcc.gnu.org
Subject: [PATCH] Do not count unused scalar use when marking STMT_VINFO_LIVE_P
[PR113091
mark_live_stmts (bb_vinfo, SLP_INSTANCE_TREE (instance),
- instance, &instance->cost_vec, svisited,
- visited);
- }
-}
+vect_bb_slp_mark_live_stmts (bb_vinfo);
return !vinfo->slp_instances.is_empty (
This patch is meant to fix over-estimation about SLP vector-to-scalar cost for
STMT_VINFO_LIVE_P statement. When pattern recognition is involved, a
statement whose definition is consumed in some pattern, may not be
included in the final replacement pattern statements, and would be skipped
when buil
LP_TREE_LANES (slp_node) == 1))
scalar_shift_arg = false;
else if (dt[1] == vect_constant_def
|| dt[1] == vect_external_def
--
2.17.1
________
From: Richard Biener
Sent: Thursday, June 27, 2024 12:49 AM
To: Feng Xue OS
Cc: gcc-patches@gcc.gnu.org
S
This patch series are recomposed and split from
https://gcc.gnu.org/pipermail/gcc-patches/2024-June/655974.html.
As I will add a new field tightly coupled with "vec_stmts_size", if following
naming conversion as original, the new macro would be very long. So better
to choose samely meaningful but
Vector stmts number of an operation is calculated based on output vectype.
This is over-estimated for lane-reducing operation. Sometimes, to workaround
the issue, we have to rely on additional logic to deduce an exactly accurate
number by other means. Aiming at the inconvenience, in this patch, we
For lane-reducing operation(dot-prod/widen-sum/sad) in loop reduction, current
vectorizer could only handle the pattern if the reduction chain does not
contain other operation, no matter the other is normal or lane-reducing.
This patches removes some constraints in reduction analysis to allow mult
When transforming multiple lane-reducing operations in a loop reduction chain,
originally, corresponding vectorized statements are generated into def-use
cycles starting from 0. The def-use cycle with smaller index, would contain
more statements, which means more instruction dependency. For example
YPE? As said having wrong
> SLP_TREE_NUMBER_OF_VEC_STMTS is going to backfire.
Then the alternative is to limit special handling related to the vec_num only
inside vect_transform_reduction. Is that ok? Or any other suggestion?
Thanks,
Feng
From: Rich
gt; > when that's set instead of SLP_TREE_VECTYPE? As said having wrong
> > > SLP_TREE_NUMBER_OF_VEC_STMTS is going to backfire.
> >
> > Then the alternative is to limit special handling related to the vec_num
> > only
> > inside vect_transform_reduction. Is
Extend original vect_get_num_copies (pure loop-based) to calculate number of
vector stmts for slp node regarding a generic vect region.
Thanks,
Feng
---
gcc/
* tree-vectorizer.h (vect_get_num_copies): New overload function.
(vect_get_slp_num_vectors): New function.
* tree-v
Vector stmts number of an operation is calculated based on output vectype.
This is over-estimated for lane-reducing operation, which would cause vector
def/use mismatched when we want to support loop reduction mixed with lane-
reducing and normal operations. One solution is to refit lane-reducing
t
For lane-reducing operation(dot-prod/widen-sum/sad) in loop reduction, current
vectorizer could only handle the pattern if the reduction chain does not
contain other operation, no matter the other is normal or lane-reducing.
This patches removes some constraints in reduction analysis to allow mult
When transforming multiple lane-reducing operations in a loop reduction chain,
originally, corresponding vectorized statements are generated into def-use
cycles starting from 0. The def-use cycle with smaller index, would contain
more statements, which means more instruction dependency. For example
ke the checking assert unconditional?
>
> OK with that change. vect_get_num_vectors will ICE anyway
> I guess, so at your choice remove the assert completely.
>
OK, I removed the assert.
Thanks,
Feng
From: Richard Biener
Sent: Monday, July 15,
Hi,
The patch was updated with the newest trunk, and also contained some minor
changes.
I am working on another new feature which is meant to support pattern
recognition
of lane-reducing operations in affine closure originated from loop reduction
variable,
like:
sum += cst1 * dot_prod_1 + c
Some utility functions (such as vect_look_through_possible_promotion) that are
to find out certain kind of direct or indirect definition SSA for a value, may
return the original one of the SSA, not its pattern representative SSA, even
pattern is involved. For example,
a = (T1) patt_b;
pa
Both derived classes ( loop_vec_info/bb_vec_info) have their own "bbs"
field, which have exactly same purpose of recording all basic blocks
inside the corresponding vect region, while the fields are composed by
different data type, one is normal array, the other is auto_vec. This
difference causes
Changed as the comments.
Thanks,
Feng
From: Richard Biener
Sent: Tuesday, May 28, 2024 5:34 PM
To: Feng Xue OS
Cc: gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] vect: Use vect representative statement instead of
original in patch recog [PR115060]
On Sat
_info_shared *);
~_bb_vec_info ();
- /* The region we are operating on. bbs[0] is the entry, excluding
- its PHI nodes. In the future we might want to track an explicit
- entry edge to cover bbs[0] PHI nodes and have a region entry
- insert location. */
- vec bbs;
-
vec roots;
}
Ok. Then I will add a TODO comment on "bbs" field to describe it.
Thanks,
Feng
From: Richard Biener
Sent: Wednesday, May 29, 2024 3:14 PM
To: Feng Xue OS
Cc: gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] vect: Unify bbs in loop_vec_info and b
>> Hi,
>>
>> The patch was updated with the newest trunk, and also contained some minor
>> changes.
>>
>> I am working on another new feature which is meant to support pattern
>> recognition
>> of lane-reducing operations in affine closure originated from loop reduction
>> variable,
>> like:
>>
This is a patch that is split out from
https://gcc.gnu.org/pipermail/gcc-patches/2024-May/652626.html.
Check if an operation is lane-reducing requires comparison of code against
three kinds (DOT_PROD_EXPR/WIDEN_SUM_EXPR/SAD_EXPR). Add an utility
function to make source coding for the check handy
This is a patch that is split out from
https://gcc.gnu.org/pipermail/gcc-patches/2024-May/652626.html.
Partial vectorization checking for vectorizable_reduction is a piece of
relatively isolated code, which may be reused by other places. Move the
code into a new function for sharing.
Thanks,
Fen
Normally, vectorizable checking on statement in a loop reduction chain does
not use the reduction PHI information. But some special statements might
need it in vectorizable analysis, especially, for multiple lane-reducing
operations support later.
Thanks,
Feng
---
gcc/
* tree-vect-loop.cc
The input vectype is an attribute of lane-reducing operation, instead of
reduction PHI that it is associated to, since there might be more than one
lane-reducing operations with different type in a loop reduction chain. So
bind each lane-reducing operation with its own input type.
Thanks,
Feng
---
For lane-reducing operation(dot-prod/widen-sum/sad) in loop reduction, current
vectorizer could only handle the pattern if the reduction chain does not
contain other operation, no matter the other is normal or lane-reducing.
Actually, to allow multiple arbitray lane-reducing operations, we need to
When transforming multiple lane-reducing operations in a loop reduction chain,
originally, corresponding vectorized statements are generated into def-use
cycles starting from 0. The def-use cycle with smaller index, would contain
more statements, which means more instruction dependency. For example
Ok. Updated as the comments.
Thanks,
Feng
From: Richard Biener
Sent: Friday, May 31, 2024 3:29 PM
To: Feng Xue OS
Cc: Tamar Christina; gcc-patches@gcc.gnu.org
Subject: Re: [PATCH 2/6] vect: Split out partial vect checking for reduction
into a function
Please see my comments below.
Thanks,
Feng
> On Thu, May 30, 2024 at 4:55 PM Feng Xue OS
> wrote:
>>
>> For lane-reducing operation(dot-prod/widen-sum/sad) in loop reduction,
>> current
>> vectorizer could only handle the pattern if the reduction chain does not
>>
>> >> 1. Background
>> >>
>> >> For loop reduction of accumulating result of a widening operation, the
>> >> preferred pattern is lane-reducing operation, if supported by target.
>> >> Because
>> >> this kind of operation need not preserve intermediate results of widening
>> >> operation, and o
gcc_assert (reduction_type != EXTRACT_LAST_REDUCTION
--
2.17.1
____________
From: Feng Xue OS
Sent: Thursday, May 30, 2024 10:51 PM
To: Richard Biener
Cc: Tamar Christina; gcc-patches@gcc.gnu.org
Subject: [PATCH 3/6] vect: Set STMT_VINFO_REDUC_DEF for non-live stmt i
able gives the initial
scalar values of those N reductions. */
--
2.17.1
________
From: Feng Xue OS
Sent: Thursday, May 30, 2024 10:56 PM
To: Richard Biener
Cc: Tamar Christina; gcc-patches@gcc.gnu.org
Subject: [PATCH 6/6] vect: Optimize order of lane-reducing
The series of patches are meant to support multiple lane-reducing reduction
statements. Since the original ones conflicted with the new single-lane slp
node patches, I have reworked most of the patches, and split them as small as
possible, which may make code review easier.
In the 1st one, I ad
In vectorizable_reduction, one check on a reduction operand via index could be
contained by another one check via pointer, so remove the former.
Thanks,
Feng
---
gcc/
* tree-vect-loop.cc (vectorizable_reduction): Remove the duplicated
check.
---
gcc/tree-vect-loop.cc | 6 ++
Two local variables were defined to refer same STMT_VINFO_REDUC_TYPE, better
to keep only one.
Thanks,
Feng
---
gcc/
* tree-vect-loop.cc (vectorizable_reduction): Remove v_reduc_type, and
replace it to another local variable reduction_type.
---
gcc/tree-vect-loop.cc | 8
The input vectype of reduction PHI statement must be determined before
vect cost computation for the reduction. Since lance-reducing operation has
different input vectype from normal one, so we need to traverse all reduction
statements to find out the input vectype with the least lanes, and set tha
It's better to place 3 relevant independent variables into array, since we
have requirement to access them via an index in the following patch. At the
same time, this change may get some duplicated code be more compact.
Thanks,
Feng
---
gcc/
* tree-vect-loop.cc (vect_transform_reduction):
According to logic of code nearby the assertion, all lane-reducing operations
should not appear, not just DOT_PROD_EXPR. Since "use_mask_by_cond_expr_p"
treats SAD_EXPR same as DOT_PROD_EXPR, and WIDEN_SUM_EXPR should not be allowed
by the following assertion "gcc_assert (commutative_binary_op_p (.
For lane-reducing operation(dot-prod/widen-sum/sad) in loop reduction, current
vectorizer could only handle the pattern if the reduction chain does not
contain other operation, no matter the other is normal or lane-reducing.
Actually, to allow multiple arbitrary lane-reducing operations, we need t
When transforming multiple lane-reducing operations in a loop reduction chain,
originally, corresponding vectorized statements are generated into def-use
cycles starting from 0. The def-use cycle with smaller index, would contain
more statements, which means more instruction dependency. For example
{ 0, 0, 0, 0 };
loop () {
sum_v0 = dot_prod<16 * char>(char_a0, char_a1, sum_v0);
sum_v1 = dot_prod<16 * char>(char_b0, char_b1, sum_v1);
sum_v0 = dot_prod<8 * short>(short_c0_lo, short_c1_lo, sum_v0);
sum_v1 = dot_prod<8 * short>(short_
662a3..1b73ef01ade 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -13350,6 +13350,8 @@ vect_analyze_stmt (vec_info *vinfo,
NULL, NULL, node, cost_vec)
|| vectorizable_load (vinfo, stmt_info, NULL, NULL, node, cost_vec)
|| vectorizable_store (vinfo, stmt_inf
lar values of those N reductions. */
--
2.17.1
____________
From: Feng Xue OS
Sent: Sunday, June 16, 2024 3:32 PM
To: Richard Biener
Cc: gcc-patches@gcc.gnu.org
Subject: [PATCH 8/8] vect: Optimize order of lane-reducing statements in loop
def-use cycles
When trans
s - 1 given you use one above
> and the other below? Or simply iterate till op.num_ops
> and sip i == reduc_index.
>
>> + for (unsigned i = 0; i < op.num_ops - 1; i++)
>> + {
>> + gcc_assert (vec_oprnds[i].length () == using_ncopies);
>> +
>>
>> >> - if (slp_node)
>> >> + if (slp_node && SLP_TREE_LANES (slp_node) > 1)
>> >
>> > Hmm, that looks wrong. It looks like SLP_TREE_NUMBER_OF_VEC_STMTS is off
>> > instead, which is bad.
>> >
>> >> nvectors = SLP_TREE_NUMBER_OF_VEC_STMTS (slp_node);
>> >>else
>> >>
PHI records the input vectype with least lanes. */
- if (lane_reducing)
-STMT_VINFO_REDUC_VECTYPE_IN (stmt_info) = vectype_in;
enum vect_reduction_type reduction_type = STMT_VINFO_REDUC_TYPE (phi_info);
STMT_VINFO_REDUC_TYPE (reduc_info) = reduction_type;
--
2.17.1
___
s.cc b/gcc/tree-vect-stmts.cc
index 840e162c7f0..845647b4399 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -13350,6 +13350,8 @@ vect_analyze_stmt (vec_info *vinfo,
NULL, NULL, node, cost_vec)
|| vectorizable_load (vinfo, stmt_info, NU
ctions. */
--
2.17.1
____________
From: Feng Xue OS
Sent: Thursday, June 20, 2024 2:02 PM
To: Richard Biener
Cc: gcc-patches@gcc.gnu.org
Subject: Re: [PATCH 8/8] vect: Optimize order of lane-reducing statements in
loop def-use cycles
This patch was updated with some new chang
Allow shift-by-induction for slp node, when it is single lane, which is
aligned with the original loop-based handling.
Thanks,
Feng
---
gcc/tree-vect-stmts.cc | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index ca6052662a3..8
Allow shift-by-induction for slp node, when it is single lane, which is
aligned with the original loop-based handling.
Thanks,
Feng
---
gcc/tree-vect-stmts.cc | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index ca6052662a3..8
Hi,
I composed some patches to generalize lane-reducing (dot-product is a typical
representative) pattern recognition, and prepared a RFC document so as to help
review. The original intention was to make a complete solution for
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114440. For sure, th
The work for RFC
(https://gcc.gnu.org/pipermail/gcc-patches/2024-July/657860.html)
involves not a little code change, so I have to separate it into several batches
of patchset. This and the following patches constitute the first batch.
Since pattern statement coexists with normal statements in a
This patch extends original vect analysis and transform to support a new kind
of lane-reducing operation that participates in loop reduction indirectly. The
operation itself is not reduction statement, but its value would be accumulated
into reduction result finally.
Thanks,
Feng
---
gcc/
For sum-based loop reduction, its affine closure is composed by statements
whose results and derived computation only end up in the reduction, and are
not used in any non-linear transform operation. The concept underlies the
generalized lane-reducing pattern recognition in the coming patches. As
ma
Previously, only simple lane-reducing case is supported, in which one loop
reduction statement forms one pattern match:
char *d0, *d1, *s0, *s1, *w;
for (i) {
sum += d0[i] * d1[i]; // sum = DOT_PROD(d0, d1, sum);
sum += abs(s0[i] - s1[i]); // sum = SAD(s0, s1, sum);
sum += w[i
This patch adds a pattern to fold a summation into the last operand of lane-
reducing operation when appropriate, which is a supplement to those operation-
specific patterns for dot-prod/sad/widen-sum.
sum = lane-reducing-op(..., 0) + value;
=>
sum = lane-reducing-op(..., value);
Thanks,
Feng
>> 1. Background
>>
>> For loop reduction of accumulating result of a widening operation, the
>> preferred pattern is lane-reducing operation, if supported by target. Because
>> this kind of operation need not preserve intermediate results of widening
>> operation, and only produces reduced amount
The function vect_look_through_possible_promotion() fails to figure out root
definition if casts involves more than two promotions with sign change as:
long a = (long)b; // promotion cast
-> int b = (int)c; // promotion cast, sign change
-> unsigned short c = ...;
For this case, the
Some opcodes are missed when determining the smallest scalar type for a
vectorizable statement. Currently, this bug does not cause any problem,
because vect_get_smallest_scalar_type is only used to compute max nunits
vectype, and even statement with missed opcode is incorrectly bypassed,
the max nu
Currently, for self-recursive call, we never use value originated from
non-passthrough
jump function as source to avoid propagation explosion, but self-dependent
value is
missed. This patch is made to fix the bug.
Bootstrapped/regtested on x86_64-linux and aarch64-linux.
Feng
---
2020-02-18 Fe
Thanks,
Feng
From: Tamar Christina
Sent: Monday, February 17, 2020 4:44 PM
To: Feng Xue OS; Martin Jambor; Jan Hubicka; gcc-patches@gcc.gnu.org
Cc: nd
Subject: RE: [PATCH] Fix bug in recursiveness check for function to be cloned
(ipa/pr93707)
Hi Feng
This is a simpel and nice fix, but could suppress some CP opportunities for
self-recursive call. Using the test case as example, the first should be a
for-all-context clone, and the call "recur_fn (i, 1, depth + 1)" is replaced
with
a newly created recursive node. Thus, in the next round of CP it
It is a good solution.
Thanks,
Feng
From: Martin Jambor
Sent: Saturday, February 22, 2020 2:15 AM
To: Feng Xue OS; Tamar Christina; Jan Hubicka; gcc-patches@gcc.gnu.org
Cc: nd
Subject: Re: [PATCH] Fix bug in recursiveness check for function to be cloned
Hi, Honza & Martin,
Would you please take some time to review this updated patch? Thanks.
Feng
From: Feng Xue OS
Sent: Wednesday, September 18, 2019 8:41 PM
To: Jan Hubicka
Cc: Martin Jambor; gcc-patches@gcc.gnu.org
Subject: [PATCH V4] General
Hi Honza & Martin,
And also hope your comments on this patch. Thanks.
Feng
From: Feng Xue OS
Sent: Thursday, September 19, 2019 10:30 PM
To: Martin Jambor; Jan Hubicka; gcc-patches@gcc.gnu.org
Subject: [PATCH V4] Extend IPA-CP to sup
Hi, Michael,
Would you please take a look at this modified version?
Thanks,
Feng
From: Feng Xue OS
Sent: Thursday, September 12, 2019 6:21 PM
To: Michael Matz
Cc: Richard Biener; gcc-patches@gcc.gnu.org
Subject: Re: Ping agian: [PATCH V2] Loop split
Hi Philipp,
This is an updated patch based on comments form Michael, and if he think
this is ok, we will merge it into trunk. Thanks,
Feng
From: Philipp Tomsich
Sent: Tuesday, October 15, 2019 11:49 PM
To: Feng Xue OS
Cc: Michael Matz; Richard
Hi Philipp,
This patch is still under code review, might still need some time. Thanks,
Feng
From: Philipp Tomsich
Sent: Wednesday, October 16, 2019 12:05 AM
To: Feng Xue OS
Cc: Martin Jambor; Jan Hubicka; gcc-patches@gcc.gnu.org; Christoph Müllner
IPA does not allow constant propagation on parameter that is used to control
function recursion.
recur_fn (i)
{
if ( !terminate_recursion (i))
{
...
recur_fn (i + 1);
...
}
...
}
This patch is composed to enable multi-versioning for self-recursive function,
and ve
> I noticed similar issue when analyzing the SPEC, self-recursive function is
> not versioned and posted my observations in
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92074.
> Generally, this could be implemented well by your patch, while I am
> wondering whether it is OK to convert the recur
Hi, Michael,
Since gcc 10 release is coming, that will be good if we can add this patch
before that. Thanks
Feng.
From: Michael Matz
Sent: Wednesday, October 16, 2019 12:01 AM
To: Philipp Tomsich
Cc: Feng Xue OS; Richard Biener; gcc-patches
Thanks for your review.
> In general the patch looks good to me, but I would like Martin Jambor to
> comment on the ipa-prop/cp interfaces. However...
> +@item ipa-cp-max-recursion-depth
> +Maximum depth of recursive cloning for self-recursive function.
> +
> ... I believe we will need more care
>> Cost model used by self-recursive cloning is mainly based on existing stuffs
>> in ipa-cp cloning, size growth and time benefit are considered. But since
>> recursive cloning is a more aggressive cloning, we will actually have another
>> problem, which is opposite to your concern. By default, c
aning on dst_ctx?
From: gcc-patches-ow...@gcc.gnu.org on behalf
of Jan Hubicka
Sent: Friday, November 15, 2019 4:09 PM
To: Feng Xue OS
Cc: Martin Jambor; gcc-patches@gcc.gnu.org
Subject: Re: Ping: [PATCH V6] Extend IPA-CP to support arithmetically-computed
1 - 100 of 201 matches
Mail list logo