You are right, we need discriminator for non-CALL stmts too. Patch updated:
Index: gcc/tree-cfg.c
===
--- gcc/tree-cfg.c (revision 201858)
+++ gcc/tree-cfg.c (working copy)
@@ -781,9 +781,37 @@ assign_discriminators (void)
{
This patch has 2 changes:
1. Now that we have discriminator for inlined callsite, we don't need
special handling for callsite location any more.
2. If a source line is mapped to multiple BBs, only the first BB will
be annotated.
3. Before actual annotation, mark everythin BB/edge as not annotated.
nge it to something more
> specific.
>
> thanks,
>
> David
>
> On Thu, Aug 22, 2013 at 3:56 PM, Dehao Chen wrote:
>> This patch has 2 changes:
>>
>> 1. Now that we have discriminator for inlined callsite, we don't need
>> special handling for callsit
This patch makes AutoFDO honor system paths stored in the profile.
Bootstrapped and passed regression tests.
OK for google-4_8 branch?
Thanks,
Dehao
Index: gcc/auto-profile.c
===
--- gcc/auto-profile.c (revision 202672)
+++ gcc/aut
This patch disables SLP for AutoFDO.
Bootstrapped and passed unittests.
OK for google-4_8?
Thanks,
Dehao
Index: gcc/opts.c
===
--- gcc/opts.c (revision 202709)
+++ gcc/opts.c (working copy)
@@ -1661,9 +1661,6 @@ common_handle_optio
This patch fixup the call graph edge targets during AutoFDO pass, so
that when rebuilding call graph edges, it can find the correct callee.
Bootstrapped and passed regression test. Benchmark tests on-going.
Ok for google-4_8 branch?
Thanks,
Dehao
Index: gcc/Makefile.in
=
pool_do_link ();
- cgraph_unify_type_alias_sets ();
-
return TODO_rebuild_cgraph_edges;
}
On Wed, Sep 18, 2013 at 5:16 PM, Xinliang David Li wrote:
> On Wed, Sep 18, 2013 at 4:51 PM, Dehao Chen wrote:
>> This patch fixup the call graph edge targets during AutoFDO pass, so
>> that when rebu
This patch sets cgraph_node count during AutoFDO annotation, otherwise
execute_fixup_cfg will clear all the BB counts.
bootstrapped and passed regression test.
OK for google-4_8 branch?
Thanks,
Dehao
Index: gcc/auto-profile.c
===
-
t; David
>
> On Thu, Sep 19, 2013 at 10:10 AM, Dehao Chen wrote:
>> Thanks, patch updated:
>>
>> Index: gcc/Makefile.in
>> ===
>> --- gcc/Makefile.in (revision 202725)
>> +++ gcc/Makefi
This patch fixes the issue of indirect call promotion while using
AutoFDO optimized binary to collect profile, and use the new profile
to re-optimize the binary. Before the patch, the indirect call is
promoted (and likely inlined) in the profiled binary, and will not be
promoted in the new iteratio
This patch disables aggressive loop peeling when profile is available.
This prevents extensive code bloat which leads to increased i-cache
misses.
Bootstrapped and passed regression tests.
OK for google-4_8?
Thanks,
Dehao
Index: gcc/loop-unroll.c
This patch fix the bug when setting max-lipo-group in AutoFDO:
Bootstrapped and passed regression test.
OK for google branches?
Thanks,
Dehao
Index: gcc/auto-profile.c
===
--- gcc/auto-profile.c (revision 202926)
+++ gcc/auto-profi
This fixes an ICE when lipo_cmp_type handles NULL_PTR_TYPE.
Bootstrapped and regression test on going?
OK for google branches?
Thanks,
Dehao
Index: gcc/l-ipo.c
===
--- gcc/l-ipo.c (revision 202926)
+++ gcc/l-ipo.c (working copy)
@@
for trunk?
Thanks,
Dehao
gcc/ChangeLog:
2013-10-01 Dehao Chen
* cp/mangle.c (write_special_name_constructor): Remove the
INTERNAL suffix.
Index: gcc/cp/mangle.c
===
--- gcc/cp/mangle.c (revision 202991)
+++ gcc/cp/mangle.c (wo
Sorry to reply late, missed this mail again... not sure why.
LGTM, okay for google branches.
Dehao
On Mon, Sep 24, 2012 at 1:20 PM, Teresa Johnson wrote:
> Revised patch to add a new dump flag that dumps PMU profile information using
> the -pmu dump option. (Was issue 6489092, creating new issu
Hi,
This patch fixes the bug when comparing location to UNKNOWN_LOC.
Bootstrapped and passed gcc regression test.
Okay for trunk?
Thanks,
Dehao
2012-09-30 Dehao Chen
PR middle-end/54759
* gcc/tree-vect-loop-manip.c (slpeel_make_loop_iterate_ntimes): Use
LOCATION_LOCUS to compare with
Hi,
Attached patch fixes PR54782. phi_arg_location is not correctly
updated in move_block_to_fn. This patch fixes the problem.
Bootstrapped and passed gcc regression tests on x86.
Okay for trunk?
Thanks,
Dehao
gcc/ChangeLog:
2012-10-03 Dehao Chen
PR middle-end/54782
* tree-cfg.c
Thanks for the comments. The patch was updated as attached.
Dehao
On Wed, Oct 3, 2012 at 11:46 AM, Jakub Jelinek wrote:
> On Wed, Oct 03, 2012 at 11:26:09AM -0700, Dehao Chen wrote:
>> @@ -6340,6 +6341,20 @@ move_block_to_fn (struct function *dest_cfun, basi
>>
Hi,
This patch fixes PR54826. When lowering the gimple, the block for call
arg also need to be reset.
Bootstrapped and passed gcc regression test on x86.
Okay for trunk?
Thanks,
Dehao
2012-10-05 Dehao Chen
* gimple-low.c (lower_stmt): Set the block for call args.
Index: gcc/gimple
ping^2
Honza, do you think this patch can make into 4.8 stage 1?
Thanks,
Dehao
On Wed, Sep 26, 2012 at 2:34 PM, Dehao Chen wrote:
> http://gcc.gnu.org/ml/gcc-patches/2012-05/msg01975.html
>
> Thanks,
> Dehao
gt; source code/configuration changes.
Thanks for your feedback and interest. Yes, in AutoFDO the coupling
between the profiling build and fdo build are much loosen.
>
> Just few quick questions from first glance over the patch...
>>
>> Dehao
>>
>> The patch c
On Sat, Oct 6, 2012 at 6:13 PM, Andi Kleen wrote:
> Jan Hubicka writes:
>>
>> I think it is useful feature, yes (and was in my TODO list for quite some
>> time). Unlike edge profiles, these profiles should be also more independent
>> of
>> source code/configuration changes.
>
> It would be good
Hi,
R191338 did not completely fix the location for deallocator. This
patch covers more cases for deallocator.
Bootstrapped and passed gcc regression test on x86.
Okay for trunk?
Thanks,
Dehao
gcc/ChangeLog:
2012-10-07 Dehao Chen
* tree-eh.c (lower_try_finally_onedest): Set correct
Attached is the updated patch. Yes, if we add a VRP pass before
profile pass, this patch would be unnecessary. Should we add a VRP
pass?
Thanks,
Dehao
On Sat, Oct 6, 2012 at 9:38 AM, Jan Hubicka wrote:
>> ping^2
>>
>> Honza, do you think this patch can make into 4.8 stage 1?
>
> + if (check
I have backported r192215 from trunk to google-4_7:
2012-10-08 Dehao Chen
* predict.c (predict_extra_loop_exits): Use
predict_paths_leading_to_edge to replace predict_edge_def.
Bootstrapped and passed crosstool test.
Dehao
I have backported the following patches from trunk to google-4_7:
191931, 192049, 192120, 192165
gcc:
2012-10-08 Dehao Chen
Backport 191931, 192049, 192120, 192165 from trunk.
* tree-vect-loop-manip.c (slpeel_make_loop_iterate_ntimes): Use
LOCATION_LOCUS to compare
Yes, you are right. I've changed to use EXPR_LOCATION (stmt) for the location.
New patch attached, testing is on-going.
Thanks,
Dehao
On Tue, Oct 9, 2012 at 12:35 PM, Jason Merrill wrote:
> On 10/07/2012 08:38 PM, Dehao Chen wrote:
>>
>> +*stmt_p = build2_loc (input_loc
The patch bootstrapped and passed gcc regression tests.
Thanks,
Dehao
On Tue, Oct 9, 2012 at 1:16 PM, Dehao Chen wrote:
> Yes, you are right. I've changed to use EXPR_LOCATION (stmt) for the location.
>
> New patch attached, testing is on-going.
>
> Thanks,
> Dehao
>
&
This patch was committed and ported to google-4_7 branch.
Thanks,
Dehao
gcc/ChangeLog:
2012-10-07 Dehao Chen
* tree-eh.c (lower_try_finally_onedest): Set correct location for
deallocator.
* gimplify.c (gimplify_expr): Set correct location for TRY stmt.
gcc/cp/ChangeLog:
2012-10-07 Dehao
Hi,
This patch fixes debug info for expr and jump stmt.
Bootstrapped and passed gcc regression tests.
Is it okay for trunk?
Thanks,
Dehao
gcc/ChangeLog:
2012-10-25 Dehao Chen
* tree-eh.c (do_return_redirection): Set location for jump statement.
(do_goto_redirection): Likewise
On Thu, Oct 25, 2012 at 11:06 AM, Eric Botcazou wrote:
>> This patch fixes debug info for expr and jump stmt.
>
> It would be nice to have testcases...
Sure, I'll try to forge some testcases for this.
>
>> * cfgexpand.c (set_expr_location_r): New callback function.
>> (gimple_assign_rhs_to_tree)
On Thu, Oct 25, 2012 at 11:11 AM, Tom Tromey wrote:
>>>>>> "Dehao" == Dehao Chen writes:
>
> Dehao> This patch fixes debug info for expr and jump stmt.
> Dehao> Bootstrapped and passed gcc regression tests.
> Dehao> Is it okay for trunk?
>
r is fixed by this patch I sent. Thus after this patch,
the block_location improvement will not cause regression to gdb tests.
Thanks,
Dehao
On Thu, Oct 25, 2012 at 11:53 AM, Dehao Chen wrote:
> On Thu, Oct 25, 2012 at 11:11 AM, Tom Tromey wrote:
>>>>>>> "Dehao" ==
also there even without the patch. This patch
just reveal the problem by moving a decl into cache so that it will be
checked. As I'm not familiar with LTO, not quite sure what the root
problem is. Can anyone help take a look?
Thanks,
Dehao
gcc/ChangeLog:
2012-10-25 Dehao Chen
*
On Sat, Oct 27, 2012 at 11:07 AM, Richard Biener
wrote:
> On Sat, Oct 27, 2012 at 12:53 AM, Dehao Chen wrote:
>> Hi,
>>
>> I've updated the patch:
>>
>> 1. abandon the changes in cfgexpand.c
>> 2. set the block for trees when lowering gimple stmt.
On Mon, Oct 29, 2012 at 7:17 AM, Michael Matz wrote:
> Hi,
>
> On Mon, 29 Oct 2012, Richard Biener wrote:
>
>> > Well, you merely moved the bogus code to gimple-low.c. It is bogus
>> > because you unconditionally overwrite TREE_BLOCK of all operands (and all
Emm, then in gimple-low.c, we should
On Mon, Oct 29, 2012 at 9:10 AM, Richard Biener
wrote:
> On Mon, Oct 29, 2012 at 4:25 PM, Dehao Chen wrote:
>> On Mon, Oct 29, 2012 at 7:17 AM, Michael Matz wrote:
>>> Hi,
>>>
>>> On Mon, 29 Oct 2012, Richard Biener wrote:
>>>
>>>> > W
widely after gimplification. Do you mean that in the long run, we'd
want to remove all these?
Thanks,
Dehao
On Mon, Oct 29, 2012 at 9:49 AM, Dehao Chen wrote:
> On Mon, Oct 29, 2012 at 9:10 AM, Richard Biener
> wrote:
>> On Mon, Oct 29, 2012 at 4:25 PM, Dehao Chen wrote:
&
Ok. I'll test Micheal's patch, and send out the new patch soon.
Thanks,
Dehao
): Likewise.
* expr.c (store_expr): Use current insn location instead of expr
location.
(expand_expr_real): Likewise.
(expand_expr_real_1): Likewise.
gcc/testsuite/ChangeLog:
2012-10-25 Dehao Chen
* g++.dg/debug/dwarf2/block.C: New testcase.
Index: gcc
Yeah, I looked into the testcase. Indeed, the expr location should be
used instead of curr_insn_location(). Otherwise the source line for
all definitions will be attributed to the use point. But if we use
EXPR_LOCATION, then we need to make sure the block is also correct.
Any suggestions how we sh
> And tree expressions don't have TREE_BLOCK before gimple-low either.
> So IMNSHO it is gimple-low.c that should set TREE_BLOCK of all the gimple
> stmts as well as all expression in the operands. It is not overwriting
> anything, no frontend sets TREE_BLOCK for any expression, the way frontends
ed.
I agree, but this looks like too bold a move at this point. Shall we
do that in 4.8?
BTW, I updated the patch to ensure pr43479.c works fine. The testing
is still on-going.
Dehao
gcc/ChangeLog:
2012-10-25 Dehao Chen
* tree-eh.c (do_return_redirection):
BTW, one thing I found confusing is that in expr.c, some code is for
frontend, while some are for rtl. Shall we separate them into two
files? And we don't expect to see EXPR_LOCATION in the rtl side.
Thanks,
Dehao
> The debugger isn't the only consumer of debug info, and other tools might need
> a finer granularity than a GIMPLE location, so clearing EXPR_LOCATION to work
> around a debug info size issue seems very short-sighted (to say the least).
Hi, Eric,
There might be some misunderstanding here. Clear
> gcc/ChangeLog:
> 2012-10-25 Dehao Chen
>
> * tree-eh.c (do_return_redirection): Set location for jump statement.
> (do_goto_redirection): Likewise.
> (frob_into_branch_around): Likewise.
> (lower_try_finally_
for trunk?
Thanks,
Dehao
gcc/ChangeLog:
2012-10-30 Dehao Chen
* tree-ssa-pre.c (insert_into_pred_update_location): New Function.
(insert_into_preds_of_block): Update source location for inserted stmts.
gcc/testsuite/ChangeLog:
2012-10-30 Dehao Chen
* gcc.dg/debug
Sorry, new patch attached...
On Tue, Oct 30, 2012 at 4:38 PM, Steven Bosscher wrote:
> On Wed, Oct 31, 2012 at 12:00 AM, Dehao Chen wrote:
>> This patch aims to improve debugging of optimized code. It ensures
>> that PRE inserted statements have the same source location as the
onents / operation.
>
> Thus I think inserted expressions should not have any debug information
> at all because they do not correspond to a source line.
>
> Richard.
>
>> David
>>
>> On Tue, Oct 30, 2012 at 4:38 PM, Steven Bosscher
>> wrote:
>>>
> Yeah. But please also check gdb testsuite for this kind of patches.
This patch also passed gdb testsuite.
Thanks,
Dehao
>
> Jakub
On Wed, Oct 2, 2013 at 10:50 AM, Cong Hou wrote:
> On Tue, Oct 1, 2013 at 11:35 PM, Jakub Jelinek wrote:
>> On Tue, Oct 01, 2013 at 07:12:54PM -0700, Cong Hou wrote:
>>> --- gcc/tree-vect-loop-manip.c (revision 202662)
>>> +++ gcc/tree-vect-loop-manip.c (working copy)
>>
>> Your mailer ate all th
I looked at this problem. Bug updated
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58619
This is a bug when updating block during tree-inline. Basically, it is
legal for *n to be NULL. E.g. When gimple_block(id->gimple_call) is
NULL, remap_blocks_to_null will be called to set *n to NULL.
The probl
On Fri, Oct 4, 2013 at 11:54 AM, Jan Hubicka wrote:
>> I looked at this problem. Bug updated
>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58619
>>
>> This is a bug when updating block during tree-inline. Basically, it is
>> legal for *n to be NULL. E.g. When gimple_block(id->gimple_call) is
>> N
M_EARLY_INLINER_MAX_ITERATIONS);
>> i++)
>>{
>> early_inliner ();
>> if (!flag_value_profile_transformations
>> || !autofdo::afdo_vpt_for_early_inline (&promoted_stmts))
>>break;
>>}'
>
> This needs heavy documenta
ping...
On Tue, Oct 1, 2013 at 1:28 PM, Dehao Chen wrote:
> Hi,
>
> This patch disables the C++ frontend to add " *INTERNAL* " suffix to
> maybe_in_charge_destructor/constructor. This is needed because these
> functions could be emitted in the debug info, and we would
Thanks for applying the patch. Backported to google-4_8
I still have some concern when inlining .part function into its
original function: basically, the gimple_block for that call may be
NULL, but it does not make sense to clear all block info for all stmts
in the .part function.
Dehao
On Tue,
This patch updates the AutoFDO profile propagation of equivalence
class: instead of looking just immediate dominators, traverse all
dominators. This helps improving profile accuracy.
Bootstrapped and passed regression test.
OK for google-4_8 branch?
Thanks,
Dehao
Index: gcc/auto-profile.c
=
In legacy AutoFDO, callsite is represented as a (lineno, callee_name)
pair because there could be multiple calls in one line. However, as we
enhanced the debug info by assigning discriminators for each function
call in the same line, callee_name is not needed when indexing the
callsite.
This patch
Patch updated.
Thanks,
Dehao
On Wed, Oct 9, 2013 at 10:45 PM, Xinliang David Li wrote:
> > /* Program behavior changed, original promoted (and inlined) target is not
> > hot any more. Will avoid promote the original target. */
> > if (total >= info->first * 0.5)
> > return false;
>
> This
ping^2
Thanks,
Dehao
On Tue, Oct 1, 2013 at 1:28 PM, Dehao Chen wrote:
> Hi,
>
> This patch disables the C++ frontend to add " *INTERNAL* " suffix to
> maybe_in_charge_destructor/constructor. This is needed because these
> functions could be emitted in the debug i
It's hard to get a testcase without
http://gcc.gnu.org/viewcvs/gcc?view=revision&revision=201856 because
none of these *INTERNAL* symbols will be emitted in debug info.
Thanks,
Dehao
On Fri, Oct 11, 2013 at 10:55 AM, Jason Merrill wrote:
> This needs a testcase (compile with -dA and use scan-ass
This patch forces to use profile info to check if an edge is hot when
profile is available.
Bootstrapped and passed regression tests.
OK for trunk?
Thanks,
Dehao
gcc/ChangeLog:
2013-10-14 Dehao Chen
* predict.c(cgraph_maybe_hot_edge_p): Decide edge's hotness from profile.
Index
cgraph_maybe_hot_edge_p (because callee is
UNLIKELY_EXECUTED), while the edge->count is actually hot.
Dehao
On Mon, Oct 14, 2013 at 9:13 AM, Xinliang David Li wrote:
> Looks like there is some inconsistency between edge hotness and callee
> frequency?
>
> David
>
> On Mon, Oct 14, 2013 a
On Mon, Oct 14, 2013 at 12:49 PM, Jan Hubicka wrote:
>> Not for instrumented FDO (not as I know of). But for AutoFDO, this
>> could be a potential risk because some callee is marked unlikely
>> executed simply because they are inlined and eliminated in the O2
>> binary. But in ipa-inline it will n
ile
> annotate (using information from inline instances which are not
> inlined in early inline)?
>
> David
>
> On Mon, Oct 14, 2013 at 2:18 PM, Dehao Chen wrote:
>> On Mon, Oct 14, 2013 at 12:49 PM, Jan Hubicka wrote:
>>>> Not for instrumented FDO (not as I know o
On Mon, Oct 14, 2013 at 3:04 PM, Xinliang David Li wrote:
> On Mon, Oct 14, 2013 at 2:34 PM, Dehao Chen wrote:
>> For my test case, the entire inline instance is optimized away,
>
> do you mean there is no out of line instance for the target function
> in the profile binary?
node summary after profile
>> > annotate (using information from inline instances which are not
>> > inlined in early inline)?
>> >
>> > David
>> >
>> > On Mon, Oct 14, 2013 at 2:18 PM, Dehao Chen wrote:
>> >> On Mon, Oct 14, 2013 a
l_site_hash)
cgraph_add_edge_to_call_site_hash (edge);
+ if (count > 0 && edge->callee
+ && edge->callee->frequency == NODE_FREQUENCY_UNLIKELY_EXECUTED)
+edge->callee->frequency = NODE_FREQUENCY_NORMAL;
+
return edge;
}
Thanks,
Dehao
On Mon, Oct 14, 2013
, Oct 14, 2013 at 3:26 PM, Dehao Chen wrote:
>> On Mon, Oct 14, 2013 at 2:50 PM, Jan Hubicka wrote:
>>>> For my test case, the entire inline instance is optimized away, so
>>>> there is no info about it in the profile. I can do some fixup in the
>>>> rebuild_
at location work?
>
> Teresa
>
> On Tue, Oct 15, 2013 at 8:40 AM, Dehao Chen wrote:
>> Thanks for the pointer to Honza's patch. The patch does exactly what I
>> need. But it only resides in the instrumentation based FDO path. Can
>> we move the code to more com
This patch add a new flag to let user to tell compiler that the
AutoFDO profile is accurate. So the compiler will assume function
without any sample is UNLIKELY_EXECUTED. This could save 10%~20% text
section size.
Bootstrapped and passed regression test.
OK for google-4_8 branch?
Thanks,
Dehao
On Fri, Oct 18, 2013 at 10:39 AM, Jason Merrill wrote:
> On 10/11/2013 01:59 PM, Dehao Chen wrote:
>>
>> It's hard to get a testcase without
>> http://gcc.gnu.org/viewcvs/gcc?view=revision&revision=201856 because
>> none of these *INTERNAL* symbols will be e
This is fixing a LIPO bug when there -fexception is on.
When compilation is finished, compile_file calls
dw2_output_indirect_constants, which may generate decls like
DW.ref.__gxx_personality_v0 (generated in
dw2_output_indirect_constant_1). This function is a global function,
but does not have ass
This test will fail if we don't have
http://gcc.gnu.org/viewcvs/gcc?view=revision&revision=r201824
Bootstrapped and passed regression test.
OK for trunk?
gcc/testsuite/ChangeLog:
2013-10-24 Dehao Chen
* g++.dg/opt/devirt3.C: New test.
Index: gcc/testsuite/g++.dg/opt/
Thanks!
> If they fail on FSF 4.8, I can work on backporting the patch.
> it is quite self contained and safe.
>
> Honza
>>
>> gcc/testsuite/ChangeLog:
>> 2013-10-24 Dehao Chen
>>
If the propagation finds an infinite look, if the in-edge count is
non-zero, then it will cause compiler go into infinite loop when
building with AutoFDO.
Bootstrapped and regression test on-going.
OK for google-4_8 branch?
Thanks,
Dehao
Index: gcc/auto-profile.c
===
Most are within 10. The largest one I see is 17 across all benchmark.
Dehao
On Fri, Oct 25, 2013 at 4:21 PM, Xinliang David Li wrote:
> What is the usual number of iterations?
>
> David
>
> On Fri, Oct 25, 2013 at 4:10 PM, Dehao Chen wrote:
>> If the propagation finds an
ping...
Thanks,
Dehao
On Fri, Oct 18, 2013 at 11:06 AM, Dehao Chen wrote:
> On Fri, Oct 18, 2013 at 10:39 AM, Jason Merrill wrote:
>> On 10/11/2013 01:59 PM, Dehao Chen wrote:
>>>
>>> It's hard to get a testcase without
>>> http://gcc.gnu.org/viewcvs/
This patch changes to no update callee count if caller count is not a
resolved node (in LIPO mode) during AutoFDO compilation. This is
because AutoFDO will have the same edge counts for all unresolved
nodes. Original update method will lead to multi-update of the callee.
Bootstrapped and testing o
100.
But bar()'s entry count is only 100 (assume comdat_foo is the only
caller). Then if we update bar() twice when inline these two edges,
the second update will be wrong.
Dehao
>
> David
>
> On Mon, Oct 28, 2013 at 3:51 PM, Dehao Chen wrote:
>> This patch changes to no u
On Tue, Oct 29, 2013 at 7:58 AM, Jason Merrill wrote:
> On 10/28/2013 06:12 PM, Dehao Chen wrote:
>>
>> ping...
>
>
> Sorry for the slow response.
>
> If we're actually emitting the name now, we need to give it a name different
> from the complete constructor
This patch fix the bug to honor max-lipo-group for AutoFDO.
Bootstrapped and passed regression test.
OK for google-4_8 branch?
Thanks,
Dehao
Index: gcc/auto-profile.c
===
--- gcc/auto-profile.c (revision 206135)
+++ gcc/auto-prof
This patch removes mod_id_to_name map because the info is already
there in module_infos. And also, AutoFDO don't have access to update
this map because its a file-static structure.
Bootstrapped and passed regression test.
OK for google branch?
Thanks,
Dehao
Index: gcc/coverage.c
===
This patch moves the LIPO linking before profile annotation so that
iterative-early-inline can cover functions from aux-module.
Bootstrapped and passed regression test and benchmark test.
OK for google-4_8?
Thanks,
Dehao
Index: gcc/auto-profile.c
The following patch can fix an ICE when compiling with LIPO. OK for google-4_9?
Thanks,
Dehao
Index: gcc/l-ipo.c
===
--- gcc/l-ipo.c (revision 225685)
+++ gcc/l-ipo.c (working copy)
@@ -731,6 +731,7 @@ lipo_cmp_type (tree t1, tree t2
OK for google-4_8 and google-4_9. David and Teresa may have further comments.
Dehao
On Wed, Aug 6, 2014 at 3:36 PM, Yi Yang wrote:
> This currently puts split sections together again in the specified
> section and breaks DWARF output. This patch disables the partitioning
> for such functions.
>
AutoFDO sometimes has 0 profile in the loop's entry block because the
debug info are lost and unrecoverable.
E.g.
if (a)
if (b)
for () {}
This patch checks if the scale factor is 0, then use the normal scale.
Bootstrapped and passed regression test and performance test.
OK for google-4_8
As discussed offline, will commit this patch first, and think of other
smoothing algorithm to prevent profile insanity.
Dehao
On Fri, Nov 8, 2013 at 9:32 AM, Xinliang David Li wrote:
> On Fri, Nov 8, 2013 at 6:23 AM, Dehao Chen wrote:
>> AutoFDO sometimes has 0 profile in the loo
This patch removes the zero_edge heuristic during profile propagation.
The zero_edge heuristic does not seem to be effective in improving
performance.
Tested:
Bootstrapped and passed regression test and performance test.
OK for google-4_8?
Thanks,
Dehao
Index: gcc/auto-profile.c
===
}
if ((bb->flags & BB_ANNOTATED) == 0)
{
bb->flags |= BB_ANNOTATED;
On Fri, Nov 22, 2013 at 1:17 PM, Xinliang David Li wrote:
> On Fri, Nov 22, 2013 at 12:27 PM, Dehao Chen wrote:
>> This patch removes the zero_edge heuristic during profile propagation.
>
afdo_propagate_multi_edge can do everything afdo_propagate_single_edge
does. So we refactor the code to keep only one afdo_propagate_edge
function.
Bootstrapped and passed all unittests and performance tests.
OK for googlge branch?
Thanks,
Dehao
Index: gcc/auto-profile.c
===
of the block itself. Do you see
> any problems with that heuristic?
In this case, the propagate_edge function will keep increasing the BB
count. We set a threshold (PARAM_AUTOFDO_MAX_PROPAGATE_ITERATIONS) to
prevent it from making BB count too large.
Dehao
>
>
> T
On Mon, Nov 25, 2013 at 10:26 AM, Diego Novillo wrote:
> On Mon, Nov 25, 2013 at 1:22 PM, Xinliang David Li wrote:
>> In this case the backedge will be a critical edge, which will be split by
>> GCC.
>
> Right. So, if I split it, I will reach essentially the same
> conclusion, I think. The new b
fined section attributes. Don't call it if either case
arises. */
return (flag_reorder_blocks_and_partition
- && optimize
+ && optimize && !flag_auto_profile
/* See gate_handle_reorder_blocks. We should not partition if
w
>>
>> This will cause bzip2 performance to degrade 6%. I haven't had time to
>> triage the problem. Will investigate this later.
>
> Still I would preffer to make this by default
> flag_reorder_blocks_and_partition
> to false with auto_profile. We could do that incrementally, lets just drop
> thi
e another pass over the actual streaming logic that I find
> bit difficult
> to read, but I quite trust you it does the right thing ;)
>
> Honza
Index: gcc/debug.h
=======
--- gcc/debug.h (revision 215826)
+++ gcc/debug.h (working
The new patch is attached. I used clang-format for format auto-profile.{c|h}
Thanks,
Dehao
On Tue, Oct 14, 2014 at 2:05 PM, Dehao Chen wrote:
> On Tue, Oct 14, 2014 at 8:02 AM, Jan Hubicka wrote:
>>> Index: gcc/cg
This patch recalculates dominance info before update_ssa call in
AutoFDO. This fixes bug when dominance info is out-of-date and causes
segfaults during update_ssa.
Bootstrapped and regression test on-going.
OK for google-4_9 branch?
Thanks,
Dehao
Index: gcc/auto-profile.c
==
iang David Li wrote:
> Is it destroyed by value profile transformations? Can you move the
> dominance recomputing code closer to where it gets invalidated?
>
> David
>
> On Wed, Oct 15, 2014 at 10:37 AM, Dehao Chen wrote:
>> This patch recalculates dominance info before upda
--- gcc/Makefile.in (revision 215826)
+++ gcc/Makefile.in (working copy)
@@ -1153,6 +1153,7 @@ OBJS = \
alias.o \
alloc-pool.o \
auto-inc-dec.o \
+ auto-profile.o \
bb-reorder.o \
bitmap.o \
bt-load.o \
Index: gcc/common.opt
===
1 - 100 of 408 matches
Mail list logo