On Mon, Aug 19, 2013 at 5:01 PM, Teresa Johnson wrote:
> On Mon, Aug 19, 2013 at 11:44 AM, Jeff Law wrote:
>> On 08/19/2013 12:34 PM, Teresa Johnson wrote:
>>>
>>> Ping.
>>> Thanks,
>>> Teresa
>>>
>>> On Sun, Aug 11, 2013 at 9:35 P
e->max_bb_count))
> +{
> + if (PARAM_VALUE (PARAM_INLINE_USEFUL_COLD_CALLEE))
> +return useful_cold_callee (edge);
> + else
> +return true;
> +}
> +}
>return false;
> }
On Tue, Aug 20, 2013 at 12:26 PM, E
, Kaz Kojima wrote:
> Steve Ellcey wrote:
>> On Mon, 2013-08-19 at 22:21 -0700, Teresa Johnson wrote:
>>
>>> 2013-08-19 Teresa Johnson
>>>
>>> PR rtl-optimizations/57451
>>> * final.c (reemit_insn_block_notes): Prevent lexical b
-optimization/58220
+ PR regression/58221
+ * final.c (reemit_insn_block_notes): Use NEXT_INSN to
+ handle SEQUENCE insns properly.
+
2013-08-23 Gabriel Dos Reis
* pretty-print.h (pp_newline_and_flush): Declare. Remove macro
Teresa
--
Teresa Johnson | Software
On Fri, Aug 23, 2013 at 6:34 AM, Teresa Johnson wrote:
> Hi Steve and Kaz,
>
> Sorry about that. Kaz has a fix shown in rtl-optimization/58220:
>
> --- ORIG/trunk/gcc/final.c 2013-08-22 09:43:35.0 +0900
> +++ trunk/gcc/final.c 2013-08-22 14:36:51.0 +0900
&g
Ping #3.
Thanks,
Teresa
On Mon, Aug 19, 2013 at 11:33 AM, Teresa Johnson wrote:
> Ping.
> Thanks,
> Teresa
>
> On Mon, Aug 12, 2013 at 6:54 AM, Teresa Johnson wrote:
>> On Tue, Aug 6, 2013 at 10:23 PM, Teresa Johnson wrote:
>>> On Tue, Aug 6, 2013 at 9:29 AM,
On Mon, Aug 19, 2013 at 10:47 AM, Teresa Johnson wrote:
> On Mon, Aug 19, 2013 at 8:09 AM, Jan Hubicka wrote:
>>> Remember it isn't using dominance anymore. The latest patch was
>>> instead ensuring the most frequent path between hot blocks and the
>>> entry/
;> b.c:16:A::foo: note: Loop is peeled to enhance alignment for vectorization
>>
>> b.c:16:A::foo: note: Completely unroll loop 6 times
>>
>> b.c:12:A::foo: note: Completely unroll loop 6 times
>>
>>
>> Ok after testing?
>>
>> thanks,
>>
>> David
--
Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413
gt;e->frequency = e->frequency * (gcov_type) freq_scale /
>> >> CGRAPH_FREQ_BASE;
>> >> if (e->frequency > CGRAPH_FREQ_MAX)
>> >> e->frequency = CGRAPH_FREQ_MAX;
>> >> @@ -169,7 +171,13 @@ clone_inlined_nodes (struct cgraph_edge *e, bool d
>> >> }
>> >> duplicate = false;
>> >> e->callee->symbol.externally_visible = false;
>> >> - update_noncloned_frequencies (e->callee, e->frequency);
>> >> + // In the case of a COMDAT, the callee's count may be from
>> >> other
>> >> + // modules, and we need to scale it for the current module's
>> >> calls
>> >> + // (e.g. e->count may be 0 despite e->callee->count > 0).
>> >> + gcov_type count_scale = REG_BR_PROB_BASE;
>> >> + if (e->callee->count > e->count)
>> >> +count_scale = GCOV_COMPUTE_SCALE (e->count,
>> >> e->callee->count);
>> >> + update_noncloned_frequencies (e->callee, count_scale,
>> >> e->frequency);
>
> Please fix the comment to the usual style and go ahead with this change.
>
> Thanks,
> Honza
>> >> }
>> >>else
>> >> {
>>
>>
>>
>> --
>> Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413
--
Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413
passes to new dump framework patch). The problem is
that the dump framework was not consistently emitting newlines, so
many of the messages were explicitly adding a newline, which often
wasn't needed. I'll respond more on that thread, but for cleanup such
as removing extra newlines and ad
On Wed, Aug 28, 2013 at 4:01 AM, Richard Biener
wrote:
> On Wed, Aug 7, 2013 at 7:23 AM, Teresa Johnson wrote:
>> On Tue, Aug 6, 2013 at 9:29 AM, Teresa Johnson wrote:
>>> On Tue, Aug 6, 2013 at 9:01 AM, Martin Jambor wrote:
>>>> Hi,
>>>>
>>>
On Wed, Aug 28, 2013 at 7:09 AM, Teresa Johnson wrote:
> On Wed, Aug 28, 2013 at 4:01 AM, Richard Biener
> wrote:
>> On Wed, Aug 7, 2013 at 7:23 AM, Teresa Johnson wrote:
>>> On Tue, Aug 6, 2013 at 9:29 AM, Teresa Johnson wrote:
>>>> On Tue, Aug 6, 2013
or EH edges, setjmp edges and we can then walk bodies
> ignoring
> EH/setjmp flagging blocks that are probably never executed enabling the
> partitioning
> to do a job w/o profile.
agreed. although I would prefer to leave that for a follow-on patch so
it can be tuned a bit.
>
> Otherwise the patch look OK to me. Thanks for working on this. Do we have
> agreement on C++ way of mixing
> declarations and code?
According to http://gcc.gnu.org/wiki/CppConventions:
"In new code variables which are used in a small scope should be
defined at the point of first use, rather than at the top of the
function. Variables which are used throughout the function may be
defined at the top of the function, as in C."
I think I am following that, but let me know if you see something that
needs to be fixed.
Thanks,
Teresa
>
> Honza
--
Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413
On Wed, Aug 28, 2013 at 9:07 AM, Teresa Johnson wrote:
> On Wed, Aug 28, 2013 at 7:09 AM, Teresa Johnson wrote:
>> On Wed, Aug 28, 2013 at 4:01 AM, Richard Biener
>> wrote:
>>> On Wed, Aug 7, 2013 at 7:23 AM, Teresa Johnson wrote:
>>>> On Tue, Au
on, dump_loc should not blindly emit new line as it
>> has no context. I have tried to remove it, and push the newline to
>> higher level helpers -- it mostly works, but the vectorizer verbose
>> messages need serious clean up -- most of them assume that
>> dump_printf_l
that have been useful.
I really think that the two groups of people who will find -fopt-info
useful are gcc developers and savvy performance-hungry users. For the
former group the additional info is extremely useful. For the latter
group some of the extra information may not be required (although a
call count is useful for those using profile feedback), but IMO is not
unreasonable.
Teresa
--
Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413
On Wed, Aug 28, 2013 at 7:09 AM, Teresa Johnson wrote:
>> The inline stuff should be split and re-sent, it's non-obvious to me (extra
>> function parameters are not documented for example). I'd rather have
>> inline_and_report_call () for example instead of an extra
On Wed, Aug 28, 2013 at 11:20 AM, Teresa Johnson wrote:
> On Wed, Aug 28, 2013 at 9:58 AM, Jan Hubicka wrote:
>> Hi,
>> with Martin we did bit of progress on analyzing the problems. We now have
>> COMDAT profile merging for FDO
>
> Great! Is this the LTO merging y
Sorry, I should not have committed that new test along with this
portion of the patch. Removed as of r202106.
Teresa
On Fri, Aug 30, 2013 at 12:17 AM, Andreas Schwab wrote:
> Teresa Johnson writes:
>
>> Index: testsuite/gcc.dg/i
e
>> + probability. */
>> + if (highest_count)
>> +{
>> + if (e->count < highest_count)
>> +continue;
>> +}
>> + else if (highest_freq)
>
> The frequency condition needs to be done only when you walk predecestors -
> when
> you walk down the edge probabilities are just fine.
True. For simplicity I think it should be fine to leave as-is so there
isn't more special casing as the current approach works in both
directions.
>
> The patch seems OK to me now. I will make our FDO tester to use partitioning
> so we get
> this benchmarked a bit.
Ok thanks.
>
> Honza
--
Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413
patch.diff
Description: Binary data
edges.
Thanks,
Teresa
>
> I think we have code for 1-3. If the plan looks sane to you, I think we can
> start getting it in?
>
> Of course we want to eliminate most of the problems by getting runtime to
> do the merging... (but the merging will not be posible always due to
> cfg differences comming from different optimization flags, anyway).
>
> Honza
--
Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413
instead of just -fopt-info. Unless we can
add -fdump-all-all-optimized.
Teresa
>
> thanks,
>
> David
>
> On Fri, Aug 30, 2013 at 1:30 AM, Richard Biener
> wrote:
>> On Thu, Aug 29, 2013 at 5:15 PM, Teresa Johnson wrote:
>>> On Thu, Aug 29, 2013 at 3:04 AM,
ld be fine to leave as-is so there
>> isn't more special casing as the current approach works in both
>> directions.
>
> Yep, you are right. Frequencies are safe both directions.
>
> I think this change belongs to profile feedback category, so the patch is OK.
ok,
On Fri, Aug 30, 2013 at 1:30 PM, Xinliang David Li wrote:
> On Fri, Aug 30, 2013 at 12:51 PM, Teresa Johnson wrote:
>> On Fri, Aug 30, 2013 at 9:27 AM, Xinliang David Li
>> wrote:
>>> Except that in this form, the dump will be extremely large and not
>>> sui
close if it is
stderr/stdout. Confirmed that this fixes the problem.
(So the real ratio between the volume of -fdump-...=stderr and
-fopt-info is much higher than what I reported in an earlier email)
Is the following patch ok, pending regression tests?
2013-08-30 Teresa Johnson
* du
On Sat, Aug 31, 2013 at 12:26 AM, Bernhard Reutner-Fischer
wrote:
> On 30 August 2013 23:23:16 Teresa Johnson wrote:
>>
>> On Fri, Aug 30, 2013 at 1:30 PM, Xinliang David Li
>> wrote:
>> > On Fri, Aug 30, 2013 at 12:51 PM, Teresa Johnson
>> > wrote
>
>> Is the following patch ok, pending regression tests?
>>
>> 2013-08-30 Teresa Johnson
>>
>> * dumpfile.c (dump_finish): Don't close stderr/stdout.
>>
>> Index: dumpfile.c
>> =
ng tree
> decls.
Yes, this would have some advantages such as getting the function name emitted.
Teresa
>
> David
>
> On Mon, Sep 9, 2013 at 12:01 PM, Teresa Johnson wrote:
>> I've attached a patch that implements the cleanup of newline emission
>> by the n
if profile_data_prefix is still NULL, and if so just set via
getpwd.
Passes regression tests and failure I reproduced. Ok for google branches?
Thanks,
Teresa
2013-09-12 Teresa Johnson
* tree-profile.c (tree_init_instrumentation): Handle the case
where profile_data_prefix is NULL
On Thu, Sep 12, 2013 at 1:20 PM, Xinliang David Li wrote:
> On Thu, Sep 12, 2013 at 1:06 PM, Teresa Johnson wrote:
>> After porting r198033 from google/4_7 to google/4_8 a test case failed
>> with an assert when trying to take the strlen of profile_data_prefix.
>>
>> I
_string = build_string (prefix_len + 1, prefix);
TREE_TYPE (prefix_string) = build_array_type
(char_type_node, build_index_type
(build_int_cst (NULL_TREE, prefix_len)));
>
> David
>
> On Thu, Sep 12, 2013 at 2:07 PM, Teresa Johnson wrote:
>> On Thu, S
Testing passes, is the below patch ok for google/4_8?
Thanks, Teresa
On Thu, Sep 12, 2013 at 10:18 PM, Teresa Johnson wrote:
> On Thu, Sep 12, 2013 at 2:32 PM, Xinliang David Li wrote:
>> When absolute path is specified for the object file, no prefix will be
>> prepended to th
double-checked). Can you let me know
if you see anymore after you update to this revision?
Thanks,
Teresa
>
> David
>
> On Tue, Sep 10, 2013 at 6:32 AM, Teresa Johnson wrote:
>> On Mon, Sep 9, 2013 at 9:55 PM, Xinliang David Li wrote:
>>> looks fine to me.
>>>
>
Yep, looked too quickly every time and thought the newline after "be
zero" was applying. Here is the patch with the fix. Ok for trunk
pending regression testing?
2013-09-16 Teresa Johnson
* coverage.c (get_coverage_counts): Add missing newline.
Index:
tect the profile insanities
caused by integer truncation and handle them conservatively. That
being said, I see some sreal uses already in the profile.c code, so
presumably we could use this for the counts as well if it turns out to
be necessary?
BTW, Rong also implemented his runtime patch to do the COMDAT profile
merging. However, that ended up having some issues, that were solvable
but would have caused us to lose all context sensitivity from COMDATS
inlined during the profile-gen build. I am going to go back to solving
this in the profile-use phase as we discussed in the separate thread
on the COMDAT inlining patch I had been working on.
Thanks,
Teresa
>
> Honza
--
Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413
es in partitioning) that the bb frequency was non-zero.
Why not just have probably_never_executed_bb_p return simply return
false bb->frequency is non-zero (right now it does the opposite -
returns true when bb->frequency is 0)? Making this change removed a
bunch of other failures. With this change as well, there are only 3
cases that still fail with 1 train run that pass with 100. Need to
look at those.
>
> Will you look into logic of do_jump or shall I try to dive in?
I can take a look, but probably won't have a chance until late this
week. If you don't get to it before then I will see if I can figure
out why it is applying the branch probabilities this way.
Teresa
>
> Honza
--
Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413
;> this in the profile-use phase as we discussed in the separate thread
>> on the COMDAT inlining patch I had been working on.
>
> Yes, lets move ahead with this, too. I think i should dig out the change
> that made frequencies to be guessed again.
I think I have that and was buil
On Tue, Sep 24, 2013 at 11:25 AM, Teresa Johnson wrote:
> On Tue, Sep 24, 2013 at 10:57 AM, Jan Hubicka wrote:
>>>
>>> I looked at one that failed after 100 as well (20031204-1.c). In this
>>> case, it was due to expansion which was creating multiple branches
On Wed, Sep 25, 2013 at 2:33 PM, Teresa Johnson wrote:
> On Tue, Sep 24, 2013 at 11:25 AM, Teresa Johnson wrote:
>> On Tue, Sep 24, 2013 at 10:57 AM, Jan Hubicka wrote:
>>>>
>>>> I looked at one that failed after 100 as well (20031204-1.c). In this
>>>
hose.
>>
>> >
>> > Will you look into logic of do_jump or shall I try to dive in?
>>
>> I can take a look, but probably won't have a chance until late this
>> week. If you don't get to it before then I will see if I can figure
>> out why it
for google/4_8?
Teresa
2013-09-27 Teresa Johnson
* opts.c (finish_options): Suppress -g/-gmlt when we are
building for LIPO instrumention.
* common.opt (fripa-allow-debug): New option.
Index: opts.c
On Fri, Sep 27, 2013 at 12:01 PM, Xinliang David Li wrote:
> On Fri, Sep 27, 2013 at 11:50 AM, Teresa Johnson wrote:
>> David and Rong,
>>
>> The following patch will disable -g/-gmlt when instrumenting for LIPO
>> since they will affect the recorded ggc_memory use
use memory consumption.
> we still have -g/-gmlt in profile-use compilation. Will this change
> effectively under estimate the memory use in the use phrase?
>
> -Rong
>
> On Fri, Sep 27, 2013 at 11:50 AM, Teresa Johnson wrote:
>> David and Rong,
>>
>>
On Fri, Sep 27, 2013 at 7:15 AM, Teresa Johnson wrote:
> On Thu, Sep 26, 2013 at 3:02 PM, Jan Hubicka wrote:
>>>
>>> Why not just have probably_never_executed_bb_p return simply return
>>> false bb->frequency is non-zero (right now it does the opposite -
>>
x86_64-unknown-linux-gnu. Ok for trunk?
Thanks,
Teresa
2013-09-30 Teresa Johnson
* tree-ssa-threadupdate.c (ssa_fix_duplicate_block_edges):
Update redirected out edge count in joiner case.
(ssa_redirect_edges): Common the joiner and non-joiner cases
so that joiner
Oh, I can do that right now if you want - let me know if you haven't
hit the trigger yet. Thanks,
Teresa
On Mon, Sep 30, 2013 at 1:06 PM, Jeff Law wrote:
> On 09/30/13 13:07, Jeff Law wrote:
>>
>> On 09/30/13 12:52, Teresa Johnson wrote:
>>>
>>> The ju
Nevermind - see that you did this already. Thanks!
Teresa
On Mon, Sep 30, 2013 at 1:11 PM, Teresa Johnson wrote:
> Oh, I can do that right now if you want - let me know if you haven't
> hit the trigger yet. Thanks,
> Teresa
>
> On Mon, Sep 30, 2013 at 1:06 PM, Jeff Law wrot
On Tue, Sep 24, 2013 at 11:25 AM, Teresa Johnson wrote:
> On Tue, Sep 24, 2013 at 10:57 AM, Jan Hubicka wrote:
>>>
>>> I looked at one that failed after 100 as well (20031204-1.c). In this
>>> case, it was due to expansion which was creating multiple branches
condition that caused the entire expression to be true,
so that neither condition's true label ends up as a 0-count bb.
Bootstrapped and tested on x86_64-unknown-linux-gnu. Ok for trunk?
2013-10-01 Teresa Johnson
* dojump.c (do_jump_1): Divide probability between
code that maps from the attr_mode (MODE_SI)
to the machine_mode (SImode), so I am doing the mapping in ree.c before
the comparison with the mode from the second extend.
Bootstrapped and tested on x86_64-unknown-linux-gnu. Is this ok for trunk?
Thanks,
Teresa
2012-10-10 Teresa Johnson
ok for trunk?
Thanks,
Teresa
2012-10-11 Teresa Johnson
* doc/tm.texi: Document TARGET_MACHINE_MODE_FROM_ATTR_MODE.
* doc/tm.texi.in: Regenerated.
* targhooks.c (default_machine_mode_from_attr_mode): New function.
* targhooks.h (default_machine_mode_from_attr_mode
On Fri, Oct 12, 2012 at 1:23 AM, Jakub Jelinek wrote:
> On Thu, Oct 11, 2012 at 02:44:12PM -0700, Teresa Johnson wrote:
>> Revised patch to address conservative behavior in redundant extend
>> elimination that was resulting in redundant extends not being
>> removed. Now u
redundant zero and sign extends.
Bootstrapped and tested on x86_64-unknown-linux-gnu. Is this ok for trunk?
Thanks,
Teresa
2012-10-18 Teresa Johnson
* ree.c (add_removable_extension): Remove unnecessary
mode check with other extension.
2012-10-18 Teresa Johnson
ping.
Teresa
On Thu, Oct 18, 2012 at 8:21 AM, Teresa Johnson wrote:
>
> The attached patch implements avoids conservative behavior in REE by allowing
> removal of redundant extends when the def feeds another extend with a
> different
> mode. This works because in merge_def_and
This patch fixes three different failures I encountered while trying to use
-freorder-blocks-and-partition, including the failure reported in PR 53743.
Bootstrapped and tested on x86_64-unknown-linux-gnu. Ok for trunk?
Thanks,
Teresa
2012-10-29 Teresa Johnson
PR optimization/53743
"by a block in the cold partition", bb->index);
> + err = 1;
> + }
> + for (son = first_dom_son (CDI_DOMINATORS, bb);
> + son;
> + son = next_dom_son (CDI_DOMINATORS, son))
> + VEC_safe_push (basic_block, heap, bbs_in_cold_partition, son);
> + }
> +
> + if (dom_calculated_here)
> + free_dominance_info (CDI_DOMINATORS);
> +}
> +
>/* Clean up. */
>return err;
> }
--
Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413
On Tue, Oct 30, 2012 at 12:33 PM, Xinliang David Li wrote:
> On Tue, Oct 30, 2012 at 12:25 PM, Steven Bosscher
> wrote:
>> On Tue, Oct 30, 2012 at 5:59 PM, Teresa Johnson wrote:
>>> I will try testing your patch on top of mine with our fdo benchmarks.
>>
>> Th
+1605,7 @@ init_optimization_passes (void)
>NEXT_PASS (pass_match_asm_constraints);
>NEXT_PASS (pass_sms);
>NEXT_PASS (pass_sched);
> + NEXT_PASS (pass_partition_blocks);
>NEXT_PASS (pass_ira);
>NEXT_PASS (pass_reload);
>NEXT_PASS (pass_postreload);
--
Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413
On Tue, Oct 30, 2012 at 2:31 PM, Teresa Johnson wrote:
> On Tue, Oct 30, 2012 at 12:33 PM, Xinliang David Li
> wrote:
>> On Tue, Oct 30, 2012 at 12:25 PM, Steven Bosscher
>> wrote:
>>> On Tue, Oct 30, 2012 at 5:59 PM, Teresa Johnson wrote:
>>>> I wil
work for me. I got an error because bbpart has
PROP_cfglayout listed in its required properties. I tried removing
that, but then verify_flow_info gives an error about a missing barrier
after a bb with no fall through edges.
Teresa
--
Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413
On Wed, Oct 2, 2013 at 9:03 AM, Jan Hubicka wrote:
>> 2013-10-01 Teresa Johnson
>>
>> * dojump.c (do_jump_1): Divide probability between
>> both conditions of a TRUTH_ORIF_EXPR.
>>
>> + {
>> +/* Spread the probabi
On Wed, Oct 2, 2013 at 9:19 AM, Jan Hubicka wrote:
>> 2013-09-29 Teresa Johnson
>>
>> * bb-reorder.c
>> (find_rarely_executed_basic_blocks_and_crossing_edges):
>> Treat profile insanities conservatively.
>> * predict.c (probably_
On Wed, Oct 2, 2013 at 9:19 AM, Jan Hubicka wrote:
>> 2013-09-29 Teresa Johnson
>>
>> * bb-reorder.c
>> (find_rarely_executed_basic_blocks_and_crossing_edges):
>> Treat profile insanities conservatively.
>> * predict.c (probably_
On Thu, Oct 3, 2013 at 6:41 AM, Teresa Johnson wrote:
> On Wed, Oct 2, 2013 at 9:19 AM, Jan Hubicka wrote:
>>> 2013-09-29 Teresa Johnson
>>>
>>> * bb-reorder.c
>>> (find_rarely_executed_basic_blocks_and_crossing_edges):
>>>
,
Teresa
2013-10-03 Teresa Johnson
* params.def (PARAM_MIN_HOT_RUN_RATIO): New parameter.
* predict.c (probably_never_executed): Compare frequency-based
count to number of training runs.
(tree_estimate_probability): Update function call to new name
Passes regression tests. Ok for google/4_8?
Thanks,
Teresa
2013-10-07 Teresa Johnson
* cp/cp-objcp-common.c (cmp_templ_arg): Fix argument pack
accesses.
Index: cp/cp-objcp-common.c
===
--- cp/cp-objcp-common.c
On Sun, Oct 6, 2013 at 6:43 AM, Jan Hubicka wrote:
>> 2013-10-03 Teresa Johnson
>>
>> * params.def (PARAM_MIN_HOT_RUN_RATIO): New parameter.
>
> I would not mention HOT in the parameter name. Blocks are hot/normal/unlikely
> and we HOT_BB_FREQUEN
On Fri, Oct 4, 2013 at 8:54 AM, Xinliang David Li wrote:
> On Thu, Oct 3, 2013 at 10:15 PM, Teresa Johnson wrote:
>> This patch handles the fact that small profile count values sometimes
>> get truncated down to 0 during updates due to integer arithmetic. This
>> causes
_exit_edges ();
>>mark_irreducible_loops ();
>>connect_infinite_loops_to_exit ();
>> - estimate_bb_frequencies ();
>> + compute_bb_frequencies (true);
>
> Here please add the check about maximal count in functiona and commnet
> (especially for
The following patch fixes an inefficiency whereby type unification
was being attempted unnecessarily in LIPO even when there were no
aux modules.
Tested with regression tests and internal LIPO benchmark.
Ok for google/4_8?
Thanks,
Teresa
2013-10-10 Teresa Johnson
* l-ipo.c
this change.
>
> David
>
> On Thu, Oct 10, 2013 at 11:27 AM, Teresa Johnson wrote:
>> The following patch fixes an inefficiency whereby type unification
>> was being attempted unnecessarily in LIPO even when there were no
>> aux modules.
>>
>> Teste
esa
>>
>>>
>>> if (!L_IPO_COMP_MODE || (num_in_fnames == 1 && PARAM_VALUE
>>> (PARAM_LIPO_RANDOM_GROUP_SIZE) == 0))
>>>
>>> ok with this change.
>>>
>>> David
>>>
>>> On Thu, Oct 10, 2013 at 11:27 AM, Teresa J
On Mon, Oct 7, 2013 at 10:49 PM, Teresa Johnson wrote:
> On Sun, Oct 6, 2013 at 6:51 AM, Jan Hubicka wrote:
>>> The second part ensures that frequencies are correctly updated after
>>> inlining. The problem is that after inlining the frequencies were
>>> bein
On Mon, Oct 7, 2013 at 10:45 PM, Teresa Johnson wrote:
> On Sun, Oct 6, 2013 at 6:43 AM, Jan Hubicka wrote:
>>> 2013-10-03 Teresa Johnson
>>>
>>> * params.def (PARAM_MIN_HOT_RUN_RATIO): New parameter.
>>
>> I would not mention HOT in the
-10-15 Teresa Johnson
* tree-ssa-tail-merge.c (replace_block_by): Update edge
weights during merging.
Index: tree-ssa-tail-merge.c
===
--- tree-ssa-tail-merge.c (revision 203389)
+++ tree-ssa-tail-merge.c
size, and can be sure that profile is
>>> >> representative. In this mode, we will mark all functions without
>>> >> sample as "unlikely executed". However, because AutoFDO use debug info
>>> >> (of optimized code) to represent profile, it's possible that some hot
>>> >> functions (say foo) are inlined and fully eliminated into another hot
>>> >> function (say bar). So in the profile, bar is cold, and because the
>>> >> profile for foo::bar is eliminated, bar will not be inlined into foo
>>> >> before the profile annotation. However, after profile annotate, we can
>>> >> infer from the bb count that foo->bar is hot, thus it should be
>>> >> inlined in ipa-inline phase. However, because bar itself is marked
>>> >> UNLIKELY_EXECUTED, it will not be inlined.
>>> >>
>>> >> One possible workaround would be that during rebuild_cgraph_edges, if
>>> >> we find an edge's callee is unlikely executed, add the edge count to
>>> >> the callee's count and recalculate callee's frequency.
>>> >>
>>> >> Dehao
>>> >>
>>> >>>
>>> >>> Honza
--
Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413
y resides in the instrumentation based FDO path. Can
> we move the code to more common place (like rebuild_cgraph_edges)?
>
> Thanks,
> Dehao
>
> On Tue, Oct 15, 2013 at 7:59 AM, Teresa Johnson wrote:
>> On Mon, Oct 14, 2013 at 3:26 PM, Dehao Chen wrote:
>>> On Mon, Oct 14,
t;count or
cgraph_maybe_hot_edge_p(e), but AFAICT the call to
cgraph_maybe_hot_edge_p will never return true when e->count is zero.
When there is a profile it will return false via maybe_hot_count_p
since e->count == 0. When there is no profile it will return false
when the callee has NODE_FR
TED. So I think just
>> checking for e->count >0 is sufficient here.
>
> I think I was checking caller count here (that is read) and the code
> was supposed to make functoin with hot caller to be hot...
The code in your patch was just checking the edge count, not the
caller
Ping on this one since the new param would be useful in the other
profile related patch I'm working on (the one to handle the dropped
profile counts in tree-profile).
Thanks,
Teresa
On Thu, Oct 10, 2013 at 2:35 PM, Teresa Johnson wrote:
> On Mon, Oct 7, 2013 at 10:45 PM, Teresa Johnso
On Tue, Oct 15, 2013 at 3:39 PM, Teresa Johnson wrote:
> On Tue, Oct 15, 2013 at 2:57 PM, Jan Hubicka wrote:
>>> On Tue, Oct 15, 2013 at 2:03 PM, Jan Hubicka wrote:
>>> >> Yes, that would work. So let's discard this patch because the fix for
>>
.
>>
>> Bootstrapped and tested on x86_64-unknown-linux-gnu. Ok for trunk?
>>
>> Thanks,
>> Teresa
>>
>> 2013-10-15 Teresa Johnson
>>
>> * tree-ssa-tail-merge.c (replace_block_by): Upd
Here is the patch updated to use the new parameter from r203830.
Passed bootstrap and regression tests.
2013-10-18 Jan Hubicka
Teresa Johnson
* predict.c (handle_missing_profiles): New function.
(counts_to_freqs): Don't overwrite estimated freque
occur.
Bootstrapped and tested on x86-64-unknown-linux-gnu. A
profiledbootstrap build also now succeeds. Ok for trunk?
Thanks,
Teresa
2013-10-29 Teresa Johnson
PR ipa/58862
* tree-ssa-tail-merge.c (replace_block_by): Tolerate profile
insanities when updating probabil
On Fri, Oct 18, 2013 at 2:17 PM, Jan Hubicka wrote:
>> Here is the patch updated to use the new parameter from r203830.
>> Passed bootstrap and regression tests.
>>
>> 2013-10-18 Jan Hubicka
>> Teresa Johnson
>>
>> * predict
fewer than before,
which is good news).
Martin, can you check that the binary used for
-fno-reorder-blocks-and-partition really doesn't have any splitting?
Thanks,
Teresa
>
> Honza
>>
>> Martin
>>
>> On 19 November 2013 23:18, Teresa Johnson wrote:
>> &
t;>> #ifdef L_gcov_merge_single
>>> void __gcov_merge_single (gcov_type *counters __attribute__ ((unused)),
>>> - unsigned n_counters __attribute__ ((unused))) {}
>>> + unsigned n_counters __attribute__ ((unused)),
>>> + unsigned gcov_type *src __attribute__ ((unused)),
>>> + unsigned w __attribute__ ((unused))) {}
>>> #endif
>>>
>>> #ifdef L_gcov_merge_delta
>>> void __gcov_merge_delta (gcov_type *counters __attribute__ ((unused)),
>>> - unsigned n_counters __attribute__ ((unused))) {}
>>> + unsigned n_counters __attribute__ ((unused)),
>>> + unsigned gcov_type *src __attribute__ ((unused)),
>>> + unsigned w __attribute__ ((unused))) {}
>>> #endif
>>>
>>> #else
>>>
>>> #ifdef L_gcov_merge_add
>>> /* The profile merging function that just adds the counters. It is given
>>> - an array COUNTERS of N_COUNTERS old counters and it reads the same
>>> number
>>> - of counters from the gcov file. */
>>> + an array COUNTERS of N_COUNTERS old counters.
>>> + When SRC==NULL, it reads the same number of counters from the gcov file.
>>> + Otherwise, it reads from SRC array. These read values will be multiplied
>>> + by weight W before adding to the old counters. */
>>> void
>>> -__gcov_merge_add (gcov_type *counters, unsigned n_counters)
>>> +__gcov_merge_add (gcov_type *counters, unsigned n_counters,
>>> + gcov_type *src, unsigned w)
>>> {
>>> + int in_mem = (src != 0);
>>> +
>>>for (; n_counters; counters++, n_counters--)
>>> -*counters += gcov_read_counter ();
>>> +{
>>> + gcov_type value;
>>> +
>>> + if (in_mem)
>>> +value = *(src++);
>>> + else
>>> +value = gcov_read_counter ();
>>> +
>>> + *counters += w * value;
>>> +}
>>> }
>>> #endif /* L_gcov_merge_add */
>>>
>>> #ifdef L_gcov_merge_ior
>>> /* The profile merging function that just adds the counters. It is given
>>> - an array COUNTERS of N_COUNTERS old counters and it reads the same
>>> number
>>> - of counters from the gcov file. */
>>> + an array COUNTERS of N_COUNTERS old counters.
>>> + When SRC==NULL, it reads the same number of counters from the gcov file.
>>> + Otherwise, it reads from SRC array. */
>>> void
>>> -__gcov_merge_ior (gcov_type *counters, unsigned n_counters)
>>> +__gcov_merge_ior (gcov_type *counters, unsigned n_counters,
>>> + gcov_type *src, unsigned w __attribute__ ((unused)))
>>
>> So the new in-memory variants are introduced for merging tool, while libgcc
>> use gcov_read_counter
>> interface?
>> Perhaps we can actually just duplicate the functions to avoid runtime to do
>> all the scalling
>> and in_mem tests it won't need?
>>
>> I would suggest going with libgcov.h changes and clenaups first, with
>> interface changes next
>> and the gcov-tool is probably quite obvious at the end?
>> Do you think you can split the patch this way?
>>
>> Thanks and sorry for taking long to review. I should have more time again
>> now.
>> Honza
--
Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413
tions?
Teresa
>
> Martin
>
> On 2 December 2013 17:52, Martin Liška wrote:
>> Dear Teresa,
>>I will today double check if the graphs are correct :)
>>
>> Martin
>>
>> On 2 December 2013 17:16, Jeff Law wrote:
>>> On 12/02/13 08:16,
umber
>> - of counters from the gcov file. */
>> + an array COUNTERS of N_COUNTERS old counters.
>> + When SRC==NULL, it reads the same number of counters from the gcov file.
>> + Otherwise, it reads from SRC array. These read values will be multiplied
>>
On Wed, Dec 11, 2013 at 10:05 PM, Teresa Johnson wrote:
> On Fri, Dec 6, 2013 at 6:23 AM, Jan Hubicka wrote:
>>> Hi, all
>>>
>>> This is the new patch for gcov-tool (previously profile-tool).
>>>
>>> Honza: can you comment on the new merge interfa
e only defined under "#ifdef IN_LIBGCOV"
and only used in libgcov*c (or each other):
gcov_write_counter
gcov_write_tag_length
gcov_write_summary
gcov_seek
Should they all, plus gcov_rewrite, be moved to libgcov-driver.c?
Teresa
>
>
> David
>
> On Thu, Dec 12, 2013 at 12:11 P
t;> , n_opts=n_opts@entry=1290) at
>>> ../../gcc/diagnostic.c:135
>>> #2 0x0100050e in general_init (argv0=) at
>>> ../../gcc/toplev.c:1110
>>> #3 toplev_main(int, char**) () at ../../gcc/toplev.c:1922
>>> #4 0x7774cbe5 in __libc_start_main () from /lib64/libc.so.6
>>> #5 0x00f7898d in _start () at ../sysdeps/x86_64/start.S:122
>>>
>>> That is relatively early in startup process. The function seems inlined and
>>> it fails only on second invocation, did not have time to investigate
>>> further,
>>> yet while without -fprofile-use it starts...
>>>
>>> On our periodic testers I see off-noise improvement in crafty 2200->2300
>>> and regression on Vortex, 2900->2800, plus code size increase.
>>>
>>> Honza
--
Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413
ual stdio calls so one can have replacement
>> driver
>> for i.e. Linux kernel. For that reason things like gcov_seek and friends
>> probably
>> should be separated, but what is reason for splitting the file handling
>> itself?
>>
>> Honza
>>>
&g
uring
pro_and_epilogue. The fallthru was an empty block that appears to be
due to switch expansion with the case having a __builtin_unreachable().
Bootstrapped and tested on x86_64-unknown-linux-gnu. Ok for trunk?
2013-12-17 Teresa Johnson
PR gcov-profile/59527
* cfgrtl.c (fixup_reorder_
Computed goto duplication needs to fixup partition boundaries if it
performed any block duplication.
Bootstrapped and tested on x86_64-unknown-linux-gnu. Ok for trunk?
2013-12-19 Teresa Johnson
PR gcov-profile/59542
* bb-reorder.c (duplicate_computed_gotos): Invoke
On Thu, Dec 19, 2013 at 2:06 PM, Steven Bosscher wrote:
> On Thu, Dec 19, 2013 at 8:33 PM, Teresa Johnson wrote:
>> 2013-12-19 Teresa Johnson <>
>>
>> PR gcov-profile/59542
>> * bb-reorder.c (duplicate_computed_gotos): Invoke fixup_partitio
ion I came up with is to allow the profile counts to overwrite
the guessed probabilites in counts_to_freq. But then when we inline we
re-estimate the probabilities in the callee when the callsite count is
much hotter than the entry count, and then follow the same procedure
we were doing in the 0-count case (propagate the call count into the
callee bb counts via the guessed probabilities). Is there a better
solution?
Thanks,
Teresa
>
> Honza
--
Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413
On Tue, Dec 17, 2013 at 7:48 AM, Teresa Johnson wrote:
> On Mon, Dec 16, 2013 at 2:48 PM, Xinliang David Li wrote:
>> Ok -- gcov_write_counter and gcov_write_tag_length are qualified as
>> low level primitives for basic gcov format and probably should be kept
>> in gcov-io.
On Sun, Dec 22, 2013 at 10:27 AM, Jan Hubicka wrote:
>> On Tue, Dec 17, 2013 at 7:48 AM, Teresa Johnson wrote:
>> > On Mon, Dec 16, 2013 at 2:48 PM, Xinliang David Li
>> > wrote:
>> >> Ok -- gcov_write_counter and gcov_write_tag_length are qualified as
>&
On Fri, Jan 3, 2014 at 2:49 PM, Joseph S. Myers wrote:
> On Fri, 3 Jan 2014, Teresa Johnson wrote:
>
>> Index: libgcc/libgcov.h
>> ===
>> --- libgcc/libgcov.h(revision 0)
>> +++ libgcc/libgcov.h
1 - 100 of 503 matches
Mail list logo