Re: [PATCH] Relax invalidation of TOP N counters in PGO.

2020-01-22 Thread Jan Hubicka
> 2020-01-22 Martin Liska > > PR tree-optimization/92924 > * libgcov-profiler.c (__gcov_topn_values_profiler_body): First > try to find an existing value, then find an empty slot > if not found. This looks good, one nit below. > --- > libgcc/libgcov-profiler.c | 46 +++

Re: [PATCH] Relax invalidation of TOP N counters in PGO.

2020-01-22 Thread Jan Hubicka
> On 1/22/20 12:09 PM, Martin Liška wrote: > > stats for indirect_call: > >   total: 9210 > >   invalid: 600 > >   tracked values: > > 0 values: 6280 times (68.19%) > > 1 values: 1856 times (20.15%) > > 2 values:  264 times (2.87%) > > 3 values:  157 times (1.

Re: [PATCH] Relax invalidation of TOP N counters in PGO.

2020-01-23 Thread Jan Hubicka
> So my bet is that before the patch we had a bogus code. We detected invalid > stated with hiving first counter value == -1. Which could be also reached > with decrement of all values (0 - 1 == -1). > > Maybe we would be interested in usage of a huge negative number to reflect > invalid state? A

Re: [PATCH] Proposal for IPA init() constant propagation

2020-01-23 Thread Jan Hubicka
> Hi. > > Thank you for the patch. I'm CCing the author of current IPA CP > and IPA maintainer. OK, happy to be in CC, but I wonder where I find the last version of the patch? https://gcc.gnu.org/bugzilla/attachment.cgi?id=47455 Honza > > Martin

Re: [PATCH] Relax invalidation of TOP N counters in PGO.

2020-01-23 Thread Jan Hubicka
commit it tomorrow. It bootstraped x86_64-linux, regtesting is running. Stll the number of invalidated counters outnumbers those sucesfully tracked which are not trivial (0 or 1 val). libgcc/ChangeLog: 2020-01-24 Jan Hubicka * libgcov-merge.c (merge_topn_values_set): Fix merging. diff

Re: [PATCH] Add __gcov_indirect_call_profiler_v4_atomic.

2020-01-27 Thread Jan Hubicka
> Hi. > > The patch is about missing atomic profiler function for indirect calls. > > Patch can bootstrap on x86_64-linux-gnu and survives regression tests. > > Ready to be installed? > Thanks, > Martin > > gcc/ChangeLog: > > 2020-01-27 Martin Liska > > PR gcov-profile/93403 >

ipa: fix handling of multiple speculations (PR93318)

2020-01-28 Thread Jan Hubicka
patch is mostly about getting rid of ICE and profile corruption which is a regression from GCC 9. lto-bootstrapped/regtested x86_64-linux and also tested with Firefox and hack for less conservative indirect call profile merging. Honza gcc/ChangeLog: 2020-01-28 Jan Hubicka PR lto/

profile: fix profile count dumping

2020-01-28 Thread Jan Hubicka
Hi, this patch fixes dump of quality. Comitted as obvious. * profile-count.h (profile_quality_display_names): Fix ordering. diff --git a/gcc/profile-count.c b/gcc/profile-count.c index 0c792297826..c89914ff8a0 100644 --- a/gcc/profile-count.c +++ b/gcc/profile-count.c @@ -78,9 +78,9 @@ co

diagnostics: make error message lowercase.

2020-01-28 Thread Jan Hubicka
Hi, I comitted this ias obvious. I also wonder if the next error should be info or should be part of the first error message. * coverage.c (read_counts_file): Make error message lowercase. diff --git a/gcc/coverage.c b/gcc/coverage.c index 5e961b26f66..f29ff640c43 100644 --- a/gcc/covera

Verify sanity of indirect call/topn profiles

2020-01-28 Thread Jan Hubicka
Hi, I will try to reming this next stage1 since it is not regression fix. I found it useful to have bit of sanity checking of the topn profiles to work out the bugs in merging and updating that was there. Honza gcc/ChangeLog: 2020-01-28 Jan Hubicka * profile.c

ipa: Fix removal of multi-target speculation

2020-01-29 Thread Jan Hubicka
Jan Hubicka * cgraph.c (cgraph_edge::resolve_speculation): Only lookup direct edge if called on indirect edge. (cgraph_edge::redirect_call_stmt_to_callee): Lookup indirect edge of speculative call if needed. gcc/testsuite/ChangeLog: 2020-01-29 Jan Hubicka

Re: ipa: Fix removal of multi-target speculation

2020-01-29 Thread Jan Hubicka
> > diff --git a/gcc/testsuite/gcc.dg/tree-prof/indir-call-prof-2.c > > b/gcc/testsuite/gcc.dg/tree-prof/indir-call-prof-2.c > > index 61612b5b628..bbba0521018 100644 > > --- a/gcc/testsuite/gcc.dg/tree-prof/indir-call-prof-2.c > > +++ b/gcc/testsuite/gcc.dg/tree-prof/indir-call-prof-2.c > > This

gcov: reduce code quality loss by reproducible topn merging [PR92924]

2020-01-30 Thread Jan Hubicka
-bootstrapped/regtested x86_64-linux. I am going to test this on Firefox and clang and gather updated logs. 2020-01-30 Jan Hubicka PR ipa/92924 * value-prof.c (dump_histogram_value): Update dumping. (stream_out_histogram_value): Do not check that values are positive for

Re: [PATCH] correct COUNT and PROB for unrolled loop

2020-02-03 Thread Jan Hubicka
> On Mon, 2020-02-03 at 10:04 -0600, Pat Haugen wrote: > > On 2/3/20 2:17 AM, Jiufu Guo wrote: > > > +/* { dg-final { scan-rtl-dump-times "REG_BR_PROB 937042044" 1 > > > "loop2_unroll"} } */ > > > > Sorry I didn't catch this addition to the original testcase earlier, but I > > wonder how stable

Re: [PATCH] Do not load body for alias symbols.

2020-02-04 Thread Jan Hubicka
> Hi. > > The patch is about not loading of LTO bodies for symbols > that are only aliases. > > Patch can bootstrap on x86_64-linux-gnu and survives regression tests. > > Ready to be installed? > Thanks, > Martin > > gcc/lto/ChangeLog: > > 2020-02-04 Martin Liska > > PR lto/93489 >

Re: [PATCH RFA] cgraph: A COMDAT decl always has non-zero address.

2020-02-05 Thread Jan Hubicka
> We should be able to assume that a template instantiation or other COMDAT > has non-zero address even if MAKE_DECL_ONE_ONLY for the target sets > DECL_WEAK and we haven't yet decided to emit a definition in this > translation unit. > > Tested x86_64-pc-linux-gnu, OK for trunk? > > PR c++/

Re: [PATCH] Revert mangling of names with -fprofile-generate=.

2020-02-06 Thread Jan Hubicka
> Hi. > > The patch reverts mangling of filenames due to file > length limitation. Creation of a folder tree seems fine > in context of PGO. > > Ready for master? > Thanks, > Martin > > gcc/ChangeLog: > > 2020-02-06 Martin Liska > > PR gcov-profile/91971 > PR gcov-profile/93466

Re: [PATCH] Revert mangling of names with -fprofile-generate=.

2020-02-06 Thread Jan Hubicka
> On 2/6/20 2:26 PM, Jan Hubicka wrote: > > > Hi. > > > > > > The patch reverts mangling of filenames due to file > > > length limitation. Creation of a folder tree seems fine > > > in context of PGO. > > > > > > Ready

Re: gcov: reduce code quality loss by reproducible topn merging [PR92924]

2020-02-13 Thread Jan Hubicka
> On 1/30/20 5:13 PM, Jan Hubicka wrote: > > Martin, I welcome your opinion on the patch > > Ok, recently we've made quite some experiments with Honza about > TOP N counters. Based on the numbers, it shows that tracking a negative > value per-counter value does not help

Add -fpartial-profile-training

2019-12-04 Thread Jan Hubicka
Hi, with recent fixes to proile updating I noticed that we get more regressions compared to gcc 9 at Firefox testing. This is because Firefox train run is not covering all the benchmarks and gcc 9, thanks to updating bugs sometimes optimize code for speed even if it was not trained. While in gener

Fix g++.dg/torture/pr59226.C

2019-12-05 Thread Jan Hubicka
Hi, this patch fixes ICE in g++.dg/torture/pr59226.C which was triggered by new comdat_local sanity check. What happens here is that function gets inlined into its own thunk which makes it !comdat_local_p but the updating code does not notice since thunk calls comdat local alias of the function it

Fix profile updatin in tree-ssa-threadupdate

2019-12-05 Thread Jan Hubicka
Hi, this patch makes tree-ssa-threadupdate to not leave basic blocks with undefined counts in the program. create_block_for_threading sets counts as follows: /* Zero out the profile, since the block is unreachable for now. */ rd->dup_blocks[count]->count = profile_count::uninitialized ();

Re: [PATCH] Come up with constructors of symtab_node, cgraph_node and varpool_node.

2019-12-05 Thread Jan Hubicka
> Hi. > > As mentioned in the PR, there are classes in cgraph.h that are > not PODs and are initialized with ggc_alloc_cleared. So that I'm suggesting > to use proper constructors. I added ggc_new function that can be used > at different locations as well. > > I'm attaching optimized dump file wi

Re: [PATCH] Come up with constructors of symtab_node, cgraph_node and varpool_node.

2019-12-05 Thread Jan Hubicka
> On December 5, 2019 2:03:58 PM GMT+01:00, "Martin Liška" > wrote: > >On 12/5/19 1:59 PM, Richard Biener wrote: > >> Isn't there ggc_alloc for this? Also ggc_alloc_no_dtor in > >case you > >> want to handle finalization yourself. > > > >No, if I see correctly it only calls Dtor: > > But its o

Fix flag_toplevel_reorder wrt optimize attribute

2019-12-08 Thread Jan Hubicka
Hi, this is (very likely partial) fix for PR92860 where optimize attribute incorrectly rewrites flag_toplevel_reorder which we handle as global flag. In fact we support per-symbol toplevel reordering for a while so we only need to add the annotation. However we need to inspect all other flags impl

Silence overactive sanity check with -fpartial-profile-training

2019-12-08 Thread Jan Hubicka
Hi, do_estimate_edge_time tests that cached and real values matches. This test is not working precisely for global profiles because of roundoff issues when profile of clones is subtracted from profile of offline body. This is checked by presence of ipa counter. This breaks with partial profile tra

Fix tp_first_run update in split_function

2019-12-08 Thread Jan Hubicka
Hi, the value 0 in tp_first_run is special meaing that profile is unknown. We should not set it to 1. Bootstrapped/regtested x86_64-linux, comitted. * ipa-split.c (split_function): Preserve 0 tp_first_run. Index: ipa-split.c ===

Fix overflows in -fprofile-reorder-functions

2019-12-08 Thread Jan Hubicka
Hi, this patch fixes three sissues with -fprofile-reorder-functions: 1) First is that tp_first_run is stored as 32bit integer while it can easily overflow (and does so during Firefox profiling). 2) Second problem is that flag_profile_functions can not be tested w/o function context. The ch

Re: Fix overflows in -fprofile-reorder-functions

2019-12-08 Thread Jan Hubicka
> Hi, > this patch fixes three sissues with -fprofile-reorder-functions: > 1) First is that tp_first_run is stored as 32bit integer while it can easily >overflow (and does so during Firefox profiling). Actually the overflow problem is possible only with mismatched profiles (which does happen f

Re: Fix overflows in -fprofile-reorder-functions

2019-12-08 Thread Jan Hubicka
> 2On Sun, 8 Dec 2019, Jan Hubicka wrote: > > > Other explanation would be that our new qsort with broken comparator due to > > overflow can actualy remove some entries in the array, but that sounds bit > > crazy. > > gcc_qsort only reorders elements, making it po

Re: [PATCH][RFC] Add new ipa-reorder pass

2019-12-09 Thread Jan Hubicka
> On 12/9/19 1:14 PM, Martin Liška wrote: > > Hello. > > > > Based on presentation that had Sriraman Tallam at a LLVM conference: > > https://www.youtube.com/watch?v=DySuXFGmB40 > > > > I made a heatmap based on executed instruction addresses. I used > > $ perf record -F max -- ./cc1plus -fprepro

Re: [PATCH][RFC] Add new ipa-reorder pass

2019-12-09 Thread Jan Hubicka
> > On the first glance the difference between gcc9 and gcc10 is explained > > by the changes to profile updating. gcc9 makes very small cold > > partitions compared to gcc10. It is very nice that we have a way to > > measure it. I will also check if some of the more important profiling > > update

Re: [PATCH] Fix typos in 2 functions.

2019-12-09 Thread Jan Hubicka
> Hi. > > I'm sending fix for 2 locations where we have a typo. > Second hunk is pre-approved by Rich, first one needs to be approved > by Honza? > > Patch can bootstrap on x86_64-linux-gnu and survives regression tests. > > Ready to be installed? > Thanks, > Martin > > gcc/ChangeLog: > > 2019

Re: [RFC] ipa-cp: Fix PGO regression caused by r278808

2019-12-10 Thread Jan Hubicka
Hi, I think the updating should treat self recursive edges as loops: that is calculate SUM of counts incomming edges which are not self recursive, calculate probability of self recursing and then set count as SUM * (1/(1-probability_of_recursion)) we do same thing when computing counts withing loop

Re: [PATCH] Refactor IPA devirt a bit.

2019-12-10 Thread Jan Hubicka
> > > > 2019-12-09 Richard Sandiford > > > > gcc/ > > * ipa-utils.h (get_odr_name_for_type): Check for a TYPE_DECL. > > * ipa-devirt.c (warn_types_mismatch): Don't call xstrdup for the > > second demangled name. > > > > gcc/testsuite/ > > * gcc.dg/lto/tag-1_0.c, gcc.dg/lto/tag

Re: [RFC] ipa-cp: Fix PGO regression caused by r278808

2019-12-10 Thread Jan Hubicka
> Hi, > > On Tue, Dec 10 2019, Jan Hubicka wrote: > > Hi, > > I think the updating should treat self recursive edges as loops: that is > > calculate SUM of counts incomming edges which are not self recursive, > > calculate probability of self recursing and

Re: Fix overflows in -fprofile-reorder-functions

2019-12-10 Thread Jan Hubicka
> 2On Sun, 8 Dec 2019, Jan Hubicka wrote: > > > Other explanation would be that our new qsort with broken comparator due to > > overflow can actualy remove some entries in the array, but that sounds bit > > crazy. > > gcc_qsort only reorders elements, making it po

Re: [PATCH][RFC] Add new ipa-reorder pass

2019-12-10 Thread Jan Hubicka
> * Makefile.in: Add ipa-reorder.o. > * cgraph.c (cgraph_node::dump): Dump > text_sorted_order. > (cgraph_node_cmp_by_text_sorted): > New function that sorts functions based > on text_sorted_order. > * cgraph.h (cgraph_node): Add text_sorted_order. >

Fix hot/startup partitioning with LTO

2019-12-10 Thread Jan Hubicka
Hi, as noticed by Martin's new heatmaps default_function_sections disables test.unlikely and text.startup sections during LTO. This was meant to be only with tp_first_run profiling. The hot section is still useful since it does "poor man's clustering" so I kept only the test to disable startup sub

Re: Fix overflows in -fprofile-reorder-functions

2019-12-10 Thread Jan Hubicka
> On Tue, 10 Dec 2019, Jan Hubicka wrote: > > > > I would recommend to make these variables uint64_t, then you can simply do > > > > > > tp_first_run_a--; > > > tp_first_run_b--; > > > > > > making 0 wrap around to UINT64_MAX. Then

Fix function profile computation

2019-12-10 Thread Jan Hubicka
Hi, this patch fixes compute_function_frequency to work well with profiles that was downgraded from IPA to local profiles. Bootstrapped/regtested x86_64-linux, comitted. honza * predict.c (compute_function_frequency): Check for presence of IPA profile. Index: predict.c

Fix ipa-cp bit propagation streaming

2019-12-12 Thread Jan Hubicka
Hi, this rather nasty bug makes value and mask to be exchanged during streaming. This makes us to sometimes set bogus pointer alignments and causes misoptimization of Firefox when built with GCC 9. Comitted as obvious. I will backport it to release branches soon - it is quite dangerous bug. Hon

Re: [PATCH] Fix x86_64 va_arg (ap, __int128) handling (PR target/92904)

2019-12-12 Thread Jan Hubicka
> Hi! > > As the following testcase shows, for va_arg (ap, __int128) or va_arg (ap, X) > where X is __attribute__((aligned (16))) 16 byte struct containing just > integral fields we sometimes emit incorrect read. While on the stack (i.e. > the overflow area) both of these are 16-byte aligned, whe

Fix merging of common target infos

2019-12-13 Thread Jan Hubicka
Hi, while looking into Firefox regression compared to gcc9 I noticed that we often confuse common target infos when profiles get merged. This patch adds the missing update bits. Honza * ipa-utils.c (ipa_merge_profiles): Improve dumping; merge common targets. Index: ipa-utils.c =

Re: [PATCH] Fix symver attribute with LTO

2019-12-17 Thread Jan Hubicka
> Hi, > with Jan's patch commited in r278878 we can use symver attribute for functions > and variables. The symver attribute is designed for replacing toplevel asm > statements containing ".symver" which may be removed by LTO. Unfortunately, > a quick test shown GCC still generates buggy so file

Fix partitioning ICE with external comdats

2019-12-17 Thread Jan Hubicka
Hi, while hacking firefox to work around ABI compatibility issues with LLVM I ran into an ICE where comdat group was resolved externaly but contains a static alias (for thunk). In this case parittioner attempts to put that static definition into a partition which triggers an ICE. Bootstrapped/regt

Re: [PATCH] Fix symver attribute with LTO

2019-12-17 Thread Jan Hubicka
> Hi Jan, > > I'm using GNU ld 2.33.1. > > I'll attach a testcase simplified from fuse-3.9 code. "local: *;" in the > versioning script triggers the issue. Without it there would be no problem. Thanks. You are right that I did not play with local:. Now I wonder what is the intended behaviour h

Re: [PATCH] Fix symver attribute with LTO

2019-12-17 Thread Jan Hubicka
> Would it be equivalent to: > 1) output foo_v2 local > 2) producing static alias with local name (.L1) > 3) do .symver .L1,foo@@@VERS_2 > That is somewhat more systematic and would not lead to false > visibilities. I spent some time playing with this. An in order to 1) be able to handle foo_v2

Re: [PATCH] Fix symver attribute with LTO

2019-12-18 Thread Jan Hubicka
Hi, sorry I forgot to include cgraph and varpool changes in the patch. Index: varpool.c === --- varpool.c (revision 279467) +++ varpool.c (working copy) @@ -539,8 +539,7 @@ varpool_node::assemble_aliases (void) { varpo

Pass ipa-bit-cp info to tree-ssa-ccp

2019-12-18 Thread Jan Hubicka
Hi, while hunting the streaming bug of ipa-bit-cp which exchanged value and mark while streaming to ltrans I noticed that this bug had almost no effect because we almost always throw away the relevant info. This patch makes tree-ssa-ccp to use results of ipa-bit-cp so the value is actually used.

Re: [PATCH] Fix symver attribute with LTO

2019-12-18 Thread Jan Hubicka
> ICE here. > > lto1: internal compiler error: tree check: expected identifier_node, have > function_decl in ultimate_transparent_alias_target, at varasm.c:1308 > 0x6f9cfe tree_check_failed(tree_node const*, char const*, int, char const*, > ...) > ../../gcc/gcc/tree.c:9685 > 0x714541 tree_c

Re: [PATCH] IPA-CP: Remove bogus static keyword (PR 92971)

2019-12-18 Thread Jan Hubicka
> > the leak is indeed a problem, thanks for spotting it. But apart from > that, I really wanted to pass vNULL to intersect_aggregates_with_edge, > and the patch below does it explicitely to make it clear, because while > the function can do also intersecting its actual task here is to carry > ov

Re: [PATCH] [RFC] ipa: duplicate ipa_size_summary for cloned nodes

2019-12-18 Thread Jan Hubicka
> The size_info of ipa_size_summary are created by r277424. It should be > duplicated for cloned nodes, otherwise self_size and estimated_self_stack_size > would be 0, causing param large-function-insns and large-function-growth > working > inaccurate when ipa-inline. > > gcc/ChangeLog: > >

Re: [PATCH] [RFC] ipa: duplicate ipa_size_summary for cloned nodes

2019-12-18 Thread Jan Hubicka
> The size_info of ipa_size_summary are created by r277424. It should be > duplicated for cloned nodes, otherwise self_size and estimated_self_stack_size > would be 0, causing param large-function-insns and large-function-growth > working > inaccurate when ipa-inline. > > gcc/ChangeLog: > >

Re: [PATCH] Handle aggregate pass-through for self-recursive call (PR ipa/92794)

2019-12-18 Thread Jan Hubicka
> Hi, > > On Tue, Dec 17 2019, Feng Xue OS wrote: > > If argument for a self-recursive call is a simple pass-through, the call > > edge is also considered as source of any value originated from > > non-recursive call to the function. Scalar pass-through and full aggregate > > pass-through due to p

Re: [PATCH] Fix symver attribute with LTO

2019-12-19 Thread Jan Hubicka
> On 2019-12-18 14:19 +0100, Jan Hubicka wrote: > > The problem here is that we lie to the compiler (by pretending that > > foo_v2 is exported from DSO while it is not) and force user to do the > > same. > > > > We support two ways to hide symbol - either

Re: [PATCH] Fix symver attribute with LTO

2019-12-19 Thread Jan Hubicka
> On 2019-12-19 19:12 +0800, Xi Ruoyao wrote: > > On 2019-12-19 11:06 +0100, Jan Hubicka wrote: > > > - /* See if we have linker information about symbol not being used or > > > - if we need to make guess based on the declaration. > > > + /* Limitation of

Re: [PATCH] [RFC] ipa: duplicate ipa_size_summary for cloned nodes

2019-12-19 Thread Jan Hubicka
> On 2019/12/18 23:48, Jan Hubicka wrote: > >> The size_info of ipa_size_summary are created by r277424. It should be > >> duplicated for cloned nodes, otherwise self_size and > >> estimated_self_stack_size > >> would be 0, causing param large-f

Re: *Ping* Re: [PATCH v6] Missed function specialization + partial devirtualization

2019-12-19 Thread Jan Hubicka
> > gcc/ChangeLog > > > > 2019-12-02 Xiong Hu Luo > > > > PR ipa/69678 > > * Makefile.in (GTFILES): Add ipa-profile.c. > > * cgraph.c (symbol_table::create_edge): Init speculative_id. > > (cgraph_edge::make_speculative): Add param for setting speculative_id. > > (cgraph

Re: Patch ping (was Re: [PATCH] Oprimize stack_protect_set_1_ followed by a move to the same register (PR target/92841))

2019-12-19 Thread Jan Hubicka
> Hi! > > I'd like to ping this patch (with the sizeof (c) -> sizeof (c) / sizeof (c[0]) > testsuite fix Andreas pointed out). > > Thanks! > > On Tue, Dec 10, 2019 at 10:57:35AM +0100, Jakub Jelinek wrote: > > 2019-12-10 Jakub Jelinek > > > > PR target/92841 > > * config/i386/i386.md

Re: Patch ping (was Re: [PATCH] Oprimize stack_protect_set_1_ followed by a move to the same register (PR target/92841))

2019-12-19 Thread Jan Hubicka
> On Thu, Dec 19, 2019 at 04:01:31PM +0100, Jan Hubicka wrote: > > I now get build failure of Firefox with LTO due to: > > > > movabsq $.LC12, %rax > > > > which is output by: > > > > (insn:TI 468 3 849 2 (parallel [

Re: [PATCH] Avoid segfault when doing IPA-VRP but not IPA-CP (PR 93015)

2019-12-21 Thread Jan Hubicka
> Hi, > > PR 93015 testcase - an empty main function compiled with -O0 -fipa-vrp > -flto - shows that IPA-VRA can segfault when trying to access results of > an analysis that has not been performed because of zero optimization > level, -fno-ipa-cp etc. > > Rather than adding another chain of opt_

[wwwdocs] Add GCC10 IPA/LTO changes

2019-12-30 Thread Jan Hubicka
Hi, here are some of changes of LTO/IPA done in GCC10. There is also recursive cloning and some other stuff I will add incrementally as well as some data on overall compile time/memory use improvements as we reported in past years. I am still running tests and fixing bugs in this area. Honza dif

Fix ICE while updating inliner summary

2020-01-01 Thread Jan Hubicka
Hi, this patch fix ICE seen when LTO optimizing clang. Bootstrapped/regtested x86_64-linux. * ipa.c (walk_polymorphic_call_targets): Fix updating of overall summary. Index: ipa.c === --- ipa.c (revision 279724)

Re: [PATCH] Fix up -Wsuggest-attribute=cold (PR ipa/93087)

2020-01-01 Thread Jan Hubicka
> Hi! > > While for other -Wsuggest-attribute= cases we only warn if the corresponding > attribute is not present on the current_function_decl, enforced in the > callers of warn_function_*, for the cold attribute warn_function_cold is > called in two places in compute_function_frequency, and in th

Re: [PATCH v2] ipa-cp: Fix PGO regression caused by r278808

2020-01-02 Thread Jan Hubicka
> v2 Changes: > 1. Enable proportion orig_sum to the new nodes for self recursive node: >new_sum = (orig_sum + new_sum) \ >* self_recursive_probability * (1 / param_ipa_cp_max_recursive_depth). > 2. Add value range for param_ipa_cp_max_recursive_depth. > > The performance of exchange2 buil

Re: [PATCH] Make cgraph_edge::resolve_speculation static

2020-01-06 Thread Jan Hubicka
> Hi, > > throughout this year a few of us got burnt by the fact that > cgraph_edge::resolve_speculation method sometimes removed and > deallocated its this pointer, sometimes making the this pointer of a few > other methods of the class also suddenly invalid. > > We postponed dealing with the is

Re: [PATCH] Relax invalidation of TOP N counters in PGO.

2020-01-06 Thread Jan Hubicka
> On Mon, 2020-01-06 at 15:08 +0100, Martin Liška wrote: > > Hi. > > > > As Honza noticed in the PR, we are quite strict about TOP N > > counter invalidation due to multiple values that can't > > fit in a counter. We due it in order to have a reproducible > > builds. I guess we should do a comprom

Re: [PATCH] Relax invalidation of TOP N counters in PGO.

2020-01-06 Thread Jan Hubicka
> > > OK > > Actually I am not so sure about this patch - how do we ensure > > reproducibility in this case? > ISTM that anyone trying to have reproducible builds shouldn't be using > PGO based optimizations. OpenSUSE does that. Builds are supposed to be reproducible + PGO is used for number of co

Re: [PATCH] ipa-inline: Adjust condition for caller_growth_limits

2020-01-07 Thread Jan Hubicka
> On Mon, 2020-01-06 at 01:03 -0600, Xiong Hu Luo wrote: > > Inline should return failure either (newsize > param_large_function_insns) > > OR (newsize > limit). Sometimes newsize is larger than > > param_large_function_insns, but smaller than limit, inline doesn't return > > failure even if the n

Re: [PATCH] Relax invalidation of TOP N counters in PGO.

2020-01-07 Thread Jan Hubicka
> On 1/6/20 8:03 PM, Jan Hubicka wrote: > > > > > OK > > > > Actually I am not so sure about this patch - how do we ensure > > > > reproducibility in this case? > > > ISTM that anyone trying to have reproducible builds shouldn't be using &g

Re: [PATCH] Make warn_inline Optimization option.

2020-01-07 Thread Jan Hubicka
> Err - Optimization also lists it in some -help section? It's a Warning > option and certainly we don't handle per-function Warnings in general > (with LTO) even though we have #pragma GCC diagnostic, no? > > I'm not sure why we force warn_inline to zero with -O0, it seems much > better to guard

Re: [PATCH] ipa-inline: Adjust condition for caller_growth_limits

2020-01-07 Thread Jan Hubicka
> > > On 2020/1/7 16:40, Jan Hubicka wrote: > >> On Mon, 2020-01-06 at 01:03 -0600, Xiong Hu Luo wrote: > >>> Inline should return failure either (newsize > param_large_function_insns) > >>> OR (newsize > limit). Sometimes newsize is larger th

Re: [PATCH] Make warn_inline Optimization option.

2020-01-07 Thread Jan Hubicka
> On Tue, Jan 7, 2020 at 3:26 PM Jan Hubicka wrote: > > > > > Err - Optimization also lists it in some -help section? It's a Warning > > > option and certainly we don't handle per-function Warnings in general > > > (with LTO) even though we have #

Re: [PATCH] Relax invalidation of TOP N counters in PGO.

2020-01-07 Thread Jan Hubicka
> On 1/7/20 11:08 AM, Jan Hubicka wrote: > > > On 1/6/20 8:03 PM, Jan Hubicka wrote: > > > > > > > OK > > > > > > Actually I am not so sure about this patch - how do we ensure > > > > > > reproducibility in this case? > > &

Re: [PATCH] ipa-inline: Adjust condition for caller_growth_limits

2020-01-08 Thread Jan Hubicka
> > Thanks. So caller could be {hot, cold} + {large, small}, same for callee. > It may > produce up to 4 * 4 = 16 combinations. Agree that hard to define useful, > and useful really doesn't reflect performance improvements certainly. :) > > My case is A1(1) calls A2(2), A2(2) calls A3(3). A1

Re: [PATCH] Use dump_asm_name for Callers/Calls in dump.

2020-01-08 Thread Jan Hubicka
> On 1/7/20 11:27 AM, Martin Liška wrote: > > Which is fine. Apparently there are just few usages of manual printing > > of a symtab node and order like: > > > >   fprintf (f, > >    "%*s%s/%i %s\n%*s  freq:%4.2f", > >    indent, "", callee->name (), callee->order, > > > >

Re: [PATCH] Make warn_inline Optimization option.

2020-01-08 Thread Jan Hubicka
> > > Given all warning options can be enabled/disabled via #pragma GCC > > > diagnostic > > > all Warning annotated options should be implicitely 'Optimization' for > > > the purpose > > > of LTO streaming then? > > > > Well, perhaps they can be marked but for late optimizations this does > > not

Re: [PATCH] Relax invalidation of TOP N counters in PGO.

2020-01-08 Thread Jan Hubicka
Hi, Just to explain better what I am worried about. The overall sum of counters in TOPN does not have very good meaning if you have more than N target. Lets for simplicity assume that we have TOPN for N=1 (i.e. old code). It guarantees if target X is taken by more than 50% of times, it will win,

Re: [PATCH] Make warn_inline Optimization option.

2020-01-08 Thread Jan Hubicka
> Hmm, indeed. Well, I belive we use the 'Optimization' flag for other purposes > than only triggering LTO streaming and option save/restore, so we need another > flag that only triggers save/restore then (and also allow us to avoid > dropping the > flag at lto-option streaming time where we curre

Re: [PATCH] Make warn_inline Optimization option.

2020-01-08 Thread Jan Hubicka
On unrelated note, looking what we print with --verbose -v The following options are specific to just the language LTO: -flinker-output=Set linker output type (used internally during LTO optimization). -fltransRun the link-time optimizer in local transformatio

Re: [PATCH] Relax invalidation of TOP N counters in PGO.

2020-01-08 Thread Jan Hubicka
> On 1/8/20 11:35 AM, Jan Hubicka wrote: > > Hi, > > Just to explain better what I am worried about. The overall sum of > > counters in TOPN does not have very good meaning if you have more than N > > target. > > > > Lets for simplicity assume that w

Re: [PATCH] Relax invalidation of TOP N counters in PGO.

2020-01-08 Thread Jan Hubicka
> > > > > I would still preffer invalidation before streaming (which is fully > > deterministic) and possibly have option > > Do you mean __gcov_merge_topn? I suggest we do the following: - have non-deterministic and deterministic version of TOPN counter and a flag chosing between determi

Re: [PATCH] Use cgraph_node::dump_{asm_},name where possible.

2020-01-08 Thread Jan Hubicka
> Hi. > > The patch consistent usage of cgraph_node::dump_{asm_,}name where possible. > > Patch can bootstrap on x86_64-linux-gnu and survives regression tests. > > Ready to be installed? OK, thanks! Not all dump_name/dump_asm_name choices are fully logical, but I see it is comming form name/as

Re: [PATCH] Add Optimization for various IPA parameters.

2020-01-08 Thread Jan Hubicka
> Hi, > > On Fri, Jan 03 2020, Martin Liška wrote: > > Hi. > > > > This is similar transformation for IPA passes. This time, > > one needs to use opt_for_fn in order to get the right > > parameter values. > > > > @Martin, Honza: > > There are last few remaining parameters which should use > > opt_

Re: [PATCH] Add Optimization for various IPA parameters.

2020-01-08 Thread Jan Hubicka
> Hi. > > This is similar transformation for IPA passes. This time, > one needs to use opt_for_fn in order to get the right > parameter values. > > @Martin, Honza: > There are last few remaining parameters which should use > opt_for_fn: > > param_ipa_max_agg_items > param_ipa_cp_unit_growth > pa

Re: [PATCH] Add Optimization for various IPA parameters.

2020-01-08 Thread Jan Hubicka
> Hi, > > On Fri, Jan 03 2020, Martin Liška wrote: > > Hi. > > > > This is similar transformation for IPA passes. This time, > > one needs to use opt_for_fn in order to get the right > > parameter values. > > > > @Martin, Honza: > > There are last few remaining parameters which should use > > opt_

Re: [PATCH] Add Optimization for various IPA parameters.

2020-01-08 Thread Jan Hubicka
> Hi, > > On Fri, Jan 03 2020, Martin Liška wrote: > > Hi. > > > > This is similar transformation for IPA passes. This time, > > one needs to use opt_for_fn in order to get the right > > parameter values. > > > > @Martin, Honza: > > There are last few remaining parameters which should use > > opt_

Re: [PATCH] Add Optimization for various IPA parameters.

2020-01-08 Thread Jan Hubicka
> Hi, > > On Wed, Jan 08 2020, Jan Hubicka wrote: > >> Hi, > >> > >> On Fri, Jan 03 2020, Martin Liška wrote: > >> > Hi. > >> > > >> > This is similar transformation for IPA passes. This time, > >>

Re: [PATCH] Add Optimization for various IPA parameters.

2020-01-08 Thread Jan Hubicka
> Hi, > > On Fri, Jan 03 2020, Martin Liška wrote: > > Hi. > > > > This is similar transformation for IPA passes. This time, > > one needs to use opt_for_fn in order to get the right > > parameter values. > > > > @Martin, Honza: > > There are last few remaining parameters which should use > > opt_

Re: [PATCH] Relax invalidation of TOP N counters in PGO.

2020-01-09 Thread Jan Hubicka
> On 1/8/20 3:05 PM, Jan Hubicka wrote: > > > > > > > > > > > I would still preffer invalidation before streaming (which is fully > > > > deterministic) and possibly have option > > > > > > Do you mean __gcov_merge_t

Re: [PATCH] Add Optimization for various IPA parameters.

2020-01-09 Thread Jan Hubicka
> 2020-01-03 Martin Liska > > * auto-profile.c (auto_profile): Use opt_for_fn > for a parameter. > * ipa-cp.c (ipcp_lattice::add_value): Likewise. > (propagate_vals_across_arith_jfunc): Likewise. > (hint_time_bonus): Likewise. > (incorporate_penalties): Likew

Re: [PATCH] Try LTO partial linking. (Was: Speed of compiling gimple-match.c)

2021-05-20 Thread Jan Hubicka
> On Thu, May 20, 2021 at 3:16 PM Richard Biener > wrote: > > > > On Thu, May 20, 2021 at 3:06 PM Martin Liška wrote: > > > > > > On 5/20/21 2:54 PM, Richard Biener wrote: > > > > So why did you go from applying this per-file to multiple files? > > > > > > When I did per-file for {gimple,generic}

Re: [PATCH 4/4] ipa-cp: Select saner profile count to base heuristics on

2021-10-06 Thread Jan Hubicka
> 2021-08-23 Martin Jambor > > * params.opt (param_ipa_cp_profile_count_base): New parameter. > * ipa-cp.c (max_count): Replace with base_count, replace all > occurrences too, unless otherwise stated. > (ipcp_cloning_candidate_p): identify mostly-directly called >

Re: [PATCH 2/4] ipa-cp: Propagation boost for recursion generated values

2021-10-06 Thread Jan Hubicka
> Recursive call graph edges, even when they are hot and important for > the compiled program, can never have frequency bigger than one, even > when the actual time savings in the next recursion call are not > realized just once but depend on the depth of recursion. The current > IPA-CP effect pro

Fix ipa-modref ICE

2021-10-07 Thread Jan Hubicka
Hi, this patch fixes omitted case in contains_p which later trigger a sanity check since merging is not symmetric. Bootstrapped/regtested x86_64-linux, comitted. Honza gcc/ChangeLog: 2021-10-07 Jan Hubicka PR ipa/102581 * ipa-modref-tree.h (modref_access_node::contains_p

Re: [PATCH 2/4] ipa-cp: Propagation boost for recursion generated values

2021-10-07 Thread Jan Hubicka
> Hi, > > > > If you boost every self fed value by factor of 6, I wonder how quickly > > we run into exponential explosion of the cost (since the frequency > > should be close to 1 and 6^9=10077696 > > The factor of six is applied once for an entire SCC, so we'd reach this > huge number only i

Rewrite PTA constraint generation for function calls

2021-10-08 Thread Jan Hubicka
Hi, this patch commonizes the three paths to produce constraints for function call and makes it more flexible, so we can implement new features more easily. Main idea is to not special case pure and const since we can now describe all of pure/const via their EAF flags (implicit_const_eaf_flags and

Re: [PATCH 3/4] ipa-cp: Fix updating of profile counts and self-gen value evaluation

2021-10-08 Thread Jan Hubicka
> For non-local nodes which can have unknown callers, the algorithm just > takes half of the counts - we may decide that taking just a third or > some other portion is more reasonable, but I do not think we can > attempt anything more clever. Can't you just sum the calling edges and subtract it fr

Re: [PATCH] ipa-sra: Fix thinko when overriding safe_to_import_accesses (PR 101066)

2021-07-08 Thread Jan Hubicka
Hi, > 2021-06-16 Martin Jambor > > PR ipa/101066 > * ipa-sra.c (class isra_call_summary): New member > m_before_any_store, initialize it in the constructor. > (isra_call_summary::dump): Dump the new field. > (ipa_sra_call_summaries::duplicate): Copy it. > (pr

<    1   2   3   4   5   6   7   8   9   10   >