[Bug target/81616] Update -mtune=generic for the current Intel and AMD processors

2017-11-27 Thread hubicka at ucw dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81616 --- Comment #15 from Jan Hubicka --- > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81616 > > --- Comment #14 from Andrew Roberts --- > It would be nice if znver1 for -march and -mtune could be improved before the > gcc 8 release. At present -m

[Bug target/81616] Update -mtune=generic for the current Intel and AMD processors

2017-11-28 Thread hubicka at ucw dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81616 --- Comment #21 from Jan Hubicka --- Hi, this is comparing SPEC2000 -Ofast -march=native -mprefer-vector-width=128 to -Ofast -march=native -mprefer-vector-width=256 on Ryzen. 168.wupwise 160028.25669* 160030.8518

[Bug target/81616] Update -mtune=generic for the current Intel and AMD processors

2017-11-28 Thread hubicka at ucw dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81616 --- Comment #22 from Jan Hubicka --- Hi, this is same base (so you can see there is some noise) compared to haswell tuning 164.gzip 140057.12452* 140058.72384* 175.vpr 140037.13776*

[Bug target/81616] Update -mtune=generic for the current Intel and AMD processors

2017-11-28 Thread hubicka at ucw dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81616 --- Comment #25 from Jan Hubicka --- Hi, I agree that the matric multiplication fma issue is important and hopefully it will be fixed for GCC 8. See https://gcc.gnu.org/ml/gcc-patches/2017-11/msg00437.html The irregularity of tune/arch is proba

[Bug target/81616] Update -mtune=generic for the current Intel and AMD processors

2017-11-28 Thread hubicka at ucw dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81616 --- Comment #26 from Jan Hubicka --- On you matrix benchmarks I get: Vector inside of loop cost: 44 Vector prologue cost: 12 Vector epilogue cost: 0 Scalar iteration cost: 40 Scalar outside cost: 0 Vector outside cost: 12 prologue

[Bug target/81616] Update -mtune=generic for the current Intel and AMD processors

2017-11-28 Thread hubicka at ucw dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81616 --- Comment #27 from Jan Hubicka --- Hi, one of problem here is use of vgather instruction. It is hardly a win on Zen architecture. It is also on my TODO to adjust the code model to disable it for most loops. I only want to benchmark if it is a

[Bug target/81616] Update -mtune=generic for the current Intel and AMD processors

2017-11-29 Thread hubicka at ucw dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81616 --- Comment #30 from Jan Hubicka --- Sorry, with -mno-avx2 I was speaking of the other mt benchmark. There is no need for gathers in matrix multiplication... Honza

[Bug target/81616] Update -mtune=generic for the current Intel and AMD processors

2017-11-29 Thread hubicka at ucw dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81616 --- Comment #34 from Jan Hubicka --- > So gcc loses on mt19937ar.c without -mno-avx2 > But gcc wins big on matrix.c, especially with -mprefer-vector-width=none > -mno-fma It is because llvm does not use vgather at all unless avx512 is present.

[Bug lto/77472] __attribute__((flatten)) when used with -flto can lead to extreme number of inlined functions

2016-09-04 Thread hubicka at ucw dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77472 --- Comment #7 from Jan Hubicka --- > I'm not sure what is the best way forward. > Maybe gcc should ignore __attribute__((flatten)) when using LTO > unconditionally? Well, I am not sure - flatten can make compiler explode without LTO, too, and I

[Bug lto/77472] __attribute__((flatten)) when used with -flto can lead to extreme number of inlined functions

2016-09-05 Thread hubicka at ucw dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77472 --- Comment #10 from Jan Hubicka --- > So apart from the known algorithmic issue in key updating (which Honza > promises > to fix since a few years ...) this is a "doctor it hurts when I do this" > kind-of-issue. Hmm, I actually have patchset f

[Bug middle-end/77484] Static branch predictor causes ~6-8% regression of SPEC2000 GAP

2016-09-06 Thread hubicka at ucw dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77484 --- Comment #2 from Jan Hubicka --- > IIRC the measurements have been run on x86 only, they are done "statically", > that is, verify the prediction against real outcomes as computed by the edge > profile > which is target independent. Yes, the m

[Bug ipa/60315] [4.8/4.9 Regression] template constructor switch optimization

2014-03-26 Thread hubicka at ucw dot cz
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60315 --- Comment #16 from Jan Hubicka --- > forwprop would do that, but the enum is unsigned int while the > switch value is int and thus simplify_gimple_switch bails out > because the conversion is not value-preserving. > > So the frontend would need

[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2014-03-30 Thread hubicka at ucw dot cz
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375 --- Comment #205 from Jan Hubicka --- I was looking into this recently, too. Curiously enough, for me clang+LTO was winning but comparing the symbols it seemed that the confiugre scripts picked bit more features at GCC side. I looked briefly on

[Bug tree-optimization/59967] [4.8/4.9 Regression] Performance regression from 4.7.x to 4.8.x (loop not unrolled)

2014-04-02 Thread hubicka at ucw dot cz
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59967 --- Comment #2 from Jan Hubicka --- > 193246hubicka /* If there is call on a hot path through the loop, > then > 193246hubickathere is most probably not much to optimize. */ > 193246hubicka else if (size.num_non_pu

[Bug lto/60820] [4.9/4.10 Regression] ice in ctor_for_folding, at varpool.c:291

2014-04-15 Thread hubicka at ucw dot cz
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60820 --- Comment #7 from Jan Hubicka --- > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60820 > > --- Comment #6 from Martin Liška --- > Patch works for me for net-misc/nx package. Will you merge the patch to > gcc-4.9 > branch? Richard approved it f

[Bug tree-optimization/60899] undef reference generated with -fdevirtualize-speculatively

2014-04-19 Thread hubicka at ucw dot cz
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60899 --- Comment #2 from Jan Hubicka --- David, it seems a_m.C should be different form a.C. From chain of events you describe I think we need to figure out why the last folding happens. Does the function pass can_refer_decl_in_current_unit_p and if

[Bug tree-optimization/60899] undef reference generated with -fdevirtualize-speculatively

2014-04-20 Thread hubicka at ucw dot cz
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60899 --- Comment #8 from Jan Hubicka --- > Verified that the proposed patch fixed the problem in b/1345242. Great, thanks! I still would preffer to see DECL_EXTERNAL bit on vtable that is not emit in the current unit. But C++ visibility code is bit of

[Bug ipa/60911] [4.9/4.10 Regression] wrong code with -O2 -flto -fipa-pta

2014-04-23 Thread hubicka at ucw dot cz
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60911 --- Comment #9 from Jan Hubicka --- > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60911 > > --- Comment #5 from Richard Biener --- > Ah, and tree-ssa-structalias.c does > > /* Build the constraints. */ > FOR_EACH_DEFINED_FUNCTION (node) >

[Bug ipa/60911] [4.9/4.10 Regression] wrong code with -O2 -flto -fipa-pta

2014-04-23 Thread hubicka at ucw dot cz
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60911 --- Comment #10 from Jan Hubicka --- > > We also have, in execute_one_pass, > > /* SIPLE IPA passes do not handle callgraphs with IPA transforms in it. > Apply all trnasforms first. */ > if (pass->type == SIMPLE_IPA_PASS) > { >

[Bug ipa/60965] [4.10 Regression] IPA: Devirtualization versus placement new

2014-05-04 Thread hubicka at ucw dot cz
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60965 --- Comment #12 from Jan Hubicka --- > (In reply to Jan Hubicka from comment #4) > > Mine, probably 4.9 regression, too. > > It is, and Jonathan Wakely's earlier reduction exposes it on 4.9 too. > > (In reply to Jan Hubicka from comment #6) > >

[Bug ipa/60965] [4.10 Regression] IPA: Devirtualization versus placement new

2014-05-04 Thread hubicka at ucw dot cz
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60965 --- Comment #13 from Jan Hubicka --- > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60965 > > --- Comment #11 from Andrew Haley --- > (In reply to Jason Merrill from comment #9) > > (In reply to Andrew Haley from comment #8) > > > While it's true

[Bug target/61060] [4.9/4.10 Regression] ICE: in int_mode_for_mode, at stor-layout.c:400 with -free-ter

2014-05-05 Thread hubicka at ucw dot cz
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=61060 --- Comment #3 from Jan Hubicka --- > > if (CONST_INT_P (count_exp)) > min_size = max_size = probable_max_size = count = expected_size > = INTVAL (count_exp); > ... > > if (!count) > count_exp = copy_to_mode_reg (GET_MODE (coun

[Bug target/61060] [4.9/4.10 Regression] ICE: in int_mode_for_mode, at stor-layout.c:400 with -free-ter

2014-05-06 Thread hubicka at ucw dot cz
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=61060 --- Comment #5 from Jan Hubicka --- > I'd say the backend should better deal with this. Or we have to > double-check (or delay) the zero-length check until after > > len_rtx = expand_normal (len); > > sth like This looks good to me indeed. T

[Bug ipa/60973] Invalid propagation of a tail call in devirt pass

2014-05-09 Thread hubicka at ucw dot cz
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60973 --- Comment #5 from Jan Hubicka --- > Before tunks we never bothered to compute [tailcall] before inlining > completed, but now explicitely setting the flag for thunks (and not letting > it be computed - why wouldn't that work?) breaks this. > >

[Bug go/61232] [4.10 Regression] link errors building libgo

2014-05-19 Thread hubicka at ucw dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61232 --- Comment #9 from Jan Hubicka --- > The naming is confusing and but the idea seems sound. It doesn't make sense > for a static symbol to be DECL_ONE_ONLY. But currently DECL_ONE_ONLY just > means that the symbol has a comdat group. So given

[Bug go/61232] [4.10 Regression] link errors building libgo

2014-05-19 Thread hubicka at ucw dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61232 --- Comment #11 from Jan Hubicka --- > Yes, I think that would be clearer. > > Your patch does seem to fix the problem building libgo. Thanks. Thanks for help! I am testing updated patch - it turns out that I needed to revisit about every use

[Bug bootstrap/60984] [4.9 Regression] AIX: gcc-4.9.0 bootstrap fails in stage-2

2014-05-20 Thread hubicka at ucw dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60984 --- Comment #24 from Jan Hubicka --- Hi, the problem turns out to be quite ugly issue where inline_call removes dead alias, but the alias is being walked by cgraph_for_node_and_aliases used by ipa-inline to inline function into all callees. The a

[Bug bootstrap/60984] [4.9 Regression] AIX: gcc-4.9.0 bootstrap fails in stage-2

2014-05-20 Thread hubicka at ucw dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60984 --- Comment #26 from Jan Hubicka --- > Thanks for tracking this down. I will test it as well. AIX is a very good > canary for these types of bugs. Yeah, the difference here is that we produce a lot more local aliases on AIX than elsewhere. We d

[Bug regression/61436] [4.10 Regression]: g++.dg/tls/diag-1.C ICE (emutls)

2014-06-09 Thread hubicka at ucw dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61436 --- Comment #2 from Jan Hubicka --- Hi, it seems that get_emutls_init_templ_addr just create the comdat too early and we trigger sanity check that symbols are static and not automatic variables. I have busy day tomorrow, so won't beat if anyone

[Bug tree-optimization/42890] [4.4 Regression] Crash in type_like_member_ptr_p in ipa-prop.c:382

2010-02-08 Thread hubicka at ucw dot cz
--- Comment #5 from hubicka at ucw dot cz 2010-02-08 14:59 --- Subject: Re: [4.4 Regression] Crash in type_like_member_ptr_p in ipa-prop.c:382 > The trunk has delete_unreachable_blocks_update_callgraph, perhaps we want > something like that for the branch as wel

[Bug middle-end/42973] [4.4/4.5 regression] IRA apparently systematically making reload too busy on 2 address instructions with 3 operands

2010-02-10 Thread hubicka at ucw dot cz
--- Comment #8 from hubicka at ucw dot cz 2010-02-10 09:22 --- Subject: Re: [4.4/4.5 regression] IRA apparently systematically making reload too busy on 2 address instructions with 3 operands Thanks, we should see if this solves the AMMP problem in a day or two. Are

[Bug middle-end/42961] [4.5 regression] IRA register preferencing bug

2010-02-10 Thread hubicka at ucw dot cz
--- Comment #2 from hubicka at ucw dot cz 2010-02-10 09:26 --- Subject: Re: New: [4.5 regression] IRA register preferencing bug Hi, note that it is related to way we compute cost through alternatives. I had very old patch for this http://www.x86-64.org/pipermail/patches/2001

[Bug middle-end/42973] [4.4/4.5 regression] IRA apparently systematically making reload too busy on 2 address instructions with 3 operands

2010-02-11 Thread hubicka at ucw dot cz
--- Comment #13 from hubicka at ucw dot cz 2010-02-11 16:56 --- Subject: Re: [4.4/4.5 regression] IRA apparently systematically making reload too busy on 2 address instructions with 3 operands Hi, it seems that the NAMD improved as expected tonight (by about 3%) http

[Bug tree-optimization/42906] [4.5 Regression] Empty loop not removed

2010-03-25 Thread hubicka at ucw dot cz
--- Comment #24 from hubicka at ucw dot cz 2010-03-25 17:22 --- Subject: Re: [4.5 Regression] Empty loop not removed Hi, this is updated version of patch that bootstraps/regtests. In previous one there was bug that BB was marked when processed even when only edge was

[Bug middle-end/43391] [4.5 Regression] make_decl_rtl failure for C++ on AIX and HPUX

2010-03-27 Thread hubicka at ucw dot cz
--- Comment #5 from hubicka at ucw dot cz 2010-03-27 10:53 --- Subject: Re: [4.5 Regression] make_decl_rtl failure for C++ on AIX and HPUX > Honza - ping. There is proposed patch sent to ML http://gcc.gnu.org/ml/gcc-patches/2010-03/msg00789.html I've missed that Stev

[Bug tree-optimization/40436] [4.5 regression] 0.5% code size regression caused by r147852

2010-03-28 Thread hubicka at ucw dot cz
--- Comment #17 from hubicka at ucw dot cz 2010-03-28 16:33 --- Subject: Re: [4.5 regression] 0.5% code size regression caused by r147852 > Indeed. > > There is also some miscounting of overall unit size, Micha has a patch for > that (but it completely chokes tra

[Bug tree-optimization/40436] [4.5 regression] 0.5% code size regression caused by r147852

2010-03-28 Thread hubicka at ucw dot cz
--- Comment #19 from hubicka at ucw dot cz 2010-03-28 16:56 --- Subject: Re: [4.5 regression] 0.5% code size regression caused by r147852 > > > There is also some miscounting of overall unit size, Micha has a patch for > > > that (but it completely chok

[Bug tree-optimization/40436] [4.5 regression] 0.5% code size regression caused by r147852

2010-03-28 Thread hubicka at ucw dot cz
--- Comment #21 from hubicka at ucw dot cz 2010-03-28 17:30 --- Subject: Re: [4.5 regression] 0.5% code size regression caused by r147852 > > I think I saw one but it was wrong. I would be interested to at least know > > what this patch is about :) > &

[Bug tree-optimization/43611] [4.5 Regression] ICE: SIGSEGV with -fipa-cp-clone -fkeep-inline-functions

2010-04-03 Thread hubicka at ucw dot cz
--- Comment #9 from hubicka at ucw dot cz 2010-04-03 20:52 --- Subject: Re: [4.5 Regression] ICE: SIGSEGV with -fipa-cp-clone -fkeep-inline-functions > The patch probably only papers over the problem. Honza, can you have a look > here? Hi, I am away for Easter till

[Bug tree-optimization/40436] [4.5 regression] 0.5% code size regression caused by r147852

2010-04-03 Thread hubicka at ucw dot cz
--- Comment #23 from hubicka at ucw dot cz 2010-04-03 21:02 --- Subject: Re: [4.5 regression] 0.5% code size regression caused by r147852 > 1) overall_size is reduced twice for the same function, once in >cgraph_clone_inlined_nodes, once in cgraph_mark_inline_edge

[Bug tree-optimization/40436] [4.5 regression] 0.5% code size regression caused by r147852

2010-04-03 Thread hubicka at ucw dot cz
--- Comment #25 from hubicka at ucw dot cz 2010-04-03 21:19 --- Subject: Re: [4.5 regression] 0.5% code size regression caused by r147852 > But the the code as-is allows unlimited growth of a function (well, > by PARAM_LARGE_FUNCTION_GROWTH for each inlining; the li

[Bug tree-optimization/40436] [4.5 regression] 0.5% code size regression caused by r147852

2010-04-03 Thread hubicka at ucw dot cz
--- Comment #26 from hubicka at ucw dot cz 2010-04-03 21:20 --- Subject: Re: [4.5 regression] 0.5% code size regression caused by r147852 As for history, I oriignally had only the perentage limits at place, but then found that they behave really erratically on small units and

[Bug tree-optimization/40436] [4.5 regression] 0.5% code size regression caused by r147852

2010-04-03 Thread hubicka at ucw dot cz
--- Comment #27 from hubicka at ucw dot cz 2010-04-03 21:39 --- Subject: Re: [4.5 regression] 0.5% code size regression caused by r147852 And after checking the code, I think it is correct. I.e. limit is computed on size before inlining of caller or callee (this is to allow

[Bug tree-optimization/40436] [4.5 regression] 0.5% code size regression caused by r147852

2010-04-06 Thread hubicka at ucw dot cz
--- Comment #29 from hubicka at ucw dot cz 2010-04-06 10:46 --- Subject: Re: [4.5 regression] 0.5% code size regression caused by r147852 > I don't think we should fix the double-accounting bug for the 4.5 series, > when we tried it on SPEC it caused several

[Bug tree-optimization/40436] [4.5 regression] 0.5% code size regression caused by r147852

2010-04-06 Thread hubicka at ucw dot cz
--- Comment #32 from hubicka at ucw dot cz 2010-04-06 11:05 --- Subject: Re: [4.5 regression] 0.5% code size regression caused by r147852 > I think it is a really, really bad signal if a bug like this, where the > revision that introduced the issue was identified >9 m

[Bug tree-optimization/42906] [4.5 Regression] Empty loop not removed

2010-04-06 Thread hubicka at ucw dot cz
--- Comment #28 from hubicka at ucw dot cz 2010-04-06 11:37 --- Subject: Re: [4.5 Regression] Empty loop not removed I will apply the CD-DCE fix to pretty IPA tomorrow (after testing the inliner problems) so we get some extra testing for that patch too. -- http

[Bug middle-end/42973] [4.4 regression] IRA apparently systematically making reload too busy on 2 address instructions with 3 operands

2010-04-06 Thread hubicka at ucw dot cz
--- Comment #16 from hubicka at ucw dot cz 2010-04-06 11:38 --- Subject: Re: [4.4 regression] IRA apparently systematically making reload too busy on 2 address instructions with 3 operands I believe Vladimir fixed this bug (comment #13) Honza -- http

[Bug tree-optimization/43186] [4.4 Regression] A loop in tree_unroll_loops_completely never ends

2010-04-08 Thread hubicka at ucw dot cz
--- Comment #16 from hubicka at ucw dot cz 2010-04-08 13:26 --- Subject: Re: [4.4 Regression] A loop in tree_unroll_loops_completely never ends > The main issue is that we are doing a very poor job in limiting the work > we do during complete unrolling (as well as leav

[Bug tree-optimization/43186] [4.4 Regression] A loop in tree_unroll_loops_completely never ends

2010-04-08 Thread hubicka at ucw dot cz
--- Comment #18 from hubicka at ucw dot cz 2010-04-08 14:41 --- Subject: Re: [4.4 Regression] A loop in tree_unroll_loops_completely never ends > > Well, I guess in addition to number of instructions after optimizing we can > > also estimate number of instruction

[Bug target/43884] [4.4/4.5/4.6 Regression] Performance degradation for simple fibonacci numbers calculation due to extra stack alignment

2010-04-25 Thread hubicka at ucw dot cz
--- Comment #6 from hubicka at ucw dot cz 2010-04-25 23:42 --- Subject: Re: [4.4/4.5/4.6 Regression] Performance degradation for simple fibonacci numbers calculation due to extra stack alignment > where the only difference is different loop alignment and keeping

[Bug target/43884] [4.4/4.5/4.6 Regression] Performance degradation for simple fibonacci numbers calculation due to extra stack alignment

2010-04-25 Thread hubicka at ucw dot cz
--- Comment #7 from hubicka at ucw dot cz 2010-04-25 23:43 --- Subject: Re: [4.4/4.5/4.6 Regression] Performance degradation for simple fibonacci numbers calculation due to extra stack alignment > The slowdown also happens on x86-64. Stack alignment checks >

[Bug tree-optimization/41835] ICE with -flto -O3 (BB N can not throw but has an EH edge)

2010-04-25 Thread hubicka at ucw dot cz
--- Comment #4 from hubicka at ucw dot cz 2010-04-25 23:44 --- Subject: Re: ICE with -flto -O3 (BB N can not throw but has an EH edge) > Seems to work for me, even with the 4.5.0 release. Note that on mainline the code removing wpa fixup should help here too. There clea

[Bug target/43884] [4.4/4.5/4.6 Regression] Performance degradation for simple fibonacci numbers calculation due to extra stack alignment

2010-04-26 Thread hubicka at ucw dot cz
--- Comment #12 from hubicka at ucw dot cz 2010-04-26 14:27 --- Subject: Re: [4.4/4.5/4.6 Regression] Performance degradation for simple fibonacci numbers calculation due to extra stack alignment > That is true. For tail call, we only need to align outgoing stack

[Bug fortran/43928] [4.6 Regression] FAIL: gfortran.dg/array_constructor_11.f90

2010-04-28 Thread hubicka at ucw dot cz
--- Comment #3 from hubicka at ucw dot cz 2010-04-28 22:20 --- Subject: Re: [4.6 Regression] FAIL: gfortran.dg/array_constructor_11.f90 Looks like latent problem that we fail when optimiznig for size. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43928

[Bug ipa/44563] GCC uses a lot of RAM when compiling a large numbers of functions

2015-03-15 Thread hubicka at ucw dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44563 --- Comment #34 from Jan Hubicka --- The problem is (as described earlier) the fact htat we sum size of all call statmts in function after every inline decision. Most of time is spent in calling estimate_edge_size_and_time: 79.95% cc1 cc1

[Bug lto/65380] [5 Regression][ICF] LTO: ICE in add_symbol_to_partition_1, at lto/lto-partition.c:158

2015-03-16 Thread hubicka at ucw dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65380 --- Comment #7 from Jan Hubicka --- > Note that it compiles if I add "-fno-ipa-icf". Yeah, but it is partitioning bug; it should be able to deal with whatever aliases ICF creates. I will take a look tonight or tomorrow. Honza

[Bug ipa/65478] [5 regression] crafty performance regression

2015-03-20 Thread hubicka at ucw dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65478 --- Comment #4 from Jan Hubicka --- > Which options (LTO?)? I can't see the regression on our testers. -Ofast -flto -funroll-loops Honza

[Bug ipa/65483] bzip2 bsR/bsW should be auto-inlined

2015-03-20 Thread hubicka at ucw dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65483 --- Comment #4 from Jan Hubicka --- > Testcase? I suppose you are talking about the loops in the bsNEEDR/W macros? bzip2 is quite small by itself, but I will take a look later today. Yes, it is bsNEEDR/W macros that gets unrolled. Honza

[Bug tree-optimization/65492] Bad optimization in -O3 due to if-conversion and/or unrolling

2015-03-20 Thread hubicka at ucw dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65492 --- Comment #7 from Jan Hubicka --- > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65492 > > Richard Biener changed: > >What|Removed |Added > -

[Bug lto/65475] [5 Regression] ICE in odr_vtable_hasher::equal (Segmentation fault)

2015-03-20 Thread hubicka at ucw dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65475 --- Comment #5 from Jan Hubicka --- Hmm, yeah, in one unit base is virutal and in other it is not. Perhaps just dropping that sanity check or restricting it to non-odr-violation-reported Honza > https://gcc.gnu.org/bugzilla/show_bug.cgi?id

[Bug ipa/65478] [5 regression] crafty performance regression

2015-03-20 Thread hubicka at ucw dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65478 --- Comment #5 from Jan Hubicka --- Thre regression seems to be visible at http://gcc.opensuse.org/SPEC/CINT/sb-frescobaldi.suse.de-ai-64/186_crafty_big.png

[Bug ipa/65516] lto1: internal compiler error: in get_odr_type, at ipa-devirt.c:1809

2015-03-22 Thread hubicka at ucw dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65516 --- Comment #7 from Jan Hubicka --- I commited the change to mainline, so you only need to update the tree. Honza

[Bug ipa/65516] lto1: internal compiler error: in get_odr_type, at ipa-devirt.c:1809

2015-03-23 Thread hubicka at ucw dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65516 --- Comment #9 from Jan Hubicka --- this was also my bug. Sorry for that. It is fixed on current mainlie.

[Bug ipa/62051] [4.9/5 Regression] Undefined reference to vtable with -O2 and -fdevirtualize-speculatively

2015-03-23 Thread hubicka at ucw dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62051 --- Comment #7 from Jan Hubicka --- > > Yes, though I think for such a class we probably want to consider all virtual > methods unreachable unless they have explicit default visibility; in the > testcase the main program isn't being compiled wit

[Bug ipa/65502] pure-const should play well with clobbers.

2015-03-23 Thread hubicka at ucw dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65502 --- Comment #5 from Jan Hubicka --- > I think we can safely ignore clobbers when scanning functions for > pure/constness. Yes (it is what the patch does), but doing so may cause worse code in the function calling these destructors. DCE will rem

[Bug middle-end/65534] tailcall not optimized away

2015-03-23 Thread hubicka at ucw dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65534 --- Comment #1 from Jan Hubicka --- > #ifndef OPTIMIZE_MANUALLY > void setutent(void) { > ((void)0); > __setutent_unlocked(); > ((void)0); > } > #else > extern __typeof (__setutent_unlocked) setutent > __attribute__ ((alias ("__se

[Bug ipa/65478] [5 regression] crafty performance regression

2015-03-24 Thread hubicka at ucw dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65478 --- Comment #7 from Jan Hubicka --- > > We also may consider adding bit of negative hints for cases where > > cloning would turn function called once (by noncold edge) to a > > function called twice. > > This would be much easier, although the p

[Bug lto/65536] LTO line number information garbled

2015-03-24 Thread hubicka at ucw dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65536 --- Comment #16 from Jan Hubicka --- > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65536 > > --- Comment #2 from Richard Biener --- > The main issue with LTO is that it re-creates a combined linemap but in (most > of the time) quite awkward or

[Bug lto/65536] LTO line number information garbled

2015-03-24 Thread hubicka at ucw dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65536 --- Comment #26 from Jan Hubicka --- this is a proof of concept patch that makes streamer in to collect locations into a "cache" and apply them in sorted order (looking up correct max_column hints) at the end of handling of a given section. It a

[Bug lto/65536] LTO line number information garbled

2015-03-24 Thread hubicka at ucw dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65536 --- Comment #27 from Jan Hubicka --- > I have in fact considered it already, since I think it should be fairly easy > and it can be done incrementally. However, as always in GCC, things are never > as trivial as they seem. I just tried following

[Bug ipa/65478] [5 regression] crafty performance regression

2015-03-25 Thread hubicka at ucw dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65478 --- Comment #9 from Jan Hubicka --- > > This suggests that cloning of function Search and not inlining > > NextMove is only part of the story. > > > > I'm attaching output of my script that compares inlining decisions. > "File 1" is wpa inlinin

[Bug ipa/65076] [5 Regression] 16% tramp3d-v4.cpp compile time regression

2015-03-25 Thread hubicka at ucw dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076 --- Comment #17 from Jan Hubicka --- > > Even though the inline decisions does not seem to be changed considerably > > (at least on tramp3d). > > Yeah, clobbers don't account for anything for size/inline estimates > (well, I hope so!). Yep, the

[Bug lto/65536] LTO line number information garbled

2015-03-25 Thread hubicka at ucw dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65536 --- Comment #35 from Jan Hubicka --- > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65536 > > --- Comment #31 from Martin Liška --- > Beginning of the difference of ODR warnings: > > 210,211c210,211 > < gen/blink/core/CSSPropertyNames.cpp:2330

[Bug lto/65536] LTO line number information garbled

2015-03-25 Thread hubicka at ucw dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65536 --- Comment #42 from Jan Hubicka --- Hi, I read linemap_line_start and I think I noticed few issues with respect to overflows and lines being added randomly. 1) line_delta is computed as to_line SOURCE_LINE (map, set->highest_line) I think th

[Bug lto/65536] LTO line number information garbled

2015-03-25 Thread hubicka at ucw dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65536 --- Comment #44 from Jan Hubicka --- > > Ah, ok, it seems to work now. It just takes ages to print the lto1 line and it > get printed way after the lto1 process is running already. Yep, really anoying property that the stderr output is all buff

[Bug middle-end/65595] [5 Regression] Linux kernel build failure: ICE: in as_a, at is-a.h:192

2015-03-27 Thread hubicka at ucw dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65595 --- Comment #3 from Jan Hubicka --- Hi, the ICE does not reproduce for me, but from backtrace it seems quite clear that the following fix should work: Index: cgraph.c === --- cgraph.

[Bug ipa/65600] [5 Regression] bost testsuite failure: ICE: Segmentation fault

2015-03-27 Thread hubicka at ucw dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65600 --- Comment #2 from Jan Hubicka --- Oops, really hope this is last one of this can of worms :( The problem here is that resolve_speculation assumes the cgraph node exists. I am testing the following: Index: ipa-inline-analysis.c ===

[Bug ipa/65588] [5 Regression] lto1: internal compiler error: Segmentation fault

2015-03-27 Thread hubicka at ucw dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65588 --- Comment #6 from Jan Hubicka --- Hi, this patch fixes the partitioner and also avoids assemble_undefined_decl to be called on hard registers and value exprs. I am not sure how the reduced testcase could work, since I think the bug needs parti

[Bug lto/65536] LTO line number information garbled

2015-03-27 Thread hubicka at ucw dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65536 --- Comment #50 from Jan Hubicka --- > > +/* Do not track column numbers higher than this one. As a result, the > + range of column_bits is [7, 18] (or 0 if column numbers are > + disabled). */ > +#define LINE_MAP_MAX_COLUMN_NUMBER (1U <<

[Bug lto/65536] LTO line number information garbled

2015-03-27 Thread hubicka at ucw dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65536 --- Comment #51 from Jan Hubicka --- > > Contrary to what I said before, I think now that it really makes sense for > line-maps to return UNKNOWN_LOCATION rather than the location of something > else > when overflow occurs, but then LTO has to

[Bug tree-optimization/65610] [5 Regression] Compare debug failure with -g3 -fsanitize=undefined -fno-sanitize=vptr -O3

2015-03-28 Thread hubicka at ucw dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65610 --- Comment #4 from Jan Hubicka --- > Perhaps one possibility would be even for -g0 preserve those specific BLOCKs > (those satisfying Yep, we should do that. Who is removing them?

[Bug lto/61635] LTO partitioner does not handle &&label in statics

2015-03-29 Thread hubicka at ucw dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61635 --- Comment #9 from Jan Hubicka --- This is patch I am testing

[Bug ipa/65478] [5 regression] crafty performance regression

2015-03-30 Thread hubicka at ucw dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65478 --- Comment #17 from Jan Hubicka --- > : > x.d = arg1_3(D); > _5 = x.i[3]; > if (_5 != 0) > goto ; > else > goto ; > ... > : > _12 = x.i[2]; > if (_12 != 0) > goto ; > else > goto ; > > to sth like > > : >

[Bug ipa/65076] [5 Regression] 16% tramp3d-v4.cpp compile time regression

2015-03-31 Thread hubicka at ucw dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076 --- Comment #39 from Jan Hubicka --- Hi, yep, -Os or flatten is unchanged. It seems something regress with -O3 inline decisions but it is somewhat hard to pinpoint. I am on a way to Victoria, so I will do more only tonight. https://gcc.gnu.org/

[Bug ipa/65076] [5 Regression] 16% tramp3d-v4.cpp compile time regression

2015-03-31 Thread hubicka at ucw dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076 --- Comment #45 from Jan Hubicka --- > Like Richard wrote in comment 38 it is "phase opt and generate" that regresses Yes, but is it regression bcause of one specific pass shown later or is it just a cummulative effect of many little slowdown? >

[Bug ipa/65478] [5 regression] crafty performance regression

2015-04-01 Thread hubicka at ucw dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65478 --- Comment #23 from Jan Hubicka --- > Seems to be a regression with -flto only? I also see EON regressing without > -flto. Yes, the inlining is cross file. > > http://gcc.opensuse.org/SPEC/CINT/sb-megrez-head-64/index.html Saw that one too. I

[Bug target/65660] [5 Regression] 252.eon regression on bdver2 with -Ofast

2015-04-04 Thread hubicka at ucw dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65660 --- Comment #11 from Jan Hubicka --- Thanks, 32-bit eon runs improved today, though I am not 100% sure it is ude to vectorization or the unit growth change http://gcc.opensuse.org/SPEC/CINT/sb-frescobaldi.suse.de-head-64-32o-32bit/252_eon_recent_

[Bug lto/65559] [5 Regression] lto1.exe: internal compiler error: in read_cgraph_and_symbols, at lto/lto.c:2947

2015-04-06 Thread hubicka at ucw dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65559 --- Comment #6 from Jan Hubicka --- Can you please compile with --verbose --save-temps and attach the output + temporary files produced? (in particular I wonder about resolution file that should be named *.res)

[Bug ipa/65701] r221530 makes 187.facerec drop with -Ofast -flto

2015-04-09 Thread hubicka at ucw dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65701 --- Comment #8 from Jan Hubicka --- With spaces removed to be readable > > 1.11 ???3682: mov0x60(%rsp),%rdx > 9.32 ???3687:?vmovss (%rax,%r12,2),%xmm5 > 1.44 ??? ??? vmovss (%rax),%xmm6 > 4.46 ??? ??? inc%rdi >

[Bug tree-optimization/65797] [5.0 regression] IPA ICF causes function to be emitted with no debug line info

2015-04-17 Thread hubicka at ucw dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65797 --- Comment #3 from Jan Hubicka --- Hi, the ICF wrapper are created same way as thunks (by expand_thunk) which probably suppress debug info because we do not want to see it for thunks. I suppose it is: DECL_IGNORED_P (thunk_fndecl) = 1 I suppose

[Bug tree-optimization/65797] [5 regression] IPA ICF causes function to be emitted with no debug line info

2015-04-17 Thread hubicka at ucw dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65797 --- Comment #5 from Jan Hubicka --- Well, if you turn one function to alias of another, there is no way to preserve it (like Gold's ICF doesn't). With dwarf extensions we can restore some of the info based on context where the function is called,

[Bug tree-optimization/65797] [5 regression] IPA ICF causes function to be emitted with no debug line info

2015-04-17 Thread hubicka at ucw dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65797 --- Comment #6 from Jan Hubicka --- The following untested patch could help. We may need to set location of the debug statement etc. I probably won't be able to do much more on this till Monday evening Honza

[Bug tree-optimization/65797] [5 regression] IPA ICF causes function to be emitted with no debug line info

2015-04-17 Thread hubicka at ucw dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65797 --- Comment #8 from Jan Hubicka --- > With gold's ICF, as I understand it, there is a function name and file/line > information for every function in the backtrace. It may not be the name or > the ICF does not do the wrappers as far as I know.

[Bug tree-optimization/66163] [6 Regression] Not working Firefox built with LTO

2015-05-18 Thread hubicka at ucw dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66163 --- Comment #7 from Jan Hubicka --- > According to -fsanitize=null, there are many places in Firefox that produce > undefined behavior in followin way: > > https://bugzilla.mozilla.org/show_bug.cgi?id=1165904 > > One common example: > > st

[Bug lto/66273] [6 Regression] FAIL: gcc.dg/guality/pr43177.c

2015-05-24 Thread hubicka at ucw dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66273 --- Comment #1 from Jan Hubicka --- > On Linux/x86, r223608 caused: > > FAIL: gcc.dg/guality/pr43177.c -O2 -flto -fuse-linker-plugin > -fno-fat-lto-objects line 24 l == 10 > FAIL: gcc.dg/guality/pr43177.c -O2 -flto -fuse-linker-plugin > -fn

[Bug tree-optimization/59660] We fail to optimize common boolean checks pre-inlining

2014-01-07 Thread hubicka at ucw dot cz
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59660 --- Comment #2 from Jan Hubicka --- > I have noticed these, too (-Og is pessimzed by them). The pattern is > generated > by gimplifying. I wondered why we can't simply update gimplifier to not produce them? (this is what I wanted to look into t

[Bug middle-end/58585] [4.9 Regression] ICE in ipa with virtual inheritance

2014-01-07 Thread hubicka at ucw dot cz
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58585 --- Comment #15 from Jan Hubicka --- > markus@x4 gcc % cat test.ii > class A { > void m(); > }; > void A::m() {} > > markus@x4 gcc % /var/tmp/gcc_build_dir_/./prev-gcc/xg++ > -B/var/tmp/gcc_build_dir_/./prev-gcc/ -r -nostdlib -flto -O2 -pipe te

[Bug tree-optimization/59660] We fail to optimize common boolean checks pre-inlining

2014-01-07 Thread hubicka at ucw dot cz
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59660 --- Comment #4 from Jan Hubicka --- > Not all testcases can be handled at gimplification time IIRC. Which > means "testcases welcome" first, so we can look at them individually. The GCC one I saw was equivalent of: #include bool m_is_less_than_

[Bug tree-optimization/59660] We fail to optimize common boolean checks pre-inlining

2014-01-08 Thread hubicka at ucw dot cz
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59660 --- Comment #6 from Jan Hubicka --- > That's only optimizable after the 'mergephi' pass. Before the > temporary setting is shared by the n==m code. Thus maybe > 'mergephi' itself can handle this ... Yep, mergephi seems like resonable place (at

[Bug tree-optimization/59660] We fail to optimize common boolean checks pre-inlining

2014-01-08 Thread hubicka at ucw dot cz
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59660 --- Comment #8 from Jan Hubicka --- Fe produces: unit size align 32 symtab 0 alias set -1 canonical type 0x76ede690 precision 32 min max pointer_to_this > side-effects arg 0 unit size

<    2   3   4   5   6   7   8   9   10   11   >