Re: [Bug libstdc++/87502] Poor code generation for std::string("c-style string")

2024-12-09 Thread Jan Hubicka via Gcc-bugs
> > So I think all we can hope for is merging memcpy with the extra write of 0. > > That's not actually clear. > > It would be reasonable to assume that foo isn't likely to change the string > and have the inlined destructor for a string that was initialized as a short > string like here do somet

Re: [Bug ipa/114531] Feature proposal for an `-finline-functions-aggressive` compiler option

2024-06-25 Thread Jan Hubicka via Gcc-bugs
> different issue from the one that is raised in the PR. (Unless we think that > -O2 and -O3 should always have the same inlining heuristics henceforward, but > that seems unlikely.) Yes, I think point of -O3 is to let compiler to be more aggressive than what seems desirable for your average dist

Re: [Bug c++/110137] implement clang -fassume-sane-operator-new

2024-06-04 Thread Jan Hubicka via Gcc-bugs
> Is the option supposed to be only about the standard global scope operator > new/delete (_Znam etc.) or also user operator new/delete class methods? If > the > former, then I agree it is a global property (or at least a per shared > library/binary property, one can arrange stuff with symbol vis

Re: [Bug libstdc++/109442] Dead local copy of std::vector not removed from function

2024-05-14 Thread Jan Hubicka via Gcc-bugs
This patch attempts to add __builtin_operator_new/delete. So far they are not optimized, which will need to be done by extra flag of BUILT_IN_ code. also the decl.cc code can be refactored to be less of cut&paste and I guess has_builtin hack to return proper value needs to be moved to C++ FE. How

Re: [Bug ipa/113907] [11/12/13/14 regression] ICU miscompiled on x86 since r14-5109-ga291237b628f41

2024-04-09 Thread Jan Hubicka via Gcc-bugs
There is still problem with loop bounds. I am testing patch on that and then we should be (finally) finally safe.

Re: [Bug ipa/114262] Over-inlining when optimizing for size with gnu_inline function

2024-03-07 Thread Jan Hubicka via Gcc-bugs
> Note GCC has not retuned its -Os heurstics for a long time because it has been > decent enough for most folks and corner cases like this is almost never come > up. There were quite few changes to -Os heuristics :) One of bigger challenges is that we do see more and more C++ code built with -Os wh

Re: [Bug target/114232] [14 regression] ICE when building rr-5.7.0 with LTO on x86

2024-03-05 Thread Jan Hubicka via Gcc-bugs
Looking at the prototype patch, why need to change also the splitters? My original goal was to use splitters to expand to faster code sequences while having patterns necessary for both variants. This makes it possible to use optimize_insn_for_size/speed and make decisions using BB profile, since

Re: [Bug tree-optimization/113787] [12/13/14 Regression] Wrong code at -O with ipa-modref on aarch64

2024-02-14 Thread Jan Hubicka via Gcc-bugs
> > I guess PTA gets around by tracking points-to set also for non-pointer > > types and consequently it also gives up on any such addition. > > It does. But note it does _not_ for POINTER_PLUS where it treats > the offset operand as non-pointer. > > > I think it is ipa-prop.c::unadjusted_ptr_an

Re: [Bug target/113233] LoongArch: target options from LTO objects not respected during linking

2024-01-04 Thread Jan Hubicka via Gcc-bugs
> Confirm. But option save/restore has been always implemented: > > .section.gnu.lto_.opts,"",@progbits > .ascii "'-fno-openmp' '-fno-openacc' '-fno-pie' '-fcf-protection" > .ascii "=none' '-mabi=lp64d' '-march=loongarch64' '-mfpu=64' '-m" > .ascii "simd=lasx' '-mcmodel=nor

Re: [Bug middle-end/111088] useless 'xor eax,eax' inserted when a value is not returned and icf

2023-08-21 Thread Jan Hubicka via Gcc-bugs
> But adds a return with a value. And then the inliner inlines foo into foo2 but > we still have the return with a value around ... I guess ICF can special case unused return value, but why this is not taken care of by ipa-sra?

Re: [Predicated Ins vs Branches] O3 and PGO result in 2x performance drop relative to O2

2023-08-01 Thread Jan Hubicka via Gcc-bugs
> > If I comment it out as above patch, then O3/PGO can get 16% and 12% > > performance > > improvement compared to O2 on x86. > > > > O2 O3 PGO > > cycles 2,497,674,824 2,104,993,224 2,199,753,593 > > instructions1

Re: [Bug tree-optimization/106293] [13/14 Regression] 456.hmmer at -Ofast -march=native regressed by 19% on zen2 and zen3 in July 2022

2023-07-28 Thread Jan Hubicka via Gcc-bugs
> This heuristic wants to catch > > > if (foo) abort (); > > > and avoid sinking "too far" across a path with "similar enough" > execution count (I think the original motivation was to fix some > spilling / register pressure issue). The loop depth test > should be !(bb_loop_depth (best_b

Re: [Bug target/110758] [14 Regression] 8% hmmer regression on zen1/3 with -Ofast -march=native -flto between g:8377cf1bf41a0a9d (2023-07-05 01:46) and g:3a61ca1b9256535e (2023-07-06 16:56); g:d76d19c

2023-07-21 Thread Jan Hubicka via Gcc-bugs
> I suspect this is most likely the profile updates changes ... Quite possibly. The goal of this excercise is to figure out if there are some bugs in profile estimate or whether passes somehow preffer broken profile or if it is just back luck. Looking at sphinx and fatigue it seems that LRA really

Re: [Bug ipa/110334] [13/14 Regresssion] unused functions not eliminated before LTO streaming

2023-06-28 Thread Jan Hubicka via Gcc-bugs
> > why disallow caller->indirect_calls? See testcase in comment #9 > > > + return false; > > + for (cgraph_edge *e2 = callee->callees; e2; e2 = e2->next_callee) > > I don't think this flys - it looks quadratic. Can we compute this > in the inline summary once instead? I guess I can

Re: [Bug ipa/110334] [13/14 Regresssion] unused functions not eliminated before LTO streaming

2023-06-23 Thread Jan Hubicka via Gcc-bugs
Just so it is somewhere, here is a testcase that we can't inline leaf functions to always_inlines unless we do some tracking of what calls were formerly indirect calls. We really overloaded always_inline from the original semantics "drop inlining heuristics" into "be sure that result is inlined" w

Re: [Bug libstdc++/110287] _M_check_len is expensive

2023-06-19 Thread Jan Hubicka via Gcc-bugs
> > There is no guarantee that std::vector::max_size() is PTRDIFF_MAX. It > depends on the Allocator type, A. A user-defined allocator could have > max_size() == 100. If inliner we see path to the throw functions, it will not determine _M_check_len as early inlinable. Perhaps we can __builtin_con

Re: [Bug c++/106943] GCC building clang/llvm with LTO flags causes ICE in clang

2023-05-12 Thread Jan Hubicka via Gcc-bugs
> > Indeed it is quite long time problem with clang not building with lifetime > > DSE and strict aliasing. I wonder why this is not fixed on clang side? > > Because the problems were not communicated? I knew that Firefox needed > -flifetime-dse=1, but it's the first time I hear that any such pro

Re: [Bug target/87832] AMD pipeline models are very costly size-wise

2022-11-16 Thread Jan Hubicka via Gcc-bugs
> > Do you mean we should fix modeling of divisions there as well? I don't have > latency/throughput measurements for those CPUs, nor access so I can run > experiments myself, unfortunately. > > I guess you mean just making a patch to model division units separately, > leaving latency/throughput

Re: [Bug middle-end/106078] Invalid loop invariant motion with non-call-exceptions

2022-06-25 Thread Jan Hubicka via Gcc-bugs
> > For this one it's PRE hoisting *b across the endless loop (PRE handles > > calls as possibly not returning but not loops as possibly not > > terminating...) > > So it's a different bug. > > Btw, C++ requiring forward progress makes the testcase undefined. In my understanding access to volatil

Re: [Bug lto/105727] __builtin_constant_p expansion in LTO

2022-05-25 Thread Jan Hubicka via Gcc-bugs
> > My guess is that the > > BUILD_BUG(); > > line is the sole thing that is wrong, it should be just break; > > as the memory_is_poisoned_n(addr, size); will handle all the sizes, > > regardless if they are constants or not. > > Sure, I'm going to suggest such a change. To me it looked like a pro

Re: [Bug c/105728] New: dead store to static var not optimized out

2022-05-25 Thread Jan Hubicka via Gcc-bugs
> To me, all of these do the same thing and should generate the same code. > As nobody else can see removeme, and we aren't leaking its address, shouldn't > the compiler be able to deduce that all accesses to removeme are > inconsequential and can be removed? > > My gcc 11.3 generates a condidion

Re: [Bug rtl-optimization/102178] [12 Regression] SPECFP 2006 470.lbm regressions on AMD Zen CPUs after r12-897-gde56f95afaaa22

2022-01-27 Thread Jan Hubicka via Gcc-bugs
> I would say so. It saves code size and also uop space unless the two > can magically fuse to a immediate to %xmm move (I doubt that). I made simple benchmark double a=10; int main() { long int i; double sum,val1,val2,val3,val4; for (i=0;i<10;i++) { #if

Re: [Bug rtl-optimization/102178] [12 Regression] SPECFP 2006 470.lbm regressions on AMD Zen CPUs after r12-897-gde56f95afaaa22

2022-01-27 Thread Jan Hubicka via Gcc-bugs
> > According to znver2_cost > > > > Cost of sse_to_integer is a little bit less than fp_store, maybe increase > > sse_to_integer cost(more than fp_store) can helps RA to choose memory > > instead of GPR. > > That sounds reasonable - GPR<->xmm is cheaper than GPR -> stack -> xmm > but GPR<->xmm s

Re: [Bug tree-optimization/104203] [12 Regressions] huge compile-time regression since r12-6606-g9d6a0f388eb048f8

2022-01-24 Thread Jan Hubicka via Gcc-bugs
> > bool > Since the pass issues a bunch other warnings (e.g., -Wstringop-overflow, > -Wuse-after-free, etc.) the gate doesn't seem right. But since #pragma GCC > diagnostic can re-enable warnings disabled by -w (or turn them into errors) > any > gate that considers the global option setting will

Re: [Bug ipa/104203] [12 Regressions] huge IPA compile-time regression since r12-6606-g9d6a0f388eb048f8

2022-01-24 Thread Jan Hubicka via Gcc-bugs
So I assume that this is due to new pass_waccess which was added into early optimizations. I think this is not really ipa component but tree-optimize.

Re: [Bug tree-optimization/103195] [12 Regression] tfft2 text grows by 70% with -Ofast since r12-5113-gd70ef65692fced7a

2022-01-18 Thread Jan Hubicka via Gcc-bugs
> So nothing to see? I guess our unit growth limit doesn't trigger because it's > a small (benchmark) unit? Yep, unit growths do not apply for very small units. ipa-cp heuristics still IMO needs work and be based on relative speedups rather then absolute for the cutoffs.

Re: [Bug tree-optimization/103989] [12 regression] std::optional and bogus -Wmaybe-unitialized at -Og since r12-1992-g6feb628a706e86eb

2022-01-13 Thread Jan Hubicka via Gcc-bugs
> > Sure - I just remember (falsely?) that we finally decided to do it :) I do not recall this, but I may have forgotten :)) > If we don't run IPA inline we don't figure we failed to inline the > always_inline either ;) And IPA inline can expose more indirect > alywas-inlines we only discover a

Re: [Bug tree-optimization/103989] [12 regression] std::optional and bogus -Wmaybe-unitialized at -Og since r12-1992-g6feb628a706e86eb

2022-01-13 Thread Jan Hubicka via Gcc-bugs
> You can not disable an IPA pass becasuse then we will mishandle > optimize attributes. I think you simply want to set > > flag_inline_small_functions = 0 > flag_inline_functions_called_once = 0 Actually I forgot, we have flag_no_inline which makes tree_inlinable_function_p to return false for

Re: [Bug tree-optimization/103989] [12 regression] std::optional and bogus -Wmaybe-unitialized at -Og since r12-1992-g6feb628a706e86eb

2022-01-13 Thread Jan Hubicka via Gcc-bugs
> --- Comment #6 from Richard Biener --- > Honza, -Og was supposed to not do so much work, I intended to disable IPA > inlining but there's no knob for that. I wonder where to best put such > guard? I set flag_inline_small_functions to zero for -Og but we still > run inline_small_functions ().

Re: [Bug rtl-optimization/98782] [11/12 Regression] Bad interaction between IPA frequences and IRA resulting in spills due to changes in BB frequencies

2022-01-11 Thread Jan Hubicka via Gcc-bugs
on zen2 and 3 with -flto the speedup seems to be cca 12% for both -O2 and -Ofast -march=native which is both very nice! Zen1 for some reason sees less improvement, about 6%. With PGO it is 3.8% Overall it seems a win, but there are few noteworthy issues. I also see a 6.69% regression on x64 with

Re: [Bug gcov-profile/103652] Producing profile with -O2 -flto and trying to consume it with -O3 -flto leads to ICEs on indirect call profiling

2021-12-13 Thread Jan Hubicka via Gcc-bugs
> > Well, I'm specifically speaking about: > error: the control flow of function ‘BZ2_compressBlock’ does not match its > profile data (counter ‘arcs’) > > this type of errors should not happen even in a multi-threaded programs. There are some cases where I see even those on clang build - I am

Re: [Bug tree-optimization/103168] Value numbering for PRE of pure functions can be improved

2021-11-22 Thread Jan Hubicka via Gcc-bugs
The patch passed testing on x86_64-linux.

Re: [Bug tree-optimization/103168] Value numbering for PRE of pure functions can be improved

2021-11-22 Thread Jan Hubicka via Gcc-bugs
This is bit modified patch I am testing. I added pre-computation of the number of accesses, enabled the path for const functions (in case they have memory operand), initialized alias sets and clarified the logic around every_* and global_memory_accesses PR tree-optimization/103168

Re: [Bug driver/100937] configure: Add --enable-default-semantic-interposition

2021-11-22 Thread Jan Hubicka via Gcc-bugs
> (The -fno-semantic-interposition thing is probably the biggest performance gap > between gcc -fpic and clang -fpic.) Yep, it is often confusing to users (who do not understand what ELF interposition is) that clang and gcc disagree on default flags here. Recently -Ofast was extended to imply -fno-

Re: [Bug tree-optimization/103300] New: wrong code at -O3 on x86_64-linux-gnu

2021-11-17 Thread Jan Hubicka via Gcc-bugs
Needs -O2 -floop-unroll-and-jam --param early-inlining-insns=14 to fail, so I guess it may be issue with unrol-and-jam.

Re: [Bug ipa/103267] Wrong code with ipa-sra

2021-11-16 Thread Jan Hubicka via Gcc-bugs
> @@ -1,4 +1,3 @@ > -static int > __attribute__ ((noinline,const)) > infinite (int p) > { Just for a record, it crahes with or without static int here for me :) I run across it because the code tracking must access in ipa-sra is IMO conceptually wrong. I noticed that because ipa-modref solves

Re: [Bug ipa/103267] Wrong code with ipa-sra

2021-11-16 Thread Jan Hubicka via Gcc-bugs
Aha, but here is better example (reproduces same way). In the former one I forgot const attribute which makes it invalid. The testcase tests that ipa-sra is missing ECF_LOOPING_CONST_OR_PURE check static int __attribute__ ((noinline)) infinite (int p) { if (p) while (1); return p; } __attr

Re: [Bug ipa/103267] Wrong code with ipa-sra

2021-11-16 Thread Jan Hubicka via Gcc-bugs
Works for me even with the 3 warnings. hubicka@lomikamen:/aux/hubicka/trunk/build-lto2/gcc$ cat >tt.c __attribute__ ((noinline,const)) infinite (int p) { if (p) while (1); return p; } __attribute__ ((noinline)) static void test(int p, int *a) { int v = infinite (p); if (*a && v) __

Re: [Bug tree-optimization/103231] New: ICE (nondeterministic) on valid code at -O1 on x86_64-linux-gnu: Segmentation fault

2021-11-14 Thread Jan Hubicka via Gcc-bugs
> [659] % > [659] % gcctk -O0 -w small.c > [660] % > [660] % gcctk -O1 -w small.c > [661] % gcctk -O1 -w small.c > [662] % gcctk -O1 -w small.c > gcctk: internal compiler error: Segmentation fault signal terminated program > cc1 > Please submit a full bug report, > with preprocessed source if app

Re: [Bug ipa/103230] ipa-modref-tree.h:550:33: runtime error: load of value 255, which is not a valid value for type 'bool'

2021-11-14 Thread Jan Hubicka via Gcc-bugs
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103230 > > --- Comment #2 from Martin Liška --- > > How do you build ubsan compiler? > > F="-O0 -g -fsanitize=undefined" ; make -j16 all-host -k CFLAGS="$F" > CXXFLAGS="$F" LDFLAGS="$F" > > is the fastest approach. Thanks, it is similar to what I

Re: [Bug ipa/103230] New: ipa-modref-tree.h:550:33: runtime error: load of value 255, which is not a valid value for type 'bool'

2021-11-14 Thread Jan Hubicka via Gcc-bugs
> Happens with UBSAN compiler for: > > $ gcc gcc/testsuite/gcc.c-torture/execute/pr71494.c -O1 -flto > ... > /home/marxin/Programming/gcc/gcc/ipa-modref-tree.h:550:33: runtime error: load > of value 255, which is not a valid value for type 'bool' > #0 0x18acc38 in modref_tree::merge(modref_tr

Re: [Bug ipa/103211] [12 Regression] 416.gamess crashes after r12-5177-g494bdadf28d0fb35

2021-11-12 Thread Jan Hubicka via Gcc-bugs
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103211 > > --- Comment #2 from Martin Liška --- > Optimized dump differs for couple of functions in the same way: > > diff -u good bad > --- good2021-11-12 17:42:36.995947103 +0100 > +++ bad 2021-11-12 17:41:56.728194961 +0100 > @@ -38,7 +38

Re: [Bug tree-optimization/103175] [12 Regression] internal compiler error: in handle_call_arg, at tree-ssa-structalias.c:4139

2021-11-11 Thread Jan Hubicka via Gcc-bugs
The sanity check verifies that functions acessing parameter indirectly also reads the parameter (otherwise the indirect reference can not happen). This patch moves the check earlier and removes some overactive flag cleaning on function call boundary which introduces the non-sential situation. I g

Re: [Bug middle-end/102997] [12 Regression] 45% 454.calculix regression with LTO+PGO -march=native -Ofast on Zen since r12-4526-gd8edfadfc7a9795b65177a50ce44fd348858e844

2021-11-08 Thread Jan Hubicka via Gcc-bugs
Note that it still seems to me that the crossed_loop_header handling is overly conservative. We have: @ -2771,6 +2771,7 @@ jt_path_registry::cancel_invalid_paths (vec &path) bool seen_latch = false; int loops_crossed = 0; bool crossed_latch = false; + bool crossed_loop_header = false;

Re: [Bug tree-optimization/102943] [12 Regression] Jump threader compile-time hog with 521.wrf_r

2021-11-07 Thread Jan Hubicka via Gcc-bugs
> > This PR is still open, at least for slowdown in the threader with LTO. The > issue is ranger wide, so it may also cause slowdowns on non-LTO builds for > WRF, though I haven't checked. I just wanted to record the fact somewhere since I was looking up the revision range mostly to figure out i

Re: [Bug tree-optimization/102943] [12 Regression] Jump threader compile-time hog with 521.wrf_r

2021-11-04 Thread Jan Hubicka via Gcc-bugs
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102943 > > Aldy Hernandez changed: > >What|Removed |Added > > Depends on||103058 > > --- Comment #

Re: [Bug d/103040] [12 Regression] gdc.dg/torture/pr101273.d FAILs

2021-11-02 Thread Jan Hubicka via Gcc-bugs
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103040 > > --- Comment #15 from Iain Buclaw --- > Got it. The difference between D and C++ is a matter of early inlining. > > The C++ example Jakub posted fails in the same way that D does if you compile > with: -O1 -fno-inline Great, I will take a

Re: [Bug d/103040] [12 Regression] gdc.dg/torture/pr101273.d FAILs

2021-11-02 Thread Jan Hubicka via Gcc-bugs
> See above comments from Iain, even if that pre-initialization is removed it is > still miscompiled. And, the testcase fails not because of the padding bits > not > being zero, but because the address of self stored into one of the fields > isn't > there or modref thinks it can't be changed or

Re: [Bug middle-end/102997] [12 Regression] 45% 454.calculix regression with LTO+PGO -march=native -Ofast between ce4d1f632ff3f680550d3b186b60176022f41190 and 6fca1761a16c68740f875fc487b98b6bde8e9be7

2021-10-29 Thread Jan Hubicka via Gcc-bugs
> Not seen on Haswell (but w/o PGO). Is this PGO specific? There's another > large jump visible end of 2019. It is between 2019-11-15 and 18 but the revisions does not exist at git - perhaps they reffer to the old git mirror. Martin will know better. In that range there are many of Richard's vec

Re: [Bug ipa/102982] [12 Regression] Dead Code Elimination Regression at -O3 (trunk vs 11.2.0)

2021-10-28 Thread Jan Hubicka via Gcc-bugs
> > fixup_cfg already removes write-only stores so that seems fit for that > purpose. > > Btw, > > static int x = 1; > > int main() > { > x = 1; > } > > should ideally be handled as well as maybe the more common(?) > > static int x[128]; > > int main() > { > memset (x, 0, 128*4); > } >

Re: [Bug tree-optimization/102446] [9/10/11/12 Regression] wrong code at -O3 on x86_64-linux-gnu

2021-09-22 Thread Jan Hubicka
> Started with r5-6477-g3620b606822f80863488ca4883542d848d41f9f9 This only affects early inlining decisions, so it may be useful to bisect this with --param early-inlining-insns=14 Honza

Re: [Bug lto/99898] Possible LTO object incompatibility on gcc-10 branch

2021-04-06 Thread Jan Hubicka
> Any *.opt changes can break the streaming of optimization or target option > nodes. > And from experience with gcc plugins we have such changes ~ each month even on > release branches. It may make sense to add a simple test to our regular testers that either the new revision can consume old objec

Re: [Bug ipa/99835] missed optimization for dead code elimination at -O3 (vs. -O1)

2021-03-31 Thread Jan Hubicka
> At -O3 the unused 'c' remains. Likely different (recursive?) inlining makes > us > process a cgraph cycle in different order and thus fail to elide the output > of 'c' (it's output first at -O3). > > Fixing that would need processing cgraph SCCs with an extra IPA phase in main > optimization s

Re: [Bug bootstrap/98338] [10/11 Regression] profiledbootstrap failure on x86_64-linux

2021-02-26 Thread Jan Hubicka
> FYI, I have today bootstrapped it as well in rpm build on > {x86_64,i686,powerpc64le}-linux, both your patch and just trunk without the > workaround I've been using before. The latter failed to bootstrap on i686 > and passed it on x86_64 and powerpc64le, the former passed bootstrap on all > arch

Re: [Bug gcov-profile/99105] profile streaming scales poorly to projects with many source files

2021-02-15 Thread Jan Hubicka
> Ah, yeah, that will make a big difference. > So clang is using 'make check', running a test-suite for a PGO build, right? It uses make check-llvm make check-clang and then it rebuilds whole llvm with the instrumented compiler. Honza

Re: [Bug gcov-profile/99105] profile streaming scales poorly to projects with many source files

2021-02-15 Thread Jan Hubicka
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99105 > > --- Comment #8 from Martin Liška --- > This is what I see for GCC PGO in train stage. It's from perf top: > >4.33% cc1plus [.] > __gcov_indirect_call_profiler_v4 > ◆ >2.28

Re: [Bug gcov-profile/99105] profile streaming scales poorly to projects with many source files

2021-02-15 Thread Jan Hubicka
> A small improvement can be achieved by the removal of libgcov I/O buffering: > https://gcc.gnu.org/git/?p=gcc.git;a=patch;h=5a17015c096012b9e43a8dd45768a8d5fb3a3aee So it effectively replaces gcov's own buffered I/O by stdio. First I am not sure how safe it is (as we had a lot of fun about usin

Re: [Bug middle-end/99097] profiledbootstrap fails with LTO and disabled plugin

2021-02-15 Thread Jan Hubicka
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99097 > > --- Comment #5 from Martin Liška --- > (In reply to Jan Hubicka from comment #3) > > > I've just tried to reproduce it: > > > ../configure --with-build-config=bootstrap-lto --enable-checking=release >

Re: [Bug middle-end/99097] profiledbootstrap fails with LTO and disabled plugin

2021-02-15 Thread Jan Hubicka
> I've just tried to reproduce it: > ../configure --with-build-config=bootstrap-lto --enable-checking=release > --disable-plugin > > But the build is fine for me. On our dhcp230 (zen III machine) it works if you make system linker ld, if system linker is gold (from tumbleweed) it fails GNU gold (

Re: [Bug c++/98330] [9/10/11 Regression] ICE in compute_parm_map, at ipa-modref.c:2900 since r9-2640-g3d78e00879b42574

2021-01-19 Thread Jan Hubicka
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98330 > > --- Comment #4 from Richard Biener --- > So modref allocates a fnspec_summary for an unknown indirect call (NULL > callee) > but then in compute_parm_map calls function_or_virtual_thunk_symbol on > that NULL callee unconditionally. We hav

Re: [Bug c++/91241] [8/9/10/11 Regression] internal compiler error: symtab_node::verify failed

2020-12-07 Thread Jan Hubicka
> @Marek: The callgraph checking error is correct. > If you disable it, you will likely see duplicate assembler names in GAS. And > that's the error that 2 symbol names clash. Indeed, there are two lambdas, but I think C++ FE should assign them different symbol names. Honza

Re: [Bug c/97172] [11 Regression] ICE: tree code ‘ssa_name’ is not supported in LTO streams since r11-3303-g6450f07388f9fe57

2020-12-01 Thread Jan Hubicka
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97172 > > --- Comment #18 from Martin Sebor --- > Let me explain how this works. The VLA bounds in function parameters are used > in two ways: > 1) in the front end, to check function redeclarations involving arrays and > VLAs > for equivalence, >

Re: [Bug tree-optimization/97915] New: ICE in get_odr_type, at ipa-devirt.c:1930 in pre

2020-11-19 Thread Jan Hubicka
Hi, this ought to be fixed by g:0862d007b564eca8c9a48fca0e689dd3f90db828 sorry for the breakage. OBJ_TYPE_REF in obj-C frontend is odd.

Re: [Bug bootstrap/97857] [11 Regression] profiledbootstrap broken freeing speculative call summary since r11-4987-g602c6cfc79ce4ae61e277107e0a60079c1a93a97

2020-11-16 Thread Jan Hubicka
This patch fixes the issue by making the conflict with C type sticky via clearing the CXX bit. I checked that it recovers profiledbootstrap, hwoever I want to look into the code tomorrow bit more to be sure that it does not disable more than it should. Honza diff --git a/gcc/ipa-utils.h b/gcc/ip

Re: [Bug middle-end/97840] [11 regression] Bogus -Wmaybe-uninitialized

2020-11-16 Thread Jan Hubicka
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97840 > > --- Comment #14 from Martin Sebor --- > Created attachment 49572 > --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49572&action=edit > Patch under test. > > The attached patch avoids the warning on aarch64. Let me finish testing it

Re: [Bug bootstrap/97857] [11 Regression] profiledbootstrap broken freeing speculative call summary since r11-4987-g602c6cfc79ce4ae61e277107e0a60079c1a93a97

2020-11-16 Thread Jan Hubicka
The checking enabled build ICEs for me at same spot as for you 0x01475505 <+165>: punpcklqdq %xmm2,%xmm3 0x01475509 <+169>: movaps %xmm3,0x30(%rsp) 0x0147550e <+174>: callq 0x10949d0 ::iterator::slide()> 0x01475513 <+179>: mov%r12,0x20(%rsp

Re: [Bug middle-end/97840] [11 regression] Bogus -Wmaybe-uninitialized

2020-11-16 Thread Jan Hubicka
> I agree we should just rename default_is_empty_type to is_empty_type, export > it, declare in tree.h and use it instead that complicated test. TYPE_EMPTY_P > isn't something tree-ssa-uninit.c should care about, that is just whether the > backend decided it will not be passed at all. OK, perhaps

Re: [Bug middle-end/97840] [11 regression] Bogus -Wmaybe-uninitialized

2020-11-16 Thread Jan Hubicka
> Note i686-linux bootstrap is still broken in r11-5062 - the PR97853 error. Yes, as discussed earlier (but perhaps lost in other coments) we need fix for the targetm.calls.empty_record_p (type) divergence. It is not clear to me if simply calling the default implementation instead of the rather com

Re: [Bug bootstrap/97857] [11 Regression] profiledbootstrap broken freeing speculative call summary since r11-4987-g602c6cfc79ce4ae61e277107e0a60079c1a93a97

2020-11-16 Thread Jan Hubicka
It seems to crash on quite few locaitons but always related to indirect calls. So perhaps there is some sort of weird relation to indirect call profiling or devirutalization... I am going to move my build to faster machine. Honza

Re: [Bug bootstrap/97857] [11 Regression] profiledbootstrap broken freeing speculative call summary since r11-4987-g602c6cfc79ce4ae61e277107e0a60079c1a93a97

2020-11-16 Thread Jan Hubicka
> > Yep, I already worked out it is ipa-icf... > > Do you have easy way to bisect what merge is causing the failure? > > Working on that will send details soon. Great, thanks. In meantime I will check if I can isolate one of the paths (constant access merging, variable access merging on the two o

Re: [Bug bootstrap/97857] [11 Regression] profiledbootstrap broken freeing speculative call summary since r11-4987-g602c6cfc79ce4ae61e277107e0a60079c1a93a97

2020-11-16 Thread Jan Hubicka
> I see a similar bootstrap failure that's with: > > ../configure --enable-languages=c,c++,lto --prefix=/home/marxin/bin/gcc > --disable-multilib --without-isl --disable-libsanitizer > --with-build-config=bootstrap-lto-lean && make profiledbootstrap > 'STAGE1_CFLAGS=-g -O2' > > started with r11-4

Re: [Bug ipa/97695] [11 Regression] wrong code at -O3 on x86_64-pc-linux-gnu since r11-4587-gae7a23a3fab74.

2020-11-03 Thread Jan Hubicka
will clean it up incrementally. gcc/ChangeLog: 2020-11-03 Jan Hubicka * cgraph.c (cgraph_edge::redirect_call_stmt_to_callee): Fix ICE with in dumping code. (cgraph_node::remove): Save clone info before releasing it and pass it to unregister. * cgraph.h

Re: [Bug c/97578] ice during IPA pass: inline

2020-11-03 Thread Jan Hubicka
> It needs to refer to the DW_TAG_formal_parameter DIEs, and only the PARM_DECLs > map to those. It has problem with the partitioning (if we call a callee from different parititon) and also if the callee is compiled before caller (as it should) we will call cgraph_node::release_body and that will l

Re: [Bug c/97578] ice during IPA pass: inline

2020-11-01 Thread Jan Hubicka
Hi, this patch fixes the ICE, though I think we do have a design issue here while producing debug info across ltrans boundary. Martin, Jakub: as discussed on IRC it would be nice to add predicate when the body is really needed and avoid materializing if it is not. Can you add one? Something like

Re: [Bug ipa/97586] [11 Regression] "make check" failures in binutils with -flto since r11-3641-gc34db4b6f8a5d803

2020-10-27 Thread Jan Hubicka
> Hi, > this is patch that moves updates to WPA time. Does it work for you? Actually it won't help, since it updates only non-lto summary. I am testing better patch, sorry for that. Honza

Re: [Bug ipa/97586] [11 Regression] "make check" failures in binutils with -flto since r11-3641-gc34db4b6f8a5d803

2020-10-27 Thread Jan Hubicka
Hi, this is patch that moves updates to WPA time. Does it work for you? Honza 2020-10-27 Jan Hubicka * ipa-modref.c (modref_summaries_lto::duplicate): Check that no clones happens after modref. (modref_transform): Rename to ... (update_signature): ... this

Re: [Bug lto/97586] [11 Regression] "make check" failures in binutils with -flto since r11-3641-gc34db4b6f8a5d803

2020-10-27 Thread Jan Hubicka
> So the _bfd_safe_read_leb128.constprop removes the first unused argument: > ... > > But the analysis is bogus: > > ipa-modref: call to _bfd_safe_read_leb128.constprop/17919 does not clobber > ref: > bytes_read alias sets: 7->7 > > The &bytes_read is always modified in the function (if it's n

Re: [Bug c/97445] Some fonctions marked static inline in Linux kernel are not inlined

2020-10-20 Thread Jan Hubicka
still need the stronger hint though. gcc/ChangeLog: 2020-10-20 Jan Hubicka PR c/97445 * ipa-fnsummary.c (ipa_dump_hints): Handle INLINE_HINT_builtin_constant_p. (ipa_fn_summary::~ipa_fn_summary): Free builtin_constant_p_parms. (ipa_fn_summary_t

Re: [Bug c/97445] Some fonctions marked static inline in Linux kernel are not inlined

2020-10-20 Thread Jan Hubicka
> > Original asm is: > > __attribute__ ((noinline)) > int fls64(__u64 x) > { > int bitpos = -1; > asm("bsrq %1,%q0" > : "+r" (bitpos) > : "rm" (x)); > return bitpos + 1; > } > > There seems to be bug in bsr{q} pattern. I can make GCC produce same > code with: > > __attribute__ ((n

Re: [Bug c/97445] Some fonctions marked static inline in Linux kernel are not inlined

2020-10-20 Thread Jan Hubicka
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97445 > > --- Comment #33 from Jakub Jelinek --- > (In reply to Jan Hubicka from comment #32) > > get_order is a wrapper around ffs64. This can be implemented w/o asm > > statement as follows: > > int > > m

Re: [Bug c/97445] Some fonctions marked static inline in Linux kernel are not inlined

2020-10-20 Thread Jan Hubicka
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97445 > > --- Comment #31 from Segher Boessenkool --- > (In reply to Jan Hubicka from comment #27) > > It is because --param inline-insns-single was reduced for -O2 from 200 > > to 70. GCC 10 has newly different set of pa

Re: [Bug c/97445] Some fonctions marked static inline in Linux kernel are not inlined

2020-10-19 Thread Jan Hubicka
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97445 > > --- Comment #23 from Christophe Leroy --- > (In reply to Jan Hubicka from comment #19) > > > > It is always possible to always_inline functions that are intended to be > > always inlined. > > Honza >

Re: [Bug gcov-profile/97461] [11 Regression] allocate_gcov_kvp() deadlocks in firefox LTO+PGO build (overridden malloc() recursion)

2020-10-19 Thread Jan Hubicka
> > They have the very same problem when I disable a statically pre-allocated > buffers with -mllvm -vp-static-alloc=0: > > Program received signal SIGILL, Illegal instruction. > 0x004014e6 in calloc (nmemb=1, size=8) at pr97461.c:103 > 103 if (malloc_depth != 0) __builtin_trap();

Re: [Bug gcov-profile/97461] [11 Regression] allocate_gcov_kvp() deadlocks in firefox LTO+PGO build (overridden malloc() recursion)

2020-10-19 Thread Jan Hubicka
> No. The only thing we support is a recursive malloc as seen in: > ./gcc/testsuite/gcc.dg/tree-prof/indir-call-prof-malloc.c > > It was added in g:bc2b1a232b1825b421a1aaa21a0865b2d1e4e08c as we use a > statically allocated buffer when we recursively entry allocate_gcov_kvp. > > However this is d

Re: [Bug ipa/97292] [11 Regression] dealII from SPECCPU 2016 no longer terminates after g:c34db4b6f8a5d80367c709309f9b00cb32630054

2020-10-08 Thread Jan Hubicka
Hi, the following patch should let us to pinpoint the wrong disambiguation. With -fdump-tree-all-details we should also see the difference in dump file. Honza diff --git a/gcc/dbgcnt.def b/gcc/dbgcnt.def index cf8775b2b66..07946a85ecc 100644 --- a/gcc/dbgcnt.def +++ b/gcc/dbgcnt.def @@ -171,6 +17

Re: [Bug tree-optimization/97159] [11 Regression] segfault in modref_may_conflict

2020-09-22 Thread Jan Hubicka
Recursion is handled in normal compilation (we analyze the function and while hitting the recursive call we skip the summary). I suppose here the problem is missing LTO and offloading. With LTO lto summaries (that include types) are streamed out while they are turned into non-lto summaries at ltr

Re: [Bug bootstrap/96794] --with-build-config=bootstrap-lto-lean with --enable-link-mutex leads to poor LTRANS utilization

2020-08-26 Thread Jan Hubicka
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96794 > > --- Comment #4 from Martin Liška --- > > > For jobserver they are still running even though they sleep. > > Aha, so it is extra locking mechanizm we add without jobserver > > knowledge. > > It's unrelated to jobserver, one can enable it wi

Re: [Bug bootstrap/96794] --with-build-config=bootstrap-lto-lean with --enable-link-mutex leads to poor LTRANS utilization

2020-08-26 Thread Jan Hubicka
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96794 > > --- Comment #2 from Martin Liška --- > (In reply to Jan Hubicka from comment #1) > > > As seen > > > here:https://gist.githubusercontent.com/marxin/223890df4d8d8e490b6b2918b77dacad/raw/7e0363da60dcddbfde4ab6

Re: [Bug bootstrap/96794] New: --with-build-config=bootstrap-lto-lean with --enable-link-mutex leads to poor LTRANS utilization

2020-08-26 Thread Jan Hubicka
> As seen > here:https://gist.githubusercontent.com/marxin/223890df4d8d8e490b6b2918b77dacad/raw/7e0363da60dcddbfde4ab68fa3be755515166297/gcc-10-with-zstd.svg > > each blocking linking of a GCC front-end leads to a wasted jobserver worker. Hmm, I am not sure how to interpret the graph. I can see th

Re: [Bug ipa/96337] [10/11 Regression] GCC 10.2: twice as slow for -O2 -march=x86-64 vs. GCC 9.3/8.4

2020-08-01 Thread Jan Hubicka
> I think, this inliner change needs to be reverted. People expect -O2 to > produce > decently optimized binaries, and starting with gcc 10.x it doesn't deliver. > -O3 > traditionally enabled optimizations that may or may not improve performance > (and historically, sometimes even break code), so

Re: [Bug ipa/96337] [10/11 Regression] GCC 10.2: twice as slow for -O2 -march=x86-64 vs. GCC 9.3/8.4

2020-07-28 Thread Jan Hubicka
> > Maybe you want to use same GCC version as phoronix used (GCC 10.2)? OK, I will give it a try, but there are no inliner changes in gcc 10.2 compared to 10.1. Honza

Re: [Bug lto/95548] ice in tree_to_shwi, at tree.c:7321

2020-06-05 Thread Jan Hubicka
> I think Honza ran into this himself. Yep, i converted code to use wide-ints. But it is nice to have short testcase. Honza

Re: [Bug tree-optimization/91322] [10 regression] alias-4 test failure

2020-04-04 Thread Jan Hubicka
> Which ARM target has 16-bit int? > I don't see INT_TYPE_SIZE nor SHORT_TYPE_SIZE defined in config/arm/*, neither > BITS_PER_WORD, so all depends on UNITS_PER_WORD, which is 4 and thus short is > 16-bit and int is 32-bit. Hmm, you are right - I messed up target triplets. With arm-linux-gnueabi I

Re: [Bug ipa/93318] [10 regression] Firefox LTO+FDO ICEs in speculative_call_info

2020-01-19 Thread Jan Hubicka
Ok, I managed to reproduce the crash locally (it was not that easy) At the point of failure the node passes verification and I suppose problem is that the call stmt hash contains indirect call while it is supposed to contain direct call. Edge removal code probably replaces direct edge by indreict

Re: [Bug tree-optimization/93084] [10 regression] Infinite loop in ipa-cp when building clang with LTO+PGO

2020-01-02 Thread Jan Hubicka
> xxx.localalias is gcc-generated as a noninterposable alias to xxx. But I guess > target node returned by xxx.localalias->function_symbol() is not xxx. A simple that ought to return xxx unless the target of localalias is thunk that is not recursive. > thing we can do is to write a simple case to f

Re: [Bug tree-optimization/93084] [10 regression] Infinite loop in ipa-cp when building clang with LTO+PGO

2019-12-30 Thread Jan Hubicka
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93084 > > --- Comment #6 from fxue at gcc dot gnu.org --- > Could you share how you build clang with PGO, and train workload? It needs a lot of patience. If you have patch I can try it since I still have the train data and corresponding gcc tree. I

Re: [Bug rtl-optimization/68664] [6/7 Regression] Speculative sqrt in c-ray main loop causes large slow down

2017-02-06 Thread Jan Hubicka
> > I don't think so. But I don't know much about that bug, it is something > with AVX I think? If you are talking about PR79224. I see, we have separate PR for that, good ;) > > > Also with profile feedback perhaps you have enough info to tell that the > > speculative path is almost as likely

Re: [Bug rtl-optimization/68664] [6/7 Regression] Speculative sqrt in c-ray main loop causes large slow down

2017-02-06 Thread Jan Hubicka
> Scheduling should never move very expensive instructions to places they > are executed more frequently. This patch fixes that, reducing the > execution time of c-ray by over 40% (I tested on a BE Power7 system). > > This introduces a new target hook sched.can_speculate_insn which returns > whet

Re: [Bug lto/65559] [5 Regression] lto1.exe: internal compiler error: in read_cgraph_and_symbols, at lto/lto.c:2947

2015-04-06 Thread Jan Hubicka
Can you please compile with --verbose --save-temps and attach the output + temporary files produced? (in particular I wonder about resolution file that should be named *.res)

Re: [Bug target/65660] [5 Regression] 252.eon regression on bdver2 with -Ofast

2015-04-04 Thread Jan Hubicka
Thanks, 32-bit eon runs improved today, though I am not 100% sure it is ude to vectorization or the unit growth change http://gcc.opensuse.org/SPEC/CINT/sb-frescobaldi.suse.de-head-64-32o-32bit/252_eon_recent_big.png Overall we had better scores on 32bit eon in the past however http://gcc.opensuse

  1   2   >