[Bug tree-optimization/42108] [4.4/4.5 Regression] 50% performance regression

2010-04-05 Thread rguenth at gcc dot gnu dot org
--- Comment #50 from rguenth at gcc dot gnu dot org 2010-04-05 14:23 --- (In reply to comment #49) > At least the tree-vrp.c bit did not get applied (as of trunk r157950) > Yup, my fault. I looked at the wrong patch. Thus, the first comment applies - maybe stage1 with lots of cleanu

[Bug tree-optimization/42108] [4.4/4.5 Regression] 50% performance regression

2010-04-05 Thread steven at gcc dot gnu dot org
--- Comment #49 from steven at gcc dot gnu dot org 2010-04-05 13:01 --- At least the tree-vrp.c bit did not get applied (as of trunk r157950) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42108

[Bug tree-optimization/42108] [4.4/4.5 Regression] 50% performance regression

2010-04-05 Thread rguenther at suse dot de
--- Comment #48 from rguenther at suse dot de 2010-04-05 12:56 --- Subject: Re: [4.4/4.5 Regression] 50% performance regression On Mon, 5 Apr 2010, rguenther at suse dot de wrote: > --- Comment #47 from rguenther at suse dot de 2010-04-05 12:54 --- > Subject: Re: [4.4/4.5 R

[Bug tree-optimization/42108] [4.4/4.5 Regression] 50% performance regression

2010-04-05 Thread rguenther at suse dot de
--- Comment #47 from rguenther at suse dot de 2010-04-05 12:54 --- Subject: Re: [4.4/4.5 Regression] 50% performance regression On Mon, 5 Apr 2010, steven at gcc dot gnu dot org wrote: > --- Comment #46 from steven at gcc dot gnu dot org 2010-04-05 12:52 > --- > What happen

[Bug tree-optimization/42108] [4.4/4.5 Regression] 50% performance regression

2010-04-05 Thread steven at gcc dot gnu dot org
--- Comment #46 from steven at gcc dot gnu dot org 2010-04-05 12:52 --- What happened with the patch of comment #33? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42108

[Bug tree-optimization/42108] [4.4/4.5 Regression] 50% performance regression

2010-01-21 Thread jakub at gcc dot gnu dot org
-- jakub at gcc dot gnu dot org changed: What|Removed |Added Target Milestone|4.4.3 |4.4.4 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42108

[Bug tree-optimization/42108] [4.4/4.5 Regression] 50% performance regression

2009-12-19 Thread rguenth at gcc dot gnu dot org
--- Comment #45 from rguenth at gcc dot gnu dot org 2009-12-19 21:10 --- (In reply to comment #41) > Indeed. The PRE issue could be fixed by fixing PR38819 not in the way it is > done now but "properly" detect the invalid situations during ANTIC computation > and simply never mark trap

[Bug tree-optimization/42108] [4.4/4.5 Regression] 50% performance regression

2009-12-19 Thread rguenth at gcc dot gnu dot org
--- Comment #44 from rguenth at gcc dot gnu dot org 2009-12-19 19:41 --- PR42436 now tracks the possible VRP and middle-end improvement. Only the PRE fixing possibility would count as a regression fix IMHO. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42108

[Bug tree-optimization/42108] [4.4/4.5 Regression] 50% performance regression

2009-12-19 Thread rguenth at gcc dot gnu dot org
--- Comment #43 from rguenth at gcc dot gnu dot org 2009-12-19 19:29 --- Btw, with the patch from comment #33 LIM will now hoist the division properly and the performance regression would be fixed(?). The patch will though likely cause verification issues with -fnon-call-exceptions for

[Bug tree-optimization/42108] [4.4/4.5 Regression] 50% performance regression

2009-12-19 Thread rguenth at gcc dot gnu dot org
--- Comment #42 from rguenth at gcc dot gnu dot org 2009-12-19 11:25 --- Subject: Bug 42108 Author: rguenth Date: Sat Dec 19 11:24:49 2009 New Revision: 155360 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=155360 Log: 2009-12-19 Richard Guenther PR tree-optimizatio

[Bug tree-optimization/42108] [4.4/4.5 Regression] 50% performance regression

2009-12-18 Thread rguenth at gcc dot gnu dot org
--- Comment #41 from rguenth at gcc dot gnu dot org 2009-12-18 23:44 --- Indeed. The PRE issue could be fixed by fixing PR38819 not in the way it is done now but "properly" detect the invalid situations during ANTIC computation and simply never mark trapping expressions so. At the cur

[Bug tree-optimization/42108] [4.4/4.5 Regression] 50% performance regression

2009-12-18 Thread matz at gcc dot gnu dot org
--- Comment #40 from matz at gcc dot gnu dot org 2009-12-18 21:40 --- That's expected. There are three problems and the patch in comment #38 hacks around only one of them. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42108

[Bug tree-optimization/42108] [4.4/4.5 Regression] 50% performance regression

2009-12-18 Thread dominiq at lps dot ens dot fr
--- Comment #39 from dominiq at lps dot ens dot fr 2009-12-18 21:04 --- The patch in comment #38 does not fix the speed issue: the code with the inner loop is still 4 times slower than the code with the loop manually unrolled. Note that the included test regtests successfully. --

[Bug tree-optimization/42108] [4.4/4.5 Regression] 50% performance regression

2009-12-18 Thread rguenth at gcc dot gnu dot org
--- Comment #38 from rguenth at gcc dot gnu dot org 2009-12-18 15:43 --- Created an attachment (id=19346) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=19346&action=view) patch to fix SCCVN issue This patch fixes the SCCVN issue, I'm giving it more testing. -- http://gcc.gnu

[Bug tree-optimization/42108] [4.4/4.5 Regression] 50% performance regression

2009-12-15 Thread rguenth at gcc dot gnu dot org
--- Comment #37 from rguenth at gcc dot gnu dot org 2009-12-15 11:08 --- No, there isn't. I'd simply allow TREE_THIS_NOTRAP on all expression codes that in principle could. Now of course the middle-end would still need to make use of this (like transition it to a stmt flag on a tuple)

[Bug tree-optimization/42108] [4.4/4.5 Regression] 50% performance regression

2009-12-14 Thread tkoenig at gcc dot gnu dot org
--- Comment #36 from tkoenig at gcc dot gnu dot org 2009-12-15 07:09 --- If it is any help, code which traps for a do loop is illegal Fortran, so the compiler may do anything in this case anyway. Is there a function like __builtin_i_dont_care_if_this_traps_or_not_if_it_traps_its_the_us

[Bug tree-optimization/42108] [4.4/4.5 Regression] 50% performance regression

2009-12-14 Thread matz at gcc dot gnu dot org
--- Comment #35 from matz at gcc dot gnu dot org 2009-12-14 16:58 --- Exactly my thinking (growing SCCs -> slow, sorting SCCs -> difficult). What I thought about the trapping problem is that in this situation we could ignore the trap test. We start with this situation: bb1: goto bb2

[Bug tree-optimization/42108] [4.4/4.5 Regression] 50% performance regression

2009-12-14 Thread rguenth at gcc dot gnu dot org
--- Comment #34 from rguenth at gcc dot gnu dot org 2009-12-14 12:57 --- "Another possibility is to artificially grow SCCs and their dependencies by honoring dominating virtual operand uses, not only defs (ugh)." what I mean with this is that when finding SCCs we process all uses of a

[Bug tree-optimization/42108] [4.4/4.5 Regression] 50% performance regression

2009-12-14 Thread rguenth at gcc dot gnu dot org
--- Comment #33 from rguenth at gcc dot gnu dot org 2009-12-14 12:30 --- Created an attachment (id=19289) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=19289&action=view) VRP hack Hack marking divisions non-trapping during VRP (re-using some stmt bit, not updating relevant places

[Bug tree-optimization/42108] [4.4/4.5 Regression] 50% performance regression

2009-12-14 Thread rguenther at suse dot de
--- Comment #32 from rguenther at suse dot de 2009-12-14 12:27 --- Subject: Re: [4.4/4.5 Regression] 50% performance regression On Mon, 14 Dec 2009, matz at gcc dot gnu dot org wrote: > --- Comment #26 from matz at gcc dot gnu dot org 2009-12-14 04:55 --- > And if I fix this

[Bug tree-optimization/42108] [4.4/4.5 Regression] 50% performance regression

2009-12-14 Thread rguenth at gcc dot gnu dot org
--- Comment #31 from rguenth at gcc dot gnu dot org 2009-12-14 11:49 --- Created an attachment (id=19288) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=19288&action=view) another hack Sorts the SCCs after collecting them all. Breaks most of the PRE/FRE testcases because it sorts

[Bug tree-optimization/42108] [4.4/4.5 Regression] 50% performance regression

2009-12-14 Thread rguenth at gcc dot gnu dot org
--- Comment #30 from rguenth at gcc dot gnu dot org 2009-12-14 11:23 --- I fail to see why FRE does not remove the redundant load of *n_9(D). Oh, it is because we first value-number D.1537_58 = *n_9(D); and only after it we value-number D.1529_45 = *n_9(D); This is because while we vi

[Bug tree-optimization/42108] [4.4/4.5 Regression] 50% performance regression

2009-12-14 Thread dominiq at lps dot ens dot fr
--- Comment #29 from dominiq at lps dot ens dot fr 2009-12-14 11:21 --- On x86_64-apple-darwin10, I don't see any speedup with the patch in comment #27 (not a clean bootstrap, but just an incremental build). -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42108

[Bug tree-optimization/42108] [4.4/4.5 Regression] 50% performance regression

2009-12-14 Thread dominiq at lps dot ens dot fr
--- Comment #28 from dominiq at lps dot ens dot fr 2009-12-14 10:51 --- (In reply to comment #27) > My current collection of patches and hacks for this problem. Obviously the > "if (0)" in tree-ssa-pre.c will break pr38819 again; apart from that untested, > hence probably miscompiles ev

[Bug tree-optimization/42108] [4.4/4.5 Regression] 50% performance regression

2009-12-13 Thread matz at gcc dot gnu dot org
--- Comment #27 from matz at gcc dot gnu dot org 2009-12-14 05:25 --- Created an attachment (id=19287) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=19287&action=view) three hacks My current collection of patches and hacks for this problem. Obviously the "if (0)" in tree-ssa-pre.

[Bug tree-optimization/42108] [4.4/4.5 Regression] 50% performance regression

2009-12-13 Thread matz at gcc dot gnu dot org
--- Comment #26 from matz at gcc dot gnu dot org 2009-12-14 04:55 --- And if I fix this problem (so that only one reference to *n_9) remains I hit the problem that the fortran frontend emits the computation of countm1 after the loop bound test. No pass is moving code in front of that te

[Bug tree-optimization/42108] [4.4/4.5 Regression] 50% performance regression

2009-12-13 Thread matz at gcc dot gnu dot org
--- Comment #25 from matz at gcc dot gnu dot org 2009-12-13 23:48 --- The reason that the testcase still is slow (and that the inner loop isn't unrolled or vectorized) is still the calculation of countm1. The division therein stays in the second inner loop, whereas with GCC 4.3 it can b

[Bug tree-optimization/42108] [4.4/4.5 Regression] 50% performance regression

2009-12-04 Thread dominiq at lps dot ens dot fr
--- Comment #24 from dominiq at lps dot ens dot fr 2009-12-04 14:25 --- AFAICT fixing pr42131 does not help. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42108