On Tue, 4 Dec 2012, Jan Hubicka wrote:
> > > here is updated patch. It should get the bounds safe enough to not have
> > > effect on codegen of complette unrolling.
> > >
> > > There is IMO no way to cut the walk of loop body w/o affecting codegen in
> > > unrolling for size mode. The condition
> > here is updated patch. It should get the bounds safe enough to not have
> > effect on codegen of complette unrolling.
> >
> > There is IMO no way to cut the walk of loop body w/o affecting codegen in
> > unrolling for size mode. The condition on unroling to happen is
> >
> > unrolled_size
> here is updated patch. It should get the bounds safe enough to not have
> effect on codegen of complette unrolling.
>
> There is IMO no way to cut the walk of loop body w/o affecting codegen in
> unrolling for size mode. The condition on unroling to happen is
>
> unrolled_size * 2 / 3 < orig
My mailer has eaten a line in my previous mail. One should read:
I have found another fall out: I have some avatars of the polyhedron tests
where the REAL(8) have been replaced with REAL(10). Some of them are now ~50%
slower with the new value of max-completely-peeled-insns.
Should I open a new PR
> ... I believe I posted a patch?
Yes: http://gcc.gnu.org/ml/gcc-patches/2012-11/msg01799.html
I have found another fall out: I have some avatars of the polyhedron tests
where the REAL(8) have been replaced with REAL(10). Some of them are now
Should I open a new PR for that?
Cheers,
Dominique
> On Sun, 18 Nov 2012, Jan Hubicka wrote:
> > > > > this patch reduces max-peeled-insns and max-completely-peeled-insns
> > > > > from 400
> > > > > to 100. The reason why I am doing this is that I want to reduce code
> > > > > bloat
> > > > > caused by my cunroll work that enabled a lot more un
On Sun, 18 Nov 2012, Jan Hubicka wrote:
> > > > this patch reduces max-peeled-insns and max-completely-peeled-insns
> > > > from 400
> > > > to 100. The reason why I am doing this is that I want to reduce code
> > > > bloat
> > > > caused by my cunroll work that enabled a lot more unrolling then
> > Did you notice that gcc.c-torture/compile/pr43186.c regressed? It now again
> > takes a while to compile, so times out on slow machines:
> > ...
>
> On a 2.5Ghz Core2Duo, compiling the test with revision 192891 (2012-10-28)
> takes a small fraction of a second, while with revision 193270 (201
Hi,
here is updated patch. It should get the bounds safe enough to not have effect
on codegen of complette unrolling.
There is IMO no way to cut the walk of loop body w/o affecting codegen in
unrolling for size mode. The condition on unroling to happen is
unrolled_size * 2 / 3 < original_size
> FAIL: gcc.dg/graphite/interchange-8.c scan-tree-dump-times graphite "will be
> interchanged" 2
> FAIL: gcc.dg/graphite/pr42530.c (internal compiler error)
> FAIL: gcc.dg/graphite/pr42530.c (test for excess errors)
> FAIL: gcc.dg/tree-ssa/cunroll-1.c scan-tree-dump cunrolli "Unrolled loop 1
> co
Hi Jan,
> this is patch I will try to test once I have chance :)
> It simply prevents unroller from analyzing loops when they are already too
> large.
> ...
This patch breaks bootstrap with
...
/opt/gcc/p_build/./prev-gcc/g++ -B/opt/gcc/p_build/./prev-gcc/
-B/opt/gcc/gcc4.8p-193652p3/x86_64-ap
Hi,
this is patch I will try to test once I have chance :)
t simply prevents unroller from analyzing loops when they are already too large.
* tree-ssa-loop-ivcanon.c (tree_estimate_loop_size): Add UPPER_BOUND
parameter.
(try_unroll_loop_completely) Update.
Index: tree-ssa-l
> OK, here are multiple issues.
> 1) recursive inlining makes huge loop nest (of 18 loops)
> 2) SCEV is very slow on answering simple_iv tests in this case becuase it
> walks the nest
> 3) unroller is computing loop body size even when it is clear the body is
> much larger than the limit (the outer
> > > this patch reduces max-peeled-insns and max-completely-peeled-insns from
> > > 400
> > > to 100. The reason why I am doing this is that I want to reduce code
> > > bloat
> > > caused by my cunroll work that enabled a lot more unrolling then
> > > previously
> > > causing considerable code
> > this patch reduces max-peeled-insns and max-completely-peeled-insns from 400
> > to 100. The reason why I am doing this is that I want to reduce code bloat
> > caused by my cunroll work that enabled a lot more unrolling then previously
> > causing considerable code size regression at -O3.
>
>
> Did you notice that gcc.c-torture/compile/pr43186.c regressed? It now again
> takes a while to compile, so times out on slow machines:
> ...
On a 2.5Ghz Core2Duo, compiling the test with revision 192891 (2012-10-28)
takes a small fraction of a second, while with revision 193270 (2012-11-06)
it
> this patch reduces max-peeled-insns and max-completely-peeled-insns from 400
> to 100. The reason why I am doing this is that I want to reduce code bloat
> caused by my cunroll work that enabled a lot more unrolling then previously
> causing considerable code size regression at -O3.
Did you not
On Thu, Nov 15, 2012 at 12:34:07AM +0100, Jan Hubicka wrote:
> * params.def (max-peeled-insns, max-completely-peeled-insns): Reduce to
> 100.
Ok, thanks.
> --- params.def(revision 193505)
> +++ params.def(working copy)
> @@ -290,7 +290,7 @@ DEFPARAM(PARAM_MAX_UNROLL_TIMES,
Hi,
this patch reduces max-peeled-insns and max-completely-peeled-insns from 400 to
100. The reason why I am doing this is that I want to reduce code bloat caused
by my cunroll work that enabled a lot more unrolling then previously causing
considerable code size regression at -O3.
I do not think
19 matches
Mail list logo