------- Comment #56 from sebastian dot pop at cri dot ensmp dot fr  2005-11-03 
13:24 -------
Subject: Re:  [4.0/4.1 Regression] IV-OPTS is O(N^3)

> 
> So, I'd suggest that we add a --param here for max-loop-nest-depth, and then
> just not do this stuff on deeper nests, or ignore all the outer loops in that
> case.
> 
> Is that feasible?
> 

Mark, I think that this has already been fixed by the patch that added
PARAM_SCEV_MAX_EXPR_SIZE: we give up when the expressions that we
handle are longer than this size (if I remember correctly this
defaults to 20).  For the example in this bug,

{       int j0, j1, j2, j3, j4, j5, j6, j7, j8, j9, a;
        a = 0;
        for (j0 = 0; j0 < 2; j0 ++)
        for (j1 = j0; j1 < 2; j1 ++)
        for (j2 = j1; j2 < 2; j2 ++)
        for (j3 = j2; j3 < 2; j3 ++)
        for (j4 = j3; j4 < 2; j4 ++)
        for (j5 = j4; j5 < 2; j5 ++)
        for (j6 = j5; j6 < 2; j6 ++)
        for (j7 = j6; j7 < 2; j7 ++)
        for (j8 = j7; j8 < 2; j8 ++)
        for (j9 = j8; j9 < 2; j9 ++)
        a += j0 + j1 + j2 + j3 + j4 + j5 + j6 + j7 + j8 + j9;
        return a;
}

we would have a scev with 10 dimensions, i.e. an evolution in each
dimension, thus an expression with about 10 components.

Some numbers for mainline with -O2 (on amd64 with cpufreq):

time ./gcc/cc1 -O2 ~/ex/pr18595_10.c 
real    0m0.345s
user    0m0.089s
sys     0m0.017s

time ./gcc/cc1 -O2 ~/ex/pr18595_100.c 
 tree PRE              :   0.28 (21%) usr   0.02 (50%) sys   0.29 (20%) wall   
7742 kB (59%) ggc
 tree loop bounds      :   0.14 (10%) usr   0.00 ( 0%) sys   0.15 (10%) wall   
   0 kB ( 0%) ggc
real    0m1.776s
user    0m1.348s
sys     0m0.050s

time ./gcc/cc1 -O2 ~/ex/pr18595_200.c 
 tree PRE              :   1.53 (24%) usr   0.09 (60%) sys   1.63 (25%) wall  
31149 kB (74%) ggc
 tree loop bounds      :   1.21 (19%) usr   0.00 ( 0%) sys   1.21 (18%) wall   
   0 kB ( 0%) ggc
real    0m7.077s
user    0m6.363s
sys     0m0.167s

time ./gcc/cc1 -O2 ~/ex/pr18595_300.c
 tree PRE              :   4.69 (28%) usr   0.21 (68%) sys   5.03 (29%) wall  
69961 kB (80%) ggc
 tree loop bounds      :   4.03 (24%) usr   0.00 ( 0%) sys   4.06 (23%) wall   
   0 kB ( 0%) ggc
real    0m18.126s
user    0m17.022s
sys     0m0.333s

Now, if we remove PRE, we can even compile the case with 1000 nested
loops in a reasonable time:

time ./gcc/cc1 -O2 -fno-tree-pre ~/ex/pr18595_300.c 
 tree loop bounds      :   0.09 ( 4%) usr   0.00 ( 0%) sys   0.24 ( 9%) wall   
   0 kB ( 0%) ggc
real    0m3.184s
user    0m2.147s
sys     0m0.054s

time ./gcc/cc1 -O2 -fno-tree-pre ~/ex/pr18595_1000.c 
 tree loop bounds      :   2.00 ( 8%) usr   0.00 ( 0%) sys   2.00 ( 8%) wall   
   0 kB ( 0%) ggc
real    0m27.171s
user    0m26.298s
sys     0m0.169s

So, I think that we can safely close this PR.

Seb


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595

Reply via email to