https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84037

--- Comment #15 from Richard Biener <rguenth at gcc dot gnu.org> ---
Oh, and if you don't disable inlining then you get down to sizes of 148
(SSE and SLP) and 91 and 75 (SSE and no SLP).  So you won't get rid
of two instances of vectorization regardless of the parameter
(for size 75 I don't apply PARAM_MAX_UNROLLED_INSNS because it's not at least
one full unroll copy when looking at the scalar body size of 43).

With the default param we inhibit use of SLP in

capacita2.f90:226:0: note: estimated vector body size is 19, scalar body size 2
capacita2.f90:226:0: note: not vectorized: loop grows too much.
capacita2.f90:226:0: note: estimated vector body size is 11, scalar body size 2
capacita2.f90:226:0: note: loop vectorized

capacita2.f90:259:0: note: estimated vector body size is 19, scalar body size 2
capacita2.f90:259:0: note: not vectorized: loop grows too much.
capacita2.f90:259:0: note: estimated vector body size is 11, scalar body size 2
capacita2.f90:259:0: note: loop vectorized

in addition to the critical loop (copies) at 551:

capacita2.f90:551:0: note: estimated vector body size is 298, scalar body size
43
capacita2.f90:551:0: note: not vectorized: loop grows too much.
capacita2.f90:551:0: note: estimated vector body size is 259, scalar body size
43
capacita2.f90:551:0: note: not vectorized: loop grows too much.
capacita2.f90:551:0: note: estimated vector body size is 147, scalar body size
43
capacita2.f90:551:0: note: loop vectorized
capacita2.f90:551:0: note: estimated vector body size is 259, scalar body size
43
capacita2.f90:551:0: note: not vectorized: loop grows too much.
capacita2.f90:551:0: note: estimated vector body size is 147, scalar body size
43
capacita2.f90:551:0: note: loop vectorized
capacita2.f90:551:0: note: estimated vector body size is 258, scalar body size
43
capacita2.f90:551:0: note: not vectorized: loop grows too much.
capacita2.f90:551:0: note: estimated vector body size is 259, scalar body size
43
capacita2.f90:551:0: note: not vectorized: loop grows too much.
capacita2.f90:551:0: note: estimated vector body size is 168, scalar body size
43
capacita2.f90:551:0: note: loop vectorized


I do think that applying this sort of heuristic makes sense, even if it doesn't
help the polyhedron case.

Numbers for different values of the parameter are

300 (w/o patch)  20.91user 0.05system 0:20.97elapsed
200 (default)    20.98user 0.08system 0:21.07elapsed 99%CPU
147              19.62user 0.06system 0:19.70elapsed 99%CPU
146              17.27user 0.08system 0:17.36elapsed 99%CPU
140              17.19user 0.06system 0:17.26elapsed 99%CPU
91               17.41user 0.05system 0:17.48elapsed 99%CPU
90               16.98user 0.05system 0:17.04elapsed 99%CPU
75               17.01user 0.04system 0:17.06elapsed 99%CPU
74               16.93user 0.06system 0:16.99elapsed 99%CPU 
1                17.02user 0.06system 0:17.08elapsed 100%CPU

the sweet spot for this benchmark seems to be 146...

For reference with -fno-tree-vectorize I get

                 18.36user 0.09system 0:18.45elapsed 99%CPU

with --param vect-max-version-for-alias-checks=0

                 16.92user 0.06system 0:16.99elapsed 99%CPU

Reply via email to