------- Comment #5 from rakdver at gcc dot gnu dot org  2006-09-28 11:34 -------
(In reply to comment #4)
> On x86_64 4.2 decides to unroll 9 times while on 4.1 it unrolls 8 times.  This
> is
> a code-size regression, but other than that?  The 4.2 version runs slightly
> faster than the 4.1 version, though the difference may be in the noise.

Choosing 9 instead of 8 looks weird, though :-).  The reason is following:
jump threading in vrp2 pass peels one iteration of the loop.  With this change,
unrolling by factor of 9 creates smaller code (only one extra iteration needs
to be peeled to make the number of iterations divisible by 9, while one would
need to peel 7 more iterations to make it divisible by 8).


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29256

Reply via email to