------- Comment #16 from dominiq at lps dot ens dot fr 2008-09-08 09:00 ------- A few personal comments.
> 2) The problem doesn't occur on powerpc-apple-darwin9. This is normal. REAL(8) are not vectorized on ppc since they are not part of altivec. IBM has preferred to add a second FPU. Although the gcc scheduler needs some improvements for a better use of it, this is an other issue. Concerning the problem on Intel Core, I think there is very little that anythin can be done for gfortran 4.3. The easiest "fix" would be to include a CAVEAT in the gfortran documentation stating that short dot-products should be unrolled by hand in order to allow the vectorization. Indeed this can be done at the level of the gfortran front-end, but I don't this it is worth the work. Note that this should be restricted to 4.3, otherwise it will give a performance regression on 4.4 (see below). A last possibility is to convince the middle-end people to tweak the vector cost in the middle-end (good luck!). I think the early unrolling improved the benchmark results, but introduced a performance regression in one of my avatars of the induct test. I have tried to reduced the problem in pr36099, but this led to a wrong direction. I am planning to do a better reduction, but I did not find the time so far. To conclude I think this pr should be closed as WON'T FIX with or without the caveat since it is not a gfortran front-end problem. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36599