http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42118

Harald Anlauf <anlauf at gmx dot de> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |anlauf at gmx dot de

--- Comment #5 from Harald Anlauf <anlauf at gmx dot de> 2012-03-01 19:54:08 
UTC ---
(In reply to comment #4)
> Additionally, as written before (comment 2), a reasonably well written DO loop
> should be always as fast or faster than a FORALL. The definition of FORALL 
> does
> not allow for a good optimization in the general case.

Do not forget that there are constraints for FORALL statements that are
not required for DO loops so that all assignments are independent.
This guarantees vectorization

> I did a quick run with six compilers. Result: The FORALL construct was between
> 3.2 to 5.25 times slower than the DO loop. Thus, other compilers do not handle
> it better, either.

I tried the SunStudio 12 on i686

 Time of operation was  11.831321  seconds
 Time of operation was  12.235342  seconds

and on x86_64 (AMD barcelona)

 Time of operation was  8.715117  seconds
 Time of operation was  10.525522  seconds

So a small slowdown.

Then I tried NEC's sxf90 rev.441 for SX-9 at -Chopt:

 Time of operation was   4.187261   seconds
 Time of operation was   1.259775   seconds

Whoops!  After looking into the transformation listing and instrumenting
the code, it looks like the do loop is poorly optimized, giving lots
of so-called bank conflicts.

Reducing optimization to -Cvopt, I get:

 Time of operation was   1.185673   seconds
 Time of operation was   1.271729   seconds

Looks reasonable.

So yes, FORALL is in practice slightly slower (almost always... ;-)

Reply via email to