------- Comment #6 from burnus at gcc dot gnu dot org 2010-08-02 12:41 -------
(In reply to comment #5)
> > As pointed out by Dominique, one needs to be careful. I think one can
> > optimize:
> > c = matmul(b, transpose(c))
> > c = matmul(c,b)
> > where c is a rank-2 matrix. I think one cannot optimize it if c is just a
> > vector. (Add "conj" as you wish.) And check whether I have not confused
> > columns
> > and rows
>
> So far I have never seen a product matrix*matrix done "in place" and I
> seriously doubt it can be done.
> The best I can see is to have a temporary of rank 1
I fail to see why a scalar temporary is not enough:
do j = 1, N
do i = 1, N
tmp = 0.0
do k = 1, N
tmp = a(i,k)*b(k,j)
end do
a(i,j) = tmp
end do
end do
Note: it won't work with a scalar if one reverses the i and the j loop.
[It would work with a rank-1 "B" argument, but then "A = matmul(B,A)" won't
work as the result is a rank-1 array ...]
> Note also that the polyhedron test nf.f90 is compiled with unnecessary
> temporaries:
My feeling is that ONE, THREE (of comment 0), FIVE (of comment 1), and your
example (comment 5) point all to the same dependency-analysis problem.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45159