------- Comment #6 from burnus at gcc dot gnu dot org  2010-08-02 12:41 -------
(In reply to comment #5)
> > As pointed out by Dominique, one needs to be careful. I think one can 
> > optimize:
> >   c = matmul(b, transpose(c))
> >   c = matmul(c,b)
> > where c is a rank-2 matrix. I think one cannot optimize it if c is just a
> > vector. (Add "conj" as you wish.) And check whether I have not confused 
> > columns
> > and rows
> 
> So far I have never seen a product matrix*matrix done "in place" and I
> seriously doubt it can be done.
> The best I can see is to have a temporary of rank 1

I fail to see why a scalar temporary is not enough:

do j = 1, N
  do i = 1, N
    tmp = 0.0
    do k = 1, N
      tmp = a(i,k)*b(k,j)
    end do
    a(i,j) = tmp
  end do
end do

Note: it won't work with a scalar if one reverses the i and the j loop.
[It would work with a rank-1 "B" argument, but then "A = matmul(B,A)" won't
work as the result is a rank-1 array ...]

> Note also that the polyhedron test nf.f90 is compiled with unnecessary
> temporaries:

My feeling is that ONE, THREE (of comment 0), FIVE (of comment 1), and your
example (comment 5) point all to the same dependency-analysis problem.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45159

Reply via email to