---------- Forwarded message ---------- From: Nathan Moore <[EMAIL PROTECTED]> Date: Fri, Nov 21, 2008 at 9:35 AM Subject: Re: [Beowulf] OpenMP on AMD dual core processors To: Bill Broadley <[EMAIL PROTECTED]>
You're right about the recursive definition, v(i,j) = 0.25*(v(i-1,j)+v(i+1,j)+v(i,j+1)+v(i,j-1)) It is an old serial programming trick that makes the computation go faster with little convergence penalty. I was thinking that two arrays would have a memory latency (reading in and out simultaneously), but I see what you mean about forcing the computation to be serial. On Thu, Nov 20, 2008 at 11:47 PM, Bill Broadley <[EMAIL PROTECTED]>wrote: > OpenMP only works on loops that are independent. So something like: > do j=1,Ny > v(j) = v(j) + 1 > > So 100 CPUs could each run with a different value for J and not conflict. > > Your code however: > do i=1,Nx > do j=1,Ny > if(boundary(i,j).eq.0) then > old_v = v(i,j) > v(i,j) = 0.25*(v(i-1,j)+v(i+1,j)+v(i,j+1)+v(i,j-1)) > > Neither the i loop nor the j loop can be parallelized because the value if > i-1 > and j-1 have been referenced. Does that code even work? Is it intentional > that the v(i-1) value is from the current iteration, but v(i+1) value is > from > the previous iteration? > > Seems like a much better idea to have a new array that is built entirely > from > the previous timestep. That would allow it to converge faster, coverge is > more cases, and also parallelize. > > Make sense? > -- - - - - - - - - - - - - - - - - - - - - - Nathan Moore Assistant Professor, Physics Winona State University AIM: nmoorewsu - - - - - - - - - - - - - - - - - - - - - -- - - - - - - - - - - - - - - - - - - - - - Nathan Moore Assistant Professor, Physics Winona State University AIM: nmoorewsu - - - - - - - - - - - - - - - - - - - - -
_______________________________________________ Beowulf mailing list, Beowulf@beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf