Re: [Beowulf] OpenMP on AMD dual core processors

Joe Landman Thu, 20 Nov 2008 21:05:26 -0800

Nathan Moore wrote:

Any suggestions? I figured that this would be a simple example toparallelize. Is there a better example for OpenMP parallelization?Also, is there something obvious I'm missing in the example below?


A few thoughts ...

Initialize your data in parallel as well. No reason not to. Butoptimize that code a bit. You don't need


        v_y = v_ground + (v_cloud-v_ground)*(j*dy/Ly)
        boundary(i,j)=0
        v(i,j) = v_y

when

        v(i,j)=  v_ground + (v_cloud-v_ground)*(j*dy/Ly)
        boundary(i,j)=0

will eliminate the explicit temporary variable. Also the i.eq.0 test isguaranteed never to be hit in the if-then construct, as with the j.eq.0.

You can (and should) replace that if-then construct with a set of loopsof the form


        do j=1,Ny
         boundary(Nx,j) = 1
        end do          
        do i=1,Nx
         boundary(i,Ny) = 1
        end do

Also, what sticks out to me is that old_v may be viewed as "shared"versus "private". I know OpenMP is supposed to do the right thing here,but you might need to explicitly mark old_v as private. And dv forthat matter.

Note also that this inner loop is attempting to do a convergence test.You are looking to set a globally shared value from within an innerloop. This is not a good thing to do. This means accesses to thatglobally shared variable are going to be locked.

I would suggest a slightly different inner loop and convergence test:(note ... this relies on something I havent tried in fortran soadjustment may be needed)



real*8 vnew(Nx,Ny),dv(Nx,Ny)

do i=1,Nx
 do j=1,Ny
    ! notice that the if-then construct is gone ...
    ! vnew eq 0.0 for boundaries
    vnew(i,j) = 0.25*(v(i-1,j)+v(i+1,j)+v(i,j+1)+v(i,j-1))*
                dabs(boundary(i,j).eq.0)

dv(i,j) = (dabs(v(i,j)-vnew(i,j)) - convergence_v )*dabs(boundary(i,j).eq.0)

 end do
end do

! now all you need is a "linear scan" to find positive elements in
! dv.  You can approach these as sum reductions, and do them in
! parallel
do i=1,Nx
 sum=0.0
 do j=1,Ny
  sum = sum + dabs(dv(i,j) .gt. 0.0) * dv(i,j)
 end do
 if (sum .gt. 0.0) converged = 0
end do

The basic idea is to replace the inner loop conditionals and remove asmany of the shared variables as possible.

Also c.f. examples here: http://www.linux-mag.com/id/4609 specificallythe Riemann zeta function (fairly trivial).




--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: [EMAIL PROTECTED]
web  : http://www.scalableinformatics.com
       http://jackrabbit.scalableinformatics.com
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Re: [Beowulf] OpenMP on AMD dual core processors

Reply via email to