Re: [Beowulf] Parallel Programming Question

Joe Landman Tue, 17 Jul 2007 04:53:23 -0700

Hi James:

James Roberson wrote:

Hello,
I'm new to parallel programming and MPI. I've developed a simulator inC++ for which I wouldlike to decrease the running time by using a Beowulf cluster. I'm notinterested in optimizingspeed, I'm just looking for a quick and easy way to significantlyimprove the speed over running
the program on a single machine.

I am not sure I grasp "I'm not interested in optimizing speed, I'm justlooking for a quick and easy way to significantly improve the speed".

Basically I think you are saying you want to run it in parallel, inorder to speed up the program. Parallelization is an optimization.

Basically, I need to learn how to parallelize a C++ function. Thefollowing functions in particulartake the longest to run in my simulator. The first implements LUdecomposition on a large matrix and
the second implements the backsubstitution method to solve matrix division.

First, how large are your matrices? Second, how many times is thiscalled?


void NR::ludcmp(Mat_IO_DP &a, Vec_O_INT &indx, DP &d)
{
    const DP TINY=1.0e-20;
    int i,imax,j,k;
    DP big,dum,sum,temp;

    int n=a.nrows();
    Vec_DP vv(n);
    d=1.0;
    for (i=0;i<n;i++) {
        big=0.0;
        for (j=0;j<n;j++)
            if ((temp=fabs(a[i][j])) > big) big=temp;

(warning: coffee impaired, may not be correct in all aspects, notation,typing, rambling ....)

abs is a slow function. Think about mathematical equivalents. abs(x)maps x into the [0,) range. Multiplication is much faster than abs. Ifyour (x) is small enough,


        temp=fabs(a[i][j]);
        if ( (temp*temp) > (big*big) ) big = temp;

might be a bit faster.

        if (big == 0.0) nrerror("Singular matrix in routine ludcmp");
        vv[i]=1.0/big;

Second, you might want to define delta = some_small_number and comparebig to that. You would get a catastrophic loss of precision if youdivided by a very small number, and amplify error in the process.Renormalization/scaling can be your friend, as well as row manipulation.

That said, I would suggest avoiding re-inventing wheels. There are anumber of excellent LU-decomposition and back substitution codes outthere ...

(The functions are borrowed from the library provided by "NumericalRecipes in C++")
I'm currently calling these functions from the main loop with the lines:


ah.... that explains it...

NR::ludcmp(c,indx,d);

and

NR::lubksb(c,indx,xv);
where the variable 'c' is a large matrix (holding image pixel values)and 'xv' is a vectorused in backsubstitution.All of the variables passed into these functions are of types defined in" nr.h"
'c' is a Mat_DP  (double-precision matrix)
'indx' is a Vec_INT  (integer vector)
'd' is a DP    (double precision)
'xv' is a Vec_DP  (double precision vector)
Is there a simple way to call these functions which will cause thecluster to distribute theload of operations? Currently, when I run the program with a 50x50array on a single machine,it takes about 5 minutes to process a single iteration through thematrix division.


Huh?  What machine?  What compiler?  What options?

(more lack of coffee induced stupor here)

This should take well under 1 second. O(N**3) operations (50*50*50 =0.125E+6 operations) on a 1 GFLOP machine ( 1E+9 FLOPS) should beachievable in well under 1 second (ball park of 0.125/1000 seconds timesa constant)

Maybe I am missing something. This *should* be fast. Unless you arerunning on a Palm pilot or something like that ...


In octave, using an interpreted LU decomposition on a hilbert matrix

        h=hilb(50);
        a=time ; [l,u,p] = lu(h); b=time; b-a
        ans = 0.00063705

this is ball-park what I had estimated, within a constant factor(cough). This is an Opteron 242 with 4 GB ram, so it isn't a speed demon...



--

Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: [EMAIL PROTECTED]
web  : http://www.scalableinformatics.com
       http://jackrabbit.scalableinformatics.com
phone: +1 734 786 8423
fax  : +1 866 888 3112
cell : +1 734 612 4615

_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Re: [Beowulf] Parallel Programming Question

Reply via email to