At Thu, 24 Mar 2005 05:00:27 +0900, Adam C Powell IV wrote: > I have an MPI program which does a popen and fread, something like: > > if (snprintf (filename, 999, "gunzip -c < %s.cpu%.4d.data", > basename, rank) > 999) > return 1; > if (!(infile = popen (filename, "r"))) > return 1; > if (ferror (infile)) > { > printf ("[%d] Pipe open has error %d\n", rank, ferror(infile)); > fflush (stdout); > } > ... some stuff ... > nmemb=fread (globalarray, sizeof (PetscScalar), gridpoints * dof, > infile); > if (nmemb != gridpoints*dof) > { > printf ("[%d] ferror = %d\n", rank, ferror (infile)); > fflush (stdout); > } > > So, there seems to be no error in the popen, but on between one and five > CPUs out of about 20, the fread results in an EPERM error. On the other > cluster, the error is less frequent but still there. They're both > identically-configured Debian beowulfs using the diskless package and > mpich, though the one with fewer errors is made of dual AthlonXP 1.53 > GHz boxes and the one with more errors of dual Opteron 240 boxes running > Debian stock -k7-smp kernels and 32-bit userland. > > On the other hand, the same program earlier fopen()s a file whose path > and name are identical to the popen redirected input except for the > extension, and those work flawlessly.
I think this problem should be separated from MPI and clusters. This kind of random behavior is usually occured by an invalid access. I recommend you to check your program with valgrind in first, then isolate the problem from MPI. Regards, -- gotom -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]