Hi folks

Dealing with an MPI problem that has me scratching my head. Quite beowulfish, as thats where this code runs.

Short version: The code starts and runs. Reads in its data. Starts its iterations. And then somewhere after this, it hangs. But not always at the same place. It doesn't write state data back out to the disk, just logs. Rerunning it gets it to a different point, sometimes hanging sooner, sometimes later. Seems to be the case on multiple different machines, with different OSes. Working on comparing MPI distributions, and it hangs with IB as well as with shared memory and tcp sockets.

Right now we are using OpenMPI 1.2.6, and this code does use allreduce. When it hangs, an strace of the master process shows lots of polling:


c1-1:~ # strace -p 8548
Process 8548 attached - interrupt to quit
rt_sigprocmask(SIG_BLOCK, [CHLD], NULL, 8) = 0
rt_sigaction(SIGCHLD, {0x2b061f65c9b2, [CHLD], SA_RESTORER|SA_RESTART, 0x2b062049b130}, NULL, 8) = 0
rt_sigprocmask(SIG_UNBLOCK, [CHLD], NULL, 8) = 0
poll([{fd=4, events=POLLIN}, {fd=5, events=POLLIN}, {fd=6, events=POLLIN}, {fd=8, events=POLLIN}, {fd=9, events=POLLIN}, {fd=10, events=POLLIN}], 6, 0) = 0
rt_sigprocmask(SIG_BLOCK, [CHLD], NULL, 8) = 0
rt_sigaction(SIGCHLD, {0x2b061f65c9b2, [CHLD], SA_RESTORER|SA_RESTART, 0x2b062049b130}, NULL, 8) = 0

[spin forever]
...

So it looks like the process is waiting for the appropriate posting on the internal scoreboard, and just hanging in a tight loop until this actually happens.

But these hangs usually happen at the same place each time for a logic error.

This is what I have seen in the past from other MPI codes where you have enough sends and receives, but everyone posts their send before their receive ... ordering is important of course.

But the odd thing about this code is that it worked fine 12 - 18 months ago, and we haven't touched it since (nor has it changed). What has changed is that we are now using OpenMPI 1.2.6.

So the code hasn't changed, and the OS on which it runs hasn't changed, but the MPI stack has. Yeah, thats a clue.

Turning off openib and tcp doesn't make a great deal of impact. This is also a clue.

I am looking now to trying mvapich2 and seeing how that goes. Using Intel and gfortran compilers (Fortran/C mixed code).

Anyone see strange things like this with their MPI stacks? OpenMPI? Mvapich2? I should try the Intel MPI as well (rebuilt mvapich2 as I remember).

I'll try all the usual things (reduce the optimization level, etc). Sage words of advice (and clue sticks) welcome.

Joe

--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: [EMAIL PROTECTED]
web  : http://www.scalableinformatics.com
       http://jackrabbit.scalableinformatics.com
phone: +1 734 786 8423
fax  : +1 866 888 3112
cell : +1 734 612 4615
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to