David Simas wrote:

Except that it probably won't help with the problem, which I'm
guessing is caused by a given host attempting more than 1024
RSH connections to a given server in less than TCP TIME WAIT
seconds (minutes, whatever).  If the original correspondent

Actually it handles exactly these cases. The FANOUT variable lets you indicate the appropriate parallelism for rsh. I believe pdsh is in use on the big clusters ( > 1024 nodes at the national labs )

doesn't want to use SSH for RSH, which would fix things

True, and you can use ssh with pdsh. Or rsh. With no syntax change to the end user.

SSH isn't restricted to low-numbered ports, he could try to
re-implement his application in MPI.

The basic question a few of us have is exactly what is Bruce and team doing that is causing them to run out of ports. Once we see this, we can stop guessing and make better/targetted suggestions.




--

Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: [EMAIL PROTECTED]
web  : http://www.scalableinformatics.com
phone: +1 734 786 8423
fax  : +1 734 786 8452 or +1 866 888 3112
cell : +1 734 612 4615

_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to