David Simas wrote:
Except that it probably won't help with the problem, which I'm guessing is caused by a given host attempting more than 1024 RSH connections to a given server in less than TCP TIME WAIT seconds (minutes, whatever). If the original correspondent
Actually it handles exactly these cases. The FANOUT variable lets you indicate the appropriate parallelism for rsh. I believe pdsh is in use on the big clusters ( > 1024 nodes at the national labs )
doesn't want to use SSH for RSH, which would fix things
True, and you can use ssh with pdsh. Or rsh. With no syntax change to the end user.
SSH isn't restricted to low-numbered ports, he could try to re-implement his application in MPI.
The basic question a few of us have is exactly what is Bruce and team doing that is causing them to run out of ports. Once we see this, we can stop guessing and make better/targetted suggestions.
-- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: [EMAIL PROTECTED] web : http://www.scalableinformatics.com phone: +1 734 786 8423 fax : +1 734 786 8452 or +1 866 888 3112 cell : +1 734 612 4615 _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf