Hi all,

have you tried the following: ssh master -> node1 -> node2, i.e. ssh from the 
master to node1 and from there to node2? 
You do not have a situation where the remote host-key is not in the database 
and hence you get asked about adding that key to the local database?

If that is working with all permutations, another possibility is that your 
host list is somehow messed up when you are submitting parallel jobs. Can you 
start the jobs manually by providing a host list to the MPI program you are 
using? Does that work or do you have problems here as well?

My two pennies

Jörg


On Thursday 20 September 2012 07:40:56 Antti Korhonen wrote:
> Passwordless SSH works between all nodes.
> Firewalls are disabled.
> 
> 
> From: g...@r-hpc.com [mailto:g...@r-hpc.com] On Behalf Of Greg Keller
> Sent: Wednesday, September 19, 2012 8:43 PM
> To: beowulf@beowulf.org; Antti Korhonen
> Subject: Re: [Beowulf] Cannot use more than two nodes on cluster
> 
> I am going to bet $0.25 that SSH or TCP/IP is configured to allow the
> master to get to the nodes without a password, but not from one Compute to
> the other Compute.
> 
> Test by sshing to Compute1, then from Compute1 to Compute2.  Depending on
> how you built the cluster, it's also possible there is iptables running on
> the compute nodes but, my money is on the ssh keys need reconfiguring. 
> Let us know what you find.
> 
> Cheers!
> Greg
> 
> Date: Wed, 19 Sep 2012 16:11:21 +0000
> From: Antti Korhonen
> <akorho...@theranos.com<mailto:akorho...@theranos.com>> Subject: [Beowulf]
> Cannot use more than two nodes on cluster
> To: "beowulf@beowulf.org<mailto:beowulf@beowulf.org>"
> <beowulf@beowulf.org<mailto:beowulf@beowulf.org>> Message-ID:
>        
> <B9D51F953BEE5C42BC2B503D288542992DD935FE@SRW004PA.theranos.local<mailto:B
> 9D51F953BEE5C42BC2B503D288542992DD935FE@SRW004PA.theranos.local>>
> Content-Type: text/plain; charset="us-ascii"
> 
> Hello
> 
> I have a small Beowulf cluster (master and 3 slaves).
> I can run jobs on any single nodes.
> Running on two nodes sort of works, running jobs on master and 1 slave
> works. (all combos, master+slave1 or master+slave2 or master+slave3)
> Running jobs on two slaves hangs.
> Running jobs on master + any two slaves hangs.
> 
> Would anybody have any troubleshooting tips?

-- 
*************************************************************
Jörg Saßmannshausen
University College London
Department of Chemistry
Gordon Street
London
WC1H 0AJ 

email: j.sassmannshau...@ucl.ac.uk
web: http://sassy.formativ.net

Please avoid sending me Word or PowerPoint attachments.
See http://www.gnu.org/philosophy/no-word-attachments.html

_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to