Re: [Beowulf] shutting down pbs server and maui for half an hour will affect running jobs?

2010-07-12 Thread Christopher Samuel
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 13/07/10 13:12, akshar bhosale wrote: > thanks..any other related info ? Not that comes to mind, I'm afraid! - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computational Initiative Email: sam...@unimelb

Re: [Beowulf] first cluster

2010-07-12 Thread Christopher Samuel
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 13/07/10 14:29, Rahul Nabar wrote: > Out of curiosity, is there the possibility of running > a "swapless" compute-node? Yes of course, it just means that the kernel no longer has the option of paging out infrequently accessed dirty pages to free s

Re: [Beowulf] Network problem: Why are ARP discovery requests sent to specific addresses instead of a broadcast domain

2010-07-12 Thread Tom Ammon
This is called a gratuitous ARP. Used to update the ARP caches of other nodes. On 07/12/2010 10:48 PM, Rahul Nabar wrote: On Mon, Jul 12, 2010 at 11:25 PM, Patrick Geoffray wrote: Rahul, On 7/13/2010 12:04 AM, Rahul Nabar wrote: I am puzzled by a bunch of ARP requests on my networ

Re: [Beowulf] Network problem: Why are ARP discovery requests sent to specific addresses instead of a broadcast domain

2010-07-12 Thread Rahul Nabar
On Mon, Jul 12, 2010 at 11:25 PM, Patrick Geoffray wrote: > Rahul, > > On 7/13/2010 12:04 AM, Rahul Nabar wrote: >> >> I am puzzled by a bunch of ARP requests on my network that I captured >> using tcpdump. Shouldn't ARP discovery requests always be sent to a >> broadcast address? > > No, the kern

Re: [Beowulf] first cluster

2010-07-12 Thread Rahul Nabar
On Mon, Jul 12, 2010 at 2:02 PM, Gus Correa wrote: > Consider disk for: > > A) swap space (say, if the user programs are large, > or you can't buy a lot of RAM, etc); Out of curiosity, is there the possibility of running a "swapless" compute-node? I mean most HPC nodes already have fairly generou

Re: [Beowulf] Network problem: Why are ARP discovery requests sent to specific addresses instead of a broadcast domain

2010-07-12 Thread Patrick Geoffray
Rahul, On 7/13/2010 12:04 AM, Rahul Nabar wrote: I am puzzled by a bunch of ARP requests on my network that I captured using tcpdump. Shouldn't ARP discovery requests always be sent to a broadcast address? No, the kernel regularly refreshes the entries in the ARP cache with unicast requests.

[Beowulf] Network problem: Why are ARP discovery requests sent to specific addresses instead of a broadcast domain

2010-07-12 Thread Rahul Nabar
I am puzzled by a bunch of ARP requests on my network that I captured using tcpdump. Shouldn't ARP discovery requests always be sent to a broadcast address? I have requests of the type below which seemingly are addressed to a specific mAC address. 00:26:b9:58:d7:2f > 00:26:b9:58:eb:b8, ARP, lengt

Re: [Beowulf] shutting down pbs server and maui for half an hour will affect running jobs?

2010-07-12 Thread Christopher Samuel
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 13/07/10 01:21, akshar bhosale wrote: > Thanks for your information, but do i need to change > anything for increasing timeout if i dont want to kill > running jobs.. If you have jobs that will hit their walltime whilst the server is down then the

Re: [Beowulf] first cluster

2010-07-12 Thread Christopher Samuel
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 13/07/10 05:02, Gus Correa wrote: > I wonder if swapping over NFS would be efficient for HPC. There are out of tree patches for swap over NFS (and I've seen assertions that SuSE SLES 11 includes it) which has been doing the rounds for a few years

[Beowulf] IB problem with openmpi 1.2.8

2010-07-12 Thread Bill Wichser
Machine is an older Intel Woodcrest cluster with a two tiered IB infrastructure with Topspin/Cisco 7000 switches. The core switch is a SFS-7008P with a single management module which runs the SM manager. The cluster runs RHEL4 and was upgraded last week to kernel 2.6.9-89.0.26.ELsmp. The ope

Re: [Beowulf] first cluster

2010-07-12 Thread Gus Correa
Hi Doug Consider disk for: A) swap space (say, if the user programs are large, or you can't buy a lot of RAM, etc); I wonder if swapping over NFS would be efficient for HPC. Disk may be a simple and cost effective solution. B) input/output data files that your application programs may require (

Re: [Beowulf] first cluster

2010-07-12 Thread Douglas Guptill
Ah Ha. I see the point of a non-diskful, or nfs root, install for the compute nodes. One image to update/change, instead of a whole bunch. Thanks, Douglas. On Fri, Jul 09, 2010 at 07:11:18PM -0400, Mark Hahn wrote: > well, the thing about nfs root is that there's almost no installation, > per