Re: [Beowulf] Re: cluster fails to boot with managed switch, but 5-port switch works OK

2009-12-02 Thread Chris Samuel
- "Art Poon" wrote: > To be specific, after the initial boot with a > minimal Linux kernel, there is a "fatal error" > with "timeout waiting for getfile" when the > compute node attempts to download the provisioning > image from head. I've seen similar issues with Cisco switches in IBM Clus

Re: [Beowulf] Intel shows 48-core 'datacentre on a chip'

2009-12-02 Thread Mark Hahn
SCC combines 24 dual-core processing elements, each with its own router, four DDR3 memory controllers capable of handling up to 8GB apiece, and a very fast sounds a fair amount like larrabee to me. you needed your own datacentre. Now, you just need your own chip," said that has to be one of

RE: [Beowulf] New member, upgrading our existing Beowulf cluster

2009-12-02 Thread Lux, Jim (337C)
> -Original Message- > From: beowulf-boun...@beowulf.org [mailto:beowulf-boun...@beowulf.org] On > Behalf Of Greg Lindahl > Sent: Wednesday, December 02, 2009 2:30 PM > To: beowulf@beowulf.org > Subject: Re: [Beowulf] New member, upgrading our existing Beowulf cluster > > On Wed, Dec 02

Re: [Beowulf] Forwarded from a long time reader having trouble posting

2009-12-02 Thread Chris Samuel
- "Toon Knapen" wrote: > And why do you prefer xfs if I may ask. Performance? For us, yes, plus the fact that ext3 is (maybe was, but not from what I've heard) single threaded through the journal daemon so if you get a lot of writers (say NFS daemons for instance) you can get horribly backl

Re: [Beowulf] New member, upgrading our existing Beowulf cluster

2009-12-02 Thread Greg Lindahl
On Wed, Dec 02, 2009 at 10:13:34PM +, John Hearns wrote: > You know fine well that such disclaimers are inserted by corporate > email servers. Actually, I had no idea, probably a lot of other people don't either. Can't you work for a company that doesn't have disclaimers? ;-) -- greg _

Re: [Beowulf] Forwarded from a long time reader having trouble posting

2009-12-02 Thread Chris Samuel
- "Joshua Baker-LePain" wrote: > What Red Hat is *not* shipping by default are any > of the filesystem utilities, so you can't, e.g., > actually mkfs an XFS filesystem. But you can get > the xfsprogs RPM from the CentOS extras repo and > that should work just fine. Ah yes, that was what i

RE: [Beowulf] Re: cluster fails to boot with managed switch, but5-port switch works OK

2009-12-02 Thread Michael Will
I got another one for you from penguins support team: enable port forward Sent from Moxier Mail (http://www.moxier.com) - Original Message - From:"Rahul Nabar" To:"Michael Will" Cc:"Hearns, John" , "beowulf@beowulf.org" Sent:12/2/2009 1:42 PM Subject:Re: [Beowulf] Re: cluster fails t

Re: [Beowulf] New member, upgrading our existing Beowulf cluster

2009-12-02 Thread John Hearns
2009/12/2 Peter Kjellstrom : > > Good job sending it to a public e-mail list then. > You know fine well that such disclaimers are inserted by corporate email servers. Keep your sarcasm to yourself. ___ Beowulf mailing list, Beowulf@beowulf.org sponsored b

Re: [Beowulf] Re: cluster fails to boot with managed switch, but5-port switch works OK

2009-12-02 Thread Rahul Nabar
On Wed, Dec 2, 2009 at 1:48 PM, Michael Will wrote: > I don't know anything about smc switches, but for cisco switches I had to > enable 'spanning-tree portfast default' before to allow a >pxe booting node > to stay up. Maybe the smc switch has something similar that prevents the port > from be

Re: [Beowulf] Re: cluster fails to boot with managed switch, but 5-port switch works OK

2009-12-02 Thread Bill Broadley
Art Poon wrote: > I've tried resetting the SMC switch to factory defaults (with > auto-negotiate on). I've checked the /etc/beowulf/modprobe.conf and it > doesn't seem to be demanding anything exotic. We've tried swapping out to > another SMC switch but that didn't change anything. I had a very

Re: [Beowulf] New member, upgrading our existing Beowulf cluster

2009-12-02 Thread Peter Kjellstrom
On Wednesday 02 December 2009, Hearns, John wrote: > I'm a new member to this list, but the research group that I work for has > had a working cluster for many years. I am now looking at upgrading our > current configuration. ... > Mixing modern multi-core hardware with an older OS release which wo

RE: [Beowulf] New member, upgrading our existing Beowulf cluster

2009-12-02 Thread Lux, Jim (337C)
> -Original Message- > From: beowulf-boun...@beowulf.org [mailto:beowulf-boun...@beowulf.org] On > Behalf Of Chris Dagdigian > Sent: Wednesday, December 02, 2009 11:12 AM > Cc: beowulf@beowulf.org; Ross Tucker > Subject: Re: [Beowulf] New member, upgrading our existing Beowulf cluster > >

Re: [Beowulf] Re: cluster fails to boot with managed switch, but 5-port switch works OK

2009-12-02 Thread Joe Landman
David Mathog wrote: What's got me and the IT guys stumped is that while the compute nodes boot via PXE from the head node without trouble on the NetGear, they barf with the SMC. To be specific, after the initial boot with a minimal Linux kernel, there is a "fatal error" with "timeout waiting fo

RE: [Beowulf] Re: cluster fails to boot with managed switch, but5-port switch works OK

2009-12-02 Thread Michael Will
I don't know anything about smc switches, but for cisco switches I had to enable 'spanning-tree portfast default' before to allow a pxe booting node to stay up. Maybe the smc switch has something similar that prevents the port from being fully useable until some spanning tree algorithm terminate

[Beowulf] Re: cluster fails to boot with managed switch, but 5-port switch works OK

2009-12-02 Thread David Mathog
> What's got me and the IT guys stumped is that while the compute nodes boot via PXE from the head node without trouble on the NetGear, they barf with the SMC. To be specific, after the initial boot with a minimal Linux kernel, there is a "fatal error" with "timeout waiting for getfile" when the c

Re: [Beowulf] New member, upgrading our existing Beowulf cluster

2009-12-02 Thread Chris Dagdigian
Not sure if you are looking at DIY or commercial options but this has been done well on a commercial scale by at least some integrators. I've never used them in a cluster but they make great virtualization platforms. This is just one example, the marketing term is "1U Twin Server" http://w

RE: [Beowulf] New member, upgrading our existing Beowulf cluster

2009-12-02 Thread Lux, Jim (337C)
From: beowulf-boun...@beowulf.org [mailto:beowulf-boun...@beowulf.org] On Behalf Of Ross Tucker Sent: Wednesday, November 25, 2009 1:54 PM To: beowulf@beowulf.org Subject: [Beowulf] New member, upgrading our existing Beowulf cluster Greetings! I'm a new member to this list, but the research grou

RE: [Beowulf] Re: cluster fails to boot with managed switch,but 5-port switch works OK

2009-12-02 Thread Hearns, John
I've tried resetting the SMC switch to factory defaults (with auto-negotiate on). I've checked the /etc/beowulf/modprobe.conf and it doesn't seem to be demanding anything exotic. We've tried swapping out to another SMC switch but that didn't change anything. No idea really, as I don't use SMC

Re: [Beowulf] Re: cluster fails to boot with managed switch, but 5-port switch works OK

2009-12-02 Thread Joe Landman
Art Poon wrote: Dear colleagues, [...] What's got me and the IT guys stumped is that while the compute nodes boot via PXE from the head node without trouble on the NetGear, they barf with the SMC. To be specific, after the initial boot with a minimal Linux kernel, there is a "fatal error" wi

RE: [Beowulf] New member, upgrading our existing Beowulf cluster

2009-12-02 Thread Hearns, John
I'm a new member to this list, but the research group that I work for has had a working cluster for many years. I am now looking at upgrading our current configuration. Go for a new cluster any time. Upgrading is fraught with pitfalls – keep that old cluster running till it dies, but look

[Beowulf] Intel shows 48-core 'datacentre on a chip'

2009-12-02 Thread Eugen Leitl
http://news.zdnet.co.uk/hardware/0,100091,39918721,00.htm Intel shows 48-core 'datacentre on a chip' * Tags: * Manycore, * Cloud, * Operating System, * Processor Rupert Goodwins ZDNet UK Published: 02 Dec 2009 18:00 GMT Intel has announced the Single-chip Cloud Computer (SCC), an experime

Re: [Beowulf] Forwarded from a long time reader having trouble posting

2009-12-02 Thread Joshua Baker-LePain
On Wed, 2 Dec 2009 at 10:42am, Chris Samuel wrote I believe xfs is now available in 5.4. I'd have to check. My meagre understanding based totally on rumours is that it's still a preview release and that you need a special support contract with Red Hat to get access. I'd love to know that I'm

Re: [Beowulf] Forwarded from a long time reader having trouble posting

2009-12-02 Thread Toon Knapen
> > I believe the older kernels handle the extra cores rather poorly, and don't > even recognize the intel CPUs as NUMA enabled. You didn't mention hardware > or > software RAID. I'd recommend RAID scrubbing, and if software that requires > (I > think) >= 2.6.21, although (I think) Redhat back po

Re: [Beowulf] Forwarded from a long time reader having trouble posting

2009-12-02 Thread Toon Knapen
> > I've had friends tell me that I should never use long-lived berkeley > db databases without a good backup-and-recovery or recreate-from-scratch > plan. > > Berkeley db comes with a test suite for integrity, and last time I > used it under Linux, it didn't pass. > You mean that subsequent (min

Re: [Beowulf] Forwarded from a long time reader having trouble posting

2009-12-02 Thread Toon Knapen
> > > I believe xfs is now available in 5.4. I'd have to check. We've found xfs > to be our preference (but we're revisiting gluster and lustre). I've not > played with gfs so far. > And why do you prefer xfs if I may ask. Performance? Do you many small files or large files? ___

Re: [Beowulf] Forwarded from a long time reader having trouble posting

2009-12-02 Thread Toon Knapen
> > > Maybe also some licensing breaks on large volume licensing. Red Hat is > primarily a sales and service organisation that also produces a Linux > by-product :) The HPC variant is targetted at areas which deal in large > clusters at cheaper than Red Hat Enterprise Linux for servers at > equival

Re: [Beowulf] Forwarded from a long time reader having trouble posting

2009-12-02 Thread Toon Knapen
Any idea why it gives better performance? Was it on memory bw intensive apps? Could it be due to changes in the kernel that take into account the Numa architecture (affinity) or ... On Tue, Dec 1, 2009 at 8:57 PM, Tom Elken wrote: > > On Behalf Of Gerald Creager > > I've been quite happy with Ce

[Beowulf] Re: cluster fails to boot with managed switch, but 5-port switch works OK

2009-12-02 Thread Art Poon
Dear colleagues, I am in charge of managing a cluster at our research centre and am stuck with a vexing (to me) problem! (Disclaimer: I am a biologist by training and a mostly self-taught programmer. I am still learning about networking and cluster management, so please bear with me!) This i

[Beowulf] Fwd: rhel hpc

2009-12-02 Thread Toon Knapen
Dear all, I've been working on hpux-itanium for the last 2 years (and even unsubscribed to beowulf-ml during most of that time, my bad) but soon will turn back to a beowulf cluster (HP DL380G6's with Xeon X5560, amcc/3ware 9690SA-8i with 4 x 600GB Cheetah 15krpm). Now I have a few questions on th

Re: [Beowulf] ask about mpich

2009-12-02 Thread Dmitri Chubarov
Hello, Christian, you probably will need some advice off the list with your mpich setup. If you tell the list where you are located and studying, you might receive the help you need from someone who speaks your language or is located nearby since the audience of Beowulf list is indeed very widely

Re: [Beowulf] ask about mpich

2009-12-02 Thread Dmitry Zaletnev
MPICH2 condensed instructions: mpd daemon ring setup (MPICH2 installed supposed) 1. At all nodes: cd $HOME touch .mpd.conf chmod 600 .mpd.conf nano .mpd.conf (if there's no nano - aptitude install nano) Enter in the file: MPD_SECRETWORD=_secretword_ _secretword_ must be the same at all nodes 2. A

[Beowulf] rhel hpc

2009-12-02 Thread Toon Knapen
Dear all, I've been working on hpux-itanium for the last 2 years (and even unsubscribed to beowulf-ml during most of that time, my bad) but soon will turn back to a beowulf cluster (HP DL380G6's with Xeon X5560, amcc/3ware 9690SA-8i with 4 x 600GB Cheetah 15krpm). Now I have a few questions on the

[Beowulf] Cluster Users in Clusters Linux and Windows

2009-12-02 Thread Leonardo Machado Moreira
Hi! I am trying to create a cluster with only two machines. The server will be a Linux machine, an Arch Linux distribution to be more specific. The slave machine will be a Windows 7 machine. I have found it is possible, but I was looking and have found that each machine on the cluster must have

[Beowulf] New member, upgrading our existing Beowulf cluster

2009-12-02 Thread Ross Tucker
Greetings! I'm a new member to this list, but the research group that I work for has had a working cluster for many years. I am now looking at upgrading our current configuration. I was wondering if anyone has actual experience with running more than one node from a single power supply. Even just