Re: [Beowulf] commercial clusters

2006-09-28 Thread Buccaneer for Hire.
--- Stuart Midgley <[EMAIL PROTECTED]> wrote: > I've spoken quite extensively to several > organisations here in > Western Australia who each have 1000+cpu systems in > Perth, 2000+cpu > systems in London and 4000+ systems in Houston (oil > and gas seismic > processing). They tell me that

Re: [Beowulf] Looking for external RAID vendors

2006-09-28 Thread Mike Davis
[EMAIL PROTECTED] wrote: We looked at Apple, Excel Meridian, Nexsan and Partners Data SurfRAID. We ended up with the Partners Data SurfRAID and have been very happy with them. RAID6, true Web management interface (as opposed to having to go through an attached machine) and you can treat it as

Re: [Beowulf] commercial clusters

2006-09-28 Thread Vincent Diepeveen
- Original Message - From: "Angel Dimitrov" <[EMAIL PROTECTED]> To: Sent: Tuesday, September 26, 2006 8:10 PM Subject: [Beowulf] commercial clusters Hello, I have some experience of running of numerical weather models on clusters. Is there many clients for processor time? As I sa

Re: [Beowulf] commercial clusters

2006-09-28 Thread Stuart Midgley
actually, their IO requirements are not that great. Certainly the systems that I run have far greater io capacity for a much smaller cpu count than the seismic machines. Hi Stu and friends, Seismic is a totally different beastie. He probably does not want to do that. You have very seri

Re: [Beowulf] Tyan S2882

2006-09-28 Thread Vincent Diepeveen
My dual opteron dual core is extremely stable, except when i run 1 type of software, namely software that is doing non-stop multiplying. I do that under Ubuntu. That really seems like a worst case path in the dual core opteron chips. After it is nonstop multiplying for a number of days, I get a

Re: [Beowulf] commercial clusters

2006-09-28 Thread Stuart Midgley
I've spoken quite extensively to several organisations here in Western Australia who each have 1000+cpu systems in Perth, 2000+cpu systems in London and 4000+ systems in Houston (oil and gas seismic processing). They tell me that they have tried to sell just the cycle time to no success.

Re: [Beowulf] Looking for external RAID vendors

2006-09-28 Thread cousins
> Message: 1 > Date: Tue, 26 Sep 2006 15:04:34 -0400 (EDT) > From: Joshua Baker-LePain <[EMAIL PROTECTED]> > Subject: [Beowulf] Looking for external RAID vendors > To: beowulf@beowulf.org > Message-ID: <[EMAIL PROTECTED]> > Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed > > I'm looking

[Beowulf] 8 ranks of DDR400 and Opterons (was: Tyan 2882)

2006-09-28 Thread serguei.patchkovskii
Mark Hahn wrote: : > * Mem: 8*1GB PC3200 (DDR 400) ECC reg.; Corsair/Samsung CM72SD1024RLP-3200/SB: > ( 12 nodes have 8*2GB) : : this dimm is 2-rank, I believe; corsair's datasheet is pretty lame. : that means that each bank of memory is 4x2=8 ranks. that's definitely : pushing the limit; I'm s

Re: [Beowulf] commercial clusters

2006-09-28 Thread Buccaneer for Hire.
--- Chris Samuel <[EMAIL PROTECTED]> wrote: > On Wednesday 27 September 2006 5:10 am, Angel > Dimitrov wrote: > > > Is there many clients for processor time? As I > saw the biggest > > supercomputers in the World are very busy! I'm > wondering if it's > > worthwhile to setup a commercial clust

Re: [Beowulf] Tyan S2882

2006-09-28 Thread Bernd Schubert
On Wednesday 27 September 2006 11:20, Gebhardt Thomas wrote: > Hi, > > > We are currently deploying Tyan S2882 Dual Opteron Boards, and we have > > found the system to be quite unstable. After BIOS updates and kernel > > changes we still get random kernel panics when under load. > > Me too :-( > >

[Beowulf] Anyone with a Dell viz cluster?

2006-09-28 Thread Greg Lindahl
I'd like to compare notes with anyone who has > 4Gbyte nodes and either graphics cards, or a network card which requires write-combining MTRRs. Please send me personal email. Thanks. -- greg ___ Beowulf mailing list, Beowulf@beowulf.org To change your s

[Beowulf] More cores/More processors/More nodes?

2006-09-28 Thread Peter Wainwright
Please enlighten a baffled newbie: Now there are motherboards with 8 sockets; quad-core processors; and clusters with as many nodes as you can shake a stick at. It seems there are at least 3 dimensions for expansion. What (in your opinion) is the right tradeoff between more cores, more processors

RE: [Beowulf] Stupid MPI programming question

2006-09-28 Thread Clements, Brent M \(SAIC\)
Robert et al. I completely agree, the code was just a quick hack to test out functions. I am COMPLETELY ANAL when it comes to doing what your stating below. Trust me..I am perfectionist when it comes to writing code the correct way, documenting, doing proper error checking, etc..but this was

RE: [Beowulf] Stupid MPI programming question

2006-09-28 Thread Robert G. Brown
On Thu, 28 Sep 2006, Michael Will wrote: That's wierd. On my scyld cluster it worked fine once I had created /tmp// on all compute nodes before running the job. Maybe we should ask something like "what compiler/kernel/distro" are you using? Although he's already begging for mercy from the

[Beowulf] RE: Stupid MPI programming question (Clements, Brent M (SAIC))

2006-09-28 Thread David Mathog
"Clements, Brent M \(SAIC\)" <[EMAIL PROTECTED]> wrote: > What is wierd is that(I haven't done error reporting yet): > > mkdir("NEWDIRNAME"); works(creates a directory in the current working directory) > > but mkdir("/tmp//NEWDIRNAME") doesn't work, I even tried a chdir(which came back succ

RE: [Beowulf] Stupid MPI programming question

2006-09-28 Thread Michael Will
Title: RE: [Beowulf] Stupid MPI programming question That's wierd. On my scyld cluster it worked fine once I had created /tmp// on all compute nodes  before running the job. Michael  -Original Message- From:   Clements, Brent M (SAIC) [mailto:[EMAIL PROTECTED]] Sent:   Thu Sep 2

Re: [Beowulf] commercial clusters

2006-09-28 Thread Mark Hahn
Is there many clients for processor time? As I saw the biggest do you mean "how much real money will people pay for cycles on clusters"? I don't know, but rumor has it that Sun's $1/cpu-hour approach is not a resounding success. of course, they're also not offering a real HPC cluster. sup

Re: [Beowulf] Looking for external RAID vendors

2006-09-28 Thread Mike Davis
VERY INTERESTING!!! We had a similar episode with an IBM Shark (I believe) machine. It said that two drives failed simultaneously,. The first time required a complete rebuild. The second (in 3 months) got IBM's attention. The problem was with the controller. We wound up needing to replace the

Re: [Beowulf] Tyan S2882

2006-09-28 Thread Florent . Calvayrac
Quoting Constantin Charissis <[EMAIL PROTECTED]>: Krugger a écrit : To be solved: - random kernel panics that take out the logging even when all debug flags are set in the kernel, as it fails to sync the disc during the kernel panic. we had the same problems, even with BIOS upgrades, excep

Re: [Beowulf] Tyan S2882

2006-09-28 Thread Mark Hahn
* Dual AMP Opteron DP270 (2.0 GHz) which rev? * Mem: 8*1GB PC3200 (DDR 400) ECC reg.; Corsair/Samsung CM72SD1024RLP-3200/SB ( 12 nodes have 8*2GB) this dimm is 2-rank, I believe; corsair's datasheet is pretty lame. that means that each bank of memory is 4x2=8 ranks. that's definitely pus

RE: [Beowulf] Stupid MPI programming question

2006-09-28 Thread Clements, Brent M \(SAIC\)
What I ended up doing was just stripping the program down to like 10 lines of code and I have a simple sprintf to create the directory name. What is wierd is that(I haven't done error reporting yet): mkdir("NEWDIRNAME"); works(creates a directory in the current working directory) but mkdir("

Re: [Beowulf] Tyan S2882

2006-09-28 Thread Eric W. Biederman
Gebhardt Thomas <[EMAIL PROTECTED]> writes: > Hi, > >> We are currently deploying Tyan S2882 Dual Opteron Boards, and we have >> found the system to be quite unstable. After BIOS updates and kernel >> changes we still get random kernel panics when under load. > > Me too :-( > > We've got a 85 Node

Re: [Beowulf] Looking for external RAID vendors

2006-09-28 Thread Joe Landman
Warren Turkal wrote: > On Wednesday 27 September 2006 09:49, Mark Hahn wrote: >> approx size/config? > > SR1520 is a 15 disk unit. It is 3U. It takes SATA disks. As a result, it can > hold up to 11.25TB until SATA drives get bigger. Thats 11.25TB raw. If you run with a hot spare (assuming al

[Beowulf] Looking for external RAID vendors

2006-09-28 Thread Bryan L.
If your looking for External Storage using SATA disks a really good RAID vendor is NexSAN http://www.nexsan.com/ The SATABlade, SATABoy and SATABeast are fibre attached. The SATABoySC is SCSI attached. Excellent performance and service. --- [EMAIL PROTECTED] wrote: >

Re: [Beowulf] Looking for external RAID vendors

2006-09-28 Thread Gary Stiehr
Hi, We have been using Nexsan's "SATABeast" product recently. We purchased 40 of them with 42 500GB drives with one dual-port Fibre controller. We connect each SATABeast to two nodes. Those two nodes then have two channel-bonded GigE interfaces. We have been able to fully saturate the 2 G

RE: [Beowulf] Looking for external RAID vendors

2006-09-28 Thread Mark McCardell
Joshua, We use Storcase on one of our FEM clusters. http://www.storcase.com/rm_raid/rm_raid_ovrvw.asp Serial management and environment monitoring; connected to the I/O server via U320 For the other cluster we went with an integrated I/O & storage array http://www.asaservers.com/system_dept.asp?d

Re: [Beowulf] Tyan S2882

2006-09-28 Thread Constantin Charissis
Krugger a écrit : To be solved: - random kernel panics that take out the logging even when all debug flags are set in the kernel, as it fails to sync the disc during the kernel panic. Hi, We have a lot of clusters running with S2882 (centos3, centos4, debian sarge) without random kernel panic

Re: [Beowulf] Tyan S2882

2006-09-28 Thread Gebhardt Thomas
Hi, > We are currently deploying Tyan S2882 Dual Opteron Boards, and we have > found the system to be quite unstable. After BIOS updates and kernel > changes we still get random kernel panics when under load. Me too :-( We've got a 85 Node Dual Opteron Cluster. I've documented most of the crashe

Re: [Beowulf] Stupid MPI programming question

2006-09-28 Thread Jakob Oestergaard
On Thu, Sep 28, 2006 at 08:57:28AM -0400, Robert G. Brown wrote: > On Thu, 28 Sep 2006, Jakob Oestergaard wrote: ... > Ah, that's it. I'd forgotten this one and missed the write to a static > string, although it has bitten me in the past (partly because back in > the remote K&R past one could near

Re: [Beowulf] Stupid MPI programming question

2006-09-28 Thread Vincent Diepeveen
- Original Message - From: "Joe Landman" <[EMAIL PROTECTED]> To: "Clements, Brent M (SAIC)" <[EMAIL PROTECTED]> Cc: Sent: Thursday, September 28, 2006 2:34 AM Subject: Re: [Beowulf] Stupid MPI programming question Hi Brent Clements, Brent M (SAIC) wrote: Hey Guys, I've been sitti

Re: [Beowulf] Stupid MPI programming question

2006-09-28 Thread Robert G. Brown
On Thu, 28 Sep 2006, Jakob Oestergaard wrote: On Thu, Sep 28, 2006 at 01:47:17AM +0100, Clements, Brent M (SAIC) wrote: Hey Guys, ... This is what I have in my code(some things have been removed) #define MASTER_RANK 0 char* mystring; So mystring is a pointer to character(s). mystring

Re: [Beowulf] Stupid MPI programming question

2006-09-28 Thread Chris Samuel
On Thursday 28 September 2006 2:41 pm, Mark Hahn wrote: > also, "man 3 strerror" ;) and to make life even easier - "man 3 perror" :-) -- Christopher Samuel - (03)9925 4751 - VPAC Deputy Systems Manager Victorian Partnership for Advanced Computing http://www.vpac.org/ Bldg 91, 110 Victoria S

Re: [Beowulf] commercial clusters

2006-09-28 Thread Chris Samuel
On Wednesday 27 September 2006 5:10 am, Angel Dimitrov wrote: >  Is there many clients for processor time? As I saw the biggest > supercomputers in the World are very busy! I'm wondering if it's > worthwhile to setup a commercial cluster. Intel are planning for new > processors - two CPUs each wit

Re: [Beowulf] Stupid MPI programming question

2006-09-28 Thread Chris Samuel
On Thursday 28 September 2006 2:08 pm, Robert G. Brown wrote: > I don't think mkdir(2) does the equivalent of mkdir -p and create parent > directories as required. That's quite correct, you'll always have to do those yourself it they don't already exist (otherwise you'll get EACCESS). -- Chr

Re: [Beowulf] Looking for external RAID vendors

2006-09-28 Thread Chris Samuel
On Wednesday 27 September 2006 6:10 am, Erik Paulson wrote: > We don't do any SAN, each one is attached to a 1U server and we have our own > "filesystem" to track where stuff is. Works for us. Snap - IBM e325's running FC4. -- Christopher Samuel - (03)9925 4751 - VPAC Deputy Systems Manager V

Re: [Beowulf] Looking for external RAID vendors

2006-09-28 Thread Chris Samuel
On Wednesday 27 September 2006 5:54 am, Mike Davis wrote: > We've had good luck with Apple's arrays. So have we. On the other hand we've had an IBM FAStT & EXP enclosure (or whatever they're called today) which lost 2 SCSI drives (one in the main unit and one in the EXP) within a minute, bo

Re: [Beowulf] Stupid MPI programming question

2006-09-28 Thread Jakob Oestergaard
On Thu, Sep 28, 2006 at 01:47:17AM +0100, Clements, Brent M (SAIC) wrote: > Hey Guys, ... > > This is what I have in my code(some things have been removed) > > #define MASTER_RANK 0 > > char* mystring; So mystring is a pointer to character(s). > > mystring = "some test"; Now mystring poi

Re: [Beowulf] Stupid MPI programming question

2006-09-28 Thread Fred L Youhanaie
(this is one of those me too replies ;-) I've always found perror very handy, e.g. ... mkdir_return = mkdir(fullpath,0777); perror("mkdir results"); ... Cheers f. Joe Landman wrote: Mark Hahn wrote: #include extern int errno; IMO, it's bad practice