Re: [Beowulf] 96 cores in silent and small enclosure

2010-04-07 Thread Gerald Creager
Hearns, John wrote: sry for another post but i just got an idea. im not sure if you have seen that you tube video of a guy who put his whole setup in a fish tank and was using cooking oil i believe to cool everything. would be interesting to see the 2nd cluster put into a big enough tank and cool

Re: [Beowulf] Any recommendations for a good JBOD?

2010-02-18 Thread Gerald Creager
for a good JBOD? On Thu, Feb 18, 2010 at 12:12 PM, Gerald Creager wrote: > For what you're describing, I'd consider CoRAID's AoE technology and system, > and use their RAID6 capability. Otherwise, get yourself a box with up to 8 > slots, preferably with ho

Re: [Beowulf] Any recommendations for a good JBOD?

2010-02-18 Thread Gerald Creager
For what you're describing, I'd consider CoRAID's AoE technology and system, and use their RAID6 capability. Otherwise, get yourself a box with up to 8 slots, preferably with hot-swap capability, and forge ahead. gerry Rahul Nabar wrote: Discussions that I read on this list in the last couple

Re: [Beowulf] Thinking about going used

2010-02-11 Thread Gerald Creager
I've been getting Dell 1425's for <$100 and adding $150 worth of memory to make 'em a usable small server. I think they're OK for low-end compute nodes, but watching the list there are other machines more suitable. gerry David Mathog wrote: There are a lot of rack mount servers showing up on

[Beowulf] UPS signaling scripts?

2010-02-08 Thread Gerald Creager
Looking for a usable script that will allow me to listen to an APC data center UPS and power down a cluster when we go to UPS power. Anyone got a solution? My last experience with PowerChute was pretty sad, I'm afraid. gerry -- Gerry Creager -- gerry.crea...@tamu.edu Texas Mesonet -- AATLT, Tex

Re: [Beowulf] hardware RAID versus mdadm versus LVM-striping

2010-01-17 Thread Gerald Creager
+1: Reality. Joe Landman wrote: Rahul Nabar wrote: On Sun, Jan 17, 2010 at 9:36 PM, Joe Landman wrote: Ohhh ... it depends. Some of the "vendors" hardware raid ... heck ... most of it ... is rebadged LSI gear. Usually their lower end stuff which is sometimes fake-raid. Use fake-raid only

Re: [Beowulf] hardware RAID versus mdadm versus LVM-striping

2010-01-17 Thread Gerald Creager
Hardware RAID, in my experience, works well with most LSI controllers (that haven't been modified to become Dell PERC controllers) and 3Ware controllers. I've had pretty grim results with most others. A colleague and I had great initial results with several ARECA controllers, but then they lo

Re: [Beowulf] Geriatric computer does not stay up

2009-12-16 Thread Gerald Creager
David Mathog wrote: So we have a cluster of Tyan S2466 nodes and one of them has failed in an odd way. (Yes, these are very old, and they would be gone if we had a replacment.) On applying power the system boots normally and gets far into the boot sequence, sometimes to the login prompt, then it

Re: [Beowulf] scalability

2009-12-11 Thread Gerald Creager
Howdy! Gus Correa wrote: Hi Chris Chris Samuel wrote: - "Gus Correa" wrote: This is about the same number I've got for an atmospheric model in a dual-socket dual-core Xeon computer. Somehow the memory path/bus on these systems is not very efficient, and saturates when more than two proc

Re: [Beowulf] Sony PS3, random news

2009-12-10 Thread Gerald Creager
Grand Theft Auto? More'n'likely, YellowDog Linux, which is what the folks in Colorado adapted, under contract with Sony, to generate for the PS3. gerry Bernard Li wrote: Curious what software they use to provision these PS3s ;-) Cheers, Bernard On Wed, Dec 9, 2009 at 10:19 AM, Jeremy Bake

Re: [Beowulf] New member, upgrading our existing Beowulf cluster

2009-12-03 Thread Gerald Creager
Prentice Bisbal wrote: Greg Lindahl wrote: On Thu, Dec 03, 2009 at 10:40:12AM -0500, Prentice Bisbal wrote: if a single node goes down, you need to take down all the nodes in the chassis before you can remove the dead node. Not very practical. Eh? What's so hard about marking the other nodes

Re: [Beowulf] Forwarded from a long time reader having trouble posting

2009-12-03 Thread Gerald Creager
Toon Knapen wrote: I believe xfs is now available in 5.4. I'd have to check. We've found xfs to be our preference (but we're revisiting gluster and lustre). I've not played with gfs so far. And why do you prefer xfs if I may ask. Performance? Do you many small files or l

Re: [Beowulf] Forwarded from a long time reader having trouble posting

2009-12-01 Thread Gerald Creager
Andrew M.A. Cater wrote: On Tue, Dec 01, 2009 at 03:04:32PM -0600, Gerald Creager wrote: A combination of mostly kernel improvements, and some useful middleware as RedHat and by extension, CentOS, seek to get farther into the cluster space. gerry Maybe also some licensing breaks on large

Re: [Beowulf] Forwarded from a long time reader having trouble posting

2009-12-01 Thread Gerald Creager
kernel that take into account the Numa architecture (affinity) or ... On Tue, Dec 1, 2009 at 8:57 PM, Tom Elken <mailto:tom.el...@qlogic.com>> wrote: > On Behalf Of Gerald Creager > I've been quite happy with CentOS 5.3 and we're experimenting with >

Re: [Beowulf] Forwarded from a long time reader having trouble posting

2009-12-01 Thread Gerald Creager
Toon, welcome back! I've been quite happy with CentOS 5.3 and we're experimenting with CentOS 5.4 now. I see good stability in 5.[34] and the incorporation of a couple of tools worth having in a distribution for 'Wulf use. I'd not recommend sticking with the old version, but of course, once

Re: [Beowulf] any creative ways to crash Linux?: does a shared NIC IMPI always remain responsive?

2009-10-27 Thread Gerald Creager
Ye GODS, I remember the NeXT Cube. They were so cute, and neat looking, compared to the Sun I had next to my desk. John Hearns wrote: 2009/10/26 John Hearns : 2009/10/26 Rahul Nabar : True. I just thought that if my BMC is running a webserver it cannot be all that stripped down. Maybe I am

Re: [Beowulf] any creative ways to crash Linux?: does a shared NIC IMPI always remain responsive?

2009-10-26 Thread Gerald Creager
Mark Hahn wrote: IPMI gets hung sometimes (like Gerry says in his reply). I guess I can just attribute that to bad firmware coding in the BMC. I think it's better to think of it as a piece of hw (the nic) trying to be managed by two different OSs: host and BMC. it's surprising that it works at

Re: [Beowulf] any creative ways to crash Linux?: does a shared NIC IMPI always remain responsive?

2009-10-26 Thread Gerald Creager
Mark Hahn wrote: the BMCs were Motorola single board computers running Linux. So ssh and http access were already there with whichever Linux distro they ran (you could look around in /proc for instance) Wow! I didn't realize that the BMC was again running a full blown Linux distro! sigh. th

Re: [Beowulf] Beowulf SysAdmin Job Description

2009-05-07 Thread Gerald Creager
Nicholas M Glykos wrote: Too many Ph.D persons (in fields other than Computer Science) assume that any sort of computer work is something they could do in their spare time, if they just took the time. Taking your point to its diametrically opposite extreme, I have repeatedly been told that