[Beowulf] Odd Infiniband scaling behaviour

2007-10-07 Thread Chris Samuel
Hi fellow Beowulfers.. We're currently building an Opteron based IB cluster, and are seeing some rather peculiar behaviour that has had us puzzled for a while. If I take a CPU bound application, like NAMD, I can run an 8 CPU job on a single node and it pegs the CPUs at 100% (this is built using

Re: [Beowulf] [AMD64] Gentoo or Fedora

2007-10-07 Thread Gerry Creager
Er, ah, Greg, don't hold back. How do you REALLY feel? Greg Lindahl wrote: On Sun, Oct 07, 2007 at 03:10:30PM -0700, Greg Lindahl wrote: Hm, elm doesn't compile anymore, I wonder if anyone will notice if I just delete it? Of course, my CEO noticed about 10 minutes later! I told him to use

Re: [Beowulf] [AMD64] Gentoo or Fedora

2007-10-07 Thread Greg Lindahl
On Sun, Oct 07, 2007 at 03:10:30PM -0700, Greg Lindahl wrote: > Hm, elm doesn't compile > anymore, I wonder if anyone will notice if I just delete it? Of course, my CEO noticed about 10 minutes later! I told him to use a real mailer, like mutt. ;-) -- greg __

Re: [Beowulf] [AMD64] Gentoo or Fedora

2007-10-07 Thread Greg Lindahl
Sorry that this is a "late hit" on this topic, but every time someone mentions Gentoo, I have to count to 100,000 before I say anything. >From what I can tell, the dependency stuff in Gentoo mostly works. If you try to not update any packages unless they have a security issue, you will discover a

Re: [Beowulf] 32 nodes cluster price

2007-10-07 Thread Joe Landman
Bill Rankin wrote: Let me offer up a somewhat concrete example of a problem with hardware raid. A local group around here kept some Very Important Data on a hardware raid array. Due to several factors, a backup was not made of certain data. The device lost a drive and started an automagic

Re: [Beowulf] 32 nodes cluster price

2007-10-07 Thread Bill Broadley
Geoff Galitz wrote: Why do you automatically distrust hardware raid? Because they are low volume parts designed to handle failure modes in very complicated environments. If you buy a hardware RAID card you very well could have the only one on the planet with that exact config. Variables i

Re: [Beowulf] 32 nodes cluster price

2007-10-07 Thread Eugen Leitl
On Sun, Oct 07, 2007 at 12:54:42PM -0400, Mike Davis wrote: > Controllers can have problems, but so can software. The point is that you have to keep a hardware spare with hardware RAID. This can make things much more expensive. Also, software RAID is typically free, while a hardware RAID of simil

Re: [Beowulf] 32 nodes cluster price

2007-10-07 Thread Mike Davis
And what would happen if 2 drives died on a software RAID5? The problem with the example is that it could happen whether one uses software or hardware RAID. The real issue is that important data was stored and not backed up. Bad things happen when you have a bad storage strategy. I have run HW

Re: [Beowulf] Problem with Single RAID disk larger than 2TB and Linux

2007-10-07 Thread Joel Jaeggli
Guy Coates wrote: > Luns over 2TB are a bad idea. There are just too many reasons why they might > not > work, and trying to track down the right one is a pain. > > Your workaround to use the LVM to stripe 3x1TB luns together is the way to go. > (You really want to use LVM anyhow, as trying to d

Re: [Beowulf] 32 nodes cluster price

2007-10-07 Thread Bill Rankin
On Oct 5, 2007, at 4:17 PM, Leif Nixon wrote: "Geoff Galitz" <[EMAIL PROTECTED]> writes: Why do you automatically distrust hardware raid? To some extent I share Mark's sentiment. I certainly trust the Linux kernel more than the firmware in a cheap raid controller. Let me offer up a somewh