Re: [Beowulf] Barcelona hardware error: how to detect

2008-06-10 Thread Chris Samuel
- "Jason Clinton" <[EMAIL PROTECTED]> wrote: > The kernel patch is very extensive and, last I heard, under NDA. AMD post the patches publicly to the x86-64 discuss list. The most recent ones covered 2.6.24 and 2.6.25 and were sent out in April. https://www.x86-64.org/pipermail/discuss/2008

Re: [Beowulf] A couple of interesting comments

2008-06-10 Thread Chris Samuel
- [EMAIL PROTECTED] wrote: > All head nodes should have the BIOS set to locaboot first. We set the interface on the internal cluster network to PXE and the external to not. Mind you, we control the external network too, so even if it did try it shouldn't do anything. cheers, Chris -- Chri

Re: [Beowulf] size of swap partition

2008-06-10 Thread Mikhail Kuzminsky
In message from Mark Hahn <[EMAIL PROTECTED]> (Tue, 10 Jun 2008 00:58:12 -0400 (EDT)): ... for instance, you can always avoid OOM with the vm.overcommit_memory=2 sysctl (you'll need to tune vm.overcommit_ratio and the amount of swap to get the desired limits.) in this mode, the kernel tracks ho

Re: [Beowulf] size of swap partition

2008-06-10 Thread Walid
Hi, For an 8GB dual socket quad core node, choosing in the kick start file --recommended instead of specifying size RHEL5 allocates 1GB of memory. our developers say that they should not swap as this will cause an overhead, and they try to avoid it as much as possible regards Walid On 10/06/2008

Re: [Beowulf] size of swap partition

2008-06-10 Thread Mark Kosmowski
> Message: 5 > Date: Tue, 10 Jun 2008 00:58:12 -0400 (EDT) > From: Mark Hahn <[EMAIL PROTECTED]> > Subject: Re: [Beowulf] size of swap partition > To: Gerry Creager <[EMAIL PROTECTED]> > Cc: Mikhail Kuzminsky <[EMAIL PROTECTED]>, beowulf@beowulf.org > Message-ID: ><[EMAIL PROTECTED]> > Cont

Re: [Beowulf] Barcelona hardware error: how to detect

2008-06-10 Thread Jason Clinton
On Thu, Jun 5, 2008 at 11:39 AM, Mikhail Kuzminsky <[EMAIL PROTECTED]> wrote: > In message from Mark Hahn <[EMAIL PROTECTED]> (Thu, 5 Jun 2008 11:57:28 > -0400 (EDT)): > >> To be more exact, Rev. B2 of Opteron 2350 - is it for CPU stepping w/error >>> or w/o error ? >>> >> >> AMD, like Intel, does

Re: [Beowulf] Barcelona hardware error: how to detect

2008-06-10 Thread Jason Clinton
On Thu, Jun 5, 2008 at 1:09 PM, Mikhail Kuzminsky <[EMAIL PROTECTED]> wrote: > In message from Mark Hahn <[EMAIL PROTECTED]> (Thu, 5 Jun 2008 13:55:01 > -0400 (EDT)): > >> I'm mystified by this: B2 was broken, so using it without the bios >> workaround is just a mistake or masochism. the workarou

Re: [Beowulf] A couple of interesting comments

2008-06-10 Thread Michael Brown
Perry E. Metzger wrote: Anyone have any cool tricks for how to consistently set the BIOS on large numbers of boxes without requiring steps that humans can screw up easily? Get a USB stick that boots into Linux. Set up one machine the way you want, then boot it up using the USB stick. Do: dd if=

Re: [Beowulf] OFED/IB for FC8

2008-06-10 Thread Jason Clinton
On Thu, Jun 5, 2008 at 4:38 AM, Rainer Finocchiaro < [EMAIL PROTECTED]> wrote: > Hi Michael, > > Greg Lindahl schrieb: > >> All the OFED rpm's for FC6 installed on FC8 without difficulty, except for >>> opensm-3.0.3-0.ppc64.rpm >>> >> >> This is the cause of most of your subsequent problems. Witho

Re: [Beowulf] A couple of interesting comments

2008-06-10 Thread johnh
> On Fri, 2008-06-06 at 10:39 -0500, Gerry Creager wrote: >> > > I can think of at least one cluster where the opposite has been true and > PXE boot has been the default. The problem with this is if the head > node PXE boots on the customers network and gets automatically > re-installed as a wi

Re: Re: [Beowulf] A couple of interesting comments

2008-06-10 Thread Maurice Hilarius
Chris Samuel <[EMAIL PROTECTED]> wrote: Our most recent vendor went to the motherboard manufacturer and said "please can you cut us a BIOS with these default settings" and they did so. cheers, Chris Some manufacturers do, some do not. Asus , for example, do, for their OEM customers. OTO

Re: [Beowulf] A couple of interesting comments

2008-06-10 Thread bari
Tim Cutts wrote: Nope. :-) This is, in my view, one of the major disadvantages of PC clusters. The crappy old BIOS that we're stuck with. Just out of curiosity beside the clusters at LANL and Sandia who here uses coreboot (LinuxBIOS) for BIOS? http://www.coreboot.org If not, why not? L

Re: [Beowulf] A couple of interesting comments

2008-06-10 Thread Matt Allen
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 > cool tricks to consistently set the BIOS We had a cluster of systems that supported configuring the BIOS from an image on a bootable floppy. I bought 96 3.5" floppy disks, put one in each node, and then used parallel scp to dd the desired imag

Re: [Beowulf] OFED/IB for FC8

2008-06-10 Thread Jason Clinton
On Thu, Jun 5, 2008 at 10:38 AM, Jason Clinton < [EMAIL PROTECTED]> wrote: > ls | grep -oP .+?\(\?=.x86_64\\.rpm\) | xargs rpm -e > Of course, replace "x86_64" with "ppc64" if indeed that is what you installed. ___ Beowulf mailing list, Beowulf@beowulf.