Re: [Beowulf] Re: Building new cluster - estimate (Ivan Oleynik)

2008-08-01 Thread Joe Landman
Mark Hahn wrote: I'm not sure about the "most" part - HP's don't, and it looks like supermicro offers options both ways. all the recent tyan boards I've looked at had dedicated IPMI/OPMA onboard. all HP machines have dedicated ports. but to me this has all the hallmarks of a religious iss

Re: [Beowulf] Re: Building new cluster - estimate (Ivan Oleynik)

2008-08-01 Thread John Hearns
On Fri, 2008-08-01 at 12:12 -0400, Mark Hahn wrote: > I'm not sure about the "most" part - HP's don't, and it looks like supermicro > offers options both ways. all the recent tyan boards I've looked at had > dedicated IPMI/OPMA onboard. all HP machines have dedicated ports. > > but to me this

Re: [Beowulf] Re: fftw2, mpi, from 32 bit to 64 and fortran

2008-08-01 Thread Mike Davis
My only issue with fftw is that some of our software will only work with fftw2 and not fftw3. That being said, running both is relatively trivial. Mike ___ Beowulf mailing list, Beowulf@beowulf.org To change your subscription (digest mode or unsubscr

Re: [Beowulf] fftw2, mpi, from 32 bit to 64 and fortran

2008-08-01 Thread Mark Kosmowski
> Message: 3 > Date: Fri, 1 Aug 2008 14:23:13 -0400 (EDT) > From: Mark Hahn <[EMAIL PROTECTED]> > Subject: Re: [Beowulf] fftw2, mpi, from 32 bit to 64 and fortran > To: Ricardo Reis <[EMAIL PROTECTED]> > Cc: beowulf@beowulf.org > Message-ID: ><[EMAIL PROTECTED]> > Content-Type: TEXT/PLAIN;

[Beowulf] Re: fftw2, mpi, from 32 bit to 64 and fortran

2008-08-01 Thread Jason Riedy
And Ricardo Reis writes: > In a 64 bit machine the mpi version kaputs. Any thoughts? I'd bet that you're calling MPI routines directly from your Fortran code somewhere, and fftw is a red herring... When calling MPI routines directly from your Fortran code, be very, very careful about the argument

Re: [Beowulf] fftw2, mpi, from 32 bit to 64 and fortran

2008-08-01 Thread Glen Beane
Mark Hahn wrote: I've scourged the net for answers to no avail and the fftw project seems to have grinded to a halt. Maybe someone has had this problem and can throw some light. I don't know the status of the project, but fftw is definitely still widely used, and definitely works in 64b. I'

Re: [Beowulf] Seeing ECC errors since upgraded from Opteron 246 to 275

2008-08-01 Thread Mark Hahn
So I have 2 DL145-G2 nodes with 2 single-core 246 / 4GB each, and 2 DL145-G2 nodes with 2 dual-core 275 / 4GB each. it's worth making sure you have current bios installed. 07/28/2008 | 17:52:23 | Memory #0x02 | Uncorrectable ECC | Asserted it may also be useful to run mcelog, which will tell

Re: [Beowulf] fftw2, mpi, from 32 bit to 64 and fortran

2008-08-01 Thread Mark Hahn
I've scourged the net for answers to no avail and the fftw project seems to have grinded to a halt. Maybe someone has had this problem and can throw some light. I don't know the status of the project, but fftw is definitely still widely used, and definitely works in 64b. I've coded a small pro

[Beowulf] fftw2, mpi, from 32 bit to 64 and fortran

2008-08-01 Thread Ricardo Reis
which means... segfault. Hi all I've scourged the net for answers to no avail and the fftw project seems to have grinded to a halt. Maybe someone has had this problem and can throw some light. I've coded a small program that reads a vorticity field, uses FFTW2 to send it from the physi

[Beowulf] Seeing ECC errors since upgraded from Opteron 246 to 275

2008-08-01 Thread Paulo Afonso Lopes
Dear all: Around 2/Apr I removed 2 Opterons 246 and "companion" 4x 512 MB DIMMs from two HPs DL145-G2, leaving them void, to populate other two HPs (got 2 CPUs and 4GB per node). Then, I installed 2 dual-core Opterons per DL145-G2, together with 4 sticks of 1GB (2 sticks per CPU). So I have 2 DL

Re: [Beowulf] Re: Building new cluster - estimate (Ivan Oleynik)

2008-08-01 Thread Maurice Hilarius
Mark Hahn wrote: BTW< where a lot of people are jumping on the "Get IPMI " bandwagon, I suggest getting PDUs with remote IP controlled ports is more useful. the thing I don't like about controlled PDUs is that they're pretty harsh - don't you expect a higher failure rate of node PSUs if you go

[Beowulf] Re: Building new cluster - estimate (Ivan, Oleynik)

2008-08-01 Thread Maurice Hilarius
Chris Samuel <[EMAIL PROTECTED]> wrote: .. > BTW< where a lot of people are jumping on the "Get IPMI " > bandwagon, I suggest getting PDUs with remote IP controlled > ports is more useful. Well, it depends on what you're trying to do, if it's get the system and CPU temperatures then a PDU

Re: [Beowulf] Re: Building new cluster - estimate (Ivan Oleynik)

2008-08-01 Thread Mark Hahn
using lm_sensors is a poor substitute for IPMI. IMHO the only disadvantage of lm_sensors is the poroblem of building of right sensors.conf file. well, there's the little matter of being able to get data when the node is crashed, offline, busy, etc. I also very much like the ability to quer

Re: [Beowulf] Re: Building new cluster - estimate (Ivan Oleynik)

2008-08-01 Thread Mike Davis
the thing I don't like about controlled PDUs is that they're pretty harsh - don't you expect a higher failure rate of node PSUs if you go yanking the power this way? I have only seen a handful of different IPMI interfaces, but they all were reasonably reliable. In using the ethernet interf

Re: [Beowulf] Re: Building new cluster - estimate (Ivan Oleynik)

2008-08-01 Thread Mikhail Kuzminsky
In message from Mark Hahn <[EMAIL PROTECTED]> (Fri, 1 Aug 2008 10:06:17 -0400 (EDT)): ... Plus , with a lot of those PDUs you can add thermal sensors and trigger power off on high temperature conditions. IPMI normally provides all the motherboard's sensors as well. it seems like those are far

Re: [Beowulf] Re: Building new cluster - estimate (Ivan Oleynik)

2008-08-01 Thread Mark Hahn
the thing I don't like about controlled PDUs is that they're pretty harsh - don't you expect a higher failure rate of node PSUs if you go yanking the power this way? Why? If nodes shutdown, on commands from the scheduler, that is good. And, if they do not, how is cutting power by the PDU socket

Re: [Beowulf] reboot without passing through BIOS?

2008-08-01 Thread David Mathog
Kilian CAVALOTTI <[EMAIL PROTECTED]> wrote: > I may be totally missing the point, but doesn't the memory need to be > physically (as in electrically) reset in order to clean out those bad > bits? And doesn't this require a hard reboot, for the machine to be > power cycled, so that memory cells a

Re: [Beowulf] Re: Building new cluster - estimate (Ivan Oleynik)

2008-08-01 Thread Mark Hahn
BTW< where a lot of people are jumping on the "Get IPMI " bandwagon, I suggest getting PDUs with remote IP controlled ports is more useful. the thing I don't like about controlled PDUs is that they're pretty harsh - don't you expect a higher failure rate of node PSUs if you go yanking the power

Re: [Beowulf] Re: Linux cluster authenticating against multiple Active Directory domains

2008-08-01 Thread John Hearns
On Fri, 2008-08-01 at 15:37 +1000, Chris Samuel wrote: > We'd prefer to steer clear of Kerberos, it introduces > arbitrary job limitations through ticket lives that > are not tolerable for HPC work. > Kerberos is heavily used at CERN. They have a solution for that issue - the job can ask for an e