Re: [Beowulf] Re: MOSIX2

2008-09-30 Thread Marian Marinov
There are a few developers that continue to work on openMosix and to port it to 2.6 kernels. They forked a project called LinuxPMI - Linux Process Migration Infrastructure http://linuxpmi.org Currently the site is unavailable but there was one Russion guy who commented big parts of the openM

Re: [Beowulf] precise synchronization of system clocks

2008-09-30 Thread Nifty niftyompi Mitch
On Tue, Sep 30, 2008 at 11:37:12AM -0400, Robert G. Brown wrote: . snip. > ... >> Keeping it beowulf'y, if you want fine grained synchronization so that you >> don't lose performance when doing barriers, you're probably going to need >> some sort of common clock. The typical microprocessor

Re: [Beowulf] Compute Node OS on Local Disk vs. Ram Disk

2008-09-30 Thread Eric Thibodeau
Jon I'm replying to Don's post since he outlines most of the reasons why I choose to use the NFS-mounted approach and let you choose weather or not you want a local disk(s) for scratch. Which brings up the _real_ questions: - how many nodes - are they all identical - how many users concur

Re: [Beowulf] Compute Node OS on Local Disk vs. Ram Disk

2008-09-30 Thread Jon Forrest
Greg Lindahl wrote: On Tue, Sep 30, 2008 at 09:05:16AM -0700, Jon Forrest wrote: Probably the most dangerous is modifying shared libraries and executables. Uh, this is the safest, if you do it correctly. How do you think people can use rpm/apt/whatever to update their systems with nothing goi

Re: [Beowulf] Compute Node OS on Local Disk vs. Ram Disk

2008-09-30 Thread Greg Lindahl
On Tue, Sep 30, 2008 at 09:05:16AM -0700, Jon Forrest wrote: > Probably the most dangerous is modifying > shared libraries and executables. Uh, this is the safest, if you do it correctly. How do you think people can use rpm/apt/whatever to update their systems with nothing going wrong? -- greg

Re: [Beowulf] precise synchronization of system clocks

2008-09-30 Thread Lux, James P
>> > ... >> >> Meanwhile, I'll try to find out where I can plug a serial cable into a >> modern server... > > Ah, come on ;) > > A serial-USB dongle will do just fine, as will a USB-based GPS receiver... > No it won't, because the USB connection introduces significant non-determinism in the timin

Re: [Beowulf] precise synchronization of system clocks

2008-09-30 Thread Lombard, David N
On Tue, Sep 30, 2008 at 12:01:57PM -0700, Donald Becker wrote: > On Tue, 30 Sep 2008, Lawrence Stewart wrote: > > > On Sep 30, 2008, at 12:07 PM, Lux, James P wrote: > > > On 9/30/08 8:37 AM, "Robert G. Brown" <[EMAIL PROTECTED]> wrote: > > > The grottiest grungy GPS receiver can probably do 100ns

Re: [Beowulf] precise synchronization of system clocks

2008-09-30 Thread Lombard, David N
On Tue, Sep 30, 2008 at 10:47:05AM -0700, Lawrence Stewart wrote: > > On Sep 30, 2008, at 12:07 PM, Lux, James P wrote: > > > > The grottiest grungy GPS receiver can probably do 100ns on its 1pps > > tick, > > and most are in the 20ns range. There ARE receivers that have > > systematic > > errors

Re: [Beowulf] Compute Node OS on Local Disk vs. Ram Disk

2008-09-30 Thread Lombard, David N
On Tue, Sep 30, 2008 at 09:05:16AM -0700, Jon Forrest wrote: > Bogdan Costescu wrote: > > > I use for a long time a different approach - the node "image" is copied > > via rsync at boot time; the long waiting time for installing the RPMs > > and running whatever configuration scripts happens only

Re: [Beowulf] precise synchronization of system clocks

2008-09-30 Thread Lawrence Stewart
On Sep 30, 2008, at 3:01 PM, Donald Becker wrote: Meanwhile, I'll try to find out where I can plug a serial cable into a modern server... I guess I am out of date on "modern". My Pentium 3 at home has a serial port, doesn't everything?. The chip folks stuck a UART on our node chip, bu

Re: [Beowulf] precise synchronization of system clocks

2008-09-30 Thread Lux, James P
On 9/30/08 12:01 PM, "Donald Becker" <[EMAIL PROTECTED]> wrote: > On Tue, 30 Sep 2008, Lawrence Stewart wrote: > >> On Sep 30, 2008, at 12:07 PM, Lux, James P wrote: >>> On 9/30/08 8:37 AM, "Robert G. Brown" <[EMAIL PROTECTED]> wrote: >>> The grottiest grungy GPS receiver can probably do 100ns

Re: [Beowulf] precise synchronization of system clocks

2008-09-30 Thread Lux, James P
On 9/30/08 10:47 AM, "Lawrence Stewart" <[EMAIL PROTECTED]> wrote: > > > On Sep 30, 2008, at 12:07 PM, Lux, James P wrote: > >> >> >> >> On 9/30/08 8:37 AM, "Robert G. Brown" <[EMAIL PROTECTED]> wrote: >>> >> >> The grottiest grungy GPS receiver can probably do 100ns on its 1pps >> tick, >> and

Re: [Beowulf] Compute Node OS on Local Disk vs. Ram Disk

2008-09-30 Thread Bogdan Costescu
On Tue, 30 Sep 2008, Prentice Bisbal wrote: This brings up something else I was wondering about: If you truly strip down the OS running the nodes so that its just a tiny kernel and only the essential libraries, the users would have to compile all their software (assuming they compile their own

Re: [Beowulf] precise synchronization of system clocks

2008-09-30 Thread Donald Becker
On Tue, 30 Sep 2008, Lawrence Stewart wrote: > On Sep 30, 2008, at 12:07 PM, Lux, James P wrote: > > On 9/30/08 8:37 AM, "Robert G. Brown" <[EMAIL PROTECTED]> wrote: > > The grottiest grungy GPS receiver can probably do 100ns on its 1pps > > tick, > > and most are in the 20ns range. There ARE r

Re: [Beowulf] Compute Node OS on Local Disk vs. Ram Disk

2008-09-30 Thread Donald Becker
On Sun, 28 Sep 2008, Jon Forrest wrote: > There are two philosophies on where a compute node's > OS and basic utilities should be located: > 1) On a local harddrive > 2) On a RAM disk > I'd like to start a discussion on the positives > and negatives of each approach. I'll throw out > a few. > > B

Re: [Beowulf] Compute Node OS on Local Disk vs. Ram Disk

2008-09-30 Thread Bogdan Costescu
On Tue, 30 Sep 2008, Jon Forrest wrote: The trouble with rebooting nodes is that this takes human energy. When using a queueing system, rebooting nodes can be automated easily: - the node to be rebooted is switched to "offline" state so that the scheduler doesn't attempt to start new jobs on

Re: [Beowulf] precise synchronization of system clocks

2008-09-30 Thread Lux, James P
http://www.leapsecond.com/great2005/ On 9/30/08 10:11 AM, "Robert G. Brown" <[EMAIL PROTECTED]> wrote: > On Tue, 30 Sep 2008, Lux, James P wrote: > >> Rest assured that I am far from at the limit on homebrew timing fanaticism >> (check out the time-nuts list.. Haven't you and the kids always wan

Re: [Beowulf] Compute Node OS on Local Disk vs. Ram Disk

2008-09-30 Thread Bogdan Costescu
On Tue, 30 Sep 2008, Brian Oborn wrote: It works well except that the fileserver gets hammered if there's a time when many nodes are turned on at once. I don't know what you call "many"... When I boot 100+ nodes simultaneously, the avgload on the rsync server stays under 1; the server is a "

Re: [Beowulf] precise synchronization of system clocks

2008-09-30 Thread Robert G. Brown
On Tue, 30 Sep 2008, Lawrence Stewart wrote: I don 't think this is all that hard - once we got the cycle counters synchronized here at SiCortex, the rest of Linux behaves fairly well. The first trick is to get all the timer interrupts to happen at the same time. On our MIPS cores, this is

[Beowulf] Inappropriate topic.

2008-09-30 Thread Prentice Bisbal
Vincent Diepeveen wrote: > Hmm, > > The European Union waited a bit long introducing its own equivalent and > it should > be online within a few years, it is called Galileo and it is 1 meter > accurate for everyone. > > Also with the garantuee of it not getting switched off; which happens so > ea

Re: [Beowulf] precise synchronization of system clocks

2008-09-30 Thread Donald Becker
On Mon, 29 Sep 2008, Lawrence Stewart wrote: > > The IEEE-1588 "Precision Time Protocol" can provide such levels of > > global clock > > synchronization. > That's the one I was trying to remember, but I didn't compose a good > query and couldn't find it. > > IIRC the NIC timestamps arriving pa

Re: [Beowulf] precise synchronization of system clocks

2008-09-30 Thread Lawrence Stewart
On Sep 30, 2008, at 11:37 AM, Robert G. Brown wrote: However, I would think it would require a lot of work to get the kernel(s) to respect a usec-synchronized clock, assuming that one could constrain the hardware so that it didn't generate too much random (e.g. interrupt) noise on its ow

Re: [Beowulf] precise synchronization of system clocks

2008-09-30 Thread Lawrence Stewart
On Sep 30, 2008, at 12:07 PM, Lux, James P wrote: On 9/30/08 8:37 AM, "Robert G. Brown" <[EMAIL PROTECTED]> wrote: The grottiest grungy GPS receiver can probably do 100ns on its 1pps tick, and most are in the 20ns range. There ARE receivers that have systematic errors (i.e. Some sor

Re: [Beowulf] Re: MOSIX2

2008-09-30 Thread Jürgen Knödlseder
Hi Tony, I'm in the same situation as your are: I'm running an openMosix cluster, but since it's more and more difficult to integrate new hardware with an old 2.4.26 kernel I think that I have to move to MOSIX2. I just got the latest version of MOSIX2 sent from Amnon Barak (I'm using the cl

Re: [Beowulf] precise synchronization of system clocks

2008-09-30 Thread Robert G. Brown
On Tue, 30 Sep 2008, Lux, James P wrote: Rest assured that I am far from at the limit on homebrew timing fanaticism (check out the time-nuts list.. Haven't you and the kids always wanted to experimentally verify General Relativity over the weekend with stuff in your garage?) High on MY list of

Re: [Beowulf] Compute Node OS on Local Disk vs. Ram Disk

2008-09-30 Thread Brian Oborn
I don't agree with you here as you probably have in mind a kickstart-based install for approach #1 running upon each node boot. I use for a long time a different approach - the node "image" is copied via rsync at boot time; the long waiting time for installing the RPMs and running whatever

Re: [Beowulf] precise synchronization of system clocks

2008-09-30 Thread Robert G. Brown
On Tue, 30 Sep 2008, Robert G. Brown wrote: On Tue, 30 Sep 2008, Lux, James P wrote: This is a very nice response, and I think you're on a very good track. IIRC from discussion a few years ago, GPS can yield what, microsecond or better timing (if used to adjust drift and resync all clocks)? In

Re: [Beowulf] precise synchronization of system clocks

2008-09-30 Thread Lux, James P
On 9/30/08 8:55 AM, "Vincent Diepeveen" <[EMAIL PROTECTED]> wrote: > Hmm, > > The European Union waited a bit long introducing its own equivalent > and it should > be online within a few years, it is called Galileo and it is 1 meter > accurate for everyone. 1 meter without differential correct

Re: [Beowulf] precise synchronization of system clocks

2008-09-30 Thread Lux, James P
On 9/30/08 8:37 AM, "Robert G. Brown" <[EMAIL PROTECTED]> wrote: > On Tue, 30 Sep 2008, Lux, James P wrote: > > This is a very nice response, and I think you're on a very good track. > IIRC from discussion a few years ago, GPS can yield what, microsecond or > better timing (if used to adjust dr

Re: [Beowulf] Compute Node OS on Local Disk vs. Ram Disk

2008-09-30 Thread Jon Forrest
Prentice Bisbal wrote: This brings up something else I was wondering about: If you truly strip down the OS running the nodes so that its just a tiny kernel and only the essential libraries, the users would have to compile all their software (assuming they compile their own code, like they do her

Re: [Beowulf] Compute Node OS on Local Disk vs. Ram Disk

2008-09-30 Thread Jon Forrest
Bogdan Costescu wrote: On Sun, 28 Sep 2008, Jon Forrest wrote: There are two philosophies on where a compute node's OS and basic utilities should be located: You forget a NFS-root setup, this doesn't require memory for the RAM disk on which you later mount NFS dirs. You're right. I should

Re: [Beowulf] precise synchronization of system clocks

2008-09-30 Thread Vincent Diepeveen
Hmm, The European Union waited a bit long introducing its own equivalent and it should be online within a few years, it is called Galileo and it is 1 meter accurate for everyone. Also with the garantuee of it not getting switched off; which happens so easily with GPS, even the smallest th

Re: [Beowulf] precise synchronization of system clocks

2008-09-30 Thread Robert G. Brown
On Tue, 30 Sep 2008, Lux, James P wrote: This is a very nice response, and I think you're on a very good track. IIRC from discussion a few years ago, GPS can yield what, microsecond or better timing (if used to adjust drift and resync all clocks)? In principle sub-microsecond, since a microsecon

Re: [Beowulf] Re: MOSIX2

2008-09-30 Thread Vincent Diepeveen
I agree tony that paying for such crap is not very good idea. You might want to move to open-ssi in this case; the project is alive and there is in theory work getting performed on support for cards over infiniband as well. Most importantly is that you are gonna get more replies. Additionall

Re: [Beowulf] precise synchronization of system clocks

2008-09-30 Thread Lux, James P
On 9/30/08 2:53 AM, "Vincent Diepeveen" <[EMAIL PROTECTED]> wrote: > Hmm, > > 1 uS accuracy whereas the cpu has a hardware counter for all this. > > To be honest i find 1 microsecond very inaccurate now that cards have > latencies near that. > Doing that a couple of thousands of times, we sh

Re: [Beowulf] Compute Node OS on Local Disk vs. Ram Disk

2008-09-30 Thread Douglas Eadline
Good point about swap. I often try an make the distinction that diskless booting (provisioning) does not require diskless nodes. That is, it is perfectly reasonable to use a centralized provisioning method and yet have HDD's on compute nodes -- if you need them. In the case where swap and local sc

Re: [Beowulf] Compute Node OS on Local Disk vs. Ram Disk

2008-09-30 Thread Prentice Bisbal
Bogdan Costescu wrote: > On Sun, 28 Sep 2008, Jon Forrest wrote: > This also depends on how much of the distribution you keep as part of > the node "image" and how you place the application software. It's often > the case that the application software is distributed to the nodes from > a cluster-w

Re: [Beowulf] precise synchronization of system clocks

2008-09-30 Thread Vincent Diepeveen
Hmm, 1 uS accuracy whereas the cpu has a hardware counter for all this. To be honest i find 1 microsecond very inaccurate now that cards have latencies near that. Let's assume now a simple example of 2 nodes. node A and node B. Node A has time X Node A ships to B time X Then we do a loop.

[Beowulf] Re: MOSIX2

2008-09-30 Thread Tony Travis
Keith Hacke wrote: Dr Travis Below is a recent question you posted concerning usage of MOSIX2. I am also considering MOSIX2 for a test cluster of 64 nodes. I was not able to find any responses to your question and wondered if you received any communications on MOSIX2 or if you have more

Re: [Beowulf] Compute Node OS on Local Disk vs. Ram Disk

2008-09-30 Thread Bogdan Costescu
On Sun, 28 Sep 2008, Jon Forrest wrote: There are two philosophies on where a compute node's OS and basic utilities should be located: You forget a NFS-root setup, this doesn't require memory for the RAM disk on which you later mount NFS dirs. In both cases it's important to remember to mak