Re: [Beowulf] Torrents for HPC

2012-06-12 Thread Bill Broadley
On 06/12/2012 03:47 PM, Skylar Thompson wrote: > We manage this by having users run this in the same Grid Engine > parallel environment they run their job in. This means they're > guaranteed to run the sync job on the same nodes their actual job runs > on. The copied files change so slowly that eve

Re: [Beowulf] Torrents for HPC

2012-06-12 Thread Skylar Thompson
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 06/12/2012 03:42 PM, Bill Broadley wrote: > Using MPI does make quite a bit of sense for clusters with high > speed interconnects. Although I suspect that being network bound > for IO is less of a problem. I'd consider it though, I do have > sdr/d

Re: [Beowulf] Torrents for HPC

2012-06-12 Thread Bill Broadley
Many thanks for the online and offline feedback. I've been reviewing the mentioned alternatives. From what I can tell none of them allow nodes to join/leave at random. Our problem is that a user might submit 500-50,000 jobs that depend on a particular dataset and have a variable number of job

Re: [Beowulf] Torrents for HPC

2012-06-12 Thread Bernard Li
Hi Orion: On Tue, Jun 12, 2012 at 2:06 PM, Orion Poplawski wrote: > Hmm, the home page indicates it went into ganglia, but it's not there now. > Anyone know what happened? The code is here: http://ganglia.svn.sf.net/viewvc/ganglia/trunk/gexec/pcp/ Perhaps Brent could update the page with the d

Re: [Beowulf] Torrents for HPC

2012-06-12 Thread Orion Poplawski
On 06/11/2012 12:17 PM, Bernard Li wrote: > Hi all: > > I'd also like to point you guys to pcp: > > http://www.theether.org/pcp/ > > It's a bit old, but should still build on modern systems. It would be > nice if somebody picks up development after all these years (hint > hint) :-) Hmm, the home

Re: [Beowulf] Torrents for HPC

2012-06-12 Thread Ellis H. Wilson III
On 06/08/12 20:06, Bill Broadley wrote: > A new user on one of my GigE clusters submits batches of 500 jobs that > need to randomly read a 30-60GB dataset. They aren't the only user of > said cluster so each job will be waiting in the queue with a mix of others. With a 160TB cluster and only a 30

Re: [Beowulf] Torrents for HPC

2012-06-12 Thread David N. Lombard
On Mon, Jun 11, 2012 at 11:17:53AM -0700, Bernard Li wrote: > Hi all: > > I'd also like to point you guys to pcp: > > http://www.theether.org/pcp/ > > It's a bit old, but should still build on modern systems. It would be > nice if somebody picks up development after all these years (hint > hint