Re: [Beowulf] HPC fault tolerance using virtualization

2009-06-15 Thread John Hearns
2009/6/15 Michael Di Domenico : > > > Course having said all that, if you've been watching the linux-kernel > mailing list you've probably noticed the Xen/Kvm/Linux HV argument > that took place last week.  Makes me a little afraid to push any Linux > HV solution into to production, but it's a fun

Re: [Beowulf] HPC fault tolerance using virtualization

2009-06-15 Thread Michael Di Domenico
On Mon, Jun 15, 2009 at 1:59 PM, John Hearns wrote: > Proactive Fault Tolerance for HPC using Xen virtualization > > Its something I've wanted to see working - doing a Xen live migration > of a 'dodgy' compute node, and the job just keeps on trucking. > Looks as if these guys have it working. Anyon

[Beowulf] HPC fault tolerance using virtualization

2009-06-15 Thread John Hearns
I was doing a search on ganglia + ipmi (I'm looking at doing such a thing for temperature measurement) when I cam across this paper: http://www.csm.ornl.gov/~engelman/publications/nagarajan07proactive.ppt.pdf Proactive Fault Tolerance for HPC using Xen virtualization Its something I've wanted to

Re: [Beowulf] MPI + CUDA codes

2009-06-15 Thread Gerry Creager
Charlie Peck wrote: On Jun 12, 2009, at 7:54 PM, Brock Palen wrote: I think the Namd folks had a paper and data from real running code at SC last year. Check with them. Their paper from SC08 is here: http://mc.stanford.edu/cgi-bin/images/8/8a/SC08_NAMD.pdf Michalakes et al: http://www.go

[Beowulf] Data Center Overload

2009-06-15 Thread Eugen Leitl
http://www.nytimes.com/2009/06/14/magazine/14search-t.html?_r=1&ref=magazine&pagewanted=print Data Center Overload By TOM VANDERBILT It began with an Xbox game. On a recent rainy evening in Brooklyn, I was at a friend’s house playing (a bit sheepishly, given my incipient middle age) Call of Du

Re: [Beowulf] MPI + CUDA codes

2009-06-15 Thread Charlie Peck
On Jun 12, 2009, at 7:54 PM, Brock Palen wrote: I think the Namd folks had a paper and data from real running code at SC last year. Check with them. Their paper from SC08 is here: http://mc.stanford.edu/cgi-bin/images/8/8a/SC08_NAMD.pdf charlie

Re: [Beowulf] MPI + CUDA codes

2009-06-15 Thread Brock Palen
I think the Namd folks had a paper and data from real running code at SC last year. Check with them. Brock Palen www.umich.edu/~brockp Center for Advanced Computing bro...@umich.edu (734)936-1985 On Jun 12, 2009, at 6:53 PM, Rajeev Thakur wrote: Is anyone aware of codes out there that use

Re: [Beowulf] HPMPI ove uDAPL issue

2009-06-15 Thread gossips J
i updated my /etc/secutiry/limit.conf with hard/soft "unlimited" Still the same issue. i am using ofed-1.4.1-ga build with this hpmpi rpm. also tried with physical_mem env settings and pin_persentage as well. this does not helped again. Also observed that this error happens while memroy registra