On Mon, 15 Jun 2009 20:58:58 +0100
John Hearns wrote:
> 2009/6/15 Michael Di Domenico :
> >
> >
> > Course having said all that, if you've been watching the linux-kernel
> > mailing list you've probably noticed the Xen/Kvm/Linux HV argument
> > that took place last week. Makes me a little afraid
On 16/06/2009, at 5:58 AM, John Hearns wrote:
2009/6/15 Michael Di Domenico :
Course having said all that, if you've been watching the linux-kernel
mailing list you've probably noticed the Xen/Kvm/Linux HV argument
that took place last week. Makes me a little afraid to push any
Linux
HV
John Hearns wrote:
2009/6/16 Egan Ford mailto:e...@sense.net>>
I have no idea the state of VMs on IB. That can be an issue with
MPI. Believe it or not, but most HPC sites do not use MPI. They
are all batch systems where storage I/O is the bottleneck.
Burn the Witch! Burn the
John Hearns wrote:
Any HPC installation, if you want to show it off to alumni, august
committees from grant awarding bodies etc. and not get sand kicked in
your face from the big boys in the Top 500 NEEDS an expensive
infrastructure of various MPI libraries. Big, big switches with lots of
fl
Ha! :-)
I've put a few GigE systems in the Top100, and if the stars align you'll see
a Top20 GigE system in next weeks list. That's ONE GigE to each node
oversubscribed 4:1. Sadly no flashing lights, and since its 100% water
cooled with low velocity fans, there is almost no noise.
On Tue, Jun
The good news...
We (IBM) demonstrated such a system at SC08 as a Cloud Computing demo. The
setup was a combination of Moab, xCAT, and Xen.
xCAT is an open source provisioning system that can control/monitor
hardware, discover nodes, and provision stateful/stateless physical nodes
and virtual ma
2009/6/16 Egan Ford
> I have no idea the state of VMs on IB. That can be an issue with MPI.
> Believe it or not, but most HPC sites do not use MPI. They are all batch
> systems where storage I/O is the bottleneck.
Burn the Witch! Burn the Witch!
Any HPC installation, if you want to show it o
Date: Tue, 16 Jun 2009 10:38:55 +0200
From: Kilian CAVALOTTI
On Monday 15 June 2009 20:47:40 Michael Di Domenico wrote:
It would be nice to be able to just move bad hardware out from
under a
running job without affecting the run of the job.
I may be missing something major here, but if
2009/6/16 Ashley Pittman
> >
> > elements (or slots) allocated for the job on the node - if the VM is
> > able to adapt itself to such a situation, f.e. by starting several MPI
> > ranks and using shared memory for MPI communication. Further, to
> > cleanly stop the job, the queueing system will
On Tue, 2009-06-16 at 12:27 +0200, Bogdan Costescu wrote:
> You might be right, at least when talking about the short term. It has
> been my experience with several ISVs that they are very slow in
> adopting newer features related to system infrastructure in their
> software - by system infrastr
On Mon, Jun 15, 2009 at 3:58 PM, John Hearns wrote:
> 2009/6/15 Michael Di Domenico :
>> Course having said all that, if you've been watching the linux-kernel
>> mailing list you've probably noticed the Xen/Kvm/Linux HV argument
>> that took place last week. Makes me a little afraid to push any Li
On Tue, 16 Jun 2009, John Hearns wrote:
I believe that if we can get features like live migration of failing
machines, plus specialized stripped-down virtual machines specific
to job types then we will see virtualization becoming mainstream in
HPC clustering.
You might be right, at least whe
John Hearns writes:
> I was doing a search on ganglia + ipmi (I'm looking at doing such a
> thing for temperature measurement)
Like
http://www.nw-grid.ac.uk/LivScripts?action=AttachFile&do=get&target=freeipmi-gmetric-temp>?
If you want to take action, though, go direct to Nagios or similar with
2009/6/16 Kilian CAVALOTTI
> My take on this is that it's probably more efficient to develop
> checkpointing
> features and recovery in software (like MPI) rather than adding a
> virtualization layer, which is likely to decrease performance.
>
The performance hits measured by Panda et. al. on Inf
2009/6/16 Kilian CAVALOTTI
>
>
> I may be missing something major here, but if there's bad hardware, chances
> are the job has already failed from it, right? Would it be a bad disk (and
> the
> OS would only notice a bad disk while trying to write on it, likely asked
> to
> do so by the job), or
On Monday 15 June 2009 20:47:40 Michael Di Domenico wrote:
> It would be nice to be able to just move bad hardware out from under a
> running job without affecting the run of the job.
I may be missing something major here, but if there's bad hardware, chances
are the job has already failed from
16 matches
Mail list logo