Regarding fault tolerance, this sounds interesting.
I haven't had a chance to do more than glimpse at the web page though (9:30 UK
time and I need my coffee)
Lecture Series
"A Perspective on Exploiting Heterogeneous Fault-Tolerant Parallelism for
HPC clusters and Supercomputers"
at the
Uni
On Mon, Oct 1, 2012 at 2:22 PM, Mark Hahn wrote:
>> My idea is to use data parallel API. This is nothing new. In theory,
>
>
> right, it's not new. so why would it succeed this time around?
This is because the transformation of the application architecture
from static to statistic multiplexed f
> My idea is to use data parallel API. This is nothing new. In theory,
right, it's not new. so why would it succeed this time around?
> can still be elegant looking. For example, you can have multiple
> Infiniband interfaces (some machines already have) to help counter the
> speed disparity betw
Something like that. But we don't want the app code to look too ugly.
My idea is to use data parallel API. This is nothing new. In theory,
every MPI program can be translated into data parallel. The magic is
the total transformation of the application architecture.
Traditionally computer, network
On 9/29/12 2:29 AM, "Justin YUAN SHI" wrote:
>I missed this thread. Got busy with classes. Sorry.
>
>Going back to Jim's comments on Infiniband and OSI and MPI. I see the
>exacscale computing requires us to rethink MPI's insistence on sending
>message directly. Even with the group communicators
I missed this thread. Got busy with classes. Sorry.
Going back to Jim's comments on Infiniband and OSI and MPI. I see the
exacscale computing requires us to rethink MPI's insistence on sending
message directly. Even with the group communicators, the
implementation
insists on the same.
The problem
On 09/24/2012 12:57 PM, Andrew Holway wrote:
>> Haha, I doubt it -- probably the opposite in terms of development cost.
>>Which is why I question the original statement on the grounds that
>> "cost" isn't well defined. Maybe the costs just performance-wise, but
>> that's not even clear to me w
> Haha, I doubt it -- probably the opposite in terms of development cost.
> Which is why I question the original statement on the grounds that
> "cost" isn't well defined. Maybe the costs just performance-wise, but
> that's not even clear to me when we consider things at huge scales.
40 years a
Subject: Re: [Beowulf] Checkpointing using flash
>
>> Of course the physical modelers won't bat an eyelash, but the common
>> programmer who still tries to figure out this multithreading thing
>> will be out to lunch.
>
> Whenever you push a problem to from hardware software y
nodes, etc.
Jim Lux
-Original Message-
From: beowulf-boun...@beowulf.org [mailto:beowulf-boun...@beowulf.org] On
Behalf Of Eugen Leitl
Sent: Monday, September 24, 2012 2:11 AM
To: beowulf@beowulf.org
Subject: Re: [Beowulf] Checkpointing using flash
On Sat, Sep 22, 2012 at 09:29:25PM +, L
-Original Message-
From: beowulf-boun...@beowulf.org [mailto:beowulf-boun...@beowulf.org] On
Behalf Of Andrew Holway
Sent: Monday, September 24, 2012 3:59 AM
To: Eugen Leitl
Cc: beowulf@beowulf.org
Subject: Re: [Beowulf] Checkpointing using flash
> Of course the physical modelers wo
Regardless how low MPI stack goes, it has never "punched" through the packet
retransmission layer. Therefore, the OSI model serves as a template to
illustrate the point of discussion.
Justin
On Sep 22, 2012, at 10:34 AM, "Lux, Jim (337C)"
wrote:
> I see MPI as sitting much lower (network o
Andrew,
I think you are not too far off. If the global "fluster like" mechanism can
provide the theoretical upper bounded protection for its stored info, and can
scale as we grow the machine size, it would look like a reasonable exascale
machine.
Justin
On Sep 22, 2012, at 7:02 AM, Andrew Ho
> Of course the physical modelers won't bat an eyelash,
> but the common programmer who still tries to figure out
> this multithreading thing will be out to lunch.
Whenever you push a problem to from hardware software you
exponentially increase the cost of solving that problem.
___
On Sat, Sep 22, 2012 at 09:29:25PM +, Lux, Jim (337C) wrote:
> I think the future is in explicitly recognizing that you have to pass
> messages serially and designing algorithms that are tolerant of things
> like missing messages, variable (but bounded) latency (or heck, latency at
> all).
Co
On 9/23/12 6:57 AM, "Andrew Holway" wrote:
>2012/9/21 David N. Lombard :
>> Our primary approach today is recovery-base resilience, a.k.a.,
>> checkpoint-restart (C/R). I'm not convinced we can continue to rely on
>>that
>> at exascale.
>
>- Snapshotting seems to be an ugly and inelegant way of
2012/9/21 David N. Lombard :
> Our primary approach today is recovery-base resilience, a.k.a.,
> checkpoint-restart (C/R). I'm not convinced we can continue to rely on that
> at exascale.
- Snapshotting seems to be an ugly and inelegant way of solving the
problem. For me it is especially laughable
Jim Lux wrote that one giant [distributed] memory has scalability
problems from physical distance reasons. Yes indeed. Simply to clarify,
I was refering to a specific niche in parameter space (physical and
programmatic) associated with programs using file I/O. That is to say,
there is a realm fo
On 9/22/12 12:47 PM, "Alan Louis Scheinine"
wrote:
>Andrew Holway wrote:
> > I've been playing around with GFS and Gluster a bit recently and this
> > has got me thinking... Given a fast enough, low enough latency network
> > might it but possible to have a Gluster like or GFS like memory sp
Andrew Holway wrote:
> I've been playing around with GFS and Gluster a bit recently and this
> has got me thinking... Given a fast enough, low enough latency network
> might it but possible to have a Gluster like or GFS like memory space?
For random access, hard disk access times are millise
I see MPI as sitting much lower (network or transport, perhaps)
Maybe for this (as in many other cases) the OSI model is not an
appropriate one.
That is, most practical systems have more blending between layers, and
outright punching through. There are a variety of high level
protocols/algorithms
> To be exact, the OSI layers 1-4 can defend packet data losses and
> corruptions against transient hardware and network failures. Layers
> 5-7 provides no protection. MPI sits on top of layer 7. And it assumes
> that every transmission must be successful (this is why we have to use
> checkpoint in
Ellis:
If we go to a little nitty-gritty detail view, you will see that
transient faults are the ultimate enemies of exacscale computing. The
problem, if we really go to the nitty-gritty details, stems from
mismatch between the MPI assumptions and what the OSI model promises.
To be exact, the OS
On Fri, Sep 21, 2012 at 02:49:32PM +, Hearns, John wrote:
> http://www.theregister.co.uk/2012/09/21/emc_abba/
>
> Frequent checkpointing will of course be vital for exascale, given the MTBF
> of individual nodes.
Individual nodes have very good MTBF. It's /system/ scale that causes
problems
> I would suggest that some scheme of redundant computation might be more
> effective.. Rather than try to store a single node's state on the node,
> and then, if any node hiccups, restore the state (perhaps to a spare), and
> restart, means stopping the entire cluster while you recover.
>
> Or, i
On Fri, Sep 21, 2012 at 01:09:41PM -0400, Ellis H. Wilson III wrote:
> On 09/21/12 12:58, Lux, Jim (337C) wrote:
> > Yes.. If that's the frequency of checkpoints. I was thinking more like 1
> > checkpoint per second or 10 seconds.
>
> While I suppose they might exist that frequent somehow in the
On Fri, 21 Sep 2012, Lux, Jim (337C) wrote:
On 9/21/12 9:21 AM, "Hearns, John" wrote:
Or, if you can factor your computation to make use of extra processing
nodes, you can just keep on moving. Think of this as a higher level
scheme than, say, Hamming codes for memory protection: use 11 b
On 09/21/12 12:29, Lux, Jim (337C) wrote:
> Flash is slow, though... SLC NAND flash (pretty fast, 8 Gbit part) is 250
> microseconds to write a 4kbyte (approx) page. Erasing is about 700
> microseconds (reading is 25 microseconds)
>
> MLC flash (say 512Gbit parts with 8 kBbyte pages) takes 1.3mi
On 09/21/12 12:58, Lux, Jim (337C) wrote:
> Yes.. If that's the frequency of checkpoints. I was thinking more like 1
> checkpoint per second or 10 seconds.
While I suppose they might exist that frequent somehow in the wild, I've
never heard of checkpoints at that low of time interval. These hug
On 9/21/12 9:44 AM, "Ellis H. Wilson III" wrote:
>On 09/21/12 12:29, Lux, Jim (337C) wrote:
>> Flash is slow, though... SLC NAND flash (pretty fast, 8 Gbit part) is
>>250
>> microseconds to write a 4kbyte (approx) page. Erasing is about 700
>> microseconds (reading is 25 microseconds)
>>
>>
On 9/21/12 9:21 AM, "Hearns, John" wrote:
>
>Or, if you can factor your computation to make use of extra processing
>nodes, you can just keep on moving. Think of this as a higher level
>scheme than, say, Hamming codes for memory protection: use 11 bits to
>store 8, and you're still synchronou
On 9/21/12 8:41 AM, "Hearns, John" wrote:
>
>
>Are your concerns about the accuracy of this statement related to the
>fact that elReg is claiming that they must dump "the entire memory" or
>some concern about flash being used as a temporary checkpointing medium?
>
>The "entire memory" statement
On 09/21/12 12:13, Lux, Jim (337C) wrote:
> I would suggest that some scheme of redundant computation might be more
> effective.. Rather than try to store a single node's state on the node,
> and then, if any node hiccups, restore the state (perhaps to a spare), and
> restart, means stopping the en
Or, if you can factor your computation to make use of extra processing
nodes, you can just keep on moving. Think of this as a higher level
scheme than, say, Hamming codes for memory protection: use 11 bits to
store 8, and you're still synchronous.
Jim, you are smarter than me!
IW as going to ai
I would suggest that some scheme of redundant computation might be more
effective.. Rather than try to store a single node's state on the node,
and then, if any node hiccups, restore the state (perhaps to a spare), and
restart, means stopping the entire cluster while you recover.
Or, if you can fa
Are your concerns about the accuracy of this statement related to the
fact that elReg is claiming that they must dump "the entire memory" or
some concern about flash being used as a temporary checkpointing medium?
The "entire memory" statement puzzled me.
But using flash in this fashion does
On 09/21/12 10:49, Hearns, John wrote:
> http://www.theregister.co.uk/2012/09/21/emc_abba/
>
> Frequent checkpointing will of course be vital for exascale, given the
> MTBF of individual nodes.
>
> However how accurate is this statement:
>
> HPC jobs involving half a million compute cores ... have
It looks fairly accurate.
This is because reconcile distributed checkpoints is theoretically
difficult. Therefore, frequent checkpointing is cost prohibitive for
exacscale apps.
Justin
On Fri, Sep 21, 2012 at 10:49 AM, Hearns, John wrote:
> http://www.theregister.co.uk/2012/09/21/emc_abba/
>
>
http://www.theregister.co.uk/2012/09/21/emc_abba/
Frequent checkpointing will of course be vital for exascale, given the MTBF of
individual nodes.
However how accurate is this statement:
HPC jobs involving half a million compute cores ... have a series of
checkpoints set up in their code with
39 matches
Mail list logo