On 07/07/2011 12:31 PM, Lux, Jim (337C) wrote:
>> On 07/07/2011 10:13 AM, Eugen Leitl wrote:
>>>
>>> http://www.techeye.net/chips/one-million-arm-chips-challenge-intel-bumblebee
>>>
>>> One million ARM chips challenge Intel bumblebee
>>>
>>
>> Now say it like Dr. Evil: one MILLION processors.
>>
>>
>> How long is it going to take to wire them all up? And how fast are they
>> going to fail? If there's a MTBF of one million hours, that's still one
>> failure per hour.
> 
> 
> But this presents a very interesting design challenge.. when you get to this 
> sort of scale, you have to assume that at any time, some of them are going to 
> be dead or dying.  Just like google's massively parallel database engines..
> 
> It's all about ultimate scalability.  Anybody with a moderate competence 
> (certainly anyone on this list) could devise a scheme to use 1000 perfect 
> processors that never fail to do 1000 quanta of work in unit time.  It's 
> substantially more challenging to devise a scheme to do 1000 quanta of work 
> in unit time on, say, 1500 processors with a 20% failure rate.  Or even in 
> 1.2*unit time.
> 

Just to be clear - I wasn't saying this was a bad idea. Scaling up to
this size seems inevitable. I was just imagining the team of admins who
would have to be working non-stop to replace dead processors!

I wonder what the architecture for this system will be like. I imagine
it will be built around small multi-socket blades that are hot-swappable
to handle this.

Prentice
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to