On Nov 23, 2013, at 1:40PM, Joe Landman <land...@scalableinformatics.com> wrote:

> That is, we as a community have much to offer the growing big data 
> community.  

I think this is completely true, and somewhat urgent.  The two communities have 
a lot to teach each other.

The big data community remains incredibly naive about a lot of 
performance/scalability issues - and of course they are, they’ve only been at 
this a few years.  Traditional HPC has a *lot* of hard-won knowledge and 
experience to offer.

But conversely, where we’ve been naive is the importance of easily deployable, 
scalable, easy-to-develop-for software frameworks, even if it initially comes 
at substantial cost in terms of single-processor performance.  If we choose not 
to learn the lessons of rapid growth of tools like Hadoop, we are in trouble as 
a community.  

We’ve talked for years about how hardware is advancing more rapidly than 
software, but not done much about it; now someone has, and it’s not us.  As a 
result, people are already trying to fit very HPCy sorts of problems into 
Hadoopy sorts of frameworks (cf, all the BSP stuff in Pregel or Hama) because 
it’s so much easier to get things working, and so much easier to find 
developers to maintain.  When it comes to choosing a direction for a new 
project, 100x the number of developers will always win over single-processor 
performance, or even scaling, because you can then direct enormous amounts of 
resources to fixing performance issues in the underlying frameworks.  

    Jonathan

-- 
Jonathan Dursi, <ljdu...@scinet.utoronto.ca>
SciNet HPC Consortium, Compute Canada
http://www.SciNetHPC.ca
http://www.ComputeCanada.ca
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to