Hi, I have a basic question, let's say we're going to have a very very huge set of data. In a way that for sure we will need many servers (tens or hundreds of servers). We will also need failover. Now the question is, if we should use Hadoop or using Solr Distributed Search with shards would be enough?
I've read lots of articles like: http://www.lucidimagination.com/content/scaling-lucene-and-solr http://wiki.apache.org/solr/DistributedSearch But I'm still confused, Solr's distributed search seems to be able to handle splitting the queries and merging the result. So what's the point of using Hadoop? I'm pretty sure I'm missing something here. Can anyone suggest some links regarding this issue? Regards -- Alireza Salimi Java EE Developer