Well, actually we haven't started the actual project yet. But probably it will have to handle the data of millions of users, and a rough estimation for each user's data would be something around 5 MB.
The other problem is that those data will be changed very often. I hope I answered your question. Thanks On Tue, Dec 20, 2011 at 4:00 PM, Ted Dunning <ted.dunn...@gmail.com> wrote: > You didn't mention how big your data is or how you create it. > > Hadoop would mostly used in the preparation of the data or the off-line > creation of indexes. > > On Tue, Dec 20, 2011 at 12:28 PM, Alireza Salimi > <alireza.sal...@gmail.com>wrote: > > > Hi, > > > > I have a basic question, let's say we're going to have a very very huge > set > > of data. > > In a way that for sure we will need many servers (tens or hundreds of > > servers). > > We will also need failover. > > Now the question is, if we should use Hadoop or using Solr Distributed > > Search > > with shards would be enough? > > > > I've read lots of articles like: > > http://www.lucidimagination.com/content/scaling-lucene-and-solr > > http://wiki.apache.org/solr/DistributedSearch > > > > But I'm still confused, Solr's distributed search seems to be able to > > handle > > splitting the queries and merging the result. So what's the point of > using > > Hadoop? > > > > I'm pretty sure I'm missing something here. Can anyone suggest > > some links regarding this issue? > > > > Regards > > > > -- > > Alireza Salimi > > Java EE Developer > > > -- Alireza Salimi Java EE Developer