Well, actually we haven't started the actual project yet.
But probably it will have to handle the data of millions of users,
and a rough estimation for each user's data would be something around
5 MB.

The other problem is that those data will be changed very often.

I hope I answered your question.

Thanks

On Tue, Dec 20, 2011 at 4:00 PM, Ted Dunning <ted.dunn...@gmail.com> wrote:

> You didn't mention how big your data is or how you create it.
>
> Hadoop would mostly used in the preparation of the data or the off-line
> creation of indexes.
>
> On Tue, Dec 20, 2011 at 12:28 PM, Alireza Salimi
> <alireza.sal...@gmail.com>wrote:
>
> > Hi,
> >
> > I have a basic question, let's say we're going to have a very very huge
> set
> > of data.
> > In a way that for sure we will need many servers (tens or hundreds of
> > servers).
> > We will also need failover.
> > Now the question is, if we should use Hadoop or using Solr Distributed
> > Search
> > with shards would be enough?
> >
> > I've read lots of articles like:
> > http://www.lucidimagination.com/content/scaling-lucene-and-solr
> > http://wiki.apache.org/solr/DistributedSearch
> >
> > But I'm still confused, Solr's distributed search seems to be able to
> > handle
> > splitting the queries and merging the result. So what's the point of
> using
> > Hadoop?
> >
> > I'm pretty sure I'm missing something here. Can anyone suggest
> > some links regarding this issue?
> >
> > Regards
> >
> > --
> > Alireza Salimi
> > Java EE Developer
> >
>



-- 
Alireza Salimi
Java EE Developer

Reply via email to