As Mike suggested, we use Hadoop to organize our data en route to Solr. Hadoop 
allows us to load balance the indexing stage, and then we use the raw Lucene 
IndexWriter.addAllIndexes method to merge the data to be hosted on Solr 
instances.

Thanks,
Stu



-----Original Message-----
From: Mike Klaas <[EMAIL PROTECTED]>
Sent: Friday, January 4, 2008 3:04pm
To: solr-user@lucene.apache.org
Subject: Re: solr with hadoop

On 4-Jan-08, at 11:37 AM, Evgeniy Strokin wrote:

> I have huge index base (about 110 millions documents, 100 fields  
> each). But size of the index base is reasonable, it's about 70 Gb.  
> All I need is increase performance, since some queries, which match  
> big number of documents, are running slow.
> So I was thinking is any benefits to use hadoop for this? And if  
> so, what direction should I go? Is anybody did something for  
> integration Solr with Hadoop? Does it give any performance boost?
>
Hadoop might be useful for organizing your data enroute to Solr, but  
I don't see how it could be used to boost performance over a huge  
Solr index.  To accomplish that, you need to split it up over two  
machines (for which you might find hadoop useful).

-Mike


Reply via email to