Re: Solr & Nutch

Jorge Luis Betancourt Gonzalez Tue, 28 Jan 2014 09:25:55 -0800

Q1: Nutch doesn’t only handle the parse of HTML files, it also use hadoop to 
achieve large-scale crawling using multiple nodes, it fetch the content of the 
HTML file, and yes it also parse its content.

Q2: In our case we use sold to crawl some website, store the content in one 
“main” solr core. We also have a web app with the typical “search box” we use a 
separated core to store the queries made by our users.

Q3: Not currently using SolrCloud so I’m going to let this one pass to a more 
experienced fellow.

On Jan 28, 2014, at 11:36 AM, rashmi maheshwari <maheshwari.ras...@gmail.com> 
wrote:

> Hi,
> 
> Question1 --> When Solr could parse html, documents like doc, excel pdf
> etc, why do we need nutch to parse html files? what is different?
> 
> Questions 2: When do we use multiple core in solar? any practical business
> case when we need multiple cores?
> 
> Question 3: When do we go for cloud? What is meaning of implementing solr
> cloud?
> 
> 
> -- 
> Rashmi
> Be the change that you want to see in this world!
> www.minnal.zor.org
> disha.resolve.at
> www.artofliving.org

________________________________________________________________________________________________
III Escuela Internacional de Invierno en la UCI del 17 al 28 de febrero del 
2014. Ver www.uci.cu

Re: Solr & Nutch

Reply via email to