Q1: Nutch doesn’t only handle the parse of HTML files, it also use hadoop to achieve large-scale crawling using multiple nodes, it fetch the content of the HTML file, and yes it also parse its content.
Q2: In our case we use sold to crawl some website, store the content in one “main” solr core. We also have a web app with the typical “search box” we use a separated core to store the queries made by our users. Q3: Not currently using SolrCloud so I’m going to let this one pass to a more experienced fellow. On Jan 28, 2014, at 11:36 AM, rashmi maheshwari <maheshwari.ras...@gmail.com> wrote: > Hi, > > Question1 --> When Solr could parse html, documents like doc, excel pdf > etc, why do we need nutch to parse html files? what is different? > > Questions 2: When do we use multiple core in solar? any practical business > case when we need multiple cores? > > Question 3: When do we go for cloud? What is meaning of implementing solr > cloud? > > > -- > Rashmi > Be the change that you want to see in this world! > www.minnal.zor.org > disha.resolve.at > www.artofliving.org ________________________________________________________________________________________________ III Escuela Internacional de Invierno en la UCI del 17 al 28 de febrero del 2014. Ver www.uci.cu