RE: Pros and Cons of Using Deduplication of Solr at Huge Data Indexing

2013-05-02 Thread Markus Jelsma
ation. If SOLR-3473 is fixed you can get very decent deduplication. -Original message- > From:Furkan KAMACI > Sent: Thu 02-May-2013 22:30 > To: solr-user@lucene.apache.org > Subject: Pros and Cons of Using Deduplication of Solr at Huge Data Indexing > > I use Solr 4.2.

Pros and Cons of Using Deduplication of Solr at Huge Data Indexing

2013-05-02 Thread Furkan KAMACI
I use Solr 4.2.1 as SolrCloud. I crawl huge data with Nutch and index them with SolrCloud. I wonder about Solr's deduplication mechanism. What exactly it does and does it results with a slow indexing or is it beneficial for my situation?