If you don't want "downtime", you could add a <field name="indextime" 
type="tdate" default="NOW" /> field to your schema, reload, do a full re-index 
on top of your existing index, and then delete all documents that were not 
updated, via a delelteByQuery, e.g.: indextime:[* TO NOW-1DAY]

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

12. okt. 2014 kl. 21:59 skrev Shawn Heisey <apa...@elyograg.org>:

> On 10/12/2014 12:26 PM, vidit.asthana wrote:
>> I have a strange problem where select q=*:* is returning different number of
>> documents. Sometime its returning numFound = 5866712 and sometimes it
>> returns numFound = 5852274.  *numFound is always one of these 2 values.*
>> 
>> Here is the query:
>> 
>> *http://localhost:5011/solr/mycollection/select?q=*:*&rows=0*
>> 
>> 
>> I am running Solr in cloud mode and this problem is occurring with both
>> solr-4.5.1 and solr-4.10.0. I have exactly same data indexed in both
>> versions. 4.5.1 is running on a 8 nodes cluster (4x2 shards) and solr-4.10.0
>> is running on a 4 node (2x2 shards)cluster.
> 
> I really need to make a wiki page for this.  It would save so much
> typing!  I also need to boil it down to a small-scale real-world example
> and show how the numbers get calculated and what goes wrong, which means
> I need to have a complete understanding of the problem, and at this
> moment, I don't have that.
> 
> This is a problem that's unique to distributed indexes.  What causes it
> is having documents with the same value in the uniqueKey field indexed
> in more than one shard.
> 
> It is not a bug, it's a result of the way that results from multiple
> shards are combined into one result.  The only way to "fix" this problem
> would involve so much additional processing that it would make all
> queries extremely slow.
> 
> If you're using automatic document routing, then your routing algorithm
> may have changed at some point, and you didn't re-index.  If you're
> using manual document routing, then some documents were indexed on the
> wrong shard, and later indexed on another shard as well.
> 
> Preventing the problem is easy -- always index documents onto the
> correct shard.  Fixing the problem at this point might involve clearing
> your index and re-indexing from scratch, unless you can figure out which
> documents have been indexed on more than one shard and you can delete
> them from the incorrect shard(s).
> 
> Thanks,
> Shawn
> 

Reply via email to