RE: Duplicates results when using a non optimized index

Tim Mahy Wed, 14 May 2008 01:04:57 -0700

Hi,

thanks for the answer,


- do duplicates go away after optimization is done?
--> no, if we search the index even after it is optimized, we still get the 
duplicate results and even if we search on one of the slaves servers  which 
have the same index through synchronization ...
btw this is the first time we notice this, the only thing we have had was the 
known problem with the "too many open files" which we fixed using the ulimit 
and rebooted the tomcat server ....

- do duplicate IDs that you are seeing IDs of previously deleted documents?
--> it is possible that these documenst were uploaded earlier and have been 
replaced...

- which Solr version are you using and can you try a recent nightly?
--> we use the 1.2 stable build

greetings,
Tim
________________________________________
Van: Otis Gospodnetic [EMAIL PROTECTED]
Verzonden: woensdag 14 mei 2008 6:11
Aan: solr-user@lucene.apache.org
Onderwerp: Re: Duplicates results when using a non optimized index

Hm, not sure why that is happening, but here is some info regarding other stuff 
from your email

- there should be no duplicates even if you are searching an index that is 
being optimized
- why are you searching an index that is being optimized?  It's doable, but 
people typically perform index-modifying operations on a Solr master and 
read-only operations on Solr query slave(s)
- do duplicates go away after optimization is done?
- do duplicate IDs that you are seeing IDs of previously deleted documents?
- which Solr version are you using and can you try a recent nightly?


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch


----- Original Message ----
> From: Tim Mahy <[EMAIL PROTECTED]>
> To: "solr-user@lucene.apache.org" <solr-user@lucene.apache.org>
> Sent: Tuesday, May 13, 2008 5:59:28 AM
> Subject: Duplicates results when using a non optimized index
>
> Hi all,
>
> is this expected behavior when having an index like this :
>
> numDocs : 9479963
> maxDoc : 12622942
> readerImpl : MultiReader
>
> which is in the process of optimizing that when we search through the index we
> get this :
>
>
> 15257559
>
>
> 15257559
>
>
> 17177888
>
>
> 11825631
>
>
> 11825631
>
>
> The id field is declared like this :
>
>
> and is set as the unique identity like this in the schema xml :
>   id
>
> so the question : is this expected behavior and if so is there a way to let 
> Solr
> only return unique documents ?
>
> greetings and thanx in advance,
> Tim
>
>
>
>
> Please see our disclaimer, http://www.infosupport.be/Pages/Disclaimer.aspx





Please see our disclaimer, http://www.infosupport.be/Pages/Disclaimer.aspx

RE: Duplicates results when using a non optimized index

Reply via email to