Re: Duplicates results when using a non optimized index

Otis Gospodnetic Wed, 14 May 2008 14:18:45 -0700

Tim,

Hm, not sure what caused this.  1.2 is now quite old (yes, I know it's the last 
stable release), so if I were you I would consider moving to 1.3-dev.  It 
sounds like the index is already "polluted" with duplicate documents, so you'll 
want to rebuild the index whether you decide to stay with 1.2 or move to 
1.3-dev.



Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch


----- Original Message ----
> From: Tim Mahy <[EMAIL PROTECTED]>
> To: "solr-user@lucene.apache.org" <solr-user@lucene.apache.org>
> Sent: Wednesday, May 14, 2008 3:59:23 AM
> Subject: RE: Duplicates results when using a non optimized index
> 
> Hi,
> 
> thanks for the answer,
> 
> - do duplicates go away after optimization is done?
> --> no, if we search the index even after it is optimized, we still get the 
> duplicate results and even if we search on one of the slaves servers  which 
> have 
> the same index through synchronization ...
> btw this is the first time we notice this, the only thing we have had was the 
> known problem with the "too many open files" which we fixed using the ulimit 
> and 
> rebooted the tomcat server ....
> 
> - do duplicate IDs that you are seeing IDs of previously deleted documents?
> --> it is possible that these documenst were uploaded earlier and have been 
> replaced...
> 
> - which Solr version are you using and can you try a recent nightly?
> --> we use the 1.2 stable build
> 
> greetings,
> Tim
> ________________________________________
> Van: Otis Gospodnetic [EMAIL PROTECTED]
> Verzonden: woensdag 14 mei 2008 6:11
> Aan: solr-user@lucene.apache.org
> Onderwerp: Re: Duplicates results when using a non optimized index
> 
> Hm, not sure why that is happening, but here is some info regarding other 
> stuff 
> from your email
> 
> - there should be no duplicates even if you are searching an index that is 
> being 
> optimized
> - why are you searching an index that is being optimized?  It's doable, but 
> people typically perform index-modifying operations on a Solr master and 
> read-only operations on Solr query slave(s)
> - do duplicates go away after optimization is done?
> - do duplicate IDs that you are seeing IDs of previously deleted documents?
> - which Solr version are you using and can you try a recent nightly?
> 
> 
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> 
> 
> ----- Original Message ----
> > From: Tim Mahy 
> > To: "solr-user@lucene.apache.org" 
> > Sent: Tuesday, May 13, 2008 5:59:28 AM
> > Subject: Duplicates results when using a non optimized index
> >
> > Hi all,
> >
> > is this expected behavior when having an index like this :
> >
> > numDocs : 9479963
> > maxDoc : 12622942
> > readerImpl : MultiReader
> >
> > which is in the process of optimizing that when we search through the index 
> > we
> > get this :
> >
> >
> > 15257559
> >
> >
> > 15257559
> >
> >
> > 17177888
> >
> >
> > 11825631
> >
> >
> > 11825631
> >
> >
> > The id field is declared like this :
> >
> >
> > and is set as the unique identity like this in the schema xml :
> >   id
> >
> > so the question : is this expected behavior and if so is there a way to let 
> Solr
> > only return unique documents ?
> >
> > greetings and thanx in advance,
> > Tim
> >
> >
> >
> >
> > Please see our disclaimer, http://www.infosupport.be/Pages/Disclaimer.aspx
> 
> 
> 
> 
> 
> Please see our disclaimer, http://www.infosupport.be/Pages/Disclaimer.aspx

Re: Duplicates results when using a non optimized index

Reply via email to