Pascal, Look at that difference between numDocs and maxDocs. That delta represents deleted docs. Maybe there is something deleting your docs after all!
Otis ----Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Hadoop ecosystem search :: http://search-hadoop.com/ ----- Original Message ---- > From: Pascal Dimassimo <thesuper...@hotmail.com> > To: solr-user@lucene.apache.org > Sent: Fri, February 19, 2010 3:50:26 PM > Subject: RE: Documents disappearing > > > Using LukeRequestHandler, I see: > > 7725 > 28099 > 758826 > 1266355690710 > false > true > true > > org.apache.lucene.store.NIOFSDirectory:org.apache.lucene.store.NIOFSDirectory@/opt/solr/myindex/data/index > > > I will copy the index to my local machine so I can open it with luke. Should > I look for something specific? > > Thanks! > > > ANKITBHATNAGAR wrote: > > > > Try inspecting your index with luke > > > > > > Ankit > > > > > > -----Original Message----- > > From: Pascal Dimassimo [mailto:thesuper...@hotmail.com] > > Sent: Friday, February 19, 2010 2:22 PM > > To: solr-user@lucene.apache.org > > Subject: Documents disappearing > > > > > > Hi, > > > > I have encounter a situation that I can't explain. We are indexing > > documents > > that are often duplicates so we activated deduplication like this: > > > > > > class="org.apache.solr.update.processor.SignatureUpdateProcessorFactory"> > > true > > true > > signature > > title,text > > > > name="signatureClass">org.apache.solr.update.processor.Lookup3Signature > > > > > > What I can't explain is that when I look at the documents count in the > > log, > > I see documents disappearing. > > > > 11:24:23 INFO - [myindex] webapp=null path=null > > params={event=newSearcher&q=*:*&wt=dismax} hits=0 status=0 QTime=0 > > 14:04:24 INFO - [myindex] webapp=null path=null > > params={event=newSearcher&q=*:*&wt=dismax} hits=4065 status=0 QTime=10 > > 14:17:07 INFO - [myindex] webapp=null path=null > > params={event=newSearcher&q=*:*&wt=dismax} hits=6499 status=0 QTime=42 > > 14:25:42 INFO - [myindex] webapp=null path=null > > params={event=newSearcher&q=*:*&wt=dismax} hits=7629 status=0 QTime=1 > > 14:47:12 INFO - [myindex] webapp=null path=null > > params={event=newSearcher&q=*:*&wt=dismax} hits=10140 status=0 QTime=12 > > 15:17:22 INFO - [myindex] webapp=null path=null > > params={event=newSearcher&q=*:*&wt=dismax} hits=10861 status=0 QTime=13 > > 15:47:31 INFO - [myindex] webapp=null path=null > > params={event=newSearcher&q=*:*&wt=dismax} hits=9852 status=0 QTime=19 > > 16:17:42 INFO - [myindex] webapp=null path=null > > params={event=newSearcher&q=*:*&wt=dismax} hits=8112 status=0 QTime=13 > > 16:38:17 INFO - [myindex] webapp=null path=null > > params={event=newSearcher&q=*:*&wt=dismax} hits=7725 status=0 QTime=10 > > 16:39:10 INFO - [myindex] webapp=null path=null > > params={event=newSearcher&q=*:*&wt=dismax} hits=7725 status=0 QTime=1 > > 16:47:40 INFO - [myindex] webapp=null path=null > > params={event=newSearcher&q=*:*&wt=dismax} hits=7725 status=0 QTime=46 > > 16:51:24 INFO - [myindex] webapp=null path=null > > params={event=newSearcher&q=*:*&wt=dismax} hits=7725 status=0 QTime=74 > > 17:02:13 INFO - [myindex] webapp=null path=null > > params={event=newSearcher&q=*:*&wt=dismax} hits=7725 status=0 QTime=102 > > 17:17:41 INFO - [myindex] webapp=null path=null > > params={event=newSearcher&q=*:*&wt=dismax} hits=7725 status=0 QTime=8 > > > > 11:24 was the time at which Solr was started that day. Around 13:30, we > > started the indexation. > > > > At some point during the indexation, I notice that a batch a documents > > were > > resend (i.e, documents with the same id field were sent again to the > > index). > > And according to the log, NO delete was sent to Solr. > > > > I understand that if I send duplicates (either documents with the same id > > or > > with the same signature), the count of documents should stay the same. But > > how can we explain that it is lowering? What are the possible causes of > > this > > behavior? > > > > Thanks! > > -- > > View this message in context: > > http://old.nabble.com/Documents-disappearing-tp27659047p27659047.html > > Sent from the Solr - User mailing list archive at Nabble.com. > > > > > > > > -- > View this message in context: > http://old.nabble.com/Documents-disappearing-tp27659047p27660077.html > Sent from the Solr - User mailing list archive at Nabble.com.