Delays when deleting by query

2011-12-06 Thread Mike Gallan









Hello,
We're encountering delays of 10+ minutes when trying to delete from our Solr 
3.4 instance.  We have 335k documents indexed and interface using SolrJ.  Our 
schema basically consists of a parent object with multiple child objects.  
Every object is indexed as a separate document with the child documents 
referencing parents via a 'parentId' field.  When any part of a parent object 
is updated solrServer.deleteByQuery() is called to delete the parent and all 
the child documents, then solrServer.add() is called to reindex them.  We 
currently rely on autocommit, with maxDocs set to 100 and maxTime set to 30s.  
Deletes work fine on another Solr test instance with 22k documents.
Any thoughts?  Is this sort of delay common when deleting against this many 
documents?
Thanks,Mike

  

RE: Delays when deleting by query

2011-12-07 Thread Mike Gallan

I ran some more tests.  I added an explicit commit after each deleteByQuery() 
call and removed the add/reindex step.  This hung up immediately and completed 
(or timed out?) after 20 minutes.  The hangs occur almost exactly 20 minutes 
apart.  Could this be a Tomcat issue?

I ran jconsole but didn't see any extraordinary memory or CPU usage.  The 
delays appear on the first delete attempt immediately after start up so I 
suspect it's not GC related.

I also tried adding documents without deleting.  This worked with no 
significant delays on the commit.  The delete/commit combo appears to be the 
source of the problem.

Any tips on how to debug this are appreciated!
Thanks,Mike

> From: mgal...@hotmail.com
> To: solr-user@lucene.apache.org
> Subject: Delays when deleting by query
> Date: Tue, 6 Dec 2011 08:25:28 -0500
> 
> Hello,
> 
> We're encountering delays of 10+ minutes when trying to delete from our Solr 
> 3.4 instance.  We have 335k documents indexed and interface using SolrJ.  Our 
> schema basically consists of a parent object with multiple child objects.  
> Every object is indexed as a separate document with the child documents 
> referencing parents via a 'parentId' field.  When any part of a parent object 
> is updated solrServer.deleteByQuery() is called to delete the parent and all 
> the child documents, then solrServer.add() is called to reindex them.  We 
> currently rely on autocommit, with maxDocs set to 100 and maxTime set to 30s. 
>  Deletes work fine on another Solr test instance with 22k documents.
> 
> Any thoughts?  Is this sort of delay common when deleting against this many 
> documents?
> 
> Thanks,
> Mike
> 
  

RE: Delays when deleting by query

2011-12-08 Thread Mike Gallan

Thanks for the response Erick.  I actually turned up logging yesterday and 
noticed spellchecker builds were causing the delays.  Setting buildOnCommit to 
false solved the problem.  Our plan is to schedule a nightly timer task that 
sends a 'spellcheck.build=true' to trigger it.

Mike

> Date: Thu, 8 Dec 2011 08:25:52 -0500
> Subject: Re: Delays when deleting by query
> From: erickerick...@gmail.com
> To: solr-user@lucene.apache.org
> 
> Hmmm, this is unusual. Can we see the code you use to delete?
> And your solrconfig file? You're not doing something odd like
> optimizing on commit or anything, right?
> 
> You shouldn't have to commit after deletes. The fact that you're
> hanging is very odd (BTW, does "hanging" mean you're system
> is locked up or just that you can't find your new documents?).
> 
> You could try using the default Jetty container just for yucks
> to see if Tomcat is somehow the culprit, although many people
> use Tomcat so it'd b something peculiar to your setup.
> 
> Best
> Erick
> 
> On Wed, Dec 7, 2011 at 8:55 AM, Mike Gallan  wrote:
> >
> > I ran some more tests.  I added an explicit commit after each 
> > deleteByQuery() call and removed the add/reindex step.  This hung up 
> > immediately and completed (or timed out?) after 20 minutes.  The hangs 
> > occur almost exactly 20 minutes apart.  Could this be a Tomcat issue?
> >
> > I ran jconsole but didn't see any extraordinary memory or CPU usage.  The 
> > delays appear on the first delete attempt immediately after start up so I 
> > suspect it's not GC related.
> >
> > I also tried adding documents without deleting.  This worked with no 
> > significant delays on the commit.  The delete/commit combo appears to be 
> > the source of the problem.
> >
> > Any tips on how to debug this are appreciated!
> > Thanks,Mike
> >
> >> From: mgal...@hotmail.com
> >> To: solr-user@lucene.apache.org
> >> Subject: Delays when deleting by query
> >> Date: Tue, 6 Dec 2011 08:25:28 -0500
> >>
> >> Hello,
> >>
> >> We're encountering delays of 10+ minutes when trying to delete from our 
> >> Solr 3.4 instance.  We have 335k documents indexed and interface using 
> >> SolrJ.  Our schema basically consists of a parent object with multiple 
> >> child objects.  Every object is indexed as a separate document with the 
> >> child documents referencing parents via a 'parentId' field.  When any part 
> >> of a parent object is updated solrServer.deleteByQuery() is called to 
> >> delete the parent and all the child documents, then solrServer.add() is 
> >> called to reindex them.  We currently rely on autocommit, with maxDocs set 
> >> to 100 and maxTime set to 30s.  Deletes work fine on another Solr test 
> >> instance with 22k documents.
> >>
> >> Any thoughts?  Is this sort of delay common when deleting against this 
> >> many documents?
> >>
> >> Thanks,
> >> Mike
> >>
> >