If you are updating all the time, don't forceMerge at all, unless you want to put the overhead of big merges at a known time. Otherwise, leave it alone.
wunder On Oct 12, 2012, at 3:56 PM, Erick Erickson wrote: > Right. If I've multiplied right, you're essentially replacing your entire > index > every day given the rate you're adding documents. > > Have a look at MergePolicy, here are a couple of references: > http://juanggrande.wordpress.com/2011/02/07/merge-policy-internals/ > https://lucene.apache.org/core/old_versioned_docs/versions/3_2_0/api/core/org/apache/lucene/index/MergePolicy.html > http://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.html > > But unless you're having problems with performance, I'd consider just > optimizing once a day at off-peak hours. > > FWIW, > Erick > > On Fri, Oct 12, 2012 at 5:35 PM, Petersen, Robert <rober...@buy.com> wrote: >> Hi Erick, >> >> After reading the discussion you guys were having about renaming optimize to >> forceMerge I realized I was guilty of over-optimizing like you guys were >> worried about! We have about 15 million docs indexed now and we spin about >> 50-300 adds per second 24/7, most of them being updates to existing >> documents whose data has changed since the last time it was indexed (which >> we keep track of in a DB table). There are some new documents being added >> in the mix and some deletes as well too. >> >> I understand now how the merge policy caps the number of segments. I used >> to think they would grow unbounded and thus optimize was required. How does >> the large number of updates of existing documents affect the need to >> optimize, by causing a large number of deletes with a 're-add'? And so I >> suppose that means the index size tends to grow with the deleted docs >> hanging around in the background, as it were. >> >> So in our situation, what frequency of optimize would you recommend? We're >> on 3.6.1 btw... >> >> Thanks, >> Robi >> >> -----Original Message----- >> From: Erick Erickson [mailto:erickerick...@gmail.com] >> Sent: Thursday, October 11, 2012 5:29 AM >> To: solr-user@lucene.apache.org >> Subject: Re: anyone have any clues about this exception >> >> Well, you'll actually be able to optimize, it's just called forceMerge. >> >> But the point is that optimize seems like something that _of course_ you >> want to do, when in reality it's not something you usually should do at all. >> Optimize does two things: >> 1> merges all the segments into one (usually) >> 2> removes all of the info associated with deleted documents. >> >> Of the two, point <2> is the one that really counts and that's done whenever >> segment merging is done anyway. So unless you have a very large number of >> deletes (or updates of the same document), optimize buys you very little. >> You can tell this by the difference between numDocs and maxDoc in the admin >> page. >> >> So what happens if you just don't bother to optimize? Take a look at merge >> policy to help control how merging happens perhaps as an alternative. >> >> Best >> Erick >> >> On Wed, Oct 10, 2012 at 3:04 PM, Petersen, Robert <rober...@buy.com> wrote: >>> You could be right. Going back in the logs, I noticed it used to happen >>> less frequently and always towards the end of an optimize operation. It is >>> probably my indexer timing out waiting for updates to occur during >>> optimizes. The errors grew recently due to my upping the indexer >>> threadcount to 22 threads, so there's a lot more timeouts occurring now. >>> Also our index has grown to double the old size so the optimize operation >>> has started taking a lot longer, also contributing to what I'm seeing. I >>> have just changed my optimize frequency from three times a day to one time >>> a day after reading the following: >>> >>> Here they are talking about completely deprecating the optimize >>> command in the next version of solr... >>> https://issues.apache.org/jira/browse/SOLR-3141c >>> >>> >>> -----Original Message----- >>> From: Alexandre Rafalovitch [mailto:arafa...@gmail.com] >>> Sent: Wednesday, October 10, 2012 11:10 AM >>> To: solr-user@lucene.apache.org >>> Subject: Re: anyone have any clues about this exception >>> >>> Something timed out, the other end closed the connection. This end tried to >>> write to closed pipe and died, something tried to catch that exception and >>> write its own and died even worse? Just making it up really, but sounds >>> good (plus a 3-year Java tech-support hunch). >>> >>> If it happens often enough, see if you can run WireShark on that machine's >>> network interface and catch the whole network conversation in action. >>> Often, there is enough clues there by looking at tcp packets and/or stuff >>> transmitted. WireShark is a power-tool, so takes a little while the first >>> time, but the learning will pay for itself over and over again. >>> >>> Regards, >>> Alex. >>> >>> Personal blog: http://blog.outerthoughts.com/ >>> LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch >>> - Time is the quality of nature that keeps events from happening all >>> at once. Lately, it doesn't seem to be working. (Anonymous - via GTD >>> book) >>> >>> >>> On Wed, Oct 10, 2012 at 11:31 PM, Petersen, Robert <rober...@buy.com> wrote: >>>> Tomcat localhost log (not the catalina log) for my solr 3.6.1 (master) >>>> instance contains lots of these exceptions but solr itself seems to be >>>> doing fine... any ideas? I'm not seeing these exceptions being logged on >>>> my slave servers btw, just the master where we do our indexing only. >>>> >>>> >>>> >>>> Oct 9, 2012 5:34:11 PM org.apache.catalina.core.StandardWrapperValve >>>> invoke >>>> SEVERE: Servlet.service() for servlet default threw exception >>>> java.lang.IllegalStateException >>>> at >>>> org.apache.catalina.connector.ResponseFacade.sendError(ResponseFacade.java:407) >>>> at >>>> org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:389) >>>> at >>>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:291) >>>> at >>>> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) >>>> at >>>> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) >>>> at >>>> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) >>>> at >>>> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) >>>> at >>>> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128) >>>> at >>>> com.googlecode.psiprobe.Tomcat60AgentValve.invoke(Tomcat60AgentValve.java:30) >>>> at >>>> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) >>>> at >>>> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) >>>> at >>>> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293) >>>> at >>>> org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:849) >>>> at >>>> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583) >>>> at >>>> org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:454) >>>> at java.lang.Thread.run(Unknown Source) >>> >> >> -- Walter Underwood wun...@wunderwood.org