Did you change the merge settings and max segments? If you did, try going back to the defaults.
wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Aug 8, 2016, at 8:56 AM, Erick Erickson <erickerick...@gmail.com> wrote: > > Callum: > > re: the optimize failing: Perhaps it's just timing out? > That is, the command succeeds fine (which you > are reporting), but it's taking long enough that the > request times out so the client you're using reports an error..... > Just a guess... > > My personal feeling is that (of course), you need to measure > your perf before/after optimize to see if there's a measurable > difference. Apart from that, Shawn's comments about the > stats being different due to deleted docs is germane. > > Have you tried adding expundeDeletes=true to a commit > message? See: > https://cwiki.apache.org/confluence/display/solr/Uploading+Data+with+Index+Handlers > > A little-known option is to control how aggressively > the % of deleted documents is factored in to the decision > whether to merge a segments or not. It takes a little > code-diving, and faith, but if you look at TieredMergePolicy, > you'll see a double field: reclaimDeletesWeight. > > Now, in your solrconfig.xml file you can set this, there's a > clever bit of reflection to allow these to be specified, going > from memory it's just > <double name="reclaimDeletesWeight">3.0</double> > as a node in your tiered merge config. The default is 2.0. > In terms of what that _does_, that's where code-diving > comes in..... > > Best, > Erick > > On Mon, Aug 8, 2016 at 7:59 AM, Callum Lamb <cl...@mintel.com> wrote: >> Yeah I figured that was too many deleteddocs. It could just be that our max >> segments is set too high though. >> >> The reason I asked is because our optimize requests have started failing. >> Or at least,they are appearing to fail because the optimize request returns >> a non 200. The optimize seems to go ahead successfully regardless though. >> Before trying to find out if I can asynchronously request and poll for >> success (doesn't appear to be possible yet) or a better way of determining >> success, I thought I'd check if the whole thing was necessary to begin with. >> >> Hopefully it doesn't involve polling the core status until deleteddocs goes >> below a certain level :/. >> >> Cheers for info. >> >> On Mon, Aug 8, 2016 at 2:58 PM, Shawn Heisey <apa...@elyograg.org> wrote: >> >>> On 8/8/2016 3:10 AM, Callum Lamb wrote: >>>> How true is this claim? Is optimizing still a good idea for the >>>> general case? >>> >>> For the general case, optimizing is not recommended. If there are a >>> very large number of deleted documents, which does describe your >>> situation, then there is definitely a benefit. >>> >>> In cases where there are a lot of deleted documents, scoring can be >>> affected by the presence of the deleted documents, and the drop in index >>> size after an optimize can result in a large performance boost. For the >>> general case where there are not many deletes, there *is* a performance >>> benefit to optimizing down to a single segment, but it is nowhere near >>> as dramatic as it was in the 1.x/3.x days. >>> >>> The problem with optimizes in the general case is this: The performance >>> hit that the optimize operation itself causes may not be worth the small >>> performance improvement. >>> >>> If you have a time where your index is quiet enough that the optimize >>> itself won't be disruptive, then you should certainly take advantage of >>> that time and do the optimize, even if there aren't many deletes. >>> >>> There is another benefit to optimizes that doesn't get mentioned often: >>> It can make subsequent normal merging operations during indexing faster, >>> because there will not be as many large segments. >>> >>> Thanks, >>> Shawn >>> >>> >> >> -- >> >> Mintel Group Ltd | 11 Pilgrim Street | London | EC4V 6RN >> Registered in England: Number 1475918. | VAT Number: GB 232 9342 72 >> >> Contact details for our other offices can be found at >> http://www.mintel.com/office-locations. >> >> This email and any attachments may include content that is confidential, >> privileged >> or otherwise protected under applicable law. Unauthorised disclosure, >> copying, distribution >> or use of the contents is prohibited and may be unlawful. If you have >> received this email in error, >> including without appropriate authorisation, then please reply to the >> sender about the error >> and delete this email and any attachments. >>