Re: Index optimize runs in background.

Upayavira Tue, 26 May 2015 05:34:21 -0700

Modassar,

Are you saying that the reason you are optimising is because you have
been doing it for years? If this is the only reason, you should stop
doing it immediately.


The one scenario in which optimisation still makes some sense is when
you reindex every night and optimise straight after. This will leave you
with a single segment which will search faster.

However, if you are doing a lot of indexing, especially with
deletes/updates, you will have merged your content into a single segment
which will later need to be merged. That merge will be costly as it will
involve copying the entire content of your large segment, which will
impact performance.

Before Solr 3.6, Optimisation was necessary and recommended. At that
point (or a little before) the TieredMergePolicy became the default, and
this made optimisation generally unnecessary.

Upayavira

On Mon, May 25, 2015, at 07:17 AM, Modassar Ather wrote:
> Thanks everybody for your replies.
> 
> I have noticed the optimization running in background every time I
> indexed.
> This is 5 node cluster with solr-5.1.0 and uses the CloudSolrClient.
> Kindly
> share your findings on this issue.
> 
> Our index has almost 100M documents running on SolrCloud. We have been
> optimizing the index after indexing for years and it has worked well for
> us.
> 
> Thanks,
> Modassar
> 
> On Fri, May 22, 2015 at 11:55 PM, Erick Erickson
> <erickerick...@gmail.com>
> wrote:
> 
> > Actually, I've recently seen very similar behavior in Solr 4.10.3, but
> > involving hard commits openSearcher=true, see:
> > https://issues.apache.org/jira/browse/SOLR-7572. Of course I can't
> > reproduce this at will, siigggghhhh.
> >
> > A unit test should be very simple to write though, maybe I can get to it
> > today.
> >
> > Erick
> >
> >
> >
> > On Fri, May 22, 2015 at 8:27 AM, Upayavira <u...@odoko.co.uk> wrote:
> > >
> > >
> > > On Fri, May 22, 2015, at 03:55 PM, Shawn Heisey wrote:
> > >> On 5/21/2015 6:21 AM, Modassar Ather wrote:
> > >> > I am using Solr-5.1.0. I have an indexer class which invokes
> > >> > cloudSolrClient.optimize(true, true, 1). My indexer exits after the
> > >> > invocation of optimize and the optimization keeps on running in the
> > >> > background.
> > >> > Kindly let me know if it is per design and how can I make my indexer
> > to
> > >> > wait until the optimization is over. Is there a
> > configuration/parameter I
> > >> > need to set for the same.
> > >> >
> > >> > Please note that the same indexer with cloudSolrServer.optimize(true,
> > true,
> > >> > 1) on Solr-4.10 used to wait till the optimize was over before
> > exiting.
> > >>
> > >> This is very odd, because I could not get HttpSolrServer to optimize in
> > >> the background, even when that was what I wanted.
> > >>
> > >> I wondered if maybe the Cloud object behaves differently with regard to
> > >> blocking until an optimize is finished ... except that there is no code
> > >> for optimizing in CloudSolrClient at all ... so I don't know where the
> > >> different behavior would actually be happening.
> > >
> > > A more important question is, why are you optimising? Generally it isn't
> > > recommended anymore as it reduces the natural distribution of documents
> > > amongst segments and makes future merges more costly.
> > >
> > > Upayavira
> >

Re: Index optimize runs in background.

Reply via email to