Hi Erick,
We index around several million documents/ day and we optimize everyday
when the relative load is low. The reason we optimize is, we dont want the
index sizes to grow too large and auto optimzie to kick in. When auto
optimize kicks in, it results in unpredictable performance as it is CPU and
IO intensive.

In older solr (4.2), when the segment size grows too large, insertion used
to fail .  Have we seen this problem in solr cloud?

Also, we have observed, recovery takes a bit more time when it is not
optimized. We dont have any quantitative measurement for the same. Its just
an observation. Is this correct observation?

If we optimize it every day, the indexes will not be skewed right?

Please let me know if my understanding is correct.

Regards,
Rahul

On Mon, Dec 21, 2015 at 9:54 AM, Erick Erickson <erickerick...@gmail.com>
wrote:

> You'll probably have to shard before you get to the TB range. At that
> point, all the optimization is done individually on each shard so it
> really doesn't matter how many shards you have.
>
> Just issuing
> http://solr:port/solr/collection/update?optimize=true
>
> is sufficient, that'll forward the optimize command to all the shards
> in the collection.
>
> Best,
> Erick
>
> On Sun, Dec 20, 2015 at 8:19 PM, Zheng Lin Edwin Yeo
> <edwinye...@gmail.com> wrote:
> > Thanks for your information Erick.
> >
> > We have yet to decide how often we will update the index to include new
> > documents that came in. Let's say we update the index once a day, then
> when
> > the indexed is updated, we do the optimization (this will be done at
> night
> > when there are not many users using the system).
> > But my index size will probably grow quite big (potentially can go up to
> > more than 1TB in the future), so does that have to be taken into
> > consideration too?
> >
> > Regards,
> > Edwin
> >
> >
> > On 21 December 2015 at 12:12, Erick Erickson <erickerick...@gmail.com>
> > wrote:
> >
> >> Much depends on how often the index is updated. If your index only
> >> changes, say, once a day then it's probably a good idea. If you're
> >> constantly updating your index, then I'd recommend that you do _not_
> >> optimize.
> >>
> >> Optimizing will create one large segment. That segment will be
> >> unlikely to be merged since it is so large relative to other segments
> >> for quite a while, resulting in significant wasted space. So if you're
> >> regularly indexing documents that _replace_ existing documents, this
> >> will skew your index.
> >>
> >> Bottom line:
> >> If you have a relatively static index the you can build and then use
> >> for an extended time (as in 12 hours plus) it can be worth the time to
> >> optimize. Otherwise I wouldn't bother.
> >>
> >> Best,
> >> Erick
> >>
> >> On Sun, Dec 20, 2015 at 7:57 PM, Zheng Lin Edwin Yeo
> >> <edwinye...@gmail.com> wrote:
> >> > Hi,
> >> >
> >> > I would like to find out, will it be good to do write a script to do
> an
> >> > auto-opitmization of the indexes at a certain time every day? Is there
> >> any
> >> > advantage to do so?
> >> >
> >> > I found that optimization can reduce the index size by quite a
> >> > signification amount, and allow the searching of the index to run
> faster.
> >> > But will there be advantage if we do the optimization every day?
> >> >
> >> > I'm using Solr 5.3.0
> >> >
> >> > Regards,
> >> > Edwin
> >>
>

Reply via email to