On Mon, Aug 8, 2016 at 5:10 AM, Callum Lamb <cl...@mintel.com> wrote:
> We have a cronjob that runs every week at a quiet time to run the
> optimizecommand on our Solr collections. Even when it's quiet it's still an
> extremely heavy operation.
>
> One of the things I keep seeing on stackoverflow is that optimizing is now
> essentially deprecated and lucene (We're on Solr 5.5.2) will now keep the
> amount of segments at a reasonable level and that the performance impact of
> having deletedDocs is now much less.

Optimize is certainly not deprecated.
The operation was renamed to forceMerge at the Lucene level (but not
the Solr level) due to concerns that people may think it was necessary
for good performance and didn't realize the cost.

> One of our cores doesn't get optimized and it's currently sitting at 5.5
> million documents with 1.9 million deleted docs. Which seems pretty high to
> me.
>
> How true is this claim? Is optimizing still a good idea for the general
> case?

The cost of optimize will always be high (but the impact of that cost
depends on the user/use case).  The benefit may be small to large.
I don't think one can really give a recommendation for the general case.

-Yonik

Reply via email to