Is there really a good reason to consolidate down to a single segment?

Any incremental query performance benefit is tiny compared to the loss of 
managability.   

I.e. shouldn't segments _always_ be kept small enough to facilitate 
re-balancing data across shards?   Even in non-cloud instances this is true.  
When a collection grows, you may want shard/split an existing index by adding a 
node and moving some segments around.    Isn't this the direction Solr is 
going?   With many, smaller segments, this is feasible.  With "one big 
segment", the collection must always be reindexed.

Thus, "optimize" would mean, "get rid of all deleted records" and would, in 
fact, optimize queries by eliminating wasted I/O.   Perhaps worth it for slowly 
changing indexes.   Seems like the Tiered merge policy is 90% there ...    Or 
am I all wet (again)?

-----Original Message-----
From: Walter Underwood [mailto:wun...@wunderwood.org] 
Sent: Monday, June 29, 2015 10:39 AM
To: solr-user@lucene.apache.org
Subject: Re: optimize status

"Optimize" is a manual full merge.

Solr automatically merges segments as needed. This also expunges deleted 
documents.

We really need to rename "optimize" to "force merge". Is there a Jira for that?

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

On Jun 29, 2015, at 5:15 AM, Steven White <swhite4...@gmail.com> wrote:

> Hi Upayavira,
> 
> This is news to me that we should not optimize and index.
> 
> What about disk space saving, isn't optimization to reclaim disk space 
> or is Solr somehow does that?  Where can I read more about this?
> 
> I'm on Solr 5.1.0 (may switch to 5.2.1)
> 
> Thanks
> 
> Steve
> 
> On Mon, Jun 29, 2015 at 4:16 AM, Upayavira <u...@odoko.co.uk> wrote:
> 
>> I'm afraid I don't understand. You're saying that optimising is 
>> causing performance issues?
>> 
>> Simple solution: DO NOT OPTIMIZE!
>> 
>> Optimisation is very badly named. What it does is squashes all 
>> segments in your index into one segment, removing all deleted 
>> documents. It is good to get rid of deletes - in that sense the index is 
>> "optimized".
>> However, future merges become very expensive. The best way to handle 
>> this topic is to leave it to Lucene/Solr to do it for you. Pretend 
>> the "optimize" option never existed.
>> 
>> This is, of course, assuming you are using something like Solr 3.5+.
>> 
>> Upayavira
>> 
>> On Mon, Jun 29, 2015, at 08:08 AM, Summer Shire wrote:
>>> 
>>> Have to cause of performance issues.
>>> Just want to know if there is a way to tap into the status.
>>> 
>>>> On Jun 28, 2015, at 11:37 PM, Upayavira <u...@odoko.co.uk> wrote:
>>>> 
>>>> Bigger question, why are you optimizing? Since 3.6 or so, it 
>>>> generally hasn't been requires, even, is a bad thing.
>>>> 
>>>> Upayavira
>>>> 
>>>>> On Sun, Jun 28, 2015, at 09:37 PM, Summer Shire wrote:
>>>>> Hi All,
>>>>> 
>>>>> I have two indexers (Independent processes ) writing to a common 
>>>>> solr core.
>>>>> If One indexer process issued an optimize on the core I want the 
>>>>> second indexer to wait adding docs until the optimize has 
>>>>> finished.
>>>>> 
>>>>> Are there ways I can do this programmatically?
>>>>> pinging the core when the optimize is happening is returning OK
>> because
>>>>> technically
>>>>> solr allows you to update when an optimize is happening.
>>>>> 
>>>>> any suggestions ?
>>>>> 
>>>>> thanks,
>>>>> Summer
>> 


*************************************************************************
This e-mail may contain confidential or privileged information.
If you are not the intended recipient, please notify the sender immediately and 
then delete it.

TIAA-CREF
*************************************************************************

Reply via email to