Thanks so much Shawn, I am in a scenario with many inserts while searching,
each <add> consisting of ~ 500documents, I will monitor the number of
segments taking your considerations in mind :-)
Regards,
Tommaso

2010/11/4 Shawn Heisey <s...@elyograg.org>

> On 11/4/2010 3:27 AM, Tommaso Teofili wrote:
>
>>    - Is mergeFactor a one time configuration setting that is considered
>> only
>>
>>    when creating the index for the first time or can it be adjusted later
>> even
>>    with some docs inside the index? e.g. I have mF to 10 then I realize I
>> want
>>    quicker searches and I set it to 2 so that at the next optimize/commit
>> I
>>    will have no more than 2 segments. My understanding is that one can
>> adjust
>>    mF over time, is it right?
>>
>
> The mergeFactor is applied anytime documents are added to the index, not
> just when it is built for the first time.  You can adjust it later, and
> reload the core or restart Solr.  It will apply to any additional indexing
> from that point forward.
>
> With a mergeFactor of 10, having 21 segments (and more) temporarily on the
> disk at the same time is reasonably possible.  I know this applies if you
> are doing a continuous large insert, not sure if you are doing several small
> inserts separately. These segments are:
>
> * The small segment that is being built right now.
> * The previous 10 small segments.
> * The merged segment being created from those above.
> * The previous 9 merged segments.
>
> If it takes a really long time to merge the last 10 small segments and then
> merge the 10 large segments into an even larger segment, you can end up with
> even more small segments from your continuous insert.  If it should take
> long enough that you actually get 10 more new small segments, the large
> merge will pause while it completes the small merge.  I saw this happen
> recently when I decided to see what happens if I built a single shard from
> our entire database.  It took a really long time, partly from that
> super-merge and the optimize that happened later, and took up 85GB of disk
> space.
>
> I'm not really sure what happens if you have this continue beyond a single
> super-merge like I have mentioned.
>
>     - In a replicated environment does it make sense to define different
>>
>>    mergeFactors on master and slave? I'd say no since it influences the
>> number
>>    of segments created, that being a concern of who actually index
>> documents
>>    (the master) not of who receives (segments of) index, but please
>> correct me
>>    if I am wrong.
>>
>
> Because it only applies when indexes are being built, it has no meaning on
> a slave, which as you said, just copies the data from the master.
>
> Shawn
>
>

Reply via email to