Ok, thanks Shawn!

That makes sense. We'll be experimenting with it.

Best,
Eric

On Wed, Oct 7, 2015 at 5:54 PM, Shawn Heisey <apa...@elyograg.org> wrote:
> On 10/7/2015 12:00 PM, Eric Torti wrote:
>> Can we read "high reopen rate" as "frequent soft commits"? (In our
>> case, hard commits do not open a searcher. But soft commits do).
>>
>> Considering it does mean "frequent soft commits", I'd say that it
>> doesn't fit our setup because we have an index rate of about 10
>> updates/s and we perform a soft commit at each 15min. So our scenario
>> is not near real time in that sense. In light of this, do you thing
>> using NRTCachingDirectory is still convenient?
>
> The NRT factory achieves high speed in NRT situations by flushing very
> small updates to RAM instead of the disk.  As more updates come in,
> older index segments sitting in RAM will eventually be flushed to disk,
> so a sustained flood of updates doesn't really achieve a speed increase,
> but a short burst of updates will be searchable *very* quickly.
>
> NRTCachingDirectoryFactory was chosen for Solr examples (and I think
> it's the Solr default) because it has no real performance downsides, but
> has a strong possibility to be noticeably faster than the standard
> factory in NRT situations.
>
> The only problem with it is that small index segments from recent
> updates might only exist in RAM, and not get flushed to disk, so they
> would be lost if Solr dies or is killed suddenly.  This is part of why
> the updateLog feature exists -- when Solr is started, the transaction
> logs will be replayed, inserting/replacing (at a minimum) all documents
> indexed since the last hard commit.  When the replay is finished, you
> will not lose data.  This does require a defined uniqueKey to operate
> correctly.
>
> Thanks,
> Shawn
>

Reply via email to