I have not benchmarked various number of segments at different sizes
on different HW etc, so my hunch could very well be wrong for Salman’s case.
I don’t know how frequent updates there is to his data either.

Have you done #segments benchmarking for your huge datasets?

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

> 12. okt. 2015 kl. 12.56 skrev Toke Eskildsen <t...@statsbiblioteket.dk>:
> 
> On Mon, 2015-10-12 at 10:05 +0200, Jan Høydahl wrote:
>> What you do when you call optimize is to force Lucene to merge all
>> those 35M docs into ONE SINGLE index segment. You get better HW
>> utilization if you let Lucene/Solr automatically handle merging,
>> meaning you’ll have around 10 smaller segments that are faster to
>> search across than one huge segment.
> 
> As individual Lucene/Solr shard searches are very much single threaded,
> the single segment version should be faster. Have you observed
> otherwise?
> 
> 
> Optimization is a fine feature if ones workflow is batch oriented with
> sufficiently long pauses between index updates. Nightly index updates
> with few active users at that time could be an example.
> 
> - Toke Eskildsen, State and University Library, Denmark
> 
> 

Reply via email to