I have not benchmarked various number of segments at different sizes on different HW etc, so my hunch could very well be wrong for Salman’s case. I don’t know how frequent updates there is to his data either.
Have you done #segments benchmarking for your huge datasets? -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com > 12. okt. 2015 kl. 12.56 skrev Toke Eskildsen <t...@statsbiblioteket.dk>: > > On Mon, 2015-10-12 at 10:05 +0200, Jan Høydahl wrote: >> What you do when you call optimize is to force Lucene to merge all >> those 35M docs into ONE SINGLE index segment. You get better HW >> utilization if you let Lucene/Solr automatically handle merging, >> meaning you’ll have around 10 smaller segments that are faster to >> search across than one huge segment. > > As individual Lucene/Solr shard searches are very much single threaded, > the single segment version should be faster. Have you observed > otherwise? > > > Optimization is a fine feature if ones workflow is batch oriented with > sufficiently long pauses between index updates. Nightly index updates > with few active users at that time could be an example. > > - Toke Eskildsen, State and University Library, Denmark > >