Thanks, Erick. *2> There are, but you'll have to dig. *
>> Any pointers on where to get started? *3> Well, I'd ask a counter-question. Are you seeing unacceptableperformance? If not, why worry? :)* >> When you mean % do you refer to deleted_docs/NumDocs or deleted_docs/Max_docs ? To answer your question, yes i see some of our shards taking 3x more time and 3x more cpu than other shards for the same queries and same number of hits (all shards have exact same number of docs but i see a few shards having more deleted documents than the rest). My understanding is that the Search time /CPU would increase with # of segments ? The core of my issue is that few nodes are running with extremely high CPU (90+) and rest are running under 30% CPU and the only difference between both is the # of segments in the shards on the machines. The nodes running hot have shards with 30 segments and the ones running with lesser CPU contain 20 segments and much lesser deleted documents. Is it possible that a difference of 10 segments could impact CPU /Search time? Thanks - Nitin On Sat, Feb 22, 2014 at 4:36 PM, Erick Erickson <erickerick...@gmail.com>wrote: > 1> It Depends. Soft commits will not add a new segment. Hard commits > with openSearcher=true or false _will_ create a new segment. > 2> There are, but you'll have to dig. > 3> Well, I'd ask a counter-question. Are you seeing unacceptable > performance? If not, why worry? :) > > A better answer is that 24-28 segments is not at all unusual. > > By and large, don't bother with optimize/force merge. What I would do is > look at the admin screen and note the percentage of deleted documents. > If it's above some arbitrary number (I typically use 15-20%) and _stays_ > there, consider optimizing. > > However! There is a parameter you can explicitly set in solrconfig.xml > (sorry, which one escapes me now) that increases the "weight" of the % > deleted documents when the merge policy decides which segments > to merge. Upping this number will have the effect of more aggressively > merging segments with a greater % of deleted docs. But these are already > pretty heavily weighted for merging already... > > > Best, > Erick > > > On Sat, Feb 22, 2014 at 1:23 PM, KNitin <nitin.t...@gmail.com> wrote: > > > Hi > > > > I have the following questions > > > > > > 1. I have a job that runs for 3-4 hours continuously committing data > to > > a collection with auto commit of 30 seconds. Does it mean that every > 30 > > seconds I would get a new solr segment ? > > 2. My current segment merge policy is set to 10. Will merger always > > continue running in the background to reduce the segments ? Is there a > > way > > to see metrics regarding segment merging from solr (mbeans or any > other > > way)? > > 3. A few of my collections are very large with around 24-28 segments > per > > shard and around 16 shards. Is it bad to have this many segments for a > > shard for a collection? Is it a good practice to optimize the index > very > > often or just rely on segment merges alone? > > > > > > > > Thanks for the help in advance > > Nitin > > >