We have solr with the index stored in HDFS. We are running MapReduce jobs to build the index using the MapReduceIndexerTool from Cloudera with the go-live option to merge into our live index.
We are seeing an issue where the number of segments in the index never reduces. It continues to grow until we manually do an optimize. We are using the following solr config for merge policy *<mergePolicy class="org.apache.lucene.index.TieredMergePolicy"> <int name="maxMergeAtOnce">10</int> <int name="segmentsPerTier">10</int></mergePolicy><!--<mergeFactor>10</mergeFactor>--><mergeScheduler class="org.apache.lucene.index.ConcurrentMergeScheduler"> <int name="maxThreadCount">1</int> <int name="maxMergeCount">6</int></mergeScheduler>* If we add documents into solr without using MapReduce the segments merge properly as expected. Any ideas on why we see this behavior? Does the solr index merge prevent the segments from merging? Thanks, Jordan