I am using Solr 4.6.0 in cloud mode. The setup is of 4 shards, 1 on each
machine with a zookeeper quorum running on 3 other machines. The index size
on each shard is about 15GB. I noticed that the number of segments in
second shard was 42 and in the remaining shards was between 25-30.

I am basically trying to get the number of segments down to a reasonable
size like 4 or 5 in order to improve the search time. We do have some
documents indexed everyday, so we don't want to do an optimize every day.

The merge factor with the TierMergePolicy is only the number of segments
per tier. Assuming there were 5 tiers (mergeFactor of 10) in the second
shard, I tried clearing the index, reducing the mergeFactor and re-indexing
the same data in the same manner, multiple times, but I don't see a pattern
of reduction in number of segments.

No mergeFactor set      =>     42 segments
mergeFactor=5      =>       22 segments
mergeFactor=2      =>       22 segments

Below is the simple configuration, as specified in the documentation, I am
using for merging:

<mergePolicy class="org.apache.lucene.index.TieredMergePolicy">

          <int name="maxMergeAtOnce">2</int>

          <int name="segmentsPerTier">2</int>

</mergePolicy>

<mergeScheduler class="org.apache.lucene.index.ConcurrentMergeScheduler"/>

What is the best way in which I can use merging to restrict the number of
segments being formed?

Also, we are moving from Solr 1.4 (Master-Slave) to Solr 4.6.0 Cloud and
see a great increase in response time from about 18ms to 150ms. Is this a
known issue? Is there no way to reduce the response time? In the MBeans,
the individual cores show the /select handler attributes having search
times around 8ms. What is it that causes the overall response time to
increase so much?

-Varun

Reply via email to