Currently my indexing code calls optimize. Once a night, one of my six large shards is optimized, so each one only gets optimized once every six days. Here is the SolrJ call, server is an instance of HttpSolrServer:

UpdateResponse ur = server.optimize();

I only do this because I want deleted documents regularly removed from the index. Whatever speed gains I might see from getting down to one segment are just an added bonus. After watching all the discussion on the -dev list regarding what to do in Solr due to the Lucene forceMerge rename, I am considering changing this to something like the following:

UpdateResponse ur = server.optimize(true, true, 20);

What happens with this if I am already below 20 segments? Will it still expunge all of my (typically several thousand) deleted documents? I am hoping that what it will do is rebuild any segment that contains deleted documents and leave the other segments alone. Possibly irrelevant info: I'm using the following MP config:

  <mergePolicy class="org.apache.lucene.index.TieredMergePolicy">
    <int name="maxMergeAtOnce">35</int>
    <int name="segmentsPerTier">35</int>
    <int name="maxMergeAtOnceExplicit">105</int>
  </mergePolicy>

Thanks,
Shawn

Reply via email to