Hi Erick, After waiting for some days abt. a week (I did daily crawling & indexing), here are the docs summary:
Num Docs: 9738 Max Doc: 15311 Deleted Docs: 5573 Version: 781 Segment Count: 5 The percentage of deletedDocs of NumDocs is near 57%. In the other, the TieredMergePolicy in solrconfig.xml is still disabled. <!-- <mergePolicy class="org.apache.lucene.index.TieredMergePolicy"> <int name="maxMergeAtOnce">10</int> <int name="segmentsPerTier">10</int> </mergePolicy> --> Should we enable it and wait for the effect? Thanks! On Wed, Nov 20, 2013 at 9:55 PM, Bayu Widyasanyata <bwidyasany...@gmail.com>wrote: > Thanks Erick. > I will check that on next round. > > --- > wassalam, > [bayu] > > /sent from Android phone/ > On Nov 20, 2013 7:45 PM, "Erick Erickson" <erickerick...@gmail.com> wrote: > >> You probably shouldn't optimize at all. The default TieredMergePolicy >> will eventually purge the deleted files' data, which is really what >> optimize >> does. So despite its name, most of the time it's not really worth the >> effort. >> >> Take a look at your Solr admin page, the "overview" link for a core. >> If the number of deleted docs is a significant percentage of your >> numDocs (I typically use 20% or so, but YMMV) then optimize >> might be worthwhile. Otherwise, it's a distraction unless and until >> you have some evidence that it actually makes a difference. >> >> Best, >> Erick >> >> >> On Wed, Nov 20, 2013 at 7:33 AM, Bayu Widyasanyata >> <bwidyasany...@gmail.com>wrote: >> >> > Hi, >> > >> > After successfully configured re-crawling script, I sometimes checked >> and >> > found on Solr Admin that "Optimized" status of my collection is not >> > optimized (slash icon). >> > >> > Hence I did optimized steps manually. >> > >> > How to make my crawling optimized automatically? >> > >> > Should we restart Solr (I use Tomcat) as shown on here [1] >> > >> > [1] http://wiki.apache.org/nutch/Crawl >> > >> > Thanks! >> > >> > -- >> > wassalam, >> > [bayu] >> > >> > -- wassalam, [bayu]