I am using Solr 4.5.1. I have two collections: Collection 1 - 2 shards, 3 replicas (Size of Shard 1 - 115 MB, Size of Shard 2 - 55 MB) Collection 2 - 2 shards, 3 replicas (Size of Shard 2 - 3.5 GB, Size of Shard 2 - 1 GB)
I have a batch process that performs indexing (full refresh) - once a week on the same index. Here is some information on how I index: a) I use SolrJ's bulk ADD API for indexing - CloudSolrServer.add(Collection docs). b) I have an autoCommit (hardcommit) setting of for both my Collections (solrConfig.xml): <autoCommit> <maxDocs>100000</maxDocs> <openSearcher>false</openSearcher> </autoCommit> c) I do a programatic hardcommit at the end of the indexing cycle - with an open searcher of "true" - so that the documents show up on the Search Results. d) I neither programatically soft commit (nor have any autoSoftCommit seetings) during the batch indexing process e) When I re-index all my data again (the following week) into the same index - I don't delete existing docs. Rather, I just re-index into the same Collection. f) I am using the default mergefactor of 10 in my solrconfig.xml <mergeFactor>10</mergeFactor> Here is what I am observing: 1) After a batch indexing cycle - the segment counts for each shard / core is pretty high. The Solr Dashboard reports segment counts between 8 - 30 segments on the variousr cores. 2) Sometimes the Solr Dashboard shows the status of my Core as - NOT OPTIMIZED. This I find unusual - since I have just finished a Batch indexing cycle - and would assume that the Index should already be optimized - Is this happening because I don't delete my docs before re-indexing all my data ? 3) After I run an optimize on my Collections - the segment count does reduce to significantly - to 1 segment. Am I doing indexing the right way ? Is there a better strategy ? Is it necessary to perform an optimize after every batch indexing cycle ?? The outcome I am looking for is that I need an optimized index after every major Batch Indexing cycle. Thanks!! -- View this message in context: http://lucene.472066.n3.nabble.com/Does-one-need-to-perform-an-optimize-soon-after-doing-a-batch-indexing-using-SolrJ-tp4143686.html Sent from the Solr - User mailing list archive at Nabble.com.