Your indexing process looks fine, there's no reason to
change it.

Optimizing is _probably_ unnecessary at all. In fact in the 4.x
world it was changed to "forceMerge" to make it seem less
attractive (I mean, who wouldn't want an optimized index?)

That said, the batch indexing process has nothing at all to
do with optimization. Nothing in the process of adding docs
to a server will trigger an optimize.

In your case, since your index only changes once a week it
will help your performance a little (but perhaps so little you won't
notice) to optimize after the batch index is done.

In short, your process seems fine. Indexes are never optimized
unless you explicitly do it. After all, how would Solr know that
you are done with your batch indexing?

Best,
Erick

On Tue, Jun 24, 2014 at 5:32 AM, RadhaJayalakshmi
<rlakshminaraya...@inautix.co.in> wrote:
> I am using Solr 4.5.1. I have two collections:
>                 Collection 1 - 2 shards, 3 replicas (Size of Shard 1 - 115
> MB, Size of Shard 2 - 55 MB)
>                 Collection 2 - 2 shards, 3 replicas (Size of Shard 2 - 3.5
> GB, Size of Shard 2 - 1 GB)
>
> I have a batch process that performs indexing (full refresh) - once a week
> on the same index.
>
> Here is some information on how I index:
> a) I use SolrJ's bulk ADD API for indexing - CloudSolrServer.add(Collection
> docs).
> b) I have an autoCommit (hardcommit) setting of for both my Collections
> (solrConfig.xml):
>                                 <autoCommit>
>                                                 <maxDocs>100000</maxDocs>
>
> <openSearcher>false</openSearcher>
>                                 </autoCommit>
> c) I do a programatic hardcommit at the end of the indexing cycle - with an
> open searcher of "true" - so that the documents show up on the Search
> Results.
> d) I neither programatically soft commit (nor have any autoSoftCommit
> seetings) during the batch indexing process
> e) When I re-index all my data again (the following week) into the same
> index - I don't delete existing docs. Rather, I just re-index into the same
> Collection.
> f) I am using the default mergefactor of 10 in my solrconfig.xml
>                 <mergeFactor>10</mergeFactor>
>
> Here is what I am observing:
> 1) After a batch indexing cycle - the segment counts for each shard / core
> is pretty high. The Solr Dashboard reports segment counts between 8 - 30
> segments on the variousr cores.
> 2) Sometimes the Solr Dashboard shows the status of my Core as - NOT
> OPTIMIZED. This I find unusual - since I have just finished a Batch indexing
> cycle - and would assume that the Index should already be optimized - Is
> this happening because I don't delete my docs before re-indexing all my data
> ?
> 3) After I run an optimize on my Collections - the segment count does reduce
> to significantly - to 1 segment.
>
> Am I doing indexing the right way ? Is there a better strategy ?
>
> Is it necessary to perform an optimize after every batch indexing cycle ??
>
> The outcome I am looking for is that I need an optimized index after every
> major Batch Indexing cycle.
>
> Thanks!!
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Does-one-need-to-perform-an-optimize-soon-after-doing-a-batch-indexing-using-SolrJ-tp4143686.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to