Re: Parallel optimize of index on SolrCloud.

2014-07-09 Thread Mark Miller
I think that’s pretty much a search time param, though it might end being used on the update side as well. In any case, I know it doesn’t affect commit or optimize. Also, to my knowledge, SolrCloud optimize support was never explicitly added or tested. -- Mark Miller about.me/markrmiller On

Re: Parallel optimize of index on SolrCloud.

2014-07-09 Thread Shawn Heisey
On 7/9/2014 8:49 AM, Timothy Potter wrote: > Hi Modassar, > > Have you tried hitting the cores for each replica directly (instead of > using the collection)? i.e. if you had col_shard1_replica1 on node1, > then send the optimize command to that core URL directly: > > curl -i -v "http://host:port/so

Re: Parallel optimize of index on SolrCloud.

2014-07-09 Thread Timothy Potter
Hi Modassar, Have you tried hitting the cores for each replica directly (instead of using the collection)? i.e. if you had col_shard1_replica1 on node1, then send the optimize command to that core URL directly: curl -i -v "http://host:port/solr/col_shard1_replica1/update"; -H 'Content-type:applic

Re: Parallel optimize of index on SolrCloud.

2014-07-09 Thread Modassar Ather
Hi All, Thanks for your kind suggestions and inputs. We have been going the optimize way and it has helped. There have been testing and benchmarking already done around memory and performance. So while optimizing we see a scope of improvement on it by doing it parallel so kindly suggest in what w

Re: Parallel optimize of index on SolrCloud.

2014-07-08 Thread Shalin Shekhar Mangar
Hi Walter, I wonder why you think SolrCloud isn't necessary if you're indexing once per week. Isn't the automatic failover and auto-sharding still useful? One can also do custom sharding with SolrCloud if necessary. On Wed, Jul 9, 2014 at 11:38 AM, Walter Underwood wrote: > More memory or fast

Re: Parallel optimize of index on SolrCloud.

2014-07-08 Thread Walter Underwood
More memory or faster disks will make a much bigger improvement than a forced merge. What are you measuring? If it is average query time, that is not a good measure. Look at 90th or 95th percentile. Test with queries from logs. No user can see a 10% or 20% difference. If your managers are watch

Re: Parallel optimize of index on SolrCloud.

2014-07-08 Thread Modassar Ather
Our index has almost 100M documents running on SolrCloud of 3 shards and each shard has an index size of about 700GB (for the record, we are not using stored fields - our documents are pretty large). We perform a full indexing every weekend and during the week there are no updates made to the index

Re: Parallel optimize of index on SolrCloud.

2014-07-08 Thread Walter Underwood
I seriously doubt that you are required to force merge. How much improvement? And is the big performance cost also OK? I have worked on search engines that do automatic merges and offer forced merges for over fifteen years. For all that time, forced merges have usually caused problems. Stop do

Re: Parallel optimize of index on SolrCloud.

2014-07-08 Thread Modassar Ather
Thanks Walter for your inputs. Our use case and performance benchmark requires us to invoke optimize. Here we see a chance of improvement in performance of optimize() if invoked in parallel. I found that if* distrib=false *is used, the optimization will happen in parallel. But I could not find a

Re: Parallel optimize of index on SolrCloud.

2014-07-08 Thread Walter Underwood
You probably do not need to force merge (mistakenly called "optimize") your index. Solr does automatic merges, which work just fine. There are only a few situations where a forced merge is even a good idea. The most common one is a replicated (non-cloud) setup with a full reindex every night.

Parallel optimize of index on SolrCloud.

2014-07-08 Thread Modassar Ather
Hi, Need to optimize index created using CloudSolrServer APIs under SolrCloud setup of 3 instances on separate machines. Currently it optimizes sequentially if I invoke cloudSolrServer.optimize(). To make it parallel I tried making three separate HttpSolrServer instances and invoked httpSolrServe