: We have SolrCloud cluster (5 shards and 2 replicas) on 10 boxes with 500 
: million documents. We're using custom sharding where we direct all
: documents with specific business date to specific shard.

        ...

: How do we optimize documents for all shards in Solr Cloud? Do we have to 
: fire five different optimize commands to all five leaders? Also, looks 

Commands like Optimize and deleteByQuery are automatically propogated to 
all shards -- you only need to send that command to one node in the 
collection.

: like optimize will be going away and might no longer be necessary - see 
: SOLR-3141<https://issues.apache.org/jira/browse/SOLR-3141> Is that true? 

it's still up for debate, and as you can see from the comments hasn't had 
much traction lately.  Even if, at some point in the future, sending a 
command named "optimize" ceasees to work, the underlying functinoality of  
being able to say "force merge down to N segments" will always exist under 
some name, provided you don't go out of your way to use a MergePolicy that 
ignores that command.

: With Solr 3.6 we used following curl command to purge documents. Now 
: with multiple shards can we still use the same command? We will 

as mentioned above, a deleteByQuery command can be sent to a single node 
and it will be propogated automatically.

However: if you are already using custom sharding to shard by date, then a 
blanket deleteByQuery across all shards may not be neccessary -- you may 
find it easier/faster/cleaner to just delete the shards you no longer need 
as the data in them "expires" ...

https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-DeleteaShard

-Hoss

Reply via email to