On 2/27/2015 7:15 AM, tuxedomoon wrote:
> I currently have a SolrCloud with 3 shards + replicas, it is holding 130M
> documents and the r3.large hosts are running out of memory. As it's on 4.2
> there is no shard splitting, I will have to reindex to a 4.3+ version.
> 
> If I had that feature would I need to split each shard into 2 subshards
> resulting in a total of 6 subshards, in order to keep all shards relatively
> equal?
> 
> And since host memory is the problem I'd be migrating subshards to new
> hosts. So it seems I'd be going from 6 hosts to 12.  Are these assumptions
> correct or is there a way to avoid doubling my host count?

All shards that result from a split will reside on the same host(s) as
the original shard.

If you are splitting shards because of memory problems, it is normally a
good idea to add hosts and then use ADDREPLICA and DELETEREPLICA to move
your shard replicas around ... but that's not strictly required.  You
may not need a strict doubling of hosts ... adding 1 or 2 may be enough.

Because it is a lot cleaner, I recommend building a new collection and
reindexing to change the number of shards and hosts.  You should be able
to use your existing collection without interruption until you're ready
to switch ... and if you do not want to reconfigure your application,
you can delete the old collection and set up an alias that points the
original collection name to the new collection.  Coordinating index
updates to make sure the new collection is completely up to date can be
challenging.

If you are having memory problems, be prepared for those memory problems
to get at least a little bit worse (and maybe a lot worse) while
splitting shards or building a new collection.

Thanks,
Shawn

Reply via email to