Thanks for your suggestions and recommendations.
If I understand correctly, the MIGRATE command does shard splitting
(around the range of the split.key) and merging behind the scene.
Though, it's a bit difficult to properly monitor the actual migration,
set the proper timeouts, know when to direct indexing and search traffic
to the destination collection, etc.
Note sure how to MIGRATE an entire collection. By providing the full
list of split.keys? I'd be surprised if that was doable, but I guess it
will skip the splitting part, which makes it easier ;-) Or much tougher
by splitting around all the ranges. More seriously, doing a MERGEINDEX
at the core level might not be a bad alternative, providing the hash
ranges are compatible.
Damien
On 07/07/2014 05:14 PM, Shawn Heisey wrote:
I don't think you'd want to disable mmap. It could be done, by choosing
another DirectoryFactory object. Adding memory is likely to be the only
sane way forward.
Another possibility would be to bump up the maxShardsPerNode value and
build the new collection (with the proper number of shards) only on the
new machines... Then when they are built, move them to their proper homes
and manually adjust the cluster state in zookeeper. This will still
generate a lot of I/O, but hopefully it will last for less time on the
wall clock, and it will be something you can do when load is low.
After that done and you've switched to it, you can add replicas with
either the addreplica collections api or with the core admin api. You
should be on the newest Solr version... Lots of bugs have been found and
fixed.
One thing I wonder is whether the MIGRATE api can be used on an entire
collection. It says it works by shard key, but I suspect that most users
will not be using that functionality.
Thanks,
Shawn