On 1/4/2017 6:23 AM, Hendrik Haddorp wrote: > Problem is that we would like to run without down times. Rolling > updates worked fine so far except when creating a collection at the > wrong time. I just did another test with stateFormat=2. This seems to > greatly improve the situation. One collection creation got stuck but > other creations still worked and after a restart of some nodes the > stuck collection creation also looked ok. For some reason it just > resulted in two replicas for the same shard getting assigned to the > same node even though I specified a rule of "shard:*,replica:<2,node:*".
I have no idea what that rule means or where you might be configuring it. That must be for a feature that I've never used. If you're going to restart nodes, then don't create collections at that moment. Wait until after the restart is completely finished. If it's all automated ... then design your tools so that they do not create collections and do the restarts at the same time. Thanks, Shawn