Hi Ryan, yes, you can change a number of settings. Have you had a look at http://docs.basho.com/riak/kv/2.1.4/using/admin/riak-admin/#transfer-limit and http://lists.basho.com/pipermail/riak-users_lists.basho.com/2014-July/015529.html ?
-Alexander On Tue, Nov 1, 2016 at 2:43 AM, Ryan Maclear <[email protected]> wrote: > Good Day, > > We have a 4 node riak cluster running inside AWS. The riak is riak-kv 2.1.2 > with AAE enabled on Ubuntu 14.04.4 LTS > > We are in the process of replacing one node with another using the process > described here: > > http://docs.basho.com/riak/kv/2.1.4/using/cluster-operations/replacing-node/ > > We have successfully replaced two of the nodes so far but we are having a > problem with the third. If we look at /var/log/riak/console.log we see the > start of the hinted handoff, and some time later (sometimes minutes and > sometimes hours) we see: > > 2016-10-31 06:30:40.090 [error] > <0.19834.2101>@riak_core_handoff_sender:start_fold:272 hinted transfer of > riak_kv_vnode from '[email protected]' > 274031556999544297163190906134303066185487351808 to > '[email protected]' > 274031556999544297163190906134303066185487351808 failed because of TCP recv > timeout > 2016-10-31 06:30:40.090 [error] > <0.187.0>@riak_core_handoff_manager:handle_info:303 An outbound handoff of > partition riak_kv_vnode 274031556999544297163190906134303066185487351808 was > terminated for reason: {shutdown,timeout} > > So the handoff was terminated due to a tcp timeout. The handoff then starts > again. > > This has been going on for some times (about two weeks now). > > The current member status is as follows: > > riak-admin member-status > ================================= Membership > ================================== > Status Ring Pending Node > ------------------------------------------------------------------------------- > leaving 0.0% -- '[email protected]' > valid 25.0% -- '[email protected]' > valid 25.0% -- '[email protected]' > valid 25.0% -- '[email protected]' > valid 25.0% -- '[email protected]' > ------------------------------------------------------------------------------- > Valid:4 / Leaving:1 / Exiting:0 / Joining:0 / Down:0 > > > Here are some questions: > > 1. What is the default tcp timeout? > 2. Is there any way to increase this timeout? > 3. Is there any way to increase the rate of handoff? > 4. Are there any other parameters we can tune to try and avoid this? > > The output from riak-admin transfers is as follows: > > '[email protected]' waiting to handoff 1 partitions > > Active Transfers: > > transfer type: hinted > vnode type: riak_kv_vnode > partition: 274031556999544297163190906134303066185487351808 > started: 2016-11-01 05:30:47 [2.10 hr ago] > last update: 2016-11-01 07:36:51 [3.03 s ago] > total size: 78393086512 bytes > objects transferred: 11440967 > > 1513 Objs/s > [email protected] =======> [email protected] > et et > |====== | 15% > 1.53 MB/s > > > Thanks, > Ryan Maclear > > _______________________________________________ > riak-users mailing list > [email protected] > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > _______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
