Re: [go-nuts] Designing a recurring event with timeouts using channels

Arya Fri, 09 Apr 2021 13:06:03 -0700

This was very helpful. I was somewhat apprehensive about using channels for 
this since I noticed a few other raft implementations have leveraged 
"shared-memory" instead of message passing to model the protocol. 
Validation of my approach really helps with making progress on this front. 
It didn't occur to me that I can use HDRHistogram to analyze some of the 
core metrics so I learnt something there.


As far as the TLA+ spec for raft goes, I found a spec by one of the 
creators of the protocol 
(https://github.com/ongardie/raft.tla/blob/master/raft.tla). Hoping that 
will suffice with validating the correctness of the implementation.

Thanks for your thoughtful feedback.

On Friday, April 9, 2021 at 7:14:35 AM UTC-7 [email protected] wrote:

> On Thu, Apr 8, 2021 at 8:15 PM Arya <[email protected]> wrote:
>
>> There are a few things that I am worried about in the above given 
>> snippet. The first thing is whether I am in alignment with golang's idioms 
>> while modeling this process using channels and the second thing is where I 
>> am resetting the voteStatusChan by creating channels within the poll loop. 
>> Something about creating new channels for every tick of the timer seems to 
>> be wasteful (I don't have a particularly nuanced understanding of how 
>> unbounded channel creations will tax the garbage collector or if there are 
>> other dangers lurking in the corner by taking this approach)
>>
>>
> While I have a somewhat overarching idea of the Raft protocol, I can't 
> comment on its correctness because I'm not too well versed in the 
> particular code base.
>
> However, I can comment on the worries: a channel does cost you garbage, 
> that will eventually require collection. However, if the GC pressure this 
> provides is fairly small it is unlikely to have any impact. There's clearly 
> a limit at which it becomes too expensive, but a few channels created per 
> second is highly unlikely to be measurable. You should mostly be worried if 
> the channel creation is in a tight loop and the tick resolution is lower 
> than milliseconds (say). A large tick window, and situations where there is 
> human interaction in the loop shouldn't pose too much of a worry. The other 
> case is if a pathological reelection loop starts generating thousands of 
> channels, but again, this is unlikely. In particular, one-shot channels are 
> somewhat common where a channel is used for one single interaction, so 
> generating a few shouldn't be a problem. Another way of looking at this is: 
> how much GC pressure does the rest of the program provide? It might be that 
> other parts of the program dominate GC pressure, and thus we can slightly 
> modify Amdahl's law and argue that's where your effort should be.
>
> In the above code, I don't really see any place channels are formed, 
> except when you run an election. If memory serves, Raft doesn't run too 
> many leader elections under normal operations.
>
> As for the channel approach: channels have an advantage over a lot of 
> "smarter" or "faster" approaches in that they are often easier to reason 
> about in code, and for a consensus algorithm, you should probably worry 
> about correctness before speed: it's easy to create a system which is very 
> very very fast and also incorrect, to slightly paraphrase Joe Armstrong. 
> Were I to create a Raft protocol from scratch in Go, I'd probably reach for 
> a channel approach first (and I would definitely sit with a TLA+ 
> implementation for reference).
>
> The key thing I'd work on is to establish a frame of reference through 
> some careful measurement. There is a quantitative point where you hit "good 
> enough for our use case(tm)", and as long as you stay within that, you are 
> likely to succeed, provided your solution is correct. If you have 
> measurement, you know if you are operating within the boundaries of the 
> frame or not. Count number of leader elections. Count time to elect a 
> leader and dump it into some histogram approximation. Count the number of 
> failed elections. Bumping a counter or updating a HDRHistogram should be 
> cheap. And thinking in terms of system observability is generally valuable.
>
> -- 
> J.
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/3f17a0d6-2a90-40d2-8b5e-170617d23899n%40googlegroups.com.

Re: [go-nuts] Designing a recurring event with timeouts using channels

Reply via email to