This was very helpful. I was somewhat apprehensive about using channels for this since I noticed a few other raft implementations have leveraged "shared-memory" instead of message passing to model the protocol. Validation of my approach really helps with making progress on this front. It didn't occur to me that I can use HDRHistogram to analyze some of the core metrics so I learnt something there.
As far as the TLA+ spec for raft goes, I found a spec by one of the creators of the protocol (https://github.com/ongardie/raft.tla/blob/master/raft.tla). Hoping that will suffice with validating the correctness of the implementation. Thanks for your thoughtful feedback. On Friday, April 9, 2021 at 7:14:35 AM UTC-7 [email protected] wrote: > On Thu, Apr 8, 2021 at 8:15 PM Arya <[email protected]> wrote: > >> There are a few things that I am worried about in the above given >> snippet. The first thing is whether I am in alignment with golang's idioms >> while modeling this process using channels and the second thing is where I >> am resetting the voteStatusChan by creating channels within the poll loop. >> Something about creating new channels for every tick of the timer seems to >> be wasteful (I don't have a particularly nuanced understanding of how >> unbounded channel creations will tax the garbage collector or if there are >> other dangers lurking in the corner by taking this approach) >> >> > While I have a somewhat overarching idea of the Raft protocol, I can't > comment on its correctness because I'm not too well versed in the > particular code base. > > However, I can comment on the worries: a channel does cost you garbage, > that will eventually require collection. However, if the GC pressure this > provides is fairly small it is unlikely to have any impact. There's clearly > a limit at which it becomes too expensive, but a few channels created per > second is highly unlikely to be measurable. You should mostly be worried if > the channel creation is in a tight loop and the tick resolution is lower > than milliseconds (say). A large tick window, and situations where there is > human interaction in the loop shouldn't pose too much of a worry. The other > case is if a pathological reelection loop starts generating thousands of > channels, but again, this is unlikely. In particular, one-shot channels are > somewhat common where a channel is used for one single interaction, so > generating a few shouldn't be a problem. Another way of looking at this is: > how much GC pressure does the rest of the program provide? It might be that > other parts of the program dominate GC pressure, and thus we can slightly > modify Amdahl's law and argue that's where your effort should be. > > In the above code, I don't really see any place channels are formed, > except when you run an election. If memory serves, Raft doesn't run too > many leader elections under normal operations. > > As for the channel approach: channels have an advantage over a lot of > "smarter" or "faster" approaches in that they are often easier to reason > about in code, and for a consensus algorithm, you should probably worry > about correctness before speed: it's easy to create a system which is very > very very fast and also incorrect, to slightly paraphrase Joe Armstrong. > Were I to create a Raft protocol from scratch in Go, I'd probably reach for > a channel approach first (and I would definitely sit with a TLA+ > implementation for reference). > > The key thing I'd work on is to establish a frame of reference through > some careful measurement. There is a quantitative point where you hit "good > enough for our use case(tm)", and as long as you stay within that, you are > likely to succeed, provided your solution is correct. If you have > measurement, you know if you are operating within the boundaries of the > frame or not. Count number of leader elections. Count time to elect a > leader and dump it into some histogram approximation. Count the number of > failed elections. Bumping a counter or updating a HDRHistogram should be > cheap. And thinking in terms of system observability is generally valuable. > > -- > J. > -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/3f17a0d6-2a90-40d2-8b5e-170617d23899n%40googlegroups.com.
