Gossip Analysis
Hello, Forgive me if I have missed anything in the obvious locations, but I am trying to find out if anyone has done an analysis of the gossip protocol as implemented in Cassandra? In particular, I am interested in the the theoretical propagation time (to all nodes) of a change. For example, if a single node makes a change, what is the number of gossip cycles that must elapse before we can be sure (to very high probability obviously, not 100% sure) that everyone has seen it. For those familiar, this is obviously an epidemic analysis, but the Scuttlebutt-style protocol makes it a bit more complex. If anyone already has this, it would be much appreciated. Thanks! -Bill Katsak Ph.D. Student Department of Computer Science Rutgers University
Re: Gossip Analysis
There is this very old ticket: https://issues.apache.org/jira/browse/CASSANDRA-617 But note that is gossip simulation, not the actual gossiper Cassandra uses (and also very antiquated.) Using the actual gossiper under simulation is unfortunately complicated by https://issues.apache.org/jira/browse/CASSANDRA-6881 That said, I'm pretty confident in our implementation, after beating out the details the scuttlebutt paper won't tell you over the last years, like how to remove a node. So the number of cycles required to answer you is, yet, not scientifically determined in our implementation, but I feel it's sufficient in most deployments right now, and the knob we have to turn if not is -Dcassandra.ring_delay_ms. On Tue, Sep 30, 2014 at 10:50 AM, William Katsak wrote: > Hello, > > Forgive me if I have missed anything in the obvious locations, but I am > trying to find out if anyone has done an analysis of the gossip protocol as > implemented in Cassandra? In particular, I am interested in the the > theoretical propagation time (to all nodes) of a change. For example, if a > single node makes a change, what is the number of gossip cycles that must > elapse before we can be sure (to very high probability obviously, not 100% > sure) that everyone has seen it. For those familiar, this is obviously an > epidemic analysis, but the Scuttlebutt-style protocol makes it a bit more > complex. > > If anyone already has this, it would be much appreciated. > > Thanks! > -Bill Katsak > Ph.D. Student > Department of Computer Science > Rutgers University > >
Re: Gossip Analysis
Thanks for this info, I will read it now. The idea is, I have a modified version of Cassandra for a specific research application. I want to be able to make a supportable statement like this: After a state change, we wait x*gossip_interval in order to ensure with 0.9 probability that all nodes have seen the change. Thanks, Bill On 09/30/2014 12:02 PM, Brandon Williams wrote: There is this very old ticket: https://issues.apache.org/jira/browse/CASSANDRA-617 But note that is gossip simulation, not the actual gossiper Cassandra uses (and also very antiquated.) Using the actual gossiper under simulation is unfortunately complicated by https://issues.apache.org/jira/browse/CASSANDRA-6881 That said, I'm pretty confident in our implementation, after beating out the details the scuttlebutt paper won't tell you over the last years, like how to remove a node. So the number of cycles required to answer you is, yet, not scientifically determined in our implementation, but I feel it's sufficient in most deployments right now, and the knob we have to turn if not is -Dcassandra.ring_delay_ms. On Tue, Sep 30, 2014 at 10:50 AM, William Katsak wrote: Hello, Forgive me if I have missed anything in the obvious locations, but I am trying to find out if anyone has done an analysis of the gossip protocol as implemented in Cassandra? In particular, I am interested in the the theoretical propagation time (to all nodes) of a change. For example, if a single node makes a change, what is the number of gossip cycles that must elapse before we can be sure (to very high probability obviously, not 100% sure) that everyone has seen it. For those familiar, this is obviously an epidemic analysis, but the Scuttlebutt-style protocol makes it a bit more complex. If anyone already has this, it would be much appreciated. Thanks! -Bill Katsak Ph.D. Student Department of Computer Science Rutgers University