Re: Scope of CASSANDRA-14557 (Default and Minimum RF)
I think it is reasonable that system keyspaces would get initialized with a default replication factor, assuming ones that were already initalized would remain intact (however, this should be the same for user-created keyspaces). Assuming it doesn't change the current behaviour, and default and min rf, when unset, act the same way current version would, the only thing we should probably add is a line in cassandra.conf that default and min replication factors will also apply to system keyspaces. On Wed, May 13, 2020 at 9:01 PM Sumanth Pasupuleti < sumanth.pasupuleti...@gmail.com> wrote: > Hi, > > Based on Alex's suggestion on the ticket, wanted to reach out to clarify on > the current scope of default and minimum replication factors that 14557 > defines, and gather thoughts to farm for dissent. > > Both the configurations (default and minimum) apply not just to user > keyspaces, but also to system keyspaces. For instance, this can be handy in > deployments that use authenticated C* clusters where operators have to > "remember" to set system_auth keyspace's RF to a value higher than 1. In > such cases, setting default_rf = 3 for example (which I suppose is common > in most deployments) would ensure all the system keyspaces (including > system_auth) come up with RF=3. > > It can be helpful to note that, this patch by default does not cause any > change to the replication factors, reason being, the default values of > these configurations are set to [defaultRF=1, minimumRF=0] to not induce > any changes that folks may not expect, but rather offers knobs to define > what a sane default RF should be, and have a gate on any new keyspaces > being created with an RF lower than minimumRF. > > Curious to know your thoughts. > > Thanks, > Sumanth > -- alex p
Re: List of serious issues fixed in 3.0.x
Wanted to add some that I remembered: * https://issues.apache.org/jira/browse/CASSANDRA-12811 - data resurrection, but was marked as normal because was discovered with a test. Should've marked it as critical. * https://issues.apache.org/jira/browse/CASSANDRA-12956 - data loss (commit log isn't replayed on custom 2i exception) * https://issues.apache.org/jira/browse/CASSANDRA-12144 - undeletable/duplicate rows problem; can be considered data resurrection and/or sstable corruption. On Thu, May 7, 2020 at 6:55 PM Joshua McKenzie wrote: > "ML is plaintext bro" - thanks Mick. ಠ_ಠ > > Since we're stuck in the late 90's, here's some links to a gsheet: > > Defects by month: > https://docs.google.com/spreadsheets/d/1Qt8lLIiqVvK7mlSML7zsmXdAc-LsvktFW5RXJDRtN8k/edit#gid=1584867240 > Defects by component: > https://docs.google.com/spreadsheets/d/1Qt8lLIiqVvK7mlSML7zsmXdAc-LsvktFW5RXJDRtN8k/edit#gid=1946109279 > Defects by type: > https://docs.google.com/spreadsheets/d/1Qt8lLIiqVvK7mlSML7zsmXdAc-LsvktFW5RXJDRtN8k/edit#gid=385136105 > > On Thu, May 7, 2020 at 12:31 PM Joshua McKenzie > wrote: > >> Hearing the images got killed by the web server. Trying from gmail (sorry >> for spam). Time to see if it's the apache smtp server or the list culling >> images: >> >> --- >> I did a little analysis on this data (any defect marked with fixversion >> 4.0 that rose to the level of critical in terms of availability, >> correctness, or corruption/loss) and charted some things the rest of the >> project community might find interesting: >> >> 1: Critical (availability, correctness, corruption/loss) defects fixed >> per month since about 6 months before 3.11.0: >> [image: monthly.png] >> >> 2: Components in which critical defects arose (note: bright red bar == >> sum of 3 dark red): >> [image: Total Defects by Component.png] >> >> 3: Type of defect found and fixed (bright red: cluster down or permaloss, >> dark red: temp corrupt/loss, yellow: incorrect response): >> >> [image: Total Defects by Type.png] >> >> My personal takeaways from this: a ton of great defect fixing work has >> gone into 4.0. I'd love it if we had both code coverage analysis for >> testing on the codebase as well as data to surface where hotspots of >> defects are in the code that might need further testing (caveat: many have >> voiced their skepticism of the value of this type of data in the past in >> this project community, so that's probably another conversation to have on >> another thread) >> >> Hope someone else finds the above interesting if not useful. >> >> -- >> Joshua McKenzie >> >> On Thu, May 7, 2020 at 12:24 PM Joshua McKenzie >> wrote: >> >>> I did a little analysis on this data (any defect marked with fixversion >>> 4.0 that rose to the level of critical in terms of availability, >>> correctness, or corruption/loss) and charted some things the rest of the >>> project community might find interesting: >>> >>> 1: Critical (availability, correctness, corruption/loss) defects fixed >>> per month since about 6 months before 3.11.0: >>> [image: monthly.png] >>> >>> 2: Components in which critical defects arose (note: bright red bar == >>> sum of 3 dark red): >>> [image: Total Defects by Component.png] >>> >>> 3: Type of defect found and fixed (bright red: cluster down or >>> permaloss, dark red: temp corrupt/loss, yellow: incorrect response): >>> >>> [image: Total Defects by Type.png] >>> >>> My personal takeaways from this: a ton of great defect fixing work has >>> gone into 4.0. I'd love it if we had both code coverage analysis for >>> testing on the codebase as well as data to surface where hotspots of >>> defects are in the code that might need further testing (caveat: many have >>> voiced their skepticism of the value of this type of data in the past in >>> this project community, so that's probably another conversation to have on >>> another thread) >>> >>> Hope someone else finds the above interesting if not useful. >>> >>> ~Josh >>> >>> >>> On Wed, May 6, 2020 at 3:38 PM Dinesh Joshi wrote: >>> Hi Sankalp, Thanks for bringing this up. At the very minimum, I hope we have regression tests for the specific issues we have fixed. I personally think, the project should focus on building a comprehensive test suite. However, some of these issues can only be detected at scale. We need users to test* C* in their environment for their use-cases. Ideally these folks stand up large clusters and tee their traffic to the new cluster and report issues. If we had an automated test suite that everyone can run at a large scale that would be even better. Thanks, Dinesh * test != starting C* in a few nodes and looking at logs. > On May 6, 2020, at 10:11 AM, sankalp kohli wrote: > > Hi, >I want to share some of the serious issues that were found and fixed in > 3.0.x. I
Re: Scope of CASSANDRA-14557 (Default and Minimum RF)
I agree. I've updated the patch in 14557 to include a note about application to system keyspaces in cassandra.yaml. On Mon, May 18, 2020 at 1:59 AM Oleksandr Petrov wrote: > I think it is reasonable that system keyspaces would get initialized with a > default replication factor, assuming ones that were already > initalized would remain intact (however, this should be the same for > user-created keyspaces). > > Assuming it doesn't change the current behaviour, and default and min rf, > when unset, act the same way current version would, the only thing we > should probably add is a line in cassandra.conf that default and min > replication factors will also apply to system keyspaces. > > On Wed, May 13, 2020 at 9:01 PM Sumanth Pasupuleti < > sumanth.pasupuleti...@gmail.com> wrote: > > > Hi, > > > > Based on Alex's suggestion on the ticket, wanted to reach out to clarify > on > > the current scope of default and minimum replication factors that 14557 > > defines, and gather thoughts to farm for dissent. > > > > Both the configurations (default and minimum) apply not just to user > > keyspaces, but also to system keyspaces. For instance, this can be handy > in > > deployments that use authenticated C* clusters where operators have to > > "remember" to set system_auth keyspace's RF to a value higher than 1. In > > such cases, setting default_rf = 3 for example (which I suppose is common > > in most deployments) would ensure all the system keyspaces (including > > system_auth) come up with RF=3. > > > > It can be helpful to note that, this patch by default does not cause any > > change to the replication factors, reason being, the default values of > > these configurations are set to [defaultRF=1, minimumRF=0] to not induce > > any changes that folks may not expect, but rather offers knobs to define > > what a sane default RF should be, and have a gate on any new keyspaces > > being created with an RF lower than minimumRF. > > > > Curious to know your thoughts. > > > > Thanks, > > Sumanth > > > > > -- > alex p >
Reminder - 2020-05-19 Apache Cassandra Contributor Meeting
Hi everyone, Reminder that tomorrow at 1PM PST we'll be having a contributor meeting. I gave Jitsi a try for the Kubernetes SIG but ran into a lot of trouble with browser compatibility and recording. I'll just stick with using Zoom to keep it working consistently. https://datastax.zoom.us/j/390839037 https://cwiki.apache.org/confluence/display/CASSANDRA/2020-05-19+Apache+Cassandra+Contributor+Meeting Add any agenda items here or email me direct and I can put them in. Thanks, Patrick
Re: Reminder - 2020-05-19 Apache Cassandra Contributor Meeting
I'll be there and have added "Cassandra CI Run-through, next steps, help needed, and Q&A" to the agenda. If you have questions on CI turn up and ask them. Mick On Tue, 19 May 2020 at 02:53, Patrick McFadin wrote: > > Hi everyone, > > Reminder that tomorrow at 1PM PST we'll be having a contributor meeting. I > gave Jitsi a try for the Kubernetes SIG but ran into a lot of trouble with > browser compatibility and recording. I'll just stick with using Zoom to > keep it working consistently. > > https://datastax.zoom.us/j/390839037 > > https://cwiki.apache.org/confluence/display/CASSANDRA/2020-05-19+Apache+Cassandra+Contributor+Meeting > > Add any agenda items here or email me direct and I can put them in. > > Thanks, > > Patrick - To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For additional commands, e-mail: dev-h...@cassandra.apache.org