I opened the following ticket for this: https://issues.apache.org/jira/browse/CASSANDRA-18617
Thanks again, Dan On Fri, Jun 16, 2023 at 8:45 AM David Capwell <dcapw...@apple.com> wrote: > Does this sound good? > > > Sounds good to me > > On Jun 16, 2023, at 8:30 AM, Dan Jatnieks <jatni...@pobox.com> wrote: > > Hi all, > > Apologies for the late reply; I didn't mean to start a thread and then > disappear - it was unintended and I feel bad about that. > > I've been taking notes to summarize the discussion points and it matches > what Andres already listed, so I'm glad for that. And thank you Andres for > doing that - much appreciated! > > The plan Andres outlined also sounds good to me. I was not aware > of @Replaces before, and now that I learned it, I agree it should be used > here. > > Converting the old threshold values by subtracting the system > keyspace/tables makes sense to keep the existing guardrail semantics - and > including a message about that will be a good step to reduce any confusion > about how the, possibly odd-looking, new value was determined. > > Dan > > > On Fri, Jun 16, 2023 at 9:58 AM Ekaterina Dimitrova <e.dimitr...@gmail.com> > wrote: > >> Hi all, >> I was following the discussion. What Andres just summarized sounds >> reasonable to me. Let’s just not forget to document also all this. >> Thank you >> Ekaterina >> >> On Fri, 16 Jun 2023 at 10:16, Andrés de la Peña <adelap...@apache.org> >> wrote: >> >>> It seems we agree on removing the default value for the old thresholds, >>> and don't count system keyspaces/tables on the new ones. >>> >>> The old thresholds were on active duty for around ten months, and they >>> have been deprecated for around a year. They will have been deprecated for >>> longer by the time we release 5.0. If we want to keep them in perpetuity, I >>> guess the plan would be: >>> >>> - Remove the default value of the old thresholds in Config.java to make >>> them disabled by default. >>> - Remove the old thresholds from the default cassandra.yaml, although >>> old yamls can still have them. >>> - Use converters (@Replaces tag in Config.java) to read the old >>> threshold values (if present) and apply them to the new guardrails. >>> - During the conversion from the old thresholds to the new guardrails, >>> subtract the current number of system keyspace/tables from the old value. >>> For example, 150 tables in the old threshold translate to 103 tables in the >>> new guardrail, considering that there are 47 system tables. >>> >>> Does this sound good? >>> >>> On Wed, 14 Jun 2023 at 17:26, David Capwell <dcapw...@apple.com> wrote: >>> >>>> That's problematic because the new thresholds we added in >>>> CASSANDRA-17147 don't include system tables. Do you think we should change >>>> that? >>>> >>>> >>>> I wouldn’t change the semantics of the config as it’s already live. I >>>> guess where I am coming from is that logically we have to think about the >>>> system tables, so to your point, if we think 150 is too much and the system >>>> already exposes 50… then we should recommend no more than 100…. >>>> >>>> I find it's better for usability to not count the system tables and >>>> just say "It's recommended not to have more than 100 tables. This doesn't >>>> include system tables.” >>>> >>>> >>>> I am fine with this framing… internally we think about 150 but >>>> publicly speak 100 (due to our 50 tables)... >>>> >>>> >>>> On Jun 14, 2023, at 8:29 AM, Josh McKenzie <jmcken...@apache.org> >>>> wrote: >>>> >>>> In my opinion including system tables defeats that purpose because it >>>> forces users to know details about the system tables. >>>> >>>> Perhaps having a unit test that caps our system tables at some value >>>> and keeping the guardrail user-scope specific would be a better approach. I >>>> see your point about leaking internal details to users, specifically on >>>> things they can't control at this point. >>>> >>>> On Wed, Jun 14, 2023, at 8:19 AM, Andrés de la Peña wrote: >>>> >>>> > Default value I agree with you; features should be off by default! >>>> If we remove the default then we disable the feature by default (which im >>>> cool with) and for anyone who changed the config, they would keep their >>>> behavior >>>> >>>> >>>> I'm glad we agree on at least removing the default value if we keep the >>>> deprecated properties. >>>> >>>> > With that, I kinda don’t agree that including system tables is a >>>> mistake, as we add more we allow less for user tables before we start to >>>> have issues…. >>>> >>>> >>>> That's problematic because the new thresholds we added in >>>> CASSANDRA-17147 don't include system tables. Do you think we should change >>>> that? >>>> >>>> I still think it's better not to include the system tables in the >>>> count. The thresholds on the number of >>>> keyspaces/tables/rows/columns/tombstones are just guidance since they >>>> cannot be exactly related to exact resource consumption. The main purpose >>>> of those thresholds is to prevent obvious antipatterns such as creating >>>> thousands of tables. A benefit of expressing the guardrails in terms of the >>>> number of schema entities, rather than counting the memory usage of those >>>> entities, is that they are easy to understand and reason about. In my >>>> opinion including system tables defeats that purpose because it forces >>>> users to know details about the system tables. The fact that those details >>>> change between versions doesn't help. Including system tables is not going >>>> to make the thresholds precise in terms of measuring memory consumption >>>> because that depends on other factors, such as the columns they store. >>>> >>>> Including system tables also imposes a minimum threshold value, like in >>>> 5.0 you cannot set a threshold value under 45 tables without triggering it >>>> with an empty db. For other thresholds, this can be more tricky. That would >>>> be the case of the guardrail on the number of columns in a partition, where >>>> you would need to know the size of the widest row in the system tables, >>>> which can change over time. >>>> >>>> I guess that if system tables were to be counted, a recommendation for >>>> the threshold would say something like "It's recommended not to have more >>>> than 150 tables. The system already includes 45 tables for internal usage, >>>> so you shouldn't create more than 105 user tables". I find it's better for >>>> usability to not count the system tables and just say "It's recommended not >>>> to have more than 100 tables. This doesn't include system tables." >>>> >>>> On Tue, 13 Jun 2023 at 23:51, Josh McKenzie <jmcken...@apache.org> >>>> wrote: >>>> >>>> >>>> Warning that too many tables (including system) may have negative >>>> behavior I think is fine >>>> >>>> This reminds me of the current situation with our tests where we just >>>> keep adding more and more without really considering the value of the >>>> current set and the costs of that body of work as it keeps growing. >>>> >>>> Having some kind of signal that we need to do some housekeeping with >>>> our system tables, or *something* in the feedback loop that helps us >>>> keep on top of this hygiene over time, seems like a clear benefit to me. >>>> >>>> On Tue, Jun 13, 2023, at 1:42 PM, David Capwell wrote: >>>> >>>> I think that the combined decision of using a default value and >>>> counting system tables was a mistake >>>> >>>> >>>> Default value I agree with you; features should be off by default! If >>>> we remove the default then we disable the feature by default (which im cool >>>> with) and for anyone who changed the config, they would keep their behavior >>>> >>>> As for system tables… each table adds a cost to our bookkeeping, so >>>> when we add new tables the cost grows and the memory per table decreases, >>>> does it not? Warning that too many tables (including system) may have >>>> negative behavior I think is fine, its only if we start to fail is when >>>> things become a problem (upgrading to 5.0 can’t happen due to too many >>>> tables added in the release?); think the feature was warn only, so that >>>> should be fine. With that, I kinda don’t agree that including system >>>> tables is a mistake, as we add more we allow less for user tables before we >>>> start to have issues…. At the same time, if we have improvements in newer >>>> versions that allows higher number of tables, the user then has to update >>>> their configs (well, as long as we don’t make things worse a smaller limit >>>> than needed is fine…) >>>> >>>> we would need to know how many system keyspaces/tables were on the >>>> version we are upgrading from >>>> >>>> >>>> Do we? The logic was pulling from local schema, so to keep the same >>>> behavior we would need to do the same; being version dependent would >>>> actually break the semantics as far as I can tell. >>>> >>>> On Jun 13, 2023, at 9:50 AM, Andrés de la Peña <adelap...@apache.org> >>>> wrote: >>>> >>>> Indeed "keyspace_count_warn_threshold" and "table_count_warn_threshold" >>>> include system keyspaces and tables. Also, differently to the newer >>>> guardrails, they are enabled by default. >>>> >>>> I find that problematic because users need to know how many system >>>> keyspaces/tables there are to know if they need to set the threshold value. >>>> Moreover, if a new release adds some system tables, the threshold can start >>>> to be triggered without changing the number of user tables. That would >>>> force some users to update the threshold values during an upgrade. Even if >>>> they are using the defaults. That situation would happen again in any >>>> release adding new keyspaces/tables. I think adding new system tables is >>>> not that uncommon, and indeed 5.0 does it. >>>> >>>> I think that the combined decision of using a default value and >>>> counting system tables was a mistake. If that's the case, I don't know for >>>> how long we want to remain tied to that mistake. Especially when the old >>>> thresholds tend to create upgrade issues on their own. >>>> >>>> If we were going to use converters, we would need to know how many >>>> system keyspaces/tables were on the version we are upgrading from. I don't >>>> know if that information is available. Or perhaps we could assume that >>>> counting system keyspaces/tables was a bug, and just translate changing the >>>> meaning to not include them. >>>> >>>> >>>> >>>> >>>> >>>> On Tue, 13 Jun 2023 at 16:51, David Capwell <dcapw...@apple.com> wrote: >>>> >>>> > Have we been dropping support entirely for old params or using the >>>> @Replaces annotation into perpetuity? >>>> >>>> >>>> My understanding is that the goal is to keep things around in >>>> perpetuity unless it actively causes us harm… and with @Replaces, there >>>> tends to be no harm to keep around… >>>> >>>> Looking at >>>> https://github.com/apache/cassandra/commit/bae92ee139b411c94228f8fd5bb8befb4183ca9f >>>> we just marked them deprecated and created a brand new config that >>>> matched the old… which I feel was a bad idea…. Renaming configs are fine >>>> with @Replaces, but asking users to migrate with the idea of breaking them >>>> in the future is bad… >>>> >>>> The table_count_warn_threshold config is used at >>>> org.apache.cassandra.cql3.statements.schema.CreateTableStatement#clientWarnings >>>> The tables_warn_threshold config is used at >>>> org.apache.cassandra.cql3.statements.schema.CreateTableStatement#validate >>>> >>>> The only difference I see is that table_count_warn_threshold includes >>>> system tables where as tables_warn_threshold is only user tables… >>>> >>>> > I would like to propose removing the non-guardrail thresholds >>>> 'keyspace_count_warn_threshold' and 'table_count_warn_threshold' >>>> configuration settings on the trunk branch for the next major release. >>>> >>>> Deprecate in 4.1 is way too new for me to accept that, and its low >>>> effort to keep; breaking users is always a bad idea and doing it when not >>>> needed is bad… >>>> >>>> Honestly, I don’t see why we couldn’t use @Replaces here to solve the >>>> semantic gap… table_count_warn_threshold includes the system tables, so we >>>> just need a Converter that takes w/e the value the user put in and >>>> subtracts the system tables… which then gives us the user tables (matching >>>> tables_warn_threshold) >>>> >>>> > On Jun 13, 2023, at 7:57 AM, Josh McKenzie <jmcken...@apache.org> >>>> wrote: >>>> > >>>> >> have subsequently been deprecated since 4.1-alpha in CASSANDRA-17195 >>>> when they were replaced/migrated to guardrails as part of CEP-3 >>>> (Guardrails). >>>> > Have we been dropping support entirely for old params or using the >>>> @Replaces annotation into perpetuity? >>>> > >>>> > I dislike the idea of operators having to remember to update things >>>> between versions and being surprised when things change roughly equally to >>>> us carrying along undocumented deprecated param name mapping roughly >>>> equally. :) >>>> > >>>> > On Mon, Jun 12, 2023, at 5:56 PM, Dan Jatnieks wrote: >>>> >> Hello everyone, >>>> >> >>>> >> I would like to propose removing the non-guardrail thresholds >>>> 'keyspace_count_warn_threshold' and 'table_count_warn_threshold' >>>> configuration settings on the trunk branch for the next major release. >>>> >> >>>> >> These thresholds were first added with CASSANDRA-16309 in 4.0-beta4 >>>> and have subsequently been deprecated since 4.1-alpha in CASSANDRA-17195 >>>> when they were replaced/migrated to guardrails as part of CEP-3 >>>> (Guardrails). >>>> >> >>>> >> I'd appreciate any thoughts about this. I will open a ticket to get >>>> started if there is support for doing this. >>>> >> >>>> >> Reference: >>>> >> https://issues.apache.org/jira/browse/CASSANDRA-16309 >>>> >> https://issues.apache.org/jira/browse/CASSANDRA-17195 >>>> >> CEP-3: Guardrails >>>> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-3%3A+Guardrails >>>> >> >>>> >> >>>> >> Thanks, >>>> >> Dan Jatnieks >>>> >>>> >>>> >