> In my opinion including system tables defeats that purpose because it forces 
> users to know details about the system tables.
Perhaps having a unit test that caps our system tables at some value and 
keeping the guardrail user-scope specific would be a better approach. I see 
your point about leaking internal details to users, specifically on things they 
can't control at this point.

On Wed, Jun 14, 2023, at 8:19 AM, Andrés de la Peña wrote:
>> > Default value I agree with you; features should be off by default!  If we 
>> > remove the default then we disable the feature by default (which im cool 
>> > with) and for anyone who changed the config, they would keep their behavior
> 
> I'm glad we agree on at least removing the default value if we keep the 
> deprecated properties.
> 
>> > With that, I kinda don’t agree that including system tables is a mistake, 
>> > as we add more we allow less for user tables before we start to have 
>> > issues….
> 
> That's problematic because the new thresholds we added in CASSANDRA-17147 
> don't include system tables. Do you think we should change that?
> 
> I still think it's better not to include the system tables in the count. The 
> thresholds on the number of keyspaces/tables/rows/columns/tombstones are just 
> guidance since they cannot be exactly related to exact resource consumption. 
> The main purpose of those thresholds is to prevent obvious antipatterns such 
> as creating thousands of tables. A benefit of expressing the guardrails in 
> terms of the number of schema entities, rather than counting the memory usage 
> of those entities, is that they are easy to understand and reason about. In 
> my opinion including system tables defeats that purpose because it forces 
> users to know details about the system tables. The fact that those details 
> change between versions doesn't help. Including system tables is not going to 
> make the thresholds precise in terms of measuring memory consumption because 
> that depends on other factors, such as the columns they store.
> 
> Including system tables also imposes a minimum threshold value, like in 5.0 
> you cannot set a threshold value under 45 tables without triggering it with 
> an empty db. For other thresholds, this can be more tricky. That would be the 
> case of the guardrail on the number of columns in a partition, where you 
> would need to know the size of the widest row in the system tables, which can 
> change over time.
> 
> I guess that if system tables were to be counted, a recommendation for the 
> threshold would say something like "It's recommended not to have more than 
> 150 tables. The system already includes 45 tables for internal usage, so you 
> shouldn't create more than 105 user tables". I find it's better for usability 
> to not count the system tables and just say "It's recommended not to have 
> more than 100 tables. This doesn't include system tables."
> 
> On Tue, 13 Jun 2023 at 23:51, Josh McKenzie <jmcken...@apache.org> wrote:
>> __
>>> Warning that too many tables (including system) may have negative behavior 
>>> I think is fine
>> This reminds me of the current situation with our tests where we just keep 
>> adding more and more without really considering the value of the current set 
>> and the costs of that body of work as it keeps growing.
>> 
>> Having some kind of signal that we need to do some housekeeping with our 
>> system tables, or *something* in the feedback loop that helps us keep on top 
>> of this hygiene over time, seems like a clear benefit to me.
>> 
>> On Tue, Jun 13, 2023, at 1:42 PM, David Capwell wrote:
>>>> I think that the combined decision of using a default value and counting 
>>>> system tables was a mistake
>>> 
>>> Default value I agree with you; features should be off by default!  If we 
>>> remove the default then we disable the feature by default (which im cool 
>>> with) and for anyone who changed the config, they would keep their behavior
>>> 
>>> As for system tables… each table adds a cost to our bookkeeping, so when we 
>>> add new tables the cost grows and the memory per table decreases, does it 
>>> not?  Warning that too many tables (including system) may have negative 
>>> behavior I think is fine, its only if we start to fail is when things 
>>> become a problem (upgrading to 5.0 can’t happen due to too many tables 
>>> added in the release?); think the feature was warn only, so that should be 
>>> fine.  With that, I kinda don’t agree that including system tables is a 
>>> mistake, as we add more we allow less for user tables before we start to 
>>> have issues…. At the same time, if we have improvements in newer versions 
>>> that allows higher number of tables, the user then has to update their 
>>> configs (well, as long as we don’t make things worse a smaller limit than 
>>> needed is fine…)
>>> 
>>>> we would need to know how many system keyspaces/tables were on the version 
>>>> we are upgrading from
>>> 
>>> Do we?  The logic was pulling from local schema, so to keep the same 
>>> behavior we would need to do the same; being version dependent would 
>>> actually break the semantics as far as I can tell.
>>> 
>>>> On Jun 13, 2023, at 9:50 AM, Andrés de la Peña <adelap...@apache.org> 
>>>> wrote:
>>>> 
>>>> Indeed "keyspace_count_warn_threshold" and "table_count_warn_threshold" 
>>>> include system keyspaces and tables. Also, differently to the newer 
>>>> guardrails, they are enabled by default. 
>>>> 
>>>> I find that problematic because users need to know how many system 
>>>> keyspaces/tables there are to know if they need to set the threshold 
>>>> value. Moreover, if a new release adds some system tables, the threshold 
>>>> can start to be triggered without changing the number of user tables. That 
>>>> would force some users to update the threshold values during an upgrade. 
>>>> Even if they are using the defaults. That situation would happen again in 
>>>> any release adding new keyspaces/tables. I think adding new system tables 
>>>> is not that uncommon, and indeed 5.0 does it.
>>>> 
>>>> I think that the combined decision of using a default value and counting 
>>>> system tables was a mistake. If that's the case, I don't know for how long 
>>>> we want to remain tied to that mistake. Especially when the old thresholds 
>>>> tend to create upgrade issues on their own.
>>>> 
>>>> If we were going to use converters, we would need to know how many system 
>>>> keyspaces/tables were on the version we are upgrading from. I don't know 
>>>> if that information is available. Or perhaps we could assume that counting 
>>>> system keyspaces/tables was a bug, and just translate changing the meaning 
>>>> to not include them.
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> On Tue, 13 Jun 2023 at 16:51, David Capwell <dcapw...@apple.com> wrote:
>>>>> > Have we been dropping support entirely for old params or using the 
>>>>> > @Replaces annotation into perpetuity?
>>>>> 
>>>>> 
>>>>> My understanding is that the goal is to keep things around in perpetuity 
>>>>> unless it actively causes us harm… and with @Replaces, there tends to be 
>>>>> no harm to keep around…
>>>>> 
>>>>> Looking at 
>>>>> https://github.com/apache/cassandra/commit/bae92ee139b411c94228f8fd5bb8befb4183ca9f
>>>>>  we just marked them deprecated and created a brand new config that 
>>>>> matched the old… which I feel was a bad idea…. Renaming configs are fine 
>>>>> with @Replaces, but asking users to migrate with the idea of breaking 
>>>>> them in the future is bad…
>>>>> 
>>>>> The table_count_warn_threshold config is used at 
>>>>> org.apache.cassandra.cql3.statements.schema.CreateTableStatement#clientWarnings
>>>>> The tables_warn_threshold config is used at 
>>>>> org.apache.cassandra.cql3.statements.schema.CreateTableStatement#validate
>>>>> 
>>>>> The only difference I see is that table_count_warn_threshold includes 
>>>>> system tables where as tables_warn_threshold is only user tables…
>>>>> 
>>>>> > I would like to propose removing the non-guardrail thresholds 
>>>>> > 'keyspace_count_warn_threshold' and 'table_count_warn_threshold' 
>>>>> > configuration settings on the trunk branch for the next major release.
>>>>> 
>>>>> Deprecate in 4.1 is way too new for me to accept that, and its low effort 
>>>>> to keep; breaking users is always a bad idea and doing it when not needed 
>>>>> is bad…
>>>>> 
>>>>> Honestly, I don’t see why we couldn’t use @Replaces here to solve the 
>>>>> semantic gap… table_count_warn_threshold includes the system tables, so 
>>>>> we just need a Converter that takes w/e the value the user put in and 
>>>>> subtracts the system tables… which then gives us the user tables 
>>>>> (matching tables_warn_threshold)
>>>>> 
>>>>> > On Jun 13, 2023, at 7:57 AM, Josh McKenzie <jmcken...@apache.org> wrote:
>>>>> >
>>>>> >> have subsequently been deprecated since 4.1-alpha in CASSANDRA-17195 
>>>>> >> when they were replaced/migrated to guardrails as part of CEP-3 
>>>>> >> (Guardrails).
>>>>> > Have we been dropping support entirely for old params or using the 
>>>>> > @Replaces annotation into perpetuity?
>>>>> >
>>>>> > I dislike the idea of operators having to remember to update things 
>>>>> > between versions and being surprised when things change roughly equally 
>>>>> > to us carrying along undocumented deprecated param name mapping roughly 
>>>>> > equally. :)
>>>>> >
>>>>> > On Mon, Jun 12, 2023, at 5:56 PM, Dan Jatnieks wrote:
>>>>> >> Hello everyone,
>>>>> >>
>>>>> >> I would like to propose removing the non-guardrail thresholds 
>>>>> >> 'keyspace_count_warn_threshold' and 'table_count_warn_threshold' 
>>>>> >> configuration settings on the trunk branch for the next major release.
>>>>> >>
>>>>> >> These thresholds were first added with CASSANDRA-16309 in 4.0-beta4 
>>>>> >> and have subsequently been deprecated since 4.1-alpha in 
>>>>> >> CASSANDRA-17195 when they were replaced/migrated to guardrails as part 
>>>>> >> of CEP-3 (Guardrails).
>>>>> >>
>>>>> >> I'd appreciate any thoughts about this. I will open a ticket to get 
>>>>> >> started if there is support for doing this.
>>>>> >>
>>>>> >> Reference:
>>>>> >> https://issues.apache.org/jira/browse/CASSANDRA-16309
>>>>> >> https://issues.apache.org/jira/browse/CASSANDRA-17195
>>>>> >> CEP-3: Guardrails 
>>>>> >> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-3%3A+Guardrails
>>>>> >>
>>>>> >>
>>>>> >> Thanks,
>>>>> >> Dan Jatnieks
>>>>> 
>> 

Reply via email to