David, I think we're getting a bit off-topic now, but it seems we already have a consensus on the main question. You're right about column descriptions, adding new columns to vtables, and moving fields description from yaml to the source code - all that will be part of other changes anyway, so let's keep it all in mind for further discussion for now.
I still think that relying on keywords is bad when we are trying to expose something for public API and we should explicitly mark all fields with "something", but you are right about the following - if we have a single @Mutable (or so) annotation: - are we able to distinguish mutable fields in the Config class - yes; - are we able to make the SettingsTable updatable with it - yes; - can we change/extend it in the future without backward compatibility issues - I think, yes; So, for now, we have everything we need to continue with SettingsTable and we can implement other our ideas iteratively later discussing the details first. On Wed, 1 Mar 2023 at 19:26, David Capwell <dcapw...@apple.com> wrote: > > > Another option would be to introduce a second class with the same fields as > > the first where we simply specify final for immutable fields, and construct > > it after parsing the Config. > > > > We could even generate the non-final version from the one with final fields. > > > > Not sure this would be nicer, but it is an alternative. > > > Correct me if I misunderstood, but I read this as Config acts like a Builder, > so after creating Config we do .build (logically) to get a new TheRealConfig > which “may” have “final” fields and non-final fields (mutable)? This > logically would be a perfect place to merge the DatabaseDescriptor logic that > normalizes, implements some form of backwards compatibility (one of the > annoying issues with settings table), and validation (also annoying for > settings table)…. Would allow DD to be a simple POJO (though still static) > class…. > > Assuming TheRealConfig is auto generated and not human maintained this could > be another solution, we would just have to be very careful not to fall into > the trap that Lombok put people in and make upgrading JDKs a nightmare… > > > Splitting this into the second class ... well, I would say that just > > increases the entropy. > > Yeah, this is a real issue with that solution, something we would have to be > caution of if we go down that route. > > > From a public API perspective, we have three types of fields in the > > Config class: internal use only (e.g. logger, PROPERTY_PREFIX prefix) > > > You do not have this in Config, but you do in subtypes; transient. SnakeYaml > ignores transient fields, and this is used in types such as EncryptionOptions > for state that must be exposed internally (normally derived from user config) > but must not be exposed to users. In I think 3.11 and earlier you could > define sslContextFactoryInstance in yaml (though it would always fail as > SnakeYaml couldn’t figure out how to build that type), and in 4.0+ you get an > unknown field exception if you try to define it (as we marked it transient to > hide it) > > In normalizing yaml loading and settings table (first release they both had > different logic, which meant settings table would show things not allowed in > yaml), I also added support for com.fasterxml.jackson.annotation.JsonIgnore, > so users may define that to also hide things as well. > > So, from our config point of view, if your field/methods match any of the > following rules, you are “internal” and not accessible from yaml or settings > table > > 1) static > 2) have com.fasterxml.jackson.annotation.JsonIgnore > 3) have transient > 4) not public (private, package, or package-private) > > > @Exposure(policy = Exposure.Policy.READ_WRITE) > > @Exposure(policy = Exposure.Policy.READ_ONLY) > > > I believe this is trying to address the concern you listed on internal > configs correct? My issue here is verbosity and how easy it would be for > authors to forget to add, causing dead code (if you are not read_write or > read_only then logically yaml/settings table shouldn’t touch you). I feel > that no annotation should be “immutable” by default as that is the common > case, the majority of configs are immutable and a small subset are mutable. > > > Stefan mentioned that these annotations could be used to create > > documentation pages, it's true > > > We could go this route, but I wonder if its better to use JavaDoc here (we > already build/publish, though there were questions in slack recently to stop > publishing…)… Majority of docs are in conf/cassandra.yaml, so if we wish to > move to code we would need to address how that file is maintained and how to > document “groups” of configs (rather than documenting each config, we have a > pattern of documenting a feature or pair of configs (such as min/max targets) > and showing the configs that can be tweaked). We did talk about moving to a > “nested” config model, but there was concerns about nesting at a feature > level as some features are cross cutting (so the “group” of configs may be in > different areas), so how we define these “groups” isn’t too clear to me… its > also not the common case so maybe less of a concern (if we document > row_index_read_size_warn_threshold but not > row_index_read_size_fail_threshold, is this still clear?)? > > > The SettingsTable may have the following columns and be truly > > self-descriptive for a user: name, value, default_value, policy, and > > description. > > > If we wish to expose docs in the settings table, this would push us to define > these in code and no longer in conf/cassandra.yml… I am ok with this, but > this does increase the scope as it needs to address the existing models. We > also need better clarity on compatibility with column additions… there is > another dev@ thread pointing out that durable tables cause downgrade issues… > but do vtables? Is it safe to add columns? I should really bring this > question to another thread and have us document…. > > > On Mar 1, 2023, at 6:33 AM, Maxim Muzafarov <mmu...@apache.org> wrote: > > > > Thank you all for your replies. Let me add some comments too, > > > > > > From a public API perspective, we have three types of fields in the > > Config class: internal use only (e.g. logger, PROPERTY_PREFIX prefix), > > read-only use (e.g. cluster_name), and read-write fields that are > > currently mutated with JMX. So a single @Mutable annotation is not > > enough to have clear Config's field separation. Adding two annotations > > @Mutable and @Immutable might solve the problem, but such an approach > > leads to code duplication if we want to extend our solution in future > > with additional parameters such as "description", besides having two > > different annotations for the same thing might confuse developers who > > are not familiar with this discussion. > > > > So, from my point of view, the best way for us might be as follows > > mentioned in the PR (the annotation name needs to reflect that the > > fields are available to the public API and for a user, we can change > > the name): > > @Exposure(policy = Exposure.Policy.READ_WRITE) > > @Exposure(policy = Exposure.Policy.READ_ONLY) > > > > Some other names come into my mind: APIAvailable, APIExposed, > > UserAvailable, UserExposed etc. > > > > > > Stefan mentioned that these annotations could be used to create > > documentation pages, it's true, I have the same thoughts in mind, and > > you can see what it will look like at the link below (the description > > annotation field will be removed from the final version of the PR, but > > may still be added as part of another issue): > > > > https://github.com/apache/cassandra/pull/2133/files#diff-e966f41bc2a418becfe687134ec8cf542eb051eead7fb4917e65a3a2e7c9bce3R392 > > > > The SettingsTable may have the following columns and be truly > > self-descriptive for a user: name, value, default_value, policy, and > > description. > > > > > > Benedict mentioned that we could create a second class to hold such > > information. The best candidate for this is the ConfigFields class, > > which is based on the Config class and contains all the field names as > > constants (I used a small utility class to generate it). But it will > > still require some manual work, as there is no rule to distinguish > > which config field is mutable and which isn't. So we would have to > > update two classes instead of one (the Config class) when adding new > > configuration fields, which we don't want to do. > > > > Here it is in the PR: > > https://github.com/apache/cassandra/pull/2133/files#diff-fcb4c5bc59d4bb127ffbe9f1ce566b2238c5bb92622da430a4ff879781093d3fR31 > > > > On Wed, 1 Mar 2023 at 09:21, Miklosovic, Stefan > > <stefan.mikloso...@netapp.com> wrote: > >> > >> I am fine with annotations. I am not a big of fan of the generation. From > >> my experience whenever we wanted to generate something we had to take care > >> of the generator itself and then we had to live with what it generated > >> (yeah, that is also a thing) instead of writing it by hand once and have > >> some freedom to tweak it however we wanted. Splitting this into the second > >> class ... well, I would say that just increases the entropy. > >> > >> We can parse config class on these annotations and produce the > >> documentation easily. I would probably go so far that I would put that > >> annotation on all fields. We could have two - Mutable, and Immutable. But > >> that is really optional. > >> > >> ________________________________________ > >> From: Benedict <bened...@apache.org> > >> Sent: Wednesday, March 1, 2023 9:09 > >> To: dev@cassandra.apache.org > >> Subject: Re: [DISCUSS] Allow UPDATE on settings virtual table to change > >> running configuration > >> > >> NetApp Security WARNING: This is an external email. Do not click links or > >> open attachments unless you recognize the sender and know the content is > >> safe. > >> > >> > >> > >> Another option would be to introduce a second class with the same fields > >> as the first where we simply specify final for immutable fields, and > >> construct it after parsing the Config. > >> > >> We could even generate the non-final version from the one with final > >> fields. > >> > >> Not sure this would be nicer, but it is an alternative. > >> > >> On 1 Mar 2023, at 02:10, Ekaterina Dimitrova <e.dimitr...@gmail.com> wrote: > >> > >> > >> I agree with David that the annotations seem a bit too many but if I have > >> to choose from the three approaches - the annotations one seems most > >> reasonable to me and I didn’t have the chance to consider any others. > >> Volatile seems fragile and unclear as a differentiator. I agree > >> > >> On Tue, 28 Feb 2023 at 17:47, Maxim Muzafarov > >> <mmu...@apache.org<mailto:mmu...@apache.org>> wrote: > >> Folks, > >> > >> If there are no objections to the approach described in this thread, > >> I'd like to proceed with this change. The change seems to be valuable > >> for the upcoming release, so any comments are really appreciated. > >> > >> On Wed, 22 Feb 2023 at 21:51, David Capwell > >> <dcapw...@apple.com<mailto:dcapw...@apple.com>> wrote: > >>> > >>> I guess back to the point of the thread, we need a way to know what > >>> configs are mutable for the settings virtual table, so need some way to > >>> denote that the config > >>> replica_filtering_protection.cached_rows_fail_threshold is mutable. > >>> Given the way that the yaml config works, we can’t rely on the presences > >>> of “final” or not, so need some way to mark a config is mutable for that > >>> table, does anyone want to offer feedback on what works best for them? > >>> > >>> Out of all proposals given so far “volatile” is the least verbose but > >>> also not explicit (as this thread is showing there is debate on if this > >>> should be present), new annotations are a little more verbose but would > >>> be explicit (no surprises), and getter/setters in different classes (such > >>> as DD) is the most verbose and suffers from not being explicit and > >>> ambiguity for mapping back to Config. > >>> > >>> Given the above, annotations sounds like the best option, but do we > >>> really want our config to look as follows? > >>> > >>> @Replaces(oldName = "native_transport_idle_timeout_in_ms", converter = > >>> Converters.MILLIS_DURATION_LONG, deprecated = true) > >>> @Mutable > >>> public DurationSpec.LongMillisecondsBound native_transport_idle_timeout = > >>> new DurationSpec.LongMillisecondsBound("0ms”); > >>> @Mutable > >>> public DurationSpec.LongMillisecondsBound transaction_timeout = new > >>> DurationSpec.LongMillisecondsBound("30s”); > >>> @Mutable > >>> public double phi_convict_threshold = 8.0; > >>> public String partitioner; // assume immutable by default? > >>> > >>> > >>>> On Feb 22, 2023, at 6:20 AM, Benedict > >>>> <bened...@apache.org<mailto:bened...@apache.org>> wrote: > >>>> > >>>> Could you describe the issues? Config that is globally exposed should > >>>> ideally be immutable with final members, in which case volatile is only > >>>> necessary if you’re using the config parameter in a tight loop that you > >>>> need to witness a new value - which shouldn’t apply to any of our config. > >>>> > >>>> There are some weird niches, like updating long values on some > >>>> (unsupported by us) JVMs that may tear. Technically you also require it > >>>> for visibility with the JMM. But in practice it is mostly unnecessary. > >>>> Often what seems to be a volatile issue is really something else. > >>>> > >>>>> On 22 Feb 2023, at 13:18, Benjamin Lerer > >>>>> <b.le...@gmail.com<mailto:b.le...@gmail.com>> wrote: > >>>>> > >>>>> I have seen issues with some updatable parameters which were missing > >>>>> the volatile keyword. > >>>>> > >>>>> Le mer. 22 févr. 2023 à 11:36, Aleksey Yeshchenko > >>>>> <alek...@apple.com<mailto:alek...@apple.com>> a écrit : > >>>>> FWIW most of those volatile fields, if not in fact all of them, should > >>>>> NOT be volatile at all. Someone started the trend and most folks have > >>>>> been copycatting or doing the same for consistency with the rest of the > >>>>> codebase. > >>>>> > >>>>> Please definitely don’t rely on that. > >>>>> > >>>>>> On 21 Feb 2023, at 21:06, Maxim Muzafarov > >>>>>> <mmu...@apache.org<mailto:mmu...@apache.org>> wrote: > >>>>>> > >>>>>> 1. Rely on the volatile keyword in front of fields in the Config class; > >>>>>> > >>>>>> I would say this is the most confusing option for me because it > >>>>>> doesn't give us all the guarantees we need, and also: > >>>>>> - We have no explicit control over what exactly we expose to a user. > >>>>>> When we modify the JMX API, we're implementing a new method for the > >>>>>> MBean, which in turn makes this action an explicit exposure; > >>>>>> - The volatile keyword is not the only way to achieve thread safety, > >>>>>> and looks strange for the public API design point; > >>>>>> - A good example is the setEnableDropCompactStorage method, which > >>>>>> changes the volatile field, but is only visible for testing purposes; > >>>>> > >>>>> > >>> >