Benedict, I am confused. If you are so much concerned about virtual tables or CQL why do you not track those components changes directly? People usually label them correctly I believe. Like that you would be able to provide feedback straight away rather than after the fact. It would be a win for everybody, no?
Le lun. 5 déc. 2022 à 15:10, Ekaterina Dimitrova <e.dimitr...@gmail.com> a écrit : > Quick idea - can we add a label in Jira API or something like that and > then Josh can filter those in the bi-weekly report? In the meantime if > there are big changes that people consider they need a DISCUSS thread for > they can always open one? I will be happy to help with the mentioned > filter/report. > Also +1 on having Contributing doc with broader discussion and directions > around API > > On Mon, 5 Dec 2022 at 8:32, Benedict <bened...@apache.org> wrote: > >> I would be fine with a formal API change review period prior to release, >> but if we go that route people should expect to have to revisit work they >> completed a while back, and there should be no presumption that decisions >> taken without a DISCUSS thread should be preferred to alternative >> suggestions - and we should have a clear policy of reverting any work if it >> is not revisited based on the outcome of any discussion, since seeking >> broader input earlier was always an option. I expect this approach could >> lead to frustration, but it might actually be a better system than separate >> DISCUSS threads as the changes can be considered holistically. >> >> The idea that a DISCUSS thread for each change would be burdensome >> however is I think mistaken. Even if 70 were the true figure, it would have >> been around one per week, and they could easily have been batched. I’d also >> be fine with white listing some changes (eg JMX and warning messages) - but >> definitely not virtual tables or CQL. These APIs develop strong user >> dependencies, and are very hard to change. >> >> We should not restrict input on our main user experiences to the handful >> of people with time to closely monitor Jira, most of whom are not even >> users of Cassandra. We should be seeking the broadest visibility, including >> casual observers and non-contributors. >> >> On 5 Dec 2022, at 13:05, Paulo Motta <pauloricard...@gmail.com> wrote: >> >> >> >> It feels bit of overkill to me to require addition of any new virtual >> tables/JMX/configuration/knob to go through a discuss thread. If this would >> require 70 threads for the previous release I think this would easily >> become spammy and counter-productive. >> >> I think the burden should be on the maintainer to keep up with changes >> being added to the database and chime in any areas it feel responsible for, >> as it has been the case and has worked relatively well. >> >> I think it makes sense to look into improving visibility of API changes, >> so people can more easily review a summary of API changes versus reading >> through the whole changelog (perhaps we need a summarized API change log?). >> >> It would also help to have more explicit guidelines on what kinds of API >> changes are riskier and might require additional visibility via a DISCUSS >> thread. >> >> Also, would it make sense to introduce a new API review stage during >> release validation, and agree to revert/update any API changes that may be >> controversial that were not caught during normal review? >> >> On Mon, 5 Dec 2022 at 06:49 Andrés de la Peña <adelap...@apache.org> >> wrote: >> >>> Indeed that contribution policy should be clearer and not be on a page >>> titled code style, thanks for briging that up. >>> >>> If we consider all those things APIs, and additions are also considered >>> changes that require a DISCUSS thread, it turns out that almost any >>> not-bugfix ticket would require a mail list thread. In fact, if one goes >>> through CHANGES.txt it's easy to see that most entries would have required >>> a DISCUSS thread. >>> >>> I think that such a strict policy would only make us lose agility and >>> increase the burden of almost any contribution. After all, it's not that >>> changes without a DISCUSS thread happen in secret. Changes are publicly >>> visible on their tickets, those tickets are notified on Slack so anyone can >>> jump into the ticket discussions and set themselves as reviewers, and >>> reviewers can ask for DISCUSS threads whenever they think more opinions or >>> broader consensus are needed. >>> >>> Also, a previous DISCUSS thread is not going to impede that any changes >>> are going to be questioned later. We have seen changes that are proposed, >>> discussed and approved as CEPs, reviewed for weeks or months, and finally >>> committed, and still they are questioned shortly after that cycle, and >>> asked to be changed or discussed again. I don't think that an avalanche of >>> DISCUSS threads is going to improve that, since usually the problem is that >>> people don't have the time for deeply looking into the changes when they >>> are happening. I doubt that more notification channels are going to improve >>> that. >>> >>> Of course I'm not saying that there should never DISCUSS threads before >>> starting a change. Probably we can all agree that major changes and things >>> that break compatibility would need previous discussion. >>> >>> On Mon, 5 Dec 2022 at 10:16, Benjamin Lerer <ble...@apache.org> wrote: >>> >>>> Thanks for opening this thread Josh, >>>> >>>> It seems perfectly normal to me that for important changes or questions >>>> we raise some discussion to the mailing list. >>>> >>>> My understanding of the current proposal implies that for the 4.1 >>>> release we should have had to raise over 70 discussion threads. >>>> We have a minimum of 2 commiters required for every patch. Should we >>>> not trust them to update nodetool, the virtual tables or other things on >>>> their own? >>>> >>>> There is already multiple existing ways to track changes in specific >>>> code areas. I am personaly tracking the areas in which I am the most >>>> involved this way and I know that a lot of people do the same. >>>> >>>> To be transparent, It is not clear to me what the underlying issue is? >>>> Do we have some specific cases that illustrate the underlying problem? >>>> Thrift and JMX are from a different time in my opinion. >>>> >>>> Le lun. 5 déc. 2022 à 08:09, Berenguer Blasi <berenguerbl...@gmail.com> >>>> a écrit : >>>> >>>>> +1 to moving that into it's own section outside the coding style page. >>>>> >>>>> Dinesh I also thought in terms of backward compatibility here. But >>>>> notice the discussion is about _any change_ to the API such as adding new >>>>> CQL functions. Would adding or changing an exception type or a user >>>>> warning >>>>> qualify for a DISCUSS thread also? I wonder if we're talking ourselves >>>>> into >>>>> opening a DISCUSS for almost every ticket and sthg easy to miss. >>>>> >>>>> I wonder, you guys know the code better, if 'public APIs' could be >>>>> matched to a reasonable set of files (cql parsing, yaml, etc) and have >>>>> jenkins send an email when changes are detected on them. Overkill? bad >>>>> idea? :thinking:... >>>>> On 4/12/22 1:14, Dinesh Joshi wrote: >>>>> >>>>> We should also very clearly list out what is considered a public API. >>>>> The current statement that we have is insufficient: >>>>> >>>>> public APIs, including CQL, virtual tables, JMX, yaml, system >>>>> properties, etc. >>>>> >>>>> >>>>> The guidance on treatment of public APIs should also move out of "Code >>>>> Style" page as it isn't strictly related to code style. Backward >>>>> compatibility of public APIs is a best practice & project policy. >>>>> >>>>> >>>>> On Dec 2, 2022, at 2:08 PM, Benedict <bened...@apache.org> wrote: >>>>> >>>>> I think some of that text also got garbled by mixing up how you >>>>> approach internal APIs and external APIs. We should probably clarify that >>>>> there are different burdens for each. Which is all my fault as the >>>>> formulator. I remember it being much clearer in my head. >>>>> >>>>> My view is the same as yours Josh. Evolving the database’s public APIs >>>>> is something that needs community consensus. The more visibility these >>>>> decisions get, the better the final outcome (usually). Even small API >>>>> changes need to be carefully considered to ensure the API evolves >>>>> coherently, and this is particularly true for something as complex and >>>>> central as CQL. >>>>> >>>>> A DISCUSS thread is a good forcing function to think about what you’re >>>>> trying to achieve and why, and to provide others a chance to spot >>>>> potential >>>>> flaws, alternatives and interactions with work you may not be aware of. >>>>> >>>>> It would be nice if there were an easy rubric for whether something >>>>> needs feedback, but I don’t think there is. One person’s obvious >>>>> change may be another’s obvious problem. So I think any decision that >>>>> binds >>>>> the project going forwards should have a lazy consensus DISCUSS thread at >>>>> least. >>>>> >>>>> I don’t think it needs to be burdensome though - trivial API changes >>>>> could begin while the DISCUSS thread is underway, expecting they usually >>>>> won’t raise a murmur. >>>>> >>>>> On 2 Dec 2022, at 19:25, Josh McKenzie <jmcken...@apache.org> wrote: >>>>> >>>>> >>>>> Came up this morning / afternoon in dev slack: >>>>> https://the-asf.slack.com/archives/CK23JSY2K/p1669981168190189 >>>>> >>>>> The gist of it: we're lacking clarity on whether the expectation on >>>>> the project is to hit the dev ML w/a [DISCUSS] thread on _any_ API >>>>> modification or only on modifications where the author feels they are >>>>> adjusting a paradigm / strategy for an API. >>>>> >>>>> The code style section on Public APIs is actually a little unclear: >>>>> https://cassandra.apache.org/_/development/code_style.html >>>>> >>>>> Public APIs >>>>> >>>>> These considerations are especially important for public APIs, including >>>>> CQL, virtual tables, JMX, yaml, system properties, etc. Any planned >>>>> additions must be carefully considered in the context of any existing >>>>> APIs. Where possible the approach of any existing API should be followed. >>>>> Where the existing API is poorly suited, a strategy should be developed >>>>> to modify or replace the existing API with one that is more coherent in >>>>> light of the changes - which should also carefully consider any planned >>>>> or expected future changes to minimise churn. Any strategy for modifying >>>>> APIs should be brought to dev@cassandra.apache.org for discussion. >>>>> >>>>> >>>>> My .02: >>>>> 1. We should rename that page to a "code contribution guide" as >>>>> discussed on the slack thread >>>>> 2. *All* publicly facing API changes (tool output, CQL semantics, JMX, >>>>> vtables, .java interfaces targeting user extension, etc) should hit the >>>>> dev >>>>> ML w/a [DISCUSS] thread. >>>>> >>>>> This takes the burden of trying to determine if a change is consistent >>>>> w/existing strategy or not etc. off the author in isolation and allows >>>>> devs >>>>> to work concurrently on API changes w/out risk of someone else working on >>>>> something that may inform their work or vice versa. >>>>> >>>>> We've learned that API's are *really really hard* to deprecate, >>>>> disruptive to our users when we change or remove them, and can cause >>>>> serious pain and ecosystem fragmentation when changed. See: Thrift, >>>>> current >>>>> discussions about JMX, etc. They're the definition of a "one-way-door" >>>>> decision and represent a long-term maintenance burden commitment from the >>>>> project. >>>>> >>>>> Lastly, I'd expect the vast majority of these discuss threads to be >>>>> quick consensus checks resolved via lazy consensus or after some slight >>>>> discussion; ideally this wouldn't represent a huge burden of coordination >>>>> on folks working on changes. >>>>> >>>>> So that's 1 opinion. What other opinions are out there? >>>>> >>>>> ~Josh >>>>> >>>>> >>>>>