[DISCUSS] API modifications and when to raise a thread on the dev ML

2022-12-02 Thread Josh McKenzie
Came up this morning / afternoon in dev slack: 
https://the-asf.slack.com/archives/CK23JSY2K/p1669981168190189

The gist of it: we're lacking clarity on whether the expectation on the project 
is to hit the dev ML w/a [DISCUSS] thread on _any_ API modification or only on 
modifications where the author feels they are adjusting a paradigm / strategy 
for an API.

The code style section on Public APIs is actually a little unclear: 
https://cassandra.apache.org/_/development/code_style.html

> Public APIs
> 
> These considerations are especially important for public APIs, including CQL, 
> virtual tables, JMX, yaml, system properties, etc. Any planned additions must 
> be carefully considered in the context of any existing APIs. Where possible 
> the approach of any existing API should be followed. Where the existing API 
> is poorly suited, a strategy should be developed to modify or replace the 
> existing API with one that is more coherent in light of the changes - which 
> should also carefully consider any planned or expected future changes to 
> minimise churn. Any strategy for modifying APIs should be brought to 
> dev@cassandra.apache.org for discussion.

My .02:
1. We should rename that page to a "code contribution guide" as discussed on 
the slack thread
2. *All* publicly facing API changes (tool output, CQL semantics, JMX, vtables, 
.java interfaces targeting user extension, etc) should hit the dev ML w/a 
[DISCUSS] thread.

This takes the burden of trying to determine if a change is consistent 
w/existing strategy or not etc. off the author in isolation and allows devs to 
work concurrently on API changes w/out risk of someone else working on 
something that may inform their work or vice versa.

We've learned that API's are *really really hard* to deprecate, disruptive to 
our users when we change or remove them, and can cause serious pain and 
ecosystem fragmentation when changed. See: Thrift, current discussions about 
JMX, etc. They're the definition of a "one-way-door" decision and represent a 
long-term maintenance burden commitment from the project.

Lastly, I'd expect the vast majority of these discuss threads to be quick 
consensus checks resolved via lazy consensus or after some slight discussion; 
ideally this wouldn't represent a huge burden of coordination on folks working 
on changes.

So that's 1 opinion. What other opinions are out there?

~Josh

Re: [DISCUSS] API modifications and when to raise a thread on the dev ML

2022-12-02 Thread Benedict
I think some of that text also got garbled by mixing up how you approach 
internal APIs and external APIs. We should probably clarify that there are 
different burdens for each. Which is all my fault as the formulator. I remember 
it being much clearer in my head.

My view is the same as yours Josh. Evolving the database’s public APIs is 
something that needs community consensus. The more visibility these decisions 
get, the better the final outcome (usually). Even small API changes need to be 
carefully considered to ensure the API evolves coherently, and this is 
particularly true for something as complex and central as CQL. 

A DISCUSS thread is a good forcing function to think about what you’re trying 
to achieve and why, and to provide others a chance to spot potential flaws, 
alternatives and interactions with work you may not be aware of.

It would be nice if there were an easy rubric for whether something needs 
feedback, but I don’t think there is. One person’s obvious change may be 
another’s obvious problem. So I think any decision that binds the project going 
forwards should have a lazy consensus DISCUSS thread at least.

I don’t think it needs to be burdensome though - trivial API changes could 
begin while the DISCUSS thread is underway, expecting they usually won’t raise 
a murmur.

> On 2 Dec 2022, at 19:25, Josh McKenzie  wrote:
> 
> 
> Came up this morning / afternoon in dev slack: 
> https://the-asf.slack.com/archives/CK23JSY2K/p1669981168190189
> 
> The gist of it: we're lacking clarity on whether the expectation on the 
> project is to hit the dev ML w/a [DISCUSS] thread on _any_ API modification 
> or only on modifications where the author feels they are adjusting a paradigm 
> / strategy for an API.
> 
> The code style section on Public APIs is actually a little unclear: 
> https://cassandra.apache.org/_/development/code_style.html
> 
>> Public APIs
>> 
>> These considerations are especially important for public APIs, including 
>> CQL, virtual tables, JMX, yaml, system properties, etc. Any planned 
>> additions must be carefully considered in the context of any existing APIs. 
>> Where possible the approach of any existing API should be followed. Where 
>> the existing API is poorly suited, a strategy should be developed to modify 
>> or replace the existing API with one that is more coherent in light of the 
>> changes - which should also carefully consider any planned or expected 
>> future changes to minimise churn. Any strategy for modifying APIs should be 
>> brought to dev@cassandra.apache.org for discussion.
> 
> My .02:
> 1. We should rename that page to a "code contribution guide" as discussed on 
> the slack thread
> 2. *All* publicly facing API changes (tool output, CQL semantics, JMX, 
> vtables, .java interfaces targeting user extension, etc) should hit the dev 
> ML w/a [DISCUSS] thread.
> 
> This takes the burden of trying to determine if a change is consistent 
> w/existing strategy or not etc. off the author in isolation and allows devs 
> to work concurrently on API changes w/out risk of someone else working on 
> something that may inform their work or vice versa.
> 
> We've learned that API's are really really hard to deprecate, disruptive to 
> our users when we change or remove them, and can cause serious pain and 
> ecosystem fragmentation when changed. See: Thrift, current discussions about 
> JMX, etc. They're the definition of a "one-way-door" decision and represent a 
> long-term maintenance burden commitment from the project.
> 
> Lastly, I'd expect the vast majority of these discuss threads to be quick 
> consensus checks resolved via lazy consensus or after some slight discussion; 
> ideally this wouldn't represent a huge burden of coordination on folks 
> working on changes.
> 
> So that's 1 opinion. What other opinions are out there?
> 
> ~Josh