By the way https://deepwiki.com/apache/cassandra
- Seems as if its documenting as it goes along. In general, self documentation is now a matter of implementation. I think the larger opportunity is to have some sort of combined user / committer contributed + self documentation. Sent via Superhuman <https://sprh.mn/?vip=rahul.si...@anant.us> On Thu, May 01, 2025 at 5:23 PM, Jeremiah Jordan <jerem...@apache.org> wrote: > That page does not look great, but the link “Documentation” in the middle > of it takes you to the actual page on contributing https://cassandra. > apache.org/_/development/documentation.html > > On May 1, 2025 at 4:19:15 PM, Jon Haddad <j...@rustyrazorblade.com> wrote: > > Oh, well then, that’s perfect. I looked at our contributing to the docs > page and found it empty: > > https://cassandra.apache.org/_/docdev/index.html > > Will merge in my UCS changes tomorrow. > > Jon > > On Thu, May 1, 2025 at 1:41 PM Brandon Williams <dri...@gmail.com> wrote: > > The governanceays "Correcting typos, docs, website, and comments etc > operate a “Commit Then Review > <https://www.apache.org/foundation/glossary.html#CommitThenReview>” > policy" > > Kind Regards, > Brandon > > > On Thu, May 1, 2025 at 3:21 PM Jon Haddad <j...@rustyrazorblade.com> wrote: > > I propose we encourage committers commit docs without review or a JIRA. > > > On Thu, May 1, 2025 at 8:06 AM Jon Haddad <j...@rustyrazorblade.com> wrote: > > Stefan, > > Any feature developed for Cassandra is a collaborative effort. The public > branches of accord have been available for months. Have you contributed to > the accord docs? You've had plenty of opportunity. > > It looks like you're turning your own thread into an airing of > grievances. It's not particularly constructive. You can be right (we need > more docs) without being hostile. > > Jon > > > > > On Thu, May 1, 2025 at 7:49 AM Miklosovic, Stefan via dev <dev@cassandra. > apache.org> wrote: > > Yeah, no surprise, I was thinking the dicussion will go this direction. I > am not completely sure who we are developing this for then. I see the > statements like this and I am pretty disappointed: > > > > "The project obviously aims to serve end users, but the developer > community is the actual project and it is fine to serve that demographic > first, or only. " > > > > What is the actual difference between working in a private fork and > publishing the code publicly almost nobody understands? > > > I get that people work for companies etc. but really, we should reflect > quite hard on what we are doing here. > > > > Let's take Accord, for example. I can not see any justification for > working on something for 3 years and not documenting how to use that when > it really comes to it. Is Accord documentation for users on the way or > not? Is the documentation for CEP-45 going to be done or not? How are > operators outside of the authors of that change one can count on one hand (and > working in one company) supposed to know how to use that? What is "open > source" about that expect of that being online and publicly accessible? > > > > Over the last couple years Cassandra is getting more and more complex > which might alienate even the developers working on it daily. People > might get out of touch with all of the new features being rolled out and > if this trend will continue without documenting it along the way I am very > afraid that Cassandra becomes an exclusive self-serving club of elite > programmers an average / begginner user has no way to catch up with, > consultants will not know how to consult it and so on and so on. How is > this good for anybody? > > > > What I am asking for is really not a rocket science and nothing "rigid". I > am all open to lower the requirements. > > > > I am not asking for rephrasing whole CEP and present it to a user. I am > asking for the description of the most common usages and scenarios with > most important consequences and all configuration parameters. > > > > Let's take a look at CEP-37. > > > > *https://cassandra.apache.org/doc/trunk/cassandra/managing/operating/auto_repair.html > <https://cassandra.apache.org/doc/trunk/cassandra/managing/operating/auto_repair.html>* > > > > This is just wonderful and it is an example how it should be done. > > > > Why can not be this done for other CEPs too? What was different for CEP-37 > when docs were written together with the code but it can not be done > similarly for other CEPs as well? > > > > Regards > > > > > > *From: *Benedict <bened...@apache.org> > *Date: *Thursday, 1 May 2025 at 14:37 > *To: *dev@cassandra.apache.org <dev@cassandra.apache.org> > *Cc: *Rolo, Carlos <carlos.r...@netapp.com>, Miklosovic, Stefan <Stefan. > mikloso...@netapp.com>, dev@cassandra.apache.org <dev@cassandra.apache.org > > > *Subject: *Re: [DISCUSS] Requirement to document features before > releasing them > > *EXTERNAL EMAIL - USE CAUTION when clicking links or attachments * > > > > I am opposed to this. There’s too much imprecision in the “rule” while > simultaneously being much too rigid, and it will be improperly enforced (we > already have lots of rule breaking around modifying public APIs, that > should have discuss threads and do not, for instance). This kind of > arbitrary rule that is unaligned with contributors will likely lead to a > bad and inconsistent documentation, which is worse than no documentation. > > > > We could perhaps stipulate that for a feature to leave experimental status > the community must vote and that documentation should be a consideration. > But this will only capture big changes. > > > > We could perhaps try other ideas like moratoriums on contributions that > are not documentation, to encourage improvements there. > > > > We could perhaps try having LLMs generate documentation that new > contributors could take a first pass at editing for correctness, before a > committer takes a final pass. > > > > At the end of the day though, we’re an OSS project and we do have features > (big and small) designed, implemented and likely only used by the sole > contributor of the feature. We also have features used primarily by active > community members who understand it well enough. I don’t think this is a > bug in the system. The project obviously aims to serve end users, but the > developer community is the actual project and it is fine to serve that > demographic first, or only. > > > > I agree we want to improve our documentation, but this is not the right > way to go about it. > > > > On 1 May 2025, at 13:19, Miklosovic, Stefan via dev <dev@cassandra.apache. > org> wrote: > > > > I am not completely sure LLMs are the way to go here. Sure, to have something > to further refine ... why not. But to just generate something via LLM and > commit that, that would be a no-no from me. These things can go hallucinate > quite quickly, then what? Who is going to proof-read technical stuff like > that? Fixing the hallucinations might take more time then just writing it > from scratch. > > > > Anyway, I would really appreciate if we stayed on track and discussed the > proposition mentioned in my first email - the end goal is to codify the > need to provide documentation together with the feature. If not provided > together, it might be in a separate ticket which will be a blocker for the > next release. > > > > I might initiate the voting thread for that ... > > > > Regards > > > > > > *From: *Rolo, Carlos <carlos.r...@netapp.com> > *Date: *Thursday, 1 May 2025 at 12:30 > *To: *David Capwell <dcapw...@apple.com>, dev@cassandra.apache.org <dev@ > cassandra.apache.org> > *Cc: *Miklosovic, Stefan <stefan.mikloso...@netapp.com> > *Subject: *Re: [DISCUSS] Requirement to document features before > releasing them > > I am bit out of the loop on how/if this would extend to driver > sub-projects. > > Because this makes 100% sense, and in the driver space as well. Looking > into Java driver docs and making others similar would be a great. > > > > Patrich that LLM suggestion might be a life saver, let me try that! > ------------------------------ > > *From:* Miklosovic, Stefan via dev <dev@cassandra.apache.org> > *Sent:* 01 May 2025 08:07 > *To:* David Capwell <dcapw...@apple.com>; dev@cassandra.apache.org <dev@ > cassandra.apache.org> > *Cc:* Miklosovic, Stefan <stefan.mikloso...@netapp.com> > *Subject:* Re: [DISCUSS] Requirement to document features before > releasing them > > > > *EXTERNAL EMAIL - USE CAUTION when clicking links or attachments * > > > > Denser is better. In your oversimplified example of Accord, as a user who > encounters this for the first time, I am definitely interested in what the > limitations are. What might happen quite easily is that if it is not dense > and we just announce it sparsly, then a user takes it all at face value and > if it starts to diverge from your proclamation then they might feel like > they were lied to or they start to be disappointed. You got me? Users do > not like surprises they are discovering themselves on the way of trying it > out (and a lot of time painfully). They just want to know what they are > buying themselves into. > > > > If there are super-cornercase details, that might be omitted as we have > other channels of the communication (Slack, mailing list ...) but in > general I do not see how a lot of documentation would be bad. > > > > It also depends on who you are writing that documentation to. As said, we > talk about user-facing docs here. A documentation for developers where we > are trying to boostrap them / to make them oriented in the code base is > going to be substantially different from a user-facing one. > > > > > > *From: *David Capwell <dcapw...@apple.com> > *Date: *Wednesday, 30 April 2025 at 23:35 > *To: *dev@cassandra.apache.org <dev@cassandra.apache.org> > *Cc: *Miklosovic, Stefan <stefan.mikloso...@netapp.com> > *Subject: *Re: [DISCUSS] Requirement to document features before > releasing them > > *EXTERNAL EMAIL - USE CAUTION when clicking links or attachments * > > > > I wonder at what level can we enforce this. What I mean, in modeling > testing I have found some odd behaviors that people were not aware of > (BATCH cell resolution, NULL handling (emptiness…..), etc.)… so if > documentation is dense this can help force people to think through edge > cases or how 2 features interact with each other…. If documentation is > sparse, then you loose this benefit… > > > > Simple example for Accord > > > > # Sparse > > > > Multiple key transaction support, bringing Apache Cassandra cluster to the > RDMS world! > > > > # Dense > > > > … > > > > Here are the current limitations, … > > > > Here is where we alter Apache Cassandra’s behavior to be more inline with > SQL, ... > > > > On Apr 30, 2025, at 1:38 PM, Miklosovic, Stefan via dev <dev@cassandra. > apache.org> wrote: > > > > > > To extend the first e-mail to cover the practicalities: > > > > 1. changes introduced to nodetool would not be part of this because > they are self-documented (docs of help is autogenerated) > 2. introduction of changes into cassandra.yaml is already covered as > that is what is autogenerated / on website also. > 3. Applying common sense, if it is just enough to mention in NEWS.txt, > that is also fine. > 4. metrics - I bet there are some which are not documented, we should > find a way how to autogenerate them into the website. > > > > I am also to blame and showing I am not a hypocrite, I have never > delivered in-depth user documentation of CEP-24 with examples, use cases, > and so on. I am trying to be more aware of the documentation when > delivering features, to raise awareness about that etc. It is easy to not > think about this too much when developers are in a rush and similar. If > there was a hard requirement for the documentation, I would do it right > away and I would not need to deal with this now. > > > > I understand that when delivering heavy-weights like CEP-15 we can not > expect that all the docs will be done upon delivery but I want to stress > the fact that providing usable documentation should be definitely something > to think about when releasing it. Same goes for all other non-trivial > features. > > > > > > *From: *Josh McKenzie <jmcken...@apache.org> > *Date: *Wednesday, 30 April 2025 at 22:11 > *To: *dev <dev@cassandra.apache.org> > *Cc: *Miklosovic, Stefan <stefan.mikloso...@netapp.com> > *Subject: *Re: [DISCUSS] Requirement to document features before > releasing them > > *EXTERNAL EMAIL - USE CAUTION when clicking links or attachments* > > > > This makes intuitive sense to me. > > > > In our case we could tie documentation to the process of promoting a > feature from “experimental” to production ready, though I fear that might > leave wiggle room for primary authors of some features to leave them as > experimental forever, not desiring to take on the burden of documenting > something that’s already merged in and usable by experts. > > > > Curious what others think. > > > > On Wed, Apr 30, 2025, at 12:10 PM, Miklosovic, Stefan via dev wrote: > > I am on OpenSearchCon and there was a discussion about the documentation > of features. In a nutshell, the policy they seem to have is that there are > some minimal requirements for documentation in place for each feature > introduced. That way, there is no way (or it is greatly minimised) that > there would be a feature released or some user-facing change introduced > without any documentation how to use it. > > > > Under the "documentation", in our case, I mean the docs which would end up > in cassandra.apache.org > <https://urldefense.com/v3/__http:/cassandra.apache.org__;!!Nhn8V6BzJA!Q2uU9Ab38CiJSRJuSPI9bIKJfTgR9yuneyK2LGgK4a4YNMwL2jD1yVsG018wQlMrMAgKI9CfFzOtXbLNjERRjfVMrw$> > docs. > > > > In their case, the documentation is either part of the change or there is > a documentation issue (in GitHub terms) created which basically blocks the > release when not addressed. > > > > When there is no documentation about a feature or improvement, knob to > tweak etc, there is virtually nobody who knows about that except the > person who committed the code / people who participated in a review. I > think this is detrimental to the project. I do not see the point in > releasing something undocumented when the only people who know what is > going on are the ones who wrote it. > > > > If somebody argued that we have them in CHANGES.txt and NEWS.txt, neither > ends up on the website and I do not think they are appropriate vehicles for > user-facing documentation or for anything beyond few sentences. > > > > Could we introduce a policy which would require developers to introduce at > least minimal user-facing documentation (if applicable) before delivering > it / before releasing it and it would be part of the reviews? > > > > For now, while we also add documentation, I feel it is "the best-effort" > approach, it is not part of the official policy when delivering it. > > > > As of now, I can not see any information about documentation among "For > Code Contributions" points: > > > > https://cwiki.apache.org/confluence/display/CASSANDRA/ > Cassandra+Project+Governance > <https://urldefense.com/v3/__https:/cwiki.apache.org/confluence/display/CASSANDRA/Cassandra*Project*Governance__;Kys!!Nhn8V6BzJA!Q2uU9Ab38CiJSRJuSPI9bIKJfTgR9yuneyK2LGgK4a4YNMwL2jD1yVsG018wQlMrMAgKI9CfFzOtXbLNjETp4KSISQ$> > > > > I am looking for adding there a new point: > > > > Code must not be committed when user-facing functionality is not > documented and visible without code inspection. > > > > Regards > >