On the topic of code style, the wiki not only covers many elements of style that are impossible to encode in a linter, but explicitly uses terminology on many lintable items that is not “hard.” I should know, I formulated it this way deliberately, and made this explicit in the discussion while shepherding it through a vote - only quite recently.
For both these reason I think it is impossible to make checkstyle the source of truth. We can vote for a subset of rules we want everyone to follow without any exception, and translate these into a checkstyle.xml without a vote. But: 1) I personally have seen limited value to strict style enforcement on projects where I’ve interacted with it. Expressing code neatly involves a lot of authorship decisions that a linter simply cannot understand. 2) There’s already a broad array of styles in the project, and this is fine. We don’t have to check-in identical looking code, so long as it is all the same flavour. 3) It would anyway be hugely disruptive to apply to PRs to existing code. I think if you’re keen on this it would be best to start by selecting your preferred subset of hard rules. > About the check style as the source of truth for the style guides, I am +1 to > this as well… I feel that wiki is a bad place for this and we can use the > check style file to generate the wiki text (no idea if there are tools for > this…). About the comment on “hard” requirements… my understanding has been > that new changes must follow and I know several reviewers who act this way in > the review process…. If we agree on a style, why do we want to keep allowing > ignoring it? > On 28 Nov 2022, at 20:18, David Capwell <dcapw...@apple.com> wrote: > > I am strong +1 to new linters, I have been working on SpotBugs but not sent > a patch due to sickness and holidays… > > About the check style as the source of truth for the style guides, I am +1 to > this as well… I feel that wiki is a bad place for this and we can use the > check style file to generate the wiki text (no idea if there are tools for > this…). About the comment on “hard” requirements… my understanding has been > that new changes must follow and I know several reviewers who act this way in > the review process…. If we agree on a style, why do we want to keep allowing > ignoring it? > >> There’s always a handful of people asking for it, but notably few if any of >> the full time contributors doing the majority of the core development of >> Cassandra. > > > I don’t agree with that statement, I know several committers/PMC who want to > switch, but every 6 months this gets brought up and gets shot down… so we > just give up trying… > >> I never really see a good argument articulated for the migration, besides >> general hand waving that ant is old, and people like newer build systems. >> Ant is working fine, so there isn’t a strong technical reason to replace it, >> and there are good organisational reasons not to. > > I don’t know how Mick feels, but I tend to find that maintaining our current > Ant build is a big waste of my time, and that every time I need to go to this > layer its far more brittle than it should be…. There are pushes to refactor > to try to get more order in the build (side conversation between Mick and I), > but these tend to hide the complexity from most…. I strongly feel that our > current build is duct taping things together, this isn’t a complaint of the > maintainers but more how Ant works… > > And there are a few things I feel newer build systems offer that actually > make my life better: > > 1) cache - I hate running tests in a loop as this always has the same > pattern: rebuild everything from scratch for 1m to run a single test that > takes 30 milliseconds…. With ant we “could” force our tasks to cache, but > then we have to go do that for everything… we are basically rewriting and > reimplementing what other builds have > > 2) consistency - CI runs tests using 1 task, but people like to run tests > using a different task… they tend to have different setup and act slightly > differently… I had to go improve this for jvm-dtest and see that simulator is > in the same shape (if you run simulator tests using the normal CI task, they > will fail; you need the bulk task that doesn’t have the same features as the > CI version) > > 3) reusable work, avoid copy/paste - when we add new jars/artifacts we need > to copy/paste a set of tasks to implement a “sub-module” or “sub-project” > (depending on build tool you prefer), not only is this very easy to get > wrong, it discourages splitting work into these units (we talked about moving > utils/concurrent into its own jar for Accord…) > > 4) only having to know 1 build system - right now we are Ant + Maven; to > maintain this project you need to understand both… you can get by not knowing > this until you touch dependencies, and if you need tasks that also know about > those dependencies… you need to see how this is taped together and grasp that > logic… > > Maybe I am bitter as it took a whole day to get SpotBugs working with Ant… > and it is just the following in other builds: > > Gradle: apply plugin: “com.github.spotbugs" > Maven: > > <reporting> > <plugins> > <plugin> > <groupId>com.github.spotbugs</groupId> > <artifactId>spotbugs-maven-plugin</artifactId> > </plugin> > </plugins> > </reporting> > > >> On Nov 28, 2022, at 10:46 AM, Maxim Muzafarov <mmu...@apache.org> wrote: >> >> Thank you all for the feedback and productive discussion. >> >> >> I couldn't have formed my thoughts on the build tools for the product >> better and provided such good examples than Scott did. Rephrasing what >> I wrote in the first letter, seeing Maven/Gradle in the project >> underfoot, a modern IDE will take care of all the necessary files and >> configurations for themselves much better than we do with scripts. I >> fully agree that there is no rush with such migration, and the >> databases in such cases must be more conservative than progressive, >> and not change anything without strong benefits and a broad consensus >> on it. I still believe this consensus can be reached in future and >> when (and if) the consensus will be reached, a clear migration plan >> should be developed for several releases ahead as well. There's still >> a lot of work to be done here that's why I mentioned it at the end of >> my proposal, so as not to pay too much attention to this question at >> this moment. >> I've added a link to this thread to the JIRA issue [1], so we don't >> lose the insights mentioned by members above. >> >> >> I want to take away your concerns about lints expansion for now. I >> thought first of all about making all the source code-checking tools >> more convenient for use with a minimal set of already existing lints >> rather than adding or forcing new rules. I really want to avoid here >> cases with storing multiple configurations for a single tool e.g. >> having different configurations for 'optional' or 'mandatory' checks >> as well as different configurations for 'production' or 'tests'. >> >> Thus, the ideal picture in my mind of all discussed above is : >> >> We have: >> - checkstyle >> - SpotBugs >> - Sonar >> >> They work the same way for: >> - Jenkins builds >> - CirleCI builds >> - GitHub pull requests >> - build on the local machine >> >> For all that, we have the code style webpage [2] (and wiki [3]) is >> pretty well described, there is no need to expand checking tools with >> new rules until we will get working these tools on the minimal set of >> rules. For instance, we can pick up for the checkstyle 'Unused >> imports', 'Import order', for the SpotBugs 'AutoCloseable', >> 'Number.valueOf', for Sonar - only reports to monitor the source code >> trends. >> >> I agree that adding new lints require a broad consensus, so I'd like >> to avoid such debatable questions for now. Moreover, even with the >> lints already agreed upon, it is still risky to implement some of them >> because they can contain a lot of boilerplate changes and may affect >> more important fixes ready for merge. >> >> >> So, as a first step, I can invest my time into the checkstyle tool and >> make it work everywhere with the same configuration. >> WDYT? >> >> >> P.S. >> >> For IntelliJ with the Checkstyle Plugin it's easy to import the >> checkstyle.xml the following way: >> Preferences -> Code Style -> Show Scheme Actions (wheel) -> Import >> scheme -> Checkstyle configuration. >> >> >> [1] https://issues.apache.org/jira/browse/CASSANDRA-17015 >> [2] https://cassandra.apache.org/_/development/code_style.html >> [3] https://cwiki.apache.org/confluence/display/CASSANDRA2/CodeStyle >> >>> On Sun, 27 Nov 2022 at 13:17, Josh McKenzie <jmcken...@apache.org> wrote: >>> >>> My .02 on the build discussion is we should try and keep the guts of that >>> in one place, be it the other email thread or on JIRA. Some insightful >>> points made on this thread but would hate to see this thread derailed on a >>> complex independent topic as well as see some of these points lost on the >>> other discussion. >>> >>> I think there needs to be a lot of community consensus on the broad >>> expansion of lints that can reject patches. >>> >>> +1. It may be worthwhile to configure 2 tiers of lints, optional and >>> required, so we can move to a more gradual process of cleaning up lint >>> violations for those that are interested in that type of work. I know in >>> the past we've seen value in looking at the diff in linting violations even >>> w/a 1k+ noisy violation environment. >>> >>> >>> On Fri, Nov 25, 2022, at 12:41 PM, sc...@paradoxica.net wrote: >>> >>> For me, the strongest arguments in favor of adopting a modern build tool >>> like Maven or Gradle are their ecosystems - both explicit (in terms of >>> plugins), and implicit (in terms of nearly all build tooling supporting >>> both of them, but not ant). >>> >>> Investment in Ant - and in tooling that integrates with Ant - fell off >>> years ago. This makes integrating build-phase aspects of Cassandra with >>> other tooling a very frustrating task that users of most build tools get >>> for free. Many tools built in the last several years don’t support it, or >>> do so only as an afterthought. >>> >>> Two recent examples that have caused pain for me, which I suspect are felt >>> by many: >>> >>> – Integration with internal build systems at many companies that develop >>> Cassandra. Because ant has fallen into disuse, this integration is heavily >>> manual instead of automatic and free. It usually requires forking the >>> project’s build.xml, developing custom tooling around it, or creating a >>> mock Gradle build that wraps ant lifecycle tasks (which also requires >>> overriding ant tasks whose names clash). >>> >>> – Security toolchain integration. Many users and developers of Cassandra >>> also integrate with security tooling at their respective companies. Because >>> Ant has fallen into disuse, most tooling commonly used by security >>> organizations doesn’t support it. SBOMs are a good example, as their >>> introduction postdates ant’s decline. Maven plugins exist to generate them >>> in CycloneDX and SPDX format, but no such plugins exist for ant. Cassandra >>> users and developers who need them must manually write shims to produce >>> SBOMs that users of modern build tools get for free. >>> >>> These might not be use cases anticipated by the project, but they represent >>> work I suspect a large number of contributors to the project are required >>> to perform to make Cassandra usable for them. >>> >>> The ecosystem point means that the fix for this has to be external to the >>> project if the project continues to use Ant: lots of one-off scripts and >>> shims unsuitable for contribution with sole maintainers at their respective >>> companies to provide functionality that users of modern build tools get for >>> free. It also demands continuous, incremental work to adapt to changes in >>> security and build tooling in use at many companies that don’t need to be >>> made for projects using well-supported tools like Maven or Gradle. >>> >>> – Scott >>> >>>> On Nov 25, 2022, at 4:56 AM, Benedict <bened...@apache.org> wrote: >>>> >>>> I think modules are already fairly well supported - we in effect already >>>> have several? FQL, Simulator and others I think. >>>> >>>> I can anyway say with absolute certainty that the main impediment to >>>> modularising is not build tooling, it’s the difficulty of the task, the >>>> cost to the project of undertaking it, and therefore its relative payoff >>>> versus other things that could be undertaken by the folk with the relevant >>>> expertise to do it. >>>> >>>> I’ve also been around long enough to see enough efforts to broaden >>>> contributions fail to have an impact, that basing costly decisions on this >>>> kind of reasoning doesn’t resonate. The main impediments to contributions >>>> are the complexity of the codebase (and problem domain) and our limited >>>> capacity to respond promptly to high quality contributions. Until we fix >>>> those fundamental issues, this kind of rearranging of chairs seems more >>>> like a branding exercise to ourselves than to anyone else. >>>> >>>> >>>>> On 25 Nov 2022, at 10:22, Miklosovic, Stefan >>>>> <stefan.mikloso...@netapp.com> wrote: >>>>> >>>>> I agree with what you wrote. How I understand it is that migrating to >>>>> Maven/Gradle makes the project more "attractive" for newcomers. If a >>>>> project is built on "that old un-cool Ant", it might be a little bit >>>>> off-putting and questionable if we are "stuck in the past on build >>>>> systems and not progressing". >>>>> >>>>> So in that sense I agree this is more "marketing" rather than >>>>> technological question but on the other hand, does not Maven/Gradle allow >>>>> us to modularize the project better? Maybe we would like to modularize >>>>> but nobody is up to that because build system makes it impossible or at >>>>> least quite inconvenient to do so. Do you really think there are not any >>>>> significant benefits to switch even if it "just works" now? >>>>> >>>>> ________________________________________ >>>>> From: Benedict <bened...@apache.org> >>>>> Sent: Friday, November 25, 2022 11:07 >>>>> To: dev@cassandra.apache.org >>>>> Subject: Re: [DISCUSSION] Cassandra's code style and source code analysis >>>>> >>>>> NetApp Security WARNING: This is an external email. Do not click links or >>>>> open attachments unless you recognize the sender and know the content is >>>>> safe. >>>>> >>>>> >>>>> >>>>> >>>>> There’s always a handful of people asking for it, but notably few if any >>>>> of the full time contributors doing the majority of the core development >>>>> of Cassandra. It strikes me as something very appealing to others, but >>>>> less so to those wanting to get on with development. >>>>> >>>>> I never really see a good argument articulated for the migration, besides >>>>> general hand waving that ant is old, and people like newer build systems. >>>>> Ant is working fine, so there isn’t a strong technical reason to replace >>>>> it, and there are good organisational reasons not to. >>>>> >>>>> Why do you consider a migration inevitable? >>>>> >>>>> >>>>> >>>>>> On 25 Nov 2022, at 09:58, Miklosovic, Stefan >>>>>> <stefan.mikloso...@netapp.com> wrote: >>>>>> >>>>>> Interesting take on Ant / no-Ant, Benedict. I am very curious how this >>>>>> unfolds. My long-term perception is that changing it to something else >>>>>> is more or less inevitable but if there is a broader consensus to not do >>>>>> that .... well. >>>>>> >>>>>> ________________________________________ >>>>>> From: Benedict <bened...@apache.org> >>>>>> Sent: Friday, November 25, 2022 10:52 >>>>>> To: dev@cassandra.apache.org >>>>>> Subject: Re: [DISCUSSION] Cassandra's code style and source code analysis >>>>>> >>>>>> NetApp Security WARNING: This is an external email. Do not click links >>>>>> or open attachments unless you recognize the sender and know the content >>>>>> is safe. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> I was in a bit of a rush last night. I should say that I’m of course +1 >>>>>> a general endeavour to clean this up, and to expand our use of linters, >>>>>> and I appreciate your volunteering to help out in this way Maxim. >>>>>> >>>>>> However, responding to Stefan, I’m pretty -1 migrating from ant to >>>>>> another build system without really good reason. Migration has a real >>>>>> cost to productivity for all existing contributors, and the phantom of >>>>>> increasing new contributions has never paid off historically. I’m all >>>>>> for easing people into participation, but not at penalty to the existing >>>>>> contributor base. >>>>>> >>>>>> If the only reason is to make it easier to open in a different IDE, we >>>>>> can perhaps have some basic build files outlining code structure for >>>>>> importing, that are compatible with our canonical ant build? We could >>>>>> perhaps even generate them. >>>>>> >>>>>> >>>>>>>> On 25 Nov 2022, at 09:35, Miklosovic, Stefan >>>>>>>> <stefan.mikloso...@netapp.com> wrote: >>>>>>> >>>>>>> For the record, I was testing that same combo Claude mentioned and it >>>>>>> did not work out of the box but it is definitely possible to set up >>>>>>> successfully. I do not remember the details. >>>>>>> >>>>>>> To replay to Maxim, it all seems good to me, roughly, but I humbly >>>>>>> think it all boils down to Maven/Gradle refactoring and on top of that >>>>>>> we can do all else. >>>>>>> >>>>>>> For example, there is (1) where the solution, besides fixing the tests, >>>>>>> is to introduce an Ant task which would check this on build. That being >>>>>>> said, how is that going to look like when we change Ant for something >>>>>>> else? That stuff suddenly becomes obsolete. >>>>>>> >>>>>>> This case maybe applies to other problems we want to solve as well. I >>>>>>> do not want to do something tailored for one build system just to >>>>>>> rewrite it all or to spend significant amount of time on that again >>>>>>> when we switch the build system. >>>>>>> >>>>>>> For that reason I think changing Ant for something else should be top >>>>>>> priority (as I understand that it the hot topic for community for very >>>>>>> long time) and then everything else should follow. We should spend time >>>>>>> on things mentioned only in case they do not collide with any build >>>>>>> system at all. >>>>>>> >>>>>>> (1) https://issues.apache.org/jira/browse/CASSANDRA-17964 >>>>>>> >>>>>>> Stefan >>>>>>> >>>>>>> ________________________________________ >>>>>>> From: Claude Warren, Jr via dev <dev@cassandra.apache.org> >>>>>>> Sent: Friday, November 25, 2022 10:16 >>>>>>> To: dev@cassandra.apache.org >>>>>>> Subject: Re: [DISCUSSION] Cassandra's code style and source code >>>>>>> analysis >>>>>>> >>>>>>> NetApp Security WARNING: This is an external email. Do not click links >>>>>>> or open attachments unless you recognize the sender and know the >>>>>>> content is safe. >>>>>>> >>>>>>> >>>>>>> >>>>>>> +1 for the concept as a whole. I am certain I could find nits to pick >>>>>>> if I looked deeply. >>>>>>> >>>>>>> @mck -- I did have a problem with Cassandra + Eclipse + Java11 >>>>>>> (Classpath). I gave up and am spending time trying to learn IntelliJ. >>>>>>> I also mentioned it in one of the discussion areas. >>>>>>> >>>>>>> Claude >>>>>>> >>>>>>>> On Thu, Nov 24, 2022 at 8:55 PM Mick Semb Wever >>>>>>>> <m...@apache.org<mailto:m...@apache.org>> wrote: >>>>>>> Thank you for a solid write up Maxim. And welcome to Cassandra, it's >>>>>>> very positive to see you here. >>>>>>> >>>>>>> I whole-heartedly agree with nearly everything you write. Some input >>>>>>> and questions inline. >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> As you may know, the infrastructure team has disabled public sign-up >>>>>>>> to ASF JIRA (the GitHub issues are recommended instead). >>>>>>>> >>>>>>> >>>>>>> >>>>>>> I suspect (based on chatter in general, but not on dev@ AFAIK) is to >>>>>>> avoid GH issues and stick with jira. The sign-up hurdle we will >>>>>>> document on the website: CASSANDRA-18064 >>>>>>> >>>>>>> >>>>>>> >>>>>>>> == 1. Make the checkstyle config a single point of truth for the >>>>>>>> source code style. == >>>>>>>> >>>>>>>> The checkstyle is already used and included in the Cassandra project >>>>>>>> build lifecycle (ant command line, Jenkins, CircleCI). There is no >>>>>>>> need to maintain code style configurations for different types of IDEs >>>>>>>> (e.g. IntelliJ inspections configuration) since the checkstyle.xml >>>>>>>> file can be directly imported to IDE used by a developer. This is fair >>>>>>>> for Intellij Idea, NetBeans, and Eclipse. >>>>>>> >>>>>>> >>>>>>> Big +1 >>>>>>> >>>>>>> >>>>>>>> So, I propose to focus on the checks themselves and checking pull >>>>>>>> requests with automation scripts, rather than maintaining these >>>>>>>> integrations. The benefits here are avoiding all issues with >>>>>>>> maintaining configurations for different IDEs. Another good advantage >>>>>>>> of this approach would be the ability to add new checkstyle rules >>>>>>>> without touching IDE configuration - and such tickets will be LFH and >>>>>>>> easy to commit. >>>>>>>> >>>>>>>> The actions points here are: >>>>>>>> >>>>>>>> - create an umbrella JIRA ticket for all checkstyle issues e.g. [8] >>>>>>>> (or label checkstyle); >>>>>>>> - add checkstyle to GitHub pull requests using GitHub actions (execute >>>>>>>> ant command); >>>>>>> >>>>>>> >>>>>>> Instead of custom GHA scripting, please use our existing >>>>>>> cassandra-artifact.sh (which should already include all such checks). >>>>>>> >>>>>>> Something like >>>>>>> https://github.com/apache/cassandra/compare/cassandra-3.11...thelastpickle:cassandra:mck/github-actions/3.11 >>>>>>> >>>>>>> >>>>>>> >>>>>>>> == 3. Enable pushing backwards build results for both Jenkins and >>>>>>>> CircleCI to GitHub pull requests. == >>>>>>>> >>>>>>>> The goal here is to have a "green checkbox" for those GitHub pull >>>>>>>> requests that have corresponding Jenkins/CircleCI runs. According to >>>>>>>> the following links, it is completely possible to have those. >>>>>>>> >>>>>>>> https://github.com/jenkinsci/github-branch-source-plugin >>>>>>>> https://circleci.com/docs/enable-checks/ >>>>>>>> >>>>>>>> Moreover, the GitHub Branch Source Plugin is already enabled for the >>>>>>>> Cassandra project [16]. The same seems should work the same way for >>>>>>>> CirleCI, but I have faced the infrastructure team comment [17] that >>>>>>>> describes admin permissions are required for that, so the question is >>>>>>>> still open here. I could dig a bit more once we've agreed on it. >>>>>>>> >>>>>>>> The actions points here are: >>>>>>>> - enable Jenkins integration for GitHub pull requests; >>>>>>>> - enable CircleCI integration for GitHub pull requests; >>>>>>> >>>>>>> >>>>>>> Some folk use CircleCI, some use ci-cassandra. The green checkbox idea >>>>>>> is great, so long as it's optional. We don't want PRs triggering the >>>>>>> runs either (there are other mechanisms for triggering for now). >>>>>>> >>>>>>> >>>>>>>> The actions points here are: >>>>>>>> >>>>>>>> - initiate a wide survey for user and dev lists, to get to know about >>>>>>>> the usages; >>>>>>>> - remove those configurations that are not used anymore; >>>>>>>> - force migration from Ant to Gradle/Maven; >>>>>>> >>>>>>> >>>>>>> Let's leave this out for now. There's too many unknowns here. If >>>>>>> there's an IDE configuration that's broken, no one has reported it for >>>>>>> ages, and no one is around to fix it, then I say we should raise the >>>>>>> discussion to remove it. >>>>>>> >>>>>>> The Gradle/Maven migration is a hot one, contribute to that discussion >>>>>>> but let's not tangle this work up with it, IMHO. >>>>>>> >>>>>>> Totally agree that IDE project files should be as light weight as >>>>>>> possible. >>>>>>> >>>>>>> >>>>>>>> Summarizing everything proposed above I think it is possible to >>>>>>>> simplify adding small contributions easier to the codebase as well as >>>>>>>> save a bunch of committer's time. >>>>>>>> >>>>>>>> So, >>>>>>>> WDYT about the things described above? >>>>>>>> Should I create a CEP for this? >>>>>>> >>>>>>> >>>>>>> I see no need for a CEP here. An epic and tickets will work. >>>>>>> Again, thanks for the input Maxim! >>>>> >>> >>> >>> >