I can put few sentences in this context, because my involvement into Cassandra mailing list started with looking at build system. It was 8 or maybe 9 years ago.

Simple fact that Cassandra core team do understand their chain of Ant task execution doesn't automatically make all other contributors and interested parties understand it. More over, another person outside of present core team with prior knowledge of Ant, will require some amount of time to dive into task definitions and how they are being invoked. There is a reason why usage of Ant fade away over time, its because maintenance of builds such Cassandra takes with time more and more effort. Maven is built around convention over configuration, most of external contributors who worked with it before will come with enough of knowledge to understand what your build is doing and how. A lot of hate people feel to Maven is caused by its limitations which they try to overcome for some reason. Gradle is closer in its flexibility to Ant and gives much more power, but requires a bit of though to not complicate things too much. Switching to standardized build requires a work, but usually it improves architecture and separation of concerns. System you build might have a complex logic because of business, standards or technology requirements, but your build is not part of that complexity. If you assume it is then you just double amount of troubles as your technology will depend on over-engineered pipeline. One of major pain points I found back in 2013 or 2014 was that Cassandra had a lot of cross dependencies between packages making it impossible to simplify build in first place. Today I am even scared to look at it now. Whether these cross dependencies are needed or not - I was not able to answer myself, yet most of properly designed systems tend to build directed dependencies free of cycles.

From looking at history of root build.xml I see 75 contributors who touched it. It is not a small number, I would say that for project witch such small amount of external dependencies it is a lot. It is still a fair question if you will benefit from using other build tool. Yet I could put opposite question - where you would be, if you would make a change 7 years ago? How many hours of tweaking of Ant you would save?

With regard to linters I saw it used in openHAB, I know Apache Maven is looking to adopt an unified code style as well. In both cases ecosystem of these projects is quite large and amount of people doing contributions is larger than core team. Yet, even if cassandra is smaller in terms of people contributing patches, it has serious enterprise use cases. Spotless, the-last-npe and other build time plugins can assure that you never get a PR or patch with amended whitespaces or code formatting. Unnecessary parts will be fixed by tool so patch will contain only needed change. Whether you will enforce it by CI or by hand, that's other question. For sure staying with Ant doesn't

Best regards,
Łukasz


On 28.11.2022 22:19, Benedict wrote:
Scott makes some valid points about legitimate benefits. I personally doubt the high upfront cost of migration will take less than a decade to pay back in time saved managing shims. But, it’s a tangible justification. Conveniently the bulk of my contributions are also at Scott’s prerogative, so if he’s fine with me (and others) wasting their time battling Gradle or Maven, or losing time to the migration, then I think my complaint is functionally neutered.

I think, though, that those pushing such a disruptive change into others had better work very darn hard to deliver a smooth experience.

I tend to find that maintaining our current Ant build is a big waste of my time, and that every time I need to go to this layer its far more brittle than it should be

Whereas I find Ant an absolute pleasure, and Gradle a nightmare, and already regret using it for Accord. I wasted more than a day just trying to get some test artefacts in one module exposed in another, and eventually gave up. I have made dozens of forays into our Ant build and *never* abandoned my goal because I couldn’t accomplish it. I think people are really glossing over all the pain other build systems bring - even without our complex build requirements.

If you’re going to make the project adopt your preferred build system, you become responsible for the experience of everyone using it. Make sure you’re ready for that.


I think those pushing such a disruptive change on others had better work very darn hard to deliver a smooth experience.


On 28 Nov 2022, at 20:18, David Capwell <dcapw...@apple.com> wrote:

I am strong +1 to new linters, I have been working on SpotBugs but not sent a patch due to sickness and holidays…

About the check style as the source of truth for the style guides, I am +1 to this as well… I feel that wiki is a bad place for this and we can use the check style file to generate the wiki text (no idea if there are tools for this…).  About the comment on “hard” requirements… my understanding has been that new changes must follow and I know several reviewers who act this way in the review process…. If we agree on a style, why do we want to keep allowing ignoring it?

There’s always a handful of people asking for it, but notably few if any of the full time contributors doing the majority of the core development of Cassandra.

I don’t agree with that statement, I know several committers/PMC who want to switch, but every 6 months this gets brought up and gets shot down… so we just give up trying…

I never really see a good argument articulated for the migration, besides general hand waving that ant is old, and people like newer build systems. Ant is working fine, so there isn’t a strong technical reason to replace it, and there are good organisational reasons not to.

I don’t know how Mick feels, but I tend to find that maintaining our current Ant build is a big waste of my time, and that every time I need to go to this layer its far more brittle than it should be…. There are pushes to refactor to try to get more order in the build (side conversation between Mick and I), but these tend to hide the complexity from most…. I strongly feel that our current build is duct taping things together, this isn’t a complaint of the maintainers but more how Ant works…

And there are a few things I feel newer build systems offer that actually make my life better:

1) cache -  I hate running tests in a loop as this always has the same pattern: rebuild everything from scratch for 1m to run a single test that takes 30 milliseconds…. With ant we “could” force our tasks to cache, but then we have to go do that for everything… we are basically rewriting and reimplementing what other builds have

2) consistency - CI runs tests using 1 task, but people like to run tests using a different task… they tend to have different setup and act slightly differently… I had to go improve this for jvm-dtest and see that simulator is in the same shape (if you run simulator tests using the normal CI task, they will fail; you need the bulk task that doesn’t have the same features as the CI version)

3) reusable work, avoid copy/paste - when we add new jars/artifacts we need to copy/paste a set of tasks to implement a “sub-module” or “sub-project” (depending on build tool you prefer), not only is this very easy to get wrong, it discourages splitting work into these units (we talked about moving utils/concurrent into its own jar for Accord…)

4) only having to know 1 build system - right now we are Ant + Maven; to maintain this project you need to understand both… you can get by not knowing this until you touch dependencies, and if you need tasks that also know about those dependencies… you need to see how this is taped together and grasp that logic…

Maybe I am bitter as it took a whole day to get SpotBugs working with Ant… and it is just the following in other builds:

Gradle: apply plugin: “com.github.spotbugs"
Maven:

<reporting><plugins><plugin><groupId>com.github.spotbugs</groupId><artifactId>spotbugs-maven-plugin</artifactId></plugin></plugins></reporting>


On Nov 28, 2022, at 10:46 AM, Maxim Muzafarov <mmu...@apache.org <mailto:mmu...@apache.org>> wrote:

Thank you all for the feedback and productive discussion.


I couldn't have formed my thoughts on the build tools for the product
better and provided such good examples than Scott did. Rephrasing what
I wrote in the first letter, seeing Maven/Gradle in the project
underfoot, a modern IDE will take care of all the necessary files and
configurations for themselves much better than we do with scripts. I
fully agree that there is no rush with such migration, and the
databases in such cases must be more conservative than progressive,
and not change anything without strong benefits and a broad consensus
on it. I still believe this consensus can be reached in future and
when (and if) the consensus will be reached, a clear migration plan
should be developed for several releases ahead as well. There's still
a lot of work to be done here that's why I mentioned it at the end of
my proposal, so as not to pay too much attention to this question at
this moment.
I've added a link to this thread to the JIRA issue [1], so we don't
lose the insights mentioned by members above.


I want to take away your concerns about lints expansion for now. I
thought first of all about making all the source code-checking tools
more convenient for use with a minimal set of already existing lints
rather than adding or forcing new rules. I really want to avoid here
cases with storing multiple configurations for a single tool e.g.
having different configurations for 'optional' or 'mandatory' checks
as well as different configurations for 'production' or 'tests'.

Thus, the ideal picture in my mind of all discussed above is :

We have:
- checkstyle
- SpotBugs
- Sonar

They work the same way for:
- Jenkins builds
- CirleCI builds
- GitHub pull requests
- build on the local machine

For all that, we have the code style webpage [2] (and wiki [3]) is
pretty well described, there is no need to expand checking tools with
new rules until we will get working these tools on the minimal set of
rules. For instance, we can pick up for the checkstyle 'Unused
imports', 'Import order', for the SpotBugs 'AutoCloseable',
'Number.valueOf', for Sonar - only reports to monitor the source code
trends.

I agree that adding new lints require a broad consensus, so I'd like
to avoid such debatable questions for now. Moreover, even with the
lints already agreed upon, it is still risky to implement some of them
because they can contain a lot of boilerplate changes and may affect
more important fixes ready for merge.


So, as a first step, I can invest my time into the checkstyle tool and
make it work everywhere with the same configuration.
WDYT?


P.S.

For IntelliJ with the Checkstyle Plugin it's easy to import the
checkstyle.xml the following way:
Preferences -> Code Style -> Show Scheme Actions (wheel) -> Import
scheme -> Checkstyle configuration.


[1] https://issues.apache.org/jira/browse/CASSANDRA-17015 <https://issues.apache.org/jira/browse/CASSANDRA-17015> [2] https://cassandra.apache.org/_/development/code_style.html <https://cassandra.apache.org/_/development/code_style.html> [3] https://cwiki.apache.org/confluence/display/CASSANDRA2/CodeStyle <https://cwiki.apache.org/confluence/display/CASSANDRA2/CodeStyle>

On Sun, 27 Nov 2022 at 13:17, Josh McKenzie <jmcken...@apache.org <mailto:jmcken...@apache.org>> wrote:

My .02 on the build discussion is we should try and keep the guts of that in one place, be it the other email thread or on JIRA. Some insightful points made on this thread but would hate to see this thread derailed on a complex independent topic as well as see some of these points lost on the other discussion.

I think there needs to be a lot of community consensus on the broad expansion of lints that can reject patches.

+1. It may be worthwhile to configure 2 tiers of lints, optional and required, so we can move to a more gradual process of cleaning up lint violations for those that are interested in that type of work. I know in the past we've seen value in looking at the diff in linting violations even w/a 1k+ noisy violation environment.


On Fri, Nov 25, 2022, at 12:41 PM, sc...@paradoxica.net <mailto:sc...@paradoxica.net> wrote:

For me, the strongest arguments in favor of adopting a modern build tool like Maven or Gradle are their ecosystems - both explicit (in terms of plugins), and implicit (in terms of nearly all build tooling supporting both of them, but not ant).

Investment in Ant - and in tooling that integrates with Ant - fell off years ago. This makes integrating build-phase aspects of Cassandra with other tooling a very frustrating task that users of most build tools get for free. Many tools built in the last several years don’t support it, or do so only as an afterthought.

Two recent examples that have caused pain for me, which I suspect are felt by many:

– Integration with internal build systems at many companies that develop Cassandra. Because ant has fallen into disuse, this integration is heavily manual instead of automatic and free. It usually requires forking the project’s build.xml, developing custom tooling around it, or creating a mock Gradle build that wraps ant lifecycle tasks (which also requires overriding ant tasks whose names clash).

– Security toolchain integration. Many users and developers of Cassandra also integrate with security tooling at their respective companies. Because Ant has fallen into disuse, most tooling commonly used by security organizations doesn’t support it. SBOMs are a good example, as their introduction postdates ant’s decline. Maven plugins exist to generate them in CycloneDX and SPDX format, but no such plugins exist for ant. Cassandra users and developers who need them must manually write shims to produce SBOMs that users of modern build tools get for free.

These might not be use cases anticipated by the project, but they represent work I suspect a large number of contributors to the project are required to perform to make Cassandra usable for them.

The ecosystem point means that the fix for this has to be external to the project if the project continues to use Ant: lots of one-off scripts and shims unsuitable for contribution with sole maintainers at their respective companies to provide functionality that users of modern build tools get for free. It also demands continuous, incremental work to adapt to changes in security and build tooling in use at many companies that don’t need to be made for projects using well-supported tools like Maven or Gradle.

– Scott

On Nov 25, 2022, at 4:56 AM, Benedict <bened...@apache.org <mailto:bened...@apache.org>> wrote:

I think modules are already fairly well supported - we in effect already have several? FQL, Simulator and others I think.

I can anyway say with absolute certainty that the main impediment to modularising is not build tooling, it’s the difficulty of the task, the cost to the project of undertaking it, and therefore its relative payoff versus other things that could be undertaken by the folk with the relevant expertise to do it.

I’ve also been around long enough to see enough efforts to broaden contributions fail to have an impact, that basing costly decisions on this kind of reasoning doesn’t resonate. The main impediments to contributions are the complexity of the codebase (and problem domain) and our limited capacity to respond promptly to high quality contributions. Until we fix those fundamental issues, this kind of rearranging of chairs seems more like a branding exercise to ourselves than to anyone else.


On 25 Nov 2022, at 10:22, Miklosovic, Stefan <stefan.mikloso...@netapp.com <mailto:stefan.mikloso...@netapp.com>> wrote:

I agree with what you wrote. How I understand it is that migrating to Maven/Gradle makes the project more "attractive" for newcomers. If a project is built on "that old un-cool Ant", it might be a little bit off-putting and questionable if we are "stuck in the past on build systems and not progressing".

So in that sense I agree this is more "marketing" rather than technological question but on the other hand, does not Maven/Gradle allow us to modularize the project better? Maybe we would like to modularize but nobody is up to that because build system makes it impossible or at least quite inconvenient to do so. Do you really think there are not any significant benefits to switch even if it "just works" now?

________________________________________
From: Benedict <bened...@apache.org <mailto:bened...@apache.org>>
Sent: Friday, November 25, 2022 11:07
To: dev@cassandra.apache.org <mailto:dev@cassandra.apache.org>
Subject: Re: [DISCUSSION] Cassandra's code style and source code analysis

NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.




There’s always a handful of people asking for it, but notably few if any of the full time contributors doing the majority of the core development of Cassandra. It strikes me as something very appealing to others, but less so to those wanting to get on with development.

I never really see a good argument articulated for the migration, besides general hand waving that ant is old, and people like newer build systems. Ant is working fine, so there isn’t a strong technical reason to replace it, and there are good organisational reasons not to.

Why do you consider a migration inevitable?



On 25 Nov 2022, at 09:58, Miklosovic, Stefan <stefan.mikloso...@netapp.com <mailto:stefan.mikloso...@netapp.com>> wrote:

Interesting take on Ant / no-Ant, Benedict. I am very curious how this unfolds. My long-term perception is that changing it to something else is more or less inevitable but if there is a broader consensus to not do that .... well.

________________________________________
From: Benedict <bened...@apache.org <mailto:bened...@apache.org>>
Sent: Friday, November 25, 2022 10:52
To: dev@cassandra.apache.org <mailto:dev@cassandra.apache.org>
Subject: Re: [DISCUSSION] Cassandra's code style and source code analysis

NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.




I was in a bit of a rush last night. I should say that I’m of course +1 a general endeavour to clean this up, and to expand our use of linters, and I appreciate your volunteering to help out in this way Maxim.

However, responding to Stefan, I’m pretty -1 migrating from ant to another build system without really good reason. Migration has a real cost to productivity for all existing contributors, and the phantom of increasing new contributions has never paid off historically. I’m all for easing people into participation, but not at penalty to the existing contributor base.

If the only reason is to make it easier to open in a different IDE, we can perhaps have some basic build files outlining code structure for importing, that are compatible with our canonical ant build? We could perhaps even generate them.


On 25 Nov 2022, at 09:35, Miklosovic, Stefan <stefan.mikloso...@netapp.com <mailto:stefan.mikloso...@netapp.com>> wrote:

For the record, I was testing that same combo Claude mentioned and it did not work out of the box but it is definitely possible to set up successfully. I do not remember the details.

To replay to Maxim, it all seems good to me, roughly, but I humbly think it all boils down to Maven/Gradle refactoring and on top of that we can do all else.

For example, there is (1) where the solution, besides fixing the tests, is to introduce an Ant task which would check this on build. That being said, how is that going to look like when we change Ant for something else? That stuff suddenly becomes obsolete.

This case maybe applies to other problems we want to solve as well. I do not want to do something tailored for one build system just to rewrite it all or to spend significant amount of time on that again when we switch the build system.

For that reason I think changing Ant for something else should be top priority (as I understand that it the hot topic for community for very long time) and then everything else should follow. We should spend time on things mentioned only in case they do not collide with any build system at all.

(1) https://issues.apache.org/jira/browse/CASSANDRA-17964 <https://issues.apache.org/jira/browse/CASSANDRA-17964>

Stefan

________________________________________
From: Claude Warren, Jr via dev <dev@cassandra.apache.org <mailto:dev@cassandra.apache.org>>
Sent: Friday, November 25, 2022 10:16
To: dev@cassandra.apache.org <mailto:dev@cassandra.apache.org>
Subject: Re: [DISCUSSION] Cassandra's code style and source code analysis

NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.



+1 for the concept as a whole.  I am certain I could find nits to pick if I looked deeply.

@mck -- I did have a problem with Cassandra + Eclipse + Java11 (Classpath).  I gave up and am spending time trying to learn IntelliJ.  I also mentioned it in one of the discussion areas.

Claude

On Thu, Nov 24, 2022 at 8:55 PM Mick Semb Wever <m...@apache.org <mailto:m...@apache.org><mailto:m...@apache.org <mailto:m...@apache.org>>> wrote:
Thank you for a solid write up Maxim. And welcome to Cassandra, it's
very positive to see you here.

I whole-heartedly agree with nearly everything you write. Some input
and questions inline.



As you may know, the infrastructure team has disabled public sign-up
to ASF JIRA (the GitHub issues are recommended instead).



I suspect (based on chatter in general, but not on dev@ AFAIK) is to
avoid GH issues and stick with jira. The sign-up hurdle we will
document on the website: CASSANDRA-18064



== 1. Make the checkstyle config a single point of truth for the
source code style. ==

The checkstyle is already used and included in the Cassandra project
build lifecycle (ant command line, Jenkins, CircleCI). There is no
need to maintain code style configurations for different types of IDEs
(e.g. IntelliJ inspections configuration) since the checkstyle.xml
file can be directly imported to IDE used by a developer. This is fair
for Intellij Idea, NetBeans, and Eclipse.


Big +1


So, I propose to focus on the checks themselves and checking pull
requests with automation scripts, rather than maintaining these
integrations. The benefits here are avoiding all issues with
maintaining configurations for different IDEs. Another good advantage
of this approach would be the ability to add new checkstyle rules
without touching IDE configuration - and such tickets will be LFH and
easy to commit.

The actions points here are:

- create an umbrella JIRA ticket for all checkstyle issues e.g. [8]
(or label checkstyle);
- add checkstyle to GitHub pull requests using GitHub actions (execute
ant command);


Instead of custom GHA scripting, please use our existing
cassandra-artifact.sh (which should already include all such checks).

Something like https://github.com/apache/cassandra/compare/cassandra-3.11...thelastpickle:cassandra:mck/github-actions/3.11 <https://github.com/apache/cassandra/compare/cassandra-3.11...thelastpickle:cassandra:mck/github-actions/3.11>



== 3. Enable pushing backwards build results for both Jenkins and
CircleCI to GitHub pull requests. ==

The goal here is to have a "green checkbox" for those GitHub pull
requests that have corresponding Jenkins/CircleCI runs. According to
the following links, it is completely possible to have those.

https://github.com/jenkinsci/github-branch-source-plugin <https://github.com/jenkinsci/github-branch-source-plugin>
https://circleci.com/docs/enable-checks/

Moreover, the GitHub Branch Source Plugin is already enabled for the
Cassandra project [16]. The same seems should work the same way for
CirleCI, but I have faced the infrastructure team comment [17] that
describes admin permissions are required for that, so the question is
still open here. I could dig a bit more once we've agreed on it.

The actions points here are:
- enable Jenkins integration for GitHub pull requests;
- enable CircleCI integration for GitHub pull requests;


Some folk use CircleCI, some use ci-cassandra. The green checkbox idea
is great, so long as it's optional. We don't want PRs triggering the
runs either (there are other mechanisms for triggering for now).


The actions points here are:

- initiate a wide survey for user and dev lists, to get to know about
the usages;
- remove those configurations that are not used anymore;
- force migration from Ant to Gradle/Maven;


Let's leave this out for now. There's too many unknowns here. If
there's an IDE configuration that's broken, no one has reported it for
ages, and no one is around to fix it, then I say we should raise the
discussion to remove it.

The Gradle/Maven migration is a hot one, contribute to that discussion
but let's not tangle this work up with it, IMHO.

Totally agree that IDE project files should be as light weight as possible.


Summarizing everything proposed above I think it is possible to
simplify adding small contributions easier to the codebase as well as
save a bunch of committer's time.

So,
WDYT about the things described above?
Should I create a CEP for this?


I see no need for a CEP here. An epic and tickets will work.
Again, thanks for the input Maxim!





Reply via email to