There are three distinct problems you raise: code structure, documentation, and build system.
The build system, as far as I can tell, is a matter of personal preference. I personally dislike the few interactions I've had with maven, but gratefully my interactions with build system innards have been fairly limited. I mostly just use them. Unless a concrete and significant benefit is delivered by maven, though, it just doesn't seem worth the upheaval to me. If you can make the argument that it actually improves the project in a way that justifies the upheaval, it will certainly be considered, but so far no justification has been made. The documentation problem is common to many projects, though: out of codebase documentation gets stale very rapidly. When we say to "read the code" we mean "read the code and its inline documentation" - the quality of this documentation has itself generally been substandard, but has been improving significantly over the past year or so, and we are endeavouring to improve with every change. In the meantime, there are videos from a recent bootcamp we've run for both internal and external contributors http://www.datastax.com/dev/blog/deep-into-cassandra-internals. The code structure would be great to modularise, but the reality is that it is not currently modular. There are no good clear dividing lines for much of the project. The problem with refactoring the entire codebase to create separate projects is that it is a significant undertaking that makes maintenance of the project across versions significantly more costly. This create a net drag on all productivity in the project. Such a major change requires strong consensus, and strong evidence justifying it. So the question is: would this create more new work than it loses? The evidence isn't there that it would. It might, but I personally guess that it would not, judging by the results of our other attempts to drive up contributions to the project. Perhaps we can have a wider dialogue about the endeavour, though, and see if a consensus can in fact be built. On Thu, Apr 2, 2015 at 9:31 AM, Pierre Devops <pierredev...@gmail.com> wrote: > Hi all, > > Not a cassandra contributor here, but I'm working on the cassandra sources > too. > > This big cassandra source root caused me trouble too, firstly it was not > easy to import in an IDE, try to import cassandra sources in netbeans, it's > a headcache. > > It would be great if we had more small modules/projects in separate POM. It > will be more easier to work on small part of the project, and as a > consequences, I'm sure you will have more external contribution to this > project. > > I know cassandra devs are used to ant build model, but it's like a thread I > opened about updated and more complete documentation about sstable > structures. I got answer that it was not needed to understand how to use > Cassandra, and the only way to learn about that is to rtfcode. Because > people working on cassandra already know how sstable structure are, it's > not needed to provide up to date documentation. > So it will take me a very long time to read and understand all the > serialization code in cassandra to understand the sttable structure before > I can work on the code. Up to date documentation about internals would have > gave me the knowledge I need to contribute much quicker. > > Here we have the same problem, we have a complex non modular build system, > and core cassandra dev are used to it, so it's not needed to make something > more flexible, even if it could facilite external contribution. > > > > 2015-03-31 23:42 GMT+02:00 Benedict Elliott Smith < > belliottsm...@datastax.com>: > > > I think the problem is everyone currently contributing is comfortable > with > > ant, and as much as it is imperfect, it isn't clear maven is going to be > > better. Having the requisite maven functionality linked under the hood > > doesn't seem particularly preferable to the inverse. The status quo has > the > > bonus of zero upheaval for the project and its contributors, though, so > it > > would have to be a very clear win to justify the change in my opinion. > > > > > > On Tue, Mar 31, 2015 at 10:24 PM, Łukasz Dywicki <l...@code-house.org> > > wrote: > > > > > Hey Tyler, > > > Thank you very much for coming back. I already lost faith that I will > get > > > reply. :-) I am fine with code relocations. Moving constants into one > > place > > > where they cause no circular dependencies is cool, I’m all for doing > such > > > thing. > > > > > > Currently Cassandra uses ant for doing some of maven functionalities > > (such > > > deploying POM.xml into repositories with dependency information), it > uses > > > also maven type of artifact repositories. This can be easily flipped. > > Maven > > > can call ant tasks for these parts which can not be made with existing > > > maven plugins. Here is simplest example: > > > http://docs.codehaus.org/display/MAVENUSER/Antrun+Plugin < > > > http://docs.codehaus.org/display/MAVENUSER/Antrun+Plugin> - you can > see > > > ant task definition embedded in maven pom.xml. > > > > > > Most of things can be made at this moment via maven plugins: > > > apache-rat-plugin: > > > > http://mvnrepository.com/artifact/org.apache.rat/apache-rat-plugin/0.11 > > < > > > > http://mvnrepository.com/artifact/org.apache.rat/apache-rat-plugin/0.11> > > > maven-thrift-plugin: > > > > > > http://mvnrepository.com/artifact/org.apache.thrift.tools/maven-thrift-plugin/0.1.11 > > > < > > > > > > http://mvnrepository.com/artifact/org.apache.thrift.tools/maven-thrift-plugin/0.1.11 > > > > > > > antlr4-maven-plugin: > > > http://mvnrepository.com/artifact/org.antlr/antlr4-maven-plugin/4.5 < > > > http://mvnrepository.com/artifact/org.antlr/antlr4-maven-plugin/4.5> > or > > > antlr3-maven-plugin: > > > http://mvnrepository.com/artifact/org.antlr/antlr3-maven-plugin/3.5.2 > < > > > http://mvnrepository.com/artifact/org.antlr/antlr3-maven-plugin/3.5.2> > > > maven-gpg-plugin: > > > > > > http://mvnrepository.com/artifact/org.apache.maven.plugins/maven-gpg-plugin/1.6 > > > < > > > > > > http://mvnrepository.com/artifact/org.apache.maven.plugins/maven-gpg-plugin/1.6 > > > > > > > maven-cobertura-plugin: > http://mojo.codehaus.org/cobertura-maven-plugin/ > > < > > > http://mojo.codehaus.org/cobertura-maven-plugin/> (but these days > jacoco > > > with java agent instrumentation perfoms better) > > > .. and so on > > > > > > I already made some evaluation of impact and it is big. Code has to be > > > separated into different source roots. It’s not easy even for keeping > > > current artifact structure: cassandra-all, cassandra-thrift and > > clientutil > > > (cause of cyclic dependencies). What I can do is prepare of these src > > roots > > > with dependencies which are declared for them and push that to my > > cassandra > > > fork so you will be able to verify that and continue with relocations > if > > > you will like new build. Creating new modules (source roots) with maven > > is > > > simple so you could possibly extract more than these 3 predefined > > > artifacts/package roots. > > > Just let me know if you are interested. > > > > > > Kind regards, > > > Lukasz > > > > > > > > > > Wiadomość napisana przez Tyler Hobbs <ty...@datastax.com> w dniu 31 > > mar > > > 2015, o godz. 21:57: > > > > > > > > Hi Łukasz, > > > > > > > > I'm not very familiar with the build system, but I'll try to respond. > > > > > > > > The Serializer dependencies on org.apache.cassandra.transport are > > almost > > > > certainly uses of Server.CURRENT_VERSION and Server.VERSION_3. These > > are > > > > constants that represent the native protocol version in use, which > > > affects > > > > how certain types are serialized. These constants could easily be > > moved. > > > > > > > > The o.a.c.marshal dependency in MapSerializer is on AbstractType, but > > > could > > > > easily be replaced with java.util.Comparator. > > > > > > > > In any case, I'm not necessarily opposed to improving the build > system > > to > > > > make these errors more apparent. Would your proposal still allow us > to > > > > build with ant (and just change the way those artifacts are built)? > > > > > > > > On Tue, Mar 24, 2015 at 7:58 PM, Łukasz Dywicki <l...@code-house.org > > > <mailto:l...@code-house.org>> wrote: > > > > > > > >> Dear cassandra commiters and development process followers, > > > >> I would like to bring an important topic off build process of > > > cassandra. I > > > >> am an external user from community point of view, however I been > > walking > > > >> around various projects close to cassandra over past year or even > > more. > > > >> What is worrying me a lot is how cassandra is publishing artifacts > and > > > how > > > >> many problems are reported due that. > > > >> > > > >> First of all - I want to note that I am not born enemy of Ant > itself. > > I > > > >> never used it. I am also aware of problems with custom builds made > > with > > > >> Maven, however I don’t really want to discuss any particular > > > replacement, > > > >> yet I want to note that Cassandra JIRA project contains about 116 > > issues > > > >> related somehow to maven (http://bit.ly/1GRoXl5 < > > http://bit.ly/1GRoXl5> > > > <http://bit.ly/1GRoXl5 <http://bit.ly/1GRoXl5>>, > > > >> project=CASSANDRA, text ~ maven). Depends on the point of view it > > might > > > be > > > >> a lot or a little. By simple statistics it is around 21 issues a > year > > or > > > >> almost 2 issues a month, many of them breaking maintanance/major > > > releases > > > >> from user point of view. From other hand it’s not bad considering > how > > > >> project is being built. > > > >> > > > >> Current structure has a very big disadvantage - ONE source root for > > > >> multiple artifacts published in maven repositories and copying > classes > > > to > > > >> jar AFTER they are compiled. Obviously ant copy task doesn’t follow > > > import > > > >> statements and does not include dependant classes. For example just > by > > > >> making test relocations and extraction of clientutil jar on master > > > branch > > > >> into separate source root I have found a bug where ListSerializer > > > depends > > > >> on org.apache.cassandra.transpor package. More over clientutil > > > >> (MapSerializer) does depends on org.apache.cassandra.db.marshal > > package > > > >> leading to the fact that it can not be used without cassandra-all > > > present > > > >> at classpath. > > > >> Luckily for cassandra CQL as a new interface reduces thrift and > > > clientutil > > > >> usage reducing amount of issues reported around these, however this > > just > > > >> hides a real problem in previous paragraph. I have found a handy > tool > > > and > > > >> made a graph of circular dependencies in cassandra-all.jar. Graph of > > > >> results can found here: http://grab.by/FRnO <http://grab.by/FRnO> < > > > http://grab.by/FRnO <http://grab.by/FRnO>>. As you > > > >> can see this graph has multiple levels and solving it is not a > simple > > > task. > > > >> I am afraid a current way of building and packaging cassandra can > > create > > > >> huge hiccups when it will come to code rafactorings cause entire > > > cassandra > > > >> will become a house of cards. > > > >> Restructuring project into smaller pieces is also beneficiary for > > > >> community since solving bugs in smaller units is definitelly easier. > > > >> > > > >> At the end of this mail I would like to propose moving Cassandra > build > > > >> system forward, regardless of tool which will be choosen for it. > > > Personally > > > >> I can volunteer in maven related changes to extract > cassandra-thrift, > > > >> cassandra-clientutil and cassandra-all to make regular maven build. > It > > > >> might be seen as a switch from one big XML into couple smaller. :-) > > All > > > >> this depends on Cassandra developers decission to devide source > roots > > or > > > >> not. > > > >> > > > >> Kind regards, > > > >> Łukasz Dywicki > > > >> — > > > >> l...@code-house.org > > > >> Twitter: ldywicki > > > >> Blog: http://dywicki.pl > > > >> Code-House - http://code-house.org > > > >> > > > >> > > > > > > > > > > > > -- > > > > Tyler Hobbs > > > > DataStax <http://datastax.com/ <http://datastax.com/>> > > > > > > > > >