Dear cassandra commiters and development process followers,
I would like to bring an important topic off build process of cassandra. I am 
an external user from community point of view, however I been walking around 
various  projects close to cassandra over past year or even more. What is 
worrying me a lot is how cassandra is publishing artifacts and how many 
problems are reported due that.

First of all - I want to note that I am not born enemy of Ant itself. I never 
used it. I am also aware of problems with custom builds made with Maven, 
however I don’t really want to discuss any particular replacement, yet I want 
to note that Cassandra JIRA project contains about 116 issues related somehow 
to maven (http://bit.ly/1GRoXl5 <http://bit.ly/1GRoXl5>, project=CASSANDRA, 
text ~ maven). Depends on the point of view it might be a lot or a little. By 
simple statistics it is around 21 issues a year or almost 2 issues a month, 
many of them breaking maintanance/major releases from user point of view. From 
other hand it’s not bad considering how project is being built.

Current structure has a very big disadvantage - ONE source root for multiple 
artifacts published in maven repositories and copying classes to jar AFTER they 
are compiled. Obviously ant copy task doesn’t follow import statements and does 
not include dependant classes. For example just by making test relocations and 
extraction of clientutil jar on master branch into separate source root I have 
found a bug where ListSerializer depends on org.apache.cassandra.transpor 
package. More over clientutil (MapSerializer) does depends on 
org.apache.cassandra.db.marshal package leading to the fact that it can not be 
used without cassandra-all present at classpath.
Luckily for cassandra CQL as a new interface reduces thrift and clientutil 
usage reducing amount of issues reported around these, however this just hides 
a real problem in previous paragraph. I have found a handy tool and made a 
graph of circular dependencies in cassandra-all.jar. Graph of results can found 
here: http://grab.by/FRnO <http://grab.by/FRnO>. As you can see this graph has 
multiple levels and solving it is not a simple task. I am afraid a current way 
of building and packaging cassandra can create huge hiccups when it will come 
to code rafactorings cause entire cassandra will become a house of cards.
Restructuring project into smaller pieces is also beneficiary for community 
since solving bugs in smaller units is definitelly easier.

At the end of this mail I would like to propose moving Cassandra build system 
forward, regardless of tool which will be choosen for it. Personally I can 
volunteer in maven related changes to extract cassandra-thrift, 
cassandra-clientutil and cassandra-all to make regular maven build. It might be 
seen as a switch from one big XML into couple smaller. :-) All this depends on 
Cassandra developers decission to devide source roots or not.

Kind regards,
Łukasz Dywicki
—
l...@code-house.org
Twitter: ldywicki
Blog: http://dywicki.pl
Code-House - http://code-house.org

Reply via email to