I am a non-datastax-employee committer, and the large percentage of my
commits are not reviewed by datastax exmployees. I see problems or areas
of improvement in the code base, and directly commit them. No questions
asked, no oversight, no direction at all from datastax or their
employees. I have had a minor number of commits that were reviewed by
cassandra committers, some of which are datastax employees, but the
overwhelming number have not been that way.
If you go by pure commit counts, (an admittedly dubious rating, but
still) i am #4 on number of commits.
On 06/05/2016 06:33 PM, Mattmann, Chris A (3980) wrote:
Thanks for the info Jonathan. I think have assessed based on
the replies thus far, my studying of the archives and
commit and project history the following situation.
Unfortunately it seems like there is a bit of control going on
I’m going to call a spade a spade here. A key portion of your
software’s stack, a client driver to use it, exists outside of
Apache in separate communities. This is an inherent risk to the
project. Some of you cite flexibility and adaptability as reasons
for this - I’ve seen it in so many communities over the last 12+
years in the foundation - it’s not really due to those issues.
There is definitely some control going on. I would ask you all
this - has there been a PR or patch in the past year or two that
wasn’t singularly reviewed by DataStax committers and PMC? Also,
as to the composition of the PMC when was the last time a non
DataStax person was elected to the PMC and/or as a committer?
By itself the diversity issues alone are not damning to the
project, but taken together with the citation to other project
communities even those outside of Apache (e.g., the comments
well “Postgres does it this way, so it’s a good example to
compare us to” or “these other 4 projects at the ASF do it
like this, so X”.. [sic]) and with the perception being created
to those that don’t work at DataStax, and there is an issue here.
I would like to see a discussion in your next board report about
the diversity and health issues of the project, and also some
ideas about potential strategies for mitigation.
I appreciate the open and honest conversation thus far. Let’s
keep it up.
Cheers,
Chris
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: chris.a.mattm...@nasa.gov
WWW: http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Director, Information Retrieval and Data Science Group (IRDS)
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
WWW: http://irds.usc.edu/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
On 6/5/16, 1:51 PM, "Jonathan Ellis" <jbel...@gmail.com> wrote:
On Sun, Jun 5, 2016 at 8:32 AM, Mattmann, Chris A (3980) <
chris.a.mattm...@jpl.nasa.gov> wrote:
1. Is Apache Cassandra useful *without* a driver? That is, can
you use the database without a driver to connect to it or in the
real world would your users all have to download at least one
driver in order to use the DB?
The users do need to download a driver--but this is pretty normal for
community-driven OSS databases. Besides the Apache projects I listed,
PostgreSQL also runs on a community-maintained driver model.
2. To confirm again, at one point at least the Java driver code
lived in the code-base, and further, at one point, people did
submit some patches to add drivers, but the PMC didn’t want to
maintain that code (and apparently they didn’t want to create
any new PMC members and/or committers to do so) and so thus
people started their own new projects? That right?
I think that summary over-emphasizes the governance aspect at the expense
of more important considerations:
0. The very first Cassandra driver interface was Thrift. No Thrift clients
were ever part of the Cassandra tree.
1. When we created the CQL protocol, we initially had a Java driver in tree
as a reference implementation.
2. But due primarily to the project management issues mentioned by Nate,
and secondarily to the governance aspects above, we moved quickly back to
the pure community-driven drivers approach that had worked for us before.
2a. While some Apache databases do ship a Java driver in tree, I think that
this hinders adoption because it signals to users that non-Java drivers are
second-class citizens. (No doubt this is not the *intent* of that
decision, but it is a likely consequence nevertheless.)
2b. DataStax saw CQL adoption as a key driver for Cassandra adoption and
hence its own success, and hired a team to accelerate the production of
drivers for the new CQL protocol. These drivers are Apache licensed and
see broad community participation, e.g. with ~70 contributors to the Java
driver.
2c. Neither has DataStax "sucked the oxygen out of the room." Lots of
non-DataStax drivers exist as well.
As Aleksey pointed out earlier, I don't see anyone being harmed by this
state of affairs. Cassandra PMC doesn't want to run drivers projects,
driver authors don't want to be run by Cassandra PMC, and meanwhile users
have Apache licensed drivers that let them be productive with Cassandra.