[this post is available online at https://s.apache.org/aATd ]

by Dinesh Joshi

My journey with Apache began in 1999 with Apache httpd and Apache Tomcat. 
Apache httpd was the de facto webserver at the time on Linux and Tomcat was the 
most well known Java Servlet container. LAMP (Linux, Apache httpd, MySQL, PHP) 
stack was a fantastic combination. From that point on, I have always been a 
Apache user, successively exploring technologies like Apache Commons, Apache 
Storm, Apache Hadoop, Apache HBase, and Apache Cassandra. It has been a very 
dependable OSS brand. Being interested in Distributed Systems and Databases, I 
began exploring OSS databases and came across Cassandra.

In early 2018, almost 19 years after I was first introduced to Apache projects, 
I began actively contributing to Apache Cassandra source. I have always been 
passionate about Cassandra and used it during my Masters at Georgia Tech. Its 
distributed, shared-nothing model is amazing. So when I did get the opportunity 
to contribute to the Cassandra codebase, I decided to make the most of it. Over 
the past year I have contributed over 25 patches as an author and reviewed over 
30 patches. Collaborating with various contributors in the community, we 
successfully proposed the very first CIP (Cassandra Improvement Process), a 
Cassandra Sidecar. We, the community, are now busy building it. I have 
contributed some interesting changes to Cassandra so that it is more reliable 
and can scale better viz. Zero Copy streaming and Zstd Compressor which have 
been featured on the Apache Cassandra blog and at various international 
conferences. This has generated new interest in Cassandra.

I fully credit the Cassandra community with enabling a new contributor like me 
to make meaningful contributions. It is an incredibly passionate community, 
with a lot of questions, answers and knowledge dominating the project JIRA 
board and mailing lists. As a new contributor it was incredible to see a lot of 
community interest in what I was contributing. The Sidecar specifically 
generated a lot of discussion and debate within the community and ultimately we 
achieved consensus, the Apache Way! Zero Copy streaming is something that big 
players like Netflix, Uber, etc were interested in. Contributors from Netflix 
took the initiative in testing and benchmarking it and posting the results on 
Jira. Getting your work into an Open Source project is one thing but it is 
humbling to see your work being actively evaluated by some of the biggest 
industry names. It is even more fascinating to me how people can overcome 
organizational boundaries to collaborate on a project, and how ideas are 
accepted, debated and implemented as a community ultimately making it better 
for everyone in the world. Given my contributions to the Cassandra community, 
recently the PMC voted me in as a Committer which will help me bring in more 
contributions from the community as well as help mentor others to join in and 
contribute!

My goal with contributing to Cassandra was to give back to the community the 
knowledge & expertise that I have gained over the years building some of the 
most scalable systems in the world. I have found great mentors along the way 
who have helped me achieve that goal. It is incredible to see the impact we 
have on the world through Apache projects such as Apache Cassandra. 

Cassandra is used at some of the biggest organizations in the world for mission 
critical applications and changes like Zero Copy streaming (CASSANDRA-14556) or 
Zstd Compression (CASSANDRA-14482) will have a significant impact on many large 
businesses and more importantly people’s lives. Specifically Zero Copy 
Streaming in Cassandra allows the database to recover from a failed node 
several times faster than existing stable version of Cassandra. In addition, it 
also lowers the amount of resources that are required by the streaming process. 
Therefore, an organization running large installations of Cassandra can see a 
meaningful reduction in MTTR (Mean Time to Recovery) as well as reduce the 
spare server pool capacity that they need to maintain. This lowers the TCO 
(Total Cost of Ownership) for Cassandra. Zstd Compression is a new lossless 
compression scheme that offers better compression ratios over existing LZ4 
Compression that is used within Cassandra with comparable compression speed. It 
can reduce storage needs by up to 40% depending on the characteristics of your 
dataset. Again, this not only reduces the expenses but also requires fewer 
servers to store data. As a result you are not only saving money but in a way 
saving the planet by using fewer servers.

I also believe that being a Open Source contributor is not just about code 
contributions. Contributions come in various forms and one of them is 
documentation. Seeing how Cassandra’s documentation is not updated, I proposed 
Cassandra for Google Season of Documentation to improve it. I also have been 
invited to talk about Cassandra at various conferences across Asia, Europe and 
North America. So far, in the past year, I have spoken about Cassandra at 9 
conferences. It is great to engage with the user community at large which is 
very passionate and excited about Cassandra. This is one of the most important 
aspects of community contributions because you get to talk to your users first 
hand. It also generates interest in the project and is key to getting new 
contributors for your project.

In summary, this is impossible without having a great, supportive community 
which is the whole point of the ASF – to build great communities that foster 
collaboration making the world better one contribution at a time.

Dinesh Joshi is a Senior Software Engineer and a Committer on the Apache 
Cassandra project. He has a Masters in Computer Science (Distributed Systems & 
Databases) from Georgia Tech, Atlanta. In the past, Dinesh was a Principal 
Software Engineer at Yahoo building real time distributed systems for Yahoo’s 
Finance Web, iOS & Android apps. He is also an international speaker and 
regularly talks about Apache Cassandra and Databases. In his spare time, he 
volunteers as a mentor for Women Who Code.

# # #

"Success at Apache" is a monthly blog series that focuses on the people and 
processes behind why the ASF "just works". 
https://blogs.apache.org/foundation/category/SuccessAtApache

= = =

NOTE: you are receiving this message because you are subscribed to the 
[email protected] distribution list. To unsubscribe, send email from the 
recipient account to [email protected] with the word 
"Unsubscribe" in the subject line.

Reply via email to