As requested, I'm alerting the ML that I've got the first patch of several
to come.  This one only addresses the ColumnFamilyStore class - and only
changes the name.  There's follow up tickets to change method and variable
names.  As we agreed, I'm doing this incrementally, so please let's keep
that in mind, I have no interest in producing a 10K patch that changes all
the things.

Since the patch only touches lines explicitly naming ColumnFamilyStore this
patch should be reasonably non-intrusive.

https://issues.apache.org/jira/browse/CASSANDRA-14340

Jon

On Thu, Mar 22, 2018 at 8:09 AM Jon Haddad <j...@jonhaddad.com> wrote:

> Cool.  I think there’s general agreement that doing this in as small bites
> as possible is going to be the best approach.  I have no interest in mega
> patches.
>
> >  The combined approach takes a
> > change that's already non-trivially dealing with complex subsystem
> > changes and injects a bunch of trivial renaming noise across unrelated
> > subsystems into the signal of an actual logic refactor.
>
> I agree.  This is why I like the idea of proactively working to improve
> the readability of the codebase as a specific goal, rather than being
> wrapped into some other unrelated patch.  Keeping the scope in check is the
> challenge.  Simple class and method renames, as several have pointed out,
> is easy enough with IDEA.
>
> I’ll start with class renames, as individual patches for each of them.
> I’ll be sure to call it out on the ML.  First one will be ColumnFamilyStore
> -> TableStore.
>
> Jon
>
> > On Mar 22, 2018, at 7:13 AM, Jason Brown <jasedbr...@gmail.com> wrote:
> >
> > Jon,
> >
> > Thanks for bringing up this topic. I'll admit that I've been around this
> > code base for long enough, and have enough accumulated history, that I
> > probably can't fully appreciate the impact for a newcomer wrt naming.
> > However, as Josh points out, this situation probably happens to "every
> > non-trivially aged code-base ever".
> >
> > One thing I'd like to add is that with these types of large refactoring
> > changes, the review effort is non-trivial. This is because the review
> still
> > has to ensure that correctness is preserved and it's easy to overlook a
> > seemingly innocuous change.
> >
> > That being said, I am supportive of this effort. However, I believe it's
> > going to be best, for contributor and reviewer, to break it up into
> > smaller, more digestible pieces. I'd also like to request that we not go
> > whole hog and try to do everything in a compressed time frame; reviewer
> > availability is already stretched thin and I'm afraid of deepening the
> > review queue, especially mine :)
> >
> > Thanks,
> >
> > -Jason
> >
> >
> >
> >
> > On Thu, Mar 22, 2018 at 6:41 AM, Josh McKenzie <jmcken...@apache.org>
> wrote:
> >
> >>> Some of us have big patches in flight, things that actually
> >>> pay off some technical debt, and dealing with such renames is rebase
> >> hell :\
> >> For sure, but with a code-base this old / organically grown, I expect
> >> this will always be the case. If we're talking something as simple as
> >> an intellij rename refactor, while menial, couldn't someone with a
> >> giant patch just do the same thing on their side and spend half an
> >> hour of their life clicking next? ;)
> >>
> >>> That said, there is good time for such renames - it’s during
> >>> those major refactors and rewrites. When you are
> >>> changing a subsystem, might as well do the appropriate renames.
> >> Does that hold true for a code-base with as much static state and
> >> abstraction leaking / bad factoring as we have? (i.e. every
> >> non-trivially aged code-base ever) The combined approach takes a
> >> change that's already non-trivially dealing with complex subsystem
> >> changes and injects a bunch of trivial renaming noise across unrelated
> >> subsystems into the signal of an actual logic refactor.
> >>
> >> On Thu, Mar 22, 2018 at 9:31 AM, Aleksey Yeshchenko <alek...@apple.com>
> >> wrote:
> >>> Poor and out-of-date naming of things is probably the least serious
> part
> >> of our technical debt. Bad factoring, and straight-up
> >>> poorly written components is where it’s really at.
> >>>
> >>> Doing a big rename for rename sake alone does more harm than it is
> good,
> >> sometimes. Some of us have big patches
> >>> in flight, things that actually pay off some technical debt, and
> dealing
> >> with such renames is rebase hell :\
> >>>
> >>> That said, there is good time for such renames - it’s during those
> major
> >> refactors and rewrites. When you are
> >>> changing a subsystem, might as well do the appropriate renames.
> >>>
> >>> —
> >>> AY
> >>>
> >>> On 20 March 2018 at 22:04:48, Jon Haddad (j...@jonhaddad.com) wrote:
> >>>
> >>> Whenever I hop around in the codebase, one thing that always manages to
> >> slow me down is needing to understand the context of the variable names
> >> that I’m looking at. We’ve now removed thrift the transport, but the
> >> variables, classes and comments still remain. Personally, I’d like to
> go in
> >> and pay off as much technical debt as possible by refactoring the code
> to
> >> be as close to CQL as possible. Rows should be rows, not partitions, I’d
> >> love to see the term column family removed forever in favor of always
> using
> >> tables. That said, it’s a big task. I did a quick refactor in a branch,
> >> simply changing the ColumnFamilyStore class to TableStore, and pushed
> it up
> >> to GitHub. [1]
> >>>
> >>> Didn’t click on the link? That’s ok. The TL;DR is that it’s almost 2K
> >> LOC changed across 275 files. I’ll note that my branch doesn’t change
> any
> >> of the almost 1000 search results of “columnfamilystore” found in the
> >> codebase and hundreds of tests failed on my branch in CircleCI, so that
> 2K
> >> LOC change would probably be quite a bit bigger. There is, of course, a
> lot
> >> more than just renaming this one class, there’s thousands of variable
> names
> >> using any manner of “cf”, “cfs”, “columnfamily”, names plus comments and
> >> who knows what else. There’s lots of references in probably every file
> that
> >> would have to get updated.
> >>>
> >>> What are people’s thoughts on this? We should be honest with ourselves
> >> and know this isn’t going to get any easier over time. It’s only going
> to
> >> get more confusing for new people to the project, and having to figure
> out
> >> “what kind of row am i even looking at” is a waste of time. There’s
> >> obviously a much bigger impact than just renaming a bunch of files,
> there’s
> >> any number of patches and branches that would become outdated, plus
> anyone
> >> pulling in Cassandra as a dependency would be affected. I don’t really
> have
> >> a solution for the disruption other than “leave it in place”, but in my
> >> mind that’s not a great (or even good) solution.
> >>>
> >>> Anyways, enough out of me. My concern for ergonomics and naming might
> be
> >> significantly higher than the rest of the folks working in the code,
> and I
> >> wanted to put a feeler out there before I decided to dig into this in a
> >> more serious manner.
> >>>
> >>> Jon
> >>>
> >>> [1] https://github.com/apache/cassandra/compare/trunk...
> >> rustyrazorblade:refactor_column_family_store?expand=1 <
> >> https://github.com/apache/cassandra/compare/trunk...
> >> rustyrazorblade:refactor_column_family_store?expand=1>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> >> For additional commands, e-mail: dev-h...@cassandra.apache.org
> >>
> >>
>
>

Reply via email to