date:20180321

Re: Paying off tech debt and correctly naming things

2018-03-21 Thread DuyHai Doan

+10

There are 2 hard problems in CS: naming things and cache invalidation

Le 20 mars 2018 23:04, "Jon Haddad"  a écrit :

> Whenever I hop around in the codebase, one thing that always manages to
> slow me down is needing to understand the context of the variable names
> that I’m looking at.  We’ve now removed thrift the transport, but the
> variables, classes and comments still remain.  Personally, I’d like to go
> in and pay off as much technical debt as possible by refactoring the code
> to be as close to CQL as possible.  Rows should be rows, not partitions,
> I’d love to see the term column family removed forever in favor of always
> using tables.  That said, it’s a big task.  I did a quick refactor in a
> branch, simply changing the ColumnFamilyStore class to TableStore, and
> pushed it up to GitHub. [1]
>
> Didn’t click on the link?  That’s ok.  The TL;DR is that it’s almost 2K
> LOC changed across 275 files.  I’ll note that my branch doesn’t change any
> of the almost 1000 search results of “columnfamilystore” found in the
> codebase and hundreds of tests failed on my branch in CircleCI, so that 2K
> LOC change would probably be quite a bit bigger.  There is, of course, a
> lot more than just renaming this one class, there’s thousands of variable
> names using any manner of “cf”, “cfs”, “columnfamily”, names plus comments
> and who knows what else.  There’s lots of references in probably every file
> that would have to get updated.
>
> What are people’s thoughts on this?  We should be honest with ourselves
> and know this isn’t going to get any easier over time.  It’s only going to
> get more confusing for new people to the project, and having to figure out
> “what kind of row am i even looking at” is a waste of time.  There’s
> obviously a much bigger impact than just renaming a bunch of files, there’s
> any number of patches and branches that would become outdated, plus anyone
> pulling in Cassandra as a dependency would be affected.  I don’t really
> have a solution for the disruption other than “leave it in place”, but in
> my mind that’s not a great (or even good) solution.
>
> Anyways, enough out of me.  My concern for ergonomics and naming might be
> significantly higher than the rest of the folks working in the code, and I
> wanted to put a feeler out there before I decided to dig into this in a
> more serious manner.
>
> Jon
>
> [1] https://github.com/apache/cassandra/compare/trunk...
> rustyrazorblade:refactor_column_family_store?expand=1 <
> https://github.com/apache/cassandra/compare/trunk...
> rustyrazorblade:refactor_column_family_store?expand=1>

Re: Paying off tech debt and correctly naming things

2018-03-21 Thread kurt greaves

As someone who came to the codebase post CQL but prior to thrift being
removed, +1 to refactor. The current mixing of terminology is a complete
nightmare. This would also give a good opportunity document a lot of code
that simply isn't documented (or incorrect). I'd say it's worth doing it in
multiple steps though, such as refactor of a single class at a time, then
followed by refactor of variable names. We've already done one pretty big
refactor (InetAddressAndPort) for 4.0, I don't see how a few more could
make it any worse (lol).

Row vs partition vs key vs PK is killing me

On 20 March 2018 at 22:04, Jon Haddad  wrote:

> Whenever I hop around in the codebase, one thing that always manages to
> slow me down is needing to understand the context of the variable names
> that I’m looking at.  We’ve now removed thrift the transport, but the
> variables, classes and comments still remain.  Personally, I’d like to go
> in and pay off as much technical debt as possible by refactoring the code
> to be as close to CQL as possible.  Rows should be rows, not partitions,
> I’d love to see the term column family removed forever in favor of always
> using tables.  That said, it’s a big task.  I did a quick refactor in a
> branch, simply changing the ColumnFamilyStore class to TableStore, and
> pushed it up to GitHub. [1]
>
> Didn’t click on the link?  That’s ok.  The TL;DR is that it’s almost 2K
> LOC changed across 275 files.  I’ll note that my branch doesn’t change any
> of the almost 1000 search results of “columnfamilystore” found in the
> codebase and hundreds of tests failed on my branch in CircleCI, so that 2K
> LOC change would probably be quite a bit bigger.  There is, of course, a
> lot more than just renaming this one class, there’s thousands of variable
> names using any manner of “cf”, “cfs”, “columnfamily”, names plus comments
> and who knows what else.  There’s lots of references in probably every file
> that would have to get updated.
>
> What are people’s thoughts on this?  We should be honest with ourselves
> and know this isn’t going to get any easier over time.  It’s only going to
> get more confusing for new people to the project, and having to figure out
> “what kind of row am i even looking at” is a waste of time.  There’s
> obviously a much bigger impact than just renaming a bunch of files, there’s
> any number of patches and branches that would become outdated, plus anyone
> pulling in Cassandra as a dependency would be affected.  I don’t really
> have a solution for the disruption other than “leave it in place”, but in
> my mind that’s not a great (or even good) solution.
>
> Anyways, enough out of me.  My concern for ergonomics and naming might be
> significantly higher than the rest of the folks working in the code, and I
> wanted to put a feeler out there before I decided to dig into this in a
> more serious manner.
>
> Jon
>
> [1]
> https://github.com/apache/cassandra/compare/trunk...rustyrazorblade:refactor_column_family_store?expand=1
> <
> https://github.com/apache/cassandra/compare/trunk...rustyrazorblade:refactor_column_family_store?expand=1
> >

Re: Paying off tech debt and correctly naming things

2018-03-21 Thread Benjamin Lerer

I agree that the refactoring make sense but I also know the pain of merging
branches with smaller refactorings. When you have to make a patch for
several branches and that you have to go through several round of reviews
it can be pretty painful.
I know that pain too well so I am not in favor of the bing bang approach.





On Wed, Mar 21, 2018 at 9:05 AM, kurt greaves  wrote:

> As someone who came to the codebase post CQL but prior to thrift being
> removed, +1 to refactor. The current mixing of terminology is a complete
> nightmare. This would also give a good opportunity document a lot of code
> that simply isn't documented (or incorrect). I'd say it's worth doing it in
> multiple steps though, such as refactor of a single class at a time, then
> followed by refactor of variable names. We've already done one pretty big
> refactor (InetAddressAndPort) for 4.0, I don't see how a few more could
> make it any worse (lol).
>
> Row vs partition vs key vs PK is killing me
>
> On 20 March 2018 at 22:04, Jon Haddad  wrote:
>
> > Whenever I hop around in the codebase, one thing that always manages to
> > slow me down is needing to understand the context of the variable names
> > that I’m looking at.  We’ve now removed thrift the transport, but the
> > variables, classes and comments still remain.  Personally, I’d like to go
> > in and pay off as much technical debt as possible by refactoring the code
> > to be as close to CQL as possible.  Rows should be rows, not partitions,
> > I’d love to see the term column family removed forever in favor of always
> > using tables.  That said, it’s a big task.  I did a quick refactor in a
> > branch, simply changing the ColumnFamilyStore class to TableStore, and
> > pushed it up to GitHub. [1]
> >
> > Didn’t click on the link?  That’s ok.  The TL;DR is that it’s almost 2K
> > LOC changed across 275 files.  I’ll note that my branch doesn’t change
> any
> > of the almost 1000 search results of “columnfamilystore” found in the
> > codebase and hundreds of tests failed on my branch in CircleCI, so that
> 2K
> > LOC change would probably be quite a bit bigger.  There is, of course, a
> > lot more than just renaming this one class, there’s thousands of variable
> > names using any manner of “cf”, “cfs”, “columnfamily”, names plus
> comments
> > and who knows what else.  There’s lots of references in probably every
> file
> > that would have to get updated.
> >
> > What are people’s thoughts on this?  We should be honest with ourselves
> > and know this isn’t going to get any easier over time.  It’s only going
> to
> > get more confusing for new people to the project, and having to figure
> out
> > “what kind of row am i even looking at” is a waste of time.  There’s
> > obviously a much bigger impact than just renaming a bunch of files,
> there’s
> > any number of patches and branches that would become outdated, plus
> anyone
> > pulling in Cassandra as a dependency would be affected.  I don’t really
> > have a solution for the disruption other than “leave it in place”, but in
> > my mind that’s not a great (or even good) solution.
> >
> > Anyways, enough out of me.  My concern for ergonomics and naming might be
> > significantly higher than the rest of the folks working in the code, and
> I
> > wanted to put a feeler out there before I decided to dig into this in a
> > more serious manner.
> >
> > Jon
> >
> > [1]
> > https://github.com/apache/cassandra/compare/trunk...
> rustyrazorblade:refactor_column_family_store?expand=1
> > <
> > https://github.com/apache/cassandra/compare/trunk...
> rustyrazorblade:refactor_column_family_store?expand=1
> > >
>

Re: Paying off tech debt and correctly naming things

2018-03-21 Thread Sylvain Lebresne

I really don't think anyone has been recently against such renaming, and in
fact, a _lot_ of renaming *has* already happen over time. The problem, as
you carefully noted, is that it's such a big task that there is still a lot
to do. Anyway, I've yet to see a patch renaming things to match the CQL
naming scheme be rejected, so I'd personally encourage such submission. But
maybe with a few caveats (already mentioned largely, so repeating here to
signify my personal agreement with them):
- renaming with large surface area can be painful for ongoing patches or
even future merge. That's not a reason for not doing them, but that's imo a
good enough reason to do things incrementally/in as-small-as-reasonable
steps. Making sure a renaming commit only does renaming and doesn't change
the logic is also pretty nice when you rebase such things.
- breaking hundreds of tests is obviously not ok :)
- pure code renaming is one reasonably simple aspect, but quite a few
renaming may have user visible impact. Particularly around JMX where many
things are name based on their class, and to a lesser extend some of our
tools still use "old" naming. We can't and shouldn't ignore those impact:
such user visible changes should imo be documented, and we should make sure
we have a reasonably painless (and thus incremental) upgrade path. My hunch
is the latter isn't as simple as it seems.


--
Sylvain


On Wed, Mar 21, 2018 at 9:06 AM kurt greaves  wrote:

> As someone who came to the codebase post CQL but prior to thrift being
> removed, +1 to refactor. The current mixing of terminology is a complete
> nightmare. This would also give a good opportunity document a lot of code
> that simply isn't documented (or incorrect). I'd say it's worth doing it in
> multiple steps though, such as refactor of a single class at a time, then
> followed by refactor of variable names. We've already done one pretty big
> refactor (InetAddressAndPort) for 4.0, I don't see how a few more could
> make it any worse (lol).
>
> Row vs partition vs key vs PK is killing me
>
> On 20 March 2018 at 22:04, Jon Haddad  wrote:
>
> > Whenever I hop around in the codebase, one thing that always manages to
> > slow me down is needing to understand the context of the variable names
> > that I’m looking at.  We’ve now removed thrift the transport, but the
> > variables, classes and comments still remain.  Personally, I’d like to go
> > in and pay off as much technical debt as possible by refactoring the code
> > to be as close to CQL as possible.  Rows should be rows, not partitions,
> > I’d love to see the term column family removed forever in favor of always
> > using tables.  That said, it’s a big task.  I did a quick refactor in a
> > branch, simply changing the ColumnFamilyStore class to TableStore, and
> > pushed it up to GitHub. [1]
> >
> > Didn’t click on the link?  That’s ok.  The TL;DR is that it’s almost 2K
> > LOC changed across 275 files.  I’ll note that my branch doesn’t change
> any
> > of the almost 1000 search results of “columnfamilystore” found in the
> > codebase and hundreds of tests failed on my branch in CircleCI, so that
> 2K
> > LOC change would probably be quite a bit bigger.  There is, of course, a
> > lot more than just renaming this one class, there’s thousands of variable
> > names using any manner of “cf”, “cfs”, “columnfamily”, names plus
> comments
> > and who knows what else.  There’s lots of references in probably every
> file
> > that would have to get updated.
> >
> > What are people’s thoughts on this?  We should be honest with ourselves
> > and know this isn’t going to get any easier over time.  It’s only going
> to
> > get more confusing for new people to the project, and having to figure
> out
> > “what kind of row am i even looking at” is a waste of time.  There’s
> > obviously a much bigger impact than just renaming a bunch of files,
> there’s
> > any number of patches and branches that would become outdated, plus
> anyone
> > pulling in Cassandra as a dependency would be affected.  I don’t really
> > have a solution for the disruption other than “leave it in place”, but in
> > my mind that’s not a great (or even good) solution.
> >
> > Anyways, enough out of me.  My concern for ergonomics and naming might be
> > significantly higher than the rest of the folks working in the code, and
> I
> > wanted to put a feeler out there before I decided to dig into this in a
> > more serious manner.
> >
> > Jon
> >
> > [1]
> >
> https://github.com/apache/cassandra/compare/trunk...rustyrazorblade:refactor_column_family_store?expand=1
> > <
> >
> https://github.com/apache/cassandra/compare/trunk...rustyrazorblade:refactor_column_family_store?expand=1
> > >
>

Re: [DISCUSS] java 9 and the future of cassandra on the jdk

2018-03-21 Thread Stefan Podkowinski

There's also another option, which I just want to mention here for the
sake of discussion.

Quoting the Oracle Support Roadmap:
"Instead of relying on a pre-installed standalone JRE, we encourage
application developers to deliver JREs with their applications."

I've played around with Java 9 a while ago and also tested creating a
self contained JRE using jlink, which you can bundle and ship with your
application. So there's a technical solution for that with Java 9. Of
course you'd have to clarify licensing issues (OpenJDK is GPLv2 +
Classpath exception) first.

Bundling a custom JRE along with Cassandra, would be convenient in a way
that we can do all the testing against the bundled Java version. We
could also switch to a new Java version whenever it fits us. Like e.g.
apache-cassandra-4.0.14_openjdk11u321 and two months later release
apache-cassandra-4.0.15_openjdk12u123. History has shown that planing
and timing new releases isn't always working out for us as expected. I'd
rather prefer not having to tightly coordinate our own releases together
with OpenJDK releases, if it can be avoided. At the same time I'd like
to avoid having users updating to incompatible JREs (think about
8u161/#14173), or have them constantly ask which JRE version to use for
which Cassandra version, always with the risk of automatic updates
causing unexpected issues. Bundling the JRE may help us with that, as it
would become more a matter of testing and getting CI turn green, before
we're ready to bundle the next major JRE update, without getting the
user involved at all.

If you would prefer using a global system JRE, that should still be
possible by installing an unbundled Cassandra version, but you'd have to
pay attention which Java version to use for which Cassandra release,
possibly having to provide patches and do some testing for more recent
Cassandra versions, in case of compatibility issues. If we update 3.11
to Java 13 in mid 2019, we'd have to provide release candidates that can
be used for testing for such incompatibilities by LTS users and have
them provide patches, which then have to fully work with Java 13 of
course. Otherwise I can't see how to make Oracle/Redhat/IBM/Azul LTS
releases work, except on this best effort basis without official support
guarantees by us.

I'm not too enthusiastic about this perspective. But I wouldn't
completely dismiss it either, without talking about all the other
options first.

On 20.03.2018 22:32, Ariel Weisberg wrote:
> Hi,
> 
> Synchronizing with Oracle LTS releases is kind of low value if it's a paid 
> offering. But if someone in the community doesn't want to upgrade and pays 
> Oracle we don't want to get in the way of that.
> 
> Which is how you end up with what Jordan and ElasticSearch suggest. I'm still 
> +1 on that although in my heart of hearts I want  to only support the latest 
> OpenJDK on trunk and after we cut a release only change the JDK if there is a 
> serious issue.
> 
> It's going to be annoying once we have a serious security or correctness 
> issue and we need to move to a later OpenJDK. The majority won't be paying 
> Oracle for LTS. I don't think that will happen that often though.
> 
> Regards,
> Ariel
> 
> If that ends up not working and we find it's a problem to not be getting 
> On Tue, Mar 20, 2018, at 4:50 PM, Jason Brown wrote:
>> Thanks to Hannu and others pointing out that the OracleJDK is a
>> *commercial* LTS, and thus not an option. mea culpa for missing the
>> "commercial" and just focusing on the "LTS" bit. OpenJDK is is, then.
>>
>> Stefan's elastic search link is rather interesting. Looks like they are
>> compiling for both a LTS version as well as the current OpenJDK. They
>> assume some of their users will stick to a LTS version and some will run
>> the current version of OpenJDK.
>>
>> While it's extra work to add JDK version as yet another matrix variable in
>> addition to our branching, is that something we should consider? Or are we
>> going to burden maintainers even more? Do we have a choice? Note: I think
>> this is similar to what Jeremiah is proposed.
>>
>> @Ariel: Going beyond 3 years could be tricky in the worst case because
>> bringing in up to 3 years of JDK changes to an older release might mean
>> some of our dependencies no longer function and now it's not just minor
>> fixes it's bringing in who knows what in terms of updated dependencies
>>
>> I'm not sure we have a choice anymore, as we're basically bound to what the
>> JDK developers choose to do (and we're bound to the JDK ...). However, if
>> we have the changes necessary for the JDK releases higher than the LTS (if
>> we following the elastic search model), perhaps it'll be a reasonably
>> smooth transition?
>>
>> On Tue, Mar 20, 2018 at 1:31 PM, Jason Brown  wrote:
>>
>>> copied directly from dev channel, just to keep with this ML conversation
>>>
>>> 08:08:26   Robert Stupp jasobrown: https://www.azul.com/java-
>>> stable-secure-free-choose-two-three/ and https:/

Re: [DISCUSS] java 9 and the future of cassandra on the jdk

2018-03-21 Thread Josh McKenzie

As a gut check, the idea of bundling a JRE with C* appeals to me from
a "control your variables" perspective. Simplifies quite a bit of this
problem imo.

On Wed, Mar 21, 2018 at 9:04 AM, Stefan Podkowinski  wrote:
> There's also another option, which I just want to mention here for the
> sake of discussion.
>
> Quoting the Oracle Support Roadmap:
> "Instead of relying on a pre-installed standalone JRE, we encourage
> application developers to deliver JREs with their applications."
>
> I've played around with Java 9 a while ago and also tested creating a
> self contained JRE using jlink, which you can bundle and ship with your
> application. So there's a technical solution for that with Java 9. Of
> course you'd have to clarify licensing issues (OpenJDK is GPLv2 +
> Classpath exception) first.
>
> Bundling a custom JRE along with Cassandra, would be convenient in a way
> that we can do all the testing against the bundled Java version. We
> could also switch to a new Java version whenever it fits us. Like e.g.
> apache-cassandra-4.0.14_openjdk11u321 and two months later release
> apache-cassandra-4.0.15_openjdk12u123. History has shown that planing
> and timing new releases isn't always working out for us as expected. I'd
> rather prefer not having to tightly coordinate our own releases together
> with OpenJDK releases, if it can be avoided. At the same time I'd like
> to avoid having users updating to incompatible JREs (think about
> 8u161/#14173), or have them constantly ask which JRE version to use for
> which Cassandra version, always with the risk of automatic updates
> causing unexpected issues. Bundling the JRE may help us with that, as it
> would become more a matter of testing and getting CI turn green, before
> we're ready to bundle the next major JRE update, without getting the
> user involved at all.
>
> If you would prefer using a global system JRE, that should still be
> possible by installing an unbundled Cassandra version, but you'd have to
> pay attention which Java version to use for which Cassandra release,
> possibly having to provide patches and do some testing for more recent
> Cassandra versions, in case of compatibility issues. If we update 3.11
> to Java 13 in mid 2019, we'd have to provide release candidates that can
> be used for testing for such incompatibilities by LTS users and have
> them provide patches, which then have to fully work with Java 13 of
> course. Otherwise I can't see how to make Oracle/Redhat/IBM/Azul LTS
> releases work, except on this best effort basis without official support
> guarantees by us.
>
> I'm not too enthusiastic about this perspective. But I wouldn't
> completely dismiss it either, without talking about all the other
> options first.
>
>
> On 20.03.2018 22:32, Ariel Weisberg wrote:
>> Hi,
>>
>> Synchronizing with Oracle LTS releases is kind of low value if it's a paid 
>> offering. But if someone in the community doesn't want to upgrade and pays 
>> Oracle we don't want to get in the way of that.
>>
>> Which is how you end up with what Jordan and ElasticSearch suggest. I'm 
>> still +1 on that although in my heart of hearts I want  to only support the 
>> latest OpenJDK on trunk and after we cut a release only change the JDK if 
>> there is a serious issue.
>>
>> It's going to be annoying once we have a serious security or correctness 
>> issue and we need to move to a later OpenJDK. The majority won't be paying 
>> Oracle for LTS. I don't think that will happen that often though.
>>
>> Regards,
>> Ariel
>>
>> If that ends up not working and we find it's a problem to not be getting
>> On Tue, Mar 20, 2018, at 4:50 PM, Jason Brown wrote:
>>> Thanks to Hannu and others pointing out that the OracleJDK is a
>>> *commercial* LTS, and thus not an option. mea culpa for missing the
>>> "commercial" and just focusing on the "LTS" bit. OpenJDK is is, then.
>>>
>>> Stefan's elastic search link is rather interesting. Looks like they are
>>> compiling for both a LTS version as well as the current OpenJDK. They
>>> assume some of their users will stick to a LTS version and some will run
>>> the current version of OpenJDK.
>>>
>>> While it's extra work to add JDK version as yet another matrix variable in
>>> addition to our branching, is that something we should consider? Or are we
>>> going to burden maintainers even more? Do we have a choice? Note: I think
>>> this is similar to what Jeremiah is proposed.
>>>
>>> @Ariel: Going beyond 3 years could be tricky in the worst case because
>>> bringing in up to 3 years of JDK changes to an older release might mean
>>> some of our dependencies no longer function and now it's not just minor
>>> fixes it's bringing in who knows what in terms of updated dependencies
>>>
>>> I'm not sure we have a choice anymore, as we're basically bound to what the
>>> JDK developers choose to do (and we're bound to the JDK ...). However, if
>>> we have the changes necessary for the JDK releases higher than the LTS (if
>>> we following t

Re: [DISCUSS] java 9 and the future of cassandra on the jdk

2018-03-21 Thread Jason Brown

Similar to Josh, I like Stefan's idea, a lot. iirc, the Basho folks used to
ship a specific version of erlang with Riak, so it's not a new precedent in
the (nosql) database space.
I'm not sure about the licensing, though, as openjdk is GPLv2 w/ classpath
exception - but I can ask Apache Legal about it.

On Wed, Mar 21, 2018 at 6:10 AM, Josh McKenzie  wrote:

> As a gut check, the idea of bundling a JRE with C* appeals to me from
> a "control your variables" perspective. Simplifies quite a bit of this
> problem imo.
>
> On Wed, Mar 21, 2018 at 9:04 AM, Stefan Podkowinski 
> wrote:
> > There's also another option, which I just want to mention here for the
> > sake of discussion.
> >
> > Quoting the Oracle Support Roadmap:
> > "Instead of relying on a pre-installed standalone JRE, we encourage
> > application developers to deliver JREs with their applications."
> >
> > I've played around with Java 9 a while ago and also tested creating a
> > self contained JRE using jlink, which you can bundle and ship with your
> > application. So there's a technical solution for that with Java 9. Of
> > course you'd have to clarify licensing issues (OpenJDK is GPLv2 +
> > Classpath exception) first.
> >
> > Bundling a custom JRE along with Cassandra, would be convenient in a way
> > that we can do all the testing against the bundled Java version. We
> > could also switch to a new Java version whenever it fits us. Like e.g.
> > apache-cassandra-4.0.14_openjdk11u321 and two months later release
> > apache-cassandra-4.0.15_openjdk12u123. History has shown that planing
> > and timing new releases isn't always working out for us as expected. I'd
> > rather prefer not having to tightly coordinate our own releases together
> > with OpenJDK releases, if it can be avoided. At the same time I'd like
> > to avoid having users updating to incompatible JREs (think about
> > 8u161/#14173), or have them constantly ask which JRE version to use for
> > which Cassandra version, always with the risk of automatic updates
> > causing unexpected issues. Bundling the JRE may help us with that, as it
> > would become more a matter of testing and getting CI turn green, before
> > we're ready to bundle the next major JRE update, without getting the
> > user involved at all.
> >
> > If you would prefer using a global system JRE, that should still be
> > possible by installing an unbundled Cassandra version, but you'd have to
> > pay attention which Java version to use for which Cassandra release,
> > possibly having to provide patches and do some testing for more recent
> > Cassandra versions, in case of compatibility issues. If we update 3.11
> > to Java 13 in mid 2019, we'd have to provide release candidates that can
> > be used for testing for such incompatibilities by LTS users and have
> > them provide patches, which then have to fully work with Java 13 of
> > course. Otherwise I can't see how to make Oracle/Redhat/IBM/Azul LTS
> > releases work, except on this best effort basis without official support
> > guarantees by us.
> >
> > I'm not too enthusiastic about this perspective. But I wouldn't
> > completely dismiss it either, without talking about all the other
> > options first.
> >
> >
> > On 20.03.2018 22:32, Ariel Weisberg wrote:
> >> Hi,
> >>
> >> Synchronizing with Oracle LTS releases is kind of low value if it's a
> paid offering. But if someone in the community doesn't want to upgrade and
> pays Oracle we don't want to get in the way of that.
> >>
> >> Which is how you end up with what Jordan and ElasticSearch suggest. I'm
> still +1 on that although in my heart of hearts I want  to only support the
> latest OpenJDK on trunk and after we cut a release only change the JDK if
> there is a serious issue.
> >>
> >> It's going to be annoying once we have a serious security or
> correctness issue and we need to move to a later OpenJDK. The majority
> won't be paying Oracle for LTS. I don't think that will happen that often
> though.
> >>
> >> Regards,
> >> Ariel
> >>
> >> If that ends up not working and we find it's a problem to not be getting
> >> On Tue, Mar 20, 2018, at 4:50 PM, Jason Brown wrote:
> >>> Thanks to Hannu and others pointing out that the OracleJDK is a
> >>> *commercial* LTS, and thus not an option. mea culpa for missing the
> >>> "commercial" and just focusing on the "LTS" bit. OpenJDK is is, then.
> >>>
> >>> Stefan's elastic search link is rather interesting. Looks like they are
> >>> compiling for both a LTS version as well as the current OpenJDK. They
> >>> assume some of their users will stick to a LTS version and some will
> run
> >>> the current version of OpenJDK.
> >>>
> >>> While it's extra work to add JDK version as yet another matrix
> variable in
> >>> addition to our branching, is that something we should consider? Or
> are we
> >>> going to burden maintainers even more? Do we have a choice? Note: I
> think
> >>> this is similar to what Jeremiah is proposed.
> >>>
> >>> @Ariel: Going beyond 3 years could be t

Re: Paying off tech debt and correctly naming things

2018-03-21 Thread Eric Evans

On Wed, Mar 21, 2018 at 3:48 AM, Sylvain Lebresne  wrote:

[ ... ]

> - pure code renaming is one reasonably simple aspect, but quite a few
> renaming may have user visible impact. Particularly around JMX where many
> things are name based on their class, and to a lesser extend some of our
> tools still use "old" naming. We can't and shouldn't ignore those impact:
> such user visible changes should imo be documented, and we should make sure
> we have a reasonably painless (and thus incremental) upgrade path. My hunch
> is the latter isn't as simple as it seems.

Speaking as someone who has personally been burned by this
(repeatedly, and it's on-going), please think very carefully before
making such changes.  I hate to think about of all the hours I wasted
shaving this breed of yak.

> On Wed, Mar 21, 2018 at 9:06 AM kurt greaves  wrote:

[ ... ]

-- 
Eric Evans
john.eric.ev...@gmail.com

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org

Re: [DISCUSS] java 9 and the future of cassandra on the jdk

2018-03-21 Thread Eric Evans

On Wed, Mar 21, 2018 at 8:04 AM, Stefan Podkowinski  wrote:

> There's also another option, which I just want to mention here for the
> sake of discussion.
>
> Quoting the Oracle Support Roadmap:
> "Instead of relying on a pre-installed standalone JRE, we encourage
> application developers to deliver JREs with their applications."
>
> I've played around with Java 9 a while ago and also tested creating a
> self contained JRE using jlink, which you can bundle and ship with your
> application. So there's a technical solution for that with Java 9. Of
> course you'd have to clarify licensing issues (OpenJDK is GPLv2 +
> Classpath exception) first.
>
> Bundling a custom JRE along with Cassandra, would be convenient in a way
> that we can do all the testing against the bundled Java version. We
> could also switch to a new Java version whenever it fits us. Like e.g.
> apache-cassandra-4.0.14_openjdk11u321 and two months later release
> apache-cassandra-4.0.15_openjdk12u123. History has shown that planing
> and timing new releases isn't always working out for us as expected. I'd
> rather prefer not having to tightly coordinate our own releases together
> with OpenJDK releases, if it can be avoided. At the same time I'd like
> to avoid having users updating to incompatible JREs (think about
> 8u161/#14173), or have them constantly ask which JRE version to use for
> which Cassandra version, always with the risk of automatic updates
> causing unexpected issues. Bundling the JRE may help us with that, as it
> would become more a matter of testing and getting CI turn green, before
> we're ready to bundle the next major JRE update, without getting the
> user involved at all.
>
> If you would prefer using a global system JRE, that should still be
> possible by installing an unbundled Cassandra version, but you'd have to
> pay attention which Java version to use for which Cassandra release,
> possibly having to provide patches and do some testing for more recent
> Cassandra versions, in case of compatibility issues. If we update 3.11
> to Java 13 in mid 2019, we'd have to provide release candidates that can
> be used for testing for such incompatibilities by LTS users and have
> them provide patches, which then have to fully work with Java 13 of
> course. Otherwise I can't see how to make Oracle/Redhat/IBM/Azul LTS
> releases work, except on this best effort basis without official support
> guarantees by us.
>
> I'm not too enthusiastic about this perspective. But I wouldn't
> completely dismiss it either, without talking about all the other
> options first.
>

Personally, I don't like the idea of vendoring like this at all.  Wasn't
portability supposed to be one of the big selling points of Java?  Wouldn't
our efforts be better directed at being portable to within a few releases
of the JDK, than to tightly couple it to once specific version?


> On 20.03.2018 22:32, Ariel Weisberg wrote:
> > Hi,
> >
> > Synchronizing with Oracle LTS releases is kind of low value if it's a
> paid offering. But if someone in the community doesn't want to upgrade and
> pays Oracle we don't want to get in the way of that.
> >
> > Which is how you end up with what Jordan and ElasticSearch suggest. I'm
> still +1 on that although in my heart of hearts I want  to only support the
> latest OpenJDK on trunk and after we cut a release only change the JDK if
> there is a serious issue.
> >
> > It's going to be annoying once we have a serious security or correctness
> issue and we need to move to a later OpenJDK. The majority won't be paying
> Oracle for LTS. I don't think that will happen that often though.
> >
> > Regards,
> > Ariel
> >
> > If that ends up not working and we find it's a problem to not be getting
> > On Tue, Mar 20, 2018, at 4:50 PM, Jason Brown wrote:
> >> Thanks to Hannu and others pointing out that the OracleJDK is a
> >> *commercial* LTS, and thus not an option. mea culpa for missing the
> >> "commercial" and just focusing on the "LTS" bit. OpenJDK is is, then.
> >>
> >> Stefan's elastic search link is rather interesting. Looks like they are
> >> compiling for both a LTS version as well as the current OpenJDK. They
> >> assume some of their users will stick to a LTS version and some will run
> >> the current version of OpenJDK.
> >>
> >> While it's extra work to add JDK version as yet another matrix variable
> in
> >> addition to our branching, is that something we should consider? Or are
> we
> >> going to burden maintainers even more? Do we have a choice? Note: I
> think
> >> this is similar to what Jeremiah is proposed.
> >>
> >> @Ariel: Going beyond 3 years could be tricky in the worst case because
> >> bringing in up to 3 years of JDK changes to an older release might mean
> >> some of our dependencies no longer function and now it's not just minor
> >> fixes it's bringing in who knows what in terms of updated dependencies
> >>
> >> I'm not sure we have a choice anymore, as we're basically bound to what
> the
> >> JDK devel

Re: [DISCUSS] java 9 and the future of cassandra on the jdk

2018-03-21 Thread Josh McKenzie

> Wasn't portability supposed to be one of the
> big selling points of Java?
Historically, yes, but their change in release cadence and support
structure is something they're pushing, not us. We have to figure out
how to make the best of a change that is, at best, orthogonal to our
interests as a project.

Maintaining portability within a few releases of the JDK for each
supported version of C* is a non-trivial amount of work with a 6 month
refresh timer on it.

On Wed, Mar 21, 2018 at 9:49 AM, Eric Evans  wrote:
> On Wed, Mar 21, 2018 at 8:04 AM, Stefan Podkowinski  wrote:
>
>> There's also another option, which I just want to mention here for the
>> sake of discussion.
>>
>> Quoting the Oracle Support Roadmap:
>> "Instead of relying on a pre-installed standalone JRE, we encourage
>> application developers to deliver JREs with their applications."
>>
>> I've played around with Java 9 a while ago and also tested creating a
>> self contained JRE using jlink, which you can bundle and ship with your
>> application. So there's a technical solution for that with Java 9. Of
>> course you'd have to clarify licensing issues (OpenJDK is GPLv2 +
>> Classpath exception) first.
>>
>> Bundling a custom JRE along with Cassandra, would be convenient in a way
>> that we can do all the testing against the bundled Java version. We
>> could also switch to a new Java version whenever it fits us. Like e.g.
>> apache-cassandra-4.0.14_openjdk11u321 and two months later release
>> apache-cassandra-4.0.15_openjdk12u123. History has shown that planing
>> and timing new releases isn't always working out for us as expected. I'd
>> rather prefer not having to tightly coordinate our own releases together
>> with OpenJDK releases, if it can be avoided. At the same time I'd like
>> to avoid having users updating to incompatible JREs (think about
>> 8u161/#14173), or have them constantly ask which JRE version to use for
>> which Cassandra version, always with the risk of automatic updates
>> causing unexpected issues. Bundling the JRE may help us with that, as it
>> would become more a matter of testing and getting CI turn green, before
>> we're ready to bundle the next major JRE update, without getting the
>> user involved at all.
>>
>> If you would prefer using a global system JRE, that should still be
>> possible by installing an unbundled Cassandra version, but you'd have to
>> pay attention which Java version to use for which Cassandra release,
>> possibly having to provide patches and do some testing for more recent
>> Cassandra versions, in case of compatibility issues. If we update 3.11
>> to Java 13 in mid 2019, we'd have to provide release candidates that can
>> be used for testing for such incompatibilities by LTS users and have
>> them provide patches, which then have to fully work with Java 13 of
>> course. Otherwise I can't see how to make Oracle/Redhat/IBM/Azul LTS
>> releases work, except on this best effort basis without official support
>> guarantees by us.
>>
>> I'm not too enthusiastic about this perspective. But I wouldn't
>> completely dismiss it either, without talking about all the other
>> options first.
>>
>
> Personally, I don't like the idea of vendoring like this at all.  Wasn't
> portability supposed to be one of the big selling points of Java?  Wouldn't
> our efforts be better directed at being portable to within a few releases
> of the JDK, than to tightly couple it to once specific version?
>
>
>> On 20.03.2018 22:32, Ariel Weisberg wrote:
>> > Hi,
>> >
>> > Synchronizing with Oracle LTS releases is kind of low value if it's a
>> paid offering. But if someone in the community doesn't want to upgrade and
>> pays Oracle we don't want to get in the way of that.
>> >
>> > Which is how you end up with what Jordan and ElasticSearch suggest. I'm
>> still +1 on that although in my heart of hearts I want  to only support the
>> latest OpenJDK on trunk and after we cut a release only change the JDK if
>> there is a serious issue.
>> >
>> > It's going to be annoying once we have a serious security or correctness
>> issue and we need to move to a later OpenJDK. The majority won't be paying
>> Oracle for LTS. I don't think that will happen that often though.
>> >
>> > Regards,
>> > Ariel
>> >
>> > If that ends up not working and we find it's a problem to not be getting
>> > On Tue, Mar 20, 2018, at 4:50 PM, Jason Brown wrote:
>> >> Thanks to Hannu and others pointing out that the OracleJDK is a
>> >> *commercial* LTS, and thus not an option. mea culpa for missing the
>> >> "commercial" and just focusing on the "LTS" bit. OpenJDK is is, then.
>> >>
>> >> Stefan's elastic search link is rather interesting. Looks like they are
>> >> compiling for both a LTS version as well as the current OpenJDK. They
>> >> assume some of their users will stick to a LTS version and some will run
>> >> the current version of OpenJDK.
>> >>
>> >> While it's extra work to add JDK version as yet another matrix variable
>> in
>> >> addition to

Re: [DISCUSS] java 9 and the future of cassandra on the jdk

2018-03-21 Thread J. D. Jordan

I was reading up on the bundling possibility the other day as well. I like the 
idea from release standpoint. And if someone wants to use a different version 
they can always recompile it themselves.
As Stefan pointed out, the main issue I see is the one of how licensing plays 
out with this.

-Jeremiah

> On Mar 21, 2018, at 9:49 AM, Eric Evans  wrote:
> 
>> On Wed, Mar 21, 2018 at 8:04 AM, Stefan Podkowinski  wrote:
>> 
>> There's also another option, which I just want to mention here for the
>> sake of discussion.
>> 
>> Quoting the Oracle Support Roadmap:
>> "Instead of relying on a pre-installed standalone JRE, we encourage
>> application developers to deliver JREs with their applications."
>> 
>> I've played around with Java 9 a while ago and also tested creating a
>> self contained JRE using jlink, which you can bundle and ship with your
>> application. So there's a technical solution for that with Java 9. Of
>> course you'd have to clarify licensing issues (OpenJDK is GPLv2 +
>> Classpath exception) first.
>> 
>> Bundling a custom JRE along with Cassandra, would be convenient in a way
>> that we can do all the testing against the bundled Java version. We
>> could also switch to a new Java version whenever it fits us. Like e.g.
>> apache-cassandra-4.0.14_openjdk11u321 and two months later release
>> apache-cassandra-4.0.15_openjdk12u123. History has shown that planing
>> and timing new releases isn't always working out for us as expected. I'd
>> rather prefer not having to tightly coordinate our own releases together
>> with OpenJDK releases, if it can be avoided. At the same time I'd like
>> to avoid having users updating to incompatible JREs (think about
>> 8u161/#14173), or have them constantly ask which JRE version to use for
>> which Cassandra version, always with the risk of automatic updates
>> causing unexpected issues. Bundling the JRE may help us with that, as it
>> would become more a matter of testing and getting CI turn green, before
>> we're ready to bundle the next major JRE update, without getting the
>> user involved at all.
>> 
>> If you would prefer using a global system JRE, that should still be
>> possible by installing an unbundled Cassandra version, but you'd have to
>> pay attention which Java version to use for which Cassandra release,
>> possibly having to provide patches and do some testing for more recent
>> Cassandra versions, in case of compatibility issues. If we update 3.11
>> to Java 13 in mid 2019, we'd have to provide release candidates that can
>> be used for testing for such incompatibilities by LTS users and have
>> them provide patches, which then have to fully work with Java 13 of
>> course. Otherwise I can't see how to make Oracle/Redhat/IBM/Azul LTS
>> releases work, except on this best effort basis without official support
>> guarantees by us.
>> 
>> I'm not too enthusiastic about this perspective. But I wouldn't
>> completely dismiss it either, without talking about all the other
>> options first.
>> 
> 
> Personally, I don't like the idea of vendoring like this at all.  Wasn't
> portability supposed to be one of the big selling points of Java?  Wouldn't
> our efforts be better directed at being portable to within a few releases
> of the JDK, than to tightly couple it to once specific version?
> 
> 
>>> On 20.03.2018 22:32, Ariel Weisberg wrote:
>>> Hi,
>>> 
>>> Synchronizing with Oracle LTS releases is kind of low value if it's a
>> paid offering. But if someone in the community doesn't want to upgrade and
>> pays Oracle we don't want to get in the way of that.
>>> 
>>> Which is how you end up with what Jordan and ElasticSearch suggest. I'm
>> still +1 on that although in my heart of hearts I want  to only support the
>> latest OpenJDK on trunk and after we cut a release only change the JDK if
>> there is a serious issue.
>>> 
>>> It's going to be annoying once we have a serious security or correctness
>> issue and we need to move to a later OpenJDK. The majority won't be paying
>> Oracle for LTS. I don't think that will happen that often though.
>>> 
>>> Regards,
>>> Ariel
>>> 
>>> If that ends up not working and we find it's a problem to not be getting
 On Tue, Mar 20, 2018, at 4:50 PM, Jason Brown wrote:
 Thanks to Hannu and others pointing out that the OracleJDK is a
 *commercial* LTS, and thus not an option. mea culpa for missing the
 "commercial" and just focusing on the "LTS" bit. OpenJDK is is, then.
 
 Stefan's elastic search link is rather interesting. Looks like they are
 compiling for both a LTS version as well as the current OpenJDK. They
 assume some of their users will stick to a LTS version and some will run
 the current version of OpenJDK.
 
 While it's extra work to add JDK version as yet another matrix variable
>> in
 addition to our branching, is that something we should consider? Or are
>> we
 going to burden maintainers even more? Do we have a choice? Note: I
>> think
 this is s

Re: [DISCUSS] java 9 and the future of cassandra on the jdk

2018-03-21 Thread Gerald Henriksen

On Wed, 21 Mar 2018 14:04:39 +0100, you wrote:

>There's also another option, which I just want to mention here for the
>sake of discussion.
>
>Quoting the Oracle Support Roadmap:
>"Instead of relying on a pre-installed standalone JRE, we encourage
>application developers to deliver JREs with their applications."
>
>I've played around with Java 9 a while ago and also tested creating a
>self contained JRE using jlink, which you can bundle and ship with your
>application. So there's a technical solution for that with Java 9. Of
>course you'd have to clarify licensing issues (OpenJDK is GPLv2 +
>Classpath exception) first.
>
>Bundling a custom JRE along with Cassandra, would be convenient in a way
>that we can do all the testing against the bundled Java version. We
>could also switch to a new Java version whenever it fits us.

To a certain extent though the issue isn't whether Cassandra works
well with the given JRE but rather the issue of having a supported JRE
in a production environment.

If Cassandra ships with a bundled JRE does that then mean the
people/organizations downloading and using that product are going to
expect the Cassandra project to provide bug and security updates to
the JRE as well as Cassandra?

What happens if an organization gets hacked due to an issue in an out
of date JRE that Cassandra bundled?  Yes, that can currently happen if
the organization chooses to run Cassandra on an unsupported JRE.  But
in that case the organization has made that decision, not Cassandra.

Essentially any security concious entity, whether a person or
organization, running any software stack on top of Java (or I guess
any of the other languages based on the JVM) is going to have to make
a choice between constantly updating their JRE or going with a LTS
version (either from Oracle or Red Hat or any other company that is
willing to provide it).  Or maybe even move to .Net now that it is
supported on Linux.

I don't think there are any great choices here for Cassandra or any of
the other Java based projects but an easy solution (in terms of basing
the project on a supported JRE that can be downloaded for free) would
be to choose whatever version of OpenJDK is supported by Red Hat or
any other Linux distribution that offers a LTS release.

So for example basing on OpenJDK 8 gets you support until October 2020
with paying for Red Hat, or for free (with mainly security updates) by
using Centos.

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org

Re: [DISCUSS] java 9 and the future of cassandra on the jdk

2018-03-21 Thread Eric Evans

On Wed, Mar 21, 2018 at 9:00 AM, Josh McKenzie  wrote:
>> Wasn't portability supposed to be one of the
>> big selling points of Java?
> Historically, yes, but their change in release cadence and support
> structure is something they're pushing, not us. We have to figure out
> how to make the best of a change that is, at best, orthogonal to our
> interests as a project.
>
> Maintaining portability within a few releases of the JDK for each
> supported version of C* is a non-trivial amount of work with a 6 month
> refresh timer on it.

Do we know this though?  More frequent releases should also mean that
each new release has accumulated less change (the typical trade-off).
Also, these JDKs are supposed to be backward compatible, if this isn't
the case because (for example) we are (ab)using private interfaces,
that's probably worth looking at anyway.

> On Wed, Mar 21, 2018 at 9:49 AM, Eric Evans  wrote:
>> On Wed, Mar 21, 2018 at 8:04 AM, Stefan Podkowinski  wrote:
>>
>>> There's also another option, which I just want to mention here for the
>>> sake of discussion.
>>>
>>> Quoting the Oracle Support Roadmap:
>>> "Instead of relying on a pre-installed standalone JRE, we encourage
>>> application developers to deliver JREs with their applications."
>>>
>>> I've played around with Java 9 a while ago and also tested creating a
>>> self contained JRE using jlink, which you can bundle and ship with your
>>> application. So there's a technical solution for that with Java 9. Of
>>> course you'd have to clarify licensing issues (OpenJDK is GPLv2 +
>>> Classpath exception) first.
>>>
>>> Bundling a custom JRE along with Cassandra, would be convenient in a way
>>> that we can do all the testing against the bundled Java version. We
>>> could also switch to a new Java version whenever it fits us. Like e.g.
>>> apache-cassandra-4.0.14_openjdk11u321 and two months later release
>>> apache-cassandra-4.0.15_openjdk12u123. History has shown that planing
>>> and timing new releases isn't always working out for us as expected. I'd
>>> rather prefer not having to tightly coordinate our own releases together
>>> with OpenJDK releases, if it can be avoided. At the same time I'd like
>>> to avoid having users updating to incompatible JREs (think about
>>> 8u161/#14173), or have them constantly ask which JRE version to use for
>>> which Cassandra version, always with the risk of automatic updates
>>> causing unexpected issues. Bundling the JRE may help us with that, as it
>>> would become more a matter of testing and getting CI turn green, before
>>> we're ready to bundle the next major JRE update, without getting the
>>> user involved at all.
>>>
>>> If you would prefer using a global system JRE, that should still be
>>> possible by installing an unbundled Cassandra version, but you'd have to
>>> pay attention which Java version to use for which Cassandra release,
>>> possibly having to provide patches and do some testing for more recent
>>> Cassandra versions, in case of compatibility issues. If we update 3.11
>>> to Java 13 in mid 2019, we'd have to provide release candidates that can
>>> be used for testing for such incompatibilities by LTS users and have
>>> them provide patches, which then have to fully work with Java 13 of
>>> course. Otherwise I can't see how to make Oracle/Redhat/IBM/Azul LTS
>>> releases work, except on this best effort basis without official support
>>> guarantees by us.
>>>
>>> I'm not too enthusiastic about this perspective. But I wouldn't
>>> completely dismiss it either, without talking about all the other
>>> options first.
>>>
>>
>> Personally, I don't like the idea of vendoring like this at all.  Wasn't
>> portability supposed to be one of the big selling points of Java?  Wouldn't
>> our efforts be better directed at being portable to within a few releases
>> of the JDK, than to tightly couple it to once specific version?
>>
>>
>>> On 20.03.2018 22:32, Ariel Weisberg wrote:
>>> > Hi,
>>> >
>>> > Synchronizing with Oracle LTS releases is kind of low value if it's a
>>> paid offering. But if someone in the community doesn't want to upgrade and
>>> pays Oracle we don't want to get in the way of that.
>>> >
>>> > Which is how you end up with what Jordan and ElasticSearch suggest. I'm
>>> still +1 on that although in my heart of hearts I want  to only support the
>>> latest OpenJDK on trunk and after we cut a release only change the JDK if
>>> there is a serious issue.
>>> >
>>> > It's going to be annoying once we have a serious security or correctness
>>> issue and we need to move to a later OpenJDK. The majority won't be paying
>>> Oracle for LTS. I don't think that will happen that often though.
>>> >
>>> > Regards,
>>> > Ariel
>>> >
>>> > If that ends up not working and we find it's a problem to not be getting
>>> > On Tue, Mar 20, 2018, at 4:50 PM, Jason Brown wrote:
>>> >> Thanks to Hannu and others pointing out that the OracleJDK is a
>>> >> *commercial* LTS, and thus not an option. mea culpa for

Re: [DISCUSS] java 9 and the future of cassandra on the jdk

2018-03-21 Thread Jeff Jirsa




> On Mar 21, 2018, at 7:21 AM, Gerald Henriksen  wrote:
> 
>> On Wed, 21 Mar 2018 14:04:39 +0100, you wrote:
>> Bundling a custom JRE along with Cassandra, would be convenient in a way
>> that we can do all the testing against the bundled Java version. We
>> could also switch to a new Java version whenever it fits us.
> 
> To a certain extent though the issue isn't whether Cassandra works
> well with the given JRE but rather the issue of having a supported JRE
> in a production environment.
> 

This, plus the license issue, probably makes this a non-option.

(The license question is closed for the record:

https://www.apache.org/licenses/GPL-compatibility.html - The Apache Software 
Foundation does not allow its own projects to distribute software under 
licenses more restrictive than the Apache License)

Re: [DISCUSS] java 9 and the future of cassandra on the jdk

2018-03-21 Thread Eric Evans

On Wed, Mar 21, 2018 at 9:21 AM, Gerald Henriksen  wrote:
> On Wed, 21 Mar 2018 14:04:39 +0100, you wrote:
>
>>There's also another option, which I just want to mention here for the
>>sake of discussion.
>>
>>Quoting the Oracle Support Roadmap:
>>"Instead of relying on a pre-installed standalone JRE, we encourage
>>application developers to deliver JREs with their applications."
>>
>>I've played around with Java 9 a while ago and also tested creating a
>>self contained JRE using jlink, which you can bundle and ship with your
>>application. So there's a technical solution for that with Java 9. Of
>>course you'd have to clarify licensing issues (OpenJDK is GPLv2 +
>>Classpath exception) first.
>>
>>Bundling a custom JRE along with Cassandra, would be convenient in a way
>>that we can do all the testing against the bundled Java version. We
>>could also switch to a new Java version whenever it fits us.
>
> To a certain extent though the issue isn't whether Cassandra works
> well with the given JRE but rather the issue of having a supported JRE
> in a production environment.
>
> If Cassandra ships with a bundled JRE does that then mean the
> people/organizations downloading and using that product are going to
> expect the Cassandra project to provide bug and security updates to
> the JRE as well as Cassandra?
>
> What happens if an organization gets hacked due to an issue in an out
> of date JRE that Cassandra bundled?  Yes, that can currently happen if
> the organization chooses to run Cassandra on an unsupported JRE.  But
> in that case the organization has made that decision, not Cassandra.

Exactly.  It is common for organizations to evaluate JVM errata
against their environment/requirements and the use-cases they have,
then act accordingly.  If applications start embedding their own JVM
this becomes a combinatorial nightmare.

> Essentially any security concious entity, whether a person or
> organization, running any software stack on top of Java (or I guess
> any of the other languages based on the JVM) is going to have to make
> a choice between constantly updating their JRE or going with a LTS
> version (either from Oracle or Red Hat or any other company that is
> willing to provide it).  Or maybe even move to .Net now that it is
> supported on Linux.
>
> I don't think there are any great choices here for Cassandra or any of
> the other Java based projects but an easy solution (in terms of basing
> the project on a supported JRE that can be downloaded for free) would
> be to choose whatever version of OpenJDK is supported by Red Hat or
> any other Linux distribution that offers a LTS release.
>
> So for example basing on OpenJDK 8 gets you support until October 2020
> with paying for Red Hat, or for free (with mainly security updates) by
> using Centos.

Agreed.  Someone said this elsewhere as well, that the community will
work this out.

Even if you are not running say Debian, or RedHat, those distributions
will be backporting critical fixes to their JVMs; This work is going
to be done, and will be available to anyone.  If this becomes an
issue, it'll be an issue facing a lot of people, and I expect
unofficial LTS releases will quickly become available.

-- 
Eric Evans
john.eric.ev...@gmail.com

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org

Re: [DISCUSS] java 9 and the future of cassandra on the jdk

2018-03-21 Thread Ariel Weisberg

Hi,

I'm not clear on what building and bundling our own JRE/JDK accomplishes? What 
is our source for JRE updates going to be? Are we going to build our own and 
does Oracle release the source for their LTS releases? Are we going to extract 
LTS updates from CentOS?

If the goal of bundling is to simplify upgrading the JRE/JDK for users by 
synchronizing it with updating Cassandra, well I think that isn't so bad. Sure 
it's a responsibility on our part, but if we can extract the Ubuntu or CentOS 
build it's "just" a matter of getting the latest bug fix JRE/JDK every time we 
cut a minor release. However it's us taking responsibility for it when we 
didn't previously and I don't see why we can't just document where to get 
updates from instead of bundling it ourselves.

Is there a licensing issue that forces us to just ship the JRE vs JDK or is it 
about download size? For release builds just the JRE is fine, but for building 
from source it would be nice if it bundled or fetched the full JDK. If we can 
avoid checking in a large binary distribution that would also be good.

Ariel

On Wed, Mar 21, 2018, at 10:21 AM, Gerald Henriksen wrote:
> On Wed, 21 Mar 2018 14:04:39 +0100, you wrote:
> 
> >There's also another option, which I just want to mention here for the
> >sake of discussion.
> >
> >Quoting the Oracle Support Roadmap:
> >"Instead of relying on a pre-installed standalone JRE, we encourage
> >application developers to deliver JREs with their applications."
> >
> >I've played around with Java 9 a while ago and also tested creating a
> >self contained JRE using jlink, which you can bundle and ship with your
> >application. So there's a technical solution for that with Java 9. Of
> >course you'd have to clarify licensing issues (OpenJDK is GPLv2 +
> >Classpath exception) first.
> >
> >Bundling a custom JRE along with Cassandra, would be convenient in a way
> >that we can do all the testing against the bundled Java version. We
> >could also switch to a new Java version whenever it fits us.
> 
> To a certain extent though the issue isn't whether Cassandra works
> well with the given JRE but rather the issue of having a supported JRE
> in a production environment.
> 
> If Cassandra ships with a bundled JRE does that then mean the
> people/organizations downloading and using that product are going to
> expect the Cassandra project to provide bug and security updates to
> the JRE as well as Cassandra?
> 
> What happens if an organization gets hacked due to an issue in an out
> of date JRE that Cassandra bundled?  Yes, that can currently happen if
> the organization chooses to run Cassandra on an unsupported JRE.  But
> in that case the organization has made that decision, not Cassandra.
> 
> Essentially any security concious entity, whether a person or
> organization, running any software stack on top of Java (or I guess
> any of the other languages based on the JVM) is going to have to make
> a choice between constantly updating their JRE or going with a LTS
> version (either from Oracle or Red Hat or any other company that is
> willing to provide it).  Or maybe even move to .Net now that it is
> supported on Linux.
> 
> I don't think there are any great choices here for Cassandra or any of
> the other Java based projects but an easy solution (in terms of basing
> the project on a supported JRE that can be downloaded for free) would
> be to choose whatever version of OpenJDK is supported by Red Hat or
> any other Linux distribution that offers a LTS release.
> 
> So for example basing on OpenJDK 8 gets you support until October 2020
> with paying for Red Hat, or for free (with mainly security updates) by
> using Centos.
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
> 

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org

Re: [DISCUSS] java 9 and the future of cassandra on the jdk

2018-03-21 Thread Josh McKenzie

> Even if you are not running say Debian, or RedHat, those distributions
> will be backporting critical fixes to their JVMs; This work is going
> to be done, and will be available to anyone.
This would certainly mitigate a lot of the core problems with the new
release model. Has there been any public statements of plans/intent
with regards to distros doing this?

In terms of the burden of bugfixes and security fixes if we bundled a
JRE w/C*, cutting a patch release of C* with a new JRE distribution
would be a really low friction process (add to build, check CI, green,
done), so I don't think that would be a blocker for the concept.

The whole license restriction aspect of it, however, obviously is (and
makes complete sense).

On Wed, Mar 21, 2018 at 10:41 AM, Eric Evans  wrote:
> On Wed, Mar 21, 2018 at 9:21 AM, Gerald Henriksen  wrote:
>> On Wed, 21 Mar 2018 14:04:39 +0100, you wrote:
>>
>>>There's also another option, which I just want to mention here for the
>>>sake of discussion.
>>>
>>>Quoting the Oracle Support Roadmap:
>>>"Instead of relying on a pre-installed standalone JRE, we encourage
>>>application developers to deliver JREs with their applications."
>>>
>>>I've played around with Java 9 a while ago and also tested creating a
>>>self contained JRE using jlink, which you can bundle and ship with your
>>>application. So there's a technical solution for that with Java 9. Of
>>>course you'd have to clarify licensing issues (OpenJDK is GPLv2 +
>>>Classpath exception) first.
>>>
>>>Bundling a custom JRE along with Cassandra, would be convenient in a way
>>>that we can do all the testing against the bundled Java version. We
>>>could also switch to a new Java version whenever it fits us.
>>
>> To a certain extent though the issue isn't whether Cassandra works
>> well with the given JRE but rather the issue of having a supported JRE
>> in a production environment.
>>
>> If Cassandra ships with a bundled JRE does that then mean the
>> people/organizations downloading and using that product are going to
>> expect the Cassandra project to provide bug and security updates to
>> the JRE as well as Cassandra?
>>
>> What happens if an organization gets hacked due to an issue in an out
>> of date JRE that Cassandra bundled?  Yes, that can currently happen if
>> the organization chooses to run Cassandra on an unsupported JRE.  But
>> in that case the organization has made that decision, not Cassandra.
>
> Exactly.  It is common for organizations to evaluate JVM errata
> against their environment/requirements and the use-cases they have,
> then act accordingly.  If applications start embedding their own JVM
> this becomes a combinatorial nightmare.
>
>> Essentially any security concious entity, whether a person or
>> organization, running any software stack on top of Java (or I guess
>> any of the other languages based on the JVM) is going to have to make
>> a choice between constantly updating their JRE or going with a LTS
>> version (either from Oracle or Red Hat or any other company that is
>> willing to provide it).  Or maybe even move to .Net now that it is
>> supported on Linux.
>>
>> I don't think there are any great choices here for Cassandra or any of
>> the other Java based projects but an easy solution (in terms of basing
>> the project on a supported JRE that can be downloaded for free) would
>> be to choose whatever version of OpenJDK is supported by Red Hat or
>> any other Linux distribution that offers a LTS release.
>>
>> So for example basing on OpenJDK 8 gets you support until October 2020
>> with paying for Red Hat, or for free (with mainly security updates) by
>> using Centos.
>
> Agreed.  Someone said this elsewhere as well, that the community will
> work this out.
>
> Even if you are not running say Debian, or RedHat, those distributions
> will be backporting critical fixes to their JVMs; This work is going
> to be done, and will be available to anyone.  If this becomes an
> issue, it'll be an issue facing a lot of people, and I expect
> unofficial LTS releases will quickly become available.
>
> --
> Eric Evans
> john.eric.ev...@gmail.com
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org

Re: [DISCUSS] java 9 and the future of cassandra on the jdk

2018-03-21 Thread Stefan Podkowinski

On 21.03.2018 15:41, Ariel Weisberg wrote:
> I'm not clear on what building and bundling our own JRE/JDK accomplishes? 

If we talk about OpenJDK, there will be only a single Java version
supported at any time and that is the latest Java version (11, 12, ..).
There is no overlap between supported versions. Therefor it doesn't
really make a lot of sense for us to officially support "a few releases
of the JDK" when we talk about OpenJDK releases. What we'd have to do is
to keep up with new Java versions by testing them and updating our code
base if necessary. Keep in mind that branches like 4.0 and 3.11 will
span several Java versions.

We can do this by communicating a list of branches and corresponding
Java releases that are officially supported. But we can also just bundle
and ship the latest OpenJDK release that we know is to be working for
any Cassandra branch right away, which would avoid any incompatibility
issues between our releases and JREs installed by the user and is
probably easier for everyone. But thats pretty much the biggest selling
point on bundling the JRE, but will probably not happen anyway due to
the licensing restrictions.


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org

Re: [DISCUSS] java 9 and the future of cassandra on the jdk

2018-03-21 Thread Jason Brown

fwiw, a naive internet search turned up [1]. tl;dr use the java 9's jlink
(or java8's javapackager) to build a full app+jre package for distribution.

I started digging into the legal aspects, and (trying to) searching
legal-discuss@. May just send an email to them later today to speed up this
discovery process.

[1]
https://steveperkins.com/using-java-9-modularization-to-ship-zero-dependency-native-apps/

On Wed, Mar 21, 2018 at 8:25 AM, Stefan Podkowinski  wrote:

> On 21.03.2018 15:41, Ariel Weisberg wrote:
> > I'm not clear on what building and bundling our own JRE/JDK accomplishes?
>
> If we talk about OpenJDK, there will be only a single Java version
> supported at any time and that is the latest Java version (11, 12, ..).
> There is no overlap between supported versions. Therefor it doesn't
> really make a lot of sense for us to officially support "a few releases
> of the JDK" when we talk about OpenJDK releases. What we'd have to do is
> to keep up with new Java versions by testing them and updating our code
> base if necessary. Keep in mind that branches like 4.0 and 3.11 will
> span several Java versions.
>
> We can do this by communicating a list of branches and corresponding
> Java releases that are officially supported. But we can also just bundle
> and ship the latest OpenJDK release that we know is to be working for
> any Cassandra branch right away, which would avoid any incompatibility
> issues between our releases and JREs installed by the user and is
> probably easier for everyone. But thats pretty much the biggest selling
> point on bundling the JRE, but will probably not happen anyway due to
> the licensing restrictions.
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>

Re: [DISCUSS] java 9 and the future of cassandra on the jdk

2018-03-21 Thread Gerald Henriksen

On Wed, 21 Mar 2018 10:52:06 -0400, you wrote:

>> Even if you are not running say Debian, or RedHat, those distributions
>> will be backporting critical fixes to their JVMs; This work is going
>> to be done, and will be available to anyone.
>This would certainly mitigate a lot of the core problems with the new
>release model. Has there been any public statements of plans/intent
>with regards to distros doing this?

I am not familiar enough with the others to say definitely, but that
is the whole point of the distributions - though admittedly usually
better done for key things like kernels and compilers, less so for
random applications.  The distribution tests and releases, and then
maintains for the duration of the support of the release the software
contained in the release.

So in the case of the Red Hat / Fedora ecosystem you get:

Fedora - maintain the current version and the immediate preceding
version (usually means you get about 1 year).

Red Hat - their policy is given here* but can be up to 6 years
depending on which version of Red Hat you are on.  Those on Red Hat
will get the most benefit (being able to file bugs, etc) but even
those using the free Centos version will get the security fixes and
any other backported fixes given that Centos mirrors Red Hat EL.

So (at least for the Linux world) all that may be needed is surveying
the various distributions to find a version that has broad support for
a reasonable time period.

The problem will come for those using Windows or MacOS, where neither
Microsoft or Apple provide Java and thus those users will either have
to:

1) accept the JRE upgrade treadmill
2) adopt the Oracle JRE (and pay if necessary for it)
3) hope that some 3rd party either provides a free or cheaper than
Oracle version based off of OpenJDK

What Cassandra does in those cases will be a discussion to be had,
perhaps in a couple of months, when the available options is clearer.
In particular the big question for many organizations will be what is
the cost of Oracle support, and if they find it reasonable they may
just pay it.

I suppose there is also the possibility of whether Cassandra (or any
other Java based open source project) will continue to support Windows
or MacOS as platforms if they become to much hassle.

It may also be worth considering getting an informal group of projects
based on Java contacting the Java maintainers at various Linux/BSD
distributions and come up with a consence going forward as how to
handle the issue of what versions of Java to support and which to
skip.

* https://access.redhat.com/articles/1299013

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org

Re: [DISCUSS] java 9 and the future of cassandra on the jdk

2018-03-21 Thread Michael Burman


On 03/21/2018 04:52 PM, Josh McKenzie wrote:


This would certainly mitigate a lot of the core problems with the new
release model. Has there been any public statements of plans/intent
with regards to distros doing this?
Since the latest official LTS version is Java 8, that's the only one 
with publicly available information For RHEL, OpenJDK8 will receive 
updates until October 2020.  "A major version of OpenJDK is supported 
for a period of six years from the time that it is first introduced in 
any version of RHEL, or until the retirement date of the underlying RHEL 
platform , whichever is earlier." [1]


[1] https://access.redhat.com/articles/1299013


In terms of the burden of bugfixes and security fixes if we bundled a
JRE w/C*, cutting a patch release of C* with a new JRE distribution
would be a really low friction process (add to build, check CI, green,
done), so I don't think that would be a blocker for the concept.

And do we have someone actively monitoring CVEs for this? Would we ship 
a version of OpenJDK which ensures that it works with all the major 
distributions? Would we run tests against all the major distributions 
for each of the OpenJDK version we would ship after each CVE with each 
Cassandra version? Who compiles the OpenJDK distribution we would create 
(which wouldn't be the official one if we need to maintain support for 
each distribution we support) ? What if one build doesn't work for one 
distro? Would we not update that CVE? OpenJDK builds that are in the 
distros are not necessarily the pure ones from the upstream, they might 
include patches that provide better support for the distribution - or 
even fix bugs that are not yet in the upstream version.


I guess we also need the Windows versions, maybe the PowerPC & ARM 
versions also at some point. I'm not sure if we plan to support J9 or 
other JVMs at some point.


We would also need to create CVE reports after each Java CVE for 
Cassandra as well I would assume since it would affect us separately 
(and updating only the Java wouldn't help).


To me this sounds like an understatement of the amount of work that 
would go to this. Not to mention the bad publicity if Java CVEs are not 
actually patched instantly in the Cassandra also (and then each user 
would have to validate that the shipped version actually works with 
their installation in their hardware since they won't get support for it 
from the vendors as it's unofficial package).


  - Micke

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org

Re: [DISCUSS] java 9 and the future of cassandra on the jdk

2018-03-21 Thread Stefan Podkowinski

The idea was not about building a custom JDK and ship it along with
Cassandra, rather than using the new modular run-time images feature [0]
introduced in Java 9. See also the link posted by Jason [1] for an
practical introduction.

[0] http://openjdk.java.net/jeps/220
[1]
https://steveperkins.com/using-java-9-modularization-to-ship-zero-dependency-native-apps/


On 21.03.18 17:26, Michael Burman wrote:
> On 03/21/2018 04:52 PM, Josh McKenzie wrote:
>
>> This would certainly mitigate a lot of the core problems with the new
>> release model. Has there been any public statements of plans/intent
>> with regards to distros doing this?
> Since the latest official LTS version is Java 8, that's the only one
> with publicly available information For RHEL, OpenJDK8 will receive
> updates until October 2020.  "A major version of OpenJDK is supported
> for a period of six years from the time that it is first introduced in
> any version of RHEL, or until the retirement date of the underlying
> RHEL platform , whichever is earlier." [1]
>
> [1] https://access.redhat.com/articles/1299013
>
>> In terms of the burden of bugfixes and security fixes if we bundled a
>> JRE w/C*, cutting a patch release of C* with a new JRE distribution
>> would be a really low friction process (add to build, check CI, green,
>> done), so I don't think that would be a blocker for the concept.
>>
> And do we have someone actively monitoring CVEs for this? Would we
> ship a version of OpenJDK which ensures that it works with all the
> major distributions? Would we run tests against all the major
> distributions for each of the OpenJDK version we would ship after each
> CVE with each Cassandra version? Who compiles the OpenJDK distribution
> we would create (which wouldn't be the official one if we need to
> maintain support for each distribution we support) ? What if one build
> doesn't work for one distro? Would we not update that CVE? OpenJDK
> builds that are in the distros are not necessarily the pure ones from
> the upstream, they might include patches that provide better support
> for the distribution - or even fix bugs that are not yet in the
> upstream version.
>
> I guess we also need the Windows versions, maybe the PowerPC & ARM
> versions also at some point. I'm not sure if we plan to support J9 or
> other JVMs at some point.
>
> We would also need to create CVE reports after each Java CVE for
> Cassandra as well I would assume since it would affect us separately
> (and updating only the Java wouldn't help).
>
> To me this sounds like an understatement of the amount of work that
> would go to this. Not to mention the bad publicity if Java CVEs are
> not actually patched instantly in the Cassandra also (and then each
> user would have to validate that the shipped version actually works
> with their installation in their hardware since they won't get support
> for it from the vendors as it's unofficial package).
>
>   - Micke
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org

Re: [DISCUSS] java 9 and the future of cassandra on the jdk

2018-03-21 Thread Jason Brown

If we went down this path, I can't imagine we would build OpenJDK
ourselves, but probably build a release with jlink or javapackager. I
haven't done homework on that yet, but i *think* it uses a blessed OpenJDK
release for the packaging (or perhaps whatever JDK you happen to be
compiling/building with). Thus as long as we build/release when an openJDK
rev is released, we would hypothetically be ok from a secutiry POV.

That being said, Micke's points about multiple architectures and other OSes
(Windows for sure, macOS not so sure) are a legit concern as those would
need to be separate packages, with separate CI/testing and so on :(

I'm not sure betting the farm on linux disto support is the path to
happiness, either. Not everyone uses one of the distros mentioned (RH,
ubuntu), nor does everyone use linux (sure, the vast majority is Linux/x86,
but we do support Windows deployment and macOS development).

-Jason



On Wed, Mar 21, 2018 at 9:26 AM, Michael Burman  wrote:

> On 03/21/2018 04:52 PM, Josh McKenzie wrote:
>
> This would certainly mitigate a lot of the core problems with the new
>> release model. Has there been any public statements of plans/intent
>> with regards to distros doing this?
>>
> Since the latest official LTS version is Java 8, that's the only one with
> publicly available information For RHEL, OpenJDK8 will receive updates
> until October 2020.  "A major version of OpenJDK is supported for a period
> of six years from the time that it is first introduced in any version of
> RHEL, or until the retirement date of the underlying RHEL platform ,
> whichever is earlier." [1]
>
> [1] https://access.redhat.com/articles/1299013
>
> In terms of the burden of bugfixes and security fixes if we bundled a
>> JRE w/C*, cutting a patch release of C* with a new JRE distribution
>> would be a really low friction process (add to build, check CI, green,
>> done), so I don't think that would be a blocker for the concept.
>>
>> And do we have someone actively monitoring CVEs for this? Would we ship a
> version of OpenJDK which ensures that it works with all the major
> distributions? Would we run tests against all the major distributions for
> each of the OpenJDK version we would ship after each CVE with each
> Cassandra version? Who compiles the OpenJDK distribution we would create
> (which wouldn't be the official one if we need to maintain support for each
> distribution we support) ? What if one build doesn't work for one distro?
> Would we not update that CVE? OpenJDK builds that are in the distros are
> not necessarily the pure ones from the upstream, they might include patches
> that provide better support for the distribution - or even fix bugs that
> are not yet in the upstream version.
>
> I guess we also need the Windows versions, maybe the PowerPC & ARM
> versions also at some point. I'm not sure if we plan to support J9 or other
> JVMs at some point.
>
> We would also need to create CVE reports after each Java CVE for Cassandra
> as well I would assume since it would affect us separately (and updating
> only the Java wouldn't help).
>
> To me this sounds like an understatement of the amount of work that would
> go to this. Not to mention the bad publicity if Java CVEs are not actually
> patched instantly in the Cassandra also (and then each user would have to
> validate that the shipped version actually works with their installation in
> their hardware since they won't get support for it from the vendors as it's
> unofficial package).
>
>   - Micke
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>

Re: [DISCUSS] java 9 and the future of cassandra on the jdk

2018-03-21 Thread Jason Brown

ftr, I've sent a message to legal-discuss to inquire about the licensing
aspect of the OpenJDK as we've been discussing. I believe anyone can follow
the thread by subscribing to the legal-discuss@ ML, or you can wait for
updates on this thread as I get them.

On Wed, Mar 21, 2018 at 9:49 AM, Jason Brown  wrote:

> If we went down this path, I can't imagine we would build OpenJDK
> ourselves, but probably build a release with jlink or javapackager. I
> haven't done homework on that yet, but i *think* it uses a blessed OpenJDK
> release for the packaging (or perhaps whatever JDK you happen to be
> compiling/building with). Thus as long as we build/release when an openJDK
> rev is released, we would hypothetically be ok from a secutiry POV.
>
> That being said, Micke's points about multiple architectures and other
> OSes (Windows for sure, macOS not so sure) are a legit concern as those
> would need to be separate packages, with separate CI/testing and so on :(
>
> I'm not sure betting the farm on linux disto support is the path to
> happiness, either. Not everyone uses one of the distros mentioned (RH,
> ubuntu), nor does everyone use linux (sure, the vast majority is Linux/x86,
> but we do support Windows deployment and macOS development).
>
> -Jason
>
>
>
> On Wed, Mar 21, 2018 at 9:26 AM, Michael Burman 
> wrote:
>
>> On 03/21/2018 04:52 PM, Josh McKenzie wrote:
>>
>> This would certainly mitigate a lot of the core problems with the new
>>> release model. Has there been any public statements of plans/intent
>>> with regards to distros doing this?
>>>
>> Since the latest official LTS version is Java 8, that's the only one with
>> publicly available information For RHEL, OpenJDK8 will receive updates
>> until October 2020.  "A major version of OpenJDK is supported for a period
>> of six years from the time that it is first introduced in any version of
>> RHEL, or until the retirement date of the underlying RHEL platform ,
>> whichever is earlier." [1]
>>
>> [1] https://access.redhat.com/articles/1299013
>>
>> In terms of the burden of bugfixes and security fixes if we bundled a
>>> JRE w/C*, cutting a patch release of C* with a new JRE distribution
>>> would be a really low friction process (add to build, check CI, green,
>>> done), so I don't think that would be a blocker for the concept.
>>>
>>> And do we have someone actively monitoring CVEs for this? Would we ship
>> a version of OpenJDK which ensures that it works with all the major
>> distributions? Would we run tests against all the major distributions for
>> each of the OpenJDK version we would ship after each CVE with each
>> Cassandra version? Who compiles the OpenJDK distribution we would create
>> (which wouldn't be the official one if we need to maintain support for each
>> distribution we support) ? What if one build doesn't work for one distro?
>> Would we not update that CVE? OpenJDK builds that are in the distros are
>> not necessarily the pure ones from the upstream, they might include patches
>> that provide better support for the distribution - or even fix bugs that
>> are not yet in the upstream version.
>>
>> I guess we also need the Windows versions, maybe the PowerPC & ARM
>> versions also at some point. I'm not sure if we plan to support J9 or other
>> JVMs at some point.
>>
>> We would also need to create CVE reports after each Java CVE for
>> Cassandra as well I would assume since it would affect us separately (and
>> updating only the Java wouldn't help).
>>
>> To me this sounds like an understatement of the amount of work that would
>> go to this. Not to mention the bad publicity if Java CVEs are not actually
>> patched instantly in the Cassandra also (and then each user would have to
>> validate that the shipped version actually works with their installation in
>> their hardware since they won't get support for it from the vendors as it's
>> unofficial package).
>>
>>   - Micke
>>
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>>
>>
>

Re: [DISCUSS] java 9 and the future of cassandra on the jdk

2018-03-21 Thread Jason Brown

Well, that was quick. TL;DR Redistributing any part of the OpenJDK is
basically a no-go.

Thus, that option is off the table.

On Wed, Mar 21, 2018 at 10:46 AM, Jason Brown  wrote:

> ftr, I've sent a message to legal-discuss to inquire about the licensing
> aspect of the OpenJDK as we've been discussing. I believe anyone can follow
> the thread by subscribing to the legal-discuss@ ML, or you can wait for
> updates on this thread as I get them.
>
> On Wed, Mar 21, 2018 at 9:49 AM, Jason Brown  wrote:
>
>> If we went down this path, I can't imagine we would build OpenJDK
>> ourselves, but probably build a release with jlink or javapackager. I
>> haven't done homework on that yet, but i *think* it uses a blessed OpenJDK
>> release for the packaging (or perhaps whatever JDK you happen to be
>> compiling/building with). Thus as long as we build/release when an openJDK
>> rev is released, we would hypothetically be ok from a secutiry POV.
>>
>> That being said, Micke's points about multiple architectures and other
>> OSes (Windows for sure, macOS not so sure) are a legit concern as those
>> would need to be separate packages, with separate CI/testing and so on :(
>>
>> I'm not sure betting the farm on linux disto support is the path to
>> happiness, either. Not everyone uses one of the distros mentioned (RH,
>> ubuntu), nor does everyone use linux (sure, the vast majority is Linux/x86,
>> but we do support Windows deployment and macOS development).
>>
>> -Jason
>>
>>
>>
>> On Wed, Mar 21, 2018 at 9:26 AM, Michael Burman 
>> wrote:
>>
>>> On 03/21/2018 04:52 PM, Josh McKenzie wrote:
>>>
>>> This would certainly mitigate a lot of the core problems with the new
 release model. Has there been any public statements of plans/intent
 with regards to distros doing this?

>>> Since the latest official LTS version is Java 8, that's the only one
>>> with publicly available information For RHEL, OpenJDK8 will receive updates
>>> until October 2020.  "A major version of OpenJDK is supported for a period
>>> of six years from the time that it is first introduced in any version of
>>> RHEL, or until the retirement date of the underlying RHEL platform ,
>>> whichever is earlier." [1]
>>>
>>> [1] https://access.redhat.com/articles/1299013
>>>
>>> In terms of the burden of bugfixes and security fixes if we bundled a
 JRE w/C*, cutting a patch release of C* with a new JRE distribution
 would be a really low friction process (add to build, check CI, green,
 done), so I don't think that would be a blocker for the concept.

 And do we have someone actively monitoring CVEs for this? Would we ship
>>> a version of OpenJDK which ensures that it works with all the major
>>> distributions? Would we run tests against all the major distributions for
>>> each of the OpenJDK version we would ship after each CVE with each
>>> Cassandra version? Who compiles the OpenJDK distribution we would create
>>> (which wouldn't be the official one if we need to maintain support for each
>>> distribution we support) ? What if one build doesn't work for one distro?
>>> Would we not update that CVE? OpenJDK builds that are in the distros are
>>> not necessarily the pure ones from the upstream, they might include patches
>>> that provide better support for the distribution - or even fix bugs that
>>> are not yet in the upstream version.
>>>
>>> I guess we also need the Windows versions, maybe the PowerPC & ARM
>>> versions also at some point. I'm not sure if we plan to support J9 or other
>>> JVMs at some point.
>>>
>>> We would also need to create CVE reports after each Java CVE for
>>> Cassandra as well I would assume since it would affect us separately (and
>>> updating only the Java wouldn't help).
>>>
>>> To me this sounds like an understatement of the amount of work that
>>> would go to this. Not to mention the bad publicity if Java CVEs are not
>>> actually patched instantly in the Cassandra also (and then each user would
>>> have to validate that the shipped version actually works with their
>>> installation in their hardware since they won't get support for it from the
>>> vendors as it's unofficial package).
>>>
>>>   - Micke
>>>
>>>
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>>>
>>>
>>
>

Re: Paying off tech debt and correctly naming things

2018-03-21 Thread Lerh Chuan Low

For reasons others have mentioned (nightmare to continuously update branch
and resolve merge conflicts, existing patches/big features..) it will be a
nightmare. It seems like in software projects (just basing it off personal
experience) people typically refactor if a ticket they are working on
touches the part of the code base that needs refactoring, I've not really
seen a freeze and work off technical debt before (I'll admit upfront I
don't know much).

Thinking about it, the only ones I could come up with are the same as
Sylvain had mentioned, which is start with a small subset and just do only
renaming and cleaning up comments; no logic changes. I would think some
parts of the code may take ages more before a ticket finds its way to it
(and a knowledgable enough person is involved to even guide the refactor).

So definitely, you have my (moral) support if you are going to go with it,
+1 +1 +1

On 22 March 2018 at 00:31, Eric Evans  wrote:

> On Wed, Mar 21, 2018 at 3:48 AM, Sylvain Lebresne 
> wrote:
>
> [ ... ]
>
> > - pure code renaming is one reasonably simple aspect, but quite a few
> > renaming may have user visible impact. Particularly around JMX where many
> > things are name based on their class, and to a lesser extend some of our
> > tools still use "old" naming. We can't and shouldn't ignore those impact:
> > such user visible changes should imo be documented, and we should make
> sure
> > we have a reasonably painless (and thus incremental) upgrade path. My
> hunch
> > is the latter isn't as simple as it seems.
>
> Speaking as someone who has personally been burned by this
> (repeatedly, and it's on-going), please think very carefully before
> making such changes.  I hate to think about of all the hours I wasted
> shaving this breed of yak.
>
> > On Wed, Mar 21, 2018 at 9:06 AM kurt greaves 
> wrote:
>
> [ ... ]
>
> --
> Eric Evans
> john.eric.ev...@gmail.com
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>

Re: Paying off tech debt and correctly naming things

2018-03-21 Thread Jeremiah D Jordan

+1 if you are willing to take it on.  As the person who performed the 
Table->Keyspace rename of 2.0, I say good luck!  From hindsight of doing that, 
as others suggested, I would come at this in multiple tickets.
I would suggest a simple class rename with intellij refactoring tools or 
something as the first ticket.  This is going to touch the most files at once, 
but will be mechanical, and for the most part if it compiles it was right :).
After you have done that you can take on other renaming of things with a 
smaller scope.
Also as others have said the main things to be wary of are the naming of things 
in JMX metrics.  Ideally we would keep around deprecated aliases of the old JMX 
names for a release before removing them.  The other thing is to watch out for 
class names in byte man scripts in dtest.

-Jeremiah

> On Mar 21, 2018, at 4:48 AM, Sylvain Lebresne  wrote:
> 
> I really don't think anyone has been recently against such renaming, and in
> fact, a _lot_ of renaming *has* already happen over time. The problem, as
> you carefully noted, is that it's such a big task that there is still a lot
> to do. Anyway, I've yet to see a patch renaming things to match the CQL
> naming scheme be rejected, so I'd personally encourage such submission. But
> maybe with a few caveats (already mentioned largely, so repeating here to
> signify my personal agreement with them):
> - renaming with large surface area can be painful for ongoing patches or
> even future merge. That's not a reason for not doing them, but that's imo a
> good enough reason to do things incrementally/in as-small-as-reasonable
> steps. Making sure a renaming commit only does renaming and doesn't change
> the logic is also pretty nice when you rebase such things.
> - breaking hundreds of tests is obviously not ok :)
> - pure code renaming is one reasonably simple aspect, but quite a few
> renaming may have user visible impact. Particularly around JMX where many
> things are name based on their class, and to a lesser extend some of our
> tools still use "old" naming. We can't and shouldn't ignore those impact:
> such user visible changes should imo be documented, and we should make sure
> we have a reasonably painless (and thus incremental) upgrade path. My hunch
> is the latter isn't as simple as it seems.
> 
> 
> --
> Sylvain
> 
> 
> On Wed, Mar 21, 2018 at 9:06 AM kurt greaves  wrote:
> 
>> As someone who came to the codebase post CQL but prior to thrift being
>> removed, +1 to refactor. The current mixing of terminology is a complete
>> nightmare. This would also give a good opportunity document a lot of code
>> that simply isn't documented (or incorrect). I'd say it's worth doing it in
>> multiple steps though, such as refactor of a single class at a time, then
>> followed by refactor of variable names. We've already done one pretty big
>> refactor (InetAddressAndPort) for 4.0, I don't see how a few more could
>> make it any worse (lol).
>> 
>> Row vs partition vs key vs PK is killing me
>> 
>> On 20 March 2018 at 22:04, Jon Haddad  wrote:
>> 
>>> Whenever I hop around in the codebase, one thing that always manages to
>>> slow me down is needing to understand the context of the variable names
>>> that I’m looking at.  We’ve now removed thrift the transport, but the
>>> variables, classes and comments still remain.  Personally, I’d like to go
>>> in and pay off as much technical debt as possible by refactoring the code
>>> to be as close to CQL as possible.  Rows should be rows, not partitions,
>>> I’d love to see the term column family removed forever in favor of always
>>> using tables.  That said, it’s a big task.  I did a quick refactor in a
>>> branch, simply changing the ColumnFamilyStore class to TableStore, and
>>> pushed it up to GitHub. [1]
>>> 
>>> Didn’t click on the link?  That’s ok.  The TL;DR is that it’s almost 2K
>>> LOC changed across 275 files.  I’ll note that my branch doesn’t change
>> any
>>> of the almost 1000 search results of “columnfamilystore” found in the
>>> codebase and hundreds of tests failed on my branch in CircleCI, so that
>> 2K
>>> LOC change would probably be quite a bit bigger.  There is, of course, a
>>> lot more than just renaming this one class, there’s thousands of variable
>>> names using any manner of “cf”, “cfs”, “columnfamily”, names plus
>> comments
>>> and who knows what else.  There’s lots of references in probably every
>> file
>>> that would have to get updated.
>>> 
>>> What are people’s thoughts on this?  We should be honest with ourselves
>>> and know this isn’t going to get any easier over time.  It’s only going
>> to
>>> get more confusing for new people to the project, and having to figure
>> out
>>> “what kind of row am i even looking at” is a waste of time.  There’s
>>> obviously a much bigger impact than just renaming a bunch of files,
>> there’s
>>> any number of patches and branches that would become outdated, plus
>> anyone
>>> pulling in Cassandra as a dependency would be af

Re: Paying off tech debt and correctly naming things

2018-03-21 Thread Jeff Jirsa

Please please please ping the list and ask if anyone has big commits ready to 
merge before actually committing any huge automated refactors - people who may 
be sitting on big patches will thank you if they don’t have to rebase against 
huge IntelliJ refactors . 

-- 
Jeff Jirsa


> On Mar 21, 2018, at 5:49 PM, Jeremiah D Jordan  
> wrote:
> 
> +1 if you are willing to take it on.  As the person who performed the 
> Table->Keyspace rename of 2.0, I say good luck!  From hindsight of doing 
> that, as others suggested, I would come at this in multiple tickets.
> I would suggest a simple class rename with intellij refactoring tools or 
> something as the first ticket.  This is going to touch the most files at 
> once, but will be mechanical, and for the most part if it compiles it was 
> right :).
> After you have done that you can take on other renaming of things with a 
> smaller scope.
> Also as others have said the main things to be wary of are the naming of 
> things in JMX metrics.  Ideally we would keep around deprecated aliases of 
> the old JMX names for a release before removing them.  The other thing is to 
> watch out for class names in byte man scripts in dtest.
> 
> -Jeremiah
> 
>> On Mar 21, 2018, at 4:48 AM, Sylvain Lebresne  wrote:
>> 
>> I really don't think anyone has been recently against such renaming, and in
>> fact, a _lot_ of renaming *has* already happen over time. The problem, as
>> you carefully noted, is that it's such a big task that there is still a lot
>> to do. Anyway, I've yet to see a patch renaming things to match the CQL
>> naming scheme be rejected, so I'd personally encourage such submission. But
>> maybe with a few caveats (already mentioned largely, so repeating here to
>> signify my personal agreement with them):
>> - renaming with large surface area can be painful for ongoing patches or
>> even future merge. That's not a reason for not doing them, but that's imo a
>> good enough reason to do things incrementally/in as-small-as-reasonable
>> steps. Making sure a renaming commit only does renaming and doesn't change
>> the logic is also pretty nice when you rebase such things.
>> - breaking hundreds of tests is obviously not ok :)
>> - pure code renaming is one reasonably simple aspect, but quite a few
>> renaming may have user visible impact. Particularly around JMX where many
>> things are name based on their class, and to a lesser extend some of our
>> tools still use "old" naming. We can't and shouldn't ignore those impact:
>> such user visible changes should imo be documented, and we should make sure
>> we have a reasonably painless (and thus incremental) upgrade path. My hunch
>> is the latter isn't as simple as it seems.
>> 
>> 
>> --
>> Sylvain
>> 
>> 
>>> On Wed, Mar 21, 2018 at 9:06 AM kurt greaves  wrote:
>>> 
>>> As someone who came to the codebase post CQL but prior to thrift being
>>> removed, +1 to refactor. The current mixing of terminology is a complete
>>> nightmare. This would also give a good opportunity document a lot of code
>>> that simply isn't documented (or incorrect). I'd say it's worth doing it in
>>> multiple steps though, such as refactor of a single class at a time, then
>>> followed by refactor of variable names. We've already done one pretty big
>>> refactor (InetAddressAndPort) for 4.0, I don't see how a few more could
>>> make it any worse (lol).
>>> 
>>> Row vs partition vs key vs PK is killing me
>>> 
 On 20 March 2018 at 22:04, Jon Haddad  wrote:
 
 Whenever I hop around in the codebase, one thing that always manages to
 slow me down is needing to understand the context of the variable names
 that I’m looking at.  We’ve now removed thrift the transport, but the
 variables, classes and comments still remain.  Personally, I’d like to go
 in and pay off as much technical debt as possible by refactoring the code
 to be as close to CQL as possible.  Rows should be rows, not partitions,
 I’d love to see the term column family removed forever in favor of always
 using tables.  That said, it’s a big task.  I did a quick refactor in a
 branch, simply changing the ColumnFamilyStore class to TableStore, and
 pushed it up to GitHub. [1]
 
 Didn’t click on the link?  That’s ok.  The TL;DR is that it’s almost 2K
 LOC changed across 275 files.  I’ll note that my branch doesn’t change
>>> any
 of the almost 1000 search results of “columnfamilystore” found in the
 codebase and hundreds of tests failed on my branch in CircleCI, so that
>>> 2K
 LOC change would probably be quite a bit bigger.  There is, of course, a
 lot more than just renaming this one class, there’s thousands of variable
 names using any manner of “cf”, “cfs”, “columnfamily”, names plus
>>> comments
 and who knows what else.  There’s lots of references in probably every
>>> file
 that would have to get updated.
 
 What are people’s thoughts on this?  We should be honest with ourselves
>>

Re: Paying off tech debt and correctly naming things

2018-03-21 Thread Rahul Singh

+1 - can help with sections of the code

--
Rahul Singh
rahul.si...@anant.us

Anant Corporation

On Mar 21, 2018, 4:25 PM -0500, Lerh Chuan Low , wrote:
> For reasons others have mentioned (nightmare to continuously update branch
> and resolve merge conflicts, existing patches/big features..) it will be a
> nightmare. It seems like in software projects (just basing it off personal
> experience) people typically refactor if a ticket they are working on
> touches the part of the code base that needs refactoring, I've not really
> seen a freeze and work off technical debt before (I'll admit upfront I
> don't know much).
>
> Thinking about it, the only ones I could come up with are the same as
> Sylvain had mentioned, which is start with a small subset and just do only
> renaming and cleaning up comments; no logic changes. I would think some
> parts of the code may take ages more before a ticket finds its way to it
> (and a knowledgable enough person is involved to even guide the refactor).
>
> So definitely, you have my (moral) support if you are going to go with it,
> +1 +1 +1
>
> On 22 March 2018 at 00:31, Eric Evans  wrote:
>
> > On Wed, Mar 21, 2018 at 3:48 AM, Sylvain Lebresne  > wrote:
> >
> > [ ... ]
> >
> > > - pure code renaming is one reasonably simple aspect, but quite a few
> > > renaming may have user visible impact. Particularly around JMX where many
> > > things are name based on their class, and to a lesser extend some of our
> > > tools still use "old" naming. We can't and shouldn't ignore those impact:
> > > such user visible changes should imo be documented, and we should make
> > sure
> > > we have a reasonably painless (and thus incremental) upgrade path. My
> > hunch
> > > is the latter isn't as simple as it seems.
> >
> > Speaking as someone who has personally been burned by this
> > (repeatedly, and it's on-going), please think very carefully before
> > making such changes. I hate to think about of all the hours I wasted
> > shaving this breed of yak.
> >
> > > On Wed, Mar 21, 2018 at 9:06 AM kurt greaves  > wrote:
> >
> > [ ... ]
> >
> > --
> > Eric Evans
> > john.eric.ev...@gmail.com
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
> >

Re: Paying off tech debt and correctly naming things

2018-03-21 Thread Dinesh Joshi

Having been through a bunch of refactors in other projects, I would suggest 
doing small incremental patches isolated in related parts of the code. It would 
also be nice if you could give us a heads up on changes you're making.
Best of luck!
Dinesh 

On Tuesday, March 20, 2018, 3:04:43 PM PDT, Jon Haddad  
wrote:  
 
 Whenever I hop around in the codebase, one thing that always manages to slow 
me down is needing to understand the context of the variable names that I’m 
looking at.  We’ve now removed thrift the transport, but the variables, classes 
and comments still remain.  Personally, I’d like to go in and pay off as much 
technical debt as possible by refactoring the code to be as close to CQL as 
possible.  Rows should be rows, not partitions, I’d love to see the term column 
family removed forever in favor of always using tables.  That said, it’s a big 
task.  I did a quick refactor in a branch, simply changing the 
ColumnFamilyStore class to TableStore, and pushed it up to GitHub. [1]

Didn’t click on the link?  That’s ok.  The TL;DR is that it’s almost 2K LOC 
changed across 275 files.  I’ll note that my branch doesn’t change any of the 
almost 1000 search results of “columnfamilystore” found in the codebase and 
hundreds of tests failed on my branch in CircleCI, so that 2K LOC change would 
probably be quite a bit bigger.  There is, of course, a lot more than just 
renaming this one class, there’s thousands of variable names using any manner 
of “cf”, “cfs”, “columnfamily”, names plus comments and who knows what else.  
There’s lots of references in probably every file that would have to get 
updated.

What are people’s thoughts on this?  We should be honest with ourselves and 
know this isn’t going to get any easier over time.  It’s only going to get more 
confusing for new people to the project, and having to figure out “what kind of 
row am i even looking at” is a waste of time.  There’s obviously a much bigger 
impact than just renaming a bunch of files, there’s any number of patches and 
branches that would become outdated, plus anyone pulling in Cassandra as a 
dependency would be affected.  I don’t really have a solution for the 
disruption other than “leave it in place”, but in my mind that’s not a great 
(or even good) solution.

Anyways, enough out of me.  My concern for ergonomics and naming might be 
significantly higher than the rest of the folks working in the code, and I 
wanted to put a feeler out there before I decided to dig into this in a more 
serious manner. 

Jon

[1] 
https://github.com/apache/cassandra/compare/trunk...rustyrazorblade:refactor_column_family_store?expand=1

Re: Paying off tech debt and correctly naming things

Re: Paying off tech debt and correctly naming things

Re: Paying off tech debt and correctly naming things

Re: Paying off tech debt and correctly naming things

Re: [DISCUSS] java 9 and the future of cassandra on the jdk

Re: [DISCUSS] java 9 and the future of cassandra on the jdk

Re: [DISCUSS] java 9 and the future of cassandra on the jdk

Re: Paying off tech debt and correctly naming things

Re: [DISCUSS] java 9 and the future of cassandra on the jdk

Re: [DISCUSS] java 9 and the future of cassandra on the jdk

Re: [DISCUSS] java 9 and the future of cassandra on the jdk

Re: [DISCUSS] java 9 and the future of cassandra on the jdk

Re: [DISCUSS] java 9 and the future of cassandra on the jdk

Re: [DISCUSS] java 9 and the future of cassandra on the jdk

Re: [DISCUSS] java 9 and the future of cassandra on the jdk

Re: [DISCUSS] java 9 and the future of cassandra on the jdk

Re: [DISCUSS] java 9 and the future of cassandra on the jdk

Re: [DISCUSS] java 9 and the future of cassandra on the jdk

Re: [DISCUSS] java 9 and the future of cassandra on the jdk

Re: [DISCUSS] java 9 and the future of cassandra on the jdk

Re: [DISCUSS] java 9 and the future of cassandra on the jdk

Re: [DISCUSS] java 9 and the future of cassandra on the jdk

Re: [DISCUSS] java 9 and the future of cassandra on the jdk

Re: [DISCUSS] java 9 and the future of cassandra on the jdk

Re: [DISCUSS] java 9 and the future of cassandra on the jdk

Re: Paying off tech debt and correctly naming things

Re: Paying off tech debt and correctly naming things

Re: Paying off tech debt and correctly naming things

Re: Paying off tech debt and correctly naming things

Re: Paying off tech debt and correctly naming things

30 matches

Site Navigation

Mail list logo

Footer information