Re: [Discuss] CASSANDRA-16999 introduction of a column in system.peers_v2

2024-02-13 Thread Sam Tunnicliffe
Late to the party I'm afraid, but I'd agree with Abe's proposal to deprecate 
the dual port approach given that CASSANDRA-10559 makes it pretty much 
redundant. Adding further yaml options like "default ssl/no ssl" feels like 
another nasty band-aid that we'll have to live with for the foreseeable future.

Also, if CASSANDRA-16999 is only going to trunk, why can't we just deprecate 
dual ports in 5.0 (as it isn't at -rc stage yet) and remove it from trunk? That 
seems preferable to shoehorning something into the new system_views.peers 
table, which isn't going to help any existing drivers anyway as none of them 
will be using it.

 
> On 12 Feb 2024, at 07:58, Štefan Miklošovič  
> wrote:
> 
> I think that the situation like
> 
> client_encryption_options.enabled = true
> client_encryption_options.optional=true
> native_transport_port != native_transport_port_ssl
> 
> is a legit bug and should be fixed. If we look here (1), when these ports are 
> not equal, the normal port is explicitly set to be unencrypted but it is 
> encrypted on _ssl port. This is not always true for _ssl port, because in (2) 
> we throw only if client_encryption_options' encryption_policy is UNENCRYPTED. 
> We do not throw if it is OPTIONAL. If we say that the normal port is always 
> unencrypted, why don't we also say that _ssl port is always encrypted? This 
> asymmetry should be fixed.
> 
> However, I think it is too late to fix anything but trunk. Adding a column 
> might break clients and fixing the logic around ports might potentially break 
> the deployments so our best shot seems to be:
> 
> 1) from 4.0 to trunk - apply a patch which would inform a user that it is 
> preferable to use single port instead of dual ports
> 2) for trunk - apply a patch which adds native_port_ssl column to 
> system_views.peers so Cassandra Java Driver can connect to such a deployment
> 3) optionally fix the bug I was describing above.
> 
> I am not sure how to evaluate the severity of 3). I would like to see it from 
> 4.0 to trunk but I also understand that if it is too disruptive, we can leave 
> it just to trunk.
> 
> The problems you described are the result of us using one 
> client_encryption_options for both ssl and non-ssl ports. I would say that 
> the first problem you described, even if it looks weird, is less serious, 
> because a user knowingly uses two ports and one of them is said to be 
> unencrypted and another one encrypted. So the fact that a non-ssl port still 
> enables unencrypted traffic is somehow expected. What is not expected is that 
> _ssl port might still accept unencrypted traffic.
> 
> To sum it up for the driver, I do not think this has a nice solution for dual 
> ports deployments already out there so it would work just for the trunk. 
> Actually, because there is missing native_port_ssl in a system table, there 
> is currently no way to successfully use dual ports because Java driver just 
> can not connect to SSL-enabled nodes reliably using ssl and non-ssl ports.
> 
> (1) 
> https://github.com/apache/cassandra/blob/097c1231e2466163fe3f8b36b12cdc5235eb1403/src/java/org/apache/cassandra/service/NativeTransportService.java#L94
> 
> (2) 
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/config/DatabaseDescriptor.java#L905-L915
> 
> On Thu, Feb 8, 2024 at 9:58 PM Abe Ratnofsky  > wrote:
>> > Deprecating helps nothing for existing releases. We can’t/shouldn’t remove 
>> > the feature in existing releases.
>> 
>> The deprecation I'm proposing is intended to push people to configure their 
>> servers in a way that is more secure and maximizes compatibility with 
>> clients. Deprecating can help for existing releases - the better 
>> configuration already exists, and it's likely that users of dual-native-port 
>> optional SSL can use it. At the very least, users should be made aware of 
>> the risks of dual-native-port operation.
>> 
>> Currently, if a user specifies the following server configuration:
>> - client_encryption_options.enabled=true
>> - client_encryption_options.optional=false
>> - native_transport_port != native_transport_port_ssl
>> 
>> then the server will still handle unencrypted traffic on 
>> native_transport_port. This feels like a security risk: it would be 
>> reasonable to interpret that this configuration requires all traffic to be 
>> encrypted.
>> 
>> And if a user specifies this server configuration:
>> - client_encryption_options.enabled=true
>> - client_encryption_options.optional=true
>> - native_transport_port != native_transport_port_ssl
>> 
>> then clients can still send unencrypted traffic to 
>> native_transport_port_ssl, since the server handles optional encryption on 
>> this port. In this case, there are two ports that accept unencrypted 
>> traffic, one of which also accepts encrypted traffic.
>> 
>> In both cases:
>> - Clients configured to use SSL will discover non-SSL ports from 
>> system.peers_v2 and fail to connect to those 

Re: [Discuss] CASSANDRA-16999 introduction of a column in system.peers_v2

2024-02-13 Thread Brandon Williams
On Tue, Feb 13, 2024 at 6:17 AM Sam Tunnicliffe  wrote:
> Also, if CASSANDRA-16999 is only going to trunk, why can't we just deprecate 
> dual ports in 5.0 (as it isn't at -rc stage yet) and remove it from trunk? 
> That seems preferable to shoehorning something into the new 
> system_views.peers table, which isn't going to help any existing drivers 
> anyway as none of them will be using it.

I agree and I think it will be a mess having the port in 3.x, then not
in 4.0, 4.1, or 5.0, then resurrected again after that.

Kind Regards,
Brandon


Re: [Discuss] CASSANDRA-16999 introduction of a column in system.peers_v2

2024-02-13 Thread Štefan Miklošovič
Alright ...  so how I am interpreting this, even more so after Sam's and
Brandon's mail, is that we should just get rid of that completely in trunk
and deprecate in 5.0.

There are already patches for 3.x and 4.x branches of the driver so the way
I was looking at that was that we might resurrect this feature but if there
is actually no need for this then the complete removal in trunk is probably
unavoidable.

On Tue, Feb 13, 2024 at 1:27 PM Brandon Williams  wrote:

> On Tue, Feb 13, 2024 at 6:17 AM Sam Tunnicliffe  wrote:
> > Also, if CASSANDRA-16999 is only going to trunk, why can't we just
> deprecate dual ports in 5.0 (as it isn't at -rc stage yet) and remove it
> from trunk? That seems preferable to shoehorning something into the new
> system_views.peers table, which isn't going to help any existing drivers
> anyway as none of them will be using it.
>
> I agree and I think it will be a mess having the port in 3.x, then not
> in 4.0, 4.1, or 5.0, then resurrected again after that.
>
> Kind Regards,
> Brandon
>


Re: [VOTE] Release Apache Cassandra 4.1.4

2024-02-13 Thread Štefan Miklošovič
I am kindly pinging people that we still have some work to do here.
Currently, we are missing at least one more binding +1.

Regards

On Fri, Feb 9, 2024 at 5:08 AM Paulo Motta  wrote:

> +1
>
> Reviewed changelog + tested docker image build (verify binary
> gpg+sha512sum) + jre11 startup.
>
> On Tue, Feb 6, 2024 at 4:03 PM Brandon Williams  wrote:
>
>> I think we lost track of this one over some concern about
>> CASSANDRA-19097, but that turned out to be a test problem.  +1
>>
>> Kind Regards,
>> Brandon
>>
>> On Fri, Jan 26, 2024 at 4:31 AM Štefan Miklošovič
>>  wrote:
>> >
>> > Proposing the test build of Cassandra 4.1.4 for release.
>> >
>> > sha1: 99d9faeef57c9cf5240d11eac9db5b283e45a4f9
>> > Git: https://github.com/apache/cassandra/tree/4.1.4-tentative
>> > Maven Artifacts:
>> https://repository.apache.org/content/repositories/orgapachecassandra-1324/org/apache/cassandra/cassandra-all/4.1.4/
>> >
>> > The Source and Build Artifacts, and the Debian and RPM packages and
>> repositories, are available here:
>> https://dist.apache.org/repos/dist/dev/cassandra/4.1.4/
>> >
>> > The vote will be open for 72 hours (longer if needed). Everyone who has
>> tested the build is invited to vote. Votes by PMC members are considered
>> binding. A vote passes if there are at least three binding +1s and no -1's.
>> >
>> > [1]: CHANGES.txt:
>> https://github.com/apache/cassandra/blob/4.1.4-tentative/CHANGES.txt
>> > [2]: NEWS.txt:
>> https://github.com/apache/cassandra/blob/4.1.4-tentative/NEWS.txt
>>
>


[Discuss] "Latest" configuration for testing and evaluation (CASSANDRA-18753)

2024-02-13 Thread Branimir Lambov
Hi All,

CASSANDRA-18753 introduces a second set of defaults (in a separate
"cassandra_latest.yaml") that enable new features of Cassandra. The
objective is two-fold: to be able to test the database in this
configuration, and to point potential users that are evaluating the
technology to an optimized set of defaults that give a clearer picture of
the expected performance of the database for a new user. The objective is
to get this configuration into 5.0 to have the extra bit of confidence that
we are not releasing (and recommending) options that have not gone through
thorough CI.

The implementation has already gone through review, but I'd like to get
people's opinion on two things:
- There are currently a number of test failures when the new options are
selected, some of which appear to be genuine problems. Is the community
okay with committing the patch before all of these are addressed? This
should prevent the introduction of new failures and make sure we don't
release before clearing the existing ones.
- I'd like to get an opinion on what's suitable wording and documentation
for the new defaults set. Currently, the patch proposes adding the
following text to the yaml (see
https://github.com/apache/cassandra/pull/2896/files):
# NOTE:
#   This file is provided in two versions:
# - cassandra.yaml: Contains configuration defaults for a "compatible"
#   configuration that operates using settings that are
backwards-compatible
#   and interoperable with machines running older versions of Cassandra.
#   This version is provided to facilitate pain-free upgrades for
existing
#   users of Cassandra running in production who want to gradually and
#   carefully introduce new features.
# - cassandra_latest.yaml: Contains configuration defaults that enable
#   the latest features of Cassandra, including improved functionality
as
#   well as higher performance. This version is provided for new users
of
#   Cassandra who want to get the most out of their cluster, and for
users
#   evaluating the technology.
#   To use this version, simply copy this file over cassandra.yaml, or
specify
#   it using the -Dcassandra.config system property, e.g. by running
# cassandra
-Dcassandra.config=file:/$CASSANDRA_HOME/conf/cassandra_latest.yaml
# /NOTE
Does this sound sensible? Should we add a pointer to this defaults set
elsewhere in the documentation?

Regards,
Branimir


Re: [Discuss] CASSANDRA-16999 introduction of a column in system.peers_v2

2024-02-13 Thread Jon Haddad
+1 to deprecating dual ports and removing in 5.0

On Tue, Feb 13, 2024 at 4:29 AM Štefan Miklošovič <
stefan.mikloso...@gmail.com> wrote:

> Alright ...  so how I am interpreting this, even more so after Sam's and
> Brandon's mail, is that we should just get rid of that completely in trunk
> and deprecate in 5.0.
>
> There are already patches for 3.x and 4.x branches of the driver so the
> way I was looking at that was that we might resurrect this feature but if
> there is actually no need for this then the complete removal in trunk is
> probably unavoidable.
>
> On Tue, Feb 13, 2024 at 1:27 PM Brandon Williams  wrote:
>
>> On Tue, Feb 13, 2024 at 6:17 AM Sam Tunnicliffe  wrote:
>> > Also, if CASSANDRA-16999 is only going to trunk, why can't we just
>> deprecate dual ports in 5.0 (as it isn't at -rc stage yet) and remove it
>> from trunk? That seems preferable to shoehorning something into the new
>> system_views.peers table, which isn't going to help any existing drivers
>> anyway as none of them will be using it.
>>
>> I agree and I think it will be a mess having the port in 3.x, then not
>> in 4.0, 4.1, or 5.0, then resurrected again after that.
>>
>> Kind Regards,
>> Brandon
>>
>


Re: [Discuss] "Latest" configuration for testing and evaluation (CASSANDRA-18753)

2024-02-13 Thread David Capwell
> and to point potential users that are evaluating the technology to an 
> optimized set of defaults

Left this comment in the GH… is there a reason all guardrails and reliability 
(aka repair retries) configs are off by default?  They are off by default in 
the normal config for backwards compatibility reasons, but if we are defining a 
config saying what we recommend, we should enable these things by default IMO.

> There are currently a number of test failures when the new options are 
> selected, some of which appear to be genuine problems. Is the community okay 
> with committing the patch before all of these are addressed?

I was tagged on CASSANDRA-19042, the paxos repair message handing does not have 
the repair reliably improvements that 5.0 have, so can cause repairs to 
deadlock forever (same as current 4.x repairs).  Bringing these up to par with 
the rest of repair would be very much welcome (they are also lacking 
visibility, so need to fallback to heap dumps to see what’s going on; same as 
4.0.x but not 4.1.x), but I doubt I have cycles to do that…. This refactor is 
not 100% trivial as it has fun subtle concurrency issues to address (message 
retries and dedupping), and making sure this logic works with the existing 
repair simulation tests does require refactoring how the paxos cleanup state is 
tracked, which could have subtle consequents.

I do think this should be fixed, but should it block 5.0?  Not sure… will leave 
to others….

Should we merge the configs breaking these tests?  No…. When we have failing 
tests people do not spend the time to figure out if their logic caused a 
regression and merge, making things more unstable… so when we merge failing 
tests that leads to people merging even more failing tests...

> On Feb 13, 2024, at 8:41 AM, Branimir Lambov  wrote:
> 
> Hi All,
> 
> CASSANDRA-18753 introduces a second set of defaults (in a separate 
> "cassandra_latest.yaml") that enable new features of Cassandra. The objective 
> is two-fold: to be able to test the database in this configuration, and to 
> point potential users that are evaluating the technology to an optimized set 
> of defaults that give a clearer picture of the expected performance of the 
> database for a new user. The objective is to get this configuration into 5.0 
> to have the extra bit of confidence that we are not releasing (and 
> recommending) options that have not gone through thorough CI.
> 
> The implementation has already gone through review, but I'd like to get 
> people's opinion on two things:
> - There are currently a number of test failures when the new options are 
> selected, some of which appear to be genuine problems. Is the community okay 
> with committing the patch before all of these are addressed? This should 
> prevent the introduction of new failures and make sure we don't release 
> before clearing the existing ones.
> - I'd like to get an opinion on what's suitable wording and documentation for 
> the new defaults set. Currently, the patch proposes adding the following text 
> to the yaml (see https://github.com/apache/cassandra/pull/2896/files):
> # NOTE:
> #   This file is provided in two versions:
> # - cassandra.yaml: Contains configuration defaults for a "compatible"
> #   configuration that operates using settings that are 
> backwards-compatible
> #   and interoperable with machines running older versions of Cassandra.
> #   This version is provided to facilitate pain-free upgrades for existing
> #   users of Cassandra running in production who want to gradually and
> #   carefully introduce new features.
> # - cassandra_latest.yaml: Contains configuration defaults that enable
> #   the latest features of Cassandra, including improved functionality as
> #   well as higher performance. This version is provided for new users of
> #   Cassandra who want to get the most out of their cluster, and for users
> #   evaluating the technology.
> #   To use this version, simply copy this file over cassandra.yaml, or 
> specify
> #   it using the -Dcassandra.config system property, e.g. by running
> # cassandra 
> -Dcassandra.config=file:/$CASSANDRA_HOME/conf/cassandra_latest.yaml
> # /NOTE
> Does this sound sensible? Should we add a pointer to this defaults set 
> elsewhere in the documentation?
> 
> Regards,
> Branimir



Re: [Discuss] "Latest" configuration for testing and evaluation (CASSANDRA-18753)

2024-02-13 Thread David Capwell
> so can cause repairs to deadlock forever

Small correction, I finished fixing the tests in CASSANDRA-19042 and we don’t 
deadlock, we timeout and fail repair if any of those messages are dropped.  

> On Feb 13, 2024, at 11:04 AM, David Capwell  wrote:
> 
>> and to point potential users that are evaluating the technology to an 
>> optimized set of defaults
> 
> Left this comment in the GH… is there a reason all guardrails and reliability 
> (aka repair retries) configs are off by default?  They are off by default in 
> the normal config for backwards compatibility reasons, but if we are defining 
> a config saying what we recommend, we should enable these things by default 
> IMO.
> 
>> There are currently a number of test failures when the new options are 
>> selected, some of which appear to be genuine problems. Is the community okay 
>> with committing the patch before all of these are addressed?
> 
> I was tagged on CASSANDRA-19042, the paxos repair message handing does not 
> have the repair reliably improvements that 5.0 have, so can cause repairs to 
> deadlock forever (same as current 4.x repairs).  Bringing these up to par 
> with the rest of repair would be very much welcome (they are also lacking 
> visibility, so need to fallback to heap dumps to see what’s going on; same as 
> 4.0.x but not 4.1.x), but I doubt I have cycles to do that…. This refactor is 
> not 100% trivial as it has fun subtle concurrency issues to address (message 
> retries and dedupping), and making sure this logic works with the existing 
> repair simulation tests does require refactoring how the paxos cleanup state 
> is tracked, which could have subtle consequents.
> 
> I do think this should be fixed, but should it block 5.0?  Not sure… will 
> leave to others….
> 
> Should we merge the configs breaking these tests?  No…. When we have failing 
> tests people do not spend the time to figure out if their logic caused a 
> regression and merge, making things more unstable… so when we merge failing 
> tests that leads to people merging even more failing tests...
> 
>> On Feb 13, 2024, at 8:41 AM, Branimir Lambov  wrote:
>> 
>> Hi All,
>> 
>> CASSANDRA-18753 introduces a second set of defaults (in a separate 
>> "cassandra_latest.yaml") that enable new features of Cassandra. The 
>> objective is two-fold: to be able to test the database in this 
>> configuration, and to point potential users that are evaluating the 
>> technology to an optimized set of defaults that give a clearer picture of 
>> the expected performance of the database for a new user. The objective is to 
>> get this configuration into 5.0 to have the extra bit of confidence that we 
>> are not releasing (and recommending) options that have not gone through 
>> thorough CI.
>> 
>> The implementation has already gone through review, but I'd like to get 
>> people's opinion on two things:
>> - There are currently a number of test failures when the new options are 
>> selected, some of which appear to be genuine problems. Is the community okay 
>> with committing the patch before all of these are addressed? This should 
>> prevent the introduction of new failures and make sure we don't release 
>> before clearing the existing ones.
>> - I'd like to get an opinion on what's suitable wording and documentation 
>> for the new defaults set. Currently, the patch proposes adding the following 
>> text to the yaml (see https://github.com/apache/cassandra/pull/2896/files):
>> # NOTE:
>> #   This file is provided in two versions:
>> # - cassandra.yaml: Contains configuration defaults for a "compatible"
>> #   configuration that operates using settings that are 
>> backwards-compatible
>> #   and interoperable with machines running older versions of Cassandra.
>> #   This version is provided to facilitate pain-free upgrades for 
>> existing
>> #   users of Cassandra running in production who want to gradually and
>> #   carefully introduce new features.
>> # - cassandra_latest.yaml: Contains configuration defaults that enable
>> #   the latest features of Cassandra, including improved functionality as
>> #   well as higher performance. This version is provided for new users of
>> #   Cassandra who want to get the most out of their cluster, and for 
>> users
>> #   evaluating the technology.
>> #   To use this version, simply copy this file over cassandra.yaml, or 
>> specify
>> #   it using the -Dcassandra.config system property, e.g. by running
>> # cassandra 
>> -Dcassandra.config=file:/$CASSANDRA_HOME/conf/cassandra_latest.yaml
>> # /NOTE
>> Does this sound sensible? Should we add a pointer to this defaults set 
>> elsewhere in the documentation?
>> 
>> Regards,
>> Branimir
>