[DISCUSS] Deprecation of IEndpointSnitch (CASSANDRA-19488)

2024-10-31 Thread Sam Tunnicliffe
Since CEP-21, the source of truth for topology info (a node's datacenter & 
rack) is ClusterMetadata. Each node provides its dc/rack when it registers 
itself with the cluster prior to joining and this information is effectively 
immutable (for now). This significantly reduces the scope of IEndpointSnitch's 
responsibilities and CASSANDRA-19488 proposes a refactoring which breaks out 
the remaining functionality into a handful of new providers (full details can 
be found in the JIRA). 

This is one of the more widely used extension points in Cassandra, so we wanted 
to bring it to the mailing list in addition to discussing on JIRA. 

To be clear, no operator intervention should be necessary when upgrading. To 
ease migration onto the new config and to allow us to deprecate snitches in a 
controlled way, it will remain fully supported to configure nodes using the 
endpoint_snitch setting in yaml. A SnitchAdapter acts as a facade in this case, 
presenting the new interfaces to calling code while delegating to the legacy 
snitch. Most of the in-tree snitches have been refactored to extract 
implementations of the new interfaces so that their functionality can be used 
via the new configuration.

Some questions for the list:

* We have added 2 new methods to IEndpointSnitch, which have essentially been 
pulled up from Ec2MultiRegionSnitch and GossipingPropertyFileSnitch to support 
ReconnectableSnitchHelper. Currently, these are added as default methods on the 
interface so that out-of-tree snitches remain binary compatible. However, it 
would be safer to break binary compatibility in this case to ensure that any 
custom snitches out in the wild must be updated and their behaviour is 
preserved. So the question is, would there be objections to extending the (now 
deprecated) IEndpointSnitch interface in this way?

* Python dtests and config are currently unchanged (aside from some error 
message checks) so these are exercising the path whereby the clusters are 
configured with endpoint_snitch and make use of the compatibility adapter. 
In-jvm upgrade dtests switch from old to new style configuration on upgrade to 
5.1 (though in truth, these don't exercise snitches much at all as a special 
dtest snitch is used throughout). cassandra-latest.yaml contains the new 
settings, while cassandra.yaml and the variations in test/conf retain the old 
style settings. How should we approach updating these configs so that we 
maintain a balance between test coverage, compatibility during upgrades and 
encouraging the use of new style config in fresh clusters?



Re: [DISCUSS] Usage of "var" instead of types in the code

2024-10-31 Thread Berenguer Blasi
Nuke it out of orbit. -1 everywhere imo. I don't need another level of 
indirection and one more plate to spin while connecting dots in the code.


On 30/10/24 17:47, Francisco Guerrero wrote:

Yeah, I'm also in favor of disallowing usage of var in
production code.

On 2024/10/30 16:43:06 Jacek Lewandowski wrote:

+1 for allow in tests and ban in production code.




- - -- --- -  -
Jacek Lewandowski


śr., 30 paź 2024 o 17:37 David Capwell  napisał(a):


I am fine with allow in tests ban in src

On Oct 30, 2024, at 7:16 AM, Štefan Miklošovič 
wrote:

https://issues.apache.org/jira/browse/CASSANDRA-20038

On Wed, Oct 30, 2024 at 2:40 PM Brandon Williams  wrote:


Allow in tests, forbid elsewhere is my vote and simple to implement.

Kind Regards,
Brandon

On Wed, Oct 30, 2024 at 6:46 AM Štefan Miklošovič
 wrote:

against vars:

Benjamin
Stefan
Benedict
Yifan

somewhere in the middle:

Dinesh

Caleb +1 in tests and +0 in prod code.
David - against flat out banning, limiting it just for some places

Josh prod / test on/off switch.

for:

Jon

Correct me if I am wrong and I have "miscategorized" somebody.

 From what I see, I think the outcome is that we might use this in tests

and let's just forbid it in prod code. My reasoning behind this is that we
might all learn how to live with this in tests and we may re-evaluate in
the future if its usage proved to be beneficial and we are comfortable to
have it in prod code as well (or we go to ban its usage in tests too). In
other words, if the majority just can't live without this in prod code we
can take a look at this again in the future.

Not banning it in tests does not mean that from now on we need to use

vars there. It just means that tests which will introduce the usage of vars
will not be rejected.

Does this sound reasonable to everybody?

On Wed, Oct 30, 2024 at 12:31 PM Benjamin Lerer 

wrote:

I recently faced some var usage in the CQL layer part where it was

making the code pretty hard to understand. I am in favor of prohibiting it.

Le mar. 29 oct. 2024 à 20:32, Caleb Rackliffe <

calebrackli...@gmail.com> a écrit :

Josh's example of "good" usage seems defensible, because the declared

type is already obfuscated to Collection anyway, and my eyeballs are going
to skip to the right anyway to see the instantiated type. I'm +0 on
prohibiting it in non-test code, and -1 on prohibiting it in tests.

On Tue, Oct 29, 2024 at 2:23 PM Jon Haddad 

wrote:

When I first saw var as a Java keyword, my initial reaction was one

of skepticism for the reasons already mentioned.  However, in practice,
I've been using var for years across many code bases (Java and Kotlin) and
have never had an issue with readability.  I find it to be significantly
more pleasant to work with.

Jon




On Tue, Oct 29, 2024 at 12:13 PM Štefan Miklošovič <

smikloso...@apache.org> wrote:

I get that there might be some situations when its usage does not

pose any imminent problem but the question is how we are going to enforce
that _at scale_? What is allowed and what not ... Sure we can have a code
style for that, but its existence itself does not guarantee that everybody
will adhere to that _all the time_. People are people and make mistakes,
stuff is overlooked etc.

Upon the review we will argue whether "this particular usage of var

is meaningful or not" and "if the codestyle counts with this or not" and as
many reviews we will have there will be so many outcomes. Dozens of people
contribute to the code and more than that reads it, everybody has some
opinion about that ... Sure, just use var when you prototype etc but I
don't think it is a lot to ask to merge it without vars. Come on ...

On Tue, Oct 29, 2024 at 8:08 PM Yifan Cai 

wrote:

I am in favor of disallowing the `var` keyword.

It does not provide a good readability, especially in the

environments w/o type inference, e.g. text editor or github site.

It could introduce performance degradation without being noticed.

Consider the following code for example,

Set allNames()
{
 return null;
}

boolean contains(String name)
{
 var names = allNames();
 return names.contains(name);
}

Then, allNames is refactored to return List later. The contains

method then runs slower.

List allNames()
{
 return null;
}


- Yifan

On Tue, Oct 29, 2024 at 11:53 AM Josh McKenzie <

jmcken...@apache.org> wrote:

(sorry for the double-post)

Jeremy Hanna kicked this link to a style guideline re: inference

my way. Interesting read for those that are curious:
https://openjdk.org/projects/amber/guides/lvti-style-guide

On Tue, Oct 29, 2024, at 2:47 PM, Josh McKenzie wrote:

To illustrate my position from above:

Good usage:

Collection names = new ArrayList<>();

becomes

var names = new ArrayList();


Bad usage:

Collection names = myObject.getNames();

becomes

var names = myObject.getNames();


Effectively, anything that's not clearly redundant in assignment

shouldn't use inference IMO. Thinking more de

Case-insensitivity in login workflow

2024-10-31 Thread Joel Shepherd
Hi - I've been doing some exploration in Cassandra's client 
authentication workflow and got tripped up by surprising (to me) 
behavior regarding role name case sensitivity.


I made an implementation of IAuthenticator and SaslNegotiator. After 
evaluateResponse(), when the client responds successfully to the node's 
auth challenge, getAuthenticatedUser() returns the user name in the 
letter-case provided by my authentication provider (external to 
Cassandra), which actually performs the authentication. I.e., if my 
authn provider successfully auths the caller, it returns a corresponding 
user name/identity like 'Susie' (without the quotes); now my 
SaslNegotiator will return 'Susie' (without quotes) when 
getAuthenticatedUser() is called.


My understanding is that unquoted user/role names in Cassandra are 
case-insensitive and are forced to lowercase when stored.


However, when Cassandra's AuthUtil.handleLogin() goes to login the 
authn'd user, it will fail to find any roles for Susie because the role 
name as stored (susie) doesn't match the rolename returned by 
getAuthenticatedUser() (Susie): it makes a case-sensitive comparison.


I was surprised by this and I'm wondering, during the authentication and 
login process, where does responsibility for appropriately casing the 
authn'd user/role name live? Is that intended to be with the node 
authenticator (i.e., I need to force Susie to susie for 
getAuthenticatedUser()), or should it be happening outside the 
authenticator and in the central Cassandra login implementation? Or is 
my understanding wrong?


Thanks! -- Joel.