Hi Josh - Thanks for all the feedback: appreciate it. Responses to
specific points interwoven below ...
On 7/9/2025 3:25 AM, Josh McKenzie wrote:
Sorry for the delay in getting to this Joel. This is great work -
really well thought out and put together. I'm a +1 on it; had the
following observations or questions reviewing the CEP:
Rather than going with a new discretely allowed option to a closed
list of allowable options in STARTUP, what if we went the route we did
with SUPPORTED and offered a variable [string multimap] so we could
add future STARTUP options w/out fully revving the protocol spec in
the future?
(https://github.com/apache/cassandra/blob/trunk/doc/native_protocol_v5.spec#L462-L491)
If I'm understanding, you're suggesting making a one-time protocol
change by adding something like:
STARTUP: {
"CQL_VERSION": "3.11",
<other existing options>
"OPTIONS": <multimap of option name => value>
}
... so that in the future new options could just be keys in the
multimap, which would not be considered a protocol change? Or placing
all the options -- current and future -- into the multimap?
I don't have a strong opinion on this. My weak opinion is that --
especially if the intent is to put all options in the multimap -- it's
either going to require careful client-node coordination to roll out the
protocol change, or it's going to require the node to expect and handle
both flavors of the STARTUP message structure. Doable but I'm not sure
that part of the protocol changes so frequently that it's worth the
effort. I'm not going to argue if you or others have a different and
stronger opinion on it.
Any opinion on us deprecating the existing
cassandra.yaml#authenticator: option to be renamed as
"fallback_authenticator", and updating the text to reflect that? Or
/default_authenticator/; something to denote through the naming that
the role of that is secondary to the newly added
/authenticator_negotiation/? I'm 50/50 on this; I don't love
deprecation (knowing we're going to keep supporting the old
nomenclature into perpetuity), but I do think the topic of auth is
important enough that a little config disruption and maintenance here
to steer new users to the new, more secure method of auth might
warrant the change.
I hadn't thought about that but I like the idea. I was striving for
maximum backwards compatibility, but you're right that the existing key
could continue to be supported, just not publicized. Going forward,
something like "default_authenticator" will be less confusing than just
"authenticator". I'll add it to the doc.
`‘requireAuthentication’ should be set to true as soon as all clients
are using other authenticators.`
A hot prop or mutable vtable (... did we ever do mutable / change
config through vtables? Hm, looks like not yet:
https://issues.apache.org/jira/browse/CASSANDRA-15254) so we could
change this live on a cluster w/out bouncing would be nice. Be nicer
if that change also coincided w/a change in config via 1 UX through
the DB. ;) But that's a /different/ problem than what you're looking
to address so worth deferring on that piece probably.
Agree it'd be great to be able to enable authn and negotiation without
having to bounce nodes. Will add.
re: thundering herd risk - this tickled my memory. <snip> We have a
separate executor specifically for auth messages in Dispatcher.java;
might be worth keeping an eye on this to see if it proves to be a new
bottleneck w/a more heavyweight negotiation on connection.
Ah, thanks for pointing that out. I wouldn't expect STARTUP (which is
where negotiation will actually be resolved server-side) to be
significantly more expensive than it is today, but will keep an eye on
that. A second potential bottleneck might lie in the increased overhead
of the required OPTIONS/SUPPORTED handshake, which I believe today is
optional: the client can and often does initiate connection with STARTUP
alone. Again, while I wouldn't expect those to be very expensive to
handle, offloading them from the requestExecutor might protect the
workers serving the actual data plane traffic.
The JIRA link in the CEP just links to the ASF C* JIRA - I couldn't
find a ticket for this CEP yet in JIRA. Generally we use that to link
to a specific ticket; was a minor speed bump just FYI.
I was deferring creating a JIRA until if/when the CEP was adopted, but
I'll create one today. Can always close it later.
And we have some prior art in the following:
- CASSANDRA-13048: Support SASL mechanism negotiation in existing
Authenticators
- CASSANDRA-11471: Add SASL mechanism negotiation to the native protocol
Might be good to link to those and once we get a JIRA up for CEP-50,
we can flag those 2 as duplicates of it and close them out once it's done.
Great: thanks for the pointers.
One detail where CEP-50 varies a bit from the SASL RFC's description of
negotiation (https://datatracker.ietf.org/doc/html/rfc4422#section-3.2)
is that the RFC suggests the server send the client a list of
authenticators to choose from, and the client responds with its chosen
authenticator. The RFC doesn't say that the protocol "MUST" or even
"SHOULD" operate this way, just the implementations "commonly" do. In
early drafts, I actually described the exchange as working this way, but
eventually backed away from it because it seems to create the risk of a
malicious or confused client picking the least secure option offered by
the server: the server loses some control. The current proposal is that
the client sends the server a list of auth mechanisms that the client
can support, and the server makes the final decision about which to use,
which discourages the client from choosing the weakest usable option.
(It doesn't outright prevent it however.) I believe this is still
SASL-compliant because SASL doesn't mandate a particular exchange, but
did want to call it out. I'll fold this into the CEP as well.
Thanks! -- Joel.
Overall - looks great. Again: +1 from me.
On Tue, Jul 8, 2025, at 8:34 PM, Joel Shepherd wrote:
Hi Doug - That's an interesting suggestion for a load test: I'll
include something like that in our plans.
You're right about the logic in RoleManager: it should be doing the
right thing with MutualTlsWithPasswordFallbackAuthenticator.
Thanks! -- Joel.
On 7/8/2025 1:52 PM, Doug Rohrer wrote:
+1 from me (I think committer +1s are "binding" on CEPs given our
previous "how we do things" conversation, but either way, +1)...
One interesting perf test to think about would be the difference
between negotiated auth with MutualTlsAuthenticator and
PasswordAuthenticator and the combined
MutualTlsWithPasswordFallbackAuthenticator, as I think it'll provide
a pretty good indication of the (hopefully negligible) performance
difference when negotiating.
Also, because I was curious about your last comment...
MutualTlsWithPasswordFallbackAuthenticator _derives_ from
PasswordAuthenticator, so *instanceof PasswordAuthenticator* checks
return *true* for instances of it, and therefore (assuming you're
talking about supportedOptions/alterableOptions in
https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/auth/CassandraRoleManager.java#L167-L172
<https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/auth/CassandraRoleManager.java#L167-L172>),
it should work fine. Not that that helps _you_ solve your problem,
but at least the existing classes should work.
Thanks for putting the CEP together and working on the implementation!
Doug
On Jul 3, 2025, at 2:38 PM, Joel Shepherd <sheph...@amazon.com>
<mailto:sheph...@amazon.com> wrote:
Thanks Andy - My hope/expectation is this would significantly
reduce the amount of friction involved when either implementing or
migrating to a new authenticator. I suspect it will benefit more
complex environments too, when a single authenticator isn't ideal
for all clients.
At this point, the stickiest bits I've run into have involved logic
that switches on "the" authenticator class, because of the
hardcoded dependency on a specific authn implementation, and
working out how to sanely extend the logic once it's possible for a
single node to be using different authenticators for different
client sessions. Solvable but will take some refactoring and might
also generate debate about what the right behaviors are in that
scenario. An example is the RoleManager which currently creates
additional role attributes if the Password Authenticator is in use
... which, now that you mention it, I wonder if that's already
broken for the MutualTlsWithPasswordFallbackAuthenticator. Hmm.
Anyway, thanks for the feedback -- Joel.
On 7/3/2025 7:52 AM, Tolbert, Andy wrote:
Hi Joel,
+1 (nb), I think this is a really good idea and well fleshed our CEP!
The capability to allow the server to support multiple
authenticators would be very useful. CEP-34 added a
'MutualTlsWithPasswordFallbackAuthenticator' for simultaneously
supporting both mTLS and Password authentication, primarily as a
means to introduce mTLS auth without breaking password auth and
also a possible gradual migration to mTLS, but this only works for
combining these two particular authenticators, and also creates
another authenticator to support.
The migration to any new auth strategy would likely involve the
need to simultaneously support an existing and new auth provider.
I think the approach you describe is well described and should
meet this need assuming users use a driver that supports it.
In terms of the protocol, utilizing the capabilities of the
existing OPTIONS, STARTUP and SUPPORTED messages to communicate
what authenticators are supported/should be used is pretty clever
as it shouldn't require a protocol version uprev, and hopefully
wouldn't be too complicated for a driver to implement.
Thanks,
Andy
On Mon, Jun 30, 2025 at 11:44 AM Joel Shepherd
<sheph...@amazon.com <mailto:sheph...@amazon.com>> wrote:
Erm ... and here's the CEP:
https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-50%3A+Authentication+Negotiation
<https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-50%3A+Authentication+Negotiation>
(Thanks for the heads up, Abe ...)
-- Joel.
On 6/30/2025 9:37 AM, Joel Shepherd wrote:
> Hello - We would like to propose CEP-50: Authentication
Negotiation
> for adoption by the community: <link> .
>
> This CEP proposes minor changes to the initial handshake
protocol
> (OPTIONS, SUPPORTED and STARTUP messages) to enable a client
to inform
> the node of the authenticators supported by the client, and
changes in
> the node's authentication-related areas to enable it to pick
its
> preferred authenticator for each client client connection.
The CEP
> explains why this approach is proposed, instead of
implementing a
> "negotiating authenticator".
>
> Authentication negotiation will make it easier and safer for
> administrators to migrate clusters to stronger authentication
> mechanisms (including switching on authentication for a
cluster that
> has been using "allow-all" authentication) without downtime,
and to
> support environments where different clients prefer different
> authentication mechanisms (e.g., username and password for
ad-hoc
> cqlsh access, mutual TLS for programmatic access, etc.),
without
> having to pick a single "lowest common denominator"
authenticator for
> all. Additionally, the proposed changes are intended to be
backwards
> compatible for both clients and nodes.
>
> Thanks in advance for your time and feedback. Please keep the
> discussion on this mailing list thread.
>
> Thanks! -- Joel.
>
>
>
>