Re: [DISCUSS] CEP-55 Generated role names

Štefan Miklošovič Wed, 17 Sep 2025 07:28:48 -0700

On Wed, Sep 17, 2025 at 2:17 AM Joel Shepherd <[email protected]> wrote:


> Could I make a suggestion? Well, I will make a suggestion :-) , but if
> it's not useful then feel free to ignore it.
>
> Could we talk a bit about how users/operators would work with the CREATE
> ROLE features you're proposing?
>
> The use-case as I understand it is that there are organizations that have
> or are going to create large numbers of clusters (say  > 3), and they would
> appreciate some automation around creating role names and credentials for
> all those clusters. The proposal is to extend the CREATE ROLE statement to
> enable the database to generate those names and credentials automatically,
> including persisting them in the database itself.
>
> One thing I'm wondering about is what kind of tooling those organizations
> are likely to be using for creating/managing all those clusters. Are they
> going to be scripting, or are they going to be using some third-party
> tooling like Terraform, CloudFormation, Puppet, etc.? If they're using
> tooling like that, which is going to be a more natural fit: making
> role/password generation available through CQL, or through Sidecar APIs, or
> ... ? I don't have an opinion at the moment so that's not a rhetorical
> question. I'd actually like to reason through what's going to work best for
> the folks who actually have to manage tons of clusters all day  long.
>
Somewhat related to that ... is there any need for role "stability" across
> clusters: e.g. I want to create a role that can access existing tables but
> not create/drop tables or keyspaces, and for my own sanity I want that role
> to have the same name on every cluster I operate. Do I have to implement a
> custom role name generator to do that, or is that common enough
> functionality that it should be supportable by the tooling I'm using to
> manage my clusters?
>

I do not think we have such a requirement for "stability". If you had this
requirement then you would not use the feature we are discussing here and
you created them manually. I also do not think that having the same name
everywhere is a good idea in general. Username is security sensitive as
well. For cases it does not really matter I think the best approach is to
just have them randomized. The argument like "hey but I want to be able to
type it to the console, how am I going to memorize a random role name?" is
from my perspective not completely valid because, ideally, both username as
well as password are meant to be stored e.g. in ~/.cassandra/credentials,
chmod-ed to 400. You are not supposed to write your username and password
manually anywhere.

Of course there might be core technical users whose name is the same all
the time, but there are additional layers of checks to ensure that such a
role is able to login etc. I do not want to go into too much here really.

I do not see why we should have a ton of logic / functionality outside of
Cassandra for doing basic things. I think that Cassandra is notoriously
known for its "do it yourself" approach and I think _that_ is the
primary impediment for broader adoption, not if we dare to introduce CREATE
GENERATED ROLE or not. The focus on usability is completely missed. For a
lot of things you want to have you have to have "tooling" which you need to
take care of and so on. People are sick of it. They just want to do the
thing in the most efficient and time-saving manner.

When I was introduced to this community for the first time, like 2015-16
maybe, I remember that there was somebody on the mailing list complaining
that "repair should be automatic", "that should be provided", "this should
be natively in". People see this for years. It takes just 9 years to
finally introduce automatic repairs. Thank god for repairing people finally
doing that. They should be weighted in gold. But the response to that was
that "well if you need it you need to write it yourself, there is no "one
size fits all!", you need to take care of that yourself". Just imagine
that. This was a kind of genuinely meant response. How are we going to make
this popular if everything beyond trivial is left to an end user to figure
out. Who sane is going to put up with that? People just want to turn on the
thing and not think too much about it anymore.


> I don't have strong opinions on CQL vs Sidecar, but I think one way to
> frame the debate is to look at which will work best with the tooling that
> people already use to manage large numbers of clusters.
>
> Thanks -- Joel.
> On 9/16/2025 3:15 PM, Štefan Miklošovič wrote:
>
>
> Oh crap, what a feedback! If nothing else this shows a lesson to everybody
> that the most sure way to have a fast feedback if you are tired of waiting
> or impatient so you can move quickly is to just propose your ideas, then
> boldly proclaim you go to do something and the universe will mysteriously
> take care of finding out somebody who will reject it. Because people are
> not always interested in agreeing. A lot of times, they take action only in
> case they don't and are put in front of it. So don't be afraid to take some
> flak as soon as possible!
>
>
>
> On Tue, Sep 16, 2025 at 9:05 PM Patrick McFadin <[email protected]>
> wrote:
>
>> Thanks Mick, I'm just digging into this more after a long week of travel.
>>
>> Generally, I'm -1 for adding more custom syntax. Another concern of mine
>> is adding control plane actions in DDL. I understand the usefulness of a
>> feature like this in ops. It's a great idea.. Here would be my counter
>> proposal:
>>
>>  - Leave the CQL as is and keep "CREATE ROLE" etc as is, and avoid making
>> changes to core Cassandra.
>>
>
> Why should we keep it "as is"? Genuinely asking. Why? Where is this need
> for conserving stuff coming from? Is this what we are doing here? Adding as
> little as possible? I think we are stifling innovation unnecessarily. There
> was the same discussion about constraints and CHECK NOT NULL / NOT NULL
> where we were trying to follow "the Holy Postgres Grail". I just don't get
> it. Are we not obsessed with that at this point? Literally nobody cares if
> there will be CREATE GENERATED ROLE. Nobody. Cares. So I do not take this
> point of yours as valid without some strong backing from your side.
>
>
>>  - Move the generation & policy to the sidecar project. A sidecar
>> endpoint will generate the role name/password, enforce
>>
> prefix/suffix/length requirements, ensure uniqueness, and then return the
>> role and password (or a secret handle) to the caller.
>>
>
> Well the problem I see in putting this to Sidecar is that this would be
> only possible to do via HTTP(S). Not everybody is interested in it. Hardly.
> Zero interest. Sidecar is 0.2.0 at this point. I think that realistically
> speaking I am not far from the truth at all if I say that there is
> practically nobody who is using 0.2.0 in production. 0.2.0. I do not count
> exceptions as early adopters or Analytics.
>
> Putting this to Sidecar almost guarantees nobody is going to use this
> particular functionality. People have their own control planes, their own
> way of generating this stuff and they are not going to deploy Sidecar just
> because they want to delegate this task to it. Come on. I think that it
> would, paradoxically, create more problems for them. Not less. So again, I
> do not take this point as something which is solving anything. This will
> have 0 users when put in Sidecar. I think it would be better if we just
> flat out refuse this instead of putting that to Sidecar. It is even worse
> imho.
>
> Another problem with Sidecar I see is that the current implementation is
> pluggable. How do you want to make this pluggable in Sidecar? Pluggable
> how? People might have their own opinion on how role names should be
> generated. That is why you can just code your own generator / validator,
> put it on the class path and be done with it. How are you supposed to
> "patch Sidecar"? You create a custom implementation, then you put it on the
> class path of Sidecar? Is this even supported? I think that you have
> proposed it with a good will but I don't think that would fly.
>
>
>> Why?
>>  - End users will have it faster since it will work with any version of
>> Cassandra supporting the CREATE syntax. (No having to backport either)
>>  - Keeps control plane actions optional and separated. Not an attack
>> surface inside core Cassandra
>>
>
> Thirdly, what _attack surface_? I think you are pretty aware of the fact
> that this feature is by default turned off. If you have an organisation
> deploying hundreds of clusters and for each they have to figure out some
> role name for a user which is going to use it, how is this going to be
> abused concretely? There are dedicated accounts for CQL management,
> creation of a role is tied to some workflow etc. What is attacked exactly
> and how? Concrete examples please.
>
> Dineshi had the concern that "what if we just have a script which will
> generate roles repeatedly nonstop?" How is this different from having a
> script which would generate roles itself instead of Cassandra and it would
> execute that? What's the difference really? If you want to abuse it you
> will. There is no protection against that unless we put some rate limiting
> in front of it - which I do not have a problem to address in follow-up work
> as already explained.
>
>
>>  - We keep the syntax of CQL more generic and less one-off.
>>
>
> I don't think this is relevant, really. I think we should abandon this
> mindset. At this point, to make the point, I suspect that CQL had to "hurt
> you" somehow :)
>
> Regards
>
>
>>  - k8s/Cloud native friendly with separation of control plane/data plane.
>>
>> Patrick
>>
>>
>> On Tue, Sep 16, 2025 at 7:31 AM Mick <[email protected]> wrote:
>>
>>>
>>>
>>>
>>> > I think enough time passed for everybody to participate in the
>>> discussion so I would just move on and start the voting thread soon.
>>>
>>>
>>>
>>> Can we give CEP discussions longer than ~one week, please.
>>>
>>> Folk are easily away/offline for a whole week.  Take for example many
>>> who were at Community over Code and may still be catching up on their
>>> inbox, thinking dev@ is a less urgent folder.
>>>
>>> I haven't look at how fast the other CEP discuss threads have turned
>>> around, I apologise if I'm only singling one out, my concern applies
>>> generally.
>>>
>>>

Re: [DISCUSS] CEP-55 Generated role names

Reply via email to