Adding wiremock to test dependencies

2023-06-20 Thread Miklosovic, Stefan
Hi,

we want to introduce wiremock library (1) into the project as a test dependency 
to test CASSANDRA-16555.

In that patch, (wip here (2)), we want to test how would such snitch behave 
based on what Amazon EC2 Identity Service of version 2 returned to that snitch. 
AWS Identity service of version 2 is necessary to call in order to get a token 
with which a snitch is going to get AZ of a node it is called from.

The last comment of mine in (3) elaborates about approaches we were considering 
and mocking http communication / requests with wiremock seems to be like the 
most comfortable and straightforward solution.

Wiremock is Apache licence 2.0 (4) and is well maintained.

Are people OK with us introducing this to the build?

(1) https://wiremock.org/
(2) 
https://github.com/apache/cassandra/pull/2403/files#diff-dc04778c6659040f1c00f37e97a9b1530a532d3d1e3620427bd6628d1b2ec048
(3) https://issues.apache.org/jira/browse/CASSANDRA-16555
(4) https://github.com/wiremock/wiremock/blob/master/LICENSE.txt

Regards

Re: Adding wiremock to test dependencies

2023-06-20 Thread Miklosovic, Stefan
I forgot to mention that in the future this will not be used for testing AWS 
IDSv2 only. There are other snitches we have which are also calling similar 
"services" to get some metadata from an instance a node runs in and this 
communication is currently not tested at all. I can pretty much imagine that 
the testing efforts might be expanded to all other snitches as well. 


From: Miklosovic, Stefan
Sent: Tuesday, June 20, 2023 13:35
To: dev@cassandra.apache.org
Subject: Adding wiremock to test dependencies

Hi,

we want to introduce wiremock library (1) into the project as a test dependency 
to test CASSANDRA-16555.

In that patch, (wip here (2)), we want to test how would such snitch behave 
based on what Amazon EC2 Identity Service of version 2 returned to that snitch. 
AWS Identity service of version 2 is necessary to call in order to get a token 
with which a snitch is going to get AZ of a node it is called from.

The last comment of mine in (3) elaborates about approaches we were considering 
and mocking http communication / requests with wiremock seems to be like the 
most comfortable and straightforward solution.

Wiremock is Apache licence 2.0 (4) and is well maintained.

Are people OK with us introducing this to the build?

(1) https://wiremock.org/
(2) 
https://github.com/apache/cassandra/pull/2403/files#diff-dc04778c6659040f1c00f37e97a9b1530a532d3d1e3620427bd6628d1b2ec048
(3) https://issues.apache.org/jira/browse/CASSANDRA-16555
(4) https://github.com/wiremock/wiremock/blob/master/LICENSE.txt

Regards


Re: Adding wiremock to test dependencies

2023-06-20 Thread Brandon Williams
I was concerned about 'jre8' being in the dependency name and looked
into which java versions were supported, and it looks like 17 is, so
we should verify but I am +1 if that is the case.

https://github.com/wiremock/wiremock/issues/1655

Kind Regards,
Brandon

On Tue, Jun 20, 2023 at 6:35 AM Miklosovic, Stefan
 wrote:
>
> Hi,
>
> we want to introduce wiremock library (1) into the project as a test 
> dependency to test CASSANDRA-16555.
>
> In that patch, (wip here (2)), we want to test how would such snitch behave 
> based on what Amazon EC2 Identity Service of version 2 returned to that 
> snitch. AWS Identity service of version 2 is necessary to call in order to 
> get a token with which a snitch is going to get AZ of a node it is called 
> from.
>
> The last comment of mine in (3) elaborates about approaches we were 
> considering and mocking http communication / requests with wiremock seems to 
> be like the most comfortable and straightforward solution.
>
> Wiremock is Apache licence 2.0 (4) and is well maintained.
>
> Are people OK with us introducing this to the build?
>
> (1) https://wiremock.org/
> (2) 
> https://github.com/apache/cassandra/pull/2403/files#diff-dc04778c6659040f1c00f37e97a9b1530a532d3d1e3620427bd6628d1b2ec048
> (3) https://issues.apache.org/jira/browse/CASSANDRA-16555
> (4) https://github.com/wiremock/wiremock/blob/master/LICENSE.txt
>
> Regards


Re: Adding wiremock to test dependencies

2023-06-20 Thread Josh McKenzie
Speaking only to the "we don't want to add a dependency on something that's 
unstable or likely to fizzle out", looks good there to me:
 • Long-term project health / activity looks robust: 
https://github.com/wiremock/wiremock/graphs/contributors
 • Pretty diverse set of contributors in the last couple of years: 
https://github.com/wiremock/wiremock/graphs/contributors?from=2020-12-29&to=2023-06-20&type=c

On Tue, Jun 20, 2023, at 9:05 AM, Brandon Williams wrote:
> I was concerned about 'jre8' being in the dependency name and looked
> into which java versions were supported, and it looks like 17 is, so
> we should verify but I am +1 if that is the case.
> 
> https://github.com/wiremock/wiremock/issues/1655
> 
> Kind Regards,
> Brandon
> 
> On Tue, Jun 20, 2023 at 6:35 AM Miklosovic, Stefan
>  wrote:
> >
> > Hi,
> >
> > we want to introduce wiremock library (1) into the project as a test 
> > dependency to test CASSANDRA-16555.
> >
> > In that patch, (wip here (2)), we want to test how would such snitch behave 
> > based on what Amazon EC2 Identity Service of version 2 returned to that 
> > snitch. AWS Identity service of version 2 is necessary to call in order to 
> > get a token with which a snitch is going to get AZ of a node it is called 
> > from.
> >
> > The last comment of mine in (3) elaborates about approaches we were 
> > considering and mocking http communication / requests with wiremock seems 
> > to be like the most comfortable and straightforward solution.
> >
> > Wiremock is Apache licence 2.0 (4) and is well maintained.
> >
> > Are people OK with us introducing this to the build?
> >
> > (1) https://wiremock.org/
> > (2) 
> > https://github.com/apache/cassandra/pull/2403/files#diff-dc04778c6659040f1c00f37e97a9b1530a532d3d1e3620427bd6628d1b2ec048
> > (3) https://issues.apache.org/jira/browse/CASSANDRA-16555
> > (4) https://github.com/wiremock/wiremock/blob/master/LICENSE.txt
> >
> > Regards
> 


Re: CASSANDRA-18554 - mTLS based client and internode authenticators

2023-06-20 Thread Jyothsna Konisa
Hi Yuki,

Sorry I missed answering your other question in the above reply. Regarding
checking what identities are associated with a given role, one can make a
query to list identities for a given role to the table. Also note that,
addition or removal of identities from the table can only be performed by
the super user only. Not even read-write users can perform modifications to
the table.

Also, If others have no concerns regarding this patch, can we move forward
with the merge? or do we need voting on this one?

Thanks,
Jyothsna Konisa.


On Mon, Jun 19, 2023 at 4:00 PM Jyothsna Konisa 
wrote:

> Hi Yuki,
> You are right regarding adding a custom validator. If one wants to
> implement a CN based validator, they can do that and configure that
> validator in Cassandra.yaml in "authenticator.parameters.
> validator_class_name".
>
> Regarding a role having multiple identities, yes a role can have multiple
> identities associated with it. For example, there can be several read_only
> users for a given cluster, so the role `readonly_user` can be associated
> with multiple identities.
>
> Regarding the uniqueness of identity, each identity should be associated
> with only one role. For example, a single identity can not be both admin
> user and a read only user.
>
> We have ensured this by carefully designing the schema of the new table
> for storing identity information by making identity as the primary key
> which guarantees that each identity is unique and the same role can have
> multiple identities.
>
> Thanks,
> Jyothsna Konisa.
>
> On Sun, Jun 18, 2023 at 5:42 PM Yuki Morishita  wrote:
>
>> HI,
>>
>> I was discussing with users the other day regarding a similar feature.
>> They were thinking of implementing the custom Authenticator similar to
>> what MySQL offers:
>>
>> CREATE USER 'jeffrey'@'localhost'
>>   REQUIRE SUBJECT '/C=SE/ST=Stockholm/L=Stockholm/
>> O=MySQL demo client certificate/
>> CN=client/emailAddress=cli...@example.com';
>>
>> (https://dev.mysql.com/doc/refman/8.0/en/create-user.html#create-user-tls
>> )
>>
>> I think they can implement a custom Validator that validates the identity
>> (for their case, CN) associated with a role using the certificate's
>> subject, so that's great!
>>
>> Regarding new CQL syntax,
>>
>> > ADD IDENTITY 'testIdentity' TO ROLE 'testRole';
>> > DROP IDENTITY 'testIdentity';
>>
>> This means a role can have multiple identities, and each identities must
>> be unique?
>> How can users check what identities are associated with certain roles?
>>
>>
>> On Sun, Jun 18, 2023 at 12:15 AM Dinesh Joshi  wrote:
>>
>>> Folks, any feedback here?
>>>
>>> On 6/15/23 12:46, Jyothsna Konisa wrote:
>>> > Hi Everyone!
>>> >
>>> > We are adding the following CQL queries in this patch for adding and
>>> dropping identities in the new `system_auth.identity_to_role` table.
>>> >
>>> > ADD IDENTITY 'testIdentity' TO ROLE 'testRole';
>>> > DROP IDENTITY 'testIdentity';
>>> >
>>> > Please let us know if anyone has any concerns!
>>> >
>>> > Thanks,
>>> > Jyothsna Konisa.
>>> >
>>> >
>>> > On Sat, Jun 3, 2023 at 7:18 AM Derek Chen-Becker <
>>> de...@chen-becker.org
>>> > > wrote:
>>> >
>>> > Sounds great, thanks for the clarification!
>>> >
>>> > Cheers,
>>> >
>>> > Derek
>>> >
>>> > On Sat, Jun 3, 2023 at 12:48 AM Dinesh Joshi >> > > wrote:
>>> >
>>> >> On Jun 2, 2023, at 9:06 PM, Derek Chen-Becker
>>> >> mailto:de...@chen-becker.org>> wrote:
>>> >>
>>> >> This certainly looks like a nice addition to the operator's
>>> >> tools for securing cluster access. Out of curiosity, is there
>>> >> anything in this work that would *preclude* a different
>>> >> authentication scheme for internode at some point in the
>>> >> future? Has there ever been discussion of pluggability similar
>>> >> to the client protocol?
>>> >
>>> > This is a pluggable implementation so it's not mandatory to use
>>> > it and doesn't preclude one from using a different mechanism in
>>> > the future. We haven't explicitly discussed pluggability i.e.
>>> > part of protocol negotiation in the past for internode
>>> > connections. However, this work also does not preclude us from
>>> > implementing such changes. If we do add negotiation this could
>>> > be one of the authentication mechanisms. So it would be
>>> > complimentary.
>>> >
>>> >
>>> >> Also, am I correct in understanding that this would allow for
>>> >> multiple certificates for the same identity (e.g. distinct
>>> >> cert per node)? I certainly understand the decision to keep
>>> >> things simple and have all nodes share identity from the
>>> >> perspective of operational simplicity, but I also don't want
>>> >> to get in a situation where a single compromised node would
>>> >> require an invalidati

Re: [DISCUSS] The future of CREATE INDEX

2023-06-20 Thread Caleb Rackliffe
For everyone previously following this, just created
https://issues.apache.org/jira/browse/CASSANDRA-18615 :)

On Fri, May 19, 2023 at 1:34 PM Caleb Rackliffe 
wrote:

> Posted on ASF Slack to see if we can get more responses, but so far the
> leaders seem to be...
>
> [POLL] Centralize existing syntax or create new syntax?
>
> 1.) CREATE INDEX ... USING ... WITH OPTIONS...
>
> (i.e. centralize)
>
> [POLL] Should there be a default? (YES/NO)
>
> Yes
>
> [POLL] What do do with the default?
>
> 3.) and 4.) i.e. YAML options to control default and requirement to
> specify a default
>
> (i.e. w/o changing default in 5.0)
>
> On Thu, May 18, 2023 at 3:33 AM Miklosovic, Stefan <
> stefan.mikloso...@netapp.com> wrote:
>
>> I don't want to hijack this thread, I just want to say that the point 4)
>> seems to be recurring. I second Caleb in saying that transactional metadata
>> would probably fix this. Because of the problem of not being sure that all
>> config is same, cluster-wide, I basically dropped the effort on CEP-24
>> because different local configurations might compromise the security.
>>
>> 
>> From: Henrik Ingo 
>> Sent: Wednesday, May 17, 2023 22:32
>> To: dev@cassandra.apache.org
>> Subject: Re: [DISCUSS] The future of CREATE INDEX
>>
>> NetApp Security WARNING: This is an external email. Do not click links or
>> open attachments unless you recognize the sender and know the content is
>> safe.
>>
>>
>>
>> I have read the thread but chose to reply to the top message...
>>
>> I'm coming to this with the background of having worked with MySQL, where
>> both the storage engine and index implementation had many options, and
>> often of course some index types were only available in some engines.
>>
>> I would humbly suggest:
>>
>> 1. What's up with naming anything "legacy". Calling the current index
>> type "2i" seems perfectly fine with me. From what I've heard it can work
>> great for many users?
>>
>> 2. It should be possible to always specify the index type explicitly. In
>> other words, it should be possible to CREATE CUSTOM INDEX ... USING "2i"
>> (if it isn't already)
>>
>> 2b) It should be possible to just say "SAI" or "SASIIndex", not the full
>> Java path.
>>
>> 3. It's a fair point that the "CUSTOM" word may make this sound a bit too
>> special... The simplest change IMO is to just make the CUSTOM work optional.
>>
>> 4. Benedict's point that a YAML option is per node is a good one... For
>> example, you wouldn't want some nodes to create a 2i index and other nodes
>> a SAI index for the same index That said, how many other YAML options
>> can you think of that would create total chaos if different nodes actually
>> had different values for them? For example what if a guardrail allowed some
>> action on some nodes but not others?  Maybe what we need is a jira ticket
>> to enforce that certain sections of the config must not differ?
>>
>> 5. That said, the default index type could also be a property of the
>> keyspace
>>
>> 6. MySQL allows the DBA to determine the default engine. This seems to
>> work well. If the user doesn't care, they don't care, if they do, they use
>> the explicit syntax.
>>
>> henrik
>>
>>
>> On Wed, May 10, 2023 at 12:45 AM Caleb Rackliffe <
>> calebrackli...@gmail.com> wrote:
>> Earlier today, Mick started a thread on the future of our index creation
>> DDL on Slack:
>>
>> https://the-asf.slack.com/archives/C018YGVCHMZ/p1683527794220019<
>> https://urldefense.com/v3/__https://the-asf.slack.com/archives/C018YGVCHMZ/p1683527794220019__;!!PbtH5S7Ebw!YuQzuQkxC0gmD9ofXEGoaEmVMwPwZ_ab8-B_PCfRfNsQtKIZDLOIuw38jnV1Vt8TqHXn-818hL-CoLbVJXBTCWgSxoE$
>> >
>>
>> At the moment, there are two ways to create a secondary index.
>>
>> 1.) CREATE INDEX [IF NOT EXISTS] [name] ON  ()
>>
>> This creates an optionally named legacy 2i on the provided table and
>> column.
>>
>> ex. CREATE INDEX my_index ON kd.tbl(my_text_col)
>>
>> 2.) CREATE CUSTOM INDEX [IF NOT EXISTS] [name] ON  ()
>> USING  [WITH OPTIONS = ]
>>
>> This creates a secondary index on the provided table and column using the
>> specified 2i implementation class and (optional) parameters.
>>
>> ex. CREATE CUSTOM INDEX my_index ON ks.tbl(my_text_col) USING
>> 'StorageAttachedIndex'
>>
>> (Note that the work on SAI added aliasing, so `StorageAttachedIndex` is
>> shorthand for the fully-qualified class name, which is also valid.)
>>
>> So what is there to discuss?
>>
>> The concern Mick raised is...
>>
>> "...just folk continuing to use CREATE INDEX  because they think CREATE
>> CUSTOM INDEX is advanced (or just don't know of it), and we leave users
>> doing 2i (when they think they are, and/or we definitely want them to be,
>> using SAI)"
>>
>> To paraphrase, we want people to use SAI once it's available where
>> possible, and the default behavior of CREATE INDEX could be at odds w/ that.
>>
>> The proposal we seem to have landed on is something like t

Re: CASSANDRA-18554 - mTLS based client and internode authenticators

2023-06-20 Thread Yuki Morishita
Hi Jyothsna,

Thanks, sorry I have additional questions regarding set up and migration:

* Initial set up

Say, you are building the brand new cassandra cluster with

authenticator:
  class_name :org.apache.cassandra.auth.MutualTlsAuthenticator
  parameters :
validator_class_name:
org.apache.cassandra.auth.SpiffeCertificateValidator

What will be the op's first step to set up the roles and identities?
Is default cassandra / cassandra super user login still required to set up
other roles and identities?
If initial cassandra super user login is required, does that mean super
users and "cassandra" superuser bypass mTLS check?

* Migration

If you are currently using PasswordAuthenticator and would like to migrate
to mTLS authentication:

I *think* you need to first use MutualTlsWithPasswordFallbackAuthenticator
so the current roles can login with their password,
and eventually the admin sets up identity and then can switch to mTLS auth.
Is this the expected way for migration?

I think a thorough documentation for this new feature including new CQL
syntax, setting up and migration would be greatly appreciated.


On Wed, Jun 21, 2023 at 4:13 AM Jyothsna Konisa 
wrote:

> Hi Yuki,
>
> Sorry I missed answering your other question in the above reply. Regarding
> checking what identities are associated with a given role, one can make a
> query to list identities for a given role to the table. Also note that,
> addition or removal of identities from the table can only be performed by
> the super user only. Not even read-write users can perform modifications to
> the table.
>
> Also, If others have no concerns regarding this patch, can we move forward
> with the merge? or do we need voting on this one?
>
> Thanks,
> Jyothsna Konisa.
>
>
> On Mon, Jun 19, 2023 at 4:00 PM Jyothsna Konisa 
> wrote:
>
>> Hi Yuki,
>> You are right regarding adding a custom validator. If one wants to
>> implement a CN based validator, they can do that and configure that
>> validator in Cassandra.yaml in "authenticator.parameters.
>> validator_class_name".
>>
>> Regarding a role having multiple identities, yes a role can have multiple
>> identities associated with it. For example, there can be several read_only
>> users for a given cluster, so the role `readonly_user` can be associated
>> with multiple identities.
>>
>> Regarding the uniqueness of identity, each identity should be associated
>> with only one role. For example, a single identity can not be both admin
>> user and a read only user.
>>
>> We have ensured this by carefully designing the schema of the new table
>> for storing identity information by making identity as the primary key
>> which guarantees that each identity is unique and the same role can have
>> multiple identities.
>>
>> Thanks,
>> Jyothsna Konisa.
>>
>> On Sun, Jun 18, 2023 at 5:42 PM Yuki Morishita 
>> wrote:
>>
>>> HI,
>>>
>>> I was discussing with users the other day regarding a similar feature.
>>> They were thinking of implementing the custom Authenticator similar to
>>> what MySQL offers:
>>>
>>> CREATE USER 'jeffrey'@'localhost'
>>>   REQUIRE SUBJECT '/C=SE/ST=Stockholm/L=Stockholm/
>>> O=MySQL demo client certificate/
>>> CN=client/emailAddress=cli...@example.com';
>>>
>>> (
>>> https://dev.mysql.com/doc/refman/8.0/en/create-user.html#create-user-tls
>>> )
>>>
>>> I think they can implement a custom Validator that validates the
>>> identity (for their case, CN) associated with a role using the
>>> certificate's subject, so that's great!
>>>
>>> Regarding new CQL syntax,
>>>
>>> > ADD IDENTITY 'testIdentity' TO ROLE 'testRole';
>>> > DROP IDENTITY 'testIdentity';
>>>
>>> This means a role can have multiple identities, and each identities must
>>> be unique?
>>> How can users check what identities are associated with certain roles?
>>>
>>>
>>> On Sun, Jun 18, 2023 at 12:15 AM Dinesh Joshi  wrote:
>>>
 Folks, any feedback here?

 On 6/15/23 12:46, Jyothsna Konisa wrote:
 > Hi Everyone!
 >
 > We are adding the following CQL queries in this patch for adding and
 dropping identities in the new `system_auth.identity_to_role` table.
 >
 > ADD IDENTITY 'testIdentity' TO ROLE 'testRole';
 > DROP IDENTITY 'testIdentity';
 >
 > Please let us know if anyone has any concerns!
 >
 > Thanks,
 > Jyothsna Konisa.
 >
 >
 > On Sat, Jun 3, 2023 at 7:18 AM Derek Chen-Becker <
 de...@chen-becker.org
 > > wrote:
 >
 > Sounds great, thanks for the clarification!
 >
 > Cheers,
 >
 > Derek
 >
 > On Sat, Jun 3, 2023 at 12:48 AM Dinesh Joshi >>> > > wrote:
 >
 >> On Jun 2, 2023, at 9:06 PM, Derek Chen-Becker
 >> mailto:de...@chen-becker.org>>
 wrote:
 >>
 >> This certainly looks like a nice addition to the operator's
 >> tools for securing cluster access. Out of curiosity, is t