Re: [DISCUSS] Introduce DATABASE as an alternative to KEYSPACE

2023-04-06 Thread guo Maxwell
I think replication_factor or replication is important😄. This concepts
will correspondingly lead to the concept of read and write consistency(ie :
ONE/QUORUM/ALL and so on) that users need to care about.
And the consistency level is very important to cassandra in my opinion.

Our experience is that there are many users who may not initially care
about replication factor and consistency level,then latter a lot of
explanation costs are introduced, and users will also feel that your
database is not easy to use.

So, why not educate users well from the beginning, and there are not many
of these concepts. Just like some data tell users from the beginning that
we are cloud-native, and we separate storage and computing.I think the
replication factor  should be easier to understand than these.


Jacek Lewandowski  于2023年4月6日周四 14:37写道:

> Haha... we have opinions against each name :)
>
> According to what Caleb said, I don't think all new users start learning
> Cassandra from understanding the replication.
> There are probably many small projects where Cassandra is used on a single
> node, or bigger projects where people
> try different things to make some PoC. Understanding the internals,
> architecture of Cassandra is not crucial - they
>  want to start writing queries as soon as possible and the less prior
> knowledge is required to do that the better.
>
> That being said, we should maybe even go further and assume some default
> replication config, like simple 1, so that
> creating a names boils down to a simply CREATE
> KEYSPACE|SCHEMA|DATABASE|NAMESPACE ;
>
> thanks,
> - - -- --- -  -
> Jacek Lewandowski
>
>
> czw., 6 kwi 2023 o 04:09 guo Maxwell  napisał(a):
>
>> either KEYSPACE or DATABASE or SCHEMA is better than NAMESPACE
>> NAMESPACE is always used in hbase which is a table store in my mind.
>> For existing users, NAMESPACE may take some time to be accepted. For
>> hbase and cassandra users, it may be necessary to mix the corresponding
>> terms.
>> From the terminology standard of the database, DATABASE or SCHAME may be
>> better , for terminology standard of the nosql database (cassandra),
>> KESYACEP is better.
>>
>>
>> Caleb Rackliffe  于2023年4月6日周四 07:09写道:
>>
>>> KEYSPACE isn’t a terrible name for a namespace that also configures how
>>> keys are replicated. NAMESPACE is accurate but not comprehensive. DATABASE
>>> doesn’t seem to have the advantages of either.
>>>
>>> I’m neutral on NAMESPACE and slightly -1 on DATABASE. It’s hard for me
>>> to believe KEYSPACE is really a stumbling block for new users, especially
>>> when it connotes something those users should understand about them (the
>>> replication configuration).
>>>
>>> On Apr 5, 2023, at 4:16 AM, Aleksey Yeshchenko 
>>> wrote:
>>>
>>> FYI we support SCHEMA as an alias to KEYSPACE today (have since
>>> always). Can use CREATE SCHEMA in place of CREATE KEYSPACE, etc.
>>>
>>> On 4 Apr 2023, at 19:23, Henrik Ingo  wrote:
>>>
>>> I find the Postgres terminology overly complex. Where most SQL databases
>>> will have several *databases*, each containing several *tables*, in
>>> Postgres we have namespaces, databases, schemas and tables...
>>>
>>> Oracle seems to also use the words database, schema and tables. I don't
>>> know if it has namespaces.
>>>
>>> Ah, ok, so SQL Server actually is like Oracle too!
>>>
>>>
>>> So in MySQL, referring unambiguously (aka full path) to a table would be:
>>>
>>> SELECT * FROM mydb.mytable;
>>>
>>> Whereas in Postgresql and Oracle and SQL Server you'd have to:
>>>
>>> SELECT * FROM mydb.myschema.mytable;   /* And I don't even know what
>>> to do with the namespace! */
>>>
>>>
>>> https://www.postgresql.org/docs/current/catalog-pg-namespace.html
>>> https://www.postgresql.org/docs/current/ddl-schemas.html
>>>
>>> https://docs.oracle.com/database/121/ADMQS/GUID-6E0CE8C9-7DC4-450C-BAE0-2E1CDD882993.htm#ADMQS0821
>>>
>>> https://docs.oracle.com/database/121/ADMQS/GUID-8AC1A325-3542-48A0-9B0E-180D633A5BD1.htm#ADMQS081
>>>
>>> https://learn.microsoft.com/en-us/sql/t-sql/statements/create-schema-transact-sql?view=sql-server-ver16
>>>
>>> https://learn.microsoft.com/en-us/sql/t-sql/statements/create-database-transact-sql?view=sql-server-ver16&tabs=sqlpool
>>>
>>> The Microsoft docs perhaps best explain the role of each: The Database
>>> contains the configuration of physical things like where on disk is the
>>> database stored. The Schema on the other hand contains "logical" objects
>>> like tables, views andprocedures.
>>>
>>> MongoDB has databases and collections. As an easter egg / inside joke,
>>> it also supports the command `SHOW TABLES` as a synonym for collections.
>>>
>>> A TABLESPACE btw is something else completely:
>>> https://docs.oracle.com/database/121/ADMQS/GUID-F05EE514-FFC6-4E86-A592-802BA5A49254.htm#ADMQS12053
>>>
>>>
>>>
>>> Personally I would be in favor of introducing `DATABASE` as a synonym
>>> for KEYSPACE. The latter could remain the "official" usage.
>>>
>>> he

Re: [DISCUSS] Introduce DATABASE as an alternative to KEYSPACE

2023-04-06 Thread Miklosovic, Stefan
I am against simplifying that so much, up to the point that there is some 
implicit replication strategy. While I understand the preferences towards 
having it all "easier", what is wrong with knowing that there are some 
replication strategies and my data will be replicated just once? There is also 
an educational aspect to that.

Also, having 4 ways how to create "keyspace" (keyspace, schema, database, 
namespace) feels pretty confusing to me. Are they equal? Why four? Is not it 
just better to have one way of doing that? Having 4 ways to do that feels like 
we do not know how to name it.

Somebody already mentioned in this thread that Postgres is quite complex in 
this. Maybe adding "DATABASE" would be OK but anything beyond that (NAMESPACE 
etc) is just too much imo.


From: Jacek Lewandowski 
Sent: Thursday, April 6, 2023 8:36
To: dev@cassandra.apache.org
Subject: Re: [DISCUSS] Introduce DATABASE as an alternative to KEYSPACE

NetApp Security WARNING: This is an external email. Do not click links or open 
attachments unless you recognize the sender and know the content is safe.



Haha... we have opinions against each name :)

According to what Caleb said, I don't think all new users start learning 
Cassandra from understanding the replication.
There are probably many small projects where Cassandra is used on a single 
node, or bigger projects where people
try different things to make some PoC. Understanding the internals, 
architecture of Cassandra is not crucial - they
 want to start writing queries as soon as possible and the less prior knowledge 
is required to do that the better.

That being said, we should maybe even go further and assume some default 
replication config, like simple 1, so that
creating a names boils down to a simply CREATE 
KEYSPACE|SCHEMA|DATABASE|NAMESPACE ;

thanks,
- - -- --- -  -
Jacek Lewandowski


czw., 6 kwi 2023 o 04:09 guo Maxwell 
mailto:cclive1...@gmail.com>> napisał(a):
either KEYSPACE or DATABASE or SCHEMA is better than NAMESPACE
NAMESPACE is always used in hbase which is a table store in my mind.
For existing users, NAMESPACE may take some time to be accepted. For hbase and 
cassandra users, it may be necessary to mix the corresponding terms.
From the terminology standard of the database, DATABASE or SCHAME may be better 
, for terminology standard of the nosql database (cassandra), KESYACEP is 
better.


Caleb Rackliffe mailto:calebrackli...@gmail.com>> 
于2023年4月6日周四 07:09写道:
KEYSPACE isn’t a terrible name for a namespace that also configures how keys 
are replicated. NAMESPACE is accurate but not comprehensive. DATABASE doesn’t 
seem to have the advantages of either.

I’m neutral on NAMESPACE and slightly -1 on DATABASE. It’s hard for me to 
believe KEYSPACE is really a stumbling block for new users, especially when it 
connotes something those users should understand about them (the replication 
configuration).

On Apr 5, 2023, at 4:16 AM, Aleksey Yeshchenko 
mailto:alek...@apple.com>> wrote:

FYI we support SCHEMA as an alias to KEYSPACE today (have since always). Can 
use CREATE SCHEMA in place of CREATE KEYSPACE, etc.

On 4 Apr 2023, at 19:23, Henrik Ingo 
mailto:henrik.i...@datastax.com>> wrote:

I find the Postgres terminology overly complex. Where most SQL databases will 
have several *databases*, each containing several *tables*, in Postgres we have 
namespaces, databases, schemas and tables...

Oracle seems to also use the words database, schema and tables. I don't know if 
it has namespaces.

Ah, ok, so SQL Server actually is like Oracle too!


So in MySQL, referring unambiguously (aka full path) to a table would be:

SELECT * FROM mydb.mytable;

Whereas in Postgresql and Oracle and SQL Server you'd have to:

SELECT * FROM mydb.myschema.mytable;   /* And I don't even know what to do 
with the namespace! */


https://www.postgresql.org/docs/current/catalog-pg-namespace.html
https://www.postgresql.org/docs/current/ddl-schemas.html
https://docs.oracle.com/database/121/ADMQS/GUID-6E0CE8C9-7DC4-450C-BAE0-2E1CDD882993.htm#ADMQS0821
https://docs.oracle.com/database/121/ADMQS/GUID-8AC1A325-3542-48A0-9B0E-180D633A5BD1.htm#ADMQS081
https://learn.microsoft.com/en-us/sql/t-sql/statements/create-schema-transact-sql?view=sql-server-ver16
https://learn.microsoft.com/en-us/sql/t-sql/statements/create-database-transact-sql?view=sql-server-ver16&tabs=sqlpool

The Microsoft docs perhaps best explain the role of each: The Database contains 
the configuration of physical things like where on disk is the database stored. 
The Schema on the other hand contains "logical" objects like tables, views 
andprocedures.

MongoDB has databases and collections. As an easter egg / inside joke, it also 
supports the command `SHOW TABLES` as a synonym for collections.

A TABLESPACE btw is something else completely: 
https://docs.oracle.com/database/121/ADMQS/GUID-F05EE514-FFC6-4E86-A592-802BA5A49254.htm#ADMQS12053

Re: [DISCUSS] Introduce DATABASE as an alternative to KEYSPACE

2023-04-06 Thread Berenguer Blasi
One aspect to take into account is that we might not go even get as far 
as having a chance to educate the user. They start the thing up, see a 
wall of logs and then start seeing 'keyspace' (what is that?), etc. 
Everything seems so foreign and out of band to their 'normal' experience 
they just move on to the next option they had in mind.


My 2cts.

On 6/4/23 9:30, Miklosovic, Stefan wrote:

I am against simplifying that so much, up to the point that there is some implicit 
replication strategy. While I understand the preferences towards having it all 
"easier", what is wrong with knowing that there are some replication strategies 
and my data will be replicated just once? There is also an educational aspect to that.

Also, having 4 ways how to create "keyspace" (keyspace, schema, database, 
namespace) feels pretty confusing to me. Are they equal? Why four? Is not it just better 
to have one way of doing that? Having 4 ways to do that feels like we do not know how to 
name it.

Somebody already mentioned in this thread that Postgres is quite complex in this. Maybe 
adding "DATABASE" would be OK but anything beyond that (NAMESPACE etc) is just 
too much imo.


From: Jacek Lewandowski 
Sent: Thursday, April 6, 2023 8:36
To: dev@cassandra.apache.org
Subject: Re: [DISCUSS] Introduce DATABASE as an alternative to KEYSPACE

NetApp Security WARNING: This is an external email. Do not click links or open 
attachments unless you recognize the sender and know the content is safe.



Haha... we have opinions against each name :)

According to what Caleb said, I don't think all new users start learning 
Cassandra from understanding the replication.
There are probably many small projects where Cassandra is used on a single 
node, or bigger projects where people
try different things to make some PoC. Understanding the internals, 
architecture of Cassandra is not crucial - they
  want to start writing queries as soon as possible and the less prior 
knowledge is required to do that the better.

That being said, we should maybe even go further and assume some default 
replication config, like simple 1, so that
creating a names boils down to a simply CREATE KEYSPACE|SCHEMA|DATABASE|NAMESPACE 
;

thanks,
- - -- --- -  -
Jacek Lewandowski


czw., 6 kwi 2023 o 04:09 guo Maxwell 
mailto:cclive1...@gmail.com>> napisał(a):
either KEYSPACE or DATABASE or SCHEMA is better than NAMESPACE
NAMESPACE is always used in hbase which is a table store in my mind.
For existing users, NAMESPACE may take some time to be accepted. For hbase and 
cassandra users, it may be necessary to mix the corresponding terms.
 From the terminology standard of the database, DATABASE or SCHAME may be 
better , for terminology standard of the nosql database (cassandra), KESYACEP 
is better.


Caleb Rackliffe mailto:calebrackli...@gmail.com>> 
于2023年4月6日周四 07:09写道:
KEYSPACE isn’t a terrible name for a namespace that also configures how keys 
are replicated. NAMESPACE is accurate but not comprehensive. DATABASE doesn’t 
seem to have the advantages of either.

I’m neutral on NAMESPACE and slightly -1 on DATABASE. It’s hard for me to 
believe KEYSPACE is really a stumbling block for new users, especially when it 
connotes something those users should understand about them (the replication 
configuration).

On Apr 5, 2023, at 4:16 AM, Aleksey Yeshchenko 
mailto:alek...@apple.com>> wrote:

FYI we support SCHEMA as an alias to KEYSPACE today (have since always). Can 
use CREATE SCHEMA in place of CREATE KEYSPACE, etc.

On 4 Apr 2023, at 19:23, Henrik Ingo 
mailto:henrik.i...@datastax.com>> wrote:

I find the Postgres terminology overly complex. Where most SQL databases will 
have several *databases*, each containing several *tables*, in Postgres we have 
namespaces, databases, schemas and tables...

Oracle seems to also use the words database, schema and tables. I don't know if 
it has namespaces.

Ah, ok, so SQL Server actually is like Oracle too!


So in MySQL, referring unambiguously (aka full path) to a table would be:

 SELECT * FROM mydb.mytable;

Whereas in Postgresql and Oracle and SQL Server you'd have to:

 SELECT * FROM mydb.myschema.mytable;   /* And I don't even know what to do 
with the namespace! */


https://www.postgresql.org/docs/current/catalog-pg-namespace.html
https://www.postgresql.org/docs/current/ddl-schemas.html
https://docs.oracle.com/database/121/ADMQS/GUID-6E0CE8C9-7DC4-450C-BAE0-2E1CDD882993.htm#ADMQS0821
https://docs.oracle.com/database/121/ADMQS/GUID-8AC1A325-3542-48A0-9B0E-180D633A5BD1.htm#ADMQS081
https://learn.microsoft.com/en-us/sql/t-sql/statements/create-schema-transact-sql?view=sql-server-ver16
https://learn.microsoft.com/en-us/sql/t-sql/statements/create-database-transact-sql?view=sql-server-ver16&tabs=sqlpool

The Microsoft docs perhaps best explain the role of each: The Database contains the 
configuration of physical things like where on disk is the da

Re: [DISCUSS] Introduce DATABASE as an alternative to KEYSPACE

2023-04-06 Thread Miklosovic, Stefan
So lets rename Keyspace (Java class) to Database then. If we are concerned that 
looking into logs would be full of "keyspaces" but a user created "database" 
and it is a source of inconsistencies, should not it be somehow resolved and 
unified?

I think that it is just too late to rename keyspace to something else. That 
term is so entrenched over the years in Cassandra-verse that it just does not 
make sense to try to get rid of that.

Also, a "beginner" might not look into the logs at all. I think that they will 
be all over CQL trying to load there some data etc rather than looking into the 
logs  not important. Who is looking into the actual logs while logging into 
the console, whatever DB they are using? These are not beginners imho.

BTW keep in mind that all nodetool commands which are using "keyspace" 
terminology would have to be probably accommodated to "database" term as well.


From: Berenguer Blasi 
Sent: Thursday, April 6, 2023 9:47
To: dev@cassandra.apache.org
Subject: Re: [DISCUSS] Introduce DATABASE as an alternative to KEYSPACE

NetApp Security WARNING: This is an external email. Do not click links or open 
attachments unless you recognize the sender and know the content is safe.




One aspect to take into account is that we might not go even get as far
as having a chance to educate the user. They start the thing up, see a
wall of logs and then start seeing 'keyspace' (what is that?), etc.
Everything seems so foreign and out of band to their 'normal' experience
they just move on to the next option they had in mind.

My 2cts.

On 6/4/23 9:30, Miklosovic, Stefan wrote:
> I am against simplifying that so much, up to the point that there is some 
> implicit replication strategy. While I understand the preferences towards 
> having it all "easier", what is wrong with knowing that there are some 
> replication strategies and my data will be replicated just once? There is 
> also an educational aspect to that.
>
> Also, having 4 ways how to create "keyspace" (keyspace, schema, database, 
> namespace) feels pretty confusing to me. Are they equal? Why four? Is not it 
> just better to have one way of doing that? Having 4 ways to do that feels 
> like we do not know how to name it.
>
> Somebody already mentioned in this thread that Postgres is quite complex in 
> this. Maybe adding "DATABASE" would be OK but anything beyond that (NAMESPACE 
> etc) is just too much imo.
>
> 
> From: Jacek Lewandowski 
> Sent: Thursday, April 6, 2023 8:36
> To: dev@cassandra.apache.org
> Subject: Re: [DISCUSS] Introduce DATABASE as an alternative to KEYSPACE
>
> NetApp Security WARNING: This is an external email. Do not click links or 
> open attachments unless you recognize the sender and know the content is safe.
>
>
>
> Haha... we have opinions against each name :)
>
> According to what Caleb said, I don't think all new users start learning 
> Cassandra from understanding the replication.
> There are probably many small projects where Cassandra is used on a single 
> node, or bigger projects where people
> try different things to make some PoC. Understanding the internals, 
> architecture of Cassandra is not crucial - they
>   want to start writing queries as soon as possible and the less prior 
> knowledge is required to do that the better.
>
> That being said, we should maybe even go further and assume some default 
> replication config, like simple 1, so that
> creating a names boils down to a simply CREATE 
> KEYSPACE|SCHEMA|DATABASE|NAMESPACE ;
>
> thanks,
> - - -- --- -  -
> Jacek Lewandowski
>
>
> czw., 6 kwi 2023 o 04:09 guo Maxwell 
> mailto:cclive1...@gmail.com>> napisał(a):
> either KEYSPACE or DATABASE or SCHEMA is better than NAMESPACE
> NAMESPACE is always used in hbase which is a table store in my mind.
> For existing users, NAMESPACE may take some time to be accepted. For hbase 
> and cassandra users, it may be necessary to mix the corresponding terms.
>  From the terminology standard of the database, DATABASE or SCHAME may be 
> better , for terminology standard of the nosql database (cassandra), KESYACEP 
> is better.
>
>
> Caleb Rackliffe mailto:calebrackli...@gmail.com>> 
> 于2023年4月6日周四 07:09写道:
> KEYSPACE isn’t a terrible name for a namespace that also configures how keys 
> are replicated. NAMESPACE is accurate but not comprehensive. DATABASE doesn’t 
> seem to have the advantages of either.
>
> I’m neutral on NAMESPACE and slightly -1 on DATABASE. It’s hard for me to 
> believe KEYSPACE is really a stumbling block for new users, especially when 
> it connotes something those users should understand about them (the 
> replication configuration).
>
> On Apr 5, 2023, at 4:16 AM, Aleksey Yeshchenko 
> mailto:alek...@apple.com>> wrote:
>
> FYI we support SCHEMA as an alias to KEYSPACE today (have since always). Can 
> use CREATE SCHEMA in place of CREATE KEYSPACE, etc.
>
> On 4 Apr 2023

Re: [DISCUSS] Introduce DATABASE as an alternative to KEYSPACE

2023-04-06 Thread Berenguer Blasi
No beginner is going to look for keyspace in logs imo, that's not what I 
was pointing at. But upon starting C* you get a wall of logs which is 
less user friendly imo than having a nice simple message saying you just 
started. Then you go to cqlsh and keyspace and RF are the first things 
he is going to hit. He might think 'Too mush overall hassle, I'll go 
check sthg else instead'


On 6/4/23 10:01, Miklosovic, Stefan wrote:

So lets rename Keyspace (Java class) to Database then. If we are concerned that looking into logs 
would be full of "keyspaces" but a user created "database" and it is a source 
of inconsistencies, should not it be somehow resolved and unified?

I think that it is just too late to rename keyspace to something else. That 
term is so entrenched over the years in Cassandra-verse that it just does not 
make sense to try to get rid of that.

Also, a "beginner" might not look into the logs at all. I think that they will 
be all over CQL trying to load there some data etc rather than looking into the logs  
not important. Who is looking into the actual logs while logging into the console, 
whatever DB they are using? These are not beginners imho.

BTW keep in mind that all nodetool commands which are using "keyspace" terminology would 
have to be probably accommodated to "database" term as well.


From: Berenguer Blasi 
Sent: Thursday, April 6, 2023 9:47
To: dev@cassandra.apache.org
Subject: Re: [DISCUSS] Introduce DATABASE as an alternative to KEYSPACE

NetApp Security WARNING: This is an external email. Do not click links or open 
attachments unless you recognize the sender and know the content is safe.




One aspect to take into account is that we might not go even get as far
as having a chance to educate the user. They start the thing up, see a
wall of logs and then start seeing 'keyspace' (what is that?), etc.
Everything seems so foreign and out of band to their 'normal' experience
they just move on to the next option they had in mind.

My 2cts.

On 6/4/23 9:30, Miklosovic, Stefan wrote:

I am against simplifying that so much, up to the point that there is some implicit 
replication strategy. While I understand the preferences towards having it all 
"easier", what is wrong with knowing that there are some replication strategies 
and my data will be replicated just once? There is also an educational aspect to that.

Also, having 4 ways how to create "keyspace" (keyspace, schema, database, 
namespace) feels pretty confusing to me. Are they equal? Why four? Is not it just better 
to have one way of doing that? Having 4 ways to do that feels like we do not know how to 
name it.

Somebody already mentioned in this thread that Postgres is quite complex in this. Maybe 
adding "DATABASE" would be OK but anything beyond that (NAMESPACE etc) is just 
too much imo.


From: Jacek Lewandowski 
Sent: Thursday, April 6, 2023 8:36
To: dev@cassandra.apache.org
Subject: Re: [DISCUSS] Introduce DATABASE as an alternative to KEYSPACE

NetApp Security WARNING: This is an external email. Do not click links or open 
attachments unless you recognize the sender and know the content is safe.



Haha... we have opinions against each name :)

According to what Caleb said, I don't think all new users start learning 
Cassandra from understanding the replication.
There are probably many small projects where Cassandra is used on a single 
node, or bigger projects where people
try different things to make some PoC. Understanding the internals, 
architecture of Cassandra is not crucial - they
   want to start writing queries as soon as possible and the less prior 
knowledge is required to do that the better.

That being said, we should maybe even go further and assume some default 
replication config, like simple 1, so that
creating a names boils down to a simply CREATE KEYSPACE|SCHEMA|DATABASE|NAMESPACE 
;

thanks,
- - -- --- -  -
Jacek Lewandowski


czw., 6 kwi 2023 o 04:09 guo Maxwell 
mailto:cclive1...@gmail.com>> napisał(a):
either KEYSPACE or DATABASE or SCHEMA is better than NAMESPACE
NAMESPACE is always used in hbase which is a table store in my mind.
For existing users, NAMESPACE may take some time to be accepted. For hbase and 
cassandra users, it may be necessary to mix the corresponding terms.
  From the terminology standard of the database, DATABASE or SCHAME may be 
better , for terminology standard of the nosql database (cassandra), KESYACEP 
is better.


Caleb Rackliffe mailto:calebrackli...@gmail.com>> 
于2023年4月6日周四 07:09写道:
KEYSPACE isn’t a terrible name for a namespace that also configures how keys 
are replicated. NAMESPACE is accurate but not comprehensive. DATABASE doesn’t 
seem to have the advantages of either.

I’m neutral on NAMESPACE and slightly -1 on DATABASE. It’s hard for me to 
believe KEYSPACE is really a stumbling block for new users, especially when it 
connotes something those

Re: [DISCUSS] Introduce DATABASE as an alternative to KEYSPACE

2023-04-06 Thread guo Maxwell
>
> So lets rename Keyspace (Java class) to Database then. If we are concerned
> that looking into logs would be full of "keyspaces" but a user created
> "database" and it is a source of inconsistencies, should not it be somehow
> resolved and unified?
>
> I think that it is just too late to rename keyspace to something else.
> That term is so entrenched over the years in Cassandra-verse that it just
> does not make sense to try to get rid of that.
>
> Also, a "beginner" might not look into the logs at all. I think that they
> will be all over CQL trying to load there some data etc rather than looking
> into the logs  not important. Who is looking into the actual logs while
> logging into the console, whatever DB they are using? These are not
> beginners imho.
>
> BTW keep in mind that all nodetool commands which are using "keyspace"
> terminology would have to be probably accommodated to "database" term as
> well.
>

+1

Miklosovic, Stefan  于2023年4月6日周四 16:01写道:

> So lets rename Keyspace (Java class) to Database then. If we are concerned
> that looking into logs would be full of "keyspaces" but a user created
> "database" and it is a source of inconsistencies, should not it be somehow
> resolved and unified?
>
> I think that it is just too late to rename keyspace to something else.
> That term is so entrenched over the years in Cassandra-verse that it just
> does not make sense to try to get rid of that.
>
> Also, a "beginner" might not look into the logs at all. I think that they
> will be all over CQL trying to load there some data etc rather than looking
> into the logs  not important. Who is looking into the actual logs while
> logging into the console, whatever DB they are using? These are not
> beginners imho.
>
> BTW keep in mind that all nodetool commands which are using "keyspace"
> terminology would have to be probably accommodated to "database" term as
> well.
>
> 
> From: Berenguer Blasi 
> Sent: Thursday, April 6, 2023 9:47
> To: dev@cassandra.apache.org
> Subject: Re: [DISCUSS] Introduce DATABASE as an alternative to KEYSPACE
>
> NetApp Security WARNING: This is an external email. Do not click links or
> open attachments unless you recognize the sender and know the content is
> safe.
>
>
>
>
> One aspect to take into account is that we might not go even get as far
> as having a chance to educate the user. They start the thing up, see a
> wall of logs and then start seeing 'keyspace' (what is that?), etc.
> Everything seems so foreign and out of band to their 'normal' experience
> they just move on to the next option they had in mind.
>
> My 2cts.
>
> On 6/4/23 9:30, Miklosovic, Stefan wrote:
> > I am against simplifying that so much, up to the point that there is
> some implicit replication strategy. While I understand the preferences
> towards having it all "easier", what is wrong with knowing that there are
> some replication strategies and my data will be replicated just once? There
> is also an educational aspect to that.
> >
> > Also, having 4 ways how to create "keyspace" (keyspace, schema,
> database, namespace) feels pretty confusing to me. Are they equal? Why
> four? Is not it just better to have one way of doing that? Having 4 ways to
> do that feels like we do not know how to name it.
> >
> > Somebody already mentioned in this thread that Postgres is quite complex
> in this. Maybe adding "DATABASE" would be OK but anything beyond that
> (NAMESPACE etc) is just too much imo.
> >
> > 
> > From: Jacek Lewandowski 
> > Sent: Thursday, April 6, 2023 8:36
> > To: dev@cassandra.apache.org
> > Subject: Re: [DISCUSS] Introduce DATABASE as an alternative to KEYSPACE
> >
> > NetApp Security WARNING: This is an external email. Do not click links
> or open attachments unless you recognize the sender and know the content is
> safe.
> >
> >
> >
> > Haha... we have opinions against each name :)
> >
> > According to what Caleb said, I don't think all new users start learning
> Cassandra from understanding the replication.
> > There are probably many small projects where Cassandra is used on a
> single node, or bigger projects where people
> > try different things to make some PoC. Understanding the internals,
> architecture of Cassandra is not crucial - they
> >   want to start writing queries as soon as possible and the less prior
> knowledge is required to do that the better.
> >
> > That being said, we should maybe even go further and assume some default
> replication config, like simple 1, so that
> > creating a names boils down to a simply CREATE
> KEYSPACE|SCHEMA|DATABASE|NAMESPACE ;
> >
> > thanks,
> > - - -- --- -  -
> > Jacek Lewandowski
> >
> >
> > czw., 6 kwi 2023 o 04:09 guo Maxwell  cclive1...@gmail.com>> napisał(a):
> > either KEYSPACE or DATABASE or SCHEMA is better than NAMESPACE
> > NAMESPACE is always used in hbase which is a table store in my mind.
> > For existing us

Re: [DISCUSS] Introduce DATABASE as an alternative to KEYSPACE

2023-04-06 Thread Mick Semb Wever
>
> Something like "TABLESPACE" or 'TABLEGROUP" would *theoretically* better
> satisfy point 1 and 2 above but subjectively I kind of recoil at both
> equally. So there's that.
>



TABLEGROUP would work for me.  Immediately intuitive.

brain-storming…

A keyspace today defines replication strategy, rf, and durable_writes. If
they also had the table options that could be defined as defaults for all
tables in that group, and one tablegroup could be a child and inherit
settings from another tablegroup, you could logically group tables in ways
that both benefit your application platform's taxonomy and the spread of
keyspace/table settings. DATABASE, NAMESPACE, whatever, can be aliases to
it too, if you like.


Re: [DISCUSS] Introduce DATABASE as an alternative to KEYSPACE

2023-04-06 Thread Miklosovic, Stefan
I just do not share your concerns, Berenguer. Maybe you have a different 
experience but I have never seen anybody who judged if they are going to use so 
and so database based on the fact if the startup logs are easy to parse, 
conceptually and mentally. Lets talk about simplifying the startup logs then 
and not logging what is not necessary. We might probably find such cases 
already which are not needed.


From: Berenguer Blasi 
Sent: Thursday, April 6, 2023 10:22
To: dev@cassandra.apache.org
Subject: Re: [DISCUSS] Introduce DATABASE as an alternative to KEYSPACE

NetApp Security WARNING: This is an external email. Do not click links or open 
attachments unless you recognize the sender and know the content is safe.




No beginner is going to look for keyspace in logs imo, that's not what I
was pointing at. But upon starting C* you get a wall of logs which is
less user friendly imo than having a nice simple message saying you just
started. Then you go to cqlsh and keyspace and RF are the first things
he is going to hit. He might think 'Too mush overall hassle, I'll go
check sthg else instead'

On 6/4/23 10:01, Miklosovic, Stefan wrote:
> So lets rename Keyspace (Java class) to Database then. If we are concerned 
> that looking into logs would be full of "keyspaces" but a user created 
> "database" and it is a source of inconsistencies, should not it be somehow 
> resolved and unified?
>
> I think that it is just too late to rename keyspace to something else. That 
> term is so entrenched over the years in Cassandra-verse that it just does not 
> make sense to try to get rid of that.
>
> Also, a "beginner" might not look into the logs at all. I think that they 
> will be all over CQL trying to load there some data etc rather than looking 
> into the logs  not important. Who is looking into the actual logs while 
> logging into the console, whatever DB they are using? These are not beginners 
> imho.
>
> BTW keep in mind that all nodetool commands which are using "keyspace" 
> terminology would have to be probably accommodated to "database" term as well.
>
> 
> From: Berenguer Blasi 
> Sent: Thursday, April 6, 2023 9:47
> To: dev@cassandra.apache.org
> Subject: Re: [DISCUSS] Introduce DATABASE as an alternative to KEYSPACE
>
> NetApp Security WARNING: This is an external email. Do not click links or 
> open attachments unless you recognize the sender and know the content is safe.
>
>
>
>
> One aspect to take into account is that we might not go even get as far
> as having a chance to educate the user. They start the thing up, see a
> wall of logs and then start seeing 'keyspace' (what is that?), etc.
> Everything seems so foreign and out of band to their 'normal' experience
> they just move on to the next option they had in mind.
>
> My 2cts.
>
> On 6/4/23 9:30, Miklosovic, Stefan wrote:
>> I am against simplifying that so much, up to the point that there is some 
>> implicit replication strategy. While I understand the preferences towards 
>> having it all "easier", what is wrong with knowing that there are some 
>> replication strategies and my data will be replicated just once? There is 
>> also an educational aspect to that.
>>
>> Also, having 4 ways how to create "keyspace" (keyspace, schema, database, 
>> namespace) feels pretty confusing to me. Are they equal? Why four? Is not it 
>> just better to have one way of doing that? Having 4 ways to do that feels 
>> like we do not know how to name it.
>>
>> Somebody already mentioned in this thread that Postgres is quite complex in 
>> this. Maybe adding "DATABASE" would be OK but anything beyond that 
>> (NAMESPACE etc) is just too much imo.
>>
>> 
>> From: Jacek Lewandowski 
>> Sent: Thursday, April 6, 2023 8:36
>> To: dev@cassandra.apache.org
>> Subject: Re: [DISCUSS] Introduce DATABASE as an alternative to KEYSPACE
>>
>> NetApp Security WARNING: This is an external email. Do not click links or 
>> open attachments unless you recognize the sender and know the content is 
>> safe.
>>
>>
>>
>> Haha... we have opinions against each name :)
>>
>> According to what Caleb said, I don't think all new users start learning 
>> Cassandra from understanding the replication.
>> There are probably many small projects where Cassandra is used on a single 
>> node, or bigger projects where people
>> try different things to make some PoC. Understanding the internals, 
>> architecture of Cassandra is not crucial - they
>>want to start writing queries as soon as possible and the less prior 
>> knowledge is required to do that the better.
>>
>> That being said, we should maybe even go further and assume some default 
>> replication config, like simple 1, so that
>> creating a names boils down to a simply CREATE 
>> KEYSPACE|SCHEMA|DATABASE|NAMESPACE ;
>>
>> thanks,
>> - - -- --- -  -
>> Jacek Lewandowski
>>
>>
>> czw., 6 

Re: [DISCUSS] Introduce DATABASE as an alternative to KEYSPACE

2023-04-06 Thread Benedict
KEYSPACE is fine. If we want to introduce a standard nomenclature like DATABASE 
that’s also fine. Inventing brand new ones is not fine, there’s no benefit.

I think it would be fine to introduce some arbitrary unrelated concept for 
assigning tables with similar behaviours some configuration that is orthogonal 
to replication, but that should be a different discussion about how we evolve 
config.

> On 6 Apr 2023, at 09:40, Mick Semb Wever  wrote:
> 
> 
>> Something like "TABLESPACE" or 'TABLEGROUP" would theoretically better 
>> satisfy point 1 and 2 above but subjectively I kind of recoil at both 
>> equally. So there's that.
> 
> 
> 
> TABLEGROUP would work for me.  Immediately intuitive.
> 
> brain-storming…
> 
> A keyspace today defines replication strategy, rf, and durable_writes. If 
> they also had the table options that could be defined as defaults for all 
> tables in that group, and one tablegroup could be a child and inherit 
> settings from another tablegroup, you could logically group tables in ways 
> that both benefit your application platform's taxonomy and the spread of 
> keyspace/table settings. DATABASE, NAMESPACE, whatever, can be aliases to it 
> too, if you like.
> 
> 
>  


Re: [VOTE] Release Apache Cassandra 4.0.9

2023-04-06 Thread Miklosovic, Stefan
1.5.5 was just released.

I am bumping it under this ticket 
https://issues.apache.org/jira/browse/CASSANDRA-18429

I am building CI as we speak.

Up to you to fail the vote and we realistically release 4.0.9 after Easter


From: Miklosovic, Stefan 
Sent: Wednesday, April 5, 2023 12:21
To: dev@cassandra.apache.org
Subject: Re: [VOTE] Release Apache Cassandra 4.0.9

NetApp Security WARNING: This is an external email. Do not click links or open 
attachments unless you recognize the sender and know the content is safe.




Lets wait one more day for that, I have asked when 1.5.5 is going to be out. 
All suggests that it is pretty much prepared for releasing it.

Voting is planned to be held for 72h anyway.

https://github.com/luben/zstd-jni/issues/254


From: Brandon Williams 
Sent: Wednesday, April 5, 2023 12:03
To: dev@cassandra.apache.org
Subject: Re: [VOTE] Release Apache Cassandra 4.0.9

NetApp Security WARNING: This is an external email. Do not click links or open 
attachments unless you recognize the sender and know the content is safe.




On Wed, Apr 5, 2023 at 1:36 AM Miklosovic, Stefan
 wrote:

> I think that we should fail the vote and bump zstd-jni to 1.5.5. I do not 
> think it is a good idea to release 4.0.9 with such a bug in it when we can 
> wait one more day or so.

I'm not sure it makes sense to delay a release waiting for another
project to make their own release for a bug that is so rare it hasn't
surfaced in any release to date, since they all have it.  I think
getting CASSANDRA-18125 is probably a more prevalent issue and
delaying that for an unknown amount of time doesn't seem prudent.  We
could always make another release when Zstd is ready.


Re: [DISCUSS] Introduce DATABASE as an alternative to KEYSPACE

2023-04-06 Thread Ekaterina Dimitrova
“ KEYSPACE is fine. If we want to introduce a standard nomenclature like
DATABASE that’s also fine. Inventing brand new ones is not fine, there’s no
benefit.

I think it would be fine to introduce some arbitrary unrelated concept for
assigning tables with similar behaviours some configuration that is
orthogonal to replication, but that should be a different discussion about
how we evolve config.”

+1


On Thu, 6 Apr 2023 at 5:26, Benedict  wrote:

> KEYSPACE is fine. If we want to introduce a standard nomenclature like
> DATABASE that’s also fine. Inventing brand new ones is not fine, there’s no
> benefit.
>
> I think it would be fine to introduce some arbitrary unrelated concept for
> assigning tables with similar behaviours some configuration that is
> orthogonal to replication, but that should be a different discussion about
> how we evolve config.
>
> On 6 Apr 2023, at 09:40, Mick Semb Wever  wrote:
>
> 
>
> Something like "TABLESPACE" or 'TABLEGROUP" would *theoretically* better
>> satisfy point 1 and 2 above but subjectively I kind of recoil at both
>> equally. So there's that.
>>
>
>
>
> TABLEGROUP would work for me.  Immediately intuitive.
>
> brain-storming…
>
> A keyspace today defines replication strategy, rf, and durable_writes. If
> they also had the table options that could be defined as defaults for all
> tables in that group, and one tablegroup could be a child and inherit
> settings from another tablegroup, you could logically group tables in ways
> that both benefit your application platform's taxonomy and the spread of
> keyspace/table settings. DATABASE, NAMESPACE, whatever, can be aliases to
> it too, if you like.
>
>
>
>
>


Re: [VOTE] Release Apache Cassandra 4.0.9

2023-04-06 Thread Mick Semb Wever
> Up to you to fail the vote and we realistically release 4.0.9 after Easter
>


-1 to the vote.

I support your initial veto and reasoning, and it appears you are willing
to recut once 18429 is resolved.


Re: [DISCUSS] Introduce DATABASE as an alternative to KEYSPACE

2023-04-06 Thread Mick Semb Wever
> … but that should be a different discussion about how we evolve config.
>


I disagree. Nomenclature being difficult can benefit from holistic and
forward thinking.

Sure you can label this off-topic if you like, but I value our discuss
threads being collaborative in an open-mode. Sometimes the best idea is on
the tail end of a sequence of bad and/or unpopular ideas.


Re: [DISCUSS] Introduce DATABASE as an alternative to KEYSPACE

2023-04-06 Thread Josh McKenzie
> KEYSPACE is fine. If we want to introduce a standard nomenclature like 
> DATABASE that’s also fine. Inventing brand new ones is not fine, there’s no 
> benefit.
I'm with Benedict in principle, with Aleksey in practice; I think KEYSPACE and 
SCHEMA are actually fine enough.

If and when we get to any kind of multi-tenancy, having a more metaphorical 
abstraction that users are familiar with like these becomes more valuable; it's 
pretty clear that things in different keyspaces, different databases, or even 
different schemas could have different access rules, resourcing, etc from one 
another.

While the off-the-cuff logical TABLEGROUP thing is a *literal* statement about 
what the thing is, it'd be another unique term to us;  we have enough things in 
our system where we've charted our own path. My personal .02 is we don't need 
to go adding more. :)

On Thu, Apr 6, 2023, at 8:54 AM, Mick Semb Wever wrote:
> 
>> … but that should be a different discussion about how we evolve config.
> 
>  
> I disagree. Nomenclature being difficult can benefit from holistic and 
> forward thinking.
> Sure you can label this off-topic if you like, but I value our discuss 
> threads being collaborative in an open-mode. Sometimes the best idea is on 
> the tail end of a sequence of bad and/or unpopular ideas.
> 
> 
>> 
> 


Re: [DISCUSS] Introduce DATABASE as an alternative to KEYSPACE

2023-04-06 Thread Mike Adamson
My apologies. I started this discussion off the back of a usability
discussion around new user accessibility to Cassandra and the premise that
there is an initial steep learning curve for new users. Including new users
who have worked for a long time in the traditional DBMS field.

On the basis of the reason for the discussion,  TABLEGROUP doesn't sit well
because of user types / functions / indexes etc. which are not strictly
tables and is also yet another Cassandra only term.

NAMESPACE could work but it's different usage in other systems could be
just as confusing to new users.

And, I certainly don't think having multiple names for the same thing just
to satisfy different parties is a good idea at all.

I'm quite happy to leave things as they are if that is the consensus.

On Thu, 6 Apr 2023 at 14:16, Josh McKenzie  wrote:

> KEYSPACE is fine. If we want to introduce a standard nomenclature like
> DATABASE that’s also fine. Inventing brand new ones is not fine, there’s no
> benefit.
>
> I'm with Benedict in principle, with Aleksey in practice; I think KEYSPACE
> and SCHEMA are actually fine enough.
>
> If and when we get to any kind of multi-tenancy, having a more
> metaphorical abstraction that users are familiar with like these becomes
> more valuable; it's pretty clear that things in different keyspaces,
> different databases, or even different schemas could have different access
> rules, resourcing, etc from one another.
>
> While the off-the-cuff logical TABLEGROUP thing is a *literal* statement
> about what the thing is, it'd be another unique term to us;  we have enough
> things in our system where we've charted our own path. My personal .02 is
> we don't need to go adding more. :)
>
> On Thu, Apr 6, 2023, at 8:54 AM, Mick Semb Wever wrote:
>
>
> … but that should be a different discussion about how we evolve config.
>
>
>
> I disagree. Nomenclature being difficult can benefit from holistic and
> forward thinking.
> Sure you can label this off-topic if you like, but I value our discuss
> threads being collaborative in an open-mode. Sometimes the best idea is on
> the tail end of a sequence of bad and/or unpopular ideas.
>
>
>
>
>
>

-- 
[image: DataStax Logo Square]  *Mike Adamson*
Engineering

+1 650 389 6000 <16503896000> | datastax.com 
Find DataStax Online: [image: LinkedIn Logo]

   [image: Facebook Logo]

   [image: Twitter Logo]    [image: RSS Feed]
   [image: Github Logo]



Re: [DISCUSS] Introduce DATABASE as an alternative to KEYSPACE

2023-04-06 Thread Bowen Song via dev

/> I'm quite happy to leave things as they are if that is the consensus./

+1 to the above


On 06/04/2023 14:54, Mike Adamson wrote:
My apologies. I started this discussion off the back of a usability 
discussion around new user accessibility to Cassandra and the premise 
that there is an initial steep learning curve for new users. Including 
new users who have worked for a long time in the traditional DBMS field.


On the basis of the reason for the discussion,  TABLEGROUP doesn't sit 
well because of user types / functions / indexes etc. which are not 
strictly tables and is also yet another Cassandra only term.


NAMESPACE could work but it's different usage in other systems could 
be just as confusing to new users.


And, I certainly don't think having multiple names for the same thing 
just to satisfy different parties is a good idea at all.


I'm quite happy to leave things as they are if that is the consensus.

On Thu, 6 Apr 2023 at 14:16, Josh McKenzie  wrote:


KEYSPACE is fine. If we want to introduce a standard nomenclature
like DATABASE that’s also fine. Inventing brand new ones is not
fine, there’s no benefit.

I'm with Benedict in principle, with Aleksey in practice; I think
KEYSPACE and SCHEMA are actually fine enough.

If and when we get to any kind of multi-tenancy, having a more
metaphorical abstraction that users are familiar with like these
becomes more valuable; it's pretty clear that things in different
keyspaces, different databases, or even different schemas could
have different access rules, resourcing, etc from one another.

While the off-the-cuff logical TABLEGROUP thing is a /literal/
statement about what the thing is, it'd be another unique term to
us;  we have enough things in our system where we've charted our
own path. My personal .02 is we don't need to go adding more. :)

On Thu, Apr 6, 2023, at 8:54 AM, Mick Semb Wever wrote:


… but that should be a different discussion about how we
evolve config.



I disagree. Nomenclature being difficult can benefit from
holistic and forward thinking.
Sure you can label this off-topic if you like, but I value our
discuss threads being collaborative in an open-mode.
Sometimes the best idea is on the tail end of a sequence of bad
and/or unpopular ideas.








--
DataStax Logo Square   *Mike Adamson*
Engineering

+1 650 389 6000 |datastax.com 



Find DataStax Online: 	LinkedIn Logo 
 
Facebook Logo 
 
Twitter Logo  RSS Feed 
 Github Logo 



Re: [DISCUSS] Introduce DATABASE as an alternative to KEYSPACE

2023-04-06 Thread Henrik Ingo
On Thu, Apr 6, 2023 at 4:16 PM Josh McKenzie  wrote:

> KEYSPACE is fine. If we want to introduce a standard nomenclature like
> DATABASE that’s also fine. Inventing brand new ones is not fine, there’s no
> benefit.
>
> I'm with Benedict in principle, with Aleksey in practice; I think KEYSPACE
> and SCHEMA are actually fine enough.
>
>
Having learned that SCHEMA already exists as a synonym for KEYSPACE, I
think everything is good here. If Cassandra evolves to a richer database
(transactions and queries beyond just key based access) then gradually
adopting SCHEMA as the primary name might feel natural. Once we get there.


> If and when we get to any kind of multi-tenancy, having a more
> metaphorical abstraction that users are familiar with like these becomes
> more valuable; it's pretty clear that things in different keyspaces,
> different databases, or even different schemas could have different access
> rules, resourcing, etc from one another.
>
>
At Datastax I've tried, with some success actually, to ban the use of the
word "Database" in our cloud service, because it was too overloaded.
Various people, one group of which were the UI designers that expose their
point of view to actual users, had completely different ideas of what a
"database" is. I remember at least:
 - the cluster of servers / VMs in the cloud that together contain a
Cassandra database. => It's a cluster.
 - One tenant in a multi-tenanant cluster => It's a tenant
 - A KEYSPACE. This would have been most correct in my world view, but was
actually the least used. => KEYSPACE or SCHEMA
 - The software product: Cassandra, DSE, or Astra

I think the first two were the ones actually used in the UI.

Now that I think about this email thread, the different expectations of
what the word "database" means might correlate with whether the speaker's
background is in the Oracle/Postgresql/Microsoft camp, or the MySQL/MongoDB
camp.


So it's like me trying to order a bisquit in a US cafe.

henrik





> While the off-the-cuff logical TABLEGROUP thing is a *literal* statement
> about what the thing is, it'd be another unique term to us;  we have enough
> things in our system where we've charted our own path. My personal .02 is
> we don't need to go adding more. :)
>
> On Thu, Apr 6, 2023, at 8:54 AM, Mick Semb Wever wrote:
>
>
> … but that should be a different discussion about how we evolve config.
>
>
>
> I disagree. Nomenclature being difficult can benefit from holistic and
> forward thinking.
> Sure you can label this off-topic if you like, but I value our discuss
> threads being collaborative in an open-mode. Sometimes the best idea is on
> the tail end of a sequence of bad and/or unpopular ideas.
>
>
>
>
>
>

-- 

Henrik Ingo

c. +358 40 569 7354

w. www.datastax.com

  
  


Re: [VOTE] CEP-26: Unified Compaction Strategy

2023-04-06 Thread Joseph Lynch
+1

This proposal looks really exciting!

-Joey

On Wed, Apr 5, 2023 at 2:13 AM Aleksey Yeshchenko  wrote:
>
> +1
>
> On 4 Apr 2023, at 16:56, Ekaterina Dimitrova  wrote:
>
> +1
>
> On Tue, 4 Apr 2023 at 11:44, Benjamin Lerer  wrote:
>>
>> +1
>>
>> Le mar. 4 avr. 2023 à 17:17, Andrés de la Peña  a 
>> écrit :
>>>
>>> +1
>>>
>>> On Tue, 4 Apr 2023 at 15:09, Jeremy Hanna  
>>> wrote:

 +1 nb, will be great to have this in the codebase - it will make nearly 
 every table's compaction work more efficiently.  The only possible 
 exception is tables that are well suited for TWCS.

 On Apr 4, 2023, at 8:00 AM, Berenguer Blasi  
 wrote:

 +1

 On 4/4/23 14:36, J. D. Jordan wrote:

 +1

 On Apr 4, 2023, at 7:29 AM, Brandon Williams  wrote:

 
 +1

 On Tue, Apr 4, 2023, 7:24 AM Branimir Lambov  wrote:
>
> Hi everyone,
>
> I would like to put CEP-26 to a vote.
>
> Proposal:
> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-26%3A+Unified+Compaction+Strategy
>
> JIRA and draft implementation:
> https://issues.apache.org/jira/browse/CASSANDRA-18397
>
> Up-to-date documentation:
> https://github.com/blambov/cassandra/blob/CASSANDRA-18397/src/java/org/apache/cassandra/db/compaction/UnifiedCompactionStrategy.md
>
> Discussion:
> https://lists.apache.org/thread/8xf5245tclf1mb18055px47b982rdg4b
>
> The vote will be open for 72 hours.
> A vote passes if there are at least three binding +1s and no binding 
> vetoes.
>
> Thanks,
> Branimir


>


Re: [DISCUSS] CEP-29 CQL NOT Operator

2023-04-06 Thread David Capwell
Overall I welcome this feature, was trying to use this around 1-2 months back 
and found we didn’t support, so glad to see it coming!

From a testing point of view, I think we would want to have good fuzz testing 
covering complex types (frozen/non-frozen collections, tuples, udt, etc.), and 
reverse ordering; both sections tend to cause the most problem for new features 
(and existing ones)

We also will want a way to disable this feature, and optionally disable at 
different sections (such as m2’s NOT IN for partition keys).

> On Apr 4, 2023, at 2:28 AM, Piotr Kołaczkowski  wrote:
> 
> Hi everyone!
> 
> I created a new CEP for adding NOT support to the query language and
> want to start discussion around it:
> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-29%3A+CQL+NOT+operator
> 
> Happy to get your feedback.
> --
> Piotr



Re: [VOTE] CEP-26: Unified Compaction Strategy

2023-04-06 Thread Josh McKenzie
+1

On Thu, Apr 6, 2023, at 12:18 PM, Joseph Lynch wrote:
> +1
> 
> This proposal looks really exciting!
> 
> -Joey
> 
> On Wed, Apr 5, 2023 at 2:13 AM Aleksey Yeshchenko  wrote:
> >
> > +1
> >
> > On 4 Apr 2023, at 16:56, Ekaterina Dimitrova  wrote:
> >
> > +1
> >
> > On Tue, 4 Apr 2023 at 11:44, Benjamin Lerer  wrote:
> >>
> >> +1
> >>
> >> Le mar. 4 avr. 2023 à 17:17, Andrés de la Peña  a 
> >> écrit :
> >>>
> >>> +1
> >>>
> >>> On Tue, 4 Apr 2023 at 15:09, Jeremy Hanna  
> >>> wrote:
> 
>  +1 nb, will be great to have this in the codebase - it will make nearly 
>  every table's compaction work more efficiently.  The only possible 
>  exception is tables that are well suited for TWCS.
> 
>  On Apr 4, 2023, at 8:00 AM, Berenguer Blasi  
>  wrote:
> 
>  +1
> 
>  On 4/4/23 14:36, J. D. Jordan wrote:
> 
>  +1
> 
>  On Apr 4, 2023, at 7:29 AM, Brandon Williams  wrote:
> 
>  
>  +1
> 
>  On Tue, Apr 4, 2023, 7:24 AM Branimir Lambov  wrote:
> >
> > Hi everyone,
> >
> > I would like to put CEP-26 to a vote.
> >
> > Proposal:
> > https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-26%3A+Unified+Compaction+Strategy
> >
> > JIRA and draft implementation:
> > https://issues.apache.org/jira/browse/CASSANDRA-18397
> >
> > Up-to-date documentation:
> > https://github.com/blambov/cassandra/blob/CASSANDRA-18397/src/java/org/apache/cassandra/db/compaction/UnifiedCompactionStrategy.md
> >
> > Discussion:
> > https://lists.apache.org/thread/8xf5245tclf1mb18055px47b982rdg4b
> >
> > The vote will be open for 72 hours.
> > A vote passes if there are at least three binding +1s and no binding 
> > vetoes.
> >
> > Thanks,
> > Branimir
> 
> 
> >
> 

Re: [VOTE] CEP-26: Unified Compaction Strategy

2023-04-06 Thread Francisco Guerrero
+1 (nb)

On 2023/04/06 17:30:37 Josh McKenzie wrote:
> +1
> 
> On Thu, Apr 6, 2023, at 12:18 PM, Joseph Lynch wrote:
> > +1
> > 
> > This proposal looks really exciting!
> > 
> > -Joey
> > 
> > On Wed, Apr 5, 2023 at 2:13 AM Aleksey Yeshchenko  wrote:
> > >
> > > +1
> > >
> > > On 4 Apr 2023, at 16:56, Ekaterina Dimitrova  
> > > wrote:
> > >
> > > +1
> > >
> > > On Tue, 4 Apr 2023 at 11:44, Benjamin Lerer  wrote:
> > >>
> > >> +1
> > >>
> > >> Le mar. 4 avr. 2023 à 17:17, Andrés de la Peña  a 
> > >> écrit :
> > >>>
> > >>> +1
> > >>>
> > >>> On Tue, 4 Apr 2023 at 15:09, Jeremy Hanna  
> > >>> wrote:
> > 
> >  +1 nb, will be great to have this in the codebase - it will make 
> >  nearly every table's compaction work more efficiently.  The only 
> >  possible exception is tables that are well suited for TWCS.
> > 
> >  On Apr 4, 2023, at 8:00 AM, Berenguer Blasi  
> >  wrote:
> > 
> >  +1
> > 
> >  On 4/4/23 14:36, J. D. Jordan wrote:
> > 
> >  +1
> > 
> >  On Apr 4, 2023, at 7:29 AM, Brandon Williams  wrote:
> > 
> >  
> >  +1
> > 
> >  On Tue, Apr 4, 2023, 7:24 AM Branimir Lambov  
> >  wrote:
> > >
> > > Hi everyone,
> > >
> > > I would like to put CEP-26 to a vote.
> > >
> > > Proposal:
> > > https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-26%3A+Unified+Compaction+Strategy
> > >
> > > JIRA and draft implementation:
> > > https://issues.apache.org/jira/browse/CASSANDRA-18397
> > >
> > > Up-to-date documentation:
> > > https://github.com/blambov/cassandra/blob/CASSANDRA-18397/src/java/org/apache/cassandra/db/compaction/UnifiedCompactionStrategy.md
> > >
> > > Discussion:
> > > https://lists.apache.org/thread/8xf5245tclf1mb18055px47b982rdg4b
> > >
> > > The vote will be open for 72 hours.
> > > A vote passes if there are at least three binding +1s and no binding 
> > > vetoes.
> > >
> > > Thanks,
> > > Branimir
> > 
> > 
> > >
> > 


Re: [VOTE] CEP-26: Unified Compaction Strategy

2023-04-06 Thread Mick Semb Wever
+1

On Thu, 6 Apr 2023 at 19:32, Francisco Guerrero  wrote:

> +1 (nb)
>
> On 2023/04/06 17:30:37 Josh McKenzie wrote:
> > +1
> >
> > On Thu, Apr 6, 2023, at 12:18 PM, Joseph Lynch wrote:
> > > +1
> > >
> > > This proposal looks really exciting!
> > >
> > > -Joey
> > >
> > > On Wed, Apr 5, 2023 at 2:13 AM Aleksey Yeshchenko 
> wrote:
> > > >
> > > > +1
> > > >
> > > > On 4 Apr 2023, at 16:56, Ekaterina Dimitrova 
> wrote:
> > > >
> > > > +1
> > > >
> > > > On Tue, 4 Apr 2023 at 11:44, Benjamin Lerer 
> wrote:
> > > >>
> > > >> +1
> > > >>
> > > >> Le mar. 4 avr. 2023 à 17:17, Andrés de la Peña <
> adelap...@apache.org> a écrit :
> > > >>>
> > > >>> +1
> > > >>>
> > > >>> On Tue, 4 Apr 2023 at 15:09, Jeremy Hanna <
> jeremy.hanna1...@gmail.com> wrote:
> > > 
> > >  +1 nb, will be great to have this in the codebase - it will make
> nearly every table's compaction work more efficiently.  The only possible
> exception is tables that are well suited for TWCS.
> > > 
> > >  On Apr 4, 2023, at 8:00 AM, Berenguer Blasi <
> berenguerbl...@gmail.com> wrote:
> > > 
> > >  +1
> > > 
> > >  On 4/4/23 14:36, J. D. Jordan wrote:
> > > 
> > >  +1
> > > 
> > >  On Apr 4, 2023, at 7:29 AM, Brandon Williams 
> wrote:
> > > 
> > >  
> > >  +1
> > > 
> > >  On Tue, Apr 4, 2023, 7:24 AM Branimir Lambov 
> wrote:
> > > >
> > > > Hi everyone,
> > > >
> > > > I would like to put CEP-26 to a vote.
> > > >
> > > > Proposal:
> > > >
> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-26%3A+Unified+Compaction+Strategy
> > > >
> > > > JIRA and draft implementation:
> > > > https://issues.apache.org/jira/browse/CASSANDRA-18397
> > > >
> > > > Up-to-date documentation:
> > > >
> https://github.com/blambov/cassandra/blob/CASSANDRA-18397/src/java/org/apache/cassandra/db/compaction/UnifiedCompactionStrategy.md
> > > >
> > > > Discussion:
> > > > https://lists.apache.org/thread/8xf5245tclf1mb18055px47b982rdg4b
> > > >
> > > > The vote will be open for 72 hours.
> > > > A vote passes if there are at least three binding +1s and no
> binding vetoes.
> > > >
> > > > Thanks,
> > > > Branimir
> > > 
> > > 
> > > >
> > >
>


Re: [VOTE] CEP-26: Unified Compaction Strategy

2023-04-06 Thread Patrick McFadin
+1

Thanks to Lorina for getting people excited about it at Cassandra Forward!

On Thu, Apr 6, 2023 at 10:37 AM Mick Semb Wever  wrote:

> +1
>
> On Thu, 6 Apr 2023 at 19:32, Francisco Guerrero 
> wrote:
>
>> +1 (nb)
>>
>> On 2023/04/06 17:30:37 Josh McKenzie wrote:
>> > +1
>> >
>> > On Thu, Apr 6, 2023, at 12:18 PM, Joseph Lynch wrote:
>> > > +1
>> > >
>> > > This proposal looks really exciting!
>> > >
>> > > -Joey
>> > >
>> > > On Wed, Apr 5, 2023 at 2:13 AM Aleksey Yeshchenko 
>> wrote:
>> > > >
>> > > > +1
>> > > >
>> > > > On 4 Apr 2023, at 16:56, Ekaterina Dimitrova 
>> wrote:
>> > > >
>> > > > +1
>> > > >
>> > > > On Tue, 4 Apr 2023 at 11:44, Benjamin Lerer 
>> wrote:
>> > > >>
>> > > >> +1
>> > > >>
>> > > >> Le mar. 4 avr. 2023 à 17:17, Andrés de la Peña <
>> adelap...@apache.org> a écrit :
>> > > >>>
>> > > >>> +1
>> > > >>>
>> > > >>> On Tue, 4 Apr 2023 at 15:09, Jeremy Hanna <
>> jeremy.hanna1...@gmail.com> wrote:
>> > > 
>> > >  +1 nb, will be great to have this in the codebase - it will make
>> nearly every table's compaction work more efficiently.  The only possible
>> exception is tables that are well suited for TWCS.
>> > > 
>> > >  On Apr 4, 2023, at 8:00 AM, Berenguer Blasi <
>> berenguerbl...@gmail.com> wrote:
>> > > 
>> > >  +1
>> > > 
>> > >  On 4/4/23 14:36, J. D. Jordan wrote:
>> > > 
>> > >  +1
>> > > 
>> > >  On Apr 4, 2023, at 7:29 AM, Brandon Williams 
>> wrote:
>> > > 
>> > >  
>> > >  +1
>> > > 
>> > >  On Tue, Apr 4, 2023, 7:24 AM Branimir Lambov 
>> wrote:
>> > > >
>> > > > Hi everyone,
>> > > >
>> > > > I would like to put CEP-26 to a vote.
>> > > >
>> > > > Proposal:
>> > > >
>> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-26%3A+Unified+Compaction+Strategy
>> > > >
>> > > > JIRA and draft implementation:
>> > > > https://issues.apache.org/jira/browse/CASSANDRA-18397
>> > > >
>> > > > Up-to-date documentation:
>> > > >
>> https://github.com/blambov/cassandra/blob/CASSANDRA-18397/src/java/org/apache/cassandra/db/compaction/UnifiedCompactionStrategy.md
>> > > >
>> > > > Discussion:
>> > > >
>> https://lists.apache.org/thread/8xf5245tclf1mb18055px47b982rdg4b
>> > > >
>> > > > The vote will be open for 72 hours.
>> > > > A vote passes if there are at least three binding +1s and no
>> binding vetoes.
>> > > >
>> > > > Thanks,
>> > > > Branimir
>> > > 
>> > > 
>> > > >
>> > >
>>
>


Re: [DISCUSS] CEP-29 CQL NOT Operator

2023-04-06 Thread Patrick McFadin
I love that this is finally coming to Cassandra. Absolutely hate that, once
again, we'll be endorsing the use of ALLOW FILTERING. This is an
anti-pattern that keeps getting legitimized.

Hot take: Should we just not do Milestones 1 and 2 and wait for an
index-only Milestone 3?

Patrick

On Thu, Apr 6, 2023 at 10:04 AM David Capwell  wrote:

> Overall I welcome this feature, was trying to use this around 1-2 months
> back and found we didn’t support, so glad to see it coming!
>
> From a testing point of view, I think we would want to have good fuzz
> testing covering complex types (frozen/non-frozen collections, tuples, udt,
> etc.), and reverse ordering; both sections tend to cause the most problem
> for new features (and existing ones)
>
> We also will want a way to disable this feature, and optionally disable at
> different sections (such as m2’s NOT IN for partition keys).
>
> > On Apr 4, 2023, at 2:28 AM, Piotr Kołaczkowski 
> wrote:
> >
> > Hi everyone!
> >
> > I created a new CEP for adding NOT support to the query language and
> > want to start discussion around it:
> >
> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-29%3A+CQL+NOT+operator
> >
> > Happy to get your feedback.
> > --
> > Piotr
>
>


Re: [DISCUSS] CEP-29 CQL NOT Operator

2023-04-06 Thread Jeremy Hanna
Considering all of the examples require using ALLOW FILTERING with the 
partition key specified, I think it's appropriate to consider separating out 
use of ALLOW FILTERING within a partition versus ALLOW FILTERING across the 
whole table.  A few years back we had a discussion about this in ASF slack in 
the context of capability restrictions and it seems relevant here.  That is, we 
don't want people to get comfortable using ALLOW FILTERING across the whole 
table.  However, there are times when ALLOW FILTERING within a partition is 
reasonable.

Ticket to discuss separating them out: 
https://issues.apache.org/jira/browse/CASSANDRA-15803
Summary: Perhaps add an optional [WITHIN PARTITION] or something similar to 
make it backwards compatible and indicate that this is purely within the 
specified partition.

This also gives us the ability to disallow table scan types of ALLOW FILTERING 
from a guard rail perspective, because the intent is explicit.  That operators 
could disallow ALLOW FILTERING but allow ALLOW FILTERING WITHIN PARTITION, or 
whatever is decided.

I do NOT want to hijack a good discussion but I thought this separation could 
be useful within this context.

Jeremy

> On Apr 6, 2023, at 3:00 PM, Patrick McFadin  wrote:
> 
> I love that this is finally coming to Cassandra. Absolutely hate that, once 
> again, we'll be endorsing the use of ALLOW FILTERING. This is an anti-pattern 
> that keeps getting legitimized.
> 
> Hot take: Should we just not do Milestones 1 and 2 and wait for an index-only 
> Milestone 3? 
> 
> Patrick
> 
> On Thu, Apr 6, 2023 at 10:04 AM David Capwell  > wrote:
>> Overall I welcome this feature, was trying to use this around 1-2 months 
>> back and found we didn’t support, so glad to see it coming!
>> 
>> From a testing point of view, I think we would want to have good fuzz 
>> testing covering complex types (frozen/non-frozen collections, tuples, udt, 
>> etc.), and reverse ordering; both sections tend to cause the most problem 
>> for new features (and existing ones)
>> 
>> We also will want a way to disable this feature, and optionally disable at 
>> different sections (such as m2’s NOT IN for partition keys).
>> 
>> > On Apr 4, 2023, at 2:28 AM, Piotr Kołaczkowski > > > wrote:
>> > 
>> > Hi everyone!
>> > 
>> > I created a new CEP for adding NOT support to the query language and
>> > want to start discussion around it:
>> > https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-29%3A+CQL+NOT+operator
>> > 
>> > Happy to get your feedback.
>> > --
>> > Piotr
>> 



Re: [DISCUSS] Introduce DATABASE as an alternative to KEYSPACE

2023-04-06 Thread Dinesh Joshi
I’m strongly in favor of leaving terminology as-is. On Apr 6, 2023, at 7:20 AM, Bowen Song via dev  wrote:
  

  
  
> I'm quite happy to leave things as they are if that is
the consensus.
+1 to the above



On 06/04/2023 14:54, Mike Adamson
  wrote:


  
  My apologies. I started this discussion off the
back of a usability discussion around new user accessibility to
Cassandra and the premise that there is an initial steep
learning curve for new users. Including new users who have
worked for a long time in the traditional DBMS field.


On the basis of the reason for the discussion,  TABLEGROUP
  doesn't sit well because of user types / functions / indexes
  etc. which are not strictly tables and is also yet another
  Cassandra only term. 


NAMESPACE could work but it's different usage in
  other systems could be just as confusing to new users. 


And, I certainly don't think having multiple names for the
  same thing just to satisfy different parties is a good idea at
  all. 


I'm quite happy to leave things as they are if that is the
  consensus.
  
  
  
On Thu, 6 Apr 2023 at 14:16,
  Josh McKenzie 
  wrote:


  

  
KEYSPACE is fine. If we want to introduce
  a standard nomenclature like DATABASE that’s also
  fine. Inventing brand new ones is not fine, there’s no
  benefit.

  
  I'm with Benedict in principle, with
Aleksey in practice; I think KEYSPACE and SCHEMA are
actually fine enough.
  
  
  
  If and when we get to any kind of
multi-tenancy, having a more metaphorical abstraction
that users are familiar with like these becomes more
valuable; it's pretty clear that things in different
keyspaces, different databases, or even different
schemas could have different access rules, resourcing,
etc from one another.
  
  
  
  While the off-the-cuff logical TABLEGROUP
thing is a literal statement about what the thing
is, it'd be another unique term to us;  we have enough
things in our system where we've charted our own path.
My personal .02 is we don't need to go adding more. :)
  
  
  
  On Thu, Apr 6, 2023, at 8:54 AM, Mick Semb Wever
wrote:
  
  

  


  
  

  
… but that should be a different discussion
  about how we evolve config.

  



 


  I
disagree. Nomenclature being difficult can
benefit from holistic and forward thinking.
  


  Sure you
can label this off-topic if you like, but I
value our discuss threads being collaborative in
an open-mode. Sometimes the best idea is on the
tail end of a sequence of bad and/or unpopular
ideas.
  






  


  

  
  
  

  
  
  

  

  
  
  
  
  -- 
  

  

  

Mike
  Adamson
  
  
Engineering

  
  
  
+1 650 389 6000 | datastax.com
  

  
  

  
Find DataStax
Online:
        
  

  

  

  

  



Re: [VOTE] Release Apache Cassandra 4.0.9

2023-04-06 Thread Dinesh Joshi
-1 as well. We need to upgrade Zstd.

> 
> On Apr 6, 2023, at 4:57 AM, Mick Semb Wever  wrote:
> 
> 
> 
>  
>> Up to you to fail the vote and we realistically release 4.0.9 after Easter
> 
> 
> -1 to the vote. 
> 
> I support your initial veto and reasoning, and it appears you are willing to 
> recut once 18429 is resolved.