Re: [VOTE] Release Apache Cassandra 4.1.0 GA

2022-12-06 Thread Mick Semb Wever
Marianne and Matt,
 are you saying that this v1->v2 performance degradation is more than
just CASSANDRA-18086 ?
If so, what's the eta for a ticket created for it?

On Tue, 6 Dec 2022 at 00:07, Marianne Lyne Manaog 
wrote:

> Following on what Matt said:
>
> - Here is the link to the Cassandra repo with the bugfix of wait time from
> ms to ns:
> https://github.com/apache/cassandra/compare/trunk...marianne-manaog:cassandra:bugfix/wait-from-ms-to-ns
>
> - the Paxos configuration used is:
>
>   paxos_contention_wait_randomizer: uniform
>
>   paxos_contention_min_wait: 0
>
>   paxos_contention_max_wait: 100ms
>
> - V1 and V2 have the same configurations except for paxos_variant: which
> changes accordingly
>
> *Results: V1 (100 partitions)*
>
> - Average read: 28948
>
> - Standard Deviation: 416.271
>
> - Coefficient of variance: 1.44%
>
> - Average write: 19248
>
> - Standard Deviation:158.595
>
> - Coefficient of variance:0.82%
>
> *Results: V2 (100 partitions)*
>
> - Average read: 12307
>
> - Standard Deviation: 2367.473
>
> - Coefficient of variance: 19.24%
>
> - Average write: 5780
>
> - Standard Deviation: 1154.261
>
> - Coefficient of variance: 19.97%
>
>
> On Mon, Dec 5, 2022 at 1:50 PM Matt Fleming 
> wrote:
>
>> Me and Marianne are also still chasing a performance issue with Paxos v2
>> when compared with v1. We
>> see way more contention on v2 for a LOCAL_SERIALIZABLE workload that
>> writes/reads to only 100
>> partitions (v2 performs better for higher partition counts). We're still
>> investigating what's going
>> on.
>>
>> Should that be a -1 vote? I'm not sure :)
>>
>> On Mon, 5 Dec 2022 at 11:37, Benedict  wrote:
>>
>>> -0
>>>
>>> CASSANDRA-18086 should probably be fixed and merged first, as Paxos v2
>>> will be unlikely to work well for users without it. Either that or we need
>>> to update NEWS.txt to mention it.
>>>
>>> On 5 Dec 2022, at 11:01, Aleksey Yeshchenko  wrote:
>>>
>>> +1
>>>
>>> On 5 Dec 2022, at 10:17, Benjamin Lerer  wrote:
>>>
>>> +1
>>>
>>> Le lun. 5 déc. 2022 à 11:02, Berenguer Blasi 
>>> a écrit :
>>>
 +1
 On 5/12/22 10:53, guo Maxwell wrote:

 +1

 Mick Semb Wever 于2022年12月5日 周一下午5:33写道:

>
> Proposing the test build of Cassandra 4.1.0 GA for release.
>
> sha1: b807f97b37933fac251020dbd949ee8ef245b158
> Git:
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=shortlog;h=refs/tags/4.1.0-tentative
> Maven Artifacts:
> https://repository.apache.org/content/repositories/orgapachecassandra-1281/org/apache/cassandra/cassandra-all/4.1.0/
>
> The Source and Build Artifacts, and the Debian and RPM packages and
> repositories, are available here:
> https://dist.apache.org/repos/dist/dev/cassandra/4.1.0/
>
> The vote will be open for 72 hours (longer if needed). Everyone who
> has tested the build is invited to vote. Votes by PMC members are
> considered binding. A vote passes if there are at least three binding +1s
> and no -1's.
>
> [1]: CHANGES.txt:
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/4.1.0-tentative
> [2]: NEWS.txt:
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=NEWS.txt;hb=refs/tags/4.1.0-tentative
>
 --
 you are the apple of my eye !


>>>


Re: [VOTE] Release Apache Cassandra 4.1.0 GA

2022-12-06 Thread Benedict
There’s another 1-line fix needed with the patch, and some tests, that I (or 
someone else) can likely post later today. It might be worth also giving Matt 
and Marianne a day or two to run some further tests to confirm everything is 
working as expected.

> On 6 Dec 2022, at 09:15, Mick Semb Wever  wrote:
> 
> 
> Marianne and Matt, 
>  are you saying that this v1->v2 performance degradation is more than just 
> CASSANDRA-18086 ? 
> If so, what's the eta for a ticket created for it? 
> 
>> On Tue, 6 Dec 2022 at 00:07, Marianne Lyne Manaog  
>> wrote:
>> Following on what Matt said:
>>  - Here is the link to the Cassandra repo with the bugfix of wait time 
>> from ms to ns: 
>> https://github.com/apache/cassandra/compare/trunk...marianne-manaog:cassandra:bugfix/wait-from-ms-to-ns
>> 
>>  - the Paxos configuration used is:
>>   paxos_contention_wait_randomizer: uniform
>>   paxos_contention_min_wait: 0
>>   paxos_contention_max_wait: 100ms
>> 
>>  - V1 and V2 have the same configurations except for paxos_variant: 
>> which changes accordingly
>> 
>>  Results: V1 (100 partitions)
>>  - Average read: 28948
>>  - Standard Deviation: 416.271
>>  - Coefficient of variance: 1.44%
>>  - Average write: 19248
>>  - Standard Deviation:158.595
>>  - Coefficient of variance:0.82%
>> 
>>  Results: V2 (100 partitions)
>>  - Average read: 12307
>>  - Standard Deviation: 2367.473
>>  - Coefficient of variance: 19.24%
>>  - Average write: 5780
>>  - Standard Deviation: 1154.261
>>  - Coefficient of variance: 19.97%
>> 
>> 
>>> On Mon, Dec 5, 2022 at 1:50 PM Matt Fleming  
>>> wrote:
>>> Me and Marianne are also still chasing a performance issue with Paxos v2 
>>> when compared with v1. We
>>> see way more contention on v2 for a LOCAL_SERIALIZABLE workload that 
>>> writes/reads to only 100 
>>> partitions (v2 performs better for higher partition counts). We're still 
>>> investigating what's going
>>> on.
>>> 
>>> Should that be a -1 vote? I'm not sure :)
>>> 
 On Mon, 5 Dec 2022 at 11:37, Benedict  wrote:
 -0 
 
 CASSANDRA-18086 should probably be fixed and merged first, as Paxos v2 
 will be unlikely to work well for users without it. Either that or we need 
 to update NEWS.txt to mention it.
 
>> On 5 Dec 2022, at 11:01, Aleksey Yeshchenko  wrote:
>> 
> +1
> 
>> On 5 Dec 2022, at 10:17, Benjamin Lerer  wrote:
>> 
>> +1
>> 
>>> Le lun. 5 déc. 2022 à 11:02, Berenguer Blasi  
>>> a écrit :
>>> +1
>>> 
 On 5/12/22 10:53, guo Maxwell wrote:
 +1 
 
 Mick Semb Wever 于2022年12月5日 周一下午5:33写道:
> 
> Proposing the test build of Cassandra 4.1.0 GA for release.
> 
> sha1: b807f97b37933fac251020dbd949ee8ef245b158
> Git: 
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=shortlog;h=refs/tags/4.1.0-tentative
> Maven Artifacts: 
> https://repository.apache.org/content/repositories/orgapachecassandra-1281/org/apache/cassandra/cassandra-all/4.1.0/
> 
> The Source and Build Artifacts, and the Debian and RPM packages and 
> repositories, are available here: 
> https://dist.apache.org/repos/dist/dev/cassandra/4.1.0/
> 
> The vote will be open for 72 hours (longer if needed). Everyone who 
> has tested the build is invited to vote. Votes by PMC members are 
> considered binding. A vote passes if there are at least three binding 
> +1s and no -1's.
> 
> [1]: CHANGES.txt: 
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/4.1.0-tentative
> [2]: NEWS.txt: 
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=NEWS.txt;hb=refs/tags/4.1.0-tentative
 -- 
 you are the apple of my eye !
> 


Re: [VOTE] Release Apache Cassandra 4.1.0 GA

2022-12-06 Thread Aleksey Yeshchenko
Switching to -1 then.

Let’s get these fixes in and roll a pure RC2.

> On 6 Dec 2022, at 10:53, Benedict  wrote:
> 
> There’s another 1-line fix needed with the patch, and some tests, that I (or 
> someone else) can likely post later today. It might be worth also giving Matt 
> and Marianne a day or two to run some further tests to confirm everything is 
> working as expected.
> 
>> On 6 Dec 2022, at 09:15, Mick Semb Wever  wrote:
>> 
>> 
>> Marianne and Matt, 
>>  are you saying that this v1->v2 performance degradation is more than just 
>> CASSANDRA-18086 ? 
>> If so, what's the eta for a ticket created for it? 
>> 
>> On Tue, 6 Dec 2022 at 00:07, Marianne Lyne Manaog > > wrote:
>>> Following on what Matt said:
>>> - Here is the link to the Cassandra repo with the bugfix of wait time 
>>> from ms to ns: 
>>> https://github.com/apache/cassandra/compare/trunk...marianne-manaog:cassandra:bugfix/wait-from-ms-to-ns
>>> 
>>> - the Paxos configuration used is:
>>>   paxos_contention_wait_randomizer: uniform
>>>   paxos_contention_min_wait: 0
>>>   paxos_contention_max_wait: 100ms
>>> 
>>> - V1 and V2 have the same configurations except for paxos_variant: 
>>> which changes accordingly
>>> 
>>> Results: V1 (100 partitions)
>>> - Average read: 28948
>>> - Standard Deviation: 416.271
>>> - Coefficient of variance: 1.44%
>>> - Average write: 19248
>>> - Standard Deviation:158.595
>>> - Coefficient of variance:0.82%
>>> 
>>> Results: V2 (100 partitions)
>>> - Average read: 12307
>>> - Standard Deviation: 2367.473
>>> - Coefficient of variance: 19.24%
>>> - Average write: 5780
>>> - Standard Deviation: 1154.261
>>> - Coefficient of variance: 19.97%
>>> 
>>> 
>>> On Mon, Dec 5, 2022 at 1:50 PM Matt Fleming >> > wrote:
 Me and Marianne are also still chasing a performance issue with Paxos v2 
 when compared with v1. We
 see way more contention on v2 for a LOCAL_SERIALIZABLE workload that 
 writes/reads to only 100 
 partitions (v2 performs better for higher partition counts). We're still 
 investigating what's going
 on.
 
 Should that be a -1 vote? I'm not sure :)
 
 On Mon, 5 Dec 2022 at 11:37, Benedict >>> > wrote:
> -0 
> 
> CASSANDRA-18086 should probably be fixed and merged first, as Paxos v2 
> will be unlikely to work well for users without it. Either that or we 
> need to update NEWS.txt to mention it.
> 
>> On 5 Dec 2022, at 11:01, Aleksey Yeshchenko > > wrote:
>> 
>> +1
>> 
>>> On 5 Dec 2022, at 10:17, Benjamin Lerer >> > wrote:
>>> 
>>> +1
>>> 
>>> Le lun. 5 déc. 2022 à 11:02, Berenguer Blasi >> > a écrit :
 +1
 
 On 5/12/22 10:53, guo Maxwell wrote:
> +1 
> 
> Mick Semb Wever mailto:m...@apache.org>>于2022年12月5日 
> 周一下午5:33写道:
>> 
>> Proposing the test build of Cassandra 4.1.0 GA for release.
>> 
>> sha1: b807f97b37933fac251020dbd949ee8ef245b158
>> Git: 
>> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=shortlog;h=refs/tags/4.1.0-tentative
>> Maven Artifacts: 
>> https://repository.apache.org/content/repositories/orgapachecassandra-1281/org/apache/cassandra/cassandra-all/4.1.0/
>> 
>> The Source and Build Artifacts, and the Debian and RPM packages and 
>> repositories, are available here: 
>> https://dist.apache.org/repos/dist/dev/cassandra/4.1.0/
>> 
>> The vote will be open for 72 hours (longer if needed). Everyone who 
>> has tested the build is invited to vote. Votes by PMC members are 
>> considered binding. A vote passes if there are at least three 
>> binding +1s and no -1's.
>> 
>> [1]: CHANGES.txt: 
>> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/4.1.0-tentative
>> [2]: NEWS.txt: 
>> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=NEWS.txt;hb=refs/tags/4.1.0-tentative
> -- 
> you are the apple of my eye !
>> 



Aggregate functions on collections, collection functions and MAXWRITETIME

2022-12-06 Thread Andrés de la Peña
This will require some long introduction for context:

The MAX/MIN functions aggregate rows to get the row with min/max column
value according to their comparator. For collections, the comparison is on
the lexicographical order of the collection elements. That's the very same
comparator that is used when collections are used as clustering keys and
for ORDER BY.

However, a bug in the MIN/MAX aggregate functions used to make that the
results were presented in their unserialized form, although the row
selection was correct. That bug was recently solved by CASSANDRA-17811.
During that ticket it was also considered the option of simply disabling
MIN/MAX on collection since applying those functions to collections, since
they don't seem super useful. However, that option was quickly discarded
and the operation was fixed so the MIN/MAX functions correctly work for
every data type.

As a byproduct of the internal improvements of that fix, CASSANDRA-8877
introduced a new set of functions that can perform aggregations of the
elements of a collection. Those where named "map_keys", "map_values",
"collection_min", "collection_max", "collection_sum", and
"collection_count". Those are the names mentioned on the mail list thread
about function naming conventions. Despite doing a kind of
within-collection aggregation, these functions are not what we usually call
aggregate functions, since they don't aggregate multiple rows together.

On a different line of work, CASSANDRA-17425 added to trunk a MAXWRITETIME
function to get the max timestamp of a multi-cell column. However, the new
collection functions can be used in combination with the WRITETIME and TTL
functions to retrieve the min/max/sum/avg timestamp or ttl of a multi-cell
column. Since the new functions give a generic way of aggreagting
timestamps ant TTLs of multi-cell columns, CASSANDRA-18078 proposed to
remove that MAXWRITETIME function.

Yifan Cai, author of the MAXWRITETIME function, agreed to remove that
function in favour of the new generic collection functions. However, the
MAXWRITETIME function can work on both single-cell and multi-cell columns,
whereas "COLLECTION_MAX(WRITETIME(column))" would only work on multi-cell
columns, That's because MAXWRITETIME of a not-multicell column doesn't
return a collection, and one should simply use "WRITETIME(column)" instead.
So it was proposed in CASSANDRA-18037 that collections functions applied to
a not-collection value consider that value as the only element of a
singleton collection. So, for example, COLLECTION_MAX(7) =
COLLECTION_MAX([7]) = 7. That ticket has already been reviewed and it's
mostly ready to commit.

Now we can go straight to the point:

Recently Benedict brought back the idea of deprecating aggregate functions
applied to collections, the very same idea that was mentioned on
CASSANDRA-17811 description almost four months ago. That way we could
rename the new collection functions MIN/MAX/SUM/AVG, same as the classic
aggregate functions. That way MIN/MAX/SUM/AVG would be an aggregate
function when applied to not-collection columns, and a scalar function when
applied to collection. We can't do that with COUNT because there would be
an ambiguity, so the proposal for that case is renaming COLLECTION_COUNT to
SIZE. Benedict, please correct me if I'm not correctly exposing the
proposal.

I however would prefer to keep aggregate functions working on collections,
and keep the names of the new collection functions as "COLLECTION_*".
Reasons are:

1 - Making aggregate functions not work on collections might be cosidered
as breaking backward compatibility and require a deprecation plan.
2 - Keeping aggregate functions working on collections might not look
superuseful, but they make the set of aggregate functions consistent and
applicable to every column type.
3 - Using the "COLLECTION_" prefix on collection functions establishes a
clear distinction between row aggregations and collection aggregations,
while at the same time exposing the analogy between each pair of functions.
4 - Not using the "COLLECTION_" prefix forces us to search for workarounds
such as using the column type when possible, or trying to figure out
synonyms like in the case of COUNT/SIZE. Even if that works for this case,
future functions can find more trouble when trying to figure out
workarounds to avoid clashing with existing function names. For example, we
might want to add a SIZE function that gets the size in bytes of any
column, or we might want to add a MAX function that gets the maximum of a
set of columns, etc. And example of the synonym-based approach that comes
to mind is MySQL's MAX and GREATEST functions, where MAX is for row
aggregation and GREATEST is for column aggregation.
5 - If MIN/MAX function selection is based on the column type, we can't
implement Yifan's proposal of making COLLECTION_MAX(7) =
COLLECTION_MAX([7]) = 7, which would be very useful for combining
collection functions with time functions.

What do others think

Re: Aggregate functions on collections, collection functions and MAXWRITETIME

2022-12-06 Thread Benedict
Thanks Andres, I think community input on direction here will be invaluable. 
There’s a bunch of interrelated tickets, and my opinions are as follows:

1. I think it is a mistake to offer a function MAX that operates over rows 
containing collections, returning the collection with the most elements. This 
is just a nonsensical operation to support IMO. We should decide as a community 
whether we “fix” this aggregation, or remove it.
2. I think “collection_" prefixed methods are non-intuitive for discovery, and 
all-else equal it would be better to use MAX,MIN, etc, same as for aggregations.
3. I think it is peculiar to permit methods named collection_ to operate over 
non-collection types when they are explicitly collection variants.

Given (1), (2) becomes simple except for COUNT which remains ambiguous, but 
this could be solved by either providing a separate method for collections 
(e.g. SIZE) which seems fine to me, or by offering a precedence order for 
matching and a keyword for overriding the precedence order (e.g. 
COUNT(collection AS COLLECTION)).

Given (2), (3) is a little more difficult. However, I think this can be solved 
several ways. 
 - We could permit explicit casts to collection types, that for a collection 
type would be a no-op, and for a single value would create a collection
 - With precedence orders, by always selecting the scalar function last
 - By permitting WRITETIME to accept a binary operator reduce function to 
resolve multiple values

These decisions all imply trade-offs on each other, and affect the evolution of 
CQL, so I think community input would be helpful.

> On 6 Dec 2022, at 12:44, Andrés de la Peña  wrote:
> 
> 
> This will require some long introduction for context:
> 
> The MAX/MIN functions aggregate rows to get the row with min/max column value 
> according to their comparator. For collections, the comparison is on the 
> lexicographical order of the collection elements. That's the very same 
> comparator that is used when collections are used as clustering keys and for 
> ORDER BY.
> 
> However, a bug in the MIN/MAX aggregate functions used to make that the 
> results were presented in their unserialized form, although the row selection 
> was correct. That bug was recently solved by CASSANDRA-17811. During that 
> ticket it was also considered the option of simply disabling MIN/MAX on 
> collection since applying those functions to collections, since they don't 
> seem super useful. However, that option was quickly discarded and the 
> operation was fixed so the MIN/MAX functions correctly work for every data 
> type.
> 
> As a byproduct of the internal improvements of that fix, CASSANDRA-8877 
> introduced a new set of functions that can perform aggregations of the 
> elements of a collection. Those where named "map_keys", "map_values", 
> "collection_min", "collection_max", "collection_sum", and "collection_count". 
> Those are the names mentioned on the mail list thread about function naming 
> conventions. Despite doing a kind of within-collection aggregation, these 
> functions are not what we usually call aggregate functions, since they don't 
> aggregate multiple rows together.
> 
> On a different line of work, CASSANDRA-17425 added to trunk a MAXWRITETIME 
> function to get the max timestamp of a multi-cell column. However, the new 
> collection functions can be used in combination with the WRITETIME and TTL 
> functions to retrieve the min/max/sum/avg timestamp or ttl of a multi-cell 
> column. Since the new functions give a generic way of aggreagting timestamps 
> ant TTLs of multi-cell columns, CASSANDRA-18078 proposed to remove that 
> MAXWRITETIME function.
> 
> Yifan Cai, author of the MAXWRITETIME function, agreed to remove that 
> function in favour of the new generic collection functions. However, the 
> MAXWRITETIME function can work on both single-cell and multi-cell columns, 
> whereas "COLLECTION_MAX(WRITETIME(column))" would only work on multi-cell 
> columns, That's because MAXWRITETIME of a not-multicell column doesn't return 
> a collection, and one should simply use "WRITETIME(column)" instead. So it 
> was proposed in CASSANDRA-18037 that collections functions applied to a 
> not-collection value consider that value as the only element of a singleton 
> collection. So, for example, COLLECTION_MAX(7) = COLLECTION_MAX([7]) = 7. 
> That ticket has already been reviewed and it's mostly ready to commit.
> 
> Now we can go straight to the point:
> 
> Recently Benedict brought back the idea of deprecating aggregate functions 
> applied to collections, the very same idea that was mentioned on 
> CASSANDRA-17811 description almost four months ago. That way we could rename 
> the new collection functions MIN/MAX/SUM/AVG, same as the classic aggregate 
> functions. That way MIN/MAX/SUM/AVG would be an aggregate function when 
> applied to not-collection columns, and a scalar function when applied to 
> collection. We can't do that with COUNT

Re: [DISCUSS] API modifications and when to raise a thread on the dev ML

2022-12-06 Thread Benjamin Lerer
I am sorry but I still do not understand what problem we are trying to
solve.
All examples given so far have been about significant features which we
already discuss on this mailing not about minor changes that happen
multiple times per week.
Is it a trust issue ?


Re: [DISCUSS] API modifications and when to raise a thread on the dev ML

2022-12-06 Thread Aleksey Yeshchenko
Public APIs are 1) essentially forever-lasting and 2) are what our users 
actually get to interact with. A suboptimal public API can be annoying to work 
with at best, and at worst force and cement suboptimal implementations.

The goal here is to make sure that whatever public API changes we introduce are 
as good as they can be, first time around. Getting as many diverse eyes on it 
as possible helps with achieving this goal. Making these changes more visible 
and allowing for longer periods of revision maximises the opportunity for 
someone to spot an issue or suggest an improvement.

This isn’t about trust, but about recognition of one’s own limitations. Most 
active committers - *absolute* most of us - are indeed *not* Cassandra users or 
Cassandra operators. Our predominant interaction with Cassandra is via editing 
Java code in our IDEs. We don’t usually have a lot of experience or skin in the 
game when it comes to consuming Cassandra’s APIs. We should welcome and 
actively seek inputs of those who do. Giving more time to other developers to 
react and contribute is pretty important as well.

The mechanisms suggested here don’t strike me as being too costly. Starting a 
lightweight informal thread even for every individual proposal is no huge deal, 
surely. We aren’t talking about CEP level of commitment here.

It’s not the first time you bring up trust, I feel, but there really is no need 
to go all defensive here. No person or organisation is being singled out. 
Admitting that API design can genuinely benefit from user input and input of 
others in general, to me, is productive humility - a sign of maturity. It’s not 
a reason to be offended.

> On 6 Dec 2022, at 13:53, Benjamin Lerer  wrote:
> 
> I am sorry but I still do not understand what problem we are trying to solve.
> All examples given so far have been about significant features which we 
> already discuss on this mailing not about minor changes that happen multiple 
> times per week.
> Is it a trust issue ?



Re: [DISCUSS] API modifications and when to raise a thread on the dev ML

2022-12-06 Thread Ekaterina Dimitrova
If the big works are already accompanied by discussions, I do not see a
reason a list pointing to Jira tickets with small API changes could not
serve our needs. A nice label and commitment that API change will be
brought in the description with a few sentences seems more than enough to
me.

Also, I am confused… when I asked about the access of JDK internals which
can lead to trouble, breaking changes and urgent patches needed, we agreed
Jira is enough and we shouldn’t trigger too much noise here, but for small
API changes this is preferable? What do I miss here? Maybe there is
something more that people have in mind that I struggle to see?

On Tue, 6 Dec 2022 at 8:53, Benjamin Lerer  wrote:

> I am sorry but I still do not understand what problem we are trying to
> solve.
> All examples given so far have been about significant features which we
> already discuss on this mailing not about minor changes that happen
> multiple times per week.
> Is it a trust issue ?
>


Re: [DISCUSS] API modifications and when to raise a thread on the dev ML

2022-12-06 Thread Ekaterina Dimitrova
Aleksey, I absolutely agree with the point users will be able to give more
input, help early on with feedback etc. But I struggle to see why having
two sentences explicitly to mention on the ticket the exact API change,
which can be even in a new separate required field is not enough to trigger
discussion if needed. We can even have mandatory:
“Is this API change” and if the answer is “yes” - people should explain in
a few sentences

It seems neat to me, streamlined and easy to find historically in tickets.
Especially if we consider many of them will be accepted with lazy consensus
as it was pointed out earlier on this thread.

On Tue, 6 Dec 2022 at 10:18, Aleksey Yeshchenko  wrote:

> Public APIs are 1) essentially forever-lasting and 2) are what our users
> actually get to interact with. A suboptimal public API can be annoying to
> work with at best, and at worst force and cement suboptimal implementations.
>
> The goal here is to make sure that whatever public API changes we
> introduce are as good as they can be, first time around. Getting as many
> diverse eyes on it as possible helps with achieving this goal. Making these
> changes more visible and allowing for longer periods of revision maximises
> the opportunity for someone to spot an issue or suggest an improvement.
>
> This isn’t about trust, but about recognition of one’s own limitations.
> Most active committers - *absolute* most of us - are indeed *not* Cassandra
> users or Cassandra operators. Our predominant interaction with Cassandra is
> via editing Java code in our IDEs. We don’t usually have a lot of
> experience or skin in the game when it comes to consuming Cassandra’s APIs.
> We should welcome and actively seek inputs of those who do. Giving more
> time to other developers to react and contribute is pretty important as
> well.
>
> The mechanisms suggested here don’t strike me as being too costly.
> Starting a lightweight informal thread even for every individual proposal
> is no huge deal, surely. We aren’t talking about CEP level of commitment
> here.
>
> It’s not the first time you bring up trust, I feel, but there really is no
> need to go all defensive here. No person or organisation is being singled
> out. Admitting that API design can genuinely benefit from user input and
> input of others in general, to me, is productive humility - a sign of
> maturity. It’s not a reason to be offended.
>
> > On 6 Dec 2022, at 13:53, Benjamin Lerer  wrote:
> >
> > I am sorry but I still do not understand what problem we are trying to
> solve.
> > All examples given so far have been about significant features which we
> already discuss on this mailing not about minor changes that happen
> multiple times per week.
> > Is it a trust issue ?
>
>


Re: [DISCUSS] API modifications and when to raise a thread on the dev ML

2022-12-06 Thread Benjamin Lerer
>
> It’s not the first time you bring up trust, I feel, but there really is no
> need to go all defensive here.


I am not defensive. I am simply trying to understand the need to put in
place a process that has a high cost in terms of time and effort for small
changes.
So far nobody has been able to provide me with examples of times where it
would have been needed. I am sorry. I see the cost not the benefit.


Le mar. 6 déc. 2022 à 16:18, Aleksey Yeshchenko  a
écrit :

> Public APIs are 1) essentially forever-lasting and 2) are what our users
> actually get to interact with. A suboptimal public API can be annoying to
> work with at best, and at worst force and cement suboptimal implementations.
>
> The goal here is to make sure that whatever public API changes we
> introduce are as good as they can be, first time around. Getting as many
> diverse eyes on it as possible helps with achieving this goal. Making these
> changes more visible and allowing for longer periods of revision maximises
> the opportunity for someone to spot an issue or suggest an improvement.
>
> This isn’t about trust, but about recognition of one’s own limitations.
> Most active committers - *absolute* most of us - are indeed *not* Cassandra
> users or Cassandra operators. Our predominant interaction with Cassandra is
> via editing Java code in our IDEs. We don’t usually have a lot of
> experience or skin in the game when it comes to consuming Cassandra’s APIs.
> We should welcome and actively seek inputs of those who do. Giving more
> time to other developers to react and contribute is pretty important as
> well.
>
> The mechanisms suggested here don’t strike me as being too costly.
> Starting a lightweight informal thread even for every individual proposal
> is no huge deal, surely. We aren’t talking about CEP level of commitment
> here.
>
> It’s not the first time you bring up trust, I feel, but there really is no
> need to go all defensive here. No person or organisation is being singled
> out. Admitting that API design can genuinely benefit from user input and
> input of others in general, to me, is productive humility - a sign of
> maturity. It’s not a reason to be offended.
>
> > On 6 Dec 2022, at 13:53, Benjamin Lerer  wrote:
> >
> > I am sorry but I still do not understand what problem we are trying to
> solve.
> > All examples given so far have been about significant features which we
> already discuss on this mailing not about minor changes that happen
> multiple times per week.
> > Is it a trust issue ?
>
>


Re: Aggregate functions on collections, collection functions and MAXWRITETIME

2022-12-06 Thread Jeremiah D Jordan
> 1. I think it is a mistake to offer a function MAX that operates over rows 
> containing collections, returning the collection with the most elements. This 
> is just a nonsensical operation to support IMO. We should decide as a 
> community whether we “fix” this aggregation, or remove it.

The current MAX function does not work this way afaik?  It returns the row with 
the column that has the highest value in clustering order sense, like if the 
collection was used as a clustering key.  While that also may have limited use, 
I don’t think it worth while to deprecate such use and all the headache that 
comes with doing so.

> 2. I think “collection_" prefixed methods are non-intuitive for discovery, 
> and all-else equal it would be better to use MAX,MIN, etc, same as for 
> aggregations.

If we actually wanted to move towards using the existing names with new 
meanings, then I think that would take us multiple major releases.  First 
deprecate existing use in current releases.  Then make it an error in the next 
major release X.  Then change the behavior in major release X+1.  Just 
switching the behavior without having a major where such queries error out 
would make a bunch of user queries start returning “wrong” data.
Also I don’t think those functions being cross row aggregations for some column 
types, but within row collection operations for other types, is any more 
intuitive, and actually would be more confusing.  So I am -1 on using the same 
names.

> 3. I think it is peculiar to permit methods named collection_ to operate over 
> non-collection types when they are explicitly collection variants.

While I could see some point to this, I do not think it would be confusing for 
something named collection_XXX to treat a non-collection as a collection of 1.  
But maybe there is a better name for these function.  Rather than seeing them 
as collection variants, we should see them as variants that operate on the data 
in a single row, rather than aggregating across multiple rows.  But even with 
that perspective I don’t know what the best name would be.

> On Dec 6, 2022, at 7:30 AM, Benedict  wrote:
> 
> Thanks Andres, I think community input on direction here will be invaluable. 
> There’s a bunch of interrelated tickets, and my opinions are as follows:
> 
> 1. I think it is a mistake to offer a function MAX that operates over rows 
> containing collections, returning the collection with the most elements. This 
> is just a nonsensical operation to support IMO. We should decide as a 
> community whether we “fix” this aggregation, or remove it.
> 2. I think “collection_" prefixed methods are non-intuitive for discovery, 
> and all-else equal it would be better to use MAX,MIN, etc, same as for 
> aggregations.
> 3. I think it is peculiar to permit methods named collection_ to operate over 
> non-collection types when they are explicitly collection variants.
> 
> Given (1), (2) becomes simple except for COUNT which remains ambiguous, but 
> this could be solved by either providing a separate method for collections 
> (e.g. SIZE) which seems fine to me, or by offering a precedence order for 
> matching and a keyword for overriding the precedence order (e.g. 
> COUNT(collection AS COLLECTION)).
> 
> Given (2), (3) is a little more difficult. However, I think this can be 
> solved several ways. 
> - We could permit explicit casts to collection types, that for a collection 
> type would be a no-op, and for a single value would create a collection
> - With precedence orders, by always selecting the scalar function last
> - By permitting WRITETIME to accept a binary operator reduce function to 
> resolve multiple values
> 
> These decisions all imply trade-offs on each other, and affect the evolution 
> of CQL, so I think community input would be helpful.
> 
>> On 6 Dec 2022, at 12:44, Andrés de la Peña  wrote:
>> 
>> 
>> This will require some long introduction for context:
>> 
>> The MAX/MIN functions aggregate rows to get the row with min/max column 
>> value according to their comparator. For collections, the comparison is on 
>> the lexicographical order of the collection elements. That's the very same 
>> comparator that is used when collections are used as clustering keys and for 
>> ORDER BY.
>> 
>> However, a bug in the MIN/MAX aggregate functions used to make that the 
>> results were presented in their unserialized form, although the row 
>> selection was correct. That bug was recently solved by CASSANDRA-17811. 
>> During that ticket it was also considered the option of simply disabling 
>> MIN/MAX on collection since applying those functions to collections, since 
>> they don't seem super useful. However, that option was quickly discarded and 
>> the operation was fixed so the MIN/MAX functions correctly work for every 
>> data type.
>> 
>> As a byproduct of the internal improvements of that fix, CASSANDRA-8877 
>> introduced a new set of functions that can perform aggregations of the 
>> ele

Re: Aggregate functions on collections, collection functions and MAXWRITETIME

2022-12-06 Thread Benedict
As far as I am aware it has never worked in a release, and so deprecating it is 
probably not as challenging as you think. Only folk that have been able to 
parse the raw bytes of the collection in storage format would be affected - 
which we can probably treat as zero.


> On 6 Dec 2022, at 17:31, Jeremiah D Jordan  wrote:
> 
> 
>> 
>> 1. I think it is a mistake to offer a function MAX that operates over rows 
>> containing collections, returning the collection with the most elements. 
>> This is just a nonsensical operation to support IMO. We should decide as a 
>> community whether we “fix” this aggregation, or remove it.
> 
> The current MAX function does not work this way afaik?  It returns the row 
> with the column that has the highest value in clustering order sense, like if 
> the collection was used as a clustering key.  While that also may have 
> limited use, I don’t think it worth while to deprecate such use and all the 
> headache that comes with doing so.
> 
>> 2. I think “collection_" prefixed methods are non-intuitive for discovery, 
>> and all-else equal it would be better to use MAX,MIN, etc, same as for 
>> aggregations.
> 
> If we actually wanted to move towards using the existing names with new 
> meanings, then I think that would take us multiple major releases.  First 
> deprecate existing use in current releases.  Then make it an error in the 
> next major release X.  Then change the behavior in major release X+1.  Just 
> switching the behavior without having a major where such queries error out 
> would make a bunch of user queries start returning “wrong” data.
> Also I don’t think those functions being cross row aggregations for some 
> column types, but within row collection operations for other types, is any 
> more intuitive, and actually would be more confusing.  So I am -1 on using 
> the same names.
> 
>> 3. I think it is peculiar to permit methods named collection_ to operate 
>> over non-collection types when they are explicitly collection variants.
> 
> While I could see some point to this, I do not think it would be confusing 
> for something named collection_XXX to treat a non-collection as a collection 
> of 1.  But maybe there is a better name for these function.  Rather than 
> seeing them as collection variants, we should see them as variants that 
> operate on the data in a single row, rather than aggregating across multiple 
> rows.  But even with that perspective I don’t know what the best name would 
> be.
> 
>> On Dec 6, 2022, at 7:30 AM, Benedict  wrote:
>> 
>> Thanks Andres, I think community input on direction here will be invaluable. 
>> There’s a bunch of interrelated tickets, and my opinions are as follows:
>> 
>> 1. I think it is a mistake to offer a function MAX that operates over rows 
>> containing collections, returning the collection with the most elements. 
>> This is just a nonsensical operation to support IMO. We should decide as a 
>> community whether we “fix” this aggregation, or remove it.
>> 2. I think “collection_" prefixed methods are non-intuitive for discovery, 
>> and all-else equal it would be better to use MAX,MIN, etc, same as for 
>> aggregations.
>> 3. I think it is peculiar to permit methods named collection_ to operate 
>> over non-collection types when they are explicitly collection variants.
>> 
>> Given (1), (2) becomes simple except for COUNT which remains ambiguous, but 
>> this could be solved by either providing a separate method for collections 
>> (e.g. SIZE) which seems fine to me, or by offering a precedence order for 
>> matching and a keyword for overriding the precedence order (e.g. 
>> COUNT(collection AS COLLECTION)).
>> 
>> Given (2), (3) is a little more difficult. However, I think this can be 
>> solved several ways. 
>> - We could permit explicit casts to collection types, that for a collection 
>> type would be a no-op, and for a single value would create a collection
>> - With precedence orders, by always selecting the scalar function last
>> - By permitting WRITETIME to accept a binary operator reduce function to 
>> resolve multiple values
>> 
>> These decisions all imply trade-offs on each other, and affect the evolution 
>> of CQL, so I think community input would be helpful.
>> 
 On 6 Dec 2022, at 12:44, Andrés de la Peña  wrote:
>>> 
>>> 
>>> This will require some long introduction for context:
>>> 
>>> The MAX/MIN functions aggregate rows to get the row with min/max column 
>>> value according to their comparator. For collections, the comparison is on 
>>> the lexicographical order of the collection elements. That's the very same 
>>> comparator that is used when collections are used as clustering keys and 
>>> for ORDER BY.
>>> 
>>> However, a bug in the MIN/MAX aggregate functions used to make that the 
>>> results were presented in their unserialized form, although the row 
>>> selection was correct. That bug was recently solved by CASSANDRA-17811. 
>>> During that ticket it was also consi

Re: [VOTE] Release Apache Cassandra 4.1.0 GA

2022-12-06 Thread Mick Semb Wever
>
> Let’s get these fixes in and roll a pure RC2.
>


Given that we're talking about two one-liners, just changing time units,
what are folks thoughts about skipping rc2 and just re-cutting 4.1.0 (GA) ?

(I agree with the rules, and am in favour of making the exception.)


Re: [DISCUSS] API modifications and when to raise a thread on the dev ML

2022-12-06 Thread Aleksey Yeshchenko
>  I am simply trying to understand the need to put in place a process that has 
> a high cost in terms of time and effort for small changes.

It is an additional cost, but it’s not a high cost. And certainly not a high 
*marginal* cost - when compared to all the admin involved in getting a patch 
committed. A tiny extra step. One quick email is nothing. If a conversation 
comes out of it, then yes, engaging with it will take time, but then again, 
it’ll also mean that the effort has not been wasted and attracted input.

> So far nobody has been able to provide me with examples of times where it 
> would have been needed. I am sorry. I see the cost not the benefit.

Many deprecated config params and bits of DDL, for example, would qualify. Some 
CQL beyond DDL - including early authn/authz bits introduced by me - could’ve 
been designed better first time around and not need a rewrite/deprecation.

> On 6 Dec 2022, at 16:21, Benjamin Lerer  wrote:
> 
>> It’s not the first time you bring up trust, I feel, but there really is no 
>> need to go all defensive here.
> 
> I am not defensive. I am simply trying to understand the need to put in place 
> a process that has a high cost in terms of time and effort for small changes.
> So far nobody has been able to provide me with examples of times where it 
> would have been needed. I am sorry. I see the cost not the benefit.
>
> 
> Le mar. 6 déc. 2022 à 16:18, Aleksey Yeshchenko  > a écrit :
>> Public APIs are 1) essentially forever-lasting and 2) are what our users 
>> actually get to interact with. A suboptimal public API can be annoying to 
>> work with at best, and at worst force and cement suboptimal implementations.
>> 
>> The goal here is to make sure that whatever public API changes we introduce 
>> are as good as they can be, first time around. Getting as many diverse eyes 
>> on it as possible helps with achieving this goal. Making these changes more 
>> visible and allowing for longer periods of revision maximises the 
>> opportunity for someone to spot an issue or suggest an improvement.
>> 
>> This isn’t about trust, but about recognition of one’s own limitations. Most 
>> active committers - *absolute* most of us - are indeed *not* Cassandra users 
>> or Cassandra operators. Our predominant interaction with Cassandra is via 
>> editing Java code in our IDEs. We don’t usually have a lot of experience or 
>> skin in the game when it comes to consuming Cassandra’s APIs. We should 
>> welcome and actively seek inputs of those who do. Giving more time to other 
>> developers to react and contribute is pretty important as well.
>> 
>> The mechanisms suggested here don’t strike me as being too costly. Starting 
>> a lightweight informal thread even for every individual proposal is no huge 
>> deal, surely. We aren’t talking about CEP level of commitment here.
>> 
>> It’s not the first time you bring up trust, I feel, but there really is no 
>> need to go all defensive here. No person or organisation is being singled 
>> out. Admitting that API design can genuinely benefit from user input and 
>> input of others in general, to me, is productive humility - a sign of 
>> maturity. It’s not a reason to be offended.
>> 
>> > On 6 Dec 2022, at 13:53, Benjamin Lerer > > > wrote:
>> > 
>> > I am sorry but I still do not understand what problem we are trying to 
>> > solve.
>> > All examples given so far have been about significant features which we 
>> > already discuss on this mailing not about minor changes that happen 
>> > multiple times per week.
>> > Is it a trust issue ?
>> 



Re: [VOTE] Release Apache Cassandra 4.1.0 GA

2022-12-06 Thread Marianne Lyne Manaog
Here is CASSANDRA-18097
 with the bug fix
for the performance regression encountered with 100 partitions in V2.

On Mon, Dec 5, 2022 at 2:05 PM Marianne Lyne Manaog <
marianne.man...@ieee.org> wrote:

> Following on what Matt said:
>
> - Here is the link to the Cassandra repo with the bugfix of wait time from
> ms to ns:
> https://github.com/apache/cassandra/compare/trunk...marianne-manaog:cassandra:bugfix/wait-from-ms-to-ns
>
> - the Paxos configuration used is:
>
>   paxos_contention_wait_randomizer: uniform
>
>   paxos_contention_min_wait: 0
>
>   paxos_contention_max_wait: 100ms
>
> - V1 and V2 have the same configurations except for paxos_variant: which
> changes accordingly
>
> *Results: V1 (100 partitions)*
>
> - Average read: 28948
>
> - Standard Deviation: 416.271
>
> - Coefficient of variance: 1.44%
>
> - Average write: 19248
>
> - Standard Deviation:158.595
>
> - Coefficient of variance:0.82%
>
> *Results: V2 (100 partitions)*
>
> - Average read: 12307
>
> - Standard Deviation: 2367.473
>
> - Coefficient of variance: 19.24%
>
> - Average write: 5780
>
> - Standard Deviation: 1154.261
>
> - Coefficient of variance: 19.97%
>
>
> On Mon, Dec 5, 2022 at 1:50 PM Matt Fleming 
> wrote:
>
>> Me and Marianne are also still chasing a performance issue with Paxos v2
>> when compared with v1. We
>> see way more contention on v2 for a LOCAL_SERIALIZABLE workload that
>> writes/reads to only 100
>> partitions (v2 performs better for higher partition counts). We're still
>> investigating what's going
>> on.
>>
>> Should that be a -1 vote? I'm not sure :)
>>
>> On Mon, 5 Dec 2022 at 11:37, Benedict  wrote:
>>
>>> -0
>>>
>>> CASSANDRA-18086 should probably be fixed and merged first, as Paxos v2
>>> will be unlikely to work well for users without it. Either that or we need
>>> to update NEWS.txt to mention it.
>>>
>>> On 5 Dec 2022, at 11:01, Aleksey Yeshchenko  wrote:
>>>
>>> +1
>>>
>>> On 5 Dec 2022, at 10:17, Benjamin Lerer  wrote:
>>>
>>> +1
>>>
>>> Le lun. 5 déc. 2022 à 11:02, Berenguer Blasi 
>>> a écrit :
>>>
 +1
 On 5/12/22 10:53, guo Maxwell wrote:

 +1

 Mick Semb Wever 于2022年12月5日 周一下午5:33写道:

>
> Proposing the test build of Cassandra 4.1.0 GA for release.
>
> sha1: b807f97b37933fac251020dbd949ee8ef245b158
> Git:
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=shortlog;h=refs/tags/4.1.0-tentative
> Maven Artifacts:
> https://repository.apache.org/content/repositories/orgapachecassandra-1281/org/apache/cassandra/cassandra-all/4.1.0/
>
> The Source and Build Artifacts, and the Debian and RPM packages and
> repositories, are available here:
> https://dist.apache.org/repos/dist/dev/cassandra/4.1.0/
>
> The vote will be open for 72 hours (longer if needed). Everyone who
> has tested the build is invited to vote. Votes by PMC members are
> considered binding. A vote passes if there are at least three binding +1s
> and no -1's.
>
> [1]: CHANGES.txt:
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/4.1.0-tentative
> [2]: NEWS.txt:
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=NEWS.txt;hb=refs/tags/4.1.0-tentative
>
 --
 you are the apple of my eye !


>>>


Re: Aggregate functions on collections, collection functions and MAXWRITETIME

2022-12-06 Thread J. D. Jordan
If the functionality truly has never actually worked, then throwing an error 
that MAX is not supported for collections seems reasonable.

But we should throw an error, I do not think we should have functions that 
aggregate across rows and functions that operate within a row use the same name.

My expectation as a user would be that MAX either always aggregates across 
rows, so results in a single row of output or always operates within a row, so 
returns the full set of rows matching the query.

So if we want a max that aggregates across rows that works for collections we 
could change it to return the aggregated max across all rows. Or we just leave 
it as an error and if someone wants the max across all rows they would ask for 
MAX(COLLECTION_MAX(column)). Yes I still agree COLLECTION_MAX may be a bad name.

> On Dec 6, 2022, at 11:55 AM, Benedict  wrote:
> 
> As far as I am aware it has never worked in a release, and so deprecating it 
> is probably not as challenging as you think. Only folk that have been able to 
> parse the raw bytes of the collection in storage format would be affected - 
> which we can probably treat as zero.
> 
> 
>> On 6 Dec 2022, at 17:31, Jeremiah D Jordan  wrote:
>> 
>> 
>>> 
>>> 1. I think it is a mistake to offer a function MAX that operates over rows 
>>> containing collections, returning the collection with the most elements. 
>>> This is just a nonsensical operation to support IMO. We should decide as a 
>>> community whether we “fix” this aggregation, or remove it.
>> 
>> The current MAX function does not work this way afaik?  It returns the row 
>> with the column that has the highest value in clustering order sense, like 
>> if the collection was used as a clustering key.  While that also may have 
>> limited use, I don’t think it worth while to deprecate such use and all the 
>> headache that comes with doing so.
>> 
>>> 2. I think “collection_" prefixed methods are non-intuitive for discovery, 
>>> and all-else equal it would be better to use MAX,MIN, etc, same as for 
>>> aggregations.
>> 
>> If we actually wanted to move towards using the existing names with new 
>> meanings, then I think that would take us multiple major releases.  First 
>> deprecate existing use in current releases.  Then make it an error in the 
>> next major release X.  Then change the behavior in major release X+1.  Just 
>> switching the behavior without having a major where such queries error out 
>> would make a bunch of user queries start returning “wrong” data.
>> Also I don’t think those functions being cross row aggregations for some 
>> column types, but within row collection operations for other types, is any 
>> more intuitive, and actually would be more confusing.  So I am -1 on using 
>> the same names.
>> 
>>> 3. I think it is peculiar to permit methods named collection_ to operate 
>>> over non-collection types when they are explicitly collection variants.
>> 
>> While I could see some point to this, I do not think it would be confusing 
>> for something named collection_XXX to treat a non-collection as a collection 
>> of 1.  But maybe there is a better name for these function.  Rather than 
>> seeing them as collection variants, we should see them as variants that 
>> operate on the data in a single row, rather than aggregating across multiple 
>> rows.  But even with that perspective I don’t know what the best name would 
>> be.
>> 
 On Dec 6, 2022, at 7:30 AM, Benedict  wrote:
>>> 
>>> Thanks Andres, I think community input on direction here will be 
>>> invaluable. There’s a bunch of interrelated tickets, and my opinions are as 
>>> follows:
>>> 
>>> 1. I think it is a mistake to offer a function MAX that operates over rows 
>>> containing collections, returning the collection with the most elements. 
>>> This is just a nonsensical operation to support IMO. We should decide as a 
>>> community whether we “fix” this aggregation, or remove it.
>>> 2. I think “collection_" prefixed methods are non-intuitive for discovery, 
>>> and all-else equal it would be better to use MAX,MIN, etc, same as for 
>>> aggregations.
>>> 3. I think it is peculiar to permit methods named collection_ to operate 
>>> over non-collection types when they are explicitly collection variants.
>>> 
>>> Given (1), (2) becomes simple except for COUNT which remains ambiguous, but 
>>> this could be solved by either providing a separate method for collections 
>>> (e.g. SIZE) which seems fine to me, or by offering a precedence order for 
>>> matching and a keyword for overriding the precedence order (e.g. 
>>> COUNT(collection AS COLLECTION)).
>>> 
>>> Given (2), (3) is a little more difficult. However, I think this can be 
>>> solved several ways. 
>>> - We could permit explicit casts to collection types, that for a collection 
>>> type would be a no-op, and for a single value would create a collection
>>> - With precedence orders, by always selecting the scalar function last
>>> - By permitting WRI

Re: [VOTE] Release Apache Cassandra 4.1.0 GA

2022-12-06 Thread Aleksey Yeshchenko
Sure.

> On 6 Dec 2022, at 18:14, Mick Semb Wever  wrote:
> 
>> Let’s get these fixes in and roll a pure RC2.
> 
> 
> 
> Given that we're talking about two one-liners, just changing time units, what 
> are folks thoughts about skipping rc2 and just re-cutting 4.1.0 (GA) ?
> 
> (I agree with the rules, and am in favour of making the exception.)
>