Re: [EXTERNAL] Re: [DISCUSS] Next release date

2023-03-13 Thread Andrés de la Peña
>
> Should we clarify that part first by getting an idea of the status of the
> different CEPs and other big pieces of work?


CEP-20 (dynamic data masking) should hopefully be ready by the end of this
month.

It's composed by seven small tickets. Five of those tickets are ready, and
two are under review. All together it will be ~6K LOC, involving around 100
files.

On Thu, 9 Mar 2023 at 21:17, Mick Semb Wever  wrote:

> > > > One place we've been weak historically is in distinguishing between
> tickets we consider "nice to have" and things that are "blockers". We don't
> have any metadata that currently distinguishes those two, so determining
> what our burndown leading up to 5.0 looks like is a lot more data massaging
> and hand-waving than I'd prefer right now.
> > >
> > > We distinguish "blockers" with `Priority=Urgent` or
> `Severity=Critical`, or by linking the ticket as blocking to a specific
> ticket that spells it out. We do have the metadata, but yes it requires
> some work…
> >
> > For everything not urgent or a blocker, does it matter whether something
> has a fixver of where we think it's going to land or where we'd like to see
> it land? At the end of the day, neither of those scenarios will actually
> shift a release date if we're proactively putting "blocker / urgent" status
> on new features, improvements, and bugs we think are significant enough to
> delay a release right?
>
>
> Ooops, actually we were using the -beta, and -rc fixVersion
> placeholders to denote the blockers once "the bridge was crossed"
> (while Urgent and Critical is used more broadly, e.g. patch releases).
> If we use this approach, then we could add a 5.0-alpha placeholder
> that indicates a consensus on tickets blocking the branching (if we
> agree alpha1 should be cut at the same time we branch…). IMHO such
> tickets should also still be marked as Urgent, but I suggest we use
> Urgent/Critical as an initial state, and the fixVersion placeholders
> where we have consensus or it is according to our release criteria
> :shrug:
>


Re: [EXTERNAL] Re: [DISCUSS] Next release date

2023-03-13 Thread Mike Adamson
CEP-7 Storage Attached Index is in review with ~430 files and ~70k LOC. The
bulk of the project is in 3 main patches. The first patch (in-memory index
and query path) is merged to the feature branch CASSANDRA-16052 and the
second patch (on-disk write and literal / string index) is in review.

Mike

On Thu, 9 Mar 2023 at 09:13, Branimir Lambov  wrote:

> CEPs 25 (trie-indexed sstables) and 26 (unified compaction strategy)
> should both be ready for review by mid-April.
>
> Both are around 10k LOC, fairly isolated, and in need of a committer to
> review.
>
> Regards,
> Branimir
>
> On Mon, Mar 6, 2023 at 11:25 AM Benjamin Lerer  wrote:
>
>> Sorry, I realized that when I started the discussion I probably did not
>> frame it enough as I see that it is now going into different directions.
>> The concerns I am seeing are:
>> 1) A too small amount of time between releases  is inefficient from a
>> development perspective and from a user perspective. From a development
>> point of view because we are missing time to deliver some features. From a
>> user perspective because they cannot follow with the upgrade.
>> 2) Some features are so anticipated (Accord being the one mentioned) that
>> people would prefer to delay the release to make sure that it is available
>> as soon as possible.
>> 3) We do not know how long we need to go from the freeze to GA. We hope
>> for 2 months but our last experience was 6 months. So delaying the release
>> could mean not releasing this year.
>> 4) For people doing marketing it is really hard to promote a product when
>> you do not know when the release will come and what features might be there.
>>
>> All those concerns are probably even made worse by the fact that we do
>> not have a clear visibility on where we are.
>>
>> Should we clarify that part first by getting an idea of the status of the
>> different CEPs and other big pieces of work? From there we could agree on
>> some timeline for the freeze. We could then discuss how to make predictable
>> the time from freeze to GA.
>>
>>
>>
>> Le sam. 4 mars 2023 à 18:14, Josh McKenzie  a
>> écrit :
>>
>>> (for convenience sake, I'm referring to both Major and Minor semver
>>> releases as "major" in this email)
>>>
>>> The big feature from our perspective for 5.0 is ACCORD (CEP-15) and I
>>> would advocate to delay until this has sufficient quality to be in
>>> production.
>>>
>>> This approach can be pretty unpredictable in this domain; often
>>> unforeseen things come up in implementation that can give you a long tail
>>> on something being production ready. For the record - I don't intend to
>>> single Accord out *at all* on this front, quite the opposite given how
>>> much rigor's gone into the design and implementation. I'm just thinking
>>> from my personal experience: everything I've worked on, overseen, or
>>> followed closely on this codebase always has a few tricks up its sleeve
>>> along the way to having edge-cases stabilized.
>>>
>>> Much like on some other recent topics, I think there's a nuanced middle
>>> ground where we take things on a case-by-case basis. Some factors that have
>>> come up in this thread that resonated with me:
>>>
>>> For a given potential release date 'X':
>>> 1. How long has it been since the last release?
>>> 2. How long do we expect qualification to take from a "freeze" (i.e. no
>>> new improvement or features, branch) point?
>>> 3. What body of merged production ready work is available?
>>> 4. What body of new work do we have high confidence will be ready within
>>> Y time?
>>>
>>> I think it's worth defining a loose "minimum bound and upper bound" on
>>> release cycles we want to try and stick with barring extenuating
>>> circumstances. For instance: try not to release sooner than maybe 10 months
>>> out from a prior major, and try not to release later than 18 months out
>>> from a prior major. Make exceptions if truly exceptional things land, are
>>> about to land, or bugs are discovered around those boundaries.
>>>
>>> Applying the above framework to what we have in flight, our last release
>>> date, expectations on CI, etc - targeting an early fall freeze (pending CEP
>>> status) and mid to late fall or December release "feels right" to me.
>>>
>>> With the exception, of course, that if something merges earlier, is
>>> stable, and we feel is valuable enough to cut a major based on that, we do
>>> it.
>>>
>>> ~Josh
>>>
>>> On Fri, Mar 3, 2023, at 7:37 PM, German Eichberger via dev wrote:
>>>
>>> Hi,
>>>
>>> We shouldn't release just for releases sake. Are there enough new
>>> features and are they working well enough (quality!).
>>>
>>> The big feature from our perspective for 5.0 is ACCORD (CEP-15) and I
>>> would advocate to delay until this has sufficient quality to be in
>>> production.
>>>
>>> Just because something is released doesn't mean anyone is gonna use it.
>>> To add some operator perspective: Every time there is a new release we need
>>> to decide
>>> 1) are we supporting it
>>

[DISCUSS] Remove deprecated CQL functions dateof and unixtimestampof on 5.0

2023-03-13 Thread Andrés de la Peña
The CQL functions "dateof" and "unixtimestampof" were deprecated on
Cassandra 2.2.0, almost eight years ago [1]. They were deprecated in favour
of the then new "totimestamp" and "tounixtimestamp" functions.

I think that we can finally remove those functions in 5.0, since they have
been deprecated for so long.

A note about their deprecation was added to NEWS.txt [2], and they were
marked as deprecated on CQL.textile [3]. They are also listed as deprecated
on the new doc [4].

I came to this while working on the adoption of snake case conventions for
CQL function names on CASSANDRA-18037. It probably doesn't make sense to
add new "date_of" and "unix_timestamp_of" aliases for them.

What do you think? Should we remove them?

[1]
https://github.com/apache/cassandra/commit/c08aaabd95d4872593c29807de6ec1485cefa7fa
[2] https://github.com/apache/cassandra/blob/trunk/NEWS.txt#L1421-L1423
[3]
https://github.com/apache/cassandra/blob/trunk/doc/cql3/CQL.textile#time-conversion-functions
[4]
https://github.com/apache/cassandra/blob/trunk/doc/modules/cassandra/pages/cql/functions.adoc#time-conversion-functions


Re: [DISCUSS] Remove deprecated CQL functions dateof and unixtimestampof on 5.0

2023-03-13 Thread Miklosovic, Stefan
I am +1.

Could you please link the ticket to 
https://issues.apache.org/jira/browse/CASSANDRA-17973 ?

Thanks


From: Andrés de la Peña 
Sent: Monday, March 13, 2023 13:22
To: dev@cassandra.apache.org
Subject: [DISCUSS] Remove deprecated CQL functions dateof and unixtimestampof 
on 5.0

NetApp Security WARNING: This is an external email. Do not click links or open 
attachments unless you recognize the sender and know the content is safe.



The CQL functions "dateof" and "unixtimestampof" were deprecated on Cassandra 
2.2.0, almost eight years ago [1]. They were deprecated in favour of the then 
new "totimestamp" and "tounixtimestamp" functions.

I think that we can finally remove those functions in 5.0, since they have been 
deprecated for so long.

A note about their deprecation was added to NEWS.txt [2], and they were marked 
as deprecated on CQL.textile [3]. They are also listed as deprecated on the new 
doc [4].

I came to this while working on the adoption of snake case conventions for CQL 
function names on CASSANDRA-18037. It probably doesn't make sense to add new 
"date_of" and "unix_timestamp_of" aliases for them.

What do you think? Should we remove them?

[1] 
https://github.com/apache/cassandra/commit/c08aaabd95d4872593c29807de6ec1485cefa7fa
[2] https://github.com/apache/cassandra/blob/trunk/NEWS.txt#L1421-L1423
[3] 
https://github.com/apache/cassandra/blob/trunk/doc/cql3/CQL.textile#time-conversion-functions
[4] 
https://github.com/apache/cassandra/blob/trunk/doc/modules/cassandra/pages/cql/functions.adoc#time-conversion-functions


Re: [DISCUSS] Remove deprecated CQL functions dateof and unixtimestampof on 5.0

2023-03-13 Thread Miklosovic, Stefan
Actually, this one https://issues.apache.org/jira/browse/CASSANDRA-18306


From: Miklosovic, Stefan 
Sent: Monday, March 13, 2023 13:26
To: dev@cassandra.apache.org
Subject: Re: [DISCUSS] Remove deprecated CQL functions dateof and 
unixtimestampof on 5.0

NetApp Security WARNING: This is an external email. Do not click links or open 
attachments unless you recognize the sender and know the content is safe.




I am +1.

Could you please link the ticket to 
https://issues.apache.org/jira/browse/CASSANDRA-17973 ?

Thanks


From: Andrés de la Peña 
Sent: Monday, March 13, 2023 13:22
To: dev@cassandra.apache.org
Subject: [DISCUSS] Remove deprecated CQL functions dateof and unixtimestampof 
on 5.0

NetApp Security WARNING: This is an external email. Do not click links or open 
attachments unless you recognize the sender and know the content is safe.



The CQL functions "dateof" and "unixtimestampof" were deprecated on Cassandra 
2.2.0, almost eight years ago [1]. They were deprecated in favour of the then 
new "totimestamp" and "tounixtimestamp" functions.

I think that we can finally remove those functions in 5.0, since they have been 
deprecated for so long.

A note about their deprecation was added to NEWS.txt [2], and they were marked 
as deprecated on CQL.textile [3]. They are also listed as deprecated on the new 
doc [4].

I came to this while working on the adoption of snake case conventions for CQL 
function names on CASSANDRA-18037. It probably doesn't make sense to add new 
"date_of" and "unix_timestamp_of" aliases for them.

What do you think? Should we remove them?

[1] 
https://github.com/apache/cassandra/commit/c08aaabd95d4872593c29807de6ec1485cefa7fa
[2] https://github.com/apache/cassandra/blob/trunk/NEWS.txt#L1421-L1423
[3] 
https://github.com/apache/cassandra/blob/trunk/doc/cql3/CQL.textile#time-conversion-functions
[4] 
https://github.com/apache/cassandra/blob/trunk/doc/modules/cassandra/pages/cql/functions.adoc#time-conversion-functions


[DISCUSS] New dependencies with Chronicle-Queue update

2023-03-13 Thread Mick Semb Wever
JDK17 requires us to update our chronicle-queue dependency: CASSANDRA-18049

We use chronicle-queue for both audit logging and fql.

This update pulls in a number of new transitive dependencies.

affinity-3.23ea1.jar
asm-analysis-9.2.jar
asm-commons-9.2.jar
asm-tree-9.2.jar
asm-util-9.2.jar
jffi-1.3.9.jar
jna-platform-5.5.0.jar
jnr-a64asm-1.0.0.jar
jnr-constants-0.10.3.jar
jnr-ffi-2.2.11.jar
jnr-x86asm-1.0.2.jar
posix-2.24ea4.jar


More info here:
https://issues.apache.org/jira/browse/CASSANDRA-18049?focusedCommentId=17699393&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17699393


Objections?


Re: [EXTERNAL] Re: [DISCUSS] Next release date

2023-03-13 Thread Berenguer Blasi
TTL (CASSANDRA-14227) is undergoing review and it's in final stages 
afaik. A big rebase and perf re-testing will be needed to confirm all is 
still good. I would expect this to happen this month.


Then the feature flag and downgradability issue, which are unkown atm in 
terms of complexity, are next.


On 13/3/23 12:34, Mike Adamson wrote:
CEP-7 Storage Attached Index is in review with ~430 files and ~70k 
LOC. The bulk of the project is in 3 main patches. The first patch 
(in-memory index and query path) is merged to the feature branch 
CASSANDRA-16052 and the second patch (on-disk write and literal / 
string index) is in review.


Mike

On Thu, 9 Mar 2023 at 09:13, Branimir Lambov  wrote:

CEPs 25 (trie-indexed sstables) and 26 (unified compaction
strategy) should both be ready for review by mid-April.

Both are around 10k LOC, fairly isolated, and in need of a
committer to review.

Regards,
Branimir

On Mon, Mar 6, 2023 at 11:25 AM Benjamin Lerer 
wrote:

Sorry, I realized that when I started the discussion I
probably did not frame it enough as I see that it is now going
into different directions.
The concerns I am seeing are:
1) A too small amount of time between releases is inefficient
from a development perspective and from a user perspective.
From a development point of view because we are missing time
to deliver some features. From a user perspective because they
cannot follow with the upgrade.
2) Some features are so anticipated (Accord being the one
mentioned) that people would prefer to delay the release to
make sure that it is available as soon as possible.
3) We do not know how long we need to go from the freeze to
GA. We hope for 2 months but our last experience was 6 months.
So delaying the release could mean not releasing this year.
4) For people doing marketing it is really hard to promote a
product when you do not know when the release will come and
what features might be there.

All those concerns are probably even made worse by the fact
that we do not have a clear visibility on where we are.

Should we clarify that part first by getting an idea of the
status of the different CEPs and other big pieces of work?
From there we could agree on some timeline for the freeze. We
could then discuss how to make predictable the time from
freeze to GA.



Le sam. 4 mars 2023 à 18:14, Josh McKenzie
 a écrit :

(for convenience sake, I'm referring to both Major and
Minor semver releases as "major" in this email)


The big feature from our perspective for 5.0 is ACCORD
(CEP-15) and I would advocate to delay until this has
sufficient quality to be in production.

This approach can be pretty unpredictable in this domain;
often unforeseen things come up in implementation that can
give you a long tail on something being production ready.
For the record - I don't intend to single Accord out /at
all/ on this front, quite the opposite given how much
rigor's gone into the design and implementation. I'm just
thinking from my personal experience: everything I've
worked on, overseen, or followed closely on this codebase
always has a few tricks up its sleeve along the way to
having edge-cases stabilized.

Much like on some other recent topics, I think there's a
nuanced middle ground where we take things on a
case-by-case basis. Some factors that have come up in this
thread that resonated with me:

For a given potential release date 'X':
1. How long has it been since the last release?
2. How long do we expect qualification to take from a
"freeze" (i.e. no new improvement or features, branch) point?
3. What body of merged production ready work is available?
4. What body of new work do we have high confidence will
be ready within Y time?

I think it's worth defining a loose "minimum bound and
upper bound" on release cycles we want to try and stick
with barring extenuating circumstances. For instance: try
not to release sooner than maybe 10 months out from a
prior major, and try not to release later than 18 months
out from a prior major. Make exceptions if truly
exceptional things land, are about to land, or bugs are
discovered around those boundaries.

Applying the above framework to what we have in flight,
our last release date, expectations on CI, etc - targeting
an early fall freeze (pending CEP status) and mid to late
fa

[DISCUSS] Lift MessagingService.minimum_version to 40 in trunk

2023-03-13 Thread Mick Semb Wever
If we do not recommend and do not test direct upgrades from 3.x to
5.x, we have the opportunity to clean up a fair chunk of code by
making `MessagingService.minimum_version=40`

As Cassandra versions 4.x and  5.0 are all on
`MessagingService.current_version=40` this would mean lifting
MessagingService.minimum_version would make it equal to the
current_version.

Today already we don't allow mixed-version streaming.  The only
argument I can see for keeping minimum_version=30 is for supporting
non-streaming messages between 3.x and 5.0 nodes, which I can't find a
basis for.

An _example_ of the code that can be cleaned up is in the patch
attached to the ticket:
CASSANDRA-18314 – Lift MessagingService.minimum_version to 40

What do you think?


Re: [DISCUSS] New dependencies with Chronicle-Queue update

2023-03-13 Thread Jeremiah D Jordan
Given we need to upgrade to support JDK17 it seems fine to me.  The only 
concern I have is that some of those libraries are already pretty old, for 
example the most recent jna-platform is 5.13.0 and 5.5.0 is almost 4 years old. 
 I think we should we use the most recent versions of all libraries where 
possible?

> On Mar 13, 2023, at 7:42 AM, Mick Semb Wever  wrote:
> 
> JDK17 requires us to update our chronicle-queue dependency: CASSANDRA-18049
> 
> We use chronicle-queue for both audit logging and fql.
> 
> This update pulls in a number of new transitive dependencies.
> 
> affinity-3.23ea1.jar
> asm-analysis-9.2.jar
> asm-commons-9.2.jar
> asm-tree-9.2.jar
> asm-util-9.2.jar
> jffi-1.3.9.jar
> jna-platform-5.5.0.jar
> jnr-a64asm-1.0.0.jar
> jnr-constants-0.10.3.jar
> jnr-ffi-2.2.11.jar
> jnr-x86asm-1.0.2.jar
> posix-2.24ea4.jar
> 
> 
> More info here:
> https://issues.apache.org/jira/browse/CASSANDRA-18049?focusedCommentId=17699393&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17699393
> 
> 
> Objections?



Re: [DISCUSS] New dependencies with Chronicle-Queue update

2023-03-13 Thread Brandon Williams
I know it was just an example but we upgraded JNA to 5.13 in
CASSANDRA-18050 as part of the JDK17 effort, so at least that is taken
care of.

Kind Regards,
Brandon

On Mon, Mar 13, 2023 at 10:39 AM Jeremiah D Jordan
 wrote:
>
> Given we need to upgrade to support JDK17 it seems fine to me.  The only 
> concern I have is that some of those libraries are already pretty old, for 
> example the most recent jna-platform is 5.13.0 and 5.5.0 is almost 4 years 
> old.  I think we should we use the most recent versions of all libraries 
> where possible?
>
> > On Mar 13, 2023, at 7:42 AM, Mick Semb Wever  wrote:
> >
> > JDK17 requires us to update our chronicle-queue dependency: CASSANDRA-18049
> >
> > We use chronicle-queue for both audit logging and fql.
> >
> > This update pulls in a number of new transitive dependencies.
> >
> > affinity-3.23ea1.jar
> > asm-analysis-9.2.jar
> > asm-commons-9.2.jar
> > asm-tree-9.2.jar
> > asm-util-9.2.jar
> > jffi-1.3.9.jar
> > jna-platform-5.5.0.jar
> > jnr-a64asm-1.0.0.jar
> > jnr-constants-0.10.3.jar
> > jnr-ffi-2.2.11.jar
> > jnr-x86asm-1.0.2.jar
> > posix-2.24ea4.jar
> >
> >
> > More info here:
> > https://issues.apache.org/jira/browse/CASSANDRA-18049?focusedCommentId=17699393&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17699393
> >
> >
> > Objections?
>


Should we cut some new releases?

2023-03-13 Thread Benjamin Lerer
Hi everybody,

Benedict and Jon recently committed the patch for CASSANDRA-18125
 which fixes some
serious problems at the memtable/flush level. Should we consider cutting
some releases that contain this fix?


Re: Should we cut some new releases?

2023-03-13 Thread Ekaterina Dimitrova
+1

On Mon, 13 Mar 2023 at 12:23, Benjamin Lerer  wrote:

> Hi everybody,
>
> Benedict and Jon recently committed the patch for CASSANDRA-18125
>  which fixes some
> serious problems at the memtable/flush level. Should we consider cutting
> some releases that contain this fix?
>


Re: [DISCUSS] New dependencies with Chronicle-Queue update

2023-03-13 Thread Ekaterina Dimitrova
“ > Given we need to upgrade to support JDK17 it seems fine to me.  The
only concern I have is that some of those libraries are already pretty old,
for example the most recent jna-platform is 5.13.0 and 5.5.0 is almost 4
years old.  I think we should we use the most recent versions of all
libraries where possible?”
+1

On Mon, 13 Mar 2023 at 12:10, Brandon Williams  wrote:

> I know it was just an example but we upgraded JNA to 5.13 in
> CASSANDRA-18050 as part of the JDK17 effort, so at least that is taken
> care of.
>
> Kind Regards,
> Brandon
>
> On Mon, Mar 13, 2023 at 10:39 AM Jeremiah D Jordan
>  wrote:
> >
> > Given we need to upgrade to support JDK17 it seems fine to me.  The only
> concern I have is that some of those libraries are already pretty old, for
> example the most recent jna-platform is 5.13.0 and 5.5.0 is almost 4 years
> old.  I think we should we use the most recent versions of all libraries
> where possible?
> >
> > > On Mar 13, 2023, at 7:42 AM, Mick Semb Wever  wrote:
> > >
> > > JDK17 requires us to update our chronicle-queue dependency:
> CASSANDRA-18049
> > >
> > > We use chronicle-queue for both audit logging and fql.
> > >
> > > This update pulls in a number of new transitive dependencies.
> > >
> > > affinity-3.23ea1.jar
> > > asm-analysis-9.2.jar
> > > asm-commons-9.2.jar
> > > asm-tree-9.2.jar
> > > asm-util-9.2.jar
> > > jffi-1.3.9.jar
> > > jna-platform-5.5.0.jar
> > > jnr-a64asm-1.0.0.jar
> > > jnr-constants-0.10.3.jar
> > > jnr-ffi-2.2.11.jar
> > > jnr-x86asm-1.0.2.jar
> > > posix-2.24ea4.jar
> > >
> > >
> > > More info here:
> > >
> https://issues.apache.org/jira/browse/CASSANDRA-18049?focusedCommentId=17699393&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17699393
> > >
> > >
> > > Objections?
> >
>


Re: [DISCUSSION] Cassandra + Java 17

2023-03-13 Thread Ekaterina Dimitrova
Hi everyone,
To close this thread, I see lazy consensus around JDK internals accesses as
follows:
- we keep the current accesses to JDK internals in the Cassandra codebase,
they were carefully considered during former reviews already. If there is
breakage from changes in JDK internals (which are not guaranteed to provide
backward compatibility) - they will be addressed on a per case basis. Like
the example I provided earlier - CASSANDRA-14173
- we will keep on being very conservative and carefully consider any new
access to JDK internals being proposed to land in the Cassandra codebase.
We document in tickets what other options were considered to make easier
future maintenance or in case of breakages in order to have context.
Covered also in [1]

Best regards,
Ekaterina

[1]https://lists.apache.org/thread/33dt0c3kgskrzqtp4h8y411tqv2d6qvh



On Tue, 7 Mar 2023 at 14:58, Ekaterina Dimitrova 
wrote:

> Thanks Benjamin, please, find below my comments.
>
> "It is not necessarily a problem as long as we do get an issue with the
> Modules boundaries and their access. For me it needs to be looked at on a
> case by case basis."
>
> We can still use the --add-opens/add-exports with Java 17(I mentioned I
> added some as part of introducing experimental JDK17 support -
> CASSANDRA-18258) so the concern is that we should do it as little as
> possible as things can break at any time. JDK internals do not guarantee
> backward compatibility. When something can break is unknown and we need to
> be careful. We also had agreement on that in [1]
>
> "It was used mainly to get around the fact that Java did not offer other
> means to do certain things. Outside of trying to anticipate some of the
> restrictions of that API and make sure that the JDK offers a suitable
> replacement for us. I am not sure that there is much that we can do. But I
> might misunderstand your question."
> I think in some cases it was used for performance, too.
> I think some people in our community might have more history around
> Unsafe. I know there were different conversations in the past between
> members of our community and the JDK community regarding Unsafe replacement
> in time but I cannot find any final outcome so it is an honest question if
> anyone has something more to share here.
>
> Now in some of the cases we use internals I found there is fallback which
> can still have some implications like being slower option. So there are
> nuances and lots of history in decisions taken around the Cassandra
> codebase, as usual. Regarding Unsafe, with JDK17 the only concern is with
> Jamm so far because we do not use ObjectFieldOffset with lambdas
> (implemented internally as hidden classes) in Cassandra code or at least  I
> didn't find such a place so far.
>
> Even if we decide to go with - "let's keep things as is we will look into
> breakages in time", there needs to be visibility and awareness and
> consideration at least when new code is added. I've heard different
> opinions on the topic around the community, honestly - whether the code
> should stay as-is and breakages to be addressed on a per case basis or not.
> I do not see us having exact guidance. Thoughts?
>
> [1] https://lists.apache.org/thread/33dt0c3kgskrzqtp4h8y411tqv2d6qvh
>
> On Thu, 2 Mar 2023 at 7:48, Benjamin Lerer  wrote:
>
>> Hey Ekaterina,
>> Thanks for the update and all the work.
>>
>>
>>> -- we also use setAccessible at numerous places.
>>
>>
>> It is not necessarily a problem  as long as we do get an issue with the
>> Modules boundaries and their access. For me it needs to be looked at on a
>> case by case basis.
>>
>> - thoughts around the usage/future of Unsafe? History around the choice
>>> of using it in C* and future plans I might not know of?
>>>
>>
>> It was used mainly to get around the fact that Java did not offer other
>> means to do certain things.
>> Outside of trying to anticipate some of the restrictions of that API and
>> make sure that the JDK offers a suitable replacement for us. I am not sure
>> that there is much that we can do. But I might misunderstand your question.
>>
>> Le mer. 1 mars 2023 à 21:16, Ekaterina Dimitrova 
>> a écrit :
>>
>>> Hi everyone,
>>> Some updates and questions around JDK 17 below.
>>> First of all, I wanted to let people know that currently Cassandra trunk
>>> can be already compiled and run with J8 + J11 + J17. This is the product of
>>> the realization that the feature branch makes it harder for working on
>>> JDK17 related tickets due to the involvement of too many moving parts.
>>> Agreement reached in [1] that new JDK introduction can be done
>>> incrementally. Scripted UDFs removed, hooks to be added in a follow up
>>> ticket.
>>> What does this mean?
>>> - Currently you can compile and run Cassandra trunk  with JDK 17(further
>>> to 8+11). You can run unit and java distributed tests already with JDK17
>>> - CASSANDRA-18106 in progress,  enabling CCM to handle JDK8, 11 and 17
>>> with trunk and when that is ready we w

Re: [DISCUSSION] Cassandra + Java 17

2023-03-13 Thread Derek Chen-Becker
Thanks for all of the work on this, Ekaterina! I would add a third point:

- If newer JVMs offer a standard way to access things that no longer
requires unsafe or native access, we can move to that standard approach
once all supported JVMs are in scope

The case I was thinking about specifically was the method to get the PID,
since Java 9 introduced a way to do this (although we have to wait until
Java 8 is unsupported). As the JVM continues to improve I suspect other
areas where we have native or unsafe workarounds will become replaceable.
Note that performance would still be a sufficient reason to keep existing
native or unsafe access since that's an important requirement.

Cheers,

Derek

On Mon, Mar 13, 2023 at 11:28 AM Ekaterina Dimitrova 
wrote:

> Hi everyone,
> To close this thread, I see lazy consensus around JDK internals accesses
> as follows:
> - we keep the current accesses to JDK internals in the Cassandra codebase,
> they were carefully considered during former reviews already. If there is
> breakage from changes in JDK internals (which are not guaranteed to provide
> backward compatibility) - they will be addressed on a per case basis. Like
> the example I provided earlier - CASSANDRA-14173
> - we will keep on being very conservative and carefully consider any new
> access to JDK internals being proposed to land in the Cassandra codebase.
> We document in tickets what other options were considered to make easier
> future maintenance or in case of breakages in order to have context.
> Covered also in [1]
>
> Best regards,
> Ekaterina
>
> [1]https://lists.apache.org/thread/33dt0c3kgskrzqtp4h8y411tqv2d6qvh
>
>
>
> On Tue, 7 Mar 2023 at 14:58, Ekaterina Dimitrova 
> wrote:
>
>> Thanks Benjamin, please, find below my comments.
>>
>> "It is not necessarily a problem as long as we do get an issue with the
>> Modules boundaries and their access. For me it needs to be looked at on a
>> case by case basis."
>>
>> We can still use the --add-opens/add-exports with Java 17(I mentioned I
>> added some as part of introducing experimental JDK17 support -
>> CASSANDRA-18258) so the concern is that we should do it as little as
>> possible as things can break at any time. JDK internals do not guarantee
>> backward compatibility. When something can break is unknown and we need to
>> be careful. We also had agreement on that in [1]
>>
>> "It was used mainly to get around the fact that Java did not offer other
>> means to do certain things. Outside of trying to anticipate some of the
>> restrictions of that API and make sure that the JDK offers a suitable
>> replacement for us. I am not sure that there is much that we can do. But I
>> might misunderstand your question."
>> I think in some cases it was used for performance, too.
>> I think some people in our community might have more history around
>> Unsafe. I know there were different conversations in the past between
>> members of our community and the JDK community regarding Unsafe replacement
>> in time but I cannot find any final outcome so it is an honest question if
>> anyone has something more to share here.
>>
>> Now in some of the cases we use internals I found there is fallback which
>> can still have some implications like being slower option. So there are
>> nuances and lots of history in decisions taken around the Cassandra
>> codebase, as usual. Regarding Unsafe, with JDK17 the only concern is with
>> Jamm so far because we do not use ObjectFieldOffset with lambdas
>> (implemented internally as hidden classes) in Cassandra code or at least  I
>> didn't find such a place so far.
>>
>> Even if we decide to go with - "let's keep things as is we will look into
>> breakages in time", there needs to be visibility and awareness and
>> consideration at least when new code is added. I've heard different
>> opinions on the topic around the community, honestly - whether the code
>> should stay as-is and breakages to be addressed on a per case basis or not.
>> I do not see us having exact guidance. Thoughts?
>>
>> [1] https://lists.apache.org/thread/33dt0c3kgskrzqtp4h8y411tqv2d6qvh
>>
>> On Thu, 2 Mar 2023 at 7:48, Benjamin Lerer  wrote:
>>
>>> Hey Ekaterina,
>>> Thanks for the update and all the work.
>>>
>>>
 -- we also use setAccessible at numerous places.
>>>
>>>
>>> It is not necessarily a problem  as long as we do get an issue with the
>>> Modules boundaries and their access. For me it needs to be looked at on a
>>> case by case basis.
>>>
>>> - thoughts around the usage/future of Unsafe? History around the choice
 of using it in C* and future plans I might not know of?

>>>
>>> It was used mainly to get around the fact that Java did not offer other
>>> means to do certain things.
>>> Outside of trying to anticipate some of the restrictions of that API and
>>> make sure that the JDK offers a suitable replacement for us. I am not sure
>>> that there is much that we can do. But I might misunderstand your question.
>>>
>>> Le mer. 1 mars 2023 à

Re: [DISCUSS] New dependencies with Chronicle-Queue update

2023-03-13 Thread Josh McKenzie
> I think we should we use the most recent versions of all libraries where 
> possible?”
To clarify, are we talking "most recent versions of all libraries *when we have 
to update them anyway for a dependency*"? Not *all libraries all libraries*...

If the former, I agree. If the latter, here be dragons. :)

On Mon, Mar 13, 2023, at 1:13 PM, Ekaterina Dimitrova wrote:
> “ > Given we need to upgrade to support JDK17 it seems fine to me.  The only 
> concern I have is that some of those libraries are already pretty old, for 
> example the most recent jna-platform is 5.13.0 and 5.5.0 is almost 4 years 
> old.  I think we should we use the most recent versions of all libraries 
> where possible?”
> +1
> 
> On Mon, 13 Mar 2023 at 12:10, Brandon Williams  wrote:
>> I know it was just an example but we upgraded JNA to 5.13 in
>> CASSANDRA-18050 as part of the JDK17 effort, so at least that is taken
>> care of.
>> 
>> Kind Regards,
>> Brandon
>> 
>> On Mon, Mar 13, 2023 at 10:39 AM Jeremiah D Jordan
>>  wrote:
>> >
>> > Given we need to upgrade to support JDK17 it seems fine to me.  The only 
>> > concern I have is that some of those libraries are already pretty old, for 
>> > example the most recent jna-platform is 5.13.0 and 5.5.0 is almost 4 years 
>> > old.  I think we should we use the most recent versions of all libraries 
>> > where possible?
>> >
>> > > On Mar 13, 2023, at 7:42 AM, Mick Semb Wever  wrote:
>> > >
>> > > JDK17 requires us to update our chronicle-queue dependency: 
>> > > CASSANDRA-18049
>> > >
>> > > We use chronicle-queue for both audit logging and fql.
>> > >
>> > > This update pulls in a number of new transitive dependencies.
>> > >
>> > > affinity-3.23ea1.jar
>> > > asm-analysis-9.2.jar
>> > > asm-commons-9.2.jar
>> > > asm-tree-9.2.jar
>> > > asm-util-9.2.jar
>> > > jffi-1.3.9.jar
>> > > jna-platform-5.5.0.jar
>> > > jnr-a64asm-1.0.0.jar
>> > > jnr-constants-0.10.3.jar
>> > > jnr-ffi-2.2.11.jar
>> > > jnr-x86asm-1.0.2.jar
>> > > posix-2.24ea4.jar
>> > >
>> > >
>> > > More info here:
>> > > https://issues.apache.org/jira/browse/CASSANDRA-18049?focusedCommentId=17699393&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17699393
>> > >
>> > >
>> > > Objections?
>> >


Re: [DISCUSS] New dependencies with Chronicle-Queue update

2023-03-13 Thread J. D. Jordan
Yes exactly. If we are updating a library for some reason, we should update it to the latest one that makes sense.On Mar 13, 2023, at 1:17 PM, Josh McKenzie  wrote:I think we should we use the most recent versions of all libraries where possible?”To clarify, are we talking "most recent versions of all libraries when we have to update them anyway for a dependency"? Not all libraries all libraries...If the former, I agree. If the latter, here be dragons. :)On Mon, Mar 13, 2023, at 1:13 PM, Ekaterina Dimitrova wrote:“ > Given we need to upgrade to support JDK17 it seems fine to me.  The only concern I have is that some of those libraries are already pretty old, for example the most recent jna-platform is 5.13.0 and 5.5.0 is almost 4 years old.  I think we should we use the most recent versions of all libraries where possible?”+1On Mon, 13 Mar 2023 at 12:10, Brandon Williams  wrote:I know it was just an example but we upgraded JNA to 5.13 in CASSANDRA-18050 as part of the JDK17 effort, so at least that is taken care of.  Kind Regards, Brandon  On Mon, Mar 13, 2023 at 10:39 AM Jeremiah D Jordan  wrote: > > Given we need to upgrade to support JDK17 it seems fine to me.  The only concern I have is that some of those libraries are already pretty old, for example the most recent jna-platform is 5.13.0 and 5.5.0 is almost 4 years old.  I think we should we use the most recent versions of all libraries where possible? > > > On Mar 13, 2023, at 7:42 AM, Mick Semb Wever  wrote: > > > > JDK17 requires us to update our chronicle-queue dependency: CASSANDRA-18049 > > > > We use chronicle-queue for both audit logging and fql. > > > > This update pulls in a number of new transitive dependencies. > > > > affinity-3.23ea1.jar > > asm-analysis-9.2.jar > > asm-commons-9.2.jar > > asm-tree-9.2.jar > > asm-util-9.2.jar > > jffi-1.3.9.jar > > jna-platform-5.5.0.jar > > jnr-a64asm-1.0.0.jar > > jnr-constants-0.10.3.jar > > jnr-ffi-2.2.11.jar > > jnr-x86asm-1.0.2.jar > > posix-2.24ea4.jar > > > > > > More info here: > > https://issues.apache.org/jira/browse/CASSANDRA-18049?focusedCommentId=17699393&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17699393 > > > > > > Objections? >

Re: Should we cut some new releases?

2023-03-13 Thread Miklosovic, Stefan
Yes, I was waiting for CASSANDRA-18125 to be in.

I can release 4.1.1 to staging tomorrow morning CET if nobody objects that.

Not sure about 4.0.9. We released 4.0.8 just few weeks ago. I would do 4.1.1 
first.


From: Ekaterina Dimitrova 
Sent: Monday, March 13, 2023 18:12
To: dev@cassandra.apache.org
Subject: Re: Should we cut some new releases?

NetApp Security WARNING: This is an external email. Do not click links or open 
attachments unless you recognize the sender and know the content is safe.



+1

On Mon, 13 Mar 2023 at 12:23, Benjamin Lerer 
mailto:ble...@apache.org>> wrote:
Hi everybody,

Benedict and Jon recently committed the patch for 
CASSANDRA-18125 which 
fixes some serious problems at the memtable/flush level. Should we consider 
cutting some releases that contain this fix?


Re: [EXTERNAL] Re: [DISCUSS] Next release date

2023-03-13 Thread Ekaterina Dimitrova
Getting JDK17 far enough to drop JDK8 eta is end of May to June. Blockers
to dropping JDK8 are listed against CASSANDRA-18255. Production ready JDK17
support remains a bigger unknown task, which we need more help on.


On Mon, 13 Mar 2023 at 8:48, Berenguer Blasi 
wrote:

> TTL (CASSANDRA-14227) is undergoing review and it's in final stages afaik.
> A big rebase and perf re-testing will be needed to confirm all is still
> good. I would expect this to happen this month.
>
> Then the feature flag and downgradability issue, which are unkown atm in
> terms of complexity, are next.
> On 13/3/23 12:34, Mike Adamson wrote:
>
> CEP-7 Storage Attached Index is in review with ~430 files and ~70k LOC.
> The bulk of the project is in 3 main patches. The first patch (in-memory
> index and query path) is merged to the feature branch CASSANDRA-16052 and
> the second patch (on-disk write and literal / string index) is in review.
>
> Mike
>
> On Thu, 9 Mar 2023 at 09:13, Branimir Lambov  wrote:
>
>> CEPs 25 (trie-indexed sstables) and 26 (unified compaction strategy)
>> should both be ready for review by mid-April.
>>
>> Both are around 10k LOC, fairly isolated, and in need of a committer to
>> review.
>>
>> Regards,
>> Branimir
>>
>> On Mon, Mar 6, 2023 at 11:25 AM Benjamin Lerer  wrote:
>>
>>> Sorry, I realized that when I started the discussion I probably did not
>>> frame it enough as I see that it is now going into different directions.
>>> The concerns I am seeing are:
>>> 1) A too small amount of time between releases  is inefficient from a
>>> development perspective and from a user perspective. From a development
>>> point of view because we are missing time to deliver some features. From a
>>> user perspective because they cannot follow with the upgrade.
>>> 2) Some features are so anticipated (Accord being the one mentioned)
>>> that people would prefer to delay the release to make sure that it is
>>> available as soon as possible.
>>> 3) We do not know how long we need to go from the freeze to GA. We hope
>>> for 2 months but our last experience was 6 months. So delaying the release
>>> could mean not releasing this year.
>>> 4) For people doing marketing it is really hard to promote a product
>>> when you do not know when the release will come and what features might be
>>> there.
>>>
>>> All those concerns are probably even made worse by the fact that we do
>>> not have a clear visibility on where we are.
>>>
>>> Should we clarify that part first by getting an idea of the status of
>>> the different CEPs and other big pieces of work? From there we could agree
>>> on some timeline for the freeze. We could then discuss how to make
>>> predictable the time from freeze to GA.
>>>
>>>
>>>
>>> Le sam. 4 mars 2023 à 18:14, Josh McKenzie  a
>>> écrit :
>>>
 (for convenience sake, I'm referring to both Major and Minor semver
 releases as "major" in this email)

 The big feature from our perspective for 5.0 is ACCORD (CEP-15) and I
 would advocate to delay until this has sufficient quality to be in
 production.

 This approach can be pretty unpredictable in this domain; often
 unforeseen things come up in implementation that can give you a long tail
 on something being production ready. For the record - I don't intend to
 single Accord out *at all* on this front, quite the opposite given how
 much rigor's gone into the design and implementation. I'm just thinking
 from my personal experience: everything I've worked on, overseen, or
 followed closely on this codebase always has a few tricks up its sleeve
 along the way to having edge-cases stabilized.

 Much like on some other recent topics, I think there's a nuanced middle
 ground where we take things on a case-by-case basis. Some factors that have
 come up in this thread that resonated with me:

 For a given potential release date 'X':
 1. How long has it been since the last release?
 2. How long do we expect qualification to take from a "freeze" (i.e. no
 new improvement or features, branch) point?
 3. What body of merged production ready work is available?
 4. What body of new work do we have high confidence will be ready
 within Y time?

 I think it's worth defining a loose "minimum bound and upper bound" on
 release cycles we want to try and stick with barring extenuating
 circumstances. For instance: try not to release sooner than maybe 10 months
 out from a prior major, and try not to release later than 18 months out
 from a prior major. Make exceptions if truly exceptional things land, are
 about to land, or bugs are discovered around those boundaries.

 Applying the above framework to what we have in flight, our last
 release date, expectations on CI, etc - targeting an early fall freeze
 (pending CEP status) and mid to late fall or December release "feels right"
 to me.

 With the exception, of cou

Re: Should we cut some new releases?

2023-03-13 Thread Jacek Lewandowski
+1

pon., 13 mar 2023, 20:36 użytkownik Miklosovic, Stefan <
stefan.mikloso...@netapp.com> napisał:

> Yes, I was waiting for CASSANDRA-18125 to be in.
>
> I can release 4.1.1 to staging tomorrow morning CET if nobody objects that.
>
> Not sure about 4.0.9. We released 4.0.8 just few weeks ago. I would do
> 4.1.1 first.
>
> 
> From: Ekaterina Dimitrova 
> Sent: Monday, March 13, 2023 18:12
> To: dev@cassandra.apache.org
> Subject: Re: Should we cut some new releases?
>
> NetApp Security WARNING: This is an external email. Do not click links or
> open attachments unless you recognize the sender and know the content is
> safe.
>
>
>
> +1
>
> On Mon, 13 Mar 2023 at 12:23, Benjamin Lerer  ble...@apache.org>> wrote:
> Hi everybody,
>
> Benedict and Jon recently committed the patch for CASSANDRA-18125<
> https://issues.apache.org/jira/browse/CASSANDRA-18125> which fixes some
> serious problems at the memtable/flush level. Should we consider cutting
> some releases that contain this fix?
>


Re: [DISCUSS] New dependencies with Chronicle-Queue update

2023-03-13 Thread Mick Semb Wever
On Mon, 13 Mar 2023 at 16:39, Jeremiah D Jordan 
wrote:

> Given we need to upgrade to support JDK17 it seems fine to me.  The only
> concern I have is that some of those libraries are already pretty old, for
> example the most recent jna-platform is 5.13.0 and 5.5.0 is almost 4 years
> old.
>


Good catch. I've updated the transitive dependencies to their latest.
(Taking this approach is kinda unfortunate, as pinning the transitive
dependency versions requires declaring them explicitly.)

Note, the introduction of jnr-ffi, jffi, and openhft:posix  introduces
platform/machine dependent differences as native libraries are taken
advantage of (when available). While we don't have a choice (the
alternative would be to rewrite the o.a.c.utils.binlog package without
chronicle-queue), it's still worth raising attention to.

The new transitive dependencies are now:

 affinity-3.23ea1.jar
 asm-analysis-9.4.jar
 asm-commons-9.4.jar
 asm-tree-9.4.jar
 asm-util-9.4.jar
 jffi-1.3.11-native.jar
 jffi-1.3.11.jar
 jna-platform-5.13.0.jar
 jnr-a64asm-1.0.0.jar
 jnr-constants-0.10.4.jar
 jnr-ffi-2.2.13.jar
 jnr-x86asm-1.0.2.jar
 posix-2.24ea4.jar


[DISCUSS] Drop support for sstable formats m* (in trunk)

2023-03-13 Thread Mick Semb Wever
If we do not recommend and do not test direct upgrades from 3.x to
5.x, we can clean up a fair bit by removing code related to sstable
formats m*, as Cassandra versions 4.x and  5.0 are all on sstable
formats n*.

We don't allow mixed-version streaming, so it's not possible today to
stream any such older sstable format between nodes. This
compatibility-break impacts only node-local and/or offline.

Some arguments raised to keep m* sstable formats are:
 - offline cluster upgrade, e.g. direct from 3.x to 5.0,
 - single-invocation sstableupgrade usage
 - third-party tools based on the above

Personally I am not in favour of keeping, or recommending users use,
code we don't test.

An _example_ of the code that can be cleaned up is in the patch
attached to the ticket:
CASSANDRA-18312 – Drop support for sstable formats before `na`

What do you think?


Re: Should we cut some new releases?

2023-03-13 Thread Berenguer Blasi

+1

On 13/3/23 21:25, Jacek Lewandowski wrote:

+1

pon., 13 mar 2023, 20:36 użytkownik Miklosovic, Stefan 
 napisał:


Yes, I was waiting for CASSANDRA-18125 to be in.

I can release 4.1.1 to staging tomorrow morning CET if nobody
objects that.

Not sure about 4.0.9. We released 4.0.8 just few weeks ago. I
would do 4.1.1 first.


From: Ekaterina Dimitrova 
Sent: Monday, March 13, 2023 18:12
To: dev@cassandra.apache.org
Subject: Re: Should we cut some new releases?

NetApp Security WARNING: This is an external email. Do not click
links or open attachments unless you recognize the sender and know
the content is safe.



+1

On Mon, 13 Mar 2023 at 12:23, Benjamin Lerer
mailto:ble...@apache.org>> wrote:
Hi everybody,

Benedict and Jon recently committed the patch for
CASSANDRA-18125
which fixes some serious problems at the memtable/flush level.
Should we consider cutting some releases that contain this fix?