from:"Jordan West"

Re: Welcome Patrick McFadin as Cassandra Committer

2023-02-02 Thread Jordan West

Congratulations Patrick! Well deserved.

Jordan

On Thu, Feb 2, 2023 at 10:18 Brandon Williams  wrote:

> Congratulations, Patrick!
>
> Kind Regards,
> Brandon
>
> On Thu, Feb 2, 2023 at 11:58 AM Benjamin Lerer  wrote:
> >
> > The PMC members are pleased to announce that Patrick McFadin has accepted
> > the invitation to become committer today.
> >
> > Thanks a lot, Patrick, for everything you have done for this project and
> its community through the years.
> >
> > Congratulations and welcome!
> >
> > The Apache Cassandra PMC members
>

[DISCUSS] Using ACCP or tc-native by default

2023-06-22 Thread Jordan West

Hi,

I’m wondering if there is appetite to change the default SSL provider for
Cassandra going forward to either ACCP [1] or tc-native in Netty? Our
deployment as well as others I’m aware of make this change in their fork
and it can lead to significant performance improvement. When recently
qualifying 4.1 without using ACCP (by accident) we noticed p99 latencies
were 2x higher than 3.0 w/ ACCP. Wiring up ACCP can be a bit of a pain and
also requires some amount of customization. I think it could be great for
the wider community to adopt it.

The biggest hurdle I foresee is licensing but ACCP is Apache 2.0 licensed.
Anything else I am missing before opening a JIRA and submitting a patch?

Jordan


[1]
https://github.com/corretto/amazon-corretto-crypto-provider

Re: [DISCUSS] Using ACCP or tc-native by default

2023-06-22 Thread Jordan West

Glad to see there is support for this! I think ACCP would be a good choice
since there seems to be a lot of experience deploying it. I’ve opened
https://issues.apache.org/jira/browse/CASSANDRA-18624. I should have some
time to work on the patch soon and I will try to provide some graphs that
show the performance benefit from a recent benchmark.

Jordan


On Thu, Jun 22, 2023 at 19:28 Fleming, Jackson 
wrote:

> We run ACCP in production on 1000s of nodes across Cassandra 3.11 and 4
> with great results.
>
>
>
> Would love to see it baked into Cassandra.
>
>
>
> Jackson
>
>
>
> *From: *David Capwell 
> *Date: *Friday, 23 June 2023 at 9:22 am
> *To: *dev 
> *Subject: *Re: [DISCUSS] Using ACCP or tc-native by default
>
> *NetApp Security WARNING*: This is an external email. Do not click links
> or open attachments unless you recognize the sender and know the content is
> safe.
>
>
>
> +1 to ACCP
>
>
>
> On Jun 22, 2023, at 3:05 PM, C. Scott Andreas 
> wrote:
>
>
>
> +1 for ACCP and can attest to its results. ACCP also optimizes for a range
> of hash functions and other cryptographic primitives beyond TLS
> acceleration for Netty.
>
>
>
> On Jun 22, 2023, at 2:07 PM, Jeff Jirsa  wrote:
>
>
>
>
>
> Either would be better than today.
>
>
>
> On Thu, Jun 22, 2023 at 1:57 PM Jordan West  wrote:
>
> Hi,
>
>
>
> I’m wondering if there is appetite to change the default SSL provider for
> Cassandra going forward to either ACCP [1] or tc-native in Netty? Our
> deployment as well as others I’m aware of make this change in their fork
> and it can lead to significant performance improvement. When recently
> qualifying 4.1 without using ACCP (by accident) we noticed p99 latencies
> were 2x higher than 3.0 w/ ACCP. Wiring up ACCP can be a bit of a pain and
> also requires some amount of customization. I think it could be great for
> the wider community to adopt it.
>
>
>
> The biggest hurdle I foresee is licensing but ACCP is Apache 2.0 licensed.
> Anything else I am missing before opening a JIRA and submitting a patch?
>
>
>
> Jordan
>
>
>
>
>
> [1]
>
> https://github.com/corretto/amazon-corretto-crypto-provider
>
>
>
>
>

Re: [DISCUSS] Using ACCP or tc-native by default

2023-07-26 Thread Jordan West

I left my comments on the JIRA itself but generally they mirror Scott and
Joeys thoughts.

Jordan

On Wed, Jul 26, 2023 at 07:26 C. Scott Andreas  wrote:

> Peter, thanks for your message.
>
> You are receiving these emails because your address is subscribed to the
> Apache Cassandra "dev@" developer mailing list. You can unsubscribe from
> this list by sending an email to dev-unsubscr...@cassandra.apache.org.
> Subscribers to the mailing list are not able to take this action on others'
> behalf.
>
> More information on the project's mailing lists and how to join/leave them
> is here: https://cassandra.apache.org/_/community.html
>
> Cheers,
>
> – Scott
>
> On Jul 26, 2023, at 7:11 AM, C. Scott Andreas 
> wrote:
>
>
> Can you say more about the shape of your concern?
>
> JCA/JCE conformance and correctness of the functions implemented are a
> responsibility of the ACCP/Corretto test suite (link
> ).
> These are thoroughly exercised by Amazon and bundled into the Corretto JDK
> distribution Amazon ships as well.
>
> With regard to Cassandra, the hash and cryptographic functions utilized in
> ACCP are also thoroughly exercised by Cassandra’s unit and in-JVM dtest
> suite.
>
> I wouldn’t propose fragmenting our build into a matrix of JDK x arch x
> ACCP/no, in the same way that we wouldn’t for tcnative vs. not.
>
> - Scott
>
> On Jul 26, 2023, at 6:48 AM, Mick Semb Wever  wrote:
>
> 
>
>> So if a service is not there it will just search where it is next. I
>> completely forgot this aspect of it ... Folks from Corretto forgot to
>> mention this behavior as well, interesting. It is not as we are going to
>> use this _as the only provider_.
>>
>
>
> I'm still uncomfortable assuming upgrades work without having the
> appropriate tests in place.  That's the crux for me.  Existing JCE tests
> (with and without accp) should cover this?
>
>
>

Re: [DISCUSS] Using ACCP or tc-native by default

2023-07-26 Thread Jordan West

We do and I’m sensitive to that 100% but there is no reason ACCP should
break upgrades afaik. The algorithms it implements are identical and for
the ones it doesn’t the JRE implementation is used — ACCP is the higher
priority implementation. Do we have any examples of it breaking anything?
Or that it’s problematic?

We recently did a 4.1 upgrade that was mixed JRE / ACCP and it worked fine.
It’s how we figured out ACCP was missing because 4.1 was noticeably slower
(graph in JIRA) and the JRE crypto library dominated the flamegraph (can
try to dig up a screenshot maybe).

Jordan

On Wed, Jul 26, 2023 at 08:35 Mick Semb Wever  wrote:

>
>
> Can you say more about the shape of your concern?
>>
>
>
> Integration testing where some nodes are running JCE and others accp, and
> various configurations that are and are not accp compatible/native.
>
> I'm not referring to (re-) unit testing accp or jce themselves, or matrix
> testing over them, but our commitment to always-on upgrades against all
> possible configurations that integrate.  We've history with config changes
> breaking upgrades, for as simple as they are.
>

Re: [DISCUSS] Using ACCP or tc-native by default

2023-07-26 Thread Jordan West

+1 Scott. And agreed all involved are looking out for the best interests of
C* users. And I appreciate those with concerns contributing to addressing
them.

I’m all for making upgrades smooth bc I do them so often. A huge portion of
our 4.1 qualification is “will it break on upgrade”? Because of that I’m
confident in this patch and concerned about many other areas. I think it’s
commedable to want to reach a point where teams have the trust in the
community to have done that for them but that starts w better test coverage
and concrete evidence.

Given all that, I think we should move forward w Ayushi’s proposal to make
it on by default.

Jordan

On Wed, Jul 26, 2023 at 12:14 C. Scott Andreas  wrote:

> I think these concerns are well-intended, but they feel rooted in
> uncertainty rather than in factual examples of areas where risk is present.
> I would appreciate elaboration on the specific areas of risk that folks
> imagine.
>
> I would encourage those who express skepticism to try the patch, and I
> endorse Ayushi's proposal to enable it by default.
>
>
> – Scott
>
> On Jul 26, 2023, at 12:03 PM, "Miklosovic, Stefan" <
> stefan.mikloso...@netapp.com> wrote:
>
>
> We can make it opt-in, wait one major to see what bugs pop up and we might
> do that opt-out eventually. We do not need to hurry up with this. I
> understand everybody's expectations and excitement but it really boils down
> to one line change in yaml. People who are so much after the performance
> will be definitely aware of this knob to turn on to squeeze even more perf
> ...
>
> I look around dtests Jeremiah mentioned but I would just moved on and make
> it opt-in if we are not 100% persuaded about it _yet_.
>
> 
> From: Mick Semb Wever 
> Sent: Wednesday, July 26, 2023 20:48
> To: dev@cassandra.apache.org
> Subject: Re: [DISCUSS] Using ACCP or tc-native by default
>
> NetApp Security WARNING: This is an external email. Do not click links or
> open attachments unless you recognize the sender and know the content is
> safe.
>
>
>
>
> What comes to mind is how we brought down people clusters and made
> sstables unreadable with the introduction of the chunk_length configuration
> in 1.0. It wasn't about how tested the compression libraries were, but
> about the new configuration itself. Introducing silent defaults has more
> surface area for bugs than introducing explicit defaults that only apply to
> new clusters and are so opt-in for existing clusters.
>
>
>
> On Wed, 26 Jul 2023 at 20:13, J. D. Jordan  > wrote:
> Enabling ssl for the upgrade dtests would cover this use case. If those
> don’t currently exist I see no reason it won’t work so I would be fine for
> someone to figure it out post merge if there is a concern. What JCE
> provider you use should have no upgrade concerns.
>
> -Jeremiah
>
> On Jul 26, 2023, at 1:07 PM, Miklosovic, Stefan <
> stefan.mikloso...@netapp.com> wrote:
>
> Am I understanding it correctly that tests you are talking about are only
> required in case we make ACCP to be default provider?
>
> I can live with not making it default and still deliver it if tests are
> not required. I do not think that these kind of tests were required couple
> mails ago when opt-in was on the table.
>
> While I tend to agree with people here who seem to consider testing this
> scenario to be unnecessary exercise, I am afraid that I will not be able to
> deliver that as testing something like this is quite complicated matter.
> There is a lot of aspects which could be tested I can not even enumerate
> right now ... so I try to meet you somewhere in the middle.
>
> 
> From: Mick Semb Wever mailto:m...@apache.org>>
> Sent: Wednesday, July 26, 2023 17:34
> To: dev@cassandra.apache.org
> Subject: Re: [DISCUSS] Using ACCP or tc-native by default
>
> NetApp Security WARNING: This is an external email. Do not click links or
> open attachments unless you recognize the sender and know the content is
> safe.
>
>
>
>
>
> Can you say more about the shape of your concern?
>
>
> Integration testing where some nodes are running JCE and others accp, and
> various configurations that are and are not accp compatible/native.
>
> I'm not referring to (re-) unit testing accp or jce themselves, or matrix
> testing over them, but our commitment to always-on upgrades against all
> possible configurations that integrate. We've history with config changes
> breaking upgrades, for as simple as they are.
>
>
>
>

Re: [DISCUSS] Using ACCP or tc-native by default

2023-07-26 Thread Jordan West

It sounds like some of the concerns have shifted then. I would like to
better understand the YAML one. Like Jeremiah said it may be a better topic
for the ticket. Would appreciate an example exception or error people are
concerned about.

If the issue is the “fail fast” on start I’m sure we can find a solution
everyone accepts and move forward.

If we are agreed “on by default” is the way to go that’s awesome!

Jordan

On Wed, Jul 26, 2023 at 12:59 Jeremiah Jordan 
wrote:

> I had a discussion with Mick on slack.  His concern is not with enabling
> ACCP.  His concern is around the testing of the new C* yaml config code
> which is included in the patch that is used to decide if ACCP should be
> enabled or not, and if startup should fail if it can’t be enabled.
>
> I agree.  We should make sure that the new C* yaml config code is solid
> before we commit this patch, especially when it has the possibility of
> cause node startup to fail on purpose.  But that should be a discussion for
> the ticket I think, not for this thread.
>
> So I think we are back to the original question.  Should ACCP be used by
> default in trunk.  From what I have seen I do not see anyone who is against
> that?
>
> -Jeremiah
>
>
> On Jul 26, 2023 at 2:53:02 PM, Jordan West  wrote:
>
>> +1 Scott. And agreed all involved are looking out for the best interests
>> of C* users. And I appreciate those with concerns contributing to
>> addressing them.
>>
>> I’m all for making upgrades smooth bc I do them so often. A huge portion
>> of our 4.1 qualification is “will it break on upgrade”? Because of that I’m
>> confident in this patch and concerned about many other areas. I think it’s
>> commedable to want to reach a point where teams have the trust in the
>> community to have done that for them but that starts w better test coverage
>> and concrete evidence.
>>
>> Given all that, I think we should move forward w Ayushi’s proposal to
>> make it on by default.
>>
>> Jordan
>>
>> On Wed, Jul 26, 2023 at 12:14 C. Scott Andreas 
>> wrote:
>>
>>> I think these concerns are well-intended, but they feel rooted in
>>> uncertainty rather than in factual examples of areas where risk is present.
>>> I would appreciate elaboration on the specific areas of risk that folks
>>> imagine.
>>>
>>> I would encourage those who express skepticism to try the patch, and I
>>> endorse Ayushi's proposal to enable it by default.
>>>
>>>
>>> – Scott
>>>
>>> On Jul 26, 2023, at 12:03 PM, "Miklosovic, Stefan" <
>>> stefan.mikloso...@netapp.com> wrote:
>>>
>>>
>>> We can make it opt-in, wait one major to see what bugs pop up and we
>>> might do that opt-out eventually. We do not need to hurry up with this. I
>>> understand everybody's expectations and excitement but it really boils down
>>> to one line change in yaml. People who are so much after the performance
>>> will be definitely aware of this knob to turn on to squeeze even more perf
>>> ...
>>>
>>> I look around dtests Jeremiah mentioned but I would just moved on and
>>> make it opt-in if we are not 100% persuaded about it _yet_.
>>>
>>> 
>>> From: Mick Semb Wever 
>>> Sent: Wednesday, July 26, 2023 20:48
>>> To: dev@cassandra.apache.org
>>> Subject: Re: [DISCUSS] Using ACCP or tc-native by default
>>>
>>> NetApp Security WARNING: This is an external email. Do not click links
>>> or open attachments unless you recognize the sender and know the content is
>>> safe.
>>>
>>>
>>>
>>>
>>> What comes to mind is how we brought down people clusters and made
>>> sstables unreadable with the introduction of the chunk_length configuration
>>> in 1.0. It wasn't about how tested the compression libraries were, but
>>> about the new configuration itself. Introducing silent defaults has more
>>> surface area for bugs than introducing explicit defaults that only apply to
>>> new clusters and are so opt-in for existing clusters.
>>>
>>>
>>>
>>> On Wed, 26 Jul 2023 at 20:13, J. D. Jordan >> <mailto:jeremiah.jor...@gmail.com>> wrote:
>>> Enabling ssl for the upgrade dtests would cover this use case. If those
>>> don’t currently exist I see no reason it won’t work so I would be fine for
>>> someone to figure it out post merge if there is a concern. What JCE
>>> provider you use should have no upgrade concerns.
>&

Re: [DISCUSS] Update default disk_access_mode to mmap_index_only on 5.0

2023-11-15 Thread Jordan West

I would also like to back this proposal. We change this default because
several incidents have occurred by leaving the default of auto. There are
rare cases where auto/mmap is the better option but as for a default
mmap_index_only is safer.

On Wed, Nov 15, 2023 at 6:35 AM Paulo Motta  wrote:

> Hi,
>
> I would like to get back to this. I proposed this default configuration
> change on the user list ~1 month ago and there were no comments [1].
>
> I created CASSANDRA-19021 [2] to make the proposed change and Stefan
> kindly submitted a patch, CI is looking good.
>
> Any objections to making this change in 5.0? If not, we will merge in 24
> hours.
>
> Thanks,
>
> Paulo
>
> [1] - https://lists.apache.org/thread/w0gkdj7fhylycqwmd73p0kfck7jr8qth
> [2] - https://issues.apache.org/jira/browse/CASSANDRA-19021
>
> On Wed, Sep 6, 2023 at 5:12 PM Paulo Motta 
> wrote:
>
>> > I wonder why disk_access_mode property is not in cassandra.yaml
>> (looking into trunk right now)
>>
>> I think there's a prehistoric reason why it was removed but I can't
>> remember right now.
>>
>> > Do you all think we can add it there with brief explanation what each
>> option does?
>>
>> We could reinclude it as long as we provide a clear recommendation on
>> when to change from the default since this is an advanced setting which
>> should be rarely changed. But I still think we should provide a more
>> stable/foolproof default (mmap_index_only) since the current default (mmap)
>> is known to cause instability in some scenarios.
>>
>> Also there is a technicality with changing the default, if we change the
>> "auto" behavior from mmap to mmap_index_only this may affect users relying
>> on the default "mmap" behavior. Not sure the best way to address that, is a
>> big NEWS note sufficient? Even though users are expected to read NEWS when
>> upgrading we know well not all users read it.
>>
>> > Shall we also share this thread with @user?
>>
>> Thanks Ekaterina! If we decide to change the default we can run this
>> through the user@ list to see what the user community thinks.
>>
>> On Wed, Sep 6, 2023 at 4:45 PM Ekaterina Dimitrova 
>> wrote:
>>
>>> Thanks for starting this discussion, Paulo!
>>>
>>> Shall we also share this thread with @user?
>>>
>>> On Wed, 6 Sep 2023 at 16:35, C. Scott Andreas 
>>> wrote:
>>>
 Supportive of switching the default to mmap_index_only as well.

 I don’t have numbers handy to share, but my experience has been
 significantly lower read latency and I wouldn’t run with auto. I’ve also
 not observed substantial heap pressure after switching - it was strictly an
 improvement.

 - Scott

 —
 Mobile

 On Sep 6, 2023, at 8:50 AM, Paulo Motta 
 wrote:

 Hi,

 I've been bitten by OOMs with disk_access_mode:auto/mmap that were
 fixed by changing to disk_access_mode:mmap_index_only. In a particular
 benchmark I got 5x more read throughput on 3.11.x with disk_access_mode:
 mmap_index_only vs disk_access_mode: auto/mmap.

 Changing disk_access_mode to mmap_index_only seems to be a common
 recommendation on forums[1][2][3][4] and slack (find by searching
 disk_access_mode in the #cassandra channel on
 https://the-asf.slack.com/).

 It's not clear to me when using the default
 disk_access_mode:auto/mmap is beneficial, perhaps only when the read set
 fits in memory? Mick seems to think on CASSANDRA-15531 [5], that
 mmap_index_only has a higher heap cost and should be only used when
 warranted. However it's not uncommon to see people being bitten with OOMs
 or lower read performance due to the default disk_access_mode, so it makes
 me think it's not the best fool-proof default.

 Should we consider changing default "auto" behavior of
 "disk_access_mode" to be "mmap_index_only" instead of "mmap" in 5.0 since
 it's likely safer and perhaps more performant?

 Thanks,

 Paulo

 [1]
 https://stackoverflow.com/questions/72272035/troubleshooting-and-fixing-cassandra-oom-issue
 [2] https://phabricator.wikimedia.org/T137419
 [3] https://stackoverflow.com/a/55975471
 [4]
 https://support.datastax.com/s/article/FAQ-Use-of-disk-access-mode-in-DSE-51-and-earlier
 [5] https://issues.apache.org/jira/browse/CASSANDRA-15531

Re: [DISCUSS] CEP-40: Data Transfer Using Cassandra Sidecar for Live Migrating Instances

2024-04-14 Thread Jordan West

Thanks for proposing this CEP! We have something like this internally so I
have some familiarity with the approach and the challenges. After reading
the CEP a couple things come to mind:

1. I would like to see more abstraction of how the files get moved / put in
place with the proposed solution being the default implementation. That
would allow others to plug in alternatives means of data movement like
pulling down backups from S3 or rsync, etc.

2. I do agree with Jon’s last email that the lifecycle / orchestration
portion is the more challenging aspect. It would be nice to address that as
well so we don’t end up with something like repair where the building
blocks are there but the hard parts are left to the operator. I do,
however, see that portion being done in a follow-on CEP to limit the scope
of CEP-40 and have a higher chance for success by incrementally adding
these features.

Jordan

On Thu, Apr 11, 2024 at 12:31 Jon Haddad  wrote:

> First off, let me apologize for my initial reply, it came off harsher than
> I had intended.
>
> I know I didn't say it initially, but I like the idea of making it easier
> to replace a node.  I think it's probably not obvious to folks that you can
> use rsync (with stunnel, or alternatively rclone), and for a lot of teams
> it's intimidating to do so.  Whether it actually is easy or not to do with
> rsync is irrelevant.  Having tooling that does it right is better than duct
> taping things together.
>
> So with that said, if you're looking to get feedback on how to make the
> CEP more generally useful, I have a couple thoughts.
>
> > Managing the Cassandra processes like bringing them up or down while
> migrating the instances.
>
> Maybe I missed this, but I thought we already had support for managing the
> C* lifecycle with the sidecar?  Maybe I'm misremembering.  It seems to me
> that adding the ability to make this entire workflow self managed would be
> the biggest win, because having a live migrate *feature* instead of what's
> essentially a runbook would be far more useful.
>
> > To verify whether the desired file set matches with source, only file
> path and size is considered at the moment. Strict binary level verification
> is deferred for later.
>
> Scott already mentioned this is a problem and I agree, we cannot simply
> rely on file path and size.
>
> TL;DR: I like the intention of the CEP.  I think it would be better if it
> managed the entire lifecycle of the migration, but you might not have an
> appetite to implement all that.
>
> Jon
>
>
> On Thu, Apr 11, 2024 at 10:01 AM Venkata Hari Krishna Nukala <
> n.v.harikrishna.apa...@gmail.com> wrote:
>
>> Thanks Jon & Scott for taking time to go through this CEP and providing
>> inputs.
>>
>> I am completely with what Scott had mentioned earlier (I would have added
>> more details into the CEP). Adding a few more points to the same.
>>
>> Having a solution with Sidecar can make the migration easy without
>> depending on rsync. At least in the cases I have seen, rsync is not enabled
>> by default and most of them want to run OS/images with as minimal
>> requirements as possible. Installing rsync requires admin privileges and
>> syncing data is a manual operation. If an API is provided with Sidecar,
>> then tooling can be built around it reducing the scope for manual errors.
>>
>> From performance wise, at least in the cases I had seen, the File
>> Streaming API in Sidecar performs a lot better. To give an idea on the
>> performance, I would like to quote "up to 7 Gbps/instance writes (depending
>> on hardware)" from CEP-28 as this CEP proposes to leverage the same.
>>
>> For:
>>
>> >When enabled for LCS, single sstable uplevel will mutate only the level
>> of an SSTable in its stats metadata component, which wouldn't alter the
>> filename and may not alter the length of the stats metadata component. A
>> change to the level of an SSTable on the source via single sstable uplevel
>> may not be caught by a digest based only on filename and length.
>>
>> In this case file size may not change, but the timestamp of last modified
>> time would change, right? It is addressed in section MIGRATING ONE
>> INSTANCE, point 2.b.ii which says "If a file is present at the destination
>> but did not match (by size or timestamp) with the source file, then local
>> file is deleted and added to list of files to download.". And after
>> download by final data copy task, file should match with source.
>>
>> On Thu, Apr 11, 2024 at 7:30 AM C. Scott Andreas 
>> wrote:
>>
>>> Oh, one note on this item:
>>>
>>> >  The operator can ensure that files in the destination matches with
>>> the source. In the first iteration of this feature, an API is introduced to
>>> calculate digest for the list of file names and their lengths to identify
>>> any mismatches. It does not validate the file contents at the binary level,
>>> but, such feature can be added at a later point of time.
>>>
>>> When enabled for LCS, single sstable uplevel will

Re: [DISCUSS] CEP-40: Data Transfer Using Cassandra Sidecar for Live Migrating Instances

2024-04-19 Thread Jordan West

If we are considering the main process then we have to do some additional
work to ensure that it doesn’t put pressure on the JVM and introduce
latency. That’s one thing I like about having it an external process — not
that it’s bullet proof but it’s one less thing to worry about.

Jordan

On Thu, Apr 18, 2024 at 15:39 Francisco Guerrero  wrote:

> My understanding from the proposal is that Sidecar would be able to migrate
> from a Cassandra instance that is already dead and cannot recover. This is
> a
> scenario that is possible where Sidecar should still be able to migrate to
> a new
> instance.
>
> Alternatively, Cassandra itself could have some flag to start up with
> limited
> subsystems enabled to allow live migration.
>
> In any case, we'll need to weigh in the pros and cons of each alternative
> and
> decide if the live migration process can be handled within the C* process
> itself
> or if we allow this functionality to be handled by Sidecar.
>
> I am looking forward to this feature though, as it will be of great value
> for many
> users across the ecosystem.
>
> On 2024/04/18 22:25:23 Jon Haddad wrote:
> > Hmm... I guess if you're using encryption you can't use ZCS so there's
> that.
> >
> > It probably makes sense to implement kernel TLS:
> > https://www.kernel.org/doc/html/v5.7/networking/tls.html
> >
> > Then we can get ZCS all the time, for bootstrap & replacements.
> >
> > Jon
> >
> >
> > On Thu, Apr 18, 2024 at 12:50 PM Jon Haddad  wrote:
> >
> > > Ariel, having it in C* process makes sense to me.
> > >
> > > Please correct me if I'm wrong here, but shouldn't using ZCS to
> transfer
> > > have no distinguishable difference in overhead from doing it using the
> > > sidecar?  Since the underlying call is sendfile, never hitting
> userspace, I
> > > can't see why we'd opt for the transfer in sidecar.  What's the
> > > advantage of duplicating the work that's already been done?
> > >
> > > I can see using the sidecar for coordination to start and stop
> instances
> > > or do things that require something out of process.
> > >
> > > Jon
> > >
> > >
> > > On Thu, Apr 18, 2024 at 12:44 PM Ariel Weisberg 
> wrote:
> > >
> > >> Hi,
> > >>
> > >> If there is a faster/better way to replace a node why not  have
> Cassandra
> > >> support that natively without the sidecar so people who aren’t
> running the
> > >> sidecar can benefit?
> > >>
> > >> Copying files over a network shouldn’t be slow in C* and it would also
> > >> already have all the connectivity issues solved.
> > >>
> > >> Regards,
> > >> Ariel
> > >>
> > >> On Fri, Apr 5, 2024, at 6:46 AM, Venkata Hari Krishna Nukala wrote:
> > >>
> > >> Hi all,
> > >>
> > >> I have filed CEP-40 [1] for live migrating Cassandra instances using
> the
> > >> Cassandra Sidecar.
> > >>
> > >> When someone needs to move all or a portion of the Cassandra nodes
> > >> belonging to a cluster to different hosts, the traditional approach of
> > >> Cassandra node replacement can be time-consuming due to repairs and
> the
> > >> bootstrapping of new nodes. Depending on the volume of the storage
> service
> > >> load, replacements (repair + bootstrap) may take anywhere from a few
> hours
> > >> to days.
> > >>
> > >> Proposing a Sidecar based solution to address these challenges. This
> > >> solution proposes transferring data from the old host (source) to the
> new
> > >> host (destination) and then bringing up the Cassandra process at the
> > >> destination, to enable fast instance migration. This approach would
> help to
> > >> minimise node downtime, as it is based on a Sidecar solution for data
> > >> transfer and avoids repairs and bootstrap.
> > >>
> > >> Looking forward to the discussions.
> > >>
> > >> [1]
> > >>
> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-40%3A+Data+Transfer+Using+Cassandra+Sidecar+for+Live+Migrating+Instances
> > >>
> > >> Thanks!
> > >> Hari
> > >>
> > >>
> > >>
> >
>

Re: [DISCUSS] CEP-40: Data Transfer Using Cassandra Sidecar for Live Migrating Instances

2024-04-20 Thread Jordan West

I do really like the framing of replacing a node is restoring a node and
then kicking off a replace. That is effectively what we do internally.

I also agree we should be able to do data movement well both internal to
Cassandra and externally for a variety of reasons.

We’ve seen great performance with “ZCS+TLS” even though it’s not full zero
copy — nodes that previously took *days* to replace now take a few hours.
But we have seen it put pressure on nodes and drive up latencies which is
the main reason we still rely on an external data movement system by
default — falling back to ZCS+TLS as needed.

Jordan

On Fri, Apr 19, 2024 at 19:15 Jon Haddad  wrote:

> Jeff, this is probably the best explanation and justification of the idea
> that I've heard so far.
>
> I like it because
>
> 1) we really should have something official for backups
> 2) backups / object store would be great for analytics
> 3) it solves a much bigger problem than the single goal of moving
> instances.
>
> I'm a huge +1 in favor of this perspective, with live migration being one
> use case for backup / restore.
>
> Jon
>
>
> On Fri, Apr 19, 2024 at 7:08 PM Jeff Jirsa  wrote:
>
>> I think Jordan and German had an interesting insight, or at least their
>> comment made me think about this slightly differently, and I’m going to
>> repeat it so it’s not lost in the discussion about zerocopy / sendfile.
>>
>> The CEP treats this as “move a live instance from one machine to
>> another”. I know why the author wants to do this.
>>
>> If you think of it instead as “change backup/restore mechanism to be able
>> to safely restore from a running instance”, you may end up with a cleaner
>> abstraction that’s easier to think about (and may also be easier to
>> generalize in clouds where you have other tools available ).
>>
>> I’m not familiar enough with the sidecar to know the state of
>> orchestration for backup/restore, but “ensure the original source node
>> isn’t running” , “migrate the config”, “choose and copy a snapshot” , maybe
>> “forcibly exclude the original instance from the cluster” are all things
>> the restore code is going to need to do anyway, and if restore doesn’t do
>> that today, it seems like we can solve it once.
>>
>> Backup probably needs to be generalized to support many sources, too.
>> Object storage is obvious (s3 download). Block storage is obvious (snapshot
>> and reattach). Reading sstables from another sidecar seems reasonable, too.
>>
>> It accomplishes the original goal, in largely the same fashion, it just
>> makes the logic reusable for other purposes?
>>
>>
>>
>>
>>
>> On Apr 19, 2024, at 5:52 PM, Dinesh Joshi  wrote:
>>
>> 
>> On Thu, Apr 18, 2024 at 12:46 PM Ariel Weisberg 
>> wrote:
>>
>>>
>>> If there is a faster/better way to replace a node why not  have
>>> Cassandra support that natively without the sidecar so people who aren’t
>>> running the sidecar can benefit?
>>>
>>
>> I am not the author of the CEP so take whatever I say with a pinch of
>> salt. Scott and Jordan have pointed out some benefits of doing this in the
>> Sidecar vs Cassandra.
>>
>> Today Cassandra is able to do fast node replacements. However, this CEP
>> is addressing an important corner case when Cassandra is unable to start up
>> due to old / ailing hardware. Can we fix it in Cassandra so it doesn't die
>> on old hardware? Sure. However, you would still need operator intervention
>> to start it up in some special mode both on the old and new node so the new
>> node can peer with the old node, copy over its data and join the ring. This
>> would still require some orchestration outside the database. The Sidecar
>> can do that orchestration for the operator. The point I'm making here is
>> that the CEP addresses a real issue. The way it is currently built can
>> improve over time with improvements in Cassandra.
>>
>> Dinesh
>>
>>

Re: [DISCUSS] Donating easy-cass-stress to the project

2024-04-30 Thread Jordan West

I would likely commit to it as well

Jordan

On Mon, Apr 29, 2024 at 10:55 David Capwell  wrote:

> So: besides Jon, who in the community expects/desires to maintain this
> going forward?
>
>
> I have been maintaining a fork for years, so don’t mind helping maintain
> this project.
>
> On Apr 28, 2024, at 4:08 AM, Mick Semb Wever  wrote:
>
> A separate subproject like dtest and the Java driver would maybe help
>> address concerns with introducing a gradle build system and Kotlin.
>>
>
>
> Nit, dtest is a separate repository, not a subproject.  The Java driver is
> one repository to be in the Drivers subproject.  Esoteric maybe, but ASF
> terminology we need to get right :-)
>
> To your actual point (IIUC), it can be a separate repository and not a
> separate subproject.  This permits it to be kotlin+gradle, while not having
> the formal subproject procedures.  It still needs 3 responsible committers
> from the get-go to show sustainability.  Would easy-cass-stress have
> releases, or always be a codebase users work directly with ?
>
> Can/Should we first demote cassandra-stress by moving it out to a separate
> repo ?
>  ( Can its imports work off non-snapshot dependencies ? )
> It might feel like an extra prerequisite step to introduce, but maybe it
> helps move the needle forward and make this conversation a bit
> easier/obvious.
>
>
>

Re: [DISCUSS] Gossip Protocol Change

2024-05-16 Thread Jordan West

I’m a big +1 on 18917 or more testing of gossip. While I appreciate that it
makes TCM more complicated, gossip and schema propagation bugs have been
the source of our two worst data loss events in the last 3 years. Data loss
should immediately cause us to evaluate what we can do better.

We will likely live with gossip for at least 1, maybe 2, more years.
Otherwise outside of bug fixes (and to some degree even still) I think the
only other solution is to not touch gossip *at all* until we are all
TCM-only which I don’t think is practical or realistic. recent changes to
gossip in 4.1 introduced several subtle bugs that had serious impact (from
data loss to loss of ability to safely replace nodes in the cluster).

I am happy to contribute some time to this if lack of folks is the issue.

Jordan

On Mon, May 13, 2024 at 17:05 David Capwell  wrote:

> So, I created https://issues.apache.org/jira/browse/CASSANDRA-18917 which
> lets you do deterministic gossip simulation testing cross large clusters
> within seconds… I stopped this work as it conflicted with TCM (they were
> trying to merge that week) and it hit issues where some nodes never
> converged… I didn’t have time to debug so I had to drop the patch…
>
> This type of change would be a good reason to resurrect that patch as
> testing gossip is super dangerous right now… its behavior is only in a few
> peoples heads and even then its just bits and pieces scattered cross
> multiple people (and likely missing pieces)…
>
> My brain is far too fried right now to say your idea is safe or not, but
> honestly feel that we would need to improve our tests (we have 0) before
> making such a change…
>
> I do welcome the patch though...
>
>
> On May 12, 2024, at 8:05 PM, Zemek, Cameron via dev <
> dev@cassandra.apache.org> wrote:
>
> In looking into CASSANDRA-19580 I noticed something that raises a
> question. With Gossip SYN it doesn't check for missing digests. If its
> empty for shadow round it will add everything from endpointStateMap to the
> reply. But why not included missing entries in normal replies? The
> branching for reply handling of SYN requests could then be merged into
> single code path (though shadow round handles empty state different with
> CASSANDRA-16213). Potential is performance impact as this requires doing a
> set difference.
>
> For example, something along the lines of:
>
> ```
> Set missing = new
> HashSet<>(endpointStateMap.keySet());
>
> missing.removeAll(gDigestList.stream().map(GossipDigest::getEndpoint).collect(Collectors.toSet()));
> for ( InetAddressAndPort endpoint : missing)
> {
> gDigestList.add(new GossipDigest(endpoint, 0, 0));
> }
> ```
>
> It seems odd to me that after shadow round for a new node we have
> endpointStateMap with only itself as an entry. Then the only way it gets
> the gossip state is by another node choosing to send the new node a gossip
> SYN. The choosing of this is random. Yeah this happens every second so
> eventually its going to receive one (outside the issue of CASSANDRA-19580
> were it doesn't if its in a dead state like hibernate) , but doesn't this
> open up bootstrapping to failures on very large clusters as it can take
> longer before its sent a SYN (as the odds of being chosen for SYN get
> lower)? For years been seeing bootstrap failures with 'Unable to contact
> any seeds' but they are infrequent and never been able to figure out how to
> reproduce in order to open a ticket, but I wonder if some of them have been
> due to not receiving a SYN message before it does the seenAnySeed check.
>
>
>

Re: [DISCUSS] CEP-40: Data Transfer Using Cassandra Sidecar for Live Migrating Instances

2024-05-20 Thread Jordan West

On Wed, May 1, 2024 at 3:34 AM Alex Petrov  wrote:

>
> We can implement CEP-40 using a similar approach: we can leave the source
> node as both a read and write target, and allow the new node to be a target
> for (pending) writes. Unfortunately, this does not help with availability
> (in fact, it decreases write availability, since we will have to collect
> 2+1 mandatory write responses instead of just 2), but increases durability,
> and I think helps to fully eliminate the second phase. This also increases
> read availability when the source node is up, since we can still use the
> source node as a part of read quorum.
>
>
I 100% agree that this is the more durable approach. And that bringing the
source node down reduces availability during the second phase. While my
inclination is that it would be better to implement the logic in the manner
you describe, from a pure correctness perspective, that loss of
availability of the r/w quorum is rare in my experience. Running a setup
like CEP-40 currently describes (but using S3 for the file transfer) for
over 3 years, in practice I have a hard time remembering one incident of
it. I'm sure its happened, but at the rate we replace hardware its not
something we deal with regularly despite taking the risk. I do agree as
well it needs to be well documented as surprising edge cases are never fun.
I think the existing and future TCM implementations cover the more
conservative/correct case and having this option as an alternative, or for
when the instance is unable to bring up the C* process, is a good to have.



> On Fri, Apr 5, 2024, at 12:46 PM, Venkata Hari Krishna Nukala wrote:
>
> Hi all,
>
> I have filed CEP-40 [1] for live migrating Cassandra instances using the
> Cassandra Sidecar.
>
> When someone needs to move all or a portion of the Cassandra nodes
> belonging to a cluster to different hosts, the traditional approach of
> Cassandra node replacement can be time-consuming due to repairs and the
> bootstrapping of new nodes. Depending on the volume of the storage service
> load, replacements (repair + bootstrap) may take anywhere from a few hours
> to days.
>
> Proposing a Sidecar based solution to address these challenges. This
> solution proposes transferring data from the old host (source) to the new
> host (destination) and then bringing up the Cassandra process at the
> destination, to enable fast instance migration. This approach would help to
> minimise node downtime, as it is based on a Sidecar solution for data
> transfer and avoids repairs and bootstrap.
>
> Looking forward to the discussions.
>
> [1]
> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-40%3A+Data+Transfer+Using+Cassandra+Sidecar+for+Live+Migrating+Instances
>
> Thanks!
> Hari
>
>
>

Re: [DISCUSS] ccm as a subproject

2024-05-20 Thread Jordan West

I would also love to see CCM as an official side project. It is important
to the project and I personally use it regularly.

Jordan

On Thu, May 16, 2024 at 7:55 AM Josh McKenzie  wrote:

> We do still have the issues of DSE-supporting code in it, as we do with
> the drivers.  I doubt any of us strongly object to it: there's no trickery
> happening here on the user; but we should be aware of it and have a rough
> direction sketched out for when someone else comes along wanting to add
> support for their proprietary product.
>
> IMO as long as it's documented well at the outset and we have plans to
> slowly refactor to move it to clean boundaries (epic in JIRA anyone <3) so
> it can be extracted into a separately maintained module by folks that need
> it, I think we'd be in great shape. That'd also pave a path for others
> wanting to add support for their proprietary products as well. Win-win.
>
> There's always this chicken or egg problem w/things like ccm. Do people
> not contribute to it because it's out of the umbrella, or is it out of the
> umbrella because people don't need to contribute to it?
>
> I hadn't thought about other subprojects relying on it. That's a very good
> point.
>
> On Thu, May 16, 2024, at 4:48 AM, Jacek Lewandowski wrote:
>
> +1 (my personal opinion)
>
> How to deal with the DSE-supporting code is a separate discussion IMO
>
> - - -- --- -  -
> Jacek Lewandowski
>
>
> czw., 16 maj 2024 o 10:21 Berenguer Blasi 
> napisał(a):
>
>
> +1 ccm is super useful
> On 16/5/24 10:09, Mick Semb Wever wrote:
>
>
>
> On Wed, 15 May 2024 at 16:24, Josh McKenzie  wrote:
>
> Right now ccm isn't formally a subproject of Cassandra or under governance
> of the ASF. Given it's an integral components of our CI as well as for
> local testing for many devs, and we now have more experience w/our muscle
> on IP clearance and ingesting / absorbing subprojects where we can't track
> down every single contributor to get an ICLA, seems like it might be worth
> revisiting the topic of donation of ccm to Apache.
>
> For what it's worth, Sylvain originally and then DataStax after transfer
> have both been incredible and receptive stewards of the projects and repos,
> so this isn't about any response to any behavior on their part.
> Structurally, however, it'd be better for the health of the project(s)
> long-term to have ccm promoted in. As far as I know there was strong
> receptivity to that donation in the past but the IP clearance was the
> primary hurdle.
>
> Anyone have any thoughts for or against?
>
> https://github.com/riptano/ccm
>
>
>
>
> We've been working on this along with the python-driver (just haven't
> raised it yet).  It is recognised, like the python-driver, as a key
> dependency that would best be in the project.
>
> Obtaining the CLAs should be much easier, the contributors to ccm are less
> diverse, being more the people we know already.
>
> We do still have the issues of DSE-supporting code in it, as we do with
> the drivers.  I doubt any of us strongly object to it: there's no trickery
> happening here on the user; but we should be aware of it and have a rough
> direction sketched out for when someone else comes along wanting to add
> support for their proprietary product.  We also don't want to be pushing
> downstream users to be having to create their own forks either.
>
> Great to see general consensus (so far) in receiving it :)
>
>
>
>

Re: [DISCUSS] Stream Pipelines on hot paths

2024-06-06 Thread Jordan West

Similarly in the "don't use them in the main project but am ok with tests"
camp

On Thu, Jun 6, 2024 at 4:46 AM Štefan Miklošovič <
stefan.mikloso...@gmail.com> wrote:

> I have created
>
> https://issues.apache.org/jira/browse/CASSANDRA-19673
>
> to gather all your ideas about what to remove. If you stumble upon some
> code which is susceptible to rewriting, just put it there.
>
> On Wed, Jun 5, 2024 at 6:35 PM  wrote:
>
>> I would like to vote for banning streams in all non-test code. It may not
>> be easy for new contributors to distinguish between hot path and non-hot
>> path. So would be great if we can simply block them in non-test code and
>> update codestyle to detect the usage.
>>
>>
>> On Jun 4, 2024, at 6:26 PM, Josh McKenzie  wrote:
>>
>> I'm in the "ban in non-test cases, allow in tests" camp. Can sometimes
>> make things more expressive and concise.
>>
>> On Mon, Jun 3, 2024, at 12:07 PM, Sam wrote:
>>
>> Added.
>>
>> Here is the 'after' profile
>>
>> 
>>
>> On Sun, 2 Jun 2024 at 20:50, Mick Semb Wever  wrote:
>>
>>
>>
>> On profiling a 90% write workload I found
>> StorageProxy::updateCoordinatorWriteLatencyTableMetric to be a hot-path,
>> consuming between 15-20% of ModificationStatement::executeWithoutCondition
>> cycles.
>>
>> https://github.com/apache/cassandra/pull/3344
>> 
>>
>>
>>
>> Ouch.  Ok, I've no idea what constitutes an ok "slow path" now…
>>
>> Sam, can you also share in the ticket the easy-cass-stress profile you
>> used please.
>>
>>
>>

Re: [DISCUSS] Stream Pipelines on hot paths

2024-06-07 Thread Jordan West

Agreed Aleksey. I wouldn’t be opposed to more nuanced use but the burden
that adds seems impractical. A simple rule is easier.

Jordan

On Fri, Jun 7, 2024 at 05:59 Aleksey Yeshchenko  wrote:

> It am okay with its use off hot paths in principle, and I’ve done it
> myself.
>
> But as others have mentioned, it’s not obvious to every contributor what
> is and isn’t a hot path. Also, the codebase is a living, shifting thing: a
> cold path today can suddenly become hot tomorrow - it’s not uncommon.
>
> Another benefit to this binary decision flow is that we can easily enforce
> it with our lint tooling just for non-test part of the codebase. It’s just
> easier to scale.
>
>
> On 7 Jun 2024, at 10:27, Štefan Miklošovič 
> wrote:
>
> I think it makes sense to use streams to make the life easier for a dev
> when constructing some log messages or something like that in clearly not
> hot paths. Nothing wrong with that ... Collectors.joining(", ") and that
> kind of stuff. I do not think that doing this aggressively and "orthodoxly"
> is necessary.
>
>
>

Re: [DISCUSS] CEP-42: Constraints Framework

2024-06-23 Thread Jordan West

I am generally for this CEP, particularly the sizeOf guardrail. For
example, we recently had an incident caused by a client who wrote outside
of the contract we had verbally established. The constraint would have let
us encode that contract into the database. In this case, clients are
writing large blobs at the application layer and internally the client
performs chunking.  We had established a chunk size of 64k, for example.
However, the application team wanted to use a different programming
language than the ones we provide clients for so they wrote their own. The
new client had a bug that did not honor the agreed upon chunk size and
wrote chunks that were MBs in size. This eventually led to a production
incident and the issue was discovered as a result of a bunch of analysis
(dumping sstables, etc). Had we had the sizeOf guardrail it would have
turned a production incident with hours of investigation into a bug found
immediately during development. Could this be done with a node-level
guardrail? Likely. But config has the issues described above and its
possible to have two tables with different constraints around similar
fields (for example, two different chunk size configs due to data shape).
Could it be done at the client layer? Yes that's what we are doing now, but
this incident highlights the weakness with that approach (having to
implement the contract everywhere and having disjoint features across
clients).

I also think there is benefit to application owners. Encoding constraints
in the database ensures continuity as ownership and contributors change and
reduces the need for comments or documentation as the means to enforce or
share this knowledge.

I think enforcing them at write time makes sense. Thinking about it in the
scope of compaction for example reminds me of a data loss incident where
someone ran a validation in an older version (like 2.0 or 2.1) and a bunch
of 4 byte ints were thrown away because the field expected an 8 byte long.

My primary concern would be ensuring that we don't implement constraints
that require a read before right (not inList comes to mind as an example of
one that could imply reading before writing and could confuse a user if it
doesn't).

Regarding the conflict with existing guardrails, I do think that is
tougher. On one hand I find this feature to be more evolved than those
guardrails and would be fine to see them be replaced by it. On the other,
the guardrails provide sole control to the operator which is nice but adds
some complexity that has been rightly called out.  But I don't see that as
a reason not to go forward with this feature. We should pick a path and
accept the tradeoffs.

Jordan

On Thu, Jun 13, 2024 at 2:39 PM Bernardo Botella <
conta...@bernardobotella.com> wrote:

> Thanks a lot for your comments Abe!
>
> I do agree that the Constraint clause should be as simple as possible. I
> will add a note on the CEP along with some specifics about the proposed
> constraints (removing the ones that are contentious, and adding them to a
> possible future additions section). And yeah, I also think that these
> constraints will help different Cassandra operating paradigms (multi-tenant
> clusters and diverse workflows).
>
> Besides that, I hope that I’ve addressed all the potential concerns and
> feedback on the thread. Let’s let a bit more time for others to chime in
> (any further feedback will be more than welcome), but I’d like to move
> forward with a voting soon if no other concerns are pointed out.
>
> All and all, thanks a lot to everyone that participated in the thread and
> added to the discussion!
> Bernardo
>
>
>
> > On Jun 12, 2024, at 2:37 PM, Abe Ratnofsky  wrote:
> >
> > I've thought about this some more. It would be useful for Cassandra to
> support user-defined "guardrails" (or constraints, whatever you want to
> call them), that could be applied per keyspace or table. Whether a user or
> an operator is considered the owner of a table depends on the organization
> deploying Cassandra, so allowing both parties to protect their tables
> against mis-use seems good to me, especially for large multi-tenant
> clusters with diverse workloads.
> >
> > For example, it would be really useful if a user could set the
> Guardrails.{read,write}ConsistencyLevels for their tables, or declare
> whether all operations should be over LWTs to avoid mixing regular and LWT
> workloads.
> >
> > I'm hesitant about adding lots of expression syntax to the CONSTRAINT
> clause. I think I'd prefer a function calling syntax that represents:
> > 1. Whether the constraint is system / keyspace / table scoped
> > 2. Where in query processing the constraint is checked
> > 3. What is executed by the check
>
>

Re: [VOTE] CEP-40: Data Transfer Using Cassandra Sidecar for Live Migrating Instances

2024-07-02 Thread Jordan West

+1

On Fri, Jun 28, 2024 at 05:56  wrote:

> +1
>
>
> On Jun 27, 2024, at 3:03 PM, Josh McKenzie  wrote:
>
> +1
>
> On Thu, Jun 27, 2024, at 12:40 AM, Abhijeet Dubey wrote:
>
> +1
>
> On Thu, Jun 27, 2024 at 1:47 AM Francisco Guerrero 
> wrote:
>
> +1
>
> On 2024/06/21 15:13:31 Venkata Hari Krishna Nukala wrote:
> > Hi everyone,
> >
> > I would like to start the voting for CEP-40 as all the feedback in the
> > discussion thread seems to be addressed.
> >
> > Proposal:
> >
> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-40%3A+Data+Transfer+Using+Cassandra+Sidecar+for+Live+Migrating+Instances
> > Discussion thread:
> > https://lists.apache.org/thread/g397668tp0zybf29g8hgbllv7t3j493f
> >
> > As per the CEP process documentation, this vote will be open for 72 hours
> > (longer if needed).
> >
> > Thanks!
> > Hari
> >
>
>
>
> --
> *Abhijeet Dubey*
> Software Engineer @ Apple Inc.
> IIT Bombay Computer Science & Engineering Class of 2019
> Contact : +91-9900190105
> Apple Inc. | IIT Bombay
>
>
>

Re: [VOTE] CEP-42: Constraints Framework

2024-07-02 Thread Jordan West

+1

On Tue, Jul 2, 2024 at 12:15 Francisco Guerrero  wrote:

> +1
>
> On 2024/07/02 18:45:33 Josh McKenzie wrote:
> > +1
> >
> > On Tue, Jul 2, 2024, at 1:18 PM, Abe Ratnofsky wrote:
> > > +1 (nb)
> > >
> > >> On Jul 2, 2024, at 12:15 PM, Yifan Cai  wrote:
> > >>
> > >> +1 on CEP-42.
> > >>
> > >> - Yifan
> > >>
> > >> On Tue, Jul 2, 2024 at 5:17 AM Jon Haddad  wrote:
> > >>> +1
> > >>>
> > >>> On Tue, Jul 2, 2024 at 5:06 AM  wrote:
> >  +1
> > 
> > 
> > > On Jul 1, 2024, at 8:34 PM, Doug Rohrer  wrote:
> > >
> > > +1 (nb) - Thanks for all of the suggestions and Bernardo for
> wrangling the CEP into shape!
> > >
> > > Doug
> > >
> > >> On Jul 1, 2024, at 3:06 PM, Dinesh Joshi 
> wrote:
> > >>
> > >> +1
> > >>
> > >> On Mon, Jul 1, 2024 at 11:58 AM Ariel Weisberg 
> wrote:
> > >>> __
> > >>> Hi,
> > >>>
> > >>> I am +1 on CEP-42 with the latest updates to the CEP to clarify
> syntax, error messages, constraint naming and generated naming, alter/drop,
> describe etc.
> > >>>
> > >>> I think this now tracks very closely to how other SQL databases
> define constraints and the syntax is easily extensible to multi-column and
> multi-table constraints.
> > >>>
> > >>> Ariel
> > >>>
> > >>> On Mon, Jul 1, 2024, at 9:48 AM, Bernardo Botella wrote:
> >  With all the feedback that came in the discussion thread after
> the call for votes, I’d like to extend the period another 72 hours starting
> today.
> > 
> >  As before, a vote passes if there are at least 3 binding +1s
> and no binding vetoes.
> > 
> >  Thanks,
> >  Bernardo Botella
> > 
> > > On Jun 24, 2024, at 7:17 AM, Bernardo Botella <
> conta...@bernardobotella.com> wrote:
> > >
> > > Hi everyone,
> > >
> > > I would like to start the voting for CEP-42.
> > >
> > > Proposal:
> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-42%3A+Constraints+Framework
> > > Discussion:
> https://lists.apache.org/thread/xc2phmxgsc7t3y9b23079vbflrhyyywj
> > >
> > > The vote will be open for 72 hours. A vote passes if there are
> at least 3 binding +1s and no binding vetoes.
> > >
> > > Thanks,
> > > Bernardo Botella
> > >>>
>

Re: secondary index table - tombstones surviving compactions

2018-05-30 Thread Jordan West

Hi Roman,

I was able to reproduce the issue you described. I filed
https://issues.apache.org/jira/browse/CASSANDRA-14479. More details there.

Thanks for reporting!
Jordan


On Wed, May 23, 2018 at 12:06 AM, Roman Bielik <
roman.bie...@openmindnetworks.com> wrote:

> Hi,
>
> I apologise for a late response I wanted to run some further tests so I can
> provide more information to you.
>
> @Jeff, no I don't set the "only_purge_repaired_tombstone" option. It
> should
> be default: False.
> But no I don't run repairs during the tests.
>
> @Eric, I understand that rapid deletes/inserts are some kind of
> antipattern, nevertheless I'm not experiencing any problems with that
> (except for the 2nd indices).
>
> Update: I run a new test where I delete the indexed columns extra, plus
> delete the whole row at the end.
> And surprisingly this test scenario works fine. Using nodetool flush +
> compact (in order to expedite the test) seems to always purge the index
> table.
> So that's great because I seem to have found a workaround, on the other
> hand, could there be a bug in Cassandra - leaking index table?
>
> Test details:
> Create table with LeveledCompactionStrategy;
> 'tombstone_compaction_interval': 60; gc_grace_seconds=60
> There are two indexed columns for comparison: column1, column2
> Insert keys {1..x} with random values in column1 & column2
> Delete {key:column2} (but not column1)
> Delete {key}
> Repeat n-times from the inserts
> Wait 1 minute
> nodetool flush
> nodetool compact (sometimes compact  
> nodetool cfstats
>
> What I observe is, that the data table is empty, column2 index table is
> also empty and column1 index table has non-zero (leaked) "space used" and
> "estimated rows".
>
> Roman
>
>
>
>
>
>
> On 18 May 2018 at 16:13, Jeff Jirsa  wrote:
>
> > This would matter for the base table, but would be less likely for the
> > secondary index, where the partition key is the value of the base row
> >
> > Roman: there’s a config option related to only purging repaired
> tombstones
> > - do you have that enabled ? If so, are you running repairs?
> >
> > --
> > Jeff Jirsa
> >
> >
> > > On May 18, 2018, at 6:41 AM, Eric Stevens  wrote:
> > >
> > > The answer to Question 3 is "yes."  One of the more subtle points about
> > > tombstones is that Cassandra won't remove them during compaction if
> there
> > > is a bloom filter on any SSTable on that replica indicating that it
> > > contains the same partition (not primary) key.  Even if it is older
> than
> > > gc_grace, and would otherwise be a candidate for cleanup.
> > >
> > > If you're recycling partition keys, your tombstones may never be able
> to
> > be
> > > cleaned up, because in this scenario there is a high probability that
> an
> > > SSTable not involved in that compaction also contains the same
> partition
> > > key, and so compaction cannot have confidence that it's safe to remove
> > the
> > > tombstone (it would have to fully materialize every record in the
> > > compaction, which is too expensive).
> > >
> > > In general it is an antipattern in Cassandra to write to a given
> > partition
> > > indefinitely for this and other reasons.
> > >
> > > On Fri, May 18, 2018 at 2:37 AM Roman Bielik <
> > > roman.bie...@openmindnetworks.com> wrote:
> > >
> > >> Hi,
> > >>
> > >> I have a Cassandra 3.11 table (with compact storage) and using
> secondary
> > >> indices with rather unique data stored in the indexed columns. There
> are
> > >> many inserts and deletes, so in order to avoid tombstones piling up
> I'm
> > >> re-using primary keys from a pool (which works fine).
> > >> I'm aware that this design pattern is not ideal, but for now I can not
> > >> change it easily.
> > >>
> > >> The problem is, the size of 2nd index tables keeps growing (filled
> with
> > >> tombstones) no matter what.
> > >>
> > >> I tried some aggressive configuration (just for testing) in order to
> > >> expedite the tombstone removal but with little-to-zero effect:
> > >> COMPACTION = { 'class':
> > >> 'LeveledCompactionStrategy', 'unchecked_tombstone_compaction':
> 'true',
> > >> 'tombstone_compaction_interval': 600 }
> > >> gc_grace_seconds = 600
> > >>
> > >> I'm aware that perhaps Materialized views could provide a solution to
> > this,
> > >> but I'm bind to the Thrift interface, so can not use them.
> > >>
> > >> Questions:
> > >> 1. Is there something I'm missing? How come compaction does not remove
> > the
> > >> obsolete indices/tombstones from 2nd index tables? Can I trigger the
> > >> cleanup manually somehow?
> > >> I have tried nodetool flush, compact, rebuild_index on both data table
> > and
> > >> internal Index table, but with no result.
> > >>
> > >> 2. When deleting a record I'm deleting the whole row at once - which
> > would
> > >> create one tombstone for the whole record if I'm correct. Would it
> help
> > to
> > >> delete the indexed columns separately creating extra tombstone for
> each
> > >> cell?
> > >> As I understand the underlying mechan

Re: Difference between heartbeat and generation on a Gossip packet

2018-06-28 Thread Jordan West

On Tue, Jun 26, 2018 at 3:08 PM, Abdelkrim Fitouri 
wrote:

> Hello,
>
> I  am studying the gossip part on casssandra and wondering about the
> difference between the heartbeat and generation data exchanged for the
> autodiscovery.
>
> many thanks for any help.
>

If you haven’t had a chance to check it out, Jason Brown’s Cassandra Summit
talk on gossip is a great resource:
https://www.youtube.com/watch?v=FuP1Fvrv6ZQ. Details about
heartbeat/generation start around 6 minutes in.

Jordan

>
> --
>
> Best Regards.
>
> Abdelkarim.
>

Re: [VOTE] Branching Change for 4.0 Freeze

2018-07-13 Thread Jordan West

+1 (non-binding)

On Fri, Jul 13, 2018 at 5:02 AM, J. D. Jordan 
wrote:

> -0 (non-binding) as well for similar reasons to Gary.
>
> > On Jul 12, 2018, at 8:23 AM, Gary Dusbabek  wrote:
> >
> > -0
> >
> > I'm not interested in sparking a discussion on this, because a) it has
> > already happened and b) it seems I am in a minority. But thought I should
> > at least include the rationale for my vote:
> > * This proposal goes against the "scratch an itch" philosophy of making
> > contributions to an Apache project and IMO will discourage contributions
> > that are casual or new.
> > * It feels dictatorial. IMO the right way to do this would be for
> > impassioned committers to -1 any patch that goes against elements a, b,
> or
> > c of what this vote is for.
> >
> > Gary.
> >
> >
> > On Wed, Jul 11, 2018 at 4:46 PM sankalp kohli 
> > wrote:
> >
> >> Hi,
> >>As discussed in the thread[1], we are proposing that we will not
> branch
> >> on 1st September but will only allow following merges into trunk.
> >>
> >> a. Bug and Perf fixes to 4.0.
> >> b. Critical bugs in any version of C*.
> >> c. Testing changes to help test 4.0
> >>
> >> If someone has a change which does not fall under these three, we can
> >> always discuss it and have an exception.
> >>
> >> Vote will be open for 72 hours.
> >>
> >> Thanks,
> >> Sankalp
> >>
> >> [1]
> >>
> >> https://lists.apache.org/thread.html/494c3ced9e83ceeb53fa127e44eec6
> e2588a01b769896b25867fd59f@%3Cdev.cassandra.apache.org%3E
> >>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>

Re: GitHub PR ticket spam

2018-08-06 Thread Jordan West

+1 (nb) for the worklog approach.

Jordan

On Mon, Aug 6, 2018 at 4:53 PM dinesh.jo...@yahoo.com.INVALID
 wrote:

> +1 for preserving it as worklog.
> Dinesh
> P.S.: Apologies for the github spam :-)
> On Monday, August 6, 2018, 3:09:28 PM PDT, Mick Semb Wever <
> m...@apache.org> wrote:
>
>
> >
> > Great idea. +1 to moving it to the work log.
> >
>
>
> https://issues.apache.org/jira/browse/INFRA-16879
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>

Re: QA signup

2018-09-06 Thread Jordan West

Thanks for staring this thread Jon!

On Thu, Sep 6, 2018 at 5:51 AM Jonathan Haddad  wrote:

> For 4.0, I'm thinking it would be a good idea to put together a list of the
> things that need testing and see if people are willing to help test / break
> those things.  My goal here is to get as much coverage as possible, and let
> folks focus on really hammering on specific things rather than just firing
> up a cluster and rubber stamping it.  If we're going to be able to
> confidently deploy 4.0 quickly after it's release we're going to need a
> high attention to detail.
>
>
+1 to a more coordinated effort. I think we could use the Confluence that
was set up a little bit ago since it was setup for this purpose, at least
for finalized plans and results:
https://cwiki.apache.org/confluence/display/CASSANDRA.


> In addition to a signup sheet, I think providing some guidance on how to QA
> each thing that's being tested would go a long way.  Throwing "hey please
> test sstable streaming" over the wall will only get quality feedback from
> folks that are already heavily involved in the development process.  It
> would be nice to bring some new faces into the project by providing a
> little guidance.
>

> We could help facilitate this even further by considering the people
> signing up to test a particular feature as a team, with seasoned Cassandra
> veterans acting as team leads.
>

+1 to this as well. I am always a fan of folks learning about a
subsystem/project through testing. It can be challenging to get folks new
to a project excited about testing first but for those that do, or for
committers who want to learn another part of the db, its a great way to
learn.

Another thing we can do here is make sure teams are writing about the
testing they are doing and their results. This will help share knowledge
about techniques and approaches that others can then apply. This knowledge
can be shared on the mailing list, a blog post, or in JIRA.

 Jordan


> Any thoughts?  I'm happy to take the lead on this.
> --
> Jon Haddad
> http://www.rustyrazorblade.com
> twitter: rustyrazorblade
>

Re: Proposing an Apache Cassandra Management process

2018-09-23 Thread Jordan West

I think this feature is important to the community and I don’t want to
stifle that but if committers/contributors are working on the management
process instead of testing 4.0 it takes away from it regardless of where
the code lives. Waiting to merge until after 4.0, at a minimum, would
benefit the testing effort.

Jordan

On Sat, Sep 22, 2018 at 10:06 AM Sankalp Kohli 
wrote:

> This is not part of core database and a separate repo and so my impression
> is that this can continue to make progress. Also we can always make
> progress and not merge it till freeze is lifted.
>
> Open to ideas/suggestions if someone thinks otherwise.
>
> > On Sep 22, 2018, at 03:13, kurt greaves  wrote:
> >
> > Is this something we're moving ahead with despite the feature freeze?
> >
> > On Sat, 22 Sep 2018 at 08:32, dinesh.jo...@yahoo.com.INVALID
> >  wrote:
> >
> >> I have created a sub-task - CASSANDRA-14783. Could we get some feedback
> >> before we begin implementing anything?
> >>
> >> Dinesh
> >>
> >>On Thursday, September 20, 2018, 11:22:33 PM PDT, Dinesh Joshi <
> >> dinesh.jo...@yahoo.com.INVALID> wrote:
> >>
> >> I have updated the doc with a short paragraph providing the
> >> clarification. Sankalp's suggestion is already part of the doc. If there
> >> aren't further objections could we move this discussion over to the jira
> >> (CASSANDRA-14395)?
> >>
> >> Dinesh
> >>
> >>> On Sep 18, 2018, at 10:31 AM, sankalp kohli 
> >> wrote:
> >>>
> >>> How about we start with a few basic features in side car. How about
> >> starting with this
> >>> 1. Bulk nodetool commands: User can curl any sidecar and be able to run
> >> a nodetool command in bulk across the cluster.
> >>>
> >>
> :/bulk/nodetool/tablestats?arg0=keyspace_name.table_name&arg1= >> required>
> >>>
> >>> And later
> >>> 2: Health checks.
> >>>
> >>> On Thu, Sep 13, 2018 at 11:34 AM dinesh.jo...@yahoo.com.INVALID <
> >> dinesh.jo...@yahoo.com.invalid> wrote:
> >>> I will update the document to add that point. The document did not mean
> >> to serve as a design or architectural document but rather something that
> >> would spark a discussion on the idea.
> >>> Dinesh
> >>>
> >>>   On Thursday, September 13, 2018, 10:59:34 AM PDT, Jonathan Haddad <
> >> j...@jonhaddad.com > wrote:
> >>>
> >>> Most of the discussion and work was done off the mailing list - there's
> >> a
> >>> big risk involved when folks disappear for months at a time and
> resurface
> >>> with big pile of code plus an agenda that you failed to loop everyone
> in
> >>> on. In addition, by your own words the design document didn't
> accurately
> >>> describe what was being built.  I don't write this to try to argue
> about
> >>> it, I just want to put some perspective for those of us that weren't
> part
> >>> of this discussion on a weekly basis over the last several months.
> Going
> >>> forward let's keep things on the ML so we can avoid confusion and
> >>> frustration for all parties.
> >>>
> >>> With that said - I think Blake made a really good point here and it's
> >>> helped me understand the scope of what's being built better.  Looking
> at
> >> it
> >>> from a different perspective it doesn't seem like there's as much
> overlap
> >>> as I had initially thought.  There's the machinery that runs certain
> >> tasks
> >>> (what Joey has been working on) and the user facing side of exposing
> that
> >>> information in management tool.
> >>>
> >>> I do appreciate (and like) the idea of not trying to boil the ocean,
> and
> >>> working on things incrementally.  Putting a thin layer on top of
> >> Cassandra
> >>> that can perform cluster wide tasks does give us an opportunity to move
> >> in
> >>> the direction of a general purpose user-facing admin tool without
> >>> committing to trying to write the full stack all at once (or even make
> >>> decisions on it now).  We do need a sensible way of doing rolling
> >> restarts
> >>> / scrubs / scheduling and Reaper wasn't built for that, and even though
> >> we
> >>> can add it I'm not sure if it's the best mechanism for the long term.
> >>>
> >>> So if your goal is to add maturity to the project by making cluster
> wide
> >>> tasks easier by providing a framework to build on top of, I'm in favor
> of
> >>> that and I don't see it as antithetical to what I had in mind with
> >> Reaper.
> >>> Rather, the two are more complementary than I had originally realized.
> >>>
> >>> Jon
> >>>
> >>>
> >>>
> >>>
> >>> On Thu, Sep 13, 2018 at 10:39 AM dinesh.jo...@yahoo.com.INVALID
> >>> mailto:dinesh.jo...@yahoo.com>.invalid>
> wrote:
> >>>
>  I have a few clarifications -
>  The scope of the management process is not to simply run repair
>  scheduling. Repair scheduling is one of the many features we could
>  implement or adopt from existing sources. So could we please split the
>  Management Process discussion and the repair scheduling?
>  After re-reading the management process proposal, I see we missed to
>

Re: Both Java 8 and Java 11 required for producing a tarball

2019-03-13 Thread Jordan West

A couple related JIRAs for reference:

https://issues.apache.org/jira/browse/CASSANDRA-14714
https://issues.apache.org/jira/browse/CASSANDRA-14712

On Wed, Mar 6, 2019 at 7:34 PM Michael Shuler 
wrote:

> On 3/6/19 7:10 PM, Stefan Miklosovic wrote:
> > I am trying to build 4.0 from sources and prior to this I was doing
> >
> > ant artifacts
> >
> > in order to get distribution tarball to play with.
> >
> > If I understand this right, if I do not run Ant with Java 11,
> > java.version.8 will be true so it will skip building tarballs.
>
> Correct. You'll get a JDK8-only jar, but no full tar artifact set.
>
> > 1) Why would one couldnt create a tarball while running on Java 8 only?
>
> The build system needs a dual-JDK install to build the artifacts with
> support for each/both.
>
> > 2) What is the current status of Java 11 / Java 8? Is it there just "to
> try
> > it out if it runs with that" or are there different reasons behind it?
>
> JDK8 runtime is the default, JDK11 runtime is optional, but supported.
> Here's the JIRA with all the details:
> https://issues.apache.org/jira/browse/CASSANDRA-9608
>
> I just pushed a WIP branch to do a dual-JDK build via docker, since we
> need to work on this, too. (lines may wrap:)
>
> git clone -b tar-artifact-build
> https://gitbox.apache.org/repos/asf/cassandra-builds.git
>
> cd cassandra-builds/
>
> docker build -t cass-build-tars -f docker/buster-image.docker docker/
>
> docker run --rm -v `pwd`/dist:/dist `docker images -f
> label=org.cassandra.buildenv=buster -q` /home/build/build-tars.sh trunk
>
> After all that, here's my tar artifacts:
>
> (tar-artifact-build)mshuler@hana:~/git/cassandra-builds$ ls -l dist/
> total 94328
> -rw-r--r-- 1 mshuler mshuler 50385890 Mar  6 21:16
> apache-cassandra-4.0-SNAPSHOT-bin.tar.gz
> -rw-r--r-- 1 mshuler mshuler 46198947 Mar  6 21:16
> apache-cassandra-4.0-SNAPSHOT-src.tar.gz
>
> Or you could drop a dual-JDK install on your machine, export the env
> vars you found and `ant artifacts` should produce the tars :)
>
> --
> Kind regards,
> Michael
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>

Re: Choosing a supported Python 3 major version for cqlsh

2019-03-19 Thread Jordan West

On Mon, Mar 18, 2019 at 7:52 PM Michael Shuler 
wrote:

> On 3/18/19 9:06 PM, Patrick Bannister wrote:
> > I recommend we pick the longest supported stable release available. That
> > would be Python 3.7, which is planned to get its last release in 2023,
> four
> > years from now.
> > - Python 3.5 was planned to get its last major release yesterday
> > - Python 3.6 is planned to get its last major release in December 2021,
> > about three years from now
> >
> > Any feedback on picking a tested Python version for cqlshlib? I'm
> inclined
> > to focus on Python 3.7 as we push toward finishing the ticket.
>
> The correct method of choosing this would be to target runtime
> functionality on the version in the latest LTS release of the likely
> most-used OS. Ubuntu 18.04 LTS comes with python-3.6.5. I would think it
> highly likely that if it runs properly on 3.6, it should run on 3.7


In my experience working with a different python project recently this
isn’t the case. There are reserved keywords that were added between 3.6 and
3.7:
https://docs.python.org/3/whatsnew/3.7.html

Jordan



> fine. Using some 3.7-only feature/syntax and making it difficult on
> people to install/use on Ubuntu LTS would be user-unfriendly.
>
> https://packages.ubuntu.com/bionic/python3
>
> There is not a similar CentOS package search, but I see a couple docs
> say that python-3.6 is available via the SCL repository for this OS. I
> do not see a 3.7 installation noted.
>
> Shoot for the lowest common denominator in real world usage, not the
> latest release from upstream. Super strong opinion, here.
>
> --
> Michael
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>

Cassandra 4.0 Quality and Stability Update

2019-04-10 Thread Jordan West

In September, the community chose to freeze trunk to begin working on
Quality and Stability with the goal of releasing the most stable Cassandra
major in the project’s history. While lots of work has been ongoing and
folks could follow along with progress on JIRA I thought it would be useful
to cover what has been accomplished so far since I’ve spent a good amount
of time working with others on various testing projects.

During this time we have made significant progress on improving the Quality
and Stability of Cassandra — not only Cassandra 4.0 but also the Cassandra
3.x series and future Cassandra releases. Additionally, testing has
provided the opportunity for new community members and committers to
contribute. While not comprehensive the community has found at least 25
bugs that can be classified as either Data Loss, Corruption, Incorrect
Response, Loss of Stability, Loss of Availability, Concurrency Issues,
Performance Issues, and Lack of Safety. These bugs have been found by a
variety of methodologies including commonly used ones like unit testing and
canary deployments. However, the majority of the bugs have been found or
confirmed using new methodologies like the ones described in a some recent
blog posts [1] [2].

Additionally, the state of the test suites and test tooling have improved.
CASSANDRA-14806 [3] brought some much welcomed improvements to the circleci
workflow and made it easier for people to run (d)tests on supported
platforms (jdk8/11) and the work to get upgrade tests running found several
bugs including CASSADNRA-14958 [4].

While we have made significant progress there is still more to do before we
can be truly confident in an Cassandra 4.0 release. Some ongoing and
outstanding work includes:

* Improving the state of the cqlsh tests [5]
* There is ongoing discussion on the new MessagingService [6] which will
require significant review and testing
* Additional upgrade testing for Cassandra 4.0 including additional support
for upgrade testing using in-jvm dtests [7]
* Work to increase coverage of important areas and new features in
Cassandra 4.0 [8]

While the list above may seem short, the last item contains a long list of
important areas the community has previously discussed adding coverage to.
If you are looking for areas to contribute this is a great starting point.
If there is a name down on an area you are interested in I would encourage
you to reach out to them to discuss how you can help further increase the
community’s confidence in the Quality and Stability of Cassandra.

Below is an in-complete list of many of the severe bugs found during this
part of the release cycle. Thanks again to all of the community members who
contributed to finding these bugs and improving Cassandra for everyone.

CASSANDRA-15004: Anti-compaction briefly removes sstables from the read path
CASSANDRA-14958: Counters fail to increment on 2.X to 3.X mixed version
clusters
CASSANDRA-14936: Anticompaction should throw exceptions on errors, not just
log them
CASSANDRA-14672: After deleting data in 3.11.3, reads fail: "open marker
and close marker have different deletion times"
CASSANDRA-14912: LegacyLayout errors on collection tombstones from dropped
columns
CASSANDRA-14843: Drop/add column name with different Kind can result in
corruption
CASSANDRA-14568: CorruptSSTableExceptions in 3.0.17.1 (CASSANDRA-14568 v2)
Static collection deletions are corrupted in 3.0 <-> 2.{1,2} messages
CASSANDRA-14749: Collection Deletions for Dropped Columns in 2.1/3.0
mixed-mode can delete rows
CASSANDRA-14568: Static collection deletions are corrupted in 3.0 ->
2.{1,2} messages
CASSANDRA-14861: Inaccurate sstable min/max metadata can cause data loss
CASSANDRA-14823: Legacy sstables with range tombstones spanning multiple
index blocks create invalid bound sequences on 3.0+ (#1193)
CASSANDRA-14873: Missing rows when reading 2.1 SSTables in 3.0
CASSANDRA-14838: Dropped columns can cause reverse sstable iteration to
return prematurely
CASSANDRA-14803: Rows that cross index block boundaries can cause
incomplete reverse reads in some cases.
CASSANDRA-14766: DESC order reads can fail to return the last Unfiltered in
the partition (#1170)
CASSANDRA-14991: SSL Cert Hot Reloading should defensively check for sanity
of the new keystore/truststore before loading it
CASSANDRA-14794: Avoid calling iter.next() in a loop when notifying
indexers about range tombstones
CASSANDRA-14780: Avoid creating empty compaction tasks after truncate
CASSANDRA-14657: Handle failures in upgradesstables/cleanup/relocatee
CASSANDRA-14638: Column result order can change in 'SELECT *' results when
upgrading from 2.1 to 3.0 causing response corruption for queries using
prepared statements when static columns are used
CASSANDRA-14919: Regression in paging queries in mixed version clusters
CASSANDRA-14554: LifecycleTransaction encounters
ConcurrentModificationException when used in multi-threaded context
CASSANDRA-14935: PendingAntiCompaction should be mor

Re: Stabilising Internode Messaging in 4.0

2019-04-10 Thread Jordan West

There is a lot of discuss here so I’ll try to keep my opinions brief:

1. The bug fixes are a requirement in order to have a stable 4.0. Whether
they come from this patch or the original I have less of an opinion. I do
think its important to minimize code changes at this time in the
development/freeze cycle — including large refactors which add risk despite
how they are being discussed here. From that perspective, I would prefer to
see more targeted fixes but since we don’t have them and we have this patch
the decision is different.

2. We should not be measuring complexity in LoC with the exception that all
20k lines *do need to be review* (not just the important parts and because
code refactoring tools have bugs too) and more lines take more time.
Otherwise, its a poor metric for how long this will take to review.
Further, it seems odd that the authors are projecting how long it will take
to review — this should be the charge of the reviewers and I would like to
hear from them on this. Its clear this a complex patch — as risky as
something like 8099 (and the original rewrite by Jason). We should treat it
as such and not try to merge it in quickly for the sake of it, repeating
past mistakes. The goal of reviewing the messaging service work was to do
just that. It would be a disservice to rush in these changes now. If the
goal is to get the patch in that should be the priority, not completing it
“in two weeks”. Its clear several community members have pushed back on
that and are not comfortable with the time frame.

3. If we need to add special forks of Netty classes to the code because of
“how we use Netty” that is a concern to me w.r.t to quality, stability, and
maintenance. Is there a place that documents/justifies our non-traditional
usage (I saw some in JavaDocs but found it lacking in *why* but had a lot
of how/what was changed). Given folks in the community have decent
relationships with the Netty team perhaps we should leverage that as well.
Have we reached out to them?

4. In principle, I agree with the technical improvements you mention
(backpressure / checksumming / etc). These things should be there. Are they
a hard requirement for 4.0? In my opinion no and Dinesh has done a good job
of providing some reasons as to why so I won’t go into much detail here. In
short, a bug and a missing safety mechanism are not the same thing. Its
also important to consider how many users a change like that covers and for
what risk — we found a bug in 13304 late in review, had it slipped through
it would have subjected users to silent corruption they thought couldn’t
occur anymore because we included the feature in a prod Cassandra release.

5. The patch could seriously benefit from some commit hygiene that would
make it easier for folks to review. Had this been done not only would
review be easier but also the piecemeal breakup of features/fixes could
have been done more easily. I don’t buy the premise that this wasn’t
possible. If we had to add the feature/fix later it would have been
possible. I’m sure there was a smart way we could have organized it, if it
was a priority.

6. Have any upgrade tests been done/added? I would also like to see some
performance benchmarks before merging so that the patch is in a similar
state as 14503 in terms of performance testing.

I’m sure I have more thoughts but these seem like the important ones for
now.

Jordan

On Wed, Apr 10, 2019 at 8:21 AM Dinesh Joshi 
wrote:

> Here's are my 2¢.
>
> I think the general direction of this work is valuable but I have a few
> concerns I’d like to address. More inline.
>
> > On Apr 4, 2019, at 1:13 PM, Aleksey Yeschenko 
> wrote:
> >
> > I would like to propose CASSANDRA-15066 [1] - an important set of bug
> fixes
> > and stability improvements to internode messaging code that Benedict, I,
> > and others have been working on for the past couple of months.
> >
> > First, some context.   This work started off as a review of
> CASSANDRA-14503
> > (Internode connection management is race-prone [2]), CASSANDRA-13630
> > (Support large internode messages with netty) [3], and a pre-4.0
> > confirmatory review of such a major new feature.
> >
> > However, as we dug in, we realized this was insufficient. With more than
> 50
> > bugs uncovered [4] - dozens of them critical to correctness and/or
> > stability of the system - a substantial rework was necessary to
> guarantee a
> > solid internode messaging subsystem for the 4.0 release.
> >
> > In addition to addressing all of the uncovered bugs [4] that were unique
> to
> > trunk + 13630 [3] + 14503 [2], we used this opportunity to correct some
> > long-existing, pre-4.0 bugs and stability issues. For the complete list
> of
> > notable bug fixes, read the comments to CASSANDRA-15066 [1]. But I’d like
> > to highlight a few.
>
> Do you have regression tests that will fail if these bugs are reintroduced
> at a later point?
>
> > # Lack of message integrity checks
> >
> > It’s known that TCP checksums

Re: Stabilising Internode Messaging in 4.0

2019-04-12 Thread Jordan West

e they a hard requirement for 4.0?
> >>
> >> One thing that comes to mind is protocol versioning and consistency. If
> >> changes adding checksumming and handshake do not make it to 4.0, we grow
> >> the upgrade matrix and have to put changes to the separate protocol
> >> version. I'm not sure how many other internode protocol changes we have
> >> planned for 4.next, but this is definitely something we should keep in
> >> mind.
> >>
> >>> 2. We should not be measuring complexity in LoC with the exception that
> >> all 20k lines *do need to be review* (not just the important parts and
> >> because code refactoring tools have bugs too) and more lines take more
> >> time.
> >>
> >> Everything should definitely be reviewed. But with different rigour: one
> >> thing is to review byte arithmetic and protocol formats and completely
> >> different thing is to verify that Verb moved from one place to the
> other is
> >> still used. Concentrating on a smaller subset doesn't make a patch
> smaller,
> >> just helps to guide reviewers and observers, so my primary goal was to
> help
> >> people, hence my reference to the jira comment I'm working on.
> >>
> >>
> >> On Wed, Apr 10, 2019 at 6:13 PM Sankalp Kohli 
> >> wrote:
> >>
> >>> I think we should wait for testing doc on confluence to come up and
> >>> discuss what all needs to be added there to gain confidence.
> >>>
> >>> If the work is more to split the patch into smaller parts and delays
> 4.0
> >>> even more, can we use time in adding more test coverage, documentation
> >> and
> >>> design docs to this component?  Will that be a middle ground here ?
> >>>
> >>> Examples of large changes not going well is due to lack of testing,
> >>> documentation and design docs not because they were big. Being big adds
> >> to
> >>> the complexity and increased the total bug count but small changes can
> be
> >>> equally worse in terms of impact.
> >>>
> >>>
> >>>> On Apr 10, 2019, at 8:53 AM, Jordan West  wrote:
> >>>>
> >>>> There is a lot of discuss here so I’ll try to keep my opinions brief:
> >>>>
> >>>> 1. The bug fixes are a requirement in order to have a stable 4.0.
> >> Whether
> >>>> they come from this patch or the original I have less of an opinion. I
> >> do
> >>>> think its important to minimize code changes at this time in the
> >>>> development/freeze cycle — including large refactors which add risk
> >>> despite
> >>>> how they are being discussed here. From that perspective, I would
> >> prefer
> >>> to
> >>>> see more targeted fixes but since we don’t have them and we have this
> >>> patch
> >>>> the decision is different.
> >>>>
> >>>> 2. We should not be measuring complexity in LoC with the exception
> that
> >>> all
> >>>> 20k lines *do need to be review* (not just the important parts and
> >>> because
> >>>> code refactoring tools have bugs too) and more lines take more time.
> >>>> Otherwise, its a poor metric for how long this will take to review.
> >>>> Further, it seems odd that the authors are projecting how long it will
> >>> take
> >>>> to review — this should be the charge of the reviewers and I would
> like
> >>> to
> >>>> hear from them on this. Its clear this a complex patch — as risky as
> >>>> something like 8099 (and the original rewrite by Jason). We should
> >> treat
> >>> it
> >>>> as such and not try to merge it in quickly for the sake of it,
> >> repeating
> >>>> past mistakes. The goal of reviewing the messaging service work was to
> >> do
> >>>> just that. It would be a disservice to rush in these changes now. If
> >> the
> >>>> goal is to get the patch in that should be the priority, not
> completing
> >>> it
> >>>> “in two weeks”. Its clear several community members have pushed back
> on
> >>>> that and are not comfortable with the time frame.
> >>>>
> >>>> 3. If we need to add special forks of Netty classes to the code
> because
> >>> of
> >>>> “how we use Netty” that is a concern to me w.r.t to q

Re: Cassandra 4.0 Quality and Stability Update

2019-04-12 Thread Jordan West

Hi Dinesh,

Great question! Unfortunately I don’t have a great definition of what
“alpha” means in the Cassandra community so its hard for me to answer that
directly. However, I am of the opinion that we are not yet at the point of
being able to branch trunk — we are finding too many bugs at too quick a
pace still and have yet to make enough significant progress on the test
plan [1] previously linked. I do think it would be beneficial to cut an
official build (maybe after internode messaging settles down) as a preview
for the community and to make it easier for folks to run on dev/test
hardware. In the Riak community we call these “pre” builds (Riak 2.0.0preX)
and they were nothing more than a stable place on trunk released
periodically until we reached a point where we branched.

Regarding metrics, the first major step towards that was Benedict’s and
others work (thanks al!) to re-organize JIRA. We now have a better set of
inputs to automatically build reports around release quality metrics, etc.
We have yet to take this and turn it into JIRA reports but I am working
with Scott Andreas on it — I don’t have a timeframe just yet but I hope
soon. If you would like to help please let me know.

In the meantime, Scott and I have kept a list which is where the data I
used came from. We absolutely need to make this public and the efforts
mentioned above will accomplish that.

Jordan

[1]
https://cwiki.apache.org/confluence/display/CASSANDRA/4.0+Quality%3A+Components+and+Test+Plans

On Thu, Apr 11, 2019 at 4:21 PM Dinesh Joshi  wrote:

> Hey Jordan,
>
> Thanks for update! Do you have a sense of where we are in terms of
> stability and where do we need to be in order to cut an alpha? I also
> remember a discussion on measuring release quality[1]. Not sure where we
> landed on it. Any idea on how are we doing on that front?
>
> Thanks,
>
> Dinesh
>
> [1]
> https://lists.apache.org/thread.html/3a444be1a3097c0c55d15268ccb0fe7aab83ef276b87bf55bf4f3bc2@%3Cdev.cassandra.apache.org%3E
>
> > On Apr 10, 2019, at 8:25 AM, Jordan West  wrote:
> >
> > In September, the community chose to freeze trunk to begin working on
> > Quality and Stability with the goal of releasing the most stable
> Cassandra
> > major in the project’s history. While lots of work has been ongoing and
> > folks could follow along with progress on JIRA I thought it would be
> useful
> > to cover what has been accomplished so far since I’ve spent a good amount
> > of time working with others on various testing projects.
> >
> > During this time we have made significant progress on improving the
> Quality
> > and Stability of Cassandra — not only Cassandra 4.0 but also the
> Cassandra
> > 3.x series and future Cassandra releases. Additionally, testing has
> > provided the opportunity for new community members and committers to
> > contribute. While not comprehensive the community has found at least 25
> > bugs that can be classified as either Data Loss, Corruption, Incorrect
> > Response, Loss of Stability, Loss of Availability, Concurrency Issues,
> > Performance Issues, and Lack of Safety. These bugs have been found by a
> > variety of methodologies including commonly used ones like unit testing
> and
> > canary deployments. However, the majority of the bugs have been found or
> > confirmed using new methodologies like the ones described in a some
> recent
> > blog posts [1] [2].
> >
> > Additionally, the state of the test suites and test tooling have
> improved.
> > CASSANDRA-14806 [3] brought some much welcomed improvements to the
> circleci
> > workflow and made it easier for people to run (d)tests on supported
> > platforms (jdk8/11) and the work to get upgrade tests running found
> several
> > bugs including CASSADNRA-14958 [4].
> >
> > While we have made significant progress there is still more to do before
> we
> > can be truly confident in an Cassandra 4.0 release. Some ongoing and
> > outstanding work includes:
> >
> > * Improving the state of the cqlsh tests [5]
> > * There is ongoing discussion on the new MessagingService [6] which will
> > require significant review and testing
> > * Additional upgrade testing for Cassandra 4.0 including additional
> support
> > for upgrade testing using in-jvm dtests [7]
> > * Work to increase coverage of important areas and new features in
> > Cassandra 4.0 [8]
> >
> > While the list above may seem short, the last item contains a long list
> of
> > important areas the community has previously discussed adding coverage
> to.
> > If you are looking for areas to contribute this is a great starting
> point.
> > If there is a name down on an area you are interested in I

Re: Stabilising Internode Messaging in 4.0

2019-04-12 Thread Jordan West

Since their seems to be an assumption that I haven’t read the code let me
clarify: I am working on making time to be a reviewer on this and I have
already spent a few hours with the patch before I sent any replies, likely
more than most who are replying here. Again, because I disagree on
non-technical matters does not mean I haven’t considered the technical. I
am sharing what I think is necessary for the authors
to make review higher quality. I will not compromise my review standards on
a patch like this as I have said already. Telling me to review it to talk
more about it directly ignores my feedback and requires me to acquiesce all
of my concerns, which as I said I won’t do as a reviewer.

And yes I am arguing for changing how the Cassandra community approaches
large patches. In the same way the freeze changed how we approached major
releases and the decision to do so has been a net benefit as measured by
quality and stability. Existing community members have already chimed in in
support of things like better commit hygiene.

The past approaches haven’t prioritized quality and stability and it really
shows. What I and others here are suggesting has worked all over our
industry and is adopted by companies big (like google as i linked
previously) and small (like many startups I and others have worked for).
Everything we want to do: better testing, better review, better code, is
made easier with better design review, better discussion, and more
digestible patches among many of the other things suggested in this thread.

Jordan

On Fri, Apr 12, 2019 at 12:01 PM Benedict Elliott Smith 
wrote:

> I would once again exhort everyone making these kinds of comment to
> actually read the code, and to comment on Jira.  Preferably with a
> justification by reference to the code for how or why it would improve the
> patch.
>
> As far as a design document is concerned, it’s very unclear what is being
> requested.  We already had plans, as Jordan knows, to produce a wiki page
> for posterity, and a blog post closer to release.  However, I have never
> heard of this as a requirement for review, or for commit.  We have so far
> taken two members of the community through the patch over video chat, and
> would be more than happy to do the same for others.  So far nobody has had
> any difficulty getting to grips with its structure.
>
> If the project wants to modify its normal process for putting a patch
> together, this is a whole different can of worms, and I am strongly -1.
> I’m not sure what precedent we’re trying to set imposing arbitrary
> constraints pre-commit for work that has already met the project’s
> inclusion criteria.
>
>
> > On 12 Apr 2019, at 18:58, Pavel Yaskevich  wrote:
> >
> > I haven't actually looked at the code
>
>
>
>

Re: [VOTE] Apache Cassandra Release Lifecycle

2019-10-07 Thread Jordan West

+1 nb — to both the document and the benefits listed in Scott’s email.

Jordan

On Fri, Oct 4, 2019 at 2:26 PM Scott Andreas  wrote:

> There are two main benefits to agreeing on this:
>
> 1. Providing clarity for contributors on release phases – i.e., what types
> of changes are expected to land or be deferred during a particular point in
> that cycle.
>
> 2. Providing semantic clarity to users of Cassandra in terms of what they
> can expect from a given release.
>
> The second one is more important. The document stands as a commitment
> between the Cassandra project and its users regarding what can be expected
> from each type of release. It defines GA releases as "recommended for
> production deployment for all users," setting a standard of quality that we
> aim to uphold together and that users can depend on. Affirming what it
> means for a release to be EOL, deprecated, or in maintenance signals
> importance of upgrading to a GA version.
>
> The prerelease phases set expectations for us as contributors to produce a
> more stable release: what type of testing/validation is needed at what
> time, at which point interfaces/protocols solidify, when a release is
> considered feature complete, etc.
>
> Creating clarity for ourselves will help us build a better database, and
> creating clarity for our users will help give them the confidence to run it.
>
> I want to thank Sumanth for his work on this document and to everyone else
> who's contributed.
>
> I support it (+1 nb).
>
> – Scott
>
> 
> From: Stefan Podkowinski 
> Sent: Tuesday, October 1, 2019 1:43 PM
> To: dev@cassandra.apache.org
> Subject: Re: [VOTE] Apache Cassandra Release Lifecycle
>
> What exactly will be the implication of the outcome of this vote, in
> case the content is agreed upon? What's the proposal of the voting?
>
> The document seems to be informative rather then formal. It's verbose on
> definitions that should be commonly understood or can only broadly
> defined (what is alpha/beta/RC, GA for production, etc.), while at the
> same time is unclear and weasel-wordy on our actual commitment and rules.
>
>
> On 30.09.19 20:51, sankalp kohli wrote:
> > Hi,
> >  We have discussed in the email thread[1] about Apache Cassandra
> Release
> > Lifecycle. We came up with a doc[2] for it. Please vote on it if you
> agree
> > with the content of the doc[2].
> >
> > Thanks,
> > Sankalp
> >
> > [1]
> >
> https://lists.apache.org/thread.html/c610b23f9002978636b66d09f0e0481ed3de9b78895050da22c91c6f@%3Cdev.cassandra.apache.org%3E
> > [2]
> >
> https://docs.google.com/document/d/1bS6sr-HSrHFjZb0welife6Qx7u3ZDgRiAoENMLYlfz8/edit#heading=h.633eppni91tw
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>

Re: [VOTE-2] Apache Cassandra Release Lifecycle

2019-10-10 Thread Jordan West

+1 nb

On Wed, Oct 9, 2019 at 11:00 PM Per Otterström
 wrote:

> +1 nb
>
> -Original Message-
> From: Mick Semb Wever 
> Sent: den 10 oktober 2019 07:08
> To: dev@cassandra.apache.org
> Subject: Re: [VOTE-2] Apache Cassandra Release Lifecycle
>
>
> > We have discussed in the email thread[1] about Apache Cassandra
> > Release Lifecycle. We came up with a doc[2] for it. We have finalized
> > the doc here[3] Please vote on it if you agree with the content of the
> doc [3].
>
>
> +1 nb
>
> the doc is good. it adds value for users, and does not need to be perfect.
> thanks Sumanth for doing this.
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>

Re: Offering some project management services

2020-01-10 Thread Jordan West

Extra time contributed to the project by an experienced community member in
either developer or project management areas would be very helpful in
completing 4.0. Thanks for volunteering Josh -- and +1 on thanking Scott
for his existing efforts (and Benedict and others who worked to improve the
JIRA workflow this release cycle). I agree that having folks driving
regular community check ins on progress (or working to keep progress up to
date), among other things, would be beneficial. One thing I'd love to see
again is a regular (every two weeks?) update on progress on the dev list
(similar to what Jeff Jirsa used to send around -- it also included a call
for reviews iirc).

Jordan

On Fri, Jan 10, 2020 at 9:04 AM Joshua McKenzie 
wrote:

> >
> > developer time from your employer would probably be more impactful
>
>  Certainly, and there's movement on that side as well but that's
> independent from my current purview so I don't feel it appropriate for me
> to speak to that.
>
>   the project has already largely agreed on the work that is necessary for
> > 4.0, and is executing on it as quickly as resources allow
>
> This JQL on the release
> <
> https://issues.apache.org/jira/issues/?jql=project%20%3D%20cassandra%20and%20fixversion%20~%204.0%20and%20resolution%20%3D%20unresolved%20and%20status%20!%3D%20resolved%20and%20(assignee%20is%20empty%20or%20(reviewer%20is%20empty%20and%20reviewers%20is%20empty))%20order%20by%20priority%20desc%2C%20assignee
> >
> indicates that 49 of the 72 open issues are lacking either an assignee or a
> reviewer. I can only speak to my experience on this and other software
> projects, but I find a lot of things slip through the cracks by virtue of
> not having ownership for various points in their pipeline or stall based on
> people not realizing things are on their plate (we had quite a few tickets
> marked 4.0 assigned to people no longer active on the project, for
> instance).
>
> I also believe that what qualifies as scope for a release requires constant
> vigilance and healthy gentle skepticism from the devil's advocate position
> on minimizing scope to help counter-balance our tendencies as engineers to
> want to get things into releases, especially when there are longer cycle
> times. We've seen it on almost every major release on this project, and
> it's healthy and a great sign of people's passion and dedication to this
> project and their craft, but without a countering force I personally
> believe it leads to lengthened cycle times and isn't a healthy balance for
> the project. This is a strong opinion of mine but it's loosely held; I'm
> open to other data or experiences that can help shape this perspective.
>
> One thing I want to clarify - Scott in particular and the community as a
> whole has been doing great work both managing this project and driving
> things forward; I'm not trying to step into some perceived gap or rescue
> something, but rather meet people where they are and add what value I can
> and work with the project to help keep momentum high and remove blockers or
> stalls from people's workflows.
>
> Does the above make sense?
>
>
>
> On Fri, Jan 10, 2020 at 8:30 AM Benedict Elliott Smith <
> bened...@apache.org>
> wrote:
>
> > I personally welcome your increased participation in any role, and more
> > focus on project delivery is certainly a great thing.  But developer time
> > from your employer would probably be more impactful, as the main active
> > contributors right now have their own project management infrastructure,
> > and are already dedicating what resources they have to 4.0.  So it's not
> > 100% clear what resources you'll be able to facilitate better deploying.
> >
> > I think the project has already largely agreed on the work that is
> > necessary for 4.0, and is executing on it as quickly as resources allow.
> >
> >
> > On 10/01/2020, 16:18, "Joshua McKenzie"  wrote:
> >
> > Hey all,
> >
> > I've recently had some cycles free up I can dedicate to the
> open-source
> > project. My intuition is that I can add the most value right now by
> > engaging in some simple project management type work (help get
> > assignees
> > and reviewers for things critical path for 4.0, help stimulate and
> > facilitate discussions about scope for the upcoming release and
> > subsequent
> > releases, general triage and test board health, etc).
> >
> > Before I wade into the project and start poking and prodding us all,
> > does
> > anyone have any concerns with me stepping (back ;) ) into this role,
> or
> > have any feedback or recommendations before doing so?
> >
> >
> >
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
> >
>

Re: Apache Cassandra Contributor Meeting

2020-01-20 Thread Jordan West

Thanks Patrick. Looking forward to tomorrow’s meeting. I added an agenda
item around 4.0 — it’s not my intention to lead that section necessarily
but I think a check in / progress update / follow up on Josh’s email will
be good to cover.

Jordan

On Mon, Jan 13, 2020 at 6:11 PM Patrick McFadin  wrote:

> And I sent this without saying when. Let me save you a click on the
> confluence link.
>
> January 21, 1PM PST
>
> On Mon, Jan 13, 2020 at 5:28 PM Patrick McFadin 
> wrote:
>
> > Hi everyone,
> >
> > In order to catch up on what's happening here, here's the establishing
> > thread:
> >
> https://lists.apache.org/thread.html/aa54420a43671c00392978f2b0920bc6926ca9ba1e61a486ad39fb21%40%3Cdev.cassandra.apache.org%3E
> >
> > Key points that Scott Andreas proposed in the initial email was
> >
> > Motivation for such a meeting
> > 1. We currently have Slack, JIRA and emails however an agenda driven
> video
> > meeting can help facilitate alignment within the community.
> > 2. This will give an opportunity to the community to summarize past
> > progress and talk about future tasks.
> > 3. Agenda notes can serve as newsletters for the community.
> >
> > To that, I humbly offer my services as a community organizer to help with
> > the logistics and setup. I'm happy to say this is finally happening and I
> > apologize this has taken so long. I saw some of the examples mentioned in
> > the original thread for other open source projects and I "borrowed"
> heavily
> > from them.
> >
> > I created a page in the Cassandra Confluence page to hopefully centralize
> > both logistics and records of each call. You can fine it here:
> >
> https://cwiki.apache.org/confluence/display/CASSANDRA/Apache+Cassandra+Contributor+Meeting
> >
> > The meetings are on Zoom and set to be wide open. Anyone can join via
> > computer or phone. I'm using a tier that allows for 100 participants. If
> we
> > need more, I can change the type of meeting but it's more of a pain for
> > logistics. We can try this and see how it goes. Once the meeting starts
> > I'll hit record, I'll post the video on YouTube and add the link to the
> > notes. All meeting notes for each agenda items can live in the doc above
> > and remain as a permanent record. After the meeting, I'll send the notes
> > link to the dev list as a reminder that it happened to anyone subscribed.
> >
> > If you have agenda items, please edit the Confluence page and add your
> > name and what you would like discussed.
> >
> > My contribution here is as an organizer. Please feel free to email or
> > Slack if you need anything. Most important, a video meet is an alpha
> > product and we'll learn a lot from the first time trying. I'll try to
> keep
> > note of things to improve in the doc.
> >
> > See you there,
> >
> > Patrick
> >
>

Re: Cassandra 4.0 Dev Work Status

2020-01-22 Thread Jordan West

Hi Everyone,

Josh is traveling this week so he sent me a brief summary and I offered to
send it to the mailing list w/ a few updates. There was enough progress in
the last week to warrant an update.

The 4.0 board can be found at
https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=355. More
details below.

- *Progress*: We closed 8 more tickets this week for a rolling total of 26
(up from 18 in the last update) of 122 (up from 115 last week) across
4.0-alpha, 4.0-beta, and 4.0. Closed:
https://issues.apache.org/jira/issues/?jql=project%20%3D%20CASSANDRA%20AND%20fixversion%20in%20(4.0%2C%204.0.0%2C%204.0-alpha%2C%204.0-beta)%20AND%20resolved%20%3E%3D%20-4w

Total:
https://issues.apache.org/jira/issues/?filter=12347782

*LHF / Failing Tests*: 3 of the 6 failing tests now have an assignee. The
remaining 3 unassigned tickets can be found at
https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=355&quickFilter=1660&quickFilter=1658

*Needs Reviewer*: 6 tickets need a reviewer. This is down from 10 last
week. They can be found at
https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=355&quickFilter=1659

*Available to work*: 3 alpha (the remaining test failures), 4 beta, and 18
RC issues are unassigned. They can be found at
https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=355&view=detail&selectedIssue=CASSANDRA-15308&quickFilter=1661&quickFilter=1658

*Ready to Commit*: 7 tickets are marked ready to commit. They can be found
at
https://issues.apache.org/jira/browse/CASSANDRA-15461?jql=project%20%3D%20CASSANDRA%20AND%20fixversion%20in%20(4.0%2C%204.0.0%2C%204.0-alpha%2C%204.0-beta)%20AND%20status%20%3D%20%22Ready%20to%20Commit%22%20

*Testing*:  On our 4.0 Quality and Test Plan Wiki (
https://cwiki.apache.org/confluence/display/CASSANDRA/4.0+Quality%3A+Components+and+Test+Plans
)
we have 5 remaining open Shepherd positions (down from 7 last week). 11
areas do not have a tracking ticket (down from 13 last week). 13 areas
remain not started.

Thanks everyone for your contributions! Its exciting to see this progress.

Jordan

On Wed, Jan 15, 2020 at 5:37 AM Benedict Elliott Smith 
wrote:

> Specifically, if anyone's interested, I think we should probably maintain
> three tags for work landing in 4.0, e.g. 4.0-alpha1, 4.0-alpha, 4.0
>
> This helps track all of the relevant information, the first limited
> release, the first general release, and the point in the release process it
> was delivered.
>
> On 15/01/2020, 13:34, "Benedict Elliott Smith" 
> wrote:
>
> I think there's always been a distinction in the way we treat
> alphas/betas versus patch releases, because they have a staged delivery
> (landing for dev and users in different releases).  I don't know we've ever
> been totally consistent about it across major versions though.
>
> I think we can view 4.0-alpha as equivalent to 4.x, except that it has
> value being maintained after commit, to historically track where things
> land in the release process.  It was discussed somewhere, not ages ago and
> I can't remember where, that there was value in this.  There's probably
> also value in introducing 4.0-alpha1 etc on top.
>
> We should probably decide and document it, as you say, so that we can
> at least be consistent next major.
>
>
> On 15/01/2020, 13:18, "Joshua McKenzie"  wrote:
>
> Historically I believe we used the ".x" nomenclature to indicate
> general
> release we wanted things in (4.x, 3.11.x, 3.6.x, etc), and then
> upon merge
> update the FixVersion to reflect which release it actually went
> in. Is that
> still a thing, and whether a thing or not, is the current
> appropriate usage
> of FixVersion on the project documented somewhere?
>
> On Wed, Jan 15, 2020 at 2:24 AM Scott Andreas <
> sc...@paradoxica.net> wrote:
>
> > Just realized I'd misunderstood Mick's original email, apologies.
> >
> > I'd originally interpreted it as a question of prioritization,
> but the
> > intent was to ensure that the Fix Version field reflects the
> release a
> > given change is /included in/, not /originally targeted for/.
> Apologies for
> > my misunderstanding.
> >
> > Agreed yes; it'd make sense to update recently-committed items
> that have a
> > future fix version to indicate they were resolved during alpha.
> I haven't
> > seen fix version refer to specific alpha releases (given that
> there's just
> > one at the m

Re: [DISCUSS] Switch to using GitHub pull requests?

2020-01-23 Thread Jordan West

On Thu, Jan 23, 2020 at 9:09 AM Jeff Jirsa  wrote:

> On Thu, Jan 23, 2020 at 6:18 AM Jeremiah Jordan 
> wrote:
>
> > It is the reviewer and authors job to make sure CI ran and didn’t
> > introduce new failing tests, it doesn’t matter how they were ran. It is
> > just as easy to let something through when “pr triggered” tests have a
> > failure as it is tests manually linked from a JIRA comment, if the author
> > and reviewer think the failures are not new.
> >
>
> Agreed. Any committer who commits while tests are broken is ignoring
> policy. Moving patch submission from one system to another won't somehow
> make committers adhere to policy.
>

Agreed but at the same time having a mechanism that reduces the amount of
manual work required by the reviewer/committer (running tests/viewing
reported results) should increase the likelihood the policy is adhered to
if it’s not being adhered to currently. I would welcome information being
automatically added from CI during review and if PRs are an easy way to
accomplish that in the future then +1.


>
> >
> > If someone want to setup some extra niceties, like auto triggered builds
> > or something, to happen if people use the PR workflow, then I see no
> > problem there. But I don’t think we need to force use of PRs.
> >
> > This is why I don’t think we need to “switch” to using PR’s. There is no
> > need to switch. People can “also” use PRs. If someone who likes the PR
> > workflow sets up some more nice stuff to happen when it is used, that
> would
> > probably encourage more people to do things that way. But it doesn’t need
> > to be forced.
> >
>
> Agreed.
>
+1. I don’t think we need requirement yet as much as encouragement. If some
folks start using this approach, hopefully it’s overall use will spread as
others see the benefits. I used it for the first time on my most recent
ticket and have found it convenient so far.

We can always revisit requirement if failures on trunk continue to be a
problem. Better visibility would be a good start.

Jordan

Re: [DISCUSS] Switch to using GitHub pull requests?

2020-01-24 Thread Jordan West

Keeping trunk green at all times is a great goal to strive for, I'd love to
continue to work towards it, but in my experience its not easy. Flaky
tests, for the reason folks mentioned, are a real challenge. A standard we
could use while we work towards the more ambitious one, and we are pretty
close to using already as Josh mentioned, that I've seen work well is
multiple successive green runs (ideally on different platforms as well)
before a certain release and better visibility/documentation of test runs &
flakiness.

We can make incremental improvements towards this! Some I've heard on this
thread or am personally interested in are below. I think even making one or
two of these changes would be an improvement.

- A regularly run / on commit trunk build, visible to the public, should
give us more visibility into test health vs. todays status quo of having to
search the CI history of different branches.

- A process of documenting known flaky tests like a JIRA and maybe an
annotation (or just a comment) that references that JIRA (not that runs the
test multiple times to ask flakiness). Those JIRAs can be assigned to
specific releases in the current cycle like we have been doing for 4.0.
This could be paired w/ making it explicit when in the release cycle its ok
to merge w/ flaky tests (if they are documented).

- Surfacing CI results on JIRA when CI is triggered (manually or
automatically) makes it easier for reviewers and checking history at a
later date.

- Running CI automatically for contributions that the ASF says its ok for
-- as David said, other projects seem to make this work and it doesn't seem
to be an insurmountable problem since the list of signed ICLA users is
known & the GitHub API is powerful.

- Automatically transitioning JIRAs to Patch Available when the PR method
is used to open a ticket (don't know if this is possible, currently it adds
the pull-request-available label)

Jordan


On Fri, Jan 24, 2020 at 9:30 AM Joshua McKenzie 
wrote:

> >
> > I also don't think it leads to the right behaviour or incentives.
>
> The gap between when a test is authored and the point at which it's
> determined to be flaky, as the difficulty with responsibility assignment
> (an "unrelated" change can in some cases make a previously stable test
> become flaky) makes this a real devil of a problem to fix. Hence it's long
> and rich legacy. ;)
>
> While I agree with the general sentiment of "if we email the dev list with
> a failure, or we git blame a test and poke the author to fix it they'll do
> the right thing", we still end up in cases where people have rotated off
> the project and nobody feels a sense of ownership over a test failure for
> something someone else wrote, or a circumstance in which another change
> broke something, etc. At least from where I sit, I can't see a solution to
> this problem that doesn't involve some collective action for things not
> directly under one's purview.
>
> Also, fwiw in my experience, "soft" gatekeeping for things like this will
> just lead to the problem persisting into perpetuity. The problem strikes me
> as too complex and temporally / unpredictably distributed to be solvable by
> incentivizing the "right" behavior (proactive prevention of introduction of
> things like this, hygiene and rigor on authorship, etc), but I'm sure
> there's ways of approaching this that I'm not thinking of.
>
> But maybe I'm making a mountain out of a molehill. @bes - if you think that
> emailing the dev list when a failure is encountered on rotation would be
> sufficient to keep this problem under control with an obviously much
> lighter touch, I'm +1 for giving it a shot.
>
> On Fri, Jan 24, 2020 at 10:12 AM Benedict Elliott Smith <
> bened...@apache.org>
> wrote:
>
> > > due to oversight on a commit or a delta breaking some test the author
> > thinks is unrelated to their diff but turns out to be a second-order
> > consequence of their change that they didn't expect
> >
> > In my opinion/experience, this is all a direct consequence of lack of
> > trust in CI caused by flakiness.  We have finite time to dedicate to our
> > jobs, and figuring out whether or not a run is really clean for this
> patch
> > is genuinely costly when  you cannot trust the result,  Those costs
> > multiple rapidly across the contributor base.
> >
> > That does not conflict with what you are saying.  I don't, however, think
> > it is reasonable to place the burden on the person trying to commit at
> that
> > moment, whether or not by positive sentiment or "computer says no".  I
> also
> > don't think it leads to the right behaviour or incentives.
> >
> > I further think there's been a degradation of community behaviour to some
> > extent caused by the bifurcation of CI infrastructure and approach.
> > Ideally we would all use a common platform, and there would be regular
> > trunk runs to compare against, like-for-like.
> >
> > IMO, we should email dev@ if there are failing runs for trunk, and there
> > sh

Re: [DISCUSS] Switch to using GitHub pull requests?

2020-01-24 Thread Jordan West

Looks like there are two slack plugins for Jenkins. They trigger after
builds and if my rusty Jenkins-fu is right the trunk build can be scheduled
to run daily and then have the plugin post to slack when its done. Not an
expert and can't poke at the Jenkins instance myself so not sure what
limitations there are.

https://plugins.jenkins.io/slack
https://plugins.jenkins.io/global-slack-notifier

Jordan

On Fri, Jan 24, 2020 at 11:27 AM Jeff Jirsa  wrote:

> Can someone find a circleci or jenkins bot that posts to the #cassandra-dev
> channel in ASF slack once a day?
>
>
> On Fri, Jan 24, 2020 at 11:23 AM Jordan West  wrote:
>
> > Keeping trunk green at all times is a great goal to strive for, I'd love
> to
> > continue to work towards it, but in my experience its not easy. Flaky
> > tests, for the reason folks mentioned, are a real challenge. A standard
> we
> > could use while we work towards the more ambitious one, and we are pretty
> > close to using already as Josh mentioned, that I've seen work well is
> > multiple successive green runs (ideally on different platforms as well)
> > before a certain release and better visibility/documentation of test
> runs &
> > flakiness.
> >
> > We can make incremental improvements towards this! Some I've heard on
> this
> > thread or am personally interested in are below. I think even making one
> or
> > two of these changes would be an improvement.
> >
> > - A regularly run / on commit trunk build, visible to the public, should
> > give us more visibility into test health vs. todays status quo of having
> to
> > search the CI history of different branches.
> >
> > - A process of documenting known flaky tests like a JIRA and maybe an
> > annotation (or just a comment) that references that JIRA (not that runs
> the
> > test multiple times to ask flakiness). Those JIRAs can be assigned to
> > specific releases in the current cycle like we have been doing for 4.0.
> > This could be paired w/ making it explicit when in the release cycle its
> ok
> > to merge w/ flaky tests (if they are documented).
> >
> > - Surfacing CI results on JIRA when CI is triggered (manually or
> > automatically) makes it easier for reviewers and checking history at a
> > later date.
> >
> > - Running CI automatically for contributions that the ASF says its ok for
> > -- as David said, other projects seem to make this work and it doesn't
> seem
> > to be an insurmountable problem since the list of signed ICLA users is
> > known & the GitHub API is powerful.
> >
> > - Automatically transitioning JIRAs to Patch Available when the PR method
> > is used to open a ticket (don't know if this is possible, currently it
> adds
> > the pull-request-available label)
> >
> > Jordan
> >
> >
> > On Fri, Jan 24, 2020 at 9:30 AM Joshua McKenzie 
> > wrote:
> >
> > > >
> > > > I also don't think it leads to the right behaviour or incentives.
> > >
> > > The gap between when a test is authored and the point at which it's
> > > determined to be flaky, as the difficulty with responsibility
> assignment
> > > (an "unrelated" change can in some cases make a previously stable test
> > > become flaky) makes this a real devil of a problem to fix. Hence it's
> > long
> > > and rich legacy. ;)
> > >
> > > While I agree with the general sentiment of "if we email the dev list
> > with
> > > a failure, or we git blame a test and poke the author to fix it they'll
> > do
> > > the right thing", we still end up in cases where people have rotated
> off
> > > the project and nobody feels a sense of ownership over a test failure
> for
> > > something someone else wrote, or a circumstance in which another change
> > > broke something, etc. At least from where I sit, I can't see a solution
> > to
> > > this problem that doesn't involve some collective action for things not
> > > directly under one's purview.
> > >
> > > Also, fwiw in my experience, "soft" gatekeeping for things like this
> will
> > > just lead to the problem persisting into perpetuity. The problem
> strikes
> > me
> > > as too complex and temporally / unpredictably distributed to be
> solvable
> > by
> > > incentivizing the "right" behavior (proactive prevention of
> introduction
> > of
> > > things like this, hygiene and rigor on authorship, etc), but I'm sure
> > > t

Re: [DISCUSS] Switch to using GitHub pull requests?

2020-01-24 Thread Jordan West

That’s awesome that we have that set up. I was checking out b.a.o after my
email and noticed some recent runs. I don’t mean to prescribe any specific
way of surfacing results as long as they are easily accessible to all
contributors (well documented where to find them, etc).

Progress on posting results to jira is also awesome.

Thanks Mick!


Jordan


On Fri, Jan 24, 2020 at 12:24 PM Mick Semb Wever  wrote:

> > In my opinion/experience, this is all a direct consequence of lack of
> trust in CI caused by flakiness.
>
>
> The challenge of this project's test state certainly feel like an
> insurmountable challenge at times…
>
> Having been battling away with Jenkins, because I do have ASF access and
> don't have premium CircleCI access, I've developed a bit of a routine for
> evaluating the Jenkins CI results the best I can for even the most trivial
> of patches, so I've got some input to this…
>
> A canonical record of test results is important, and we didn't have that
> until yesterday: take a look in bui...@cassandra.apache.org.  It is now
> possible to search for commit SHAs and find their test results.
>
> And with the new pipeline builds these test results are summarised for all
> the different test build types. These summarised results also go to slack's
> #cassandra-builds channel. The summarised results contains a lot and I
> haven't completely verified them, any help would be appreciated there. The
> idea is also to also post these results back to the jira ticket. How to do
> that is already figured out. This was discussed in the 'Cassandra CI
> Status' thread and in CASSANDRA-15496.
>
> In addition, build failures (and the resuming success) for the 'artifacts'
> build step goes to the builds ML, and to the author (if their email address
> can be determined).  As we stabilise the pipeline's builds, eg starting
> with unit tests, we could then more easily move into the "no broken
> windows" mode.
>
>
> > I also don't think it leads to the right behaviour or incentives.
>
>
> I agree that a gatekeeping approach won't work, we need instead to
> incentivise more reviewing, code cleaners, test fixers, documenters, etc.
> These actions should be praised and valued as much as any other. That said,
> a little blame often goes a long way.
>
> With all this^ said I don't see the need for special daily build with
> results posted to the dev ML.
>
>
> regards,
> Mick
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>

Re: Testing out JIRA as replacement for cwiki tracking of 4.0 quality testing

2020-02-01 Thread Jordan West

Thanks for taking this up Josh. I'm for whatever we think will result in a
more accurate view of progress. Edit access has been a friction point. I'd
like to hear from others as well too but generally I'm +1 to giving it a
shot.

Jordan

On Thu, Jan 30, 2020 at 1:45 PM Joshua McKenzie 
wrote:

> From my 4.0 status progress email earlier today, we still have quite a few
> testing initiatives that are lacking Shepherds or tracking tickets in JIRA:
> [Areas needing Shepherds] - 6
> ...
>
> [Areas needing tracking tickets] - 11
> ...
>
> I went ahead and tried out the format of creating an epic in JIRA as a
> central location to collect this information in one place. The link for a
> WIP look at this is here: Link:
> https://issues.apache.org/jira/browse/CASSANDRA-15536. I don't want to get
> too far into prototyping this as if we don't collectively want to go this
> route, I don't want to have 11 JIRAs created plus an epic we'd then delete
> and spam the list.
>
> My .02: I think it'd improve our ability to collaborate and lower friction
> to testing if we could do so on JIRA instead of the cwiki. *I suspect *the
> edit access restrictions there plus general UX friction (difficult to have
> collab discussion, comment chains, links to things, etc) make the confluent
> wiki a worse tool for this job than JIRA. Plus if we do it in JIRA we can
> track the outstanding scope in the single board and it's far easier to
> visualize everything in one place so we can all know where attention and
> resources need to be directed to best move the needle on things.
>
> But that's just my opinion. What does everyone else think? Like the JIRA
> route? Hate it? No opinion?
>
> If we do decide we want to go the epic / JIRA route, I'd be happy to
> migrate the rest of the information in there for things that haven't been
> completed yet on the wiki (ticket creation, assignee/reviewer chains, links
> to epic).
>
> So what does everyone think?
>

Re: 20200217 4.0 Status Update

2020-02-18 Thread Jordan West

On Mon, Feb 17, 2020 at 12:52 PM Jeff Jirsa  wrote:

>
> beyond the client proto change being painful for anything other than major
> releases
>
>
This came up during the community meeting today and I wanted to bring a
question about it to the list: could someone who is *very* familiar with
the client proto share w/ the list why changing the proto in anything other
than a major release is so difficult? I hear this a lot and it seems to be
fact. So that all of us don't have to go read the code, a brief summary
would be super helpful. Or if there is a ticket that already covers this
even better! I'd also be curious if there have ever been any thoughts to
address it as it seems to be a consistent hurdle during the release cycle
and one that tends to further increase scope.

Thanks,
Jordan

>
>
> > On Feb 17, 2020, at 12:43 PM, Jon Meredith 
> wrote:
> >
> > My turn to give an update on 4.0 status. The 4.0 board created by Josh
> can
> > be found at
> >
> >
> > https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=355.
> >
> >
> > We have 94 unresolved tickets marked against the 4.0 release. [1]
> >
> >
> > Things seem to have settled into a phase of working to resolve issues,
> with
> > few new issues added.
> >
> >
> > 2 new tickets opened (that are marked against 4.0)
> >
> > 11 tickets closed (including one of the newly opened ones)
> >
> > 39 tickets received updates to JIRA of some kind in the last week
> >
> >
> > Cumulative flow over the last couple of weeks shows todo reducing and
> done
> > increasing as it should as we continue to close out work for the release.
> >
> >
> >
> https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=355&projectKey=CASSANDRA&view=reporting&chart=cumulativeFlowDiagram&swimlane=939&swimlane=936&swimlane=931&column=1505&column=1506&column=1514&column=1509&column=1512&column=1507&days=14
> >
> >
> > Notables
> >
> > - Python 3 support for cqlsh has been committed (thank you all who
> > persevered on this)
> >
> > - Some activity on Windows support - perhaps not dead yet.
> >
> > - Lots of movement on documentation
> >
> > - Lots of activity on flaky tests.
> >
> > - Oldest ticket with a patch award goes to CASSANDRA-2848
> >
> >
> > There are 18 tickets marked as patch available (easy access from the
> > Dashboard [2], apologies if they're already picked up for review)
> >
> >
> > CASSANDRA-15567 Allow EXTRA_CLASSPATH to work in tarball/source
> > installations
> >
> > CASSANDRA-15553 Preview repair should include sstables from finalized
> > incremental repair sessions
> >
> > CASSANDRA-15550 Fix flaky test
> > org.apache.cassandra.streaming.StreamTransferTaskTest
> > testFailSessionDuringTransferShouldNotReleaseReferences
> >
> > CASSANDRA-15488/CASSANDRA-15353 Configuration file
> >
> > CASSANDRA-15484/CASSANDRA-15353 Read Repair
> >
> > CASSANDRA-15482/CASSANDRA-15353 Guarantees
> >
> > CASSANDRA-15481/CASSANDRA-15353 Data Modeling
> >
> > CASSANDRA-15393/CASSANDRA-15387 Add byte array backed cells
> >
> > CASSANDRA-15391/CASSANDRA-15387 Reduce heap footprint of commonly
> allocated
> > objects
> >
> > CASSANDRA-15367 Memtable memory allocations may deadlock
> >
> > CASSANDRA-15308 Fix flakey testAcquireReleaseOutbound -
> > org.apache.cassandra.net.ConnectionTest
> >
> > CASSANDRA-1530 5Fix multi DC nodetool status output
> >
> > CASSANDRA-14973 Bring v5 driver out of beta, introduce v6 before 4.0
> > release is cut
> >
> > CASSANDRA-14939 fix some operational holes in incremental repair
> >
> > CASSANDRA-14904 SSTableloader doesn't understand listening for CQL
> > connections on multiple ports
> >
> > CASSANDRA-14842 SSL connection problems when upgrading to 4.0 when
> > upgrading from 3.0.x
> >
> > CASSANDRA-14761 Rename speculative_retry to match additional_write_policy
> >
> > CASSANDRA-2848 Make the Client API support passing down timeouts
> >
> >
> > *LHF / Failing Tests*: We have 7 unassigned test failures that are all
> >
> > great candidates to pick up and get involved in:
> >
> >
> https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=355&projectKey=CASSANDRA&quickFilter=1660&quickFilter=1661&quickFilter=1658
> >
> >
> > Thanks again to everybody for all the contributions. It's really good to
> > see the open issue count start dropping.
> >
> >
> > Feedback on whether this information is useful and how it can be improved
> > is both welcome and appreciated.
> >
> >
> > Cheers, Jon
> >
> >
> > [1] Unresolved 4.0 tickets
> >
> https://issues.apache.org/jira/browse/CASSANDRA-15567?filter=12347782&jql=project%20%3D%20cassandra%20AND%20fixversion%20in%20(4.0%2C%204.0.0%2C%204.0-alpha%2C%204.0-beta)%20AND%20status%20!%3D%20Resolved
> >
> > [2] Patch Available
> >
> https://issues.apache.org/jira/secure/Dashboard.jspa?selectPageId=12334910
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org

20200224 4.0 Status Update

2020-02-24 Thread Jordan West

The board can be found at
https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=355

We continue to make positive progress. There are 92 tickets [1] that remain
open at this time. We opened 4 tickets in the last 7 days and closed 16
(including 3 of the 4 new tickets). We also added two tickets to the
tracking that were improperly tagged and not covered by last weeks query.
NOTE: this excludes the epic tickets created by Josh as part of his effort
to move the test plan from cwiki to JIRA.

Cumulative flow for the last week shows a slight increase in total work, a
noticeable increase in completed work, and a significant decrease in
tickets awaiting review [2].

Notable Tickets That Need a Reviewer:

- Flaky test fix: https://issues.apache.org/jira/browse/CASSANDRA-15308
- BufferPool causing nodes to crash bug:
https://issues.apache.org/jira/browse/CASSANDRA-15358
- Allow EXTRA_CLASSPATH to work in tarball/source installations:
https://issues.apache.org/jira/browse/CASSANDRA-15567
- SSL issue (looks like it might need to be picked up for more than
review?): https://issues.apache.org/jira/browse/CASSANDRA-14842
- SSTableloader bug (also looks like it might have more work todo besides
review): https://issues.apache.org/jira/browse/CASSANDRA-14904

Unassigned Tickets (Mostly Test Failures):

https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=355&projectKey=CASSANDRA&quickFilter=1660&quickFilter=1661&quickFilter=1658

Notable Patches Merged This Week:

- Lots of doc fixes (to many to list here, thanks to everyone involved this
effort!)
- Several flaky test fixes including:
https://issues.apache.org/jira/browse/CASSANDRA-15575
- A  gossip bug fix: https://issues.apache.org/jira/browse/CASSANDRA-15592
- Memory consumption improvements in metrics:
https://issues.apache.org/jira/browse/CASSANDRA-15213

Thanks Everyone!
Jordan

[1]
https://issues.apache.org/jira/issues/?jql=project%20%3D%20cassandra%20AND%20fixversion%20IN%20(4.0%2C%204.0.0%2C%204.0-alpha%2C%204.0-beta)%20AND%20%20status%20!%3D%20resolved

[2]
https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=355&projectKey=CASSANDRA&view=reporting&chart=cumulativeFlowDiagram&swimlane=939&swimlane=936&swimlane=931&column=1505&column=1506&column=1514&column=1509&column=1512&column=1507&days=14

Re: [proposal] Introduce AssertJ in test framework

2020-03-10 Thread Jordan West

If it encourages more  and higher quality test writing +1 (nb). Also, low
risk given it’s a test dep.

Using QuickTheories as an example, merging it with a new or updated test
could be a good way to get it merged

Jordan

On Tue, Mar 10, 2020 at 10:33 AM Benjamin Lerer 
wrote:

> +1
>
> On Tue, Mar 10, 2020 at 6:18 PM Jon Haddad  wrote:
>
> > I've used assertj in a lot of projects, I prefer it by a wide margin over
> > using only junit.
> >
> > On Tue, Mar 10, 2020 at 9:45 AM David Capwell 
> wrote:
> >
> > > +1 from me
> > >
> > > In CASSANDRA-15564 I build my own assert chain to make the tests
> cleaner;
> > > did it since assertj wasn't there.
> > >
> > > On Tue, Mar 10, 2020, 9:28 AM Kevin Gallardo <
> > kevin.galla...@datastax.com>
> > > wrote:
> > >
> > > > I would like to propose adding AssertJ <
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__assertj.github.io_doc_&d=DwIFaQ&c=adz96Xi0w1RHqtPMowiL2g&r=Jad7nE1Oab1mebx31r7AOfSsa0by8th6tCxpykmmOBA&m=WrkWi_LeOnAl7rqft1DM27OEkXD7sc2fnZMy_-c7IS8&s=D4FAaGRQi2WlwKAOQbQMfyt_cRqsOuZdePUDgchdLhA&e=
> > >
> > > as
> > > > a test dependency and therefore have it available for writing
> > > > unit/distributed/any test assertions.
> > > >
> > > > In addition to the examples mentioned on the AssertJ docs page
> (allows
> > to
> > > > do elaborate and comprehensible assertions on Collections, String,
> and
> > > > *custom
> > > > assertions*), here's an example of a dtest I was looking at, that
> could
> > > be
> > > > translated to AssertJ syntax, just to give an idea of how the syntax
> > > would
> > > > apply:
> > > >
> > > > *JUnit asserts*:
> > > > try {
> > > >[...]
> > > > } catch (Exception e) {
> > > > Assert.assertTrue(e instanceof RuntimeException);
> > > > RuntimeException re = ((RuntimeException) e);
> > > > Assert.assertTrue(re.getCause() instanceof ReadTimeoutException);
> > > > ReadTimeoutException rte = ((ReadTimeoutException) e.getCause());
> > > > Assert.assertTrue(rte.getMessage().contains("blabla")
> > > >   && rte.getMessage().contains("andblablo"));
> > > > }
> > > >
> > > > *AssertJ style:*
> > > > try {
> > > > [...]
> > > > } catch (Exception e) {
> > > > Assertions.assertThat(e)
> > > > .isInstanceOf(RuntimeException.class)
> > > > .hasCauseExactlyInstanceOf(ReadTimeoutException.class)
> > > > .hasMessageContaining("blabla")
> > > > .hasMessageContaining("andblablo");
> > > > }
> > > >
> > > > The syntax is more explicit and more comprehensible, but more
> > > importantly,
> > > > when one of the JUnit assertTrue() fails, you don't know *why*, you
> > only
> > > > know that the resulting boolean expression is false.
> > > > If a failure happened with the assertJ tests, the failure would say
> > > > "Exception
> > > > did not contain expected message, expected "blabla", actual
> > "notblabla""
> > > > (same for a lot of other situations), this makes debugging a failure,
> > > after
> > > > a test ran and failed much easier. With JUnit asserts you would have
> to
> > > > additionally add a message explaining what the expected value is
> *and*
> > > > what the
> > > > actual value is, for each assert that is more complex than a
> > assertEquals
> > > > on a number, I suppose. I have seen a lot of tests so far that only
> > test
> > > > the expected behavior via assertTrue and does not show the incorrect
> > > values
> > > > when the test fails, which would come for free with AssertJ.
> > > >
> > > > Other examples randomly picked from the test suite:
> > > >
> > > >
> > > >
> > >
> >
> *org.apache.cassandra.repair.RepairJobTest#testNoTreeRetainedAfterDistance:*
> > > > Replace assertion:
> > > > assertTrue(messages.stream().allMatch(m -> m.verb() ==
> Verb.SYNC_REQ));
> > > > With:
> > > > assertThat(messages)
> > > > .extracting(Message::verb)
> > > > .containsOnly(Verb.SYNC_REQ);
> > > >
> > > > As a result, if any of the messages is not a Verb.SYNC_REQ, the test
> > > > failure will show the actual "Verb"s of messages.
> > > >
> > > > Replace:
> > > > assertTrue(millisUntilFreed < TEST_TIMEOUT_S * 1000);
> > > > With:
> > > > assertThat(millisUntilFreed)
> > > > .isLessThan(TEST_TIMEOUT_S * 1000);
> > > >
> > > > Same effect if the condition is not satisfied, more explicit error
> > > message
> > > > explaining why the test failed.
> > > >
> > > > AssertJ also allows Custom assertions which are also very useful and
> > > could
> > > > potentially be leveraged in the future.
> > > >
> > > > This would only touch on the tests' assertions, the rest of the test
> > > setup
> > > > and execution remains untouched (still uses JUnit for the test
> > > execution).
> > > >
> > > > Thanks.
> > > >
> > > > --
> > > > Kévin Gallardo.
> > > >
> > >
> >
>

20200417 4.0 Status Update

2020-03-17 Thread Jordan West

Hi Everyone!

One day late again but here is the weekly status update. Link to JIRA
board:
https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=355&projectKey=CASSANDRA

We opened 3 new tickets last week and closed 13 including 1 of the new
tickets. The current open count is 98 (reminder that this now includes all
4.0-rc tagged tickets).
Opened:
https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=355&quickFilter=1670
Closed:
https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=355&quickFilter=1671

There are 19 tickets with no assignee. Notably, there are none left in
alpha! Thee are still some flaky tests looking for assignee's however.
https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=355&selectedIssue=CASSANDRA-14801&quickFilter=1658

The largest outstanding body of work is to complete the test plans in
https://issues.apache.org/jira/browse/CASSANDRA-15536. If you are able to
contribute in one of the areas please work with the assignee's and other
contributors. Folks involved in each area are working on lists of what to
test. A few areas still have no assignee.

8 tickets need a reviewer. Half are flaky test fixes (yay!).
https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=355&selectedIssue=CASSANDRA-14904&quickFilter=1659

Cumulative flow diagram continues to show healthy progress. The total
amount of work is still growing but the amount of completed work is growing
more quickly.
https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=355&projectKey=CASSANDRA&view=reporting&chart=cumulativeFlowDiagram&swimlane=939&swimlane=936&swimlane=931&column=1505&column=1506&column=1514&column=1509&column=1512&column=1507&days=90

Thanks everyone!
Jordan

20200407 4.0 Status Update

2020-04-07 Thread Jordan West

Hi Everyone,

The board can be found here:
https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=355

[Tickets That Need Attention]
A reminder that Josh has added a new 'Needs Attention' filter to show any
tasks that are stalled, need an assignee or need a reviewer. Makes it easy
if you would like to find something to work on that helps push us closer to
4.0.0's release.

Needs Attention:
https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=355&quickFilter=1719

[Alpha Status]
We continue to have no tickets that need assignees in alpha. Of the
remaining tickets in alpha, 8 have a reviewer / reviews are in progress, 3
are in need of reviewers, 3 are in progress, and 3 are not started or
require more information (or are being worked on but haven't had ticket
metadata updated). The tickets in need of a reviewer or are not yet started
are a great place to help out if you are looking for something to
contribute.

Needs Reviewer:
https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=355&quickFilter=1659

[Stalled Tickets]
63 tickets are stalled (have not been updated in >14d). 6 of these are
tagged for alpha. Of the remaining, 14 are in beta and 43 are in rc.

Stalled:
https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=355&quickFilter=1694

[Open vs. Closed Last 7 Days & Cumulative Flow]
We opened 14 issues in the last 7 days and closed 26 (8 of which were new).
Most of the new tickets were test failures that were fixed in alpha. While
the number of issues in the release continues to grow (many of which are
flaky test failures), we continue to make more progress completing issues
(as shown by this week's net of 12 closed tickets.

Notable New Tickets:
https://issues.apache.org/jira/browse/CASSANDRA-15690 is a critical read
path bug that can lead to a transient incorrect response. This affects 4.0
as well as previous versions of Cassandra including 3.11 and 3.0.x.

Opened:
https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=355&quickFilter=1670
Closed:
https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=355&quickFilter=1671

Cumulative Flow:
https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=355&view=reporting&chart=cumulativeFlowDiagram&swimlane=939&swimlane=936&swimlane=931&column=1505&column=1506&column=1514&column=1509&column=1512&column=1507&days=30

Thanks for all your efforts! Also, if there is anything else you would like
to see in these weekly updates please let us know.

Jordan

Re: Keeping test-only changes out of CHANGES.txt

2020-04-08 Thread Jordan West

+1 (nb) to the change and +1 (nb) to updating the docs to reflect this.

Jordan

On Wed, Apr 8, 2020 at 11:30 AM  wrote:

> +1
>
> > El 8 abr 2020, a las 19:05, e.dimitr...@gmail.com escribió:
> >
> > +1
> >
> > Sent from my iPhone
> >
> >> On 8 Apr 2020, at 13:50, Joshua McKenzie  wrote:
> >>
> >> +1
> >>
>  On Wed, Apr 8, 2020 at 12:26 PM Sam Tunnicliffe 
> wrote:
> >>>
> >>> +1
> >>>
> > On 8 Apr 2020, at 15:08, Mick Semb Wever  wrote:
> 
>  Can we agree on keeping such test changes out of CHANGES.txt ?
> 
>  We already don't put entries into CHANGES.txt if it is not a change
>  from any previous release.
> 
>  There was some discussion before¹ about this, and the problem that
>  being selective meant what ended up there being arbitrary. I think
>  this can be solved with an easy rule of thumb that if it only touches
>  *Test.java classes, or it is only about fixing a test, then it
>  shouldn't be in CHANGES.txt. That means if the patch does touch any
>  runtime code then you do still need to add an entry to CHANGES.txt.
>  This avoids the whole "arbitrary" problem,  and maintains CHANGES.txt
>  as user-facing formatted text to be searched through.
> 
>  If there's agreement I can commit to going through 4.0 changes and
>  removing those that never touched runtime code.
> 
>  regards,
>  Mick
> 
>  ¹)
> >>>
> https://lists.apache.org/thread.html/a94946887081d8a408dd5cd01a203664f4d0197df713f0c63364a811%40%3Cdev.cassandra.apache.org%3E
> 
>  -
>  To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>  For additional commands, e-mail: dev-h...@cassandra.apache.org
> 
> >>>
> >>>
> >>> -
> >>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> >>> For additional commands, e-mail: dev-h...@cassandra.apache.org
> >>>
> >>>
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>

Re: 20200407 4.0 Status Update

2020-04-08 Thread Jordan West

I’ve added a “LHF” quick filter to the board which filters on this. You may
have to click “Show More” to make it visible.

Jordan

On Tue, Apr 7, 2020 at 6:09 PM  wrote:

> Hi Manish,
> You can check those tickets which have Complexity: Low Hanging Fruit
>
> Ekaterina
>
> Sent from my iPhone
>
> > On 7 Apr 2020, at 20:04, Manish G  wrote:
> >
> > Hi,
> >
> > Can there be a filter like 'good first issue' which new people can use to
> > find issues to start with?
> >
> > Manish
> >
> >> On Tue, Apr 7, 2020 at 10:44 PM Jordan West  wrote:
> >>
> >> Hi Everyone,
> >>
> >> The board can be found here:
> >> https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=355
> >>
> >> [Tickets That Need Attention]
> >> A reminder that Josh has added a new 'Needs Attention' filter to show
> any
> >> tasks that are stalled, need an assignee or need a reviewer. Makes it
> easy
> >> if you would like to find something to work on that helps push us
> closer to
> >> 4.0.0's release.
> >>
> >> Needs Attention:
> >>
> >>
> https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=355&quickFilter=1719
> >>
> >> [Alpha Status]
> >> We continue to have no tickets that need assignees in alpha. Of the
> >> remaining tickets in alpha, 8 have a reviewer / reviews are in
> progress, 3
> >> are in need of reviewers, 3 are in progress, and 3 are not started or
> >> require more information (or are being worked on but haven't had ticket
> >> metadata updated). The tickets in need of a reviewer or are not yet
> started
> >> are a great place to help out if you are looking for something to
> >> contribute.
> >>
> >> Needs Reviewer:
> >>
> >>
> https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=355&quickFilter=1659
> >>
> >> [Stalled Tickets]
> >> 63 tickets are stalled (have not been updated in >14d). 6 of these are
> >> tagged for alpha. Of the remaining, 14 are in beta and 43 are in rc.
> >>
> >> Stalled:
> >>
> >>
> https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=355&quickFilter=1694
> >>
> >> [Open vs. Closed Last 7 Days & Cumulative Flow]
> >> We opened 14 issues in the last 7 days and closed 26 (8 of which were
> new).
> >> Most of the new tickets were test failures that were fixed in alpha.
> While
> >> the number of issues in the release continues to grow (many of which are
> >> flaky test failures), we continue to make more progress completing
> issues
> >> (as shown by this week's net of 12 closed tickets.
> >>
> >> Notable New Tickets:
> >> https://issues.apache.org/jira/browse/CASSANDRA-15690 is a critical
> read
> >> path bug that can lead to a transient incorrect response. This affects
> 4.0
> >> as well as previous versions of Cassandra including 3.11 and 3.0.x.
> >>
> >> Opened:
> >>
> >>
> https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=355&quickFilter=1670
> >> Closed:
> >>
> >>
> https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=355&quickFilter=1671
> >>
> >> Cumulative Flow:
> >>
> >>
> https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=355&view=reporting&chart=cumulativeFlowDiagram&swimlane=939&swimlane=936&swimlane=931&column=1505&column=1506&column=1514&column=1509&column=1512&column=1507&days=30
> >>
> >> Thanks for all your efforts! Also, if there is anything else you would
> like
> >> to see in these weekly updates please let us know.
> >>
> >> Jordan
> >>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>

Re: Drivers support for Cassandra 4.0

2020-04-10 Thread Jordan West

On Thu, Apr 9, 2020 at 7:30 AM Alexandre Dutra 
wrote:

> * Java drivers 3.9.0 and 4.6.0 will be released in the next few weeks.
> They will include
> support for missing features (transient replication and
> now-in-seconds), effectively
> providing complete support for protocol v5 in its current state. To
> make it as easy as
> possible for users to adopt C* 4.0, we decided to release both major
> branches of the Java
> driver, including 3.x, even if this branch is now in maintenance mode.



This is great to hear and I think will be a big benefit to 4.0 adoption.

Thanks for the update!
Jordan

Re: [VOTE] Release Apache Cassandra 4.0-alpha4

2020-04-13 Thread Jordan West

+1 (non-binding)

Thanks to all those who ran the tests and checks!

Jordan

On Mon, Apr 13, 2020 at 6:02 PM Sumanth Pasupuleti <
sumanth.pasupuleti...@gmail.com> wrote:

> +1 (non-binding)
>
> All java8 UTs, jvmdtests and dtests pass
>
> https://circleci.com/workflow-run/d7b3f62d-c9ad-43d6-9152-2655e27feccb?signup-404=true
>
> On Mon, Apr 13, 2020 at 5:56 PM Jeff Jirsa  wrote:
>
> > +1
> >
> >
> > > On Apr 10, 2020, at 4:02 PM, Mick Semb Wever  wrote:
> > >
> > > Proposing the test build of Cassandra 4.0-alpha4 for release.
> > >
> > > sha1: d00c004cc10986fc41c2070f9c5d0007e03a45c3
> > > Git:
> >
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=shortlog;h=refs/tags/4.0-alpha4-tentative
> > > Maven Artifacts:
> > >
> >
> https://repository.apache.org/content/repositories/orgapachecassandra-1202/org/apache/cassandra/cassandra-all/4.0-alpha4/
> > >
> > > The Source and Build Artifacts, and the Debian and RPM packages and
> > > repositories, are available here:
> > > https://dist.apache.org/repos/dist/dev/cassandra/4.0-alpha4/
> > >
> > > The vote will be open for at least 96 hours (longer than normal,
> > > because of Easter holidays for many). Everyone who has tested the
> > > build is invited to vote. Votes by PMC members are considered binding.
> > > A vote passes if there are at least three binding +1s.
> > >
> > > [1]: CHANGES.txt:
> > >
> >
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/4.0-alpha4-tentative
> > > [2]: NEWS.txt:
> >
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=NEWS.txt;hb=refs/tags/4.0-alpha4-tentative
> > >
> > >
> > > regards,
> > > Mick
> > >
> > > -
> > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > > For additional commands, e-mail: dev-h...@cassandra.apache.org
> > >
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
> >
>

Re: Discussion: addition to CEP guide

2020-04-22 Thread Jordan West

+1 (nb) on both counts. Thanks for bringing this up!

Jordan

On Wed, Apr 22, 2020 at 11:53 AM Joshua McKenzie 
wrote:

> >
> > Maybe put it high up the list, e.g. after Description of Approach?
>
> Really great point. Definitely not the lowest priority item.
>
> I'll leave this thread open for another 24 or 48 for feedback; if
> noncontroversial I'll edit then.
>
> On Wed, Apr 22, 2020 at 1:45 PM Scott Andreas 
> wrote:
>
> > Sounds good to me as well, thanks for suggesting.
> >
> > 
> > From: Jon Haddad 
> > Sent: Wednesday, April 22, 2020 9:54 AM
> > To: dev@cassandra.apache.org
> > Subject: Re: Discussion: addition to CEP guide
> >
> > Great idea Josh, +1
> >
> > On Wed, Apr 22, 2020 at 9:47 AM Benedict Elliott Smith <
> > bened...@apache.org>
> > wrote:
> >
> > > +1.  This might also serve to produce specific points of discussion
> > around
> > > the topic as the CEP progresses.
> > >
> > > Maybe put it high up the list, e.g. after Description of Approach?
> > >
> > >
> > >
> > > On 22/04/2020, 17:40, "Joshua McKenzie"  wrote:
> > >
> > > Link to CEP guide:
> > >
> > >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=95652201
> > >
> > > Currently the CEP guide reads:
> > > ---
> > >
> > > *A CEP should contain the following sections: *
> > >
> > >-
> > >
> > >*Scope,*
> > >-
> > >
> > >*Goals (and non-goals),*
> > >-
> > >
> > >*Description of Approach,*
> > >-
> > >
> > >*Timeline,*
> > >-
> > >
> > >*Mailing list / Slack channels,*
> > >-
> > >
> > >*Related JIRA tickets.*
> > >
> > > --
> > > What does everyone think about adding another bullet item as
> follows:
> > >
> > >- A test plan covering performance, correctness, failure, and
> > > boundary
> > >conditions (as applicable)
> > >
> > > --
> > > Or some variation thereof. I personally think it's worth calling
> out
> > > "We
> > > should think about and discuss how we're going to test something
> > from a
> > > high level collectively before we dive into it", since we've had
> some
> > > pain
> > > in the past in that area.
> > >
> > > Thoughts?
> > >
> > >
> > >
> > >
> > > -
> > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > > For additional commands, e-mail: dev-h...@cassandra.apache.org
> > >
> > >
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
> >
>

Re: Calling for release managers (Committers and PMC)

2020-05-07 Thread Jordan West

*raises hand*

- Jordan

On Thu, May 7, 2020 at 11:29 AM Mick Semb Wever  wrote:

> The Cassandra release process has had some improvements to better in
> line with the ASF guidelines: sha256 & sha512 checksums, staged
> artefacts in svnpubsub, dep and rpm repositories complete and signed
> in staging, and separate scripts and manual steps merged together.
>
> The updated documentation for cutting, voting, and publishing a
> release is found here:
> https://cassandra.apache.org/doc/latest/development/release_process.html
>
> I am hoping to get as many Committers* and PMC members interested as
> possible for cutting a future release.
>
> Who is interested? How many names can I get :-)
>
> The more that are interested then the easier it is to take turns and
> be flexible depending on our own availability each time. I will help
> out everyone on their first run. Indeed most of my motivation in
> getting involved with the release process was to make it all as simple
> and as forgettable as possible, so the role of the role manager can
> change easily from release to release.
>
> *When a Committer cuts a release, a PMC member has to perform the very
> last post-vote publish step.
>
> regards,
> Mick
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>

4.0 Ticket Review and 20200512 Status Update

2020-05-12 Thread Jordan West

Hi Everyone,

This week's status update is below but we (Josh, Jon M, and myself) thought
it was important for us all to take a closer look at which parts of the
release cycle we've assigned different tickets to so we can get a better
idea of how close we truly are to releasing beta1 and what work is left
between then and rc/ga. With the release lifecycle document (
https://cwiki.apache.org/confluence/display/CASSANDRA/Release+Lifecycle) the
community agreed upon previously in mind, we looked at all outstanding
tickets.

In many cases, tickets were already assigned to where we would have
expected them to be based on that document and the ticket's
perceived severity. In a few cases, tickets' assigned stage in the cycle
was different from what the document outlined. In cases where we felt it
was obvious we have reached out to those involved and updated it
accordingly. Those tickets fell primarily into these categories:

* Client API / Configuration / Other user interfaces - As a user
interfaces, these should be stabilized before the end of the alpha stage
* Testing Epic - should be completed by the end of the beta stage
* Changes in RC - We shouldn't be making substantial changes in a Release
Candidate but documentation, etc is ok in this stage in the cycle.

Of the remaining tickets, some we felt were important for the community to
discuss on a case by case basis as we have with other larger changes in the
past. In particular, as we look at these, we think it is important for the
community to think about whether or not we should block the 4.0 release on
them or if they would be well suited to a subsequent release. One that we
feel confident will have a shorter cycle than 4.0.

Please look through them, especially if you are involved in one. We
encourage all discussion to happen outside this thread: either in JIRA or
in a separate [DISCUSS] thread on the mailing list. Questions I found
helpful while reviewing myself:  1) Would you block the release over this
ticket? 2) Would you prioritize this ticket over testing? 3) Does fixing
this ticket make 4.0 a more stable release?

Transient Replication:
- https://issues.apache.org/jira/browse/CASSANDRA-15670: Transient
Replication: unable to insert data when the keyspace is configured with the
SimpleStrategy
- https://issues.apache.org/jira/browse/CASSANDRA-14697: Transient
Replication 4.0 pre-release followup work
- https://issues.apache.org/jira/browse/CASSANDRA-14404: Transient
Replication & Cheap Quorums: Decouple storage requirements from consensus
group size using incremental repair

Repair & Streaming:
- https://issues.apache.org/jira/browse/CASSANDRA-15665: StreamManager
should clearly differentiate between "initiator" and "receiver" sessions
- https://issues.apache.org/jira/browse/CASSANDRA-14939: fix some
operational holes in incremental repair
- https://issues.apache.org/jira/browse/CASSANDRA-15406: Add command to
show the progress of data streaming and index build

Indexing:
- https://issues.apache.org/jira/browse/CASSANDRA-15533: Don't allocate
unneeded MergeIterator in OnDiskToken#iterator
- https://issues.apache.org/jira/browse/CASSANDRA-13606: Improve handling
of 2i initialization failures

Test Tooling:
- https://issues.apache.org/jira/browse/CASSANDRA-15624: Avoid lazy
initializing shut down instances when trying to send them messages

Other Features / Improvements:
- https://issues.apache.org/jira/browse/CASSANDRA-15241: Virtual table to
expose current running queries
- https://issues.apache.org/jira/browse/CASSANDRA-15211: Remove
BaseIterator.stopChild
- https://issues.apache.org/jira/browse/CASSANDRA-14793: Improve system
table handling when losing a disk when using JBOD
- https://issues.apache.org/jira/browse/CASSANDRA-14694: add latency sample
for speculative read repair writes
- https://issues.apache.org/jira/browse/CASSANDRA-14361: Allow
SimpleSeedProvider to resolve multiple IPs per DNS name
- https://issues.apache.org/jira/browse/CASSANDRA-13994: Remove COMPACT
STORAGE internals before 4.0 release
- https://issues.apache.org/jira/browse/CASSANDRA-2848: Make the Client API
support passing down timeouts (This one has been discussed before but given
its current status and being interface related it could block a beta1
release)

Ok, on with the regular status update:

[Board]
https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=355

[Tickets that Need Attention]
https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=355&projectKey=CASSANDRA&quickFilter=1723&quickFilter=1719

There are 6 tickets in alpha that need attention and 18 in beta. The alpha
tickets are either flaky tests or configuration changes. The beta tickets
are primarily grouped into test plans and known bugs.

[Needs Reviewer]
https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=355&projectKey=CASSANDRA&quickFilter=1661&quickFilter=1659

3 tickets are looking for reviewers, although only one is not related to
Transient Replication.

[Alpha Status]

Re: [DISCUSS] CASSANDRA-13994

2020-05-27 Thread Jordan West

On Wed, May 27, 2020 at 1:23 PM Joshua McKenzie 
wrote:

> Maybe. Do we just time box, say we're going to cut an RC and give it 4
> weeks, if nothing awful surfaces we GA?
>

I've seen that work well in the past on other projects. I agree with the
notion that RCs are real candidates for release if no one finds issues with
them. Ideally we would have as little RCs as possible and have more
alphas/betas.

>
> On Wed, May 27, 2020 at 4:12 PM Brandon Williams  wrote:
>
> > Absolutely my understanding.
> >
> > On Wed, May 27, 2020, 2:49 PM Jeremiah D Jordan <
> jeremiah.jor...@gmail.com
> > >
> > wrote:
> >
> > > > A clear point to cut RC's doesn't surface from the above for me.
> > > Releasing
> > > > an RC before broad verification seems wrong, and cutting an RC after
> > the
> > > 4
> > > > points above may as well be GA because it's all known scope.
> > >
> > > Isn’t the whole point of an RC is that it could be the GA?  It is a
> > > “release candidate”, meaning if no one finds any issues with it, that
> can
> > > them become the release?  So that seems like exactly the right time to
> > make
> > > RC releases?
> > >
> > > > On May 27, 2020, at 2:45 PM, Joshua McKenzie 
> > > wrote:
> > > >
> > > > I think we're all on the same page here; I was focusing more on the
> > > release
> > > > lifecycles and sequencing than the entire version cycle. Good to
> > broaden
> > > > scope I think.
> > > >
> > > > One thing we're not considering is the separation of API changes from
> > > major
> > > > changes and how that intersects with release milestones.
> > > >
> > > > Meaning:
> > > > 1. alpha phase
> > > > 2. Milestone: API freeze (all API changes pushed to next major)
> > > > 3. beta phase
> > > > 4. Verification phase (all major disruptive pushed to next major)
> > > >
> > > > A clear point to cut RC's doesn't surface from the above for me.
> > > Releasing
> > > > an RC before broad verification seems wrong, and cutting an RC after
> > the
> > > 4
> > > > points above may as well be GA because it's all known scope.
> > > >
> > > > Thoughts?
> > > >
> > > > On Wed, May 27, 2020 at 3:28 PM Scott Andreas 
> > > wrote:
> > > >
> > > >> That makes sense to me, yep.
> > > >>
> > > >> My hope and expectation is that the time required for "verification
> > > work"
> > > >> will shrink dramatically in the not too distant future - ideally to
> a
> > > >> period of less than a month. In this world, the cost of missing one
> > > train
> > > >> is reduced to catching the next one.
> > > >>
> > > >> One of the main goals in shifting focus from "testing" and "test
> > plans"
> > > to
> > > >> "test engineering" is automating as many aspects of release
> > > qualification
> > > >> as possible, with an asymptotic ideal as a function of compute
> > capacity
> > > and
> > > >> time. While such automation will never be complete (it's likely that
> > > >> development of new features will/must include qualification infra
> > > changes
> > > >> to exercise them), if we're able to apply the same rigor to major
> > > releases
> > > >> as we are to patchlevel builds with little incremental effort, I'd
> be
> > > >> thrilled.
> > > >>
> > > >> This is mostly a way of saying:
> > > >> – I like the cadence/sequencing Benedict proposes below.
> > > >> – I think improvements in test engineering can reduce/eliminate
> > > >> invalidation and may increase the scope of what can be a candidate
> for
> > > >> merge on a given branch
> > > >> – And if not, the cost of missing the train is lower because we'll
> be
> > > able
> > > >> to deliver major releases more often.
> > > >>
> > > >> Scott
> > > >>
> > > >> 
> > > >> From: Jeremiah D Jordan 
> > > >> Sent: Wednesday, May 27, 2020 11:54 AM
> > > >> To: Cassandra DEV
> > > >> Subject: Re: [DISCUSS] CASSANDRA-13994
> > > >>
> > > >> +1 strongly agree.  If we aren’t going to let something go into
> 4.0.0
> > > >> because it would "invalidate testing” then we can not let such a
> thing
> > > go
> > > >> into 4.0.1 unless we plan to re-do said testing for the patch
> release.
> > > >>
> > > >>> On May 27, 2020, at 1:31 PM, Benedict Elliott Smith <
> > > bened...@apache.org>
> > > >> wrote:
> > > >>>
> > > >>> I'm being told this still isn't clear, so let me try in a
> > bullet-point
> > > >> timeline:
> > > >>>
> > > >>> * 4.0 Beta
> > > >>> * 4.0 Verification Work
> > > >>> * [Merge Window]
> > > >>> * 4.0 GA
> > > >>> * 4.0 Minor Releases
> > > >>> * ...
> > > >>> * 5.0 Dev
> > > >>> * ...
> > > >>> * 5.0 Verification Work
> > > >>> * GA 5.0
> > > >>>
> > > >>> I think that anything that is prohibited from "[Merge Window]"
> > because
> > > >> it invalidates "4.0 Verification Work" must also be prohibited until
> > > "5.0
> > > >> Dev" because the next equivalent work that can now validate it
> occurs
> > > only
> > > >> at "5.0 Verification Work"
> > > >>>
> > > >>> On 27/05/2020, 19:05, "Benedict Elliott Smith" <
> bened...@apache.org
> > >
> > > >> wrote:

Re: [DISCUSSION] Flaky tests

2020-05-28 Thread Jordan West

> On Wed, May 27, 2020 at 5:13 PM Ekaterina Dimitrova <
> ekaterina.dimitr...@datastax.com> wrote:

> - No flaky tests according to Jenkins or CircleCI? Also, some people run
> > the free tier, others take advantage of premium CircleCI. What should be
> > the framework?

While I agree that we should use the Apache infrastructure as the canonical
infrastructure, failures in both (or any) environment matter when it comes
to flaky tests.

On Wed, May 27, 2020 at 5:23 PM Joshua McKenzie 
wrote:

>
> At least for me, what I learned in the past is we'd drive to a green test
> board and immediately transition it as a milestone, so flaky tests would
> reappear like a disappointing game of whack-a-mole. They seem frustratingly
> ever-present.
>
>
Agreed. Having multiple successive green runs would be a better bar than
one on a single platform imo.

> I'd personally advocate for us taking the following stance on flaky tests
> from this point in the cycle forward:
>
>- Default posture to label fix version as beta
>- *excepting* on case-by-case basis, if flake could imply product defect
>that would greatly impair beta testing we leave alpha
>

I would be in favor of tightening this further to flakes that imply
interface changes or major defects (e.g. corruption, data loss, etc). To do
so would require evaluation of the flaky test, however, which I think is in
sync with our "start in alpha and make exceptions to move to beta". The
difference would be that we better define and widen what flaky tests can be
punted to beta and my guess is we could already evaluate all outstanding
flaky test tickets by that bar.

Jordan

20200602 Cassandra 4.0 Status Update

2020-06-03 Thread Jordan West

Hi Everyone,

[Board]
The board can be found here:
https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=355

[New Tickets]
In the last 7 days we have opened 7 new tickets and closed 1. 2 were in
Alpha (including the one that was closed) and the remaining were assigned
to Beta. Please remember to be thoughtful about what Fix Version you assign
a ticket to as we get closer to being able to release beta1.

https://cwiki.apache.org/confluence/display/CASSANDRA/Release+Lifecycle

[Tickets That Need Attention]
There are 4 alpha issues and 33 beta issues that need attention. Of the 4
alpha tickets, 2 are in review, 1 is being actively worked on and the last
is blocked by the ongoing work. The number of beta tickets is considerably
higher than our last status update primarily due to tickets being
re-assigned from rc but we have also created 5 new beta issues in the last
7 days that need attention.

[Needs Reviewer]
2 tickets assigned to Alpha are looking for reviewers:
- https://issues.apache.org/jira/browse/CASSANDRA-15848:  Fully purged
static row causes NPE in repaired data tracking
- https://issues.apache.org/jira/browse/CASSANDRA-15792:
test_speculative_data_request
- read_repair_test.TestSpeculativeReadRepair

5 tickets assigned to Beta are looking for reviewers:
- https://issues.apache.org/jira/browse/CASSANDRA-15229: BufferPool
Regression
- https://issues.apache.org/jira/browse/CASSANDRA-15838: Add deb and rpm
packaging to artifacts test script
- https://issues.apache.org/jira/browse/CASSANDRA-15833: Unresolvable false
digest mismatch during upgrade due to CASSANDRA-10657
- https://issues.apache.org/jira/browse/CASSANDRA-15842: Fix flaky
org.apache.cassandra.schema.SchemaTest.testTransKsMigration-cdc
- https://issues.apache.org/jira/browse/CASSANDRA-15841: Fix flaky
junit.framework.TestSuite.org.apache.cassandra.io.sstable.CQLSSTableWriterTest-cdc

[Alpha Status]
There are 11 tickets not marked done that are assigned to Alpha (down from
17 at the time of the last status update). The issues are comprised of
interface-breaking changes, flaky tests, and a critical bug.

[Beta Status]
There are 41 issues not marked done that are assigned to Beta (up from 38
at the time of the last status update). These tickets are primarily
comprised of the testing epic work, known regressions, planned
improvements, and flaky tests.

[Cumulative Flow Diagram]
A visual measure of our progress can be found here:
https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=355&projectKey=CASSANDRA&view=reporting&chart=cumulativeFlowDiagram&swimlane=939&swimlane=936&swimlane=931&column=1505&column=1506&column=1514&column=1509&column=1512&column=1507&from=2020-04-07&to=2020-04-28

Thanks everyone!
Jordan

Re: [DISCUSS] governance on the Apache Cassandra project

2020-06-04 Thread Jordan West

Glad to see the PMC has been discussing these topics and is making efforts
towards improving on the status quo. Thanks for sharing the draft. I'll
leave more detailed questions/comments on the doc itself but as a whole its
encouraging to see the PMC rely more heavily on the community and make an
effort to keep its view of active participants current.

Jordan

On Thu, Jun 4, 2020 at 9:54 AM Joshua McKenzie  wrote:

> Hello project!
>
> The pmc has been discussing how we make decisions as a pmc, how we make
> decisions as a project of committers and contributors, what decisions are
> made where, and how those decisions are ratified and by whom. We came to
> the conclusion that there's value in having a more formal (though
> lightweight) structure around these topics as well as start to enumerate
> some expectations on how we interact with each other on the project as it
> matures.
>
> A link to the current draft of the governance doc is here:
>
> https://docs.google.com/document/d/1wOrJBkgudY2BxEVtubq9IbiFFC3d3efJSj9OIrGcqQ8/edit#
>
> The doc is only 2 pages long; if you're interested in engaging in a
> discussion about how we evolve and collaborate as a project, please take
> some time to read through the doc, think through things, and engage on this
> thread here.
>
> Thanks everyone, and looking forward to a great discussion!
>
> ~Josh McKenzie
>

Re: [DISCUSS] governance on the Apache Cassandra project

2020-06-04 Thread Jordan West

I missed the end of Josh's email that suggested engaging here and the doc
doesn't allow comments anyways so some more questions / thoughts here:

- Regarding the PMC roll call, is there any definition of "active on the
project and want to participate"?

- Will the PMC roll call apply to the PMC itself? That was my original read
of it but looking closer, its an email to dev@.

- 24 hour periods seem a little short, especially on the weekends

- The bar regarding code review: I am generally +1 on requiring more eyes
on code review. Two areas that I think could use clarification: for
low-risk patches like test fixes, etc it may be too strong and for high
risk patches the caveat that the author can be a reviewer if also a
committer is too weak.

I'll start with those 4 to limit the potential branching of this thread.

Jordan


On Thu, Jun 4, 2020 at 12:06 PM Jordan West  wrote:

> Glad to see the PMC has been discussing these topics and is making efforts
> towards improving on the status quo. Thanks for sharing the draft. I'll
> leave more detailed questions/comments on the doc itself but as a whole its
> encouraging to see the PMC rely more heavily on the community and make an
> effort to keep its view of active participants current.
>
> Jordan
>
> On Thu, Jun 4, 2020 at 9:54 AM Joshua McKenzie 
> wrote:
>
>> Hello project!
>>
>> The pmc has been discussing how we make decisions as a pmc, how we make
>> decisions as a project of committers and contributors, what decisions are
>> made where, and how those decisions are ratified and by whom. We came to
>> the conclusion that there's value in having a more formal (though
>> lightweight) structure around these topics as well as start to enumerate
>> some expectations on how we interact with each other on the project as it
>> matures.
>>
>> A link to the current draft of the governance doc is here:
>>
>> https://docs.google.com/document/d/1wOrJBkgudY2BxEVtubq9IbiFFC3d3efJSj9OIrGcqQ8/edit#
>>
>> The doc is only 2 pages long; if you're interested in engaging in a
>> discussion about how we evolve and collaborate as a project, please take
>> some time to read through the doc, think through things, and engage on
>> this
>> thread here.
>>
>> Thanks everyone, and looking forward to a great discussion!
>>
>> ~Josh McKenzie
>>
>

Re: [VOTE] Project governance wiki doc

2020-06-16 Thread Jordan West

+1 nb

On Tue, Jun 16, 2020 at 5:45 PM Jake Luciani  wrote:

> +1
>
> On Tue, Jun 16, 2020 at 5:37 PM Benedict Elliott Smith <
> bened...@apache.org>
> wrote:
>
> > +1
> >
> > On 16/06/2020, 22:23, "Nate McCall"  wrote:
> >
> > +1 (binding)
> >
> > On Wed, Jun 17, 2020 at 4:19 AM Joshua McKenzie <
> jmcken...@apache.org>
> > wrote:
> >
> > > Added unratified draft to the wiki here:
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/CASSANDRA/Apache+Cassandra+Project+Governance
> > >
> > > I propose the following:
> > >
> > >1. We leave the vote open for 1 week (close at end of day
> 6/23/20)
> > >unless there's a lot of feedback on the wiki we didn't get on
> gdoc
> > >2. pmc votes are considered binding
> > >3. committer and community votes are considered advisory /
> > non-binding
> > >
> > > Any objections / revisions to the above?
> > >
> > > Thanks!
> > >
> > > ~Josh
> > >
> >
> >
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
> >
>
> --
> http://twitter.com/tjake
>

Re: [VOTE] Project governance wiki doc (take 2)

2020-06-20 Thread Jordan West

+1 (nb)

On Sat, Jun 20, 2020 at 11:13 AM Jonathan Ellis  wrote:

> +1
>
> On Sat, Jun 20, 2020 at 10:12 AM Joshua McKenzie 
> wrote:
>
> > Link to doc:
> >
> >
> https://cwiki.apache.org/confluence/display/CASSANDRA/Apache+Cassandra+Project+Governance
> >
> > Change since previous cancelled vote:
> > "A simple majority of this electorate becomes the low-watermark for votes
> > in favour necessary to pass a motion, with new PMC members added to the
> > calculation."
> >
> > This previously read "super majority". We have lowered the low water mark
> > to "simple majority" to balance strong consensus against risk of stall
> due
> > to low participation.
> >
> >
> >- Vote will run through 6/24/20
> >- pmc votes considered binding
> >- simple majority of binding participants passes the vote
> >- committer and community votes considered advisory
> >
> > Lastly, I propose we take the count of pmc votes in this thread as our
> > initial roll call count for electorate numbers and low watermark
> > calculation on subsequent votes.
> >
> > Thanks again everyone (and specifically Benedict and Jon) for the time
> and
> > collaboration on this.
> >
> > ~Josh
> >
>
>
> --
> Jonathan Ellis
> co-founder, http://www.datastax.com
> @spyced
>

Re: [VOTE] Project governance wiki doc (take 2)

2020-06-24 Thread Jordan West

On Wed, Jun 24, 2020 at 3:43 PM Dinesh Joshi  wrote:

> 3. Discussion #3 - "... 1 business day notice period."  Whose business day
> is it? US? Europe? Australia? NZ? We are a distributed community and so 1
> business day is ambiguous. ASF typically states a 48-72 hour period which
> gives enough time to cover everyone in the community. We want to avoid
> people getting disenfranchised due to their location. I propose we make
> this longer and avoid using 'business day' language.
>
>
I'll take responsibility for that. It was one of the discussions on the
google doc during the initial round of feedback.  The intention was to
ensure folks didn't feel obligated to check the mailing list on the
weekends or holidays (regardless of location) since we are all volunteering
our time. I intended it to mean "not on weekends or holidays for you". We
can use more specific language if we feel its necessary.


> Thanks,
>
> Dinesh
>
> [1] https://www.apache.org/foundation/voting.html#Veto
>
> > On Jun 24, 2020, at 2:59 PM, sankalp kohli 
> wrote:
> >
> > +1
> >
> > On Wed, Jun 24, 2020 at 8:37 AM Jake Luciani  wrote:
> >
> >> +1 (b)
> >>
> >> On Wed, Jun 24, 2020 at 9:59 AM Joshua McKenzie 
> >> wrote:
> >>
> >>> A reminder: this vote will close at midnight PST today in roughly 17
> >> hours.
> >>>
> >>>
> >>> On Mon, Jun 22, 2020 at 2:20 PM J. D. Jordan <
> jeremiah.jor...@gmail.com>
> >>> wrote:
> >>>
>  +1 non-binding
> 
> > On Jun 22, 2020, at 1:18 PM, Stefan Podkowinski 
> >>> wrote:
> >
> > +1
> >
> >> On 22.06.20 20:12, Blake Eggleston wrote:
> >> +1
> >>
>  On Jun 20, 2020, at 8:12 AM, Joshua McKenzie <
> >> jmcken...@apache.org>
>  wrote:
> >>>
> >>> Link to doc:
> >>>
> 
> >>>
> >>
> https://cwiki.apache.org/confluence/display/CASSANDRA/Apache+Cassandra+Project+Governance
> >>>
> >>> Change since previous cancelled vote:
> >>> "A simple majority of this electorate becomes the low-watermark for
>  votes
> >>> in favour necessary to pass a motion, with new PMC members added to
> >>> the
> >>> calculation."
> >>>
> >>> This previously read "super majority". We have lowered the low
> >> water
>  mark
> >>> to "simple majority" to balance strong consensus against risk of
> >>> stall
>  due
> >>> to low participation.
> >>>
> >>>
> >>>  - Vote will run through 6/24/20
> >>>  - pmc votes considered binding
> >>>  - simple majority of binding participants passes the vote
> >>>  - committer and community votes considered advisory
> >>>
> >>> Lastly, I propose we take the count of pmc votes in this thread as
> >>> our
> >>> initial roll call count for electorate numbers and low watermark
> >>> calculation on subsequent votes.
> >>>
> >>> Thanks again everyone (and specifically Benedict and Jon) for the
> >>> time
>  and
> >>> collaboration on this.
> >>>
> >>> ~Josh
> >>
> >>
> >> -
> >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> >> For additional commands, e-mail: dev-h...@cassandra.apache.org
> >>
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
> 
>  -
>  To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>  For additional commands, e-mail: dev-h...@cassandra.apache.org
> 
> 
> >>>
> >>
> >>
> >> --
> >> http://twitter.com/tjake
> >>
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>

Re: [DISCUSS] When to branch 4.0

2020-06-26 Thread Jordan West

Thanks for bringing this up Josh. I think the last time we all discussed
this on the mailing list was during the initial freeze thread where we
agreed "that between the September freeze date and beta, a new branch would
not be created and trunk would only have bug fixes and performance
improvements committed to it." Now that we are closer to beta and have a
more formal governance model I think its good to revisit.

I'm torn. Folks should absolutely be able to scratch an itch as part of the
ethos of the project but we also haven't made substantial progress on the
testing epic -- less than I expected when I was +1 to branch at beta in the
initial proposal. A few general thoughts come to mind around this:

- Does not having a target branch truly discourage folks from scratching an
itch? Is the lack of a timeline on when they could actually see that merge
in a release more substantial?

- Regarding timeline (and scope), I wonder if we would be better branching
after we have a bit more of an idea of our future process and development /
release lifecycle. Perhaps we should discuss that first?

- I haven't seen any CEPs, etc for major features. These discussions aren't
blocked by the freeze and would presumably precede any need to merge to
trunk?

- For smaller changes that don't need CEPs, I know maintaining a long
running branch can be a pain but I would like to better understand how many
of these are truly out there before asking the committers to maintain and
merge into a 4th branch (which is not super challenging but is measurable
overhead).

Jordan

On Fri, Jun 26, 2020 at 6:43 AM Joshua McKenzie 
wrote:

> As we close on cutting beta1, a new consequence of our release lifecycle is
> becoming apparent. With guarantees of API stability in the beta phase, any
> work that is deferred from alpha to the next major due to API impacting
> changes will atrophy for as long as the beta is active.
>
> Cutting a branch for the 4.0 line upon release of beta1 would mitigate this
> problem and allow work in flight to be merged in, as well as unblock the
> work of others who may not be focusing on testing 4.0, whether it be due to
> their interest and focus or due to saturation on the work in scope for the
> release.
>
> The obvious downsides to cutting a branch of a major and allowing dev on
> trunk to continue separately is 1) the need to merge up to trunk on changes
> going into beta, and 2) a risk of a lack of focus on testing the release
> due to the availability of developing on trunk. My personal thoughts on
> those two points:
>
> 1) changes going into beta should be small enough that a fast-forward merge
> should be available in the majority of cases up to trunk. We've done this
> with previous releases and it wasn't prohibitively expensive in terms of
> time. Further, I would posit that changes going into beta should be on the
> smaller side, further mitigating the burden of this process.
>
> 2) We've been feature frozen since late 2018 with the expressed intention
> to encourage work on testing and stabilizing 4.0. I am not aware of any
> contributors whose time and energy has been invested in testing 4.0 that
> would otherwise have gone to trunk (i.e. this approach achieving its
> desired outcomes), however I do know of several major contributors and
> contributions that have atrophied and been deterred from further work on
> the project due to waiting for 4.0 to release.
>
> I don't think it's appropriate for us as an existing body of contributors
> to mandate either how each other or more detrimentally how other new
> contributors engage with and contribute to the project by disallowing
> contributions past 4.0; I take the position that we should do everything
> reasonably possible to encourage a diversity of contributions to the
> project. I deeply believe that making deliberate decisions to prioritize
> new engagement and interaction on the project as the context in which it's
> used evolves is vital to the project's health long term.
>
> That's just my .02 - I'd be curious to hear what everyone else thinks.
>
> ~Josh
>

Re: [DISCUSS] When to branch 4.0

2020-06-26 Thread Jordan West

On Fri, Jun 26, 2020 at 2:58 PM Benedict Elliott Smith 
wrote:

> > Nothing is stopping us for discussing CEPs now, and nothing prevents
> folks from making their own feature branches.
>
> I disagree.  CEPs are just as - if not more - of a distraction than
> branching.
>

> Work doesn't happen in a vacuum.  Those who are today focusing what
> resources they can on shipping 4.0.0 will have to divert resources to the
> new CEP and feature development happening on the project.  It is
> unrealistic to expect otherwise.
>
> We can have a swifter 4.0.0 release, or we can begin earnestly developing
> new features, but we cannot have both.
>
>
Agreed 100% and I would prefer to see us all focus on getting 4.0.0 out. I
only meant we never explicitly voted to prevent CEPs from being submitted
at this time and it was more in response to a comment in the initial email
in this thread.


>
> On 26/06/2020, 22:09, "Jon Haddad"  wrote:
>
> We currently have 2.1, 2.2, 3.0 3.11, and trunk.  With a new branch
> we'll
> _also_ have whatever is next, let's call it 5.0.
>
> Nothing is stopping us for discussing CEPs now, and nothing prevents
> folks
> from making their own feature branches.
>
> If we're going to add another branch (4.0) and let people merge to
> trunk,
> we're making an *active* decision to push the 4.0 release out even
> further,
> because the folks working on it will have to learn the new code when
> they
> merge forward.
>
> I'm -1 on branching before we release 4.0.
>
> On Fri, Jun 26, 2020 at 2:04 PM Mick Semb Wever 
> wrote:
>
> > >
> > > > Branching anytime before we 4.0.0 adds extra burden to the folks
> trying
> > > to
> > > > get 4.0.0 out the door (because of merge up)
> > >
> > > Given both that we've done this with every major release in the
> past, as
> > > well as the type of work we'd expect to land during the beta phase
> > > (smaller, non-api breaking, defect fixing or smaller performance
> > tuning), I
> > > didn't personally originally weigh this as being as much of a
> burden as
> > you
> > > perceive it to be.
> >
> >
> >
> > Looking at this a different way, you might say we have previously
> cut the
> > release branch somewhere around beta. Because previous releases
> haven't
> > (all) had so much focus on testing and alphas. My impression is that
> 4.0.0
> > will be closer compared to a second or third patch of previous major
> > releases.
> >
> > So I would have thought it makes sense around beta or RC to branch,
> > especially RC because between RC and GA it is more about a cooling
> period,
> > public acceptance and testing. That RC to GA window should be quiet
> enough
> > that it makes sense to branch then, and kick off the CEP discussions.
> >
> > I don't think the forward merging really is so much of a problem,
> it's a
> > normal activity in the C* codebase, and the additional merge-forward
> window
> > between either beta or RC, to GA is short.
> >
> > Thanks Ekaterina and Benjamin and Josh for raising the discussion.
> >
>
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>

Re: [DISCUSS] Revisiting Java 11's experimental status

2020-07-13 Thread Jordan West

Thanks for bringing this up Jon! My current thinking is we should
officially support both 8 and 11. That increases the surface area we need
to test but I think its hard to predict what different users will run given
the current transition in the Java landscape.

Jordan

On Mon, Jul 13, 2020 at 11:42 AM Jon Haddad  wrote:

> Support for Java 11 was added a long time ago, and it's been about 2 years
> since it was released (Sept 2018).  Had we released Cassandra 4 close to
> that date, I'd be fine with keeping the status as experimental, but at this
> point I'm wondering if releasing a new major version of C* that's primarily
> targeting Java 8 as the only "official" supported version is a good idea.
>
> To those of you that are planning on rolling out C* 4.0, are you planning
> on using Java 8 still, or moving to 11?  Speaking for myself, I can say I
> don't think I'd want to use 8 anymore.  If most folks are testing with 11
> at this point, I think we should consider making 11 the recommended version
> and really only encouraging Java 8 for legacy purposes - teams who have a
> restriction that prevents them from upgrading.
>
> To those of you planning on moving to 4.0 soon after it's release, are you
> planning on deploying to JDK 11 or 8?
>
> [1] https://www.oracle.com/java/technologies/java-se-support-roadmap.html
>

Re: [VOTE] Release Apache Cassandra 4.0-beta1

2020-07-16 Thread Jordan West

+1 nb

On Thu, Jul 16, 2020 at 9:38 AM Yifan Cai  wrote:

> +1 nb
>
> 
> From: Robert Stupp 
> Sent: Thursday, July 16, 2020 2:59:34 AM
> To: dev@cassandra.apache.org 
> Subject: Re: [VOTE] Release Apache Cassandra 4.0-beta1
>
> +1 (nb)
>
> —
> Robert Stupp
> @snazy
>
> > On 15. Jul 2020, at 20:07, Jasonstack Zhao Yang <
> zhaoyangsingap...@gmail.com> wrote:
> >
> > +1 (nb)
> >
> > On Thu, 16 Jul 2020 at 01:28, Brandon Williams  wrote:
> >
> >> +1 (binding)
> >>
> >> On Tue, Jul 14, 2020, 6:06 PM Mick Semb Wever  wrote:
> >>
> >>> Proposing the test build of Cassandra 4.0-beta1 for release.
> >>>
> >>> sha1: 5e767711360ecc4bc05a7cd219f0e680bfada004
> >>> Git:
> >>>
> >>>
> >>
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=shortlog;h=refs/tags/4.0-beta1-tentative
> >>> Maven Artifacts:
> >>>
> >>>
> >>
> https://repository.apache.org/content/repositories/orgapachecassandra-1210/org/apache/cassandra/cassandra-all/4.0-beta1/
> >>>
> >>> The Source and Build Artifacts, and the Debian and RPM packages and
> >>> repositories, are available here:
> >>> https://dist.apache.org/repos/dist/dev/cassandra/4.0-beta1/
> >>>
> >>> The vote will be open for 72 hours (longer if needed). Everyone who has
> >>> tested the build is invited to vote. Votes by PMC members are
> considered
> >>> binding. A vote passes if there are at least three binding +1s and no
> >> -1s.
> >>>
> >>> Eventual publishing and announcement of the 4.0-beta1 release will be
> >>> coordinated, as described in
> >>>
> >>>
> >>
> https://lists.apache.org/thread.html/r537fe799e7d5e6d72ac791fdbe9098ef0344c55400c7f68ff65abe51%40%3Cdev.cassandra.apache.org%3E
> >>>
> >>> [1]: CHANGES.txt:
> >>>
> >>>
> >>
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/4.0-beta1-tentative
> >>> [2]: NEWS.txt:
> >>>
> >>>
> >>
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=NEWS.txt;hb=refs/tags/4.0-beta1-tentative
> >>>
> >>
>
>

Re: [Vote] Remove Windows support from 4.0+

2020-08-10 Thread Jordan West

It wasn't directly regarding removing support but we did reach out to
cassandra-users@ for testing 4.0 on Windows and got no response:
https://www.mail-archive.com/user@cassandra.apache.org/msg60234.html

Jordan


On Mon, Aug 10, 2020 at 4:16 AM Benedict Elliott Smith 
wrote:

> Have we considered first asking the user list if there's anyone willing to
> donate resources to maintain compatibility?
>
> I know I have in the (distant) past handled Jira filed by (production)
> Windows users.  I don’t know how prevalent they are, but perhaps we should
> offer them a chance to step up before cutting them off?  I understand
> nobody presently involved has the resources or inclination to maintain
> them, but if the effort is low it is not infeasible that somebody else
> might.
>
> On 10/08/2020, 12:11, "Aleksey Yeshchenko" 
> wrote:
>
> +1
>
> > On 10 Aug 2020, at 04:14, Yuki Morishita  wrote:
> >
> > As per the discussion(*), I propose to remove Windows support from
> 4.0
> > release and onward.
> >
> > Windows scripts are not maintained and we lack windows test
> > environments. WIndows users can  use docker or cloud environments to
> > set up Cassandra application development.
> >
> > If the vote pass, I will create the following tickets to officially
> > remove Windows support from 4.0:
> >
> > - Remove Windows scripts and add notice to NEWS.txt
> > - Update "Getting Started" documents for Windows users (to direct
> them
> > to use docker or cloud)
> >
> > Regards,
> > Yuki
> >
> > --
> > *:
> https://mail-archives.apache.org/mod_mbox/cassandra-dev/202007.mbox/%3CCAGM0Up_3GoPucCP-U18L1akzBXS1eJoKbui997%3DajcCfKJQdng%40mail.gmail.com%3E
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>

Re: Committing `CASSANDRA-13701 Lower default num_tokens` and the dtest slowdown…

2020-08-20 Thread Jordan West

What sort of commitment is there to the follow-up tickets? Are the
follow-ups "make this faster" or are there specific tasks we know will
help? I'm concerned by the increase in testing run times on circle but
don't think that should prevent a good/decided upon default from merging.

Jordan

On Wed, Aug 19, 2020 at 9:49 AM Mick Semb Wever  wrote:

> It was agreed¹ that 4.0 should have the new configuration defaults of
>   num_tokens: 16
>   allocate_tokens_for_local_replication_factor: 3
>
> 13701's patches: against cassandra, cassandra-builds, cassandra-dtest, ccm;
> are reviewed, tested, and ready to commit. But the ccm and dtest patches
> required ccm having to now start nodes sequentially, and adding some longer
> timeout values in the dtests.
>
> The consequence of this is CI runs now take longer. ci-cassandra.a.o's
> dtests take ~30% longer, and circleci's dtests (with vnodes) have gone from
> ~22 to ~43 minutes. The general opinion (on slack²) is to commit, and work
> on improving ccm and dtest startup times in a subsequent ticket.
>
> 13701 was intended to be committed before the first beta release because of
> its user-facing changes. But these numbers are significant enough it makes
> sense to touch base with dev@
>
> Does anyone (strongly) object to the "commit + follow up ticket" approach?
>
> regards,
> Mick
>
>
> ¹ –
>
> https://lists.apache.org/thread.html/ra829084fcf344e9e96fa5c61cb31e909c8629091993471594b65ea89%40%3Cdev.cassandra.apache.org%3E
> ² – https://the-asf.slack.com/archives/CK23JSY2K/p1597747395032600 and
>
> https://the-asf.slack.com/archives/CK23JSY2K/p1597849774078200?thread_ts=1597762085.048300&cid=CK23JSY2K
>

Re: [DISCUSS] Change style guide to recommend use of @Override

2020-09-01 Thread Jordan West

+1

On Tue, Sep 1, 2020 at 12:22 PM Benedict Elliott Smith 
wrote:

> +1
>
>
>
> On 01/09/2020, 20:09, "Caleb Rackliffe"  wrote:
>
>
>
> +1
>
>
>
> On Tue, Sep 1, 2020, 2:00 PM Jasonstack Zhao Yang <
> jasonstack.z...@gmail.com>
>
> wrote:
>
>
>
> > +1
>
> >
>
> > On Wed, 2 Sep 2020 at 02:45, Dinesh Joshi  wrote:
>
> >
>
> > > +1
>
> > >
>
> > > > On Sep 1, 2020, at 11:27 AM, David Capwell 
> wrote:
>
> > > >
>
> > > > Currently our style guide recommends to avoid using @Override and
>
> > updates
>
> > > > intellij's code style to exclude it by default; I would like to
> propose
>
> > > we
>
> > > > change this recommendation to use it and to update intellij's
> style to
>
> > > > include it by default.
>
> > > >
>
> > > > @Override is used by javac to enforce that a method is in fact
>
> > overriding
>
> > > > from an abstract class or an interface and if this stops being
> true
>
> > (such
>
> > > > as a refactor happens) then a compiler error is thrown; when we
> default
>
> > > to
>
> > > > excluding, it makes it harder to detect that a refactor catches
> all
>
> > > > implementations and can lead to subtle and hard to track down
> bugs.
>
> > > >
>
> > > > This proposal is for new code and would not be to go rewrite all
> code
>
> > at
>
> > > > once, but would recommend new code adopt this style, and to pull
> old
>
> > code
>
> > > > forward which is related to changes being made (similar to our
> stance
>
> > on
>
> > > > imports).
>
> > > >
>
> > > > If people are ok with this, I will file a JIRA, update the docs,
> and
>
> > > > update intellij's formatting.
>
> > > >
>
> > > > Thanks for your time!
>
> > >
>
> > >
>
> > >
> -
>
> > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>
> > > For additional commands, e-mail: dev-h...@cassandra.apache.org
>
> > >
>
> > >
>
> >
>
>
>
>
>
>
>
> -
>
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>
>
>

Re: Creating a branch for 5.0 …?

2020-09-11 Thread Jordan West

It still seems to me that the best use of our efforts as a community is to
come together to get a stable 4.0 out as fast as possible. It would address
the branching and freeze issues that have been raised -- neither of which
currently prevent people from "scratching an itch", maintaining a
non-official side branch, or running CI on those branches. We've generally
stopped on the "planning" (I use this word lightly) or progress checking
front since beta was released.

Also, fwiw, it seems unlikely this conversation will be any different than
the one we had on the same topic 11 weeks ago:
https://lists.apache.org/thread.html/raf3592f2297abfb120563d216eeea26bfb3a6e048b246492815954ff%40%3Cdev.cassandra.apache.org%3E
.

Jordan


On Fri, Sep 11, 2020 at 5:44 AM Benedict Elliott Smith 
wrote:

> > if we do these contributions in secret
>
> Are you aware of any work happening (or expected to happen) in this way?
> This seems a very different problem than the one the thread was opened with.
>
> > it will be even harder for folk to put in late reviews
>
> It is always harder to revert and revisit committed work, than to review
> work that has not been merged.  So the flood gates you expect to open will
> still flood those people working on 4.0, only worse. There is also no such
> thing as a "late review" in this context; the review happens, at whatever
> pace is necessary, as agreed recently by the community.  If an organisation
> drops several huge patches, progress will quite reasonably be slow.  The
> best way to mitigate this would be to invest more of those secret resources
> into shipping 4.0, so the project can be on an even keel.
>
>
>
> On 11/09/2020, 13:06, "Mick Semb Wever"  wrote:
>
> For significant new feature work, the option of working in a public,
> > long-running, trunk-based feature branch is available. If we look at
> a
> > specific example like CEP-7/SAI, I’m not sure how it would benefit
> much
> > from a 5.0 branch, at least until it fundamentally depended on other
> > 5.0-targeted work.
> >
>
>
> Caleb, I'm seeing an important value to the branch (given there's no
> inter-dependencies between patches) is the CI builds on the
> cassandra-5.0
> branch, and the efforts of rebasing centralised from many feature
> branches
> to one preview branch.
>
> Raising the CEP process is interesting. Anything significant enough to
> warrant a CEP still has to go through that process (which has limited
> throughput atm) and I can't imagine anything that size making it to the
> cassandra-5.0 before we got to 4.0-rc (which is hopefully in a few
> months).
> But we are sending the clear signal that we are no longer shutting out
> these contributions.
>
>
> Maybe the effort should be done in the area of getting more people on
> > board technically so they can start to review things themselves
> (which
> > indeed takes a lot of time and patience) instead of creating a new
> > branch so they can pile up their stuff there.
>
>
>
> Stefan, the cassandra-5.0 is not a substitute for reviews. Good habits
> in
> preparation for reviews: like rebasing your feature branch, having CI
> results ready to view; and the review process itself remains exactly
> the
> same, and will take the same time as before.
>
> You do have strong review preparation habits in place. I can see that
> the
> CI builds (not just a selection of tests but the whole complete
> pipeline)
> being part of the value you are taking advantage of here.  We want to
> re-apply that value also to the cassandra-5.0 branch with its patches
> that
> are post-review yet, not yet merged to trunk. That CI would help smoke
> out
> the combination (sequence) of reviewed patches all put together, and
> easing
> the burden of the re-review of those patches,  before they land in
> trunk.
>
> Again… if the feature freeze is now a quickly shortening window, it's
> going
> to be very limited to what might make it into such a branch, so mostly
> about sending the signal that this final hurdle can be worked around
> if it
> means we retain any such significant new contributions.
>
>
> Work conducted without the engagement of the community can also expect
> to
> > be heavily revised when the community finally engages with it, as
> signalled
> > with the CEP process.
>
>
>
> Benedict, good point and it loops into what Caleb touches on. The CEP
> intends to bring out community involvement earlier in the development
> cycle, to avoid the late revisions. And under the feature freeze the
> CEP
> process is an obvious bottleneck and I don't think we can get around
> that.
>
> As far as dev involvement goes, it doesn't stop just because something
> is
> merged to trunk, commits in trunk can also be re-reviewed and then
> reverted, but that's something we want to avoid.  So yes, ofc there
> will be
> th

Re: Creating a branch for 5.0 …?

2020-09-11 Thread Jordan West

On Fri, Sep 11, 2020 at 9:18 AM Joshua McKenzie 
wrote:

> I thought it was the former; seems like it's
> the latter.
>
>
Looking at the thread I link, both are discussed (initially the former but
it turns to the latter briefly as well before going down a path similar to
the one this thread is starting to go down). In that thread we discuss
several things to improve the 4.0 process including that "we lack clarity
around scoping of this release and don't seem to have a project-wide
consensus on how we're handling this.". I think we'd do better to improve
the situation in this regard and have a plan to get 4.0 out instead of
debating branching. An official 5.0 branch is only as useful as the 5.0
release timeline (otherwise its just a branch name, maintenance, and
unreleased code). Thats predicated on us getting 4.0 out the door anyways.

Jordan


>
>
> On Fri, Sep 11, 2020 at 12:10 PM, Jordan West  wrote:
>
> > It still seems to me that the best use of our efforts as a community is
> to
> > come together to get a stable 4.0 out as fast as possible. It would
> address
> > the branching and freeze issues that have been raised -- neither of which
> > currently prevent people from "scratching an itch", maintaining a
> > non-official side branch, or running CI on those branches. We've
> generally
> > stopped on the "planning" (I use this word lightly) or progress checking
> > front since beta was released.
> >
> > Also, fwiw, it seems unlikely this conversation will be any different
> than
> > the one we had on the same topic 11 weeks ago:
> > https://lists.apache.org/thread.html/
> > raf3592f2297abfb120563d216eeea26bfb3a6e048b246492815954ff%40%3Cdev.
> > cassandra.apache.org%3E
> > .
> >
> > Jordan
> >
> > On Fri, Sep 11, 2020 at 5:44 AM Benedict Elliott Smith  > org> wrote:
> >
> > if we do these contributions in secret
> >
> > Are you aware of any work happening (or expected to happen) in this way?
> > This seems a very different problem than the one the thread was opened
> > with.
> >
> > it will be even harder for folk to put in late reviews
> >
> > It is always harder to revert and revisit committed work, than to review
> > work that has not been merged. So the flood gates you expect to open will
> > still flood those people working on 4.0, only worse. There is also no
> such
> > thing as a "late review" in this context; the review happens, at whatever
> > pace is necessary, as agreed recently by the community. If an
> organisation
> > drops several huge patches, progress will quite reasonably be slow. The
> > best way to mitigate this would be to invest more of those secret
> resources
> > into shipping 4.0, so the project can be on an even keel.
> >
> > On 11/09/2020, 13:06, "Mick Semb Wever"  wrote:
> >
> > For significant new feature work, the option of working in a public,
> >
> > long-running, trunk-based feature branch is available. If we look at
> >
> > a
> >
> > specific example like CEP-7/SAI, I’m not sure how it would benefit
> >
> > much
> >
> > from a 5.0 branch, at least until it fundamentally depended on other
> > 5.0-targeted work.
> >
> > Caleb, I'm seeing an important value to the branch (given there's no
> > inter-dependencies between patches) is the CI builds on the cassandra-5.0
> > branch, and the efforts of rebasing centralised from many feature
> branches
> > to one preview branch.
> >
> > Raising the CEP process is interesting. Anything significant enough to
> > warrant a CEP still has to go through that process (which has limited
> > throughput atm) and I can't imagine anything that size making it to the
> > cassandra-5.0 before we got to 4.0-rc (which is hopefully in a few
> months).
> > But we are sending the clear signal that we are no longer shutting out
> > these contributions.
> >
> > Maybe the effort should be done in the area of getting more people on
> >
> > board technically so they can start to review things themselves
> >
> > (which
> >
> > indeed takes a lot of time and patience) instead of creating a new branch
> > so they can pile up their stuff there.
> >
> > Stefan, the cassandra-5.0 is not a substitute for reviews. Good habits in
> > preparation for reviews: like rebasing your feature branch, having CI
> > results ready to view; and the review process itself remains exactly the
> > same, and will take the same time as before.
> >
> > You do have strong r

Re: [VOTE] Accept the Harry donation

2020-09-16 Thread Jordan West

+1

On Wed, Sep 16, 2020 at 10:29 AM sankalp kohli 
wrote:

> +1
>
> On Wed, Sep 16, 2020 at 10:07 AM Ekaterina Dimitrova <
> e.dimitr...@gmail.com>
> wrote:
>
> > +1 (non-binding)
> >
> > On Wed, 16 Sep 2020 at 12:52, Dinesh Joshi  wrote:
> >
> > > +1
> > >
> > >
> > >
> > > Dinesh
> > >
> > >
> > >
> > > > On Sep 16, 2020, at 9:30 AM, Joshua McKenzie 
> > > wrote:
> > >
> > > >
> > >
> > > > +1
> > >
> > > >
> > >
> > > >
> > >
> > > >> On Wed, Sep 16, 2020 at 11:22 AM, Aleksey Yeshchenko <
> > >
> > > >> alek...@apple.com.invalid> wrote:
> > >
> > > >>
> > >
> > > >> +1
> > >
> > > >>
> > >
> > > >> On 16 Sep 2020, at 16:09, Sumanth Pasupuleti
> > >  > >
> > > >> com> wrote:
> > >
> > > >>
> > >
> > > >> +1 (non-binding)
> > >
> > > >>
> > >
> > > >> On Wed, Sep 16, 2020 at 7:41 AM Jon Meredith  >
> > >
> > > >> wrote:
> > >
> > > >>
> > >
> > > >> +1 (non-binding)
> > >
> > > >>
> > >
> > > >> On Wed, Sep 16, 2020 at 8:28 AM David Capwell
> > >
> > > >>  wrote:
> > >
> > > >>
> > >
> > > >> +1
> > >
> > > >>
> > >
> > > >> Sent from my iPhone
> > >
> > > >>
> > >
> > > >> On Sep 16, 2020, at 6:34 AM, Brandon Williams 
> > >
> > > >>
> > >
> > > >> wrote:
> > >
> > > >>
> > >
> > > >> +1
> > >
> > > >>
> > >
> > > >> On Wed, Sep 16, 2020, 4:45 AM Mick Semb Wever 
> wrote:
> > >
> > > >>
> > >
> > > >> This vote is about officially accepting the Harry donation from Alex
> > >
> > > >>
> > >
> > > >> Petrov
> > >
> > > >>
> > >
> > > >> and Benedict Elliott Smith, that was worked on in CASSANDRA-15348.
> > >
> > > >>
> > >
> > > >> The Incubator IP Clearance has been filled out at
> > >
> > > >>
> http://incubator.apache.org/ip-clearance/apache-cassandra-harry.html
> > >
> > > >>
> > >
> > > >> This vote is a required part of the IP Clearance process. It follows
> > >
> > > >>
> > >
> > > >> the
> > >
> > > >>
> > >
> > > >> same voting rules as releases, i.e. from the PMC a minimum of three
> > >
> > > >>
> > >
> > > >> +1s and
> > >
> > > >>
> > >
> > > >> no -1s.
> > >
> > > >>
> > >
> > > >> Please cast your votes:
> > >
> > > >> [ ] +1 Accept the contribution into Cassandra
> > >
> > > >> [ ] -1 Do not
> > >
> > > >>
> > >
> > > >>
> -
> > To
> > >
> > > >> unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For
> > > additional
> > >
> > > >> commands, e-mail: dev-h...@cassandra.apache.org
> > >
> > > >>
> > >
> > > >>
> -
> > To
> > >
> > > >> unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For
> > > additional
> > >
> > > >> commands, e-mail: dev-h...@cassandra.apache.org
> > >
> > > >>
> > >
> > > >>
> -
> > To
> > >
> > > >> unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For
> > > additional
> > >
> > > >> commands, e-mail: dev-h...@cassandra.apache.org
> > >
> > > >>
> > >
> > >
> > >
> > >
> > >
> > > -
> > >
> > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > >
> > > For additional commands, e-mail: dev-h...@cassandra.apache.org
> > >
> > >
> > >
> > >
> >
>

Re: Creating a branch for 5.0 …?

2020-09-24 Thread Jordan West

https://cwiki.apache.org/confluence/display/CASSANDRA/Release+Lifecycle was
voted on after and under GA states: "A new branch is created for the
release with the new major version, limiting any new feature addition to
the new release branch, with new feature development will continue to
happen only on trunk."

Jordan


On Thu, Sep 24, 2020 at 8:12 AM Jake Luciani  wrote:

> >  Today the community still has in force an explicit vote prohibiting thee
> merge of this work.  You must conduct a vote to rescind this decision.
>
> Actually, the vote was defined to hold until beta release:
>
>
> https://lists.apache.org/thread.html/5ee66f3986bf8308912c216bd1b5f9aea35443626db9f92cdca4d7b9%40%3Cdev.cassandra.apache.org%3E
>
> -Jake
>
>
> On Thu, Sep 24, 2020 at 11:05 AM Brandon Williams 
> wrote:
>
> > On Thu, Sep 24, 2020 at 9:55 AM Benedict Elliott Smith
> >  wrote:
> > >
> > > You do not have the authority to unilaterally overrule the community
> > process.  This is a serious breach of your responsibilities as a member
> of
> > the PMC.
> >
> > Feel free to complain that I'm creating branches we intend to someday,
> > perhaps even in 2020, release.
> >
> > > I have deleted this branch, and will do so again if you repeat this.
> >
> > This would create some interesting tickets for INFRA, but I won't
> > waste their time with you either. Whether either of us has the
> > authority to do such on ASF infrastructure is irrelevant, since that
> > is the only thing that can be argued here.  The ASL absolutely allows
> > people to innovate on their own with the code, so let's just move the
> > bits.
> >
> > Those who wish to innovate,
> > https://github.com/driftx/cassandra/tree/cassandra-5.0 is now open for
> > business, PRs accepted. This will be maintained to track trunk on the
> > ASF servers.
> >
> > I guess this is the apache way.
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
> >
>
> --
> http://twitter.com/tjake
>

Re: Creating a branch for 5.0 …?

2020-09-24 Thread Jordan West

The intention was the former. It was discussed during apache con in 2019
and many people expressed the desire to wait until GA. Even some who
initially were opposed to the freeze.


Jordan

On Thu, Sep 24, 2020 at 9:04 AM Joshua McKenzie 
wrote:

>
> https://lists.apache.org/thread.html/5ee66f3986bf8308912c216bd1b5f9aea35443626db9f92cdca4d7b9%40%3Cdev.cassandra.apache.org%3E
>
>
>
> "*From: *sankalp kohli 
>
> *To: *dev 
>
> *Subject: *Re: [VOTE] Branching Change for 4.0 Freeze
>
> *Date: *2018/07/11 21:50:08
>
> *List: *dev@cassandra.apache.org
>
> 
>
>
>
> We will be in this state till beta is reached."
>
>
>
> The release lifecycle doc says: "*A new branch is created for the release
>
> with the new major version, limiting any new feature addition to the new
>
> release branch, with new feature development will continue to happen only
>
> on trunk.*"
>
>
>
> This could be read as "we won't branch until we hit GA" or "we will make
>
> sure we definitely branch at GA so disruptive features don't go into it",
>
> the latter of which is what we've done in all prior releases. I'm curious
>
> if there was any point in that discussion where the intention was made
>
> explicit if anyone has a link.
>
>
>
>
>
>
>
> On Thu, Sep 24, 2020 at 11:53 AM, Benedict Elliott Smith <
>
> bened...@apache.org> wrote:
>
>
>
> > I'm not sure what you are referring to here, that vote said nothing about
>
> > branching at beta.
>
> >
>
> > The most recent vote on the topic anyway was for the Release Lifecycle
>
> > process, which stipulates branching at GA.
>
> >
>
> > https://cwiki.apache.org/confluence/display/CASSANDRA/Release+Lifecycle
>
> >
>
> > We can vote to modify this document, or to make an exception, but I am
>
> > aware of no other vote stipulating anything about the point at which we
>
> > branch.
>
> >
>
> > On 24/09/2020, 16:49, "Jake Luciani"  wrote:
>
> >
>
> > Today the community still has in force an explicit vote prohibiting thee
>
> >
>
> > merge of this work.
>
> >
>
> > You referred to an explicit vote here. I assume that was the one you were
>
> > referring to? Yes, the community should decide.
>
> > Call a vote if you think the community thinks we should continue the
>
> > freeze vs continuing to rely on beliefs about the community.
>
> >
>
> > I'm simply pointing out the branching of 4.0 post beta was the plan of
>
> > last record.
>
> >
>
> > Jake
>
> >
>
> > On Thu, Sep 24, 2020 at 11:44 AM Benedict Elliott Smith 
> > org> wrote:
>
> >
>
> > The community does everything through discussion and consensus. Does that
>
> > include branching, or not?
>
> >
>
> > If there is no consensus, a vote is held. Whether or not you consider the
>
> > vote from 2018 still valid, you still need to seek the consent of the
>
> > community for your action today. Or is that not sacrosanct anymore?
>
> >
>
> > On 24/09/2020, 16:22, "Jake Luciani"  wrote:
>
> >
>
> > I'm sorry I see no issue with branching 4.0 as it was the thing we voted
>
> > on
>
> > back in 2018. If you wish to extend the freeze you should call a new
> vote.
>
> >
>
> > On Thu, Sep 24, 2020 at 11:15 AM Benedict Elliott Smith <
> benedict@apache.
>
> > org>
>
> > wrote:
>
> >
>
> > Nobody has any problem with an external repository being
>
> >
>
> > maintained. Just
>
> >
>
> > bear in mind the normal process will need to take place to merge to
>
> >
>
> > the ASF
>
> >
>
> > repository, and that there may be feedback and review requests to
>
> >
>
> > address,
>
> >
>
> > so merge order and diffs will probably change.
>
> >
>
> > On 24/09/2020, 16:05, "Brandon Williams"  wrote:
>
> >
>
> > On Thu, Sep 24, 2020 at 9:55 AM Benedict Elliott Smith
>
> >  wrote:
>
> >
>
> > You do not have the authority to unilaterally overrule the
>
> >
>
> > community
>
> >
>
> > process. This is a serious breach of your responsibilities as a
>
> >
>
> > member of
>
> >
>
> > the PMC.
>
> >
>
> > Feel free to complain that I'm creating branches we intend to
>
> >
>
> > someday,
>
> >
>
> > perhaps even in 2020, release.
>
> >
>
> > I have deleted this branch, and will do so again if you repeat
>
> >
>
> > this.
>
> >
>
> > This would create some interesting tickets for INFRA, but I won't waste
>
> > their time with you either. Whether either of us has the authority to do
>
> > such on ASF infrastructure is irrelevant, since
>
> >
>
> > that
>
> >
>
> > is the only thing that can be argued here. The ASL absolutely
>
> >
>
> > allows
>
> >
>
> > people to innovate on their own with the code, so let's just
>
> >
>
> > move the
>
> >
>
> > bits.
>
> >
>
> > Those who wish to innovate,
>
> > https://github.com/driftx/cassandra/tree/cassandra-5.0 is now
>
> >
>
> > open for
>
> >
>
> > business, PRs accepted. This will be maintained to track trunk
>
> >
>
> > on the
>
> >
>
> > ASF servers.
>
> >
>
> > I guess this is the apache way.
>
> >
>
> > -

Re: Creating a branch for 5.0 …?

2020-09-24 Thread Jordan West

On Thu, Sep 24, 2020 at 10:19 AM Joshua McKenzie 
wrote:

> Jordan: thanks for providing that context - it's quite helpful. Was that
> aspect of the conversation captured and shared with the rest of the project
> on the mailing list? It's a shame if not, since that may have contributed
> quite a bit to misalignment and misunderstanding over time.


Unfortuantely, I don't believe it was. The discussion occurred between
committers/contributors/PMC members on the floor in the venue and was
brought to the mailing list for a vote to ensure everyone was included. It
certainly would have been better if we were able to capture more of it (we
took as many notes as we could and translated them into that wiki doc but
there was no ability to record/etc). While its true that even more detail
would have been helpful, many of the folks who are pushing back against
what was discussed there were not in attendance or active in the open
source project at the time.

>
> Fewer and fewer people have the appetite to deal with this bickering and
> exposing anyone new to this seems like a guaranteed way to turn them away
> from the project for good.
>

Speaking for myself, the tone and tenor, in addition to the lack of
progress, in these discussions have absolutely affected my desire to
participate in the open source project (from taking additional reviews that
would speed up 4.0 on my free time, to encouraging new community members to
join, to participating in these discussions entirely). I hope we can find a
better way to communicate.

One suggestion I have for folks participating is to consider what feedback
is best given on the public mailing list and what feedback is better
delivered directly. We don't have to do EVERYTHING on the public mailing
list. Only the parts that pertain to Cassandra. And sometimes we can
resolve our personal differences better when there isn't an additional
audience around.


>
>
> On Thu, Sep 24, 2020 at 12:22 PM, Benedict Elliott Smith <
> bened...@apache.org> wrote:
>
> > The discussion on the present topic has not concluded, and if we are
> > making an exception to 4.0 only then it really needs to.
> >
> > Members of one organisation have been pushing hard for feature
> development
> > to proceed, arguing it harms unnamed third parties. A request that these
> > third parties be asked to participate in the discussion has so far gone
> > unanswered. It is reasonable that this is answered before a vote, since
> > this is the entire basis of the argument in favour of branching.
> >
> > Given this is the basis of argument, I would also propose a less
> > contentious vote, should one be undertaken: to create a cassandra-5.0
> > branch that is open only to contributions from those unaffiliated by
> > employment with any existing committers. This seems to alleviate the
> > concerns precipitating this discussion, while mitigating the concerns of
> > those who are opposed to it.
> >
> > On 24/09/2020, 17:02, "Jake Luciani"  wrote:
> >
> > The vote was to unfreeze new changes at beta, so logically that means
> > non-bugfix work goes into trunk.
> >
> > Jordan, thanks. That is a more recent vote so thanks. That being said,
> > under that line Benedict comments this needs to be discussed. So how
> about
> > we just have a Vote on branching cassandra-4.0 and the issue will be
> > decided?
> >
> > Jake
> >
> > On Thu, Sep 24, 2020 at 11:53 AM Benedict Elliott Smith  > org> wrote:
> >
> > I'm not sure what you are referring to here, that vote said nothing about
> > branching at beta.
> >
> > The most recent vote on the topic anyway was for the Release Lifecycle
> > process, which stipulates branching at GA.
> >
> > https://cwiki.apache.org/confluence/display/CASSANDRA/Release+Lifecycle
> >
> > We can vote to modify this document, or to make an exception, but I am
> > aware of no other vote stipulating anything about the point at which we
> > branch.
> >
> > On 24/09/2020, 16:49, "Jake Luciani"  wrote:
> >
> > Today the community still has in force an explicit vote prohibiting
> >
> > thee
> > merge of this work.
> >
> > You referred to an explicit vote here. I assume that was the one you were
> > referring to? Yes, the community should decide.
> > Call a vote if you think the community thinks we should continue the
> > freeze
> > vs continuing to rely on beliefs about the community.
> >
> > I'm simply pointing out the branching of 4.0 post beta was the plan of
> > last
> > record.
> >
> > Jake
> >
> > On Thu, Sep 24, 2020 at 11:44 AM Benedict Elliott Smith <
> benedict@apache.
> > org>
> > wrote:
> >
> > The community does everything through discussion and consensus.
> >
> > Does that
> >
> > include branching, or not?
> >
> > If there is no consensus, a vote is held. Whether or not you
> >
> > consider the
> >
> > vote from 2018 still valid, you still need to seek the consent of the
> > community for your action today. Or is that not sacrosanct anymore?
> >
> > On 24/09/2020, 16:22, "Jake Luciani"  wrote:
>

Re: [VOTE] Release dtest-api 0.0.6

2020-10-08 Thread Jordan West

+1

On Thu, Oct 8, 2020 at 7:25 AM Brandon Williams  wrote:

> +1
>
> On Thu, Oct 8, 2020 at 3:20 AM Oleksandr Petrov
>  wrote:
> >
> > Proposing the test build of in-jvm dtest API 0.0.6 for release.
> >
> > Repository:
> >
> https://gitbox.apache.org/repos/asf?p=cassandra-in-jvm-dtest-api.git;a=shortlog;h=refs/tags/0.0.6
> >
> > Candidate SHA:
> >
> https://github.com/apache/cassandra-in-jvm-dtest-api/commit/9efeb731b6ff4036fa822b0282b27d273975cd6fTT
> > tagged with 0.0.6
> > Artifact:
> >
> https://repository.apache.org/content/repositories/orgapachecassandra-1220/org/apache/cassandra/dtest-api/0.0.6/
> >
> > Key signature: 9E66CEC6106D578D0B1EB9BFF1000962B7F6840C
> >
> > Changes since last release:
> >
> >   * CASSANDRA-16148: Add IInstance#getReleaseVersionString
> >
> > The vote will be open for 24 hours. Everyone who has tested the build is
> > invited to vote. Votes by PMC members are considered binding. A vote
> passes
> > if there are at least three binding +1s.
> >
> > -- Alex
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>

Re: Welcome Paulo Motta as Cassandra PMC member

2021-02-10 Thread Jordan West

Congrats Paulo!

Jordan

On Tue, Feb 9, 2021 at 10:25 PM Berenguer Blasi 
wrote:

> Congrats Paulo! well done.
>
> On 9/2/21 21:28, Lorina Poland wrote:
> > Congratulations Paulo!
> > Lorina Poland
> > e. lor...@datastax.com
> > w. www.datastax.com
> >
> >
> >
> > On Tue, Feb 9, 2021 at 10:30 AM Andrés de la Peña <
> a.penya.gar...@gmail.com>
> > wrote:
> >
> >> Congrats Paulo!
> >>
> >> On Tue, 9 Feb 2021 at 17:42, Sumanth Pasupuleti <
> >> sumanth.pasupuleti...@gmail.com> wrote:
> >>
> >>> Congratulations Paulo!
> >>>
> >>> On Tue, Feb 9, 2021 at 8:10 AM Jasonstack Zhao Yang <
> >>> jasonstack.z...@gmail.com> wrote:
> >>>
>  Congrats Paulo!
> 
>  On Wed, 10 Feb 2021 at 00:03, Ekaterina Dimitrova <
> >> e.dimitr...@gmail.com
>  wrote:
> 
> > Congrats! Well done!
> >
> > On Tue, 9 Feb 2021 at 11:02, J. D. Jordan  > wrote:
> >
> >> Congrats Paulo! A great addition to the PMC.
> >>
> >>> On Feb 9, 2021, at 9:59 AM, Jonathan Ellis 
>  wrote:
> >>> Congratulations, Paulo!  Well deserved.
> >>>
>  On Tue, Feb 9, 2021 at 9:54 AM Benjamin Lerer <
> >> benjamin.le...@datastax.com>
>  wrote:
> 
>  The PMC's members are pleased to announce that Paulo Motta has
> > accepted
>  the invitation to become a PMC member yesterday.
> 
>  Thanks a lot, Paulo, for everything you have done for the
> >> project
>  all
> >> these
>  years.
> 
>  Congratulations and welcome
> 
>  The Apache Cassandra PMC members
> 
> >>>
> >>> --
> >>> Jonathan Ellis
> >>> co-founder, http://www.datastax.com
> >>> @spyced
> >>
> >> -
> >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> >> For additional commands, e-mail: dev-h...@cassandra.apache.org
> >>
> >>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>

Re: Welcome Joey Lynch as Cassandra PMC member

2024-07-27 Thread Jordan West

Congrats Joey!

Jordan

On Fri, Jul 26, 2024 at 09:22 Maxim Muzafarov  wrote:

> My congratulations Joseph Lynch!
>
> On Thu, 25 Jul 2024 at 18:15, Paulo Motta  wrote:
> >
> > Congratulations Joey!
> >
> > On Thu, 25 Jul 2024 at 00:55 Venkata Hari Krishna Nukala <
> n.v.harikrishna.apa...@gmail.com> wrote:
> >>
> >> Congratulations Joey!!
> >>
> >> On Thu, 25 Jul 2024 at 7:20 AM, Joseph Lynch 
> wrote:
> >>>
> >>> Thank you all for the warm wishes and I greatly appreciate this
> opportunity!
> >>>
> >>> This is such a great community and I am proud to be part of it.
> >>>
> >>> Cheers!
> >>> -Joey
> >>>
> >>> On Wed, Jul 24, 2024 at 10:12 AM Benjamin Lerer 
> wrote:
> 
>  The PMC's members are pleased to announce that Joey Lynch has
> accepted the invitation to become a PMC member.
> 
>  Thanks a lot, Joey, for everything you have done for the project all
> these years.
> 
>  Congratulations and welcome
> 
>  The Apache Cassandra PMC members
>

Re: [VOTE] Release Apache Cassandra 4.1.6

2024-07-30 Thread Jordan West

If there is a quick fix for the interface issue as Alex is describing then
I am all for it. I think binary compatibility isn’t required here so the
compile time compat would be good enough.

Otherwise I tend to agree that while it should be considered a public
interface we haven’t had a strict definition of those interfaces being so
nor have I seen it enforced in the past with e.g. compaction classes,
partitioners, topology strategies, or anywhere else it’s a class can be
injected. Not saying we never have but I haven’t seen us consistently do so
looking at the excerpt Alex provided.

So in the case that it’s not a simple fix my vote would be continue with
the release and then start a discuss thread about increasing the scope of
defined public / stable interfaces.

Jordan

On Tue, Jul 30, 2024 at 08:53 Jon Haddad  wrote:

> This patch fixes a long standing issue that's the root cause of
> availability failures.  Even though folks can specify a custom query
> handler with the -D flag, the number of users impacted by this is going to
> be incredibly small.  On the other hand, the fix helps every single user of
> 4.1 that puts too much pressure on the cluster, which happens fairly
> regularly.
>
> My POV is that it's a fairly weak argument that this is a public
> interface, but I don't consider it worth debating whether it is or not,
> because even if it is, this improves stability of the database for all
> users, so it's worth going in.  Let's not be dogmatic about fixes that help
> 99% of users because an incredibly small number that actually implement a
> custom query handler will need to make a trivial update in order to use the
> latest 4.1.6 dependency.
>
> Jon
>
>
>
> On Tue, Jul 30, 2024 at 8:09 AM J. D. Jordan 
> wrote:
>
>> Given we allow a pluggable query handler implementation to be specified
>> for the server with a -D during startup. So I would consider the query
>> handler one of our public interfaces.
>>
>> On Jul 30, 2024, at 9:35 AM, Alex Petrov  wrote:
>>
>> 
>> Hi Tommy,
>>
>> Thank you for spotting this and bringing this to community's attention.
>>
>> I believe our primary interfaces are native and internode protocol, and
>> CLI tools. Most interfaces are used to to abstract implementations
>> internally. Few interfaces, such as DataType, Partitioner, and Triggers can
>> be depended upon by external tools using Cassandra as a library. There is
>> no official way to plug in a QueryHandler, so I did not consider it to be a
>> part of our public API.
>>
>> From [1]:
>>
>> > These considerations are especially important for public APIs,
>> including CQL, virtual tables, JMX, yaml, system properties, etc. Any
>> planned additions must be carefully considered in the context of any
>> existing APIs. Where possible the approach of any existing API should be
>> followed.
>>
>> Maybe we should have an exhaustive list of public APIs, and explicitly
>> mention that native and internode protocols are included, alongside with
>> nodetool command API and output, but also which classes/interfaces
>> specifically should be evolved with care.
>>
>> Thank you,
>> --Alex
>>
>> [1] https://cassandra.apache.org/_/development/index.html
>>
>> On Tue, Jul 30, 2024, at 10:56 AM, Tommy Stendahl via dev wrote:
>>
>> Hi,
>>
>> There is a change in the QueryHandler interface introduced by
>> https://issues.apache.org/jira/browse/CASSANDRA-19534
>>
>> Do we allow changes such changes between 4.1.5 and 4.1.6?
>> CASSANDRA-19534 looks like a very good change so maybe there is an
>> exception in this case?
>>
>> /Tommy
>>
>> -Original Message-
>> *From*: Brandon Williams > >
>> *Reply-To*: dev@cassandra.apache.org
>> *To*: dev > >
>> *Subject*: [VOTE] Release Apache Cassandra 4.1.6
>> *Date*: Mon, 29 Jul 2024 09:36:04 -0500
>>
>> Proposing the test build of Cassandra 4.1.6 for release.
>>
>>
>> sha1: b662744af59f3a3dfbfeb7314e29fecb93abfd80
>>
>> Git:
>>
>> https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fcassandra%2Ftree%2F4.1.6-tentative&data=05%7C02%7Ctommy.stendahl%40ericsson.com%7C30a819344e48491e561908dcafdbddf4%7C92e84cebfbfd47abbe52080c6b87953f%7C0%7C0%7C638578606055937277%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=BWaJmvRTXvrMh%2FFBRzt%2FOost%2Bn6xAkgePP2ObtmTnbY%3D&reserved=0
>>
>>
>> Maven Artifacts:
>>
>> https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Frepository.apache.org%2Fcontent%2Frepositories%2Forgapachecassandra-1339%2Forg%2Fapache%2Fcassandra%2Fcassandra-all%2F4.1.6%2F&data=05%7C02%7Ctommy.stendahl%40ericsson.com%7C30a819344e48491e561908dcafdbddf4%7C92e84cebfbfd47abbe52080c6b87953f%7C0%7C0%7C638578606055947610%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=2baa1fUTwQqDpPtFAdv%2FFU6sqax3LSkKEm%2FUdbcHsbE%3D&reserved=0
>>
>>
>>
>> The Source and Build Artifacts, and the Debian and RPM packages and
>>
>> reposi

Re: [VOTE] Release Apache Cassandra 4.1.6

2024-07-30 Thread Jordan West

I would make the case that loss of availability / significant performance
issue, regardless of the amount of time it has existed for, is worth fixing
on the branches that are widely deployed by the community. Especially when
weighed against a loosely defined public interface issue.

The queuing issue has been a persistent problem (like you said 10 years)
and I regularly (approx once every 1-2 weeks) have to tell my customers “we
either have to wait for Cassandra to clear the queues or do a rolling
restart to fix it” both which come at a cost during an incident where a
client overloaded the DB and the impact is severe or business impacting.
Especially for customers doing LWTs or using non-standard RFs which are
also more prevalent in my experience than an external implementation of
QueryHandler.

While not data loss, I would argue this is a critical bug and if we did
find a data loss issue dormant for 10 years (which has happened in the
past) we would fix it as soon as it was found and a patch was made
available on all actively maintained versions.

Jordan

On Tue, Jul 30, 2024 at 10:30 Jeff Jirsa  wrote:

>
> It’s a 10 year old flaw in an 18 month old branch. Why does it need to go
> into 4.1, it’s not a regression and it clearly breaks compatibility?
>
>
>
>
> On Jul 30, 2024, at 8:52 AM, Jon Haddad  wrote:
>
> This patch fixes a long standing issue that's the root cause of
> availability failures.  Even though folks can specify a custom query
> handler with the -D flag, the number of users impacted by this is going to
> be incredibly small.  On the other hand, the fix helps every single user of
> 4.1 that puts too much pressure on the cluster, which happens fairly
> regularly.
>
> My POV is that it's a fairly weak argument that this is a public
> interface, but I don't consider it worth debating whether it is or not,
> because even if it is, this improves stability of the database for all
> users, so it's worth going in.  Let's not be dogmatic about fixes that help
> 99% of users because an incredibly small number that actually implement a
> custom query handler will need to make a trivial update in order to use the
> latest 4.1.6 dependency.
>
> Jon
>
>
>
> On Tue, Jul 30, 2024 at 8:09 AM J. D. Jordan 
> wrote:
>
>> Given we allow a pluggable query handler implementation to be specified
>> for the server with a -D during startup. So I would consider the query
>> handler one of our public interfaces.
>>
>> On Jul 30, 2024, at 9:35 AM, Alex Petrov  wrote:
>>
>> 
>> Hi Tommy,
>>
>> Thank you for spotting this and bringing this to community's attention.
>>
>> I believe our primary interfaces are native and internode protocol, and
>> CLI tools. Most interfaces are used to to abstract implementations
>> internally. Few interfaces, such as DataType, Partitioner, and Triggers can
>> be depended upon by external tools using Cassandra as a library. There is
>> no official way to plug in a QueryHandler, so I did not consider it to be a
>> part of our public API.
>>
>> From [1]:
>>
>> > These considerations are especially important for public APIs,
>> including CQL, virtual tables, JMX, yaml, system properties, etc. Any
>> planned additions must be carefully considered in the context of any
>> existing APIs. Where possible the approach of any existing API should be
>> followed.
>>
>> Maybe we should have an exhaustive list of public APIs, and explicitly
>> mention that native and internode protocols are included, alongside with
>> nodetool command API and output, but also which classes/interfaces
>> specifically should be evolved with care.
>>
>> Thank you,
>> --Alex
>>
>> [1] https://cassandra.apache.org/_/development/index.html
>>
>> On Tue, Jul 30, 2024, at 10:56 AM, Tommy Stendahl via dev wrote:
>>
>> Hi,
>>
>> There is a change in the QueryHandler interface introduced by
>> https://issues.apache.org/jira/browse/CASSANDRA-19534
>>
>> Do we allow changes such changes between 4.1.5 and 4.1.6?
>> CASSANDRA-19534 looks like a very good change so maybe there is an
>> exception in this case?
>>
>> /Tommy
>>
>> -Original Message-
>> *From*: Brandon Williams > >
>> *Reply-To*: dev@cassandra.apache.org
>> *To*: dev > >
>> *Subject*: [VOTE] Release Apache Cassandra 4.1.6
>> *Date*: Mon, 29 Jul 2024 09:36:04 -0500
>>
>> Proposing the test build of Cassandra 4.1.6 for release.
>>
>>
>> sha1: b662744af59f3a3dfbfeb7314e29fecb93abfd80
>>
>> Git:
>>
>> https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fcassandra%2Ftree%2F4.1.6-tentative&data=05%7C02%7Ctommy.stendahl%40ericsson.com%7C30a819344e48491e561908dcafdbddf4%7C92e84cebfbfd47abbe52080c6b87953f%7C0%7C0%7C638578606055937277%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=BWaJmvRTXvrMh%2FFBRzt%2FOost%2Bn6xAkgePP2ObtmTnbY%3D&reserved=0
>>
>>
>> Maven Artifacts:
>>
>> https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Frepository.apache.org%2Fcontent%2Frep

Re: [DISCUSS] Removing support for deterministic table IDs

2024-07-30 Thread Jordan West

Generally no disagreement but more of a curiosity: what’s the motivation
for removal? Just that it’s not needed? Otherwise it’s relatively cheap and
DDL aren’t high throughput (or at least shouldn’t be since we can only deal
with so many tables)

Jordan

On Tue, Jul 30, 2024 at 15:04 Caleb Rackliffe 
wrote:

> To summarize all this noise I've created, the plan would be...
>
> 1.) Leave CQL WITH id intact.
> 2.) Deprecate and WARN on *use_deterministic_table_id *in 5.0.x.
> 3.) Ignore and WARN on *use_deterministic_table_id *in 5.1.
> 4.) Profit
>
> On Tue, Jul 30, 2024 at 4:46 PM Caleb Rackliffe 
> wrote:
>
>> No intention of touching WITH id in CQL
>>
>> On Tue, Jul 30, 2024 at 4:10 PM Caleb Rackliffe 
>> wrote:
>>
>>> To clarify, my plan was to deprecate in Config/JMX and ignore it, not
>>> remove it entirely so it breaks existing YAMLs and JMX clients.
>>>
>>> This should be fine, if I'm reading the upgrade notes correctly, as no
>>> table or view creation operations will be allowed on 5.1 nodes until
>>> upgrade is complete and the CMS has been initialized.
>>>
>>> On Tue, Jul 30, 2024 at 3:54 PM J. D. Jordan 
>>> wrote:
>>>
 +1 to deprecate it. What does removing it buy us?

 On Jul 30, 2024, at 3:52 PM, David Capwell  wrote:

 Users can provide ids and TCM can manage to make them safe, so agree
 we don’t really need the feature anymore.  I am fine with deprecating the
 feature, but removing would be a breaking change for anyone that had that
 config in place, so not a fan of breaking the config interface.

 On Jul 30, 2024, at 1:38 PM, Caleb Rackliffe 
 wrote:

 I'd like to propose removing deterministic table IDs for new *user*
 tables and views in trunk. With TCM in place, it looks like the reason we
 added *use_deterministic_table_id*, concurrent table creations, is no
 longer a concern.

 Thoughts? Objections?

Re: [DISCUSS] Removing support for deterministic table IDs

2024-07-30 Thread Jordan West

Understood. Nits aside I have no objection to your plan.

Jordan

On Tue, Jul 30, 2024 at 15:42 Caleb Rackliffe 
wrote:

> I think the main motivation for ignoring the YAML option and removing
> down the line is that we probably never would have created it if TCM
> existed at that point of creation. I'd liken it to what we did w/ some
> no-longer-relevant options for the batch commit log.
>
> On Tue, Jul 30, 2024 at 5:19 PM Jordan West  wrote:
>
>> Generally no disagreement but more of a curiosity: what’s the motivation
>> for removal? Just that it’s not needed? Otherwise it’s relatively cheap and
>> DDL aren’t high throughput (or at least shouldn’t be since we can only deal
>> with so many tables)
>>
>> Jordan
>>
>> On Tue, Jul 30, 2024 at 15:04 Caleb Rackliffe 
>> wrote:
>>
>>> To summarize all this noise I've created, the plan would be...
>>>
>>> 1.) Leave CQL WITH id intact.
>>> 2.) Deprecate and WARN on *use_deterministic_table_id *in 5.0.x.
>>> 3.) Ignore and WARN on *use_deterministic_table_id *in 5.1.
>>> 4.) Profit
>>>
>>> On Tue, Jul 30, 2024 at 4:46 PM Caleb Rackliffe <
>>> calebrackli...@gmail.com> wrote:
>>>
>>>> No intention of touching WITH id in CQL
>>>>
>>>> On Tue, Jul 30, 2024 at 4:10 PM Caleb Rackliffe <
>>>> calebrackli...@gmail.com> wrote:
>>>>
>>>>> To clarify, my plan was to deprecate in Config/JMX and ignore it, not
>>>>> remove it entirely so it breaks existing YAMLs and JMX clients.
>>>>>
>>>>> This should be fine, if I'm reading the upgrade notes correctly, as no
>>>>> table or view creation operations will be allowed on 5.1 nodes until
>>>>> upgrade is complete and the CMS has been initialized.
>>>>>
>>>>> On Tue, Jul 30, 2024 at 3:54 PM J. D. Jordan <
>>>>> jeremiah.jor...@gmail.com> wrote:
>>>>>
>>>>>> +1 to deprecate it. What does removing it buy us?
>>>>>>
>>>>>> On Jul 30, 2024, at 3:52 PM, David Capwell 
>>>>>> wrote:
>>>>>>
>>>>>> Users can provide ids and TCM can manage to make them safe, so agree
>>>>>> we don’t really need the feature anymore.  I am fine with deprecating the
>>>>>> feature, but removing would be a breaking change for anyone that had that
>>>>>> config in place, so not a fan of breaking the config interface.
>>>>>>
>>>>>> On Jul 30, 2024, at 1:38 PM, Caleb Rackliffe <
>>>>>> calebrackli...@gmail.com> wrote:
>>>>>>
>>>>>> I'd like to propose removing deterministic table IDs for new *user*
>>>>>> tables and views in trunk. With TCM in place, it looks like the reason we
>>>>>> added *use_deterministic_table_id*, concurrent table creations, is
>>>>>> no longer a concern.
>>>>>>
>>>>>> Thoughts? Objections?
>>>>>>
>>>>>>
>>>>>>

Re: [DISCUSS] state of lz4-java

2024-08-06 Thread Jordan West

Oof this is a tough one. I agree with both of you on the concerns. I don’t
have good answers but I do see a few options:

- upgrade to 1.10.0 (optionally and w rigorous testing ofc) and then start
to think of a deprecation plan for LZ4 in project. That would be
unfortunate but we could probably do some testing around Zstd config to get
comparable-ish performance. I rely heavily on LZ4 currently so this is less
ideal but maybe necessary?

- since the project is Apache licensed we could fork / adopt for our needs?
I’m not familiar enough with the hurdles we’d have to surmount around the
licensing but at least it’s a compatible license. Maybe Mick knows more
about what would be required for this route?

- see if someone from the community wants to pick up ownership of lz4-Java.
This would be the simplest if it works. The project would have a new owner
/ steward.

- write our own compatible JNI wrapper around the lz4 project instead of
using LZ4 Java. This would be the most work but give us ownership of the
frameing and compatibility issues going forward.

Jordan

On Fri, Aug 2, 2024 at 04:48 Štefan Miklošovič 
wrote:

> Indeed, I also want to add that I tried to checkout 1.10.0 submodule of
> lz4 in lz4-java git repository and I built it and all tests of lz4-java
> just passed fine. That is nice to know but that does not change the fact
> that there is nobody behind it to drive the releases, do the bug fixes and
> similar. My visibility into what it is doing is zero and that project
> misses a person to take care of it in a broader way.
>
> On Fri, Aug 2, 2024 at 1:15 PM Brandon Williams  wrote:
>
>> I just want to note that lz4 is used for network compression, either
>> between nodes or more importantly for clients, so interoperability is
>> key and we need to be careful about changing things here.
>>
>> Kind Regards,
>> Brandon
>>
>> On Fri, Aug 2, 2024 at 3:05 AM Štefan Miklošovič 
>> wrote:
>> >
>> > I just want to raise awareness about lz4-java library we use for LZ4
>> compressor. We are using the version 1.8.0, there is already version 1.10.0
>> of the underlying lz4 project which lz4-java integrates.
>> >
>> > We can see from NEWS (1) that after 1.8.0 there are a lot of
>> performance improvements and in 1.10.0 they implemented multithreading
>> compression (2) which provides great compression times. This seems to be
>> supported for "cli" only, I am not completely sure yet if multithreaded
>> compression would be something which we might use for our case too though.
>> >
>> > Anyway, the thing is lz4-java seems to be a dead project. There is no
>> official release from 1.8.0, the submodule of lz4 was updated to 1.9.3 but
>> it was never released and the main contributor behind lz4-java is
>> unresponsive. (I even wrote him an email with no response yet). There are
>> numerous requests from users to release a new version, questions about the
>> state of the project but the response is none. It truly seems like the
>> project was abandoned.
>> >
>> > That is quite unfortunate and I am not sure what we should do about
>> that. I think that people will eventually fork so the project might
>> continue in some fashion. Nevertheless, I think the state of that library
>> is in a quite bad position.
>> >
>> > We might look into alternatives but I do not think that switching it to
>> anything else will be easy because these custom libraries are often not
>> compatible between themselves as the way they implement the specification
>> might slightly vary.
>> >
>> > On lz4.org (1), there is Apache Commons listed as a lib which
>> implements block and frame specification and it is interoperable so maybe
>> we might take a look into how we would replace it if lz4-java is indeed
>> abandoned?
>> >
>> > There is a note about lz4-java (or any lib in that category) like this:
>> >
>> > They use the block compression format, but add their own frame / header
>> logic (or none at all) Consequently, they are not interoperable with LZ4
>> command line utility, nor (generally) between themselves.
>> >
>> > Do we even want to make any progress in this area or are we happy to
>> have it on 1.8.0 forever?
>> >
>> > (1) https://github.com/lz4/lz4/blob/dev/NEWS
>> > (2) https://github.com/lz4/lz4/blob/dev/NEWS#L2
>> > (3) https://lz4.org/
>>
>

Re: [DISCUSS] inotify for detection of manually removed snapshots

2024-08-09 Thread Jordan West

I lean towards the documentation approach vs complicating the
implementation.

For me personally: I regularly use shell commands to operate on snapshots.
That includes listing them. I probably should use nodetool for it all
instead though.

Jordan

On Fri, Aug 9, 2024 at 08:09 Štefan Miklošovič 
wrote:

> I understand and agree. It is just that it would be cool if we avoided the
> situation when there is a figurative ABC company which has these "bash
> scripts removing snapshots from cron by rm -rf every second Sunday at 3:00
> am" because "that was their workflow for ages".
>
> I am particularly sensitive to this as Cassandra is very cautious when it
> comes to not disrupting the workflows already out there.
>
> I do not know how frequent this would be and if somebody started to
> complain. I mean ... they could still remove it by hand, right? It is just
> listsnapshots would not be relevant anymore without refreshing it. I think
> that might be acceptable. It would be something else if we flat out made
> manual deletion forbidden, which it is not.
>
> On Fri, Aug 9, 2024 at 4:50 PM Bowen Song via dev <
> dev@cassandra.apache.org> wrote:
>
>> If we have the documentation in place, we can then consider the cache to
>> be the master copy of metadata, and rely on it to be always accurate and up
>> to date. If someone deletes the snapshot files from filesystem, they can't
>> complain about Cassandra stopped working correctly - which is the same if
>> they had manually deleted some SSTable files (they shouldn't).
>> On 09/08/2024 11:16, Štefan Miklošovič wrote:
>>
>> We could indeed do that. Does your suggestion mean that there should not
>> be a problem with caching it all once explicitly stated like that?
>>
>> On Fri, Aug 9, 2024 at 12:01 PM Bowen Song via dev <
>> dev@cassandra.apache.org> wrote:
>>
>>> Has anyone considered simply updating the documentation saying this?
>>>
>>> "Removing the snapshot files directly from the filesystem may break
>>> things. Always use the `nodetool` command or JMX to remove snapshots."
>>> On 09/08/2024 09:18, Štefan Miklošovič wrote:
>>>
>>> If we consider caching it all to be too much, we might probably make
>>> caching an option an admin would need to opt-in into? There might be a flag
>>> in cassandra.yaml, once enabled, it would be in memory, otherwise it would
>>> just load it as it was so people can decide if caching is enough for them
>>> or they want to have it as it was before (would be by default set to as it
>>> was). This puts additional complexity into SnapshotManager but it should be
>>> in general doable.
>>>
>>> Let me know what you think, I would really like to have this resolved,
>>> 18111 brings a lot of code cleanup and simplifies stuff a lot.
>>>
>>> On Wed, Aug 7, 2024 at 11:30 PM Josh McKenzie 
>>> wrote:
>>>
 If you have a lot of snapshots and have for example a metric monitoring
 them and their sizes, if you don’t cache it, creating the metric can cause
 performance degradation. We added the cache because we saw this happen to
 databases more than once.

 I mean, I believe you, I'm just surprised querying out metadata for
 files and basic computation is leading to hundreds of ms pause times even
 on systems with a lot of files. Aren't most / all of these values cached at
 the filesystem layer so we're basically just tomato / tomahto caching
 systems, either one we maintain or one the OS maintains?

 Or is there really just a count of files well outside what I'm thinking
 here?

 Anyway, not trying to cause a ruckus and make needless noise, trying to
 learn. ;)

 On Wed, Aug 7, 2024, at 3:20 PM, Štefan Miklošovič wrote:

 On Wed, Aug 7, 2024 at 6:39 PM Yifan Cai  wrote:

 With WatcherService, when events are missed (which is to be expected),
 you will still need to list the files. It seems to me that WatcherService
 doesn't offer significant benefits in this case.

 Yeah I think we leave it out eventually.

 Regarding listing directory with a refresh flag, my concern is the
 potential for abuse. End-users might/could always refresh before listing,
 which could undermine the purpose of caching. Perhaps Jeremiah can provide
 more insight on this.

 Well, by default, it would not be refreshed every single time. You
 would need to opt-in into that. If there is a shop which has users with a
 direct access to the disk of Cassandra nodes and they are removing data
 manually, I do not know what to say, what is nodetool clearsnapshot and jmx
 methods good for then? I do not think we can prevent people from shooting
 into their feet if they are absolutely willing to do that.

 If they want to refresh that every time, that would be equal to the
 current behavior. It would be at most as "bad" as it is now.

 IMO, caching is best handled internally. I have a

Re: [VOTE] Release Apache Cassandra 4.1.6 - take 2

2024-08-15 Thread Jordan West

+1

btw, I validated the tarball release super quickly using easy-cass-lab (
https://github.com/rustyrazorblade/easy-cass-lab). Steps I took:
1. modified cassandra_versions.yaml to point the 4.1 version at the new
release tarball of 4.1.6
2. built the image: bin/easy-cass-lab build-image (got a coffee for this
part, it takes a few)
3. created and started a cluster:
   $ bin/easy-cass-lab init --instance r5d.large -c 3 sanitycheck416
   $ bin/easy-cass-lab up
   $ bin/easy-cass-lab use 4.1
   $ bin/easy-cass-lab start
4. waiting for the cluster to complete startup and join

Connected to sanitycheck416 at cassandra0:9042
[cqlsh 6.1.0 | Cassandra 4.1.6 | CQL spec 3.4.6 | Native protocol v5]
Use HELP for help.

On Thu, Aug 15, 2024 at 1:10 AM Tommy Stendahl via dev <
dev@cassandra.apache.org> wrote:

> +1 (nb)
>
> -Original Message-
> *From*: Jon Haddad  >
> *Reply-To*: dev@cassandra.apache.org
> *To*: dev@cassandra.apache.org
> *Subject*: Re: [VOTE] Release Apache Cassandra 4.1.6 - take 2
> *Date*: Wed, 14 Aug 2024 11:07:54 -0700
>
> +1
>
> On Wed, Aug 14, 2024 at 9:55 AM Brandon Williams <
> brandonwilli...@apache.org> wrote:
>
> Proposing the test build of Cassandra 4.1.6 for release.
>
> sha1: 790de1079811278a2b431c2ced7c7ea02d290a25
> Git: https://github.com/apache/cassandra/tree/4.1.6-tentative
> Maven Artifacts:
>
> https://repository.apache.org/content/repositories/orgapachecassandra-1340/org/apache/cassandra/cassandra-all/4.1.6/
>
> The Source and Build Artifacts, and the Debian and RPM packages and
> repositories, are available here:
> https://dist.apache.org/repos/dist/dev/cassandra/4.1.6/
>
> The vote will be open for 72 hours (longer if needed). Everyone who
> has tested the build is invited to vote. Votes by PMC members are
> considered binding. A vote passes if there are at least three binding
> +1s and no -1's.
>
> [1]: CHANGES.txt:
> https://github.com/apache/cassandra/blob/4.1.6-tentative/CHANGES.txt
> [2]: NEWS.txt:
> https://github.com/apache/cassandra/blob/4.1.6-tentative/NEWS.txt
>
>

Re: [VOTE] Release Apache Cassandra 5.0-rc2

2024-08-22 Thread Jordan West

+1, validated again with easy-cass-lab

On Thu, Aug 22, 2024 at 6:48 AM Mick Semb Wever  wrote:

>  .
>
>
> The vote will be open for 72 hours (longer if needed). Everyone who has
>> tested the build is invited to vote. Votes by PMC members are considered
>> binding. A vote passes if there are at least three binding +1s and no -1's.
>>
>
>
>
> +1
>
> Checked
> - signing correct
> - checksums correct
> - source artefact builds (JDK 11+17)
> - binary artefact runs (JDK 11+17)
> - debian package runs (JDK 11+17)
> - debian repo runs (JDK 11+17)
> - redhat* package runs (JDK11+17)
> - redhat* repo runs (JDK 11+17)
>
>
>

Re: Welcome Doug Rohrer as Cassandra Committer

2024-08-23 Thread Jordan West

Awesome! Congratulations Doug!

On Fri, Aug 23, 2024 at 12:17 Štefan Miklošovič 
wrote:

> Great news! Congratulations, Doug.
>
> On Fri, Aug 23, 2024 at 8:55 PM Dinesh Joshi  wrote:
>
>> The Apache Cassandra PMC is thrilled to announce that Doug Rohrer has
>> accepted the invitation to become a committer!
>>
>> Doug has worked on several aspects of Cassandra, Sidecar, and
>> Analytics. Congratulations and welcome!
>>
>> The Apache Cassandra PMC members
>>
>

Re: Welcome Jordan West and Stefan Miklosovic as Cassandra PMC members!

2024-08-31 Thread Jordan West

Thanks all!!!

On Sat, Aug 31, 2024 at 07:55 J. D. Jordan 
wrote:

> Two great additions to the PMC. Congratulations to you both!
>
> -Jeremiah Jordan
>
> > On Aug 30, 2024, at 3:21 PM, Jon Haddad  wrote:
> >
> > 
> > The PMC's members are pleased to announce that Jordan West and Stefan
> Miklosovic have accepted invitations to become PMC members.
> >
> > Thanks a lot, Jordan and Stefan, for everything you have done for the
> project all these years.
> >
> > Congratulations and welcome!!
> >
> > The Apache Cassandra PMC
>

Re: 【DISCUSS】The configuration of Commitlog archiving

2024-09-02 Thread Jordan West

+1 to Scott’s comments. Once you expose those YAML config params outside of
a single node which many of us do, this becomes an RCE attack vector.
Something more structured as Scott proposes, similar to snapshots, would be
preferred. Would recommend a CEP.

Jordan

On Fri, Aug 30, 2024 at 20:58 C. Scott Andreas  wrote:

> I appreciate this report and would love to work toward the direction it
> recommends.
>
> I’m also familiar with past concerns raised by others with our FQL
> configuration parameters that allow passing shell commands for FQL segment
> archival.
>
> We bias toward ensuring an MBean exists for dynamic modification of yaml
> parameters. When we couple dynamic configuration updates and arbitrary
> shell command execution, we introduce vectors for arbitrary code execution,
> data exfiltration, and data compromise that have a lower bar to achieve
> than local file write.
>
> I agree that we should work toward removing operator-provided shell
> commands in yaml.
>
> For concerns like archival, these seem like areas that Cassandra could
> easily accomplish itself without shelling out to gzip/zstd/lz4-compress a
> file. Introducing a new config structure that declares an archival format,
> accompanying implementations for compression/decompression, and deprecation
> of the prior approach sounds both reasonable and desirable to me.
>
> – Scott
>
> —
> Mobile
>
> On Aug 30, 2024, at 10:25 PM, Bowen Song via dev 
> wrote:
>
> 
>
> I'm not sure what is the concern here. Is it a malicious user exploiting
> this? Or human error with unintended consequences?
>
> For malicious user, in order to exploit this, an attacker needs to be able
> to write to the config file. The config file on Linux by default is owned
> by the root user and has the -rw-r--r-- permission, that means the attacker
> must either gain root access to the system or has the ability to write
> arbitrary file on the filesystem. With either of these permission, they can
> already do almost anything they want (e.g. modify a SUID executable file).
> They wouldn't even need to exploit this to run a script or dangerous
> command. So this sounds like a non-issue to me, at least on Linux-based
> OSes.
>
> For human error, if the operator puts "rm -rf" in it, the software should
> treat it as the operator actually wants to do that. I personally don't like
> software attempting to outsmart human, which often ends up interfering with
> legitimate use cases. The best thing a software can do is log it, so
> there's some traceability if and when things go wrong.
>
> So, IMO, there's nothing wrong with the implementation in Cassandra.
>
>
> On 30/08/2024 17:13, guo Maxwell wrote:
>
> Commitlog has the ability of archive  log file, see
> CommitLogArchiver.java
> ,
> we can achieve the purpose of archive and restore commitlog by
> configuring archive_command and restore_command in
> commitlog_archiving.properties
> 
> .The archive_command and restore_command can be some linux/unix shell
> command.  However, I found that the shell command can actually be filled
> with any script, even if "*rm -rf"* .I have tested this situation and it
> finally succeeded with my test file being deleted.
>
> Personally, I think it is a dangerous behavior, because if there are
> no system-level restrictions and users are allowed to do anything in these
> shell commands. So here I want to discuss with you whether it is
> necessary to impose any restrictions on use, or do we need a new way of
> archiving/restoring commitlog?
>
> Of course, before that, I would also like to ask, how many people are
> using archive and restore of commitlog? It seems that the commitlog archive
> code has not been updated for a long time.
>
> I have two ideas.
> One is to make some restrictions on the command context based on the
> existing usage methods, such as strictly only allowing the current cp/mv/ln
> %path to %name.Other redundant strings in the command are not allowed.
> Another one , As I roughly investigated the archive of mysql and pg. They
> do not give users too much space (I am talking about letting users define
> their own archiving command ), and archive directly to a designated
> location. For us, I feel that we can refer to c * Incremental backup of
> sstable,  add a hardlink to the commitlog to the specified location, but
> this place may modify the original configuration method, such as setting
> the archive location and restoring location of the node through nodetool
> and deprecate the  commitlog_archiving.properties
> 
>  configuration.
>
> I am just putting forward some views  here, and looking forward to your
> feedback. 😀
>
>

Re: [DISCUSS] CASSANDRA-13704 Safer handling of out of range tokens

2024-09-12 Thread Jordan West

I’m +1 on enabling rejection by default on all branches. We have been bit
by silent data loss (due to other bugs like the schema issues in 4.1) from
lack of rejection on several occasions and short of writing extremely
specialized tooling its unrecoverable. While both lack of availability and
data loss are critical, I will always pick lack of availability over data
loss. Its better to fail a write that will be lost than silently lose it.

Of course, a change like this requires very good communication in NEWS.txt
and elsewhere but I think its well worth it. While it may surprise some
users I think they would be more surprised that they were silently losing
data.

Jordan

On Thu, Sep 12, 2024 at 10:22 Mick Semb Wever  wrote:

> Thanks for starting the thread Caleb, it is a big and impacting patch.
>
> Appreciate the criticality, in a new major release rejection by default is
> obvious.   Otherwise the logging and metrics is an important addition to
> help users validate the existence and degree of any problem.
>
> Also worth mentioning that rejecting writes can cause degraded
> availability in situations that pose no problem.  This is a coordination
> problem on a probabilistic design, it's choose your evil: unnecessary
> degraded availability or mislocated data (eventual data loss).   Logging
> and metrics makes alerting on and handling the data mislocation possible,
> i.e. avoids data loss with manual intervention.  (Logging and metrics also
> face the same problem with false positives.)
>
> I'm +0 for rejection default in 5.0.1, and +1 for only logging default in
> 4.x
>
>
> On Thu, 12 Sept 2024 at 18:56, Jeff Jirsa  wrote:
>
>> This patch is so hard for me.
>>
>> The safety it adds is critical and should have been added a decade ago.
>> Also it’s a huge patch, and touches “everything”.
>>
>> It definitely belongs in 5.0. I’d probably reject by default in 5.0.1.
>>
>> 4.0 / 4.1 - if we treat this like a fix for latent opportunity for data
>> loss (which it implicitly is), I guess?
>>
>>
>>
>> > On Sep 12, 2024, at 9:46 AM, Brandon Williams  wrote:
>> >
>> > On Thu, Sep 12, 2024 at 11:41 AM Caleb Rackliffe
>> >  wrote:
>> >>
>> >> Are you opposed to the patch in its entirety, or just rejecting unsafe
>> operations by default?
>> >
>> > I had the latter in mind.  Changing any default in a patch release is
>> > a potential surprise for operators and one of this nature especially
>> > so.
>> >
>> > Kind Regards,
>> > Brandon
>>
>>

Re: [DISCUSS] CASSANDRA-13704 Safer handling of out of range tokens

2024-09-12 Thread Jordan West

I think folks not losing sleep over this are only in that position because
they don’t know it’s happening. Like Brandon said, ignorance is bliss (but
it’s a false bliss).

Very few users do the work necessary to detect data loss outside the
obvious paths. I agree with Caleb, if we log and give them no means to
remediate we are giving them nightmares with no recourse. While failed
writes will be a surprise it’s the correct solution because it’s the only
one that prevents data loss which we should always strive to get rid of.

Jordan

On Thu, Sep 12, 2024 at 11:31 Caleb Rackliffe 
wrote:

> We aren’t counting on users to read NEWS.txt. That’s the point. We’re
> saying we’re going to make things safer, as they should always have been,
> and if someone out there has tooling that somehow allows them to avoid the
> risks, they can disable rejection.
>
> > On Sep 12, 2024, at 1:21 PM, Brandon Williams  wrote:
> >
> > On Thu, Sep 12, 2024 at 1:13 PM Caleb Rackliffe
> >  wrote:
> >>
> >> I think I can count at least 4 people on this thread who literally have
> lost sleep over this.
> >
> > Probably good examples of not being the majority though, heh.
> >
> > If we are counting on users to read NEWS.txt, can we not count on them
> > to enable rejection if this is important to them?
> >
> > Kind Regards,
> > Brandon
>

Re: [DISCUSS] CASSANDRA-13704 Safer handling of out of range tokens

2024-09-12 Thread Jordan West

To clarify my response:

We didn’t hit a bug “like it”. We hit a bug that resulted in an improper
view of the ring (on my phone so can’t dig up the JIRA but it was a ring
issue introduced in 4.1.0 and fixed in 4.1.4 iirc). There have been several
bugs of this form in the past. So it wasn’t that we were operationally
wrong. c* itself was wrong. And because this patch wasn’t present we had
silent data loss as a result. Even worse the scale of that data loss is
extremely hard to measure.

This incident is considered one of the top three most impactful /
significant incidents ever at my current employer — all three were data
loss related and two could’ve been prevented by this patch, again trading
availability instead of nrecoverable data loss that required setting up an
entirely new cluster and having a copy of the data in an external system
(not backups but literally an unrelated external system).

Put more succinctly: we got very lucky and had this patch been there and
rejection enabled we wouldn’t have needed luck.

Jordan

On Thu, Sep 12, 2024 at 12:36 Mick Semb Wever  wrote:

> Great that the discussion explores the issue as well.
>
> So far we've heard three* companies being impacted, and four times in
> total…?  Info is helpful here.
>
> *) Jordan, you say you've been hit by _other_ bugs _like_ it.  Jon i'm
> assuming the company you refer to doesn't overlap. JD we know it had
> nothing to do with range movements and could/should have been prevented far
> simpler with operational correctness/checks.
>
> In the extreme, when no writes have gone to any of the replicas, what
> happened ? Either this was CL.*ONE, or it was an operational failure (not
> C* at fault).  If it's an operational fault, both the coordinator and the
> node can be wrong.  With CL.ONE, just the coordinator can be wrong and the
> problem still exists (and with rejection enabled the operator is now more
> likely to ignore it).
>
> WRT to the remedy, is it not to either run repair (when 1+ replica has
> it), or to load flushed and recompacted sstables (from the period in
> question) to their correct nodes.  This is not difficult, but
> understandably lost-sleep and time-intensive.
>
> Neither of the above two points I feel are that material to the outcome,
> but I think it helps keep the discussion on track and informative.   We
> also know there are many competent operators out there that do detect data
> loss.
>
>
>
> On Thu, 12 Sept 2024 at 20:07, Caleb Rackliffe 
> wrote:
>
>> If we don’t reject by default, but log by default, my fear is that we’ll
>> simply be alerting the operator to something that has already gone very
>> wrong that they may not be in any position to ever address.
>>
>> On Sep 12, 2024, at 12:44 PM, Jordan West  wrote:
>>
>> 
>> I’m +1 on enabling rejection by default on all branches. We have been bit
>> by silent data loss (due to other bugs like the schema issues in 4.1) from
>> lack of rejection on several occasions and short of writing extremely
>> specialized tooling its unrecoverable. While both lack of availability and
>> data loss are critical, I will always pick lack of availability over data
>> loss. Its better to fail a write that will be lost than silently lose it.
>>
>> Of course, a change like this requires very good communication in
>> NEWS.txt and elsewhere but I think its well worth it. While it may surprise
>> some users I think they would be more surprised that they were silently
>> losing data.
>>
>> Jordan
>>
>> On Thu, Sep 12, 2024 at 10:22 Mick Semb Wever  wrote:
>>
>>> Thanks for starting the thread Caleb, it is a big and impacting patch.
>>>
>>> Appreciate the criticality, in a new major release rejection by default
>>> is obvious.   Otherwise the logging and metrics is an important addition to
>>> help users validate the existence and degree of any problem.
>>>
>>> Also worth mentioning that rejecting writes can cause degraded
>>> availability in situations that pose no problem.  This is a coordination
>>> problem on a probabilistic design, it's choose your evil: unnecessary
>>> degraded availability or mislocated data (eventual data loss).   Logging
>>> and metrics makes alerting on and handling the data mislocation possible,
>>> i.e. avoids data loss with manual intervention.  (Logging and metrics also
>>> face the same problem with false positives.)
>>>
>>> I'm +0 for rejection default in 5.0.1, and +1 for only logging default
>>> in 4.x
>>>
>>>
>>> On Thu, 12 Sept 2024 at 18:56, Jeff Jirsa  wrote:
>>>
>>>> This patch is so har

Re: Welcome Chris Bannister, James Hartig, Jackson Flemming and João Reis, as cassandra-gocql-driver committers

2024-09-12 Thread Jordan West

Congrats, welcome!

On Thu, Sep 12, 2024 at 13:16 Dinesh Joshi  wrote:

> Congratulations, everyone!
>
> On Thu, Sep 12, 2024 at 4:40 AM Mick Semb Wever  wrote:
>
>> The PMC's members are pleased to announce that Chris Bannister, James
>> Hartig, Jackson Flemming and João Reis have accepted invitations to
>> become committers on the Drivers subproject.
>>
>> Thanks a lot for everything you have done with the gocql driver all these
>> years.  We are very excited to see the driver now inside the Apache
>> Cassandra project.
>>
>> Congratulations and welcome!!
>>
>> The Apache Cassandra PMC
>>
>

Re: [DISCUSS] Chronicle Queue's development model and a hypothetical replacement of the library

2024-09-16 Thread Jordan West

Thanks for the sleuthing Stefan! This definitely is a bit unfortunate. It
sounds like a replacement is not really practical so I'll ignore that
option for now, until a viable alternative is proposed. I am -1 on us
writing our own without strong, strong justification -- primarily because I
think the likelihood is we introduce more bugs before getting to something
stable.

Regarding the remaining options, mostly some thoughts:

- it would be nice to have some specific evidence of other projects using
the EA versions and what their developers have said about it.
- it sounds like if we go with the EA route, the onus to test for
correctness / compatibility increases. They do test but anything marked
"early access" I think deserves more scrutiny from the C* community before
release. That could come in the form of more tests (or showing that we
already have good coverage of where its used).
- i assume each time we upgrade we would pick the most recently released EA
version

Jordan


On Mon, Sep 16, 2024 at 1:46 PM Štefan Miklošovič 
wrote:

> We are using a library called Chronicle Queue (1) and its dependencies and
> we ship them in the distribution tarball.
>
> The version we use in 5.0 / trunk as I write this is 2.23.36. If you look
> closely here (2), there is one more release like this, 2.23.37 and after
> that all these releases have "ea" in their name.
>
> "ea" stands for "early access". The project has changed the versioning /
> development model in such a way that "ea" releases act, more or less, as
> glorified snapshots which are indeed released to Maven Central but the
> "regular" releases are not there. The reason behind this is that "regular"
> releases are published only for customers who pay to the company behind
> this project and they offer commercial support for that.
>
> "regular" releases are meant to get all the bug fixes after "ea" is
> published and they are official stable releases. On the other hand "ea"
> releases are the ones where the development happens and every now and then,
> once the developers think that it is time to cut new 2.x, they just publish
> that privately.
>
> I was investigating how this all works here (3) and while they said that,
> I quote (4):
>
> "In my experience this is consumed by a large number of open source
> projects reliably (for our other artifacts too). This development/ea branch
> still goes through an extensive test suite prior to release. Releases from
> this branch will contain the latest features and bug fixes."
>
> I am not completely sure if we are OK with this. For the record, Mick is
> not overly comfortable with that and Brandon would prefer to just replace
> it / get rid of this dependency (comments / reasons / discussion from (5)
> to the end)
>
> The question is if we are OK with how things are and if we are then what
> are the rules when upgrading the version of this project in Cassandra in
> the context of "ea" versions they publish.
>
> If we are not OK with this, then the question is what we are going to
> replace it with.
>
> If we are going to replace it, I very briefly took a look and there is
> practically nothing out there which would hit all the buttons for us.
> Chronicle is just perfect for this job and I am not a fan of rewriting this
> at all.
>
> I would like to have this resolved because there is CEP-12 I plan to
> deliver and I hit this and I do not want to base that work on something we
> might eventually abandon. There are some ideas for CEP-12 how to bypass
> this without using Chronicle but I would like to firstly hear your opinion.
>
> Regards
>
> (1) https://github.com/OpenHFT/Chronicle-Queue
> (2) https://repo1.maven.org/maven2/net/openhft/chronicle-core/
> (3) https://github.com/OpenHFT/Chronicle-Core/issues/668
> (4)
> https://github.com/OpenHFT/Chronicle-Core/issues/668#issuecomment-2322038676
> (5)
> https://issues.apache.org/jira/browse/CASSANDRA-18712?focusedCommentId=17878254&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17878254
>

Re: [VOTE] Release Apache Cassandra 4.0-rc1

2021-03-30 Thread Jordan West

+1 nb. Very excited to see the project reach this milestone. Congrats
everyone and thank you all for the effort and hard work!

Jordan

On Tue, Mar 30, 2021 at 8:33 AM Scott Andreas  wrote:

> +1 nb.
>
> This is a huge milestone for the project.
>
> 
> From: Paulo Motta 
> Sent: Tuesday, March 30, 2021 4:57 AM
> To: Cassandra DEV
> Subject: Re: [VOTE] Release Apache Cassandra 4.0-rc1
>
> +1
>
> Em ter., 30 de mar. de 2021 às 00:25, Dinesh Joshi 
> escreveu:
>
> > +1
> >
> > Dinesh
> >
> > > On Mar 29, 2021, at 1:41 PM, Nate McCall  wrote:
> > >
> > > +1
> > >
> > >
> > >> On Tue, Mar 30, 2021 at 2:06 AM Mick Semb Wever 
> wrote:
> > >>
> > >> Proposing the test build of Cassandra 4.0-rc1 for release.
> > >>
> > >> sha1: 2facbc97ea215faef1735d9a3d5697162f61bc8c
> > >> Git:
> > >>
> > >>
> >
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=shortlog;h=refs/tags/4.0-rc1-tentative
> > >> Maven Artifacts:
> > >>
> > >>
> >
> https://repository.apache.org/content/repositories/orgapachecassandra-1234/org/apache/cassandra/cassandra-all/4.0-rc1/
> > >>
> > >> The Source and Build Artifacts, and the Debian and RPM packages and
> > >> repositories, are available here:
> > >> https://dist.apache.org/repos/dist/dev/cassandra/4.0-rc1/
> > >>
> > >> The vote will be open for 72 hours (longer if needed). Everyone who
> has
> > >> tested the build is invited to vote. Votes by PMC members are
> considered
> > >> binding. A vote passes if there are at least three binding +1s and no
> > -1's.
> > >>
> > >> Known issues with this release, that are planned to be fixed in
> 4.0-rc2,
> > >> are
> > >> - four files were missing copyright headers,
> > >> - LICENSE and NOTICE contain additional unneeded information,
> > >> - jar files under lib/ in the source artefact.
> > >>
> > >> These issues are actively being worked on, along with our expectations
> > that
> > >> the ASF makes the policy around them more explicit so it is clear
> > exactly
> > >> what is required of us.
> > >>
> > >>
> > >> [1]: CHANGES.txt:
> > >>
> > >>
> >
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/4.0-rc1-tentative
> > >> [2]: NEWS.txt:
> > >>
> > >>
> >
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=NEWS.txt;hb=refs/tags/4.0-rc1-tentative
> > >>
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
> >
>

Re: Download source release / binary files in source release

2021-03-30 Thread Jordan West

I have yet to see a legal reason why including binaries in packages is a
bad thing. I’ve read the thread and the documents linked. In fact, it looks
like it’s done specifically to avoid legal issues with copy left licenses.
It’s very common for Apache to hold on to past policies at the expense of
its projects’ users (see the slow transition to Git) all while claiming to
do it for their benefit. It’s a decade later, the landscape has changed. We
should absolutely protect the project legally but trying to guess the
spirit of open source at the cost of users is of little benefit to all
stakeholders.

In the end this discussion has moved to a list most of us don’t have access
to and when asked to contribute the original reporter basically said “Your
problem. You fix it” despite having a significant amount of experience in
making builds “comply”.  It’s also causing the delay of the projects first
major release in 5 years, that many of this list have contributed large
portions of their life too. That’s not very in the spirit of open source
and I am disappointed again by the ASFs role in this — which continues to
be ambiguous and at the cost of its users and developers.

All that said, if we fix this great. If we don’t, eh. As long as we are
legally compliant with the licenses of the dependencies we use we should
value convenience for users over pedanticsm and statements that are a
decade old. If there is a legal reason to change this it’s been explained
poorly by the ASF and needs clarification. It also can only be so important
if we are only catching it now after so many releases with the project.

Jordan

On Tue, Mar 30, 2021 at 7:19 AM Joshua McKenzie 
wrote:

> FWIW I don't have access to what's being raised with the board so
> effectively can't participate in this discussion beyond +1'ing Jirsa:
>
> Based on this point, I personally won't vote to approve a future release
> > with binary packages, but I also strongly disagree with the assertion in
> > that same past thread that it's worth nuking a 10+year history of
> releases.
> > That's the type of action that would severely diminish trust in the
> > foundation.
>
>
> We SHOULD look at what's required to rebuild PAST releases.
>
>
> We should keep in mind what's best for our users. While avoiding including
> compiled binaries that can't be verified as open source makes complete
> sense from a "maximize safety to our users" perspective and can be done on
> forward-going releases with minimal lift, we also have to consider how we
> get There from Here on past releases. Pulling the rug out from our entire
> user-base and releases after over a decade based on a conversation that
> happened off-list (i.e. not on the C* dev list) 9 years ago is, hopefully
> we can agree, not in our users' best interests nor the best interests of
> this project's longevity.
>
> ~Josh
>
> On Tue, Mar 30, 2021 at 9:38 AM Mick Semb Wever  wrote:
>
> > >
> > > It good to see you are taking action, but I think the situation is a
> > > little more seriously that you may realise, I suggest you look at what
> > > actions the board has taken in similar situations in the past. I'll
> > update
> > > the board agenda item to reflect the current situation.
> > >
> >
> >
> > The current board agenda item is still not accurate. The PMC members and
> > the project are not ignoring the issue.
> >
> > Also, it would be nice if you could reference this thread, in both the
> > board's agenda item and ML post, to allow people to have a complete view
> of
> > the discussion.
> >
> > I am happy to add information to the agenda item if you agree to it.
> > Better yet, I suggest that we work together in public to word it. Most
> > people on this list do not have access to the message. There is a
> community
> > here, and the way we work together to solve problems matters.
> >
>

Re: Welcome Caleb Rackliffe as Cassandra committer

2021-05-14 Thread Jordan West

Congrats Caleb!

Jordan

On Fri, May 14, 2021 at 10:43 AM Scott Andreas  wrote:

> Congratulations, Caleb!
>
> — Scott
>
> > On May 14, 2021, at 10:29 AM, Andrés de la Peña <
> a.penya.gar...@gmail.com> wrote:
> >
> > Congrats Caleb, well deserved! :)
> >
> >> On Fri, 14 May 2021 at 17:53, Paulo Motta 
> wrote:
> >>
> >> Awesome, congratulations Caleb!! :)
> >>
> >> Em sex., 14 de mai. de 2021 às 13:16, Patrick McFadin <
> pmcfa...@gmail.com>
> >> escreveu:
> >>
> >>> YES! Love seeing this. A very much deserved congratulations Caleb!
> >>>
> >>> Patrick
> >>>
> >>> On Fri, May 14, 2021 at 9:12 AM David Capwell
>  >>>
> >>> wrote:
> >>>
>  Congrats!
> 
> > On May 14, 2021, at 8:52 AM, Charles Cao 
> >> wrote:
> >
> > Congrats Caleb! Well deserved :)
> >
> > ~Charles
> >
> >> On May 14, 2021, at 07:30, Yifan Cai  wrote:
> >>
> >> Congrats Caleb!
> >>
> >>> On May 14, 2021, at 6:56 AM, Joshua McKenzie  >>>
>  wrote:
> >>>
> >>> Congrats Caleb!
> >>>
> > On Fri, May 14, 2021 at 9:10 AM Brandon Williams <
> >> dri...@gmail.com
> 
>  wrote:
> 
>  Congrats Caleb! Well deserved.
> 
> > On Fri, May 14, 2021, 8:03 AM Mick Semb Wever 
>  wrote:
> >
> > The PMC members are pleased to announce that Caleb Rackliffe has
> > accepted the invitation to become committer.
> >
> > Thanks heaps Caleb for helping make Cassandra awesome!
> >
> > Congratulations and welcome,
> > The Apache Cassandra PMC members
> >
> 
> >>
> >>
> >> -
> >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> >> For additional commands, e-mail: dev-h...@cassandra.apache.org
> >>
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
> 
> 
>  -
>  To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>  For additional commands, e-mail: dev-h...@cassandra.apache.org
> 
> 
> >>>
> >>
>

Re: Welcome Dinesh Joshi as Cassandra PMC member

2021-06-03 Thread Jordan West

Congratulations Dinesh!

Jordan

On Thu, Jun 3, 2021 at 1:40 AM Mick Semb Wever  wrote:

> Congrats Dinesh. Thanks for all the help given and offered whenever it is
> needed!
>
> On Wed, 2 Jun 2021 at 18:16, Benjamin Lerer  wrote:
>
> >  The PMC's members are pleased to announce that Dinesh Joshi has accepted
> > the invitation to become a PMC member.
> >
> > Thanks a lot, Dinesh, for everything you have done for the project all
> > these years.
> >
> > Congratulations and welcome
> >
> > The Apache Cassandra PMC members
> >
>

[ANNOUNCE] 3.0.23/3.0.24/3.11.9/3.11.10 Can Potentially Corrupt Data During Schema Changes

2021-07-25 Thread Jordan West

The bug reported in CASSANDRA-16735 [1] was known to cause corruption
thought to be recoverable but can, in fact, induce *non-recoverable*
corruption in some partitions. If you are not yet on 3.0.23, 3.0.24,
3.11.9, or 3.11.10, it is recommended you wait to upgrade until the
Cassandra community releases 3.0.25 and 3.11.11. Once released, skip
directly from 3.0.22 to 3.0.25 or from 3.11.8 to 3.11.11. For those
already on the affected versions, the Cassandra community is working
to release 3.0.25 and 3.11.11 immediately. Immediate upgrade to 3.0.25
or 3.11.11 is recommended and all schema changes should be stopped
until the upgrade is complete.

While the issue has been known for some time, the severity of the
issue was not well understood. This understanding has improved and
with that we are suggesting the above actions for all users.


The issue was introduced by a fix for CASSANDRA-15899 [2] which
affected all versions up to and including 3.0.22 and 3.11.8. The fix
for CASSANDRA-16735 was to revert the patch made in CASSANDRA-15899
meaning clusters will continue to be susceptible to this transient
issue.

In summary:

- 3.0.22 and before/3.11.8 and before - susceptible to CASSANDRA-15899
which carries considerably less risk relative to CASSANDRA-16735.

- 3.0.23, 3.0.24, 3.11.9, 3.11.10 - has the CASSANDRA-15899 patch that
introduces the bug reported in CASSANDRA-16735. This makes Cassandra
susceptible to non-recoverable corruption and should be upgraded
immediately.

- 3.0.25, 3.11.11 - has CASSANDRA-15899 patch reverted by patch in
CASSANDRA-16735 -- no longer susceptible to unrecoverable corruption
but continues to be susceptible to CASSANDRA-15899.


[1] https://issues.apache.org/jira/browse/CASSANDRA-16735
[2] https://issues.apache.org/jira/browse/CASSANDRA-15899

1 2 >

1 - 100 of 187 matches

Mail list logo