date:20230309

Re: [DISCUSS] Enhanced Disk Error Handling

2023-03-09 Thread Bowen Song via dev

From an operator's view, I think the most reliable indicator is not the total count of corruption events, but the frequency of the events. Let me try to explain that over some examples: 1. many corruption events in short period of time, then nothing after that The disk is probably still heal

Re: Role of Hadoop code in Cassandra 5.0

2023-03-09 Thread Jeremy Hanna

It was mainly to integrate with Hadoop - I used it from 0.6 to 1.2 in production prior to starting at DataStax and at that time I was stitching together Cloudera's distribution of Hadoop with Cassandra. Back then there were others that used it as well. As far as I know, usage dropped off when

Re: Role of Hadoop code in Cassandra 5.0

2023-03-09 Thread Rahul Xavier Singh

What is the hadoop code for? For interacting from Hadoop via CQL, or Thrift if it's that old, or directly looking at SSTables? Been using C* since 2 and have never used it. Agree to deprecate in next possible 4.1.x version and remove in 5.0 Rahul Singh Chief Executive Officer | Business Platform

Re: [DISCUSS] Enhanced Disk Error Handling

2023-03-09 Thread Bowen Song via dev

/When we attempt to rectify any bit-error by streaming data from peers, we implicitly take a lock on token ownership. A user needs to know that it is unsafe to change token ownership in a cluster that is currently in the process of repairing a corruption error on one of its instance

New episode of The Apache Cassandra (R) Corner podcast!

2023-03-09 Thread Aaron Ploetz

Link to the next episode: https://drive.google.com/file/d/1IePasf681bU-7xRNl4tBzWvVG28y4tQK/view?usp=share_link s2Ep3 - Loren Sands-Ramshaw (You may have to download it to play) FYI - Experimenting with a video podcast on this one. It will remain in staging for 72 hours, going live (assuming no

Re: [EXTERNAL] Re: [DISCUSS] Next release date

2023-03-09 Thread Mick Semb Wever

> > > One place we've been weak historically is in distinguishing between > > > tickets we consider "nice to have" and things that are "blockers". We > > > don't have any metadata that currently distinguishes those two, so > > > determining what our burndown leading up to 5.0 looks like is a lot

Re: [DISCUSS] Enhanced Disk Error Handling

2023-03-09 Thread Abe Ratnofsky

> there's a point at which a host limping along is better put down and replaced I did a basic literature review and it looks like load (total program-erase cycles), disk age, and operating temperature all lead to BER increases. We don't need to build a whole model of disk failure, we could proba

Re: [DISCUSS] Enhanced Disk Error Handling

2023-03-09 Thread Josh McKenzie

> I'm not seeing any reasons why CEP-21 would make this more difficult to > implement I think I communicated poorly - I was just trying to point out that there's a point at which a host limping along is better put down and replaced than piecemeal flagging range after range dead and working aroun

Re: Role of Hadoop code in Cassandra 5.0

2023-03-09 Thread Derek Chen-Becker

Honestly, I don't think moving it out in its current state is a win, either. I'm +1 to deprecation in 4.1.x and removal in 5.0. If someone in the community wants or needs the Hadoop code it should be in a separate repo/package just like the Spark Connector. Derek On Thu, Mar 9, 2023 at 10:07 AM M

Re: [EXTERNAL] Re: [DISCUSS] Next release date

2023-03-09 Thread Josh McKenzie

> We do have the metadata, but yes it requires some work… My wording was poor; we have the *potential* to have this metadata, but to my knowledge we don't have a muscle of consistently setting this, or any kind of heuristic to determine when something should block a release or not. At least on 4

Re: [EXTERNAL] Re: [DISCUSS] Next release date

2023-03-09 Thread Mick Semb Wever

> > One place we've been weak historically is in distinguishing between > tickets we consider "nice to have" and things that are "blockers". We don't > have any metadata that currently distinguishes those two, so determining > what our burndown leading up to 5.0 looks like is a lot more data massag

Re: Role of Hadoop code in Cassandra 5.0

2023-03-09 Thread Jacek Lewandowski

Is there a ticket for that? - - -- --- - - Jacek Lewandowski czw., 9 mar 2023 o 20:27 Mick Semb Wever napisał(a): > > > On Thu, 9 Mar 2023 at 18:54, Brandon Williams wrote: > >> I think if we reach consensus here that decides it. I too vote to >> deprecate in 4.1.x.

Re: [DISCUSS] Enhanced Disk Error Handling

2023-03-09 Thread Abe Ratnofsky

I'm not seeing any reasons why CEP-21 would make this more difficult to implement, besides the fact that it hasn't landed yet. There are two major potential pitfalls that CEP-21 would help us avoid: 1. Bit-errors beget further bit-errors, so we ought to be resistant to a high frequency of corrup

Re: Role of Hadoop code in Cassandra 5.0

2023-03-09 Thread Mick Semb Wever

On Thu, 9 Mar 2023 at 18:54, Brandon Williams wrote: > I think if we reach consensus here that decides it. I too vote to > deprecate in 4.1.x. This means we would remove it in 5.0. > +1

Re: [EXTERNAL] Re: [DISCUSS] Next release date

2023-03-09 Thread Ekaterina Dimitrova

There is also this roadmap page but we haven’t updated it lately. It contains still 4.1 updates from last year. https://cwiki.apache.org/confluence/display/CASSANDRA/Roadmap On Thu, 9 Mar 2023 at 13:51, Josh McKenzie wrote: > Added an "Epics" quick filter; could help visualize what our high pri

Re: [DISCUSS] Enhanced Disk Error Handling

2023-03-09 Thread Josh McKenzie

> Personally, I'd like to see the fix for this issue come after CEP-21. It > could be feasible to implement a fix before then, that detects bit-errors on > the read path and refuses to respond to the coordinator, implicitly having > speculative execution handle the retry against another replica

Re: [EXTERNAL] Re: [DISCUSS] Next release date

2023-03-09 Thread Josh McKenzie

Added an "Epics" quick filter; could help visualize what our high priority features are for given releases: https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=484&quickFilter=2649 Our cumulative flow diagram of 5.0 related tickets is pretty large. Probably not a great indicator for

Re: Role of Hadoop code in Cassandra 5.0

2023-03-09 Thread Francisco Guerrero

+1 (nb) for deprecation in 4.x and removal in 5.0 On 2023/03/09 18:04:27 Jeremy Hanna wrote: > +1 from me to deprecate in 4.x and remove in 5.0. > > > On Mar 9, 2023, at 12:01 PM, J. D. Jordan wrote: > > > > +1 from me to deprecate in 4.x and remove in 5.0. > > > > -Jeremiah > > > >> On Mar

Re: [DISCUSS] Enhanced Disk Error Handling

2023-03-09 Thread Abe Ratnofsky

Thanks for proposing this discussion Bowen. I see a few different issues here: 1. How do we safely handle corruption of a handful of tokens without taking an entire instance offline for re-bootstrap? This includes refusal to serve read requests for the corrupted token(s), and correct repair of t

Re: Role of Hadoop code in Cassandra 5.0

2023-03-09 Thread Jeremy Hanna

+1 from me to deprecate in 4.x and remove in 5.0. > On Mar 9, 2023, at 12:01 PM, J. D. Jordan wrote: > > +1 from me to deprecate in 4.x and remove in 5.0. > > -Jeremiah > >> On Mar 9, 2023, at 11:53 AM, Brandon Williams wrote: >> >> I think if we reach consensus here that decides it. I too

Re: Role of Hadoop code in Cassandra 5.0

2023-03-09 Thread J. D. Jordan

+1 from me to deprecate in 4.x and remove in 5.0. -Jeremiah > On Mar 9, 2023, at 11:53 AM, Brandon Williams wrote: > > I think if we reach consensus here that decides it. I too vote to > deprecate in 4.1.x. This means we would remove it in 5.0. > > Kind Regards, > Brandon > >> On Thu, Mar 9

Re: Role of Hadoop code in Cassandra 5.0

2023-03-09 Thread Brandon Williams

I think if we reach consensus here that decides it. I too vote to deprecate in 4.1.x. This means we would remove it in 5.0. Kind Regards, Brandon On Thu, Mar 9, 2023 at 11:32 AM Ekaterina Dimitrova wrote: > > Deprecation sounds good to me, but I am not completely sure in which version > we can

Re: Role of Hadoop code in Cassandra 5.0

2023-03-09 Thread Miklosovic, Stefan

Deprecation would mean that the code has to be there whole 5.0 so we can remove it for real in 6.0? From: Ekaterina Dimitrova Sent: Thursday, March 9, 2023 18:32 To: dev@cassandra.apache.org Subject: Re: Role of Hadoop code in Cassandra 5.0 NetApp Securi

Re: Role of Hadoop code in Cassandra 5.0

2023-03-09 Thread Ekaterina Dimitrova

Deprecation sounds good to me, but I am not completely sure in which version we can do it. If it is possible to add a deprecation warning in the 4.x series or at least 4.1.x - I vote for that. On Thu, 9 Mar 2023 at 12:14, Jacek Lewandowski wrote: > Is it possible to deprecate it in the 4.1.x pat

Re: Role of Hadoop code in Cassandra 5.0

2023-03-09 Thread Jacek Lewandowski

Is it possible to deprecate it in the 4.1.x patch release? :) - - -- --- - - Jacek Lewandowski czw., 9 mar 2023 o 18:11 Brandon Williams napisał(a): > This is my feeling too, but I think we should accomplish this by > deprecating it first. I don't expect anything wil

Re: Role of Hadoop code in Cassandra 5.0

2023-03-09 Thread Jacek Lewandowski

... because - why Hadoop? This is something to be made as a separate project if there is a need for that. Just like the Spark Cassandra Connector. Why do we need to include Hadoop specific classes and no specific stuff for other frameworks? - - -- --- - - Jacek Lewandowski

Re: Role of Hadoop code in Cassandra 5.0

2023-03-09 Thread Brandon Williams

This is my feeling too, but I think we should accomplish this by deprecating it first. I don't expect anything will change after the deprecation period. Kind Regards, Brandon On Thu, Mar 9, 2023 at 11:09 AM Jacek Lewandowski wrote: > > I vote for removing it entirely. > > thanks > - - -- --- --

Re: Role of Hadoop code in Cassandra 5.0

2023-03-09 Thread Jacek Lewandowski

I vote for removing it entirely. thanks - - -- --- - - Jacek Lewandowski czw., 9 mar 2023 o 18:07 Miklosovic, Stefan napisał(a): > Derek, > > I have couple more points ... I do not think that extracting it to a > separate repository is "win". That code is on Hadoop 1.0

Re: Role of Hadoop code in Cassandra 5.0

2023-03-09 Thread Miklosovic, Stefan

Derek, I have couple more points ... I do not think that extracting it to a separate repository is "win". That code is on Hadoop 1.0.3. We would be spending a lot of work on extracting it just to extract 10 years old code with occasional updates (in my humble opinion just to make it compilable

Re: [DISCUSS] Enhanced Disk Error Handling

2023-03-09 Thread Bowen Song via dev

Hi Jeremiah, I'm fully aware of that, which is why I said that deleting the affected SSTable files is "less safe". If the "bad blocks" logic is implemented and the node abort the current read query when hitting a bad block, it should remain safe, as the data in other SSTable files will not b

Re: [DISCUSS] Enhanced Disk Error Handling

2023-03-09 Thread Jeremiah D Jordan

It is actually more complicated than just removing the sstable and running repair. In the face of expired tombstones that might be covering data in other sstables the only safe way to deal with a bad sstable is wipe the token range in the bad sstable and rebuild/bootstrap that range (or wipe/re

Re: Role of Hadoop code in Cassandra 5.0

2023-03-09 Thread Miklosovic, Stefan

What about asking somebody from Hadoop project to update it directly in Cassandra? I think these people have loads of experience in integrations like this. If we bumped the version to something 3.3.x, refreshed the code and put some tests on top, I think we could just leave it there for couple m

Re: Role of Hadoop code in Cassandra 5.0

2023-03-09 Thread Derek Chen-Becker

I think the question isn't "Who ... is still using that?" but more "are we actually going to support it?" If we're on a version that old it would appear that we've basically abandoned it, although there do appear to have been refactoring (for other things) commits in the last couple of years. I wou

Re: [EXTERNAL] Re: [DISCUSS] Next release date

2023-03-09 Thread Mick Semb Wever

> > I've also found some useful Cassandra's JIRA dashboards for previous > releases to track progress and scope, but we don't have anything > similar for the next release. Should we create it? > Cassandra 4.0GAScope > Cassandra 4.1 GA scope > https://issues.apache.org/jira/secure/RapidBoard.jspa?

Re: [EXTERNAL] Re: [DISCUSS] Next release date

2023-03-09 Thread Maxim Muzafarov

When I was a release manager for another Apache project, I found it useful to create confluence pages for the upcoming release, both for transparency of release dates and for benchmarks. Of course, the dates can be updated when we will have a better understanding of the scope of the release. Do we

Role of Hadoop code in Cassandra 5.0

2023-03-09 Thread Miklosovic, Stefan

Hi list, I stumbled upon Hadoop package again. I think there was some discussion about the relevancy of Hadoop code some time ago but I would like to ask this again. Do you think Hadoop code (1) is still relevant in 5.0? Who in the industry is still using that? We might drop a lot of code and

Re: [RELEASE] Apache Cassandra 4.0.8 released

2023-03-09 Thread Brandon Williams

It was reported in CASSANDRA-18307 that the Debian and Redhat packages for 4.0.8 did not make it to the jfrog repository - this has now been corrected, sorry for any inconvenience. Kind Regards, Brandon On Tue, Feb 14, 2023 at 3:39 PM Miklosovic, Stefan wrote: > > The Cassandra team is pleased t

Re: [EXTERNAL] Re: [DISCUSS] Next release date

2023-03-09 Thread Branimir Lambov

CEPs 25 (trie-indexed sstables) and 26 (unified compaction strategy) should both be ready for review by mid-April. Both are around 10k LOC, fairly isolated, and in need of a committer to review. Regards, Branimir On Mon, Mar 6, 2023 at 11:25 AM Benjamin Lerer wrote: > Sorry, I realized that wh

38 matches

Mail list logo