From an operator's view, I think the most reliable indicator is not the
total count of corruption events, but the frequency of the events. Let
me try to explain that over some examples:
1. many corruption events in short period of time, then nothing after that
The disk is probably still heal
It was mainly to integrate with Hadoop - I used it from 0.6 to 1.2 in
production prior to starting at DataStax and at that time I was stitching
together Cloudera's distribution of Hadoop with Cassandra. Back then there
were others that used it as well. As far as I know, usage dropped off when
What is the hadoop code for? For interacting from Hadoop via CQL, or Thrift
if it's that old, or directly looking at SSTables? Been using C* since 2
and have never used it.
Agree to deprecate in next possible 4.1.x version and remove in 5.0
Rahul Singh
Chief Executive Officer | Business Platform
/When we attempt to rectify any bit-error by streaming data from
peers, we implicitly take a lock on token ownership. A user needs to
know that it is unsafe to change token ownership in a cluster that
is currently in the process of repairing a corruption error on one
of its instance
Link to the next episode:
https://drive.google.com/file/d/1IePasf681bU-7xRNl4tBzWvVG28y4tQK/view?usp=share_link
s2Ep3 - Loren Sands-Ramshaw
(You may have to download it to play)
FYI - Experimenting with a video podcast on this one.
It will remain in staging for 72 hours, going live (assuming no
> > > One place we've been weak historically is in distinguishing between
> > > tickets we consider "nice to have" and things that are "blockers". We
> > > don't have any metadata that currently distinguishes those two, so
> > > determining what our burndown leading up to 5.0 looks like is a lot
> there's a point at which a host limping along is better put down and replaced
I did a basic literature review and it looks like load (total program-erase
cycles), disk age, and operating temperature all lead to BER increases. We
don't need to build a whole model of disk failure, we could proba
> I'm not seeing any reasons why CEP-21 would make this more difficult to
> implement
I think I communicated poorly - I was just trying to point out that there's a
point at which a host limping along is better put down and replaced than
piecemeal flagging range after range dead and working aroun
Honestly, I don't think moving it out in its current state is a win,
either. I'm +1 to deprecation in 4.1.x and removal in 5.0. If someone in
the community wants or needs the Hadoop code it should be in a separate
repo/package just like the Spark Connector.
Derek
On Thu, Mar 9, 2023 at 10:07 AM M
> We do have the metadata, but yes it requires some work…
My wording was poor; we have the *potential* to have this metadata, but to my
knowledge we don't have a muscle of consistently setting this, or any kind of
heuristic to determine when something should block a release or not. At least
on 4
>
> One place we've been weak historically is in distinguishing between
> tickets we consider "nice to have" and things that are "blockers". We don't
> have any metadata that currently distinguishes those two, so determining
> what our burndown leading up to 5.0 looks like is a lot more data massag
Is there a ticket for that?
- - -- --- - -
Jacek Lewandowski
czw., 9 mar 2023 o 20:27 Mick Semb Wever napisał(a):
>
>
> On Thu, 9 Mar 2023 at 18:54, Brandon Williams wrote:
>
>> I think if we reach consensus here that decides it. I too vote to
>> deprecate in 4.1.x.
I'm not seeing any reasons why CEP-21 would make this more difficult to
implement, besides the fact that it hasn't landed yet.
There are two major potential pitfalls that CEP-21 would help us avoid:
1. Bit-errors beget further bit-errors, so we ought to be resistant to a high
frequency of corrup
On Thu, 9 Mar 2023 at 18:54, Brandon Williams wrote:
> I think if we reach consensus here that decides it. I too vote to
> deprecate in 4.1.x. This means we would remove it in 5.0.
>
+1
There is also this roadmap page but we haven’t updated it lately. It
contains still 4.1 updates from last year.
https://cwiki.apache.org/confluence/display/CASSANDRA/Roadmap
On Thu, 9 Mar 2023 at 13:51, Josh McKenzie wrote:
> Added an "Epics" quick filter; could help visualize what our high pri
> Personally, I'd like to see the fix for this issue come after CEP-21. It
> could be feasible to implement a fix before then, that detects bit-errors on
> the read path and refuses to respond to the coordinator, implicitly having
> speculative execution handle the retry against another replica
Added an "Epics" quick filter; could help visualize what our high priority
features are for given releases:
https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=484&quickFilter=2649
Our cumulative flow diagram of 5.0 related tickets is pretty large. Probably
not a great indicator for
+1 (nb) for deprecation in 4.x and removal in 5.0
On 2023/03/09 18:04:27 Jeremy Hanna wrote:
> +1 from me to deprecate in 4.x and remove in 5.0.
>
> > On Mar 9, 2023, at 12:01 PM, J. D. Jordan wrote:
> >
> > +1 from me to deprecate in 4.x and remove in 5.0.
> >
> > -Jeremiah
> >
> >> On Mar
Thanks for proposing this discussion Bowen. I see a few different issues here:
1. How do we safely handle corruption of a handful of tokens without taking an
entire instance offline for re-bootstrap? This includes refusal to serve read
requests for the corrupted token(s), and correct repair of t
+1 from me to deprecate in 4.x and remove in 5.0.
> On Mar 9, 2023, at 12:01 PM, J. D. Jordan wrote:
>
> +1 from me to deprecate in 4.x and remove in 5.0.
>
> -Jeremiah
>
>> On Mar 9, 2023, at 11:53 AM, Brandon Williams wrote:
>>
>> I think if we reach consensus here that decides it. I too
+1 from me to deprecate in 4.x and remove in 5.0.
-Jeremiah
> On Mar 9, 2023, at 11:53 AM, Brandon Williams wrote:
>
> I think if we reach consensus here that decides it. I too vote to
> deprecate in 4.1.x. This means we would remove it in 5.0.
>
> Kind Regards,
> Brandon
>
>> On Thu, Mar 9
I think if we reach consensus here that decides it. I too vote to
deprecate in 4.1.x. This means we would remove it in 5.0.
Kind Regards,
Brandon
On Thu, Mar 9, 2023 at 11:32 AM Ekaterina Dimitrova
wrote:
>
> Deprecation sounds good to me, but I am not completely sure in which version
> we can
Deprecation would mean that the code has to be there whole 5.0 so we can remove
it for real in 6.0?
From: Ekaterina Dimitrova
Sent: Thursday, March 9, 2023 18:32
To: dev@cassandra.apache.org
Subject: Re: Role of Hadoop code in Cassandra 5.0
NetApp Securi
Deprecation sounds good to me, but I am not completely sure in which
version we can do it. If it is possible to add a deprecation warning in the
4.x series or at least 4.1.x - I vote for that.
On Thu, 9 Mar 2023 at 12:14, Jacek Lewandowski
wrote:
> Is it possible to deprecate it in the 4.1.x pat
Is it possible to deprecate it in the 4.1.x patch release? :)
- - -- --- - -
Jacek Lewandowski
czw., 9 mar 2023 o 18:11 Brandon Williams napisał(a):
> This is my feeling too, but I think we should accomplish this by
> deprecating it first. I don't expect anything wil
... because - why Hadoop? This is something to be made as a separate
project if there is a need for that. Just like the Spark Cassandra
Connector. Why do we need to include Hadoop specific classes and no
specific stuff for other frameworks?
- - -- --- - -
Jacek Lewandowski
This is my feeling too, but I think we should accomplish this by
deprecating it first. I don't expect anything will change after the
deprecation period.
Kind Regards,
Brandon
On Thu, Mar 9, 2023 at 11:09 AM Jacek Lewandowski
wrote:
>
> I vote for removing it entirely.
>
> thanks
> - - -- --- --
I vote for removing it entirely.
thanks
- - -- --- - -
Jacek Lewandowski
czw., 9 mar 2023 o 18:07 Miklosovic, Stefan
napisał(a):
> Derek,
>
> I have couple more points ... I do not think that extracting it to a
> separate repository is "win". That code is on Hadoop 1.0
Derek,
I have couple more points ... I do not think that extracting it to a separate
repository is "win". That code is on Hadoop 1.0.3. We would be spending a lot
of work on extracting it just to extract 10 years old code with occasional
updates (in my humble opinion just to make it compilable
Hi Jeremiah,
I'm fully aware of that, which is why I said that deleting the affected
SSTable files is "less safe".
If the "bad blocks" logic is implemented and the node abort the current
read query when hitting a bad block, it should remain safe, as the data
in other SSTable files will not b
It is actually more complicated than just removing the sstable and running
repair.
In the face of expired tombstones that might be covering data in other sstables
the only safe way to deal with a bad sstable is wipe the token range in the bad
sstable and rebuild/bootstrap that range (or wipe/re
What about asking somebody from Hadoop project to update it directly in
Cassandra? I think these people have loads of experience in integrations like
this. If we bumped the version to something 3.3.x, refreshed the code and put
some tests on top, I think we could just leave it there for couple m
I think the question isn't "Who ... is still using that?" but more "are we
actually going to support it?" If we're on a version that old it would
appear that we've basically abandoned it, although there do appear to have
been refactoring (for other things) commits in the last couple of years. I
wou
>
> I've also found some useful Cassandra's JIRA dashboards for previous
> releases to track progress and scope, but we don't have anything
> similar for the next release. Should we create it?
> Cassandra 4.0GAScope
> Cassandra 4.1 GA scope
>
https://issues.apache.org/jira/secure/RapidBoard.jspa?
When I was a release manager for another Apache project, I found it
useful to create confluence pages for the upcoming release, both for
transparency of release dates and for benchmarks. Of course, the dates
can be updated when we will have a better understanding of the scope
of the release.
Do we
Hi list,
I stumbled upon Hadoop package again. I think there was some discussion about
the relevancy of Hadoop code some time ago but I would like to ask this again.
Do you think Hadoop code (1) is still relevant in 5.0? Who in the industry is
still using that?
We might drop a lot of code and
It was reported in CASSANDRA-18307 that the Debian and Redhat packages
for 4.0.8 did not make it to the jfrog repository - this has now been
corrected, sorry for any inconvenience.
Kind Regards,
Brandon
On Tue, Feb 14, 2023 at 3:39 PM Miklosovic, Stefan
wrote:
>
> The Cassandra team is pleased t
CEPs 25 (trie-indexed sstables) and 26 (unified compaction strategy) should
both be ready for review by mid-April.
Both are around 10k LOC, fairly isolated, and in need of a committer to
review.
Regards,
Branimir
On Mon, Mar 6, 2023 at 11:25 AM Benjamin Lerer wrote:
> Sorry, I realized that wh
38 matches
Mail list logo