Re: Re-evaluate compaction defaults in 5.1/trunk

2024-12-11 Thread Branimir Lambov
Predefined configurations make most sense to me too. My personal preference is to use a variation of the mechanism for defining memtable configurations , where the details of each are laid out explicitly in the yaml, and the s

Re: Re-evaluate compaction defaults in 5.1/trunk

2024-12-09 Thread Benedict
That should never need to happen in UCS as far as I understand. Levels should be defined by the properties of the sstable, not by assignment, so all sstables should be placed in the correct bucket on creation by definition. I haven’t read the code though, so there might be some impediment to tha

Re: Re-evaluate compaction defaults in 5.1/trunk

2024-12-09 Thread Jeff Jirsa
On 2024/12/09 17:33:09 Benedict wrote: > I think it would make sense to support overriding the default FP in the UCS > parameters, so we can treat it as a direct replacement. Desiree FP is > directly related to sstable overlaps after all. > > Can you think of any other usability gaps like t

Re: Re-evaluate compaction defaults in 5.1/trunk

2024-12-09 Thread Benedict
I think it would make sense to support overriding the default FP in the UCS parameters, so we can treat it as a direct replacement. Desiree FP is directly related to sstable overlaps after all. Can you think of any other usability gaps like this? > On 9 Dec 2024, at 12:06, Jeff Jirsa wrote: >

Re: Re-evaluate compaction defaults in 5.1/trunk

2024-12-09 Thread Jeff Jirsa
On 2024/12/09 16:26:45 Benedict wrote: > I think it’s important to remember that UCS broadly speaking subsumes both LCS > and STCS, with various subtle but important refinements. So while it offers a > broader parameter space it might be best to conceive of it as a suite of > compaction strategi

Re: Re-evaluate compaction defaults in 5.1/trunk

2024-12-09 Thread Benedict
I think it’s important to remember that UCS broadly speaking subsumes both LCS and STCS, with various subtle but important refinements. So while it offers a broader parameter space it might be best to conceive of it as a suite of compaction strategies, two of which are direct replacements to LCS an

Re: Re-evaluate compaction defaults in 5.1/trunk

2024-12-09 Thread Chris lohfink
Very true I was too harsh when trying to enumerate what I saw as blockers, and that wasn't fair of me as a lot of great work has gone into it. I am sorry for that! > Maybe telling what problem you actually have with it and how to simplify so it is easier to digest would be more appropriate. The t

Re: Re-evaluate compaction defaults in 5.1/trunk

2024-12-08 Thread Paulo Motta
I think this would beneficial since we have a loose agreement that UCS is a promising option as new default. I plan to summarize the views expressed in this thread to propose a plan to make compactions usability smoother to new users in Cassandra, if there are any short term actions we can agree t

Re: Re-evaluate compaction defaults in 5.1/trunk

2024-12-08 Thread Jordan West
While we continue the discussion here on short term defaults do we all feel it would be beneficial to start a new thread on what is required to get UCS over the line as a default? So we can have both discussions going at once? On Sun, Dec 8, 2024 at 8:44 AM Paulo Motta wrote: > > Hi Dave, > > I

Re: Re-evaluate compaction defaults in 5.1/trunk

2024-12-08 Thread Paulo Motta
Hi Dave, I appreciate these performance/cost considerations and I believe these should be taken into account when evaluating default changes. I am trying to frame this as an usability issue with the database by shipping with STCS by default. I think it's possible to classify workloads into two t

Re: Re-evaluate compaction defaults in 5.1/trunk

2024-12-08 Thread Dave Herrington
…the analysis I describe would need to be weighted by table size. I have several representative production cluster tablestats analyses that show r:w ratio by table, including table size. I can check to see how this analysis plays out on a few of these. -Dave David A. Herrington II President and

Re: Re-evaluate compaction defaults in 5.1/trunk

2024-12-08 Thread Dave Herrington
Paulo, I understand your perspective. Short of waiting for UCS to prove itself out, I guess it comes down to the assertion that a strong majority of Cassandra use cases would benefit from using LCS vs. STCS. The conventional wisdom is that workloads need to be read-heavy to make the extra resour

Re: Re-evaluate compaction defaults in 5.1/trunk

2024-12-08 Thread Paulo Motta
> I feel it could be maddening to customers if LCS started showing up in schemas after an upgrade just because the default changed. Fwiw I’m not proposing changing the default for existing users or clusters, but just for new tables in new clusters. Existing clusters would keep the legacy default a

Re: Re-evaluate compaction defaults in 5.1/trunk

2024-12-08 Thread Paulo Motta
Hi Dave, I'm also in the field and my experience is different. I have seen new users shooting themselves in the foot with the default compaction strategy STCS on a regular basis over the past few years and have been recommending them to switch to LCS and they no longer encounter issues after maki

Re: Re-evaluate compaction defaults in 5.1/trunk

2024-12-07 Thread Dave Herrington
Chiming in from the field, I think maintaining the familiar status quo until a panacea compaction strategy proves itself out (could that be UCS?) makes sense to me. I feel it could be maddening to customers if LCS started showing up in schemas after an upgrade just because the default changed. If

Re: Re-evaluate compaction defaults in 5.1/trunk

2024-12-07 Thread Jeff Jirsa
> On Dec 7, 2024, at 7:08 PM, Mick Semb Wever wrote: > > Chiming in with my two cents… > > >> When people have the luxury of working in environments where clusters are >> massively over provisioned, LCS as a default makes a lot of sense, because >> there's not much downside. The use cases

Re: Re-evaluate compaction defaults in 5.1/trunk

2024-12-07 Thread Mick Semb Wever
Chiming in with my two cents… When people have the luxury of working in environments where clusters are > massively over provisioned, LCS as a default makes a lot of sense, because > there's not much downside. The use cases where you'd actually fall behind > in compaction are pretty slim, so the

Re: Re-evaluate compaction defaults in 5.1/trunk

2024-12-07 Thread Dinesh Joshi
To summarize, we only have 3 options here – 1. Make the default LCS and document this change. 2. Leave the default as STCS. 3. Leave the default undefined and force the operators to choose a compaction strategy. Given that there is apprehension on setting the default, I would suggest we provide n

Re: Re-evaluate compaction defaults in 5.1/trunk

2024-12-07 Thread Jordan West
It’s a good point that if we plan to qualify UCS as a default changing it now has little value. STCS also has massively bad use cases, it’s not a C across the board (in particular when SSTables per read gets super high on dense nodes) though. It also requires more disk overhead and overprovisionin

Re: Re-evaluate compaction defaults in 5.1/trunk

2024-12-07 Thread Jeff Jirsa
People who know enough to read the docs to find those profiles know how to read the docs to choose the right compaction already. Beyond that, it just clutters up the grammar and metadata. The reality here is there’s no single compaction strategy that works for everyone, so unless there’s strong

Re: Re-evaluate compaction defaults in 5.1/trunk

2024-12-07 Thread Paulo Motta
Hi Jon, Thanks for your perspective on this topic, that's helpful feedback! I understand the default compaction strategy choice depends on multiple deployment and workload factors. How about this to allow Cassandra to be more flexible for different workload types: Update CQL with: CREATE TABLE

Re: Re-evaluate compaction defaults in 5.1/trunk

2024-12-07 Thread Jon Haddad
When people have the luxury of working in environments where clusters are massively over provisioned, LCS as a default makes a lot of sense, because there's not much downside. The use cases where you'd actually fall behind in compaction are pretty slim, so the negative impact isn't felt. Most peo

Re: Re-evaluate compaction defaults in 5.1/trunk

2024-12-07 Thread Dinesh Joshi
On Sat, Dec 7, 2024 at 4:16 AM Brandon Williams wrote: > I agree with your sentiment here. It's a growing problem that we > don't have anyone focussed on writing user docs any longer - if you > open a ticket for docs, unfortunately you will probably need to drive > it for it to go anywhere. > H

Re: Re-evaluate compaction defaults in 5.1/trunk

2024-12-07 Thread Jordan West
Generally agree with the following sentiments: - LCS as the stable default, it’s not perfect and can blow up but it’s the best in the majority of cases. All of the compaction strategies come with foot guns of varying sizes. If STCS is replaced by UCS it definitely should not be the default. - mov

Re: Re-evaluate compaction defaults in 5.1/trunk

2024-12-07 Thread Brandon Williams
On Sat, Dec 7, 2024 at 6:05 AM Štefan Miklošovič wrote: > ouch ... that hurts ... whoever did that job. Could we be more emotions-less > here? Branimir did an excellent job and for _technical_ documentation there > is nothing wrong with it. It is another problem that the documentation is not >

Re: Re-evaluate compaction defaults in 5.1/trunk

2024-12-07 Thread Štefan Miklošovič
On Sat, Dec 7, 2024 at 4:42 AM Chris Lohfink wrote: > While I am actually +1 on LCS being default as it handles more use cases > well compared to STCS. I am -1 on UCS being default anywhere currently, > the UX is horrible, documentation is unreadable and it's only available on > a release barely

Re: Re-evaluate compaction defaults in 5.1/trunk

2024-12-06 Thread Tolbert, Andy
> @Andy - you can set the default compaction strategy in C* yaml now. Oh, this is very cool and I'm happy to see it! Looks like that landed as part of the UCS contribution itself (CASSANDRA-18397 Unified Compaction Strategy ), great idea. >

Re: Re-evaluate compaction defaults in 5.1/trunk

2024-12-06 Thread cnlwsu
Same. I can’t think of a scenario beyond just writes out pacing compaction throughput. What’s the 20%?ChrisSent from my iPhoneOn Dec 6, 2024, at 10:58 PM, Dinesh Joshi wrote:I’m genuinely curious to understand how is defaulting to LCS going to cause a nightmare? I am not sure what the concern is

Re: Re-evaluate compaction defaults in 5.1/trunk

2024-12-06 Thread Dinesh Joshi
On Fri, Dec 6, 2024 at 9:23 PM Jon Haddad wrote: > For a very common example, a lot of clusters are now using the k8ssandra > operator in AWS, which needs EBS. It's incredibly easy to fall behind on > compaction there. It's why I'm so interested in seeing CASSANDRA-15452 get > merged in. I've

Re: Re-evaluate compaction defaults in 5.1/trunk

2024-12-06 Thread Jon Haddad
For a very common example, a lot of clusters are now using the k8ssandra operator in AWS, which needs EBS. It's incredibly easy to fall behind on compaction there. It's why I'm so interested in seeing CASSANDRA-15452 get merged in. I've dealt with quite a few of these clusters, in fact I just wo

Re: Re-evaluate compaction defaults in 5.1/trunk

2024-12-06 Thread Dinesh Joshi
I’m genuinely curious to understand how is defaulting to LCS going to cause a nightmare? I am not sure what the concern is over here. On Fri, Dec 6, 2024 at 8:53 PM Jon Haddad wrote: > You're ignoring the other side here. For the folks who *can't* use LCS, > defaulting to it is a nightmare. > >

Re: Re-evaluate compaction defaults in 5.1/trunk

2024-12-06 Thread Tolbert, Andy
It's also quite easy for STCS to make clusters inoperable, and it can be quite difficult to dig yourself out of. It's not hard to find yourself in a state where you have old 100GB+ SSTables full of expired data that never get compacted sitting around for months. Write amplification is a thing, b

Re: Re-evaluate compaction defaults in 5.1/trunk

2024-12-06 Thread Dinesh Joshi
I would argue that vast majority of real world workloads are read heavy. LCS would therefore be a net benefit for the average user. To mitigate the write amplification concern I would make this change and make sure it is well documented for operators so they’re not caught off guard. On Fri, Dec 6

Re: Re-evaluate compaction defaults in 5.1/trunk

2024-12-06 Thread Jon Haddad
You're ignoring the other side here. For the folks who *can't* use LCS, defaulting to it is a nightmare. Sorry, but you can't screw over 20% of the community to make life a little better for the 80%. This is a terrible tradeoff. Jon On Fri, Dec 6, 2024 at 8:36 PM Dinesh Joshi wrote: > I woul

Re: Re-evaluate compaction defaults in 5.1/trunk

2024-12-06 Thread Dinesh Joshi
I have to agree with Chris here. In the vast majority of cases LCS hits the sweet spot and avoids the STCS pitfalls. UCS is too new and I would not make that a default OOTB. Philosophically, as a project, we should wait until critical features like these reach a certain level of maturity prior to

Re: Re-evaluate compaction defaults in 5.1/trunk

2024-12-06 Thread Jeff Jirsa
And it works for that most of the time, so what’s the concern? “You lose throughput because iops / write amplification go up, so the perf of the default install goes down” ? (But the cost per byte goes way down, too)? > On Dec 6, 2024, at 8:01 PM, Brad wrote: > > > Could you elaborate what

Re: Re-evaluate compaction defaults in 5.1/trunk

2024-12-06 Thread Brad
> Could you elaborate what you mean by 'disk storage management'? I often see clusters use LCS as an easy fix to avoid the 50% disk free recommendation of STCS without considering the write magnification implications. On Fri, Dec 6, 2024 at 10:46 PM Dinesh Joshi wrote: > Could you elaborate wha

Re: Re-evaluate compaction defaults in 5.1/trunk

2024-12-06 Thread Dinesh Joshi
Could you elaborate what you mean by 'disk storage management'? On Fri, Dec 6, 2024 at 7:30 PM Brad wrote: > I'm -1 on LCS being the default, seen far too many people use it for disk > storage management > > On Fri, Dec 6, 2024 at 10:08 PM Jon Haddad > wrote: > >> I'm -1 on LCS being the defaul

Re: Re-evaluate compaction defaults in 5.1/trunk

2024-12-06 Thread Jeff Jirsa
I’m probably closely aligned with Chris here, fwiw. - Jeff > On Dec 6, 2024, at 7:40 PM, Chris Lohfink wrote: > > While I am actually +1 on LCS being default as it handles more use cases well > compared to STCS. I am -1 on UCS being default anywhere currently, the UX is > horrible, documenta

Re: Re-evaluate compaction defaults in 5.1/trunk

2024-12-06 Thread Chris Lohfink
While I am actually +1 on LCS being default as it handles more use cases well compared to STCS. I am -1 on UCS being default anywhere currently, the UX is horrible, documentation is unreadable and it's only available on a release barely anyone uses yet (not adequately tested in production). Seems l

Re: Re-evaluate compaction defaults in 5.1/trunk

2024-12-06 Thread Brad
I'm -1 on LCS being the default, seen far too many people use it for disk storage management On Fri, Dec 6, 2024 at 10:08 PM Jon Haddad wrote: > I'm -1 on LCS being the default, since using it in the wrong situations > renders clusters inoperable. > > > On Fri, Dec 6, 2024 at 7:03 PM Paulo Motta

Re: Re-evaluate compaction defaults in 5.1/trunk

2024-12-06 Thread Jon Haddad
I'm -1 on LCS being the default, since using it in the wrong situations renders clusters inoperable. On Fri, Dec 6, 2024 at 7:03 PM Paulo Motta wrote: > > I'd prefer to see the default go from STCS to UCS > > I’m proposing this for latest unstable (cassandra_latest.yaml) since it’s > a more rec

Re: Re-evaluate compaction defaults in 5.1/trunk

2024-12-06 Thread Paulo Motta
> I'd prefer to see the default go from STCS to UCS I’m proposing this for latest unstable (cassandra_latest.yaml) since it’s a more recent strategy still being adopted. For latest stable (cassandra.yaml) I’d prefer LCS since it does not need tuning to support mutable workloads (UPDATE/DELETE) and

Re: Re-evaluate compaction defaults in 5.1/trunk

2024-12-06 Thread Jon Haddad
I'd prefer to see the default go from STCS to UCS, probably with scaling_parameters T4. That's essentially the same as STCS but without the ridiculous SSTable growth, allowing us to leverage the fast streaming path more often. I don't think there's any valid use cases for STCS anymore now that we