Re: Improved DeletionTime serialization to reduce disk size

2023-07-03 Thread Benedict
I checked and I’m pretty sure we do, but it doesn’t apply any liveness optimisation. I had misunderstood the optimisation you proposed. Ideally we would encode any non-live timestamp with the delta offset, but since that’s a distinct optimisation perhaps that can be left to another patch.Are we happy, though, that the two different deletion time serialisers can store different ranges of timestamp? Both are large ranges, but I am not 100% comfortable with them diverging.On 3 Jul 2023, at 05:45, Berenguer Blasi  wrote:
  

  
  
It can look into it. I don't have a deep knowledge of the sstable
  format hence why I wanted to document it someday. But DeletionTime
  is being serialized in other places as well iirc and I doubt
  (finger in the air) we'll have that Epoch handy.

On 29/6/23 17:22, Benedict wrote:


  
  
So I’m just taking a quick peek at
  SerializationHeader and we already have a method for reading
  and writing a deletion time with offsets from EncodingStats.


So perhaps we simply have a bug where we are
  using DeletionTime Serializer instead of
  SerializationHeader.writeLocalDeletionTime? It looks to me
  like this is already available at most (perhaps all) of the
  relevant call sites.


 
  
  
On 29 Jun 2023, at 15:53, Josh McKenzie
   wrote:
  

  
  

  
  
I would prefer we not plan on two distinct changes to
  this

  
  I agree with this sentiment, and
  
  
  
  
+1, if you have time for this approach and no other in
  this window.
  
  People are going to use 5.0 for awhile. Better to have an
improvement in their hands for that duration than no
improvement at all IMO. Justifies the cost of the double
implementation and transitions to me.
  
  
  On Tue, Jun 27, 2023, at 5:43 AM, Mick Semb Wever wrote:
  
  

  

  Just for completeness the change is a handful loc.
The rest is added tests and we'd loose the sstable
format change opportunity window.
  





+1, if you have time for this approach
  and no other in this window.



(If you have time for the other, or
  someone else does, then the technically superior
  approach should win)

  
  

  
  
  
  

  

  
  
  

  

  



Re: Improved DeletionTime serialization to reduce disk size

2023-07-03 Thread Berenguer Blasi
Thanks for the comments Benedict. Given DeletionTime.localDeletionTime 
is what caps everything to year 2106 (uint enconded now) I am ok with a 
DeletionTime.markForDeleteAt that can go up to year 4284, personal 
opinion ofc.


And yes I hope once I read, doc and understand the sstable format better 
I can look into your suggestion and anything else I come across.


On 3/7/23 9:46, Benedict wrote:
I checked and I’m pretty sure we do, but it doesn’t apply any liveness 
optimisation. I had misunderstood the optimisation you proposed. 
Ideally we would encode any non-live timestamp with the delta offset, 
but since that’s a distinct optimisation perhaps that can be left to 
another patch.


Are we happy, though, that the two different deletion time serialisers 
can store different ranges of timestamp? Both are large ranges, but I 
am not 100% comfortable with them diverging.


On 3 Jul 2023, at 05:45, Berenguer Blasi  
wrote:




It can look into it. I don't have a deep knowledge of the sstable 
format hence why I wanted to document it someday. But DeletionTime is 
being serialized in other places as well iirc and I doubt (finger in 
the air) we'll have that Epoch handy.


On 29/6/23 17:22, Benedict wrote:
So I’m just taking a quick peek at SerializationHeader and we 
already have a method for reading and writing a deletion time with 
offsets from EncodingStats.


So perhaps we simply have a bug where we are using DeletionTime 
Serializer instead of SerializationHeader.writeLocalDeletionTime? It 
looks to me like this is already available at most (perhaps all) of 
the relevant call sites.




On 29 Jun 2023, at 15:53, Josh McKenzie  wrote:



I would prefer we not plan on two distinct changes to this

I agree with this sentiment, /*and*/


+1, if you have time for this approach and no other in this window.
People are going to use 5.0 for awhile. Better to have an 
improvement in their hands for that duration than no improvement at 
all IMO. Justifies the cost of the double implementation and 
transitions to me.


On Tue, Jun 27, 2023, at 5:43 AM, Mick Semb Wever wrote:


Just for completeness the change is a handful loc. The rest is
added tests and we'd loose the sstable format change
opportunity window.



+1, if you have time for this approach and no other in this window.

(If you have time for the other, or someone else does, then the 
technically superior approach should win)





Re: [VOTE] CEP 33 - CIDR filtering authorizer

2023-07-03 Thread Shailaja Koppu
Voting passes with 11 +1s and no -1. Closing this thread now. 

Thank you everyone.


> On Jun 27, 2023, at 6:17 PM, Shailaja Koppu  wrote:
> 
> Hi Team,
> 
> (Starting a new thread for VOTE instead of reusing the DISCUSS thread, to 
> follow usual procedure).
> 
> Please vote on CEP 33 - CIDR filtering authorizer 
> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-33%3A+CIDR+filtering+authorizer
>  
> .
> 
> Thanks,
> Shailaja



Re: [DISCUSS] Maintain backwards compatibility after dependency upgrade in the 5.0

2023-07-03 Thread Maxim Muzafarov
I'd like to mention the approach we took here: to untangle the driver
update in tests with the dropwizard library version (cassandra-driver
3.11 requires the "old" JMXReporter classes in the classpath) we have
copied the classes into the tests themselves, as it is allowed by the
Apache License 2.0. This way we can update the metrics library itself
and then update the driver used in the tests afterwards.

If there are no objections, we need another committer to take a look
at these changes:
https://issues.apache.org/jira/browse/CASSANDRA-14667
https://github.com/apache/cassandra/pull/2238/files

Thanks in advance for your help!

On Wed, 28 Jun 2023 at 16:04, Bowen Song via dev
 wrote:
>
> IMHO, anyone upgrading software between major versions should expect to
> see breaking changes. Introducing breaking or major changes is the whole
> point of bumping major version numbers.
>
> Since the library upgrade need to happen sooner or later, I don't see
> any reason why it should not happen in the 5.0 release.
>
>
> On 27/06/2023 19:21, Maxim Muzafarov wrote:
> > Hello everyone,
> >
> >
> > We use the Dropwizard Metrics 3.1.5 library, which provides a basic
> > set of classes to easily expose Cassandra internals to a user through
> > various interfaces (the most common being JMX). We want to upgrade
> > this library version in the next major release 5.0 up to the latest
> > stable 4.2.19 for the following reasons:
> > - the 3.x (and 4.0.x) Dropwizard Metrics library is no longer
> > supported, which means that if we face a critical CVE, we'll still
> > need to upgrade, so it's better to do it sooner and more calmly;
> > - as of 4.2.5 the library supports jdk11, jdk17, so we will be in-sync
> > [1] as well as having some of the compatibility fixes mentioned in the
> > related JIRA [2];
> > - there have been a few user-related requests [3][4] whose
> > applications collide with the old version of the library, we want to
> > help them;
> >
> >
> > The problem
> >
> > The problem with simply upgrading is that the JmxReporter class of the
> > library has moved from the com.codahale.metrics package in the 3.x
> > release to the com.codahale.metrics.jmx package in the 4.x release.
> > This is a problem for applications/tools that rely on the cassandra
> > classpath (lib/jars) as after the upgrade they may be looking for the
> > JmxReporter class which has changed its location.
> >
> > A good example of the problem that we (or a user) may face after the
> > upgrade is our tests and the cassandra-driver-core 3.1.1, which uses
> > the old 3.x version of the library in tests. Of course, in this case,
> > we can upgrade the cassandra driver up to 4.x [5][6] to fix the
> > problem, as the new driver uses a newer version of the library, but
> > that's another story I won't go into for now. I'm talking more about
> > visualising the problem a user might face after upgrading to 5.0 if
> > he/she rely on the cassandra classpath, but on the other hand, they
> > might not face this problem at all because, as I understand, they will
> > provide this library in their applications by themselves.
> >
> >
> > So, since Cassandra has a huge ecosystem and a variety of tools that I
> > can't even imagine, the main question here is:
> >
> > Can we move forward with this change without breaking backwards
> > compatibility with any kind of tools that we have considering the
> > example above as the main case? Do you have any thoughts on this?
> >
> > The changes are here:
> > https://github.com/apache/cassandra/pull/2238/files
> >
> >
> >
> > [1] 
> > https://github.com/dropwizard/metrics/pull/2180/files#diff-5dbf1a803ecc13ff945a08ed3eb09149a83615e83f15320550af8e3a91976446R14
> > [2] https://issues.apache.org/jira/browse/CASSANDRA-14667
> > [3] https://github.com/dropwizard/metrics/issues/1581#issuecomment-628430870
> > [4] https://issues.apache.org/jira/browse/STORM-3204
> > [5] https://issues.apache.org/jira/browse/CASSANDRA-15750
> > [6] https://issues.apache.org/jira/browse/CASSANDRA-17231


Re: [DISCUSS] Formalizing requirements for pre-commit patches on new CI

2023-07-03 Thread Maxim Muzafarov
For me, the biggest benefit of keeping the build scripts and CI
configurations as well in the same project is that these files are
versioned in the same way as the main sources do. This ensures that we
can build past releases without having any annoying errors in the
scripts, so I would say that this is a pretty necessary change.

I'd like to mention the approach that could work for the projects with
a huge amount of tests. Instead of running all the tests through
available CI agents every time we can have presets of tests:
- base tests (to make sure that your design basically works, the set
will not run longer than 30 min);
- pre-commit tests (the number of tests to make sure that we can
safely commit new changes and fit the run into the 1-2 hour build
timeframe);
- nightly builds (scheduled task to build everything we have once a
day and notify the ML if that build fails);


My question here is:
Should we mention in this concept how we will build the sub-projects
(e.g. Accord) alongside Cassandra?

On Fri, 30 Jun 2023 at 23:19, Josh McKenzie  wrote:
>
> Not everyone will have access to such resources, if all you have is 1 such 
> pod you'll be waiting a long time (in theory one month, and you actually need 
> a few bigger pods for some of the more extensive tests, e.g. large upgrade 
> tests)….
>
> One thing worth calling out: I believe we have a lot of low hanging fruit in 
> the domain of "find long running tests and speed them up". Early 2022 I was 
> poking around at our unit tests on CASSANDRA-17371 and found that 2.62% of 
> our tests made up 20.4% of our runtime 
> (https://docs.google.com/spreadsheets/d/1-tkH-hWBlEVInzMjLmJz4wABV6_mGs-2-NNM2XoVTcA/edit#gid=1501761592).
>  This kind of finding is pretty consistent; I remember Carl Yeksigian at NGCC 
> back in like 2015 axing an hour plus of aggregate runtime by just devoting an 
> afternoon to looking at a few badly behaving tests.
>
> I'd like to see us move from "1 pod 1 month" down to something a lot more 
> manageable. :)
>
> Shout-out to Berenger's work on CASSANDRA-16951 for dtest cluster reuse (not 
> yet merged), and I have CASSANDRA-15196 to remove the CDC vs. non segment 
> allocator distinction and axe the test-cdc target entirely.
>
> Ok. Enough of that. Don't want to derail us, just wanted to call out that the 
> state of things today isn't the way it has to be.
>
> On Fri, Jun 30, 2023, at 4:41 PM, Mick Semb Wever wrote:
>
> - There are hw constraints, is there any approximation on how long it will 
> take to run all tests? Or is there a stated goal that we will strive to reach 
> as a project?
>
> Have to defer to Mick on this; I don't think the changes outlined here will 
> materially change the runtime on our currently donated nodes in CI.
>
>
>
> A recent comparison between CircleCI and the jenkins code underneath 
> ci-cassandra.a.o was done (not yet shared) to whether a 'repeatable CI' can 
> be both lower cost and same turn around time.  The exercise undercovered that 
> there's a lot of waste in our jenkins builds, and once the jenkinsfile 
> becomes standalone it can stash and unstash the build results.  From this a 
> conservative estimate was even if we only brought the build time to be double 
> that of circleci it will still be significantly lower cost while still using 
> on-demand ec2 instances. (The goal is to use spot instances.)
>
> The real problem here is that our CI pipeline uses ~1000 containers. 
> ci-cassandra.a.o only has 100 executors (and a few of these at any time are 
> often down for disk self-cleaning).   The idea with 'repeatable CI', and to a 
> broader extent Josh's opening email, is that no one will need to use 
> ci-cassandra.a.o for pre-commit work anymore.  For post-commit we don't care 
> if it takes 7 hours (we care about stability of results, which 'repeatable 
> CI' also helps us with).
>
> While pre-commit testing will be more accessible to everyone, it will still 
> depend on the resources you have access to.  For the fastest turn-around 
> times you will need a k8s cluster that can spawn 1000 pods (4cpu, 8GB ram) 
> which will run for up to 1-30 minutes, or the equivalent.  Not everyone will 
> have access to such resources, if all you have is 1 such pod you'll be 
> waiting a long time (in theory one month, and you actually need a few bigger 
> pods for some of the more extensive tests, e.g. large upgrade tests)….
>
>


Re: [DISCUSS] Formalizing requirements for pre-commit patches on new CI

2023-07-03 Thread Josh McKenzie
> Instead of running all the tests through available CI agents every time we 
> can have presets of tests:
Back when I joined the project in 2014, unit tests took ~ 5 minutes to run on a 
local machine. We had pre-commit and post-commit tests as a distinction as 
well, but also had flakes in the pre and post batch. I'd love to see us get 
back to a unit test regime like that.

The challenge we've always had is flaky tests showing up in either the 
pre-commit or post-commit groups and difficulty in attribution on a flaky 
failure to where it was introduced (not to lay blame but to educate and learn 
and prevent recurrence). While historically further reduced smoke testing 
suites would just mean more flakes showing up downstream, the rule of 
multiplexing new or changed tests might go a long way to mitigating that.

> Should we mention in this concept how we will build the sub-projects (e.g. 
> Accord) alongside Cassandra?
I think it's an interesting question, but I also think there's no real 
dependency of process between primary mainline branches and feature branches. 
My intuition is that having the same bar (green CI, multiplex, don't introduce 
flakes, smart smoke suite tiering) would be a good idea on feature branches so 
there's not a death march right before merge, squashing flakes when you have to 
multiplex hundreds of tests before merge to mainline (since presumably a 
feature branch would impact a lot of tests).

Now that I write that all out it does sound Painful. =/

On Mon, Jul 3, 2023, at 10:38 AM, Maxim Muzafarov wrote:
> For me, the biggest benefit of keeping the build scripts and CI
> configurations as well in the same project is that these files are
> versioned in the same way as the main sources do. This ensures that we
> can build past releases without having any annoying errors in the
> scripts, so I would say that this is a pretty necessary change.
> 
> I'd like to mention the approach that could work for the projects with
> a huge amount of tests. Instead of running all the tests through
> available CI agents every time we can have presets of tests:
> - base tests (to make sure that your design basically works, the set
> will not run longer than 30 min);
> - pre-commit tests (the number of tests to make sure that we can
> safely commit new changes and fit the run into the 1-2 hour build
> timeframe);
> - nightly builds (scheduled task to build everything we have once a
> day and notify the ML if that build fails);
> 
> 
> My question here is:
> Should we mention in this concept how we will build the sub-projects
> (e.g. Accord) alongside Cassandra?
> 
> On Fri, 30 Jun 2023 at 23:19, Josh McKenzie  wrote:
> >
> > Not everyone will have access to such resources, if all you have is 1 such 
> > pod you'll be waiting a long time (in theory one month, and you actually 
> > need a few bigger pods for some of the more extensive tests, e.g. large 
> > upgrade tests)….
> >
> > One thing worth calling out: I believe we have a lot of low hanging fruit 
> > in the domain of "find long running tests and speed them up". Early 2022 I 
> > was poking around at our unit tests on CASSANDRA-17371 and found that 2.62% 
> > of our tests made up 20.4% of our runtime 
> > (https://docs.google.com/spreadsheets/d/1-tkH-hWBlEVInzMjLmJz4wABV6_mGs-2-NNM2XoVTcA/edit#gid=1501761592).
> >  This kind of finding is pretty consistent; I remember Carl Yeksigian at 
> > NGCC back in like 2015 axing an hour plus of aggregate runtime by just 
> > devoting an afternoon to looking at a few badly behaving tests.
> >
> > I'd like to see us move from "1 pod 1 month" down to something a lot more 
> > manageable. :)
> >
> > Shout-out to Berenger's work on CASSANDRA-16951 for dtest cluster reuse 
> > (not yet merged), and I have CASSANDRA-15196 to remove the CDC vs. non 
> > segment allocator distinction and axe the test-cdc target entirely.
> >
> > Ok. Enough of that. Don't want to derail us, just wanted to call out that 
> > the state of things today isn't the way it has to be.
> >
> > On Fri, Jun 30, 2023, at 4:41 PM, Mick Semb Wever wrote:
> >
> > - There are hw constraints, is there any approximation on how long it will 
> > take to run all tests? Or is there a stated goal that we will strive to 
> > reach as a project?
> >
> > Have to defer to Mick on this; I don't think the changes outlined here will 
> > materially change the runtime on our currently donated nodes in CI.
> >
> >
> >
> > A recent comparison between CircleCI and the jenkins code underneath 
> > ci-cassandra.a.o was done (not yet shared) to whether a 'repeatable CI' can 
> > be both lower cost and same turn around time.  The exercise undercovered 
> > that there's a lot of waste in our jenkins builds, and once the jenkinsfile 
> > becomes standalone it can stash and unstash the build results.  From this a 
> > conservative estimate was even if we only brought the build time to be 
> > double that of circleci it will still be significantly lower cost while