>
> - There are hw constraints, is there any approximation on how long it will
> take to run all tests? Or is there a stated goal that we will strive to
> reach as a project?
>
> Have to defer to Mick on this; I don't think the changes outlined here
> will materially change the runtime on our currently donated nodes in CI.
>


A recent comparison between CircleCI and the jenkins code underneath
ci-cassandra.a.o was done (not yet shared) to whether a 'repeatable CI' can
be both lower cost and same turn around time.  The exercise undercovered
that there's a lot of waste in our jenkins builds, and once the jenkinsfile
becomes standalone it can stash and unstash the build results.  From this a
conservative estimate was even if we only brought the build time to be
double that of circleci it will still be significantly lower cost while
still using on-demand ec2 instances. (The goal is to use spot instances.)

The real problem here is that our CI pipeline uses ~1000 containers.
ci-cassandra.a.o only has 100 executors (and a few of these at any time are
often down for disk self-cleaning).   The idea with 'repeatable CI', and to
a broader extent Josh's opening email, is that no one will need to use
ci-cassandra.a.o for pre-commit work anymore.  For post-commit we don't
care if it takes 7 hours (we care about stability of results, which
'repeatable CI' also helps us with).

While pre-commit testing will be more accessible to everyone, it will still
depend on the resources you have access to.  For the fastest
turn-around times you will need a k8s cluster that can spawn 1000 pods
(4cpu, 8GB ram) which will run for up to 1-30 minutes, or the equivalent.
Not everyone will have access to such resources, if all you have is 1 such
pod you'll be waiting a long time (in theory one month, and you actually
need a few bigger pods for some of the more extensive tests, e.g. large
upgrade tests)….

Reply via email to