So revised proposal:
On Release Lifecycle cwiki page:
- Ensure we have parity on jobs run between circle and asf-ci
- Allow usage of circleci as gatekeeper for releases. 1 green run -> beta, 3
green runs consecutive -> ga
- No new consistent regressions on CI for asf compared to prior branches
- Explicitly do not consider ci-cassandra asf flaky tests as release blockers
Changes to codify into documentation:
- On patch before commit, multiplex @500 all new tests, changed tests, or
expected to be impacted tests ("expected to be impacted" piece pending
multi-class multiplexing support):
- Add support for multi-class specification in multiplexer and document
Add informal project commitment during next major release lifecycle to continue
working on bringing asf ci-cassandra up to where it can be formal gatekeeper
for release.
On Wed, Sep 28, 2022, at 10:13 AM, Ekaterina Dimitrova wrote:
> If we talk blockers nothing more than ensuring we see all tests we want
> pre-release, IMHO.
> The other points sound to me like future important improvements that will
> help us significantly in the flaky test fight.
>
> On Wed, 28 Sep 2022 at 10:08, Josh McKenzie <[email protected]> wrote:
>> __
>> I'm receptive to that but I wouldn't gate our ability to get 4.1 out the
>> door based on circle on that. Honestly probably only need to have the parity
>> of coverage be the blocker for its use in retrospect.
>>
>> On Wed, Sep 28, 2022, at 1:32 AM, Berenguer Blasi wrote:
>>> I would add an option for generate.sh to detect all changed *Test.java
>>> files, that would be handy imo.
>>>
>>> On 28/9/22 4:29, Josh McKenzie wrote:
>>>> So:
>>>> 1. 500 iterations on multiplexer
>>>> 2. Augmenting generate.sh to allow providing multiple class names and
>>>> generating a single config that'll multiplex all the tests provided
>>>> 3. Test parity / pre-release config added on circleci (see
>>>> https://issues.apache.org/jira/browse/CASSANDRA-17930), specifically
>>>> dtest-large, dtest-offheap, test-large-novnode
>>>> If we get the above 3, are we at a place where we're good to consider
>>>> vetting releases on circleci for beta / rc / ga?
>>>>
>>>> On Tue, Sep 27, 2022, at 11:28 AM, Ekaterina Dimitrova wrote:
>>>>>> “I have plans on modifying the multiplexer to allow specifying a list of
>>>>>> classes per test target, so we don't have to needlessly suffer with this”
>>>>>>
>>>>>>
>>>>>> That would be great, I was thinking of that the other day too. With that
>>>>>> said I’ll be happy to support you in that effort too :-)
>>>>>
>>>>> On Tue, 27 Sep 2022 at 11:18, Josh McKenzie <[email protected]> wrote:
>>>>>>
>>>>>>> I have plans on modifying the multiplexer to allow specifying a list of
>>>>>>> classes per test target, so we don't have to needlessly suffer with this
>>>>>> This sounds integral to us multiplexing tests on large diffs whether we
>>>>>> go with circle for releases or not and would be a great addition!
>>>>>>
>>>>>> On Tue, Sep 27, 2022, at 6:19 AM, Andrés de la Peña wrote:
>>>>>>>> 250 iterations isn't enough; I use 500 as a low water mark.
>>>>>>>
>>>>>>> I agree that 500 iterations would be a reasonable minimum. We have seen
>>>>>>> flaky unit tests requiring far more iterations, but that's not very
>>>>>>> common. We could use to 500 iterations as default, and discretionary
>>>>>>> use a higher limit in tests that are quick and might be prone to
>>>>>>> concurrency issues. I can change the defaults on CirceCI config file if
>>>>>>> we agree to a new limit, the current default of 100 iterations is quite
>>>>>>> arbitrary.
>>>>>>>
>>>>>>> The test multiplexer allows to either run test individual test methods
>>>>>>> or entire classes. It is quite frequent to see tests methods that pass
>>>>>>> individually but fail when they are run together with the other tests
>>>>>>> in the same class. Because of this, I think that we should always run
>>>>>>> entire classes when repeating new or modified tests. The only exception
>>>>>>> to this would be Python dtests, which usually are more resource
>>>>>>> intensive and not so prone to that type of issues.
>>>>>>>
>>>>>>>> For CI on a patch, run the pre-commit suite and also run multiplexer
>>>>>>>> with 250 runs on new, changed, or related tests to ensure not flaky
>>>>>>>
>>>>>>> The multiplexer only allows to run a single test class per push. This
>>>>>>> is ok for fixing existing flakies (its original purpose), and for most
>>>>>>> minor changes, but it can be quite inconvenient for testing large
>>>>>>> patches that add or modify many tests. For example, the patch for
>>>>>>> CEP-19 directly modifies 31 test classes, which means 31 CircleCI
>>>>>>> config pushes. This number can be somewhat reduced with some wildcards
>>>>>>> on the class names, but the process is still quite inconvenient. I
>>>>>>> guess that other large patches will find the same problem. I have plans
>>>>>>> on modifying the multiplexer to allow specifying a list of classes per
>>>>>>> test target, so we don't have to needlessly suffer with this.
>>>>>>>
>>>>>>> On Mon, 26 Sept 2022 at 22:44, Brandon Williams <[email protected]>
>>>>>>> wrote:
>>>>>>>> On Mon, Sep 26, 2022 at 1:31 PM Josh McKenzie <[email protected]>
>>>>>>>> wrote:
>>>>>>>> >
>>>>>>>> > 250 iterations isn't enough; I use 500 as a low water mark.
>>>>>>>> >
>>>>>>>> > Say more here. I originally had it at 500 but neither Mick nor I
>>>>>>>> > knew why and figured we could suss this out on this thread.
>>>>>>>>
>>>>>>>> I've seen flakies that passed with less later exhibit at that point.
>>>>>>>>
>>>>>>>> > This is also assuming that circle and ASF CI run the same tests,
>>>>>>>> > which
>>>>>>>> > is not entirely true.
>>>>>>>> >
>>>>>>>> > +1: we need to fix this. My intuition is the path to getting
>>>>>>>> > circle-ci in parity on coverage is a shorter path than getting ASF
>>>>>>>> > CI to 3 green runs for GA. That consistent w/your perception as well
>>>>>>>> > or do you disagree?
>>>>>>>>
>>>>>>>> I agree that bringing parity to the coverage will be the shorter path.
>>>>>>
>>>>
>>