Re: [DISCUSS] What should we do with @Ignore tests?

Mark Hanson Fri, 03 Jan 2020 11:34:53 -0800

So, I just boil it down to the pragmatic situation where you are in a file that 
has some commented tests with “TODO” and in the process of cleaning up the file 
you are confronted with what to do about them.


I think the realistic answer is that no one is going to pay down technical 
debt, but us. So just figure out what the story is on the tests and do the 
right thing, with a bias towards getting them working.

Thanks,
Mark

> On Jan 2, 2020, at 1:17 PM, John Blum <jb...@pivotal.io> wrote:
> 
> +1 to Kirk's comments.
> 
> Also, regarding (c), using AssumeThat [1] (or, alternatively & IMO
> preferrably, [2]) might provide some temporary relief.
> 
> 
> [1] https://junit.org/junit4/javadoc/latest/org/junit/Assume.html
> [2]
> https://joel-costigliola.github.io/assertj/core-8/api/org/assertj/core/api/Assumptions.html
> 
> 
> On Thu, Jan 2, 2020 at 11:46 AM Kirk Lund <kl...@apache.org> wrote:
> 
>> The Geode code base has 328 tests across 145 files with @Ignore. The vast
>> majority of these were disabled pre-Geode because the previous group's
>> policy was to disable any test that failed in CI, then file a bug system
>> ticket, fix it, and re-enable it. However, the process never followed
>> through beyond disabling the test and filing a ticket. Since that was the
>> old pre-Apache bug system, most of us don't have access to it.
>> 
>> From a Pivotal point-of-view, it's possible that every Geode commit that
>> passes precheckin but then breaks a Pivotal Hydra test could have been
>> caught by one of those disabled tests. So we are probably already
>> experiencing pain and paying the price for not re-enabling these tests (in
>> at least some cases).
>> 
>> @Ignore was introduced in JUnit 4, but most of these tests were JUnit 3
>> style. After I upgraded us to JUnit 4, I found every JUnit 3 style of test
>> that had been disabled by renaming the test, fixed the name of the test,
>> and added @Test plus @Ignore so that we can at least be aware of how many
>> tests are disabled and actively search for them. I tried to add include a
>> comment in the annotation if I could figure out why it was disabled. Some
>> of these comments are things like (a) TRAC ticket # if known, (b) not yet
>> implemented, (c) skip test for this configuration, (d) not yet implemented
>> for partitioned region, etc. Sometimes I just reused a java comment that
>> the disabler left above the disabled test.
>> 
>> More info on some of those @Ignore comments:
>> 
>> (b) not yet implemented -- this means probably means the test was never
>> finished and should just be deleted but it might also be another example of
>> (c) or (d) in certain test classes
>> 
>> (c) skip test for this configuration -- this is a class that extends
>> another test class but one or more test methods are overridden with @Ignore
>> to prevent the test from running in the subclass because it's invalid for
>> that configuration -- we can't just delete these, we have to remove
>> class-hierarchy (stop extending test classes!) which will result in some
>> redundancy between two test classes
>> 
>> I do not think we should blindly delete them all, and I think we need to
>> spend time studying each one individually to determine what the state of
>> the test is and what should be done for it. So even if many in our
>> community want to just delete all tests with @Ignore, that's not going to
>> work in at least half of the test classes. You'll have to tease apart
>> sub-classes from super-classes.
>> 
>> Plus you won't actually know that the test has other coverage unless you
>> spend the time analyzing it, which is what I think we should do.
>> 
>> Some quick examples:
>> 
>> *1) ClientServerTimeSyncDUnitTest*
>> 
>> Notes: this test class has only two tests and both are @Ignored, so maybe
>> deleting them isn't the best answer.
>> 
>> testClientTimeAdvances is marked with @Ignore("Bug 52327"). #52327 is
>> "ClientServerTimeSyncDUnitTest.testClientTimeAdvances fails intermittently"
>> 
>> testClientTimeSlowsDown is marked with @Ignore("not yet implemented") which
>> either means the test was never finished or the functionality it's trying
>> to test was never finished in the product code. It looks like the test
>> never had any assertions.
>> 
>> Personally, I don't think I'd want to delete either of these tests. I would
>> prefer to fix them up.
>> 
>> If we really care about Apache Geode, our test coverage, and the quality of
>> the product, then I think we should separate super-sub classes that force
>> us to disable test in sub-class that are invalid test cases and fix analyze
>> each test on a case-by-case basis to determine what should be done. And I
>> believe that the default approach for each test should be to fix it rather
>> than delete it.
>> 
>> *2) DistributedAckPersistentRegionCCEDUnitTest*
>> 
>> 
>> geode-core/src/distributedTest/java/org/apache/geode/cache30/DistributedAckPersistentRegionCCEDUnitTest.java:41:
>> @Ignore("Skip test for this configuration")
>> 
>> geode-core/src/distributedTest/java/org/apache/geode/cache30/DistributedAckPersistentRegionCCEDUnitTest.java:46:
>> @Ignore("Skip test for this configuration")
>> 
>> geode-core/src/distributedTest/java/org/apache/geode/cache30/DistributedAckPersistentRegionCCEDUnitTest.java:51:
>> @Ignore("Skip test for this configuration")
>> 
>> geode-core/src/distributedTest/java/org/apache/geode/cache30/DistributedAckRegionCCEDUnitTest.java:110:
>> @Ignore("replicates don't allow local destroy")
>> 
>> geode-core/src/distributedTest/java/org/apache/geode/cache30/DistributedAckRegionCCEDUnitTest.java:117:
>> @Ignore("replicates don't allow local destroy")
>> 
>> geode-core/src/distributedTest/java/org/apache/geode/cache30/DistributedAckRegionCCEDUnitTest.java:294:
>> @Ignore("Enable after fix for bug GEODE-1891 was #45704")
>> 
>> Here you see 3 test methods with "Skip test for this configuration". If you
>> delete these from the sub-class, it'll run these three tests and probably
>> blow up because it's an invalid configuration. To fix this we'll have to
>> NOT extend the super-classes and start inlining super-classes and deleting
>> what's not needed.
>> 
>> Next are 2 test methods with "replicates don't allow local destroy" --
>> sounds like the same thing as "Skip test for this configuration" to me.
>> 
>> Last one is "Enable after fix for bug GEODE-1891 was #45704". This
>> represents product functionality that doesn't work how we want it to AND we
>> have a test that demonstrates it. Seems a shame to delete this test. If a
>> developer ever tries to fix GEODE-1891, they'll want this test. I suppose
>> you could copy-paste the test into the comments or attachments of
>> GEODE-1891 and delete it from code.
>> 
>> *3) GlobalRegionCCEDUnitTest*
>> 
>> 
>> geode-core/src/distributedTest/java/org/apache/geode/cache30/GlobalRegionCCEDUnitTest.java:101:
>> @Ignore("replicates don't allow local destroy")
>> 
>> geode-core/src/distributedTest/java/org/apache/geode/cache30/GlobalRegionCCEDUnitTest.java:108:
>> @Ignore("replicates don't allow local destroy")
>> 
>> geode-core/src/distributedTest/java/org/apache/geode/cache30/GlobalRegionCCEDUnitTest.java:137:
>> @Ignore("TODO: takes too long with global regions and cause false dunit
>> hangs")
>> 
>> geode-core/src/distributedTest/java/org/apache/geode/cache30/GlobalRegionCCEDUnitTest.java:143:
>> @Ignore("TODO: takes too long with global regions and cause false dunit
>> hangs")
>> 
>> geode-core/src/distributedTest/java/org/apache/geode/cache30/GlobalRegionCCEDUnitTest.java:149:
>> @Ignore("TODO: takes too long with global regions and cause false dunit
>> hangs")
>> 
>> geode-core/src/distributedTest/java/org/apache/geode/cache30/GlobalRegionCCEDUnitTest.java:221:
>> @Ignore("Disabling due to bug #52347")
>> 
>> geode-core/src/distributedTest/java/org/apache/geode/cache30/GlobalRegionCCEDUnitTest.java:231:
>> @Ignore("TODO: reenable this test")
>> 
>> Again we have sub-class overriding and ignoring tests that are invalid with
>> "replicates don't allow local destroy". Next we have "TODO: takes too long
>> with global regions". Apparently someone a long time ago decided these 3
>> tests take too long and disabled them. Sure seems like we should re-enable
>> them to me! If they take too long, then look for sleeps. If GLOBAL takes
>> too long but any users still use it too bad. If GLOBAL takes too long but
>> NO users still use, then we deprecate GLOBAL, release Geode 2.0 and delete
>> it!
>> 
>> #52347 is "testConcurrentEventsOnEmptyRegion fails intermittently in CI".
>> We could look into who filed #52347 and assign fixing the flakiness in this
>> test method to that developer.
>> 
>> "TODO: reenable this test" -- I guess we should reenable this test and see
>> how flaky it is?! After all that's what it says to do.
>> 
>> I quick glance over all of the results of $ git grep "@Ignore" shows that
>> the above is pretty much the sort of things we're looking at.
>> 
>> I'm 100% against deleting them. If everyone else wants to delete them, then
>> I propose reassigning me to do nothing but fix ALL @Ignored tests and I'll
>> jump on it.
>> 
>> And just to be very clear:
>> 
>> -1 to all proposals to delete all tests with @Ignore (I can't think of any
>> argument to change me to -0)
>> 
>> -Kirk
>> 
>> On Tue, Dec 31, 2019 at 2:08 PM Alexander Murmann <amurm...@apache.org>
>> wrote:
>> 
>>> Here are a few things that are true for me or I believe are true in
>>> general:
>>> 
>>>   - Our test suite is more flaky than we'd like it to be
>>>   - I don't believe that adding more Unit tests that follow existing
>>>   patterns buys us that much. I'd rather see something similar to what
>>> some
>>>   folks are doing with Membership right now where we isolate the code
>> and
>>>   test it more systematically
>>>   - We have other testing gaps: We have benchmarks 👏🎉, but we are
>> still
>>>   lacking coverage in that ares; our community is still lacking HA
>> tests.
>>> I'd
>>>   rather fill those than bring back old DUnit tests that are chosen
>>> somewhat
>>>   at random.
>>>   - I'd rather be deliberate about what tests we introduce than
>> wholesale
>>>   bring back a set of tests, since any of these re-introduced tests has
>> a
>>>   potential to be flaky. Let's make sure our tests carry their weight.
>>>   - If we delete these tests, we can always go back to a SHA from today
>>>   and bring them back at a later date
>>>   - These tests have been ignored since a very long time and we've
>> shipped
>>>   without them and nobody has missed them enough to bring them back.
>>> 
>>> Given all the above, my vote is for less noise in our code, which means
>>> deleting all ignored tests. If we want to keep them, I'd love to hear a
>>> plan of action on how we bring them back. Having a bunch of dead code
>> helps
>>> nobody.
>>> 
>>> On Tue, Dec 31, 2019 at 1:50 PM Mark Hanson <mhan...@pivotal.io> wrote:
>>> 
>>>> Hi All,
>>>> 
>>>> As part of what I am doing to fix flaky tests, I periodically come
>> across
>>>> tests that are @Ignore’d. I am curious what we would like to do with
>> them
>>>> generally speaking. We could fix them, which would seem obvious, but we
>>> are
>>>> struggling to fix flaky tests as it is.  We could delete them, but
>> those
>>>> tests were written for a reason (I hope).  Or we could leave them. This
>>>> pollutes searches etc as inactive code requiring upkeep at least.
>>>> 
>>>> I don’t have an easy answer. Some have suggested deleting them. I tend
>> to
>>>> lean that direction, but I thought I would consult the community for a
>>>> broader perspective.
>>>> 
>>>> Thanks,
>>>> Mark
>>> 
>> 
> 
> 
> -- 
> -John
> Spring Data Team

Re: [DISCUSS] What should we do with @Ignore tests?

Reply via email to