Please don't forget that we are NOT supposed to ignore failures in
FlakyTest. That was never the intention of FlakyTest. The intention was
this:

1) run the test under a testing target that ALWAYS forks (or forks every 30
distributed test cases) -- this was enough to stabilize many of these
flickering tests, presumably because they use short sleeps or waits and
sharing the JVM with other tests was causing GC pauses that caused the test
to fail

2) move ALL flickering tests to one target so that we don't waste time
analyzing flickering tests in all of the gradle targets

3) moving a flickering test to FlakyTest is meant to be TEMPORARY -- think
of it as better than @Ignore or Assume because it's still running and
should NOT be ignored but it gives you a bit of breathing room to fix it

Using @Ignore or non-OS-Assume is much worse than using FlakyTest because
the test continues to run under FlakyTest and should never be ignored by a
developer running precheckin.

Everyone, please don't ignore FlakyTest failures! If a flickering test
moves from IntegrationTest to FlakyTest and then continues to flicker, then
fixing that test needs to become a high priority. If the test stops
flickering (because it's running in a clean JVM) then fixing it is low
priority but still needs to be planned for.

We should have a strict policy that if a flickering test continues
flickering for a week or more then it should be moved to FlakyTest and
scheduled to be fixed. Also, if a flickering test moves to FlakyTest and
continues to fail a couple times a week then fixing it should become a high
priority -- it's not excuse to just ignore that test and it never was.

I honestly don't know how so many people ended up with the wrong
understanding of what FlakyTest is all about. It's all about keeping the
non-FlakyTest targets 100% GREEN(!), fixing the flakiness ASAP and moving
them back to their original target after they are fixed. Everyone should
know that if a FlakyTest fails for them in precheckin, then you cannot
ignore that -- you must investigate to find out what the cause is -- it may
be that the developer has an uncommitted change that introduces a bug which
causes a previously flickering test to start failing consistently. Anyone
who ignores this is not doing their job correctly.

Thanks,
Kirk

On Thu, Sep 14, 2017 at 10:30 AM, Patrick Rhomberg <prhomb...@pivotal.io>
wrote:

> These problems stem from a dirty testing environment occupying the default
> port, which this test wants to use.  I have a pull request open to ignore
> the test when the default port is not available.
>
> On the one hand, an ignored test is essentially dead code.  If the test is
> consistently skipped and not only occasionally skipped, it's not informing
> us of anything.  But on the other hand, a flaky test ends up being more or
> less ignored and its errors end up being discarded as general flakiness.
>
> I'd favor the "assume X or skip" route over marking a test flaky, but that
> will require some diligence on our part to make sure these tests do end up
> eventually running.
>
> Pull request #771: https://github.com/apache/geode/pull/771
>
> On Thu, Sep 14, 2017 at 8:51 AM, Kirk Lund <kl...@apache.org> wrote:
>
> > These tests have been failing intermittently in the integrationTest
> target
> > for over a week. I'm going to add the category FlakyTest to these tests.
> > See GEODE-3426.
> >
> > org.apache.geode.rest.internal.web.RestServersJUnitTest > testGet FAILED
> >     org.junit.ComparisonFailure: expected:<[200]> but was:<[404]>
> >         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> > Method)
> >         at
> > sun.reflect.NativeConstructorAccessorImpl.newInstance(
> > NativeConstructorAccessorImpl.java:62)
> >         at
> > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(
> > DelegatingConstructorAccessorImpl.java:45)
> >         at
> > org.apache.geode.rest.internal.web.RestServersJUnitTest.testGet(
> > RestServersJUnitTest.java:49)
> >
> > org.apache.geode.rest.internal.web.RestServersJUnitTest >
> > testServerStartedOnDefaultPort FAILED
> >     org.json.JSONException: Value <html> of type java.lang.String cannot
> be
> > converted to JSONArray
> >         at org.json.JSON.typeMismatch(JSON.java:108)
> >         at org.json.JSONArray.<init>(JSONArray.java:85)
> >         at
> > org.apache.geode.rest.internal.web.GeodeRestClient.
> > getJsonArray(GeodeRestClient.java:99)
> >         at
> > org.apache.geode.rest.internal.web.RestServersJUnitTest.
> > testServerStartedOnDefaultPort(RestServersJUnitTest.java:55)
> >
>

Reply via email to