I have created this test to reproduce a race condition in ProcessGDBRemote. Given that it tests a race condition, it cannot be failing 100% of the time, but I agree with Tamas that we should keep it as XFAIL to avoid noise in the buildbots.
pl On 19 October 2015 at 12:30, Tamas Berghammer via lldb-dev <lldb-dev@lists.llvm.org> wrote: > The expected flakey works a bit differently then you are described: > * Run the tests > * If it passes, it goes as a successful test and we are done > * Run the test again > * If it is passes the 2nd time then record it as expected failure (IMO > expected falkey would be a better result, but we don't have that category) > * If it fails 2 times in a row then record it as a failure because a flakey > test should pass at least once in every 2 run (it means we need ~95% success > rate to keep the build bot green in most of the time). If it isn't passing > often enough for that then it should be marked as expected failure. This is > done this way to detect the case when a flakey test get broken completely by > a new change. > > I checked some states for TestRaise on the build bot and in the current > definition of expected flakey we shouldn't mark it as flakey because it will > often fail 2 times in a row (it passing rate is ~50%) what will be reported > as a failure making the build bot red. > > I will send you the full stats from the lass 100 build in a separate off > list mail as it is a too big for the mailing list. If somebody else is > interested in it then let me know. > > Tamas > > On Sun, Oct 18, 2015 at 2:18 AM Todd Fiala <todd.fi...@gmail.com> wrote: >> >> Nope, no good either when I limit the flakey to DWO. >> >> So perhaps I don't understand how the flakey marking works. I thought it >> meant: >> * run the test. >> * If it passes, it goes as a successful test. Then we're done. >> * run the test again. >> * If it passes, then we're done and mark it a successful test. If it >> fails, then mark it an expected failure. >> >> But that's definitely not the behavior I'm seeing, as a flakey marking in >> the above scheme should never produce a failing test. >> >> I'll have to revisit the flakey test marking to see what it's really doing >> since my understanding is clearly flawed! >> >> On Sat, Oct 17, 2015 at 5:57 PM, Todd Fiala <todd.fi...@gmail.com> wrote: >>> >>> Hmm, the flakey behavior may be specific to dwo. Testing it locally as >>> unconditionally flaky on Linux is failing on dwarf. All the ones I see >>> succeed are dwo. I wouldn't expect a diff there but that seems to be the >>> case. >>> >>> So, the request still stands but I won't be surprised if we find that dwo >>> sometimes passes while dwarf doesn't (or at least not enough to get through >>> the flakey setting). >>> >>> On Sat, Oct 17, 2015 at 4:57 PM, Todd Fiala <todd.fi...@gmail.com> wrote: >>>> >>>> Hi Tamas, >>>> >>>> I think you grabbed me stats on failing tests in the past. Can you dig >>>> up the failure rate for TestRaise.py's test_restart_bug() variants on >>>> Ubuntu >>>> 14.04 x86_64? I'd like to mark it as flaky on Linux, since it is passing >>>> most of the time over here. But I want to see if that's valid across all >>>> Ubuntu 14.04 x86_64. (If it is passing some of the time, I'd prefer >>>> marking >>>> it flakey so that we don't see unexpected successes). >>>> >>>> Thanks! >>>> >>>> -- >>>> -Todd >>> >>> >>> >>> >>> -- >>> -Todd >> >> >> >> >> -- >> -Todd > > > _______________________________________________ > lldb-dev mailing list > lldb-dev@lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev > _______________________________________________ lldb-dev mailing list lldb-dev@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev