Re: [lldb-dev] TestRaise.py test_restart_bug flakey stats

2015-10-19 Thread Todd Fiala via lldb-dev
Okay. I think for the time being, the XFAIL makes sense. Per my previous email, though, I think we should move away from unexpected success (XPASS) being a "sometimes meaningful, sometimes meaningless" signal. For almost all cases, an unexpected success is an actionable signal. I don't want it

Re: [lldb-dev] TestRaise.py test_restart_bug flakey stats

2015-10-19 Thread Todd Fiala via lldb-dev
Thanks, Tamas. On Mon, Oct 19, 2015 at 4:30 AM, Tamas Berghammer wrote: > The expected flakey works a bit differently then you are described: > * Run the tests > * If it passes, it goes as a successful test and we are done > * Run the test again > * If it is passes the 2nd time then record it as

Re: [lldb-dev] TestRaise.py test_restart_bug flakey stats

2015-10-19 Thread Pavel Labath via lldb-dev
I have created this test to reproduce a race condition in ProcessGDBRemote. Given that it tests a race condition, it cannot be failing 100% of the time, but I agree with Tamas that we should keep it as XFAIL to avoid noise in the buildbots. pl On 19 October 2015 at 12:30, Tamas Berghammer via lld

Re: [lldb-dev] TestRaise.py test_restart_bug flakey stats

2015-10-19 Thread Tamas Berghammer via lldb-dev
The expected flakey works a bit differently then you are described: * Run the tests * If it passes, it goes as a successful test and we are done * Run the test again * If it is passes the 2nd time then record it as expected failure (IMO expected falkey would be a better result, but we don't have th

Re: [lldb-dev] TestRaise.py test_restart_bug flakey stats

2015-10-17 Thread Todd Fiala via lldb-dev
Nope, no good either when I limit the flakey to DWO. So perhaps I don't understand how the flakey marking works. I thought it meant: * run the test. * If it passes, it goes as a successful test. Then we're done. * run the test again. * If it passes, then we're done and mark it a successful test.

Re: [lldb-dev] TestRaise.py test_restart_bug flakey stats

2015-10-17 Thread Todd Fiala via lldb-dev
Hmm, the flakey behavior may be specific to dwo. Testing it locally as unconditionally flaky on Linux is failing on dwarf. All the ones I see succeed are dwo. I wouldn't expect a diff there but that seems to be the case. So, the request still stands but I won't be surprised if we find that dwo