Re: [lldb-dev] proposal for reworked flaky test category

2015-10-20 Thread Enrico Granata via lldb-dev
> On Oct 19, 2015, at 4:40 PM, Zachary Turner via lldb-dev > wrote: > > Yea, I definitely agree with you there. > > Is this going to end up with an @expectedFlakeyWindows, @expectedFlakeyLinux, > @expectedFlakeyDarwin, @expectedFlakeyAndroid, @expectedFlakeyFreeBSD? > > It's starting to ge

Re: [lldb-dev] proposal for reworked flaky test category

2015-10-20 Thread Zachary Turner via lldb-dev
Well that's basically what I meant with this: @test_status(status=flaky, host=[win, linux, android, darwin, bsd], target=[win, linux, android, darwin, bsd], compiler=[gcc, clang], debug_info=[dsym, dwarf, dwo]) but it has keyword parameters that allow you to specify the conditions under which the

Re: [lldb-dev] proposal for reworked flaky test category

2015-10-20 Thread Todd Fiala via lldb-dev
I'm not totally sure yet here. Right now there is a generic category mechanism, but it is only settable via a file in a directory, or overridden via the test case class method called getCategories(). I think I'd want a more general decorator that allows you to tag a method itself with categories.

Re: [lldb-dev] proposal for reworked flaky test category

2015-10-19 Thread Zachary Turner via lldb-dev
Yea, I definitely agree with you there. Is this going to end up with an @expectedFlakeyWindows, @expectedFlakeyLinux, @expectedFlakeyDarwin, @expectedFlakeyAndroid, @expectedFlakeyFreeBSD? It's starting to get a little crazy, at some point I think we just need something that we can use like this:

Re: [lldb-dev] proposal for reworked flaky test category

2015-10-19 Thread Todd Fiala via lldb-dev
My initial proposal was an attempt to not entirely skip running them on our end and still get them to generate actionable signals without conflating them with unexpected successes (which they absolutely are not in a semantic way). On Mon, Oct 19, 2015 at 4:33 PM, Todd Fiala wrote: > Nope, I have

Re: [lldb-dev] proposal for reworked flaky test category

2015-10-19 Thread Todd Fiala via lldb-dev
Nope, I have no issue with what you said. We don't want to run them over here at all because we don't see enough useful info come out of them. You need time series data for that to be somewhat useful, and even then it only is useful if you see a sharp change in it after a specific change. So I r

Re: [lldb-dev] proposal for reworked flaky test category

2015-10-19 Thread Zachary Turner via lldb-dev
Don't get me wrong, I like the idea of running flakey tests a couple of times and seeing if one passes (Chromium does this too as well, so it's not without precedent). If I sounded harsh, it's because I *want* to be harsh on flaky tests. Flaky tests indicate literally the *worst* kind of bugs bec

Re: [lldb-dev] proposal for reworked flaky test category

2015-10-19 Thread Todd Fiala via lldb-dev
Okay, so I'm not a fan of the flaky tests myself, nor of test suites taking longer to run than needed. Enrico is going to add a new 'flakey' category to the test categorization. Scratch all the other complexity I offered up. What we're going to ask is if a test is flakey, please add it to the 'f

Re: [lldb-dev] proposal for reworked flaky test category

2015-10-19 Thread Zachary Turner via lldb-dev
On Mon, Oct 19, 2015 at 12:50 PM Todd Fiala via lldb-dev < lldb-dev@lists.llvm.org> wrote: > Hi all, > > I'd like unexpected successes (i.e. tests marked as unexpected failure > that in fact pass) to retain the actionable meaning that something is > wrong. The wrong part is that either (1) the te

Re: [lldb-dev] proposal for reworked flaky test category

2015-10-19 Thread Todd Fiala via lldb-dev
> I'd like unexpected successes (i.e. tests marked as unexpected failure that in fact pass) argh, that should have been "(i.e. tests marked as *expected* failure that in fact pass)" On Mon, Oct 19, 2015 at 12:50 PM, Todd Fiala wrote: > Hi all, > > I'd like unexpected successes (i.e. tests marke

[lldb-dev] proposal for reworked flaky test category

2015-10-19 Thread Todd Fiala via lldb-dev
Hi all, I'd like unexpected successes (i.e. tests marked as unexpected failure that in fact pass) to retain the actionable meaning that something is wrong. The wrong part is that either (1) the test now passes consistently and the author of the fix just missed updating the test definition (or perh