Re: [lldb-dev] test rerun phase is in

Todd Fiala via lldb-dev Tue, 15 Dec 2015 13:46:12 -0800

Hmm, yeah it looks like it did the rerun and then after finishing the
rerun, it's just hanging.


Let's have a look right after r255676 goes through this builder.  I hit a
hang in the curses output display due to the recursive taking of a lock on
a lock that was not recursive-enabled.  While I would have expected to see
that with the basic results output that this builder here is using when I
was testing earlier, it's possible somehow that we're hitting a path here
that is attempting to recursively take a lock.

Do you know if it is happening every single time a rerun occurs?
 (Hopefully yes?)

-Todd

On Tue, Dec 15, 2015 at 1:38 PM, Todd Fiala <[email protected]> wrote:

> Yep, I'll have a look!
>
> On Tue, Dec 15, 2015 at 12:43 PM, Ying Chen <[email protected]> wrote:
>
>> Hi Todd,
>>
>> It is noticed on lldb android builders that the test_runner didn't exit
>> after rerun, which caused buildbot timeout since the process was hanging
>> for over 20 minutes.
>> Could you please take a look if that's related to your change?
>>
>> Please see the following builds.
>>
>> http://lab.llvm.org:8011/builders/lldb-x86_64-ubuntu-14.04-android/builds/4305/steps/test3/logs/stdio
>>
>> http://lab.llvm.org:8011/builders/lldb-x86_64-ubuntu-14.04-android/builds/4305/steps/test7/logs/stdio
>>
>> Thanks,
>> Ying
>>
>> On Mon, Dec 14, 2015 at 4:52 PM, Todd Fiala via lldb-dev <
>> [email protected]> wrote:
>>
>>> And, btw, this shows the rerun logic working (via the --rerun-all-issues
>>> flag):
>>>
>>> time test/dotest.py --executable `pwd`/build/Debug/lldb --threads 24
>>> --rerun-all-issues
>>> Testing: 416 test suites, 24 threads
>>> 377 out of 416 test suites processed - TestSBTypeTypeClass.py
>>>
>>> Session logs for test failures/errors/unexpected successes will go into
>>> directory '2015-12-14-16_44_28'
>>> Command invoked: test/dotest.py --executable
>>> /Users/tfiala/src/lldb-tot/lldb/build/Debug/lldb --threads 24
>>> --rerun-all-issues -s 2015-12-14-16_44_28 --results-port 62322 --inferior
>>> -p TestMultithreaded.py
>>> /Users/tfiala/src/lldb-tot/lldb/packages/Python/lldbsuite/test
>>> --event-add-entries worker_index=3:int
>>>
>>> Configuration: arch=x86_64 compiler=clang
>>> ----------------------------------------------------------------------
>>> Collected 8 tests
>>>
>>> lldb_codesign: no identity found
>>> lldb_codesign: no identity found
>>> lldb_codesign: no identity found
>>> lldb_codesign: no identity found
>>> lldb_codesign: no identity found
>>> lldb_codesign: no identity found
>>> lldb_codesign: no identity found
>>>
>>> [TestMultithreaded.py FAILED]
>>> Command invoked: /usr/bin/python test/dotest.py --executable
>>> /Users/tfiala/src/lldb-tot/lldb/build/Debug/lldb --threads 24
>>> --rerun-all-issues -s 2015-12-14-16_44_28 --results-port 62322 --inferior
>>> -p TestMultithreaded.py
>>> /Users/tfiala/src/lldb-tot/lldb/packages/Python/lldbsuite/test
>>> --event-add-entries worker_index=3:int
>>> 396 out of 416 test suites processed - TestMiBreak.py
>>>
>>> Session logs for test failures/errors/unexpected successes will go into
>>> directory '2015-12-14-16_44_28'
>>> Command invoked: test/dotest.py --executable
>>> /Users/tfiala/src/lldb-tot/lldb/build/Debug/lldb --threads 24
>>> --rerun-all-issues -s 2015-12-14-16_44_28 --results-port 62322 --inferior
>>> -p TestDataFormatterObjC.py
>>> /Users/tfiala/src/lldb-tot/lldb/packages/Python/lldbsuite/test
>>> --event-add-entries worker_index=12:int
>>>
>>> Configuration: arch=x86_64 compiler=clang
>>> ----------------------------------------------------------------------
>>> Collected 26 tests
>>>
>>>
>>> [TestDataFormatterObjC.py FAILED]
>>> Command invoked: /usr/bin/python test/dotest.py --executable
>>> /Users/tfiala/src/lldb-tot/lldb/build/Debug/lldb --threads 24
>>> --rerun-all-issues -s 2015-12-14-16_44_28 --results-port 62322 --inferior
>>> -p TestDataFormatterObjC.py
>>> /Users/tfiala/src/lldb-tot/lldb/packages/Python/lldbsuite/test
>>> --event-add-entries worker_index=12:int
>>> 416 out of 416 test suites processed - TestLldbGdbServer.py
>>> 2 test files marked for rerun
>>>
>>>
>>> Rerunning the following files:
>>>
>>> functionalities/data-formatter/data-formatter-objc/TestDataFormatterObjC.py
>>>   api/multithreaded/TestMultithreaded.py
>>> Testing: 2 test suites, 1 thread
>>> 2 out of 2 test suites processed - TestMultithreaded.py
>>> Test rerun complete
>>>
>>>
>>> =============
>>> Issue Details
>>> =============
>>> UNEXPECTED SUCCESS: test_symbol_name_dsym
>>> (functionalities/completion/TestCompletion.py)
>>> UNEXPECTED SUCCESS: test_symbol_name_dwarf
>>> (functionalities/completion/TestCompletion.py)
>>>
>>> ===================
>>> Test Result Summary
>>> ===================
>>> Test Methods:       1695
>>> Reruns:               30
>>> Success:            1367
>>> Expected Failure:     90
>>> Failure:               0
>>> Error:                 0
>>> Exceptional Exit:      0
>>> Unexpected Success:    2
>>> Skip:                236
>>> Timeout:               0
>>> Expected Timeout:      0
>>>
>>> On Mon, Dec 14, 2015 at 4:51 PM, Todd Fiala <[email protected]>
>>> wrote:
>>>
>>>> And that fixed the rest as well.  Thanks, Siva!
>>>>
>>>> -Todd
>>>>
>>>> On Mon, Dec 14, 2015 at 4:44 PM, Todd Fiala <[email protected]>
>>>> wrote:
>>>>
>>>>> Heh you were skinning the same cat :-)
>>>>>
>>>>> That fixed the one I was just looking at, running the others now.
>>>>>
>>>>> On Mon, Dec 14, 2015 at 4:42 PM, Todd Fiala <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Yep, will try now...  (I was just looking at the condition testing
>>>>>> logic since it looks like something isn't quite right there).
>>>>>>
>>>>>> On Mon, Dec 14, 2015 at 4:39 PM, Siva Chandra <[email protected]
>>>>>> > wrote:
>>>>>>
>>>>>>> Can you try again after taking my change at r255584?
>>>>>>>
>>>>>>> On Mon, Dec 14, 2015 at 4:31 PM, Todd Fiala via lldb-dev
>>>>>>> <[email protected]> wrote:
>>>>>>> > I'm having some of these blow up.
>>>>>>> >
>>>>>>> > In the case of test/lang/c/typedef/Testtypedef.py, it looks like
>>>>>>> some of the
>>>>>>> > @expected decorators were changed a bit, and perhaps they are not
>>>>>>> pound for
>>>>>>> > pound the same.  For example, this test used to really be marked
>>>>>>> XFAIL (via
>>>>>>> > an expectedFailureClang directive), but it looks like the current
>>>>>>> marking of
>>>>>>> > compiler="clang" is either not right or not working, since the
>>>>>>> test is run
>>>>>>> > on OS X and is treated like it is expected to pass.
>>>>>>> >
>>>>>>> > I'm drilling into that a bit more, that's just the first of
>>>>>>> several that
>>>>>>> > fail with these changes on OS X.
>>>>>>> >
>>>>>>> > On Mon, Dec 14, 2015 at 3:03 PM, Zachary Turner <
>>>>>>> [email protected]> wrote:
>>>>>>> >>
>>>>>>> >> I've checked in r255567 which fixes a problem pointed out by
>>>>>>> Siva.  It
>>>>>>> >> doesn't sound related to in 255542, but looking at those logs I
>>>>>>> can't really
>>>>>>> >> tell how my CL would be related.  If r255567 doesn't fix the
>>>>>>> bots, would
>>>>>>> >> someone mind helping me briefly?  r255542 seems pretty
>>>>>>> straightforward, so I
>>>>>>> >> don't see why it would have an effect here.
>>>>>>> >>
>>>>>>> >> On Mon, Dec 14, 2015 at 2:35 PM Todd Fiala <[email protected]>
>>>>>>> wrote:
>>>>>>> >>>
>>>>>>> >>> Ah yes I see.  Thanks, Ying (and Siva!  Saw your comments too).
>>>>>>> >>>
>>>>>>> >>> On Mon, Dec 14, 2015 at 2:34 PM, Ying Chen <[email protected]>
>>>>>>> wrote:
>>>>>>> >>>>
>>>>>>> >>>> Seems this is the first build that fails, and it only has one
>>>>>>> CL 255542.
>>>>>>> >>>>
>>>>>>> >>>>
>>>>>>> http://lab.llvm.org:8011/builders/lldb-x86_64-ubuntu-14.04-cmake/builds/9446
>>>>>>> >>>> I believe Zachary is looking at that problem.
>>>>>>> >>>>
>>>>>>> >>>> On Mon, Dec 14, 2015 at 2:18 PM, Todd Fiala <
>>>>>>> [email protected]>
>>>>>>> >>>> wrote:
>>>>>>> >>>>>
>>>>>>> >>>>> I am seeing several failures on the Ubuntu 14.04 testbot, but
>>>>>>> >>>>> unfortunately there are a number of changes that went in at
>>>>>>> the same time on
>>>>>>> >>>>> that build.  The failures I'm seeing are not appearing at all
>>>>>>> related to the
>>>>>>> >>>>> test running infrastructure.
>>>>>>> >>>>>
>>>>>>> >>>>> Anybody with a fast Linux system able to take a look to see
>>>>>>> what
>>>>>>> >>>>> exactly is failing there?
>>>>>>> >>>>>
>>>>>>> >>>>> -Todd
>>>>>>> >>>>>
>>>>>>> >>>>> On Mon, Dec 14, 2015 at 1:39 PM, Todd Fiala <
>>>>>>> [email protected]>
>>>>>>> >>>>> wrote:
>>>>>>> >>>>>>
>>>>>>> >>>>>> Hi all,
>>>>>>> >>>>>>
>>>>>>> >>>>>> I just put in the single-worker, low-load, follow-up test run
>>>>>>> pass in
>>>>>>> >>>>>> r255543.  Most of the work for it went in late last week,
>>>>>>> this just mostly
>>>>>>> >>>>>> flips it on.
>>>>>>> >>>>>>
>>>>>>> >>>>>> The feature works like this:
>>>>>>> >>>>>>
>>>>>>> >>>>>> * First test phase works as before: run all tests using
>>>>>>> whatever level
>>>>>>> >>>>>> of concurrency is normally used.  (e.g. 8 works on an
>>>>>>> 8-logical-core box).
>>>>>>> >>>>>>
>>>>>>> >>>>>> * Any timeouts, failures, errors, or anything else that would
>>>>>>> have
>>>>>>> >>>>>> caused a test failure is eligible for rerun if either (1) it
>>>>>>> was marked as a
>>>>>>> >>>>>> flakey test via the flakey decorator, or (2) if the
>>>>>>> --rerun-all-issues
>>>>>>> >>>>>> command line flag is provided.
>>>>>>> >>>>>>
>>>>>>> >>>>>> * After the first test phase, if there are any tests that met
>>>>>>> rerun
>>>>>>> >>>>>> eligibility that would have caused a test failure, those get
>>>>>>> run using a
>>>>>>> >>>>>> serial test phase.  Their results will overwrite (i.e.
>>>>>>> replace) the previous
>>>>>>> >>>>>> result for the given test method.
>>>>>>> >>>>>>
>>>>>>> >>>>>> The net result should be that tests that were load sensitive
>>>>>>> and
>>>>>>> >>>>>> intermittently fail during the first higher-concurrency test
>>>>>>> phase should
>>>>>>> >>>>>> (in theory) pass in the second, single worker test phase when
>>>>>>> the test suite
>>>>>>> >>>>>> is only using a single worker.  This should make the test
>>>>>>> suite generate
>>>>>>> >>>>>> fewer false positives on test failure notification, which
>>>>>>> should make
>>>>>>> >>>>>> continuous integration servers (testbots) much more useful in
>>>>>>> terms of
>>>>>>> >>>>>> generating actionable signals caused by version control
>>>>>>> changes to the lldb
>>>>>>> >>>>>> or related sources.
>>>>>>> >>>>>>
>>>>>>> >>>>>> Please let me know if you see any issues with this when
>>>>>>> running the
>>>>>>> >>>>>> test suite using the default output.  I'd like to fix this up
>>>>>>> ASAP.  And for
>>>>>>> >>>>>> those interested in the implementation, I'm happy to do
>>>>>>> post-commit
>>>>>>> >>>>>> review/changes as needed to get it in good shape.
>>>>>>> >>>>>>
>>>>>>> >>>>>> I'll be watching the  builders now and will address any
>>>>>>> issues as I
>>>>>>> >>>>>> see them.
>>>>>>> >>>>>>
>>>>>>> >>>>>> Thanks!
>>>>>>> >>>>>> --
>>>>>>> >>>>>> -Todd
>>>>>>> >>>>>
>>>>>>> >>>>>
>>>>>>> >>>>>
>>>>>>> >>>>>
>>>>>>> >>>>> --
>>>>>>> >>>>> -Todd
>>>>>>> >>>>
>>>>>>> >>>>
>>>>>>> >>>
>>>>>>> >>>
>>>>>>> >>>
>>>>>>> >>> --
>>>>>>> >>> -Todd
>>>>>>> >
>>>>>>> >
>>>>>>> >
>>>>>>> >
>>>>>>> > --
>>>>>>> > -Todd
>>>>>>> >
>>>>>>> > _______________________________________________
>>>>>>> > lldb-dev mailing list
>>>>>>> > [email protected]
>>>>>>> > http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
>>>>>>> >
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> -Todd
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> -Todd
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> -Todd
>>>>
>>>
>>>
>>>
>>> --
>>> -Todd
>>>
>>> _______________________________________________
>>> lldb-dev mailing list
>>> [email protected]
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
>>>
>>>
>>
>
>
> --
> -Todd
>



-- 
-Todd

_______________________________________________
lldb-dev mailing list
[email protected]
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev

Re: [lldb-dev] test rerun phase is in

Reply via email to