Re: [lldb-dev] [RFC] Testsuite in lldb & possible future directions

2018-02-07 Thread Pavel Labath via lldb-dev
On 6 February 2018 at 18:53, Zachary Turner  wrote:
> I'm not claiming that it's definitely caused by dotest and that moving away
> from dotest is going to fix all the problems.  Rather, I'm claiming that
> dotest has an unknown amount of flakiness (which may be 0, but may be
> large), and the alternative has a known amount of flakiness (which is very

Well, it may be unknown to you, but as someone who has managed a bot
running tests for a long time, I can tell you that the it's pretty
close to 0. Some test still fail sometimes, but the failure rate is
approximately at the same level as failures caused by the bot not
being able to reach the svn server to fetch the sources.

That said, I'm still in favor of replacing the test runner with lit. I
just think it needs to be done with a steady hand.


>> So I believe we need more lightweight tests, and lldb-test can provide
>> us with that. The main question for me (and that's something I don't
>> really have an answer to) is how to make writing tests like that easy.
>> E.g. for these "foreign" language plugins, the only way to make a
>> self-contained regression test would be to check-in some dwarf which
>> mimics what the compiler in question would produce. But doing that is
>> extremely tedious as we don't have any tooling for that.
>
>
>  Most of these other language plugins are being removed anyway.  Which
> language plugins are going to still remain that aren't some flavor of c/c++?

Well, right now we have another thread proposing the addition of a
Rust plugin, and we will want to resurrect Java support sooner or
later. Go/Ocaml folks may want to do the same, if doing that will not
involve them inventing a whole test framework.

So, I'm not sure where you were heading with that question..

On 6 February 2018 at 18:53, Zachary Turner  wrote:
>
>
> On Tue, Feb 6, 2018 at 8:19 AM Pavel Labath via lldb-dev
>  wrote:
>>
>> On 6 February 2018 at 15:41, Davide Italiano 
>> wrote:
>> > On Tue, Feb 6, 2018 at 7:09 AM, Pavel Labath  wrote:
>> >> On 6 February 2018 at 04:11, Davide Italiano via lldb-dev
>> >>
>> >> So, I guess my question is: are you guys looking into making sure that
>> >> others are also able to reproduce the 0-fail+0-xpass state? I would
>> >> love to run the mac test suite locally, as I tend to touch a lot of
>> >> stuff that impacts all targets, but as it stands now, I have very
>> >> little confidence that the test I am running reflect in any way the
>> >> results you will get when you run the test on your end.
>> >>
>> >> I am ready to supply any test logs or information you need if you want
>> >> to try to tackle this.
>> >>
>> >
>> > Yes, I'm definitely interested in making the testusuite
>> > working/reliable on any configuration.
>> > I was afraid there were a lot of latent issues, that's why I sent this
>> > mail in the first place.
>> > It's also the reason why I started thinking about `lldb-test` as a
>> > driver for testing, because I found out the testsuite being a little
>> > inconsistent/brittle depending on the environment it's run on (which,
>> > FWIW, doesn't happen when you run lit/FileCheck or even the unit tests
>> > in lldb). I'm not currently claiming switching to a different method
>> > would improve the situation, but it's worth a shot.
>> >
>>
>> Despite Zachary's claims, I do not believe this is caused by the test
>> driver (dotest). It's definitely not beautiful, but I haven't seen an
>> issue that would be caused by this in a long time. The issue is that
>> the tests are doing too much -- even the simplest involves compiling a
>> fully working executable, which pulls in a lot of stuff from the
>> environment (runtime libraries, dynamic linker, ...) that we have no
>> control of. And of course it makes it impossible to test the debugging
>> functionality of any other platform than what you currently have in
>> front of you.
>
> I'm not claiming that it's definitely caused by dotest and that moving away
> from dotest is going to fix all the problems.  Rather, I'm claiming that
> dotest has an unknown amount of flakiness (which may be 0, but may be
> large), and the alternative has a known amount of flakiness (which is very
> close to, if not equal to 0).  So we should do it because, among other
> benefits, it replaces an unknown with a known that is at least as good, if
> not better.
>
>
>>
>>
>> In this sense, the current setup makes an excellent integration test
>> suite -- if you run the tests and they pass, you can be fairly
>> confident that the debugging on your system is setup correctly.
>> However, it makes a very bad regression test suite, as the tests will
>> be checking something different on each machine.
>>
>> So I believe we need more lightweight tests, and lldb-test can provide
>> us with that. The main question for me (and that's something I don't
>> really have an answer to) is how to make writing tests like that easy.
>> E.g. for these "foreign" language plugins, the only way to make a
>> self-containe

Re: [lldb-dev] [RFC] Testsuite in lldb & possible future directions

2018-02-07 Thread Zachary Turner via lldb-dev
On Wed, Feb 7, 2018 at 2:38 AM Pavel Labath  wrote:

> On 6 February 2018 at 18:53, Zachary Turner  wrote:
> > I'm not claiming that it's definitely caused by dotest and that moving
> away
> > from dotest is going to fix all the problems.  Rather, I'm claiming that
> > dotest has an unknown amount of flakiness (which may be 0, but may be
> > large), and the alternative has a known amount of flakiness (which is
> very
>
> Well, it may be unknown to you, but as someone who has managed a bot
> running tests for a long time, I can tell you that the it's pretty
> close to 0. Some test still fail sometimes, but the failure rate is
> approximately at the same level as failures caused by the bot not
> being able to reach the svn server to fetch the sources.

As someone who gave up on trying to set up a bot due to flakiness, I have a
different experience.



>
> That said, I'm still in favor of replacing the test runner with lit. I
> just think it needs to be done with a steady hand.
>
>
> >> So I believe we need more lightweight tests, and lldb-test can provide
> >> us with that. The main question for me (and that's something I don't
> >> really have an answer to) is how to make writing tests like that easy.
> >> E.g. for these "foreign" language plugins, the only way to make a
> >> self-contained regression test would be to check-in some dwarf which
> >> mimics what the compiler in question would produce. But doing that is
> >> extremely tedious as we don't have any tooling for that.
> >
> >
> >  Most of these other language plugins are being removed anyway.  Which
> > language plugins are going to still remain that aren't some flavor of
> c/c++?
>
> Well, right now we have another thread proposing the addition of a
> Rust plugin, and we will want to resurrect Java support sooner or
> later. Go/Ocaml  folks may want to do the
> same, if doing that will not
> involve them inventing a whole test framework.
>
> So, I'm not sure where you were heading with that question..


Rust is based on llvm so we have the tools necessary for that.  The rest
are still maybe and someday so we can cross that bridge when (if) we come
to it
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [RFC] Testsuite in lldb & possible future directions

2018-02-07 Thread Pavel Labath via lldb-dev
On 7 February 2018 at 14:20, Zachary Turner  wrote:
>
> As someone who gave up on trying to set up a bot due to flakiness, I have a
> different experience.

I did not say it was easy to get to the present point, and I am
certain that the situation is much harder on windows. But I believe
this is due to reasons not related to the test runner (such various
posixism spread out over the codebase and the fact that windows uses a
completely different (i.e. lest tested) code path for debugging).

FWIW, we also have a windows bot running remote tests targetting
android. It's not as stable as the one hosted on linux, but most of
the issues I've seen there also do not point towards dotest.

> Rust is based on llvm so we have the tools necessary for that.  The rest are
> still maybe and someday so we can cross that bridge when (if) we come to it

I don't know enough about Rust to say whether that is true. If it uses
llvm as a backend then I guess we could check-in some rust-generated
IR to serve as a test case (but we still figure out what exactly to do
with it).

However, I would assert that even for C family languages a more
low-level approach than "$CC -g" for generating debug info would be
useful. People generally will not have their compiler and debugger
versions in sync, so we need tests that check we handle debug info
produced by older versions of clang (or gcc for that matter). And
then, there are the tests to make sure we handle "almost valid" debug
info gracefully...
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [RFC] Testsuite in lldb & possible future directions

2018-02-07 Thread Davide Italiano via lldb-dev
On Wed, Feb 7, 2018 at 7:57 AM, Pavel Labath  wrote:
> On 7 February 2018 at 14:20, Zachary Turner  wrote:
>>
>> As someone who gave up on trying to set up a bot due to flakiness, I
have a
>> different experience.
>
> I did not say it was easy to get to the present point, and I am
> certain that the situation is much harder on windows. But I believe
> this is due to reasons not related to the test runner (such various
> posixism spread out over the codebase and the fact that windows uses a
> completely different (i.e. lest tested) code path for debugging).
>
> FWIW, we also have a windows bot running remote tests targetting
> android. It's not as stable as the one hosted on linux, but most of
> the issues I've seen there also do not point towards dotest.
>
>> Rust is based on llvm so we have the tools necessary for that.  The rest
are
>> still maybe and someday so we can cross that bridge when (if) we come to
it
>
> I don't know enough about Rust to say whether that is true. If it uses
> llvm as a backend then I guess we could check-in some rust-generated
> IR to serve as a test case (but we still figure out what exactly to do
> with it).
>
> However, I would assert that even for C family languages a more
> low-level approach than "$CC -g" for generating debug info would be
> useful. People generally will not have their compiler and debugger
> versions in sync, so we need tests that check we handle debug info
> produced by older versions of clang (or gcc for that matter). And
> then, there are the tests to make sure we handle "almost valid" debug
> info gracefully...

This last category is really interesting (and, unfortunately, given our
current testing strategy, almost entirely untested).
I think the proper thing here is that of having tooling that generates
broken debug info, as yaml2obj can generate broken object files, and test
with them.
lldb does a great deal of work trying to "recover" with a lot of heuristics
in case debug info are wrong but not that off. In order to have better
control of this codepath, we need to have a better testing for this case,
otherwise this will break (and we'll be forced to remove the codepath
entirely).

--
Davide
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [RFC] Testsuite in lldb & possible future directions

2018-02-07 Thread Pavel Labath via lldb-dev
On 6 February 2018 at 15:51, Davide Italiano  wrote:
>
> FWIW, I strongly believe we should all agree on a configuration to run
> tests and standardize on that.
> It's unfortunate that we have two build systems, but there are plans
> to move away from manually generating xcodebuild, as many agree it's a
> terrible maintenance burden.
> So, FWIW, I'll share my conf (I'm on high Sierra):
>
>
> git clone https://github.com/monorepo
> symlink clang -> tools
> symlink lldb -> tools
> symlink libcxx -> projects (this particular one has caused lots of
> trouble for me in the past, and I realized it's undocumented :()
>
> cmake -GNinja -DCMAKE_BUILD_TYPE=Release ../llvm
> ninja check-lldb
>
Right, so I tried following these instructions as precisely as I could.

- The first thing that failed is the libc++ link step (missing -lcxxabi_shared).

So, I added libcxxabi to the build, and tried again.
Aaand, I have to say the situation is much better now: I got two
unexpected successes and one timeout:
UNEXPECTED SUCCESS: test_lldbmi_output_grammar
(tools/lldb-mi/syntax/TestMiSyntax.py)
UNEXPECTED SUCCESS: test_process_interrupt_dsym
(functionalities/thread/state/TestThreadStates.py)
TIMEOUT: test_breakpoint_doesnt_match_file_with_different_case_dwarf
(functionalities/breakpoint/breakpoint_case_sensitivity/TestBreakpointCaseSensitivity.py)

On the second run I got these results:
FAIL: test_launch_in_terminal (functionalities/tty/TestTerminal.py)
UNEXPECTED SUCCESS: test_lldbmi_output_grammar
(tools/lldb-mi/syntax/TestMiSyntax.py)
UNEXPECTED SUCCESS: test_process_interrupt_dwarf
(functionalities/thread/state/TestThreadStates.py)


So, checking out libc++ certainly helped (this definitely needs to be
documented somewhere) a lot. Of these, the MI test seems to be failing
consistently. The rest appear to be flakes. I am attaching the logs
from the second run, but there doesn't appear to be anything
interesting there...


Failure-LaunchInTerminalTestCase-test_launch_in_terminal.log
Description: Binary data


UnexpectedSuccess-MiSyntaxTestCase-test_lldbmi_output_grammar.log
Description: Binary data


UnexpectedSuccess-ThreadStateTestCase-test_process_interrupt_dwarf.log
Description: Binary data
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [RFC] Testsuite in lldb & possible future directions

2018-02-07 Thread Davide Italiano via lldb-dev
On Wed, Feb 7, 2018 at 9:32 AM, Pavel Labath  wrote:
> On 6 February 2018 at 15:51, Davide Italiano  wrote:
>>
>> FWIW, I strongly believe we should all agree on a configuration to run
>> tests and standardize on that.
>> It's unfortunate that we have two build systems, but there are plans
>> to move away from manually generating xcodebuild, as many agree it's a
>> terrible maintenance burden.
>> So, FWIW, I'll share my conf (I'm on high Sierra):
>>
>>
>> git clone https://github.com/monorepo
>> symlink clang -> tools
>> symlink lldb -> tools
>> symlink libcxx -> projects (this particular one has caused lots of
>> trouble for me in the past, and I realized it's undocumented :()
>>
>> cmake -GNinja -DCMAKE_BUILD_TYPE=Release ../llvm
>> ninja check-lldb
>>
> Right, so I tried following these instructions as precisely as I could.
>
> - The first thing that failed is the libc++ link step (missing 
> -lcxxabi_shared).
>
> So, I added libcxxabi to the build, and tried again.
> Aaand, I have to say the situation is much better now: I got two
> unexpected successes and one timeout:
> UNEXPECTED SUCCESS: test_lldbmi_output_grammar
> (tools/lldb-mi/syntax/TestMiSyntax.py)
> UNEXPECTED SUCCESS: test_process_interrupt_dsym
> (functionalities/thread/state/TestThreadStates.py)
> TIMEOUT: test_breakpoint_doesnt_match_file_with_different_case_dwarf
> (functionalities/breakpoint/breakpoint_case_sensitivity/TestBreakpointCaseSensitivity.py)
>
> On the second run I got these results:
> FAIL: test_launch_in_terminal (functionalities/tty/TestTerminal.py)
> UNEXPECTED SUCCESS: test_lldbmi_output_grammar
> (tools/lldb-mi/syntax/TestMiSyntax.py)
> UNEXPECTED SUCCESS: test_process_interrupt_dwarf
> (functionalities/thread/state/TestThreadStates.py)
>
>
> So, checking out libc++ certainly helped (this definitely needs to be
> documented somewhere) a lot. Of these, the MI test seems to be failing
> consistently. The rest appear to be flakes. I am attaching the logs
> from the second run, but there doesn't appear to be anything
> interesting there...

Terrific that we're making progress! I plan to take a look at the
`lldb-mi` failure soon, as I can reproduce it here fairly
consistently.

About the others, we've seen
functionalities/breakpoint/breakpoint_case_sensitivity/TestBreakpointCaseSensitivity.py
failing on the bots and I think might be due to a spotlight issue
Adrian found (and fixed).
You might still have `.dSYM` bundles from stale build directories, i.e.

To fix this, you need to wipe out all old build artifacts:

- Inside of the LLDB source tree:
 $ git clean -f -d

- Globally:
 $ find / -name a.out.dSYM -exec rm -rf \{} \;

This s a long shot, but might help you

--
Davide
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [RFC] Testsuite in lldb & possible future directions

2018-02-07 Thread Adrian Prantl via lldb-dev


> On Feb 6, 2018, at 9:29 AM, Davide Italiano via lldb-dev 
>  wrote:
> 
> On Tue, Feb 6, 2018 at 8:18 AM, Pavel Labath  wrote:
>> On 6 February 2018 at 15:41, Davide Italiano  wrote:
>>> On Tue, Feb 6, 2018 at 7:09 AM, Pavel Labath  wrote:
 On 6 February 2018 at 04:11, Davide Italiano via lldb-dev
 
 So, I guess my question is: are you guys looking into making sure that
 others are also able to reproduce the 0-fail+0-xpass state? I would
 love to run the mac test suite locally, as I tend to touch a lot of
 stuff that impacts all targets, but as it stands now, I have very
 little confidence that the test I am running reflect in any way the
 results you will get when you run the test on your end.
 
 I am ready to supply any test logs or information you need if you want
 to try to tackle this.
 
>>> 
>>> Yes, I'm definitely interested in making the testusuite
>>> working/reliable on any configuration.
>>> I was afraid there were a lot of latent issues, that's why I sent this
>>> mail in the first place.
>>> It's also the reason why I started thinking about `lldb-test` as a
>>> driver for testing, because I found out the testsuite being a little
>>> inconsistent/brittle depending on the environment it's run on (which,
>>> FWIW, doesn't happen when you run lit/FileCheck or even the unit tests
>>> in lldb). I'm not currently claiming switching to a different method
>>> would improve the situation, but it's worth a shot.
>>> 
>> 
>> Despite Zachary's claims, I do not believe this is caused by the test
>> driver (dotest). It's definitely not beautiful, but I haven't seen an
>> issue that would be caused by this in a long time. The issue is that
>> the tests are doing too much -- even the simplest involves compiling a
>> fully working executable, which pulls in a lot of stuff from the
>> environment (runtime libraries, dynamic linker, ...) that we have no
>> control of. And of course it makes it impossible to test the debugging
>> functionality of any other platform than what you currently have in
>> front of you.
>> 
>> In this sense, the current setup makes an excellent integration test
>> suite -- if you run the tests and they pass, you can be fairly
>> confident that the debugging on your system is setup correctly.
>> However, it makes a very bad regression test suite, as the tests will
>> be checking something different on each machine.
>> 
> 
> Yes, I didn't complain about "dotest" in general, but, as you say, the
> fact that it pull in lots of stuffs we don't really have control on.
> Also, most of the times I actually found out we've been sloppy watching
> bots for a while, or XFAILING tests instead of fixing them and that resulted 
> in
> issues piling up). This is a more general problem not necessarily tied to
> `dotest` as a driver.
> 
>> So I believe we need more lightweight tests, and lldb-test can provide
>> us with that. The main question for me (and that's something I don't
> 
> +1.
> 
>> really have an answer to) is how to make writing tests like that easy.
>> E.g. for these "foreign" language plugins, the only way to make a
>> self-contained regression test would be to check-in some dwarf which
>> mimics what the compiler in question would produce. But doing that is
>> extremely tedious as we don't have any tooling for that. Since debug
>> info is very central to what we do, having something like that would
>> go a long way towards improving the testing situation, and it would be
>> useful for C/C++ as well, as we generally need to make sure that we
>> work with a wide range of compiler versions, not just accept what ToT
>> clang happens to produce.
>> 
> 
> I think the plan here (and I'd love to spend some time on this once we
> have stability, which seems we're slowly getting) is that of enhancing
> `yaml2*` to do the work for us.
> I do agree is a major undertaken but even spending a month on it will
> go a long way IMHO. I will try to come up with a plan after discussing
> with folks in my team (I'd really love to get also inputs from DWARF
> people in llvm, e.g. Eric or David Blake).

The last time I looked into yaml2obj was to use it for testing llvm-dwarfdump 
and back then I concluded that it needs a lot of work to be useful even for 
testing dwarfdump. In the current state it is both too low-level (e.g., you 
need to manually specify all Mach-O load commands, you have to manually compute 
and specify the size of each debug info section) and too high-level (it can 
only auto-generate one exact version of .debug_info headers) to be useful.

If we could make a tool whose input roughly looks like the output of dwarfdump, 
then this might be a viable option. Note that I'm not talking about syntax but 
about the abstraction level of the contents.

In summary, I think this is an interesting direction to explore, but we 
shouldn't underestimate the amount of work necessary to make this useful.

-- adrian

> 
>> 
>> PS: I saw your second email as 

[lldb-dev] [6.0.0 Release] Release Candidate 2 tagged

2018-02-07 Thread Hans Wennborg via lldb-dev
Dear testers,

There's been a lot of merges since rc1, and hopefully the tests are in
a better state now.

6.0.0-rc2 was just tagged, after r324506.

Please test, let me know how it goes, and upload binaries.

Thanks,
Hans
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev