Re: [lldb-dev] [Openmp-dev] [llvm-dev] Updates on SVN to GitHub migration

2018-10-22 Thread David Greene via lldb-dev
I had a short side-conversation at one of the roundtables about existing
users of the subproject repositories.  It would be helpful to have
instructions about the best way to move local branches in those
repositories to the monorepo and some scripts to help with the
transition.  I know someone posted an example project a while ago with
some scripts but my sense is that those scripts were particular to that
project and maybe not generally applicable.

Once the monorepo goes live (tomorrow?), what happens to the existing
subproject mirrors?  Do they get wiped away and replaced with history
from the monorepo?  Or are entirely new mirrors created?  Or do they
just continue to mirror SVN until SVN becomes read-only?

The first option would essentially be a rewrite of history for the
subproject repositories.  We'll need to know if/when that is going to
happen.

  -David

Jonas Hahnfeld via Openmp-dev  writes:

> (+openmp-dev, they should know about this!)
>
> Recapping the "Concerns"
> (https://llvm.org/docs/Proposals/GitHubMove.html#id12) there is a
> proposal of "single-subproject Git mirrors" for people who are only
> contributing to standalone subprojects. I think this will be easy in
> the transition period, we can just continue to move the current
> official git mirrors. Will this "service" be continued after GitHub
> becomes the 'one source of truth'? I'd strongly vote for yes, but I'm
> not sure how that's going to work on a technical level.
>
> Thanks,
> Jonas
>
> On 2018-10-20 03:14, Tom Stellard via llvm-dev wrote:
>> On 10/19/2018 05:47 PM, Tom Stellard via lldb-dev wrote:
>>> TLDR: Official monorepo repository will be published on
>>> Tuesday, Oct 23, 2018.  After this date, you should modify
>>> your workflows to use the monorepo ASAP.  Current workflows
>>> will be supported for at most 1 more year.
>>>
>>> Hi,
>>>
>>> We had 2 round-tables this week at the Developer Meeting to
>>> discuss the SVN to GitHub migration, and I wanted to update
>>> the rest of the community on what we discussed.
>>>
>>> The most important outcome from that meeting is that we
>>> now have a timeline for completing the transition which looks
>>> like this:
>>>
>>
>> Step 1:
>>> Tues Oct 23, 2018:
>>>
>>> The latest monorepo prototype[1] will be moved over to the LLVM
>>> organization github project[2] and will begin mirroring the current
>>> SVN repository.  Commits will still be made to the SVN repository
>>> just as they are today.
>>>
>>> All community members should begin migrating their workflows that
>>> rely on SVN or the current git mirrors to use the new monorepo.
>>>
>>> For CI jobs or internal mirrors pulling from SVN or
>>> http://llvm.org/git/*.git you should modify them to pull from
>>> the new monorepo and also to deal with the new repository
>>> layout.
>>>
>>> For Developers, you should begin using the new monorepo
>>> for your development and using the provided scripts[3]
>>> to commit your code.  These scripts will allow to commit
>>> to SVN from the monorepo without using git-svn
>>>
>>>
>>
>> Sorry hit send before I was done.  Here is the rest of the mail:
>>
>> Step 2:
>>
>> Around the time of next year's developer meeting (1 year at the most),
>> we will turn off commit access to the SVN server and enable commit
>> access to the monorepo.  At this point the monorepo will become the
>> 'one source of truth' for the project.  Community members *must* have
>> updated their workflows by this date and are encouraged to begin
>> updating workflows ASAP.
>>
>> A lot of people asked at the developer meeting about the future
>> of bugzilla and phabricator and whether or not we will use
>> github issues and pull requests.  These are important questions,
>> but are unrelated to the migration of the code.
>>
>> We also came up with a TODO list for things we want to accomplish
>> as a community in the next year and beyond related to github.  I
>> am working on putting these into bugzilla so we can track progress
>> better and I will send a follow-up email about this.
>>
>> -Tom
>>
>>>
>>>
>>>
>>>
>>> [1] https://github.com/llvm-git-prototype/llvm>> [2] 
>>> https://github.com/llvm/>> [3]
>>> https://llvm.org/docs/GettingStarted.html#for-developers-to-work-with-a-git-monorepo>>
>>>
>>> ___
>>> lldb-dev mailing list
>>> lldb-dev@lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev>>
>>
>> ___
>> LLVM Developers mailing list
>> llvm-...@lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev___
> Openmp-dev mailing list
> openmp-...@lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/openmp-dev
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [llvm-dev] [Openmp-dev] Updates on SVN to GitHub migration

2018-10-23 Thread David Greene via lldb-dev
Thanks!  In our case we have a branch off master for each component and
we've developed there, merging from master from time to time (but never
our local branch to master).  I guess that puts us in the "multirepo
with merges" category.  We don't really care about maintaining the
history of merges from master.  Even rebasing our changes on the curent
monorepo master would probably be fine (understanding that there will be
merge conflicts we'll need to resolve).

The biggest challenge will be corrdinating changes to subprojects that
depend on each other.  We don't want to just rebase all changes from
subproject X and then all changes from subproject Y, because we'll end
up with unbuildable history for most of the resulting commits.  In the
foggy days of the past, I did our svn-to-git conversion and we had
similar needs, merging multiple svn repositories into one git
repository.  I had to carefully craft scripts to get the commit ordering
right.  I could probably do the same much more easily with git.

I've not used subtree merges in quite the way described so we'll have to
experiment a bit and see what works.

   -David

Justin Lebar  writes:

>> It would be helpful to have instructions about the best way to move
> local branches in those repositories to the monorepo and some scripts
> to help with the transition. 
>
> I put together https://reviews.llvm.org/D53414.
>
> On Mon, Oct 22, 2018 at 8:31 AM David Greene via llvm-dev
>  wrote:
>
> I had a short side-conversation at one of the roundtables about
> existing
> users of the subproject repositories. It would be helpful to have
> instructions about the best way to move local branches in those
> repositories to the monorepo and some scripts to help with the
> transition. I know someone posted an example project a while ago
> with
> some scripts but my sense is that those scripts were particular to
> that
> project and maybe not generally applicable.
> 
> Once the monorepo goes live (tomorrow?), what happens to the
> existing
> subproject mirrors? Do they get wiped away and replaced with
> history
> from the monorepo? Or are entirely new mirrors created? Or do they
> just continue to mirror SVN until SVN becomes read-only?
> 
> The first option would essentially be a rewrite of history for the
> subproject repositories. We'll need to know if/when that is going
> to
> happen.
> 
> -David
> 
> Jonas Hahnfeld via Openmp-dev  writes:
> 
> > (+openmp-dev, they should know about this!)
> >
> > Recapping the "Concerns"
> > (https://llvm.org/docs/Proposals/GitHubMove.html#id12) there is
> a
> > proposal of "single-subproject Git mirrors" for people who are
> only
> > contributing to standalone subprojects. I think this will be
> easy in
> > the transition period, we can just continue to move the current
> > official git mirrors. Will this "service" be continued after
> GitHub
> > becomes the 'one source of truth'? I'd strongly vote for yes,
> but I'm
> > not sure how that's going to work on a technical level.
> >
> > Thanks,
> > Jonas
> >
> > On 2018-10-20 03:14, Tom Stellard via llvm-dev wrote:
> >> On 10/19/2018 05:47 PM, Tom Stellard via lldb-dev wrote:
> >>> TLDR: Official monorepo repository will be published on
> >>> Tuesday, Oct 23, 2018. After this date, you should modify
> >>> your workflows to use the monorepo ASAP. Current workflows
> >>> will be supported for at most 1 more year.
> >>>
> >>> Hi,
> >>>
> >>> We had 2 round-tables this week at the Developer Meeting to
> >>> discuss the SVN to GitHub migration, and I wanted to update
> >>> the rest of the community on what we discussed.
> >>>
> >>> The most important outcome from that meeting is that we
> >>> now have a timeline for completing the transition which looks
> >>> like this:
> >>>
> >>
> >> Step 1:
> >>> Tues Oct 23, 2018:
> >>>
> >>> The latest monorepo prototype[1] will be moved over to the
> LLVM
> >>> organization github project[2] and will begin mirroring the
> current
> >>> SVN repository. Commits will still be made to the SVN
> repository
> >>> just as they are today.
> >>>
> >>> All community members should begin migrating their workflows
> that
> >>> rely on SVN or the current git mirrors to use the new
> monorepo.
> >>>
> >>> For CI jobs or internal mirrors pulling from SVN or
> >>> http://llvm.org/git/*.git you should modify them to pull from
> >>> the new monorepo and also to deal with the new repository
> >>> layout.
> >>>
> >>> For Developers, you should begin using the new monorepo
> >>> for your development and using the provided scripts[3]
> >>> to commit your code. These scripts will allow to commit
>   

Re: [lldb-dev] [llvm-dev] Updates on SVN to GitHub migration

2018-10-23 Thread David Greene via lldb-dev
Apparently the GitHub UI is not great about showing tags.  If you clone
and do git tag you'll see them.  Someone at one of the dev. meeting
roundtables found a way to see them on GitHub but I don't remember the
details.

  -David

Jacob Carlborg via llvm-dev  writes:

> On 2018-10-20 02:47, Tom Stellard via llvm-dev wrote:
>> TLDR: Official monorepo repository will be published on
>> Tuesday, Oct 23, 2018.  After this date, you should modify
>> your workflows to use the monorepo ASAP.  Current workflows
>> will be supported for at most 1 more year.
>
> Looks like the prototype repository [1] is missing quite a lot of
> tags, like all of the RELEASE_NNN tags [2].
>
> [1] https://github.com/llvm-git-prototype/llvm[2] 
> https://llvm.org/svn/llvm-project/cfe/tags/
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [cfe-dev] [llvm-dev] [RFC] LLVM bug lifecycle BoF - triaging

2018-10-31 Thread David Greene via lldb-dev
Richard Smith via cfe-dev  writes:

> In fact, I think it'd be entirely reasonable to subscribe cfe-dev to
> all clang bugs (fully subscribe -- email on all updates!). I don't see
> any reason whatsoever why a bug update should get *less* attention
> than non-bug development discussion.

Some of us are on space-limited machines (I'm thinking of personal
equipment, not corporate infrastructure) and getting all bug updates for
components could put a real squeeze on things.

I agree that cfe-bugs, for example, should get copied on all updates but
those updates should be opt-in.

 -David
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [llvm-dev] [cfe-dev] [Call for Volunteers] Bug triaging

2018-11-08 Thread David Greene via lldb-dev
Zachary Turner via llvm-dev  writes:

> Just so I'm clear, are we going to attempt to clean up and/or merge
> the components? If we are, it makes sense to do that before we start
> putting ourselves as default CC's on the various components since they
> will just change. If not, it would be nice to get some clarification
> on that now.

I agree that we could use component cleanup and that it should happen
before assigning triagers.

> I think a good starting point would be to get rid of any component
> with less than 10 bugs reported so far this year and merge them all
> into an "Other" component.

Here's how I might organize things, given a clean slate:

Bugzilla

Build

clang/driver
clang/frontend
clang/llvm
clang/tools

compiler-rt

Documentation

libc++
libc++-abi

llvm/optimizer
llvm/codegen

lld

lldb

LNT

OpenMP

Packaging

Phabricator

Polly

Test Suite

Tools

Triage   # Replaces new-bugs

Website

These are not necessarily restricted to particular directory structures.
For example, an "lld" bug might very well be against a common library in
under llvm/.  I'm trying to put forward something that would make sense
to users of llvm-based compilers as well as more casual LLVM developers
while providing some categorization for broad topics (the llvm optimizer
is very different from llvm codegen, for example).


I'm not sure what should go under "Tools."  Should it be limited to
things in llvm/tools or should it include things like clang-tidy?  Maybe
we'd want an llvm/tools component and leave Tools for user-facing tools
like the clang static analyzer.  In that case clang/tools can maybe go
away.

Just some ideas.

  -David

___
lldb-dev mailing list
lldb-dev@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


[lldb-dev] [monorepo] Downstream branch zipping tool available

2018-11-12 Thread David Greene via lldb-dev
Building on the great work that James Knight did on
migrate-downstream-fork.py (Thanks, James!) [1], I've created a simple
tool to take migrated downstream fork branches and zip them into a
single history given a history containing submodule updates of
subprojects [2].

With migrate-downstream-fork.py, one is left with a set of unrelated
histories, one per subproject:

llvm clang   compiler-rt
* V Add my fancy LLVM feature* G Fix my dumb clang bug   * Z Merge from 
upstream compiler-rt

One can do an octopus merge to unify them:

  *-- Merge llvm, clang and compiler-rt
  |\ \
  * \ \  V Add my fancy LLVM feature
  |  * |  G Fix my dumb clang bug
  |  | *  Z Merge from upstream compiler-rt

Unfortunately, that doesn't show the logical history of development,
where changes were effectively applied to subprojects in a linear
fashion.  This makes it more difficult to do bisects, among other things
because none of the downstream integration happens until the octopus
merge.

Let's say that downstream you have a local mirror for each LLVM
subproject you work on.  Suppose also that you have an "umbrella"
repository that holds submodule references to all those local mirrors.
Various commits in the umbrella update submodule references:

  * Update llvm submodule to V
  * Update clang submodule to G
  * Don't update any submodules, fix scripts or something
  * Update compiler-rt submodule to Z
  |

zip-downstream-fork.py will take these submodule updates and "inline"
them into the umbrella history, making it appear that the downstream
commits were applied against the monorepo in the order implied by the
umbrella history:

  * A Add my fancy LLVM feature
  * B Fix my dumb clang bug
  * C Merge from upstream compiler-rt
  |

Parent relationships for merges from upstream are preserved, though as
top-level comments in zip-downstream-fork.py explain, the history graph
can look a little strange.  Commits that don't update submodules are
skipped on the assumption that they modify things uninteresting to a
monorepo history.  Such commits could be preserved but doing so has some
caveats as explained in the comments.  Perhaps your umbrella repository
holds your build scripts.  You'd probably want to migrate that to the
zipped history.  If there's strong demand for this I could look into
doing it.

There are various other limitations to the tool explained in the
comments.  It was enough to get us going and I'm hopeful it will be
useful for others.  It seems to do the right thing with our repositories
but YMMV.  Feel free to open PRs with bug fixes.  :)

To get this to work, you'll need to apply a PR for
migrate-downstream-fork.py to fix issues with --revmap-out [3].

-David

[1] 
https://github.com/jyknight/llvm-git-migration/blob/master/migrate-downstream-fork.py
[2] 
https://github.com/jyknight/llvm-git-migration/pull/2/commits/a3b44a294c20f1762cb42b5794e6130c5b27f22d
[3] https://github.com/jyknight/llvm-git-migration/pull/1
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [monorepo] Downstream branch zipping tool available

2018-12-18 Thread David Greene via lldb-dev
Björn Pettersson A  writes:

> We have used llvm as the umbrella repo, so in llvm we have a "master"
> branch (from the git single repo version of llvm) and a couple of
> downstream branches (let's call them "down0", "down1") containing our
> downstream work (with frequent merges from "master").

Ok.

> The downstream branches has tools/clang and runtimes/compiler-rt as
> submodules, as well as a couple of downstream submodules.

Ok.

> In our downstream version of clang we have a similar structure.
> A "master" branch (mapping to the git single repo version clang),
> and a couple of downstream branches. The downstream branches has
> tools/extra (i.e. clang-tools-extra) as a submodule.

So the clang submodule in llvm has a submodule itself?  I wasn't even
aware that was possible.

> I can also mention that the clang, compiler-rt and clang-tools-extra
> submodules aren't present from the beginning of history. They have
> been added later on.

That shouldn't be a problem for the script.  We have the same sort of
history.

> I doubt that zip-downstream-fork.py will work out-of-the-box.
> Hopefully I'll be able to patch it for our scenario. Any guidelines
> might be helpful. But maybe it isn't even worth trying to adapt
> zip-downstream-fork.py to do something useful for our scenario?

Yeah, non-submodule-update commits in the llvm repository would be
droppped per this comment:

# - The script assumes that any commits in the umbrella history that
#   do not update submodules should be discarded.  It is not clear
#   what should happen if such a commit happens to touch files with
#   the same name as those in the monorepo (README files are typical).
#   Adding support to keep these commits should be straightforward,
#   but because decisions are likely to vary based on particular
#   setups, we just punt for now.

This happens around line 288 in zip-downstream-fork.py:

if self.prev_submodules == submodules:
  # This is a commit that modified some file in the umbrella and
  # didn't update any submodules..  Assume we don't want it.
  self.debug('No submodule updates')
  return self.substitute_commit(commit, githash)

If you return commit here instead of doing substitute_commit it should
retain the commit unaltered.  That's not quite what you want for the
monorepo, you want commits to llvm to appear under the llvm directory in
the monorepo.  The code to do that is in migrate-downstream-fork.py
arount line 106 in commit_filter:

# OK -- NOT an upstream commit: move the tree under the correct subdir, and
# preserve everything outside that subdir.  The tricky part is figuring out
# *which* parent to get the rest of the tree (other than the named 
subproject)
# from, in case of a merge.

You could try to copy this verbatim into zip-downstream-fork.py or it
could be factored out into a common library.  If a significant number of
people have a setup similar to yours, it may very well be worth doing
that.  You'd also need to add the check for upstream commits.

Now that I think about it, what you really want is something that runs
migrate-downstream-fork.py on the commits in llvm and something that
runs zip-downstream-fork.py on commits in other projects, but they have
to ruin simultaneously to keep the commits in the proper order.  If both
migrate-downstream-fork.py and zip-downstream-fork.py were refactored to
put most of their code in a package/library, then a third tool could be
created to do what you need.  Obviously, that will take some work to
accomplish.  You'd also want James' guidance on changing
migrate-downstream-fork.py.  There are certain enhancements to
zip-downstream-fork.py that I didn't make because I didn't want to mess
with migrate-downstream-fork.py (see the comments at the top of
zip-downstream-fork.py).

zip-downstream-fork.py also doesn't consider submodules of other
submodules.  You can maybe get that to work by altering how
find_submodules looks for submodule commits.  It would have to recurse
over the submodules it finds.

> If someone else got a similar scenario, let me know. Perhaps we can
> do some joint effort in adapting the zipper script.

Unfortunately, I don't have any bandwidth to hack on this right now.
I'm happy to answer questions, though.

   -David
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [llvm-dev] [monorepo] Downstream branch zipping tool available

2018-12-19 Thread David Greene via lldb-dev
David Greene via llvm-dev  writes:

> Now that I think about it, what you really want is something that runs
> migrate-downstream-fork.py on the commits in llvm and something that
> runs zip-downstream-fork.py on commits in other projects, but they have
> to ruin simultaneously to keep the commits in the proper order.

After pondering this overnight, I think a better approach might be to do
the enhancement of zip-downstream-fork.py described in the comments:

# - The script requires a history with submodule updates.  It should
#   be fairly straightforward to enhance the script to take a revlist
#   directly, ordering the commits according to the revlist.  Such a
#   revlist could be generated from an umbrella repository or via
#   site-specific mechanisms.  This would be passed to
#   fast_filter_branch.py directly, rather than generating a list via
#   expand_ref_pattern(self.reflist) in Zipper.run as is currently
#   done.  Changes would need to be made to fast_filter_branch.py to
#   accept a revlist to process directly, bypassing its invocation of
#   git rev-list within do_filter.

If that were done, then it should be possible to write a tool to
generate such a revlist from your llvm master project.  The tool would
examine each commit in llvm and if it were a commit to llvm, it would
add its hash to the revlist.  If it were a submodule update it would
traverse the gitlink to find the commit in the corresponding project
(see find_submodules_in_entry in zip-downstream-fork.py).  It would then
add that commit's hash to the revlist.  If a commit updates multiple
submodules then you just have to pick an arbitrary order.

All of the code to do the traversal is already in
zip-downstream-fork.py.  You could enhance it to output a revlist in the
same way fast_filter_branch can output a revmap and have it not actually
rewrite any commits.  You would have to tell it to not skip
non-submodule-update commits as described in my previous message.

This all assumes that each submodule update only adds one new commit
from the project linked by the submodule (zip-downstream-fork.py also
makes this assumption).  If a submodule update represents moving a
submodule up multiple commits, then you'd need something that can walk
that history and add hashes to the revlist.

The more I think about it, the more it seems to me that this is the
easiest way to go.  It's much less work that refactoring two tools and
should require relatively minimal changes to migrate-downstream-fork.py.

-David
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


[lldb-dev] [monorepo] Downstream repo import tool available

2019-01-09 Thread David Greene via lldb-dev
I've created a tool that takes a downstream repository and imports it
into the monorepo such that trees appear under a given subdirectory in
the monorepo:

https://github.com/jyknight/llvm-git-migration/pull/6

This is useful for downstream users who have repositories that make
heavy use of LLVM libraries and logically operate as extensions of the
LLVM ecosystem.

By default, downstream repo commits are rewritten such that the *only*
blobs in their trees are from the downstream repo.  Thus checking out
such commits will populate the workare with *only* the subdirectory
containing the imported repository artifacts.  A post-import merge with
an upstream monorepo commit will unify the trees and result in checkouts
that populate the workarea with monorepo and downstream repo artifacts.
There are some experimental options to rewrite downstream repo trees
alongside an existing monorepo tree but they are not well-tested.

There is no effort to interleave commits from the downstream repo with
commits from other imported downstream repos, or downstream branches of
the monorepo.  That would be a separate tool, I think, and would require
some fundamental reworking of fast_filter_branch.py that I didn't want
to tackle at this point.  If such downstream repos and/or branches have
commits ordered via submodule updates in some "umbrella" repository,
then zip-downstream-fork.py can be used to import and interleave them:

https://github.com/jyknight/llvm-git-migration/pull/2

Hopefully others will find these tools useful.

 -David
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


[lldb-dev] [monorepo] Much improved downstream zipping tool available

2019-01-29 Thread David Greene via lldb-dev
He all,

I've updated the downstream fork zipping tool that I posted about last
November [1].  It is much improved in every way.  The most important
enhancements are:

- Does a better job of simplifying history

- Handles nested submodules

- Will put non-submodule-update content in a subdirectory of the
  monorepo

- Updates tags

In addition there are plenty of the requisite bug fixes.  The latest
version of the tool can be found here:

https://github.com/greened/llvm-git-migration/tree/zip

With the nested submodules and the subdirectory features, the tool can
now take a downstream llvm repository with submodules (e.g. clang in
tools/clang and so on) as an umbrella and order the commits according to
changes in llvm and its submodules.

Björn, this new version may well be able to handle the tasks you
outlined in December [2].

I've written some recipes as proposed additions to the GitHub migration
proposal [3].  If you have a different scenario, please comment there
and if it seems a like a common case I can add a recipe for it so we can
all benefit from the learning.

Much of the bugfixing work was the result of some artificial histories I
created to shake out problems.  I believe it is ready for some testing
in the wild.  If you do try it, please let me know how it worked for you
and any problems you run into.  I will try to fix them.  It's easiest if
you can provide me with a test repository showing the problem but even a
verbal description of what is happening can help.

I hope this tool is helpful to the community.

 -David

[1] http://lists.llvm.org/pipermail/llvm-dev/2018-November/127704.html
[2] http://lists.llvm.org/pipermail/llvm-dev/2018-December/128620.html
[3] https://reviews.llvm.org/D56550
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [monorepo] Much improved downstream zipping tool available

2019-01-29 Thread David Greene via lldb-dev
Björn Pettersson A  writes:

> In the new monorepo UC1 may or may not be a parent to UL1.
> We could actually have something like this:
>
>   UL4->UC2->UL3->UL2->UL1->UL0->UC1
>
> Our DL1 commit should preferably have UL1 as parent after
> conversion
>
>   UL4->UC2->UL3->UL2->UL1->UL0->UC1
>|
>  ...->DL1
>
> but since it also includes DC1 (via submodule reference) we
> want to zip in DC1 before DL1, right? 
>
>   UL4->UC2->UL3->UL2->UL1->UL0->UC1
>|
> ...->DC1->DL1
>
> The problem is that DC1 is based on UC1, so we would get something
> like this
>
>   UL4->UC2->UL3->UL2->UL1->UL0->UC1
>| |
> ...->DC1->DL1|
>   ^  |
>   |  |
>--
>
> Which is not correct, since then we also get the UL0 commit
> as predecessor to DL1.

To be clear, is DC1 a commit that updates the clang submodule to UC1 and
DL1 a separate local commit to llvm that merges in UL1?

When zip-downstream-fork.py runs, it *always* uses the exact trees in
use by each downstream commit, whether from submodules or the umbrella
itself.  It tries very hard to maintain the state of the trees as they
appeared in the umbrella repository.

Since in your case llvm isn't a submodule (it's the "umbrella"), DL1
will absolutely have the tree from UL1, not UL0.  This is how
migrate-downstream-fork.py works and zip-downstream-fork.py won't touch
the llvm tree since it's not a submodule.  The commit DL1 doesn't update
any submodules so it will just use the clang tree from DC1.

I haven't tested this case explicitly but I would expect the resulting
history graph to look as you diagrammed above (reformatted to make it
clear there isn't a cycle):

   UL4->UC2->UL3->UL2->UL1->UL0->UC1 <- monorepo/master
| |
\ |
 `---.
  |   \
   ... ->DC1->DL1 <- zip/master

The "redundant" edge here is indicating that the state of the llvm tree
at DL1 is based on UL1, not UL0.  All other projects will be in the
state at UC1 (assuming you don't have other submodules under llvm).  I
know it looks strange but this is the best I could come up with because
in general there is no guarantee that submodule updates were in any way
correlated with when upstream commits were made (as you discovered!).
There's some discussion of this issue on the documentation I posted [1],
as well as in header comments in zip-downstream-fork.py.

The difficulty with this is that going forward, if you merge from
monorepo/master git will think you already have the changes from UL0.
There are at least two ways to work around this issue.  The first is to
just manually apply the llvm diff from UL1 to UL0 on top of zip/master
and then merge from monorepo/master after that.  The other way is to
freeze your local split repositories and merge from the upstream split
masters for all subprojects before running migrate-downstream-fork.py
and zip-downstream-fork.py.  Then everything will have the most
up-to-date trees and you should be fine going forward.  Doing such a
merge isn't possible for everyone at the time they want to migrate, but
the manual diff/patch method should suffice for those situations.  You
just have to somehow remember to do it before the next merge from
upstream.  Creating an auxilliary branch with the patch applied is one
way to remember.

I haven't really thought of a better way to handle situations like this
so I'm open to ideas!

   -David

[1] https://reviews.llvm.org/D56550
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [Openmp-dev] [Github] RFC: linear history vs merge commits

2019-01-29 Thread David Greene via lldb-dev
Tom Stellard via Openmp-dev  writes:

> As part of the migration of LLVM's source code to github, we need to update
> our developer policy with instructions about how to interact with the new git
> repository.  There are a lot of different topics we will need to discuss, but
> I would like to start by initiating a discussion about our merge commit
> policy.  Should we:
>
> 1. Disallow merge commits and enforce a linear history by requiring a
>rebase before push.
>
> 2. Allow merge commits.
>
> 3. Require merge commits and disallow rebase before push.
>
> I'm going to propose that if we cannot reach a consensus that we
> adopt policy #1, because this is essentially what we have now
> with SVN.

I agree with proposing #1 in general.  It results in a nice clean
history and will be familiar to everyone working on the project.

However, there is a place for merge commits.  If there's a bug in a
release and we want to make a point release, it might make sense to make
a commit on the release branch and then merge the release branch to
master in order to ensure the fix lands there as well.  Another option
is to commit to master first and then cherry-pick to release.  A third
option is to use the "hotfix branch" method of git flow [1], which would
result in a merge commit to the release branch and another merge commit
to master.

I've seen projects that commit to release first and then merge release
to master and also projects that commit to master and cherry-pick to
release.  I personally haven't seen projects employ the git flow hotfix
branch method rigorously.

But GitHub is read-only for almost a year, right?  So we really don't
have to decide this for a while.  I have not tried using the monorepo
and committing to SVN with it.  How does that work?  What would it do
with merge commits?

  -David

[1] https://nvie.com/posts/a-successful-git-branching-model
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [llvm-dev] [Github] RFC: linear history vs merge commits

2019-01-30 Thread David Greene via lldb-dev
Bruce Hoult via llvm-dev  writes:

> How about:
>
> Require a rebase, followed by git merge --no-ff
>
> This creates a linear history, but with extra null merge commits
> delimiting each related series of patches.
>
> I believe it is compatible with bisect.
>
> https://linuxhint.com/git_merge_noff_option/

We've done both and I personally prefer the strict linear history by a
lot.  It's just much easier to understand a linear history.

  -David
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [llvm-dev] [Github] RFC: linear history vs merge commits

2019-01-30 Thread David Greene via lldb-dev
Jeremy Lakeman via llvm-dev  writes:

> 4. Each reviewed bug / feature must be rebased onto the current "known
> good" commit, merged into a "probably good" commit, tested by build
> bots, and only then pushed to trunk. Keeping trunk's history more
> usable, with most bad patches reworked and resubmitted instead of
> reverted.

If you mean having a submitted PR trigger builds and only allow merging
if all builds pass, that may be doable.  Of course by the time it's
merged it will be against some later commit (so it should be rebased)
and there's no guarantee it will build against *that* commit.  But it
will tend to filter out most problems.

Trying to keep a branch buildable at all times is a hard problem, but
the above is probably a 90% solution.

-David
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [monorepo] Much improved downstream zipping tool available

2019-01-30 Thread David Greene via lldb-dev
Björn Pettersson A  writes:

> In llvm (split) we have:
>
>   UL4->UL3->UL2->UL1->UL0
>\
>  ...->DL2->DL1
>
> In clang (split) we have:
>
>   UC4->UC3->UC2->UC1->UC0
>\
>  ...->DC2->DC1
>
>
> DL1 is a commit that updates the clang submodule to DC1 (and in this
> scenario at the same time merges UL1 and DL2 in llvm).

Ok, in that case I would expect the resulting history to look like this:

UL4->UC2->UL3->UL2->UL1->UL0->UC1 <- monorepo/master
 | \
 \  `---.
  `. \
\|
... ->DL2->DL1/DC2 <- zip/master
/
... ->DC2--'

As a submodule update, DC1 is "inlined" into DL1 and its commit message
is appended to that of DL1.  I'm presuming here that llvm never updated
the clang submodule to DC2, so it remains an independent commit.

The inlining is done assuming that submodule updates represent a single
logical change.  Submodule updates are assumed to be related to whatever
changes happen in the umbrella so they all get smushed together into one
commit.

The edge UC1->DL1 represents the use of UC1 tree for every project
*except* llvm, because clang was a submodule of llvm (and updated to DC1
which merged UC1) and no other project was a submodule in llvm.  DL1
still has the llvm tree from UL1 plus any local changes you may have
made.

Admittedly, this is tricky to understand.  Believe me, there were a lot
of headaches involved trying to figure out what the right thing to do
is.  This is my best stab at that.

I don't think I have a test that creates this kind of graph.  It would
be interesting to see if it works.  :) At the moment I'm busy with other
things.  Give it a try and see if it does what you expect.

> How does git know that it should follow the parent relation from
> DL1 to UL1 for the llvm subdir, and not the UL0->UC1->DC1->DL1
> path? I mean, if I check out commit DC1 I will see the contribution
> from UL0 in the llvm subdir, and DL1 includes the changes from DC1.

With the history above this is no longer an issue since you can't check
out DC1 as such.  It's related to the llvm tree in DL1.

Let's say we have a commit DC3 and commit DL3 updated llvm's clang
submodule to DC3.  Commit DC4 was never referenced in a submodule
update.  The graph should then look like this:

UL4->UC2->UL3->UL2->UL1->UL0->UC1 <- monorepo/master
 | \
 \  `---.
  `. \
\|
   ... ->DL3/DC3->DL2->DL1/DC1 <- zip/master
 /\ /
 ... ->DC4--'  `--->DC2'

DC3 is related to DL3 so it got inlined.  DC2 has an llvm tree based on
DL3.

Hopefully, this is now clear as mud.  :)

 -David
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [cfe-dev] [llvm-dev] [Github] RFC: linear history vs merge commits

2019-01-31 Thread David Greene via lldb-dev
Mehdi AMINI  writes:

> What is the practical plan to enforce the lack of merges? When we
> looked into this GitHub would not support this unless also forcing
> every change to go through a pull request (i.e. no pre-receive hooks
> on direct push to master were possible). Did this change? Are we
> hoping to get support from GitHub on this?
>
> We may write this rule in the developer guide, but I fear it'll be
> hard to enforce in practice.

Again, changes aren't going through git yet, right?  Not until SVN is
decommissioned late this year (or early next).  SVN requires a strict
linear history so it handles the enforcement for now.

My personal opinion is that when SVN is decomissioned we should use pull
requests, simply because that's what's familiar to the very large
development community outside LLVM.  It will lower the bar to entry for
new contributors.  This has all sorts of implications we need to discuss
of course, but we have some time to do that.

If we don't use PRs, then we're pretty much on the honor system to
disallow merges AFAIK.

 -David
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [cfe-dev] [llvm-dev] [Github] RFC: linear history vs merge commits

2019-01-31 Thread David Greene via lldb-dev
 writes:

> Systems that I've seen will funnel all submitted PRs into a single queue
> which *does* guarantee that the trial builds are against HEAD and there
> are no "later commits" that can screw it up. If the trial build passes,
> the PR goes in and becomes the new HEAD.

The downside of a system like this is that when changes are going in
rapidly, the queue grows very large and it takes forever to get your
change merged.  PR builds are serialized and if a "build" means "make
sure it builds on all the Buildbots" then it takes a very long time
indeed.

There are ways to parallelize builds by speculating that PRs will build
cleanly but it gets pretty complicated quickly.

> But this would be a radical redesign of the LLVM bot system, and would
> have to wait until we're done with the GitHub migration and can spend
> a couple of years debating the use of PRs. :-)

Heh.  Fully guaranteeing buildability of a branch is not a trivial task
and will take a LOT of thought and discussion.

  -David
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [monorepo] Much improved downstream zipping tool available

2019-01-31 Thread David Greene via lldb-dev
Björn Pettersson A  writes:

>> Ok, in that case I would expect the resulting history to look like
>> this:
>> 
>> UL4->UC2->UL3->UL2->UL1->UL0->UC1 <- monorepo/master
>>  | \
>>  \  `---.
>>   `. \
>> \|
>> ... ->DL2->DL1/DC2 <- zip/master
>> /
>> ... ->DC2--'
>> 
>
> I still do not understand how that actually works technically, but maybe
> it does if you say so. But I also believe that "git log" etc on DL1/DC2
> will show that commit UL0 is part of my tree (which it isn't?). This will
> be really confusing when looking back at the history when debugging etc.

Yes, it will look like UL0 is part of your tree.  The edge from
UL1->DL1, which looks redundant, is actually there as a visual reminder
of the state of the llvm tree.

Unfortunately, git just doesn't have a good way to express the kind of
history we're creating here.  Since redundant edges are oddball in git
and git itself never creates them, I thought it would be strange enough
to stick out as a reminder.

If there's some other way to express this (Git notes?  Commit message?)
that would be more helpful, I'd be happy to consider it.

 -David
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [cfe-dev] [llvm-dev] [Github] RFC: linear history vs merge commits

2019-01-31 Thread David Greene via lldb-dev
Roman Lebedev  writes:

> *Does* LLVM want to switch from phabricator to github pr's?
> I personally don't recall previous discussions.
> Personally, i hope not, i hope phabricator should/will stay.

I find Phab pretty unintuitive.  I just started using it in earnest
about four months ago, so that's one datapoint from people new to it.

-David
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [cfe-dev] [llvm-dev] [Github] RFC: linear history vs merge commits

2019-02-01 Thread David Greene via lldb-dev
Oh, I'm completely in favor of making bad commits much less likely.  I
simply think there is a decent solution between "let everything in" and
"don't let everything in unless its proven to work everywhere" that gets
90% of the improvement.  The complexity of guaranteeing a buildable
branch is high.  If someone wants to take that on, great!  I just don't
think we should reasonably expect a group of volunteers to do it.  :)

-David

Jeremy Lakeman  writes:

> I realise that llvm trunk can move fairly quickly. 
> So my original, but brief, suggestion was to merge the current set of
> approved patches together rather than attempting them one at a time.
> Build on a set of fast smoke test bots. If something breaks, it should
> be possible to bisect it to reject a PR and make forward progress.
> Occasionally bisecting a large set of PR's should still be less bot
> time than attempting to build each of them individually.
> Blocking the PR's due to target specific and or slow bot failures
> would be a different question.
> You could probably do this with a linear history, so long as the final
> tree is the same as the merge commit, it should still build.
> I'm just fond of the idea of trying to prevent bad commits from ever
> being merged. Since they sometimes waste everyone's time.
>
> On Fri, 1 Feb 2019 at 04:02, David Greene  wrote:
>
>  writes:
> 
> > Systems that I've seen will funnel all submitted PRs into a
> single queue
> > which *does* guarantee that the trial builds are against HEAD
> and there
> > are no "later commits" that can screw it up. If the trial build
> passes,
> > the PR goes in and becomes the new HEAD.
> 
> The downside of a system like this is that when changes are going
> in
> rapidly, the queue grows very large and it takes forever to get
> your
> change merged. PR builds are serialized and if a "build" means
> "make
> sure it builds on all the Buildbots" then it takes a very long
> time
> indeed.
> 
> There are ways to parallelize builds by speculating that PRs will
> build
> cleanly but it gets pretty complicated quickly.
> 
> > But this would be a radical redesign of the LLVM bot system, and
> would
> > have to wait until we're done with the GitHub migration and can
> spend
> > a couple of years debating the use of PRs. :-)
> 
> Heh. Fully guaranteeing buildability of a branch is not a trivial
> task
> and will take a LOT of thought and discussion.
> 
> -David
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [llvm-dev] [GitHub] RFC: Enforcing no merge commit policy

2019-03-20 Thread David Greene via lldb-dev
Tom Stellard via llvm-dev  writes:

> 1. Have either a status check or use the "Rebase and Merge" policy for
> pull requests to disallow merge commits.  Disable direct pushes to the
> master branch and update the git-llvm script to create a pull request
> when a developer does `git llvm push` and then have it automatically
> merged if there are no merge commits.

This seems to be the least disruptive to existing developers while
maintaining the invariant we want.  It has the advantage of potentially
being extended in the future to add additional criteria for merges
(e.g. "it builds" or "it passes this set of tests").

> 2. Enforce no merge commits for pull requests, but sill allow
> developers to push directly to master without checking for merge
> requests.  This is essentially a best effort approach where we avoid
> having to implement our own custom work-flow for committing, while
> accepting the possibility that someone might accidentally push a merge
> commit.

To me this is the least desirable.  I'd prefer we have one way of
getting changes into the repository.

> 3. Disable direct pushes to master, don't update the git-llvm script
> and require all developers to use pull requests, where this policy
> will be enforced.

I am completely fine with this.  It's friendlier to new contributors.
That said, I assume with #1 we wouldn't prevent users from git push-ing
their local branch to GitHub (i.e. not using git llvm push) and manually
opening a PR, so either #1 or #3 works for new developers.

> Which option do you prefer?

Maybe a slight preference for #1 as being less disruptive than #3 but I
really don't care either way.

  -David
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


[lldb-dev] RFC: End-to-end testing

2019-10-08 Thread David Greene via lldb-dev
[ I am initially copying only a few lists since they seem like
  the most impacted projects and I didn't want to spam all the mailing
  lists.  Please let me know if other lists should be included. ]

I submitted D68230 for review but this is not about that patch per se.
The patch allows update_cc_test_checks.py to process tests that should
check target asm rather than LLVM IR.  We use this facility downstream
for our end-to-end tests.  It strikes me that it might be useful for
upstream to do similar end-to-end testing.

Now that the monorepo is about to become the canonical source of truth,
we have an opportunity for convenient end-to-end testing that we didn't
easily have before with svn (yes, it could be done but in an ugly way).
AFAIK the only upstream end-to-end testing we have is in test-suite and
many of those codes are very large and/or unfocused tests.

With the monorepo we have a place to put lit-style tests that exercise
multiple subprojects, for example tests that ensure the entire clang
compilation pipeline executes correctly.  We could, for example, create
a top-level "test" directory and put end-to-end tests there.  Some of
the things that could be tested include:

- Pipeline execution (debug-pass=Executions)
- Optimization warnings/messages
- Specific asm code sequences out of clang (e.g. ensure certain loops
  are vectorized)
- Pragma effects (e.g. ensure loop optimizations are honored)
- Complete end-to-end PGO (generate a profile and re-compile)
- GPU/accelerator offloading
- Debuggability of clang-generated code

Each of these things is tested to some degree within their own
subprojects, but AFAIK there are currently no dedicated tests ensuring
such things work through the entire clang pipeline flow and with other
tools that make use of the results (debuggers, etc.).  It is relatively
easy to break the pipeline while the individual subproject tests
continue to pass.

I realize that some folks prefer to work on only a portion of the
monorepo (for example, they just hack on LLVM).  I am not sure how to
address those developers WRT end-to-end testing.  On the one hand,
requiring them to run end-to-end testing means they will have to at
least check out and build the monorepo.  On the other hand, it seems
less than ideal to have people developing core infrastructure and not
running tests.

I don't yet have a formal proposal but wanted to put this out to spur
discussion and gather feedback and ideas.  Thank you for your interest
and participation!

-David
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [Openmp-dev] [cfe-dev] RFC: End-to-end testing

2019-10-08 Thread David Greene via lldb-dev
David Blaikie via Openmp-dev  writes:

> I have a bit of concern about this sort of thing - worrying it'll lead to
> people being less cautious about writing the more isolated tests.

That's a fair concern.  Reviewers will still need to insist on small
component-level tests to go along with patches.  We don't have to
sacrifice one to get the other.

> Dunno if they need a new place or should just be more stuff in test-suite,
> though.

There are at least two problems I see with using test-suite for this:

- It is a separate repository and thus is not as convenient as tests
  that live with the code.  One cannot commit an end-to-end test
  atomically with the change meant to be tested.

- It is full of large codes which is not the kind of testing I'm talking
  about.

Let me describe how I recently added some testing in our downstream
fork.

- I implemented a new feature along with a C source test.

- I used clang to generate asm from that test and captured the small
  piece of it I wanted to check in an end-to-end test.

- I used clang to generate IR just before the feature kicked in and
  created an opt-style test for it.  Generating this IR is not always
  straightfoward and it would be great to have better tools to do this,
  but that's another discussion.

- I took the IR out of opt (after running my feature) and created an
  llc-style test out of it to check the generated asm.  The checks are
  the same as in the original C end-to-end test.

So the tests are checking at each stage that the expected input is
generating the expected output and the end-to-end test checks that we go
from source to asm correctly.

These are all really small tests, easily runnable as part of the normal
"make check" process.

 -David
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [cfe-dev] [llvm-dev] RFC: End-to-end testing

2019-10-09 Thread David Greene via lldb-dev
Mehdi AMINI via cfe-dev  writes:

>> I have a bit of concern about this sort of thing - worrying it'll lead to
>> people being less cautious about writing the more isolated tests.
>>
>
> I have the same concern. I really believe we need to be careful about
> testing at the right granularity to keep things both modular and the
> testing maintainable (for instance checking vectorized ASM from a C++
> source through clang has always been considered a bad FileCheck practice).
> (Not saying that there is no space for better integration testing in some
> areas).

I absolutely disagree about vectorization tests.  We have seen
vectorization loss in clang even though related LLVM lit tests pass,
because something else in the clang pipeline changed that caused the
vectorizer to not do its job.  We need both kinds of tests.  There are
many asm tests of value beyond vectorization and they should include
component and well as end-to-end tests.

> For instance I remember asking about implementing test based on checking if
> some loops written in C source file were properly vectorized by the -O2 /
> -O3 pipeline and it was deemed like the kind of test that we don't want to
> maintain: instead I was pointed at the test-suite to add better benchmarks
> there for the end-to-end story. What is interesting is that the test-suite
> is not gonna be part of the monorepo!

And it shouldn't be.  It's much too big.  But there is a place for small
end-to-end tests that live alongside the code.

>>> We could, for example, create
>>> a top-level "test" directory and put end-to-end tests there.  Some of
>>> the things that could be tested include:
>>>
>>> - Pipeline execution (debug-pass=Executions)
>>>
>>> - Optimization warnings/messages
>>> - Specific asm code sequences out of clang (e.g. ensure certain loops
>>>   are vectorized)
>>> - Pragma effects (e.g. ensure loop optimizations are honored)
>>> - Complete end-to-end PGO (generate a profile and re-compile)
>>> - GPU/accelerator offloading
>>> - Debuggability of clang-generated code
>>>
>>> Each of these things is tested to some degree within their own
>>> subprojects, but AFAIK there are currently no dedicated tests ensuring
>>> such things work through the entire clang pipeline flow and with other
>>> tools that make use of the results (debuggers, etc.).  It is relatively
>>> easy to break the pipeline while the individual subproject tests
>>> continue to pass.
>>>
>>
>
> I'm not sure I really see much in your list that isn't purely about testing
> clang itself here?

Debugging and PGO involve other components, no?  If we want to put clang
end-to-end tests in the clang subdirectory, that's fine with me.  But we
need a place for tests that cut across components.

I could also imagine llvm-mca end-to-end tests through clang.

> Actually the first one seems more of a pure LLVM test.

Definitely not.  It would test the pipeline as constructed by clang,
which is very different from the default pipeline constructed by
opt/llc.  The old and new pass managers also construct different
pipelines.  As we have seen with various mailing list messages, this is
surprising to users.  Best to document and check it with testing.

  -David
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [llvm-dev] [cfe-dev] RFC: End-to-end testing

2019-10-09 Thread David Greene via lldb-dev
Mehdi AMINI via llvm-dev  writes:

>> I absolutely disagree about vectorization tests.  We have seen
>> vectorization loss in clang even though related LLVM lit tests pass,
>> because something else in the clang pipeline changed that caused the
>> vectorizer to not do its job.
>
> Of course, and as I mentioned I tried to add these tests (probably 4 or 5
> years ago), but someone (I think Chandler?) was asking me at the time: does
> it affect a benchmark performance? If so why isn't it tracked there? And if
> not does it matter?
> The benchmark was presented as the actual way to check this invariant
> (because you're only vectoring to get performance, not for the sake of it).
> So I never pursued, even if I'm a bit puzzled that we don't have such tests.

Thanks for explaining.

Our experience is that relying solely on performance tests to uncover
such issues is problematic for several reasons:

- Performance varies from implementation to implementation.  It is
  difficult to keep tests up-to-date for all possible targets and
  subtargets.
  
- Partially as a result, but also for other reasons, performance tests
  tend to be complicated, either in code size or in the numerous code
  paths tested.  This makes such tests hard to debug when there is a
  regression.

- Performance tests don't focus on the why/how of vectorization.  They
  just check, "did it run fast enough?"  Maybe the test ran fast enough
  for some other reason but we still lost desired vectorization and
  could have run even faster.

With a small asm test one can documented why vectorization is desired
and how it comes about right in the test.  Then when it breaks it's
usually pretty obvious what the problem is.

They don't replace performance tests, they complement each other, just
as end-to-end and component tests complement each other.

>> Debugging and PGO involve other components, no?
>
> I don't think that you need anything else than LLVM core (which is a
> dependency of clang) itself?

What about testing that what clang produces is debuggable with lldb?
debuginfo tests do that now but AFAIK they are not end-to-end.

> Things like PGO (unless you're using frontend instrumentation) don't even
> have anything to do with clang, so we may get into the situation David
> mentioned where we would rely on clang to test LLVM features, which I find
> non-desirable.

We would still expect component-level tests.  This would be additional
end-to-end testing, not replacing existing testing methodology.  I agree
the concern is valid but shouldn't code review discover such problems?

>> > Actually the first one seems more of a pure LLVM test.
>>
>> Definitely not.  It would test the pipeline as constructed by clang,
>> which is very different from the default pipeline constructed by
>> opt/llc.
>
> I am not convinced it is "very" difference (they are using the
> PassManagerBuilder AFAIK), I am only aware of very subtle difference.

I don't think clang exclusively uses PassManagerBuilder but it's been a
while since I looked at that code.

> But more fundamentally: *should* they be different? I would want `opt -O3`
> to be able to reproduce 1-1 the clang pipeline.

Which clang pipeline?  -O3?  -Ofast?  opt currently can't do -Ofast.  I
don't *think* -Ofast affects the pipeline itself but I am not 100%
certain.

> Isn't it the role of LLVM PassManagerBuilder to expose what is the "-O3"
> pipeline?

Ideally, yes.  In practice, it's not.

> If we see the PassManagerBuilder as the abstraction for the pipeline, then
> I don't see what testing belongs to clang here, this seems like a layering
> violation (and maintaining the PassManagerBuilder in LLVM I wouldn't want
> to have to update the tests of all the subproject using it because they
> retest the same feature).

If nothing else, end-to-end testing of the pipeline would uncover
layering violations.  :)

>> The old and new pass managers also construct different
>> pipelines.  As we have seen with various mailing list messages, this is
>> surprising to users.  Best to document and check it with testing.
>>
>
> Yes: both old and new pass managers are LLVM components, so hopefully that
> are documented and tested in LLVM :)

But we have nothing to guarantee that what clang does matches what opt
does.  Currently they do different things.

 -David
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [cfe-dev] [llvm-dev] RFC: End-to-end testing

2019-10-09 Thread David Greene via lldb-dev
Mehdi AMINI via cfe-dev  writes:

> I don't think these particular tests are the most controversial though, and
> it is really still fairly "focused" testing. I'm much more curious about
> larger end-to-end scope: for instance since you mention debug info and
> LLDB, what about a test that would verify that LLDB can print a particular
> variable content from a test that would come as a source program for
> instance. Such test are valuable in the absolute, it isn't clear to me that
> we could in practice block any commit that would break such test though:
> this is because a bug fix or an improvement in one of the pass may be
> perfectly correct in isolation but make the test fail by exposing a bug
> where we are already losing some debug info precision in a totally
> unrelated part of the codebase.
> I wonder how you see this managed in practice: would you gate any change on
> InstCombine (or other mid-level pass) on not regressing any of the
> debug-info quality test on any of the backend, and from any frontend (not
> only clang)? Or worse: a middle-end change that would end-up with a
> slightly different Dwarf construct on this particular test, which would
> trip LLDB but not GDB (basically expose a bug in LLDB). Should we require
> the contributor of inst-combine to debug LLDB and fix it first?

Good questions!  I think for situations like this I would tend toward
allowing the change and the test would alert us that something else is
wrong.  At that point it's probably a case-by-case decision.  Maybe we
XFAIL the test.  Maybe the fix is easy enough that we just do it and the
test starts passing again.  What's the policy for breaking current tests
when the change itself is fine but exposes a problem elsewhere (adding
an assert, for example)?

   -David
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [cfe-dev] [llvm-dev] RFC: End-to-end testing

2019-10-09 Thread David Greene via lldb-dev
Philip Reames via cfe-dev  writes:

> A challenge we already have - as in, I've broken these tests and had to 
> fix them - is that an end to end test which checks either IR or assembly 
> ends up being extraordinarily fragile.  Completely unrelated profitable 
> transforms create small differences which cause spurious test failures.  
> This is a very real issue today with the few end-to-end clang tests we 
> have, and I am extremely hesitant to expand those tests without giving 
> this workflow problem serious thought.  If we don't, this could bring 
> development on middle end transforms to a complete stop.  (Not kidding.)

Do you have a pointer to these tests?  We literally have tens of
thousands of end-to-end tests downstream and while some are fragile, the
vast majority are not.  A test that, for example, checks the entire
generated asm for a match is indeed very fragile.  A test that checks
whether a specific instruction/mnemonic was emitted is generally not, at
least in my experience.  End-to-end tests require some care in
construction.  I don't think update_llc_test_checks.py-type operation is
desirable.

Still, you raise a valid point and I think present some good options
below.

> A couple of approaches we could consider:
>
>  1. Simply restrict end to end tests to crash/assert cases.  (i.e. no
> property of the generated code is checked, other than that it is
> generated)  This isn't as restrictive as it sounds when combined
> w/coverage guided fuzzer corpuses.

I would be pretty hesitant to do this but I'd like to hear more about
how you see this working with coverage/fuzzing.

>  2. Auto-update all diffs, but report them to a human user for
> inspection.  This ends up meaning that tests never "fail" per se,
> but that individuals who have expressed interest in particular tests
> get an automated notification and a chance to respond on list with a
> reduced example.

That's certainly workable.

>  3. As a variant on the former, don't auto-update tests, but only inform
> the *contributor* of an end-to-end test of a failure. Responsibility
> for determining failure vs false positive lies solely with them, and
> normal channels are used to report a failure after it has been
> confirmed/analyzed/explained.

I think I like this best of the three but it raises the question of what
happens when the contributor is no longer contributing.  Who's
responsible for the test?  Maybe it just sits there until someone else
claims it.

> I really think this is a problem we need to have thought through and 
> found a workable solution before end-to-end testing as proposed becomes 
> a practically workable option.

Noted.  I'm very happy to have this discussion and work the problem.

 -David
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [llvm-dev] [cfe-dev] RFC: End-to-end testing

2019-10-10 Thread David Greene via lldb-dev
Florian Hahn via llvm-dev  writes:

>> - Performance varies from implementation to implementation.  It is
>>  difficult to keep tests up-to-date for all possible targets and
>>  subtargets.
>
> Could you expand a bit more what you mean here? Are you concerned
> about having to run the performance tests on different kinds of
> hardware? In what way do the existing benchmarks require keeping
> up-to-date?

We have to support many different systems and those systems are always
changing (new processors, new BIOS, new OS, etc.).  Performance can vary
widely day to day from factors completely outside the compiler's
control.  As the performance changes you have to keep updating the tests
to expect the new performance numbers.  Relying on performance
measurements to ensure something like vectorization is happening just
isn't reliable in our experience.

> With tests checking ASM, wouldn’t we end up with lots of checks for
> various targets/subtargets that we need to keep up to date?

Yes, that's true.  But the only thing that changes the asm generated is
the compiler.

> Just considering AArch64 as an example, people might want to check the
> ASM for different architecture versions and different vector
> extensions and different vendors might want to make sure that the ASM
> on their specific cores does not regress.

Absolutely.  We do a lot of that sort of thing downstream.

>> - Partially as a result, but also for other reasons, performance tests
>>  tend to be complicated, either in code size or in the numerous code
>>  paths tested.  This makes such tests hard to debug when there is a
>>  regression.
>
> I am not sure they have to. Have you considered adding the small test
> functions/loops as micro-benchmarks using the existing google
> benchmark infrastructure in test-suite?

We have tried nightly performance runs using LNT/test-suite and have
found it to be very unreliable, especially the microbenchmarks.

> I think that might be able to address the points here relatively
> adequately. The separate micro benchmarks would be relatively small
> and we should be able to track down regressions in a similar fashion
> as if it would be a stand-alone file we compile and then analyze the
> ASM. Plus, we can easily run it and verify the performance on actual
> hardware.

A few of my colleagues really struggled to get consistent results out of
LNT.  They asked for help and discussed with a few upstream folks, but
in the end were not able to get something reliable working.  I've talked
to a couple of other people off-list and they've had similar
experiences.  It would be great if we have a reliable performance suite.
Please tell us how to get it working!  :)

But even then, I still maintain there is a place for the kind of
end-to-end testing I describe.  Performance testing would complement it.
Neither is a replacement for the other.

>> - Performance tests don't focus on the why/how of vectorization.  They
>>  just check, "did it run fast enough?"  Maybe the test ran fast enough
>>  for some other reason but we still lost desired vectorization and
>>  could have run even faster.
>> 
>
> If you would add a new micro-benchmark, you could check that it
> produces the desired result when adding it. The runtime-tracking
> should cover cases where we lost optimizations. I guess if the
> benchmarks are too big, additional optimizations in one part could
> hide lost optimizations somewhere else. But I would assume this to be
> relatively unlikely, as long as the benchmarks are isolated.

Even then I have seen small performance tests vary widely in performance
due to system issues (see above).  Again, there is a place for them but
they are not sufficient.

> Also, checking the assembly for vector code does also not guarantee
> that the vector code will be actually executed. So for example by just
> checking the assembly for certain vector instructions, we might miss
> that we regressed performance, because we messed up the runtime checks
> guarding the vector loop.

Oh absolutely.  Presumably such checks would be included in the test or
would be checked by a different test.  As always, tests have to be
constructed intelligently.  :)

  -David
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [cfe-dev] [llvm-dev] RFC: End-to-end testing

2019-10-10 Thread David Greene via lldb-dev
Florian Hahn via cfe-dev  writes:

> Have you considered alternatives to checking the assembly for ensuring
> vectorization or other transformations? For example, instead of
> checking the assembly, we could check LLVM’s statistics or
> optimization remarks.

Yes, absolutely.  We have tests that do things like that.  I don't want
to focus on the asm bit, that's just one type of test.  The larger issue
is end-to-end tests that ensure the compiler and other tools are working
correctly, be it from checking messages, statistics, asm or something
else.

> This idea of leveraging statistics and optimization remarks to track
> the impact of changes on overall optimization results is nothing new
> and I think several people already discussed it in various forms. For
> regular benchmark runs, in addition to tracking the existing
> benchmarks, we could also track selected optimization remarks
> (e.g. loop-vectorize, but not necessarily noisy ones like gvn) and
> statistics. Comparing those run-to-run could potentially highlight new
> end-to-end issues on a much larger scale, across all existing
> benchmarks integrated in test-suite. We might be able to detect loss
> in vectorization pro-actively, instead of requiring someone to file a
> bug report and then we add an isolated test after the fact.

That's an interesting idea!  I would love to get more use out of
test-suite.

   -David
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [cfe-dev] [llvm-dev] RFC: End-to-end testing

2019-10-10 Thread David Greene via lldb-dev
Renato Golin via cfe-dev  writes:

> I'd recommend trying to move any e2e tests into the test-suite and
> make it easier to run, and leave specific tests only in the repo (to
> guarantee independence of components).

That would be a shame.  Where is test-suite run right now?  Are there
bots?  How are regressions reported?

> The last thing we want is to create direct paths from front-ends to
> back-ends and make LLVM IR transformation less flexible.

I'm not sure I follow.  Can you explain this a bit?

-David
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [Openmp-dev] [cfe-dev] [llvm-dev] RFC: End-to-end testing

2019-10-11 Thread David Greene via lldb-dev
"Robinson, Paul via Openmp-dev"  writes:

> David Greene, will you be at the LLVM Dev Meeting? If so, could you sign
> up for a Round Table session on this topic?  Obviously lots to discuss
> and concerns to be addressed.

That's a great idea.  I will be there.  I'm also hoping to help run a
routable on complex types so we'll need times that don't conflict.  What
times work well for folks?

> (1) Executable tests. These obviously require an execution platform; for
> feasibility reasons this means host==target and the guarantee of having
> a linker (possibly but not necessarily LLD) and a runtime (possibly but
> not necessarily including libcxx).  Note that the LLDB tests and the 
> debuginfo-tests project already have this kind of dependency, and in the
> case of debuginfo-tests, this is exactly why it's a separate project.

Ok.  I'd like to learn more about debuginfo-tests and how they're set
up.

> (2) Non-executable tests.  These are near-identical in character to the
> existing clang/llvm test suites and I'd expect lit to drive them.  The 
> only material difference from the majority(*) of existing clang tests is 
> that they are free to depend on LLVM features/passes.  The only difference 
> from the majority of existing LLVM tests is that they have [Obj]{C,C++} as 
> their input source language.

Right.  These are the kinds of tests I've been thinking about.

  -David
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [cfe-dev] [llvm-dev] RFC: End-to-end testing

2019-10-11 Thread David Greene via lldb-dev
Renato Golin via cfe-dev  writes:

> On Thu, 10 Oct 2019 at 22:26, David Greene  wrote:
>> That would be a shame.  Where is test-suite run right now?  Are there
>> bots?  How are regressions reported?
>
> There is no shame in making the test-suite better.

That's not what I meant, sorry.  I mean it would be a shame not to be
able to put end-to-end tests next to the code they test.  Tests that are
separated from code either tend to not get written/committed or tend to
not get run pre-merge.

> We do have bots running them in full CI for multiple targets, yes.
> Regressions are reported and fixed. The benchmarks are also followed
> by a smaller crowd and regression on those are also fixed (but
> slower).

How are regressions reported?  On llvm-dev?

> I'm not proposing to move e2e off to a dark corner, I'm proposing to
> have a scaled testing strategy that can ramp up and down as needed,
> without upsetting the delicate CI and developer balance.

I don't think I quiet understand what you mean.  CI is part of
development.  If tests break we have to fix them, regardless of where
the tests live.

> Sure, e2e tests are important, but they need to catch bugs that the
> other tests don't catch, not being our front-line safety net.

Of course.

> A few years back there was a big effort to clean up the LIT tests from
> duplicates and speed up inefficient code, and a lot of tests are
> removed. If we just add the e2e today and they never catch anything
> relevant, they'll be the next candidates to go.

I'm confused.  On the one hand you say you don't want to put e2e tests
in a dark corner, but here you speculate they could be easily removed.
Presumably a test was added because there was some failure that other
tests did not catch.  It's true that once a test is fixed it's very
likely it will never break again.  Is that a reason to remove tests?

If something end-to-end breaks in the field, it would be great if we
could capture a component-level test for it.  That would be my first
goal.  I agree it can be tempting to stop at an e2e test level and not
investigate further down.  We definitely want to avoid that.  My guess
is that over time we'll gain experience of what a good e2e test is for
the LLVM project.

> The delta that e2e can test is really important, but really small and
> fairly rare. So running it less frequent (every few dozen commits)
> will most likely be enough for anything we can possibly respond to
> upstream.

I think that's probably reasonable.

> Past experiences have, over and over, shown us that new shiny CI toys
> get rusty, noisy, and dumped.

I don't think e2e testing is shiny or new.  :)

> We want to have the tests, in a place anyone can test, that the bots
> *will* test periodically, and that don't annoy developers often enough
> to be a target.

What do you mean by "annoy?"  Taking too long to run?

> In a nutshell:
>  * We still need src2src tests, to ensure connection points (mainly
> IR) are canonical and generic, avoiding hidden contracts

Yes.

>  * We want the end2end tests to *add* coverage, not overlap with or
> replace existing tests

Yes, but I suspect people will disagree about what constitutes
"overlap."

>  * We don't want those tests to become a burden to developers by
> breaking on unrelated changes and making bots red for obscure reasons

Well, tests are going to break.  If a test is too fragile it should be
fixed or removed.

>  * We don't want them to be a burden to our CI efforts, slowing down
> regular LIT testing and becoming a target for removal

I certainly think less frequent running of tests could help with that if
it proves to be a burden.

> The orders of magnitude for number of commits we want to run tests are:
>  * LIT base, linker, compiler-RT, etc: ~1
>  * Test-suite correctness, end-2-end: ~10
>  * Multi-stage build, benchmarks: ~100
>
> We already have that ratio (somewhat) with buildbots, so it should be
> simple to add e2e to the test suite at the right scale.

Would it be possible to keep them in the monorepo but have bots that
exercise those tests at the test-suite frequency?  I suspect that if e2e
tests live in test-suite very few people will ever run them before
merging to master.

>> > The last thing we want is to create direct paths from front-ends to
>> > back-ends and make LLVM IR transformation less flexible.
>>
>> I'm not sure I follow.  Can you explain this a bit?
>
> Right, I had written a long paragraph about it but deleted in the
> final version of my email. :)
>
> The main point is that we want to avoid hidden contracts between the
> front-end and the back-end.
>
> We want to make sure all front-ends can produce canonical IR, and that
> the middle-end can optimise the IR and that the back-end can lower
> that to asm in a way that runs correctly on the target. As we have
> multiple back-ends and are soon to have a second official front-end,
> we want to make sure we have good coverage on the multi-step tests
> (AST to IR, IR to asm, etc).

A

Re: [lldb-dev] [cfe-dev] [llvm-dev] RFC: End-to-end testing

2019-10-11 Thread David Greene via lldb-dev
Sean Silva via cfe-dev  writes:

>> We have to support many different systems and those systems are always
>> changing (new processors, new BIOS, new OS, etc.).  Performance can vary
>> widely day to day from factors completely outside the compiler's
>> control.  As the performance changes you have to keep updating the tests
>> to expect the new performance numbers.  Relying on performance
>> measurements to ensure something like vectorization is happening just
>> isn't reliable in our experience.
>
> Could you compare performance with vectorization turned on and off?

That might catch more things but now you're running tests twice and it
still won't catch some cases.

-David
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [Openmp-dev] [llvm-dev] [cfe-dev] RFC: End-to-end testing

2019-10-16 Thread David Greene via lldb-dev
Renato Golin via Openmp-dev  writes:

> But if we have some consensus on doing a clean job, then I would
> actually like to have that kind of intermediary check (diagnostics,
> warnings, etc) on most test-suite tests, which would cover at least
> the main vectorisation issues. Later, we could add more analysis
> tools, if we want.

I think this makes a lot of sense.

> It would be as simple as adding CHECK lines on the execution of the
> compilation process (in CMake? Make? wrapper?) and keep the check
> files with the tests / per file.

Yep.

> I think we're on the same page regarding almost everything, but
> perhaps I haven't been clear enough on the main point, which I think
> it's pretty simple. :)

Personally, I still find source-to-asm tests to be highly valuable and I
don't think we need test-suite for that.  Such tests don't (usually)
depend on system libraries (headers may occasionally be an issue but I
would argue that the test is too fragile in that case).

So maybe we separate concerns.  Use test-suite to do the kind of
system-level testing you've discussed but still allow some tests in a
monorepo top-level directory that test across components but don't
depend on system configurations.

If people really object to a top-level monorepo test directory I guess
they could go into test-suite but that makes it much more cumbersome to
run what really should be very simple tests.

   -David
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [Openmp-dev] [cfe-dev] [llvm-dev] RFC: End-to-end testing

2019-10-16 Thread David Greene via lldb-dev
"Robinson, Paul via Openmp-dev"  writes:

>> I always ran check-all before every patch, FWIW.
>
> Yep.  Although I run check-all before *starting* on a patch, to make sure
> the starting point is clean.  It usually is, but I've been caught enough
> times to be slightly wary.

This is interesting.  I literally have never seen a clean check-all.  I
suspect that is because we have more components built than (most?)
others along with multiple targets.

   -David
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [Openmp-dev] [cfe-dev] [llvm-dev] RFC: End-to-end testing

2019-10-16 Thread David Greene via lldb-dev
> I'm inclined to the direction suggested by others that the monorepo is
> orthogonal to this issue and top level tests might not be the right thing.
>
> lldb already does end-to-end testing in its tests, for instance.
>
> Clang does in some tests (the place I always hit is anything that's
> configured API-wise on the MCContext - there's no way to test that
> configuration on the clang boundary, so the only test that we can write is
> one that tests the effect of that API/programmatic configuration done by
> clang to the MCContext (function sections, for instance) - in some cases
> I've just skipped the testing, in others I've written the end-to-end test
> in clang (& an LLVM test for the functionality that uses llvm-mc or
> similar)).

I'd be totally happy putting such tests under clang.  This whole
discussion was spurred by D68230 where some noted that previous
discussion had determined we didn't want source-to-asm tests in clang
and the test update script explicitly forbade it.

If we're saying we want to reverse that decision, I'm very glad!

-David
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [llvm-dev] [Openmp-dev] [cfe-dev] RFC: End-to-end testing

2019-10-17 Thread David Greene via lldb-dev
David Blaikie via llvm-dev  writes:

> & I generally agree that end-to-end testing should be very limited - but
> there are already some end-to-end-ish tests in clang and I don't think
> they're entirely wrong there. I don't know much about the vectorization
> tests - but any test that requires a tool to maintain/generate makes me a
> bit skeptical and doubly-so if we were testing all of those end-to-end too.
> (I'd expect maybe one or two sample/example end-to-end tests, to test
> certain integration points, but exhaustive testing would usually be left to
> narrower tests (so if you have one subsystem with three codepaths {1, 2, 3}
> and another subsystem with 3 codepaths {A, B, C}, you don't test the full
> combination of {1, 2, 3} X {A, B, C} (9 tests), you test each set
> separately, and maybe one representative sample end-to-end (so you end up
> with maybe 7-8 tests))

That sounds reasonable.  End-to-end tests are probably going to be very
much a case-by-case thing.  I imagine we'd start with the component
tests as is done today and then if we see some failure in end-to-end
operation that isn't covered by the existing component tests we'd add an
end-to-end test.  Or maybe we create some new component tests to cover
it.

> Possible I know so little about the vectorization issues in particular that
> my thoughts on testing don't line up with the realities of that particular
> domain.

Vectorization is one only small part of what I imagine we'd want to test
in an end-to-end fashion.  There are lots of examples of "we want this
code generated" beyond vectorization.

   -David
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [llvm-dev] [cfe-dev] [Openmp-dev] RFC: End-to-end testing

2019-10-17 Thread David Greene via lldb-dev
Mehdi AMINI via llvm-dev  writes:

> The main thing I see that will justify push-back on such test is the
> maintenance: you need to convince everyone that every component in LLVM
> must also maintain (update, fix, etc.) the tests that are in other
> components (clang, flang, other future subproject, etc.). Changing the
> vectorizer in the middle-end may require now to understand the kind of
> update a test written in Fortran (or Haskell?) is checking with some
> Hexagon assembly. This is a non-trivial burden when you compute the full
> matrix of possible frontend and backends.

That's true.  But don't we want to make sure the complete compiler works
as expected?  And don't we want to be alerted as soon as possible if
something breaks?  To my knowledge we have very few end-to-end tests of
the type I've been thinking about.  That worries me.

> Even if you write very small tests for checking vectorization, what is
> next? What about unrolling, inlining, loop-fusion, etc. ? Why would we stop
> the end-to-end FileCheck testing to vectorization?

I actually think vectorization is probably lower on the concern list for
end-to-end testing than more focused things like FMA generation,
prefetching and so on.  This is because there isn't a lot after the
vectorization pass that can be mess up vectorization.  Once something is
vectorized, it is likely to stay vectorized.  On the other hand, I have
for example frequently seen prefetches dropped or poorly scheduled by
code long after the prefetch got inserted into the IR.

> So the monorepo vs the test-suite seems like a false dichotomy: if such
> tests don't make it in the monorepo it will be (I believe) because folks
> won't want to maintain them. Putting them "elsewhere" is fine but it does
> not solve the question of the maintenance of the tests.

Agree 100%.

  -David
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev