Re: genmatch and cond vs "for (cnd cond vec_cond)" for gimple

2021-06-13 Thread Richard Biener via Gcc
On June 13, 2021 4:03:16 AM GMT+02:00, Andrew Pinski  wrote:
>On Sat, Jun 12, 2021 at 5:21 PM Andrew Pinski 
>wrote:
>>
>> On Sat, Jun 12, 2021 at 4:54 PM Andrew Pinski 
>wrote:
>> >
>> > Hi all,
>> >   While moving the simple A CMP 0 ? A : -A patterns from
>> > fold_cond_expr_with_comparison to match, I ran into an issue where
>> > using cond directly in the patterns work while "for cnd (cond
>> > vec_cond)" don't work.
>> > It looks like in the first case we are able to correctly handle the
>> > cond first operand being a comparison while with the for loop we
>are
>> > not.
>> >
>> > That is the following additional pattern works:
>> > /* A == 0? A : -Asame as -A */
>> > (simplify
>> >  (cond (eq @0 zerop) @0 (negate@1 @0))
>> >   (if (!HONOR_SIGNED_ZEROS (element_mode (type)))
>> >@1))
>> > (simplify
>> >  (cond (eq @0 zerop) zerop (negate@1 @0))
>> >   (if (!HONOR_SIGNED_ZEROS (element_mode (type)))
>> >@1)))
>> >
>> > While this one does not work:
>> > (for cnd (cond vec_cond)
>> > /* A == 0? A : -Asame as -A */
>> > (simplify
>> >  (cnd (eq @0 zerop) @0 (negate@1 @0))
>> >   (if (!HONOR_SIGNED_ZEROS (element_mode (type)))
>> >@1))
>> > (simplify
>> >  (cnd (eq @0 zerop) zerop (negate@1 @0))
>> >   (if (!HONOR_SIGNED_ZEROS (element_mode (type)))
>> >@1)))
>> >
>> >  CUT ---
>> > I will try to debug genmatch some but I wanted to get this email
>out
>> > to record what will need to be fixed to continue the movement of
>> > phiopt over to match.
>>
>> So the problem is we lower for loops first and then cond.  Though
>> swapping the order in genmatch's lower function causes invalid C++
>> code to be generated :(.
>> Still trying to figure out why though.
>
>I figured out why; lower_cond does not copy for_subst_vec for the new
>simplifier.  Fixing that allows the switching of the order of the two
>lower functions which is needed in this case.
>I will submit the patch for this when I submit the patch set for
>converting the simple "A CMP 0 ? A : - A" of
>fold_cond_expr_with_comparison.

Hmm, it was done so on purpose because cond lowering does not see what is a 
cond w/o for lowered. So why does it not work with lowered for? 

I'll have a look next week. 

Richard. 

>Thanks,
>Andrew Pinski
>
>>
>> Thanks,
>> Andrew
>>
>> >
>> > Thanks,
>> > Andrew Pinski



Re: progress update after initial GSoC virtual meetup

2021-06-13 Thread Ankur Saini via Gcc



> On 08-Jun-2021, at 11:24 PM, David Malcolm  wrote:
> 
> Is there a URL for your branch?

no, currently it only local branch on my machine. Should I upload it on a 
hosting site ( like GitHub ) ? or can I create a branch on remote also ?

> The issue is that the analyzer currently divides calls into
> (a) calls where GCC's middle-end "knows" which function is called, and
> thus the call site has a cgraph_node.
> (b) calls where GCC's middle-end doesn't "know" which function is
> called.
> 
> The analyzer handles
>  (a) by building call and return edges in the supergraph, and
> processing them, and
>  (b) with an "unknown call" handler, which conservatively sets lots of
> state to "unknown" to handle the effects of an arbitrary call, and
> where the call doesn't get its own exploded_edge.

> 
> In this bug we have a variant of (b), let's call it (c): GCC's middle-
> end doesn't know which function is called, but the analyzer's
> region_model *does* know at a particular exploded_node.

but how will the we know this at the time of creation of supergraph? isn’t 
exploded graph and regional model created after the supergraph ?

>  I expect this kind of thing will also arise for virtual function calls.

yes, it would be a similar case as if the call is not devirtualised, GCC’s 
middle-end would not know which function is being called but our regional model 
would know about the same.

>  So I think you should look at supergraph.cc at where it handles calls; I 
> think we
> need to update how it handles (b), so that it can handle the (c) cases,
> probably by splitting supernodes at all call sites, rather than just
> those with cgraph_edges, and then creating exploded_edges (with custom
> edge info) for calls where the analyzer "figured out" what the function
> pointer was in the region_model, even if there wasn't a cgraph_node.

> 
> Does that make sense?

ok so we are leaving the decision of how to handle case (b) to explodedgraph 
with the additional info from the regional model and create a call and return 
supernodes for all type of function calls whether or not middle-end know which 
function is called or not, makes sense. ( ok so this answers my previous 
question )

I went through supergraph.cc  and can see the splitting 
happening in the constructor’s (supergraph::supergraph() ) at the end of first 
pass.

> 
> Or you could attack the problem from the other direction, by looking at
> what GCC generates for a vfunc call, and seeing if you can get the
> region_model to "figure out" what the function pointer is at a
> particular exploded_node.

I will also be looking at this after the fixing the above problem, my current 
plan is to see how GCC's devirtualiser do it.

> 
>> 
>> also, should I prefer discussing about this bug here( gcc mailing
>> list) or on the bugzilla itself ?
> 
> Either way works for me.  Maybe on this list?  (given that this feels
> like a design question)

ok

> 
> Hope this is helpful
> Dave

Thanks

- Ankur

Re: progress update after initial GSoC virtual meetup

2021-06-13 Thread David Malcolm via Gcc
On Sun, 2021-06-13 at 19:11 +0530, Ankur Saini wrote:
> 
> 
> > On 08-Jun-2021, at 11:24 PM, David Malcolm 
> > wrote:
> > 
> > Is there a URL for your branch?
> 
> no, currently it only local branch on my machine. Should I upload it on
> a hosting site ( like GitHub ) ? or can I create a branch on remote
> also ?

At some point we want you to be able to push patches to trunk, so as a
step towards that I think it would be good for you to have a personal
branch on the gcc git repository.

A guide to getting access is here:
  https://gcc.gnu.org/gitwrite.html

I will sponsor you.

> 
> > The issue is that the analyzer currently divides calls into
> > (a) calls where GCC's middle-end "knows" which function is called,
> > and
> > thus the call site has a cgraph_node.
> > (b) calls where GCC's middle-end doesn't "know" which function is
> > called.
> > 
> > The analyzer handles
> >  (a) by building call and return edges in the supergraph, and
> > processing them, and
> >  (b) with an "unknown call" handler, which conservatively sets lots
> > of
> > state to "unknown" to handle the effects of an arbitrary call, and
> > where the call doesn't get its own exploded_edge.
> 
> > 
> > In this bug we have a variant of (b), let's call it (c): GCC's
> > middle-
> > end doesn't know which function is called, but the analyzer's
> > region_model *does* know at a particular exploded_node.
> 
> but how will the we know this at the time of creation of supergraph?
> isn’t exploded graph and regional model created after the supergraph ?

You are correct.

What I'm thinking is that when we create the supergraph we should split
the nodes at more calls, not just at those calls that have a
cgraph_edge, but also at those that are calls to an unknown function
pointer (or maybe even split them at *all* calls).

Then, later, when engine.cc is building the exploded_graph, the
supergraph will have a superedge for those calls, and we can create an
exploded_edge representing the call.  That way if we discover the
function pointer then (rather than having it from a cgraph_edge), we
can build exploded nodes and exploded edges that are similar to the "we
had a cgraph_edge" case.  You may need to generalize some of the event-
handling code to do this.

Does that make sense?

You might want to try building some really simple examples of this, to
make it as easy as possible to see what's happening, and to debug.

> 
> >  I expect this kind of thing will also arise for virtual function
> > calls.
> 
> yes, it would be a similar case as if the call is not devirtualised,
> GCC’s middle-end would not know which function is being called but our
> regional model would know about the same.

Yes.

> 
> >  So I think you should look at supergraph.cc at where it handles
> > calls; I think we
> > need to update how it handles (b), so that it can handle the (c)
> > cases,
> > probably by splitting supernodes at all call sites, rather than
> > just
> > those with cgraph_edges, and then creating exploded_edges (with
> > custom
> > edge info) for calls where the analyzer "figured out" what the
> > function
> > pointer was in the region_model, even if there wasn't a
> > cgraph_node.
> 
> > 
> > Does that make sense?
> 
> ok so we are leaving the decision of how to handle case (b) to
> explodedgraph with the additional info from the regional model and
> create a call and return supernodes for all type of function calls
> whether or not middle-end know which function is called or not, makes
> sense. ( ok so this answers my previous question )
> 
> I went through supergraph.cc  and can see the
> splitting happening in the constructor’s (supergraph::supergraph() )
> at the end of first pass.

It sounds to me like you are on the right track.

> 
> > 
> > Or you could attack the problem from the other direction, by
> > looking at
> > what GCC generates for a vfunc call, and seeing if you can get the
> > region_model to "figure out" what the function pointer is at a
> > particular exploded_node.
> 
> I will also be looking at this after the fixing the above problem, my
> current plan is to see how GCC's devirtualiser do it.

OK.

> 
> > 
> > > 
> > > also, should I prefer discussing about this bug here( gcc mailing
> > > list) or on the bugzilla itself ?
> > 
> > Either way works for me.  Maybe on this list?  (given that this
> > feels
> > like a design question)
> 
> ok
> 
> > 
> > Hope this is helpful
> > Dave
> 
> Thanks
> 
> - Ankur

Great.

Let me know how you get on.

As I understand it, Google recommends that we're exchanging emails
about our GSoC project at least two times a week, so please do continue
to report in, whether you're making progress, or if you feel you're
stuck on something.

Hope this is constructive.
Dave




[PATCH 1/2] libstdc++: Count pretty-printed tuple elements from 0 not 1

2021-06-13 Thread Paul Smith via Gcc
Show 0-based offsets for std::tuple members, to match with std::get.

libstdc++-v3/ChangeLog:

* python/libstdcxx/v6/printers.py (StdTuplePrinter): don't increment
self.count until after generating the result string.
---
 libstdc++-v3/python/libstdcxx/v6/printers.py | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/libstdc++-v3/python/libstdcxx/v6/printers.py 
b/libstdc++-v3/python/libstdcxx/v6/printers.py
index 550e0ecdd22..14a6d998690 100644
--- a/libstdc++-v3/python/libstdcxx/v6/printers.py
+++ b/libstdc++-v3/python/libstdcxx/v6/printers.py
@@ -560,16 +560,17 @@ class StdTuplePrinter:
 # Process left node and set it as head.
 self.head  = self.head.cast (nodes[0].type)

-self.count = self.count + 1
-
 # Finally, check the implementation.  If it is
 # wrapped in _M_head_impl return that, otherwise return
 # the value "as is".
 fields = impl.type.fields ()
-if len (fields) < 1 or fields[0].name != "_M_head_impl":
-return ('[%d]' % self.count, impl)
-else:
-return ('[%d]' % self.count, impl['_M_head_impl'])
+if len (fields) > 0 and fields[0].name == "_M_head_impl":
+impl = impl['_M_head_impl']
+
+out = '[%d]' % self.count
+self.count = self.count + 1
+
+return (out, impl)

 def __init__ (self, typename, val):
 self.typename = strip_versioned_namespace(typename)
--
2.28.0



[PATCH 2/2] libstdc++: Use template form for pretty-printing tuple elements

2021-06-13 Thread Paul Smith via Gcc
std::tuple elements are retrieved via std::get<> (template) not
[] (array); have the generated output string match this.

libstdc++-v3/ChangeLog:

* python/libstdcxx/v6/printers.py (StdTuplePrinter): Use <> not [].
---
The previous patch seems uncontroversial to me.  I don't know about this one:
I'm not sure if there's any precedent for this type of output although to me
it looks better since tuples cannot be retrieved via array indexing.

 libstdc++-v3/python/libstdcxx/v6/printers.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libstdc++-v3/python/libstdcxx/v6/printers.py 
b/libstdc++-v3/python/libstdcxx/v6/printers.py
index 14a6d998690..0063a3185a6 100644
--- a/libstdc++-v3/python/libstdcxx/v6/printers.py
+++ b/libstdc++-v3/python/libstdcxx/v6/printers.py
@@ -567,7 +567,7 @@ class StdTuplePrinter:
 if len (fields) > 0 and fields[0].name == "_M_head_impl":
 impl = impl['_M_head_impl']

-out = '[%d]' % self.count
+out = '<%d>' % self.count
 self.count = self.count + 1

 return (out, impl)
--
2.28.0



Re: GCC/clang warning incompatibility with unused private member variables

2021-06-13 Thread Jason Merrill via Gcc
On Fri, Jun 11, 2021 at 4:03 PM Jason Merrill  wrote:

> On 6/11/21 3:37 PM, Markus Faehling wrote:
> > Hello,
> >
> > I'm currently facing a problem where I cannot get both gcc and clang
> > warning-free simultaneously in my project. My code looks somewhat like
> > this:
> >
> > class Test {
> >  int a_;
> >  void b() {};
> > };
> >
> > This code gives me the(usually very useful) "-Wunused-private-field"
> > warning on clang. But because I have the unused member on purpose, I
> > would like to add the [[maybe_unused]] attribute to it:
> >
> > class Test {
> >  [[maybe_unused]] int a_;
> >  void b() {};
> > };
> >
> > While this version is warning-free in clang, gcc has a "-Wattributes"
> > warning because it ignores the [[maybe_unused]] warning. But I do not
> > want to disable either of these warnings because they are still very
> > useful in other situations.
> >
> > Would it be possible to ignore the "-Wattributes" warning for
> > [[maybe_unused]] in places where other compilers might use the attribute?
> >
> > Demonstration on godbolt.org: https://godbolt.org/z/8oT4Kr5eM
>
> You can use #pragma to disable a warning for a particular section of code:
>
> #pragma GCC diagnostic push
> #pragma GCC diagnostic ignored "-Wattributes"
> class Test {
>   [[maybe_unused]] int a_;
>   void b() {};
> };
> #pragma GCC diagnostic pop
>
> But I also agree that GCC shouldn't warn here.


I've pushed a change to trunk to stop warning about this case.

Jason


Build failure due to format-truncation

2021-06-13 Thread José Rui Faustino de Sousa via Gcc



Hi All!

While building I started to get this error:

../../gcc-master/gcc/opts.c: In function ‘void 
print_filtered_help(unsigned int, unsigned int, unsigned int, unsigned 
int, gcc_options*, unsigned int)’:
../../gcc-master/gcc/opts.c:1497:26: error: ‘  ’ directive output may be 
truncated writing 2 bytes into a region of size between 1 and 256 
[-Werror=format-truncation=]

 1497 |   "%s  %s", help, _(use_diagnosed_msg));
  |  ^~
../../gcc-master/gcc/opts.c:1496:22: note: ‘snprintf’ output 3 or more 
bytes (assuming 258) into a destination of size 256

 1496 | snprintf (new_help, sizeof new_help,
  | ~^~~
 1497 |   "%s  %s", help, _(use_diagnosed_msg));
  |   ~
cc1plus: all warnings being treated as errors

My guess is that it is due to the use of the flag "-fstrict-overflow"

I am not complaining or saying that this is a bug, but the whole code 
compiles fine with this flag except for this single instance...


Thank you very much.

Best regards,
José Rui



Re: replacing the backwards threader and more

2021-06-13 Thread Jeff Law via Gcc




On 6/9/2021 5:48 AM, Aldy Hernandez wrote:

Hi Jeff.  Hi folks.

What started as a foray into severing the old (forward) threader's 
dependency on evrp, turned into a rewrite of the backwards threader 
code.  I'd like to discuss the possibility of replacing the current 
backwards threader with a new one that gets far more threads and can 
potentially subsume all threaders in the future.


I won't include code here, as it will just detract from the high level 
discussion.  But if it helps, I could post what I have, which just 
needs some cleanups and porting to the latest trunk changes Andrew has 
made.


Currently the backwards threader works by traversing DEF chains 
through PHIs leading to possible paths that start in a constant. When 
such a path is found, it is checked to see if it is profitable, and if 
so, the constant path is threaded.  The current implementation is 
rather limited since backwards paths must end in a constant.  For 
example, the backwards threader can't get any of the tests in 
gcc.dg/tree-ssa/ssa-thread-14.c:


  if (a && b)
    foo ();
  if (!b && c)
    bar ();

etc.
Right.  And these kinds of cases are particularly interesting to capture 
-- not only do you remove the runtime test/compare, all the setup code 
usually dies as well.  I can't remember who, but someone added some bits 
to detect these cases in DOM a while back and while the number of 
additional jumps threaded wasn't great, the overall impact was much 
better than we initially realized.   Instead of allowign removal of a 
single compare/branch, it typically allowed removal of a chain of 
logicals that fed the conditional.




After my refactoring patches to the threading code, it is now possible 
to drop in an alternate implementation that shares the profitability 
code (is this path profitable?), the jump registry, and the actual 
jump threading code.  I have leveraged this to write a ranger-based 
threader that gets every single thread the current code gets, plus 
90-130% more.

Sweet.



Here are the details from the branch, which should be very similar to 
trunk.  I'm presenting the branch numbers because they contain 
Andrew's upcoming relational query which significantly juices up the 
results.
Yea, I'm not surprised that the relational query helps significantly 
here.  And I'm not surprised that we can do much better with the 
backwards threader with a rewrite.


Much of the ranger design was with the idea behind using it in the 
backwards jump threader in mind.  Backwards threading is, IMHO, a much 
better way to think about the problem.  THe backwards threader also has 
a much stronger region copier -- so we don't have to live with the 
various limitations of the old jump threading approach.






New threader:
 ethread:65043    (+3.06%)
 dom:32450  (-13.3%)
 backwards threader:72482   (+89.6%)
 vrp:40532  (-30.7%)
  Total threaded:  210507 (+6.70%)

This means that the new code gets 89.6% more jump threading 
opportunities than the code I want to replace.  In doing so, it 
reduces the amount of DOM threading opportunities by 13.3% and by 
30.7% from the VRP jump threader.  The total  improvement across the 
jump threading opportunities in the compiler is 6.70%.
This looks good at first glance.  It's worth noting that the backwards 
threader runs before the others, so, yea, as it captures more stuff I 
would expect DOM/VRP to capture fewer things.    It would be interesting 
to know the breakdown of things caught by VRP1/VRP2 and how much of that 
is secondary opportunities that are only appearing because we've done a 
better job earlier.


And just to be clear, I expect that we're going to leave some of those 
secondary opportunities on the table -- we just don't want it to be too 
many :-)  When I last looked at this my sense was wiring the backwards 
threader and ranger together should be enough to subsume VRP1/VRP2 jump 
threading.




However, these are pessimistic numbers...

I have noticed that some of the threading opportunities that DOM and 
VRP now get are not because they're smarter, but because they're 
picking up opportunities that the new code exposes.  I experimented 
with running an iterative threader, and then seeing what VRP and DOM 
could actually get.  This is too expensive to do in real life, but it 
at least shows what the effect of the new code is on DOM/VRP's abilities:


  Iterative threader:
    ethread:65043    (+3.06%)
    dom:31170    (-16.7%)
    thread:86717    (+127%)
    vrp:33851    (-42.2%)
  Total threaded:  216781 (+9.90%)

This means that the new code not only gets 127% more cases, but it 
reduces the DOM and VRP opportunities considerably (16.7% and 42.2% 
respectively).   The end result is that we have the possibility of 
getting almost 10% more jump threading opportunities in the entire 
compilation run.


(Note that the new code gets even more opportunities, but I'm only 
reporting the profitable ones that made it 

gcc-12-20210613 is now available

2021-06-13 Thread GCC Administrator via Gcc
Snapshot gcc-12-20210613 is now available on
  https://gcc.gnu.org/pub/gcc/snapshots/12-20210613/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 12 git branch
with the following options: git://gcc.gnu.org/git/gcc.git branch master 
revision 681143b9b94d7f1c88a7c34e2250865c31191959

You'll find:

 gcc-12-20210613.tar.xz   Complete GCC

  SHA256=af053fe0ffebc344ba9b9cfa8c5f96ae594dfc44ceb0eb615c24aafe88e9f442
  SHA1=ee3d81ca83b1729a5fff59254092ac5c8cdeb7be

Diffs from 12-20210606 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-12
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


Re: replacing the backwards threader and more

2021-06-13 Thread Richard Biener via Gcc
On Mon, Jun 14, 2021 at 12:02 AM Jeff Law via Gcc  wrote:
>
>
>
> On 6/9/2021 5:48 AM, Aldy Hernandez wrote:
> > Hi Jeff.  Hi folks.
> >
> > What started as a foray into severing the old (forward) threader's
> > dependency on evrp, turned into a rewrite of the backwards threader
> > code.  I'd like to discuss the possibility of replacing the current
> > backwards threader with a new one that gets far more threads and can
> > potentially subsume all threaders in the future.
> >
> > I won't include code here, as it will just detract from the high level
> > discussion.  But if it helps, I could post what I have, which just
> > needs some cleanups and porting to the latest trunk changes Andrew has
> > made.
> >
> > Currently the backwards threader works by traversing DEF chains
> > through PHIs leading to possible paths that start in a constant. When
> > such a path is found, it is checked to see if it is profitable, and if
> > so, the constant path is threaded.  The current implementation is
> > rather limited since backwards paths must end in a constant.  For
> > example, the backwards threader can't get any of the tests in
> > gcc.dg/tree-ssa/ssa-thread-14.c:
> >
> >   if (a && b)
> > foo ();
> >   if (!b && c)
> > bar ();
> >
> > etc.
> Right.  And these kinds of cases are particularly interesting to capture
> -- not only do you remove the runtime test/compare, all the setup code
> usually dies as well.  I can't remember who, but someone added some bits
> to detect these cases in DOM a while back and while the number of
> additional jumps threaded wasn't great, the overall impact was much
> better than we initially realized.   Instead of allowign removal of a
> single compare/branch, it typically allowed removal of a chain of
> logicals that fed the conditional.
>
> >
> > After my refactoring patches to the threading code, it is now possible
> > to drop in an alternate implementation that shares the profitability
> > code (is this path profitable?), the jump registry, and the actual
> > jump threading code.  I have leveraged this to write a ranger-based
> > threader that gets every single thread the current code gets, plus
> > 90-130% more.
> Sweet.
>
> >
> > Here are the details from the branch, which should be very similar to
> > trunk.  I'm presenting the branch numbers because they contain
> > Andrew's upcoming relational query which significantly juices up the
> > results.
> Yea, I'm not surprised that the relational query helps significantly
> here.  And I'm not surprised that we can do much better with the
> backwards threader with a rewrite.
>
> Much of the ranger design was with the idea behind using it in the
> backwards jump threader in mind.  Backwards threading is, IMHO, a much
> better way to think about the problem.  THe backwards threader also has
> a much stronger region copier -- so we don't have to live with the
> various limitations of the old jump threading approach.
>
>
>
> >
> > New threader:
> >  ethread:65043(+3.06%)
> >  dom:32450  (-13.3%)
> >  backwards threader:72482   (+89.6%)
> >  vrp:40532  (-30.7%)
> >   Total threaded:  210507 (+6.70%)
> >
> > This means that the new code gets 89.6% more jump threading
> > opportunities than the code I want to replace.  In doing so, it
> > reduces the amount of DOM threading opportunities by 13.3% and by
> > 30.7% from the VRP jump threader.  The total  improvement across the
> > jump threading opportunities in the compiler is 6.70%.
> This looks good at first glance.  It's worth noting that the backwards
> threader runs before the others, so, yea, as it captures more stuff I
> would expect DOM/VRP to capture fewer things.It would be interesting
> to know the breakdown of things caught by VRP1/VRP2 and how much of that
> is secondary opportunities that are only appearing because we've done a
> better job earlier.
>
> And just to be clear, I expect that we're going to leave some of those
> secondary opportunities on the table -- we just don't want it to be too
> many :-)  When I last looked at this my sense was wiring the backwards
> threader and ranger together should be enough to subsume VRP1/VRP2 jump
> threading.
>
> >
> > However, these are pessimistic numbers...
> >
> > I have noticed that some of the threading opportunities that DOM and
> > VRP now get are not because they're smarter, but because they're
> > picking up opportunities that the new code exposes.  I experimented
> > with running an iterative threader, and then seeing what VRP and DOM
> > could actually get.  This is too expensive to do in real life, but it
> > at least shows what the effect of the new code is on DOM/VRP's abilities:
> >
> >   Iterative threader:
> > ethread:65043(+3.06%)
> > dom:31170(-16.7%)
> > thread:86717(+127%)
> > vrp:33851(-42.2%)
> >   Total threaded:  216781 (+9.90%)
> >
> > This means that the new code not only gets 127% more cases, but it
> > reduce