Tree SSA If-combine optimization pass in GCC

2015-02-17 Thread Ajit Kumar Agarwal
Hello All:

I can see the IF-combining (If-merging) pass of optimization on tree-ssa form 
of intermediate representation. 
The IF-combine or merging takes of merging the IF-THEN-ELSE if the condition 
Expr found be congruent or 
Similar.

The IF-combine happens if the two IF-THEN-ELSE are contiguous to each other. 
If the IF-THEN-ELSE happens to be not contiguous but are wide apart with there 
is code in between.
Does the If-combine takes care of this. This requires to do the 
head-duplication and Tail-duplication for the
Code in between If-THEN-ELSE to bring the IF-THEN-ELSE contiguous to each other.

After the head and tail duplication of the code in between the IF-THEN-ElSE 
sequence becomes contiguous 
to each other. Apart from this, Does the tree-ssa-if-combine pass considers the 
control flow of the body
of the IF-THEN-ELSE. Is there any limitation on control flow of the body of the 
IF-THEN-ELSE.

Can I know the scope of tree-ssa-ifcombine optimizations pass with respect to 
the above points.

Thoughts Please?

Thanks & Regards
Ajit


Cost Calculation on Loop Invariant on Arithmetic operations on RTL

2015-02-17 Thread Ajit Kumar Agarwal
Hello All:

I can see the Loop invariant pass in the GCC on RTL considering the register 
pressure and the cost manipulation
With respect to SET destination node in RTL.

The Loop invariant takes care of only address arithmetic candidates of Loop 
invariance.

In the function get_inv_cost, I can see the following check for the updation of 
comp_cost and the comp_cost
Gets incremented with respect all the dependence on the def node of the 
invariant variable.

If (!inv->cheap_address || inv-def->n_addr_uses < inv->def->n_uses)
  (*comp_cost += inv->cost + inv->eqno

Is there any specific reasons of addr_uses  less than the actual uses check for 
the address arithmetic candidate
Of Loop invariant. I think we should be aggressive enough and can do the cost 
calculation without this check.

One more point the above cost calculation should not be considering the 
register pressure_costs.

Thoughts Please ?

Thanks & Regards
Ajit


Re: Tree SSA If-combine optimization pass in GCC

2015-02-17 Thread Richard Biener
On Tue, Feb 17, 2015 at 9:22 AM, Ajit Kumar Agarwal
 wrote:
> Hello All:
>
> I can see the IF-combining (If-merging) pass of optimization on tree-ssa form 
> of intermediate representation.
> The IF-combine or merging takes of merging the IF-THEN-ELSE if the condition 
> Expr found be congruent or
> Similar.
>
> The IF-combine happens if the two IF-THEN-ELSE are contiguous to each other.
> If the IF-THEN-ELSE happens to be not contiguous but are wide apart with 
> there is code in between.
> Does the If-combine takes care of this. This requires to do the 
> head-duplication and Tail-duplication for the
> Code in between If-THEN-ELSE to bring the IF-THEN-ELSE contiguous to each 
> other.
>
> After the head and tail duplication of the code in between the IF-THEN-ElSE 
> sequence becomes contiguous
> to each other. Apart from this, Does the tree-ssa-if-combine pass considers 
> the control flow of the body
> of the IF-THEN-ELSE. Is there any limitation on control flow of the body of 
> the IF-THEN-ELSE.
>
> Can I know the scope of tree-ssa-ifcombine optimizations pass with respect to 
> the above points.
>
> Thoughts Please?

if-combine is a simple CFG + condition pattern matcher.  It does not
perform head/tail duplication.  Also there is no "control flow" in the
bodies, control flow is part of the CFG that is matched so I'm not quite
getting your last question.

if-combine was designed to accompany IL-only patterns that get
partly translated into control flow.  Like

  tem1 = name & bit1;
  tem2 = name & bit2;
  tem3 = tem1 | tem2;
  if (tem3)
...

vs.

  tem1 = name & bit1;
  if (tem1)
   goto x;
  else
{
  tem2 = name & bit2;
  if (tem2)
goto x;
}

x:
   ...

Richard.

> Thanks & Regards
> Ajit


RE: Tree SSA If-combine optimization pass in GCC

2015-02-17 Thread Ajit Kumar Agarwal


-Original Message-
From: Richard Biener [mailto:richard.guent...@gmail.com] 
Sent: Tuesday, February 17, 2015 3:42 PM
To: Ajit Kumar Agarwal
Cc: gcc@gcc.gnu.org; Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; 
Nagaraju Mekala
Subject: Re: Tree SSA If-combine optimization pass in GCC

On Tue, Feb 17, 2015 at 9:22 AM, Ajit Kumar Agarwal 
 wrote:
> Hello All:
>
> I can see the IF-combining (If-merging) pass of optimization on tree-ssa form 
> of intermediate representation.
> The IF-combine or merging takes of merging the IF-THEN-ELSE if the 
> condition Expr found be congruent or Similar.
>
> The IF-combine happens if the two IF-THEN-ELSE are contiguous to each other.
> If the IF-THEN-ELSE happens to be not contiguous but are wide apart with 
> there is code in between.
> Does the If-combine takes care of this. This requires to do the 
> head-duplication and Tail-duplication for the Code in between If-THEN-ELSE to 
> bring the IF-THEN-ELSE contiguous to each other.
>
> After the head and tail duplication of the code in between the 
> IF-THEN-ElSE sequence becomes contiguous to each other. Apart from 
> this, Does the tree-ssa-if-combine pass considers the control flow of the 
> body of the IF-THEN-ELSE. Is there any limitation on control flow of the body 
> of the IF-THEN-ELSE.
>
> Can I know the scope of tree-ssa-ifcombine optimizations pass with respect to 
> the above points.
>
> Thoughts Please?

>>if-combine is a simple CFG + condition pattern matcher.  It does not perform 
>>head/tail duplication.  Also there is no "control flow" in the bodies, 
>>control flow is part of the CFG that is >>matched so I'm not quite getting 
>>your last question.

Thanks ! My last question was If there is a control flow likes  loops inside 
the IF-THEN-ELSE, which could be possible if the Loop unswitching is performed 
and the Loop body is placed inside the IF-THEN-ELSE, then in that case the two 
IF-THEN-ELSE can be merged if the cond expr matches and the control flow
of the body of If-then-else matches?

There are many cases in SPEC 2006 benchmarks where the IF-combine could be 
enabled if the if-then-else sequence is made contiguous by performing
the head/tail duplication. 

>>if-combine was designed to accompany IL-only patterns that get partly 
>>translated into control flow.  Like

  >>tem1 = name & bit1;
  >>tem2 = name & bit2;
  >>tem3 = tem1 | tem2;
  >>if (tem3)
...

>>vs.

  >>tem1 = name & bit1;
  >>if (tem1)
   >>goto x;
  >>else
>>{
  >>tem2 = name & bit2;
  >>if (tem2)
   >> goto x;
>>}

>>x:
   >>...
Thanks for the examples. This explains the scope of if-combine optimization 
pass.

Thanks & Regards
Ajit

Richard.

> Thanks & Regards
> Ajit


Re: Cost Calculation on Loop Invariant on Arithmetic operations on RTL

2015-02-17 Thread Steven Bosscher
On Tue, Feb 17, 2015 at 9:45 AM, Ajit Kumar Agarwal wrote:
> Hello All:
>
> I can see the Loop invariant pass in the GCC on RTL considering the register 
> pressure and the cost manipulation
> With respect to SET destination node in RTL.
>
> The Loop invariant takes care of only address arithmetic candidates of Loop 
> invariance.
>
> In the function get_inv_cost, I can see the following check for the updation 
> of comp_cost and the comp_cost
> Gets incremented with respect all the dependence on the def node of the 
> invariant variable.
>
> If (!inv->cheap_address || inv-def->n_addr_uses < inv->def->n_uses)
>   (*comp_cost += inv->cost + inv->eqno
>
> Is there any specific reasons of addr_uses  less than the actual uses check 
> for the address arithmetic candidate
> Of Loop invariant. I think we should be aggressive enough and can do the cost 
> calculation without this check.
>
> One more point the above cost calculation should not be considering the 
> register pressure_costs.
>
> Thoughts Please ?

When all is said and done, it's just a matter of what works best for
performance of the produced code. Heuristics and theory are often not
on the same page, even if the heuristics are "rationalized". You
should just try and see what the effects are if you change something.

Ciao!
Steven


Re: Tree SSA If-combine optimization pass in GCC

2015-02-17 Thread Richard Biener
On Tue, Feb 17, 2015 at 11:26 AM, Ajit Kumar Agarwal
 wrote:
>
>
> -Original Message-
> From: Richard Biener [mailto:richard.guent...@gmail.com]
> Sent: Tuesday, February 17, 2015 3:42 PM
> To: Ajit Kumar Agarwal
> Cc: gcc@gcc.gnu.org; Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; 
> Nagaraju Mekala
> Subject: Re: Tree SSA If-combine optimization pass in GCC
>
> On Tue, Feb 17, 2015 at 9:22 AM, Ajit Kumar Agarwal 
>  wrote:
>> Hello All:
>>
>> I can see the IF-combining (If-merging) pass of optimization on tree-ssa 
>> form of intermediate representation.
>> The IF-combine or merging takes of merging the IF-THEN-ELSE if the
>> condition Expr found be congruent or Similar.
>>
>> The IF-combine happens if the two IF-THEN-ELSE are contiguous to each other.
>> If the IF-THEN-ELSE happens to be not contiguous but are wide apart with 
>> there is code in between.
>> Does the If-combine takes care of this. This requires to do the
>> head-duplication and Tail-duplication for the Code in between If-THEN-ELSE 
>> to bring the IF-THEN-ELSE contiguous to each other.
>>
>> After the head and tail duplication of the code in between the
>> IF-THEN-ElSE sequence becomes contiguous to each other. Apart from
>> this, Does the tree-ssa-if-combine pass considers the control flow of the 
>> body of the IF-THEN-ELSE. Is there any limitation on control flow of the 
>> body of the IF-THEN-ELSE.
>>
>> Can I know the scope of tree-ssa-ifcombine optimizations pass with respect 
>> to the above points.
>>
>> Thoughts Please?
>
>>>if-combine is a simple CFG + condition pattern matcher.  It does not perform 
>>>head/tail duplication.  Also there is no "control flow" in the bodies, 
>>>control flow is part of the CFG that is >>matched so I'm not quite getting 
>>>your last question.
>
> Thanks ! My last question was If there is a control flow likes  loops inside 
> the IF-THEN-ELSE, which could be possible if the Loop unswitching is performed
> and the Loop body is placed inside the IF-THEN-ELSE, then in that case the 
> two IF-THEN-ELSE can be merged if the cond expr matches and the control flow
> of the body of If-then-else matches?
>
> There are many cases in SPEC 2006 benchmarks where the IF-combine could be 
> enabled if the if-then-else sequence is made contiguous by performing
> the head/tail duplication.

I'd be curious what those cases look like.  Care to file some bugreports
with testcases?

>>>if-combine was designed to accompany IL-only patterns that get partly 
>>>translated into control flow.  Like
>
>   >>tem1 = name & bit1;
>   >>tem2 = name & bit2;
>   >>tem3 = tem1 | tem2;
>   >>if (tem3)
> ...
>
>>>vs.
>
>   >>tem1 = name & bit1;
>   >>if (tem1)
>>>goto x;
>   >>else
> >>{
>   >>tem2 = name & bit2;
>   >>if (tem2)
>>> goto x;
> >>}
>
>>>x:
>>>...
> Thanks for the examples. This explains the scope of if-combine optimization 
> pass.
>
> Thanks & Regards
> Ajit
>
> Richard.
>
>> Thanks & Regards
>> Ajit


Re: Operator "~", decltype() and templates.

2015-02-17 Thread Jonathan Wakely
On 17 February 2015 at 15:10, Paweł Tomulik wrote:
> Is this a bug? The original program compiles with clang.

Yes, please report it as described at https://gcc.gnu.org/bugs/

In any case, "is this a bug?" questions are inappropriate on this
mailing list, they belong on the gcc-help list.


Operator "~", decltype() and templates.

2015-02-17 Thread Paweł Tomulik
Hi,

the following program does not compile with g++4.9.2:

#include 


template
auto tt(T x) -> decltype(~x) // <-- here
{ return ~x; }

int main()
{
  std::cout << tt(10) << std::endl;
  return EXIT_SUCCESS;
}

ptomulik@tea:$ g++ -std=c++11 -g -O0 -Wall -Wextra -Werror -pedantic -o
test-gcc test.cpp
test.cpp: In function ‘int main()’:
test.cpp:10:21: error: no matching function for call to ‘tt(int)’
   std::cout << tt(10) << std::endl;
 ^
test.cpp:10:21: note: candidate is:
test.cpp:5:6: note: template decltype (~ x) tt(T)
 auto tt(T x) -> decltype(~x)
  ^
test.cpp:5:6: note:   template argument deduction/substitution failed:
test.cpp: In substitution of ‘template decltype (~ x) tt(T)
[with T = int]’:
test.cpp:10:21:   required from here
test.cpp:5:6: error: ‘x’ was not declared in this scope


This is specific to operator "~". Note, that, for example, any of the
following functions compile without a problem:


template
auto tt(T x) -> decltype(-x)
{ return -x; }

template
auto tt(T x) -> decltype(!x)
{ return !x; }

template
auto tt(T x) -> decltype(+x)
{ return +x; }

template
auto tt(T x) -> decltype(~T())
{ return ~x; }


Is this a bug? The original program compiles with clang.

Best Regards!
-- 
Pawel Tomulik


RE: Tree SSA If-combine optimization pass in GCC

2015-02-17 Thread Ajit Kumar Agarwal


-Original Message-
From: Richard Biener [mailto:richard.guent...@gmail.com] 
Sent: Tuesday, February 17, 2015 5:49 PM
To: Ajit Kumar Agarwal
Cc: gcc@gcc.gnu.org; Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; 
Nagaraju Mekala
Subject: Re: Tree SSA If-combine optimization pass in GCC

On Tue, Feb 17, 2015 at 11:26 AM, Ajit Kumar Agarwal 
 wrote:
>
>
> -Original Message-
> From: Richard Biener [mailto:richard.guent...@gmail.com]
> Sent: Tuesday, February 17, 2015 3:42 PM
> To: Ajit Kumar Agarwal
> Cc: gcc@gcc.gnu.org; Vinod Kathail; Shail Aditya Gupta; Vidhumouli 
> Hunsigida; Nagaraju Mekala
> Subject: Re: Tree SSA If-combine optimization pass in GCC
>
> On Tue, Feb 17, 2015 at 9:22 AM, Ajit Kumar Agarwal 
>  wrote:
>> Hello All:
>>
>> I can see the IF-combining (If-merging) pass of optimization on tree-ssa 
>> form of intermediate representation.
>> The IF-combine or merging takes of merging the IF-THEN-ELSE if the 
>> condition Expr found be congruent or Similar.
>>
>> The IF-combine happens if the two IF-THEN-ELSE are contiguous to each other.
>> If the IF-THEN-ELSE happens to be not contiguous but are wide apart with 
>> there is code in between.
>> Does the If-combine takes care of this. This requires to do the 
>> head-duplication and Tail-duplication for the Code in between If-THEN-ELSE 
>> to bring the IF-THEN-ELSE contiguous to each other.
>>
>> After the head and tail duplication of the code in between the 
>> IF-THEN-ElSE sequence becomes contiguous to each other. Apart from 
>> this, Does the tree-ssa-if-combine pass considers the control flow of the 
>> body of the IF-THEN-ELSE. Is there any limitation on control flow of the 
>> body of the IF-THEN-ELSE.
>>
>> Can I know the scope of tree-ssa-ifcombine optimizations pass with respect 
>> to the above points.
>>
>> Thoughts Please?
>
>>>if-combine is a simple CFG + condition pattern matcher.  It does not perform 
>>>head/tail duplication.  Also there is no "control flow" in the bodies, 
>>>control flow is part of the CFG that is >>matched so I'm not quite getting 
>>>your last question.
>
> Thanks ! My last question was If there is a control flow likes  loops 
> inside the IF-THEN-ELSE, which could be possible if the Loop 
> unswitching is performed and the Loop body is placed inside the IF-THEN-ELSE, 
> then in that case the two IF-THEN-ELSE can be merged if the cond expr matches 
> and the control flow of the body of If-then-else matches?
>
> There are many cases in SPEC 2006 benchmarks where the IF-combine 
> could be enabled if the if-then-else sequence is made contiguous by 
> performing the head/tail duplication.

>>I'd be curious what those cases look like.  Care to file some bugreports with 
>>testcases?

This is not a bug and it’s the performance improvement optimizations with 
respect to h264ref spec2006 benchmarks. Here is the example.


Var1 = funcptr();
For(...)
{
   Code here 
  For(...)
  {
  Code here ...
 For(...)
   ... code here..

 If(*funcptr() == FastPely())
   FastPely()
Else
 (*funcptr)();

 There are such 16 IF statements.


   code here

 } end for
Code here
  }//end for
  Code here
}//end for.

The funcptr has two targets FastPely() and UMVPely(). After the indirect call 
promotion the targets is known to be either Fastpely() or UMVPely.

The Transformed  code  after indirect icall promotion looks like as follows.

Var1 = funcptr();
For(...)
{
   Code here 
  For(...)
  {
  Code here ...
 For(...)
   ... code here..

 If(var1 == FastPely())
   FastPely()
Else
 UMVpely();

 There are such 16 IF statements.


   code here

 } end for
Code here
  }//end for
  Code here
}//end for.

After the icall promotion the Function FastPely or UMVPely can be inlined as 
the target is known to be either Fastpely() or UmvPely() and it become a 
candidate for heuristics for inlined.
As you can see the transformed code the IF-THEN-ELSE (such 16 If statements) 
can be IF-combined and merged and then get inlined.

Also you can see that the code above IF and below for which can be head 
duplicated or tail duplicated which is then become 3 -Level loop unswitching 
candidate. This can be loop unswitching candidate after the IF-Combine or 
merging.

I am planning to implement the above optimizations in GCC with respect to 
h264ref spec 2006 benchmark. This gives a significant amount of gains.
I have implemented the above optimization in Open64 compiler and it has given  
significant amount of gains in open64 compiler.

Thanks & Regards
Ajit
   
>>>if-combine was designed to accompany IL-only patterns that get partly 
>>>translated into control flow.  Like
>
>   >>tem1 = name & bit1;
>   >>tem2 = name & bit2;
>   >>tem3 = tem1 | tem2;
>   >>if (tem3)
> ...
>
>>>vs.
>
>   >>tem1 = name & bit1;
>   >>if (tem1)
>>>goto x;
>   >>else
> >>{