date:20141106

LTO IPA inline decisions in GCC trunk.

2014-11-06 Thread Venkataramanan Kumar

Hi Honza,

I experimented building Coremark with both PGO and LTO at -O3 level on
Aarch64 machine.  First I generated profiles using the recommended
seeds in Coremark's readme.txt. Then compiled again with -O3 -flto and
-fprofile-use.

I tried using GCC Linaro compiler (september) which is based on FSF
4.9 and GCC trunk 30-sep-2014.

With linaro compiler perf events show 5% less instruction counts
compared to the GCC trunk version I used.

I looked at the generated code and seeing that IPA inlining have
changed between linaro and trunk.

Linaro compiler does not seem to inline a function called "crcu32",
but trunk inlines it but does not inline "crcu16. Also trunk does not
detect an IPA indirect inlining on a function called "cmp_complex".

The number of partitions for ltrans is 3 in Linaro compiler and
reduced to 2 in trunk.

Eyeballing the dump it seems --param max-inline-insns-auto limit
reached and hence deciding not to inline some functions. I tried
increasing this limit from 40 to 45, 50 and 100. But is not helping in
inlining "crcu32" in trunk, but inlines "cmp_complex" when set to
limit set 45. But this is not reducing the instruction count.

With Linaro compiler I tried to manually not to inline crcu16. Now
Linaro compiler behaves in same way as trunk. It inlines crcu32,
crcu16 is not inlined and instruction count increases.

So inlining "crcu16", seem to increasing the instruction counts in
trunk. I tried to latest trunk on X86_64 machine only and inlining
behavior is same to the trunk version I used in Aarch64.

LTO may not be best thing to try on Coremark, but just wanted to check
if trunk (5.0) is better compared to GCC 4.9.

Can you suggest where should I look in GCC to see why these inline
decisions changes in trunk? Also compared to FSF 4.9, inline size
calculation in IPA have changed now in trunk?

Please advise.

regards,
Venkat.

Re: Match-and-simplify and COND_EXPR

2014-11-06 Thread Richard Biener

On Wed, 5 Nov 2014, Andrew Pinski wrote:

> Hi,
>   I was trying to hook up tree-ssa-phiopt to match-and-simplify using
> either gimple_build (or rather using gimple_simplify depending on if
> we want to produce cond_expr for conditional move).  I ran into a
> problem.
> With the pattern below:
> /* a ? 0 : 1 -> a if 0 and 1 are integral types. */
> (simplify
>   (cond_expr @0 integer_zerop integer_onep)
>   (if (INTEGRAL_TYPE_P (type))
> (convert @0)))
> 
> This produces a gimple statement with an incorrect gimple statement
> where we have have a convert with an compare expression.
> 
> t.c: In function ‘f’:
> t.c:1:5: error: invalid operand in unary operation
>  int f(int c, int a, int b)
>  ^
> _4 = (int) (c_2(D) != 0);
> 
> 
> Did I implement the pattern incorrectly or is there a bug due to the
> way cond_expr handles its 0th operand in that it is valid for compares
> there in gimple form.

There are probably still corner-cases where the GENERIC exprs we
embed into GIMPLE operands are not handled correctly or optimally.
We jump through multiple hoops to get that right already but I'm sure
I missed some ;)  COND_EXPRs are difficult because the embedded
comparison is simplified separately but how the result needs to be
represented depends on the context.  So more example patterns
certainly help here.

Feel free to open a bugreport.

Btw, I usually "test" this by inspecting the generated code for
a single pattern in a test.pd file and

./build/genmatch --gimple test.pd

Thanks,
Richard.

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendoerffer, HRB 21284
(AG Nuernberg)
Maxfeldstrasse 5, 90409 Nuernberg, Germany

Re: RFC: Update ISL under gcc/infrastructure/ ? // Remove CLooG?

2014-11-06 Thread Richard Biener

On Thu, Nov 6, 2014 at 8:02 AM, Tobias Grosser  wrote:
> On 06.11.2014 07:04, Roman Gareev wrote:
>>>
>>> CLooG is not necessarily needed. You can run graphite just with ISL. The
>>> main reason that ISL code generation is not enabled by default is that we
>>> did not yet get extensive testing and it was unclear who will have the
>>> time
>>> to fix possible bugs.
>>
>>
>> Could you please advise me which test suites should be used to make
>> performance comparison between CLooG and ISL generator? (I would like
>> to do this, even though the old generator is removed).
>
>
> I do not have specific advices. You can use various open source benchmarks
> e.g. the LLVM test suite or, if you have access, you could run SPEC or
> something.
>
>>> @Mircae, Roman: Would you have time to help with bug-fixing if we do the
>>> switch now? (I am happy to review patches and give advice, but can not do
>>> the full move myself)
>>
>>
>> I could find time for this. What do you mean by ‘switch’? If I’m not
>> mistaken, ISL generator is already used by default. Should we remove
>> support of CLooG generator and all files related to it?
>
>
> Wow, I must really have been sleeping (or just forgetting). The switch
> already happened. This is amazing.
>
> As the ISL code generator has been default since a while and we did not get
> many bug reports, the actual switch seems to have worked well. We could
> probably still need some testing, but in this case it is most likely time to
> drop the CLooG support entirely. Are you interested to provide the relevant
> patches?
>
> Also, as Tobias suggested we should raise the minimal supported isl level to
> 0.14 to be sure PR 62289 is fixed.

As I am testing with system isl/cloog that would be unfortunate as it means
I'd either drop testing graphite for 4.8 and 4.9 or for 5.0 ...  AFAIK ISL
12.x and 13+ cannot co-exist in the system due to include file conflicts.

Richard.

> Cheers,
> Tobias
>

Re: RFC: Update ISL under gcc/infrastructure/ ? // Remove CLooG?

2014-11-06 Thread Tobias Grosser


On 06.11.2014 10:05, Richard Biener wrote:

On Thu, Nov 6, 2014 at 8:02 AM, Tobias Grosser  wrote:

On 06.11.2014 07:04, Roman Gareev wrote:


CLooG is not necessarily needed. You can run graphite just with ISL. The
main reason that ISL code generation is not enabled by default is that we
did not yet get extensive testing and it was unclear who will have the
time
to fix possible bugs.



Could you please advise me which test suites should be used to make
performance comparison between CLooG and ISL generator? (I would like
to do this, even though the old generator is removed).



I do not have specific advices. You can use various open source benchmarks
e.g. the LLVM test suite or, if you have access, you could run SPEC or
something.


@Mircae, Roman: Would you have time to help with bug-fixing if we do the
switch now? (I am happy to review patches and give advice, but can not do
the full move myself)



I could find time for this. What do you mean by ‘switch’? If I’m not
mistaken, ISL generator is already used by default. Should we remove
support of CLooG generator and all files related to it?



Wow, I must really have been sleeping (or just forgetting). The switch
already happened. This is amazing.

As the ISL code generator has been default since a while and we did not get
many bug reports, the actual switch seems to have worked well. We could
probably still need some testing, but in this case it is most likely time to
drop the CLooG support entirely. Are you interested to provide the relevant
patches?

Also, as Tobias suggested we should raise the minimal supported isl level to
0.14 to be sure PR 62289 is fixed.


As I am testing with system isl/cloog that would be unfortunate as it means
I'd either drop testing graphite for 4.8 and 4.9 or for 5.0 ...  AFAIK ISL
12.x and 13+ cannot co-exist in the system due to include file conflicts.


Sven,

is there any chance we can add the deprecated isl_int includes back into 
isl 0.14.1. This would unblock the testing and we could remove them as 
soon as gcc 4.8/4.9 has been phased out.


This would also fix my polly code coverage tests which do not work since 
isl_int has been moved into different header files, as ubuntu does not 
want to update isl as long as such an update breaks gcc.


Cheers,
Tobias

Re: LTO IPA inline decisions in GCC trunk.

2014-11-06 Thread Jan Hubicka

> Hi Honza,
Hello,
> 
> I experimented building Coremark with both PGO and LTO at -O3 level on
> Aarch64 machine.  First I generated profiles using the recommended seeds in
> Coremark's readme.txt. Then compiled again with -O3 -flto and -fprofile-use.
> 
> I tried using GCC Linaro compiler (september) which is based on FSF 4.9 and
> GCC trunk 30-sep-2014.
> 
> With linaro compiler perf events show 5% less instruction counts compared
> to the GCC trunk version I used.
> 
> I looked at the generated code and seeing that IPA inlining have changed
> between linaro and trunk.
> 
> Linaro compiler does not seem to inline a function called "crcu32", but
> trunk inlines it but does not inline "crcu16. Also trunk does not detect an
> IPA indirect inlining on a function called "cmp_complex".

Can you please compile with -fdump-ipa-inline-details -fdump-tree-release_ssa 
and send
me the dumps from both compilers? It should not be that hard to debug this.
> 
> The number of partitions for ltrans is 3 in Linaro compiler and reduced to
> 2 in trunk.
> 
> Eyeballing the dump it seems --param max-inline-insns-auto limit reached
> and hence deciding not to inline some functions. I tried increasing this
> limit from 40 to 45, 50 and 100. But is not helping in inlining "crcu32" in
> trunk, but inlines "cmp_complex" when set to limit set 45. But this is not
> reducing the instruction count.
> 
> With Linaro compiler I tried to manually not to inline crcu16. Now Linaro
> compiler behaves in same way as trunk. It inlines crcu32, crcu16 is not
> inlined and instruction count increases.
> 
> So inlining "crcu16", seem to increasing the instruction counts in trunk. I
> tried to latest trunk on X86_64 machine only and inlining behavior is same
> to the trunk version I used in Aarch64.
> 
> LTO may not be best thing to try on Coremark, but just wanted to check if
> trunk (5.0) is better compared to GCC 4.9.
> 
> Can you suggest where should I look in GCC to see why these inline
> decisions changes in trunk? Also compared to FSF 4.9, inline size
> calculation in IPA have changed now in trunk?

One important change for mainline compared to 4.9 is that with profile feedback
it can now bypass max-inline-insns-single/auto limits.

This is change I did in early stage1
https://gcc.gnu.org/ml/gcc-patches/2014-04/msg01110.html and I wanted to see if
there are any testcases.  I think we may make more selective decisions about
what call is considered hot in this case (our current cgraph_maybe_hot_edge_p
is very conservative). 

So in your case the main problem seems to be not inlining crcu16? Of course the
change above does not directly explain it, but perhaps some expensive inlining
early in the decision stage prevents useful inlining later..

Honza
> 
> Please advise.
> 
> regards,
> Venkat.

Re: RFC: Update ISL under gcc/infrastructure/ ? // Remove CLooG?

2014-11-06 Thread Richard Biener

On 11/6/14, Tobias Grosser  wrote:
> On 06.11.2014 10:05, Richard Biener wrote:
>> On Thu, Nov 6, 2014 at 8:02 AM, Tobias Grosser  wrote:
>>> On 06.11.2014 07:04, Roman Gareev wrote:
>
> CLooG is not necessarily needed. You can run graphite just with ISL.
> The
> main reason that ISL code generation is not enabled by default is that
> we
> did not yet get extensive testing and it was unclear who will have the
> time
> to fix possible bugs.


 Could you please advise me which test suites should be used to make
 performance comparison between CLooG and ISL generator? (I would like
 to do this, even though the old generator is removed).
>>>
>>>
>>> I do not have specific advices. You can use various open source
>>> benchmarks
>>> e.g. the LLVM test suite or, if you have access, you could run SPEC or
>>> something.
>>>
> @Mircae, Roman: Would you have time to help with bug-fixing if we do
> the
> switch now? (I am happy to review patches and give advice, but can not
> do
> the full move myself)


 I could find time for this. What do you mean by ‘switch’? If I’m not
 mistaken, ISL generator is already used by default. Should we remove
 support of CLooG generator and all files related to it?
>>>
>>>
>>> Wow, I must really have been sleeping (or just forgetting). The switch
>>> already happened. This is amazing.
>>>
>>> As the ISL code generator has been default since a while and we did not
>>> get
>>> many bug reports, the actual switch seems to have worked well. We could
>>> probably still need some testing, but in this case it is most likely time
>>> to
>>> drop the CLooG support entirely. Are you interested to provide the
>>> relevant
>>> patches?
>>>
>>> Also, as Tobias suggested we should raise the minimal supported isl level
>>> to
>>> 0.14 to be sure PR 62289 is fixed.
>>
>> As I am testing with system isl/cloog that would be unfortunate as it
>> means
>> I'd either drop testing graphite for 4.8 and 4.9 or for 5.0 ...  AFAIK
>> ISL
>> 12.x and 13+ cannot co-exist in the system due to include file conflicts.
>
> Sven,
>
> is there any chance we can add the deprecated isl_int includes back into
> isl 0.14.1. This would unblock the testing and we could remove them as
> soon as gcc 4.8/4.9 has been phased out.
>
> This would also fix my polly code coverage tests which do not work since
> isl_int has been moved into different header files, as ubuntu does not
> want to update isl as long as such an update breaks gcc.

Ah, so it's still there?  Maybe we can simply add some configury to detect
its location and fix build with a patch for 4.8 and 4.9?  Is there maybe even
a preprocessor macro one can check that newer ISL provide that can tell
us where to look for isl_int?

Thanks,
Richard.

> Cheers,
> Tobias
>
>

Re: RFC: Update ISL under gcc/infrastructure/ ? // Remove CLooG?

2014-11-06 Thread Tobias Grosser


On 06.11.2014 11:15, Richard Biener wrote:

On 11/6/14, Tobias Grosser  wrote:

On 06.11.2014 10:05, Richard Biener wrote:

On Thu, Nov 6, 2014 at 8:02 AM, Tobias Grosser  wrote:

On 06.11.2014 07:04, Roman Gareev wrote:


CLooG is not necessarily needed. You can run graphite just with ISL.
The
main reason that ISL code generation is not enabled by default is that
we
did not yet get extensive testing and it was unclear who will have the
time
to fix possible bugs.



Could you please advise me which test suites should be used to make
performance comparison between CLooG and ISL generator? (I would like
to do this, even though the old generator is removed).



I do not have specific advices. You can use various open source
benchmarks
e.g. the LLVM test suite or, if you have access, you could run SPEC or
something.


@Mircae, Roman: Would you have time to help with bug-fixing if we do
the
switch now? (I am happy to review patches and give advice, but can not
do
the full move myself)



I could find time for this. What do you mean by ‘switch’? If I’m not
mistaken, ISL generator is already used by default. Should we remove
support of CLooG generator and all files related to it?



Wow, I must really have been sleeping (or just forgetting). The switch
already happened. This is amazing.

As the ISL code generator has been default since a while and we did not
get
many bug reports, the actual switch seems to have worked well. We could
probably still need some testing, but in this case it is most likely time
to
drop the CLooG support entirely. Are you interested to provide the
relevant
patches?

Also, as Tobias suggested we should raise the minimal supported isl level
to
0.14 to be sure PR 62289 is fixed.


As I am testing with system isl/cloog that would be unfortunate as it
means
I'd either drop testing graphite for 4.8 and 4.9 or for 5.0 ...  AFAIK
ISL
12.x and 13+ cannot co-exist in the system due to include file conflicts.


Sven,

is there any chance we can add the deprecated isl_int includes back into
isl 0.14.1. This would unblock the testing and we could remove them as
soon as gcc 4.8/4.9 has been phased out.

This would also fix my polly code coverage tests which do not work since
isl_int has been moved into different header files, as ubuntu does not
want to update isl as long as such an update breaks gcc.


Ah, so it's still there?


Yes, we just need to include some additional header files.

Besides isl/aff.h we now also would need isl/deprecated/aff.h


 Maybe we can simply add some configury to detect
its location and fix build with a patch for 4.8 and 4.9?

> Is there maybe even

a preprocessor macro one can check that newer ISL provide that can tell
us where to look for isl_int?


We could detect the need to include these files based on isl's version 
macros.


So you suggest that adding conditional includes would allow us to move 
forward and would be part of some of the next gcc point releases. This

would be perfect.

Tobias

Re: RFC: Update ISL under gcc/infrastructure/ ? // Remove CLooG?

2014-11-06 Thread Richard Biener

On 11/6/14, Tobias Grosser  wrote:
> On 06.11.2014 11:15, Richard Biener wrote:
>> On 11/6/14, Tobias Grosser  wrote:
>>> On 06.11.2014 10:05, Richard Biener wrote:
 On Thu, Nov 6, 2014 at 8:02 AM, Tobias Grosser 
 wrote:
> On 06.11.2014 07:04, Roman Gareev wrote:
>>>
>>> CLooG is not necessarily needed. You can run graphite just with ISL.
>>> The
>>> main reason that ISL code generation is not enabled by default is
>>> that
>>> we
>>> did not yet get extensive testing and it was unclear who will have
>>> the
>>> time
>>> to fix possible bugs.
>>
>>
>> Could you please advise me which test suites should be used to make
>> performance comparison between CLooG and ISL generator? (I would like
>> to do this, even though the old generator is removed).
>
>
> I do not have specific advices. You can use various open source
> benchmarks
> e.g. the LLVM test suite or, if you have access, you could run SPEC or
> something.
>
>>> @Mircae, Roman: Would you have time to help with bug-fixing if we do
>>> the
>>> switch now? (I am happy to review patches and give advice, but can
>>> not
>>> do
>>> the full move myself)
>>
>>
>> I could find time for this. What do you mean by ‘switch’? If I’m not
>> mistaken, ISL generator is already used by default. Should we remove
>> support of CLooG generator and all files related to it?
>
>
> Wow, I must really have been sleeping (or just forgetting). The switch
> already happened. This is amazing.
>
> As the ISL code generator has been default since a while and we did
> not
> get
> many bug reports, the actual switch seems to have worked well. We
> could
> probably still need some testing, but in this case it is most likely
> time
> to
> drop the CLooG support entirely. Are you interested to provide the
> relevant
> patches?
>
> Also, as Tobias suggested we should raise the minimal supported isl
> level
> to
> 0.14 to be sure PR 62289 is fixed.

 As I am testing with system isl/cloog that would be unfortunate as it
 means
 I'd either drop testing graphite for 4.8 and 4.9 or for 5.0 ...  AFAIK
 ISL
 12.x and 13+ cannot co-exist in the system due to include file
 conflicts.
>>>
>>> Sven,
>>>
>>> is there any chance we can add the deprecated isl_int includes back into
>>> isl 0.14.1. This would unblock the testing and we could remove them as
>>> soon as gcc 4.8/4.9 has been phased out.
>>>
>>> This would also fix my polly code coverage tests which do not work since
>>> isl_int has been moved into different header files, as ubuntu does not
>>> want to update isl as long as such an update breaks gcc.
>>
>> Ah, so it's still there?
>
> Yes, we just need to include some additional header files.
>
> Besides isl/aff.h we now also would need isl/deprecated/aff.h
>
>>  Maybe we can simply add some configury to detect
>> its location and fix build with a patch for 4.8 and 4.9?
>  > Is there maybe even
>> a preprocessor macro one can check that newer ISL provide that can tell
>> us where to look for isl_int?
>
> We could detect the need to include these files based on isl's version
> macros.
>
> So you suggest that adding conditional includes would allow us to move
> forward and would be part of some of the next gcc point releases. This
> would be perfect.

Yes.  Please make sure - if you produce a patch - that it works with both
inline copy of ISL and system installed one.

Thanks,
Richard.

> Tobias
>
>
>
>
>
>

Re: Match-and-simplify and COND_EXPR

2014-11-06 Thread Richard Biener

On Wed, 5 Nov 2014, Andrew Pinski wrote:

> Hi,
>   I was trying to hook up tree-ssa-phiopt to match-and-simplify using
> either gimple_build (or rather using gimple_simplify depending on if
> we want to produce cond_expr for conditional move).  I ran into a
> problem.
> With the pattern below:
> /* a ? 0 : 1 -> a if 0 and 1 are integral types. */
> (simplify
>   (cond_expr @0 integer_zerop integer_onep)
>   (if (INTEGRAL_TYPE_P (type))
> (convert @0)))

Ok, so you are capturing a GENERIC expr here but nothing knows that.
It would work if you'd do (ugh)

(for op (lt le eq ne ge gt)
 (simplify
  (cond_expr (op @0 @1) integer_zerop integer_onep)
  (if (INTEGRAL_TYPE_P (type))
   (convert (op @0 @1)
(simplify
 (cond_expr SSA_NAME@0 integer_zerop integer_onep)
  (if (INTEGRAL_TYPE_P (type))
   (convert @0

as a workaround.  To make your version work will require (quite)
some special-casing in the code generator or maybe the resimplify
helper.  Let me see if I can cook up a "simple" fix.

Richard.

> This produces a gimple statement with an incorrect gimple statement
> where we have have a convert with an compare expression.
> 
> t.c: In function ‘f’:
> t.c:1:5: error: invalid operand in unary operation
>  int f(int c, int a, int b)
>  ^
> _4 = (int) (c_2(D) != 0);
> 
> 
> Did I implement the pattern incorrectly or is there a bug due to the
> way cond_expr handles its 0th operand in that it is valid for compares
> there in gimple form.
> 
> I don't have a good patch to test tree-ssa-phiopt.c connection due the
> patch having extra code in it already which handles creating cond_expr
> for conditional moves already.
> 
> Thanks,
> Andrew Pinski
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendoerffer, HRB 21284
(AG Nuernberg)
Maxfeldstrasse 5, 90409 Nuernberg, Germany

Re: Match-and-simplify and COND_EXPR

2014-11-06 Thread Richard Biener

On Thu, 6 Nov 2014, Richard Biener wrote:

> On Wed, 5 Nov 2014, Andrew Pinski wrote:
> 
> > Hi,
> >   I was trying to hook up tree-ssa-phiopt to match-and-simplify using
> > either gimple_build (or rather using gimple_simplify depending on if
> > we want to produce cond_expr for conditional move).  I ran into a
> > problem.
> > With the pattern below:
> > /* a ? 0 : 1 -> a if 0 and 1 are integral types. */
> > (simplify
> >   (cond_expr @0 integer_zerop integer_onep)
> >   (if (INTEGRAL_TYPE_P (type))
> > (convert @0)))
> 
> Ok, so you are capturing a GENERIC expr here but nothing knows that.
> It would work if you'd do (ugh)
> 
> (for op (lt le eq ne ge gt)
>  (simplify
>   (cond_expr (op @0 @1) integer_zerop integer_onep)
>   (if (INTEGRAL_TYPE_P (type))
>(convert (op @0 @1)
> (simplify
>  (cond_expr SSA_NAME@0 integer_zerop integer_onep)
>   (if (INTEGRAL_TYPE_P (type))
>(convert @0
> 
> as a workaround.  To make your version work will require (quite)
> some special-casing in the code generator or maybe the resimplify
> helper.  Let me see if I can cook up a "simple" fix.

Sth like below (for the real fix this has to be replicated in
all gimple_resimplifyN functions).  I'm missing a testcase
where the pattern would apply (and not be already folded by fold),
so I didn't check if it actually works.

Bah, of course we should fix COND_EXPRs to not embed a GENERIC
expr...

Richard.

Index: gcc/gimple-match-head.c
===
--- gcc/gimple-match-head.c (revision 217035)
+++ gcc/gimple-match-head.c (working copy)
@@ -90,6 +90,13 @@ gimple_resimplify1 (gimple_seq *seq,
code_helper *res_code, tree type, tree *res_ops,
tree (*valueize)(tree))
 {
+  /* ???  Stupid tcc_comparison GENERIC trees in COND_EXPRs.  */
+  if (COMPARISON_CLASS_P (res_ops[0]))
+res_ops[0] = gimple_build (seq,
+  TREE_CODE (res_ops[0]), TREE_TYPE (res_ops[0]),
+  TREE_OPERAND (res_ops[0], 0),
+  TREE_OPERAND (res_ops[0], 1));
+
   if (constant_for_folding (res_ops[0]))
 {
   tree tem = NULL_TREE;

Re: RFC: Update ISL under gcc/infrastructure/ ? // Remove CLooG?

2014-11-06 Thread Sven Verdoolaege

On Thu, Nov 06, 2014 at 10:43:16AM +0100, Tobias Grosser wrote:
> is there any chance we can add the deprecated isl_int includes back into isl
> 0.14.1. This would unblock the testing and we could remove them as soon as
> gcc 4.8/4.9 has been phased out.

I prefer Richard's solution.
In any case, you need to test later versions of isl before you accept
for use in older versions of gcc.  Even if there are no incompatibilities
in the API, the internal representations of results of operations may
very well change, which may, for example, result in significantly
different output from the AST generator.

skimo

Loop invariant motion from cold block

2014-11-06 Thread Pat Haugen

It appears to me that both loop invariant motion passes (tree/rtl) don't 
look at basic block frequencies and will gladly hoist invariant code 
from a cold block within a loop. This can impact performance by 
executing (possibly costly) code that would otherwise not be executed, 
adds another register that is live through the loop, and possibly 
increases number of callee-saved registers used in the procedure (which 
require save/restore) if there is a call in the loop. Following 
simplified test was patterned after code observed in Boost library, 
which apparently has a fair number of asserts sprinkled throughout.


volatile int x;
void assert_fail(int, char *, char *);
void foo(int *a, int n, int k) {
  int i;

  for(i=0; iCompiling the above with -O2, tree-LIM (tree-ssa-loop-im.c) will yank 
the 'k/5' expression out of the loop and rtl-LIM (loop-invariant.c) will 
yank the address computations for the 2 strings out of the loop. The 
default probability for __builtin_expect is 90%, but even adding --param 
builtin-expect-probability=100 to the compile such that the block 
containing the call looks to never be executed still results in 
invariant hoisting.


A simple change like the following prevents the hoisting in the rtl 
pass, I assume something similar could also be added to the tree pass. 
But I'd like to hear others thoughts on whether this is an acceptable 
approach/issue that should be fixed. The code below prevents motion if 
the block containing it is executed <= 10% of the time the loop header 
is. Is there a better approach there instead of another magic number? 
Should it be based on PARAM_VALUE (BUILTIN_EXPECT_PROBABILITY)? Basing 
on BUILTIN_EXPECT_PROBABILITY seems to make sense to me for the above 
case, but what about the more general case where __builtin_expect() 
isn't used and branch probabilities are guessed based on other heurstics?


Index: loop-invariant.c
===
--- loop-invariant.c(revision 216843)
+++ loop-invariant.c(working copy)
@@ -1000,6 +1000,11 @@ find_invariants_bb (basic_block bb, bool
 {
   rtx_insn *insn;

+
+  /* Don't move invariants from cold block inside the loop. */
+  if (bb->frequency <= (bb->loop_father->header->frequency / 10) )
+return;
+
   FOR_BB_INSNS (bb, insn)
 {
   if (!NONDEBUG_INSN_P (insn))


Thanks for any feedback,
Pat

Re: Loop invariant motion from cold block

2014-11-06 Thread Richard Biener

On Thu, Nov 6, 2014 at 7:08 PM, Pat Haugen  wrote:
> It appears to me that both loop invariant motion passes (tree/rtl) don't
> look at basic block frequencies and will gladly hoist invariant code from a
> cold block within a loop. This can impact performance by executing (possibly
> costly) code that would otherwise not be executed, adds another register
> that is live through the loop, and possibly increases number of callee-saved
> registers used in the procedure (which require save/restore) if there is a
> call in the loop. Following simplified test was patterned after code
> observed in Boost library, which apparently has a fair number of asserts
> sprinkled throughout.
>
> volatile int x;
> void assert_fail(int, char *, char *);
> void foo(int *a, int n, int k) {
>   int i;
>
>   for(i=0; i if (__builtin_expect(x,0))
>   assert_fail(k / 5, "one", "two");
> a[i] = k;
>   }
> }
>
> Compiling the above with -O2, tree-LIM (tree-ssa-loop-im.c) will yank the
> 'k/5' expression out of the loop and rtl-LIM (loop-invariant.c) will yank
> the address computations for the 2 strings out of the loop. The default
> probability for __builtin_expect is 90%, but even adding --param
> builtin-expect-probability=100 to the compile such that the block containing
> the call looks to never be executed still results in invariant hoisting.
>
> A simple change like the following prevents the hoisting in the rtl pass, I
> assume something similar could also be added to the tree pass. But I'd like
> to hear others thoughts on whether this is an acceptable approach/issue that
> should be fixed. The code below prevents motion if the block containing it
> is executed <= 10% of the time the loop header is. Is there a better
> approach there instead of another magic number? Should it be based on
> PARAM_VALUE (BUILTIN_EXPECT_PROBABILITY)? Basing on
> BUILTIN_EXPECT_PROBABILITY seems to make sense to me for the above case, but
> what about the more general case where __builtin_expect() isn't used and
> branch probabilities are guessed based on other heurstics?

Shouldn't we never hoist anything from a bb with lower execution frequency
to a bb with higher one?  It seems LIM simply assumes that inside a loop
is always higher frequency than outside of it.

So - why artificially have that factor of 0.1 instead of simply checking for
less than?

Btw, we also hoist through multiple loop nests so even an unlikely
branch in the innermost loop may execute more times than the outermost
loop entry.   So I believe the change has to be made somewhere else,
like in invariantness_dom_walker::before_dom_children.

Btw, for your example with calling a noreturn function in the loop exit
destination the block containing the k/5 shouldn't be in the loop at all.

I wonder why they are not using __attribute__((noreturn)) in which case
they'd even get better profiles can can get rid of the __builtin_expect?

Richard.

> Index: loop-invariant.c
> ===
> --- loop-invariant.c(revision 216843)
> +++ loop-invariant.c(working copy)
> @@ -1000,6 +1000,11 @@ find_invariants_bb (basic_block bb, bool
>  {
>rtx_insn *insn;
>
> +
> +  /* Don't move invariants from cold block inside the loop. */
> +  if (bb->frequency <= (bb->loop_father->header->frequency / 10) )
> +return;
> +
>FOR_BB_INSNS (bb, insn)
>  {
>if (!NONDEBUG_INSN_P (insn))
>
>
> Thanks for any feedback,
> Pat
>
>
>

Re: RFC: Update ISL under gcc/infrastructure/ ? // Remove CLooG?

2014-11-06 Thread Roman Gareev

> As the ISL code generator has been default since a while and we did not get
> many bug reports, the actual switch seems to have worked well. We could
> probably still need some testing, but in this case it is most likely time to
> drop the CLooG support entirely. Are you interested to provide the relevant
> patches?
>
> Also, as Tobias suggested we should raise the minimal supported isl level to
> 0.14 to be sure PR 62289 is fixed.

Yes, if you don’t mind, I’ll try to provide them by the end of the
week. I’m planning to make two patches: the first one removes files
related to CLooG code generator, the second one removes support of
CLooG library from configure and make files. What do you think about
this?

-- 
Cheers, Roman Gareev.

Re: RFC: Update ISL under gcc/infrastructure/ ? // Remove CLooG?

2014-11-06 Thread Tobias Grosser


On 06.11.2014 20:08, Roman Gareev wrote:

As the ISL code generator has been default since a while and we did not get
many bug reports, the actual switch seems to have worked well. We could
probably still need some testing, but in this case it is most likely time to
drop the CLooG support entirely. Are you interested to provide the relevant
patches?

Also, as Tobias suggested we should raise the minimal supported isl level to
0.14 to be sure PR 62289 is fixed.


Yes, if you don’t mind, I’ll try to provide them by the end of the
week. I’m planning to make two patches: the first one removes files
related to CLooG code generator, the second one removes support of
CLooG library from configure and make files. What do you think about
this?


Sounds good as long as they will compile and pass tests independently. 
Please also remember updating the documentation.


Cheers,
Tobias

Re: Loop invariant motion from cold block

2014-11-06 Thread Pat Haugen


On 11/06/2014 01:00 PM, Richard Biener wrote:

Shouldn't we never hoist anything from a bb with lower execution frequency
to a bb with higher one?  It seems LIM simply assumes that inside a loop
is always higher frequency than outside of it.

So - why artificially have that factor of 0.1 instead of simply checking for
less than?
Yes, that would seem to be the best general approach. My initial hack 
was done wrt to example given and apparently I just kept my tunnel 
vision goggles on.

Btw, we also hoist through multiple loop nests so even an unlikely
branch in the innermost loop may execute more times than the outermost
loop entry.   So I believe the change has to be made somewhere else,
like in invariantness_dom_walker::before_dom_children.

Btw, for your example with calling a noreturn function in the loop exit
destination the block containing the k/5 shouldn't be in the loop at all.

I wonder why they are not using __attribute__((noreturn)) in which case
they'd even get better profiles can can get rid of the __builtin_expect?
Good point, and probably the approach that should really be taken in 
that code.


Thanks,
Pat

Re: RFC: Update ISL under gcc/infrastructure/ ? // Remove CLooG?

2014-11-06 Thread Tobias Burnus


Richard Biener wrote:

On Thu, Nov 6, 2014 at 8:02 AM, Tobias Grosser  wrote:

Also, as Tobias suggested we should raise the minimal supported isl level to
0.14 to be sure PR 62289 is fixed.

As I am testing with system isl/cloog that would be unfortunate as it means
I'd either drop testing graphite for 4.8 and 4.9 or for 5.0 ...  AFAIK ISL
12.x and 13+ cannot co-exist in the system due to include file conflicts.


An alternative would be to still accept ISL 0.12.x in terms of 
configure, while updating infrastructure / 
contrib/download_prerequisites to ISL 0.14 (while keeping 0.12 for GCC 
4.8/4.9 on the ftp server).


The question is only what then to write at 
https://gcc.gnu.org/install/prerequisites.html? "0.14.0" (but still 
accepting 0.12)? "0.12.2 or later, recommended 0.14.0"?


Tobias

gcc-4.8-20141106 is now available

2014-11-06 Thread gccadmin

Snapshot gcc-4.8-20141106 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/4.8-20141106/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 4.8 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_8-branch 
revision 217203

You'll find:

 gcc-4.8-20141106.tar.bz2 Complete GCC

  MD5=3a3c474c18240199f009efa5362e3f14
  SHA1=b5605b0972fd79967afa63012ae673f78619cd79

Diffs from 4.8-20141030 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-4.8
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.

RE: [Aarch64] LRA

2014-11-06 Thread Evandro Menezes

That's what I assumed.  However, can reload spill GPRs into FPRs as LRA
does?  For even after specifying -mno-lra, I still see excessive slots in
FPRs.

Thank you,

-- 
Evandro Menezes  Austin, TX


> -Original Message-
> From: Richard Earnshaw [mailto:rearn...@arm.com]
> Sent: Wednesday, November 05, 2014 11:49
> To: Evandro Menezes; gcc@gcc.gnu.org
> Subject: Re: [Aarch64] LRA
> 
> On 05/11/14 17:14, Evandro Menezes wrote:
> > It doesn't seem that the option -mno-lra does what it implies.  If so,
> > what does it do, for the it does result in differences.
> 
> It causes the compiler to use 'reload' rather than LRA for handling part
of
> the register allocation process.
> 
> R.

Re: Question on param MAX_PENDING_LIST_LENGTH in sched-deps

2014-11-06 Thread Jeff Law


On 11/04/14 20:29, Bin.Cheng wrote:

Hi,
The parameter MAX_PENDING_LIST_LENGTH is set to 32 by default.  It
seems to me the length of pending list can't be larger than 32.  But
in sched-deps.c, below code is used:
   /* Pending lists can't get larger with a readonly context.  */
   if (!deps->readonly
   && ((deps->pending_read_list_length + 
deps->pending_write_list_length)
   > MAX_PENDING_LIST_LENGTH))

Since we compares it using ">", the list can have 33 instructions at
most, which is inconsistent with the parameter name.

Well, it's some kind of nit picking.
Yea, seems like a bit of a nit.  Might be worth fixing as someone might 
size an array based on the param's value only to find out it can have an 
extra element.




Another question.  This parameter is introduced to prevent GCC from
running for too long time.  I am not clear if the running time is a
quadratic function of pending list, or a quadratic function of
dependency nodes?  If it's the latter, could we use the number of
dependency nodes as the parameter directly?  Because in some cases
like memory initialization, there are more than 32 store instructions
in flow, but the dependency are actually very simple.

See:

https://gcc.gnu.org/ml/gcc-patches/2001-07/msg01668.html

I don't recall what part of the algorithm was quadratic, but it should 
be too hard to find out given the thread references the target & the 
testcase.


jeff

Re: [Aarch64] LRA

2014-11-06 Thread Andrew Pinski

On Thu, Nov 6, 2014 at 3:03 PM, Evandro Menezes  wrote:
> That's what I assumed.  However, can reload spill GPRs into FPRs as LRA
> does?  For even after specifying -mno-lra, I still see excessive slots in
> FPRs.

Not fully.  What is happening most likely is IRA is deciding to use
some FPRs for some psedu-registers due to the costs (look at the .ira
dump) and then reloading from FPRs to GPRs.

Thanks,
Andrew


>
> Thank you,
>
> --
> Evandro Menezes  Austin, TX
>
>
>> -Original Message-
>> From: Richard Earnshaw [mailto:rearn...@arm.com]
>> Sent: Wednesday, November 05, 2014 11:49
>> To: Evandro Menezes; gcc@gcc.gnu.org
>> Subject: Re: [Aarch64] LRA
>>
>> On 05/11/14 17:14, Evandro Menezes wrote:
>> > It doesn't seem that the option -mno-lra does what it implies.  If so,
>> > what does it do, for the it does result in differences.
>>
>> It causes the compiler to use 'reload' rather than LRA for handling part
> of
>> the register allocation process.
>>
>> R.
>
>

Re: Question on param MAX_PENDING_LIST_LENGTH in sched-deps

2014-11-06 Thread Bin.Cheng

On Fri, Nov 7, 2014 at 7:10 AM, Jeff Law  wrote:
> On 11/04/14 20:29, Bin.Cheng wrote:
>>
>> Hi,
>> The parameter MAX_PENDING_LIST_LENGTH is set to 32 by default.  It
>> seems to me the length of pending list can't be larger than 32.  But
>> in sched-deps.c, below code is used:
>>/* Pending lists can't get larger with a readonly context.  */
>>if (!deps->readonly
>>&& ((deps->pending_read_list_length +
>> deps->pending_write_list_length)
>>> MAX_PENDING_LIST_LENGTH))
>>
>> Since we compares it using ">", the list can have 33 instructions at
>> most, which is inconsistent with the parameter name.
>>
>> Well, it's some kind of nit picking.
>
> Yea, seems like a bit of a nit.  Might be worth fixing as someone might size
> an array based on the param's value only to find out it can have an extra
> element.
Yes, I found this only because in some extreme case with lots of
consecutive stores, length of 33 breaks the store pairing one time in
the middle of instruction flow.


>
>>
>> Another question.  This parameter is introduced to prevent GCC from
>> running for too long time.  I am not clear if the running time is a
>> quadratic function of pending list, or a quadratic function of
>> dependency nodes?  If it's the latter, could we use the number of
>> dependency nodes as the parameter directly?  Because in some cases
>> like memory initialization, there are more than 32 store instructions
>> in flow, but the dependency are actually very simple.
>
> See:
>
> https://gcc.gnu.org/ml/gcc-patches/2001-07/msg01668.html
>
> I don't recall what part of the algorithm was quadratic, but it should be
> too hard to find out given the thread references the target & the testcase.
At least dependency analysis part is of quadratic complexity because
for each new memory access, it iterates over all insns in the list to
see if it depends on them.  But this seems not the real problem, there
are comments in both source code and the patch you mentioned, that
perf issue is caused by too many dependency nodes created in this
scenario.

Also this is introduced at time we only have low frequency/memory
machines, I bet it can be relax to a larger number without hurting
compilation time.  The reason I didn't is because I can't get anything
good from it, seems it barely has any impact on performance.

Thanks,
bin

Re: Match-and-simplify and COND_EXPR

2014-11-06 Thread Andrew Pinski

On Thu, Nov 6, 2014 at 2:40 AM, Richard Biener  wrote:
> On Thu, 6 Nov 2014, Richard Biener wrote:
>
>> On Wed, 5 Nov 2014, Andrew Pinski wrote:
>>
>> > Hi,
>> >   I was trying to hook up tree-ssa-phiopt to match-and-simplify using
>> > either gimple_build (or rather using gimple_simplify depending on if
>> > we want to produce cond_expr for conditional move).  I ran into a
>> > problem.
>> > With the pattern below:
>> > /* a ? 0 : 1 -> a if 0 and 1 are integral types. */
>> > (simplify
>> >   (cond_expr @0 integer_zerop integer_onep)
>> >   (if (INTEGRAL_TYPE_P (type))
>> > (convert @0)))
>>
>> Ok, so you are capturing a GENERIC expr here but nothing knows that.
>> It would work if you'd do (ugh)
>>
>> (for op (lt le eq ne ge gt)
>>  (simplify
>>   (cond_expr (op @0 @1) integer_zerop integer_onep)
>>   (if (INTEGRAL_TYPE_P (type))
>>(convert (op @0 @1)
>> (simplify
>>  (cond_expr SSA_NAME@0 integer_zerop integer_onep)
>>   (if (INTEGRAL_TYPE_P (type))
>>(convert @0
>>
>> as a workaround.  To make your version work will require (quite)
>> some special-casing in the code generator or maybe the resimplify
>> helper.  Let me see if I can cook up a "simple" fix.
>
> Sth like below (for the real fix this has to be replicated in
> all gimple_resimplifyN functions).  I'm missing a testcase
> where the pattern would apply (and not be already folded by fold),
> so I didn't check if it actually works.

You do need to check if seq is NULL though as gimple_build depends on
seq not being NULL.  But otherwise yes this works for me.

>
> Bah, of course we should fix COND_EXPRs to not embed a GENERIC
> expr...

Yes totally agree.  For my changes to tree-ssa-phiopt, I no longer
embed it.  Though we need to change loop ifconvert still.

Thanks,
Andrew

>
> Richard.
>
> Index: gcc/gimple-match-head.c
> ===
> --- gcc/gimple-match-head.c (revision 217035)
> +++ gcc/gimple-match-head.c (working copy)
> @@ -90,6 +90,13 @@ gimple_resimplify1 (gimple_seq *seq,
> code_helper *res_code, tree type, tree *res_ops,
> tree (*valueize)(tree))
>  {
> +  /* ???  Stupid tcc_comparison GENERIC trees in COND_EXPRs.  */
> +  if (COMPARISON_CLASS_P (res_ops[0]))
> +res_ops[0] = gimple_build (seq,
> +  TREE_CODE (res_ops[0]), TREE_TYPE (res_ops[0]),
> +  TREE_OPERAND (res_ops[0], 0),
> +  TREE_OPERAND (res_ops[0], 1));
> +
>if (constant_for_folding (res_ops[0]))
>  {
>tree tem = NULL_TREE;
>

Re: Match-and-simplify and COND_EXPR

2014-11-06 Thread Richard Biener

On November 7, 2014 5:03:19 AM CET, Andrew Pinski  wrote:
>On Thu, Nov 6, 2014 at 2:40 AM, Richard Biener 
>wrote:
>> On Thu, 6 Nov 2014, Richard Biener wrote:
>>
>>> On Wed, 5 Nov 2014, Andrew Pinski wrote:
>>>
>>> > Hi,
>>> >   I was trying to hook up tree-ssa-phiopt to match-and-simplify
>using
>>> > either gimple_build (or rather using gimple_simplify depending on
>if
>>> > we want to produce cond_expr for conditional move).  I ran into a
>>> > problem.
>>> > With the pattern below:
>>> > /* a ? 0 : 1 -> a if 0 and 1 are integral types. */
>>> > (simplify
>>> >   (cond_expr @0 integer_zerop integer_onep)
>>> >   (if (INTEGRAL_TYPE_P (type))
>>> > (convert @0)))
>>>
>>> Ok, so you are capturing a GENERIC expr here but nothing knows that.
>>> It would work if you'd do (ugh)
>>>
>>> (for op (lt le eq ne ge gt)
>>>  (simplify
>>>   (cond_expr (op @0 @1) integer_zerop integer_onep)
>>>   (if (INTEGRAL_TYPE_P (type))
>>>(convert (op @0 @1)
>>> (simplify
>>>  (cond_expr SSA_NAME@0 integer_zerop integer_onep)
>>>   (if (INTEGRAL_TYPE_P (type))
>>>(convert @0
>>>
>>> as a workaround.  To make your version work will require (quite)
>>> some special-casing in the code generator or maybe the resimplify
>>> helper.  Let me see if I can cook up a "simple" fix.
>>
>> Sth like below (for the real fix this has to be replicated in
>> all gimple_resimplifyN functions).  I'm missing a testcase
>> where the pattern would apply (and not be already folded by fold),
>> so I didn't check if it actually works.
>
>You do need to check if seq is NULL though as gimple_build depends on
>seq not being NULL.  But otherwise yes this works for me.
>
>>
>> Bah, of course we should fix COND_EXPRs to not embed a GENERIC
>> expr...
>
>Yes totally agree.  For my changes to tree-ssa-phiopt, I no longer
>embed it.  Though we need to change loop ifconvert still.

Istr expansion or code quality does not like us to cse the condition of two 
cobd_exprs either.  After all I had a patch set at some point doing that 
conversion (though as well for gimple_conds).

Richard.

>Thanks,
>Andrew
>
>>
>> Richard.
>>
>> Index: gcc/gimple-match-head.c
>> ===
>> --- gcc/gimple-match-head.c (revision 217035)
>> +++ gcc/gimple-match-head.c (working copy)
>> @@ -90,6 +90,13 @@ gimple_resimplify1 (gimple_seq *seq,
>> code_helper *res_code, tree type, tree *res_ops,
>> tree (*valueize)(tree))
>>  {
>> +  /* ???  Stupid tcc_comparison GENERIC trees in COND_EXPRs.  */
>> +  if (COMPARISON_CLASS_P (res_ops[0]))
>> +res_ops[0] = gimple_build (seq,
>> +  TREE_CODE (res_ops[0]), TREE_TYPE
>(res_ops[0]),
>> +  TREE_OPERAND (res_ops[0], 0),
>> +  TREE_OPERAND (res_ops[0], 1));
>> +
>>if (constant_for_folding (res_ops[0]))
>>  {
>>tree tem = NULL_TREE;
>>

Re: Match-and-simplify and COND_EXPR

2014-11-06 Thread pinskia





> On Nov 6, 2014, at 11:24 PM, Richard Biener  wrote:
> 
>> On November 7, 2014 5:03:19 AM CET, Andrew Pinski  wrote:
>> On Thu, Nov 6, 2014 at 2:40 AM, Richard Biener 
>> wrote:
>>> On Thu, 6 Nov 2014, Richard Biener wrote:
>>> 
> On Wed, 5 Nov 2014, Andrew Pinski wrote:
> 
> Hi,
>  I was trying to hook up tree-ssa-phiopt to match-and-simplify
>> using
> either gimple_build (or rather using gimple_simplify depending on
>> if
> we want to produce cond_expr for conditional move).  I ran into a
> problem.
> With the pattern below:
> /* a ? 0 : 1 -> a if 0 and 1 are integral types. */
> (simplify
>  (cond_expr @0 integer_zerop integer_onep)
>  (if (INTEGRAL_TYPE_P (type))
>(convert @0)))
 
 Ok, so you are capturing a GENERIC expr here but nothing knows that.
 It would work if you'd do (ugh)
 
 (for op (lt le eq ne ge gt)
 (simplify
  (cond_expr (op @0 @1) integer_zerop integer_onep)
  (if (INTEGRAL_TYPE_P (type))
   (convert (op @0 @1)
 (simplify
 (cond_expr SSA_NAME@0 integer_zerop integer_onep)
  (if (INTEGRAL_TYPE_P (type))
   (convert @0
 
 as a workaround.  To make your version work will require (quite)
 some special-casing in the code generator or maybe the resimplify
 helper.  Let me see if I can cook up a "simple" fix.
>>> 
>>> Sth like below (for the real fix this has to be replicated in
>>> all gimple_resimplifyN functions).  I'm missing a testcase
>>> where the pattern would apply (and not be already folded by fold),
>>> so I didn't check if it actually works.
>> 
>> You do need to check if seq is NULL though as gimple_build depends on
>> seq not being NULL.  But otherwise yes this works for me.
>> 
>>> 
>>> Bah, of course we should fix COND_EXPRs to not embed a GENERIC
>>> expr...
>> 
>> Yes totally agree.  For my changes to tree-ssa-phiopt, I no longer
>> embed it.  Though we need to change loop ifconvert still.
> 
> Istr expansion or code quality does not like us to cse the condition of two 
> cobd_exprs either.  After all I had a patch set at some point doing that 
> conversion (though as well for gimple_conds).

I thought I changed that when I did the expansion of cond_expr into conditional 
move.  We need to something similar for cond_expr of jumps too. 

Thanks,
Andrew


> 
> Richard.
> 
>> Thanks,
>> Andrew
>> 
>>> 
>>> Richard.
>>> 
>>> Index: gcc/gimple-match-head.c
>>> ===
>>> --- gcc/gimple-match-head.c (revision 217035)
>>> +++ gcc/gimple-match-head.c (working copy)
>>> @@ -90,6 +90,13 @@ gimple_resimplify1 (gimple_seq *seq,
>>>code_helper *res_code, tree type, tree *res_ops,
>>>tree (*valueize)(tree))
>>> {
>>> +  /* ???  Stupid tcc_comparison GENERIC trees in COND_EXPRs.  */
>>> +  if (COMPARISON_CLASS_P (res_ops[0]))
>>> +res_ops[0] = gimple_build (seq,
>>> +  TREE_CODE (res_ops[0]), TREE_TYPE
>> (res_ops[0]),
>>> +  TREE_OPERAND (res_ops[0], 0),
>>> +  TREE_OPERAND (res_ops[0], 1));
>>> +
>>>   if (constant_for_folding (res_ops[0]))
>>> {
>>>   tree tem = NULL_TREE;
>>> 
> 
>

LTO IPA inline decisions in GCC trunk.

Re: Match-and-simplify and COND_EXPR

Re: RFC: Update ISL under gcc/infrastructure/ ? // Remove CLooG?

Re: RFC: Update ISL under gcc/infrastructure/ ? // Remove CLooG?

Re: LTO IPA inline decisions in GCC trunk.

Re: RFC: Update ISL under gcc/infrastructure/ ? // Remove CLooG?

Re: RFC: Update ISL under gcc/infrastructure/ ? // Remove CLooG?

Re: RFC: Update ISL under gcc/infrastructure/ ? // Remove CLooG?

Re: Match-and-simplify and COND_EXPR

Re: Match-and-simplify and COND_EXPR

Re: RFC: Update ISL under gcc/infrastructure/ ? // Remove CLooG?

Loop invariant motion from cold block

Re: Loop invariant motion from cold block

Re: RFC: Update ISL under gcc/infrastructure/ ? // Remove CLooG?

Re: RFC: Update ISL under gcc/infrastructure/ ? // Remove CLooG?

Re: Loop invariant motion from cold block

Re: RFC: Update ISL under gcc/infrastructure/ ? // Remove CLooG?

gcc-4.8-20141106 is now available

RE: [Aarch64] LRA

Re: Question on param MAX_PENDING_LIST_LENGTH in sched-deps

Re: [Aarch64] LRA

Re: Question on param MAX_PENDING_LIST_LENGTH in sched-deps

Re: Match-and-simplify and COND_EXPR

Re: Match-and-simplify and COND_EXPR

Re: Match-and-simplify and COND_EXPR

25 matches

Site Navigation

Mail list logo

Footer information