Re: [PATCH] RTEMS: Add LEON3/SPARC multilibs

2013-09-19 Thread Eric Botcazou
> I don't expect that this will be back ported to GCC 4.8.  You also need
> Binutils 2.24 for this.

>From a SPARC maintainership viewpoint, I'd think that this is backportable for 
the upcoming 4.8.2 release, and the patches are essentially SPARC-specific, 
but perhaps the RMs are of a different opinion here.


2013-08-09  Eric Botcazou  

* configure.ac: Add GAS check for LEON instructions on SPARC.
* configure: Regenerate.
* config.in: Likewise.
* config.gcc (with_cpu): Remove sparc-leon*-* and deal with LEON in the
sparc*-*-* block.
* config/sparc/sparc.opt (LEON, LEON3): New masks.
* config/sparc/sparc.h (ASM_CPU32_DEFAULT_SPEC): Set to AS_LEON_FLAG
for LEON or LEON3.
(ASM_CPU_SPEC): Pass AS_LEON_FLAG if -mcpu=leon or -mcpu=leon3.
(AS_LEON_FLAG): New macro.
* config/sparc/sparc.c (sparc_option_override): Set MASK_LEON for leon
and MASK_LEON3 for leon3 and unset them if HAVE_AS_LEON is not defined.
Deal with LEON and LEON3 for the memory model.
* config/sparc/sync.md (atomic_compare_and_swap): Enable if LEON3
(atomic_compare_and_swap_1): Likewise.
(*atomic_compare_and_swap_1): Likewise.

2013-07-23  Eric Botcazou  

* doc/invoke.texi (SPARC Options): Document new leon3 processor value.

2013-07-22  Eric Botcazou  

* config.gcc (sparc*-*-*): Accept leon3 processor.
(sparc-leon*-*): Merge with sparc*-*-* and add leon3 support.
* doc/invoke.texi (SPARC Options): Adjust -mfix-ut699 entry.
* config/sparc/sparc-opts.h (enum processor_type): Add PROCESSOR_LEON3.
* config/sparc/sparc.opt (enum processor_type): Add leon3.
(mfix-ut699): Adjust comment.
* config/sparc/sparc.h (TARGET_CPU_leon3): New define.
(CPP_CPU32_DEFAULT_SPEC): Add leon3 support.
(CPP_CPU_SPEC): Likewise.
(ASM_CPU_SPEC): Likewise.
* config/sparc/sparc.c (leon3_cost): New constant.
(sparc_option_override): Add leon3 support.
(mem_ref): New function.
(sparc_gate_work_around_errata): Return true if -mfix-ut699 is enabled.
(sparc_do_work_around_errata): Look into the instruction in the delay
slot and adjust accordingly.  Add fix for the data cache nullify issues
of the UT699.  Change insertion position for the NOP.
* config/sparc/leon.md (leon_fpalu, leon_fpmds, write_buf): Delete.
(leon3_load): New reservation.
(leon_store): Bump latency to 2.
(grfpu): New automaton.
(grfpu_alu): New unit.
(grfpu_ds): Likewise.
(leon_fp_alu): Adjust.
(leon_fp_mult): Delete.
(leon_fp_div): Split into leon_fp_divs and leon_fp_divd.
(leon_fp_sqrt): Split into leon_fp_sqrts and leon_fp_sqrtd.
* config/sparc/sparc.md (cpu): Add leon3.
* config/sparc/sync.md (atomic_exchangesi): Disable if -mfix-ut699.
(swapsi): Likewise.
(atomic_test_and_set): Likewise.
(ldstub): Likewise.

2013-05-28  Eric Botcazou  

* doc/invoke.texi (SPARC Options): Document -mfix-ut699.
* builtins.c (expand_builtin_mathfn) : Try to widen the
mode if the instruction isn't available in the original mode.
* config/sparc/sparc.opt (mfix-ut699): New option.
* config/sparc/sparc.md (muldf3_extend): Disable if -mfix-ut699.
(divdf3): Turn into expander.
(divdf3_nofix): New insn.
(divdf3_fix): Likewise.
(divsf3): Disable if -mfix-ut699.
(sqrtdf2): Turn into expander.
(sqrtdf2_nofix): New insn.
(sqrtdf2_fix): Likewise.
(sqrtsf2): Disable if -mfix-ut699.

-- 
Eric Botcazou


Re: [PATCH, committed] SH: Fix PR58314 (unsatisfied constraints)

2013-09-19 Thread Christian Bruel
Hi Kaz, Oleg,

On 09/19/2013 01:15 AM, Kaz Kojima wrote:
> Christian Bruel  wrote:
>> && (!can_create_pseudo_p () && REG_P (operands[0]) && REG_P (operands[1]))"
>>
>> is necessary ?
> It looks an another hack to allow the 2nd and 3rd alternatives only
> when reloading.  If so, it might be a bit cleaner to use a special
> predicate like
>
>
This still looks complicated to me. I have tested for sh-superh-elf and
sh-linux the attached patch that just "fixes" the issue reported by
Richard with no regression and absolutely no differences in code
generation for CSIBe and a few other benches (eembc, coremark, ...). 
The spill alternatives are correctly selected and the original PR still
passes.

If OK I'd like to apply it to trunk/4.8. If there is the need for an
additional hack, How about sending it separately ?

Many thanks,

Christian
2013-09-13  Christian Bruel  

	* config/sh/sh.md (mov_reg_reg): Use general_movd*_operand predicate and guard insn with reg only operand.

Index: gcc/config/sh/sh.md
===
--- gcc/config/sh/sh.md	(revision 202699)
+++ gcc/config/sh/sh.md	(working copy)
@@ -6894,9 +6894,11 @@ label:
 ;; reloading MAC subregs otherwise.  For that probably special patterns
 ;; would be required.
 (define_insn "*mov_reg_reg"
-  [(set (match_operand:QIHI 0 "arith_reg_dest" "=r,m,*z")
-	(match_operand:QIHI 1 "register_operand" "r,*z,m"))]
-  "TARGET_SH1 && !t_reg_operand (operands[1], VOIDmode)"
+  [(set (match_operand:QIHI 0 "general_movdst_operand" "=r,m,*z")
+	(match_operand:QIHI 1 "general_movsrc_operand" "r,*z,m"))]
+  "TARGET_SH1 && !t_reg_operand (operands[1], VOIDmode)
+   && arith_reg_dest (operands[0], mode)
+   && register_operand (operands[1], mode)"
   "@
 mov		%1,%0
 mov.	%1,%0


Re: [PATCH] RTEMS: Add LEON3/SPARC multilibs

2013-09-19 Thread Sebastian Huber

On 2013-09-19 09:23, Eric Botcazou wrote:

I don't expect that this will be back ported to GCC 4.8.  You also need
>Binutils 2.24 for this.
From a SPARC maintainership viewpoint, I'd think that this is backportable for

the upcoming 4.8.2 release, and the patches are essentially SPARC-specific,
but perhaps the RMs are of a different opinion here.


A back port would be quite nice for us, since we work currently on SMP support 
for LEON3/4 and C11 atomic operations would be very handy for this.


--
Sebastian Huber, embedded brains GmbH

Address : Dornierstr. 4, D-82178 Puchheim, Germany
Phone   : +49 89 189 47 41-16
Fax : +49 89 189 47 41-09
E-Mail  : sebastian.hu...@embedded-brains.de
PGP : Public key available on request.

Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.


Re: [v3] More noexcept -- 4th

2013-09-19 Thread Paolo Carlini

Hi,

On 09/19/2013 05:46 AM, Marc Glisse wrote:

Hello,

I did not touch the regular basic_string because Paulo usually says 
not to touch it, but I could do it as well if wanted.
If you like, please go ahead, there are no ABI issues in this case. 
Indeed, in the current implementation the move constructor isn't 
unconditionally noexcept due to the allocators. We have got a bug report 
about that, a very recent one. You could add the decorations to the 
current basic_string too and put a comment right before the constructor 
mentioning the bug # and the fact that things have to be reworked anyway 
in the C++11 conforming implementation.


The patch is otherwise Ok with me, thanks again!
Paolo.


Re: [PATCH, committed] SH: Fix PR58314 (unsatisfied constraints)

2013-09-19 Thread Oleg Endo
Hi,

On Thu, 2013-09-19 at 10:44 +0200, Christian Bruel wrote:
> Hi Kaz, Oleg,
> 
> On 09/19/2013 01:15 AM, Kaz Kojima wrote:
> > Christian Bruel  wrote:
> >> && (!can_create_pseudo_p () && REG_P (operands[0]) && REG_P (operands[1]))"
> >>
> >> is necessary ?
> > It looks an another hack to allow the 2nd and 3rd alternatives only
> > when reloading.  If so, it might be a bit cleaner to use a special
> > predicate like
> >
> >
> This still looks complicated to me. I have tested for sh-superh-elf and
> sh-linux the attached patch that just "fixes" the issue reported by
> Richard with no regression and absolutely no differences in code
> generation for CSIBe and a few other benches (eembc, coremark, ...). 
> The spill alternatives are correctly selected and the original PR still
> passes.
> 
> If OK I'd like to apply it to trunk/4.8. If there is the need for an
> additional hack, How about sending it separately ?

Yeah, the move patterns probably could use some cleanup / refactoring
anyway.  I also wonder what is going to happen if LRA is used ... but
that's another story.  Have you also checked the patch for SH2A?

Cheers,
Oleg



Re: [v3] More noexcept -- 4th

2013-09-19 Thread Paolo Carlini

On 09/19/2013 10:50 AM, Paolo Carlini wrote:

Hi,

On 09/19/2013 05:46 AM, Marc Glisse wrote:

Hello,

I did not touch the regular basic_string because Paulo usually says 
not to touch it, but I could do it as well if wanted.
If you like, please go ahead, there are no ABI issues in this case. 
Indeed, in the current implementation the move constructor
Read this *assignment* of course, sorry about the confusion. But the 
issue is very clear.


Paolo.


Re: [gomp4] Tweak GOMP_target{,_data,_update} arguments

2013-09-19 Thread Michael V. Zolotukhin
Hi Jakub,

Thanks for the explanation, it's getting a bit clearer, though I still have some
questions.

> __OPENMP_TARGET__ would be a linker plugin inserted symbol at the start of
> some linker plugin created data section, which would start with some header
> and then data.
> Say
> uleb128 number_of_supported_targets - n
> uleb128 number_of_host_var_pairs - m
> [ name of offload target (asciiz?)
>   relative offset to the start of the offload data for the target (in MIC 
> case embedded DSO)
>   size of the offload data
>   perhaps something how to find the target addresses array
> ] repeated n times
> [ host_address, size ] repeated m times
> (for the functions passed to GOMP_target the pair would be [ 
> foobar.omp_fn.25, 1 ] ).
So, in this table we store host addresses of global variables, marked with
'pragma omp declare target', and addresses of host-versions of OMP-versioned
functions.  Correct?  Also, there are pointers to images of target-binaries,
which are (presumably) placed in other (or the same?) data sections.

> So, when GOMP_target{,_data,_update} is called, it could easily determine
> if the calling shared library resp. binary has been offloaded or not
That's right.
Then, if no initialization has been performed yet, GOMP_target{,_data,_update}
is initialized.  Now let's look at the initialization.
In initialization GOMP_target* looks at the __OPENMP_TARGET__ table (its address
is passed as the 3rd argument), finds pointer to a data section with
target-binary image, loads it to memory, runs a process on a target from it
(e.g. in COI using COIProcessCreateFromFile and/or
COIProcessLoadLibraryFromMemory).
Global variables are mapped and the corresponding host<->target address pairs
are inserted to the splay tree, as usual.
Also, GOMP_target* should do the same for function addresses.  Could you please
describe this step in more details?  Do we want to just add some offset to
host_function_address (as we want host versions of functions to be ordered
exactly as the target versions)?

> See above, names are just a bad idea.  You can just use some magic wrapper
> name in the target binary (the one sitting in libgomp), to which you just
> pass the pair of function address and it's argument and the named function
> will just read the (target) function pointer and (target) pointer argument
> from misc data block and tail call that function.
Yes, if we know target function pointer, we can do this.

Basically, the main question I have now is how would we figure out target
function address?  Of course, after initialization we just look for it in our
splay tree, so the question relates to the initialization step.

Thanks, Michael


Re: [PATCH 1/n] Add conditional compare support

2013-09-19 Thread Richard Earnshaw
On 18/09/13 10:45, Zhenqiang Chen wrote:
> 
>> -Original Message-
>> From: Richard Biener [mailto:richard.guent...@gmail.com]
>> Sent: Tuesday, August 27, 2013 8:18 PM
>> To: Richard Earnshaw
>> Cc: Zhenqiang Chen; GCC Patches
>> Subject: Re: [PATCH 1/n] Add conditional compare support
>>
>> On Tue, Aug 27, 2013 at 1:56 PM, Richard Earnshaw 
>> wrote:
>>> On 27/08/13 12:10, Richard Biener wrote:
 What's this for and what's the desired semantics? I don't like having
 extra tree codes for this.  Is this for a specific instruction set
 feature?
>>>
>>> The background is to support the conditional compare instructions in
>>> ARM (more effectively) and AArch64 at all.
>>>
>>> The current method used in ARM is to expand into a series of
>>> store-flag instructions and then hope that combine can optimize them
>>> away (though that fails far too often, particularly when the first
>>> instruction in the sequence is combined into another pattern).  To
>>> make it work at all the compiler has to lie about the costs of various
>>> store-flag type operations which overall risks producing worse code
>>> and means we also have to support many more complex multi-instruction
>>> patterns than is desirable.  I really don't want to go down the same
> route
>> for AArch64.
>>>
>>> The idea behind all this is to capture potential conditional compare
>>> operations early enough in the mid end that we can keep track of them
>>> until RTL expand time and then to emit the correct logic on all
>>> targets depending on what is the best thing for that target.  The
>>> current method of lowering into store-flag sequences doesn't really cut
> it.
>>
>> It seems to me that then the initial instruction selection process (aka
> RTL
>> expansion) needs to be improved.  As we are expanding with having the CFG
>> around it should be easy enough to detect AND/ORIF cases and do better
>> here.  Yeah, I suppose this asks to turn existing jump expansion
> optimizations
>> up-side-down to optimize with the GIMPLE CFG in mind.
>>
>> The current way of LOGICAL_OP_NON_SHORT_CIRCUIT is certainly bogus -
>> fold-const.c is way too early to decide this.  Similar to the ongoing work
> of
>> expanding / building-up switch expressions in a GIMPLE pass, moving expand
>> complexity up the pipeline this asks for a GIMPLE phase that moves this
>> decision down closer to RTL expansion.
>> (We have tree-ssa-ifcombine.c that is a related GIMPLE transform pass)
>>
> 
> The patch is updated according to your comments. It is a basic support,
> which does not touch ifcombine and jump related optimizations yet.
> 
> Current method is:
> 1) In fold-const, if HAVE_conditional_compare, set
> LOGICAL_OP_NON_SHORT_CIRCUIT
>to optimize_function_for_speed_p. So we do not depend on BRANCH_COST.
> 2) Identify CCMP during expanding. A CCMP candidate is a BIT_AND_EXPR
>or BIT_IOR_EXPR, whose operators are compares.
> 3) Add a new op in optab to expand the CCMP to optimized RTL,
> e.g. and_scc_scc/ior_scc_scc in ARM.
> 
> Bootstrap on ARM Chrome book.
> No make check regression on Pandaboard.
> 
> Thanks!
> -Zhenqiang
> 
> ChangeLog:
> 2013-09-18  Zhenqiang Chen  
> 
>   * config/arm/arm.md (conditional_compare): New.
>   * expr.c (is_conditional_compare_candidate_p, expand_ccmp_expr):
> New.
>   (expand_expr_real_1): Identify conditional compare.
>   * fold-const.c (LOGICAL_OP_NON_SHORT_CIRCUIT): Update.
>   * optabs.c (expand_ccmp_op): New.
>   (get_rtx_code): Handle BIT_AND_EXPR and BIT_IOR_EXPR.
>   * optabs.def (ccmp_optab): New.
>   * optabs.h (expand_ccmp_op): New.
> 
> 
> basic-conditional-compare-support2.patch
> 
> 
> N¬n‡r¥ªíÂ)emçhÂyhiם¢w^™©Ý
> 

Some general comments.

1) How do we get to a conditional branch from this code.  It seems your
new pattern generates a store-flag value rather than a branch expansion.
 Are you expecting combine or some other later pass to clean that up?

2) Assuming we do end up with branches, why would we not want to do this
optimization when optimzing for space?

cmp r0, r1
cmpne   r2, r3
beq L1

is shorter than

cmp r0, r1
beq L1
cmp r2, r3
beq L1

3) Is there any way to generalize this for more complex expressions?  Eg

if (a == b || a == c || a == d)

should become

cmp a, b
cmpne   a, c
cmpne   a, d
...

Obviously, when optimizing for speed there will probably be a limit on
the number of compares that are desirable, but I don't see why we should
arbitrarily cap it at 2.

R.




Re: [gomp4] Tweak GOMP_target{,_data,_update} arguments

2013-09-19 Thread Jakub Jelinek
On Thu, Sep 19, 2013 at 12:58:28PM +0400, Michael V. Zolotukhin wrote:
> Thanks for the explanation, it's getting a bit clearer, though I still have 
> some
> questions.
> 
> > __OPENMP_TARGET__ would be a linker plugin inserted symbol at the start of
> > some linker plugin created data section, which would start with some header
> > and then data.
> > Say
> > uleb128 number_of_supported_targets - n
> > uleb128 number_of_host_var_pairs - m
> > [ name of offload target (asciiz?)
> >   relative offset to the start of the offload data for the target (in MIC 
> > case embedded DSO)
> >   size of the offload data
> >   perhaps something how to find the target addresses array
> > ] repeated n times
> > [ host_address, size ] repeated m times
> > (for the functions passed to GOMP_target the pair would be [ 
> > foobar.omp_fn.25, 1 ] ).
> So, in this table we store host addresses of global variables, marked with
> 'pragma omp declare target', and addresses of host-versions of OMP-versioned
> functions.  Correct?  Also, there are pointers to images of target-binaries,
> which are (presumably) placed in other (or the same?) data sections.

Yeah.  How exactly we define the section is up to us, but it should have all
the information that GOMP_target* will need to offload the stuff from the
current shared library or binary, and everything needed to initialize the
{ host_addr, size } -> { target_addr } mapping of declare target global var
definitions and functions passed to GOMP_target.  The fewer relocations
the section has, the better.  But, if we need any relocations, it will need
to be in a relro section, and supposedly the embedded shared library (resp.
libraries) don't need any relocations on them and will be large, thus
supposedly they should live in different sections and the header should just
point to them (e.g. using offset relative to __OPENMP_TARGET__ or something
that doesn't dynamic relocation).  Similarly, if the linker plugin puts in the
array of [ host_address, size ] rewritten such that host_address is an
offset from __OPENMP_TARGET__, then we won't need dynamic relocations for
that.  Another complication is dependent shared libraries.
Consider
liba.c:
#pragma omp declare target
int i;
int foo (void)
{
  return ++i;
}
#pragma omp end declare target
main.c:
#pragma omp declare target
extern int i;
extern int foo (void);
#pragma omp end declare target
int main ()
{
  int j;
  #pragma omp target
{
  j = i;
  j += foo ();
}
  if (j != 1)
abort ();
  return 0;
}
gcc -shared -O2 -fpic -fopenmp -o liba.so -Wl,-soname,liba.so liba.c
gcc -O2 -fopenmp -o main main.c -L. -la
./main

Perhaps the linker plugin can extract the target shared libraries from
the embedded sections of dependent shared libraries (if any), and link the
"main" shared library against that, but GOMP_target will need to know that
it can't just offload main.so, but also has to offload the dependent
liba.so (and of course libgomp.so.1 from the libgomp plugin).
What does ICC do in this case?

> > So, when GOMP_target{,_data,_update} is called, it could easily determine
> > if the calling shared library resp. binary has been offloaded or not
> That's right.
> Then, if no initialization has been performed yet, GOMP_target{,_data,_update}
> is initialized.  Now let's look at the initialization.
> In initialization GOMP_target* looks at the __OPENMP_TARGET__ table (its 
> address
> is passed as the 3rd argument), finds pointer to a data section with
> target-binary image, loads it to memory, runs a process on a target from it
> (e.g. in COI using COIProcessCreateFromFile and/or
> COIProcessLoadLibraryFromMemory).
> Global variables are mapped and the corresponding host<->target address pairs
> are inserted to the splay tree, as usual.
> Also, GOMP_target* should do the same for function addresses.  Could you 
> please
> describe this step in more details?  Do we want to just add some offset to
> host_function_address (as we want host versions of functions to be ordered
> exactly as the target versions)?

The idea was that the host [ host_addr, size ] array (in some named section)
would be ordered exactly the same as corresponding [ targ_addr ] array in
the target shared library.  So, [25] pair in the host array will correspond
to [25] in the target shared library array.
So you just walk the whole arrays, and in each iteration pick nth host array
pair plus corresponding nth target array address, and put it into the splay
tree.

In the above testcase, host liba.so would contain a [ &i, sizeof(int) ]
pair and target liba.so corresponding [ &i ] entry (target i in that case).
In host main there would be [ &main.omp_fn.0, 1 ] and in target main.so
corresponding [ &main.omp_fn.0 ] (target main.omp_fn.0 in that case).

> > See above, names are just a bad idea.  You can just use some magic wrapper
> > name in the target binary (the one sitting in libgomp), to which you just
> > pass the pair of function address and it's argument and the 

[PATCH] Fix up omp sections expansion

2013-09-19 Thread Jakub Jelinek
Hi!

If sections construct has all section directives non-fallthru, as in
the attached testcase, including the case where they can be all or some of
them cancelled, we generate too large argument for GOMP_sections_start,
so if there are more threads than section directives (including the implicit
first) in the sections construct, we can __builtin_trap ().

Fixed thusly, tested on x86_64-linux and i686-linux, committed to trunk/4.8.

2013-09-19  Jakub Jelinek  

* omp-low.c (expand_omp_sections): Always pass len - 1 to
GOMP_sections_start, even if !exit_reachable.
libgomp/
* testsuite/libgomp.c/sections-2.c: New test.

--- gcc/omp-low.c.jj2013-09-19 09:15:08.0 +0200
+++ gcc/omp-low.c   2013-09-19 12:59:49.325031241 +0200
@@ -6862,8 +6862,7 @@ expand_omp_sections (struct omp_region *
 {
   /* If we are not inside a combined parallel+sections region,
 call GOMP_sections_start.  */
-  t = build_int_cst (unsigned_type_node,
-exit_reachable ? len - 1 : len);
+  t = build_int_cst (unsigned_type_node, len - 1);
   u = builtin_decl_explicit (BUILT_IN_GOMP_SECTIONS_START);
   stmt = gimple_build_call (u, 1, t);
 }
--- libgomp/testsuite/libgomp.c/sections-2.c.jj 2013-09-19 13:06:46.138913032 
+0200
+++ libgomp/testsuite/libgomp.c/sections-2.c2013-09-19 13:07:00.250841349 
+0200
@@ -0,0 +1,29 @@
+/* { dg-do run } */
+
+#include 
+#include 
+
+__attribute__((noinline, noclone, noreturn))
+void
+foo ()
+{
+  sleep (4);
+  exit (0);
+}
+
+int
+main ()
+{
+  #pragma omp parallel
+  {
+#pragma omp sections
+  {
+foo ();
+  #pragma omp section
+foo ();
+  #pragma omp section
+foo ();
+  }
+  }
+  return 0;
+}

Jakub


Re: [PATCH] manage dom-walk_data initialization and finalization with constructors and destructors

2013-09-19 Thread Richard Biener
Jeff Law  wrote:
>On 09/18/2013 11:17 AM, Michael Matz wrote:
>> Hi,
>>
>> On Wed, 18 Sep 2013, Jeff Law wrote:
>>
>>> On 09/18/2013 10:24 AM, Michael Matz wrote:

 I'm irritated by the member name uglification (e.g. equiv_stack_
>with
 trailing underscore).  I know that's a certain style to mark
>private
 members, but I think it's a bad style (like prefixing variable
>names with
 their type), and before it sets a precedent in GCCs c++ coding
>style I'd
 like this to be changed, like in the below.
>>>
>>> We're already using the trailing underscore idiom for private
>objects
>>> moving into classes (see the pass class).
>>
>> I know, and I don't like it there either.
>Well, as Ian pointed out, it is in our recommended style guidelines and
>
>you'll find uses in the vec class as well.  It's well established at 
>this point and I see no compelling reason to go back unless you can 
>convince the project as a whole to change the C++ guidelines.
>
>>
 I'd also like us to not use member privatization in our classes,
>but
 that's not in the patch, but if we could agree on that it would be
>nice.
>>> Member privatization is quite natural.  What specifically do you not
>like
>>> about the practice?
>>
>> http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00302.html
>>
>> That was conditional on "when we need to jump through hoops", but for
>> constistency it'd make sense to avoid it everywhere.
>> (I know that Ian agreed to that mail, but somehow the mailing list
>> archives don't have that!?)
>I don't see anything in Trevor's work that requires jumping through 
>hoops.  It's pretty simple stuff.  And again, as Ian pointed out, our 
>established guidelines for C++ usage encourage this behaviour.
>
>
>>
 Regstrapped on x86-64-linux, okay?
>>>
>>> Obviously any ChangeLog, formatting and such can go in.  However,
>the
>>> trailing underscore should stay given it's already established
>practice
>>> and has several nice benefits.
>>
>> What's the benefit of reading and writing such noisy lines? :
>>
>>*out_mode = mode_;
>>mode_ = GET_MODE_WIDER_MODE (mode_);
>>count_++;
>It makes it very clear to the reader that we're dealing with objects 
>that belong to a class instance rather than direct access to an auto or
>
>static.  That can be important.
>

There is a language specific way, too. Just qualify accesses with this-> that 
also avoids all the interesting name-lookup issues with dependent names.

Richard.

>>
>> The uglification merely makes code harder to write and read, it
>should be
>> used in cases where you _don't_ want developers to write such names.
>I feel it makes it harder in some ways and easier in others.
>
>Given it's recommended by our C++ guidelines which were discussed at 
>length, I'm going to explicitly NAK your patch.  If you want to re-open
>
>the guidelines for C++ usage, then that's fine with me and if we as a 
>project change the guidelines to disallow such things, then that will
>be 
>the time to remove the trailing underscores, private members, etc.
>
>FWIW, I have worked on large C++ codebases that were a free-for-all and
>
>found them *amazingly* painful.  The restricted set allowed for GCC is 
>actually quite reasonable IMHO, particularly for projects where the
>main 
>body of code is evolving from a pure C base.
>
>
>Jeff




Fix uninitialized memory access in cgraph.c

2013-09-19 Thread Jan Hubicka
Hi,
Richard pointed out to me that we use speculative flag before we initialize
it. 

Bootstrapped/regtested x86_64-linux, comitted.

Honza

Index: ChangeLog
===
--- ChangeLog   (revision 202739)
+++ ChangeLog   (working copy)
@@ -1,3 +1,8 @@
+2013-09-19  Jan Hubicka  
+
+   * cgraph.c (cgraph_create_edge_1): Avoid uninitialized read
+   of speculative flag.
+
 2013-09-19  Jakub Jelinek  
 
* omp-low.c (expand_omp_sections): Always pass len - 1 to
Index: cgraph.c
===
--- cgraph.c(revision 202739)
+++ cgraph.c(working copy)
@@ -870,12 +870,12 @@ cgraph_create_edge_1 (struct cgraph_node
 edge->call_stmt_cannot_inline_p = true;
   else
 edge->call_stmt_cannot_inline_p = false;
-  if (call_stmt && caller->call_site_hash)
-cgraph_add_edge_to_call_site_hash (edge);
 
   edge->indirect_info = NULL;
   edge->indirect_inlining_edge = 0;
   edge->speculative = false;
+  if (call_stmt && caller->call_site_hash)
+cgraph_add_edge_to_call_site_hash (edge);
 
   return edge;
 }


[PATCH] Simplify & refactor a bit of tree-ssa-dom.c

2013-09-19 Thread Jeff Law


I find it amazing to look at code I wrote in the past, the occasional 
WTF always makes it worth it.


The code to manage the temporary expression & const/copies tables in 
tree-ssa-dom.c around jump threading looks overly convoluted.  In 
particular the code reused an existing unwind point for the temporary 
expression stack when threading one outgoing edge of a block.


I can only guess this was in response to that code at one time being 
much more expensive than it is now.  Given how cheap this code is now, 
handling the temporary expression stack in the most obvious way seems 
much wiser.


The refactoring should also make it easier to get some missing 
expressions into the table without tons of unwanted code duplication.


Bootstrapped and regression tested on x86_64-unknown-linux-gnu. 
Installed on the trunk.



commit f71b6fa579d113c2de9ec0ab921bbd2dcc7be43c
Author: Jeff Law 
Date:   Thu Sep 19 05:54:23 2013 -0600

   * tree-ssa-dom.c (record_temporary_equivalences): New function
split out of dom_opt_dom_walker::after_dom_children.
(dom_opt_dom_walker::thread_across_edge): Move common code
in here from dom_opt_dom_walker::after_dom_children.
(dom_opt_dom_walker::after_dom_children): Corresponding 
simplifictions.

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index be5b1d9..6c5f5d6 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,11 @@
+2013-09-17  Jeff Law  
+
+   * tree-ssa-dom.c (record_temporary_equivalences): New function
+   split out of dom_opt_dom_walker::after_dom_children.
+   (dom_opt_dom_walker::thread_across_edge): Move common code
+   in here from dom_opt_dom_walker::after_dom_children.
+   (dom_opt_dom_walker::after_dom_children): Corresponding simplifictions.
+
 2013-09-19  Jan Hubicka  
 
* cgraph.c (cgraph_create_edge_1): Avoid uninitialized read
diff --git a/gcc/tree-ssa-dom.c b/gcc/tree-ssa-dom.c
index aac7aa4..f561386 100644
--- a/gcc/tree-ssa-dom.c
+++ b/gcc/tree-ssa-dom.c
@@ -1070,6 +1070,31 @@ simplify_stmt_for_jump_threading (gimple stmt,
   return lookup_avail_expr (stmt, false);
 }
 
+static void
+record_temporary_equivalences (edge e)
+{
+  int i;
+  struct edge_info *edge_info = (struct edge_info *) e->aux;
+
+  /* If we have info associated with this edge, record it into
+ our equivalence tables.  */
+  if (edge_info)
+{
+  cond_equivalence *eq;
+  tree lhs = edge_info->lhs;
+  tree rhs = edge_info->rhs;
+
+  /* If we have a simple NAME = VALUE equivalence, record it.  */
+  if (lhs && TREE_CODE (lhs) == SSA_NAME)
+   record_const_or_copy (lhs, rhs);
+
+  /* If we have 0 = COND or 1 = COND equivalences, record them
+into our expression hash tables.  */
+  for (i = 0; edge_info->cond_equivalences.iterate (i, &eq); ++i)
+   record_cond (eq);
+}
+}
+
 /* Wrapper for common code to attempt to thread an edge.  For example,
it handles lazily building the dummy condition and the bookkeeping
when jump threading is successful.  */
@@ -1083,9 +1108,27 @@ dom_opt_dom_walker::thread_across_edge (edge e)
integer_zero_node, integer_zero_node,
NULL, NULL);
 
+  /* Push a marker on both stacks so we can unwind the tables back to their
+ current state.  */
+  avail_exprs_stack.safe_push (NULL);
+  const_and_copies_stack.safe_push (NULL_TREE);
+
+  /* Traversing E may result in equivalences we can utilize.  */
+  record_temporary_equivalences (e);
+
+  /* With all the edge equivalences in the tables, go ahead and attempt
+ to thread through E->dest.  */
   ::thread_across_edge (dummy_cond_, e, false,
&const_and_copies_stack,
simplify_stmt_for_jump_threading);
+
+  /* And restore the various tables to their state before
+ we threaded this edge. 
+
+ XXX The code in tree-ssa-threadedge.c will restore the state of
+ the const_and_copies table.  We we just have to restore the expression
+ table.  */
+  remove_local_expressions_from_table ();
 }
 
 /* PHI nodes can create equivalences too.
@@ -1905,9 +1948,6 @@ dom_opt_dom_walker::after_dom_children (basic_block bb)
   && (single_succ_edge (bb)->flags & EDGE_ABNORMAL) == 0
   && potentially_threadable_block (single_succ (bb)))
 {
-  /* Push a marker on the stack, which thread_across_edge expects
-and will remove.  */
-  const_and_copies_stack.safe_push (NULL_TREE);
   thread_across_edge (single_succ_edge (bb));
 }
   else if ((last = last_stmt (bb))
@@ -1923,79 +1963,15 @@ dom_opt_dom_walker::after_dom_children (basic_block bb)
   /* Only try to thread the edge if it reaches a target block with
 more than one predecessor and more than one successor.  */
   if (potentially_threadable_block (true_edge->dest))
-   {
- struct edge_info *edge_info;
- unsigned int i;
-
- /* Push a marker ont

Re: [PATCH] Fix PR58417 -- r202700 appears to be causing ICEs

2013-09-19 Thread Kyrill Tkachov

On 18/09/13 20:37, Paul Pluzhnikov wrote:

Richard,

I am seeing ICEs in libstdc++ make check that I didn't see yesterday:


I'm also seeing these on arm and aarch64.

Kyrill




spawn /home/ppluzhnikov/Archive/gcc-svn/build/./gcc/xg++
-shared-libgcc -B/home/ppluzhnikov/Archive/gcc-svn/build/./gcc
-nostdinc++ 
-L/home/ppluzhnikov/Archive/gcc-svn/build/x86_64-unknown-linux-gnu/libstdc++-v3/src
-L/home/ppluzhnikov/Archive/gcc-svn/build/x86_64-unknown-linux-gnu/libstdc++-v3/src/.libs
-L/home/ppluzhnikov/Archive/gcc-svn/build/x86_64-unknown-linux-gnu/libstdc++-v3/libsupc++/.libs
-B/home/ppluzhnikov/Archive/gcc-svn-install/x86_64-unknown-linux-gnu/bin/
-B/home/ppluzhnikov/Archive/gcc-svn-install/x86_64-unknown-linux-gnu/lib/
-isystem 
/home/ppluzhnikov/Archive/gcc-svn-install/x86_64-unknown-linux-gnu/include
-isystem 
/home/ppluzhnikov/Archive/gcc-svn-install/x86_64-unknown-linux-gnu/sys-include
-B/home/ppluzhnikov/Archive/gcc-svn/build/x86_64-unknown-linux-gnu/./libstdc++-v3/src/.libs
-D_GLIBCXX_ASSERT -fmessage-length=0 -ffunction-sections
-fdata-sections -g -O2 -D_GNU_SOURCE -g -O2 -D_GNU_SOURCE
-DLOCALEDIR="." -nostdinc++
-I/home/ppluzhnikov/Archive/gcc-svn/build/x86_64-unknown-linux-gnu/libstdc++-v3/include/x86_64-unknown-linux-gnu
-I/home/ppluzhnikov/Archive/gcc-svn/build/x86_64-unknown-linux-gnu/libstdc++-v3/include
-I/home/ppluzhnikov/Archive/gcc-svn/libstdc++-v3/libsupc++
-I/home/ppluzhnikov/Archive/gcc-svn/libstdc++-v3/include/backward
-I/home/ppluzhnikov/Archive/gcc-svn/libstdc++-v3/testsuite/util
/home/ppluzhnikov/Archive/gcc-svn/libstdc++-v3/testsuite/ext/random/normal_mv_distribution/cons/default.cc
-std=c++0x ./libtestc++.a -Wl,--gc-sections -lm -o ./default.exe
/home/ppluzhnikov/Archive/gcc-svn/libstdc++-v3/testsuite/ext/random/normal_mv_distribution/cons/default.cc:
In constructor '__gnu_cxx::normal_mv_distribution<_Dimen,
_RealType>::param_type::param_type() [with long unsigned int _Dimen =
2ul; _RealType = double]':
/home/ppluzhnikov/Archive/gcc-svn/libstdc++-v3/testsuite/ext/random/normal_mv_distribution/cons/default.cc:49:1:
internal compiler error: in build_polynomial_chrec, at
tree-chrec.h:148
  }
  ^
0xf57001 build_polynomial_chrec
 ../../gcc/tree-chrec.h:148
0xf5b8ab chrec_fold_plus_1
 ../../gcc/tree-chrec.c:321
0xf5c441 chrec_fold_plus_poly_poly
 ../../gcc/tree-chrec.c:153
0xf5c441 chrec_fold_plus_1
 ../../gcc/tree-chrec.c:279
0xb6ef7a interpret_rhs_expr
 ../../gcc/tree-scalar-evolution.c:1692
0xb6fecd interpret_gimple_assign
 ../../gcc/tree-scalar-evolution.c:1810
0xb6fecd analyze_scalar_evolution_1
 ../../gcc/tree-scalar-evolution.c:1892
0xb707b2 analyze_scalar_evolution(loop*, tree_node*)
 ../../gcc/tree-scalar-evolution.c:1947
0xb73667 analyze_scalar_evolution_in_loop
 ../../gcc/tree-scalar-evolution.c:2043
0xb73eaf simple_iv(loop*, loop*, tree_node*, affine_iv*, bool)
 ../../gcc/tree-scalar-evolution.c:3167
0x9817c6 estimate_function_body_sizes
 ../../gcc/ipa-inline-analysis.c:2563
0x9822e0 compute_inline_parameters(cgraph_node*, bool)
 ../../gcc/ipa-inline-analysis.c:2696
0x982630 inline_analyze_function
 ../../gcc/ipa-inline-analysis.c:3684
0x9827c7 inline_generate_summary()
 ../../gcc/ipa-inline-analysis.c:3735
0xa30df6 execute_ipa_summary_passes(ipa_opt_pass_d*)
 ../../gcc/passes.c:2000
0x7e8844 ipa_passes
 ../../gcc/cgraphunit.c:2008
0x7e8844 compile()
 ../../gcc/cgraphunit.c:2115
0x7e8a89 finalize_compilation_unit()
 ../../gcc/cgraphunit.c:2269
0x5fcc20 cp_write_global_declarations()
 ../../gcc/cp/decl2.c:4360
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.
compiler exited with status 1


Perhaps they are related to your commit:


r202700 | rguenth | 2013-09-18 05:31:45 -0700 (Wed, 18 Sep 2013) | 25 lines

2013-09-18  Richard Biener  

 PR tree-optimization/58417
 * tree-chrec.c (chrec_fold_plus_1): Assert that we do not
 have chrecs with symbols defined in the loop as operands.
 (chrec_fold_multiply): Likewise.
 * tree-scalar-evolution.c (interpret_rhs_expr): Instantiate
 parameters before folding binary operations.
 (struct instantiate_cache_entry_hasher): Remove.
 (struct instantiate_cache_type): Use a pointer-map.
 (instantiate_cache_type::instantiate_cache_type): New function.
 (instantiate_cache_type::get): Likewise.
 (instantiate_cache_type::set): Likewise.
 (instantiate_cache_type::~instantiate_cache_type): Adjust.
 (get_instantiated_value_entry): Likewise.
 (global_cache): New global.
 (instantiate_scev_r, instantiate_scev_poly, instantiate_scev_binary,
 instantiate_array_ref, instantiate_scev_convert, instantiate_scev_3,
 instantiate_scev_2, inst

Re: [PATCH] manage dom-walk_data initialization and finalization with constructors and destructors

2013-09-19 Thread Michael Matz
Hi,

On Wed, 18 Sep 2013, Jeff Law wrote:

> > I know, and I don't like it there either.
> 
> Well, as Ian pointed out, it is in our recommended style guidelines and 
> you'll find uses in the vec class as well.

As I said, yes; I also said those were pre-existing from the C times 
already, so they don't support the new c++ guidelines.  I do have several 
issues with the style guidelines, and yes, it's my fault for not having 
gone through the pains of trying to fight off those things last year :-/

> It's well established at this point

I wouldn't call two recent examples well established, but well.

> I don't see anything in Trevor's work that requires jumping through 
> hoops.

Me neither, from that perspective it's okay.  It's merely that I doubt the 
value of any syntactic privatization like it's implemented in C++, you can 
#define it away, hence the compiler can't make use of that information for 
code generation, and the cognitive value for the developer ("hey I 
shouldn't look at this member from outside") is dubious, as that probably 
is a general rule, no direct data member access from non-members (although 
I have problems with that too).

And I think the fact that Trevor made one data member non-private to 
access it from a non-member function (move_computations_dom_walker::todo) 
just underlines my point: private is useless and gets in the way.

> > What's the benefit of reading and writing such noisy lines? :
> > 
> >*out_mode = mode_;
> >mode_ = GET_MODE_WIDER_MODE (mode_);
> >count_++;
> 
> It makes it very clear to the reader that we're dealing with objects that
> belong to a class instance rather than direct access to an auto or static.
> That can be important.

this->x.

>From the wiki it seems that was dicussed (on the wiki, not the mailing 
list) and rejected by Lawrence on the grounds of indroducing too long 
lines.  I agree with that, but I don't agree that therefore members should 
be named foo_.

> Given it's recommended by our C++ guidelines which were discussed at 
> length, I'm going to explicitly NAK your patch.

Hmmkay.

> FWIW, I have worked on large C++ codebases

Me too.

> that were a free-for-all and found them *amazingly* painful.

I don't think any of my mails about style can be interpreted as advocating 
free-for-all.

> The restricted set allowed for GCC is actually quite reasonable IMHO, 
> particularly for projects where the main body of code is evolving from a 
> pure C base.

Funnily it's the small things that weren't much discussed (probably 
because they are deemed not very important) in the convention that give 
me a hard time, nits such as these syntactic uglifications.  The larger 
things indeed mostly are okayish.


Ciao,
Michael.


gimple build interface

2013-09-19 Thread Andrew MacLeod
First attempt bounced from gcc-patches for some reason  trying one 
more time.


I'm looking at pulling ssa specific bits out of gimple.c so that it 
doesn't require the ssa headers, and can concentrate on basic gimple 
support.  I stumbled across the "new" build interface in gimple.c 
consisting of  these routines:



* Helper functions to build GIMPLE statements.  */
tree create_gimple_tmp (tree, enum ssa_mode = M_SSA);
gimple build_assign (enum tree_code, tree, int, enum ssa_mode = M_SSA);
gimple build_assign (enum tree_code, gimple, int, enum ssa_mode = M_SSA);
gimple build_assign (enum tree_code, tree, tree, enum ssa_mode = M_SSA);
gimple build_assign (enum tree_code, gimple, tree, enum ssa_mode = M_SSA);
gimple build_assign (enum tree_code, tree, gimple, enum ssa_mode = M_SSA);
gimple build_assign (enum tree_code, gimple, gimple, enum ssa_mode = M_SSA);
gimple build_type_cast (tree, tree, enum ssa_mode = M_SSA);
gimple build_type_cast (tree, gimple, enum ssa_mode = M_SSA);



currently only used in asan.c

the routine giving me trouble is:

tree
create_gimple_tmp (tree type, enum ssa_mode mode)
{
  return (mode == M_SSA)
 ? make_ssa_name (type, NULL)
 : create_tmp_var (type, NULL);
}


Other than one other routine that doesn't belong in gimple.c anyway, 
this call to make_ssa_name() is the only thing preventing gimple.c from 
not requiring tree-ssa.h.


This new interface is really trying to bridge the gap between gimple and 
ssa and simplify the life of anyone needing to generate a series of 
instructions.  Before actually making any code changes, I wanted to get 
a consensus on the future direction of this interface.


I see the benefit in the streamlined asan.c code,  but I detest that 
ssa_mode flag.  And as long as it supports SSA, I don't think it should 
be in gimple.c.


I think this is of most use to ssa passes that need to construct code 
snippets, so I propose we make this ssa specific and put it in 
tree-ssa.c (renaming it ssa_build_assign),  *OR* we could leave it 
general purpose and put it in its own set of files, 
gimple-ssa-build.[ch] or something that crosses the border between the 
two representations.


I'd also suggest that the final optional parameter be changed to tree 
*lhs = NULL_TREE,  which would allow the caller to specify the LHS if 
they want, otherwise make_ssa_name would be called.   If we want to 
leave it supporting both gimple and ssa, then anyone from gimple land 
could pass in a gimple LHS variable thus avoiding the call to 
make_ssa_name


Thoughts?
Andrew


[PATCH, SH4] Fix PR58475 insn swapb does not satisfy its constraints

2013-09-19 Thread Christian Bruel
Hello,

This patch fixes the aforementioned PR by refusing FPUL_REG to be an
acceptable reg for any arithmetic_operand on TARGET_SH4. (This was a
strange SH4 singularity with regards to the SH family).

The only impacted insn is movsf_ie used for reg-fpreg transfers. So the
condition now mentions explicitly fpul_operand, allowing to simplify a
bit the logic to match by removing the extra checks.

The testsuite survived (no regression) for 
-m2,-m2a,-m2a-nofpu,-m2a-single,-m2a-single-only,-m3,-m3e,-m4,-m4-single,-m4-single-only,-m4a,-m4a-single,-m4a-single-only

No performance impact on a large number of benchmarks (CSIBE, EEMBC,
Coremark, ...)

sh4-linux-elf survived a full Linux distribution rebuild

OK for trunk?

many thanks,

Christian


2013-09-19  Christian Bruel  

	PR target/58475
	* config/sh/sh.md (movsf_ie): Allow fpul_operand.
	* config/sh/predicate.md (arith_reg_operand): Disallow FPUL_REG.

2013-09-19  Christian Bruel  

	PR target/58475
	* gcc.target/sh/torture/pr58475.c: New test.

Index: gcc/config/sh/predicates.md
===
--- gcc/config/sh/predicates.md	(revision 202699)
+++ gcc/config/sh/predicates.md	(working copy)
@@ -154,7 +154,7 @@
 
   return (regno != T_REG && regno != PR_REG
 	  && ! TARGET_REGISTER_P (regno)
-	  && (regno != FPUL_REG || TARGET_SH4)
+	  && regno != FPUL_REG
 	  && regno != MACH_REG && regno != MACL_REG);
 }
   /* Allow a no-op sign extension - compare LOAD_EXTEND_OP.
Index: gcc/config/sh/sh.md
===
--- gcc/config/sh/sh.md	(revision 202699)
+++ gcc/config/sh/sh.md	(working copy)
@@ -8203,15 +8205,9 @@ label:
(use (match_operand:PSI 2 "fpscr_operand" "c,c,c,c,c,c,c,c,c,c,c,c,c,c,c,c,c,c,c"))
(clobber (match_scratch:SI 3 "=X,X,Bsc,Bsc,&z,X,X,X,X,X,X,X,X,y,X,X,X,X,X"))]
   "TARGET_SH2E
-   && (arith_reg_operand (operands[0], SFmode)
-   || arith_reg_operand (operands[1], SFmode)
-   || arith_reg_operand (operands[3], SImode)
-   || (fpul_operand (operands[0], SFmode)
-	   && memory_operand (operands[1], SFmode)
-	   && GET_CODE (XEXP (operands[1], 0)) == POST_INC)
-   || (fpul_operand (operands[1], SFmode)
-	   && memory_operand (operands[0], SFmode)
-	   && GET_CODE (XEXP (operands[0], 0)) == PRE_DEC))"
+   && (arith_reg_operand (operands[0], SFmode) || fpul_operand (operands[0], SFmode)
+   || arith_reg_operand (operands[1], SFmode) || fpul_operand (operands[1], SFmode)
+   || arith_reg_operand (operands[3], SImode))"
   "@
 	fmov	%1,%0
 	mov	%1,%0

Index: gcc/testsuite/gcc.target/sh/torture/pr58475.c
===
--- gcc/testsuite/gcc.target/sh/torture/pr58475.c	(revision 0)
+++ gcc/testsuite/gcc.target/sh/torture/pr58475.c	(working copy)
@@ -0,0 +1,15 @@
+/* { dg-do compile { target "sh*-*-*" } } */
+
+int
+kerninfo(int __bsx, double tscale)
+{
+ return (
+	 (int)(__extension__
+	   ({
+		 __bsx) & 0xff00u) >> 24)
+		  | (((__bsx) & 0x00ff) >> 8)
+		  | (((__bsx) & 0xff00) << 8)
+		  | (((__bsx) & 0x00ff) << 24)
+		  ); }))
+	   * tscale);
+}


Re: [PATCH, AArch64] Fix the pointer-typed function argument expansion in aarch64_simd_expand_args

2013-09-19 Thread Yufeng Zhang

Ping~

Thanks,
Yufeng

http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00774.html


On 09/10/13 18:12, Yufeng Zhang wrote:

Oops, now attach the correct patch and change log.

Thanks,
Yufeng

gcc/

* config/aarch64/aarch64-builtins.c (aarch64_simd_expand_args):
Call aarch64_simd_expand_args to update op[argc].


On 09/10/13 18:08, Yufeng Zhang wrote:

This patch fixes a number of test failures in gcc.target/aarch64/v*.c in
ILP32.

The corresponding RTL patterns for some load/store builtins have Pmode
(i.e. DImode) specified for their address operands.  However, coming
from a pointer-typed function argument, op[argc] will have SImode in
ILP32.  Instead of duplicating these RTL patterns to cope with SImode
operand (which e.g. would complicate arm_neon.h), we explicitly convert
the operand to Pmode here; an address operand in a RTL shall have Pmode
anyway.  Note that if op[argc] already has DImode,
convert_memory_address will simply return it





Re: Drop generic32 cost model

2013-09-19 Thread Ian Lance Taylor
On Wed, Sep 18, 2013 at 1:39 PM, Jan Hubicka  wrote:
>
> when generic model was introduced, the 32bit only CPUs was still common on the
> market.  It would be stupid to tune 64bit code for CPUs that will never run 
> it.
> We thus introduced two models - generic32 that was considering needs
> of 32bit cpus (centrinos in particular) and generic64 that didn't.
>
>  /* Generic32 should produce code tuned for PPro, Pentium4, Nocona,
> Athlon and K8.  */
>  /* Generic64 should produce code tuned for Nocona and K8.  */
>
> Was original definitions that are still in the source.
>
> Today the 32bit only CPUs are no longer important.  This patch thus
> drops 32bit generic.  This has effect of droping the following flags
> for generic at -m32:
>  use_leave, avoid_vector_decode, slow_imul_imm32_mem, slow_imul_imm8
> that are currently enabled for generic64 only.  This was to accomodate
> earlier AMD chips that are no longer relevant too.
>
> I also updated comment:
> ! /* Generic64 should produce code tuned for Nocona and K8.  */
> to:
> ! /* Generic should produce code tuned for Core-i7 (and newer chips)
> !and btver1 (and newer chips).  */
> This is what I think generic represents today (it also fares swell on earlier
> cores and amdfam10, but we probably don't want to get too limited by these
> anymore).
>
> I would like to proceed with modernization of generic64 - in particular
> to switch it to 4 issue scheduling model and revisit individual flags
> incrementally.
>
> Bootstrapped/regtested x86_64-linux, will commit it tomorrow if there
> are no complains.
>
> Honza
>
> * i386.h (TARGET_GENERIC32, TARGET_GENERIC64): Remove.
> (TARGET_GENERIC): Use PROCESOR_GENERIC
> (enum processor_type): Unify generic32 and 64.
> * i386.md (cpu): Likewise.
> * x86-tune.def (use_leave): Enable for generic32.
> (avoid_vector_decode, slow_imul_imm32_mem, slow_imul_imm8): Likewise.
> * athlon.md: Change generic64 to generic in all occurences.
> * i386-c.c (ix86_target_macros_internal): Unify generic64 and 32.
> (ix86_target_macros_internal): Likewise.
> * driver-i386.c (host_detect_local_cpu): Likewise.
> * i386.c (generic64_memcpy, generic64_memset, generic64_cost): Rename 
> to ..
> (generic_memcpy, generic_memset, generic_cost): This one.
> (generic32_memcpy, generic32_memset, generic32_cost): Remove.
> (m_GENERIC32, m_GENERIC64): Remove.
> (m_GENERIC): Turn into one flag.
> (processor_target): Unify generic tunnings.
> (ix86_option_override_internal): Replace generic32/64 by generic.
> (ix86_issue_rate): Likewise.
> (ix86_adjust_cost): Likewise.



I'm seeing infinite recursion in decide_alg on
x86_64-unknown-linux-gnu when compiling the C code in
libgo/runtime/go-append.c with -m32, and I suspect that this patch is
the culprit.  The file compiles fine without -m32.

#5  0x00b6cc23 in decide_alg (count=count@entry=0, expected_size=2048,
memset=memset@entry=false,
dynamic_check=dynamic_check@entry=0x7fffd72c,
noalign=noalign@entry=0x7fffd72b)
at ../../trunk/gcc/config/i386/i386.c:22769
#6  0x00b6cf3e in decide_alg (count=count@entry=0, expected_size=2048,
memset=memset@entry=false,
dynamic_check=dynamic_check@entry=0x7fffd72c,
noalign=noalign@entry=0x7fffd72b)
at ../../trunk/gcc/config/i386/i386.c:22871
#7  0x00b6cf3e in decide_alg (count=count@entry=0, expected_size=2048,
memset=memset@entry=false,
dynamic_check=dynamic_check@entry=0x7fffd72c,
noalign=noalign@entry=0x7fffd72b)
at ../../trunk/gcc/config/i386/i386.c:22871
#8  0x00b6cf3e in decide_alg (count=count@entry=0, expected_size=2048,
memset=memset@entry=false,
dynamic_check=dynamic_check@entry=0x7fffd72c,
noalign=noalign@entry=0x7fffd72b)
at ../../trunk/gcc/config/i386/i386.c:22871
#9  0x00b6cf3e in decide_alg (count=count@entry=0, expected_size=2048,
memset=memset@entry=false,
dynamic_check=dynamic_check@entry=0x7fffd72c,
noalign=noalign@entry=0x7fffd72b)
at ../../trunk/gcc/config/i386/i386.c:22871
#10 0x00b6cf3e in decide_alg (count=count@entry=0, expected_size=2048,
memset=memset@entry=false,
dynamic_check=dynamic_check@entry=0x7fffd72c,
noalign=noalign@entry=0x7fffd72b)
at ../../trunk/gcc/config/i386/i386.c:22871


decide_alg is being called from ix86_expand_movmem, from
expand_builtin_memcpy, for the call at line 61 of go-append.c.
  __builtin_memcpy (n, a.__values, a.__count * element_size);

I'm continuing to look.

Ian


Re: Drop generic32 cost model

2013-09-19 Thread Jan Hubicka
> 
> 
> decide_alg is being called from ix86_expand_movmem, from
> expand_builtin_memcpy, for the call at line 61 of go-append.c.
>   __builtin_memcpy (n, a.__values, a.__count * element_size);
> 
> I'm continuing to look.

Indeed it is problem of this patch - the issue is that generic64 had
dummy 32bit alg entries (those should not cause infinite loop, I will
fix that too)

I am testing:
Index: i386.c
===
--- i386.c  (revision 202741)
+++ i386.c  (working copy)
@@ -1648,11 +1648,13 @@ struct processor_costs slm_cost = {
and btver1 (and newer chips).  */
 
 static stringop_algs generic_memcpy[2] = {
-  DUMMY_STRINGOP_ALGS,
+  {libcall, {{32, loop, false}, {8192, rep_prefix_4_byte, false},
+ {-1, libcall, false}}},
   {libcall, {{32, loop, false}, {8192, rep_prefix_8_byte, false},
  {-1, libcall, false;
 static stringop_algs generic_memset[2] = {
-  DUMMY_STRINGOP_ALGS,
+  {libcall, {{32, loop, false}, {8192, rep_prefix_4_byte, false},
+ {-1, libcall, false}}},
   {libcall, {{32, loop, false}, {8192, rep_prefix_8_byte, false},
  {-1, libcall, false;
 static const

> 
> Ian


Re: Drop generic32 cost model

2013-09-19 Thread Ian Lance Taylor
On Thu, Sep 19, 2013 at 8:38 AM, Jan Hubicka  wrote:
>>
>>
>> decide_alg is being called from ix86_expand_movmem, from
>> expand_builtin_memcpy, for the call at line 61 of go-append.c.
>>   __builtin_memcpy (n, a.__values, a.__count * element_size);
>>
>> I'm continuing to look.
>
> Indeed it is problem of this patch - the issue is that generic64 had
> dummy 32bit alg entries (those should not cause infinite loop, I will
> fix that too)

Thanks, that patch at least lets the build continue.

Ian


[wwwdocs] Mention ubsan in 4.9 changes.html

2013-09-19 Thread Marek Polacek
Maybe it'd be worth noting in changes.html that GCC now has the
ubsan...

Ok to apply?

--- www/htdocs/gcc-4.9/changes.html.mp  2013-09-19 16:54:32.113724993 +0200
+++ www/htdocs/gcc-4.9/changes.html 2013-09-19 17:07:05.418030738 +0200
@@ -38,6 +38,14 @@
 AddressSanitizer, a fast memory error detector, is now available on 
ARM.
 
   
+  
+UndefinedBehaviorSanitizer, a fast undefined behavior detector,
+has been added and can be enabled via 
-fsanitize=undefined.
+Various computations will be instrumented to detect undefined behavior
+at runtime.  UndefinedBehaviorSanitizer is currently available for C
+and C++ languages.
+
+  
 
 New Languages and Language specific improvements
 
Marek


Re: [PATCH, committed] SH: Fix PR58314 (unsatisfied constraints)

2013-09-19 Thread Richard Sandiford
Christian Bruel  writes:
> Index: gcc/config/sh/sh.md
> ===
> --- gcc/config/sh/sh.md   (revision 202699)
> +++ gcc/config/sh/sh.md   (working copy)
> @@ -6894,9 +6894,11 @@ label:
>  ;; reloading MAC subregs otherwise.  For that probably special patterns
>  ;; would be required.
>  (define_insn "*mov_reg_reg"
> -  [(set (match_operand:QIHI 0 "arith_reg_dest" "=r,m,*z")
> - (match_operand:QIHI 1 "register_operand" "r,*z,m"))]
> -  "TARGET_SH1 && !t_reg_operand (operands[1], VOIDmode)"
> +  [(set (match_operand:QIHI 0 "general_movdst_operand" "=r,m,*z")
> + (match_operand:QIHI 1 "general_movsrc_operand" "r,*z,m"))]
> +  "TARGET_SH1 && !t_reg_operand (operands[1], VOIDmode)
> +   && arith_reg_dest (operands[0], mode)
> +   && register_operand (operands[1], mode)"

This defeats the purpose of changing the predicates though.  The problem
with the original pattern was that you shouldn't have a situation where
the constraints allow a combination that recog wouldn't match to the
same define_insn.  Constraints must always match a subset of what
recog would match.

Sorry for just saying something's wrong without suggesting a fix,
but I don't know anything about the SH port.  In general though,
the "r<-r", "r<-m" and "m<-r" alternatives should be part of a single
define_insn, rather than split across several.  Which sounds like what
Oleg was suggesting about folding the r<-r alternatives back into the
main patterns.

Thanks,
Richard


Re: [PATCH][gomp4] Plugins Support in LibGOMP (Take 2)

2013-09-19 Thread Michael V. Zolotukhin
Hi Jakub,

Updated patch and my answers are below.

> The OpenMP standard has the omp_is_initial_device () function that can be
> used to query whether the code is offloaded or not.  So I don't think we
> need to do the logging.  For the device 257 hack we of course don't return
> that as true, but that is a hack that is going away.
Ok that sounds good too.

> > @@ -50,6 +59,10 @@ struct target_mem_desc {
> >struct target_mem_desc *prev;
> >/* Number of items in following list.  */
> >size_t list_count;
> > +
> > +  /* Corresponding target device descriptor.  */
> > +  struct gomp_device_descr* device_descr;
> 
> Please put the space before *, not after it.
I wasn't aware of that rule, sorry.  Fixed.

> > +  /* Plugin file name.  */
> > +  char plugin_name[PATH_MAX];
> 
> I don't like such fixed size arrays, for most cases
> it will be big memory waste.  What do you need the plugin_name
> for?  And, if you really need it past dlopen, can't you store
> it as const char *plugin_name instead?
I kept it just in case - it easily could be removed, and I did it in the current
version of the patch.

> > +
> > +  /* Plugin file handler.  */
> > +  void *plugin_handle;
> > +
> > +  /* Function handlers.  */
> > +  bool (*device_available_func) (void);
> 
> The scan hook shouldn't give you just bool whether the device is available,
> but how many devices of that kind are available.  You can have 2 MIC
> cards and one or two HSAIL GPGPU in a box e.g.  Plus, is this hook useful
> after the initialization at all?  I'd say it would be enough to just
> dlsym it during initialization, ask how many devices it has and just create
> that many device structures with that plugin_handle.
> What you want are hooks for device_alloc (taking size and align arguments,
> returning uintptr_t target address), device_free (taking uintptr_t target
> address and perhaps size), device_copyto (like memcpy, just with target
> address uintptr_t instead of void *) and device_copyfrom (similarly),
> and device_run hook or similar (taking host and target fn and target
> uintptr_t address of the block with pointers).
That's just a stub, showing how everything would work in future, when the
interface libgomp<->plugin would be finally settled.
I think it's better to wait a little bit when we would progress further in
development of the libgomp plugin - probably we'd spot new issues in the
interface.  Anyway, it's easy to add any routines we want here.

> You need to call pthread_once here too, so that omp_get_num_devices returns
> the correct number.
>  ...
> Thus, IMHO you should just call gomp_get_num_devices () here, or after the
> if (device_id == -1) block, and that will ensure gomp_target_init has been
> already called.  Just save the return value into a temporary.
Fixed.
> 
> > +  if (device_id == -1)
> >  {
> >struct gomp_task_icv *icv = gomp_icv (false);
> > -  device = icv->default_device_var;
> > +  device_id = icv->default_device_var;
> >  }
> >/* FIXME: Temporary hack for testing non-shared address spaces on host.  
> > */
> > -  if (device == 257)
> > -return 257;
> > -  if (device >= gomp_get_num_devices ())
> > -return -1;
> > -  return -1;
> > +  if (device_id == 257)
> > +return &devices[0];
> 
> Guess the hack should be if gomp_get_num_devices () returned 0 and
> device_id == 257, otherwise the hack device won't be created.
Currently we always have at least one device (see FIXME in
gomp_find_available_plugins routine) - even if we found no plugins, we create a
hack device.  If we found some plugins, then we don't create a new device for
the hack, but use the devices[0] for it.

> > -  struct target_mem_desc *tgt
> > -= gomp_malloc (sizeof (*tgt) + sizeof (tgt->list[0]) * mapnum);
> > +  struct target_mem_desc *tgt = NULL;
> > +  tgt = gomp_malloc (sizeof (*tgt) + sizeof (tgt->list[0]) * mapnum);
> 
> Why this change?
Changed back.

> >tgt->list_count = mapnum;
> >tgt->refcount = 1;
> > +  tgt->device_descr = devicep;
> > +
> > +  if (!devicep)
> > +return tgt;
> 
> Why this conditional?  mapnum == 0 conditional below will do the trick.
Fixed.

> > +  /* FIXME: currently only device 257 is available and it is a hack which 
> > is
> > + done only to test the functionality early.  We need to enable all 
> > devices,
> > + not only this one.  */
> 
> Yeah, I don't see why the FIXME is here, just use gomp_map_vars
> unconditionally, or conditionally on some flag in the device descr structure
> (whether device has non-shared address space).
Removed.

> > +  if (devicep->id == 257)
> >  {
> >struct target_mem_desc *tgt
> > -   = gomp_map_vars (mapnum, hostaddrs, sizes, kinds, true);
> > +   = gomp_map_vars (devicep, mapnum, hostaddrs, sizes, kinds, true);
> >fn ((void *) tgt->tgt_start);
> 
> And thus would be devicep->device_run hook.
We'll start device_run hook here once the interface libgomp<->plugin is fully 
set.

> Why devicep here, when

Re: [PATCH,ARM] fix testsuite failures for arm-none-linux-gnueabihf

2013-09-19 Thread Charles Baylis
Hi

Here is an updated version.

Changelog:

* gcc.dg/builtin-apply2.c: skip test on arm hardfloat ABI targets
* gcc.dg/tls/pr42894.c: Remove options, forcing -mthumb fails
with hardfloat, and test is not thumb-specific
* gcc,target/arm/thumb-ltu.c: Avoid test failure with
hardfloat ABI by requiring arm_thumb1_ok
* lib/target-supports.exp
(check_effective_target_arm_fp16_ok_nocache): don't force
-mfloat-abi=soft when building for hardfloat target

On 19 August 2013 16:34, Richard Earnshaw  wrote:
> On 15/08/13 15:10, Charles Baylis wrote:
>> Hi
>>
>> The attached patch fixes some tests which fail when testing gcc for a
>> arm-none-linux-gnueabihf target because they do not expect to be built
>> with a hard float ABI.
>>
>> The change in target-supports.exp fixes arm-fp16-ops-5.c and 
>> arm-fp16-ops-6.c.
>>
>> Tested on arm-none-linux-gnueabihf using qemu-arm, and does not cause
>> any other tests to break.
>>
>> Comments? This is my first patch, so please point out anything wrong.
>>
>>
>
>>
>>
>> 2013-08-15  Charles Baylis  
>>
>> * gcc.dg/builtin-apply2.c: skip test on arm hardfloat ABI targets
>> * gcc.dg/tls/pr42894.c: Use -mfloat-abi=soft as Thumb1 does
>> not support hardfloat ABI
>> * arm/thumb-ltu.c: Use -mfloat-abi=soft as Thumb1 does not
>> support hardfloat ABI
>> * target-supports.exp: don't force -mfloat-abi=soft when
>> building for hardfloat target
>>
>>
>> hf-fixes.txt
>>
>>
>> Index: gcc/testsuite/gcc.dg/builtin-apply2.c
>> ===
>> --- gcc/testsuite/gcc.dg/builtin-apply2.c (revision 201726)
>> +++ gcc/testsuite/gcc.dg/builtin-apply2.c (working copy)
>> @@ -1,6 +1,7 @@
>>  /* { dg-do run } */
>>  /* { dg-skip-if "Variadic funcs have all args on stack. Normal funcs have 
>> args in registers." { "aarch64*-*-* avr-*-* " } { "*" } { "" } } */
>>  /* { dg-skip-if "Variadic funcs use Base AAPCS.  Normal funcs use VFP 
>> variant." { "arm*-*-*" } { "-mfloat-abi=hard" } { "" } } */
>> +/* { dg-skip-if "Variadic funcs use Base AAPCS.  Normal funcs use VFP 
>> variant." { "arm*-*-gnueabihf" } { "*" } { "-mfloat-abi=soft*" } } */
>>
>
>
> As you've noticed, basing the test's behaviour on the config variant
> doesn't work reliably.  The builtin-apply2 test really should be skipped
> if the current test variant is not soft-float.  We already have
> check_effective_target_arm_hf_eabi in target-supports.exp that checks
> whether __ARM_PCS_VFP is defined during a compilation.  So can  replace
> both arm related lines in builtin-apply2 with
>
>  /* { dg-skip-if "Variadic funcs use Base AAPCS.  Normal funcs use VFP
> variant." { "arm*-*-*" && arm_hf_eabi} { "*" } { "" } } */
>
>>  /* PR target/12503 */
>>  /* Origin:  */
>> Index: gcc/testsuite/gcc.dg/tls/pr42894.c
>> ===
>> --- gcc/testsuite/gcc.dg/tls/pr42894.c(revision 201726)
>> +++ gcc/testsuite/gcc.dg/tls/pr42894.c(working copy)
>> @@ -1,6 +1,7 @@
>>  /* PR target/42894 */
>>  /* { dg-do compile } */
>>  /* { dg-options "-march=armv5te -mthumb" { target arm*-*-* } } */
>> +/* { dg-options "-march=armv5te -mthumb -mfloat-abi=soft" { target 
>> arm*-*-*hf } } */
>>  /* { dg-require-effective-target tls } */
>>
>
> Although the original PR was for Thumb1, this is a generic test.  I'm
> not convinced that on ARM it should try to force thumb1.  Removing the
> original dg-options line should solve the problem and we then get better
> multi-lib testing as well.
>
>>  extern __thread int t;
>> Index: gcc/testsuite/gcc.target/arm/thumb-ltu.c
>> ===
>> --- gcc/testsuite/gcc.target/arm/thumb-ltu.c  (revision 201726)
>> +++ gcc/testsuite/gcc.target/arm/thumb-ltu.c  (working copy)
>> @@ -1,6 +1,6 @@
>>  /* { dg-do compile } */
>>  /* { dg-skip-if "incompatible options" { arm*-*-* } { "-march=*" } { 
>> "-march=armv6" "-march=armv6j" "-march=armv6z" } } */
>> -/* { dg-options "-mcpu=arm1136jf-s -mthumb -O2" } */
>> +/* { dg-options "-mcpu=arm1136jf-s -mthumb -O2 -mfloat-abi=soft" } */
>>
>
> This won't work if there's an explict -mfloat-abi={softfp,hard} on the
> multilib options.  Probably the best thing to do here is to skip the
> test if arm_thumb1_ok is not true.
>
>>  void f(unsigned a, unsigned b, unsigned c, unsigned d)
>>  {
>> Index: gcc/testsuite/lib/target-supports.exp
>> ===
>> --- gcc/testsuite/lib/target-supports.exp (revision 201726)
>> +++ gcc/testsuite/lib/target-supports.exp (working copy)
>> @@ -2445,6 +2445,11 @@
>>   # Must generate floating-point instructions.
>>   return 0
>>  }
>> +if [check-flags [list "" { *-*-gnueabihf } { "*" } { "" } ]] {
>> +# Use existing float-abi and force an fpu which supports fp16
>
> This should use arm_hf_eabi as described above.
>
>> + se

Re: [PATCH] manage dom-walk_data initialization and finalization with constructors and destructors

2013-09-19 Thread Trevor Saunders
On Thu, Sep 19, 2013 at 03:23:21PM +0200, Michael Matz wrote:
> > I don't see anything in Trevor's work that requires jumping through 
> > hoops.
> 
> Me neither, from that perspective it's okay.  It's merely that I doubt the 
> value of any syntactic privatization like it's implemented in C++, you can 
> #define it away, hence the compiler can't make use of that information for 

no, it can't make use of it if someone does something crazy like #define
it away which is atleast a little tricky because of the ':'.  I believe
clang does infact make use of private to find unused fields (maybe it
does something else, but I can't imagine what that would be).

> code generation, and the cognitive value for the developer ("hey I 
> shouldn't look at this member from outside") is dubious, as that probably 
> is a general rule, no direct data member access from non-members (although 
> I have problems with that too).

The value is that when you read code you *know* that something is only
used in certain places instead of hoping that is true.

> And I think the fact that Trevor made one data member non-private to 
> access it from a non-member function (move_computations_dom_walker::todo) 
> just underlines my point: private is useless and gets in the way.

It certainly shows a case where that's true, but it doesn't really show
that's always true.

> > > What's the benefit of reading and writing such noisy lines? :
> > > 
> > >*out_mode = mode_;
> > >mode_ = GET_MODE_WIDER_MODE (mode_);
> > >count_++;
> > 
> > It makes it very clear to the reader that we're dealing with objects that
> > belong to a class instance rather than direct access to an auto or static.
> > That can be important.
> 
> this->x.
> 
> From the wiki it seems that was dicussed (on the wiki, not the mailing 
> list) and rejected by Lawrence on the grounds of indroducing too long 
> lines.  I agree with that, but I don't agree that therefore members should 
> be named foo_.

this-> also has the disadvantage that you always have to rember it, and
fundimentally doesn't help you know where a member could possibly be
used.

Trev

> 
> > Given it's recommended by our C++ guidelines which were discussed at 
> > length, I'm going to explicitly NAK your patch.
> 
> Hmmkay.
> 
> > FWIW, I have worked on large C++ codebases
> 
> Me too.
> 
> > that were a free-for-all and found them *amazingly* painful.
> 
> I don't think any of my mails about style can be interpreted as advocating 
> free-for-all.
> 
> > The restricted set allowed for GCC is actually quite reasonable IMHO, 
> > particularly for projects where the main body of code is evolving from a 
> > pure C base.
> 
> Funnily it's the small things that weren't much discussed (probably 
> because they are deemed not very important) in the convention that give 
> me a hard time, nits such as these syntactic uglifications.  The larger 
> things indeed mostly are okayish.
> 
> 
> Ciao,
> Michael.


[committed] Fix two simd vectorization issues (PR tree-optimization/58472)

2013-09-19 Thread Jakub Jelinek
Hi!

I've committed the following patch to trunk (except for the testcase, which
went to gomp-4_0-branch only), the tree-vect-stmts.c hunks as obvious,
inv_p has been used uninitialized for simd lane accesses, the omp-low.c
change to fix -Wuninitialized on simd code.

2013-09-19  Jakub Jelinek  

PR tree-optimization/58472
* tree-vect-stmts.c (vectorizable_store, vectorizable_load): For
simd_lane_access set inv_p = false.
* omp-low.c (lower_rec_input_clauses): Set TREE_NO_WARNING on
the simduid magic VAR_DECL.

* c-c++-common/gomp/pr58472.c: New test.

--- gcc/tree-vect-stmts.c.jj2013-09-18 12:17:55.0 +0200
+++ gcc/tree-vect-stmts.c   2013-09-19 14:17:00.771484495 +0200
@@ -4182,6 +4182,7 @@ vectorizable_store (gimple stmt, gimple_
  dataref_ptr = unshare_expr (DR_BASE_ADDRESS (first_dr));
  dataref_offset = build_int_cst (reference_alias_ptr_type
  (DR_REF (first_dr)), 0);
+ inv_p = false;
}
  else
dataref_ptr
@@ -5077,6 +5078,7 @@ vectorizable_load (gimple stmt, gimple_s
  dataref_ptr = unshare_expr (DR_BASE_ADDRESS (first_dr));
  dataref_offset = build_int_cst (reference_alias_ptr_type
  (DR_REF (first_dr)), 0);
+ inv_p = false;
}
  else
dataref_ptr
--- gcc/omp-low.c.jj2013-09-19 12:59:49.0 +0200
+++ gcc/omp-low.c   2013-09-19 18:23:27.860618153 +0200
@@ -3460,6 +3460,9 @@ lower_rec_input_clauses (tree clauses, g
   if (lane)
 {
   tree uid = create_tmp_var (ptr_type_node, "simduid");
+  /* Don't want uninit warnings on simduid, it is always uninitialized,
+but we use it not for the value, but for the DECL_UID only.  */
+  TREE_NO_WARNING (uid) = 1;
   gimple g
= gimple_build_call_internal (IFN_GOMP_SIMD_LANE, 1, uid);
   gimple_call_set_lhs (g, lane);
--- gcc/testsuite/c-c++-common/gomp/pr58472.c.jj2013-09-19 
18:47:56.309209103 +0200
+++ gcc/testsuite/c-c++-common/gomp/pr58472.c   2013-09-19 18:26:24.0 
+0200
@@ -0,0 +1,16 @@
+/* PR tree-optimization/58472 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -Wall -fopenmp" } */
+
+float a[1024], b[1024];
+
+float
+foo ()
+{
+  float s = 0.f;
+  unsigned int i;
+#pragma omp simd reduction(+:s)
+  for (i = 0; i < 1024; ++i)
+s += a[i] * b[i];
+  return s;
+}

Jakub


Go patch committed: Fix inconsistent check for == as memcmp

2013-09-19 Thread Ian Lance Taylor
The Go frontend was inconsistent in determining whether a struct could
use memcmp for the == operator.  This could cause one package to decide
that it could use == and a package importing that one to determine that
it could not.  The effect was an undefined symbol at link time.  This
patch fixes the problem.  I've added a test case (bug479) to the master
testsuite, which will be brought over to the gccgo testsuite in due
course.  Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu.
Committed to mainline and 4.8 branch.

Ian

diff -r 2b23d9831cf7 go/expressions.cc
--- a/go/expressions.cc	Wed Sep 18 16:26:07 2013 -0700
+++ b/go/expressions.cc	Thu Sep 19 10:28:24 2013 -0700
@@ -7752,8 +7752,6 @@
 	return false;
   if (arg_type->is_abstract())
 	return false;
-  if (arg_type->named_type() != NULL)
-	arg_type->named_type()->convert(this->gogo_);
 
   unsigned int ret;
   if (this->code_ == BUILTIN_SIZEOF)
diff -r 2b23d9831cf7 go/types.cc
--- a/go/types.cc	Wed Sep 18 16:26:07 2013 -0700
+++ b/go/types.cc	Thu Sep 19 10:28:24 2013 -0700
@@ -2288,9 +2288,7 @@
   }
 
 case TYPE_NAMED:
-  // Begin converting this type to the backend representation.
-  // This will create a placeholder if necessary.
-  this->get_backend(gogo);
+  this->named_type()->convert(gogo);
   return this->named_type()->is_named_backend_type_size_known();
 
 case TYPE_FORWARD:


Re: [GOOGLE] Patch to fix AutoFDO LIPO performance regression

2013-09-19 Thread Dehao Chen
Thanks, patch updated:

Index: gcc/Makefile.in
===
--- gcc/Makefile.in (revision 202725)
+++ gcc/Makefile.in (working copy)
@@ -2960,7 +2960,7 @@ coverage.o : coverage.c $(GCOV_IO_H) $(CONFIG_H) $
 auto-profile.o : auto-profile.c $(CONFIG_H) $(SYSTEM_H) $(FLAGS_H) \
$(BASIC_BLOCK_H) $(DIAGNOSTIC_CORE_H) $(GCOV_IO_H) $(INPUT_H) profile.h \
$(LANGHOOKS_H) $(OPTS_H) $(TREE_PASS_H) $(CGRAPH_H) $(GIMPLE_H)
value-prof.h \
-   $(COVERAGE_H) coretypes.h $(TREE_H) $(PARAMS_H) $(AUTO_PROFILE_H)
+   $(COVERAGE_H) coretypes.h $(TREE_H) $(PARAMS_H) l-ipo.h $(AUTO_PROFILE_H)
 cselib.o : cselib.c $(CONFIG_H) $(SYSTEM_H) coretypes.h dumpfile.h
$(TM_H) $(RTL_H) \
$(REGS_H) hard-reg-set.h $(FLAGS_H) insn-config.h $(RECOG_H) \
$(EMIT_RTL_H) $(DIAGNOSTIC_CORE_H) $(FUNCTION_H) \
Index: gcc/auto-profile.c
===
--- gcc/auto-profile.c (revision 202725)
+++ gcc/auto-profile.c (working copy)
@@ -46,6 +46,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "value-prof.h"
 #include "coverage.h"
 #include "params.h"
+#include "l-ipo.h"
 #include "auto-profile.h"

 /* The following routines implements AutoFDO optimization.
@@ -1290,6 +1291,13 @@ auto_profile (void)
   init_node_map ();
   profile_info = autofdo::afdo_profile_info;

+  cgraph_pre_profiling_inlining_done = true;
+  cgraph_process_module_scope_statics ();
+  /* Now perform link to allow cross module inlining.  */
+  cgraph_do_link ();
+  varpool_do_link ();
+  cgraph_unify_type_alias_sets ();
+
   FOR_EACH_FUNCTION (node)
 {
   if (!gimple_has_body_p (node->symbol.decl))
@@ -1301,21 +1309,33 @@ auto_profile (void)

   push_cfun (DECL_STRUCT_FUNCTION (node->symbol.decl));

+  if (L_IPO_COMP_MODE)
+{
+  basic_block bb;
+  FOR_EACH_BB (bb)
+{
+  gimple_stmt_iterator gsi;
+  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+ {
+  gimple stmt = gsi_stmt (gsi);
+  if (is_gimple_call (stmt))
+lipo_fixup_cgraph_edge_call_target (stmt);
+ }
+}
+ }
+
   autofdo::afdo_annotate_cfg ();
   compute_function_frequency ();
   update_ssa (TODO_update_ssa);

+  /* Local pure-const may imply need to fixup the cfg.  */
+  if (execute_fixup_cfg () & TODO_cleanup_cfg)
+ cleanup_tree_cfg ();
+
   current_function_decl = NULL;
   pop_cfun ();
 }

-  cgraph_pre_profiling_inlining_done = true;
-  cgraph_process_module_scope_statics ();
-  /* Now perform link to allow cross module inlining.  */
-  cgraph_do_link ();
-  varpool_do_link ();
-  cgraph_unify_type_alias_sets ();
-
   return TODO_rebuild_cgraph_edges;
 }

On Wed, Sep 18, 2013 at 5:16 PM, Xinliang David Li  wrote:
> On Wed, Sep 18, 2013 at 4:51 PM, Dehao Chen  wrote:
>> This patch fixup the call graph edge targets during AutoFDO pass, so
>> that when rebuilding call graph edges, it can find the correct callee.
>>
>> Bootstrapped and passed regression test. Benchmark tests on-going.
>>
>> Ok for google-4_8 branch?
>>
>> Thanks,
>> Dehao
>>
>> Index: gcc/Makefile.in
>> ===
>> --- gcc/Makefile.in (revision 202725)
>> +++ gcc/Makefile.in (working copy)
>> @@ -2960,7 +2960,7 @@ coverage.o : coverage.c $(GCOV_IO_H) $(CONFIG_H) $
>>  auto-profile.o : auto-profile.c $(CONFIG_H) $(SYSTEM_H) $(FLAGS_H) \
>> $(BASIC_BLOCK_H) $(DIAGNOSTIC_CORE_H) $(GCOV_IO_H) $(INPUT_H) profile.h \
>> $(LANGHOOKS_H) $(OPTS_H) $(TREE_PASS_H) $(CGRAPH_H) $(GIMPLE_H)
>> value-prof.h \
>> -   $(COVERAGE_H) coretypes.h $(TREE_H) $(PARAMS_H) $(AUTO_PROFILE_H)
>> +   $(COVERAGE_H) coretypes.h $(TREE_H) $(PARAMS_H) l-ipo.h $(AUTO_PROFILE_H)
>>  cselib.o : cselib.c $(CONFIG_H) $(SYSTEM_H) coretypes.h dumpfile.h
>> $(TM_H) $(RTL_H) \
>> $(REGS_H) hard-reg-set.h $(FLAGS_H) insn-config.h $(RECOG_H) \
>> $(EMIT_RTL_H) $(DIAGNOSTIC_CORE_H) $(FUNCTION_H) \
>> Index: gcc/auto-profile.c
>> ===
>> --- gcc/auto-profile.c (revision 202725)
>> +++ gcc/auto-profile.c (working copy)
>> @@ -46,6 +46,7 @@ along with GCC; see the file COPYING3.  If not see
>>  #include "value-prof.h"
>>  #include "coverage.h"
>>  #include "params.h"
>> +#include "l-ipo.h"
>>  #include "auto-profile.h"
>>
>>  /* The following routines implements AutoFDO optimization.
>> @@ -1290,6 +1291,13 @@ auto_profile (void)
>>init_node_map ();
>>profile_info = autofdo::afdo_profile_info;
>>
>> +  cgraph_pre_profiling_inlining_done = true;
>> +  cgraph_process_module_scope_statics ();
>> +  /* Now perform link to allow cross module inlining.  */
>> +  cgraph_do_link ();
>> +  varpool_do_link ();
>> +  cgraph_unify_type_alias_sets ();
>> +
>>FOR_EACH_FUNCTION (node)
>>  {
>>if (!gimple_has_body_p (node->symbol.decl))
>> @@ -1301,6 +1309,21 @@ auto_profile (void)
>>
>>push_cfun (DECL_STRUCT_FUNCTION (node->s

Re: [GOOGLE] Patch to fix AutoFDO LIPO performance regression

2013-09-19 Thread Xinliang David Li
ok.

David

On Thu, Sep 19, 2013 at 10:10 AM, Dehao Chen  wrote:
> Thanks, patch updated:
>
> Index: gcc/Makefile.in
> ===
> --- gcc/Makefile.in (revision 202725)
> +++ gcc/Makefile.in (working copy)
> @@ -2960,7 +2960,7 @@ coverage.o : coverage.c $(GCOV_IO_H) $(CONFIG_H) $
>  auto-profile.o : auto-profile.c $(CONFIG_H) $(SYSTEM_H) $(FLAGS_H) \
> $(BASIC_BLOCK_H) $(DIAGNOSTIC_CORE_H) $(GCOV_IO_H) $(INPUT_H) profile.h \
> $(LANGHOOKS_H) $(OPTS_H) $(TREE_PASS_H) $(CGRAPH_H) $(GIMPLE_H)
> value-prof.h \
> -   $(COVERAGE_H) coretypes.h $(TREE_H) $(PARAMS_H) $(AUTO_PROFILE_H)
> +   $(COVERAGE_H) coretypes.h $(TREE_H) $(PARAMS_H) l-ipo.h $(AUTO_PROFILE_H)
>  cselib.o : cselib.c $(CONFIG_H) $(SYSTEM_H) coretypes.h dumpfile.h
> $(TM_H) $(RTL_H) \
> $(REGS_H) hard-reg-set.h $(FLAGS_H) insn-config.h $(RECOG_H) \
> $(EMIT_RTL_H) $(DIAGNOSTIC_CORE_H) $(FUNCTION_H) \
> Index: gcc/auto-profile.c
> ===
> --- gcc/auto-profile.c (revision 202725)
> +++ gcc/auto-profile.c (working copy)
> @@ -46,6 +46,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "value-prof.h"
>  #include "coverage.h"
>  #include "params.h"
> +#include "l-ipo.h"
>  #include "auto-profile.h"
>
>  /* The following routines implements AutoFDO optimization.
> @@ -1290,6 +1291,13 @@ auto_profile (void)
>init_node_map ();
>profile_info = autofdo::afdo_profile_info;
>
> +  cgraph_pre_profiling_inlining_done = true;
> +  cgraph_process_module_scope_statics ();
> +  /* Now perform link to allow cross module inlining.  */
> +  cgraph_do_link ();
> +  varpool_do_link ();
> +  cgraph_unify_type_alias_sets ();
> +
>FOR_EACH_FUNCTION (node)
>  {
>if (!gimple_has_body_p (node->symbol.decl))
> @@ -1301,21 +1309,33 @@ auto_profile (void)
>
>push_cfun (DECL_STRUCT_FUNCTION (node->symbol.decl));
>
> +  if (L_IPO_COMP_MODE)
> +{
> +  basic_block bb;
> +  FOR_EACH_BB (bb)
> +{
> +  gimple_stmt_iterator gsi;
> +  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
> + {
> +  gimple stmt = gsi_stmt (gsi);
> +  if (is_gimple_call (stmt))
> +lipo_fixup_cgraph_edge_call_target (stmt);
> + }
> +}
> + }
> +
>autofdo::afdo_annotate_cfg ();
>compute_function_frequency ();
>update_ssa (TODO_update_ssa);
>
> +  /* Local pure-const may imply need to fixup the cfg.  */
> +  if (execute_fixup_cfg () & TODO_cleanup_cfg)
> + cleanup_tree_cfg ();
> +
>current_function_decl = NULL;
>pop_cfun ();
>  }
>
> -  cgraph_pre_profiling_inlining_done = true;
> -  cgraph_process_module_scope_statics ();
> -  /* Now perform link to allow cross module inlining.  */
> -  cgraph_do_link ();
> -  varpool_do_link ();
> -  cgraph_unify_type_alias_sets ();
> -
>return TODO_rebuild_cgraph_edges;
>  }
>
> On Wed, Sep 18, 2013 at 5:16 PM, Xinliang David Li  wrote:
>> On Wed, Sep 18, 2013 at 4:51 PM, Dehao Chen  wrote:
>>> This patch fixup the call graph edge targets during AutoFDO pass, so
>>> that when rebuilding call graph edges, it can find the correct callee.
>>>
>>> Bootstrapped and passed regression test. Benchmark tests on-going.
>>>
>>> Ok for google-4_8 branch?
>>>
>>> Thanks,
>>> Dehao
>>>
>>> Index: gcc/Makefile.in
>>> ===
>>> --- gcc/Makefile.in (revision 202725)
>>> +++ gcc/Makefile.in (working copy)
>>> @@ -2960,7 +2960,7 @@ coverage.o : coverage.c $(GCOV_IO_H) $(CONFIG_H) $
>>>  auto-profile.o : auto-profile.c $(CONFIG_H) $(SYSTEM_H) $(FLAGS_H) \
>>> $(BASIC_BLOCK_H) $(DIAGNOSTIC_CORE_H) $(GCOV_IO_H) $(INPUT_H) profile.h 
>>> \
>>> $(LANGHOOKS_H) $(OPTS_H) $(TREE_PASS_H) $(CGRAPH_H) $(GIMPLE_H)
>>> value-prof.h \
>>> -   $(COVERAGE_H) coretypes.h $(TREE_H) $(PARAMS_H) $(AUTO_PROFILE_H)
>>> +   $(COVERAGE_H) coretypes.h $(TREE_H) $(PARAMS_H) l-ipo.h 
>>> $(AUTO_PROFILE_H)
>>>  cselib.o : cselib.c $(CONFIG_H) $(SYSTEM_H) coretypes.h dumpfile.h
>>> $(TM_H) $(RTL_H) \
>>> $(REGS_H) hard-reg-set.h $(FLAGS_H) insn-config.h $(RECOG_H) \
>>> $(EMIT_RTL_H) $(DIAGNOSTIC_CORE_H) $(FUNCTION_H) \
>>> Index: gcc/auto-profile.c
>>> ===
>>> --- gcc/auto-profile.c (revision 202725)
>>> +++ gcc/auto-profile.c (working copy)
>>> @@ -46,6 +46,7 @@ along with GCC; see the file COPYING3.  If not see
>>>  #include "value-prof.h"
>>>  #include "coverage.h"
>>>  #include "params.h"
>>> +#include "l-ipo.h"
>>>  #include "auto-profile.h"
>>>
>>>  /* The following routines implements AutoFDO optimization.
>>> @@ -1290,6 +1291,13 @@ auto_profile (void)
>>>init_node_map ();
>>>profile_info = autofdo::afdo_profile_info;
>>>
>>> +  cgraph_pre_profiling_inlining_done = true;
>>> +  cgraph_process_module_scope_statics ();
>>> +  /* Now perform link to allow cross module inlining.  */
>>> +  

Re: [PATCH] manage dom-walk_data initialization and finalization with constructors and destructors

2013-09-19 Thread Mike Stump
On Sep 19, 2013, at 6:23 AM, Michael Matz  wrote:
> Me neither, from that perspective it's okay.  It's merely that I doubt the 
> value of any syntactic privatization like it's implemented in C++, you can 
> #define it away, hence the compiler can't make use of that information for 
> code generation, and the cognitive value for the developer ("hey I 
> shouldn't look at this member from outside") is dubious, as that probably 
> is a general rule, no direct data member access from non-members (although 
> I have problems with that too).

If we are making engineering decisions on the basis of people being able to say 
#define private public, well, we are so far off into the weeds as to not be 
funny.

ODR:

  --each definition of D shall consist of the same sequence  of  tokens;

Just because you see no value in private, doesn't mean others don't.  Consider 
this, It would not be in the language if everyone shared your view.

[C++1y] [PATCH 2/4] Support nested generic lambdas.

2013-09-19 Thread Adam Butcher
* lambda.c (maybe_add_lambda_conv_op): Don't check for instantiated
callop in the case of generic lambdas.
---
 gcc/cp/lambda.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/cp/lambda.c b/gcc/cp/lambda.c
index b04448b..2ffa7e0 100644
--- a/gcc/cp/lambda.c
+++ b/gcc/cp/lambda.c
@@ -810,7 +810,7 @@ maybe_add_lambda_conv_op (tree type)
 = (DECL_TEMPLATE_INFO (callop)
 && DECL_TEMPLATE_RESULT (DECL_TI_TEMPLATE (callop)) == callop);
 
-  if (DECL_INITIAL (callop) == NULL_TREE)
+  if (!generic_lambda_p && DECL_INITIAL (callop) == NULL_TREE)
 {
   /* If the op() wasn't instantiated due to errors, give up.  */
   gcc_assert (errorcount || sorrycount);
-- 
1.8.4



[C++1y] [PATCH 0/4] Fixes and enhancements to generic lambdas and implicit function templates.

2013-09-19 Thread Adam Butcher
Hi all,

The following series contain a few miscellaneous updates to generic lambdas and
implicit function templates.


[1/4]: Use translation-unit-global rather than parameter-list-local counter for
   generic type names to facilitate nested implicit function
   templates.

  Using function-local counter means that nested generic lambdas generate
  duplicate (conflicting) template type parameters.


[2/4]: Support nested generic lambdas.

  Bug fix; remove assertion not applicable to generic lambdas.


[3/4]: Ensure implicit template parameters have distinct canonical types.

  Unsure on my solution here.  I tinkered with externalizing
  'canonical_type_parameter' from pt.c but was not sure whether it was
  necessary.  It seemed sufficient to simply make TYPE_CANONICAL be distinct for
  each parameter so I simply made it point to the generate template parameter
  type.


[4/4]: Generate more intuitive name for 'auto' parameters.

  Potentially contentious.  This makes the names generated for implicit template
  parameter types to be of the form '' rather than '__GenN'.  The former,
  IMHO, look better in diagnostics.  A better solution might be to make the
  transformation in the diagnostic code rather than relabel the type but this
  appears to work in my simplistic test cases.


On the subject of test cases; I'm trying to put together a set to test all
features of the generic lambda and implicit function template updates.  This is
taking longer than I'd hoped as I'm only getting a few minutes here and there to
spend on this at the moment.

Cheers,
Adam


 gcc/cp/lambda.c |  2 +-
 gcc/cp/parser.c | 30 ++
 2 files changed, 19 insertions(+), 13 deletions(-)

-- 
1.8.4



[C++1y] [PATCH 3/4] Ensure implicit template parameters have distinct canonical types.

2013-09-19 Thread Adam Butcher
* parser.c (add_implicit_template_parms): Set the canonical type of a
generic parameter to be that of the newly generated type such that it is
unique.
---
 gcc/cp/parser.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 7e9ade2..148e2f2 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -29005,6 +29005,9 @@ add_implicit_template_parms (cp_parser *parser, size_t 
expect_count,
cur_type = cp_build_qualified_type (new_type, TYPE_QUALS (cur_type));
   else
cur_type = new_type;
+
+  /* Make the canonical type of the parameter distinct.  */
+  TYPE_CANONICAL (TREE_TYPE (TREE_VALUE (p))) = cur_type;
 }
 
   gcc_assert (synth_count == expect_count);
-- 
1.8.4



[C++1y] [PATCH 4/4] Generate more intuitive name for 'auto' parameters.

2013-09-19 Thread Adam Butcher
* parser.c (make_generic_type_name): Spell generic type names ''
rather than '__GenN'.
---
 gcc/cp/parser.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 148e2f2..a54496a 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -28902,7 +28902,7 @@ make_generic_type_name ()
 {
   char buf[32];
   static int i = 0;
-  sprintf (buf, "__GenT%d", i);
+  sprintf (buf, "", ++i);
   return get_identifier (buf);
 }
 
-- 
1.8.4



Re: [PATCH] manage dom-walk_data initialization and finalization with constructors and destructors

2013-09-19 Thread Richard Sandiford
Michael Matz  writes:
> What's the benefit of reading and writing such noisy lines? :
>
>   *out_mode = mode_;
>   mode_ = GET_MODE_WIDER_MODE (mode_);
>   count_++;
>
> The uglification merely makes code harder to write and read, it should be 
> used in cases where you _don't_ want developers to write such names.

Heh.  Since it's my code being used as the example here: I also find it
very ugly FWIW.  I only added the underscores because that's what the
conventions said.

But we're never going to get consensus on this kind of thing.  E.g. I
know some people really hate the GNU formatting style (although I very
much like it).  So I just held my nose while writing the patch.

Thanks,
Richard


Re: gimple build interface

2013-09-19 Thread Andrew MacLeod

On 09/19/2013 09:24 AM, Andrew MacLeod wrote:


I think this is of most use to ssa passes that need to construct code 
snippets, so I propose we make this ssa specific and put it in 
tree-ssa.c (renaming it ssa_build_assign),  *OR* we could leave it 
general purpose and put it in its own set of files, 
gimple-ssa-build.[ch] or something that crosses the border between the 
two representations.


I'd also suggest that the final optional parameter be changed to tree 
*lhs = NULL_TREE,  which would allow the caller to specify the LHS if 
they want, otherwise make_ssa_name would be called. If we want to 
leave it supporting both gimple and ssa, then anyone from gimple land 
could pass in a gimple LHS variable thus avoiding the call to 
make_ssa_name


Thoughts?
Andrew
Anyway, here is a patch which does that and a bit more.  I didn't rename 
build_assign() to ssa_build_assign()..   even though those are the only 
kind actually created right now.   we can leave that for the day someone 
actually decides to flush this interface out, and maybe we'll want to 
pass in gimple_tmps and call them from front ends or other places... 
then it would have to be renamed again. So I just left it as is for the 
moment, but that could be changed.


I also moved gimple_replace_lhs() to tree-ssa.c and renamed it 
ssa_replace_lhs(). It calls insert_debug_temp_for_var_def() from 
tree-ssa.c  and that only works with the immediate use operands.. so 
that is an SSA specific routine, which makes this one SSA specific as well.


Those 2 changes allow tree-ssa.h to no longer be included, it is 
replaced with tree-flow.h.   Some preliminary work to enable removing 
immediate use routines out of tree-flow.h include:


struct count_ptr_d, count_ptr_derefs(), count_uses_and_derefs() also get 
moved to tree-ssa.c since those are also require the immediate use 
mechanism, and thus is also SSA dependent.


This bootstraps on x86_64-unknown-linux-gnu and has no new regressions. 
  OK?


Andrew

	* gimple.c (gimple_replace_lhs): Move to tree-ssa.c and rename.
	(struct count_ptr_d, count_ptr_derefs, count_uses_and_derefs): Move to
	tree-ssa.c
	(create_gimple_tmp): Delete.
	(get_expr_type, build_assign, build_type_cast): Move to tree-ssa.c
	* tree-ssa.c (struct count_ptr_d, count_ptr_derefs,
	count_uses_and_derefs): Relocate from gimple.c.
	(get_expr_type): Relocate from gimple.c.
	(build_assign, build_type_cast): Change to only create ssanames.
	(ssa_replace_lhs): Renamed gimple_replace_ssa from gimple.c
	* gimple.h: Move prototypes to...
	* tree-ssa.h: Here.
	* tree-ssa-reassoc.c (repropagate_negates): Use ssa_replace_lhs.
	* tree-ssa-math-opts (execute_cse_reciprocals): Use ssa_replace_lhs.


Index: gimple.c
===
*** gimple.c	(revision 202720)
--- gimple.c	(working copy)
*** along with GCC; see the file COPYING3.  
*** 30,36 
  #include "basic-block.h"
  #include "gimple.h"
  #include "diagnostic.h"
! #include "tree-ssa.h"
  #include "value-prof.h"
  #include "flags.h"
  #include "alias.h"
--- 30,36 
  #include "basic-block.h"
  #include "gimple.h"
  #include "diagnostic.h"
! #include "tree-flow.h"
  #include "value-prof.h"
  #include "flags.h"
  #include "alias.h"
*** gimple_set_lhs (gimple stmt, tree lhs)
*** 2156,2194 
  gcc_unreachable();
  }
  
- /* Replace the LHS of STMT, an assignment, either a GIMPLE_ASSIGN or a
-GIMPLE_CALL, with NLHS, in preparation for modifying the RHS to an
-expression with a different value.
- 
-This will update any annotations (say debug bind stmts) referring
-to the original LHS, so that they use the RHS instead.  This is
-done even if NLHS and LHS are the same, for it is understood that
-the RHS will be modified afterwards, and NLHS will not be assigned
-an equivalent value.
- 
-Adjusting any non-annotation uses of the LHS, if needed, is a
-responsibility of the caller.
- 
-The effect of this call should be pretty much the same as that of
-inserting a copy of STMT before STMT, and then removing the
-original stmt, at which time gsi_remove() would have update
-annotations, but using this function saves all the inserting,
-copying and removing.  */
- 
- void
- gimple_replace_lhs (gimple stmt, tree nlhs)
- {
-   if (MAY_HAVE_DEBUG_STMTS)
- {
-   tree lhs = gimple_get_lhs (stmt);
- 
-   gcc_assert (SSA_NAME_DEF_STMT (lhs) == stmt);
- 
-   insert_debug_temp_for_var_def (NULL, lhs);
- }
- 
-   gimple_set_lhs (stmt, nlhs);
- }
  
  /* Return a deep copy of statement STMT.  All the operands from STMT
 are reallocated and copied using unshare_expr.  The DEF, USE, VDEF
--- 2156,2161 
*** gimple_get_alias_set (tree t)
*** 3739,3834 
  }
  
  
- /* Data structure used to count the number of dereferences to PTR
-inside an expression.  */
- struct count_ptr_d
- {
-   tree ptr;
-   unsigned num_stores;
-   unsigned n

[C++1y] [PATCH 1/4] Use translation-unit-global rather than parameter-list-local counter for generic type names to facilitate nested implicit function templates.

2013-09-19 Thread Adam Butcher
* parser.c (make_generic_type_name): Use static count rather than
parameter and ...
(add_implicit_template_parms): ... propagate interface change here.
---
 gcc/cp/parser.c | 25 ++---
 1 file changed, 14 insertions(+), 11 deletions(-)

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 2cd60f0..7e9ade2 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -28898,9 +28898,10 @@ c_parse_file (void)
template parameter implied by `auto' or a concept identifier). */
 
 static tree
-make_generic_type_name (int i)
+make_generic_type_name ()
 {
   char buf[32];
+  static int i = 0;
   sprintf (buf, "__GenT%d", i);
   return get_identifier (buf);
 }
@@ -28915,14 +28916,14 @@ tree_type_is_auto_or_concept (const_tree t)
   return TREE_TYPE (t) && is_auto_or_concept (TREE_TYPE (t));
 }
 
-/* Add COUNT implicit template parameters gleaned from the generic
-   type parameters in PARAMETERS to the CURRENT_TEMPLATE_PARMS
-   (creating a new template parameter list if necessary).  Returns
-   PARAMETERS suitably rewritten to reference the newly created types
-   or ERROR_MARK_NODE on failure.  */
+/* Add EXPECT_COUNT implicit template parameters gleaned from the generic
+   type parameters in PARAMETERS to the CURRENT_TEMPLATE_PARMS (creating a new
+   template parameter list if necessary).  Returns PARAMETERS suitably 
rewritten
+   to reference the newly created types or ERROR_MARK_NODE on failure.  */
 
 tree
-add_implicit_template_parms (cp_parser *parser, size_t count, tree parameters)
+add_implicit_template_parms (cp_parser *parser, size_t expect_count,
+tree parameters)
 {
   gcc_assert (current_binding_level->kind == sk_function_parms);
 
@@ -28931,7 +28932,7 @@ add_implicit_template_parms (cp_parser *parser, size_t 
count, tree parameters)
   bool become_template =
 fn_parms_scope->level_chain->kind != sk_template_parms;
 
-  size_t synth_idx = 0;
+  size_t synth_count = 0;
 
   /* Roll back a scope level and either introduce a new template parameter list
  or update an existing one.  The function scope is added back after 
template
@@ -28973,7 +28974,7 @@ add_implicit_template_parms (cp_parser *parser, size_t 
count, tree parameters)
   ++processing_template_parmlist;
 }
 
-  for (tree p = parameters; p && synth_idx < count; p = TREE_CHAIN (p))
+  for (tree p = parameters; p && synth_count < expect_count; p = TREE_CHAIN 
(p))
 {
   tree generic_type_ptr
= find_type_usage (TREE_VALUE (p), tree_type_is_auto_or_concept);
@@ -28981,7 +28982,9 @@ add_implicit_template_parms (cp_parser *parser, size_t 
count, tree parameters)
   if (!generic_type_ptr)
continue;
 
-  tree synth_id = make_generic_type_name (synth_idx++);
+  ++synth_count;
+
+  tree synth_id = make_generic_type_name ();
   tree synth_tmpl_parm = finish_template_type_parm (class_type_node,
synth_id);
   tparms = process_template_parm (tparms, DECL_SOURCE_LOCATION (TREE_VALUE
@@ -29004,7 +29007,7 @@ add_implicit_template_parms (cp_parser *parser, size_t 
count, tree parameters)
cur_type = new_type;
 }
 
-  gcc_assert (synth_idx == count);
+  gcc_assert (synth_count == expect_count);
 
   push_binding_level (fn_parms_scope);
 
-- 
1.8.4



[PATCH] Amend attribute used documentation (PR other/58467)

2013-09-19 Thread Marek Polacek
__attribute__((used)) is meant to be used only on VAR_DECLs that are
TREE_STATIC, but the documentation does not say that.  Thus fixed.

Ok?

2013-09-19  Marek Polacek  

PR other/58467
* doc/extend.texi: Document that attribute used is meant to be used
on variables with static storage duration.

--- gcc/doc/extend.texi.mp  2013-09-19 16:22:16.214492101 +0200
+++ gcc/doc/extend.texi 2013-09-19 16:35:19.874041331 +0200
@@ -4891,8 +4891,9 @@ to be possibly unused.  GCC does not pro
 variable.
 
 @item used
-This attribute, attached to a variable, means that the variable must be
-emitted even if it appears that the variable is not referenced.
+This attribute, attached to a variable with static storage duration, means
+that the variable must be emitted even if it appears that the variable is
+not referenced.
 
 When applied to a static data member of a C++ class template, the
 attribute also means that the member is instantiated if the

Marek


Merge from 4.8 branch to gccgo branch

2013-09-19 Thread Ian Lance Taylor
I merged revision 202754 from the GCC 4.8 branch to the gccgo branch.

Ian


Re: [wide-int] Fix LTO regression that I'd introduced

2013-09-19 Thread Kenneth Zadeck

this looks fine to me.
On 09/19/2013 02:56 PM, Richard Sandiford wrote:

It turns out that gcc20's version of binutils is too old for the LTO plugin,
so the tests I'd been running hadn't exercised it.  This patch fixes a
regression that Kenny pointed out.

The problem was that build_int_cst and build_int_cst_type were using
the signedness of the type to decide how the HWI should be extended,
whereas they're supposed to use sign extension regardless.

Tested on x86_64-linux-gnu, this time with trunk binutils.  OK for wide-int?

Thanks,
Richard


gcc/
* tree.h (wi::hwi): Delete.
* tree.c (build_int_cst, build_int_cst_type): Use wi::shwi.
(build_int_cstu): Use wi::uhwi.

Index: gcc/tree.h
===
--- gcc/tree.h  (revision 202746)
+++ gcc/tree.h  (working copy)
@@ -5206,8 +5206,6 @@
  
  namespace wi

  {
-  hwi_with_prec hwi (HOST_WIDE_INT, const_tree);
-
template 
bool fits_to_tree_p (const T &x, const_tree);
  
@@ -5216,12 +5214,6 @@

wide_int from_mpz (const_tree, mpz_t, bool);
  }
  
-inline wi::hwi_with_prec

-wi::hwi (HOST_WIDE_INT val, const_tree type)
-{
-  return hwi_with_prec (val, TYPE_PRECISION (type), TYPE_SIGN (type));
-}
-
  template 
  bool
  wi::fits_to_tree_p (const T &x, const_tree type)
Index: gcc/tree.c
===
--- gcc/tree.c  (revision 202746)
+++ gcc/tree.c  (working copy)
@@ -1056,13 +1056,13 @@
if (!type)
  type = integer_type_node;
  
-  return wide_int_to_tree (type, wi::hwi (low, type));

+  return wide_int_to_tree (type, wi::shwi (low, TYPE_PRECISION (type)));
  }
  
  tree

  build_int_cstu (tree type, unsigned HOST_WIDE_INT cst)
  {
-  return wide_int_to_tree (type, wi::hwi (cst, type));
+  return wide_int_to_tree (type, wi::uhwi (cst, TYPE_PRECISION (type)));
  }
  
  /* Create an INT_CST node with a LOW value sign extended to TYPE.  */

@@ -1071,7 +1071,7 @@
  build_int_cst_type (tree type, HOST_WIDE_INT low)
  {
gcc_assert (type);
-  return wide_int_to_tree (type, wi::hwi (low, type));
+  return wide_int_to_tree (type, wi::shwi (low, TYPE_PRECISION (type)));
  }
  
  /* Constructs tree in type TYPE from with value given by CST.  Signedness




[Patch, Fortran] PR57697/58469 - Fix another defined-assignment issue

2013-09-19 Thread Tobias Burnus

This patch fixes two issues:

a) It could happen that no code change has happened. In that case, the 
one freed an expression which still should be used.


b) In my previous patch, I used a pointer assignment to the temporary of 
the LHS (after its allocation) [only if the LHS was initially 
unassigned]. That lead to a problem with double deallocation (temporary 
+ LHS). In the previous test case, it didn't matter as the LHS wasn't 
freed (implicit SAVE of in the main program). That's now solved by a 
NULL-pointer assignment.


Finally, I corrected some indenting issues and removed unreachable code.

Build and regtested on x86-64-gnu-linux.
OK for the trunk and the 4.8 branch?

Tobias

PS: For the testcase of (a), I am not quite sure whether the intrinsic 
assignment should invoke the defined assignment. It currently doesn't 
for gfortran and crayftn. In any case, the invalid freeing is wrong.
2013-09-19  Tobias Burnus  

	PR fortran/57697
	PR fortran/58469
	* resolve.c (generate_component_assignments): Avoid double free
	at runtime and freeing a still-being used expr.

2013-09-19  Tobias Burnus  

	PR fortran/57697
	PR fortran/58469
	* gfortran.dg/defined_assignment_8.f90: New.
	* gfortran.dg/defined_assignment_9.f90: New.

diff --git a/gcc/fortran/resolve.c b/gcc/fortran/resolve.c
index d33fe49..4befb9fd 100644
--- a/gcc/fortran/resolve.c
+++ b/gcc/fortran/resolve.c
@@ -9602,8 +9602,9 @@ generate_component_assignments (gfc_code **code, gfc_namespace *ns)
 		  && gfc_expr_attr ((*code)->expr1).allocatable)
 		{
 		  gfc_code *block;
-  gfc_expr *cond;
-  cond = gfc_get_expr ();
+		  gfc_expr *cond;
+
+		  cond = gfc_get_expr ();
 		  cond->ts.type = BT_LOGICAL;
 		  cond->ts.kind = gfc_default_logical_kind;
 		  cond->expr_type = EXPR_OP;
@@ -9621,7 +9622,7 @@ generate_component_assignments (gfc_code **code, gfc_namespace *ns)
 		  add_code_to_chain (&block, &head, &tail);
 		}
 	}
-	  }
+	}
   else if (this_code->op == EXEC_ASSIGN && !this_code->next)
 	{
 	  /* Don't add intrinsic assignments since they are already
@@ -9643,13 +9644,6 @@ generate_component_assignments (gfc_code **code, gfc_namespace *ns)
 	}
 }
 
-  /* This is probably not necessary.  */
-  if (this_code)
-{
-  gfc_free_statements (this_code);
-  this_code = NULL;
-}
-
   /* Put the temporary assignments at the top of the generated code.  */
   if (tmp_head && component_assignment_level == 1)
 {
@@ -9658,6 +9652,28 @@ generate_component_assignments (gfc_code **code, gfc_namespace *ns)
   tmp_head = tmp_tail = NULL;
 }
 
+  // If we did a pointer assignment - thus, we need to ensure that the LHS is
+  // not accidentally deallocated. Hence, nullify t1.
+  if (t1 && (*code)->expr1->symtree->n.sym->attr.allocatable
+  && gfc_expr_attr ((*code)->expr1).allocatable)
+{
+  gfc_code *block;
+  gfc_expr *cond;
+  gfc_expr *e;
+
+  e = gfc_lval_expr_from_sym ((*code)->expr1->symtree->n.sym);
+  cond = gfc_build_intrinsic_call (ns, GFC_ISYM_ASSOCIATED, "associated",
+   (*code)->loc, 2, gfc_copy_expr (t1), e);
+  block = gfc_get_code (EXEC_IF);
+  block->block = gfc_get_code (EXEC_IF);
+  block->block->expr1 = cond;
+  block->block->next = build_assignment (EXEC_POINTER_ASSIGN,
+	t1, gfc_get_null_expr (&(*code)->loc),
+	NULL, NULL, (*code)->loc);
+  gfc_append_code (tail, block);
+  tail = block;
+}
+
   /* Now attach the remaining code chain to the input code.  Step on
  to the end of the new code since resolution is complete.  */
   gcc_assert ((*code)->op == EXEC_ASSIGN);
@@ -9667,7 +9683,8 @@ generate_component_assignments (gfc_code **code, gfc_namespace *ns)
   gfc_free_expr ((*code)->expr1);
   gfc_free_expr ((*code)->expr2);
   **code = *head;
-  free (head);
+  if (head != tail)
+free (head);
   *code = tail;
 
   component_assignment_level--;
diff --git a/gcc/testsuite/gfortran.dg/defined_assignment_8.f90 b/gcc/testsuite/gfortran.dg/defined_assignment_8.f90
new file mode 100644
index 000..aab8085
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/defined_assignment_8.f90
@@ -0,0 +1,40 @@
+! { dg-do compile }
+!
+! PR fortran/58469
+!
+! Related: PR fortran/57697
+!
+! Was ICEing before
+!
+module m0
+  implicit none
+  type :: component
+integer :: i = 42
+  contains
+procedure :: assign0
+generic :: assignment(=) => assign0
+  end type
+  type, extends(component) :: comp2
+real :: aa
+  end type comp2
+  type parent
+type(comp2) :: foo
+  end type
+contains
+  elemental subroutine assign0(lhs,rhs)
+class(component), intent(INout) :: lhs
+class(component), intent(in) :: rhs
+lhs%i = 20
+  end subroutine
+end module
+
+program main
+  use m0
+  implicit none
+  type(parent), allocatable :: left
+  type(parent) :: right
+  print *, right%foo
+  left = right
+  print *, left%foo
+  if (left%foo%i /= 42) call abort()
+end
diff --git a/gcc/testsuite/gfortran.dg/

Re: Drop generic32 cost model

2013-09-19 Thread Jan Hubicka
> 
> I did some experiment with code alignment. I found
> -fno-align-loops -fno-align-functions -fno-align-jumps
> had no negative performance impacts on current
> Intel processors while reducing code sizes by 1-2%.
> Should we use
> 
> {&generic_cost, 0, 0, 0, 0, 0},
> 
> instead?

Good, revisiting alignment rules is on my TODO list.  AMD chips have aligned
decoding windows, so I think they may be more sensitive to jumps near end of
32byte boundary.  Current alignment settings for Buldozer does not make much
sense to me, especially the alignment of functions to 11 byte boundary :) So I
am running some experiments now.

Honza
> 
> Thanks.
> 
> -- 
> H.J.


Re: [GOOGLE] Sets cgraph_node count during annotation

2013-09-19 Thread Xinliang David Li
Looks good.

David

On Thu, Sep 19, 2013 at 1:15 PM, Dehao Chen  wrote:
> This patch sets cgraph_node count during AutoFDO annotation, otherwise
> execute_fixup_cfg will clear all the BB counts.
>
> bootstrapped and passed regression test.
>
> OK for google-4_8 branch?
>
> Thanks,
> Dehao
>
> Index: gcc/auto-profile.c
> ===
> --- gcc/auto-profile.c (revision 202753)
> +++ gcc/auto-profile.c (working copy)
> @@ -1234,6 +1234,7 @@ afdo_annotate_cfg (void)
>
>if (s == NULL)
>  return;
> +  cgraph_get_node (current_function_decl)->count = s->head_count ();
>ENTRY_BLOCK_PTR->count = s->head_count ();
>gcov_type max_count = ENTRY_BLOCK_PTR->count;


[wide-int] Fix LTO regression that I'd introduced

2013-09-19 Thread Richard Sandiford
It turns out that gcc20's version of binutils is too old for the LTO plugin,
so the tests I'd been running hadn't exercised it.  This patch fixes a
regression that Kenny pointed out.

The problem was that build_int_cst and build_int_cst_type were using
the signedness of the type to decide how the HWI should be extended,
whereas they're supposed to use sign extension regardless.

Tested on x86_64-linux-gnu, this time with trunk binutils.  OK for wide-int?

Thanks,
Richard


gcc/
* tree.h (wi::hwi): Delete.
* tree.c (build_int_cst, build_int_cst_type): Use wi::shwi.
(build_int_cstu): Use wi::uhwi.

Index: gcc/tree.h
===
--- gcc/tree.h  (revision 202746)
+++ gcc/tree.h  (working copy)
@@ -5206,8 +5206,6 @@
 
 namespace wi
 {
-  hwi_with_prec hwi (HOST_WIDE_INT, const_tree);
-
   template 
   bool fits_to_tree_p (const T &x, const_tree);
 
@@ -5216,12 +5214,6 @@
   wide_int from_mpz (const_tree, mpz_t, bool);
 }
 
-inline wi::hwi_with_prec
-wi::hwi (HOST_WIDE_INT val, const_tree type)
-{
-  return hwi_with_prec (val, TYPE_PRECISION (type), TYPE_SIGN (type));
-}
-
 template 
 bool
 wi::fits_to_tree_p (const T &x, const_tree type)
Index: gcc/tree.c
===
--- gcc/tree.c  (revision 202746)
+++ gcc/tree.c  (working copy)
@@ -1056,13 +1056,13 @@
   if (!type)
 type = integer_type_node;
 
-  return wide_int_to_tree (type, wi::hwi (low, type));
+  return wide_int_to_tree (type, wi::shwi (low, TYPE_PRECISION (type)));
 }
 
 tree
 build_int_cstu (tree type, unsigned HOST_WIDE_INT cst)
 {
-  return wide_int_to_tree (type, wi::hwi (cst, type));
+  return wide_int_to_tree (type, wi::uhwi (cst, TYPE_PRECISION (type)));
 }
 
 /* Create an INT_CST node with a LOW value sign extended to TYPE.  */
@@ -1071,7 +1071,7 @@
 build_int_cst_type (tree type, HOST_WIDE_INT low)
 {
   gcc_assert (type);
-  return wide_int_to_tree (type, wi::hwi (low, type));
+  return wide_int_to_tree (type, wi::shwi (low, TYPE_PRECISION (type)));
 }
 
 /* Constructs tree in type TYPE from with value given by CST.  Signedness


[GOOGLE] Sets cgraph_node count during annotation

2013-09-19 Thread Dehao Chen
This patch sets cgraph_node count during AutoFDO annotation, otherwise
execute_fixup_cfg will clear all the BB counts.

bootstrapped and passed regression test.

OK for google-4_8 branch?

Thanks,
Dehao

Index: gcc/auto-profile.c
===
--- gcc/auto-profile.c (revision 202753)
+++ gcc/auto-profile.c (working copy)
@@ -1234,6 +1234,7 @@ afdo_annotate_cfg (void)

   if (s == NULL)
 return;
+  cgraph_get_node (current_function_decl)->count = s->head_count ();
   ENTRY_BLOCK_PTR->count = s->head_count ();
   gcov_type max_count = ENTRY_BLOCK_PTR->count;


[Patch] match_results::format and regex_replace

2013-09-19 Thread Tim Shen
This patch complete the last two parts of the whole regex module, but
two problems left:

1) regex_traits<>::transform_primary [28.7.7]. I don't know how to
implement it correctly. Can anyone give some advice?

2) Digraph support. Is it need to be done, since the standard doesn't
specify it?

Tested under -m64. I'll do a full test before committing.

Thanks!


-- 
Tim Shen


a.patch
Description: Binary data


Re: [PATCH, SH4] Fix PR58475 insn swapb does not satisfy its constraints

2013-09-19 Thread Kaz Kojima
Christian Bruel  wrote:
> This patch fixes the aforementioned PR by refusing FPUL_REG to be an
> acceptable reg for any arithmetic_operand on TARGET_SH4. (This was a
> strange SH4 singularity with regards to the SH family).
> 
> The only impacted insn is movsf_ie used for reg-fpreg transfers. So the
> condition now mentions explicitly fpul_operand, allowing to simplify a
> bit the logic to match by removing the extra checks.
> 
> The testsuite survived (no regression) for 
> -m2,-m2a,-m2a-nofpu,-m2a-single,-m2a-single-only,-m3,-m3e,-m4,-m4-single,-m4-single-only,-m4a,-m4a-single,-m4a-single-only
> 
> No performance impact on a large number of benchmarks (CSIBE, EEMBC,
> Coremark, ...)
> 
> sh4-linux-elf survived a full Linux distribution rebuild
> 
> OK for trunk?

OK.

Regards,
kaz


Re: [PATCH] Amend attribute used documentation (PR other/58467)

2013-09-19 Thread Ian Lance Taylor
On Thu, Sep 19, 2013 at 7:36 AM, Marek Polacek  wrote:
> __attribute__((used)) is meant to be used only on VAR_DECLs that are
> TREE_STATIC, but the documentation does not say that.  Thus fixed.
>
> Ok?
>
> 2013-09-19  Marek Polacek  
>
> PR other/58467
> * doc/extend.texi: Document that attribute used is meant to be used
> on variables with static storage duration.

This is OK.  Thanks.

Ian


Re: [PATCH, SH4] Fix PR58475 insn swapb does not satisfy its constraints

2013-09-19 Thread Kaz Kojima
Christian Bruel  wrote:
>  (define_insn "*mov_reg_reg"
> -  [(set (match_operand:QIHI 0 "arith_reg_dest" "=r,m,*z")
> - (match_operand:QIHI 1 "register_operand" "r,*z,m"))]
> -  "TARGET_SH1 && !t_reg_operand (operands[1], VOIDmode)"
> +  [(set (match_operand:QIHI 0 "general_movdst_operand" "=r,m,*z")
> + (match_operand:QIHI 1 "general_movsrc_operand" "r,*z,m"))]
> +  "TARGET_SH1 && !t_reg_operand (operands[1], VOIDmode)
> +   && arith_reg_dest (operands[0], mode)
> +   && register_operand (operands[1], mode)"

I thought that predicates explicitly allowing mem only when reload
in progress are defensive because I guess there is no guarantee
that the condition part of the insn will be never used in spilling.
Re-factoring suggested by Oleg and Rechard would be the right thing
to do, though it might be a bit invasive for 4.8.

Regards,
kaz


Re: [PATCH, SH4] Fix PR58475 insn swapb does not satisfy its constraints

2013-09-19 Thread Kaz Kojima
> Christian Bruel  wrote:
>>  (define_insn "*mov_reg_reg"
>> -  [(set (match_operand:QIHI 0 "arith_reg_dest" "=r,m,*z")
>> -(match_operand:QIHI 1 "register_operand" "r,*z,m"))]
>> -  "TARGET_SH1 && !t_reg_operand (operands[1], VOIDmode)"
>> +  [(set (match_operand:QIHI 0 "general_movdst_operand" "=r,m,*z")
>> +(match_operand:QIHI 1 "general_movsrc_operand" "r,*z,m"))]
>> +  "TARGET_SH1 && !t_reg_operand (operands[1], VOIDmode)
>> +   && arith_reg_dest (operands[0], mode)
>> +   && register_operand (operands[1], mode)"
> 
> I thought that predicates explicitly allowing mem only when reload
> in progress are defensive because I guess there is no guarantee
> that the condition part of the insn will be never used in spilling.
> Re-factoring suggested by Oleg and Rechard would be the right thing
> to do, though it might be a bit invasive for 4.8.

Ugh, this should be for

Re: [PATCH, committed] SH: Fix PR58314 (unsatisfied constraints)
http://gcc.gnu.org/ml/gcc-patches/2013-09/msg01447.html

Sorry for wrong reply.

Regards,
kaz


Re: [PATCH] Fix PR58417

2013-09-19 Thread David Edelsohn
This patch has caused 6 new libstdc++ failures on AIX. All look like:

/home/dje/src/src/libstdc++-v3/testsuite/ext/random/normal_mv_distribution/cons/default.cc:49:1:
internal compiler error: in build_polynomial_chrec, at
tree-chrec.h:148
 }

- David


Re: [GOOGLE] Patch to fix AutoFDO LIPO performance regression

2013-09-19 Thread Xinliang David Li
I did not catch this in the last review. The cleanup CFG should be
done before afdo_annotate_cfg and just after call statement fixup.
Otherwise the cfg cleanup will zero out all profile counts. After
moving this up, you don't need the follow up patch which sets the
cgraph node's count -- that should be done in the
rebuild_cgraph_edges.

David

On Thu, Sep 19, 2013 at 10:10 AM, Dehao Chen  wrote:
> Thanks, patch updated:
>
> Index: gcc/Makefile.in
> ===
> --- gcc/Makefile.in (revision 202725)
> +++ gcc/Makefile.in (working copy)
> @@ -2960,7 +2960,7 @@ coverage.o : coverage.c $(GCOV_IO_H) $(CONFIG_H) $
>  auto-profile.o : auto-profile.c $(CONFIG_H) $(SYSTEM_H) $(FLAGS_H) \
> $(BASIC_BLOCK_H) $(DIAGNOSTIC_CORE_H) $(GCOV_IO_H) $(INPUT_H) profile.h \
> $(LANGHOOKS_H) $(OPTS_H) $(TREE_PASS_H) $(CGRAPH_H) $(GIMPLE_H)
> value-prof.h \
> -   $(COVERAGE_H) coretypes.h $(TREE_H) $(PARAMS_H) $(AUTO_PROFILE_H)
> +   $(COVERAGE_H) coretypes.h $(TREE_H) $(PARAMS_H) l-ipo.h $(AUTO_PROFILE_H)
>  cselib.o : cselib.c $(CONFIG_H) $(SYSTEM_H) coretypes.h dumpfile.h
> $(TM_H) $(RTL_H) \
> $(REGS_H) hard-reg-set.h $(FLAGS_H) insn-config.h $(RECOG_H) \
> $(EMIT_RTL_H) $(DIAGNOSTIC_CORE_H) $(FUNCTION_H) \
> Index: gcc/auto-profile.c
> ===
> --- gcc/auto-profile.c (revision 202725)
> +++ gcc/auto-profile.c (working copy)
> @@ -46,6 +46,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "value-prof.h"
>  #include "coverage.h"
>  #include "params.h"
> +#include "l-ipo.h"
>  #include "auto-profile.h"
>
>  /* The following routines implements AutoFDO optimization.
> @@ -1290,6 +1291,13 @@ auto_profile (void)
>init_node_map ();
>profile_info = autofdo::afdo_profile_info;
>
> +  cgraph_pre_profiling_inlining_done = true;
> +  cgraph_process_module_scope_statics ();
> +  /* Now perform link to allow cross module inlining.  */
> +  cgraph_do_link ();
> +  varpool_do_link ();
> +  cgraph_unify_type_alias_sets ();
> +
>FOR_EACH_FUNCTION (node)
>  {
>if (!gimple_has_body_p (node->symbol.decl))
> @@ -1301,21 +1309,33 @@ auto_profile (void)
>
>push_cfun (DECL_STRUCT_FUNCTION (node->symbol.decl));
>
> +  if (L_IPO_COMP_MODE)
> +{
> +  basic_block bb;
> +  FOR_EACH_BB (bb)
> +{
> +  gimple_stmt_iterator gsi;
> +  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
> + {
> +  gimple stmt = gsi_stmt (gsi);
> +  if (is_gimple_call (stmt))
> +lipo_fixup_cgraph_edge_call_target (stmt);
> + }
> +}
> + }
> +
>autofdo::afdo_annotate_cfg ();
>compute_function_frequency ();
>update_ssa (TODO_update_ssa);
>
> +  /* Local pure-const may imply need to fixup the cfg.  */
> +  if (execute_fixup_cfg () & TODO_cleanup_cfg)
> + cleanup_tree_cfg ();
> +
>current_function_decl = NULL;
>pop_cfun ();
>  }
>
> -  cgraph_pre_profiling_inlining_done = true;
> -  cgraph_process_module_scope_statics ();
> -  /* Now perform link to allow cross module inlining.  */
> -  cgraph_do_link ();
> -  varpool_do_link ();
> -  cgraph_unify_type_alias_sets ();
> -
>return TODO_rebuild_cgraph_edges;
>  }
>
> On Wed, Sep 18, 2013 at 5:16 PM, Xinliang David Li  wrote:
>> On Wed, Sep 18, 2013 at 4:51 PM, Dehao Chen  wrote:
>>> This patch fixup the call graph edge targets during AutoFDO pass, so
>>> that when rebuilding call graph edges, it can find the correct callee.
>>>
>>> Bootstrapped and passed regression test. Benchmark tests on-going.
>>>
>>> Ok for google-4_8 branch?
>>>
>>> Thanks,
>>> Dehao
>>>
>>> Index: gcc/Makefile.in
>>> ===
>>> --- gcc/Makefile.in (revision 202725)
>>> +++ gcc/Makefile.in (working copy)
>>> @@ -2960,7 +2960,7 @@ coverage.o : coverage.c $(GCOV_IO_H) $(CONFIG_H) $
>>>  auto-profile.o : auto-profile.c $(CONFIG_H) $(SYSTEM_H) $(FLAGS_H) \
>>> $(BASIC_BLOCK_H) $(DIAGNOSTIC_CORE_H) $(GCOV_IO_H) $(INPUT_H) profile.h 
>>> \
>>> $(LANGHOOKS_H) $(OPTS_H) $(TREE_PASS_H) $(CGRAPH_H) $(GIMPLE_H)
>>> value-prof.h \
>>> -   $(COVERAGE_H) coretypes.h $(TREE_H) $(PARAMS_H) $(AUTO_PROFILE_H)
>>> +   $(COVERAGE_H) coretypes.h $(TREE_H) $(PARAMS_H) l-ipo.h 
>>> $(AUTO_PROFILE_H)
>>>  cselib.o : cselib.c $(CONFIG_H) $(SYSTEM_H) coretypes.h dumpfile.h
>>> $(TM_H) $(RTL_H) \
>>> $(REGS_H) hard-reg-set.h $(FLAGS_H) insn-config.h $(RECOG_H) \
>>> $(EMIT_RTL_H) $(DIAGNOSTIC_CORE_H) $(FUNCTION_H) \
>>> Index: gcc/auto-profile.c
>>> ===
>>> --- gcc/auto-profile.c (revision 202725)
>>> +++ gcc/auto-profile.c (working copy)
>>> @@ -46,6 +46,7 @@ along with GCC; see the file COPYING3.  If not see
>>>  #include "value-prof.h"
>>>  #include "coverage.h"
>>>  #include "params.h"
>>> +#include "l-ipo.h"
>>>  #include "auto-profile.h"
>>>
>>>  /* The following r

Re: [Patch, Fortran] PR 58099: [4.8/4.9 Regression] [F03] over-zealous procedure-pointer error checking

2013-09-19 Thread Tobias Burnus

Hi Janus, hi all,

On August 15, 2013, Janus Weil wrote:

Hi Mikael,

Regtested on x86_64-unknown-linux-gnu. Ok for trunk?


This looks good to me



but I let Tobias have the final word as he
expressed some concerns in the PR audit trail.


Sorry for the very belated replay. I played with the patch and it looks 
okay.


(Except for an issue with intrinsic elemental, but that's a separate 
bug. See PR for the details.)


Tobias