date:20130426

Re: [PATCH] Fold VEC_[LR]SHIFT_EXPR (PR tree-optimization/57051)

2013-04-26 Thread Richard Biener

On Thu, 25 Apr 2013, Jakub Jelinek wrote:

> Hi!
> 
> This patch adds folding of constant arguments v>> and v<<, which helps to
> optimize the testcase from the PR back into constant store after vectorized
> loop is unrolled.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ok.

Thanks,
Richard.

> 2013-04-25  Jakub Jelinek  
> 
>   PR tree-optimization/57051
>   * fold-const.c (const_binop): Handle VEC_LSHIFT_EXPR
>   and VEC_RSHIFT_EXPR if shift count is a multiple of element
>   bitsize.
> 
> --- gcc/fold-const.c.jj   2013-04-12 10:16:25.0 +0200
> +++ gcc/fold-const.c  2013-04-24 12:37:11.789122719 +0200
> @@ -1380,17 +1380,42 @@ const_binop (enum tree_code code, tree a
>int count = TYPE_VECTOR_SUBPARTS (type), i;
>tree *elts = XALLOCAVEC (tree, count);
>  
> -  for (i = 0; i < count; i++)
> +  if (code == VEC_LSHIFT_EXPR
> +   || code == VEC_RSHIFT_EXPR)
>   {
> -   tree elem1 = VECTOR_CST_ELT (arg1, i);
> -
> -   elts[i] = const_binop (code, elem1, arg2);
> +   if (!host_integerp (arg2, 1))
> + return NULL_TREE;
>  
> -   /* It is possible that const_binop cannot handle the given
> -  code and return NULL_TREE */
> -   if (elts[i] == NULL_TREE)
> +   unsigned HOST_WIDE_INT shiftc = tree_low_cst (arg2, 1);
> +   unsigned HOST_WIDE_INT outerc = tree_low_cst (TYPE_SIZE (type), 1);
> +   unsigned HOST_WIDE_INT innerc
> + = tree_low_cst (TYPE_SIZE (TREE_TYPE (type)), 1);
> +   if (shiftc >= outerc || (shiftc % innerc) != 0)
>   return NULL_TREE;
> +   int offset = shiftc / innerc;
> +   if (code == VEC_LSHIFT_EXPR)
> + offset = -offset;
> +   tree zero = build_zero_cst (TREE_TYPE (type));
> +   for (i = 0; i < count; i++)
> + {
> +   if (i + offset < 0 || i + offset >= count)
> + elts[i] = zero;
> +   else
> + elts[i] = VECTOR_CST_ELT (arg1, i + offset);
> + }
>   }
> +  else
> + for (i = 0; i < count; i++)
> +   {
> + tree elem1 = VECTOR_CST_ELT (arg1, i);
> +
> + elts[i] = const_binop (code, elem1, arg2);
> +
> + /* It is possible that const_binop cannot handle the given
> +code and return NULL_TREE */
> + if (elts[i] == NULL_TREE)
> +   return NULL_TREE;
> +   }
>  
>return build_vector (type, elts);
>  }
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend

Re: [PATCH] Fix bootstrap with go (uninit warning with ab edges)

2013-04-26 Thread Richard Biener

On Fri, 26 Apr 2013, Jakub Jelinek wrote:

> Hi!
> 
> Bootstrap currently fails in libgo, there are false positive warnings
> that ({anonymous}) is uninitialized in multiple places.
> 
> The testcase below reproduces this issue too.  The problem is
> that the ab edges to setjmp call are added conservatively, thus they can be
> added even from calls before the setjmp call, that are never executed after
> the setjmp, and in their bb's some variables might not be initialized yet,
> even if their initialization dominates the setjmp call.
> 
> As discussed on IRC, the following patch let us ignore the SSA_NAMEs on such
> abnormal edges.  Perhaps later on we could try to do something for the
> common case where there is exactly one setjmp call or exactly one nonlocal
> goto label in a function, we could then avoid creating abnormal edges that
> aren't needed (calls that don't appear on any path from the single setjmp
> call to exit don't need to have ab edge to it).  Bootstrapped/regtested on
> x86_64-linux and i686-linux (including go), ok for trunk?

Ok.

Thanks,
Richard.

> 2013-04-25  Jakub Jelinek  
> 
>   * tree-ssa-uninit.c (compute_uninit_opnds_pos): In functions
>   with nonlocal goto receivers or returns twice calls, ignore
>   unininitialized values from abnormal edges to nl goto receiver
>   or returns twice call.
> 
>   * gcc.dg/setjmp-5.c: New test.
> 
> --- gcc/tree-ssa-uninit.c.jj  2013-03-04 10:37:48.0 +0100
> +++ gcc/tree-ssa-uninit.c 2013-04-25 17:52:55.215166853 +0200
> @@ -151,7 +151,21 @@ compute_uninit_opnds_pos (gimple phi)
>if (TREE_CODE (op) == SSA_NAME
>&& ssa_undefined_value_p (op)
>&& !can_skip_redundant_opnd (op, phi))
> -MASK_SET_BIT (uninit_opnds, i);
> + {
> +   /* Ignore SSA_NAMEs on abnormal edges to setjmp
> +  or nonlocal goto receiver.  */
> +  if (cfun->has_nonlocal_label || cfun->calls_setjmp)
> + {
> +   edge e = gimple_phi_arg_edge (phi, i);
> +   if (e->flags & EDGE_ABNORMAL)
> + {
> +   gimple last = last_stmt (e->src);
> +   if (last && stmt_can_make_abnormal_goto (last))
> + continue;
> + }
> + }
> +   MASK_SET_BIT (uninit_opnds, i);
> + }
>  }
>return uninit_opnds;
>  }
> --- gcc/testsuite/gcc.dg/setjmp-5.c.jj2013-04-25 17:54:49.679559650 
> +0200
> +++ gcc/testsuite/gcc.dg/setjmp-5.c   2013-04-25 17:55:08.084460447 +0200
> @@ -0,0 +1,22 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -Wall" } */
> +
> +#include 
> +
> +void bar (int);
> +
> +jmp_buf buf;
> +int v;
> +
> +void
> +foo (void)
> +{
> +  int i;
> +  bar (0);
> +  bar (1);
> +  i = 5;
> +  int j = setjmp (buf);
> +  if (j == 0)
> +bar (2);
> +  v = i; /* { dg-bogus "may be used uninitialized in this function" } */
> +}
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend

Re: [PATCH] Preserve loops from CFG build until after RTL loop opts

2013-04-26 Thread Richard Biener

On Thu, 25 Apr 2013, Jakub Jelinek wrote:

> On Thu, Apr 25, 2013 at 04:19:20PM +0200, Richard Biener wrote:
> > 
> > This is the patch that I consider final as a first step (to avoid
> > changing too much at once).  I've analyzed the few failures
> > and compared to the previous patch changed the tree-ssa-tailmerge.c
> > part to deal with merging of loop latch and loop preheader (even
> > if that's a really bad idea) to not regress gcc.dg/pr50763.c.
> > Any suggestion on how to improve that part welcome.
> > I've had to change a few testcases, mostly in parts that are not
> > really related to what they check (also reverting to a previous
> > testcase state).  One remaining failure is
> > 
> > FAIL: gcc.dg/pr53265.c  (test for bogus messages, line 147)
> > 
> > which I don't want to deal with in this patch.  I can either
> > followup or prepare the discussed fix of adding a copyprop
> > pass before cunrolli.  Another possibility is to XFAIL the above.
> 
> As that testcase was derived from real-world code (gcc itself, if I remember
> well two places actually hitting it at that point), I'd strongly prefer
> the former, rather than XFAIL, because people would be pretty much annoyed
> by the false positive warning, scratching heads where they invoke undefined
> behavior when they actually don't.  But sure, it can wait for a follow-up
> patch.
> 
> > Bootstrapped and tested on x86_64-unknown-linux-gnu for all languages
> > including Ada (including 32bit multilibs).
> > 
> > Ok for trunk?
> 
> Yes, thanks.

A bootstrap & test together with the additional copyprop pass did not
show any issues so the following is what I committed in addition
to the patch.

Richard.

* passes.c (init_optimization_passes): Schedule a copy-propagation
pass before complete unrolling of inner loops.

Index: gcc/passes.c
===
--- gcc/passes.c(revision 198332)
+++ gcc/passes.c(working copy)
@@ -1397,6 +1397,7 @@ init_optimization_passes (void)
 They ensure memory accesses are not indirect wherever possible.  
*/
   NEXT_PASS (pass_strip_predict_hints);
   NEXT_PASS (pass_rename_ssa_copies);
+  NEXT_PASS (pass_copy_prop);
   NEXT_PASS (pass_complete_unrolli);
   NEXT_PASS (pass_ccp);
   /* After CCP we rewrite no longer addressed locals into SSA

Re: [C++ Patch] Define __cplusplus == 201300L for -std=c++1y

2013-04-26 Thread Paolo Carlini


On 04/24/2013 08:20 PM, Jason Merrill wrote:

On 04/24/2013 02:02 PM, Paolo Carlini wrote:

+#if __cplusplus < 201300L


Don't test for this value.  Use <= 201103L instead.

OK with that change.

Thanks Jason.

Today I'm noticing an issue with the underlying  from glibc, 
which makes our  unusable. It does have:


#if !defined __USE_ISOC11 \
|| (defined __cplusplus && __cplusplus <= 201103L)
/* Get a newline-terminated string from stdin, removing the newline.
   DO NOT USE THIS FUNCTION!!  There is no limit on how much it will read.

   The function has been officially removed in ISO C11.  This opportunity
   is used to also remove it from the GNU feature list.  It is now only
   available when explicitly using an old ISO C, Unix, or POSIX standard.
   GCC defines _GNU_SOURCE when building C++ code and the function is still
   in C++11, so it is also available for C++.

   This function is a possible cancellation point and therefore not
   marked with __THROW.  */
extern char *gets (char *__s) __wur __attribute_deprecated__;

I don't think the header should check __cplusplus <= 201103L like that. 
Jakub, Jason, what do you think?


Thanks,
Paolo.

Re: [C++ Patch] Define __cplusplus == 201300L for -std=c++1y

2013-04-26 Thread Jakub Jelinek

On Fri, Apr 26, 2013 at 12:32:40PM +0200, Paolo Carlini wrote:
> On 04/24/2013 08:20 PM, Jason Merrill wrote:
> >On 04/24/2013 02:02 PM, Paolo Carlini wrote:
> >>+#if __cplusplus < 201300L
> >
> >Don't test for this value.  Use <= 201103L instead.
> >
> >OK with that change.
> Thanks Jason.
> 
> Today I'm noticing an issue with the underlying  from glibc,
> which makes our  unusable. It does have:
> 
> #if !defined __USE_ISOC11 \
> || (defined __cplusplus && __cplusplus <= 201103L)
> /* Get a newline-terminated string from stdin, removing the newline.
>DO NOT USE THIS FUNCTION!!  There is no limit on how much it will read.
> 
>The function has been officially removed in ISO C11.  This opportunity
>is used to also remove it from the GNU feature list.  It is now only
>available when explicitly using an old ISO C, Unix, or POSIX standard.
>GCC defines _GNU_SOURCE when building C++ code and the function is still
>in C++11, so it is also available for C++.
> 
>This function is a possible cancellation point and therefore not
>marked with __THROW.  */
> extern char *gets (char *__s) __wur __attribute_deprecated__;
> 
> I don't think the header should check __cplusplus <= 201103L like
> that. Jakub, Jason, what do you think?

I guess Ulrich added this with the expectation that gets will be also
removed from C++1y.  Has there been any discussions regarding that in the WG
already?

Jakub

Re: [patch] Hash table changes from cxx-conversion branch - config part

2013-04-26 Thread Rainer Orth

Lawrence,

> * config/sol2.c'solaris_comdat_htab
>
> Fold comdat_hash and comdat_eq into new struct comdat_entry_hasher.
[...]
> Index: gcc/ChangeLog
>
> 2013-04-24  Lawrence Crowl  
>   * config/sol2.c (solaris_comdat_htab):
>   Change type to hash_table.  Update dependent calls and types.
>
>   * config/t-sol2: Update for above.

Just a nit: your ChangeLog entries are formatted strangely: No newline
immetiately after the colons, and better group e.g. all *sol2* entries
together without a blank line between.

Unfortunately, your mail client mangled the patch, so I had to apply it
manually.  Once that was done, it survived a i386-pc-solaris2.10
bootstrap without regressions, so the Solaris parts are ok.

Thanks.
Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

Re: [C++ Patch] Define __cplusplus == 201300L for -std=c++1y

2013-04-26 Thread Paolo Carlini


Hi,

On 04/26/2013 12:42 PM, Jakub Jelinek wrote:

On Fri, Apr 26, 2013 at 12:32:40PM +0200, Paolo Carlini wrote:

On 04/24/2013 08:20 PM, Jason Merrill wrote:

On 04/24/2013 02:02 PM, Paolo Carlini wrote:

+#if __cplusplus < 201300L

Don't test for this value.  Use <= 201103L instead.

OK with that change.

Thanks Jason.

Today I'm noticing an issue with the underlying  from glibc,
which makes our  unusable. It does have:

#if !defined __USE_ISOC11 \
 || (defined __cplusplus && __cplusplus <= 201103L)
/* Get a newline-terminated string from stdin, removing the newline.
DO NOT USE THIS FUNCTION!!  There is no limit on how much it will read.

The function has been officially removed in ISO C11.  This opportunity
is used to also remove it from the GNU feature list.  It is now only
available when explicitly using an old ISO C, Unix, or POSIX standard.
GCC defines _GNU_SOURCE when building C++ code and the function is still
in C++11, so it is also available for C++.

This function is a possible cancellation point and therefore not
marked with __THROW.  */
extern char *gets (char *__s) __wur __attribute_deprecated__;

I don't think the header should check __cplusplus <= 201103L like
that. Jakub, Jason, what do you think?

I guess Ulrich added this with the expectation that gets will be also
removed from C++1y.  Has there been any discussions regarding that in the WG
already?

Not to my best knowledge. I'm adding in CC Jonathan and Daniel too.

We could certainly mirror in the C++ library what glibc does, but at 
this point it seems premature to me to assume that C++1y (C++14, in 
practice) will be stricter than C++11 as regards gets.


Paolo.

Re: [C++ Patch] Define __cplusplus == 201300L for -std=c++1y

2013-04-26 Thread Daniel Krügler

2013/4/26 Paolo Carlini :
> Hi,
>
> On 04/26/2013 12:42 PM, Jakub Jelinek wrote:
>>
>> On Fri, Apr 26, 2013 at 12:32:40PM +0200, Paolo Carlini wrote:
>>>
>>> On 04/24/2013 08:20 PM, Jason Merrill wrote:

 On 04/24/2013 02:02 PM, Paolo Carlini wrote:
>
> +#if __cplusplus < 201300L

 Don't test for this value.  Use <= 201103L instead.

 OK with that change.
>>>
>>> Thanks Jason.
>>>
>>> Today I'm noticing an issue with the underlying  from glibc,
>>> which makes our  unusable. It does have:
>>>
>>> #if !defined __USE_ISOC11 \
>>>  || (defined __cplusplus && __cplusplus <= 201103L)
>>> /* Get a newline-terminated string from stdin, removing the newline.
>>> DO NOT USE THIS FUNCTION!!  There is no limit on how much it will
>>> read.
>>>
>>> The function has been officially removed in ISO C11.  This
>>> opportunity
>>> is used to also remove it from the GNU feature list.  It is now only
>>> available when explicitly using an old ISO C, Unix, or POSIX
>>> standard.
>>> GCC defines _GNU_SOURCE when building C++ code and the function is
>>> still
>>> in C++11, so it is also available for C++.
>>>
>>> This function is a possible cancellation point and therefore not
>>> marked with __THROW.  */
>>> extern char *gets (char *__s) __wur __attribute_deprecated__;
>>>
>>> I don't think the header should check __cplusplus <= 201103L like
>>> that. Jakub, Jason, what do you think?
>>
>> I guess Ulrich added this with the expectation that gets will be also
>> removed from C++1y.  Has there been any discussions regarding that in the
>> WG
>> already?
>
> Not to my best knowledge. I'm adding in CC Jonathan and Daniel too.
>
> We could certainly mirror in the C++ library what glibc does, but at this
> point it seems premature to me to assume that C++1y (C++14, in practice)
> will be stricter than C++11 as regards gets.

Jonathan recently submitted an LWG issue for this (not yet part of the
available list). I'm in the process to add the new issue within the
following days. He's essentially suggesting to remove get() from
C++14.

- Daniel

> Paolo.

Re: [C++ Patch] Define __cplusplus == 201300L for -std=c++1y

2013-04-26 Thread Daniel Krügler

2013/4/26 Daniel Krügler :
> 2013/4/26 Paolo Carlini :
>> Hi,
>>
>> On 04/26/2013 12:42 PM, Jakub Jelinek wrote:
>>>
>>> On Fri, Apr 26, 2013 at 12:32:40PM +0200, Paolo Carlini wrote:

 On 04/24/2013 08:20 PM, Jason Merrill wrote:
>
> On 04/24/2013 02:02 PM, Paolo Carlini wrote:
>>
>> +#if __cplusplus < 201300L
>
> Don't test for this value.  Use <= 201103L instead.
>
> OK with that change.

 Thanks Jason.

 Today I'm noticing an issue with the underlying  from glibc,
 which makes our  unusable. It does have:

 #if !defined __USE_ISOC11 \
  || (defined __cplusplus && __cplusplus <= 201103L)
 /* Get a newline-terminated string from stdin, removing the newline.
 DO NOT USE THIS FUNCTION!!  There is no limit on how much it will
 read.

 The function has been officially removed in ISO C11.  This
 opportunity
 is used to also remove it from the GNU feature list.  It is now only
 available when explicitly using an old ISO C, Unix, or POSIX
 standard.
 GCC defines _GNU_SOURCE when building C++ code and the function is
 still
 in C++11, so it is also available for C++.

 This function is a possible cancellation point and therefore not
 marked with __THROW.  */
 extern char *gets (char *__s) __wur __attribute_deprecated__;

 I don't think the header should check __cplusplus <= 201103L like
 that. Jakub, Jason, what do you think?
>>>
>>> I guess Ulrich added this with the expectation that gets will be also
>>> removed from C++1y.  Has there been any discussions regarding that in the
>>> WG
>>> already?
>>
>> Not to my best knowledge. I'm adding in CC Jonathan and Daniel too.
>>
>> We could certainly mirror in the C++ library what glibc does, but at this
>> point it seems premature to me to assume that C++1y (C++14, in practice)
>> will be stricter than C++11 as regards gets.
>
> Jonathan recently submitted an LWG issue for this (not yet part of the
> available list). I'm in the process to add the new issue within the
> following days. He's essentially suggesting to remove get() from
> C++14.

To clarify this: The C++ Standard currently refers to a get() function
that does not exist anymore in the reference C99. So, its removal
looks more than reasonable to me.

> - Daniel
>
>> Paolo.

Re: [C++ Patch] Define __cplusplus == 201300L for -std=c++1y

2013-04-26 Thread Paolo Carlini


Hi,

On 04/26/2013 12:57 PM, Daniel Krügler wrote:
Jonathan recently submitted an LWG issue for this (not yet part of the 
available list). I'm in the process to add the new issue within the 
following days. He's essentially suggesting to remove get() from C++14.
Thanks Daniel. Then, it seems safe to do in v3 the same as , to 
enable testing -std=c++1y. Do you have already a DR # which I can write 
in a comment?


Thanks!

Paolo.

//
Index: include/c_global/cstdio
===
--- include/c_global/cstdio (revision 198333)
+++ include/c_global/cstdio (working copy)
@@ -69,7 +69,9 @@
 #undef ftell
 #undef fwrite
 #undef getc
-#undef gets
+#if __cplusplus <= 201103L
+# undef gets
+#endif
 #undef perror
 #undef printf
 #undef putc
@@ -116,7 +118,9 @@
   using ::fwrite;
   using ::getc;
   using ::getchar;
+#if __cplusplus <= 201103L
   using ::gets;
+#endif
   using ::perror;
   using ::printf;
   using ::putc;
Index: include/c_std/cstdio
===
--- include/c_std/cstdio(revision 198333)
+++ include/c_std/cstdio(working copy)
@@ -70,7 +70,9 @@
 #undef fwrite
 #undef getc
 #undef getchar
-#undef gets
+#if __cplusplus <= 201103L
+# undef gets
+#endif
 #undef perror
 #undef printf
 #undef putc
@@ -117,7 +119,9 @@
   using ::fwrite;
   using ::getc;
   using ::getchar;
+#if __cplusplus <= 201103L
   using ::gets;
+#endif
   using ::perror;
   using ::printf;
   using ::putc;

Re: [C++ Patch] Define __cplusplus == 201300L for -std=c++1y

2013-04-26 Thread Daniel Krügler

2013/4/26 Paolo Carlini :
> Hi,
>
>
> On 04/26/2013 12:57 PM, Daniel Krügler wrote:
>>
>> Jonathan recently submitted an LWG issue for this (not yet part of the
>> available list). I'm in the process to add the new issue within the
>> following days. He's essentially suggesting to remove get() from C++14.
>
> Thanks Daniel. Then, it seems safe to do in v3 the same as , to
> enable testing -std=c++1y. Do you have already a DR # which I can write in a
> comment?

I have no number yet, because there are several issues in the
pipeline. I'll  send it to you once I have it.

- Daniel

> Thanks!
>
> Paolo.
>
> //

Re: [C++ Patch] Define __cplusplus == 201300L for -std=c++1y

2013-04-26 Thread Paolo Carlini


On 04/26/2013 01:02 PM, Daniel Krügler wrote:
I have no number yet, because there are several issues in the 
pipeline. I'll send it to you once I have it.

Ok, thanks!

Paolo.

Re: vtables patch 1/3: allow empty array initializations

2013-04-26 Thread Bernd Schmidt

On 04/24/2013 09:14 PM, DJ Delorie wrote:
>> 24 bits stored as three bytes, or four? How does this affect vtable
>> layout? I would have expected the C++ frontend and libsupc++ to
>> currently be inconsistent with each other given such a setup.
> 
> In memory, four, I think.  The address registers really are three
> bytes though.  They're PSImode and gcc doesn't really have a good way
> of using any specified PSImode precision.

I took a look myself to find out the answer to the second part of my
question (I'll post a few patches to get m32c working on trunk later).
It turns out that as I expected, C++ support is somewhat broken on this
target. My vtables patch series fixes the following execution tests:

+PASS: g++.dg/abi/local1.C -std=c++98 execution test
+PASS: g++.dg/abi/local1.C -std=c++11 execution test
+PASS: g++.dg/ext/attr-alias-2.C -std=c++98 execution test
+PASS: g++.dg/ext/attr-alias-2.C -std=c++11 execution test
+PASS: g++.dg/lto/pr42987 cp_lto_pr42987_0.o-cp_lto_pr42987_1.o execute
 -flto -g
+PASS: g++.dg/lto/pr42987 cp_lto_pr42987_0.o-cp_lto_pr42987_1.o execute
 -flto -flto-partition=none -g
+PASS: g++.old-deja/g++.mike/thunk2.C -std=c++98 execution test
+PASS: g++.old-deja/g++.mike/thunk2.C -std=c++11 execution test
+PASS: g++.old-deja/g++.other/rtti3.C -std=gnu++98 execution test
+PASS: g++.old-deja/g++.other/rtti3.C -std=gnu++11 execution test
+PASS: g++.old-deja/g++.other/rtti4.C -std=gnu++98 execution test
+PASS: g++.old-deja/g++.other/rtti4.C -std=gnu++11 execution test

The downside is that it would be an ABI change for -mcpu=m32cm in terms
of generated code (to make it match what libsupc++ expects). Would you
consider that acceptable for this target, considering it is a bug fix
and the new layout is more space efficient?

Bernd

Re: [C++ Patch] Define __cplusplus == 201300L for -std=c++1y

2013-04-26 Thread Daniel Krügler

2013/4/26 Daniel Krügler :
> 2013/4/26 Daniel Krügler :
>> 2013/4/26 Paolo Carlini :
>>> Hi,
>>>
>>> On 04/26/2013 12:42 PM, Jakub Jelinek wrote:

 On Fri, Apr 26, 2013 at 12:32:40PM +0200, Paolo Carlini wrote:
>
> On 04/24/2013 08:20 PM, Jason Merrill wrote:
>>
>> On 04/24/2013 02:02 PM, Paolo Carlini wrote:
>>>
>>> +#if __cplusplus < 201300L
>>
>> Don't test for this value.  Use <= 201103L instead.
>>
>> OK with that change.
>
> Thanks Jason.
>
> Today I'm noticing an issue with the underlying  from glibc,
> which makes our  unusable. It does have:
>
> #if !defined __USE_ISOC11 \
>  || (defined __cplusplus && __cplusplus <= 201103L)
> /* Get a newline-terminated string from stdin, removing the newline.
> DO NOT USE THIS FUNCTION!!  There is no limit on how much it will
> read.
>
> The function has been officially removed in ISO C11.  This
> opportunity
> is used to also remove it from the GNU feature list.  It is now only
> available when explicitly using an old ISO C, Unix, or POSIX
> standard.
> GCC defines _GNU_SOURCE when building C++ code and the function is
> still
> in C++11, so it is also available for C++.
>
> This function is a possible cancellation point and therefore not
> marked with __THROW.  */
> extern char *gets (char *__s) __wur __attribute_deprecated__;
>
> I don't think the header should check __cplusplus <= 201103L like
> that. Jakub, Jason, what do you think?

 I guess Ulrich added this with the expectation that gets will be also
 removed from C++1y.  Has there been any discussions regarding that in the
 WG
 already?
>>>
>>> Not to my best knowledge. I'm adding in CC Jonathan and Daniel too.
>>>
>>> We could certainly mirror in the C++ library what glibc does, but at this
>>> point it seems premature to me to assume that C++1y (C++14, in practice)
>>> will be stricter than C++11 as regards gets.
>>
>> Jonathan recently submitted an LWG issue for this (not yet part of the
>> available list). I'm in the process to add the new issue within the
>> following days. He's essentially suggesting to remove get() from
>> C++14.
>
> To clarify this: The C++ Standard currently refers to a get() function
> that does not exist anymore in the reference C99. So, its removal
> looks more than reasonable to me.

Sorry, I need to correct me here: gets() is part of C99 TC3, but has
been deprecated.

>> - Daniel
>>
>>> Paolo.

[PATCH] Stream loops with LTO

2013-04-26 Thread Richard Biener


This adds loop tree streaming to LTO.

LTO bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2013-04-26  Richard Biener  

* Makefile.in (lto-streamer-in.o): Add $(CFGLOOP_H) dependency.
(lto-streamer-out.o): Likewise.
* cfgloop.c (init_loops_structure): Export, add struct function
argument and adjust.
(flow_loops_find): Adjust.
* cfgloop.h (enum loop_estimation): Add EST_LAST.
(init_loops_structure): Declare.
* lto-streamer-in.c: Include cfgloop.h.
(input_cfg): Input the loop tree.
* lto-streamer-out.c: Include cfgloop.h.
(output_cfg): Output the loop tree.
(output_struct_function_base): Do not drop PROP_loops.

Index: trunk/gcc/Makefile.in
===
*** trunk.orig/gcc/Makefile.in  2013-04-26 10:01:45.0 +0200
--- trunk/gcc/Makefile.in   2013-04-26 10:36:17.494133278 +0200
*** lto-streamer-in.o: lto-streamer-in.c $(C
*** 2174,2184 
 $(TM_H) toplev.h $(DIAGNOSTIC_CORE_H) $(EXPR_H) $(FLAGS_H) $(PARAMS_H) \
 input.h $(HASHTAB_H) $(BASIC_BLOCK_H) $(TREE_FLOW_H) $(TREE_PASS_H) \
 $(CGRAPH_H) $(FUNCTION_H) $(GGC_H) $(DIAGNOSTIC_H) $(EXCEPT_H) debug.h \
!$(IPA_UTILS_H) $(LTO_STREAMER_H) toplev.h \
 $(DATA_STREAMER_H) $(GIMPLE_STREAMER_H) $(TREE_STREAMER_H)
  lto-streamer-out.o : lto-streamer-out.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
 $(TM_H) $(DIAGNOSTIC_CORE_H) $(TREE_H) $(EXPR_H) $(FLAGS_H) $(PARAMS_H) 
input.h \
!$(HASHTAB_H) $(BASIC_BLOCK_H) tree-iterator.h \
 $(TREE_FLOW_H) $(TREE_PASS_H) $(CGRAPH_H) $(FUNCTION_H) $(GGC_H) \
 $(DIAGNOSTIC_CORE_H) $(EXCEPT_H) $(LTO_STREAMER_H) $(DIAGNOSTIC_CORE_H) \
 $(DATA_STREAMER_H) $(STREAMER_HOOKS_H) $(GIMPLE_STREAMER_H) \
--- 2174,2184 
 $(TM_H) toplev.h $(DIAGNOSTIC_CORE_H) $(EXPR_H) $(FLAGS_H) $(PARAMS_H) \
 input.h $(HASHTAB_H) $(BASIC_BLOCK_H) $(TREE_FLOW_H) $(TREE_PASS_H) \
 $(CGRAPH_H) $(FUNCTION_H) $(GGC_H) $(DIAGNOSTIC_H) $(EXCEPT_H) debug.h \
!$(IPA_UTILS_H) $(LTO_STREAMER_H) toplev.h $(CFGLOOP_H) \
 $(DATA_STREAMER_H) $(GIMPLE_STREAMER_H) $(TREE_STREAMER_H)
  lto-streamer-out.o : lto-streamer-out.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
 $(TM_H) $(DIAGNOSTIC_CORE_H) $(TREE_H) $(EXPR_H) $(FLAGS_H) $(PARAMS_H) 
input.h \
!$(HASHTAB_H) $(BASIC_BLOCK_H) tree-iterator.h $(CFGLOOP_H) \
 $(TREE_FLOW_H) $(TREE_PASS_H) $(CGRAPH_H) $(FUNCTION_H) $(GGC_H) \
 $(DIAGNOSTIC_CORE_H) $(EXCEPT_H) $(LTO_STREAMER_H) $(DIAGNOSTIC_CORE_H) \
 $(DATA_STREAMER_H) $(STREAMER_HOOKS_H) $(GIMPLE_STREAMER_H) \
Index: trunk/gcc/cfgloop.c
===
*** trunk.orig/gcc/cfgloop.c2013-04-26 10:01:45.0 +0200
--- trunk/gcc/cfgloop.c 2013-04-26 10:38:35.297698554 +0200
*** alloc_loop (void)
*** 339,346 
  /* Initializes loops structure LOOPS, reserving place for NUM_LOOPS loops
 (including the root of the loop tree).  */
  
! static void
! init_loops_structure (struct loops *loops, unsigned num_loops)
  {
struct loop *root;
  
--- 339,347 
  /* Initializes loops structure LOOPS, reserving place for NUM_LOOPS loops
 (including the root of the loop tree).  */
  
! void
! init_loops_structure (struct function *fn,
! struct loops *loops, unsigned num_loops)
  {
struct loop *root;
  
*** init_loops_structure (struct loops *loop
*** 349,359 
  
/* Dummy loop containing whole function.  */
root = alloc_loop ();
!   root->num_nodes = n_basic_blocks;
!   root->latch = EXIT_BLOCK_PTR;
!   root->header = ENTRY_BLOCK_PTR;
!   ENTRY_BLOCK_PTR->loop_father = root;
!   EXIT_BLOCK_PTR->loop_father = root;
  
loops->larray->quick_push (root);
loops->tree_root = root;
--- 350,360 
  
/* Dummy loop containing whole function.  */
root = alloc_loop ();
!   root->num_nodes = n_basic_blocks_for_function (fn);
!   root->latch = EXIT_BLOCK_PTR_FOR_FUNCTION (fn);
!   root->header = ENTRY_BLOCK_PTR_FOR_FUNCTION (fn);
!   ENTRY_BLOCK_PTR_FOR_FUNCTION (fn)->loop_father = root;
!   EXIT_BLOCK_PTR_FOR_FUNCTION (fn)->loop_father = root;
  
loops->larray->quick_push (root);
loops->tree_root = root;
*** flow_loops_find (struct loops *loops)
*** 411,417 
if (!loops)
  {
loops = ggc_alloc_cleared_loops ();
!   init_loops_structure (loops, 1);
  }
  
/* Ensure that loop exits were released.  */
--- 412,418 
if (!loops)
  {
loops = ggc_alloc_cleared_loops ();
!   init_loops_structure (cfun, loops, 1);
  }
  
/* Ensure that loop exits were released.  */
Index: trunk/gcc/cfgloop.h
===
*** trunk.orig/gcc/cfgloop.h2013-04-26 10:01:45.0 +0200
--- trunk/gcc/cfgloop.h 2013-04-26 10:41:45.600861044 +0200
*** enum loop_est

[PATCH, ARM] Remove incscc and decscc patterns from thumb2.md

2013-04-26 Thread Greta Yorsh

This patch removes dead patterns for incscc and decscc from thumb2.md.

It's a cleanup after this patch:
http://gcc.gnu.org/ml/gcc-patches/2013-01/msg00955.html
which removed incscc and decscc expanders and the corresponding patterns
from arm.md, but not from thumb2.md.

No regression on qemu for arm-none-eabi cortex-a15 thumb.

Ok for trunk?

Thanks,
Greta

gcc/

2013-04-05  Greta Yorsh  

* config/arm/thumb2.md (thumb2_incscc, thumb2_decscc): Delete.diff --git a/gcc/config/arm/thumb2.md b/gcc/config/arm/thumb2.md
index 6aa76f6..968cc0c 100644
--- a/gcc/config/arm/thumb2.md
+++ b/gcc/config/arm/thumb2.md
@@ -24,32 +24,6 @@
 ;; changes made in armv5t as "thumb2".  These are considered part
 ;; the 16-bit Thumb-1 instruction set.
 
-(define_insn "*thumb2_incscc"
-  [(set (match_operand:SI 0 "s_register_operand" "=r,r")
-(plus:SI (match_operator:SI 2 "arm_comparison_operator"
-[(match_operand:CC 3 "cc_register" "") (const_int 0)])
- (match_operand:SI 1 "s_register_operand" "0,?r")))]
-  "TARGET_THUMB2"
-  "@
-  it\\t%d2\;add%d2\\t%0, %1, #1
-  ite\\t%D2\;mov%D2\\t%0, %1\;add%d2\\t%0, %1, #1"
-  [(set_attr "conds" "use")
-   (set_attr "length" "6,10")]
-)
-
-(define_insn "*thumb2_decscc"
-  [(set (match_operand:SI0 "s_register_operand" "=r,r")
-(minus:SI (match_operand:SI  1 "s_register_operand" "0,?r")
- (match_operator:SI 2 "arm_comparison_operator"
-   [(match_operand   3 "cc_register" "") (const_int 0)])))]
-  "TARGET_THUMB2"
-  "@
-   it\\t%d2\;sub%d2\\t%0, %1, #1
-   ite\\t%D2\;mov%D2\\t%0, %1\;sub%d2\\t%0, %1, #1"
-  [(set_attr "conds" "use")
-   (set_attr "length" "6,10")]
-)
-
 ;; Thumb-2 only allows shift by constant on data processing instructions 
 (define_insn "*thumb_andsi_not_shiftsi_si"
   [(set (match_operand:SI 0 "s_register_operand" "=r")

Re: [PATCH] Fix linking with -findirect-dispatch

2013-04-26 Thread Matthias Klose

Am 16.04.2013 11:55, schrieb Matthias Klose:
> Am 16.04.2013 11:48, schrieb Jakub Jelinek:
>> On Tue, Apr 16, 2013 at 11:37:07AM +0200, Andreas Schwab wrote:
>>> Jakub Jelinek  writes:
>>>
 at dynamic link time it is a dummy library with no symbols that just
 adds DT_NEEDED of the latest and greatest libgcj.so.N, which provides
 all the symbols.
>>>
>>> Which is exactly the problem.  --no-copy-dt-needed-entries has been the
>>> default for a long time now.
>>
>> Why would that be a problem?  libgcj.so the linker sees (i.e. the dummy
>> library) doesn't intentionally have DT_NEEDED libgcj.so.N, programs and
>> shared libraries linked with -findirect-dispatch should be adding
>> libgcj_bc.so to DT_NEEDED, not libgcj.so.N.
>>
>> If this is caused by some recent broken linker change, then that should be
>> better reverted.
> 
> I don't see this with binutils 2.23.2.

I do see this now too, however the root of the problem seems to be a linker
which defaults to --as-needed (which is the case on SuSe afaik).  I can see this
without installing anything, just running the testsuite shows some hundred of
these failures.

  Matthias

Re: [PATCH] Fix linking with -findirect-dispatch

2013-04-26 Thread Andrew Haley

On 04/26/2013 12:22 PM, Matthias Klose wrote:
> I do see this now too, however the root of the problem seems to be a linker
> which defaults to --as-needed (which is the case on SuSe afaik).

Is this a non-standard thing?  So SuSe has a special --configure option
which does this?  We can always patch in --no-as-needed

Andrew.

Re: [PATCH] Loop distribution improvements

2013-04-26 Thread Marc Glisse



ping http://gcc.gnu.org/ml/gcc-patches/2013-04/msg00336.html

On Fri, 5 Apr 2013, Marc Glisse wrote:


On Fri, 5 Apr 2013, Marc Glisse wrote:

Shouldn't we change integer_all_onesp to do what its name says and create a 
separate integer_minus_onep for the single place I could find where it 
would break, the folding of x * -1 ?


2013-04-05  Marc Glisse  

* tree.c (integer_all_onesp) : Test that both
components are all 1s.
(integer_minus_onep): New function.
* tree.h (integer_minus_onep): Declare it.
* fold-const.c (fold_binary_loc) : Test
integer_minus_onep instead of integer_all_onesp.

It passes bootstrap+testsuite on x86_64-linux-gnu, but if someone else wants 
to go through the (not that long) list of integer_all_onesp to check for 
things that might break... I did not change places where the name "-1" might 
make more sense than "all 1s" but the type cannot be complex.


--
Marc Glisse

[PATCH] Properly outline loop tree in move_sese_region_to_fn

2013-04-26 Thread Richard Biener


This implements the missing outlining of loop tree parts in
move_sese_region_to_fn (used by OMP lowering and thus autopar).

Testing coverage is somewhat weak because OMP lowering doesn't
update the loop tree when building loops.  Something to be fixed
so the loop fixup can be avoided.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2013-04-26  Richard Biener  

* omp-low.c (finalize_task_copyfn): Do not drop PROP_loops.
(expand_omp_taskreg): Likewise.  Mark loops for fixup.
* tree-cfg.c (move_block_to_fn): Remap loop fathers.
(fixup_loop_arrays_after_move): New function.
(move_sese_region_to_fn): Properly outline the loop tree parts
of the SESE region.

Index: trunk/gcc/omp-low.c
===
*** trunk.orig/gcc/omp-low.c2013-04-26 11:27:31.0 +0200
--- trunk/gcc/omp-low.c 2013-04-26 12:01:56.643354202 +0200
*** finalize_task_copyfn (gimple task_stmt)
*** 1258,1267 
  return;
  
child_cfun = DECL_STRUCT_FUNCTION (child_fn);
! 
!   /* Inform the callgraph about the new function.  */
!   DECL_STRUCT_FUNCTION (child_fn)->curr_properties
! = cfun->curr_properties & ~PROP_loops;
  
push_cfun (child_cfun);
bind = gimplify_body (child_fn, false);
--- 1258,1264 
  return;
  
child_cfun = DECL_STRUCT_FUNCTION (child_fn);
!   DECL_STRUCT_FUNCTION (child_fn)->curr_properties = cfun->curr_properties;
  
push_cfun (child_cfun);
bind = gimplify_body (child_fn, false);
*** finalize_task_copyfn (gimple task_stmt)
*** 1276,1281 
--- 1273,1279 
gimple_set_body (child_fn, seq);
pop_cfun ();
  
+   /* Inform the callgraph about the new function.  */
cgraph_add_new_function (child_fn, false);
  }
  
*** expand_omp_taskreg (struct omp_region *r
*** 3573,3578 
--- 3571,3581 
new_bb = move_sese_region_to_fn (child_cfun, entry_bb, exit_bb, block);
if (exit_bb)
single_succ_edge (new_bb)->flags = EDGE_FALLTHRU;
+   /* ???  As the OMP expansion process does not update the loop
+  tree of the original function before outlining the region to
+the new child function we need to discover loops in the child.
+Arrange for that.  */
+   child_cfun->x_current_loops->state |= LOOPS_NEED_FIXUP;
  
/* Remove non-local VAR_DECLs from child_cfun->local_decls list.  */
num = vec_safe_length (child_cfun->local_decls);
*** expand_omp_taskreg (struct omp_region *r
*** 3589,3596 
vec_safe_truncate (child_cfun->local_decls, dstidx);
  
/* Inform the callgraph about the new function.  */
!   DECL_STRUCT_FUNCTION (child_fn)->curr_properties
!   = cfun->curr_properties & ~PROP_loops;
cgraph_add_new_function (child_fn, true);
  
/* Fix the callgraph edges for child_cfun.  Those for cfun will be
--- 3592,3598 
vec_safe_truncate (child_cfun->local_decls, dstidx);
  
/* Inform the callgraph about the new function.  */
!   DECL_STRUCT_FUNCTION (child_fn)->curr_properties = 
cfun->curr_properties;
cgraph_add_new_function (child_fn, true);
  
/* Fix the callgraph edges for child_cfun.  Those for cfun will be
Index: trunk/gcc/tree-cfg.c
===
*** trunk.orig/gcc/tree-cfg.c   2013-04-26 11:27:31.0 +0200
--- trunk/gcc/tree-cfg.c2013-04-26 12:59:55.282564848 +0200
*** move_block_to_fn (struct function *dest_
*** 6366,6373 
  
/* Remove BB from dominance structures.  */
delete_from_dominance_info (CDI_DOMINATORS, bb);
if (current_loops)
! remove_bb_from_loops (bb);
  
/* Link BB to the new linked list.  */
move_block_after (bb, after);
--- 6366,6379 
  
/* Remove BB from dominance structures.  */
delete_from_dominance_info (CDI_DOMINATORS, bb);
+ 
+   /* Move BB from its current loop to the copy in the new function.  */
if (current_loops)
! {
!   struct loop *new_loop = (struct loop *)bb->loop_father->aux;
!   if (new_loop)
!   bb->loop_father = new_loop;
! }
  
/* Link BB to the new linked list.  */
move_block_after (bb, after);
*** replace_block_vars_by_duplicates (tree b
*** 6599,6604 
--- 6605,6629 
  replace_block_vars_by_duplicates (block, vars_map, to_context);
  }
  
+ /* Fixup the loop arrays and numbers after moving LOOP and its subloops
+from FN1 to FN2.  */
+ 
+ static void
+ fixup_loop_arrays_after_move (struct function *fn1, struct function *fn2,
+ struct loop *loop)
+ {
+   /* Discard it from the old loop array.  */
+   (*fn1->x_current_loops->larray)[loop->num] = NULL;
+ 
+   /* Place it in the new loop array, assigning it a new number.  */
+   loop->num = vec_safe_length (fn2->x_current_loops->larray);
+   vec_safe_push (fn2->x_

[C++ testcase, committed] PR 55708

2013-04-26 Thread Paolo Carlini


Hi,

tested x86_64-linux, committed to mainline.

Thanks,
Paolo.

//
2013-04-26  Paolo Carlini  

PR c++/55708
* g++.dg/cpp0x/constexpr-55708.C: New.
Index: g++.dg/cpp0x/constexpr-55708.C
===
--- g++.dg/cpp0x/constexpr-55708.C  (revision 0)
+++ g++.dg/cpp0x/constexpr-55708.C  (working copy)
@@ -0,0 +1,30 @@
+// PR c++/55708
+// { dg-do compile { target c++11 } }
+
+template
+struct AA { static constexpr int val = N; };
+
+template
+//constexpr unsigned long long mymax(A a,B b){ // <-- compiles 
+constexpr unsigned long long mymax(A && a,const B& b){
+  return a
+constexpr long long operator"" _y() noexcept
+{
+  return AA<1, mymax(1,2)>::val; // <-- crashes gcc
+  // return mymax(1,2);   // <-- compiles
+  // return AA<1,2>::val; // <-- compiles
+}
+
+template
+constexpr unsigned long long do_y() noexcept
+{
+  return AA<1, mymax(1,2)>::val; // <-- crashes gcc
+}
+
+int main()
+{
+  return 1_y + do_y();
+}

[PATCH] Fix PR57081

2013-04-26 Thread Richard Biener


This should fix PR57081 - the problem is that number-of-iteration
estimates are not freed by DCE and thus possibly dead stmts are looked
at in loop prediction.  The fix is to always free number-of-iteration
estimates at loop_optimizer_finalize time (this is the behavior
from before preserving loops at all).

Bootstrap & regtest pending on x86_64-unknown-linux-gnu.

Richard.

2013-04-26  Richard Biener  

PR tree-optimization/57081
* loop-init.c: Include tree-flow.h.
(loop_optimizer_finalize): Free number of iteration estimates.
* Makefile.in (loop-init.o): Add $(TREE_FLOW_H) dependency.

* gcc.dg/torture/pr57081.c: New testcase.

Index: gcc/loop-init.c
===
*** gcc/loop-init.c (revision 198334)
--- gcc/loop-init.c (working copy)
*** along with GCC; see the file COPYING3.
*** 30,35 
--- 30,36 
  #include "flags.h"
  #include "df.h"
  #include "ggc.h"
+ #include "tree-flow.h"
  
  
  /* Apply FLAGS to the loop state.  */
*** loop_optimizer_finalize (void)
*** 142,147 
--- 143,150 
if (loops_state_satisfies_p (LOOPS_HAVE_RECORDED_EXITS))
  release_recorded_exits ();
  
+   free_numbers_of_iterations_estimates ();
+ 
/* If we should preserve loop structure, do not free it but clear
   flags that advanced properties are there as we are not preserving
   that in full.  */
Index: gcc/Makefile.in
===
*** gcc/Makefile.in (revision 198334)
--- gcc/Makefile.in (working copy)
*** cfgloopmanip.o : cfgloopmanip.c $(CONFIG
*** 3181,3187 
  loop-init.o : loop-init.c $(CONFIG_H) $(SYSTEM_H) $(RTL_H) $(GGC_H) \
 $(BASIC_BLOCK_H) hard-reg-set.h $(CFGLOOP_H) \
 coretypes.h $(TM_H) $(OBSTACK_H) $(TREE_PASS_H) $(FLAGS_H) \
!$(REGS_H) $(DF_H)
  loop-unswitch.o : loop-unswitch.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
 $(DUMPFILE_H) \
 $(RTL_H) $(TM_H) $(BASIC_BLOCK_H) hard-reg-set.h $(CFGLOOP_H) $(PARAMS_H) \
--- 3189,3195 
  loop-init.o : loop-init.c $(CONFIG_H) $(SYSTEM_H) $(RTL_H) $(GGC_H) \
 $(BASIC_BLOCK_H) hard-reg-set.h $(CFGLOOP_H) \
 coretypes.h $(TM_H) $(OBSTACK_H) $(TREE_PASS_H) $(FLAGS_H) \
!$(REGS_H) $(DF_H) $(TREE_FLOW_H)
  loop-unswitch.o : loop-unswitch.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
 $(DUMPFILE_H) \
 $(RTL_H) $(TM_H) $(BASIC_BLOCK_H) hard-reg-set.h $(CFGLOOP_H) $(PARAMS_H) \
Index: gcc/testsuite/gcc.dg/torture/pr57081.c
===
*** gcc/testsuite/gcc.dg/torture/pr57081.c  (revision 0)
--- gcc/testsuite/gcc.dg/torture/pr57081.c  (working copy)
***
*** 0 
--- 1,22 
+ /* { dg-do compile } */
+ 
+ int a;
+ 
+ void f(void)
+ {
+   int b;
+ 
+   if(0)
+ lbl:
+   goto lbl;
+ 
+   if(b)
+ {
+   int p = 0;
+   goto lbl;
+ }
+ 
+   a = 0;
+   while(b++);
+   goto lbl;
+ }

Re: Logic operators ! && || for vectors

2013-04-26 Thread Marc Glisse


Ping http://gcc.gnu.org/ml/gcc-patches/2013-04/msg00783.html

Even if we decide not to implement logic operators in the front-end, we 
still need the fix for the wrong code (the 2 save_expr in 
cp_build_binary_op, is that part of the patch ok with the 
vector-scalar-2.c testcase? and for 4.8?) and to avoid ICEing on __m128i 
f(__m128d a,__m128d b){return awarn_logical_operator.


On Fri, 12 Apr 2013, Marc Glisse wrote:


Hello,

this adds support for vector !, && and ||. In the long run, I think it would 
be good to be able to use TRUTH_*_EXPR with vectors, but that's probably a 
lot of work.


It currently restricts && and || to vector-vector operations. I'd like to 
also support mixed scalar-vector later, but it is a bit more complicated. 
With vectors, && evaluates both sides. For scal && vec, we have the choice of 
making it short-circuit: cond_expr((bool)scal, vec!=0, {0}) or do a vector 
and. For vec && scal, it seems clear we have to evaluate both operands, but 
then we can also make it a cond_expr instead of a BIT_AND_EXPR (technically, 
I think I can achieve that with save_expr and a compound_expr, I don't know 
if there is a better way to add statements).


The missing save_expr before build_vector_from_val are a bug I introduced 
when I adapted the code from the C front-end. This bit (and the 
vector-scalar-2.c testcase that goes with it) should probably be backported 
to 4.8.


The code we generate for these examples is not very good, but that's a 
different issue.


Bootstrap+testsuite on x86_64-linux-gnu.

2013-04-12  Marc Glisse  

gcc/cp/
* typeck.c (cp_build_binary_op): Call save_expr before
build_vector_from_val.
: Handle vectors.
(cp_build_unary_op) : Likewise.

gcc/c-family/
* c-common.c (warn_logical_operator): Give up for vectors.

gcc/testsuite/
* c-c++-common/vector-scalar-2.c: New testcase.
* g++.dg/ext/vector22.C: Likewise.
* g++.dg/ext/vector23.C: Likewise.
* g++.dg/ext/vector9.C: Adapt.
* g++.dg/other/error23.C: Adapt.




--
Marc Glisse

Re: [C++ Patch] Define __cplusplus == 201300L for -std=c++1y

2013-04-26 Thread Jonathan Wakely

On 26 April 2013 12:06, Daniel Krügler wrote:
>>>
>>> Jonathan recently submitted an LWG issue for this (not yet part of the
>>> available list). I'm in the process to add the new issue within the
>>> following days. He's essentially suggesting to remove get() from
>>> C++14.
>>
>> To clarify this: The C++ Standard currently refers to a get() function
>> that does not exist anymore in the reference C99. So, its removal
>> looks more than reasonable to me.
>
> Sorry, I need to correct me here: gets() is part of C99 TC3, but has
> been deprecated.

Yes. When I submitted the issue I was under the impression (along with
most of the BSI C++ panel) that the C++ standard references the
current C standard, which is now C11.  I was told in Bristol that the
reference to ISO/IEC 9899:1999 is fine and so we only reference the
C99 library, which still includes gets().  So I think the issue I
submitted is NAD.

IMHO it wouldn't be a bad thing to remove gets() from the C++ Standard
Library even if it's technically still in C99.  It is evil and should
be killed with fire.  But that probably can't be handled with a DR and
so is too late for C++14.

[AArch64] Map frint intrinsics to standard pattern names directly.

2013-04-26 Thread James Greenhalgh


Hi,

This patch maps the frint style intrinsics directly to their
standard pattern name versions and adds support for frintn, which
does not map to a standard pattern name.

Regression tested on aarch64-none-elf with no issues.

Thanks,
James

---
gcc/

2013-04-26  James Greenhalgh  

* config/aarch64/aarch64-builtins.c
(aarch64_builtin_vectorized_function): Fold to standard pattern names.
* config/aarch64/aarch64-simd-builtins.def (frintn): New.
(frintz): Rename to...
(btrunc): ...this.
(frintp): Rename to...
(ceil): ...this.
(frintm): Rename to...
(floor): ...this.
(frinti): Rename to...
(nearbyint): ...this.
(frintx): Rename to...
(rint): ...this.
(frinta): Rename to...
(round): ...this.
* config/aarch64/aarch64-simd.md
(aarch64_frint): Delete.
(2): Convert to insn.
* config/aarch64/aarch64.md (unspec): Add UNSPEC_FRINTN.
* config/aarch64/iterators.md (FRINT): Add UNSPEC_FRINTN.
(frint_pattern): Likewise.
(frint_suffix): Likewise.
diff --git a/gcc/config/aarch64/aarch64-builtins.c b/gcc/config/aarch64/aarch64-builtins.c
index 2851e2b..08bfe01 100644
--- a/gcc/config/aarch64/aarch64-builtins.c
+++ b/gcc/config/aarch64/aarch64-builtins.c
@@ -1224,19 +1224,19 @@ aarch64_builtin_vectorized_function (tree fndecl, tree type_out, tree type_in)
&& in_mode == N##Fmode && in_n == C)
 	case BUILT_IN_FLOOR:
 	case BUILT_IN_FLOORF:
-	  return AARCH64_FIND_FRINT_VARIANT (frintm);
+	  return AARCH64_FIND_FRINT_VARIANT (floor);
 	case BUILT_IN_CEIL:
 	case BUILT_IN_CEILF:
-	  return AARCH64_FIND_FRINT_VARIANT (frintp);
+	  return AARCH64_FIND_FRINT_VARIANT (ceil);
 	case BUILT_IN_TRUNC:
 	case BUILT_IN_TRUNCF:
-	  return AARCH64_FIND_FRINT_VARIANT (frintz);
+	  return AARCH64_FIND_FRINT_VARIANT (btrunc);
 	case BUILT_IN_ROUND:
 	case BUILT_IN_ROUNDF:
-	  return AARCH64_FIND_FRINT_VARIANT (frinta);
+	  return AARCH64_FIND_FRINT_VARIANT (round);
 	case BUILT_IN_NEARBYINT:
 	case BUILT_IN_NEARBYINTF:
-	  return AARCH64_FIND_FRINT_VARIANT (frinti);
+	  return AARCH64_FIND_FRINT_VARIANT (nearbyint);
 	case BUILT_IN_SQRT:
 	case BUILT_IN_SQRTF:
 	  return AARCH64_FIND_FRINT_VARIANT (sqrt);
diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def b/gcc/config/aarch64/aarch64-simd-builtins.def
index 6e69298..9b06a68 100644
--- a/gcc/config/aarch64/aarch64-simd-builtins.def
+++ b/gcc/config/aarch64/aarch64-simd-builtins.def
@@ -247,13 +247,14 @@
   BUILTIN_VDQ_BHSI (BINOP, umax, 3)
   BUILTIN_VDQ_BHSI (BINOP, umin, 3)
 
-  /* Implemented by aarch64_frint.  */
-  BUILTIN_VDQF (UNOP, frintz, 0)
-  BUILTIN_VDQF (UNOP, frintp, 0)
-  BUILTIN_VDQF (UNOP, frintm, 0)
-  BUILTIN_VDQF (UNOP, frinti, 0)
-  BUILTIN_VDQF (UNOP, frintx, 0)
-  BUILTIN_VDQF (UNOP, frinta, 0)
+  /* Implemented by 2.  */
+  BUILTIN_VDQF (UNOP, btrunc, 2)
+  BUILTIN_VDQF (UNOP, ceil, 2)
+  BUILTIN_VDQF (UNOP, floor, 2)
+  BUILTIN_VDQF (UNOP, nearbyint, 2)
+  BUILTIN_VDQF (UNOP, rint, 2)
+  BUILTIN_VDQF (UNOP, round, 2)
+  BUILTIN_VDQF (UNOP, frintn, 2)
 
   /* Implemented by aarch64_fcvt.  */
   BUILTIN_VDQF (UNOP, fcvtzs, 0)
diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index 5862d26..5f14cc6 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -1232,7 +1232,9 @@
(set_attr "simd_mode" "")]
 )
 
-(define_insn "aarch64_frint"
+;; Vector versions of the floating-point frint patterns.
+;; Expands to btrunc, ceil, floor, nearbyint, rint, round.
+(define_insn "2"
   [(set (match_operand:VDQF 0 "register_operand" "=w")
 	(unspec:VDQF [(match_operand:VDQF 1 "register_operand" "w")]
 		  FRINT))]
@@ -1242,15 +1244,6 @@
(set_attr "simd_mode" "")]
 )
 
-;; Vector versions of the floating-point frint patterns.
-;; Expands to btrunc, ceil, floor, nearbyint, rint, round.
-(define_expand "2"
-  [(set (match_operand:VDQF 0 "register_operand")
-	(unspec:VDQF [(match_operand:VDQF 1 "register_operand")]
-		  FRINT))]
-  "TARGET_SIMD"
-  {})
-
 (define_insn "aarch64_fcvt"
   [(set (match_operand: 0 "register_operand" "=w")
 	(FIXUORS: (unspec:
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 330f78c..4342c2d 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -74,6 +74,7 @@
 UNSPEC_FRINTA
 UNSPEC_FRINTI
 UNSPEC_FRINTM
+UNSPEC_FRINTN
 UNSPEC_FRINTP
 UNSPEC_FRINTX
 UNSPEC_FRINTZ
diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
index 58a2a9e..a2ad866 100644
--- a/gcc/config/aarch64/iterators.md
+++ b/gcc/config/aarch64/iterators.md
@@ -692,7 +692,8 @@
 			  UNSPEC_UZP1 UNSPEC_UZP2])
 
 (define_int_iterator FRINT [UNSPEC_FRINTZ UNSPEC_FRINTP UNSPEC_FRINTM
-			 UNSPEC_FRINTI UNSPEC_FRINTX UNSPEC_FRINTA])
+			 UNSPEC_FRINTN UNSPEC_FRINTI UNSPEC_FRINTX
+			 UNSPEC_FRINTA])
 
 (define_

[AArch64] Convert NEON frint implementations to use builtins.

2013-04-26 Thread James Greenhalgh


Hi,

This patch renames the vrnd intrinsics,
which previously were vrnd

At the same time, we move these intrinsics to an RTL-based intrinsic.

Regression tested on aarch64-none-elf with no issues.

Thanks,
James

---
gcc/

2013-04-26  James Greenhalgh  

* config/aarch64/arm_neon.h (vrndq_f<32, 64>): Rename to...
(vrndq_f<32, 64>): ...This, implement using builtin.
(vrnd_f32): Implement using builtins.
(vrnd_f<32, 64>): New.

gcc/testsuite/

2013-04-26  James Greenhalgh  

* gcc.target/aarch64/vect-vrnd.c: New.
diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h
index 6f5ca8e..c868a46 100644
--- a/gcc/config/aarch64/arm_neon.h
+++ b/gcc/config/aarch64/arm_neon.h
@@ -14941,171 +14941,6 @@ vrev64q_u32 (uint32x4_t a)
   return result;
 }
 
-__extension__ static __inline float32x2_t __attribute__ ((__always_inline__))
-vrnd_f32 (float32x2_t a)
-{
-  float32x2_t result;
-  __asm__ ("frintz %0.2s,%1.2s"
-   : "=w"(result)
-   : "w"(a)
-   : /* No clobbers */);
-  return result;
-}
-
-__extension__ static __inline float32x2_t __attribute__ ((__always_inline__))
-vrnda_f32 (float32x2_t a)
-{
-  float32x2_t result;
-  __asm__ ("frinta %0.2s,%1.2s"
-   : "=w"(result)
-   : "w"(a)
-   : /* No clobbers */);
-  return result;
-}
-
-__extension__ static __inline float32x2_t __attribute__ ((__always_inline__))
-vrndm_f32 (float32x2_t a)
-{
-  float32x2_t result;
-  __asm__ ("frintm %0.2s,%1.2s"
-   : "=w"(result)
-   : "w"(a)
-   : /* No clobbers */);
-  return result;
-}
-
-__extension__ static __inline float32x2_t __attribute__ ((__always_inline__))
-vrndn_f32 (float32x2_t a)
-{
-  float32x2_t result;
-  __asm__ ("frintn %0.2s,%1.2s"
-   : "=w"(result)
-   : "w"(a)
-   : /* No clobbers */);
-  return result;
-}
-
-__extension__ static __inline float32x2_t __attribute__ ((__always_inline__))
-vrndp_f32 (float32x2_t a)
-{
-  float32x2_t result;
-  __asm__ ("frintp %0.2s,%1.2s"
-   : "=w"(result)
-   : "w"(a)
-   : /* No clobbers */);
-  return result;
-}
-
-__extension__ static __inline float32x4_t __attribute__ ((__always_inline__))
-vrndq_f32 (float32x4_t a)
-{
-  float32x4_t result;
-  __asm__ ("frintz %0.4s,%1.4s"
-   : "=w"(result)
-   : "w"(a)
-   : /* No clobbers */);
-  return result;
-}
-
-__extension__ static __inline float64x2_t __attribute__ ((__always_inline__))
-vrndq_f64 (float64x2_t a)
-{
-  float64x2_t result;
-  __asm__ ("frintz %0.2d,%1.2d"
-   : "=w"(result)
-   : "w"(a)
-   : /* No clobbers */);
-  return result;
-}
-
-__extension__ static __inline float32x4_t __attribute__ ((__always_inline__))
-vrndqa_f32 (float32x4_t a)
-{
-  float32x4_t result;
-  __asm__ ("frinta %0.4s,%1.4s"
-   : "=w"(result)
-   : "w"(a)
-   : /* No clobbers */);
-  return result;
-}
-
-__extension__ static __inline float64x2_t __attribute__ ((__always_inline__))
-vrndqa_f64 (float64x2_t a)
-{
-  float64x2_t result;
-  __asm__ ("frinta %0.2d,%1.2d"
-   : "=w"(result)
-   : "w"(a)
-   : /* No clobbers */);
-  return result;
-}
-
-__extension__ static __inline float32x4_t __attribute__ ((__always_inline__))
-vrndqm_f32 (float32x4_t a)
-{
-  float32x4_t result;
-  __asm__ ("frintm %0.4s,%1.4s"
-   : "=w"(result)
-   : "w"(a)
-   : /* No clobbers */);
-  return result;
-}
-
-__extension__ static __inline float64x2_t __attribute__ ((__always_inline__))
-vrndqm_f64 (float64x2_t a)
-{
-  float64x2_t result;
-  __asm__ ("frintm %0.2d,%1.2d"
-   : "=w"(result)
-   : "w"(a)
-   : /* No clobbers */);
-  return result;
-}
-
-__extension__ static __inline float32x4_t __attribute__ ((__always_inline__))
-vrndqn_f32 (float32x4_t a)
-{
-  float32x4_t result;
-  __asm__ ("frintn %0.4s,%1.4s"
-   : "=w"(result)
-   : "w"(a)
-   : /* No clobbers */);
-  return result;
-}
-
-__extension__ static __inline float64x2_t __attribute__ ((__always_inline__))
-vrndqn_f64 (float64x2_t a)
-{
-  float64x2_t result;
-  __asm__ ("frintn %0.2d,%1.2d"
-   : "=w"(result)
-   : "w"(a)
-   : /* No clobbers */);
-  return result;
-}
-
-__extension__ static __inline float32x4_t __attribute__ ((__always_inline__))
-vrndqp_f32 (float32x4_t a)
-{
-  float32x4_t result;
-  __asm__ ("frintp %0.4s,%1.4s"
-   : "=w"(result)
-   : "w"(a)
-   : /* No clobbers */);
-  return result;
-}
-
-__extension__ static __inline float64x2_t __attribute__ ((__always_inline__))
-vrndqp_f64 (float64x2_t a)
-{
-  float64x2_t result;
-  __asm__ ("frintp %0.2d,%1.2d"
-   : "=w"(result)
-   : "w"(a)
-   : /* No clobbers */);
-  return result;
-}
-
 #define vrshrn_high_n_s16(a, b, c)  \
   __extension__

Re: [C++ Patch] Define __cplusplus == 201300L for -std=c++1y

2013-04-26 Thread Daniel Krügler

2013/4/26 Jonathan Wakely :
> On 26 April 2013 12:06, Daniel Krügler wrote:

 Jonathan recently submitted an LWG issue for this (not yet part of the
 available list). I'm in the process to add the new issue within the
 following days. He's essentially suggesting to remove get() from
 C++14.
>>>
>>> To clarify this: The C++ Standard currently refers to a get() function
>>> that does not exist anymore in the reference C99. So, its removal
>>> looks more than reasonable to me.
>>
>> Sorry, I need to correct me here: gets() is part of C99 TC3, but has
>> been deprecated.
>
> Yes. When I submitted the issue I was under the impression (along with
> most of the BSI C++ panel) that the C++ standard references the
> current C standard, which is now C11.  I was told in Bristol that the
> reference to ISO/IEC 9899:1999 is fine and so we only reference the
> C99 library, which still includes gets().  So I think the issue I
> submitted is NAD.
>
> IMHO it wouldn't be a bad thing to remove gets() from the C++ Standard
> Library even if it's technically still in C99.  It is evil and should
> be killed with fire.  But that probably can't be handled with a DR and
> so is too late for C++14.

There exists at least one further issue similar to that one such as

http://cplusplus.github.io/LWG/lwg-active.html#2241

so lets see (Per minimum the gets() function should still keep the
deprecated attribute).

- Daniel

[PATCH, AArch64] Testcases for ANDS instruction

2013-04-26 Thread Ian Bolton

I made some testcases to go with my implementation of ANDS in the backend,
but Naveen Hurugalawadi got the ANDS patterns in before me!
 
I'm now just left with the testcases, but they are still worth adding, so
here they are.

Tests are working correctly as of current trunk.

OK to commit?

Cheers,
Ian


2013-04-26  Ian Bolton  

   * gcc.target/aarch64/ands.c: New test.
   * gcc.target/aarch64/ands2.c: LikewiseIndex: gcc/testsuite/gcc.target/aarch64/ands2.c
===
--- gcc/testsuite/gcc.target/aarch64/ands2.c(revision 0)
+++ gcc/testsuite/gcc.target/aarch64/ands2.c(revision 0)
@@ -0,0 +1,157 @@
+/* { dg-do run } */
+/* { dg-options "-O2 --save-temps -fno-inline" } */
+
+extern void abort (void);
+
+int
+ands_si_test1 (int a, int b, int c)
+{
+  int d = a & b;
+
+  /* { dg-final { scan-assembler-not "ands\tw\[0-9\]+, w\[0-9\]+, w\[0-9\]+" } 
} */
+  /* { dg-final { scan-assembler "and\tw\[0-9\]+, w\[0-9\]+, w\[0-9\]+" } } */
+  if (d <= 0)
+return a + c;
+  else
+return b + d + c;
+}
+
+int
+ands_si_test2 (int a, int b, int c)
+{
+  int d = a & 0x;
+
+  /* { dg-final { scan-assembler-not "ands\tw\[0-9\]+, w\[0-9\]+, -1717986919" 
} } */
+  /* { dg-final { scan-assembler "and\tw\[0-9\]+, w\[0-9\]+, -1717986919" } } 
*/
+  if (d <= 0)
+return a + c;
+  else
+return b + d + c;
+}
+
+int
+ands_si_test3 (int a, int b, int c)
+{
+  int d = a & (b << 3);
+
+  /* { dg-final { scan-assembler-not "ands\tw\[0-9\]+, w\[0-9\]+, w\[0-9\]+, 
lsl 3" } } */
+  /* { dg-final { scan-assembler "and\tw\[0-9\]+, w\[0-9\]+, w\[0-9\]+, lsl 3" 
} } */
+  if (d <= 0)
+return a + c;
+  else
+return b + d + c;
+}
+
+typedef long long s64;
+
+s64
+ands_di_test1 (s64 a, s64 b, s64 c)
+{
+  s64 d = a & b;
+
+  /* { dg-final { scan-assembler-not "ands\tx\[0-9\]+, x\[0-9\]+, x\[0-9\]+" } 
} */
+  /* { dg-final { scan-assembler "and\tx\[0-9\]+, x\[0-9\]+, x\[0-9\]+" } } */
+  if (d <= 0)
+return a + c;
+  else
+return b + d + c;
+}
+
+s64
+ands_di_test2 (s64 a, s64 b, s64 c)
+{
+  s64 d = a & 0xll;
+
+  /* { dg-final { scan-assembler-not "ands\tx\[0-9\]+, x\[0-9\]+, 
-6148914691236517206" } } */
+  /* { dg-final { scan-assembler "and\tx\[0-9\]+, x\[0-9\]+, 
-6148914691236517206" } } */
+  if (d <= 0)
+return a + c;
+  else
+return b + d + c;
+}
+
+s64
+ands_di_test3 (s64 a, s64 b, s64 c)
+{
+  s64 d = a & (b << 3);
+
+  /* { dg-final { scan-assembler-not "ands\tx\[0-9\]+, x\[0-9\]+, x\[0-9\]+, 
lsl 3" } } */
+  /* { dg-final { scan-assembler "and\tx\[0-9\]+, x\[0-9\]+, x\[0-9\]+, lsl 3" 
} } */
+  if (d <= 0)
+return a + c;
+  else
+return b + d + c;
+}
+
+int
+main ()
+{
+  int x;
+  s64 y;
+
+  x = ands_si_test1 (29, 4, 5);
+  if (x != 13)
+abort ();
+
+  x = ands_si_test1 (5, 2, 20);
+  if (x != 25)
+abort ();
+
+  x = ands_si_test2 (29, 4, 5);
+  if (x != 34)
+abort ();
+
+  x = ands_si_test2 (1024, 2, 20);
+  if (x != 1044)
+abort ();
+
+  x = ands_si_test3 (35, 4, 5);
+  if (x != 41)
+abort ();
+
+  x = ands_si_test3 (5, 2, 20);
+  if (x != 25)
+abort ();
+
+  y = ands_di_test1 (0x13029ll,
+ 0x32004ll,
+ 0x505050505ll);
+
+  if (y != ((0x13029ll & 0x32004ll) + 0x32004ll + 0x505050505ll))
+abort ();
+
+  y = ands_di_test1 (0x5000500050005ll,
+ 0x2111211121112ll,
+ 0x02020ll);
+  if (y != 0x5000500052025ll)
+abort ();
+
+  y = ands_di_test2 (0x13029ll,
+ 0x32004ll,
+ 0x505050505ll);
+  if (y != ((0x13029ll & 0xll) + 0x32004ll + 
0x505050505ll))
+abort ();
+
+  y = ands_di_test2 (0x540004100ll,
+ 0x32004ll,
+ 0x805050205ll);
+  if (y != (0x540004100ll + 0x805050205ll))
+abort ();
+
+  y = ands_di_test3 (0x13029ll,
+ 0x06408ll,
+ 0x505050505ll);
+  if (y != ((0x13029ll & (0x06408ll << 3))
+   + 0x06408ll + 0x505050505ll))
+abort ();
+
+  y = ands_di_test3 (0x130002900ll,
+ 0x08808ll,
+ 0x505050505ll);
+  if (y != (0x130002900ll + 0x505050505ll))
+abort ();
+
+  return 0;
+}
+
+/* { dg-final { cleanup-saved-temps } } */
Index: gcc/testsuite/gcc.target/aarch64/ands.c
===
--- gcc/testsuite/gcc.target/aarch64/ands.c (revision 0)
+++ gcc/testsuite/gcc.target/aarch64/ands.c (revision 0)
@@ -0,0 +1,151 @@
+/* { dg-do run } */
+/* { dg-options "-O2 --save-temps -fno-inline" } */
+
+extern void abort (void);
+
+int
+ands_si_test1 (int a, int b, int c)
+{
+  int d = a & b;
+
+  /* { dg-final { scan-assembler "ands\tw\[0-9\]+, w\[0-9\]+, w\[0-9\]+" } } */
+  if (d == 0)
+return a + c;
+  else
+return b + d + c;
+}
+
+int
+ands_si_test2 (int a, int

[AArch64] Fix order of modes to lroundmn2 standard names.

2013-04-26 Thread James Greenhalgh


Hi,

The vector versions of lroundmn2, lfloormn2, lceilmn2 convert from
mode m to mode n. The current implementation has this backwards.
For correctness, this patch swaps the n and m parameters.

There is no need to backport this patch as a bug fix, as nothing uses
this name for expansion in 4.8.

Regression tested on aarch64-none-elf with no regressions.

Thanks,
James Greenhalgh

---
gcc/

3013-04-26  James Greenhalgh  

* config/aarch64/aarch64-simd.md
(l2): Rename to...
(l2): ... This.
diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index 5f14cc6..b716fbe 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -1257,7 +1257,7 @@
 
 ;; Vector versions of the fcvt standard patterns.
 ;; Expands to lbtrunc, lround, lceil, lfloor
-(define_expand "l2"
+(define_expand "l2"
   [(set (match_operand: 0 "register_operand")
 	(FIXUORS: (unspec:
 			   [(match_operand:VDQF 1 "register_operand")]

[AArch64] Map fcvt intrinsics to builtin name directly.

2013-04-26 Thread James Greenhalgh


Hi,

This patch uses the new builtin-mapping infrastructure
to map the fcvt family of builtins directly to their
GCC standard pattern name.

Regression tested on aarch64-none-elf with no regressions.

Thanks,
James

---
gcc/

2013-04-26  James Greenhalgh  

* config/aarch64/aarch64-builtins.c
(aarch64_builtin_vectorized_function): Use new names for
fcvt builtins.
* config/aarch64/aarch64-simd-builtins.def (fcvtzs): Split as...
(lbtruncv2sf, lbtruncv4sf, lbtruncv2df): ...This.
(fcvtzu): Split as...
(lbtruncuv2sf, lbtruncuv4sf, lbtruncuv2df): ...This.
(fcvtas): Split as...
(lroundv2sf, lroundv4sf, lroundv2df, lroundsf, lrounddf): ...This.
(fcvtau): Split as...
(lrounduv2sf, lrounduv4sf, lrounduv2df, lroundusf, lroundudf): ...This.
(fcvtps): Split as...
(lceilv2sf, lceilv4sf, lceilv2df): ...This.
(fcvtpu): Split as...
(lceiluv2sf, lceiluv4sf, lceiluv2df, lceilusf, lceiludf): ...This.
(fcvtms): Split as...
(lfloorv2sf, lfloorv4sf, lfloorv2df): ...This.
(fcvtmu): Split as...
(lflooruv2sf, lflooruv4sf, lflooruv2df, lfloorusf, lfloorudf): ...This.
(lfrintnv2sf, lfrintnv4sf, lfrintnv2df, lfrintnsf, lfrintndf): New.
(lfrintnuv2sf, lfrintnuv4sf, lfrintnuv2df): Likewise.
(lfrintnusf, lfrintnudf): Likewise.
* config/aarch64/aarch64-simd.md
(l2): Convert to
define_insn.
(aarch64_fcvt): Remove.
* config/aarch64/iterators.md (FCVT): Include UNSPEC_FRINTN.
(fcvt_pattern): Likewise.
diff --git a/gcc/config/aarch64/aarch64-builtins.c b/gcc/config/aarch64/aarch64-builtins.c
index 08bfe01..f540568 100644
--- a/gcc/config/aarch64/aarch64-builtins.c
+++ b/gcc/config/aarch64/aarch64-builtins.c
@@ -1245,9 +1245,33 @@ aarch64_builtin_vectorized_function (tree fndecl, tree type_out, tree type_in)
   (out_mode == N##Imode && out_n == C \
&& in_mode == N##Fmode && in_n == C)
 	case BUILT_IN_LFLOOR:
-	  return AARCH64_FIND_FRINT_VARIANT (fcvtms);
+	  {
+	tree new_tree = NULL_TREE;
+	if (AARCH64_CHECK_BUILTIN_MODE (2, D))
+	  new_tree =
+		aarch64_builtin_decls[AARCH64_SIMD_BUILTIN_lfloorv2dfv2di];
+	else if (AARCH64_CHECK_BUILTIN_MODE (4, S))
+	  new_tree =
+		aarch64_builtin_decls[AARCH64_SIMD_BUILTIN_lfloorv4sfv4si];
+	else if (AARCH64_CHECK_BUILTIN_MODE (2, S))
+	  new_tree =
+		aarch64_builtin_decls[AARCH64_SIMD_BUILTIN_lfloorv2sfv2si];
+	return new_tree;
+	  }
 	case BUILT_IN_LCEIL:
-	  return AARCH64_FIND_FRINT_VARIANT (fcvtps);
+	  {
+	tree new_tree = NULL_TREE;
+	if (AARCH64_CHECK_BUILTIN_MODE (2, D))
+	  new_tree =
+		aarch64_builtin_decls[AARCH64_SIMD_BUILTIN_lceilv2dfv2di];
+	else if (AARCH64_CHECK_BUILTIN_MODE (4, S))
+	  new_tree =
+		aarch64_builtin_decls[AARCH64_SIMD_BUILTIN_lceilv4sfv4si];
+	else if (AARCH64_CHECK_BUILTIN_MODE (2, S))
+	  new_tree =
+		aarch64_builtin_decls[AARCH64_SIMD_BUILTIN_lceilv2sfv2si];
+	return new_tree;
+	  }
 	default:
 	  return NULL_TREE;
   }
diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def b/gcc/config/aarch64/aarch64-simd-builtins.def
index 9b06a68..4654bd5 100644
--- a/gcc/config/aarch64/aarch64-simd-builtins.def
+++ b/gcc/config/aarch64/aarch64-simd-builtins.def
@@ -256,15 +256,59 @@
   BUILTIN_VDQF (UNOP, round, 2)
   BUILTIN_VDQF (UNOP, frintn, 2)
 
-  /* Implemented by aarch64_fcvt.  */
-  BUILTIN_VDQF (UNOP, fcvtzs, 0)
-  BUILTIN_VDQF (UNOP, fcvtzu, 0)
-  BUILTIN_VDQF (UNOP, fcvtas, 0)
-  BUILTIN_VDQF (UNOP, fcvtau, 0)
-  BUILTIN_VDQF (UNOP, fcvtps, 0)
-  BUILTIN_VDQF (UNOP, fcvtpu, 0)
-  BUILTIN_VDQF (UNOP, fcvtms, 0)
-  BUILTIN_VDQF (UNOP, fcvtmu, 0)
+  /* Implemented by l2.  */
+  VAR1 (UNOP, lbtruncv2sf, 2, v2si)
+  VAR1 (UNOP, lbtruncv4sf, 2, v4si)
+  VAR1 (UNOP, lbtruncv2df, 2, v2di)
+
+  VAR1 (UNOP, lbtruncuv2sf, 2, v2si)
+  VAR1 (UNOP, lbtruncuv4sf, 2, v4si)
+  VAR1 (UNOP, lbtruncuv2df, 2, v2di)
+
+  VAR1 (UNOP, lroundv2sf, 2, v2si)
+  VAR1 (UNOP, lroundv4sf, 2, v4si)
+  VAR1 (UNOP, lroundv2df, 2, v2di)
+  /* Implemented by l2.  */
+  VAR1 (UNOP, lroundsf, 2, si)
+  VAR1 (UNOP, lrounddf, 2, di)
+
+  VAR1 (UNOP, lrounduv2sf, 2, v2si)
+  VAR1 (UNOP, lrounduv4sf, 2, v4si)
+  VAR1 (UNOP, lrounduv2df, 2, v2di)
+  VAR1 (UNOP, lroundusf, 2, si)
+  VAR1 (UNOP, lroundudf, 2, di)
+
+  VAR1 (UNOP, lceilv2sf, 2, v2si)
+  VAR1 (UNOP, lceilv4sf, 2, v4si)
+  VAR1 (UNOP, lceilv2df, 2, v2di)
+
+  VAR1 (UNOP, lceiluv2sf, 2, v2si)
+  VAR1 (UNOP, lceiluv4sf, 2, v4si)
+  VAR1 (UNOP, lceiluv2df, 2, v2di)
+  VAR1 (UNOP, lceilusf, 2, si)
+  VAR1 (UNOP, lceiludf, 2, di)
+
+  VAR1 (UNOP, lfloorv2sf, 2, v2si)
+  VAR1 (UNOP, lfloorv4sf, 2, v4si)
+  VAR1 (UNOP, lfloorv2df, 2, v2di)
+
+  VAR1 (UNOP, lflooruv2sf, 2, v2si)
+  VAR1 (UNOP, lflooruv4sf, 2, v4si)
+  VAR1 (UNOP, lflooruv2df, 2, v2di)
+  VAR1 (UNOP, lfloorusf, 2, si)
+  VAR1 (UNOP, lfloorudf, 2, di)
+
+  VAR1 (UNOP, lfrintnv2sf, 2, v2si)
+  VAR

Re: [AArch64] Fix order of modes to lroundmn2 standard names.

2013-04-26 Thread Marcus Shawcroft


On 26/04/13 14:09, James Greenhalgh wrote:


Hi,

The vector versions of lroundmn2, lfloormn2, lceilmn2 convert from
mode m to mode n. The current implementation has this backwards.
For correctness, this patch swaps the n and m parameters.

There is no need to backport this patch as a bug fix, as nothing uses
this name for expansion in 4.8.

Regression tested on aarch64-none-elf with no regressions.

Thanks,
James Greenhalgh

---
gcc/

3013-04-26  James Greenhalgh  

* config/aarch64/aarch64-simd.md
(l2): Rename to...
(l2): ... This.



OK
/Marcus

Re: [PATCH, ARM] Remove incscc and decscc patterns from thumb2.md

2013-04-26 Thread Richard Earnshaw


On 26/04/13 12:13, Greta Yorsh wrote:

This patch removes dead patterns for incscc and decscc from thumb2.md.

It's a cleanup after this patch:
http://gcc.gnu.org/ml/gcc-patches/2013-01/msg00955.html
which removed incscc and decscc expanders and the corresponding patterns
from arm.md, but not from thumb2.md.

No regression on qemu for arm-none-eabi cortex-a15 thumb.

Ok for trunk?

Thanks,
Greta

gcc/

2013-04-05  Greta Yorsh  

* config/arm/thumb2.md (thumb2_incscc, thumb2_decscc): Delete.



OK.

R.

[AArch64] Add vector int to float conversions.

2013-04-26 Thread James Greenhalgh


Hi,

This patch wires up builtins for int to float conversions in
Tree, and uint to float conversions in RTL.

Regression tested for aarch64-none-elf with no regressions.

Thanks,
James

---
gcc/

2013-04-26  James Greenhalgh  

* config/aarch64/aarch64-builtins.c
(aarch64_fold_builtin): Fold float conversions.
* config/aarch64/aarch64-simd-builtins.def
(floatv2si, floatv4si, floatv2di): New.
(floatunsv2si, floatunsv4si, floatunsv2di): Likewise.
* config/aarch64/aarch64-simd.md
(2): New, expands to float and floatuns.
* config/aarch64/iterators.md (FLOATUORS): New.
(optab): Add float, floatuns.
(su_optab): Likewise.
diff --git a/gcc/config/aarch64/aarch64-builtins.c b/gcc/config/aarch64/aarch64-builtins.c
index f540568..d2e5136 100644
--- a/gcc/config/aarch64/aarch64-builtins.c
+++ b/gcc/config/aarch64/aarch64-builtins.c
@@ -1296,6 +1296,11 @@ aarch64_fold_builtin (tree fndecl, int n_args ATTRIBUTE_UNUSED, tree *args,
   BUILTIN_VDQF (UNOP, abs, 2)
 	return fold_build1 (ABS_EXPR, type, args[0]);
 	break;
+  VAR1 (UNOP, floatv2si, 2, v2sf)
+  VAR1 (UNOP, floatv4si, 2, v4sf)
+  VAR1 (UNOP, floatv2di, 2, v2df)
+	return fold_build1 (FLOAT_EXPR, type, args[0]);
+	break;
   default:
 	break;
 }
diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def b/gcc/config/aarch64/aarch64-simd-builtins.def
index 4654bd5..029e091 100644
--- a/gcc/config/aarch64/aarch64-simd-builtins.def
+++ b/gcc/config/aarch64/aarch64-simd-builtins.def
@@ -310,6 +310,15 @@
   VAR1 (UNOP, lfrintnusf, 2, si)
   VAR1 (UNOP, lfrintnudf, 2, di)
 
+  /* Implemented by 2.  */
+  VAR1 (UNOP, floatv2si, 2, v2sf)
+  VAR1 (UNOP, floatv4si, 2, v4sf)
+  VAR1 (UNOP, floatv2di, 2, v2df)
+
+  VAR1 (UNOP, floatunsv2si, 2, v2sf)
+  VAR1 (UNOP, floatunsv4si, 2, v4sf)
+  VAR1 (UNOP, floatunsv2di, 2, v2df)
+
   /* Implemented by
  aarch64_.  */
   BUILTIN_VALL (BINOP, zip1, 0)
diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index 4c678ba..067c849 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -1257,6 +1257,16 @@
(set_attr "simd_mode" "")]
 )
 
+(define_insn "2"
+  [(set (match_operand:VDQF 0 "register_operand" "=w")
+	(FLOATUORS:VDQF
+	  (match_operand: 1 "register_operand" "w")))]
+  "TARGET_SIMD"
+  "cvtf\\t%0., %1."
+  [(set_attr "simd_type" "simd_icvtf")
+   (set_attr "simd_mode" "")]
+)
+
 (define_insn "aarch64_vmls"
   [(set (match_operand:VDQF 0 "register_operand" "=w")
(minus:VDQF (match_operand:VDQF 1 "register_operand" "0")
diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
index 5c769f8..8668d3f 100644
--- a/gcc/config/aarch64/iterators.md
+++ b/gcc/config/aarch64/iterators.md
@@ -530,6 +530,9 @@
 ;; Iterator for integer conversions
 (define_code_iterator FIXUORS [fix unsigned_fix])
 
+;; Iterator for float conversions
+(define_code_iterator FLOATUORS [float unsigned_float])
+
 ;; Code iterator for variants of vector max and min.
 (define_code_iterator MAXMIN [smax smin umax umin])
 
@@ -557,6 +560,8 @@
 			 (zero_extend "zero_extend")
 			 (sign_extract "extv")
 			 (zero_extract "extzv")
+			 (float "float")
+			 (unsigned_float "floatuns")
 			 (and "and")
 			 (ior "ior")
 			 (xor "xor")
@@ -579,6 +584,7 @@
 (define_code_attr su_optab [(sign_extend "") (zero_extend "u")
 			(div "") (udiv "u")
 			(fix "") (unsigned_fix "u")
+			(float "s") (unsigned_float "u")
 			(ss_plus "s") (us_plus "u")
 			(ss_minus "s") (us_minus "u")])

[PATCH, AArch64] Support BICS instruction in the backend

2013-04-26 Thread Ian Bolton

With these patterns, we can now generate BICS in the appropriate places.

I've included test cases.

This has been run on linux and bare-metal regression tests.

OK to commit?

Cheers,
Ian



2013-04-26  Ian Bolton  

gcc/
* config/aarch64/aarch64.md (*and_one_cmpl3_compare0):
New pattern.
(*and_one_cmplsi3_compare0_uxtw): Likewise.
(*and_one_cmpl_3_compare0): Likewise.
(*and_one_cmpl_si3_compare0_uxtw): Likewise.

testsuite/
* gcc.target/aarch64/bics.c: New test.
* gcc.target/aarch64/bics2.c: Likewise.Index: gcc/testsuite/gcc.target/aarch64/bics.c
===
--- gcc/testsuite/gcc.target/aarch64/bics.c (revision 0)
+++ gcc/testsuite/gcc.target/aarch64/bics.c (revision 0)
@@ -0,0 +1,107 @@
+/* { dg-do run } */
+/* { dg-options "-O2 --save-temps -fno-inline" } */
+
+extern void abort (void);
+
+int
+bics_si_test1 (int a, int b, int c)
+{
+  int d = a & ~b;
+
+  /* { dg-final { scan-assembler "bics\tw\[0-9\]+, w\[0-9\]+, w\[0-9\]+" } } */
+  if (d == 0)
+return a + c;
+  else
+return b + d + c;
+}
+
+int
+bics_si_test2 (int a, int b, int c)
+{
+  int d = a & ~(b << 3);
+
+  /* { dg-final { scan-assembler "bics\tw\[0-9\]+, w\[0-9\]+, w\[0-9\]+, lsl 
3" } } */
+  if (d == 0)
+return a + c;
+  else
+return b + d + c;
+}
+
+typedef long long s64;
+
+s64
+bics_di_test1 (s64 a, s64 b, s64 c)
+{
+  s64 d = a & ~b;
+
+  /* { dg-final { scan-assembler "bics\tx\[0-9\]+, x\[0-9\]+, x\[0-9\]+" } } */
+  if (d == 0)
+return a + c;
+  else
+return b + d + c;
+}
+
+s64
+bics_di_test2 (s64 a, s64 b, s64 c)
+{
+  s64 d = a & ~(b << 3);
+
+  /* { dg-final { scan-assembler "bics\tx\[0-9\]+, x\[0-9\]+, x\[0-9\]+, lsl 
3" } } */
+  if (d == 0)
+return a + c;
+  else
+return b + d + c;
+}
+
+int
+main ()
+{
+  int x;
+  s64 y;
+
+  x = bics_si_test1 (29, ~4, 5);
+  if (x != ((29 & 4) + ~4 + 5))
+abort ();
+
+  x = bics_si_test1 (5, ~2, 20);
+  if (x != 25)
+abort ();
+
+  x = bics_si_test2 (35, ~4, 5);
+  if (x != ((35 & ~(~4 << 3)) + ~4 + 5))
+abort ();
+
+  x = bics_si_test2 (96, ~2, 20);
+  if (x != 116)
+abort ();
+
+  y = bics_di_test1 (0x13029ll,
+ ~0x32004ll,
+ 0x505050505ll);
+
+  if (y != ((0x13029ll & 0x32004ll) + ~0x32004ll + 0x505050505ll))
+abort ();
+
+  y = bics_di_test1 (0x5000500050005ll,
+ ~0x2111211121112ll,
+ 0x02020ll);
+  if (y != 0x5000500052025ll)
+abort ();
+
+  y = bics_di_test2 (0x13029ll,
+ ~0x06408ll,
+ 0x505050505ll);
+  if (y != ((0x13029ll & ~(~0x06408ll << 3))
+   + ~0x06408ll + 0x505050505ll))
+abort ();
+
+  y = bics_di_test2 (0x130002900ll,
+ ~0x08808ll,
+ 0x505050505ll);
+  if (y != (0x130002900ll + 0x505050505ll))
+abort ();
+
+  return 0;
+}
+
+/* { dg-final { cleanup-saved-temps } } */
Index: gcc/testsuite/gcc.target/aarch64/bics2.c
===
--- gcc/testsuite/gcc.target/aarch64/bics2.c(revision 0)
+++ gcc/testsuite/gcc.target/aarch64/bics2.c(revision 0)
@@ -0,0 +1,111 @@
+/* { dg-do run } */
+/* { dg-options "-O2 --save-temps -fno-inline" } */
+
+extern void abort (void);
+
+int
+bics_si_test1 (int a, int b, int c)
+{
+  int d = a & ~b;
+
+  /* { dg-final { scan-assembler-not "bics\tw\[0-9\]+, w\[0-9\]+, w\[0-9\]+" } 
} */
+  /* { dg-final { scan-assembler "bic\tw\[0-9\]+, w\[0-9\]+, w\[0-9\]+" } } */
+  if (d <= 0)
+return a + c;
+  else
+return b + d + c;
+}
+
+int
+bics_si_test2 (int a, int b, int c)
+{
+  int d = a & ~(b << 3);
+
+  /* { dg-final { scan-assembler-not "bics\tw\[0-9\]+, w\[0-9\]+, w\[0-9\]+, 
lsl 3" } } */
+  /* { dg-final { scan-assembler "bic\tw\[0-9\]+, w\[0-9\]+, w\[0-9\]+, lsl 3" 
} } */
+  if (d <= 0)
+return a + c;
+  else
+return b + d + c;
+}
+
+typedef long long s64;
+
+s64
+bics_di_test1 (s64 a, s64 b, s64 c)
+{
+  s64 d = a & ~b;
+
+  /* { dg-final { scan-assembler-not "bics\tx\[0-9\]+, x\[0-9\]+, x\[0-9\]+" } 
} */
+  /* { dg-final { scan-assembler "bic\tx\[0-9\]+, x\[0-9\]+, x\[0-9\]+" } } */
+  if (d <= 0)
+return a + c;
+  else
+return b + d + c;
+}
+
+s64
+bics_di_test2 (s64 a, s64 b, s64 c)
+{
+  s64 d = a & ~(b << 3);
+
+  /* { dg-final { scan-assembler-not "bics\tx\[0-9\]+, x\[0-9\]+, x\[0-9\]+, 
lsl 3" } } */
+  /* { dg-final { scan-assembler "bic\tx\[0-9\]+, x\[0-9\]+, x\[0-9\]+, lsl 3" 
} } */
+  if (d <= 0)
+return a + c;
+  else
+return b + d + c;
+}
+
+int
+main ()
+{
+  int x;
+  s64 y;
+
+  x = bics_si_test1 (29, ~4, 5);
+  if (x != ((29 & 4) + ~4 + 5))
+abort ();
+
+  x = bics_si_test1 (5, ~2, 20);
+  if (x != 25)
+abort ();
+
+  x = bics_si_test2 (35, ~4, 5);
+  if (x != ((35 & ~(~4 << 3)) + ~4 + 5))
+abort ();
+
+  x = bics_si_test2 (96

[AArch64] Implement vector float->double widening and double->float narrowing.

2013-04-26 Thread James Greenhalgh


Hi,

gcc.dg/vect/vect-float-truncate-1.c and
gcc.dg/vect/vect-float-extend-1.c

Were failing because widening and narrowing of floats to doubles was
not wired up.

This patch fixes that by implementing the standard names:

vec_pack_trunc_v2df
Taking two vectors of V2DFmode and returning one vector of V4SF mode.

`vec_unpacks_float_hi_v4sf', `vec_unpacks_float_lo_v4sf'
Taking one vector of V4SF mode and splitting it to two vectors of V2DF mode.

Patch regression tested on aarch64-none-elf with no regressions,
and shown to fix the bug.

Thanks,
James
---
gcc/

2013-04-26  James Greenhalgh  

* config/aarch64/aarch64-simd-builtins.def (vec_unpacks_hi_): New.
(float_truncate_hi_): Likewise.
(float_extend_lo_): Likewise.
(float_truncate_lo_): Likewise.
* config/aarch64/aarch64-simd.md (vec_unpacks_lo_v4sf): New.
(aarch64_float_extend_lo_v2df): Likewise.
(vec_unpacks_hi_v4sf): Likewise.
(aarch64_float_truncate_lo_v2sf): Likewise.
(aarch64_float_truncate_hi_v4sf): Likewise.
(vec_pack_trunc_v2df): Likewise.
(vec_pack_trunc_df): Likewise.
diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def b/gcc/config/aarch64/aarch64-simd-builtins.def
index 029e091..2aa9877 100644
--- a/gcc/config/aarch64/aarch64-simd-builtins.def
+++ b/gcc/config/aarch64/aarch64-simd-builtins.def
@@ -338,3 +338,9 @@
   BUILTIN_VDQF (BINOP, frecps, 0)
 
   BUILTIN_VDQF (UNOP, abs, 2)
+
+  VAR1 (UNOP, vec_unpacks_hi_, 10, v4sf)
+  VAR1 (BINOP, float_truncate_hi_, 0, v4sf)
+
+  VAR1 (UNOP, float_extend_lo_, 0, v2df)
+  VAR1 (UNOP, float_truncate_lo_, 0, v2sf)
diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index 067c849..4546094 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -1267,6 +1267,108 @@
(set_attr "simd_mode" "")]
 )
 
+;; Conversions between vectors of floats and doubles.
+;; Contains a mix of patterns to match standard pattern names
+;; and those for intrinsics.
+
+;; Float widening operations.
+
+(define_insn "vec_unpacks_lo_v4sf"
+  [(set (match_operand:V2DF 0 "register_operand" "=w")
+	(float_extend:V2DF
+	  (vec_select:V2SF
+	(match_operand:V4SF 1 "register_operand" "w")
+	(parallel [(const_int 0) (const_int 1)])
+	  )))]
+  "TARGET_SIMD"
+  "fcvtl\\t%0.2d, %1.2s"
+  [(set_attr "simd_type" "simd_fcvtl")
+   (set_attr "simd_mode" "V2DF")]
+)
+
+(define_insn "aarch64_float_extend_lo_v2df"
+  [(set (match_operand:V2DF 0 "register_operand" "=w")
+	(float_extend:V2DF
+	  (match_operand:V2SF 1 "register_operand" "w")))]
+  "TARGET_SIMD"
+  "fcvtl\\t%0.2d, %1.2s"
+  [(set_attr "simd_type" "simd_fcvtl")
+   (set_attr "simd_mode" "V2DF")]
+)
+
+(define_insn "vec_unpacks_hi_v4sf"
+  [(set (match_operand:V2DF 0 "register_operand" "=w")
+	(float_extend:V2DF
+	  (vec_select:V2SF
+	(match_operand:V4SF 1 "register_operand" "w")
+	(parallel [(const_int 2) (const_int 3)])
+	  )))]
+  "TARGET_SIMD"
+  "fcvtl2\\t%0.2d, %1.4s"
+  [(set_attr "simd_type" "simd_fcvtl")
+   (set_attr "simd_mode" "V2DF")]
+)
+
+;; Float narrowing operations.
+
+(define_insn "aarch64_float_truncate_lo_v2sf"
+  [(set (match_operand:V2SF 0 "register_operand" "=w")
+  (float_truncate:V2SF
+	(match_operand:V2DF 1 "register_operand" "w")))]
+  "TARGET_SIMD"
+  "fcvtn\\t%0.2s, %1.2d"
+  [(set_attr "simd_type" "simd_fcvtl")
+   (set_attr "simd_mode" "V2SF")]
+)
+
+(define_insn "aarch64_float_truncate_hi_v4sf"
+  [(set (match_operand:V4SF 0 "register_operand" "=w")
+(vec_concat:V4SF
+  (match_operand:V2SF 1 "register_operand" "0")
+  (float_truncate:V2SF
+	(match_operand:V2DF 2 "register_operand" "w"]
+  "TARGET_SIMD"
+  "fcvtn2\\t%0.4s, %2.2d"
+  [(set_attr "simd_type" "simd_fcvtl")
+   (set_attr "simd_mode" "V4SF")]
+)
+
+(define_expand "vec_pack_trunc_v2df"
+  [(set (match_operand:V4SF 0 "register_operand")
+  (vec_concat:V4SF
+	(float_truncate:V2SF
+	(match_operand:V2DF 1 "register_operand"))
+	(float_truncate:V2SF
+	(match_operand:V2DF 2 "register_operand"))
+	  ))]
+  "TARGET_SIMD"
+  {
+rtx tmp = gen_reg_rtx (V2SFmode);
+emit_insn (gen_aarch64_float_truncate_lo_v2sf (tmp, operands[1]));
+emit_insn (gen_aarch64_float_truncate_hi_v4sf (operands[0],
+		   tmp, operands[2]));
+DONE;
+  }
+)
+
+(define_expand "vec_pack_trunc_df"
+  [(set (match_operand:V2SF 0 "register_operand")
+  (vec_concat:V2SF
+	(float_truncate:SF
+	(match_operand:DF 1 "register_operand"))
+	(float_truncate:SF
+	(match_operand:DF 2 "register_operand"))
+	  ))]
+  "TARGET_SIMD"
+  {
+rtx tmp = gen_reg_rtx (V2SFmode);
+emit_insn (gen_move_lo_quad_v2df (tmp, operands[1]));
+emit_insn (gen_move_hi_quad_v2df (tmp, operands[2]));
+emit_insn (gen_aarch64_float_truncate_lo_v2sf (operands[0], tmp));
+DONE;
+  }
+)
+
 (define_insn "aarch64_vmls"
   [(set (match_operand:VDQF 0 "register_operand" "=w")
(minus:VDQF (match_operand:VDQF 1 "reg

Re: [PATCH, AArch64] Testcases for ANDS instruction

2013-04-26 Thread Richard Earnshaw


On 26/04/13 13:54, Ian Bolton wrote:

I made some testcases to go with my implementation of ANDS in the backend,
but Naveen Hurugalawadi got the ANDS patterns in before me!

I'm now just left with the testcases, but they are still worth adding, so
here they are.

Tests are working correctly as of current trunk.

OK to commit?

Cheers,
Ian


2013-04-26  Ian Bolton  

* gcc.target/aarch64/ands.c: New test.
* gcc.target/aarch64/ands2.c: Likewise


aarch64-ands-tests-svn-patch-v2.txt


Index: gcc/testsuite/gcc.target/aarch64/ands2.c
===
--- gcc/testsuite/gcc.target/aarch64/ands2.c(revision 0)
+++ gcc/testsuite/gcc.target/aarch64/ands2.c(revision 0)
@@ -0,0 +1,157 @@
+/* { dg-do run } */
+/* { dg-options "-O2 --save-temps -fno-inline" } */
+
+extern void abort (void);
+
+int
+ands_si_test1 (int a, int b, int c)
+{
+  int d = a & b;
+
+  /* { dg-final { scan-assembler-not "ands\tw\[0-9\]+, w\[0-9\]+, w\[0-9\]+" } 
} */


This rule

+  /* { dg-final { scan-assembler "and\tw\[0-9\]+, w\[0-9\]+, w\[0-9\]+" } } */


Will match anything that this rule


+  /* { dg-final { scan-assembler "and\tw\[0-9\]+, w\[0-9\]+, w\[0-9\]+, lsl 3" 
} } */


matches (though not vice versa).

Similarly for the x register variants.



R.

[AArch64] Vectorize over more math.h functions.

2013-04-26 Thread James Greenhalgh


Hi,

This patch adds float -> int builtins to the set
of builtins we can try to vectorize in aarch64_builtin_vectorized_function.

In particular, we add BUILT_IN_IFLOORF, BUILT_IN_ICEILF, BUILT_IN_LROUND,
BUILT_IN_IROUNDF.

The BUILT_IN_LROUND cases won't be triggered unless -ffast-math
or something else which turns off inexact errors is enabled.

Regression tested for aarch64-none-elf with no regressions.

Thanks,
James

---
gcc/

2013-04-26  James Greenhalgh  

* config/aarch64/aarch64-builtins.c
(aarch64_builtin_vectorized_function): Vectorize over ifloorf,
iceilf, lround, iroundf.
diff --git a/gcc/config/aarch64/aarch64-builtins.c b/gcc/config/aarch64/aarch64-builtins.c
index d2e5136..53d2c6a 100644
--- a/gcc/config/aarch64/aarch64-builtins.c
+++ b/gcc/config/aarch64/aarch64-builtins.c
@@ -1245,6 +1245,7 @@ aarch64_builtin_vectorized_function (tree fndecl, tree type_out, tree type_in)
   (out_mode == N##Imode && out_n == C \
&& in_mode == N##Fmode && in_n == C)
 	case BUILT_IN_LFLOOR:
+	case BUILT_IN_IFLOORF:
 	  {
 	tree new_tree = NULL_TREE;
 	if (AARCH64_CHECK_BUILTIN_MODE (2, D))
@@ -1259,6 +1260,7 @@ aarch64_builtin_vectorized_function (tree fndecl, tree type_out, tree type_in)
 	return new_tree;
 	  }
 	case BUILT_IN_LCEIL:
+	case BUILT_IN_ICEILF:
 	  {
 	tree new_tree = NULL_TREE;
 	if (AARCH64_CHECK_BUILTIN_MODE (2, D))
@@ -1272,6 +1274,22 @@ aarch64_builtin_vectorized_function (tree fndecl, tree type_out, tree type_in)
 		aarch64_builtin_decls[AARCH64_SIMD_BUILTIN_lceilv2sfv2si];
 	return new_tree;
 	  }
+	case BUILT_IN_LROUND:
+	case BUILT_IN_IROUNDF:
+	  {
+	tree new_tree = NULL_TREE;
+	if (AARCH64_CHECK_BUILTIN_MODE (2, D))
+	  new_tree =
+		aarch64_builtin_decls[AARCH64_SIMD_BUILTIN_lroundv2dfv2di];
+	else if (AARCH64_CHECK_BUILTIN_MODE (4, S))
+	  new_tree =
+		aarch64_builtin_decls[AARCH64_SIMD_BUILTIN_lroundv4sfv4si];
+	else if (AARCH64_CHECK_BUILTIN_MODE (2, S))
+	  new_tree =
+		aarch64_builtin_decls[AARCH64_SIMD_BUILTIN_lroundv2sfv2si];
+	return new_tree;
+	  }
+
 	default:
 	  return NULL_TREE;
   }

[AArch64] Add vector fix, fixuns, fix_trunc, fixuns_trunc standard patterns

2013-04-26 Thread James Greenhalgh


Hi,

This patch enables vectorization over conversions by implimenting the
fix, fixuns, fix_trunc, fixuns_trunc, and ftrunc standard pattern names.

Each of these is implimented by the frintz instruction.
(Round towards 0)

The expanders for these are blank as they are already
implimented by the lrint standard patterns. We are
just connecting the dots for another set of standard names.

Regression tested for aarch64-none-elf with no regressions.

Thanks,
James

---
gcc/

2013-04-26  James Greenhalgh  

* config/aarch64/aarch64-simd.md
(2): New, maps to fix, fixuns.
(2): New, maps to
fix_trunc, fixuns_trunc.
(ftrunc2): New.
* config/aarch64/iterators.md (optab): Add fix, fixuns.
(fix_trunc_optab): New.
diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index 4546094..32ea587 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -1257,6 +1257,29 @@
(set_attr "simd_mode" "")]
 )
 
+(define_expand "2"
+  [(set (match_operand: 0 "register_operand")
+	(FIXUORS: (unspec:
+			   [(match_operand:VDQF 1 "register_operand")]
+			   UNSPEC_FRINTZ)))]
+  "TARGET_SIMD"
+  {})
+
+(define_expand "2"
+  [(set (match_operand: 0 "register_operand")
+	(FIXUORS: (unspec:
+			   [(match_operand:VDQF 1 "register_operand")]
+			   UNSPEC_FRINTZ)))]
+  "TARGET_SIMD"
+  {})
+
+(define_expand "ftrunc2"
+  [(set (match_operand:VDQF 0 "register_operand")
+	(unspec:VDQF [(match_operand:VDQF 1 "register_operand")]
+		  UNSPEC_FRINTZ))]
+  "TARGET_SIMD"
+  {})
+
 (define_insn "2"
   [(set (match_operand:VDQF 0 "register_operand" "=w")
 	(FLOATUORS:VDQF
diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
index 8668d3f..d774c4c 100644
--- a/gcc/config/aarch64/iterators.md
+++ b/gcc/config/aarch64/iterators.md
@@ -560,6 +560,8 @@
 			 (zero_extend "zero_extend")
 			 (sign_extract "extv")
 			 (zero_extract "extzv")
+			 (fix "fix")
+			 (unsigned_fix "fixuns")
 			 (float "float")
 			 (unsigned_float "floatuns")
 			 (and "and")
@@ -580,6 +582,9 @@
 			 (lt "lt")
 			 (ge "ge")])
 
+(define_code_attr fix_trunc_optab [(fix "fix_trunc")
+   (unsigned_fix "fixuns_trunc")])
+
 ;; Optab prefix for sign/zero-extending operations
 (define_code_attr su_optab [(sign_extend "") (zero_extend "u")
 			(div "") (udiv "u")

Re: [C++ Patch] Define __cplusplus == 201300L for -std=c++1y

2013-04-26 Thread Gabriel Dos Reis

On Fri, Apr 26, 2013 at 7:41 AM, Jonathan Wakely  wrote:
>  But that probably can't be handled with a DR and
> so is too late for C++14.

BSI still gets to submit a NB comment demanding its removal.

-- Gaby

[AArch64] fcvt instructions - arm_neon.h changes.

2013-04-26 Thread James Greenhalgh


This patch updates the implimentation in arm_neon.h of the vcvt
intrinsics. Where appropriate we use C statements, and where not
possible we fall back to builtins.

There were a number of errors with names and types in the current
revision of the file. These have been corrected.

Regression tested with no regressions.

Thanks,
James

---
gcc/

2013-04-26  James Greenhalgh  

* config/aarch64/arm_neon.h
(vcvt_f<32,64>_s<32,64>): Rewrite in C.
(vcvt_f<32,64>_s<32,64>): Rewrite using builtins.
(vcvt__f<32,64>_f<32,64>): Likewise.
(vcvt_<32,64>_f<32,64>): Likewise.
(vcvta_<32,64>_f<32,64>): Likewise.
(vcvtm_<32,64>_f<32,64>): Likewise.
(vcvtn_<32,64>_f<32,64>): Likewise.
(vcvtp_<32,64>_f<32,64>): Likewise.

gcc/testsuite/

2013-04-26  James Greenhalgh  

* gcc.target/aarch64/vect-vcvt.c: New.
diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h
index c868a46..7d37744 100644
--- a/gcc/config/aarch64/arm_neon.h
+++ b/gcc/config/aarch64/arm_neon.h
@@ -5882,100 +5882,12 @@ vcntq_u8 (uint8x16_t a)
 
 /* vcvt_f32_f16 not supported */
 
-__extension__ static __inline float32x2_t __attribute__ ((__always_inline__))
-vcvt_f32_f64 (float64x2_t a)
-{
-  float32x2_t result;
-  __asm__ ("fcvtn %0.2s,%1.2d"
-   : "=w"(result)
-   : "w"(a)
-   : /* No clobbers */);
-  return result;
-}
-
-__extension__ static __inline float32x2_t __attribute__ ((__always_inline__))
-vcvt_f32_s32 (int32x2_t a)
-{
-  float32x2_t result;
-  __asm__ ("scvtf %0.2s, %1.2s"
-   : "=w"(result)
-   : "w"(a)
-   : /* No clobbers */);
-  return result;
-}
-
-__extension__ static __inline float32x2_t __attribute__ ((__always_inline__))
-vcvt_f32_u32 (uint32x2_t a)
-{
-  float32x2_t result;
-  __asm__ ("ucvtf %0.2s, %1.2s"
-   : "=w"(result)
-   : "w"(a)
-   : /* No clobbers */);
-  return result;
-}
-
-__extension__ static __inline float64x2_t __attribute__ ((__always_inline__))
-vcvt_f64_f32 (float32x2_t a)
-{
-  float64x2_t result;
-  __asm__ ("fcvtl %0.2d,%1.2s"
-   : "=w"(result)
-   : "w"(a)
-   : /* No clobbers */);
-  return result;
-}
-
-__extension__ static __inline float64x1_t __attribute__ ((__always_inline__))
-vcvt_f64_s64 (uint64x1_t a)
-{
-  float64x1_t result;
-  __asm__ ("scvtf %d0, %d1"
-   : "=w"(result)
-   : "w"(a)
-   : /* No clobbers */);
-  return result;
-}
-
-__extension__ static __inline float64x1_t __attribute__ ((__always_inline__))
-vcvt_f64_u64 (uint64x1_t a)
-{
-  float64x1_t result;
-  __asm__ ("ucvtf %d0, %d1"
-   : "=w"(result)
-   : "w"(a)
-   : /* No clobbers */);
-  return result;
-}
-
 /* vcvt_high_f16_f32 not supported */
 
 /* vcvt_high_f32_f16 not supported */
 
 static float32x2_t vdup_n_f32 (float32_t);
 
-__extension__ static __inline float32x4_t __attribute__ ((__always_inline__))
-vcvt_high_f32_f64 (float32x2_t a, float64x2_t b)
-{
-  float32x4_t result = vcombine_f32 (a, vdup_n_f32 (0.0f));
-  __asm__ ("fcvtn2 %0.4s,%2.2d"
-   : "+w"(result)
-   : "w"(b)
-   : /* No clobbers */);
-  return result;
-}
-
-__extension__ static __inline float64x2_t __attribute__ ((__always_inline__))
-vcvt_high_f64_f32 (float32x4_t a)
-{
-  float64x2_t result;
-  __asm__ ("fcvtl2 %0.2d,%1.4s"
-   : "=w"(result)
-   : "w"(a)
-   : /* No clobbers */);
-  return result;
-}
-
 #define vcvt_n_f32_s32(a, b)\
   __extension__ \
 ({  \
@@ -6024,160 +5936,6 @@ vcvt_high_f64_f32 (float32x4_t a)
result;  \
  })
 
-__extension__ static __inline int32x2_t __attribute__ ((__always_inline__))
-vcvt_s32_f32 (float32x2_t a)
-{
-  int32x2_t result;
-  __asm__ ("fcvtzs %0.2s, %1.2s"
-   : "=w"(result)
-   : "w"(a)
-   : /* No clobbers */);
-  return result;
-}
-
-__extension__ static __inline uint32x2_t __attribute__ ((__always_inline__))
-vcvt_u32_f32 (float32x2_t a)
-{
-  uint32x2_t result;
-  __asm__ ("fcvtzu %0.2s, %1.2s"
-   : "=w"(result)
-   : "w"(a)
-   : /* No clobbers */);
-  return result;
-}
-
-__extension__ static __inline int32x2_t __attribute__ ((__always_inline__))
-vcvta_s32_f32 (float32x2_t a)
-{
-  int32x2_t result;
-  __asm__ ("fcvtas %0.2s, %1.2s"
-   : "=w"(result)
-   : "w"(a)
-   : /* No clobbers */);
-  return result;
-}
-
-__extension__ static __inline uint32x2_t __attribute__ ((__always_inline__))
-vcvta_u32_f32 (float32x2_t a)
-{
-  uint32x2_t result;
-  __asm__ ("fcvtau %0.2s, %1.2s"
-   : "=w"(result)
-   : "w"(a)
-   : /* No clobbers */);
-  return result;
-}
-
-__extension__ static __inline float64_

Re: [AArch64] Map frint intrinsics to standard pattern names directly.

2013-04-26 Thread Marcus Shawcroft

On 26 April 2013 13:42, James Greenhalgh  wrote:
>
> Hi,
>
> This patch maps the frint style intrinsics directly to their
> standard pattern name versions and adds support for frintn, which
> does not map to a standard pattern name.
>
> Regression tested on aarch64-none-elf with no issues.
>
> Thanks,
> James
>
> ---
> gcc/
>
> 2013-04-26  James Greenhalgh  
>
> * config/aarch64/aarch64-builtins.c
> (aarch64_builtin_vectorized_function): Fold to standard pattern names.
> * config/aarch64/aarch64-simd-builtins.def (frintn): New.
> (frintz): Rename to...
> (btrunc): ...this.
> (frintp): Rename to...
> (ceil): ...this.
> (frintm): Rename to...
> (floor): ...this.
> (frinti): Rename to...
> (nearbyint): ...this.
> (frintx): Rename to...
> (rint): ...this.
> (frinta): Rename to...
> (round): ...this.
> * config/aarch64/aarch64-simd.md
> (aarch64_frint): Delete.
> (2): Convert to insn.
> * config/aarch64/aarch64.md (unspec): Add UNSPEC_FRINTN.
> * config/aarch64/iterators.md (FRINT): Add UNSPEC_FRINTN.
> (frint_pattern): Likewise.
> (frint_suffix): Likewise.

OK
/Marcus

Re: [AArch64] Convert NEON frint implementations to use builtins.

2013-04-26 Thread Marcus Shawcroft

On 26 April 2013 13:45, James Greenhalgh  wrote:
>
> Hi,
>
> This patch renames the vrnd intrinsics,
> which previously were vrnd
>
> At the same time, we move these intrinsics to an RTL-based intrinsic.
>
> Regression tested on aarch64-none-elf with no issues.
>
> Thanks,
> James
>
> ---
> gcc/
>
> 2013-04-26  James Greenhalgh  
>
> * config/aarch64/arm_neon.h (vrndq_f<32, 64>): Rename to...
> (vrndq_f<32, 64>): ...This, implement using builtin.
> (vrnd_f32): Implement using builtins.
> (vrnd_f<32, 64>): New.
>
> gcc/testsuite/
>
> 2013-04-26  James Greenhalgh  
>
> * gcc.target/aarch64/vect-vrnd.c: New.

OK
/Marcus

[AArch64][Testsuite] Enable vect_uintfloat_cvt for AArch64.

2013-04-26 Thread James Greenhalgh


Hi,

While modifying all the vcvt builtins we've fixed enough bugs
that we can now enable vect_uintfloat_cvt for AArch64. Do that.

Patch tested to ensure all newly enabled tests succeed.

James
---
gcc/testsuite/

2013-04-26  James Greenhalgh  

* lib/target-supports.exp (vect_uintfloat_cvt): Enable for AArch64.
diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
index 4604af6..33086c6 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -2012,6 +2012,7 @@ proc check_effective_target_vect_uintfloat_cvt { } {
 	  || ([istarget powerpc*-*-*]
 		  && ![istarget powerpc-*-linux*paired*])
 	  || [istarget x86_64-*-*] 
+	  || [istarget aarch64*-*-*]
 	  || ([istarget arm*-*-*]
 		  && [check_effective_target_arm_neon_ok])} {
set et_vect_uintfloat_cvt_saved 1

Re: [PATCH, AArch64] Support BICS instruction in the backend

2013-04-26 Thread Marcus Shawcroft

+  /* { dg-final { scan-assembler "bics\tx\[0-9\]+, x\[0-9\]+, x\[0-9\]+" } } */

+  /* { dg-final { scan-assembler "bics\tx\[0-9\]+, x\[0-9\]+,
x\[0-9\]+, lsl 3" } } */

Ian, These two patterns have the same issue Richard just highlighted
on your other patch, ie the first pattern will also match anything
matched by the second pattern.

/Marcus

On 26 April 2013 14:24, Ian Bolton  wrote:
> With these patterns, we can now generate BICS in the appropriate places.
>
> I've included test cases.
>
> This has been run on linux and bare-metal regression tests.
>
> OK to commit?
>
> Cheers,
> Ian
>
>
>
> 2013-04-26  Ian Bolton  
>
> gcc/
> * config/aarch64/aarch64.md (*and_one_cmpl3_compare0):
> New pattern.
> (*and_one_cmplsi3_compare0_uxtw): Likewise.
> (*and_one_cmpl_3_compare0): Likewise.
> (*and_one_cmpl_si3_compare0_uxtw): Likewise.
>
> testsuite/
> * gcc.target/aarch64/bics.c: New test.
> * gcc.target/aarch64/bics2.c: Likewise.

[PATCH, AArch64] Support LDR/STR to/from S and D registers

2013-04-26 Thread Ian Bolton

This patch allows us to load to and store from the S and D registers,
which helps with doing scalar operations in those registers.

This has been regression tested on bare-metal and linux.

OK for trunk?

Cheers,
Ian


2013-04-26  Ian Bolton  

* config/aarch64/aarch64.md (movsi_aarch64): Support LDR/STR
from/to S register.
(movdi_aarch64): Support LDR/STR from/to D register.Index: gcc/config/aarch64/aarch64.md
===
--- gcc/config/aarch64/aarch64.md   (revision 198231)
+++ gcc/config/aarch64/aarch64.md   (working copy)
@@ -808,26 +808,28 @@ (define_expand "mov"
 )
 
 (define_insn "*movsi_aarch64"
-  [(set (match_operand:SI 0 "nonimmediate_operand" "=r,r,r,m, *w, r,*w")
-   (match_operand:SI 1 "aarch64_mov_operand" " r,M,m,rZ,rZ,*w,*w"))]
+  [(set (match_operand:SI 0 "nonimmediate_operand" "=r,r,r,*w,m,  m,*w, r,*w")
+   (match_operand:SI 1 "aarch64_mov_operand"  " r,M,m, m,rZ,*w,rZ,*w,*w"))]
   "(register_operand (operands[0], SImode)
 || aarch64_reg_or_zero (operands[1], SImode))"
   "@
mov\\t%w0, %w1
mov\\t%w0, %1
ldr\\t%w0, %1
+   ldr\\t%s0, %1
str\\t%w1, %0
+   str\\t%s1, %0
fmov\\t%s0, %w1
fmov\\t%w0, %s1
fmov\\t%s0, %s1"
-  [(set_attr "v8type" "move,alu,load1,store1,fmov,fmov,fmov")
+  [(set_attr "v8type" "move,alu,load1,load1,store1,store1,fmov,fmov,fmov")
(set_attr "mode" "SI")
-   (set_attr "fp" "*,*,*,*,yes,yes,yes")]
+   (set_attr "fp" "*,*,*,*,*,*,yes,yes,yes")]
 )
 
 (define_insn "*movdi_aarch64"
-  [(set (match_operand:DI 0 "nonimmediate_operand" "=r,k,r,r,r,m, r,  r,  *w, 
r,*w,w")
-   (match_operand:DI 1 "aarch64_mov_operand"  " 
r,r,k,N,m,rZ,Usa,Ush,rZ,*w,*w,Dd"))]
+  [(set (match_operand:DI 0 "nonimmediate_operand" "=r,k,r,r,r,*w,m,  m,r,  r, 
 *w, r,*w,w")
+   (match_operand:DI 1 "aarch64_mov_operand"  " r,r,k,N,m, 
m,rZ,*w,Usa,Ush,rZ,*w,*w,Dd"))]
   "(register_operand (operands[0], DImode)
 || aarch64_reg_or_zero (operands[1], DImode))"
   "@
@@ -836,16 +838,18 @@ (define_insn "*movdi_aarch64"
mov\\t%x0, %1
mov\\t%x0, %1
ldr\\t%x0, %1
+   ldr\\t%d0, %1
str\\t%x1, %0
+   str\\t%d1, %0
adr\\t%x0, %a1
adrp\\t%x0, %A1
fmov\\t%d0, %x1
fmov\\t%x0, %d1
fmov\\t%d0, %d1
movi\\t%d0, %1"
-  [(set_attr "v8type" 
"move,move,move,alu,load1,store1,adr,adr,fmov,fmov,fmov,fmov")
+  [(set_attr "v8type" 
"move,move,move,alu,load1,load1,store1,store1,adr,adr,fmov,fmov,fmov,fmov")
(set_attr "mode" "DI")
-   (set_attr "fp" "*,*,*,*,*,*,*,*,yes,yes,yes,yes")]
+   (set_attr "fp" "*,*,*,*,*,*,*,*,*,*,yes,yes,yes,yes")]
 )
 
 (define_insn "insv_imm"

Make m32c build, fix PSImode truncation

2013-04-26 Thread Bernd Schmidt

This patch here:
  http://gcc.gnu.org/ml/gcc-patches/2012-10/msg00661.html

changed simplification code from
 case TRUNCATE:
-  /* We can't handle truncation to a partial integer mode here
- because we don't know the real bitsize of the partial
- integer mode.  */
-  if (GET_MODE_CLASS (mode) == MODE_PARTIAL_INT)
-break;

to
+  if (GET_MODE_CLASS (mode) == MODE_PARTIAL_INT)
+   {
+ if (TRULY_NOOP_TRUNCATION_MODES_P (mode, GET_MODE (op)))
+   return rtl_hooks.gen_lowpart_no_emit (mode, op);
+ /* We can't handle truncation to a partial integer mode here
+because we don't know the real bitsize of the partial
+integer mode.  */
+ break;
+   }

This is problematic for m32c; it defines TRULY_NOOP_TRUNCATION as 1, and
it's not really possible to define it meaningfully for partial int
modes, since it only gets passed precisions. Allowing subregs of PSImode
values leads to out of registers reload failures, so it kind of relies
on the previous behaviour.

The patch below restores the old behaviour. Bootstrapped and tested on
x86_64-linux, and it makes m32c build. Ok?


Bernd
	* simplify-rtx.c (simplify_unary_operation_1): Don't try to
	simplify truncations of partial int modes.

diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c
index 791f91a..6a8221c 100644
--- a/gcc/simplify-rtx.c
+++ b/gcc/simplify-rtx.c
@@ -1038,12 +1038,6 @@ simplify_unary_operation_1 (enum rtx_code code, enum machine_mode mode, rtx op)
 
   if (GET_MODE_CLASS (mode) == MODE_PARTIAL_INT)
 	{
-	  if (TRULY_NOOP_TRUNCATION_MODES_P (mode, GET_MODE (op)))
-	{
-	  temp = rtl_hooks.gen_lowpart_no_emit (mode, op);
-	  if (temp)
-		return temp;
-	}
 	  /* We can't handle truncation to a partial integer mode here
 	 because we don't know the real bitsize of the partial
 	 integer mode.  */

Re: [PATCH] Preserve loops from CFG build until after RTL loop opts

2013-04-26 Thread Tom de Vries

On 25/04/13 16:19, Richard Biener wrote:

> and compared to the previous patch changed the tree-ssa-tailmerge.c
> part to deal with merging of loop latch and loop preheader (even
> if that's a really bad idea) to not regress gcc.dg/pr50763.c.
> Any suggestion on how to improve that part welcome.

>   * tree-ssa-tail-merge.c: Include cfgloop.h.
>   (replace_block_by): When merging loop latches mark loops for fixup.

> Index: trunk/gcc/tree-ssa-tail-merge.c
> ===
> *** trunk.orig/gcc/tree-ssa-tail-merge.c  2013-04-25 11:31:14.0 
> +0200
> --- trunk/gcc/tree-ssa-tail-merge.c   2013-04-25 12:39:00.236390580 +0200
> *** along with GCC; see the file COPYING3.
> *** 197,202 
> --- 197,203 
>   #include "gimple-pretty-print.h"
>   #include "tree-ssa-sccvn.h"
>   #include "tree-dump.h"
> + #include "cfgloop.h"
>   
>   /* ??? This currently runs as part of tree-ssa-pre.  Why is this not
>  a stand-alone GIMPLE pass?  */
> *** replace_block_by (basic_block bb1, basic
> *** 1459,1464 
> --- 1460,1476 
> /* Mark the basic block as deleted.  */
> mark_basic_block_deleted (bb1);
>   
> +   /* ???  If we merge the loop preheader with the loop latch we are creating
> +  additional entries into the loop, eventually rotating it.
> +  Mark loops for fixup in this case.
> +  ???  This is a completely unwanted transform and will wreck most
> +  loops at this point - but with just not considering loop latches as
> +  merge candidates we fail to commonize the two loops in 
> gcc.dg/pr50763.c.
> +  A better fix to avoid that regression is needed.  */
> +   if (current_loops
> +   && bb2->loop_father->latch == bb2)
> + loops_state_set (LOOPS_NEED_FIXUP);
> + 
> /* Redirect the incoming edges of bb1 to bb2.  */
> for (i = EDGE_COUNT (bb1->preds); i > 0 ; --i)
>   {

Richard,

I'm not sure if I get your comment about the two loops in pr50763.c. There is
just one loop, both before and after tail-merge.

BEFORE:
2(e)
   / \
  *   *
 3 9
  \   /
   * *
4
   / \
  *   *
 5 10
 | |
 * *
 +--*7 8(x)
 |  / \*
 \ *   *  /
  6 11

TAIL-MERGE:
1. merges empty blocks 10 and 11
2. merges empty block 5 and 6
3. merges block 4 and 7, which are empty except for testing the conditional,
The transformations in steps 2 and 3 affect the loop.

AFTER:
2(e)
   / \
  *   *
 3 9
  \   /
   * *
+-*4,7
|  / \
\ *   *
 5,6   10,11
\
 *
  8(x)

Although step 2 and 3 reduce the amount of BBs, which could make sense for
compile-for-size, I wonder whether this transformation works in general.

Step 3 only works if the same conditional is tested, which means an eternal 
loop.

Step 2 works if the loop pre-header and the loop latch are empty. This will be
the case quite often since loop_optimizer_init is called with
LOOPS_HAVE_PREHEADERS | LOOPS_HAVE_SIMPLE_LATCHES before pass_pre. OTOH, loops
will typically have a non-virtual phi, with different values coming from the
loop pre-header and the loop latch, which prevents the optimization.

So I think this is really a cornercase, and we should disregard it if that makes
things simpler.

Rather than fixing up the loop structure, we could prevent tail-merge in these
cases.

The current fix tests for current_loops == NULL, and I'm not sure that can still
happen there, given that we have PROP_loops.

It's not evident to me that the test bb2->loop_father->latch == bb2 is
sufficient. Before calling tail_merge_optimize, we call loop_optimizer_finalize
in which we assert that LOOPS_MAY_HAVE_MULTIPLE_LATCHES from there on, so in
theory we might miss some latches.

But I guess that pre (having started out with simple latches) maintains simple
latches throughout, and that tail-merge does the same.

Tentative patch attached. I'll try build & test.

[ Btw, it would be nice if restricting the optimization also means that we can
  simplify dominator handling in the pass. ]

Thanks,
- Tom

2013-04-26  Tom de Vries  

* tree-ssa-tail-merge.c (find_same_succ_bb): Skip loop latch bbs.
(replace_block_by): Don't set LOOPS_NEED_FIXUP.

* gcc.dg/pr50763.c: Update test.
diff --git a/gcc/tree-ssa-tail-merge.c b/gcc/tree-ssa-tail-merge.c
index f2ab7444..e49c3e7 100644
--- a/gcc/tree-ssa-tail-merge.c
+++ b/gcc/tree-ssa-tail-merge.c
@@ -689,7 +689,8 @@ find_same_succ_bb (basic_block bb, same_succ *same_p)
   edge_iterator ei;
   edge e;

-  if (bb == NULL)
+  if (bb == NULL
+  || bb->loop_father->latch == bb)
 return;
   bitmap_set_bit (same->bbs, bb->index);
   FOR_EACH_EDGE (e, ei, bb->succs)
@@ -1460,17 +1461,6 @@ replace_block_by (basic_block bb1, basic_block bb2)
   /* Mark the basic block as deleted.  */
   mark_basic_block_deleted (bb1);

-  /* ???

[PATCH] Allow nested use of attributes in MD-files

2013-04-26 Thread Michael Zolotukhin

Hi,
This patch allows to use attributes inside other attributes in MD-files.
Currently we can't have an attribute depending on both mode and code - we have
only mode attribute and code attribute and mode_attribute can't depend on the
code.

So, if we write, for example,
  (define_mode_attr attr_name [(SI "ps") (DI "")])
then  in a pattern with both mode and code iterators will be replaced
with "ps" and "" depending on the mode, but "" won't be expanded
further.

Here is a small example to show when it could be needed.  Suppose we have two
instructions: add and substract, and each of them operates on SI and DI mode
registers.  Suppose also that for DI-mode addition it's prefferable to
use an alias (suppose, it's
a faster version of similar instruction).  I.e. we want to have patterns to emit
the following:

  plus,  SI: add_32
  plus,  DI: fast_add_64
  minus, SI: sub_32
  minus, DI: sub_64

Currently we need to have a separate pattern in MD-file for fast_add_64, but
with the change I suggest it could be written within one pattern as follows:

(define_mode_iterator MI [SI DI])
(define_code_iterator plusminus [plus minus])

(define_mode_attr Madd [(SI "add_32") (DI "fast_add_64")])
(define_mode_attr Msub [(SI "sub_32") (DI "sub_64")])
(define_code_attr CodeModeAttribute [(plus "") (minus "")])

(define_insn ""
  [(set:MI (match_operand:MI 0 "register_operand" "r")
   (plusminus:MI (match_operand:MI 1 "register_operand" "r")
 (match_operand:MI 2 "register_operand" "r")))]
   ""
"")

This could be used for all kinds of iterators and it could be very useful
when several different substs are applied to the same pattern.

The patch is regtested and bootstrapped on i386 and x86_64, and tested on
Specs2k, 2k6.

Is it ok for trunk?

gcc/ChangeLog
2013-04-26  Michael Zolotukhin  

* read-rtl.c (copy_rtx_for_iterators): Continue applying iterators
while it has any effect.


--
---
Best regards,
Michael V. Zolotukhin,
Software Engineer
Intel Corporation.


attr.patch
Description: Binary data

Re: RFA: enable LRA for rs6000

2013-04-26 Thread Michael Meissner

Vlad, in going through the LRA test differences, some of the bswap64 tests are
failing because LRA converts the swaps for register/register converts into
store/load.  For example, if gcc.target/powerpc/bswap64-4.c is compiled on
32-bit, for this function:

long long swap_reg (long long a) { return __builtin_bswap64 (a); }

LRA gives:

swap_reg:
stwu 1,-16(1)
li 9,4
stw 3,8(1)
stw 4,12(1)
addi 10,1,8
lwbrx 3,9,10
lwbrx 4,0,10
addi 1,1,16
blr

And the traditional code generation is:

swap_reg:
rlwinm 9,4,8,0x
rlwinm 10,3,8,0x
rlwimi 9,4,24,0,7
rlwimi 10,3,24,0,7
rlwimi 9,4,24,16,23
rlwimi 10,3,24,16,23
mr 4,10
mr 3,9

I assume the rlwinm's are to be preferred because there is no LHS, and also in
this case, the 2 registers rlwinm's are done in parallel.

The test gcc.target/powerpc/vect-83_64.c is failing in LRA:

vect-83_64.c: In function ‘main1’:
vect-83_64.c:30:1: internal compiler error: Max. number of generated reload 
insns per insn is achieved (90)

 }
 ^
0x104dca7f lra_constraints(bool)
/home/meissner/fsf-src/meissner-lra/gcc/lra-constraints.c:3613
0x104ca67b lra(_IO_FILE*)
/home/meissner/fsf-src/meissner-lra/gcc/lra.c:2278
0x1047d6eb do_reload
/home/meissner/fsf-src/meissner-lra/gcc/ira.c:4619
0x1047d6eb rest_of_handle_reload
/home/meissner/fsf-src/meissner-lra/gcc/ira.c:4731
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.

I'm also seeing quite a few Fortran failures for -m32:

gfortran.dg/PR19872.f
gfortran.dg/advance_1.f90
gfortran.dg/advance_4.f90
gfortran.dg/advance_5.f90
gfortran.dg/advance_6.f90
gfortran.dg/append_1.f90
gfortran.dg/associated_2.f90
gfortran.dg/assumed_rank_1.f90
gfortran.dg/assumed_rank_2.f90
gfortran.dg/assumed_rank_7.f90
gfortran.dg/assumed_type_2.f90
gfortran.dg/backspace_10.f90
gfortran.dg/backspace_2.f
gfortran.dg/backspace_8.f
gfortran.dg/backspace_9.f
gfortran.dg/bound_2.f90
gfortran.dg/bound_7.f90
gfortran.dg/bound_8.f90
gfortran.dg/char_cshift_1.f90
gfortran.dg/char_cshift_2.f90
gfortran.dg/char_cshift_3.f90
gfortran.dg/char_eoshift_1.f90
gfortran.dg/char_eoshift_2.f90
gfortran.dg/char_eoshift_3.f90
gfortran.dg/char_eoshift_4.f90
gfortran.dg/char_eoshift_5.f90
gfortran.dg/char_length_8.f90
gfortran.dg/chmod_1.f90
gfortran.dg/chmod_2.f90
gfortran.dg/chmod_3.f90
gfortran.dg/comma.f
gfortran.dg/convert_2.f90
gfortran.dg/convert_implied_open.f90
gfortran.dg/cr_lf.f90
gfortran.dg/cshift_bounds_1.f90
gfortran.dg/cshift_bounds_2.f90
gfortran.dg/cshift_bounds_3.f90
gfortran.dg/cshift_bounds_4.f90
gfortran.dg/cshift_nan_1.f90
gfortran.dg/dev_null.F90
gfortran.dg/direct_io_1.f90
gfortran.dg/direct_io_11.f90
gfortran.dg/direct_io_12.f90
gfortran.dg/direct_io_2.f90
gfortran.dg/direct_io_3.f90
gfortran.dg/direct_io_5.f90
gfortran.dg/direct_io_8.f90
gfortran.dg/endfile.f90
gfortran.dg/endfile_2.f90
gfortran.dg/eof_4.f90
gfortran.dg/eoshift.f90
gfortran.dg/eoshift_bounds_1.f90
gfortran.dg/error_format.f90
gfortran.dg/f2003_inquire_1.f03
gfortran.dg/f2003_io_1.f03
gfortran.dg/f2003_io_5.f03
gfortran.dg/f2003_io_7.f03
gfortran.dg/fmt_cache_1.f
gfortran.dg/fmt_error_4.f90
gfortran.dg/fmt_error_5.f90
gfortran.dg/fmt_t_5.f90
gfortran.dg/fmt_t_7.f
gfortran.dg/ftell_3.f90
gfortran.dg/hollerith4.f90
gfortran.dg/inquire_10.f90
gfortran.dg/inquire_13.f90
gfortran.dg/inquire_15.f90
gfortran.dg/inquire_9.f90
gfortran.dg/inquire_size.f90
gfortran.dg/iomsg_1.f90
gfortran.dg/iostat_2.f90
gfortran.dg/list_read_10.f90
gfortran.dg/list_read_6.f90
gfortran.dg/list_read_7.f90
gfortran.dg/list_read_9.f90
gfortran.dg/matmul_1.f90
gfortran.dg/matmul_5.f90
gfortran.dg/maxloc_bounds_1.f90
gfortran.dg/maxloc_bounds_2.f90
gfortran.dg/maxloc_bounds_3.f90
gfortran.dg/maxloc_bounds_6.f90
gfortran.dg/maxloc_bounds_8.f90
gfortran.dg/namelist_44.f90
gfortran.dg/namelist_45.f90
gfortran.dg/namelist_46.f90
gfortran.dg/namelist_66.f90
gfortran.dg/namelist_72.f
gfortran.dg/namelist_82.f90
gfortran.dg/negative_automatic_size.f90
gfortran.dg/negative_unit.f
gfortran.dg/negative_unit_int8.f
gfortran.dg/newunit_1.f90
gfortran.dg/newunit_3.f90
gfortran.dg/open_access_append_1.f90
gfortran.dg/open_errors.f90
gfortran.dg/open_negative_unit_1.f90
gfortran.dg/open_new.f90
gfortran.dg/open_readonly_1.f90
gfortran.dg/open_status_1.f90
gfortran.dg/open_status_2.f90
gfortran.dg/open_status_3.f90
gfortran.dg/optional_dim_2.f90
gfortran.dg/optional_dim_3.f90
gfortran.dg/overwrite_1.f
gfortran.dg/pr16597.f90
gfortran.dg/pr16935.f90
gfortran.dg/pr20954.f
gfortran.dg/pr39865.f90
gfortran.dg/pr46804.f90
gfortran.dg/pr47878.f90
gfortran.dg/read_comma.f
gfortran.dg/read_eof_4.f90
gfortran.dg/read_eof_8.f90
gfortran.dg/read_eof_all.f90
gfortran.dg/read_list_eof_1.f90
gfortran.dg/read_many_1.f
gfortran.dg/read_no_eor.f90
gfor

[PATCH] Two -mxop wrong-code fixes (PR target/56866)

2013-04-26 Thread Jakub Jelinek

Hi!

This patch fixes two wrong-code bugs with -mxop.
One is that vpmacsdqh instruction can be only used for vec_widen_smult_odd_v4si
but not vec_widen_umult_odd_v4si.  Consider we have
unsigned V4SImode h* with arguments
{ 3, 3, 3, 3 } h* { 0xaaab, 0xaaab, 0xaaab, 0xaaab }
(but not known at compile time).  If we use vpmacsdqh, it sign-extends
the numbers and thus computes (3 * 0xaaabULL) >> 32,
i.e. 0x, while we want (3 * 0xaaabULL) >> 32, i.e. 2.

The second bug is in wrong shift count for immediate xop_rotr.
We want element bitsize - immediate to transform the r>> immediate
into r<< immediate, but ( * 8) is correct for that only
for V4SImode - 32.  For V2DImode it is 16 instead of the desired
64, for V8HImode it is 64 instead of the desired 16 and for V16QImode
it is 128 instead of the desired 8. 

Bootstrapped/regtested on x86_64-linux, configured --with-arch=bdver2,
fixes:

-FAIL: gcc.c-torture/execute/pr51581-1.c execution,  -O3 -fomit-frame-pointer 
-FAIL: gcc.c-torture/execute/pr51581-1.c execution,  -O3 -fomit-frame-pointer 
-funroll-loops 
-FAIL: gcc.c-torture/execute/pr51581-1.c execution,  -O3 -fomit-frame-pointer 
-funroll-all-loops -finline-functions 
-FAIL: gcc.c-torture/execute/pr51581-1.c execution,  -O3 -g 
-FAIL: gcc.c-torture/execute/pr51581-2.c execution,  -O3 -fomit-frame-pointer 
-FAIL: gcc.c-torture/execute/pr51581-2.c execution,  -O3 -fomit-frame-pointer 
-funroll-loops 
-FAIL: gcc.c-torture/execute/pr51581-2.c execution,  -O3 -fomit-frame-pointer 
-funroll-all-loops -finline-functions 
-FAIL: gcc.c-torture/execute/pr51581-2.c execution,  -O3 -g 
-FAIL: gcc.c-torture/execute/pr53645.c execution,  -O1 
-FAIL: gcc.c-torture/execute/pr53645.c execution,  -O2 
-FAIL: gcc.c-torture/execute/pr53645.c execution,  -O3 -fomit-frame-pointer 
-FAIL: gcc.c-torture/execute/pr53645.c execution,  -O3 -fomit-frame-pointer 
-funroll-loops 
-FAIL: gcc.c-torture/execute/pr53645.c execution,  -O3 -fomit-frame-pointer 
-funroll-all-loops -finline-functions 
-FAIL: gcc.c-torture/execute/pr53645.c execution,  -O3 -g 
-FAIL: gcc.c-torture/execute/pr53645.c execution,  -Os 
-FAIL: gcc.c-torture/execute/pr53645.c execution,  -Og -g 
-FAIL: gcc.c-torture/execute/pr53645.c execution,  -O2 -flto 
-fno-use-linker-plugin -flto-partition=none 
-FAIL: gcc.c-torture/execute/pr53645.c execution,  -O2 -flto 
-fuse-linker-plugin -fno-fat-lto-objects 
-FAIL: gcc.c-torture/execute/pr56866.c execution,  -O3 -fomit-frame-pointer 
-FAIL: gcc.c-torture/execute/pr56866.c execution,  -O3 -fomit-frame-pointer 
-funroll-loops 
-FAIL: gcc.c-torture/execute/pr56866.c execution,  -O3 -fomit-frame-pointer 
-funroll-all-loops -finline-functions 
-FAIL: gcc.c-torture/execute/pr56866.c execution,  -O3 -g 
-FAIL: gcc.dg/vect/pr51581-1.c execution test
-FAIL: gcc.dg/vect/pr51581-2.c execution test
-FAIL: gcc.dg/vect/pr51581-3.c execution test
-FAIL: gcc.dg/vect/pr51581-1.c -flto execution test
-FAIL: gcc.dg/vect/pr51581-2.c -flto execution test
-FAIL: gcc.dg/vect/pr51581-3.c -flto execution test
-FAIL: gcc.target/i386/avx-mul-1.c execution test
-FAIL: gcc.target/i386/avx-pr51581-1.c execution test
-FAIL: gcc.target/i386/avx-pr51581-2.c execution test
-FAIL: gcc.target/i386/pr56866.c execution test
-FAIL: gcc.target/i386/sse2-mul-1.c execution test
-FAIL: gcc.target/i386/sse4_1-mul-1.c execution test
-FAIL: gcc.target/i386/xop-mul-1.c execution test

failures that appear with stock gcc just with the testsuite/
part of the patch applied.  Ok for trunk/4.8 and partly for 4.7
(the i386.c bug has been introduced in 2012-06-25 but the sse.md
bug existed in 4.7 already)?

2013-04-26  Jakub Jelinek  

PR target/56866
* config/i386/i386.c (ix86_expand_mul_widen_evenodd): Don't
use xop_pmacsdqh if uns_p.
* config/i386/sse.md (xop_rotr3): Fix up computation of
the immediate rotate count.

* gcc.c-torture/execute/pr56866.c: New test.
* gcc.target/i386/pr56866.c: New test.

--- gcc/config/i386/i386.c.jj   2013-04-22 10:26:22.0 +0200
+++ gcc/config/i386/i386.c  2013-04-26 10:28:51.793534370 +0200
@@ -40841,7 +40841,7 @@ ix86_expand_mul_widen_evenodd (rtx dest,
  the even slots.  For some cpus this is faster than a PSHUFD.  */
   if (odd_p)
 {
-  if (TARGET_XOP && mode == V4SImode)
+  if (TARGET_XOP && mode == V4SImode && !uns_p)
{
  x = force_reg (wmode, CONST0_RTX (wmode));
  emit_insn (gen_xop_pmacsdqh (dest, op1, op2, x));
--- gcc/config/i386/sse.md.jj   2013-04-02 20:24:37.0 +0200
+++ gcc/config/i386/sse.md  2013-04-26 13:25:32.729590863 +0200
@@ -9924,7 +9924,8 @@ (define_insn "xop_rotr3"
 (match_operand:SI 2 "const_0_to__operand" "n")))]
   "TARGET_XOP"
 {
-  operands[3] = GEN_INT (( * 8) - INTVAL (operands[2]));
+  operands[3]
+= GEN_INT (GET_MODE_BITSIZE (mode) - INTVAL (operands[2]));
   return \"vprot\t{%3, %1, %0|%0, %1, %3}\";
 }
   [(set_attr "type" "sseishft")
--- gcc

Re: [build, driver] RFC: Support compressed debug sections

2013-04-26 Thread Joseph S. Myers

On Thu, 11 Apr 2013, Rainer Orth wrote:

> +gz=
> +Common Driver JoinedOrMissing
> +-gz= Generate compressed debug sections in format 

Although handled entirely in specs, I think it's best to use the Enum .opt 
facility to list the valid arguments to this option, so the option 
handling machinery can properly detect invalid arguments.  (And, since an 
empty argument isn't meaningful, use Joined rather than JoinedOrMissing.)

The integer values assigned to each valid argument string are of course 
arbitrary since nothing will use them.

> +@item -gz@r{[}=@var{type}@r{]}
> +@opindex gz
> +Produce compressed debug sections in DWARF format (if that is
> +supported).  If @var{type} is not given, the default type depends on the
> +capabilities of the assembler and linker used.  @var{type} may be one of
> +@option{none} (don't compress debug sections), @option{zlib} (use zlib
> +compression in ELF gABI format), or @option{zlib-gnu} (use zlib
> +compression in tradition GNU format).

"traditional".

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH 4/6] Handwritten part of conversion of passes to instances of C++ classes

2013-04-26 Thread David Malcolm

On Fri, 2013-04-19 at 21:23 -0400, David Malcolm wrote:
>   This is the hand-written part of the patch; it is required for the
>   preceding auto-generated patch to make sense.

[Answering my own patch]

This patch isn't yet good enough as-is: upon investigating test case
failures I found that the patch wasn't properly handling instances of
passes, leading to the symptom of:
  cc1plus: error: unrecognized command line option '-fdump-tree-fre1'
which was because both of the instances of "fre" were erroneously
getting the dump switch '-fdump-tree-fre' (i.e. missing the trailing
instance number).

This highlighted a deeper issue with converting the passes to C++
classes: register_pass expects a pre-allocated pass, but potentially
needs to create multiple copies of the pass if it's going to be inserted
in multiple places, and gcc/passes.c:make_pass_instance creates these
instances using a memcpy with a fixed size (based on the pass type).  If
we're going to support hanging extra data off of a pass instance, it
means a small reorganization here.

I'm working on a revised patch series which respects the status quo as
per -fdump- option names, and which changes register_pass so that you
pass in a callback for creating passes, rather than creating the pass
yourself.

Dave

Patch ping - Add a new option "-fstack-protector-strong"

2013-04-26 Thread 沈涵

Hi, I'd like to ping the patch '-fstack-protector-strong':

- http://gcc.gnu.org/ml/gcc-patches/2013-04/msg00945.html
  Add a new option '-fstack-protector-strong' to protect only
stack-smashing-vulnerable functions.

Thanks,
H.

[C++ Patch/RFC] PR 56450

2013-04-26 Thread Paolo Carlini


Hi,

in this ICE on valid, finish_decltype_type doesn't handle a 
COMPOUND_EXPR when the second argument, 
id_expression_or_member_access_p, is true.


As analyzed by Jakub in the audit trail, at parsing time for the test 
involving:


   struct A3 { static const int dummy = 0; };

the *.dummy in the template argument of has_const_int_dummy is parsed as 
a COMPONENT_REF, but then, in case DECLTYPE_TYPE of tsubst it's turned 
into a COMPOUND_EXPR when it becomes clear that dummy is actually a 
static data member. Then id_expression_or_member_access_p == true, as 
stored in the DECLTYPE_TYPE, isn't correct anymore and 
finish_decltype_type can't handle its arguments.


Now, right before calling finish_decltype_type, we are *already* 
handling a case where, before the instantiation, we have ~id which, upon 
instantiation can turn out to be either a complement expression or a 
destructor: I thought we could handle in the same place this additional 
case of disambiguation and adjust id_expression_or_member_access_p to 
false in this case too. The below passes testing on x86_64-linux.


Thanks,
Paolo.

///
Index: cp/pt.c
===
--- cp/pt.c (revision 198340)
+++ cp/pt.c (working copy)
@@ -11781,11 +11781,12 @@ tsubst (tree t, tree args, tsubst_flags_t complain
 case DECLTYPE_TYPE:
   {
tree type;
+   tree expr = DECLTYPE_TYPE_EXPR (t);
 
++cp_unevaluated_operand;
++c_inhibit_evaluation_warnings;
 
-   type = tsubst_copy_and_build (DECLTYPE_TYPE_EXPR (t), args,
+   type = tsubst_copy_and_build (expr, args,
  complain|tf_decltype, in_decl,
  /*function_p*/false,
  /*integral_constant_expression*/false);
@@ -11801,12 +11802,16 @@ tsubst (tree t, tree args, tsubst_flags_t complain
else
  {
bool id = DECLTYPE_TYPE_ID_EXPR_OR_MEMBER_ACCESS_P (t);
-   if (id && TREE_CODE (DECLTYPE_TYPE_EXPR (t)) == BIT_NOT_EXPR
-   && EXPR_P (type))
- /* In a template ~id could be either a complement expression
-or an unqualified-id naming a destructor; if instantiating
-it produces an expression, it's not an id-expression or
-member access.  */
+   /* In a template ~id could be either a complement expression
+  or an unqualified-id naming a destructor; if instantiating
+  it produces an expression, it's not an id-expression or
+  member access.  Likewise, if a COMPONENT_REF becomes upon
+  instantiation a COMPOUND_EXPR, it's actually a static data
+  member.  */
+   if (id && ((TREE_CODE (expr) == BIT_NOT_EXPR
+   && EXPR_P (type))
+  || (TREE_CODE (expr) == COMPONENT_REF
+  && TREE_CODE (type) == COMPOUND_EXPR)))
  id = false;
type = finish_decltype_type (type, id, complain);
  }
Index: testsuite/g++.dg/cpp0x/decltype52.C
===
--- testsuite/g++.dg/cpp0x/decltype52.C (revision 0)
+++ testsuite/g++.dg/cpp0x/decltype52.C (working copy)
@@ -0,0 +1,39 @@
+// PR c++/56450
+// { dg-do compile { target c++11 } }
+
+template
+T&& declval();
+
+template
+struct enable_if { };
+
+template
+struct enable_if
+{ typedef T type; };
+
+template
+struct is_same
+{ static constexpr bool value = false; };
+
+template
+struct is_same
+{ static constexpr bool value = true; };
+
+template< typename, typename = void >
+struct has_const_int_dummy
+{ static constexpr bool value = false; };
+
+template< typename T >
+struct has_const_int_dummy< T, typename enable_if< is_same< decltype(
+declval< T >().dummy ), const int >::value >::type >
+{ static constexpr bool value = true; };
+
+struct A0 { const int dummy; };
+struct A1 {};
+struct A2 { int dummy(); };
+struct A3 { static const int dummy = 0; };
+
+static_assert( has_const_int_dummy< A0 >::value, "A0" );
+static_assert( !has_const_int_dummy< A1 >::value, "A1" );
+static_assert( !has_const_int_dummy< A2 >::value, "A2" );
+static_assert( !has_const_int_dummy< A3 >::value, "A3" );  // ICE

Re: RFA: enable LRA for rs6000 [32-bit fortran]

2013-04-26 Thread Michael Meissner

In addition to all of the failures in the 32-bit gfortrain suite, I ran one run
of the 32-bit spec 2006 fortan tests, and the following benchmarks fail:

410.bwaves  416.gamess  434.zeusmp
437.leslie3d454.calculix459.GemsFDTD
465.tonto   481.wrf

The following 2 benchmarks succeed:

435.gromacs 436.cactusADM

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797

patch for recent LRA changes -- another try

2013-04-26 Thread Vladimir Makarov

  I'll already submitted analogous patch a few days ago. Unfortunately, 
it resulted in some libstdc++ test failures.  This version of patch 
fixes the problem which was a wrong choice of insn alternatives resulted 
in using MMX regs.


  The patch was successfully bootstrapped on x86/x86-64.

  Committed as rev. 198344.

2013-04-26  Vladimir Makarov  

* rtl.h (struct rtx_def): Add comment for field jump.
(LRA_SUBREG_P): New macro.
* recog.c (register_operand): Check LRA_SUBREG_P.
* lra.c (lra): Add note at the end of RTL code. Align non-empty
stack frame.
* lra-spills.c (lra_spill): Align stack after spilling pseudos.
(lra_final_code_change): Skip subreg change for operators.
* lra-eliminations.c (eliminate_regs_in_insn): Make return earlier
if there are no operand changes.
* lra-constraints.c (curr_insn_set): New.
(match_reload): Set LRA_SUBREG_P.
(emit_spill_move): Ditto.
(check_and_process_move): Use curr_insn_set. Process only single
set insns.  Don't initialize sec_mem_p and change_p.
(simplify_operand_subreg): Use LRA_SUBREG_P.
(reg_in_class_p): New function.
(process_alt_operands): Use it.  Use #if HAVE_ATTR_enabled instead
of #ifdef.  Add code to remove cycling.
(process_address): Check EXTRA_CONSTRAINT_STR. Process even if
non-null disp.  Reload inner instead of disp when base and index
are null.  Try to put lo_sum into register.
(EBB_PROBABILITY_CUTOFF): Redefine probability in percents.
(check_and_process_move): Move code for move cost check to
simple_move_p.  Remove equiv_substitution.
(simple_move_p): New function.
(curr_insn_transform): Initialize sec_mem_p and change_p.  Set up
curr_insn_set.  Call check_and_process_move only for single set
insns.  Use the new function.  Move call of check_and_process_move
after operand equiv substitution and address process.


Index: lra-constraints.c
===
--- lra-constraints.c   (revision 198330)
+++ lra-constraints.c   (working copy)
@@ -135,10 +135,11 @@
reload insns.  */
 static int bb_reload_num;
 
-/* The current insn being processed and corresponding its data (basic
-   block, the insn data, the insn static data, and the mode of each
-   operand).  */
+/* The current insn being processed and corresponding its single set
+   (NULL otherwise), its data (basic block, the insn data, the insn
+   static data, and the mode of each operand).  */
 static rtx curr_insn;
+static rtx curr_insn_set;
 static basic_block curr_bb;
 static lra_insn_recog_data_t curr_id;
 static struct lra_static_insn_data *curr_static_id;
@@ -698,6 +699,7 @@ match_reload (signed char out, signed ch
new_out_reg = gen_lowpart_SUBREG (outmode, reg);
  else
new_out_reg = gen_rtx_SUBREG (outmode, reg, 0);
+ LRA_SUBREG_P (new_out_reg) = 1;
  /* If the input reg is dying here, we can use the same hard
 register for REG and IN_RTX.  We do it only for original
 pseudos as reload pseudos can die although original
@@ -721,6 +723,7 @@ match_reload (signed char out, signed ch
 it at the end of LRA work.  */
  clobber = emit_clobber (new_out_reg);
  LRA_TEMP_CLOBBER_P (PATTERN (clobber)) = 1;
+ LRA_SUBREG_P (new_in_reg) = 1;
  if (GET_CODE (in_rtx) == SUBREG)
{
  rtx subreg_reg = SUBREG_REG (in_rtx);
@@ -855,40 +858,34 @@ static rtx
 emit_spill_move (bool to_p, rtx mem_pseudo, rtx val)
 {
   if (GET_MODE (mem_pseudo) != GET_MODE (val))
-val = gen_rtx_SUBREG (GET_MODE (mem_pseudo),
- GET_CODE (val) == SUBREG ? SUBREG_REG (val) : val,
- 0);
+{
+  val = gen_rtx_SUBREG (GET_MODE (mem_pseudo),
+   GET_CODE (val) == SUBREG ? SUBREG_REG (val) : val,
+   0);
+  LRA_SUBREG_P (val) = 1;
+}
   return (to_p
  ? gen_move_insn (mem_pseudo, val)
  : gen_move_insn (val, mem_pseudo));
 }
 
 /* Process a special case insn (register move), return true if we
-   don't need to process it anymore.  Return that RTL was changed
-   through CHANGE_P and macro SECONDARY_MEMORY_NEEDED says to use
-   secondary memory through SEC_MEM_P. */
+   don't need to process it anymore.  INSN should be a single set
+   insn.  Set up that RTL was changed through CHANGE_P and macro
+   SECONDARY_MEMORY_NEEDED says to use secondary memory through
+   SEC_MEM_P.  */
 static bool
-check_and_process_move (bool *change_p, bool *sec_mem_p)
+check_and_process_move (bool *change_p, bool *sec_mem_p ATTRIBUTE_UNUSED)
 {
   int sregno, dregno;
-  rtx set, dest, src, dreg, sreg, old_sreg, new_reg, before, scratch_reg;
+  rtx dest, src, dreg, sreg, old_sreg, new_reg, before, scratch_reg

[PATCH] Fix PR57077 (issue8840045)

2013-04-26 Thread Teresa Johnson

This patch fixes PR57077. Certain new uses of apply_probability
are actually scaling the counts up, and the scale factor should not 
be treated as a probability as the value may exceed REG_BR_PROB_BASE.
One example (from the PR) is when scaling counts up in LTO when merging
profiles. Another example I found when preparing the patch to use
the rounding divide in more places is when inlining COMDAT functions.

Add new helper function apply_scale that does the scaling without
the probability range check. I audited the new uses of apply_probability
and changed the calls as appropriate.

Profilebootstrapped and tested on x86_64-unknown-linux-gnu. Verified that this
fixes the lto-bootstrap issue. Ok for trunk?

2013-04-26  Teresa Johnson  

* basic-block.h (apply_scale): New function.
(apply_probability): Use apply_scale.
* gimple-streamer-in.c (input_bb): Ditto.
* lto-streamer-in.c (input_cfg): Ditto.
* lto-cgraph.c (merge_profile_summaries): Ditto.
* tree-optimize.c (execute_fixup_cfg): Ditto.
* tree-inline.c (copy_bb): Update comment to use
apply_scale.
(copy_edges_for_bb): Ditto.
(copy_cfg_body): Ditto.

Index: gimple-streamer-in.c
===
--- gimple-streamer-in.c(revision 198344)
+++ gimple-streamer-in.c(working copy)
@@ -329,8 +329,8 @@ input_bb (struct lto_input_block *ib, enum LTO_tag
   index = streamer_read_uhwi (ib);
   bb = BASIC_BLOCK_FOR_FUNCTION (fn, index);
 
-  bb->count = apply_probability (streamer_read_gcov_count (ib),
- count_materialization_scale);
+  bb->count = apply_scale (streamer_read_gcov_count (ib),
+   count_materialization_scale);
   bb->frequency = streamer_read_hwi (ib);
   bb->flags = streamer_read_hwi (ib);
 
Index: lto-streamer-in.c
===
--- lto-streamer-in.c   (revision 198344)
+++ lto-streamer-in.c   (working copy)
@@ -635,8 +635,8 @@ input_cfg (struct lto_input_block *ib, struct func
 
  dest_index = streamer_read_uhwi (ib);
  probability = (int) streamer_read_hwi (ib);
- count = apply_probability ((gcov_type) streamer_read_gcov_count (ib),
- count_materialization_scale);
+ count = apply_scale ((gcov_type) streamer_read_gcov_count (ib),
+   count_materialization_scale);
  edge_flags = streamer_read_uhwi (ib);
 
  dest = BASIC_BLOCK_FOR_FUNCTION (fn, dest_index);
Index: tree-inline.c
===
--- tree-inline.c   (revision 198344)
+++ tree-inline.c   (working copy)
@@ -1519,7 +1519,7 @@ copy_bb (copy_body_data *id, basic_block bb, int f
  basic_block_info automatically.  */
   copy_basic_block = create_basic_block (NULL, (void *) 0,
  (basic_block) prev->aux);
-  /* Update to use apply_probability().  */
+  /* Update to use apply_scale().  */
   copy_basic_block->count = bb->count * count_scale / REG_BR_PROB_BASE;
 
   /* We are going to rebuild frequencies from scratch.  These values
@@ -1891,7 +1891,7 @@ copy_edges_for_bb (basic_block bb, gcov_type count
&& old_edge->dest->aux != EXIT_BLOCK_PTR)
  flags |= EDGE_FALLTHRU;
new_edge = make_edge (new_bb, (basic_block) old_edge->dest->aux, flags);
-/* Update to use apply_probability().  */
+/* Update to use apply_scale().  */
new_edge->count = old_edge->count * count_scale / REG_BR_PROB_BASE;
new_edge->probability = old_edge->probability;
   }
@@ -2278,7 +2278,7 @@ copy_cfg_body (copy_body_data * id, gcov_type coun
incoming_frequency += EDGE_FREQUENCY (e);
incoming_count += e->count;
  }
-  /* Update to use apply_probability().  */
+  /* Update to use apply_scale().  */
   incoming_count = incoming_count * count_scale / REG_BR_PROB_BASE;
   /* Update to use EDGE_FREQUENCY.  */
   incoming_frequency
Index: tree-optimize.c
===
--- tree-optimize.c (revision 198344)
+++ tree-optimize.c (working copy)
@@ -131,15 +131,15 @@ execute_fixup_cfg (void)
 ENTRY_BLOCK_PTR->count);
 
   ENTRY_BLOCK_PTR->count = cgraph_get_node (current_function_decl)->count;
-  EXIT_BLOCK_PTR->count = apply_probability (EXIT_BLOCK_PTR->count,
- count_scale);
+  EXIT_BLOCK_PTR->count = apply_scale (EXIT_BLOCK_PTR->count,
+   count_scale);
 
   FOR_EACH_EDGE (e, ei, ENTRY_BLOCK_PTR->succs)
-e->count = apply_probability (e->count, count_scale);
+e->count = apply_scale (e->count, count_scale);
 
   FOR_EACH_BB (bb)
 {
-  bb->count = apply_probability (bb->count, count_

[RFA][PATCH] Eliminate more unnecessary type conversions

2013-04-26 Thread Jeff Law



So looking at more dumps made it pretty obvious that my previous patch 
to tree-vrp.c to eliminate useless casts to boolean types which fed into 
comparisons could and should be generalized.


Given:

  x1 = (T1) x0;
  if (x1 COND CONST)

If the known value range for x0 fits into T1, then we can rewrite as

  x1 = (T1) x0;
  if (x0 COND (T)CONST)

Which typically makes the first statement dead and may allow further 
simplifications.


Bootstrapped and regression tested on x86_64-unknown-linux-gnu.  OK for 
the trunk?


commit ad290c7270201042bfc3cde1d84c12e639e4bff7
Author: Jeff Law 
Date:   Fri Apr 26 12:52:06 2013 -0600

* tree-vrp.c (range_fits_type_p): Move to earlier point in file.
(simplify_cond_using_ranges): Generalize code to simplify
COND_EXPRs where one argument is a constant and the other
is an SSA_NAME created by an integral type conversion.

* gcc.dg/tree-ssa/vrp88.c: New test.

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index d06eee6..f9b207c 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,10 @@
+2013-04-26  Jeff Law  
+
+   * tree-vrp.c (range_fits_type_p): Move to earlier point in file.
+   (simplify_cond_using_ranges): Generalize code to simplify
+   COND_EXPRs where one argument is a constant and the other
+   is an SSA_NAME created by an integral type conversion.
+
 2013-04-26  Vladimir Makarov  
 
* rtl.h (struct rtx_def): Add comment for field jump.
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index bbea9fa..6d7839f 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,7 @@
+2013-04-26  Jeff Law  
+
+   * gcc.dg/tree-ssa/vrp88.c: New test.
+
 2013-04-26  Jakub Jelinek  
 
PR go/57045
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp88.c 
b/gcc/testsuite/gcc.dg/tree-ssa/vrp88.c
new file mode 100644
index 000..e43bdff
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp88.c
@@ -0,0 +1,39 @@
+/* { dg-do compile } */
+
+/* { dg-options "-O2 -fdump-tree-vrp1-details" } */
+
+
+typedef const struct bitmap_head_def *const_bitmap;
+typedef unsigned long BITMAP_WORD;
+typedef struct bitmap_element_def {
+  struct bitmap_element_def *next;
+  BITMAP_WORD bits[((128 + (8 * 8 * 1u) - 1) / (8 * 8 * 1u))];
+} bitmap_element;
+typedef struct bitmap_head_def {
+  bitmap_element *first;
+} bitmap_head;
+unsigned char
+bitmap_single_bit_set_p (const_bitmap a)
+{
+  unsigned long count = 0;
+  const bitmap_element *elt;
+  unsigned ix;
+  if ((!(a)->first))
+return 0;
+  elt = a->first;
+  if (elt->next != ((void *)0))
+return 0;
+  for (ix = 0; ix != ((128 + (8 * 8 * 1u) - 1) / (8 * 8 * 1u)); ix++)
+{
+  count += __builtin_popcountl (elt->bits[ix]);
+  if (count > 1)
+ return 0;
+}
+  return count == 1;
+}
+
+/* Verify that VRP simplified an "if" statement.  */
+/* { dg-final { scan-tree-dump "Folded into: if.*" "vrp1"} } */
+/* { dg-final { cleanup-tree-dump "vrp1" } } */
+
+
diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index cb4a09a..07e3e01 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -8509,6 +8509,57 @@ test_for_singularity (enum tree_code cond_code, tree op0,
   return NULL;
 }
 
+/* Return whether the value range *VR fits in an integer type specified
+   by PRECISION and UNSIGNED_P.  */
+
+static bool
+range_fits_type_p (value_range_t *vr, unsigned precision, bool unsigned_p)
+{
+  tree src_type;
+  unsigned src_precision;
+  double_int tem;
+
+  /* We can only handle integral and pointer types.  */
+  src_type = TREE_TYPE (vr->min);
+  if (!INTEGRAL_TYPE_P (src_type)
+  && !POINTER_TYPE_P (src_type))
+return false;
+
+  /* An extension is fine unless VR is signed and unsigned_p,
+ and so is an identity transform.  */
+  src_precision = TYPE_PRECISION (TREE_TYPE (vr->min));
+  if ((src_precision < precision
+   && !(unsigned_p && !TYPE_UNSIGNED (src_type)))
+  || (src_precision == precision
+ && TYPE_UNSIGNED (src_type) == unsigned_p))
+return true;
+
+  /* Now we can only handle ranges with constant bounds.  */
+  if (vr->type != VR_RANGE
+  || TREE_CODE (vr->min) != INTEGER_CST
+  || TREE_CODE (vr->max) != INTEGER_CST)
+return false;
+
+  /* For sign changes, the MSB of the double_int has to be clear.
+ An unsigned value with its MSB set cannot be represented by
+ a signed double_int, while a negative value cannot be represented
+ by an unsigned double_int.  */
+  if (TYPE_UNSIGNED (src_type) != unsigned_p
+  && (TREE_INT_CST_HIGH (vr->min) | TREE_INT_CST_HIGH (vr->max)) < 0)
+return false;
+
+  /* Then we can perform the conversion on both ends and compare
+ the result for equality.  */
+  tem = tree_to_double_int (vr->min).ext (precision, unsigned_p);
+  if (tree_to_double_int (vr->min) != tem)
+return false;
+  tem = tree_to_double_int (vr->max).ext (precision, unsigned_p);
+  if (tree_to_double_int (vr->max) != tem)
+return false;
+
+  retu

[PATCH] Fix VRP LSHIFT_EXPR non-singleton shift count handling (PR tree-optimization/57083)

2013-04-26 Thread Jakub Jelinek

Hi!

If shift count range is [0, 1], then for unsigned LSHIFT_EXPR
bound is the topmost bit, but as llshift method always sign-extends
the result into double_int, the test don't properly find out that
deriving the value range is unsafe.  In this case
vr0 is [0x7fff8001, 0x8001], thus when shifting up by 0 or one bit
we might shift out either zero or 1.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk?

2013-04-26  Jakub Jelinek  

PR tree-optimization/57083
* tree-vrp.c (extract_range_from_binary_expr_1): For LSHIFT_EXPR with
non-singleton shift count range, zero extend low_bound for uns case.

* gcc.dg/torture/pr57083.c: New test.

--- gcc/tree-vrp.c.jj   2013-04-24 12:07:07.0 +0200
+++ gcc/tree-vrp.c  2013-04-26 17:59:41.077938198 +0200
@@ -2837,7 +2837,7 @@ extract_range_from_binary_expr_1 (value_
 
  if (uns)
{
- low_bound = bound;
+ low_bound = bound.zext (prec);
  high_bound = complement.zext (prec);
  if (tree_to_double_int (vr0.max).ult (low_bound))
{
--- gcc/testsuite/gcc.dg/torture/pr57083.c.jj   2013-04-26 18:09:05.396031875 
+0200
+++ gcc/testsuite/gcc.dg/torture/pr57083.c  2013-04-26 18:08:51.0 
+0200
@@ -0,0 +1,15 @@
+/* PR tree-optimization/57083 */
+/* { dg-do run { target int32plus } } */
+
+extern void abort (void);
+short x = 1;
+int y = 0;
+
+int
+main ()
+{
+  unsigned t = (0x7fff8001U - x) << (y == 0);
+  if (t != 0xU)
+abort ();
+  return 0;
+}

Jakub

[PATCH] Improve vec_widen_?mult_odd_*

2013-04-26 Thread Jakub Jelinek

Hi!

On
#define N 4096
unsigned int b[N], d[N];

void
foo (void)
{
  int i;
  for (i = 0; i < N; i++)
d[i] = b[i] / 3;
}
testcase I was looking earlier today because of the XOP issues,
I've noticed we generate unnecessary code:
vmovdqa .LC0(%rip), %ymm2
...
vpsrlq  $32, %ymm2, %ymm3
before the loop and in the loop:
vmovdqa b(%rax), %ymm0
vpmuludqb(%rax), %ymm2, %ymm1
...
vpsrlq  $32, %ymm0, %ymm0
vpmuludq%ymm3, %ymm0, %ymm0
...
.LC0:
.long   -1431655765
.long   -1431655765
.long   -1431655765
.long   -1431655765
.long   -1431655765
.long   -1431655765
.long   -1431655765
.long   -1431655765
The first vpsrlq and having an extra register live across the loop is not
needed, if each pair of constants in the constant vector is equal, we can
just use .LC0(%rip) (i.e. %ymm2 above) in both places.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux.

Apparently ix86_expand_binop_builtin wasn't prepared for NULL predicates
(but generic code is), alternatively perhaps I could add a predicate
that would accept nonimmediate_operand or CONSTANT_VECTOR if that is
preferrable that way.
Also, not sure if force_reg or copy_to_mode_reg is preferrable.

2013-04-26  Jakub Jelinek  

* config/i386/i386.c (ix86_expand_binop_builtin): Allow NULL
predicate.
(const_vector_equal_evenodd_p): New function.
(ix86_expand_mul_widen_evenodd): Force op1 resp. op2 into register
if they aren't nonimmediate operands.  If their original values
satisfy const_vector_equal_evenodd_p, don't shift them.
* config/i386/sse.md (mul3): Remove predicates.  For the
SSE4.1 case force operands[{1,2}] into registers if not
nonimmediate_operand.
(vec_widen_smult_even_v4si): Use nonimmediate_operand predicates
instead of register_operand.
(vec_widen_mult_odd_): Remove predicates.

--- gcc/config/i386/i386.c.jj   2013-04-26 15:11:37.0 +0200
+++ gcc/config/i386/i386.c  2013-04-26 19:03:54.777293448 +0200
@@ -30149,9 +30150,11 @@ ix86_expand_binop_builtin (enum insn_cod
   op1 = gen_lowpart (TImode, x);
 }
 
-  if (!insn_data[icode].operand[1].predicate (op0, mode0))
+  if (insn_data[icode].operand[1].predicate
+  && !insn_data[icode].operand[1].predicate (op0, mode0))
 op0 = copy_to_mode_reg (mode0, op0);
-  if (!insn_data[icode].operand[2].predicate (op1, mode1))
+  if (insn_data[icode].operand[2].predicate
+  && !insn_data[icode].operand[2].predicate (op1, mode1))
 op1 = copy_to_mode_reg (mode1, op1);
 
   pat = GEN_FCN (icode) (target, op0, op1);
@@ -40826,6 +40829,24 @@ ix86_expand_vecop_qihi (enum rtx_code co
   gen_rtx_fmt_ee (code, qimode, op1, op2));
 }
 
+/* Helper function of ix86_expand_mul_widen_evenodd.  Return true
+   if op is CONST_VECTOR with all odd elements equal to their
+   preceeding element.  */
+
+static bool
+const_vector_equal_evenodd_p (rtx op)
+{
+  enum machine_mode mode = GET_MODE (op);
+  int i, nunits = GET_MODE_NUNITS (mode);
+  if (GET_CODE (op) != CONST_VECTOR
+  || nunits != CONST_VECTOR_NUNITS (op))
+return false;
+  for (i = 0; i < nunits; i += 2)
+if (CONST_VECTOR_ELT (op, i) != CONST_VECTOR_ELT (op, i + 1))
+  return false;
+  return true;
+}
+
 void
 ix86_expand_mul_widen_evenodd (rtx dest, rtx op1, rtx op2,
   bool uns_p, bool odd_p)
@@ -40833,6 +40854,12 @@ ix86_expand_mul_widen_evenodd (rtx dest,
   enum machine_mode mode = GET_MODE (op1);
   enum machine_mode wmode = GET_MODE (dest);
   rtx x;
+  rtx orig_op1 = op1, orig_op2 = op2;
+
+  if (!nonimmediate_operand (op1, mode))
+op1 = force_reg (mode, op1);
+  if (!nonimmediate_operand (op2, mode))
+op2 = force_reg (mode, op2);
 
   /* We only play even/odd games with vectors of SImode.  */
   gcc_assert (mode == V4SImode || mode == V8SImode);
@@ -40849,10 +40876,12 @@ ix86_expand_mul_widen_evenodd (rtx dest,
}
 
   x = GEN_INT (GET_MODE_UNIT_BITSIZE (mode));
-  op1 = expand_binop (wmode, lshr_optab, gen_lowpart (wmode, op1),
- x, NULL, 1, OPTAB_DIRECT);
-  op2 = expand_binop (wmode, lshr_optab, gen_lowpart (wmode, op2),
- x, NULL, 1, OPTAB_DIRECT);
+  if (!const_vector_equal_evenodd_p (orig_op1))
+   op1 = expand_binop (wmode, lshr_optab, gen_lowpart (wmode, op1),
+   x, NULL, 1, OPTAB_DIRECT);
+  if (!const_vector_equal_evenodd_p (orig_op2))
+   op2 = expand_binop (wmode, lshr_optab, gen_lowpart (wmode, op2),
+   x, NULL, 1, OPTAB_DIRECT);
   op1 = gen_lowpart (mode, op1);
   op2 = gen_lowpart (mode, op2);
 }
--- gcc/config/i386/sse.md.jj   2013-04-26 15:11:37.0 +0200
+++ gcc/config/i386/sse.md  2013-04-26 18:59:03.838753277 +0200
@@ -5631,14 +5631,16 @@ (define_insn

[PATCH] Fix a -Wsign-compare warning in i386.c

2013-04-26 Thread Jakub Jelinek

Hi!

GCC 4.7.2 warns about -Wsign-compare when unsigned iterator is compared
with cregs_size.  GCC 4.8 doesn't warn about it (otherwise bootstrap would
fail), because it calls maybe_constant_value before emitting the warning,
but still I'd say it is better to use the same signedness.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2013-04-26  Jakub Jelinek  

* config/i386/i386.c (ix86_expand_call): Make cregs_size unsigned.

--- gcc/config/i386/i386.c.jj   2013-04-26 19:11:33.0 +0200
+++ gcc/config/i386/i386.c  2013-04-26 19:12:21.329725950 +0200
@@ -23714,7 +23714,8 @@ ix86_expand_call (rtx retval, rtx fnaddr
  rtx callarg2,
  rtx pop, bool sibcall)
 {
-  int const cregs_size = ARRAY_SIZE (x86_64_ms_sysv_extra_clobbered_registers);
+  unsigned int const cregs_size
+= ARRAY_SIZE (x86_64_ms_sysv_extra_clobbered_registers);
   rtx vec[3 + cregs_size];
   rtx use = NULL, call;
   unsigned int vec_len = 0;

Jakub

Re: [patch, mips] Fix for PR target/56942

2013-04-26 Thread Steve Ellcey

On Wed, 2013-04-24 at 07:45 +0100, Richard Sandiford wrote:
> "Steve Ellcey "  writes:
> > 2013-04-19  Andrew Bennett 
> > Steve Ellcey  
> >
> > PR target/56942
> > * config/mips/mips.md (casesi_internal_mips16_): Use
> > next_active_insn instead of next_real_insn.
> 
> Hmm, I don't really like this.  Steven said from ARM in
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56809:
> 
> ---
> Target bug, this is wrong:
> 
>   rtx diff_vec = PATTERN (next_real_insn (operands[2]));
> 
> A jump_table_data is not a real insn.  Before my patch this worked
> by accident because the jump table would hide in a JUMP_INSN and 
> next_real_insn returned any JUMP_P insn.
> 
> Use next_active_insn instead.
> ---
> 
> But using next_real_insn was at least as correct (IMO, more correct)
> as next_active_insn before r197266.  It seems counterintuitive that
> something can be "active" but not "real".
> 
> Richard

So should we put the active_insn_p hack/FIXME into real_next_insn?  That
doesn't seem like much of a win but it would probably fix the problem.

Steve Ellcey
sell...@imgtec.com



>From emit-rtl.c:

/* Find the next insn after INSN that really does something.  This routine
   does not look inside SEQUENCEs.  After reload this also skips over
   standalone USE and CLOBBER insn.  */

int
active_insn_p (const_rtx insn)
{
  return (CALL_P (insn) || JUMP_P (insn)
  || JUMP_TABLE_DATA_P (insn) /* FIXME */
  || (NONJUMP_INSN_P (insn)
  && (! reload_completed
  || (GET_CODE (PATTERN (insn)) != USE
  && GET_CODE (PATTERN (insn)) != CLOBBER;
}

Re: RFA: enable LRA for rs6000

2013-04-26 Thread Vladimir Makarov


On 13-04-26 11:30 AM, Michael Meissner wrote:

Vlad, in going through the LRA test differences, some of the bswap64 tests are
failing because LRA converts the swaps for register/register converts into
store/load.  For example, if gcc.target/powerpc/bswap64-4.c is compiled on
32-bit, for this function:

long long swap_reg (long long a) { return __builtin_bswap64 (a); }

LRA gives:

swap_reg:
 stwu 1,-16(1)
 li 9,4
 stw 3,8(1)
 stw 4,12(1)
 addi 10,1,8
 lwbrx 3,9,10
 lwbrx 4,0,10
 addi 1,1,16
 blr

And the traditional code generation is:

swap_reg:
 rlwinm 9,4,8,0x
 rlwinm 10,3,8,0x
 rlwimi 9,4,24,0,7
 rlwimi 10,3,24,0,7
 rlwimi 9,4,24,16,23
 rlwimi 10,3,24,16,23
 mr 4,10
 mr 3,9

I assume the rlwinm's are to be preferred because there is no LHS, and also in
this case, the 2 registers rlwinm's are done in parallel.

The test gcc.target/powerpc/vect-83_64.c is failing in LRA:

vect-83_64.c: In function ‘main1’:
vect-83_64.c:30:1: internal compiler error: Max. number of generated reload 
insns per insn is achieved (90)

  }
  ^
0x104dca7f lra_constraints(bool)
 /home/meissner/fsf-src/meissner-lra/gcc/lra-constraints.c:3613
0x104ca67b lra(_IO_FILE*)
 /home/meissner/fsf-src/meissner-lra/gcc/lra.c:2278
0x1047d6eb do_reload
 /home/meissner/fsf-src/meissner-lra/gcc/ira.c:4619
0x1047d6eb rest_of_handle_reload
 /home/meissner/fsf-src/meissner-lra/gcc/ira.c:4731
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.


The patch below solved the above problems.  The problem was in that I 
increased alternatives with '?' (although manual says that it should 
affect only regclass).  It improves slightly x86/x86-64 generated code 
performance.  But power has some strange insns with  (four ?).  It 
looks a woodo to me.


I've committed the patch into the branch.



I'm also seeing quite a few Fortran failures for -m32:

It seems fixed too.

gfortran.dg/PR19872.f
gfortran.dg/advance_1.f90
gfortran.dg/advance_4.f90
gfortran.dg/advance_5.f90
gfortran.dg/advance_6.f90
gfortran.dg/append_1.f90
gfortran.dg/associated_2.f90
gfortran.dg/assumed_rank_1.f90
gfortran.dg/assumed_rank_2.f90
gfortran.dg/assumed_rank_7.f90
gfortran.dg/assumed_type_2.f90
gfortran.dg/backspace_10.f90
gfortran.dg/backspace_2.f
gfortran.dg/backspace_8.f
gfortran.dg/backspace_9.f
gfortran.dg/bound_2.f90
gfortran.dg/bound_7.f90
gfortran.dg/bound_8.f90
gfortran.dg/char_cshift_1.f90
gfortran.dg/char_cshift_2.f90
gfortran.dg/char_cshift_3.f90
gfortran.dg/char_eoshift_1.f90
gfortran.dg/char_eoshift_2.f90
gfortran.dg/char_eoshift_3.f90
gfortran.dg/char_eoshift_4.f90
gfortran.dg/char_eoshift_5.f90
gfortran.dg/char_length_8.f90
gfortran.dg/chmod_1.f90
gfortran.dg/chmod_2.f90
gfortran.dg/chmod_3.f90
gfortran.dg/comma.f
gfortran.dg/convert_2.f90
gfortran.dg/convert_implied_open.f90
gfortran.dg/cr_lf.f90
gfortran.dg/cshift_bounds_1.f90
gfortran.dg/cshift_bounds_2.f90
gfortran.dg/cshift_bounds_3.f90
gfortran.dg/cshift_bounds_4.f90
gfortran.dg/cshift_nan_1.f90
gfortran.dg/dev_null.F90
gfortran.dg/direct_io_1.f90
gfortran.dg/direct_io_11.f90
gfortran.dg/direct_io_12.f90
gfortran.dg/direct_io_2.f90
gfortran.dg/direct_io_3.f90
gfortran.dg/direct_io_5.f90
gfortran.dg/direct_io_8.f90
gfortran.dg/endfile.f90
gfortran.dg/endfile_2.f90
gfortran.dg/eof_4.f90
gfortran.dg/eoshift.f90
gfortran.dg/eoshift_bounds_1.f90
gfortran.dg/error_format.f90
gfortran.dg/f2003_inquire_1.f03
gfortran.dg/f2003_io_1.f03
gfortran.dg/f2003_io_5.f03
gfortran.dg/f2003_io_7.f03
gfortran.dg/fmt_cache_1.f
gfortran.dg/fmt_error_4.f90
gfortran.dg/fmt_error_5.f90
gfortran.dg/fmt_t_5.f90
gfortran.dg/fmt_t_7.f
gfortran.dg/ftell_3.f90
gfortran.dg/hollerith4.f90
gfortran.dg/inquire_10.f90
gfortran.dg/inquire_13.f90
gfortran.dg/inquire_15.f90
gfortran.dg/inquire_9.f90
gfortran.dg/inquire_size.f90
gfortran.dg/iomsg_1.f90
gfortran.dg/iostat_2.f90
gfortran.dg/list_read_10.f90
gfortran.dg/list_read_6.f90
gfortran.dg/list_read_7.f90
gfortran.dg/list_read_9.f90
gfortran.dg/matmul_1.f90
gfortran.dg/matmul_5.f90
gfortran.dg/maxloc_bounds_1.f90
gfortran.dg/maxloc_bounds_2.f90
gfortran.dg/maxloc_bounds_3.f90
gfortran.dg/maxloc_bounds_6.f90
gfortran.dg/maxloc_bounds_8.f90
gfortran.dg/namelist_44.f90
gfortran.dg/namelist_45.f90
gfortran.dg/namelist_46.f90
gfortran.dg/namelist_66.f90
gfortran.dg/namelist_72.f
gfortran.dg/namelist_82.f90
gfortran.dg/negative_automatic_size.f90
gfortran.dg/negative_unit.f
gfortran.dg/negative_unit_int8.f
gfortran.dg/newunit_1.f90
gfortran.dg/newunit_3.f90
gfortran.dg/open_access_append_1.f90
gfortran.dg/open_errors.f90
gfortran.dg/open_negative_unit_1.f90
gfortran.dg/open_new.f90
gfortran.dg/open_readonly_1.f90
gfortran.dg/open_status_1.f90
gfortran.dg/open_status_2.f90
gfortran.dg/

Re: RFA: enable LRA for rs6000 [32-bit fortran]

2013-04-26 Thread Vladimir Makarov


On 13-04-26 2:04 PM, Michael Meissner wrote:

In addition to all of the failures in the 32-bit gfortrain suite, I ran one run
of the 32-bit spec 2006 fortan tests, and the following benchmarks fail:

410.bwaves  416.gamess  434.zeusmp
437.leslie3d454.calculix459.GemsFDTD
465.tonto   481.wrf

I'll work on this on Monday.

The following 2 benchmarks succeed:

435.gromacs 436.cactusADM

Re: RFA: enable LRA for rs6000

2013-04-26 Thread Michael Meissner

On Fri, Apr 26, 2013 at 07:00:37PM -0400, Vladimir Makarov wrote:
> 2013-04-26  Vladimir Makarov  
> 
> * lra.c (setup_operand_alternative): Ignore '?'.
> * lra-constraints.c (process_alt_operands): Print cost dump for
> alternatives.  Check only moves for cycling.
> (curr_insn_transform): Print insn name.

I'm not sure I'm comfortable with ignoring the '?' altogether.  For example, if
you do something in the GPR unit, instructions run at one cycle, while if you
do it in the vector unit, it runs in two cycles.  In the past, I've seen cases
where it wanted to spill floating point values from the floating point
registers to the CTR.  And if you spill to the LR, it can interfere with the
call cache.

Admitily, when to use '!', '?', and '*' is unclear, and unfortunately it has
changed over time.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797

RE: [gomp4] Some progress on #pragma omp simd

2013-04-26 Thread Iyer, Balaji V

Hello Aldy and Jakub,
Please see my response below.

Thanks,

Balaji V. Iyer.

> -Original Message-
> From: Aldy Hernandez [mailto:al...@redhat.com]
> Sent: Wednesday, April 24, 2013 7:22 PM
> To: Jakub Jelinek
> Cc: Iyer, Balaji V; Richard Henderson; gcc-patches@gcc.gnu.org
> Subject: Re: [gomp4] Some progress on #pragma omp simd
> 
> [Balaji, see below].
> 
> Ok, this is confusing.  While the document in the link you posted (the ICC
> manual?) says so, the document I'm following says otherwise.
> 
> I'm following this (which, until a few days was a link accessible from the 
> cilk plus
> web page, though I no longer see it):

Yes, I am aware of the missing link. We are currently looking into it.

> 
> http://software.intel.com/sites/default/files/m/4/e/7/3/1/40297-
> Intel_Cilk_plus_lang_spec_2.htm
> 
> The document above is for version 1.1 of the Cilk Plus language extension
> specification, which I was told was the latest.  There it explicitly says 
> that the
> clauses behave exactly like in OpenMP:
> 
> "The syntax and semantics of the various simd-openmp-data-clauses are
> detailed in the OpenMP specification.
> (http://www.openmp.org/mp-documents/spec30.pdf, Section 2.9.3)."
> 
> Balaji, can you verify which is correct?  For that matter, which are the 
> official
> specs from which we should be basing this work?

Privatization clause makes a variable private for the simd lane. In general,  I 
would follow the spec. If you have further questions, please feel free to ask.

Thanks,

Balaji V. Iyer.

> 
> Aldy
> 
> 
> On 04/24/13 01:40, Jakub Jelinek wrote:
> > On Wed, Apr 24, 2013 at 08:25:36AM +0200, Jakub Jelinek wrote:
> >> BTW, the semantics of private/firstprivate/lastprivate desribed in
> >> http://software.intel.com/sites/products/documentation/studio/compose
> >> r/en-
> us/2011Update/compiler_c/cref_cls/common/cppref_pragma_simd.htm
> >> doesn't seem to match the semantics of those in #pragma omp simd.
> >> private in OpenMP I understand is private to the whole loop (or SIMD
> >> lane?;
> >
> > SIMD lane apparently.  Guess that is going to be quite difficult,
> > because at the point of omp lowering or expansion we are nowhere close
> > to knowing what vectorization factor we are going to choose, all we
> > have is an upper bound on that based on the target ISA and safelen clause.
> > If say private clause is used with C++ classes with non-trivial
> > ctors/dtors that would make a difference.  Plus how to represent this in 
> > the IL.
> >
> > struct A { A (); ~A (); A (const A &); int i; };
> >
> > void
> > foo ()
> > {
> >A a, b;
> >#pragma omp simd private (a) lastprivate (b)
> >for (int i = 0; i < 10; i++)
> >  {
> >a.i++;
> >b.i++;
> >  }
> > }
> >
> > Right now what gomp4 branch does is that it will just construct
> > private vars around the whole loop, as in:
> > void
> > foo ()
> > {
> >A a, b;
> >{
> >  A a', b';
> >  int i;
> >  for (i = 0; i < 10; i++)
> >{
> > a'.i++;
> > b'.i++;
> > if (i == 9)
> >   b = b';
> >}
> >}
> > }
> >
> > Jakub
> >

Re: [C++ Patch/RFC] PR 56450

2013-04-26 Thread Jason Merrill

Why should id_expression_or_member_access_p be false? 
"declval().dummy" is a class member access (5.2.5) regardless of what 
kind of member dummy is.


Jason

Backport r185150 to google-4_7 and google-4_8(save --std flag as cl_args)

2013-04-26 Thread Dehao Chen

Bootstrapped and passed regression tests.

Okay for google-4_7 and google-4_7 branches?

Thanks,
Dehao

2012-03-09   Rong Xu  

* opts-global.c (lipo_save_cl_args): save -std option.
Google ref b/6117980.

Index: gcc/opts-global.c
===
--- gcc/opts-global.c (revision 198164)
+++ gcc/opts-global.c (working copy)
@@ -277,7 +277,7 @@ lipo_save_cl_args (struct cl_decoded_option *decod
   */
   if (opt[0] == '-'
   && (opt[1] == 'f' || opt[1] == 'm' || opt[1] == 'W' || opt[1] == 'O'
-  || (strstr (opt, "--param") == opt))
+  || (strstr (opt, "--param") == opt) || (strstr (opt, "-std=")))
   && !strstr(opt, "-frandom-seed")
   && !strstr(opt, "-fripa-disallow-opt-mismatch")
   && !strstr(opt, "-Wripa-opt-mismatch"))

Re: Backport r185150 to google-4_7 and google-4_8(save --std flag as cl_args)

2013-04-26 Thread Xinliang David Li

Ok.

David

On Fri, Apr 26, 2013 at 6:42 PM, Dehao Chen  wrote:
> Bootstrapped and passed regression tests.
>
> Okay for google-4_7 and google-4_7 branches?
>
> Thanks,
> Dehao
>
> 2012-03-09   Rong Xu  
>
> * opts-global.c (lipo_save_cl_args): save -std option.
> Google ref b/6117980.
>
> Index: gcc/opts-global.c
> ===
> --- gcc/opts-global.c (revision 198164)
> +++ gcc/opts-global.c (working copy)
> @@ -277,7 +277,7 @@ lipo_save_cl_args (struct cl_decoded_option *decod
>*/
>if (opt[0] == '-'
>&& (opt[1] == 'f' || opt[1] == 'm' || opt[1] == 'W' || opt[1] == 'O'
> -  || (strstr (opt, "--param") == opt))
> +  || (strstr (opt, "--param") == opt) || (strstr (opt, "-std=")))
>&& !strstr(opt, "-frandom-seed")
>&& !strstr(opt, "-fripa-disallow-opt-mismatch")
>&& !strstr(opt, "-Wripa-opt-mismatch"))

[GOOGLE] Disallow importing modules with different --std

2013-04-26 Thread Dehao Chen

This patch forbids modules to be imported as aux module if its --std
is different with the primary module.

Bootstrapped and passed regression test.

OK for google branches?

Thanks,
Dehao

Index: gcc/coverage.c
===
--- gcc/coverage.c (revision 198353)
+++ gcc/coverage.c (working copy)
@@ -384,6 +384,7 @@ incompatible_cl_args (struct gcov_module_info* mod
   char **warning_opts2 = XNEWVEC (char *, mod_info2->num_cl_args);
   char **non_warning_opts1 = XNEWVEC (char *, mod_info1->num_cl_args);
   char **non_warning_opts2 = XNEWVEC (char *, mod_info2->num_cl_args);
+  char *std_opts1 = NULL, *std_opts2 = NULL;
   unsigned int i, num_warning_opts1 = 0, num_warning_opts2 = 0;
   unsigned int num_non_warning_opts1 = 0, num_non_warning_opts2 = 0;
   bool warning_mismatch = false;
@@ -396,7 +397,7 @@ incompatible_cl_args (struct gcov_module_info* mod
 mod_info2->num_bracket_paths + mod_info2->num_cpp_defines +
 mod_info2->num_cpp_includes;

-  bool *cg_opts1, *cg_opts2, has_any_incompatible_cg_opts;
+  bool *cg_opts1, *cg_opts2, has_any_incompatible_cg_opts,
has_incompatible_std;
   unsigned int num_cg_opts = 0;

   for (i = 0; force_matching_cg_opts[i].opt_str; i++)
@@ -426,6 +427,8 @@ incompatible_cl_args (struct gcov_module_info* mod
 char *option_string = mod_info1->string_array[start_index1 + i];

 check_cg_opts (cg_opts1, option_string);
+ if (strstr (option_string, "-std="))
+  std_opts1 = option_string;

 slot = htab_find_slot (option_tab1, option_string, INSERT);
 if (!*slot)
@@ -445,6 +448,8 @@ incompatible_cl_args (struct gcov_module_info* mod
 char *option_string = mod_info2->string_array[start_index2 + i];

 check_cg_opts (cg_opts2, option_string);
+ if (strstr (option_string, "-std="))
+  std_opts2 = option_string;

 slot = htab_find_slot (option_tab2, option_string, INSERT);
 if (!*slot)
@@ -454,6 +459,10 @@ incompatible_cl_args (struct gcov_module_info* mod
   }
   }

+  has_incompatible_std =
+  std_opts1 != std_opts2 && (std_opts1 == NULL || std_opts2 == NULL
+ || strcmp (std_opts1, std_opts2));
+
   /* Compare warning options. If these mismatch, we emit a warning.  */
   if (num_warning_opts1 != num_warning_opts2)
 warning_mismatch = true;
@@ -498,7 +507,7 @@ incompatible_cl_args (struct gcov_module_info* mod
htab_delete (option_tab1);
htab_delete (option_tab2);
return ((flag_ripa_disallow_opt_mismatch && non_warning_mismatch)
-   || has_any_incompatible_cg_opts);
+   || has_any_incompatible_cg_opts || has_incompatible_std);
 }

 /* Support for module sorting based on user specfication.  */

Re: [GOOGLE] Disallow importing modules with different --std

2013-04-26 Thread Xinliang David Li

ok with benchmark testing.

Need to be in all google branches (47, 48 and main)

David

On Fri, Apr 26, 2013 at 7:57 PM, Dehao Chen  wrote:
> This patch forbids modules to be imported as aux module if its --std
> is different with the primary module.
>
> Bootstrapped and passed regression test.
>
> OK for google branches?
>
> Thanks,
> Dehao
>
> Index: gcc/coverage.c
> ===
> --- gcc/coverage.c (revision 198353)
> +++ gcc/coverage.c (working copy)
> @@ -384,6 +384,7 @@ incompatible_cl_args (struct gcov_module_info* mod
>char **warning_opts2 = XNEWVEC (char *, mod_info2->num_cl_args);
>char **non_warning_opts1 = XNEWVEC (char *, mod_info1->num_cl_args);
>char **non_warning_opts2 = XNEWVEC (char *, mod_info2->num_cl_args);
> +  char *std_opts1 = NULL, *std_opts2 = NULL;
>unsigned int i, num_warning_opts1 = 0, num_warning_opts2 = 0;
>unsigned int num_non_warning_opts1 = 0, num_non_warning_opts2 = 0;
>bool warning_mismatch = false;
> @@ -396,7 +397,7 @@ incompatible_cl_args (struct gcov_module_info* mod
>  mod_info2->num_bracket_paths + mod_info2->num_cpp_defines +
>  mod_info2->num_cpp_includes;
>
> -  bool *cg_opts1, *cg_opts2, has_any_incompatible_cg_opts;
> +  bool *cg_opts1, *cg_opts2, has_any_incompatible_cg_opts,
> has_incompatible_std;
>unsigned int num_cg_opts = 0;
>
>for (i = 0; force_matching_cg_opts[i].opt_str; i++)
> @@ -426,6 +427,8 @@ incompatible_cl_args (struct gcov_module_info* mod
>  char *option_string = mod_info1->string_array[start_index1 + i];
>
>  check_cg_opts (cg_opts1, option_string);
> + if (strstr (option_string, "-std="))
> +  std_opts1 = option_string;
>
>  slot = htab_find_slot (option_tab1, option_string, INSERT);
>  if (!*slot)
> @@ -445,6 +448,8 @@ incompatible_cl_args (struct gcov_module_info* mod
>  char *option_string = mod_info2->string_array[start_index2 + i];
>
>  check_cg_opts (cg_opts2, option_string);
> + if (strstr (option_string, "-std="))
> +  std_opts2 = option_string;
>
>  slot = htab_find_slot (option_tab2, option_string, INSERT);
>  if (!*slot)
> @@ -454,6 +459,10 @@ incompatible_cl_args (struct gcov_module_info* mod
>}
>}
>
> +  has_incompatible_std =
> +  std_opts1 != std_opts2 && (std_opts1 == NULL || std_opts2 == NULL
> + || strcmp (std_opts1, std_opts2));
> +
>/* Compare warning options. If these mismatch, we emit a warning.  */
>if (num_warning_opts1 != num_warning_opts2)
>  warning_mismatch = true;
> @@ -498,7 +507,7 @@ incompatible_cl_args (struct gcov_module_info* mod
> htab_delete (option_tab1);
> htab_delete (option_tab2);
> return ((flag_ripa_disallow_opt_mismatch && non_warning_mismatch)
> -   || has_any_incompatible_cg_opts);
> +   || has_any_incompatible_cg_opts || has_incompatible_std);
>  }
>
>  /* Support for module sorting based on user specfication.  */

[wwwdocs] C++14 support for binary literals says Noinstead of Yes

2013-04-26 Thread Ed Smith-Rowland

In htdocs/projects/cxx1y.html it says no for support of binary 
literals.  I think that's a Yes actually.


Here is a little patchlet.

Am I missing something?

Index: htdocs/projects/cxx1y.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/projects/cxx1y.html,v
retrieving revision 1.1
diff -r1.1 cxx1y.html
56c56
<   No
---
>   Yes

72 matches

Mail list logo