Re: Fix libgomp crash without TLS (PR42616)

2014-09-30 Thread Varvara Rainchik
Corrected patch: call pthread_setspecific (gomp_tls_key, NULL) in
gomp_thread_start if HAVE_TLS is not defined.

2014-09-19  Varvara Rainchik  

* libgomp.h (gomp_thread): For non TLS case create thread data.
* team.c (non_tls_thread_data_destructor,
create_non_tls_thread_data): New functions.


---
diff --git a/libgomp/libgomp.h b/libgomp/libgomp.h
index bcd5b34..2f33d99 100644
--- a/libgomp/libgomp.h
+++ b/libgomp/libgomp.h
@@ -467,9 +467,15 @@ static inline struct gomp_thread *gomp_thread (void)
 }
 #else
 extern pthread_key_t gomp_tls_key;
-static inline struct gomp_thread *gomp_thread (void)
+extern struct gomp_thread *create_non_tls_thread_data (void);
+static struct gomp_thread *gomp_thread (void)
 {
-  return pthread_getspecific (gomp_tls_key);
+  struct gomp_thread *thr = pthread_getspecific (gomp_tls_key);
+  if (thr == NULL)
+  {
+thr = create_non_tls_thread_data ();
+  }
+  return thr;
 }
 #endif

diff --git a/libgomp/team.c b/libgomp/team.c
index e6a6d8f..1854d8a 100644
--- a/libgomp/team.c
+++ b/libgomp/team.c
@@ -41,6 +41,7 @@ pthread_key_t gomp_thread_destructor;
 __thread struct gomp_thread gomp_tls_data;
 #else
 pthread_key_t gomp_tls_key;
+struct gomp_thread initial_thread_tls_data;
 #endif


@@ -130,6 +131,9 @@ gomp_thread_start (void *xdata)
   gomp_sem_destroy (&thr->release);
   thr->thread_pool = NULL;
   thr->task = NULL;
+#ifndef HAVE_TLS
+  pthread_setspecific (gomp_tls_key, NULL);
+#endif
   return NULL;
 }

@@ -222,8 +226,16 @@ gomp_free_pool_helper (void *thread_pool)
 void
 gomp_free_thread (void *arg __attribute__((unused)))
 {
-  struct gomp_thread *thr = gomp_thread ();
-  struct gomp_thread_pool *pool = thr->thread_pool;
+  struct gomp_thread *thr;
+  struct gomp_thread_pool *pool;
+#ifdef HAVE_TLS
+  thr = gomp_thread ();
+#else
+  thr = pthread_getspecific (gomp_tls_key);
+  if (thr == NULL)
+return;
+#endif
+  pool = thr->thread_pool;
   if (pool)
 {
   if (pool->threads_used > 0)
@@ -910,6 +922,21 @@ gomp_team_end (void)
 }
 }

+/* Destructor for data created in create_non_tls_thread_data.  */
+
+#ifndef HAVE_TLS
+void
+non_tls_thread_data_destructor (void *arg __attribute__((unused)))
+{
+  struct gomp_thread *thr = pthread_getspecific (gomp_tls_key);
+  if (thr != NULL && thr != &initial_thread_tls_data)
+  {
+gomp_free_thread (arg);
+free (thr);
+pthread_setspecific (gomp_tls_key, NULL);
+  }
+}
+#endif

 /* Constructors for this file.  */

@@ -917,9 +944,7 @@ static void __attribute__((constructor))
 initialize_team (void)
 {
 #ifndef HAVE_TLS
-  static struct gomp_thread initial_thread_tls_data;
-
-  pthread_key_create (&gomp_tls_key, NULL);
+  pthread_key_create (&gomp_tls_key, non_tls_thread_data_destructor);
   pthread_setspecific (gomp_tls_key, &initial_thread_tls_data);
 #endif

@@ -927,6 +952,19 @@ initialize_team (void)
 gomp_fatal ("could not create thread pool destructor.");
 }

+/* Create data for thread created by pthread_create.  */
+
+#ifndef HAVE_TLS
+struct gomp_thread *create_non_tls_thread_data (void)
+{
+  struct gomp_thread *thr = gomp_malloc_cleared (sizeof (struct gomp_thread));
+  pthread_setspecific (gomp_tls_key, thr);
+  gomp_sem_init (&thr->release, 0);
+
+  return thr;
+}
+#endif
+
 static void __attribute__((destructor))
 team_destructor (void)
 {




2014-09-24 14:19 GMT+04:00 Varvara Rainchik :
> *Ping*
>
> 2014-09-19 15:41 GMT+04:00 Varvara Rainchik :
>> I've corrected my patch accordingly to what you said. To diffirentiate
>> second case in destructor I've added pthread_setspecific
>> (gomp_tls_key, NULL) at the end of gomp_thread_start. So, destructor
>> can simply skip the case when pthread_getspecific (gomp_tls_key)
>> returns 0. I also think that it's better to set 0 in gomp_thread_start
>> explicitly as thread data is initialized by a local variable in this
>> function.
>>
>> But, I see that pthread_getspecific always returns 0 in destrucor
>> because data pointer is implicitly set to 0 before destructor call in
>> glibc:
>>
>> (pthread_create.c):
>>
>> /* Always clear the data. */
>> level2[inner].data = NULL;
>>
>> /* Make sure the data corresponds to a valid
>> key. This test fails if the key was
>> deallocated and also if it was
>> re-allocated. It is the user's
>> responsibility to free the memory in this
>> case. */
>> if (level2[inner].seq
>>== __pthread_keys[idx].seq
>>/* It is not necessary to register a destructor
>>   function. */
>>  && __pthread_keys[idx].destr != NULL)
>> /* Call the user-provided destructor. */
>> __pthread_keys[idx].destr (data);
>>
>> I suppose it's not necessary if everything is cleaned up in
>> gomp_thread_start  and destructor. What do you think?
>>
>>
>> Changes are bootstrapped and regtested on x86_64-linux.
>>
>> 2014-09-19  Varvara Rainchik  
>>
>> * libgomp.h (gomp_thread): For non TLS case create thread data.
>> * team.c (non_tls_thread_data_destructor,
>> create_non_tls_thread_data): New functions.
>>
>>

Re: [PATCHv3][PING] Enable -fsanitize-recover for KASan

2014-09-30 Thread Yury Gribov

On 09/30/2014 09:40 AM, Jakub Jelinek wrote:

On Mon, Sep 29, 2014 at 05:24:02PM -0700, Konstantin Serebryany wrote:

I don't think we ever going to support recovery for regular ASan
(Kostya, correct me if I'm wrong).


I hope so too.
Another point is that with asan-instrumentation-with-call-threshold=0
(instrumentation with callbacks)


The normal (non-recovery) callbacks are __attribute__((noreturn)) for
performance reasons, and you do need different callbacks and different
generated code if you want to recover (after the callback you need jump
back to a basic block after the conditional jump).
So, in that case you would need -fsanitize-recover=address.


I see no problem in enabling -fsanitize-recover by default for
-fsanitize=undefined and


This becomes more interesting when we use asan and ubsan together.


That is fairly common case.


I think we can summarize:
* the current option -fsanitize-recover is misleading; it's really 
-fubsan-recover
* we need a way to selectively enable/disable recovery for different 
sanitizers


The most promininet solution seems to be
* allow -fsanitize-recover=tgt1,tgt2 syntax
* -fsanitize-recover wo options would still mean UBSan recovery

The question is what to do with -fno-sanitize-recover then.

-Y



Re: [PATCH v2] Fix signed integer overflow in gcc/data-streamer.c

2014-09-30 Thread Markus Trippelsdorf
On 2014.09.28 at 14:57 +0200, Markus Trippelsdorf wrote:
> On 2014.09.28 at 14:36 +0200, Steven Bosscher wrote:
> > 
> > Can you use HOST_WIDE_INT_1U for this?
> 
> Sure. Thanks for the suggestion. 
> (Fix now resembles similar idiom in data-streamer-in.c)

I checked in the fix as obvious.

-- 
Markus


Re: [PATCHv3][PING] Enable -fsanitize-recover for KASan

2014-09-30 Thread Yury Gribov

On 09/30/2014 10:56 AM, Yury Gribov wrote:

On 09/30/2014 04:24 AM, Konstantin Serebryany wrote:

On Mon, Sep 29, 2014 at 4:26 PM, Alexey Samsonov 
wrote:

I don't think we ever going to support recovery for regular ASan
(Kostya, correct me if I'm wrong).


I hope so too.
Another point is that with asan-instrumentation-with-call-threshold=0
(instrumentation with callbacks)
we can and probably will allow to recover from errors (glibc demands
that),
but that does not require any compile-time flag.


I don't know details but are you absolutely sure that you won't want to
do inline instrumentation of glibc in the future? This would then
require -fasan-recover.


FYI in kernel we had exactly this situation: outline instrumentation 
allowed us to hide recovery inside callbacks but then turned out to be 
too slow so we are now switching back to inline instrumentation (which 
requires -fasan-recovery).


-Y



Infrastructure for forward propagation of polymorphic call contexts

2014-09-30 Thread Jan Hubicka
Hi,
this patch adds interface for froward propagation of polymorphic contexts.
The API is as follow:
 - OFFSET_BY that allows to change offset of context
   (to support ipa-prop ancestor functions),
 - MAKE_SPECULATIVE to kill non-speculative info when we can not keep track
   of dynamic type changes, and
 - COMBINE_WITH that is used to produce context that is combination of two that
   are both known to be valid.

There is also USELESS_P predicate that can be used to throw away context that
has nothing important in it.

This is all infrastructure needed to propagate inside inliner modulo the need
to revisit all code that track dynamic type changes - the old code is all built
around the idea that there is only single inheritance and that one class can
contain another only as a base (not as field).  I plan to do that incrementally.

Early results seems pretty good bumping up number of devirtualizations done by
inliner on Firefox from 200 to 300 full and 13000 speculative ones (out of cca
3 remaining polymprhic calls after ipa-devirt pass).  This should further
improve once I re-enable dynamic type tracking.  I will need to validate this
with profile data though.

Merging two contexts turns out to be quite a lot of code and current
implementation may turn out to be too busy. Conceptually it is easy walk from
smaller types to bigger types, but it gets nasty in details because one class
may contain another as a base, as a field, or via placement new allocation.
All three cases are bit different for merging.  I want to get some experience
with current implementation and perhaps simplify/extend it in future.
Perhaps we do not really care much about fancy cases.

The patch also fixes some issues I found while reorganizing the code.

I have bootstrapped/regtested this patch with ipa-prop modified to use the
new bits.  I plan to commit this after some further testing tomorrow.

Honza

* ipa-polymorphic-call.c
(ipa_polymorphic_call_context::restrict_to_inner_class):
Rename EXPECTED_TYPE to OTR_TYPE; Validate speculation late;
use speculation_consistent_p to do so; Add CONSDER_BASES
and CONSIDER_PLACEMENT_NEW parameters.
(contains_type_p): Add CONSDER_PLACEMENT_NEW and CONSIDER_BASES;
short circuit obvious cases.
(ipa_polymorphic_call_context::dump): Improve formatting.
(ipa_polymorphic_call_context::ipa_polymorphic_call_context): Use
combine_speculation_with to record speculations; Do not ICE when
object is located in pointer type decl; do not ICE for methods
of UNION_TYPE; do not record nonpolymorphic types.
(ipa_polymorphic_call_context::speculation_consistent_p): New method.
(ipa_polymorphic_call_context::combine_speculation_with): New method.
(ipa_polymorphic_call_context::combine_with): New method.
(ipa_polymorphic_call_context::make_speculative): Move here; use
combine speculation.
* cgraph.h (ipa_polymorphic_call_context): Update
restrict_to_inner_class prototype; add offset_by, make_speculative, 
combine_with, useless_p, combine_speculation_with and
speculation_consistent_p methods.
(ipa_polymorphic_call_context::offset_by): New method.
(ipa_polymorphic_call_context::useless_p): New method.
Index: ipa-polymorphic-call.c
===
--- ipa-polymorphic-call.c  (revision 215658)
+++ ipa-polymorphic-call.c  (working copy)
@@ -53,7 +53,9 @@ along with GCC; see the file COPYING3.
 /* Return true when TYPE contains an polymorphic type and thus is interesting
for devirtualization machinery.  */
 
-static bool contains_type_p (tree, HOST_WIDE_INT, tree);
+static bool contains_type_p (tree, HOST_WIDE_INT, tree,
+bool consider_placement_new = true,
+bool consider_bases = true);
 
 bool
 contains_polymorphic_type_p (const_tree type)
@@ -99,13 +101,13 @@ possible_placement_new (tree type, tree
  <= tree_to_uhwi (TYPE_SIZE (type);
 }
 
-/* THIS->OUTER_TYPE is a type of memory object where object of EXPECTED_TYPE
+/* THIS->OUTER_TYPE is a type of memory object where object of OTR_TYPE
is contained at THIS->OFFSET.  Walk the memory representation of
THIS->OUTER_TYPE and find the outermost class type that match
-   EXPECTED_TYPE or contain EXPECTED_TYPE as a base.  Update THIS
+   OTR_TYPE or contain OTR_TYPE as a base.  Update THIS
to represent it.
 
-   If EXPECTED_TYPE is NULL, just find outermost polymorphic type with
+   If OTR_TYPE is NULL, just find outermost polymorphic type with
virtual table present at possition OFFSET.
 
For example when THIS represents type
@@ -119,11 +121,20 @@ possible_placement_new (tree type, tree
sizeof(int). 
 
If we can not find corresponding class, give up by setting
-   THIS->OUTER_TYPE to EXPECTED_TYPE and THIS->OFFSET to NULL. 
-  

Re: Fix for "FAIL: tmpdir-gcc.dg-struct-layout-1/t028 c_compat_x_tst.o compile, (internal compiler error)"

2014-09-30 Thread Richard Sandiford
"David Sherwood"  writes:
> @@ -1859,9 +1861,11 @@ static basic_block curr_bb;
>  
>  /* This recursive function creates allocnos corresponding to
> pseudo-registers containing in X.  True OUTPUT_P means that X is
> -   a lvalue.  */
> +   a lvalue.  The 'parent' parameter corresponds to the parent expression
> +   of 'rtx'.
> + */

Coding style nit: parameters should be written in caps rather than in quotes
and the "*/" should be on the same line as the "." (two spaces inbetween).

> +   if (outer_regno < 0 ||
> +   !in_hard_reg_set_p (reg_class_contents[aclass],
> +   outer_mode, outer_regno))

Another one, sorry: || should be at the start of the line rather than
the end.  Also, indentation should be by tabs as far as possible,
then spaces.

Since Vlad already OK'd this version, I committed it as below.  Thanks
for the patch!

Richard


2014-09-30  David Sherwood  

* ira-int.h (ira_allocno): Add "wmode" field.
* ira-build.c (create_insn_allocnos): Add new "parent" function
parameter.
* ira-conflicts.c (ira_build_conflicts): Add conflicts for registers
that cannot be accessed in wmode.

Index: gcc/ira-int.h
===
--- gcc/ira-int.h   2014-09-22 08:36:23.613797736 +0100
+++ gcc/ira-int.h   2014-09-30 08:50:55.936083472 +0100
@@ -283,6 +283,9 @@ struct ira_allocno
   /* Mode of the allocno which is the mode of the corresponding
  pseudo-register.  */
   ENUM_BITFIELD (machine_mode) mode : 8;
+  /* Widest mode of the allocno which in at least one case could be
+ for paradoxical subregs where wmode > mode.  */
+  ENUM_BITFIELD (machine_mode) wmode : 8;
   /* Register class which should be used for allocation for given
  allocno.  NO_REGS means that we should use memory.  */
   ENUM_BITFIELD (reg_class) aclass : 16;
@@ -315,7 +318,7 @@ struct ira_allocno
  number (0, ...) - 2.  Value -1 is used for allocnos spilled by the
  reload (at this point pseudo-register has only one allocno) which
  did not get stack slot yet.  */
-  short int hard_regno;
+  int hard_regno : 16;
   /* Allocnos with the same regno are linked by the following member.
  Allocnos corresponding to inner loops are first in the list (it
  corresponds to depth-first traverse of the loops).  */
@@ -436,6 +439,7 @@ #define ALLOCNO_TOTAL_NO_STACK_REG_P(A)
 #define ALLOCNO_BAD_SPILL_P(A) ((A)->bad_spill_p)
 #define ALLOCNO_ASSIGNED_P(A) ((A)->assigned_p)
 #define ALLOCNO_MODE(A) ((A)->mode)
+#define ALLOCNO_WMODE(A) ((A)->wmode)
 #define ALLOCNO_PREFS(A) ((A)->allocno_prefs)
 #define ALLOCNO_COPIES(A) ((A)->allocno_copies)
 #define ALLOCNO_HARD_REG_COSTS(A) ((A)->hard_reg_costs)
Index: gcc/ira-build.c
===
--- gcc/ira-build.c 2014-08-26 12:09:28.234659250 +0100
+++ gcc/ira-build.c 2014-09-30 09:00:05.541337392 +0100
@@ -524,6 +524,7 @@ ira_create_allocno (int regno, bool cap_
   ALLOCNO_BAD_SPILL_P (a) = false;
   ALLOCNO_ASSIGNED_P (a) = false;
   ALLOCNO_MODE (a) = (regno < 0 ? VOIDmode : PSEUDO_REGNO_MODE (regno));
+  ALLOCNO_WMODE (a) = ALLOCNO_MODE (a);
   ALLOCNO_PREFS (a) = NULL;
   ALLOCNO_COPIES (a) = NULL;
   ALLOCNO_HARD_REG_COSTS (a) = NULL;
@@ -893,6 +894,7 @@ create_cap_allocno (ira_allocno_t a)
   parent = ALLOCNO_LOOP_TREE_NODE (a)->parent;
   cap = ira_create_allocno (ALLOCNO_REGNO (a), true, parent);
   ALLOCNO_MODE (cap) = ALLOCNO_MODE (a);
+  ALLOCNO_WMODE (cap) = ALLOCNO_WMODE (a);
   aclass = ALLOCNO_CLASS (a);
   ira_set_allocno_class (cap, aclass);
   ira_create_allocno_objects (cap);
@@ -1859,9 +1861,9 @@ ira_traverse_loop_tree (bool bb_p, ira_l
 
 /* This recursive function creates allocnos corresponding to
pseudo-registers containing in X.  True OUTPUT_P means that X is
-   a lvalue.  */
+   an lvalue.  PARENT corresponds to the parent expression of X.  */
 static void
-create_insn_allocnos (rtx x, bool output_p)
+create_insn_allocnos (rtx x, rtx outer, bool output_p)
 {
   int i, j;
   const char *fmt;
@@ -1876,7 +1878,15 @@ create_insn_allocnos (rtx x, bool output
  ira_allocno_t a;
 
  if ((a = ira_curr_regno_allocno_map[regno]) == NULL)
-   a = ira_create_allocno (regno, false, ira_curr_loop_tree_node);
+   {
+ a = ira_create_allocno (regno, false, ira_curr_loop_tree_node);
+ if (outer != NULL && GET_CODE (outer) == SUBREG)
+   {
+ enum machine_mode wmode = GET_MODE (outer);
+ if (GET_MODE_SIZE (wmode) > GET_MODE_SIZE (ALLOCNO_WMODE (a)))
+   ALLOCNO_WMODE (a) = wmode;
+   }
+   }
 
  ALLOCNO_NREFS (a)++;
  ALLOCNO_FREQ (a) += REG_FREQ_FROM_BB (curr_bb);
@@ -1887,25 +1897,25 @@ create_insn_allocnos (rtx x, bool output
 }
   else if (code == SET)
 {
-  create_insn_a

Re: Fix for "FAIL: tmpdir-gcc.dg-struct-layout-1/t028 c_compat_x_tst.o compile, (internal compiler error)"

2014-09-30 Thread Andreas Schwab
Richard Sandiford  writes:

> @@ -315,7 +318,7 @@ struct ira_allocno
>   number (0, ...) - 2.  Value -1 is used for allocnos spilled by the
>   reload (at this point pseudo-register has only one allocno) which
>   did not get stack slot yet.  */
> -  short int hard_regno;
> +  int hard_regno : 16;

If you want negative numbers you need to make that explicitly signed.

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


Re: [Patch] Cleanup widest_int_mode_for_size

2014-09-30 Thread James Greenhalgh
*ping*

Thanks,
James

On Tue, Sep 23, 2014 at 10:17:21AM +0100, James Greenhalgh wrote:
> 
> Hi,
> 
> The comment on widest_int_mode_for_size claims that it returns the
> widest integer mode no wider than size. The implementation looks more
> like it finds the widest integer mode smaller than size. Everywhere it
> is used, the mode it is looking for is ultimately checked against an
> expected alignment or is used for heuristics that should be thinking
> about that check, so pull it in to here.
> 
> Throughout expr.c corrections are made for this fact - adding one to
> the size passed to this function. This feels a bit backwards to me.
> 
> This patch fixes that, and then fixes the fallout throughout expr.c.
> Generally, this means simplifying a bunch of move_by_pieces style copy
> code.
> 
> Bootstrapped on x86_64, arm and AArch64 with no issues.
> 
> OK for trunk?
> 
> Thanks,
> James
> 
> 2014-09-23  James Greenhalgh  
> 
>   * expr.c (MOVE_BY_PIECES_P): Remove off-by-one correction to
>   move_by_pieces_ninsns.
>   (CLEAR_BY_PIECES_P): Likewise.
>   (SET_BY_PIECES_P): Likewise.
>   (STORE_BY_PIECES_P): Likwise.
>   (widest_int_mode_for_size): Return the widest mode in which the
>   given size fits.
>   (move_by_pieces): Remove off-by-one correction for max_size,
>   simplify copy loop body.
>   (move_by_pieces_ninsns): Simplify copy body.
>   (can_store_by_pieces): Remove off-by-one correction for max_size,
>   simplify copy body.
>   (store_by_pieces_1): Likewise.
> 

> diff --git a/gcc/expr.c b/gcc/expr.c
> index a6233f3..0af9b9a 100644
> --- a/gcc/expr.c
> +++ b/gcc/expr.c
> @@ -161,7 +161,7 @@ static void write_complex_part (rtx, rtx, bool);
> to perform a structure copy.  */
>  #ifndef MOVE_BY_PIECES_P
>  #define MOVE_BY_PIECES_P(SIZE, ALIGN) \
> -  (move_by_pieces_ninsns (SIZE, ALIGN, MOVE_MAX_PIECES + 1) \
> +  (move_by_pieces_ninsns (SIZE, ALIGN, MOVE_MAX_PIECES) \
> < (unsigned int) MOVE_RATIO (optimize_insn_for_speed_p ()))
>  #endif
>  
> @@ -169,7 +169,7 @@ static void write_complex_part (rtx, rtx, bool);
> called to clear storage.  */
>  #ifndef CLEAR_BY_PIECES_P
>  #define CLEAR_BY_PIECES_P(SIZE, ALIGN) \
> -  (move_by_pieces_ninsns (SIZE, ALIGN, STORE_MAX_PIECES + 1) \
> +  (move_by_pieces_ninsns (SIZE, ALIGN, STORE_MAX_PIECES) \
> < (unsigned int) CLEAR_RATIO (optimize_insn_for_speed_p ()))
>  #endif
>  
> @@ -177,7 +177,7 @@ static void write_complex_part (rtx, rtx, bool);
> called to "memset" storage with byte values other than zero.  */
>  #ifndef SET_BY_PIECES_P
>  #define SET_BY_PIECES_P(SIZE, ALIGN) \
> -  (move_by_pieces_ninsns (SIZE, ALIGN, STORE_MAX_PIECES + 1) \
> +  (move_by_pieces_ninsns (SIZE, ALIGN, STORE_MAX_PIECES) \
> < (unsigned int) SET_RATIO (optimize_insn_for_speed_p ()))
>  #endif
>  
> @@ -185,7 +185,7 @@ static void write_complex_part (rtx, rtx, bool);
> called to "memcpy" storage when the source is a constant string.  */
>  #ifndef STORE_BY_PIECES_P
>  #define STORE_BY_PIECES_P(SIZE, ALIGN) \
> -  (move_by_pieces_ninsns (SIZE, ALIGN, STORE_MAX_PIECES + 1) \
> +  (move_by_pieces_ninsns (SIZE, ALIGN, STORE_MAX_PIECES) \
> < (unsigned int) MOVE_RATIO (optimize_insn_for_speed_p ()))
>  #endif
>  
> @@ -801,18 +801,23 @@ alignment_for_piecewise_move (unsigned int max_pieces, 
> unsigned int align)
>return align;
>  }
>  
> -/* Return the widest integer mode no wider than SIZE.  If no such mode
> -   can be found, return VOIDmode.  */
> +/* Return the widest integer mode no wider than SIZE which can be accessed
> +   at the given ALIGNMENT.  If no such mode can be found, return VOIDmode.
> +   If SIZE would fit exactly in a mode, return that mode. */
>  
>  static enum machine_mode
> -widest_int_mode_for_size (unsigned int size)
> +widest_int_mode_for_size_and_alignment (unsigned int size,
> + unsigned int align)
>  {
>enum machine_mode tmode, mode = VOIDmode;
>  
>for (tmode = GET_CLASS_NARROWEST_MODE (MODE_INT);
> tmode != VOIDmode; tmode = GET_MODE_WIDER_MODE (tmode))
> -if (GET_MODE_SIZE (tmode) < size)
> -  mode = tmode;
> +{
> +  if (GET_MODE_SIZE (tmode) <= size
> +   && align >= GET_MODE_ALIGNMENT (mode))
> + mode = tmode;
> +}
>  
>return mode;
>  }
> @@ -855,7 +860,7 @@ move_by_pieces (rtx to, rtx from, unsigned HOST_WIDE_INT 
> len,
>enum machine_mode to_addr_mode;
>enum machine_mode from_addr_mode = get_address_mode (from);
>rtx to_addr, from_addr = XEXP (from, 0);
> -  unsigned int max_size = MOVE_MAX_PIECES + 1;
> +  unsigned int max_size = MOVE_MAX_PIECES;
>enum insn_code icode;
>  
>align = MIN (to ? MEM_ALIGN (to) : align, MEM_ALIGN (from));
> @@ -907,7 +912,7 @@ move_by_pieces (rtx to, rtx from, unsigned HOST_WIDE_INT 
> len,
>MODE might not be used depending on the definitions of the
>USE_* macros below.  */
>enum machine_mode mode ATTR

Re: [PATCH, ARM]Option support to new ARM MCU Cortex-M7

2014-09-30 Thread Ramana Radhakrishnan
On Wed, Sep 24, 2014 at 6:17 AM, Terry Guo  wrote:
> Hi there,
>
> The attached patch intends to provide option support to newly announced core
> Cortex-M7 and related FPU:
> http://www.arm.com/about/newsroom/arm-supercharges-mcu-market-with-high-perf
> ormance-cortex-m7-processor.php
> http://www.arm.com/products/processors/cortex-m/cortex-m7-processor.php
>
> The required Binutils support is
> https://sourceware.org/ml/binutils/2014-09/msg00201.html.
>
> Is it OK to trunk?

OK.

Ramana

>
> BR,
> Terry
>
> 2014-09-24  Terry Guo  
>
>  * config/arm/arm-cores.def (cortex-m7): New core name.
>  * config/arm/arm-fpus.def (fpv5-sp-d16): New fpu name.
>  (fpv5-d16): Ditto.
>  * config/arm/arm-tables.opt: Regenerated.
>  * config/arm/arm-tune.md: Likewise.
>  * doc/invoke.texi: Document new cpu and fpu names.
>  * config/arm/arm.h (TARGET_VFP5): New macro.
>  * config/arm/vfp.md (2,
>  smax3, smin3): Enabled for FPU FPv5.


Re: [fortran,patch] Forbid assignment of different character kinds

2014-09-30 Thread Tobias Burnus
FX wrote:
> Now, here's a tiny patch to silence the related warning in PR36534.
> I also remove the condition on gfc_current_form != FORM_FIXED, as diagnostics
> should be emitted based on language/pedantic options, not source form.

Looks good to me. However:

In the test case, could you also add a "PR fortran/36534" to the
as comment?


Additionally, I wonder whether instead of the name-based checking
+ && (sym->name[0] != '_' || sym->name[1] != '_'))
it wouldn't be cleaner to check
  && sym->attr.intrinsic
(If you change it to attr.intrinsic, you need to set
the attribute also in intrinsic.c's gfc_convert_type_warn.)

I know that using __... names it not really possible in Fortran (except as C
binding name), but - still - I think it is cleaner. But I am fine with
either version.

Tobias


[COMMITTED][shrink-wrap] should not sink instructions which may cause trap ?

2014-09-30 Thread Jiong Wang


On 30/09/14 05:26, Jeff Law wrote:

On 09/29/14 12:06, Jiong Wang wrote:

thanks for pointing this out, patch updated.

re-tested, pass x86-64 bootstrap and no regression on check-gcc/g++.
pass aarch64-none-elf cross check also.

ok for trunk?

Yes this is fine.


committed as 215709.




BTW, another bug exposed by linux x86-64 kernel build, and it's at

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63404

the problem is caused by we missed clobber/use check. I will send
a seperate patch for review. really sorry for causing the trouble,
the insn move in generic code is actually not that generic, related
with some backend features...

Noted.  These things happen, it's one of the things that makes working
with RTL tough and one of the reasons we made a major focus away from
RTL as the primary IL for optimization work.  But for things like
shrink-wrapping, RTL is the right place to be.


thanks for the explanation. Now  I fell I get a deeper understanding of why it's
called RTL, the "Register Transfer Language" :)



jeff







RE: [PATCH] Fix PR preprocessor/58893 access to uninitialized memory

2014-09-30 Thread Bernd Edlinger


Hi Jeff,

On Mon, 29 Sep 2014 22:40:58, Jeff Law wrote:
>
> On 09/27/14 03:53, Bernd Edlinger wrote:
 Comment before this change. Someone not familiar with this code is
 going to have no idea why these two lines exist.

>>>
>>> Ok, I added a comment now, do you like it?
> Yes.
>
>
>>>
 Please try to include a testcase. If you're having trouble reproducing
 on the trunk, you could use MALLOC_PERTURB per c#8 in the bug report.
 If there's a way to set environment variables in our testing framework
 that may be a reasonable way to test (if you need to do that, limit
 testing to linux targets as we'll have a dependency on glibc features).

>>>
>>> For whatever reason, the first -include must end with a pragma
>>> as in the PR, and MALLOC_PERTURB_ must be set to something.
>>> Then we get an ICE, otherwise we get an error message without line number.
>>> I tried to make this a valid test case, but that might be less trivial than
>>> it looks at first sight.
>
>>>
>>> I tried to set MALLOC_PERTURB_=123 globally, like this:
>>>
>>> MALLOC_PERTURB_=123 make -k check
>>>
>>> but then this happened:
> Sigh. Yea, I guess if we're hitting the allocator insanely hard,
> scrubbing memory might turn out to slow things down in a significant
> way. Or it may simply be the case that we're using free'd memory in
> some way and with the MALLOC_PERTURB changes we're in an infinite loop
> in the dumping code or something similar.
>

Yeah, that is an interesting thing.
I debugged that, and it turns out, that this is just incredibly slow.
It seems to be in the macro expansion of this construct:

#define t16(x) x x x x x x x x x x x x x x x x
#define M (sizeof (t16(t16(t16(t16(t16(" ")) - 1)

libcpp is calling realloc 1.000.000 times for this, resizing
the memory by just one byte at a time. And the worst case of
realloc is O(n), so in the worst case realloc would have
to copy 1/2 * 1.000.000^2 bytes = 500 GB of memory.

With this little change in libcpp, the test suite passed, without any
further regressions:

--- libcpp/charset.c.jj    2014-08-19 07:34:31.0 +0200
+++ libcpp/charset.c    2014-09-30 10:45:26.676954120 +0200
@@ -537,6 +537,7 @@ convert_no_conversion (iconv_t cd ATTRIB
   if (to->len + flen> to->asize)
 {
   to->asize = to->len + flen;
+  to->asize *= 2;
   to->text = XRESIZEVEC (uchar, to->text, to->asize);
 }
   memcpy (to->text + to->len, from, flen);

I will prepare a patch for that later.

Interestingly, if I define MALLOC_CHECK_=3 _and_ MALLOC_PERTURB_
this test passes, even without the above change,
but the test case 
  gfortran.dg/realloc_on_assign_5.f03 fails in this configuration,
which is a known bug: PR 47674. However it passes when only MALLOC_PERTURB_
is defined.

Weird...

>
>>>
>>>
>>> Well, I added a test case, but it does not reliably fail without the
>>> patch, because setting
>>> MALLOC_PERTURB_ causes too much trouble at this time.
>>>
>>> I would propose to set MALLOC_PERTURB_ globally at a later time.
> Sorry, just to be clear, I wasn't suggesting to set it globally, but
> just for the duration of this test as a potentially easier way to
> trigger the failure.
>
> However, it may make sense to do that at some point. I also think that
> Jakub bootstraps and runs the regression suite with valgrind late in the
> release cycle, which would catch this problem if it raises its head again.
>
>>>
>>> Boot-Strapped & Regression-Tested on x86_64-linux-gnu.
>>> Ok for trunk?
> Yes, this is OK for the trunk.
>

Thanks!
Bernd.

> jeff
>
  

[PATCH RFC]Pair load store instructions using a generic scheduling fusion pass

2014-09-30 Thread Bin Cheng
Hi,
Last time I posted the patch pairing consecutive load/store instructions on
ARM, the patch got some review comments.  The most important one, as
suggested by Jeff and Mike, is about to do the load/store pairing using
existing scheduling facility.  In the initial investigation, I cleared up
Mike's patch and fixed some implementation bugs in it, it can now find as
many load/store pairs as my old patch.  Then I decided to take one step
forward to introduce a generic instruction fusion infrastructure in GCC,
because in essence, load/store pair is nothing different with other
instruction fusion, all these optimizations want is to push instructions
together in instruction flow.
So here comes this patch.  It adds a new sched_fusion pass just before
peephole2.  The methodology is like:
1) The priority in scheduler is extended into [fusion_priority, priority]
pair, with fusion_priority as the major key and priority as the minor key.
2) The back-end assigns priorities pair to each instruction, instructions
want to be fused together get same fusion_priority assigned.
3) The haifa scheduler schedules instructions based on fusion priorities,
all other parts are just like the original sched2 pass.  Of course, some
functions can be simplified/skipped in this process.
4) With instructions fused together in flow, the following peephole2 pass
can easily transform interesting instructions into other forms, just like
ldrd/strd for ARM.

The new infrastructure can handle different kinds of fusion in one pass.
It's also easy to extend for new fusion cases, all it takes is to identify
instructions which want to be fused and assign new fusion priorities to
them.  Also as Mike suggested last time, the patch can be very simple by
reusing existing scheduler facility.

I collected performance data for both cortex-a15 and cortex-a57 (with a
local peephole ldp/stp patch), the benchmarks can be obviously improved on
arm/aarch64.  I also collected instrument data about how many load/store
pairs are found.  For the four versions of load/store pair patches:
0) The original Mike's patch.
1) My original prototype patch.
2) Cleaned up pass of Mike (with implementation bugs resolved).
3) This new prototype fusion pass.

The numbers of paired opportunities satisfy below relations:
3 * N0 ~ N1 ~ N2 < N3
For example, for one benchmark suite, we have:
N0 ~= 1300
N1/N2 ~= 5000
N3 ~= 7500

As a matter of fact, if we move sched_fusion and peephole2 pass after
register renaming (~11000 for above benchmark suite), then enable register
renaming pass, this patch can find even more load store pairs.  But rename
pass has its own impact on performance and we need more benchmark data
before doing that change.

Of course, this patch is no the perfect solution, it does miss load/store
pair in some corner cases which have complicated instruction dependencies.
Actually it breaks one load/store pair test on armv6 because of the corner
case, that's why the pass is disabled by default on non-armv7 processors.  I
may investigate the failure and try to enable the pass for all arm targets
in the future.

So any comments on this?

2014-09-30  Bin Cheng  
Mike Stump  

* timevar.def (TV_SCHED_FUSION): New time var.
* passes.def (pass_sched_fusion): New pass.
* config/arm/arm.c (TARGET_SCHED_FUSION_PRIORITY): New.
(extract_base_offset_in_addr, fusion_load_store): New.
(arm_sched_fusion_priority): New.
(arm_option_override): Disable scheduling fusion on non-armv7
processors by default.
* sched-int.h (struct _haifa_insn_data): New field.
(INSN_FUSION_PRIORITY, FUSION_MAX_PRIORITY, sched_fusion): New.
* sched-rgn.c (rest_of_handle_sched_fusion): New.
(pass_data_sched_fusion, pass_sched_fusion): New.
(make_pass_sched_fusion): New.
* haifa-sched.c (sched_fusion): New.
(insn_cost): Handle sched_fusion.
(priority): Handle sched_fusion by calling target hook.
(enum rfs_decision): New enum value.
(rfs_str): New element for RFS_FUSION.
(rank_for_schedule): Support sched_fusion.
(schedule_insn, max_issue, prune_ready_list): Handle sched_fusion.
(schedule_block, fix_tick_ready): Handle sched_fusion.
* common.opt (flag_schedule_fusion): New.
* tree-pass.h (make_pass_sched_fusion): New.
* target.def (fusion_priority): New.
* doc/tm.texi.in (TARGET_SCHED_FUSION_PRIORITY): New.
* doc/tm.texi: Regenerated.
* doc/invoke.texi (-fschedule-fusion): New.

gcc/testsuite/ChangeLog
2014-09-30  Bin Cheng  

* gcc.target/arm/ldrd-strd-pair-1.c: New test.
* gcc.target/arm/vfp-1.c: Improve scanning string.Index: gcc/timevar.def
===
--- gcc/timevar.def (revision 215662)
+++ gcc/timevar.def (working copy)
@@ -244,6 +244,7 @@ DEFTIMEVAR (TV_IFCVT2, "if-conversion 
2")
 DEFTIME

Re: Fix libgomp crash without TLS (PR42616)

2014-09-30 Thread Jakub Jelinek
On Tue, Sep 30, 2014 at 11:03:47AM +0400, Varvara Rainchik wrote:
> Corrected patch: call pthread_setspecific (gomp_tls_key, NULL) in
> gomp_thread_start if HAVE_TLS is not defined.
> 
> 2014-09-19  Varvara Rainchik  
> 
> * libgomp.h (gomp_thread): For non TLS case create thread data.
> * team.c (non_tls_thread_data_destructor,
> create_non_tls_thread_data): New functions.

I actually wonder when we have emutls support in libgcc if it wouldn't
be better to just define HAVE_TLS always to 1 (i.e. remove all the
conditionals on it), then you wouldn't need to bother with this at all.

I don't have an OS which doesn't support native TLS though, so somebody with
such a system would need to test it and benchmark if it doesn't make things
slower.

Richard, thoughts on this?

Jakub


[PATCH GCC]Improve candidate selecting in IVOPT

2014-09-30 Thread Bin Cheng
Hi,
As analyzed in PR62178, IVOPT can't find the optimal iv set for that case.
The problem with current heuristic algorithm is it only replaces candidate
with ones not in current solution one by one, starting from small solution.
This patch adds another heuristic which starts from assigning the best
candidate for each iv use, then replaces candidate with ones in the current
solution.
Before this patch, there are two runs of find_optimal_set_1 to find the
optimal iv sets, we name them as set_a and set_b.  After this patch we will
have set_c.  At last, IVOPT chooses the best one from set_a/set_b/set_c.  To
prove that this patch is necessary, I collected instrumental data for gcc
bootstrap, spec2k, eembc and can confirm for some cases only the newly added
heuristic can find the optimal iv set.  The number of these cases in which
set_c is the optimal one is on the same level of set_b.
As for the compilation time, the newly added function actually is one
iteration of previous selection algorithm, it should be much faster than
previous process.

I also added one target dependent test case.
Bootstrap and test on x86_64, test on aarch64.  Any comments?

2014-09-30  Bin Cheng  

PR tree-optimization/62178
* tree-ssa-loop-ivopts.c (enum sel_type): New.
(iv_ca_add_use): Add parameter RELATED_P and find the best cand
for iv use if it's true.
(try_add_cand_for, get_initial_solution): Change paramter ORIGINALP
to SELECT_TYPE and handle it.
(find_optimal_iv_set_1): Ditto.
(try_prune_iv_set, find_optimal_iv_set_2): New functions.
(find_optimal_iv_set): Call find_optimal_iv_set_2 and choose the
best candidate set.

gcc/testsuite/ChangeLog
2014-09-30  Bin Cheng  

PR tree-optimization/62178
* gcc.target/aarch64/pr62178.c: New test.Index: gcc/testsuite/gcc.target/aarch64/pr62178.c
===
--- gcc/testsuite/gcc.target/aarch64/pr62178.c  (revision 0)
+++ gcc/testsuite/gcc.target/aarch64/pr62178.c  (revision 0)
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O3" } */
+
+int a[30 +1][30 +1], b[30 +1][30 +1], r[30 +1][30 +1];
+
+void Intmm (int run) {
+  int i, j, k;
+
+  for ( i = 1; i <= 30; i++ )
+for ( j = 1; j <= 30; j++ ) {
+  r[i][j] = 0;
+  for(k = 1; k <= 30; k++ )
+r[i][j] += a[i][k]*b[k][j];
+}
+}
+
+/* { dg-final { scan-assembler "ld1r\\t\{v\[0-9\]+\."} } */
Index: gcc/tree-ssa-loop-ivopts.c
===
--- gcc/tree-ssa-loop-ivopts.c  (revision 215113)
+++ gcc/tree-ssa-loop-ivopts.c  (working copy)
@@ -254,6 +254,14 @@ struct iv_inv_expr_ent
   hashval_t hash;
 };
 
+/* Types used to start selecting the candidate for each IV use.  */
+enum sel_type
+{
+  SEL_ORIGINAL,/* Start selecting from original cands.  */
+  SEL_IMPORTANT,   /* Start selecting from important cands.  */
+  SEL_RELATED  /* Start selecting from related cands.  */
+};
+
 /* The data used by the induction variable optimizations.  */
 
 typedef struct iv_use *iv_use_p;
@@ -5417,22 +5425,51 @@ iv_ca_set_cp (struct ivopts_data *data, struct iv_
 }
 
 /* Extend set IVS by expressing USE by some of the candidates in it
-   if possible.  Consider all important candidates if candidates in
-   set IVS don't give any result.  */
+   if possible.  If RELATED_P is FALSE, consider all important
+   candidates if candidates in set IVS don't give any result;
+   otherwise, try to find the best one from related or all candidates,
+   depending on consider_all_candidates.  */
 
 static void
 iv_ca_add_use (struct ivopts_data *data, struct iv_ca *ivs,
-  struct iv_use *use)
+  struct iv_use *use, bool related_p)
 {
   struct cost_pair *best_cp = NULL, *cp;
   bitmap_iterator bi;
   unsigned i;
   struct iv_cand *cand;
 
-  gcc_assert (ivs->upto >= use->id);
+  gcc_assert (ivs->upto == use->id);
   ivs->upto++;
   ivs->bad_uses++;
 
+  if (related_p)
+{
+  if (data->consider_all_candidates)
+   {
+ for (i = 0; i < n_iv_cands (data); i++)
+   {
+ cand = iv_cand (data, i);
+ cp = get_use_iv_cost (data, use, cand);
+ if (cheaper_cost_pair (cp, best_cp))
+   best_cp = cp;
+   }
+   }
+  else
+   {
+ EXECUTE_IF_SET_IN_BITMAP (use->related_cands, 0, i, bi)
+   {
+ cand = iv_cand (data, i);
+ cp = get_use_iv_cost (data, use, cand);
+ if (cheaper_cost_pair (cp, best_cp))
+   best_cp = cp;
+}
+   }
+
+  iv_ca_set_cp (data, ivs, use, best_cp);
+  return;
+}
+
   EXECUTE_IF_SET_IN_BITMAP (ivs->cands, 0, i, bi)
 {
   cand = iv_cand (data, i);
@@ -5440,7 +5477,7 @@ iv_ca_add_use (struct ivopts_data *data, struct iv
   if (cheaper_cost_pair (cp, best_cp))
best_cp = cp;
 }
-   
+

[PATCH, i386]: Enable reminder{sd,df,xf} and fmod{sf,df,xf} only for flag_finite_math_only.

2014-09-30 Thread Uros Bizjak
Hello!

According to C99, reminder function returns:

   If x or y is a NaN, a NaN is returned.

   If x is an infinity, and y is not a NaN, a domain error occurs,
and a NaN is returned.

   If y is zero, and x is not a NaN, a domain error occurs, and a
NaN is returned.

and fmod returns:

   If x or y is a NaN, a NaN is returned.

   If x is an infinity, a domain error occurs, and a NaN is returned.

   If y is zero, a domain error occurs, and a NaN is returned.

   If x is +0 (-0), and y is not zero, +0 (-0) is returned.

However, x87 fprem and fprem1 instructions that are used to implement
these builtin functions do not return NaN for infinities, but generate
invalid-arithmetic-operand exception.

Attached patch enables these builtins for finite math only, consistent
with gcc documentation:

'-ffinite-math-only':

 Allow optimizations for floating-point arithmetic that assume that
 arguments and results are not NaNs or +-Infs.

 This option is not turned on by any '-O' option since it can result
 in incorrect output for programs that depend on an exact
 implementation of IEEE or ISO rules/specifications for math
 functions.  It may, however, yield faster code for programs that do
 not require the guarantees of these specifications.

2014-09-30  Uros Bizjak  

* config/i386/i386.md (fmodxf3): Enable for flag_finite_math_only only.
(fmod3): Ditto.
(fpremxf4_i387): Ditto.
(reminderxf3): Ditto.
(reminder3): Ditto.
(fprem1xf4_i387): Ditto.

Patch was bootstrapped and regression tested on x86_64-linux-gnu
{,-m32}. The patch also fixes ieee_2.f90 testsuite failure with FX's
pending IEEE support improvement patch.

2014-09-30  Uros Bizjak  

* config/i386/i386.md (fmodxf3): Enable for flag_finite_math_only only.
(fmod3): Ditto.
(fpremxf4_i387): Ditto.
(reminderxf3): Ditto.
(reminder3): Ditto.
(fprem1xf4_i387): Ditto.

The patch will be committed to mainline and other release branches.

Uros.
Index: config/i386/i386.md
===
--- config/i386/i386.md (revision 215705)
+++ config/i386/i386.md (working copy)
@@ -13813,7 +13813,8 @@
(set (reg:CCFP FPSR_REG)
(unspec:CCFP [(match_dup 2) (match_dup 3)]
 UNSPEC_C2_FLAG))]
-  "TARGET_USE_FANCY_MATH_387"
+  "TARGET_USE_FANCY_MATH_387
+   && flag_finite_math_only"
   "fprem"
   [(set_attr "type" "fpspc")
(set_attr "mode" "XF")])
@@ -13822,7 +13823,8 @@
   [(use (match_operand:XF 0 "register_operand"))
(use (match_operand:XF 1 "general_operand"))
(use (match_operand:XF 2 "general_operand"))]
-  "TARGET_USE_FANCY_MATH_387"
+  "TARGET_USE_FANCY_MATH_387
+   && flag_finite_math_only"
 {
   rtx_code_label *label = gen_label_rtx ();
 
@@ -13845,7 +13847,8 @@
   [(use (match_operand:MODEF 0 "register_operand"))
(use (match_operand:MODEF 1 "general_operand"))
(use (match_operand:MODEF 2 "general_operand"))]
-  "TARGET_USE_FANCY_MATH_387"
+  "TARGET_USE_FANCY_MATH_387
+   && flag_finite_math_only"
 {
   rtx (*gen_truncxf) (rtx, rtx);
 
@@ -13884,7 +13887,8 @@
(set (reg:CCFP FPSR_REG)
(unspec:CCFP [(match_dup 2) (match_dup 3)]
 UNSPEC_C2_FLAG))]
-  "TARGET_USE_FANCY_MATH_387"
+  "TARGET_USE_FANCY_MATH_387
+   && flag_finite_math_only"
   "fprem1"
   [(set_attr "type" "fpspc")
(set_attr "mode" "XF")])
@@ -13893,7 +13897,8 @@
   [(use (match_operand:XF 0 "register_operand"))
(use (match_operand:XF 1 "general_operand"))
(use (match_operand:XF 2 "general_operand"))]
-  "TARGET_USE_FANCY_MATH_387"
+  "TARGET_USE_FANCY_MATH_387
+   && flag_finite_math_only"
 {
   rtx_code_label *label = gen_label_rtx ();
 
@@ -13916,7 +13921,8 @@
   [(use (match_operand:MODEF 0 "register_operand"))
(use (match_operand:MODEF 1 "general_operand"))
(use (match_operand:MODEF 2 "general_operand"))]
-  "TARGET_USE_FANCY_MATH_387"
+  "TARGET_USE_FANCY_MATH_387
+   && flag_finite_math_only"
 {
   rtx (*gen_truncxf) (rtx, rtx);
 


Re: [PATCH 1/n] OpenMP 4.0 offloading infrastructure

2014-09-30 Thread Jakub Jelinek
On Fri, Sep 26, 2014 at 04:36:21PM +0400, Ilya Verbin wrote:
> 2014-09-26  Bernd Schmidt  
>   Thomas Schwinge  
>   Ilya Verbin  
>   Andrey Turetskiy  
> 
>   * configure: Regenerate.
>   * configure.ac (--enable-as-accelerator-for)
>   (--enable-offload-targets): New configure options.
> gcc/
>   * Makefile.in (real_target_noncanonical, accel_dir_suffix)
>   (enable_as_accelerator): New variables substituted by configure.
>   (libsubdir, libexecsubdir, unlibsubdir): Tweak for the possibility of
>   being configured as an offload compiler.
>   (DRIVER_DEFINES): Pass new defines DEFAULT_REAL_TARGET_MACHINE and
>   ACCEL_DIR_SUFFIX.
>   (install-cpp, install-common, install_driver, install-gcc-ar): Do not
>   install for the offload compiler.
>   * config.in: Regenerate.
>   * configure: Regenerate.
>   * configure.ac (real_target_noncanonical, accel_dir_suffix)
>   (enable_as_accelerator, enable_offload_targets): Compute new variables.
>   (--enable-as-accelerator-for, --enable-offload-targets): New options.
>   (ACCEL_COMPILER): Define if the compiler is built as the accel compiler.
>   (OFFLOAD_TARGETS): List of target names suitable for offloading.
>   (ENABLE_OFFLOADING): Define if list of offload targets is not empty.
> gcc/cp/
>   * Make-lang.in (c++.install-common): Do not install for the offload
>   compiler.
> gcc/fortran/
>   * Make-lang.in (fortran.install-common): Do not install for the offload
>   compiler.
> libgcc/
>   * Makefile.in (crtompbegin$(objext), crtompend$(objext)): New rule.
>   * configure: Regenerate.
>   * configure.ac (--enable-as-accelerator-for)
>   (--enable-offload-targets): New configure options.
>   (extra_parts): Add crtompbegin.o and crtompend.o if
>   enable_offload_targets is not empty.
>   * ompstuff.c: New file.
> libgomp/
>   * config.h.in: Regenerate.
>   * configure: Regenerate.
>   * configure.ac: Check for libdl, required for plugin support.
>   (PLUGIN_SUPPORT): Define if plugins are supported.
>   (--enable-offload-targets): New configure option.
>   (enable_offload_targets): Support Intel MIC targets.
>   (OFFLOAD_TARGETS): List of target names suitable for offloading.
> lto-plugin/
>   * Makefile.am (libexecsubdir): Tweak for the possibility of being
>   configured for offload compiler.
>   (accel_dir_suffix): New variable substituted by configure.
>   * Makefile.in: Regenerate.
>   * configure: Regenerate.
>   * configure.ac (--enable-as-accelerator-for): New option.

If you add the documentation Joseph requested, looks good to me
(once the rest is reviewed too).

Jakub


Re: [debug-early] rearrange some checks in gen_subprogram_die

2014-09-30 Thread Richard Biener
On Mon, Sep 29, 2014 at 8:54 PM, Aldy Hernandez  wrote:
> I'm rearranging some code in Michael's original patch to minimize the
> difference with mainline.
>
> It seems that the check for DECL_STRUCT_FUNCTION (decl)->gimple_df, was
> merely a check to see if we had already set the FDE bits for the decl in
> question.

Sounds more like a check whether the frontend is finished?

>  I've moved the check inside the original DECL_EXTERNAL check,
> thus making it obvious what is being accomplished.
>
> I also got rid of mainline's gcc_checking_assert of fun.  We're
> dereferencing it immediately after.  That should be enough to trigger an
> ICE.
>
> Also I removed Michael's check for DECL_STRUCT_FUNCTION(decl), since
> mainline drops into this codepath regardless, and has/had that
> gcc_checking_assert anyhow.
>
> No regressions.
>
> Committed to branch.
>
> Aldy


Re: [PATCH][PING] PR62120

2014-09-30 Thread Ilya Tocar
Ping.

On 15 Sep 18:43, Ilya Tocar wrote:
> On 01 Sep 18:38, Ilya Tocar wrote:
> > > Please mention the PR in the ChangeLog entry and add some testcases
> > > (can be gcc.target/i386/, but we should have it tested).
> > > Does this change anything on say register short sil __asm ("sil"); in 
> > > 32-bit
> > > mode (when it IMHO should be rejected too?)?
> > >
> > Do we support "sil" at all? In i386.h i see:
> > 
> > /* Note we are omitting these since currently I don't know how
> > to get gcc to use these, since they want the same but different
> > number as al, and ax.
> > */
> > #define QI_REGISTER_NAMES \
> > {"al", "dl", "cl", "bl", "sil", "dil", "bpl", "spl",}
> > 
> > And gcc doesn't recognize sil.
> > 
> > Added testcase, and fixed avx512f-additional-reg-names.c to be valid on
> > 32 bits. Ok for trunk?
> >
> 
> Slightly updated tests.
> Ok for trunk?
> 
> gcc/
> 
> 2014-09-15  Ilya Tocar  
> 
>PR middle-end/62120
>* varasm.c (decode_reg_name_and_count): Check availability for
>registers from ADDITIONAL_REGISTER_NAMES.
> 
> Testsuite/
> 
> 2014-09-15  Ilya Tocar  
> 
>PR middle-end/62120
>* gcc.target/i386/avx512f-additional-reg-names.c: Use register vaild
>in 32-bit mode.
>* gcc.target/i386/pr62120.c: New.
> 
> ---
>  gcc/testsuite/gcc.target/i386/avx512f-additional-reg-names.c | 2 +-
>  gcc/testsuite/gcc.target/i386/pr62120.c  | 8 
>  gcc/varasm.c | 5 +++--
>  3 files changed, 12 insertions(+), 3 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr62120.c
> 
> diff --git a/gcc/testsuite/gcc.target/i386/avx512f-additional-reg-names.c 
> b/gcc/testsuite/gcc.target/i386/avx512f-additional-reg-names.c
> index 164a1de..98a9052 100644
> --- a/gcc/testsuite/gcc.target/i386/avx512f-additional-reg-names.c
> +++ b/gcc/testsuite/gcc.target/i386/avx512f-additional-reg-names.c
> @@ -3,7 +3,7 @@
>  
>  void foo ()
>  {
> -  register int zmm_var asm ("zmm9") __attribute__((unused));
> +  register int zmm_var asm ("zmm7") __attribute__((unused));
>  
>__asm__ __volatile__("vxorpd %%zmm0, %%zmm0, %%zmm7\n" : : : "zmm7" );
>  }
> diff --git a/gcc/testsuite/gcc.target/i386/pr62120.c 
> b/gcc/testsuite/gcc.target/i386/pr62120.c
> new file mode 100644
> index 000..bfb8c47
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr62120.c
> @@ -0,0 +1,8 @@
> +/* { dg-do compile } */
> +/* { dg-options "-mno-sse" } */
> +
> +void foo ()
> +{
> +  register int zmm_var asm ("ymm9");/* { dg-error "invalid register name" } 
> */
> +  register int zmm_var2 asm ("23");/* { dg-error "invalid register name" } */
> +}
> diff --git a/gcc/varasm.c b/gcc/varasm.c
> index cd4a230..9c12b81 100644
> --- a/gcc/varasm.c
> +++ b/gcc/varasm.c
> @@ -888,7 +888,7 @@ decode_reg_name_and_count (const char *asmspec, int 
> *pnregs)
>if (asmspec[0] != 0 && i < 0)
>   {
> i = atoi (asmspec);
> -   if (i < FIRST_PSEUDO_REGISTER && i >= 0)
> +   if (i < FIRST_PSEUDO_REGISTER && i >= 0 && reg_names[i][0])
>   return i;
> else
>   return -2;
> @@ -925,7 +925,8 @@ decode_reg_name_and_count (const char *asmspec, int 
> *pnregs)
>  
>   for (i = 0; i < (int) ARRAY_SIZE (table); i++)
> if (table[i].name[0]
> -   && ! strcmp (asmspec, table[i].name))
> +   && ! strcmp (asmspec, table[i].name)
> +   && reg_names[table[i].number][0])
>   return table[i].number;
>}
>  #endif /* ADDITIONAL_REGISTER_NAMES */
> -- 
> 1.8.3.1
> 


[patch] fix expand_builtin_init_dwarf_reg_sizes wrt register spans

2014-09-30 Thread Olivier Hainque
Hello,

Exception propagation has been failing for a while for the SPE/e500 family of
powerpc targets. The issue boils down to an assert failure through:

uw_init_context_1 ()
...
_Unwind_SetSpColumn (context, outer_cfa, &sp_slot);

then

_Unwind_SetSpColumn ()
...
int size = dwarf_reg_size_table[__builtin_dwarf_sp_column ()];

if (size == sizeof(_Unwind_Ptr))
   tmp_sp->ptr = (_Unwind_Ptr) cfa;
else
  {
gcc_assert (size == sizeof(_Unwind_Word));

Indeed, dwarf_reg_size_table[sp] is 8 while sizeof(_Unwind_Word) and
sizeof(_Unwind_Ptr) are both 4. dwarf_reg_size[sp] 8 is an outcome of:

  commit 275035b56823b26d5fb7e90fad945b998648edf2
  Date:   Thu Sep 5 14:09:07 2013 +

   PR target/58139
   * reginfo.c (choose_hard_reg_mode): Scan through all mode classes
   looking for widest mode.

choose_hard_reg_mode returning 8 for register r1 is not incorrect per se,
I think.

The problem, IMO, is expand_builtin_init_dwarf_reg_sizes ignoring the
targetm.dwarf_register_span hook, unlike other functions in dwarf2outcfi.c.

This patch is a proposal to fix this.

The general idea is isolate the dwarf_reg_size computation for a single
register in a separate function, called either for the "current" register
if it doesn't span, or for each items of the span otherwise.

Working on this, I noticed that expand_builtin_init_dwarf_reg_sizes was
also not honoring DWARF_REG_TO_UNWIND_COLUMN. ISTM that it should so the
patch adjusts on this front as well.

We have been using a slight variant of this in production for a few months
on a gcc-4.9 base. Our 4.9 patch required adjustment to account for the
introduction of the dwarf_frame_reg_mode hook on mainline in the interim.

Tested by verifying that the proper size is stored for GPRs on
powerpc-eabispe, then with bootstrap and regtest for languages=all,ada on
x86_64-linux.

OK to commit ?

Thanks in advance for your feedback,

With Kind Regards,

Olivier

2014-09-30  Olivier Hainque  

libgcc/
* unwind-dw2.c (DWARF_REG_TO_UNWIND_COLUMN): Move default def to ...

gcc/
* defaults.h: ... here.
* dwarf2cfi.c (init_one_dwarf_reg_size): New helper, processing
one particular reg for expand_builtin_init_dwarf_reg_sizes. Apply
DWARF_REG_TO_UNWIND_COLUMN.
(expand_builtin_init_dwarf_reg_sizes): Rework to use helper and
account for dwarf register spans.



cfispan.diff
Description: Binary data


[ping*2] define CROSS = @CROSS@ in gcc/Makefile.in

2014-09-30 Thread Olivier Hainque
Hello,

ping on https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00056.html

Thanks in advance,

With Kind Regards,

Olivier

On Sep 1, 2014, at 17:26 , Olivier Hainque  wrote:

> Hello,
> 
> This patch is necessary for proper operation of a piece
> of the Ada Makefile fragment which tests the value of $(CROSS).
> 
> @ substitutions aren't performed for the language specific
> Makefile fragments so using @CROSS directly isn't an option
> there.
> 
> We have been using this for years and multiple targets in our
> local trees. Boostrapped & reg-tested on x86_64-linux.
> 
> OK to commit ?
> 
> Thanks in advance for your feedback,
> 
> Olivier
> 
> 2014-09-01  Olivier Hainque  
> 
>   * Makefile.in (CROSS): Define, to @CROSS@.
> 
> 
> 
> 




Re: [PATCH][PING] PR62120

2014-09-30 Thread Jakub Jelinek
On Tue, Sep 30, 2014 at 02:44:21PM +0400, Ilya Tocar wrote:
> > 2014-09-15  Ilya Tocar  
> > 
> >PR middle-end/62120
> >* varasm.c (decode_reg_name_and_count): Check availability for
> >registers from ADDITIONAL_REGISTER_NAMES.
> > 
> > Testsuite/
> > 
> > 2014-09-15  Ilya Tocar  
> > 
> >PR middle-end/62120
> >* gcc.target/i386/avx512f-additional-reg-names.c: Use register vaild

s/vaild/valid/

> >in 32-bit mode.
> >* gcc.target/i386/pr62120.c: New.
> > 
> > ---
> >  gcc/testsuite/gcc.target/i386/avx512f-additional-reg-names.c | 2 +-
> >  gcc/testsuite/gcc.target/i386/pr62120.c  | 8 
> >  gcc/varasm.c | 5 +++--
> >  3 files changed, 12 insertions(+), 3 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr62120.c
> > 
> > diff --git a/gcc/testsuite/gcc.target/i386/avx512f-additional-reg-names.c 
> > b/gcc/testsuite/gcc.target/i386/avx512f-additional-reg-names.c
> > index 164a1de..98a9052 100644
> > --- a/gcc/testsuite/gcc.target/i386/avx512f-additional-reg-names.c
> > +++ b/gcc/testsuite/gcc.target/i386/avx512f-additional-reg-names.c
> > @@ -3,7 +3,7 @@
> >  
> >  void foo ()
> >  {
> > -  register int zmm_var asm ("zmm9") __attribute__((unused));
> > +  register int zmm_var asm ("zmm7") __attribute__((unused));
> >  
> >__asm__ __volatile__("vxorpd %%zmm0, %%zmm0, %%zmm7\n" : : : "zmm7" );

Please use zmm6 instead, zmm7 is clobbered in the following statement.

Otherwise LGTM.

Jakub


Re: [Patch, Fortran] Add CO_BROADCAST

2014-09-30 Thread Dominique d'Humières
This is what I have committed as r215715:

Index: gcc/testsuite/ChangeLog
===
--- gcc/testsuite/ChangeLog (revision 215714)
+++ gcc/testsuite/ChangeLog (working copy)
@@ -1,3 +1,7 @@
+2014-30-09  Dominique d'Humieres 
+
+   * gfortran.dg/coarray_collectives_9.f90: Fix some dg-error.
+
 2014-09-30  Jakub Jelinek  
 
PR inline-asm/63282
Index: gcc/testsuite/gfortran.dg/coarray_collectives_9.f90
===
--- gcc/testsuite/gfortran.dg/coarray_collectives_9.f90 (revision 215714)
+++ gcc/testsuite/gfortran.dg/coarray_collectives_9.f90 (working copy)
@@ -1,5 +1,5 @@
 ! { dg-do compile }
-! { dg-options "-fcoarray=single" }
+! { dg-options "-fcoarray=single -fmax-errors=40" }
 !
 !
 ! CO_BROADCAST/CO_REDUCE
@@ -29,7 +29,7 @@
   call co_reduce("abc") ! { dg-error "Missing actual argument 'operator' in 
call to 'co_reduce'" }
   call co_broadcast(1, source_image=1) ! { dg-error "'a' argument of 
'co_broadcast' intrinsic at .1. must be a variable" }
   call co_reduce(a=1, operator=red_f) ! { dg-error "'a' argument of 
'co_reduce' intrinsic at .1. must be a variable" }
-  call co_reduce(a=val, operator=red_f2) ! { dg-error "OPERATOR argument at 
(1) must be a PURE function" }
+  call co_reduce(a=val, operator=red_f2) ! { dg-error "OPERATOR argument at 
\\(1\\) must be a PURE function" }
 
   call co_broadcast(val, source_image=[1,2]) ! { dg-error "must be a scalar" }
   call co_broadcast(val, source_image=1.0) ! { dg-error "must be INTEGER" }
@@ -49,8 +49,8 @@
   call co_reduce(val, red_f, stat=[1,2]) ! { dg-error "must be a scalar" }
   call co_reduce(val, red_f, stat=1.0) ! { dg-error "must be INTEGER" }
   call co_reduce(val, red_f, stat=1) ! { dg-error "must be a variable" }
-  call co_reduce(val, red_f, stat=i, result_image=1) ! OK
-  call co_reduce(val, red_f, stat=i, errmsg=errmsg, result_image=1) ! OK
+  call co_reduce(val, red_f, stat=i, result_image=1) ! { dg-error "CO_REDUCE 
at \\(1\\) is not yet implemented" }
+  call co_reduce(val, red_f, stat=i, errmsg=errmsg, result_image=1) ! { 
dg-error "CO_REDUCE at \\(1\\) is not yet implemented" }
   call co_reduce(val, red_f, stat=i, errmsg=[errmsg], result_image=1) ! { 
dg-error "must be a scalar" }
   call co_reduce(val, red_f, stat=i, errmsg=5, result_image=1) ! { dg-error 
"must be CHARACTER" }
   call co_reduce(val, red_f, errmsg="abc") ! { dg-error "must be a variable" }
@@ -57,6 +57,6 @@
   call co_reduce(val, red_f, stat=i8) ! { dg-error "The stat= argument at .1. 
must be a kind=4 integer variable" }
   call co_reduce(val, red_f, errmsg=msg4) ! { dg-error "The errmsg= argument 
at .1. must be a default-kind character variable" }
 
-  call co_broadcasr(vec(idx), 1) ! { dg-error "Argument 'A' with 
INTENT\\(INOUT\\) at .1. of the intrinsic subroutine co_sum shall not have a 
vector subscript" }
-  call co_reduce(vec([1,3,2]), red_f) ! { dg-error "Argument 'A' with 
INTENT\\(INOUT\\) at .1. of the intrinsic subroutine co_min shall not have a 
vector subscript" }
+  call co_broadcast(vec(idx), 1) ! { dg-error "Argument 'A' with 
INTENT\\(INOUT\\) at .1. of the intrinsic subroutine co_broadcast shall not 
have a vector subscript" }
+  call co_reduce(vec([1,3,2]), red_f) ! { dg-error "Argument 'A' with 
INTENT\\(INOUT\\) at .1. of the intrinsic subroutine co_reduce shall not have a 
vector subscript" }
 end program test

Dominique



Re: Fix for "FAIL: tmpdir-gcc.dg-struct-layout-1/t028 c_compat_x_tst.o compile, (internal compiler error)"

2014-09-30 Thread Richard Sandiford
Andreas Schwab  writes:
> Richard Sandiford  writes:
>
>> @@ -315,7 +318,7 @@ struct ira_allocno
>>   number (0, ...) - 2.  Value -1 is used for allocnos spilled by the
>>   reload (at this point pseudo-register has only one allocno) which
>>   did not get stack slot yet.  */
>> -  short int hard_regno;
>> +  int hard_regno : 16;
>
> If you want negative numbers you need to make that explicitly signed.

Are you sure?  In:

  struct { int i : 16; unsigned int j : 1; } x = { -1, 0 };
  int foo (void) { return x.i; }

foo returns -1 rather than 65535.  I can't see any precedent in gcc/*.[hc]
for explicitly marking bitfields as signed.

Thanks,
Richard



Re: [PATCH 1/n] OpenMP 4.0 offloading infrastructure

2014-09-30 Thread Thomas Schwinge
Hi!

On Fri, 26 Sep 2014 16:36:21 +0400, Ilya Verbin  wrote:
> --- a/gcc/Makefile.in
> +++ b/gcc/Makefile.in
> @@ -58,6 +58,7 @@ build=@build@
>  host=@host@
>  target=@target@
>  target_noncanonical:=@target_noncanonical@
> +real_target_noncanonical:=@real_target_noncanonical@
>  
>  # Sed command to transform gcc to installed name.
>  program_transform_name := @program_transform_name@
> @@ -66,6 +67,10 @@ program_transform_name := @program_transform_name@
>  # Directories used during build
>  # -
>  
> +# Normally identical to target_noncanonical, except for compilers built
> +# as accelerator targets.
> +accel_dir_suffix = @accel_dir_suffix@

Doesn't that comment belong to real_target_noncanonical just above?  By
default, accel_dir_suffix is empty.  Probably also move accel_dir_suffix
above to where real_target_noncanonical is being set?


> --- a/configure.ac
> +++ b/configure.ac
> @@ -286,6 +286,24 @@ case ${with_newlib} in
>yes) skipdirs=`echo " ${skipdirs} " | sed -e 's/ target-newlib / /'` ;;
>  esac
>  
> +AC_ARG_ENABLE(as-accelerator-for,
> +[AS_HELP_STRING([--enable-as-accelerator-for=ARG],
> + [build as offload target compiler.
> + Specify offload host triple by ARG])],
> +ENABLE_AS_ACCELERATOR_FOR=$enableval,
> +ENABLE_AS_ACCELERATOR_FOR=no)

I don't see $ENABLE_AS_ACCELERATOR_FOR being used anywhere, so this can
probably be removed?

Also, given that we're addding --enable-as-accelerator-for and
--enable-offload-targets to the top-level configure, do we need really to
repeat (and thus, in the future maintain) those also in the subdirectory
configure files where the respective enable_* flags are evaulated?  If my
understanding of Autoconf is correct, then the enable_* variables will be
available in the subdirectory configure files, even without repeating the
AC_ARG_ENABLE instantiations.

How is this handled for other --enable-[...] flags that are needed in
several configure files?

> --- a/gcc/configure.ac
> +++ b/gcc/configure.ac
> @@ -883,6 +887,53 @@ AC_ARG_ENABLE(languages,
>  esac],
>  [enable_languages=c])
>  
> +AC_ARG_ENABLE(as-accelerator-for,
> +[AS_HELP_STRING([--enable-as-accelerator-for=ARG],
> + [build as offload target compiler.
> + Specify offload host triple by ARG])],
> +[
> +  AC_DEFINE(ACCEL_COMPILER, 1,
> +[Define if this compiler should be built as the offload target 
> compiler.])
> +  enable_as_accelerator=yes
> +  case "${target}" in
> +*-intelmicemul-*)
> +  # In this case we expect offload compiler to be built as native, so we
> +  # need to rename the driver to avoid clashes with host's drivers.
> +  program_transform_name="s&^&${target}-&" ;;
> +  esac
> +  
> sedscript="s#${target_noncanonical}#${enable_as_accelerator_for}-accel-${target_noncanonical}#"
> +  program_transform_name=`echo $program_transform_name | sed $sedscript`
> +  accel_dir_suffix=/accel/${target_noncanonical}
> +  real_target_noncanonical=${enable_as_accelerator_for}
> +], [enable_as_accelerator=no])
> +AC_SUBST(enable_as_accelerator)

Thus, here you should be able to remove the AC_ARG_ENABLE for
--enable-as-accelerator-for, and just do something like (untested):

if test x"$enable_as_accelerator" != x; then
  AC_DEFINE([...])
  [...]
fi

I'm not sure whether the AC_SUBST for enable_as_accelerator needs to be
repeated.  (I think it needs to be repeated.)

> +AC_ARG_ENABLE(offload-targets,

Likewise, and also in the other */configure.ac files.


> --- a/gcc/configure.ac
> +++ b/gcc/configure.ac

> +AC_SUBST(real_target_noncanonical)
> +AC_SUBST(accel_dir_suffix)

Can't we move these two AC_SUBST invocation up to where the variables are
being defined?


> --- /dev/null
> +++ b/libgcc/ompstuff.c
> @@ -0,0 +1,80 @@
> +/* Specialized bits of code needed for the OpenMP offloading tables.

> +#define OFFLOAD_FUNC_TABLE_SECTION_NAME "__gnu_offload_funcs"
> +#define OFFLOAD_VAR_TABLE_SECTION_NAME "__gnu_offload_vars"

Here we use __gnu_offload_* names here, and...

> +void *_omp_func_table[0]
> +  __attribute__ ((__used__, visibility ("hidden"),
> +   section (OFFLOAD_FUNC_TABLE_SECTION_NAME))) = { };
> +void *_omp_var_table[0]
> +  __attribute__ ((__used__, visibility ("hidden"),
> +   section (OFFLOAD_VAR_TABLE_SECTION_NAME))) = { };

..., also use OFFLOAD_*_TABLE_*, but on the other hand use _omp_*_table.
For consistency, shouldn't these also be named _offload_*_table?  Then
the »OpenMP« could be removed from the description in line 1 (just
»needed for the offloading tables«, and the file also be renamed to
offloadstuff.c or similar.  I certainly do acknowledge the role that
OpenMP has played in the development of this, but yet, this
infrastructure is not tied to OpenMP.

> +void *__OPENMP_TARGET__[]
> +  __attribute__ ((__visibility__ ("hidden"))) =
> +{
> +  &_omp_func_table, &_omp_funcs_end,
> +  &_omp_var_table, &_omp_vars_end
> +};

Also, the name __OPENMP_

Re: [PATCH 2/n] OpenMP 4.0 offloading infrastructure: LTO streaming

2014-09-30 Thread Thomas Schwinge
Hi!

As just discussed for the libgcc changes in
,
just some suggestions regarding the terminology, where I think that the
term »target« might be confusing in comments or symbols' names.  That is,
in the following, »target« should possibly be replaced by »offload[ing]«
or similar:

On Mon, 29 Sep 2014 21:37:04 +0400, Ilya Verbin  wrote:
> --- a/gcc/lto-cgraph.c
> +++ b/gcc/lto-cgraph.c
> @@ -321,6 +321,11 @@ referenced_from_other_partition_p (symtab_node *node, 
> lto_symtab_encoder_t encod
>  
>for (i = 0; node->iterate_referring (i, ref); i++)
>  {
> +  /* Ignore references from non-target nodes while streaming NODE into
> +  offload target section.  */
> +  if (!ref->referring->need_lto_streaming)
> + continue;
> +
>if (ref->referring->in_other_partition
>|| !lto_symtab_encoder_in_partition_p (encoder, ref->referring))
>   return true;
> @@ -339,9 +344,16 @@ reachable_from_other_partition_p (struct cgraph_node 
> *node, lto_symtab_encoder_t
>if (node->global.inlined_to)
>  return false;
>for (e = node->callers; e; e = e->next_caller)
> -if (e->caller->in_other_partition
> - || !lto_symtab_encoder_in_partition_p (encoder, e->caller))
> -  return true;
> +{
> +  /* Ignore references from non-target nodes while streaming NODE into
> +  offload target section.  */
> +  if (!e->caller->need_lto_streaming)
> + continue;
> +
> +  if (e->caller->in_other_partition
> +   || !lto_symtab_encoder_in_partition_p (encoder, e->caller))
> + return true;
> +}
>return false;
>  }

> --- a/gcc/lto-section-names.h
> +++ b/gcc/lto-section-names.h
> @@ -25,6 +25,11 @@ along with GCC; see the file COPYING3.  If not see
> name for the functions and static_initializers.  For other types of
> sections a '.' and the section type are appended.  */
>  #define LTO_SECTION_NAME_PREFIX ".gnu.lto_"
> +#define OMP_SECTION_NAME_PREFIX ".gnu.target_lto_"

What about:

#define OFFLOAD_SECTION_NAME_PREFIX ".gnu.offload_lto_"

> --- a/gcc/omp-low.c
> +++ b/gcc/omp-low.c
> @@ -8337,6 +8345,11 @@ expand_omp_target (struct omp_region *region)
>push_cfun (child_cfun);
>cgraph_edge::rebuild_edges ();
>  
> +  /* Prevent IPA from removing child_fn as unreachable, since there are 
> no
> +  refs from the parent function to the target side child_fn.  */
> +  node = cgraph_node::get (child_fn);
> +  node->mark_force_output ();
> +
>/* Some EH regions might become dead, see PR34608.  If
>pass_cleanup_cfg isn't the first pass to happen with the
>new child, these dead EH edges might cause problems.


Grüße,
 Thomas


pgp39Ul0jdvBC.pgp
Description: PGP signature


Re: Fix for "FAIL: tmpdir-gcc.dg-struct-layout-1/t028 c_compat_x_tst.o compile, (internal compiler error)"

2014-09-30 Thread Andreas Schwab
Richard Sandiford  writes:

> Andreas Schwab  writes:
>> Richard Sandiford  writes:
>>
>>> @@ -315,7 +318,7 @@ struct ira_allocno
>>>   number (0, ...) - 2.  Value -1 is used for allocnos spilled by the
>>>   reload (at this point pseudo-register has only one allocno) which
>>>   did not get stack slot yet.  */
>>> -  short int hard_regno;
>>> +  int hard_regno : 16;
>>
>> If you want negative numbers you need to make that explicitly signed.
>
> Are you sure?

See C11, 6.7.2#5.

Each of the comma-separated multisets designates the same type,
except that for bit-fields, it is implementation-defined whether the
specifier int designates the same type as signed int or the same
type as unsigned int.


Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


Re: [PATCH, i386]: Enable reminder{sd,df,xf} and fmod{sf,df,xf} only for flag_finite_math_only.

2014-09-30 Thread FX
> The patch will be committed to mainline and other release branches.

Thanks!

FX


[c++-concepts] function concepts with deduced return type

2014-09-30 Thread Andrew Sutton
Do not allow. Return type deduction only happens during instantiation,
and concepts are never instantiated. Therefore, we can't find the
return type of a function concept until you try to normalize the
return expression.

2014-09-25  Andrew Sutton  

Explicitly disallow function concepts with deduced return types.
* gcc/cp/constraint.cc (check_function_concept): Remove check
for deduced return type.
* gcc/cp/decl.c (check_concept_fn): Explicitly check for
deduced return type.
* gcc/testsuite/g++.dg/concepts/fn-concept2.C: New.

Andrew Sutton
Index: testsuite/g++.dg/concepts/fn-concept2.C
===
--- testsuite/g++.dg/concepts/fn-concept2.C	(revision 0)
+++ testsuite/g++.dg/concepts/fn-concept2.C	(revision 0)
@@ -0,0 +1,7 @@
+// { dg-options "-std=c++1z" }
+
+template
+  concept auto C1() { return 0; } // { dg-error "deduced return type" }
+
+template
+  concept int C2() { return 0; } // { dg-error "return type" }
Index: cp/constraint.cc
===
--- cp/constraint.cc	(revision 215718)
+++ cp/constraint.cc	(working copy)
@@ -280,11 +280,6 @@ check_function_concept (tree fn)
 {
   location_t loc = DECL_SOURCE_LOCATION (fn);
 
-  // If fn was declared with auto, make sure the result type is bool.
-  if (FNDECL_USED_AUTO (fn) && TREE_TYPE (fn) != boolean_type_node) 
-error_at (loc, "deduced type of concept definition %qD is %qT and not %qT", 
-  fn, TREE_TYPE (fn), boolean_type_node);
-
   // Check that the function is comprised of only a single
   // return statement.
   tree body = DECL_SAVED_TREE (fn);
Index: cp/decl.c
===
--- cp/decl.c	(revision 215718)
+++ cp/decl.c	(working copy)
@@ -7525,9 +7525,13 @@ check_concept_fn (tree fn)
   if (DECL_ARGUMENTS (fn))
 error ("concept %q#D declared with function parameters", fn);
 
-  // The result type must be convertible to bool.
-  if (!same_type_p (TREE_TYPE (TREE_TYPE (fn)), boolean_type_node))
-error ("concept %q#D result must be bool", fn);
+  // The declared return type of the concept shall be bool, and
+  // it shall not be deduced from it definition.
+  tree type = TREE_TYPE (TREE_TYPE (fn));
+  if (is_auto (type))
+error ("concept %q#D declared with a deduced return type", fn);
+  else if (type != boolean_type_node)
+error ("concept %q#D with return type %qT", fn, type);
 }
 
 /* Helper function.  Replace the temporary this parameter injected


Re: [Patch AArch64] Fix extended register width

2014-09-30 Thread Marcus Shawcroft
On 22 September 2014 19:41, Carrot Wei  wrote:
> Hi
>
> The extended register width in add/adds/sub/subs/cmp instructions is
> not always the same as target register, it depends on both target
> register width and extension type. But in current implementation the
> extended register width is always the same as target register. We have
> noticed it can generate following wrong assembler code when compiled
> an internal application,
>
> add x2, x20, x0, sxtw 3
>
> The correct assembler should be
>
> add x2, x20, w0, sxtw 3

Hi,

The assembler deliberately accepts the first form as a programmer
convenience.  Given the above example:

AARCH64 GAS  x.s page 1


   1  82CE20ABaddsx2, x20, x0, sxtw 3
   2 0004 82CE20ABaddsx2, x20, w0, sxtw 3

Note both forms are correctly assembled.  The GAS implementation
contains code at (or near) tc-aarch64.c:5461 that specifically catches
the former.

... therefore I see no need to change the behaviour of gcc.

Cheers
/Marcus


[PATCH 1/2] PR 63340: Avoid harmful union classes in ira-costs.c

2014-09-30 Thread Richard Sandiford
This patch is the first of two to fix PR 63340, which is an ia64
regression caused by:

2014-09-22  Richard Sandiford  

* hard-reg-set.h: Include hash-table.h.
(target_hard_regs): Add a finalize method and a x_simplifiable_subregs
field.
* target-globals.c (target_globals::~target_globals): Call
hard_regs->finalize.
* rtl.h (subreg_shape): New structure.
(shape_of_subreg): New function.
(simplifiable_subregs): Declare.
* reginfo.c (simplifiable_subreg): New structure.
(simplifiable_subregs_hasher): Likewise.
(simplifiable_subregs): New function.
(invalid_mode_changes): Delete.
(alid_mode_changes, valid_mode_changes_obstack): New variables.
(record_subregs_of_mode): Remove subregs_of_mode parameter.
Record valid mode changes in valid_mode_changes.
(find_subregs_of_mode): Remove subregs_of_mode parameter.
Update calls to record_subregs_of_mode.
(init_subregs_of_mode): Remove invalid_mode_changes and bitmap
handling.  Initialize new variables.  Update call to
find_subregs_of_mode.
(invalid_mode_change_p): Check new variables instead of
invalid_mode_changes.
(finish_subregs_of_mode): Finalize new variables instead of
invalid_mode_changes.
(target_hard_regs::finalize): New function.
* ira-costs.c (print_allocno_costs): Call invalid_mode_change_p
even when CLASS_CANNOT_CHANGE_MODE is undefined.

After that patch, we consider a hard register to be invalid for a pseudo
register if the hard register doesn't allow all the mode changes
required by the pseudo register.  A class is invalid if all hard
registers in it are invalid.  The problem was that this redefines the
behaviour for union classes like ia64's GR_AND_FR_REGS.  Before the
patch, validity was based entirely on CANNOT_CHANGE_MODE_CLASS, which
rejects any superset of FR_REGS if a mode change is invalid for FR_REGS.
After the patch, mode changes allowed by GR_REGS are also allowed by
GR_AND_FR_REGS.

As explained in the big block comment in the patch, there's an argument
whether the mode changes allowed by union classes should be the union or
the intersection of the mode changes allowed by subclasses.  Both are
right in some cases and wrong in others.  The upshot is that we really
shouldn't be using union classes if only one of the subclasses is valid.

This first patch lays the groundwork for the main fix by:

(a) taking the register mode as well as the "approximate" allocation
class into account in setup_regno_cost_classes_by_aclass, so that
a register's cost classes are always "correct" for its mode.

(b) checking contains_reg_of_mode when setting up the cost classes
rather than when using them, so that this can be taken into account
when deciding whether union classes are useful.

(c) excluding classes that allow the same set of registers as existing
cost classes, such as GR_AND_FR_REGS if only GR_REGS or FR_REGS
are valid.

(d) using the cost_classes "index" array to map excluded classes to the
appropriate subclass, so that we can still take the subunion of
two cost classes and expect it to have a valid index.  E.g. if:

  A = { 1  }
  B = {2, 3, 4 }
  C = { 1, 2, 3}
  D = { 1, 2, 3, 4 }

and if 4 is invalid for a particular pseudo register, A, B and C
are useful but D is redundant with C.  A \subunion B gives D,
then the lookup will map it to C.

This significantly reduces the number of redundant classes in the
cost_classes structure, so it's also a minor compile-time improvement.
The time for -O0 fold-const.ii on x86_64 improved by ~0.5%.
(record_reg_classes was previously the hottest function in the
compilation, after the patch it goes down to number 2, though
it's still costly.)

I did a diff of the assembly output before and after the patch
on x86_64-linux-gnu, powerpc64-linux-gnu, s390x-linux-gnu and
aarch64-linux-gnu.  There were some minor register allocation
changes in a handful files, but nothing major.

Tested on x86_64-linux-gnu, powerpc64-linux-gnu and aarch64-elf.
OK to install?

Thanks,
Richard


gcc/
PR rtl-optimization/63340 (part 1)
* ira-costs.c (all_cost_classes): New variable.
(complete_cost_classes): New function, split out from...
(setup_cost_classes): ...here.
(initiate_regno_cost_classes): Set up all_cost_classes.
(restrict_cost_classes): New function.
(setup_regno_cost_classes_by_aclass): Restrict the cost classes to
registers that are valid for the register's mode.
(setup_regno_cost_classes_by_mode): Model the mode cache as a
restriction of all_cost_classes to a particular mode.
(print_allocno_costs): Remove contains_reg_of_mode check.
(print_pseudo_costs, find_costs_and_classes): Likewise.

Index: gcc/ira-costs.c
===

[PATCH 2/2] PR 63340: Avoid harmful union classes in ira-costs.c

2014-09-30 Thread Richard Sandiford
This part of the patch actually fixes the PR.  It takes the reginfo.c
"invalid mode changes" into account when computing the cost classes,
rather than when using them.  The code in the first patch then ensures
that we don't add X_AND_Y_REGS to the cost classes if only X or Y allow
the required mode changes.

Tested in the same way as patch 1.  OK to install?

Thanks,
Richard


gcc/
PR rtl-optimization/63340 (part 2)
* ira-costs.c (setup_regno_cost_classes_by_aclass): Restrict the
classes to registers that are allowed by valid_mode_changes_for_regno.
(setup_regno_cost_classes_by_mode): Likewise.
(print_allocno_costs): Remove invalid_mode_change_p test.
(print_pseudo_costs, find_costs_and_classes): Likewise.

Index: gcc/ira-costs.c
===
--- gcc/ira-costs.c 2014-09-30 10:56:20.352946321 +0100
+++ gcc/ira-costs.c 2014-09-30 11:05:02.678826946 +0100
@@ -378,12 +378,18 @@ setup_regno_cost_classes_by_aclass (int
   classes_ptr = cost_classes_aclass_cache[aclass] = (cost_classes_t) *slot;
 }
   if (regno_reg_rtx[regno] != NULL_RTX)
-/* Restrict the classes to those that are valid for REGNO's mode
-   (which might for example exclude singleton classes if the mode requires
-   two registers).  */
-classes_ptr = restrict_cost_classes (classes_ptr,
-PSEUDO_REGNO_MODE (regno),
-reg_class_contents[ALL_REGS]);
+{
+  /* Restrict the classes to those that are valid for REGNO's mode
+(which might for example exclude singleton classes if the mode
+requires two registers).  Also restrict the classes to those that
+are valid for subregs of REGNO.  */
+  const HARD_REG_SET *valid_regs = valid_mode_changes_for_regno (regno);
+  if (!valid_regs)
+   valid_regs = ®_class_contents[ALL_REGS];
+  classes_ptr = restrict_cost_classes (classes_ptr,
+  PSEUDO_REGNO_MODE (regno),
+  *valid_regs);
+}
   regno_cost_classes[regno] = classes_ptr;
 }
 
@@ -396,11 +402,17 @@ setup_regno_cost_classes_by_aclass (int
 static void
 setup_regno_cost_classes_by_mode (int regno, enum machine_mode mode)
 {
-  if (cost_classes_mode_cache[mode] == NULL)
-cost_classes_mode_cache[mode]
-  = restrict_cost_classes (&all_cost_classes, mode,
-  reg_class_contents[ALL_REGS]);
-  regno_cost_classes[regno] = cost_classes_mode_cache[mode];
+  if (const HARD_REG_SET *valid_regs = valid_mode_changes_for_regno (regno))
+regno_cost_classes[regno] = restrict_cost_classes (&all_cost_classes,
+  mode, *valid_regs);
+  else
+{
+  if (cost_classes_mode_cache[mode] == NULL)
+   cost_classes_mode_cache[mode]
+ = restrict_cost_classes (&all_cost_classes, mode,
+  reg_class_contents[ALL_REGS]);
+  regno_cost_classes[regno] = cost_classes_mode_cache[mode];
+}
 }
 
 /* Finilize info about the cost classes for each pseudo.  */
@@ -1526,14 +1538,11 @@ print_allocno_costs (FILE *f)
   for (k = 0; k < cost_classes_ptr->num; k++)
{
  rclass = cost_classes[k];
- if (! invalid_mode_change_p (regno, (enum reg_class) rclass))
-   {
- fprintf (f, " %s:%d", reg_class_names[rclass],
-  COSTS (costs, i)->cost[k]);
- if (flag_ira_region == IRA_REGION_ALL
- || flag_ira_region == IRA_REGION_MIXED)
-   fprintf (f, ",%d", COSTS (total_allocno_costs, i)->cost[k]);
-   }
+ fprintf (f, " %s:%d", reg_class_names[rclass],
+  COSTS (costs, i)->cost[k]);
+ if (flag_ira_region == IRA_REGION_ALL
+ || flag_ira_region == IRA_REGION_MIXED)
+   fprintf (f, ",%d", COSTS (total_allocno_costs, i)->cost[k]);
}
   fprintf (f, " MEM:%i", COSTS (costs, i)->mem_cost);
   if (flag_ira_region == IRA_REGION_ALL
@@ -1564,9 +1573,8 @@ print_pseudo_costs (FILE *f)
   for (k = 0; k < cost_classes_ptr->num; k++)
{
  rclass = cost_classes[k];
- if (! invalid_mode_change_p (regno, (enum reg_class) rclass))
-   fprintf (f, " %s:%d", reg_class_names[rclass],
-COSTS (costs, regno)->cost[k]);
+ fprintf (f, " %s:%d", reg_class_names[rclass],
+  COSTS (costs, regno)->cost[k]);
}
   fprintf (f, " MEM:%i\n", COSTS (costs, regno)->mem_cost);
 }
@@ -1803,10 +1811,6 @@ find_costs_and_classes (FILE *dump_file)
  for (k = 0; k < cost_classes_ptr->num; k++)
{
  rclass = cost_classes[k];
- /* Ignore classes that are too small or invalid for this
-operand.  */
- if (invalid_mode_change_p (i, (enum re

Re: Move tail merging pass forward

2014-09-30 Thread Dominique Dhumieres
> testcase in PR35545 shows case where profile feedback infrastructure ...

The test g++.dg/tree-prof/pr35545.C yields

UNRESOLVED: g++.dg/tree-prof/pr35545.C scan-ipa-dump-not optimized 
"OBJ_TYPE_REF"

The following patch

--- ../_clean/gcc/testsuite/g++.dg/tree-prof/pr35545.C  2014-09-27 
15:01:44.0 +0200
+++ gcc/testsuite/g++.dg/tree-prof/pr35545.C2014-09-30 15:09:46.0 
+0200
@@ -48,5 +48,5 @@ int main()
 }
 /* { dg-final-use { scan-ipa-dump "Indirect call -> direct call" 
"profile_estimate" } } */
 /* { dg-final-use { cleanup-ipa-dump "profile" } } */
-/* { dg-final-use { scan-ipa-dump-not "OBJ_TYPE_REF" "optimized" } } */
+/* { dg-final-use { scan-tree-dump-not "OBJ_TYPE_REF" "optimized" } } */
 /* { dg-final-use { cleanup-tree-dump "optimized" } } */

fixes it.

Dominique


Re: [AArch64] Wire up vqdmullh_laneq_s16 and vqdmullh_laneq_s32

2014-09-30 Thread Marcus Shawcroft
On 24 September 2014 16:06, James Greenhalgh  wrote:
>
> Hi,
>
> As per the subject line this patch adds support for two arm_neon.h
> intrinsics that we had missed.
>
> We also need to fix the signature of vqdmulls_lane_s32, which is an
> obvious extension to this patch while we are in the area.
>
> Tested for simd.exp and aarch64.exp with no issues.
>
> OK?

OK
/Marcus


[C++ Patch PING] Re: [PATCH] make excessive template instantiation depth a fatal error

2014-09-30 Thread Paolo Carlini

Hi all, hi Jason,

On 08/24/2014 12:11 PM, Manuel López-Ibáñez wrote:

PING: https://gcc.gnu.org/ml/gcc-patches/2014-08/msg01709.html
Today, I picked this unreviewed patch prepared by Manuel back in August 
and trivially completed it by adjusting the testcases (all the tweaks 
seem the expected ones given the patch proper, no surprises). How does 
it look?


Thanks!
Paolo.

//
2014-09-30  Paolo Carlini  

* g++.dg/cpp0x/decltype26.C: Adjust.
* g++.dg/cpp0x/decltype28.C: Likewise.
* g++.dg/cpp0x/decltype29.C: Likewise.
* g++.dg/cpp0x/decltype32.C: Likewise.
* g++.dg/cpp0x/enum11.C: Likewise.
* g++.dg/template/arrow1.C: Likewise.
* g++.dg/template/pr23510.C: Likewise.
* g++.dg/template/recurse.C: Likewise.
* g++.dg/template/recurse2.C: Likewise.
* g++.dg/template/vtable2.C: Likewise.
* g++.old-deja/g++.pt/infinite1.C: Likewise.


Re: Fix for "FAIL: tmpdir-gcc.dg-struct-layout-1/t028 c_compat_x_tst.o compile, (internal compiler error)"

2014-09-30 Thread Richard Earnshaw
On 30/09/14 12:51, Andreas Schwab wrote:
> Richard Sandiford  writes:
> 
>> Andreas Schwab  writes:
>>> Richard Sandiford  writes:
>>>
 @@ -315,7 +318,7 @@ struct ira_allocno
   number (0, ...) - 2.  Value -1 is used for allocnos spilled by the
   reload (at this point pseudo-register has only one allocno) which
   did not get stack slot yet.  */
 -  short int hard_regno;
 +  int hard_regno : 16;
>>>
>>> If you want negative numbers you need to make that explicitly signed.
>>
>> Are you sure?
> 
> See C11, 6.7.2#5.
> 
> Each of the comma-separated multisets designates the same type,
> except that for bit-fields, it is implementation-defined whether the
> specifier int designates the same type as signed int or the same
> type as unsigned int.
> 
> 
> Andreas.
> 

GCC is written in C++ these days, so technically, you need the C++
standard :-)

GNU C defaults to signed bitfields (see trouble.texi).  However, since
GCC is supposed to bootstrap using a portable ISO C++ compiler, there's
an argument for removing the ambiguity entirely by being explicit.  We
no-longer have to worry about compilers that don't support the signed
keyword.

R.




Re: [Patch, Fortran] Add CO_BROADCAST

2014-09-30 Thread James Greenhalgh
On Tue, Sep 30, 2014 at 11:59:16AM +0100, Dominique d'Humières wrote:
> This is what I have committed as r215715:
> 
> Index: gcc/testsuite/ChangeLog
> ===
> --- gcc/testsuite/ChangeLog   (revision 215714)
> +++ gcc/testsuite/ChangeLog   (working copy)
> @@ -1,3 +1,7 @@
> +2014-30-09  Dominique d'Humieres 
 
Note that the date-line for ChangeLog entries should be in the form:

-MM-DD  Name  

I pushed the following as r215723/r215724 (215723 fixes the date, 215724
fixes the two spaces after the name - sorry for the double commit spam) to
fix it up for you.

Thanks,
James

---
Index: gcc/testsuite/ChangeLog
===
--- gcc/testsuite/ChangeLog (revision 215722)
+++ gcc/testsuite/ChangeLog (working copy)
@@ -6,7 +6,7 @@
* gcc.target/aarch64/scalar_intrinsics.c (test_vqdmulls_s32):  Fix
return type.
 
-2014-30-09  Dominique d'Humieres 
+2014-09-30  Dominique d'Humieres  
 
* gfortran.dg/coarray_collectives_9.f90: Fix some dg-error.



Re: [PATCH] PR63404, gcc 5 miscompiles linux block layer

2014-09-30 Thread Richard Earnshaw
On 29/09/14 19:32, Richard Henderson wrote:
> On 09/29/2014 11:12 AM, Jiong Wang wrote:
>> +inline rtx single_set_no_clobber_use (const rtx_insn *insn)
>> +{
>> +  if (!INSN_P (insn))
>> +return NULL_RTX;
>> +
>> +  if (GET_CODE (PATTERN (insn)) == SET)
>> +return PATTERN (insn);
>> +
>> +  /* Defer to the more expensive case, and return NULL_RTX if there is
>> + USE or CLOBBER.  */
>> +  return single_set_2 (insn, PATTERN (insn), true);
>>  }
> 
> What more expensive case?
> 
> If you're disallowing USE and CLOBBER, then single_set is just GET_CODE == 
> SET.
> 
> I think this function is somewhat useless, and should not be added.
> 
> An adjustment to move_insn_for_shrink_wrap may be reasonable though.  I 
> haven't
> tried to understand the miscompilation yet.  I can imagine that this would
> disable quite a bit of shrink wrapping for x86 though.  Can we do better in
> understanding when the clobbered register is live at the location to which 
> we'd
> like to move then insns?
> 
> 
> r~
> 

I think part of the problem is in the naming of single_set().  From the
name it's not entirely obvious to users that this includes insns that
clobber registers or which write other registers that are unused after
that point.  I've previously had to fix a bug where this assumption was
made (eg https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54300)

Most uses of single_set prior to register allocation are probably safe;
but later uses are fraught with potential problems of this nature and
may well be bugs waiting to happen.

R.



Re: [C++ Patch PING] Re: [PATCH] make excessive template instantiation depth a fatal error

2014-09-30 Thread Paolo Carlini

... forgot to attach the complete patch ;)

Paolo.


Index: cp/cp-tree.h
===
--- cp/cp-tree.h(revision 215710)
+++ cp/cp-tree.h(working copy)
@@ -5418,7 +5418,6 @@ extern const char *lang_decl_name (tree, int, boo
 extern const char *lang_decl_dwarf_name(tree, int, bool);
 extern const char *language_to_string  (enum languages);
 extern const char *class_key_or_enum_as_string (tree);
-extern void print_instantiation_context(void);
 extern void maybe_warn_variadic_templates   (void);
 extern void maybe_warn_cpp0x   (cpp0x_warn_str str);
 extern bool pedwarn_cxx98   (location_t, int, const char 
*, ...) ATTRIBUTE_GCC_DIAG(3,4);
@@ -5633,7 +5632,7 @@ extern tree tsubst_copy_and_build (tree, tree, ts
 tree, bool, bool);
 extern tree most_general_template  (tree);
 extern tree get_mostly_instantiated_function_type (tree);
-extern int problematic_instantiation_changed   (void);
+extern bool problematic_instantiation_changed  (void);
 extern void record_last_problematic_instantiation (void);
 extern struct tinst_level *current_instantiation(void);
 extern tree maybe_get_template_decl_from_type_decl (tree);
@@ -5661,7 +5660,8 @@ extern tree fold_non_dependent_expr_sfinae(tree,
 extern bool alias_type_or_template_p(tree);
 extern bool alias_template_specialization_p (const_tree);
 extern bool explicit_class_specialization_p (tree);
-extern int push_tinst_level (tree);
+extern bool push_tinst_level(tree);
+extern bool push_tinst_level_loc(tree, location_t);
 extern void pop_tinst_level (void);
 extern struct tinst_level *outermost_tinst_level(void);
 extern void init_template_processing   (void);
Index: cp/error.c
===
--- cp/error.c  (revision 215710)
+++ cp/error.c  (working copy)
@@ -3360,16 +3360,6 @@ maybe_print_instantiation_context (diagnostic_cont
   record_last_problematic_instantiation ();
   print_instantiation_full_context (context);
 }
-
-/* Report the bare minimum context of a template instantiation.  */
-void
-print_instantiation_context (void)
-{
-  print_instantiation_partial_context
-(global_dc, current_instantiation (), input_location);
-  pp_newline (global_dc->printer);
-  diagnostic_flush_buffer (global_dc);
-}
 
 /* Report what constexpr call(s) we're trying to expand, if any.  */
 
Index: cp/pt.c
===
--- cp/pt.c (revision 215710)
+++ cp/pt.c (working copy)
@@ -8347,26 +8347,26 @@ static GTY(()) struct tinst_level *last_error_tins
 /* We're starting to instantiate D; record the template instantiation context
for diagnostics and to restore it later.  */
 
-int
+bool
 push_tinst_level (tree d)
 {
+  return push_tinst_level_loc (d, input_location);
+}
+
+/* We're starting to instantiate D; record the template instantiation context
+   at LOC for diagnostics and to restore it later.  */
+
+bool
+push_tinst_level_loc (tree d, location_t loc)
+{
   struct tinst_level *new_level;
 
   if (tinst_depth >= max_tinst_depth)
 {
-  last_error_tinst_level = current_tinst_level;
-  if (TREE_CODE (d) == TREE_LIST)
-   error ("template instantiation depth exceeds maximum of %d (use "
-  "-ftemplate-depth= to increase the maximum) substituting %qS",
-  max_tinst_depth, d);
-  else
-   error ("template instantiation depth exceeds maximum of %d (use "
-  "-ftemplate-depth= to increase the maximum) instantiating %qD",
-  max_tinst_depth, d);
-
-  print_instantiation_context ();
-
-  return 0;
+  fatal_error ("template instantiation depth exceeds maximum of %d"
+   " (use -ftemplate-depth= to increase the maximum)",
+   max_tinst_depth);
+  return false;
 }
 
   /* If the current instantiation caused problems, don't let it instantiate
@@ -8373,11 +8373,11 @@ push_tinst_level (tree d)
  anything else.  Do allow deduction substitution and decls usable in
  constant expressions.  */
   if (limit_bad_template_recursion (d))
-return 0;
+return false;
 
   new_level = ggc_alloc ();
   new_level->decl = d;
-  new_level->locus = input_location;
+  new_level->locus = loc;
   new_level->errors = errorcount+sorrycount;
   new_level->in_system_header_p = in_system_header_at (input_location);
   new_level->next = current_tinst_level;
@@ -8387,7 +8387,7 @@ push_tinst_level (tree d)
   if (GATHER_STATISTICS && (tinst_depth > depth_reached))
 depth_reached = tinst_depth;
 
-  return 1;
+  return true;
 }
 
 /* We're done instantiating this template; return to the instantiation
@@ -2

Re: [Patch ARM-AArch64/testsuite v2 01/21] Neon intrinsics execution tests initial framework.

2014-09-30 Thread Christophe Lyon
On 10 July 2014 12:12, Marcus Shawcroft  wrote:
> On 1 July 2014 11:05, Christophe Lyon  wrote:
>> * documentation (README)
>> * dejanu driver (neon-intrinsics.exp)
>> * support macros (arm-neon-ref.h, compute-ref-data.h)
>> * Tests for 3 intrinsics: vaba, vld1, vshl
>
> Hi, The terminology in armv8 is advsimd rather than neon.  Can we
> rename neon-intrinsics to advsimd-intrinsics or simd-intrinsics
> throughout please.  The existing gcc.target/aarch64/simd directory of
> tests will presumably be superseded by this more comprehensive set of
> tests so I suggest these tests go in gcc.target/aarch64/advsimd and we
> eventually remove gcc.target/aarch64/simd/ directory.
>
> GNU style should apply throughout this patch series, notably double
> space after period in comments and README text.  Space before left
> parenthesis in function/macro call and function declaration.  The
> function name in a declaration goes on a new line.  The GCC wiki notes
> on test case state individual test should have file names ending in
> _, see here https://gcc.gnu.org/wiki/TestCaseWriting
>

Hi,

For the record, these tests are based on a testsuite I wrote quite
some time ago:
https://gitorious.org/arm-neon-tests/

where obviously I had no such requirement (and v8 wasn't public yet)

So I prefer to apply the changes you request in my main version before
re-submitting it here.
(libsanitizer-style, sort-of).

This will take me some time, so the next version of my patch series
should not be expected really soon :-(

Christophe.


> I'm OK with the execute only no scan nature of the tests.
>
>> diff --git a/gcc/testsuite/gcc.target/aarch64/neon-intrinsics/README 
>> b/gcc/testsuite/gcc.target/aarch64/neon-intrinsics/README
>> new file mode 100644
>> index 000..232bb1d
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/aarch64/neon-intrinsics/README
>> @@ -0,0 +1,132 @@
>> +This directory contains executable tests for ARM/AArch64 Neon
>> +intrinsics.
>
> Neon -> Advanced SIMD as below.
>
>> +
>> +It is meant to cover execution cases of all the Advanced SIMD
>> +intrinsics, but does not scan the generated assembler code.
>
>> +#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
>> +
>> +typedef union {
>> +  struct {
>
> GNUstyle { on new lne.
>
>> +#define Neon_Cumulative_Sat  __read_neon_cumulative_sat()
>> +#define Set_Neon_Cumulative_Sat(x)  __set_neon_cumulative_sat((x))
>
> Upper case the macro's rather than camel case.
>
>> +# Copyright (C) 2013 Free Software Foundation, Inc.
>
> s/13/14/
>
> Cheers
> /Marcus


Re: Enable TBAA on anonymous types with LTO

2014-09-30 Thread Jason Merrill

On 09/29/2014 11:36 AM, Jan Hubicka wrote:

If C++ FE sets canonical type always to main variant, it should work.
Is it always the case?


No.  For a compound type like a pointer or function the canonical type 
strips all typedefs, but a main variant does not.



   namespace {
   struct B {};
   }
   struct A
   {
void t(B);
void t2();
   };


Yep, A seems to be not anonymous and mangled as A.  I think it is ODR violation
to declare such type in more than one compilation unit (and we will warn on
it). We can make it anonymous, but I think it is C++ FE to do so.


Yes, it's an ODR violation.  The FE currently warns about a field with 
internal type, and I suppose could warn about other members as well.



I really think that anonymous types are meant to not be accessible from other
compilation unit and I do not see why other languages need different rule.


Agreed.


This does not work for types build from ODR types that are not ODR themselves.


I'm not sure what you mean.  In C++ the only types not subject to the 
ODR are local to one translation unit, so merging isn't an issue.  Do 
you mean types from other languages?


Jason



Re: [PATCH] PR63404, gcc 5 miscompiles linux block layer

2014-09-30 Thread Jiong Wang


On 30/09/14 05:21, Jeff Law wrote:

On 09/29/14 13:24, Jiong Wang wrote:

I don't think so. from the x86-64 bootstrap, there is no regression
on the number of functions shrink-wrapped. actually speaking,
previously only single mov dest, src handled, so the disallowing
USE/CLOBBER will not disallow shrink-wrap opportunity which was
allowed previously.

This is the key, of course.  shrink-wrapping is very restrictive in its
ability to sink insns.  The only forms it'll currently shrink are simple
moves.  Arithmetic, logicals, etc are left alone.  Thus disallowing
USE/CLOBBER does not impact the x86 port in any significant way.


yes, and we could get +1.22% (2567 compared with 2536) functions shrink-wrapped
after we sinking more insn except simple "mov dest, src"  on x86-64 bootstrap. 
and
I remember the similar percentage on glibc build.

while on aarch64, the overall functions shrink-wrapped increased +25% on some
programs. maybe we could gain the same on other RISC backend.



I do agree with Richard that it would be useful to see the insns that
are incorrectly sunk and the surrounding context.


insn 14 and 182 are sunk incorrectly. below is the details.

(insn 14 173 174 2 (parallel [

(set (reg:QI 37 r8 [orig:86 D.32480 ] [86])

(lshiftrt:QI (reg:QI 37 r8 [orig:86 D.32480 ] [86])

(const_int 2 [0x2])))

(clobber (reg:CC 17 flags))

]) /home/andi/lsrc/linux/block/blk-flush2.c:50 547 {*lshrqi3_1}

 (expr_list:REG_EQUAL (lshiftrt:QI (mem:QI (plus:DI (reg/v/f:DI 43 r14 
[orig:85 q ] [85])

(const_int 1612 [0x64c])) [20 *q_7+1612 S1 A32])

(const_int 2 [0x2]))

(nil)))

(insn 174 14 182 2 (set (reg:QI 44 r15 [orig:86 D.32480 ] [86])

(reg:QI 37 r8 [orig:86 D.32480 ] [86])) 
/home/andi/lsrc/linux/block/blk-flush2.c:50 93 {*movqi_internal}

 (nil))

(insn 182 174 16 2 (parallel [

(set (reg:SI 44 r15 [orig:86 D.32480 ] [86])

(and:SI (reg:SI 44 r15 [orig:86 D.32480 ] [86])

(const_int 1 [0x1])))

(clobber (reg:CC 17 flags))

]) /home/andi/lsrc/linux/block/blk-flush2.c:50 376 {*andsi_1}

 (nil))


Jeff








Re: [C++ Patch PING] Re: [PATCH] make excessive template instantiation depth a fatal error

2014-09-30 Thread Jason Merrill

OK.

Jason


Re: Fix libgomp crash without TLS (PR42616)

2014-09-30 Thread Richard Henderson
On 09/30/2014 02:52 AM, Jakub Jelinek wrote:
> On Tue, Sep 30, 2014 at 11:03:47AM +0400, Varvara Rainchik wrote:
>> Corrected patch: call pthread_setspecific (gomp_tls_key, NULL) in
>> gomp_thread_start if HAVE_TLS is not defined.
>>
>> 2014-09-19  Varvara Rainchik  
>>
>> * libgomp.h (gomp_thread): For non TLS case create thread data.
>> * team.c (non_tls_thread_data_destructor,
>> create_non_tls_thread_data): New functions.
> 
> I actually wonder when we have emutls support in libgcc if it wouldn't
> be better to just define HAVE_TLS always to 1 (i.e. remove all the
> conditionals on it), then you wouldn't need to bother with this at all.
> 
> I don't have an OS which doesn't support native TLS though, so somebody with
> such a system would need to test it and benchmark if it doesn't make things
> slower.
> 
> Richard, thoughts on this?

I like that idea better as well.


r~



Re: [PATCH, rs6000] Generate LE code for vec_lvsl and vec_lvsr that is compatible with BE code

2014-09-30 Thread Segher Boessenkool
On Mon, Sep 29, 2014 at 05:26:14PM -0500, Bill Schmidt wrote:
> The method used in this patch is to perform a byte-reversal of the
> result of the lvsl/lvsr.  This is accomplished by loading the vector
> char constant {0,1,...,15}, which will appear in the register from left
> to right as {15,...,1,0}.  A vperm instruction (which uses BE element
> ordering) is applied to the result of the lvsl/lvsr using the loaded
> constant as the permute control vector.

It would be nice if you could arrange the generated sequence such that
for the common case where the vec_lvsl feeds a vperm it is results in
just lvsr;vnot machine instructions.  Not so easy to do though :-(

Some minor comments...

> -(define_insn "altivec_lvsl"
> +(define_expand "altivec_lvsl"
> +  [(use (match_operand:V16QI 0 "register_operand" ""))
> +   (use (match_operand:V16QI 1 "memory_operand" "Z"))]

A define_expand should not have constraints.

> +  "TARGET_ALTIVEC"
> +  "

No need for the quotes.

> +{
> +  if (VECTOR_ELT_ORDER_BIG)
> +emit_insn (gen_altivec_lvsl_direct (operands[0], operands[1]));
> +  else
> +{
> +  int i;
> +  rtx mask, perm[16], constv, vperm;
> +  mask = gen_reg_rtx (V16QImode);
> +  emit_insn (gen_altivec_lvsl_direct (mask, operands[1]));
> +  for (i = 0; i < 16; ++i)

i++ is the common style.


Segher


[PATCH 3/n] OpenMP 4.0 offloading infrastructure: offload tables

2014-09-30 Thread Ilya Verbin
Hello,

This patch creates 2 vectors with decls: offload_funcs and offload_vars.
libgomp will use addresses from these arrays to look up offloaded code.

During the compilation they are outputted to:
* .gnu.offload_lto_offload_table section as IR for offload compiler;
* .gnu.lto_offload_table section as IR (if compiled with -flto);
* binary __gnu_offload_funcs/vars sections, or using
  targetm.record_offload_symbol hook for PTX.

During the linking phase:
* without -flto: a linker joins __gnu_offload_funcs/vars sections from all
  objects.
* with -flto -flto-partition=none: a compiler reads .gnu.lto_offload_table
  sections from all objects and writes the final joint table into
  __gnu_offload_funcs/vars in the final binary.
* with -flto:
  * at WPA stage a compiler reads .gnu.lto_offload_table sections from all
objects and writes the joint table into .gnu.lto_offload_table in the
first LTO partition;
  * at LTRANS stage a compiler reads .gnu.lto_offload_table from the first
partition and writes the final table into __gnu_offload_funcs/vars in
the final binary.

Bootstrapped and regtested on top of patch 2.  Is it OK for trunk?

Thanks,
  -- Ilya


2014-09-30  Ilya Verbin  
Bernd Schmidt  
Andrey Turetskiy  
Michael Zolotukhin  

gcc/
* Makefile.in (GTFILES): Add omp-low.h to list of GC files.
* cgraphunit.c: Include omp-low.h.
(initialize_offload): Collect global variables with "omp declare target"
attribute into offload_vars vector.
* doc/tm.texi: Regenerate.
* doc/tm.texi.in (TARGET_RECORD_OFFLOAD_SYMBOL): Document.
* gengtype.c (open_base_files): Add omp-low.h to ifiles.
* lto-cgraph.c (output_offload_tables): New function.
(input_offload_tables): Likewise.
* lto-section-in.c (lto_section_name): Add "offload_table".
* lto-section-names.h (OFFLOAD_VAR_TABLE_SECTION_NAME): Define.
(OFFLOAD_FUNC_TABLE_SECTION_NAME): Likewise.
* lto-streamer-out.c (lto_output): Call output_offload_tables.
* lto-streamer.h (lto_section_type): Add LTO_section_offload_table.
(output_offload_tables, input_offload_tables): Declare.
* omp-low.c: Include common/common-target.h and lto-section-names.h.
(offload_funcs, offload_vars): New global  vectors.
(expand_omp_target): Add child_fn into offload_funcs vector.
(add_decls_addresses_to_decl_constructor): New function.
(omp_finish_file): Likewise.
* omp-low.h (omp_finish_file, offload_funcs, offload_vars): Declare.
* target.def (record_offload_symbol): New DEFHOOK.
* toplev.c: Include omp-low.h.
(compile_file): Call omp_finish_file.
gcc/lto/
* lto/lto.c (read_cgraph_and_symbols): Call input_offload_tables.

---

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index aa1c360..5c08f4b 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -2278,6 +2278,7 @@ GTFILES = $(CPP_ID_DATA_H) $(srcdir)/input.h 
$(srcdir)/coretypes.h \
   $(srcdir)/tree-profile.c $(srcdir)/tree-nested.c \
   $(srcdir)/tree-parloops.c \
   $(srcdir)/omp-low.c \
+  $(srcdir)/omp-low.h \
   $(srcdir)/targhooks.c $(out_file) $(srcdir)/passes.c $(srcdir)/cgraphunit.c \
   $(srcdir)/cgraphclones.c \
   $(srcdir)/tree-phinodes.c \
diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c
index a6b0bac..3c9bd04 100644
--- a/gcc/cgraphunit.c
+++ b/gcc/cgraphunit.c
@@ -211,6 +211,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-nested.h"
 #include "gimplify.h"
 #include "dbgcnt.h"
+#include "omp-low.h"
 #include "lto-section-names.h"
 
 /* Queue of cgraph nodes scheduled to be added into cgraph.  This is a
@@ -1996,7 +1997,9 @@ output_in_order (bool no_reorder)
 }
 
 /* Check whether there is at least one function or global variable to offload.
-   */
+   Also collect all such global variables into OFFLOAD_VARS, the functions were
+   already collected in omp-low.c.  They will be streamed out in
+   ipa_write_summaries.  */
 
 static bool
 initialize_offload (void)
@@ -2020,6 +2023,7 @@ initialize_offload (void)
  || DECL_SIZE (vnode->decl) == 0)
continue;
   have_offload = true;
+  vec_safe_push (offload_vars, vnode->decl);
 }
 
   return have_offload;
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 10af50e..80da884 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -11195,6 +11195,12 @@ If defined, this function returns an appropriate 
alignment in bits for an atomic
 ISO C11 requires atomic compound assignments that may raise floating-point 
exceptions to raise exceptions corresponding to the arithmetic operation whose 
result was successfully stored in a compare-and-exchange sequence.  This 
requires code equivalent to calls to @code{feholdexcept}, @code{feclearexcept} 
and @code{feupdateenv} to be generated at appropriate points in the 
compare-and-exchange sequence.  This hook should set @code{*@var{hold}} to an 
expres

Re: [PATCH C++] - SD-6 Implementation Part 1 - __has_include.

2014-09-30 Thread Jason Merrill

On 09/29/2014 11:18 AM, Ed Smith-Rowland wrote:

+  /* Nonzero to prevent macro expansion.  */
+  unsigned char in__has_include__;


I don't see anything checking this flag to prevent macro expansion. 
Does the comment just need a change?



+  /* Binary literals and variable length arrays have been allowed in g++
+before C++11 and were standardized for C++14.  */
+  if (!pedantic || cxx_dialect > cxx11)
+   {
+ cpp_define (pfile, "__cpp_binary_literals=201304");
+   }


This comment also needs an update.


+//  Try a macro.
+#define COMPLEX_INC "complex.h"
+#if __has_include(COMPLEX_INC)
+#else
+#  error COMPLEX_INC
+#endif


Are you sure this is what SD-6 means?  I interpret it as trying to 
specify something equivalent to the #include directive, namely that 
first we look for an explicit header-name, then try a more flexible 
parse that should include macro expansion.  But this can wait for a 
clarification from SG10.


The patch is OK with those comment tweaks.

Jason



Re: [C++ Patch PING] Re: [PATCH] make excessive template instantiation depth a fatal error

2014-09-30 Thread Manuel López-Ibáñez
I don't want to cause you more work Paolo, but perhaps this should be
documented in https://gcc.gnu.org/gcc-5/changes.html. ?

Something like:

* Excessive template instantiation depth is now a fatal error. This
prevents excessive diagnostics that usually do not help to identify
the problem.

Thanks for taking care of this!

Cheers,

Manuel.

On 30 September 2014 16:38, Jason Merrill  wrote:
> OK.
>
> Jason


Re: [PATCH][AArch64] LR register not used in leaf functions

2014-09-30 Thread Jiong Wang

On 27/09/14 22:20, Kugan wrote:


On 23/09/14 01:58, Jiong Wang wrote:

On 22/09/14 16:43, Kugan wrote:


AArch64 has the same issue ARM had where the LR register was not used in
leaf functions. This was reported in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=42017. In AArch64, this
test-case need to be added with more live ranges for the need for the
LR_REGNUM. i.e test-case in the PR needs additional loops up to r31 for
the case AArch64 to see this.

The same fix (from the thread
https://gcc.gnu.org/ml/gcc-patches/2011-04/msg02191.html) which went
into ARM should apply to AArch64 as well. Regression tested on qemu for
aarch64-none-linux-gnu with no new regressions. Is this OK for trunk?

This still be a partial fix. LR should be a caller-saved register free
to use in case it's saved properly to across function call.

Indeed. This should be improved from the generic code. Right now, if a
hard register is used in EPILOGUE_USES, it conflicts with all the live
ranges till a call site kills.  I think we should have this patch till
the generic code can be improved.


below is my local patch. LR is treated as free register, and strictly
following AArch64 ABI, frame should always be created, FP maintained
properly if LR clobbered under -fno-omit-frame-pointer.


gcc/
  * config/aarch64/aarch64.h (CALL_USED_REGISTERS): Mark LR as caller-save.
  (EPILOGUE_USES): Guard the check by epilogue_completed.
  * config/aarch64/aarch64.c (aarch64_layout_frame): Explictly check for LR.
  (aarch64_can_eliminate): Check LR_REGNUM liveness.

gcc/testsuite/
  * gcc.target/aarch64/lr_free_1.c: New testcase for -fomit-frame-pointer.
  * gcc.target/aarch64/lr_free_2.c: New testcase for leaf 
-fno-omit-frame-pointer.
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index db950da..892b310 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -250,7 +250,7 @@ extern unsigned long aarch64_tune_flags;
 1, 1, 1, 1,   1, 1, 1, 1,	/* R0 - R7 */		\
 1, 1, 1, 1,   1, 1, 1, 1,	/* R8 - R15 */		\
 1, 1, 1, 0,   0, 0, 0, 0,	/* R16 - R23 */		\
-0, 0, 0, 0,   0, 1, 0, 1,	/* R24 - R30, SP */	\
+0, 0, 0, 0,   0, 1, 1, 1,	/* R24 - R30, SP */	\
 1, 1, 1, 1,   1, 1, 1, 1,	/* V0 - V7 */		\
 0, 0, 0, 0,   0, 0, 0, 0,	/* V8 - V15 */		\
 1, 1, 1, 1,   1, 1, 1, 1,   /* V16 - V23 */ \
@@ -309,7 +309,7 @@ extern unsigned long aarch64_tune_flags;
considered live at the start of the called function.  */
 
 #define EPILOGUE_USES(REGNO) \
-  ((REGNO) == LR_REGNUM)
+  (epilogue_completed && (REGNO) == LR_REGNUM)
 
 /* EXIT_IGNORE_STACK should be nonzero if, when returning from a function,
the stack pointer does not matter.  The value is tested only in
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 15c7be6..8b39b2a 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -1864,7 +1864,8 @@ aarch64_layout_frame (void)
   /* ... and any callee saved register that dataflow says is live.  */
   for (regno = R0_REGNUM; regno <= R30_REGNUM; regno++)
 if (df_regs_ever_live_p (regno)
-	&& !call_used_regs[regno])
+	&& (regno == R30_REGNUM
+	|| !call_used_regs[regno]))
   cfun->machine->frame.reg_offset[regno] = SLOT_REQUIRED;
 
   for (regno = V0_REGNUM; regno <= V31_REGNUM; regno++)
@@ -4313,6 +4314,16 @@ aarch64_can_eliminate (const int from, const int to)
 
   return false;
 }
+  else
+{
+  /* If we decided that we didn't need a leaf frame pointer but then used
+	 LR in the function, then we'll want a frame pointer after all, so
+	 prevent this elimination to ensure a frame pointer is used.  */
+  if (to == STACK_POINTER_REGNUM
+	  && flag_omit_leaf_frame_pointer
+	  && df_regs_ever_live_p (LR_REGNUM))
+	return false;
+}
 
   return true;
 }
diff --git a/gcc/testsuite/gcc.target/aarch64/lr_free_1.c b/gcc/testsuite/gcc.target/aarch64/lr_free_1.c
new file mode 100644
index 000..4c530a2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/lr_free_1.c
@@ -0,0 +1,38 @@
+/* { dg-do run } */
+/* { dg-options "-fno-inline -O2 -fomit-frame-pointer -ffixed-x2 -ffixed-x3 -ffixed-x4 -ffixed-x5 -ffixed-x6 -ffixed-x7 -ffixed-x8 -ffixed-x9 -ffixed-x10 -ffixed-x11 -ffixed-x12 -ffixed-x13 -ffixed-x14 -ffixed-x15 -ffixed-x16 -ffixed-x17 -ffixed-x18 -ffixed-x19 -ffixed-x20 -ffixed-x21 -ffixed-x22 -ffixed-x23 -ffixed-x24 -ffixed-x25 -ffixed-x26 -ffixed-x27 -ffixed-28 -ffixed-29 --save-temps -mgeneral-regs-only -fno-ipa-cp" } */
+
+extern void abort ();
+
+int
+dec (int a, int b)
+{
+  return a + b;
+}
+
+int
+cal (int a, int b)
+{
+  int sum1 = a * b;
+  int sum2 = a / b;
+  int sum = dec (sum1, sum2);
+  return a + b + sum + sum1 + sum2;
+}
+
+int
+main (int argc, char **argv)
+{
+  int ret = cal (2, 1);
+
+  if (ret != 11)
+abort ();
+
+  return 0;
+}
+
+/* { dg-final { scan-assembler-times "str\tx30, \\\[sp, -\[0-9\]+\\\]!" 2 } } */
+/* { dg-final { scan-assembler "str\tw30, \\\[sp, \[0-9\]+

Re: [debug-early] rearrange some checks in gen_subprogram_die

2014-09-30 Thread Aldy Hernandez

On 09/30/14 03:23, Richard Biener wrote:

On Mon, Sep 29, 2014 at 8:54 PM, Aldy Hernandez  wrote:

I'm rearranging some code in Michael's original patch to minimize the
difference with mainline.

It seems that the check for DECL_STRUCT_FUNCTION (decl)->gimple_df, was
merely a check to see if we had already set the FDE bits for the decl in
question.


Sounds more like a check whether the frontend is finished?


Is that the canonical way for checking the FE is finished?  Seems kinda 
odd.  I'd prefer to check for ->fde, since this is the actual reason the 
rest of dwarf generation will not work in this case.


Either way, I'm not terribly attached to this particular part of the 
patch.  If you'd rather me use ->gimple_df, I can use it.  It just 
doesn't seem very readable.


Aldy


[PATCH] gcc.c-torture/ cleanup

2014-09-30 Thread Marek Polacek
I did this as a part of preparing the testsuite to cope with the
(possible) gnu11 default.  But I think it's a reasonable cleanup
on its own.  With gnu11, we'd start to warn about defaulting to
int, missing function declarations, and functions without return
type.  I added -fgnu89-inline when a test relies on a gnu89 inline
semantics, and -std=gnu89 if a test relies on gnu89 standard.

I have patches that cover the rest of C testsuite, but let's do this
piecewise.

Tested on x86_64-linux: vanilla results == results with this patch ==
results with this patch and gnu11 as a default.

Does this approach make sense?

2014-09-30  Marek Polacek  

* gcc.c-torture/compile/2120-2.c: Use -fgnu89-inline.
* gcc.c-torture/compile/2009-1.c: Likewise.
* gcc.c-torture/compile/2009-2.c: Likewise.
* gcc.c-torture/compile/20021120-1.c: Likewise.
* gcc.c-torture/compile/20021120-2.c: Likewise.
* gcc.c-torture/compile/20050215-1.c: Likewise.
* gcc.c-torture/compile/20050215-2.c: Likewise.
* gcc.c-torture/compile/20050215-3.c: Likewise.
* gcc.c-torture/compile/pr37669.c: Likewise.
* gcc.c-torture/execute/20020107-1.c: Likewise.
* gcc.c-torture/execute/restrict-1.c: Likewise.
* gcc.c-torture/compile/20090721-1.c: Fix defaulting to int.
* gcc.c-torture/execute/930529-1.c: Likewise.
* gcc.c-torture/execute/920612-1.c: Likewise.
* gcc.c-torture/execute/920711-1.c: Likewise.
* gcc.c-torture/execute/990127-2.c: Likewise.
* gcc.c-torture/execute/pr40386.c: Likewise.
* gcc.c-torture/execute/pr57124.c: Likewise.
* gcc.c-torture/compile/pr34808.c: Add function declarations.
* gcc.c-torture/compile/pr42299.c: Likewise.
* gcc.c-torture/compile/pr48517.c: Use -std=gnu89.
* gcc.c-torture/compile/simd-6.c: Likewise.
* gcc.c-torture/execute/pr53645-2.c: Likewise.
* gcc.c-torture/execute/pr53645.c: Likewise.
* gcc.c-torture/execute/20001121-1.c: Use -fgnu89-inline.  Add function
declarations.
* gcc.c-torture/execute/980608-1.c: Likewise.
* gcc.c-torture/execute/bcp-1.c: Likewise.
* gcc.c-torture/execute/p18298.c: Likewise.
* gcc.c-torture/execute/unroll-1.c: Likewise.
* gcc.c-torture/execute/va-arg-7.c: Likewise.
* gcc.c-torture/execute/va-arg-8.c: Likewise.
* gcc.c-torture/execute/930526-1.c: Use -fgnu89-inline.  Add function
declarations.  Fix defaulting to int.
* gcc.c-torture/execute/961223-1.c: Likewise.
* gcc.c-torture/execute/loop-2c.c: Use -fgnu89-inline and
-Wno-pointer-to-int-cast.  Fix defaulting to int.

diff --git gcc/gcc/testsuite/gcc.c-torture/compile/2120-2.c 
gcc/gcc/testsuite/gcc.c-torture/compile/2120-2.c
index 737eb92..939c52d 100644
--- gcc/gcc/testsuite/gcc.c-torture/compile/2120-2.c
+++ gcc/gcc/testsuite/gcc.c-torture/compile/2120-2.c
@@ -1,3 +1,5 @@
+/* { dg-options "-fgnu89-inline" } */
+
 extern __inline__ int
 odd(int i)
 {
diff --git gcc/gcc/testsuite/gcc.c-torture/compile/2009-1.c 
gcc/gcc/testsuite/gcc.c-torture/compile/2009-1.c
index b4b80ae..5d036c9 100644
--- gcc/gcc/testsuite/gcc.c-torture/compile/2009-1.c
+++ gcc/gcc/testsuite/gcc.c-torture/compile/2009-1.c
@@ -1,3 +1,4 @@
+/* { dg-options "-fgnu89-inline" } */
 /* { dg-require-weak "" } */
 /* { dg-require-alias "" } */
 #define ASMNAME(cname)  ASMNAME2 (__USER_LABEL_PREFIX__, cname)
diff --git gcc/gcc/testsuite/gcc.c-torture/compile/2009-2.c 
gcc/gcc/testsuite/gcc.c-torture/compile/2009-2.c
index e06809f..ea1176a 100644
--- gcc/gcc/testsuite/gcc.c-torture/compile/2009-2.c
+++ gcc/gcc/testsuite/gcc.c-torture/compile/2009-2.c
@@ -1,3 +1,4 @@
+/* { dg-options "-fgnu89-inline" } */
 /* { dg-require-weak "" } */
 /* { dg-require-alias "" } */
 #define ASMNAME(cname)  ASMNAME2 (__USER_LABEL_PREFIX__, cname)
diff --git gcc/gcc/testsuite/gcc.c-torture/compile/20021120-1.c 
gcc/gcc/testsuite/gcc.c-torture/compile/20021120-1.c
index 423f8ec..3dc4928 100644
--- gcc/gcc/testsuite/gcc.c-torture/compile/20021120-1.c
+++ gcc/gcc/testsuite/gcc.c-torture/compile/20021120-1.c
@@ -4,6 +4,8 @@
 /* Verify that GCC doesn't get confused by the
redefinition of an extern inline function. */
 
+/* { dg-options "-fgnu89-inline" } */
+
 extern int inline foo () { return 0; }
 extern int inline bar () { return 0; }
 static int inline bar () { return foo(); }
diff --git gcc/gcc/testsuite/gcc.c-torture/compile/20021120-2.c 
gcc/gcc/testsuite/gcc.c-torture/compile/20021120-2.c
index 51f0e25..cd9eda0 100644
--- gcc/gcc/testsuite/gcc.c-torture/compile/20021120-2.c
+++ gcc/gcc/testsuite/gcc.c-torture/compile/20021120-2.c
@@ -4,6 +4,8 @@
 /* Verify that GCC doesn't get confused by the
redefinition of an extern inline function. */
 
+/* { dg-options "-fgnu89-inline" } */
+
 extern int inline foo () { return 0; }
 extern int inline bar (

Re: [PATCH, rs6000] Generate LE code for vec_lvsl and vec_lvsr that is compatible with BE code

2014-09-30 Thread Bill Schmidt
On Tue, 2014-09-30 at 09:50 -0500, Segher Boessenkool wrote:
> On Mon, Sep 29, 2014 at 05:26:14PM -0500, Bill Schmidt wrote:
> > The method used in this patch is to perform a byte-reversal of the
> > result of the lvsl/lvsr.  This is accomplished by loading the vector
> > char constant {0,1,...,15}, which will appear in the register from left
> > to right as {15,...,1,0}.  A vperm instruction (which uses BE element
> > ordering) is applied to the result of the lvsl/lvsr using the loaded
> > constant as the permute control vector.
> 
> It would be nice if you could arrange the generated sequence such that
> for the common case where the vec_lvsl feeds a vperm it is results in
> just lvsr;vnot machine instructions.  Not so easy to do though :-(

Yes -- as you note, that only works when feeding a vperm, which is what
we expect but generally a lot of work to prove.  Again, this is
deprecated usage so it seems not worth spending the effort on this...

> 
> Some minor comments...
> 
> > -(define_insn "altivec_lvsl"
> > +(define_expand "altivec_lvsl"
> > +  [(use (match_operand:V16QI 0 "register_operand" ""))
> > +   (use (match_operand:V16QI 1 "memory_operand" "Z"))]
> 
> A define_expand should not have constraints.

Thanks for catching this -- that one slipped through (pasto).

> 
> > +  "TARGET_ALTIVEC"
> > +  "
> 
> No need for the quotes.

Ok.

> 
> > +{
> > +  if (VECTOR_ELT_ORDER_BIG)
> > +emit_insn (gen_altivec_lvsl_direct (operands[0], operands[1]));
> > +  else
> > +{
> > +  int i;
> > +  rtx mask, perm[16], constv, vperm;
> > +  mask = gen_reg_rtx (V16QImode);
> > +  emit_insn (gen_altivec_lvsl_direct (mask, operands[1]));
> > +  for (i = 0; i < 16; ++i)
> 
> i++ is the common style.

Now that we're being compiled as C++, ++i is the common style there --
is there guidance about this for gcc style these days?

Thanks,
Bill

> 
> 
> Segher
> 




Re: [Bug libstdc++/62313] Data race in debug iterators

2014-09-30 Thread Jonathan Wakely

On 26/09/14 11:05 +0100, Jonathan Wakely wrote:

On 26/09/14 00:00 +0200, François Dumont wrote:



Apart from those minor adjustments I think this looks good, but I'd
like to know that it has been tested with -fsanitize=thread, even if
only lightly tested.




Hi

  Dmitry, who reported the bug, confirmed the fix. Can I go ahead 
and commit ?


Yes, OK.


This caused some failures in the printer tests:

Running
/home/jwakely/src/gcc/gcc/libstdc++-v3/testsuite/libstdc++-prettyprinters/prettyprinters.exp
 ...
FAIL: libstdc++-prettyprinters/debug.cc print deqiter
FAIL: libstdc++-prettyprinters/debug.cc print lstiter
FAIL: libstdc++-prettyprinters/debug.cc print lstciter
FAIL: libstdc++-prettyprinters/debug.cc print mpiter
FAIL: libstdc++-prettyprinters/debug.cc print spciter



Re: [libstdc++] Refactor python/hook.in

2014-09-30 Thread Jonathan Wakely

On 29/09/14 14:11 +0100, Jonathan Wakely wrote:

On 29/09/14 06:02 -0700, Siva Chandra wrote:

The attached patch refactors python/hook.in so that there are no
individual function calls to load pretty printers and xmethods. This
was suggested by Tom here:
https://gcc.gnu.org/ml/gcc-patches/2014-08/msg02589.html. He indicates
that it is better to put as little as possible in the hook file. The
attached patch removes all code which explicitly loads the hooks from
hook.in.


This looks good to me, thanks.

I'll commit it later this week unless I hear objections from Tom.


Committed to trunk - thanks for the patch.


Re: [PATCH v2] Fix signed integer overflow in gcc/data-streamer.c

2014-09-30 Thread Diego Novillo
On Tue, Sep 30, 2014 at 3:09 AM, Markus Trippelsdorf
 wrote:
> On 2014.09.28 at 14:57 +0200, Markus Trippelsdorf wrote:
>> On 2014.09.28 at 14:36 +0200, Steven Bosscher wrote:
>> >
>> > Can you use HOST_WIDE_INT_1U for this?
>>
>> Sure. Thanks for the suggestion.
>> (Fix now resembles similar idiom in data-streamer-in.c)
>
> I checked in the fix as obvious.

Sorry for the delay. Yes, the fix is obvious. Thanks.


Re: [C++ Patch PING] Re: [PATCH] make excessive template instantiation depth a fatal error

2014-09-30 Thread Paolo Carlini

Hi,

On 09/30/2014 04:51 PM, Manuel López-Ibáñez wrote:

I don't want to cause you more work Paolo, but perhaps this should be
documented in https://gcc.gnu.org/gcc-5/changes.html. ?

Something like:

* Excessive template instantiation depth is now a fatal error. This
prevents excessive diagnostics that usually do not help to identify
the problem.

Thanks for taking care of this!
You are welcome. No problem about the changes.html bits, I'll take care 
of that too.


Paolo.


[PATCH, committed] PR 63410: Fix missing plugin headers

2014-09-30 Thread David Malcolm
We install the header "pass_manager.h", but it can't be included by a
plugin, since it includes "pass-instances.def", and we don't current
install that.

Similarly, the installed header pretty-print.h now uses
wide-int-print.h, but the latter isn't installed.

FWIW, both of these issues prevent building gcc-python-plugin.

Fixed by the attached patch. 

Bootstrapped on x86_64-unknown-linux (Fedora 20), verified that "make
install" installs the previously-missing files.

Committed to trunk as obvious, as r215727.

The missing pass-instances.def also affects the 4.9 branch (and is
currently blocking gcc-python-plugin for gcc 4.9); I'll fix it on that
branch after bootstrapping (without the wide-int part).

Index: gcc/ChangeLog
===
--- gcc/ChangeLog	(revision 215726)
+++ gcc/ChangeLog	(revision 215727)
@@ -1,3 +1,9 @@
+2014-09-30  David Malcolm  
+
+	PR plugins/63410
+	* Makefile.in (PRETTY_PRINT_H): Add wide-int-print.h.
+	(PLUGIN_HEADERS): Add pass-instances.def.
+
 2014-09-30  James Greenhalgh  
 
 	* config/aarch64/aarch64-simd-builtins.def (sqdmull_laneq): Expand
Index: gcc/Makefile.in
===
--- gcc/Makefile.in	(revision 215726)
+++ gcc/Makefile.in	(revision 215727)
@@ -916,7 +916,7 @@
 		$(BITMAP_H) sbitmap.h $(BASIC_BLOCK_H) $(GIMPLE_H) \
 		$(HASHTAB_H) $(CGRAPH_H) $(IPA_REFERENCE_H) \
 		tree-ssa-alias.h
-PRETTY_PRINT_H = pretty-print.h $(INPUT_H) $(OBSTACK_H)
+PRETTY_PRINT_H = pretty-print.h $(INPUT_H) $(OBSTACK_H) wide-int-print.h
 TREE_PRETTY_PRINT_H = tree-pretty-print.h $(PRETTY_PRINT_H)
 GIMPLE_PRETTY_PRINT_H = gimple-pretty-print.h $(TREE_PRETTY_PRINT_H)
 DIAGNOSTIC_CORE_H = diagnostic-core.h $(INPUT_H) bversion.h diagnostic.def
@@ -3148,7 +3148,7 @@
   tree-ssa-loop.h tree-ssa-loop-ivopts.h tree-ssa-loop-manip.h \
   tree-ssa-loop-niter.h tree-ssa-ter.h tree-ssa-threadedge.h \
   tree-ssa-threadupdate.h inchash.h wide-int.h signop.h hash-map.h \
-  hash-set.h
+  hash-set.h pass-instances.def
 
 # generate the 'build fragment' b-header-vars
 s-header-vars: Makefile


Re: [PATCH, rs6000] Generate LE code for vec_lvsl and vec_lvsr that is compatible with BE code

2014-09-30 Thread Segher Boessenkool
On Tue, Sep 30, 2014 at 10:24:23AM -0500, Bill Schmidt wrote:
> On Tue, 2014-09-30 at 09:50 -0500, Segher Boessenkool wrote:
> > On Mon, Sep 29, 2014 at 05:26:14PM -0500, Bill Schmidt wrote:
> > > The method used in this patch is to perform a byte-reversal of the
> > > result of the lvsl/lvsr.  This is accomplished by loading the vector
> > > char constant {0,1,...,15}, which will appear in the register from left
> > > to right as {15,...,1,0}.  A vperm instruction (which uses BE element
> > > ordering) is applied to the result of the lvsl/lvsr using the loaded
> > > constant as the permute control vector.
> > 
> > It would be nice if you could arrange the generated sequence such that
> > for the common case where the vec_lvsl feeds a vperm it is results in
> > just lvsr;vnot machine instructions.  Not so easy to do though :-(
> 
> Yes -- as you note, that only works when feeding a vperm, which is what
> we expect but generally a lot of work to prove.

I meant generating a sequence that just "falls out" as you want it after
optimisation.  E.g. lvsr;vnot;vand(splat8(31));vperm can have the vand
absorbed by the vperm.  But that splat is nasty when not optimised away :-(

> Again, this is
> deprecated usage so it seems not worth spending the effort on this...

There is that yes :-)

> > i++ is the common style.
> 
> Now that we're being compiled as C++, ++i is the common style there --

The GCC source code didn't magically change to say "++i" everywhere it
said "i++" before, when we started compiling it with ++C :-P

> is there guidance about this for gcc style these days?

codingconventions.html doesn't say.

grep | wc in rs6000/ shows 317 vs. 86; so a lot of stuff has already
leaked in (and in gcc/*.c it is 6227 vs. 793).  Some days I think the
world has gone insane :-(

To me "++i" reads as "danger, pre-increment!"  Old habits I suppose.
I'll shut up now.


Segher


Re: Fix for "FAIL: tmpdir-gcc.dg-struct-layout-1/t028 c_compat_x_tst.o compile, (internal compiler error)"

2014-09-30 Thread Joseph S. Myers
On Tue, 30 Sep 2014, Richard Earnshaw wrote:

> GCC is written in C++ these days, so technically, you need the C++
> standard :-)

And, while C++14 requires plain int bit-fields to be signed, GCC is 
written in C++98/C++03.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH, rs6000] Generate LE code for vec_lvsl and vec_lvsr that is compatible with BE code

2014-09-30 Thread Bill Schmidt
On Tue, 2014-09-30 at 11:04 -0500, Segher Boessenkool wrote:
> On Tue, Sep 30, 2014 at 10:24:23AM -0500, Bill Schmidt wrote:
> > On Tue, 2014-09-30 at 09:50 -0500, Segher Boessenkool wrote:
> > > On Mon, Sep 29, 2014 at 05:26:14PM -0500, Bill Schmidt wrote:
> > > > The method used in this patch is to perform a byte-reversal of the
> > > > result of the lvsl/lvsr.  This is accomplished by loading the vector
> > > > char constant {0,1,...,15}, which will appear in the register from left
> > > > to right as {15,...,1,0}.  A vperm instruction (which uses BE element
> > > > ordering) is applied to the result of the lvsl/lvsr using the loaded
> > > > constant as the permute control vector.
> > > 
> > > It would be nice if you could arrange the generated sequence such that
> > > for the common case where the vec_lvsl feeds a vperm it is results in
> > > just lvsr;vnot machine instructions.  Not so easy to do though :-(
> > 
> > Yes -- as you note, that only works when feeding a vperm, which is what
> > we expect but generally a lot of work to prove.
> 
> I meant generating a sequence that just "falls out" as you want it after
> optimisation.  E.g. lvsr;vnot;vand(splat8(31));vperm can have the vand
> absorbed by the vperm.  But that splat is nasty when not optimised away :-(

Especially since splat8(31) requires vsub(splat8(15),splat8(-16))...

To get something that is correct with and without feeding a vperm and
actually performs well just ain't happening here...

> 
> > Again, this is
> > deprecated usage so it seems not worth spending the effort on this...
> 
> There is that yes :-)
> 
> > > i++ is the common style.
> > 
> > Now that we're being compiled as C++, ++i is the common style there --
> 
> The GCC source code didn't magically change to say "++i" everywhere it
> said "i++" before, when we started compiling it with ++C :-P
> 
> > is there guidance about this for gcc style these days?
> 
> codingconventions.html doesn't say.
> 
> grep | wc in rs6000/ shows 317 vs. 86; so a lot of stuff has already
> leaked in (and in gcc/*.c it is 6227 vs. 793).  Some days I think the
> world has gone insane :-(
> 
> To me "++i" reads as "danger, pre-increment!"  Old habits I suppose.
> I'll shut up now.

Heh.  I have to go back and forth between C and C++ a lot these days and
find it's best for my sanity to just stick with the preincrement form
now...

Thanks,
Bill

> 
> 
> Segher
> 




C++ PATCH to use CONVERT_EXPR for dummy objects

2014-09-30 Thread Jason Merrill
A recent question from richi about the C++ FE's use of NOP_EXPR and 
CONVERT_EXPR led me to switch these functions to use CONVERT_EXPR 
instead, since we don't want STRIP_NOPS to remove them.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit e1215a9814036a16bb97609e97c3402b9b78c84c
Author: Jason Merrill 
Date:   Mon Sep 29 10:50:01 2014 -0400

	* method.c (build_stub_object): Use CONVERT_EXPR.
	* tree.c (build_dummy_object): Likewise.
	(is_dummy_object): Adjust.

diff --git a/gcc/cp/method.c b/gcc/cp/method.c
index d0e0105..b427d65 100644
--- a/gcc/cp/method.c
+++ b/gcc/cp/method.c
@@ -852,7 +852,7 @@ build_stub_type (tree type, int quals, bool rvalue)
 static tree
 build_stub_object (tree reftype)
 {
-  tree stub = build1 (NOP_EXPR, reftype, integer_one_node);
+  tree stub = build1 (CONVERT_EXPR, reftype, integer_one_node);
   return convert_from_reference (stub);
 }
 
diff --git a/gcc/cp/tree.c b/gcc/cp/tree.c
index a7bb38b..2247eb5 100644
--- a/gcc/cp/tree.c
+++ b/gcc/cp/tree.c
@@ -2979,7 +2979,7 @@ member_p (const_tree decl)
 tree
 build_dummy_object (tree type)
 {
-  tree decl = build1 (NOP_EXPR, build_pointer_type (type), void_node);
+  tree decl = build1 (CONVERT_EXPR, build_pointer_type (type), void_node);
   return cp_build_indirect_ref (decl, RO_NULL, tf_warning_or_error);
 }
 
@@ -3028,7 +3028,7 @@ is_dummy_object (const_tree ob)
 {
   if (INDIRECT_REF_P (ob))
 ob = TREE_OPERAND (ob, 0);
-  return (TREE_CODE (ob) == NOP_EXPR
+  return (TREE_CODE (ob) == CONVERT_EXPR
 	  && TREE_OPERAND (ob, 0) == void_node);
 }
 


Re: Enable TBAA on anonymous types with LTO

2014-09-30 Thread Jan Hubicka
> On 09/29/2014 11:36 AM, Jan Hubicka wrote:
> >If C++ FE sets canonical type always to main variant, it should work.
> >Is it always the case?
> 
> No.  For a compound type like a pointer or function the canonical
> type strips all typedefs, but a main variant does not.
> 
> >>>   namespace {
> >>>   struct B {};
> >>>   }
> >>>   struct A
> >>>   {
> >>>   void t(B);
> >>>   void t2();
> >>>   };
> >
> >Yep, A seems to be not anonymous and mangled as A.  I think it is ODR 
> >violation
> >to declare such type in more than one compilation unit (and we will warn on
> >it). We can make it anonymous, but I think it is C++ FE to do so.
> 
> Yes, it's an ODR violation.  The FE currently warns about a field
> with internal type, and I suppose could warn about other members as
> well.

The testcase seems to get around without a warning for both G++ and clang
(at least without -Wall)
> 
> >I really think that anonymous types are meant to not be accessible from other
> >compilation unit and I do not see why other languages need different rule.
> 
> Agreed.
> 
> >This does not work for types build from ODR types that are not ODR 
> >themselves.
> 
> I'm not sure what you mean.  In C++ the only types not subject to
> the ODR are local to one translation unit, so merging isn't an
> issue.  Do you mean types from other languages?

Yes, Richard would like

namespace {
  struct A {int a;};
}

to be considered with aliasing with 

  struct B {int b;};

in the other unit if that unit is built in C language (or any other than C++).

Honza
> Jason


Re: [PATCH] PR63404, gcc 5 miscompiles linux block layer

2014-09-30 Thread Jeff Law

On 09/30/14 08:37, Jiong Wang wrote:


On 30/09/14 05:21, Jeff Law wrote:

On 09/29/14 13:24, Jiong Wang wrote:

I don't think so. from the x86-64 bootstrap, there is no regression
on the number of functions shrink-wrapped. actually speaking,
previously only single mov dest, src handled, so the disallowing
USE/CLOBBER will not disallow shrink-wrap opportunity which was
allowed previously.

This is the key, of course.  shrink-wrapping is very restrictive in its
ability to sink insns.  The only forms it'll currently shrink are simple
moves.  Arithmetic, logicals, etc are left alone.  Thus disallowing
USE/CLOBBER does not impact the x86 port in any significant way.


yes, and we could get +1.22% (2567 compared with 2536) functions
shrink-wrapped
after we sinking more insn except simple "mov dest, src"  on x86-64
bootstrap. and
I remember the similar percentage on glibc build.

while on aarch64, the overall functions shrink-wrapped increased +25% on
some
programs. maybe we could gain the same on other RISC backend.



I do agree with Richard that it would be useful to see the insns that
are incorrectly sunk and the surrounding context.
So I must be missing something.  I thought the shrink-wrapping code 
wouldn't sink arithmetic/logical insns like we see with insn 14 and insn 
182.  I thought it was limited to reg-reg copies and constant 
initializations.


Jeff




Re: [PATCH] Fix PR preprocessor/58893 access to uninitialized memory

2014-09-30 Thread Jeff Law

On 09/30/14 03:01, Bernd Edlinger wrote:

Sigh. Yea, I guess if we're hitting the allocator insanely hard,
scrubbing memory might turn out to slow things down in a significant
way. Or it may simply be the case that we're using free'd memory in
some way and with the MALLOC_PERTURB changes we're in an infinite loop
in the dumping code or something similar.



Yeah, that is an interesting thing.
I debugged that, and it turns out, that this is just incredibly slow.
It seems to be in the macro expansion of this construct:

#define t16(x) x x x x x x x x x x x x x x x x
#define M (sizeof (t16(t16(t16(t16(t16(" ")) - 1)

libcpp is calling realloc 1.000.000 times for this, resizing
the memory by just one byte at a time. And the worst case of
realloc is O(n), so in the worst case realloc would have
to copy 1/2 * 1.000.000^2 bytes = 500 GB of memory.

With this little change in libcpp, the test suite passed, without any
further regressions:

--- libcpp/charset.c.jj2014-08-19 07:34:31.0 +0200
+++ libcpp/charset.c2014-09-30 10:45:26.676954120 +0200
@@ -537,6 +537,7 @@ convert_no_conversion (iconv_t cd ATTRIB
if (to->len + flen> to->asize)
  {
to->asize = to->len + flen;
+  to->asize *= 2;
to->text = XRESIZEVEC (uchar, to->text, to->asize);
  }
memcpy (to->text + to->len, from, flen);

I will prepare a patch for that later.
Thanks for digging into this.  We usually try to throttle this growth a 
little.  Something like this would be consistent with other cases in GCC:


to->asize += to->asize / 4;




Interestingly, if I define MALLOC_CHECK_=3 _and_ MALLOC_PERTURB_
this test passes, even without the above change,
but the test case
   gfortran.dg/realloc_on_assign_5.f03 fails in this configuration,
which is a known bug: PR 47674. However it passes when only MALLOC_PERTURB_
is defined.

Weird...

Yea, but that's par for the course when dealing with memory errors.

Jeff



Re: [PATCH] PR63404, gcc 5 miscompiles linux block layer

2014-09-30 Thread Jeff Law

On 09/30/14 08:15, Richard Earnshaw wrote:


I think part of the problem is in the naming of single_set().  From the
name it's not entirely obvious to users that this includes insns that
clobber registers or which write other registers that are unused after
that point.  I've previously had to fix a bug where this assumption was
made (eg https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54300)

Most uses of single_set prior to register allocation are probably safe;
but later uses are fraught with potential problems of this nature and
may well be bugs waiting to happen.
Very possibly.  There's a bit of a natural tension here in that often we 
don't much care about the additional CLOBBERS, but when we get it wrong, 
obviously it's bad.


I haven't done any research, but I suspect the change it ignore clobbers 
in single_set came in as part of exposing the CC register and avoiding 
regressions all over the place as a result.


I wonder what would happen if we ignored prior to register allocation, 
then rejected insns with those CLOBBERs once register allocation started.


Jeff


[PATCH X86, PR62128] Rotate pattern for AVX2

2014-09-30 Thread Evgeny Stupachenko
Hi,

Patch resubmitted from https://gcc.gnu.org/ml/gcc-patches/2014-09/msg01400.html

The patch fix PR62128 and  "gcc.target/i386/pr52252-atom.c" in
core-avx2 make check.
The test in pr62128 is exactly TEST 22 from
gcc.dg/torture/vshuf-v32qi.c. It will check if the pattern is correct
or not.
The patch developed similar to define_insn_and_split
"*avx_vperm_broadcast_".
The patch passed x86 bootstrap and make check (+2 new passes for
-march=core-avx2).
Is it ok?

Evgeny

ChangeLog:

2014-09-30  Evgeny Stupachenko  

* config/i386/sse.md (avx2_palignrv4di): New.
* config/i386/sse.md (avx2_rotate_perm): New.


palignr_hsw_pattern.patch
Description: Binary data


Re: [PATCH] gcc.c-torture/ cleanup

2014-09-30 Thread Jeff Law

On 09/30/14 09:22, Marek Polacek wrote:

I did this as a part of preparing the testsuite to cope with the
(possible) gnu11 default.  But I think it's a reasonable cleanup
on its own.  With gnu11, we'd start to warn about defaulting to
int, missing function declarations, and functions without return
type.  I added -fgnu89-inline when a test relies on a gnu89 inline
semantics, and -std=gnu89 if a test relies on gnu89 standard.

I have patches that cover the rest of C testsuite, but let's do this
piecewise.

Tested on x86_64-linux: vanilla results == results with this patch ==
results with this patch and gnu11 as a default.

Does this approach make sense?

2014-09-30  Marek Polacek  

* gcc.c-torture/compile/2120-2.c: Use -fgnu89-inline.
* gcc.c-torture/compile/2009-1.c: Likewise.
* gcc.c-torture/compile/2009-2.c: Likewise.
* gcc.c-torture/compile/20021120-1.c: Likewise.
* gcc.c-torture/compile/20021120-2.c: Likewise.
* gcc.c-torture/compile/20050215-1.c: Likewise.
* gcc.c-torture/compile/20050215-2.c: Likewise.
* gcc.c-torture/compile/20050215-3.c: Likewise.
* gcc.c-torture/compile/pr37669.c: Likewise.
* gcc.c-torture/execute/20020107-1.c: Likewise.
* gcc.c-torture/execute/restrict-1.c: Likewise.
* gcc.c-torture/compile/20090721-1.c: Fix defaulting to int.
* gcc.c-torture/execute/930529-1.c: Likewise.
* gcc.c-torture/execute/920612-1.c: Likewise.
* gcc.c-torture/execute/920711-1.c: Likewise.
* gcc.c-torture/execute/990127-2.c: Likewise.
* gcc.c-torture/execute/pr40386.c: Likewise.
* gcc.c-torture/execute/pr57124.c: Likewise.
* gcc.c-torture/compile/pr34808.c: Add function declarations.
* gcc.c-torture/compile/pr42299.c: Likewise.
* gcc.c-torture/compile/pr48517.c: Use -std=gnu89.
* gcc.c-torture/compile/simd-6.c: Likewise.
* gcc.c-torture/execute/pr53645-2.c: Likewise.
* gcc.c-torture/execute/pr53645.c: Likewise.
* gcc.c-torture/execute/20001121-1.c: Use -fgnu89-inline.  Add function
declarations.
* gcc.c-torture/execute/980608-1.c: Likewise.
* gcc.c-torture/execute/bcp-1.c: Likewise.
* gcc.c-torture/execute/p18298.c: Likewise.
* gcc.c-torture/execute/unroll-1.c: Likewise.
* gcc.c-torture/execute/va-arg-7.c: Likewise.
* gcc.c-torture/execute/va-arg-8.c: Likewise.
* gcc.c-torture/execute/930526-1.c: Use -fgnu89-inline.  Add function
declarations.  Fix defaulting to int.
* gcc.c-torture/execute/961223-1.c: Likewise.
* gcc.c-torture/execute/loop-2c.c: Use -fgnu89-inline and
-Wno-pointer-to-int-cast.  Fix defaulting to int.

OK.
Jeff



Re: [ping*2] define CROSS = @CROSS@ in gcc/Makefile.in

2014-09-30 Thread Jeff Law

On 09/30/14 04:48, Olivier Hainque wrote:

Hello,

ping on https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00056.html

Thanks in advance,

With Kind Regards,

Olivier

On Sep 1, 2014, at 17:26 , Olivier Hainque  wrote:


Hello,

This patch is necessary for proper operation of a piece
of the Ada Makefile fragment which tests the value of $(CROSS).

@ substitutions aren't performed for the language specific
Makefile fragments so using @CROSS directly isn't an option
there.

We have been using this for years and multiple targets in our
local trees. Boostrapped & reg-tested on x86_64-linux.

OK to commit ?

Thanks in advance for your feedback,

Olivier

2014-09-01  Olivier Hainque  

* Makefile.in (CROSS): Define, to @CROSS@.

OK.
Jeff



Re: [PATCH] PR63404, gcc 5 miscompiles linux block layer

2014-09-30 Thread Richard Earnshaw
On 30/09/14 17:45, Jeff Law wrote:
> On 09/30/14 08:15, Richard Earnshaw wrote:
>>
>> I think part of the problem is in the naming of single_set().  From the
>> name it's not entirely obvious to users that this includes insns that
>> clobber registers or which write other registers that are unused after
>> that point.  I've previously had to fix a bug where this assumption was
>> made (eg https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54300)
>>
>> Most uses of single_set prior to register allocation are probably safe;
>> but later uses are fraught with potential problems of this nature and
>> may well be bugs waiting to happen.
> Very possibly.  There's a bit of a natural tension here in that often we 
> don't much care about the additional CLOBBERS, but when we get it wrong, 
> obviously it's bad.
> 
> I haven't done any research, but I suspect the change it ignore clobbers 
> in single_set came in as part of exposing the CC register and avoiding 
> regressions all over the place as a result.

It's not just clobbers; it ignores patterns like

(parallel
 [(set (a) (...)
  (set (b) (...)])
[(reg_note (reg_unused(b))]

Which is probably fine before register allocation but definitely
something you have to think about afterwards.

> 
> I wonder what would happen if we ignored prior to register allocation, 
> then rejected insns with those CLOBBERs once register allocation started.
> 

Might work; but it might miss some useful cases as well...

R.




Re: [PATCH X86, PR62128] Rotate pattern for AVX2

2014-09-30 Thread H.J. Lu
On Tue, Sep 30, 2014 at 9:47 AM, Evgeny Stupachenko  wrote:
> Hi,
>
> Patch resubmitted from 
> https://gcc.gnu.org/ml/gcc-patches/2014-09/msg01400.html
>
> The patch fix PR62128 and  "gcc.target/i386/pr52252-atom.c" in
> core-avx2 make check.
> The test in pr62128 is exactly TEST 22 from
> gcc.dg/torture/vshuf-v32qi.c. It will check if the pattern is correct
> or not.
> The patch developed similar to define_insn_and_split
> "*avx_vperm_broadcast_".
> The patch passed x86 bootstrap and make check (+2 new passes for
> -march=core-avx2).
> Is it ok?
>
> Evgeny
>
> ChangeLog:
>
> 2014-09-30  Evgeny Stupachenko  
>
> * config/i386/sse.md (avx2_palignrv4di): New.
> * config/i386/sse.md (avx2_rotate_perm): New.

Please mention PR target/62128 in ChangeLog and
add 2 testases to gcc.target/i386 such that they fail
without this patch using the default GCC configuration.

Thanks.

-- 
H.J.


Re: [PATCH X86, PR62128] Rotate pattern for AVX2

2014-09-30 Thread Uros Bizjak
On Tue, Sep 30, 2014 at 6:47 PM, Evgeny Stupachenko  wrote:

> Patch resubmitted from 
> https://gcc.gnu.org/ml/gcc-patches/2014-09/msg01400.html
>
> The patch fix PR62128 and  "gcc.target/i386/pr52252-atom.c" in
> core-avx2 make check.
> The test in pr62128 is exactly TEST 22 from
> gcc.dg/torture/vshuf-v32qi.c. It will check if the pattern is correct
> or not.
> The patch developed similar to define_insn_and_split
> "*avx_vperm_broadcast_".
> The patch passed x86 bootstrap and make check (+2 new passes for
> -march=core-avx2).
> Is it ok?
>
> Evgeny
>
> ChangeLog:
>
> 2014-09-30  Evgeny Stupachenko  
>
> * config/i386/sse.md (avx2_palignrv4di): New.
> * config/i386/sse.md (avx2_rotate_perm): New.

+(define_insn "avx2_palignrv4di"
+  [(set (match_operand:V4DI 0 "register_operand" "=x")
+ (unspec:V4DI
+  [(match_operand:V4DI 1 "register_operand" "x")
+   (match_operand:V4DI 2 "nonimmediate_operand" "xm")
+   (match_operand:SI 3 "const_0_to_255_operand" "n")]
+  UNSPEC_VPALIGNRDI))]
+  "TARGET_AVX2"
+  "vpalignr\t{%3, %2, %1, %0|%0, %1, %2, %3}"
+  [(set_attr "type" "sselog")
+   (set_attr "prefix" "vex")
+   (set_attr "mode" "OI")])

Just reuse UNSPEC_PALIGNR, no need for a new unspec.

+(define_insn_and_split "avx2_rotate_perm"
+  [(set (match_operand:V_256 0 "register_operand" "=&x")
+  (vec_select:V_256
+ (match_operand:V_256 1 "register_operand" "x")
+ (match_parallel 2 "palignr_operand"
+  [(match_operand 3 "const_int_operand" "n")])))]
+  "TARGET_AVX2"
+  "#"
+  "&& reload_completed"
+  [(const_int 0)]

This should be a define_expand. There is nothing that requires hard
registers. You can achieve mode-changes by using gen_lowpart, see many
examples in sse.md

+  if (shift < 16)
+ emit_insn (gen_avx2_palignrv4di (op0,
+ op0,
+ op1,
+ GEN_INT (shift)));
+  else if (shift > 16)
+ emit_insn (gen_avx2_palignrv4di (op0,
+ op1,
+ op0,
+ GEN_INT (shift - 16)));

What happens when shift == 16?

Uros.


Re: [PATCH X86, PR62128] Rotate pattern for AVX2

2014-09-30 Thread Jakub Jelinek
On Tue, Sep 30, 2014 at 10:03:05AM -0700, H.J. Lu wrote:
> On Tue, Sep 30, 2014 at 9:47 AM, Evgeny Stupachenko  
> wrote:
> > Hi,
> >
> > Patch resubmitted from 
> > https://gcc.gnu.org/ml/gcc-patches/2014-09/msg01400.html
> >
> > The patch fix PR62128 and  "gcc.target/i386/pr52252-atom.c" in
> > core-avx2 make check.
> > The test in pr62128 is exactly TEST 22 from
> > gcc.dg/torture/vshuf-v32qi.c. It will check if the pattern is correct
> > or not.
> > The patch developed similar to define_insn_and_split
> > "*avx_vperm_broadcast_".
> > The patch passed x86 bootstrap and make check (+2 new passes for
> > -march=core-avx2).
> > Is it ok?
> >
> > Evgeny
> >
> > ChangeLog:
> >
> > 2014-09-30  Evgeny Stupachenko  
> >
> > * config/i386/sse.md (avx2_palignrv4di): New.
> > * config/i386/sse.md (avx2_rotate_perm): New.
> 
> Please mention PR target/62128 in ChangeLog and
> add 2 testases to gcc.target/i386 such that they fail
> without this patch using the default GCC configuration.

Also, just use
(avx2_rotate_perm): New.
for the 4th ChangeLog entry line, no point duplicating sse.md...

Jakub


C++ PATCHes to add __is_trivially_*

2014-09-30 Thread Jason Merrill
Ville asked for help with the necessary compiler intrinsics for the 
is_trivially_* C++11 library traits.


The first patch cleans up a few oddities I noticed with the existing 
intrinsics.  __is_convertible_to was never implemented and isn't needed. 
 There's no need for a second grokdeclarator in trait parsing since 
cp_parser_type_id already does a grokdeclarator.  And the assert at the 
top of finish_trait_expr is redundant with the gcc_unreachable in the 
switch.


The second patch adds __is_trivially_copyable, which just uses the 
existing trivially_copyable_p predicate in the compiler.


The third patch adds __is_trivially_assignable and 
__is_trivially_constructible, which work by building up an expression 
representing assignment or object declaration and then scanning it for 
calls to functions other than trivial special member functions.  Note 
that there are still bugs in trivial_fn_p that are exposed by this 
intrinsic.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit 7a3d9e80fb97691115e574915fd632f85c0974b7
Author: Jason Merrill 
Date:   Thu Sep 25 12:34:43 2014 -0400

c-family/
	* c-common.h (enum rid): Remove RID_IS_CONVERTIBLE_TO.
	* c-common.c (c_common_reswords): Remove __is_convertible_to.
cp/
	* cp-tree.h (cp_trait_kind): Remove CPTK_IS_CONVERTIBLE_TO.
	* cxx-pretty-print.c (pp_cxx_trait_expression): Likewise.
	* semantics.c (trait_expr_value): Likewise.
	(finish_trait_expr): Likewise.
	* parser.c (cp_parser_primary_expression): Likewise.
	(cp_parser_trait_expr): Likewise. Remove redundant grokdeclarator.

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index a9e0191..0324a0a 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -472,7 +472,6 @@ const struct c_common_resword c_common_reswords[] =
   { "__is_abstract",	RID_IS_ABSTRACT, D_CXXONLY },
   { "__is_base_of",	RID_IS_BASE_OF, D_CXXONLY },
   { "__is_class",	RID_IS_CLASS,	D_CXXONLY },
-  { "__is_convertible_to", RID_IS_CONVERTIBLE_TO, D_CXXONLY },
   { "__is_empty",	RID_IS_EMPTY,	D_CXXONLY },
   { "__is_enum",	RID_IS_ENUM,	D_CXXONLY },
   { "__is_final",	RID_IS_FINAL,	D_CXXONLY },
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index 5ec79a0..5ba7859 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -138,7 +138,7 @@ enum rid
   RID_HAS_TRIVIAL_CONSTRUCTOR, RID_HAS_TRIVIAL_COPY,
   RID_HAS_TRIVIAL_DESTRUCTOR,  RID_HAS_VIRTUAL_DESTRUCTOR,
   RID_IS_ABSTRACT, RID_IS_BASE_OF,
-  RID_IS_CLASS,RID_IS_CONVERTIBLE_TO,
+  RID_IS_CLASS,
   RID_IS_EMPTY,RID_IS_ENUM,
   RID_IS_FINAL,RID_IS_LITERAL_TYPE,
   RID_IS_POD,  RID_IS_POLYMORPHIC,
diff --git a/gcc/cp/cp-tree.def b/gcc/cp/cp-tree.def
index b4a72d6..e6e90f7 100644
--- a/gcc/cp/cp-tree.def
+++ b/gcc/cp/cp-tree.def
@@ -354,9 +354,9 @@ DEFTREECODE (STMT_EXPR, "stmt_expr", tcc_expression, 1)
is applied.  */
 DEFTREECODE (UNARY_PLUS_EXPR, "unary_plus_expr", tcc_unary, 1)
 
-/** C++0x extensions. */
+/** C++11 extensions. */
 
-/* A static assertion.  This is a C++0x extension.
+/* A static assertion.  This is a C++11 extension.
STATIC_ASSERT_CONDITION contains the condition that is being
checked.  STATIC_ASSERT_MESSAGE contains the message (a string
literal) to be displayed if the condition fails to hold.  */
diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 5d8badc..0bb6ef9 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -645,7 +645,6 @@ typedef enum cp_trait_kind
   CPTK_IS_ABSTRACT,
   CPTK_IS_BASE_OF,
   CPTK_IS_CLASS,
-  CPTK_IS_CONVERTIBLE_TO,
   CPTK_IS_EMPTY,
   CPTK_IS_ENUM,
   CPTK_IS_FINAL,
diff --git a/gcc/cp/cxx-pretty-print.c b/gcc/cp/cxx-pretty-print.c
index f5f91c8..f0734ec 100644
--- a/gcc/cp/cxx-pretty-print.c
+++ b/gcc/cp/cxx-pretty-print.c
@@ -388,7 +388,6 @@ pp_cxx_userdef_literal (cxx_pretty_printer *pp, tree t)
  __is_abstract ( type-id )
  __is_base_of ( type-id , type-id )
  __is_class ( type-id )
- __is_convertible_to ( type-id , type-id ) 
  __is_empty ( type-id )
  __is_enum ( type-id )
  __is_literal_type ( type-id )
@@ -2373,9 +2372,6 @@ pp_cxx_trait_expression (cxx_pretty_printer *pp, tree t)
 case CPTK_IS_CLASS:
   pp_cxx_ws_string (pp, "__is_class");
   break;
-case CPTK_IS_CONVERTIBLE_TO:
-  pp_cxx_ws_string (pp, "__is_convertible_to");
-  break;
 case CPTK_IS_EMPTY:
   pp_cxx_ws_string (pp, "__is_empty");
   break;
@@ -2411,7 +2407,7 @@ pp_cxx_trait_expression (cxx_pretty_printer *pp, tree t)
   pp_cxx_left_paren (pp);
   pp->type_id (TRAIT_EXPR_TYPE1 (t));
 
-  if (kind == CPTK_IS_BASE_OF || kind == CPTK_IS_CONVERTIBLE_TO)
+  if (kind == CPTK_IS_BASE_OF)
 {
   pp_cxx_separate_with (pp, ',');
   pp->type_id (TRAIT_EXPR_TYPE2 (t));
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 4563145..63cc0d3 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -4134,7 +4

Re: [PATCHv3][PING] Enable -fsanitize-recover for KASan

2014-09-30 Thread Alexey Samsonov
On Tue, Sep 30, 2014 at 12:07 AM, Yury Gribov  wrote:
> On 09/30/2014 09:40 AM, Jakub Jelinek wrote:
>>
>> On Mon, Sep 29, 2014 at 05:24:02PM -0700, Konstantin Serebryany wrote:

 I don't think we ever going to support recovery for regular ASan
 (Kostya, correct me if I'm wrong).
>>>
>>>
>>> I hope so too.
>>> Another point is that with asan-instrumentation-with-call-threshold=0
>>> (instrumentation with callbacks)
>>
>>
>> The normal (non-recovery) callbacks are __attribute__((noreturn)) for
>> performance reasons, and you do need different callbacks and different
>> generated code if you want to recover (after the callback you need jump
>> back to a basic block after the conditional jump).
>> So, in that case you would need -fsanitize-recover=address.
>>
 I see no problem in enabling -fsanitize-recover by default for
 -fsanitize=undefined and
>>>
>>>
>>> This becomes more interesting when we use asan and ubsan together.
>>
>>
>> That is fairly common case.
>
>
> I think we can summarize:
> * the current option -fsanitize-recover is misleading; it's really
> -fubsan-recover
> * we need a way to selectively enable/disable recovery for different
> sanitizers
>
> The most promininet solution seems to be
> * allow -fsanitize-recover=tgt1,tgt2 syntax
> * -fsanitize-recover wo options would still mean UBSan recovery
>
> The question is what to do with -fno-sanitize-recover then.

We can make -f(no-)?sanitize-recover= flags accept the same values as
-f(no-)?sanitize= flags. In this case,

"-fsanitize-recover" will be a deprecated alias of
"-fsanitize-recover=undefined", and
"-fno-sanitize-recover" will be a deprecated alias of
"-fno-sanitize-recover=undefined".
If a user provides "-fsanitize-recover=address", we can instruct the
instrumentation pass to
use recoverable instrumentation.

>
> -Y
>



-- 
Alexey Samsonov, Mountain View, CA


Re: C++ PATCHes to add __is_trivially_*

2014-09-30 Thread Paolo Carlini

Hi,

On 09/30/2014 07:13 PM, Jason Merrill wrote:
Ville asked for help with the necessary compiler intrinsics for the 
is_trivially_* C++11 library traits.


The first patch cleans up a few oddities I noticed with the existing 
intrinsics.  __is_convertible_to was never implemented and isn't 
needed.  There's no need for a second grokdeclarator in trait parsing 
since cp_parser_type_id already does a grokdeclarator.  And the assert 
at the top of finish_trait_expr is redundant with the gcc_unreachable 
in the switch.


The second patch adds __is_trivially_copyable, which just uses the 
existing trivially_copyable_p predicate in the compiler.


The third patch adds __is_trivially_assignable and 
__is_trivially_constructible, which work by building up an expression 
representing assignment or object declaration and then scanning it for 
calls to functions other than trivial special member functions.  Note 
that there are still bugs in trivial_fn_p that are exposed by this 
intrinsic.

Great. I think this can be as well marked as PR c++/26099.

By the way, if I remember correctly, the idea of having 
__is_convertible_to leading to unimplemented instead of simply being not 
recognized, goes back to this kind of idea:


http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2518.html

and Intel too was in favor of somewhat standardizing those intrinsics. 
In fact, both current icc and clang++ accept and implement 
__is_convertible_to.


Paolo.


Re: [PATCHv3][PING] Enable -fsanitize-recover for KASan

2014-09-30 Thread Jakub Jelinek
On Tue, Sep 30, 2014 at 10:26:39AM -0700, Alexey Samsonov wrote:
> > I think we can summarize:
> > * the current option -fsanitize-recover is misleading; it's really
> > -fubsan-recover
> > * we need a way to selectively enable/disable recovery for different
> > sanitizers
> >
> > The most promininet solution seems to be
> > * allow -fsanitize-recover=tgt1,tgt2 syntax
> > * -fsanitize-recover wo options would still mean UBSan recovery
> >
> > The question is what to do with -fno-sanitize-recover then.
> 
> We can make -f(no-)?sanitize-recover= flags accept the same values as
> -f(no-)?sanitize= flags. In this case,
> 
> "-fsanitize-recover" will be a deprecated alias of
> "-fsanitize-recover=undefined", and
> "-fno-sanitize-recover" will be a deprecated alias of
> "-fno-sanitize-recover=undefined".
> If a user provides "-fsanitize-recover=address", we can instruct the
> instrumentation pass to
> use recoverable instrumentation.

Would we accept -fsanitize-recover=undefined 
-fno-sanitize-recover=signed-integer-overflow
as recovering everything but signed integer overflows, i.e. the decision
whether to recover a particular call would check similar bitmask as
is checked whether to sanitize something at all?

Jakub


Re: [PATCHv3][PING] Enable -fsanitize-recover for KASan

2014-09-30 Thread Alexey Samsonov
On Tue, Sep 30, 2014 at 10:33 AM, Jakub Jelinek  wrote:
> On Tue, Sep 30, 2014 at 10:26:39AM -0700, Alexey Samsonov wrote:
>> > I think we can summarize:
>> > * the current option -fsanitize-recover is misleading; it's really
>> > -fubsan-recover
>> > * we need a way to selectively enable/disable recovery for different
>> > sanitizers
>> >
>> > The most promininet solution seems to be
>> > * allow -fsanitize-recover=tgt1,tgt2 syntax
>> > * -fsanitize-recover wo options would still mean UBSan recovery
>> >
>> > The question is what to do with -fno-sanitize-recover then.
>>
>> We can make -f(no-)?sanitize-recover= flags accept the same values as
>> -f(no-)?sanitize= flags. In this case,
>>
>> "-fsanitize-recover" will be a deprecated alias of
>> "-fsanitize-recover=undefined", and
>> "-fno-sanitize-recover" will be a deprecated alias of
>> "-fno-sanitize-recover=undefined".
>> If a user provides "-fsanitize-recover=address", we can instruct the
>> instrumentation pass to
>> use recoverable instrumentation.
>
> Would we accept -fsanitize-recover=undefined 
> -fno-sanitize-recover=signed-integer-overflow
> as recovering everything but signed integer overflows, i.e. the decision
> whether to recover a particular call would check similar bitmask as
> is checked whether to sanitize something at all?

Yes, the logic for creating a set of recoverable sanitizers should be
the same as the logic for creating a set of enabled sanitizers.

-- 
Alexey Samsonov, Mountain View, CA


Re: [PATCHv3][PING] Enable -fsanitize-recover for KASan

2014-09-30 Thread Jakub Jelinek
On Tue, Sep 30, 2014 at 10:36:34AM -0700, Alexey Samsonov wrote:
> > Would we accept -fsanitize-recover=undefined 
> > -fno-sanitize-recover=signed-integer-overflow
> > as recovering everything but signed integer overflows, i.e. the decision
> > whether to recover a particular call would check similar bitmask as
> > is checked whether to sanitize something at all?
> 
> Yes, the logic for creating a set of recoverable sanitizers should be
> the same as the logic for creating a set of enabled sanitizers.

LGTM, will hack it up soon in GCC then.

Jakub


Re: [Bug libstdc++/62313] Data race in debug iterators

2014-09-30 Thread François Dumont

I forgot to check pretty printer tests indeed, I will.

François

On 30/09/2014 17:32, Jonathan Wakely wrote:

On 26/09/14 11:05 +0100, Jonathan Wakely wrote:

On 26/09/14 00:00 +0200, François Dumont wrote:



Apart from those minor adjustments I think this looks good, but I'd
like to know that it has been tested with -fsanitize=thread, even if
only lightly tested.




Hi

  Dmitry, who reported the bug, confirmed the fix. Can I go ahead 
and commit ?


Yes, OK.


This caused some failures in the printer tests:

Running
/home/jwakely/src/gcc/gcc/libstdc++-v3/testsuite/libstdc++-prettyprinters/prettyprinters.exp 
...

FAIL: libstdc++-prettyprinters/debug.cc print deqiter
FAIL: libstdc++-prettyprinters/debug.cc print lstiter
FAIL: libstdc++-prettyprinters/debug.cc print lstciter
FAIL: libstdc++-prettyprinters/debug.cc print mpiter
FAIL: libstdc++-prettyprinters/debug.cc print spciter






Re: [PATCH X86, PR62128] Rotate pattern for AVX2

2014-09-30 Thread Uros Bizjak
On Tue, Sep 30, 2014 at 7:06 PM, Uros Bizjak  wrote:
> On Tue, Sep 30, 2014 at 6:47 PM, Evgeny Stupachenko  
> wrote:
>
>> Patch resubmitted from 
>> https://gcc.gnu.org/ml/gcc-patches/2014-09/msg01400.html
>>
>> The patch fix PR62128 and  "gcc.target/i386/pr52252-atom.c" in
>> core-avx2 make check.
>> The test in pr62128 is exactly TEST 22 from
>> gcc.dg/torture/vshuf-v32qi.c. It will check if the pattern is correct
>> or not.
>> The patch developed similar to define_insn_and_split
>> "*avx_vperm_broadcast_".
>> The patch passed x86 bootstrap and make check (+2 new passes for
>> -march=core-avx2).
>> Is it ok?

Please try following (totally untested) expander:

--cut here--
(define_expand "avx2_rotate_perm"
  [(set (match_operand:V_256 0 "register_operand")
  (vec_select:V_256
(match_operand:V_256 1 "register_operand")
(match_parallel 2 "palignr_operand"
  [(match_operand 3 "const_int_operand" "n")])))]
  "TARGET_AVX2"
{
  int shift = INTVAL (operands[3]) * ;
  rtx insn;

  rtx op0 = gen_lowpart (V4DImode, operands[0]);
  rtx op1 = gen_lowpart (V4DImode, operands[1]);

  emit_insn (gen_avx2_permv2ti (op0, op1, op1, GEN_INT (33)));

  op0 = gen_lowpart (V2TImode, operands[0]);
  op1 = gen_lowpart (V2TImode, operands[1]);

  if (shift < GET_MODE_SIZE (TImode))
insn = gen_avx2_palignrv2ti (op0, op0, op1, GEN_INT (shift)));
  else
insn = gen_avx2_palignrv2ti (op0, op1, op0, GEN_INT (shift - 16)));

  emit_insn (insn);
  DONE;
}
--cut here--

BTW: Looking at the code above, it looks to me that avx2_permv2ti
should accept V2TImode operands, not V4DImode.

Uros.


Re: [PATCH] Redesign jump threading profile updates

2014-09-30 Thread Teresa Johnson
On Mon, Sep 29, 2014 at 9:33 PM, Jeff Law  wrote:
> On 09/29/14 08:19, Teresa Johnson wrote:
>>>
>>>
>>> Just an update - I found some good test cases by compiling the
>>> c-torture tests with profile feedback with and without my patch. But
>>> in the cases I pulled out I saw that there were still a couple profile
>>> or probability insanities introduced by jump threading (albeit far
>>> less than before), so I wanted to investigate before I commit. I ran
>>> out of time this week and will not get to this until I get back from
>>> vacation the week after next.
>>
>>
>> Hi Jeff,
>>
>> I finally had a chance to get back to this and look at the remaining
>> insanities in the new test cases I created. It turns out that there
>> were still a few issues in the case where there were guessed
>> frequencies and no profile counts. The two test cases I created do use
>> FDO, and the insanities in the routines with profile counts went away
>> with my patch. But the outlined copies of routines that were also
>> inlined into the main routine still had estimated frequencies, and
>> these still had a few issues.
>>
>> The problem is that the profile updates are done incrementally as we
>> walk and update the paths in ssa_fix_duplicate_block_edges, including
>> the block and edge counts, the block frequencies and the
>> probabilities. This is very difficult to do when only operating on
>> frequencies since the edge frequencies are derived from the source
>> block frequency and the probability. Therefore, once the source block
>> frequency is updated, the edge frequency is also affected, and it is
>> really difficult to figure out what the update to the edge frequency
>> (essentially the probability) is using the same incremental update
>> approach. I was attempting to handle this with the routine
>> deduce_freq, for example, but this turned out to have issues for
>> certain types of paths. I tried a few other approaches, but they start
>> looking really ugly and I didn't want to add a parallel but different
>> algorithm in the case of no profile counts.
>>
>> So by far the simplest approach was simply to take a snapshot of the
>> existing block and edge frequencies along the path before we start the
>> updates in ssa_fix_duplicate_block_edges, by copying them into the
>> profile count fields of those blocks and edges. Then the existing
>> algorithm operates the same as when we do have counts, and can
>> essentially operate incrementally on the edge frequencies since they
>> live in the count field of the edge and are no longer affected anytime
>> the source block is updated. Since the algorithm does update block
>> frequencies and probabilities as well (based on the count updates
>> performed), we can simply clear out these fake count fields at the end
>> of ssa_fix_duplicate_block_edges. This takes care of the remaining
>> insanities introduced by jump threading from these test cases. During
>> testing I also added in some checking to ensure that the count fields
>> for the whole routine were cleared properly to make sure the new
>> clear_counts_path was not missing anything (checking is a little too
>> heavyweight to add in normally).
>>
>> New patch below (also attached since my mailer sometimes eats spaces).
>> The differences between the old patch and the new one:
>> - removed deduce_freq (which was my least favorite part of the patch
>> anyway!), and its call from recompute_probabilities, since it is no
>> longer necessary.
>> - two new routines freqs_to_counts_path and clear_counts_path, invoked
>> from ssa_fix_duplicate_block_edges.
>> - two new tests
>>
>> Bootstrapped and tested on x86_64-unknown-linux-gnu, ok for trunk?
>>
>> Thanks,
>> Teresa
>>
>> gcc:
>>
>> 2014-09-29  Teresa Johnson  
>>
>>  * tree-ssa-threadupdate.c (struct ssa_local_info_t): New
>>  duplicate_blocks bitmap.
>>  (remove_ctrl_stmt_and_useless_edges): Ditto.
>>  (create_block_for_threading): Ditto.
>>  (compute_path_counts): New function.
>>  (update_profile): Ditto.
>>  (recompute_probabilities): Ditto.
>>  (update_joiner_offpath_counts): Ditto.
>>  (freqs_to_counts_path): Ditto.
>>  (clear_counts_path): Ditto.
>>  (ssa_fix_duplicate_block_edges): Update profile info.
>>  (ssa_create_duplicates): Pass new parameter.
>>  (ssa_redirect_edges): Remove old profile update.
>>  (thread_block_1): New duplicate_blocks bitmap,
>>  remove old profile update.
>>  (thread_single_edge): Pass new parameter.
>>
>> gcc/testsuite:
>>
>> 2014-09-29  Teresa Johnson  
>>
>>  * testsuite/gcc.dg/tree-prof/20050826-2.c: New test.
>>  * testsuite/gcc.dg/tree-prof/cmpsf-1.c: Ditto.
>
> Given I'd already been through this pretty thoroughly, I just gave this a
> cursory review.
>
> clear_counts_path needs a function comment.  It's pretty obvious what it's
> doing, but for completeness let's go ahead and get the obvious comment

Re: [PATCH X86, PR62128] Rotate pattern for AVX2

2014-09-30 Thread Uros Bizjak
On Tue, Sep 30, 2014 at 8:08 PM, Uros Bizjak  wrote:
> On Tue, Sep 30, 2014 at 7:06 PM, Uros Bizjak  wrote:
>> On Tue, Sep 30, 2014 at 6:47 PM, Evgeny Stupachenko  
>> wrote:
>>
>>> Patch resubmitted from 
>>> https://gcc.gnu.org/ml/gcc-patches/2014-09/msg01400.html
>>>
>>> The patch fix PR62128 and  "gcc.target/i386/pr52252-atom.c" in
>>> core-avx2 make check.
>>> The test in pr62128 is exactly TEST 22 from
>>> gcc.dg/torture/vshuf-v32qi.c. It will check if the pattern is correct
>>> or not.
>>> The patch developed similar to define_insn_and_split
>>> "*avx_vperm_broadcast_".
>>> The patch passed x86 bootstrap and make check (+2 new passes for
>>> -march=core-avx2).
>>> Is it ok?
>
> Please try following (totally untested) expander:

As usual, the wrong version was pasted. This should read:

--cut here--
(define_expand "avx2_rotate_perm"
  [(set (match_operand:V_256 0 "register_operand")
  (vec_select:V_256
(match_operand:V_256 1 "register_operand")
(match_parallel 2 "palignr_operand"
  [(match_operand 3 "const_int_operand" "n")])))]
  "TARGET_AVX2"
{
  int shift = INTVAL (operands[3]) * ;
  rtx insn;

  rtx op1 = gen_lowpart (V4DImode, operands[1]);
  rtx t2 = gen_reg_rtx (V4DImode);

  emit_insn (gen_avx2_permv2ti (t2, op1, op1, GEN_INT (33)));

  op0 = gen_lowpart (V2TImode, operands[0]);
  op1 = gen_lowpart (V2TImode, operands[1]);
  t2 = gen_lowpart (V2TImode, t2);

  if (shift < GET_MODE_SIZE (TImode))
insn = gen_avx2_palignrv2ti (op0, t2, op1, GEN_INT (shift)));
  else
insn = gen_avx2_palignrv2ti (op0, op1, t2, GEN_INT (shift - 16)));

  emit_insn (insn);
  DONE;
}
--cut here--

Uros.


Re: [PATCH] PR63404, gcc 5 miscompiles linux block layer

2014-09-30 Thread Jiong Wang
2014-09-30 17:30 GMT+01:00 Jeff Law :
> On 09/30/14 08:37, Jiong Wang wrote:
>>
>>
>> On 30/09/14 05:21, Jeff Law wrote:
>>>
>>> On 09/29/14 13:24, Jiong Wang wrote:

 I don't think so. from the x86-64 bootstrap, there is no regression
 on the number of functions shrink-wrapped. actually speaking,
 previously only single mov dest, src handled, so the disallowing
 USE/CLOBBER will not disallow shrink-wrap opportunity which was
 allowed previously.
>>>
>>> This is the key, of course.  shrink-wrapping is very restrictive in its
>>> ability to sink insns.  The only forms it'll currently shrink are simple
>>> moves.  Arithmetic, logicals, etc are left alone.  Thus disallowing
>>> USE/CLOBBER does not impact the x86 port in any significant way.
>>
>>
>> yes, and we could get +1.22% (2567 compared with 2536) functions
>> shrink-wrapped
>> after we sinking more insn except simple "mov dest, src"  on x86-64
>> bootstrap. and
>> I remember the similar percentage on glibc build.
>>
>> while on aarch64, the overall functions shrink-wrapped increased +25% on
>> some
>> programs. maybe we could gain the same on other RISC backend.
>>
>>>
>>> I do agree with Richard that it would be useful to see the insns that
>>> are incorrectly sunk and the surrounding context.
>
> So I must be missing something.  I thought the shrink-wrapping code wouldn't
> sink arithmetic/logical insns like we see with insn 14 and insn 182.  I
> thought it was limited to reg-reg copies and constant initializations.

yes, it was limited to reg-reg copies, and my previous sink improvement aimed to
sink any rtx

  A: be single_set
  B: the src operand be any combination of no more than one register
and no non-constant objects.

while some operator like shift may have side effect. IMHO, all side
effects are reflected on RTX,
together with this fail_on_clobber_use modification, the rtx returned
by single_set_no_clobber_use is
safe to sink if it meets the above limit B and pass later register
use/def check in move_insn_for_shrink_wrap ?

Regards,
Jiong

>
> Jeff
>
>


[debug-early] do not add location info/etc to abstract instances

2014-09-30 Thread Aldy Hernandez

Hi Jason.

As discussed on IRC, DIEs of abstract instances of functions (those 
tagged with DW_AT_inline), cannot include information that would be 
different between an abstract inline and an out-of-line copy.  This, as 
well as (seeming) gdb snafus regarding abstract origins, was the reason 
I was seeing less guality regressions on my branch.


With the current patch, not only do we fix this oversight, but are now 
at feature *and* bug parity with mainline.  I guess that's good :(.


There were a few problems that needed fixing.  First, 
gen_formal_parameter_die() was reusing the abstract instance DIE's 
parameter, instead of creating a new die with abstract origin set. 
Also, gen_formal_parameter_die() had some old Michael code, working 
around decl_ultimate_origin hacks.  I've fixed all of these problems.


I also fixed gen_subprogram_die(), so it's more aware of a previously 
generated die.


I would appreciate if you could take a look at this patch, particularly 
at the gen_formal_parameter_die change, since I'm not sure what the 
proper way is of checking that a parameter belongs to an abstract 
instance.  Below is what I'm using:



   bool reusing_die;
-  if (parm_die)
+  if (parm_die
+  /* Make sure the function to which this parameter belongs to is
+not an abstract instance.  If it is, we can't reuse anything.
+We must create a new DW_TAG_formal_parameter with a
+corresponding DW_AT_abstract_origin.  */
+  && !get_AT (context_die, DW_AT_abstract_origin))
 reusing_die = true;


Also, seeing as my changes could potentially render invalid DIEs, I 
thought it best to add, at the very least, a check for the inline 
problem described above (see check_die_inline).  For that matter, I 
wonder if we could add more checks to check_die() to make it a general 
dwarf DIE sanity checker.  It still amazes me that you dwarf hackers do 
all this magic, without any sort of checks.


And finally, making changes to _when_ we generate DIEs can sometimes 
lead to NOT generating DIEs early, and silently behaving like mainline 
(that is, generating everything at the end of compilation).  To solve 
this problem, which I'm sure I'll stumble into, I've added 
-fdump-early-debug-stats which will set the ->dumped_early bit in every 
DIE generated after parsing, and then dumping the DIEs after the late 
dwarf generation has run.  This way I can see if we have too many DIEs 
that were NOT generated early.  It is likely this will only be a 
temporary hacking tool to check I didn't do anything stupid along :).


How does this look?

Aldy
commit d9b215d7c3aeea2c601e0d983cbe424990c1beab
Author: Aldy Hernandez 
Date:   Tue Sep 30 08:20:44 2014 -0700

* common.opt (fdump-early-debug-stats): New.
* debug.h (dwarf2out_mark_early_dies): New prototype.
(dwarf2out_dump_early_debug_stats): New prototype.
* toplev.c (compile_file): Dump early debug stats if requested.
* dwarf2out.c (check_die): Check that DIEs containing a
DW_AT_inline doe not contain any invalid modifiers.
(gen_formal_parameter_die): Do not reuse parameters that belong to
an abstract instance.
Do not care that an abstract origin is itself.
(gen_subprogram_die): Handle old_die's better.
(print_die): Print dumped_early bit.
(mark_early_dies_helper): New.
(dwarf2out_mark_early_dies): New.
(dwarf2out_dump_early_debug_stats): New.
(check_die_inline): New.
(check_die): Call check_die_inline.

diff --git a/gcc/common.opt b/gcc/common.opt
index 634a72b..c01f935 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1132,6 +1132,10 @@ fdump-unnumbered-links
 Common Report Var(flag_dump_unnumbered_links)
 Suppress output of previous and next insn numbers in debugging dumps
 
+fdump-early-debug-stats
+Common Report Var(flag_dump_early_debug_stats)
+Dump all dwarf DIEs, specifying if they were generated during the early debug 
stage
+
 fdwarf2-cfi-asm
 Common Report Var(flag_dwarf2_cfi_asm) Init(HAVE_GAS_CFI_DIRECTIVE)
 Enable CFI tables via GAS assembler directives.
diff --git a/gcc/debug.h b/gcc/debug.h
index ec387ca..7158a48 100644
--- a/gcc/debug.h
+++ b/gcc/debug.h
@@ -190,11 +190,12 @@ extern bool dwarf2out_do_frame (void);
 extern bool dwarf2out_do_cfi_asm (void);
 extern void dwarf2out_switch_text_section (void);
 
+extern void dwarf2out_mark_early_dies (void);
+extern void dwarf2out_dump_early_debug_stats (void);
+
 const char *remap_debug_filename (const char *);
 void add_debug_prefix_map (const char *);
 
-extern void dwarf2out_early_decl (tree);
-
 /* For -fdump-go-spec.  */
 
 extern const struct gcc_debug_hooks *
diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index c92101f..a713435 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -5358,9 +5358,9 @@ print_die (dw_die_ref die, FILE *outfile)
   unsigned ix;
 
   print_spaces (outfile);
-  fprintf (outfile, "DIE %4ld: %s (%p)\n",
+  fprintf (outfile, "DIE 

Re: [PATCH, committed] PR 63410: Fix missing plugin headers

2014-09-30 Thread Mike Stump
On Sep 30, 2014, at 8:45 AM, David Malcolm  wrote:
> We install the header "pass_manager.h", but it can't be included by a
> plugin, since it includes "pass-instances.def", and we don't current
> install that.
> 
> Similarly, the installed header pretty-print.h now uses
> wide-int-print.h, but the latter isn't installed.

So, installing wide-int-print.h seems reasonable.


Re: Fix for "FAIL: tmpdir-gcc.dg-struct-layout-1/t028 c_compat_x_tst.o compile, (internal compiler error)"

2014-09-30 Thread Mike Stump
On Sep 30, 2014, at 9:15 AM, Joseph S. Myers  wrote:
> On Tue, 30 Sep 2014, Richard Earnshaw wrote:
> 
>> GCC is written in C++ these days, so technically, you need the C++
>> standard :-)
> 
> And, while C++14 requires plain int bit-fields to be signed, GCC is 
> written in C++98/C++03.

So, seemingly left unstated in the thread is what is required by the language 
standard we write in…  From c++98:

  It is implementa-
  tion-defined  whether  bit-fields  and objects of char type are repre-
  sented as signed or unsigned quantities.  The signed specifier  forces
  char  objects  and bit-fields to be signed; it is redundant with other
  integral types.

So, I think you need a signed on bitfields if your want them to be signed.   It 
doesn’t matter what g++ does, if we want to be portable to any C++ compiler.

Re: __intN patch 3/5: main __int128 -> __intN conversion.

2014-09-30 Thread DJ Delorie

Joseph S. Myers  wrote:
> The non-C++/libstdc++ parts are OK with those changes.

Jonathan Wakely  wrote:
> >* libstdc++-v3/
> > * src/c++11/limits.cc: Add support for __intN types.
> > * include/std/type_traits: Likewise.
> > * include/std/limits: Likewise.
> > * include/c_std/cstdlib: Likewise.
> > * include/bits/cpp_type_traits.h: Likewise.
> > * include/c_global/cstdlib: Likewise.
> 
> These libstdc++ changes are OK for trunk.

Do I still need approval for gcc/cp/* or do these cover it?


Re: [PATCH X86, PR62128] Rotate pattern for AVX2

2014-09-30 Thread Evgeny Stupachenko
>What happens when shift == 16?
We emit just gen_avx2_permv2ti. We don't need additional palignr.

On Tue, Sep 30, 2014 at 9:06 PM, Uros Bizjak  wrote:
> On Tue, Sep 30, 2014 at 6:47 PM, Evgeny Stupachenko  
> wrote:
>
>> Patch resubmitted from 
>> https://gcc.gnu.org/ml/gcc-patches/2014-09/msg01400.html
>>
>> The patch fix PR62128 and  "gcc.target/i386/pr52252-atom.c" in
>> core-avx2 make check.
>> The test in pr62128 is exactly TEST 22 from
>> gcc.dg/torture/vshuf-v32qi.c. It will check if the pattern is correct
>> or not.
>> The patch developed similar to define_insn_and_split
>> "*avx_vperm_broadcast_".
>> The patch passed x86 bootstrap and make check (+2 new passes for
>> -march=core-avx2).
>> Is it ok?
>>
>> Evgeny
>>
>> ChangeLog:
>>
>> 2014-09-30  Evgeny Stupachenko  
>>
>> * config/i386/sse.md (avx2_palignrv4di): New.
>> * config/i386/sse.md (avx2_rotate_perm): New.
>
> +(define_insn "avx2_palignrv4di"
> +  [(set (match_operand:V4DI 0 "register_operand" "=x")
> + (unspec:V4DI
> +  [(match_operand:V4DI 1 "register_operand" "x")
> +   (match_operand:V4DI 2 "nonimmediate_operand" "xm")
> +   (match_operand:SI 3 "const_0_to_255_operand" "n")]
> +  UNSPEC_VPALIGNRDI))]
> +  "TARGET_AVX2"
> +  "vpalignr\t{%3, %2, %1, %0|%0, %1, %2, %3}"
> +  [(set_attr "type" "sselog")
> +   (set_attr "prefix" "vex")
> +   (set_attr "mode" "OI")])
>
> Just reuse UNSPEC_PALIGNR, no need for a new unspec.
>
> +(define_insn_and_split "avx2_rotate_perm"
> +  [(set (match_operand:V_256 0 "register_operand" "=&x")
> +  (vec_select:V_256
> + (match_operand:V_256 1 "register_operand" "x")
> + (match_parallel 2 "palignr_operand"
> +  [(match_operand 3 "const_int_operand" "n")])))]
> +  "TARGET_AVX2"
> +  "#"
> +  "&& reload_completed"
> +  [(const_int 0)]
>
> This should be a define_expand. There is nothing that requires hard
> registers. You can achieve mode-changes by using gen_lowpart, see many
> examples in sse.md
>
> +  if (shift < 16)
> + emit_insn (gen_avx2_palignrv4di (op0,
> + op0,
> + op1,
> + GEN_INT (shift)));
> +  else if (shift > 16)
> + emit_insn (gen_avx2_palignrv4di (op0,
> + op1,
> + op0,
> + GEN_INT (shift - 16)));
>
> What happens when shift == 16?
>
> Uros.


Re: [PR libfortran/62768] Handle filenames with embedded nulls

2014-09-30 Thread Janne Blomqvist
On Thu, Sep 18, 2014 at 11:33 PM, Hans-Peter Nilsson  wrote:
> On Thu, 18 Sep 2014, Janne Blomqvist wrote:
>> >  If you look back at the patch I posted, there's a
>> > typo. :-}  Duly warned about, but I'd rather expect the build to
>> > fail.
>>
>> Yes, strange that it didn't fail. There's no prototype for cf_fstrcpy,
>> and since we use std=gnu11 prototypes should be mandatory. Also, since
>> there's no symbol called cf_fstrcpy  so at least the linking should
>> fail. Unless the link picked up some old inquire.o file?
>
> For closure: no linking certainly *did* fail and no executable
> was created for those tests; failing linking correctly counts as
> a fail too.
>
>> > Apparently libgfortran is not compiled with -Werror, at least
>> > not for crosses.  Maybe -Werror is there for native but I'm not
>> > sure as I see some "warning: array subscript has type 'char'
>> > [-Wchar-subscripts]" which seems generic and also some others.
>> > Though no more than can be fixed or excepted, IMHO.
>>
>> No, Werror isn't used. It was tried, but apparently caused issues.
>
> 'k.  Maybe -Werror=implicit-function-declaration is a middle
> way.

Good idea. I committed r215741 as obvious which adds this to the compile flags.

I'm sure there are other warnings that can be enabled with -Werror=...
in a similar fashion, but this is a start at least. Another approach
would be to enable -Werror if some conditions are met. Such as

- native build
- --enable-maintainer-mode
- glibc target

I'm not in the mood to torture myself with autofoo to do this ATM, but
food for thought..

-- 
Janne Blomqvist


Re: [PATCH] PR63404, gcc 5 miscompiles linux block layer

2014-09-30 Thread Steven Bosscher
On Tue, Sep 30, 2014 at 6:51 PM, Richard Earnshaw wrote:
> It's not just clobbers; it ignores patterns like
>
> (parallel
>  [(set (a) (...)
>   (set (b) (...)])
> [(reg_note (reg_unused(b))]
>
> Which is probably fine before register allocation but definitely
> something you have to think about afterwards.

Even before RA this isn't always fine. We have checks for
!multiple_sets for this.

Ciao!
Steven


Re: [Bug libstdc++/62313] Data race in debug iterators

2014-09-30 Thread François Dumont

Hi

I prefer to submit this patch to you cause I am not very 
comfortable with Python stuff.


I simply rely on Python cast feature. It doesn't really matter but 
is it going to simply consider the debug iterator as a normal one or is 
it going through the C++ explicit cast operator on debug iterators ?


François


On 30/09/2014 17:32, Jonathan Wakely wrote:

On 26/09/14 11:05 +0100, Jonathan Wakely wrote:

On 26/09/14 00:00 +0200, François Dumont wrote:



Apart from those minor adjustments I think this looks good, but I'd
like to know that it has been tested with -fsanitize=thread, even if
only lightly tested.




Hi

  Dmitry, who reported the bug, confirmed the fix. Can I go ahead 
and commit ?


Yes, OK.


This caused some failures in the printer tests:

Running
/home/jwakely/src/gcc/gcc/libstdc++-v3/testsuite/libstdc++-prettyprinters/prettyprinters.exp 
...

FAIL: libstdc++-prettyprinters/debug.cc print deqiter
FAIL: libstdc++-prettyprinters/debug.cc print lstiter
FAIL: libstdc++-prettyprinters/debug.cc print lstciter
FAIL: libstdc++-prettyprinters/debug.cc print mpiter
FAIL: libstdc++-prettyprinters/debug.cc print spciter




Index: python/libstdcxx/v6/printers.py
===
--- python/libstdcxx/v6/printers.py	(revision 215741)
+++ python/libstdcxx/v6/printers.py	(working copy)
@@ -460,7 +460,7 @@
 # and return the wrapped iterator value.
 def to_string (self):
 itype = self.val.type.template_argument(0)
-return self.val['_M_current'].cast(itype)
+return self.val.cast(itype)
 
 class StdMapPrinter:
 "Print a std::map or std::multimap"


Re: [Patch AArch64] Fix extended register width

2014-09-30 Thread Eric Christopher
On Tue, Sep 30, 2014 at 5:57 AM, Marcus Shawcroft
 wrote:
> On 22 September 2014 19:41, Carrot Wei  wrote:
>> Hi
>>
>> The extended register width in add/adds/sub/subs/cmp instructions is
>> not always the same as target register, it depends on both target
>> register width and extension type. But in current implementation the
>> extended register width is always the same as target register. We have
>> noticed it can generate following wrong assembler code when compiled
>> an internal application,
>>
>> add x2, x20, x0, sxtw 3
>>
>> The correct assembler should be
>>
>> add x2, x20, w0, sxtw 3
>

Hi Marcus,

> Hi,
>
> The assembler deliberately accepts the first form as a programmer
> convenience.  Given the above example:
>

I've been doing some reading of the ARM-v8 ARM and the language the
ARM uses here for this instruction matches the "shall" and not
"should" language it uses in other locations:

"Is the 32-bit name of the second general-purpose source register,
encoded in the "Rm" field."

This seems to say that a conforming assembler should error on a
non-32bit named register here. As I said, same sort of verbiage used
elsewhere for shall, in "should" cases the ARM is very careful to
spell it out.

Now if we want to change the ARM philosophy here I'm not opposed, but
I think we'd want some more explicit documentation about how/where
things should be more relaxed versus a bunch of "this is convenient to
accept here" stuff. That kind of thing has a tendency to end up in
some pretty fun context sensitive parsing madness.

Thoughts?

-eric


> AARCH64 GAS  x.s page 1
>
>
>1  82CE20ABaddsx2, x20, x0, sxtw 3
>2 0004 82CE20ABaddsx2, x20, w0, sxtw 3
>
> Note both forms are correctly assembled.  The GAS implementation
> contains code at (or near) tc-aarch64.c:5461 that specifically catches
> the former.
>
> ... therefore I see no need to change the behaviour of gcc.
>
> Cheers
> /Marcus


Re: [PATCH, rs6000] Generate LE code for vec_lvsl and vec_lvsr that is compatible with BE code

2014-09-30 Thread Segher Boessenkool
On Tue, Sep 30, 2014 at 11:18:39AM -0500, Bill Schmidt wrote:
> > I meant generating a sequence that just "falls out" as you want it after
> > optimisation.  E.g. lvsr;vnot;vand(splat8(31));vperm can have the vand
> > absorbed by the vperm.  But that splat is nasty when not optimised away :-(
> 
> Especially since splat8(31) requires vsub(splat8(15),splat8(-16))...

vspltisb vT,-5 ; vsrb vD,vT,vT # :-)

> To get something that is correct with and without feeding a vperm and
> actually performs well just ain't happening here...

Yeah.


Segher


Re: [PATCH RFC]Pair load store instructions using a generic scheduling fusion pass

2014-09-30 Thread Mike Stump
On Sep 30, 2014, at 2:22 AM, Bin Cheng  wrote:
> Then I decided to take one step forward to introduce a generic
> instruction fusion infrastructure in GCC, because in essence, load/store
> pair is nothing different with other instruction fusion, all these 
> optimizations
> want is to push instructions together in instruction flow.

I like the step you took.  I had exactly this in mind when I wrote the original.

> N0 ~= 1300
> N1/N2 ~= 5000
> N3 ~= 7500

Nice.  Would be nice to see metrics for time to ensure that the code isn’t 
actually worse (CSiBE and/or spec and/or some other).  I didn’t have any large 
scale benchmark runs with my code and I did worry about extending lifetimes and 
register pressure.

> I cleared up Mike's patch and fixed some implementation bugs in it

So, I’m wondering what the bugs or missed opportunities were?  And, if they 
were of the type of problem that generated incorrect code or if they were of 
the type that was merely a missed opportunity.

[Patch, libgfortran, committed] Fix -Wmaybe-uninitialized warnings

2014-09-30 Thread Janne Blomqvist
Hi,

I compiled libgfortran with -Werror, and fixed the fallout with the
attached patch. Committed r215742 as obvious.

2014-10-01  Janne Blomqvist  

* intrinsics/pack_generic.c (pack_s_internal): Fix
-Wmaybe-uninitialized warning.
* m4/unpack.m4 (unpack0_'rtype_code`): Likewise.
(unpack1_'rtype_code`): Likewise.
* generated/unpack_*.m4: Regenerated.


-- 
Janne Blomqvist
diff --git a/libgfortran/intrinsics/pack_generic.c 
b/libgfortran/intrinsics/pack_generic.c
index 3fbfa0a..831f396 100644
--- a/libgfortran/intrinsics/pack_generic.c
+++ b/libgfortran/intrinsics/pack_generic.c
@@ -463,6 +463,9 @@ pack_s_internal (gfc_array_char *ret, const gfc_array_char 
*array,
   index_type total;
 
   dim = GFC_DESCRIPTOR_RANK (array);
+  /* Initialize sstride[0] to avoid -Wmaybe-uninitialized
+ complaints.  */
+  sstride[0] = size;
   ssize = 1;
   for (n = 0; n < dim; n++)
 {
diff --git a/libgfortran/m4/unpack.m4 b/libgfortran/m4/unpack.m4
index e945446..271eae2 100644
--- a/libgfortran/m4/unpack.m4
+++ b/libgfortran/m4/unpack.m4
@@ -105,6 +105,8 @@ unpack0_'rtype_code` ('rtype` *ret, const 'rtype` *vector,
   else
 {
   dim = GFC_DESCRIPTOR_RANK (ret);
+  /* Initialize to avoid -Wmaybe-uninitialized complaints.  */
+  rstride[0] = 1;
   for (n = 0; n < dim; n++)
{
  count[n] = 0;
@@ -250,6 +252,8 @@ unpack1_'rtype_code` ('rtype` *ret, const 'rtype` *vector,
   else
 {
   dim = GFC_DESCRIPTOR_RANK (ret);
+  /* Initialize to avoid -Wmaybe-uninitialized complaints.  */
+  rstride[0] = 1;
   for (n = 0; n < dim; n++)
{
  count[n] = 0;


Re: [PATCH X86, PR62128] Rotate pattern for AVX2

2014-09-30 Thread Evgeny Stupachenko
expand_vselect for some reason ignores the expander.
Does it work with expanders?
The comment talks about insn only:
/* Construct (set target (vec_select op0 (parallel perm))) and
   return true if that's a valid instruction in the active ISA.  */

On Tue, Sep 30, 2014 at 10:21 PM, Uros Bizjak  wrote:
> On Tue, Sep 30, 2014 at 8:08 PM, Uros Bizjak  wrote:
>> On Tue, Sep 30, 2014 at 7:06 PM, Uros Bizjak  wrote:
>>> On Tue, Sep 30, 2014 at 6:47 PM, Evgeny Stupachenko  
>>> wrote:
>>>
 Patch resubmitted from 
 https://gcc.gnu.org/ml/gcc-patches/2014-09/msg01400.html

 The patch fix PR62128 and  "gcc.target/i386/pr52252-atom.c" in
 core-avx2 make check.
 The test in pr62128 is exactly TEST 22 from
 gcc.dg/torture/vshuf-v32qi.c. It will check if the pattern is correct
 or not.
 The patch developed similar to define_insn_and_split
 "*avx_vperm_broadcast_".
 The patch passed x86 bootstrap and make check (+2 new passes for
 -march=core-avx2).
 Is it ok?
>>
>> Please try following (totally untested) expander:
>
> As usual, the wrong version was pasted. This should read:
>
> --cut here--
> (define_expand "avx2_rotate_perm"
>   [(set (match_operand:V_256 0 "register_operand")
>   (vec_select:V_256
> (match_operand:V_256 1 "register_operand")
> (match_parallel 2 "palignr_operand"
>   [(match_operand 3 "const_int_operand" "n")])))]
>   "TARGET_AVX2"
> {
>   int shift = INTVAL (operands[3]) * ;
>   rtx insn;
>
>   rtx op1 = gen_lowpart (V4DImode, operands[1]);
>   rtx t2 = gen_reg_rtx (V4DImode);
>
>   emit_insn (gen_avx2_permv2ti (t2, op1, op1, GEN_INT (33)));
>
>   op0 = gen_lowpart (V2TImode, operands[0]);
>   op1 = gen_lowpart (V2TImode, operands[1]);
>   t2 = gen_lowpart (V2TImode, t2);
>
>   if (shift < GET_MODE_SIZE (TImode))
> insn = gen_avx2_palignrv2ti (op0, t2, op1, GEN_INT (shift)));
>   else
> insn = gen_avx2_palignrv2ti (op0, op1, t2, GEN_INT (shift - 16)));
>
>   emit_insn (insn);
>   DONE;
> }
> --cut here--
>
> Uros.


Re: [PATCH GCC]Improve candidate selecting in IVOPT

2014-09-30 Thread Sebastian Pop
Bin Cheng wrote:
> Hi,
> As analyzed in PR62178, IVOPT can't find the optimal iv set for that case.
> The problem with current heuristic algorithm is it only replaces candidate
> with ones not in current solution one by one, starting from small solution.
> This patch adds another heuristic which starts from assigning the best
> candidate for each iv use, then replaces candidate with ones in the current
> solution.
> Before this patch, there are two runs of find_optimal_set_1 to find the
> optimal iv sets, we name them as set_a and set_b.  After this patch we will
> have set_c.  At last, IVOPT chooses the best one from set_a/set_b/set_c.  To
> prove that this patch is necessary, I collected instrumental data for gcc
> bootstrap, spec2k, eembc and can confirm for some cases only the newly added
> heuristic can find the optimal iv set.  The number of these cases in which
> set_c is the optimal one is on the same level of set_b.
> As for the compilation time, the newly added function actually is one
> iteration of previous selection algorithm, it should be much faster than
> previous process.
> 
> I also added one target dependent test case.
> Bootstrap and test on x86_64, test on aarch64.  Any comments?

I verified that the patch fixes the performance regression on intmm.  I have
seen improvements to other benchmarks, and very small degradations that could
very well be noise.

Thanks for fixing this perf issue!
Sebastian

> 
> 2014-09-30  Bin Cheng  
> 
>   PR tree-optimization/62178
>   * tree-ssa-loop-ivopts.c (enum sel_type): New.
>   (iv_ca_add_use): Add parameter RELATED_P and find the best cand
>   for iv use if it's true.
>   (try_add_cand_for, get_initial_solution): Change paramter ORIGINALP
>   to SELECT_TYPE and handle it.
>   (find_optimal_iv_set_1): Ditto.
>   (try_prune_iv_set, find_optimal_iv_set_2): New functions.
>   (find_optimal_iv_set): Call find_optimal_iv_set_2 and choose the
>   best candidate set.
> 
> gcc/testsuite/ChangeLog
> 2014-09-30  Bin Cheng  
> 
>   PR tree-optimization/62178
>   * gcc.target/aarch64/pr62178.c: New test.

> Index: gcc/testsuite/gcc.target/aarch64/pr62178.c
> ===
> --- gcc/testsuite/gcc.target/aarch64/pr62178.c(revision 0)
> +++ gcc/testsuite/gcc.target/aarch64/pr62178.c(revision 0)
> @@ -0,0 +1,17 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3" } */
> +
> +int a[30 +1][30 +1], b[30 +1][30 +1], r[30 +1][30 +1];
> +
> +void Intmm (int run) {
> +  int i, j, k;
> +
> +  for ( i = 1; i <= 30; i++ )
> +for ( j = 1; j <= 30; j++ ) {
> +  r[i][j] = 0;
> +  for(k = 1; k <= 30; k++ )
> +r[i][j] += a[i][k]*b[k][j];
> +}
> +}
> +
> +/* { dg-final { scan-assembler "ld1r\\t\{v\[0-9\]+\."} } */
> Index: gcc/tree-ssa-loop-ivopts.c
> ===
> --- gcc/tree-ssa-loop-ivopts.c(revision 215113)
> +++ gcc/tree-ssa-loop-ivopts.c(working copy)
> @@ -254,6 +254,14 @@ struct iv_inv_expr_ent
>hashval_t hash;
>  };
>  
> +/* Types used to start selecting the candidate for each IV use.  */
> +enum sel_type
> +{
> +  SEL_ORIGINAL,  /* Start selecting from original cands.  */
> +  SEL_IMPORTANT, /* Start selecting from important cands.  */
> +  SEL_RELATED/* Start selecting from related cands.  */
> +};
> +
>  /* The data used by the induction variable optimizations.  */
>  
>  typedef struct iv_use *iv_use_p;
> @@ -5417,22 +5425,51 @@ iv_ca_set_cp (struct ivopts_data *data, struct iv_
>  }
>  
>  /* Extend set IVS by expressing USE by some of the candidates in it
> -   if possible.  Consider all important candidates if candidates in
> -   set IVS don't give any result.  */
> +   if possible.  If RELATED_P is FALSE, consider all important
> +   candidates if candidates in set IVS don't give any result;
> +   otherwise, try to find the best one from related or all candidates,
> +   depending on consider_all_candidates.  */
>  
>  static void
>  iv_ca_add_use (struct ivopts_data *data, struct iv_ca *ivs,
> -struct iv_use *use)
> +struct iv_use *use, bool related_p)
>  {
>struct cost_pair *best_cp = NULL, *cp;
>bitmap_iterator bi;
>unsigned i;
>struct iv_cand *cand;
>  
> -  gcc_assert (ivs->upto >= use->id);
> +  gcc_assert (ivs->upto == use->id);
>ivs->upto++;
>ivs->bad_uses++;
>  
> +  if (related_p)
> +{
> +  if (data->consider_all_candidates)
> + {
> +   for (i = 0; i < n_iv_cands (data); i++)
> + {
> +   cand = iv_cand (data, i);
> +   cp = get_use_iv_cost (data, use, cand);
> +   if (cheaper_cost_pair (cp, best_cp))
> + best_cp = cp;
> + }
> + }
> +  else
> + {
> +   EXECUTE_IF_SET_IN_BITMAP (use->related_cands, 0, i, bi)
> + {
> +   cand = iv_cand (data, i);
> +   cp 

Re: C++ PATCHes to add __is_trivially_*

2014-09-30 Thread Ville Voutilainen
>>Ville asked for help with the necessary compiler intrinsics for the 
>>is_trivially_* >>C++11 library traits. The first patch cleans up a few 
>>oddities I noticed with the
>Great. I think this can be as well marked as PR c++/26099.

There's also PR c++/63362.

The intrinsics still fail to support certain variadic cases, such as

template  void bar() {
  static_assert(__is_trivially_constructible(T, Args...), "");
}


Re: __intN patch 3/5: main __int128 -> __intN conversion.

2014-09-30 Thread Jonathan Wakely

On 30/09/14 15:37 -0400, DJ Delorie wrote:


Joseph S. Myers  wrote:

The non-C++/libstdc++ parts are OK with those changes.


Jonathan Wakely  wrote:

>* libstdc++-v3/
>* src/c++11/limits.cc: Add support for __intN types.
>* include/std/type_traits: Likewise.
>* include/std/limits: Likewise.
>* include/c_std/cstdlib: Likewise.
>* include/bits/cpp_type_traits.h: Likewise.
>* include/c_global/cstdlib: Likewise.

These libstdc++ changes are OK for trunk.


Do I still need approval for gcc/cp/* or do these cover it?


I can't approve gcc/cp/* changes and as Joseph says non-C++ I think
you need to chase someone else (i.e. Jason :-) for that.



Re: __intN patch 3/5: main __int128 -> __intN conversion.

2014-09-30 Thread DJ Delorie

Joseph S. Myers  wrote:
> The non-C++/libstdc++ parts are OK with those changes.

Jonathan Wakely  wrote:
> These libstdc++ changes are OK for trunk.

Jason/Nathan,

Could one of you two please review the remaining C++ parts (cp/*) ?

  https://gcc.gnu.org/ml/gcc-patches/2014-08/msg02360.html

Thanks!


  1   2   >