Re: [PATCH] Delete GCJ

2016-10-10 Thread Iain Sandoe

> On 10 Oct 2016, at 05:03, Matthias Klose  wrote:
> 
> On 07.10.2016 10:30, Iain Sandoe wrote:
>> 
>>> On 7 Oct 2016, at 00:58, Matthias Klose  wrote:
>>> 
>>> On 06.10.2016 20:00, Mike Stump wrote:
 On Oct 6, 2016, at 9:56 AM, Rainer Orth  
 wrote:
> I wouldn't hard-fail, but completely disable objc-gc with an appropriate
> warning.  The Objective-C maintainers may have other preferences, though.
>>> 
>>> I think I can't do that in the top level make file very well (currently I 
>>> only
>>> have the pkg-config check there for an early failure, but that check doesn't
>>> tell me if the library is present for all multilib variants). And I can't 
>>> check
>>> for multilibs because I don't know if the bootstrap compiler is multilib 
>>> aware.
>> 
>> hrm, so perhaps we need a —with-target-boehm-gc= type arrangement, and it’s 
>> the configurer’s responsibility to provide a path with appropriate 
>> headers/libs for the multi-lib configuration being attempted.
> 
> I don't understand what you are proposing here.

given that:
 auto-detection of the capabilities could be quite difficult (or, in the 
general case, unachievable) for the reasons you gave.
 the chosen target libraries might be in a non-standard place.

making it an explicit requirement to point to them, and to ensure that the 
versions/multi-libs provided are suitable, (by putting 
—with-target-boehm-gc=/path/to/suitable/) would mean that the dependent 
configury (for objc-gc) could be just conditional upon the  presence of the 
"with-target-boehm-gc”.

I suppose that one could make "—with-target-boehm-gc” (no attached path) 
declare that the library (and requisite mult-lib versions) will be found in the 
sysroot without any additional work.

The point here was to simplify the dependent configury so that it only needs to 
test something that the configuring user specifies (i.e. if they specify 
objc-gc, then they need also to specify the place that the gc lib can be found).

 gcc historically is fairly weak at complex configurations.  I need the 32 
 bit libraries to support -m32, but, those libraries might not be present, 
 but do I build all the rest of my libraries, and if i do, do I test them 
 once build, but what is other dependent external libraries are missing.  
 Do I turn off the multilib, or do I not?
 
 I used to manage some of this by passing in configure flags to control 
 multilibbing based upon what libraries were install and then run testing 
 based upon that.  Of course, that's all external to gcc proper. Doesn't 
 really make gcc any easier to configure and build or advance gcc.
 
 We could smell the system at configure time, and turn on and off multilib 
 variants and things like objc gc. Target specific, but I think it helps to 
 ponder this in a target independent way.  This can then turn on and off 
 objc gc support directly.  To get it on, one would need to install the 
 needed libraries, and reconfigure and rebuild gcc.  I think I might like 
 that the best.  Has a nice easy of use about it, and then everything gcc 
 does is rather sane (no funny build errors when a needed library isn't 
 present).
 
 
 So, I think, if I understand what you propose, I'm fine with that.
>>> 
>>> So your proposal is to replace the ": dnl ..." line in libobjc/configure.ac 
>>> with
>>> a hard error message and leave it to the user to correctly configure GCC?  
>>> That
>>> would rely on the compiler to find the library in a system wide multilib 
>>> aware
>>> directory (e.g. /usr/lib/i386-linux-gnu, or /usr/lib32).  Is this the case 
>>> for
>>> Solaris and Darwin?
>> 
>> for Darwin, it’s not a default install (but then neither are the host deps 
>> such as gmp & friends) - so the toolchain builder on Darwin already needs to 
>> make some provisions outside the system.  It’s just that the only target 
>> provisions to date have been the sysroot (we haven’t yet made use of add-on 
>> target libs).
>> 
>>> I'm fine with that, it wouldn't affect configurations like x86_64-linux-gnu
>>> where multilib is the default (but objc-gc is not).
>>> 
>>> Looking back at libjava, I think everybody disabled multilibs for libjava,
>>> because nobody had a complete gtk2 stack for multilibs, however that was a
>>> complete subdir, not just a certain configuration in that subdir. Looking 
>>> back
>>> at libffi and separate released libffi's I first built multilib'ed libffi
>>> libraries from the libffi source for Debian/Ubuntu, then dropped these 
>>> because
>>> they were not used, and until today GCC internal and external libffi are
>>> hopelessly out of sync, so you couldn't use an external libffi to build 
>>> libjava.
>> 
>> Becase Darwin’s libjava does not depend on the gtk2 stack, actually normally 
>> libjava (and libffi, gc) were generally built and tested (by those who cared 
>> to do it) as multilibs [the default].
>>> 
>>> In the past I looked

[PATCH][store merging][RFA] Re-implement merging code

2016-10-10 Thread Kyrill Tkachov

Hi Richard,

As I mentioned, here is the patch applying to the main store merging patch to 
re-implement encode_tree_to_bitpos
to operate on the bytes directly.

This works fine on little-endian but breaks on big-endian, even for merging 
bitfields within a single byte.
Consider the code snippet from gcc.dg/store_merging_6.c:

struct bar {
  int a : 3;
  unsigned char b : 4;
  unsigned char c : 1;
  char d;
  char e;
  char f;
  char g;
};

void
foo1 (struct bar *p)
{
  p->b = 3;
  p->a = 2;
  p->c = 1;
  p->d = 4;
  p->e = 5;
}

The correct GIMPLE for these merged stores on big-endian is:
  MEM[(voidD.49 *)p_2(D)] = 18180;
  MEM[(charD.8 *)p_2(D) + 2B] = 5;

whereas with this patch we emit:
  MEM[(voidD.49 *)p_2(D)] = 39428;
  MEM[(charD.8 *)p_2(D) + 2B] = 5;

The dump for merging the individual stores without this patch (using the 
correct but costly wide_int approach in the base patch) is:
After writing 3 of size 4 at position 3 the merged region contains:
6 0 0 0 0 0
After writing 2 of size 3 at position 0 the merged region contains:
46 0 0 0 0 0
After writing 1 of size 1 at position 7 the merged region contains:
47 0 0 0 0 0
After writing 4 of size 8 at position 8 the merged region contains:
47 4 0 0 0 0
After writing 5 of size 8 at position 16 the merged region contains:
47 4 5 0 0 0


And with this patch it is:
After writing 3 of size 4 at position 3 the merged region contains:
18 0 0 0 0 0
After writing 2 of size 3 at position 0 the merged region contains:
1a 0 0 0 0 0
After writing 1 of size 1 at position 7 the merged region contains:
9a 0 0 0 0 0
After writing 4 of size 8 at position 8 the merged region contains:
9a 4 0 0 0 0
After writing 5 of size 8 at position 16 the merged region contains:
9a 4 5 0 0 0

(Note the dump just dumps the byte array from index 0 to  so the first 
thing printed is the lowest numbered byte.
Also, each byte is dumped in hex.)

The code as included here doesn't do any byte swapping for big-endian but as 
seen from the dump even writing a sub-byte
bitfield goes wrong so it would be nice to resolve that before going forward.
Any help with debugging this is hugely appreciated. I've included an ASCII 
diagram of the steps in the algorithm
in the patch itself.

Thanks,
Kyrill
diff --git a/gcc/gimple-ssa-store-merging.c b/gcc/gimple-ssa-store-merging.c
index 45dc615482780bcb4c9e0d261f25c334d83dd878..c6c81596528a065e6dec023522c1430997fb4593 100644
--- a/gcc/gimple-ssa-store-merging.c
+++ b/gcc/gimple-ssa-store-merging.c
@@ -199,6 +199,87 @@ dump_char_array (FILE *fd, unsigned char *ptr, unsigned int len)
   fprintf (fd, "\n");
 }
 
+/* Fill a byte array PTR of SZ elements with zeroes.  This is to be used by
+   encode_tree_to_bitpos to zero-initialize most likely small arrays but
+   with a non-compile-time-constant size.  */
+
+static inline void
+zero_char_buf (unsigned char *ptr, unsigned int sz)
+{
+  for (unsigned int i = 0; i < sz; i++)
+ptr[i] = 0;
+}
+
+/* Shift the bytes in PTR of SZ elements by AMNT bits, carrying over the bits
+   between adjacent elements.  */
+
+static void
+shift_bytes_in_array (unsigned char *ptr, unsigned int sz, unsigned int amnt)
+{
+  unsigned char carry_over = 0U;
+  unsigned char carry_mask = (~0U) << ((unsigned char)(BITS_PER_UNIT - amnt));
+  unsigned char clear_mask = (~0U) << amnt;
+
+  for (unsigned int i = 0; i < sz; i++)
+{
+  unsigned prev_carry_over = carry_over;
+  carry_over
+	= (ptr[i] & carry_mask) >> (BITS_PER_UNIT - amnt);
+
+  ptr[i] <<= amnt;
+  if (i != 0)
+	{
+	  ptr[i] &= clear_mask;
+	  ptr[i] |= prev_carry_over;
+	}
+}
+}
+
+/* In the byte array PTR clear the bit region starting at bit
+   START and is LEN bits wide.  START should be within [0, BITS_PER_UNIT).
+   For regions spanning multiple bytes do this recursively until we reach
+   zero LEN or a region contained within a single byte.  */
+
+static void
+clear_bit_region (unsigned char *ptr, unsigned int start,
+		  unsigned int len)
+{
+  /* Degenerate base case.  */
+  if (len == 0)
+return;
+
+  /* Second base case.  */
+  if ((start + len) <= BITS_PER_UNIT)
+{
+  unsigned char mask = (~0U) << ((unsigned char)(BITS_PER_UNIT - len));
+  mask >>= BITS_PER_UNIT - (start + len);
+
+  ptr[0] &= ~mask;
+
+  return;
+}
+  /* Clear most significant bits in a byte and proceed with the next byte.  */
+  else if (start != 0)
+{
+  clear_bit_region (ptr, start, BITS_PER_UNIT - start);
+  clear_bit_region (ptr + 1, 0, len - (BITS_PER_UNIT - start) + 1);
+}
+  /* Whole bytes need to be cleared.  */
+  else if (start == 0 && len > BITS_PER_UNIT)
+{
+  unsigned int nbytes = len / BITS_PER_UNIT;
+  /* We could recurse on each byte but do the loop here to to avoid
+	 recuring too deep.  */
+  for (unsigned int i = 0; i < nbytes; i++)
+	ptr[i] = 0U;
+  /* Clear the remaining sub-byte region if there is one.  */
+  if (len % BITS_PER_UNIT != 0)
+	clear_bit_region (ptr + nbytes, 0, le

[PATCH][v5] GIMPLE store merging pass

2016-10-10 Thread Kyrill Tkachov

Hi all,

This is another revision of the pass addressing Richard's feedback [1]
I believe I've addressed all of it and added more comments to the code where
needed.

The output_merged_store function now uses the new split_group helper to break
up the merged store into multiple regular-sized stores.

The apply_stores function that splats the stores in a group together can now
return a bool to indicate failure and is used to reject quickly one-store groups
and other store groups that we cannot output.

One thing I've been struggling with is reimplementing encode_tree_to_bitpos,
the function that applies a tree constant to the merged byte array.
I've tried to reimplement it by writing the constant to a byte array with 
native_encode_expr
and manipulating the bytes directly to insert them into the appropriate bit 
position without
constructing an intermediate wide_int.  This works, but only for little-endian.
On big-endian it generated wrong code.
So this patch doesn't include that implementation but rather uses the previous 
one that uses
a wide_int but is correct on both endiannesses.
Richard, I am sending out a patch that implements the cheaper algorithm 
separately if you
want to help debug it.

This has been bootstrapped and tested on arm, aarch64, aarch64_be, x86_64.
Besides the encode_tree_to_bitpos reimplementation (which will have its own 
thread)
does this version look good?

Thanks,
Kyrill

[1] https://gcc.gnu.org/ml/gcc-patches/2016-09/msg02225.html

2016-10-10  Kyrylo Tkachov  

PR middle-end/22141
* Makefile.in (OBJS): Add gimple-ssa-store-merging.o.
* common.opt (fstore-merging): New Optimization option.
* opts.c (default_options_table): Add entry for
OPT_ftree_store_merging.
* fold-const.h (can_native_encode_type_p): Declare prototype.
* fold-const.c (can_native_encode_type_p): Define.
* params.def (PARAM_STORE_MERGING_ALLOW_UNALIGNED): Define.
* passes.def: Insert pass_tree_store_merging.
* tree-pass.h (make_pass_store_merging): Declare extern
prototype.
* gimple-ssa-store-merging.c: New file.
* doc/invoke.texi (Optimization Options): Document
-fstore-merging.

2016-10-10  Kyrylo Tkachov  
Jakub Jelinek  
Andrew Pinski  

PR middle-end/22141
PR rtl-optimization/23684
* gcc.c-torture/execute/pr22141-1.c: New test.
* gcc.c-torture/execute/pr22141-2.c: Likewise.
* gcc.target/aarch64/ldp_stp_1.c: Adjust for -fstore-merging.
* gcc.target/aarch64/ldp_stp_4.c: Likewise.
* gcc.dg/store_merging_1.c: New test.
* gcc.dg/store_merging_2.c: Likewise.
* gcc.dg/store_merging_3.c: Likewise.
* gcc.dg/store_merging_4.c: Likewise.
* gcc.dg/store_merging_5.c: Likewise.
* gcc.dg/store_merging_6.c: Likewise.
* gcc.dg/store_merging_7.c: Likewise.
* gcc.target/i386/pr22141.c: Likewise.
* gcc.target/i386/pr34012.c: Add -fno-store-merging to dg-options.
* g++.dg/init/new17.C: Likewise.
diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 2a98e62b03ac8b84e4595660ac952a8bb3eb1d7f..fd4353fd94f3f12d1b4c799896a704981c6ea9a1 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1300,6 +1300,7 @@ OBJS = \
 	gimple-ssa-isolate-paths.o \
 	gimple-ssa-nonnull-compare.o \
 	gimple-ssa-split-paths.o \
+	gimple-ssa-store-merging.o \
 	gimple-ssa-strength-reduction.o \
 	gimple-ssa-sprintf.o \
 	gimple-streamer-in.o \
diff --git a/gcc/common.opt b/gcc/common.opt
index ca48872072e0780c25178e714f96a8aa7f37eb1a..79255c865e4ff4d49b4331337ede172e9e2d7e31 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1460,6 +1460,10 @@ fstrict-volatile-bitfields
 Common Report Var(flag_strict_volatile_bitfields) Init(-1) Optimization
 Force bitfield accesses to match their type width.
 
+fstore-merging
+Common Report Var(flag_store_merging) Optimization
+Merge adjacent stores.
+
 fguess-branch-probability
 Common Report Var(flag_guess_branch_prob) Optimization
 Enable guessing of branch probabilities.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index d9667e7f5d91b25e4160fdfc6aae2e5d64ba260d..bd60decb4d2f7883e2c3834d5c819fba78622177 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -403,7 +403,7 @@ Objective-C and Objective-C++ Dialects}.
 -fsingle-precision-constant -fsplit-ivs-in-unroller @gol
 -fsplit-paths @gol
 -fsplit-wide-types -fssa-backprop -fssa-phiopt @gol
--fstdarg-opt -fstrict-aliasing @gol
+-fstdarg-opt -fstore-merging -fstrict-aliasing @gol
 -fstrict-overflow -fthread-jumps -ftracer -ftree-bit-ccp @gol
 -ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol
 -ftree-coalesce-vars -ftree-copy-prop -ftree-dce -ftree-dominator-opts @gol
@@ -414,8 +414,8 @@ Objective-C and Objective-C++ Dialects}.
 -ftree-loop-vectorize @gol
 -ftree-parallelize-loops=@var{n} -ftree-pre -ftree-partial-pre -ftree-pta @gol
 -ftree-reassoc -ftree-sink -ftree-slsr -ftree-sra @gol
--ftree-switch-conversion -ftree-tail-merge -ftree-ter @gol
--ftree-vectorize -ftree-vrp -funconstrained-commons @gol
+-ft

Re: [PATCH][simplify-rtx] Zero-initialise local array in simplify_immed_subreg

2016-10-10 Thread Kyrill Tkachov


On 07/10/16 18:56, Andrew Pinski wrote:

On Fri, Oct 7, 2016 at 10:55 AM, Andrew Pinski  wrote:

On Fri, Oct 7, 2016 at 7:08 AM, Kyrill Tkachov
 wrote:

Hi all,

I've encountered another wrong-code bug with the store merging pass. This
time it's in RTL.
The test gcc.target/aarch64/aapcs64/test_27.c on aarch64 merges a few __fp16
values at GIMPLE level but
during RTL dse1 one of the constants generated gets wrongly misinterpreted
from HImode to HFmode
by simplify_immed_subreg. The HFmode value it ends up producing is
completely bogus.

By stepping through the code with GDB the problem is in the hunk touched by
this patch when it
fills in an array of longs with the value bytes before passing it down to
real_from_target.
It fills in the array by orring in each byte.

However, the array it declared to use for this doesn not get properly
zero-initialised for modes
less that 32 bits wide (HFmode in this case). The fix in this patch is to
just use an array initialiser
to zero it out. This makes the failure go away.

Bootstrapped and tested on aarch64, x86_64.

Ok for trunk?

Even though this has been approved I think it is best to do write this
code as this:
long tmp[MAX_BITSIZE_MODE_ANY_MODE / 32];
  /* real_from_target wants its input in words affected by
 FLOAT_WORDS_BIG_ENDIAN.  However, we ignore this,
 and use WORDS_BIG_ENDIAN instead; see the documentation
 of SUBREG in rtl.texi.  */
memset (tmp, 0, sizeof(tmp));


Also the / 32 should be changed to / (sizeof(long) * BITS_PER_CHAR)
but that was there before hand.


Agreed about this, but I'm not sure about the memset.
long tmp[] = { 0 };
looks shorter and cleaner to me and is guaranteed to do the right
thing as well, no?

Thanks,
Kyrill


THanks,
Andrew


Thanks,
Andrew Pinski



Thanks,
Kyrill

2016-10-06  Kyrylo Tkachov  

 * simplify-rtx.c (simplify_immed_subreg): Zero-initialize tmp array
 before merging in bytes to pass down to real_from_target.




Re: [PATCH] Fix PR77826

2016-10-10 Thread Marc Glisse

On Fri, 7 Oct 2016, Richard Biener wrote:


On Thu, 6 Oct 2016, Marc Glisse wrote:


On Wed, 5 Oct 2016, Richard Biener wrote:


The following will fix PR77826, the issue that in match.pd matching
up two things uses operand_equal_p which is too lax about the type
of the toplevel entity (at least for integer constants).

Bootstrap / regtest pending on x86_64-unknown-linux-gnu.


The following is what I have applied.

Richard.

2016-10-05  Richard Biener  

PR middle-end/77826
* genmatch.c (dt_operand::gen_match_op): Amend operand_equal_p
with types_match for GIMPLE code gen to handle type mismatched
constants properly.


I don't understand the disparity between generic and gimple here. Why let
(char)1 and (long)1 match in generic but not in gimple? And there are probably
several transformations in match.pd that could do with an update if constants
don't match anymore. Or did I misunderstand what the patch does?


The disparity is mostly that with GENERIC unfolded trees such as (char)1
are a bug while in GIMPLE the fact that the match.pd machinery does
valueization makes those a "feature" we have to deal with.  Originally
I've restricted GENERIC as well but with its types_match_p implementation
it resulted in too many missed matches.


I shouldn't have written (long)1, I meant the fact that 1 (as a char 
constant) and 1 (as a long constant) will now be matching captures in 
generic and not in gimple. If we are going in the direction of not 
matching constants of different types, I'd rather do it consistently and 
update the patterns as needed to avoid the missed optimizations. The 
missed matches exist in gimple as well, and generic optimization seems 
less important than gimple to me.


An example that regressed at -O (looking at the .optimized dump)

int f(int a, unsigned b){
  a &= 1;
  b &= 1;
  return a&b;
}



If we stick to the old behavior, maybe we could have some genmatch magic 
to help with the constant capture weirdness. With matching captures, we 
could select which operand (among those supposed to be equivalent) is 
actually captured more cleverly, either with an explicit marker, or by 
giving priority to the one that is not immediatly below convert? in the 
pattern.


And if we move to stricter matching, maybe genmatch could be updated so 
convert could also match integer constants in some cases.



I agree that some transforms would need updates - I've actually tried
to implement a warning for genmatch whenever seeing a match with
(T)@0 but there isn't any good existing place to sneak that in.




* match.pd ((X /[ex] A) * A -> X): Properly handle converted
and constant A.


This regressed
int f(int*a,int*b){return 4*(int)(b-a);}


This is because (int)(b-a) could be a truncation in which case
multiplying with 4 might not result in the same value as
b-a truncated(?).  The comment before the unpatched patterns
said "sign-changing conversions" but nothign actually verified this.
Might be that truncations are indeed ok now that I think about it.


2015-05-22  Marc Glisse  

PR tree-optimization/63387
* match.pd ((X /[ex] A) * A -> X): Remove unnecessary condition.

Apparently I forgot to remove the comment at that time :-(


Btw, do you have a better suggestion as to how to handle the original
issue rather than not relying on operand_equal_p for constants?


In previous cases, in order to get the right version of a matching 
capture, we used non-matching captures and an explicit call to 
operand_equal_p, for instance:


/* X - (X / Y) * Y is the same as X % Y.  */
(simplify
 (minus (convert1? @2) (convert2? (mult:c (trunc_div @0 @1) @1)))
 /* We cannot use matching captures here, since in the case of
constants we really want the type of @0, not @2.  */
 (if (operand_equal_p (@0, @2, 0)
  && (INTEGRAL_TYPE_P (type) || VECTOR_INTEGER_TYPE_P (type)))
  (convert (trunc_mod @0 @1

That seems to match the solutions that Jakub and you were discussing in 
the PR, something like it should work (we can discuss the exact code).


I don't know if it is better. The old behavior of matching captures with 
inconsistent types was confusing. A behavior with strict matching may 
complicate things (will we duplicate patterns, or write operand_equal_p 
explicitly to mimic the old behavior?). The recent inconsistency between 
generic and gimple doesnt appeal to me much...


--
Marc Glisse


[Ada] Fix wrong code with biased subtype

2016-10-10 Thread Eric Botcazou
This is a regression present on all active branches for a subtype of a biased 
type declared with explicit constraints, which is itself biased.  The compiler 
generates code that computes a wrong value for the conversion to an integer.

Tested on x86_64-suse-linux, applied on all active branches.


2016-10-10  Eric Botcazou  

* gcc-interface/utils.c (convert): For a biased input type, convert
the bias itself to the base type before adding it.


2016-10-10  Eric Botcazou  

* gnat.dg/biased_subtype.adb: New test.

-- 
Eric BotcazouIndex: gcc-interface/utils.c
===
--- gcc-interface/utils.c	(revision 240890)
+++ gcc-interface/utils.c	(working copy)
@@ -4193,12 +4193,15 @@ convert (tree type, tree expr)
   return convert (type, unpadded);
 }
 
-  /* If the input is a biased type, adjust first.  */
+  /* If the input is a biased type, convert first to the base type and add
+ the bias.  Note that the bias must go through a full conversion to the
+ base type, lest it is itself a biased value; this happens for subtypes
+ of biased types.  */
   if (ecode == INTEGER_TYPE && TYPE_BIASED_REPRESENTATION_P (etype))
 return convert (type, fold_build2 (PLUS_EXPR, TREE_TYPE (etype),
    fold_convert (TREE_TYPE (etype), expr),
-   fold_convert (TREE_TYPE (etype),
-		 TYPE_MIN_VALUE (etype;
+   convert (TREE_TYPE (etype),
+		TYPE_MIN_VALUE (etype;
 
   /* If the input is a justified modular type, we need to extract the actual
  object before converting it to any other type with the exceptions of an
@@ -4502,7 +4505,12 @@ convert (tree type, tree expr)
 	  && (ecode == ARRAY_TYPE || ecode == UNCONSTRAINED_ARRAY_TYPE
 	  || (ecode == RECORD_TYPE && TYPE_CONTAINS_TEMPLATE_P (etype
 	return unchecked_convert (type, expr, false);
-  else if (TYPE_BIASED_REPRESENTATION_P (type))
+
+  /* If the output is a biased type, convert first to the base type and
+	 subtract the bias.  Note that the bias itself must go through a full
+	 conversion to the base type, lest it is a biased value; this happens
+	 for subtypes of biased types.  */
+  if (TYPE_BIASED_REPRESENTATION_P (type))
 	return fold_convert (type,
 			 fold_build2 (MINUS_EXPR, TREE_TYPE (type),
 	  convert (TREE_TYPE (type), expr),
-- { dg-do run }
-- { dg-options "-gnatws" }

procedure Biased_Subtype is

   CIM_Max_AA : constant := 9_999_999;
   CIM_Min_AA : constant := -999_999;

   type TIM_AA is range CIM_Min_AA..CIM_Max_AA + 1;
   for TIM_AA'Size use 24;

   subtype STIM_AA is TIM_AA range TIM_AA(CIM_Min_AA)..TIM_AA(CIM_Max_AA);

   SAA : STIM_AA := 1;

begin
   if Integer(SAA) /= 1 then
 raise Program_Error;
   end if;
end;


Re: [AArch64][11/14] ARMv8.2-A FP16 testsuite selector

2016-10-10 Thread James Greenhalgh
On Thu, Jul 07, 2016 at 05:18:41PM +0100, Jiong Wang wrote:
> ARMv8.2-A adds support for scalar and vector FP16 instructions to ARM and
> AArch64. This patch adds support for testing code for AArch64 targets
> using the new instructions. It is based on the target-support code for
> ARMv8.2-A added for ARM (AArch32).

OK.

Thanks,
James

> gcc/testsuite/
> 2016-07-07  Matthew Wahab 
> Jiong Wang 
> 
> * target-supports.exp (add_options_for_arm_v8_2a_fp16_scalar):
> Mention AArch64 support.
> (add_options_for_arm_v8_2a_fp16_neon): Likewise.
> (check_effective_target_arm_v8_2a_fp16_scalar_ok_nocache): Support
> AArch64 targets.
> (check_effective_target_arm_v8_2a_fp16_neon_ok_nocache): Support
> AArch64 targets.
> (check_effective_target_arm_v8_2a_fp16_scalar_hw): Support AArch64
> targets.
> (check_effective_target_arm_v8_2a_fp16_neon_hw): Likewise.
> 



Re: [patch] Fix GC issue triggered by arithmetic overflow checking

2016-10-10 Thread Richard Biener
On Sat, Oct 8, 2016 at 8:56 PM, Eric Botcazou  wrote:
> Hi,
>
> adding patterns for unsigned arithmetic overflow checking in a back-end can
> have unexpected fallout because of a latent GC issue: when they are present,
> GIMPLE optimization passes can create complex (math. sense) types at will by
> invoking build_complex_type.  Now build_complex_type goes through the type
> caonicalization hashtable, which is GC-ed, so its behavior depends on the
> actual collection points.
>
> The other type-building functions present in tree.c do the same so no big deal
> but build_complex_type is special because it also does:
>
>   /* We need to create a name, since complex is a fundamental type.  */
>   if (! TYPE_NAME (t))
> {
>   const char *name;
>   if (component_type == char_type_node)
> name = "complex char";
>   else if (component_type == signed_char_type_node)
> name = "complex signed char";
>   else if (component_type == unsigned_char_type_node)
> name = "complex unsigned char";
>   else if (component_type == short_integer_type_node)
> name = "complex short int";
>   else if (component_type == short_unsigned_type_node)
> name = "complex short unsigned int";
>   else if (component_type == integer_type_node)
> name = "complex int";
>   else if (component_type == unsigned_type_node)
> name = "complex unsigned int";
>   else if (component_type == long_integer_type_node)
> name = "complex long int";
>   else if (component_type == long_unsigned_type_node)
> name = "complex long unsigned int";
>   else if (component_type == long_long_integer_type_node)
> name = "complex long long int";
>   else if (component_type == long_long_unsigned_type_node)
> name = "complex long long unsigned int";
>   else
> name = 0;
>
>   if (name != 0)
> TYPE_NAME (t) = build_decl (UNKNOWN_LOCATION, TYPE_DECL,
> get_identifier (name), t);
> }
>
> so it creates a DECL node every time a new canonical complex type is created,
> bumping the DECL_UID counter in the process.  Which means that the DECL_UID
> counter is sensitive to the collection points, which in turn means that the
> result of algorithms depending on the DECL_UID counter also is.
>
> This for example resulted in a bootstrap comparison failure on a SPARC/Solaris
> machine doing a strict stage2/stage3 comparison because the contents of the
> .debug_loc section were different: location lists computed by var-tracking
> were slightly different because of a different hashing.
>
> I'm not sure whether the hashing done by var-tracking should be sensitive to
> the DECL_UID of nodes or not, but I think that having the DECL_UID counter
> depend on the collection points is highly undesirable, so the attached patch
> attempts to prevent it; it at least fixed the bootstrap comparison failure.

I believe the rule is that you might only depend on the order of objects
with respect to their DECL_UID, not the actual value of the DECL_UID.
As var-tracking shouldn't look at TYPE_DECLs (?) it's probably a latent
var-tracking bug as well.

> Tested on x86_64-suse-linux, OK for the mainline?

I'd prefer the named parameter to be defaulted to false and the few
places in the FEs fixed (eventually that name business should be
handled like names for nodes like integer_type_node -- I see no
reason why build_complex_type should have this special-case at all!
That is, why are the named vairants in the type hash in the first place?)

Richard.

>
> 2016-10-08  Eric Botcazou  
>
> * tree.h (build_complex_type): Add second parameter with default.
> * builtins.c (expand_builtin_cexpi): Pass false in call to above.
> (fold_builtin_sincos): Likewise.
> (fold_builtin_arith_overflow): Likewise.
> * gimple-fold.c (fold_builtin_atomic_compare_exchange): Likewise.
> (gimple_fold_call): Likewise.
> * stor-layout.c (bitwise_type_for_mode): Likewise.
> * tree-ssa-dce.c (maybe_optimize_arith_overflow): Likewise.
> * tree-ssa-math-opts.c (match_uaddsub_overflow): Likewise.
> * tree.c (build_complex): Likewise.
> (build_complex_type): Add NAMED second parameter and adjust recursive
> call.  Create a TYPE_DECL only if NAMED is true.
>
> --
> Eric Botcazou


Re: [VRP] Allocate bitmap before copying

2016-10-10 Thread Richard Biener
On Sat, Oct 8, 2016 at 9:34 PM, kugan  wrote:
> Hi,
>
> In vrp_intersect_ranges_1, when !vr0->equiv, we are copying vr1->equiv
> without allocating bitmap. This patch fixes this.
>
> Bootstrap and regression testing are ongoing. Is this OK if no new
> regressions?

Ok for trunk and branches.

Richard.

> Thanks,
> Kugan
>
> gcc/ChangeLog:
>
> 2016-10-09  Kugan Vivekanandarajah  
>
> * tree-vrp.c (vrp_intersect_ranges_1): Allocate bitmap before
>   copying.


Re: [RFC][VRP] Improve intersect_ranges

2016-10-10 Thread Richard Biener
On Sat, Oct 8, 2016 at 9:38 PM, kugan  wrote:
> Hi Richard,
>
> Thanks for the review.
> On 07/10/16 20:11, Richard Biener wrote:
>>
>> On Fri, Oct 7, 2016 at 12:00 AM, kugan
>>  wrote:
>>>
>>> Hi,
>>>
>>> In vrp intersect_ranges, Richard recently changed it to create integer
>>> value
>>> ranges when it is integer singleton.
>>>
>>> Maybe we should do the same when the other range is a complex ranges with
>>> SSA_NAME (like [x+2, +INF])?
>>>
>>> Attached patch tries to do this. There are cases where it will be
>>> beneficial
>>> as the  testcase in the patch. (For this testcase to work with Early VRP,
>>> we
>>> need the patch posted at
>>> https://gcc.gnu.org/ml/gcc-patches/2016-10/msg00413.html)
>>>
>>> Bootstrapped and regression tested on x86_64-linux-gnu with no new
>>> regressions.
>>
>>
>> This is not clearly a win, in fact it can completely lose an ASSERT_EXPR
>> because there is no way to add its effect back as an equivalence.  The
>> current choice of always using the "left" keeps the ASSERT_EXPR range
>> and is able to record the other range via an equivalence.
>
>
> How about changing the order in Early VRP when we are dealing with the same
> SSA_NAME in inner and outer scope. Here is a patch that does this. Is this
> OK if no new regressions?

I'm not sure if this is a good way forward.  The failure with the testcase is
that we don't extract a range for k from if (j < k) which I believe another
patch from you addresses?

As said the issue is with the equivalence / value-range representation so
you can't do sth like

  /* Discover VR when condition is true.  */
  extract_range_for_var_from_comparison_expr (op0, code, op0, op1, &vr);
  if (old_vr->type == VR_RANGE || old_vr->type == VR_ANTI_RANGE)
vrp_intersect_ranges (&vr, old_vr);

  /* If we found any usable VR, set the VR to ssa_name and create a
 PUSH old value in the stack with the old VR.  */
  if (vr.type == VR_RANGE || vr.type == VR_ANTI_RANGE)
{
  new_vr = vrp_value_range_pool.allocate ();
  *new_vr = vr;
  push_value_range (op0, new_vr);
  ->>>  add equivalence to old_vr for new_vr.

because old_vr and new_vr are the 'same' (they are associated with SSA name op0)

Richard.

> Thanks,
> Kugan
>
>
>
>
>
>> My thought on this was that we need to separate "ranges" and associated
>> SSA names so we can introduce new ranges w/o the need for an SSA name
>> (and thus we can create an equivalence to the ASSERT_EXPR range).
>> IIRC I started on this at some point but never finished it ...
>>
>> Richard.
>>
>>> Thanks,
>>> Kugan
>>>
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>> 2016-10-07  Kugan Vivekanandarajah  
>>>
>>> * gcc.dg/tree-ssa/evrp6.c: New test.
>>>
>>> gcc/ChangeLog:
>>>
>>> 2016-10-07  Kugan Vivekanandarajah  
>>>
>>> * tree-vrp.c (intersect_ranges): If we failed to handle
>>> the intersection and the other range involves computation with
>>> symbolic values, choose integer range if available.
>>>
>>>
>>>
>


[Ada] Fix inter-unit inlining failure

2016-10-10 Thread Eric Botcazou
This is a regression present on the mainline and 6 branch: the compiler fails 
to inline across units a function declared with pragma Inline_Always because 
the middle-end detects a type mismatch for an argument, after gimplification 
removed a conversion.  The fix is to make the conversion more robust.

Tested on x86_64-suse-linux, applied on mainline and 6 branch.


2016-10-10  Eric Botcazou  

* gcc-interface/utils2.c (find_common_type): Do not return the LHS
type if it's an array with non-constant lower bound and the RHS type
is an array with a constant one.


2016-10-10  Eric Botcazou  

* gnat.dg/inline13.ad[sb]: New test.
* gnat.dg/inline13_pkg.ad[sb]: New helper.

-- 
Eric Botcazou-- { dg-do compile }
-- { dg-options "-O -gnatn" }

package body Inline13 is

  function F (L : Arr) return String is
Local : Arr (1 .. L'Length);
Ret : String (1 .. L'Length);
Pos : Natural := 1;
  begin
Local (1 .. L'Length) := L;
for I in 1 .. Integer (L'Length) loop
   Ret (Pos .. Pos + 8) := " " & Inline13_Pkg.Padded (Local (I));
   Pos := Pos + 9;
end loop;
return Ret;
  end;

end Inline13;
with Inline13_Pkg;

package Inline13 is

  type Arr is array (Positive range <>) of Inline13_Pkg.T;

  function F (L : Arr) return String;

end Inline13;
package body Inline13_Pkg is

  function Padded (Value : T) return Padded_T is
  begin
return Padded_T(Value);
  end Padded;

end Inline13_Pkg;
package Inline13_Pkg is

  subtype Padded_T is String (1..8);

  type T is new Padded_T;

  function Padded (Value : T) return Padded_T;
  pragma Inline_Always (Padded);

end Inline13_Pkg;
Index: gcc-interface/utils2.c
===
--- gcc-interface/utils2.c	(revision 240890)
+++ gcc-interface/utils2.c	(working copy)
@@ -215,27 +215,40 @@ find_common_type (tree t1, tree t2)
  calling into build_binary_op), some others are really expected and we
  have to be careful.  */
 
+  const bool variable_record_on_lhs
+= (TREE_CODE (t1) == RECORD_TYPE
+   && TREE_CODE (t2) == RECORD_TYPE
+   && get_variant_part (t1)
+   && !get_variant_part (t2));
+
+  const bool variable_array_on_lhs
+= (TREE_CODE (t1) == ARRAY_TYPE
+   && TREE_CODE (t2) == ARRAY_TYPE
+   && !TREE_CONSTANT (TYPE_MIN_VALUE (TYPE_DOMAIN (t1)))
+   && TREE_CONSTANT (TYPE_MIN_VALUE (TYPE_DOMAIN (t2;
+
   /* We must avoid writing more than what the target can hold if this is for
  an assignment and the case of tagged types is handled in build_binary_op
  so we use the lhs type if it is known to be smaller or of constant size
  and the rhs type is not, whatever the modes.  We also force t1 in case of
  constant size equality to minimize occurrences of view conversions on the
- lhs of an assignment, except for the case of record types with a variant
- part on the lhs but not on the rhs to make the conversion simpler.  */
+ lhs of an assignment, except for the case of types with a variable part
+ on the lhs but not on the rhs to make the conversion simpler.  */
   if (TREE_CONSTANT (TYPE_SIZE (t1))
   && (!TREE_CONSTANT (TYPE_SIZE (t2))
 	  || tree_int_cst_lt (TYPE_SIZE (t1), TYPE_SIZE (t2))
 	  || (TYPE_SIZE (t1) == TYPE_SIZE (t2)
-	  && !(TREE_CODE (t1) == RECORD_TYPE
-		   && TREE_CODE (t2) == RECORD_TYPE
-		   && get_variant_part (t1)
-		   && !get_variant_part (t2)
+	  && !variable_record_on_lhs
+	  && !variable_array_on_lhs)))
 return t1;
 
-  /* Otherwise, if the lhs type is non-BLKmode, use it.  Note that we know
- that we will not have any alignment problems since, if we did, the
- non-BLKmode type could not have been used.  */
-  if (TYPE_MODE (t1) != BLKmode)
+  /* Otherwise, if the lhs type is non-BLKmode, use it, except for the case of
+ a non-BLKmode rhs and array types with a variable part on the lhs but not
+ on the rhs to make sure the conversion is preserved during gimplification.
+ Note that we know that we will not have any alignment problems since, if
+ we did, the non-BLKmode type could not have been used.  */
+  if (TYPE_MODE (t1) != BLKmode
+  && (TYPE_MODE (t2) == BLKmode || !variable_array_on_lhs))
 return t1;
 
   /* If the rhs type is of constant size, use it whatever the modes.  At


Re: Compile-time improvement for if conversion.

2016-10-10 Thread Richard Biener
On Wed, Oct 5, 2016 at 3:22 PM, Yuri Rumyantsev  wrote:
> Hi All,
>
> Here is implementation of Richard proposal:
>
> < For general infrastructure it would be nice to expose a (post-)dominator
> < compute for MESE (post-dominators) / SEME (dominators) regions.  I believe
> < what makes if-conversion expensive is the post-dom compute which happens
> < for each loop for the whole function.  It shouldn't be very difficult
> < to write this,
> < sharing as much as possible code with the current DOM code might need
> < quite some refactoring though.
>
> I implemented this proposal by adding calculation of dominance info
> for SESE regions and incorporate this change to if conversion pass.
> SESE region is built by adding loop pre-header and possibly fake
> post-header blocks to loop body. Fake post-header is deleted after
> predication completion.
>
> Bootstrapping and regression testing did not show any new failures.
>
> Is it OK for trunk?

It's mostly reasonable but I have a few comments.  First, re-using
bb->dom[] for the dominator info is somewhat fragile but indeed
a requirement to make the patch reasonably small.  Please,
in calculate_dominance_info_for_region, make sure that
!dom_info_available_p (dir).

You pass loop * everywhere but require ->aux to be set up as
an array of BBs forming the region with special BBs at array ends.

Please instead pass in a vec which avoids using ->aux
and also allows other non-loop-based SESE regions to be used
(I couldn't spot anything that relies on this being a loop).

Adding a convenience wrapper for loop  * would be of course nice,
to cover the special pre/post-header code in tree-if-conv.c.

In theory a SESE region is fully specified by its entry end exit _edge_,
so you might want to see if it's possible to use such a pair of edges
to guard the dfs/idom walks to avoid the need to create fake blocks.

Btw, instead of using create_empty_bb, unchecked_make_edge, etc.
please use split_edge() of the entry/exit edges.

Richard.

> ChangeLog:
> 2016-10-05  Yuri Rumyantsev  
>
> * dominance.c : Include cfgloop.h for loop recognition.
> (dom_info): Add new functions and add boolean argument to recognize
> computation for loop region.
> (dom_info::dom_info): New function.
> (dom_info::calc_dfs_tree): Add boolean argument IN_REGION to not
> handle unvisited blocks.
> (dom_info::calc_idoms): Likewise.
> (compute_dom_fast_query_in_region): New function.
> (calculate_dominance_info): Invoke calc_dfs_tree and calc_idoms with
> false argument.
> (calculate_dominance_info_for_region): New function.
> (free_dominance_info_for_region): Likewise.
> (verify_dominators): Invoke calc_dfs_tree and calc_idoms with false
> argument.
> * dominance.h: Add prototype for introduced functions
> calculate_dominance_info_for_region and
> free_dominance_info_for_region.
> tree-if-conv.c: Add to local variables ifc_sese_bbs & fake_postheader.
> (build_sese_region): New function.
> (if_convertible_loop_p_1): Invoke local version of post-dominators
> calculation, free it after basic block predication and delete created
> fake post-header block if any.
> (tree_if_conversion): Delete call of free_dominance_info for
> post-dominators, free ifc_sese_bbs which represents SESE region.
> (pass_if_conversion::execute): Delete detection of infinite loops
> and fake edges to exit block since post-dominator calculation is
> performed per if-converted loop only.


Re: [PATCH, PR77558] Remove RECORD_TYPE special-casing in std_canonical_va_list_type

2016-10-10 Thread Richard Biener
On Sun, Sep 25, 2016 at 11:08 AM, Tom de Vries  wrote:
> Hi,
>
> this patch fixes PR77558, an ice-on-invalid-code 6/7 regression.
>
> The fix for PR71602 introduced the invalid-code test-case
> c-c++-common/va-arg-va-list-type.c:
> ...
> __builtin_va_list *pap;
>
> void
> fn1 (void)
> {
>   __builtin_va_arg (pap, double); /* { dg-error "first argument to 'va_arg'
> not of type 'va_list'" } */
> }
> ...
>
> The test-case passes for x86_64, but fails for aarch64 and ICEs for arm.
>
> The ICE happens because the patch for PR71602 is incomplete. The patch tries
> to be more strict about returning a canonical va_list only for actual
> va_lists, but doesn't implement this for structure va_list types, such as we
> have for arm, aarch64 and alpha.
>
> This patch adds the missing part, and fixes the ICE.
>
> OK for trunk, 6-branch?

Ok.

Richard.

> Thanks,
> - Tom


Re: [AArch64][12/14] ARMv8.2-A testsuite for new data movement intrinsics

2016-10-10 Thread James Greenhalgh
On Thu, Jul 07, 2016 at 05:19:09PM +0100, Jiong Wang wrote:
> This patch contains testcases for those new scalar intrinsics which are only
> available for AArch64.

OK.

Thanks,
James

> 
> gcc/testsuite/
> 2016-07-07  Jiong Wang 
> 
> * gcc.target/aarch64/advsimd-intrinsics/arm-neon-ref.h
> (FP16_SUPPORTED):
> Enable AArch64.
> * gcc.target/aarch64/advsimd-intrinsics/vdup_lane.c: Add
> support for
> vdup*_laneq.
> * gcc.target/aarch64/advsimd-intrinsics/vduph_lane.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vtrn_half.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vuzp_half.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vzip_half.c: New.
> 



Re: PING: [PATCH] Be more conservative in early inliner if FDO is enabled

2016-10-10 Thread Richard Biener
On Mon, Oct 10, 2016 at 4:23 AM, Yuan, Pengfei  wrote:
> Hi,
>
> What is the decision on this patch?
> https://gcc.gnu.org/ml/gcc-patches/2016-09/msg01041.html

Honza approved the patch already.

Richard.

> Regards,
> Yuan, Pengfei
>
>> A new patch for trunk is attached.
>>
>> Regards,
>> Yuan, Pengfei
>>
>>
>> 2016-09-16  Yuan Pengfei  
>>
>>   * doc/invoke.texi (--param early-inlining-insns-feedback): New.
>>   * ipa-inline.c (want_early_inline_function_p): Use
>>   PARAM_EARLY_INLINING_INSNS_FEEDBACK when FDO is enabled.
>>   * params.def (PARAM_EARLY_INLINING_INSNS_FEEDBACK): Define.
>>   (PARAM_EARLY_INLINING_INSNS): Change help string accordingly.
>>
>>
>> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
>> index 8eb5eff..6e7659a 100644
>> --- a/gcc/doc/invoke.texi
>> +++ b/gcc/doc/invoke.texi
>> @@ -9124,12 +9124,18 @@ given call expression.  This parameter limits 
>> inlining only to call expressions
>>  whose probability exceeds the given threshold (in percents).
>>  The default value is 10.
>>
>>  @item early-inlining-insns
>> +@itemx early-inlining-insns-feedback
>>  Specify growth that the early inliner can make.  In effect it increases
>>  the amount of inlining for code having a large abstraction penalty.
>>  The default value is 14.
>>
>> +The @option{early-inlining-insns-feedback} parameter is used only when
>> +profile feedback-directed optimizations are enabled (by
>> +@option{-fprofile-generate} or @option{-fprofile-use}).
>> +The default value is 2.
>> +
>>  @item max-early-inliner-iterations
>>  Limit of iterations of the early inliner.  This basically bounds
>>  the number of nested indirect calls the early inliner can resolve.
>>  Deeper chains are still handled by late inlining.
>> diff --git a/gcc/ipa-inline.c b/gcc/ipa-inline.c
>> index 5c9366a..e028c08 100644
>> --- a/gcc/ipa-inline.c
>> +++ b/gcc/ipa-inline.c
>> @@ -594,10 +594,17 @@ want_early_inline_function_p (struct cgraph_edge *e)
>>  }
>>else
>>  {
>>int growth = estimate_edge_growth (e);
>> +  int growth_limit;
>>int n;
>>
>> +  if ((profile_arc_flag && !flag_test_coverage)
>> +   || (flag_branch_probabilities && !flag_auto_profile))
>> + growth_limit = PARAM_VALUE (PARAM_EARLY_INLINING_INSNS_FEEDBACK);
>> +  else
>> + growth_limit = PARAM_VALUE (PARAM_EARLY_INLINING_INSNS);
>> +
>>if (growth <= 0)
>>   ;
>>else if (!e->maybe_hot_p ()
>>  && growth > 0)
>> @@ -610,9 +617,9 @@ want_early_inline_function_p (struct cgraph_edge *e)
>>xstrdup_for_dump (callee->name ()), callee->order,
>>growth);
>> want_inline = false;
>>   }
>> -  else if (growth > PARAM_VALUE (PARAM_EARLY_INLINING_INSNS))
>> +  else if (growth > growth_limit)
>>   {
>> if (dump_file)
>>   fprintf (dump_file, "  will not early inline: %s/%i->%s/%i, "
>>"growth %i exceeds --param early-inlining-insns\n",
>> @@ -622,9 +629,9 @@ want_early_inline_function_p (struct cgraph_edge *e)
>>growth);
>> want_inline = false;
>>   }
>>else if ((n = num_calls (callee)) != 0
>> -&& growth * (n + 1) > PARAM_VALUE (PARAM_EARLY_INLINING_INSNS))
>> +&& growth * (n + 1) > growth_limit)
>>   {
>> if (dump_file)
>>   fprintf (dump_file, "  will not early inline: %s/%i->%s/%i, "
>>"growth %i exceeds --param early-inlining-insns "
>> diff --git a/gcc/params.def b/gcc/params.def
>> index 79b7dd4..91ea513 100644
>> --- a/gcc/params.def
>> +++ b/gcc/params.def
>> @@ -199,12 +199,20 @@ DEFPARAM(PARAM_INLINE_UNIT_GROWTH,
>>  DEFPARAM(PARAM_IPCP_UNIT_GROWTH,
>>"ipcp-unit-growth",
>>"How much can given compilation unit grow because of the 
>> interprocedural constant propagation (in percent).",
>>10, 0, 0)
>> -DEFPARAM(PARAM_EARLY_INLINING_INSNS,
>> -  "early-inlining-insns",
>> -  "Maximal estimated growth of function body caused by early inlining 
>> of single call.",
>> -  14, 0, 0)
>> +DEFPARAM (PARAM_EARLY_INLINING_INSNS_FEEDBACK,
>> +   "early-inlining-insns-feedback",
>> +   "Maximal estimated growth of function body caused by early "
>> +   "inlining of single call.  Used when profile feedback-directed "
>> +   "optimizations are enabled.",
>> +   2, 0, 0)
>> +DEFPARAM (PARAM_EARLY_INLINING_INSNS,
>> +   "early-inlining-insns",
>> +   "Maximal estimated growth of function body caused by early "
>> +   "inlining of single call.  Used when profile feedback-directed "
>> +   "optimizations are not enabled.",
>> +   14, 0, 0)
>>  DEFPARAM(PARAM_LARGE_STACK_FRAME,
>>"large-stack-frame",
>>"The size of stack frame to be considered large.",
>>256, 0, 0)
>


Re: [AArch64][13/14] ARMv8.2-A testsuite for new vector intrinsics

2016-10-10 Thread James Greenhalgh
On Thu, Jul 07, 2016 at 05:19:25PM +0100, Jiong Wang wrote:
> This patch contains testcases for those new vector intrinsics which are only
> available for AArch64.


OK.

Thanks,
James

> gcc/testsuite/
> 2016-07-07  Jiong Wang 
> 
> * gcc.target/aarch64/advsimd-intrinsics/vdiv_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vfmas_lane_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vfmas_n_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vmaxnmv_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vmaxv_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vminnmv_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vminv_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vmul_lane_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vmulx_f16_1.c: New
> * gcc.target/aarch64/advsimd-intrinsics/vmulx_lane_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vmulx_n_f16_1.c: New
> * gcc.target/aarch64/advsimd-intrinsics/vpminmaxnm_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vrndi_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vsqrt_f16_1.c: New.
> 




Re: [AArch64][14/14] ARMv8.2-A testsuite for new scalar intrinsics

2016-10-10 Thread James Greenhalgh
On Thu, Jul 07, 2016 at 05:19:37PM +0100, Jiong Wang wrote:
> This patch contains testcases for those new scalar intrinsics which are only
> available for AArch64.

OK.

Thanks,
James

> gcc/testsuite/
> 2016-07-07  Jiong Wang 
> 
> * gcc.target/aarch64/advsimd-intrinsics/unary_scalar_op.inc:
> Support FMT64.
> * gcc.target/aarch64/advsimd-intrinsics/vabdh_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcageh_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcagth_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcaleh_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcalth_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vceqh_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vceqzh_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcgeh_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcgezh_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcgth_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcgtzh_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcleh_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vclezh_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vclth_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcltzh_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvtah_s16_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvtah_s64_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvtah_u16_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvtah_u64_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvth_f16_s16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvth_f16_s64_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvth_f16_u16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvth_f16_u64_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvth_n_f16_s16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvth_n_f16_s64_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvth_n_f16_u16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvth_n_f16_u64_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvth_n_s16_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvth_n_s64_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvth_n_u16_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvth_n_u64_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvth_s16_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvth_s64_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvth_u16_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvth_u64_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvtmh_s16_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvtmh_s64_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvtmh_u16_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvtmh_u64_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvtnh_s16_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvtnh_s64_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvtnh_u16_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvtnh_u64_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvtph_s16_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvtph_s64_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvtph_u16_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvtph_u64_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vfmash_lane_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vmaxh_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vminh_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vmulh_lane_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vmulxh_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vmulxh_lane_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vrecpeh_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vrecpsh_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vrecpxh_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vrsqrteh_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vrsqrtsh_f16_1.c: New.




[Ada] Fix type checking failure with pragma Volatile_Full_Access

2016-10-10 Thread Eric Botcazou
The problem is that we put an alias set on a variant that is not the main one.

Tested on x86_64-suse-linux, applied on the mainline.


2016-10-10  Eric Botcazou  

* gcc-interface/decl.c (gnat_to_gnu_entity): Put volatile qualifier
on types at the very end of the processing.
(gnat_to_gnu_param): Remove redundant test.
(change_qualified_type): Do nothing for unconstrained array types.


2016-10-10  Eric Botcazou  

* gnat.dg/specs/vfa.ads: New test.

-- 
Eric BotcazouIndex: gcc-interface/decl.c
===
--- gcc-interface/decl.c	(revision 240890)
+++ gcc-interface/decl.c	(working copy)
@@ -4728,14 +4728,6 @@ gnat_to_gnu_entity (Entity_Id gnat_entit
 	  && AGGREGATE_TYPE_P (gnu_type)
 	  && TYPE_BY_REFERENCE_P (gnu_type))
 	SET_TYPE_MODE (gnu_type, BLKmode);
-
-	  if (Treat_As_Volatile (gnat_entity))
-	{
-	  const int quals
-		= TYPE_QUAL_VOLATILE
-		  | (Is_Atomic_Or_VFA (gnat_entity) ? TYPE_QUAL_ATOMIC : 0);
-	  gnu_type = change_qualified_type (gnu_type, quals);
-	}
 	}
 
   /* If this is a derived type, relate its alias set to that of its parent
@@ -4816,6 +4808,14 @@ gnat_to_gnu_entity (Entity_Id gnat_entit
 			 ? ALIAS_SET_COPY : ALIAS_SET_SUPERSET);
 	}
 
+  if (Treat_As_Volatile (gnat_entity))
+	{
+	  const int quals
+	= TYPE_QUAL_VOLATILE
+	  | (Is_Atomic_Or_VFA (gnat_entity) ? TYPE_QUAL_ATOMIC : 0);
+	  gnu_type = change_qualified_type (gnu_type, quals);
+	}
+
   if (!gnu_decl)
 	gnu_decl = create_type_decl (gnu_entity_name, gnu_type,
  artificial_p, debug_info_p,
@@ -5386,12 +5386,9 @@ gnat_to_gnu_param (Entity_Id gnat_param,
 }
 
   /* If this is a read-only parameter, make a variant of the type that is
- read-only.  ??? However, if this is an unconstrained array, that type
- can be very complex, so skip it for now.  Likewise for any other
- self-referential type.  */
-  if (ro_param
-  && TREE_CODE (gnu_param_type) != UNCONSTRAINED_ARRAY_TYPE
-  && !CONTAINS_PLACEHOLDER_P (TYPE_SIZE (gnu_param_type)))
+ read-only.  ??? However, if this is a self-referential type, the type
+ can be very complex, so skip it for now.  */
+  if (ro_param && !CONTAINS_PLACEHOLDER_P (TYPE_SIZE (gnu_param_type)))
 gnu_param_type = change_qualified_type (gnu_param_type, TYPE_QUAL_CONST);
 
   /* For foreign conventions, pass arrays as pointers to the element type.
@@ -6254,6 +6251,10 @@ gnu_ext_name_for_subprog (Entity_Id gnat
 static tree
 change_qualified_type (tree type, int type_quals)
 {
+  /* Qualifiers must be put on the associated array type.  */
+  if (TREE_CODE (type) == UNCONSTRAINED_ARRAY_TYPE)
+return type;
+
   return build_qualified_type (type, TYPE_QUALS (type) | type_quals);
 }
 
-- { dg-do compile }
-- { dg-options "-g" }

package VFA is

  type Rec is record
A : Short_Integer;
B : Short_Integer;
  end record;

  type Rec_VFA is new Rec;
  pragma Volatile_Full_Access (Rec_VFA);

end VFA;


Re: [PATCH][store merging][RFA] Re-implement merging code

2016-10-10 Thread Richard Biener
On Mon, 10 Oct 2016, Kyrill Tkachov wrote:

> Hi Richard,
> 
> As I mentioned, here is the patch applying to the main store merging patch to
> re-implement encode_tree_to_bitpos
> to operate on the bytes directly.
> 
> This works fine on little-endian but breaks on big-endian, even for merging
> bitfields within a single byte.
> Consider the code snippet from gcc.dg/store_merging_6.c:
> 
> struct bar {
>   int a : 3;
>   unsigned char b : 4;
>   unsigned char c : 1;
>   char d;
>   char e;
>   char f;
>   char g;
> };
> 
> void
> foo1 (struct bar *p)
> {
>   p->b = 3;
>   p->a = 2;
>   p->c = 1;
>   p->d = 4;
>   p->e = 5;
> }
> 
> The correct GIMPLE for these merged stores on big-endian is:
>   MEM[(voidD.49 *)p_2(D)] = 18180;
>   MEM[(charD.8 *)p_2(D) + 2B] = 5;
> 
> whereas with this patch we emit:
>   MEM[(voidD.49 *)p_2(D)] = 39428;
>   MEM[(charD.8 *)p_2(D) + 2B] = 5;
> 
> The dump for merging the individual stores without this patch (using the
> correct but costly wide_int approach in the base patch) is:
> After writing 3 of size 4 at position 3 the merged region contains:
> 6 0 0 0 0 0
> After writing 2 of size 3 at position 0 the merged region contains:
> 46 0 0 0 0 0
> After writing 1 of size 1 at position 7 the merged region contains:
> 47 0 0 0 0 0
> After writing 4 of size 8 at position 8 the merged region contains:
> 47 4 0 0 0 0
> After writing 5 of size 8 at position 16 the merged region contains:
> 47 4 5 0 0 0
> 
> 
> And with this patch it is:
> After writing 3 of size 4 at position 3 the merged region contains:
> 18 0 0 0 0 0
> After writing 2 of size 3 at position 0 the merged region contains:
> 1a 0 0 0 0 0
> After writing 1 of size 1 at position 7 the merged region contains:
> 9a 0 0 0 0 0
> After writing 4 of size 8 at position 8 the merged region contains:
> 9a 4 0 0 0 0
> After writing 5 of size 8 at position 16 the merged region contains:
> 9a 4 5 0 0 0
> 
> (Note the dump just dumps the byte array from index 0 to  so the first
> thing printed is the lowest numbered byte.
> Also, each byte is dumped in hex.)
> 
> The code as included here doesn't do any byte swapping for big-endian but as
> seen from the dump even writing a sub-byte
> bitfield goes wrong so it would be nice to resolve that before going forward.
> Any help with debugging this is hugely appreciated. I've included an ASCII
> diagram of the steps in the algorithm
> in the patch itself.

Ah, I think you need to account for BITS_BIG_ENDIAN in 
shift_bytes_in_array.  You have to shift towards MSB which means changing
left to right shifts for BITS_BIG_ENDIAN.

You also seem to miss to account for amnt / BITS_PER_UNIT != 0.
Independently of BYTES_BIG_ENDIAN it would be

  ptr[i + (amnt / BITS_PER_UNIT)] = ptr[i] << amnt;
...

(so best use a single load / store and operate on a temporary).

Richard.

> Thanks,
> Kyrill
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


Re: [PATCH][store merging][RFA] Re-implement merging code

2016-10-10 Thread Richard Biener
On Mon, 10 Oct 2016, Richard Biener wrote:

> On Mon, 10 Oct 2016, Kyrill Tkachov wrote:
> 
> > Hi Richard,
> > 
> > As I mentioned, here is the patch applying to the main store merging patch 
> > to
> > re-implement encode_tree_to_bitpos
> > to operate on the bytes directly.
> > 
> > This works fine on little-endian but breaks on big-endian, even for merging
> > bitfields within a single byte.
> > Consider the code snippet from gcc.dg/store_merging_6.c:
> > 
> > struct bar {
> >   int a : 3;
> >   unsigned char b : 4;
> >   unsigned char c : 1;
> >   char d;
> >   char e;
> >   char f;
> >   char g;
> > };
> > 
> > void
> > foo1 (struct bar *p)
> > {
> >   p->b = 3;
> >   p->a = 2;
> >   p->c = 1;
> >   p->d = 4;
> >   p->e = 5;
> > }
> > 
> > The correct GIMPLE for these merged stores on big-endian is:
> >   MEM[(voidD.49 *)p_2(D)] = 18180;
> >   MEM[(charD.8 *)p_2(D) + 2B] = 5;
> > 
> > whereas with this patch we emit:
> >   MEM[(voidD.49 *)p_2(D)] = 39428;
> >   MEM[(charD.8 *)p_2(D) + 2B] = 5;
> > 
> > The dump for merging the individual stores without this patch (using the
> > correct but costly wide_int approach in the base patch) is:
> > After writing 3 of size 4 at position 3 the merged region contains:
> > 6 0 0 0 0 0
> > After writing 2 of size 3 at position 0 the merged region contains:
> > 46 0 0 0 0 0
> > After writing 1 of size 1 at position 7 the merged region contains:
> > 47 0 0 0 0 0
> > After writing 4 of size 8 at position 8 the merged region contains:
> > 47 4 0 0 0 0
> > After writing 5 of size 8 at position 16 the merged region contains:
> > 47 4 5 0 0 0
> > 
> > 
> > And with this patch it is:
> > After writing 3 of size 4 at position 3 the merged region contains:
> > 18 0 0 0 0 0
> > After writing 2 of size 3 at position 0 the merged region contains:
> > 1a 0 0 0 0 0
> > After writing 1 of size 1 at position 7 the merged region contains:
> > 9a 0 0 0 0 0
> > After writing 4 of size 8 at position 8 the merged region contains:
> > 9a 4 0 0 0 0
> > After writing 5 of size 8 at position 16 the merged region contains:
> > 9a 4 5 0 0 0
> > 
> > (Note the dump just dumps the byte array from index 0 to  so the first
> > thing printed is the lowest numbered byte.
> > Also, each byte is dumped in hex.)
> > 
> > The code as included here doesn't do any byte swapping for big-endian but as
> > seen from the dump even writing a sub-byte
> > bitfield goes wrong so it would be nice to resolve that before going 
> > forward.
> > Any help with debugging this is hugely appreciated. I've included an ASCII
> > diagram of the steps in the algorithm
> > in the patch itself.
> 
> Ah, I think you need to account for BITS_BIG_ENDIAN in 
> shift_bytes_in_array.  You have to shift towards MSB which means changing
> left to right shifts for BITS_BIG_ENDIAN.
> 
> You also seem to miss to account for amnt / BITS_PER_UNIT != 0.
> Independently of BYTES_BIG_ENDIAN it would be

Ok, that would matter only if you'd merge shift_bytes_in_array,
clear_bit_region and the |-ring of that into the final buffer
(which should be possible).

Richard.


Re: [patch] Fix GC issue triggered by arithmetic overflow checking

2016-10-10 Thread Eric Botcazou
> I believe the rule is that you might only depend on the order of objects
> with respect to their DECL_UID, not the actual value of the DECL_UID.
> As var-tracking shouldn't look at TYPE_DECLs (?) it's probably a latent
> var-tracking bug as well.

It presumably doesn't look at TYPE_DECLs, simply the DECL_UID of variables is 
also different so this changes some hashing.

> I'd prefer the named parameter to be defaulted to false and the few
> places in the FEs fixed (eventually that name business should be
> handled like names for nodes like integer_type_node -- I see no
> reason why build_complex_type should have this special-case at all!
> That is, why are the named vairants in the type hash in the first place?)

I think that the calls in build_common_tree_nodes need to be changed too then:

  complex_integer_type_node = build_complex_type (integer_type_node);
  complex_float_type_node = build_complex_type (float_type_node);
  complex_double_type_node = build_complex_type (double_type_node);
  complex_long_double_type_node = build_complex_type (long_double_type_node);

in addition to:

./ada/gcc-interface/decl.c: = build_complex_type
./ada/gcc-interface/decl.c:  return build_complex_type (nt);
./ada/gcc-interface/trans.c:  tree gnu_ctype = build_complex_type 
(gnu_type);
./c/c-decl.c: specs->type = build_complex_type (specs->type);
./c/c-decl.c: specs->type = build_complex_type (specs->type);
./c/c-decl.c: specs->type = build_complex_type (specs->type);
./c/c-parser.c:  build_complex_type
./c/c-typeck.c: return build_complex_type (subtype);
./c-family/c-common.c:  return build_complex_type (inner_type);
./c-family/c-lex.c:   type = build_complex_type (type);
./cp/decl.c:type = build_complex_type (type);
./cp/typeck.c:  return build_type_attribute_variant (build_complex_type 
(subtype),
./fortran/trans-types.c:gfc_build_complex_type (tree scalar_type)
./fortran/trans-types.c:  type = gfc_build_complex_type (type);
./go/go-gcc.cc: 
build_complex_type(TREE_TYPE(real_tree)),
./go/go-gcc.cc:  type = build_complex_type(type);
./lto/lto-lang.c:   return build_complex_type (inner_type);

Or perhaps *only* the calls in build_common_tree_nodes need to be changed?

It's certainly old code (r29604, September 1999).

-- 
Eric Botcazou


Re: [AArch64][0/14] ARMv8.2-A FP16 extension support

2016-10-10 Thread James Greenhalgh
On Wed, Oct 05, 2016 at 05:44:08PM +0100, Jiong Wang wrote:
> On 27/09/16 17:03, Jiong Wang wrote:
> >
> > Now as ARM patches have gone in around r240427, I have done a
> quick confirmation
> > on the status of these four pending testsuite patches:
> >
> >   https://gcc.gnu.org/ml/gcc-patches/2016-07/msg00337.html
> >   https://gcc.gnu.org/ml/gcc-patches/2016-07/msg00338.html
> >   https://gcc.gnu.org/ml/gcc-patches/2016-07/msg00339.html
> >   https://gcc.gnu.org/ml/gcc-patches/2016-07/msg00340.html
> >
> > The result is they applies cleanly on gcc trunk, and there is no
> regression on
> > AArch64 native regression test.  Testcases enabled without
> requirement of FP16
> > all passed.
> >
> > I will give a final run on ARM native board and AArch64 emulation
> environment
> > with ARMv8.2-A FP16 enabled. (Have done this before, just in case
> something
> > changed during these days)
> >
> > OK for trunk if there is no regression?
> >
> > Thanks
> 
> Finished the final tests on emulator with FP16 enabled.
> 
>   * No regression on AARCH64, all new testcases passed.
>   * No regression on AARCH32, part of these new testcases UNRESOLVED
> because
> they should be skipped on AARCH32, fixed by the attached trivial patch
> which I will merge into the 4th patch (no affect on changelog).
> 
> OK to commit these patches?

And to be explicit, this is OK too.

Thanks for the tests!

Cheers,
James



Re: [patch] Fix GC issue triggered by arithmetic overflow checking

2016-10-10 Thread Richard Biener
On Mon, Oct 10, 2016 at 12:38 PM, Eric Botcazou  wrote:
>> I believe the rule is that you might only depend on the order of objects
>> with respect to their DECL_UID, not the actual value of the DECL_UID.
>> As var-tracking shouldn't look at TYPE_DECLs (?) it's probably a latent
>> var-tracking bug as well.
>
> It presumably doesn't look at TYPE_DECLs, simply the DECL_UID of variables is
> also different so this changes some hashing.
>
>> I'd prefer the named parameter to be defaulted to false and the few
>> places in the FEs fixed (eventually that name business should be
>> handled like names for nodes like integer_type_node -- I see no
>> reason why build_complex_type should have this special-case at all!
>> That is, why are the named vairants in the type hash in the first place?)
>
> I think that the calls in build_common_tree_nodes need to be changed too then:
>
>   complex_integer_type_node = build_complex_type (integer_type_node);
>   complex_float_type_node = build_complex_type (float_type_node);
>   complex_double_type_node = build_complex_type (double_type_node);
>   complex_long_double_type_node = build_complex_type (long_double_type_node);
>
> in addition to:
>
> ./ada/gcc-interface/decl.c: = build_complex_type
> ./ada/gcc-interface/decl.c:  return build_complex_type (nt);
> ./ada/gcc-interface/trans.c:  tree gnu_ctype = build_complex_type
> (gnu_type);
> ./c/c-decl.c: specs->type = build_complex_type (specs->type);
> ./c/c-decl.c: specs->type = build_complex_type (specs->type);
> ./c/c-decl.c: specs->type = build_complex_type (specs->type);
> ./c/c-parser.c:  build_complex_type
> ./c/c-typeck.c: return build_complex_type (subtype);
> ./c-family/c-common.c:  return build_complex_type (inner_type);
> ./c-family/c-lex.c:   type = build_complex_type (type);
> ./cp/decl.c:type = build_complex_type (type);
> ./cp/typeck.c:  return build_type_attribute_variant (build_complex_type
> (subtype),
> ./fortran/trans-types.c:gfc_build_complex_type (tree scalar_type)
> ./fortran/trans-types.c:  type = gfc_build_complex_type (type);
> ./go/go-gcc.cc:
> build_complex_type(TREE_TYPE(real_tree)),
> ./go/go-gcc.cc:  type = build_complex_type(type);
> ./lto/lto-lang.c:   return build_complex_type (inner_type);
>
> Or perhaps *only* the calls in build_common_tree_nodes need to be changed?

I think only the calls in build_common_tree_nodes -- those are the ones
built early and that survive GC.  The patch is ok if it passes testing
with that.

Richard.

> It's certainly old code (r29604, September 1999).
> --
> Eric Botcazou


Re: [patch] Fix GC issue triggered by arithmetic overflow checking

2016-10-10 Thread Richard Biener
On Mon, Oct 10, 2016 at 12:38 PM, Eric Botcazou  wrote:
>> I believe the rule is that you might only depend on the order of objects
>> with respect to their DECL_UID, not the actual value of the DECL_UID.
>> As var-tracking shouldn't look at TYPE_DECLs (?) it's probably a latent
>> var-tracking bug as well.
>
> It presumably doesn't look at TYPE_DECLs, simply the DECL_UID of variables is
> also different so this changes some hashing.

Yes.  But that's not the only source for DECL_UID differences.  Btw,
I see lots of FOR_EACH_HASH_TABLE_ELEMENT in var-tracking.c
but they don't look like their outcome is supposed to be dependent on
element ordering.

Did you track down where exactly the code-gen difference appeared?

>> I'd prefer the named parameter to be defaulted to false and the few
>> places in the FEs fixed (eventually that name business should be
>> handled like names for nodes like integer_type_node -- I see no
>> reason why build_complex_type should have this special-case at all!
>> That is, why are the named vairants in the type hash in the first place?)
>
> I think that the calls in build_common_tree_nodes need to be changed too then:
>
>   complex_integer_type_node = build_complex_type (integer_type_node);
>   complex_float_type_node = build_complex_type (float_type_node);
>   complex_double_type_node = build_complex_type (double_type_node);
>   complex_long_double_type_node = build_complex_type (long_double_type_node);
>
> in addition to:
>
> ./ada/gcc-interface/decl.c: = build_complex_type
> ./ada/gcc-interface/decl.c:  return build_complex_type (nt);
> ./ada/gcc-interface/trans.c:  tree gnu_ctype = build_complex_type
> (gnu_type);
> ./c/c-decl.c: specs->type = build_complex_type (specs->type);
> ./c/c-decl.c: specs->type = build_complex_type (specs->type);
> ./c/c-decl.c: specs->type = build_complex_type (specs->type);
> ./c/c-parser.c:  build_complex_type
> ./c/c-typeck.c: return build_complex_type (subtype);
> ./c-family/c-common.c:  return build_complex_type (inner_type);
> ./c-family/c-lex.c:   type = build_complex_type (type);
> ./cp/decl.c:type = build_complex_type (type);
> ./cp/typeck.c:  return build_type_attribute_variant (build_complex_type
> (subtype),
> ./fortran/trans-types.c:gfc_build_complex_type (tree scalar_type)
> ./fortran/trans-types.c:  type = gfc_build_complex_type (type);
> ./go/go-gcc.cc:
> build_complex_type(TREE_TYPE(real_tree)),
> ./go/go-gcc.cc:  type = build_complex_type(type);
> ./lto/lto-lang.c:   return build_complex_type (inner_type);
>
> Or perhaps *only* the calls in build_common_tree_nodes need to be changed?
>
> It's certainly old code (r29604, September 1999).
>
> --
> Eric Botcazou


Fix invalid doloop setup on ia64 (PR target/77738)

2016-10-10 Thread Andreas Schwab
On ia64 the doloop pattern can only work with DImode, so it should
reject any other mode.  Bootstrapped and regtested on ia64-suse-linux.

Andreas.

PR target/77738
* config/ia64/ia64.md ("doloop_end"): Reject if mode of loop
pseudo is not DImode.

---
 gcc/config/ia64/ia64.md | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/gcc/config/ia64/ia64.md b/gcc/config/ia64/ia64.md
index 7bc21fd8ca..afde75aa74 100644
--- a/gcc/config/ia64/ia64.md
+++ b/gcc/config/ia64/ia64.md
@@ -3959,6 +3959,9 @@
(use (match_operand 1 "" ""))]  ; label
   ""
 {
+  if (GET_MODE (operands[0]) != DImode)
+FAIL;
+
   emit_jump_insn (gen_doloop_end_internal (gen_rtx_REG (DImode, AR_LC_REGNUM),
   operands[1]));
   DONE;
-- 
2.10.1

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


Re: PING: [PATCH] Be more conservative in early inliner if FDO is enabled

2016-10-10 Thread Yuan, Pengfei
> On Mon, Oct 10, 2016 at 4:23 AM, Yuan, Pengfei  wrote:
> > Hi,
> >
> > What is the decision on this patch?
> > https://gcc.gnu.org/ml/gcc-patches/2016-09/msg01041.html
> 
> Honza approved the patch already.
> 
> Richard.

Do I need to sign a copyright assignment for the patch?
Moreover, I do not have the permission to commit it.

Regards,
Yuan, Pengfei



Re: [PATCH 2/3] Fold __builtin_memchr (version 2)

2016-10-10 Thread Martin Liška
On 10/07/2016 01:21 PM, Wilco Dijkstra wrote:
> Hi,
> 
>> -static int
>> +int
>> target_char_cast (tree cst, char *p)
> 
>> +  if (target_char_cast (arg2, &c))
>> +return false;
> 
> I believe target_char_cast is incorrect if the host/target chars are not 
> identical
> (depending on how constant strings are created there may be signed/unsigned
> mismatches too). I recently added target_char_cst_p to gimple-fold.c to avoid
> char representation mismatches, so it would be better to use that instead.
> 
> Wilco
> 

Thank you for the predicate, I'm going to use it.

I have one additional question whether also c_getstr should be guarded
with a similar guard? Or is it always safe to grab a char* by 
TREE_STRING_POINTER
and use it by a host string functions (strcmp, ...)?

Martin


Re: Fix invalid doloop setup on ia64 (PR target/77738)

2016-10-10 Thread Bernd Schmidt

On 10/10/2016 12:51 PM, Andreas Schwab wrote:

On ia64 the doloop pattern can only work with DImode, so it should
reject any other mode.  Bootstrapped and regtested on ia64-suse-linux.

Andreas.

PR target/77738
* config/ia64/ia64.md ("doloop_end"): Reject if mode of loop
pseudo is not DImode.


Ok. Same issue as on every target that uses doloop.


Bernd



Re: [PATCH][store merging][RFA] Re-implement merging code

2016-10-10 Thread Kyrill Tkachov


On 10/10/16 11:22, Richard Biener wrote:

On Mon, 10 Oct 2016, Kyrill Tkachov wrote:


Hi Richard,

As I mentioned, here is the patch applying to the main store merging patch to
re-implement encode_tree_to_bitpos
to operate on the bytes directly.

This works fine on little-endian but breaks on big-endian, even for merging
bitfields within a single byte.
Consider the code snippet from gcc.dg/store_merging_6.c:

struct bar {
   int a : 3;
   unsigned char b : 4;
   unsigned char c : 1;
   char d;
   char e;
   char f;
   char g;
};

void
foo1 (struct bar *p)
{
   p->b = 3;
   p->a = 2;
   p->c = 1;
   p->d = 4;
   p->e = 5;
}

The correct GIMPLE for these merged stores on big-endian is:
   MEM[(voidD.49 *)p_2(D)] = 18180;
   MEM[(charD.8 *)p_2(D) + 2B] = 5;

whereas with this patch we emit:
   MEM[(voidD.49 *)p_2(D)] = 39428;
   MEM[(charD.8 *)p_2(D) + 2B] = 5;

The dump for merging the individual stores without this patch (using the
correct but costly wide_int approach in the base patch) is:
After writing 3 of size 4 at position 3 the merged region contains:
6 0 0 0 0 0
After writing 2 of size 3 at position 0 the merged region contains:
46 0 0 0 0 0
After writing 1 of size 1 at position 7 the merged region contains:
47 0 0 0 0 0
After writing 4 of size 8 at position 8 the merged region contains:
47 4 0 0 0 0
After writing 5 of size 8 at position 16 the merged region contains:
47 4 5 0 0 0


And with this patch it is:
After writing 3 of size 4 at position 3 the merged region contains:
18 0 0 0 0 0
After writing 2 of size 3 at position 0 the merged region contains:
1a 0 0 0 0 0
After writing 1 of size 1 at position 7 the merged region contains:
9a 0 0 0 0 0
After writing 4 of size 8 at position 8 the merged region contains:
9a 4 0 0 0 0
After writing 5 of size 8 at position 16 the merged region contains:
9a 4 5 0 0 0

(Note the dump just dumps the byte array from index 0 to  so the first
thing printed is the lowest numbered byte.
Also, each byte is dumped in hex.)

The code as included here doesn't do any byte swapping for big-endian but as
seen from the dump even writing a sub-byte
bitfield goes wrong so it would be nice to resolve that before going forward.
Any help with debugging this is hugely appreciated. I've included an ASCII
diagram of the steps in the algorithm
in the patch itself.

Ah, I think you need to account for BITS_BIG_ENDIAN in
shift_bytes_in_array.  You have to shift towards MSB which means changing
left to right shifts for BITS_BIG_ENDIAN.


Thanks, I'll try it out. But this is on aarch64 where
BITS_BIG_ENDIAN is 0 even when BYTES_BIG_ENDIAN is 1
so there's something else bad here.


You also seem to miss to account for amnt / BITS_PER_UNIT != 0.
Independently of BYTES_BIG_ENDIAN it would be

   ptr[i + (amnt / BITS_PER_UNIT)] = ptr[i] << amnt;
...


doh, yes. I'll fix that.


(so best use a single load / store and operate on a temporary).


Thanks,
Kyrill


Richard.


Thanks,
Kyrill





Re: [PATCH][store merging][RFA] Re-implement merging code

2016-10-10 Thread Kyrill Tkachov


On 10/10/16 12:06, Kyrill Tkachov wrote:


On 10/10/16 11:22, Richard Biener wrote:

On Mon, 10 Oct 2016, Kyrill Tkachov wrote:


Hi Richard,

As I mentioned, here is the patch applying to the main store merging patch to
re-implement encode_tree_to_bitpos
to operate on the bytes directly.

This works fine on little-endian but breaks on big-endian, even for merging
bitfields within a single byte.
Consider the code snippet from gcc.dg/store_merging_6.c:

struct bar {
   int a : 3;
   unsigned char b : 4;
   unsigned char c : 1;
   char d;
   char e;
   char f;
   char g;
};

void
foo1 (struct bar *p)
{
   p->b = 3;
   p->a = 2;
   p->c = 1;
   p->d = 4;
   p->e = 5;
}

The correct GIMPLE for these merged stores on big-endian is:
   MEM[(voidD.49 *)p_2(D)] = 18180;
   MEM[(charD.8 *)p_2(D) + 2B] = 5;

whereas with this patch we emit:
   MEM[(voidD.49 *)p_2(D)] = 39428;
   MEM[(charD.8 *)p_2(D) + 2B] = 5;

The dump for merging the individual stores without this patch (using the
correct but costly wide_int approach in the base patch) is:
After writing 3 of size 4 at position 3 the merged region contains:
6 0 0 0 0 0
After writing 2 of size 3 at position 0 the merged region contains:
46 0 0 0 0 0
After writing 1 of size 1 at position 7 the merged region contains:
47 0 0 0 0 0
After writing 4 of size 8 at position 8 the merged region contains:
47 4 0 0 0 0
After writing 5 of size 8 at position 16 the merged region contains:
47 4 5 0 0 0


And with this patch it is:
After writing 3 of size 4 at position 3 the merged region contains:
18 0 0 0 0 0
After writing 2 of size 3 at position 0 the merged region contains:
1a 0 0 0 0 0
After writing 1 of size 1 at position 7 the merged region contains:
9a 0 0 0 0 0
After writing 4 of size 8 at position 8 the merged region contains:
9a 4 0 0 0 0
After writing 5 of size 8 at position 16 the merged region contains:
9a 4 5 0 0 0

(Note the dump just dumps the byte array from index 0 to  so the first
thing printed is the lowest numbered byte.
Also, each byte is dumped in hex.)

The code as included here doesn't do any byte swapping for big-endian but as
seen from the dump even writing a sub-byte
bitfield goes wrong so it would be nice to resolve that before going forward.
Any help with debugging this is hugely appreciated. I've included an ASCII
diagram of the steps in the algorithm
in the patch itself.

Ah, I think you need to account for BITS_BIG_ENDIAN in
shift_bytes_in_array.  You have to shift towards MSB which means changing
left to right shifts for BITS_BIG_ENDIAN.


Thanks, I'll try it out. But this is on aarch64 where
BITS_BIG_ENDIAN is 0 even when BYTES_BIG_ENDIAN is 1
so there's something else bad here.


You also seem to miss to account for amnt / BITS_PER_UNIT != 0.
Independently of BYTES_BIG_ENDIAN it would be

   ptr[i + (amnt / BITS_PER_UNIT)] = ptr[i] << amnt;
...


doh, yes. I'll fix that.



Scratch that, just read your other reply.
The precondition for that function is that the shift amount is less than 
BITS_PER_UNIT.
I'll clarify that in the comment.

Kyril


(so best use a single load / store and operate on a temporary).


Thanks,
Kyrill


Richard.


Thanks,
Kyrill







Re: [PATCH][store merging][RFA] Re-implement merging code

2016-10-10 Thread Richard Biener
On Mon, 10 Oct 2016, Kyrill Tkachov wrote:

> 
> On 10/10/16 11:22, Richard Biener wrote:
> > On Mon, 10 Oct 2016, Kyrill Tkachov wrote:
> > 
> > > Hi Richard,
> > > 
> > > As I mentioned, here is the patch applying to the main store merging patch
> > > to
> > > re-implement encode_tree_to_bitpos
> > > to operate on the bytes directly.
> > > 
> > > This works fine on little-endian but breaks on big-endian, even for
> > > merging
> > > bitfields within a single byte.
> > > Consider the code snippet from gcc.dg/store_merging_6.c:
> > > 
> > > struct bar {
> > >int a : 3;
> > >unsigned char b : 4;
> > >unsigned char c : 1;
> > >char d;
> > >char e;
> > >char f;
> > >char g;
> > > };
> > > 
> > > void
> > > foo1 (struct bar *p)
> > > {
> > >p->b = 3;
> > >p->a = 2;
> > >p->c = 1;
> > >p->d = 4;
> > >p->e = 5;
> > > }
> > > 
> > > The correct GIMPLE for these merged stores on big-endian is:
> > >MEM[(voidD.49 *)p_2(D)] = 18180;
> > >MEM[(charD.8 *)p_2(D) + 2B] = 5;
> > > 
> > > whereas with this patch we emit:
> > >MEM[(voidD.49 *)p_2(D)] = 39428;
> > >MEM[(charD.8 *)p_2(D) + 2B] = 5;
> > > 
> > > The dump for merging the individual stores without this patch (using the
> > > correct but costly wide_int approach in the base patch) is:
> > > After writing 3 of size 4 at position 3 the merged region contains:
> > > 6 0 0 0 0 0
> > > After writing 2 of size 3 at position 0 the merged region contains:
> > > 46 0 0 0 0 0
> > > After writing 1 of size 1 at position 7 the merged region contains:
> > > 47 0 0 0 0 0
> > > After writing 4 of size 8 at position 8 the merged region contains:
> > > 47 4 0 0 0 0
> > > After writing 5 of size 8 at position 16 the merged region contains:
> > > 47 4 5 0 0 0
> > > 
> > > 
> > > And with this patch it is:
> > > After writing 3 of size 4 at position 3 the merged region contains:
> > > 18 0 0 0 0 0
> > > After writing 2 of size 3 at position 0 the merged region contains:
> > > 1a 0 0 0 0 0
> > > After writing 1 of size 1 at position 7 the merged region contains:
> > > 9a 0 0 0 0 0
> > > After writing 4 of size 8 at position 8 the merged region contains:
> > > 9a 4 0 0 0 0
> > > After writing 5 of size 8 at position 16 the merged region contains:
> > > 9a 4 5 0 0 0
> > > 
> > > (Note the dump just dumps the byte array from index 0 to  so the
> > > first
> > > thing printed is the lowest numbered byte.
> > > Also, each byte is dumped in hex.)
> > > 
> > > The code as included here doesn't do any byte swapping for big-endian but
> > > as
> > > seen from the dump even writing a sub-byte
> > > bitfield goes wrong so it would be nice to resolve that before going
> > > forward.
> > > Any help with debugging this is hugely appreciated. I've included an ASCII
> > > diagram of the steps in the algorithm
> > > in the patch itself.
> > Ah, I think you need to account for BITS_BIG_ENDIAN in
> > shift_bytes_in_array.  You have to shift towards MSB which means changing
> > left to right shifts for BITS_BIG_ENDIAN.
> 
> Thanks, I'll try it out. But this is on aarch64 where
> BITS_BIG_ENDIAN is 0 even when BYTES_BIG_ENDIAN is 1
> so there's something else bad here.

Maybe I'm confusing all the macros, so maybe it's BYTES_BIG_ENDIAN
(vs. WORDS_BIG_ENDIAN -- in theory this approach should work for
pdp11 as well).

Richard.

> > You also seem to miss to account for amnt / BITS_PER_UNIT != 0.
> > Independently of BYTES_BIG_ENDIAN it would be
> > 
> >ptr[i + (amnt / BITS_PER_UNIT)] = ptr[i] << amnt;
> > ...
> 
> doh, yes. I'll fix that.
> 
> > (so best use a single load / store and operate on a temporary).
> 
> Thanks,
> Kyrill
> 
> > Richard.
> > 
> > > Thanks,
> > > Kyrill
> > > 
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


Re: [PATCH][store merging][RFA] Re-implement merging code

2016-10-10 Thread Richard Biener
On Mon, 10 Oct 2016, Richard Biener wrote:

> On Mon, 10 Oct 2016, Kyrill Tkachov wrote:
> 
> > 
> > On 10/10/16 11:22, Richard Biener wrote:
> > > On Mon, 10 Oct 2016, Kyrill Tkachov wrote:
> > > 
> > > > Hi Richard,
> > > > 
> > > > As I mentioned, here is the patch applying to the main store merging 
> > > > patch
> > > > to
> > > > re-implement encode_tree_to_bitpos
> > > > to operate on the bytes directly.
> > > > 
> > > > This works fine on little-endian but breaks on big-endian, even for
> > > > merging
> > > > bitfields within a single byte.
> > > > Consider the code snippet from gcc.dg/store_merging_6.c:
> > > > 
> > > > struct bar {
> > > >int a : 3;
> > > >unsigned char b : 4;
> > > >unsigned char c : 1;
> > > >char d;
> > > >char e;
> > > >char f;
> > > >char g;
> > > > };
> > > > 
> > > > void
> > > > foo1 (struct bar *p)
> > > > {
> > > >p->b = 3;
> > > >p->a = 2;
> > > >p->c = 1;
> > > >p->d = 4;
> > > >p->e = 5;
> > > > }
> > > > 
> > > > The correct GIMPLE for these merged stores on big-endian is:
> > > >MEM[(voidD.49 *)p_2(D)] = 18180;
> > > >MEM[(charD.8 *)p_2(D) + 2B] = 5;
> > > > 
> > > > whereas with this patch we emit:
> > > >MEM[(voidD.49 *)p_2(D)] = 39428;
> > > >MEM[(charD.8 *)p_2(D) + 2B] = 5;
> > > > 
> > > > The dump for merging the individual stores without this patch (using the
> > > > correct but costly wide_int approach in the base patch) is:
> > > > After writing 3 of size 4 at position 3 the merged region contains:
> > > > 6 0 0 0 0 0
> > > > After writing 2 of size 3 at position 0 the merged region contains:
> > > > 46 0 0 0 0 0
> > > > After writing 1 of size 1 at position 7 the merged region contains:
> > > > 47 0 0 0 0 0
> > > > After writing 4 of size 8 at position 8 the merged region contains:
> > > > 47 4 0 0 0 0
> > > > After writing 5 of size 8 at position 16 the merged region contains:
> > > > 47 4 5 0 0 0
> > > > 
> > > > 
> > > > And with this patch it is:
> > > > After writing 3 of size 4 at position 3 the merged region contains:
> > > > 18 0 0 0 0 0
> > > > After writing 2 of size 3 at position 0 the merged region contains:
> > > > 1a 0 0 0 0 0
> > > > After writing 1 of size 1 at position 7 the merged region contains:
> > > > 9a 0 0 0 0 0
> > > > After writing 4 of size 8 at position 8 the merged region contains:
> > > > 9a 4 0 0 0 0
> > > > After writing 5 of size 8 at position 16 the merged region contains:
> > > > 9a 4 5 0 0 0
> > > > 
> > > > (Note the dump just dumps the byte array from index 0 to  so the
> > > > first
> > > > thing printed is the lowest numbered byte.
> > > > Also, each byte is dumped in hex.)
> > > > 
> > > > The code as included here doesn't do any byte swapping for big-endian 
> > > > but
> > > > as
> > > > seen from the dump even writing a sub-byte
> > > > bitfield goes wrong so it would be nice to resolve that before going
> > > > forward.
> > > > Any help with debugging this is hugely appreciated. I've included an 
> > > > ASCII
> > > > diagram of the steps in the algorithm
> > > > in the patch itself.
> > > Ah, I think you need to account for BITS_BIG_ENDIAN in
> > > shift_bytes_in_array.  You have to shift towards MSB which means changing
> > > left to right shifts for BITS_BIG_ENDIAN.
> > 
> > Thanks, I'll try it out. But this is on aarch64 where
> > BITS_BIG_ENDIAN is 0 even when BYTES_BIG_ENDIAN is 1
> > so there's something else bad here.
> 
> Maybe I'm confusing all the macros, so maybe it's BYTES_BIG_ENDIAN
> (vs. WORDS_BIG_ENDIAN -- in theory this approach should work for
> pdp11 as well).

Or maybe I'm confusing how get_inner_reference numbers "bits" when
it returns bitpos... (and how a multi-byte value in target memory
representation has to be "shifted" by bitpos).

I really thought BITS_BIG_ENDIAN is the only thing that matters...

Btw, I reproduced on ppc64-linux (which has BITS_BIG_ENDIAN).

Richard.

> Richard.
> 
> > > You also seem to miss to account for amnt / BITS_PER_UNIT != 0.
> > > Independently of BYTES_BIG_ENDIAN it would be
> > > 
> > >ptr[i + (amnt / BITS_PER_UNIT)] = ptr[i] << amnt;
> > > ...
> > 
> > doh, yes. I'll fix that.
> > 
> > > (so best use a single load / store and operate on a temporary).
> > 
> > Thanks,
> > Kyrill
> > 
> > > Richard.
> > > 
> > > > Thanks,
> > > > Kyrill
> > > > 
> > 
> > 
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


Re: [PATCH] Implement C++17 node extraction and insertion (P0083R5)

2016-10-10 Thread Jonathan Wakely

On 21/09/16 14:48 +0100, Jonathan Wakely wrote:

This implements container node extraction/insertion, and merging. The
patch includes Debug Mode support and pretty printers for the node
handles.

Most of the changes are fairly straightforward, with two things worth
pointing out.

There's a FIXME in bits/hashtable.h due to an exception-safety issue.
If the hash function or equality predicate throws then the node is
destroyed and deallocated. It would be better to leave it unchanged in
the node_handle argument.

I didn't want to make all map and multimap specializations friends of
each other (and similarly for all sets and multisets, and again for
the unordered ones). That would make it too easy to accidentally
access the internals of a map from a map. So I defined the
_Rb_tree_merge_helper and _Hash_merge_helper class templates to
mediate access, so that any access to a "foreign" container type must
be done through that type, and only certain internals can be obtained.


I forgot to mention another thing worth calling out.

I spoke to Richi and Michael Matz at the Cauldron and they assured me
that the middle end won't do any optimizations that would cause the
"magic happens here" part to do the wrong thing. Specifically, the
node handle for maps does this to get a non-const pointer to the
key_type in the pair:

 auto& __key = const_cast<_Key&>(__ptr->_M_valptr()->first);
 _M_pkey = _S_pointer_to(__key);
 _M_pmapped = _S_pointer_to(__ptr->_M_valptr()->second);

Where _S_pointer_to() is:

 template
   using __pointer = __ptr_rebind;

 template
   __pointer<_Tp>
   _S_pointer_to(_Tp& __obj)
   { return pointer_traits<__pointer<_Tp>>::pointer_to(__obj); }

The potentially worrying part of this is the const_cast, but as we
know that the pair is inside a non-const node allocated on the heap,
it will never be in read-only memory or actually non-modifiable. Once
we have std::launder we could consider using that, i.e.

 _M_pkey = _S_pointer_to(std::launder(__key));



Re: [PATCH 2/3] Fold __builtin_memchr (version 2)

2016-10-10 Thread Wilco Dijkstra
Martin Liška  wrote:
> On 10/07/2016 01:21 PM, Wilco Dijkstra wrote:
>
> > I believe target_char_cast is incorrect if the host/target chars are not 
> > identical
> > (depending on how constant strings are created there may be signed/unsigned
> > mismatches too). I recently added target_char_cst_p to gimple-fold.c to 
> > avoid
> > char representation mismatches, so it would be better to use that instead.
>
> Thank you for the predicate, I'm going to use it.
>
> I have one additional question whether also c_getstr should be guarded
> with a similar guard? Or is it always safe to grab a char* by 
> TREE_STRING_POINTER
> and use it by a host string functions (strcmp, ...)?

Yes I guess that one is incorrect too. I can't find the internal implementation 
of tree strings,
but it may well be that GCC just doesn't support any mismatches in host/target 
character
size. In any case an explicit check won't do any harm as it isn't possible to 
use host string
functions if there is a mismatch in character size.

Another thing, what happens with:

memchr ("abc", 225, 10);

It seems your new code will call memchr with the given size (and potentially 
crash) rather
than report the obvious bug and set a consistent return value that doesn't rely 
on reading
random memory on the host.

Wilco





Re: [PATCH 2/3] Fold __builtin_memchr (version 2)

2016-10-10 Thread Martin Liška
On 10/10/2016 01:28 PM, Wilco Dijkstra wrote:
> Martin Liška  wrote:
>> On 10/07/2016 01:21 PM, Wilco Dijkstra wrote:
>>
>>> I believe target_char_cast is incorrect if the host/target chars are not 
>>> identical
>>> (depending on how constant strings are created there may be signed/unsigned
>>> mismatches too). I recently added target_char_cst_p to gimple-fold.c to 
>>> avoid
>>> char representation mismatches, so it would be better to use that instead.
>>
>> Thank you for the predicate, I'm going to use it.
>>
>> I have one additional question whether also c_getstr should be guarded
>> with a similar guard? Or is it always safe to grab a char* by 
>> TREE_STRING_POINTER
>> and use it by a host string functions (strcmp, ...)?
> 
> Yes I guess that one is incorrect too. I can't find the internal 
> implementation of tree strings,
> but it may well be that GCC just doesn't support any mismatches in 
> host/target character
> size. In any case an explicit check won't do any harm as it isn't possible to 
> use host string
> functions if there is a mismatch in character size.

I will dig in this situation. I'll build a cross-compiler which will have a 
different character size.

> 
> Another thing, what happens with:
> 
> memchr ("abc", 225, 10);
> 
> It seems your new code will call memchr with the given size (and potentially 
> crash) rather
> than report the obvious bug and set a consistent return value that doesn't 
> rely on reading
> random memory on the host.

I asked Jakub about that on IRC already:

 Hi. Just thinking whether we should fold a case like __builtin_memchr 
("a", 'x', 2), which is ubsan?
 marxin: what do you mean by that?  That is NULL, without undefined 
behavior
 jakub: sry, s/2/3
 marxin: don't fold that in that case
 jakub: good, I thought that

It's an opportunity for a warning and as I talked to Martin Sebor, he's aware 
of this as an improvement
of his sprintf warnings he's currently working on.

Martin

> 
> Wilco
> 
> 
> 



RE: [PATCH] [ARC] Disable compact casesi patterns for arcv2

2016-10-10 Thread Claudiu Zissulescu

> > gcc/
> > 2016-05-09  Claudiu Zissulescu  
> >
> > * common/config/arc/arc-common.c
> (arc_option_optimization_table):
> > Remove compact casesi option.
> > * config/arc/arc.c (arc_override_options): Use compact casesi
> > option only for pre-ARCv2 cores.
> > * doc/invoke.texi (mcompact-casesi): Update text.
> 
> Looks good to me.
> 

Committed r240916.

Thank you for your review,
Claudiu


Re: Compile-time improvement for if conversion.

2016-10-10 Thread Yuri Rumyantsev
Thanks Richard for your comments.
I'd like to answer on your last comment regarding use split_edge()
instead of creating fake post-header. I started with this splitting
but it requires to fix-up closed ssa form by creating additional phi
nodes, so I decided to use only cfg change without updating ssa form.
Other changes look reasonable and will fix them.

2016-10-10 12:52 GMT+03:00 Richard Biener :
> On Wed, Oct 5, 2016 at 3:22 PM, Yuri Rumyantsev  wrote:
>> Hi All,
>>
>> Here is implementation of Richard proposal:
>>
>> < For general infrastructure it would be nice to expose a (post-)dominator
>> < compute for MESE (post-dominators) / SEME (dominators) regions.  I believe
>> < what makes if-conversion expensive is the post-dom compute which happens
>> < for each loop for the whole function.  It shouldn't be very difficult
>> < to write this,
>> < sharing as much as possible code with the current DOM code might need
>> < quite some refactoring though.
>>
>> I implemented this proposal by adding calculation of dominance info
>> for SESE regions and incorporate this change to if conversion pass.
>> SESE region is built by adding loop pre-header and possibly fake
>> post-header blocks to loop body. Fake post-header is deleted after
>> predication completion.
>>
>> Bootstrapping and regression testing did not show any new failures.
>>
>> Is it OK for trunk?
>
> It's mostly reasonable but I have a few comments.  First, re-using
> bb->dom[] for the dominator info is somewhat fragile but indeed
> a requirement to make the patch reasonably small.  Please,
> in calculate_dominance_info_for_region, make sure that
> !dom_info_available_p (dir).
>
> You pass loop * everywhere but require ->aux to be set up as
> an array of BBs forming the region with special BBs at array ends.
>
> Please instead pass in a vec which avoids using ->aux
> and also allows other non-loop-based SESE regions to be used
> (I couldn't spot anything that relies on this being a loop).
>
> Adding a convenience wrapper for loop  * would be of course nice,
> to cover the special pre/post-header code in tree-if-conv.c.
>
> In theory a SESE region is fully specified by its entry end exit _edge_,
> so you might want to see if it's possible to use such a pair of edges
> to guard the dfs/idom walks to avoid the need to create fake blocks.
>
> Btw, instead of using create_empty_bb, unchecked_make_edge, etc.
> please use split_edge() of the entry/exit edges.
>
> Richard.
>
>> ChangeLog:
>> 2016-10-05  Yuri Rumyantsev  
>>
>> * dominance.c : Include cfgloop.h for loop recognition.
>> (dom_info): Add new functions and add boolean argument to recognize
>> computation for loop region.
>> (dom_info::dom_info): New function.
>> (dom_info::calc_dfs_tree): Add boolean argument IN_REGION to not
>> handle unvisited blocks.
>> (dom_info::calc_idoms): Likewise.
>> (compute_dom_fast_query_in_region): New function.
>> (calculate_dominance_info): Invoke calc_dfs_tree and calc_idoms with
>> false argument.
>> (calculate_dominance_info_for_region): New function.
>> (free_dominance_info_for_region): Likewise.
>> (verify_dominators): Invoke calc_dfs_tree and calc_idoms with false
>> argument.
>> * dominance.h: Add prototype for introduced functions
>> calculate_dominance_info_for_region and
>> free_dominance_info_for_region.
>> tree-if-conv.c: Add to local variables ifc_sese_bbs & fake_postheader.
>> (build_sese_region): New function.
>> (if_convertible_loop_p_1): Invoke local version of post-dominators
>> calculation, free it after basic block predication and delete created
>> fake post-header block if any.
>> (tree_if_conversion): Delete call of free_dominance_info for
>> post-dominators, free ifc_sese_bbs which represents SESE region.
>> (pass_if_conversion::execute): Delete detection of infinite loops
>> and fake edges to exit block since post-dominator calculation is
>> performed per if-converted loop only.


[PATCH] Add noexcept to enable_shared_from_this::weak_from_this

2016-10-10 Thread Jonathan Wakely

I missed out the "noexcept" on these new functions.

* include/bits/shared_ptr.h (enable_shared_from_this::weak_from_this):
Add noexcept.
* include/bits/shared_ptr_base.h
(__enable_shared_from_this::weak_from_this): Likewise.
* testsuite/20_util/enable_shared_from_this/members/weak_from_this.cc:
Test exception-specification of weak_from_this.

Tested powerpc64le-linux, committing to trunk.

commit 3f386e54098cb01df83a131ed4a8e22c0b0b52bd
Author: Jonathan Wakely 
Date:   Mon Oct 10 11:42:00 2016 +0100

Add noexcept to enable_shared_from_this::weak_from_this

* include/bits/shared_ptr.h (enable_shared_from_this::weak_from_this):
Add noexcept.
* include/bits/shared_ptr_base.h
(__enable_shared_from_this::weak_from_this): Likewise.
* testsuite/20_util/enable_shared_from_this/members/weak_from_this.cc:
Test exception-specification of weak_from_this.

diff --git a/libstdc++-v3/include/bits/shared_ptr.h 
b/libstdc++-v3/include/bits/shared_ptr.h
index b2523b8..cbcb3b3 100644
--- a/libstdc++-v3/include/bits/shared_ptr.h
+++ b/libstdc++-v3/include/bits/shared_ptr.h
@@ -593,11 +593,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 #if __cplusplus > 201402L || !defined(__STRICT_ANSI__) // c++1z or gnu++11
 #define __cpp_lib_enable_shared_from_this 201603
   weak_ptr<_Tp>
-  weak_from_this()
+  weak_from_this() noexcept
   { return this->_M_weak_this; }
 
   weak_ptr
-  weak_from_this() const
+  weak_from_this() const noexcept
   { return this->_M_weak_this; }
 #endif
 
diff --git a/libstdc++-v3/include/bits/shared_ptr_base.h 
b/libstdc++-v3/include/bits/shared_ptr_base.h
index 4ae2668..e8820a1 100644
--- a/libstdc++-v3/include/bits/shared_ptr_base.h
+++ b/libstdc++-v3/include/bits/shared_ptr_base.h
@@ -1562,11 +1562,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
 #if __cplusplus > 201402L || !defined(__STRICT_ANSI__) // c++1z or gnu++11
   __weak_ptr<_Tp, _Lp>
-  weak_from_this()
+  weak_from_this() noexcept
   { return this->_M_weak_this; }
 
   __weak_ptr
-  weak_from_this() const
+  weak_from_this() const noexcept
   { return this->_M_weak_this; }
 #endif
 
diff --git 
a/libstdc++-v3/testsuite/20_util/enable_shared_from_this/members/weak_from_this.cc
 
b/libstdc++-v3/testsuite/20_util/enable_shared_from_this/members/weak_from_this.cc
index b5ebb81..9c33396 100644
--- 
a/libstdc++-v3/testsuite/20_util/enable_shared_from_this/members/weak_from_this.cc
+++ 
b/libstdc++-v3/testsuite/20_util/enable_shared_from_this/members/weak_from_this.cc
@@ -26,6 +26,9 @@
 
 struct X : public std::enable_shared_from_this { };
 
+static_assert( noexcept(std::declval().weak_from_this()) );
+static_assert( noexcept(std::declval().weak_from_this()) );
+
 void
 test01()
 {


Re: Compile-time improvement for if conversion.

2016-10-10 Thread Richard Biener
On Mon, Oct 10, 2016 at 1:42 PM, Yuri Rumyantsev  wrote:
> Thanks Richard for your comments.
> I'd like to answer on your last comment regarding use split_edge()
> instead of creating fake post-header. I started with this splitting
> but it requires to fix-up closed ssa form by creating additional phi
> nodes, so I decided to use only cfg change without updating ssa form.
> Other changes look reasonable and will fix them.

Ah.  In this case can you investigate what it takes to make the entry/exit
edges rather than BBs?  That is, introduce those "fakes" only internally
in dominance.c?

> 2016-10-10 12:52 GMT+03:00 Richard Biener :
>> On Wed, Oct 5, 2016 at 3:22 PM, Yuri Rumyantsev  wrote:
>>> Hi All,
>>>
>>> Here is implementation of Richard proposal:
>>>
>>> < For general infrastructure it would be nice to expose a (post-)dominator
>>> < compute for MESE (post-dominators) / SEME (dominators) regions.  I believe
>>> < what makes if-conversion expensive is the post-dom compute which happens
>>> < for each loop for the whole function.  It shouldn't be very difficult
>>> < to write this,
>>> < sharing as much as possible code with the current DOM code might need
>>> < quite some refactoring though.
>>>
>>> I implemented this proposal by adding calculation of dominance info
>>> for SESE regions and incorporate this change to if conversion pass.
>>> SESE region is built by adding loop pre-header and possibly fake
>>> post-header blocks to loop body. Fake post-header is deleted after
>>> predication completion.
>>>
>>> Bootstrapping and regression testing did not show any new failures.
>>>
>>> Is it OK for trunk?
>>
>> It's mostly reasonable but I have a few comments.  First, re-using
>> bb->dom[] for the dominator info is somewhat fragile but indeed
>> a requirement to make the patch reasonably small.  Please,
>> in calculate_dominance_info_for_region, make sure that
>> !dom_info_available_p (dir).
>>
>> You pass loop * everywhere but require ->aux to be set up as
>> an array of BBs forming the region with special BBs at array ends.
>>
>> Please instead pass in a vec which avoids using ->aux
>> and also allows other non-loop-based SESE regions to be used
>> (I couldn't spot anything that relies on this being a loop).
>>
>> Adding a convenience wrapper for loop  * would be of course nice,
>> to cover the special pre/post-header code in tree-if-conv.c.
>>
>> In theory a SESE region is fully specified by its entry end exit _edge_,
>> so you might want to see if it's possible to use such a pair of edges
>> to guard the dfs/idom walks to avoid the need to create fake blocks.
>>
>> Btw, instead of using create_empty_bb, unchecked_make_edge, etc.
>> please use split_edge() of the entry/exit edges.
>>
>> Richard.
>>
>>> ChangeLog:
>>> 2016-10-05  Yuri Rumyantsev  
>>>
>>> * dominance.c : Include cfgloop.h for loop recognition.
>>> (dom_info): Add new functions and add boolean argument to recognize
>>> computation for loop region.
>>> (dom_info::dom_info): New function.
>>> (dom_info::calc_dfs_tree): Add boolean argument IN_REGION to not
>>> handle unvisited blocks.
>>> (dom_info::calc_idoms): Likewise.
>>> (compute_dom_fast_query_in_region): New function.
>>> (calculate_dominance_info): Invoke calc_dfs_tree and calc_idoms with
>>> false argument.
>>> (calculate_dominance_info_for_region): New function.
>>> (free_dominance_info_for_region): Likewise.
>>> (verify_dominators): Invoke calc_dfs_tree and calc_idoms with false
>>> argument.
>>> * dominance.h: Add prototype for introduced functions
>>> calculate_dominance_info_for_region and
>>> free_dominance_info_for_region.
>>> tree-if-conv.c: Add to local variables ifc_sese_bbs & fake_postheader.
>>> (build_sese_region): New function.
>>> (if_convertible_loop_p_1): Invoke local version of post-dominators
>>> calculation, free it after basic block predication and delete created
>>> fake post-header block if any.
>>> (tree_if_conversion): Delete call of free_dominance_info for
>>> post-dominators, free ifc_sese_bbs which represents SESE region.
>>> (pass_if_conversion::execute): Delete detection of infinite loops
>>> and fake edges to exit block since post-dominator calculation is
>>> performed per if-converted loop only.


[PATCH] Define std::allocator::is_always_equal

2016-10-10 Thread Jonathan Wakely

I somehow only added the is_always_equal nested typedef to the
allocator specialization, not the primary template. All the
containers still do the right thing, because they use
allocator_traits>::is_always_equal which gives the right
answer, but we still need to provide allocator::is_always_equal to
be conforming.

* include/bits/allocator.h (allocator::is_always_equal): Define.
* testsuite/20_util/allocator/requirements/typedefs.cc: Test for
is_always_equal.
* testsuite/util/testsuite_allocator.h
(uneq_allocator::is_always_equal): Define as false_type.

Tested powerpc64le-linux, committed to trunk/


commit f5020f0fa1dc815eda37d8b1040e7c16f1554114
Author: Jonathan Wakely 
Date:   Mon Oct 10 12:04:24 2016 +0100

Define std::allocator::is_always_equal

* include/bits/allocator.h (allocator::is_always_equal): Define.
* testsuite/20_util/allocator/requirements/typedefs.cc: Test for
is_always_equal.
* testsuite/util/testsuite_allocator.h
(uneq_allocator::is_always_equal): Define as false_type.

diff --git a/libstdc++-v3/include/bits/allocator.h 
b/libstdc++-v3/include/bits/allocator.h
index 984d800..8e78165 100644
--- a/libstdc++-v3/include/bits/allocator.h
+++ b/libstdc++-v3/include/bits/allocator.h
@@ -50,6 +50,9 @@
 #endif
 
 #define __cpp_lib_incomplete_container_elements 201505
+#if __cplusplus >= 201103L
+# define __cpp_lib_allocator_is_always_equal 201411
+#endif
 
 namespace std _GLIBCXX_VISIBILITY(default)
 {
@@ -80,7 +83,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   // 2103. std::allocator propagate_on_container_move_assignment
   typedef true_type propagate_on_container_move_assignment;
 
-#define __cpp_lib_allocator_is_always_equal 201411
   typedef true_type is_always_equal;
 #endif
 };
@@ -113,6 +115,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   // _GLIBCXX_RESOLVE_LIB_DEFECTS
   // 2103. std::allocator propagate_on_container_move_assignment
   typedef true_type propagate_on_container_move_assignment;
+
+  typedef true_type is_always_equal;
 #endif
 
   allocator() throw() { }
diff --git a/libstdc++-v3/testsuite/20_util/allocator/requirements/typedefs.cc 
b/libstdc++-v3/testsuite/20_util/allocator/requirements/typedefs.cc
index 028daa9..1b3f14f 100644
--- a/libstdc++-v3/testsuite/20_util/allocator/requirements/typedefs.cc
+++ b/libstdc++-v3/testsuite/20_util/allocator/requirements/typedefs.cc
@@ -48,3 +48,6 @@ static_assert( is_same::rebind::other,
 static_assert( is_same::propagate_on_container_move_assignment,
std::true_type>::value,
"propagate_on_container_move_assignment" );
+
+static_assert( is_same::is_always_equal, std::true_type>::value,
+   "is_always_equal" );
diff --git a/libstdc++-v3/testsuite/util/testsuite_allocator.h 
b/libstdc++-v3/testsuite/util/testsuite_allocator.h
index 8537a83..dd7e22d 100644
--- a/libstdc++-v3/testsuite/util/testsuite_allocator.h
+++ b/libstdc++-v3/testsuite/util/testsuite_allocator.h
@@ -297,6 +297,7 @@ namespace __gnu_test
 
 #if __cplusplus >= 201103L
   typedef std::true_type   propagate_on_container_swap;
+  typedef std::false_type  is_always_equal;
 #endif
 
   template


[PATCH] LWG 2733, LWG 2759 reject bool in gcd and lcm

2016-10-10 Thread Jonathan Wakely

These DRs are only in Tentatively Ready status, but they're not
controversial so implementing them immediately seems sensible.

The deleted function is sufficient, but the static assertions are more
user-friendly (and are only tested once, not in every recursive call
to __gcd or __lcm).

* include/experimental/numeric (gcd, lcm): Make bool arguments
ill-formed.
* include/std/numeric (gcd, lcm): Likewise.
* testsuite/26_numerics/gcd/gcd_neg.cc: New test.
* testsuite/26_numerics/lcm/lcm_neg.cc: New test.

Tested x86_64-linux, committed to trunk.

commit a785026d8d928a1492daf6919a57d6cda714f714
Author: Jonathan Wakely 
Date:   Mon Oct 10 11:58:27 2016 +0100

LWG 2733, LWG 2759 reject bool in gcd and lcm

* include/experimental/numeric (gcd, lcm): Make bool arguments
ill-formed.
* include/std/numeric (gcd, lcm): Likewise.
* testsuite/26_numerics/gcd/gcd_neg.cc: New test.
* testsuite/26_numerics/lcm/lcm_neg.cc: New test.

diff --git a/libstdc++-v3/include/experimental/numeric 
b/libstdc++-v3/include/experimental/numeric
index 6d1dc21..0ce4bda 100644
--- a/libstdc++-v3/include/experimental/numeric
+++ b/libstdc++-v3/include/experimental/numeric
@@ -57,8 +57,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 constexpr common_type_t<_Mn, _Nn>
 gcd(_Mn __m, _Nn __n)
 {
-  static_assert(is_integral<_Mn>::value, "arguments to gcd are integers");
-  static_assert(is_integral<_Nn>::value, "arguments to gcd are integers");
+  static_assert(is_integral<_Mn>::value, "gcd arguments are integers");
+  static_assert(is_integral<_Nn>::value, "gcd arguments are integers");
+  static_assert(!is_same<_Mn, bool>::value, "gcd arguments are not bools");
+  static_assert(!is_same<_Nn, bool>::value, "gcd arguments are not bools");
   return std::__detail::__gcd(__m, __n);
 }
 
@@ -67,8 +69,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 constexpr common_type_t<_Mn, _Nn>
 lcm(_Mn __m, _Nn __n)
 {
-  static_assert(is_integral<_Mn>::value, "arguments to lcm are integers");
-  static_assert(is_integral<_Nn>::value, "arguments to lcm are integers");
+  static_assert(is_integral<_Mn>::value, "lcm arguments are integers");
+  static_assert(is_integral<_Nn>::value, "lcm arguments are integers");
+  static_assert(!is_same<_Mn, bool>::value, "lcm arguments are not bools");
+  static_assert(!is_same<_Nn, bool>::value, "lcm arguments are not bools");
   return std::__detail::__lcm(__m, __n);
 }
 
diff --git a/libstdc++-v3/include/std/numeric b/libstdc++-v3/include/std/numeric
index 7b1ab98..4414081 100644
--- a/libstdc++-v3/include/std/numeric
+++ b/libstdc++-v3/include/std/numeric
@@ -96,6 +96,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 __abs_integral(_Tp __val)
 { return __val; }
 
+  void __abs_integral(bool) = delete;
+
   template
 constexpr common_type_t<_Mn, _Nn>
 __gcd(_Mn __m, _Nn __n)
@@ -129,8 +131,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 constexpr common_type_t<_Mn, _Nn>
 gcd(_Mn __m, _Nn __n)
 {
-  static_assert(is_integral<_Mn>::value, "arguments to gcd are integers");
-  static_assert(is_integral<_Nn>::value, "arguments to gcd are integers");
+  static_assert(is_integral<_Mn>::value, "gcd arguments are integers");
+  static_assert(is_integral<_Nn>::value, "gcd arguments are integers");
+  static_assert(!is_same<_Mn, bool>::value, "gcd arguments are not bools");
+  static_assert(!is_same<_Nn, bool>::value, "gcd arguments are not bools");
   return __detail::__gcd(__m, __n);
 }
 
@@ -140,8 +144,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 constexpr common_type_t<_Mn, _Nn>
 lcm(_Mn __m, _Nn __n)
 {
-  static_assert(is_integral<_Mn>::value, "arguments to lcm are integers");
-  static_assert(is_integral<_Nn>::value, "arguments to lcm are integers");
+  static_assert(is_integral<_Mn>::value, "lcm arguments are integers");
+  static_assert(is_integral<_Nn>::value, "lcm arguments are integers");
+  static_assert(!is_same<_Mn, bool>::value, "lcm arguments are not bools");
+  static_assert(!is_same<_Nn, bool>::value, "lcm arguments are not bools");
   return __detail::__lcm(__m, __n);
 }
 
diff --git a/libstdc++-v3/testsuite/26_numerics/gcd/gcd_neg.cc 
b/libstdc++-v3/testsuite/26_numerics/gcd/gcd_neg.cc
new file mode 100644
index 000..231ce8d
--- /dev/null
+++ b/libstdc++-v3/testsuite/26_numerics/gcd/gcd_neg.cc
@@ -0,0 +1,39 @@
+// Copyright (C) 2016 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the impl

[PATCH] Implement LWG 2192 and LWG 2294 for std::abs

2016-10-10 Thread Jonathan Wakely

It looks like I forgot to send this patch to the lists last month.

This implements the requirements that all overloads of std::abs are
declared by either of  or . This ensures that
including only one of those headers and calling std::abs doesn't cause
conversions from integers to floating point types, or vice versa.

* doc/xml/manual/intro.xml: Document LWG 2192 changes.
* doc/html/*: Regenerate.
* include/Makefile.am: Add bits/std_abs.h.
* include/Makefile.in: Regenerate.
* include/bits/std_abs.h: New header defining all required overloads
of std::abs in one place (LWG 2294).
* include/c_global/cmath (abs(double), abs(float), abs(long double)):
Move to bits/std_abs.h.
(abs<_Tp>(_Tp)): Remove.
* include/c_global/cstdlib (abs(long), abs(long long), abs(__int)):
Move to bits/std_abs.h.
* testsuite/26_numerics/headers/cmath/dr2192.cc: New test.
* testsuite/26_numerics/headers/cmath/dr2192_neg.cc: New test.
* testsuite/26_numerics/headers/cstdlib/dr2192.cc: New test.
* testsuite/26_numerics/headers/cstdlib/dr2192_neg.cc: New test.

Tested on ppc64le and x86_64 GNU/Linux, and committed to trunk on 30
September.

commit 9e441fcfca8a90e72a3cdbed42303dc2353b3da2
Author: redi 
Date:   Fri Sep 30 16:07:43 2016 +

Implement LWG 2192 and LWG 2294 for std::abs

* doc/xml/manual/intro.xml: Document LWG 2192 changes.
* doc/html/*: Regenerate.
* include/Makefile.am: Add bits/std_abs.h.
* include/Makefile.in: Regenerate.
* include/bits/std_abs.h: New header defining all required overloads
of std::abs in one place (LWG 2294).
* include/c_global/cmath (abs(double), abs(float), abs(long double)):
Move to bits/std_abs.h.
(abs<_Tp>(_Tp)): Remove.
* include/c_global/cstdlib (abs(long), abs(long long), abs(__int)):
Move to bits/std_abs.h.
* testsuite/26_numerics/headers/cmath/dr2192.cc: New test.
* testsuite/26_numerics/headers/cmath/dr2192_neg.cc: New test.
* testsuite/26_numerics/headers/cstdlib/dr2192.cc: New test.
* testsuite/26_numerics/headers/cstdlib/dr2192_neg.cc: New test.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@240660 
138bc75d-0d04-0410-961f-82ee72b054a4

diff --git a/libstdc++-v3/doc/xml/manual/intro.xml 
b/libstdc++-v3/doc/xml/manual/intro.xml
index 238ab24..4747851 100644
--- a/libstdc++-v3/doc/xml/manual/intro.xml
+++ b/libstdc++-v3/doc/xml/manual/intro.xml
@@ -940,6 +940,13 @@ requirements of the license of GCC.
 Add emplace and emplace_back 
member functions.
 
 
+http://www.w3.org/1999/xlink"; 
xlink:href="../ext/lwg-defects.html#2192">2192:
+   Validity and return type of std::abs(0u) is 
unclear
+
+Move all declarations to a common header and remove the
+generic abs which accepted unsigned arguments.
+
+
 http://www.w3.org/1999/xlink"; 
xlink:href="../ext/lwg-defects.html#2196">2196:
Specification of 
is_*[copy/move]_[constructible/assignable] unclear for 
non-referencable types
 
diff --git a/libstdc++-v3/include/Makefile.am b/libstdc++-v3/include/Makefile.am
index 7782258..4e63fbb 100644
--- a/libstdc++-v3/include/Makefile.am
+++ b/libstdc++-v3/include/Makefile.am
@@ -159,6 +159,7 @@ bits_headers = \
${bits_srcdir}/shared_ptr_base.h \
${bits_srcdir}/slice_array.h \
${bits_srcdir}/sstream.tcc \
+   ${bits_srcdir}/std_abs.h \
${bits_srcdir}/std_mutex.h \
${bits_srcdir}/stl_algo.h \
${bits_srcdir}/stl_algobase.h \
diff --git a/libstdc++-v3/include/bits/std_abs.h 
b/libstdc++-v3/include/bits/std_abs.h
new file mode 100644
index 000..ab0f980
--- /dev/null
+++ b/libstdc++-v3/include/bits/std_abs.h
@@ -0,0 +1,107 @@
+// -*- C++ -*- C library enhancements header.
+
+// Copyright (C) 2016 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// Under Section 7 of GPL version 3, you are granted additional
+// permissions described in the GCC Runtime Library Exception, version
+// 3.1, as published by the Free Software Foundation.
+
+// You should have received a copy of the GNU General Public License and
+// a copy of the GCC Runtime Library Exception along with this program;
+// see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+// .
+
+/** @file include/bits/std_abs.h
+ *  This is 

Re: [PATCH, ARM 5/7] Add support for MOVT/MOVW to ARMv8-M Baseline

2016-10-10 Thread Christophe Lyon
Hi Thomas,

On 13 July 2016 at 17:34, Thomas Preudhomme
 wrote:
> On Wednesday 13 July 2016 17:14:52 Christophe Lyon wrote:
>> Hi Thomas,
>
> Hi Christophe,
>
>>
>> I'm seeing:
>> gcc.target/arm/pr42574.c: syntax error in target selector
>> "arm_thumb1_ok && { ! arm_thumb1_movt_ok }" for " dg-do 1 compile {
>> arm_thumb1_ok && { ! arm_thumb1_movt_ok } } "
>
> Oops. I remember the trial and error to find the right amount of curly braces
> yet I can indeed reproduce the error now. The target keyword is missing. I'll
> submit a patch asap.
>
> Best regards,
>
> Thomas

I've noticed that the new test
  gcc.target/arm/movdi_movw.c scan-assembler-times movw\tr0, #61680 1
fails on armeb-none-linux-gnueabihf
--with-mode=thumb --with-cpu=cortex-a9 --with-fpu=neon-fp16

the other new tests pass, and using --with=mode=arm makes all three
of them unsupported.

Sorry I missed it when I reported the other error.

Can you have a look?

Thanks

Christophe


[PATCH] [ARC] New option handling, refurbish multilib support.

2016-10-10 Thread Claudiu Zissulescu
Hi Andrew,

This is updated patch of the original sent to mailing list some while ago.

What is new:
 - Do not use MULTILIB_REUSE as its semantic changed, and the old one was 
causing issues while building.
 - Update invoke.texi documentation adding nps400 option to mcpu.

This patch is important as it changes the way how we handle CPU
variations and multilib support. It will be great if you can include
this patch on your review list before any other one.

Thanks,
Claudiu

gcc/
2016-05-09  Claudiu Zissulescu  

* config/arc/arc-arch.h: New file.
* config/arc/arc-arches.def: Likewise.
* config/arc/arc-cpus.def: Likewise.
* config/arc/arc-options.def: Likewise.
* config/arc/t-multilib: Likewise.
* config/arc/genmultilib.awk: Likewise.
* config/arc/genoptions.awk: Likewise.
* config/arc/arc-tables.opt: Likewise.
* config/arc/driver-arc.c: Likewise.
* common/config/arc/arc-common.c (arc_handle_option): Trace
toggled options.
* config.gcc (arc*-*-*): Add arc-tables.opt to arc's extra
options; check for supported cpu against arc-cpus.def file.
(arc*-*-elf*, arc*-*-linux-uclibc*): Use new make fragment; define
TARGET_CPU_BUILD macro; add driver-arc.o as an extra object.
* config/arc/arc-c.def: Add emacs local variables.
* config/arc/arc-opts.h (processor_type): Use arc-cpus.def file.
(FPU_FPUS, FPU_FPUD, FPU_FPUDA, FPU_FPUDA_DIV, FPU_FPUDA_FMA)
(FPU_FPUDA_ALL, FPU_FPUS_DIV, FPU_FPUS_FMA, FPU_FPUS_ALL)
(FPU_FPUD_DIV, FPU_FPUD_FMA, FPU_FPUD_ALL): New defines.
(DEFAULT_arc_fpu_build): Define.
(DEFAULT_arc_mpy_option): Define.
* config/arc/arc-protos.h (arc_init): Delete.
* config/arc/arc.c (arc_cpu_name): New variable.
(arc_selected_cpu, arc_selected_arch, arc_arcem, arc_archs)
(arc_arc700, arc_arc600, arc_arc601): New variable.
(arc_init): Add static; remove selection of default tune value,
cleanup obsolete error messages.
(arc_override_options): Make use of .def files for selecting the
right cpu and option configurations.
* config/arc/arc.h (stdbool.h): Include.
(TARGET_CPU_DEFAULT): Define.
(CPP_SPEC): Remove mcpu=NPS400 handling.
(arc_cpu_to_as): Declare.
(EXTRA_SPEC_FUNCTIONS): Define.
(OPTION_DEFAULT_SPECS): Likewise.
(ASM_DEFAULT): Remove.
(ASM_SPEC): Use arc_cpu_to_as.
(DRIVER_SELF_SPECS): Remove deprecated options.
(arc_arc700, arc_arc600, arc_arc601, arc_arcem, arc_archs):
Declare.
(TARGET_ARC600, TARGET_ARC601, TARGET_ARC700, TARGET_EM)
(TARGET_HS, TARGET_V2, TARGET_ARC600): Make them use arc_arc*
variables.
(MULTILIB_DEFAULTS): Use ARC_MULTILIB_CPU_DEFAULT.
* config/arc/arc.md (attr_cpu): Remove.
* config/arc/arc.opt (arc_mpy_option): Make it target variable.
(mno-mpy): Deprecate.
(mcpu=ARC600, mcpu=ARC601, mcpu=ARC700, mcpu=NPS400, mcpu=ARCEM)
(mcpu=ARCHS): Remove.
(mcrc, mdsp-packa, mdvbf, mmac-d16, mmac-24, mtelephony, mrtsc):
Deprecate.
(mbarrel_shifte, mspfp_, mdpfp_, mdsp_pack, mmac_): Remove.
(arc_fpu): Use new defines.
(arc_seen_options): New target variable.
* config/arc/t-arc (driver-arc.o): New target.
(arc-cpus, t-multilib, arc-tables.opt): Likewise.
* config/arc/t-arc-newlib: Delete.
* config/arc/t-arc-uClibc: Renamed to t-uClibc.
* doc/invoke.texi (ARC): Update arc options.
---
 gcc/common/config/arc/arc-common.c | 162 -
 gcc/config.gcc |  47 +
 gcc/config/arc/arc-arch.h  | 120 ++
 gcc/config/arc/arc-arches.def  |  35 +++
 gcc/config/arc/arc-c.def   |   4 +
 gcc/config/arc/arc-cpus.def|  47 +
 gcc/config/arc/arc-options.def |  69 +
 gcc/config/arc/arc-opts.h  |  47 +++--
 gcc/config/arc/arc-protos.h|   1 -
 gcc/config/arc/arc-tables.opt  |  90 
 gcc/config/arc/arc.c   | 186 ++---
 gcc/config/arc/arc.h   |  91 -
 gcc/config/arc/arc.md  |   5 -
 gcc/config/arc/arc.opt | 109 ++--
 gcc/config/arc/driver-arc.c|  80 +++
 gcc/config/arc/genmultilib.awk | 203 +
 gcc/config/arc/genoptions.awk  |  86 
 gcc/config/arc/t-arc   |  19 
 gcc/config/arc/t-arc-newlib|  46 -
 gcc/config/arc/t-arc-uClibc|  20 
 gcc/config/arc/t-multilib  |  34 +++
 gcc/config/arc/t-uClibc|  20 
 gcc/doc/invoke.texi|  90 +---
 23 files changed, 1235 insertions(+), 376 deletions(-)
 create mode 100

[patch,avr] Use avr-passes.def to register passes.

2016-10-10 Thread Georg-Johann Lay
This is a code clean-up using the new -passes.def feature in order to 
register avr target passes and to get -fdump-xxx etc. to work for such passes.


Ok for trunk?

Johann

* config/avr/avr-passes.def: New file.
* config/avr/t-avr (PASSES_EXTRA): Add avr-passes.def.
* config/avr/avr-protos.h (gcc::context, rtl_opt_pass): Declare.
(make_avr_pass_recompute_note): New proto.
* config/avr/avr.c (make_avr_pass_recompute_notes): New function.
(avr_pass_recompute_notes): Use anonymous namespace.
(avr_register_passes): Remove function...
(avr_option_override): ...and its call.
Index: config/avr/avr-passes.def
===
--- config/avr/avr-passes.def	(nonexistent)
+++ config/avr/avr-passes.def	(working copy)
@@ -0,0 +1,28 @@
+/* Description of target passes for AVR.
+   Copyright (C) 2016 Free Software Foundation, Inc. */
+
+/* This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it under
+   the terms of the GNU General Public License as published by the Free
+   Software Foundation; either version 3, or (at your option) any later
+   version.
+
+   GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+   WARRANTY; without even the implied warranty of MERCHANTABILITY or
+   FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+   for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with GCC; see the file COPYING3.  If not see
+   .  */
+
+/* This avr-specific pass (re)computes insn notes, in particular REG_DEAD
+   notes which are used by `avr.c::reg_unused_after' and branch offset
+   computations.  These notes must be correct, i.e. there must be no
+   dangling REG_DEAD notes; otherwise wrong code might result, cf. PR64331.
+
+   DF needs (correct) CFG, hence right before free_cfg is the last
+   opportunity to rectify notes.  */
+
+INSERT_PASS_BEFORE (pass_free_cfg, 1, avr_pass_recompute_notes);
Index: config/avr/avr-protos.h
===
--- config/avr/avr-protos.h	(revision 240915)
+++ config/avr/avr-protos.h	(working copy)
@@ -154,6 +154,11 @@ extern void asm_output_float (FILE *file
 
 extern bool avr_have_dimode;
 
+namespace gcc { class context; }
+class rtl_opt_pass;
+
+extern rtl_opt_pass *make_avr_pass_recompute_notes (gcc::context *);
+
 /* From avr-log.c */
 
 #define avr_dump(...) avr_vdump (NULL, __FUNCTION__, __VA_ARGS__)
Index: config/avr/avr.c
===
--- config/avr/avr.c	(revision 240915)
+++ config/avr/avr.c	(working copy)
@@ -295,6 +295,7 @@ avr_to_int_mode (rtx x)
 : simplify_gen_subreg (int_mode_for_mode (mode), x, mode, 0);
 }
 
+namespace {
 
 static const pass_data avr_pass_data_recompute_notes =
 {
@@ -328,20 +329,12 @@ public:
   }
 }; // avr_pass_recompute_notes
 
+} // anon namespace
 
-static void
-avr_register_passes (void)
+rtl_opt_pass*
+make_avr_pass_recompute_notes (gcc::context *ctxt)
 {
-  /* This avr-specific pass (re)computes insn notes, in particular REG_DEAD
- notes which are used by `avr.c::reg_unused_after' and branch offset
- computations.  These notes must be correct, i.e. there must be no
- dangling REG_DEAD notes; otherwise wrong code might result, cf. PR64331.
-
- DF needs (correct) CFG, hence right before free_cfg is the last
- opportunity to rectify notes.  */
-
-  register_pass (new avr_pass_recompute_notes (g, "avr-notes-free-cfg"),
- PASS_POS_INSERT_BEFORE, "*free_cfg", 1);
+  return new avr_pass_recompute_notes (ctxt, "avr-notes-free-cfg");
 }
 
 
@@ -464,11 +457,6 @@ avr_option_override (void)
   init_machine_status = avr_init_machine_status;
 
   avr_log_set_avr_log();
-
-  /* Register some avr-specific pass(es).  There is no canonical place for
- pass registration.  This function is convenient.  */
-
-  avr_register_passes ();
 }
 
 /* Function to set up the backend function structure.  */
Index: config/avr/t-avr
===
--- config/avr/t-avr	(revision 240915)
+++ config/avr/t-avr	(working copy)
@@ -16,6 +16,8 @@
 # along with GCC; see the file COPYING3.  If not see
 # .
 
+PASSES_EXTRA += $(srcdir)/config/avr/avr-passes.def
+
 driver-avr.o: $(srcdir)/config/avr/driver-avr.c \
   $(CONFIG_H) $(SYSTEM_H) coretypes.h \
   $(srcdir)/config/avr/avr-arch.h $(TM_H)


Re: Compile-time improvement for if conversion.

2016-10-10 Thread Yuri Rumyantsev
Richard,

If "fake" exit or entry block is created in dominance how we can
determine what is its the only  predecessor or successor without using
a notion of loop?

2016-10-10 15:00 GMT+03:00 Richard Biener :
> On Mon, Oct 10, 2016 at 1:42 PM, Yuri Rumyantsev  wrote:
>> Thanks Richard for your comments.
>> I'd like to answer on your last comment regarding use split_edge()
>> instead of creating fake post-header. I started with this splitting
>> but it requires to fix-up closed ssa form by creating additional phi
>> nodes, so I decided to use only cfg change without updating ssa form.
>> Other changes look reasonable and will fix them.
>
> Ah.  In this case can you investigate what it takes to make the entry/exit
> edges rather than BBs?  That is, introduce those "fakes" only internally
> in dominance.c?
>
>> 2016-10-10 12:52 GMT+03:00 Richard Biener :
>>> On Wed, Oct 5, 2016 at 3:22 PM, Yuri Rumyantsev  wrote:
 Hi All,

 Here is implementation of Richard proposal:

 < For general infrastructure it would be nice to expose a (post-)dominator
 < compute for MESE (post-dominators) / SEME (dominators) regions.  I 
 believe
 < what makes if-conversion expensive is the post-dom compute which happens
 < for each loop for the whole function.  It shouldn't be very difficult
 < to write this,
 < sharing as much as possible code with the current DOM code might need
 < quite some refactoring though.

 I implemented this proposal by adding calculation of dominance info
 for SESE regions and incorporate this change to if conversion pass.
 SESE region is built by adding loop pre-header and possibly fake
 post-header blocks to loop body. Fake post-header is deleted after
 predication completion.

 Bootstrapping and regression testing did not show any new failures.

 Is it OK for trunk?
>>>
>>> It's mostly reasonable but I have a few comments.  First, re-using
>>> bb->dom[] for the dominator info is somewhat fragile but indeed
>>> a requirement to make the patch reasonably small.  Please,
>>> in calculate_dominance_info_for_region, make sure that
>>> !dom_info_available_p (dir).
>>>
>>> You pass loop * everywhere but require ->aux to be set up as
>>> an array of BBs forming the region with special BBs at array ends.
>>>
>>> Please instead pass in a vec which avoids using ->aux
>>> and also allows other non-loop-based SESE regions to be used
>>> (I couldn't spot anything that relies on this being a loop).
>>>
>>> Adding a convenience wrapper for loop  * would be of course nice,
>>> to cover the special pre/post-header code in tree-if-conv.c.
>>>
>>> In theory a SESE region is fully specified by its entry end exit _edge_,
>>> so you might want to see if it's possible to use such a pair of edges
>>> to guard the dfs/idom walks to avoid the need to create fake blocks.
>>>
>>> Btw, instead of using create_empty_bb, unchecked_make_edge, etc.
>>> please use split_edge() of the entry/exit edges.
>>>
>>> Richard.
>>>
 ChangeLog:
 2016-10-05  Yuri Rumyantsev  

 * dominance.c : Include cfgloop.h for loop recognition.
 (dom_info): Add new functions and add boolean argument to recognize
 computation for loop region.
 (dom_info::dom_info): New function.
 (dom_info::calc_dfs_tree): Add boolean argument IN_REGION to not
 handle unvisited blocks.
 (dom_info::calc_idoms): Likewise.
 (compute_dom_fast_query_in_region): New function.
 (calculate_dominance_info): Invoke calc_dfs_tree and calc_idoms with
 false argument.
 (calculate_dominance_info_for_region): New function.
 (free_dominance_info_for_region): Likewise.
 (verify_dominators): Invoke calc_dfs_tree and calc_idoms with false
 argument.
 * dominance.h: Add prototype for introduced functions
 calculate_dominance_info_for_region and
 free_dominance_info_for_region.
 tree-if-conv.c: Add to local variables ifc_sese_bbs & fake_postheader.
 (build_sese_region): New function.
 (if_convertible_loop_p_1): Invoke local version of post-dominators
 calculation, free it after basic block predication and delete created
 fake post-header block if any.
 (tree_if_conversion): Delete call of free_dominance_info for
 post-dominators, free ifc_sese_bbs which represents SESE region.
 (pass_if_conversion::execute): Delete detection of infinite loops
 and fake edges to exit block since post-dominator calculation is
 performed per if-converted loop only.


[avr,committed] Include string.h in gen-avr-mmcu-texi.c

2016-10-10 Thread Georg-Johann Lay

https://gcc.gnu.org/r240925
https://gcc.gnu.org/r240926
https://gcc.gnu.org/r240927

gen-avr-mmcu-texi.c missed the inclusion of string.h (for strcmp).  Applied as 
obvious.



Johann

* config/avr/gen-avr-mmcu-texi.c (string.h): Include.

Index: config/avr/gen-avr-mmcu-texi.c
===
--- config/avr/gen-avr-mmcu-texi.c  (revision 240924)
+++ config/avr/gen-avr-mmcu-texi.c  (revision 240925)
@@ -19,6 +19,7 @@

 #include 
 #include 
+#include 

 #define IN_GEN_AVR_MMCU_TEXI



[patch, fortran, committed] Fix PR 77915

2016-10-10 Thread Thomas Koenig

Hello world,

I have committed the attached patch to trunk as obvious after
regression-testing. Will commit to gcc6 soon.

Regards

Thomas

2016-10-10  Thomas Koenig  

PR fortran/77915
* frontend-passes.c (inline_matmul_assign):  Return early if
inside a FORALL statement.

2016-10-10  Thomas Koenig  

PR fortran/77915
* gfortran.dg/matmul_11.f90:  New test.
Index: frontend-passes.c
===
--- frontend-passes.c	(Revision 240927)
+++ frontend-passes.c	(Arbeitskopie)
@@ -2857,6 +2857,11 @@ inline_matmul_assign (gfc_code **c, int *walk_subt
   if (in_where)
 return 0;
 
+  /* The BLOCKS generated for the temporary variables and FORALL don't
+ mix.  */
+  if (forall_level > 0)
+return 0;
+
   /* For now don't do anything in OpenMP workshare, it confuses
  its translation, which expects only the allowed statements in there.
  We should figure out how to parallelize this eventually.  */
! { dg-do compile }
! { dg-options "-ffrontend-optimize -fdump-tree-original" }
! PR 77915 - ICE of matmul with forall.
program x
  integer, parameter :: d = 3
  real,dimension(d,d,d) :: cube,xcube
  real, dimension(d,d) :: cmatrix
  integer :: i,j
  forall(i=1:d,j=1:d)
 xcube(i,j,:) = matmul(cmatrix,cube(i,j,:))
  end forall
end program x

! { dg-final { scan-tree-dump-times "_gfortran_matmul" 1 "original" } }


Re: [PATCH v4 0/6] Separate shrink-wrapping

2016-10-10 Thread Segher Boessenkool
Ping.

On Mon, Oct 03, 2016 at 01:48:17PM +, Segher Boessenkool wrote:
> I updated according to Jeff's latest comments (importantly, we cannot
> move a *logue in front of a move in general), and added some testcases.
> 
> Bootstrapping is in progress on today's trunk, powerpc64-linux and
> powerpc64le-linux.
> 
> Is this okay to commit now?
> 
> 
> Segher
> 
> 
> Segher Boessenkool (6):
>   separate shrink-wrap: New command-line flag, status flag, hooks, and
> doc
>   dce: Don't dead-code delete separately wrapped restores
>   regrename: Don't rename restores
>   shrink-wrap: Shrink-wrapping for separate components
>   rs6000: Separate shrink-wrapping
>   shrink-wrap: Testcases for separate shrink-wrapping
> 
>  gcc/common.opt |   4 +
>  gcc/config/rs6000/rs6000.c | 269 +++-
>  gcc/dce.c  |   9 +
>  gcc/doc/invoke.texi|  11 +-
>  gcc/doc/tm.texi|  63 ++
>  gcc/doc/tm.texi.in |  38 ++
>  gcc/emit-rtl.h |   4 +
>  gcc/function.c |  15 +-
>  gcc/regrename.c|   7 +
>  gcc/shrink-wrap.c  | 741 
> +
>  gcc/shrink-wrap.h  |   1 +
>  gcc/target.def |  57 ++
>  .../gcc.target/powerpc/shrink-wrap-separate-0.c|  22 +
>  .../gcc.target/powerpc/shrink-wrap-separate-1.c|  18 +
>  .../gcc.target/powerpc/shrink-wrap-separate-2.c|  26 +
>  15 files changed, 1265 insertions(+), 20 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/shrink-wrap-separate-0.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/shrink-wrap-separate-1.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/shrink-wrap-separate-2.c
> 
> -- 
> 1.9.3


PING! Re: [PATCH, Fortran] Extension: COTAN and degree-valued trig intrinsics with -fdec-math

2016-10-10 Thread Fritz Reese
https://gcc.gnu.org/ml/fortran/2016-09/msg00163.html [original]
https://gcc.gnu.org/ml/fortran/2016-09/msg00183.html [latest]

On Wed, Sep 28, 2016 at 4:14 PM, Fritz Reese  wrote:
> Attached is a patch extending the GNU Fortran front-end to support
> some additional math intrinsics, enabled with a new compile flag
> -fdec-math. The flag adds the COTAN intrinsic (cotangent), as well as
> degree versions of all trigonometric intrinsics (SIND, TAND, ACOSD,
> etc...). This extension allows for further compatibility with legacy
> code that depends on the compiler to support such intrinsic functions.

Patch is still pending. Current draft of the patch is re-attached for
convenience, since it was amended twice since the original post. OK
for trunk?

---
Fritz Reese


2016-09-28  Fritz Reese  

 New flag -fdec-math for COTAN and degree trig intrinsics.

gcc/fortran/
* lang.opt: New flag -fdec-math.
* options.c (set_dec_flags): Enable with -fdec.
* invoke.texi, gfortran.texi, intrinsic.texi: Update documentation.
* intrinsics.c (add_functions, do_simplify): New intrinsics
with -fdec-math.
* gfortran.h (gfc_isym_id): New isym GFC_ISYM_COTAN.
* gfortran.h (gfc_resolve_atan2d, gfc_resolve_cotan,
gfc_resolve_trigd, gfc_resolve_atrigd): New prototypes.
* iresolve.c (resolve_trig_call, get_degrees, get_radians,
is_trig_resolved, gfc_resolve_cotan, gfc_resolve_trigd,
gfc_resolve_atrigd, gfc_resolve_atan2d): New functions.
* intrinsics.h (gfc_simplify_atan2d, gfc_simplify_atrigd,
gfc_simplify_cotan, gfc_simplify_trigd): New prototypes.
* simplify.c (simplify_trig_call, degrees_f, radians_f,
gfc_simplify_cotan, gfc_simplify_trigd, gfc_simplify_atrigd,
gfc_simplify_atan2d): New functions.

gcc/testsuite/gfortran.dg/
* dec_math.f90: New testsuite.
commit 126e89b660fad6b21f50c48e2af616225a727586
Author: Fritz Reese 
Date:   Wed Sep 28 16:11:23 2016 -0400

	New flag -fdec-math for COTAN and degree trig intrinsics.

	gcc/fortran/
	* lang.opt: New flag -fdec-math.
	* options.c (set_dec_flags): Enable with -fdec.
	* invoke.texi, gfortran.texi, intrinsic.texi: Update documentation.
	* intrinsics.c (add_functions, do_simplify): New intrinsics
	with -fdec-math.
	* gfortran.h (gfc_isym_id): New isym GFC_ISYM_COTAN.
	* gfortran.h (gfc_resolve_atan2d, gfc_resolve_cotan,
	gfc_resolve_trigd, gfc_resolve_atrigd): New prototypes.
	* iresolve.c (resolve_trig_call, get_degrees, get_radians,
	is_trig_resolved, gfc_resolve_cotan, gfc_resolve_trigd,
	gfc_resolve_atrigd, gfc_resolve_atan2d): New functions.
	* intrinsics.h (gfc_simplify_atan2d, gfc_simplify_atrigd,
	gfc_simplify_cotan, gfc_simplify_trigd): New prototypes.
	* simplify.c (simplify_trig_call, degrees_f, radians_f,
	gfc_simplify_cotan, gfc_simplify_trigd, gfc_simplify_atrigd,
	gfc_simplify_atan2d): New functions.

	gcc/testsuite/gfortran.dg/
	* dec_math.f90: New testsuite.

diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
index d6b92a6..f8f3d4a 100644
--- a/gcc/fortran/gfortran.h
+++ b/gcc/fortran/gfortran.h
@@ -391,6 +391,7 @@ enum gfc_isym_id
   GFC_ISYM_CONVERSION,
   GFC_ISYM_COS,
   GFC_ISYM_COSH,
+  GFC_ISYM_COTAN,
   GFC_ISYM_COUNT,
   GFC_ISYM_CPU_TIME,
   GFC_ISYM_CSHIFT,
diff --git a/gcc/fortran/gfortran.texi b/gcc/fortran/gfortran.texi
index 3ebe3c7..a11eb84 100644
--- a/gcc/fortran/gfortran.texi
+++ b/gcc/fortran/gfortran.texi
@@ -1466,6 +1466,7 @@ without warning.
 * Form feed as whitespace::
 * TYPE as an alias for PRINT::
 * %LOC as an rvalue::
+* Extended math intrinsics::
 @end menu
 
 @node Old-style kind specifications
@@ -2519,6 +2520,42 @@ integer :: i
 call sub(%loc(i))
 @end smallexample
 
+@node Extended math intrinsics
+@subsection Extended math intrinsics
+@cindex intrinsics, math
+@cindex intrinsics, trigonometric functions
+
+GNU Fortran supports an extended list of mathematical intrinsics with the
+compile flag @option{-fdec-math} for compatability with legacy code.
+These intrinsics are described fully in @ref{Intrinsic Procedures} where it is
+noted that they are extensions and should be avoided whenever possible.
+
+Specifically, @option{-fdec-math} enables the @ref{COTAN} intrinsic, and
+trigonometric intrinsics which accept or produce values in degrees instead of
+radians.  Here is a summary of the new intrinsics:
+
+@multitable @columnfractions .5 .5
+@headitem Radians @tab Degrees
+@item @code{@ref{ACOS}}   @tab @code{@ref{ACOSD}}*
+@item @code{@ref{ASIN}}   @tab @code{@ref{ASIND}}*
+@item @code{@ref{ATAN}}   @tab @code{@ref{ATAND}}*
+@item @code{@ref{ATAN2}}  @tab @code{@ref{ATAN2D}}*
+@item @code{@ref{COS}}@tab @code{@ref{COSD}}*
+@item @code{@ref{COTAN}}* @tab @code{@ref{COTAND}}*
+@item @code{@ref{SIN}}@tab @code{@ref{SIND}}*
+@item @code{@ref{TAN}}@tab @code{@ref{TAND}}*
+@end 

Re: [PATCH 4/4][Ada,DJGPP] Ada support for DJGPP

2016-10-10 Thread Andris Pavenis

On 09/25/2016 07:25 PM, Arnaud Charlet wrote:

  int
  __gnat_get_maximum_file_name_length (void)
  {
+#if defined (__DJGPP__)
+  return (_use_lfn(".")) ? -1 : 8;
+#else
return -1;
+#endif
  }

Is the above change really necessary? Would be nice to get rid of this
extra code. The rest looks OK to me.


It is be possible to leave this part out for now.

We could return to this part later separately.

Andris

PS. What about last versions of other 2 not yet approved patches (1 and 3)?

>From bd1698bff232bdc4258c70f49add1869276184db Mon Sep 17 00:00:00 2001
From: Andris Pavenis 
Date: Mon, 10 Oct 2016 18:14:52 +0300
Subject: [PATCH 4/4] [DJGPP, Ada] Ada support

* ada/adaint.c: Include process.h, signal.h, dir.h and utime.h for DJGPP.
  ISALPHA: include  and define to isalpha for DJGPP when IN_RTS is defined.
  (DIR_SEPARATOR) define to '\\' for DJGPP.
  (__gnat_get_file_names_case_sensitive): return 0 for DJGPP unless
  overriden in environment
  (__gnat_is_absolute_path): Support MS-DOS style absolute paths for DJGPP.
  (__gnat_portable_spawn): Use spewnvp for DJGPP.
  (__gnat_portable_no_block_spawn): Use spawnvp for DJGPP.
  (__gnat_portable_wait): Return 0 for DJGPP.
---
 gcc/ada/adaint.c | 39 ---
 1 file changed, 32 insertions(+), 7 deletions(-)

diff --git a/gcc/ada/adaint.c b/gcc/ada/adaint.c
index f317865..17d6f1f 100644
--- a/gcc/ada/adaint.c
+++ b/gcc/ada/adaint.c
@@ -112,7 +112,18 @@
 extern "C" {
 #endif
 
-#if defined (__MINGW32__) || defined (__CYGWIN__)
+#if defined (__DJGPP__)
+
+/* For isalpha-like tests in the compiler, we're expected to resort to
+   safe-ctype.h/ISALPHA.  This isn't available for the runtime library
+   build, so we fallback on ctype.h/isalpha there.  */
+
+#ifdef IN_RTS
+#include 
+#define ISALPHA isalpha
+#endif
+
+#elif defined (__MINGW32__) || defined (__CYGWIN__)
 
 #include "mingw32.h"
 
@@ -165,11 +176,16 @@ UINT CurrentCCSEncoding;
 #include 
 #endif
 
-#if defined (_WIN32)
-
+#if defined (__DJGPP__)
 #include 
 #include 
 #include 
+#include 
+#undef DIR_SEPARATOR
+#define DIR_SEPARATOR '\\'
+
+#elif defined (_WIN32)
+
 #include 
 #include 
 #include 
@@ -560,7 +576,7 @@ __gnat_get_file_names_case_sensitive (void)
 	{
 	  /* By default, we suppose filesystems aren't case sensitive on
 	 Windows and Darwin (but they are on arm-darwin).  */
-#if defined (WINNT) \
+#if defined (WINNT) || defined (__DJGPP__) \
   || (defined (__APPLE__) && !(defined (__arm__) || defined (__arm64__)))
 	  file_names_case_sensitive_cache = 0;
 #else
@@ -576,7 +592,7 @@ __gnat_get_file_names_case_sensitive (void)
 int
 __gnat_get_env_vars_case_sensitive (void)
 {
-#if defined (WINNT)
+#if defined (WINNT) || defined (__DJGPP__)
  return 0;
 #else
  return 1;
@@ -1646,7 +1662,7 @@ __gnat_is_absolute_path (char *name, int length)
 #else
   return (length != 0) &&
  (*name == '/' || *name == DIR_SEPARATOR
-#if defined (WINNT)
+#if defined (WINNT) || defined(__DJGPP__)
   || (length > 1 && ISALPHA (name[0]) && name[1] == ':')
 #endif
 	  );
@@ -2234,7 +2250,7 @@ __gnat_portable_spawn (char *args[] ATTRIBUTE_UNUSED)
 #if defined (__vxworks) || defined(__PikeOS__)
   return -1;
 
-#elif defined (_WIN32)
+#elif defined (__DJGPP__) || defined (_WIN32)
   /* args[0] must be quotes as it could contain a full pathname with spaces */
   char *args_0 = args[0];
   args[0] = (char *)xmalloc (strlen (args_0) + 3);
@@ -2606,6 +2622,12 @@ __gnat_portable_no_block_spawn (char *args[] ATTRIBUTE_UNUSED)
   /* Not supported.  */
   return -1;
 
+#elif defined(__DJGPP__)
+  if (spawnvp (P_WAIT, args[0], args) != 0)
+return -1;
+  else
+return 0;
+
 #elif defined (_WIN32)
 
   HANDLE h = NULL;
@@ -2649,6 +2671,9 @@ __gnat_portable_wait (int *process_status)
 
   pid = win32_wait (&status);
 
+#elif defined (__DJGPP__)
+  /* Child process has already ended in case of DJGPP.
+ No need to do anything. Just return success. */
 #else
 
   pid = waitpid (-1, &status, 0);
-- 
2.7.4



Re: [PATCH 4/4][Ada,DJGPP] Ada support for DJGPP

2016-10-10 Thread Arnaud Charlet
> >>  int
> >>  __gnat_get_maximum_file_name_length (void)
> >>  {
> >>+#if defined (__DJGPP__)
> >>+  return (_use_lfn(".")) ? -1 : 8;
> >>+#else
> >>return -1;
> >>+#endif
> >>  }
> >Is the above change really necessary? Would be nice to get rid of this
> >extra code. The rest looks OK to me.
> 
> It is be possible to leave this part out for now.

OK without this part then.

> PS. What about last versions of other 2 not yet approved patches (1 and 3)?

There have been many back and forth and many updates, so I do not know where
we are on these. I'm pretty sure I OKed one of the other parts, but best
to resubmit them cleanly (so with latest patches, changelog, etc...).

Arno


[PATCH] Implement constexpr std::addressof for C++17

2016-10-10 Thread Jonathan Wakely

Thank to the new __builtin_addressof that Jakub added we can do this
now.

* doc/xml/manual/intro.xml: Document DR 2296 status.
* doc/xml/manual/status_cxx2017.xml: Update status.
* include/bits/move.h (__addressof): Add _GLIBCXX_CONSTEXPR and
call __builtin_addressof.
(addressof): Add _GLIBCXX17_CONSTEXPR.
* testsuite/20_util/addressof/requirements/constexpr.cc: New test.
* testsuite/20_util/forward/c_neg.cc: Adjust dg-error lineno.
* testsuite/20_util/forward/f_neg.cc: Likewise.

Tested powerpc64le-linux, committed to trunk.


commit d450a86bff30b38464e440f6157c39b399b54cc1
Author: Jonathan Wakely 
Date:   Thu Oct 6 18:53:28 2016 +0100

Implement constexpr std::addressof for C++17

* doc/xml/manual/intro.xml: Document DR 2296 status.
* doc/xml/manual/status_cxx2017.xml: Update status.
* include/bits/move.h (__addressof): Add _GLIBCXX_CONSTEXPR and
call __builtin_addressof.
(addressof): Add _GLIBCXX17_CONSTEXPR.
* testsuite/20_util/addressof/requirements/constexpr.cc: New test.
* testsuite/20_util/forward/c_neg.cc: Adjust dg-error lineno.
* testsuite/20_util/forward/f_neg.cc: Likewise.

diff --git a/libstdc++-v3/doc/xml/manual/intro.xml 
b/libstdc++-v3/doc/xml/manual/intro.xml
index 4747851..265ef67 100644
--- a/libstdc++-v3/doc/xml/manual/intro.xml
+++ b/libstdc++-v3/doc/xml/manual/intro.xml
@@ -961,6 +961,13 @@ requirements of the license of GCC.
 is included by .
 
 
+http://www.w3.org/1999/xlink"; 
xlink:href="../ext/lwg-defects.html#2296">2296:
+   std::addressof should be constexpr
+
+Use __builtin_addressof and add
+constexpr to addressof for C++17 and later.
+
+
 http://www.w3.org/1999/xlink"; 
xlink:href="../ext/lwg-defects.html#2313">2313:
tuple_size should always derive from 
integral_constant
 
diff --git a/libstdc++-v3/doc/xml/manual/status_cxx2017.xml 
b/libstdc++-v3/doc/xml/manual/status_cxx2017.xml
index c03978e..c6b8440 100644
--- a/libstdc++-v3/doc/xml/manual/status_cxx2017.xml
+++ b/libstdc++-v3/doc/xml/manual/status_cxx2017.xml
@@ -253,14 +253,13 @@ Feature-testing recommendations for C++.
 
 
 
-  
std::addressof should be constexpr 
   
http://www.w3.org/1999/xlink"; 
xlink:href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0304r0.html#2296";>
LWG2296

   
-   No 
+   7 
__cpp_lib_addressof_constexpr >= 201603 
 
 
diff --git a/libstdc++-v3/include/bits/move.h b/libstdc++-v3/include/bits/move.h
index 9deec42..a5002fc 100644
--- a/libstdc++-v3/include/bits/move.h
+++ b/libstdc++-v3/include/bits/move.h
@@ -43,12 +43,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
*  @ingroup utilities
*/
   template
-inline _Tp*
+inline _GLIBCXX_CONSTEXPR _Tp*
 __addressof(_Tp& __r) _GLIBCXX_NOEXCEPT
-{
-  return reinterpret_cast<_Tp*>
-   (&const_cast(reinterpret_cast(__r)));
-}
+{ return __builtin_addressof(__r); }
 
 _GLIBCXX_END_NAMESPACE_VERSION
 } // namespace
@@ -123,6 +120,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   // declval, from type_traits.
 
+#if __cplusplus > 201402L
+  // _GLIBCXX_RESOLVE_LIB_DEFECTS
+  // 2296. std::addressof should be constexpr
+# define __cpp_lib_addressof_constexpr 201603
+#endif
   /**
*  @brief Returns the actual address of the object or function
* referenced by r, even in the presence of an overloaded
@@ -131,7 +133,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
*  @return   The actual address.
   */
   template
-inline _Tp*
+inline _GLIBCXX17_CONSTEXPR _Tp*
 addressof(_Tp& __r) noexcept
 { return std::__addressof(__r); }
 
diff --git a/libstdc++-v3/testsuite/20_util/addressof/requirements/constexpr.cc 
b/libstdc++-v3/testsuite/20_util/addressof/requirements/constexpr.cc
new file mode 100644
index 000..998d087
--- /dev/null
+++ b/libstdc++-v3/testsuite/20_util/addressof/requirements/constexpr.cc
@@ -0,0 +1,55 @@
+// Copyright (C) 2016 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// { dg-options "-std=gnu++1z" }
+// { dg-do compile { target c++1z } }
+
+#include 
+
+// LWG 2296 std::addressof should be constexpr
+
+#ifndef __cp

[Committed] S/390: Wrap more macro args into ()

2016-10-10 Thread Andreas Krebbel
Turned out that there where a few () around macro args uses missing.
One real problem with it was detected with the int-in-bool-context in
the definition of DBX_REGISTER_NUMBER. But while being at it I've
also tried to fix other places where brackets might be missing.

gcc/ChangeLog:

2016-10-10  Andreas Krebbel  

* config/s390/s390.h: Wrap more macros args in brackets and fix
some formatting.
---
 gcc/ChangeLog  |  4 +++
 gcc/config/s390/s390.h | 88 ++
 2 files changed, 49 insertions(+), 43 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 6d27102..abe0194 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,7 @@
+2016-10-10  Andreas Krebbel  
+
+   * config/s390/s390.h: Wrap more macros args in brackets and fix
+
 2016-10-10  Andreas Schwab  
 
PR target/77738
diff --git a/gcc/config/s390/s390.h b/gcc/config/s390/s390.h
index 3a7be1a..501c8e4 100644
--- a/gcc/config/s390/s390.h
+++ b/gcc/config/s390/s390.h
@@ -320,9 +320,9 @@ extern const char *s390_host_detect_local_cpu (int argc, 
const char **argv);
FUNCTION is VOIDmode because calling convention maintains SP.
BLOCK needs Pmode for SP.
NONLOCAL needs twice Pmode to maintain both backchain and SP.  */
-#define STACK_SAVEAREA_MODE(LEVEL)  \
-  (LEVEL == SAVE_FUNCTION ? VOIDmode\
-  : LEVEL == SAVE_NONLOCAL ? (TARGET_64BIT ? OImode : TImode) : Pmode)
+#define STACK_SAVEAREA_MODE(LEVEL) \
+  ((LEVEL) == SAVE_FUNCTION ? VOIDmode \
+   : (LEVEL) == SAVE_NONLOCAL ? (TARGET_64BIT ? OImode : TImode) : Pmode)
 
 
 /* Type layout.  */
@@ -491,7 +491,7 @@ extern const char *s390_host_detect_local_cpu (int argc, 
const char **argv);
   s390_hard_regno_mode_ok ((REGNO), (MODE))
 
 #define HARD_REGNO_RENAME_OK(FROM, TO)  \
-  s390_hard_regno_rename_ok (FROM, TO)
+  s390_hard_regno_rename_ok ((FROM), (TO))
 
 #define MODES_TIEABLE_P(MODE1, MODE2)  \
(((MODE1) == SFmode || (MODE1) == DFmode)   \
@@ -584,7 +584,7 @@ enum reg_class
reload can decide not to use the hard register because some
constant was forced to be in memory.  */
 #define IRA_HARD_REGNO_ADD_COST_MULTIPLIER(regno)  \
-  (regno != BASE_REGNUM ? 0.0 : 0.5)
+  ((regno) != BASE_REGNUM ? 0.0 : 0.5)
 
 /* Register -> class mapping.  */
 extern const enum reg_class regclass_map[FIRST_PSEUDO_REGISTER];
@@ -617,10 +617,10 @@ extern const enum reg_class 
regclass_map[FIRST_PSEUDO_REGISTER];
 
  FIXME: Should we try splitting it into two vlgvg's/vlvg's instead?  */
 #define SECONDARY_MEMORY_NEEDED(CLASS1, CLASS2, MODE)  \
-  (((reg_classes_intersect_p (CLASS1, VEC_REGS)
\
- && reg_classes_intersect_p (CLASS2, GENERAL_REGS))
\
-|| (reg_classes_intersect_p (CLASS1, GENERAL_REGS) \
-   && reg_classes_intersect_p (CLASS2, VEC_REGS))) \
+  (((reg_classes_intersect_p ((CLASS1), VEC_REGS)  \
+ && reg_classes_intersect_p ((CLASS2), GENERAL_REGS))  \
+|| (reg_classes_intersect_p ((CLASS1), GENERAL_REGS)   \
+   && reg_classes_intersect_p ((CLASS2), VEC_REGS)))   \
&& (!TARGET_DFP || !TARGET_64BIT || GET_MODE_SIZE (MODE) != 8)  \
&& (!TARGET_VX || (SCALAR_FLOAT_MODE_P (MODE)   \
  && GET_MODE_SIZE (MODE) > 8)))
@@ -630,7 +630,7 @@ extern const enum reg_class 
regclass_map[FIRST_PSEUDO_REGISTER];
 #define SECONDARY_MEMORY_NEEDED_MODE(MODE) \
  (GET_MODE_BITSIZE (MODE) < 32 \
   ? mode_for_size (32, GET_MODE_CLASS (MODE), 0)   \
-  : MODE)
+  : (MODE))
 
 
 /* Stack layout and calling conventions.  */
@@ -720,8 +720,8 @@ extern const enum reg_class 
regclass_map[FIRST_PSEUDO_REGISTER];
 /* Define the dwarf register mapping.
v16-v31 -> 68-83
rX  -> X  otherwise  */
-#define DBX_REGISTER_NUMBER(regno) \
-  ((regno >= 38 && regno <= 53) ? regno + 30 : regno)
+#define DBX_REGISTER_NUMBER(regno) \
+  (((regno) >= 38 && (regno) <= 53) ? (regno) + 30 : (regno))
 
 /* Frame registers.  */
 
@@ -832,24 +832,25 @@ CUMULATIVE_ARGS;
operand.  If we find one, push the reload and jump to WIN.  This
macro is used in only one place: `find_reloads_address' in reload.c.  */
 #define LEGITIMIZE_RELOAD_ADDRESS(AD, MODE, OPNUM, TYPE, IND, WIN) \
-do {   \
-  rtx new_rtx = legitimize_reload_address (AD, MODE, OPNUM, (int)(TYPE));  
\
-  if (new_rtx) \
-{  \
-  (AD) = new_rtx;  \
-  goto WIN;  

[PING][PATCH 1/4][Ada,DJGPP] Ada support for DJGPP

2016-10-10 Thread Andris Pavenis

I'd like to ping this patch.

Last version of the patch together with Changelog entry can be found in mailing 
list archive:

https://gcc.gnu.org/ml/gcc-patches/2016-08/msg01229.html

Andris



Re: [Committed] S/390: Wrap more macro args into ()

2016-10-10 Thread Andreas Schwab
On Okt 10 2016, Andreas Krebbel  wrote:

> @@ -491,7 +491,7 @@ extern const char *s390_host_detect_local_cpu (int argc, 
> const char **argv);
>s390_hard_regno_mode_ok ((REGNO), (MODE))
>  
>  #define HARD_REGNO_RENAME_OK(FROM, TO)  \
> -  s390_hard_regno_rename_ok (FROM, TO)
> +  s390_hard_regno_rename_ok ((FROM), (TO))

That should not be necessary.  The only way to get an error is if you
play dirty games with macros expanding to a bare comma.

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


[PING][PATCH 3/4][Ada,DJGPP] Ada support for DJGPP

2016-10-10 Thread Andris Pavenis

I'd like to ping patch

https://gcc.gnu.org/ml/gcc-patches/2016-09/msg00164.html

Additional comments about using ZCX_By_Default := true are in

https://gcc.gnu.org/ml/gcc-patches/2016-09/msg00845.html

Andris



Re: [PATCH 4/4][Ada,DJGPP] Ada support for DJGPP

2016-10-10 Thread Andris Pavenis

On 10/10/2016 06:22 PM, Arnaud Charlet wrote:



PS. What about last versions of other 2 not yet approved patches (1 and 3)?

There have been many back and forth and many updates, so I do not know where
we are on these. I'm pretty sure I OKed one of the other parts, but best
to resubmit them cleanly (so with latest patches, changelog, etc...).
There are no changes since submitting last versions of patches 1 and 3. So I just pointed to 
messages in mail archives

in separate e-mails.

Andris



[hsa-branch 4/9] Add expansion of reciprocal of square root

2016-10-10 Thread Martin Jambor
Hi,

this patch is a simple addition of reciprocal of square root gimple
function into its HSAIL equivalent.

Committed to the branch, queued for merge to trunk soon.
Thanks,

Martin

2016-10-03  Martin Jambor  

* hsa-gen.c (gen_hsa_insn_for_internal_fn_call): Also handle IFN_RSQRT.
---
 gcc/hsa-gen.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/gcc/hsa-gen.c b/gcc/hsa-gen.c
index deb2a07..efb87a0 100644
--- a/gcc/hsa-gen.c
+++ b/gcc/hsa-gen.c
@@ -5386,6 +5386,10 @@ gen_hsa_insn_for_internal_fn_call (gcall *stmt, hsa_bb 
*hbb)
   gen_hsa_unaryop_for_builtin (BRIG_OPCODE_SQRT, stmt, hbb);
   break;
 
+case IFN_RSQRT:
+  gen_hsa_unaryop_for_builtin (BRIG_OPCODE_NRSQRT, stmt, hbb);
+  break;
+
 case IFN_TRUNC:
   gen_hsa_unaryop_for_builtin (BRIG_OPCODE_TRUNC, stmt, hbb);
   break;
-- 
2.10.0



[hsa-branch 2/9] Lastprivate lowering for gridified kernels

2016-10-10 Thread Martin Jambor
Hi,

this patch implements the lastprivate data sharing clauses of gridified
OpenMP looping constructs.  It adds code to construct a special
condition to identify he "last" loop iteration using special HSA
instructions, because that way we do not need information about all HSA
dimensions conveyed from callers and could modify only a small fraction
of the non-gridification code.

On the gridification side, it creates group-segment copies of internal
loop lastprivate variables as means to transfer the value from the
"last" work-item to all work-items that then continue working with the
value.

Committed to the branch, queued for merge to trunk soon.
Thanks,

Martin

2016-10-03  Martin Jambor  

* gimple.h (GF_OMP_FOR_GRID_PHONY): Added comment.
(GF_OMP_FOR_GRID_INTRA_GROUP): New.
(gimple_omp_for_grid_phony): Added checking assert.
(gimple_omp_for_set_grid_phony): Likewise.
(gimple_omp_for_grid_intra_group): New function.
(gimple_omp_for_set_grid_intra_group): Likewise.
(gimple_omp_for_grid_group_iter): Added checking assert.
(gimple_omp_for_set_grid_group_iter): Likewise.
* omp-low.c (lower_lastprivate_clauses): Also handle predicates
that are not simple comparisons.
(grid_lastprivate_predicate): New function.
(lower_omp_for_lastprivate): Generate conditions for gridified kernels.
(lower_omp_for): Adjust phony predicate call.
(grid_parallel_clauses_gridifiable): Allow lastprivate.
(grid_inner_loop_gridifiable_p): Likewise.
(grid_mark_tiling_loops): Generate copies of lastprivate variables
to group variables.
(grid_mark_tiling_parallels_and_loops): Create binds for bodies of
a parallel statements.
(grid_process_kernel_body_copy): Avoid reusing variable name.
---
 gcc/gimple.h  |  36 +
 gcc/omp-low.c | 235 +-
 2 files changed, 187 insertions(+), 84 deletions(-)

diff --git a/gcc/gimple.h b/gcc/gimple.h
index ce3a161..3e84e6b0 100644
--- a/gcc/gimple.h
+++ b/gcc/gimple.h
@@ -162,7 +162,12 @@ enum gf_mask {
 GF_OMP_FOR_KIND_CILKSIMD   = GF_OMP_FOR_SIMD | 1,
 GF_OMP_FOR_COMBINED= 1 << 4,
 GF_OMP_FOR_COMBINED_INTO   = 1 << 5,
+/* The following flag must not be used on GF_OMP_FOR_KIND_GRID_LOOP loop
+   statements.  */
 GF_OMP_FOR_GRID_PHONY  = 1 << 6,
+/* The following two flags should only be set on GF_OMP_FOR_KIND_GRID_LOOP
+   loop statements.  */
+GF_OMP_FOR_GRID_INTRA_GROUP= 1 << 6,
 GF_OMP_FOR_GRID_GROUP_ITER  = 1 << 7,
 GF_OMP_TARGET_KIND_MASK= (1 << 4) - 1,
 GF_OMP_TARGET_KIND_REGION  = 0,
@@ -5123,6 +5128,8 @@ gimple_omp_for_set_pre_body (gimple *gs, gimple_seq 
pre_body)
 static inline bool
 gimple_omp_for_grid_phony (const gomp_for *omp_for)
 {
+  gcc_checking_assert (gimple_omp_for_kind (omp_for)
+  != GF_OMP_FOR_KIND_GRID_LOOP);
   return (gimple_omp_subcode (omp_for) & GF_OMP_FOR_GRID_PHONY) != 0;
 }
 
@@ -5131,18 +5138,45 @@ gimple_omp_for_grid_phony (const gomp_for *omp_for)
 static inline void
 gimple_omp_for_set_grid_phony (gomp_for *omp_for, bool value)
 {
+  gcc_checking_assert (gimple_omp_for_kind (omp_for)
+  != GF_OMP_FOR_KIND_GRID_LOOP);
   if (value)
 omp_for->subcode |= GF_OMP_FOR_GRID_PHONY;
   else
 omp_for->subcode &= ~GF_OMP_FOR_GRID_PHONY;
 }
 
+/* Return the kernel_intra_group of a GRID_LOOP OMP_FOR statement.  */
+
+static inline bool
+gimple_omp_for_grid_intra_group (const gomp_for *omp_for)
+{
+  gcc_checking_assert (gimple_omp_for_kind (omp_for)
+  == GF_OMP_FOR_KIND_GRID_LOOP);
+  return (gimple_omp_subcode (omp_for) & GF_OMP_FOR_GRID_INTRA_GROUP) != 0;
+}
+
+/* Set kernel_intra_group flag of OMP_FOR to VALUE.  */
+
+static inline void
+gimple_omp_for_set_grid_intra_group (gomp_for *omp_for, bool value)
+{
+  gcc_checking_assert (gimple_omp_for_kind (omp_for)
+  == GF_OMP_FOR_KIND_GRID_LOOP);
+  if (value)
+omp_for->subcode |= GF_OMP_FOR_GRID_INTRA_GROUP;
+  else
+omp_for->subcode &= ~GF_OMP_FOR_GRID_INTRA_GROUP;
+}
+
 /* Return true if iterations of a grid OMP_FOR statement correspond to HSA
groups.  */
 
 static inline bool
 gimple_omp_for_grid_group_iter (const gomp_for *omp_for)
 {
+  gcc_checking_assert (gimple_omp_for_kind (omp_for)
+  == GF_OMP_FOR_KIND_GRID_LOOP);
   return (gimple_omp_subcode (omp_for) & GF_OMP_FOR_GRID_GROUP_ITER) != 0;
 }
 
@@ -5151,6 +5185,8 @@ gimple_omp_for_grid_group_iter (const gomp_for *omp_for)
 static inline void
 gimple_omp_for_set_grid_group_iter (gomp_for *omp_for, bool value)
 {
+  gcc_checking_assert (gimple_omp_for_kind (omp_for)
+  == GF_OMP_FOR_KIND_GRID_LOOP);
   if (value)
 omp_for->subcode |= GF_OMP_FOR_GRID_GROUP_ITER;
   else
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index ee5d2df..05015bd

[hsa-branch 3/9] Handle simds within gridified loops gracefully

2016-10-10 Thread Martin Jambor
Hi,

this patch deals with simd constructs in gridified OpenMP loops.
Standalone simds are dealt with by forcing the gridified copy to have
OMP_CLAUSE_SAFELEN_EXPR of one, while simds which are a part of a
combined construct with the gridified parallel loop are simply
discarded.

Committed to the branch, queued for merge to trunk soon.
Thanks,

Martin

2016-10-03  Martin Jambor  

* omp-low.c (grid_find_ungridifiable_statement): Do not bail out
for simd loops.
(grid_inner_loop_gridifiable_p): Likewise.
(grid_process_grid_body): New function.
(grid_eliminate_combined_simd_part): Likewise.
(grid_mark_tiling_loops): Use it. Walk body of the loop with
grid_process_grid_body.
(grid_process_kernel_body_copy): Likewise.
---
 gcc/omp-low.c | 137 +++---
 1 file changed, 122 insertions(+), 15 deletions(-)

diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index 05015bd..a51474b 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -17478,17 +17478,6 @@ grid_find_ungridifiable_statement 
(gimple_stmt_iterator *gsi,
   *handled_ops_p = true;
   wi->info = stmt;
   return error_mark_node;
-
-case GIMPLE_OMP_FOR:
-  if ((gimple_omp_for_kind (stmt) & GF_OMP_FOR_SIMD)
- && gimple_omp_for_combined_into_p (stmt))
-   {
- *handled_ops_p = true;
- wi->info = stmt;
- return error_mark_node;
-   }
-  break;
-
 default:
   break;
 }
@@ -17614,10 +17603,6 @@ grid_inner_loop_gridifiable_p (gomp_for *gfor, 
grid_prop *grid)
dump_printf_loc (MSG_MISSED_OPTIMIZATION, grid->target_loc,
   GRID_MISSED_MSG_PREFIX "the inner loop contains "
   "call to a noreturn function\n");
- else if (gimple_code (bad) == GIMPLE_OMP_FOR)
-   dump_printf_loc (MSG_MISSED_OPTIMIZATION, grid->target_loc,
-GRID_MISSED_MSG_PREFIX "the inner loop contains "
-"a simd construct\n");
  else
dump_printf_loc (MSG_MISSED_OPTIMIZATION, grid->target_loc,
 GRID_MISSED_MSG_PREFIX "the inner loop contains "
@@ -18212,6 +18197,113 @@ grid_copy_leading_local_assignments (gimple_seq src, 
gimple_stmt_iterator *dst,
   return NULL;
 }
 
+/* Statement walker function to make adjustments to statements within the
+   gridifed kernel copy.  */
+
+static tree
+grid_process_grid_body (gimple_stmt_iterator *gsi, bool *handled_ops_p,
+   struct walk_stmt_info *)
+{
+  *handled_ops_p = false;
+  gimple *stmt = gsi_stmt (*gsi);
+  if (gimple_code (stmt) == GIMPLE_OMP_FOR
+  && (gimple_omp_for_kind (stmt) & GF_OMP_FOR_SIMD))
+  {
+gomp_for *loop = as_a  (stmt);
+tree clauses = gimple_omp_for_clauses (loop);
+tree cl = find_omp_clause (clauses, OMP_CLAUSE_SAFELEN);
+if (cl)
+  OMP_CLAUSE_SAFELEN_EXPR (cl) = integer_one_node;
+else
+  {
+   tree c = build_omp_clause (UNKNOWN_LOCATION, OMP_CLAUSE_SAFELEN);
+   OMP_CLAUSE_SAFELEN_EXPR (c) = integer_one_node;
+   OMP_CLAUSE_CHAIN (c) = clauses;
+   gimple_omp_for_set_clauses (loop, c);
+  }
+  }
+  return NULL_TREE;
+}
+
+/* Given a PARLOOP that is a normal for looping construct but also a part of a
+   combined construct with a simd loop, eliminate the simd loop.  */
+
+static void
+grid_eliminate_combined_simd_part (gomp_for *parloop)
+{
+  struct walk_stmt_info wi;
+
+  memset (&wi, 0, sizeof (wi));
+  wi.val_only = true;
+  enum gf_mask msk = GF_OMP_FOR_SIMD;
+  wi.info = (void *) &msk;
+  walk_gimple_seq (gimple_omp_body (parloop), find_combined_for, NULL, &wi);
+  gimple *stmt = (gimple *) wi.info;
+  /* We expect that the SIMD id the only statement in the parallel loop.  */
+  gcc_assert (stmt
+ && gimple_code (stmt) == GIMPLE_OMP_FOR
+ && (gimple_omp_for_kind (stmt) == GF_OMP_FOR_SIMD)
+ && gimple_omp_for_combined_into_p (stmt)
+ && !gimple_omp_for_combined_p (stmt));
+  gomp_for *simd = as_a  (stmt);
+
+  /* Copy over the iteration properties because the body refers to the index in
+ the bottmom-most loop.  */
+  unsigned i, collapse = gimple_omp_for_collapse (parloop);
+  gcc_checking_assert (collapse == gimple_omp_for_collapse (simd));
+  for (i = 0; i < collapse; i++)
+{
+  gimple_omp_for_set_index (parloop, i, gimple_omp_for_index (simd, i));
+  gimple_omp_for_set_initial (parloop, i, gimple_omp_for_initial (simd, 
i));
+  gimple_omp_for_set_final (parloop, i, gimple_omp_for_final (simd, i));
+  gimple_omp_for_set_incr (parloop, i, gimple_omp_for_incr (simd, i));
+}
+
+  tree *tgt= gimple_omp_for_clauses_ptr (parloop);
+  while (*tgt)
+tgt = &OMP_CLAUSE_CHAIN (*tgt);
+
+  /* Copy over all clauses, except for linaer clauses, which are turned into
+ private clauses, and all other simd-specificl clauses, which are

[hsa-branch 6/9] Expand FMA_EXPR to HSAIL

2016-10-10 Thread Martin Jambor
Hi,

the following patch adds expansion of fused multiply and add to HSAIL.
The scalar variant is straightforwardly converted to an HSAIL equivalent
while any vector instance is expanded into separate multiplication and
additions.

Committed to the branch, queued for merge to trunk soon.
Thanks,

Martin

2016-10-03  Martin Jambor  

* hsa-gen.c (gen_hsa_insns_for_operation_assignment): Handle
FMA_EXPR and ternary operators in general.  Remove obsolete
fallthrough comments.
---
 gcc/hsa-gen.c | 27 ---
 1 file changed, 24 insertions(+), 3 deletions(-)

diff --git a/gcc/hsa-gen.c b/gcc/hsa-gen.c
index ac83e9e..ad40087 100644
--- a/gcc/hsa-gen.c
+++ b/gcc/hsa-gen.c
@@ -3076,6 +3076,23 @@ gen_hsa_insns_for_operation_assignment (gimple *assign, 
hsa_bb *hbb)
 case NEGATE_EXPR:
   opcode = BRIG_OPCODE_NEG;
   break;
+case FMA_EXPR:
+  /* There is a native HSA instruction for scalar FMAs but not for vector
+ones.  */
+  if (TREE_CODE (TREE_TYPE (lhs)) == VECTOR_TYPE)
+   {
+ hsa_op_reg *dest
+   = hsa_cfun->reg_for_gimple_ssa (gimple_assign_lhs (assign));
+ hsa_op_with_type *op1 = hsa_reg_or_immed_for_gimple_op (rhs1, hbb);
+ hsa_op_with_type *op2 = hsa_reg_or_immed_for_gimple_op (rhs2, hbb);
+ hsa_op_with_type *op3 = hsa_reg_or_immed_for_gimple_op (rhs3, hbb);
+ hsa_op_reg *tmp = new hsa_op_reg (dest->m_type);
+ gen_hsa_binary_operation (BRIG_OPCODE_MUL, tmp, op1, op2, hbb);
+ gen_hsa_binary_operation (BRIG_OPCODE_ADD, dest, tmp, op3, hbb);
+ return;
+   }
+  opcode = BRIG_OPCODE_MAD;
+  break;
 case MIN_EXPR:
   opcode = BRIG_OPCODE_MIN;
   break;
@@ -3275,14 +3292,18 @@ gen_hsa_insns_for_operation_assignment (gimple *assign, 
hsa_bb *hbb)
   switch (rhs_class)
 {
 case GIMPLE_TERNARY_RHS:
-  gcc_unreachable ();
+  {
+   hsa_op_with_type *op3 = hsa_reg_or_immed_for_gimple_op (rhs3, hbb);
+   hsa_insn_basic *insn = new hsa_insn_basic (4, opcode, dest->m_type, 
dest,
+  op1, op2, op3);
+   hbb->append_insn (insn);
+  }
   return;
 
-  /* Fall through */
 case GIMPLE_BINARY_RHS:
   gen_hsa_binary_operation (opcode, dest, op1, op2, hbb);
   break;
-  /* Fall through */
+
 case GIMPLE_UNARY_RHS:
   gen_hsa_unary_operation (opcode, dest, op1, hbb);
   break;
-- 
2.10.0



[hsa-branch 1/9] Builtins for gridsize and currentworkgroupsize

2016-10-10 Thread Martin Jambor
Hi,

the patch below makes the griddim and currentworkgroupsize special HSA
instructions available for omp lowering through a builtin.  They are
then used by subsequent patch to implement conditions determining the
last iteration for the lastprivate OpenMP sharing clause.

Committed to the branch, queued for merge to trunk soon.
Thanks,

Martin

2016-10-03  Martin Jambor  

* hsa-builtins.def (BUILT_IN_HSA_GRIDSIZE): New.
(BUILT_IN_HSA_CURRENTWORKGROUPSIZE): Likewise.
* hsa-gen.c (gen_hsa_insns_for_call): Handle BUILT_IN_HSA_GRIDSIZE.
---
 gcc/hsa-builtins.def | 4 
 gcc/hsa-gen.c| 6 ++
 2 files changed, 10 insertions(+)

diff --git a/gcc/hsa-builtins.def b/gcc/hsa-builtins.def
index dcd0c55..cc0409e 100644
--- a/gcc/hsa-builtins.def
+++ b/gcc/hsa-builtins.def
@@ -33,3 +33,7 @@ DEF_HSA_BUILTIN (BUILT_IN_HSA_WORKITEMID, "hsa_workitemid",
 BT_FN_UINT_UINT, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_HSA_BUILTIN (BUILT_IN_HSA_WORKITEMABSID, "hsa_workitemabsid",
 BT_FN_UINT_UINT, ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_HSA_BUILTIN (BUILT_IN_HSA_GRIDSIZE, "hsa_gridsize",
+BT_FN_UINT_UINT, ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_HSA_BUILTIN (BUILT_IN_HSA_CURRENTWORKGROUPSIZE, "hsa_currentworkgroupsize",
+BT_FN_UINT_UINT, ATTR_CONST_NOTHROW_LEAF_LIST)
diff --git a/gcc/hsa-gen.c b/gcc/hsa-gen.c
index f63608c..deb2a07 100644
--- a/gcc/hsa-gen.c
+++ b/gcc/hsa-gen.c
@@ -5812,6 +5812,12 @@ gen_hsa_insns_for_call (gimple *stmt, hsa_bb *hbb)
 case BUILT_IN_HSA_WORKITEMABSID:
   query_hsa_grid_dim (stmt, BRIG_OPCODE_WORKITEMABSID, hbb);
   break;
+case BUILT_IN_HSA_GRIDSIZE:
+  query_hsa_grid_dim (stmt, BRIG_OPCODE_GRIDSIZE, hbb);
+  break;
+case BUILT_IN_HSA_CURRENTWORKGROUPSIZE:
+  query_hsa_grid_dim (stmt, BRIG_OPCODE_CURRENTWORKGROUPSIZE, hbb);
+  break;
 
 case BUILT_IN_GOMP_BARRIER:
   hbb->append_insn (new hsa_insn_br (0, BRIG_OPCODE_BARRIER, 
BRIG_TYPE_NONE,
-- 
2.10.0



[hsa-branch 5/9] Properly detect variadic arguments

2016-10-10 Thread Martin Jambor
Hi,

this patch from Martin properly detects some variadic calls which we have
failed to detect before during expansion to HSAIL.

Committed to the branch, queued for merge to trunk soon.
Thanks,

Martin

2016-10-03  Martin Liska  
Martin Jambor  

* hsa-gen.c (verify_function_arguments): Properly detect variadic
arguments.
---
 gcc/hsa-gen.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/hsa-gen.c b/gcc/hsa-gen.c
index efb87a0..ac83e9e 100644
--- a/gcc/hsa-gen.c
+++ b/gcc/hsa-gen.c
@@ -3444,13 +3444,14 @@ gen_hsa_insns_for_switch_stmt (gswitch *s, hsa_bb *hbb)
 static void
 verify_function_arguments (tree decl)
 {
+  tree type = TREE_TYPE (decl);
   if (DECL_STATIC_CHAIN (decl))
 {
   HSA_SORRY_ATV (EXPR_LOCATION (decl),
 "HSA does not support nested functions: %D", decl);
   return;
 }
-  else if (!TYPE_ARG_TYPES (TREE_TYPE (decl)))
+  else if (!TYPE_ARG_TYPES (type) || stdarg_p (type))
 {
   HSA_SORRY_ATV (EXPR_LOCATION (decl),
 "HSA does not support functions with variadic arguments "
-- 
2.10.0



[hsa-branch 7/9] Ignore prefetch builtin

2016-10-10 Thread Martin Jambor
Hi,

this patch makes HSAIL expansion ignore prefetch built-ins.  It is a bit
less straightforward because we also need to handle cases where the call
does not pass gimple_call_builtin_p test because of argument type
mismatches.

Committed to the branch, queued for merge to trunk soon.
Thanks,

Martin

2016-10-03  Martin Jambor  

* hsa-gen.c (gen_hsa_insns_for_call): Ignore prefetch builtin.
---
 gcc/hsa-gen.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/gcc/hsa-gen.c b/gcc/hsa-gen.c
index ad40087..8893a28 100644
--- a/gcc/hsa-gen.c
+++ b/gcc/hsa-gen.c
@@ -5530,6 +5530,12 @@ gen_hsa_insns_for_call (gimple *stmt, hsa_bb *hbb)
   if (!gimple_call_builtin_p (stmt, BUILT_IN_NORMAL))
 {
   tree function_decl = gimple_call_fndecl (stmt);
+  /* Prefetch pass can create type-mismatching prefetch builtin calls which
+fail the gimple_call_builtin_p test above.  Handle them here.  */
+  if (DECL_BUILT_IN_CLASS (function_decl)
+ && DECL_FUNCTION_CODE (function_decl) == BUILT_IN_PREFETCH)
+   return;
+
   if (function_decl == NULL_TREE)
{
  HSA_SORRY_AT (gimple_location (stmt),
@@ -5962,6 +5968,8 @@ gen_hsa_insns_for_call (gimple *stmt, hsa_bb *hbb)
gen_hsa_alloca (call, hbb);
break;
   }
+case BUILT_IN_PREFETCH:
+  break;
 default:
   {
gen_hsa_insns_for_direct_call (stmt, hbb);
-- 
2.10.0



[hsa-branch 8/9] Fail instead of calling an unknown GOMP builtin

2016-10-10 Thread Martin Jambor
Hi,

this patch is a bit of a hack to make sure we do not emit calls to
libgomp run-time functions which are not available at the HSA GPU side,
such as run-time loop scheduling routines.  If we fail at the caller
side, we avoid issues with finalizer looking at calls to non-existing
functions.

Committed to the branch, queued for merge to trunk soon.
Thanks,

Martin

2016-10-03  Martin Jambor  

* hsa-gen.c (gen_hsa_insns_for_call): Fail when encountering a
GOMP builtin that we cannot process ourselves.
---
 gcc/hsa-gen.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/gcc/hsa-gen.c b/gcc/hsa-gen.c
index 8893a28..fd0dbcd 100644
--- a/gcc/hsa-gen.c
+++ b/gcc/hsa-gen.c
@@ -5972,7 +5972,15 @@ gen_hsa_insns_for_call (gimple *stmt, hsa_bb *hbb)
   break;
 default:
   {
-   gen_hsa_insns_for_direct_call (stmt, hbb);
+   tree name_tree = DECL_NAME (fndecl);
+   const char *s = IDENTIFIER_POINTER (name_tree);
+   size_t len = strlen (s);
+   if (len > 4 && (strncmp (s, "__builtin_GOMP_", 15) == 0))
+ HSA_SORRY_ATV (gimple_location (stmt),
+"support for HSA does not implement GOMP function %s",
+s);
+   else
+ gen_hsa_insns_for_direct_call (stmt, hbb);
return;
   }
 }
-- 
2.10.0



[hsa-branch 9/9] Fix another finalizer type complaint

2016-10-10 Thread Martin Jambor
Hi,

the subsequent patch deals with a finalizer error issued when we ave a
register-register move of an HSAIL vector type.  Apparently, such a move
must obey the same rules as vector loads and stores.

Committed to the branch, queued for merge to trunk soon.
Thanks,

Martin

2016-10-03  Martin Jambor  

* hsa-gen.c (hsa_build_append_simple_mov): Use mem_type_for_type.
---
 gcc/hsa-gen.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/gcc/hsa-gen.c b/gcc/hsa-gen.c
index fd0dbcd..0b25f66 100644
--- a/gcc/hsa-gen.c
+++ b/gcc/hsa-gen.c
@@ -2227,8 +2227,10 @@ hsa_reg_or_immed_for_gimple_op (tree op, hsa_bb *hbb)
 void
 hsa_build_append_simple_mov (hsa_op_reg *dest, hsa_op_base *src, hsa_bb *hbb)
 {
-  hsa_insn_basic *insn = new hsa_insn_basic (2, BRIG_OPCODE_MOV, dest->m_type,
-dest, src);
+  /* Moves of packed data between registers need to adhere to the same type
+ rules like when dealing with memory.  */
+  BrigType16_t tp = mem_type_for_type (dest->m_type);
+  hsa_insn_basic *insn = new hsa_insn_basic (2, BRIG_OPCODE_MOV, tp, dest, 
src);
   if (hsa_op_reg *sreg = dyn_cast  (src))
 gcc_assert (hsa_type_bit_size (dest->m_type)
== hsa_type_bit_size (sreg->m_type));
-- 
2.10.0


Re: [PATCH] Improve performance of list::reverse

2016-10-10 Thread Jonathan Wakely

On 09/10/16 16:23 +0100, Elliot Goodrich wrote:

Hi,

If we unroll the loop so that we iterate both forwards and backwards,
we can take advantage of memory-level parallelism when chasing
pointers. This means that reverse takes 35% less time when nodes are
randomly scattered in memory and about the same time if nodes are
contiguous.

Further, as our node pointers will never alias, we can interleave the
swaps of the next and previous pointers to remove further data
dependencies. This takes another 5% off the time when nodes are
scattered in memory and takes 20% off when nodes are contiguous.

All in all we save 20%-40% depending on the memory layout.


Nice, thanks for the patch.

Do you have (or are you willing to sign) a copyright assignment for
GCC?

See https://gcc.gnu.org/contribute.html#legal for details.


For future improvement, by passing whether there is an odd or even
number of nodes in the list we can hoist one of the ifs out of the
loop and gain another 5-10% but most likely this is only possible when
_GLIBCXX_USE_CXX11_ABI is defined and size() is O(1). This would bring
the saving to 30%-45%. Is it worth writing a new overload of
_M_reverse which takes the size of the list?


That certainly seems worthwhile. Do we need an overload or can it just
be done with #if? It seems to me we'd either want to use the size, or
not use it, we wouldn't want both versions defined at once. That
suggests #if to me.



[PATCH] Minor simplification to std::_Bind_result helpers

2016-10-10 Thread Jonathan Wakely

We don't need to define new class templates for the SFINAE helpers in
_Bind_result, we can just use alias templates. This also moves where
the helpers are used to the return types, instead of as a defaulted
argument.

* include/std/functional (_Bind_result::__enable_if_void): Use alias
template instead of class template.
(_Bind_result::__disable_if_void): Likewise.
(_Bind_result::__call): Adjust uses of __enable_if_void and
__disable_if_void.

Tested powerpc64le-linux, committed to trunk.

commit 1330ba1b3b4ccddc64e532756aa2f571f27ae2ad
Author: Jonathan Wakely 
Date:   Mon Oct 10 17:00:49 2016 +0100

Minor simplification to std::_Bind_result helpers

* include/std/functional (_Bind_result::__enable_if_void): Use alias
template instead of class template.
(_Bind_result::__disable_if_void): Likewise.
(_Bind_result::__call): Adjust uses of __enable_if_void and
__disable_if_void.

diff --git a/libstdc++-v3/include/std/functional 
b/libstdc++-v3/include/std/functional
index 1c7523e..2587392 100644
--- a/libstdc++-v3/include/std/functional
+++ b/libstdc++-v3/include/std/functional
@@ -1000,15 +1000,17 @@ _GLIBCXX_MEM_FN_TRAITS(&&, false_type, true_type)
 
   // sfinae types
   template
-   struct __enable_if_void : enable_if::value, int> { };
+   using __enable_if_void
+ = typename enable_if{}>::type;
+
   template
-   struct __disable_if_void : enable_if::value, int> { };
+   using __disable_if_void
+ = typename enable_if{}, _Result>::type;
 
   // Call unqualified
   template
-   _Result
-   __call(tuple<_Args...>&& __args, _Index_tuple<_Indexes...>,
-   typename __disable_if_void<_Res>::type = 0)
+   __disable_if_void<_Res>
+   __call(tuple<_Args...>&& __args, _Index_tuple<_Indexes...>)
{
  return _M_f(_Mu<_Bound_args>()
  (std::get<_Indexes>(_M_bound_args), __args)...);
@@ -1016,9 +1018,8 @@ _GLIBCXX_MEM_FN_TRAITS(&&, false_type, true_type)
 
   // Call unqualified, return void
   template
-   void
-   __call(tuple<_Args...>&& __args, _Index_tuple<_Indexes...>,
-   typename __enable_if_void<_Res>::type = 0)
+   __enable_if_void<_Res>
+   __call(tuple<_Args...>&& __args, _Index_tuple<_Indexes...>)
{
  _M_f(_Mu<_Bound_args>()
   (std::get<_Indexes>(_M_bound_args), __args)...);
@@ -1026,9 +1027,8 @@ _GLIBCXX_MEM_FN_TRAITS(&&, false_type, true_type)
 
   // Call as const
   template
-   _Result
-   __call(tuple<_Args...>&& __args, _Index_tuple<_Indexes...>,
-   typename __disable_if_void<_Res>::type = 0) const
+   __disable_if_void<_Res>
+   __call(tuple<_Args...>&& __args, _Index_tuple<_Indexes...>) const
{
  return _M_f(_Mu<_Bound_args>()
  (std::get<_Indexes>(_M_bound_args), __args)...);
@@ -1036,9 +1036,8 @@ _GLIBCXX_MEM_FN_TRAITS(&&, false_type, true_type)
 
   // Call as const, return void
   template
-   void
-   __call(tuple<_Args...>&& __args, _Index_tuple<_Indexes...>,
-   typename __enable_if_void<_Res>::type = 0) const
+   __enable_if_void<_Res>
+   __call(tuple<_Args...>&& __args, _Index_tuple<_Indexes...>) const
{
  _M_f(_Mu<_Bound_args>()
   (std::get<_Indexes>(_M_bound_args),  __args)...);
@@ -1046,9 +1045,8 @@ _GLIBCXX_MEM_FN_TRAITS(&&, false_type, true_type)
 
   // Call as volatile
   template
-   _Result
-   __call(tuple<_Args...>&& __args, _Index_tuple<_Indexes...>,
-   typename __disable_if_void<_Res>::type = 0) volatile
+   __disable_if_void<_Res>
+   __call(tuple<_Args...>&& __args, _Index_tuple<_Indexes...>) volatile
{
  return _M_f(_Mu<_Bound_args>()
  (__volget<_Indexes>(_M_bound_args), __args)...);
@@ -1056,9 +1054,8 @@ _GLIBCXX_MEM_FN_TRAITS(&&, false_type, true_type)
 
   // Call as volatile, return void
   template
-   void
-   __call(tuple<_Args...>&& __args, _Index_tuple<_Indexes...>,
-   typename __enable_if_void<_Res>::type = 0) volatile
+   __enable_if_void<_Res>
+   __call(tuple<_Args...>&& __args, _Index_tuple<_Indexes...>) volatile
{
  _M_f(_Mu<_Bound_args>()
   (__volget<_Indexes>(_M_bound_args), __args)...);
@@ -1066,9 +1063,9 @@ _GLIBCXX_MEM_FN_TRAITS(&&, false_type, true_type)
 
   // Call as const volatile
   template
-   _Result
-   __call(tuple<_Args...>&& __args, _Index_tuple<_Indexes...>,
-   typename __disable_if_void<_Res>::type = 0) const volatile
+   __disable_if_void<_Res>
+   __call(tuple<_Args...>&& __args,
+  _Index_tuple<_Indexes...>) const volatile
{
  return _M_f(_Mu<_Bound_args>()
  (__volget<_Indexes>(_M_bound_args), __args)...);
@@ -1076,10 +1073,9 @@ _GLIBCXX_MEM_FN

Go patch committed: remove GCC-specific linemap usage

2016-10-10 Thread Ian Lance Taylor
This patch by Than McIntosh removes a GCC-specific use of the linemap
code to retrieve the line number.  Bootstrapped and ran Go testsuite
on x86_64-pc-linux-gnu.  Committed to mainline.

Ian

2016-10-10  Than McIntosh  

* go-linemap.cc (Gcc_linemap::location_line): New method.
Index: gcc/go/go-linemap.cc
===
--- gcc/go/go-linemap.cc(revision 240755)
+++ gcc/go/go-linemap.cc(working copy)
@@ -32,6 +32,9 @@ class Gcc_linemap : public Linemap
   std::string
   to_string(Location);
 
+  int
+  location_line(Location);
+
  protected:
   Location
   get_predeclared_location();
@@ -88,6 +91,13 @@ Gcc_linemap::to_string(Location location
   return ss.str();
 }
 
+// Return the line number for a given location (for debugging dumps)
+int
+Gcc_linemap::location_line(Location loc)
+{
+  return LOCATION_LINE(loc.gcc_location());
+}
+
 // Stop getting locations.
 
 void
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 240941)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-9401e714d690e3907a64ac5c8cd5aed9e28f511b
+f3658aea2493c7f1c4a72502f9e7da562c7764c4
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/escape.cc
===
--- gcc/go/gofrontend/escape.cc (revision 240941)
+++ gcc/go/gofrontend/escape.cc (working copy)
@@ -145,7 +145,7 @@ Node::details() const
   std::stringstream details;
 
   if (!this->is_sink())
-details << " l(" << LOCATION_LINE(this->location().gcc_location()) << ")";
+details << " l(" << Linemap::location_to_line(this->location()) << ")";
 
   bool is_varargs = false;
   bool is_address_taken = false;
Index: gcc/go/gofrontend/go-linemap.h
===
--- gcc/go/gofrontend/go-linemap.h  (revision 240755)
+++ gcc/go/gofrontend/go-linemap.h  (working copy)
@@ -63,6 +63,10 @@ class Linemap
   virtual std::string
   to_string(Location) = 0;
 
+  // Return the line number for a given location (for debugging dumps)
+  virtual int
+  location_line(Location) = 0;
+
  protected:
   // Return a special Location used for predeclared identifiers.  This
   // Location should be different from that for any actual source
@@ -135,6 +139,14 @@ class Linemap
 go_assert(Linemap::instance_ != NULL);
 return Linemap::instance_->to_string(loc);
   }
+
+  // Return line number for location
+  static int
+  location_to_line(Location loc)
+  {
+go_assert(Linemap::instance_ != NULL);
+return Linemap::instance_->location_line(loc);
+  }
 };
 
 // The backend interface must define this function.  It should return


Re: [v3 PATCH] Make any's copy assignment operator exception-safe, don't copy the underlying value when any is moved, make in_place constructors explicit.

2016-10-10 Thread Jonathan Wakely

On 08/10/16 16:07 +0300, Ville Voutilainen wrote:

Tested on Linux-x64.

2016-10-08  Ville Voutilainen  

   Make any's copy assignment operator exception-safe,
   don't copy the underlying value when any is moved,
   make in_place constructors explicit.
   * include/std/any (any(in_place_type_t<_ValueType>, _Args&&...)):
   Make explicit.
   (any(in_place_type_t<_ValueType>, initializer_list<_Up>, _Args&&...)):
   Likewise.
   (operator=(const any&)): Make strongly exception-safe.
   (operator=(any&&)): Reset the manager when resetting the value.
   This makes the state saner if an exception is thrown during the move.
   (_Manager_internal<_Tp>::_S_manage): Move in _Op_xfer, don't copy.
   * testsuite/20_util/any/assign/2.cc: Adjust.
   * testsuite/20_util/any/assign/exception.cc: New.
   * testsuite/20_util/any/cons/2.cc: Adjust.
   * testsuite/20_util/any/cons/explicit.cc: New.
   * testsuite/20_util/any/misc/any_cast_neg.cc: Ajust.



diff --git a/libstdc++-v3/include/std/any b/libstdc++-v3/include/std/any
index 9160035..78bdf89 100644
--- a/libstdc++-v3/include/std/any
+++ b/libstdc++-v3/include/std/any
@@ -179,7 +179,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  typename _Tp = _Decay<_ValueType>,
  typename _Mgr = _Manager<_Tp>,
  __any_constructible_t<_Tp, _Args&&...> = false>
-  any(in_place_type_t<_ValueType>, _Args&&... __args)
+  explicit any(in_place_type_t<_ValueType>, _Args&&... __args)
  : _M_manager(&_Mgr::_S_manage)
  {
_Mgr::_S_create(_M_storage, std::forward<_Args>(__args)...);
@@ -192,8 +192,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  typename _Mgr = _Manager<_Tp>,
  __any_constructible_t<_Tp, initializer_list<_Up>,
_Args&&...> = false>
-  any(in_place_type_t<_ValueType>,
- initializer_list<_Up> __il, _Args&&... __args)
+  explicit any(in_place_type_t<_ValueType>,
+  initializer_list<_Up> __il, _Args&&... __args)
  : _M_manager(&_Mgr::_S_manage)
  {
_Mgr::_S_create(_M_storage, __il, std::forward<_Args>(__args)...);


I prefer to put "explicit" on a line of its own, as we do for return
types, but I won't complain if you leave it like this.


@@ -211,11 +211,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
reset();
  else if (this != &__rhs)
{
- if (has_value())
-   _M_manager(_Op_destroy, this, nullptr);
- _Arg __arg;
- __arg._M_any = this;
- __rhs._M_manager(_Op_clone, &__rhs, &__arg);
+ any(__rhs).swap(*this);


I was trying to avoid the "redundant" xfer operations that the swap
does, but I don't think we can do that and be exception safe. This is
simple and safe, and I think its optimal. Thanks.


}
  return *this;
}
@@ -232,7 +228,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  else if (this != &__rhs)
{
  if (has_value())
-   _M_manager(_Op_destroy, this, nullptr);
+   reset();


If you're going to use reset() then you don't need the has_value()
check first. I think the reason I didn't use reset() was to avoid the
dead store to _M_manager that reset() does, since the compiler might
not detect it's dead (because the next store is done by the call
through a function pointer).

This code was all pretty carefully written to avoid any redundant
operations. Does this change buy us anything except simpler code?



  _Arg __arg;
  __arg._M_any = this;
  __rhs._M_manager(_Op_xfer, &__rhs, &__arg);
@@ -556,7 +552,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
__ptr->~_Tp();
break;
  case _Op_xfer:
-   ::new(&__arg->_M_any->_M_storage._M_buffer) _Tp(*__ptr);
+   ::new(&__arg->_M_any->_M_storage._M_buffer) _Tp
+ (std::move(*const_cast<_Tp*>(__ptr)));


I was looking at this recently and wondering why I did a copy not a
move. *cough* no redundant operations *cough* Oops.




Re: [PATCH] Implement new hook for max_align_t_align

2016-10-10 Thread John David Anglin
Attached is an updated version using the new builtin __MAX_ALIGN_T_ALIGN__.  
This
simplifies the declaration of max_align_t and ensures it is always the same as 
max_align_t_align().

Tested on hppa-unknown-linux-gnu.  Okay for trunk?

Dave
--
John David Anglin   dave.ang...@bell.net


2016-10-10  John David Anglin  

gcc/c-family/
* c-common.c (c_stddef_cpp_builtins): Add __MAX_ALIGN_T_ALIGN__ builtin
define.
(max_align_t_align): Move to targhooks.c.
* c-common.h (max_align_t_align): Delete.
gcc/
* target.def (max_align_t_align): New target hook.
* targhooks.c (default_max_align_t_align): New.
* targhooks.h (default_max_align_t_align): Declare.
* config/pa/pa.c (pa_max_align_t_align): New.
(TARGET_MAX_ALIGN_T_ALIGN): Define.
* ginclude/stddef.h (max_align_t): Use __MAX_ALIGN_T_ALIGN__ builtin.
* doc/tm.texi.in (TARGET_MAX_ALIGN_T_ALIGN): Add documentation hook.
* doc/tm.texi: Update.
gcc/cp/
* decl.c (cxx_init_decl_processing): Use max_align_t_align target hook.
* init.c (build_new_1): Likewise.

Index: c-family/c-common.c
===
--- c-family/c-common.c (revision 240901)
+++ c-family/c-common.c (working copy)
@@ -6683,6 +6683,8 @@
 builtin_define_with_value ("__INTPTR_TYPE__", INTPTR_TYPE, 0);
   if (UINTPTR_TYPE)
 builtin_define_with_value ("__UINTPTR_TYPE__", UINTPTR_TYPE, 0);
+  builtin_define_with_int_value ("__MAX_ALIGN_T_ALIGN__",
+targetm.max_align_t_align () / BITS_PER_UNIT);
 }
 
 static void
@@ -12925,22 +12927,6 @@
   return stv_nothing;
 }
 
-/* Return the alignment of std::max_align_t.
-
-   [support.types.layout] The type max_align_t is a POD type whose alignment
-   requirement is at least as great as that of every scalar type, and whose
-   alignment requirement is supported in every context.  */
-
-unsigned
-max_align_t_align ()
-{
-  unsigned int max_align = MAX (TYPE_ALIGN (long_long_integer_type_node),
-   TYPE_ALIGN (long_double_type_node));
-  if (float128_type_node != NULL_TREE)
-max_align = MAX (max_align, TYPE_ALIGN (float128_type_node));
-  return max_align;
-}
-
 /* Return true iff ALIGN is an integral constant that is a fundamental
alignment, as defined by [basic.align] in the c++-11
specifications.
@@ -12954,7 +12940,7 @@
 bool
 cxx_fundamental_alignment_p (unsigned align)
 {
-  return (align <= max_align_t_align ());
+  return (align <= targetm.max_align_t_align ());
 }
 
 /* Return true if T is a pointer to a zero-sized aggregate.  */
Index: c-family/c-common.h
===
--- c-family/c-common.h (revision 240901)
+++ c-family/c-common.h (working copy)
@@ -866,7 +866,6 @@
 extern bool keyword_is_storage_class_specifier (enum rid);
 extern bool keyword_is_type_qualifier (enum rid);
 extern bool keyword_is_decl_specifier (enum rid);
-extern unsigned max_align_t_align (void);
 extern bool cxx_fundamental_alignment_p (unsigned);
 extern bool pointer_to_zero_sized_aggr_p (tree);
 extern bool diagnose_mismatched_attributes (tree, tree);
Index: config/pa/pa.c
===
--- config/pa/pa.c  (revision 240901)
+++ config/pa/pa.c  (working copy)
@@ -194,6 +194,7 @@
 static bool pa_legitimate_constant_p (machine_mode, rtx);
 static unsigned int pa_section_type_flags (tree, const char *, int);
 static bool pa_legitimate_address_p (machine_mode, rtx, bool);
+static unsigned int pa_max_align_t_align (void);
 
 /* The following extra sections are only used for SOM.  */
 static GTY(()) section *som_readonly_data_section;
@@ -400,6 +401,9 @@
 #undef TARGET_LRA_P
 #define TARGET_LRA_P hook_bool_void_false
 
+#undef TARGET_MAX_ALIGN_T_ALIGN
+#define TARGET_MAX_ALIGN_T_ALIGN pa_max_align_t_align
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 
 /* Parse the -mfixed-range= option string.  */
@@ -10719,4 +10723,16 @@
   return NULL_RTX;
 }
 
+/* The maximimum alignment in bits for the POD type std:max_align_t.
+   This is set to 128 on 32-bit non HP-UX systems to suppress warnings
+   about new with extended alignment.  This arises because various POSIX
+   types such as pthread_mutex_t have for historical reasons 128-bit
+   alignment but the default alignment of std:max_align_t is 64 bits.  */
+
+static unsigned int
+pa_max_align_t_align (void)
+{
+  return TARGET_HPUX && !TARGET_64BIT ? 64 : 128;
+}
+
 #include "gt-pa.h"
Index: cp/decl.c
===
--- cp/decl.c   (revision 240901)
+++ cp/decl.c   (working copy)
@@ -4082,7 +4082,7 @@
   if (aligned_new_threshold == -1)
 aligned_new_threshold = (cxx_dialect >= cxx1z) ? 1 : 0;
   if (aligned_new_threshold == 1)
-aligned_new_threshold = max_align_t_align () / BITS_PER_UNIT;
+aligned_new_threshold = targ

Re: [v3 PATCH] Make any's copy assignment operator exception-safe, don't copy the underlying value when any is moved, make in_place constructors explicit.

2016-10-10 Thread Jonathan Wakely

On 10/10/16 19:19 +0100, Jonathan Wakely wrote:

On 08/10/16 16:07 +0300, Ville Voutilainen wrote:

Tested on Linux-x64.

2016-10-08  Ville Voutilainen  

  Make any's copy assignment operator exception-safe,
  don't copy the underlying value when any is moved,
  make in_place constructors explicit.
  * include/std/any (any(in_place_type_t<_ValueType>, _Args&&...)):
  Make explicit.
  (any(in_place_type_t<_ValueType>, initializer_list<_Up>, _Args&&...)):
  Likewise.
  (operator=(const any&)): Make strongly exception-safe.
  (operator=(any&&)): Reset the manager when resetting the value.
  This makes the state saner if an exception is thrown during the move.
  (_Manager_internal<_Tp>::_S_manage): Move in _Op_xfer, don't copy.
  * testsuite/20_util/any/assign/2.cc: Adjust.
  * testsuite/20_util/any/assign/exception.cc: New.
  * testsuite/20_util/any/cons/2.cc: Adjust.
  * testsuite/20_util/any/cons/explicit.cc: New.
  * testsuite/20_util/any/misc/any_cast_neg.cc: Ajust.



diff --git a/libstdc++-v3/include/std/any b/libstdc++-v3/include/std/any
index 9160035..78bdf89 100644
--- a/libstdc++-v3/include/std/any
+++ b/libstdc++-v3/include/std/any
@@ -179,7 +179,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  typename _Tp = _Decay<_ValueType>,
  typename _Mgr = _Manager<_Tp>,
 __any_constructible_t<_Tp, _Args&&...> = false>
-  any(in_place_type_t<_ValueType>, _Args&&... __args)
+  explicit any(in_place_type_t<_ValueType>, _Args&&... __args)
 : _M_manager(&_Mgr::_S_manage)
 {
   _Mgr::_S_create(_M_storage, std::forward<_Args>(__args)...);
@@ -192,8 +192,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  typename _Mgr = _Manager<_Tp>,
 __any_constructible_t<_Tp, initializer_list<_Up>,
_Args&&...> = false>
-  any(in_place_type_t<_ValueType>,
- initializer_list<_Up> __il, _Args&&... __args)
+  explicit any(in_place_type_t<_ValueType>,
+  initializer_list<_Up> __il, _Args&&... __args)
 : _M_manager(&_Mgr::_S_manage)
 {
   _Mgr::_S_create(_M_storage, __il, std::forward<_Args>(__args)...);


I prefer to put "explicit" on a line of its own, as we do for return
types, but I won't complain if you leave it like this.


@@ -211,11 +211,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
reset();
 else if (this != &__rhs)
{
- if (has_value())
-   _M_manager(_Op_destroy, this, nullptr);
- _Arg __arg;
- __arg._M_any = this;
- __rhs._M_manager(_Op_clone, &__rhs, &__arg);
+ any(__rhs).swap(*this);


I was trying to avoid the "redundant" xfer operations that the swap
does, but I don't think we can do that and be exception safe. This is
simple and safe, and I think its optimal. Thanks.


As discussed on IRC, it can be:

 else if (this != &__rhs)
   *this = any(__rhs);

which does one clone, one xfer and one destroy.

This way the effort of avoiding an extra xfer op is in the move
assignment operator.


As a drive-by fix, on operator=(_ValueType&& __rhs) please indent the
return type to line up with "operator".


[PATCH] Use noexcept instead of _GLIBCXX_USE_NOEXCEPT

2016-10-10 Thread Jonathan Wakely

This file is compiled with -std=gnu++11 so there's no need to use the
macro, we can use noexcept directly.

* libsupc++/eh_ptr.cc (exception_ptr): Replace _GLIBCXX_USE_NOEXCEPT
with noexcept.

Tested powerpc64le-linbux, committed to trunk.


commit f80e14a01697a34b835638a303967c0a7ad194a1
Author: Jonathan Wakely 
Date:   Mon Oct 10 18:38:20 2016 +0100

Use noexcept instead of _GLIBCXX_USE_NOEXCEPT

* libsupc++/eh_ptr.cc (exception_ptr): Replace _GLIBCXX_USE_NOEXCEPT
with noexcept.

diff --git a/libstdc++-v3/libsupc++/eh_ptr.cc b/libstdc++-v3/libsupc++/eh_ptr.cc
index 3b8e0a01..f3c910b 100644
--- a/libstdc++-v3/libsupc++/eh_ptr.cc
+++ b/libstdc++-v3/libsupc++/eh_ptr.cc
@@ -63,33 +63,31 @@ static_assert( adjptr<__cxa_exception>()
 #endif
 }
 
-std::__exception_ptr::exception_ptr::exception_ptr() _GLIBCXX_USE_NOEXCEPT
+std::__exception_ptr::exception_ptr::exception_ptr() noexcept
 : _M_exception_object(0) { }
 
 
-std::__exception_ptr::exception_ptr::exception_ptr(void* obj)
-_GLIBCXX_USE_NOEXCEPT
+std::__exception_ptr::exception_ptr::exception_ptr(void* obj) noexcept
 : _M_exception_object(obj)  { _M_addref(); }
 
 
-std::__exception_ptr::exception_ptr::exception_ptr(__safe_bool)
-_GLIBCXX_USE_NOEXCEPT
+std::__exception_ptr::exception_ptr::exception_ptr(__safe_bool) noexcept
 : _M_exception_object(0) { }
 
 
 std::__exception_ptr::
-exception_ptr::exception_ptr(const exception_ptr& other) _GLIBCXX_USE_NOEXCEPT
+exception_ptr::exception_ptr(const exception_ptr& other) noexcept
 : _M_exception_object(other._M_exception_object)
 { _M_addref(); }
 
 
-std::__exception_ptr::exception_ptr::~exception_ptr() _GLIBCXX_USE_NOEXCEPT
+std::__exception_ptr::exception_ptr::~exception_ptr() noexcept
 { _M_release(); }
 
 
 std::__exception_ptr::exception_ptr&
 std::__exception_ptr::
-exception_ptr::operator=(const exception_ptr& other) _GLIBCXX_USE_NOEXCEPT
+exception_ptr::operator=(const exception_ptr& other) noexcept
 {
   exception_ptr(other).swap(*this);
   return *this;
@@ -97,7 +95,7 @@ exception_ptr::operator=(const exception_ptr& other) 
_GLIBCXX_USE_NOEXCEPT
 
 
 void
-std::__exception_ptr::exception_ptr::_M_addref() _GLIBCXX_USE_NOEXCEPT
+std::__exception_ptr::exception_ptr::_M_addref() noexcept
 {
   if (_M_exception_object)
 {
@@ -109,7 +107,7 @@ std::__exception_ptr::exception_ptr::_M_addref() 
_GLIBCXX_USE_NOEXCEPT
 
 
 void
-std::__exception_ptr::exception_ptr::_M_release() _GLIBCXX_USE_NOEXCEPT
+std::__exception_ptr::exception_ptr::_M_release() noexcept
 {
   if (_M_exception_object)
 {
@@ -128,13 +126,12 @@ std::__exception_ptr::exception_ptr::_M_release() 
_GLIBCXX_USE_NOEXCEPT
 
 
 void*
-std::__exception_ptr::exception_ptr::_M_get() const _GLIBCXX_USE_NOEXCEPT
+std::__exception_ptr::exception_ptr::_M_get() const noexcept
 { return _M_exception_object; }
 
 
 void
-std::__exception_ptr::exception_ptr::swap(exception_ptr &other)
-  _GLIBCXX_USE_NOEXCEPT
+std::__exception_ptr::exception_ptr::swap(exception_ptr &other) noexcept
 {
   void *tmp = _M_exception_object;
   _M_exception_object = other._M_exception_object;
@@ -144,27 +141,24 @@ std::__exception_ptr::exception_ptr::swap(exception_ptr 
&other)
 
 // Retained for compatibility with CXXABI_1.3.
 void
-std::__exception_ptr::exception_ptr::_M_safe_bool_dummy()
-  _GLIBCXX_USE_NOEXCEPT { }
+std::__exception_ptr::exception_ptr::_M_safe_bool_dummy() noexcept { }
 
 
 // Retained for compatibility with CXXABI_1.3.
 bool
-std::__exception_ptr::exception_ptr::operator!() const _GLIBCXX_USE_NOEXCEPT
+std::__exception_ptr::exception_ptr::operator!() const noexcept
 { return _M_exception_object == 0; }
 
 
 // Retained for compatibility with CXXABI_1.3.
-std::__exception_ptr::exception_ptr::operator __safe_bool() const
-_GLIBCXX_USE_NOEXCEPT
+std::__exception_ptr::exception_ptr::operator __safe_bool() const noexcept
 {
   return _M_exception_object ? &exception_ptr::_M_safe_bool_dummy : 0;
 }
 
 
 const std::type_info*
-std::__exception_ptr::exception_ptr::__cxa_exception_type() const
-  _GLIBCXX_USE_NOEXCEPT
+std::__exception_ptr::exception_ptr::__cxa_exception_type() const noexcept
 {
   __cxa_exception *eh = __get_exception_header_from_obj (_M_exception_object);
   return eh->exceptionType;
@@ -172,19 +166,17 @@ 
std::__exception_ptr::exception_ptr::__cxa_exception_type() const
 
 
 bool std::__exception_ptr::operator==(const exception_ptr& lhs,
- const exception_ptr& rhs)
-  _GLIBCXX_USE_NOEXCEPT
+ const exception_ptr& rhs) noexcept
 { return lhs._M_exception_object == rhs._M_exception_object; }
 
 
 bool std::__exception_ptr::operator!=(const exception_ptr& lhs,
- const exception_ptr& rhs)
-  _GLIBCXX_USE_NOEXCEPT
+ const exception_ptr& rhs) noexcept
 { return !(lhs == rhs);}
 
 
 std::exception_ptr
-std::current_exception() _GLIBCXX_USE_NOEXCEPT
+std::current_except

[PATCH] Correct C++11 implementation status docs

2016-10-10 Thread Jonathan Wakely

The std::list allocator status and the note about timed mutexes are
out of date, those are both completely implemented now (there's a
fallback timed mutex for targets without _POSIX_TIMEOUTS).

* doc/xml/manual/status_cxx2011.xml: Correct C++11 status.

Committed to trunk. I'll backport this to the branches as appropriate.


commit cdeed69de9aae70a15633a160378d84fbd03478c
Author: Jonathan Wakely 
Date:   Mon Oct 10 19:33:23 2016 +0100

Correct C++11 implementation status docs

* doc/xml/manual/status_cxx2011.xml: Correct C++11 status.

diff --git a/libstdc++-v3/doc/xml/manual/status_cxx2011.xml 
b/libstdc++-v3/doc/xml/manual/status_cxx2011.xml
index 83a266f..705f2ee 100644
--- a/libstdc++-v3/doc/xml/manual/status_cxx2011.xml
+++ b/libstdc++-v3/doc/xml/manual/status_cxx2011.xml
@@ -1340,12 +1340,10 @@ particular release.
   
 
 
-  
   23.2.1
   General container requirements
-  Partial
-  list does not meet the requirements
- relating to allocator use and propagation.
+  Y
+  
 
 
   23.2.2
@@ -1396,11 +1394,10 @@ particular release.
   
 
 
-  
   23.3.5
   Class template list
-  Partial
-  Incomplete allocator support.
+  Y
+  
 
 
   23.3.6
@@ -2349,8 +2346,7 @@ particular release.
   30.4.1.3
   Timed mutex types
   
-  On POSIX sytems these types are only defined if the OS
- supports the POSIX Timeouts option. 
+  
 
 
   30.4.1.3.1


Re: [PATCH 10/16] Introduce class function_reader (v3)

2016-10-10 Thread Richard Sandiford
David Malcolm  writes:
> On Wed, 2016-10-05 at 18:00 +0200, Bernd Schmidt wrote:
>> On 10/05/2016 06:15 PM, David Malcolm wrote:
>> >* errors.c: Use consistent pattern for bconfig.h vs config.h
>> >includes.
>> >(progname): Wrap with #ifdef GENERATOR_FILE.
>> >(error): Likewise.  Add "error: " to message.
>> >(fatal): Likewise.
>> >(internal_error): Likewise.
>> >(trim_filename): Likewise.
>> >(fancy_abort): Likewise.
>> >* errors.h (struct file_location): Move here from read-md.h.
>> >(file_location::file_location): Likewise.
>> >(error_at): New decl.
>> 
>> Can you split these out into a separate patch as well? I'll require
>> more 
>> explanation for them and they seem largely independent.
>
> [CCing Richard Sandiford]
>
> The gen* tools have their own diagnostics system, in errors.c:
>
> /* warning, error, and fatal.  These definitions are suitable for use
>in the generator programs; the compiler has a more elaborate suite
>of diagnostic printers, found in diagnostic.c.  */
>
> with file locations tracked using read-md.h's struct file_location,
> rather than location_t (aka libcpp's source_location).
>
> Implementing an RTL frontend by using the RTL reader from read-rtl.c
> means that we now need a diagnostics subsystem on the *host* for
> handling errors in RTL files, rather than just on the build machine.
>
> There seem to be two ways to do this:
>
>   (A) build the "light" diagnostics system (errors.c) for the host as
> well as build machine, and link it with the RTL reader there, so there
> are two parallel diagnostics subsystems.
>
>   (B) build the "real" diagnostics system (diagnostics*) for the
> *build* machine as well as the host, and use it from the gen* tools,
> eliminating the "light" system, and porting the gen* tools to use
> libcpp for location tracking.
>
> Approach (A) seems to be simpler, which is what this part of the patch
> does.
>
> I've experimented with approach (B).  I think it's doable, but it's
> much more invasive (perhaps needing a libdiagnostics.a and a
> build/libdiagnostics.a in gcc/Makefile.in), so I hope this can be
> followup work.
>
> I can split the relevant parts out into a separate patch, but I was
> wondering if either of you had a strong opinion on (A) vs (B) before I
> do so?

(A) sounds fine to me FWIW.  And sorry for the slow reply.

Thanks,
Richard


[PATCH] Fix PR77824

2016-10-10 Thread Bill Schmidt
Hi,

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77824 reports unreachable code 
where MODIFY_EXPR
is being tested instead of SSA_NAME to identify RHS's for copies.  This patch 
corrects that.
I instrumented the compiler to identify copies being added to the candidate 
table, and found
that this now occurs frequently in GCC's support libraries as well as 
throughout SPEC CPU2006.
I spot-checked the SLSR dumps for a number of code examples and found that, 
while copies often
now appear in the candidate table, and sometimes appear in candidate chains 
representing 
potential opportunities, I have not yet found a place where this changes code 
generation.

Bootstrapped and tested for powerpc64le-unknown-linux-gnu with no regressions, 
committed.

Thanks,

Bill



2016-10-10  Bill Schmidt  

PR tree-optimization/77824
* gimple-ssa-strength-reduction.c (stmt_cost): Explicitly return
zero cost for copies.
(find_candidates_dom_walker::before_dom_children): Replace
MODIFY_EXPR with SSA_NAME.
(replace_mult_candidate): Likewise.
(replace_profitable_candidates): Likewise.


Index: gcc/gimple-ssa-strength-reduction.c
===
--- gcc/gimple-ssa-strength-reduction.c (revision 240924)
+++ gcc/gimple-ssa-strength-reduction.c (working copy)
@@ -688,6 +688,9 @@ stmt_cost (gimple *gs, bool speed)
 
 /* Note that we don't assign costs to copies that in most cases
will go away.  */
+case SSA_NAME:
+  return 0;
+  
 default:
   ;
 }
@@ -1693,7 +1696,7 @@ find_candidates_dom_walker::before_dom_children (b
  gcc_fallthrough ();
 
CASE_CONVERT:
-   case MODIFY_EXPR:
+   case SSA_NAME:
case NEGATE_EXPR:
  rhs1 = gimple_assign_rhs1 (gs);
  if (TREE_CODE (rhs1) != SSA_NAME)
@@ -1724,7 +1727,7 @@ find_candidates_dom_walker::before_dom_children (b
  slsr_process_cast (gs, rhs1, speed);
  break;
 
-   case MODIFY_EXPR:
+   case SSA_NAME:
  slsr_process_copy (gs, rhs1, speed);
  break;
 
@@ -2010,7 +2013,7 @@ replace_mult_candidate (slsr_cand_t c, tree basis_
   && bump.to_shwi () != HOST_WIDE_INT_MIN
   /* It is not useful to replace casts, copies, or adds of
 an SSA name and a constant.  */
-  && cand_code != MODIFY_EXPR
+  && cand_code != SSA_NAME
   && !CONVERT_EXPR_CODE_P (cand_code)
   && cand_code != PLUS_EXPR
   && cand_code != POINTER_PLUS_EXPR
@@ -3445,7 +3448,7 @@ replace_profitable_candidates (slsr_cand_t c)
 to a cast or copy.  */
   if (i >= 0
  && profitable_increment_p (i) 
- && orig_code != MODIFY_EXPR
+ && orig_code != SSA_NAME
  && !CONVERT_EXPR_CODE_P (orig_code))
{
  if (phi_dependent_cand_p (c))



[PATCH] Update docs on libstdc++ source-code layout

2016-10-10 Thread Jonathan Wakely

Self-explanatory updates to the docs, and regenerating after the
various recent changes.

* doc/xml/manual/appendix_contributing.xml (contrib.organization):
Describe other subdirectories and add markup. Remove outdated
reference to check-script target.
* doc/html/*: Regenerate.

Committed to trunk.

commit 40bed069fd9497174b398c683d684fc825867cb7
Author: Jonathan Wakely 
Date:   Mon Oct 10 19:54:50 2016 +0100

Update docs on libstdc++ source-code layout

* doc/xml/manual/appendix_contributing.xml (contrib.organization):
Describe other subdirectories and add markup. Remove outdated
reference to check-script target.
* doc/html/*: Regenerate.

diff --git a/libstdc++-v3/doc/xml/manual/appendix_contributing.xml 
b/libstdc++-v3/doc/xml/manual/appendix_contributing.xml
index d7df13c..ee35dd9 100644
--- a/libstdc++-v3/doc/xml/manual/appendix_contributing.xml
+++ b/libstdc++-v3/doc/xml/manual/appendix_contributing.xml
@@ -199,91 +199,104 @@
   
 
   
-The unpacked source directory of libstdc++ contains the files
-needed to create the GNU C++ Library.
+The libstdc++-v3 directory in the
+GCC sources contains the files needed to create the GNU C++ Library.
   
 
   
 It has subdirectories:
 
-  doc
+  doc
 Files in HTML and text format that document usage, quirks of the
 implementation, and contributor checklists.
 
-  include
+  include
 All header files for the C++ library are within this directory,
 modulo specific runtime-related files that are in the libsupc++
 directory.
 
-include/std
+include/std
   Files meant to be found by #include  directives in
   standard-conforming user programs.
 
-include/c
+include/c
   Headers intended to directly include standard C headers.
-  [NB: this can be enabled via --enable-cheaders=c]
+  [NB: this can be enabled via --enable-cheaders=c]
 
-include/c_global
+include/c_global
   Headers intended to include standard C headers in
   the global namespace, and put select names into the std::
   namespace.  [NB: this is the default, and is the same as
-  --enable-cheaders=c_global]
+  --enable-cheaders=c_global]
 
-include/c_std
+include/c_std
   Headers intended to include standard C headers
   already in namespace std, and put select names into the std::
-  namespace.  [NB: this is the same as --enable-cheaders=c_std]
+  namespace.  [NB: this is the same as
+  --enable-cheaders=c_std]
 
-include/bits
+include/bits
   Files included by standard headers and by other files in
   the bits directory.
 
-include/backward
+include/backward
   Headers provided for backward compatibility, such as .
   They are not used in this library.
 
-include/ext
+include/ext
   Headers that define extensions to the standard library.  No
   standard header refers to any of them.
 
-  scripts
+  scripts
 Scripts that are used during the configure, build, make, or test
 process.
 
-  src
+  src
 Files that are used in constructing the library, but are not
 installed.
 
-  testsuites/[backward, demangle, ext, performance, thread, 17_* to 30_*]
+src/c++98
+Source files compiled using -std=gnu++98.
+
+src/c++11
+Source files compiled using -std=gnu++11.
+
+src/filesystem
+Source files for the Filesystem TS.
+
+src/shared
+Source code included by other files under both
+src/c++98 and
+src/c++11
+
+  testsuites/[backward, demangle, ext, 
performance, thread, 17_* to 30_*]
 Test programs are here, and may be used to begin to exercise the
 library.  Support for "make check" and "make check-install" is
 complete, and runs through all the subdirectories here when this
 command is issued from the build directory.  Please note that
-"make check" requires DejaGNU 1.4 or later to be installed.  Please
-note that "make check-script" calls the script mkcheck, which
-requires bash, and which may need the paths to bash adjusted to
-work properly, as /bin/bash is assumed.
+"make check" requires DejaGNU 1.4 or later to be installed.
 
 Other subdirectories contain variant versions of certain files
 that are meant to be copied or linked by the configure script.
 Currently these are:
 
-  config/abi
-  config/cpu
-  config/io
-  config/locale
-  config/os
+  config/abi
+  config/cpu
+  config/io
+  config/locale
+  config/os
 
 In addition, a subdirectory holds the convenience library libsupc++.
 
-  libsupc++
+  libsupc++
 Contains the runtime library for C++, including exception
 handling and memory allocation and deallocation, RTTI, terminate
 handlers, etc.
 
-Note that glibc also has a bits/ subdirectory.  We will either
-need to be careful not to collide with names in its bits/
-directory; or rename bits to (e.g.) cppbits/.
+Note that glibc also has a b

Re: [v3 PATCH] Make any's copy assignment operator exception-safe, don't copy the underlying value when any is moved, make in_place constructors explicit.

2016-10-10 Thread Ville Voutilainen
On 10 October 2016 at 21:19, Jonathan Wakely  wrote:
> I prefer to put "explicit" on a line of its own, as we do for return
> types, but I won't complain if you leave it like this.

Changed.

>> + any(__rhs).swap(*this);
>
>
> I was trying to avoid the "redundant" xfer operations that the swap
> does, but I don't think we can do that and be exception safe. This is
> simple and safe, and I think its optimal. Thanks.

Right, as discussed, this is now just a move assignment from a temporary.

>
>> }
>>   return *this;
>> }
>> @@ -232,7 +228,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>>   else if (this != &__rhs)
>> {
>>   if (has_value())
>> -   _M_manager(_Op_destroy, this, nullptr);
>> +   reset();
>
>
> If you're going to use reset() then you don't need the has_value()
> check first. I think the reason I didn't use reset() was to avoid the

I removed the check, works fine.

> dead store to _M_manager that reset() does, since the compiler might
> not detect it's dead (because the next store is done by the call
> through a function pointer).
> This code was all pretty carefully written to avoid any redundant
> operations. Does this change buy us anything except simpler code?

As discussed, destroying the value but leaving the manager non-null will
do bad things.

New patch attached, ok for trunk?

2016-10-10  Ville Voutilainen  

Make any's copy assignment operator exception-safe,
don't copy the underlying value when any is moved,
make in_place constructors explicit.
* include/std/any (any(in_place_type_t<_ValueType>, _Args&&...)):
Make explicit.
(any(in_place_type_t<_ValueType>, initializer_list<_Up>, _Args&&...)):
Likewise.
(operator=(const any&)): Make strongly exception-safe.
(operator=(any&&)): reset() unconditionally in the case where
rhs has a value.
(operator=(_ValueType&&)): Indent the return type.
(_Manager_internal<_Tp>::_S_manage): Move in _Op_xfer, don't copy.
* testsuite/20_util/any/assign/2.cc: Adjust.
* testsuite/20_util/any/assign/exception.cc: New.
* testsuite/20_util/any/cons/2.cc: Adjust.
* testsuite/20_util/any/cons/explicit.cc: New.
* testsuite/20_util/any/misc/any_cast_neg.cc: Ajust.
diff --git a/libstdc++-v3/include/std/any b/libstdc++-v3/include/std/any
index 9160035..45a2145 100644
--- a/libstdc++-v3/include/std/any
+++ b/libstdc++-v3/include/std/any
@@ -179,6 +179,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  typename _Tp = _Decay<_ValueType>,
  typename _Mgr = _Manager<_Tp>,
   __any_constructible_t<_Tp, _Args&&...> = false>
+  explicit
   any(in_place_type_t<_ValueType>, _Args&&... __args)
   : _M_manager(&_Mgr::_S_manage)
   {
@@ -192,6 +193,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  typename _Mgr = _Manager<_Tp>,
   __any_constructible_t<_Tp, initializer_list<_Up>,
_Args&&...> = false>
+  explicit
   any(in_place_type_t<_ValueType>,
  initializer_list<_Up> __il, _Args&&... __args)
   : _M_manager(&_Mgr::_S_manage)
@@ -207,16 +209,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 /// Copy the state of another object.
 any& operator=(const any& __rhs)
 {
-  if (!__rhs.has_value())
-   reset();
-  else if (this != &__rhs)
-   {
- if (has_value())
-   _M_manager(_Op_destroy, this, nullptr);
- _Arg __arg;
- __arg._M_any = this;
- __rhs._M_manager(_Op_clone, &__rhs, &__arg);
-   }
+  *this = any(__rhs);
   return *this;
 }
 
@@ -231,8 +224,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
reset();
   else if (this != &__rhs)
{
- if (has_value())
-   _M_manager(_Op_destroy, this, nullptr);
+ reset();
  _Arg __arg;
  __arg._M_any = this;
  __rhs._M_manager(_Op_xfer, &__rhs, &__arg);
@@ -243,7 +235,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 /// Store a copy of @p __rhs as the contained object.
 template>
-enable_if_t::value, any&>
+  enable_if_t::value, any&>
   operator=(_ValueType&& __rhs)
   {
*this = any(std::forward<_ValueType>(__rhs));
@@ -556,7 +548,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
__ptr->~_Tp();
break;
   case _Op_xfer:
-   ::new(&__arg->_M_any->_M_storage._M_buffer) _Tp(*__ptr);
+   ::new(&__arg->_M_any->_M_storage._M_buffer) _Tp
+ (std::move(*const_cast<_Tp*>(__ptr)));
__ptr->~_Tp();
__arg->_M_any->_M_manager = __any->_M_manager;
const_cast(__any)->_M_manager = nullptr;
diff --git a/libstdc++-v3/testsuite/20_util/any/assign/2.cc 
b/libstdc++-v3/testsuite/20_util/any/assign/2.cc
index b333e5d..28f06a0 100644
--- a/libstdc++-v3/testsuite/20_util/any/assign/2.cc
+++ b/libstdc++-v3/testsuite/20_util/any/assign/2.cc
@@ -24,28 +24,69 @@
 using std::any;
 using std::any_cast;
 
+bool moved = false;
+bool copied = false;
+
 s

Re: [PATCH] 77864 Fix noexcept conditions for map/set default constructors

2016-10-10 Thread François Dumont

On 09/10/2016 17:14, Jonathan Wakely wrote:

On 08/10/16 22:55 +0200, François Dumont wrote:

On 06/10/2016 23:34, Jonathan Wakely wrote:

On 06/10/16 22:17 +0200, François Dumont wrote:
Another approach is to rely on existing compiler ability to compute 
conditional noexcept when defaulting implementations. This is what 
I have done in this patch.


The new default constructor on _Rb_tree_node_base is not a problem 
as it is not used to build _Rb_tree_node.


Why not?


_Rb_tree_node_base is used in 2 context. As member of _Rb_tree_impl 
in which case we need the new default constructor. And also as base 
class of _Rb_tree_node which is never constructed. Nodes are being 
allocated and then associated value is being constructed through the 
allocator, the node default constructor itself is never invoked.


In C++03 mode that is true, but it's only valid because the type is
trivially-constructible. If the type requires "non-vacuous
initialization" then it's not valid to allocate memory for it and
start using it without invoking a constructor.


  Good to know.


If you add a
non-trivial constructor then we can't do that any more.

In C++11 and later, see line 550 in 

   ::new(__node) _Rb_tree_node<_Val>;

This default-constructs a tree node. Currently there is no
user-provided default constructor, so default-construction does no
initialization. Adding your constructor would mean it is used for
every node.


I missed this call, indeed. I should have deleted default constructor 
and run compilation to be sure.




   If you think it is cleaner to create an intermediate type that 
will take care of this initialization through its default constructor 
I can do that.




I'll try to do the same for copy constructor/assignment and move 
constructor/assignment.


We need to make sure we don't change whether any of those operations
are trivial (which shouldn't be a problem for copy/move, because they
are definitely very non-trivial and will stay that way!)

Does this change the default constructors from non-trivial to trivial?
It would be a major compiler bug if making a constructor default was 
making it trivial.


I must be misunderstanding you, because this is not a bug:


No, my fault, I was misunderstanding you. Now that I know about validity 
of using a "non-constructed" type only if trivial, it is much clearer.


So here is the fixed patch with your proposed intermediate type 
containing the necessary default constructor.


Being tested, ok to commit if successful ?

François

diff --git a/libstdc++-v3/include/bits/stl_map.h b/libstdc++-v3/include/bits/stl_map.h
index e5b2a1b..dea7d5b 100644
--- a/libstdc++-v3/include/bits/stl_map.h
+++ b/libstdc++-v3/include/bits/stl_map.h
@@ -167,11 +167,11 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   /**
*  @brief  Default constructor creates no elements.
*/
-  map()
-  _GLIBCXX_NOEXCEPT_IF(
-	  is_nothrow_default_constructible::value
-	  && is_nothrow_default_constructible::value)
-  : _M_t() { }
+#if __cplusplus < 201103L
+  map() : _M_t() { }
+#else
+  map() = default;
+#endif
 
   /**
*  @brief  Creates a %map with no elements.
diff --git a/libstdc++-v3/include/bits/stl_multimap.h b/libstdc++-v3/include/bits/stl_multimap.h
index d240427..7e86b76 100644
--- a/libstdc++-v3/include/bits/stl_multimap.h
+++ b/libstdc++-v3/include/bits/stl_multimap.h
@@ -164,11 +164,11 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   /**
*  @brief  Default constructor creates no elements.
*/
-  multimap()
-  _GLIBCXX_NOEXCEPT_IF(
-	  is_nothrow_default_constructible::value
-	  && is_nothrow_default_constructible::value)
-  : _M_t() { }
+#if __cplusplus < 201103L
+  multimap() : _M_t() { }
+#else
+  multimap() = default;
+#endif
 
   /**
*  @brief  Creates a %multimap with no elements.
diff --git a/libstdc++-v3/include/bits/stl_multiset.h b/libstdc++-v3/include/bits/stl_multiset.h
index cc068a9..7fe2fbd 100644
--- a/libstdc++-v3/include/bits/stl_multiset.h
+++ b/libstdc++-v3/include/bits/stl_multiset.h
@@ -144,11 +144,11 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   /**
*  @brief  Default constructor creates no elements.
*/
-  multiset()
-  _GLIBCXX_NOEXCEPT_IF(
-	  is_nothrow_default_constructible::value
-	  && is_nothrow_default_constructible::value)
-  : _M_t() { }
+#if __cplusplus < 201103L
+  multiset() : _M_t() { }
+#else
+  multiset() = default;
+#endif
 
   /**
*  @brief  Creates a %multiset with no elements.
diff --git a/libstdc++-v3/include/bits/stl_set.h b/libstdc++-v3/include/bits/stl_set.h
index 3938351..5ed9672 100644
--- a/libstdc++-v3/include/bits/stl_set.h
+++ b/libstdc++-v3/include/bits/stl_set.h
@@ -147,11 +147,11 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   /**
*  @brief  Default constructor creates no elements.
*/
-  set()
-  _GLIBCXX_NOEXCEPT_IF(
-	  is_nothrow_default_constructible::value
-	  && is_not

Re: PING! Re: [PATCH, Fortran] Extension: COTAN and degree-valued trig intrinsics with -fdec-math

2016-10-10 Thread Jerry DeLisle
On 10/10/2016 08:06 AM, Fritz Reese wrote:
> https://gcc.gnu.org/ml/fortran/2016-09/msg00163.html [original]
> https://gcc.gnu.org/ml/fortran/2016-09/msg00183.html [latest]
> 
> On Wed, Sep 28, 2016 at 4:14 PM, Fritz Reese  wrote:
>> Attached is a patch extending the GNU Fortran front-end to support
>> some additional math intrinsics, enabled with a new compile flag
>> -fdec-math. The flag adds the COTAN intrinsic (cotangent), as well as
>> degree versions of all trigonometric intrinsics (SIND, TAND, ACOSD,
>> etc...). This extension allows for further compatibility with legacy
>> code that depends on the compiler to support such intrinsic functions.
> 
> Patch is still pending. Current draft of the patch is re-attached for
> convenience, since it was amended twice since the original post. OK
> for trunk?
> 

OK, thanks for the work.

Jerry


Move OVERRIDE/FINAL from gcc/coretypes.h to include/ansidecl.h (was: Re: [PATCH 1/2] Add OVERRIDE and FINAL macros to coretypes.h)

2016-10-10 Thread Pedro Alves
Please find below a patch moving the FINAL/OVERRIDE macros to
include/ansidecl.h, as I was suggesting in the earlier discussion:

On 05/06/2016 07:33 PM, Trevor Saunders wrote:
> On Fri, May 06, 2016 at 07:10:33PM +0100, Pedro Alves wrote:
>> On 05/06/2016 06:56 PM, Pedro Alves wrote:

>> I was going to suggest to put this in include/ansidecl.h,
>> so that all C++ libraries / programs in binutils-gdb use the same
>> thing, instead of each reinventing the wheel, and I found
>> there's already something there:
>>
>> /* This is used to mark a class or virtual function as final.  */
>> #if __cplusplus >= 201103L
>> #define GCC_FINAL final
>> #elif GCC_VERSION >= 4007
>> #define GCC_FINAL __final
>> #else
>> #define GCC_FINAL
>> #endif
>>
>> From:
>>
>>  https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00455.html
>>
>> Apparently the patch that actually uses that was reverted,
>> as I can't find any use.
> 
> Yeah, I wanted to use it to work around gdb not dealing well with stuff
> in the anon namespace, but somehow that broke aix, and some people
> objected and I haven't gotten back to it.
> 
>> I like your names without the GCC_ prefix better though,
>> for the same reason of standardizing binutils-gdb + gcc
>> on the same symbols.
> 
> I agree, though I'm not really sure when gdb / binutils stuff will
> support building as C++11.

Meanwhile, GDB master is C++-only nowadays, and we support building
with a C++11 compiler, provided there are C++03 fallbacks in place.
I'd like to start using FINAL/OVERRIDE, and seems better to me to
standardize on the same symbol names across the trees.
This patch removes the existing GCC_FINAL macro, since nothing is
using it.

OK to apply?

From: Pedro Alves 
Date: 2016-10-10 19:25:47 +0100

Move OVERRIDE/FINAL from gcc/coretypes.h to include/ansidecl.h

So that GDB and other projects that share the top level can use them.

Bootstrapped with all default languages on x86-64 Fedora 23.

gcc/ChangeLog:
-mm-dd  Pedro Alves  

* coretypes.h (OVERRIDE, FINAL): Delete, moved to
include/ansidecl.h.

include/ChangeLog:
-mm-dd  Pedro Alves  

* ansidecl.h (GCC_FINAL): Delete.
(OVERRIDE, FINAL): New, moved from gcc/coretypes.h.
---

 gcc/coretypes.h |   25 -
 1 file changed, 25 deletions(-)

diff --git a/gcc/coretypes.h b/gcc/coretypes.h
index fe1e984..a9c4df9 100644
--- a/gcc/coretypes.h
+++ b/gcc/coretypes.h
@@ -367,31 +367,6 @@ typedef void (*gt_pointer_operator) (void *, void *);
 typedef unsigned char uchar;
 #endif
 
-/* C++11 adds the ability to add "override" after an implementation of a
-   virtual function in a subclass, to:
- (A) document that this is an override of a virtual function
- (B) allow the compiler to issue a warning if it isn't (e.g. a mismatch
- of the type signature).
-
-   Similarly, it allows us to add a "final" to indicate that no subclass
-   may subsequently override the vfunc.
-
-   Provide OVERRIDE and FINAL as macros, allowing us to get these benefits
-   when compiling with C++11 support, but without requiring C++11.
-
-   For gcc, use "-std=c++11" to enable C++11 support; gcc 6 onwards enables
-   this by default (actually GNU++14).  */
-
-#if __cplusplus >= 201103
-/* C++11 claims to be available: use it: */
-#define OVERRIDE override
-#define FINAL final
-#else
-/* No C++11 support; leave the macros empty: */
-#define OVERRIDE
-#define FINAL
-#endif
-
 /* Most host source files will require the following headers.  */
 #if !defined (GENERATOR_FILE) && !defined (USED_FOR_TARGET)
 #include "machmode.h"
diff --git a/include/ansidecl.h b/include/ansidecl.h
index 6e4bfc2..ee93421 100644
--- a/include/ansidecl.h
+++ b/include/ansidecl.h
@@ -313,13 +313,29 @@ So instead we use the macro below and test it against 
specific values.  */
 #define ENUM_BITFIELD(TYPE) unsigned int
 #endif
 
-/* This is used to mark a class or virtual function as final.  */
-#if __cplusplus >= 201103L
-#define GCC_FINAL final
-#elif GCC_VERSION >= 4007
-#define GCC_FINAL __final
+/* C++11 adds the ability to add "override" after an implementation of a
+   virtual function in a subclass, to:
+ (A) document that this is an override of a virtual function
+ (B) allow the compiler to issue a warning if it isn't (e.g. a mismatch
+ of the type signature).
+
+   Similarly, it allows us to add a "final" to indicate that no subclass
+   may subsequently override the vfunc.
+
+   Provide OVERRIDE and FINAL as macros, allowing us to get these benefits
+   when compiling with C++11 support, but without requiring C++11.
+
+   For gcc, use "-std=c++11" to enable C++11 support; gcc 6 onwards enables
+   this by default (actually GNU++14).  */
+
+#if __cplusplus >= 201103
+/* C++11 claims to be available: use it: */
+#define OVERRIDE override
+#define FINAL final
 #else
-#define GCC_FINAL
+/* No C++11 support; leave the macros empty: */
+#define OVERRIDE
+#define FINAL
 #endif
 
 #ifdef __cplusplus


New Swedish PO file for 'gcc' (version 6.2.0)

2016-10-10 Thread Translation Project Robot
Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'gcc' has been submitted
by the Swedish team of translators.  The file is available at:

http://translationproject.org/latest/gcc/sv.po

(This file, 'gcc-6.2.0.sv.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

http://translationproject.org/latest/gcc/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

http://translationproject.org/domain/gcc.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.




[libgo] Silence compiler error message

2016-10-10 Thread Eric Botcazou
Hi,

on Solaris the configuration of the library yields an ugly:

checking whether linker supports split/non-split linked together... cc1: 
error: '-fsplit-stack' is not supported by this compiler configuration
xgcc: error: conftest1.o: No such file or directory
no

Tested on x86-64/Linux and SPARC/Solaris, OK for the mainline?


2016-10-10  Eric Botcazou  

* configure.ac (libgo_cv_c_linker_split_non_split): Redirect compiler
output to /dev/null.
* configure: Regenerate.

-- 
Eric BotcazouIndex: configure.ac
===
--- configure.ac	(revision 240888)
+++ configure.ac	(working copy)
@@ -447,9 +447,9 @@ EOF
 cat > conftest2.c << EOF
 void f() {}
 EOF
-$CC -c -fsplit-stack $CFLAGS $CPPFLAGS conftest1.c
-$CC -c $CFLAGS $CPPFLAGS conftest2.c
-if $CC -o conftest conftest1.$ac_objext conftest2.$ac_objext; then
+$CC -c -fsplit-stack $CFLAGS $CPPFLAGS conftest1.c >/dev/null 2>&1
+$CC -c $CFLAGS $CPPFLAGS conftest2.c > /dev/null 2>&1
+if $CC -o conftest conftest1.$ac_objext conftest2.$ac_objext > /dev/null 2>&1; then
   libgo_cv_c_linker_split_non_split=yes
 else
   libgo_cv_c_linker_split_non_split=no


Re: PING! Re: [PATCH, Fortran] Extension: COTAN and degree-valued trig intrinsics with -fdec-math

2016-10-10 Thread Steve Kargl
On Mon, Oct 10, 2016 at 12:29:32PM -0700, Jerry DeLisle wrote:
> On 10/10/2016 08:06 AM, Fritz Reese wrote:
> > https://gcc.gnu.org/ml/fortran/2016-09/msg00163.html [original]
> > https://gcc.gnu.org/ml/fortran/2016-09/msg00183.html [latest]
> > 
> > On Wed, Sep 28, 2016 at 4:14 PM, Fritz Reese  wrote:
> >> Attached is a patch extending the GNU Fortran front-end to support
> >> some additional math intrinsics, enabled with a new compile flag
> >> -fdec-math. The flag adds the COTAN intrinsic (cotangent), as well as
> >> degree versions of all trigonometric intrinsics (SIND, TAND, ACOSD,
> >> etc...). This extension allows for further compatibility with legacy
> >> code that depends on the compiler to support such intrinsic functions.
> > 
> > Patch is still pending. Current draft of the patch is re-attached for
> > convenience, since it was amended twice since the original post. OK
> > for trunk?
> > 
> 
> OK, thanks for the work.
> 

Sorry about following behind. I did intend to review the patch, but
time got away from me.  There are a few small clean-up that can be
done.  For example,

+static gfc_expr *
+get_radians (gfc_expr *deg)
+{
+  mpfr_t tmp;
...

+  /* Set factor = pi / 180.  */
+  factor = gfc_get_constant_expr (deg->ts.type, deg->ts.kind, °->where);
+  mpfr_const_pi (factor->value.real, GFC_RND_MODE);
+  mpfr_init (tmp);
+  mpfr_set_d (tmp, 180.0, GFC_RND_MODE);
+  mpfr_div (factor->value.real, factor->value.real, tmp, GFC_RND_MODE);
+  mpfr_clear (tmp);

the tmp variable is unneeded in the above.  Converting the double
precision 180.0 to mpfr_t and then dividing is probably slower
than just dividing by 180.

+  /* Set factor = pi / 180.  */
+  factor = gfc_get_constant_expr (deg->ts.type, deg->ts.kind, °->where);
+  mpfr_const_pi (factor->value.real, GFC_RND_MODE);
+  mpfr_div_ui (factor->value.real, factor->value.real, 180, GFC_RND_MODE);

Of course, the clean-up can be done post-commit by Fritz.

-- 
Steve


[PATCH], PR 77924, Fix PowerPC breakage on AIX

2016-10-10 Thread Michael Meissner
I accidently broke AIX with my patch on October 6th.  That patch split
-mfloat128 into -mfloat128-type and -mfloat128 under PowerPC Linux.  This patch
fixes that issue.  I bootstrapped it on PowerPC Linux with no regressions, and
David Edelsohn reports that it fixes the problem on AIX.  Is it ok to apply the
patch?

2016-10-10  Michael Meissner  

PR target/77924
* config/rs6000/rs6000.c (rs6000_init_builtins): Only create the
distinct __ibm128 IBM extended double type if long doubles are
128-bits and the default format for long double is IEEE 128-bit.

Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 240941)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -16572,10 +16572,10 @@ rs6000_init_builtins (void)
  floating point, we need make sure the type is non-zero or else self-test
  fails during bootstrap.
 
- We don't register a built-in type for __ibm128 or __float128 if the type
- is the same as long double.  Instead we add a #define for __ibm128 or
- __float128 in rs6000_cpu_cpp_builtins to long double.  */
-  if (TARGET_IEEEQUAD || !TARGET_LONG_DOUBLE_128)
+ We don't register a built-in type for __ibm128 if the type is the same as
+ long double.  Instead we add a #define for __ibm128 in
+ rs6000_cpu_cpp_builtins to long double.  */
+  if (TARGET_LONG_DOUBLE_128 && FLOAT128_IEEE_P (TFmode))
 {
   ibm128_float_type_node = make_node (REAL_TYPE);
   TYPE_PRECISION (ibm128_float_type_node) = 128;

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797



Fwd: C++ PATCH for c++/77890, 77912 (C++17 class deduction issues)

2016-10-10 Thread Jason Merrill
77890: we were losing the CLASS_PLACEHOLDER_TEMPLATE when reducing the
level of a TEMPLATE_TYPE_PARM.

77912: after 77890 was fixed, we were complaining about an undefined
deduction guide; set cp_unevaluated_operand to prevent that.

Tested x86_64-pc-linux-gnu, applying to trunk.
commit 927a7c6142c7e08bdc37e94e776201f801de0df8
Author: Jason Merrill 
Date:   Mon Oct 10 13:52:50 2016 -0400

C++17 class deduction issues

PR c++/77890
PR c++/77912
* pt.c (do_class_deduction): Set cp_unevaluated_operand.
(tsubst) [TEMPLATE_TYPE_PARM]: Copy CLASS_PLACEHOLDER_TEMPLATE.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index f6cd3ea..28b1c98 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -13233,11 +13233,15 @@ tsubst (tree t, tree args, tsubst_flags_t complain, 
tree in_decl)
TYPE_POINTER_TO (r) = NULL_TREE;
TYPE_REFERENCE_TO (r) = NULL_TREE;
 
-   /* Propagate constraints on placeholders.  */
 if (TREE_CODE (t) == TEMPLATE_TYPE_PARM)
-  if (tree constr = PLACEHOLDER_TYPE_CONSTRAINTS (t))
-   PLACEHOLDER_TYPE_CONSTRAINTS (r)
- = tsubst_constraint (constr, args, complain, in_decl);
+ {
+   /* Propagate constraints on placeholders.  */
+   if (tree constr = PLACEHOLDER_TYPE_CONSTRAINTS (t))
+ PLACEHOLDER_TYPE_CONSTRAINTS (r)
+   = tsubst_constraint (constr, args, complain, in_decl);
+   else if (tree pl = CLASS_PLACEHOLDER_TEMPLATE (t))
+ CLASS_PLACEHOLDER_TEMPLATE (r) = pl;
+ }
 
if (TREE_CODE (r) == TEMPLATE_TEMPLATE_PARM)
  /* We have reduced the level of the template
@@ -24431,9 +24435,10 @@ do_class_deduction (tree tmpl, tree init, 
tsubst_flags_t complain)
   return error_mark_node;
 }
 
+  ++cp_unevaluated_operand;
   tree t = build_new_function_call (cands, &args, /*koenig*/false,
complain|tf_decltype);
-
+  --cp_unevaluated_operand;
   release_tree_vector (args);
 
   return TREE_TYPE (t);
diff --git a/gcc/testsuite/g++.dg/cpp1z/class-deduction19.C 
b/gcc/testsuite/g++.dg/cpp1z/class-deduction19.C
new file mode 100644
index 000..38327d1
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/class-deduction19.C
@@ -0,0 +1,20 @@
+// PR c++/77912
+// { dg-options -std=c++1z }
+
+template struct S{S(T){}}; 
+
+//error: invalid use of template type parameter 'S'
+template auto f(T t){return S(t);}
+
+int main()
+{
+  //fails
+  f(42);
+
+  //fails
+  //error: invalid use of template type parameter 'S'
+  [](auto a){return S(a);}(42); 
+
+  //works
+  [](int a){return S(a);}(42);
+}
diff --git a/gcc/testsuite/g++.dg/cpp1z/class-deduction20.C 
b/gcc/testsuite/g++.dg/cpp1z/class-deduction20.C
new file mode 100644
index 000..58e8f7d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/class-deduction20.C
@@ -0,0 +1,21 @@
+// PR c++/77890
+// { dg-options -std=c++1z }
+
+template struct S{S(F&&f){}}; 
+void f()
+{
+  S([]{});
+}
+
+template 
+struct scope_guard : TF
+{
+scope_guard(TF f) : TF{f} { }
+~scope_guard() { (*this)(); }
+};
+
+void g() 
+{
+struct K { void operator()() {} };
+scope_guard _{K{}};
+}


Re: [PATCH] 77864 Fix noexcept conditions for map/set default constructors

2016-10-10 Thread Tim Song
Trying again...with a few edits.

> On Mon, Oct 10, 2016 at 3:24 PM, François Dumont 
> wrote:
>
> @@ -602,24 +612,32 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>  struct _Rb_tree_impl : public _Node_allocator
>  {
>_Key_compare _M_key_compare;
> -  _Rb_tree_node_base _M_header;
> +  _Rb_header_node _M_header;
> +#if __cplusplus < 201103L
>size_type _M_node_count; // Keeps track of size of tree.
> +#else
> +  size_type _M_node_count = 0; // Keeps track of size of tree.
> +#endif
>
> +#if __cplusplus < 201103L
>_Rb_tree_impl()
> -  : _Node_allocator(), _M_key_compare(), _M_header(),
> -_M_node_count(0)
> -  { _M_initialize(); }
> +  : _M_node_count(0)
> +  { }
> +#else
> +  _Rb_tree_impl() = default;
> +#endif


The default constructor of the associative containers is required to
value-initialize the comparator (see their synopses in
[map/set/multimap/multiset.overview]).

 _Rb_tree_impl() = default; doesn't do that; it default-initializes the
 comparator instead.

Tim


[tree-optimization/71947] Avoid unwanted propagations

2016-10-10 Thread Jeff Law



So if we have an equality conditional between A & B, we record into our 
const/copy tables A = B and B = A.


This helps us discover some of the more obscure equivalences. But it 
also creates problems with an expression like


A ^ B

Where we might cprop the first operand generating

B ^ B

Then the second generating

B ^ A

ANd we've lost the folding opportunity.  At first I'd tried folding 
after each propagation step, but that turns into a bit of a nightmare 
because of changes in the underlying structure of the gimple statement 
and cycles that may develop if we re-build the operand cache after folding.


This approach is simpler and should catch all these cases for binary 
operators.  We just track the last copy propagated argument and refuse 
to ping-pong propagations.


It fixes the tests from 71947 and 77647 without regressing (obviously). 
I've included an xfailed test for a more complex situation that we don't 
currently handle (would require backtracking from the equality 
comparison through the logicals that feed the equality comparison).


Bootstrapped and regression tested on x86_64.  Applied to the trunk.

commit 6223e6e425b6de916f0330b9dbe5698765d4a73c
Author: law 
Date:   Mon Oct 10 20:40:59 2016 +

PR tree-optimization/71947
* tree-ssa-dom.c (cprop_into_stmt): Avoid replacing A with B, then
B with A within a single statement.

PR tree-optimization/71947
* gcc.dg/tree-ssa/pr71947-1.c: New test.
* gcc.dg/tree-ssa/pr71947-2.c: New test.
* gcc.dg/tree-ssa/pr71947-3.c: New test.
* gcc.dg/tree-ssa/pr71947-4.c: New test.
* gcc.dg/tree-ssa/pr71947-5.c: New test.
* gcc.dg/tree-ssa/pr71947-6.c: New test.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@240947 
138bc75d-0d04-0410-961f-82ee72b054a4

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 1738bc7..16e25bf 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,9 @@
+2016-10-10  Jeff Law  
+
+PR tree-optimization/71947
+   * tree-ssa-dom.c (cprop_into_stmt): Avoid replacing A with B, then
+   B with A within a single statement.
+
 2016-10-10  Bill Schmidt  
 
PR tree-optimization/77824
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 04966cf..e31bcc6 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,13 @@
+2016-10-10  Jeff Law  
+
+   PR tree-optimization/71947
+   * gcc.dg/tree-ssa/pr71947-1.c: New test.
+   * gcc.dg/tree-ssa/pr71947-2.c: New test.
+   * gcc.dg/tree-ssa/pr71947-3.c: New test.
+   * gcc.dg/tree-ssa/pr71947-4.c: New test.
+   * gcc.dg/tree-ssa/pr71947-5.c: New test.
+   * gcc.dg/tree-ssa/pr71947-6.c: New test.
+
 2016-10-10  Thomas Koenig  
 
PR fortran/77915
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr71947-1.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr71947-1.c
new file mode 100644
index 000..b033495
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr71947-1.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */ 
+/* { dg-options "-O2 -fno-tree-vrp -fdump-tree-dom-details" } */
+
+
+int f(int x, int y)
+{
+   int ret;
+
+   if (x == y)
+ ret = x ^ y;
+   else
+ ret = 1;
+
+   return ret;
+}
+
+/* { dg-final { scan-tree-dump "Folded to: ret_\[0-9\]+ = 0;"  "dom2" } } */
+
+
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr71947-2.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr71947-2.c
new file mode 100644
index 000..de8f88b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr71947-2.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-tree-vrp -fdump-tree-dom-details" } */
+
+
+int f(int x, int y)
+{
+  int ret;
+  if (x == y)
+ret = x - y;
+  else
+ret = 1;
+
+  return ret;
+}
+
+/* { dg-final { scan-tree-dump "Folded to: ret_\[0-9\]+ = 0;"  "dom2" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr71947-3.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr71947-3.c
new file mode 100644
index 000..e79847f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr71947-3.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-tree-vrp -fdump-tree-dom-details" } */
+
+int f(int x, int y)
+{
+  int ret = 10;
+  if (x == y)
+ret = x  -  y;
+  return ret;
+}
+
+/* { dg-final { scan-tree-dump "Folded to: ret_\[0-9\]+ = 0;"  "dom2" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr71947-4.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr71947-4.c
new file mode 100644
index 000..a881f0d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr71947-4.c
@@ -0,0 +1,22 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-tree-vrp -fdump-tree-dom-details" } */
+
+
+
+static inline long load(long *p)
+{
+long ret;
+asm ("movq  %1,%0\n\t" : "=r" (ret) : "m" (*p));
+if (ret != *p)
+__builtin_unreachable();
+return ret;
+}
+
+long foo(long *mem)
+{
+long ret;
+ret = load(mem);
+return ret + *mem;
+}
+
+/* { dg-final { scan-tree-dump "Folded to: _\[

[patch] aarch64-*-freebsd* support for gcc.

2016-10-10 Thread Andreas Tobler

Hi all,

the attached patch brings support for the aarch64-*-freebsd* target.

Bootstraped and tested, results on the list. Not that many results due 
to board instabilities I lack a cavium ;)


Ok for main? And if yes, how far can I backport? Down to 5.4?

TIA,
Andreas

libgcc:

2016-10-10  Andreas Tobler  

* config.host: Add support for aarch64-*-freebsd*.

gcc:

2016-10-10  Andreas Tobler  

* config.gcc: Add aarch64-*-freebsd* support.
* config.host: Likewise.
* config/aarch64/aarch64-freebsd.h: New file.
* config/aarch64/t-aarch64-freebsd: Ditto.

toplevel:

2016-10-10  Andreas Tobler 

* configure.ac: Add aarch64-*-freebsd*.
* configure: Regenerate.

Index: configure.ac
===
--- configure.ac(revision 240948)
+++ configure.ac(working copy)
@@ -727,6 +727,9 @@
   *-*-vxworks*)
 noconfigdirs="$noconfigdirs target-libffi"
 ;;
+  aarch64*-*-freebsd*)
+noconfigdirs="$noconfigdirs target-libffi"
+;;
   alpha*-*-*vms*)
 noconfigdirs="$noconfigdirs target-libffi"
 ;;
Index: gcc/config.gcc
===
--- gcc/config.gcc  (revision 240948)
+++ gcc/config.gcc  (working copy)
@@ -937,6 +937,11 @@
done
TM_MULTILIB_CONFIG=`echo $TM_MULTILIB_CONFIG | sed 's/^,//'`
;;
+aarch64*-*-freebsd*)
+   tm_file="${tm_file} dbxelf.h elfos.h ${fbsd_tm_file}"
+   tm_file="${tm_file} aarch64/aarch64-elf.h aarch64/aarch64-freebsd.h"
+   tmake_file="${tmake_file} aarch64/t-aarch64 aarch64/t-aarch64-freebsd"
+   ;;
 aarch64*-*-linux*)
tm_file="${tm_file} dbxelf.h elfos.h gnu-user.h linux.h glibc-stdint.h"
tm_file="${tm_file} aarch64/aarch64-elf.h aarch64/aarch64-linux.h"
Index: gcc/config.host
===
--- gcc/config.host (revision 240948)
+++ gcc/config.host (working copy)
@@ -99,7 +99,7 @@
 esac
 
 case ${host} in
-  aarch64*-*-linux*)
+  aarch64*-*-freebsd* | aarch64*-*-linux*)
 case ${target} in
   aarch64*-*-*)
host_extra_gcc_objs="driver-aarch64.o"
Index: gcc/config/aarch64/aarch64-freebsd.h
===
--- gcc/config/aarch64/aarch64-freebsd.h(nonexistent)
+++ gcc/config/aarch64/aarch64-freebsd.h(working copy)
@@ -0,0 +1,94 @@
+/* Definitions for AArch64 running FreeBSD
+   Copyright (C) 2016 Free Software Foundation, Inc.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   GCC is distributed in the hope that it will be useful, but
+   WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with GCC; see the file COPYING3.  If not see
+   .  */
+
+#ifndef GCC_AARCH64_FREEBSD_H
+#define GCC_AARCH64_FREEBSD_H
+
+#undef  SUBTARGET_CPP_SPEC
+#define SUBTARGET_CPP_SPEC FBSD_CPP_SPEC
+
+#if TARGET_BIG_ENDIAN_DEFAULT
+#define TARGET_LINKER_EMULATION  "aarch64fbsdb"
+#else
+#define TARGET_LINKER_EMULATION  "aarch64fbsd"
+#endif
+
+#undef  SUBTARGET_EXTRA_LINK_SPEC
+#define SUBTARGET_EXTRA_LINK_SPEC " -m" TARGET_LINKER_EMULATION
+
+#undef  FBSD_TARGET_LINK_SPEC
+#define FBSD_TARGET_LINK_SPEC " \
+%{p:%nconsider using `-pg' instead of `-p' with gprof (1) } \
+%{v:-V} \
+%{assert*} %{R*} %{rpath*} %{defsym*}   \
+%{shared:-Bshareable %{h*} %{soname*}}  \
+%{symbolic:-Bsymbolic}  \
+%{static:-Bstatic}  \
+%{!static:  \
+  %{rdynamic:-export-dynamic}   \
+  %{!shared:-dynamic-linker " FBSD_DYNAMIC_LINKER " }}  \
+-X" SUBTARGET_EXTRA_LINK_SPEC " \
+%{mbig-endian:-EB} %{mlittle-endian:-EL}"
+
+#if TARGET_FIX_ERR_A53_835769_DEFAULT
+#define CA53_ERR_835769_SPEC \
+  " %{!mno-fix-cortex-a53-835769:--fix-cortex-a53-835769}"
+#else
+#define CA53_ERR_835769_SPEC \
+  " %{mfix-cortex-a53-835769:--fix-cortex-a53-835769}"
+#endif
+
+#ifdef TARGET_FIX_ERR_A53_843419_DEFAULT
+#define CA53_ERR_843419_SPEC \
+  " %{!mno-fix-cortex-a53-843419:--fix-cortex-a53-843419}"
+#else
+#define CA53_ERR_843419_SPEC \
+  " %{mfix-cortex-a53-843419:--fix-cortex-a53-843419}"
+#endif
+
+#undef  LINK_SPEC
+#define LINK_SPEC FBSD_TARGET_LINK_SPEC

Re: [patch] aarch64-*-freebsd* support for gcc.

2016-10-10 Thread Jeff Law

On 10/10/2016 03:07 PM, Andreas Tobler wrote:

Hi all,

the attached patch brings support for the aarch64-*-freebsd* target.

Bootstraped and tested, results on the list. Not that many results due
to board instabilities I lack a cavium ;)

Ok for main? And if yes, how far can I backport? Down to 5.4?

TIA,
Andreas

libgcc:

2016-10-10  Andreas Tobler  

* config.host: Add support for aarch64-*-freebsd*.

gcc:

2016-10-10  Andreas Tobler  

* config.gcc: Add aarch64-*-freebsd* support.
* config.host: Likewise.
* config/aarch64/aarch64-freebsd.h: New file.
* config/aarch64/t-aarch64-freebsd: Ditto.

toplevel:

2016-10-10  Andreas Tobler 

* configure.ac: Add aarch64-*-freebsd*.
* configure: Regenerate.
Certainly OK for the trunk.  Jakub, Richi & Joseph make the rules for 
the release branches.


jeff


Re: [PATCH 4/5] shrink-wrap: Shrink-wrapping for separate components

2016-10-10 Thread Jeff Law

On 09/30/2016 04:34 AM, Segher Boessenkool wrote:

[ whoops, message too big, resending with the attachment compressed ]

On Tue, Sep 27, 2016 at 03:14:51PM -0600, Jeff Law wrote:

With transposition issue addressed, the only blocker I see are some
simple testcases we can add to the suite.  They don't have to be real
extensive.  And one motivating example for the list archives, ideally
the glibc malloc case.


And here is the malloc testcase.

A very important (for performance) function is _int_malloc, which starts
with

[ ... ]
THanks.  What I think is important to note with this example is the bits 
that were pushed into the path with the sysmalloc/alloc_perturb calls. 
That's an unlikely path.


We have to extrapolate a bit from the assembly provided.  In the not 
separately shrink-wrapped version, we have a full prologue of stores and 
two instances of a full epilogue (though only one ever executes) provided.


With separate shrink wrapping the (presumably) very cold path where we 
error has virtually no prologue/epilogue.  That's probably a nop from a 
performance standpoint.


More interesting is the path where we call sysmalloc/alloc_perturb, it's 
a cold path, but not as cold as the error path.  We save/restore 4 regs 
in that case.  Rather than a full prologue/epilogue.  So there's clearly 
a savings there, though again, via the expect it's a cold path.


Where we have to extrapolate is the hot path.  Presumably on the hot 
path we're saving/restoring ~4 fewer registers.   I haven't verified 
that, but that is kindof the whole point here.




Thanks,
Jeff


Re: PATCH to introduce c-family/c-warn.c

2016-10-10 Thread Jeff Law

On 10/10/2016 10:36 AM, Marek Polacek wrote:

As outlined recently, this patch creates a new c-warn.c file, where various
diagnostic routines should reside, making c-common.c a little bit shorter.
There are no function changes though.  While at it, I fixed all tabs/space
problems in those functions that I've moved.  Some functions are contentious
and could arguably be in either file.

Next step is probably to create c-attribs.c.

Bootstrapped/regtested on x86_64-linux and ppc64-linux, ok for trunk?

2016-10-10  Marek Polacek  

* Makefile.in (C_COMMON_OBJS): Add c-family/c-warn.o.

* c-common.c (fold_for_warn): No longer static.
(bool_promoted_to_int_p): Likewise.
(c_common_get_narrower): Likewise.
(constant_expression_warning): Move to c-warn.c.
(constant_expression_error): Likewise.
(overflow_warning): Likewise.
(warn_logical_operator): Likewise.
(find_array_ref_with_const_idx_r): Likewise.
(warn_tautological_cmp): Likewise.
(expr_has_boolean_operands_p): Likewise.
(warn_logical_not_parentheses): Likewise.
(warn_if_unused_value): Likewise.
(strict_aliasing_warning): Likewise.
(sizeof_pointer_memaccess_warning): Likewise.
(check_main_parameter_types): Likewise.
(conversion_warning): Likewise.
(warnings_for_convert_and_check): Likewise.
(match_case_to_enum_1): Likewise.
(match_case_to_enum): Likewise.
(c_do_switch_warnings): Likewise.
(warn_for_omitted_condop): Likewise.
(readonly_error): Likewise.
(lvalue_error): Likewise.
(invalid_indirection_error): Likewise.
(warn_array_subscript_with_type_char): Likewise.
(warn_about_parentheses): Likewise.
(warn_for_unused_label): Likewise.
(warn_for_div_by_zero): Likewise.
(warn_for_memset): Likewise.
(warn_for_sign_compare): Likewise.
(do_warn_double_promotion): Likewise.
(do_warn_unused_parameter): Likewise.
(record_locally_defined_typedef): Likewise.
(maybe_record_typedef_use): Likewise.
(maybe_warn_unused_local_typedefs): Likewise.
(maybe_warn_bool_compare): Likewise.
(maybe_warn_shift_overflow): Likewise.
(warn_duplicated_cond_add_or_warn): Likewise.
(diagnose_mismatched_attributes): Likewise.
* c-common.h: Move the declarations from c-warn.c to its own section.
* c-warn.c: New file.

OK and creating c-attribs.c is pre-approved as well.

jeff



Re: [patch] aarch64-*-freebsd* support for gcc.

2016-10-10 Thread Andreas Tobler

On 10.10.16 23:10, Jeff Law wrote:

On 10/10/2016 03:07 PM, Andreas Tobler wrote:

Hi all,

the attached patch brings support for the aarch64-*-freebsd* target.

Bootstraped and tested, results on the list. Not that many results due
to board instabilities I lack a cavium ;)

Ok for main? And if yes, how far can I backport? Down to 5.4?

TIA,
Andreas

libgcc:

2016-10-10  Andreas Tobler  

* config.host: Add support for aarch64-*-freebsd*.

gcc:

2016-10-10  Andreas Tobler  

* config.gcc: Add aarch64-*-freebsd* support.
* config.host: Likewise.
* config/aarch64/aarch64-freebsd.h: New file.
* config/aarch64/t-aarch64-freebsd: Ditto.

toplevel:

2016-10-10  Andreas Tobler 

* configure.ac: Add aarch64-*-freebsd*.
* configure: Regenerate.

Certainly OK for the trunk.  Jakub, Richi & Joseph make the rules for
the release branches.


Thank you again Jeff.

Committed to trunk with r240949.

Andreas



Re: [PATCH, C++] Warn on redefinition of builtin functions (PR c++/71973)

2016-10-10 Thread Bernd Edlinger
On 10/06/16 22:37, Bernd Edlinger wrote:
> On 10/06/16 16:14, Kyrill Tkachov wrote:
>>
>> @@ -1553,7 +1588,7 @@ duplicate_decls (tree newdecl, tree olddecl, bool
>>
>> /* Whether or not the builtin can throw exceptions has no
>>bearing on this declarator.  */
>> -  TREE_NOTHROW (olddecl) = 0;
>> +  TREE_NOTHROW (olddecl) = TREE_NOTHROW (newdecl);
>>
>> The comment would need to be updated I think.
>
> Probably, yes.
>
> Actually the code did *not* do what the comment said, and
> effectively set the nothrow attribute to zero, thus
> the eh handlers were emitted when not needed.
>
> And IMHO the new code does now literally do what the comment
> said.
>
> At this point there follow 1000+ lines of code, in the same
> function that merge olddecl into newdecl and back again.
>
> The code is dependent on the types_match variable,
> and in the end newdecl is free'd an olddecl returned.
>
> At some places the code is impossible to understand:
> Look for the memcpy :-/
>
> I *think* the intention is to merge the attribute from the
> builtin when the header file is not explicitly giving,
> some or all attributes, when the parameters match.
>
> But when the parameters do not match, the header file
> changes the builtin's signature, and overrides the
> builtin attributes more or less with defaults or
> whatever is in the header file.
>
>


A few more thoughts, that may help to clarify a few things here.

Regarding this hunk:

else if (! same_type_p (TREE_VALUE (t1), TREE_VALUE (t2)))
  break;
+ if (t1 || t2
+ || ! same_type_p (TREE_TYPE (TREE_TYPE (olddecl)),
+   TREE_TYPE (TREE_TYPE (newdecl
+   warning_at (DECL_SOURCE_LOCATION (newdecl),
+   OPT_Wbuiltin_function_redefined,
+   "declaration of %q+#D conflicts with built-in "
+   "declaration %q#D", newdecl, olddecl);
}
  else if ((DECL_EXTERN_C_P (newdecl)

meanwhile I start to think that the "if" here is unnecessary,
because if decls_match returns false, the declarations are certainly
different.  And the warning is thus already justified at this point.
Removing the if changes nothing, the condition is always satisfied.

Regarding this hunk:

/* Whether or not the builtin can throw exceptions has no
  bearing on this declarator.  */
-  TREE_NOTHROW (olddecl) = 0;
+  TREE_NOTHROW (olddecl) = TREE_NOTHROW (newdecl);

You may ask, why the old code was working most of the time.
I think, usually, when types_match == true, there happens another
assignment to TREE_NOTHROW, later in that function around line 2183:

   /* Merge the type qualifiers.  */
   if (TREE_READONLY (newdecl))
 TREE_READONLY (olddecl) = 1;
   if (TREE_THIS_VOLATILE (newdecl))
 TREE_THIS_VOLATILE (olddecl) = 1;
   if (TREE_NOTHROW (newdecl))
 TREE_NOTHROW (olddecl) = 1;

This is in a big "if (types_match)", so I think that explains,
why the old code did work normally, and why it fails if the
parameter don't match, but I still have no idea what to say
in the comment, except that the code should exactly do what
the comment above says.


Bernd.


Re: [v3 PATCH] Make any's copy assignment operator exception-safe, don't copy the underlying value when any is moved, make in_place constructors explicit.

2016-10-10 Thread Jonathan Wakely

On 10/10/16 22:21 +0300, Ville Voutilainen wrote:

This code was all pretty carefully written to avoid any redundant
operations. Does this change buy us anything except simpler code?


As discussed, destroying the value but leaving the manager non-null will
do bad things.


Oops again on my part! Not so carefully written, or tested.


New patch attached, ok for trunk?


OK, thanks.


Re: [PATCH 4/5] shrink-wrap: Shrink-wrapping for separate components

2016-10-10 Thread Segher Boessenkool
On Mon, Oct 10, 2016 at 03:21:31PM -0600, Jeff Law wrote:
> On 09/30/2016 04:34 AM, Segher Boessenkool wrote:
> >[ whoops, message too big, resending with the attachment compressed ]
> >
> >On Tue, Sep 27, 2016 at 03:14:51PM -0600, Jeff Law wrote:
> >>With transposition issue addressed, the only blocker I see are some
> >>simple testcases we can add to the suite.  They don't have to be real
> >>extensive.  And one motivating example for the list archives, ideally
> >>the glibc malloc case.
> >
> >And here is the malloc testcase.
> >
> >A very important (for performance) function is _int_malloc, which starts
> >with
> [ ... ]
> THanks.  What I think is important to note with this example is the bits 
> that were pushed into the path with the sysmalloc/alloc_perturb calls. 
> That's an unlikely path.

alloc_perturb is a no-op, and inlined as such: as nothing :-)

> We have to extrapolate a bit from the assembly provided.  In the not 
> separately shrink-wrapped version, we have a full prologue of stores and 
> two instances of a full epilogue (though only one ever executes) provided.
> 
> With separate shrink wrapping the (presumably) very cold path where we 
> error has virtually no prologue/epilogue.  That's probably a nop from a 
> performance standpoint.
> 
> More interesting is the path where we call sysmalloc/alloc_perturb, it's 
> a cold path, but not as cold as the error path.  We save/restore 4 regs 
> in that case.  Rather than a full prologue/epilogue.  So there's clearly 
> a savings there, though again, via the expect it's a cold path.
> 
> Where we have to extrapolate is the hot path.  Presumably on the hot 
> path we're saving/restoring ~4 fewer registers.   I haven't verified 
> that, but that is kindof the whole point here.

We save/restore just four registers total on the hot path.  And yes,
that is the point :-)

The hot exit is

.L683:
ld 14,144(1)
ld 15,152(1)
ld 25,232(1)
ld 30,272(1)
addi 3,4,16
.L673:
addi 1,1,288
blr

so four GPR restores and no LR restore.  Without separate shrink-wrapping
this was

.L641:
addi 3,21,16
b .L631

[ ... ]

.L631:
addi 1,1,288
ld 29,16(1)
ld 14,-144(1)
ld 15,-136(1)
ld 16,-128(1)
ld 17,-120(1)
ld 18,-112(1)
ld 19,-104(1)
ld 20,-96(1)
ld 21,-88(1)
ld 22,-80(1)
ld 23,-72(1)
ld 24,-64(1)
mtlr 29
ld 25,-56(1)
ld 26,-48(1)
ld 27,-40(1)
ld 28,-32(1)
ld 29,-24(1)
ld 30,-16(1)
ld 31,-8(1)
blr

(18 GPRs as well as LR).

I didn't show this path because there is a whole bunch of branches with
inline asm in the way.

The sysmalloc path was

.L635:
li 4,0
.L761:
addi 1,1,288
mr 3,14
ld 14,16(1)
ld 15,-136(1)
ld 16,-128(1)
ld 17,-120(1)
ld 18,-112(1)
ld 19,-104(1)
ld 20,-96(1)
ld 21,-88(1)
ld 22,-80(1)
ld 23,-72(1)
ld 24,-64(1)
ld 25,-56(1)
mtlr 14
ld 26,-48(1)
ld 14,-144(1)
ld 27,-40(1)
ld 28,-32(1)
ld 29,-24(1)
ld 30,-16(1)
ld 31,-8(1)
b sysmalloc

and now is

.L677:
mr 3,14
ld 15,152(1)
ld 14,144(1)
ld 25,232(1)
ld 30,272(1)
li 4,0
addi 1,1,288
b sysmalloc

I attach malloc.s.{no,yes}, I hope you can stomach that.  Well you
can read HP-PA, heh.


Segher


malloc.s.no.gz
Description: GNU Zip compressed data


malloc.s.yes.gz
Description: GNU Zip compressed data


Re: [PATCH] Update docs on libstdc++ source-code layout

2016-10-10 Thread Jonathan Wakely

On 10/10/16 19:57 +0100, Jonathan Wakely wrote:

Self-explanatory updates to the docs, and regenerating after the
various recent changes.

* doc/xml/manual/appendix_contributing.xml (contrib.organization):
Describe other subdirectories and add markup. Remove outdated
reference to check-script target.
* doc/html/*: Regenerate.

Committed to trunk.


Some further markup improvements and corrections for outdated text.

Committed to trunk.

commit ae505b77cef62a4ee79dd374e75d88c223e945ed
Author: Jonathan Wakely 
Date:   Mon Oct 10 23:33:15 2016 +0100

Improve docs on libstdc++ source-code layout

	* doc/xml/manual/appendix_contributing.xml (contrib.organization):
	Replace  with nested  elements. Update
	some more outdated text.
	* doc/html/*: Regenerate.

diff --git a/libstdc++-v3/doc/xml/manual/appendix_contributing.xml b/libstdc++-v3/doc/xml/manual/appendix_contributing.xml
index ee35dd9..1ee848f 100644
--- a/libstdc++-v3/doc/xml/manual/appendix_contributing.xml
+++ b/libstdc++-v3/doc/xml/manual/appendix_contributing.xml
@@ -203,105 +203,195 @@
 GCC sources contains the files needed to create the GNU C++ Library.
   
 
-  
+
 It has subdirectories:
+
 
-  doc
+
+  
+  doc
+  
 Files in HTML and text format that document usage, quirks of the
 implementation, and contributor checklists.
+
+  
 
-  include
+  
+  include
+  
 All header files for the C++ library are within this directory,
 modulo specific runtime-related files that are in the libsupc++
 directory.
 
-include/std
-  Files meant to be found by #include  directives in
-  standard-conforming user programs.
+
+
+include/std
+
+  Files meant to be found by #include  directives
+  in standard-conforming user programs.
+  
+
 
-include/c
+
+include/c
+
   Headers intended to directly include standard C headers.
   [NB: this can be enabled via --enable-cheaders=c]
+  
+
 
-include/c_global
+
+include/c_global
+
   Headers intended to include standard C headers in
-  the global namespace, and put select names into the std::
+  the global namespace, and put select names into the std::
   namespace.  [NB: this is the default, and is the same as
   --enable-cheaders=c_global]
+  
+
 
-include/c_std
+
+include/c_std
+
   Headers intended to include standard C headers
-  already in namespace std, and put select names into the std::
+  already in namespace std, and put select names into the std::
   namespace.  [NB: this is the same as
   --enable-cheaders=c_std]
+  
+
 
-include/bits
+
+include/bits
+
   Files included by standard headers and by other files in
   the bits directory.
+  
+
 
-include/backward
-  Headers provided for backward compatibility, such as .
+
+include/backward
+
+  Headers provided for backward compatibility, such as
+  .
   They are not used in this library.
+
+
 
-include/ext
+
+include/ext
+
   Headers that define extensions to the standard library.  No
-  standard header refers to any of them.
+  standard header refers to any of them, in theory (there are some
+  exceptions).
+  
+
+
+  
+  
 
-  scripts
+  
+  scripts
+  
 Scripts that are used during the configure, build, make, or test
 process.
+
+  
 
-  src
+  
+  src
+  
 Files that are used in constructing the library, but are not
 installed.
 
-src/c++98
+
+
+src/c++98
+
 Source files compiled using -std=gnu++98.
+  
+
 
-src/c++11
+
+src/c++11
+
 Source files compiled using -std=gnu++11.
+  
+
 
-src/filesystem
+
+src/filesystem
+
 Source files for the Filesystem TS.
+  
+
 
-src/shared
+
+src/shared
+
 Source code included by other files under both
 src/c++98 and
 src/c++11
+  
+
+
+  
+  
 
-  testsuites/[backward, demangle, ext, performance, thread, 17_* to 30_*]
+  
+  testsuites/[backward, demangle, ext, performance, thread, 17_* to 30_*]
+  
 Test programs are here, and may be used to begin to exercise the
 library.  Support for "make check" and "make check-install" is
 complete, and runs through all the subdirectories here when this
 command is issued from the build directory.  Please note that
 "make check" requires DejaGNU 1.4 or later to be installed.
+
+  
+
 
+
 Other subdirectories contain variant versions of certain files
 that are meant to be copied or linked by the configure script.
 Currently these are:
+config/abi
+config/allocator
+config/cpu
+config/io
+config/locale
+config/os
+
+
 
-  config/abi
-  config/cpu
-  config/io
-  config/locale
-  config/os
-
+
 In addition, a subdirecto

Re: Always support float128 on ia64 (PR target/77586)

2016-10-10 Thread Jeff Law

On 10/04/2016 10:46 AM, Joseph Myers wrote:

Bug 77586, and previously
, reports
ia64-elf failing to build because of float128_type_node being NULL,
but being used by the back end for __float128.

The global float128_type_node is only available conditionally, if
target hooks indicate TFmode is not only available as a scalar mode
and of the right format, but also supported in libgcc.  The back-end
support, however, expects the type always to be available for
__float128 even if the libgcc support is missing.

Although a target-specific node could be restored in the case where
libgcc support is missing, it seems better to address the missing
libgcc support.  Thus, this patch enables TFmode soft-fp in libgcc
globally for all ia64 targets.  Support for XFmode in libgcc (that is,
for libgcc2.c XFmode functions, not soft-fp) is also enabled for all
ia64 targets so that ia64 no longer needs to define the
TARGET_LIBGCC_FLOATING_MODE_SUPPORTED_P hook.

I've confirmed that ia64-elf builds cc1 with this patch and it passes
-fself-test.  I have not otherwise tested the patch.  It's plausible
that ia64-elf and ia64-freebsd might work as-is, but ia64-vms probably
needs further changes, by someone familiar with VMS shared libraries,
to implement an equivalent of ia64/t-softfp-compat in that case
(avoiding conflicts between __divtf3 from soft-fp and the old alias
for __divxf3).

gcc:
2016-10-04  Joseph Myers  

PR target/77586
* config/ia64/ia64.c (ia64_libgcc_floating_mode_supported_p)
(TARGET_LIBGCC_FLOATING_MODE_SUPPORTED_P): Remove.
* config/ia64/elf.h (IA64_NO_LIBGCC_TFMODE): Likewise.
* config/ia64/freebsd.h (IA64_NO_LIBGCC_TFMODE): Likewise.
* config/ia64/vms.h (IA64_NO_LIBGCC_XFMODE)
(IA64_NO_LIBGCC_TFMODE): Likewise.

libgcc:
2016-10-04  Joseph Myers  

PR target/77586
* config.host (ia64*-*-elf*, ia64*-*-freebsd*, ia64-hp-*vms*): Use
soft-fp.
Given it's a clear step forward and the inability to test the least 
common platform (vms), I'm OK with this patch.


jeff



Re: [PATCH] Improve performance of list::reverse

2016-10-10 Thread Elliot Goodrich
I haven't yet but I will try and sort it out tomorrow.

If we're replacing the current method with one that takes a size
parameter when _GLIBCXX_USE_CXX11_ABI is defined, is this going to
cause any issues with ABI compatibility? If not, then I agree that we
should go with the #if version.

On 10 October 2016 at 17:12, Jonathan Wakely  wrote:
> On 09/10/16 16:23 +0100, Elliot Goodrich wrote:
>>
>> Hi,
>>
>> If we unroll the loop so that we iterate both forwards and backwards,
>> we can take advantage of memory-level parallelism when chasing
>> pointers. This means that reverse takes 35% less time when nodes are
>> randomly scattered in memory and about the same time if nodes are
>> contiguous.
>>
>> Further, as our node pointers will never alias, we can interleave the
>> swaps of the next and previous pointers to remove further data
>> dependencies. This takes another 5% off the time when nodes are
>> scattered in memory and takes 20% off when nodes are contiguous.
>>
>> All in all we save 20%-40% depending on the memory layout.
>
>
> Nice, thanks for the patch.
>
> Do you have (or are you willing to sign) a copyright assignment for
> GCC?
>
> See https://gcc.gnu.org/contribute.html#legal for details.
>
>> For future improvement, by passing whether there is an odd or even
>> number of nodes in the list we can hoist one of the ifs out of the
>> loop and gain another 5-10% but most likely this is only possible when
>> _GLIBCXX_USE_CXX11_ABI is defined and size() is O(1). This would bring
>> the saving to 30%-45%. Is it worth writing a new overload of
>> _M_reverse which takes the size of the list?
>
>
> That certainly seems worthwhile. Do we need an overload or can it just
> be done with #if? It seems to me we'd either want to use the size, or
> not use it, we wouldn't want both versions defined at once. That
> suggests #if to me.
>


  1   2   >