Re: [PATCH][ARM][2/4] Fix operand costing logic for SMUL[TB][TB]

2016-02-03 Thread Nick Clifton
Hi Kyrill,

> 2016-01-22  Kyrylo Tkachov  
> 
> * config/arm/arm.c (arm_new_rtx_costs, MULT case): Properly extract
> the operands of the SIGN_EXTENDs from a SMUL[TB][TB] rtx.

Approved - please apply.

Cheers
  Nick
  


Re: [PATCH][ARM][1/4] PR target/65932: Add testcase

2016-02-03 Thread Nick Clifton
Hi Kyrill,

  I would like to approve this patch, but cannot, since it is not ARM
  specific.  I think that if you ping the list you may be able to get a
  response, and it would be nice to see this whole patch series checked
  in before the gcc 6 branch occurs.

Cheers
  Nick

PS.  If necessary you could always move the test to gcc.target/arm...



Re: [PATCH][ARM][4/4] Adjust gcc.target/arm/wmul-[123].c tests

2016-02-03 Thread Nick Clifton

> 2016-01-22  Kyrylo Tkachov  
> 
> * gcc.target/arm/wmul-3.c: Simplify test to generate just
> a single smulbb instruction.
> * gcc.target/amr/wmul-1.c: Add -mtune=cortex-a9 to dg-options.
> * gcc.target/amr/wmul-2.c: Likewise.

Approved - please apply.

Cheers
  Nick
  


Re: [Patch, MIPS] Fix PR target/68273, passing args in wrong regs

2016-02-03 Thread Eric Botcazou
> Can you explain why the GCC internals cause us to get SCmode instead of
> BLKmode for the example with _Complex?  I don't understand that.  It
> seems wrong to me and I don't understand where it is coming from.

compute_record_mode uses BLKmode only as a last resort because this will 
significantly pessimize over non-BLKmode at the RTL level.  As Richard said 
(and as discussed many times over the years), you ought not to rely on the 
mode for the calling convention of aggregate types.

-- 
Eric Botcazou



Re: [Patch, Fortran] PR 69495: unused-label warning does not tell which flag triggered it

2016-02-03 Thread Manfred Schwarb

Am 02.02.2016 um 21:26 schrieb Janus Weil:

Hi all,

here is a diagnostics patch, which makes sure that the responsible
flag is printed in several warning messages (for which this was still
missing).

The  only case that I'm not completely sure about is the hunk in
intrinsic.c. In particular I was not able to trigger this warning and
found no occurrence of it in the testsuite. Could someone check if the
flag that I'm using there is correct, please?

As a small extra the patch also mentions the -Wpedantic flag in the
gfortran documentation.

It regtests cleanly on x86_64-linux-gnu. Ok for trunk?

Cheers,
Janus



   if (source_size < result_size)
-gfc_warning (0, "Intrinsic TRANSFER at %L has partly undefined result: "
-"source size %ld < result size %ld", &source->where,
-(long) source_size, (long) result_size);
+gfc_warning (OPT_Wsurprising, "Intrinsic TRANSFER at %L has partly "
+"undefined result: source size %ld < result size %ld",
+&source->where, (long) source_size, (long) result_size);
 


Breaking apart of these strings will probably hamper translation.

Cheers,
Manfred


Re: [PING] Add new mexecute-only arm option.

2016-02-03 Thread mickael guene

Hi Sandra,

 Thanks for your feedback.

On 02/02/2016 08:57 PM, Sandra Loosemore wrote:
> On 02/02/2016 02:06 AM, mickael guene wrote:
>> Hi All,
>>
>>Ping for following thread :
>>
>> https://gcc.gnu.org/ml/gcc-patches/2016-01/msg01968.html
>> https://gcc.gnu.org/ml/gcc-patches/2016-01/msg01969.html
>> https://gcc.gnu.org/ml/gcc-patches/2016-01/msg01970.html
>
> Two comments:
>
> (1) MIPS has had a similar option for quite some time called
> -mcode-readable=.  It might be less confusing to use a similar name for
> the ARM option with the similar reversed sense to -mexecute-only, even
> if it doesn't need to be a tristate flag like for MIPS.
 I was unaware of this MIPS option. But anyway I would rather prefer to
stick with -mexecute-only since it's very similar to armcc option naming
for the same feature (--execute_only).

> (2) I suggest changing the help string for the command line option
>
>> +
>> +mexecute-only
>> +Target Report Var(target_execute_only) Init(0)
>> +Forbid load into text sections.
>
> to use the same wording as the documentation in the manual:
>
>> +@item -mexecute-only
>> +@opindex mexecute-only
>> +Disable read memory access inside code sections.  Only code fetching is
>> +allowed.
>> +This option is off by default.
>> +
>
> Or at least, "load into text sections" is confusing.  (You load *from*
> the text section, not *into* it, right?)

 You're right. I will reuse documentation sentence.

Regards
Mickael


Re: [PATCH][ARM][1/4] PR target/65932: Add testcase

2016-02-03 Thread Kyrill Tkachov

Hi Nick,

On 03/02/16 08:35, Nick Clifton wrote:

Hi Kyrill,

   I would like to approve this patch, but cannot, since it is not ARM
   specific.  I think that if you ping the list you may be able to get a
   response, and it would be nice to see this whole patch series checked
   in before the gcc 6 branch occurs.

Cheers
   Nick

PS.  If necessary you could always move the test to gcc.target/arm...



Thanks again for looking at these.
I'd like to keep this test in the generic directory as we're
trying to avoid cluttering the gcc.target directories with
tests that are not arm-specific.

CC'ing Jakub then. Jakub, is it ok to have the test from
https://gcc.gnu.org/ml/gcc-patches/2016-01/msg01717.html
in gcc.c-torture/execute/ ?

Thanks,
Kyrill


Re: [PATCH][ARM][1/4] PR target/65932: Add testcase

2016-02-03 Thread Jakub Jelinek
On Wed, Feb 03, 2016 at 09:46:31AM +, Kyrill Tkachov wrote:
> Hi Nick,
> 
> On 03/02/16 08:35, Nick Clifton wrote:
> >Hi Kyrill,
> >
> >   I would like to approve this patch, but cannot, since it is not ARM
> >   specific.  I think that if you ping the list you may be able to get a
> >   response, and it would be nice to see this whole patch series checked
> >   in before the gcc 6 branch occurs.
> >
> >Cheers
> >   Nick
> >
> >PS.  If necessary you could always move the test to gcc.target/arm...
> >
> 
> Thanks again for looking at these.
> I'd like to keep this test in the generic directory as we're
> trying to avoid cluttering the gcc.target directories with
> tests that are not arm-specific.
> 
> CC'ing Jakub then. Jakub, is it ok to have the test from
> https://gcc.gnu.org/ml/gcc-patches/2016-01/msg01717.html
> in gcc.c-torture/execute/ ?

The test is not portable, so not in this form.
Does the test reproduce what you want if you change
0xfff1 to -15?  If yes, the test is ok with that change.

Jakub


Re: [PATCH 4/4][AArch64] Cost CCMP instruction sequences to choose better expand order

2016-02-03 Thread James Greenhalgh
On Thu, Jan 28, 2016 at 02:33:20PM +, James Greenhalgh wrote:
> On Mon, Jan 25, 2016 at 08:09:39PM +, Wilco Dijkstra wrote:
> > Andreas Schwab  wrote:
> > 
> > > FAIL: gcc.target/aarch64/ccmp_1.c scan-assembler-times \tcmp\tw[0-9]+, 0 4
> > > FAIL: gcc.target/aarch64/ccmp_1.c scan-assembler adds\t
> > > FAIL: gcc.target/aarch64/ccmp_1.c scan-assembler-times fccmpe\t.*0\\.0 1
> > 
> > Yes I noticed those too, and here is the fix. Richard's recent change added
> > UNSPEC to the CCMP patterns to stop combine optimizing the CCMP CCmode
> > immediate in a rare case. This requires a change to the CCMP cost 
> > calculation
> > as the CCMP instruction with unspec is no longer recognized.
> > 
> > Fix the ccmp_1.c test to allow both '0' and 'wzr' on cmp - BTW is there a
> > regular expression that correctly implements (0|xzr)? If I use that the test
> > still fails somehow but \[0wzr\]+ works fine... Is the correct syntax
> > documented somewhere?
> > 
> > Finally to ensure FCCMPE is emitted on relational compares, add
> > -ffinite-math-only.
> > 
> > ChangeLog:
> > 2016-01-25  Wilco Dijkstra  
> > 
> > gcc/
> > * config/aarch64/aarch64.c (aarch64_if_then_else_costs):
> > Remove CONST_INT_P check in CCMP cost calculation.
> > 
> > gcc/testsuite/
> > * gcc.target/aarch64/ccmp_1.c: Fix test issues.

I'm still seeing:

  FAIL: gcc.target/aarch64/ccmp_1.c scan-assembler-times \\tcmp\\tw[0-9]+, 
(0|wzr) 4

Looking at the assembly generated for me with this testcase I see ccmp
with zero in 5 places:

  f3:
cmp w1, 34
ccmpw0, 19, 0, eq
csetw0, eq
ret
  f4:
cmp w0, 35
ccmpw1, 20, 0, eq
csetw0, eq
ret

  f7:
cmp w0, 0
ccmpw1, 7, 0, eq
csetw0, eq
ret

  f8:
cmp w1, 0
ccmpw0, 9, 0, eq
csetw0, eq
ret

  f11:
fcmpe   d0, #0.0
ccmpw0, 30, 0, mi
csetw0, eq
ret

Are these all expected? If so, can you spin the "obvious" patch to bump
this number to 5.

Thanks,
James



Re: [PATCH][ARM][1/4] PR target/65932: Add testcase

2016-02-03 Thread Kyrill Tkachov


On 03/02/16 09:51, Jakub Jelinek wrote:

On Wed, Feb 03, 2016 at 09:46:31AM +, Kyrill Tkachov wrote:

Hi Nick,

On 03/02/16 08:35, Nick Clifton wrote:

Hi Kyrill,

   I would like to approve this patch, but cannot, since it is not ARM
   specific.  I think that if you ping the list you may be able to get a
   response, and it would be nice to see this whole patch series checked
   in before the gcc 6 branch occurs.

Cheers
   Nick

PS.  If necessary you could always move the test to gcc.target/arm...


Thanks again for looking at these.
I'd like to keep this test in the generic directory as we're
trying to avoid cluttering the gcc.target directories with
tests that are not arm-specific.

CC'ing Jakub then. Jakub, is it ok to have the test from
https://gcc.gnu.org/ml/gcc-patches/2016-01/msg01717.html
in gcc.c-torture/execute/ ?

The test is not portable, so not in this form.
Does the test reproduce what you want if you change
0xfff1 to -15?  If yes, the test is ok with that change.


thanks for spotting this.
Yes, -15 works just as well (bug is reproducible)
I'll make the change.

Kyrill


Jakub




Re: [Patch, fortran, pr67451, gcc-5, v1] [5/6 Regression] ICE with sourced allocation from coarray

2016-02-03 Thread Andre Vehreschild
Hi Paul,

thanks for the review. Committed as:

r233099 for ggc-5, and

r233101 for trunk.

Regards,
Andre


On Tue, 2 Feb 2016 19:44:00 +0100
Paul Richard Thomas  wrote:

> Hi Andre,
> 
> This one looks good too. As every day goes by, I see more and more why
> Tobias was so keen to incorporate all objects into a single descriptor
> type :-)
> 
> OK for 5-branch.
> 
> Thanks for both the patches
> 
> Paul
> 
> On 1 February 2016 at 13:34, Andre Vehreschild  wrote:
> > Oh, well, now with attachments. I am sorry.
> >
> > - Andre
> >
> > On Mon, 1 Feb 2016 13:20:24 +0100
> > Andre Vehreschild  wrote:
> >  
> >> Hi all,
> >>
> >> here is the backport of the patch for pr67451 for gcc-5. Because the
> >> structure of the allocate() in trunk is quite different the patch looks
> >> somewhat different, too, but essentially does the same.
> >>
> >> Bootstrapped and regtests ok on x86_64-linux-gnu/F23.
> >>
> >> Ok for gcc-5-branch?
> >>
> >> Here is the link to the mainline patch:
> >> https://gcc.gnu.org/ml/fortran/2016-01/msg00093.html
> >>
> >> Regards,
> >>   Andre
> >>
> >> On Fri, 29 Jan 2016 19:17:24 +0100
> >> Andre Vehreschild  wrote:
> >>  
> >> > Hi all,
> >> >
> >> > attached is a patch to fix a regression in current gfortran when a
> >> > coarray is used in the source=-expression of an allocate(). The ICE was
> >> > caused by the class information, i.e., _vptr and so on, not at the
> >> > expected place. The patch fixes this.
> >> >
> >> > The patch also fixes pr69418, which I will flag as a duplicate in a
> >> > second.
> >> >
> >> > Bootstrapped and regtested ok on x86_64-linux-gnu/F23.
> >> >
> >> > Ok for trunk?
> >> >
> >> > Backport to gcc-5 is pending, albeit more difficult, because the
> >> > allocate() implementation on 5 is not as advanced the one in 6.
> >> >
> >> > Regards,
> >> > Andre  
> >>
> >>  
> >
> >
> > --
> > Andre Vehreschild * Email: vehre ad gmx dot de  
> 
> 
> 


-- 
Andre Vehreschild * Email: vehre ad gmx dot de 
Index: gcc/fortran/ChangeLog
===
--- gcc/fortran/ChangeLog	(Revision 233098)
+++ gcc/fortran/ChangeLog	(Arbeitskopie)
@@ -1,3 +1,13 @@
+2016-02-03  Andre Vehreschild  
+
+	PR fortran/67451
+	PR fortran/69418
+	* trans-expr.c (gfc_copy_class_to_class): For coarrays just the
+	pointer is passed.  Take it as is without trying to deref the
+	_data component.
+	* trans-stmt.c (gfc_trans_allocate): Take care of coarrays as
+	argument to source=-expression.
+
 2016-01-30  Bud Davis  
 	Mikael Morin  
 
Index: gcc/fortran/trans-expr.c
===
--- gcc/fortran/trans-expr.c	(Revision 233098)
+++ gcc/fortran/trans-expr.c	(Arbeitskopie)
@@ -1019,6 +1019,7 @@
   tree fcn;
   tree fcn_type;
   tree from_data;
+  tree from_class_base = NULL;
   tree from_len;
   tree to_data;
   tree to_len;
@@ -1035,21 +1036,41 @@
   from_len = to_len = NULL_TREE;
 
   if (from != NULL_TREE)
-fcn = gfc_class_vtab_copy_get (from);
+{
+  /* Check that from is a class.  When the class is part of a coarray,
+	 then from is a common pointer and is to be used as is.  */
+  tmp = POINTER_TYPE_P (TREE_TYPE (from)) && !DECL_P (from)
+	  ? TREE_OPERAND (from, 0) : from;
+  if (GFC_CLASS_TYPE_P (TREE_TYPE (tmp))
+	  || (DECL_P (tmp) && GFC_DECL_CLASS (tmp)))
+	{
+	  from_class_base = from;
+	  from_data = gfc_class_data_get (from_class_base);
+	}
+  else
+	{
+	  /* For arrays two component_refs can be present.  */
+	  if (TREE_CODE (tmp) == COMPONENT_REF)
+	tmp = TREE_OPERAND (tmp, 0);
+	  if (TREE_CODE (tmp) == COMPONENT_REF)
+	tmp = TREE_OPERAND (tmp, 0);
+	  from_class_base = tmp;
+	  from_data = from;
+	}
+  fcn = gfc_class_vtab_copy_get (from_class_base);
+}
   else
-fcn = gfc_class_vtab_copy_get (to);
+{
+  fcn = gfc_class_vtab_copy_get (to);
+  from_data = gfc_class_vtab_def_init_get (to);
+}
 
   fcn_type = TREE_TYPE (TREE_TYPE (fcn));
 
-  if (from != NULL_TREE)
-  from_data = gfc_class_data_get (from);
-  else
-from_data = gfc_class_vtab_def_init_get (to);
-
   if (unlimited)
 {
-  if (from != NULL_TREE && unlimited)
-	from_len = gfc_class_len_get (from);
+  if (from_class_base != NULL_TREE)
+	from_len = gfc_class_len_get (from_class_base);
   else
 	from_len = integer_zero_node;
 }
Index: gcc/fortran/trans-stmt.c
===
--- gcc/fortran/trans-stmt.c	(Revision 233098)
+++ gcc/fortran/trans-stmt.c	(Arbeitskopie)
@@ -5180,7 +5180,7 @@
  _vptr, _len and element_size for expr3.  */
   if (code->expr3)
 {
-  bool vtab_needed = false;
+  bool vtab_needed = false, is_coarray = gfc_is_coarray (code->expr3);
   /* expr3_tmp gets the tree when code->expr3.mold is set, i.e.,
 	 the expression is only needed to get the _vptr, _len a.s.o.  */
   tree expr3_tmp = NULL_TREE;
@@ -5245,7 +5245,8 @@
 		{
 	

Re: [PATCH 4/4] Un-XFAIL ssa-dom-cse-2.c for most platforms

2016-02-03 Thread Alan Lawrence

On 26/01/16 12:23, Dominik Vogt wrote:

On Mon, Dec 21, 2015 at 01:13:28PM +, Alan Lawrence wrote:

...the test passes with --param sra-max-scalarization-size-Ospeed.

Verified on aarch64 and with stage1 compiler for hppa, powerpc, sparc, s390.


How did you test this on s390?  For me, the test still fails
unless I add -march=z13 (s390x).


Sorry for the slow response, was away last week. On x86 host, I built a compiler

configure --enable-languages=c,c++,lto --target=s390-none-linux-gnu
make all-gcc
make check-gcc RUNTESTFLAGS=tree-ssa.exp=ssa-dom-cse-2.c

and that shows the tests passing. gcc -v shows little further:
Reading specs from ./gcc/specs
COLLECT_GCC=./gcc/xgcc
COLLECT_LTO_WRAPPER=./gcc/lto-wrapper
Target: s390-none-linux-gnu
Configured with: /work/alalaw01/src2/gcc/configure --enable-languages=c,c++,lto 
--target=s390-none-linux-gnu

Thread model: posix
gcc version 6.0.0 20151206 (experimental) (GCC)

I speculate that perhaps the -march=z13 is default for s390 *linux* ???

If you can send me alternative configury, or dumps from your failing compiler 
(dom{2,3}-details, optimized), I can take a look.


Thanks, Alan


Re: [PATCH 4/4][AArch64] Cost CCMP instruction sequences to choose better expand order

2016-02-03 Thread Wilco Dijkstra
James Greenhalgh wrote:
> I'm still seeing:
>
>  FAIL: gcc.target/aarch64/ccmp_1.c scan-assembler-times \\tcmp\\tw[0-9]+, 
> (0|wzr) 4

That's because "(0|wzr)" is not correctly matching due to the weird regular 
expression syntax used in the testsuite (I tried with several escapes to no 
avail). It looks like Richard committed that, perhaps accidentally? I'll change 
it back to "0" (count 4 is right as it only matches CMP, not CCMP).

Wilco



[COMMITTED][AArch64] Fix ccmp_1.c test

2016-02-03 Thread Wilco Dijkstra
Fix the ccmp_1.c test back to use '0' as regular expressions don't work 
correctly. '0' is right due to compare with zero now printing as 'CMP w0, 0' 
rather than 'CMP w0, wzr' (since r232921).

Committed as trivial patch in r233102.

ChangeLog:
2016-02-03  Wilco Dijkstra  

gcc/testsuite/
* gcc.target/aarch64/ccmp_1.c: Fix test issue.

diff --git a/gcc/testsuite/gcc.target/aarch64/ccmp_1.c 
b/gcc/testsuite/gcc.target/aarch64/ccmp_1.c
index 
7c962cbb396f5ef9634cf3091dd12327ad2a7b03..fd38b2cfdb834c6395ddb939b54d23345c51b8d9
 100644
--- a/gcc/testsuite/gcc.target/aarch64/ccmp_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/ccmp_1.c
@@ -85,7 +85,7 @@ f13 (int a, int b)
 /* { dg-final { scan-assembler "cmp\t(.)+34" } } */
 /* { dg-final { scan-assembler "cmp\t(.)+35" } } */
 
-/* { dg-final { scan-assembler-times "\tcmp\tw\[0-9\]+, (0|wzr)" 4 } } */
+/* { dg-final { scan-assembler-times "\tcmp\tw\[0-9\]+, 0" 4 } } */
 /* { dg-final { scan-assembler-times "fcmpe\t(.)+0\\.0" 2 } } */
 /* { dg-final { scan-assembler-times "fcmp\t(.)+0\\.0" 2 } } */




Re: [Patch, fortran] PR66089 fix elemental dependency mishandling

2016-02-03 Thread Paul Richard Thomas
Dear Mikael,

The patch is OK for trunk.

A small niggle: Although present in the original testcase, 'a' is unused.

I am not in a position to find out for myself, right now, but does the
testcase of comment #10 work with this patch?

Thanks for the patch

Paul

On 1 February 2016 at 23:07, Mikael Morin  wrote:
> Hello,
>
> this is about the case
>
>c(:) = elemental_func(c(1), ...)
>
> where as a result of a trunk change, only a reference to c(1) is saved to a
> temporary variable, instead of its value.
>
> The fix tries to save the amount of copying as much as possible by detecting
> the above case.  Technically through the usage of a new field
> needs_temporary.
>
> The patch is a variant of the one that has been on bugzilla for months.
> The main difference is the usage of gfc_expr_is_variable instead of the
> check for expr_type == EXPR_VARIABLE (the former includes pointer-returning
> functions as well).
>
> Regression-tested on x86_64-unknown-linux-gnu.  OK for trunk?
> Mikael
>



-- 
The difference between genius and stupidity is; genius has its limits.

Albert Einstein


Re: [PATCH] Fix c/69643, named address space wrong-code

2016-02-03 Thread Richard Biener
On February 3, 2016 8:11:01 AM GMT+01:00, Richard Henderson  
wrote:
>On 02/03/2016 06:05 PM, Richard Biener wrote:
>  I wasn't aware that STRIP_NOPS strips ADDR_SPACE_CONVERT_EXPR.
>>
>> Isn't this maybe failing to use that (unable to look at the
>attachment from my phone).
>
>The test case does fail to use ADDR_SPACE_CONVERT_EXPR.
>Perhaps it's because of the intermediate cast to uintptr_t?

Ah.  Isn't to/from int conversion also address-space specific?

I wonder if it makes sense to have ADDR_SPACE_CONVERT if there is the loophole 
of going through an integer type...

That is, if the address spaces are not subsets, how can going through an int 
make sense? Isn't the testcase somehow invalid then?

>Of course, for this case, the intermediate cast is required
>because __seg_[fg]s are *not* subsets of ADDR_SPACE_GENERIC,
>and thus a direct cast between the pointer types results in
>an error message.

As for a patch I'd repeatedly pondered on not stripping int <-> pointer 
conversions at all, similar to what STRIP_SIGN_NOPS does.  Don't remember 
actually trying this or the fallout though.

Richard.

>
>r~




[PATCH, i386, COMMITTED] Fix PR69118.

2016-02-03 Thread Kirill Yukhin
Hello,
As proposed in PR69118 - fixed condition of compare pattern.

Bootstrapped, regtested & comitted to main trunk & gcc-5-branch.

gcc/
PR target/69118
* config/i386/sse.md (define_insn "avx512f_maskcmp3"):
Fix target.

--
Thanks, K

commit 7fa978b9b80a6d50a81065755be81acc2923b0e2
Author: Kirill Yukhin 
Date:   Wed Feb 3 12:37:13 2016 +0300

AVX512. Fix PR69118 - wrong target for compare pattern.

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 7f89679..045a85f 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -2788,7 +2788,7 @@
(match_operator: 3 "sse_comparison_operator"
  [(match_operand:VF 1 "register_operand" "v")
   (match_operand:VF 2 "nonimmediate_operand" "vm")]))]
-  "TARGET_SSE"
+  "TARGET_AVX512F"
   "vcmp%D3\t{%2, %1, %0|%0, %1, %2}"
   [(set_attr "type" "ssecmp")
(set_attr "length_immediate" "1")


[PATCH, testsuite]: Improve check_effective_target_fsanitize_thread

2016-02-03 Thread Uros Bizjak
Hello!

Attached patch improves detection of working -fsanitize=thread option.
Check for working -fsanitize=thread option timeouts with older glibcs,
so tsan_init detects this case and sets default compile flags to
compile.

Recently Eric changed check_effective_target_fsanitize_thread to a
runtime test, and we *again* unnecessarily waste 5 minutes of test
time here, waiting for a test to timeout.

Attached patch moves the detection to
check_effective_target_fsanitize_thread function. Now, the function
first checks if the compiler is able to create executable (and exits
early if not) and later in the function sets what to do by default,
depending on the outcome of the runtime test.

BTW: The dg-do-what-default link was chosen to avoid the testsuite
error due to the usage of additional-sources in some testcases). Also,
we don't need additional compile flags for check_no_compiler_messages
and check_runtime since TEST_ALWAYS_FLAG is always set.

2016-02-03  Uros Bizjak  

* lib/tsan-dg.exp (tsan_init): Move check if tsan executable
works from here ...
(check_effective_target_fsanitize_thread): ... to here.  Do not
specify additional compile flags for the test source.
* lib/asan-dg.exp (check_effective_target_fsanitize_address): Do not
specify additional compile flags for the test source.

Patch was tested on x86_64-linux-gnu (CentOS 5.11 and Fedora 23).

OK for mainline?

Uros.
diff --git a/gcc/testsuite/lib/asan-dg.exp b/gcc/testsuite/lib/asan-dg.exp
index 994160e..a1198c0 100644
--- a/gcc/testsuite/lib/asan-dg.exp
+++ b/gcc/testsuite/lib/asan-dg.exp
@@ -20,7 +20,7 @@
 proc check_effective_target_fsanitize_address {} {
 return [check_no_compiler_messages fsanitize_address executable {
int main (void) { return 0; }
-} "-fsanitize=address"]
+}]
 }
 
 proc asan_include_flags {} {
diff --git a/gcc/testsuite/lib/tsan-dg.exp b/gcc/testsuite/lib/tsan-dg.exp
index eb1f3a9..5745fe7 100644
--- a/gcc/testsuite/lib/tsan-dg.exp
+++ b/gcc/testsuite/lib/tsan-dg.exp
@@ -15,12 +15,30 @@
 # .
 
 # Return 1 if compilation with -fsanitize=thread is error-free for trivial
-# code, 0 otherwise.
+# code, 0 otherwise.  Also set what to do by default here.
 
 proc check_effective_target_fsanitize_thread {} {
-return [check_runtime fsanitize_thread {
+global individual_timeout
+global dg-do-what-default
+
+if ![check_no_compiler_messages fsanitize_thread executable {
int main (void) { return 0; }
-} "-fsanitize=thread"]
+}] {
+return 0
+}
+
+# Lower timeout value in case test does not terminate properly.
+set individual_timeout 20
+if [check_runtime_nocache tsan_works {
+   int main () { return 0; }
+}] {
+   set dg-do-what-default run
+} else {
+   set dg-do-what-default link
+}
+unset individual_timeout
+
+return 1
 }
 
 #
@@ -101,22 +119,6 @@ proc tsan_init { args } {
set TEST_ALWAYS_FLAGS "$link_flags -fsanitize=thread -g"
}
 }
-
-set dg-do-what-default run
-if { $link_flags != "" } {
-   global individual_timeout
-
-   # Lower timeout value in case test does not terminate properly.
-   set individual_timeout 20
-   if [check_runtime_nocache tsan_works {
-   int main () { return 0; }
-   } "-fsanitize=thread -g"] {
-   set dg-do-what-default run
-   } else {
-   set dg-do-what-default compile
-   }
-   unset individual_timeout
-}
 }
 
 #


Re: [PATCH, testsuite]: Improve check_effective_target_fsanitize_thread

2016-02-03 Thread Jakub Jelinek
On Wed, Feb 03, 2016 at 02:53:56PM +0100, Uros Bizjak wrote:
> diff --git a/gcc/testsuite/lib/asan-dg.exp b/gcc/testsuite/lib/asan-dg.exp
> index 994160e..a1198c0 100644
> --- a/gcc/testsuite/lib/asan-dg.exp
> +++ b/gcc/testsuite/lib/asan-dg.exp
> @@ -20,7 +20,7 @@
>  proc check_effective_target_fsanitize_address {} {
>  return [check_no_compiler_messages fsanitize_address executable {
>   int main (void) { return 0; }
> -} "-fsanitize=address"]
> +}]
>  }
>  

This is just weird.  What if fsanitize_address effective target is used
outside of asan/ (i.e. without asan_init first) ?

Jakub


Re: [PATCH, testsuite]: Improve check_effective_target_fsanitize_thread

2016-02-03 Thread Uros Bizjak
On Wed, Feb 3, 2016 at 2:59 PM, Jakub Jelinek  wrote:
> On Wed, Feb 03, 2016 at 02:53:56PM +0100, Uros Bizjak wrote:
>> diff --git a/gcc/testsuite/lib/asan-dg.exp b/gcc/testsuite/lib/asan-dg.exp
>> index 994160e..a1198c0 100644
>> --- a/gcc/testsuite/lib/asan-dg.exp
>> +++ b/gcc/testsuite/lib/asan-dg.exp
>> @@ -20,7 +20,7 @@
>>  proc check_effective_target_fsanitize_address {} {
>>  return [check_no_compiler_messages fsanitize_address executable {
>>   int main (void) { return 0; }
>> -} "-fsanitize=address"]
>> +}]
>>  }
>>
>
> This is just weird.  What if fsanitize_address effective target is used
> outside of asan/ (i.e. without asan_init first) ?

It won't work, you also need various link flags that are part of
TEST_ALWAYS_FLAGS.

Uros.


[PATCH] Fix up various issues with missing lhs on calls with addressable return value (PR ipa/69241, PR c++/69649)

2016-02-03 Thread Jakub Jelinek
Hi!

As mentioned in the PR, the expander now requires that calls with
TREE_ADDRESSABLE result type have lhs set (so that a temporary of the
type doesn't have to be created by the middle-end).

The first 3 hunks (gimplify.c and cgraphunit.c) are to fix noreturn
functions with such return types, by making sure they have the lhs.
The ipa-split.c changes force the *.part.N function to have void
return if we don't need the return value from the split function.
If the split part contains any MEM_REFs that refer to the DECL_BY_REFERENCE
RESULT_DECL SSA_NAME, then consider_split will already mark them
split_point->split_part_set_retval, and if we just the SSA_NAME otherwise,
we (missed optimization) give up on fn splitting in that case earlier (could
force split_point->split_part_set_retval instead and even that would be
handled well in this case).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-02-03  Jakub Jelinek  
Patrick Palka  

PR ipa/69241
PR c++/69649
* gimplify.c (gimplify_modify_expr): Set lhs even for noreturn
calls if the return type is TREE_ADDRESSABLE.
* cgraphunit.c (cgraph_node::expand_thunk): Likewise.
* ipa-split.c (split_function): Fix doubled "we" in comment.
Use void return type for the split part even if
!split_point->split_part_set_retval.

* g++.dg/ipa/pr69241-1.C: New test.
* g++.dg/ipa/pr69241-2.C: New test.
* g++.dg/ipa/pr69241-3.C: New test.
* g++.dg/ipa/pr69649.C: New test.

--- gcc/gimplify.c.jj   2016-02-02 20:42:00.0 +0100
+++ gcc/gimplify.c  2016-02-03 11:04:06.400757668 +0100
@@ -4828,7 +4828,8 @@ gimplify_modify_expr (tree *expr_p, gimp
}
}
   notice_special_calls (call_stmt);
-  if (!gimple_call_noreturn_p (call_stmt))
+  if (!gimple_call_noreturn_p (call_stmt)
+ || TREE_ADDRESSABLE (TREE_TYPE (*to_p)))
gimple_call_set_lhs (call_stmt, *to_p);
   assign = call_stmt;
 }
--- gcc/cgraphunit.c.jj 2016-01-20 10:55:15.0 +0100
+++ gcc/cgraphunit.c2016-02-03 11:04:41.034279370 +0100
@@ -1703,7 +1703,8 @@ cgraph_node::expand_thunk (bool output_a
   bsi = gsi_start_bb (bb);
 
   /* Build call to the function being thunked.  */
-  if (!VOID_TYPE_P (restype) && !alias_is_noreturn)
+  if (!VOID_TYPE_P (restype)
+ && (!alias_is_noreturn || TREE_ADDRESSABLE (restype)))
{
  if (DECL_BY_REFERENCE (resdecl))
{
@@ -1770,7 +1771,7 @@ cgraph_node::expand_thunk (bool output_a
  || DECL_BY_REFERENCE (resdecl)))
 gimple_call_set_return_slot_opt (call, true);
 
-  if (restmp && !alias_is_noreturn)
+  if (restmp)
{
   gimple_call_set_lhs (call, restmp);
  gcc_assert (useless_type_conversion_p (TREE_TYPE (restmp),
--- gcc/ipa-split.c.jj  2016-01-04 14:55:52.0 +0100
+++ gcc/ipa-split.c 2016-02-03 13:01:45.905136051 +0100
@@ -1254,7 +1254,7 @@ split_function (basic_block return_bb, s
   else
main_part_return_p = true;
 }
-  /* The main part also returns if we we split on a fallthru edge
+  /* The main part also returns if we split on a fallthru edge
  and the split part returns.  */
   if (split_part_return_p)
 FOR_EACH_EDGE (e, ei, split_point->entry_bb->preds)
@@ -1364,8 +1364,9 @@ split_function (basic_block return_bb, s
   /* Now create the actual clone.  */
   cgraph_edge::rebuild_edges ();
   node = cur_node->create_version_clone_with_body
-(vNULL, NULL, args_to_skip, !split_part_return_p, split_point->split_bbs,
- split_point->entry_bb, "part");
+(vNULL, NULL, args_to_skip,
+ !split_part_return_p || !split_point->split_part_set_retval,
+ split_point->split_bbs, split_point->entry_bb, "part");
 
   node->split_part = true;
 
--- gcc/testsuite/g++.dg/ipa/pr69241-1.C.jj 2016-02-03 10:56:10.624328263 
+0100
+++ gcc/testsuite/g++.dg/ipa/pr69241-1.C2016-02-03 11:01:18.600075039 
+0100
@@ -0,0 +1,12 @@
+// PR ipa/69241
+// { dg-do compile }
+// { dg-options "-O2" }
+
+struct R { R (const R &) {} };
+__attribute__ ((noreturn)) R bar ();
+
+R
+foo ()
+{
+  bar ();
+}
--- gcc/testsuite/g++.dg/ipa/pr69241-2.C.jj 2016-02-03 10:56:07.996364556 
+0100
+++ gcc/testsuite/g++.dg/ipa/pr69241-2.C2016-02-03 11:01:42.958738639 
+0100
@@ -0,0 +1,18 @@
+// PR ipa/69241
+// { dg-do compile }
+// { dg-options "-O2" }
+
+__attribute__((noreturn)) void foo (int);
+struct R { R (const R &) {} };
+
+R
+bar ()
+{
+  foo (0);
+}
+
+R
+baz ()
+{
+  foo (0);
+}
--- gcc/testsuite/g++.dg/ipa/pr69241-3.C.jj 2016-02-03 11:00:39.840610317 
+0100
+++ gcc/testsuite/g++.dg/ipa/pr69241-3.C2016-02-03 11:01:02.044303678 
+0100
@@ -0,0 +1,12 @@
+// PR ipa/69241
+// { dg-do compile }
+// { dg-options "-O2" }
+
+struct R { int x[100]; };
+__attribute__ ((noreturn)) R bar ();
+
+void
+foo ()
+{
+  bar ();
+}
--- gcc/testsuite/g++.dg/ipa/pr696

Re: [PATCH, testsuite]: Improve check_effective_target_fsanitize_thread

2016-02-03 Thread Jakub Jelinek
On Wed, Feb 03, 2016 at 03:12:27PM +0100, Uros Bizjak wrote:
> On Wed, Feb 3, 2016 at 2:59 PM, Jakub Jelinek  wrote:
> > On Wed, Feb 03, 2016 at 02:53:56PM +0100, Uros Bizjak wrote:
> >> diff --git a/gcc/testsuite/lib/asan-dg.exp b/gcc/testsuite/lib/asan-dg.exp
> >> index 994160e..a1198c0 100644
> >> --- a/gcc/testsuite/lib/asan-dg.exp
> >> +++ b/gcc/testsuite/lib/asan-dg.exp
> >> @@ -20,7 +20,7 @@
> >>  proc check_effective_target_fsanitize_address {} {
> >>  return [check_no_compiler_messages fsanitize_address executable {
> >>   int main (void) { return 0; }
> >> -} "-fsanitize=address"]
> >> +}]
> >>  }
> >>
> >
> > This is just weird.  What if fsanitize_address effective target is used
> > outside of asan/ (i.e. without asan_init first) ?
> 
> It won't work, you also need various link flags that are part of
> TEST_ALWAYS_FLAGS.

The patch is ok then.

Jakub


[gomp4] Fix use of declare'd vars by routine procedures.

2016-02-03 Thread James Norris



I've backported this patch from trunk to gomp-4_0-branch. This patch
updates a previous patch to gomp4 in dealing with variables used
within a routine procedure.

Reference:

https://gcc.gnu.org/ml/gcc-patches/2016-01/msg01231.html

Thanks,
Jim


 ChangeLog entries...

gcc/
* gimplify.c (omp_notice_variable): Add usage check.

gcc/testsuite/
* c-c++-common/goacc/routine-5.c: Add tests.

gcc/c/
* c-typeck.c (build_external_ref): Remove usage check.

gcc/cp/
* semantics.c (finish_id_expression): Remove usage check.
Index: gcc/ChangeLog.gomp
===
--- gcc/ChangeLog.gomp	(revision 233104)
+++ gcc/ChangeLog.gomp	(working copy)
@@ -1,3 +1,9 @@
+2016-02-03  James Norris  
+	* gimplify.c (omp_notice_variable): Add usage check.
+
 2016-01-22  Nathan Sidwell  
 
 	* omp-low.c (struct oacc_loop): Add 'inner' field.
Index: gcc/c/ChangeLog.gomp
===
--- gcc/c/ChangeLog.gomp	(revision 233104)
+++ gcc/c/ChangeLog.gomp	(working copy)
@@ -1,3 +1,7 @@
+2016-02-03  James Norris  
+
+	* c-typeck.c (build_external_ref): Remove usage check.
+
 2016-01-14  James Norris  
 
 	* c-parser.c (c_finish_oacc_routine): Remove attribute.
Index: gcc/c/c-typeck.c
===
--- gcc/c/c-typeck.c	(revision 233104)
+++ gcc/c/c-typeck.c	(working copy)
@@ -2677,26 +2677,6 @@
   tree ref;
   tree decl = lookup_name (id);
 
-  if (decl
-  && decl != error_mark_node
-  && current_function_decl
-  && TREE_CODE (decl) == VAR_DECL
-  && is_global_var (decl)
-  && get_oacc_fn_attrib (current_function_decl))
-{
-  /* Validate data type for use with routine directive.  */
-  if (lookup_attribute ("omp declare target link",
-			DECL_ATTRIBUTES (decl))
-	  || ((!lookup_attribute ("omp declare target",
-  DECL_ATTRIBUTES (decl))
-	   && ((TREE_STATIC (decl) && !DECL_EXTERNAL (decl))
-		   || (!TREE_STATIC (decl) && DECL_EXTERNAL (decl))
-	{
-	  error_at (loc, "invalid use in % function");
-	  return error_mark_node;
-	}
-}
-
   /* In Objective-C, an instance variable (ivar) may be preferred to
  whatever lookup_name() found.  */
   decl = objc_lookup_ivar (decl, id);
Index: gcc/cp/ChangeLog.gomp
===
--- gcc/cp/ChangeLog.gomp	(revision 233104)
+++ gcc/cp/ChangeLog.gomp	(working copy)
@@ -1,3 +1,7 @@
+2016-02-03  James Norris  
+
+	* semantics.c (finish_id_expression): Remove usage check.
+
 2016-01-20  Cesar Philippidis  
 
 	* parser.c (cp_parser_oacc_all_clauses): Call finish_omp_clauses
Index: gcc/cp/semantics.c
===
--- gcc/cp/semantics.c	(revision 233104)
+++ gcc/cp/semantics.c	(working copy)
@@ -3712,25 +3712,6 @@
 
 	  decl = convert_from_reference (decl);
 	}
-
-  if (decl != error_mark_node
-	  && current_function_decl
-	  && TREE_CODE (decl) == VAR_DECL
-	  && is_global_var (decl)
-  && get_oacc_fn_attrib (current_function_decl))
-	{
-	  /* Validate data type for use with routine directive.  */
-	  if (lookup_attribute ("omp declare target link",
-DECL_ATTRIBUTES (decl))
-	  || ((!lookup_attribute ("omp declare target",
-  DECL_ATTRIBUTES (decl))
-		   && ((TREE_STATIC (decl) && !DECL_EXTERNAL (decl))
-			|| (!TREE_STATIC (decl) && DECL_EXTERNAL (decl))
-	{
-	  *error_msg = "invalid use in % function";
-	  return error_mark_node;
-	}
-	}
 }
 
   return cp_expr (decl, location);
Index: gcc/gimplify.c
===
--- gcc/gimplify.c	(revision 233104)
+++ gcc/gimplify.c	(working copy)
@@ -6095,9 +6095,9 @@
   if (ctx->region_type == ORT_NONE)
 return lang_hooks.decls.omp_disregard_value_expr (decl, false);
 
-  /* Threadprivate variables are predetermined.  */
   if (is_global_var (decl))
 {
+  /* Threadprivate variables are predetermined.  */
   if (DECL_THREAD_LOCAL_P (decl))
 	return omp_notice_threadprivate_variable (ctx, decl, NULL_TREE);
 
@@ -6108,6 +6108,30 @@
 	  if (value && DECL_P (value) && DECL_THREAD_LOCAL_P (value))
 	return omp_notice_threadprivate_variable (ctx, decl, value);
 	}
+
+  if (gimplify_omp_ctxp->outer_context == NULL
+	  && VAR_P (decl)
+	  && get_oacc_fn_attrib (current_function_decl))
+	{
+	  location_t loc = DECL_SOURCE_LOCATION (decl);
+
+	  if (lookup_attribute ("omp declare target link",
+DECL_ATTRIBUTES (decl)))
+	{
+	  error_at (loc,
+			"%qE with % clause used in % function",
+			DECL_NAME (decl));
+	  return false;
+	}
+	  else if (!lookup_attribute ("omp declare target",
+  DECL_ATTRIBUTES (decl)))
+	{
+	  error_at (loc,
+			"%qE requires a % directive for use "
+			"in a % function", DECL_NAME (decl));
+	  ret

Re: [PATCH, testsuite]: Improve check_effective_target_fsanitize_thread

2016-02-03 Thread Eric Botcazou
> Attached patch improves detection of working -fsanitize=thread option.
> Check for working -fsanitize=thread option timeouts with older glibcs,
> so tsan_init detects this case and sets default compile flags to
> compile.
> 
> Recently Eric changed check_effective_target_fsanitize_thread to a
> runtime test, and we *again* unnecessarily waste 5 minutes of test
> time here, waiting for a test to timeout.

Well, if you don't want someone else to do it again, you'd better adjust the 
comments as well because, as Jakub put it, the whole stuff is rather weird.

-- 
Eric Botcazou


Re: [wwwdocs] Update changes.html for LTO and IPA

2016-02-03 Thread Jonathan Wakely

On 19/01/16 16:45 +0100, Jan Hubicka wrote:

Index: changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-6/changes.html,v
retrieving revision 1.46
diff -u -r1.46 changes.html
--- changes.html22 Dec 2015 19:23:31 -  1.46
+++ changes.html19 Jan 2016 15:42:56 -
@@ -43,6 +43,64 @@
of array bounds.  In particular, it enables
-fsanitize=bounds as well as instrumentation of
flexible array member-like arrays.
+Type based alias analysis now disambiguate accesses to different
+   pointers. This improve precision of the alias oracle by about 20-30%
+   on higher-level C++ programs. Programs doing invalid type punning
+   of pointer types may now need -fno-strict-aliasing
+   to work correctly.
+Alias oracle now correctly supports weakref and
+   alias attributes. This makes it possible to access
+   both variable and its alias in one translation unit which is common
+   with link-time optimization.
+Value range propagation now assume that this pointer
+   of C++ methods is non-NULL.  This eliminates many NULL pointer checks


s/of C++ methods/in C++ member functions/


+   but also breaks some non-conforming code-bases (such as Qt-5, Chromium,
+   KDevelop). As a termporary work-around


s/termporary/temporary/



RE: [PATCH] [ARC] Add single/double IEEE precission FPU support.

2016-02-03 Thread Claudiu Zissulescu
First, I will split this patch in two. The first part will deal with the FPU 
instructions. The second patch, will try to address a new abi optimized for 
odd-even registers as the comments for the mabi=optimized are numerous and I 
need to carefully prepare for an answer.
The remaining of this email will focus on FPU patch.

> +  case EQ:
> +  case NE:
> +  case UNORDERED:
> +  case UNLT:
> +  case UNLE:
> +  case UNGT:
> +  case UNGE:
> +   return CC_FPUmode;
> +
> +  case LT:
> +  case LE:
> +  case GT:
> +  case GE:
> +  case ORDERED:
> +   return CC_FPUEmode;
> 
> cse and other code transformations are likely to do better if you use
> just one mode for these.  It is also very odd to have comparisons and their
> inverse use different modes.  Have you done any benchmarking for this?

Right, the ORDERED should be in CC_FPUmode. An inspiration point for 
CC_FPU/CC_FPUE mode is the arm port. The reason why having the two CC_FPU and 
CC_FPUE modes is to emit signaling FPU compare instructions.  We can use a 
single CC_FPU mode here instead of two, but we may lose functionality.
Regarding benchmarks, I do not have an establish benchmark for this, however, 
as far as I could see the code generated for FPU looks clean.
Please let me know if it is acceptable to go with CC_FPU/CC_FPUE, and ORDERED 
fix further on. Or, to have a single mode.

> +  /* ARCHS has 64-bit data-path which makes use of the even-odd paired
> + registers.  */
> +  if (TARGET_HS)
> +{
> +  for (regno = 1; regno < 32; regno +=2)
> +   {
> + arc_hard_regno_mode_ok[regno] = S_MODES;
> +   }
> +}
> +
> 
> Does TARGET_HS with -mabi=default allow for passing DFmode / DImode
> arguments
> in odd registers?  I fear you might run into reload trouble when trying to
> access the values.

Although, I haven't bump into this issue until now, I do not say it may not 
happen. Hence, I would create a new register class to hold the odd-even 
registers. Hence the above code will not be needed. What do u say?

> still in "subdf3":
> +  else if (TARGET_FP_DOUBLE)
> 
> So this implies that both (TARGET_DPFP) and (TARGET_FP_DOUBLE) might
> be
> true at
> the same time.  In that case, so we really want to prefer the
> (TARGET_DPFP) expansion?

The TARGET_DPFP (FPX instructions) and TARGET_FP_DOUBLE (FPU) are mutually 
exclusive. It should be a check in arc_init() function for this case.

> +(define_insn "*cmpsf_trap_fpu"
> 
> That name makes as little sense to me as having two separate modes
> CC_FPU and CC_FPUE
> for positive / negated usage and having two comparison patterns pre
> precision that
> do the same but pretend to be dissimilar.
> 

The F{S/D}CMPF instruction is similar to the F{S/D}CMP instruction in cases 
when either of the instruction operands is a signaling NaN. The FxCMPF 
instruction updates the invalid flag (FPU_STATUS.IV) when either of the 
operands is a quiet or signaling NaN, whereas, the FxCMP instruction updates 
the invalid flag (FPU_STATUS.IV) only when either of the operands is a quiet 
NaN. We need to use the FxCMPF only if we keep the CC_FPU an CC_FPUE otherwise, 
we shall use only FxCMP instruction.

> Also, the agglomeration of S/D with FU{S,Z}ED is confusing.  Could you
> spare another underscore? 

Is this better?

#define TARGET_FP_SP_BASE   ((arc_fpu_build & FPU_SP) != 0)
#define TARGET_FP_DP_BASE   ((arc_fpu_build & FPU_DP) != 0)
#define TARGET_FP_SP_FUSED  ((arc_fpu_build & FPU_SF) != 0)
#define TARGET_FP_DP_FUSED  ((arc_fpu_build & FPU_DF) != 0)
#define TARGET_FP_SP_CONV   ((arc_fpu_build & FPU_SC) != 0)
#define TARGET_FP_DP_CONV   ((arc_fpu_build & FPU_DC) != 0)
#define TARGET_FP_SP_SQRT   ((arc_fpu_build & FPU_SD) != 0)
#define TARGET_FP_DP_SQRT   ((arc_fpu_build & FPU_DD) != 0)
#define TARGET_FP_DP_AX ((arc_fpu_build & FPX_DP) != 0)

Thanks,
Claudiu


Default compute dimensions (runtime)

2016-02-03 Thread Nathan Sidwell

Jakub,
this is the runtime side of default compute dimension support.

1) extend the -fopenacc-dim=X:Y:Z syntax to allow '-' indicating a runtime 
choice.  (0 also indicates that, but I thought best to have an explicit syntax 
as well).


2) New plugin helper 'GOMP_PLUGIN_acc_default_dims' that parses a 
GOMP_OPENACC_DIM environment variable.  The syntax here is the same as that for 
the -fopenacc-dim option -- except '-' isn't permitted.  I have future-proofed 
the interface by including a plugin tag parameter.  This  will permit 
device_type support.


3) the plugin itself lazily calls GOMP_PLUGIN_acc_default_dims when it sees an 
unspecified dimension.  Validates the default dimensions and then plugs them 
into the launch parameters.


The testcase reuses the compile-time testcase by breaking its core to a header 
file and explicitly setting the environment variable before first launch.  The 
original testcase also explitily sets  the environment variable, to make sure 
it's not being considered.


There doesn't seem to be a mechanism warning messages -- only debug ones or 
fatal errors.  I'm not sure what the best approach to handling errors in the env 
var parsing, and ducked to silently ignore problems (and the plugin will then 
provide fallback values).


ok?

nathan
2016-02-03  Nathan Sidwell  

	gcc/
	* doc/invoke.texi (fopenacc-dim): Document runtime support.
	* omp-low.c(oacc_parse_default_dims): Add runtime support.

	libgomp/
	* libgomp.map (GOMP_PLUGIN_acc_default_dims): New.
	* oacc-parallel.c (GOACC_parallel_keyed): Zero initialize dims.
	* oacc-plugin.c (GOMP_PLUGIN_acc_default_dims): New.
	* oacc-plugin.h (GOMP_PLUGIN_acc_default_dims): Declare.
	* plugin/plugin-nvptx.c (nvptx_exec): Add support for runtime
	defaul dimensions.
	* testsuite/libgomp.oacc-c-c++-common/loop-dim-default.c: Breakout
	body to and #include ...
	* testsuite/libgomp.oacc-c-c++-common/loop-dim-default.h: ... this.
	* testsuite/libgomp.oacc-c-c++-common/loop-dim-default-2.c: New.

Index: gcc/doc/invoke.texi
===
--- gcc/doc/invoke.texi	(revision 233084)
+++ gcc/doc/invoke.texi	(working copy)
@@ -1969,7 +1969,12 @@ have support for @option{-pthread}.
 Specify default compute dimensions for parallel offload regions that do
 not explicitly specify.  The @var{geom} value is a triple of
 ':'-separated sizes, in order 'gang', 'worker' and, 'vector'.  A size
-can be omitted, to use a target-specific default value.
+can be omitted, to use a target-specific default value. Use '-' to defer
+the size determination until execution.  In that case, the environment
+variable @var{GOMP_OPENACC_DIM} should be set.  It has the same format
+as the option value, except that '-' is not permitted.  If it is unset,
+a target-specific value is chosen. Runtime and compile-time values can
+be freely mixed.
 
 @item -fopenmp
 @opindex fopenmp
Index: gcc/omp-low.c
===
--- gcc/omp-low.c	(revision 233084)
+++ gcc/omp-low.c	(working copy)
@@ -20275,9 +20275,14 @@ oacc_parse_default_dims (const char *dim
 	  pos++;
 	}
 
-	  if (*pos != ':')
+	  long val = -1;
+	  if (*pos == '-')
+	{
+	  pos++;
+	  val = 0;
+	}
+	  else if (*pos != ':')
 	{
-	  long val;
 	  const char *eptr;
 
 	  errno = 0;
@@ -20285,8 +20290,8 @@ oacc_parse_default_dims (const char *dim
 	  if (errno || val <= 0 || (int) val != val)
 		goto malformed;
 	  pos = eptr;
-	  oacc_default_dims[ix] = (int) val;
 	}
+	  oacc_default_dims[ix] = (int) val;
 	}
   if (*pos)
 	{
Index: libgomp/libgomp.map
===
--- libgomp/libgomp.map	(revision 233084)
+++ libgomp/libgomp.map	(working copy)
@@ -411,4 +411,5 @@ GOMP_PLUGIN_1.0 {
 GOMP_PLUGIN_1.1 {
   global:
 	GOMP_PLUGIN_target_task_completion;
+	GOMP_PLUGIN_acc_default_dims;
 } GOMP_PLUGIN_1.0;
Index: libgomp/oacc-parallel.c
===
--- libgomp/oacc-parallel.c	(revision 233084)
+++ libgomp/oacc-parallel.c	(working copy)
@@ -103,6 +103,7 @@ GOACC_parallel_keyed (int device, void (
   return;
 }
 
+  memset (dims, 0, sizeof (dims));
   va_start (ap, kinds);
   /* TODO: This will need amending when device_type is implemented.  */
   while ((tag = va_arg (ap, unsigned)) != 0)
Index: libgomp/oacc-plugin.c
===
--- libgomp/oacc-plugin.c	(revision 233084)
+++ libgomp/oacc-plugin.c	(working copy)
@@ -29,6 +29,9 @@
 #include "libgomp.h"
 #include "oacc-plugin.h"
 #include "oacc-int.h"
+#include "gomp-constants.h"
+#include 
+#include 
 
 void
 GOMP_PLUGIN_async_unmap_vars (void *ptr)
@@ -46,3 +49,41 @@ GOMP_PLUGIN_acc_thread (void)
   struct goacc_thread *thr = goacc_thread ();
   return thr ? thr->target_tls : NULL;
 }
+
+/* Determine runtime default compute dimensions fr

Re: [C++ PATCH] Fix -Wunused-function (PR debug/66869)

2016-02-03 Thread Jason Merrill

On 01/29/2016 01:30 PM, Jakub Jelinek wrote:

On Fri, Jan 29, 2016 at 11:35:07AM +0100, Jakub Jelinek wrote:

I can try to stick there an assert whether for FUNCTION_DECL
(DECL_INITIAL (decl) == 0) == DECL_EXTERNAL (decl).


Tried that, but cancelled that quickly, I see lots of cases where
DECL_INITIAL is non-NULL, but DECL_EXTERNAL is set, and some
where DECL_INITIAL is NULL, and DECL_EXTERNAL is not set,
at least in the other two spots (check_global_declaration in cgraphunit.c
and c-decl.c).  Haven't waited long enough to find out if the C++ FE is some
exception.


My thought was that if DECL_INITIAL is non-null, the function is 
defined, so it seems odd to warn about a lack of definition.


Jason




Re: [patch, c++] delete "com_interface" attribute

2016-02-03 Thread Jason Merrill

OK.

Jason


Re: [PATCH] Fix PR c++/69056 (argument pack deduction failure during overload resolution)

2016-02-03 Thread Jason Merrill

OK.

Jason


Re: [PATCH] Partially fix PR c++/12277 (Warn on dynamic cast with known NULL results)

2016-02-03 Thread Jason Merrill

On 11/09/2015 04:30 AM, Patrick Palka wrote:

+ if (complain & tf_warning)
+   {
+ if (VAR_P (old_expr))
+   warning (0, "dynamic_cast of %q#D to %q#T can never 
succeed",
+   old_expr, type);
+ else
+   warning (0, "dynamic_cast of %q#E to %q#T can never 
succeed",
+   old_expr, type);
+   }
+ return build_zero_cst (type);


You also need to handle throwing bad_cast in the reference case.

Jason




Re: Default compute dimensions (runtime)

2016-02-03 Thread Alexander Monakov
Hello,

On Wed, 3 Feb 2016, Nathan Sidwell wrote:
> 1) extend the -fopenacc-dim=X:Y:Z syntax to allow '-' indicating a runtime
> choice.  (0 also indicates that, but I thought best to have an explicit syntax
> as well).

Does it work when the user specifies one of the dimensions, so that references
to it are subject to constant folding and VRP, but leaves some other dimension
unspecified, and when eventually GOMP_OPENACC_DIM is parsed at runtime, the
runtime-specified value of the first dimension is different from what the
compiler saw, invalidating all folding and propagation?


Here:

+ /* Do some sanity checking.  The CUDA API doesn't appear to
+provide queries to determine these limits.  */
+ if (default_dims[GOMP_DIM_GANG] < 1)
+   default_dims[GOMP_DIM_GANG] = 32;
+ if (default_dims[GOMP_DIM_WORKER] < 1
+ || default_dims[GOMP_DIM_WORKER] > 32)
+   default_dims[GOMP_DIM_WORKER] = 32;
+ default_dims[GOMP_DIM_VECTOR] = 32;

I don't see why you say that because cuDeviceGetAttribute provides 
CU_DEVICE_ATTRIBUTE_WARP_SIZE, CU_DEVICE_ATTRIBUTE_MAX_THREADS_PER_BLOCK,
CU_DEVICE_ATTRIBUTE_MAX_GRID_DIM_X (which is not too useful for this case) and
cuFuncGetAttribute that allows to get a per-function thread limit.  There's a
patch on gomp-nvptx branch that adds querying some of those to the plugin.

Alexander


Re: Default compute dimensions (runtime)

2016-02-03 Thread Nathan Sidwell

On 02/03/16 11:10, Alexander Monakov wrote:

Hello,

On Wed, 3 Feb 2016, Nathan Sidwell wrote:

1) extend the -fopenacc-dim=X:Y:Z syntax to allow '-' indicating a runtime
choice.  (0 also indicates that, but I thought best to have an explicit syntax
as well).


Does it work when the user specifies one of the dimensions, so that references
to it are subject to constant folding and VRP, but leaves some other dimension
unspecified, and when eventually GOMP_OPENACC_DIM is parsed at runtime, the
runtime-specified value of the first dimension is different from what the
compiler saw, invalidating all folding and propagation?


You can only override at runtime those dimensions that you said you'd override 
at runtime when you compiled your program.



I don't see why you say that because cuDeviceGetAttribute provides
CU_DEVICE_ATTRIBUTE_WARP_SIZE, CU_DEVICE_ATTRIBUTE_MAX_THREADS_PER_BLOCK,
CU_DEVICE_ATTRIBUTE_MAX_GRID_DIM_X (which is not too useful for this case) and
cuFuncGetAttribute that allows to get a per-function thread limit.  There's a
patch on gomp-nvptx branch that adds querying some of those to the plugin.


thanks.  There doesn't appear to be one for number of physical CTAs though, 
right?

nathan


Re: [PATCH] Fix constexpr evaluation of comparisons involving pointer-to-members

2016-02-03 Thread Jason Merrill

On 12/22/2015 12:07 AM, Patrick Palka wrote:

+  if (code == EQ_EXPR || code == NE_EXPR)
+{
+  if (TREE_CODE (lhs) == PTRMEM_CST && CONSTANT_CLASS_P (rhs))
+   lhs = cplus_expand_constant (lhs);
+  if (TREE_CODE (rhs) == PTRMEM_CST && CONSTANT_CLASS_P (lhs))
+   rhs = cplus_expand_constant (rhs);
+}


If both sides are PTRMEM_CST, we should be able to compare them without 
expanding, using cp_tree_equal.


Jason




[PATCH] PR69619: Fix exponential issue in ccmp.c

2016-02-03 Thread Wilco Dijkstra
This patch fixes an exponential issue in ccmp.c.  When deciding which ccmp
expansion to use, the tree nodes gs0 and gs1 are fully expanded twice.  If
they contain more CCMP opportunities, their subtrees are also expanded twice.
When the trees are complex the expansion takes exponential time and memory.
As a workaround in GCC6 compute the cost of the first expansion early, and
only try the alternative expansion if the cost is low enough.  This rarely
affects real code, eg. SPECINT2006 has identical codesize.

For GCC7 we should improve the way this works and simplify the backend
interface.  I don't see why the backend should expand tree expressions,
especially when they are not part of the CCMP sequence.

OK for commit?

ChangeLog:
2016-02-03  Wilco Dijkstra  

gcc/
PR target/69619
* ccmp.c (expand_ccmp_expr_1): Avoid evaluating gs0/gs1
twice when complex.

gcc/testsuite/

* gcc.dg/pr69619.c: Add new test.

diff --git a/gcc/ccmp.c b/gcc/ccmp.c
index 
9f1ce295554d17c0c3e39676632a07cabe7d5493..dce610488f2d13d6983f3752fb884c8af7ed3bc8
 100644
--- a/gcc/ccmp.c
+++ b/gcc/ccmp.c
@@ -183,19 +183,25 @@ expand_ccmp_expr_1 (gimple *g, rtx *prep_seq, rtx 
*gen_seq)
gimple_assign_rhs1 (gs0),
gimple_assign_rhs2 (gs0));
 
- tmp2 = targetm.gen_ccmp_first (&prep_seq_2, &gen_seq_2, rcode1,
-gimple_assign_rhs1 (gs1),
-gimple_assign_rhs2 (gs1));
-
- if (!tmp && !tmp2)
-   return NULL_RTX;
-
  if (tmp != NULL)
{
  ret = expand_ccmp_next (gs1, code, tmp, &prep_seq_1, &gen_seq_1);
  cost1 = seq_cost (safe_as_a  (prep_seq_1), speed_p);
  cost1 += seq_cost (safe_as_a  (gen_seq_1), speed_p);
}
+
+ /* FIXME: Temporary workaround for PR69619.
+Avoid exponential compile time due to expanding gs0 and gs1 twice.
+If gs0 and gs1 are complex, the cost will be high, so avoid
+reevaluation if above an arbitrary threshold.  */
+ if ((tmp == NULL) || (cost1 < 100))
+   tmp2 = targetm.gen_ccmp_first (&prep_seq_2, &gen_seq_2, rcode1,
+  gimple_assign_rhs1 (gs1),
+  gimple_assign_rhs2 (gs1));
+
+ if (!tmp && !tmp2)
+   return NULL_RTX;
+
  if (tmp2 != NULL)
{
  ret2 = expand_ccmp_next (gs0, code, tmp2, &prep_seq_2,
diff --git a/gcc/testsuite/gcc.dg/pr69619.c b/gcc/testsuite/gcc.dg/pr69619.c
new file mode 100644
index 
..a200bdf310fc2c02b008e0c13fb9c917784423f8
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr69619.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-O3" } */
+
+int a, b, c, d;
+int e[100];
+void
+fn1 ()
+{
+  int *f = &d;
+  c = 6;
+  for (; c; c--)
+{
+  b = 0;
+  for (; b <= 5; b++)
+   {
+ short g = e[(b + 2) * 9 + c];
+ *f = *f == a && e[(b + 2) * 9 + c];
+   }
+}
+}



Re: Default compute dimensions (runtime)

2016-02-03 Thread Alexander Monakov
On Wed, 3 Feb 2016, Nathan Sidwell wrote:
> You can only override at runtime those dimensions that you said you'd override
> at runtime when you compiled your program.

Ah, I see.  That's not obvious to me, so perhaps added documentation can be
expanded to explain that?  (I now see that the plugin silently drops
user-provided dimensions where a value recorded at compile time is present;
not sure if that'd be worth a runtime diagnostic, could be very noisy)
 
> > I don't see why you say that because cuDeviceGetAttribute provides
> > CU_DEVICE_ATTRIBUTE_WARP_SIZE, CU_DEVICE_ATTRIBUTE_MAX_THREADS_PER_BLOCK,
> > CU_DEVICE_ATTRIBUTE_MAX_GRID_DIM_X (which is not too useful for this case)
> > and cuFuncGetAttribute that allows to get a per-function thread limit.
> > There's a patch on gomp-nvptx branch that adds querying some of those to
> > the plugin.
> 
> thanks.  There doesn't appear to be one for number of physical CTAs though,
> right?

Sorry, I don't understand the question: CTA is a logical entity.  One could
derive limit of possible concurrent CTAs from number of SMs
(CU_DEVICE_ATTRIBUTE_MULTIPROCESSOR_COUNT) multiplied by how many CTAs fit on
one multiprocessor.  The latter figure can be taken as a rough worst-case
value, or semi-intelligent per-kernel estimate based on register limits
(there's code on gomp-nvptx branch that does this), or one can use the cuOcc*
API to ask the driver for a precise per-kernel figure.

Alexander


Re: [PATCH] s390: Add -fsplit-stack support

2016-02-03 Thread Ulrich Weigand
Marcin Kościelnicki wrote:

> Comment fixed, split_stack_marker gone, reorg gone.  Generated code seems 
> sane,
> but testsuite still running.
> 
> I will need to modify the gold patch to handle the "leaf function taking 
> non-split
> stack function address" issue - this will likely require messing with the 
> target
> independent plumbing, the hook for that doesn't seem to get enough params.

Thanks for making those changes; the patch is looking a lot nicer (and shorter 
:-))
now!  Just to clarify, your original patch series had two common-code 
prerequisite
patches (3/5 and 4/5) -- it looks like those may still be needed?  If so, we'll
have to get approval from the appropriate middle-end maintainers before this
patch can go it as well.

As to the back-end patch, I've now only got some cosmetical issues:

> +  insn = emit_insn (gen_main_base_64 (r1, parm_base));

Now that we aren't using the literal pool infrastructure for the block any more,
I guess we shouldn't be using it to load the address either.  Just something
like:
  insn = emit_move_insn (r1, gen_rtx_LABEL_REF (VOIDmode, parm_base));
should do it.

> +(define_insn "split_stack_data"
> +  [(unspec_volatile [(match_operand 0 "" "X")
> +  (match_operand 1 "" "X")
> +  (match_operand 2 "consttable_operand" "X")
> +  (match_operand 3 "consttable_operand" "X")]

And similarly here, just use const_int_operand.

Otherwise, this all looks very good to me.

Thanks,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  ulrich.weig...@de.ibm.com



Re: [PATCH] Fix compile/memory hog in the combiner (PR rtl-optimization/69592)

2016-02-03 Thread Bernd Schmidt

On 02/02/2016 09:59 AM, Jakub Jelinek wrote:


I wonder if it wouldn't be better to pass around some structure, containing
for the common case fixed size cache and perhaps fall back to hash_map if
there are more calls to cache than that.  Plus perhaps a recursion depth, so
that we avoid other pathological cases.


I had the same thought. Maybe for stage1?


Bernd



Re: [PATCH] Fix compile/memory hog in the combiner (PR rtl-optimization/69592)

2016-02-03 Thread Jakub Jelinek
On Wed, Feb 03, 2016 at 06:07:25PM +0100, Bernd Schmidt wrote:
> On 02/02/2016 09:59 AM, Jakub Jelinek wrote:
> 
> >I wonder if it wouldn't be better to pass around some structure, containing
> >for the common case fixed size cache and perhaps fall back to hash_map if
> >there are more calls to cache than that.  Plus perhaps a recursion depth, so
> >that we avoid other pathological cases.
> 
> I had the same thought. Maybe for stage1?

Yeah.

Jakub


[PATCH] Re: [wwwdocs] Add common C++ issues to /gcc-6/porting_to.html

2016-02-03 Thread David Malcolm
On Tue, 2016-02-02 at 20:36 +, Jonathan Wakely wrote:

I had some difficulty reading the new section; mostly due to the
leapfrogging of C++11 by the default (my immediate reaction was "why is
it talking about C++11 when the option says GNU++14?")

I'm attaching a patch which I hope clarifies it, for people like me who
aren't experts in C++ standards (with that as a caveat... I'm not an
expert in C++ standards).

Is the attached OK to commit? (it validates)

Some other notes below.

> Index: htdocs/gcc-6/porting_to.html
> ===
> RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-6/porting_to.html,v
> retrieving revision 1.2
> diff -u -r1.2 porting_to.html
> --- htdocs/gcc-6/porting_to.html  27 Jan 2016 14:40:26 -  
> 1.2
> +++ htdocs/gcc-6/porting_to.html  2 Feb 2016 20:32:29 -

[...snip...]

> +Cannot convert 'bool' to 'T*'
> +
> +
> +The current C++ standard only allows integer literals to be used as
> null

Which standard?

> +pointer constants, so other constants such as false and
> +(1 - 1) cannot be used where a null pointer is desired.
> Code that
> +fails to compile with this error should be changed to use
> nullptr,
> +or 0, or NULL.
> +
> +
> +Cannot convert 'std::ostream' to 'bool'
> +
> +
> +Since C++11 iostream classes are no longer implicitly convertible to
> +void* so it is no longer valid to do something like:

Should there be a comma between "C++" and "iostream" here?  (or maybe
rewrite as: "As of C++, iostream classes"... ?)

[...snip...]


Hope this is constructive
DaveIndex: htdocs/gcc-6/porting_to.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-6/porting_to.html,v
retrieving revision 1.3
diff -u -p -r1.3 porting_to.html
--- htdocs/gcc-6/porting_to.html	2 Feb 2016 20:34:14 -	1.3
+++ htdocs/gcc-6/porting_to.html	3 Feb 2016 17:04:50 -
@@ -36,8 +36,10 @@ manner. Additions and suggestions for im
 Default standard is now GNU++14
 
 
-GCC defaults to -std=gnu++14 instead of -std=gnu++98.
-This brings several changes that users should be aware of.  The following
+GCC 6 defaults to -std=gnu++14 instead of -std=gnu++98:
+the C++14 standard, plus GNU extensions.
+This brings several changes that users should be aware of, some new with the C++14
+standard, others that appeared with the C++11 standard.  The following
 paragraphs describe some of these changes and suggest how to deal with them.
 
 
@@ -45,6 +47,10 @@ paragraphs describe some of these change
 use the -std=gnu++98 command-line option, perhaps by putting it
 in CXXFLAGS or similar variables in Makefiles.
 
+Alternatively, you might prefer to update to gnu++11, bringing in the C++11
+changes but not the C++14 ones.  If so, use the -std=gnu++11
+command-line option.
+
 Narrowing conversions
 
 


[PATCH] s390: Add -fsplit-stack support

2016-02-03 Thread Marcin Kościelnicki
libgcc/ChangeLog:

* config.host: Use t-stack and t-stack-s390 for s390*-*-linux.
* config/s390/morestack.S: New file.
* config/s390/t-stack-s390: New file.
* generic-morestack.c (__splitstack_find): Add s390-specific code.

gcc/ChangeLog:

* common/config/s390/s390-common.c (s390_supports_split_stack):
New function.
(TARGET_SUPPORTS_SPLIT_STACK): New macro.
* config/s390/s390-protos.h: Add s390_expand_split_stack_prologue.
* config/s390/s390.c (struct machine_function): New field
split_stack_varargs_pointer.
(s390_register_info): Mark r12 as clobbered if it'll be used as temp
in s390_emit_prologue.
(s390_emit_prologue): Use r12 as temp if r1 is taken by split-stack
vararg pointer.
(morestack_ref): New global.
(SPLIT_STACK_AVAILABLE): New macro.
(s390_expand_split_stack_prologue): New function.
(s390_live_on_entry): New function.
(s390_va_start): Use split-stack vararg pointer if appropriate.
(s390_asm_file_end): Emit the split-stack note sections.
(TARGET_EXTRA_LIVE_ON_ENTRY): New macro.
* config/s390/s390.md (UNSPEC_STACK_CHECK): New unspec.
(UNSPECV_SPLIT_STACK_CALL): New unspec.
(UNSPECV_SPLIT_STACK_DATA): New unspec.
(split_stack_prologue): New expand.
(split_stack_space_check): New expand.
(split_stack_data): New insn.
(split_stack_call): New expand.
(split_stack_call_*): New insn.
(split_stack_cond_call): New expand.
(split_stack_cond_call_*): New insn.
---
Changes applied.  Testsuite still running, still works on my simple tests.

As for common code prerequisites: #3 is no longer needed, and very likely
so is #4 (it fixes problems that I've only seen with ESA mode, and testsuite
runs just fine without it now).

 gcc/ChangeLog|  30 ++
 gcc/common/config/s390/s390-common.c |  14 +
 gcc/config/s390/s390-protos.h|   1 +
 gcc/config/s390/s390.c   | 214 +++-
 gcc/config/s390/s390.md  | 138 
 libgcc/ChangeLog |   7 +
 libgcc/config.host   |   4 +-
 libgcc/config/s390/morestack.S   | 609 +++
 libgcc/config/s390/t-stack-s390  |   2 +
 libgcc/generic-morestack.c   |   4 +
 10 files changed, 1016 insertions(+), 7 deletions(-)
 create mode 100644 libgcc/config/s390/morestack.S
 create mode 100644 libgcc/config/s390/t-stack-s390

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 92db764..8e3f9f7 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,33 @@
+2016-02-03  Marcin Kościelnicki  
+
+   * common/config/s390/s390-common.c (s390_supports_split_stack):
+   New function.
+   (TARGET_SUPPORTS_SPLIT_STACK): New macro.
+   * config/s390/s390-protos.h: Add s390_expand_split_stack_prologue.
+   * config/s390/s390.c (struct machine_function): New field
+   split_stack_varargs_pointer.
+   (s390_register_info): Mark r12 as clobbered if it'll be used as temp
+   in s390_emit_prologue.
+   (s390_emit_prologue): Use r12 as temp if r1 is taken by split-stack
+   vararg pointer.
+   (morestack_ref): New global.
+   (SPLIT_STACK_AVAILABLE): New macro.
+   (s390_expand_split_stack_prologue): New function.
+   (s390_live_on_entry): New function.
+   (s390_va_start): Use split-stack vararg pointer if appropriate.
+   (s390_asm_file_end): Emit the split-stack note sections.
+   (TARGET_EXTRA_LIVE_ON_ENTRY): New macro.
+   * config/s390/s390.md (UNSPEC_STACK_CHECK): New unspec.
+   (UNSPECV_SPLIT_STACK_CALL): New unspec.
+   (UNSPECV_SPLIT_STACK_DATA): New unspec.
+   (split_stack_prologue): New expand.
+   (split_stack_space_check): New expand.
+   (split_stack_data): New insn.
+   (split_stack_call): New expand.
+   (split_stack_call_*): New insn.
+   (split_stack_cond_call): New expand.
+   (split_stack_cond_call_*): New insn.
+
 2016-02-03  Kirill Yukhin  
 
PR target/69118
diff --git a/gcc/common/config/s390/s390-common.c 
b/gcc/common/config/s390/s390-common.c
index 4519c21..1e497e6 100644
--- a/gcc/common/config/s390/s390-common.c
+++ b/gcc/common/config/s390/s390-common.c
@@ -105,6 +105,17 @@ s390_handle_option (struct gcc_options *opts 
ATTRIBUTE_UNUSED,
 }
 }
 
+/* -fsplit-stack uses a field in the TCB, available with glibc-2.23.
+   We don't verify it, since earlier versions just have padding at
+   its place, which works just as well.  */
+
+static bool
+s390_supports_split_stack (bool report ATTRIBUTE_UNUSED,
+  struct gcc_options *opts ATTRIBUTE_UNUSED)
+{
+  return true;
+}
+
 #undef TARGET_DEFAULT_TARGET_FLAGS
 #define TARGET_DEFAULT_TARGET_FLAGS (TARGET_DEFAULT)
 
@@ -117,4 +128,7 @@ s390_handle_option (struct gcc_options *opts 
ATTRIBUTE_UNUSED,
 #undef TARGET_OPTION

Re: [Patch, fortran] PR66089 fix elemental dependency mishandling

2016-02-03 Thread Mikael Morin

Le 03/02/2016 14:00, Paul Richard Thomas a écrit :

Dear Mikael,

The patch is OK for trunk.

A small niggle: Although present in the original testcase, 'a' is unused.


Indeed, I'll remove it.


I am not in a position to find out for myself, right now, but does the
testcase of comment #10 work with this patch?


No, it doesn't.  I plan to propose a separate patch for comment #10.

Thanks for the review.


Mikael


Re: [PATCH] s390: Add -fsplit-stack support

2016-02-03 Thread Ulrich Weigand
Marcin Kościelnicki wrote:

> libgcc/ChangeLog:
> 
>   * config.host: Use t-stack and t-stack-s390 for s390*-*-linux.
>   * config/s390/morestack.S: New file.
>   * config/s390/t-stack-s390: New file.
>   * generic-morestack.c (__splitstack_find): Add s390-specific code.
> 
> gcc/ChangeLog:
> 
>   * common/config/s390/s390-common.c (s390_supports_split_stack):
>   New function.
>   (TARGET_SUPPORTS_SPLIT_STACK): New macro.
>   * config/s390/s390-protos.h: Add s390_expand_split_stack_prologue.
>   * config/s390/s390.c (struct machine_function): New field
>   split_stack_varargs_pointer.
>   (s390_register_info): Mark r12 as clobbered if it'll be used as temp
>   in s390_emit_prologue.
>   (s390_emit_prologue): Use r12 as temp if r1 is taken by split-stack
>   vararg pointer.
>   (morestack_ref): New global.
>   (SPLIT_STACK_AVAILABLE): New macro.
>   (s390_expand_split_stack_prologue): New function.
>   (s390_live_on_entry): New function.
>   (s390_va_start): Use split-stack vararg pointer if appropriate.
>   (s390_asm_file_end): Emit the split-stack note sections.
>   (TARGET_EXTRA_LIVE_ON_ENTRY): New macro.
>   * config/s390/s390.md (UNSPEC_STACK_CHECK): New unspec.
>   (UNSPECV_SPLIT_STACK_CALL): New unspec.
>   (UNSPECV_SPLIT_STACK_DATA): New unspec.
>   (split_stack_prologue): New expand.
>   (split_stack_space_check): New expand.
>   (split_stack_data): New insn.
>   (split_stack_call): New expand.
>   (split_stack_call_*): New insn.
>   (split_stack_cond_call): New expand.
>   (split_stack_cond_call_*): New insn.
> ---
> Changes applied.  Testsuite still running, still works on my simple tests.
> 
> As for common code prerequisites: #3 is no longer needed, and very likely
> so is #4 (it fixes problems that I've only seen with ESA mode, and testsuite
> runs just fine without it now).

OK, I see.  The patch is OK for mainline then, assuming testing passes.

Thanks again,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  ulrich.weig...@de.ibm.com



Re: [PATCH] Fix constexpr evaluation of comparisons involving pointer-to-members

2016-02-03 Thread Patrick Palka

On Tue, 2 Feb 2016, Jason Merrill wrote:


On 12/22/2015 12:07 AM, Patrick Palka wrote:

+  if (code == EQ_EXPR || code == NE_EXPR)
+{
+  if (TREE_CODE (lhs) == PTRMEM_CST && CONSTANT_CLASS_P (rhs))
+   lhs = cplus_expand_constant (lhs);
+  if (TREE_CODE (rhs) == PTRMEM_CST && CONSTANT_CLASS_P (lhs))
+   rhs = cplus_expand_constant (rhs);
+}


If both sides are PTRMEM_CST, we should be able to compare them without 
expanding, using cp_tree_equal.


Ah, okay.  Here's an updated patch that uses cp_tree_equal and avoids
calling cplus_expand_constant altogether, by assuming that we should
only expect to compare a PTRMEM_CST against another PTRMEM_CST or
against the constant -1.  I also improved the coverage of the test case.
Does this look reasonable?


gcc/cp/ChangeLog:

* constexpr.c (cxx_eval_binary_expression): Fold equality
comparisons involving PTRMEM_CSTs.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/constexpr-ptrmem5.C: New test.
---
 gcc/cp/constexpr.c | 21 +++--
 gcc/testsuite/g++.dg/cpp0x/constexpr-ptrmem5.C | 17 +
 2 files changed, 36 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/constexpr-ptrmem5.C

diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index b076991..c5e6642 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -1593,7 +1593,7 @@ cxx_eval_binary_expression (const constexpr_ctx *ctx, 
tree t,
bool /*lval*/,
bool *non_constant_p, bool *overflow_p)
 {
-  tree r;
+  tree r = NULL_TREE;
   tree orig_lhs = TREE_OPERAND (t, 0);
   tree orig_rhs = TREE_OPERAND (t, 1);
   tree lhs, rhs;
@@ -1612,7 +1612,24 @@ cxx_eval_binary_expression (const constexpr_ctx *ctx, 
tree t,
   location_t loc = EXPR_LOCATION (t);
   enum tree_code code = TREE_CODE (t);
   tree type = TREE_TYPE (t);
-  r = fold_binary_loc (loc, code, type, lhs, rhs);
+
+  if ((code == EQ_EXPR || code == NE_EXPR))
+{
+  bool is_code_eq = (code == EQ_EXPR);
+
+  if (TREE_CODE (lhs) == PTRMEM_CST
+ && TREE_CODE (rhs) == PTRMEM_CST)
+   r = constant_boolean_node (cp_tree_equal (lhs, rhs) == is_code_eq,
+  type);
+  else if ((TREE_CODE (lhs) == PTRMEM_CST
+   || TREE_CODE (rhs) == PTRMEM_CST)
+  && (integer_minus_onep (lhs)
+  || integer_minus_onep (rhs)))
+   r = constant_boolean_node (code == NE_EXPR, type);
+}
+
+  if (r == NULL_TREE)
+r = fold_binary_loc (loc, code, type, lhs, rhs);
   if (r == NULL_TREE)
 {
   if (lhs == orig_lhs && rhs == orig_rhs)
diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-ptrmem5.C 
b/gcc/testsuite/g++.dg/cpp0x/constexpr-ptrmem5.C
new file mode 100644
index 000..b1318c4
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-ptrmem5.C
@@ -0,0 +1,17 @@
+// { dg-do compile { target c++11 } }
+
+#define SA(x) static_assert ((x), #x)
+
+struct X { int a, b; };
+
+void
+foo ()
+{
+  SA (&X::a);
+  SA (&X::a == &X::a);
+  SA (!(&X::a != &X::a));
+  SA (&X::a != &X::b);
+  SA (!(&X::a == &X::b));
+  SA ((!&X::b) == 0);
+  SA (!(&X::b == 0));
+}
--
2.7.0.240.g19e8eb6



Re: [PATCH] Re: [wwwdocs] Add common C++ issues to /gcc-6/porting_to.html

2016-02-03 Thread Jonathan Wakely

On 03/02/16 12:13 -0500, David Malcolm wrote:

On Tue, 2016-02-02 at 20:36 +, Jonathan Wakely wrote:

I had some difficulty reading the new section; mostly due to the
leapfrogging of C++11 by the default (my immediate reaction was "why is
it talking about C++11 when the option says GNU++14?")

I'm attaching a patch which I hope clarifies it, for people like me who
aren't experts in C++ standards (with that as a caveat... I'm not an
expert in C++ standards).

Is the attached OK to commit? (it validates)

Some other notes below.


Index: htdocs/gcc-6/porting_to.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-6/porting_to.html,v
retrieving revision 1.2
diff -u -r1.2 porting_to.html
--- htdocs/gcc-6/porting_to.html27 Jan 2016 14:40:26 -  
1.2
+++ htdocs/gcc-6/porting_to.html2 Feb 2016 20:32:29 -


[...snip...]


+Cannot convert 'bool' to 'T*'
+
+
+The current C++ standard only allows integer literals to be used as
null


Which standard?


The current one :-)

That was deliberate, because formally the C++11 standard _did_ allow
false and (1 - 1) as null pointer constants. That was changed for
C++14, but was the subject of a defect report against C++11 and so G++
implements the C++14 rule even in C++11 mode (as does Clang).

So it would be wrong to say C++11, and misleading to say C++14.

Maybe we should just say "The C++ standard no longer allows ..."



+pointer constants, so other constants such as false and
+(1 - 1) cannot be used where a null pointer is desired.
Code that
+fails to compile with this error should be changed to use
nullptr,
+or 0, or NULL.
+
+
+Cannot convert 'std::ostream' to 'bool'
+
+
+Since C++11 iostream classes are no longer implicitly convertible to
+void* so it is no longer valid to do something like:


Should there be a comma between "C++" and "iostream" here?  (or maybe
rewrite as: "As of C++, iostream classes"... ?)


As long as we keep the "11" that would be an improvement, yes.


[...snip...]


Hope this is constructive
Dave



Index: htdocs/gcc-6/porting_to.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-6/porting_to.html,v
retrieving revision 1.3
diff -u -p -r1.3 porting_to.html
--- htdocs/gcc-6/porting_to.html2 Feb 2016 20:34:14 -   1.3
+++ htdocs/gcc-6/porting_to.html3 Feb 2016 17:04:50 -
@@ -36,8 +36,10 @@ manner. Additions and suggestions for im
Default standard is now GNU++14


-GCC defaults to -std=gnu++14 instead of -std=gnu++98.
-This brings several changes that users should be aware of.  The following
+GCC 6 defaults to -std=gnu++14 instead of 
-std=gnu++98:
+the C++14 standard, plus GNU extensions.
+This brings several changes that users should be aware of, some new with the 
C++14
+standard, others that appeared with the C++11 standard.  The following
paragraphs describe some of these changes and suggest how to deal with them.



Looks good to me.


@@ -45,6 +47,10 @@ paragraphs describe some of these change
use the -std=gnu++98 command-line option, perhaps by putting it
in CXXFLAGS or similar variables in Makefiles.

+Alternatively, you might prefer to update to gnu++11, bringing in the C++11
+changes but not the C++14 ones.  If so, use the -std=gnu++11
+command-line option.


I see no harm in adding this, although the changes from C++98 to C++11
are huge, and the changes from C++11 to C++14 are tiny, so for the
purposes of porting code to work with GCC 6 it's unlikely to help.



patch to fix PR69461

2016-02-03 Thread Vladimir Makarov

  The following patch fixes

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69461

  The patch actually solves several issues.  Before the patch LRA has 
>800 more failures on GCC testsuite on power8.  After the patch the LRA 
has the same number of failures as reload.


Working on the patch, I think I found some typo in 
rs6000.c::rs6000_legitimate_address_p.  The code suspicious to me:


  if (reg_offset_p && reg_addr[mode].fused_toc && 
toc_fusion_mem_wrapped (x, mode))

return 1;

The function works with address (x) but toc_fusion_mem_wrapped requires 
memory instead of address.  Therefore the function never returns 1 for 
toc_fusion_wrapped address.


Mike and Peter, what do you think about this code?

Anyway, the patch was successfully bootstrapped and tested on power8.

Committed as rev..

Index: ChangeLog
===
--- ChangeLog	(revision 233106)
+++ ChangeLog	(working copy)
@@ -1,3 +1,12 @@
+2016-02-03  Vladimir Makarov  
+	Alexandre Oliva  
+
+	PR target/69461
+	* lra-constraints.c (simplify_operand_subreg): Check additionally
+	address validity after potential reloading.
+	(process_address_1): Check insns validity.  In case of failure do
+	nothing.
+
 2016-02-03  Kirill Yukhin  
 
 	PR target/69118
Index: testsuite/ChangeLog
===
--- testsuite/ChangeLog	(revision 233106)
+++ testsuite/ChangeLog	(working copy)
@@ -1,3 +1,9 @@
+2016-02-03  Vladimir Makarov  
+	Alexandre Oliva  
+
+	PR target/69461
+	* gcc.target/powerpc/pr69461.c: New.
+
 2016-02-03  Uros Bizjak  
 
 	* lib/tsan-dg.exp (tsan_init): Move check if tsan executable
Index: lra-constraints.c
===
--- lra-constraints.c	(revision 233106)
+++ lra-constraints.c	(working copy)
@@ -1411,6 +1411,21 @@ simplify_operand_subreg (int nop, machin
 	  || valid_address_p (GET_MODE (subst), XEXP (subst, 0),
 			  MEM_ADDR_SPACE (subst)))
 	return true;
+  else if ((get_constraint_type (lookup_constraint
+ (curr_static_id->operand[nop].constraint))
+		!= CT_SPECIAL_MEMORY)
+	   /* We still can reload address and if the address is
+		  valid, we can remove subreg without reloading its
+		  inner memory.  */
+	   && valid_address_p (GET_MODE (subst),
+   regno_reg_rtx
+   [ira_class_hard_regs
+[base_reg_class (GET_MODE (subst),
+		 MEM_ADDR_SPACE (subst),
+		 ADDRESS, SCRATCH)][0]],
+   MEM_ADDR_SPACE (subst)))
+	return true;
+
   /* If the address was valid and became invalid, prefer to reload
 	 the memory.  Typical case is when the index scale should
 	 correspond the memory.  */
@@ -2958,6 +2973,8 @@ process_address_1 (int nop, bool check_o
 {
   if (ad.index == NULL)
 	{
+	  rtx_insn *insn;
+	  rtx_insn *last = get_last_insn ();
 	  int code = -1;
 	  enum reg_class cl = base_reg_class (ad.mode, ad.as,
 	  SCRATCH, SCRATCH);
@@ -2966,9 +2983,6 @@ process_address_1 (int nop, bool check_o
 	  new_reg = lra_create_new_reg (Pmode, NULL_RTX, cl, "addr");
 	  if (HAVE_lo_sum)
 	{
-	  rtx_insn *insn;
-	  rtx_insn *last = get_last_insn ();
-
 	  /* addr => lo_sum (new_base, addr), case (2) above.  */
 	  insn = emit_insn (gen_rtx_SET
 (new_reg,
@@ -3004,6 +3018,20 @@ process_address_1 (int nop, bool check_o
 	{
 	  /* addr => new_base, case (2) above.  */
 	  lra_emit_move (new_reg, addr);
+
+	  for (insn = last == NULL_RTX ? get_insns () : NEXT_INSN (last);
+		   insn != NULL_RTX;
+		   insn = NEXT_INSN (insn))
+		if (recog_memoized (insn) < 0)
+		  break;
+	  if (insn != NULL_RTX)
+		{
+		  /* Do nothing if we cannot generate right insns.
+		 This is analogous to reload pass behaviour.  */
+		  delete_insns_since (last);
+		  end_sequence ();
+		  return false;
+		}
 	  *ad.inner = new_reg;
 	}
 	}
Index: testsuite/gcc.target/powerpc/pr69461.c
===
--- testsuite/gcc.target/powerpc/pr69461.c	(revision 0)
+++ testsuite/gcc.target/powerpc/pr69461.c	(working copy)
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -mlra" } */
+
+extern void _setjmp (void);
+typedef struct {
+  double real;
+  double imag;
+} Py_complex;
+Py_complex a;
+Py_complex fn1();
+Py_complex fn2() { return fn1(); }
+void fn3() {
+  _setjmp();
+  a = fn2();
+}


Re: [Patch, Fortran] PR 69495: unused-label warning does not tell which flag triggered it

2016-02-03 Thread Janus Weil
Hi,

2016-02-03 10:21 GMT+01:00 Manfred Schwarb :
>> here is a diagnostics patch, which makes sure that the responsible
>> flag is printed in several warning messages (for which this was still
>> missing).
>>
>
>if (source_size < result_size)
> -gfc_warning (0, "Intrinsic TRANSFER at %L has partly undefined result:
> "
> -"source size %ld < result size %ld", &source->where,
> -(long) source_size, (long) result_size);
> +gfc_warning (OPT_Wsurprising, "Intrinsic TRANSFER at %L has partly "
> +"undefined result: source size %ld < result size %ld",
> +&source->where, (long) source_size, (long) result_size);
>
> Breaking apart of these strings will probably hamper translation.

thanks for the comment, I was not aware that this is a problem (in
fact I'm rather ignorant of the translation process as a whole). I was
just trying to comply with the GNU coding standards by avoiding
overlong lines.

So, I assume the problem is that the strings are being broken
*differently* than before, right? (Obviously the were broken already
...) I guess I will just move the start of the warning message to a
new line in order to avoid this.

Btw, if anyone notices any further cases where the flag is missing in
the warning message, please let me know. (I haven't searched through
the whole gfortran code for more such cases and I'm not planning on
doing so, but I'll be happy to include further cases in the patch if
pointed out to me ...)

Also I guess I should mention Manuel and Dominique in the Changelog
(for their supportive comments in the PR).

Cheers,
Janus


Re: [PATCH] Partially fix PR c++/12277 (Warn on dynamic cast with known NULL results)

2016-02-03 Thread Patrick Palka

On Tue, 2 Feb 2016, Jason Merrill wrote:


On 11/09/2015 04:30 AM, Patrick Palka wrote:

+ if (complain & tf_warning)
+   {
+ if (VAR_P (old_expr))
+	warning (0, "dynamic_cast of %q#D to %q#T can never 
succeed",

+   old_expr, type);
+ else
+	warning (0, "dynamic_cast of %q#E to %q#T can never 
succeed",

+   old_expr, type);
+   }
+ return build_zero_cst (type);


You also need to handle throwing bad_cast in the reference case.


Oops, fixed.  I also updated the test case to confirm that the expected
number of calls to bad_cast is generated.

-- >8 --

gcc/cp/ChangeLog:

PR c++/12277
* rtti.c (build_dynamic_cast_1): Warn on dynamic_cast that can
never succeed due to either the target type or the static type
being marked final.

gcc/testsuite/ChangeLog:

PR c++/12277
* g++.dg/rtti/dyncast8.C: New test.
---
 gcc/cp/rtti.c| 26 ++
 gcc/testsuite/g++.dg/rtti/dyncast8.C | 53 
 2 files changed, 79 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/rtti/dyncast8.C

diff --git a/gcc/cp/rtti.c b/gcc/cp/rtti.c
index a43ff85..b1454d9 100644
--- a/gcc/cp/rtti.c
+++ b/gcc/cp/rtti.c
@@ -694,6 +694,32 @@ build_dynamic_cast_1 (tree type, tree expr, tsubst_flags_t 
complain)

  target_type = TYPE_MAIN_VARIANT (TREE_TYPE (type));
  static_type = TYPE_MAIN_VARIANT (TREE_TYPE (exprtype));
+
+ if ((CLASSTYPE_FINAL (static_type)
+  && !DERIVED_FROM_P (target_type, static_type))
+ || (CLASSTYPE_FINAL (target_type)
+ && !DERIVED_FROM_P (static_type, target_type)))
+   {
+ if (complain & tf_warning)
+   {
+ if (VAR_P (old_expr))
+   warning (0, "dynamic_cast of %q#D to %q#T can never 
succeed",
+   old_expr, type);
+ else
+   warning (0, "dynamic_cast of %q#E to %q#T can never 
succeed",
+   old_expr, type);
+   }
+
+ if (tc == REFERENCE_TYPE)
+   {
+ tree expr = throw_bad_cast ();
+ TREE_TYPE (expr) = type;
+ return expr;
+   }
+ else
+   return build_zero_cst (type);
+   }
+
  td2 = get_tinfo_decl (target_type);
  if (!mark_used (td2, complain) && !(complain & tf_error))
return error_mark_node;
diff --git a/gcc/testsuite/g++.dg/rtti/dyncast8.C 
b/gcc/testsuite/g++.dg/rtti/dyncast8.C
new file mode 100644
index 000..f63ed5d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/rtti/dyncast8.C
@@ -0,0 +1,53 @@
+// PR c++/12277
+// { dg-do compile { target c++11 } }
+// { dg-additional-options "-fdump-tree-original" }
+
+struct A1 { virtual ~A1 () { } };
+struct A2 { virtual ~A2 () { } };
+
+struct B1 { virtual ~B1 () { } };
+struct B2 final : B1 { virtual ~B2 () { } };
+
+struct C1 { virtual ~C1 () { } };
+struct C2 final { virtual ~C2 () { } };
+
+A1 *a1;
+
+B1 *b1;
+B2 *b2;
+
+C1 *c1;
+C2 *c2;
+
+void
+foo (void)
+{
+  {
+A2 *x = dynamic_cast (a1);
+  }
+
+  {
+B2 *x = dynamic_cast (b1);
+// The following cast may throw bad_cast.
+B2 &y = dynamic_cast (*b1);
+  }
+
+  {
+B1 *x = dynamic_cast (b2);
+B1 &y = dynamic_cast (*b2);
+  }
+
+  {
+C2 *x = dynamic_cast (c1); // { dg-warning "can never succeed" }
+// The following cast may throw bad_cast.
+C2 &y = dynamic_cast (*c1); // { dg-warning "can never succeed" }
+  }
+
+  {
+C1 *x = dynamic_cast (c2); // { dg-warning "can never succeed" }
+// The following cast may throw bad_cast.
+C1 &y = dynamic_cast (*c2); // { dg-warning "can never succeed" }
+  }
+}
+
+// { dg-final { scan-tree-dump-times "bad_cast" 3 "original" } }
--
2.7.0.240.g19e8eb6



Re: [PATCH] s390: Add -fsplit-stack support

2016-02-03 Thread Marcin Kościelnicki




The second issue I'm still not sure about is the magic nop marker
for frameless functions.  In an earlier mail you wrote:


Both currently supported
architectures always emit split-stack code on every function.


At least for rs6000 this doesn't appear to be true; in
rs6000_expand_split_stack_prologue we have:

   if (!info->push_p)
 return;

so it does nothing for frameless routines.

Now on i386 we do indeed generate code for frameless routines;
in fact, the *same* full stack check is generated as for any
other routine.  Now I'm wondering: is there are reason why
this check would be necessary (and there's simply a bug in
the rs6000 implementation)?  Then we obviously should do the
same on s390.


Try that on powerpc64(le):

$ cat a.c
#include 

void f(void) {
}

typedef void (*fptr)(void);

fptr g(void);

int main() {
 printf("%p\n", g());
}

$ cat b.c
void f(void);

typedef void (*fptr)(void);

fptr g(void) {
 return f;
}

$ gcc -O3 -fsplit-stack -c b.c
$ gcc -O3 -c a.c
$ gcc a.o b.o -fuse-ld=gold

I don't have a recent enough gcc for powerpc, but from what I've seen in
the code, this should explode with a linker error.

Of course, mixing split-stack and non-split-stack code when function
pointers are involved is sketchy anyway, so what's one more bug...

That said, for s390, we can avoid the above problem by checking the
relocation in gold now that ESA paths are gone - for direct function
calls (the only ones we care about), we should be seeing a relocation in
brasl.  So I'll remove the nopmark thing and add proper recognition in
gold.


Ugh. I take that back.  For -fPIC, the load-address sequence is:

larl%r1,f@GOTENT
lg  %r2,0(%r1)
br  %r14

And (sibling) call sequence is:

larl%r1,f@GOTENT
lg  %r1,0(%r1)
br  %r1

It seems there's no proper way to recognize a call vs a load address - 
so we can either go with emitting the marker, or have the same problem 
as on ppc.


So - how much should we care?





On the other hand, if rs6000 works fine *without* any code
in frameless routines, why wouldn't that work for s390 too?

Emitting a nop (that is always executed) still looks weird to me.


Bye,
Ulrich







Re: [PATCH] [ARC] Add single/double IEEE precission FPU support.

2016-02-03 Thread Joern Wolfgang Rennecke



On 03/02/16 15:02, Claudiu Zissulescu wrote:

First, I will split this patch in two. The first part will deal with the FPU 
instructions. The second patch, will try to address a new abi optimized for 
odd-even registers as the comments for the mabi=optimized are numerous and I 
need to carefully prepare for an answer.
The remaining of this email will focus on FPU patch.


+  case EQ:
+  case NE:
+  case UNORDERED:
+  case UNLT:
+  case UNLE:
+  case UNGT:
+  case UNGE:
+   return CC_FPUmode;
+
+  case LT:
+  case LE:
+  case GT:
+  case GE:
+  case ORDERED:
+   return CC_FPUEmode;

cse and other code transformations are likely to do better if you use
just one mode for these.  It is also very odd to have comparisons and their
inverse use different modes.  Have you done any benchmarking for this?

Right, the ORDERED should be in CC_FPUmode. An inspiration point for 
CC_FPU/CC_FPUE mode is the arm port.


I can't see how this code in the arm can actually work correctly. When, 
for instance, combine simplifies
a comparison, the comparison code can change, and it will use 
SELECT_CC_MODE to find a new
mode for the comparison.  Thus, if a comparison traps or not on qNaNs 
will depend on the whims

of combine.
Also, the the trapping comparisons don't show the side effect of 
trapping on qNaNs, which means

they can be speculated.

To make the trapping comparisons safe, they should display the side 
effect in the rtl, and only

be used when requested by options, type attributes, pragmas etc.
They could almost be safe to use be default for -ffinite-math-only, 
except that when the frontend knows
how to tell qNaNs and sNaNs apart, and speculates a comparison after 
some integer/fp mixed computation when it can infer that no sNaN will 
occur, you could still get an unexpected signal.

  The reason why having the two CC_FPU and CC_FPUE modes is to emit signaling 
FPU compare instructions.
I don't know if your compare instructions are signalling for quiet NaNs 
(I hope they're not),
but  the mode of the comparison result shouldn't be used to distinguish 
that - it's not safe,

see above.
The purpose of the mode of the result is to distinguish different 
interpretations for the bit

patterns inside the comparison result flags.

   We can use a single CC_FPU mode here instead of two, but we may lose 
functionality.
Can you define what that functionality actually is, and show some simple 
test code

to demonstrate how it works with your port  extension?

Regarding benchmarks, I do not have an establish benchmark for this, however, 
as far as I could see the code generated for FPU looks clean.
Please let me know if it is acceptable to go with CC_FPU/CC_FPUE, and ORDERED 
fix further on.
No, there should be a discernible and achievable purpose for comparison 
modes.
Which you have not demonstrated yet so far for the CC_FPU/CC_FPUE 
dichotomy.

  Or, to have a single mode.

Yes.



+  /* ARCHS has 64-bit data-path which makes use of the even-odd paired
+ registers.  */
+  if (TARGET_HS)
+{
+  for (regno = 1; regno < 32; regno +=2)
+   {
+ arc_hard_regno_mode_ok[regno] = S_MODES;
+   }
+}
+

Does TARGET_HS with -mabi=default allow for passing DFmode / DImode
arguments
in odd registers?  I fear you might run into reload trouble when trying to
access the values.

Although, I haven't bump into this issue until now, I do not say it may not 
happen. Hence, I would create a new register class to hold the odd-even 
registers. Hence the above code will not be needed. What do u say?
That would have been possible a couple of years ago, but these days, all 
the constituent
registers of a multi-reg hard register have to be in a constraint's 
register class for the

constraint to match.
You could fudge this by using two classes for two-reg allocations, one 
with 0,1, 4,5, 8,9 ... the other
with 2,3, 6,7, 10,11 ... , but then you need another pair for 
four-register allocations, and maybe you
want to add various union and intersection classes, and the register 
allocators are rather rubbis
when it comes to balance allocations between classes of similar size and 
utility, so you should

really try to avoid this.
A way to avoid this is not to give the option of using the old ABI while 
enforcing alignment in registers.
Or you could use a different mode for the argument passing when it ends 
up unaligned; I suppose
BLKmode should work, using a vector to designate the constituent 
registers of the function argument.

+(define_insn "*cmpsf_trap_fpu"

That name makes as little sense to me as having two separate modes
CC_FPU and CC_FPUE
for positive / negated usage and having two comparison patterns pre
precision that
do the same but pretend to be dissimilar.

The F{S/D}CMPF instruction is similar to the F{S/D}CMP instruction
Oops. I missed the 'f' suffix.  So the "*trap_fpu" patterns really are 
different...

  in cases when either of the instruction opera

Re: [PATCH] [wwwdocs] Add common C++ issues to /gcc-6/porting_to.html

2016-02-03 Thread Mike Stump
On Feb 3, 2016, at 9:13 AM, David Malcolm  wrote:
>> +pointer constants, so other constants such as false and
>> +(1 - 1) cannot be used where a null pointer is desired.

So, I’d leave this out entirely.  The subject is porting, not the fine detail 
pedanticism only a language lawyer could love.  Was this text from a porting 
experience, or an invention based upon compiler/language mods?

Re: [PATCH] fix #69251 - [6 Regression] ICE in unify_array_domain on a flexible array member

2016-02-03 Thread Jason Merrill

On 02/02/2016 08:21 PM, Martin Sebor wrote:

On 02/02/2016 05:28 AM, Jason Merrill wrote:

On 01/25/2016 05:55 PM, Martin Sebor wrote:

The downside of this approach is that it prevents everything but
the front end from distinguishing flexible array members from
arrays of unspecified or unknown bounds.  The immediate impact
is that prevents us from maintaining ABI compatibility with GCC
5 (with -fabi-version=9) and from diagnosing the mangling change.
This means should we decide to adopt this approach, the final
version of the patch for c++/69277 mentioned above that's still
pending approval will need to be tweaked to have the ABI checks
removed.


That's unfortunate, but I think acceptable.


* decl.c (compute_array_index_type): Return null for flexible array
members.


Instead of this, I would think we can remove the calls to
compute_array_index_type added by your earlier patch, as well as many
other changes from that patch to handle null TYPE_MAX_VALUE.


Yes, that's possible but it didn't seem essential at this stage.
I wanted to make only conservative changes to avoid any further
fallout.  I also wasn't sure whether the ABI issue above would
make this approach unviable.


I guess my sense of conservatism is different from yours: I think 
removing recent changes is conservative in that it minimizes the change 
from previous versions of the compiler.



* tree.c (array_of_runtime_bound_p): Handle gracefully array types
with null TYPE_MAX_VALUE.


This seems unneeded.


(build_ctor_subob_ref): Loosen debug checking to handle flexible
array members.


And this shouldn't need the TYPE_MAX_VALUE check.


I went ahead and made the requested changes.  They might seem
perfectly innocuous to you but the removal of the tests for
TYPE_MAX_VALUE(t) being null makes me nervous at this stage.
I'm not nearly comfortable enough with the code to be confident
that they're all 100% safe.  I defer to your better judgment
on this.


It was impossible to have null TYPE_MAX_VALUE until you introduced that 
in compute_array_index_type, and thus we didn't test for it; if we 
aren't doing that anymore I can't imagine where it would come from now.



@@ -4120,9 +4120,8 @@ walk_subobject_offsets (tree type,
- || !domain
- /* Flexible array members have no upper bound.  */
- || !TYPE_MAX_VALUE (domain))
+ /* Flexible array members have a null domain.  */
+ || !domain)


With this patch flexible array members are a special case of array of 
unknown bound, so I don't think we need to call them out in a comment here.



@@ -875,10 +875,11 @@ dump_type_suffix (cxx_pretty_printer *pp, tree t, int 
flags)
-  if (TYPE_DOMAIN (t) && TYPE_MAX_VALUE (TYPE_DOMAIN (t)))
+  /* C++ flexible array members have a null domain.  */
+  if (tree dtype = TYPE_DOMAIN (t))


Likewise.

OK with these two comments removed.

Jason



Re: [PATCH] [wwwdocs] Add common C++ issues to /gcc-6/porting_to.html

2016-02-03 Thread Jakub Jelinek
On Wed, Feb 03, 2016 at 10:42:37AM -0800, Mike Stump wrote:
> On Feb 3, 2016, at 9:13 AM, David Malcolm  wrote:
> >> +pointer constants, so other constants such as false and
> >> +(1 - 1) cannot be used where a null pointer is desired.
> 
> So, I’d leave this out entirely.  The subject is porting, not the fine detail 
> pedanticism only a language lawyer could love.  Was this text from a porting 
> experience, or an invention based upon compiler/language mods?

I believe trying to use false as pointer is from porting experience,
(1 - 1) most likely not really used in the wild, but just clarifies what is
and what is not the null pointer constant.

Jakub


Re: [PATCH] Partially fix PR c++/12277 (Warn on dynamic cast with known NULL results)

2016-02-03 Thread Jason Merrill

OK.

Jason


[ARM, PATCH v2 0/2] PR68532: Fix VZIP/VUZP recognition for big endian

2016-02-03 Thread charles . baylis
From: Charles Baylis 

This is an updated patch, which fixes the following issues:
. big endian ICE with vshuf-* tests
. style issues reported by check_GNU_style.sh

This has no regressions with -mfpu=neon, for arm-unknown-linux-gnueabihf and
armeb-unknown-linux-gnueabihf. The new test passes for both, and big endian has
new PASSes for the vshuf-* execution tests, which currently fail on trunk.

The comment about the failures due to failure to vectorize seems to have been
incorrect.

Link to previous thread:
https://gcc.gnu.org/ml/gcc-patches/2016-02/msg00060.html

Charles Baylis (2):
  [ARM] PR68532: Fix up vuzp for big endian
  [ARM] PR68532 Fix up vzip recognition for big endian

 gcc/config/arm/arm.c  | 77 +--
 gcc/config/arm/arm_neon.h | 72 -
 gcc/testsuite/gcc.c-torture/execute/pr68532.c | 24 +
 3 files changed, 122 insertions(+), 51 deletions(-)
 create mode 100644 gcc/testsuite/gcc.c-torture/execute/pr68532.c

-- 
1.9.1



[PATCH 2/2] [ARM] PR68532 Fix up vzip recognition for big endian

2016-02-03 Thread charles . baylis
From: Charles Baylis 

gcc/ChangeLog:

2016-02-03  Charles Baylis  

PR target/68532
* config/arm/arm.c (arm_evpc_neon_vzip): Allow for big endian lane
order.
* config/arm/arm_neon.h (vzipq_s8): Adjust shuffle patterns for big
endian.
(vzipq_s16): Likewise.
(vzipq_s32): Likewise.
(vzipq_f32): Likewise.
(vzipq_u8): Likewise.
(vzipq_u16): Likewise.
(vzipq_u32): Likewise.
(vzipq_p8): Likewise.
(vzipq_p16): Likewise.

Change-Id: I327678f5e73c1de2f413c1d22769ab42ce1d6c16

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index e9aa982..24239db 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -28318,15 +28318,21 @@ arm_evpc_neon_vzip (struct expand_vec_perm_d *d)
   unsigned int i, high, mask, nelt = d->nelt;
   rtx out0, out1, in0, in1;
   rtx (*gen)(rtx, rtx, rtx, rtx);
+  int first_elem;
+  bool is_swapped;
 
   if (GET_MODE_UNIT_SIZE (d->vmode) >= 8)
 return false;
 
+  is_swapped = BYTES_BIG_ENDIAN ? true : false;
+
   /* Note that these are little-endian tests.  Adjust for big-endian later.  */
+  first_elem = d->perm[neon_endian_lane_map (d->vmode, 0) ^ is_swapped];
+
   high = nelt / 2;
-  if (d->perm[0] == high)
+  if (first_elem == neon_endian_lane_map (d->vmode, high))
 ;
-  else if (d->perm[0] == 0)
+  else if (first_elem == neon_endian_lane_map (d->vmode, 0))
 high = 0;
   else
 return false;
@@ -28334,11 +28340,16 @@ arm_evpc_neon_vzip (struct expand_vec_perm_d *d)
 
   for (i = 0; i < nelt / 2; i++)
 {
-  unsigned elt = (i + high) & mask;
-  if (d->perm[i * 2] != elt)
+  unsigned elt =
+   neon_pair_endian_lane_map (d->vmode, i + high) & mask;
+  if (d->perm[neon_pair_endian_lane_map (d->vmode, 2 * i + is_swapped)]
+ != elt)
return false;
-  elt = (elt + nelt) & mask;
-  if (d->perm[i * 2 + 1] != elt)
+  elt =
+   neon_pair_endian_lane_map (d->vmode, i + nelt + high)
+   & mask;
+  if (d->perm[neon_pair_endian_lane_map (d->vmode, 2 * i + !is_swapped)]
+ != elt)
return false;
 }
 
@@ -28362,10 +28373,9 @@ arm_evpc_neon_vzip (struct expand_vec_perm_d *d)
 
   in0 = d->op0;
   in1 = d->op1;
-  if (BYTES_BIG_ENDIAN)
+  if (is_swapped)
 {
   std::swap (in0, in1);
-  high = !high;
 }
 
   out0 = d->target;
diff --git a/gcc/config/arm/arm_neon.h b/gcc/config/arm/arm_neon.h
index 2e014b6..aa17f49 100644
--- a/gcc/config/arm/arm_neon.h
+++ b/gcc/config/arm/arm_neon.h
@@ -8453,9 +8453,9 @@ vzipq_s8 (int8x16_t __a, int8x16_t __b)
   int8x16x2_t __rv;
 #ifdef __ARM_BIG_ENDIAN
   __rv.val[0] = __builtin_shuffle (__a, __b, (uint8x16_t)
-  { 24, 8, 25, 9, 26, 10, 27, 11, 28, 12, 29, 13, 30, 14, 31, 15 });
+  { 20, 4, 21, 5, 22, 6, 23, 7, 16, 0, 17, 1, 18, 2, 19, 3 });
   __rv.val[1] = __builtin_shuffle (__a, __b, (uint8x16_t)
-  { 16, 0, 17, 1, 18, 2, 19, 3, 20, 4, 21, 5, 22, 6, 23, 7 });
+  { 28, 12, 29, 13, 30, 14, 31, 15, 24, 8, 25, 9, 26, 10, 27, 11 });
 #else
   __rv.val[0] = __builtin_shuffle (__a, __b, (uint8x16_t)
   { 0, 16, 1, 17, 2, 18, 3, 19, 4, 20, 5, 21, 6, 22, 7, 23 });
@@ -8471,9 +8471,9 @@ vzipq_s16 (int16x8_t __a, int16x8_t __b)
   int16x8x2_t __rv;
 #ifdef __ARM_BIG_ENDIAN
   __rv.val[0] = __builtin_shuffle (__a, __b, (uint16x8_t)
-  { 12, 4, 13, 5, 14, 6, 15, 7 });
+  { 10, 2, 11, 3, 8, 0, 9, 1 });
   __rv.val[1] = __builtin_shuffle (__a, __b, (uint16x8_t)
-  { 8, 0, 9, 1, 10, 2, 11, 3 });
+  { 14, 6, 15, 7, 12, 4, 13, 5 });
 #else
   __rv.val[0] = __builtin_shuffle (__a, __b, (uint16x8_t)
   { 0, 8, 1, 9, 2, 10, 3, 11 });
@@ -8488,8 +8488,8 @@ vzipq_s32 (int32x4_t __a, int32x4_t __b)
 {
   int32x4x2_t __rv;
 #ifdef __ARM_BIG_ENDIAN
-  __rv.val[0] = __builtin_shuffle (__a, __b, (uint32x4_t) { 6, 2, 7, 3 });
-  __rv.val[1] = __builtin_shuffle (__a, __b, (uint32x4_t) { 4, 0, 5, 1 });
+  __rv.val[0] = __builtin_shuffle (__a, __b, (uint32x4_t) { 5, 1, 4, 0 });
+  __rv.val[1] = __builtin_shuffle (__a, __b, (uint32x4_t) { 7, 3, 6, 2 });
 #else
   __rv.val[0] = __builtin_shuffle (__a, __b, (uint32x4_t) { 0, 4, 1, 5 });
   __rv.val[1] = __builtin_shuffle (__a, __b, (uint32x4_t) { 2, 6, 3, 7 });
@@ -8502,8 +8502,8 @@ vzipq_f32 (float32x4_t __a, float32x4_t __b)
 {
   float32x4x2_t __rv;
 #ifdef __ARM_BIG_ENDIAN
-  __rv.val[0] = __builtin_shuffle (__a, __b, (uint32x4_t) { 6, 2, 7, 3 });
-  __rv.val[1] = __builtin_shuffle (__a, __b, (uint32x4_t) { 4, 0, 5, 1 });
+  __rv.val[0] = __builtin_shuffle (__a, __b, (uint32x4_t) { 5, 1, 4, 0 });
+  __rv.val[1] = __builtin_shuffle (__a, __b, (uint32x4_t) { 7, 3, 6, 2 });
 #else
   __rv.val[0] = __builtin_shuffle (__a, __b, (uint32x4_t) { 0, 4, 1, 5 });
   __rv.val[1] = __builtin_shuffle (__a, __b, (uint32x4_t) { 2, 6, 3, 7 });
@@ -8517,9 +8517,9 @@ vzipq_u8 (uint8x16_t __a, uint8x16_t __b)
   uint8x16x2_t __rv;
 #ifdef __ARM_BIG_ENDIAN
   __rv.val[0] = __builtin_shuffle (__a, __b, (uint

[PATCH 1/2] [ARM] PR68532: Fix up vuzp for big endian

2016-02-03 Thread charles . baylis
From: Charles Baylis 

gcc/ChangeLog:

2016-02-03  Charles Baylis  

PR target/68532
* config/arm/arm.c (neon_endian_lane_map): New function.
(neon_vector_pair_endian_lane_map): New function.
(arm_evpc_neon_vuzp): Allow for big endian lane order.
* config/arm/arm_neon.h (vuzpq_s8): Adjust shuffle patterns for big
endian.
(vuzpq_s16): Likewise.
(vuzpq_s32): Likewise.
(vuzpq_f32): Likewise.
(vuzpq_u8): Likewise.
(vuzpq_u16): Likewise.
(vuzpq_u32): Likewise.
(vuzpq_p8): Likewise.
(vuzpq_p16): Likewise.

gcc/testsuite/ChangeLog:

2015-12-15  Charles Baylis  

PR target/68532
* gcc.c-torture/execute/pr68532.c: New test.

Change-Id: Ifd35d79bd42825f05403a1b96d8f34ef0f21dac3

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index d8a2745..e9aa982 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -28208,6 +28208,35 @@ arm_expand_vec_perm (rtx target, rtx op0, rtx op1, rtx 
sel)
   arm_expand_vec_perm_1 (target, op0, op1, sel);
 }
 
+/* map lane ordering between architectural lane order, and GCC lane order,
+   taking into account ABI.  See comment above output_move_neon for details.  
*/
+static int
+neon_endian_lane_map (machine_mode mode, int lane)
+{
+  if (BYTES_BIG_ENDIAN)
+  {
+int nelems = GET_MODE_NUNITS (mode);
+/* Reverse lane order.  */
+lane = (nelems - 1 - lane);
+/* Reverse D register order, to match ABI.  */
+if (GET_MODE_SIZE (mode) == 16)
+  lane = lane ^ (nelems / 2);
+  }
+  return lane;
+}
+
+/* some permutations index into pairs of vectors, this is a helper function
+   to map indexes into those pairs of vectors.  */
+static int
+neon_pair_endian_lane_map (machine_mode mode, int lane)
+{
+  int nelem = GET_MODE_NUNITS (mode);
+  if (BYTES_BIG_ENDIAN)
+lane =
+  neon_endian_lane_map (mode, lane & (nelem - 1)) + (lane & nelem);
+  return lane;
+}
+
 /* Generate or test for an insn that supports a constant permutation.  */
 
 /* Recognize patterns for the VUZP insns.  */
@@ -28218,14 +28247,22 @@ arm_evpc_neon_vuzp (struct expand_vec_perm_d *d)
   unsigned int i, odd, mask, nelt = d->nelt;
   rtx out0, out1, in0, in1;
   rtx (*gen)(rtx, rtx, rtx, rtx);
+  int first_elem;
+  int swap;
 
   if (GET_MODE_UNIT_SIZE (d->vmode) >= 8)
 return false;
 
-  /* Note that these are little-endian tests.  Adjust for big-endian later.  */
-  if (d->perm[0] == 0)
+  /* arm_expand_vec_perm_const_1 () helpfully swaps the operands for the
+ big endian pattern on 64 bit vectors, so we correct for that.  */
+  swap = BYTES_BIG_ENDIAN && !d->one_vector_p
+&& GET_MODE_SIZE (d->vmode) == 8 ? d->nelt : 0;
+
+  first_elem = d->perm[neon_endian_lane_map (d->vmode, 0)] ^ swap;
+
+  if (first_elem == neon_endian_lane_map (d->vmode, 0))
 odd = 0;
-  else if (d->perm[0] == 1)
+  else if (first_elem == neon_endian_lane_map (d->vmode, 1))
 odd = 1;
   else
 return false;
@@ -28233,8 +28270,9 @@ arm_evpc_neon_vuzp (struct expand_vec_perm_d *d)
 
   for (i = 0; i < nelt; i++)
 {
-  unsigned elt = (i * 2 + odd) & mask;
-  if (d->perm[i] != elt)
+  unsigned elt =
+   (neon_pair_endian_lane_map (d->vmode, i) * 2 + odd) & mask;
+  if ((d->perm[i] ^ swap) != neon_pair_endian_lane_map (d->vmode, elt))
return false;
 }
 
@@ -28258,10 +28296,9 @@ arm_evpc_neon_vuzp (struct expand_vec_perm_d *d)
 
   in0 = d->op0;
   in1 = d->op1;
-  if (BYTES_BIG_ENDIAN)
+  if (swap)
 {
   std::swap (in0, in1);
-  odd = !odd;
 }
 
   out0 = d->target;
diff --git a/gcc/config/arm/arm_neon.h b/gcc/config/arm/arm_neon.h
index 47816d5..2e014b6 100644
--- a/gcc/config/arm/arm_neon.h
+++ b/gcc/config/arm/arm_neon.h
@@ -8741,9 +8741,9 @@ vuzpq_s8 (int8x16_t __a, int8x16_t __b)
   int8x16x2_t __rv;
 #ifdef __ARM_BIG_ENDIAN
   __rv.val[0] = __builtin_shuffle (__a, __b, (uint8x16_t)
-  { 17, 19, 21, 23, 25, 27, 29, 31, 1, 3, 5, 7, 9, 11, 13, 15 });
+  { 9, 11, 13, 15, 1, 3, 5, 7, 25, 27, 29, 31, 17, 19, 21, 23 });
   __rv.val[1] = __builtin_shuffle (__a, __b, (uint8x16_t)
-  { 16, 18, 20, 22, 24, 26, 28, 30, 0, 2, 4, 6, 8, 10, 12, 14 });
+  { 8, 10, 12, 14, 0, 2, 4, 6, 24, 26, 28, 30, 16, 18, 20, 22 });
 #else
   __rv.val[0] = __builtin_shuffle (__a, __b, (uint8x16_t)
   { 0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30 });
@@ -8759,9 +8759,9 @@ vuzpq_s16 (int16x8_t __a, int16x8_t __b)
   int16x8x2_t __rv;
 #ifdef __ARM_BIG_ENDIAN
   __rv.val[0] = __builtin_shuffle (__a, __b, (uint16x8_t)
-  { 9, 11, 13, 15, 1, 3, 5, 7 });
+  { 5, 7, 1, 3, 13, 15, 9, 11 });
   __rv.val[1] = __builtin_shuffle (__a, __b, (uint16x8_t)
-  { 8, 10, 12, 14, 0, 2, 4, 6 });
+  { 4, 6, 0, 2, 12, 14, 8, 10 });
 #else
   __rv.val[0] = __builtin_shuffle (__a, __b, (uint16x8_t)
   { 0, 2, 4, 6, 8, 10, 12, 14 });
@@ -8776,8 +8776,8 @@ vuzpq_s32 (int32x4_t __a, int32x4_t __b)
 {
   int32x4x2_t 

Re: [PATCH] [graphite] document that isl-0.16 is supported

2016-02-03 Thread Mike Stump
On Feb 2, 2016, at 10:29 PM, Sebastian Huber 
 wrote:
> If it is so basic to choose the latest release or the one on the system, then 
> why uses the contrib/download_prerequisites ancient versions, e.g. the six 
> year old GMP 4.3.2?

Because no one has seen fit to update it?  I’ll plead ignorance why bumping to 
the latest release would be bad/wrong.


Re: [PATCH 2/4][AArch64] Increase the loop peeling limit

2016-02-03 Thread Evandro Menezes

On 01/08/16 16:55, Evandro Menezes wrote:

On 12/16/2015 02:11 PM, Evandro Menezes wrote:

On 12/16/2015 05:24 AM, Richard Earnshaw (lists) wrote:

On 15/12/15 23:34, Evandro Menezes wrote:

On 12/14/2015 05:26 AM, James Greenhalgh wrote:

On Thu, Dec 03, 2015 at 03:07:43PM -0600, Evandro Menezes wrote:

On 11/20/2015 05:53 AM, James Greenhalgh wrote:

On Thu, Nov 19, 2015 at 04:04:41PM -0600, Evandro Menezes wrote:

On 11/05/2015 02:51 PM, Evandro Menezes wrote:

2015-11-05  Evandro Menezes 

gcc/

* config/aarch64/aarch64.c
(aarch64_override_options_internal):
Increase loop peeling limit.

This patch increases the limit for the number of peeled insns.
With this change, I noticed no major regression in either
Geekbench v3 or SPEC CPU2000 while some benchmarks, typically FP
ones, improved significantly.

I tested this tuning on Exynos M1 and on A57. ThunderX seems to
benefit from this tuning too.  However, I'd appreciate comments

>from other stakeholders.

Ping.

I'd like to leave this for a call from the port maintainers. I can
see why
this leads to more opportunities for vectorization, but I'm
concerned about
the wider impact on code size. Certainly I wouldn't expect this to
be our
default at -O2 and below.

My gut feeling is that this doesn't really belong in the back-end
(there are
presumably good reasons why the default for this parameter across
GCC has
fluctuated from 400 to 100 to 200 over recent years), but as I 
say, I'd
like Marcus or Richard to make the call as to whether or not we 
take

this
patch.

Please, correct me if I'm wrong, but loop peeling is enabled only
with loop unrolling (and with PGO).  If so, then extra code size is
not a concern, for this heuristic is only active when unrolling
loops, when code size is already of secondary importance.
My understanding was that loop peeling is enabled from -O2 
upwards, and

is also used to partially peel unaligned loops for vectorization
(allowing
the vector code to be well aligned), or to completely peel inner 
loops

which
may then become amenable to SLP vectorization.

If I'm wrong then I take back these objections. But I was sure this
parameter was used in a number of situations outside of just
-funroll-loops/-funroll-all-loops . Certainly I remember seeing
performance
sensitivities to this parameter at -O3 in some internal workloads 
I was

analysing.

Vectorization, including SLP, is only enabled at -O3, isn't it?  It
seems to me that peeling is only used by optimizations which already
lead to potential increase in code size.

For instance, with "-Ofast -funroll-all-loops", the total text size 
for

the SPEC CPU2000 suite is 26.9MB with this proposed change and 26.8MB
without it; with just "-O2", it is the same at 23.1MB regardless of 
this

setting.

So it seems to me that this proposal should be neutral for up to -O2.

Thank you,


My preference would be to not diverge from the global parameter
settings.  I haven't looked in detail at this parameter but it seems to
me there are two possible paths:

1) We could get agreement globally that the parameter should be 
increased.

2) We could agree that this specific use of the parameter is distinct
from some other uses and deserves a new param in its own right with a
higher value.



Here's what I have observed, not only in AArch64: architectures 
benefit differently from certain loop optimizations, especially those 
dealing with vectorization.  Be it because some have plenty of 
registers of more aggressive loop unrolling, or because some have 
lower costs to vectorize.  With this, I'm trying to imply that there 
may be the case to wiggle this parameter to suit loop optimizations 
better to specific targets.  While it is not the only parameter 
related to loop optimizations, it seems to be the one with the 
desired effects, as exemplified by PPC, S390 and x86 (AOSP).  Though 
there is the possibility that they are actually side-effects, as 
Richard Biener perhaps implied in another reply.





Gents,

Any new thoughts on this proposal?



Ping?

--
Evandro Menezes



Re: [PATCH 2/4 v2][AArch64] Add support for FCCMP

2016-02-03 Thread Evandro Menezes

On 01/21/16 16:55, Evandro Menezes wrote:

On 01/21/16 16:07, James Greenhalgh wrote:

On Thu, Jan 21, 2016 at 01:58:31PM -0600, Evandro Menezes wrote:

Hi, James.

On 01/21/16 03:24, James Greenhalgh wrote:

On Wed, Jan 06, 2016 at 02:44:47PM -0600, Evandro Menezes wrote:

On 01/06/2016 06:04 AM, Wilco Dijkstra wrote:
Here's what I had in mind when I inquired about distinguishing 
FCMP from

FCCMP.  As you can see in the patch, Exynos is the only target that
cares about it, but I wonder if ThunderX or Xgene would too.

What do you think?
The new attributes look fine (I've got a similar outstanding 
change), however
please don't add them to non-AArch64 cores. We only need it for 
thunderx.md,

cortex-a53.md, cortex-a57.md, xgene1.md and exynos-m1.md.

 Add support for the FCCMP insn types

 2016-01-04  Evandro Menezes 

 gcc/
 * config/aarch64/aarch64.md (fccmp): Change insn type.
 (fccmpe): Likewise.
 * config/aarch64/thunderx.md (thunderx_fcmp): Add
"fccmp{s,d}" types.
 * config/arm/cortex-a53.md (cortex_a53_fpalu): Likewise.
 * config/arm/cortex-a57.md (cortex_a57_fp_cmp): 
Likewise.

 * config/arm/xgene1.md (xgene1_fcmp): Likewise.
 * config/arm/exynos-m1.md (exynos_m1_fp_ccmp): New insn
reservation.
 * config/arm/types.md (fccmps): Add new insn type.
 (fccmpd): Likewise.

Got it.  Here's an updated patch.  Again, assuming that your
original patch is in place.  Perhaps you can build on it.
If we don't have any targets which care about the fccmps/fccmpd 
split in

the code base, do we really need it? Can we just follow the example of
fcsel?

The Exynos M1 does care about the difference between FCMP and FCCMP,
as can be seen in the patch.
More explicitly:

(define_insn_reservation "exynos_m1_fp_cmp" 4
   (and (eq_attr "tune" "exynosm1")
(eq_attr "type" "fcmps, fcmpd"))
   "em1_nmisc")

(define_insn_reservation "exynos_m1_fp_ccmp" 7
   (and (eq_attr "tune" "exynosm1")
(eq_attr "type" "fccmps, fccmpd"))
   "em1_st, em1_nmisc")

I think I was unclear. Your exynos-m1 model cares about splitting 
fcmp[s/d]
and fccmp, but it doesn't care about splitting fccmp in to 
fccmps/fccmpd. It

is the split to fccmps/fccmpd that I think is unneccesary at this time.


Indeed.  However, it seems to me that the jury is still out about the 
{s,d} suffix, isn't it?  Otherwise, whatever others deem better.  I 
myself am agnostic about it.



diff --git a/gcc/config/arm/types.md b/gcc/config/arm/types.md
index 321ff89..daf7162 100644
--- a/gcc/config/arm/types.md
+++ b/gcc/config/arm/types.md
@@ -70,6 +70,7 @@
  ; f_rint[d,s]double/single floating point rount to 
integral.
  ; f_store[d,s]   double/single store to memory. Used for VFP 
unit.

  ; fadd[d,s]  double/single floating-point scalar addition.
+; fccmp[d,s] double/single floating-point conditional 
compare.

Can we follow the convention fcsel uses of calling out "From ARMv8-A:"
for this type?

I'm not sure I follow.  Though I didn't refer to the ISA spec, I
used the description from it for the *fccmp* type.

Something like:

; fccmpFrom ARMv8-A: floating point conditional compare.

Just to capture that this instruction is only available for cores 
implementing

ARMv8-A.



Got it.

Let me try this again:

   Add support for the FCCMP insn types

   2016-01-21  Evandro Menezes  

   gcc/
* config/aarch64/aarch64.md (fccmp): Change insn type.
(fccmpe): Likewise.
* config/aarch64/thunderx.md (thunderx_fcmp): Add
   "fccmp{s,d}" types.
* config/arm/cortex-a53.md (cortex_a53_fpalu): Likewise.
* config/arm/cortex-a57.md (cortex_a57_fp_cmp): Likewise.
* config/arm/xgene1.md (xgene1_fcmp): Likewise.
* config/arm/exynos-m1.md (exynos_m1_fp_ccmp): New insn
   reservation.
* config/arm/types.md (fccmps): Add new insn type.
(fccmpd): Likewise.



*Ping*

--
Evandro Menezes



[PATCH] Fix valgrind reported issue during char constant lexing (PR c++/69628)

2016-02-03 Thread Jakub Jelinek
Hi!

If we report error from cpp_interpret_charconst (or functions it calls),
we leave *pchars_seen and *unsignedp uninitialized, and as the return
value for error (0) is also valid return value for valid programs,
various callers look at the uninitialized variables.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk?

2016-02-03  Jakub Jelinek  

PR c++/69628
* charset.c (cpp_interpret_charconst): Clear *PCHARS_SEEN
and *UNSIGNEDP if bailing out early due to errors.

* g++.dg/parse/pr69628.C: New test.

--- libcpp/charset.c.jj 2016-01-04 15:14:08.0 +0100
+++ libcpp/charset.c2016-02-03 13:44:05.100120898 +0100
@@ -1620,10 +1620,17 @@ cpp_interpret_charconst (cpp_reader *pfi
   if (token->val.str.len == (size_t) (2 + wide + u8))
 {
   cpp_error (pfile, CPP_DL_ERROR, "empty character constant");
+  *pchars_seen = 0;
+  *unsignedp = 0;
+  return 0;
+}
+  else if (!cpp_interpret_string (pfile, &token->val.str, 1, &str,
+ token->type))
+{
+  *pchars_seen = 0;
+  *unsignedp = 0;
   return 0;
 }
-  else if (!cpp_interpret_string (pfile, &token->val.str, 1, &str, 
token->type))
-return 0;
 
   if (wide)
 result = wide_str_to_charconst (pfile, str, pchars_seen, unsignedp,
--- gcc/testsuite/g++.dg/parse/pr69628.C.jj 2016-02-03 13:47:55.300061110 
+0100
+++ gcc/testsuite/g++.dg/parse/pr69628.C2016-02-03 13:47:32.0 
+0100
@@ -0,0 +1,5 @@
+// PR c++/69628
+// { dg-do compile }
+
+0''; // { dg-error "empty character constant" }
+// { dg-error "expected unqualified-id before numeric constant" "" { target 
*-*-* } 4 }

Jakub


[PATCH] Fix valgrind reported issue with diagnostics caret (PR c/69627)

2016-02-03 Thread Jakub Jelinek
Hi!

As range->m_caret.m_{line,column} is only initialized if
range->m_show_caret_p is true, we really shouldn't be looking at those
fields otherwise.
Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk?

2016-02-03  Jakub Jelinek  

PR c/69627
* diagnostic-show-locus.c (layout::get_state_at_point): Don't read
range->m_caret fields if range->m_show_caret_p is false.

* gcc.dg/pr69627.c: New test.

--- gcc/diagnostic-show-locus.c.jj  2016-01-26 20:50:26.0 +0100
+++ gcc/diagnostic-show-locus.c 2016-02-03 14:12:30.472706582 +0100
@@ -722,9 +722,10 @@ layout::get_state_at_point (/* Inputs.
 
  /* Are we at the range's caret?  is it visible? */
  out_state->draw_caret_p = false;
- if (row == range->m_caret.m_line
+ if (range->m_show_caret_p
+ && row == range->m_caret.m_line
  && column == range->m_caret.m_column)
-   out_state->draw_caret_p = range->m_show_caret_p;
+   out_state->draw_caret_p = true;
 
  /* Within a multiline range, don't display any underline
 in any leading or trailing whitespace on a line.
--- gcc/testsuite/gcc.dg/pr69627.c.jj   2016-02-03 14:21:17.063450583 +0100
+++ gcc/testsuite/gcc.dg/pr69627.c  2016-02-03 14:28:31.765465915 +0100
@@ -0,0 +1,27 @@
+/* PR c/69627 */
+/* { dg-do compile } */
+/* { dg-options "-fdiagnostics-show-caret" } */
+
+void
+foo ()
+{
+  float t[2] = { 1, 2 };
+  int const *s = 0;
+  t[1] / s;/* { dg-error "invalid operands to binary /" } */
+/* { dg-begin-multiline-output "" }
+   t[1] / s;
+    ^
+   { dg-end-multiline-output "" } */
+}
+
+void
+bar ()
+{
+  float t[2] = { 1, 2 };
+  int const *s[2] = { 0, 0 };
+  t[1] / s[0]; /* { dg-error "invalid operands to binary /" } */
+/* { dg-begin-multiline-output "" }
+   t[1] / s[0];
+    ^ 
+   { dg-end-multiline-output "" } */
+}

Jakub


Re: [PATCH, PR target/69454] Disable TARGET_STV when stack is not properly aligned

2016-02-03 Thread Jakub Jelinek
Hi!

On Tue, Feb 02, 2016 at 05:09:34PM +0300, Ilya Enkovich wrote:
> And it's too late to do it after STV pass and therefore we disable it
> when stack is not properly aligned. I think this argumentation goes in
> a loop.

This is a P1 that needs to be fixed, so that we don't defer this forever,
what about the following patch?  As neither stv nor preferred-stack-boundary
nor incoming-stack-boundary are allowed target attribute/GCC target pragma
switches, I can't find a problem with that.  We don't track at expansion
time whether the function is leaf or not, so the patch pessimistically
assumes that the function might be leaf and check both incoming and
preferred stack boundaries.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-02-03  Jakub Jelinek  
Ilya Enkovich  
H.J. Lu  

PR target/69454
* config/i386/i386.c (convert_scalars_to_vector): Remove
stack alignment fixes.
(ix86_option_override_internal): Disable TARGET_STV if stack
might not be aligned enough.
(ix86_minimum_alignment): Assert that TARGET_STV is false.

* gcc.target/i386/pr69454-1.c: New test.
* gcc.target/i386/pr69454-2.c: New test.

--- gcc/config/i386/i386.c.jj   2016-02-02 20:42:01.024287587 +0100
+++ gcc/config/i386/i386.c  2016-02-03 18:45:43.271997909 +0100
@@ -3588,16 +3588,6 @@ convert_scalars_to_vector ()
   bitmap_obstack_release (NULL);
   df_process_deferred_rescans ();
 
-  /* Conversion means we may have 128bit register spills/fills
- which require aligned stack.  */
-  if (converted_insns)
-{
-  if (crtl->stack_alignment_needed < 128)
-   crtl->stack_alignment_needed = 128;
-  if (crtl->stack_alignment_estimated < 128)
-   crtl->stack_alignment_estimated = 128;
-}
-
   return 0;
 }
 
@@ -5453,6 +5443,13 @@ ix86_option_override_internal (bool main
 opts->x_target_flags |= MASK_VZEROUPPER;
   if (!(opts_set->x_target_flags & MASK_STV))
 opts->x_target_flags |= MASK_STV;
+  /* Disable STV if -mpreferred-stack-boundary=2 or
+ -mincoming-stack-boundary=2 - the needed
+ stack realignment will be extra cost the pass doesn't take into
+ account and the pass can't realign the stack.  */
+  if (ix86_preferred_stack_boundary < 64
+  || ix86_incoming_stack_boundary < 64)
+opts->x_target_flags &= ~MASK_STV;
   if (!ix86_tune_features[X86_TUNE_AVX256_UNALIGNED_LOAD_OPTIMAL]
   && !(opts_set->x_target_flags & MASK_AVX256_SPLIT_UNALIGNED_LOAD))
 opts->x_target_flags |= MASK_AVX256_SPLIT_UNALIGNED_LOAD;
@@ -29323,7 +29320,10 @@ ix86_minimum_alignment (tree exp, machin
   if ((mode == DImode || (type && TYPE_MODE (type) == DImode))
   && (!type || !TYPE_USER_ALIGN (type))
   && (!decl || !DECL_USER_ALIGN (decl)))
-return 32;
+{
+  gcc_checking_assert (!TARGET_STV);
+  return 32;
+}
 
   return align;
 }
--- gcc/testsuite/gcc.target/i386/pr69454-1.c.jj2016-02-03 
18:44:17.642175753 +0100
+++ gcc/testsuite/gcc.target/i386/pr69454-1.c   2016-02-03 18:44:17.642175753 
+0100
@@ -0,0 +1,11 @@
+/* { dg-do compile { target { ia32 } } } */
+/* { dg-options "-O2 -msse2 -mno-accumulate-outgoing-args 
-mpreferred-stack-boundary=2" } */
+
+typedef struct { long long w64[2]; } V128;
+extern V128* fn2(void);
+long long a;
+V128 b;
+void fn1() {
+  V128 *c = fn2();
+  c->w64[0] = a ^ b.w64[0];
+}
--- gcc/testsuite/gcc.target/i386/pr69454-2.c.jj2016-02-03 
18:44:17.655175574 +0100
+++ gcc/testsuite/gcc.target/i386/pr69454-2.c   2016-02-03 18:44:17.655175574 
+0100
@@ -0,0 +1,13 @@
+/* { dg-do compile { target { ia32 } } } */
+/* { dg-options "-O2 -mpreferred-stack-boundary=2" } */
+
+extern void fn2 ();
+long long a, b;
+
+void fn1 ()
+{
+  long long c = a;
+  a = b ^ a;
+  fn2 ();
+  a = c;
+}


Jakub


Re: [PATCH] Fix valgrind reported issue with diagnostics caret (PR c/69627)

2016-02-03 Thread David Malcolm
On Wed, 2016-02-03 at 21:07 +0100, Jakub Jelinek wrote:
> Hi!
> 
> As range->m_caret.m_{line,column} is only initialized if
> range->m_show_caret_p is true, we really shouldn't be looking at
> those
> fields otherwise.
> Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
> ok for
> trunk?

I'm not a reviewer, but fwiw the change looks good to me; thanks.

[the uninitialized data is coming from:
gcc_rich_location::add_expr, which leaves m_show_caret_p as false, and
doesn't bother initializing m_caret].



Re: [PATCH] Fix c/69643, named address space wrong-code

2016-02-03 Thread Richard Henderson

On 02/04/2016 12:46 AM, Richard Biener wrote:

On February 3, 2016 8:11:01 AM GMT+01:00, Richard Henderson  
wrote:

On 02/03/2016 06:05 PM, Richard Biener wrote:
  I wasn't aware that STRIP_NOPS strips ADDR_SPACE_CONVERT_EXPR.


Isn't this maybe failing to use that (unable to look at the

attachment from my phone).

The test case does fail to use ADDR_SPACE_CONVERT_EXPR.
Perhaps it's because of the intermediate cast to uintptr_t?


Ah.  Isn't to/from int conversion also address-space specific?


No, we just copy the bits there.


I wonder if it makes sense to have ADDR_SPACE_CONVERT if there is the
loophole of going through an integer type...


Dunno.


That is, if the address spaces are not subsets, how can going through an int
make sense? Isn't the testcase somehow invalid then?


Well, that depends.  For x86, they really are subsets, but the compiler does 
not know the relationship between them, so it cannot perform the conversion itself.


The test case is using implementation-defined conversions between pointers and 
integers (GCC defines the conversion to be bitwise).  So I don't think the test 
case is in any way invalid.


The user-friendly way to do this would probably be some sort of pragma that 
allows user-defined address spaces, and user-defined conversion between them. 
But that's certainly not going to happen in the near-term.


[ Irritatingly, there's a new Haswell instruction that would allow the compiler 
user-space access to the fs/gs_base registers, and then the compiler really 
would know the relationship.


However, the new instruction needs to be enabled by the OS, and not one has 
done so yet.  Then there's that further complication of all those non-Haswell 
cpus still running around.  So in practice we'd still want to be using the 
user-defined spaces. ]



As for a patch I'd repeatedly pondered on not stripping int <-> pointer
conversions at all, similar to what STRIP_SIGN_NOPS does.  Don't remember
actually trying this or the fallout though.


I'll run that through a test cycle and see what happens.


r~



Re: [PATCH] Fix valgrind reported issue with diagnostics caret (PR c/69627)

2016-02-03 Thread Jeff Law

On 02/03/2016 01:25 PM, David Malcolm wrote:

On Wed, 2016-02-03 at 21:07 +0100, Jakub Jelinek wrote:

Hi!

As range->m_caret.m_{line,column} is only initialized if
range->m_show_caret_p is true, we really shouldn't be looking at
those
fields otherwise.
Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
ok for
trunk?


I'm not a reviewer, but fwiw the change looks good to me; thanks.

[the uninitialized data is coming from:
gcc_rich_location::add_expr, which leaves m_show_caret_p as false, and
doesn't bother initializing m_caret].
Given you know this code better than anyone, that should carry enough 
weight to be an approval, even if we haven't gone through the formal 
process of appointing you as a reviewer for these bits.


jeff






Re: patch to fix PR69461

2016-02-03 Thread Michael Meissner
On Wed, Feb 03, 2016 at 01:02:57PM -0500, Vladimir Makarov wrote:
>   The following patch fixes
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69461
> 
>   The patch actually solves several issues.  Before the patch LRA
> has >800 more failures on GCC testsuite on power8.  After the patch
> the LRA has the same number of failures as reload.
> 
> Working on the patch, I think I found some typo in
> rs6000.c::rs6000_legitimate_address_p.  The code suspicious to me:
> 
>   if (reg_offset_p && reg_addr[mode].fused_toc &&
> toc_fusion_mem_wrapped (x, mode))
> return 1;
> 
> The function works with address (x) but toc_fusion_mem_wrapped
> requires memory instead of address.  Therefore the function never
> returns 1 for toc_fusion_wrapped address.
> 
> Mike and Peter, what do you think about this code?
> 
> Anyway, the patch was successfully bootstrapped and tested on power8.
> 
> Committed as rev..

It looks like it would solve the problem (not knowing the inner details of
lra).

You are correct about the call to toc_fusion_wrapped expecting a MEM, and
rs6000_legitimate_address_p was pass the address.

We are testing the following patch to fix this:

2016-02-03  Michael Meissner  
Vladimir Makarov  

* config/rs6000/rs6000.c (rs6000_legitimate_address_p): Fix thinko
in validating fused toc addresses.

Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 233107)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -8399,7 +8399,8 @@ rs6000_legitimate_address_p (machine_mod
   && legitimate_constant_pool_address_p (x, mode,
 reg_ok_strict || lra_in_progress))
 return 1;
-  if (reg_offset_p && reg_addr[mode].fused_toc && toc_fusion_mem_wrapped (x, 
mode))
+  if (reg_offset_p && reg_addr[mode].fused_toc && GET_CODE (x) == UNSPEC
+  && XINT (x, 1) == UNSPEC_FUSION_ADDIS)
 return 1;
   /* For TImode, if we have load/store quad and TImode in VSX registers, only
  allow register indirect addresses.  This will allow the values to go in

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797



New Swedish PO file for 'cpplib' (version 6.1-b20160131)

2016-02-03 Thread Translation Project Robot
Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'cpplib' has been submitted
by the Swedish team of translators.  The file is available at:

http://translationproject.org/latest/cpplib/sv.po

(This file, 'cpplib-6.1-b20160131.sv.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

http://translationproject.org/latest/cpplib/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

http://translationproject.org/domain/cpplib.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.




[PING][ARM] Re: Use vector wide add for mixed-mode adds

2016-02-03 Thread Michael Collison

Second Ping. Most recent patch posted here:

https://gcc.gnu.org/ml/gcc-patches/2015-12/msg01682.html

Regards,

Michael Collison

--
Michael Collison
Linaro Toolchain Working Group
michael.colli...@linaro.org



[wwwdocs] Add more PowerPC information to gcc-6/changes.html

2016-02-03 Thread Bill Schmidt
Hi,

The following was applied to the website to record additional GCC 6
changes for PowerPC.  The changes passed XHTML verification.

Index: changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-6/changes.html,v
retrieving revision 1.54
diff -r1.54 changes.html
361a362,434
> PowerPC64 now supports IEEE 128-bit floating-point using the
>   __float128 data type.  In GCC 6, this is NOT enabled by default,
>   but you can enable it with -mfloat128.  The IEEE 128-bit
>   floating-point support requires the use of the VSX instruction
>   set.  IEEE 128-bit floating-point values are passed and returned
>   as a single vector value.  The software emulator for IEEE 128-bit
>   floating-point support is only built on PowerPC Linux systems
>   where the default cpu is at least power7.  On future ISA 3.0
>   systems (power9 and later), you will be able to use the
>   -mfloat128-hardware option to use the ISA 3.0 instructions
>   that support IEEE 128-bit floating-point.  An additional type
>   (__ibm128) has been added to refer to the IBM extended double
>   type that normally implements long double.  This will allow
>   for a future transition to implementing long double with IEEE
>   128-bit floating-point.
> Basic support has been added for POWER9 hardware that will use the
>   recently published OpenPOWER ISA 3.0 instructions.  The following
>   new switches are available:
>   
> -mcpu=power9:  Implement all of the ISA 3.0
> instructions supported by the compiler.
> -mtune=power9:  In the future, apply tuning for
> POWER9 systems.  Currently, POWER8 tunings are used.
> -mmodulo:  Generate code using the ISA 3.0
> integer instructions (modulus, count trailing zeros, array
> index support, integer multiply/add).
> -mpower9-fusion:  Generate code to suitably fuse
> instruction sequences for a POWER9 system.
> -mpower9-dform:  Generate code to use the new D-form
> (register +offset) memory instructions for the vector
> registers.
> -mpower9-vector:  Generate code using the new ISA
> 3.0 vector (VSX or Altivec) instructions.
> -mpower9-minmax:  Reserved for future development.
> 
> -mtoc-fusion:  Keep TOC entries together to provide
> more fusion opportunities.
>   
> New constraints have been added to support IEEE 128-bit
>   floating-point and ISA 3.0 instructions:
>   
> wb:  Altivec register if -mpower9-dform is
> enabled.
> we:  VSX register if -mpower9-vector is enabled
> for 64-bit code generation.
> wo:  VSX register if -mpower9-vector is
> enabled.
> wp:  Reserved for future use if long double
> is implemented with IEEE 128-bit floating-point instead
> of IBM extended double.
> wq:  VSX register if -mfloat128 is enabled.
> wF:  Memory operand suitable for POWER9 fusion
> load/store.
> wG:  Memory operand suitable for TOC fusion memory
> references.
> wL:  Integer constant identifying the element
> number mfvsrld accesses within a vector.
>   
> Support has been added for __builtin_cpu_is () and
>   __builtin_cpu_supports (), allowing for very fast access to
>   AT_PLATFORM, AT_HWCAP, and AT_HWCAP2 values.  This requires
>   use of glibc 2.23 or later.
> All hardware transactional memory builtins now correctly
>   behave as memory barriers.  Programmers can use #ifdef __TM_FENCE__
>   to determine whether their "old" compiler treats the builtins
>   as barriers.
> Split-stack support has been added for gccgo on PowerPC64
>   for both big- and little-endian (but NOT for 32-bit).  The gold
>   linker from at least binutils 2.25.1 must be available in the PATH
>   when configuring and building gccgo to enable split stack.  (The
>   requirement for binutils 2.25.1 applies to PowerPC64 only.)  The
>   split-stack feature allows a small initial stack size to be
>   allocated for each goroutine, which increases as needed.




[PATCH] bootstrap/69611

2016-02-03 Thread Andreas Tobler

Hi all,

this patch fixes bootstrap on FreeBSD PowerPC and hopefully all other 
PowerPC targets which do not have float128 support.


The patch itself is a bandaid to survive stage4. We have to come up with 
a better solution for FreeBSD and all other soft float targets which do 
not support float128.


The patch was tested by Michael Meissner on different POWER machines.

Ok to commit to trunk?

TIA,
Andreas

2016-02-03  Andreas Tobler  

PR bootstrap/69611
* config/rs6000/sfp-machine.h: Guard __sfp_exceptions with
__FLOAT128__ to compile only for __float128 capable targets.

Index: config/rs6000/sfp-machine.h
===
--- config/rs6000/sfp-machine.h (revision 233109)
+++ config/rs6000/sfp-machine.h (working copy)
@@ -110,7 +110,7 @@
floating point on pre-ISA 3.0 machines without the IEEE 128-bit 
floating

point support.  */

-#ifndef __NO_FPRS__
+#ifdef __FLOAT128__
 #define ISA_BIT(x) (1LL << (63 - x))

 /* Use the same bits of the FPSCR.  */
@@ -151,7 +151,7 @@
   } while (0)

 # define FP_ROUNDMODE  (_fpscr & FP_RND_MASK)
-#endif /* !__NO_FPRS__ */
+#endif /* !__FLOAT128__ */

 /* Define ALIASNAME as a strong alias for NAME.  */
 # define strong_alias(name, aliasname) _strong_alias(name, aliasname)


Re: [PATCH] Fix c/69643, named address space wrong-code

2016-02-03 Thread Richard Henderson

On 02/04/2016 07:30 AM, Richard Henderson wrote:

On 02/04/2016 12:46 AM, Richard Biener wrote:

As for a patch I'd repeatedly pondered on not stripping int <-> pointer
conversions at all, similar to what STRIP_SIGN_NOPS does.  Don't remember
actually trying this or the fallout though.


I'll run that through a test cycle and see what happens.



+FAIL: c-c++-common/fold-bitand-4.c  -Wc++-compat   scan-tree-dump-times 
original "& 15" 1
+FAIL: c-c++-common/fold-bitand-4.c  -Wc++-compat   scan-tree-dump-times 
original "return [^\\n0-9]*0;" 2
+FAIL: c-c++-common/fold-bitand-4.c  -Wc++-compat   scan-tree-dump-times 
original "return [^\\n0-9]*12;" 1

+FAIL: gcc.dg/fold-bitand-1.c scan-tree-dump-times original "&c4 & 3" 0
+FAIL: gcc.dg/fold-bitand-1.c scan-tree-dump-times original "&c8 & 3" 0
+FAIL: gcc.dg/fold-bitand-1.c scan-tree-dump-times original "return 0" 2
+FAIL: gcc.dg/fold-bitand-2.c scan-tree-dump-times original "& 3" 0
+FAIL: gcc.dg/fold-bitand-2.c scan-tree-dump-times original "return 0" 1
+FAIL: gcc.dg/fold-bitand-2.c scan-tree-dump-times original "return 1" 1
+FAIL: gcc.dg/fold-bitand-2.c scan-tree-dump-times original "return 2" 1
+FAIL: gcc.dg/fold-bitand-2.c scan-tree-dump-times original "return 3" 1
+FAIL: gcc.dg/fold-bitand-3.c scan-tree-dump-times original "& 3" 0
+FAIL: gcc.dg/fold-bitand-3.c scan-tree-dump-times original "return 1" 2
+FAIL: gcc.dg/pr52355.c (test for excess errors)
+FAIL: gcc.dg/tree-ssa/foldaddr-1.c scan-tree-dump-times original "return 0" 1
+FAIL: gcc.dg/tree-ssa/ivopt_4.c scan-tree-dump-times ivopts "ivtmp.[0-9_]* = 
PHI <" 1
+FAIL: gcc.dg/tree-ssa/pr21985.c scan-tree-dump-times optimized "foo 
([0-9]*)" 2
+FAIL: gcc.dg/tree-ssa/pr22051-2.c scan-tree-dump-times optimized "r_. = 
(int) q" 1

+FAIL: gcc.target/i386/addr-space-5.c scan-assembler gs:


So, it even fails the new test I added there at the end.
Patch below, just in case I've misunderstood what you suggested.



r~



diff --git a/gcc/tree.c b/gcc/tree.c
index fa7646b..3e79c4b 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -12219,6 +12219,10 @@ block_ultimate_origin (const_tree block)
 bool
 tree_nop_conversion_p (const_tree outer_type, const_tree inner_type)
 {
+  /* Do not strip conversions between pointers and integers.  */
+  if (POINTER_TYPE_P (outer_type) != POINTER_TYPE_P (inner_type))
+return false;
+
   /* Use precision rather then machine mode when we can, which gives
  the correct answer even for submode (bit-field) types.  */
   if ((INTEGRAL_TYPE_P (outer_type)
@@ -12272,8 +12276,7 @@ tree_sign_nop_conversion (const_tree exp)
   outer_type = TREE_TYPE (exp);
   inner_type = TREE_TYPE (TREE_OPERAND (exp, 0));

-  return (TYPE_UNSIGNED (outer_type) == TYPE_UNSIGNED (inner_type)
- && POINTER_TYPE_P (outer_type) == POINTER_TYPE_P (inner_type));
+  return TYPE_UNSIGNED (outer_type) == TYPE_UNSIGNED (inner_type);
 }

 /* Strip conversions from EXP according to tree_nop_conversion and



Re: [Patch, Fortran] PR 69495: unused-label warning does not tell which flag triggered it

2016-02-03 Thread Manfred Schwarb

Am 03.02.2016 um 19:18 schrieb Janus Weil:

Hi,

2016-02-03 10:21 GMT+01:00 Manfred Schwarb :

here is a diagnostics patch, which makes sure that the responsible
flag is printed in several warning messages (for which this was still
missing).



if (source_size < result_size)
-gfc_warning (0, "Intrinsic TRANSFER at %L has partly undefined result:
"
-"source size %ld < result size %ld", &source->where,
-(long) source_size, (long) result_size);
+gfc_warning (OPT_Wsurprising, "Intrinsic TRANSFER at %L has partly "
+"undefined result: source size %ld < result size %ld",
+&source->where, (long) source_size, (long) result_size);

Breaking apart of these strings will probably hamper translation.


thanks for the comment, I was not aware that this is a problem (in
fact I'm rather ignorant of the translation process as a whole). I was
just trying to comply with the GNU coding standards by avoiding
overlong lines.

So, I assume the problem is that the strings are being broken
*differently* than before, right? (Obviously the were broken already
...) I guess I will just move the start of the warning message to a
new line in order to avoid this.



There are 2 things with translation, and there is a third issue:
- As you noticed, breaking things differently means translation has to be
  done again.
- Normally, each string is translated independently, and depending on the
  language there may be lack of context (e.g. adjectives get different suffixes
  depending on the noun).
- grep'ability: you got such an error message, then you want to look for the
  corresponding code and do a grep for e.g. "partly undefined result".
  GOTCHA!

So IMHO strings should be left intact, irrespective of some arbitrary 80 char 
limits.
Other projects, e.g. the linux kernel, do deliberately violate the 80 char limit
if it is needed, and do not always break strings. I do not know how it is 
handled
in the GCC project, but I guess common sense is always a good recipe.

Of course it is no problem to split at natural boundaries, e.g. at ":", ";" or 
"."
characters.

Cheers,
Manfred



Btw, if anyone notices any further cases where the flag is missing in
the warning message, please let me know. (I haven't searched through
the whole gfortran code for more such cases and I'm not planning on
doing so, but I'll be happy to include further cases in the patch if
pointed out to me ...)

Also I guess I should mention Manuel and Dominique in the Changelog
(for their supportive comments in the PR).

Cheers,
Janus





libgo patch committed: Update to Go 1.6rc1

2016-02-03 Thread Ian Lance Taylor
I've committed a patch to the libgo library to update it to Go 1.6rc1,
the first release candidate for the Go 1.6 release.  As usual with
major libgo updates, the change is too large to include here.  I've
attached the changes to gccgo-specific files.

This update does not have many OS-specific changes, but please do let
me know about any problems building on different systems.

Bootstrapped and ran Go tests on x86_64-pc-linux-gnu.  Committed to mainline.

Ian

gotools/ChangeLog:
2016-02-03  Ian Lance Taylor  

* Makefile.am (go_cmd_gofmt_files): Update to Go 1.6rc1 by adding
internal.go.
* Makefile.in: Rebuild.
Index: libgo/MERGE
===
--- libgo/MERGE (revision 232239)
+++ libgo/MERGE (working copy)
@@ -1,4 +1,4 @@
-f2e4c8b5fb3660d793b2c545ef207153db0a34b1
+036b8fd40b60830ca1d152f17148e52b96d8aa6c
 
 The first line of this file holds the git revision number of the
 last merge done from the master library sources.
Index: libgo/Makefile.am
===
--- libgo/Makefile.am   (revision 232239)
+++ libgo/Makefile.am   (working copy)
@@ -846,9 +846,7 @@ go_net_common_files = \
go/net/parse.go \
go/net/pipe.go \
go/net/fd_poll_runtime.go \
-   go/net/port.go \
go/net/port_unix.go \
-   go/net/race0.go \
$(go_net_sendfile_file) \
go/net/sock_posix.go \
$(go_net_sock_file) \
@@ -1018,7 +1016,7 @@ go_os_files = \
$(go_os_sys_file) \
$(go_os_cloexec_file) \
go/os/types.go \
-   go/os/types_notwin.go
+   go/os/types_unix.go
 
 go_path_files = \
go/path/match.go \
@@ -1100,7 +1098,8 @@ go_strings_files = \
go/strings/replace.go \
go/strings/search.go \
go/strings/strings.go \
-   go/strings/strings_decl.go
+   go/strings/strings_decl.go \
+   go/strings/strings_generic.go
 go_strings_c_files = \
go/strings/indexbyte.c
 
@@ -1109,7 +1108,6 @@ go_sync_files = \
go/sync/mutex.go \
go/sync/once.go \
go/sync/pool.go \
-   go/sync/race0.go \
go/sync/runtime.go \
go/sync/rwmutex.go \
go/sync/waitgroup.go
@@ -1196,7 +1194,6 @@ go_compress_bzip2_files = \
 go_compress_flate_files = \
go/compress/flate/copy.go \
go/compress/flate/deflate.go \
-   go/compress/flate/fixedhuff.go \
go/compress/flate/huffman_bit_writer.go \
go/compress/flate/huffman_code.go \
go/compress/flate/inflate.go \
@@ -1367,7 +1364,8 @@ go_debug_dwarf_files = \
go/debug/dwarf/unit.go
 go_debug_elf_files = \
go/debug/elf/elf.go \
-   go/debug/elf/file.go
+   go/debug/elf/file.go \
+   go/debug/elf/reader.go
 go_debug_gosym_files = \
go/debug/gosym/pclntab.go \
go/debug/gosym/symtab.go
@@ -1450,7 +1448,6 @@ go_go_build_files = \
go/go/build/read.go \
go/go/build/syslist.go
 go_go_constant_files = \
-   go/go/constant/go14.go \
go/go/constant/value.go
 go_go_doc_files = \
go/go/doc/comment.go \
@@ -1461,7 +1458,8 @@ go_go_doc_files = \
go/go/doc/reader.go \
go/go/doc/synopsis.go
 go_go_format_files = \
-   go/go/format/format.go
+   go/go/format/format.go \
+   go/go/format/internal.go
 go_go_importer_files = \
go/go/importer/importer.go
 go_go_parser_files = \
@@ -1489,7 +1487,6 @@ go_go_types_files = \
go/go/types/eval.go \
go/go/types/expr.go \
go/go/types/exprstring.go \
-   go/go/types/go12.go \
go/go/types/initorder.go \
go/go/types/labels.go \
go/go/types/lookup.go \
@@ -1512,6 +1509,7 @@ go_go_types_files = \
go/go/types/universe.go
 
 go_go_internal_gcimporter_files = \
+   go/go/internal/gcimporter/bimport.go \
go/go/internal/gcimporter/exportdata.go \
go/go/internal/gcimporter/gcimporter.go
 go_go_internal_gccgoimporter_files = \
@@ -1578,20 +1576,46 @@ go_index_suffixarray_files = \
go/index/suffixarray/qsufsort.go \
go/index/suffixarray/suffixarray.go
 
-go_internal_format_files = \
-   go/internal/format/format.go
+go_internal_golang_org_x_net_http2_hpack_files = \
+   go/internal/golang.org/x/net/http2/hpack/encode.go \
+   go/internal/golang.org/x/net/http2/hpack/hpack.go \
+   go/internal/golang.org/x/net/http2/hpack/huffman.go \
+   go/internal/golang.org/x/net/http2/hpack/tables.go
+go_internal_race_files = \
+   go/internal/race/doc.go \
+   go/internal/race/norace.go
 go_internal_singleflight_files = \
go/internal/singleflight/singleflight.go
 
 if LIBGO_IS_LINUX
-internal_syscall_unix_getrandom_file = 
go/internal/syscall/unix/getrandom_linux.go
+if LIBGO_IS_386
+internal_syscall_unix_getrandom_files = 
go/internal/syscall/unix/getrandom_linux.go 
go/internal/syscall/unix/getrandom_linux_386.go
+else
+if LIBGO_IS_X86_64
+internal_syscall_u

Re: [PATCH] [wwwdocs] Add common C++ issues to /gcc-6/porting_to.html

2016-02-03 Thread Jonathan Wakely

On 03/02/16 19:47 +0100, Jakub Jelinek wrote:

On Wed, Feb 03, 2016 at 10:42:37AM -0800, Mike Stump wrote:

On Feb 3, 2016, at 9:13 AM, David Malcolm  wrote:
>> +pointer constants, so other constants such as false and
>> +(1 - 1) cannot be used where a null pointer is desired.

So, I’d leave this out entirely.  The subject is porting, not the fine detail 
pedanticism only a language lawyer could love.  Was this text from a porting 
experience, or an invention based upon compiler/language mods?


I believe trying to use false as pointer is from porting experience,
(1 - 1) most likely not really used in the wild, but just clarifies what is
and what is not the null pointer constant.


Yes, there are *dozens* of packages that fail to build due to "return
false;" in a function that returns a pointer of some kind.

I can't imagine what the authors of that code were thinking, if they
were thinking, or what was wrong with "return NULL;" or "return 0;"
but it compiled in C++98 so apparently people did it.  It doesn't
compile in C++11, so I added it tothe page. The pedantic details of
which standard (or DR) changed the rules matter much less than the
fact the rules changed.



Re: [PATCH] bootstrap/69611

2016-02-03 Thread David Edelsohn
this patch fixes bootstrap on FreeBSD PowerPC and hopefully all other
PowerPC targets which do not have float128 support.

The patch itself is a bandaid to survive stage4. We have to come up
with a better solution for FreeBSD and all other soft float targets
which do not support float128.

The patch was tested by Michael Meissner on different POWER machines.

Ok to commit to trunk?

TIA,
Andreas

2016-02-03  Andreas Tobler  

PR bootstrap/69611
* config/rs6000/sfp-machine.h: Guard __sfp_exceptions with
__FLOAT128__ to compile only for __float128 capable targets.


Okay.

Thanks, David


Re: [Patch, Fortran] PR 69495: unused-label warning does not tell which flag triggered it

2016-02-03 Thread Joseph Myers
On Wed, 3 Feb 2016, Manfred Schwarb wrote:

> There are 2 things with translation, and there is a third issue:
> - As you noticed, breaking things differently means translation has to be
>   done again.
> - Normally, each string is translated independently, and depending on the
>   language there may be lack of context (e.g. adjectives get different
> suffixes
>   depending on the noun).

I believe gettext works fine with (compile-time) string constant 
concatenation - that is, extracts the whole concatenated string for 
translation, so these are non-issues.  What doesn't work includes:

* Runtime concatenation of strings or otherwise putting English fragments 
together at runtime.

* String constant concatenation where one of the concatenated pieces comes 
from a macro expansion.

* The argument for translation being a conditional expression:

  error (cond ? "message 1" : "message 2");

(in this case, only one of the messages may be extracted for translation, 
so you need to mark both of them up with appropriate macros such as G_).

-- 
Joseph S. Myers
jos...@codesourcery.com


[PATCH] Fix -mcpu=power8 atomic expansion (PR target/69644)

2016-02-03 Thread Jakub Jelinek
Hi!

rs6000_expand_atomic_compare_and_swap uses oldval directly in
a comparison instruction, but oldval might be a CONST_INT not suitable
for the instruction (such as in the testcase below in SImode comparison
0x8000 constant).  We need to force those into register if they don't
satisfy the predicate.

Bootstrapped/regtested on powerpc64{,le}-linux, ok for trunk?

2016-02-03  Jakub Jelinek  

PR target/69644
* config/rs6000/rs6000.c (rs6000_expand_atomic_compare_and_swap):
Force oldval into register if it does not satisfy reg_or_short_operand
predicate.  Fix up formatting.

* gcc.dg/pr69644.c: New test.

--- gcc/config/rs6000/rs6000.c.jj   2016-02-02 20:42:01.0 +0100
+++ gcc/config/rs6000/rs6000.c  2016-02-03 08:45:31.345521112 +0100
@@ -22275,6 +22275,9 @@ rs6000_expand_atomic_compare_and_swap (r
   else if (reg_overlap_mentioned_p (retval, oldval))
 oldval = copy_to_reg (oldval);
 
+  if (mode != TImode && !reg_or_short_operand (oldval, mode))
+oldval = copy_to_mode_reg (mode, oldval);
+
   mem = rs6000_pre_atomic_barrier (mem, mod_s);
 
   label1 = NULL_RTX;
@@ -22289,10 +22292,8 @@ rs6000_expand_atomic_compare_and_swap (r
 
   x = retval;
   if (mask)
-{
-  x = expand_simple_binop (SImode, AND, retval, mask,
-  NULL_RTX, 1, OPTAB_LIB_WIDEN);
-}
+x = expand_simple_binop (SImode, AND, retval, mask,
+NULL_RTX, 1, OPTAB_LIB_WIDEN);
 
   cond = gen_reg_rtx (CCmode);
   /* If we have TImode, synthesize a comparison.  */
--- gcc/testsuite/gcc.dg/pr69644.c.jj   2016-02-03 08:42:20.827165549 +0100
+++ gcc/testsuite/gcc.dg/pr69644.c  2016-02-03 08:41:51.0 +0100
@@ -0,0 +1,11 @@
+/* PR target/69644 */
+/* { dg-do compile } */
+
+int
+main ()
+{
+  unsigned short x = 0x8000;
+  if (!__sync_bool_compare_and_swap (&x, 0x8000, 0) || x)
+__builtin_abort ();
+  return 0;
+}

Jakub


Re: [Patch, MIPS] Fix PR target/68273, passing args in wrong regs

2016-02-03 Thread Steve Ellcey

Here is a new patch for PR target/68273.  It makes the GCC calling conventions
compatible with LLVM so that the two agree with each other in all the cases
I could think of testing and it fixes the reported defect.

I couldn't get the GCC compatibility test to work (see
https://gcc.gnu.org/ml/gcc/2016-02/msg00017.html) so I wasn't able to use
that for compatibility testing, instead I hand examined routines with various
argument types (structures, ints, complex, etc) to see what registers 
GCC and LLVM were accessing.

This change does introduce an ABI/compatibility change with GCC itself and
that is obviously a concern.  Basically, any type that is passed by value
and has an external alignment applied to it may get passed in different
registers because we strip off that alignment (via TYPE_MAIN_VARIANT) before
determining the alignment of the variable.

If we don't want to break the GCC compatibility then we will continue to have an
incompatibility with LLVM and we will need to find another way to deal
with the aligned int variable that SRA is creating and passing to a function
that expects a 'normal' integer.

One thought I had was that we could compare TYPE_ALIGN (type) and
TYPE_ALIGN (TYPE_MAIN_VARIANT (type) and if they are different, issue
a warning during compilation about a possible incompatibility with
older objects.

Steve Ellcey
sell...@imgtec.com


2016-02-03  Steve Ellcey  

PR target/68273
* config/mips/mips.c (mips_function_arg_boundary): Fix argument
alignment.


diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
index 697abc2..4aa215f 100644
--- a/gcc/config/mips/mips.c
+++ b/gcc/config/mips/mips.c
@@ -5644,7 +5644,10 @@ mips_function_arg_boundary (machine_mode mode, 
const_tree type)
 {
   unsigned int alignment;
 
-  alignment = type ? TYPE_ALIGN (type) : GET_MODE_ALIGNMENT (mode);
+  alignment = type
+   ? TYPE_ALIGN (TYPE_MAIN_VARIANT (type))
+   : GET_MODE_ALIGNMENT (mode);
+
   if (alignment < PARM_BOUNDARY)
 alignment = PARM_BOUNDARY;
   if (alignment > STACK_BOUNDARY)




2016-02-03  Steve Ellcey  

PR target/68273
* gcc.c-torture/execute/pr68273-1.c: New test.
* gcc.c-torture/execute/pr68273-2.c: New test.


diff --git a/gcc/testsuite/gcc.c-torture/execute/pr68273-1.c 
b/gcc/testsuite/gcc.c-torture/execute/pr68273-1.c
index e69de29..3ce07c6 100644
--- a/gcc/testsuite/gcc.c-torture/execute/pr68273-1.c
+++ b/gcc/testsuite/gcc.c-torture/execute/pr68273-1.c
@@ -0,0 +1,74 @@
+/* Make sure that the alignment attribute on an argument passed by
+   value does not affect the calling convention and what registers
+   arguments are passed in.  */
+
+extern void exit (int);
+extern void abort (void);
+
+typedef int alignedint __attribute__((aligned(8)));
+
+int  __attribute__((noinline))
+foo1 (int a, alignedint b)
+{ return a + b; }
+
+int __attribute__((noinline))
+foo2 (int a, int b)
+{
+  return a + b;
+}
+
+int __attribute__((noinline))
+bar1 (alignedint x)
+{
+  return foo1 (1, x);
+}
+
+int __attribute__((noinline))
+bar2 (alignedint x)
+{
+  return foo1 (1, (alignedint) 99);
+}
+
+int __attribute__((noinline))
+bar3 (alignedint x)
+{
+  return foo1 (1, x + (alignedint) 1);
+}
+
+alignedint q = 77;
+
+int __attribute__((noinline))
+bar4 (alignedint x)
+{
+  return foo1 (1, q);
+}
+
+
+int __attribute__((noinline))
+bar5 (alignedint x)
+{
+  return foo2 (1, x);
+}
+
+int __attribute__((noinline))
+use_arg_regs (int i, int j, int k)
+{
+  return i+j-k;
+}
+
+int main()
+{
+   if (use_arg_regs (999, 999, 999) != 999) abort ();
+   if (foo1 (19,13) != 32) abort ();
+   if (use_arg_regs (999, 999, 999) != 999) abort ();
+   if (bar1 (-33) != -32) abort ();
+   if (use_arg_regs (999, 999, 999) != 999) abort ();
+   if (bar2 (1) != 100) abort ();
+   if (use_arg_regs (999, 999, 999) != 999) abort ();
+   if (bar3 (17) != 19) abort ();
+   if (use_arg_regs (999, 999, 999) != 999) abort ();
+   if (bar4 (-33) != 78) abort ();
+   if (use_arg_regs (999, 999, 999) != 999) abort ();
+   if (bar5 (-84) != -83) abort ();
+   exit (0);
+}
diff --git a/gcc/testsuite/gcc.c-torture/execute/pr68273-2.c 
b/gcc/testsuite/gcc.c-torture/execute/pr68273-2.c
index e69de29..1661be9 100644
--- a/gcc/testsuite/gcc.c-torture/execute/pr68273-2.c
+++ b/gcc/testsuite/gcc.c-torture/execute/pr68273-2.c
@@ -0,0 +1,109 @@
+/* Make sure that the alignment attribute on an argument passed by
+   value does not affect the calling convention and what registers
+   arguments are passed in.  */
+
+extern void exit (int);
+extern void abort (void);
+
+typedef struct s {
+   char c;
+   char d;
+} t1;
+typedef t1 t1_aligned8  __attribute__((aligned(8)));
+typedef t1 t1_aligned16 __attribute__((aligned(16)));
+typedef t1 t1_aligned32 __attribute__((aligned(32)));
+
+int bar1(int a, t1 b)
+{
+   return a + b.c + b.d;
+}
+
+int bar2(int a, t1_aligned8 b)
+{
+   return a + b.c + b.d;
+}
+
+int bar3(int a, t1_aligned16 b)
+{
+   re

Re: [PATCH] Fix -mcpu=power8 atomic expansion (PR target/69644)

2016-02-03 Thread David Edelsohn
On Wed, Feb 3, 2016 at 5:28 PM, Jakub Jelinek  wrote:
> Hi!
>
> rs6000_expand_atomic_compare_and_swap uses oldval directly in
> a comparison instruction, but oldval might be a CONST_INT not suitable
> for the instruction (such as in the testcase below in SImode comparison
> 0x8000 constant).  We need to force those into register if they don't
> satisfy the predicate.
>
> Bootstrapped/regtested on powerpc64{,le}-linux, ok for trunk?
>
> 2016-02-03  Jakub Jelinek  
>
> PR target/69644
> * config/rs6000/rs6000.c (rs6000_expand_atomic_compare_and_swap):
> Force oldval into register if it does not satisfy reg_or_short_operand
> predicate.  Fix up formatting.
>
> * gcc.dg/pr69644.c: New test.

Okay.

Thanks, David


Re: PR 69577: Invalid RA of destination subregs

2016-02-03 Thread Richard Sandiford
Richard Sandiford  writes:
> Uros Bizjak  writes:
>> On Tue, Feb 2, 2016 at 5:54 PM, Kyrill Tkachov
>>  wrote:
>>> Hi Richard,
>>>
>>>
>>> On 02/02/16 14:56, Richard Sandiford wrote:

 In PR 69577 we have:

A: (set (reg:V2TI X) ...)
B: (set (subreg:TI (reg:V2TI X) 0) ...)

 X gets allocated to an AVX register, as usual for V2TI.  The problem is
 that the movti for B doesn't then preserve the other half of X, even
 though the subreg semantics are supposed to guarantee that.

 If instead the same value had been set by:

A': (set (subreg:TI (reg:V2TI X) 16) ...)
B: (set (subreg:TI (reg:V2TI X) 0) ...)

 the subreg in A' would have prevented the use of AVX registers for X,
 since you can't directly access the high part.

 IMO these are really the same thing.  An alternative way to view it
 is that the original sequence is equivalent to:

A: (set (reg:V2TI X) ...)
B1: (set (subreg:TI (reg:V2TI X) 0) ...)
B2: (set (subreg:TI (reg:V2TI X) 16) (subreg:TI (reg:V2TI X) 16))

 in which B2 is a no-op and therefore implicit.  The handling ought
 to be the same regardless of whether there is an rtl insn that
 explicitly assigns to (subreg:TI (reg:V2TI X) 16).

 This patch implements that idea.  Hopefully the comments explain
 what's going on.

 Tested on x86_64-linux-gnu so far.  Will test on aarch64-linux-gnu and
 arm-linux-gnueabihf as well.  OK to install if the additional testing
 succeeds?
>>>
>>>
>>> For me this patch causes an ICE when building libgcc during an
>>> aarch64-none-elf build.
>>> It's a segfault with the trace:
>>> 0xb0ac2a crash_signal
>>> $SRC/gcc/toplev.c:335
>>> 0xa7cfd7 init_subregs_of_mode()
>>> $SRC/gcc/reginfo.c:1345
>>> 0x96fc4b init_costs
>>> $SRC/gcc/ira-costs.c:2187
>>> 0x97419e ira_set_pseudo_classes(bool, _IO_FILE*)
>>> $SRC/gcc/ira-costs.c:2237
>>> 0x106fd1e alloc_global_sched_pressure_data
>>> $SRC/gcc/haifa-sched.c:7244
>>> 0x106fd1e sched_init()
>>> $SRC/gcc/haifa-sched.c:7394
>>> 0x107109a haifa_sched_init()
>>> $SRC/gcc/haifa-sched.c:7406
>>> 0xab37ac schedule_insns()
>>> $SRC/gcc/sched-rgn.c:3504
>>> 0xab3f5b rest_of_handle_sched
>>> $SRC/gcc/sched-rgn.c:3717
>>> 0xab3f5b execute
>>> $SRC/gcc/sched-rgn.c:3825
>>
>> Also on x86_64-linux-gnu when building -m32 multilib:
>>
>> Program received signal SIGSEGV, Segmentation fault.
>> 0x00d28264 in init_subregs_of_mode () at
>> /home/uros/gcc-svn/trunk/gcc/reginfo.c:1345
>> 1345FOR_EACH_INSN_DEF (def, insn)
>> (gdb) p insn
>> $1 = (rtx_insn *) 0x7fffef9f4d40
>> (gdb) p debug_rtx (insn)
>> (code_label 60 31 39 10 9 "" [3 uses])
>> $2 = void
>> (gdb) p def
>> $3 = (df_ref) 0x0
>
> Bah, sorry.  I test with --enable-checking=yes,rtl,df, and it turns out
> that df checking masks this kind of problem.  -m32 builds (and tests)
> fine with it but not without.
>
> Here's the patch again with the obvious fix.  Retesting now with just
> --enable-checking=yes,rtl.

Tested on x86_64-linux-gnu, aarch64-linux-gnu and arm-linux-gnueabihf.
OK to install?

Thanks,
Richard

> gcc/
>   PR rtl-optimization/69577
>   * reginfo.c (record_subregs_of_mode): Add a partial_def parameter.
>   (find_subregs_of_mode): Update accordingly.  Iterate over partial
>   definitions.
>
> gcc/testsuite/
>   PR rtl-optimization/69577
>   * gcc.target/i386/pr69577.c: New test.
>
> diff --git a/gcc/reginfo.c b/gcc/reginfo.c
> index 6814eed..ccf53bf 100644
> --- a/gcc/reginfo.c
> +++ b/gcc/reginfo.c
> @@ -1244,8 +1244,16 @@ simplifiable_subregs (const subreg_shape &shape)
>  static HARD_REG_SET **valid_mode_changes;
>  static obstack valid_mode_changes_obstack;
>  
> +/* Restrict the choice of register for SUBREG_REG (SUBREG) based
> +   on information about SUBREG.
> +
> +   If PARTIAL_DEF, SUBREG is a partial definition of a multipart inner
> +   register and we want to ensure that the other parts of the inner
> +   register are correctly preserved.  If !PARTIAL_DEF we need to
> +   ensure that SUBREG itself can be formed.  */
> +
>  static void
> -record_subregs_of_mode (rtx subreg)
> +record_subregs_of_mode (rtx subreg, bool partial_def)
>  {
>unsigned int regno;
>  
> @@ -1256,15 +1264,41 @@ record_subregs_of_mode (rtx subreg)
>if (regno < FIRST_PSEUDO_REGISTER)
>  return;
>  
> +  subreg_shape shape (shape_of_subreg (subreg));
> +  if (partial_def)
> +{
> +  /* The number of independently-accessible SHAPE.outer_mode values
> +  in SHAPE.inner_mode is GET_MODE_SIZE (SHAPE.inner_mode) / SIZE.
> +  We need to check that the assignment will preserve all the other
> +  SIZE-byte chunks in the inner register besides the one that
> +  includes SUBREG.
> +
> +  In practice it is enough to check whether an equivalent
> +  SHAPE.inner_mode value in an 

random struct-layout-1 link failures caused by timeouts

2016-02-03 Thread Mike Stump
I added dg-timeout-factor support to compat.exp, so that one can use it in 
struct-layout-1.exp test cases.


To use it, one can do something like:

diff --git a/gcc/testsuite/gcc.dg/compat/struct-layout-1_generate.c 
b/gcc/testsuite/gcc.dg/compat/struct-layout-1_generate.c
index 80c7355..bc34f2a 100644
--- a/gcc/testsuite/gcc.dg/compat/struct-layout-1_generate.c
+++ b/gcc/testsuite/gcc.dg/compat/struct-layout-1_generate.c
@@ -50,7 +50,8 @@ const char *dg_options[] = {
 "/* { dg-options \"%s-I%s -fno-common\" { target hppa*-*-hpux* 
powerpc*-*-darwin* } } */\n",
 "/* { dg-options \"%s-I%s -mno-mmx -fno-common -Wno-abi\" { target 
i?86-*-darwin* x86_64-*-darwin* } } */\n",
 "/* { dg-options \"%s-I%s -mno-base-addresses\" { target mmix-*-* } } */\n",
-"/* { dg-options \"%s-I%s -mlongcalls -mtext-section-literals\" { target 
xtensa*-*-* } } */\n"
+"/* { dg-options \"%s-I%s -mlongcalls -mtext-section-literals\" { target 
xtensa*-*-* } } */\n",
+"/* { dg-timeout-factor 10 } */\n"
 #define NDG_OPTIONS (sizeof (dg_options) / sizeof (dg_options[0]))
 };

if they want.  On my target and my usual host environment, I randomly get 
timeouts on linking.  The default of 300 doesn’t seem to be enough.

I don’t know if others hit this (I’m -j24 with 128 GB ram on a local zfs with 
spinning rust under it), so I didn’t bother checking in the above.  If others 
want to report on wether they see random link failures, I’d be happy to check 
it in, let me know.


 
* lib/compat.exp (compat-get-options-main): Add dg-timeout-factor
support for struct-layout-1.exp.

diff --git a/gcc/testsuite/lib/compat.exp b/gcc/testsuite/lib/compat.exp
index 77d6705..63d78cc 100644
--- a/gcc/testsuite/lib/compat.exp
+++ b/gcc/testsuite/lib/compat.exp
@@ -170,7 +170,8 @@ proc compat-get-options-main { src } {
if { ![string compare "dg-options" $cmd] \
 || [string match "dg-prune-output" $cmd] \
 || [string match "dg-skip-if" $cmd] \
-|| [string match "dg-require-*" $cmd]  } {
+|| [string match "dg-require-*" $cmd] \
+|| [string match "dg-timeout-factor" $cmd]  } {
set status [catch "$op" errmsg]
if { $status != 0 } {
perror "src: $errmsg for \"$op\"\n"
@@ -215,7 +216,8 @@ proc compat-get-options { src } {
set cmd [lindex $op 0]
if { ![string compare "dg-options" $cmd] \
 || ![string compare "dg-prune-output" $cmd] \
-|| ![string compare "dg-xfail-if" $cmd] } {
+|| ![string compare "dg-xfail-if" $cmd] \
+|| ![string compare "dg-timeout-factor" $cmd] } {
set status [catch "$op" errmsg]
if { $status != 0 } {
perror "src: $errmsg for \"$op\"\n"



Re: [Patch, MIPS] Patch for PR 68400, a mips16 bug

2016-02-03 Thread Richard Sandiford
Andrew Bennett  writes:
>> -Original Message-
>> From: Matthew Fortune
>> Sent: 30 January 2016 16:46
>> To: Richard Sandiford; Steve Ellcey
>> Cc: gcc-patches@gcc.gnu.org; c...@codesourcery.com; Andrew Bennett
>> Subject: RE: [Patch, MIPS] Patch for PR 68400, a mips16 bug
>> 
>> Richard Sandiford  writes:
>> > "Steve Ellcey "  writes:
>> > > Here is a patch for PR6400.  The problem is that and_operands_ok was
>> checking
>> > > one operand to see if it was a memory_operand but MIPS16 addressing is
>> more
>> > > restrictive than what the general memory_operand allows.  The fix was to
>> > > call mips_classify_address if TARGET_MIPS16 is set because it will do a
>> > > more complete mips16 addressing check and reject operands that do not 
>> > > meet
>> > > the more restrictive requirements.
>> > >
>> > > I ran the GCC testsuite with no regressions and have included a test case
>> as
>> > > part of this patch.
>> > >
>> > > OK to checkin?
>> > >
>> > > Steve Ellcey
>> > > sell...@imgtec.com
>> > >
>> > >
>> > > 2016-01-26  Steve Ellcey  
>> > >
>> > >  PR target/68400
>> > >  * config/mips/mips.c (and_operands_ok): Add MIPS16 check.
>> > >
>> > >
>> > >
>> > > diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
>> > > index dd54d6a..adeafa3 100644
>> > > --- a/gcc/config/mips/mips.c
>> > > +++ b/gcc/config/mips/mips.c
>> > > @@ -8006,9 +8006,18 @@ mask_low_and_shift_p (machine_mode mode, rtx mask,
>> rtx shift, int
>> > maxlen)
>> > >  bool
>> > >  and_operands_ok (machine_mode mode, rtx op1, rtx op2)
>> > >  {
>> > > -  return (memory_operand (op1, mode)
>> > > -  ? and_load_operand (op2, mode)
>> > > -  : and_reg_operand (op2, mode));
>> > > +
>> > > +  if (memory_operand (op1, mode))
>> > > +{
>> > > +  if (TARGET_MIPS16) {
>> > > +struct mips_address_info addr;
>> > > +if (!mips_classify_address (&addr, op1, mode, false))
>> > > +  return false;
>> > > +  }
>> >
>> > Nit: brace formatting.
>> >
>> > It looks like the patch only works by accident.  The code above
>> > is passing the MEM, rather than the address inside the MEM, to
>> > mips_classify_address.  Since (mem (mem ...)) is never valid on MIPS,
>> > the effect is to disable the memory alternatives of *and3_mips16
>> > unconditionally.
>> >
>> > The addresses that occur in the testcase are valid as far as
>> > mips_classify_address is concerned.  FWIW, there shouldn't be any
>> > difference between the addresses that memory_operand accepts and the
>> > addresses that mips_classify_address accepts.
>> >
>> > In theory, the "W" constraints in *and3_mips16 are supposed to
>> > tell the target-independent code that this instruction cannot handle
>> > constant addresses or stack-based addresses.  That seems to be working
>> > correctly during RA for the testcase.  The problem comes in regcprop,
>> > which ends up creating a second stack pointer rtx distinct from
>> > stack_pointer_rtx:
>> >
>> > (reg/f:SI 29 $sp [375])
>> >
>> > (Note the ORIGINAL_REGNO of 375.)  This then defeats the test in
>> > mips_stack_address_p:
>> >
>> > bool
>> > mips_stack_address_p (rtx x, machine_mode mode)
>> > {
>> >   struct mips_address_info addr;
>> >
>> >   return (mips_classify_address (&addr, x, mode, false)
>> >  && addr.type == ADDRESS_REG
>> >  && addr.reg == stack_pointer_rtx);
>> > }
>> >
>> > Change the == to rtx_equal_p and the test passes.  I don't think that's
>> > the correct fix though -- the fix is to stop a second stack pointer rtx
>> > from being created.
>> 
>> Agreed, though I'm inclined to say do both. We actually hit this
>> same issue while testing some 4.9.2 based tools recently but I hadn't
>> got confirmation from Andrew (cc'd) whether it was definitely the same
>> issue. Andrew fixed this by updating all tests against stack_pointer_rtx
>> to compare register numbers instead (but rtx_equal_p is better still).

It looks from the patch like it's only "all" for the MIPS target.
Target-independent code would continue to expect pointer equality.

So sorry to be awkward, but I really don't think it's a good idea
to do both.  If we want to allow more than one stack pointer rtx,
we should do it consistently across the codebase rather than in
specific parts of one target.  And if we do that, there's no
need to "fix" the regcprop.c issue; we'd then have redefined
things so that the current regcprop.c behaviour is correct.

If instead we decide to stick with the traditional semantics and
require the stack pointer rtx to be exactly stack_pointer_rtx,
we should just fix the regcprop.c bug and leave the comparisons
with stack_pointer_rtx alone.

Thanks,
Richard


[PATCH], PR 69461, PowerPC specific fix for toc-fusion

2016-02-03 Thread Michael Meissner
In PR 69461, Vlad mentioned that in rs6000_legitimate_address_p, I was trying
to validate an address for TOC fusion, but I was using a predicate that looked
for a MEM instead of an address.

I bootstrapped the compiler on a little endian power8 and there were no
regressions.  In addition, Segher Boessenkool, says that with Vlad's patch and
this patch, it fixes a lot of the errors that he was looking at.

Is the patch ok to check in?

2016-02-03  Michael Meissner  
Vladimir Makarov  

PR target/69461
* config/rs6000/rs6000.c (rs6000_legitimate_address_p): Fix thinko
in validating fused toc addresses.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 233107)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -8399,7 +8399,8 @@ rs6000_legitimate_address_p (machine_mod
   && legitimate_constant_pool_address_p (x, mode,
 reg_ok_strict || lra_in_progress))
 return 1;
-  if (reg_offset_p && reg_addr[mode].fused_toc && toc_fusion_mem_wrapped (x, 
mode))
+  if (reg_offset_p && reg_addr[mode].fused_toc && GET_CODE (x) == UNSPEC
+  && XINT (x, 1) == UNSPEC_FUSION_ADDIS)
 return 1;
   /* For TImode, if we have load/store quad and TImode in VSX registers, only
  allow register indirect addresses.  This will allow the values to go in


Re: [PATCH], PR 69461, PowerPC specific fix for toc-fusion

2016-02-03 Thread David Edelsohn
On Wed, Feb 3, 2016 at 6:34 PM, Michael Meissner
 wrote:
> In PR 69461, Vlad mentioned that in rs6000_legitimate_address_p, I was trying
> to validate an address for TOC fusion, but I was using a predicate that looked
> for a MEM instead of an address.
>
> I bootstrapped the compiler on a little endian power8 and there were no
> regressions.  In addition, Segher Boessenkool, says that with Vlad's patch and
> this patch, it fixes a lot of the errors that he was looking at.
>
> Is the patch ok to check in?
>
> 2016-02-03  Michael Meissner  
> Vladimir Makarov  
>
> PR target/69461
> * config/rs6000/rs6000.c (rs6000_legitimate_address_p): Fix thinko
> in validating fused toc addresses.

Okay.

Thanks, David


Re: [PATCH] PR69619: Fix exponential issue in ccmp.c

2016-02-03 Thread Bernd Schmidt

On 02/03/2016 05:35 PM, Wilco Dijkstra wrote:

- tmp2 = targetm.gen_ccmp_first (&prep_seq_2, &gen_seq_2, rcode1,
-gimple_assign_rhs1 (gs1),
-gimple_assign_rhs2 (gs1));
-


It looks like after this patch tmp2 could be used uninitialized? Should 
be fixed.



+
+ /* FIXME: Temporary workaround for PR69619.
+Avoid exponential compile time due to expanding gs0 and gs1 twice.
+If gs0 and gs1 are complex, the cost will be high, so avoid
+reevaluation if above an arbitrary threshold.  */
+ if ((tmp == NULL) || (cost1 < 100))


Two sets of unnecessary parentheses. Also, I think the cost should be 
based on COSTS_N_INSNS for proper units.


Otherwise I think this is a reasonable workaround for this stage. Ok 
with these changes.



Bernd


Re: [PATCH] gcc: invoke: delete -mno-fma4 docs

2016-02-03 Thread Bernd Schmidt

On 02/01/2016 08:13 AM, Mike Frysinger wrote:

We don't document the -mno-xxx variants for other flags here, and the
paragraph here specifically says "Each has a corresponding -mno- option
to disable use of these instructions".  Drop the -mno-fma4 line.

2016-01-31  Mike Frysinger  

* doc/invoke.texi: Delete -mno-fma4.


Ok.


Bernd


Re: [Patch, MIPS] Fix PR target/68273, passing args in wrong regs

2016-02-03 Thread Mike Stump
On Feb 3, 2016, at 2:30 PM, Steve Ellcey  wrote:
> Here is a new patch for PR target/68273.

So this doesn’t fix aarch64, c6x, epiphany, ia64, iq2000, rs6000, rx, sparc, 
tilegx, tilepro or xtensa.

:-(  That’s one of the problems by having each port copy and paste swaths of 
code from other ports to express the same thing instead of ports sharing just 
one copy of code.  My port is also broken in the same way (currently).

I’m curious why the caller of the hook can’t grab the main variant, if it 
wants.  If it did this, then all the ports are fixed wrt this issue.

Re: [PATCH] [wwwdocs] Add common C++ issues to /gcc-6/porting_to.html

2016-02-03 Thread Mike Stump
On Feb 3, 2016, at 2:03 PM, Jonathan Wakely  wrote:
> Yes, there are *dozens* of packages that fail to build due to "return
> false;" in a function that returns a pointer of some kind.

Wow, curious.  Anyway, that removes my objection.


[committed, PATCH] Define check_union_passing6 only for CHECK_FLOAT128

2016-02-03 Thread H.J. Lu
Index: ChangeLog
===
--- ChangeLog   (revision 233123)
+++ ChangeLog   (working copy)
@@ -1,3 +1,9 @@
+2016-02-03  H.J. Lu  
+
+   * gcc.target/i386/iamcu/test_passing_unions.c (check_union_passing6):
+   Define only if CHECK_FLOAT128 is defined.
+   (main): Properly initialize u5.
+
 2016-02-03  Jakub Jelinek  
 
PR c/69627
Index: gcc.target/i386/iamcu/test_passing_unions.c
===
--- gcc.target/i386/iamcu/test_passing_unions.c (revision 233123)
+++ gcc.target/i386/iamcu/test_passing_unions.c (working copy)
@@ -94,6 +94,7 @@ check_union_passing5(union un5 u ATTRIBU
 #define check_union_passing4 WRAP_CALL(check_union_passing4)
 #define check_union_passing5 WRAP_CALL(check_union_passing5)
 
+#ifdef CHECK_FLOAT128
 union un6
 {
   __float128 f128;
@@ -111,6 +112,7 @@ check_union_passing6(union un6 u ATTRIBU
 }
 
 #define check_union_passing6 WRAP_CALL(check_union_passing6)
+#endif
 
 int
 main (void)
@@ -123,9 +125,11 @@ main (void)
   struct long_struct ls;
 #endif /* CHECK_LARGER_UNION_PASSING */
   union un4 u4[8];
-  union un5 u5 = { 48.394 };
+  union un5 u5;
   int i;
+#ifdef CHECK_FLOAT128
   union un6 u6;
+#endif
 
   /* Check a union with char, int.  */
   clear_struct_registers;
@@ -208,14 +212,17 @@ main (void)
   u4[4], u4[5], u4[6], u4[7]);
 
   clear_struct_registers;
+  u5.d = 48.394;
   iregs.I0 = u5.ll & 0x;
   iregs.I1 = (u5.ll >> 32) & 0x;
   num_iregs = 2;
   clear_int_hardware_registers;
   check_union_passing5(u5);
 
+#ifdef CHECK_FLOAT128
   u6.i = 2;
   check_union_passing6(u6);
+#endif
 
   return 0;
 }


Re: PR 69577: Invalid RA of destination subregs

2016-02-03 Thread Richard Henderson

On 02/03/2016 04:56 AM, Richard Sandiford wrote:


gcc/
PR rtl-optimization/69577
* reginfo.c (record_subregs_of_mode): Add a partial_def parameter.
(find_subregs_of_mode): Update accordingly.  Iterate over partial
definitions.

gcc/testsuite/
PR rtl-optimization/69577
* gcc.target/i386/pr69577.c: New test.


Ok.


r~


Merge from trunk to gccgo branch

2016-02-03 Thread Ian Lance Taylor
I merged GCC trunk revision 233110 to the gccgo branch.

Ian


Re: [PATCH] Partially fix PR c++/12277 (Warn on dynamic cast with known NULL results)

2016-02-03 Thread Patrick Palka
On Wed, Feb 3, 2016 at 1:40 PM, Patrick Palka  wrote:
> On Tue, 2 Feb 2016, Jason Merrill wrote:
>
>> On 11/09/2015 04:30 AM, Patrick Palka wrote:
>>>
>>> + if (complain & tf_warning)
>>> +   {
>>> + if (VAR_P (old_expr))
>>> +   warning (0, "dynamic_cast of %q#D to %q#T can never
>>> succeed",
>>> +   old_expr, type);
>>> + else
>>> +   warning (0, "dynamic_cast of %q#E to %q#T can never
>>> succeed",
>>> +   old_expr, type);
>>> +   }
>>> + return build_zero_cst (type);
>>
>>
>> You also need to handle throwing bad_cast in the reference case.
>
>
> Oops, fixed.  I also updated the test case to confirm that the expected
> number of calls to bad_cast is generated.
>
> -- >8 --
>
> gcc/cp/ChangeLog:
>
> PR c++/12277
> * rtti.c (build_dynamic_cast_1): Warn on dynamic_cast that can
> never succeed due to either the target type or the static type
> being marked final.
>
> gcc/testsuite/ChangeLog:
>
> PR c++/12277
> * g++.dg/rtti/dyncast8.C: New test.
> ---
>  gcc/cp/rtti.c| 26 ++
>  gcc/testsuite/g++.dg/rtti/dyncast8.C | 53
> 
>  2 files changed, 79 insertions(+)
>  create mode 100644 gcc/testsuite/g++.dg/rtti/dyncast8.C
>
> diff --git a/gcc/cp/rtti.c b/gcc/cp/rtti.c
> index a43ff85..b1454d9 100644
> --- a/gcc/cp/rtti.c
> +++ b/gcc/cp/rtti.c
> @@ -694,6 +694,32 @@ build_dynamic_cast_1 (tree type, tree expr,
> tsubst_flags_t complain)
>
>   target_type = TYPE_MAIN_VARIANT (TREE_TYPE (type));
>   static_type = TYPE_MAIN_VARIANT (TREE_TYPE (exprtype));
> +
> + if ((CLASSTYPE_FINAL (static_type)
> +  && !DERIVED_FROM_P (target_type, static_type))
> + || (CLASSTYPE_FINAL (target_type)
> + && !DERIVED_FROM_P (static_type, target_type)))
> +   {
> + if (complain & tf_warning)
> +   {
> + if (VAR_P (old_expr))
> +   warning (0, "dynamic_cast of %q#D to %q#T can never
> succeed",
> +   old_expr, type);
> + else
> +   warning (0, "dynamic_cast of %q#E to %q#T can never
> succeed",
> +   old_expr, type);
> +   }

It just occurred to me that issuing this warning during template
instantiation may be undesirable if the dynamic_cast being built was
originally dependent (for the same reasons it is undesirable to issue
-Wuseless-cast warnings during instantiation, which we already avoid
doing).  Especially so since this particular warning is not associated
with any warning flag.  I'll try to come up with a more careful patch
for stage 1.


Re: patch to fix PR69461

2016-02-03 Thread Vladimir Makarov

On 02/03/2016 03:49 PM, Michael Meissner wrote:

On Wed, Feb 03, 2016 at 01:02:57PM -0500, Vladimir Makarov wrote:

   The following patch fixes

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69461

   The patch actually solves several issues.  Before the patch LRA
has >800 more failures on GCC testsuite on power8.  After the patch
the LRA has the same number of failures as reload.

Working on the patch, I think I found some typo in
rs6000.c::rs6000_legitimate_address_p.  The code suspicious to me:

   if (reg_offset_p && reg_addr[mode].fused_toc &&
toc_fusion_mem_wrapped (x, mode))
 return 1;

The function works with address (x) but toc_fusion_mem_wrapped
requires memory instead of address.  Therefore the function never
returns 1 for toc_fusion_wrapped address.

Mike and Peter, what do you think about this code?

Anyway, the patch was successfully bootstrapped and tested on power8.

Committed as rev..

It looks like it would solve the problem (not knowing the inner details of
lra).

You are correct about the call to toc_fusion_wrapped expecting a MEM, and
rs6000_legitimate_address_p was pass the address.

We are testing the following patch to fix this:

2016-02-03  Michael Meissner  
Vladimir Makarov  

* config/rs6000/rs6000.c (rs6000_legitimate_address_p): Fix thinko
in validating fused toc addresses.


Thanks, Mike.  I found it when trying to fix numerous LRA failures on 
power8.  Reload pass when can not figure out what to do with 
illegitimate address does nothing.  LRA tried still to do something.   
I've made LRA working as reload for this case.  So LRA failures were 
fixed even if fused to wrapper address was believed to be illegitimate.


Still it makes a sense to consider the address legitimate as a lot of 
code in GCC needs this.






[PATCH] obsolete the deprecated rtems targets

2016-02-03 Thread tbsaunde+gcc
From: Trevor Saunders 

hi,

Joel said in http://gcc.gnu.org/ml/gcc/2016-01/msg00016.html we should drop
support for these targets because rtems has stopped supporting them, so this
marks them as obsolete and we can remove them in gcc 7.

tested building for {avr,h8300,m32r}-rtems without --enable-obsolete fails, and
make all-gcc for i686-linux-gnu succeeds.  ok?

Trev

gcc/ChangeLog:

2016-02-03  Trevor Saunders  

* config.gcc: Mark deprecated rtems targets as obsolete.
---
 gcc/config.gcc | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/gcc/config.gcc b/gcc/config.gcc
index c602358..e26742e 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -240,6 +240,9 @@ case ${target} in
  | *-knetbsd-* \
  | *-openbsd2* \
  | *-openbsd3* \
+ | avr-*rtems* \
+ | h8300-*rtems*   \
+ | m32r-*rtems*\
  )
 if test "x$enable_obsolete" != xyes; then
   echo "*** Configuration ${target} is obsolete." >&2
-- 
2.7.0



Re: [PATCH] fix #69251 - [6 Regression] ICE in unify_array_domain on a flexible array member

2016-02-03 Thread Martin Sebor

I've committed the patch with the changes below.  Just to clarify
my concern (since put to rest):


It was impossible to have null TYPE_MAX_VALUE until you introduced that
in compute_array_index_type, and thus we didn't test for it; if we
aren't doing that anymore I can't imagine where it would come from now.


The patch with the TYPE_MAX_VALUE checks has been in place for
several weeks.  Although unlikely, it seemed conceivable that
a change could have gone in since then that has introduced
a dependency on the domain being non-null for flexible array
members in some corner case.  But you know the code far better
than me (and the changes being committed) so I trust you when
you say the removal is safe.

Martin


[RFC] Variants of __typeof

2016-02-03 Thread Richard Henderson
While attempting to write some code that uses the new x86 named address space 
support in gcc 6, I found that __typeof is very unhelpful.  In particular, given


int __seg_fs *ptr;
__typeof(*ptr) obj;

OBJ will not be type "int", but "int __seg_fs".  Which means that you can't use 
it to create temporaries within statement expressions.


In the process of writing this, I found a hack in __typeof added just to 
support _Atomic.  Which suggests that one of these variants would be more 
generally helpful than the hack.


I add __typeof_noas and __typeof_noqual.  The first strips only the address 
space, leaving 'const' and 'volatile' (and, I suppose 'restrict').  The second 
strips all qualifiers, essentially yielding the TYPE_MAIN_VARIANT.


Thoughts?


r~
diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 378afae..4972fe1 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -509,6 +509,10 @@ const struct c_common_resword c_common_reswords[] =
   { "__transaction_cancel", RID_TRANSACTION_CANCEL, 0 },
   { "__typeof",RID_TYPEOF, 0 },
   { "__typeof__",  RID_TYPEOF, 0 },
+  { "__typeof_noas",   RID_TYPEOF_NOAS, 0 },
+  { "__typeof_noas__", RID_TYPEOF_NOAS, 0 },
+  { "__typeof_noqual", RID_TYPEOF_NOQUAL, 0 },
+  { "__typeof_noqual__", RID_TYPEOF_NOQUAL, 0 },
   { "__underlying_type", RID_UNDERLYING_TYPE, D_CXXONLY },
   { "__volatile",  RID_VOLATILE,   0 },
   { "__volatile__",RID_VOLATILE,   0 },
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index fa3746c..6c6c2b1 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -100,9 +100,10 @@ enum rid
   /* C extensions */
   RID_ASM,   RID_TYPEOF,   RID_ALIGNOF,  RID_ATTRIBUTE,  RID_VA_ARG,
   RID_EXTENSION, RID_IMAGPART, RID_REALPART, RID_LABEL,  RID_CHOOSE_EXPR,
-  RID_TYPES_COMPATIBLE_P,  RID_BUILTIN_COMPLEX, 
RID_BUILTIN_SHUFFLE,
-  RID_DFLOAT32, RID_DFLOAT64, RID_DFLOAT128,
+  RID_TYPES_COMPATIBLE_P,  RID_BUILTIN_COMPLEX,
+  RID_BUILTIN_SHUFFLE, RID_DFLOAT32, RID_DFLOAT64, RID_DFLOAT128,
   RID_FRACT, RID_ACCUM, RID_AUTO_TYPE, RID_BUILTIN_CALL_WITH_STATIC_CHAIN,
+  RID_TYPEOF_NOAS, RID_TYPEOF_NOQUAL,
 
   /* C11 */
   RID_ALIGNAS, RID_GENERIC,
diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index eede3a7..199a39f 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -561,6 +561,8 @@ c_token_starts_typename (c_token *token)
case RID_STRUCT:
case RID_UNION:
case RID_TYPEOF:
+   case RID_TYPEOF_NOAS:
+   case RID_TYPEOF_NOQUAL:
case RID_CONST:
case RID_ATOMIC:
case RID_VOLATILE:
@@ -722,6 +724,8 @@ c_token_starts_declspecs (c_token *token)
case RID_STRUCT:
case RID_UNION:
case RID_TYPEOF:
+   case RID_TYPEOF_NOAS:
+   case RID_TYPEOF_NOQUAL:
case RID_CONST:
case RID_VOLATILE:
case RID_RESTRICT:
@@ -2530,6 +2534,8 @@ c_parser_declspecs (c_parser *parser, struct c_declspecs 
*specs,
  declspecs_add_type (loc, specs, t);
  break;
case RID_TYPEOF:
+   case RID_TYPEOF_NOAS:
+   case RID_TYPEOF_NOQUAL:
  /* ??? The old parser rejected typeof after other type
 specifiers, but is a syntax error the best way of
 handling this?  */
@@ -3179,7 +3185,12 @@ c_parser_typeof_specifier (c_parser *parser)
   ret.spec = error_mark_node;
   ret.expr = NULL_TREE;
   ret.expr_const_operands = true;
-  gcc_assert (c_parser_next_token_is_keyword (parser, RID_TYPEOF));
+
+  enum rid keyword = c_parser_peek_token (parser)->keyword;
+  gcc_assert (keyword == RID_TYPEOF
+ || keyword == RID_TYPEOF_NOAS
+ || keyword == RID_TYPEOF_NOQUAL);
+
   c_parser_consume_token (parser);
   c_inhibit_evaluation_warnings++;
   in_typeof++;
@@ -3221,9 +3232,19 @@ c_parser_typeof_specifier (c_parser *parser)
   /* For use in macros such as those in , remove all
 qualifiers from atomic types.  (const can be an issue for more macros
 using typeof than just the  ones.)  */
-  if (ret.spec != error_mark_node && TYPE_ATOMIC (ret.spec))
-   ret.spec = c_build_qualified_type (ret.spec, TYPE_UNQUALIFIED);
+  if (ret.spec != error_mark_node
+ && keyword == RID_TYPEOF
+ && TYPE_ATOMIC (ret.spec))
+   keyword = RID_TYPEOF_NOQUAL;
 }
+
+  /* If requested, drop (some) qualifiers.  */
+  if (keyword != RID_TYPEOF && ret.spec != error_mark_node)
+ret.spec = c_build_qualified_type (ret.spec,
+  keyword == RID_TYPEOF_NOQUAL
+  ? TYPE_UNQUALIFIED
+  : TYPE_QUALS_NO_ADDR_SPACE (ret.spec));
+
   c_parser_skip_until_found (parser, CPP_CLOSE_PAREN, "expected %<)%>");
   return ret;
 }
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index d03b0c9..49bdf4f 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -978,6 +978,8 @@ c

Re: [PATCH] obsolete the deprecated rtems targets

2016-02-03 Thread Jeff Law

On 02/03/2016 09:36 PM, tbsaunde+...@tbsaunde.org wrote:

From: Trevor Saunders 

hi,

Joel said in http://gcc.gnu.org/ml/gcc/2016-01/msg00016.html we should drop
support for these targets because rtems has stopped supporting them, so this
marks them as obsolete and we can remove them in gcc 7.

tested building for {avr,h8300,m32r}-rtems without --enable-obsolete fails, and
make all-gcc for i686-linux-gnu succeeds.  ok?

Trev

gcc/ChangeLog:

2016-02-03  Trevor Saunders  

* config.gcc: Mark deprecated rtems targets as obsolete.

OK.
jeff



RE: Turnoff prefetching for -march=znver1

2016-02-03 Thread Kumar, Venkataramanan
Hi Uros,

> -Original Message-
> From: Uros Bizjak [mailto:ubiz...@gmail.com]
> Sent: Wednesday, February 03, 2016 2:20 AM
> To: Stepanyan, Victoria
> Cc: gcc-patches@gcc.gnu.org; ger...@pfeifer.com; rguent...@suse.de;
> Kumar, Venkataramanan
> Subject: Re: Turnoff prefetching for -march=znver1
> 
> On Tue, Feb 2, 2016 at 9:28 PM, Stepanyan, Victoria
>  wrote:
> > Hi Maintainers,
> >
> > This patch disables prefetching for -march=znver1 which is turned on by
> default.
> >
> > gcc/ChangeLog:
> >
> > 2016-02-02 Victoria Stepanyan 
> >
> > * gcc/config/i386/x86-tune.def: Disable default prefetching for -
> march=znver1
> >
> > Ok for trunk?
> 
> OK.

Thank you. I up streamed it on behalf of Victoria.
https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=233127


> 
> Thanks,
> Uros.

Regards,
Venkat.