[PATCH] combine: Don't mess with subregs of floating point (PR78590)

2016-11-30 Thread Segher Boessenkool
PR78590 shows a problem in change_zero_ext, where we change a zero_extend
of a subreg to a logical and.  We should only do this if the thing we are
taking the subreg of is a scalar integer, otherwise we will take a subreg
of (e.g.) a float in a different size, which is nonsensical and hits an
assert.

Tested on powerpc64-linux; committing to trunk.


Segher


2016-11-30  Segher Boessenkool  

PR rtl-optimization/78590
* combine.c (change_zero_ext): Transform zero_extend of subregs only
if the subreg_reg is a scalar integer mode.

---
 gcc/combine.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/combine.c b/gcc/combine.c
index 5696eea9..fd33a4d 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -11285,6 +11285,7 @@ change_zero_ext (rtx pat)
   else if (GET_CODE (x) == ZERO_EXTEND
   && SCALAR_INT_MODE_P (mode)
   && GET_CODE (XEXP (x, 0)) == SUBREG
+  && SCALAR_INT_MODE_P (GET_MODE (SUBREG_REG (XEXP (x, 0
   && !paradoxical_subreg_p (XEXP (x, 0))
   && subreg_lowpart_p (XEXP (x, 0)))
{
-- 
1.9.3



Re: [PATCH] libiberty: avoid reading past end of buffer in strndup/xstrndup (PR c/78498)

2016-11-30 Thread Richard Biener
On Tue, Nov 29, 2016 at 11:08 PM, David Malcolm  wrote:
> libiberty's implementations of strndup and xstrndup call strlen on
> the input string, and hence can read past the end of the input buffer
> if it isn't zero-terminated (such as is the case in PR c/78498, where
> the input string is from the input.c line cache).
>
> This patch converts them to use strnlen instead (as glibc's
> implementation of them does), avoiding reading more than n bytes
> from the input buffer.  strnlen is provided by libiberty.
>
> Successfully bootstrapped®rtested on x86_64-pc-linux-gnu;
> adds 6 PASS results to gcc.sum.
>
> The patch also adds some selftests for this case, which showed
> the problem and the fix nicely via "make selftest-valgrind".
> Unfortunately I had to put these selftests within the gcc
> subdirectory, rather than libiberty, since selftest.h is C++ and
> is itself in the gcc subdirectory.  If that's unacceptable, I can
> just drop the selftest.c part of the patch (or we somehow support
> selftests from within libiberty itself, though I'm not sure how to
> do that, if libiberty is meant as a cross-platform compat library,
> rather than as a base support layer; the simplest thing to do seemed
> to be to put them in the "gcc" subdir).

Ok.

Thanks,
Richard.

> gcc/ChangeLog:
> PR c/78498
> * selftest.c (selftest::assert_strndup_eq): New function.
> (selftest::test_strndup): New function.
> (selftest::test_libiberty): New function.
> (selftest::selftest_c_tests): Call test_libiberty.
>
> gcc/testsuite/ChangeLog:
> PR c/78498
> * gcc.dg/format/pr78494.c: New test case.
>
> libiberty/ChangeLog:
> PR c/78498
> * strndup.c (strlen): Delete decl.
> (strnlen): Add decl.
> (strndup): Call strnlen rather than strlen.
> * xstrndup.c: Likewise.
> ---
>  gcc/selftest.c| 48 
> +++
>  gcc/testsuite/gcc.dg/format/pr78494.c | 12 +
>  libiberty/strndup.c   |  7 ++---
>  libiberty/xstrndup.c  |  5 +---
>  4 files changed, 63 insertions(+), 9 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/format/pr78494.c
>
> diff --git a/gcc/selftest.c b/gcc/selftest.c
> index 2a729be..6df73c2 100644
> --- a/gcc/selftest.c
> +++ b/gcc/selftest.c
> @@ -198,6 +198,53 @@ read_file (const location &loc, const char *path)
>return result;
>  }
>
> +/* Selftests for libiberty.  */
> +
> +/* Verify that both strndup and xstrndup generate EXPECTED
> +   when called on SRC and N.  */
> +
> +static void
> +assert_strndup_eq (const char *expected, const char *src, size_t n)
> +{
> +  char *buf = strndup (src, n);
> +  if (buf)
> +ASSERT_STREQ (expected, buf);
> +  free (buf);
> +
> +  buf = xstrndup (src, n);
> +  ASSERT_STREQ (expected, buf);
> +  free (buf);
> +}
> +
> +/* Verify that strndup and xstrndup work as expected.  */
> +
> +static void
> +test_strndup ()
> +{
> +  assert_strndup_eq ("", "test", 0);
> +  assert_strndup_eq ("t", "test", 1);
> +  assert_strndup_eq ("te", "test", 2);
> +  assert_strndup_eq ("tes", "test", 3);
> +  assert_strndup_eq ("test", "test", 4);
> +  assert_strndup_eq ("test", "test", 5);
> +
> +  /* Test on an string without zero termination.  */
> +  const char src[4] = {'t', 'e', 's', 't'};
> +  assert_strndup_eq ("", src, 0);
> +  assert_strndup_eq ("t", src, 1);
> +  assert_strndup_eq ("te", src, 2);
> +  assert_strndup_eq ("tes", src, 3);
> +  assert_strndup_eq ("test", src, 4);
> +}
> +
> +/* Run selftests for libiberty.  */
> +
> +static void
> +test_libiberty ()
> +{
> +  test_strndup ();
> +}
> +
>  /* Selftests for the selftest system itself.  */
>
>  /* Sanity-check the ASSERT_ macros with various passing cases.  */
> @@ -245,6 +292,7 @@ test_read_file ()
>  void
>  selftest_c_tests ()
>  {
> +  test_libiberty ();
>test_assertions ();
>test_named_temp_file ();
>test_read_file ();
> diff --git a/gcc/testsuite/gcc.dg/format/pr78494.c 
> b/gcc/testsuite/gcc.dg/format/pr78494.c
> new file mode 100644
> index 000..4b53a68
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/format/pr78494.c
> @@ -0,0 +1,12 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -Wall -Wextra -fdiagnostics-show-caret" } */
> +
> +void f (void)
> +{
> +  __builtin_printf ("%i", ""); /* { dg-warning "expects argument of type" } 
> */
> +/* { dg-begin-multiline-output "" }
> +   __builtin_printf ("%i", "");
> +  ~^   ~~
> +  %s
> +   { dg-end-multiline-output "" } */
> +}
> diff --git a/libiberty/strndup.c b/libiberty/strndup.c
> index 9e9b4e2..4556b96 100644
> --- a/libiberty/strndup.c
> +++ b/libiberty/strndup.c
> @@ -33,7 +33,7 @@ memory was available.  The result is always NUL terminated.
>  #include "ansidecl.h"
>  #include 
>
> -extern size_t  strlen (const char*);
> +extern size_t  strnlen (const char *s, size_t maxlen);
>  extern PTR malloc (size_t);
>  extern PTR memcpy 

Re: [patch] boehm-gc removal and libobjc changes to build with an external bdw-gc

2016-11-30 Thread Andreas Schwab
configure: error: no --with-target-bdw-gc options and no bdw-gc pkg-config 
module found
make[1]: *** [Makefile:19775: configure-target-libobjc] Error 1

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


Re: [RFA] Handle target with no length attributes sanely in bb-reorder.c

2016-11-30 Thread Richard Biener
On Tue, Nov 29, 2016 at 5:07 PM, Jeff Law  wrote:
> On 11/29/2016 03:23 AM, Richard Biener wrote:
>>
>> On Mon, Nov 28, 2016 at 10:23 PM, Jeff Law  wrote:
>>>
>>>
>>>
>>> I was digging into  issues around the patches for 78120 when I stumbled
>>> upon
>>> undesirable bb copying in bb-reorder.c on the m68k.
>>>
>>> The core issue is that the m68k does not define a length attribute and
>>> therefore generic code assumes that the length of all insns is 0 bytes.
>>
>>
>> What other targets behave like this?
>
> ft32, nvptx, mmix, mn10300, m68k, c6x, rl78, vax, ia64, m32c

Ok.

> cris has a hack to define a length, even though no attempt is made to make
> it accurate.  The hack specifically calls out that it's to make bb-reorder
> happy.
>
>>
>>> That in turn makes bb-reorder think it is infinitely cheap to copy basic
>>> blocks.  In the two codebases I looked at (GCC's runtime libraries and
>>> newlib) this leads to a 10% and 15% undesirable increase in code size.
>>>
>>> I've taken a slight variant of this patch and bootstrapped/regression
>>> tested
>>> it on x86_64-linux-gnu to verify sanity as well as built the m68k target
>>> libraries noted above.
>>>
>>> OK for the trunk?
>>
>>
>> I wonder if it isn't better to default to a length of 1 instead of zero
>> when
>> there is no length attribute.  There are more users of the length
>> attribute
>> in bb-reorder.c (and elsewhere as well I suppose).
>
> I pondered that as well, but felt it was riskier given we've had a default
> length of 0 for ports that don't define lengths since the early 90s.  It's
> certainly easy enough to change that default if you'd prefer.  I don't have
> a strong preference either way.

Thinking about this again maybe targets w/o insn-length should simply
always use the 'simple' algorithm instead of the STV one?  At least that
might be what your change effectively does in some way?

Richard.

> Jeff


[PATCH] simplify-rtx: Add missing line for previous commit (PR78583)

2016-11-30 Thread Segher Boessenkool
The comment for the added case to simplify_truncation reads

  /* Turn (truncate:M1 (*_extract:M2 (reg:M2) (len) (pos))) into
 (*_extract:M1 (truncate:M1 (reg:M2)) (len) (pos')) if possible without
 changing len.  */

but I forget to check the two modes M2 are actually the same.  Tested on
powerpc64-linux, committing to trunk as obvious.


Segher


2016-11-30  Segher Boessenkool  

PR rtl-optimization/78583
* simplify-rtx.c (simplify_truncation): Add check missing from the
previous commit.

---
 gcc/simplify-rtx.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c
index f922aaf..ead2f5a 100644
--- a/gcc/simplify-rtx.c
+++ b/gcc/simplify-rtx.c
@@ -752,6 +752,7 @@ simplify_truncation (machine_mode mode, rtx op,
  changing len.  */
   if ((GET_CODE (op) == ZERO_EXTRACT || GET_CODE (op) == SIGN_EXTRACT)
   && REG_P (XEXP (op, 0))
+  && GET_MODE (XEXP (op, 0)) == mode
   && CONST_INT_P (XEXP (op, 1))
   && CONST_INT_P (XEXP (op, 2)))
 {
-- 
1.9.3



Re: [libstdc++, testsuite] Add dg-require-thread-fence

2016-11-30 Thread Christophe Lyon
On 29 November 2016 at 21:18, Jonathan Wakely  wrote:
> On 16/11/16 22:18 +0100, Christophe Lyon wrote:
>>
>> On 15 November 2016 at 12:50, Jonathan Wakely  wrote:
>>>
>>> On 14/11/16 14:32 +0100, Christophe Lyon wrote:


 On 20 October 2016 at 19:40, Jonathan Wakely  wrote:
>
>
> On 20/10/16 10:33 -0700, Mike Stump wrote:
>>
>>
>>
>> On Oct 20, 2016, at 9:34 AM, Jonathan Wakely 
>> wrote:
>>>
>>>
>>>
>>>
>>> On 20/10/16 09:26 -0700, Mike Stump wrote:



 On Oct 20, 2016, at 5:20 AM, Jonathan Wakely 
 wrote:
>
>
>
>
> I am considering leaving this in the ARM backend to force people to
> think what they want to do about thread safety with statics and C++
> on bare-metal systems.
>>>
>>>
>>>
>>>
>>> The quoting makes it look like those are my words, but I was quoting
>>> Ramana from https://gcc.gnu.org/ml/gcc-patches/2015-05/msg02751.html
>>>
 Not quite in the GNU spirit?  The port people should decide the best
 way
 to get as much functionality as possible and everything should just
 work, no
 sharp edges.

 Forcing people to think sounds like a sharp edge?
>>>
>>>
>>>
>>>
>>> I'm inclined to agree, but we are talking about bare metal systems,
>>
>>
>>
>>
>> So?  gcc has been doing bare metal systems for more than 2 years now.
>> It
>> is pretty good at it.  All my primary targets today are themselves
>> bare
>> metal systems (I test with newlib).
>>
>>> where there is no one-size-fits-all solution.
>>
>>
>>
>>
>> Configurations are like ice cream cones.  Everyone gets their flavor
>> no
>> matter how weird or strange.  Putting nails in a cone because you
>> don't
>> know
>> if they like vanilla or chocolate isn't reasonable.  If you want, make
>> two
>> flavors, and vend two, if you want to just do one, pick the flavor and
>> vend
>> it.  Put an enum #define default_flavor vanilla, and you then have
>> support
>> for any flavor you want.  Want to add a configure option for the
>> flavor
>> select, add it.  You want to make a -mflavor=chocolate option, add it.
>> gcc
>> is literally littered with these things.
>
>
>
>
> Like I said, you can either build the library with
> -fno-threadsafe-statics or you can provide a definition of the missing
> symbol.
>
 I gave this a try (using CXXFLAGS_FOR_TARGET=-fno-threadsafe-statics).
 It seems to do the trick indeed: almost all tests now pass, the flag is
 added
 to testcase compilation.

 Among the 6 remaining failures, I noticed these two:
 - experimental/type_erased_allocator/2.cc: still complains about the
 missing
 __sync_synchronize. Does it need dg-require-thread-fence?
>>>
>>>
>>>
>>> Yes, I think that test actually uses atomics directly, so does depend
>>> on the fence.
>>>
>> I've attached the patch to achieve this.
>> Is it OK?
>
>
> Yes, OK, thanks.
>
Thanks, committed.

 - abi/header_cxxabi.c complains because the option is not valid for C.
 I can see the test is already skipped for other C++-only options: it is
 OK
 if I submit a patch to skip it if -fno-threadsafe-statics is used?
>>>
>>>
>>>
>>> Yes, it makes sense there too.
>>
>>
>> This one is not as obvious as I hoped. I tried:
>> -// { dg-skip-if "invalid options for C" { *-*-* } { "-std=c++??"
>> "-std=gnu++??" } }
>> +// { dg-skip-if "invalid options for C" { *-*-* } { "-std=c++??"
>> "-std=gnu++??" "-fno-threadsafe-statics" } }
>>
>> but it does not work.
>>
>> I set CXXFLAGS_FOR_TARGET=-fno-threadsafe-statics
>> before running GCC's configure.
>>
>> This results in -fno-threadsafe-statics being used when compiling the
>> tests,
>> but dg-skip-if does not consider it: it would if I passed it via
>> runtestflags/target-board, but then it would mean passing this flag
>> to all tests, not only the c++ ones, leading to errors everywhere.
>>
>> Am I missing something?
>
>
> I'm not sure how to deal with that.
>
I'll try to think about it, but I can live with that single "known" failure.
It's better than the ~3500 failures I used to have, and hopefully
won't report false regressions anymore.

Christophe


Re: [PATCH] Fix PR78306

2016-11-30 Thread Richard Biener
On Tue, 29 Nov 2016, Jeff Law wrote:

> On 11/29/2016 12:47 AM, Richard Biener wrote:
> > > Balaji added this check explicitly. There should be tests in the testsuite
> > > (spawnee_inline, spawner_inline) which exercise that code.
> > 
> > Yes he did, but no, nothing in the testsuite.
> I believe the tests are:
> 
> c-c++-common/cilk-plus/CK/spawnee_inline.c
> c-c++-common/cilk-plus/CK/spawner_inline.c
> 
> But as I mentioned, they don't check for proper behaviour

Actually they do -- and both show what the issue might be, cilk+
uses setjmp but we already have code to disallow inlining of
functions calling setjmp (but we happily inline into functions
calling setjmp).  When mangling the testcases to try forcing
inlining I still (the patch was already applied) get

/space/rguenther/src/gcc-git/gcc/testsuite/c-c++-common/cilk-plus/CK/spawnee_inline.c:
 
In function ‘fib’:
/space/rguenther/src/gcc-git/gcc/testsuite/c-c++-common/cilk-plus/CK/spawnee_inline.c:9:50:
 
error: function ‘fib’ can never be copied because it receives a non-local 
goto

so the intent was probably to disallow inlining of functions calling
cilk_spawn, not to disable inlining into functions calling cilk_spawn.

But as seen above this is already handled by generic code handling
setjmp.

> 
> > 
> > There is _nowhere_ documented _why_ the checks were added.  Why is
> > inlining a transform that can do anything bad to a function using
> > cilk_spawn?
> I know, it's disappointing.  Even the tests mentioned above don't shed any
> real light on the issue.

One issue is obvious (but already handled).  Why all inlining should
be disabled is indeed still a mystery.

Richard.

> 
> Jeff
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)

Re: [patch,lto] Fix PR78562: Wrong type mismatch warning for built-ins with same asm name.

2016-11-30 Thread Richard Biener
On Tue, 29 Nov 2016, Georg-Johann Lay wrote:

> This is a fix for a wrong warning from -Wlto-type-mismatch that reports a type
> mismatch for two built-in functions.
> 
> The avr backend has several built-ins that have the same asm name because
> their assembler implementation in libgcc is exactly the same. The prototypes
> might differ, however.
> 
> This patch skips the warning for built-in types as discussed in
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78562#c6
> 
> Testing against avr-unknown-none, this resolves all FAILs because of that
> warning, e.g. gcc.target/avr/torture/builtins-5-countlsfx.c
> 
> Ok for trunk?

Ok.

Thanks,
Richard.

> Johann
> 
> gcc/lto/
>   PR lto/78562
>   * lto-symtab.c (lto_symtab_merge_decls_2): Don't diagnose type
>   mismatch if the two types are built-in.
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


Re: [AArch64][ARM][GCC][PATCHv2 3/3] Add tests for missing Poly64_t intrinsics to GCC

2016-11-30 Thread Christophe Lyon
Hi Tamar,


On 29 November 2016 at 14:54, James Greenhalgh  wrote:
> On Tue, Nov 29, 2016 at 01:48:22PM +, Kyrill Tkachov wrote:
>>
>> On 29/11/16 09:50, Tamar Christina wrote:
>> >Hi All,
>> >
>> >The new patch contains the proper types for the intrinsics that should be 
>> >returning uint64x1
>> >and has the rest of the comments by Christophe in them.
>>
>> Ok with an appropriate ChangeLog entry.
>
> Also OK from an AArch64 persepctive based on the detailed review from
> Christophe.
>
> Thanks,
> James
>

After you committed this patch (r242962), I've noticed some
regressions as follows:
* on aarch64, vreinterpret_p128 and vreinterpret_p64 fail to compile
with errors like
warning: implicit declaration of function 'vreinterpretq_p64_p128
warning: implicit declaration of function 'vreinterpretq_p128_s8
error: incompatible types when assigning to type 'poly64x2_t' from type 'int'
etc...

* on arm configured for armv8-a, several tests fail to link or compile:
vbsl.c:(.text+0x24f0): undefined reference to `expected_poly64x1'
vdup-vmov.c:227:38: error: 'expected0_poly64x1' undeclared
vdup_lane.c:(.text+0x1584): undefined reference to `expected_poly64x1'

You can have more details at
http://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/242962/report-build-info.html


Christophe


Re: [PATCH][AArch64] PR target/71112: Properly create lowpart of pic_offset_table_rtx with -fpie

2016-11-30 Thread Kyrill Tkachov


On 30/11/16 05:27, Andrew Pinski wrote:

On Tue, Nov 29, 2016 at 1:09 AM, Kyrill Tkachov
 wrote:

Hi all,

This ICE only occurs on big-endian ILP32 -fpie code. The expansion code
generates the invalid load:
(insn 6 5 7 (set (reg/f:SI 76)
 (unspec:SI [
 (mem/u/c:SI (lo_sum:SI (nil)
 (symbol_ref:SI ("dbs") [flags 0x40] )) [0  S4 A8])
 ] UNSPEC_GOTSMALLPIC28K))
  (expr_list:REG_EQUAL (symbol_ref:SI ("dbs") [flags 0x40] )
 (nil)))

to load the symbol. Note the (nil) argument to lo_sum.
The buggy hunk meant to take the lowpart of the pic_offset_table_rtx
register but it did so by explicitly
constructing a subreg, for which the offset is wrong for big-endian. The
right way is to use gen_lowpart which
knows what exactly to do, with this patch we emit:
(insn 6 5 7 (set (reg/f:SI 76)
 (unspec:SI [
 (mem/u/c:SI (lo_sum:SI (subreg:SI (reg:DI 73) 4)
 (symbol_ref:SI ("dbs") [flags 0x40] )) [0  S4 A8])
 ] UNSPEC_GOTSMALLPIC28K))
  (expr_list:REG_EQUAL (symbol_ref:SI ("dbs") [flags 0x40] )
 (nil)))

and everything works fine.

Bootstrapped and tested on aarch64-none-linux-gnu.
Also tested on aarch64_be-none-elf.
Ok for trunk?

Naveen posted the exact same patch:
https://gcc.gnu.org/ml/gcc-patches/2016-11/msg02305.html
Though with a slightly different testcase :).


Ah, indeed. Sorry for the noise :)
Kyrill


Thanks,
Andrew


Thanks,
Kyrill

2016-11-29  Kyrylo Tkachov  

 PR target/71112
 * config/aarch64/aarch64.c (aarch64_load_symref_appropriately,
 case SYMBOL_SMALL_GOT_28K): Use gen_lowpart rather than constructing
 subreg directly.

2016-11-29  Kyrylo Tkachov  

 PR target/71112
 * gcc.c-torture/compile/pr71112.c: New test.




RE: [AArch64][ARM][GCC][PATCHv2 3/3] Add tests for missing Poly64_t intrinsics to GCC

2016-11-30 Thread Tamar Christina
Hi Christophe,

> After you committed this patch (r242962), I've noticed some regressions as
> follows:
> * on aarch64, vreinterpret_p128 and vreinterpret_p64 fail to compile with
> errors like
> warning: implicit declaration of function 'vreinterpretq_p64_p128
> warning: implicit declaration of function 'vreinterpretq_p128_s8
> error: incompatible types when assigning to type 'poly64x2_t' from type 'int'
> etc...

Sorry for the screw up. On the last patch I only tested the file p64_p128.c.
I'll fix these asap.

> 
> * on arm configured for armv8-a, several tests fail to link or compile:
> vbsl.c:(.text+0x24f0): undefined reference to `expected_poly64x1'
> vdup-vmov.c:227:38: error: 'expected0_poly64x1' undeclared
> vdup_lane.c:(.text+0x1584): undefined reference to `expected_poly64x1'
> 
> You can have more details at
> http://people.linaro.org/~christophe.lyon/cross-
> validation/gcc/trunk/242962/report-build-info.html
> 
> 
> Christophe


Re: [PATCH, GCC/ARM] Fix PR77933: stack corruption on ARM when using high registers and lr

2016-11-30 Thread Thomas Preudhomme

Hi,

Is this ok to backport to gcc-5-branch and gcc-6-branch? Patch applies cleanly 
(patches attached for reference).



2016-11-17  Thomas Preud'homme  

Backport from mainline
2016-11-17  Thomas Preud'homme  

gcc/
PR target/77933
* config/arm/arm.c (thumb1_expand_prologue): Distinguish between lr
being live in the function and lr needing to be saved.  Distinguish
between already saved pushable registers and registers to push.
Check for LR being an available pushable register.

gcc/testsuite/
PR target/77933
* gcc.target/arm/pr77933-1.c: New test.
* gcc.target/arm/pr77933-2.c: Likewise.


Best regards,

Thomas


On 17/11/16 20:15, Thomas Preudhomme wrote:

Hi Kyrill,

I've committed the following updated patch where the test is restricted to Thumb
execution mode and skipping it if not possible since -mtpcs-leaf-frame is only
available in Thumb mode. I've considered the change obvious.

*** gcc/ChangeLog ***

2016-11-08  Thomas Preud'homme  

PR target/77933
* config/arm/arm.c (thumb1_expand_prologue): Distinguish between lr
being live in the function and lr needing to be saved.  Distinguish
between already saved pushable registers and registers to push.
Check for LR being an available pushable register.


*** gcc/testsuite/ChangeLog ***

2016-11-08  Thomas Preud'homme  

PR target/77933
* gcc.target/arm/pr77933-1.c: New test.
* gcc.target/arm/pr77933-2.c: Likewise.

Best regards,

Thomas

On 17/11/16 10:04, Kyrill Tkachov wrote:


On 09/11/16 16:41, Thomas Preudhomme wrote:

I've reworked the patch following comments from Wilco [1] (sorry could not
find it in my MUA for some reason).

[1] https://gcc.gnu.org/ml/gcc-patches/2016-11/msg00317.html


== Context ==

When saving registers, function thumb1_expand_prologue () aims at minimizing
the number of push instructions. One of the optimization it does is to push LR
alongside high register(s) (after having moved them to low register(s)) when
there is no low register to save. The way this is implemented is to add LR to
the pushable_regs mask if it is live just before pushing the registers in that
mask. The mask of live pushable registers which is used to detect whether LR
needs to be saved is then clear to ensure LR is only saved once.


== Problem ==

However beyond deciding what register to push pushable_regs is used to track
what pushable register can be used to move a high register before being
pushed, hence the name. That mask is cleared when all high registers have been
assigned a low register but the clearing assumes the high registers were
assigned to the registers with the biggest number in that mask. This is not
the case because LR is not considered when looking for a register in that
mask. Furthermore, LR might have been saved in the TARGET_BACKTRACE path above
yet the mask of live pushable registers is not cleared in that case.


== Solution ==

This patch changes the loop to iterate over register LR to r0 so as to both
fix the stack corruption reported in PR77933 and reuse lr to push some high
register when possible. This patch also introduce a new variable
lr_needs_saving to record whether LR (still) needs to be saved at a given
point in code and sets the variable accordingly throughout the code, thus
fixing the second issue. Finally, this patch create a new push_mask variable
to distinguish between the mask of registers to push and the mask of live
pushable registers.


== Note ==

Other bits could have been improved but have been left out to allow the patch
to be backported to stable branch:

(1) using argument registers that are not holding an argument
(2) using push_mask consistently instead of l_mask (in TARGET_BACKTRACE), mask
(low register push) and push_mask
(3) the !l_mask case improved in TARGET_BACKTRACE since offset == 0
(4) rename l_mask to a more appropriate name (live_pushable_regs_mask?)

ChangeLog entry are as follow:

*** gcc/ChangeLog ***

2016-11-08  Thomas Preud'homme  

PR target/77933
* config/arm/arm.c (thumb1_expand_prologue): Distinguish between lr
being live in the function and lr needing to be saved. Distinguish
between already saved pushable registers and registers to push.
Check for LR being an available pushable register.


*** gcc/testsuite/ChangeLog ***

2016-11-08  Thomas Preud'homme  

PR target/77933
* gcc.target/arm/pr77933-1.c: New test.
* gcc.target/arm/pr77933-2.c: Likewise.


Testing: no regression on arm-none-eabi GCC cross-compiler targeting Cortex-M0

Is this ok for trunk?



Ok.
Thanks,
Kyrill


Best regards,

Thomas

On 02/11/16 17:08, Thomas Preudhomme wrote:

Hi,

When saving registers, function thumb1_expand_prologue () aims at minimizing
the
number of push instructions. One of the optimization it does is to push lr
alongside high register(s) (after having moved them to low register(s)) when
there is no low register to s

[Patch, Fortran, committed] PR 78592: [7 Regression] ICE in gfc_find_specific_dtio_proc, at fortran/interface.c:4939

2016-11-30 Thread Janus Weil
Hi all,

I have just committed a completely obvious patch for this PR. All it
does is rearrange some expressions to avoid an ICE (see attachment):

https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=243005

Cheers,
Janus
Index: gcc/fortran/interface.c
===
--- gcc/fortran/interface.c (revision 243004)
+++ gcc/fortran/interface.c (working copy)
@@ -4933,15 +4933,15 @@ gfc_find_specific_dtio_proc (gfc_symbol *derived,
  && tb_io_st->n.sym
  && tb_io_st->n.sym->generic)
{
- gfc_interface *intr;
- for (intr = tb_io_st->n.sym->generic; intr; intr = intr->next)
+ for (gfc_interface *intr = tb_io_st->n.sym->generic;
+  intr && intr->sym && intr->sym->formal;
+  intr = intr->next)
{
  gfc_symbol *fsym = intr->sym->formal->sym;
- if (intr->sym && intr->sym->formal
- && ((fsym->ts.type == BT_CLASS
- && CLASS_DATA (fsym)->ts.u.derived == extended)
-   || (fsym->ts.type == BT_DERIVED
-   && fsym->ts.u.derived == extended)))
+ if ((fsym->ts.type == BT_CLASS
+  && CLASS_DATA (fsym)->ts.u.derived == extended)
+ || (fsym->ts.type == BT_DERIVED
+ && fsym->ts.u.derived == extended))
{
  dtio_sub = intr->sym;
  break;
Index: gcc/testsuite/gfortran.dg/dtio_18.f90
===
--- gcc/testsuite/gfortran.dg/dtio_18.f90   (nonexistent)
+++ gcc/testsuite/gfortran.dg/dtio_18.f90   (working copy)
@@ -0,0 +1,15 @@
+! { dg-do compile }
+!
+! PR 78592: [7 Regression] ICE in gfc_find_specific_dtio_proc, at 
fortran/interface.c:4939
+!
+! Contributed by Gerhard Steinmetz 
+
+program p
+   type t
+   end type
+   type(t) :: z
+   interface write(formatted)
+  module procedure wf! { dg-error "is neither function nor subroutine" 
}
+   end interface
+   print *, z
+end


Re: [PATCH] combine: Tweak change_zero_ext

2016-11-30 Thread Bin.Cheng
On Tue, Nov 29, 2016 at 9:14 PM, Christophe Lyon
 wrote:
> On 29 November 2016 at 20:38, Uros Bizjak  wrote:
>>> 2016-11-26  Segher Boessenkool  
>>>
>>> * combine.c (change_zero_ext): Also handle extends from a subreg
>>> to a mode bigger than that of the operand of the subreg.
>>
>> This patch introduced:
>>
>> FAIL: gcc.target/i386/pr44578.c (internal compiler error)
>>
>> on i686 (or x86_64 32bit multi-lib).
>>
>> ./cc1 -O2 -mtune=athlon64 -m32 -quiet pr44578.c
>> pr44578.c: In function ‘test’:
>> pr44578.c:18:1: internal compiler error: in gen_rtx_SUBREG, at emit-rtl.c:908
>>  }
>>  ^
>> 0x81493b gen_rtx_SUBREG(machine_mode, rtx_def*, int)
>> /home/uros/gcc-svn/trunk/gcc/emit-rtl.c:908
>> 0x122609f change_zero_ext
>> /home/uros/gcc-svn/trunk/gcc/combine.c:11260
>> 0x1226207 recog_for_combine
>> /home/uros/gcc-svn/trunk/gcc/combine.c:11346
>> 0x1236db3 try_combine
>> /home/uros/gcc-svn/trunk/gcc/combine.c:3501
>> 0x123a3e0 combine_instructions
>> /home/uros/gcc-svn/trunk/gcc/combine.c:1265
>> 0x123a3e0 rest_of_handle_combine
>> /home/uros/gcc-svn/trunk/gcc/combine.c:14581
>> 0x123a3e0 execute
>> /home/uros/gcc-svn/trunk/gcc/combine.c:14626
>>
>> Uros.
>
> Hi,
>
> I'm seeing a similar error on aarch64:
> FAIL: gcc.target/aarch64/advsimd-intrinsics/vduph_lane.c   -O1
> (internal compiler error)
> with the same backtrace.
Hi,
I also saw the same failure for below tests on aarch64.

/tmp/.../src/gcc/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vget_lane.c:
In function 'exec_vget_lane':
/tmp/.../src/gcc/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vget_lane.c:136:1:
internal compiler error: in gen_rtx_SUBREG, at emit-rtl.c:908
0x7d9cd1 gen_rtx_SUBREG(machine_mode, rtx_def*, int)
/tmp/.../src/gcc/gcc/emit-rtl.c:908
0x7d9d25 gen_lowpart_SUBREG(machine_mode, rtx_def*)
/tmp/.../src/gcc/gcc/emit-rtl.c:924
0x1112562 change_zero_ext
/tmp/.../src/gcc/gcc/combine.c:11260
0x1112b47 recog_for_combine
/tmp/.../src/gcc/gcc/combine.c:11346

Thanks,
bin
>
> Christophe


Re: [PATCH, GCC/ARM] Fix PR77933: stack corruption on ARM when using high registers and lr

2016-11-30 Thread Richard Earnshaw (lists)
On 30/11/16 09:50, Thomas Preudhomme wrote:
> Hi,
> 
> Is this ok to backport to gcc-5-branch and gcc-6-branch? Patch applies
> cleanly (patches attached for reference).
> 
> 
> 2016-11-17  Thomas Preud'homme  
> 
> Backport from mainline
> 2016-11-17  Thomas Preud'homme  
> 
> gcc/
> PR target/77933
> * config/arm/arm.c (thumb1_expand_prologue): Distinguish between lr
> being live in the function and lr needing to be saved.  Distinguish
> between already saved pushable registers and registers to push.
> Check for LR being an available pushable register.
> 
> gcc/testsuite/
> PR target/77933
> * gcc.target/arm/pr77933-1.c: New test.
> * gcc.target/arm/pr77933-2.c: Likewise.
> 

Your attached patch doesn't appear to match your ChangeLog.  (rmprofile
patch?).

R.

> 
> Best regards,
> 
> Thomas
> 
> 
> On 17/11/16 20:15, Thomas Preudhomme wrote:
>> Hi Kyrill,
>>
>> I've committed the following updated patch where the test is
>> restricted to Thumb
>> execution mode and skipping it if not possible since -mtpcs-leaf-frame
>> is only
>> available in Thumb mode. I've considered the change obvious.
>>
>> *** gcc/ChangeLog ***
>>
>> 2016-11-08  Thomas Preud'homme  
>>
>> PR target/77933
>> * config/arm/arm.c (thumb1_expand_prologue): Distinguish
>> between lr
>> being live in the function and lr needing to be saved. 
>> Distinguish
>> between already saved pushable registers and registers to push.
>> Check for LR being an available pushable register.
>>
>>
>> *** gcc/testsuite/ChangeLog ***
>>
>> 2016-11-08  Thomas Preud'homme  
>>
>> PR target/77933
>> * gcc.target/arm/pr77933-1.c: New test.
>> * gcc.target/arm/pr77933-2.c: Likewise.
>>
>> Best regards,
>>
>> Thomas
>>
>> On 17/11/16 10:04, Kyrill Tkachov wrote:
>>>
>>> On 09/11/16 16:41, Thomas Preudhomme wrote:
 I've reworked the patch following comments from Wilco [1] (sorry
 could not
 find it in my MUA for some reason).

 [1] https://gcc.gnu.org/ml/gcc-patches/2016-11/msg00317.html


 == Context ==

 When saving registers, function thumb1_expand_prologue () aims at
 minimizing
 the number of push instructions. One of the optimization it does is
 to push LR
 alongside high register(s) (after having moved them to low
 register(s)) when
 there is no low register to save. The way this is implemented is to
 add LR to
 the pushable_regs mask if it is live just before pushing the
 registers in that
 mask. The mask of live pushable registers which is used to detect
 whether LR
 needs to be saved is then clear to ensure LR is only saved once.


 == Problem ==

 However beyond deciding what register to push pushable_regs is used
 to track
 what pushable register can be used to move a high register before being
 pushed, hence the name. That mask is cleared when all high registers
 have been
 assigned a low register but the clearing assumes the high registers
 were
 assigned to the registers with the biggest number in that mask. This
 is not
 the case because LR is not considered when looking for a register in
 that
 mask. Furthermore, LR might have been saved in the TARGET_BACKTRACE
 path above
 yet the mask of live pushable registers is not cleared in that case.


 == Solution ==

 This patch changes the loop to iterate over register LR to r0 so as
 to both
 fix the stack corruption reported in PR77933 and reuse lr to push
 some high
 register when possible. This patch also introduce a new variable
 lr_needs_saving to record whether LR (still) needs to be saved at a
 given
 point in code and sets the variable accordingly throughout the code,
 thus
 fixing the second issue. Finally, this patch create a new push_mask
 variable
 to distinguish between the mask of registers to push and the mask of
 live
 pushable registers.


 == Note ==

 Other bits could have been improved but have been left out to allow
 the patch
 to be backported to stable branch:

 (1) using argument registers that are not holding an argument
 (2) using push_mask consistently instead of l_mask (in
 TARGET_BACKTRACE), mask
 (low register push) and push_mask
 (3) the !l_mask case improved in TARGET_BACKTRACE since offset == 0
 (4) rename l_mask to a more appropriate name (live_pushable_regs_mask?)

 ChangeLog entry are as follow:

 *** gcc/ChangeLog ***

 2016-11-08  Thomas Preud'homme  

 PR target/77933
 * config/arm/arm.c (thumb1_expand_prologue): Distinguish
 between lr
 being live in the function and lr needing to be saved.
 Distinguish
 between already saved pushable registers and registers to push.
 Check fo

Re: [patch] boehm-gc removal and libobjc changes to build with an external bdw-gc

2016-11-30 Thread Matthias Klose
On 30.11.2016 09:29, Andreas Schwab wrote:
> configure: error: no --with-target-bdw-gc options and no bdw-gc pkg-config 
> module found
> make[1]: *** [Makefile:19775: configure-target-libobjc] Error 1
> 
> Andreas.

that's a bit terse. Could you send the complete output for the configuration of
the libobjc subdir and the config.log?

I assume that is a configuration with --enable-objc-gc and then the pkg-config
module cannot be found.  Are gc/gc.h and libgc.so in standard paths without
having the bdw-gc pkg-config module available? Which libgc version is installed?

Thanks, Matthias



Re: [PATCH] combine: Tweak change_zero_ext

2016-11-30 Thread Kyrill Tkachov


On 30/11/16 10:02, Bin.Cheng wrote:

On Tue, Nov 29, 2016 at 9:14 PM, Christophe Lyon
 wrote:

On 29 November 2016 at 20:38, Uros Bizjak  wrote:

2016-11-26  Segher Boessenkool  

* combine.c (change_zero_ext): Also handle extends from a subreg
to a mode bigger than that of the operand of the subreg.

This patch introduced:

FAIL: gcc.target/i386/pr44578.c (internal compiler error)

on i686 (or x86_64 32bit multi-lib).

./cc1 -O2 -mtune=athlon64 -m32 -quiet pr44578.c
pr44578.c: In function ‘test’:
pr44578.c:18:1: internal compiler error: in gen_rtx_SUBREG, at emit-rtl.c:908
  }
  ^
0x81493b gen_rtx_SUBREG(machine_mode, rtx_def*, int)
 /home/uros/gcc-svn/trunk/gcc/emit-rtl.c:908
0x122609f change_zero_ext
 /home/uros/gcc-svn/trunk/gcc/combine.c:11260
0x1226207 recog_for_combine
 /home/uros/gcc-svn/trunk/gcc/combine.c:11346
0x1236db3 try_combine
 /home/uros/gcc-svn/trunk/gcc/combine.c:3501
0x123a3e0 combine_instructions
 /home/uros/gcc-svn/trunk/gcc/combine.c:1265
0x123a3e0 rest_of_handle_combine
 /home/uros/gcc-svn/trunk/gcc/combine.c:14581
0x123a3e0 execute
 /home/uros/gcc-svn/trunk/gcc/combine.c:14626

Uros.

Hi,

I'm seeing a similar error on aarch64:
FAIL: gcc.target/aarch64/advsimd-intrinsics/vduph_lane.c   -O1
(internal compiler error)
with the same backtrace.

Hi,
I also saw the same failure for below tests on aarch64.

/tmp/.../src/gcc/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vget_lane.c:
In function 'exec_vget_lane':
/tmp/.../src/gcc/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vget_lane.c:136:1:
internal compiler error: in gen_rtx_SUBREG, at emit-rtl.c:908
0x7d9cd1 gen_rtx_SUBREG(machine_mode, rtx_def*, int)
 /tmp/.../src/gcc/gcc/emit-rtl.c:908
0x7d9d25 gen_lowpart_SUBREG(machine_mode, rtx_def*)
 /tmp/.../src/gcc/gcc/emit-rtl.c:924
0x1112562 change_zero_ext
 /tmp/.../src/gcc/gcc/combine.c:11260
0x1112b47 recog_for_combine
 /tmp/.../src/gcc/gcc/combine.c:11346


This is PR 78590 and Segher fixed it today.
Kyrill


Thanks,
bin

Christophe




Re: [patch] boehm-gc removal and libobjc changes to build with an external bdw-gc

2016-11-30 Thread Andreas Schwab
On Nov 30 2016, Matthias Klose  wrote:

> I assume that is a configuration with --enable-objc-gc

No.

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


Re: [patch] boehm-gc removal and libobjc changes to build with an external bdw-gc

2016-11-30 Thread Richard Biener
On Wed, Nov 30, 2016 at 11:06 AM, Matthias Klose  wrote:
> On 30.11.2016 09:29, Andreas Schwab wrote:
>> configure: error: no --with-target-bdw-gc options and no bdw-gc pkg-config 
>> module found
>> make[1]: *** [Makefile:19775: configure-target-libobjc] Error 1
>>
>> Andreas.
>
> that's a bit terse. Could you send the complete output for the configuration 
> of
> the libobjc subdir and the config.log?
>
> I assume that is a configuration with --enable-objc-gc and then the pkg-config
> module cannot be found.  Are gc/gc.h and libgc.so in standard paths without
> having the bdw-gc pkg-config module available? Which libgc version is 
> installed?

I see the same failure with just

 ../configure --enable-languages=objc

usually we disable languages (with a diagnostic) if requirements
cannot be fulfilled.

But it seems the default chosen is bad somehow... (and breaks my bootstraps with
default languages).

Richard.


> Thanks, Matthias
>


Re: [PATCH, GCC/ARM] Fix ICE when compiling empty FIQ interrupt handler in ARM mode

2016-11-30 Thread Thomas Preudhomme

Hi,

Is this ok to backport this fix together with its follow-up testcase fix to 
gcc-5-branch and gcc-6-branch? Both patches apply cleanly (patches attached for 
reference).



2016-11-30  Thomas Preud'homme  

Backport from mainline
2016-11-16  Thomas Preud'homme  

gcc/
* config/arm/arm.md (arm_addsi3): Add alternative for addition of
general register with general register or ARM constant into SP
register.

gcc/testsuite/
* gcc.target/arm/empty_fiq_handler.c: New test.

Backport from mainline
2016-11-21  Thomas Preud'homme  

gcc/testsuite/
* gcc.target/arm/empty_fiq_handler.c: Skip if -mthumb is passed in and
target is Thumb-only.


Best regards,

Thomas


On 16/11/16 09:39, Kyrill Tkachov wrote:


On 09/11/16 16:19, Thomas Preudhomme wrote:

Hi,

This patch fixes the following ICE when building when compiling an empty FIQ
interrupt handler in ARM mode:

empty_fiq_handler.c:5:1: error: insn does not satisfy its constraints:
 }
 ^

(insn/f 13 12 14 (set (reg/f:SI 13 sp)
(plus:SI (reg/f:SI 11 fp)
(const_int 4 [0x4]))) irq.c:5 4 {*arm_addsi3}
 (expr_list:REG_CFA_ADJUST_CFA (set (reg/f:SI 13 sp)
(plus:SI (reg/f:SI 11 fp)
(const_int 4 [0x4])))
(nil)))

The ICE was provoked by missing an alternative to reflect that ARM mode can do
an add of general register into sp which is unpredictable in Thumb mode add
immediate.

ChangeLog entries are as follow:

*** gcc/ChangeLog ***

2016-11-04  Thomas Preud'homme  

* config/arm/arm.md (arm_addsi3): Add alternative for addition of
general register with general register or ARM constant into SP
register.


*** gcc/testsuite/ChangeLog ***

2016-11-04  Thomas Preud'homme  

* gcc.target/arm/empty_fiq_handler.c: New test.


Testing: bootstrapped on ARMv7-A ARM mode & testsuite shows no regression.

Is this ok for trunk?



I see that "r" does not include the stack pointer (STACK_REG is separate from
GENERAL_REGs) so we are indeed missing
that constraint.

Ok for trunk.
Thanks,
Kyrill


Best regards,

Thomas


diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 32e015d44b3503612b5f29d895880cb52e523817..faa08e47b6b1b21de7fa11cc3f878003bf6ee081 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -567,9 +567,9 @@
 ;;  (plus (reg rN) (reg sp)) into (reg rN).  In this case reload will
 ;; put the duplicated register first, and not try the commutative version.
 (define_insn_and_split "*arm_addsi3"
-  [(set (match_operand:SI  0 "s_register_operand" "=rk,l,l ,l ,r ,k ,r,r ,k ,r ,k,k,r ,k ,r")
-(plus:SI (match_operand:SI 1 "s_register_operand" "%0 ,l,0 ,l ,rk,k ,r,rk,k ,rk,k,r,rk,k ,rk")
- (match_operand:SI 2 "reg_or_int_operand" "rk ,l,Py,Pd,rI,rI,k,Pj,Pj,L ,L,L,PJ,PJ,?n")))]
+  [(set (match_operand:SI  0 "s_register_operand" "=rk,l,l ,l ,r ,k ,r,k ,r ,k ,r ,k,k,r ,k ,r")
+	(plus:SI (match_operand:SI 1 "s_register_operand" "%0 ,l,0 ,l ,rk,k ,r,r ,rk,k ,rk,k,r,rk,k ,rk")
+		 (match_operand:SI 2 "reg_or_int_operand" "rk ,l,Py,Pd,rI,rI,k,rI,Pj,Pj,L ,L,L,PJ,PJ,?n")))]
   "TARGET_32BIT"
   "@
add%?\\t%0, %0, %2
@@ -579,6 +579,7 @@
add%?\\t%0, %1, %2
add%?\\t%0, %1, %2
add%?\\t%0, %2, %1
+   add%?\\t%0, %1, %2
addw%?\\t%0, %1, %2
addw%?\\t%0, %1, %2
sub%?\\t%0, %1, #%n2
@@ -598,10 +599,10 @@
 		  operands[1], 0);
   DONE;
   "
-  [(set_attr "length" "2,4,4,4,4,4,4,4,4,4,4,4,4,4,16")
+  [(set_attr "length" "2,4,4,4,4,4,4,4,4,4,4,4,4,4,4,16")
(set_attr "predicable" "yes")
-   (set_attr "predicable_short_it" "yes,yes,yes,yes,no,no,no,no,no,no,no,no,no,no,no")
-   (set_attr "arch" "t2,t2,t2,t2,*,*,*,t2,t2,*,*,a,t2,t2,*")
+   (set_attr "predicable_short_it" "yes,yes,yes,yes,no,no,no,no,no,no,no,no,no,no,no,no")
+   (set_attr "arch" "t2,t2,t2,t2,*,*,*,a,t2,t2,*,*,a,t2,t2,*")
(set (attr "type") (if_then_else (match_operand 2 "const_int_operand" "")
 		  (const_string "alu_imm")
 		  (const_string "alu_sreg")))
diff --git a/gcc/testsuite/gcc.target/arm/empty_fiq_handler.c b/gcc/testsuite/gcc.target/arm/empty_fiq_handler.c
new file mode 100644
index ..8313f2199122be153a737946e817a5e3bee60372
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/empty_fiq_handler.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-skip-if "" { ! arm_cortex_m } { "-mthumb" } } */
+
+/* Below code used to trigger an ICE due to missing constraints for
+   sp = fp + cst pattern.  */
+
+void fiq_handler (void) __attribute__((interrupt ("FIQ")));
+
+void
+fiq_handler (void)
+{
+}
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 47171b99682207226aa4f9a76d4dfb54d6c2814b..86df1c0366be6c4b9b4ebf76821a8100c4e9fc16 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -575,9 +575,9 @@
 ;;  (plus (reg rN) (reg sp)) into (reg rN).  In this case reload will
 ;; put the duplicated register first, and not try the

Re: [PATCH][AArch64] PR target/78362: Make sure to only take REGNO of a register

2016-11-30 Thread Kyrill Tkachov

Ping.

Thanks,
Kyrill

On 23/11/16 14:16, Kyrill Tkachov wrote:


Ping.
https://gcc.gnu.org/ml/gcc-patches/2016-11/msg01664.html

Thanks,
Kyrill

On 16/11/16 16:57, Kyrill Tkachov wrote:

Hi all,

As the PR says we have an RTL checking failure that occurs when building libgcc 
for aarch64.
The expander code for addsi3 takes the REGNO of a SUBREG in operands[1]. The 
three operands
in the failing case are:
{(reg:SI 78), (subreg:SI (reg:DI 77) 0), (subreg:SI (reg:DI 73 [ ivtmp.9 ]) 0)}

According to the documentation of register_operand (which is the predicate for 
operands[1]),
operands[1] can be a REG or a SUBREG. If it's a subreg it may also contain a 
MEM before reload
(because it is guaranteed to be reloaded into a register later). Anyway, the 
bottom line is that
we have to be careful when taking REGNO of expressions during expand-time.

This patch extracts the inner rtx in case we have a SUBREG and checks that it's 
a REG before
checking its REGNO.

Bootstrapped and tested on aarch64-none-linux-gnu. Tested aarch64-none-elf with 
RTL checking enabled
(without this patch that doesn't build).

Ok for trunk?
Thanks,
Kyrill

2016-11-16  Kyrylo Tkachov  

PR target/78362
* config/aarch64/aarch64.md (add3): Extract inner expression
from a subreg in operands[1] and don't call REGNO on a non-reg expression
when deciding to force operands[2] into a reg.

2016-11-16  Kyrylo Tkachov  

PR target/78362
* gcc.c-torture/compile/pr78362.c: New test.






Re: [PATCH, GCC/ARM] Fix PR77933: stack corruption on ARM when using high registers and lr

2016-11-30 Thread Thomas Preudhomme



On 30/11/16 10:04, Richard Earnshaw (lists) wrote:

On 30/11/16 09:50, Thomas Preudhomme wrote:

Hi,

Is this ok to backport to gcc-5-branch and gcc-6-branch? Patch applies
cleanly (patches attached for reference).


2016-11-17  Thomas Preud'homme  

Backport from mainline
2016-11-17  Thomas Preud'homme  

gcc/
PR target/77933
* config/arm/arm.c (thumb1_expand_prologue): Distinguish between lr
being live in the function and lr needing to be saved.  Distinguish
between already saved pushable registers and registers to push.
Check for LR being an available pushable register.

gcc/testsuite/
PR target/77933
* gcc.target/arm/pr77933-1.c: New test.
* gcc.target/arm/pr77933-2.c: Likewise.



Your attached patch doesn't appear to match your ChangeLog.  (rmprofile
patch?).


There is 3 attached patches. I attached one by mistake indeed (2 double-clicks 
instead of 1?). Let's try again with only the 2 correct patches.


Best regards,

Thomas


R.



Best regards,

Thomas


On 17/11/16 20:15, Thomas Preudhomme wrote:

Hi Kyrill,

I've committed the following updated patch where the test is
restricted to Thumb
execution mode and skipping it if not possible since -mtpcs-leaf-frame
is only
available in Thumb mode. I've considered the change obvious.

*** gcc/ChangeLog ***

2016-11-08  Thomas Preud'homme  

PR target/77933
* config/arm/arm.c (thumb1_expand_prologue): Distinguish
between lr
being live in the function and lr needing to be saved.
Distinguish
between already saved pushable registers and registers to push.
Check for LR being an available pushable register.


*** gcc/testsuite/ChangeLog ***

2016-11-08  Thomas Preud'homme  

PR target/77933
* gcc.target/arm/pr77933-1.c: New test.
* gcc.target/arm/pr77933-2.c: Likewise.

Best regards,

Thomas

On 17/11/16 10:04, Kyrill Tkachov wrote:


On 09/11/16 16:41, Thomas Preudhomme wrote:

I've reworked the patch following comments from Wilco [1] (sorry
could not
find it in my MUA for some reason).

[1] https://gcc.gnu.org/ml/gcc-patches/2016-11/msg00317.html


== Context ==

When saving registers, function thumb1_expand_prologue () aims at
minimizing
the number of push instructions. One of the optimization it does is
to push LR
alongside high register(s) (after having moved them to low
register(s)) when
there is no low register to save. The way this is implemented is to
add LR to
the pushable_regs mask if it is live just before pushing the
registers in that
mask. The mask of live pushable registers which is used to detect
whether LR
needs to be saved is then clear to ensure LR is only saved once.


== Problem ==

However beyond deciding what register to push pushable_regs is used
to track
what pushable register can be used to move a high register before being
pushed, hence the name. That mask is cleared when all high registers
have been
assigned a low register but the clearing assumes the high registers
were
assigned to the registers with the biggest number in that mask. This
is not
the case because LR is not considered when looking for a register in
that
mask. Furthermore, LR might have been saved in the TARGET_BACKTRACE
path above
yet the mask of live pushable registers is not cleared in that case.


== Solution ==

This patch changes the loop to iterate over register LR to r0 so as
to both
fix the stack corruption reported in PR77933 and reuse lr to push
some high
register when possible. This patch also introduce a new variable
lr_needs_saving to record whether LR (still) needs to be saved at a
given
point in code and sets the variable accordingly throughout the code,
thus
fixing the second issue. Finally, this patch create a new push_mask
variable
to distinguish between the mask of registers to push and the mask of
live
pushable registers.


== Note ==

Other bits could have been improved but have been left out to allow
the patch
to be backported to stable branch:

(1) using argument registers that are not holding an argument
(2) using push_mask consistently instead of l_mask (in
TARGET_BACKTRACE), mask
(low register push) and push_mask
(3) the !l_mask case improved in TARGET_BACKTRACE since offset == 0
(4) rename l_mask to a more appropriate name (live_pushable_regs_mask?)

ChangeLog entry are as follow:

*** gcc/ChangeLog ***

2016-11-08  Thomas Preud'homme  

PR target/77933
* config/arm/arm.c (thumb1_expand_prologue): Distinguish
between lr
being live in the function and lr needing to be saved.
Distinguish
between already saved pushable registers and registers to push.
Check for LR being an available pushable register.


*** gcc/testsuite/ChangeLog ***

2016-11-08  Thomas Preud'homme  

PR target/77933
* gcc.target/arm/pr77933-1.c: New test.
* gcc.target/arm/pr77933-2.c: Likewise.


Testing: no regression on arm-none-eabi GCC cross-compiler targeting
Cortex-M0

Is this ok for trunk?



Re: [PATCH, GCC/ARM] Fix PR77904: callee-saved register trashed when clobbering sp

2016-11-30 Thread Thomas Preudhomme

Hi,

Is this ok to backport to gcc-5-branch and gcc-6-branch? Patch applies cleanly 
(patches attached for reference).



2016-11-30 Thomas Preud'homme 

Backport from mainline
2016-11-22  Thomas Preud'homme  

gcc/
PR target/77904
* config/arm/arm.c (thumb1_compute_save_reg_mask): Mark frame pointer
in save register mask if it is needed.

gcc/testsuite/
PR target/77904
* gcc.target/arm/pr77904.c: New test.


Best regards,

Thomas


On 22/11/16 10:45, Thomas Preudhomme wrote:

On 17/11/16 09:11, Kyrill Tkachov wrote:


On 17/11/16 08:56, Thomas Preudhomme wrote:

On 16/11/16 10:30, Kyrill Tkachov wrote:

Hi Thomas,

On 03/11/16 16:52, Thomas Preudhomme wrote:

Hi,

When using a callee-saved register to save the frame pointer the Thumb-1
prologue fails to save the callee-saved register before that. For ARM and
Thumb-2 targets the frame pointer is handled as a special case but nothing is
done for Thumb-1 targets. This patch adds the same logic for Thumb-1 targets.

ChangeLog entries are as follow:

*** gcc/ChangeLog ***

2016-11-02  Thomas Preud'homme 

PR target/77904
* config/arm/arm.c (thumb1_compute_save_reg_mask): mark frame pointer
in save register mask if it is needed.



s/mark/Mark/



*** gcc/testsuite/ChangeLog ***

2016-11-02  Thomas Preud'homme 

PR target/77904
* gcc.target/arm/pr77904.c: New test.


Testing: Testsuite shows no regression when run with arm-none-eabi GCC
cross-compiler for Cortex-M0 target.

Is this ok for trunk?



I'd ask for a bootstrap, but this code is Thumb-1 only so it wouldn't affect
anything.


I can bootstrap for armv4t with --with-mode=thumb which would at least
exercise the path. I'll try such a bootstrap on qemu.



If you can get it to work, then yes please.


Bootstrap came back clean so I've committed the patch (r242693). Thanks!

Best regards,

Thomas
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index c01a3c878968f6e6f07358b0686e4a59e34f56b7..62fd5c557e4a356bc2bf0141d35b959dfb859842 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -19361,6 +19361,10 @@ thumb1_compute_save_reg_mask (void)
 if (df_regs_ever_live_p (reg) && callee_saved_reg_p (reg))
   mask |= 1 << reg;
 
+  /* Handle the frame pointer as a special case.  */
+  if (frame_pointer_needed)
+mask |= 1 << HARD_FRAME_POINTER_REGNUM;
+
   if (flag_pic
   && !TARGET_SINGLE_PIC_BASE
   && arm_pic_register != INVALID_REGNUM
diff --git a/gcc/testsuite/gcc.target/arm/pr77904.c b/gcc/testsuite/gcc.target/arm/pr77904.c
new file mode 100644
index ..76728c07e73350ce44160cabff3dd2fa7a6ef021
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/pr77904.c
@@ -0,0 +1,45 @@
+/* { dg-do run } */
+/* { dg-options "-O2" } */
+
+__attribute__ ((noinline, noclone)) void
+clobber_sp (void)
+{
+  __asm volatile ("" : : : "sp");
+}
+
+int
+main (void)
+{
+  int ret;
+
+  __asm volatile ("mov\tr4, #0xf4\n\t"
+		  "mov\tr5, #0xf5\n\t"
+		  "mov\tr6, #0xf6\n\t"
+		  "mov\tr7, #0xf7\n\t"
+		  "mov\tr0, #0xf8\n\t"
+		  "mov\tr8, r0\n\t"
+		  "mov\tr0, #0xfa\n\t"
+		  "mov\tr10, r0"
+		  : : : "r0", "r4", "r5", "r6", "r7", "r8", "r10");
+  clobber_sp ();
+
+  __asm volatile ("cmp\tr4, #0xf4\n\t"
+		  "bne\tfail\n\t"
+		  "cmp\tr5, #0xf5\n\t"
+		  "bne\tfail\n\t"
+		  "cmp\tr6, #0xf6\n\t"
+		  "bne\tfail\n\t"
+		  "cmp\tr7, #0xf7\n\t"
+		  "bne\tfail\n\t"
+		  "mov\tr0, r8\n\t"
+		  "cmp\tr0, #0xf8\n\t"
+		  "bne\tfail\n\t"
+		  "mov\tr0, r10\n\t"
+		  "cmp\tr0, #0xfa\n\t"
+		  "bne\tfail\n\t"
+		  "mov\t%0, #1\n"
+		  "fail:\n\t"
+		  "sub\tr0, #1"
+		  : "=r" (ret) : :);
+  return ret;
+}
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 83cb13d1195beb19d6301f5c83a7eb544a91d877..ae479a43fe8514f2eacb7d56f89916b48f720768 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -19390,6 +19390,10 @@ thumb1_compute_save_reg_mask (void)
 if (df_regs_ever_live_p (reg) && callee_saved_reg_p (reg))
   mask |= 1 << reg;
 
+  /* Handle the frame pointer as a special case.  */
+  if (frame_pointer_needed)
+mask |= 1 << HARD_FRAME_POINTER_REGNUM;
+
   if (flag_pic
   && !TARGET_SINGLE_PIC_BASE
   && arm_pic_register != INVALID_REGNUM
diff --git a/gcc/testsuite/gcc.target/arm/pr77904.c b/gcc/testsuite/gcc.target/arm/pr77904.c
new file mode 100644
index ..76728c07e73350ce44160cabff3dd2fa7a6ef021
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/pr77904.c
@@ -0,0 +1,45 @@
+/* { dg-do run } */
+/* { dg-options "-O2" } */
+
+__attribute__ ((noinline, noclone)) void
+clobber_sp (void)
+{
+  __asm volatile ("" : : : "sp");
+}
+
+int
+main (void)
+{
+  int ret;
+
+  __asm volatile ("mov\tr4, #0xf4\n\t"
+		  "mov\tr5, #0xf5\n\t"
+		  "mov\tr6, #0xf6\n\t"
+		  "mov\tr7, #0xf7\n\t"
+		  "mov\tr0, #0xf8\n\t"
+		  "mov\tr8, r0\n\t"
+		  "mov\tr0, #0xfa\n\t"
+		  "mov\tr10, r0"
+		  : : : "r0", "r4", "r5", "r6", "r7", "r8",

Re: [PATCH, GCC/ARM] Fix PR77904: callee-saved register trashed when clobbering sp

2016-11-30 Thread Thomas Preudhomme
Sorry, the bug cannot be reproduced on gcc-5-branch so it's probably better to 
only do a backport to gcc-6-branch.


Ok for a backport to gcc-6-branch?

Best regards,

Thomas

On 30/11/16 10:42, Thomas Preudhomme wrote:

Hi,

Is this ok to backport to gcc-5-branch and gcc-6-branch? Patch applies cleanly
(patches attached for reference).


2016-11-30 Thomas Preud'homme 

Backport from mainline
2016-11-22  Thomas Preud'homme  

gcc/
PR target/77904
* config/arm/arm.c (thumb1_compute_save_reg_mask): Mark frame pointer
in save register mask if it is needed.

gcc/testsuite/
PR target/77904
* gcc.target/arm/pr77904.c: New test.


Best regards,

Thomas


On 22/11/16 10:45, Thomas Preudhomme wrote:

On 17/11/16 09:11, Kyrill Tkachov wrote:


On 17/11/16 08:56, Thomas Preudhomme wrote:

On 16/11/16 10:30, Kyrill Tkachov wrote:

Hi Thomas,

On 03/11/16 16:52, Thomas Preudhomme wrote:

Hi,

When using a callee-saved register to save the frame pointer the Thumb-1
prologue fails to save the callee-saved register before that. For ARM and
Thumb-2 targets the frame pointer is handled as a special case but nothing is
done for Thumb-1 targets. This patch adds the same logic for Thumb-1 targets.

ChangeLog entries are as follow:

*** gcc/ChangeLog ***

2016-11-02  Thomas Preud'homme 

PR target/77904
* config/arm/arm.c (thumb1_compute_save_reg_mask): mark frame pointer
in save register mask if it is needed.



s/mark/Mark/



*** gcc/testsuite/ChangeLog ***

2016-11-02  Thomas Preud'homme 

PR target/77904
* gcc.target/arm/pr77904.c: New test.


Testing: Testsuite shows no regression when run with arm-none-eabi GCC
cross-compiler for Cortex-M0 target.

Is this ok for trunk?



I'd ask for a bootstrap, but this code is Thumb-1 only so it wouldn't affect
anything.


I can bootstrap for armv4t with --with-mode=thumb which would at least
exercise the path. I'll try such a bootstrap on qemu.



If you can get it to work, then yes please.


Bootstrap came back clean so I've committed the patch (r242693). Thanks!

Best regards,

Thomas


Re: [patch] boehm-gc removal and libobjc changes to build with an external bdw-gc

2016-11-30 Thread Jakub Jelinek
On Wed, Nov 30, 2016 at 11:17:32AM +0100, Richard Biener wrote:
> On Wed, Nov 30, 2016 at 11:06 AM, Matthias Klose  wrote:
> > On 30.11.2016 09:29, Andreas Schwab wrote:
> >> configure: error: no --with-target-bdw-gc options and no bdw-gc pkg-config 
> >> module found
> >> make[1]: *** [Makefile:19775: configure-target-libobjc] Error 1
> >>
> >> Andreas.
> >
> > that's a bit terse. Could you send the complete output for the 
> > configuration of
> > the libobjc subdir and the config.log?
> >
> > I assume that is a configuration with --enable-objc-gc and then the 
> > pkg-config
> > module cannot be found.  Are gc/gc.h and libgc.so in standard paths without
> > having the bdw-gc pkg-config module available? Which libgc version is 
> > installed?
> 
> I see the same failure with just
> 
>  ../configure --enable-languages=objc
> 
> usually we disable languages (with a diagnostic) if requirements
> cannot be fulfilled.
> 
> But it seems the default chosen is bad somehow... (and breaks my bootstraps 
> with
> default languages).

I'm now testing the default (no --enable-objc-gc, --enable-objc-gc=*,
--disable-objc-gc) with.  Ok for trunk if it succeeds?

2016-11-30  Jakub Jelinek  

* configure.ac (--enable-objc-gc): If not given, default to
enable_objc_gc=no.
* configure: Regenerated.

--- libobjc/configure.ac.jj 2016-11-30 08:57:26.0 +0100
+++ libobjc/configure.ac2016-11-30 11:47:33.085828683 +0100
@@ -203,7 +203,7 @@ gt_BITFIELD_TYPE_MATTERS
 AC_ARG_ENABLE(objc-gc,
 [AS_HELP_STRING([--enable-objc-gc],
[enable use of Boehm's garbage collector with the
-GNU Objective-C runtime])])
+GNU Objective-C runtime])],,enable_objc_gc=no)
 AC_ARG_WITH([target-bdw-gc],
 [AS_HELP_STRING([--with-target-bdw-gc=PATHLIST],
[specify prefix directory for installed bdw-gc package.
--- libobjc/configure.jj2016-11-30 08:57:26.0 +0100
+++ libobjc/configure   2016-11-30 11:47:44.720680375 +0100
@@ -11509,6 +11509,8 @@ $as_echo "#define HAVE_BITFIELD_TYPE_MAT
 # Check whether --enable-objc-gc was given.
 if test "${enable_objc_gc+set}" = set; then :
   enableval=$enable_objc_gc;
+else
+  enable_objc_gc=no
 fi
 
 
Jakub


Re: [PATCH] arc: Avoid store/load pipeline hazard

2016-11-30 Thread Andrew Burgess
* Claudiu Zissulescu  [2016-11-28 09:29:22 
+]:

> Hi Andrew,
> 
> Approved, but ...

Thanks.  Committed as r243007 with the changes you suggested.

Thanks again,
Andrew




>  
> > +/* Return true if a load instruction (CONSUMER) uses the same address
> > +   as a store instruction (PRODUCER). This function is used to avoid
> > +   st/ld address hazard in ARC700 cores. */
> 
> 
> Missing space after dot.
> 
> > +  /* Peel the producer and the consumer for the address. */
> 
> Likewise.
> 
> > +  out_set = single_set (producer);
> > +  if (out_set)
> > +{
> > +  out_addr = SET_DEST (out_set);
> > +  if (!out_addr)
> > +   return false;
> > +  if (GET_CODE (out_addr) == ZERO_EXTEND
> > + || GET_CODE (out_addr) == SIGN_EXTEND)
> > +   out_addr = XEXP (out_addr, 0);
> > +
> > +  if (!MEM_P(out_addr))
> 
> Space between macro and parenthesis.
> 
> > +   return false;
> > +
> > +  in_set = single_set (consumer);
> > +  if (in_set)
> > +   {
> > + in_addr = SET_SRC (in_set);
> > + if (!in_addr)
> > +   return false;
> > + if (GET_CODE (in_addr) == ZERO_EXTEND
> > + || GET_CODE (in_addr) == SIGN_EXTEND)
> > +   in_addr = XEXP (in_addr, 0);
> > +
> > + if (!MEM_P(in_addr))
> 
> Likewise.
> 
> > +   return false;
> > + /* Get rid of the MEM() and check if the addresses are
> > +equivalent. */
> 
> Space, space after dot.
> 
> > + in_addr = XEXP (in_addr, 0);
> > + out_addr = XEXP (out_addr, 0);
> > +
> > + return exp_equiv_p (in_addr, out_addr, 0, true);
> > +   }
> > +}
> > +  return false;
> > +}
> > +
> >  /* The same functionality as arc_hazard.  It is called in machine
> > reorg before any other optimization.  Hence, the NOP size is taken
> > into account when doing branch shortening.  */
> > @@ -6501,6 +6553,29 @@ workaround_arc_anomaly (void)
> >   emit_insn_before (gen_nopv (), succ0);
> > }
> >  }
> > +
> > +  if (TARGET_ARC700)
> > +{
> > +  rtx_insn *succ1;
> > +
> > +  for (insn = get_insns (); insn; insn = NEXT_INSN (insn))
> > +   {
> > + succ0 = next_real_insn(insn);
> 
> Space between function name and parenthesis.
> 
> > + if (arc_store_addr_hazard_p (insn, succ0))
> > +   {
> > + emit_insn_after (gen_nopv (), insn);
> > + emit_insn_after (gen_nopv (), insn);
> > + continue;
> > +   }
> > +
> > + /* Avoid adding nops if the instruction between the ST and LD is
> > +a call or jump. */
> > + succ1 = next_real_insn(succ0);
> 
> Likewise.
> 
> > + if (succ0 && !JUMP_P (succ0) && !CALL_P (succ0)
> > + && arc_store_addr_hazard_p (insn, succ1))
> > +   emit_insn_after (gen_nopv (), insn);
> > +   }
> > +}
> >  }
> 
> Best,
> Claudiu


Re: [patch] boehm-gc removal and libobjc changes to build with an external bdw-gc

2016-11-30 Thread Matthias Klose
On 30.11.2016 11:52, Jakub Jelinek wrote:
> On Wed, Nov 30, 2016 at 11:17:32AM +0100, Richard Biener wrote:
>> On Wed, Nov 30, 2016 at 11:06 AM, Matthias Klose  wrote:
>>> On 30.11.2016 09:29, Andreas Schwab wrote:
 configure: error: no --with-target-bdw-gc options and no bdw-gc pkg-config 
 module found
 make[1]: *** [Makefile:19775: configure-target-libobjc] Error 1

 Andreas.
>>>
>>> that's a bit terse. Could you send the complete output for the 
>>> configuration of
>>> the libobjc subdir and the config.log?
>>>
>>> I assume that is a configuration with --enable-objc-gc and then the 
>>> pkg-config
>>> module cannot be found.  Are gc/gc.h and libgc.so in standard paths without
>>> having the bdw-gc pkg-config module available? Which libgc version is 
>>> installed?
>>
>> I see the same failure with just
>>
>>  ../configure --enable-languages=objc
>>
>> usually we disable languages (with a diagnostic) if requirements
>> cannot be fulfilled.
>>
>> But it seems the default chosen is bad somehow... (and breaks my bootstraps 
>> with
>> default languages).
> 
> I'm now testing the default (no --enable-objc-gc, --enable-objc-gc=*,
> --disable-objc-gc) with.  Ok for trunk if it succeeds?
> 
> 2016-11-30  Jakub Jelinek  
> 
>   * configure.ac (--enable-objc-gc): If not given, default to
>   enable_objc_gc=no.
>   * configure: Regenerated.
> 
> --- libobjc/configure.ac.jj   2016-11-30 08:57:26.0 +0100
> +++ libobjc/configure.ac  2016-11-30 11:47:33.085828683 +0100
> @@ -203,7 +203,7 @@ gt_BITFIELD_TYPE_MATTERS
>  AC_ARG_ENABLE(objc-gc,
>  [AS_HELP_STRING([--enable-objc-gc],
>   [enable use of Boehm's garbage collector with the
> -  GNU Objective-C runtime])])
> +  GNU Objective-C runtime])],,enable_objc_gc=no)
>  AC_ARG_WITH([target-bdw-gc],
>  [AS_HELP_STRING([--with-target-bdw-gc=PATHLIST],
>   [specify prefix directory for installed bdw-gc package.
> --- libobjc/configure.jj  2016-11-30 08:57:26.0 +0100
> +++ libobjc/configure 2016-11-30 11:47:44.720680375 +0100
> @@ -11509,6 +11509,8 @@ $as_echo "#define HAVE_BITFIELD_TYPE_MAT
>  # Check whether --enable-objc-gc was given.
>  if test "${enable_objc_gc+set}" = set; then :
>enableval=$enable_objc_gc;
> +else
> +  enable_objc_gc=no
>  fi
>  

I can confirm, that this works for any configureation not enabling the objc-gc.
I found another issue with enabling objc-gc, but your patch restores the
standard bootstrap.

Matthias





Re: [1/9][RFC][DWARF] Reserve three DW_OP numbers in vendor extension space

2016-11-30 Thread Jiong Wang

On 16/11/16 14:02, Jakub Jelinek wrote:

On Wed, Nov 16, 2016 at 02:54:56PM +0100, Mark Wielaard wrote:

On Wed, 2016-11-16 at 10:00 +, Jiong Wang wrote:

  The two operations DW_OP_AARCH64_paciasp and DW_OP_AARCH64_paciasp_deref were
designed as shortcut operations when LR is signed with A key and using
function's CFA as salt.  This is the default behaviour of return address
signing so is expected to be used for most of the time.  DW_OP_AARCH64_pauth
is designed as a generic operation that allow describing pointer signing on
any value using any salt and key in case we can't use the shortcut operations
we can use this.


I admit to not fully understand the salting/keying involved. But given
that the DW_OP space is really tiny, so we would like to not eat up too
many of them for new opcodes. And given that introducing any new DW_OPs
using for CFI unwinding will break any unwinder anyway causing us to
update them all for this new feature. Have you thought about using a new
CIE augmentation string character for describing that the return
address/link register used by a function/frame is salted/keyed?

This seems a good description of CIE records and augmentation
characters:http://www.airs.com/blog/archives/460

It obviously also involves updating all unwinders to understand the new
augmentation character (and possible arguments). But it might be more
generic and saves us from using up too many DW_OPs.


From what I understood, the return address is not always scrambled, so
it doesn't apply to the whole function, just to most of it (except for
an insn in the prologue and some in the epilogue).  So I think one op is
needed.  But can't it be just a toggable flag whether the return address
is scrambled + some arguments to it?
Thus DW_OP_AARCH64_scramble .uleb128 0 would mean that the default
way of scrambling starts here (if not already active) or any kind of
scrambling ends here (if already active), and
DW_OP_AARCH64_scramble .uleb128 non-zero would be whatever encoding you need
to represent details of the less common variants with details what to do.
Then you'd just hook through some MD_* macro in the unwinder the
descrambling operation if the scrambling is active at the insns you unwind
on.

  Jakub


Hi Mark, Jakub:

  Thanks very much for the suggestions.

  I have done some experiments on your ideas and am thinking it's good to
  combine them together.  The use of DW_CFA instead of DW_OP can avoid building
  all information from scratch at each unwind location, while we can indicate
  the signing key index through new AArch64 CIE augmentation 'B'. This new
  approach reduce the unwind table size overhead from ~25% to ~5% when return
  address signing enabled, it also largely simplified dwarf generation code for
  return address signing.

  As one new DWARF call frame instruction is needed for AArch64, I want to reuse
  DW_CFA_GNU_window_save to save the space.  It is in vendor extension space and
  used for Sparc only, I think it make sense to reuse it for AArch64. On
  AArch64, DW_CFA_GNU_window_save toggle return address sign status which kept
  in a new boolean type column in DWARF table,  so DW_CFA_GNU_window_save takes
  no argument on AArch64, the same as on Sparc, this makes no difference to 
those
  existed encoding, length calculation code.

  Meanwhile one new DWARF expression operation number is still needed for
  AArch64, it's useful for describing those complex pointer signing scenarios
  and it will be used to multiplex some further extensions on AArch64.

  OK on this proposal and to install this patch to gcc trunk?

Hi GDB, Binutils maintainer:

  OK on this proposal and install this patch to binutils-gdb master?

include/
2016-11-29   Richard Earnshaw  
 Jiong Wang  

* dwarf2.def (DW_OP_AARCH64_operation): Reserve the number 0xea.


diff --git a/include/dwarf2.def b/include/dwarf2.def
index bb916ca238221151cf49359c25fd92643c7e60af..f3892a20da1fe13ddb419e5d7eda07f2c8d8b0c6 100644
--- a/include/dwarf2.def
+++ b/include/dwarf2.def
@@ -684,6 +684,12 @@ DW_OP (DW_OP_HP_unmod_range, 0xe5)
 DW_OP (DW_OP_HP_tls, 0xe6)
 /* PGI (STMicroelectronics) extensions.  */
 DW_OP (DW_OP_PGI_omp_thread_num, 0xf8)
+/* AARCH64 extensions.
+   DW_OP_AARCH64_operation takes one mandatory unsigned LEB128 operand.
+   Bits[6:0] of this operand is the action code, all others bits are initialized
+   to 0 except explicitly documented for one action.  Please refer AArch64 DWARF
+   ABI documentation for details.  */
+DW_OP (DW_OP_AARCH64_operation, 0xea)
 DW_END_OP
 
 DW_FIRST_ATE (DW_ATE_void, 0x0)
@@ -765,7 +771,8 @@ DW_CFA (DW_CFA_hi_user, 0x3f)
 
 /* SGI/MIPS specific.  */
 DW_CFA (DW_CFA_MIPS_advance_loc8, 0x1d)
-/* GNU extensions.  */
+/* GNU extensions.
+   NOTE: DW_CFA_GNU_window_save is multiplexed on Sparc and AArch64.  */
 DW_CFA (DW_CFA_GNU_window_save, 0x2d)
 DW_CFA (DW_CFA_GNU_args_size, 0x2e)
 DW_CFA (DW_CFA_GNU_negative_offset_extended, 0x2f)


Re: [libstdc++, testsuite] Add dg-require-thread-fence

2016-11-30 Thread Jonathan Wakely

On 30/11/16 09:50 +0100, Christophe Lyon wrote:

I'll try to think about it, but I can live with that single "known" failure.
It's better than the ~3500 failures I used to have, and hopefully
won't report false regressions anymore.


OK, cool. One rather than 3500 is a nice improvement :-)




Re: [patch] boehm-gc removal and libobjc changes to build with an external bdw-gc

2016-11-30 Thread Matthias Klose
On 30.11.2016 11:52, Jakub Jelinek wrote:
> On Wed, Nov 30, 2016 at 11:17:32AM +0100, Richard Biener wrote:
>> On Wed, Nov 30, 2016 at 11:06 AM, Matthias Klose  wrote:
>>> On 30.11.2016 09:29, Andreas Schwab wrote:
 configure: error: no --with-target-bdw-gc options and no bdw-gc pkg-config 
 module found
 make[1]: *** [Makefile:19775: configure-target-libobjc] Error 1

 Andreas.
>>>
>>> that's a bit terse. Could you send the complete output for the 
>>> configuration of
>>> the libobjc subdir and the config.log?
>>>
>>> I assume that is a configuration with --enable-objc-gc and then the 
>>> pkg-config
>>> module cannot be found.  Are gc/gc.h and libgc.so in standard paths without
>>> having the bdw-gc pkg-config module available? Which libgc version is 
>>> installed?
>>
>> I see the same failure with just
>>
>>  ../configure --enable-languages=objc
>>
>> usually we disable languages (with a diagnostic) if requirements
>> cannot be fulfilled.
>>
>> But it seems the default chosen is bad somehow... (and breaks my bootstraps 
>> with
>> default languages).
> 
> I'm now testing the default (no --enable-objc-gc, --enable-objc-gc=*,
> --disable-objc-gc) with.  Ok for trunk if it succeeds?
> 
> 2016-11-30  Jakub Jelinek  
> 
>   * configure.ac (--enable-objc-gc): If not given, default to
>   enable_objc_gc=no.
>   * configure: Regenerated.
> 
> --- libobjc/configure.ac.jj   2016-11-30 08:57:26.0 +0100
> +++ libobjc/configure.ac  2016-11-30 11:47:33.085828683 +0100
> @@ -203,7 +203,7 @@ gt_BITFIELD_TYPE_MATTERS
>  AC_ARG_ENABLE(objc-gc,
>  [AS_HELP_STRING([--enable-objc-gc],
>   [enable use of Boehm's garbage collector with the
> -  GNU Objective-C runtime])])
> +  GNU Objective-C runtime])],,enable_objc_gc=no)
>  AC_ARG_WITH([target-bdw-gc],
>  [AS_HELP_STRING([--with-target-bdw-gc=PATHLIST],
>   [specify prefix directory for installed bdw-gc package.
> --- libobjc/configure.jj  2016-11-30 08:57:26.0 +0100
> +++ libobjc/configure 2016-11-30 11:47:44.720680375 +0100
> @@ -11509,6 +11509,8 @@ $as_echo "#define HAVE_BITFIELD_TYPE_MAT
>  # Check whether --enable-objc-gc was given.
>  if test "${enable_objc_gc+set}" = set; then :
>enableval=$enable_objc_gc;
> +else
> +  enable_objc_gc=no
>  fi

There's one more fix needed for the case of only having the pkg-config module
installed when configuring with --enable-objc-gc. We can't use PKG_CHECK_MODULES
directly because the pkg.m4 macros choke on the dash in the module name. Thus
setting the CFLAGS and LIBS directly. Ok to install?

* configure.ac: Set BDW_GC_CFLAGS and BDW_GC_LIBS after checking
for the existence of the pkg-config modules.
* Regenerate.


--- libobjc/configure.ac	(Revision 243006)
+++ libobjc/configure.ac	(Arbeitskopie)
@@ -225,7 +225,9 @@
   if test "x$with_target_bdw_gc$with_target_bdw_gc_include$with_target_bdw_gc_lib" = x; then
 dnl no bdw-gw options, fall back to the bdw-gc pkg-config module
 PKG_CHECK_EXISTS(bdw-gc,
-  AC_MSG_RESULT([using bdw-gc pkg-config module]),
+  [AC_MSG_RESULT([using bdw-gc pkg-config module])
+   BDW_GC_CFLAGS=`$PKG_CONFIG --cflags bdw-gc`
+   BDW_GC_LIBS=`$PKG_CONFIG --libs bdw-gc`],
   AC_MSG_ERROR([no --with-target-bdw-gc options and no bdw-gc pkg-config module found]))
   else
 dnl bdw-gw options passed by configure flags


Re: [patch] boehm-gc removal and libobjc changes to build with an external bdw-gc

2016-11-30 Thread Richard Biener
On Wed, Nov 30, 2016 at 11:52 AM, Jakub Jelinek  wrote:
> On Wed, Nov 30, 2016 at 11:17:32AM +0100, Richard Biener wrote:
>> On Wed, Nov 30, 2016 at 11:06 AM, Matthias Klose  wrote:
>> > On 30.11.2016 09:29, Andreas Schwab wrote:
>> >> configure: error: no --with-target-bdw-gc options and no bdw-gc 
>> >> pkg-config module found
>> >> make[1]: *** [Makefile:19775: configure-target-libobjc] Error 1
>> >>
>> >> Andreas.
>> >
>> > that's a bit terse. Could you send the complete output for the 
>> > configuration of
>> > the libobjc subdir and the config.log?
>> >
>> > I assume that is a configuration with --enable-objc-gc and then the 
>> > pkg-config
>> > module cannot be found.  Are gc/gc.h and libgc.so in standard paths without
>> > having the bdw-gc pkg-config module available? Which libgc version is 
>> > installed?
>>
>> I see the same failure with just
>>
>>  ../configure --enable-languages=objc
>>
>> usually we disable languages (with a diagnostic) if requirements
>> cannot be fulfilled.
>>
>> But it seems the default chosen is bad somehow... (and breaks my bootstraps 
>> with
>> default languages).
>
> I'm now testing the default (no --enable-objc-gc, --enable-objc-gc=*,
> --disable-objc-gc) with.  Ok for trunk if it succeeds?

Ok.

Thanks,
Richard.

> 2016-11-30  Jakub Jelinek  
>
> * configure.ac (--enable-objc-gc): If not given, default to
> enable_objc_gc=no.
> * configure: Regenerated.
>
> --- libobjc/configure.ac.jj 2016-11-30 08:57:26.0 +0100
> +++ libobjc/configure.ac2016-11-30 11:47:33.085828683 +0100
> @@ -203,7 +203,7 @@ gt_BITFIELD_TYPE_MATTERS
>  AC_ARG_ENABLE(objc-gc,
>  [AS_HELP_STRING([--enable-objc-gc],
> [enable use of Boehm's garbage collector with the
> -GNU Objective-C runtime])])
> +GNU Objective-C runtime])],,enable_objc_gc=no)
>  AC_ARG_WITH([target-bdw-gc],
>  [AS_HELP_STRING([--with-target-bdw-gc=PATHLIST],
> [specify prefix directory for installed bdw-gc package.
> --- libobjc/configure.jj2016-11-30 08:57:26.0 +0100
> +++ libobjc/configure   2016-11-30 11:47:44.720680375 +0100
> @@ -11509,6 +11509,8 @@ $as_echo "#define HAVE_BITFIELD_TYPE_MAT
>  # Check whether --enable-objc-gc was given.
>  if test "${enable_objc_gc+set}" = set; then :
>enableval=$enable_objc_gc;
> +else
> +  enable_objc_gc=no
>  fi
>
>
> Jakub


Re: [patch] boehm-gc removal and libobjc changes to build with an external bdw-gc

2016-11-30 Thread Richard Biener
On Wed, Nov 30, 2016 at 12:30 PM, Matthias Klose  wrote:
> On 30.11.2016 11:52, Jakub Jelinek wrote:
>> On Wed, Nov 30, 2016 at 11:17:32AM +0100, Richard Biener wrote:
>>> On Wed, Nov 30, 2016 at 11:06 AM, Matthias Klose  wrote:
 On 30.11.2016 09:29, Andreas Schwab wrote:
> configure: error: no --with-target-bdw-gc options and no bdw-gc 
> pkg-config module found
> make[1]: *** [Makefile:19775: configure-target-libobjc] Error 1
>
> Andreas.

 that's a bit terse. Could you send the complete output for the 
 configuration of
 the libobjc subdir and the config.log?

 I assume that is a configuration with --enable-objc-gc and then the 
 pkg-config
 module cannot be found.  Are gc/gc.h and libgc.so in standard paths without
 having the bdw-gc pkg-config module available? Which libgc version is 
 installed?
>>>
>>> I see the same failure with just
>>>
>>>  ../configure --enable-languages=objc
>>>
>>> usually we disable languages (with a diagnostic) if requirements
>>> cannot be fulfilled.
>>>
>>> But it seems the default chosen is bad somehow... (and breaks my bootstraps 
>>> with
>>> default languages).
>>
>> I'm now testing the default (no --enable-objc-gc, --enable-objc-gc=*,
>> --disable-objc-gc) with.  Ok for trunk if it succeeds?
>>
>> 2016-11-30  Jakub Jelinek  
>>
>>   * configure.ac (--enable-objc-gc): If not given, default to
>>   enable_objc_gc=no.
>>   * configure: Regenerated.
>>
>> --- libobjc/configure.ac.jj   2016-11-30 08:57:26.0 +0100
>> +++ libobjc/configure.ac  2016-11-30 11:47:33.085828683 +0100
>> @@ -203,7 +203,7 @@ gt_BITFIELD_TYPE_MATTERS
>>  AC_ARG_ENABLE(objc-gc,
>>  [AS_HELP_STRING([--enable-objc-gc],
>>   [enable use of Boehm's garbage collector with the
>> -  GNU Objective-C runtime])])
>> +  GNU Objective-C runtime])],,enable_objc_gc=no)
>>  AC_ARG_WITH([target-bdw-gc],
>>  [AS_HELP_STRING([--with-target-bdw-gc=PATHLIST],
>>   [specify prefix directory for installed bdw-gc package.
>> --- libobjc/configure.jj  2016-11-30 08:57:26.0 +0100
>> +++ libobjc/configure 2016-11-30 11:47:44.720680375 +0100
>> @@ -11509,6 +11509,8 @@ $as_echo "#define HAVE_BITFIELD_TYPE_MAT
>>  # Check whether --enable-objc-gc was given.
>>  if test "${enable_objc_gc+set}" = set; then :
>>enableval=$enable_objc_gc;
>> +else
>> +  enable_objc_gc=no
>>  fi
>
> There's one more fix needed for the case of only having the pkg-config module
> installed when configuring with --enable-objc-gc. We can't use 
> PKG_CHECK_MODULES
> directly because the pkg.m4 macros choke on the dash in the module name. Thus
> setting the CFLAGS and LIBS directly. Ok to install?

Why not fix pkg.m4?

Richard.

> * configure.ac: Set BDW_GC_CFLAGS and BDW_GC_LIBS after checking
> for the existence of the pkg-config modules.
> * Regenerate.
>
>


Re: Ping: Re: [PATCH 1/2] gcc: Remove unneeded global flag.

2016-11-30 Thread Andrew Burgess
* Jeff Law  [2016-11-29 10:35:50 -0700]:

> On 11/29/2016 07:02 AM, Andrew Burgess wrote:
> > * Jeff Law  [2016-11-28 15:08:46 -0700]:
> > 
> > > On 11/24/2016 02:40 PM, Andrew Burgess wrote:
> > > > * Christophe Lyon  [2016-11-21 13:47:09 
> > > > +0100]:
> > > > 
> > > > > On 20 November 2016 at 18:27, Mike Stump  
> > > > > wrote:
> > > > > > On Nov 19, 2016, at 1:59 PM, Andrew Burgess 
> > > > > >  wrote:
> > > > > > > > So, your new test fails on arm* targets:
> > > > > > > 
> > > > > > > After a little digging I think the problem might be that
> > > > > > > -freorder-blocks-and-partition is not supported on arm.
> > > > > > > 
> > > > > > > This should be detected as the new tests include:
> > > > > > > 
> > > > > > >/* { dg-require-effective-target freorder } */
> > > > > > > 
> > > > > > > however this test passed on arm as -freorder-blocks-and-partition 
> > > > > > > does
> > > > > > > not issue any warning unless -fprofile-use is also passed.
> > > > > > > 
> > > > > > > The patch below extends check_effective_target_freorder to check 
> > > > > > > using
> > > > > > > -fprofile-use.  With this change in place the tests are skipped on
> > > > > > > arm.
> > > > > > 
> > > > > > > All feedback welcome,
> > > > > > 
> > > > > > Seems reasonable, unless a 
> > > > > > -freorder-blocks-and-partition/-fprofile-use person thinks this is 
> > > > > > the wrong solution.
> > > > > > 
> > > > > 
> > > > > Hi,
> > > > > 
> > > > > As promised, I tested this patch: it makes
> > > > > gcc.dg/tree-prof/section-attr-[123].c
> > > > > unsupported on arm*, and thus they are not failing anymore :-)
> > > > > 
> > > > > However, it also makes other tests unsupported, while they used to 
> > > > > pass:
> > > > > 
> > > > >   gcc.dg/pr33648.c
> > > > >   gcc.dg/pr46685.c
> > > > >   gcc.dg/tree-prof/20041218-1.c
> > > > >   gcc.dg/tree-prof/bb-reorg.c
> > > > >   gcc.dg/tree-prof/cold_partition_label.c
> > > > >   gcc.dg/tree-prof/comp-goto-1.c
> > > > >   gcc.dg/tree-prof/pr34999.c
> > > > >   gcc.dg/tree-prof/pr45354.c
> > > > >   gcc.dg/tree-prof/pr50907.c
> > > > >   gcc.dg/tree-prof/pr52027.c
> > > > >   gcc.dg/tree-prof/va-arg-pack-1.c
> > > > > 
> > > > > and failures are now unsupported:
> > > > >   gcc.dg/tree-prof/cold_partition_label.c
> > > > >   gcc.dg/tree-prof/section-attr-1.c
> > > > >   gcc.dg/tree-prof/section-attr-2.c
> > > > >   gcc.dg/tree-prof/section-attr-3.c
> > > > > 
> > > > > So, maybe this patch is too strong?
> > > > 
> > > > In all of the cases that used to pass the tests are compile only tests
> > > > (except for cold_partition_label, which I discuss below).
> > > > 
> > > > On ARM passing -fprofile-use and -freorder-blocks-and-partition
> > > > results in a warning, and the -freorder-blocks-and-partition flag is
> > > > ignored.  However, disabling -freorder-blocks-and-partition doesn't
> > > > stop any of the tests compiling, hence the passes.
> > > > 
> > > > All the tests include:
> > > > 
> > > >   /* { dg-require-effective-target freorder } */
> > > > 
> > > > which I understand to mean, the tests requires the 'freorder' feature
> > > > to be supported (which corresponds to -freorder-blocks-and-partition).
> > > > 
> > > > For cold_partition_label and my new tests it's seems clear that the
> > > > lack of support for -freorder-blocks-and-partition on ARM is the cause
> > > > of the test failures.
> > > > 
> > > > So, is it reasonable to give up the other tests as "unsupported"?  I'd
> > > > be inclined to say yes, but I happy to rework the patch if anyone has
> > > > a suggestion for an alternative approach.
> > > It is reasonable.  It's not uncommon to have to drop various tests to
> > > UNSUPPORTED, particularly things which depend on assembler/linker
> > > capabilities, the target runtime system, etc.
> > 
> > OK, I'm going to take that as approval for my patch[1].  I'll wait a
> > couple of days to give people a chance to correct me, then I'll push
> > the change.  This should resolve the test regressions I introduced for
> > ARM.
> I'll just go ahead and explicitly ACK this.

Committed as r243009.

Thanks,
Andrew


[PATCH, GCC/ARM] Add multilib mapping for Cortex-M23 & Cortex-M33

2016-11-30 Thread Thomas Preudhomme

Hi,

With ARM Cortex-M23 and Cortex-M33 and the support for RM profile multilib added 
recently, it's time to add the corresponding CPU to architecture mappings in 
config/arm/t-rmprofile. Note that Cortex-M33 is mapped to ARMv8-M Mainline 
because there is no transitive closure of mappings and the multilib for ARMv8-M 
Mainline with DSP extensions is ARMv8-M Mainline.


ChangeLog entry is as follows:


*** gcc/ChangeLog ***

2016-11-30  Thomas Preud'homme  

* config/arm/t-rmprofile: Add mappings for Cortex-M23 and Cortex-M33.


Testing: Linking fails before this patch when targeting one of these two cores 
and using rmprofile multilib but succeeds with the patch.


Is this ok for stage3?

Best regards,

Thomas
diff --git a/gcc/config/arm/t-rmprofile b/gcc/config/arm/t-rmprofile
index c8b5c9cbd03694eea69855e20372afa3e97d6b4c..93aa909b4d942ad9875a95e0d4397ff17b317905 100644
--- a/gcc/config/arm/t-rmprofile
+++ b/gcc/config/arm/t-rmprofile
@@ -102,6 +102,8 @@ MULTILIB_MATCHES   += march?armv6s-m=mcpu?cortex-m1.small-multiply
 MULTILIB_MATCHES   += march?armv7-m=mcpu?cortex-m3
 MULTILIB_MATCHES   += march?armv7e-m=mcpu?cortex-m4
 MULTILIB_MATCHES   += march?armv7e-m=mcpu?cortex-m7
+MULTILIB_MATCHES   += march?armv8-m.base=mcpu?cortex-m23
+MULTILIB_MATCHES   += march?armv8-m.main=mcpu?cortex-m33
 MULTILIB_MATCHES   += march?armv7=mcpu?cortex-r4
 MULTILIB_MATCHES   += march?armv7=mcpu?cortex-r4f
 MULTILIB_MATCHES   += march?armv7=mcpu?cortex-r5


Re: [PATCH][AArch64] PR target/78362: Make sure to only take REGNO of a register

2016-11-30 Thread Ramana Radhakrishnan
On Wed, Nov 16, 2016 at 4:57 PM, Kyrill Tkachov
 wrote:
> Hi all,
>
> As the PR says we have an RTL checking failure that occurs when building
> libgcc for aarch64.
> The expander code for addsi3 takes the REGNO of a SUBREG in operands[1]. The
> three operands
> in the failing case are:
> {(reg:SI 78), (subreg:SI (reg:DI 77) 0), (subreg:SI (reg:DI 73 [ ivtmp.9 ])
> 0)}
>
> According to the documentation of register_operand (which is the predicate
> for operands[1]),
> operands[1] can be a REG or a SUBREG. If it's a subreg it may also contain a
> MEM before reload
> (because it is guaranteed to be reloaded into a register later). Anyway, the
> bottom line is that
> we have to be careful when taking REGNO of expressions during expand-time.
>
> This patch extracts the inner rtx in case we have a SUBREG and checks that
> it's a REG before
> checking its REGNO.
>
> Bootstrapped and tested on aarch64-none-linux-gnu. Tested aarch64-none-elf
> with RTL checking enabled
> (without this patch that doesn't build).
>
> Ok for trunk?


LGTM but can't approve.

Ramana

> Thanks,
> Kyrill
>
> 2016-11-16  Kyrylo Tkachov  
>
> PR target/78362
> * config/aarch64/aarch64.md (add3): Extract inner expression
> from a subreg in operands[1] and don't call REGNO on a non-reg
> expression
> when deciding to force operands[2] into a reg.
>
> 2016-11-16  Kyrylo Tkachov  
>
> PR target/78362
> * gcc.c-torture/compile/pr78362.c: New test.


Re: [PATCH, GCC/ARM] Add multilib mapping for Cortex-M23 & Cortex-M33

2016-11-30 Thread Ramana Radhakrishnan
On Wed, Nov 30, 2016 at 11:48 AM, Thomas Preudhomme
 wrote:
> Hi,
>
> With ARM Cortex-M23 and Cortex-M33 and the support for RM profile multilib
> added recently, it's time to add the corresponding CPU to architecture
> mappings in config/arm/t-rmprofile. Note that Cortex-M33 is mapped to
> ARMv8-M Mainline because there is no transitive closure of mappings and the
> multilib for ARMv8-M Mainline with DSP extensions is ARMv8-M Mainline.
>
> ChangeLog entry is as follows:
>
>
> *** gcc/ChangeLog ***
>
> 2016-11-30  Thomas Preud'homme  
>
> * config/arm/t-rmprofile: Add mappings for Cortex-M23 and
> Cortex-M33.
>
>
> Testing: Linking fails before this patch when targeting one of these two
> cores and using rmprofile multilib but succeeds with the patch.
>
> Is this ok for stage3?

This is OK despite stage3 as it only adds support for a
micro-architecture in a multilib fragment for library selection to
work properly when the CPU is on the command line.

Thanks,
Ramana

>
> Best regards,
>
> Thomas


Re: [PATCH][AArch64] PR target/78362: Make sure to only take REGNO of a register

2016-11-30 Thread James Greenhalgh
On Wed, Nov 16, 2016 at 04:57:29PM +, Kyrill Tkachov wrote:
> Hi all,
> 
> As the PR says we have an RTL checking failure that occurs when building
> libgcc for aarch64.  The expander code for addsi3 takes the REGNO of a SUBREG
> in operands[1]. The
> three operands in the failing case are:
> {(reg:SI 78), (subreg:SI (reg:DI 77) 0), (subreg:SI (reg:DI 73 [ ivtmp.9 ]) 
> 0)}
> 
> According to the documentation of register_operand (which is the predicate
> for operands[1]), operands[1] can be a REG or a SUBREG. If it's a subreg it
> may also contain a MEM before reload (because it is guaranteed to be reloaded
> into a register later). Anyway, the bottom line is that we have to be careful
> when taking REGNO of expressions during expand-time.
> 
> This patch extracts the inner rtx in case we have a SUBREG and checks that
> it's a REG before checking its REGNO.
> 
> Bootstrapped and tested on aarch64-none-linux-gnu. Tested aarch64-none-elf
> with RTL checking enabled (without this patch that doesn't build).
> 
> Ok for trunk?

OK.

Thanks,
James

> Thanks,
> Kyrill
> 
> 2016-11-16  Kyrylo Tkachov  
> 
> PR target/78362
> * config/aarch64/aarch64.md (add3): Extract inner expression
> from a subreg in operands[1] and don't call REGNO on a non-reg expression
> when deciding to force operands[2] into a reg.
> 
> 2016-11-16  Kyrylo Tkachov  
> 
> PR target/78362
> * gcc.c-torture/compile/pr78362.c: New test.




Re: [PATCH, ARM] Further improve stack usage on sha512 (PR 77308)

2016-11-30 Thread Wilco Dijkstra
Bernd Edlinger wrote:
> On 11/29/16 16:06, Wilco Dijkstra wrote:
> > Bernd Edlinger wrote:
> >
> > -  "TARGET_32BIT && reload_completed
> > +  "TARGET_32BIT && ((!TARGET_NEON && !TARGET_IWMMXT) || reload_completed)
> > && ! (TARGET_NEON && IS_VFP_REGNUM (REGNO (operands[0])))"
> >
> > This is equivalent to "&& (!TARGET_IWMMXT || reload_completed)" since we're
> > already excluding NEON.
>
> Aehm, no.  This would split the addi_neon insn before it is clear
> if the reload pass will assign a VFP register.

Hmm that's strange... This instruction shouldn't be used to also split some 
random
Neon pattern - for example arm_subdi3 doesn't do the same. To understand and
reason about any of these complex patterns they should all work in the same 
way...

> But when I make *arm_cmpdi_insn split early, it ICEs:

(insn 4870 4869 1636 87 (set (scratch:SI)
 (minus:SI (minus:SI (subreg:SI (reg:DI 2261) 4)
 (subreg:SI (reg:DI 473 [ X$14 ]) 4))
 (ltu:SI (reg:CC_C 100 cc)
 (const_int 0 [0] "pr77308-2.c":140 -1
  (nil))

That's easy, we don't have a sbcs , r1, r2 pattern. A quick workaround 
is
to create a temporary for operand[2] (if before reload) so it will match the 
standard
sbcs pattern, and then the split works fine.

> So it is certainly possible, but not really simple to improve the
> stack size even further.  But I would prefer to do that in a
> separate patch.

Yes separate patches would be fine. However there is a lot of scope to improve 
this
further. For example after your patch shifts and logical operations are 
expanded in
expand, add/sub are in split1 after combine runs and everything else is split 
after
reload. It doesn't make sense to split different operations at different times 
- it means
you're still going to get the bad DImode subregs and miss lots of optimization
opportunities due to the mix of partly split and partly not-yet-split 
operations.

> BTW: there are also negd2_compare, *negdi_extendsidi,
> *negdi_zero_extendsidi, *thumb2_negdi2.

I have a patch to merge thumb2_negdi2 into arm_negdi2. For extends, if we split 
them
at expand time, then none of the combined alu+extend patterns will be needed, 
and
that will be a huge simplification.

> I think it would be a precondition to have test cases that exercise
> each of these patterns before we try to split these instructions.

Agreed.

Wilco

[PATCH] Fix ira.c RTL sharing bug

2016-11-30 Thread Jakub Jelinek
Hi!

I've discovered today that RTL sharing verification has been disabled for
the last 3.5 years.  My --enable-checking=yes,rtl,extra bootstraps on
x86_64-linux and i686-linux revealed only two bugs.

Here is a fix for one of them, x can be shared with another insn that has
it in REG_EQUIV already, so we need to unshare it.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-11-30  Jakub Jelinek  

* ira.c (ira_update_equiv_info_by_shuffle_insn): Use copy_rtx
for REG_EQUIV argument.

--- gcc/ira.c.jj2016-11-18 18:11:24.0 +0100
+++ gcc/ira.c   2016-11-30 10:53:48.061704159 +0100
@@ -2647,7 +2647,7 @@ ira_update_equiv_info_by_shuffle_insn (i
}
   if (find_reg_note (insn, REG_EQUIV, x) == NULL_RTX)
{
- note = set_unique_reg_note (insn, REG_EQUIV, x);
+ note = set_unique_reg_note (insn, REG_EQUIV, copy_rtx (x));
  gcc_assert (note != NULL_RTX);
  if (internal_flag_ira_verbose > 3 && ira_dump_file != NULL)
{

Jakub


[PATCH] Fix rtl sharing bug in i?86 stv pass

2016-11-30 Thread Jakub Jelinek
Hi!

This is another RTL sharing bug, tmp is created through gen_rtx_SUBREG
and so shouldn't be shared by the move insn with the following insn that
uses it.

Bootstrapped/regtested on x86_64-linux and i686-linux
(--enable-checking=yes,rtl,extra), ok for trunk?

2016-11-30  Jakub Jelinek  

* config/i386/i386.c (dimode_scalar_chain::convert_op): Avoid
sharing the SUBREG rtx between move and following insn.

--- gcc/config/i386/i386.c.jj   2016-11-30 10:07:26.0 +0100
+++ gcc/config/i386/i386.c  2016-11-30 11:10:21.009991127 +0100
@@ -3723,7 +3723,7 @@ dimode_scalar_chain::convert_op (rtx *op
  emit_insn_before (seq, insn);
}
 
-  emit_insn_before (gen_move_insn (tmp, vec_cst), insn);
+  emit_insn_before (gen_move_insn (copy_rtx (tmp), vec_cst), insn);
   *op = tmp;
 }
   else

Jakub


[PATCHv3 5/7, GCC, ARM, V8M] Handling ARMv8-M Security Extension's cmse_nonsecure_call attribute

2016-11-30 Thread Andre Vieira (lists)
Hi,

I got a bug report against the old version of this patch and fixed it
here. This had to do with GCC optimizations sharing types with and
without the 'cmse_nonsecure_call' attribute.  The patch now no longer
sets the main variant, this didn't seem to do what I thought it did.
Instead the patch now creates distinct type copies for every declared
pointer that eventually points to the function type with the attribute,
it will also create a distinct copy for the function type itself.
Another change in this patch was to make 'arm_comp_type_attributes', the
ARM implementation of TARGET_COMP_TYPE_ATTRIBUTES, deny compatibility
between function types with the attribute and without.

I added a test case to test the issue solved with these changes.

*** gcc/ChangeLog ***
2016-11-xx  Andre Vieira
Thomas Preud'homme  

* config/arm/arm.c (gimplify.h): New include.
(arm_handle_cmse_nonsecure_call): New.
(arm_attribute_table): Added cmse_nonsecure_call.
(arm_comp_type_attributes): Deny compatibility of function types
with
without the cmse_nonsecure_call attribute.
* doc/extend.texi (ARM ARMv8-M Security Extensions): New attribute.

*** gcc/testsuite/ChangeLog ***
2016-11-xx  Andre Vieira
Thomas Preud'homme  

* gcc.target/arm/cmse/cmse-3.c: Add tests.
* gcc.target/arm/cmse/cmse-4.c: Add tests.
* gcc.target/arm/cmse/cmse-15.c: New.


Cheers,
Andre
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 
9509b0e25c19f35ed312efa1d86efc18a0db7674..d1160d5001830beea981df7454f09daf7fac2c5a
 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -63,6 +63,7 @@
 #include "tm-constrs.h"
 #include "rtl-iter.h"
 #include "optabs-libfuncs.h"
+#include "gimplify.h"
 
 /* This file should be included last.  */
 #include "target-def.h"
@@ -137,6 +138,7 @@ static tree arm_handle_isr_attribute (tree *, tree, tree, 
int, bool *);
 static tree arm_handle_notshared_attribute (tree *, tree, tree, int, bool *);
 #endif
 static tree arm_handle_cmse_nonsecure_entry (tree *, tree, tree, int, bool *);
+static tree arm_handle_cmse_nonsecure_call (tree *, tree, tree, int, bool *);
 static void arm_output_function_epilogue (FILE *, HOST_WIDE_INT);
 static void arm_output_function_prologue (FILE *, HOST_WIDE_INT);
 static int arm_comp_type_attributes (const_tree, const_tree);
@@ -348,6 +350,8 @@ static const struct attribute_spec arm_attribute_table[] =
   /* ARMv8-M Security Extensions support.  */
   { "cmse_nonsecure_entry", 0, 0, true, false, false,
 arm_handle_cmse_nonsecure_entry, false },
+  { "cmse_nonsecure_call", 0, 0, true, false, false,
+arm_handle_cmse_nonsecure_call, true },
   { NULL,   0, 0, false, false, false, NULL, false }
 };
 
@@ -6753,6 +6757,78 @@ arm_handle_cmse_nonsecure_entry (tree *node, tree name,
   return NULL_TREE;
 }
 
+
+/* Called upon detection of the use of the cmse_nonsecure_call attribute, this
+   function will check whether the attribute is allowed here and will add the
+   attribute to the function type tree or otherwise issue a diagnostic.  The
+   reason we check this at declaration time is to only allow the use of the
+   attribute with declarations of function pointers and not function
+   declarations.  This function checks NODE is of the expected type and issues
+   diagnostics otherwise using NAME.  If it is not of the expected type
+   *NO_ADD_ATTRS will be set to true.  */
+
+static tree
+arm_handle_cmse_nonsecure_call (tree *node, tree name,
+tree /* args */,
+int /* flags */,
+bool *no_add_attrs)
+{
+  tree decl = NULL_TREE, fntype = NULL_TREE;
+  tree main_variant, type;
+
+  if (!use_cmse)
+{
+  *no_add_attrs = true;
+  warning (OPT_Wattributes, "%qE attribute ignored without -mcmse option.",
+  name);
+  return NULL_TREE;
+}
+
+  if (TREE_CODE (*node) == VAR_DECL || TREE_CODE (*node) == TYPE_DECL)
+{
+  decl = *node;
+  fntype = TREE_TYPE (decl);
+}
+
+  while (fntype != NULL_TREE && TREE_CODE (fntype) == POINTER_TYPE)
+fntype = TREE_TYPE (fntype);
+
+  if (!decl || TREE_CODE (fntype) != FUNCTION_TYPE)
+{
+   warning (OPT_Wattributes, "%qE attribute only applies to base type of a 
"
+"function pointer", name);
+   *no_add_attrs = true;
+   return NULL_TREE;
+}
+
+  *no_add_attrs |= cmse_func_args_or_return_in_stack (NULL, name, fntype);
+
+  if (*no_add_attrs)
+return NULL_TREE;
+
+  /* Prevent trees being shared among function types with and without
+ cmse_nonsecure_call attribute.  */
+  type = TREE_TYPE (decl);
+
+  type = build_distinct_type_copy (type);
+  TREE_TYPE (decl) = type;
+  fntype = type;
+
+  while (TREE_CODE (fntype) != FUNCTION_TYPE)
+{
+  type = fntype;
+  fntype = TREE_TYPE (fntype);
+  fntype = build_distinct_type_copy (fntyp

Re: [PATCH 7/7, GCC, ARM, V8M] Added support for ARMV8-M Security Extension cmse_nonsecure_caller intrinsic

2016-11-30 Thread Andre Vieira (lists)
Hi,

I changed the testcase with this patch since the old testcase was
casting a function pointer to another function pointer and using that
pointer to call the function. This is undefined behavior. The new test
reflects a more sane use of the intrinsics.

Cheers,
Andre
diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c
index 
5ed38d1608cfbfbd1248d76705fcf675bc36c2b2..1206814fb95268ba5ba008245513b0dc94f1
 100644
--- a/gcc/config/arm/arm-builtins.c
+++ b/gcc/config/arm/arm-builtins.c
@@ -528,6 +528,8 @@ enum arm_builtins
   ARM_BUILTIN_GET_FPSCR,
   ARM_BUILTIN_SET_FPSCR,
 
+  ARM_BUILTIN_CMSE_NONSECURE_CALLER,
+
 #undef CRYPTO1
 #undef CRYPTO2
 #undef CRYPTO3
@@ -1833,6 +1835,17 @@ arm_init_builtins (void)
= add_builtin_function ("__builtin_arm_stfscr", ftype_set_fpscr,
ARM_BUILTIN_SET_FPSCR, BUILT_IN_MD, NULL, 
NULL_TREE);
 }
+
+  if (use_cmse)
+{
+  tree ftype_cmse_nonsecure_caller
+   = build_function_type_list (unsigned_type_node, NULL);
+  arm_builtin_decls[ARM_BUILTIN_CMSE_NONSECURE_CALLER]
+   = add_builtin_function ("__builtin_arm_cmse_nonsecure_caller",
+   ftype_cmse_nonsecure_caller,
+   ARM_BUILTIN_CMSE_NONSECURE_CALLER, BUILT_IN_MD,
+   NULL, NULL_TREE);
+}
 }
 
 /* Return the ARM builtin for CODE.  */
@@ -2453,6 +2466,12 @@ arm_expand_builtin (tree exp,
   emit_insn (pat);
   return target;
 
+case ARM_BUILTIN_CMSE_NONSECURE_CALLER:
+  target = gen_reg_rtx (SImode);
+  op0 = arm_return_addr (0, NULL_RTX);
+  emit_insn (gen_addsi3 (target, op0, const1_rtx));
+  return target;
+
 case ARM_BUILTIN_TEXTRMSB:
 case ARM_BUILTIN_TEXTRMUB:
 case ARM_BUILTIN_TEXTRMSH:
diff --git a/gcc/config/arm/arm_cmse.h b/gcc/config/arm/arm_cmse.h
index 
894343bb835b61e09c14668d45aa43a8693fd011..82b58b1c4f4a12ba6062e2cc2632653788d0eeb7
 100644
--- a/gcc/config/arm/arm_cmse.h
+++ b/gcc/config/arm/arm_cmse.h
@@ -163,6 +163,13 @@ __attribute__ ((__always_inline__))
 cmse_TTAT (void *__p)
 __CMSE_TT_ASM (at)
 
+/* FIXME: diagnose use outside cmse_nonsecure_entry functions.  */
+__extension__ static __inline int __attribute__ ((__always_inline__))
+cmse_nonsecure_caller (void)
+{
+  return __builtin_arm_cmse_nonsecure_caller ();
+}
+
 #define CMSE_AU_NONSECURE  2
 #define CMSE_MPU_NONSECURE 16
 #define CMSE_NONSECURE 18
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 
e8d8deefca87eacbdde6042fc1885c0d0a85a5b8..587a30d7efc6b1f2653efe96ed233828ebeb570a
 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -12640,6 +12640,7 @@ cmse_address_info_t cmse_TTAT_fptr (FPTR)
 void * cmse_check_address_range (void *, size_t, int)
 typeof(p) cmse_nsfptr_create (FPTR p)
 intptr_t cmse_is_nsfptr (FPTR)
+int cmse_nonsecure_caller (void)
 @end smallexample
 
 @node AVR Built-in Functions
diff --git a/gcc/testsuite/gcc.target/arm/cmse/cmse-1.c 
b/gcc/testsuite/gcc.target/arm/cmse/cmse-1.c
index 
d5b9a2d9d59569de170da814ae660e9fb2b943e7..c13272eed683aa06db027cd4646e5fe67817212b
 100644
--- a/gcc/testsuite/gcc.target/arm/cmse/cmse-1.c
+++ b/gcc/testsuite/gcc.target/arm/cmse/cmse-1.c
@@ -65,3 +65,42 @@ int foo (char * p)
 /* { dg-final { scan-assembler-times "ttat " 2 } } */
 /* { dg-final { scan-assembler-times "bl.cmse_check_address_range" 7 } } */
 /* { dg-final { scan-assembler-not "cmse_check_pointed_object" } } */
+
+int __attribute__ ((cmse_nonsecure_entry))
+baz (void)
+{
+  return cmse_nonsecure_caller ();
+}
+
+typedef int __attribute__ ((cmse_nonsecure_call)) (int_nsfunc_t) (void);
+
+int default_callback (void)
+{
+  return 0;
+}
+
+int_nsfunc_t * fp = (int_nsfunc_t *) default_callback;
+
+void __attribute__ ((cmse_nonsecure_entry))
+qux (int_nsfunc_t * callback)
+{
+  fp = cmse_nsfptr_create (callback);
+}
+
+int call_callback (void)
+{
+  if (cmse_is_nsfptr (fp))
+  return fp ();
+  else
+return default_callback ();
+}
+/* { dg-final { scan-assembler "baz:" } } */
+/* { dg-final { scan-assembler "__acle_se_baz:" } } */
+/* { dg-final { scan-assembler "qux:" } } */
+/* { dg-final { scan-assembler "__acle_se_qux:" } } */
+/* { dg-final { scan-assembler-not "\tcmse_nonsecure_caller" } } */
+/* { dg-final { scan-rtl-dump "and.*reg.*const_int 1" expand } } */
+/* { dg-final { scan-assembler "bic" } } */
+/* { dg-final { scan-assembler "push\t\{r4, r5, r6" } } */
+/* { dg-final { scan-assembler "msr\tAPSR_nzcvq" } } */
+/* { dg-final { scan-assembler-times "bl\\s+__gnu_cmse_nonsecure_call" 1 } } */


[PATCH] Reenable RTL sharing verification

2016-11-30 Thread Jakub Jelinek
Hi!

The http://gcc.gnu.org/ml/gcc-patches/2013-04/msg01055.html
change broke all RTL sharing verification, even with --enable-checking=rtl
we don't verify anything for the last 3.5 years.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux together
with the previously posted 2 patches, ok for trunk if those 2 are approved?

Can people try --enable-checking=rtl on more targets?

2016-11-30  Jakub Jelinek  

* emit-rtl.c (verify_insn_sharing): Call verify_rtx_sharing instead of
reset_used_flags.

--- gcc/emit-rtl.c.jj   2016-11-23 16:47:35.0 +0100
+++ gcc/emit-rtl.c  2016-11-30 10:04:59.691135836 +0100
@@ -2826,10 +2826,10 @@ static void
 verify_insn_sharing (rtx insn)
 {
   gcc_assert (INSN_P (insn));
-  reset_used_flags (PATTERN (insn));
-  reset_used_flags (REG_NOTES (insn));
+  verify_rtx_sharing (PATTERN (insn), insn);
+  verify_rtx_sharing (REG_NOTES (insn), insn);
   if (CALL_P (insn))
-reset_used_flags (CALL_INSN_FUNCTION_USAGE (insn));
+verify_rtx_sharing (CALL_INSN_FUNCTION_USAGE (insn), insn);
 }
 
 /* Go through all the RTL insn bodies and check that there is no unexpected

Jakub


Re: [patch] boehm-gc removal and libobjc changes to build with an external bdw-gc

2016-11-30 Thread Matthias Klose
On 30.11.2016 12:38, Richard Biener wrote:
> On Wed, Nov 30, 2016 at 12:30 PM, Matthias Klose  wrote:

>> There's one more fix needed for the case of only having the pkg-config module
>> installed when configuring with --enable-objc-gc. We can't use 
>> PKG_CHECK_MODULES
>> directly because the pkg.m4 macros choke on the dash in the module name. Thus
>> setting the CFLAGS and LIBS directly. Ok to install?
> 
> Why not fix pkg.m4?

as discussed on irc, now avoiding the use pkg-config, and assuming that libgc is
available in default locations when no --with-target-bdw-gc flags are given.
Checked with --enable-objc-gc and --enabled-objc-gc=auto.  Ok to install?

Matthias




	* configure.ac: Don't use pkg-config to check for bdw-gc.

config/

	* pkg.m4: Remove.

libobjc/

	* configure.ac: Don't use pkg-config to check for bdw-gc.

gcc/

	* doc/install.texi: Don't use pkg-config to check for bdw-gc.

Index: configure.ac
===
--- configure.ac	(Revision 243006)
+++ configure.ac	(Arbeitskopie)
@@ -29,7 +29,6 @@
 m4_include([ltversion.m4])
 m4_include([lt~obsolete.m4])
 m4_include([config/isl.m4])
-m4_include([config/pkg.m4])
 
 AC_INIT(move-if-change)
 AC_PREREQ(2.64)
@@ -2076,10 +2075,8 @@
 case ,${enable_languages},:${enable_objc_gc} in *,objc,*:yes|*,objc,*:auto)
   AC_MSG_CHECKING([for bdw garbage collector])
   if test "x$with_target_bdw_gc$with_target_bdw_gc_include$with_target_bdw_gc_lib" = x; then
-dnl no bdw-gw options, fall back to the bdw-gc pkg-config module
-PKG_CHECK_EXISTS(bdw-gc,
-  AC_MSG_RESULT([using bdw-gc pkg-config module]),
-  AC_MSG_ERROR([no --with-target-bdw-gc options and no bdw-gc pkg-config module found]))
+dnl no bdw-gw options, assume default locations
+AC_MSG_RESULT([using bdw-gc in default locations])
   else
 dnl bdw-gw options, first error checking, complete checking in libobjc
 if test "x$with_target_bdw_gc_include" = x && test "x$with_target_bdw_gc_lib" != x; then
Index: gcc/doc/install.texi
===
--- gcc/doc/install.texi	(Revision 243006)
+++ gcc/doc/install.texi	(Arbeitskopie)
@@ -2204,8 +2204,7 @@
 @option{--with-target-bdw-gc-lib} must always be specified together
 for each multilib variant and they take precedence over
 @option{--with-target-bdw-gc}.  If none of these options are
-specified, the values are taken from the @command{pkg-config}
-@samp{bdw-gc} module.
+specified, the library is assumed in default locations.
 @end table
 
 @html
Index: libobjc/configure.ac
===
--- libobjc/configure.ac	(Revision 243006)
+++ libobjc/configure.ac	(Arbeitskopie)
@@ -18,8 +18,6 @@
 #along with GCC; see the file COPYING3.  If not see
 #.
 
-m4_include([../config/pkg.m4])
-
 AC_PREREQ(2.64)
 AC_INIT(package-unused, version-unused,, libobjc)
 AC_CONFIG_SRCDIR([objc/objc.h])
@@ -223,10 +221,9 @@
 *)
   AC_MSG_CHECKING([for bdw garbage collector])
   if test "x$with_target_bdw_gc$with_target_bdw_gc_include$with_target_bdw_gc_lib" = x; then
-dnl no bdw-gw options, fall back to the bdw-gc pkg-config module
-PKG_CHECK_EXISTS(bdw-gc,
-  AC_MSG_RESULT([using bdw-gc pkg-config module]),
-  AC_MSG_ERROR([no --with-target-bdw-gc options and no bdw-gc pkg-config module found]))
+dnl no bdw-gw options, assuming bdw-gc in default locations
+BDW_GC_CFLAGS=
+BDW_GC_LIBS="-lgc"
   else
 dnl bdw-gw options passed by configure flags
 if test "x$with_target_bdw_gc_include" = x && test "x$with_target_bdw_gc_lib" != x; then


[PATCH][PR sanitizer/78532] Fix libsanitizer build failure on sparc64-linux-gnu.

2016-11-30 Thread Maxim Ostapenko

Hi,

this patch restores libsanitizer build on sparc64-linux-gnu with Glibc 
2.20+ on board.

I verified that this patch fixes the build error locally with Glibc 2.24.

Ok for mainline?
libsanitizer/ChangeLog:

2016-11-30  Maxim Ostapenko  

	PR sanitizer/78532
	* sanitizer_common/sanitizer_platform_limits_posix.h
	(__sanitizer_sigaction): Adjust for sparc targets and various Glibc
	versions.

diff --git a/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.h b/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.h
index d1a3051..066bf41 100644
--- a/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.h
+++ b/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.h
@@ -25,6 +25,10 @@
 # define GET_LINK_MAP_BY_DLOPEN_HANDLE(handle) ((link_map*)(handle))
 #endif  // !SANITIZER_FREEBSD
 
+#ifndef __GLIBC_PREREQ
+#define __GLIBC_PREREQ(x, y) 0
+#endif
+
 namespace __sanitizer {
   extern unsigned struct_utsname_sz;
   extern unsigned struct_stat_sz;
@@ -628,7 +632,14 @@ namespace __sanitizer {
 #endif
 #ifndef __mips__
 #if defined(__sparc__)
+#if __GLIBC_PREREQ (2, 20)
+// On sparc glibc 2.19 and earlier sa_flags was unsigned long, and
+// __glibc_reserved0 didn't exist.
+int __glibc_reserved0;
+int sa_flags;
+#else
 unsigned long sa_flags;
+#endif
 #else
 int sa_flags;
 #endif


Re: [PATCH] Reenable RTL sharing verification

2016-11-30 Thread Richard Biener
On Wed, 30 Nov 2016, Jakub Jelinek wrote:

> Hi!
> 
> The http://gcc.gnu.org/ml/gcc-patches/2013-04/msg01055.html
> change broke all RTL sharing verification, even with --enable-checking=rtl
> we don't verify anything for the last 3.5 years.
> 
> Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux together
> with the previously posted 2 patches, ok for trunk if those 2 are approved?

Ok.

Thanks,
Richard.

> Can people try --enable-checking=rtl on more targets?
> 
> 2016-11-30  Jakub Jelinek  
> 
>   * emit-rtl.c (verify_insn_sharing): Call verify_rtx_sharing instead of
>   reset_used_flags.
> 
> --- gcc/emit-rtl.c.jj 2016-11-23 16:47:35.0 +0100
> +++ gcc/emit-rtl.c2016-11-30 10:04:59.691135836 +0100
> @@ -2826,10 +2826,10 @@ static void
>  verify_insn_sharing (rtx insn)
>  {
>gcc_assert (INSN_P (insn));
> -  reset_used_flags (PATTERN (insn));
> -  reset_used_flags (REG_NOTES (insn));
> +  verify_rtx_sharing (PATTERN (insn), insn);
> +  verify_rtx_sharing (REG_NOTES (insn), insn);
>if (CALL_P (insn))
> -reset_used_flags (CALL_INSN_FUNCTION_USAGE (insn));
> +verify_rtx_sharing (CALL_INSN_FUNCTION_USAGE (insn), insn);
>  }
>  
>  /* Go through all the RTL insn bodies and check that there is no unexpected
> 
>   Ja



Re: [PATCH][PR sanitizer/78532] Fix libsanitizer build failure on sparc64-linux-gnu.

2016-11-30 Thread Jakub Jelinek
On Wed, Nov 30, 2016 at 03:10:49PM +0300, Maxim Ostapenko wrote:
> Hi,
> 
> this patch restores libsanitizer build on sparc64-linux-gnu with Glibc 2.20+
> on board.
> I verified that this patch fixes the build error locally with Glibc 2.24.
> 
> Ok for mainline?

> libsanitizer/ChangeLog:
> 
> 2016-11-30  Maxim Ostapenko  
> 
>   PR sanitizer/78532
>   * sanitizer_common/sanitizer_platform_limits_posix.h
>   (__sanitizer_sigaction): Adjust for sparc targets and various Glibc
>   versions.

Ok, but please mention it in LOCAL-PATCHES.

Jakub


Re: [PATCH] Fix PR78306

2016-11-30 Thread Andrew Senkevich
2016-11-30 11:52 GMT+03:00 Richard Biener :
> On Tue, 29 Nov 2016, Jeff Law wrote:
>
>> On 11/29/2016 12:47 AM, Richard Biener wrote:
>> > > Balaji added this check explicitly. There should be tests in the 
>> > > testsuite
>> > > (spawnee_inline, spawner_inline) which exercise that code.
>> >
>> > Yes he did, but no, nothing in the testsuite.
>> I believe the tests are:
>>
>> c-c++-common/cilk-plus/CK/spawnee_inline.c
>> c-c++-common/cilk-plus/CK/spawner_inline.c
>>
>> But as I mentioned, they don't check for proper behaviour
>
> Actually they do -- and both show what the issue might be, cilk+
> uses setjmp but we already have code to disallow inlining of
> functions calling setjmp (but we happily inline into functions
> calling setjmp).  When mangling the testcases to try forcing
> inlining I still (the patch was already applied) get
>
> /space/rguenther/src/gcc-git/gcc/testsuite/c-c++-common/cilk-plus/CK/spawnee_inline.c:
> In function ‘fib’:
> /space/rguenther/src/gcc-git/gcc/testsuite/c-c++-common/cilk-plus/CK/spawnee_inline.c:9:50:
> error: function ‘fib’ can never be copied because it receives a non-local
> goto
>
> so the intent was probably to disallow inlining of functions calling
> cilk_spawn, not to disable inlining into functions calling cilk_spawn.
>
> But as seen above this is already handled by generic code handling
> setjmp.
>
>>
>> >
>> > There is _nowhere_ documented _why_ the checks were added.  Why is
>> > inlining a transform that can do anything bad to a function using
>> > cilk_spawn?
>> I know, it's disappointing.  Even the tests mentioned above don't shed any
>> real light on the issue.
>
> One issue is obvious (but already handled).  Why all inlining should
> be disabled is indeed still a mystery.

I can suppose inline should be disabled for the next function after
cilk_spawn because spawn should be done for function.
If no way to disable the next call inlining it looks it was disabled
for all function to fix Cilk Plus Conformance Suite test fail.


--
WBR,
Andrew


Re: [PATCH] Fix ira.c RTL sharing bug

2016-11-30 Thread Bernd Schmidt

On 11/30/2016 01:02 PM, Jakub Jelinek wrote:

I've discovered today that RTL sharing verification has been disabled for
the last 3.5 years.  My --enable-checking=yes,rtl,extra bootstraps on
x86_64-linux and i686-linux revealed only two bugs.


Ok for both patches.


Bernd


[PATCH committed] Fix part of PR78555 - gcc/real.c:2890:25: runtime error: left shift of negative value -125

2016-11-30 Thread Markus Trippelsdorf
bootstrap-ubsan gcc shows:
 gcc/real.c:2890:25: runtime error: left shift of negative value -125

Fixed by casting to unsigned. 
Tested on ppc64le. Committed as obvious.

   PR ipa/78555
   * real.c (real_hash): Add cast to avoid left
   shifting of negative values.

diff --git a/gcc/real.c b/gcc/real.c
index 66e88e2ad366..eabe22de8510 100644
--- a/gcc/real.c
+++ b/gcc/real.c
@@ -2887,7 +2887,7 @@ real_hash (const REAL_VALUE_TYPE *r)
   return h;

 case rvc_normal:
-  h |= REAL_EXP (r) << 3;
+  h |= (unsigned int)REAL_EXP (r) << 3;
   break;

 case rvc_nan:

-- 
Markus


[PATCH] ira: Don't substitute into TRAP_IF insns (PR78610)

2016-11-30 Thread Segher Boessenkool
In the testcase, IRA propagates a constant into a TRAP_IF insn, which
then becomes an unconditional trap.  Unconditional traps are control
flow insns so doing this requires surgery on the cfg.  We cannot do
that here, so instead refuse to do the substitution.

Bootstrapping + regression testing on powerpc64-linux {-m64,-m32}
(the bug happened here with -m32); okay for trunk if this succeeds?


Segher


2016-11-30  Segher Boessenkool  

PR rtl-optimization/78610
* ira.c (combine_and_move_insns): Don't substitute into TRAP_IF
instructions.

gcc/testsuite/
PR rtl-optimization/78610
* gcc.c-torture/compile/pr78610.c: New testcase.

---
 gcc/ira.c |  5 +
 gcc/testsuite/gcc.c-torture/compile/pr78610.c | 14 ++
 2 files changed, 19 insertions(+)
 create mode 100644 gcc/testsuite/gcc.c-torture/compile/pr78610.c

diff --git a/gcc/ira.c b/gcc/ira.c
index d20ec99..ccd4980 100644
--- a/gcc/ira.c
+++ b/gcc/ira.c
@@ -3669,6 +3669,11 @@ combine_and_move_insns (void)
   if (JUMP_P (use_insn))
continue;
 
+  /* Also don't substitute into a conditional trap insn -- it can become
+an unconditional trap, and that is a flow control insn.  */
+  if (GET_CODE (PATTERN (use_insn)) == TRAP_IF)
+   continue;
+
   df_ref def = DF_REG_DEF_CHAIN (regno);
   gcc_assert (DF_REG_DEF_COUNT (regno) == 1 && DF_REF_INSN_INFO (def));
   rtx_insn *def_insn = DF_REF_INSN (def);
diff --git a/gcc/testsuite/gcc.c-torture/compile/pr78610.c 
b/gcc/testsuite/gcc.c-torture/compile/pr78610.c
new file mode 100644
index 000..0415ae6
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/compile/pr78610.c
@@ -0,0 +1,14 @@
+/* PR rtl-optimization/78610 */
+
+unsigned int ao, gl;
+
+void
+ri (void)
+{
+  for (;;)
+{
+  if (ao != 1)
+ao /= 0;
+  gl = 0;
+}
+}
-- 
1.9.3



Re: [PATCH] ira: Don't substitute into TRAP_IF insns (PR78610)

2016-11-30 Thread Richard Biener
On Wed, Nov 30, 2016 at 1:46 PM, Segher Boessenkool
 wrote:
> In the testcase, IRA propagates a constant into a TRAP_IF insn, which
> then becomes an unconditional trap.  Unconditional traps are control
> flow insns so doing this requires surgery on the cfg.

Huh, that's an odd choice ;)  I'd say TRAP_IF should be a control-flow insn
as well, but well...

>  We cannot do
> that here, so instead refuse to do the substitution.
>
> Bootstrapping + regression testing on powerpc64-linux {-m64,-m32}
> (the bug happened here with -m32); okay for trunk if this succeeds?
>
>
> Segher
>
>
> 2016-11-30  Segher Boessenkool  
>
> PR rtl-optimization/78610
> * ira.c (combine_and_move_insns): Don't substitute into TRAP_IF
> instructions.
>
> gcc/testsuite/
> PR rtl-optimization/78610
> * gcc.c-torture/compile/pr78610.c: New testcase.
>
> ---
>  gcc/ira.c |  5 +
>  gcc/testsuite/gcc.c-torture/compile/pr78610.c | 14 ++
>  2 files changed, 19 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.c-torture/compile/pr78610.c
>
> diff --git a/gcc/ira.c b/gcc/ira.c
> index d20ec99..ccd4980 100644
> --- a/gcc/ira.c
> +++ b/gcc/ira.c
> @@ -3669,6 +3669,11 @@ combine_and_move_insns (void)
>if (JUMP_P (use_insn))
> continue;
>
> +  /* Also don't substitute into a conditional trap insn -- it can become
> +an unconditional trap, and that is a flow control insn.  */
> +  if (GET_CODE (PATTERN (use_insn)) == TRAP_IF)
> +   continue;
> +
>df_ref def = DF_REG_DEF_CHAIN (regno);
>gcc_assert (DF_REG_DEF_COUNT (regno) == 1 && DF_REF_INSN_INFO (def));
>rtx_insn *def_insn = DF_REF_INSN (def);
> diff --git a/gcc/testsuite/gcc.c-torture/compile/pr78610.c 
> b/gcc/testsuite/gcc.c-torture/compile/pr78610.c
> new file mode 100644
> index 000..0415ae6
> --- /dev/null
> +++ b/gcc/testsuite/gcc.c-torture/compile/pr78610.c
> @@ -0,0 +1,14 @@
> +/* PR rtl-optimization/78610 */
> +
> +unsigned int ao, gl;
> +
> +void
> +ri (void)
> +{
> +  for (;;)
> +{
> +  if (ao != 1)
> +ao /= 0;
> +  gl = 0;
> +}
> +}
> --
> 1.9.3
>


Re: [PATCH] ira: Don't substitute into TRAP_IF insns (PR78610)

2016-11-30 Thread Jeff Law

On 11/30/2016 05:46 AM, Segher Boessenkool wrote:

In the testcase, IRA propagates a constant into a TRAP_IF insn, which
then becomes an unconditional trap.  Unconditional traps are control
flow insns so doing this requires surgery on the cfg.  We cannot do
that here, so instead refuse to do the substitution.

Bootstrapping + regression testing on powerpc64-linux {-m64,-m32}
(the bug happened here with -m32); okay for trunk if this succeeds?


Segher


2016-11-30  Segher Boessenkool  

PR rtl-optimization/78610
* ira.c (combine_and_move_insns): Don't substitute into TRAP_IF
instructions.

gcc/testsuite/
PR rtl-optimization/78610
* gcc.c-torture/compile/pr78610.c: New testcase.
Funny how you speculated there could be these issues hiding in the 
weeds, then just a few days later, one crawls out.


OK.  I do wonder if we're going to need a better mechanism for this in 
the long term, but OK for now.


jeff



Re: [PATCH] ira: Don't substitute into TRAP_IF insns (PR78610)

2016-11-30 Thread Jeff Law

On 11/30/2016 05:52 AM, Richard Biener wrote:

On Wed, Nov 30, 2016 at 1:46 PM, Segher Boessenkool
 wrote:

In the testcase, IRA propagates a constant into a TRAP_IF insn, which
then becomes an unconditional trap.  Unconditional traps are control
flow insns so doing this requires surgery on the cfg.


Huh, that's an odd choice ;)  I'd say TRAP_IF should be a control-flow insn
as well, but well...

Perhaps.

The situation with TRAP_IF reminds me a lot of an indirect call, which 
after some kind of propagation turns into a direct call to a 
non-returning function.  It transforms from something that is not 
control flow in the traditional sense to something we do consider 
control flow.


Jeff



Re: [PATCH] Partial solution to LWG 523

2016-11-30 Thread Jonathan Wakely

On 26/11/16 16:27 -0800, Tim Shen wrote:

@@ -235,23 +242,86 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  _StrTransT
  _M_transform(_CharT __ch) const
  {
-   return _M_transform_impl(__ch, typename integral_constant::type());
+   _StrTransT __str = _StrTransT(1, __ch);


I know the copy will be elided, but this could be:

_StrTransT __str(1, __ch);

Not a big deal though.


diff --git a/libstdc++-v3/include/bits/shared_ptr_base.h 
b/libstdc++-v3/include/bits/shared_ptr_base.h
index 953aa87..2fb70b7 100644
--- a/libstdc++-v3/include/bits/shared_ptr_base.h
+++ b/libstdc++-v3/include/bits/shared_ptr_base.h
@@ -1000,7 +1000,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  element_type&
  operator*() const noexcept
  {
-   __glibcxx_assert(_M_ptr != nullptr);
+   __glibcxx_assert(_M_get() != nullptr);
return *_M_get();
  }


Oops, thanks, but let's fix this separately (I'll do it now) so the
rest of the patch only touches regex stuff.


diff --git a/libstdc++-v3/testsuite/28_regex/traits/char/icase.cc 
b/libstdc++-v3/testsuite/28_regex/traits/char/icase.cc
new file mode 100644
index 000..6b2d9ee
--- /dev/null
+++ b/libstdc++-v3/testsuite/28_regex/traits/char/icase.cc
@@ -0,0 +1,75 @@
+// { dg_do run }
+// { dg-do run { target c++11 } }


The first dg-do should be removed.

OK for trunk with those changes, thanks.




[PATCH] [PR78112] Remove the g++.dg/pr78112.C testcase

2016-11-30 Thread Pierre-Marie de Rodat
Hello,

I recently added a testcase for PR78112's resolution. Unfortunately,
what it tests changes from one platform to another and I even get
different results for a single platform (see
). As multiple
developpers reported these errors and as the testcase relies on a
compiler behavior that still looks bogous to me, I suggest to remove the
testcase for now.

Ok to commit? Thank you in advance!

gcc/testsuite/
PR debug/78112
* g++.dg/pr78112.C: Remove unreliable testcase.
---
 gcc/testsuite/g++.dg/pr78112.C | 162 -
 1 file changed, 162 deletions(-)
 delete mode 100644 gcc/testsuite/g++.dg/pr78112.C

diff --git a/gcc/testsuite/g++.dg/pr78112.C b/gcc/testsuite/g++.dg/pr78112.C
deleted file mode 100644
index 986171d..000
--- a/gcc/testsuite/g++.dg/pr78112.C
+++ /dev/null
@@ -1,162 +0,0 @@
-/* { dg-do compile } */
-/* { dg-options "-g -dA -std=gnu++11" } */
-/* { dg-final { scan-assembler-times DW_AT_inline 6 { xfail *-*-aix* } } } */
-/* { dg-final { scan-assembler-times DW_AT_object_pointer 37 { xfail *-*-aix* 
} } } */
-namespace std
-{
-template  struct integral_constant
-{
-  static constexpr _Tp value = 0;
-};
-template  using __bool_constant = integral_constant;
-struct __not_ : integral_constant
-{
-};
-template  class _Op,
-  typename... _Args>
-struct __detector
-{
-  using type = _Op<_Args...>;
-};
-template  class _Op,
-  typename... _Args>
-using __detected_or = __detector<_Default, void, _Op, _Args...>;
-template  class _Op,
-  typename... _Args>
-using __detected_or_t = typename __detected_or<_Default, _Op, _Args...>::type;
-template  class _Default,
-  template  class _Op, typename... _Args>
-using __detected_or_t_ = __detected_or_t<_Default<_Args...>, _Op, _Args...>;
-template  struct pair;
-template  using __replace_first_arg_t = int;
-}
-class new_allocator
-{
-};
-namespace std
-{
-template  using __allocator_base = new_allocator;
-template  class allocator : __allocator_base
-{
-};
-template  struct equal_to;
-struct __allocator_traits_base
-{
-  template 
-  using __rebind = typename _Alloc::template rebind<_Up>::other;
-};
-template 
-using __alloc_rebind
-= __detected_or_t_<__replace_first_arg_t,
-   __allocator_traits_base::__rebind, _Alloc, _Up>;
-}
-struct __alloc_traits
-{
-  static int _S_select_on_copy;
-};
-namespace std
-{
-template  struct hash;
-namespace __detail
-{
-struct _Select1st;
-struct _Hashtable_traits
-{
-  using __hash_cached = __bool_constant<0>;
-};
-template  struct _Hash_node;
-template  struct _Hash_node<_Value, 0>
-{
-  int &_M_v ();
-};
-struct _Mod_range_hashing;
-struct _Default_ranged_hash;
-struct _Prime_rehash_policy
-{
-  using __has_load_factor = integral_constant;
-};
-struct _Map_base
-{
-};
-struct _Insert
-{
-};
-template 
-using __has_load_factor = typename _Policy::__has_load_factor;
-template ,
- __has_load_factor, _RehashPolicy> >
-struct _Rehash_base;
-template 
-struct _Rehash_base<_RehashPolicy, _Traits, integral_constant > {};
-struct _Hashtable_ebo_helper { template  
_Hashtable_ebo_helper (_OtherTp); };
-struct _Hashtable_base {};
-struct _Equality {};
-template  struct _Hashtable_alloc : _Hashtable_ebo_helper
-{
-  using __ebo_node_alloc = _Hashtable_ebo_helper;
-  using __node_type = typename _NodeAlloc::value_type;
-  using __node_alloc_traits = __alloc_traits;
-  template  _Hashtable_alloc (_Alloc) : __ebo_node_alloc (0) 
{}
-  template  __node_type *_M_allocate_node (...);
-};
-}
-template  using __cache_default = __not_;
-template 
-struct _Hashtable : __detail::_Hashtable_base, __detail::_Map_base, 
__detail::_Insert, __detail::_Rehash_base<_RehashPolicy, int>,
-  __detail::_Equality, __detail:: _Hashtable_alloc<__alloc_rebind<_Alloc, 
__detail::_Hash_node<_Value, _Traits::__hash_cached::value> > >
-{
-  using __traits_type = _Traits;
-  using __hash_cached = typename __traits_type::__hash_cached;
-  using __node_type = __detail::_Hash_node<_Value, __hash_cached::value>;
-  using __node_alloc_type = __alloc_rebind<_Alloc, __node_type>;
-  using __hashtable_alloc = __detail::_Hashtable_alloc<__node_alloc_type>;
-  using typename __hashtable_alloc::__node_alloc_traits;
-  _Key key_type;
-  _RehashPolicy _M_rehash_policy;
-  template 
-  void _M_assign (const _Hashtable &, const _NodeGenerator &);
-  _Hashtable ();
-  _Hashtable (const _Hashtable &);
-};
-template 
-template 
-void _Hashtable<_Key, _Value, _Alloc, _ExtractKey, _Equal, _H1, _H2, _Hash, 
_RehashPolicy,
-_Traits>::_M_assign (const _Hashtable &__ht, const 
_NodeGenerator &__node_gen) try {
-__node_type *__this_n = __node_gen (0); } catch (...) {}
-template 
-_Hashtable<_Key, _Value, _Alloc, _ExtractKey, _Equal, _H1, _H2, _Hash, 
_RehashPolicy, _Traits>::_Hashtable (const _Hashtable &__ht)
-: __hashtable_alloc (__node_alloc_traits::_S_select_on_copy

Re: [PATCH] ira: Don't substitute into TRAP_IF insns (PR78610)

2016-11-30 Thread Richard Biener
On Wed, Nov 30, 2016 at 1:57 PM, Jeff Law  wrote:
> On 11/30/2016 05:52 AM, Richard Biener wrote:
>>
>> On Wed, Nov 30, 2016 at 1:46 PM, Segher Boessenkool
>>  wrote:
>>>
>>> In the testcase, IRA propagates a constant into a TRAP_IF insn, which
>>> then becomes an unconditional trap.  Unconditional traps are control
>>> flow insns so doing this requires surgery on the cfg.
>>
>>
>> Huh, that's an odd choice ;)  I'd say TRAP_IF should be a control-flow
>> insn
>> as well, but well...
>
> Perhaps.
>
> The situation with TRAP_IF reminds me a lot of an indirect call, which after
> some kind of propagation turns into a direct call to a non-returning
> function.  It transforms from something that is not control flow in the
> traditional sense to something we do consider control flow.

OTOH the noreturn flag is now on the GIMPLE stmt via
gimple_call_[set_]_ctrl_altering_p and thus the mere
transform into a direct call does _not_ make the call noreturn!  It only
opens up the possibility to optimize it so ;)  (fixup-cfg does that)

There's also the reverse - an indirect noreturn call transformed to
a direct not noreturn call.  This is why we have the flag explicit on the
IL as well (we keep that call as "noreturn" as we have no idea where to
redirect a new fallthru to).

Richard.

> Jeff
>


Re: [RFC] Assert DECL_ABSTRACT_ORIGIN is different from the decl itself

2016-11-30 Thread Martin Jambor
Hi,

On Tue, Nov 29, 2016 at 10:17:02AM -0700, Jeff Law wrote:
> On 11/29/2016 03:13 AM, Richard Biener wrote:
> > On Mon, Nov 28, 2016 at 6:28 PM, Martin Jambor  wrote:
> > > Hi Jeff,
> > > 
> > > On Mon, Nov 28, 2016 at 08:46:05AM -0700, Jeff Law wrote:
> > > > On 11/28/2016 07:27 AM, Martin Jambor wrote:
> > > > > Hi,
> > > > > 
> > > > > one of a number of symptoms of an otherwise unrelated HSA bug I've
> > > > > been debugging today is gcc crashing or hanging in the C++ pretty
> > > > > printer when attempting to emit a warning because dump_decl() ended up
> > > > > in an infinite recursion calling itself on the DECL_ABSTRACT_ORIGIN of
> > > > > the decl it was looking at, which was however the same thing.  (It was
> > > > > set to itself on purpose in set_decl_origin_self as a part of final
> > > > > pass, the decl was being printed because it was itself an abstract
> > > > > origin of another one).
> > > > > 
> > > > > If someone ever faces a similar problem, the following (untested)
> > > > > patch might save them a bit of time.  I have eventually decided not to
> > > > > make it a checking-only assert because it is on a cold path and
> > > > > because at release-build optimization levels, the tail-call is
> > > > > optimized to a jump and thus an infinite loop if the described
> > > > > situation happens, and I suppose an informative ICE is better tan that
> > > > > even for users.
> > > > > 
> > > > > What do you think?  Would it be reasonable for trunk even now or
> > > > > should I queue it for the next stage1?
> > > > > 
> > > > > Thanks,
> > > > > 
> > > > > Martin
> > > > > 
> > > > > 
> > > > > gcc/cp/
> > > > > 
> > > > > 2016-11-28  Martin Jambor  
> > > > > 
> > > > > * error.c (dump_decl): Add an assert that DECL_ABSTRACT_ORIGIN
> > > > > is not the decl itself.
> > > > Given it's on an error/debug path it ought to be plenty safe for now. 
> > > > What's
> > > > more interesting is whether or not DECL_ABSTRACT_ORIGIN can legitimately
> > > > point to itself and if so, how is that happening.
> > > 
> > > Well, I tried to explain it in my original email but I also wanted to
> > > be as brief as possible, so perhaps it is necessary to elaborate a bit:
> > > 
> > > There is a function set_decl_origin_self() in dwarf2out.c that does
> > > just that, sets DECL_ABSTRACT_ORIGIN to the decl itself, and its
> > > comment makes it clear that is intended (according to git blame, the
> > > whole comment and much of the implementation come from 1992, though ;-)
> > > The function is called from the "final" pass through dwarf2out_decl(),
> > > and gen_decl_die().
> > > 
> > > So, for one reason or another, this is the intended behavior.
> > > Apparently, after that one is not supposed to be printing the decl
> > > name of such a "finished" a function.  It is too bad however that this
> > > can happen if a "finished" function is itself an abstract origin of a
> > > different one, which is optimized and expanded only afterwards and you
> > > attempt to print its decl name, because it triggers printing the decl
> > > name of the finished function, in turn triggering the infinite
> > > recursion/loop.  I am quite surprised that we have not hit this
> > > earlier (e.g. with warnings in IPA-CP clones) but perhaps there is a
> > > reason.
> > > 
> > > I will append the patch to some bootstrap and testing run and commit
> > > it afterwards if it passes.
> > 
> > Other users explicitely check for the self-reference when walking origins.
> I think that makes it pretty clear that we have to handle self-reference.
> So it seems that rather than an assert that we should just not walk down a
> self-referencing DECL_ABSTRACT_ORIGIN.
> 

I'm not sure what you mean by "walk down."  The code in dump_decl()
that deals with function decls is:

case FUNCTION_DECL:
  if (! DECL_LANG_SPECIFIC (t))
{
  if (DECL_ABSTRACT_ORIGIN (t))
dump_decl (pp, DECL_ABSTRACT_ORIGIN (t), flags);
  else
pp_string (pp, M_(""));
}
  else if (DECL_GLOBAL_CTOR_P (t) || DECL_GLOBAL_DTOR_P (t))
dump_global_iord (pp, t);
  else
dump_function_decl (pp, t, flags);
  break;

I suppose that there are good reasons for treating
!DECL_LANG_SPECIFIC(t) specially and not pass it down to
dump_function_decl, even if DECL_ABSTRACT_ORIGIN (t) == t.  But
printing , though perhaps better than an ICE or hang, feels
also wrong and we already print it when we shouldn't (see PR 78589).

So I wonder what the options are... perhaps it seems that we can call
dump_function_name which starts with code handling
!DECL_LANG_SPECIFIC(t) cases, even instead of the weird 
thing?

I guess I'll give it a try later this week.

Thanks,

Martin


Re: [PATCH] combine: Convert subreg-of-lshiftrt to zero_extract properly (PR78390)

2016-11-30 Thread Dominik Vogt
On Wed, Nov 23, 2016 at 02:22:07PM +, Segher Boessenkool wrote:
> r242414, for PR77881, introduces some bugs (PR78390, PR78438, PR78477).
> It all has the same root cause: that patch makes combine convert every
> lowpart subreg of a logical shift right to a zero_extract.  This cannot
> work at all if it is not a constant shift, and it has to be a bit more
> careful exactly which bits it extracts.
> 
> Tested on powerpc64-linux {-m32,-m64} (where it fixes the regression
> c-c++-common/torture/vector-compare-1.c fails at -O1, and where it also
> has a bootstrap failure with some other patches).  Also tested that the
> x86_64 compiler still generates the wanted code for the PR77881 testcase.

Is this a side effect of the patch series?

  Trying 7 -> 9:
  ...
  Failed to match this instruction:
  (set (reg:SI 68 [ v_or ])
  (ior:SI (subreg:SI (zero_extract:DI (reg:DI 2 %r2 [ v_x ])
   ^^
  (const_int 16 [0x10])
  (const_int 0 [0])) 4)
^^^
  (reg:SI 70 [ v_and1 ])))

Shouldn't this be simply

  ...
  (ior:SI (zero_extract:SI (reg:DI) (16) (0)))
  ...

?

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany



Re: [PATCH] Fix rtl sharing bug in i?86 stv pass

2016-11-30 Thread Uros Bizjak
On Wed, Nov 30, 2016 at 1:04 PM, Jakub Jelinek  wrote:
> Hi!
>
> This is another RTL sharing bug, tmp is created through gen_rtx_SUBREG
> and so shouldn't be shared by the move insn with the following insn that
> uses it.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux
> (--enable-checking=yes,rtl,extra), ok for trunk?
>
> 2016-11-30  Jakub Jelinek  
>
> * config/i386/i386.c (dimode_scalar_chain::convert_op): Avoid
> sharing the SUBREG rtx between move and following insn.

Uh, OK for mainline and backports.

Thanks,
Uros.

> --- gcc/config/i386/i386.c.jj   2016-11-30 10:07:26.0 +0100
> +++ gcc/config/i386/i386.c  2016-11-30 11:10:21.009991127 +0100
> @@ -3723,7 +3723,7 @@ dimode_scalar_chain::convert_op (rtx *op
>   emit_insn_before (seq, insn);
> }
>
> -  emit_insn_before (gen_move_insn (tmp, vec_cst), insn);
> +  emit_insn_before (gen_move_insn (copy_rtx (tmp), vec_cst), insn);
>*op = tmp;
>  }
>else
>
> Jakub


[Patch, Fortran, committed] PR 78593: [6/7 Regression] ICE in gfc_match_varspec, at fortran/primary.c:2053

2016-11-30 Thread Janus Weil
Hi all,

I have just committed to trunk another obvious one-line fix for an
ICE-on-invalid regression:

https://gcc.gnu.org/viewcvs?rev=243020&root=gcc&view=rev

I plan to backport this to the gcc-6 branch within a week or so.

Cheers,
Janus
Index: gcc/fortran/primary.c
===
--- gcc/fortran/primary.c   (revision 243019)
+++ gcc/fortran/primary.c   (working copy)
@@ -2050,7 +2050,7 @@ gfc_match_varspec (gfc_expr *primary, int equiv_fl
   if (m != MATCH_YES)
return MATCH_ERROR;
 
-  if (sym->f2k_derived)
+  if (sym && sym->f2k_derived)
tbp = gfc_find_typebound_proc (sym, &t, name, false, 
&gfc_current_locus);
   else
tbp = NULL;
Index: gcc/testsuite/gfortran.dg/derived_result.f90
===
--- gcc/testsuite/gfortran.dg/derived_result.f90(nonexistent)
+++ gcc/testsuite/gfortran.dg/derived_result.f90(working copy)
@@ -0,0 +1,10 @@
+! { dg-do compile }
+!
+! PR 78593: [6/7 Regression] ICE in gfc_match_varspec, at 
fortran/primary.c:2053
+!
+! Contributed by Gerhard Steinmetz 
+
+type(t) function add (x, y)! { dg-error "is not accessible" }
+  integer, intent(in) :: x, y
+  add%a = x + y! { dg-error "Unclassifiable statement" }
+end


Re: PING! [PATCH, Fortran, accaf, v1] Add caf-API-calls to asynchronously handle allocatable components in derived type coarrays.

2016-11-30 Thread Andre Vehreschild
Hi Paul,

thanks for the review. Committed with the changes requested and the one
reported by Dominique on IRC for coarray_lib_alloc_4 when compiled with -m32 as
r243021. 

Thanks for the review and tests.

Regards,
Andre

On Wed, 30 Nov 2016 07:49:13 +0100
Paul Richard Thomas  wrote:

> Dear Andre,
> 
> This all looks OK to me. The only comment that I have that you might
> deal with before committing is that some of the Boolean expressions,
> eg:
> +  int caf_dereg_mode
> +  = ((caf_mode & GFC_STRUCTURE_CAF_MODE_IN_COARRAY) != 0
> +  || c->attr.codimension)
> +  ? ((caf_mode & GFC_STRUCTURE_CAF_MODE_DEALLOC_ONLY) != 0
> +  ? GFC_CAF_COARRAY_DEALLOCATE_ONLY
> +  : GFC_CAF_COARRAY_DEREGISTER)
> +  : GFC_CAF_COARRAY_NOCOARRAY;
> 
> are getting be sufficiently convoluted that a small, appropriately
> named, helper function might be clearer. Of course, this is true of
> many parts of gfortran but it is not too late to start making the code
> a bit clearer.
> 
> You can commit to the present trunk as far as I am concerned. I know
> that the caf enthusiasts will test it to bits before release!
> 
> Regards
> 
> Paul
> 
> 
> On 28 November 2016 at 19:33, Andre Vehreschild  wrote:
> > PING!
> >
> > I know it's a lengthy patch, but comments would be nice anyway.
> >
> > - Andre
> >
> > On Tue, 22 Nov 2016 20:46:50 +0100
> > Andre Vehreschild  wrote:
> >  
> >> Hi all,
> >>
> >> attached patch addresses the need of extending the API of the caf-libs to
> >> enable allocatable components asynchronous allocation. Allocatable
> >> components in derived type coarrays are different from regular coarrays or
> >> coarrayed components. The latter have to be allocated on all images or on
> >> none. Furthermore is the allocation a point of synchronisation.
> >>
> >> For allocatable components the F2008 allows to have some allocated on some
> >> images and on others not. Furthermore is the registration with the caf-lib,
> >> that an allocatable component is present in a derived type coarray no
> >> longer a synchronisation point. To implement these features two new types
> >> of coarray registration have been introduced. The first one just
> >> registering the component with the caf-lib and the latter doing the
> >> allocate. Furthermore has the caf-API been extended to provide a query
> >> function to learn about the allocation status of a component on a remote
> >> image.
> >>
> >> Sorry, that the patch is rather lengthy. Most of this is due to the
> >> structure_alloc_comps' signature change. The routine and its wrappers are
> >> used rather often which needed the appropriate changes.
> >>
> >> I know I left two or three TODOs in the patch to remind me of things I
> >> have to investigate further. For the current state these TODOs are no
> >> reason to hold back the patch. The third party library opencoarrays
> >> implements the mpi-part of the caf-model and will change in sync. It would
> >> of course be advantageous to just have to say: With gcc-7 gfortran
> >> implements allocatable components in derived coarrays nearly completely.
> >>
> >> I know we are in stage 3. But the patch bootstraps and regtests ok on
> >> x86_64-linux/F23. So, is it ok for trunk or shall it go to 7.2?
> >>
> >> Regards,
> >>   Andre  
> >
> >
> > --
> > Andre Vehreschild * Email: vehre ad gmx dot de  
> 
> 
> 


-- 
Andre Vehreschild * Email: vehre ad gmx dot de 
Index: libgfortran/caf/single.c
===
--- libgfortran/caf/single.c	(Revision 243020)
+++ libgfortran/caf/single.c	(Arbeitskopie)
@@ -144,11 +144,17 @@
   || type == CAF_REGTYPE_CRITICAL || type == CAF_REGTYPE_EVENT_STATIC
   || type == CAF_REGTYPE_EVENT_ALLOC)
 local = calloc (size, sizeof (bool));
+  else if (type == CAF_REGTYPE_COARRAY_ALLOC_REGISTER_ONLY)
+local = NULL;
   else
 local = malloc (size);
-  *token = malloc (sizeof (struct caf_single_token));
 
-  if (unlikely (local == NULL || *token == NULL))
+  if (type != CAF_REGTYPE_COARRAY_ALLOC_ALLOCATE_ONLY)
+*token = malloc (sizeof (struct caf_single_token));
+
+  if (unlikely (*token == NULL
+		|| (local == NULL
+		&& type != CAF_REGTYPE_COARRAY_ALLOC_REGISTER_ONLY)))
 {
   /* Freeing the memory conditionally seems pointless, but
 	 caf_internal_error () may return, when a stat is given and then the
@@ -163,7 +169,7 @@
 
   single_token = TOKEN (*token);
   single_token->memptr = local;
-  single_token->owning_memory = true;
+  single_token->owning_memory = type != CAF_REGTYPE_COARRAY_ALLOC_REGISTER_ONLY;
   single_token->desc = GFC_DESCRIPTOR_RANK (data) > 0 ? data : NULL;
 
 
@@ -184,7 +190,7 @@
 
 
 void
-_gfortran_caf_deregister (caf_token_t *token, int *stat,
+_gfortran_caf_deregister (caf_token_t *token, caf_deregister_t type, int *stat,
 			  char *errmsg __attribute__ ((unused)),
 			  int errmsg_len __attribute__ ((unused)))
 {
@@ -193,7 +199,16 @@
   if (

Re: [PATCH] combine: Convert subreg-of-lshiftrt to zero_extract properly (PR78390)

2016-11-30 Thread Segher Boessenkool
On Wed, Nov 30, 2016 at 02:12:35PM +0100, Dominik Vogt wrote:
> On Wed, Nov 23, 2016 at 02:22:07PM +, Segher Boessenkool wrote:
> > r242414, for PR77881, introduces some bugs (PR78390, PR78438, PR78477).
> > It all has the same root cause: that patch makes combine convert every
> > lowpart subreg of a logical shift right to a zero_extract.  This cannot
> > work at all if it is not a constant shift, and it has to be a bit more
> > careful exactly which bits it extracts.
> > 
> > Tested on powerpc64-linux {-m32,-m64} (where it fixes the regression
> > c-c++-common/torture/vector-compare-1.c fails at -O1, and where it also
> > has a bootstrap failure with some other patches).  Also tested that the
> > x86_64 compiler still generates the wanted code for the PR77881 testcase.
> 
> Is this a side effect of the patch series?

I do not know; I cannot tell just from this, and there is no source
snippet to try.  Maybe Michael can tell?

>   Trying 7 -> 9:
>   ...
>   Failed to match this instruction:
>   (set (reg:SI 68 [ v_or ])
>   (ior:SI (subreg:SI (zero_extract:DI (reg:DI 2 %r2 [ v_x ])
>^^
>   (const_int 16 [0x10])
>   (const_int 0 [0])) 4)
> ^^^
>   (reg:SI 70 [ v_and1 ])))
> 
> Shouldn't this be simply
> 
>   ...
>   (ior:SI (zero_extract:SI (reg:DI) (16) (0)))
>   ...

That seems nicer, sure.  OTOH that will never match on a target that
does not have zero_extract:SI from :DI.  *_extracts are nasty.


Segher


Re: [PATCH] ira: Don't substitute into TRAP_IF insns (PR78610)

2016-11-30 Thread Segher Boessenkool
On Wed, Nov 30, 2016 at 01:52:51PM +0100, Richard Biener wrote:
> On Wed, Nov 30, 2016 at 1:46 PM, Segher Boessenkool
>  wrote:
> > In the testcase, IRA propagates a constant into a TRAP_IF insn, which
> > then becomes an unconditional trap.  Unconditional traps are control
> > flow insns so doing this requires surgery on the cfg.
> 
> Huh, that's an odd choice ;)  I'd say TRAP_IF should be a control-flow insn
> as well, but well...

It doesn't really matter here, converting a conditional TRAP_IF to an
unconditional one requires changing the cfg in any case (and we cannot
do that here).

Making every TRAP_IF a control flow insn means they will end their BB;
that will make some things simpler, sure.  It will also limit some of
the RTL optimizations a bit.


Segher


Re: [PATCH] combine: Convert subreg-of-lshiftrt to zero_extract properly (PR78390)

2016-11-30 Thread Michael Matz
Hi,

On Wed, 30 Nov 2016, Dominik Vogt wrote:

> On Wed, Nov 23, 2016 at 02:22:07PM +, Segher Boessenkool wrote:
> > r242414, for PR77881, introduces some bugs (PR78390, PR78438, PR78477).
> > It all has the same root cause: that patch makes combine convert every
> > lowpart subreg of a logical shift right to a zero_extract.  This cannot
> > work at all if it is not a constant shift, and it has to be a bit more
> > careful exactly which bits it extracts.
> > 
> > Tested on powerpc64-linux {-m32,-m64} (where it fixes the regression
> > c-c++-common/torture/vector-compare-1.c fails at -O1, and where it also
> > has a bootstrap failure with some other patches).  Also tested that the
> > x86_64 compiler still generates the wanted code for the PR77881 testcase.
> 
> Is this a side effect of the patch series?

What is "this"?

>   Trying 7 -> 9:
>   ...
>   Failed to match this instruction:
>   (set (reg:SI 68 [ v_or ])
>   (ior:SI (subreg:SI (zero_extract:DI (reg:DI 2 %r2 [ v_x ])
>^^
>   (const_int 16 [0x10])
>   (const_int 0 [0])) 4)
> ^^^
>   (reg:SI 70 [ v_and1 ])))
> 
> Shouldn't this be simply
> 
>   ...
>   (ior:SI (zero_extract:SI (reg:DI) (16) (0)))
>   ...

I don't think mode-changing _extracts are valid in this context.  From the 
docu:

  `(sign_extract:M LOC SIZE POS)'
  ...
 The mode M is the same as the mode that would be used for LOC if
 it were a register.

Probably it could be made to work just fine, but I'm not sure it'd be 
worth much, as then the targets would need to care for mode-changes 
occuring not just through subregs as usual, but also through extracts.


Ciao,
Michael.


Re: PING! [PATCH, Fortran, accaf, v1] Add caf-API-calls to asynchronously handle allocatable components in derived type coarrays.

2016-11-30 Thread Janus Weil
Hi Andre,

after your commit I see several warnings when compiling libgfortran
(see below). Could you please fix those (if possible)?

Thanks,
Janus



/home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c: In function
‘_gfortran_caf_is_present’:
/home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:2949:8: warning:
this statement may fall through [-Wimplicit-fallthrough=]
 if (riter->next == NULL)
^
/home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:2952:3: note: here
   case CAF_ARR_REF_VECTOR:
   ^~~~
/home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:2976:8: warning:
this statement may fall through [-Wimplicit-fallthrough=]
 if (riter->next == NULL)
^
/home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:2979:3: note: here
   case CAF_ARR_REF_VECTOR:
   ^~~~
/home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:2949:8: warning:
this statement may fall through [-Wimplicit-fallthrough=]
 if (riter->next == NULL)
^
/home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:2952:3: note: here
   case CAF_ARR_REF_VECTOR:
   ^~~~
/home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:2976:8: warning:
this statement may fall through [-Wimplicit-fallthrough=]
 if (riter->next == NULL)
^
/home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:2979:3: note: here
   case CAF_ARR_REF_VECTOR:
   ^~~~
/home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c: In function
‘_gfortran_caf_get_by_ref’:
/home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:1863:29: warning:
‘src_size’ may be used uninitialized in this function
[-Wmaybe-uninitialized]
   if (size == 0 || src_size == 0)
~^~~~
/home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c: In function
‘_gfortran_caf_send_by_ref’:
/home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:2649:29: warning:
‘src_size’ may be used uninitialized in this function
[-Wmaybe-uninitialized]
   if (size == 0 || src_size == 0)
~^~~~




2016-11-30 14:30 GMT+01:00 Andre Vehreschild :
> Hi Paul,
>
> thanks for the review. Committed with the changes requested and the one
> reported by Dominique on IRC for coarray_lib_alloc_4 when compiled with -m32 
> as
> r243021.
>
> Thanks for the review and tests.
>
> Regards,
> Andre
>
> On Wed, 30 Nov 2016 07:49:13 +0100
> Paul Richard Thomas  wrote:
>
>> Dear Andre,
>>
>> This all looks OK to me. The only comment that I have that you might
>> deal with before committing is that some of the Boolean expressions,
>> eg:
>> +  int caf_dereg_mode
>> +  = ((caf_mode & GFC_STRUCTURE_CAF_MODE_IN_COARRAY) != 0
>> +  || c->attr.codimension)
>> +  ? ((caf_mode & GFC_STRUCTURE_CAF_MODE_DEALLOC_ONLY) != 0
>> +  ? GFC_CAF_COARRAY_DEALLOCATE_ONLY
>> +  : GFC_CAF_COARRAY_DEREGISTER)
>> +  : GFC_CAF_COARRAY_NOCOARRAY;
>>
>> are getting be sufficiently convoluted that a small, appropriately
>> named, helper function might be clearer. Of course, this is true of
>> many parts of gfortran but it is not too late to start making the code
>> a bit clearer.
>>
>> You can commit to the present trunk as far as I am concerned. I know
>> that the caf enthusiasts will test it to bits before release!
>>
>> Regards
>>
>> Paul
>>
>>
>> On 28 November 2016 at 19:33, Andre Vehreschild  wrote:
>> > PING!
>> >
>> > I know it's a lengthy patch, but comments would be nice anyway.
>> >
>> > - Andre
>> >
>> > On Tue, 22 Nov 2016 20:46:50 +0100
>> > Andre Vehreschild  wrote:
>> >
>> >> Hi all,
>> >>
>> >> attached patch addresses the need of extending the API of the caf-libs to
>> >> enable allocatable components asynchronous allocation. Allocatable
>> >> components in derived type coarrays are different from regular coarrays or
>> >> coarrayed components. The latter have to be allocated on all images or on
>> >> none. Furthermore is the allocation a point of synchronisation.
>> >>
>> >> For allocatable components the F2008 allows to have some allocated on some
>> >> images and on others not. Furthermore is the registration with the 
>> >> caf-lib,
>> >> that an allocatable component is present in a derived type coarray no
>> >> longer a synchronisation point. To implement these features two new types
>> >> of coarray registration have been introduced. The first one just
>> >> registering the component with the caf-lib and the latter doing the
>> >> allocate. Furthermore has the caf-API been extended to provide a query
>> >> function to learn about the allocation status of a component on a remote
>> >> image.
>> >>
>> >> Sorry, that the patch is rather lengthy. Most of this is due to the
>> >> structure_alloc_comps' signature change. The routine and its wrappers are
>> >> used rather often which needed the appropriate changes.
>> >>
>> >> I know I left two or three TODOs in the patch to remind me of things I
>> >> have to investigate further. For the current state these TODOs are no
>> >> reason to hold back the patch. The third party library opencoarr

Re: [PATCH] ira: Don't substitute into TRAP_IF insns (PR78610)

2016-11-30 Thread Segher Boessenkool
On Wed, Nov 30, 2016 at 05:54:58AM -0700, Jeff Law wrote:
> Funny how you speculated there could be these issues hiding in the 
> weeds, then just a few days later, one crawls out.

Two (there is PR78607 as well).  Although that one seems related to the
combine one.  All the same reporter, it's not a big coincidence ;-)


Segher


[PATCH PR78574]Fix infinite recursion in find_deriving_biv_for_expr

2016-11-30 Thread Bin Cheng
Hi,
Loop header PHI defining IV(biv) may not be identified as biv because its 
increment statement is in (irreducible) inner loop.  Function 
find_deriving_biv_for_expr doesn't take this into consideration and runs into 
infinite recursion.  The patch fixes this issue by skipping such loop header 
PHI.  Bootstrap and test on x86_64 and AArch64, is it OK?

BTW, we don't mark such IV as biv because of below code:

  /* If the increment is in the subloop, ignore it.  */
  incr_bb = gimple_bb (SSA_NAME_DEF_STMT (var));
  if (incr_bb->loop_father != data->current_loop
  || (incr_bb->flags & BB_IRREDUCIBLE_LOOP))
continue;

I thought twice and this check may be too strict.  Given valid incr_iv returned 
by simple_iv, we know it behaves just like usual increment IVs.  In other 
words, though the increment statement is executed multiple times in inner loop, 
it computes the same result for every iteration.  Anyway this is stage1 work.

Thanks,
bin

2016-11-30  Bin Cheng  

PR tree-optimization/78574
* tree-ssa-loop-ivopts.c (find_deriving_biv_for_expr): Skip loop
header PHI that doesn't define biv.

gcc/testsuite/ChangeLog
2016-11-30  Bin Cheng  

PR tree-optimization/78574
* gcc.c-torture/compile/pr78574.c: New test.diff --git a/gcc/testsuite/gcc.c-torture/compile/pr78574.c 
b/gcc/testsuite/gcc.c-torture/compile/pr78574.c
new file mode 100644
index 000..8c91d1e
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/compile/pr78574.c
@@ -0,0 +1,27 @@
+/* PR tree-optimization/78574 */
+
+int a, d, f, g;
+int b[1];
+short h;
+int main() {
+  long j;
+  int k, i;
+  for (; j; j++) {
+i = 0;
+for (; i < 6; i++) {
+  int l = a, m = d || g;
+L:
+  l ^ m | a;
+}
+b[j + 1] = 2;
+++k;
+for (; g; g++) {
+  d ^= h;
+  if (f)
+for (;;)
+  ;
+}
+  }
+  if (k)
+goto L;
+}
diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c
index 5c667a2..00b287a 100644
--- a/gcc/tree-ssa-loop-ivopts.c
+++ b/gcc/tree-ssa-loop-ivopts.c
@@ -1853,6 +1853,11 @@ find_deriving_biv_for_expr (struct ivopts_data *data, 
tree expr)
 {
   ssa_op_iter iter;
   use_operand_p use_p;
+  basic_block phi_bb = gimple_bb (phi);
+
+  /* Skip loop header PHI that doesn't define biv.  */
+  if (phi_bb->loop_father == data->current_loop)
+   return NULL;
 
   if (virtual_operand_p (gimple_phi_result (phi)))
return NULL;


Re: [PATCH][AArch64] Separate shrink wrapping hooks implementation

2016-11-30 Thread Kyrill Tkachov


On 29/11/16 20:29, Segher Boessenkool wrote:

Hi James, Kyrill,

On Tue, Nov 29, 2016 at 10:57:33AM +, James Greenhalgh wrote:

+static sbitmap
+aarch64_components_for_bb (basic_block bb)
+{
+  bitmap in = DF_LIVE_IN (bb);
+  bitmap gen = &DF_LIVE_BB_INFO (bb)->gen;
+  bitmap kill = &DF_LIVE_BB_INFO (bb)->kill;
+
+  sbitmap components = sbitmap_alloc (V31_REGNUM + 1);
+  bitmap_clear (components);
+
+  /* GPRs are used in a bb if they are in the IN, GEN, or KILL sets.  */
+  for (unsigned regno = R0_REGNUM; regno <= V31_REGNUM; regno++)

The use of R0_REGNUM and V31_REGNUM scare me a little bit, as we're hardcoding
where the end of the register file is (does this, for example, fall apart
with the SVE work that was recently posted). Something like a
LAST_HARDREG_NUM might work?

Components and registers aren't the same thing (you can have components
for things that aren't just a register save, e.g. the frame setup, stack
alignment, save of some non-GPR via a GPR, PIC register setup, etc.)
The loop here should really only cover the non-volatile registers, and
there should be some translation from register number to component number
(it of course is convenient to have a 1-1 translation for the GPRs and
floating point registers).  For rs6000 many things in the backend already
use non-symbolic numbers for the FPRs and GPRs, so that is easier there.


Anyway, here's the patch with James's comments implemented.
I've introduced LAST_SAVED_REGNUM which is used to delimit the registers
considered for shrink-wrapping.

aarch64_process_components is introduced and used to implement
the emit_prologue_components and emit_epilogue_components functions in a single 
place.

Bootstrapped and tested on aarch64-none-linux-gnu.

Thanks,
Kyrill

2016-11-30  Kyrylo Tkachov  

* config/aarch64/aarch64.h (machine_function): Add
reg_is_wrapped_separately field.
* config/aarch64/aarch64.md (LAST_SAVED_REGNUM): Define new constant.
* config/aarch64/aarch64.c (emit_set_insn): Change return type to
rtx_insn *.
(aarch64_save_callee_saves): Don't save registers that are wrapped
separately.
(aarch64_restore_callee_saves): Don't restore registers that are
wrapped separately.
(offset_9bit_signed_unscaled_p, offset_12bit_unsigned_scaled_p,
aarch64_offset_7bit_signed_scaled_p): Move earlier in the file.
(aarch64_get_separate_components): New function.
(aarch64_get_next_set_bit): Likewise.
(aarch64_components_for_bb): Likewise.
(aarch64_disqualify_components): Likewise.
(aarch64_emit_prologue_components): Likewise.
(aarch64_emit_epilogue_components): Likewise.
(aarch64_set_handled_components): Likewise.
(aarch64_process_components): Likewise.
(TARGET_SHRINK_WRAP_GET_SEPARATE_COMPONENTS,
TARGET_SHRINK_WRAP_COMPONENTS_FOR_BB,
TARGET_SHRINK_WRAP_DISQUALIFY_COMPONENTS,
TARGET_SHRINK_WRAP_EMIT_PROLOGUE_COMPONENTS,
TARGET_SHRINK_WRAP_EMIT_EPILOGUE_COMPONENTS,
TARGET_SHRINK_WRAP_SET_HANDLED_COMPONENTS): Define.

+static void
+aarch64_disqualify_components (sbitmap, edge, sbitmap, bool)
+{
+}

Is there no default "do nothing" hook for this?

I can make the shrink-wrap code do nothing here if this hook isn't
defined, if you want?


I don't mind either way.
If you do it I'll then remove the empty implementation in aarch64.




Segher


commit 194816281ec6da2620bb34c9278ed7edf8bcf0da
Author: Kyrylo Tkachov 
Date:   Tue Oct 11 09:25:54 2016 +0100

[AArch64] Separate shrink wrapping hooks implementation

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 82bfe14..48e6e2c 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -1138,7 +1138,7 @@ aarch64_is_extend_from_extract (machine_mode mode, rtx mult_imm,
 
 /* Emit an insn that's a simple single-set.  Both the operands must be
known to be valid.  */
-inline static rtx
+inline static rtx_insn *
 emit_set_insn (rtx x, rtx y)
 {
   return emit_insn (gen_rtx_SET (x, y));
@@ -3135,6 +3135,9 @@ aarch64_save_callee_saves (machine_mode mode, HOST_WIDE_INT start_offset,
 	  || regno == cfun->machine->frame.wb_candidate2))
 	continue;
 
+  if (cfun->machine->reg_is_wrapped_separately[regno])
+   continue;
+
   reg = gen_rtx_REG (mode, regno);
   offset = start_offset + cfun->machine->frame.reg_offset[regno];
   mem = gen_mem_ref (mode, plus_constant (Pmode, stack_pointer_rtx,
@@ -3143,6 +3146,7 @@ aarch64_save_callee_saves (machine_mode mode, HOST_WIDE_INT start_offset,
   regno2 = aarch64_next_callee_save (regno + 1, limit);
 
   if (regno2 <= limit
+	  && !cfun->machine->reg_is_wrapped_separately[regno2]
 	  && ((cfun->machine->frame.reg_offset[regno] + UNITS_PER_WORD)
 	  == cfun->machine->frame.reg_offset[regno2]))
 
@@ -3191,6 +3195,9 @@ aarch64_restore_callee_saves (machine_mode mode,
regno <= limit;
regno = aarch64_next_callee_save (regno + 1, limit))
 {
+  if (cfun->machine->reg_is_

Re: [PATCH PR78574]Fix infinite recursion in find_deriving_biv_for_expr

2016-11-30 Thread Richard Biener
On Wed, Nov 30, 2016 at 2:54 PM, Bin Cheng  wrote:
> Hi,
> Loop header PHI defining IV(biv) may not be identified as biv because its 
> increment statement is in (irreducible) inner loop.  Function 
> find_deriving_biv_for_expr doesn't take this into consideration and runs into 
> infinite recursion.  The patch fixes this issue by skipping such loop header 
> PHI.  Bootstrap and test on x86_64 and AArch64, is it OK?

Ok.

Richard.

> BTW, we don't mark such IV as biv because of below code:
>
>   /* If the increment is in the subloop, ignore it.  */
>   incr_bb = gimple_bb (SSA_NAME_DEF_STMT (var));
>   if (incr_bb->loop_father != data->current_loop
>   || (incr_bb->flags & BB_IRREDUCIBLE_LOOP))
> continue;
>
> I thought twice and this check may be too strict.  Given valid incr_iv 
> returned by simple_iv, we know it behaves just like usual increment IVs.  In 
> other words, though the increment statement is executed multiple times in 
> inner loop, it computes the same result for every iteration.  Anyway this is 
> stage1 work.
>
> Thanks,
> bin
>
> 2016-11-30  Bin Cheng  
>
> PR tree-optimization/78574
> * tree-ssa-loop-ivopts.c (find_deriving_biv_for_expr): Skip loop
> header PHI that doesn't define biv.
>
> gcc/testsuite/ChangeLog
> 2016-11-30  Bin Cheng  
>
> PR tree-optimization/78574
> * gcc.c-torture/compile/pr78574.c: New test.


Re: [PATCH] combine: Convert subreg-of-lshiftrt to zero_extract properly (PR78390)

2016-11-30 Thread Segher Boessenkool
On Wed, Nov 30, 2016 at 02:43:12PM +0100, Michael Matz wrote:
> > Shouldn't this be simply
> > 
> >   ...
> >   (ior:SI (zero_extract:SI (reg:DI) (16) (0)))
> >   ...
> 
> I don't think mode-changing _extracts are valid in this context.  From the 
> docu:
> 
>   `(sign_extract:M LOC SIZE POS)'
>   ...
>  The mode M is the same as the mode that would be used for LOC if
>  it were a register.
> 
> Probably it could be made to work just fine, but I'm not sure it'd be 
> worth much, as then the targets would need to care for mode-changes 
> occuring not just through subregs as usual, but also through extracts.

The patch https://gcc.gnu.org/ml/gcc-patches/2016-11/msg02987.html I
submitted yesterday deals with this same issue, FWIW -- some ports
apparently already do mode-changing extracts.


Segher


Re: [PATCH] Fix PR78306

2016-11-30 Thread Richard Biener
On Wed, 30 Nov 2016, Andrew Senkevich wrote:

> 2016-11-30 11:52 GMT+03:00 Richard Biener :
> > On Tue, 29 Nov 2016, Jeff Law wrote:
> >
> >> On 11/29/2016 12:47 AM, Richard Biener wrote:
> >> > > Balaji added this check explicitly. There should be tests in the 
> >> > > testsuite
> >> > > (spawnee_inline, spawner_inline) which exercise that code.
> >> >
> >> > Yes he did, but no, nothing in the testsuite.
> >> I believe the tests are:
> >>
> >> c-c++-common/cilk-plus/CK/spawnee_inline.c
> >> c-c++-common/cilk-plus/CK/spawner_inline.c
> >>
> >> But as I mentioned, they don't check for proper behaviour
> >
> > Actually they do -- and both show what the issue might be, cilk+
> > uses setjmp but we already have code to disallow inlining of
> > functions calling setjmp (but we happily inline into functions
> > calling setjmp).  When mangling the testcases to try forcing
> > inlining I still (the patch was already applied) get
> >
> > /space/rguenther/src/gcc-git/gcc/testsuite/c-c++-common/cilk-plus/CK/spawnee_inline.c:
> > In function ‘fib’:
> > /space/rguenther/src/gcc-git/gcc/testsuite/c-c++-common/cilk-plus/CK/spawnee_inline.c:9:50:
> > error: function ‘fib’ can never be copied because it receives a non-local
> > goto
> >
> > so the intent was probably to disallow inlining of functions calling
> > cilk_spawn, not to disable inlining into functions calling cilk_spawn.
> >
> > But as seen above this is already handled by generic code handling
> > setjmp.
> >
> >>
> >> >
> >> > There is _nowhere_ documented _why_ the checks were added.  Why is
> >> > inlining a transform that can do anything bad to a function using
> >> > cilk_spawn?
> >> I know, it's disappointing.  Even the tests mentioned above don't shed any
> >> real light on the issue.
> >
> > One issue is obvious (but already handled).  Why all inlining should
> > be disabled is indeed still a mystery.
> 
> I can suppose inline should be disabled for the next function after
> cilk_spawn because spawn should be done for function.
> If no way to disable the next call inlining it looks it was disabled
> for all function to fix Cilk Plus Conformance Suite test fail.

I see.  Even with GCC 6.2 the conformace test suite has a lot of FAILs
(and compiler ICEs):

In 734 tests: 229 (60 compilation, 169 execution) tests do not conform 
specification, 505 conform

using the suite in version 1.2.1.  For current trunk I get

In 734 tests: 198 (43 compilation, 155 execution) tests do not conform 
specification, 536 conform

where gcc 6.2 vs. gcc 7 diff is

-  PASS cn_cilk_for_label_in.c
-  PASS cn_cilk_for_label_out.c
+  FAIL cn_cilk_for_label_in.c (internal compiler error)
+  FAIL cn_cilk_for_label_out.c (internal compiler error)
-  PASS cn_cilk_for_return.c
+  FAIL cn_cilk_for_return.c (internal compiler error)
-  PASS ep_cilk_for_compare1.c
+  FAIL ep_cilk_for_compare1.c (internal compiler error)
-  PASS ep_cilk_for_increment1.c
+  FAIL ep_cilk_for_increment1.c (internal compiler error)
-  PASS ep_cilk_for_nest1.c
-  PASS ep_cilk_for_pragma2.c
+  FAIL ep_cilk_for_nest1.c (internal compiler error)
+  FAIL ep_cilk_for_pragma2.c (internal compiler error)
-  PASS cn_cilk_for_init1.cpp
+  FAIL cn_cilk_for_init1.cpp (internal compiler error)
-  FAIL cn_omp_simd_lastprivate2-1.c (internal compiler error)
-  FAIL cn_omp_simd_lastprivate2-2.c (internal compiler error)
-  FAIL cn_omp_simd_lastprivate3.c
-  FAIL cn_omp_simd_lastprivate4-1.c (internal compiler error)
+  PASS cn_omp_simd_lastprivate2-1.c
+  PASS cn_omp_simd_lastprivate2-2.c
+  PASS cn_omp_simd_lastprivate3.c
+  PASS cn_omp_simd_lastprivate4-1.c
-  FAIL cn_omp_simd_linear2.c
-  FAIL cn_omp_simd_linear3-1.c (internal compiler error)
+  PASS cn_omp_simd_linear2.c
+  PASS cn_omp_simd_linear3-1.c
-  FAIL cn_omp_simd_linear6-2.c
+  PASS cn_omp_simd_linear6-2.c
-  FAIL cn_omp_simd_private3.c
-  FAIL cn_omp_simd_private4-1.c (internal compiler error)
+  PASS cn_omp_simd_private3.c
+  PASS cn_omp_simd_private4-1.c
-  FAIL cn_omp_simd_reduction1.c (internal compiler error)
-  FAIL cn_omp_simd_reduction2.c (internal compiler error)
-  FAIL cn_omp_simd_reduction3.c
-  FAIL cn_omp_simd_reduction4-1.c
-  FAIL cn_omp_simd_reduction4-2.c (internal compiler error)
-  FAIL cn_omp_simd_reduction5.c (internal compiler error)
+  PASS cn_omp_simd_reduction1.c
+  PASS cn_omp_simd_reduction2.c
+  PASS cn_omp_simd_reduction3.c
+  PASS cn_omp_simd_reduction4-1.c
+  PASS cn_omp_simd_reduction4-2.c
+  FAIL cn_omp_simd_reduction5.c
-  FAIL cn_simd_linear1.c
-  FAIL cn_simd_linear2.c
-  FAIL cn_simd_linear3.c
+  PASS cn_simd_linear1.c
+  PASS cn_simd_linear2.c
+  PASS cn_simd_linear3.c
-  FAIL cn_omp_simd_lastprivate1.cpp
-  FAIL cn_omp_simd_lastprivate2.cpp
+  PASS cn_omp_simd_lastprivate1.cpp
+  PASS cn_omp_simd_lastprivate2.cpp
-  FAIL cn_omp_simd_linear2-1.cpp
+  PASS cn_omp_simd_linear2-1.cpp
-  FAIL cn_omp_simd_private1.cpp
-  FAIL cn_omp_simd_private2.cpp
+  PASS cn_omp_simd_private1.cpp
+  PASS cn_omp_simd_private2.cpp
-  FAIL ep_om

Re: PING! [PATCH, Fortran, accaf, v1] Add caf-API-calls to asynchronously handle allocatable components in derived type coarrays.

2016-11-30 Thread Andre Vehreschild
Janus,

those fallthroughs are fully intentional and each and everyone is documented.
When you can tell me a way to remove those false positive warnings I am happy to
do so, when it comes at no extra costs at runtime.

- Andre

On Wed, 30 Nov 2016 14:48:38 +0100
Janus Weil  wrote:

> Hi Andre,
> 
> after your commit I see several warnings when compiling libgfortran
> (see below). Could you please fix those (if possible)?
> 
> Thanks,
> Janus
> 
> 
> 
> /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c: In function
> ‘_gfortran_caf_is_present’:
> /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:2949:8: warning:
> this statement may fall through [-Wimplicit-fallthrough=]
>  if (riter->next == NULL)
> ^
> /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:2952:3: note: here
>case CAF_ARR_REF_VECTOR:
>^~~~
> /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:2976:8: warning:
> this statement may fall through [-Wimplicit-fallthrough=]
>  if (riter->next == NULL)
> ^
> /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:2979:3: note: here
>case CAF_ARR_REF_VECTOR:
>^~~~
> /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:2949:8: warning:
> this statement may fall through [-Wimplicit-fallthrough=]
>  if (riter->next == NULL)
> ^
> /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:2952:3: note: here
>case CAF_ARR_REF_VECTOR:
>^~~~
> /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:2976:8: warning:
> this statement may fall through [-Wimplicit-fallthrough=]
>  if (riter->next == NULL)
> ^
> /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:2979:3: note: here
>case CAF_ARR_REF_VECTOR:
>^~~~
> /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c: In function
> ‘_gfortran_caf_get_by_ref’:
> /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:1863:29: warning:
> ‘src_size’ may be used uninitialized in this function
> [-Wmaybe-uninitialized]
>if (size == 0 || src_size == 0)
> ~^~~~
> /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c: In function
> ‘_gfortran_caf_send_by_ref’:
> /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:2649:29: warning:
> ‘src_size’ may be used uninitialized in this function
> [-Wmaybe-uninitialized]
>if (size == 0 || src_size == 0)
> ~^~~~
> 
> 
> 
> 
> 2016-11-30 14:30 GMT+01:00 Andre Vehreschild :
> > Hi Paul,
> >
> > thanks for the review. Committed with the changes requested and the one
> > reported by Dominique on IRC for coarray_lib_alloc_4 when compiled with
> > -m32 as r243021.
> >
> > Thanks for the review and tests.
> >
> > Regards,
> > Andre
> >
> > On Wed, 30 Nov 2016 07:49:13 +0100
> > Paul Richard Thomas  wrote:
> >  
> >> Dear Andre,
> >>
> >> This all looks OK to me. The only comment that I have that you might
> >> deal with before committing is that some of the Boolean expressions,
> >> eg:
> >> +  int caf_dereg_mode
> >> +  = ((caf_mode & GFC_STRUCTURE_CAF_MODE_IN_COARRAY) != 0
> >> +  || c->attr.codimension)
> >> +  ? ((caf_mode & GFC_STRUCTURE_CAF_MODE_DEALLOC_ONLY) != 0
> >> +  ? GFC_CAF_COARRAY_DEALLOCATE_ONLY
> >> +  : GFC_CAF_COARRAY_DEREGISTER)
> >> +  : GFC_CAF_COARRAY_NOCOARRAY;
> >>
> >> are getting be sufficiently convoluted that a small, appropriately
> >> named, helper function might be clearer. Of course, this is true of
> >> many parts of gfortran but it is not too late to start making the code
> >> a bit clearer.
> >>
> >> You can commit to the present trunk as far as I am concerned. I know
> >> that the caf enthusiasts will test it to bits before release!
> >>
> >> Regards
> >>
> >> Paul
> >>
> >>
> >> On 28 November 2016 at 19:33, Andre Vehreschild  wrote:  
> >> > PING!
> >> >
> >> > I know it's a lengthy patch, but comments would be nice anyway.
> >> >
> >> > - Andre
> >> >
> >> > On Tue, 22 Nov 2016 20:46:50 +0100
> >> > Andre Vehreschild  wrote:
> >> >  
> >> >> Hi all,
> >> >>
> >> >> attached patch addresses the need of extending the API of the caf-libs
> >> >> to enable allocatable components asynchronous allocation. Allocatable
> >> >> components in derived type coarrays are different from regular coarrays
> >> >> or coarrayed components. The latter have to be allocated on all images
> >> >> or on none. Furthermore is the allocation a point of synchronisation.
> >> >>
> >> >> For allocatable components the F2008 allows to have some allocated on
> >> >> some images and on others not. Furthermore is the registration with the
> >> >> caf-lib, that an allocatable component is present in a derived type
> >> >> coarray no longer a synchronisation point. To implement these features
> >> >> two new types of coarray registration have been introduced. The first
> >> >> one just registering the component with the caf-lib and the latter
> >> >> doing the allocate. Furthermore has the caf-API been extended to
> >> >> provide a query function

Re: PING! [PATCH, Fortran, accaf, v1] Add caf-API-calls to asynchronously handle allocatable components in derived type coarrays.

2016-11-30 Thread Andre Vehreschild
Hi all,

on IRC:
15:28:22 dominiq:  vehre: add /* FALLTHROUGH */

Done and committed as obvious as r243023.

- Andre

On Wed, 30 Nov 2016 15:22:46 +0100
Andre Vehreschild  wrote:

> Janus,
> 
> those fallthroughs are fully intentional and each and everyone is documented.
> When you can tell me a way to remove those false positive warnings I am happy
> to do so, when it comes at no extra costs at runtime.
> 
> - Andre
> 
> On Wed, 30 Nov 2016 14:48:38 +0100
> Janus Weil  wrote:
> 
> > Hi Andre,
> > 
> > after your commit I see several warnings when compiling libgfortran
> > (see below). Could you please fix those (if possible)?
> > 
> > Thanks,
> > Janus
> > 
> > 
> > 
> > /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c: In function
> > ‘_gfortran_caf_is_present’:
> > /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:2949:8: warning:
> > this statement may fall through [-Wimplicit-fallthrough=]
> >  if (riter->next == NULL)
> > ^
> > /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:2952:3: note: here
> >case CAF_ARR_REF_VECTOR:
> >^~~~
> > /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:2976:8: warning:
> > this statement may fall through [-Wimplicit-fallthrough=]
> >  if (riter->next == NULL)
> > ^
> > /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:2979:3: note: here
> >case CAF_ARR_REF_VECTOR:
> >^~~~
> > /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:2949:8: warning:
> > this statement may fall through [-Wimplicit-fallthrough=]
> >  if (riter->next == NULL)
> > ^
> > /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:2952:3: note: here
> >case CAF_ARR_REF_VECTOR:
> >^~~~
> > /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:2976:8: warning:
> > this statement may fall through [-Wimplicit-fallthrough=]
> >  if (riter->next == NULL)
> > ^
> > /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:2979:3: note: here
> >case CAF_ARR_REF_VECTOR:
> >^~~~
> > /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c: In function
> > ‘_gfortran_caf_get_by_ref’:
> > /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:1863:29: warning:
> > ‘src_size’ may be used uninitialized in this function
> > [-Wmaybe-uninitialized]
> >if (size == 0 || src_size == 0)
> > ~^~~~
> > /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c: In function
> > ‘_gfortran_caf_send_by_ref’:
> > /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:2649:29: warning:
> > ‘src_size’ may be used uninitialized in this function
> > [-Wmaybe-uninitialized]
> >if (size == 0 || src_size == 0)
> > ~^~~~
> > 
> > 
> > 
> > 
> > 2016-11-30 14:30 GMT+01:00 Andre Vehreschild :  
> > > Hi Paul,
> > >
> > > thanks for the review. Committed with the changes requested and the one
> > > reported by Dominique on IRC for coarray_lib_alloc_4 when compiled with
> > > -m32 as r243021.
> > >
> > > Thanks for the review and tests.
> > >
> > > Regards,
> > > Andre
> > >
> > > On Wed, 30 Nov 2016 07:49:13 +0100
> > > Paul Richard Thomas  wrote:
> > >
> > >> Dear Andre,
> > >>
> > >> This all looks OK to me. The only comment that I have that you might
> > >> deal with before committing is that some of the Boolean expressions,
> > >> eg:
> > >> +  int caf_dereg_mode
> > >> +  = ((caf_mode & GFC_STRUCTURE_CAF_MODE_IN_COARRAY) != 0
> > >> +  || c->attr.codimension)
> > >> +  ? ((caf_mode & GFC_STRUCTURE_CAF_MODE_DEALLOC_ONLY) != 0
> > >> +  ? GFC_CAF_COARRAY_DEALLOCATE_ONLY
> > >> +  : GFC_CAF_COARRAY_DEREGISTER)
> > >> +  : GFC_CAF_COARRAY_NOCOARRAY;
> > >>
> > >> are getting be sufficiently convoluted that a small, appropriately
> > >> named, helper function might be clearer. Of course, this is true of
> > >> many parts of gfortran but it is not too late to start making the code
> > >> a bit clearer.
> > >>
> > >> You can commit to the present trunk as far as I am concerned. I know
> > >> that the caf enthusiasts will test it to bits before release!
> > >>
> > >> Regards
> > >>
> > >> Paul
> > >>
> > >>
> > >> On 28 November 2016 at 19:33, Andre Vehreschild  wrote:
> > >> > PING!
> > >> >
> > >> > I know it's a lengthy patch, but comments would be nice anyway.
> > >> >
> > >> > - Andre
> > >> >
> > >> > On Tue, 22 Nov 2016 20:46:50 +0100
> > >> > Andre Vehreschild  wrote:
> > >> >
> > >> >> Hi all,
> > >> >>
> > >> >> attached patch addresses the need of extending the API of the caf-libs
> > >> >> to enable allocatable components asynchronous allocation. Allocatable
> > >> >> components in derived type coarrays are different from regular
> > >> >> coarrays or coarrayed components. The latter have to be allocated on
> > >> >> all images or on none. Furthermore is the allocation a point of
> > >> >> synchronisation.
> > >> >>
> > >> >> For allocatable components the F2008 allows to have some allocated on
> > >> >> some images and on others n

[Patch doc] Document _Float16 availability on ARM/AArch64

2016-11-30 Thread James Greenhalgh

Hi,

As subject - update extend.texi to mention availability of _Float16 types
on ARM and AArch64.

OK?

Thanks,
James

---
2016-11-30  James Greenhalgh  

* doc/extend.texi (Floating Types): Document availability of
_Float16 on ARM/AArch64.

diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index d873403..7d3d17a 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -997,8 +997,10 @@ IEEE binary128 format.  The @code{_Float64x} type is supported on all
 systems where @code{__float128} is supported.  The @code{_Float32}
 type is supported on all systems supporting IEEE binary32; the
 @code{_Float64} and @code{Float32x} types are supported on all systems
-supporting IEEE binary64.  GCC does not currently support
-@code{_Float16} or @code{_Float128x} on any systems.
+supporting IEEE binary64.  The @code{_Float16} type is supported on AArch64
+systems by default, and on ARM systems when the IEEE format for 16-bit
+floating point types is selected with @option{-mfp16-format=ieee}.
+GCC does not currently support @code{_Float128x} on any systems.
 
 On the PowerPC, @code{__ibm128} provides access to the IBM extended
 double format, and it is intended to be used by the library functions


[PATCH,libstdc++] xfail operator new on AIX

2016-11-30 Thread David Edelsohn
AIX shared libraries do not allow overriding and interposition of
symbols by default, which is used to override operators, such as
operator new (and operator delete) in C++.  Four libstdc++ testcases
rely on this behavior and fail on AIX.  Jonathan and I have decided to
XFAIL the testcases to reduce the noise in the testsuite output.

With the recent libstdc++ testsuite changes, there now are only four
(4) failures in the libstdc++ testsuite on AIX -- all due to one real
bug in the C++ front-end for targets with stabs debugging.

Thanks, David

* testsuite/18_support/50594.cc: XFAIL on AIX.
* testsuite/ext/mt_allocator/check_new.cc: Same.
* testsuite/ext/pool_allocator/check_new.cc: Same.
* testsuite/27_io/ios_base/storage/11584.cc: Same.

Index: 18_support/50594.cc
===
--- 18_support/50594.cc (revision 243019)
+++ 18_support/50594.cc (working copy)
@@ -1,5 +1,6 @@
 // { dg-options "-fwhole-program" }
 // { dg-additional-options "-static-libstdc++" { target *-*-mingw* } }
+// { dg-xfail-run-if "AIX operator new" { powerpc-ibm-aix* } }

 // Copyright (C) 2011-2016 Free Software Foundation, Inc.
 //
Index: ext/mt_allocator/check_new.cc
===
--- ext/mt_allocator/check_new.cc   (revision 243019)
+++ ext/mt_allocator/check_new.cc   (working copy)
@@ -1,3 +1,5 @@
+// { dg-xfail-run-if "AIX operator new" { powerpc-ibm-aix* } }
+
 // 2001-11-25  Phil Edwards  
 //
 // Copyright (C) 2001-2016 Free Software Foundation, Inc.
Index: ext/pool_allocator/check_new.cc
===
--- ext/pool_allocator/check_new.cc (revision 243019)
+++ ext/pool_allocator/check_new.cc (working copy)
@@ -1,3 +1,5 @@
+// { dg-xfail-run-if "AIX operator new" { powerpc-ibm-aix* } }
+
 // 2001-11-25  Phil Edwards  
 //
 // Copyright (C) 2001-2016 Free Software Foundation, Inc.
Index: 27_io/ios_base/storage/11584.cc
===
--- 27_io/ios_base/storage/11584.cc (revision 243019)
+++ 27_io/ios_base/storage/11584.cc (working copy)
@@ -1,3 +1,5 @@
+// { dg-xfail-run-if "AIX operator new" { powerpc-ibm-aix* } }
+
 // 2004-01-25 jlqu...@gcc.gnu.org

 // Copyright (C) 2004-2016 Free Software Foundation, Inc.


Re: [PATCH] combine: Convert subreg-of-lshiftrt to zero_extract properly (PR78390)

2016-11-30 Thread Michael Matz
Hi,

On Wed, 30 Nov 2016, Segher Boessenkool wrote:

> > I don't think mode-changing _extracts are valid in this context.  From the 
> > docu:
> > 
> >   `(sign_extract:M LOC SIZE POS)'
> >   ...
> >  The mode M is the same as the mode that would be used for LOC if
> >  it were a register.
> > 
> > Probably it could be made to work just fine, but I'm not sure it'd be 
> > worth much, as then the targets would need to care for mode-changes 
> > occuring not just through subregs as usual, but also through extracts.
> 
> The patch https://gcc.gnu.org/ml/gcc-patches/2016-11/msg02987.html I
> submitted yesterday deals with this same issue, FWIW -- some ports
> apparently already do mode-changing extracts.

Yeah, saw that a bit later.  So, hmmm.  I'm not sure what to make of it, 
if the targets choose to use mode-changing extracts I guess that's fine, 
as they presumably will have written patterns that recognize them.  But I 
don't think we should willy-nilly generate such patterns as we can't know 
if the target deals with them or not.  We could of course always generate 
both variants: (subreg:M1 (extract:M2 (object:M2)) and
(extract:M1 (object:M2)) and see if either matches, but that seems a bit 
too much work.


Ciao,
Michael.


Re: [PATCH] Partial solution to LWG 523

2016-11-30 Thread Jonathan Wakely

On 30/11/16 13:03 +, Jonathan Wakely wrote:

On 26/11/16 16:27 -0800, Tim Shen wrote:

diff --git a/libstdc++-v3/include/bits/shared_ptr_base.h 
b/libstdc++-v3/include/bits/shared_ptr_base.h
index 953aa87..2fb70b7 100644
--- a/libstdc++-v3/include/bits/shared_ptr_base.h
+++ b/libstdc++-v3/include/bits/shared_ptr_base.h
@@ -1000,7 +1000,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 element_type&
 operator*() const noexcept
 {
-   __glibcxx_assert(_M_ptr != nullptr);
+   __glibcxx_assert(_M_get() != nullptr);
return *_M_get();
 }


Oops, thanks, but let's fix this separately (I'll do it now) so the
rest of the patch only touches regex stuff.


I've fixed that with this patch, committed to trunk.

commit 18ee83fae4fef2fc720ef6aef0754377e6fe29e8
Author: Jonathan Wakely 
Date:   Wed Nov 30 13:11:50 2016 +

Fix condition in shared_ptr assertion

2016-11-30  Tim Shen  

	* include/bits/shared_ptr_base.h
	(__shared_ptr_access::operator*()): Fix assertion.

diff --git a/libstdc++-v3/include/bits/shared_ptr_base.h b/libstdc++-v3/include/bits/shared_ptr_base.h
index 953aa87..2fb70b7 100644
--- a/libstdc++-v3/include/bits/shared_ptr_base.h
+++ b/libstdc++-v3/include/bits/shared_ptr_base.h
@@ -1000,7 +1000,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   element_type&
   operator*() const noexcept
   {
-	__glibcxx_assert(_M_ptr != nullptr);
+	__glibcxx_assert(_M_get() != nullptr);
 	return *_M_get();
   }
 


[Patch Doc] Update documentation for __fp16 type

2016-11-30 Thread James Greenhalgh

Hi,

Documentation for __fp16 seems to have drifted out of line with
compiler behaviour over time.

This patch tries to fix that up, documenting AArch64 support and
removing comments on restrictions on using the __fp16 type for arguments
and return types.

OK?

Thanks,
James

---
2016-11-30  James Greenhalgh  

* doc/extend.texi (Half-Precision): Update to document current
compiler behaviour.

diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 7d3d17a..cf16ec3 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -1012,11 +1012,12 @@ that handle conversions if/when long double is changed to be IEEE
 @cindex half-precision floating point
 @cindex @code{__fp16} data type
 
-On ARM targets, GCC supports half-precision (16-bit) floating point via
-the @code{__fp16} type.  You must enable this type explicitly
-with the @option{-mfp16-format} command-line option in order to use it.
+On ARM and AArch64 targets, GCC supports half-precision (16-bit) floating
+point via the @code{__fp16} type defined in the ARM C Language Extensions.
+On ARM systems, you must enable this type explicitly with the
+@option{-mfp16-format} command-line option in order to use it.
 
-ARM supports two incompatible representations for half-precision
+ARM targets support two incompatible representations for half-precision
 floating-point values.  You must choose one of the representations and
 use it consistently in your program.
 
@@ -1031,22 +1032,20 @@ format, but does not support infinities or NaNs.  Instead, the range
 of exponents is extended, so that this format can represent normalized
 values in the range of @math{2^{-14}} to 131008.
 
-The @code{__fp16} type is a storage format only.  For purposes
-of arithmetic and other operations, @code{__fp16} values in C or C++
-expressions are automatically promoted to @code{float}.  In addition,
-you cannot declare a function with a return value or parameters
-of type @code{__fp16}.
+The GCC port for AArch64 only supports the IEEE 754-2008 format, and does
+not require use of the @option{-mfp16-format} command-line option.
 
-Note that conversions from @code{double} to @code{__fp16}
-involve an intermediate conversion to @code{float}.  Because
-of rounding, this can sometimes produce a different result than a
-direct conversion.
+The @code{__fp16} type may only be used as an argument to intrinsics defined
+in @code{}, or as a storage format.  For purposes of
+arithmetic and other operations, @code{__fp16} values in C or C++
+expressions are automatically promoted to @code{float}.
 
-ARM provides hardware support for conversions between
+The ARM target provides hardware support for conversions between
 @code{__fp16} and @code{float} values
-as an extension to VFP and NEON (Advanced SIMD).  GCC generates
-code using these hardware instructions if you compile with
-options to select an FPU that provides them;
+as an extension to VFP and NEON (Advanced SIMD), and from ARMv8 provides
+hardware support for conversions between @code{__fp16} and @code{double}
+values.  GCC generates code using these hardware instructions if you
+compile with options to select an FPU that provides them;
 for example, @option{-mfpu=neon-fp16 -mfloat-abi=softfp},
 in addition to the @option{-mfp16-format} option to select
 a half-precision format.
@@ -1054,8 +1053,12 @@ a half-precision format.
 Language-level support for the @code{__fp16} data type is
 independent of whether GCC generates code using hardware floating-point
 instructions.  In cases where hardware support is not specified, GCC
-implements conversions between @code{__fp16} and @code{float} values
-as library calls.
+implements conversions between @code{__fp16} and other types as library
+calls.
+
+It is recommended that code which is intended to be portable use the
+@code{_Float16} type defined by ISO/IEC TS18661:3-2005
+(@xref{Floating Types}).
 
 @node Decimal Float
 @section Decimal Floating Types


Re: PING! [PATCH, Fortran, accaf, v1] Add caf-API-calls to asynchronously handle allocatable components in derived type coarrays.

2016-11-30 Thread Janus Weil
Hi,

> on IRC:
> 15:28:22 dominiq:  vehre: add /* FALLTHROUGH */
>
> Done and committed as obvious as r243023.

thanks. However, I still see these two:


>> > /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c: In function
>> > ‘_gfortran_caf_get_by_ref’:
>> > /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:1863:29: warning:
>> > ‘src_size’ may be used uninitialized in this function
>> > [-Wmaybe-uninitialized]
>> >if (size == 0 || src_size == 0)
>> > ~^~~~
>> > /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c: In function
>> > ‘_gfortran_caf_send_by_ref’:
>> > /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:2649:29: warning:
>> > ‘src_size’ may be used uninitialized in this function
>> > [-Wmaybe-uninitialized]
>> >if (size == 0 || src_size == 0)
>> > ~^~~~

Can you please fix them as well?

Thanks,
Janus




>> > 2016-11-30 14:30 GMT+01:00 Andre Vehreschild :
>> > > Hi Paul,
>> > >
>> > > thanks for the review. Committed with the changes requested and the one
>> > > reported by Dominique on IRC for coarray_lib_alloc_4 when compiled with
>> > > -m32 as r243021.
>> > >
>> > > Thanks for the review and tests.
>> > >
>> > > Regards,
>> > > Andre
>> > >
>> > > On Wed, 30 Nov 2016 07:49:13 +0100
>> > > Paul Richard Thomas  wrote:
>> > >
>> > >> Dear Andre,
>> > >>
>> > >> This all looks OK to me. The only comment that I have that you might
>> > >> deal with before committing is that some of the Boolean expressions,
>> > >> eg:
>> > >> +  int caf_dereg_mode
>> > >> +  = ((caf_mode & GFC_STRUCTURE_CAF_MODE_IN_COARRAY) != 0
>> > >> +  || c->attr.codimension)
>> > >> +  ? ((caf_mode & GFC_STRUCTURE_CAF_MODE_DEALLOC_ONLY) != 0
>> > >> +  ? GFC_CAF_COARRAY_DEALLOCATE_ONLY
>> > >> +  : GFC_CAF_COARRAY_DEREGISTER)
>> > >> +  : GFC_CAF_COARRAY_NOCOARRAY;
>> > >>
>> > >> are getting be sufficiently convoluted that a small, appropriately
>> > >> named, helper function might be clearer. Of course, this is true of
>> > >> many parts of gfortran but it is not too late to start making the code
>> > >> a bit clearer.
>> > >>
>> > >> You can commit to the present trunk as far as I am concerned. I know
>> > >> that the caf enthusiasts will test it to bits before release!
>> > >>
>> > >> Regards
>> > >>
>> > >> Paul
>> > >>
>> > >>
>> > >> On 28 November 2016 at 19:33, Andre Vehreschild  wrote:
>> > >> > PING!
>> > >> >
>> > >> > I know it's a lengthy patch, but comments would be nice anyway.
>> > >> >
>> > >> > - Andre
>> > >> >
>> > >> > On Tue, 22 Nov 2016 20:46:50 +0100
>> > >> > Andre Vehreschild  wrote:
>> > >> >
>> > >> >> Hi all,
>> > >> >>
>> > >> >> attached patch addresses the need of extending the API of the 
>> > >> >> caf-libs
>> > >> >> to enable allocatable components asynchronous allocation. Allocatable
>> > >> >> components in derived type coarrays are different from regular
>> > >> >> coarrays or coarrayed components. The latter have to be allocated on
>> > >> >> all images or on none. Furthermore is the allocation a point of
>> > >> >> synchronisation.
>> > >> >>
>> > >> >> For allocatable components the F2008 allows to have some allocated on
>> > >> >> some images and on others not. Furthermore is the registration with
>> > >> >> the caf-lib, that an allocatable component is present in a derived
>> > >> >> type coarray no longer a synchronisation point. To implement these
>> > >> >> features two new types of coarray registration have been introduced.
>> > >> >> The first one just registering the component with the caf-lib and the
>> > >> >> latter doing the allocate. Furthermore has the caf-API been extended
>> > >> >> to provide a query function to learn about the allocation status of a
>> > >> >> component on a remote image.
>> > >> >>
>> > >> >> Sorry, that the patch is rather lengthy. Most of this is due to the
>> > >> >> structure_alloc_comps' signature change. The routine and its wrappers
>> > >> >> are used rather often which needed the appropriate changes.
>> > >> >>
>> > >> >> I know I left two or three TODOs in the patch to remind me of things 
>> > >> >> I
>> > >> >> have to investigate further. For the current state these TODOs are no
>> > >> >> reason to hold back the patch. The third party library opencoarrays
>> > >> >> implements the mpi-part of the caf-model and will change in sync. It
>> > >> >> would of course be advantageous to just have to say: With gcc-7
>> > >> >> gfortran implements allocatable components in derived coarrays nearly
>> > >> >> completely.
>> > >> >>
>> > >> >> I know we are in stage 3. But the patch bootstraps and regtests ok on
>> > >> >> x86_64-linux/F23. So, is it ok for trunk or shall it go to 7.2?
>> > >> >>
>> > >> >> Regards,
>> > >> >>   Andre
>> > >> >
>> > >> >
>> > >> > --
>> > >> > Andre Vehreschild * Email: vehre ad gmx dot de
>> > >>
>> > >>
>> > >>
>> > >
>> > >
>> > > --
>> > > Andre Vehreschild * Email: vehre a

Re: [v3 PATCH] Implement LWG 2534, Constrain rvalue stream operators.

2016-11-30 Thread Ville Voutilainen
On 30 November 2016 at 16:59, David Edelsohn  wrote:
> I believe that this broke g++.old-deja/g++.law/ctors10.C
>
> invalid initialization of reference of type 'Klasse::Err&' from
> expression of type 'std::basic_ostream::__ostream_type {aka
> std::basic_ostream}


I'll take a look.


Re: [v3 PATCH] Implement LWG 2534, Constrain rvalue stream operators.

2016-11-30 Thread David Edelsohn
I believe that this broke g++.old-deja/g++.law/ctors10.C

invalid initialization of reference of type 'Klasse::Err&' from
expression of type 'std::basic_ostream::__ostream_type {aka
std::basic_ostream}

- David


Re: [v3 PATCH] LWG 2766, LWG 2749

2016-11-30 Thread Jonathan Wakely

On 26/11/16 14:47 +0200, Ville Voutilainen wrote:

Updated patches attached, and tested with the full testsuite on Linux-PPC64.


Both patches are OK for trunk with the minor tweaks noted below.
Thanks.



diff --git a/libstdc++-v3/include/bits/stl_pair.h 
b/libstdc++-v3/include/bits/stl_pair.h
index ef52538..981dbeb 100644
--- a/libstdc++-v3/include/bits/stl_pair.h
+++ b/libstdc++-v3/include/bits/stl_pair.h
@@ -478,6 +478,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
swap(pair<_T1, _T2>& __x, pair<_T1, _T2>& __y)
noexcept(noexcept(__x.swap(__y)))
{ __x.swap(__y); }
+
+#if __cplusplus > 201402L || !defined(__STRICT_ANSI__) // c++1z or gnu++11
+  template
+inline
+typename enable_if,
+  __is_swappable<_T2>>::value>::type
+swap(pair<_T1, _T2>&, pair<_T1, _T2>&) = delete;
+#endif
#endif // __cplusplus >= 201103L

  /**
diff --git a/libstdc++-v3/include/bits/unique_ptr.h 
b/libstdc++-v3/include/bits/unique_ptr.h
index f9ec60f..21b0bac 100644
--- a/libstdc++-v3/include/bits/unique_ptr.h
+++ b/libstdc++-v3/include/bits/unique_ptr.h
@@ -649,6 +649,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
swap(unique_ptr<_Tp, _Dp>& __x,
 unique_ptr<_Tp, _Dp>& __y) noexcept
{ __x.swap(__y); }


Insert a blank line here please.


+#if __cplusplus > 201402L || !defined(__STRICT_ANSI__) // c++1z or gnu++11
+  template
+inline
+typename enable_if::value>::type
+swap(unique_ptr<_Tp, _Dp>&,
+unique_ptr<_Tp, _Dp>&) = delete;
+#endif

  template
diff --git a/libstdc++-v3/include/std/array b/libstdc++-v3/include/std/array
index 3ab0355..86100b5 100644
--- a/libstdc++-v3/include/std/array
+++ b/libstdc++-v3/include/std/array
@@ -287,6 +287,13 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
swap(array<_Tp, _Nm>& __one, array<_Tp, _Nm>& __two)
noexcept(noexcept(__one.swap(__two)))
{ __one.swap(__two); }


Insert a blank line here please.


+#if __cplusplus > 201402L || !defined(__STRICT_ANSI__) // c++1z or gnu++11
+  template
+inline
+typename enable_if<
+  !_GLIBCXX_STD_C::__array_traits<_Tp, _Nm>::_Is_swappable::value>::type
+swap(array<_Tp, _Nm>&, array<_Tp, _Nm>&) = delete;
+#endif

  template
constexpr _Tp&




diff --git a/libstdc++-v3/include/std/tuple b/libstdc++-v3/include/std/tuple
index 63cacd4..8952750 100644
--- a/libstdc++-v3/include/std/tuple
+++ b/libstdc++-v3/include/std/tuple
@@ -1442,17 +1442,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
forward_as_tuple(_Elements&&... __args) noexcept
{ return tuple<_Elements&&...>(std::forward<_Elements>(__args)...); }

-  template
-struct __is_tuple_like_impl> : true_type
-{ };
-
-  // Internal type trait that allows us to sfinae-protect tuple_cat.
-  template
-struct __is_tuple_like
-: public __is_tuple_like_impl::type>::type>::type
-{ };
-
  template
struct __make_tuple_impl;

@@ -1596,6 +1585,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
swap(tuple<_Elements...>& __x, tuple<_Elements...>& __y)
noexcept(noexcept(__x.swap(__y)))
{ __x.swap(__y); }


Insert a blank line here please.


+#if __cplusplus > 201402L || !defined(__STRICT_ANSI__) // c++1z or gnu++11
+  template
+inline
+typename enable_if...>::value>::type
+swap(tuple<_Elements...>&, tuple<_Elements...>&) = delete;
+#endif

  // A class (and instance) which can be used in 'tie' when an element
  // of a tuple is not required
diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index e5f2bba..49f76cc 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -2593,9 +2593,28 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  template 
struct __is_nothrow_swappable;

+  template
+class tuple;
+
+  template
+struct __is_tuple_like_impl : false_type
+{ };
+
+  template
+struct __is_tuple_like_impl> : true_type
+{ };
+
+  // Internal type trait that allows us to sfinae-protect tuple_cat.
+  template
+struct __is_tuple_like
+: public __is_tuple_like_impl::type>::type>::type


We can lose the std:: quals here.


+{ };
+
  template
inline
-typename enable_if<__and_,
+typename enable_if<__and_<__not_<__is_tuple_like<_Tp>>,
+ is_move_constructible<_Tp>,
  is_move_assignable<_Tp>>::value>::type
swap(_Tp&, _Tp&)
noexcept(__and_,


Re: Ping: Re: [patch, avr] Add flash size to device info and make wrap around default

2016-11-30 Thread Georg-Johann Lay

On 30.11.2016 07:27, Pitchumani Sivanupandi wrote:

On Tuesday 29 November 2016 10:06 PM, Denis Chertykov wrote:

2016-11-28 10:17 GMT+03:00 Pitchumani Sivanupandi
:

On Saturday 26 November 2016 12:11 AM, Denis Chertykov wrote:

I'm sorry for delay.

I have a problem with the patch:
(Stripping trailing CRs from patch; use --binary to disable.)
patching file avr-arch.h
(Stripping trailing CRs from patch; use --binary to disable.)
patching file avr-devices.c
(Stripping trailing CRs from patch; use --binary to disable.)
patching file avr-mcus.def
Hunk #1 FAILED at 62.
1 out of 1 hunk FAILED -- saving rejects to file avr-mcus.def.rej
(Stripping trailing CRs from patch; use --binary to disable.)
patching file gen-avr-mmcu-specs.c
Hunk #1 succeeded at 215 (offset 5 lines).
(Stripping trailing CRs from patch; use --binary to disable.)
patching file specs.h
Hunk #1 succeeded at 58 (offset 1 line).
Hunk #2 succeeded at 66 (offset 1 line).


There are changes in avr-mcus.def after this patch is submitted.
Now, I have incorporated the changes and attached the resolved patch.

Regards,
Pitchumani

gcc/ChangeLog

2016-11-09  Pitchumani Sivanupandi 

 * config/avr/avr-arch.h (avr_mcu_t): Add flash_size member.
 * config/avr/avr-devices.c(avr_mcu_types): Add flash size info.
 * config/avr/avr-mcu.def: Likewise.
 * config/avr/gen-avr-mmcu-specs.c (print_mcu): Remove hard-coded
prefix
 check to find wrap-around value, instead use MCU flash size. For 8k
flash
 devices, update link_pmem_wrap spec string to add
--pmem-wrap-around=8k.
 * config/avr/specs.h: Remove link_pmem_wrap from LINK_RELAX_SPEC
and
 add to linker specs (LINK_SPEC) directly.

Committed.

It looks like only avr-mcus.def and ChangeLog are committed.
Without the other changes trunk build is broken.

Regards,
Pitchumani


Hi, I allowed me to commit the missing files.

http://gcc.gnu.org/r243033

Johann



Re: [PATCH GCC]Simplify (cond (cmp (convert? x) c1) (op x c2) c3) -> (op (minmax x c1) c2)

2016-11-30 Thread Richard Biener
On Fri, Nov 18, 2016 at 5:53 PM, Bin Cheng  wrote:
> Hi,
> This is a rework of https://gcc.gnu.org/ml/gcc-patches/2016-10/msg02007.html.
> Though review comments suggested it could be merged with last kind 
> simplification
> of fold_cond_expr_with_comparison, it's not really applicable.  As a matter 
> of fact,
> the suggestion stands for patch 
> @https://gcc.gnu.org/ml/gcc-patches/2016-10/msg02005.html.
> I had previous patch 
> (https://gcc.gnu.org/ml/gcc-patches/2016-11/msg01898.html)
> moving fold_cond_expr_with_comparison to match.pd pattern and incorporated
> that patch into it.  For this one, the rework is trivial, just renames 
> several variable
> tags as suggested.  Bootstrap and test on x86_64 and AArch64, is it OK?

+ A) Operand x is a unsigned to signed type conversion and c1 is
+   integer zero.  In this case,
+ (signed type)x  < 0  <=>  x  > MAX_VAL(signed type)
+ (signed type)x >= 0  <=>  x <= MAX_VAL(signed type)

for (singed type)x < 0 -> x > signed-type-max we probably do a reverse
"canonicalization" transform?  Yeah,

/* Non-equality compare simplifications from fold_binary  */
(for cmp (lt gt le ge)
...
 (if (wi::eq_p (@1, signed_max)
  && TYPE_UNSIGNED (arg1_type)
  /* We will flip the signedness of the comparison operator
 associated with the mode of @1, so the sign bit is
 specified by this mode.  Check that @1 is the signed
 max associated with this sign bit.  */
  && prec == GET_MODE_PRECISION (TYPE_MODE (arg1_type))
  /* signed_type does not work on pointer types.  */
  && INTEGRAL_TYPE_P (arg1_type))
  /* The following case also applies to X < signed_max+1
 and X >= signed_max+1 because previous transformations.  */
  (if (cmp == LE_EXPR || cmp == GT_EXPR)
   (with { tree st = signed_type_for (arg1_type); }
(if (cmp == LE_EXPR)
 (ge (convert:st @0) { build_zero_cst (st); })
 (lt (convert:st @0) { build_zero_cst (st); }))

+   if (cmp_code == GE_EXPR)
+ cmp_code = LE_EXPR;
+   c1 = wide_int_to_tree (op_type, wi::max_value (to_type));
+ }
...
+   if (op == PLUS_EXPR)
+ real_c1 = wide_int_to_tree (op_type,
+ wi::sub (c3, c2, sgn, &overflow));
+   else
+ real_c1 = wide_int_to_tree (op_type,
+ wi::add (c3, c2, sgn, &overflow));

can you avoid the tree building here and just continue using wide-ints please?
Simply do the wide_int_to_tree in the result patterns.

Otherwise looks ok to me.

Thanks,
Richard.


> Thanks,
> bin
>
> 2016-11-17  Bin Cheng  
>
> * match.pd: Add new pattern:
> (cond (cmp (convert? x) c1) (op x c2) c3) -> (op (minmax x c1) c2).
>
> gcc/testsuite/ChangeLog
> 2016-11-17  Bin Cheng  
>
> * gcc.dg/fold-bopcond-1.c: New test.
> * gcc.dg/fold-bopcond-2.c: New test.


Re: [PATCHv2 4/7, GCC, ARM, V8M] ARMv8-M Security Extension's cmse_nonsecure_entry: clear registers

2016-11-30 Thread Andre Vieira (lists)
On 23/11/16 11:52, Andre Vieira (lists) wrote:
> Hi,
> 
> After some extra testing I realized there was an issue with the way we
> were clearing registers when returning from a cmse_nonsecure_entry
> function for ARMv8-M.Baseline.  This patch fixes that and changes the
> testcase to catch the issue.
> 
> The problem was I was always using LR to clear the registers, however,
> due to the way the Thumb-1 backend works, we can't guarantee LR will
> contain the address to which we will be returning at the time of
> clearing. Instead we use r0 to clear r1-r3 and IP. If the function does
> not use r0 to return a value, we clear r0 with 0 before using it to
> clear everything else. As for LR, we move the value of the register used
> to return into it prior to returning.
> 
> This satisfies the requirements of not leaking secure information since
> all registers hold either:
> - values to return
> - 0
> - return address
> 
> No changes to ChangeLog.
> 
> Cheers,
> Andre
> 
Hi,

So I seemed to have forgotten to address two of your comments earlier,
done in this version.

To reiterate:
After some extra testing I realized there was an issue with the way we
were clearing registers when returning from a cmse_nonsecure_entry
function for ARMv8-M Baseline.  This patch fixes that and changes the
testcase to catch the issue.

The problem was I was always using LR to clear the registers, however,
due to the way the Thumb-1 backend works, we can't guarantee LR will
contain the address to which we will be returning at the time of
clearing. Instead we use r0 to clear r1-r3 and IP. If the function does
not use r0 to return a value, we clear r0 with 0 before using it to
clear everything else. As for LR, we move the value of the register used
to return into it prior to returning.

This satisfies the requirements of not leaking secure information since
all registers hold either:
- values to return
- 0
- return address

*** gcc/ChangeLog ***
2016-11-xx  Andre Vieira
 Thomas Preud'homme  

 * config/arm/arm.c (output_return_instruction): Clear
 registers.
 (thumb2_expand_return): Likewise.
 (thumb1_expand_epilogue): Likewise.
 (thumb_exit): Likewise.
 (arm_expand_epilogue): Likewise.
 (cmse_nonsecure_entry_clear_before_return): New.
 (comp_not_to_clear_mask_str_un): New.
 (compute_not_to_clear_mask): New.
 * config/arm/thumb1.md (*epilogue_insns): Change length attribute.
 * config/arm/thumb2.md (*thumb2_cmse_entry_return): Duplicate
 thumb2_return pattern for cmse_nonsecure_entry functions.

*** gcc/testsuite/ChangeLog ***
2016-11-xx  Andre Vieira
 Thomas Preud'homme  

 * gcc.target/arm/cmse/cmse.exp: Test different multilibs separate.
 * gcc.target/arm/cmse/struct-1.c: New.
 * gcc.target/arm/cmse/bitfield-1.c: New.
 * gcc.target/arm/cmse/bitfield-2.c: New.
 * gcc.target/arm/cmse/bitfield-3.c: New.
 * gcc.target/arm/cmse/baseline/cmse-2.c: Test that registers are
cleared.
 * gcc.target/arm/cmse/mainline/soft/cmse-5.c: New.
 * gcc.target/arm/cmse/mainline/hard/cmse-5.c: New.
 * gcc.target/arm/cmse/mainline/hard-sp/cmse-5.c: New.
 * gcc.target/arm/cmse/mainline/softfp/cmse-5.c: New.
 * gcc.target/arm/cmse/mainline/softfp-sp/cmse-5.c: New.

Cheers,
Andre
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 
db7e0c842fff1b0aee5059e3ea4813059caa8d03..6a9db85aa879e1c5547908dcc9f036ee37de489e
 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -16297,6 +16297,279 @@ note_invalid_constants (rtx_insn *insn, HOST_WIDE_INT 
address, int do_pushes)
   return;
 }
 
+/* This function computes the clear mask and PADDING_BITS_TO_CLEAR for structs
+   and unions in the context of ARMv8-M Security Extensions.  It is used as a
+   helper function for both 'cmse_nonsecure_call' and 'cmse_nonsecure_entry'
+   functions.  The PADDING_BITS_TO_CLEAR pointer can be the base to either one
+   or four masks, depending on whether it is being computed for a
+   'cmse_nonsecure_entry' return value or a 'cmse_nonsecure_call' argument
+   respectively.  The tree for the type of the argument or a field within an
+   argument is passed in ARG_TYPE, the current register this argument or field
+   starts in is kept in the pointer REGNO and updated accordingly, the bit this
+   argument or field starts at is passed in STARTING_BIT and the last used bit
+   is kept in LAST_USED_BIT which is also updated accordingly.  */
+
+static unsigned HOST_WIDE_INT
+comp_not_to_clear_mask_str_un (tree arg_type, int * regno,
+  uint32_t * padding_bits_to_clear,
+  unsigned starting_bit, int * last_used_bit)
+
+{
+  unsigned HOST_WIDE_INT not_to_clear_reg_mask = 0;
+
+  if (TREE_CODE (arg_type) == RECORD_TYPE)
+{
+  unsigned current_bit = starting_bit;
+  tree field;
+ 

[PATCH,rs6000] Correct mode of operand 2 in vector extract half-word and word instruction patterns

2016-11-30 Thread Kelvin Nilsen

This patch corrects an error in a patch committed on 2016-10-18 to add
built-in function support for Power9 string operations.  In that
original patch, the mode for operand 2 of the newly added vector
extract half-word and full-word instruction patterns was described as
V16QI, even though those instruction patterns were conceptually
operating on V8HI and V4SI operands respectively.

This patch changes the modes of the operands for these instruction
patterns to better represent the intended types.  This patch improves
readability and maintainability of code.  It does not affect
correctness of generated code, since the existing implementation
implicitly coerces the operand types to the declared type.

The patch has been bootstrapped and tested on powerpc64le-unknown-linux
without regressions.

Is this ok for the trunk?

gcc/ChangeLog:

2016-11-30  Kelvin Nilsen  

PR target/78577
* config/rs6000/vsx.md (vextuhlx): Revise mode of operand 2.
(vextuhrx): Likewise.
(vextuwlx): Likewise.
(vextuwrx): Likewise.

Index: gcc/config/rs6000/vsx.md
===
--- gcc/config/rs6000/vsx.md(revision 242948)
+++ gcc/config/rs6000/vsx.md(working copy)
@@ -3648,7 +3648,7 @@
   [(set (match_operand:SI 0 "register_operand" "=r")
(unspec:SI
 [(match_operand:SI 1 "register_operand" "r")
- (match_operand:V16QI 2 "altivec_register_operand" "v")]
+ (match_operand:V8HI 2 "altivec_register_operand" "v")]
 UNSPEC_VEXTUHLX))]
   "TARGET_P9_VECTOR"
   "vextuhlx %0,%1,%2"
@@ -3659,7 +3659,7 @@
   [(set (match_operand:SI 0 "register_operand" "=r")
(unspec:SI
 [(match_operand:SI 1 "register_operand" "r")
- (match_operand:V16QI 2 "altivec_register_operand" "v")]
+ (match_operand:V8HI 2 "altivec_register_operand" "v")]
 UNSPEC_VEXTUHRX))]
   "TARGET_P9_VECTOR"
   "vextuhrx %0,%1,%2"
@@ -3670,7 +3670,7 @@
   [(set (match_operand:SI 0 "register_operand" "=r")
(unspec:SI
 [(match_operand:SI 1 "register_operand" "r")
- (match_operand:V16QI 2 "altivec_register_operand" "v")]
+ (match_operand:V4SI 2 "altivec_register_operand" "v")]
 UNSPEC_VEXTUWLX))]
   "TARGET_P9_VECTOR"
   "vextuwlx %0,%1,%2"
@@ -3681,7 +3681,7 @@
   [(set (match_operand:SI 0 "register_operand" "=r")
(unspec:SI
 [(match_operand:SI 1 "register_operand" "r")
- (match_operand:V16QI 2 "altivec_register_operand" "v")]
+ (match_operand:V4SI 2 "altivec_register_operand" "v")]
 UNSPEC_VEXTUWRX))]
   "TARGET_P9_VECTOR"
   "vextuwrx %0,%1,%2"


-- 
Kelvin Nilsen, Ph.D.  kdnil...@linux.vnet.ibm.com
home office: 801-756-4821, cell: 520-991-6727
IBM Linux Technology Center - PPC Toolchain



[v3 PATCH] Fix testsuite failures caused by the patch implementing LWG 2534.

2016-11-30 Thread Ville Voutilainen
2016-11-30  Ville Voutilainen  

Fix testsuite failures caused by the patch implementing LWG 2534.
* include/std/istream (__is_convertible_to_basic_istream):
Change the return types of __check, introduce stream_type.
(operator>>(_Istream&&, _Tp&&)):
Use __is_convertible_to_basic_istream::stream_type as the return type.
* include/std/ostream (__is_convertible_to_basic_ostream):
Change the return types of __check, introduce stream_type.
(operator>>(_Ostream&&, _Tp&&)):
Use __is_convertible_to_basic_ostream::stream_type as the return type.
diff --git a/libstdc++-v3/include/std/istream b/libstdc++-v3/include/std/istream
index 4f0e940..81df402 100644
--- a/libstdc++-v3/include/std/istream
+++ b/libstdc++-v3/include/std/istream
@@ -913,11 +913,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 struct __is_convertible_to_basic_istream
 {
   template
-  static true_type __check(basic_istream<_Ch, _Up>*);
+  static basic_istream<_Ch, _Up>& __check(basic_istream<_Ch, _Up>*);
 
-  static false_type __check(void*);
+  static void __check(void*);
 public:
-  using type = decltype(__check(declval<_Tp*>()));
+  using stream_type = decltype(__check(declval<_Tp*>()));
+  using type = __not_>;
   constexpr static bool value = type::value;
   };
 
@@ -949,7 +950,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  __is_convertible_to_basic_istream<
typename remove_reference<_Istream>::type>,
  __is_extractable<_Istream&, _Tp&&>>::value,
-  _Istream&>::type
+  typename __is_convertible_to_basic_istream<
+typename
+remove_reference<_Istream>::type>::stream_type>::type
 operator>>(_Istream&& __is, _Tp&& __x)
 {
   __is >> std::forward<_Tp>(__x);
diff --git a/libstdc++-v3/include/std/ostream b/libstdc++-v3/include/std/ostream
index a1fe892..64db7c7 100644
--- a/libstdc++-v3/include/std/ostream
+++ b/libstdc++-v3/include/std/ostream
@@ -617,11 +617,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 struct __is_convertible_to_basic_ostream
   {
 template
-static true_type __check(basic_ostream<_Ch, _Up>*);
+static basic_ostream<_Ch, _Up>& __check(basic_ostream<_Ch, _Up>*);
 
-static false_type __check(void*);
+static void __check(void*);
   public:
-using type = decltype(__check(declval<_Tp*>()));
+using stream_type = decltype(__check(declval<_Tp*>()));
+using type = __not_>;
 constexpr static bool value = type::value;
   };
 
@@ -650,8 +651,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  __is_convertible_to_basic_ostream<
typename remove_reference<_Ostream>::type>,
  __is_insertable<_Ostream&, const _Tp&>>::value,
-  _Ostream&>::type
- //basic_ostream<_CharT, _Traits>&
+  typename __is_convertible_to_basic_ostream<
+typename
+remove_reference<_Ostream>::type>::stream_type>::type
 operator<<(_Ostream&& __os, const _Tp& __x)
 {
   __os << __x;


Re: PING! [PATCH, Fortran, accaf, v1] Add caf-API-calls to asynchronously handle allocatable components in derived type coarrays.

2016-11-30 Thread Andre Vehreschild
Fixed -> r243034.

- Andre

On Wed, 30 Nov 2016 15:53:39 +0100
Janus Weil  wrote:

> Hi,
> 
> > on IRC:
> > 15:28:22 dominiq:  vehre: add /* FALLTHROUGH */
> >
> > Done and committed as obvious as r243023.  
> 
> thanks. However, I still see these two:
> 
> 
> >> > /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c: In function
> >> > ‘_gfortran_caf_get_by_ref’:
> >> > /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:1863:29: warning:
> >> > ‘src_size’ may be used uninitialized in this function
> >> > [-Wmaybe-uninitialized]
> >> >if (size == 0 || src_size == 0)
> >> > ~^~~~
> >> > /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c: In function
> >> > ‘_gfortran_caf_send_by_ref’:
> >> > /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:2649:29: warning:
> >> > ‘src_size’ may be used uninitialized in this function
> >> > [-Wmaybe-uninitialized]
> >> >if (size == 0 || src_size == 0)
> >> > ~^~~~  
> 
> Can you please fix them as well?
> 
> Thanks,
> Janus
> 
> 
> 
> 
> >> > 2016-11-30 14:30 GMT+01:00 Andre Vehreschild :  
> >> > > Hi Paul,
> >> > >
> >> > > thanks for the review. Committed with the changes requested and the one
> >> > > reported by Dominique on IRC for coarray_lib_alloc_4 when compiled with
> >> > > -m32 as r243021.
> >> > >
> >> > > Thanks for the review and tests.
> >> > >
> >> > > Regards,
> >> > > Andre
> >> > >
> >> > > On Wed, 30 Nov 2016 07:49:13 +0100
> >> > > Paul Richard Thomas  wrote:
> >> > >  
> >> > >> Dear Andre,
> >> > >>
> >> > >> This all looks OK to me. The only comment that I have that you might
> >> > >> deal with before committing is that some of the Boolean expressions,
> >> > >> eg:
> >> > >> +  int caf_dereg_mode
> >> > >> +  = ((caf_mode & GFC_STRUCTURE_CAF_MODE_IN_COARRAY) != 0
> >> > >> +  || c->attr.codimension)
> >> > >> +  ? ((caf_mode & GFC_STRUCTURE_CAF_MODE_DEALLOC_ONLY) != 0
> >> > >> +  ? GFC_CAF_COARRAY_DEALLOCATE_ONLY
> >> > >> +  : GFC_CAF_COARRAY_DEREGISTER)
> >> > >> +  : GFC_CAF_COARRAY_NOCOARRAY;
> >> > >>
> >> > >> are getting be sufficiently convoluted that a small, appropriately
> >> > >> named, helper function might be clearer. Of course, this is true of
> >> > >> many parts of gfortran but it is not too late to start making the code
> >> > >> a bit clearer.
> >> > >>
> >> > >> You can commit to the present trunk as far as I am concerned. I know
> >> > >> that the caf enthusiasts will test it to bits before release!
> >> > >>
> >> > >> Regards
> >> > >>
> >> > >> Paul
> >> > >>
> >> > >>
> >> > >> On 28 November 2016 at 19:33, Andre Vehreschild 
> >> > >> wrote:  
> >> > >> > PING!
> >> > >> >
> >> > >> > I know it's a lengthy patch, but comments would be nice anyway.
> >> > >> >
> >> > >> > - Andre
> >> > >> >
> >> > >> > On Tue, 22 Nov 2016 20:46:50 +0100
> >> > >> > Andre Vehreschild  wrote:
> >> > >> >  
> >> > >> >> Hi all,
> >> > >> >>
> >> > >> >> attached patch addresses the need of extending the API of the
> >> > >> >> caf-libs to enable allocatable components asynchronous allocation.
> >> > >> >> Allocatable components in derived type coarrays are different from
> >> > >> >> regular coarrays or coarrayed components. The latter have to be
> >> > >> >> allocated on all images or on none. Furthermore is the allocation
> >> > >> >> a point of synchronisation.
> >> > >> >>
> >> > >> >> For allocatable components the F2008 allows to have some allocated
> >> > >> >> on some images and on others not. Furthermore is the registration
> >> > >> >> with the caf-lib, that an allocatable component is present in a
> >> > >> >> derived type coarray no longer a synchronisation point. To
> >> > >> >> implement these features two new types of coarray registration
> >> > >> >> have been introduced. The first one just registering the component
> >> > >> >> with the caf-lib and the latter doing the allocate. Furthermore
> >> > >> >> has the caf-API been extended to provide a query function to learn
> >> > >> >> about the allocation status of a component on a remote image.
> >> > >> >>
> >> > >> >> Sorry, that the patch is rather lengthy. Most of this is due to the
> >> > >> >> structure_alloc_comps' signature change. The routine and its
> >> > >> >> wrappers are used rather often which needed the appropriate
> >> > >> >> changes.
> >> > >> >>
> >> > >> >> I know I left two or three TODOs in the patch to remind me of
> >> > >> >> things I have to investigate further. For the current state these
> >> > >> >> TODOs are no reason to hold back the patch. The third party
> >> > >> >> library opencoarrays implements the mpi-part of the caf-model and
> >> > >> >> will change in sync. It would of course be advantageous to just
> >> > >> >> have to say: With gcc-7 gfortran implements allocatable components
> >> > >> >> in derived coarrays nearly completely.
> >> > >> >>
> >> > >> >> I know we are in stage 3. But the patch bootstraps a

Re: [PATCH] combine: Convert subreg-of-lshiftrt to zero_extract properly (PR78390)

2016-11-30 Thread Dominik Vogt
On Wed, Nov 30, 2016 at 03:40:32PM +0100, Michael Matz wrote:
> Hi,
> 
> On Wed, 30 Nov 2016, Segher Boessenkool wrote:
> 
> > > I don't think mode-changing _extracts are valid in this context.  From 
> > > the 
> > > docu:
> > > 
> > >   `(sign_extract:M LOC SIZE POS)'
> > >   ...
> > >  The mode M is the same as the mode that would be used for LOC if
> > >  it were a register.
> > > 
> > > Probably it could be made to work just fine, but I'm not sure it'd be 
> > > worth much, as then the targets would need to care for mode-changes 
> > > occuring not just through subregs as usual, but also through extracts.
> > 
> > The patch https://gcc.gnu.org/ml/gcc-patches/2016-11/msg02987.html I
> > submitted yesterday deals with this same issue, FWIW -- some ports
> > apparently already do mode-changing extracts.
> 
> Yeah, saw that a bit later.  So, hmmm.  I'm not sure what to make of it, 
> if the targets choose to use mode-changing extracts I guess that's fine, 
> as they presumably will have written patterns that recognize them.  But I 
> don't think we should willy-nilly generate such patterns as we can't know 
> if the target deals with them or not.

Just working on such a pattern for s390x, I had the impression
that combine uses the automatic mode change when it's there, and
otherwise it does things differently, that is, it tries different
combinations when it has the pattern than without.  There seems to
be at least some kind of detection.

> We could of course always generate 
> both variants: (subreg:M1 (extract:M2 (object:M2)) and
> (extract:M1 (object:M2)) and see if either matches, but that seems a bit 
> too much work.

--

The insns that combine tried are:

  (insn 7 4 8 2 (set (reg:DI 69)
(lshiftrt:DI (reg/v:DI 66 [ v_x ])
(const_int 48 [0x30])))

  (insn 9 8 10 2 (parallel [
(set (reg:SI 68 [ v_or ])
(ior:SI (reg:SI 70 [ v_and1 ])
(subreg:SI (reg:DI 69) 4)))
(clobber (reg:CC 33 %cc))
])

A while ago combine handled the situation well, resulting in the
new "risbg" instruction, but for a while it's not been working.
It's a bit difficult to track that down to a specific commit
because of the broken "combine"-patch that took a while to fix.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany



Re: [PATCH,rs6000] Correct mode of operand 2 in vector extract half-word and word instruction patterns

2016-11-30 Thread Segher Boessenkool
On Wed, Nov 30, 2016 at 08:35:08AM -0700, Kelvin Nilsen wrote:
> This patch corrects an error in a patch committed on 2016-10-18 to add
> built-in function support for Power9 string operations.  In that
> original patch, the mode for operand 2 of the newly added vector
> extract half-word and full-word instruction patterns was described as
> V16QI, even though those instruction patterns were conceptually
> operating on V8HI and V4SI operands respectively.
> 
> This patch changes the modes of the operands for these instruction
> patterns to better represent the intended types.  This patch improves
> readability and maintainability of code.  It does not affect
> correctness of generated code, since the existing implementation
> implicitly coerces the operand types to the declared type.
> 
> The patch has been bootstrapped and tested on powerpc64le-unknown-linux
> without regressions.
> 
> Is this ok for the trunk?

Okay.  Thanks,


Segher


> 2016-11-30  Kelvin Nilsen  
> 
>   PR target/78577
>   * config/rs6000/vsx.md (vextuhlx): Revise mode of operand 2.
>   (vextuhrx): Likewise.
>   (vextuwlx): Likewise.
>   (vextuwrx): Likewise.


Re: [PATCH 7/9] Add RTL-error-handling to host

2016-11-30 Thread Bernd Schmidt

On 11/29/2016 10:13 PM, Bernd Schmidt wrote:

On 11/29/2016 07:53 PM, David Malcolm wrote:


Would you prefer that I went with approach (B), or is approach (A)
acceptable?


Well, I was hoping there'd be an approach (C) where the read-rtl code
uses whatever diagnostics framework that is available. Maybe it'll turn
out that's too hard. Somehow the current patch looked strange to me, but
if there's no easy alternative maybe we'll have to go with it.


So, I've tried to build patches 1-6 + 8, without #7. It looks like the 
differences are as follows:


- A lack of seen_error in errors.[ch], could be easily added, and
  basically a spelling mismatch between have_error and errorcount.
- A lack of fatal in diagnostics.c. Could maybe be added to just call
  fatal_error?

All this seems simpler and cleaner to fix than linking two different 
error handling frameworks into one binary. Do you see any other 
difficulties?



Bernd



Re: [PATCH] Fix prs 78602 & 78560 on PowerPC (vec_extract/vec_sec)

2016-11-30 Thread Segher Boessenkool
On Wed, Nov 30, 2016 at 12:55:29AM -0500, Michael Meissner wrote:
> I have done full bootstraps and make check with no regressions on a little
> endian power8 (64-bit only), a big endian power8 (64-bit only), and a big
> endian power7 (both 32-bit and 64-bit).  Cann I install both patches to the
> trunk?

Yes please.  Thanks,


Segher


> 2016-11-29  Michael Meissner  
> 
>   PR target/78602
>   * config/rs6000/rs6000.c (rs6000_expand_vector_extract): If the
>   element is not a constant or in a register, force it to a
>   register.
> 
> 2016-11-29  Michael Meissner  
> 
>   PR target/78560
>   * config/rs6000/rs6000.c (rs6000_expand_vector_set): Force value
>   that will be set to a vector element to be in a register.
>   * config/rs6000/vsx.md (vsx_set__p9): Fix thinko that used
>   the wrong multiplier to convert the element number to a byte
>   offset.


Re: [v3 PATCH] Fix testsuite failures caused by the patch implementing LWG 2534.

2016-11-30 Thread Jonathan Wakely

On 30/11/16 17:58 +0200, Ville Voutilainen wrote:

   Fix testsuite failures caused by the patch implementing LWG 2534.
   * include/std/istream (__is_convertible_to_basic_istream):
   Change the return types of __check, introduce stream_type.
   (operator>>(_Istream&&, _Tp&&)):
   Use __is_convertible_to_basic_istream::stream_type as the return type.
   * include/std/ostream (__is_convertible_to_basic_ostream):
   Change the return types of __check, introduce stream_type.
   (operator>>(_Ostream&&, _Tp&&)):
   Use __is_convertible_to_basic_ostream::stream_type as the return type.


As discussed on IRC, please change "stream_type" to istream_type and
ostream_type, as appropriate, because those names are already used by
stream iterators, so users can't define them as macros.

And you could make the remove_reference happen inside the
__is_convertible_to_basic_[io]stream trait, since it's only ever used
with references, but that seems stylistic.

OK with the stream_type renaming.




Re: [Patches] Add variant constexpr support for visit, comparisons and get

2016-11-30 Thread Jonathan Wakely

On 26/11/16 21:38 -0800, Tim Shen wrote:

This 4-patch series contains the following in order:

a.diff: Remove uses-allocator ctors. They are going away, and removing
it reduces the maintenance burden from now on.


Yay! less code.


b.diff: Add constexpr support for get<> and comparisons. This patch
also involves small refactoring of _Variant_storage.

c.diff: Fix some libc++ test failures.

d.diff: Add constexpr support for visit. This patch also removes
__storage, __get_alternative, and __reserved_type_map, since we don't
need to support reference/void types for now.

The underlying design doesn't change - we still use the vtable
approach to achieve O(1) runtime cost even under -O0.


Great stuff.




   * include/std/variant: Implement constexpr comparison and get<>.
   * testsuite/20_util/variant/compile.cc: Tests.

diff --git a/libstdc++-v3/include/std/variant b/libstdc++-v3/include/std/variant
index 2d9303a..a913074 100644
--- a/libstdc++-v3/include/std/variant
+++ b/libstdc++-v3/include/std/variant
@@ -154,31 +154,63 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  template
using __storage = _Alternative;

-  template>
+  // _Uninitialized nullify the destructor calls.


This wording confused me slightly. How about:

 "_Uninitialized makes destructors trivial"


+  // This is necessary, since we define _Variadic_union as a recursive union,
+  // and we don't want to inspect the union members one by one in its dtor,
+  // it's slow.


Please change "it's slow" to "that's slow".


+  template>
struct _Uninitialized;


I'm still unsure that is_literal_type is the right trait here. If it's
definitely right then we should probably *not* deprecate it in C++17!


  template
struct _Uninitialized<_Type, false>
{
-  constexpr _Uninitialized() = default;
-
  template
  constexpr _Uninitialized(in_place_index_t<0>, _Args&&... __args)
  { ::new (&_M_storage) _Type(std::forward<_Args>(__args)...); }

+  const _Type& _M_get() const &
+  {
+   return *static_cast(
+   static_cast(&_M_storage));
+  }
+
+  _Type& _M_get() &
+  { return *static_cast<_Type*>(static_cast(&_M_storage)); }
+
+  const _Type&& _M_get() const &&
+  {
+   return std::move(*static_cast(
+   static_cast(&_M_storage)));
+  }
+
+  _Type&& _M_get() &&
+  {
+   return std::move(*static_cast<_Type*>(static_cast(&_M_storage)));
+  }
+
  typename std::aligned_storage::type
  _M_storage;


I think this could use __aligned_membuf, which would reduce the
alignment requirements for some types (e.g. long long on x86-32).

That would also mean you get the _M_ptr() member so don't need all the
casts.


+  ~_Variant_storage()
+  { _M_destroy_impl(std::make_index_sequence{}); }


You can use index_sequence_for<_Types...> here.


@@ -598,9 +645,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
_S_apply_all_alts(_Array_type& __vtable, index_sequence<__indices...>)
{ (_S_apply_single_alt<__indices>(__vtable._M_arr[__indices]), ...); }

-  template
+  template


This needs to be _Tp not T


+  return __lhs._M_equal_to(__rhs,
+  std::make_index_sequence{});


Another one that could use index_sequence_for<_Types...>


+  return __lhs._M_less_than(__rhs,
+   std::make_index_sequence{});


Same again.



   * include/bits/enable_special_members.h: Make
   _Enable_default_constructor constexpr.
   * include/std/variant (variant::emplace, variant::swap, std::swap,
   std::hash): Sfinae on emplace and std::swap; handle __poison_hash 
bases
   of duplicated types.

diff --git a/libstdc++-v3/include/bits/enable_special_members.h 
b/libstdc++-v3/include/bits/enable_special_members.h
index 07c6c99..4f4477b 100644
--- a/libstdc++-v3/include/bits/enable_special_members.h
+++ b/libstdc++-v3/include/bits/enable_special_members.h
@@ -118,7 +118,8 @@ template
operator=(_Enable_default_constructor&&) noexcept = default;

// Can be used in other ctors.
-explicit _Enable_default_constructor(_Enable_default_constructor_tag) { }
+constexpr explicit
+_Enable_default_constructor(_Enable_default_constructor_tag) { }
  };

+  void _M_reset()
+  {
+   _M_reset_impl(std::make_index_sequence{});
+   _M_index = variant_npos;
+  }
+
  ~_Variant_storage()
-  { _M_destroy_impl(std::make_index_sequence{}); }
+  { _M_reset(); }


These can also use index_sequence_for<_Types...>


@@ -1253,14 +1285,16 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION

  template
struct hash>
-: private __poison_hash>...
+: private __detail::__variant::_Variant_hash_base<
+   variant<_Types...>, std::make_index_sequence>


And again.


{
  using result_type = size_t;
  using argument_type = variant<_Types...>;

  size_t
  operator()(const variant<_Types...>& __t) const
-  noexcept((... && 
noexcep

[PATCH][ARM] PR target/71436: Restrict *load_multiple pattern till after LRA

2016-11-30 Thread Kyrill Tkachov

Hi all,

In this awkward ICE we have a *load_multiple pattern that is being transformed 
in reload from:
(insn 55 67 151 3 (parallel [
(set (reg:SI 0 r0)
(mem/u/c:SI (reg/f:SI 147) [2 c+0 S4 A32]))
(set (reg:SI 158 [ c+4 ])
(mem/u/c:SI (plus:SI (reg/f:SI 147)
(const_int 4 [0x4])) [2 c+4 S4 A32]))
]) arm-crash.c:25 393 {*load_multiple}
 (expr_list:REG_UNUSED (reg:SI 0 r0)
(nil)))


into the invalid:
(insn 55 67 70 3 (parallel [
(set (reg:SI 0 r0)
(mem/u/c:SI (reg/f:SI 5 r5 [147]) [2 c+0 S4 A32]))
(set (mem/c:SI (plus:SI (reg/f:SI 102 sfp)
(const_int -4 [0xfffc])) [4 %sfp+-12 S4 
A32])
(mem/u/c:SI (plus:SI (reg/f:SI 5 r5 [147])
(const_int 4 [0x4])) [2 c+4 S4 A32]))
]) arm-crash.c:25 393 {*load_multiple}
 (nil))

The operands of *load_multiple are not validated through constraints like LRA 
is used to, but rather through
a match_parallel predicate which ends up calling ldm_stm_operation_p to 
validate the multiple sets.
But this means that LRA cannot reason about the constraints properly.
This two-regiseter load should not have used *load_multiple anyway, it should 
have used *ldm2_ from ldmstm.md
and indeed it did until the loop2_invariant pass which copied the ldm2_ pattern:
(insn 27 23 28 4 (parallel [
(set (reg:SI 0 r0)
(mem/u/c:SI (reg/f:SI 147) [2 c+0 S4 A32]))
(set (reg:SI 1 r1)
(mem/u/c:SI (plus:SI (reg/f:SI 147)
(const_int 4 [0x4])) [2 c+4 S4 A32]))
]) "ldm.c":25 385 {*ldm2_}
 (nil))

into:
(insn 55 19 67 3 (parallel [
(set (reg:SI 0 r0)
(mem/u/c:SI (reg/f:SI 147) [2 c+0 S4 A32]))
(set (reg:SI 158)
(mem/u/c:SI (plus:SI (reg/f:SI 147)
(const_int 4 [0x4])) [2 c+4 S4 A32]))
]) "ldm.c":25 404 {*load_multiple}
 (expr_list:REG_UNUSED (reg:SI 0 r0)
(nil)))

Note that it now got recognised as load_multiple because the second register is 
not a hard register but the pseudo 158.
In any case, the solution suggested in the PR (and I agree with it) is to 
restrict *load_multiple to after reload.
The similar pattern *load_multiple_with_writeback also has a similar condition 
and the comment above *load_multiple says that
it's used to generate epilogues, which is done after reload anyway. For 
pre-reload load-multiples the patterns in ldmstm.md
should do just fine.

Bootstrapped and tested on arm-none-linux-gnueabihf.

Ok for trunk?

Thanks,
Kyrill

2016-11-30  Kyrylo Tkachov  

PR target/71436
* config/arm/arm.md (*load_multiple): Add reload_completed to
matching condition.

2016-11-30  Kyrylo Tkachov  

PR target/71436
* gcc.c-torture/compile/pr71436.c: New test.
commit 996d28e2353badd1b29ef000f94d40c7dab9010f
Author: Kyrylo Tkachov 
Date:   Tue Nov 29 15:07:30 2016 +

[ARM] Restrict *load_multiple pattern till after LRA

diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 74c44f3..22d2a84 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -11807,12 +11807,15 @@ (define_insn ""
 
 ;; Patterns in ldmstm.md don't cover more than 4 registers. This pattern covers
 ;; large lists without explicit writeback generated for APCS_FRAME epilogue.
+;; The operands are validated through the load_multiple_operation
+;; match_parallel predicate rather than through constraints so enable it only
+;; after reload.
 (define_insn "*load_multiple"
   [(match_parallel 0 "load_multiple_operation"
 [(set (match_operand:SI 2 "s_register_operand" "=rk")
   (mem:SI (match_operand:SI 1 "s_register_operand" "rk")))
 ])]
-  "TARGET_32BIT"
+  "TARGET_32BIT && reload_completed"
   "*
   {
 arm_output_multireg_pop (operands, /*return_pc=*/false,
diff --git a/gcc/testsuite/gcc.c-torture/compile/pr71436.c b/gcc/testsuite/gcc.c-torture/compile/pr71436.c
new file mode 100644
index 000..ab08d5d
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/compile/pr71436.c
@@ -0,0 +1,35 @@
+/* PR target/71436.  */
+
+#pragma pack(1)
+struct S0
+{
+  volatile int f0;
+  short f2;
+};
+
+void foo (struct S0 *);
+int a, d;
+static struct S0 b[5];
+static struct S0 c;
+void fn1 ();
+void
+main ()
+{
+  {
+struct S0 e;
+for (; d; fn1 ())
+  {
+{
+  a = 3;
+  for (; a >= 0; a -= 1)
+{
+  {
+e = c;
+  }
+  b[a] = e;
+}
+}
+  }
+  }
+  foo (b);
+}


Re: [patch] boehm-gc removal and libobjc changes to build with an external bdw-gc

2016-11-30 Thread Matthias Klose
On 30.11.2016 12:38, Richard Biener wrote:
> On Wed, Nov 30, 2016 at 12:30 PM, Matthias Klose  wrote:
>> There's one more fix needed for the case of only having the pkg-config module
>> installed when configuring with --enable-objc-gc. We can't use 
>> PKG_CHECK_MODULES
>> directly because the pkg.m4 macros choke on the dash in the module name. Thus
>> setting the CFLAGS and LIBS directly. Ok to install?
> 
> Why not fix pkg.m4?
>
> Richard.

Jakub suggested to avoid using pkg-config at all, so we can get rid off this
code altogether.

To re-enable bootstrap with --enable-objc-gc enabled, I checked in this change
as a stop-gap:

>> * configure.ac: Set BDW_GC_CFLAGS and BDW_GC_LIBS after checking
>> for the existence of the pkg-config modules.
>> * Regenerate.

Matthias



Re: [PATCH, ARM] Further improve stack usage on sha512 (PR 77308)

2016-11-30 Thread Bernd Edlinger
On 11/30/16 13:01, Wilco Dijkstra wrote:
> Bernd Edlinger wrote:
>> On 11/29/16 16:06, Wilco Dijkstra wrote:
>>> Bernd Edlinger wrote:
>>>
>>> -  "TARGET_32BIT && reload_completed
>>> +  "TARGET_32BIT && ((!TARGET_NEON && !TARGET_IWMMXT) || reload_completed)
>>>  && ! (TARGET_NEON && IS_VFP_REGNUM (REGNO (operands[0])))"
>>>
>>> This is equivalent to "&& (!TARGET_IWMMXT || reload_completed)" since we're
>>> already excluding NEON.
>>
>> Aehm, no.  This would split the addi_neon insn before it is clear
>> if the reload pass will assign a VFP register.
>
> Hmm that's strange... This instruction shouldn't be used to also split some 
> random
> Neon pattern - for example arm_subdi3 doesn't do the same. To understand and
> reason about any of these complex patterns they should all work in the same 
> way...
>
I was a bit surprised as well, when I saw that happen.
But subdi3 is different:
   "TARGET_32BIT && !TARGET_NEON"
   "#"  ; "subs\\t%Q0, %Q1, %Q2\;sbc\\t%R0, %R1, %R2"
   "&& reload_completed"

so this never splits anything if TARGET_NEON.
but adddi3 can not expand if TARGET_NEON but it's pattern simply
looks exactly like the addi3_neon:

(define_insn_and_split "*arm_adddi3"
   [(set (match_operand:DI  0 "s_register_operand" 
"=&r,&r,&r,&r,&r")
 (plus:DI (match_operand:DI 1 "s_register_operand" "%0, 0, r, 0, r")
  (match_operand:DI 2 "arm_adddi_operand"  "r,  0, r, 
Dd, Dd")))
(clobber (reg:CC CC_REGNUM))]
   "TARGET_32BIT && !TARGET_NEON"
   "#"
   "TARGET_32BIT && reload_completed
&& ! (TARGET_NEON && IS_VFP_REGNUM (REGNO (operands[0])))"

(define_insn "adddi3_neon"
   [(set (match_operand:DI 0 "s_register_operand" 
"=w,?&r,?&r,?w,?&r,?&r,?&r")
 (plus:DI (match_operand:DI 1 "s_register_operand" "%w,0,0,w,r,0,r")
  (match_operand:DI 2 "arm_adddi_operand" 
"w,r,0,w,r,Dd,Dd")))
(clobber (reg:CC CC_REGNUM))]
   "TARGET_NEON"
{
   switch (which_alternative)
 {
 case 0: /* fall through */
 case 3: return "vadd.i64\t%P0, %P1, %P2";
 case 1: return "#";
 case 2: return "#";
 case 4: return "#";
 case 5: return "#";
 case 6: return "#";
 default: gcc_unreachable ();
 }

Even the return "#" explicitly invokes the former pattern.
So I think the author knew that, and did it on purpose.


>> But when I make *arm_cmpdi_insn split early, it ICEs:
>
> (insn 4870 4869 1636 87 (set (scratch:SI)
>  (minus:SI (minus:SI (subreg:SI (reg:DI 2261) 4)
>  (subreg:SI (reg:DI 473 [ X$14 ]) 4))
>  (ltu:SI (reg:CC_C 100 cc)
>  (const_int 0 [0] "pr77308-2.c":140 -1
>   (nil))
>
> That's easy, we don't have a sbcs , r1, r2 pattern. A quick 
> workaround is
> to create a temporary for operand[2] (if before reload) so it will match the 
> standard
> sbcs pattern, and then the split works fine.
>
>> So it is certainly possible, but not really simple to improve the
>> stack size even further.  But I would prefer to do that in a
>> separate patch.
>
> Yes separate patches would be fine. However there is a lot of scope to 
> improve this
> further. For example after your patch shifts and logical operations are 
> expanded in
> expand, add/sub are in split1 after combine runs and everything else is split 
> after
> reload. It doesn't make sense to split different operations at different 
> times - it means
> you're still going to get the bad DImode subregs and miss lots of optimization
> opportunities due to the mix of partly split and partly not-yet-split 
> operations.
>

Yes.  I did the add/sub differently because it was more easy this way,
and it was simply sufficient to make the existing test cases happy.

Also, the biggest benefit was IIRC from the very early splitting
of the anddi/iordi/xordi patterns, because they have completely
separate data flow in low and high parts.  And that is not
the case for the arihmetic patterns, but nevertheless they
can still be optimized, preferably, when a new test case
is found, that can demonstrate an improvement.

I am not sure why the cmpdi pattern have an influence at all,
because from the data flow you need all 64 bits of both sides.
Nevertheless it is a fact: With the modified test case I
get 264 bytes frame size, and that was 1920 before.

I attached the completely untested follow-up patch now, but I would
like to post that one again for review, after I applied my current
patch, which is still waiting for final review (please feel pinged!).


This is really exciting...


Thanks
Bernd.
--- gcc/config/arm/arm.md.orig	2016-11-27 09:22:41.794790123 +0100
+++ gcc/config/arm/arm.md	2016-11-30 16:40:30.140532737 +0100
@@ -4738,7 +4738,7 @@
(clobber (reg:CC CC_REGNUM))]
   "TARGET_ARM"
   "#"   ; "rsbs\\t%Q0, %Q1, #0\;rsc\\t%R0, %R1, #0"
-  "&& reload_completed"
+  "&& ((!TARGET_NEON && !TARGET_IWMMXT) || reload_completed)"
   [(parallel [(set (reg:CC CC_REGNUM)
 		   (compare:CC (const_int 0) (match_dup 1)))
 	  (set (match_d

Re: [PATCH 2/4] [ARC] Cleanup implementation.

2016-11-30 Thread Andrew Burgess
* Claudiu Zissulescu  [2016-11-16 11:17:59 
+0100]:

> gcc/
> 2016-06-30  Claudiu Zissulescu  

There seem to be two sets of changes here:

> 
>   * config/arc/arc-protos.h (insn_is_tls_gd_dispatch): Remove.
>   * config/arc/arc.c (arc_unspec_offset): New function.
>   (arc_finalize_pic): Change.
>   (arc_emit_call_tls_get_addr): Likewise.
>   (arc_legitimize_tls_address): Likewise.
>   (arc_legitimize_pic_address): Likewise.
>   (insn_is_tls_gd_dispatch): Remove.
>   * config/arc/arc.h (INSN_REFERENCES_ARE_DELAYED): Change.

This lot seem pretty straight forward.  As always I cringe a little
when I see "Change" as the description, but given that actual change
is straight forward and small it's not that bad..

>   * config/arc/arc.md (ls_gd_load): Remove.
>   (tls_gd_dispatch): Likewise.

I don't see the connection between these two parts?  Plus it would be
nice to have some more words _somewhere_ for why these are being
removed.  The commit message is probably the right place I'd have
thought.

But assuming your reason for removing the patterns is solid this patch
looks fine.  You should commit with an extended description.

Thanks,
Andrew


> ---
>  gcc/config/arc/arc-protos.h |  1 -
>  gcc/config/arc/arc.c| 41 ++---
>  gcc/config/arc/arc.h|  2 +-
>  gcc/config/arc/arc.md   | 34 --
>  4 files changed, 19 insertions(+), 59 deletions(-)
> 
> diff --git a/gcc/config/arc/arc-protos.h b/gcc/config/arc/arc-protos.h
> index 6008744..bda3d46 100644
> --- a/gcc/config/arc/arc-protos.h
> +++ b/gcc/config/arc/arc-protos.h
> @@ -118,6 +118,5 @@ extern bool arc_eh_uses (int regno);
>  extern int regno_clobbered_p (unsigned int, rtx_insn *, machine_mode, int);
>  extern bool arc_legitimize_reload_address (rtx *, machine_mode, int, int);
>  extern void arc_secondary_reload_conv (rtx, rtx, rtx, bool);
> -extern bool insn_is_tls_gd_dispatch (rtx_insn *);
>  extern void arc_cpu_cpp_builtins (cpp_reader *);
>  extern rtx arc_eh_return_address_location (void);
> diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c
> index 428676f..7eadb3c 100644
> --- a/gcc/config/arc/arc.c
> +++ b/gcc/config/arc/arc.c
> @@ -2893,6 +2893,15 @@ arc_eh_return_address_location (void)
>  
>  /* PIC */
>  
> +/* Helper to generate unspec constant.  */
> +
> +static rtx
> +arc_unspec_offset (rtx loc, int unspec)
> +{
> +  return gen_rtx_CONST (Pmode, gen_rtx_UNSPEC (Pmode, gen_rtvec (1, loc),
> +unspec));
> +}
> +
>  /* Emit special PIC prologues and epilogues.  */
>  /* If the function has any GOTOFF relocations, then the GOTBASE
> register has to be setup in the prologue
> @@ -2918,9 +2927,7 @@ arc_finalize_pic (void)
>gcc_assert (flag_pic != 0);
>  
>pat = gen_rtx_SYMBOL_REF (Pmode, "_DYNAMIC");
> -  pat = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, pat), ARC_UNSPEC_GOT);
> -  pat = gen_rtx_CONST (Pmode, pat);
> -
> +  pat = arc_unspec_offset (pat, ARC_UNSPEC_GOT);
>pat = gen_rtx_SET (baseptr_rtx, pat);
>  
>emit_insn (pat);
> @@ -4989,8 +4996,7 @@ arc_emit_call_tls_get_addr (rtx sym, int reloc, rtx eqv)
>  
>start_sequence ();
>  
> -  rtx x = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, sym), reloc);
> -  x = gen_rtx_CONST (Pmode, x);
> +  rtx x = arc_unspec_offset (sym, reloc);
>emit_move_insn (r0, x);
>use_reg (&call_fusage, r0);
>  
> @@ -5046,17 +5052,18 @@ arc_legitimize_tls_address (rtx addr, enum tls_model 
> model)
>addr = gen_rtx_CONST (Pmode, addr);
>base = arc_legitimize_tls_address (base, TLS_MODEL_GLOBAL_DYNAMIC);
>return gen_rtx_PLUS (Pmode, force_reg (Pmode, base), addr);
> +
>  case TLS_MODEL_GLOBAL_DYNAMIC:
>return arc_emit_call_tls_get_addr (addr, UNSPEC_TLS_GD, addr);
> +
>  case TLS_MODEL_INITIAL_EXEC:
> -  addr = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, addr), UNSPEC_TLS_IE);
> -  addr = gen_rtx_CONST (Pmode, addr);
> +  addr = arc_unspec_offset (addr, UNSPEC_TLS_IE);
>addr = copy_to_mode_reg (Pmode, gen_const_mem (Pmode, addr));
>return gen_rtx_PLUS (Pmode, arc_get_tp (), addr);
> +
>  case TLS_MODEL_LOCAL_EXEC:
>  local_exec:
> -  addr = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, addr), UNSPEC_TLS_OFF);
> -  addr = gen_rtx_CONST (Pmode, addr);
> +  addr = arc_unspec_offset (addr, UNSPEC_TLS_OFF);
>return gen_rtx_PLUS (Pmode, arc_get_tp (), addr);
>  default:
>gcc_unreachable ();
> @@ -5087,14 +5094,11 @@ arc_legitimize_pic_address (rtx orig, rtx oldx)
>else if (!flag_pic)
>   return orig;
>else if (CONSTANT_POOL_ADDRESS_P (addr) || SYMBOL_REF_LOCAL_P (addr))
> - return gen_rtx_CONST (Pmode,
> -   gen_rtx_UNSPEC (Pmode, gen_rtvec (1, addr),
> -   ARC_UNSPEC_GOTOFFPC));
> + return arc_unspec_offset (addr, ARC_UNSPEC_GOTOFFPC);
>  
>/* This symbol must b

Re: [libgomp] No references to env.c -> no libgomp construction

2016-11-30 Thread Alexander Monakov
[redirecting from gcc@ to gcc-patches@ for patch submission]

On Wed, 30 Nov 2016, Sebastian Huber wrote:
> > On Tue, 29 Nov 2016, Sebastian Huber wrote:
> > > * env.c: Split out ICV definitions into...
> > > * icv.c: ...here (new file) and...
> > > * icv-device.c: ...here. New file.
> > > 
> > > the env.c contains now only local symbols (at least for target 
> > > *-rtems*-*):
> > > 
> > [...]
> > > 
> > > Thus the libgomp constructor is not linked in into executables.
> > 
> > Thanks for the report.  This issue affects only static libgomp.a (and not on
> > NVPTX where env.c is deliberately empty).
> > 
> > I think the minimal solution here is to #include  from icv.c instead 
> > of
> > compiling it separately (using <> inclusion rather than "" so in case of 
> > NVPTX
> > we pick up the empty config/nvptx/env.c from toplevel icv.c).
> > 
> > A slightly more involved but perhaps a preferable approach is to remove
> > config/nvptx/env.c, introduce LIBGOMP_OFFLOADED_ONLY macro, and use it to
> > guard inclusion of env.c from icv.c (which then can use the #include "env.c"
> > form).
> 
> I guess its sufficient to move
> 
> pthread_attr_t gomp_thread_attr;
> 
> from team.c (NVPTX seems to provide its own team.c) to env.c.  This generates
> a reference from team.c to env.c and the constructor is pulled in.

Well, yes, generally definitions of variables must be in the same translation
unit as the constructor that would initialize them -- so at the moment it's
wrong for both the ICV definitions and gomp_thread_attr to be defined elsewhere.

In reply to this message I'm posting 3 patches that solve the issue by moving
ICV definitions back to env.c and using new LIBGOMP_OFFLOADED_ONLY macro to
avoid environment-related functionality on nvptx.  I think it would be good to
move gomp_thread_attr to env.c too, but that's not a part of this patchset.

Thanks.
Alexander


  1   2   >