Re: [PATCH] Fix A < 0 ? C : 0 optimization (PR tree-optimization/78720)

2016-12-10 Thread Paolo Bonzini


On 09/12/2016 20:26, Jakub Jelinek wrote:
> +tree ctype = TREE_TYPE (@1);
> +   }
> +   (if (shift >= 0)
> +(bit_and
> + (convert (rshift @0 { build_int_cst (integer_type_node, shift); }))
> + @1)
> +(bit_and
> + (lshift (convert:ctype @0) { build_int_cst (integer_type_node, -shift); 
> })
> + @1)

The ":ctype" shouldn't be needed because the COND_EXPR already has @1's
type.

Paolo


Re: [PATCH] Fix A < 0 ? C : 0 optimization (PR tree-optimization/78720)

2016-12-10 Thread Jakub Jelinek
On Sat, Dec 10, 2016 at 10:05:50AM +0100, Paolo Bonzini wrote:
> 
> 
> On 09/12/2016 20:26, Jakub Jelinek wrote:
> > +tree ctype = TREE_TYPE (@1);
> > +   }
> > +   (if (shift >= 0)
> > +(bit_and
> > + (convert (rshift @0 { build_int_cst (integer_type_node, shift); }))
> > + @1)
> > +(bit_and
> > + (lshift (convert:ctype @0) { build_int_cst (integer_type_node, 
> > -shift); })
> > + @1)
> 
> The ":ctype" shouldn't be needed because the COND_EXPR already has @1's
> type.

Note I'm actually bootstrapping/regtesting today following patch instead,
where it clearly isn't needed.  But for the above case with the shift, it
isn't clear for me how it would work without :ctype - how the algorithm
of figuring out what type to convert to works.  Because the first operand
of lshift isn't used next to @1.

2016-12-09  Jakub Jelinek  
Marc Glisse  

PR tree-optimization/78720
* match.pd (A < 0 ? C : 0): Only optimize for signed A.  If shift
is negative, first convert to @1's type and then lshift it by -shift.

* gcc.c-torture/execute/pr78720.c: New test.

--- gcc/match.pd.jj 2016-12-09 10:19:10.909735559 +0100
+++ gcc/match.pd2016-12-10 09:21:26.260516596 +0100
@@ -2768,17 +2768,22 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 (ncmp (convert:stype @0) { build_zero_cst (stype); })
 
 /* If we have A < 0 ? C : 0 where C is a power of 2, convert
-   this into a right shift followed by ANDing with C.  */
+   this into a right shift or sign extension followed by ANDing with C.  */
 (simplify
  (cond
   (lt @0 integer_zerop)
   integer_pow2p@1 integer_zerop)
- (with {
+ (if (!TYPE_UNSIGNED (TREE_TYPE (@0)))
+  (with {
 int shift = element_precision (@0) - wi::exact_log2 (@1) - 1;
-  }
-  (bit_and
-   (convert (rshift @0 { build_int_cst (integer_type_node, shift); }))
-   @1)))
+   }
+   (if (shift >= 0)
+(bit_and
+ (convert (rshift @0 { build_int_cst (integer_type_node, shift); }))
+ @1)
+/* Otherwise ctype must be wider than TREE_TYPE (@0) and pure
+   sign extension followed by AND with C will achieve the effect.  */
+(bit_and (convert @0) @1)
 
 /* When the addresses are not directly of decls compare base and offset.
This implements some remaining parts of fold_comparison address
--- gcc/testsuite/gcc.c-torture/execute/pr78720.c.jj2016-12-10 
09:18:43.386574179 +0100
+++ gcc/testsuite/gcc.c-torture/execute/pr78720.c   2016-12-10 
09:18:43.386574179 +0100
@@ -0,0 +1,29 @@
+/* PR tree-optimization/78720 */
+
+__attribute__((noinline, noclone)) long int
+foo (signed char x)
+{
+  return x < 0 ? 0x8L : 0L;
+}
+
+__attribute__((noinline, noclone)) long int
+bar (signed char x)
+{
+  return x < 0 ? 0x80L : 0L;
+}
+
+__attribute__((noinline, noclone)) long int
+baz (signed char x)
+{
+  return x < 0 ? 0x20L : 0L;
+}
+
+int
+main ()
+{
+  if (foo (-1) != 0x8L || bar (-1) != 0x80L || baz (-1) != 0x20L
+  || foo (0) != 0L || bar (0) != 0L || baz (0) != 0L
+  || foo (31) != 0L || bar (31) != 0L || baz (31) != 0L)
+__builtin_abort ();
+  return 0;
+}


Jakub


Re: [PATCH] Fix A < 0 ? C : 0 optimization (PR tree-optimization/78720)

2016-12-10 Thread Marc Glisse

On Sat, 10 Dec 2016, Jakub Jelinek wrote:


* match.pd (A < 0 ? C : 0): Only optimize for signed A.  If shift
is negative, first convert to @1's type and then lshift it by -shift.


Thanks, the ChangeLog needs updating.


--- gcc/match.pd.jj 2016-12-09 10:19:10.909735559 +0100
+++ gcc/match.pd2016-12-10 09:21:26.260516596 +0100
@@ -2768,17 +2768,22 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
(ncmp (convert:stype @0) { build_zero_cst (stype); })

/* If we have A < 0 ? C : 0 where C is a power of 2, convert
-   this into a right shift followed by ANDing with C.  */
+   this into a right shift or sign extension followed by ANDing with C.  */
(simplify
 (cond
  (lt @0 integer_zerop)
  integer_pow2p@1 integer_zerop)
- (with {
+ (if (!TYPE_UNSIGNED (TREE_TYPE (@0)))
+  (with {
int shift = element_precision (@0) - wi::exact_log2 (@1) - 1;
-  }
-  (bit_and
-   (convert (rshift @0 { build_int_cst (integer_type_node, shift); }))
-   @1)))
+   }
+   (if (shift >= 0)
+(bit_and
+ (convert (rshift @0 { build_int_cst (integer_type_node, shift); }))
+ @1)
+/* Otherwise ctype must be wider than TREE_TYPE (@0) and pure
+   sign extension followed by AND with C will achieve the effect.  */
+(bit_and (convert @0) @1)


It is funny to notice that the fact that @1 is a power of 2 has become 
mostly irrelevant. When C fits in the type of A, we can do an arithmetic 
shift right of precision-1 to obtain a mask of 0 or -1, then bit_and works 
not just for powers of 2. Otherwise, imagining that A is int32_t and C 
int64_t, all we care about is that the low 31 bits of C are 0 (otherwise 
we could also do an arithmetic right shift, either before or after the 
convert). But that's another patch, fixing the PR is what matters for now.


--
Marc Glisse


Re: [PATCH] Fix A < 0 ? C : 0 optimization (PR tree-optimization/78720)

2016-12-10 Thread Jakub Jelinek
On Sat, Dec 10, 2016 at 11:45:35AM +0100, Marc Glisse wrote:
> On Sat, 10 Dec 2016, Jakub Jelinek wrote:
> 
> > * match.pd (A < 0 ? C : 0): Only optimize for signed A.  If shift
> > is negative, first convert to @1's type and then lshift it by -shift.
> 
> Thanks, the ChangeLog needs updating.

Indeed, here it is with updated ChangeLog and as an added benefit, also
successfully bootstrapped/regtested on x86_64-linux and i686-linux.

> It is funny to notice that the fact that @1 is a power of 2 has become
> mostly irrelevant. When C fits in the type of A, we can do an arithmetic
> shift right of precision-1 to obtain a mask of 0 or -1, then bit_and works
> not just for powers of 2. Otherwise, imagining that A is int32_t and C
> int64_t, all we care about is that the low 31 bits of C are 0 (otherwise we
> could also do an arithmetic right shift, either before or after the
> convert). But that's another patch, fixing the PR is what matters for now.

You're right.

2016-12-09  Jakub Jelinek  
Marc Glisse  

PR tree-optimization/78720
* match.pd (A < 0 ? C : 0): Only optimize for signed A.  If shift
is negative, sign extend to @1's type and than AND with C.

* gcc.c-torture/execute/pr78720.c: New test.

--- gcc/match.pd.jj 2016-12-09 10:19:10.909735559 +0100
+++ gcc/match.pd2016-12-10 09:21:26.260516596 +0100
@@ -2768,17 +2768,22 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 (ncmp (convert:stype @0) { build_zero_cst (stype); })
 
 /* If we have A < 0 ? C : 0 where C is a power of 2, convert
-   this into a right shift followed by ANDing with C.  */
+   this into a right shift or sign extension followed by ANDing with C.  */
 (simplify
  (cond
   (lt @0 integer_zerop)
   integer_pow2p@1 integer_zerop)
- (with {
+ (if (!TYPE_UNSIGNED (TREE_TYPE (@0)))
+  (with {
 int shift = element_precision (@0) - wi::exact_log2 (@1) - 1;
-  }
-  (bit_and
-   (convert (rshift @0 { build_int_cst (integer_type_node, shift); }))
-   @1)))
+   }
+   (if (shift >= 0)
+(bit_and
+ (convert (rshift @0 { build_int_cst (integer_type_node, shift); }))
+ @1)
+/* Otherwise ctype must be wider than TREE_TYPE (@0) and pure
+   sign extension followed by AND with C will achieve the effect.  */
+(bit_and (convert @0) @1)
 
 /* When the addresses are not directly of decls compare base and offset.
This implements some remaining parts of fold_comparison address
--- gcc/testsuite/gcc.c-torture/execute/pr78720.c.jj2016-12-10 
09:18:43.386574179 +0100
+++ gcc/testsuite/gcc.c-torture/execute/pr78720.c   2016-12-10 
09:18:43.386574179 +0100
@@ -0,0 +1,29 @@
+/* PR tree-optimization/78720 */
+
+__attribute__((noinline, noclone)) long int
+foo (signed char x)
+{
+  return x < 0 ? 0x8L : 0L;
+}
+
+__attribute__((noinline, noclone)) long int
+bar (signed char x)
+{
+  return x < 0 ? 0x80L : 0L;
+}
+
+__attribute__((noinline, noclone)) long int
+baz (signed char x)
+{
+  return x < 0 ? 0x20L : 0L;
+}
+
+int
+main ()
+{
+  if (foo (-1) != 0x8L || bar (-1) != 0x80L || baz (-1) != 0x20L
+  || foo (0) != 0L || bar (0) != 0L || baz (0) != 0L
+  || foo (31) != 0L || bar (31) != 0L || baz (31) != 0L)
+__builtin_abort ();
+  return 0;
+}


Jakub


[C++ PATCH] Demangle decltype(auto) (PR c++/78761)

2016-12-10 Thread Jakub Jelinek
Hi!

Mangling of decltype(auto) has been added back in April 2013:
https://gcc.gnu.org/ml/gcc-patches/2013-04/msg01273.html
and matches the
https://mentorembedded.github.io/cxx-abi/abi.html#mangling
 ::= Da # auto
 ::= Dc # decltype(auto)
part, but the corresponding demangler change has never been added.

The following patch adds it.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-12-10  Jakub Jelinek  

PR c++/78761
* cp-demangle.c (cplus_demangle_type): Demangle Dc as decltype(auto).
* testsuite/demangle-expected: Add test for decltype(auto).

--- libiberty/cp-demangle.c.jj  2016-11-30 08:57:16.0 +0100
+++ libiberty/cp-demangle.c 2016-12-10 09:31:17.698046004 +0100
@@ -2590,7 +2590,11 @@ cplus_demangle_type (struct d_info *di)
  /* auto */
  ret = d_make_name (di, "auto", 4);
  break;
- 
+   case 'c':
+ /* decltype(auto) */
+ ret = d_make_name (di, "decltype(auto)", 14);
+ break;
+
case 'f':
  /* 32-bit decimal floating point */
  ret = d_make_builtin_type (di, &cplus_demangle_builtin_types[26]);
--- libiberty/testsuite/demangle-expected.jj2016-11-16 07:13:50.0 
+0100
+++ libiberty/testsuite/demangle-expected   2016-12-10 09:39:44.277651914 
+0100
@@ -4200,6 +4200,9 @@ decltype (new auto({parm#1})) f(int
 _Z1fIiERDaRKT_S1_
 auto& f(int const&, int)
 --format=gnu-v3
+_Z1gIiEDcRKT_S0_
+decltype(auto) g(int const&, int)
+--format=gnu-v3
 _Z1gILi1EEvR1AIXT_EER1BIXscbT_EE
 void g<1>(A<1>&, B(1)>&)
 --format=gnu-v3

Jakub


Re: [PATCH, GCC/ARM] Define arm_arch_core_flags in a single file

2016-12-10 Thread Richard Earnshaw (lists)
On 09/12/16 16:56, Thomas Preudhomme wrote:
> Hi,
> 
> This patch moves the definition of arm_arch_core_flags along with the
> declaration of its structure element type to
> common/config/arm/arm-common.c so that it is not *defined* in all object
> files including tm.h. Otherwise, GCC gets bloated with as many copy of
> that array as there is file including tm.h and executable as well given
> that crtbegin gets a copy too.
> 
> ChangeLog entry is as follows:
> 
> 
> *** gcc/ChangeLog ***
> 
> 2016-12-09  Thomas Preud'homme  
> 
> * config/arm/arm-opts.h: Move struct arm_arch_core_flag and
> arm_arch_core_flags to ...
> * common/config/arm/arm-common.c: There.
> 
> 
> Testing: successfully built GCC configured with --with-cpu=cortex-a15
> and optional_thumb testcases pass (without any option for -1 and -2,
> with -mcpu=cortex-m3 for -3 to satisfy requirements): build is working
> and feature as well.
> 
> Is this ok for stage3?
> 
> Best regards,
> 
> Thomas

OK.

R.


[PATCH] gcc: config: tilegx: Reserve prev insn when delete useless insn

2016-12-10 Thread chengang
From: Chen Gang 

When check bundle, gcc may still need modify the prev insn. Original
implementation will lose the prev insn.

Also correct the coding styles of relate code.

2016-12-10  Chen Gang 

gcc/
PR target/78222
* tilegx.c (tilegx_gen_bundle): Reserve prev insn when delete
useless insn.
---
 gcc/config/tilegx/tilegx.c | 22 --
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/gcc/config/tilegx/tilegx.c b/gcc/config/tilegx/tilegx.c
index 76a7455..f846ca7 100644
--- a/gcc/config/tilegx/tilegx.c
+++ b/gcc/config/tilegx/tilegx.c
@@ -4469,8 +4469,7 @@ tilegx_gen_bundles (void)
   rtx_insn *end = NEXT_INSN (BB_END (bb));
 
   prev = NULL;
-  for (insn = next_insn_to_bundle (BB_HEAD (bb), end); insn;
-  prev = insn, insn = next)
+  for (insn = next_insn_to_bundle (BB_HEAD (bb), end); insn; insn = next)
{
  next = next_insn_to_bundle (NEXT_INSN (insn), end);
 
@@ -4499,14 +4498,17 @@ tilegx_gen_bundles (void)
  /* Delete barrier insns, because they can mess up the
 emitting of bundle braces.  If it is end-of-bundle, then
 the previous insn must be marked end-of-bundle.  */
- if (get_attr_type (insn) == TYPE_NOTHING) {
-   if (GET_MODE (insn) == QImode && prev != NULL
-   && GET_MODE (prev) == SImode)
- {
-   PUT_MODE (prev, QImode);
- }
-   delete_insn (insn);
- }
+ if (get_attr_type (insn) == TYPE_NOTHING)
+   {
+ if (GET_MODE (insn) == QImode && prev != NULL
+ && GET_MODE (prev) == SImode)
+   {
+ PUT_MODE (prev, QImode);
+   }
+   delete_insn (insn);
+   }
+ else
+   prev = insn;
}
 }
 }
-- 
2.5.0



Re: [PATCH] Add support for Fuchsia (OS)

2016-12-10 Thread Richard Earnshaw
On 08/12/16 22:55, Josh Conner wrote:
> + arm*-*-fuchsia*)
> +   tm_file="${tm_file} fuchsia.h arm/fuchsia-elf.h glibc-stdint.h"
> +   tmake_file="${tmake_file} arm/t-bpabi"
> +   ;;


This will leave the default cpu as arm7tdmi.  Is that what you want?
It's fine if it is, but if not, you should consider setting
target_cpu_cname here as well.

R.


Re: [PATCH] Fix A < 0 ? C : 0 optimization (PR tree-optimization/78720)

2016-12-10 Thread Richard Biener
On December 10, 2016 12:01:37 PM GMT+01:00, Jakub Jelinek  
wrote:
>On Sat, Dec 10, 2016 at 11:45:35AM +0100, Marc Glisse wrote:
>> On Sat, 10 Dec 2016, Jakub Jelinek wrote:
>> 
>> >* match.pd (A < 0 ? C : 0): Only optimize for signed A.  If shift
>> >is negative, first convert to @1's type and then lshift it by
>-shift.
>> 
>> Thanks, the ChangeLog needs updating.
>
>Indeed, here it is with updated ChangeLog and as an added benefit, also
>successfully bootstrapped/regtested on x86_64-linux and i686-linux.

OK.

Richard.

>> It is funny to notice that the fact that @1 is a power of 2 has
>become
>> mostly irrelevant. When C fits in the type of A, we can do an
>arithmetic
>> shift right of precision-1 to obtain a mask of 0 or -1, then bit_and
>works
>> not just for powers of 2. Otherwise, imagining that A is int32_t and
>C
>> int64_t, all we care about is that the low 31 bits of C are 0
>(otherwise we
>> could also do an arithmetic right shift, either before or after the
>> convert). But that's another patch, fixing the PR is what matters for
>now.
>
>You're right.
>
>2016-12-09  Jakub Jelinek  
>   Marc Glisse  
>
>   PR tree-optimization/78720
>   * match.pd (A < 0 ? C : 0): Only optimize for signed A.  If shift
>   is negative, sign extend to @1's type and than AND with C.
>
>   * gcc.c-torture/execute/pr78720.c: New test.
>
>--- gcc/match.pd.jj2016-12-09 10:19:10.909735559 +0100
>+++ gcc/match.pd   2016-12-10 09:21:26.260516596 +0100
>@@ -2768,17 +2768,22 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> (ncmp (convert:stype @0) { build_zero_cst (stype); })
> 
> /* If we have A < 0 ? C : 0 where C is a power of 2, convert
>-   this into a right shift followed by ANDing with C.  */
>+   this into a right shift or sign extension followed by ANDing with
>C.  */
> (simplify
>  (cond
>   (lt @0 integer_zerop)
>   integer_pow2p@1 integer_zerop)
>- (with {
>+ (if (!TYPE_UNSIGNED (TREE_TYPE (@0)))
>+  (with {
> int shift = element_precision (@0) - wi::exact_log2 (@1) - 1;
>-  }
>-  (bit_and
>-   (convert (rshift @0 { build_int_cst (integer_type_node, shift); }))
>-   @1)))
>+   }
>+   (if (shift >= 0)
>+(bit_and
>+ (convert (rshift @0 { build_int_cst (integer_type_node, shift);
>}))
>+ @1)
>+/* Otherwise ctype must be wider than TREE_TYPE (@0) and pure
>+   sign extension followed by AND with C will achieve the effect. 
>*/
>+(bit_and (convert @0) @1)
> 
>/* When the addresses are not directly of decls compare base and
>offset.
>This implements some remaining parts of fold_comparison address
>--- gcc/testsuite/gcc.c-torture/execute/pr78720.c.jj   2016-12-10
>09:18:43.386574179 +0100
>+++ gcc/testsuite/gcc.c-torture/execute/pr78720.c  2016-12-10
>09:18:43.386574179 +0100
>@@ -0,0 +1,29 @@
>+/* PR tree-optimization/78720 */
>+
>+__attribute__((noinline, noclone)) long int
>+foo (signed char x)
>+{
>+  return x < 0 ? 0x8L : 0L;
>+}
>+
>+__attribute__((noinline, noclone)) long int
>+bar (signed char x)
>+{
>+  return x < 0 ? 0x80L : 0L;
>+}
>+
>+__attribute__((noinline, noclone)) long int
>+baz (signed char x)
>+{
>+  return x < 0 ? 0x20L : 0L;
>+}
>+
>+int
>+main ()
>+{
>+  if (foo (-1) != 0x8L || bar (-1) != 0x80L || baz (-1) != 0x20L
>+  || foo (0) != 0L || bar (0) != 0L || baz (0) != 0L
>+  || foo (31) != 0L || bar (31) != 0L || baz (31) != 0L)
>+__builtin_abort ();
>+  return 0;
>+}
>
>
>   Jakub




Re: [PATCH] gcc: config: tilegx: Reserve prev insn when delete useless insn

2016-12-10 Thread Bernd Edlinger
On 12/10/16 12:23, cheng...@emindsoft.com.cn wrote:
> From: Chen Gang 
>
> When check bundle, gcc may still need modify the prev insn. Original
> implementation will lose the prev insn.
>
> Also correct the coding styles of relate code.
>
> 2016-12-10  Chen Gang 
>
>   gcc/
>   PR target/78222
>   * tilegx.c (tilegx_gen_bundle): Reserve prev insn when delete
>   useless insn.

I think this was already fixed by Walt:


r242617 | walt | 2016-11-19 03:34:17 +0100 (Sat, 19 Nov 2016) | 5 lines
Changed paths:
M /trunk/gcc/ChangeLog
M /trunk/gcc/config/tilegx/tilegx.c

TILE-Gx: Fix bundling when encountering consecutive barriers.

 * config/tilegx/tilegx.c (tilegx_gen_bundles): Preserve
   end-of-bundle marker for consecutive barriers.


But the formatting here is still odd, and should be fixed:
TAB usage, single statements in braces, { not in a line of its own.

I am however not sure about this statement:

   /* Never wrap {} around inline asm.  */
   if (GET_CODE (PATTERN (insn)) != ASM_INPUT)

... because this does only exclude asm(""); that is basic asm with
empty assembler string.  To exclude all other forms of asm statements
that are hidden in PARALLEL constructs we would need:

   /* Never wrap {} around inline asm.  */
   if (GET_CODE (PATTERN (insn)) != ASM_INPUT
   && asm_noperands (PATTERN (insn)) < 0)


I think this if-condition is probably unnecessary, because it does
apparently not create any problems although it is completely broken.


Bernd.


New Spanish PO file for 'gcc' (version 6.2.0)

2016-12-10 Thread Translation Project Robot
Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'gcc' has been submitted
by the Spanish team of translators.  The file is available at:

http://translationproject.org/latest/gcc/es.po

(This file, 'gcc-6.2.0.es.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

http://translationproject.org/latest/gcc/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

http://translationproject.org/domain/gcc.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.




Re: [RFC] combine: Improve change_zero_ext, call simplify_set afterwards.

2016-12-10 Thread Segher Boessenkool
On Fri, Dec 09, 2016 at 04:23:44PM +0100, Dominik Vogt wrote:
> 0001-*
> 
>   Deal with mode expanding zero_extracts in change_zero_ext.  The
>   patch looks good to me, but not sure whether endianness is
>   handled properly.  Is the second argument of gen_rtx_SUBREG
>   correct?

> >From 600ed3dadd5bc2568ab53be8466686abaf27ff3f Mon Sep 17 00:00:00 2001
> From: Dominik Vogt 
> Date: Fri, 9 Dec 2016 02:48:30 +0100
> Subject: [PATCH 1/2] combine: Handle mode expanding zero_extracts in
>  change_zero_ext.
> 
> Example:
> 
>   (zero_extract:DI (reg:SI)
>(const_int 24)
>(const_int 0))
> 
> -->
> 
>   (and:DI (subreg:DI (lshiftrt:SI (reg:SI) (const_int 8))
>  0)
>   (const_int 16777215))
> ---
>  gcc/combine.c | 12 +---
>  1 file changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/gcc/combine.c b/gcc/combine.c
> index b429453..e14a08f 100644
> --- a/gcc/combine.c
> +++ b/gcc/combine.c
> @@ -11237,18 +11237,24 @@ change_zero_ext (rtx pat)
>if (GET_CODE (x) == ZERO_EXTRACT
> && CONST_INT_P (XEXP (x, 1))
> && CONST_INT_P (XEXP (x, 2))
> -   && GET_MODE (XEXP (x, 0)) == mode)
> +   && (GET_MODE (XEXP (x, 0)) == mode
> +   || GET_MODE_PRECISION (GET_MODE (XEXP (x, 0)))
> +  < GET_MODE_PRECISION (mode)))

If you make this <= you can collapse the two cases into one.

>   {
> +   machine_mode inner_mode = GET_MODE (XEXP (x, 0));
> +
> size = INTVAL (XEXP (x, 1));
>  
> int start = INTVAL (XEXP (x, 2));
> if (BITS_BIG_ENDIAN)
> - start = GET_MODE_PRECISION (mode) - size - start;
> + start = GET_MODE_PRECISION (inner_mode) - size - start;
>  
> if (start)
> - x = gen_rtx_LSHIFTRT (mode, XEXP (x, 0), GEN_INT (start));
> + x = gen_rtx_LSHIFTRT (inner_mode, XEXP (x, 0), GEN_INT (start));
> else
>   x = XEXP (x, 0);
> +   if (mode != inner_mode)
> + x = gen_rtx_SUBREG (mode, x, 0);

gen_lowpart_SUBREG instead?  It's easier to read, and code a little
further down does the same thing.

Okay for trunk with those changes (did you bootstrap+regcheck this
already?)  Thanks,


Segher


Re: [PATCH] [x86] Don't use builtins for SIMD shifts

2016-12-10 Thread Marc Glisse

On Sat, 10 Dec 2016, Allan Sandfeld Jensen wrote:


Replaces the definitions of the shift intrinsics with GCC extension syntax to
allow GCC to reason about what the instructions does. Tests are added to
ensure the intrinsics still produce the right instructions, and that a few
basic optimizations now work.


I don't think we can do it in such a straightforward way. Those intrinsics 
are well defined for any shift argument, while operator<< and operator>> 
are considered undefined for many input values. I believe you may need to 
use an unsigned type for the lhs of left shifts, and write a << (b & 31) 
to match the semantics for the rhs (I haven't checked Intel's doc). Which 
might require adding a few patterns in sse.md to avoid generating code for 
that AND.


(also, the patch can't go in until next stage 1, around March or April, 
but that doesn't prevent from discussing it)


--
Marc Glisse


Re: [PATCH] [x86] Don't use builtins for SIMD shifts

2016-12-10 Thread Marc Glisse

On Sat, 10 Dec 2016, Marc Glisse wrote:


On Sat, 10 Dec 2016, Allan Sandfeld Jensen wrote:

Replaces the definitions of the shift intrinsics with GCC extension syntax 
to

allow GCC to reason about what the instructions does. Tests are added to
ensure the intrinsics still produce the right instructions, and that a few
basic optimizations now work.


I don't think we can do it in such a straightforward way. Those intrinsics 
are well defined for any shift argument, while operator<< and operator>> are 
considered undefined for many input values. I believe you may need to use an 
unsigned type for the lhs of left shifts, and write a << (b & 31) to match 
the semantics for the rhs (I haven't checked Intel's doc). Which might 
require adding a few patterns in sse.md to avoid generating code for that 
AND.


Oups, apparently I got that wrong, got confused with the scalar case. Left 
shift by more than precision is well defined for vectors, but it gives 0, 
it doesn't take the shift count modulo the precision. Which is even harder 
to explain to gcc (maybe ((unsigned)b<=31)?a

Re: [PATCH] PR fortran/65173 -- kill off old_cl_list from gfc_namespace.

2016-12-10 Thread Paul Richard Thomas
Dear Steve and Andre,

It turns out that Andre's instrumented gfortran came up trumps. There
was a block in resolve_structure_cons that attempted to remove a
charlen and, in so doing, broke the charlen chain. I have left it to
namespace cleanup to do the job in revision 243517. The module.c stuff
was a very red fish of some kind.

Cheers

Paul

2016-12-10 Paul Thomas  

PR fortran/78350
* resolve.c (resolve_structure_cons): Remove the block that
tried to remove a charlen and rely on namespace cleanup.

On 9 December 2016 at 20:50, Steven G. Kargl
 wrote:
> On Fri, Dec 09, 2016 at 01:54:34PM +0200, Janne Blomqvist wrote:
>> On Fri, Dec 9, 2016 at 1:47 PM, Paul Richard Thomas
>>  wrote:
>>
>> I'm seeing the same thing, I guess. Or rather that the ts->type == 81
>> which is non-sensical. No idea where that comes from..
>>
>> Backtrace from gdb:
>>
>> #0  gfc_code2string (m=m@entry=0x182fe00 ,
>> code=code@entry=81) at ../../trunk-git/gcc/fortran/misc.c:193
>> #1  0x0065645f in mio_name (t=81, m=m@entry=0x182fe00
>> ) at ../../trunk-git/gcc/fortran/module.c:1722
>> #2  0x0065acb9 in mio_name_bt (m=0x182fe00 ,
>> t=) at ../../trunk-git/gcc/fortran/module.c:2106
>> #3  mio_typespec (ts=ts@entry=0x2420918) at
>> ../../trunk-git/gcc/fortran/module.c:2530
>> #4  0x0065ae50 in mio_expr (ep=0x24217d0) at
>> ../../trunk-git/gcc/fortran/module.c:3432
>> #5  0x0065b300 in mio_charlen (clp=clp@entry=0x241fa80) at
>> ../../trunk-git/gcc/fortran/module.c:2500
>> #6  0x0065ad50 in mio_typespec (ts=ts@entry=0x241fa78) at
>> ../../trunk-git/gcc/fortran/module.c:2558
>> #7  0x0065ae50 in mio_expr (ep=ep@entry=0x24208a8) at
>> ../../trunk-git/gcc/fortran/module.c:3432
>> #8  0x0065bb73 in mio_component (c=c@entry=0x2420830,
>> vtype=vtype@entry=0) at ../../trunk-git/gcc/fortran/module.c:2799
>> #9  0x0065bbdb in mio_component_list (cp=cp@entry=0x2422138,
>> vtype=0) at ../../trunk-git/gcc/fortran/module.c:2818
>> #10 0x0065c913 in mio_symbol (sym=sym@entry=0x2422090) at
>> ../../trunk-git/gcc/fortran/module.c:4238
>> #11 0x0065cb8b in write_symbol (n=2, sym=sym@entry=0x2422090)
>> at ../../trunk-git/gcc/fortran/module.c:5613
>> #12 0x0065e87f in write_symbol0 (st=0x2422680) at
>> ../../trunk-git/gcc/fortran/module.c:5653
>> #13 0x0065e7f2 in write_symbol0 (st=0x2420650) at
>> ../../trunk-git/gcc/fortran/module.c:5632
>> #14 0x0065ea03 in write_module () at
>> ../../trunk-git/gcc/fortran/module.c:5992
>> #15 0x0065ec59 in dump_module (name=name@entry=0x76853060
>> "m", dump_flag=dump_flag@entry=1) at
>> ../../trunk-git/gcc/fortran/module.c:6120
>> #16 0x0065edf8 in gfc_dump_module (name=0x76853060 "m",
>> dump_flag=1) at ../../trunk-git/gcc/fortran/module.c:6163
>> #17 0x00675edd in gfc_parse_file () at
>> ../../trunk-git/gcc/fortran/parse.c:6158
>> #18 0x006baa5b in gfc_be_parse_file () at
>> ../../trunk-git/gcc/fortran/f95-lang.c:202
>
> Ugh.  I can only reproduce the ICE under valgrind.  There is
> some interesting stuff happening in module.c
>
> --
> Steve
> http://troutmask.apl.washington.edu/~kargl/
> https://www.youtube.com/watch?v=6hwgPfCcpyQ



-- 
If you're walking down the right path and you're willing to keep
walking, eventually you'll make progress.

Barack Obama


Re: [RFC] combine: Improve change_zero_ext, call simplify_set afterwards.

2016-12-10 Thread Segher Boessenkool
On Sat, Dec 10, 2016 at 10:37:38AM -0600, Segher Boessenkool wrote:
> Okay for trunk with those changes (did you bootstrap+regcheck this
> already?)  Thanks,

Oh, and a changelog :-)


Segher


Re: [PATCH] [x86] Don't use builtins for SIMD shifts

2016-12-10 Thread Marc Glisse

On Sat, 10 Dec 2016, Allan Sandfeld Jensen wrote:


On Saturday 10 December 2016, Marc Glisse wrote:

On Sat, 10 Dec 2016, Marc Glisse wrote:

On Sat, 10 Dec 2016, Allan Sandfeld Jensen wrote:

Replaces the definitions of the shift intrinsics with GCC extension
syntax to
allow GCC to reason about what the instructions does. Tests are added to
ensure the intrinsics still produce the right instructions, and that a
few basic optimizations now work.


I don't think we can do it in such a straightforward way. Those
intrinsics are well defined for any shift argument, while operator<< and
operator>> are considered undefined for many input values. I believe you
may need to use an unsigned type for the lhs of left shifts, and write a
<< (b & 31) to match the semantics for the rhs (I haven't checked
Intel's doc). Which might require adding a few patterns in sse.md to
avoid generating code for that AND.


Oups, apparently I got that wrong, got confused with the scalar case. Left
shift by more than precision is well defined for vectors, but it gives 0,
it doesn't take the shift count modulo the precision. Which is even harder
to explain to gcc (maybe ((unsigned)b<=31)?a

Re: cprop fix for PR78626

2016-12-10 Thread Segher Boessenkool
On Thu, Dec 08, 2016 at 01:21:04PM +0100, Bernd Schmidt wrote:
> This is another case where an optimization turns a trap_if 
> unconditional. We have to defer changing the CFG, since the rest of 
> cprop seems to blow up when we modify things while scanning.
> 
> Bootstrapped and tested on x86_64-linux, and reportedly fixes the 
> problem on ppc, ok?

This fixes PR78626, but not yet PR78727.

> @@ -1825,11 +1828,26 @@ one_cprop_pass (void)
>  insn into a NOTE, or deleted the insn.  */
>   if (! NOTE_P (insn) && ! insn->deleted ())
> mark_oprs_set (insn);
> +
> + if (first_uncond_trap == NULL
> + && GET_CODE (PATTERN (insn)) == TRAP_IF
> + && XEXP (PATTERN (insn), 0) == const1_rtx)
> +   first_uncond_trap = insn;
> }
> +   if (first_uncond_trap != NULL && first_uncond_trap != BB_END (bb))
> + uncond_traps.safe_push (first_uncond_trap);
>   }

The problem for PR78727 is that we also need to do this for insns that
already are the last insn in the block:

> +  while (!uncond_traps.is_empty ())
> + {
> +   rtx_insn *insn = uncond_traps.pop ();
> +   basic_block to_split = BLOCK_FOR_INSN (insn);
> +   remove_edge (split_block (to_split, insn));
> +   emit_barrier_after_bb (to_split);
> + }

We need that barrier, and we also need the successor edges removed
(which split_block+remove_edge does).

(PR78727 works fine with just that BB_END test deleted).


Segher


Re: [PATCHv3] [AARCH64] Add variant support to -m="native"and add thunderxt88p1.

2016-12-10 Thread Andrew Pinski
On Sat, Nov 26, 2016 at 1:54 PM, Andrew Pinski  wrote:
> On Tue, Nov 1, 2016 at 11:08 AM, Andrew Pinski  wrote:
>> On Tue, Nov 17, 2015 at 2:10 PM, Andrew Pinski  wrote:
>>> Since ThunderX T88 pass 1 (variant 0) is a ARMv8 part while pass 2 (variant 
>>> 1)
>>> is an ARMv8.1 part, I needed to add detecting of the variant also for this
>>> difference. Also I simplify a little bit and combined the single core and
>>> arch detecting cases so it would be easier to add variant.
>>
>> Actually it is a bit more complex than what I said here, see below for
>> the full table of options and what are enabled/disabled now.
>>
>>> OK?  Bootstrapped and tested on aarch64-linux-gnu with no regressions.
>>> Tested -mcpu=native on both T88 pass 1 and T88 pass 2 to make sure it is
>>> deecting the two seperately.
>>
>>
>> Here is the final patch in this series updated; I changed the cpu name
>> slightly and made sure I updated invoke.texi too.
>>
>> The names are going to match the names in LLVM (worked with our LLVM
>> engineer here at Cavium about the names).
>> Here are the names recorded and
>> -mpcu=thunderx:
>> *Matches part num 0xA0 (reserved for ThunderX 8x series)
>> *T88 Pass 2 scheduling
>> *Hardware prefetching (software prefetching disabled)
>> *LSE enabled
>> *no v8.1
>>
>> -mcpu=thunderxt88:
>> *Matches part num 0xA1
>> *T88 Pass 2 scheduling
>> *software prefetching enabled
>> *LSE enabled
>> *no v8.1
>>
>> -mcpu=thunderxt88p1 (only for GCC):
>> *Matches part num 0xA1, variant 0
>> *T88 Pass 1 scheduling
>> *software prefetching enabled
>> *no LSE enabled
>> *no v8.1
>>
>> -mcpu=thunderxt81 and -mcpu=thunderxt83:
>> *Matches part num 0xA2/0xA3
>> *T88 Pass 2 scheduling
>> *Hardware prefetching (software prefetching disabled)
>> *LSE enabled
>> *v8.1
>>
>>
>> I have not hooked up software vs hardware prefetching and the
>> scheduler parts (the next patch will do part of that); both ARMv8.1-a
>> and LSE parts are hooked up as those parts are only in
>> aarch64-cores.def.
>>
>> OK?  Bootstrapped and tested on ThunderX T88 and ThunderX T81
>> (aarch64-linux-gnu).
>
> Here is the latest version of the patch.  Updated for the latest
> additions of "falkor".   Also added a comment about the order of
> "thunderxt88p1" and "thunderxt88".
>
> OK?  Bootstrapped and tested on arrch64-linux-gnu with no regressions.

Ping?

>
> Thanks,
> Andrew Pinski
>
> ChangeLog:
>
> * config/aarch64/aarch64-cores.def: Add -1 as the variant to all of the cores.
> (thunderx): Update to include LSE by default.
> (thunderxt88p1): New core.
> (thunderxt88): New core.
> (thunderxt81): New core.
> (thunderxt83): New core.
> * config/aarch64/driver-aarch64.c (struct aarch64_core_data): Add variant 
> field.
> (ALL_VARIANTS): New define.
> (AARCH64_CORE): Support VARIANT operand.
> (cpu_data): Likewise.
> (host_detect_local_cpu): Parse variant field of /proc/cpuinfo.  Combine the 
> arch
> and single core case and support variant searching.
> * common/config/aarch64/aarch64-common.c (AARCH64_CORE): Add VARIANT operand.
> * config/aarch64/aarch64-opts.h (AARCH64_CORE): Likewise.
> * config/aarch64/aarch64.c (AARCH64_CORE): Likewise.
> * config/aarch64/aarch64.h (AARCH64_CORE): Likewise.
> * config/aarch64/aarch64-tune.md: Regenerate.
>
> * doc/invoke.texi (AARCH64/mtune): Document thunderxt88,
> thunderxt88p1, thunderxt81, thunderxt83 as available options.


committed: add i386/t-crtstuff to i[34567]86-*-netbsdelf tmake_file

2016-12-10 Thread Krister Walfridsson
I have committed the attached patch to add i386/t-crtstuff to tmake_file 
for i[34567]86-*-netbsdelf*.


Bootstrapped and tested on i386-unknown-netbsdelf6.1 (fixes 29378 
failures)


   /Krister


2016-12-10  Krister Walfridsson  

* config.host (i[34567]86-*-netbsdelf*): Add i386/t-crtstuff to
tmake_file.Index: libgcc/config.host
===
--- libgcc/config.host  (revision 243517)
+++ libgcc/config.host  (revision 243518)
@@ -599,6 +599,7 @@
md_unwind_header=i386/freebsd-unwind.h
;;
 i[34567]86-*-netbsdelf*)
+   tmake_file="${tmake_file} i386/t-crtstuff"
;;
 x86_64-*-netbsd*)
tmake_file="${tmake_file} i386/t-crtstuff"


Re: [PATCH/AARCH64] Handle ILP32 multi-arch

2016-12-10 Thread Andrew Pinski
On Thu, Nov 10, 2016 at 6:58 PM, Andrew Pinski  wrote:
> On Tue, Oct 25, 2016 at 3:25 PM, Matthias Klose  wrote:
>> On 07.10.2016 23:08, Andrew Pinski wrote:
>>> Hi,
>>>   This patch adds ilp32 multi-arch support.  This is needed to support
>>> multi-arch on Debian like systems.
>>>
>>> OK?  Bootstrapped and tested on aarch64-linux-gnu with no regressions.
>>> Also tested with ilp32 with a newly built toolchain that supports
>>> ILP32 with Ubuntu 1604 base.
>>>
>>> Thanks,
>>> Andrew
>>>
>>> ChangeLog:
>>> * config/aarch64/t-aarch64-linux (MULTILIB_OSDIRNAMES): Handle
>>> multi-arch for ilp32.
>>
>> I can't approve that, but it looks like a reasonable change, but we should
>> document the multiarch triplet at https://wiki.debian.org/Multiarch/Tuples
>
>
> Ping?

Ping?  This is the only outstanding GCC ILP32 related patch.

>
> Thanks,
> Andrew
>
>>
>> Matthias
>>


committed: make i486 default arch for x86 NetBSD

2016-12-10 Thread Krister Walfridsson
I have committed the attached patch to make i486 the default arch on 
NetBSD in the same way as for FreeBSD, as 386 CPUs are not supported on 
any maintained version of NetBSD.


Bootstrapped and tested on i386-unknown-netbsdelf6.1

   /Krister


2016-12-10  Krister Walfridsson  

* config.gcc (i386-*-netbsd*): Make i486 the default arch on NetBSD.
Generally use cpu generic.Index: gcc/config.gcc
===
--- gcc/config.gcc  (revision 243518)
+++ gcc/config.gcc  (revision 243519)
@@ -3061,6 +3061,12 @@
 arch_without_sse2=yes
 arch_without_64bit=yes
 ;;
+  i386-*-netbsd*)
+arch=i486
+cpu=generic
+arch_without_sse2=yes
+arch_without_64bit=yes
+;;
   i386-*-*)
 arch=i386
 cpu=i386


Re: [PATCH, Fortran, pr78672, ctp1, v1] Gfortran test suite failures with a sanitized compiler

2016-12-10 Thread Mikael Morin

Hello,

Le 09/12/2016 à 11:55, Andre Vehreschild a écrit :

Hi Mikael,

thanks a lot for your comments. Note I also have added the reply to your latest
email here.

On Thu, 8 Dec 2016 23:49:57 +0100
Mikael Morin  wrote:




diff --git a/gcc/fortran/data.c b/gcc/fortran/data.c
index 139ce88..4f835b3 100644
--- a/gcc/fortran/data.c
+++ b/gcc/fortran/data.c
@@ -186,7 +186,7 @@ create_character_initializer (gfc_expr *init,
gfc_typespec *ts, for (i = 0; i < len; i++)
dest[start+i] = rvalue->representation.string[i];
 }
-  else
+  else if (rvalue->value.character.string)

This one looks fishy.
Either rvalue is a character constant and its string should be set, or
it’s not a character constant and the value.character.string should not
be accessed at all.


You are completely right. This can *only* occur when invalid-code is given to
the compiler. In this case the offending code was:

 data c / = NULL() /

The syntax may not be correct (just out of my head), but I hope you get the
idea. The sanitizers complaint was that the second argument to the memcpy below
must not be NULL. The above if () makes sure the memcpy does not get called in
this case. So this merely to prevent the compiler from ICEing on systems whose
memcpy is not robust.


Ok, I have been able to reproduce this case, and indeed the function is 
called with null() as rvalue expression.
But my comment still holds. Accessing rvalue->value.character.string 
when rvalue hasn’t type EXPR_CONSTANT is undefined.
A slightly better patch would use rvalue->expr_type == EXPR_CONSTANT as 
else if condition, but the problem remains earlier in the function in 
the assignment to len.

I think the function just should not be called if rvalue is null().




 memcpy (&dest[start], rvalue->value.character.string,
len * sizeof (gfc_char_t));

diff --git a/gcc/fortran/interface.c b/gcc/fortran/interface.c
index 8afba84..4e4d17c 100644
--- a/gcc/fortran/interface.c
+++ b/gcc/fortran/interface.c
@@ -2803,6 +2803,7 @@ compare_actual_formal (gfc_actual_arglist **ap,
gfc_formal_arglist *formal, int i, n, na;
   unsigned long actual_size, formal_size;
   bool full_array = false;
+  gfc_ref *actual_arr_ref;

   actual = *ap;

@@ -2942,37 +2943,38 @@ compare_actual_formal (gfc_actual_arglist **ap,
gfc_formal_arglist *formal, and assumed-shape dummies, the string length
needs to match exactly.  */
   if (a->expr->ts.type == BT_CHARACTER
-  && a->expr->ts.u.cl && a->expr->ts.u.cl->length
-  && a->expr->ts.u.cl->length->expr_type == EXPR_CONSTANT
-  && f->sym->ts.u.cl && f->sym->ts.u.cl && f->sym->ts.u.cl->length
-  && f->sym->ts.u.cl->length->expr_type == EXPR_CONSTANT
-  && (f->sym->attr.pointer || f->sym->attr.allocatable
-  || (f->sym->as && f->sym->as->type == AS_ASSUMED_SHAPE))
-  && (mpz_cmp (a->expr->ts.u.cl->length->value.integer,
-   f->sym->ts.u.cl->length->value.integer) != 0))
-{
-  if (where && (f->sym->attr.pointer || f->sym->attr.allocatable))
-gfc_warning (OPT_Wargument_mismatch,
- "Character length mismatch (%ld/%ld) between
actual "
- "argument and pointer or allocatable dummy
argument "
- "%qs at %L",
- mpz_get_si
(a->expr->ts.u.cl->length->value.integer),
- mpz_get_si
(f->sym->ts.u.cl->length->value.integer),
- f->sym->name, &a->expr->where);
-  else if (where)
-gfc_warning (OPT_Wargument_mismatch,
- "Character length mismatch (%ld/%ld) between
actual "
- "argument and assumed-shape dummy argument %qs "
- "at %L",
- mpz_get_si
(a->expr->ts.u.cl->length->value.integer),
- mpz_get_si
(f->sym->ts.u.cl->length->value.integer),
- f->sym->name, &a->expr->where);
-  return 0;
-}
+ && a->expr->ts.u.cl && a->expr->ts.u.cl->length
+ && a->expr->ts.u.cl->length->expr_type == EXPR_CONSTANT




+ && f->sym->ts.type == BT_CHARACTER && f->sym->ts.u.cl


 ^^^


+ && f->sym->ts.u.cl->length
+ && f->sym->ts.u.cl->length->expr_type == EXPR_CONSTANT
+ && (f->sym->attr.pointer || f->sym->attr.allocatable
+ || (f->sym->as && f->sym->as->type == AS_ASSUMED_SHAPE))
+ && (mpz_cmp (a->expr->ts.u.cl->length->value.integer,
+  f->sym->ts.u.cl->length->value.integer) != 0))
+   {





+ && a->expr->ts.type == BT_CHARACTER)
{
  if (where)
gfc_error ("Actual argument at %L to allocatable or "

That one was just reformatting, right?


No, the check for the correct has been added at ^^^. I agree that reformatting
and the change was not a good idea.




@@ -3039,13 +3041,28