[Patch, ARM/Thumb1]Add a Thumb1 insn pattern to legalize the instruction that moves pc to low register

2014-12-08 Thread Terry Guo
Hi there,

When compile below simple code:

terguo01@terry-pc01:mtpcs-frame$ cat test.c
int main(void)
{
return 0;
}

I got ICE with option -mtpcs-leaf-frame (no error if remove this option).

terguo01@terry-pc01:mtpcs-frame$
/work/terguo01/tools/gcc-arm-none-eabi-5_0-2014q4/bin/arm-none-eabi-gcc
-mtpcs-leaf-frame test.c -c -mcpu=cortex-m0plus -mthumb -da
test.c: In function 'main':
test.c:4:1: error: unrecognizable insn:
 }
 ^
(insn 20 19 21 (set (reg:SI 2 r2)
(reg:SI 15 pc)) test.c:2 -1
 (nil))
test.c:4:1: internal compiler error: in extract_insn, at recog.c:2327
Please submit a full bug report,
with preprocessed source if appropriate.
See http://gcc.gnu.org/bugs.html\ for instructions.

This RTL is generated in function thumb1_expand_prologue. The expected insn
pattern is thumb1_movsi_insn in thumb1.md. And instruction like "mov r2, pc"
is a legal instruction. Because gcc returns NO_REG for PC register, so no
valid pattern to match instruction that move pc to low register. This patch
intends to add a new insn pattern to legalize such thing.

Tested with GCC regression test. No regression. Is it OK to trunk?

BR,
Terry

2014-12-08  Terry Guo  terry@arm.com

 * config/arm/predicates.md (pc_register): New to match PC register.
 * config/arm/thumb1.md (*thumb1_movpc_insn): New insn pattern.

gcc/testsuite/ChangeLog:
2014-12-08  Terry Guo  terry@arm.com

 * gcc.target/arm/thumb1-mov-pc.c: New test.diff --git a/gcc/config/arm/predicates.md b/gcc/config/arm/predicates.md
index 032808c..c5ef5ed 100644
--- a/gcc/config/arm/predicates.md
+++ b/gcc/config/arm/predicates.md
@@ -361,6 +361,10 @@
   (and (match_code "smin,smax,umin,umax")
(match_test "mode == GET_MODE (op)")))
 
+(define_special_predicate "pc_register"
+  (and (match_code "reg")
+   (match_test "REGNO (op) == PC_REGNUM")))
+
 (define_special_predicate "cc_register"
   (and (match_code "reg")
(and (match_test "REGNO (op) == CC_REGNUM")
diff --git a/gcc/config/arm/thumb1.md b/gcc/config/arm/thumb1.md
index ddedc39..8e6057c 100644
--- a/gcc/config/arm/thumb1.md
+++ b/gcc/config/arm/thumb1.md
@@ -1780,6 +1780,16 @@
   "
 )
 
+(define_insn "*thumb1_movpc_insn"
+  [(set (match_operand:SI 0 "low_register_operand")
+(match_operand:SI 1 "pc_register"))]
+  "TARGET_THUMB1"
+  "mov\\t%0, pc"
+  [(set_attr "length" "2")
+   (set_attr "conds" "nocond")
+   (set_attr "type"   "mov_reg")]
+)
+
 ;; NB never uses BX.
 (define_insn "*thumb1_tablejump"
   [(set (pc) (match_operand:SI 0 "register_operand" "l*r"))
diff --git a/gcc/testsuite/gcc.target/arm/thumb1-mov-pc.c 
b/gcc/testsuite/gcc.target/arm/thumb1-mov-pc.c
new file mode 100644
index 000..9f94131
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/thumb1-mov-pc.c
@@ -0,0 +1,7 @@
+/* { dg-options "-mtpcs-leaf-frame -O2" } */
+/* { dg-skip-if "" { ! { arm_thumb1 } } } */
+int
+main ()
+{
+  return 0;
+}


Re: [PATCH, i386] Fix PR64003

2014-12-08 Thread Ilya Enkovich
2014-12-07 12:51 GMT+03:00 Richard Sandiford :
> Ilya Enkovich  writes:
>> diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
>> index 88435d6..9019ed8 100644
>> --- a/gcc/config/i386/i386.md
>> +++ b/gcc/config/i386/i386.md
>> @@ -10958,6 +10958,24 @@
>>  ;; Basic conditional jump instructions.
>>  ;; We ignore the overflow flag for signed branch instructions.
>>
>> +(define_insn "*jcc_1_bnd"
>> +  [(set (pc)
>> + (if_then_else (match_operator 1 "ix86_comparison_operator"
>> +   [(reg FLAGS_REG) (const_int 0)])
>> +   (label_ref (match_operand 0))
>> +   (pc)))]
>> +  "TARGET_MPX && ix86_bnd_prefixed_insn_p (insn)"
>> +  "bnd %+j%C1\t%l0"
>> +  [(set_attr "type" "ibr")
>> +   (set_attr "modrm" "0")
>> +   (set (attr "length")
>> +(if_then_else (and (ge (minus (match_dup 0) (pc))
>> +   (const_int -126))
>> +   (lt (minus (match_dup 0) (pc))
>> +   (const_int 128)))
>> +  (const_int 3)
>> +  (const_int 7)))])
>> +
>
> Sorry, looking at this again, shouldn't the offset be -125 rather
> than -126?  The pc in:
>
>(ge (minus (match_dup 0) (pc))
>(const_int -126))
>
> is the start of the instruction, so we need to add the length
> of the jump itself to the real minimum range of -128.

You are right.  Similarly upper bound should be changed from 128 to 129.

Thanks,
Ilya

>
> Thanks,
> Richard


Re: [PATCH] Finish up string builtin folding move

2014-12-08 Thread Richard Biener
On Thu, 4 Dec 2014, Richard Biener wrote:

> 
> This is the last patch, moving stpcpy folding.  There are still
> string function foldings left but those are exclusively GENERIC
> now with no chance of advertedly recursing from GIMPLE folding
> via GENERIC folding back to GIMPLE folding.  I'll deal with those
> during next stage1.
> 
> Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

The following is what I have applied.

Bootstrapped / tested on x86_64-unknown-linux-gnu.

Richard.

2014-12-08  Richard Biener  

* builtins.c (fold_builtin_0): Remove unused ignore parameter.
(fold_builtin_1): Likewise.
(fold_builtin_3): Likewise.
(fold_builtin_varargs): Likewise.
(fold_builtin_2): Likewise.  Do not fold stpcpy here.
(fold_builtin_n): Adjust.
(fold_builtin_stpcpy): Move to gimple-fold.c.
(gimple_fold_builtin_stpcpy): Moved and gimplified from builtins.c.
(gimple_fold_builtin): Fold stpcpy here.

Index: trunk/gcc/builtins.c
===
*** trunk.orig/gcc/builtins.c   2014-12-04 14:24:58.265803955 +0100
--- trunk/gcc/builtins.c2014-12-04 14:49:45.367752467 +0100
*** static tree fold_builtin_fabs (location_
*** 191,201 
  static tree fold_builtin_abs (location_t, tree, tree);
  static tree fold_builtin_unordered_cmp (location_t, tree, tree, tree, enum 
tree_code,
enum tree_code);
! static tree fold_builtin_0 (location_t, tree, bool);
! static tree fold_builtin_1 (location_t, tree, tree, bool);
! static tree fold_builtin_2 (location_t, tree, tree, tree, bool);
! static tree fold_builtin_3 (location_t, tree, tree, tree, tree, bool);
! static tree fold_builtin_varargs (location_t, tree, tree*, int, bool);
  
  static tree fold_builtin_strpbrk (location_t, tree, tree, tree);
  static tree fold_builtin_strstr (location_t, tree, tree, tree);
--- 191,201 
  static tree fold_builtin_abs (location_t, tree, tree);
  static tree fold_builtin_unordered_cmp (location_t, tree, tree, tree, enum 
tree_code,
enum tree_code);
! static tree fold_builtin_0 (location_t, tree);
! static tree fold_builtin_1 (location_t, tree, tree);
! static tree fold_builtin_2 (location_t, tree, tree, tree);
! static tree fold_builtin_3 (location_t, tree, tree, tree, tree);
! static tree fold_builtin_varargs (location_t, tree, tree*, int);
  
  static tree fold_builtin_strpbrk (location_t, tree, tree, tree);
  static tree fold_builtin_strstr (location_t, tree, tree, tree);
*** fold_builtin_exponent (location_t loc, t
*** 8657,8703 
return NULL_TREE;
  }
  
- /* Fold function call to builtin stpcpy with arguments DEST and SRC.
-Return NULL_TREE if no simplification can be made.  */
- 
- static tree
- fold_builtin_stpcpy (location_t loc, tree fndecl, tree dest, tree src)
- {
-   tree fn, len, lenp1, call, type;
- 
-   if (!validate_arg (dest, POINTER_TYPE)
-   || !validate_arg (src, POINTER_TYPE))
- return NULL_TREE;
- 
-   len = c_strlen (src, 1);
-   if (!len
-   || TREE_CODE (len) != INTEGER_CST)
- return NULL_TREE;
- 
-   if (optimize_function_for_size_p (cfun)
-   /* If length is zero it's small enough.  */
-   && !integer_zerop (len))
- return NULL_TREE;
- 
-   fn = builtin_decl_implicit (BUILT_IN_MEMCPY);
-   if (!fn)
- return NULL_TREE;
- 
-   lenp1 = size_binop_loc (loc, PLUS_EXPR,
- fold_convert_loc (loc, size_type_node, len),
- build_int_cst (size_type_node, 1));
-   /* We use dest twice in building our expression.  Save it from
-  multiple expansions.  */
-   dest = builtin_save_expr (dest);
-   call = build_call_expr_loc (loc, fn, 3, dest, src, lenp1);
- 
-   type = TREE_TYPE (TREE_TYPE (fndecl));
-   dest = fold_build_pointer_plus_loc (loc, dest, len);
-   dest = fold_convert_loc (loc, type, dest);
-   dest = omit_one_operand_loc (loc, type, dest, call);
-   return dest;
- }
- 
  /* Fold function call to builtin memchr.  ARG1, ARG2 and LEN are the
 arguments to the call, and TYPE is its return type.
 Return NULL_TREE if no simplification can be made.  */
--- 8657,8662 
*** fold_builtin_arith_overflow (location_t
*** 9857,9867 
  }
  
  /* Fold a call to built-in function FNDECL with 0 arguments.
!IGNORE is true if the result of the function call is ignored.  This
!function returns NULL_TREE if no simplification was possible.  */
  
  static tree
! fold_builtin_0 (location_t loc, tree fndecl, bool ignore ATTRIBUTE_UNUSED)
  {
tree type = TREE_TYPE (TREE_TYPE (fndecl));
enum built_in_function fcode = DECL_FUNCTION_CODE (fndecl);
--- 9816,9825 
  }
  
  /* Fold a call to built-in function FNDECL with 0 arguments.
!This function returns NULL_TREE if no simplification was possible.  */
  
  static tree
! fold_builtin_0 (location_t loc, tre

Re: [PATCH 3/n] OpenMP 4.0 offloading infrastructure: offload tables

2014-12-08 Thread Richard Biener
On Thu, Dec 4, 2014 at 8:35 PM, Ilya Verbin  wrote:
> Hi,
>
> On 30 Sep 18:53, Ilya Verbin wrote:
>> This patch creates 2 vectors with decls: offload_funcs and offload_vars.
>> libgomp will use addresses from these arrays to look up offloaded code.
>>
>> During the compilation they are outputted to:
>> * binary __gnu_offload_funcs/vars sections, or using
>>   targetm.record_offload_symbol hook for PTX.
>
> In some cases LTO may optimize out a global variable, declared as target, but
> it still will be referenced from the offload table, that will cause a linking
> error.  Here is the example:
>
> #pragma omp declare target
> int G;
> #pragma omp end declare target

So where is that "magic" target use then?  Why doesn't the symtab
reachability code see the use?  (why don't we prune it from the offload
table?)

Richard.

> int main ()
> {
>   int res = 0;
>
>   #pragma omp target map(alloc: G) map(from: res)
> {
>   G = 1;
>   res = G;
> }
>
>   return res;
> }
>
> $ gcc -fopenmp -flto -O1 test.c
> xxx.ltrans0.ltrans.o:.offload_var_table.3973: error: undefined reference to 
> 'G'
>
> This issue can be resolved by forcing output of such variables.
> Is this fix ok?  Should I add a testcase?
>
>
> diff --git a/gcc/varpool.c b/gcc/varpool.c
> index 0526b7f..db28c2a 100644
> --- a/gcc/varpool.c
> +++ b/gcc/varpool.c
> @@ -175,6 +175,7 @@ varpool_node::get_create (tree decl)
>g->have_offload = true;
>if (!in_lto_p)
> vec_safe_push (offload_vars, decl);
> +  node->force_output = 1;
>  #endif
>  }
>
>
> Thanks,
>   -- Ilya


Re: Fix sreal::shift add to_double conversion

2014-12-08 Thread Richard Biener
On Thu, Dec 4, 2014 at 9:06 PM, Jan Hubicka  wrote:
>> On December 4, 2014 7:00:12 PM CET, Jan Hubicka  wrote:
>> >Hi,
>> >this patch fixes implementation of shift in sreal.h that ICEs on 0
>> >value.
>> >Also the comment is copied from shift_right and does not make sense for
>> >sreal.h.
>> >
>> >The to_double conversion is useful for debug output: Most of inliner
>> >data is
>> >now output as integer that does not make much sense.
>>
>> Can't you use dump for this?
>
> Well, dump will get you somewhat less readable form like 1234*2^3 and also it 
> makes it hard to dump
> multiple values with single printf.
>   if (dump)
> {
>   sreal num,den;
>   relative_time_benefit (callee_info, edge, edge_time, &num, &den);
>   fprintf (dump_file,
>"  %f: guessed profile. frequency %f,"
>" benefit %f%%, time w/o inlining %f, time w inlining %f"
>" overall growth %i (current) %i (original)\n",
>badness.to_double (), (double)edge->frequency / 
> CGRAPH_FREQ_BASE,
>(num/den).to_double () * 100,
>compute_uninlined_call_time (callee_info, edge).to_double 
> (),
>compute_inlined_call_time (edge, edge_time).to_double (),
>estimate_growth (callee),
>callee_info->growth);
> }
>
> would look even more ugly if done with dump method.

Hmm, ok.  Please add a comment before to_double that it is only
supposed to be used for debug dumping.

Richard.

> Honza
>>
>> Richard.
>>
>> >Bootstrapped/regtested x86_64-linux, OK?
>> >
>> > * sreal.h (to_double): Declare.
>> > (shift): Do not ICE on 0, update outdated comment.
>> > * sreal.c: Include math.h
>> > (sreal::to_double): New function.
>> >Index: sreal.h
>> >===
>> >--- sreal.h  (revision 218286)
>> >+++ sreal.h  (working copy)
>> >@@ -53,6 +53,7 @@
>> >
>> >   void dump (FILE *) const;
>> >   int64_t to_int () const;
>> >+  double to_double () const;
>> >   sreal operator+ (const sreal &other) const;
>> >   sreal operator- (const sreal &other) const;
>> >   sreal operator* (const sreal &other) const;
>> >@@ -93,12 +94,14 @@
>> >
>> >   sreal shift (int s) const
>> >   {
>> >+/* Zero needs no shifting.  */
>> >+if (!m_sig)
>> >+  return *this;
>> > gcc_checking_assert (s <= SREAL_BITS);
>> > gcc_checking_assert (s >= -SREAL_BITS);
>> >
>> >-/* Exponent should never be so large because shift_right is used
>> >only by
>> >- sreal_add and sreal_sub ant thus the number cannot be shifted out
>> >from
>> >- exponent range.  */
>> >+/* Overflows/drop to 0 could be handled gracefully, but hopefully
>> >we do not
>> >+   need to do so.  */
>> > gcc_checking_assert (m_exp + s <= SREAL_MAX_EXP);
>> > gcc_checking_assert (m_exp + s >= -SREAL_MAX_EXP);
>> >
>> >Index: sreal.c
>> >===
>> >--- sreal.c  (revision 218286)
>> >+++ sreal.c  (working copy)
>> >@@ -47,6 +47,7 @@
>> > sig == 0 && exp == -SREAL_MAX_EXP
>> > */
>> >
>> >+#include 
>> > #include "config.h"
>> > #include "system.h"
>> > #include "coretypes.h"
>> >@@ -167,6 +168,19 @@
>> >   return sign * m_sig;
>> > }
>> >
>> >+/* Return value of *this as double.  */
>> >+
>> >+double
>> >+sreal::to_double () const
>> >+{
>> >+  double val = m_sig;
>> >+  if (m_negative)
>> >+val = -val;
>> >+  if (m_exp)
>> >+val *= exp2 (m_exp);
>> >+  return val;
>> >+}
>> >+
>> > /* Return *this + other.  */
>> >
>> > sreal
>>


Re: [PATCH 3/n] OpenMP 4.0 offloading infrastructure: offload tables

2014-12-08 Thread Ilya Verbin
2014-12-08 12:18 GMT+03:00 Richard Biener :
> On Thu, Dec 4, 2014 at 8:35 PM, Ilya Verbin  wrote:
>> Hi,
>>
>> On 30 Sep 18:53, Ilya Verbin wrote:
>>> This patch creates 2 vectors with decls: offload_funcs and offload_vars.
>>> libgomp will use addresses from these arrays to look up offloaded code.
>>>
>>> During the compilation they are outputted to:
>>> * binary __gnu_offload_funcs/vars sections, or using
>>>   targetm.record_offload_symbol hook for PTX.
>>
>> In some cases LTO may optimize out a global variable, declared as target, but
>> it still will be referenced from the offload table, that will cause a linking
>> error.  Here is the example:
>>
>> #pragma omp declare target
>> int G;
>> #pragma omp end declare target
>
> So where is that "magic" target use then?  Why doesn't the symtab
> reachability code see the use?  (why don't we prune it from the offload
> table?)
>
> Richard.

Nowhere on host-side, but it can remain non-optimized on target-side,
therefore we can not remove it only from the host offload table.

  -- Ilya


Re: [PATCH 3/n] OpenMP 4.0 offloading infrastructure: offload tables

2014-12-08 Thread Jakub Jelinek
On Mon, Dec 08, 2014 at 01:28:10PM +0400, Ilya Verbin wrote:
> 2014-12-08 12:18 GMT+03:00 Richard Biener :
> > On Thu, Dec 4, 2014 at 8:35 PM, Ilya Verbin  wrote:
> >> Hi,
> >>
> >> On 30 Sep 18:53, Ilya Verbin wrote:
> >>> This patch creates 2 vectors with decls: offload_funcs and offload_vars.
> >>> libgomp will use addresses from these arrays to look up offloaded code.
> >>>
> >>> During the compilation they are outputted to:
> >>> * binary __gnu_offload_funcs/vars sections, or using
> >>>   targetm.record_offload_symbol hook for PTX.
> >>
> >> In some cases LTO may optimize out a global variable, declared as target, 
> >> but
> >> it still will be referenced from the offload table, that will cause a 
> >> linking
> >> error.  Here is the example:
> >>
> >> #pragma omp declare target
> >> int G;
> >> #pragma omp end declare target
> >
> > So where is that "magic" target use then?  Why doesn't the symtab
> > reachability code see the use?  (why don't we prune it from the offload
> > table?)
> >
> > Richard.
> 
> Nowhere on host-side, but it can remain non-optimized on target-side,
> therefore we can not remove it only from the host offload table.

Note, the two tables have to match, and after streaming the offloading LTO
IL, it can't be easily adjusted anymore.
Unless we'd allow some magic value (say NULL) for vars that were optimized
away on the host side, then libgomp and/or mkoffload would need to be
adjusted to ignore or remove pairs from the table where the host var has
been optimized away, but the target one not necessarily so.

Jakub


Re: [PATCH, CHKP] Don't try to optimize bounds returned by strchr

2014-12-08 Thread Ilya Enkovich
2014-12-06 1:55 GMT+03:00 Jeff Law :
> On 12/02/14 06:40, Ilya Enkovich wrote:
>>
>> Hi,
>>
>> For strchr calls bounds of the first argument are considered as returned
>> which is wrong because NULL may be returned.  This patch fixes that.
>> Bootstrapped and tested on x86_64-unknown-linux-gnu.  OK for trunk?
>>
>> Thanks,
>> Ilya
>> --
>> 2014-12-02  Ilya Enkovich  
>>
>> * tree-chkp.c (chkp_build_returned_bound): Don't predict
>> return bounds for strchr calls.
>
> OK.
>
> Do you have a testcase you could add to the suite?

I cannot add a nice test for bounds returned by strchr for NULL until
it is wrapped (which will happen eventually).

Thanks,
Ilya


>
> Jeff
>


Re: [PATCH] [AArch64, RTL] Bics instruction generation for aarch64

2014-12-08 Thread Richard Earnshaw
On 11/11/14 10:38, Alex Velenko wrote:
>  From 98bb6d7323ce79e28be8ef892b919391ed857e1f Mon Sep 17 00:00:00 2001
> From: Alex Velenko 
> Date: Fri, 31 Oct 2014 18:43:32 +
> Subject: [PATCH] [AArch64, RTL] Bics instruction generation for aarch64
> 
> Hi,
> 
> This patch adds rtl patterns for aarch64 to generate bics instructions in
> cases when caputed value gets discarded and only only the status regester
> change of the instruction gets reused.
> 
> Previously, bics would only be generated, if the value computed by bics
> would later be reused, which is not necessarily the case when computing
> this value for "if" statements.
> 
> Is this patch ok?
> 
> Thanks,
> Alex
> 
> gcc/
> 
> 2014-11-10  Alex Velenko  
> 
>  * gcc/config/aarch64/aarch64.md 
> (and_one_cmpl3_compare0_no_reuse):
>New define_insn.
>  * (and_one_cmpl_3_compare0_no_reuse):
>Likewise.
> 
> gcc/testsuite/
> 
> 2014-11-10  Alex Velenko  
> 
>  * gcc.target/aarch64/bics1.c : New testcase.

OK.

R.

> ---
>   gcc/config/aarch64/aarch64.md | 26 
>   gcc/testsuite/gcc.target/aarch64/bics_3.c | 69 
> +++
>   2 files changed, 95 insertions(+)
>   create mode 100644 gcc/testsuite/gcc.target/aarch64/bics_3.c
> 
> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> index 341c26f..6158d82 100644
> --- a/gcc/config/aarch64/aarch64.md
> +++ b/gcc/config/aarch64/aarch64.md
> @@ -2845,6 +2845,18 @@
> [(set_attr "type" "logics_reg")]
>   )
> 
> +(define_insn "*and_one_cmpl3_compare0_no_reuse"
> +  [(set (reg:CC_NZ CC_REGNUM)
> +(compare:CC_NZ
> + (and:GPI (not:GPI
> +   (match_operand:GPI 0 "register_operand" "r"))
> +  (match_operand:GPI 1 "register_operand" "r"))
> + (const_int 0)))]
> +  ""
> +  "bics\\tzr, %1, %0"
> +  [(set_attr "type" "logics_reg")]
> +)
> +
>   (define_insn "*_one_cmpl_3"
> [(set (match_operand:GPI 0 "register_operand" "=r")
>   (LOGICAL:GPI (not:GPI
> @@ -2894,6 +2906,20 @@
> [(set_attr "type" "logics_shift_imm")]
>   )
> 
> +(define_insn "*and_one_cmpl_3_compare0_no_reuse"
> +  [(set (reg:CC_NZ CC_REGNUM)
> +(compare:CC_NZ
> + (and:GPI (not:GPI
> +   (SHIFT:GPI
> +(match_operand:GPI 0 "register_operand" "r")
> +(match_operand:QI 1 "aarch64_shift_imm_" "n")))
> +  (match_operand:GPI 2 "register_operand" "r"))
> + (const_int 0)))]
> +  ""
> +  "bics\\tzr, %2, %0,  %1"
> +  [(set_attr "type" "logics_shift_imm")]
> +)
> +
>   (define_insn "clz2"
> [(set (match_operand:GPI 0 "register_operand" "=r")
>   (clz:GPI (match_operand:GPI 1 "register_operand" "r")))]
> diff --git a/gcc/testsuite/gcc.target/aarch64/bics_3.c 
> b/gcc/testsuite/gcc.target/aarch64/bics_3.c
> new file mode 100644
> index 000..ecb53e9
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/bics_3.c
> @@ -0,0 +1,69 @@
> +/* { dg-do run } */
> +/* { dg-options "-O2 --save-temps" } */
> +
> +extern void abort (void);
> +
> +int __attribute__ ((noinline))
> +bics_si_test (int a, int b)
> +{
> +  if (a & ~b)
> +return 1;
> +  else
> +return 0;
> +}
> +
> +int __attribute__ ((noinline))
> +bics_si_test2 (int a, int b)
> +{
> +  if (a & ~ (b << 2))
> +return 1;
> +  else
> +return 0;
> +}
> +
> +typedef long long s64;
> +
> +int __attribute__ ((noinline))
> +bics_di_test (s64 a, s64 b)
> +{
> +  if (a & ~b)
> +return 1;
> +  else
> +return 0;
> +}
> +
> +int __attribute__ ((noinline))
> +bics_di_test2 (s64 a, s64 b)
> +{
> +  if (a & ~(b << 2))
> +return 1;
> +  else
> +return 0;
> +}
> +
> +int
> +main (void)
> +{
> +  int a = 5;
> +  int b = 5;
> +  int c = 20;
> +  s64 d = 5;
> +  s64 e = 5;
> +  s64 f = 20;
> +  if (bics_si_test (a, b))
> +abort ();
> +  if (bics_si_test2 (c, b))
> +abort ();
> +  if (bics_di_test (d, e))
> +abort ();
> +  if (bics_di_test2 (f, e))
> +abort ();
> +  return 0;
> +}
> +
> +/* { dg-final { scan-assembler-times "bics\twzr, w\[0-9\]+, w\[0-9\]+" 
> 2 } } */
> +/* { dg-final { scan-assembler-times "bics\twzr, w\[0-9\]+, w\[0-9\]+, 
> lsl 2" 1 } } */
> +/* { dg-final { scan-assembler-times "bics\txzr, x\[0-9\]+, x\[0-9\]+" 
> 2 } } */
> +/* { dg-final { scan-assembler-times "bics\txzr, x\[0-9\]+, x\[0-9\]+, 
> lsl 2" 1 } } */
> +
> +/* { dg-final { cleanup-saved-temps } } */
> 




[match-and-simplify] Merge from trunk

2014-12-08 Thread Richard Biener

2014-12-08  Richard Biener  

Merge from trunk  r218262 through r218477.



[PATCH] Fix PR ipa/64049

2014-12-08 Thread Bernd Edlinger
Hello,


this surprisingly simple patch fixes PR ipa/64049.  The root cause seems to be
that in this test case we try to devirtualize a method call on a return value.


Boot-Strapped and regression-tested on X86_64-linux-gnu.
OK for trunk?


Thanks
Bernd.
  

patch-pr64049.diff
Description: Binary data


patch-pr64049.diff
Description: Binary data


FW: [PATCH] Fix PR ipa/64049

2014-12-08 Thread Bernd Edlinger


>
> Hello,
>
>
> this surprisingly simple patch fixes PR ipa/64049.  The root cause seems to be
> that in this test case we try to devirtualize a method call on a return value.
>
>
> Boot-Strapped and regression-tested on X86_64-linux-gnu.
> OK for trunk?
>
>
> Thanks
> Bernd.
>

Again with changelog. Sorry.
  

patch-pr64049.diff
Description: Binary data
2014-12-08  Bernd Edlinger  

PR ipa/64049
* ipa-polymorphic-call.c
(pa_polymorphic_call_context::ipa_polymorphic_call): Allow RESULT_DECL.

testsuite/ChangeLog:
2014-12-08  Bernd Edlinger  

PR ipa/64049
* g++.dg/ipa/pr64049.h: New.
* g++.dg/ipa/pr64049-1.C: New.
* g++.dg/ipa/pr64049-2.C: New.



Re: [PATCH, PR63995, CHKP] Use single static bounds var for varpool nodes sharing asm name

2014-12-08 Thread Ilya Enkovich
Ping

2014-11-26 15:35 GMT+03:00 Ilya Enkovich :
> On 25 Nov 15:03, Ilya Enkovich wrote:
>> 2014-11-25 14:11 GMT+03:00 Richard Biener :
>> > On Tue, Nov 25, 2014 at 11:19 AM, Ilya Enkovich  
>> > wrote:
>> >
>> > Ok, then it's get_for_asmname ().  That said - the above loops look
>> > bogus to me.  Honza - any better ideas?
>>
>> get_for_asmname ()  returns the first element in a chain of nodes with
>> the same asm name.  May I rely on the order of nodes in this chain?
>> Probably use ASSEMBLER_NAME as a key in chkp_static_var_bounds hash?
>>
>> Thanks,
>> Ilya
>>
>> >
>> > Richard.
>> >
>
> A variant with var's ASSEMBLER_NAME as a key works fine.  Instrumented 
> bootstrap passes.  OK for trunk?
>
> Thanks,
> Ilya
> --
> gcc/
>
> 2014-11-26  Ilya Enkovich  
>
> PR bootstrap/63995
> * tree-chkp.c (chkp_make_static_bounds): Share bounds var
> between nodes sharing assembler name.
>
> gcc/testsuite
>
> 2014-11-26  Ilya Enkovich  
>
> PR bootstrap/63995
> * g++.dg/dg.exp: Add mpx-dg.exp.
> * g++.dg/pr63995-1.C: New.
>
>
> diff --git a/gcc/testsuite/g++.dg/dg.exp b/gcc/testsuite/g++.dg/dg.exp
> index 14beae1..44eab0c 100644
> --- a/gcc/testsuite/g++.dg/dg.exp
> +++ b/gcc/testsuite/g++.dg/dg.exp
> @@ -18,6 +18,7 @@
>
>  # Load support procs.
>  load_lib g++-dg.exp
> +load_lib mpx-dg.exp
>
>  # If a testcase doesn't have special options, use these.
>  global DEFAULT_CXXFLAGS
> diff --git a/gcc/testsuite/g++.dg/pr63995-1.C 
> b/gcc/testsuite/g++.dg/pr63995-1.C
> new file mode 100644
> index 000..82e7606
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/pr63995-1.C
> @@ -0,0 +1,16 @@
> +/* { dg-do compile { target i?86-*-* x86_64-*-* } } */
> +/* { dg-require-effective-target mpx } */
> +/* { dg-options "-O2 -g -fcheck-pointer-bounds -mmpx" } */
> +
> +int test1 (int i)
> +{
> +  extern const int arr[10];
> +  return arr[i];
> +}
> +
> +extern const int arr[10];
> +
> +int test2 (int i)
> +{
> +  return arr[i];
> +}
> diff --git a/gcc/tree-chkp.c b/gcc/tree-chkp.c
> index 3e38691..924cb71 100644
> --- a/gcc/tree-chkp.c
> +++ b/gcc/tree-chkp.c
> @@ -2727,9 +2727,23 @@ chkp_make_static_bounds (tree obj)
>/* First check if we already have required var.  */
>if (chkp_static_var_bounds)
>  {
> -  slot = chkp_static_var_bounds->get (obj);
> -  if (slot)
> -   return *slot;
> +  /* For vars we use assembler name as a key in
> +chkp_static_var_bounds map.  It allows to
> +avoid duplicating bound vars for decls
> +sharing assembler name.  */
> +  if (TREE_CODE (obj) == VAR_DECL)
> +   {
> + tree name = DECL_ASSEMBLER_NAME (obj);
> + slot = chkp_static_var_bounds->get (name);
> + if (slot)
> +   return *slot;
> +   }
> +  else
> +   {
> + slot = chkp_static_var_bounds->get (obj);
> + if (slot)
> +   return *slot;
> +   }
>  }
>
>/* Build decl for bounds var.  */
> @@ -2793,7 +2807,13 @@ chkp_make_static_bounds (tree obj)
>if (!chkp_static_var_bounds)
>  chkp_static_var_bounds = new hash_map;
>
> -  chkp_static_var_bounds->put (obj, bnd_var);
> +  if (TREE_CODE (obj) == VAR_DECL)
> +{
> +  tree name = DECL_ASSEMBLER_NAME (obj);
> +  chkp_static_var_bounds->put (name, bnd_var);
> +}
> +  else
> +chkp_static_var_bounds->put (obj, bnd_var);
>
>return bnd_var;
>  }


Re: [PATCH, i386] Fix PR64003

2014-12-08 Thread Ilya Enkovich
On 08 Dec 11:30, Ilya Enkovich wrote:
> 2014-12-07 12:51 GMT+03:00 Richard Sandiford :
> > Ilya Enkovich  writes:
> >> diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
> >> index 88435d6..9019ed8 100644
> >> --- a/gcc/config/i386/i386.md
> >> +++ b/gcc/config/i386/i386.md
> >> @@ -10958,6 +10958,24 @@
> >>  ;; Basic conditional jump instructions.
> >>  ;; We ignore the overflow flag for signed branch instructions.
> >>
> >> +(define_insn "*jcc_1_bnd"
> >> +  [(set (pc)
> >> + (if_then_else (match_operator 1 "ix86_comparison_operator"
> >> +   [(reg FLAGS_REG) (const_int 0)])
> >> +   (label_ref (match_operand 0))
> >> +   (pc)))]
> >> +  "TARGET_MPX && ix86_bnd_prefixed_insn_p (insn)"
> >> +  "bnd %+j%C1\t%l0"
> >> +  [(set_attr "type" "ibr")
> >> +   (set_attr "modrm" "0")
> >> +   (set (attr "length")
> >> +(if_then_else (and (ge (minus (match_dup 0) (pc))
> >> +   (const_int -126))
> >> +   (lt (minus (match_dup 0) (pc))
> >> +   (const_int 128)))
> >> +  (const_int 3)
> >> +  (const_int 7)))])
> >> +
> >
> > Sorry, looking at this again, shouldn't the offset be -125 rather
> > than -126?  The pc in:
> >
> >(ge (minus (match_dup 0) (pc))
> >(const_int -126))
> >
> > is the start of the instruction, so we need to add the length
> > of the jump itself to the real minimum range of -128.
> 
> You are right.  Similarly upper bound should be changed from 128 to 129.
> 
> Thanks,
> Ilya
> 
> >
> > Thanks,
> > Richard

Here is a patch to fix it.

Thanks,
Ilya
--
2014-12-08  Ilya Enkovich  

PR target/64003
* config/i386/i386.md (*jcc_1_bnd): Fix short jump range.
(*jcc_2_bnd): Likewise.
(jump_bnd): Likewise.


diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 9019ed8..eafba04 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -10970,9 +10970,9 @@
(set_attr "modrm" "0")
(set (attr "length")
   (if_then_else (and (ge (minus (match_dup 0) (pc))
- (const_int -126))
+ (const_int -125))
  (lt (minus (match_dup 0) (pc))
- (const_int 128)))
+ (const_int 129)))
 (const_int 3)
 (const_int 7)))])
 
@@ -11006,9 +11006,9 @@
(set_attr "modrm" "0")
(set (attr "length")
   (if_then_else (and (ge (minus (match_dup 0) (pc))
- (const_int -126))
+ (const_int -125))
  (lt (minus (match_dup 0) (pc))
- (const_int 128)))
+ (const_int 129)))
 (const_int 3)
 (const_int 7)))])
 
@@ -11464,9 +11464,9 @@
   [(set_attr "type" "ibr")
(set (attr "length")
   (if_then_else (and (ge (minus (match_dup 0) (pc))
- (const_int -126))
+ (const_int -125))
  (lt (minus (match_dup 0) (pc))
- (const_int 128)))
+ (const_int 129)))
 (const_int 3)
 (const_int 6)))
(set_attr "modrm" "0")])


[PATCH][wwwdocs] Fix typo in changes.html and minor rewording

2014-12-08 Thread Kyrill Tkachov

Hi all,

As the subject says, this just fixes a typo in the fprofile-generate 
option name and rewords the text in the next sentence a bit.


Ok to commit?

Thanks,
KyrillIndex: htdocs/gcc-5/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-5/changes.html,v
retrieving revision 1.48
diff -U 3 -r1.48 changes.html
--- htdocs/gcc-5/changes.html	8 Dec 2014 03:17:31 -	1.48
+++ htdocs/gcc-5/changes.html	8 Dec 2014 10:47:05 -
@@ -41,8 +41,8 @@
 	 inline-insns-single limits for hot calls.
  IPA reference pass was significantly sped up making it feasible
 	 to enable -fipa-reference with
-	 -fprofile-generage. This also solve bottleneck
-	 seen when optimizing Chromium with link time optimization.
+	 -fprofile-generate. This also solves a bottleneck
+	 seen when building Chromium with link time optimization.
  Symbol table and call-graph API was reworked to C++ and
 	 simplified.
 

RE: [PATCH][wwwdocs] Fix typo in changes.html and minor rewording

2014-12-08 Thread Bernd Edlinger
Hi Kyrill,


> Hi all,
> 
> As the subject says, this just fixes a typo in the fprofile-generate 
> option name and rewords the text in the next sentence a bit.
> Ok to commit?
> 
> Thanks,
> Kyrill



I think this kind of change does not really need approval.
But, maybe you should also look at fixing the spelling of "sped": s/sped/speed/

Bernd.

--- htdocs/gcc-5/changes.html   8 Dec 2014 03:17:31 -   1.48
+++ htdocs/gcc-5/changes.html   8 Dec 2014 10:47:05 -
@@ -41,8 +41,8 @@
 inline-insns-single limits for hot calls.
  IPA reference pass was significantly sped up making it feasible
 to enable -fipa-reference with
  

Re: [PATCH][wwwdocs] Fix typo in changes.html and minor rewording

2014-12-08 Thread Ramana Radhakrishnan
On Mon, Dec 8, 2014 at 11:24 AM, Bernd Edlinger
 wrote:
> Hi Kyrill,
>
>
>> Hi all,
>>
>> As the subject says, this just fixes a typo in the fprofile-generate
>> option name and rewords the text in the next sentence a bit.
>> Ok to commit?
>>
>> Thanks,
>> Kyrill
>
>
>
> I think this kind of change does not really need approval.
> But, maybe you should also look at fixing the spelling of "sped": 
> s/sped/speed/

sped is past tense and past participle for speed. The usage is correct here.

Ramana

>
> Bernd.
>
> --- htdocs/gcc-5/changes.html   8 Dec 2014 03:17:31 -   1.48
> +++ htdocs/gcc-5/changes.html   8 Dec 2014 10:47:05 -
> @@ -41,8 +41,8 @@
>  inline-insns-single limits for hot calls.
>   IPA reference pass was significantly sped up making it feasible
>  to enable -fipa-reference with
>


[PATCH][4.8] Backport ISL 0.14 support

2014-12-08 Thread Richard Biener
I have installed the following patch on the 4.8 branch after verifying
it still builds ok and enables graphite with system ISL 0.14 and
system ISL 0.12.2 (and the corresponding cloog).

Richard.

2014-12-08  Richard Biener  

Backport from 4.9 branch
2014-12-04  Tobias Burnus  

* configure.ac: Permit also ISL 0.14 with CLooG.
* Makefile.def: Make more dependent on mpfr, mpc, isl, and cloog.
* Makefile.in: Regenerate.
* configure: Regenerate.

gcc/
* configure.ac
(ac_has_isl_schedule_constraints_compute_schedule):
New check.
* graphite-clast-to-gimple.c: For ISL 0.14, include deprecate headers.
* graphite-interchange.c: Ditto.
* graphite-poly.c: Ditto.
* graphite-sese-to-poly.c: Ditto.
* graphite-optimize-isl.c (getScheduleForBandList): Ditto.
Conditionally use ISL 0.13+ functions.
* config.in: Regenerate.
* configure: Regenerate.

Index: configure.ac
===
--- configure.ac(revision 218391)
+++ configure.ac(revision 218392)
@@ -1658,6 +1658,9 @@ if test "x$with_isl" != "xno" &&
 ISL_CHECK_VERSION(0,11)
 if test "${gcc_cv_isl}" = no ; then
   ISL_CHECK_VERSION(0,12)
+  if test "${gcc_cv_isl}" = no ; then
+ISL_CHECK_VERSION(0,14)
+  fi
 fi
   fi
   dnl Only execute fail-action, if ISL has been requested.
Index: gcc/config.in
===
--- gcc/config.in   (revision 218391)
+++ gcc/config.in   (revision 218392)
@@ -1211,6 +1211,12 @@
 #endif
 
 
+/* Define if isl_schedule_constraints_compute_schedule exists. */
+#ifndef USED_FOR_TARGET
+#undef HAVE_ISL_SCHED_CONSTRAINTS_COMPUTE_SCHEDULE
+#endif
+
+
 /* Define to 1 if you have the `kill' function. */
 #ifndef USED_FOR_TARGET
 #undef HAVE_KILL
Index: gcc/configure.ac
===
--- gcc/configure.ac(revision 218391)
+++ gcc/configure.ac(revision 218392)
@@ -5495,8 +5495,31 @@ AC_ARG_VAR(CLOOGLIBS,[How to link CLOOG]
 AC_ARG_VAR(CLOOGINC,[How to find CLOOG include files])
 if test "x${CLOOGLIBS}" != "x" ; then 
AC_DEFINE(HAVE_cloog, 1, [Define if cloog is in use.])
+
+  # Check whether isl_schedule_constraints_compute_schedule is available;
+  # it's new in ISL-0.13.
+  saved_CFLAGS="$CFLAGS"
+  CFLAGS="$CFLAGS $ISLINC"
+  saved_LIBS="$LIBS"
+  LIBS="$LIBS $CLOOGLIBS $ISLLIBS $GMPLIBS"
+
+  AC_MSG_CHECKING([Checking for isl_schedule_constraints_compute_schedule])
+  AC_TRY_LINK([#include ],
+  [isl_schedule_constraints_compute_schedule (NULL);],
+  [ac_has_isl_schedule_constraints_compute_schedule=yes],
+  [ac_has_isl_schedule_constraints_compute_schedule=no])
+  AC_MSG_RESULT($ac_has_isl_schedule_constraints_compute_schedule)
+
+  LIBS="$saved_LIBS"
+  CFLAGS="$saved_CFLAGS"
+
+  if test x"$ac_has_isl_schedule_constraints_compute_schedule" = x"yes"; then
+ AC_DEFINE(HAVE_ISL_SCHED_CONSTRAINTS_COMPUTE_SCHEDULE, 1,
+   [Define if isl_schedule_constraints_compute_schedule exists.])
+  fi
 fi
 
+
 # Check for plugin support
 AC_ARG_ENABLE(plugin,
 [AS_HELP_STRING([--enable-plugin], [enable plugin support])],
Index: gcc/graphite-clast-to-gimple.c
===
--- gcc/graphite-clast-to-gimple.c  (revision 218391)
+++ gcc/graphite-clast-to-gimple.c  (revision 218392)
@@ -30,6 +30,11 @@ along with GCC; see the file COPYING3.
 #include 
 #include 
 #include 
+#ifdef HAVE_ISL_SCHED_CONSTRAINTS_COMPUTE_SCHEDULE
+#include 
+#include 
+#include 
+#endif
 #endif
 
 #include "system.h"
Index: gcc/graphite-optimize-isl.c
===
--- gcc/graphite-optimize-isl.c (revision 218391)
+++ gcc/graphite-optimize-isl.c (revision 218392)
@@ -28,6 +28,10 @@ along with GCC; see the file COPYING3.
 #include 
 #include 
 #include 
+#ifdef HAVE_ISL_SCHED_CONSTRAINTS_COMPUTE_SCHEDULE
+#include 
+#include 
+#endif
 #endif
 
 #include "system.h"
@@ -373,7 +377,11 @@ getScheduleForBandList (isl_band_list *B
{
  for (i = ScheduleDimensions - 1 ;  i >= 0 ; i--)
{
+#ifdef HAVE_ISL_SCHED_CONSTRAINTS_COMPUTE_SCHEDULE
+ if (isl_band_member_is_coincident (Band, i))
+#else
  if (isl_band_member_is_zero_distance (Band, i))
+#endif
{
  isl_map *TileMap;
  isl_union_map *TileUMap;
Index: gcc/graphite-poly.c
===
--- gcc/graphite-poly.c (revision 218391)
+++ gcc/graphite-poly.c (revision 218392)
@@ -30,6 +30,10 @@ along with GCC; see the file COPYING3.
 #include 
 #include 
 #include 
+#ifdef HAVE_ISL_SCHED_CONSTRAINTS_COMPUTE_SCHEDULE
+#include 
+#include 
+#endif
 #endif
 
 #include "system.h"
I

Re: [Patch, ARM/Thumb1]Add a Thumb1 insn pattern to legalize the instruction that moves pc to low register

2014-12-08 Thread Richard Earnshaw
On 08/12/14 08:24, Terry Guo wrote:
> Hi there,
> 
> When compile below simple code:
> 
> terguo01@terry-pc01:mtpcs-frame$ cat test.c
> int main(void)
> {
> return 0;
> }
> 
> I got ICE with option -mtpcs-leaf-frame (no error if remove this option).
> 
> terguo01@terry-pc01:mtpcs-frame$
> /work/terguo01/tools/gcc-arm-none-eabi-5_0-2014q4/bin/arm-none-eabi-gcc
> -mtpcs-leaf-frame test.c -c -mcpu=cortex-m0plus -mthumb -da
> test.c: In function 'main':
> test.c:4:1: error: unrecognizable insn:
>  }
>  ^
> (insn 20 19 21 (set (reg:SI 2 r2)
> (reg:SI 15 pc)) test.c:2 -1
>  (nil))
> test.c:4:1: internal compiler error: in extract_insn, at recog.c:2327
> Please submit a full bug report,
> with preprocessed source if appropriate.
> See http://gcc.gnu.org/bugs.html\ for instructions.
> 
> This RTL is generated in function thumb1_expand_prologue. The expected insn
> pattern is thumb1_movsi_insn in thumb1.md. And instruction like "mov r2, pc"
> is a legal instruction. Because gcc returns NO_REG for PC register, so no
> valid pattern to match instruction that move pc to low register. This patch
> intends to add a new insn pattern to legalize such thing.
> 
> Tested with GCC regression test. No regression. Is it OK to trunk?
> 
> BR,
> Terry
> 
> 2014-12-08  Terry Guo  terry@arm.com
> 
>  * config/arm/predicates.md (pc_register): New to match PC register.
>  * config/arm/thumb1.md (*thumb1_movpc_insn): New insn pattern.
> 
> gcc/testsuite/ChangeLog:
> 2014-12-08  Terry Guo  terry@arm.com
> 
>  * gcc.target/arm/thumb1-mov-pc.c: New test.
> 
> 
> thumb1-move-pc-v1.txt
> 
> 
> diff --git a/gcc/config/arm/predicates.md b/gcc/config/arm/predicates.md
> index 032808c..c5ef5ed 100644
> --- a/gcc/config/arm/predicates.md
> +++ b/gcc/config/arm/predicates.md
> @@ -361,6 +361,10 @@
>(and (match_code "smin,smax,umin,umax")
> (match_test "mode == GET_MODE (op)")))
>  
> +(define_special_predicate "pc_register"
> +  (and (match_code "reg")
> +   (match_test "REGNO (op) == PC_REGNUM")))
> +
>  (define_special_predicate "cc_register"
>(and (match_code "reg")
> (and (match_test "REGNO (op) == CC_REGNUM")
> diff --git a/gcc/config/arm/thumb1.md b/gcc/config/arm/thumb1.md
> index ddedc39..8e6057c 100644
> --- a/gcc/config/arm/thumb1.md
> +++ b/gcc/config/arm/thumb1.md
> @@ -1780,6 +1780,16 @@
>"
>  )
>  
> +(define_insn "*thumb1_movpc_insn"
> +  [(set (match_operand:SI 0 "low_register_operand")

This needs constraints.

> +(match_operand:SI 1 "pc_register"))]
> +  "TARGET_THUMB1"
> +  "mov\\t%0, pc"
> +  [(set_attr "length" "2")
> +   (set_attr "conds" "nocond")
> +   (set_attr "type"   "mov_reg")]
> +)
> +
>  ;; NB never uses BX.
>  (define_insn "*thumb1_tablejump"
>[(set (pc) (match_operand:SI 0 "register_operand" "l*r"))
> diff --git a/gcc/testsuite/gcc.target/arm/thumb1-mov-pc.c 
> b/gcc/testsuite/gcc.target/arm/thumb1-mov-pc.c
> new file mode 100644
> index 000..9f94131
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/thumb1-mov-pc.c
> @@ -0,0 +1,7 @@
> +/* { dg-options "-mtpcs-leaf-frame -O2" } */
> +/* { dg-skip-if "" { ! { arm_thumb1 } } } */
> +int
> +main ()
> +{
> +  return 0;
> +}
> 




Re: [PATCH][wwwdocs] Fix typo in changes.html and minor rewording

2014-12-08 Thread Kyrill Tkachov


On 08/12/14 11:26, Ramana Radhakrishnan wrote:

On Mon, Dec 8, 2014 at 11:24 AM, Bernd Edlinger
 wrote:

Hi Kyrill,



Hi all,

As the subject says, this just fixes a typo in the fprofile-generate
option name and rewords the text in the next sentence a bit.
Ok to commit?

Thanks,
Kyrill



I think this kind of change does not really need approval.
But, maybe you should also look at fixing the spelling of "sped": s/sped/speed/

sped is past tense and past participle for speed. The usage is correct here.


Thank you both for taking a look. I did consider the typo fix to be 
obvious, but was on the fence about the rewording part.


I'll commit this  later today unless someone objects.

Kyrill



Ramana


Bernd.

--- htdocs/gcc-5/changes.html   8 Dec 2014 03:17:31 -   1.48
+++ htdocs/gcc-5/changes.html   8 Dec 2014 10:47:05 -
@@ -41,8 +41,8 @@
  inline-insns-single limits for hot calls.
   IPA reference pass was significantly sped up making it feasible
  to enable -fipa-reference with






Re: [PATCH] Mark explicit decls as implicit when we've seen a prototype

2014-12-08 Thread Richard Biener
On Thu, 4 Dec 2014, Joseph Myers wrote:

> On Thu, 4 Dec 2014, Richard Biener wrote:
> 
> > OTOH this also means the user cannot provide a conforming
> > implementation on his own and get that used by GCC without editing
> > system headers or including a header with -isystem or similar
> > tricks.
> 
> Well - you could have a pragma / attribute for that purpose (declaring 
> "this program is providing a version of function X that has the semantics 
> GCC expects for function X", so GCC can both generate and optimize calls).  
> Such a pragma / attribute could also override targetm.libc_has_function 
> (for the case of the user providing their own definition of something 
> missing from their system's standard libraries).  A related case would be 
> declaring somehow "I will be linking in libm, even though this translation 
> unit doesn't appear to be using libm functions, so calls to libm functions 
> can be implicitly generated", if GCC were made to avoid introducing uses 
> of libm.  (Again, this would be a matter of providing a cleaner interface 
> rather than something that currently can't be expressed at all - the 
> proposed definition of what it means to use libm explicitly implies that 
> "if (0) (void) sqrt (0);" says that libm is being used.)

I tried to come up with a patch that adds extra checks but the existing
way of having 'implicit' vs. 'explicit' only available as a flag rather
than making the explicitely declared builtin decl available makes
it harder than necessary.  I also run into the issue that we
remove all cgraph edges during GIMPLE optimizations thus embedding
the "is there already a use of the builtin" in the new abstraction
doesn't fly.

This means the frontend has to provide a "used" flag to the middle-end
as well as a "declared" flag.  For providing a declared flag there
is already the place where we special-case STPCPY, for providing
a "used" flag I don't see any obvious place.

I'm not pushing this further for stage3, but for stage1 I'd like
to eventually address this by splitting up builtin_info_type's
'implicit_p' into a flags array providing implicit_p, declared_p,
used_p and maybe declared_in_system_header_p.  Would you be
willing to fill in the gap computing "used_p" in the C frontend?

My non-working abstraction for middle-end folders looks like

/* Return the decl for the builtin function FNCODE or NULL_TREE if it 
cannot
   be emitted by GCC.  */

tree
builtin_decl (enum built_in_function fncode)
{
  tree decl = builtin_decl_implicit (fncode);
  if (decl)
return decl;

  decl = builtin_decl_explicit (fncode);
  if (!decl)
return NULL_TREE;

  /* We cannot use a builtin that has not been declared explicitely.  */
  if (!(builtin_info.flags[fncode] & bif_declared_p))
return NULL_TREE;

  /* We can use a builtin that has been declared explicitely only if
 the program contains a reference to it already.  */
  cgraph_node *node = cgraph_node::get_for_asmname (DECL_ASSEMBLER_NAME 
(decl));
  if (!node || !node->callers)
return NULL_TREE;
  return decl;
}

which currently fails because of us having removing cgraph edges
at the points the function is called.  Thus the cgraph caller
check would be replaced by a check for 'used_p'.  We'd still
get __builtin_exp10 emitted instead of the user-declared "exp10"
variant (we don't have access to that decl).

Without the 'used_p' flag work the rest would be equivalent to
setting implicit_p for explicit declarations we see (the originally
proposed patch which follows the stpcpy example).

Thanks,
Richard.


Re: [PATCH] Mark explicit decls as implicit when we've seen a prototype

2014-12-08 Thread Jakub Jelinek
On Mon, Dec 08, 2014 at 01:24:12PM +0100, Richard Biener wrote:
> I'm not pushing this further for stage3, but for stage1 I'd like
> to eventually address this by splitting up builtin_info_type's
> 'implicit_p' into a flags array providing implicit_p, declared_p,
> used_p and maybe declared_in_system_header_p.  Would you be
> willing to fill in the gap computing "used_p" in the C frontend?

The used_p thing might be problematic, I'd expect that several packages
use libm functions somewhere in dead code or when it is folded into
a constant and don't link with -lm, if those dead or optimized away
uses would be counted as uses nevertheless, then if optimizers create new
libm references because of those, I'd be afraid such programs wouldn't link
anymore.

Jakub


Re: FW: [PATCH] Fix PR ipa/64049

2014-12-08 Thread Richard Biener
On Mon, Dec 8, 2014 at 11:57 AM, Bernd Edlinger
 wrote:
>
>
>>
>> Hello,
>>
>>
>> this surprisingly simple patch fixes PR ipa/64049.  The root cause seems to 
>> be
>> that in this test case we try to devirtualize a method call on a return 
>> value.
>>
>>
>> Boot-Strapped and regression-tested on X86_64-linux-gnu.
>> OK for trunk?

I think positive tests, in this case TREE_CODE (...) == VAR_DECL
are better.

Ok with that change.

Thanks,
Richard.

>>
>> Thanks
>> Bernd.
>>
>
> Again with changelog. Sorry.
>


Re: [PATCH] Mark explicit decls as implicit when we've seen a prototype

2014-12-08 Thread Richard Biener
On Mon, 8 Dec 2014, Jakub Jelinek wrote:

> On Mon, Dec 08, 2014 at 01:24:12PM +0100, Richard Biener wrote:
> > I'm not pushing this further for stage3, but for stage1 I'd like
> > to eventually address this by splitting up builtin_info_type's
> > 'implicit_p' into a flags array providing implicit_p, declared_p,
> > used_p and maybe declared_in_system_header_p.  Would you be
> > willing to fill in the gap computing "used_p" in the C frontend?
> 
> The used_p thing might be problematic, I'd expect that several packages
> use libm functions somewhere in dead code or when it is folded into
> a constant and don't link with -lm, if those dead or optimized away
> uses would be counted as uses nevertheless, then if optimizers create new
> libm references because of those, I'd be afraid such programs wouldn't link
> anymore.

Same applies to your STPCPY special-casing, even without introducing
a use.

The alternative is to decide "used" in the middle-end at one point,
for example at the end of all_lowering_passes where hopefully
we have constant folded and removed dead code enough.  We can also
compute an overall "uses libm" flag to fix the testcase I reported
(of course we'd like to re-compute that at LTO time).

Do you think that's better?  It's of course less well-defined what
is a "use" of exp10 then (as opposed to what Joseph specified
with a reference outside of sizeof() and similar contexts).

Richard.


Re: [PATCH] Mark explicit decls as implicit when we've seen a prototype

2014-12-08 Thread Jakub Jelinek
On Mon, Dec 08, 2014 at 01:54:21PM +0100, Richard Biener wrote:
> On Mon, 8 Dec 2014, Jakub Jelinek wrote:
> 
> > On Mon, Dec 08, 2014 at 01:24:12PM +0100, Richard Biener wrote:
> > > I'm not pushing this further for stage3, but for stage1 I'd like
> > > to eventually address this by splitting up builtin_info_type's
> > > 'implicit_p' into a flags array providing implicit_p, declared_p,
> > > used_p and maybe declared_in_system_header_p.  Would you be
> > > willing to fill in the gap computing "used_p" in the C frontend?
> > 
> > The used_p thing might be problematic, I'd expect that several packages
> > use libm functions somewhere in dead code or when it is folded into
> > a constant and don't link with -lm, if those dead or optimized away
> > uses would be counted as uses nevertheless, then if optimizers create new
> > libm references because of those, I'd be afraid such programs wouldn't link
> > anymore.
> 
> Same applies to your STPCPY special-casing, even without introducing
> a use.

Well, the important difference there is that stpcpy is in libc, not libm,
and you get the former by default usually (-nostdlib is very rare).

Yes, supposedly overall uses libc, uses libm flags would be reasonable.

> The alternative is to decide "used" in the middle-end at one point,
> for example at the end of all_lowering_passes where hopefully
> we have constant folded and removed dead code enough.  We can also
> compute an overall "uses libm" flag to fix the testcase I reported
> (of course we'd like to re-compute that at LTO time).
> 
> Do you think that's better?  It's of course less well-defined what
> is a "use" of exp10 then (as opposed to what Joseph specified
> with a reference outside of sizeof() and similar contexts).

Jakub


Re: [PATCH] PR other/63613: Add fixincludes for dejagnu.h

2014-12-08 Thread Rainer Orth
Jeff Law  writes:

> On 12/04/14 15:42, Rainer Orth wrote:
>> David Malcolm  writes:
>>
>>>  assumed -fgnu89-inline until a recent upstream fix;
>>> see http://lists.gnu.org/archive/html/dejagnu/2014-10/msg00011.html
>>>
>>> Remove the workaround from jit.exp that used -fgnu89-inline
>>> in favor of a fixincludes to dejagnu.h that applies the upstream fix
>>> to a local copy.
>>>
>>> This should make it easier to support C++ testcases from jit.exp.
>>
>> I wonder how this would work if dejagnu.h doesn't live in a system
>> include dir (e.g. a self-compiled version)?  fixincludes won't touch
>> those AFAIU.  The previous version with -fgnu89-inline would still work
>> in that case provided dejagnu.h is found at all.
> Presumably in that case the answer is upgrade dejagnu? :-)

I've two problems with this:

* There's not yet a DejaGnu release available with the fix and I've no
  idea if there are any planned any time soon.  Not everyone is
  comfortable with random git (or whatever) snapshots.

* I don't consider this a critical issue that cannot work without
  current releases.  We're already working around several upstream
  DejaGnu issues in our codebase, and I don't consider this particular
  one important enough to require everyone to upgrade to a not-a-release
  version.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: Fix libgomp crash without TLS (PR42616)

2014-12-08 Thread Varvara Rainchik
Hi guys,

Could you please take a look at this issue? This fix is still urgent
for Android.

2014-12-01 18:25 GMT+03:00 Varvara Rainchik :
> Hi Jakub,
>
> Do you think this patch is ok for upstream:
>
> 2014-12-01  Varvara Rainchik  
>
> * libgomp/libgomp.h: Eliminate case when HAVE_TLS is not defined:
> always use tls emulation.
> * libgomp/team.c: Likewise.
>
> --
> diff --git a/libgomp/libgomp.h b/libgomp/libgomp.h
> index a1482cc..a659ebc 100644
> --- a/libgomp/libgomp.h
> +++ b/libgomp/libgomp.h
> @@ -471,19 +471,11 @@ enum gomp_cancel_kind
>
>  /* ... and here is that TLS data.  */
>
> -#ifdef HAVE_TLS
>  extern __thread struct gomp_thread gomp_tls_data;
>  static inline struct gomp_thread *gomp_thread (void)
>  {
>return &gomp_tls_data;
>  }
> -#else
> -extern pthread_key_t gomp_tls_key;
> -static inline struct gomp_thread *gomp_thread (void)
> -{
> -  return pthread_getspecific (gomp_tls_key);
> -}
> -#endif
>
>  extern struct gomp_task_icv *gomp_new_icv (void);
>
> diff --git a/libgomp/team.c b/libgomp/team.c
> index e6a6d8f..2e8dc47 100644
> --- a/libgomp/team.c
> +++ b/libgomp/team.c
> @@ -37,12 +37,7 @@ pthread_key_t gomp_thread_destructor;
>
>
>  /* This is the libgomp per-thread data structure.  */
> -#ifdef HAVE_TLS
>  __thread struct gomp_thread gomp_tls_data;
> -#else
> -pthread_key_t gomp_tls_key;
> -#endif
> -
>
>  /* This structure is used to communicate across pthread_create.  */
>
> @@ -70,13 +65,7 @@ gomp_thread_start (void *xdata)
>void (*local_fn) (void *);
>void *local_data;
>
> -#ifdef HAVE_TLS
>thr = &gomp_tls_data;
> -#else
> -  struct gomp_thread local_thr;
> -  thr = &local_thr;
> -  pthread_setspecific (gomp_tls_key, thr);
> -#endif
>gomp_sem_init (&thr->release, 0);
>
>/* Extract what we need from data.  */
> @@ -916,13 +905,6 @@ gomp_team_end (void)
>  static void __attribute__((constructor))
>  initialize_team (void)
>  {
> -#ifndef HAVE_TLS
> -  static struct gomp_thread initial_thread_tls_data;
> -
> -  pthread_key_create (&gomp_tls_key, NULL);
> -  pthread_setspecific (gomp_tls_key, &initial_thread_tls_data);
> -#endif
> -
>if (pthread_key_create (&gomp_thread_destructor, gomp_free_thread) != 0)
>  gomp_fatal ("could not create thread pool destructor.");
>  }
>
> Or should I add some extra checks in config/tls.m4? As far as I
> understand you have mentioned case when both native tls and tls
> emulation are not supported. So, is it sufficient to check that
> HAVE_TLS and USE_EMUTLS are not defined to detect this case?
>
> 2014-11-10 16:15 GMT+03:00 Varvara Rainchik :
>> *Ping*
>>
>> 2014-10-13 14:48 GMT+04:00 Varvara Rainchik :
 Now, I wonder on which OS and why does config/tls.m4 CHECK_GCC_TLS
 actually fail?  Can you figure that out?

>>>
>>> On Android check passes with --disable-tls (standard while building
>>> gcc for Android as TLS is not supported in bionic) and fails with
>>> --enable-tls (i686-linux-android/libgomp/conftest.c:32: undefined
>>> reference to `___tls_get_addr'). So, HAVE_TLS is not defined in both
>>> cases.
>>>
 If we get rid of HAVE_TLS code altogether, we might lose support of
 some very old OSes, e.g. some Linux distros with a recent gcc and binutils
 (so that emutls isn't used), but very old glibc (that doesn't support
 TLS or supports it incorrectly, think of pre-2002 glibc).  So, if we get
 rid of !HAVE_TLS code in libgomp, it would be nice if config/tls.m4 
 detected
 it properly and we'd just fail at configure time.
>>>
>>> How can we check this in config/tls.m4? Can we just combine tests on
>>> TLS and emutls? E.g. check whether HAVE_TLS and USE_EMUTLS are both
>>> defined.
>>>
 And if we don't, just make sure that on Android, Darwin and/or M$Win (or
 whatever other OS you had in mind which does support pthreads, but doesn't
 support native TLS) find out why HAVE_AS_TLS is not defined (guess
 config.log should explain that).
>>>
>>> HAVE_AS_TLS is also not defined for Android as it depends on --enable-tls.


Re: Ping with testcase: [PATCH][AArch64] Fix __builtin_aarch64_absdi, must not fold to ABS_EXPR

2014-12-08 Thread Marcus Shawcroft
On 3 December 2014 at 10:30, Alan Lawrence  wrote:
> On Wed, Nov 26, 2014 at 04:35:50PM +, James Greenhalgh wrote:
>> Why do we want to turn off folding for the V4SF/V2SF/V2DF modes of these
>> intrinsics? There should be no difference between the mid-end definition
>> and the intrinsic definition of their behaviour.
>
> Good point. Done.
>
>> I also note that the integer forms of these now end up as an "abs" RTL
>> expression - can we guarantee that preserves the intrinsics behaviour and
>> no RTL-folder will come along and mis-optimize? Documentation is vague
>> on this point.
>
> I don't think we can guarantee that indefinitely, but it doesn't seem anyone
> has implemented such a rule *at present*. And we'd lose e.g. constant
> folding, if we switched to using an AArch64-specific UNSPEC. If anyone does
> add such a rule, I'd have some hope that the testcase will catch it...
>
>> I'm also not convinced by the SIMD_ARG_SCRATCH foo you add. Looking at
>> the aarch64.md:absdi2 pattern I can't see why we need that scratch at
>> all. It seems we could get away with marking operand 0 early-clobber and
>> using it as our scratch register. Then we could drop all the extra
>> infrastructure from this patch.
>
> Hah, good point. Ok, I attach a revised patch, updating the pattern as you
> suggest and omitting the infrastructure changes. (I'm leaving the now-unused
> qualifier_internal as it stands, as there are arguments both for removing it
> and "fixing" it as per previous patch, but I don't think we should hold this
> up in the case that we want to have that discussion!)
>
> Cross-tested check-gcc on aarch64-none-elf, including previously-posted test
> of vabs_s8.
>
> Ok for trunk (with previously-posted testcase) ?
> --Alan
>
> gcc/ChangeLog:
>
> * config/aarch64/aarch64.md (absdi2): Remove scratch operand by
> earlyclobbering result operand.
>
> * config/aarch64/aarch64-builtins.c (aarch64_types_unop_qualifiers):
> Remove final qualifier_internal.
> (aarch64_fold_builtin): Stop folding abs builtins, except on floats.

OK, and the test case too /Marcus


Re: Fix libgomp crash without TLS (PR42616)

2014-12-08 Thread Jakub Jelinek
On Mon, Dec 08, 2014 at 04:30:46PM +0300, Varvara Rainchik wrote:
> Hi guys,
> 
> Could you please take a look at this issue? This fix is still urgent
> for Android.

I'm afraid it will break those targets where emutls is not on, but the C
library doesn't support TLS.  I think it is acceptable not to care about
#pragma omp from different pthread_create created threads in that case, but
stopping support completely might be too early.
So, can you instead arrange for HAVE_TLS to be defined if emutls is
supported (check for that during configure)?

Jakub


[PATCH 01/10] rs6000: Clobber XER[CA] in all user asm statements

2014-12-08 Thread Segher Boessenkool
A lot of old user code clobbers the carry bit without telling the compiler
about it.  This used to just work, because the compiler never used the bit
outside of a single RTL instruction.  But that will change.  Let's clobber
the carry bit in every asm; this isn't very expensive.


2014-12-08  Segher Boessenkool  

gcc/
PR target/64180
* config/rs6000/rs6000.c (TARGET_MD_ASM_CLOBBERS): Define.
(rs6000_md_asm_clobbers): New function.

---
 gcc/config/rs6000/rs6000.c | 16 
 1 file changed, 16 insertions(+)

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index cdb9de9..e58fc81 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -1556,6 +1556,9 @@ static const struct attribute_spec 
rs6000_attribute_table[] =
 #undef TARGET_ASM_LOOP_ALIGN_MAX_SKIP
 #define TARGET_ASM_LOOP_ALIGN_MAX_SKIP rs6000_loop_align_max_skip
 
+#undef TARGET_MD_ASM_CLOBBERS
+#define TARGET_MD_ASM_CLOBBERS rs6000_md_asm_clobbers
+
 #undef TARGET_OPTION_OVERRIDE
 #define TARGET_OPTION_OVERRIDE rs6000_option_override
 
@@ -3146,6 +3149,19 @@ rs6000_builtin_mask_calculate (void)
  | ((TARGET_LONG_DOUBLE_128)   ? RS6000_BTM_LDBL128 : 0));
 }
 
+/* Implement TARGET_MD_ASM_CLOBBERS.  All asm statements are considered
+   to clobber the XER[CA] bit because clobbering that bit without telling
+   the compiler worked just fine with versions of GCC before GCC 5, and
+   breaking a lot of older code in ways that are hard to track down is
+   not such a great idea.  */
+
+static tree
+rs6000_md_asm_clobbers (tree, tree, tree clobbers)
+{
+  tree s = build_string (strlen (reg_names[CA_REGNO]), reg_names[CA_REGNO]);
+  return tree_cons (NULL_TREE, s, clobbers);
+}
+
 /* Override command line options.  Mostly we process the processor type and
sometimes adjust other TARGET_ options.  */
 
-- 
1.8.1.4



[PATCH 03/10] rs6000: Do not use addic in the ctr* splitters

2014-12-08 Thread Segher Boessenkool
These splitters are only used when for some reason we generated a bdnz
loop but we cannot actually use one (CTR is already used for something
else, for example).  This shouldn't happen often, so making the split
sequence one insn longer isn't a big deal.


2014-12-08  Segher Boessenkool  

gcc/
PR target/64180
* config/rs6000/rs6000.md (*ctr_internal1, *ctr_internal2,
*ctr_internal5, *ctr_internal6): Change "r" alternatives
to "b".  Increase length.
(splitters for these): Split to cmp+addi instead of addic.

---
 gcc/config/rs6000/rs6000.md | 42 --
 1 file changed, 20 insertions(+), 22 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index abf20c3..6f4bafb 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -13496,7 +13496,7 @@ (define_expand "ctr"
 
 (define_insn "*ctr_internal1"
   [(set (pc)
-   (if_then_else (ne (match_operand:P 1 "register_operand" "c,*r,*r,*r")
+   (if_then_else (ne (match_operand:P 1 "register_operand" "c,*b,*b,*b")
  (const_int 1))
  (label_ref (match_operand 0 "" ""))
  (pc)))
@@ -13516,11 +13516,11 @@ (define_insn "*ctr_internal1"
 return \"bdz $+8\;b %l0\";
 }"
   [(set_attr "type" "branch")
-   (set_attr "length" "*,12,16,16")])
+   (set_attr "length" "*,16,20,20")])
 
 (define_insn "*ctr_internal2"
   [(set (pc)
-   (if_then_else (ne (match_operand:P 1 "register_operand" "c,*r,*r,*r")
+   (if_then_else (ne (match_operand:P 1 "register_operand" "c,*b,*b,*b")
  (const_int 1))
  (pc)
  (label_ref (match_operand 0 "" ""
@@ -13540,13 +13540,13 @@ (define_insn "*ctr_internal2"
 return \"bdnz $+8\;b %l0\";
 }"
   [(set_attr "type" "branch")
-   (set_attr "length" "*,12,16,16")])
+   (set_attr "length" "*,16,20,20")])
 
 ;; Similar but use EQ
 
 (define_insn "*ctr_internal5"
   [(set (pc)
-   (if_then_else (eq (match_operand:P 1 "register_operand" "c,*r,*r,*r")
+   (if_then_else (eq (match_operand:P 1 "register_operand" "c,*b,*b,*b")
  (const_int 1))
  (label_ref (match_operand 0 "" ""))
  (pc)))
@@ -13566,11 +13566,11 @@ (define_insn "*ctr_internal5"
 return \"bdnz $+8\;b %l0\";
 }"
   [(set_attr "type" "branch")
-   (set_attr "length" "*,12,16,16")])
+   (set_attr "length" "*,16,20,20")])
 
 (define_insn "*ctr_internal6"
   [(set (pc)
-   (if_then_else (eq (match_operand:P 1 "register_operand" "c,*r,*r,*r")
+   (if_then_else (eq (match_operand:P 1 "register_operand" "c,*b,*b,*b")
  (const_int 1))
  (pc)
  (label_ref (match_operand 0 "" ""
@@ -13590,7 +13590,7 @@ (define_insn "*ctr_internal6"
 return \"bdz $+8\;b %l0\";
 }"
   [(set_attr "type" "branch")
-   (set_attr "length" "*,12,16,16")])
+   (set_attr "length" "*,16,20,20")])
 
 ;; Now the splitters if we could not allocate the CTR register
 
@@ -13606,13 +13606,12 @@ (define_split
(clobber (match_scratch:CC 3 ""))
(clobber (match_scratch:P 4 ""))]
   "reload_completed"
-  [(parallel [(set (match_dup 3)
-  (compare:CC (plus:P (match_dup 1)
-   (const_int -1))
-  (const_int 0)))
- (set (match_dup 0)
-  (plus:P (match_dup 1)
-   (const_int -1)))])
+  [(set (match_dup 3)
+   (compare:CC (match_dup 1)
+   (const_int 1)))
+   (set (match_dup 0)
+   (plus:P (match_dup 1)
+   (const_int -1)))
(set (pc) (if_then_else (match_dup 7)
   (match_dup 5)
   (match_dup 6)))]
@@ -13632,13 +13631,12 @@ (define_split
(clobber (match_scratch:CC 3 ""))
(clobber (match_scratch:P 4 ""))]
   "reload_completed && ! gpc_reg_operand (operands[0], SImode)"
-  [(parallel [(set (match_dup 3)
-  (compare:CC (plus:P (match_dup 1)
-   (const_int -1))
-  (const_int 0)))
- (set (match_dup 4)
-  (plus:P (match_dup 1)
-   (const_int -1)))])
+  [(set (match_dup 3)
+   (compare:CC (match_dup 1)
+   (const_int 1)))
+   (set (match_dup 4)
+   (plus:P (match_dup 1)
+   (const_int -1)))
(set (match_dup 0)
(match_dup 4))
(set (pc) (if_then_else (match_dup 7)
-- 
1.8.1.4



[PATCH 00/10] rs6000: Carry bit improvements

2014-12-08 Thread Segher Boessenkool
This series makes the PowerPC port deal with the {add,subf}{c,ic,e,ze,me}
instructions directly, not as bigger primitives (which used to be emitted as
multiple machine instructions).

It does this by expanding multi-precision add/sub/neg as well as the unsigned
wordlength sCOND to separate machine insns, which are then simplified where
possible by the combine pass and friends.  For eq/ne this series splits after
combine.

This is PR target/64180.

All patches bootstrapped and tested on powerpc64-linux, runtestflags
-m32,-m64,-m32/-mpowerpc64,-m64/-mlra; also a lot of tiny tests that
I'll submit separately.

Comments?  Is this okay for mainline?


Segher


Segher Boessenkool (10):
  rs6000: Clobber XER[CA] in all user asm statements
  rs6000: Remove addic alternative from various lo_sum patterns
  rs6000: Do not use addic in the ctr* splitters
  rs6000: Remove addic from the normal add pattern
  rs6000: Merge and split add and addic
  rs6000: New add/subf carry insns
  rs6000: The big carry insns addition
  rs6000: compare->two for rld*;rld*.
  rs6000: Remove the (now unused) insn type "compare"
  rs6000: Get rid of the weird decimal thing in add

 gcc/config/rs6000/40x.md  |2 +-
 gcc/config/rs6000/440.md  |2 +-
 gcc/config/rs6000/476.md  |2 +-
 gcc/config/rs6000/601.md  |2 +-
 gcc/config/rs6000/603.md  |2 +-
 gcc/config/rs6000/6xx.md  |2 +-
 gcc/config/rs6000/7450.md |2 +-
 gcc/config/rs6000/7xx.md  |2 +-
 gcc/config/rs6000/8540.md |2 +-
 gcc/config/rs6000/cell.md |   10 +-
 gcc/config/rs6000/darwin.md   |   16 +-
 gcc/config/rs6000/e300c2c3.md |2 +-
 gcc/config/rs6000/e500mc.md   |2 +-
 gcc/config/rs6000/e500mc64.md |2 +-
 gcc/config/rs6000/e5500.md|2 +-
 gcc/config/rs6000/e6500.md|2 +-
 gcc/config/rs6000/mpc.md  |2 +-
 gcc/config/rs6000/power4.md   |5 +-
 gcc/config/rs6000/power5.md   |5 +-
 gcc/config/rs6000/power6.md   |5 +-
 gcc/config/rs6000/power7.md   |5 +-
 gcc/config/rs6000/power8.md   |8 +-
 gcc/config/rs6000/predicates.md   |   14 +
 gcc/config/rs6000/rs6000-protos.h |1 +
 gcc/config/rs6000/rs6000.c|   49 +-
 gcc/config/rs6000/rs6000.md   | 1909 -
 gcc/config/rs6000/rs64.md |2 +-
 gcc/config/rs6000/spe.md  |   18 +-
 28 files changed, 733 insertions(+), 1344 deletions(-)

-- 
1.8.1.4



[PATCH 02/10] rs6000: Remove addic alternative from various lo_sum patterns

2014-12-08 Thread Segher Boessenkool
This means we do not allow GPR0 as base address of those anymore.  The
alternative is to not allow the carry bit to be live over any lo_sum,
which is more expensive.


2014-12-08  Segher Boessenkool  

gcc/
PR target/64180
* config/rs6000/darwin.md (macho_low_si): Remove "r" alternative.
(macho_low_di): Ditto.
* config/rs6000/rs6000.md (*largetoc_low): Ditto.
(tocref): Ditto.
(elf_low): Ditto.
* config/rs6000/spe.md (mov_si_e500_subreg0_elf_low_be): Ditto.
(mov_si_e500_subreg0_elf_low_le): Ditto.
(mov_si_e500_subreg4_elf_low_be): Ditto.  Reformat condition.
(mov_si_e500_subreg4_elf_low_le): Ditto.

---
 gcc/config/rs6000/darwin.md | 16 ++--
 gcc/config/rs6000/rs6000.md | 18 +++---
 gcc/config/rs6000/spe.md| 18 +-
 3 files changed, 22 insertions(+), 30 deletions(-)

diff --git a/gcc/config/rs6000/darwin.md b/gcc/config/rs6000/darwin.md
index 8b816b7..764f847 100644
--- a/gcc/config/rs6000/darwin.md
+++ b/gcc/config/rs6000/darwin.md
@@ -213,22 +213,18 @@ (define_expand "macho_low"
 })
 
 (define_insn "macho_low_si"
-  [(set (match_operand:SI 0 "gpc_reg_operand" "=r,r")
-   (lo_sum:SI (match_operand:SI 1 "gpc_reg_operand" "b,!*r")
+  [(set (match_operand:SI 0 "gpc_reg_operand" "=r")
+   (lo_sum:SI (match_operand:SI 1 "gpc_reg_operand" "b")
   (match_operand 2 "" "")))]
"TARGET_MACHO && ! TARGET_64BIT"
-   "@
-la %0,lo16(%2)(%1)
-addic %0,%1,lo16(%2)")
+   "la %0,lo16(%2)(%1)")
 
 (define_insn "macho_low_di"
-  [(set (match_operand:DI 0 "gpc_reg_operand" "=r,r")
-   (lo_sum:DI (match_operand:DI 1 "gpc_reg_operand" "b,!*r")
+  [(set (match_operand:DI 0 "gpc_reg_operand" "=r")
+   (lo_sum:DI (match_operand:DI 1 "gpc_reg_operand" "b")
   (match_operand 2 "" "")))]
"TARGET_MACHO && TARGET_64BIT"
-   "@
-la %0,lo16(%2)(%1)
-addic %0,%1,lo16(%2)")
+   "la %0,lo16(%2)(%1)")
 
 (define_split
   [(set (mem:V4SI (plus:DI (match_operand:DI 0 "gpc_reg_operand" "")
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index f3b5aae..abf20c3 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -10712,13 +10712,11 @@ (define_insn "*largetoc_high_plus_aix"
"addis %0,%1+%3@u(%2)")
 
 (define_insn "*largetoc_low"
-  [(set (match_operand:DI 0 "gpc_reg_operand" "=r,r")
-(lo_sum:DI (match_operand:DI 1 "gpc_reg_operand" "b,!*r")
+  [(set (match_operand:DI 0 "gpc_reg_operand" "=r")
+(lo_sum:DI (match_operand:DI 1 "gpc_reg_operand" "b")
   (match_operand:DI 2 "" "")))]
"TARGET_ELF && TARGET_CMODEL != CMODEL_SMALL"
-   "@
-addi %0,%1,%2@l
-addic %0,%1,%2@l")
+   "addi %0,%1,%2@l")
 
 (define_insn "*largetoc_low_aix"
   [(set (match_operand:P 0 "gpc_reg_operand" "=r")
@@ -10728,7 +10726,7 @@ (define_insn "*largetoc_low_aix"
"la %0,%2@l(%1)")
 
 (define_insn_and_split "*tocref"
-  [(set (match_operand:P 0 "gpc_reg_operand" "=b*r")
+  [(set (match_operand:P 0 "gpc_reg_operand" "=b")
(match_operand:P 1 "small_toc_ref" "R"))]
"TARGET_TOC"
"la %0,%a1"
@@ -10747,13 +10745,11 @@ (define_insn "elf_high"
   "lis %0,%1@ha")
 
 (define_insn "elf_low"
-  [(set (match_operand:SI 0 "gpc_reg_operand" "=r,r")
-   (lo_sum:SI (match_operand:SI 1 "gpc_reg_operand" "b,!*r")
+  [(set (match_operand:SI 0 "gpc_reg_operand" "=r")
+   (lo_sum:SI (match_operand:SI 1 "gpc_reg_operand" "b")
   (match_operand 2 "" "")))]
"TARGET_ELF && ! TARGET_64BIT"
-   "@
-la %0,%2@l(%1)
-addic %0,%1,%K2")
+   "la %0,%2@l(%1)")
 
 ;; Call and call_value insns
 (define_expand "call"
diff --git a/gcc/config/rs6000/spe.md b/gcc/config/rs6000/spe.md
index 8eec7b7..07c293c 100644
--- a/gcc/config/rs6000/spe.md
+++ b/gcc/config/rs6000/spe.md
@@ -2519,7 +2519,7 @@ (define_insn "*mov_si_e500_subreg0_le"
 
 (define_insn_and_split "*mov_si_e500_subreg0_elf_low_be"
   [(set (subreg:SI (match_operand:SPE64TF 0 "register_operand" "+r") 0)
-   (lo_sum:SI (match_operand:SI 1 "gpc_reg_operand" "r")
+   (lo_sum:SI (match_operand:SI 1 "gpc_reg_operand" "b")
   (match_operand 2 "" "")))]
   "WORDS_BIG_ENDIAN
&& (((TARGET_E500_DOUBLE && (mode == DFmode || mode == TFmode))
@@ -2538,13 +2538,13 @@ (define_insn_and_split 
"*mov_si_e500_subreg0_elf_low_be"
 
 (define_insn "*mov_si_e500_subreg0_elf_low_le"
   [(set (subreg:SI (match_operand:SPE64TF 0 "register_operand" "+r") 0)
-   (lo_sum:SI (match_operand:SI 1 "gpc_reg_operand" "r")
+   (lo_sum:SI (match_operand:SI 1 "gpc_reg_operand" "b")
   (match_operand 2 "" "")))]
   "!WORDS_BIG_ENDIAN
&& (((TARGET_E500_DOUBLE && (mode == DFmode || mode == TFmode))
|| (TARGET_SPE && mode != DFmode && mode != TFmode))
&& TARGET_ELF && !TARGET_64BIT)"
-  "addic %0,%1,%K2")
+  "addi %0,%1,%K2")
 
 ;; ??? Could use evstwwe for memory stores in some cases, dep

Re: [PATCH 01/10] rs6000: Clobber XER[CA] in all user asm statements

2014-12-08 Thread David Edelsohn
On Mon, Dec 8, 2014 at 9:18 AM, Segher Boessenkool
 wrote:
> A lot of old user code clobbers the carry bit without telling the compiler
> about it.  This used to just work, because the compiler never used the bit
> outside of a single RTL instruction.  But that will change.  Let's clobber
> the carry bit in every asm; this isn't very expensive.
>
>
> 2014-12-08  Segher Boessenkool  
>
> gcc/
> PR target/64180
> * config/rs6000/rs6000.c (TARGET_MD_ASM_CLOBBERS): Define.
> (rs6000_md_asm_clobbers): New function.

Okay.  I don't know of another option that preserves backward
compatibility.  If RTH looks at this series of patches, he may have a
better suggestion.

Thanks, David


Re: [PATCH 02/10] rs6000: Remove addic alternative from various lo_sum patterns

2014-12-08 Thread David Edelsohn
On Mon, Dec 8, 2014 at 9:18 AM, Segher Boessenkool
 wrote:
> This means we do not allow GPR0 as base address of those anymore.  The
> alternative is to not allow the carry bit to be live over any lo_sum,
> which is more expensive.
>
>
> 2014-12-08  Segher Boessenkool  
>
> gcc/
> PR target/64180
> * config/rs6000/darwin.md (macho_low_si): Remove "r" alternative.
> (macho_low_di): Ditto.
> * config/rs6000/rs6000.md (*largetoc_low): Ditto.
> (tocref): Ditto.
> (elf_low): Ditto.
> * config/rs6000/spe.md (mov_si_e500_subreg0_elf_low_be): Ditto.
> (mov_si_e500_subreg0_elf_low_le): Ditto.
> (mov_si_e500_subreg4_elf_low_be): Ditto.  Reformat condition.
> (mov_si_e500_subreg4_elf_low_le): Ditto.

Okay.

Thanks, David


Re: [PATCH 03/10] rs6000: Do not use addic in the ctr* splitters

2014-12-08 Thread David Edelsohn
On Mon, Dec 8, 2014 at 9:18 AM, Segher Boessenkool
 wrote:
> These splitters are only used when for some reason we generated a bdnz
> loop but we cannot actually use one (CTR is already used for something
> else, for example).  This shouldn't happen often, so making the split
> sequence one insn longer isn't a big deal.
>
>
> 2014-12-08  Segher Boessenkool  
>
> gcc/
> PR target/64180
> * config/rs6000/rs6000.md (*ctr_internal1, *ctr_internal2,
> *ctr_internal5, *ctr_internal6): Change "r" alternatives
> to "b".  Increase length.
> (splitters for these): Split to cmp+addi instead of addic.

Okay.

Thanks, David


[PATCH] DWARFv5 Emit DW_TAG_atomic_type for C11 _Atomic.

2014-12-08 Thread Mark Wielaard
This implements the DW_TAG_atomic_type for C11 _Atomic proposal as adopted
in the latest DWARF5 draft. http://dwarfstd.org/ShowIssue.php?issue=131112.1

This is much simpler than my previous patch. Thanks to Andreas cleanups
for PR 63300 adding new qualifier support to dwarf2out.c is almost trivial.
Now that we have experimental -gdwarf-5 support we can emit this new TAG
just when the user explictly asks for it.

I do have a corresponding GDB patch to take advantage of the new information.

gcc/ChangeLog

PR debug/60782
* dwarf2out.c (modified_type_die): Handle TYPE_QUAL_ATOMIC.

gcc/testsuite/ChangeLog

PR debug/60782
* gcc.dg/debug/dwarf2/atomic.c: New test.
* gcc.dg/debug/dwarf2/stacked-qualified-types-3.c: Likewise.

include/ChangeLog

PR debug/60782
* dwarf2.def: Add DWARFv5 DW_TAG_atomic_type.
---
 gcc/ChangeLog  |  5 
 gcc/dwarf2out.c|  7 -
 gcc/testsuite/ChangeLog|  6 
 gcc/testsuite/gcc.dg/debug/dwarf2/atomic.c | 15 ++
 .../debug/dwarf2/stacked-qualified-types-3.c   | 34 ++
 include/ChangeLog  |  5 
 include/dwarf2.def |  2 ++
 7 files changed, 73 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/atomic.c
 create mode 100644 
gcc/testsuite/gcc.dg/debug/dwarf2/stacked-qualified-types-3.c

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index b2eb950..b2fe45c 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,8 @@
+2014-12-08  Mark Wielaard  
+
+   PR debug/60782
+   * dwarf2out.c (modified_type_die): Handle TYPE_QUAL_ATOMIC.
+
 2014-12-02  Dmitry Vyukov  
 
* asan.c: (asan_finish_file): Use default priority for constructors
diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index ca1e3ef..34b327e 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -10551,7 +10551,7 @@ modified_type_die (tree type, int cv_quals, dw_die_ref 
context_die)
   dw_die_ref mod_scope;
   /* Only these cv-qualifiers are currently handled.  */
   const int cv_qual_mask = (TYPE_QUAL_CONST | TYPE_QUAL_VOLATILE
-   | TYPE_QUAL_RESTRICT);
+   | TYPE_QUAL_RESTRICT | TYPE_QUAL_ATOMIC);
 
   if (code == ERROR_MARK)
 return NULL;
@@ -10564,6 +10564,10 @@ modified_type_die (tree type, int cv_quals, dw_die_ref 
context_die)
   if (dwarf_version < 3)
 cv_quals &= ~TYPE_QUAL_RESTRICT;
 
+  /* Likewise for DW_TAG_atomic_type for DWARFv5.  */
+  if (dwarf_version < 5)
+cv_quals &= ~TYPE_QUAL_ATOMIC;
+
   /* See if we already have the appropriately qualified variant of
  this type.  */
   qualified_type = get_qualified_type (type, cv_quals);
@@ -10625,6 +10629,7 @@ modified_type_die (tree type, int cv_quals, dw_die_ref 
context_die)
   struct qual_info { int q; enum dwarf_tag t; };
   static const struct qual_info qual_info[] =
{
+ { TYPE_QUAL_ATOMIC, DW_TAG_atomic_type },
  { TYPE_QUAL_RESTRICT, DW_TAG_restrict_type },
  { TYPE_QUAL_VOLATILE, DW_TAG_volatile_type },
  { TYPE_QUAL_CONST, DW_TAG_const_type },
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index f2e54ef..e983a07 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,9 @@
+2014-12-08  Mark Wielaard  
+
+   PR debug/60782
+   * gcc.dg/debug/dwarf2/atomic.c: New test.
+   * gcc.dg/debug/dwarf2/stacked-qualified-types-3.c: Likewise.
+
 2014-12-02  Uros Bizjak  
 
* gcc.target/i386/avx512ifma-vpmaddhuq-2.c: Define AVX512IFMA.
diff --git a/gcc/testsuite/gcc.dg/debug/dwarf2/atomic.c 
b/gcc/testsuite/gcc.dg/debug/dwarf2/atomic.c
new file mode 100644
index 000..187e695
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/debug/dwarf2/atomic.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-std=c11 -gdwarf-5 -dA" } */
+
+struct Atomics
+{
+  _Atomic(int) counter;
+  struct Pointer
+  {
+_Atomic volatile char *p;
+  } p;
+};
+
+struct Atomics a;
+
+/* { dg-final { scan-assembler-times "DIE \\(\[^\n\]*\\) DW_TAG_atomic_type" 2 
} } */
diff --git a/gcc/testsuite/gcc.dg/debug/dwarf2/stacked-qualified-types-3.c 
b/gcc/testsuite/gcc.dg/debug/dwarf2/stacked-qualified-types-3.c
new file mode 100644
index 000..efa3fa0
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/debug/dwarf2/stacked-qualified-types-3.c
@@ -0,0 +1,34 @@
+/* make sure we don't duplicate type qualifiers unneeded.  */
+/* { dg-do compile } */
+/* { dg-options "-std=c11 -gdwarf-5 -dA" } */
+
+/* This should give us:
+   - One const type pointing to a char
+   - One volatile type pointing to a char
+   - One atomic type pointing to a char
+   - Either one const type pointing to the volatile type pointing to a char
+ or one volatile type pointing to the const type pointing to a char.
+ But not both.
+   - Either one volatile type 

Re: [PATCH] DWARFv5 Emit DW_TAG_atomic_type for C11 _Atomic.

2014-12-08 Thread Jason Merrill

OK.

Jason


Re: [PATCH 04/10] rs6000: Remove addic from the normal add pattern

2014-12-08 Thread David Edelsohn
On Mon, Dec 8, 2014 at 9:18 AM, Segher Boessenkool
 wrote:
> This means we can no longer add GPR0+imm.  Register allocation will have
> to use a different register.
>
>
> 2014-12-08  Segher Boessenkool  
>
> gcc/
> PR target/64180
> * config/rs6000/rs6000.md (*add3_internal1): Remove addic
> alternative.
>
> ---
>  gcc/config/rs6000/rs6000.md | 9 +++--
>  1 file changed, 3 insertions(+), 6 deletions(-)
>
> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index 6f4bafb..1647f8b 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -1491,17 +1491,14 @@ (define_expand "add3"
>  }
>  })
>
> -;; Discourage ai/addic because of carry but provide it in an alternative
> -;; allowing register zero as source.
>  (define_insn "*add3_internal1"
> -  [(set (match_operand:GPR 0 "gpc_reg_operand" "=r,r,?r,r")
> -   (plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "%r,b,r,b")
> - (match_operand:GPR 2 "add_operand" "r,I,I,L")))]
> +  [(set (match_operand:GPR 0 "gpc_reg_operand" "=r,r,r")
> +   (plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "%r,b,b")
> + (match_operand:GPR 2 "add_operand" "r,I,L")))]
>"!DECIMAL_FLOAT_MODE_P (GET_MODE (operands[0])) && !DECIMAL_FLOAT_MODE_P 
> (GET_MODE (operands[1]))"
>"@
> add %0,%1,%2
> addi %0,%1,%2
> -   addic %0,%1,%2
> addis %0,%1,%v2"
>[(set_attr "type" "add")])
>
> --
> 1.8.1.4
>

Why are you removing the alternative instead of clobbering XER[CA]?

Thanks, David


Re: match.pd tweaks for vectors and issues with HONOR_NANS

2014-12-08 Thread Marc Glisse

On Tue, 18 Nov 2014, Richard Biener wrote:


I'll try to replace some more TYPE_MODE during stage3...


Btw, a convenience would be to be able to write

HONOR_NANS (type)

thus effectively make HONOR_* inline functions with a machine_mode
and a type overload (and the type overload properly looking at
element types).


Making those functions inline is not easy, because real.h and tree.h don't 
include each other. Here is a version with the functions not inline. I 
was tempted to also overload on gcond const* (for the cases that call 
gimple_cond_lhs) but the arguments were always gimple and not gcond*, so I 
didn't.


Passes bootstrap+testsuite on x86_64-linux-gnu.

2014-12-08  Marc Glisse  

* real.h (HONOR_NANS): Replace macro with 3 overloaded declarations.
* real.c: Include rtl.h and options.h.
(HONOR_NANS): Define three overloads.
* builtins.c (fold_builtin_classify, fold_builtin_unordered_cmp):
Simplify argument of HONOR_NANS.
* fold-const.c (combine_comparisons, fold_truth_not_expr,
fold_cond_expr_with_comparison, merge_truthop_with_opposite_arm,
fold_comparison, fold_binary_loc): Likewise.
* ifcvt.c (noce_try_move, noce_try_minmax): Likewise.
* ipa-inline-analysis.c (add_clause,
set_cond_stmt_execution_predicate): Likewise.
* match.pd: Likewise.
* rtlanal.c (may_trap_p_1): Likewise.
* simplify-rtx.c (simplify_const_relational_operation): Likewise.
* tree-if-conv.c (parse_predicate): Likewise.
* tree-ssa-ccp.c (valid_lattice_transition): Likewise.
* tree-ssa-ifcombine.c (ifcombine_ifandif): Likewise.
* tree-ssa-phiopt.c (minmax_replacement, neg_replacement): Likewise.
* tree-ssa-reassoc.c (eliminate_using_constants): Likewise.
* tree-ssa-tail-merge.c (gimple_equal_p): Likewise.

--
Marc GlisseIndex: builtins.c
===
--- builtins.c  (revision 218467)
+++ builtins.c  (working copy)
@@ -9641,34 +9641,34 @@ fold_builtin_classify (location_t loc, t
   integer_minus_one_node, integer_one_node);
tmp = fold_build3_loc (loc, COND_EXPR, integer_type_node,
   isinf_call, tmp,
   integer_zero_node);
  }
 
return tmp;
   }
 
 case BUILT_IN_ISFINITE:
-  if (!HONOR_NANS (TYPE_MODE (TREE_TYPE (arg)))
+  if (!HONOR_NANS (arg)
  && !HONOR_INFINITIES (TYPE_MODE (TREE_TYPE (arg
return omit_one_operand_loc (loc, type, integer_one_node, arg);
 
   if (TREE_CODE (arg) == REAL_CST)
{
  r = TREE_REAL_CST (arg);
  return real_isfinite (&r) ? integer_one_node : integer_zero_node;
}
 
   return NULL_TREE;
 
 case BUILT_IN_ISNAN:
-  if (!HONOR_NANS (TYPE_MODE (TREE_TYPE (arg
+  if (!HONOR_NANS (arg))
return omit_one_operand_loc (loc, type, integer_zero_node, arg);
 
   if (TREE_CODE (arg) == REAL_CST)
{
  r = TREE_REAL_CST (arg);
  return real_isnan (&r) ? integer_one_node : integer_zero_node;
}
 
   arg = builtin_save_expr (arg);
   return fold_build2_loc (loc, UNORDERED_EXPR, type, arg, arg);
@@ -9782,27 +9782,26 @@ fold_builtin_unordered_cmp (location_t l
   else if (code0 == REAL_TYPE && code1 == INTEGER_TYPE)
 cmp_type = type0;
   else if (code0 == INTEGER_TYPE && code1 == REAL_TYPE)
 cmp_type = type1;
 
   arg0 = fold_convert_loc (loc, cmp_type, arg0);
   arg1 = fold_convert_loc (loc, cmp_type, arg1);
 
   if (unordered_code == UNORDERED_EXPR)
 {
-  if (!HONOR_NANS (TYPE_MODE (TREE_TYPE (arg0
+  if (!HONOR_NANS (arg0))
return omit_two_operands_loc (loc, type, integer_zero_node, arg0, arg1);
   return fold_build2_loc (loc, UNORDERED_EXPR, type, arg0, arg1);
 }
 
-  code = HONOR_NANS (TYPE_MODE (TREE_TYPE (arg0))) ? unordered_code
-  : ordered_code;
+  code = HONOR_NANS (arg0) ? unordered_code : ordered_code;
   return fold_build1_loc (loc, TRUTH_NOT_EXPR, type,
  fold_build2_loc (loc, code, type, arg0, arg1));
 }
 
 /* Fold __builtin_{,s,u}{add,sub,mul}{,l,ll}_overflow, either into normal
arithmetics if it can never overflow, or into internal functions that
return both result of arithmetics and overflowed boolean flag in
a complex integer result, or some other check for overflow.  */
 
 static tree
Index: fold-const.c
===
--- fold-const.c(revision 218467)
+++ fold-const.c(working copy)
@@ -2585,21 +2585,21 @@ compcode_to_comparison (enum comparison_
and RCODE on the identical operands LL_ARG and LR_ARG.  Take into account
the possibility of trapping if the mode has NaNs, and return NULL_TREE
if this makes the transformation invalid.  */
 
 tree
 

[PATCH 10/10] rs6000: Get rid of the weird decimal thing in add

2014-12-08 Thread Segher Boessenkool
Peter tells me it was an artifact of old versions of the DFP code.  The
condition can never be false; delete it.


2014-12-08  Segher Boessenkool  

gcc/
* config/rs6000/rs6000.md (*add3): Remove condition.

---
 gcc/config/rs6000/rs6000.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index a544e57..6724819 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -1506,7 +1506,7 @@ (define_insn "*add3"
   [(set (match_operand:GPR 0 "gpc_reg_operand" "=r,r,r")
(plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "%r,b,b")
  (match_operand:GPR 2 "add_operand" "r,I,L")))]
-  "!DECIMAL_FLOAT_MODE_P (GET_MODE (operands[0])) && !DECIMAL_FLOAT_MODE_P 
(GET_MODE (operands[1]))"
+  ""
   "@
add %0,%1,%2
addi %0,%1,%2
-- 
1.8.1.4



[PATCH 05/10] rs6000: Merge and split add and addic

2014-12-08 Thread Segher Boessenkool
This splits the addic patterns from the add/addi patterns, and merges the
insns and splitters.
Also change type "compare" to "add" for "addic." (the dot form).


2014-12-08  Segher Boessenkool  

gcc/
PR target/64180
* config/rs6000/rs6000.md (*add3_internal1): Rename to
"*add3".
(*add3_internal2, *add3_internal3, and (their splitters):
Delete.
(*add3_dot, *add3_dot2): New.
(*add3_imm_dot, *add3_imm_dot2): New.

---
 gcc/config/rs6000/rs6000.md | 119 +++-
 1 file changed, 74 insertions(+), 45 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 1647f8b..dcdb7c1 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -1491,7 +1491,7 @@ (define_expand "add3"
 }
 })
 
-(define_insn "*add3_internal1"
+(define_insn "*add3"
   [(set (match_operand:GPR 0 "gpc_reg_operand" "=r,r,r")
(plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "%r,b,b")
  (match_operand:GPR 2 "add_operand" "r,I,L")))]
@@ -1510,70 +1510,99 @@ (define_insn "addsi3_high"
   "addis %0,%1,ha16(%2)"
   [(set_attr "type" "add")])
 
-(define_insn "*add3_internal2"
-  [(set (match_operand:CC 0 "cc_reg_operand" "=x,x,?y,?y")
-   (compare:CC (plus:P (match_operand:P 1 "gpc_reg_operand" "%r,r,r,r")
-   (match_operand:P 2 "reg_or_short_operand" 
"r,I,r,I"))
+(define_insn_and_split "*add3_dot"
+  [(set (match_operand:CC 3 "cc_reg_operand" "=x,?y")
+   (compare:CC (plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "r,r")
+ (match_operand:GPR 2 "gpc_reg_operand" "r,r"))
(const_int 0)))
-   (clobber (match_scratch:P 3 "=r,r,r,r"))]
-  ""
+   (clobber (match_scratch:GPR 0 "=r,r"))]
+  "mode == Pmode"
   "@
-   add. %3,%1,%2
-   addic. %3,%1,%2
-   #
+   add. %0,%1,%2
#"
-  [(set_attr "type" "add,compare,compare,compare")
+  "&& reload_completed && cc_reg_not_cr0_operand (operands[3], CCmode)"
+  [(set (match_dup 0)
+   (plus:GPR (match_dup 1)
+(match_dup 2)))
+   (set (match_dup 3)
+   (compare:CC (match_dup 0)
+   (const_int 0)))]
+  ""
+  [(set_attr "type" "add")
(set_attr "dot" "yes")
-   (set_attr "length" "4,4,8,8")])
+   (set_attr "length" "4,8")])
 
-(define_split
-  [(set (match_operand:CC 0 "cc_reg_not_cr0_operand" "")
-   (compare:CC (plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "")
- (match_operand:GPR 2 "reg_or_short_operand" ""))
+(define_insn_and_split "*add3_dot2"
+  [(set (match_operand:CC 3 "cc_reg_operand" "=x,?y")
+   (compare:CC (plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "r,r")
+ (match_operand:GPR 2 "gpc_reg_operand" "r,r"))
(const_int 0)))
-   (clobber (match_scratch:GPR 3 ""))]
-  "reload_completed"
-  [(set (match_dup 3)
+   (set (match_operand:GPR 0 "gpc_reg_operand" "=r,r")
(plus:GPR (match_dup 1)
-(match_dup 2)))
-   (set (match_dup 0)
-   (compare:CC (match_dup 3)
+ (match_dup 2)))]
+  "mode == Pmode"
+  "@
+   add. %0,%1,%2
+   #"
+  "&& reload_completed && cc_reg_not_cr0_operand (operands[3], CCmode)"
+  [(set (match_dup 0)
+   (plus:GPR (match_dup 1)
+ (match_dup 2)))
+   (set (match_dup 3)
+   (compare:CC (match_dup 0)
(const_int 0)))]
-  "")
+  ""
+  [(set_attr "type" "add")
+   (set_attr "dot" "yes")
+   (set_attr "length" "4,8")])
 
-(define_insn "*add3_internal3"
-  [(set (match_operand:CC 3 "cc_reg_operand" "=x,x,?y,?y")
-   (compare:CC (plus:P (match_operand:P 1 "gpc_reg_operand" "%r,r,r,r")
-   (match_operand:P 2 "reg_or_short_operand" 
"r,I,r,I"))
+(define_insn_and_split "*add3_imm_dot"
+  [(set (match_operand:CC 3 "cc_reg_operand" "=x,?y")
+   (compare:CC (plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "%r,r")
+ (match_operand:GPR 2 "short_cint_operand" "I,I"))
(const_int 0)))
-   (set (match_operand:P 0 "gpc_reg_operand" "=r,r,r,r")
-   (plus:P (match_dup 1)
-   (match_dup 2)))]
-  ""
+   (clobber (match_scratch:GPR 0 "=r,r"))
+   (clobber (reg:GPR CA_REGNO))]
+  "mode == Pmode"
   "@
-   add. %0,%1,%2
addic. %0,%1,%2
-   #
#"
-  [(set_attr "type" "add,compare,compare,compare")
+  "&& reload_completed && cc_reg_not_cr0_operand (operands[3], CCmode)"
+  [(set (match_dup 0)
+   (plus:GPR (match_dup 1)
+ (match_dup 2)))
+   (set (match_dup 3)
+   (compare:CC (match_dup 0)
+   (const_int 0)))]
+  ""
+  [(set_attr "type" "add")
(set_attr "dot" "yes")
-   (set_attr "length" "4,4,8,8")])
+   (set_attr "length" "4,8")])
 
-(define_split
-  [(set (match_operand:CC 3 "cc_reg_not_cr0_operand" "")
-   (compare:CC (plus:P (match_operand:P 1 "gpc_reg_operand" "")
-   (match_oper

[PATCH 08/10] rs6000: compare->two for rld*;rld*.

2014-12-08 Thread Segher Boessenkool
These now are the only remaining patterns that have type "compare".
And they shouldn't: "two" is a better type for them.


2014-12-08  Segher Boessenkool  

gcc/
* config/rs6000/rs6000.md (*anddi3_2rld_dot, *anddi3_rld_dot2):
Change type from "compare" to "two".

---
 gcc/config/rs6000/rs6000.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index eb15fff..d0b8bc8 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -7341,7 +7341,7 @@ (define_insn_and_split "*anddi3_2rld_dot"
 {
   build_mask64_2_operands (operands[2], &operands[4]);
 }
-  [(set_attr "type" "compare")
+  [(set_attr "type" "two")
(set_attr "dot" "yes")
(set_attr "length" "8,12")])
 
@@ -7374,7 +7374,7 @@ (define_insn_and_split "*anddi3_2rld_dot2"
 {
   build_mask64_2_operands (operands[2], &operands[4]);
 }
-  [(set_attr "type" "compare")
+  [(set_attr "type" "two")
(set_attr "dot" "yes")
(set_attr "length" "8,12")])
 
-- 
1.8.1.4



[PATCH 04/10] rs6000: Remove addic from the normal add pattern

2014-12-08 Thread Segher Boessenkool
This means we can no longer add GPR0+imm.  Register allocation will have
to use a different register.


2014-12-08  Segher Boessenkool  

gcc/
PR target/64180
* config/rs6000/rs6000.md (*add3_internal1): Remove addic
alternative.

---
 gcc/config/rs6000/rs6000.md | 9 +++--
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 6f4bafb..1647f8b 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -1491,17 +1491,14 @@ (define_expand "add3"
 }
 })
 
-;; Discourage ai/addic because of carry but provide it in an alternative
-;; allowing register zero as source.
 (define_insn "*add3_internal1"
-  [(set (match_operand:GPR 0 "gpc_reg_operand" "=r,r,?r,r")
-   (plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "%r,b,r,b")
- (match_operand:GPR 2 "add_operand" "r,I,I,L")))]
+  [(set (match_operand:GPR 0 "gpc_reg_operand" "=r,r,r")
+   (plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "%r,b,b")
+ (match_operand:GPR 2 "add_operand" "r,I,L")))]
   "!DECIMAL_FLOAT_MODE_P (GET_MODE (operands[0])) && !DECIMAL_FLOAT_MODE_P 
(GET_MODE (operands[1]))"
   "@
add %0,%1,%2
addi %0,%1,%2
-   addic %0,%1,%2
addis %0,%1,%v2"
   [(set_attr "type" "add")])
 
-- 
1.8.1.4



[PATCH 06/10] rs6000: New add/subf carry insns

2014-12-08 Thread Segher Boessenkool
This implements addc, addic, adde, addze, addme, subfc, subfic, subfe,
subfze, subfme.  Some of those in multiple forms: canonical RTL for the
"immediate" insns has four forms; subfX has a special form for subtracting
a register from itself.

All these new insns get type "add" for now.


2014-12-08  Segher Boessenkool  

gcc/
PR target/64180
* config/rs6000/predicates.md (adde_operand): New.
* config/rs6000/rs6000.md (add3_carry): New.
(add3_imm_carry_pos): New.
(add3_imm_carry_0): New.
(add3_imm_carry_m1): New.
(add3_imm_carry_neg): New.
(add3_carry_in): New.
(add3_carry_in_0): New.
(add3_carry_in_m1): New.
(subf3_carry): New.
(subf3_imm_carry_0): New.
(subf3_imm_carry_m1): New.
(subf3_carry_in): New.
(subf3_carry_in_0): New.
(subf3_carry_in_m1): New.
(subf3_carry_in_xx): New.

---
 gcc/config/rs6000/predicates.md |   6 ++
 gcc/config/rs6000/rs6000.md | 162 
 2 files changed, 168 insertions(+)

diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index ea230a5..a19cb2f 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -788,6 +788,12 @@ (define_predicate "add_operand"
 || satisfies_constraint_L (op)")
 (match_operand 0 "gpc_reg_operand")))
 
+;; Return 1 if the operand is either a non-special register, or 0, or -1.
+(define_predicate "adde_operand"
+  (if_then_else (match_code "const_int")
+(match_test "INTVAL (op) == 0 || INTVAL (op) == -1")
+(match_operand 0 "gpc_reg_operand")))
+
 ;; Return 1 if OP is a constant but not a valid add_operand.
 (define_predicate "non_add_cint_operand"
   (and (match_code "const_int")
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index dcdb7c1..e23fadb 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -1634,6 +1634,96 @@ (define_split
 FAIL;
 })
 
+
+(define_insn "add3_carry"
+  [(set (match_operand:P 0 "gpc_reg_operand" "=r")
+   (plus:P (match_operand:P 1 "gpc_reg_operand" "r")
+   (match_operand:P 2 "reg_or_short_operand" "rI")))
+   (set (reg:P CA_REGNO)
+   (ltu:P (plus:P (match_dup 1)
+  (match_dup 2))
+  (match_dup 1)))]
+  ""
+  "add%I2c %0,%1,%2"
+  [(set_attr "type" "add")])
+
+(define_insn "*add3_imm_carry_pos"
+  [(set (match_operand:P 0 "gpc_reg_operand" "=r")
+   (plus:P (match_operand:P 1 "gpc_reg_operand" "r")
+   (match_operand:P 2 "short_cint_operand" "n")))
+   (set (reg:P CA_REGNO)
+   (geu:P (match_dup 1)
+  (match_operand:P 3 "const_int_operand" "n")))]
+  "INTVAL (operands[2]) > 0
+   && INTVAL (operands[2]) + INTVAL (operands[3]) == 0"
+  "addic %0,%1,%2"
+  [(set_attr "type" "add")])
+
+(define_insn "*add3_imm_carry_0"
+  [(set (match_operand:P 0 "gpc_reg_operand" "=r")
+   (match_operand:P 1 "gpc_reg_operand" "r"))
+   (set (reg:P CA_REGNO)
+   (const_int 0))]
+  ""
+  "addic %0,%1,0"
+  [(set_attr "type" "add")])
+
+(define_insn "*add3_imm_carry_m1"
+  [(set (match_operand:P 0 "gpc_reg_operand" "=r")
+   (plus:P (match_operand:P 1 "gpc_reg_operand" "r")
+   (const_int -1)))
+   (set (reg:P CA_REGNO)
+   (ne:P (match_dup 1)
+ (const_int 0)))]
+  ""
+  "addic %0,%1,-1"
+  [(set_attr "type" "add")])
+
+(define_insn "*add3_imm_carry_neg"
+  [(set (match_operand:P 0 "gpc_reg_operand" "=r")
+   (plus:P (match_operand:P 1 "gpc_reg_operand" "r")
+   (match_operand:P 2 "short_cint_operand" "n")))
+   (set (reg:P CA_REGNO)
+   (gtu:P (match_dup 1)
+  (match_operand:P 3 "const_int_operand" "n")))]
+  "INTVAL (operands[2]) < 0
+   && INTVAL (operands[2]) + INTVAL (operands[3]) == -1"
+  "addic %0,%1,%2"
+  [(set_attr "type" "add")])
+
+
+(define_insn "add3_carry_in"
+  [(set (match_operand:GPR 0 "gpc_reg_operand" "=r,r")
+   (plus:GPR (plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "r,r")
+   (match_operand:GPR 2 "adde_operand" "r,n"))
+ (reg:GPR CA_REGNO)))
+   (clobber (reg:GPR CA_REGNO))]
+  ""
+  "@
+   adde %0,%1,%2
+   add%G2e %0,%1"
+  [(set_attr "type" "add")])
+
+(define_insn "add3_carry_in_0"
+  [(set (match_operand:GPR 0 "gpc_reg_operand" "=r,r")
+   (plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "r,r")
+ (reg:GPR CA_REGNO)))
+   (clobber (reg:GPR CA_REGNO))]
+  ""
+  "addze %0,%1"
+  [(set_attr "type" "add")])
+
+(define_insn "add3_carry_in_m1"
+  [(set (match_operand:GPR 0 "gpc_reg_operand" "=r,r")
+   (plus:GPR (plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "r,r")
+   (reg:GPR CA_REGNO))
+ (const_int -1)))
+   (clobber (reg:GPR CA_REGNO))]
+  ""
+  "addme %0,%1"
+  [(set_attr "type" "add")])
+
+
 (define_expand "one_cmpl2"
   [(set (match_operand:SDI 0 "gpc_reg

[PATCH 09/10] rs6000: Remove the (now unused) insn type "compare"

2014-12-08 Thread Segher Boessenkool

2014-12-08  Segher Boessenkool  

gcc/
* config/rs6000/40x.md (ppc403-compare): Remove "compare".
config/rs6000/440.md (ppc440-compare): Remove "compare".
config/rs6000/476.md (ppc476-compare): Remove "compare".
config/rs6000/601.md (ppc601-compare): Remove "compare".
config/rs6000/603.md (ppc603-compare): Remove "compare".
config/rs6000/6xx.md (ppc604-compare): Remove "compare".
config/rs6000/7450.md (ppc7450-compare): Remove "compare".
config/rs6000/7xx.md (ppc750-compare): Remove "compare".
config/rs6000/8540.md (ppc8540_su): Remove "compare".
config/rs6000/cell.md (cell-fast-cmp, cell-cmp-microcoded): Remove
"compare".
config/rs6000/e300c2c3.md (ppce300c3_cmp): Remove "compare".
config/rs6000/e500mc.md (e500mc_su): Remove "compare".
config/rs6000/e500mc64.md (e500mc64_su2): Remove "compare".
config/rs6000/e5500.md (e5500_sfx2): Remove "compare".
config/rs6000/e6500.md (e6500_sfx2): Remove "compare".
config/rs6000/mpc.md (mpccore-compare): Remove "compare".
config/rs6000/power4.md (power4-compare): Remove "compare".
config/rs6000/power5.md (power5-compare): Remove "compare".
config/rs6000/power6.md (power6-compare): Remove "compare".
config/rs6000/power7.md (power7-compare): Remove "compare".
config/rs6000/power8.md (power8-compare): Remove "compare".  Update
comment.
config/rs6000/rs6000.c (rs6000_adjust_cost) : Remove
(three times).
(is_cracked_insn): Remove TYPE_COMPARE case.
(insn_must_be_first_in_group) : Remove (twice).
config/rs6000/rs6000.md (type): Remove "compare".
(cell_micro): Remove "compare".
config/rs6000/rs64.md (rs64a-compare): Remove "compare".

---
 gcc/config/rs6000/40x.md  |  2 +-
 gcc/config/rs6000/440.md  |  2 +-
 gcc/config/rs6000/476.md  |  2 +-
 gcc/config/rs6000/601.md  |  2 +-
 gcc/config/rs6000/603.md  |  2 +-
 gcc/config/rs6000/6xx.md  |  2 +-
 gcc/config/rs6000/7450.md |  2 +-
 gcc/config/rs6000/7xx.md  |  2 +-
 gcc/config/rs6000/8540.md |  2 +-
 gcc/config/rs6000/cell.md | 10 --
 gcc/config/rs6000/e300c2c3.md |  2 +-
 gcc/config/rs6000/e500mc.md   |  2 +-
 gcc/config/rs6000/e500mc64.md |  2 +-
 gcc/config/rs6000/e5500.md|  2 +-
 gcc/config/rs6000/e6500.md|  2 +-
 gcc/config/rs6000/mpc.md  |  2 +-
 gcc/config/rs6000/power4.md   |  5 ++---
 gcc/config/rs6000/power5.md   |  5 ++---
 gcc/config/rs6000/power6.md   |  5 ++---
 gcc/config/rs6000/power7.md   |  5 ++---
 gcc/config/rs6000/power8.md   |  8 +++-
 gcc/config/rs6000/rs6000.c|  6 --
 gcc/config/rs6000/rs6000.md   |  4 +---
 gcc/config/rs6000/rs64.md |  2 +-
 24 files changed, 32 insertions(+), 48 deletions(-)

diff --git a/gcc/config/rs6000/40x.md b/gcc/config/rs6000/40x.md
index 0903536..650fe52 100644
--- a/gcc/config/rs6000/40x.md
+++ b/gcc/config/rs6000/40x.md
@@ -53,7 +53,7 @@ (define_insn_reservation "ppc403-three" 1
   "iu_40x,iu_40x,iu_40x")
 
 (define_insn_reservation "ppc403-compare" 3
-  (and (ior (eq_attr "type" "cmp,compare")
+  (and (ior (eq_attr "type" "cmp")
(and (eq_attr "type" "add,logical,shift,exts")
 (eq_attr "dot" "yes")))
(eq_attr "cpu" "ppc403,ppc405"))
diff --git a/gcc/config/rs6000/440.md b/gcc/config/rs6000/440.md
index ff91fdb..b8a1d33 100644
--- a/gcc/config/rs6000/440.md
+++ b/gcc/config/rs6000/440.md
@@ -95,7 +95,7 @@ (define_insn_reservation "ppc440-branch" 1
   "ppc440_issue,ppc440_i_pipe")
 
 (define_insn_reservation "ppc440-compare" 2
-  (and (ior (eq_attr "type" "cmp,compare,cr_logical,delayed_cr,mfcr")
+  (and (ior (eq_attr "type" "cmp,cr_logical,delayed_cr,mfcr")
(and (eq_attr "type" "add,logical,shift,exts")
 (eq_attr "dot" "yes")))
(eq_attr "cpu" "ppc440"))
diff --git a/gcc/config/rs6000/476.md b/gcc/config/rs6000/476.md
index 9bfd6b6..cf47754 100644
--- a/gcc/config/rs6000/476.md
+++ b/gcc/config/rs6000/476.md
@@ -77,7 +77,7 @@ (define_insn_reservation "ppc476-complex-integer" 1
ppc476_i_pipe")
 
 (define_insn_reservation "ppc476-compare" 4
-  (and (ior (eq_attr "type" "compare,mfcr,mfcrf,mtcr,mfjmpr,mtjmpr")
+  (and (ior (eq_attr "type" "mfcr,mfcrf,mtcr,mfjmpr,mtjmpr")
(and (eq_attr "type" "add,logical,shift,exts")
 (eq_attr "dot" "yes")))
(eq_attr "cpu" "ppc476"))
diff --git a/gcc/config/rs6000/601.md b/gcc/config/rs6000/601.md
index de51cbf..af14274 100644
--- a/gcc/config/rs6000/601.md
+++ b/gcc/config/rs6000/601.md
@@ -74,7 +74,7 @@ (define_insn_reservation "ppc601-idiv" 36
 ; compare executes on integer unit, but feeds insns which
 ; execute on the branch unit.
 (define_insn_reservation "ppc601-compare" 3
-  (and (ior (eq_attr "type" "cmp,compare")
+  (and (ior (eq_attr "type" "cmp")
(and (eq_attr "type" "shift,exts")
 (eq_a

[PATCH 07/10] rs6000: The big carry insns addition

2014-12-08 Thread Segher Boessenkool
Now that everything is in place for letting GCC use the carry-using and
carry-producing machine insns as separate RTL insns, switch over all
remaining patterns that clobber CA without telling the compiler.  This
are the multiple-precision add/sub/neg patterns, and the various eq/ne/
ltu/gtu/leu/geu patterns.

The eq/ne patterns are special.  The optimal machine code for those isn't
regular (like e.g. the ltu patterns are), and we want to implement a plain
eq as "cntlz[wd];sr[wd]i"; but that means that if we split those patterns
at expand time, combine will happily put them back together again.  So
expand them as eq/ne, and split later.


2014-12-08  Segher Boessenkool  

gcc/
PR target/64180
* config/rs6000/predicates.md (unsigned_comparison_operator): New.
(signed_comparison_operator): New.
* config/rs6000/rs6000-protos.h (rs6000_emit_eqne): Declare.
* config/rs6000/rs6000.c (rs6000_emit_eqne): New function.
(rs6000_emit_sCOND): Remove ISEL test (move it to the expander).
* config/rs6000/rs6000.md (add3 for SDI): Expand DImode
add to addc,adde directly, if !TARGET_POWERPC64.
(sub3 for SDI): Expand DImode sub to subfc,subfe directly,
if !TARGET_POWERPC64.
(neg2): Delete expander.
(*neg2): Rename to "neg2".
(addti3, subti3): Delete.
(addti3, subti3): New expanders.
(*adddi3_noppc64, *subdi3_noppc64, *negdi2_noppc64): Delete.
(cstore4_unsigned): New expander.
(cstore4): Allow GPR as output (not just SI).  Rewrite.
(cstore4 for FP): Remove superfluous quotes.
(*eq, *eq_compare, *plus_eqsi and splitter,
*compare_plus_eqsi and splitter, *plus_eqsi_compare and splitter,
*neg_eq0, *neg_eq, *ne0_, plus_ne0_,
compare_plus_ne0_ and splitter, *compare_plus_ne0__1 and
splitter, *plus_ne0__compare and splitter, *leu,
*leu_compare and splitter, *plus_leu, *neg_leu,
*and_neg_leu, *ltu, *ltu_compare, *plus_ltu,
*plus_ltu_1, *plus_ltucompare, *neg_ltu, *geu,
*geu_compare and splitter, *plus_geu, *neg_geu,
*and_neg_geu, *plus_gt0, *gtu, *gtu_compare,
*plus_gtu, *plus_gtu_1, *plus_gtu_compare,
*neg_gtu, 12 anonymous insns, and 12 anonymous splitters):
Delete.
(eq3, ne3): New.
(*neg_eq_, *neg_ne_): New.
(*plus_eq_, *plus_ne_): New.
(*minus_eq_, *minus_ne_): New.

---
 gcc/config/rs6000/predicates.md   |8 +
 gcc/config/rs6000/rs6000-protos.h |1 +
 gcc/config/rs6000/rs6000.c|   27 +-
 gcc/config/rs6000/rs6000.md   | 1559 +
 4 files changed, 400 insertions(+), 1195 deletions(-)

diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index a19cb2f..8b68f9c 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -1240,6 +1240,14 @@ (define_predicate "rs6000_cbranch_operator"
   (match_code ("unlt,unle,ungt,unge"
(match_operand 0 "comparison_operator")))
 
+;; Return 1 if OP is an unsigned comparison operator.
+(define_predicate "unsigned_comparison_operator"
+  (match_code "ltu,gtu,leu,geu"))
+
+;; Return 1 if OP is a signed comparison operator.
+(define_predicate "signed_comparison_operator"
+  (match_code "lt,gt,le,ge"))
+
 ;; Return 1 if OP is a comparison operation that is valid for an SCC insn --
 ;; it must be a positive comparison.
 (define_predicate "scc_comparison_operator"
diff --git a/gcc/config/rs6000/rs6000-protos.h 
b/gcc/config/rs6000/rs6000-protos.h
index 8d27a69..eb64598 100644
--- a/gcc/config/rs6000/rs6000-protos.h
+++ b/gcc/config/rs6000/rs6000-protos.h
@@ -109,6 +109,7 @@ extern void print_operand (FILE *, rtx, int);
 extern void print_operand_address (FILE *, rtx);
 extern enum rtx_code rs6000_reverse_condition (machine_mode,
   enum rtx_code);
+extern rtx rs6000_emit_eqne (machine_mode, rtx, rtx, rtx);
 extern void rs6000_emit_sISEL (machine_mode, rtx[]);
 extern void rs6000_emit_sCOND (machine_mode, rtx[]);
 extern void rs6000_emit_cbranch (machine_mode, rtx[]);
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index e58fc81..f60f4d7 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -19421,6 +19421,27 @@ rs6000_emit_sISEL (machine_mode mode ATTRIBUTE_UNUSED, 
rtx operands[])
   rs6000_emit_int_cmove (operands[0], operands[1], const1_rtx, const0_rtx);
 }
 
+/* Emit RTL that sets a register to zero if OP1 and OP2 are equal.  SCRATCH
+   can be used as that dest register.  Return the dest register.  */
+
+rtx
+rs6000_emit_eqne (machine_mode mode, rtx op1, rtx op2, rtx scratch)
+{
+  if (op2 == const0_rtx)
+return op1;
+
+  if (GET_CODE (scratch) == SCRATCH)
+scratch = gen_reg_rtx (mode);
+
+  if (logical_operand (op2, mode))
+emit_insn (gen_rtx_SET (VOIDmode, scratch, gen_rtx_X

Re: [PATCH 04/10] rs6000: Remove addic from the normal add pattern

2014-12-08 Thread Segher Boessenkool
On Mon, Dec 08, 2014 at 09:44:44AM -0500, David Edelsohn wrote:
> > -;; Discourage ai/addic because of carry but provide it in an alternative
> > -;; allowing register zero as source.
> >  (define_insn "*add3_internal1"
> > -  [(set (match_operand:GPR 0 "gpc_reg_operand" "=r,r,?r,r")
> > -   (plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "%r,b,r,b")
> > - (match_operand:GPR 2 "add_operand" "r,I,I,L")))]
> > +  [(set (match_operand:GPR 0 "gpc_reg_operand" "=r,r,r")
> > +   (plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "%r,b,b")
> > + (match_operand:GPR 2 "add_operand" "r,I,L")))]
> >"!DECIMAL_FLOAT_MODE_P (GET_MODE (operands[0])) && !DECIMAL_FLOAT_MODE_P 
> > (GET_MODE (operands[1]))"
> >"@
> > add %0,%1,%2
> > addi %0,%1,%2
> > -   addic %0,%1,%2
> > addis %0,%1,%v2"
> >[(set_attr "type" "add")])
> 
> Why are you removing the alternative instead of clobbering XER[CA]?

I should have mentioned, sorry.

We don't want to clobber CA on *every* add, and reload can generate more
adds out of thin air.  And CA is a fixed register.

There may be a better way to do this, but I don't know it.


Segher


Re: [PATCH 04/10] rs6000: Remove addic from the normal add pattern

2014-12-08 Thread David Edelsohn
On Mon, Dec 8, 2014 at 10:01 AM, Segher Boessenkool
 wrote:
> On Mon, Dec 08, 2014 at 09:44:44AM -0500, David Edelsohn wrote:
>> > -;; Discourage ai/addic because of carry but provide it in an alternative
>> > -;; allowing register zero as source.
>> >  (define_insn "*add3_internal1"
>> > -  [(set (match_operand:GPR 0 "gpc_reg_operand" "=r,r,?r,r")
>> > -   (plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "%r,b,r,b")
>> > - (match_operand:GPR 2 "add_operand" "r,I,I,L")))]
>> > +  [(set (match_operand:GPR 0 "gpc_reg_operand" "=r,r,r")
>> > +   (plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "%r,b,b")
>> > + (match_operand:GPR 2 "add_operand" "r,I,L")))]
>> >"!DECIMAL_FLOAT_MODE_P (GET_MODE (operands[0])) && 
>> > !DECIMAL_FLOAT_MODE_P (GET_MODE (operands[1]))"
>> >"@
>> > add %0,%1,%2
>> > addi %0,%1,%2
>> > -   addic %0,%1,%2
>> > addis %0,%1,%v2"
>> >[(set_attr "type" "add")])
>>
>> Why are you removing the alternative instead of clobbering XER[CA]?
>
> I should have mentioned, sorry.
>
> We don't want to clobber CA on *every* add, and reload can generate more
> adds out of thin air.  And CA is a fixed register.
>
> There may be a better way to do this, but I don't know it.

Okay.

It's unfortunate that GCC does not have separate patterns for a safe
ADD used by reload.

Thanks, David


Re: [PATCH 05/10] rs6000: Merge and split add and addic

2014-12-08 Thread David Edelsohn
On Mon, Dec 8, 2014 at 9:18 AM, Segher Boessenkool
 wrote:
> This splits the addic patterns from the add/addi patterns, and merges the
> insns and splitters.
> Also change type "compare" to "add" for "addic." (the dot form).
>
>
> 2014-12-08  Segher Boessenkool  
>
> gcc/
> PR target/64180
> * config/rs6000/rs6000.md (*add3_internal1): Rename to
> "*add3".
> (*add3_internal2, *add3_internal3, and (their splitters):
> Delete.
> (*add3_dot, *add3_dot2): New.
> (*add3_imm_dot, *add3_imm_dot2): New.

Okay.

Thanks, David


Re: RFA: patch to fix PR64157

2014-12-08 Thread Richard Sandiford
Vladimir Makarov  writes:
>The following patch fixes
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64157
>
>
>After calling target_reinit from save_target_globals for switchable 
> targets (as ppc), a lot of ira data (register sets, classes etc) become 
> undefined.  After that ira-costs.c crashes when the undefined data are used.
>
>The patch was successfully bootstrapped and tested on x86-64.
>
>Ok to commit to the trunk?
>
> 2014-12-05  Vladimir Makarov  
>
>  PR rtl-optimization/64157
>  * toplev.c (target_reinit): Call ira_init.
>
> Index: toplev.c
> ===
> --- toplev.c  (revision 218378)
> +++ toplev.c  (working copy)
> @@ -1888,6 +1888,8 @@ target_reinit (void)
>/* This invokes target hooks to set fixed_reg[] etc, which is
>   mode-dependent.  */
>init_regs ();
> +  /* Set IRA data depended on target parameters.  */
> +  ira_init ();

Could you give more details about how this happens?  It's reverting part of:

2014-06-25  Jan Hubicka  

* toplev.c (backend_init_target): Move init_emit_regs and
init_regs to...
(backend_init) ... here; skip ira_init_once and backend_init_target.
(target_reinit) ... and here; clear
this_target_rtl->lang_dependent_initialized.
(lang_dependent_init_target): Clear
this_target_rtl->lang_dependent_initialized;
break out rtl initialization to ...
(initialize_rtl): ... here; call also backend_init_target
and ira_init_once.
* toplev.h (initialize_rtl): New function.
* function.c: Include toplev.h
(init_function_start): Call initialize_rtl.
* rtl.h (target_rtl): Add target_specific_initialized,
lang_dependent_initialized.

which was supposed to delay the ira_init so that it only gets called
once we start to compile a function.  It sounds from your patch like
that either isn't early enough or isn't happening at all for some reason.

Thanks,
Richard



Re: [PATCH 06/10] rs6000: New add/subf carry insns

2014-12-08 Thread David Edelsohn
On Mon, Dec 8, 2014 at 9:18 AM, Segher Boessenkool
 wrote:
> This implements addc, addic, adde, addze, addme, subfc, subfic, subfe,
> subfze, subfme.  Some of those in multiple forms: canonical RTL for the
> "immediate" insns has four forms; subfX has a special form for subtracting
> a register from itself.
>
> All these new insns get type "add" for now.
>
>
> 2014-12-08  Segher Boessenkool  
>
> gcc/
> PR target/64180
> * config/rs6000/predicates.md (adde_operand): New.
> * config/rs6000/rs6000.md (add3_carry): New.
> (add3_imm_carry_pos): New.
> (add3_imm_carry_0): New.
> (add3_imm_carry_m1): New.
> (add3_imm_carry_neg): New.
> (add3_carry_in): New.
> (add3_carry_in_0): New.
> (add3_carry_in_m1): New.
> (subf3_carry): New.
> (subf3_imm_carry_0): New.
> (subf3_imm_carry_m1): New.
> (subf3_carry_in): New.
> (subf3_carry_in_0): New.
> (subf3_carry_in_m1): New.
> (subf3_carry_in_xx): New.

Okay.

Thanks, David


Re: [PATCH 06/10] rs6000: New add/subf carry insns

2014-12-08 Thread David Edelsohn
On Mon, Dec 8, 2014 at 10:41 AM, David Edelsohn  wrote:
> On Mon, Dec 8, 2014 at 9:18 AM, Segher Boessenkool
>  wrote:
>> This implements addc, addic, adde, addze, addme, subfc, subfic, subfe,
>> subfze, subfme.  Some of those in multiple forms: canonical RTL for the
>> "immediate" insns has four forms; subfX has a special form for subtracting
>> a register from itself.
>>
>> All these new insns get type "add" for now.
>>
>>
>> 2014-12-08  Segher Boessenkool  
>>
>> gcc/
>> PR target/64180
>> * config/rs6000/predicates.md (adde_operand): New.
>> * config/rs6000/rs6000.md (add3_carry): New.
>> (add3_imm_carry_pos): New.
>> (add3_imm_carry_0): New.
>> (add3_imm_carry_m1): New.
>> (add3_imm_carry_neg): New.
>> (add3_carry_in): New.
>> (add3_carry_in_0): New.
>> (add3_carry_in_m1): New.
>> (subf3_carry): New.
>> (subf3_imm_carry_0): New.
>> (subf3_imm_carry_m1): New.
>> (subf3_carry_in): New.
>> (subf3_carry_in_0): New.
>> (subf3_carry_in_m1): New.
>> (subf3_carry_in_xx): New.
>
> Okay.
>
> Thanks, David

As we both noticed, there are a few problems with this patch, so I'll
wait for a revised version.

Thanks, David


Re: Fix libgomp crash without TLS (PR42616)

2014-12-08 Thread Varvara Rainchik
Is it ok to add GCC_CHECK_EMUTLS test in libgomp/configure.ac or
should I add in tls.m3 a similair test that would be used only in
libgomp?

2014-12-08 17:03 GMT+03:00 Jakub Jelinek :
> On Mon, Dec 08, 2014 at 04:30:46PM +0300, Varvara Rainchik wrote:
>> Hi guys,
>>
>> Could you please take a look at this issue? This fix is still urgent
>> for Android.
>
> I'm afraid it will break those targets where emutls is not on, but the C
> library doesn't support TLS.  I think it is acceptable not to care about
> #pragma omp from different pthread_create created threads in that case, but
> stopping support completely might be too early.
> So, can you instead arrange for HAVE_TLS to be defined if emutls is
> supported (check for that during configure)?
>
> Jakub


Re: [patch] Fix tilepro includes

2014-12-08 Thread Jan-Benedict Glaw
On Fri, 2014-11-21 08:45:11 -0500, Andrew MacLeod  wrote:
> During the flattening of optabs.h, I updated all the config/* files
> which were affected.   I've been getting spurious failures with
> config-list.mk where my changes would "disappear" and tracked down
> why.
> 
> I was blissfully unaware that the tilepro ports mul-tables.c file is
> actually generated from gen-mul-tables.cc.
> 
> This patch fixes the include issue by adding "#include insn-codes.h"
> to the generated files.  I also added a comment indicating these are
> generated files, and to make changes in the generator.
> 
> This allows all the tile* ports to compile properly again.
> 
> OK for trunk?

Seems this wasn't ever ACKed or applied up to now? I'm still seeing
compilation errors for the tile targets, see eg.
http://toolchain.lug-owl.de/buildbot/show_build_details.php?id=382169

MfG, JBG

-- 
  Jan-Benedict Glaw  jbg...@lug-owl.de  +49-172-7608481
Signature of: 23:53 <@jbglaw> So, ich kletter' jetzt mal ins Bett.
the second  : 23:57 <@jever2> .oO( kletter ..., hat er noch Gitter vorm Bett, 
wie früher meine Kinder?)
  00:00 <@jbglaw> jever2: *patsch*
  00:01 <@jever2> *aua*, wofür, Gedanken sind frei!
  00:02 <@jbglaw> Nee, freie Gedanken, die sind seit 1984 doch aus!
  00:03 <@jever2> 1984? ich bin erst seit 1985 verheiratet!


signature.asc
Description: Digital signature


Re: Fix libgomp crash without TLS (PR42616)

2014-12-08 Thread Jakub Jelinek
On Mon, Dec 08, 2014 at 07:01:09PM +0300, Varvara Rainchik wrote:
> Is it ok to add GCC_CHECK_EMUTLS test in libgomp/configure.ac or
> should I add in tls.m3 a similair test that would be used only in
> libgomp?

I think config/tls.m4 would be a better place, but doing it in some way
where the users of config/tls.m4 could actually by using different macros
from there choose what they want (either check solely for real TLS, or
only for emutls, or for both).  And libgomp/configure.ac would then choose
it is happy with both.

Jakub


Re: [PATCH] libgccjit cleanups

2014-12-08 Thread David Malcolm
On Sat, 2014-12-06 at 15:29 -0500, Ulrich Drepper wrote:
> This patch broken out of one I sent earlier with some extensions.  It
> contains only little cleanups to the libgccjit code.
> 
> When creating the linker command line the code now uses an auto_vec
> instead of the fixed size array.
> 
> The second change adds the missing context::set_str_option member
> function to the C++ interface.
> 
> The third change it to the string option handling.  Instead of just
> using the pointer passed to the function the code now makes a copy
> of the string.
> 
> 
> OK?

Thanks.  Overall this is good, a few nitpicks inline below:

> gcc/ChangeLog:
> 
> 2014-12-06  Ulrich Drepper  
> 
>   * jit/jit-playback.c (convert_to_dso): Use auto_vec instead
> of automatic array to build up command line.
> * jit/jit-recording.c (recording::context::set_str_option):
> Make copy of the string.
> (recording::context::~context): Free string options.
> * jit/jit-recording.h (recording::context): Adjust type
> of m_str_options member.
> * jit/libgccjit.h: Adjust comment about
> gcc_jit_context_set_str_option parameter begin used after
> the call.
> * jit/libgccjit++.h (gccjit::context): Add set_str_option
> member function.
> 
> 
> diff --git a/gcc/jit/jit-playback.c b/gcc/jit/jit-playback.c
> index ecdae80..6d1eb8a 100644
> --- a/gcc/jit/jit-playback.c
> +++ b/gcc/jit/jit-playback.c
> @@ -1726,18 +1726,19 @@ convert_to_dso (const char *ctxt_progname)
>   TV_ASSEMBLE.  */
>auto_timevar assemble_timevar (TV_ASSEMBLE);
>const char *errmsg;
> -  const char *argv[7];
> +  auto_vec  argvec;
> +#define ADD_ARG(arg) argvec.safe_push (arg)
>int exit_status = 0;
>int err = 0;
>const char *gcc_driver_name = GCC_DRIVER_NAME;
>  
> -  argv[0] = gcc_driver_name;
> -  argv[1] = "-shared";
> +  ADD_ARG (gcc_driver_name);
> +  ADD_ARG ("-shared");
>/* The input: assembler.  */
> -  argv[2] = m_path_s_file;
> +  ADD_ARG (m_path_s_file);
>/* The output: shared library.  */
> -  argv[3] = "-o";
> -  argv[4] = m_path_so_file;
> +  ADD_ARG ("-o");
> +  ADD_ARG (m_path_so_file);
>  
>/* Don't use the linker plugin.
>   If running with just a "make" and not a "make install", then we'd
> @@ -1746,17 +1747,17 @@ convert_to_dso (const char *ctxt_progname)
>   libto_plugin is a .la at build time, with it becoming installed with
>   ".so" suffix: i.e. it doesn't exist with a .so suffix until install
>   time.  */
> -  argv[5] = "-fno-use-linker-plugin";
> +  ADD_ARG ("-fno-use-linker-plugin");
>  
>/* pex argv arrays are NULL-terminated.  */
> -  argv[6] = NULL;
> +  ADD_ARG (NULL);
>  
>/* pex_one's error-handling requires pname to be non-NULL.  */
>gcc_assert (ctxt_progname);
>  
>errmsg = pex_one (PEX_SEARCH, /* int flags, */
>   gcc_driver_name,
> - const_cast (argv),
> + const_cast  (argvec.address ()),
>   ctxt_progname, /* const char *pname */
>   NULL, /* const char *outname */
>   NULL, /* const char *errname */
> @@ -1783,6 +1784,7 @@ convert_to_dso (const char *ctxt_progname)
>getenv ("PATH"));
>return;
>  }
> +#undef ADD_ARG
>  }
>  
>  /* Top-level hook for playing back a recording context.
> diff --git a/gcc/jit/jit-recording.c b/gcc/jit/jit-recording.c
> --- a/gcc/jit/jit-recording.c
> +++ b/gcc/jit/jit-recording.c
> @@ -215,6 +215,9 @@ recording::context::~context ()
>delete m;
>  }
>  
> +  for (i = 0; i < GCC_JIT_NUM_STR_OPTIONS; ++i)
> +free (m_str_options[i]);
> +
>if (m_builtins_manager)
>  delete m_builtins_manager;
>  
> @@ -827,7 +830,7 @@ recording::context::set_str_option (enum 
> gcc_jit_str_option opt,
>"unrecognized (enum gcc_jit_str_option) value: %i", opt);
>return;
>  }
> -  m_str_options[opt] = value;
> +  m_str_options[opt] = xstrdup (value);
>  }

If the user repeatedly calls set_str_option on the same opt, this will
be a leak.

So something like:

  /* Free any existing value.  */
  free (m_str_options[opt]);
  m_str_options[opt] = xstrdup (value);


>  /* Set the given integer option for this context, or add an error if
> diff --git a/gcc/jit/jit-recording.h b/gcc/jit/jit-recording.h
> --- a/gcc/jit/jit-recording.h
> +++ b/gcc/jit/jit-recording.h
> @@ -246,7 +246,7 @@ private:
>char *m_first_error_str;
>bool m_owns_first_error_str;
>  
> -  const char *m_str_options[GCC_JIT_NUM_STR_OPTIONS];
> +  char *m_str_options[GCC_JIT_NUM_STR_OPTIONS];
>int m_int_options[GCC_JIT_NUM_INT_OPTIONS];
>bool m_bool_options[GCC_JIT_NUM_BOOL_OPTIONS];
>  
> diff --git a/gcc/jit/libgccjit++.h b/gcc/jit/libgccjit++.h
> --- a/gcc/jit/libgccjit++.h
> +++ b/gcc/jit/libgccjit++.h
> @@ -99,6 +99,9 @@ namespace gccjit
>  void dump_to_file (const std::string &path,
>  bool update_loc

Re: [PATCH 8/9] Negative numbers added for sreal class.

2014-12-08 Thread Martin Liška

On 11/28/2014 10:32 AM, Richard Biener wrote:

On Thu, Nov 27, 2014 at 6:08 PM, Martin Liška  wrote:

On 11/21/2014 04:21 PM, Martin Liška wrote:


On 11/21/2014 04:02 PM, Richard Biener wrote:


On Fri, Nov 21, 2014 at 3:39 PM, Martin Liška  wrote:


Hello.

Ok, this is simplified, one can use sreal a = 12345 and it works ;)


that's a  new API, right?  There is no max () and I think that using
LONG_MIN here is asking for trouble (host dependence).  The
comment in the file says the max should be
sreal (SREAL_MAX_SIG, SREAL_MAX_EXP) and the min
sreal (-SREAL_MAX_SIG, SREAL_MAX_EXP)?



Sure, sreal can store much bigger(smaller) numbers :)


Where do you need sreal::to_double?  The host shouldn't perform
double calculations so it can be only for dumping?  In which case
the user should have used sreal::dump (), maybe with extra
arguments.



That new function was request from Honza, only for debugging purpose.
I agree that dump should this kind of job.

If no other problem, I will run tests once more and commit it.
Thanks,
Martin



-#define SREAL_MAX_EXP (INT_MAX / 4)
+#define SREAL_MAX_EXP (INT_MAX / 8)

this change doesn't look necessary anymore?

Btw, it's also odd that...

   #define SREAL_PART_BITS 32
...
   #define SREAL_MIN_SIG ((uint64_t) 1 << (SREAL_PART_BITS - 1))
   #define SREAL_MAX_SIG (((uint64_t) 1 << SREAL_PART_BITS) - 1)

thus all m_sig values fit in 32bits but we still use a uint64_t m_sig ...
(the implementation uses 64bit for internal computations, but still
the storage is wasteful?)

Of course the way normalize() works requires that storage to be
64bits to store unnormalized values.

I'd say ok with the SREAL_MAX_EXP change reverted.



Hi.

You are right, this change was done because I used one bit for m_negative
(bitfield), not needed any more.

Final version attached.

Thank you,
Martin


Thanks,
Richard.





Otherwise looks good to me and sorry for not noticing the above
earlier.

Thanks,
Richard.


Thanks,
Martin



 };

 extern void debug (sreal &ref);
@@ -76,12 +133,12 @@ inline sreal &operator+= (sreal &a, const sreal
&b)

 inline sreal &operator-= (sreal &a, const sreal &b)
 {
-return a = a - b;
+  return a = a - b;
 }

 inline sreal &operator/= (sreal &a, const sreal &b)
 {
-return a = a / b;
+  return a = a / b;
 }

 inline sreal &operator*= (sreal &a, const sreal &b)
--
2.1.2










Hello.

After IRC discussions, I decided to give sreal another refactoring where I
use int64_t for m_sig.

This approach looks much easier and straightforward. I would like to
ask folk for comments?


I think you want to exclude LONG_MIN (how do you know that LONG_MIN
is min(int64_t)?) as a valid value for m_sig - after all SREAL_MIN_SIG
makes it symmetric.  Or simply use abs_hwi at


Hello.

I decided to use abs_hwi.



+  int64_t s = m_sig < 0 ? -1 : 1;
+  uint64_t sig = m_sig == LONG_MIN ? LONG_MAX : std::abs (m_sig);

-#define SREAL_MIN_SIG ((uint64_t) 1 << (SREAL_PART_BITS - 1))
-#define SREAL_MAX_SIG (((uint64_t) 1 << SREAL_PART_BITS) - 1)
+#define SREAL_MIN_SIG ((uint64_t) 1 << (SREAL_PART_BITS - 2))
+#define SREAL_MAX_SIG (((uint64_t) 1 << (SREAL_PART_BITS - 1)) - 1)

shouldn't this also be -2 in the last line?


It's correct, in the first line, I s/'SREAL_PART_BITS - 1'/'SREAL_PART_BITS - 
2' and
second one is also decremented: s/'SREAL_PART_BITS'/'REAL_PART_BITS - 1'.



That is, you effectively use the MSB as a sign bit?


Yes. It uses signed integer with MSB as a sign bit.



Btw, a further change would be to make m_sig 'signed int', matching
the "real" storage requirements according to SREAL_PART_BITS.
This would of course still require temporaries used for computation
to be 64bits and it would require normalization to work on the
temporaries.  But then we'd get down to 8 bytes storage ...


Agree, we can shrink the size for future.

I've been running tests, I hope this implementation is much nicer than
having bool m_negative. What do you think about trunk acceptation?

Thanks,
Martin



Richard.



I am able to run profiled bootstrap on x86_64-linux-pc and ppc64-linux-pc
and new regression is introduced.

Thanks,
Martin




>From c275c48eb5f0f41f546bb7866d70c1fe066d6d62 Mon Sep 17 00:00:00 2001
From: mliska 
Date: Wed, 26 Nov 2014 15:46:42 +0100
Subject: [PATCH] New sreal implementation which uses int64_t as m_sig.

gcc/ChangeLog:

2014-12-08  Martin Liska  

	* sreal.c (sreal::shift_right): New implementation
	for int64_t as m_sig.
	(sreal::normalize): Likewise.
	(sreal::to_int): Likewise.
	(sreal::operator+): Likewise.
	(sreal::operator-): Likewise.
	(sreal::operator*): Likewise.
	(sreal::operator/): Likewise.
	(sreal::signedless_minus): Removed.
	(sreal::signedless_plus): Removed.
	(sreal::debug): const keyword is added.
	* sreal.h (sreal::operator<): New implementation
	for int64_t as m_sig.
	* ipa-inline.c (recursive_inlining): LONG_MIN is replaced
	with sreal::min ().
---
 gcc/ipa-inline.c |   2 +-
 gcc/sreal.c  | 129 ++

Re: [PATCH 3/4] First usage of cgraph_summary in ipa-prop pass.

2014-12-08 Thread Martin Liška

On 11/14/2014 04:23 PM, Martin Liška wrote:

On 11/14/2014 03:05 PM, Martin Liška wrote:

On 11/11/2014 01:26 PM, mliska wrote:

gcc/ChangeLog:

2014-11-12  Martin Liska  

* auto-profile.c: Include cgraph_summary.h.
* cgraph.c: Likewise.
* cgraphbuild.c: Likewise.
* cgraphclones.c: Likewise.
* cgraphunit.c: Likewise.
* ipa-cp.c: Likewise.
* ipa-devirt.c: Likewise.
* ipa-icf.c: Likewise.
* ipa-inline-analysis.c (evaluate_properties_for_edge): Usage of
ipa_node_params_vector is replaced with ipa_node_params_summary.
(inline_node_duplication_hook): Likewise.
(estimate_function_body_sizes): Likewise.
(remap_edge_change_prob): Likewise.
(inline_merge_summary): Likewise.
* ipa-inline-transform.c: Include of cgraph_summary.h.
* ipa-inline.c (early_inliner): Usage of
ipa_node_params_vector is replaced with ipa_node_params_summary.
* ipa-polymorphic-call.c: Include of cgraph_summary.h.
* ipa-profile.c: Include of cgraph_summary.h.
* ipa-prop.c (struct func_body_info): Struct keyword is removed.
(struct ipa_cst_ref_desc): Likewise.
(ipa_func_spec_opts_forbid_analysis_p): Likewise.
(ipa_alloc_node_params): Likewise.
(ipa_initialize_node_params): Likewise.
(ipa_print_node_jump_functions_for_edge): Likewise.
(ipa_print_node_jump_functions): Likewise.
(ipa_print_all_jump_functions): Likewise.
(ipa_set_jf_constant): Likewise.
(check_stmt_for_type_change): Likewise.
(detect_type_change_from_memory_writes): Likewise.
(find_dominating_aa_status): Likewise.
(parm_bb_aa_status_for_bb): Likewise.
(parm_preserved_before_stmt_p): Likewise.
(parm_ref_data_preserved_p): Likewise.
(parm_ref_data_pass_through_p): Likewise.
(struct ipa_known_agg_contents_list): Likewise.
(get_place_in_agg_contents_list): Likewise.
(build_agg_jump_func_from_list): Likewise.
(determine_locally_known_aggregate_parts): Likewise.
(ipa_compute_jump_functions_for_edge): Likewise.
(ipa_compute_jump_functions_for_bb): Likewise.
(ipa_note_param_call): Likewise.
(ipa_analyze_indirect_call_uses) Likewise.:
(ipa_analyze_virtual_call_uses): Likewise.
(ipa_analyze_call_uses): Likewise.
(visit_ref_for_mod_analysis): Likewise.
(ipa_analyze_controlled_uses): Likewise.
(ipa_analyze_node): Likewise.
(update_jump_functions_after_inlining): Likewise.
(ipa_make_edge_direct_to_target): Likewise.
(ipa_find_agg_cst_for_param): Likewise.
(remove_described_reference): Likewise.
(jfunc_rdesc_usable): Likewise.
(try_decrement_rdesc_refcount): Likewise.
(try_make_edge_direct_simple_call): Likewise.
(try_make_edge_direct_virtual_call): Likewise.
(update_indirect_edges_after_inlining): Likewise.
(propagate_info_to_inlined_callees): Likewise.
(propagate_controlled_uses): Likewise.
(ipa_propagate_indirect_call_infos): Likewise.
(ipa_free_all_edge_args): Likewise.
(ipa_node_params::~ipa_node_params): Likewise.
(ipa_free_all_node_params): Likewise.
(ipa_edge_removal_hook): Likewise.
(ipa_node_removal_hook): Likewise.
(ipa_edge_duplication_hook): Likewise.
(ipa_add_new_function): Removed
(ipa_node_params_cgraph_summary::duplication_hook): New function.
(ipa_node_duplication_hook): Struct keyword removed.
(ipa_register_cgraph_hooks): Removal of old hooks.
(ipa_unregister_cgraph_hooks): Likewise.
(ipa_print_node_params): Struct keyword is removed.
(ipa_print_all_params): Likewise.
(ipa_modify_formal_parameters): Likewise.
(ipa_modify_call_arguments): Likewise.
(ipa_modify_expr): Likewise.
(ipa_get_adjustment_candidate): Likewise.
(index_in_adjustments_multiple_times_p): Likewise.
(ipa_combine_adjustments): Likewise.
(ipa_dump_param_adjustments): Likewise.
(ipa_write_jump_function): Likewise.
(ipa_read_jump_function): Likewise.
(ipa_write_indirect_edge_info): Likewise.
(ipa_read_indirect_edge_info): Likewise.
(ipa_write_node_info): Likewise.
(ipa_read_node_info): Likewise.
(ipa_prop_write_jump_functions): Likewise.
(ipa_prop_read_section): Likewise.
(ipa_prop_read_jump_functions): Likewise.
(write_agg_replacement_chain): Likewise.
(read_agg_replacement_chain): Likewise.
(ipa_prop_write_all_agg_replacement): Likewise.
(read_replacements_section): Likewise.
(ipa_prop_read_all_agg_replacement): Likewise.
(adjust_agg_replacement_values): Likewise.
(ipcp_modif_dom_walker::before_dom_children): Likewise.
(ipcp_transform_function): Likewise.
* ipa-prop.h (struct ipa_node_params): Introduction of new class
ipa_node_params_cgraph_summary.
* ipa-split.c: Include cgraph_summary.h.
* ipa-utils.c: Likewise.
* ipa.c: Likewise.
* omp-low.c: Likewise.
* tree-inline.c: Likewise.
* tree-sra.c: Likewise.
* tree-ssa-pre.c: Likewise.

gcc/lto/ChangeLog:

2014-11-12  Martin Liska  

* lto-p

Re: [PATCH 4/4] Data structure is used for inline_summary struct.

2014-12-08 Thread Martin Liška

On 11/14/2014 04:24 PM, Martin Liška wrote:

On 11/14/2014 03:09 PM, Martin Liška wrote:

On 11/13/2014 05:04 PM, Jan Hubicka wrote:

+  if (!inline_summary_summary)
+inline_summary_summary = (inline_summary_cgraph_summary *) 
inline_summary_cgraph_summary::create_ggc (symtab);


Hehe, this is funny naming scheme.
Peraps inline_summary_d and inline_summary_t for the data and type?


Hello.

I adopted suggested naming scheme.


-
-static void
-inline_node_duplication_hook (struct cgraph_node *src,
-  struct cgraph_node *dst,
-  ATTRIBUTE_UNUSED void *data)
+void
+inline_summary_cgraph_summary::duplication_hook (cgraph_node *src,
+  cgraph_node *dst,
+  inline_summary *,
+  inline_summary *info)


Becuase those are no longer "hooks" but virtual function, I guess we could call 
them
simply duplicate/insert/remove.


Agree with the change.



In a way I would like to see these to be methods of the underlying type rather 
than
virtual methods of the summary, becuase these are operations on the data 
themselves.
I was thinking to model these by specual constructor and copy constructor
(taking the extra node pointer parameters) and standard destructor.  I am not 
sure this
would be more understandable this way?


Motivation for this implementation is:
a) it's useful to have an access to cgraph_node that is associated with a sumary
b) with GTY, we cannot call destructors


-/* Need a typedef for inline_summary because of inline function
-   'inline_summary' below.  */
-typedef struct inline_summary inline_summary_t;
-extern GTY(()) vec *inline_summary_vec;
+class GTY((user)) inline_summary_cgraph_summary: public cgraph_summary 

+{
+public:
+  inline_summary_cgraph_summary (symbol_table *symtab, bool ggc):
+cgraph_summary  (symtab, ggc) {}
+
+  static inline_summary_cgraph_summary *create_ggc (symbol_table *symtab)
+  {
+inline_summary_cgraph_summary *summary = new (ggc_cleared_alloc 
 ()) inline_summary_cgraph_summary(symtab, true);
+summary->disable_insertion_hook ();
+return summary;
+  }
+
+
+  virtual void insertion_hook (cgraph_node *, inline_summary *);
+  virtual void removal_hook (cgraph_node *node, inline_summary *);
+  virtual void duplication_hook (cgraph_node *src, cgraph_node *dst, 
inline_summary *src_data, inline_summary *dst_data);
+};
+
+extern GTY(()) cgraph_summary  *inline_summary_summary;


All in all it looks better than original code.  If we moved insert/


  /* Information kept about parameter of call site.  */
  struct inline_param_summary
@@ -249,10 +265,10 @@ void clone_inlined_nodes (struct cgraph_edge *e, bool, 
bool, int *,
  extern int ncalls_inlined;
  extern int nfunctions_inlined;

-static inline struct inline_summary *
-inline_summary (struct cgraph_node *node)
+static inline inline_summary *
+get_inline_summary (const struct cgraph_node *node)
  {
-  return &(*inline_summary_vec)[node->uid];
+  return (*inline_summary_summary)[node->summary_uid];


Hmm, i guess there is no way to avoid the (*...)? Otherwise it would be cleaner
to use inline_summary[...] instead of get_inline_summary IMO.


I added function_summary::get method, where the usage looks cleaner:
inline_summary_d->get (node).

Thanks,
Martin


Thanks for working on this!
Honza





Patch v3.

Martin


Version v4.

Martin
>From 4f87fdf746532a9ad89fcf85246cc449a49ba09d Mon Sep 17 00:00:00 2001
From: mliska 
Date: Mon, 8 Dec 2014 11:33:08 +0100
Subject: [PATCH 3/3] symbol_summary is used for inline_summary.

gcc/lto/ChangeLog:

2014-12-08  Martin Liska  

	* lto-partition.c (add_symbol_to_partition_1): New inline_summary_d
	is used.
	(undo_partition): Likewise.
	(lto_balanced_map): Likewise.

gcc/ChangeLog:

2014-12-08  Martin Liska  

	* cgraphunit.c (symbol_table::process_new_functions): New inline_summary_d
	is used.
	* ipa-cp.c (ipcp_cloning_candidate_p): Likewise.
	(devirtualization_time_bonus): Likewise.
	(estimate_local_effects): Likewise.
	(ipcp_propagate_stage): Likewise.
	* ipa-inline-analysis.c (evaluate_conditions_for_known_args): Likewise.
	(evaluate_properties_for_edge): Likewise.
	(inline_summary_alloc): Likewise.
	(reset_inline_summary): New inline_summary argument is introduced.
	(inline_summary_t::remove): New function.
	(inline_summary_t::duplicate): Likewise.
	(dump_inline_edge_summary): New inline_summary_d is used.
	(dump_inline_summary): Likewise.
	(estimate_function_body_sizes): Likewise.
	(compute_inline_parameters): Likewise.
	(estimate_edge_devirt_benefit): Likewise.
	(estimate_node_size_and_time): Likewise.
	(inline_update_callee_summaries): Likewise.
	(inline_merge_summary): Likewise.
	(inline_update_overall_summary): Likewise.
	(simple_edge_hints): Likewise.
	(do_estimate_edge_time): Likewise.
	(estimate_time_after_inlining): Likewise.
	(estimate_size_after_inlining): Likewise.
	(do_estimate_growth): Likewise.
	(growth_likely_positive): Likewise.
	(inline_generate_summary): Likewise.
	(inlin

Re: [PATCH 2/4] New data structure for cgraph_summary introduced.

2014-12-08 Thread Martin Liška

On 11/14/2014 04:23 PM, Martin Liška wrote:

On 11/14/2014 03:04 PM, Martin Liška wrote:

On 11/13/2014 04:50 PM, Jan Hubicka wrote:

gcc/ChangeLog:

2014-11-12  Martin Liska  

* Makefile.in: New object file is added.
* cgraph.h (symbol_table::allocate_cgraph_symbol): Summary UID
is filled up.
* cgraph_summary.c: New file.
* cgraph_summary.h: New file.


Since I am trying to get rid of the cgraph prefixes for symbols (keep it for
the graph only) and the summaries can be annotated to variables too. Even if it
not necessarily supported by your current implementation, lets keep API
prepared for it. So I would call it symtab-summary.* for source files and
symtab_summary for base type  (probably function_summary for annotating
functions/cgraph_edge_summary for annotating edges?)


Hello.

I followed your remarks, new class is called function_summary and is located
in symbol-summary.h.




diff --git a/gcc/cgraph.h b/gcc/cgraph.h
index e2becb9..588b6d5 100644
--- a/gcc/cgraph.h
+++ b/gcc/cgraph.h
@@ -1225,6 +1225,8 @@ public:
int count_materialization_scale;
/* Unique id of the node.  */
int uid;
+  /* Summary unique id of the node.  */
+  int summary_uid;


What makes summary_uid better than uid?


Because cgraph_node::uid is not a unique ID, it's recycled. As I can see,
there are two remaining usages of the fact that cgraph::uid are quite 
consecutive:

a) node_growth_cache vector is resized according to cgraph_max_uid
b) lto-partition.c: lto_balanced_map

If we change ipa-related stuff to annotations and lto_balanced_map with be 
rewritten,
we can finally unify uid and summary_uid. As Martin correctly pointed out, we 
should
unify cgraph_node dumps, we combine uid and order.




diff --git a/gcc/cgraph_summary.c b/gcc/cgraph_summary.c
new file mode 100644
index 000..9af1d7e
--- /dev/null
+++ b/gcc/cgraph_summary.c


And why do we need this file?  It will need license header if really needed.


Sure, the file can be removed.

Martin



The implementation seems sane - I will check the actual uses :)
Please send the updated patch though.

Honza





Hello.

There's v3 of the patch.

Martin


Hello.

Version 4 of the patch. I have a conversation with Trevor and I hope it can be
acceptable to trunk?

Patch set can bootstrap on x86_64-linux-pc and no regression has been seen.

Thanks,
Martin

>From df978f4a338869b56dc558c70f6c8f3c884b9fe1 Mon Sep 17 00:00:00 2001
From: mliska 
Date: Fri, 5 Dec 2014 15:59:04 +0100
Subject: [PATCH 1/3] New symbol_summary class introduced.

gcc/ChangeLog:

2014-12-08  Martin Liska  

	* cgraph.h (symbol_table::allocate_cgraph_symbol): Summary UID
	is filled up.
	* symbol-summary.h: New file.
	* gengtype.c (open_base_files): Add symbol-summary.h.
	* toplev.c (general_init): Call constructor of symbol_table.
---
 gcc/cgraph.h |   8 ++
 gcc/gengtype.c   |   4 +-
 gcc/symbol-summary.h | 281 +++
 gcc/toplev.c |   3 +-
 4 files changed, 293 insertions(+), 3 deletions(-)
 create mode 100644 gcc/symbol-summary.h

diff --git a/gcc/cgraph.h b/gcc/cgraph.h
index a5c5f56..1664bd7 100644
--- a/gcc/cgraph.h
+++ b/gcc/cgraph.h
@@ -1237,6 +1237,8 @@ public:
   int count_materialization_scale;
   /* Unique id of the node.  */
   int uid;
+  /* Summary unique id of the node.  */
+  int summary_uid;
   /* ID assigned by the profiling.  */
   unsigned int profile_id;
   /* Time profiler: first run of function.  */
@@ -1810,6 +1812,10 @@ public:
   friend class cgraph_node;
   friend class cgraph_edge;
 
+  symbol_table (): cgraph_max_summary_uid (1)
+  {
+  }
+
   /* Initialize callgraph dump file.  */
   void initialize (void);
 
@@ -2006,6 +2012,7 @@ public:
 
   int cgraph_count;
   int cgraph_max_uid;
+  int cgraph_max_summary_uid;
 
   int edges_count;
   int edges_max_uid;
@@ -2334,6 +2341,7 @@ symbol_table::allocate_cgraph_symbol (void)
   node->uid = cgraph_max_uid++;
 }
 
+  node->summary_uid = cgraph_max_summary_uid++;
   return node;
 }
 
diff --git a/gcc/gengtype.c b/gcc/gengtype.c
index 2dc857e..276f543 100644
--- a/gcc/gengtype.c
+++ b/gcc/gengtype.c
@@ -1842,8 +1842,8 @@ open_base_files (void)
   "tree-ssa-loop-niter.h", "tree-into-ssa.h", "tree-dfa.h", 
   "tree-ssa.h", "reload.h", "cpp-id-data.h", "tree-chrec.h",
   "except.h", "output.h",  "cfgloop.h", "target.h", "lto-streamer.h",
-  "target-globals.h", "ipa-ref.h", "cgraph.h", "ipa-prop.h", 
-  "ipa-inline.h", "dwarf2out.h", "omp-low.h", NULL
+  "target-globals.h", "ipa-ref.h", "cgraph.h", "symbol-summary.h",
+  "ipa-prop.h", "ipa-inline.h", "dwarf2out.h", "omp-low.h", NULL
 };
 const char *const *ifp;
 outf_p gtype_desc_c;
diff --git a/gcc/symbol-summary.h b/gcc/symbol-summary.h
new file mode 100644
index 000..95270e5
--- /dev/null
+++ b/gcc/symbol-summary.h
@@ -0,0 +1,281 @@
+/* Callgraph summary data structure.
+   Copyright (C) 2014 Free Software Foundation, Inc.
+   Contribu

Re: [RFC, PATCH, fortran] PR fortran/60255 Deferred character length

2014-12-08 Thread Andre Vehreschild
Hi all,

please find attached a more elaborate patch for pr60255. I totally agree that
my first attempt was just scratching the surface of the work needed.

This patch also is *not* complete, but because I am really new to gfortran
patching, I don't want to present a final patch only to learn then, that I have
violated design rules, common practice or the like. Therefore please comment
and direct me to any sources/ideas to improve the patch.

Topic: 
The pr 60255 is about assigning a char array to an unlimited polymorphic
entity. In the comments the concern about the lost length information is
raised. The patch adds a _len component to the unlimited polymorphic entity
(after _data and _vtab) and adds an assignment of the string length to _len
when a string is pointer assigned to the unlimited poly entity. Furthermore is
the intrinsic len(unlimited poly pointing to a string) resolved to give the
_len component.

Yet missing:
- assign _len component back to deferred char array length component
- transport length along chains of unlimited poly entities, i.e., a => b; c =>
  a where all objects are unlimited poly and b is a string.
- allocate() in this context

Patch dependencies:
none

Comments, concerns, candy welcome!

Regards,
Andre

On Sun, 17 Aug 2014 14:32:21 +0200
domi...@lps.ens.fr (Dominique Dhumieres) wrote:

> > the testcase should check that the code generated is actually working,
> > not just that the ICE disappeared.
> 
> I agree. Note that there is a test in the comment 3 of PR60255 that
> can be used to check the run time behavior (and possibly check the
> vtab issue).
> 
> Dominique


-- 
Andre Vehreschild * Email: vehre ad gmx dot de 
diff --git a/gcc/fortran/class.c b/gcc/fortran/class.c
index 0286c9e..29e31e1 100644
--- a/gcc/fortran/class.c
+++ b/gcc/fortran/class.c
@@ -2403,6 +2403,38 @@ yes:
   return true;
 }
 
+/* Add the component _len to the class-type variable in c->expr1.  */
+
+void
+gfc_add_len_component (gfc_code *c)
+{
+  /* Just make sure input is correct. This is already at the calling site,
+ but may be this routine is called from somewhere else in the furure.  */
+  gcc_assert (UNLIMITED_POLY(c->expr1)
+  && c->expr2
+  && c->expr2->ts.type== BT_CHARACTER);
+
+  gfc_component *len;
+  gfc_expr *e;
+  /* Check that _len is not present already.  */
+  if ((len= gfc_find_component (c->expr1->ts.u.derived, "_len", true, true)))
+return;
+  /* Create the new component.  */
+  if (!gfc_add_component (c->expr1->ts.u.derived, "_len", &len))
+// Possible errors are already reported in add_component
+return;
+  len->ts.type = BT_INTEGER;
+  len->ts.kind = 4;
+  len->attr.access = ACCESS_PRIVATE;
+
+  /* Build minimal expression to initialize component with zero. */
+  e = gfc_get_expr();
+  e->ts = c->expr1->ts;
+  e->expr_type = EXPR_VARIABLE;
+  len->initializer = gfc_get_int_expr (gfc_default_integer_kind,
+   NULL, 0);
+  gfc_free_expr (e);
+}
 
 /* Find (or generate) the symbol for an intrinsic type's vtab.  This is
needed to support unlimited polymorphism.  */
@@ -2415,18 +2447,9 @@ find_intrinsic_vtab (gfc_typespec *ts)
   gfc_symbol *copy = NULL, *src = NULL, *dst = NULL;
   int charlen = 0;
 
-  if (ts->type == BT_CHARACTER)
-{
-  if (ts->deferred)
-	{
-	  gfc_error ("TODO: Deferred character length variable at %C cannot "
-		 "yet be associated with unlimited polymorphic entities");
-	  return NULL;
-	}
-  else if (ts->u.cl && ts->u.cl->length
-	   && ts->u.cl->length->expr_type == EXPR_CONSTANT)
-	charlen = mpz_get_si (ts->u.cl->length->value.integer);
-}
+  if (ts->type == BT_CHARACTER && !ts->deferred && ts->u.cl && ts->u.cl->length
+  && ts->u.cl->length->expr_type == EXPR_CONSTANT)
+charlen = mpz_get_si (ts->u.cl->length->value.integer);
 
   /* Find the top-level namespace.  */
   for (ns = gfc_current_ns; ns; ns = ns->parent)
@@ -2437,10 +2460,16 @@ find_intrinsic_vtab (gfc_typespec *ts)
 {
   char name[GFC_MAX_SYMBOL_LEN+1], tname[GFC_MAX_SYMBOL_LEN+1];
 
-  if (ts->type == BT_CHARACTER)
-	sprintf (tname, "%s_%d_%d", gfc_basic_typename (ts->type),
-		 charlen, ts->kind);
-  else
+  if (ts->type == BT_CHARACTER) {
+if (!ts->deferred)
+  sprintf (tname, "%s_%d_%d", gfc_basic_typename (ts->type),
+   charlen, ts->kind);
+else
+  /* The type is deferred here. Ensure that this is easily seen in the 
+ vtable. */
+  sprintf (tname, "%s_DEFERRED_%d", gfc_basic_typename (ts->type),
+   ts->kind);
+  } else
 	sprintf (tname, "%s_%d_", gfc_basic_typename (ts->type), ts->kind);
 
   sprintf (name, "__vtab_%s", tname);
diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
index 1058502..f99c3f8 100644
--- a/gcc/fortran/gfortran.h
+++ b/gcc/fortran/gfortran.h
@@ -3192,6 +3192,8 @@ gfc_expr *gfc_class_initializer (gfc_typespec *, gfc_expr *);
 

Re: RFA: patch to fix PR64157

2014-12-08 Thread Vladimir Makarov

On 2014-12-08 10:41 AM, Richard Sandiford wrote:

Vladimir Makarov  writes:

The following patch fixes

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64157


After calling target_reinit from save_target_globals for switchable
targets (as ppc), a lot of ira data (register sets, classes etc) become
undefined.  After that ira-costs.c crashes when the undefined data are used.

The patch was successfully bootstrapped and tested on x86-64.

Ok to commit to the trunk?

2014-12-05  Vladimir Makarov  

  PR rtl-optimization/64157
  * toplev.c (target_reinit): Call ira_init.

Index: toplev.c
===
--- toplev.c(revision 218378)
+++ toplev.c(working copy)
@@ -1888,6 +1888,8 @@ target_reinit (void)
/* This invokes target hooks to set fixed_reg[] etc, which is
   mode-dependent.  */
init_regs ();
+  /* Set IRA data depended on target parameters.  */
+  ira_init ();


Could you give more details about how this happens?  It's reverting part of:



I don't know this code well, Richard.  I've just see that IRA data is 
undefined after save_target_globals, e.g. max_struct_costs_size has 
value zero which never should happen.  As save_target_globals calls

target_reinit and target_reinit has a comment

/* Reinitialize everything when target parameters, such as register 
usage, 


   have changed.  */

I think it is natural reinitialize data for IRA from here.

I'd appreciate if somebody look at this problem as it is not RA one per 
se.   It is very easy to reproduce even on x86 build for ppc target.




Re: [PATCH] influence JIT linker command line

2014-12-08 Thread David Malcolm
On Sat, 2014-12-06 at 15:56 -0500, Ulrich Drepper wrote:
> This patch supercedes the patch I sent earlier this week to add
> dependencies to the linker command line.  The implementation is
> different.
> 
> First, based on Dave's comment that he wants to keep the interface
> simple, to enable the linker optimizations no new interface is added.
> Instead optimizations are enabled in the linker whenever the compiler
> optimizes, too.  I don't think this will create problems at all since
> the time it takes nowadays is really low; it's only really measurable
> for extremely large files.
> 
> The way to add dependencies is changed.  Instead of allowing an
> unstructured string parameter to be added to the command line the new
> proposed interface allows to introduce a dependency with possibly
> information about the path the dependency is found.  This should be
> useful and implementable if at some point the explicit linker invocation
> is replaced by a library implementation.  The path argument of the new
> interface is used differently depending on the name.  If the name is of
> the form -l* then the -L option is used and a runpath is added.
> Otherwise the path is used to locate the file.
> 
> 
> Comments?

Thanks.

The special meaning of -l as a prefix to make the entrypoint have two
behaviors seems wrong to me.  Similarly, on reading the docs, the
repeated uses of the word "path" as both a param name and a concept seem
confusing, to me at least.

I think I'd prefer splitting the proposed
  gcc_jit_context_add_dependency
into two API entrypoints, one for the -l case, and another for the
filesystem name case.  I think the resulting API would be easier for the
end-user to understand.

Or is there a reason to have both of the -l vs non-dash-l uses go
through the same entrypoint?

"man ld" uses "namespec" as the term for the arg to -l, so maybe the API
could be two entrypoints looking something like this:

(A)

extern void
gcc_jit_context_requires_namespec (gcc_jit_context *ctxt,
   const char *namespec,
   const char *searchdir, /* can be NULL */
   int flags);

which becomes a "-lnamespec", and perhaps a "-Lsearchdir"/"-Rsearchdir"

(putting flags to the end so that we can default them to 0 in the C++
bindings).

and:

(B)

extern void
gcc_jit_context_requires_library (gcc_jit_context *ctxt,
  const char *path,
  int flags);

Do we need separate name and path params here?  If the user has name and
path, can't they simply supply us with the path to the object?  (i.e.
"path" can be absolute, relative, or just a filename)?

Or do we need to distinguish the possiblity that the linker might be
searching in its standard locations, hence something like:

extern void
gcc_jit_context_requires_library (gcc_jit_context *ctxt,
  const char *filename,
  const char *searchdir, /* can be NULL
*/
  int flags);


Again, I'm sorry if I'm not fully grokking your use-cases here.

Some possible concrete examples:

  gcc_jit_context_requires_namespec (ctxt, "foo", NULL, 0);

adds a -lfoo to the link line, looking in the default places.

  gcc_jit_context_requires_namespec (ctxt, "bar",
"/opt/bar-2.0/lib", 0);

adds a -lbar to the link line, looking within the given dir first.

The same, but via the filesystem-based entrypoint:

  gcc_jit_context_requires_library (ctxt, "libfoo.so", 0);

  gcc_jit_context_requires_library (ctxt,
"/opt/bar-2.0/libbar.so",
 0);
and if I understand you, we can also do other kinds of object:
  gcc_jit_context_requires_library (ctxt,
"/opt/bar-2.0/libbar.a",
 0);
  gcc_jit_context_requires_library (ctxt,
"/opt/bar-2.0/libbar.o",
 0);

In the C++ binding API these would become just:

  ctxt.requires_namespec ("foo");
  ctxt.requires_namespec ("bar", "/opt/bar-2.0/lib");


  ctxt.requires_library ("libfoo.so");
  ctxt.requires_library ("/opt/bar-2.0/libbar.so");
  ctxt.requires_library ("/opt/bar-2.0/libbar.a");
  ctxt.requires_library ("/opt/bar-2.0/libbar.o");

Thoughts?

I'm not in love with the names I suggested (though sadly I've already
claimed the word "object" to mean something in the API, when it has its
own meaning in the linker world).

Maybe:
  import_by_{namespec|path}?
  import_{namespec|path}?
  link_by_{namespec|path}?
  link_to_{namespec|path}?
  add_library_by_{namespec|path|filename}?
etc

Various other stuff inline below.

> gcc/ChangeLog:
> 
> 2014-12-06  Ulrich Drepper  
> 
>   * jit/jit-recording.c (recording::context::add_dependency):
> New function.

Go patch committed: fix crash on invalid program

2014-12-08 Thread Ian Lance Taylor
This patch from Chris Manghane fixes a crash-on-invalid in the Go
frontend.  This is GCC PR 64198.  Bootstrapped and ran Go testsuite on
x86_64-unknown-linux-gnu.  Committed to mainline.

Ian
diff -r 2a85649c19e1 go/parse.cc
--- a/go/parse.cc   Sun Nov 30 17:28:00 2014 -0800
+++ b/go/parse.cc   Mon Dec 08 10:01:38 2014 -0800
@@ -3190,9 +3190,12 @@
   if (token->is_op(OPERATOR_COMMA))
 token = this->advance_token();
   if (!token->is_op(OPERATOR_RPAREN))
-error_at(this->location(), "missing %<)%>");
-  else
-this->advance_token();
+{
+  error_at(this->location(), "missing %<)%>");
+  if (!this->skip_past_error(OPERATOR_RPAREN))
+   return Expression::make_error(this->location());
+}
+  this->advance_token();
   if (func->is_error_expression())
 return func;
   return Expression::make_call(func, args, is_varargs, func->location());


Re: [PATCH] [AArch64, RTL] Bics instruction generation for aarch64

2014-12-08 Thread Alex Velenko



On 08/12/14 10:33, Richard Earnshaw wrote:

On 11/11/14 10:38, Alex Velenko wrote:

  From 98bb6d7323ce79e28be8ef892b919391ed857e1f Mon Sep 17 00:00:00 2001
From: Alex Velenko 
Date: Fri, 31 Oct 2014 18:43:32 +
Subject: [PATCH] [AArch64, RTL] Bics instruction generation for aarch64

Hi,

This patch adds rtl patterns for aarch64 to generate bics instructions in
cases when caputed value gets discarded and only only the status regester
change of the instruction gets reused.

Previously, bics would only be generated, if the value computed by bics
would later be reused, which is not necessarily the case when computing
this value for "if" statements.

Is this patch ok?

Thanks,
Alex

gcc/

2014-11-10  Alex Velenko  

  * gcc/config/aarch64/aarch64.md
(and_one_cmpl3_compare0_no_reuse):
New define_insn.
  * (and_one_cmpl_3_compare0_no_reuse):
Likewise.

gcc/testsuite/

2014-11-10  Alex Velenko  

  * gcc.target/aarch64/bics1.c : New testcase.


OK.

R.


Committed,

Alex


---
   gcc/config/aarch64/aarch64.md | 26 
   gcc/testsuite/gcc.target/aarch64/bics_3.c | 69
+++
   2 files changed, 95 insertions(+)
   create mode 100644 gcc/testsuite/gcc.target/aarch64/bics_3.c

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 341c26f..6158d82 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -2845,6 +2845,18 @@
 [(set_attr "type" "logics_reg")]
   )

+(define_insn "*and_one_cmpl3_compare0_no_reuse"
+  [(set (reg:CC_NZ CC_REGNUM)
+(compare:CC_NZ
+ (and:GPI (not:GPI
+   (match_operand:GPI 0 "register_operand" "r"))
+  (match_operand:GPI 1 "register_operand" "r"))
+ (const_int 0)))]
+  ""
+  "bics\\tzr, %1, %0"
+  [(set_attr "type" "logics_reg")]
+)
+
   (define_insn "*_one_cmpl_3"
 [(set (match_operand:GPI 0 "register_operand" "=r")
   (LOGICAL:GPI (not:GPI
@@ -2894,6 +2906,20 @@
 [(set_attr "type" "logics_shift_imm")]
   )

+(define_insn "*and_one_cmpl_3_compare0_no_reuse"
+  [(set (reg:CC_NZ CC_REGNUM)
+(compare:CC_NZ
+ (and:GPI (not:GPI
+   (SHIFT:GPI
+(match_operand:GPI 0 "register_operand" "r")
+(match_operand:QI 1 "aarch64_shift_imm_" "n")))
+  (match_operand:GPI 2 "register_operand" "r"))
+ (const_int 0)))]
+  ""
+  "bics\\tzr, %2, %0,  %1"
+  [(set_attr "type" "logics_shift_imm")]
+)
+
   (define_insn "clz2"
 [(set (match_operand:GPI 0 "register_operand" "=r")
   (clz:GPI (match_operand:GPI 1 "register_operand" "r")))]
diff --git a/gcc/testsuite/gcc.target/aarch64/bics_3.c
b/gcc/testsuite/gcc.target/aarch64/bics_3.c
new file mode 100644
index 000..ecb53e9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/bics_3.c
@@ -0,0 +1,69 @@
+/* { dg-do run } */
+/* { dg-options "-O2 --save-temps" } */
+
+extern void abort (void);
+
+int __attribute__ ((noinline))
+bics_si_test (int a, int b)
+{
+  if (a & ~b)
+return 1;
+  else
+return 0;
+}
+
+int __attribute__ ((noinline))
+bics_si_test2 (int a, int b)
+{
+  if (a & ~ (b << 2))
+return 1;
+  else
+return 0;
+}
+
+typedef long long s64;
+
+int __attribute__ ((noinline))
+bics_di_test (s64 a, s64 b)
+{
+  if (a & ~b)
+return 1;
+  else
+return 0;
+}
+
+int __attribute__ ((noinline))
+bics_di_test2 (s64 a, s64 b)
+{
+  if (a & ~(b << 2))
+return 1;
+  else
+return 0;
+}
+
+int
+main (void)
+{
+  int a = 5;
+  int b = 5;
+  int c = 20;
+  s64 d = 5;
+  s64 e = 5;
+  s64 f = 20;
+  if (bics_si_test (a, b))
+abort ();
+  if (bics_si_test2 (c, b))
+abort ();
+  if (bics_di_test (d, e))
+abort ();
+  if (bics_di_test2 (f, e))
+abort ();
+  return 0;
+}
+
+/* { dg-final { scan-assembler-times "bics\twzr, w\[0-9\]+, w\[0-9\]+"
2 } } */
+/* { dg-final { scan-assembler-times "bics\twzr, w\[0-9\]+, w\[0-9\]+,
lsl 2" 1 } } */
+/* { dg-final { scan-assembler-times "bics\txzr, x\[0-9\]+, x\[0-9\]+"
2 } } */
+/* { dg-final { scan-assembler-times "bics\txzr, x\[0-9\]+, x\[0-9\]+,
lsl 2" 1 } } */
+
+/* { dg-final { cleanup-saved-temps } } */







[PATCH, committed] Make jit/notes.txt better reflect current status quo

2014-12-08 Thread David Malcolm
The high-level overview diagram in jit/notes.txt showed the
release of the JIT mutex in where I'd like it to be, rather
than where it currently is.

Fix the diagram to reflect reality, and clarify various other things.

Committed to trunk as r218488.

gcc/jit/ChangeLog:
* notes.txt: Show the beginning and ending of
recording::context::compile vs playback::context::compile.  Show
the creation and unlinking of the tempdir.  Show toplev::finalize.
Move "RELEASE MUTEX" to the correct location.  Show
gcc_jit_result_release, and indicate where the
dlopen/dlsym/dlclose occur.
---
 gcc/jit/notes.txt | 38 ++
 1 file changed, 34 insertions(+), 4 deletions(-)

diff --git a/gcc/jit/notes.txt b/gcc/jit/notes.txt
index d337cb4..12bb6d6 100644
--- a/gcc/jit/notes.txt
+++ b/gcc/jit/notes.txt
@@ -18,9 +18,13 @@ Client Code   . Generated .libgccjit.so
│  .   .  .   .
V  .   .  gcc_jit_context_compile .
 ──>  .   .
+  .   .│ start of recording::context::compile ()
   .   .│ .   .
   .   .│ ACQUIRE MUTEX   .
   .   .│ .   .
+  .   .│ start of playback::context::compile ()
+  .   .│   (create tempdir)  .
+  .   .│ .   .
   .   .V───> toplev::main (for now)
   .   .  .   .   │
   .   .  .   .   (various code)
@@ -65,13 +69,33 @@ Client Code   . Generated .libgccjit.so
   .   .  .   .   (the middle─end and 
backend)
   .   .  .   .   ↓
   .   .<─ end of 
toplev::main
-  .   .│ RELEASE MUTEX   .
   .   .│ .   .
-  .   .│ Convert assembler to DSO
+  .   .V───> toplev::finalize
+  .   .  .   . │   (purge internal 
state)
+  .   .< end of 
toplev::finalize
+  .   .│ .   .
+  .   .│ Convert assembler to DSO ("fake.so")
+  .   .│ .   .
+  .   .│ Load DSO (dlopen "fake.so")
   .   .│ .   .
-  .   .│ Load DSO.
+  .   .│ end of playback::context::compile ()
+  .   .│ .   .
+  .   .│ playback::context dtor
+  .   . ──>  .   .
+  .   .   │ Cleanup tempdir  .
+  .   .   │   ("fake.so" is unlinked from the
+  .   .   │filesystem at this point)
+  .   .<──   .   .
+  .   .│ .   .
+  .   .│ RELEASE MUTEX   .
+  .   .│ .   .
+  .   .│ end of recording::context::compile ()
<───  .   .
│  .   .  .   .
+   V  .   .  gcc_jit_result_get_code .
+──>  .   .
+  .   .│ dlsym () within loaded DSO
+   <───  .   .
Get (void*).   .  .   .
│  .   .  .   .
│ Call it  .   .  .   .
@@ -80,5 +104,11 @@ Client Code   . Generated .libgccjit.so
   .│  .  .   .
<───   .  .   .
│  .   .  .   .
+etc│  .   .  .   .
+   │  .   .  .   .
+   V  .   .  gcc_jit_result_release  .
+──>  .   .
+  .   .│ dlclose () the loaded DSO
+  .   .│(code becomes uncallable)
+   <───  .   .
│  .   .  .   .
-etc
-- 
1.8.5.3



Re: [Patch] Improving jump-thread pass for PR 54742

2014-12-08 Thread Jeff Law

On 12/06/14 06:47, Sebastian Pop wrote:

Jeff Law wrote:

OK to commit.  Thanks for your patience.

Can you follow-up with a change which throttles this optimization
when -Os is in effect.  You can check optimize_function_for_size_p
(cfun) and simply avoid the backward traversal or you could allow it
in that case if the amount of copying is suitably small.  Your call.


I think it does not make sense to duplicate paths at -Os: I disabled the FSM
jump-threading when optimizing for size like this.

diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c
index 29b20c8..ce70311 100644
--- a/gcc/tree-ssa-threadedge.c
+++ b/gcc/tree-ssa-threadedge.c
@@ -1335,8 +1335,9 @@ thread_through_normal_block (edge e,
   return 1;
 }

if (!flag_expensive_optimizations
+ || optimize_function_for_size_p (cfun)
   || TREE_CODE (cond) != SSA_NAME
   || e->dest->loop_father != e->src->loop_father
   || loop_depth (e->dest->loop_father) == 0)
 return 0;

I will regstrap and commit the attached patch.

Looks good to me.  Thanks for taking care of it.

Jeff


[PATCH, committed] libgccjit++.h: use indentation to show inheritance

2014-12-08 Thread David Malcolm
Committed to trunk as r218489.

gcc/jit/ChangeLog:
* libgccjit++.h: Indent the forward declarations of the classes to
show the inheritance hierarchy.
---
 gcc/jit/libgccjit++.h | 21 -
 1 file changed, 12 insertions(+), 9 deletions(-)

diff --git a/gcc/jit/libgccjit++.h b/gcc/jit/libgccjit++.h
index 67ed5d5..7f6718b 100644
--- a/gcc/jit/libgccjit++.h
+++ b/gcc/jit/libgccjit++.h
@@ -32,16 +32,19 @@ along with GCC; see the file COPYING3.  If not see
 
 namespace gccjit
 {
+  /* Indentation indicates inheritance.  */
   class context;
-  class location;
-  class field;
-  class type;
-  class struct_;
-  class param;
-  class function;
-  class block;
-  class rvalue;
-  class lvalue;
+  class error;
+  class object;
+class location;
+class field;
+class type;
+  class struct_;
+class function;
+class block;
+class rvalue;
+ class lvalue;
+   class param;
 
   /* Errors within the API become C++ exceptions of this class.  */
   class error
-- 
1.8.5.3



Re: [PATCH] Fix PR 61225

2014-12-08 Thread Jeff Law

On 12/05/14 17:16, Segher Boessenkool wrote:

On Fri, Dec 05, 2014 at 03:31:54PM -0700, Jeff Law wrote:

Combine does not consider combining 9 into 7 because there is no LOG_LINK
between them (the link for r88 is between 8 and 7 already).

OK, yea, that's a long standing design decision.  We don't feed a single
def into multiple use sites.


There is no real reason not to do that.  It doesn't increase computational
complexity, although it is of course more expensive than what combine does
today (it is more work, after all).  And combining with a later use does
not have too big a chance to succeed (since it has to keep the result of
the earlier insn around always).
No fundamental reason, it's just always been that way.  One could argue 
that a bridge pattern often makes this unnecessary and bridges have been 
a well known way to work around combine's failings for a long time.



GCC 6 or later ;-)

Yea, I think so :)

jeff


Re: Compare-elim pass

2014-12-08 Thread Jeff Law

On 12/05/14 17:58, Segher Boessenkool wrote:

On Fri, Dec 05, 2014 at 01:40:56PM -0700, Jeff Law wrote:

My first thought would be to allow both and have combine swap the order
in the vector if recog doesn't recognize the pattern.  One could argue
we could go through a full permutation of ordering in the vector, but
that's probably above and beyond the call of duty.


Combine also expects a certain ordering for its *inputs*.  It might not
be the only pass that does this either.  All targets (where this works)
have the compare first in all their patterns.  This goes back, what,
20+ years?  It might not be "officially" canonical, but it's the only
ordering that works everywhere.
Yup, aware that inputs can be an issue as well.  But again, there's no 
reason we can't handle those too.


I guess in the end the first question is whether or not we want a 
canonical form.  I can see arguments either way.  That answer drives the 
next set of questions/issues to work through.





Re: [patch] Fix tilepro includes

2014-12-08 Thread Walter Lee
On 12/8/2014 11:23 AM, Jan-Benedict Glaw wrote:
> On Fri, 2014-11-21 08:45:11 -0500, Andrew MacLeod  wrote:
>> During the flattening of optabs.h, I updated all the config/* files
>> which were affected.   I've been getting spurious failures with
>> config-list.mk where my changes would "disappear" and tracked down
>> why.
>>
>> I was blissfully unaware that the tilepro ports mul-tables.c file is
>> actually generated from gen-mul-tables.cc.
>>
>> This patch fixes the include issue by adding "#include insn-codes.h"
>> to the generated files.  I also added a comment indicating these are
>> generated files, and to make changes in the generator.
>>
>> This allows all the tile* ports to compile properly again.
>>
>> OK for trunk?
> Seems this wasn't ever ACKed or applied up to now? I'm still seeing
> compilation errors for the tile targets, see eg.
> http://toolchain.lug-owl.de/buildbot/show_build_details.php?id=382169

Sorry about that.  Looks good to me.

Walter



Re: [PATCH, PR63995, CHKP] Use single static bounds var for varpool nodes sharing asm name

2014-12-08 Thread Jeff Law

On 12/08/14 04:00, Ilya Enkovich wrote:

2014-11-26  Ilya Enkovich  

 PR bootstrap/63995
 * tree-chkp.c (chkp_make_static_bounds): Share bounds var
 between nodes sharing assembler name.

gcc/testsuite

2014-11-26  Ilya Enkovich  

 PR bootstrap/63995
 * g++.dg/dg.exp: Add mpx-dg.exp.
 * g++.dg/pr63995-1.C: New.
Code seems fine.  Not sure how the testcase is actually testing that 
you're sharing bounds though.  The test AFAICT just verifies that it can 
be compiled and doesn't checkout the output in any way.  Or is it the 
case that the test will ICE or segfault without your fix?



Jeff




Re: [PATCH] Fix size & type for cold partition names (hot-cold function partitioning)

2014-12-08 Thread Jeff Law

On 12/05/14 15:41, Caroline Tice wrote:

When hot/cold function splitting occurs, a symbol is generated for the
cold partition, and gets output in the assembly & debug info, but the
symbol currently gets a size of 0 and a type of NOTYPE, as in this
example (on x86_64-linux) from the cold_partition_label test in the
testsuite:

$ readelf -sW cold_partition_label.x02 | grep foo
 36: 00400450 0 NOTYPE  LOCAL  DEFAULT   12 foo.cold.0
 58: 0040049043 FUNCGLOBAL DEFAULT   12 foo
$

This patch fixes this by calculating the right size for the partition,
and outputing the size and type fo the cold partition symbol.  After
applying this patch and looking at the same test, I get:

$ readelf -sW cold_partition_label.x02 | grep foo
 36: 0040045029 FUNCLOCAL  DEFAULT   12 foo.cold.0
 58: 0040049043 FUNCGLOBAL DEFAULT   12 foo
$

This patch has been tested by bootstrapping the compiler, running the
dejagnu testsuite with no regressions, and checked as shown above that
it fixes the original problem.  Is this patch OK to commit to ToT?

-- Caroline Tice
cmt...@google.com

  2014-12-05  Caroline Tice  

 * final.c (final_scan_insn): Change 'cold_function_name' to
 'cold_partition_name' and make it a global variable; also output
 assembly to give it a 'FUNC' type, if appropriate.
 * varasm.c (cold_partition_name): Declare and initialize global
 variable.
 (assemble_start_function): Re-set value for cold_partition_name.
 (assemble_end_function): Output assembly to calculate size of cold
 partition, and associate size with name, if appropriate.
 * varash.h (cold_partition_name): Add extern declaration for global
 variable.
OK for the trunk.  I'm ever-so-slightly worried about the PA and PTX 
ports are they're probably the most sensitive to types.  But we'll deal 
with them if they complain.


Adding a testcase would be helpful.  I believe some of the LTO tests use 
readelf, so there should be some infrastructure you can factor out and 
reuse.


jeff


Re: [PATCH, PR63995, CHKP] Use single static bounds var for varpool nodes sharing asm name

2014-12-08 Thread Ilya Enkovich
2014-12-09 0:00 GMT+03:00 Jeff Law :
> On 12/08/14 04:00, Ilya Enkovich wrote:
>>>
>>> 2014-11-26  Ilya Enkovich  
>>>
>>>  PR bootstrap/63995
>>>  * tree-chkp.c (chkp_make_static_bounds): Share bounds var
>>>  between nodes sharing assembler name.
>>>
>>> gcc/testsuite
>>>
>>> 2014-11-26  Ilya Enkovich  
>>>
>>>  PR bootstrap/63995
>>>  * g++.dg/dg.exp: Add mpx-dg.exp.
>>>  * g++.dg/pr63995-1.C: New.
>
> Code seems fine.  Not sure how the testcase is actually testing that you're
> sharing bounds though.  The test AFAICT just verifies that it can be
> compiled and doesn't checkout the output in any way.  Or is it the case that
> the test will ICE or segfault without your fix?

Without this fix compiler emits two symbols with the same name and
assembler complains. Something like that:

/tmp/cceHGHa1.s: Assembler messages:
/tmp/cceHGHa1.s:12080: Error: symbol `__chkp_bounds_of_mode_inner' is
already defined
/tmp/cceHGHa1.s:12086: Error: symbol `__chkp_bounds_of_mode_nunits' is
already defined


Thanks,
Ilya

>
>
> Jeff
>
>


Re: [PATCH] Set stores_off_frame_dead_at_return to false if sibling call

2014-12-08 Thread Jeff Law

On 12/01/14 11:44, John David Anglin wrote:


To do unconditionally set frame_read?  Or if we don't want to get
that drastic,

if (reload_completed || SIBLING_CALL_P (insn))
insn_info->frame_read = true;

Will test but I, if I read the code correctly, setting
insn_info->frame_read = true results in frame related stores being
killed in a constant call. This doesn't seem like the right
solution.
But isn't killing in this context referring to GEN/KILL types of 
operations for global dataflow?   ISTM that GEN is a store operation 
while KILL is a load operation (over-simplification, but stick with me).


Thus a GEN-GEN will get optimized into a single GEN (dead store 
eliminated) while a GEN-KILL-GEN can not be optimized by DSE because of 
the KILL.


It should always be safe to have an extraneous KILL since that merely 
inhibits optimization.  While an extraneous GEN can be a disaster.


So with setting frame_read, we're going to have more KILLs in the 
dataflow sets, which results in fewer stores being eliminated.  They 
come from:



  /* If the frame is read, the frame related stores are 
killed.  */

  else if (insn_info->frame_read)
{
  store_info_t store_info = i_ptr->store_rec;

  /* Skip the clobbers.  */
  while (!store_info->is_set)
store_info = store_info->next;

  if (store_info->group_id >= 0
  && 
rtx_group_vec[store_info->group_id]->frame_related)

remove_store = true;
}

  if (remove_store)
{
  if (dump_file && (dump_flags & TDF_DETAILS))
dump_insn_info ("removing from active", i_ptr);

  active_local_stores_len--;
  if (last)
last->next_local_store = i_ptr->next_local_store;
  else
active_local_stores = i_ptr->next_local_store;
}
  else
last = i_ptr;

  i_ptr = i_ptr->next_local_store;

AFAICT in this loop, setting FRAME_READ causes us to set REMOVE_STORE 
more often.  REMOVE_STORE in this context seems to indicate that we're 
going to remove a store from the list we are tracking for potential 
removal.  Which seems exactly right.


Here as well:

  /* If this insn reads the frame, kill all the frame related stores.  */
  if (insn_info->frame_read)
{
  FOR_EACH_VEC_ELT (rtx_group_vec, i, group)
if (group->process_globally && group->frame_related)
  {
if (kill)
  bitmap_ior_into (kill, group->group_kill);
bitmap_and_compl_into (gen, group->group_kill);
  }
}

Which also seems like exactly what we want since when we set FRAME_READ 
we end up with a bigger KILL set and a smaller GEN set.



I think the terminology and variable names certainly makes this tougher 
to follow than it should.





Here we have frame related calls being killed before reload because
the argument stores for the sibcall are off frame:

/* Set the gen set of the exit block, and also any block with no
successors that does not have a wild read.  */

static void dse_step3_exit_block_scan (bb_info_t bb_info) { /* The
gen set is all 0's for the exit block except for the
frame_pointer_group.  */

if (stores_off_frame_dead_at_return) { unsigned int i; group_info_t
group;

FOR_EACH_VEC_ELT (rtx_group_vec, i, group) { if
(group->process_globally && group->frame_related) bitmap_ior_into
(bb_info->gen, group->group_kill); } } }
I see your point.  Expanding the GEN set here seems wrong.If we go 
with setting STORES_OFF_FRAME_DEAD_AT_RETURN, then we definitely need to 
update its documentation.


I think an argument could be made that we want both changes if I've 
interpreted this code correctly.


Jeff


Re: [PATCH] Set stores_off_frame_dead_at_return to false if sibling call

2014-12-08 Thread Jeff Law

On 12/04/14 08:50, John David Anglin wrote:

On 12/1/2014 11:57 AM, Jeff Law wrote:

Prior to reload (ie, in DSE1) there's a bit of magic in that we do
not set frame_read on call insns.  That may in fact be wrong and
possibly the source of the problem.

/* This field is only used for the processing of const functions.
These functions cannot read memory, but they can read the stack
because that is where they may get their parms.  We need to be this
conservative because, like the store motion pass, we don't consider
CALL_INSN_FUNCTION_USAGE when processing call insns. Moreover, we
need to distinguish two cases: 1. Before reload (register
elimination), the stores related to outgoing arguments are stack
pointer based and thus deemed of non-constant base in this pass.
This requires special handling but also means that the frame
pointer based stores need not be killed upon encountering a const
function call.

The above is wrong for sibcalls.  Sibcall arguments are relative to
the incoming argument pointer.  Is this always the frame pointer?
I don't think it's always the frame pointer.  Don't we use an argument 
pointer for the PA64 runtime?  If I recall, it was the only port that 
had a non-eliminable argument pointer at the time.





If that is the case, then it may be possible to eliminate stack based
 stores when we have a sibcall before reload.

if (reload_completed || SIBLING_CALL_P (insn))
insn_info->frame_read = true;

This works.

Dave





Re: [PATCH] Fix PR 61225

2014-12-08 Thread Jeff Law

On 12/04/14 01:43, Zhenqiang Chen wrote:

> >
> >  Part of PR rtl-optimization/61225
> >  * combine.c (refer_same_reg_p): New function.
> >  (combine_instructions): Handle I1 -> I2 -> I3; I2 -> insn.
> >  (try_combine): Add one more parameter TO_COMBINED_INSN, which

>is

> >  used to create a new insn parallel (TO_COMBINED_INSN, I3).
> >
> >testsuite/ChangeLog:
> >2014-08-04  Zhenqiang Chen
> >
> >  * gcc.target/i386/pr61225.c: New test.
THanks for the updates and clarifications.  Just a few minor things and 
while it's a bit of a hack, I'll approve:





+
+/* A is a compare (reg1, 0) and B is SINGLE_SET which SET_SRC is reg2.
+   It returns TRUE, if reg1 == reg2, and no other refer of reg1
+   except A and B.  */
+
+static bool
+refer_same_reg_p (rtx_insn *a, rtx_insn *b)
+{
+  rtx seta = single_set (a);
+  rtx setb = single_set (b);
+
+  if (BLOCK_FOR_INSN (a) != BLOCK_FOR_INSN (b)
+  || !seta
+  || !setb)
+return false;
+
+  if (GET_CODE (SET_SRC (seta)) != COMPARE
+  || GET_MODE_CLASS (GET_MODE (SET_DEST (seta))) != MODE_CC
+  || !REG_P (XEXP (SET_SRC (seta), 0))
+  || XEXP (SET_SRC (seta), 1) != const0_rtx
+  || !REG_P (SET_SRC (setb))
+  || REGNO (SET_SRC (setb)) != REGNO (XEXP (SET_SRC (seta), 0)))
+return false;
Do you need to verify SETA and SETB satisfy single_set?  Or has that 
already been done elsewhere?


The name refer_same_reg_p seems wrong -- your function is verifying the 
underlying RTL store as well as the existence of a a dependency between 
the insns.  Can you try to come up with a better name?


Please use CONST0_RTX (mode)  IIRC that'll allow this to work regardless 
of the size of the modes relative to the host word size.





+
+  if (DF_REG_USE_COUNT (REGNO (SET_SRC (setb))) > 2)
+{
+  df_ref use;
+  rtx insn;
+  unsigned int i = REGNO (SET_SRC (setb));
+
+  for (use = DF_REG_USE_CHAIN (i); use; use = DF_REF_NEXT_REG (use))
+{
+ insn = DF_REF_INSN (use);
+ if (insn != a && insn != b && !(NOTE_P (insn) || DEBUG_INSN_P (insn)))
+   return false;
+   }
+}
+
+  return true;
+}
Is this fragment really needed?  Does it ever trigger?  I'd think that 
for > 2 uses punting would be fine.  Do we really commonly have cases 
with > 2 uses, but where they're all in SETA and SETB?



  }
  }

+ /* Try to combine a compare insn that sets CC
+with a preceding insn that can set CC, and maybe with its
+logical predecessor as well.
+We need this special code because data flow connections
+do not get entered in LOG_LINKS.  */
+ if ((prev = prev_nonnote_nondebug_insn (insn)) != NULL_RTX
+ && refer_same_reg_p (insn, prev)
+ && max_combine >= 4)
+   {
+   struct insn_link *next1;
+   FOR_EACH_LOG_LINK (next1, prev)
+ {
+   rtx_insn *link1 = next1->insn;
+   if (NOTE_P (link1))
+ continue;
+   /* I1 -> I2 -> I3; I2 -> insn;
+  output parallel (insn, I3).  */
+   FOR_EACH_LOG_LINK (nextlinks, link1)
+ if ((next = try_combine (prev, link1,
+  nextlinks->insn, NULL,
+  &new_direct_jump_p,
+  last_combined_insn, insn)) != 0)
+
+   {
+ delete_insn (insn);
+ insn = next;
+ statistics_counter_event (cfun, "four-insn combine", 
1);
+ goto retry;
+   }
+   /* I2 -> I3; I2 -> insn
+  output next = parallel (insn, I3).  */
+   if ((next = try_combine (prev, link1,
+NULL, NULL,
+&new_direct_jump_p,
+last_combined_insn, insn)) != 0)
+
+ {
+   delete_insn (insn);
+   insn = next;
+   statistics_counter_event (cfun, "three-insn combine", 
1);
+   goto retry;
+ }
+ }
+   }
So you've got two new combine cases here, but I think the testcase only 
tests one of them.  Can you include a testcase for both of hte major 
paths above (I1->I2->I3; I2->insn and I2->I3; I2->INSN)


Please make those changes and repost for final approval.

jeff


Re: [PATCH, CHKP] Fix instrumentation clones privatization

2014-12-08 Thread Jeff Law

On 12/02/14 07:18, Ilya Enkovich wrote:

Hi,

Currently symbol names privatization doesn't work for instrumentation clones 
because clones assembler name is transparent alias and therefore alias target 
should be privatized instead.  This patch does it.  Bootstrapped and tested on 
x86_64-unknown-linux-gnu.  OK for trunk?

Thanks,
Ilya
--
gcc/

2014-12-02  Ilya Enkovich  

* lto/lto-partition.c (privatize_symbol_name): Correctly
privatize instrumentation clones.

gcc/testsuite/

2014-12-02  Ilya Enkovich  

* gcc.dg/lto/lto.exp: Load mpx-dg.exp.
* gcc.dg/lto/chkp-privatize_0.c: New.
* gcc.dg/lto/chkp-privatize_1.c: New.

OK.
jeff



Re: [PATCH, PR63995, CHKP] Use single static bounds var for varpool nodes sharing asm name

2014-12-08 Thread Jeff Law

On 12/08/14 14:12, Ilya Enkovich wrote:

2014-12-09 0:00 GMT+03:00 Jeff Law :

On 12/08/14 04:00, Ilya Enkovich wrote:


2014-11-26  Ilya Enkovich  

  PR bootstrap/63995
  * tree-chkp.c (chkp_make_static_bounds): Share bounds var
  between nodes sharing assembler name.

gcc/testsuite

2014-11-26  Ilya Enkovich  

  PR bootstrap/63995
  * g++.dg/dg.exp: Add mpx-dg.exp.
  * g++.dg/pr63995-1.C: New.


Code seems fine.  Not sure how the testcase is actually testing that you're
sharing bounds though.  The test AFAICT just verifies that it can be
compiled and doesn't checkout the output in any way.  Or is it the case that
the test will ICE or segfault without your fix?


Without this fix compiler emits two symbols with the same name and
assembler complains. Something like that:

/tmp/cceHGHa1.s: Assembler messages:
/tmp/cceHGHa1.s:12080: Error: symbol `__chkp_bounds_of_mode_inner' is
already defined
/tmp/cceHGHa1.s:12086: Error: symbol `__chkp_bounds_of_mode_nunits' is
already defined

Ah OK.  Ok for the trunk.

Thanks for your patience,
Jeff



[PATCH] Fix broadcast from scalar patterns (PR target/63594)

2014-12-08 Thread Jakub Jelinek
Hi!

This patch attempts to fix
(set (reg:V*) (vec_duplicate:V* (reg/mem:)))
patterns.  One issue is that there were separate patterns for
broadcast from gpr and separate patterns for broadcast from memory
(and vector reg), that isn't a good idea for reload, which can't then
freely choose.  Another issue is that some pre-AVX2 broadcast patterns
were present above the avx512vl broadcast patterns, so again, reload didn't
have the possibility to use %xmm16-31/%ymm16-31 registers.  Also, the
splitter written for AVX2 broadcasts from gpr went into the way of AVX512VL
broadcasts.  And finally, the avx512*intrin.h headers were using
#ifdef TARGET_64BIT, macro not used anywhere (probably meant to write
__x86_64__ instead, but with the patch we actually just have one set of
builtins.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2014-12-08  Jakub Jelinek  

PR target/63594
* config/i386/sse.md (vec_dupv4sf): Move after
_vec_dup_gpr pattern.
(*vec_dupv4si, *vec_dupv2di): Likewise.
(_vec_dup_mem): Merge into ...
(_vec_dup_gpr): ... this
pattern.
(*vec_dup AVX2_VEC_DUP_MODE splitter): Disable for
TARGET_AVX512VL (for QI/HI scalar modes only if TARGET_AVX512BW
is set too).
* config/i386/i386.c (enum ix86_builtins): Remove
IX86_BUILTIN_PBROADCASTQ256_MEM_MASK,
IX86_BUILTIN_PBROADCASTQ128_MEM_MASK and
IX86_BUILTIN_PBROADCASTQ512_MEM.
(bdesc_args): Use __builtin_ia32_pbroadcastq512_gpr_mask,
__builtin_ia32_pbroadcastq256_gpr_mask and
__builtin_ia32_pbroadcastq128_gpr_mask instead of *_mem_mask
regardless of OPTION_MASK_ISA_64BIT.
* config/i386/avx512fintrin.h (_mm512_set1_epi64,
_mm512_mask_set1_epi64, _mm512_maskz_set1_epi64): Use *_gpr_mask
builtins regardless of whether TARGET_64BIT is defined or not.
* config/i386/avx512vlintrin.h (_mm256_mask_set1_epi64,
_mm256_maskz_set1_epi64, _mm_mask_set1_epi64, _mm_maskz_set1_epi64):
Likewise.

--- gcc/config/i386/sse.md.jj   2014-12-03 11:52:41.0 +0100
+++ gcc/config/i386/sse.md  2014-12-08 13:26:06.505543457 +0100
@@ -6319,22 +6319,6 @@ (define_insn "avx512f_vec_dup_1"
 (set_attr "prefix" "evex")
 (set_attr "mode" "")])
 
-(define_insn "vec_dupv4sf"
-  [(set (match_operand:V4SF 0 "register_operand" "=x,x,x")
-   (vec_duplicate:V4SF
- (match_operand:SF 1 "nonimmediate_operand" "x,m,0")))]
-  "TARGET_SSE"
-  "@
-   vshufps\t{$0, %1, %1, %0|%0, %1, %1, 0}
-   vbroadcastss\t{%1, %0|%0, %1}
-   shufps\t{$0, %0, %0|%0, %0, 0}"
-  [(set_attr "isa" "avx,avx,noavx")
-   (set_attr "type" "sseshuf1,ssemov,sseshuf1")
-   (set_attr "length_immediate" "1,0,1")
-   (set_attr "prefix_extra" "0,1,*")
-   (set_attr "prefix" "vex,vex,orig")
-   (set_attr "mode" "V4SF")])
-
 ;; Although insertps takes register source, we prefer
 ;; unpcklps with register source since it is shorter.
 (define_insn "*vec_concatv2sf_sse4_1"
@@ -12821,37 +12805,6 @@ (define_split
   operands[1] = adjust_address (operands[1], mode, offs);
 })
 
-(define_insn "*vec_dupv4si"
-  [(set (match_operand:V4SI 0 "register_operand" "=x,x,x")
-   (vec_duplicate:V4SI
- (match_operand:SI 1 "nonimmediate_operand" " x,m,0")))]
-  "TARGET_SSE"
-  "@
-   %vpshufd\t{$0, %1, %0|%0, %1, 0}
-   vbroadcastss\t{%1, %0|%0, %1}
-   shufps\t{$0, %0, %0|%0, %0, 0}"
-  [(set_attr "isa" "sse2,avx,noavx")
-   (set_attr "type" "sselog1,ssemov,sselog1")
-   (set_attr "length_immediate" "1,0,1")
-   (set_attr "prefix_extra" "0,1,*")
-   (set_attr "prefix" "maybe_vex,vex,orig")
-   (set_attr "mode" "TI,V4SF,V4SF")])
-
-(define_insn "*vec_dupv2di"
-  [(set (match_operand:V2DI 0 "register_operand" "=x,x,x,x")
-   (vec_duplicate:V2DI
- (match_operand:DI 1 "nonimmediate_operand" " 0,x,m,0")))]
-  "TARGET_SSE"
-  "@
-   punpcklqdq\t%0, %0
-   vpunpcklqdq\t{%d1, %0|%0, %d1}
-   %vmovddup\t{%1, %0|%0, %1}
-   movlhps\t%0, %0"
-  [(set_attr "isa" "sse2_noavx,avx,sse3,noavx")
-   (set_attr "type" "sselog1,sselog1,sselog1,ssemov")
-   (set_attr "prefix" "orig,vex,maybe_vex,orig")
-   (set_attr "mode" "TI,TI,DF,V4SF")])
-
 (define_insn "*vec_concatv2si_sse4_1"
   [(set (match_operand:V2SI 0 "register_operand" "=Yr,*x,x, Yr,*x,x, x, 
*y,*y")
(vec_concat:V2SI
@@ -16665,46 +16618,78 @@ (define_insn "avx512f_broa
(set_attr "mode" "")])
 
 (define_insn "_vec_dup_gpr"
-  [(set (match_operand:VI12_AVX512VL 0 "register_operand" "=v")
+  [(set (match_operand:VI12_AVX512VL 0 "register_operand" "=v,v")
(vec_duplicate:VI12_AVX512VL
- (match_operand: 1 "register_operand" "r")))]
+ (match_operand: 1 "nonimmediate_operand" "vm,r")))]
   "TARGET_AVX512BW"
-  "vpbroadcast\t{%k1, %0|%0, 
%k1}"
+  "@
+   vpbroadcast\t{%1, %0|%0, %1}
+   vpbroadcast\t{%k1, %0|%0, 
%k1}"
   [(set_attr "type" "ssemov")
(set_attr "prefix" "evex")
(set_attr "mode" "")])
 
 (define

[PATCH] Two more spots for

2014-12-08 Thread Jakub Jelinek
Hi!

I've noticed two new spots which use GET_MODE_SIZE (mode)
instead of the more efficient .

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2014-12-08  Jakub Jelinek  

* config/i386/sse.md (*mov_internal, *avx512f_gatherdi_2):
Use  instead of GET_MODE_SIZE (mode).

--- gcc/config/i386/sse.md.jj   2014-12-03 11:52:41.0 +0100
+++ gcc/config/i386/sse.md  2014-12-08 13:00:39.167141532 +0100
@@ -811,7 +811,7 @@ (define_insn "*mov_internal"
   /* There is no evex-encoded vmov* for sizes smaller than 64-bytes
 in avx512f, so we need to use workarounds, to access sse registers
 16-31, which are evex-only. In avx512vl we don't need workarounds.  */
-  if (TARGET_AVX512F && GET_MODE_SIZE (mode) < 64 && !TARGET_AVX512VL
+  if (TARGET_AVX512F &&  < 64 && !TARGET_AVX512VL
  && ((REG_P (operands[0]) && EXT_REX_SSE_REGNO_P (REGNO (operands[0])))
  || (REG_P (operands[1]) && EXT_REX_SSE_REGNO_P (REGNO 
(operands[1])
{
@@ -18208,7 +18196,7 @@ (define_insn "*avx512f_gatherdi_2"
 {
   if (mode != mode)
 {
-  if (GET_MODE_SIZE (mode) != 64)
+  if ( != 64)
return "vgatherq\t{%5, 
%x0%{%1%}|%t0%{%1%}, %g5}";
   else
return "vgatherq\t{%5, 
%t0%{%1%}|%t0%{%1%}, %g5}";

Jakub


Re: [PATCH, CHKP] Keep instrumentation references consistency when merging cgraph nodes

2014-12-08 Thread Jeff Law

On 12/02/14 06:54, Ilya Enkovich wrote:

Hi,

Currently cgraph nodes merge may break instrumented_version references.  It 
depends on order nodes are read and merged and therefore problem is not nicely 
reproducible.  I couldn't make a small reproducer.  This patch fixes problem 
and 253.perlbmk benchmark build with '-O3 -flto -fcheck-pointer-bounds -mmpx'.  
Bootstrapped and tested on x86_64-unknown-linux-gnu.  OK for trunk?

Thanks,
Ilya
--
2014-12-02  Ilya Enkovich  

* lto-cgraph.c (input_cgraph_1): Don't break existing
instrumentation clone references.
* lto/lto-symtab.c (lto_cgraph_replace_node): Redirect
instrumented_version references appropriately.

OK.
Jeff



Re: [PATCHv5][PING^4] Vimrc config with GNU formatting

2014-12-08 Thread Jeff Law

On 11/28/14 10:26, Yury Gribov wrote:



commit 3f560e9dd16a5e914b6f2ba82edffe13dfde944c
Author: Yury Gribov
Date:   Thu Oct 2 15:50:52 2014 +0400

 2014-10-02  Laurynas Biveinis
Yury Gribov

 Vim config with GNU formatting.

 contrib/
* vimrc: New file.

 /
* .gitignore: Added .local.vimrc and .lvimrc.
* Makefile.tpl (vimrc, .lvimrc, .local.vimrc): New targets.
* Makefile.in: Regenerate.
OK, primarily because it's an improvement and not the default :-)  Folks 
that want to use can explicitly opt-in.

Jeff



PING: Re: [PATCH 05/05] Add command-line option-parsing to jit testcases

2014-12-08 Thread David Malcolm
On Tue, 2014-11-25 at 20:34 -0500, David Malcolm wrote:
> Add command-line option-parsing to the testcases, so that we can
> manipulate them without needing a recompile (e.g. varying
> optimization levels etc).
> 
> This uses getopt_long, which is a GNU extension to libc.  Is that
> acceptable?

Ping.  Specifically, is it acceptable to use getopt_long within the jit
testcases, or should I find another way to do this (e.g. just getopt)?

Sorry that it wasn't obvious that I was asking for review on this one.

> Implement a --num-iterations option, to override the default of 5.
> 
> When running tests under valgrind, pass in "--num-iterations=1",
> speeding up the tests by a factor of 5.
> 
> gcc/testsuite/ChangeLog:
>   * jit.dg/harness.h (num_iterations): New variable.
> (parse_options): New function.
> (main): Use "num_iterations", rather than hardcoding 5.
>   * jit.dg/jit.exp (fixed_host_execute): When running tests under
>   valgrind, pass in "--num-iterations=1".
> ---
>  gcc/testsuite/jit.dg/harness.h | 46 
> +++---
>  gcc/testsuite/jit.dg/jit.exp   |  5 +
>  2 files changed, 48 insertions(+), 3 deletions(-)
> 
> diff --git a/gcc/testsuite/jit.dg/harness.h b/gcc/testsuite/jit.dg/harness.h
> index bff64de..a30b66e 100644
> --- a/gcc/testsuite/jit.dg/harness.h
> +++ b/gcc/testsuite/jit.dg/harness.h
> @@ -247,17 +247,57 @@ extract_progname (const char *argv0)
>  }
>  
>  #ifndef TEST_PROVIDES_MAIN
> +
> +/* Option parsing, for varying how we run the built testcases
> +   (e.g. when poking at things in the debugger).  */
> +
> +#include 
> +int num_iterations = 5;
> +
> +static void
> +parse_options (int argc, char **argv)
> +{
> +  while (1)
> +{
> +  static struct option long_options[] =
> + {
> +   {"num-iterations", required_argument, 0, 'i'},
> +   {0, 0, 0, 0}
> + };
> +
> +  int option_index = 0;
> +  /* getopt_long is a GNU extension to libc.  */
> +  int c = getopt_long (argc, argv, "i:",
> +long_options, &option_index);
> +  if (c == -1)
> + break;
> +
> +  switch (c)
> + {
> + case 'i':
> +   num_iterations = atoi (optarg);
> +   break;
> + default:
> +   fprintf (stderr, "Usage: %s [--num-iterations NUM]\n",
> +argv[0]);
> +   exit(EXIT_FAILURE);
> + }
> +}
> +}
> +
>  int
>  main (int argc, char **argv)
>  {
>int i;
>  
> -  for (i = 1; i <= 5; i++)
> +  parse_options (argc, argv);
> +
> +  for (i = 1; i <= num_iterations; i++)
>  {
>snprintf (test, sizeof (test),
>   "%s iteration %d of %d",
> -extract_progname (argv[0]),
> -i, 5);
> + extract_progname (argv[0]),
> + i, num_iterations);
>  
>//printf ("ITERATION %d\n", i);
>test_jit (argv[0], NULL);
> diff --git a/gcc/testsuite/jit.dg/jit.exp b/gcc/testsuite/jit.dg/jit.exp
> index a37ccc7..438aabd 100644
> --- a/gcc/testsuite/jit.dg/jit.exp
> +++ b/gcc/testsuite/jit.dg/jit.exp
> @@ -157,6 +157,11 @@ proc fixed_host_execute {args} {
>   set valgrind_params {"valgrind"}
>   lappend valgrind_params "--leak-check=full"
>   lappend valgrind_params "--log-file=${valgrind_logfile}"
> + # When running under valgrind, speed things up by only running one
> + # in-process iteration of the testcase, rather than the default of 5.
> + # Only testcases that use the "main" from harness.h
> + # (#ifndef TEST_PROVIDES_MAIN) will respond to this option.
> + lappend params "--num-iterations=1"
>  } else {
>   set valgrind_params {}
>  }




Re: [PATCH] Set stores_off_frame_dead_at_return to false if sibling call

2014-12-08 Thread John David Anglin

On 12/8/2014 3:49 PM, Jeff Law wrote:
I think the terminology and variable names certainly makes this 
tougher to follow than it should.

I certainly agree. When I first looked at the code, I thought it
was completely backwards.

Dave

--
John David Anglindave.ang...@bell.net



Re: [Patch] Improving jump-thread pass for PR 54742

2014-12-08 Thread Steve Ellcey
On Sat, 2014-12-06 at 19:21 +, Sebastian Pop wrote:

> > I think it does not make sense to duplicate paths at -Os: I disabled the FSM
> > jump-threading when optimizing for size like this.
> > 
> > diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c
> > index 29b20c8..ce70311 100644
> > --- a/gcc/tree-ssa-threadedge.c
> > +++ b/gcc/tree-ssa-threadedge.c
> > @@ -1335,8 +1335,9 @@ thread_through_normal_block (edge e,
> >   return 1;
> > }
> >  
> >if (!flag_expensive_optimizations
> > + || optimize_function_for_size_p (cfun)
> >   || TREE_CODE (cond) != SSA_NAME
> >   || e->dest->loop_father != e->src->loop_father
> >   || loop_depth (e->dest->loop_father) == 0)
> > return 0;
> > 
> > I will regstrap and commit the attached patch.
> 
> Bootstrapped and regression tested on x86_64-linux.
> Committed r218451.
> 
> Sebastian

Thanks for getting all this checked in Sebastian, I have tested it on
coremark and I am getting the speed up that I expect.  But I am a little
confused about turning off jump threading.  I am getting the
optimization on coremark with -O3 and that is great and if I use '-O3
-fno-expensive-optimizations' then I don't see this part of the jump
threading, also great.  But I was surprised that if I just used '-O3
-fno-thread-jumps' then I still see this optimization.  Is that
expected?  Should this test also check flag_thread_jumps?  Or should
that be getting checked somewhere else?

Steve Ellcey




Re: PING: Re: [PATCH 05/05] Add command-line option-parsing to jit testcases

2014-12-08 Thread Mike Stump
On Dec 8, 2014, at 12:44 PM, David Malcolm  wrote:
> On Tue, 2014-11-25 at 20:34 -0500, David Malcolm wrote:
>> Add command-line option-parsing to the testcases, so that we can
>> manipulate them without needing a recompile (e.g. varying
>> optimization levels etc).
>> 
>> This uses getopt_long, which is a GNU extension to libc.  Is that
>> acceptable?
> 
> Ping.  Specifically, is it acceptable to use getopt_long within the jit
> testcases, or should I find another way to do this (e.g. just getopt)?

So, the standard by which we measure, does it kill building or testing of ada 
on mingwin?  If it does, then no, not acceptable.  I’d like to think there is 
nothing you can do in jit.exp that could do that.  So, from this perspective, 
yeah, feel free to do what you want.  Git it done first.  The second person 
that wants to port your code to a new machine (a different machine) will trip 
over all the bad things you did, and someone will then have to fix them.

If you only use what gcc already sues, you will be portable to everything gcc 
is portable to.  If you use GNU extensions to libc, well, that isn’t portable 
enough in general.  Heck, even what’s in libc isn’t portable enough, that’s 
half the reason why we have autoconf in the first place.

If jit is on by default everywhere, then you need to be portable everywhere.  
If only on for linux, then well, it already has GNU extensions in libc.  I 
don’t know if it is on by default and you didn’t say, so, hard to comment on it.

Re: [Patch] PR 61692 - Fix for inline asm ICE

2014-12-08 Thread Jeff Law

On 11/15/14 20:59, David Wohlferd wrote:

On 9/15/2014 2:51 PM, Jeff Law wrote:

Let's go with your original inputs + outputs + labels change and punt
the clobbers stuff for now.

jeff


I have also added the test code you requested.

I have a release on file with the FSF, but don't have SVN write access.

Problem:
extract_insn() in recog.c will ICE if (noperands > MAX_RECOG_OPERANDS).
Normally this isn't a problem since expand_asm_operands() in cfgexpand.c
catches and reports a proper error for this condition.  However,
expand_asm_operands() only checks (ninputs + noutputs) instead of
(ninputs + noutputs + nlabels), so you can get the ICE when using "asm
goto."

ChangeLog:
2014-11-15  David Wohlferd 

 PR target/61692
 * cfgexpand.c (expand_asm_operands): Count all inline asm params.
 * testsuite/gcc.dg/pr61692.c: New test.
Thanks.  I put the testsuite ChangeLog entry in the right place and 
added a dg-do directive to the testcase to ensure we just try to compile it.


Thanks for your patience, things got rather busy and obviously I'm still 
working through my personal backlog of patches from folks.


Jeff


Re: [PATCH] Two more spots for

2014-12-08 Thread Uros Bizjak
On Mon, Dec 8, 2014 at 10:43 PM, Jakub Jelinek  wrote:
> Hi!
>
> I've noticed two new spots which use GET_MODE_SIZE (mode)
> instead of the more efficient .
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2014-12-08  Jakub Jelinek  
>
> * config/i386/sse.md (*mov_internal, *avx512f_gatherdi_2):
> Use  instead of GET_MODE_SIZE (mode).

OK.

Thanks,
Uros.


Re: [PATCH] Fix ICE with non-lvalue vector subscripts and make sure non-lvalue vector subscripts aren't used as lvalues (PR target/63764)

2014-12-08 Thread Jeff Law

On 11/20/14 11:09, Jakub Jelinek wrote:

Hi!

This patch fixes ICEs if a non-lvalue vector (say cast of one vector
to another vector type) was subscripted and used as lhs.
The following patch, if *vecp is not lvalue, will copy it to a temporary
variable which can be made addressable for the subscription, and afterwards
wrap it into a NON_LVALUE_EXPR so that it is properly rejected if later used
on the lhs.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2014-11-20  Jakub Jelinek  

PR target/63764
c-family/
* c-common.h (convert_vector_to_pointer_for_subscript): Change
return type to bool.
* c-common.c: Include gimple-expr.c.
This was in c-family and thus I didn't see it as committed...  So when I 
went to review, I saw the typo gimple-expr.c when you meant to write 
gimple-expr.h in the ChangeLog entry.


Fixed in the obvious way.

jeff


Re: [PATCH] Set stores_off_frame_dead_at_return to false if sibling call

2014-12-08 Thread John David Anglin

On 12/8/2014 3:01 PM, Jeff Law wrote:

The above is wrong for sibcalls.  Sibcall arguments are relative to
the incoming argument pointer.  Is this always the frame pointer?
I don't think it's always the frame pointer.  Don't we use an argument 
pointer for the PA64 runtime?  If I recall, it was the only port that 
had a non-eliminable argument pointer at the time.

I don't think PA64 is an argument against this as sibcalls don't
work in the PA64 runtime (they are disabled in pa.c) because the 
argument pointer isn't a
fixed register.  I guess in theory it could be fixed if it was saved and 
restored across calls.


DSE as it stands doesn't look at argument pointer based stores and I 
suspect they would

be deleted with current code.

Dave

--
John David Anglindave.ang...@bell.net



Re: [PATCH] check_GNU_style.sh "80 characters exceeded" error fix

2014-12-08 Thread Jeff Law

On 11/24/14 06:51, Mantas Mikaitis wrote:

check_GNU_style.sh error "Lines should not exceed 80 characters" does
not return the file name and line number where error is present,
only the line of code. Whereas other kind of errors return full
information.

This small patch will fix this and make check_GNU_style.sh return full
information when patch contains lines longer than 80 errors.

Tested on patches containing >80 chars lines and the script produces
full information as necessary.

Would this be a useful enhancement for trunk?

Mantas Mikaitis

gcc/ChangeLog:

  * contrib/check_GNU_style.sh (col): Got rid of cut operation
from the pipe chain and instead added cut inside awk command.

Yes.  Please install on the trunk.

Thanks,
Jeff



Re: PATCH: PR rtl-optimization/64037: Miscompilation with -Os and enum class : char parameter

2014-12-08 Thread Uros Bizjak
Hello!

> On Tue, Nov 25, 2014 at 5:05 PM, H.J. Lu  wrote:
>> On Tue, Nov 25, 2014 at 7:01 AM, Richard Biener
>>  wrote:
>>> On Tue, Nov 25, 2014 at 1:57 PM, H.J. Lu  wrote:
 Hi,

 The enclosed testcase fails on x86 when compiled with -Os since we pass
 a byte parameter with a byte load in caller and read it as an int in
 callee.  The reason it only shows up with -Os is x86 backend encodes
 a byte load with an int load if -O isn't used.  When a byte load is
 used, the upper 24 bits of the register have random value for none
 WORD_REGISTER_OPERATIONS targets.

 It happens because setup_incoming_promotions in combine.c has

   /* The mode and signedness of the argument before any promotions 
 happen
  (equal to the mode of the pseudo holding it at that stage).  */
   mode1 = TYPE_MODE (TREE_TYPE (arg));
   uns1 = TYPE_UNSIGNED (TREE_TYPE (arg));

   /* The mode and signedness of the argument after any source language 
 and
  TARGET_PROMOTE_PROTOTYPES-driven promotions.  */
   mode2 = TYPE_MODE (DECL_ARG_TYPE (arg));
   uns3 = TYPE_UNSIGNED (DECL_ARG_TYPE (arg));

   /* The mode and signedness of the argument as it is actually passed,
  after any TARGET_PROMOTE_FUNCTION_ARGS-driven ABI promotions.  */
   mode3 = promote_function_mode (DECL_ARG_TYPE (arg), mode2, &uns3,
  TREE_TYPE (cfun->decl), 0);

 while they are actually passed in register by assign_parm_setup_reg in
 function.c:

   /* Store the parm in a pseudoregister during the function, but we may
  need to do it in a wider mode.  Using 2 here makes the result
  consistent with promote_decl_mode and thus expand_expr_real_1.  */
   promoted_nominal_mode
 = promote_function_mode (data->nominal_type, data->nominal_mode, 
 &unsignedp,
  TREE_TYPE (current_function_decl), 2);

 where nominal_type and nominal_mode are set up with TREE_TYPE (parm)
 and TYPE_MODE (nominal_type). TREE_TYPE here is
>>>
>>> I think the bug is here, not in combine.c.  Can you try going back in 
>>> history
>>> for both snippets and see if they matched at some point?
>>>
>>
>> The bug was introduced by
>>
>> https://gcc.gnu.org/ml/gcc-cvs/2007-09/msg00613.html
>>
>> commit 5d93234932c3d8617ce92b77b7013ef6bede9508
>> Author: shinwell 
>> Date:   Thu Sep 20 11:01:18 2007 +
>>
>>   gcc/
>>   * combine.c: Include cgraph.h.
>>   (setup_incoming_promotions): Rework to allow more aggressive
>>   elimination of sign extensions when all call sites of the
>>   current function are known to lie within the current unit.
>>
>>
>> git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@128618
>> 138bc75d-0d04-0410-961f-82ee72b054a4
>>
>> Before this commit, combine.c has
>>
>>   enum machine_mode mode = TYPE_MODE (TREE_TYPE (arg));
>>   int uns = TYPE_UNSIGNED (TREE_TYPE (arg));
>>
>>   mode = promote_mode (TREE_TYPE (arg), mode, &uns, 1);
>>   if (mode == GET_MODE (reg) && mode != DECL_MODE (arg))
>> {
>>   rtx x;
>>   x = gen_rtx_CLOBBER (DECL_MODE (arg), const0_rtx);
>>   x = gen_rtx_fmt_e ((uns ? ZERO_EXTEND : SIGN_EXTEND), mode, x);
>>   record_value_for_reg (reg, first, x);
>> }
>>
>> It matches function.c:
>>
>>   /* This is not really promoting for a call.  However we need to be
>>  consistent with assign_parm_find_data_types and expand_expr_real_1.  */
>>   promoted_nominal_mode
>> = promote_mode (data->nominal_type, data->nominal_mode, &unsignedp, 1);
>>
>> r128618 changed
>>
>> mode = promote_mode (TREE_TYPE (arg), mode, &uns, 1);
>>
>> to
>>
>> mode3 = promote_mode (DECL_ARG_TYPE (arg), mode2, &uns3, 1);
>>
>> It breaks none WORD_REGISTER_OPERATIONS targets.
>
> Hmm, I think that DECL_ARG_TYPE makes a difference only
> for non-WORD_REGISTER_OPERATIONS targets.
>
> But yeah, isolated the above change looks wrong.
>
> Your patch is ok for trunk if nobody objects within 24h and for branches
> after a week.
>
> Thanks,
> Richard.

This patch caused PR64213.

Uros.


[Patch, Fortran] Convert gfc_notify_std to common diagnostics

2014-12-08 Thread Tobias Burnus
This patch requires that the gfc_error patch has been applied, 
https://gcc.gnu.org/ml/gcc-patches/2014-12/msg00607.html


The patch does some missing '%s' to %qs and '...' to %<...%> for 
gfc_error, does likewise for gfc_notify_std and converts the latter into 
calls to gfc_error and gfc_warning.


Build and regtested on
OK for the trunk, once the gfc_error patch is in?

Tobias
2014-12-08  Tobias Burnus  
	Manuel López-Ibáñez  

fortran/
	* error.c (gfc_error): Add variant which takes a va_list.
	(gfc_notify_std): Convert to common diagnostic.
 	* array.c: Use %qs, %<...%> in more gfc_error calls and
	for gfc_notify_std.
	* check.c: Ditto.
	* data.c: Ditto.
	* decl.c: Ditto.
	* expr.c: Ditto.
	* interface.c: Ditto.
	* intrinsic.c: Ditto.
	* io.c: Ditto.
	* match.c: Ditto.
	* matchexp.c: Ditto.
	* module.c: Ditto.
	* openmp.c: Ditto.
	* parse.c: Ditto.
	* primary.c: Ditto.
	* resolve.c: Ditto.
	* simplify.c: Ditto.
	* symbol.c: Ditto.
	* trans-common.c: Ditto.
	* trans-intrinsic.c: Ditto.

gcc/testsuite/
	* gfortran.dg/realloc_on_assign_21.f90: Update dg-error.
	* gfortran.dg/warnings_are_errors_1.f: Ditto.
	* gfortran.dg/warnings_are_errors_1.f90: Ditto.

diff --git a/gcc/fortran/array.c b/gcc/fortran/array.c
index e27ca01..300bfeb 100644
--- a/gcc/fortran/array.c
+++ b/gcc/fortran/array.c
@@ -684,7 +684,7 @@ coarray:
 
   if (current_type == AS_EXPLICIT)
 {
-  gfc_error ("Upper bound of last coarray dimension must be '*' at %C");
+  gfc_error ("Upper bound of last coarray dimension must be %<*%> at %C");
   goto cleanup;
 }
 
diff --git a/gcc/fortran/check.c b/gcc/fortran/check.c
index ef40e66..527123d 100644
--- a/gcc/fortran/check.c
+++ b/gcc/fortran/check.c
@@ -384,7 +384,7 @@ less_than_bitsize2 (const char *arg1, gfc_expr *expr1, const char *arg2,
   i3 = gfc_validate_kind (BT_INTEGER, expr1->ts.kind, false);
   if (i2 > gfc_integer_kinds[i3].bit_size)
 	{
-	  gfc_error ("'%s + %s' at %L must be less than or equal "
+	  gfc_error ("%<%s + %s%> at %L must be less than or equal "
 		 "to BIT_SIZE(%qs)",
 		 arg2, arg3, &expr2->where, arg1);
 	  return false;
@@ -581,7 +581,7 @@ dim_corank_check (gfc_expr *dim, gfc_expr *array)
   if (mpz_cmp_ui (dim->value.integer, 1) < 0
   || mpz_cmp_ui (dim->value.integer, corank) > 0)
 {
-  gfc_error ("'dim' argument of %qs intrinsic at %L is not a valid "
+  gfc_error ("% argument of %qs intrinsic at %L is not a valid "
 		 "codimension index", gfc_current_intrinsic, &dim->where);
 
   return false;
@@ -631,7 +631,7 @@ dim_rank_check (gfc_expr *dim, gfc_expr *array, int allow_assumed)
   if (mpz_cmp_ui (dim->value.integer, 1) < 0
   || mpz_cmp_ui (dim->value.integer, rank) > 0)
 {
-  gfc_error ("'dim' argument of %qs intrinsic at %L is not a valid "
+  gfc_error ("% argument of %qs intrinsic at %L is not a valid "
 		 "dimension index", gfc_current_intrinsic, &dim->where);
 
   return false;
@@ -1378,7 +1378,7 @@ gfc_check_cmplx (gfc_expr *x, gfc_expr *y, gfc_expr *kind)
   if (x->ts.type == BT_COMPLEX)
 	{
 	  gfc_error ("%qs argument of %qs intrinsic at %L must not be "
-		 "present if 'x' is COMPLEX",
+		 "present if % is COMPLEX",
 		 gfc_current_intrinsic_arg[1]->name, gfc_current_intrinsic,
 		 &y->where);
 	  return false;
@@ -1428,7 +1428,7 @@ check_co_collective (gfc_expr *a, gfc_expr *image_idx, gfc_expr *stat,
   /* Fortran 2008, 12.5.2.4, paragraph 18.  */
   if (gfc_has_vector_subscript (a))
 {
-  gfc_error ("Argument 'A' with INTENT(INOUT) at %L of the intrinsic "
+  gfc_error ("Argument % with INTENT(INOUT) at %L of the intrinsic "
 		 "subroutine %s shall not have a vector subscript",
 		 &a->where, gfc_current_intrinsic);
   return false;
@@ -1728,7 +1728,7 @@ gfc_check_count (gfc_expr *mask, gfc_expr *dim, gfc_expr *kind)
 return false;
   if (!kind_check (kind, 2, BT_INTEGER))
 return false;
-  if (kind && !gfc_notify_std (GFC_STD_F2003, "'%s' intrinsic "
+  if (kind && !gfc_notify_std (GFC_STD_F2003, "%qs intrinsic "
 			   "with KIND argument at %L", 
 			   gfc_current_intrinsic, &kind->where))
 return false;
@@ -1835,7 +1835,7 @@ gfc_check_dcmplx (gfc_expr *x, gfc_expr *y)
   if (x->ts.type == BT_COMPLEX)
 	{
 	  gfc_error ("%qs argument of %qs intrinsic at %L must not be "
-		 "present if 'x' is COMPLEX",
+		 "present if % is COMPLEX",
 		 gfc_current_intrinsic_arg[1]->name, gfc_current_intrinsic,
 		 &y->where);
 	  return false;
@@ -1908,7 +1908,8 @@ gfc_check_dot_product (gfc_expr *vector_a, gfc_expr *vector_b)
   if (! identical_dimen_shape (vector_a, 0, vector_b, 0))
 {
   gfc_error ("Different shape for arguments %qs and %qs at %L for "
-		 "intrinsic 'dot_product'", gfc_current_intrinsic_arg[0]->name,
+		 "intrinsic %",
+		 gfc_current_intrinsic_arg[0]->name,
 		 gfc_current_intrinsic_arg[1]->name, &vector_a->where);
   return false;
 }
@@ -2146,9 +2147,9 @@ gfc

Re: [PATCH] Mark explicit decls as implicit when we've seen a prototype

2014-12-08 Thread Joseph Myers
On Mon, 8 Dec 2014, Richard Biener wrote:

> I'm not pushing this further for stage3, but for stage1 I'd like
> to eventually address this by splitting up builtin_info_type's
> 'implicit_p' into a flags array providing implicit_p, declared_p,
> used_p and maybe declared_in_system_header_p.  Would you be
> willing to fill in the gap computing "used_p" in the C frontend?

C_DECL_USED is the right notion of "used" with the front end, in standard 
terms (i.e. the notion that requires a definition of the function to be 
present somewhere), though as noted some people may be relying on calls 
being optimized out.  (Of course such people could get implicitly 
generated libm calls at present anyway; the point of checking "used" is to 
restrict call generation to cases where we know that some library with the 
function will be linked in, and that the function will have the right 
semantics.)

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] Set stores_off_frame_dead_at_return to false if sibling call

2014-12-08 Thread Jeff Law

On 12/08/14 15:15, John David Anglin wrote:

On 12/8/2014 3:01 PM, Jeff Law wrote:

The above is wrong for sibcalls.  Sibcall arguments are relative
to the incoming argument pointer.  Is this always the frame
pointer?

I don't think it's always the frame pointer.  Don't we use an
argument pointer for the PA64 runtime?  If I recall, it was the
only port that had a non-eliminable argument pointer at the time.

I don't think PA64 is an argument against this as sibcalls don't work
in the PA64 runtime (they are disabled in pa.c) because the argument
pointer isn't a fixed register.  I guess in theory it could be fixed
if it was saved and restored across calls.
But there's nothing that says another port in the future won't have 
similar characteristics as the PA, so while the PA isn't particularly 
important, it shows there's cases where arguments won't be accessed by 
the FP.





DSE as it stands doesn't look at argument pointer based stores and I
 suspect they would be deleted with current code.

Agreed.

jeff


Re: [PATCH] Mark explicit decls as implicit when we've seen a prototype

2014-12-08 Thread Joseph Myers
On Mon, 8 Dec 2014, Richard Biener wrote:

> The alternative is to decide "used" in the middle-end at one point,
> for example at the end of all_lowering_passes where hopefully
> we have constant folded and removed dead code enough.  We can also
> compute an overall "uses libm" flag to fix the testcase I reported
> (of course we'd like to re-compute that at LTO time).

If you do that, of course any optimizations before deciding "used" need to 
be conservative - assume that any function except the one you're 
optimizing a call of is not "used".

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: FW: [PATCH] Fix PR ipa/64049

2014-12-08 Thread Jan Hubicka
> On Mon, Dec 8, 2014 at 11:57 AM, Bernd Edlinger
>  wrote:
> >
> >
> >>
> >> Hello,
> >>
> >>
> >> this surprisingly simple patch fixes PR ipa/64049.  The root cause seems 
> >> to be
> >> that in this test case we try to devirtualize a method call on a return 
> >> value.
> >>
> >>
> >> Boot-Strapped and regression-tested on X86_64-linux-gnu.
> >> OK for trunk?
> 
> I think positive tests, in this case TREE_CODE (...) == VAR_DECL
> are better.
> 
> Ok with that change.

The test there assumes that only values with well defined default value are 
parm decls,
so in cases we see a call on default value of other kind of decl, we could turn 
it
to undefined call.

In what cases has RESULT_DECL sane initial value?  We should at least drop a 
comment here.

Honza
> 
> Thanks,
> Richard.
> 
> >>
> >> Thanks
> >> Bernd.
> >>
> >
> > Again with changelog. Sorry.
> >


locales fixes

2014-12-08 Thread François Dumont

Hi

After having installed all necessary locales on my system I end up 
with 4 failures. Here is a patch to fix them all.


For numpunct test I consider that checking value of numpunct::grouping() 
result was not correct. libstdc++ doesn't control this value. I prefer 
to just check that values returned by numpunct are consistent with how 
numeric values are formatted.


For the others, regarding time_get, I found nothing really smart to do. 
For 4.cc I simply had to remove some characters not present anymore. And 
for 38081-1 and 38081-2, despite having glibc 2.19 it is still showing 
previous values on my system. I even wonder if it has anything to do 
with glibc version, it rather depends on the installed locale info, no ?


2014-12-08  François Dumont  

* testsuite/22_locale/numpunct/members/char/3.cc: Check numpunct 
returned

values are consistent with how numbers are formatted.
* testsuite/22_locale/time_get/get_date/wchar_t/4.cc: Update expected
values.
* testsuite/22_locale/time_get/get_weekday/char/38081-1.cc: Don't 
use new

values before glibc 2.20.
* testsuite/22_locale/time_get/get_weekday/char/38081-2.cc: Likewise.

Tested under Linux x86_64.

François

Index: testsuite/22_locale/numpunct/members/char/3.cc
===
--- testsuite/22_locale/numpunct/members/char/3.cc	(revision 218492)
+++ testsuite/22_locale/numpunct/members/char/3.cc	(working copy)
@@ -21,6 +21,7 @@
 
 // 22.2.3.2 Template class numpunct_byname
 
+#include 
 #include 
 #include 
 
@@ -30,13 +31,52 @@
   
   bool test __attribute__((unused)) = true;
 
-  locale loc_it = locale("it_IT");
+  // Use numpunct to generate representation for an arbitrary number.
+  const int number = 10;
 
-  const numpunct& nump_it = use_facet >(loc_it); 
+  ostringstream ostr;
+  ostr.imbue(locale::classic());
+  
+  ostr << number;
+  string c_number = ostr.str();
 
+  locale loc_it = locale("it_IT");
+  const numpunct& nump_it = use_facet >(loc_it);
   string g = nump_it.grouping();
 
-  VERIFY( g == "" );
+  size_t group_index = 0;
+  string it_number;
+  size_t it_pos = 0;
+  size_t offset = 0;
+  for (size_t pos = 0; pos != c_number.size(); ++pos)
+{
+  int group_size =
+	group_index < g.size() ? (int)g[group_index]
+		  : (g.empty()
+			 // No grouping, just make group size big enough so that
+			 // there will be no insertion of thousands separator.
+			 ? c_number.size()
+			 : g[g.size() - 1]);
+  if (pos + offset < it_pos + group_size)
+	it_number.insert(it_number.begin(), c_number[c_number.size() - pos - 1]);
+  else
+	{
+	  // Time to add the group separator.
+	  it_number.insert(it_number.begin(), nump_it.thousands_sep());
+	  it_number.insert(it_number.begin(), c_number[c_number.size() - pos - 1]);
+	  ++offset;
+	  ++group_index;
+	  it_pos = it_number.size() - 1;
+	}
+}
+
+  // Check that the result is identical to the one obtained with an
+  // ostringstream imbued with the it_IT locale.
+  ostr.str("");
+  ostr.imbue(loc_it);
+  ostr << number;
+
+  VERIFY( ostr.str() == it_number );
 }
 
 int main()
Index: testsuite/22_locale/time_get/get_date/wchar_t/4.cc
===
--- testsuite/22_locale/time_get/get_date/wchar_t/4.cc	(revision 218492)
+++ testsuite/22_locale/time_get/get_date/wchar_t/4.cc	(working copy)
@@ -43,7 +43,7 @@
   const ios_base::iostate good = ios_base::goodbit;
   ios_base::iostate errorstate = good;
 
-  const wchar_t wstr[] = { 0x897f, 0x5143, L'2', L'0', L'0', L'3',
+  const wchar_t wstr[] = { /* 0x897f, 0x5143,*/ L'2', L'0', L'0', L'3',
 			   0x5e74, L'1', L'2', 0x6708, L'1', L'7',
 			   0x65e5 , 0x0 };
 
Index: testsuite/22_locale/time_get/get_weekday/char/38081-1.cc
===
--- testsuite/22_locale/time_get/get_weekday/char/38081-1.cc	(revision 218492)
+++ testsuite/22_locale/time_get/get_weekday/char/38081-1.cc	(working copy)
@@ -50,7 +50,7 @@
   // ios_base::iostate&, tm*) const
 
 #if __GLIBC__ > 2 || (__GLIBC__ == 2 && __GLIBC_MINOR__ >= 14)
-# if __GLIBC__ > 2 || __GLIBC_MINOR__ >= 17
+# if __GLIBC__ > 2 || __GLIBC_MINOR__ >= 20
   iss.str("\xbf\xdd");
 # else
   iss.str("\xbf\xdd\x2e");
@@ -76,7 +76,7 @@
   VERIFY( errorstate == ios_base::eofbit );
 
 #if __GLIBC__ > 2 || (__GLIBC__ == 2 && __GLIBC_MINOR__ >= 14)
-# if __GLIBC__ > 2 || __GLIBC_MINOR__ >= 17
+# if __GLIBC__ > 2 || __GLIBC_MINOR__ >= 20
   iss.str("\xbf\xdd\xd5\xd4\xd5\xdb\xec\xdd\xd8\xda");
 # else
   iss.str("\xbf\xdd\x2e\xd5\xd4\xd5\xdb\xec\xdd\xd8\xda");
Index: testsuite/22_locale/time_get/get_weekday/char/38081-2.cc
===
--- testsuite/22_locale/time_get/get_weekday/char/38081-2.cc	(revision 218492)
+++ testsuite/22_locale/time_get/get_weekday/char/38081-2.cc	(working copy)
@@ -51,7 +51,7 @@
   // ios_base::i

  1   2   >