Re: [4.8] backport fixes for wrong-code PR57425 and PR57569

2014-03-17 Thread Richard Biener
On Sat, Mar 15, 2014 at 7:05 PM, Mikael Pettersson  wrote:
> This backports the fixes for wrong-code bugs PR57425 and PR57569,
> both marked as 4.8 regressions, from mainline to 4.8 branch.
>
> Tested since June last year on x86_64, powerpc64, sparc64, armv5tel,
> and m68k without regressions.  According to Bill Schmidt it also
> fixes a wrong-code problem for powerpc64le on IBM's 4.8 branch.
>
> Ok for 4.8 branch?

Ok.

Thanks,
Richard.

> Thanks,
>
> /Mikael
>
> (I don't have commit rights, but Bill has agreed to do the commit if
> this backport is approved.)
>
>
> gcc/
>
> 2014-03-15  Mikael Pettersson  
>
> Backport from mainline:
>
> 2013-06-20  Joern Rennecke 
>
> PR rtl-optimization/57425
> PR rtl-optimization/57569
> * alias.c (write_dependence_p): Remove parameters mem_mode and
> canon_mem_addr.  Add parameters x_mode, x_addr and x_canonicalized.
> Changed all callers.
> (canon_anti_dependence): Get comments and semantics in sync.
> Add parameter mem_canonicalized.  Changed all callers.
> * rtl.h (canon_anti_dependence): Update prototype.
>
> 2013-06-16  Joern Rennecke 
>
> PR rtl-optimization/57425
> PR rtl-optimization/57569
> * alias.c (write_dependence_p): Add new parameters mem_mode,
> canon_mem_addr and mem_canonicalized.  Change type of writep to bool.
> Changed all callers.
> (canon_anti_dependence): New function.
> * cse.c (check_dependence): Use canon_anti_dependence.
> * cselib.c (cselib_invalidate_mem): Likewise.
> * rtl.h (canon_anti_dependence): Declare.
>
> gcc/testsuite/
>
> 2014-03-15  Mikael Pettersson  
>
> Backport from mainline:
>
> 2013-06-16  Joern Rennecke 
>
> PR rtl-optimization/57425
> PR rtl-optimization/57569
> * gcc.dg/torture/pr57425-1.c, gcc.dg/torture/pr57425-2.c: New files.
> * gcc.dg/torture/pr57425-3.c, gcc.dg/torture/pr57569.c: Likewise.
>
> --- gcc-4.8.2/gcc/alias.c.~1~   2013-03-05 10:40:38.0 +0100
> +++ gcc-4.8.2/gcc/alias.c   2014-03-15 18:18:31.402652881 +0100
> @@ -156,7 +156,9 @@ static int insert_subset_children (splay
>  static alias_set_entry get_alias_set_entry (alias_set_type);
>  static bool nonoverlapping_component_refs_p (const_rtx, const_rtx);
>  static tree decl_for_component_ref (tree);
> -static int write_dependence_p (const_rtx, const_rtx, int);
> +static int write_dependence_p (const_rtx,
> +  const_rtx, enum machine_mode, rtx,
> +  bool, bool, bool);
>
>  static void memory_modified_1 (rtx, const_rtx, void *);
>
> @@ -2558,15 +2560,24 @@ canon_true_dependence (const_rtx mem, en
>  }
>
>  /* Returns nonzero if a write to X might alias a previous read from
> -   (or, if WRITEP is nonzero, a write to) MEM.  */
> +   (or, if WRITEP is true, a write to) MEM.
> +   If X_CANONCALIZED is true, then X_ADDR is the canonicalized address of X,
> +   and X_MODE the mode for that access.
> +   If MEM_CANONICALIZED is true, MEM is canonicalized.  */
>
>  static int
> -write_dependence_p (const_rtx mem, const_rtx x, int writep)
> +write_dependence_p (const_rtx mem,
> +   const_rtx x, enum machine_mode x_mode, rtx x_addr,
> +   bool mem_canonicalized, bool x_canonicalized, bool writep)
>  {
> -  rtx x_addr, mem_addr;
> +  rtx mem_addr;
>rtx base;
>int ret;
>
> +  gcc_checking_assert (x_canonicalized
> +  ? (x_addr != NULL_RTX && x_mode != VOIDmode)
> +  : (x_addr == NULL_RTX && x_mode == VOIDmode));
> +
>if (MEM_VOLATILE_P (x) && MEM_VOLATILE_P (mem))
>  return 1;
>
> @@ -2590,17 +2601,21 @@ write_dependence_p (const_rtx mem, const
>if (MEM_ADDR_SPACE (mem) != MEM_ADDR_SPACE (x))
>  return 1;
>
> -  x_addr = XEXP (x, 0);
>mem_addr = XEXP (mem, 0);
> -  if (!((GET_CODE (x_addr) == VALUE
> -&& GET_CODE (mem_addr) != VALUE
> -&& reg_mentioned_p (x_addr, mem_addr))
> -   || (GET_CODE (x_addr) != VALUE
> -   && GET_CODE (mem_addr) == VALUE
> -   && reg_mentioned_p (mem_addr, x_addr
> +  if (!x_addr)
>  {
> -  x_addr = get_addr (x_addr);
> -  mem_addr = get_addr (mem_addr);
> +  x_addr = XEXP (x, 0);
> +  if (!((GET_CODE (x_addr) == VALUE
> +&& GET_CODE (mem_addr) != VALUE
> +&& reg_mentioned_p (x_addr, mem_addr))
> +   || (GET_CODE (x_addr) != VALUE
> +   && GET_CODE (mem_addr) == VALUE
> +   && reg_mentioned_p (mem_addr, x_addr
> +   {
> + x_addr = get_addr (x_addr);
> + if (!mem_canonicalized)
> +   mem_addr = get_addr (mem_addr);
> +   }
>  }
>
>if (! writep)
> @@ -2616,11 +2631,16 @@ write_dependence_p (const_rtx mem, const
>   GET_MODE (mem)))
>  return 0;
>
> -  x_addr = canon_rtx (x_addr);
> -  mem_ad

Re: C++ PATCH for c++/58678 (devirt vs. KDE)

2014-03-17 Thread Jan Hubicka
> Honza suggested that if the destructor for an abstract class can't
> ever be called through the vtable, the front end could avoid
> referring to it from the vtable.  This patch replaces such a
> destructor with __cxa_pure_virtual in the vtable.
> 
> Tested x86_64-pc-linux-gnu, applying to trunk.

Thank you!  would preffer different marker than cxa_pure_virtual in the vtable,
most probably simply NULL.

The reason is that __cxa_pure_virtual will appear as a possible target in the
list and it will prevent devirtualiztion to happen when we end up with
__cxa_pure_virtual and real destructor in the list of possible targets.
gimple_get_virt_method_for_vtable knows that lookup in vtable that do not
result in FUNCTION_DECL should be translated to BUILTIN_UNREACHABLE and
ipa-devirt drops these from list of targets, unlike __cxa_pure_virtual that
stays.

Other problem with cxa_pure_virtual is that it needs external relocation.
I sort of wonedered if we don't want to produce hidden comdat wrapper for
it, so C++ programs are easier to relocate.

I will still keep the patch to mark ABSTACT classes by BINFO flag and will
send out the patch I made to make ABSTRACT classes to be ignored for
anonymous namespace types.  It seems to make difference for libreoffice
that uses a lot of abstracts.

What do you think of the following patch that makes ipa-devirt to conclude
that destructor calls are never done on types in construction.
If effect of doing so is undefined, I think it is safe to drop them from
list of targets and that really helps to reduce lists down.


Index: ipa-devirt.c
===
--- ipa-devirt.c(revision 208492)
+++ ipa-devirt.c(working copy)
@@ -1511,7 +1558,10 @@ possible_polymorphic_call_targets (tree
   target = NULL;
 }
 
-  maybe_record_node (nodes, target, inserted, can_refer, &complete);
+  /* Destructors are never called through construction virtual tables,
+ because the type is always known.  */
+  if (target && DECL_CXX_DESTRUCTOR_P (target))
+context.maybe_in_construction = false;
 
   if (target)
 {


[PATCH][match-and-simplify] Commit bootstrap workaround

2014-03-17 Thread Richard Biener

This temporarily adds -fpermissive to the gimple-match.c compile
to allow bootstrapping.

Bootstrapped and tested on x86_64-unknown-linux-gnu for all languages.

Richard.

2014-03-17  Richard Biener  

* Makefile.in (gimple-match.o-warn): Temporarily add -fpermissive
to allow bootstrapping.

Index: gcc/Makefile.in
===
--- gcc/Makefile.in (revision 208609)
+++ gcc/Makefile.in (working copy)
@@ -196,7 +196,7 @@ GCC_WARN_CXXFLAGS = $(LOOSE_WARN) $($(@D
 # flex output may yield harmless "no previous prototype" warnings
 build/gengtype-lex.o-warn = -Wno-error
 gengtype-lex.o-warn = -Wno-error
-gimple-match.o-warn = -Wno-error
+gimple-match.o-warn = -Wno-error -fpermissive
 
 # All warnings have to be shut off in stage1 if the compiler used then
 # isn't gcc; configure determines that.  WARN_CFLAGS will be either


[patch testsuite]: Correct testcase for LLP64 targets

2014-03-17 Thread Kai Tietz
Hi,

this patch corrects a regression seen in
gcc.c-torture/compile/20010327-1.c for LLP64 targets.

ChangeLog

2013-03-17  Kai Tietz  

* gcc.c-torture/compile/20010327-1.c: Adjust testcase for LLP64 targets.

Ok for apply?

Regards,
Kai

Index: gcc.c-torture/compile/20010327-1.c
===
--- gcc.c-torture/compile/20010327-1.c(Revision 208594)
+++ gcc.c-torture/compile/20010327-1.c(Arbeitskopie)
@@ -1,4 +1,5 @@
 /* { dg-require-effective-target ptr32plus } */
+/* { dg-skip-if "" { { i?86-*-* x86_64-*-* } && { llp64 } } { "*" } { "" } } */

 /* This testcase tests whether GCC can produce static initialized data
that references addresses of size 'unsigned long', even if that's not


[PATCH] Fix gfortran.dg/unlimited_polymorphic_13.f90

2014-03-17 Thread Andreas Schwab
Tested on {x86_64,m68k}-suse-linux and installed as obvious.

Andreas.

PR testsuite/58851
* gfortran.dg/unlimited_polymorphic_13.f90: Properly compute
storage size.
---
 gcc/testsuite/gfortran.dg/unlimited_polymorphic_13.f90 | 14 ++
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/gcc/testsuite/gfortran.dg/unlimited_polymorphic_13.f90 
b/gcc/testsuite/gfortran.dg/unlimited_polymorphic_13.f90
index 0e27b17..8225738 100644
--- a/gcc/testsuite/gfortran.dg/unlimited_polymorphic_13.f90
+++ b/gcc/testsuite/gfortran.dg/unlimited_polymorphic_13.f90
@@ -23,18 +23,24 @@ contains
 integer :: k
 integer :: sz
 
+sz = 0
 select case (k)
  case (4)
   sz = storage_size(r1)*2
+end select
+select case (k)
  case (8)
   sz = storage_size(r2)*2
- case (10)
+end select
+select case (k)
+ case (real_kinds(size(real_kinds)-1))
   sz = storage_size(r3)*2
- case (16)
+end select
+select case (k)
+ case (real_kinds(size(real_kinds)))
   sz = storage_size(r4)*2
- case default
-   call abort()
 end select
+if (sz .eq. 0) call abort()
 
 if (storage_size(o) /= sz) call abort()
 
-- 
1.9.0

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


[C++ Patch] PR 59571

2014-03-17 Thread Paolo Carlini

Hi,

noticed this issue, which looks simple to fix. The ICE happens in 
cxx_eval_constant_expression, because it cannot handle a CAST_EXPR (or 
any othe *_CAST, for that matter). In fact check_narrowing calls 
maybe_constant_value, and, because we are in a template, the latter 
faces the unfolded CAST_EXPR. Thus it seems easy to just use 
fold_non_dependent_expr_sfinae. Tested x86_64-linux.


Thanks,
Paolo.

///

PS: looking forward, I'm wondering if some semantics/typeck functions 
shouldn't try harder before building a tree node and returning, eg, 
instead of just checking processing_template_decl, actually checking if 
type and expr are dependent? Does this kind of audit make sense for next 
Stage 1?
/cp
2014-03-17  Paolo Carlini  

PR c++/59571
* typeck2.c (check_narrowing): Use fold_non_dependent_expr_sfinae.

/testsuite
2014-03-17  Paolo Carlini  

PR c++/59571
* g++.dg/cpp0x/constexpr-ice13.C: New.
Index: cp/typeck2.c
===
--- cp/typeck2.c(revision 208605)
+++ cp/typeck2.c(working copy)
@@ -861,7 +861,7 @@ check_narrowing (tree type, tree init)
   return;
 }
 
-  init = maybe_constant_value (init);
+  init = maybe_constant_value (fold_non_dependent_expr_sfinae (init, tf_none));
 
   if (TREE_CODE (type) == INTEGER_TYPE
   && TREE_CODE (ftype) == REAL_TYPE)
Index: testsuite/g++.dg/cpp0x/constexpr-ice13.C
===
--- testsuite/g++.dg/cpp0x/constexpr-ice13.C(revision 0)
+++ testsuite/g++.dg/cpp0x/constexpr-ice13.C(working copy)
@@ -0,0 +1,8 @@
+// PR c++/59571
+// { dg-do compile { target c++11 } }
+
+template 
+struct foo
+{
+  static constexpr int bar{(int)-1};
+};


[PATCH, 5/n] Handle CCMP in ifcvt to make it work with cmov

2014-03-17 Thread Zhenqiang Chen
Hi,

The patch enhances ifcvt to handle conditional compare instruction
(ccmp) to make it work with cmov. For ccmp, ALLOW_CC_MODE is set to
TRUE when calling canonicalize_condition. And the backend does not
need to generate additional "compare (CC, 0)" for it.

Bootstrap and no check regression on X84-64, ARM Chromebook and qemu-aarch64.

Is it OK for next stage1?

Thanks!
-Zhenqiang

ChangeLog:
2014-03-17  Zhenqiang Chen  

* ifcvt.c (struct noce_if_info): Add a new field ccmp_p.
(noce_emit_cmove): Allow ccmp condition.
(noce_get_alt_condition): Call canonicalize_condition with ccmp_p.
(noce_get_condition): Set ALLOW_CC_MODE to TRUE for ccmp.
(noce_process_if_block): Set ccmp_p for ccmp.
* recog.h (ccmp_insn_p): New prototype.
* recog.c (ccmp_insn_p): Make it global.
* config/aarch64/aarch64.md (movcc): Handle ccmp_cc.

testsuite/ChangeLog:
2014-03-17  Zhenqiang Chen  

* gcc.target/aarch64/ccmn-csel-1.c: New testcase.
* gcc.target/aarch64/ccmn-csel-2.c: New testcase.
* gcc.target/aarch64/ccmn-csel-3.c: New testcase.
* gcc.target/aarch64/ccmp-csel-1.c: New testcase.
* gcc.target/aarch64/ccmp-csel-2.c: New testcase.
* gcc.target/aarch64/ccmp-csel-3.c: New testcase.

diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
index 79aa2f3..4e18bb2 100644
--- a/gcc/ifcvt.c
+++ b/gcc/ifcvt.c
@@ -786,6 +786,9 @@ struct noce_if_info

   /* Estimated cost of the particular branch instruction.  */
   int branch_cost;
+
+  /* The COND is a conditional compare.  */
+  bool ccmp_p;
 };

 static rtx noce_emit_store_flag (struct noce_if_info *, rtx, int, int);
@@ -1407,9 +1410,16 @@ noce_emit_cmove (struct noce_if_info *if_info,
rtx x, enum rtx_code code,
   end_sequence ();
 }

-  /* Don't even try if the comparison operands are weird.  */
-  if (! general_operand (cmp_a, GET_MODE (cmp_a))
-  || ! general_operand (cmp_b, GET_MODE (cmp_b)))
+  /* Don't even try if the comparison operands are weird
+ except conditional compare.  */
+  if (if_info->ccmp_p)
+{
+  if (!(GET_MODE_CLASS (GET_MODE (cmp_a)) == MODE_CC
+|| GET_MODE_CLASS (GET_MODE (cmp_b)) == MODE_CC))
+return NULL_RTX;
+}
+  else if (! general_operand (cmp_a, GET_MODE (cmp_a))
+   || ! general_operand (cmp_b, GET_MODE (cmp_b)))
 return NULL_RTX;

 #if HAVE_conditional_move
@@ -1849,7 +1859,7 @@ noce_get_alt_condition (struct noce_if_info
*if_info, rtx target,
 }

   cond = canonicalize_condition (if_info->jump, cond, reverse,
- earliest, target, false, true);
+ earliest, target, if_info->ccmp_p, true);
   if (! cond || ! reg_mentioned_p (target, cond))
 return NULL;

@@ -2300,6 +2310,7 @@ noce_get_condition (rtx jump, rtx *earliest,
bool then_else_reversed)
 {
   rtx cond, set, tmp;
   bool reverse;
+  int allow_cc_mode = false;

   if (! any_condjump_p (jump))
 return NULL_RTX;
@@ -2333,10 +2344,21 @@ noce_get_condition (rtx jump, rtx *earliest,
bool then_else_reversed)
   return cond;
 }

+  /* For conditional compare, set ALLOW_CC_MODE to TRUE.  */
+  if (targetm.gen_ccmp_first)
+{
+  rtx prev = prev_nonnote_nondebug_insn (jump);
+  if (prev
+  && NONJUMP_INSN_P (prev)
+  && BLOCK_FOR_INSN (prev) == BLOCK_FOR_INSN (jump)
+  && ccmp_insn_p (prev))
+allow_cc_mode = true;
+}
+
   /* Otherwise, fall back on canonicalize_condition to do the dirty
  work of manipulating MODE_CC values and COMPARE rtx codes.  */
   tmp = canonicalize_condition (jump, cond, reverse, earliest,
-NULL_RTX, false, true);
+NULL_RTX, allow_cc_mode, true);

   /* We don't handle side-effects in the condition, like handling
  REG_INC notes and making sure no duplicate conditions are emitted.  */
@@ -2577,6 +2599,11 @@ noce_process_if_block (struct noce_if_info *if_info)
   if_info->a = a;
   if_info->b = b;

+  if (targetm.gen_ccmp_first)
+if (GET_MODE_CLASS (GET_MODE (XEXP (if_info->cond, 0))) == MODE_CC
+|| GET_MODE_CLASS (GET_MODE (XEXP (if_info->cond, 1))) == MODE_CC)
+  if_info->ccmp_p = true;
+
   /* Try optimizations in some approximation of a useful order.  */
   /* ??? Should first look to see if X is live incoming at all.  If it
  isn't, we don't need anything but an unconditional set.  */
diff -aru gcc/gcc/recog.c ccmp-all/gcc/recog.c
--- a/gcc/recog.c2014-03-13 16:45:00.524945484 +0800
+++ b/gcc/recog.c2014-03-13 15:30:22.468912473 +0800
@@ -556,7 +556,7 @@
 #define CODE_FOR_extzvCODE_FOR_nothing
 #endif

-static bool
+bool
 ccmp_insn_p (rtx object)
 {
   rtx x = PATTERN (object);
diff -aru gcc/gcc/recog.h ccmp-all/gcc/recog.h
--- a/gcc/recog.h2014-03-13 16:44:42.284945350 +0800
+++ b/gcc/recog.h2014-03-13 15:30:22.348912472 +0800
@@ -360,5 +360,6 @@

 extern const struct insn_data_d insn_data[];
 extern int peep2_current_count;
+bool ccmp_insn_p (rtx);

 #endif /* GCC_RECOG_H */
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aa

Re: [patch testsuite]: Correct testcase for LLP64 targets

2014-03-17 Thread Rainer Orth
Hi Kai,

> Index: gcc.c-torture/compile/20010327-1.c
> ===
> --- gcc.c-torture/compile/20010327-1.c(Revision 208594)
> +++ gcc.c-torture/compile/20010327-1.c(Arbeitskopie)
> @@ -1,4 +1,5 @@
>  /* { dg-require-effective-target ptr32plus } */
> +/* { dg-skip-if "" { { i?86-*-* x86_64-*-* } && { llp64 } } { "*" } { "" } } 
> */

the usual comments apply:

* add a comment/PR reference as the first arg to dg-skip-if explaining
  the skip

* omit the default args { "*" } { "" }

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [patch testsuite]: Correct testcase for LLP64 targets

2014-03-17 Thread Jakub Jelinek
On Mon, Mar 17, 2014 at 10:50:35AM +0100, Rainer Orth wrote:
> Hi Kai,
> 
> > Index: gcc.c-torture/compile/20010327-1.c
> > ===
> > --- gcc.c-torture/compile/20010327-1.c(Revision 208594)
> > +++ gcc.c-torture/compile/20010327-1.c(Arbeitskopie)
> > @@ -1,4 +1,5 @@
> >  /* { dg-require-effective-target ptr32plus } */
> > +/* { dg-skip-if "" { { i?86-*-* x86_64-*-* } && { llp64 } } { "*" } { "" } 
> > } */
> 
> the usual comments apply:
> 
> * add a comment/PR reference as the first arg to dg-skip-if explaining
>   the skip
> 
> * omit the default args { "*" } { "" }

Or perhaps just drop dg-require-effective-target directive and instead do
/* { dg-do compile { target { ptr32plus && ! llp64 } } } */

Jakub


Re: [patch testsuite]: Correct testcase for LLP64 targets

2014-03-17 Thread Kai Tietz
2014-03-17 10:53 GMT+01:00 Jakub Jelinek :
> On Mon, Mar 17, 2014 at 10:50:35AM +0100, Rainer Orth wrote:
>> Hi Kai,
>>
>> > Index: gcc.c-torture/compile/20010327-1.c
>> > ===
>> > --- gcc.c-torture/compile/20010327-1.c(Revision 208594)
>> > +++ gcc.c-torture/compile/20010327-1.c(Arbeitskopie)
>> > @@ -1,4 +1,5 @@
>> >  /* { dg-require-effective-target ptr32plus } */
>> > +/* { dg-skip-if "" { { i?86-*-* x86_64-*-* } && { llp64 } } { "*" } { "" 
>> > } } */
>>
>> the usual comments apply:
>>
>> * add a comment/PR reference as the first arg to dg-skip-if explaining
>>   the skip
>>
>> * omit the default args { "*" } { "" }
>
> Or perhaps just drop dg-require-effective-target directive and instead do
> /* { dg-do compile { target { ptr32plus && ! llp64 } } } */
>
> Jakub

Yeah, omitting the dg-require-effective-target directive looks to me
like the best thing to do.  To add a skip-directive is superflous.

Ok with patch following patch?

Kai

Index: 20010327-1.c
===
--- 20010327-1.c(Revision 208594)
+++ 20010327-1.c(Arbeitskopie)
@@ -1,4 +1,4 @@
-/* { dg-require-effective-target ptr32plus } */
+/* { dg-do compile { ptr32plus && !llp64 } } */

 /* This testcase tests whether GCC can produce static initialized data
that references addresses of size 'unsigned long', even if that's not


Re: [PATCH][AARCH64]combine "ubfiz" and "orr" with bfi when certain condition meets.

2014-03-17 Thread Richard Earnshaw
On 16/03/14 12:30, Renlin Li wrote:
> Hi all,
> 
> Thank you for your suggestions, Richard. I have updated the patch 
> accordingly.
> 
> This is an optimization patch which will combine  "ubfiz" and "orr" 
> insns with a single "bfi" when certain conditions meet.
> 
> tmp = (x & m) | ( (y & n) << lsb) can be presented using
> 
>  and tmp, x, m
>  bfi tmp, y, #lsb, #width
> 
> if ((n+1) == 2^width) && (m & n << lsb) == 0.
> 
> A small test case is also added to verify it.
> 
> Is this Okay for stage-1?
> 
> Kind regards,
> Renlin Li
> 

This looks to me more like a 3 into two split operation where combine
needs some help to do the split, since the transformation is
non-trivial.  As such, I think you just need a define_split rather than
a define_insn_and_split (there's also no obvious reason why we would
want to defer this split until after register allocation).

Furthermore, you have an early-clobber situation here: it's important
that y and tmp aren't in the same register.  You appear to try to cater
for this by using an operand tie, but that's unnecessary in general (the
AND operation can write any usable register) and won't work in the
specific case where x = y.

R.

> 
> gcc/ChangeLog:
> 
> 2014-03-14  Renlin Li  
> 
>  * config/aarch64/aarch64.md (*combine_bfi2, 
> *combine_bfi3): New.
> 
> gcc/testsuite:
> 
> 2014-03-14  Renlin Li  
> 
>  * gcc.target/aarch64/combine_and_orr_1.c: New.
> 
> 
> patch.diff
> 
> 
> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> index 99a6ac8..6c2798b 100644
> --- a/gcc/config/aarch64/aarch64.md
> +++ b/gcc/config/aarch64/aarch64.md
> @@ -3115,6 +3115,53 @@
>[(set_attr "type" "bfm")]
>  )
>  
> +(define_insn_and_split "*combine_bfi2"
> +  [(set (match_operand:GPI 0 "register_operand" "=r")
> +(ior:GPI (and:GPI (ashift:GPI (match_operand:GPI 1 
> "register_operand" "r")
> +  (match_operand 2 "const_int_operand" 
> "n"))
> +  (match_operand 3 "const_int_operand" "n"))
> + (zero_extend:GPI (match_operand:SHORT 4 "register_operand" 
> "0"]
> +  "exact_log2 ((INTVAL (operands[3]) >> INTVAL (operands[2])) + 1) >= 0
> +   &&  <= INTVAL (operands[2])"
> +  "#"
> +  "&& reload_completed"
> +  [(set (match_dup 0)
> +(zero_extend:GPI (match_dup 4)))
> +   (set (zero_extract:GPI (match_dup 0)
> +   (match_dup 3)
> +   (match_dup 2))
> + (match_dup 1))]
> +  {
> +  int tmp = (INTVAL (operands[3]) >> INTVAL (operands[2])) + 1;
> +  operands[3] = GEN_INT (exact_log2 (tmp));
> +  }
> +  [(set_attr "type" "multiple")]
> +)
> +
> +(define_insn_and_split "*combine_bfi3"
> +  [(set (match_operand:GPI 0 "register_operand" "=r")
> +(ior:GPI (and:GPI (match_operand:GPI 1 "register_operand" "0")
> +  (match_operand 2 "aarch64_logical_immediate" "n"))
> + (and:GPI (ashift:GPI (match_operand:GPI 3 
> "register_operand" "r")
> +  (match_operand 4 "const_int_operand" 
> "n"))
> +  (match_operand 5 "const_int_operand" "n"]
> +  "exact_log2 ((INTVAL (operands[5]) >> INTVAL (operands[4])) + 1) >= 0
> +   && (INTVAL (operands[2]) & INTVAL (operands[5])) == 0"
> +  "#"
> +  "&& reload_completed"
> +  [(set (match_dup 0)
> +(and:GPI (match_dup 1) (match_dup 2)))
> +   (set (zero_extract:GPI (match_dup 0)
> +   (match_dup 5)
> +   (match_dup 4))
> + (match_dup 3))]
> +  {
> +  int tmp = (INTVAL (operands[5]) >> INTVAL (operands[4])) + 1;
> +  operands[5] = GEN_INT (exact_log2 (tmp));
> +  }
> +  [(set_attr "type" "multiple")]
> +)
> +
>  (define_insn "*extr_insv_lower_reg"
>[(set (zero_extract:GPI (match_operand:GPI 0 "register_operand" "+r")
> (match_operand 1 "const_int_operand" "n")
> diff --git a/gcc/testsuite/gcc.target/aarch64/combine_and_orr_1.c 
> b/gcc/testsuite/gcc.target/aarch64/combine_and_orr_1.c
> new file mode 100644
> index 000..b2c0194
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/combine_and_orr_1.c
> @@ -0,0 +1,51 @@
> +/* { dg-do run } */
> +/* { dg-options "-save-temps -O2" }  */
> +
> +extern void abort (void);
> +
> +unsigned int __attribute__ ((noinline))
> +foo1 (unsigned int major, unsigned int minor)
> +{
> +  unsigned int tmp = (minor & 0xff) | ((major & 0xfff) << 8);
> +  return tmp;
> +}
> +
> +unsigned int __attribute__ ((noinline))
> +foo2 (unsigned int major, unsigned int minor)
> +{
> +  unsigned int tmp = (minor & 0x1f) | ((major & 0xfff) << 8);
> +  return tmp;
> +}
> +
> +int
> +main (void)
> +{
> +  unsigned int major[10] = {1947662, 484254, 193508, 4219233, 2211215,
> +  3998162, 4240676, 1034099, 54412, 3195572};
> +  unsigned int minor[10] = {1027568, 21481, 2746675, 3121857, 2471080,
> +  3158801, 237587, 813307, 4073168, 1503494};
> +
> +  unsigned in

Re: Try to catch up _GLIBCXX_RESOLVE_LIB_DEFECTS comments and documentation.

2014-03-17 Thread Jonathan Wakely
On 16 March 2014 16:09, Ed Smith-Rowland wrote:
> OK, thinking further on it I actually agree with not mentioning DRs on a
> partially baked standard.  We advertise that support for new standards is
> experimental.

I don't think it does any harm to add comments during the C++1y/C++1z
process to note that we've incorporated a particular DR against an
earlier working paper, because it's not always obvious which draft our
work-in-progress follows, but once the standard is finished I'd be in
favour of removing those comments. Implementing those DRs is implied
by implementing the finished standard.


Re: [patch testsuite]: Correct testcase for LLP64 targets

2014-03-17 Thread Kai Tietz
Sorry,  I repost last patch with small correction in dg-do directive.
The ! in there needs additional framing, and I missed the target
keyword.

Regards,
Kai

Index: 20010327-1.c
===
--- 20010327-1.c(Revision 208594)
+++ 20010327-1.c(Arbeitskopie)
@@ -1,4 +1,4 @@
-/* { dg-require-effective-target ptr32plus } */
+/* { dg-do compile { target { ptr32plus  && { ! llp64 } } } } */

 /* This testcase tests whether GCC can produce static initialized data
that references addresses of size 'unsigned long', even if that's not


[PATCH] Expand OpenMP SIMD even with -fno-tree-loop-optimize (PR middle-end/60534)

2014-03-17 Thread Marek Polacek
This patch ensures that we properly expand gomp SIMD builtins even with
-fno-tree-loop-optimize.  The problem was that we didn't run the 
loop vectorization at all.  -fno-tree-loop-vectorize already contains
similar hack.

Regtested/bootstrapped on x86_64-linux, ok for trunk (or for 5.0?)?

2014-03-17  Marek Polacek  

PR middle-end/60534
* omp-low.c (omp_max_vf): Treat -fno-tree-loop-optimize the same
as -fno-tree-loop-vectorize.
(expand_omp_simd): Likewise.
testsuite/
* gcc.dg/gomp/pr60534.c: New test.

diff --git gcc/omp-low.c gcc/omp-low.c
index 91c8656..fdf3367 100644
--- gcc/omp-low.c
+++ gcc/omp-low.c
@@ -2931,7 +2931,8 @@ omp_max_vf (void)
   || optimize_debug
   || (!flag_tree_loop_vectorize
  && (global_options_set.x_flag_tree_loop_vectorize
-  || global_options_set.x_flag_tree_vectorize)))
+ || global_options_set.x_flag_tree_vectorize
+ || global_options_set.x_flag_tree_loop_optimize)))
 return 1;
 
   int vs = targetm.vectorize.autovectorize_vector_sizes ();
@@ -6834,11 +6835,12 @@ expand_omp_simd (struct omp_region *region, struct 
omp_for_data *fd)
  loop->simduid = OMP_CLAUSE__SIMDUID__DECL (simduid);
  cfun->has_simduid_loops = true;
}
-  /* If not -fno-tree-loop-vectorize, hint that we want to vectorize
-the loop.  */
+  /* If not -fno-tree-loop-vectorize of -fno-tree-loop-optimize,
+ hint that we want to vectorize the loop.  */
   if ((flag_tree_loop_vectorize
   || (!global_options_set.x_flag_tree_loop_vectorize
-   && !global_options_set.x_flag_tree_vectorize))
+  && !global_options_set.x_flag_tree_vectorize
+  && !global_options_set.x_flag_tree_loop_optimize))
  && loop->safelen > 1)
{
  loop->force_vect = true;
diff --git gcc/testsuite/gcc.dg/gomp/pr60534.c 
gcc/testsuite/gcc.dg/gomp/pr60534.c
index e69de29..f8a6bdc 100644
--- gcc/testsuite/gcc.dg/gomp/pr60534.c
+++ gcc/testsuite/gcc.dg/gomp/pr60534.c
@@ -0,0 +1,16 @@
+/* PR middle-end/60534 */
+/* { dg-do compile } */
+/* { dg-options "-fopenmp -O -fno-tree-loop-optimize" } */
+
+extern int d[];
+
+int
+foo (int a)
+{
+  int c = 0;
+  int l;
+#pragma omp simd reduction(+: c)
+  for (l = 0; l < a; ++l)
+c += d[l];
+  return c;
+}

Marek


Re: [patch testsuite]: Correct testcase for LLP64 targets

2014-03-17 Thread Jakub Jelinek
On Mon, Mar 17, 2014 at 12:01:41PM +0100, Kai Tietz wrote:
> Sorry,  I repost last patch with small correction in dg-do directive.
> The ! in there needs additional framing, and I missed the target
> keyword.
> 
> Regards,
> Kai
> 
> Index: 20010327-1.c
> ===
> --- 20010327-1.c(Revision 208594)
> +++ 20010327-1.c(Arbeitskopie)
> @@ -1,4 +1,4 @@
> -/* { dg-require-effective-target ptr32plus } */
> +/* { dg-do compile { target { ptr32plus  && { ! llp64 } } } } */

Ok with proper ChangeLog entry and the double space before && replaced with
a single space.

Jakub


Re: [PATCH] Expand OpenMP SIMD even with -fno-tree-loop-optimize (PR middle-end/60534)

2014-03-17 Thread Jakub Jelinek
On Mon, Mar 17, 2014 at 12:01:54PM +0100, Marek Polacek wrote:
> This patch ensures that we properly expand gomp SIMD builtins even with
> -fno-tree-loop-optimize.  The problem was that we didn't run the 
> loop vectorization at all.  -fno-tree-loop-vectorize already contains
> similar hack.
> 
> Regtested/bootstrapped on x86_64-linux, ok for trunk (or for 5.0?)?
> 
> 2014-03-17  Marek Polacek  
> 
>   PR middle-end/60534
>   * omp-low.c (omp_max_vf): Treat -fno-tree-loop-optimize the same
>   as -fno-tree-loop-vectorize.
>   (expand_omp_simd): Likewise.
> testsuite/
>   * gcc.dg/gomp/pr60534.c: New test.
> 
> diff --git gcc/omp-low.c gcc/omp-low.c
> index 91c8656..fdf3367 100644
> --- gcc/omp-low.c
> +++ gcc/omp-low.c
> @@ -2931,7 +2931,8 @@ omp_max_vf (void)
>|| optimize_debug
>|| (!flag_tree_loop_vectorize
> && (global_options_set.x_flag_tree_loop_vectorize
> -  || global_options_set.x_flag_tree_vectorize)))
> +   || global_options_set.x_flag_tree_vectorize
> +   || global_options_set.x_flag_tree_loop_optimize)))

No.  IMHO this needs to be:

|| optimize_debug
+   || !flag_no_tree_loop_optimize
|| (!flag_tree_loop_vectorize
&& (global_options_set.x_flag_tree_loop_vectorize

> @@ -6834,11 +6835,12 @@ expand_omp_simd (struct omp_region *region, struct 
> omp_for_data *fd)
> loop->simduid = OMP_CLAUSE__SIMDUID__DECL (simduid);
> cfun->has_simduid_loops = true;
>   }
> -  /* If not -fno-tree-loop-vectorize, hint that we want to vectorize
> -  the loop.  */
> +  /* If not -fno-tree-loop-vectorize of -fno-tree-loop-optimize,
> + hint that we want to vectorize the loop.  */
>if ((flag_tree_loop_vectorize
>  || (!global_options_set.x_flag_tree_loop_vectorize
> -   && !global_options_set.x_flag_tree_vectorize))
> +&& !global_options_set.x_flag_tree_vectorize
> +&& !global_options_set.x_flag_tree_loop_optimize))

Similarly, here it should be added as

+ && flag_tree_loop_optimize
> && loop->safelen > 1)

The thing is, if -fno-tree-loop-optimize (whether explicitly added by user
or implicitly through other options, then the loop will be never vectorized.
It doesn't matter if -ftree-vectorize was on or not in that case.

The magic with global_options_set is there to make the loop vectorized
if either -ftree-loop-vectorize is on (implicitly or explicitly), or
at least optimizing and not disabled explicitly (-fno-tree-vectorize),
we then force the vectorization on for the specific loops.

But -fno-tree-loop-optimize means the whole loop optimization pipeline is
not performed, at that point forcing it on and disabling all other loop
optimizations might be too problematic/error prone.

E.g. you could try -fopenmp -O -fno-tree-loop-optimize -ftree-vectorize
or -fopenmp -O3 -fno-tree-loop-optimize etc.

Jakub


Re: [PATCH] x86: Define _mm*_undefined_*

2014-03-17 Thread Ilya Tocar
On 16 Mar 07:12, Ulrich Drepper wrote:
> [This patch is so far really meant for commenting.  I haven't tested it
> at all yet.]
> 
> Intel's intrinsic specification includes one set which currently is not
> defined in gcc's headers: the _mm*_undefined_* intrinsics.
What specification are talking about? As far as I know they are present
in ICC headers, but not in manuals such as:
http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html
> The purpose of these instrinsics (currently three classes, three formats
> each) is to create a pseudo-value the compiler does not assume is
> uninitialized without incurring any code doing so.  The purpose is to
> use these intrinsics in places where it is known the value of a register
> is never used.  This is already important with AVX2 and becomes really
> crucial with AVX512.
> 
> Currently three different techniques are used:
> 
> - _mm*_setzero_*() is used.  Even though the XOR operation does not
>   cost anything it still messes with the instruction scheduling and
>   more code is generated.
> 
> - another parameter is duplicated.  This leads most of the time to
>   one additional move instruction.
> 
> - uninitialized variables are used (this is in new AVX512 code).  The
>   compiler should generate warnings for these headers.  I haven't
>   tried it.
Uninitialized variables certainly are bad. Replacing them with
setzero/undefined is a good idea.
Also in most AVX512 cases those values shouldn't be present in code.
They are either optimized away in case of -1 mask or result in
zero-masking being applied. Do you know of any cases where xor is
generated (except for destination in gather/scatter)
> 
> Using the _mm*_undefined_*() intrinsics is much cleaner and also
> potentially allows to generate better code.
> 
> For now the implementation uses an inline asm to suggest to the compiler
> that the variable is initialized.  This does not prevent a real register
> to be allocated for this purpose but it saves the XOR instruction.
> 
> The correct and optimal implementation will require a compiler built-in
> which will do something different based on how the value is used:
> 
> - if the value is never modified then any register should be picked.
>   In function/intrinsic calls the parameter simply need not be loaded at
>   all.
> 
> - if the value is modified (and allocated to a register or memory
>   location) no initialization for the variable is needed (equivalent
>   to the asm now).
> 
> 
> The questions are:
> 
> - is there interest in adding the necessary compiler built-in?
> 
> - if yes, anyone interested in working on this?
> 
> - and: is it worth adding a patch like the on here in the meantime?
> 
> As it stands now gcc's instrinsics are not complete and programs following
> Intel's manuals can fail to compile.
>
Compatibility with ICC is certainly good. I tried your patch, and
undefined is similar in behavior to setzero, but it also clobbers
flags. Maybe just define it to setzero for now?
> 
> 
> 2014-03-16  Ulrich Drepper  
> 
>   * config/i386/avxintrin.h (_mm256_undefined_si256): Define.
>   (_mm256_undefined_ps): Define.
>   (_mm256_undefined_pd): Define.
>   * config/i386/emmintrin.h (_mm_undefined_si128): Define.
>   (_mm_undefined_pd): Define.
>   * config/i386/xmmintrin.h (_mm_undefined_ps): Define.
>   * config/i386/avx512fintrin.h (_mm512_undefined_si512): Define.
>   (_mm512_undefined_ps): Define.
>   (_mm512_undefined_pd): Define.
>   Use _mm*_undefined_*.
>   * config/i386/avx2intrin.h: Use _mm*_undefined_*.
> 


Re: [PATCH,GCC/Thumb1] Correctly reset the variable after_arm_reorg for Thumb1 target

2014-03-17 Thread Richard Earnshaw
On 17/03/14 02:51, Terry Guo wrote:
> Hi,
> 
> I am working on another patch and found this per-function variable isn't
> correctly reset for Thumb1 target. Currently no ICE will be triggered
> because we don't call function arm_split_constants for Thumb1 target. This
> patch intends to define this variable in machine_function struct in arm.h.
> In this way, the variable will be correctly reset and ready for being used
> for Thumb1 target in future.
> 
> Tested with gcc regression test for Thumb1 target cortex-m0. No new
> regressions.
> 
> Is it ok to trunk?
> 
> BR,
> Terry
> 
> 2014-03-17  Terry Guo  
> 
> * config/arm/arm.h (machine_function): Define variable
> after_arm_reorg here.
> * config/arm/arm.c (after_arm_reorg): Remove the definition.
> (arm_split_constant): Update the way to access variable
> after_arm_reorg.
> (arm_reorg): Ditto.
> (arm_output_function_epilogue): Remove the reset of after_arm_reorg.
> 
> 
> reset-after_arm_reorg-thumb1-v3.txt
> 
> 
> diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
> index 7ca47a7..982ed48 100644
> --- a/gcc/config/arm/arm.h
> +++ b/gcc/config/arm/arm.h
> @@ -1543,6 +1543,9 @@ typedef struct GTY(()) machine_function
>rtx thumb1_cc_op1;
>/* Also record the CC mode that is supported.  */
>enum machine_mode thumb1_cc_mode;
> +  /* Set to 1 after arm_reorg has started.  Reset to 0 at the start of
> + the next function.  */

The reset comment is no-longer relevant.  Please remove.

Ok for stage1 with that change.

R.




[PATCH, i386, AVX, AVX-512] Extend ADDITION_REGISTER_NAMES to XMMs and YMMs.

2014-03-17 Thread Kirill Yukhin
Hello,
Patch in the bottom allows to use ymmXX and zmmXX
register names in inline asm statements as well as
in `register` variables definitions.

New tests pass.
Bootstrap pass.

Is it ok for trunk?
Do we need to backport it to 4.8?

gcc/
* config/i386/i386.h (ADDITIONAL_REGISTER_NAMES): Add
ymm and zmm register names.

testsuite/
* gcc.target/i386/avx-additional-reg-names.c: New.
* gcc.target/i386/avx512f-additional-reg-names.c: Ditto.

--
Thanks, K

commit c3884af93c105115bc1e4d02fa824d24420c5bbf
Author: Kirill Yukhin 
Date:   Mon Mar 17 14:56:06 2014 +0400

[AVX, AVX-512]. Extend ADDITIONAL_REGISTER_NAMES to Ymms and Zmms.
---
 gcc/config/i386/i386.h | 28 +-
 .../gcc.target/i386/avx-additional-reg-names.c |  9 +++
 .../gcc.target/i386/avx512f-additional-reg-names.c |  9 +++
 3 files changed, 40 insertions(+), 6 deletions(-)

diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index c80878b..c5c1d58 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -2016,12 +2016,28 @@ do {
\
 /* Table of additional register names to use in user input.  */
 
 #define ADDITIONAL_REGISTER_NAMES \
-{ { "eax", 0 }, { "edx", 1 }, { "ecx", 2 }, { "ebx", 3 },  \
-  { "esi", 4 }, { "edi", 5 }, { "ebp", 6 }, { "esp", 7 },  \
-  { "rax", 0 }, { "rdx", 1 }, { "rcx", 2 }, { "rbx", 3 },  \
-  { "rsi", 4 }, { "rdi", 5 }, { "rbp", 6 }, { "rsp", 7 },  \
-  { "al", 0 }, { "dl", 1 }, { "cl", 2 }, { "bl", 3 },  \
-  { "ah", 0 }, { "dh", 1 }, { "ch", 2 }, { "bh", 3 } }
+{ { "eax", 0 }, { "edx", 1 }, { "ecx", 2 }, { "ebx", 3 },  \
+  { "esi", 4 }, { "edi", 5 }, { "ebp", 6 }, { "esp", 7 },  \
+  { "rax", 0 }, { "rdx", 1 }, { "rcx", 2 }, { "rbx", 3 },  \
+  { "rsi", 4 }, { "rdi", 5 }, { "rbp", 6 }, { "rsp", 7 },  \
+  { "al", 0 }, { "dl", 1 }, { "cl", 2 }, { "bl", 3 },  \
+  { "ah", 0 }, { "dh", 1 }, { "ch", 2 }, { "bh", 3 },  \
+  { "ymm0", 21}, { "ymm1", 22}, { "ymm2", 23}, { "ymm3", 24},  \
+  { "ymm4", 25}, { "ymm5", 26}, { "ymm6", 27}, { "ymm7", 28},  \
+  { "ymm8", 45}, { "ymm9", 46}, { "ymm10", 47}, { "ymm11", 48},
\
+  { "ymm12", 49}, { "ymm13", 50}, { "ymm14", 51}, { "ymm15", 52},  \
+  { "ymm16", 53}, { "ymm17", 54}, { "ymm18", 55}, { "ymm19", 56},  \
+  { "ymm20", 57}, { "ymm21", 58}, { "ymm22", 59}, { "ymm23", 60},  \
+  { "ymm24", 61}, { "ymm25", 62}, { "ymm26", 63}, { "ymm27", 64},  \
+  { "ymm28", 65}, { "ymm29", 66}, { "ymm30", 67}, { "ymm31", 68},  \
+  { "zmm0", 21}, { "zmm1", 22}, { "zmm2", 23}, { "zmm3", 24},  \
+  { "zmm4", 25}, { "zmm5", 26}, { "zmm6", 27}, { "zmm7", 28},  \
+  { "zmm8", 45}, { "zmm9", 46}, { "zmm10", 47}, { "zmm11", 48},
\
+  { "zmm12", 49}, { "zmm13", 50}, { "zmm14", 51}, { "zmm15", 52},  \
+  { "zmm16", 53}, { "zmm17", 54}, { "zmm18", 55}, { "zmm19", 56},  \
+  { "zmm20", 57}, { "zmm21", 58}, { "zmm22", 59}, { "zmm23", 60},  \
+  { "zmm24", 61}, { "zmm25", 62}, { "zmm26", 63}, { "zmm27", 64},  \
+  { "zmm28", 65}, { "zmm29", 66}, { "zmm30", 67}, { "zmm31", 68} }
 
 /* Note we are omitting these since currently I don't know how
 to get gcc to use these, since they want the same but different
diff --git a/gcc/testsuite/gcc.target/i386/avx-additional-reg-names.c 
b/gcc/testsuite/gcc.target/i386/avx-additional-reg-names.c
new file mode 100644
index 000..d984bff
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx-additional-reg-names.c
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx" } */
+
+void foo ()
+{
+  register int ymm_var asm ("ymm4");
+
+  __asm__ __volatile__("vxorpd %%ymm0, %%ymm0, %%ymm7\n" : : : "ymm7" );
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx512f-additional-reg-names.c 
b/gcc/testsuite/gcc.target/i386/avx512f-additional-reg-names.c
new file mode 100644
index 000..1bd428a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512f-additional-reg-names.c
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx512f" } */
+
+void foo ()
+{
+  register int zmm_var asm ("zmm9");
+
+  __asm__ __volatile__("vxorpd %%zmm0, %%zmm0, %%zmm7\n" : : : "zmm7" );
+}


Re: [RFC][gomp4] Offloading: Add device initialization and host->target function mapping

2014-03-17 Thread Ilya Verbin
Ping.

2014-03-12 21:56 GMT+04:00 Ilya Verbin :
> Hi Thomas,
>
> Here is a new version of this patch (it was discussed in other thread: 
> http://gcc.gnu.org/ml/gcc-patches/2014-03/msg00573.html ) with ChangeLog.
> Bootstrap and make check passed.
> Ok to commit?

  -- Ilya


Consolidate GCC web pages documentation (4/3)

2014-03-17 Thread Gerald Pfeifer
This nearly brings us to the goal of having just one page covering
this and simplifies language in about.html a bit on the way.

Applied.

Gerald

Index: about.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/about.html,v
retrieving revision 1.20
diff -u -r1.20 about.html
--- about.html  17 Feb 2014 01:03:10 -  1.20
+++ about.html  15 Mar 2014 11:43:01 -
@@ -14,14 +14,14 @@
 contributors.
 
 The web effort was originally led by Jeff Law.  For the last decade
-or so Gerald Pfeifer has been leading the effort, but again, there are
+or so Gerald Pfeifer has been leading the effort, but there are
 lots of people who contribute.
 
 The web pages are under CVS control and you
 can http://gcc.gnu.org/cgi-bin/cvsweb.cgi/wwwdocs/";>browse
 the repository online.
-The pages on gcc.gnu.org are updated "live" (that is, directly after a
-change has been made); www.gnu.org is updated once a day at 4:00 -0700
+The pages on gcc.gnu.org are updated "live" directly after a
+change has been made; www.gnu.org is updated once a day at 4:00 -0700
 (PDT).
 
 Please send feedback, problem reports and patches to our
@@ -83,6 +83,13 @@
 list.
 
 
+As changes are checked in, the respective pages are preprocessed
+via the script wwwdocs/bin/preprocess which in turn
+uses a tool called MetaHTML.  Among others, this preprocessing
+adds CSS style sheets, XML and HTML headers, and our standard
+footer.  The MetaHTML style sheet is in
+wwwdocs/htdocs/style.mhtml.
+
 
 The host system
 
Index: projects/web.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/projects/web.html,v
retrieving revision 1.15
diff -u -r1.15 web.html
--- projects/web.html   17 Feb 2014 01:03:10 -  1.15
+++ projects/web.html   15 Mar 2014 11:43:01 -
@@ -11,12 +11,5 @@
 Contributing changes
 to our web pages is simple.
 
-As changes are checked in, the respective pages are preprocessed
-via the script wwwdocs/bin/preprocess which in turn
-uses a tool called MetaHTML.  Among others, this preprocessing
-adds CSS style sheets, XML and HTML headers, and our standard
-footer.  The MetaHTML style sheet is in
-wwwdocs/htdocs/style.mhtml.
-
 
 


Re: [patch, libgfortran] PR46800 Handle CTRL-D correctly with STDIN

2014-03-17 Thread Janne Blomqvist
On Mon, Mar 17, 2014 at 12:50 AM, Jerry DeLisle  wrote:
> Hi all.
>
> The problem here was that when reading a value from STDIN and the user just
> entered an empty entry (LF),
> we would end up getting nested into a second read (via next_char) and the user
> would have to press CTRL-D twice to get out of the read. (The correct behavior
> is to only hit CTRL-D once which sends us the EOF.
>
> This was caused by a call to eat_separator right after we did the initial 
> read.
>  The eat_separator function then tries to read again and we get a condition of
> waiting for user input on that read.  The patch eliminates this call to
> eat_separator. This requires explicitly checking for the comma and end-of-line
> conditions which are also done in eat_separator.
>
> Regression tested on x86-64-gnu.  No test case can be done since it require
> terminal input to read.
>
> OK for trunk?

Ok, thanks for the patch.

I wonder, would it be possible to set up some dejagnu testcases with
multiple programs communicating via pipes or such, we occasionally
seem to have regressions dealing with non-seekable files/terminals and
such which go undetected for a long time, since we're not regularly
testing it?

-- 
Janne Blomqvist


Re: [C++ Patch] PR 59571

2014-03-17 Thread Jason Merrill

On 03/17/2014 05:38 AM, Paolo Carlini wrote:

noticed this issue, which looks simple to fix. The ICE happens in
cxx_eval_constant_expression, because it cannot handle a CAST_EXPR (or
any othe *_CAST, for that matter). In fact check_narrowing calls
maybe_constant_value, and, because we are in a template, the latter
faces the unfolded CAST_EXPR. Thus it seems easy to just use
fold_non_dependent_expr_sfinae. Tested x86_64-linux.


OK.


PS: looking forward, I'm wondering if some semantics/typeck functions
shouldn't try harder before building a tree node and returning, eg,
instead of just checking processing_template_decl, actually checking if
type and expr are dependent? Does this kind of audit make sense for next
Stage 1?


In general adding fold_non_dependent_expr where it's needed is the right 
answer, because normal operation creates tree patterns that tsubst 
doesn't understand how to deal with.


I suppose it might work to always fully build 
non-instantiation-dependent expressions, wrap them in 
NON_DEPENDENT_EXPR, and then unshare its operand at instantiation time. 
 But that would be a significant change with unclear benefit.


Jason



Re: [PATCH] Fix PR60505

2014-03-17 Thread Richard Biener
On Fri, 14 Mar 2014, Cong Hou wrote:

> On Fri, Mar 14, 2014 at 12:58 AM, Richard Biener  wrote:
> > On Fri, 14 Mar 2014, Jakub Jelinek wrote:
> >
> >> On Fri, Mar 14, 2014 at 08:52:07AM +0100, Richard Biener wrote:
> >> > > Consider this fact and if there are alias checks, we can safely remove
> >> > > the epilogue if the maximum trip count of the loop is less than or
> >> > > equal to the calculated threshold.
> >> >
> >> > You have to consider n % vf != 0, so an argument on only maximum
> >> > trip count or threshold cannot work.
> >>
> >> Well, if you only check if maximum trip count is <= vf and you know
> >> that for n < vf the vectorized loop + it's epilogue path will not be taken,
> >> then perhaps you could, but it is a very special case.
> >> Now, the question is when we are guaranteed we enter the scalar versioned
> >> loop instead for n < vf, is that in case of versioning for alias or
> >> versioning for alignment?
> >
> > I think neither - I have plans to do the cost model check together
> > with the versioning condition but didn't get around to implement that.
> > That would allow stronger max bounds for the epilogue loop.
> 
> In vect_transform_loop(), check_profitability will be set to true if
> th >= VF-1 and the number of iteration is unknown (we only consider
> unknown trip count here), where th is calculated based on the
> parameter PARAM_MIN_VECT_LOOP_BOUND and cost model, with the minimum
> value VF-1. If the loop needs to be versioned, then
> check_profitability with true value will be passed to
> vect_loop_versioning(), in which an enhanced loop bound check
> (considering cost) will be built. So I think if the loop is versioned
> and n < VF, then we must enter the scalar version, and in this case
> removing epilogue should be safe when the maximum trip count <= th+1.

You mean exactly in the case where the profitability check ensures
that n % vf == 0?  Thus effectively if n == maximum trip count?
That's quite a special case, no?

Richard.

-- 
Richard Biener 
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer


Re: [PATCH] Fix PR60505

2014-03-17 Thread Jakub Jelinek
On Mon, Mar 17, 2014 at 02:44:29PM +0100, Richard Biener wrote:
> You mean exactly in the case where the profitability check ensures
> that n % vf == 0?  Thus effectively if n == maximum trip count?
> That's quite a special case, no?

Indeed it is.  But I guess that is pretty much the only case where
the following optimizers can fold the array accesses in the (unneeded)
epilogue loop from some non-constant indexes to constant ones (because, it
knows that the vector loop will iterate in that case exactly once).

Jakub


Re: [PATCH] Expand OpenMP SIMD even with -fno-tree-loop-optimize (PR middle-end/60534)

2014-03-17 Thread Marek Polacek
On Mon, Mar 17, 2014 at 12:16:08PM +0100, Jakub Jelinek wrote:
> No.  IMHO this needs to be:
> || optimize_debug
> + || !flag_no_tree_loop_optimize
> || (!flag_tree_loop_vectorize
>   && (global_options_set.x_flag_tree_loop_vectorize

I presume you mean !flag_tree_loop_optimize.
 
> > @@ -6834,11 +6835,12 @@ expand_omp_simd (struct omp_region *region, struct 
> > omp_for_data *fd)
> >   loop->simduid = OMP_CLAUSE__SIMDUID__DECL (simduid);
> >   cfun->has_simduid_loops = true;
> > }
> > -  /* If not -fno-tree-loop-vectorize, hint that we want to vectorize
> > -the loop.  */
> > +  /* If not -fno-tree-loop-vectorize of -fno-tree-loop-optimize,
> > + hint that we want to vectorize the loop.  */
> >if ((flag_tree_loop_vectorize
> >|| (!global_options_set.x_flag_tree_loop_vectorize
> > -   && !global_options_set.x_flag_tree_vectorize))
> > +  && !global_options_set.x_flag_tree_vectorize
> > +  && !global_options_set.x_flag_tree_loop_optimize))
> 
> Similarly, here it should be added as
> 
> +   && flag_tree_loop_optimize
> >   && loop->safelen > 1)
> 
> The thing is, if -fno-tree-loop-optimize (whether explicitly added by user
> or implicitly through other options, then the loop will be never vectorized.
> It doesn't matter if -ftree-vectorize was on or not in that case.
> 
> The magic with global_options_set is there to make the loop vectorized
> if either -ftree-loop-vectorize is on (implicitly or explicitly), or
> at least optimizing and not disabled explicitly (-fno-tree-vectorize),
> we then force the vectorization on for the specific loops.
> 
> But -fno-tree-loop-optimize means the whole loop optimization pipeline is
> not performed, at that point forcing it on and disabling all other loop
> optimizations might be too problematic/error prone.
> 
> E.g. you could try -fopenmp -O -fno-tree-loop-optimize -ftree-vectorize
> or -fopenmp -O3 -fno-tree-loop-optimize etc.

:( sorry, fixed.  No ICE with these options.

Regtested on x86_64-linux, ok now?

2014-03-17  Marek Polacek  

PR middle-end/60534
* omp-low.c (omp_max_vf): Treat -fno-tree-loop-optimize the same
as -fno-tree-loop-vectorize.
(expand_omp_simd): Likewise.
testsuite/
* gcc.dg/gomp/pr60534.c: New test.

diff --git gcc/omp-low.c gcc/omp-low.c
index 91c8656..24ef3c8 100644
--- gcc/omp-low.c
+++ gcc/omp-low.c
@@ -2929,6 +2929,7 @@ omp_max_vf (void)
 {
   if (!optimize
   || optimize_debug
+  || !flag_tree_loop_optimize
   || (!flag_tree_loop_vectorize
  && (global_options_set.x_flag_tree_loop_vectorize
   || global_options_set.x_flag_tree_vectorize)))
@@ -6839,6 +6840,7 @@ expand_omp_simd (struct omp_region *region, struct 
omp_for_data *fd)
   if ((flag_tree_loop_vectorize
   || (!global_options_set.x_flag_tree_loop_vectorize
&& !global_options_set.x_flag_tree_vectorize))
+ && flag_tree_loop_optimize
  && loop->safelen > 1)
{
  loop->force_vect = true;
diff --git gcc/testsuite/gcc.dg/gomp/pr60534.c 
gcc/testsuite/gcc.dg/gomp/pr60534.c
index e69de29..f8a6bdc 100644
--- gcc/testsuite/gcc.dg/gomp/pr60534.c
+++ gcc/testsuite/gcc.dg/gomp/pr60534.c
@@ -0,0 +1,16 @@
+/* PR middle-end/60534 */
+/* { dg-do compile } */
+/* { dg-options "-fopenmp -O -fno-tree-loop-optimize" } */
+
+extern int d[];
+
+int
+foo (int a)
+{
+  int c = 0;
+  int l;
+#pragma omp simd reduction(+: c)
+  for (l = 0; l < a; ++l)
+c += d[l];
+  return c;
+}

Marek


Re: [PATCH] Expand OpenMP SIMD even with -fno-tree-loop-optimize (PR middle-end/60534)

2014-03-17 Thread Jakub Jelinek
On Mon, Mar 17, 2014 at 02:49:41PM +0100, Marek Polacek wrote:
> 2014-03-17  Marek Polacek  
> 
>   PR middle-end/60534
>   * omp-low.c (omp_max_vf): Treat -fno-tree-loop-optimize the same
>   as -fno-tree-loop-vectorize.
>   (expand_omp_simd): Likewise.
> testsuite/
>   * gcc.dg/gomp/pr60534.c: New test.

Ok, thanks.

Jakub


Re: C++ PATCH for c++/58678 (devirt vs. KDE)

2014-03-17 Thread Jason Merrill

On 03/17/2014 04:39 AM, Jan Hubicka wrote:

Thank you!  would preffer different marker than cxa_pure_virtual in the vtable,
most probably simply NULL.

The reason is that __cxa_pure_virtual will appear as a possible target in the
list and it will prevent devirtualization to happen when we end up with
__cxa_pure_virtual and real destructor in the list of possible targets.


Hmm?  __cxa_pure_virtual is not considered likely, so why wouldn't 
devirtualization choose the real function instead?



gimple_get_virt_method_for_vtable knows that lookup in vtable that do not
result in FUNCTION_DECL should be translated to BUILTIN_UNREACHABLE and
ipa-devirt drops these from list of targets, unlike __cxa_pure_virtual that
stays.


I don't see the reason for that distinction; either way you get 
undefined behavior.  The only purpose of __cxa_pure_virtual is to give a 
friendly diagnostic before terminating the program.



Other problem with cxa_pure_virtual is that it needs external relocation.
I sort of wondered if we don't want to produce hidden comdat wrapper for
it, so C++ programs are easier to relocate.


Sure, that would make sense.


What do you think of the following patch that makes ipa-devirt to conclude
that destructor calls are never done on types in construction.
If effect of doing so is undefined, I think it is safe to drop them from
list of targets and that really helps to reduce lists down.


That looks good to me.

Jason



Re: [PATCH] Fix PR c++/60391

2014-03-17 Thread Jason Merrill

OK.

Jason


Re: [PATCH] Fix PR c++/60390

2014-03-17 Thread Jason Merrill

On 03/16/2014 04:44 PM, Adam Butcher wrote:

+ if (parser->num_classes_being_defined == 0)
+   while (scope->kind == sk_class)
+ {
+   parent_scope = scope;
+   scope = scope->level_chain;
+ }
+ else
+   while (scope->kind == sk_class
+  && !TYPE_BEING_DEFINED (scope->this_entity))
+ {
+   parent_scope = scope;
+   scope = scope->level_chain;
+ }


The special case for 0 seems like an unnecessary optimization.  OK 
without it.


Jason



Re: Fwd: [RFC][gomp4] Offloading patches (2/3): Add tables generation

2014-03-17 Thread Thomas Schwinge
Hi!

On Sat, 8 Mar 2014 18:50:15 +0400, Ilya Verbin  wrote:
> --- a/libgomp/libgomp.map
> +++ b/libgomp/libgomp.map
> @@ -208,6 +208,7 @@ GOMP_3.0 {
>  
>  GOMP_4.0 {
>global:
> + GOMP_offload_register;
>   GOMP_barrier_cancel;
>   GOMP_cancel;
>   GOMP_cancellation_point;

Now that the GOMP_4.0 symbol version is being used in GCC trunk, and will
be in the GCC 4.9 release, can we still add new symbols to it here?
(Jakub?)

> --- a/libgomp/plugin-host.c
> +++ b/libgomp/plugin-host.c

> +const int TARGET_TYPE_HOST = 0;

We'll have to see whether this (that is, libgomp/target.c:enum
target_type) should live in a shared header file, but OK for the moment.

> +void
> +device_run (void *fn_ptr, void *vars)
> +{
> +#ifdef DEBUG
> +  printf ("libgomp plugin: %s:%s (%p, %p)\n", __FILE__, __FUNCTION__, fn_ptr,
> +   vars);
> +#endif
> +
> +  void (*fn)(void *) = (void (*)(void *)) fn_ptr;
> +
> +  fn (vars);
> +}

Why not make fn_ptr a proper function pointer?  Ah, because of
GOMP_target passing (void *) tgt_fn->tgt->tgt_start for the
!TARGET_TYPE_HOST case...

Would it make sense to have device_run return a value to make it able to
indicate to libgomp that the function cannot be run on the device (for
whatever reason), and libgomp should use host-fallback execution?
(Probably that needs more thought and discussion, OK to defer.)

> --- a/libgomp/target.c
> +++ b/libgomp/target.c

> +enum target_type {
> +  TARGET_TYPE_HOST,
> +  TARGET_TYPE_INTEL_MIC
> +};

(As discussed above, but OK to defer.)

> @@ -120,15 +140,26 @@ struct gomp_device_descr
>   TARGET construct.  */
>int id;
>  
> +  /* This is the TYPE of device.  */
> +  int type;

Use enum target_type instead of int?

> +/* This function should be called from every offload image.  It gets the
> +   descriptor of the host func and var tables HOST_TABLE, TYPE of the target,
> +   and TARGET_DATA needed by target plugin (target tables, etc.)  */
> +void
> +GOMP_offload_register (void *host_table, int type, void *target_data)
> +{
> +  offload_images = realloc (offload_images,
> + (num_offload_images + 1)
> + * sizeof (struct offload_image_descr));
> +
> +  if (offload_images == NULL)
> +return;

Fail silently, or use gomp_realloc to fail loudly?

> @@ -701,16 +836,25 @@ gomp_find_available_plugins (void)

> - out:
> +out:

Emacs wants the space to be there, so I assume that's the coding standard
to use.  ;-)

>if (dir)
>  closedir (dir);
> +  free (offload_images);

I suggest to set offload_images = NULL, for clarity.

> +  num_offload_images = 0;
>  }

We may need to revisit this later: currently it's not possible to
register additional plugins after libgomp has initialized
(gomp_target_init, gomp_find_available_plugins just executed once), but
should that ever be made possible, we'd need to preserve offload_images.


OK to commit, thanks!


Grüße,
 Thomas


pgpoKh9FyzF7I.pgp
Description: PGP signature


Re: [PATCH, i386, AVX, AVX-512] Extend ADDITION_REGISTER_NAMES to XMMs and YMMs.

2014-03-17 Thread H.J. Lu
On Mon, Mar 17, 2014 at 4:53 AM, Kirill Yukhin  wrote:
> Hello,
> Patch in the bottom allows to use ymmXX and zmmXX
> register names in inline asm statements as well as
> in `register` variables definitions.
>
> New tests pass.
> Bootstrap pass.
>
> Is it ok for trunk?
> Do we need to backport it to 4.8?
>
> gcc/
> * config/i386/i386.h (ADDITIONAL_REGISTER_NAMES): Add
> ymm and zmm register names.
>
> testsuite/
> * gcc.target/i386/avx-additional-reg-names.c: New.
> * gcc.target/i386/avx512f-additional-reg-names.c: Ditto.
>
> --
> Thanks, K
>
> commit c3884af93c105115bc1e4d02fa824d24420c5bbf
> Author: Kirill Yukhin 
> Date:   Mon Mar 17 14:56:06 2014 +0400
>
> [AVX, AVX-512]. Extend ADDITIONAL_REGISTER_NAMES to Ymms and Zmms.
> ---
>  gcc/config/i386/i386.h | 28 
> +-
>  .../gcc.target/i386/avx-additional-reg-names.c |  9 +++
>  .../gcc.target/i386/avx512f-additional-reg-names.c |  9 +++
>  3 files changed, 40 insertions(+), 6 deletions(-)
>
> diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
> index c80878b..c5c1d58 100644
> --- a/gcc/config/i386/i386.h
> +++ b/gcc/config/i386/i386.h
> @@ -2016,12 +2016,28 @@ do {  
>   \
>  /* Table of additional register names to use in user input.  */
>
>  #define ADDITIONAL_REGISTER_NAMES \
> -{ { "eax", 0 }, { "edx", 1 }, { "ecx", 2 }, { "ebx", 3 },  \
> -  { "esi", 4 }, { "edi", 5 }, { "ebp", 6 }, { "esp", 7 },  \
> -  { "rax", 0 }, { "rdx", 1 }, { "rcx", 2 }, { "rbx", 3 },  \
> -  { "rsi", 4 }, { "rdi", 5 }, { "rbp", 6 }, { "rsp", 7 },  \
> -  { "al", 0 }, { "dl", 1 }, { "cl", 2 }, { "bl", 3 },  \
> -  { "ah", 0 }, { "dh", 1 }, { "ch", 2 }, { "bh", 3 } }
> +{ { "eax", 0 }, { "edx", 1 }, { "ecx", 2 }, { "ebx", 3 },  \
> +  { "esi", 4 }, { "edi", 5 }, { "ebp", 6 }, { "esp", 7 },  \
> +  { "rax", 0 }, { "rdx", 1 }, { "rcx", 2 }, { "rbx", 3 },  \
> +  { "rsi", 4 }, { "rdi", 5 }, { "rbp", 6 }, { "rsp", 7 },  \
> +  { "al", 0 }, { "dl", 1 }, { "cl", 2 }, { "bl", 3 },  \
> +  { "ah", 0 }, { "dh", 1 }, { "ch", 2 }, { "bh", 3 },  \
> +  { "ymm0", 21}, { "ymm1", 22}, { "ymm2", 23}, { "ymm3", 24},  \
> +  { "ymm4", 25}, { "ymm5", 26}, { "ymm6", 27}, { "ymm7", 28},  \
> +  { "ymm8", 45}, { "ymm9", 46}, { "ymm10", 47}, { "ymm11", 48},  
>   \
> +  { "ymm12", 49}, { "ymm13", 50}, { "ymm14", 51}, { "ymm15", 52},  \
> +  { "ymm16", 53}, { "ymm17", 54}, { "ymm18", 55}, { "ymm19", 56},  \
> +  { "ymm20", 57}, { "ymm21", 58}, { "ymm22", 59}, { "ymm23", 60},  \
> +  { "ymm24", 61}, { "ymm25", 62}, { "ymm26", 63}, { "ymm27", 64},  \
> +  { "ymm28", 65}, { "ymm29", 66}, { "ymm30", 67}, { "ymm31", 68},  \
> +  { "zmm0", 21}, { "zmm1", 22}, { "zmm2", 23}, { "zmm3", 24},  \
> +  { "zmm4", 25}, { "zmm5", 26}, { "zmm6", 27}, { "zmm7", 28},  \
> +  { "zmm8", 45}, { "zmm9", 46}, { "zmm10", 47}, { "zmm11", 48},  
>   \
> +  { "zmm12", 49}, { "zmm13", 50}, { "zmm14", 51}, { "zmm15", 52},  \
> +  { "zmm16", 53}, { "zmm17", 54}, { "zmm18", 55}, { "zmm19", 56},  \
> +  { "zmm20", 57}, { "zmm21", 58}, { "zmm22", 59}, { "zmm23", 60},  \
> +  { "zmm24", 61}, { "zmm25", 62}, { "zmm26", 63}, { "zmm27", 64},  \
> +  { "zmm28", 65}, { "zmm29", 66}, { "zmm30", 67}, { "zmm31", 68} }
>
>  /* Note we are omitting these since currently I don't know how
>  to get gcc to use these, since they want the same but different
> diff --git a/gcc/testsuite/gcc.target/i386/avx-additional-reg-names.c 
> b/gcc/testsuite/gcc.target/i386/avx-additional-reg-names.c
> new file mode 100644
> index 000..d984bff
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/avx-additional-reg-names.c
> @@ -0,0 +1,9 @@
> +/* { dg-do compile } */
> +/* { dg-options "-mavx" } */
> +
> +void foo ()
> +{
> +  register int ymm_var asm ("ymm4");
> +
> +  __asm__ __volatile__("vxorpd %%ymm0, %%ymm0, %%ymm7\n" : : : "ymm7" );
> +}

Doesn't GCC generate the same code with xmm?

> diff --git a/gcc/testsuite/gcc.target/i386/avx512f-additional-reg-names.c 
> b/gcc/testsuite/gcc.target/i386/avx512f-additional-reg-names.c
> new file mode 100644
> index 000..1bd428a
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/avx512f-additional-reg-names.c
> @@ -0,0 +1,9 @@
> +/* { dg-do compile } */
> +/* { dg-options "-mavx512f" } */
> +
> +void foo ()
> +{
> +  register int zmm_var asm ("zmm9");
> +
> +  __asm__ __volatile__("vxorpd %%zmm0, %%zmm0, %%zmm7\n" : : : "zmm7" );
> +}

Doesn't GCC generate the same code with xmm?


-- 
H.J.


Re: [PATCH] Fix PR60505

2014-03-17 Thread Cong Hou
On Mon, Mar 17, 2014 at 6:44 AM, Richard Biener  wrote:
> On Fri, 14 Mar 2014, Cong Hou wrote:
>
>> On Fri, Mar 14, 2014 at 12:58 AM, Richard Biener  wrote:
>> > On Fri, 14 Mar 2014, Jakub Jelinek wrote:
>> >
>> >> On Fri, Mar 14, 2014 at 08:52:07AM +0100, Richard Biener wrote:
>> >> > > Consider this fact and if there are alias checks, we can safely remove
>> >> > > the epilogue if the maximum trip count of the loop is less than or
>> >> > > equal to the calculated threshold.
>> >> >
>> >> > You have to consider n % vf != 0, so an argument on only maximum
>> >> > trip count or threshold cannot work.
>> >>
>> >> Well, if you only check if maximum trip count is <= vf and you know
>> >> that for n < vf the vectorized loop + it's epilogue path will not be 
>> >> taken,
>> >> then perhaps you could, but it is a very special case.
>> >> Now, the question is when we are guaranteed we enter the scalar versioned
>> >> loop instead for n < vf, is that in case of versioning for alias or
>> >> versioning for alignment?
>> >
>> > I think neither - I have plans to do the cost model check together
>> > with the versioning condition but didn't get around to implement that.
>> > That would allow stronger max bounds for the epilogue loop.
>>
>> In vect_transform_loop(), check_profitability will be set to true if
>> th >= VF-1 and the number of iteration is unknown (we only consider
>> unknown trip count here), where th is calculated based on the
>> parameter PARAM_MIN_VECT_LOOP_BOUND and cost model, with the minimum
>> value VF-1. If the loop needs to be versioned, then
>> check_profitability with true value will be passed to
>> vect_loop_versioning(), in which an enhanced loop bound check
>> (considering cost) will be built. So I think if the loop is versioned
>> and n < VF, then we must enter the scalar version, and in this case
>> removing epilogue should be safe when the maximum trip count <= th+1.
>
> You mean exactly in the case where the profitability check ensures
> that n % vf == 0?  Thus effectively if n == maximum trip count?
> That's quite a special case, no?


Yes, it is a special case. But it is in this special case that those
warnings are thrown out. Also, I think declaring an array with VF*N as
length is not unusual.


thanks,
Cong


>
> Richard.
>
> --
> Richard Biener 
> SUSE / SUSE Labs
> SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
> GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer


Re: Fwd: [RFC][gomp4] Offloading patches (2/3): Add tables generation

2014-03-17 Thread Jakub Jelinek
On Mon, Mar 17, 2014 at 04:00:11PM +0100, Thomas Schwinge wrote:
> Hi!
> 
> On Sat, 8 Mar 2014 18:50:15 +0400, Ilya Verbin  wrote:
> > --- a/libgomp/libgomp.map
> > +++ b/libgomp/libgomp.map
> > @@ -208,6 +208,7 @@ GOMP_3.0 {
> >  
> >  GOMP_4.0 {
> >global:
> > +   GOMP_offload_register;
> > GOMP_barrier_cancel;
> > GOMP_cancel;
> > GOMP_cancellation_point;
> 
> Now that the GOMP_4.0 symbol version is being used in GCC trunk, and will
> be in the GCC 4.9 release, can we still add new symbols to it here?
> (Jakub?)

If GCC 4.9 release will not include that symbol, then it must be in a new
symbol version, e.g. GOMP_4.1 (note, the fact that GOMP_ symbol version
matched now the OpenMP standard version wasn't always true and might not be
true always either (or we could use GOMP_4.0.1 symver).

Jakub


Re: [PATCH][AArch64] vqneg and vqabs intrinsics implementation

2014-03-17 Thread Marcus Shawcroft
On 12 February 2014 10:54, Alex Velenko  wrote:
> Hi,
>
> This patch implements vqneg_s64, vqnegd_s64, vqabs_s64 and
> vqabsd_s64 AArch64 intrinsics. Regression tests added.
> Run full regression with no regressions.
>
> Is patch OK?
>
> Thanks,
> Alex
>
> gcc/
>
> 2014-02-12  Alex Velenko  
>
> * gcc/config/aarch64/aarch64-simd.md (aarch64_s):
> Pattern extended.
> * config/aarch64/aarch64-simd-builtins.def (sqneg): Iterator
> extended.
> (sqabs): Likewise.
> * config/aarch64/arm_neon.h (vqneg_s64): New intrinsic.
> (vqnegd_s64): Likewise.
> (vqabs_s64): Likewise.
> (vqabsd_s64): Likewise.
>
> gcc/testsuite/
>
> 2014-02-12  Alex Velenko  
>
> *gcc.target/aarch64/vqneg_s64_1.c: New testcase.
> *gcc.target/aarch64/vqabs_s64_1.c: New testcase.

OK for stage-1
/Marcus


Re: [PATCH, i386, AVX, AVX-512] Extend ADDITION_REGISTER_NAMES to XMMs and YMMs.

2014-03-17 Thread Uros Bizjak
On Mon, Mar 17, 2014 at 4:12 PM, H.J. Lu  wrote:

>> Patch in the bottom allows to use ymmXX and zmmXX
>> register names in inline asm statements as well as
>> in `register` variables definitions.
>>
>> New tests pass.
>> Bootstrap pass.
>>
>> Is it ok for trunk?
>> Do we need to backport it to 4.8?
>>
>> gcc/
>> * config/i386/i386.h (ADDITIONAL_REGISTER_NAMES): Add
>> ymm and zmm register names.
>>
>> testsuite/
>> * gcc.target/i386/avx-additional-reg-names.c: New.
>> * gcc.target/i386/avx512f-additional-reg-names.c: Ditto.

> Doesn't GCC generate the same code with xmm?
>
>> diff --git a/gcc/testsuite/gcc.target/i386/avx512f-additional-reg-names.c 
>> b/gcc/testsuite/gcc.target/i386/avx512f-additional-reg-names.c
>> new file mode 100644
>> index 000..1bd428a
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/i386/avx512f-additional-reg-names.c
>> @@ -0,0 +1,9 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-mavx512f" } */
>> +
>> +void foo ()
>> +{
>> +  register int zmm_var asm ("zmm9");
>> +
>> +  __asm__ __volatile__("vxorpd %%zmm0, %%zmm0, %%zmm7\n" : : : "zmm7" );
>> +}
>
> Doesn't GCC generate the same code with xmm?

It does, but the situation is the same as with %eax vs. %rax names.
So, I think the patch is OK for mainline, and similar patch involving
only %ymm names for AVX-enabled branches.

Uros.


Re: PING: Fwd: Re: [patch] implement Cilk Plus simd loops on trunk

2014-03-17 Thread Jakub Jelinek
On Fri, Mar 07, 2014 at 09:21:48PM +0100, Thomas Schwinge wrote:
> Maybe it's just too late on a Friday evening, but I don't understand this
> change, part of r204863.  GF_OMP_FOR_KIND_FOR has the value zero;
> shouldn't this comparison have remained unchanged?  Is the following
> (untested) patch OK for trunk?  Does this need a test case?
> 
> commit f3c7834ecbedc50e04223d24b1b671fc8a62c169
> Author: Thomas Schwinge 
> Date:   Fri Mar 7 21:11:43 2014 +0100
> 
> Restore check for OpenMP for construct.
> 
>   gcc/
>   * omp-low.c (lower_rec_input_clauses) : Restore
>   check for GF_OMP_FOR_KIND_FOR.

Ok for trunk, sorry for the delay.

> diff --git gcc/omp-low.c gcc/omp-low.c
> index 4dc3956..713a4ae 100644
> --- gcc/omp-low.c
> +++ gcc/omp-low.c
> @@ -3915,7 +3915,7 @@ lower_rec_input_clauses (tree clauses, gimple_seq 
> *ilist, gimple_seq *dlist,
>/* Don't add any barrier for #pragma omp simd or
>#pragma omp distribute.  */
>if (gimple_code (ctx->stmt) != GIMPLE_OMP_FOR
> -   || gimple_omp_for_kind (ctx->stmt) & GF_OMP_FOR_KIND_FOR)
> +   || gimple_omp_for_kind (ctx->stmt) == GF_OMP_FOR_KIND_FOR)
>   gimple_seq_add_stmt (ilist, build_omp_barrier (NULL_TREE));
>  }
>  

Jakub


Re: [PATCH][AARCH64]Amend AArch64 frame layout comment.

2014-03-17 Thread Richard Earnshaw
On 16/03/14 11:25, Renlin Li wrote:
> Hi all,
> 
> This is  a simple patch to update the AArch64 frame layout comment in 
> the source code.
> frame_pointer should point above the local_variables section as we 
> define FRAME_GROWS_DOWNWARD = 1.
> 
> Is this Okay for stage-4?
> 

OK.


R.




Re: [PATCH, i386, AVX, AVX-512] Extend ADDITION_REGISTER_NAMES to XMMs and YMMs.

2014-03-17 Thread H.J. Lu
On Mon, Mar 17, 2014 at 9:52 AM, Uros Bizjak  wrote:
> On Mon, Mar 17, 2014 at 4:12 PM, H.J. Lu  wrote:
>
>>> Patch in the bottom allows to use ymmXX and zmmXX
>>> register names in inline asm statements as well as
>>> in `register` variables definitions.
>>>
>>> New tests pass.
>>> Bootstrap pass.
>>>
>>> Is it ok for trunk?
>>> Do we need to backport it to 4.8?
>>>
>>> gcc/
>>> * config/i386/i386.h (ADDITIONAL_REGISTER_NAMES): Add
>>> ymm and zmm register names.
>>>
>>> testsuite/
>>> * gcc.target/i386/avx-additional-reg-names.c: New.
>>> * gcc.target/i386/avx512f-additional-reg-names.c: Ditto.
>
>> Doesn't GCC generate the same code with xmm?
>>
>>> diff --git a/gcc/testsuite/gcc.target/i386/avx512f-additional-reg-names.c 
>>> b/gcc/testsuite/gcc.target/i386/avx512f-additional-reg-names.c
>>> new file mode 100644
>>> index 000..1bd428a
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.target/i386/avx512f-additional-reg-names.c
>>> @@ -0,0 +1,9 @@
>>> +/* { dg-do compile } */
>>> +/* { dg-options "-mavx512f" } */
>>> +
>>> +void foo ()
>>> +{
>>> +  register int zmm_var asm ("zmm9");
>>> +
>>> +  __asm__ __volatile__("vxorpd %%zmm0, %%zmm0, %%zmm7\n" : : : "zmm7" );
>>> +}
>>
>> Doesn't GCC generate the same code with xmm?
>
> It does, but the situation is the same as with %eax vs. %rax names.
> So, I think the patch is OK for mainline, and similar patch involving
> only %ymm names for AVX-enabled branches.
>

If I want to write codes with asm statements which can
be compiled with GCC 4.6 and above, I will use xmm
instead of ymm.  It makes ymm less attractive.

-- 
H.J.


Re: [PATCH, i386, AVX, AVX-512] Extend ADDITION_REGISTER_NAMES to XMMs and YMMs.

2014-03-17 Thread H.J. Lu
On Mon, Mar 17, 2014 at 10:11 AM, H.J. Lu  wrote:
> On Mon, Mar 17, 2014 at 9:52 AM, Uros Bizjak  wrote:
>> On Mon, Mar 17, 2014 at 4:12 PM, H.J. Lu  wrote:
>>
 Patch in the bottom allows to use ymmXX and zmmXX
 register names in inline asm statements as well as
 in `register` variables definitions.

 New tests pass.
 Bootstrap pass.

 Is it ok for trunk?
 Do we need to backport it to 4.8?

 gcc/
 * config/i386/i386.h (ADDITIONAL_REGISTER_NAMES): Add
 ymm and zmm register names.

 testsuite/
 * gcc.target/i386/avx-additional-reg-names.c: New.
 * gcc.target/i386/avx512f-additional-reg-names.c: Ditto.
>>
>>> Doesn't GCC generate the same code with xmm?
>>>
 diff --git a/gcc/testsuite/gcc.target/i386/avx512f-additional-reg-names.c 
 b/gcc/testsuite/gcc.target/i386/avx512f-additional-reg-names.c
 new file mode 100644
 index 000..1bd428a
 --- /dev/null
 +++ b/gcc/testsuite/gcc.target/i386/avx512f-additional-reg-names.c
 @@ -0,0 +1,9 @@
 +/* { dg-do compile } */
 +/* { dg-options "-mavx512f" } */
 +
 +void foo ()
 +{
 +  register int zmm_var asm ("zmm9");
 +
 +  __asm__ __volatile__("vxorpd %%zmm0, %%zmm0, %%zmm7\n" : : : "zmm7" );
 +}
>>>
>>> Doesn't GCC generate the same code with xmm?
>>
>> It does, but the situation is the same as with %eax vs. %rax names.
>> So, I think the patch is OK for mainline, and similar patch involving
>> only %ymm names for AVX-enabled branches.
>>
>
> If I want to write codes with asm statements which can
> be compiled with GCC 4.6 and above, I will use xmm
> instead of ymm.  It makes ymm less attractive.
>

BTW, in glibc, there are

asm volatile ("vmovdqa64 %0, %%zmm0" : : "x" (zmm) : "xmm0" );

and

asm volatile ("vmovdqa %0, %%ymm0" : : "x" (ymm) : "xmm0" );

-- 
H.J.


Re: [PATCH, i386, AVX, AVX-512] Extend ADDITION_REGISTER_NAMES to XMMs and YMMs.

2014-03-17 Thread Kirill Yukhin
On 17 Mar 17:52, Uros Bizjak wrote:
> On Mon, Mar 17, 2014 at 4:12 PM, H.J. Lu  wrote:
> 
> >> Is it ok for trunk?
> >> Do we need to backport it to 4.8?
> It does, but the situation is the same as with %eax vs. %rax names.
> So, I think the patch is OK for mainline, and similar patch involving
> only %ymm names for AVX-enabled branches.

Thanks, Uroš!

Couple of questions. AVX-enabled branches are 4.8 and 4.7? I suspect that
4.6 is out of support.

Second. I didn't understood point of HJ at all. Did you? (I'll try to reach
him via internal IM).

--
Thanks, K


Re: [PATCH, i386, AVX, AVX-512] Extend ADDITION_REGISTER_NAMES to XMMs and YMMs.

2014-03-17 Thread Kirill Yukhin
On 17 Mar 10:16, H.J. Lu wrote:
> BTW, in glibc, there are
> 
> asm volatile ("vmovdqa64 %0, %%zmm0" : : "x" (zmm) : "xmm0" );
Maybe. But I belive that this is much more clear to have instead:
   asm volatile ("vmovdqa64 %0, %%zmm0" : : "x" (zmm) : "zmm0" );

--
Thanks, K


Re: C++ PATCH for c++/58678 (devirt vs. KDE)

2014-03-17 Thread Jan Hubicka
> On 03/17/2014 04:39 AM, Jan Hubicka wrote:
> >Thank you!  would preffer different marker than cxa_pure_virtual in the 
> >vtable,
> >most probably simply NULL.
> >
> >The reason is that __cxa_pure_virtual will appear as a possible target in the
> >list and it will prevent devirtualization to happen when we end up with
> >__cxa_pure_virtual and real destructor in the list of possible targets.
> 
> Hmm?  __cxa_pure_virtual is not considered likely, so why wouldn't
> devirtualization choose the real function instead?

If you get list like ~foo(), __cxa_pure_virtual, you will get speculative 
devirtualization
to ~foo.
If you get ~foo(), NULL, the NULL will get translated to BUILTIN_UNREACHABLE and
that will be dropped from the list, so you will end up with unconditional call 
of ~foo().

I think in general we can not skip cxa_pure_virtual, since people want friendly
diagnostics on broken programs insted of getting devirtualized call to random 
other
function. I was under impression in this case we know that the virtual table 
entry won't
be used, so full devirtualization would be possible.
> 
> >gimple_get_virt_method_for_vtable knows that lookup in vtable that do not
> >result in FUNCTION_DECL should be translated to BUILTIN_UNREACHABLE and
> >ipa-devirt drops these from list of targets, unlike __cxa_pure_virtual that
> >stays.
> 
> I don't see the reason for that distinction; either way you get
> undefined behavior.  The only purpose of __cxa_pure_virtual is to
> give a friendly diagnostic before terminating the program.

I can drop the handling of cxa_pure_virtual if unconditoinal devirtualization is
desirable, or perhaps do it under some switch.
Targets list containing one cxa_pure_virtual and one extra function are common.
> 
> >Other problem with cxa_pure_virtual is that it needs external relocation.
> >I sort of wondered if we don't want to produce hidden comdat wrapper for
> >it, so C++ programs are easier to relocate.
> 
> Sure, that would make sense.
> 
> >What do you think of the following patch that makes ipa-devirt to conclude
> >that destructor calls are never done on types in construction.
> >If effect of doing so is undefined, I think it is safe to drop them from
> >list of targets and that really helps to reduce lists down.
> 
> That looks good to me.

Thanks, I am away for next 4 days to allaska hut w/o electricity, will check my 
email afterwards,
> 
> Jason


Re: [PATCH, i386, AVX, AVX-512] Extend ADDITION_REGISTER_NAMES to XMMs and YMMs.

2014-03-17 Thread H.J. Lu
On Mon, Mar 17, 2014 at 10:37 AM, Kirill Yukhin  wrote:
> On 17 Mar 10:16, H.J. Lu wrote:
>> BTW, in glibc, there are
>>
>> asm volatile ("vmovdqa64 %0, %%zmm0" : : "x" (zmm) : "xmm0" );
> Maybe. But I belive that this is much more clear to have instead:
>asm volatile ("vmovdqa64 %0, %%zmm0" : : "x" (zmm) : "zmm0" );
>

My issue is this is a user-visible change.  Code using ymm which
works with GCC 4.9 won't work with the installed GCC 4.6/4.7/4.8.
This change introduces GCC portability issues without significant
benefit.

-- 
H.J.


Re: [PATCH, i386, AVX, AVX-512] Extend ADDITION_REGISTER_NAMES to XMMs and YMMs.

2014-03-17 Thread Jakub Jelinek
On Mon, Mar 17, 2014 at 11:26:58AM -0700, H.J. Lu wrote:
> On Mon, Mar 17, 2014 at 10:37 AM, Kirill Yukhin  
> wrote:
> > On 17 Mar 10:16, H.J. Lu wrote:
> >> BTW, in glibc, there are
> >>
> >> asm volatile ("vmovdqa64 %0, %%zmm0" : : "x" (zmm) : "xmm0" );
> > Maybe. But I belive that this is much more clear to have instead:
> >asm volatile ("vmovdqa64 %0, %%zmm0" : : "x" (zmm) : "zmm0" );
> >
> 
> My issue is this is a user-visible change.  Code using ymm which
> works with GCC 4.9 won't work with the installed GCC 4.6/4.7/4.8.
> This change introduces GCC portability issues without significant
> benefit.

It is up to the user to decide if they want to be portable to older
compilers or not.  But it is useful and more intuitive if we allow
specifying also the ymm and zmm forms.

Jakub


[PATCH] Fix -fsanitize=undefined -flto (PR sanitizer/60535)

2014-03-17 Thread Jakub Jelinek
Hi!

Apparently rest_of_decl_compilation only calls varpool_finalize_decl
if not in_lto_p, so this patch calls it explicitly after that call to
make sure with -flto we register the newly created vars with varpool as
well.

Additionally, the patch gives name to a few further builtin types, so that
the null-4.c and overflow-int128.c tests don't fail with -flto (without the
lto-lang.c change they printed  as type name).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2014-03-17  Jakub Jelinek  

PR sanitizer/60535
* ubsan.c (ubsan_type_descriptor, ubsan_create_data): Call
varpool_finalize_decl after rest_of_decl_compilation.
lto/
* lto-lang.c (lto_init): Add NAME_TYPE for int128_integer_type_node
and complex_{float,{,long_}double}_type_node.
testsuite/
* c-c++-common/ubsan/null-1.c: Don't skip if -flto.
* c-c++-common/ubsan/null-2.c: Likewise.
* c-c++-common/ubsan/null-3.c: Likewise.
* c-c++-common/ubsan/null-4.c: Likewise.
* c-c++-common/ubsan/null-5.c: Likewise.
* c-c++-common/ubsan/null-6.c: Likewise.
* c-c++-common/ubsan/null-7.c: Likewise.
* c-c++-common/ubsan/null-8.c: Likewise.
* c-c++-common/ubsan/null-9.c: Likewise.
* c-c++-common/ubsan/null-10.c: Likewise.
* c-c++-common/ubsan/null-11.c: Likewise.
* c-c++-common/ubsan/overflow-1.c: Likewise.
* c-c++-common/ubsan/overflow-2.c: Likewise.
* c-c++-common/ubsan/overflow-add-1.c: Likewise.
* c-c++-common/ubsan/overflow-add-2.c: Likewise.
* c-c++-common/ubsan/overflow-int128.c: Likewise.
* c-c++-common/ubsan/overflow-mul-1.c: Likewise.
* c-c++-common/ubsan/overflow-mul-2.c: Likewise.
* c-c++-common/ubsan/overflow-mul-3.c: Likewise.
* c-c++-common/ubsan/overflow-mul-4.c: Likewise.
* c-c++-common/ubsan/overflow-negate-1.c: Likewise.
* c-c++-common/ubsan/overflow-negate-2.c: Likewise.
* c-c++-common/ubsan/overflow-sub-1.c: Likewise.
* c-c++-common/ubsan/overflow-sub-2.c: Likewise.
* c-c++-common/ubsan/pr59333.c: Likewise.
* c-c++-common/ubsan/pr59503.c: Likewise.
* c-c++-common/ubsan/pr59667.c: Likewise.
* c-c++-common/ubsan/undefined-1.c: Likewise.
* g++.dg/ubsan/pr59250.C: Likewise.
* g++.dg/ubsan/pr59306.C: Likewise.

--- gcc/ubsan.c.jj  2014-01-08 17:45:06.0 +0100
+++ gcc/ubsan.c 2014-03-17 14:09:40.280376415 +0100
@@ -391,6 +391,7 @@ ubsan_type_descriptor (tree type, bool w
   TREE_STATIC (ctor) = 1;
   DECL_INITIAL (decl) = ctor;
   rest_of_decl_compilation (decl, 1, 0);
+  varpool_finalize_decl (decl);
 
   /* Save the VAR_DECL into the hash table.  */
   decl_for_type_insert (type, decl);
@@ -502,6 +503,7 @@ ubsan_create_data (const char *name, loc
   TREE_STATIC (ctor) = 1;
   DECL_INITIAL (var) = ctor;
   rest_of_decl_compilation (var, 1, 0);
+  varpool_finalize_decl (var);
 
   return var;
 }
--- gcc/lto/lto-lang.c.jj   2014-03-10 10:50:15.0 +0100
+++ gcc/lto/lto-lang.c  2014-03-17 15:49:10.592371589 +0100
@@ -1222,6 +1222,13 @@ lto_init (void)
   NAME_TYPE (long_double_type_node, "long double");
   NAME_TYPE (void_type_node, "void");
   NAME_TYPE (boolean_type_node, "bool");
+  NAME_TYPE (complex_float_type_node, "complex float");
+  NAME_TYPE (complex_double_type_node, "complex double");
+  NAME_TYPE (complex_long_double_type_node, "complex long double");
+#if HOST_BITS_PER_WIDE_INT >= 64
+  if (targetm.scalar_mode_supported_p (TImode))
+NAME_TYPE (int128_integer_type_node, "__int128");
+#endif
 #undef NAME_TYPE
 
   /* Initialize LTO-specific data structures.  */
--- gcc/testsuite/c-c++-common/ubsan/null-1.c.jj2013-11-19 
21:56:24.566416519 +0100
+++ gcc/testsuite/c-c++-common/ubsan/null-1.c   2014-03-17 13:23:46.057000209 
+0100
@@ -1,7 +1,6 @@
 /* { dg-do run } */
 /* { dg-options "-fsanitize=null -w" } */
 /* { dg-shouldfail "ubsan" } */
-/* { dg-skip-if "" { *-*-* } { "-flto" } { "" } } */
 
 int
 main (void)
--- gcc/testsuite/c-c++-common/ubsan/null-2.c.jj2013-11-19 
21:56:24.566416519 +0100
+++ gcc/testsuite/c-c++-common/ubsan/null-2.c   2014-03-17 13:23:46.06592 
+0100
@@ -1,7 +1,6 @@
 /* { dg-do run } */
 /* { dg-options "-fsanitize=null -w" } */
 /* { dg-shouldfail "ubsan" } */
-/* { dg-skip-if "" { *-*-* } { "-flto" } { "" } } */
 
 int
 main (void)
--- gcc/testsuite/c-c++-common/ubsan/null-3.c.jj2013-11-19 
21:56:24.567416516 +0100
+++ gcc/testsuite/c-c++-common/ubsan/null-3.c   2014-03-17 13:23:46.063000958 
+0100
@@ -1,7 +1,6 @@
 /* { dg-do run } */
 /* { dg-options "-fsanitize=null -w" } */
 /* { dg-shouldfail "ubsan" } */
-/* { dg-skip-if "" { *-*-* } { "-flto" } { "" } } */
 
 int
 foo (int *p)
--- gcc/testsuite/c-c++-common/ubsan/null-4.c.jj2013-11-19 
21:56:24.567416516 +0100
+++ gcc/testsuite/c-c++-common/ubsan/null-4.c   2014-03-17 15:37:15.977422737 
+0100
@@ -1,7 

Re: [PATCH] Fix up REG_CFA_ADJUST_CFA note creation in epilogue (PR target/60516)

2014-03-17 Thread Richard Henderson
On 03/17/2014 11:47 AM, Jakub Jelinek wrote:
> 2014-03-17  Jakub Jelinek  
> 
>   PR target/60516
>   * config/i386/i386.c (ix86_expand_epilogue): Adjust REG_CFA_ADJUST_CFA
>   note creation for the 2010-08-31 changes.
> 
>   * gcc.target/i386/pr60516.c: New test.

Ok.


r~


[PATCH] Fix up REG_CFA_ADJUST_CFA note creation in epilogue (PR target/60516)

2014-03-17 Thread Jakub Jelinek
Hi!

Since r163679 the pop pattern is no longer a PARALLEL, but uses POST_INC.
That commit fixed another spot where REG_CFA_ADJUST_CFA note has been
created from the pop insn pattern, but missed this spot which is rarely used
(requires popping > 64KB arguments by callee).

Bootstrapped/regtested on x86_64-linux and i686-linux, Kai has tested this
on some mingw32 or what.  Ok for trunk?

2014-03-17  Jakub Jelinek  

PR target/60516
* config/i386/i386.c (ix86_expand_epilogue): Adjust REG_CFA_ADJUST_CFA
note creation for the 2010-08-31 changes.

* gcc.target/i386/pr60516.c: New test.

--- gcc/config/i386/i386.c.jj   2014-03-13 21:54:53.0 +0100
+++ gcc/config/i386/i386.c  2014-03-17 07:19:28.461411964 +0100
@@ -11708,8 +11708,9 @@ ix86_expand_epilogue (int style)
  m->fs.cfa_offset -= UNITS_PER_WORD;
  m->fs.sp_offset -= UNITS_PER_WORD;
 
- add_reg_note (insn, REG_CFA_ADJUST_CFA,
-   copy_rtx (XVECEXP (PATTERN (insn), 0, 1)));
+ rtx x = plus_constant (Pmode, stack_pointer_rtx, UNITS_PER_WORD);
+ x = gen_rtx_SET (VOIDmode, stack_pointer_rtx, x);
+ add_reg_note (insn, REG_CFA_ADJUST_CFA, x);
  add_reg_note (insn, REG_CFA_REGISTER,
gen_rtx_SET (VOIDmode, ecx, pc_rtx));
  RTX_FRAME_RELATED_P (insn) = 1;
--- gcc/testsuite/gcc.target/i386/pr60516.c.jj  2014-03-17 07:17:14.165158703 
+0100
+++ gcc/testsuite/gcc.target/i386/pr60516.c 2014-03-17 07:16:59.343275735 
+0100
@@ -0,0 +1,20 @@
+/* PR target/60516 */
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+struct S { char c[65536]; };
+
+__attribute__((ms_abi, thiscall)) void
+foo (void *x, struct S y)
+{
+}
+
+__attribute__((ms_abi, fastcall)) void
+bar (void *x, void *y, struct S z)
+{
+}
+
+__attribute__((ms_abi, stdcall)) void
+baz (struct S x)
+{
+}

Jakub


Re: [PATCH] BZ60501: Add addptr optab

2014-03-17 Thread Vladimir Makarov

On 2014-03-13, 7:37 AM, Andreas Krebbel wrote:

On 13/03/14 12:25, Richard Biener wrote:

On Thu, Mar 13, 2014 at 12:16 PM, Eric Botcazou  wrote:

--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -4720,6 +4720,17 @@ Add operand 2 and operand 1, storing the result in
operand 0.  All operands must have mode @var{m}.  This can be used even on
two-address machines, by means of constraints requiring operands 1 and 0 to
be the same location.

+@cindex @code{addptr@var{m}3} instruction pattern
+@item @samp{addptr@var{m}3}
+Like @code{addptr@var{m}3} but does never clobber the condition code.
+It only needs to be defined if @code{add@var{m}3} either sets the
+condition code or address calculations cannot be performed with the
+normal add instructions due to other reasons.  If adds used for
+address calculations and normal adds are not compatible it is required
+to expand a distinct pattern (e.g. using an unspec).  The pattern is
+used by LRA to emit address calculations.  @code{add@var{m}3} is used
+if @code{addptr@var{m}3} is not defined.


I'm a bit skeptical of the "address calculations cannot be performed with the
normal add instructions due to other reasons" part".  Surely they can be
performed on all architectures supported by GCC as of this writing, otherwise
how would the compiler even work?  And if it's really like @code{add@var{m}3},
why restricting it to addresses, i.e. why calling it @code{addptr@var{m}3}?
Does that come from an implementation constraint on s390 that supports it only
for a subset of the cases supported by @code{add@var{m}3}?


Yeah, isn't it that you want a named pattern like add_nocc for an add
that doesn't clobber flags?

This would suggest that you can use the pattern also for performing a normal 
add in case the
condition code is not needed afterwards but this isn't correct for s390 31 bit 
where an address
calculation is actually something different.  addptr is better I think because 
it is a pattern which
is supposed to be implemented with a load address instruction and the 
middle-end guarantees to use
it only on addresses. (I hope LRA is actually behaving that way). Perhaps 
loadptr or la or
loadaddress would be a better name?



It is complicated.  There is no guarantee that it is used only for 
addresses.  I need some time to think how to fix it.


Meanwhile, you *should* commit the patch into the trunk because it 
solves the real problem.  And I can work from this to make changes that 
the new pattern is only used for addresses.


The patch is absolutely safe for all targets but s390.  There is still a 
tiny possibility that it might result in some problems for s390  (now I 
see only one situation when a pseudo in a subreg changed by equiv plus 
expr needs a reload).  In any case your patch solves real numerous 
failures and can be used as a base for further work.


Thanks for working on this problem, Andreas.  Sorry that I missed the 
PR60501.




[patch testsuite]: Fix some mingw testcases in gcc.dg

2014-03-17 Thread Kai Tietz
Hello,

this patch fixes some regressions introduced by default-option
-fms-extensions for mingw-targets.

ChangeLog

2014-03-17  Kai Tietz  

* anon-struct-1.c: Add -fno-ms-extensions option for mingw targets.
* anon-struct-11.c: Likewise.
* anon-struct-2.c: Likewise.
* c11-anon-struct-2.c: Likewise.
* c11-anon-struct-3.c: Likewise.

Tested for i686-w64-mingw32, and x86_64-unknown-linux-gnu.  Ok for apply?

Regards,
Kai

Index: anon-struct-1.c
===
--- anon-struct-1.c(Revision 208594)
+++ anon-struct-1.c(Arbeitskopie)
@@ -1,4 +1,5 @@
 /* { dg-options "-std=iso9899:1990 -pedantic" } */
+/* { dg-additional-options "-fno-ms-extensions" { target *-*-mingw* } } */
 /* In strict ISO C mode, we don't recognize the anonymous struct/union
extension or any Microsoft extensions.  */

Index: anon-struct-11.c
===
--- anon-struct-11.c(Revision 208594)
+++ anon-struct-11.c(Arbeitskopie)
@@ -3,6 +3,7 @@
 /* No special options--in particular, turn off the default
-pedantic-errors option.  */
 /* { dg-options "" } */
+/* { dg-additional-options "-fno-ms-extensions" { target *-*-mingw* } } */

 /* When not using -fplan9-extensions, we don't support automatic
conversion of pointer types, and we don't support referring to a
Index: anon-struct-2.c
===
--- anon-struct-2.c(Revision 208594)
+++ anon-struct-2.c(Arbeitskopie)
@@ -1,4 +1,5 @@
 /* { dg-options "-std=gnu89" } */
+/* { dg-additional-options "-fno-ms-extensions" { target *-*-mingw* } } */
 /* In GNU C mode, we recognize the anonymous struct/union extension,
but not Microsoft extensions.  */

Index: c11-anon-struct-2.c
===
--- c11-anon-struct-2.c(Revision 208594)
+++ c11-anon-struct-2.c(Arbeitskopie)
@@ -2,6 +2,7 @@
cases.  */
 /* { dg-do compile } */
 /* { dg-options "-std=c11 -pedantic-errors" } */
+/* { dg-additional-options "-fno-ms-extensions" { target *-*-mingw* } } */

 typedef struct s0
 {
Index: c11-anon-struct-3.c
===
--- c11-anon-struct-3.c(Revision 208594)
+++ c11-anon-struct-3.c(Arbeitskopie)
@@ -2,6 +2,7 @@
cases: typedefs disallowed by N1549.  */
 /* { dg-do compile } */
 /* { dg-options "-std=c11 -pedantic-errors" } */
+/* { dg-additional-options "-fno-ms-extensions" { target *-*-mingw* } } */

 typedef struct
 {


Re: [patch testsuite]: Fix some mingw testcases in gcc.dg

2014-03-17 Thread Rainer Orth
Hi Kai,

> this patch fixes some regressions introduced by default-option
> -fms-extensions for mingw-targets.

you should state in your submissions *which* regressions were
introduced/*which* problem you are fixing.  While this may be obvious to
you, it's often not so to reviewers.

> ChangeLog
>
> 2014-03-17  Kai Tietz  
>
> * anon-struct-1.c: Add -fno-ms-extensions option for mingw targets.
> * anon-struct-11.c: Likewise.
> * anon-struct-2.c: Likewise.
> * c11-anon-struct-2.c: Likewise.
> * c11-anon-struct-3.c: Likewise.

gcc.dg/ prefix missing in ChangeLog entries.

> Tested for i686-w64-mingw32, and x86_64-unknown-linux-gnu.  Ok for apply?

Ok with that change.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [patch testsuite]: Fix some mingw testcases in gcc.dg

2014-03-17 Thread Kai Tietz
2014-03-17 21:50 GMT+01:00 Rainer Orth :
> Hi Kai,
>
>> this patch fixes some regressions introduced by default-option
>> -fms-extensions for mingw-targets.
>
> you should state in your submissions *which* regressions were
> introduced/*which* problem you are fixing.  While this may be obvious to
> you, it's often not so to reviewers.

I did.  "The regressions in testsuite are introduced by turning on the
state of -fms-extensions."  That all, and not more to tell.

>> ChangeLog
>>
>> 2014-03-17  Kai Tietz  
>>
>> * anon-struct-1.c: Add -fno-ms-extensions option for mingw targets.
>> * anon-struct-11.c: Likewise.
>> * anon-struct-2.c: Likewise.
>> * c11-anon-struct-2.c: Likewise.
>> * c11-anon-struct-3.c: Likewise.
>
> gcc.dg/ prefix missing in ChangeLog entries.
>
>> Tested for i686-w64-mingw32, and x86_64-unknown-linux-gnu.  Ok for apply?
>
> Ok with that change.
>
> Rainer
>
> --
> -
> Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [patch testsuite]: Fix some mingw testcases in gcc.dg

2014-03-17 Thread Rainer Orth
Kai Tietz  writes:

> 2014-03-17 21:50 GMT+01:00 Rainer Orth :
>> Hi Kai,
>>
>>> this patch fixes some regressions introduced by default-option
>>> -fms-extensions for mingw-targets.
>>
>> you should state in your submissions *which* regressions were
>> introduced/*which* problem you are fixing.  While this may be obvious to
>> you, it's often not so to reviewers.
>
> I did.  "The regressions in testsuite are introduced by turning on the
> state of -fms-extensions."  That all, and not more to tell.

You didn't.  *Which* regressions?  What happens?  I had to infer it from
a comment in one of the changed testcases:

 /* In strict ISO C mode, we don't recognize the anonymous struct/union
extension or any Microsoft extensions.  */

If you'd cited the compiler error you get for one of the testcases,
everything had been clear.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH] x86: Define _mm*_undefined_*

2014-03-17 Thread Ulrich Drepper
On Mon, Mar 17, 2014 at 7:39 AM, Ilya Tocar  wrote:
> Do you know of any cases where xor is
> generated (except for destination in gather/scatter)

I don't have any code exhibiting this handy right now.  I'll keep an eye out.


>  but it also clobbers
> flags. Maybe just define it to setzero for now?

What do you mean by "clobbers flags"?  Do you have an example?


extending constants in rtl

2014-03-17 Thread Mike Stump
So, to support things like this:

(define_constants
   (C1_TEMP_REGNUM  PROLOGUE_SCRATCH_1)
   (C1_TEMP2_REGNUM PROLOGUE_SCRATCH_2)

I need the rtl reader to do less checking.  We we turn off int validation, this 
then works, and we get:

  #define C1_TEMP_REGNUM PROLOGUE_SCRATCH_1

in insn-constants.h, which is what I wanted.  The problem is that I choose 
different scratch register based upon the cpu and this is then used in a 
clobber in the rtl of a define_insn.

I’d be happy to do this some other way, but, I didn’t see a way to do this, 
otherwise.

Absent a better solution, I’d like to pursue this.  The only question I have, 
remove the checking, or allow the target to explain that we don’t want the 
checking?


diff --git a/gcc/read-rtl.c b/gcc/read-rtl.c
index c198b5b..ceef96c 100644
--- a/gcc/read-rtl.c
+++ b/gcc/read-rtl.c
@@ -807,8 +807,12 @@ validate_const_int (const char *string)
 valid = 0;
break;
   }
+#if 0
+  /* In order to support defining the md constants in terms of CPP constants 
from tm.h, we
+ can't check this.  */
   if (!valid)
 fatal_with_file_and_line ("invalid decimal constant \"%s\"\n", string);
+#endif
 }
 
 static void



Ping^3 GCC trunk 4.9: documentation patch on plugins

2014-03-17 Thread Basile Starynkevitch
On Sat, 2014-03-08 at 11:15 +0100, Basile Starynkevitch wrote:
> I am pinging again this documentation patch
> http://gcc.gnu.org/ml/gcc-patches/2014-02/msg00074.html
> (pinged at http://gcc.gnu.org/ml/gcc-patches/2014-02/msg01002.html on 
> febµ.17th 2014)
and also pinged at
http://gcc.gnu.org/ml/gcc-patches/2014-03/msg00387.html on march 8th
2014
 gcc/ChangeLog entry

2014-03-18  Basile Starynkevitch  

* doc/plugins.texi (Plugin callbacks): Mention
PLUGIN_INCLUDE_FILE.
Italicize plugin event names in description.  Explain that 
PLUGIN_PRAGMAS has no sense for lto1. Explain
PLUGIN_INCLUDE_FILE. 
Remind that no GCC functions should be called after
PLUGIN_FINISH.
Explain what pragmas with expansion are.

 the patch:
Index: gcc/doc/plugins.texi
===
--- gcc/doc/plugins.texi(revision 207422)
+++ gcc/doc/plugins.texi(working copy)
@@ -209,6 +209,10 @@
   PLUGIN_EARLY_GIMPLE_PASSES_END,
   /* Called when a pass is first instantiated.  */
   PLUGIN_NEW_PASS,
+/* Called when a file is #include-d or given thru #line directive.
+   Could happen many times.  The event data is the included file path,
+   as a const char* pointer.  */
+  PLUGIN_INCLUDE_FILE,
 
   PLUGIN_EVENT_FIRST_DYNAMIC/* Dummy event used for indexing
callback
array.  */
@@ -229,15 +233,27 @@
 @item @code{void *user_data}: Pointer to plugin-specific data.
 @end itemize
 
-For the PLUGIN_PASS_MANAGER_SETUP, PLUGIN_INFO,
PLUGIN_REGISTER_GGC_ROOTS
-and PLUGIN_REGISTER_GGC_CACHES pseudo-events the @code{callback} should
be
-null, and the @code{user_data} is specific.
+For the @i{PLUGIN_PASS_MANAGER_SETUP}, @i{PLUGIN_INFO},
+@i{PLUGIN_REGISTER_GGC_ROOTS} and @i{PLUGIN_REGISTER_GGC_CACHES}
+pseudo-events the @code{callback} should be null, and the
+@code{user_data} is specific.
 
-When the PLUGIN_PRAGMAS event is triggered (with a null
-pointer as data from GCC), plugins may register their own pragmas
-using functions like @code{c_register_pragma} or
-@code{c_register_pragma_with_expansion}.
+When the @i{PLUGIN_PRAGMAS} event is triggered (with a null pointer as
+data from GCC), plugins may register their own pragmas.  Notice that
+pragmas are not available from @file{lto1}, so plugins used with
+@code{-flto} option to GCC during link-time optimization cannot use
+pragmas and do not even see functions like @code{c_register_pragma} or
+@code{pragma_lex}.
 
+The @i{PLUGIN_INCLUDE_FILE} event, with a @code{const char*} file path
as
+GCC data, is triggered for processing of @code{#include} or
+@code{#line} directives.
+
+The @i{PLUGIN_FINISH} event is the last time that plugins can call GCC
+functions, notably emit diagnostics with @code{warning}, @code{error}
+etc.
+
+
 @node Plugins pass
 @section Interacting with the pass manager
 
@@ -376,10 +392,13 @@
 @end smallexample
 
 
-The @code{PLUGIN_PRAGMAS} callback is called during pragmas
-registration. Use the @code{c_register_pragma} or
-@code{c_register_pragma_with_expansion} functions to register custom
-pragmas.
+The @i{PLUGIN_PRAGMAS} callback is called once during pragmas
+registration. Use the @code{c_register_pragma},
+@code{c_register_pragma_with_data},
+@code{c_register_pragma_with_expansion},
+@code{c_register_pragma_with_expansion_and_data} functions to register
+custom pragmas and their handlers (which often want to call
+@code{pragma_lex}) from @file{c-family/c-pragma.h}.
 
 @smallexample
 /* Plugin callback called during pragmas registration. Registered with
@@ -397,7 +416,15 @@
 It is suggested to pass @code{"GCCPLUGIN"} (or a short name identifying
 your plugin) as the ``space'' argument of your pragma.
 
+Pragmas registered with @code{c_register_pragma_with_expansion} or
+@code{c_register_pragma_with_expansion_and_data} are allowing
+preprocessor expansions, like e.g.
 
+@smallexample
+#define NUMBER 10
+#pragma GCCPLUGIN foothreshold (NUMBER)
+@end smallexample
+
 @node Plugins recording
 @section Recording information about pass execution
 
#

Ok for 4.9?

Regards