date:20120613

Re: [PATCH, GCC][AArch64] Use Enums for code models option selection

2012-06-13 Thread Marcus Shawcroft


On 13/06/12 14:38, Sofiane Naci wrote:

Hi,

I discovered a bug in my previous patch, so I attach a new one.
The ChangeLog hasn't changed.
OK to commit?

Thanks
Sofiane


-Original Message-
From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-ow...@gcc.gnu.org]

On

Behalf Of Sofiane Naci
Sent: 31 May 2012 10:55
To: gcc-patches@gcc.gnu.org
Subject: [PATCH, GCC][AArch64] Use Enums for code models option selection

Hi,

This patch re-factors code models option selection in the AArch64 port:

  . Renaming variables such as mem_model to cmodel, for better clarity.
  . Using the generic support for enumerated option arguments.
  . Fixing touched code layout and formatting issues.

Thanks
Sofiane

-

ChangeLog:

2012-05-31  Sofiane Naci

[AArch64] Use Enums for code models option selection.

* config/aarch64/aarch64-elf-raw.h (AARCH64_DEFAULT_MEM_MODEL):
Delete.
* config/aarch64/aarch64-linux.h (AARCH64_DEFAULT_MEM_MODEL):
Delete.
* config/aarch64/aarch64-opts.h (enum aarch64_code_model): New.
* config/aarch64/aarch64-protos.h: Update comments.
* config/aarch64/aarch64.c: Update comments.
(aarch64_default_mem_model): Rename to aarch64_code_model.
(aarch64_expand_mov_immediate): Remove error message.
(aarch64_select_rtx_section): Remove assertion and update comment.
(aarch64_override_options): Move memory model initialization from
here.
(struct aarch64_mem_model): Delete.
(aarch64_memory_models[]): Delete.
(initialize_aarch64_memory_model): Rename to
initialize_aarch64_code_model
and update.
(aarch64_classify_symbol): Handle AARCH64_CMODEL_TINY and
AARCH64_CMODEL_TINY_PIC
* config/aarch64/aarch64.h
(enum aarch64_memory_model): Delete.
(aarch64_default_mem_model): Rename to aarch64_cmodel.
(HAS_LONG_COND_BRANCH): Update.
(HAS_LONG_UNCOND_BRANCH): Update.
* config/aarch64/aarch64.opt
(cmodel): New.
(mcmodel): Update.


OK

Re: Make timevar phases mutually exclusive. (issue6302064)

2012-06-13 Thread Diego Novillo


On 12-06-13 08:46 , Diego Novillo wrote:


The LTO bits are fine. I would prefer if an FE maintainer takes a second
look over the other bits. Jason, Joseph?


Incidentally, could you please test it with an LTO-enabled bootstrap?

$ ../src/configure --with-build-config=bootstrap-lto 
--enable-languages=c++,fortran

$ make profiledbootstrap

If you cut and paste one of the -flto compiles you see during bootstrap, 
you can add -ftime-report to it to make sure that the LTO timers are 
properly setup.



Diego.

Re: C++ PATCH for c++/42603 and c++/6709 (DR 743/950, allow decltype as base-specifier and nested-name-specifier)

2012-06-13 Thread H.J. Lu

On Wed, Jul 20, 2011 at 7:29 AM, Jason Merrill  wrote:
> DRs 743 and 950 allow decltype to be used as the scope in a
> nested-name-specifier and as a base-specifier.  This patch implements that
> functionality.  In order to deal with the ambiguity when we first encounter
> "decltype" as to whether it will be a nested-name-specifier or its own
> simple-type-specifier, we now cache the result of parsing in a new
> CPP_DECLTYPE token type so that we don't need to parse it a second time if
> it turns out not to be a nested-name-specifier.
>
> Tested x86_64-pc-linux-gnu, applying to trunk.

This caused:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53651


-- 
H.J.

Re: [arm] Remove obsolete FPA support (1/n): obsolete target removal

2012-06-13 Thread Richard Earnshaw

On 13/06/12 14:19, Sebastian Huber wrote:
> On 06/13/2012 02:51 PM, Richard Earnshaw wrote:
>>  (arm*-*-rtems*): Remove.
> 
> For RTEMS the intention was to rename arm*-*-rtemseabi* into arm*-*-rtems* 
> and 
> provide an arm*-*-rtemself* legacy target.  My personal opinion is to avoid a 
> arm*-*-rtemself* legacy target, but it was decided otherwise by the RTEMS 
> community.
> 
> See also:
> 
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53325
> http://gcc.gnu.org/ml/gcc-patches/2012-05/msg00939.html
> 

I've taken no position on that in this patch.  All that's happened is
that the old rtems (ie pre-eabi) configuration has now gone.

R.

PATCH: PR target/53647: Set proper cache values when needed

2012-06-13 Thread H.J. Lu

Hi,

On i386, ix86_size_cost will be used for -Os, which has zero for
simultaneous_prefetches, prefetch_block, l1_cache_size and l2_cache_size.
This patch adds ix86_tune_cost and uses it for simultaneous_prefetches,
prefetch_block, l1_cache_size and l2_cache_size if ones from ix86_cost
are zero.  OK to install?

Thanks.


H.J.
---
2012-06-13  H.J. Lu  

PR target/53647
* config/i386/i386.c (ix86_tune_cost): New variable.
(ix86_option_override_internal): Set ix86_tune_cost.  Use
ix86_tune_cost for simultaneous_prefetches, prefetch_block,
l1_cache_size and l2_cache_size if ones from ix86_cost are
zero.

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 13755f4..2e64d55 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -1874,6 +1874,10 @@ struct processor_costs generic32_cost = {
   1,   /* cond_not_taken_branch_cost.  */
 };
 
+/* Set by -mtune.  */
+const struct processor_costs *ix86_tune_cost = &pentium_cost;
+
+/* Set by -mtune or -Os.  */
 const struct processor_costs *ix86_cost = &pentium_cost;
 
 /* Processor feature/optimization bitmasks.  */
@@ -3546,6 +3550,7 @@ ix86_option_override_internal (bool main_args_p)
flag_pcc_struct_return = DEFAULT_PCC_STRUCT_RETURN;
 }
 
+  ix86_tune_cost = processor_target_table[ix86_tune].cost;
   if (optimize_size)
 ix86_cost = &ix86_size_cost;
   else
@@ -3794,16 +3799,27 @@ ix86_option_override_internal (bool main_args_p)
 flag_schedule_insns_after_reload = flag_schedule_insns = 0;
 
   maybe_set_param_value (PARAM_SIMULTANEOUS_PREFETCHES,
-ix86_cost->simultaneous_prefetches,
+ix86_cost->simultaneous_prefetches
+? ix86_cost->simultaneous_prefetches
+: ix86_tune_cost->simultaneous_prefetches,
 global_options.x_param_values,
 global_options_set.x_param_values);
-  maybe_set_param_value (PARAM_L1_CACHE_LINE_SIZE, ix86_cost->prefetch_block,
+  maybe_set_param_value (PARAM_L1_CACHE_LINE_SIZE,
+ix86_cost->prefetch_block
+? ix86_cost->prefetch_block
+: ix86_tune_cost->prefetch_block,
 global_options.x_param_values,
 global_options_set.x_param_values);
-  maybe_set_param_value (PARAM_L1_CACHE_SIZE, ix86_cost->l1_cache_size,
+  maybe_set_param_value (PARAM_L1_CACHE_SIZE,
+ix86_cost->l1_cache_size
+? ix86_cost->l1_cache_size
+: ix86_tune_cost->l1_cache_size,
 global_options.x_param_values,
 global_options_set.x_param_values);
-  maybe_set_param_value (PARAM_L2_CACHE_SIZE, ix86_cost->l2_cache_size,
+  maybe_set_param_value (PARAM_L2_CACHE_SIZE,
+ix86_cost->l2_cache_size
+? ix86_cost->l2_cache_size
+: ix86_cost->l2_cache_size,
 global_options.x_param_values,
 global_options_set.x_param_values);

[PATCH][1/n] VRP and anti-range handling

2012-06-13 Thread Richard Guenther


I am trying to refresh my patches for PR30318 and while doing so
I am about to re-organize how to deal with them in general.
Basically reduce operations to primitives and combine them
where necessary instead of open-coding all possibilities.  I've
started some of this last year (handling NEGATE_EXPR X as 0 - X, etc.)
and this will continue it for the handling of anti-ranges.
The simple idea is that X op ~[] is the same as (X op []') U (X op []'')
with two suitable ranges []' and []'' derived from the anti-range ~[].

Thus, as a starter, the following improves the range primitive vrp_meet
(which computes the union of ranges).

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2012-06-13  Richard Guenther  

* tree-vrp.c (vrp_meet): Properly meet equivalent ranges.
Handle meeting two VR_RANGE to an VR_ANTI_RANGE.  Implement
all possible meetings of VR_RANGE with VR_ANTI_RANGE and
VR_ANTI_RANGE with VR_ANTI_RANGE.

Index: gcc/tree-vrp.c
===
*** gcc/tree-vrp.c.orig 2012-06-13 15:41:59.0 +0200
--- gcc/tree-vrp.c  2012-06-13 15:52:26.504640952 +0200
*** vrp_meet (value_range_t *vr0, value_rang
*** 6914,7000 
return;
  }
  
!   if (vr0->type == VR_RANGE && vr1->type == VR_RANGE)
  {
int cmp;
tree min, max;
  
!   /* Compute the convex hull of the ranges.  The lower limit of
!  the new range is the minimum of the two ranges.  If they
 cannot be compared, then give up.  */
-   cmp = compare_values (vr0->min, vr1->min);
-   if (cmp == 0 || cmp == 1)
- min = vr1->min;
-   else if (cmp == -1)
- min = vr0->min;
-   else
-   goto give_up;
- 
-   /* Similarly, the upper limit of the new range is the maximum
-  of the two ranges.  If they cannot be compared, then
-give up.  */
-   cmp = compare_values (vr0->max, vr1->max);
-   if (cmp == 0 || cmp == -1)
- max = vr1->max;
-   else if (cmp == 1)
- max = vr0->max;
else
!   goto give_up;
! 
!   /* Check for useless ranges.  */
!   if (INTEGRAL_TYPE_P (TREE_TYPE (min))
! && ((vrp_val_is_min (min) || is_overflow_infinity (min))
! && (vrp_val_is_max (max) || is_overflow_infinity (max
!   goto give_up;
! 
!   /* The resulting set of equivalences is the intersection of
!the two sets.  */
!   if (vr0->equiv && vr1->equiv && vr0->equiv != vr1->equiv)
! bitmap_and_into (vr0->equiv, vr1->equiv);
!   else if (vr0->equiv && !vr1->equiv)
! bitmap_clear (vr0->equiv);
! 
!   set_value_range (vr0, vr0->type, min, max, vr0->equiv);
! }
!   else if (vr0->type == VR_ANTI_RANGE && vr1->type == VR_ANTI_RANGE)
! {
!   /* Two anti-ranges meet only if their complements intersect.
!  Only handle the case of identical ranges.  */
!   if (compare_values (vr0->min, vr1->min) == 0
! && compare_values (vr0->max, vr1->max) == 0
! && compare_values (vr0->min, vr0->max) == 0)
!   {
! /* The resulting set of equivalences is the intersection of
!the two sets.  */
! if (vr0->equiv && vr1->equiv && vr0->equiv != vr1->equiv)
!   bitmap_and_into (vr0->equiv, vr1->equiv);
! else if (vr0->equiv && !vr1->equiv)
!   bitmap_clear (vr0->equiv);
}
!   else
!   goto give_up;
  }
else if (vr0->type == VR_ANTI_RANGE || vr1->type == VR_ANTI_RANGE)
  {
!   /* For a numeric range [VAL1, VAL2] and an anti-range ~[VAL3, VAL4],
!  only handle the case where the ranges have an empty intersection.
!The result of the meet operation is the anti-range.  */
!   if (!symbolic_range_p (vr0)
! && !symbolic_range_p (vr1)
! && !value_ranges_intersect_p (vr0, vr1))
!   {
! /* Copy most of VR1 into VR0.  Don't copy VR1's equivalence
!set.  We need to compute the intersection of the two
!equivalence sets.  */
! if (vr1->type == VR_ANTI_RANGE)
!   set_value_range (vr0, vr1->type, vr1->min, vr1->max, vr0->equiv);
  
! /* The resulting set of equivalences is the intersection of
!the two sets.  */
! if (vr0->equiv && vr1->equiv && vr0->equiv != vr1->equiv)
!   bitmap_and_into (vr0->equiv, vr1->equiv);
! else if (vr0->equiv && !vr1->equiv)
!   bitmap_clear (vr0->equiv);
}
else
goto give_up;
--- 6914,7066 
return;
  }
  
!   if (vr0->type == vr1->type
!   && compare_values (vr0->min, vr1->min) == 0
!   && compare_values (vr0->max, vr1->max) == 0)
! {
!   /* If the value-ranges are identical just insersect
!their equivalencies.  */
! }
!   else if (vr0->type == VR_RANGE && vr1->type == VR_RANGE)
  {
int cmp;
tree min, max;
  
!

[PATCH][2/n] VRP and anti-range handling

2012-06-13 Thread Richard Guenther


This adds proper VR_ANTI_RANGE handling (for constant ranges)
to all unary and binary operations by decomposing anti-ranges
into VR_RANGEs and vrp_meet-ing the results.

Bootstrapped and tested on x86_64-unknown-linux-gnu with some
minor fallout that prompted me to improve vrp_meet.  Re-bootstrap & test
scheduled after that patch is in.

Richard.

2012-06-13  Richard Guenther  

* tree-vrp.c (VR_INITIALIZER): New define.
(ranges_from_anti_range): New function.
(extract_range_from_binary_expr_1): Decompose operations on
VR_ANTI_RANGEs to operations on VR_RANGE.
(extract_range_from_unary_expr_1): Likewise.
(extract_range_from_binary_expr_1, extract_range_from_binary_expr,
extract_range_from_unary_expr_1, extract_range_from_unary_expr,
extract_range_from_cond_expr, adjust_range_with_scev,
vrp_visit_assignment_or_call, vrp_visit_phi_node,
simplify_bit_ops_using_ranges): Use VR_INITIALIZER.

Index: gcc/tree-vrp.c
===
*** gcc/tree-vrp.c.orig 2012-06-13 12:08:27.0 +0200
--- gcc/tree-vrp.c  2012-06-13 12:46:24.177027500 +0200
*** struct value_range_d
*** 76,81 
--- 76,83 
  
  typedef struct value_range_d value_range_t;
  
+ #define VR_INITIALIZER { VR_UNDEFINED, NULL_TREE, NULL_TREE, NULL }
+ 
  /* Set of SSA names found live during the RPO traversal of the function
 for still active basic-blocks.  */
  static sbitmap *live;
*** zero_nonzero_bits_from_vr (value_range_t
*** 2216,2221 
--- 2218,2271 
return true;
  }
  
+ /* Create two value-ranges in *VR0 and *VR1 from the anti-range *AR
+so that *VR0 U *VR1 == *AR.  Returns true if that is possible,
+false otherwise.  If *AR can be represented with a single range
+*VR1 will be VR_UNDEFINED.  */
+ 
+ static bool
+ ranges_from_anti_range (value_range_t *ar,
+   value_range_t *vr0, value_range_t *vr1)
+ {
+   tree type = TREE_TYPE (ar->min);
+ 
+   vr0->type = VR_UNDEFINED;
+   vr1->type = VR_UNDEFINED;
+ 
+   if (ar->type != VR_ANTI_RANGE
+   || TREE_CODE (ar->min) != INTEGER_CST
+   || TREE_CODE (ar->max) != INTEGER_CST
+   || !vrp_val_min (type)
+   || !vrp_val_max (type))
+ return false;
+ 
+   if (!vrp_val_is_min (ar->min))
+ {
+   vr0->type = VR_RANGE;
+   vr0->min = vrp_val_min (type);
+   vr0->max
+   = double_int_to_tree (type,
+ double_int_sub (tree_to_double_int (ar->min),
+ double_int_one));
+ }
+   if (!vrp_val_is_max (ar->max))
+ {
+   vr1->type = VR_RANGE;
+   vr1->min
+   = double_int_to_tree (type,
+ double_int_add (tree_to_double_int (ar->max),
+ double_int_one));
+   vr1->max = vrp_val_max (type);
+ }
+   if (vr0->type == VR_UNDEFINED)
+ {
+   *vr0 = *vr1;
+   vr1->type = VR_UNDEFINED;
+ }
+ 
+   return vr0->type != VR_UNDEFINED;
+ }
+ 
  /* Helper to extract a value-range *VR for a multiplicative operation
 *VR0 CODE *VR1.  */
  
*** extract_range_from_binary_expr_1 (value_
*** 2379,2384 
--- 2429,2435 
  value_range_t *vr0_, value_range_t *vr1_)
  {
value_range_t vr0 = *vr0_, vr1 = *vr1_;
+   value_range_t vrtem0 = VR_INITIALIZER, vrtem1 = VR_INITIALIZER;
enum value_range_type type;
tree min = NULL_TREE, max = NULL_TREE;
int cmp;
*** extract_range_from_binary_expr_1 (value_
*** 2429,2434 
--- 2480,2515 
else if (vr1.type == VR_UNDEFINED)
  set_value_range_to_varying (&vr1);
  
+   /* Now canonicalize anti-ranges to ranges when they are not symbolic
+  and express ~[] op X as ([]' op X) U ([]'' op X).  */
+   if (vr0.type == VR_ANTI_RANGE
+   && ranges_from_anti_range (&vr0, &vrtem0, &vrtem1))
+ {
+   extract_range_from_binary_expr_1 (vr, code, expr_type, &vrtem0, vr1_);
+   if (vrtem1.type != VR_UNDEFINED)
+   {
+ value_range_t vrres = VR_INITIALIZER;
+ extract_range_from_binary_expr_1 (&vrres, code, expr_type,
+   &vrtem1, vr1_);
+ vrp_meet (vr, &vrres);
+   }
+   return;
+ }
+   /* Likewise for X op ~[].  */
+   if (vr1.type == VR_ANTI_RANGE
+   && ranges_from_anti_range (&vr1, &vrtem0, &vrtem1))
+ {
+   extract_range_from_binary_expr_1 (vr, code, expr_type, vr0_, &vrtem0);
+   if (vrtem1.type != VR_UNDEFINED)
+   {
+ value_range_t vrres = VR_INITIALIZER;
+ extract_range_from_binary_expr_1 (&vrres, code, expr_type,
+   vr0_, &vrtem1);
+ vrp_meet (vr, &vrres);
+   }
+   return;
+ }
+ 
/* The type of the resulting value range defaults to VR0.TYPE.  */
type = vr0.type;
  
*** extr

Re: Committed: atomic support for CRIS

2012-06-13 Thread Mike Stump

On Jun 12, 2012, at 4:47 PM, Hans-Peter Nilsson wrote:
>> From: Richard Henderson 
>> Date: Tue, 12 Jun 2012 23:04:02 +0200
> Putting a lot of trust onto users and libraries there, to choose
> the right model...

My take, very, very few people will actually write code that plays with models 
and barriers...  Just like in the past.  Instead, people will layer prettier 
apis on top of it and explain the world through that lens and people will use 
those.  In the compiler and compiler libraries, we merely provide all the 
primitives necessary for a api implementor to squeeze out all the performance 
they need.  We fail when we leave performance on the table in terms of having 
extra operations that aren't necessary, as that limits the utility of the 
library.

Re: [PATCH 3/3] rs6000: Rewrite sync patterns for atomic; expand early.

2012-06-13 Thread Richard Henderson

On 2012-06-12 16:16, David Edelsohn wrote:
> Should Altivec and SSE be used for TImode, and AVX for OImode?

I dunno about Altivec, but SSE/AVX loads are not guaranteed atomic, so, no.


r~

[PATCH, i386]: Some more soft-fp cleanups

2012-06-13 Thread Uros Bizjak

Hello!

A couple of #defines can be moved to shared header, no need to mask
_fex with exception mask and fnstsw should be marked volatile, since
it depends on hidden FP status register.

2012-06-13  Uros Bizjak  

* config/i386/32/sfp-machine.h (_FP_NANSIGN_S, _FP_NANSIGN_D,
_FP_NANSIGN_E, _FP_NANSIGN_Q): Move ...
* config/i386/64/sfp-machine: ... (delete here) ...
* config/i386/sfp-machine.h: ... to here.
(FP_EX_MASK): Remove.
(FP_RND_MASK): New.
(FP_INIT_ROUNDMODE): Declare asm as volatile.

Tested on x86_64-pc-linux-gnu {,-m32}, committed to mainline SVN.

Uros.
Index: config/i386/sfp-machine.h
===
--- config/i386/sfp-machine.h   (revision 188515)
+++ config/i386/sfp-machine.h   (working copy)
@@ -14,8 +14,13 @@
 #include "config/i386/32/sfp-machine.h"
 #endif
 
-#define _FP_KEEPNANFRACP 1
+#define _FP_KEEPNANFRACP   1
 
+#define _FP_NANSIGN_S  1
+#define _FP_NANSIGN_D  1
+#define _FP_NANSIGN_E  1
+#define _FP_NANSIGN_Q  1
+
 /* Here is something Intel misdesigned: the specs don't define
the case where we have two NaNs with same mantissas, but
different sign. Different operations pick up different NaNs.  */
@@ -42,13 +47,11 @@
 #define FP_EX_UNDERFLOW0x10
 #define FP_EX_INEXACT  0x20
 
-#define FP_EX_MASK 0x3f
-
 void __sfp_handle_exceptions (int);
 
 #define FP_HANDLE_EXCEPTIONS   \
   do { \
-if (_fex & FP_EX_MASK) \
+if (_fex)  \
   __sfp_handle_exceptions (_fex);  \
   } while (0);
 
@@ -57,15 +60,17 @@
 #define FP_RND_PINF0x800
 #define FP_RND_MINF0x400
 
+#define FP_RND_MASK0xc00
+
 #define _FP_DECL_EX \
   unsigned short _fcw __attribute__ ((unused)) = FP_RND_NEAREST
 
-#define FP_INIT_ROUNDMODE  \
-  do { \
-__asm__ ("fnstcw %0" : "=m" (_fcw));   \
+#define FP_INIT_ROUNDMODE  \
+  do { \
+__asm__ __volatile__ ("fnstcw\t%0" : "=m" (_fcw)); \
   } while (0)
 
-#define FP_ROUNDMODE   (_fcw & 0xc00)
+#define FP_ROUNDMODE   (_fcw & FP_RND_MASK)
 
 #define__LITTLE_ENDIAN 1234
 #define__BIG_ENDIAN4321
Index: config/i386/32/sfp-machine.h
===
--- config/i386/32/sfp-machine.h(revision 188515)
+++ config/i386/32/sfp-machine.h(working copy)
@@ -76,7 +76,3 @@
16byte since soft-fp emulation is done in 16byte.  */
 #define _FP_NANFRAC_E  _FP_QNANBIT_E, 0, 0, 0
 #define _FP_NANFRAC_Q  _FP_QNANBIT_Q, 0, 0, 0
-#define _FP_NANSIGN_S  1
-#define _FP_NANSIGN_D  1
-#define _FP_NANSIGN_E  1
-#define _FP_NANSIGN_Q  1
Index: config/i386/64/sfp-machine.h
===
--- config/i386/64/sfp-machine.h(revision 188515)
+++ config/i386/64/sfp-machine.h(working copy)
@@ -17,7 +17,3 @@
 #define _FP_NANFRAC_D  _FP_QNANBIT_D
 #define _FP_NANFRAC_E  _FP_QNANBIT_E, 0
 #define _FP_NANFRAC_Q  _FP_QNANBIT_Q, 0
-#define _FP_NANSIGN_S  1
-#define _FP_NANSIGN_D  1
-#define _FP_NANSIGN_E  1
-#define _FP_NANSIGN_Q  1

Re: [PATCH ARM iWMMXt 0/5] Improve iWMMXt support

2012-06-13 Thread Matt Turner

On Wed, Jun 13, 2012 at 3:26 AM, nick clifton  wrote:
> Hi Matt, Hi Xinyu,
>
>
>> This series was written by Marvell and sent by Xinyu Qi
>> a number of times in the last year.
>
>
> Sorry for the long delay in reviewing these patches.  Overall they were
> fine, with only a few, very minor, formatting issues.  I have committed the
> entire series of patches to the mainline.

Great! Thank you so much! Thanks to Ramana for the reviews!

>> For 4.7 and 4.6 please consider committing my patch
>> "[PATCH] arm: Fix iwmmxt shift and logical intrinsics (PR 35294)."
>> which only fixes the logical and shift intrinsics.

Sounds good.

There's also a trivial documentation fix:

[PATCH 1/2] doc: Correct __builtin_arm_tinsr prototype documentation

and a test to exercise the intrinsics:

[PATCH 2/2] arm: add iwMMXt mmx-2.c test

Thanks a lot!

Matt

Re: [PATCH 3/3] rs6000: Rewrite sync patterns for atomic; expand early.

2012-06-13 Thread Richard Henderson

On 2012-06-13 01:33, Richard Guenther wrote:
> If you are sure it won't break anything go ahead (sooner than later please).

Done.


r~

Re: [PATCH 3/7] Add stdint.h wrapper for VxWorks.

2012-06-13 Thread rbmj


On 06/12/2012 04:20 PM, Joseph S. Myers wrote:

On Tue, 12 Jun 2012, rbmj wrote:


On 06/12/2012 11:47 AM, Joseph S. Myers wrote:

On Wed, 6 Jun 2012, rbmj wrote:


The stdint.h doesn't have all the typedefs needed for standards
compliance, so add a hack that adds all of the needed typedefs
to be fully compliant to the standard.  Fixes broken libstdc++.

If you're touching VxWorks stdint.h perhaps you could also define the
relevant target macros for GCC to have built-in knowledge of the types?
This is needed for the Fortran C bindings to work correctly, at least, and
ensures char16_t and char32_t (C11/C++11) are correct as well.  (You could
then set use_gcc_stdint to "wrap" in config.gcc if you want to use GCC's
stdint.h for freestanding compilations.)

I would be happy to, but I'm not aware of what macros those are.  If you could
point me to some documentation or explain to me what macros I need to define
I'll update the patch.

  was my original
announcement for target OS maintainers.  You should define the same set of
macros as in gcc/config/glibc-stdint.h (but, obviously, to values
appropriate to VxWorks), plus INTMAX_TYPE and UINTMAX_TYPE if the default
values of those macros are wrong for VxWorks, and make sure all the
c99-stdint-*.c tests pass.

Since u?int.*_t are already defined, would this work?  Or should I use 
the non-typedef'd versions?  Also, I'm not exactly sure how to run the 
regression tests with a cross compiler.  I'm still new to everything, 
bear with me :-)


#define SIG_ATOMIC_TYPE "int"

#define INT8_TYPE "int8_t"
#define INT16_TYPE "int16_t"
#define INT32_TYPE "int32_t"
#define INT64_TYPE "int64_t"
#define UINT8_TYPE "uint8_t"
#define UINT16_TYPE "uint16_t"
#define UINT32_TYPE "uint32_t"
#define UINT64_TYPE "uint64_t"

#define INT_LEAST8_TYPE "int_least8_t"
#define INT_LEAST16_TYPE "int_least16_t"
#define INT_LEAST32_TYPE "int_least32_t"
#define INT_LEAST64_TYPE "int_least64_t"
#define UINT_LEAST8_TYPE "uint_least8_t"
#define UINT_LEAST16_TYPE "uint_least16_t"
#define UINT_LEAST32_TYPE "uint_least32_t"
#define UINT_LEAST64_TYPE "uint_least64_t"

#define INT_FAST8_TYPE "int_fast8_t"
#define INT_FAST16_TYPE "int_fast16_t"
#define INT_FAST32_TYPE "int_fast32_t"
#define INT_FAST64_TYPE "int_fast64_t"
#define UINT_FAST8_TYPE "uint_fast8_t"
#define UINT_FAST16_TYPE "uint_fast16_t"
#define UINT_FAST32_TYPE "uint_fast32_t"
#define UINT_FAST64_TYPE "uint_fast64_t"

#define INTPTR_TYPE "intptr_t"
#define UINTPTR_TYPE "uintptr_t"

Robert

Re: Committed: atomic support for CRIS

2012-06-13 Thread Hans-Peter Nilsson

> From: Mike Stump 
> CC: "r...@redhat.com" , "gcc-patches@gcc.gnu.org"
>   
> Date: Wed, 13 Jun 2012 17:06:39 +0200
> References: <201206122347.q5cnlvkz030...@ignucius.se.axis.com>
> x-spam-status: No, score=-1.9 required=5 tests=[BAYES_00=-1.9,
>   FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001]
>   autolearn=ham
> x-spam-score: -1.9
> Content-Type: text/plain; charset="us-ascii"
> Old-Content-Transfer-Encoding: quoted-printable
> X-RBL-Checked: 10.0.5.75 10.0.5.78 10.20.1.12 127.0.0.1 76.96.62.19 
> 76.96.62.24
> Content-Transfer-Encoding: 8bit
> MIME-Version: 1.0
> 
> On Jun 12, 2012, at 4:47 PM, Hans-Peter Nilsson wrote:
> >> From: Richard Henderson 
> >> Date: Tue, 12 Jun 2012 23:04:02 +0200
> > Putting a lot of trust onto users and libraries there, to choose
> > the right model...
> 
> My take, very, very few people will actually write code that
> plays with models and barriers...  Just like in the past.
> Instead, people will layer prettier apis on top of it and
> explain the world through that lens and people will use those.
> In the compiler and compiler libraries, we merely provide all
> the primitives necessary for a api implementor to squeeze out
> all the performance they need.  We fail when we leave
> performance on the table in terms of having extra operations
> that aren't necessary, as that limits the utility of the
> library.

Much ado about nothing.  No news. :)

My take is that I won't add useless things to the port that have
no practical value, like another copy-paste instance of
_{pre,post}_atomic_barrier (model) (all alike).  But, if
there had been a generic version or helper function, I'd
probably have used it (say, bool generate_barrier_p (enum
memmodel, bool pre)).  A barrier should be created on either but
maybe-not both sides of the atomic operation.  I took the safe
route to do both (when at all), so no memory accesses can sneak
in "behind" the atomic operation.  If this causes an extra
barrier to be added, it will have zero performance effect for
this architecture; there are no scheduling opportunitites
(blessed "sneaking") between the atomic operation and the
barrier anyway.  If one barrier too little is added for one
reason or another: welcome rare bugs.

On the other hand, I should have added a comment with the more
diplomatic version of the above near the code rth quoted. :)

brgds, H-P

[RFA 4.7 PATCH, ia64]: Change ior attribute to "or" in sync.md

2012-06-13 Thread Uros Bizjak

Hello!

This patch fixes a trivial oversight in the name of "or" family of
sync functions.

2012-06-12  Uros Bizjak  

* config/ia64/sync.md (fetchop_name): Change ior attribute to "or".

Tested on ia64-unknown-linux-gnu.  Is it still OK to squeeze this
trivial one-liner to 4.7 branch?

Uros.

Index: config/ia64/sync.md
===
--- config/ia64/sync.md (revision 188475)
+++ config/ia64/sync.md (working copy)
@@ -28,7 +28,7 @@

 (define_code_iterator FETCHOP [plus minus ior xor and])
 (define_code_attr fetchop_name
-  [(plus "add") (minus "sub") (ior "ior") (xor "xor") (and "and")])
+  [(plus "add") (minus "sub") (ior "or") (xor "xor") (and "and")])

 (define_expand "mem_thread_fence"
   [(match_operand:SI 0 "const_int_operand" "")];; model

Re: PATCH: PR target/53647: Set proper cache values when needed

2012-06-13 Thread Uros Bizjak

On Wed, Jun 13, 2012 at 4:47 PM, H.J. Lu  wrote:

> On i386, ix86_size_cost will be used for -Os, which has zero for
> simultaneous_prefetches, prefetch_block, l1_cache_size and l2_cache_size.
> This patch adds ix86_tune_cost and uses it for simultaneous_prefetches,
> prefetch_block, l1_cache_size and l2_cache_size if ones from ix86_cost
> are zero.  OK to install?
>
> 2012-06-13  H.J. Lu  
>
>        PR target/53647
>        * config/i386/i386.c (ix86_tune_cost): New variable.
>        (ix86_option_override_internal): Set ix86_tune_cost.  Use
>        ix86_tune_cost for simultaneous_prefetches, prefetch_block,
>        l1_cache_size and l2_cache_size if ones from ix86_cost are
>        zero.
>
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index 13755f4..2e64d55 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -1874,6 +1874,10 @@ struct processor_costs generic32_cost = {
>   1,                                   /* cond_not_taken_branch_cost.  */
>  };
>
> +/* Set by -mtune.  */
> +const struct processor_costs *ix86_tune_cost = &pentium_cost;
> +
> +/* Set by -mtune or -Os.  */

/* Set by -mtune, overridden by -Os.  */

>  const struct processor_costs *ix86_cost = &pentium_cost;

We probably don't need to initialize these variables, but won't hurt.

>  /* Processor feature/optimization bitmasks.  */
> @@ -3546,6 +3550,7 @@ ix86_option_override_internal (bool main_args_p)
>        flag_pcc_struct_return = DEFAULT_PCC_STRUCT_RETURN;
>     }
>
> +  ix86_tune_cost = processor_target_table[ix86_tune].cost;
>   if (optimize_size)
>     ix86_cost = &ix86_size_cost;
>   else
> @@ -3794,16 +3799,27 @@ ix86_option_override_internal (bool main_args_p)
>     flag_schedule_insns_after_reload = flag_schedule_insns = 0;
>
>   maybe_set_param_value (PARAM_SIMULTANEOUS_PREFETCHES,
> -                        ix86_cost->simultaneous_prefetches,
> +                        ix86_cost->simultaneous_prefetches
> +                        ? ix86_cost->simultaneous_prefetches
> +                        : ix86_tune_cost->simultaneous_prefetches,
>                         global_options.x_param_values,
>                         global_options_set.x_param_values);
> -  maybe_set_param_value (PARAM_L1_CACHE_LINE_SIZE, ix86_cost->prefetch_block,
> +  maybe_set_param_value (PARAM_L1_CACHE_LINE_SIZE,
> +                        ix86_cost->prefetch_block
> +                        ? ix86_cost->prefetch_block
> +                        : ix86_tune_cost->prefetch_block,
>                         global_options.x_param_values,
>                         global_options_set.x_param_values);
> -  maybe_set_param_value (PARAM_L1_CACHE_SIZE, ix86_cost->l1_cache_size,
> +  maybe_set_param_value (PARAM_L1_CACHE_SIZE,
> +                        ix86_cost->l1_cache_size
> +                        ? ix86_cost->l1_cache_size
> +                        : ix86_tune_cost->l1_cache_size,
>                         global_options.x_param_values,
>                         global_options_set.x_param_values);
> -  maybe_set_param_value (PARAM_L2_CACHE_SIZE, ix86_cost->l2_cache_size,
> +  maybe_set_param_value (PARAM_L2_CACHE_SIZE,
> +                        ix86_cost->l2_cache_size
> +                        ? ix86_cost->l2_cache_size
> +                        : ix86_cost->l2_cache_size,
>                         global_options.x_param_values,
>                         global_options_set.x_param_values);
>

Just set these params directly from ix86_tune_costs. We know these are
the same, unless -Os clears them.

OK with these changes.

Thanks,
Uros.

Re: [RFA 4.7 PATCH, ia64]: Change ior attribute to "or" in sync.md

2012-06-13 Thread Jakub Jelinek

On Wed, Jun 13, 2012 at 07:04:54PM +0200, Uros Bizjak wrote:
> This patch fixes a trivial oversight in the name of "or" family of
> sync functions.
> 
> 2012-06-12  Uros Bizjak  
> 
>   * config/ia64/sync.md (fetchop_name): Change ior attribute to "or".
> 
> Tested on ia64-unknown-linux-gnu.  Is it still OK to squeeze this
> trivial one-liner to 4.7 branch?

Ok.

> --- config/ia64/sync.md (revision 188475)
> +++ config/ia64/sync.md (working copy)
> @@ -28,7 +28,7 @@
> 
>  (define_code_iterator FETCHOP [plus minus ior xor and])
>  (define_code_attr fetchop_name
> -  [(plus "add") (minus "sub") (ior "ior") (xor "xor") (and "and")])
> +  [(plus "add") (minus "sub") (ior "or") (xor "xor") (and "and")])
> 
>  (define_expand "mem_thread_fence"
>[(match_operand:SI 0 "const_int_operand" "")];; model

Jakub

Re: PATCH: PR target/53647: Set proper cache values when needed

2012-06-13 Thread H.J. Lu

On Wed, Jun 13, 2012 at 10:40 AM, Uros Bizjak  wrote:
> On Wed, Jun 13, 2012 at 4:47 PM, H.J. Lu  wrote:
>
>> On i386, ix86_size_cost will be used for -Os, which has zero for
>> simultaneous_prefetches, prefetch_block, l1_cache_size and l2_cache_size.
>> This patch adds ix86_tune_cost and uses it for simultaneous_prefetches,
>> prefetch_block, l1_cache_size and l2_cache_size if ones from ix86_cost
>> are zero.  OK to install?
>>
>> 2012-06-13  H.J. Lu  
>>
>>        PR target/53647
>>        * config/i386/i386.c (ix86_tune_cost): New variable.
>>        (ix86_option_override_internal): Set ix86_tune_cost.  Use
>>        ix86_tune_cost for simultaneous_prefetches, prefetch_block,
>>        l1_cache_size and l2_cache_size if ones from ix86_cost are
>>        zero.
>>
>> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
>> index 13755f4..2e64d55 100644
>> --- a/gcc/config/i386/i386.c
>> +++ b/gcc/config/i386/i386.c
>> @@ -1874,6 +1874,10 @@ struct processor_costs generic32_cost = {
>>   1,                                   /* cond_not_taken_branch_cost.  */
>>  };
>>
>> +/* Set by -mtune.  */
>> +const struct processor_costs *ix86_tune_cost = &pentium_cost;
>> +
>> +/* Set by -mtune or -Os.  */
>
> /* Set by -mtune, overridden by -Os.  */
>
>>  const struct processor_costs *ix86_cost = &pentium_cost;
>
> We probably don't need to initialize these variables, but won't hurt.
>
> Just set these params directly from ix86_tune_costs. We know these are
> the same, unless -Os clears them.
>
> OK with these changes.
>
> Thanks,
> Uros.

This is the patch I checked in.

Thanks.

-- 
H.J.
--
PR target/53647
* config/i386/i386.c (ix86_tune_cost): New variable.
(ix86_option_override_internal): Set ix86_tune_cost.  Use
ix86_tune_cost for simultaneous_prefetches, prefetch_block,
l1_cache_size and l2_cache_size.

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 13755f4..04a5edc 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -1874,6 +1874,10 @@ struct processor_costs generic32_cost = {
   1,   /* cond_not_taken_branch_cost.  */
 };

+/* Set by -mtune.  */
+const struct processor_costs *ix86_tune_cost = &pentium_cost;
+
+/* Set by -mtune or -Os.  */
 const struct processor_costs *ix86_cost = &pentium_cost;

 /* Processor feature/optimization bitmasks.  */
@@ -3546,10 +3550,11 @@ ix86_option_override_internal (bool main_args_p)
flag_pcc_struct_return = DEFAULT_PCC_STRUCT_RETURN;
 }

+  ix86_tune_cost = processor_target_table[ix86_tune].cost;
   if (optimize_size)
 ix86_cost = &ix86_size_cost;
   else
-ix86_cost = processor_target_table[ix86_tune].cost;
+ix86_cost = ix86_tune_cost;

   /* Arrange to set up i386_stack_locals for all functions.  */
   init_machine_status = ix86_init_machine_status;
@@ -3794,16 +3799,19 @@ ix86_option_override_internal (bool main_args_p)
 flag_schedule_insns_after_reload = flag_schedule_insns = 0;

   maybe_set_param_value (PARAM_SIMULTANEOUS_PREFETCHES,
-ix86_cost->simultaneous_prefetches,
+ix86_tune_cost->simultaneous_prefetches,
 global_options.x_param_values,
 global_options_set.x_param_values);
-  maybe_set_param_value (PARAM_L1_CACHE_LINE_SIZE, ix86_cost->prefetch_block,
+  maybe_set_param_value (PARAM_L1_CACHE_LINE_SIZE,
+ix86_tune_cost->prefetch_block,
 global_options.x_param_values,
 global_options_set.x_param_values);
-  maybe_set_param_value (PARAM_L1_CACHE_SIZE, ix86_cost->l1_cache_size,
+  maybe_set_param_value (PARAM_L1_CACHE_SIZE,
+ix86_tune_cost->l1_cache_size,
 global_options.x_param_values,
 global_options_set.x_param_values);
-  maybe_set_param_value (PARAM_L2_CACHE_SIZE, ix86_cost->l2_cache_size,
+  maybe_set_param_value (PARAM_L2_CACHE_SIZE,
+ix86_tune_cost->l2_cache_size,
 global_options.x_param_values,
 global_options_set.x_param_values);

Re: Make timevar phases mutually exclusive. (issue6302064)

2012-06-13 Thread Joseph S. Myers

On Wed, 13 Jun 2012, Diego Novillo wrote:

> The LTO bits are fine.  I would prefer if an FE maintainer takes a second look
> over the other bits.  Jason, Joseph?

The c-decl.c changes are fine with me.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: MIPS testsuite patch for --with-synci configurations

2012-06-13 Thread Richard Sandiford

Steve Ellcey  writes:
> On Mon, 2012-06-11 at 18:24 -0700, David Daney wrote:
>
>> > This patch addes the -mno-synci flag to MIPS tests that specify an
>> > architecture that does not support synci, thus getting rid of the
>> > warning message and making the tests pass.  Tested with the mips-linux-gnu
>> > and mips-sde-elf targets both with and without --with-synci on the
>> > GCC configuration.
>> >
>> > OK to checkin?
>> 
>> I wonder if it would make more sense to modify the testsuite driver to 
>> take care of this.  It seems like the set of files with the -mno-synci 
>> annotation could easily become different than the set that requires it.
>> 
>> David Daney
>
> I did think about that, but the number of flags that I would have to
> check for in the testsuite driver to decide whether or not to turn on
> -mno-synci was large enough to make me not want to do it that way.
>
> I would need to check the isa= and isa_rev= flags that are currently
> handled in the driver, the -mabi flag, the -march flag, and the -mips
> flag.  The number of values that each flag could have (particularly
> -march) is rather large.  I can imagine people adding a new test and
> forgetting to add -mno-synci but that would be easy to fix and no worse
> then adding a new -march value and not handling it in the test suite
> driver.

I agree with David.  This is really the same kind of situation as we
already have for -m(no-)dsp, etc.  There too we need to override an
explicit -mdsp (which might be added using --target_board, etc.)
when testing a target that doesn't support the DSP ASE.

I think the patch below should be enough.  Spot-checked on mips64-elf
using --target_board mips-sim-idt64/-mips64r2/-msynci.  Could you
give it a go with your target?

Thanks,
Richard


gcc/testsuite/
* gcc.target/mips/mips.exp (mips-dg-options): Handle -msynci.

Index: gcc/testsuite/gcc.target/mips/mips.exp
===
--- gcc/testsuite/gcc.target/mips/mips.exp  2012-02-07 19:23:25.0 
+
+++ gcc/testsuite/gcc.target/mips/mips.exp  2012-06-13 18:54:03.507449114 
+0100
@@ -839,6 +839,8 @@ proc mips-dg-finish {} {
 #|   |
 # -mdsp   -mno-dsp
 #|   |
+# -msynci -mno-synci
+#|   |
 #+-- gp, abi & arch -+
 #
 # For these purposes, the "gp", "abi" & "arch" option groups are treated
@@ -987,6 +989,7 @@ proc mips-dg-options { args } {
#   - the DSP ASE
if { $isa_rev < 2
 && (($gp_size == 32 && [mips_have_test_option_p options "-mfp64"])
+|| [mips_have_test_option_p options "-msynci"]
 || [mips_have_test_option_p options "-mdsp"]
 || [mips_have_test_option_p options "-mdspr2"]) } {
if { $gp_size == 32 } {
@@ -1150,6 +1153,7 @@ proc mips-dg-options { args } {
mips_make_test_option options "-mfp32"
}
mips_make_test_option options "-mno-dsp"
+   mips_make_test_option options "-mno-synci"
}
unset arch
unset isa

Re: [PATCH 3/7] Add stdint.h wrapper for VxWorks.

2012-06-13 Thread Joseph S. Myers

On Wed, 13 Jun 2012, rbmj wrote:

> Since u?int.*_t are already defined, would this work?  Or should I use the
> non-typedef'd versions?  Also, I'm not exactly sure how to run the regression
> tests with a cross compiler.  I'm still new to everything, bear with me :-)

You have to use the non-typedef versions, with the keywords in the correct 
order - see the documentation of SIZE_TYPE in tm.texi for details.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH, ARM] New CPU support for Marvell PJ4 cores

2012-06-13 Thread Ramana Radhakrishnan

On 29 May 2012 10:07, Yi-Hsiu Hsu  wrote:
> Hi,
>
> This patch maintains Marvell PJ4 cores pipeline description.
> Run arm testsuite on arm-linux-gnueabi and no extra regressions are found.
>
>        * config/arm/marvell-pj4.md: New marvell-pj4 pipeline description.
>        * config/arm/arm.c (arm_issue_rate): Add marvell_pj4.
>        * config/arm/arm-cores.def: Add core marvell-pj4.
>        * config/arm/arm-tune.md: Regenerated.
>        * config/arm/arm-tables.opt: Regenerated.
>        * doc/invoke.texi: Added entry for marvell-pj4.

This command line option should also be added to BE8_LINK_SPEC similar
to what's done for the other v7-a cores.

Ok with that change.

regards,
Ramana



>
>
> Thanks!
>
> P.S. I create the patch from revision 187308, but this revision is unable to 
> build successfully, then I apply this patch to revision 187623 and 
> successfully build and pass the testsuite.
>

Re: [PATCH 1/3] Add atomic_compare_and_swap, atomic_exchange and atomic_fetch_add patterns.

2012-06-13 Thread Richard Sandiford

Looks good, thanks.  Just a couple of silly comments...

Maxim Kuvyrkov  writes:
> +/* Subroutines of the mips_process_sync_loop.
> +   Emit barriers as needed for the memory MODEL.  */
> +
> +static bool
> +mips_emit_pre_atomic_barrier_p (enum memmodel model)
> +{
> +  switch (model)
> +{
> +case MEMMODEL_RELAXED:
> +case MEMMODEL_CONSUME:
> +case MEMMODEL_ACQUIRE:
> +  return false;
> +case MEMMODEL_RELEASE:
> +case MEMMODEL_ACQ_REL:
> +case MEMMODEL_SEQ_CST:
> +  return true;
> +default:
> +  gcc_unreachable ();
> +}
> +}

Comment is a bit misleading because we don't emit anything here.
How about:

/* Subroutine of mips_process_sync_loop.  Return true if memory
   model MODEL requires a pre-loop (release-style) barrier.  */

> +static bool
> +mips_emit_post_atomic_barrier_p (enum memmodel model)
> +{
> +  switch (model)
> +{
> +case MEMMODEL_RELAXED:
> +case MEMMODEL_CONSUME:
> +case MEMMODEL_RELEASE:
> +  return false;
> +case MEMMODEL_ACQUIRE:
> +case MEMMODEL_ACQ_REL:
> +case MEMMODEL_SEQ_CST:
> +  return true;
> +default:
> +  gcc_unreachable ();
> +}
> +}

/* Subroutine of mips_process_sync_loop.  Return true if memory
   model MODEL requires a post-SC (acquire-style) barrier.  */

> +  /* CMP  = 0 [delay slot].  */
> +  if (cmp)
> +mips_multi_add_insn ("li\t%0,0", cmp, NULL);

Nitlet, but only one space after "CMP" (here and elsewhere).

> +(define_expand "atomic_exchange"
> +  [(match_operand:GPR 0 "register_operand")
> +   (match_operand:GPR 1 "memory_operand")
> +   (match_operand:GPR 2 "arith_operand")
> +   (match_operand:SI 3 "const_int_operand")]
> +  "GENERATE_LL_SC"
> +{
> +emit_insn (gen_atomic_exchange_llsc (operands[0], operands[1],
> +operands[2], operands[3]));
> +  DONE;
> +})

Excess indentation on "emit_insn" call.  Same for atomic_fetch_add.

Richard

Re: [PATCH 2/3] Add XLP-specific atomic instructions and tweaks.

2012-06-13 Thread Richard Sandiford

Maxim Kuvyrkov  writes:
> diff --git a/gcc/config/mips/sync.md b/gcc/config/mips/sync.md
> index 604aefa..ac953b5 100644
> --- a/gcc/config/mips/sync.md
> +++ b/gcc/config/mips/sync.md
> @@ -607,10 +607,32 @@
> (match_operand:GPR 1 "memory_operand")
> (match_operand:GPR 2 "arith_operand")
> (match_operand:SI 3 "const_int_operand")]
> -  "GENERATE_LL_SC"
> +  "GENERATE_LL_SC || ISA_HAS_SWAP"
>  {
> +  if (!ISA_HAS_SWAP)
>  emit_insn (gen_atomic_exchange_llsc (operands[0], operands[1],
>  operands[2], operands[3]));
> +  else

Please swap this round so that the ISA_HAS_SWAP stuff comes first.
The LLSC code then remains the fallback if new variations are added.

> +{
> +  rtx addr;
> +
> +  gcc_assert (MEM_P (operands[1]));
> +  addr = XEXP (operands[1], 0);
> +  if (!REG_P (addr) && can_create_pseudo_p ())
> +/* Workaround a reload bug that hits (lo_sum (reg) (symbol_ref))
> +addresses.  Spill the address to a register upfront to simplify
> +reload's job.  */
> +addr = force_reg (GET_MODE (addr), addr);

If there's a reload bug here, let's fix it.  But...

> @@ -631,15 +653,52 @@
> (set_attr "sync_insn1_op2" "2")
> (set_attr "sync_memmodel" "3")])
>  
> +;; Swap/ldadd instruction accepts only register, no offset, for the address.

["SWAP accepts only register addresses."]

> +;; Therefore, we spell out the MEM verbatim and constrain its address to "d".
> +;; XLP issues implicit sync for swap/ldadd, so no need for an explicit one.

...we should instead define a memory predicate/constraint pair that only
allows register addresses.  Then the expand code would be:

if (!mem_reg_operand (operands[1], mode))
  {
addr = force_reg (Pmode, XEXP (operands[1], 0));
operands[1] = replace_equiv_address (operands[1], addr);
  }

> +(define_insn "atomic_exchange_swap_"
> +  [(set (match_operand:GPR 0 "register_operand" "=d")
> + (unspec_volatile:GPR
> +  [(mem:GPR (match_operand:P 1 "address_operand" "d"))]
> +  UNSPEC_ATOMIC_EXCHANGE))
> +   (set (mem:GPR (match_dup 1))
> + (unspec_volatile:GPR [(match_operand:GPR 2 "arith_operand" "0")]

Should be "register_operand", with a force_reg in the expander.
Same kind of comments for LDADD.

Richard

Re: [PATCH 2/3] Add XLP-specific atomic instructions and tweaks.

2012-06-13 Thread Maxim Kuvyrkov

On 14/06/2012, at 6:50 AM, Richard Sandiford wrote:

> Maxim Kuvyrkov  writes:
>> diff --git a/gcc/config/mips/sync.md b/gcc/config/mips/sync.md
>> index 604aefa..ac953b5 100644
>> --- a/gcc/config/mips/sync.md
>> +++ b/gcc/config/mips/sync.md
>> @@ -607,10 +607,32 @@
>>(match_operand:GPR 1 "memory_operand")
>>(match_operand:GPR 2 "arith_operand")
>>(match_operand:SI 3 "const_int_operand")]
>> -  "GENERATE_LL_SC"
>> +  "GENERATE_LL_SC || ISA_HAS_SWAP"
>> {
>> +  if (!ISA_HAS_SWAP)
>> emit_insn (gen_atomic_exchange_llsc (operands[0], operands[1],
>> operands[2], operands[3]));
>> +  else
> 
> Please swap this round so that the ISA_HAS_SWAP stuff comes first.
> The LLSC code then remains the fallback if new variations are added.

OK.

> 
>> +{
>> +  rtx addr;
>> +
>> +  gcc_assert (MEM_P (operands[1]));
>> +  addr = XEXP (operands[1], 0);
>> +  if (!REG_P (addr) && can_create_pseudo_p ())
>> +/* Workaround a reload bug that hits (lo_sum (reg) (symbol_ref))
>> +   addresses.  Spill the address to a register upfront to simplify
>> +   reload's job.  */
>> +addr = force_reg (GET_MODE (addr), addr);
> 
> If there's a reload bug here, let's fix it.  But...

After a chat with Bernd Schmidt, this is not a bug.  I've already fixed the 
patch per yours and Bernd's instructions.  Do you want to look through an 
updated patch or should I just commit it after retesting?

Thanks,

--
Maxim Kuvyrkov
CodeSourcery / Mentor Graphics

Re: [PATCH 1/3] Add atomic_compare_and_swap, atomic_exchange and atomic_fetch_add patterns.

2012-06-13 Thread Maxim Kuvyrkov

On 14/06/2012, at 6:33 AM, Richard Sandiford wrote:

> Looks good, thanks.  Just a couple of silly comments...
> 
> Maxim Kuvyrkov  writes:
>> +/* Subroutines of the mips_process_sync_loop.
>> +   Emit barriers as needed for the memory MODEL.  */
>> +
>> +static bool
>> +mips_emit_pre_atomic_barrier_p (enum memmodel model)
>> +{
>> +  switch (model)
>> +{
>> +case MEMMODEL_RELAXED:
>> +case MEMMODEL_CONSUME:
>> +case MEMMODEL_ACQUIRE:
>> +  return false;
>> +case MEMMODEL_RELEASE:
>> +case MEMMODEL_ACQ_REL:
>> +case MEMMODEL_SEQ_CST:
>> +  return true;
>> +default:
>> +  gcc_unreachable ();
>> +}
>> +}
> 
> Comment is a bit misleading because we don't emit anything here.

Yeap, forgot to update the comment after stealing the code from Alpha's port.

> How about:
> 
> /* Subroutine of mips_process_sync_loop.  Return true if memory
>   model MODEL requires a pre-loop (release-style) barrier.  */
> 
>> +static bool
>> +mips_emit_post_atomic_barrier_p (enum memmodel model)
>> +{
>> +  switch (model)
>> +{
>> +case MEMMODEL_RELAXED:
>> +case MEMMODEL_CONSUME:
>> +case MEMMODEL_RELEASE:
>> +  return false;
>> +case MEMMODEL_ACQUIRE:
>> +case MEMMODEL_ACQ_REL:
>> +case MEMMODEL_SEQ_CST:
>> +  return true;
>> +default:
>> +  gcc_unreachable ();
>> +}
>> +}
> 
> /* Subroutine of mips_process_sync_loop.  Return true if memory
>   model MODEL requires a post-SC (acquire-style) barrier.  */
> 
>> +  /* CMP  = 0 [delay slot].  */
>> +  if (cmp)
>> +mips_multi_add_insn ("li\t%0,0", cmp, NULL);
> 
> Nitlet, but only one space after "CMP" (here and elsewhere).

OK.

> 
>> +(define_expand "atomic_exchange"
>> +  [(match_operand:GPR 0 "register_operand")
>> +   (match_operand:GPR 1 "memory_operand")
>> +   (match_operand:GPR 2 "arith_operand")
>> +   (match_operand:SI 3 "const_int_operand")]
>> +  "GENERATE_LL_SC"
>> +{
>> +emit_insn (gen_atomic_exchange_llsc (operands[0], operands[1],
>> +   operands[2], operands[3]));
>> +  DONE;
>> +})
> 
> Excess indentation on "emit_insn" call.  Same for atomic_fetch_add.

Not if you consider it together with the second patch ;-).

--
Maxim Kuvyrkov
CodeSourcery / Mentor Graphics

Re: [PATCH 3/3] Avoid emitting useless instructions in mips_process_sync_loop.

2012-06-13 Thread Richard Sandiford

Maxim Kuvyrkov  writes:
> +  /* Don't bother setting returns that are never used.  */
> +  if (cmp && find_reg_note (insn, REG_UNUSED, cmp))
> +cmp = 0;
> +  if (required_oldval && find_reg_note (insn, REG_UNUSED, required_oldval))
> +required_oldval = 0;

required_oldval is an important input (not output).  We can't drop it.
I suppose we could replace oldval with AT if the non-AT register isn't used,
but I'm not sure it's worth it.

The CMP part is OK though.

Richard

Re: [PATCH 2/3] Add XLP-specific atomic instructions and tweaks.

2012-06-13 Thread Richard Sandiford

Maxim Kuvyrkov  writes:
> On 14/06/2012, at 6:50 AM, Richard Sandiford wrote:
>> Maxim Kuvyrkov  writes:
>>> diff --git a/gcc/config/mips/sync.md b/gcc/config/mips/sync.md
>>> index 604aefa..ac953b5 100644
>>> --- a/gcc/config/mips/sync.md
>>> +++ b/gcc/config/mips/sync.md
>>> @@ -607,10 +607,32 @@
>>>(match_operand:GPR 1 "memory_operand")
>>>(match_operand:GPR 2 "arith_operand")
>>>(match_operand:SI 3 "const_int_operand")]
>>> -  "GENERATE_LL_SC"
>>> +  "GENERATE_LL_SC || ISA_HAS_SWAP"
>>> {
>>> +  if (!ISA_HAS_SWAP)
>>> emit_insn (gen_atomic_exchange_llsc (operands[0], operands[1],
>>>operands[2], operands[3]));
>>> +  else
>> 
>> Please swap this round so that the ISA_HAS_SWAP stuff comes first.
>> The LLSC code then remains the fallback if new variations are added.
>
> OK.
>
>> 
>>> +{
>>> +  rtx addr;
>>> +
>>> +  gcc_assert (MEM_P (operands[1]));
>>> +  addr = XEXP (operands[1], 0);
>>> +  if (!REG_P (addr) && can_create_pseudo_p ())
>>> +/* Workaround a reload bug that hits (lo_sum (reg) (symbol_ref))
>>> +  addresses.  Spill the address to a register upfront to simplify
>>> +  reload's job.  */
>>> +addr = force_reg (GET_MODE (addr), addr);
>> 
>> If there's a reload bug here, let's fix it.  But...
>
> After a chat with Bernd Schmidt, this is not a bug.  I've already fixed the 
> patch per yours and Bernd's instructions.  Do you want to look through an 
> updated patch or should I just commit it after retesting?

Yeah, please send the revised patch.

Thanks,
Richard

Re: [PATCH 3/3] Avoid emitting useless instructions in mips_process_sync_loop.

2012-06-13 Thread Maxim Kuvyrkov

On 14/06/2012, at 6:59 AM, Richard Sandiford wrote:

> Maxim Kuvyrkov  writes:
>> +  /* Don't bother setting returns that are never used.  */
>> +  if (cmp && find_reg_note (insn, REG_UNUSED, cmp))
>> +cmp = 0;
>> +  if (required_oldval && find_reg_note (insn, REG_UNUSED, required_oldval))
>> +required_oldval = 0;
> 
> required_oldval is an important input (not output).  We can't drop it.
> I suppose we could replace oldval with AT if the non-AT register isn't used,
> but I'm not sure it's worth it.
> 
> The CMP part is OK though.

Thanks for pointing this out.  The optimization was motivated by the CMP part, 
as this is what's needed to generate same assembly as GLIBC has.

--
Maxim Kuvyrkov
CodeSourcery / Mentor Graphics

Re: [PR debug/47624] improve value tracking in non-VTA locations

2012-06-13 Thread Richard Henderson

On 2012-06-13 01:02, Alexandre Oliva wrote:
> Ping?  http://gcc.gnu.org/ml/gcc-patches/2012-04/msg01320.html

Ok.


r~

Re: [RFC, ivopts] fix bugs in ivopts address cost computation

2012-06-13 Thread Sandra Loosemore


On 06/06/2012 02:29 AM, Richard Guenther wrote:


Pre-computing and caching things is to avoid creating RTXen over and over.
As you have discarded this completely did you try to measure the cost
of doing so in terms of produced garbage and compile-time cost?  Did you
consider changing the target interface of IVOPTs to a (bunch of) new
target hooks that avoid the RTX generation (which in fact we are not sure
that we'll end up producing anyways in exactly that form due to subsequent
optimizations)?


Since there seemed to be resistance to removing the pre-computing of 
costs, I've spent much of the last week trying to glue fixes on the 
existing code while preserving the caching, and just am not happy with 
the result.  It makes the code too complicated, adds additional overhead 
by precomputing more things, and still does not fix the lurking bugs WRT 
differing costs for different values of constant offsets and the like. 
Basically, I don't want to put my name on anything that ugly.  :-P  So, 
I went back and did some compile time benchmarking on my previously 
posted patch instead.


I used the bzip2 and gcc test programs available here:

http://people.csail.mit.edu/smcc/projects/single-file-programs/

These are respectively large and gigantic single-file programs, so you 
would expect the performance effects of caching to be particularly 
evident here as there would likely be many loops involving the same 
modes in each compilation unit.  I compiled them using a native x86_64 
build with "time gcc -c -O3", and ran each set of timings 3 times with 
an unmodified build and with my previously-posted patch.  And, it turns 
out there is no obvious difference in the results.


bzip2 base: 5.82, 5.82, 5.75
bzip2 patched: 5.73, 5.71, 5.85

gcc base: 4m44.390, 4m44.270, 4m44.060
gcc patched: 4m44.210, 4m44.530, 4m44.040


Can you split the patch into pieces fixing the above bugs separately?
Removing the pre-compute and caching is the most questionable change,
the others look like real bugs (the symbol cost might be questionable as
well).


Given that most of the bugs were in the same function and were fixed by 
rewriting it completely, trying to split up the patch seems kind of 
pointless, to me.



CC-ing Zdenek for his opinions (disclaimer: I didn't look at the actual patch).


Might somebody be willing to review the patch as posted?

FWIW, I was doing some digging around in the mail archives to see if 
there was any discussion that would help me understand the rationale for 
the current cost model better.  What I found was that when the ivopts 
pass was originally added back in 2004, the costs computation was one of 
the things that was specifically mentioned as needing work at least to 
document it better, but the patch was rushed through and approved 
without any thorough review because the Stage 3 deadline was looming. 
Anyway, given that get_address_cost has had a big FIXME on it all these 
years, it seems to me like maybe it's time to try to fix it?


-Sandra

Re: [PATCH, MIPS] Add most common atomic patterns

2012-06-13 Thread Richard Henderson

On 2012-06-12 22:50, Maxim Kuvyrkov wrote:
> The third patch is a small optimization to alleviate
> __atomic_compare_exchange[_n] builtins being a use-one-for-all
> solutions.  These builtins return both boolean "success" and "oldval"
> results.  As most cases use only one of the results, this
> optimizations looks at REG_UNUSED notes to determine if instructions
> to set these results can be omitted.

If you split the pattern, similar to how it's handled on Alpha,
and normal dead-code elimination will handle this.


r~

New Serbian PO file for 'cpplib' (version 4.7.0)

2012-06-13 Thread Translation Project Robot

Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'cpplib' has been submitted
by the Serbian team of translators.  The file is available at:

http://translationproject.org/latest/cpplib/sr.po

(This file, 'cpplib-4.7.0.sr.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

http://translationproject.org/latest/cpplib/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

http://translationproject.org/domain/cpplib.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.

Contents of PO file 'cpplib-4.7.0.sr.po'

2012-06-13 Thread Translation Project Robot



cpplib-4.7.0.sr.po.gz
Description: Binary data
The Translation Project robot, in the
name of your translation coordinator.

Re: [Fortran, DRAFT patch] PR 46321 - [OOP] Polymorphic deallocation

2012-06-13 Thread Alessandro Fanfarillo

Dear all,

in attachment the new draft which also supports the polymorphic
deallocation via INTENT(OUT). Tomorrow I'll try to realize a draft for
the deallocation at the end of the scope.

Regards

2012/6/12 Alessandro Fanfarillo :
> I don't know if there's already a PR but I get an ICE compiling this
> with a non-patched version. If x is not an array everything goes ok.
>
> 2012/6/11 Tobias Burnus :
>> On 06/11/2012 11:24 AM, Alessandro Fanfarillo wrote:
>>>
>>> gfortran.dg/coarray/poly_run_3.f90
>>
>>
>> That one fails because I for forgot that se.expr in gfc_trans_deallocate
>> contains the descriptor and not the pointer to the data. That's fixed by:
>>
>>          tmp = se.expr;
>>          if (GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (tmp)))
>>            {
>>              tmp = gfc_conv_descriptor_data_get (tmp);
>>              STRIP_NOPS (tmp);
>>
>>            }
>>          tmp =  fold_build2_loc (input_location, NE_EXPR, boolean_type_node,
>>                                  tmp, build_int_cst (TREE_TYPE (tmp), 0));
>>
>> However, it still fails for the
>>
>> type t
>>  integer, allocatable :: comp
>> end type t
>> contains
>>  subroutine foo(x)
>>    class(t), allocatable, intent(out) :: x(:)
>>  end subroutine
>> end
>>
>> (The intent(out) causes automatic deallocation.) The backtrace does not
>> really point to some code which the patch touched; it shouldn't be affected
>> by the class.c changes and gfc_trans_deallocate does not seem to be entered.
>>
>> While I do not immediately see why it fails, I wonder whether it is due to
>> the removed "else if ... BT_CLASS)" case in
>> gfc_deallocate_scalar_with_status. In any case, the change to
>> gfc_trans_deallocate might be also needed for
>> gfc_deallocate_scalar_with_status. At least, automatic deallocation (with
>> intent(out) or when leaving the scope) does not seem to go through
>> gfc_trans_deallocate but only through gfc_deallocate_scalar_with_status.
>>
>> Tobias
Index: gcc/fortran/trans-decl.c
===
--- gcc/fortran/trans-decl.c(revisione 188511)
+++ gcc/fortran/trans-decl.c(copia locale)
@@ -3423,6 +3423,63 @@ init_intent_out_dt (gfc_symbol * proc_sym, gfc_wra
   gfc_init_block (&init);
   for (f = proc_sym->formal; f; f = f->next)
 if (f->sym && f->sym->attr.intent == INTENT_OUT
+   && f->sym->ts.type == BT_CLASS
+   && !CLASS_DATA (f->sym)->attr.class_pointer
+   && CLASS_DATA (f->sym)->ts.u.derived->attr.alloc_comp)
+  {
+   gfc_expr *expr, *ppc;
+   gfc_se se, free_se;
+   gfc_code *ppc_code;
+   gfc_actual_arglist *actual;
+   tree cond;
+   f->sym->attr.referenced = 1;
+   expr = gfc_lval_expr_from_sym(f->sym);
+   gcc_assert (expr->expr_type == EXPR_VARIABLE);
+
+   if (expr->ts.type == BT_CLASS)
+ gfc_add_data_component (expr);
+
+   gfc_init_se (&se, NULL);
+   gfc_start_block (&se.pre);
+   se.want_pointer = 1;
+   se.descriptor_only = 1;
+   gfc_conv_expr (&se, expr);
+   ppc = gfc_lval_expr_from_sym(f->sym);;
+   gfc_add_vptr_component (ppc);
+   gfc_add_component_ref (ppc, "_free");
+   gfc_init_se (&free_se, NULL);
+   free_se.want_pointer = 1;
+   gfc_conv_expr (&free_se, ppc);
+   tmp = se.expr;
+   if (GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (tmp)))
+ {
+   tmp = gfc_conv_descriptor_data_get (tmp);
+   STRIP_NOPS (tmp);
+ }
+   cond = fold_build2_loc (input_location, NE_EXPR, boolean_type_node,
+   free_se.expr,
+   build_int_cst (TREE_TYPE (free_se.expr), 0));
+   tmp =  fold_build2_loc (input_location, NE_EXPR, boolean_type_node,
+   tmp, build_int_cst (TREE_TYPE (tmp), 0));
+   cond = fold_build2_loc (input_location, TRUTH_AND_EXPR,
+   boolean_type_node, cond, tmp);
+
+   actual = gfc_get_actual_arglist ();
+   actual->expr = gfc_copy_expr (expr);
+
+   ppc_code = gfc_get_code ();
+   ppc_code->resolved_sym = ppc->symtree->n.sym;
+   ppc_code->resolved_sym->attr.elemental = 1;
+   ppc_code->ext.actual = actual;
+   ppc_code->expr1 = ppc;
+   ppc_code->op = EXEC_CALL;
+   tmp = gfc_trans_call (ppc_code, true, NULL, NULL, false);
+   tmp = fold_build3_loc (input_location, COND_EXPR, void_type_node,
+   cond, tmp, build_empty_stmt (input_location));
+gfc_add_expr_to_block (&init, tmp);
+gfc_free_statements (ppc_code);
+  }
+else if (f->sym && f->sym->attr.intent == INTENT_OUT
&& !f->sym->attr.pointer
&& f->sym->ts.type == BT_DERIVED)
   {
@@ -3446,7 +3503,7 @@ init_intent_out_dt (gfc_symbol * proc_sym, gfc_wra
else if (f->sym->value)
  gfc_init_default_dt (f->sym, &init, true);
   }
-else if (f->sym && f->sym->attr.intent == INTENT_OUT
+/*else if (f->sym && f->sy

Re: constant that doesn't fit in 32bits in alpha.c

2012-06-13 Thread Richard Henderson

On 2012-06-12 12:44, Joseph S. Myers wrote:
> I'd rather have a macro HOST_WIDE_INT_C in hwint.h (like INTMAX_C etc. in 
> stdint.h).  HOST_WIDE_INT_1 is already defined in hwint.h to either 1L or 
> 1LL; I'd suggest defining HOST_WIDE_INT_C to concatenate with either L or 
> LL (and then HOST_WIDE_INT_1 can be HOST_WIDE_INT_C (1), unconditionally).
> 

Are you happy with this version?


r~
diff --git a/gcc/config/alpha/alpha.c b/gcc/config/alpha/alpha.c
index 2177288..36f7306 100644
--- a/gcc/config/alpha/alpha.c
+++ b/gcc/config/alpha/alpha.c
@@ -5451,8 +5451,6 @@ alpha_trampoline_init (rtx m_tramp, tree fndecl, rtx 
chain_value)
   chain_value = convert_memory_address (Pmode, chain_value);
 #endif
 
-#define HWI_HEX2(X,Y)  (((HOST_WIDE_INT)0x ## X ## u << 32) | 0x ## Y ## u)
-
   if (TARGET_ABI_OPEN_VMS)
 {
   const char *fnname;
@@ -5471,7 +5469,8 @@ alpha_trampoline_init (rtx m_tramp, tree fndecl, rtx 
chain_value)
 the VMS calling standard. This is stored in the first quadword.  */
   word1 = force_reg (DImode, gen_const_mem (DImode, fnaddr));
   word1 = expand_and (DImode, word1,
- GEN_INT (HWI_HEX2(0fff,fff0)), NULL);
+ GEN_INT (HOST_WIDE_INT_C (0x0ffffff0)),
+ NULL);
 }
   else
 {
@@ -5482,12 +5481,10 @@ alpha_trampoline_init (rtx m_tramp, tree fndecl, rtx 
chain_value)
nop
 We don't bother setting the HINT field of the jump; the nop
 is merely there for padding.  */
-  word1 = GEN_INT (HWI_HEX2 (a77b0010,a43b0018));
-  word2 = GEN_INT (HWI_HEX2 (47ff041f,6bfb));
+  word1 = GEN_INT (HOST_WIDE_INT_C (0xa77b0010a43b0018));
+  word2 = GEN_INT (HOST_WIDE_INT_C (0x47ff041f6bfb));
 }
 
-#undef HWI_HEX2
-
   /* Store the first two words, as computed above.  */
   mem = adjust_address (m_tramp, DImode, 0);
   emit_move_insn (mem, word1);
diff --git a/gcc/hwint.h b/gcc/hwint.h
index 9885911..1734639 100644
--- a/gcc/hwint.h
+++ b/gcc/hwint.h
@@ -1,5 +1,5 @@
 /* HOST_WIDE_INT definitions for the GNU compiler.
-   Copyright (C) 1998, 2002, 2004, 2008, 2009, 2010
+   Copyright (C) 1998, 2002, 2004, 2008, 2009, 2010, 2012
Free Software Foundation, Inc.
 
This file is part of GCC.
@@ -60,20 +60,25 @@ extern char sizeof_long_long_must_be_8[sizeof(long long) == 
8 ? 1 : -1];
 #if HOST_BITS_PER_LONG >= 64 || !defined NEED_64BIT_HOST_WIDE_INT
 #   define HOST_BITS_PER_WIDE_INT HOST_BITS_PER_LONG
 #   define HOST_WIDE_INT long
+#   define HOST_WIDE_INT_C(X) X ## L
 #else
 # if HOST_BITS_PER_LONGLONG >= 64
 #   define HOST_BITS_PER_WIDE_INT HOST_BITS_PER_LONGLONG
 #   define HOST_WIDE_INT long long
+#   define HOST_WIDE_INT_C(X) X ## LL
 # else
 #  if HOST_BITS_PER___INT64 >= 64
 #   define HOST_BITS_PER_WIDE_INT HOST_BITS_PER___INT64
 #   define HOST_WIDE_INT __int64
+#   define HOST_WIDE_INT_C(X) X ## i64
 #  else
 #error "Unable to find a suitable type for HOST_WIDE_INT"
 #  endif
 # endif
 #endif
 
+#define HOST_WIDE_INT_1 HOST_WIDE_INT_C(1)
+
 /* This is a magic identifier which allows GCC to figure out the type
of HOST_WIDE_INT for %wd specifier checks.  You must issue this
typedef before using the __asm_fprintf__ format attribute.  */
@@ -84,7 +89,6 @@ typedef HOST_WIDE_INT __gcc_host_wide_int__;
 #if HOST_BITS_PER_WIDE_INT == HOST_BITS_PER_LONG
 # define HOST_WIDE_INT_PRINT HOST_LONG_FORMAT
 # define HOST_WIDE_INT_PRINT_C "L"
-# define HOST_WIDE_INT_1 1L
   /* 'long' might be 32 or 64 bits, and the number of leading zeroes
  must be tweaked accordingly.  */
 # if HOST_BITS_PER_WIDE_INT == 64
@@ -97,7 +101,6 @@ typedef HOST_WIDE_INT __gcc_host_wide_int__;
 #else
 # define HOST_WIDE_INT_PRINT HOST_LONG_LONG_FORMAT
 # define HOST_WIDE_INT_PRINT_C "LL"
-# define HOST_WIDE_INT_1 1LL
   /* We can assume that 'long long' is at least 64 bits.  */
 # define HOST_WIDE_INT_PRINT_DOUBLE_HEX \
 "0x%" HOST_LONG_LONG_FORMAT "x%016" HOST_LONG_LONG_FORMAT "x"
@@ -122,14 +125,17 @@ typedef HOST_WIDE_INT __gcc_host_wide_int__;
 # define HOST_WIDEST_INT_PRINT_UNSIGNED  
HOST_WIDE_INT_PRINT_UNSIGNED
 # define HOST_WIDEST_INT_PRINT_HEX   HOST_WIDE_INT_PRINT_HEX
 # define HOST_WIDEST_INT_PRINT_DOUBLE_HEX HOST_WIDE_INT_PRINT_DOUBLE_HEX
+# define HOST_WIDEST_INT_C(X)HOST_WIDE_INT(X)
 #else
 # if HOST_BITS_PER_LONGLONG >= 64
 #  define HOST_BITS_PER_WIDEST_INT   HOST_BITS_PER_LONGLONG
 #  define HOST_WIDEST_INTlong long
+#  define HOST_WIDEST_INT_C(X)   X ## LL
 # else
 #  if HOST_BITS_PER___INT64 >= 64
 #   define HOST_BITS_PER_WIDEST_INT  HOST_BITS_PER___INT64
 #   define HOST_WIDEST_INT   __int64
+#   define HOST_WIDEST_INT_C(X)  X ## i64
 #  else
 #error "This line should be impossible to reach"
 #  endif

[testsuite] more gcc.dg tests: add comments to dg-error and friends

2012-06-13 Thread Janis Johnson

This patch modifies miscellaneous tests in gcc/testsuite/gcc.dg to
specify comments in dg-message/dg-warning/dg-error test directives for
checks for multiple messages for the same line of source code.

Tested on i686-pc-linux-gnu.  OK for mainline?

Janis
2012-06-13  Janis Johnson  

* gcc.dg/di-longlong64-sync-1.c: Add comments to checks for multiple
messages reported for one line of source code.
* gcc.dg/format/few-1.c: Likewise.
* gcc.dg/ia64-sync-2.c: Likewise.
* gcc.dg/sync-2.c: Likewise.
* gcc.dg/noncompile/pr44517.c: Likewise.
* gcc.dg/noncompile/pr52290.c: Likewise.

Index: gcc.dg/di-longlong64-sync-1.c
===
--- gcc.dg/di-longlong64-sync-1.c   (revision 188482)
+++ gcc.dg/di-longlong64-sync-1.c   (working copy)
@@ -3,8 +3,8 @@
 /* { dg-options "-std=gnu99" } */
 /* { dg-additional-options "-march=pentium" { target { { i?86-*-* x86_64-*-* } 
&& ia32 } } } */
 
-/* { dg-message "note: '__sync_fetch_and_nand' changed semantics in GCC 4.4" 
"" { target *-*-* } 0 } */
-/* { dg-message "note: '__sync_nand_and_fetch' changed semantics in GCC 4.4" 
"" { target *-*-* } 0 } */
+/* { dg-message "note: '__sync_fetch_and_nand' changed semantics in GCC 4.4" 
"fetch_and_nand" { target *-*-* } 0 } */
+/* { dg-message "note: '__sync_nand_and_fetch' changed semantics in GCC 4.4" 
"nand_and_fetch" { target *-*-* } 0 } */
 
 
 /* Test basic functionality of the intrinsics.  The operations should
Index: gcc.dg/format/few-1.c
===
--- gcc.dg/format/few-1.c   (revision 188482)
+++ gcc.dg/format/few-1.c   (working copy)
@@ -4,15 +4,15 @@
 int f(int *ip, char *cp)
 {
__builtin_printf ("%*.*s");
-/* { dg-warning "field width specifier '\\*' expects a matching 'int' 
argument" "" { target *-*-* } 6 } */
-/* { dg-warning "field precision specifier '\\.\\*' expects a matching 'int' 
argument" "" { target *-*-* } 6 } */
-/* { dg-warning "format '%s' expects a matching 'char \\*' argument" "" { 
target *-*-* } 6 } */
+/* { dg-warning "field width specifier '\\*' expects a matching 'int' 
argument" "width" { target *-*-* } 6 } */
+/* { dg-warning "field precision specifier '\\.\\*' expects a matching 'int' 
argument" "precision" { target *-*-* } 6 } */
+/* { dg-warning "format '%s' expects a matching 'char \\*' argument" "format" 
{ target *-*-* } 6 } */
__builtin_printf ("%*.*s", ip, *cp);
-/* { dg-warning "field width specifier '\\*' expects argument of type 'int'" 
"" { target *-*-* } 10 } */
-/* { dg-warning "format '%s' expects a matching 'char \\*' argument" "" { 
target *-*-* } 10 } */
+/* { dg-warning "field width specifier '\\*' expects argument of type 'int'" 
"width" { target *-*-* } 10 } */
+/* { dg-warning "format '%s' expects a matching 'char \\*' argument" "format" 
{ target *-*-* } 10 } */
__builtin_printf ("%s %i", ip, ip);
-/* { dg-warning "format '%s' expects argument of type 'char \\*'" "" { target 
*-*-* } 13 } */
-/* { dg-warning "format '%i' expects argument of type 'int'" "" { target *-*-* 
} 13 } */
+/* { dg-warning "format '%s' expects argument of type 'char \\*'" "char" { 
target *-*-* } 13 } */
+/* { dg-warning "format '%i' expects argument of type 'int'" "int" { target 
*-*-* } 13 } */
__builtin_printf ("%s %i", cp);
 /* { dg-warning "format '%i' expects a matching 'int' argument" "" { target 
*-*-* } 16 } */
__builtin_printf ("%lc");
Index: gcc.dg/ia64-sync-2.c
===
--- gcc.dg/ia64-sync-2.c(revision 188482)
+++ gcc.dg/ia64-sync-2.c(working copy)
@@ -4,8 +4,8 @@
 /* { dg-options "-march=i486" { target { { i?86-*-* x86_64-*-* } && ia32 } } } 
*/
 /* { dg-options "-mcpu=v9" { target sparc*-*-* } } */
 
-/* { dg-message "note: '__sync_fetch_and_nand' changed semantics in GCC 4.4" 
"" { target *-*-* } 0 } */
-/* { dg-message "note: '__sync_nand_and_fetch' changed semantics in GCC 4.4" 
"" { target *-*-* } 0 } */
+/* { dg-message "note: '__sync_fetch_and_nand' changed semantics in GCC 4.4" 
"fetch_and_nand" { target *-*-* } 0 } */
+/* { dg-message "note: '__sync_nand_and_fetch' changed semantics in GCC 4.4" 
"nand_and_fetch" { target *-*-* } 0 } */
 
 /* Test basic functionality of the intrinsics.  */
 
Index: gcc.dg/sync-2.c
===
--- gcc.dg/sync-2.c (revision 188482)
+++ gcc.dg/sync-2.c (working copy)
@@ -4,8 +4,8 @@
 /* { dg-options "-march=i486" { target { { i?86-*-* x86_64-*-* } && ia32 } } } 
*/
 /* { dg-options "-mcpu=v9" { target sparc*-*-* } } */
 
-/* { dg-message "note: '__sync_fetch_and_nand' changed semantics in GCC 4.4" 
"" { target *-*-* } 0 } */
-/* { dg-message "note: '__sync_nand_and_fetch' changed semantics in GCC 4.4" 
"" { target *-*-* } 0 } */
+/* { dg-message "note: '__sync_fetch_and_nand' changed semant

[testsuite] scandump.exp: use printable version of regexp

2012-06-13 Thread Janis Johnson

Most scan-* procedures used in dg-final test directives use a printable
version of the regular expression in the test summary.  This patch makes
scan-*-dump-tiles do that as well, which eliminates several duplicates
in test summaries for expressions that contain a newline.

Tested on i686-pc-linux-gnu and arm-none-eabi.  OK for mainline?

Janis
2012-06-13  Janis Johnson  

* lib/scandump.exp (scan-dump-times): Use printable version of
regexp in test summary line.

Index: lib/scandump.exp
===
--- lib/scandump.exp(revision 188482)
+++ lib/scandump.exp(working copy)
@@ -94,7 +94,8 @@
 upvar 3 name testcase
 
 set suf [dump-suffix [lindex $args 3]]
-set testname "$testcase scan-[lindex $args 0]-dump-times $suf \"[lindex 
$args 1]\" [lindex $args 2]"
+set printable_pattern [make_pattern_printable [lindex $args 1]]
+set testname "$testcase scan-[lindex $args 0]-dump-times $suf 
\"$printable_pattern\" [lindex $args 2]"
 set src [file tail [lindex $testcase 0]]
 set output_file "[glob -nocomplain $src.[lindex $args 3]]"
 if { $output_file == "" } {

[testsuite] scanasm: don't strip torture args from testname in summary

2012-06-13 Thread Janis Johnson

Test results in a summary file usually include the torture options used
for the test run, but those options are stripped for pass/fail reports
for most scan-* procedures used in dg-final test directives.  This patch
refrains from stripping them and adds an extra space beteween those
options and the rest of the summary line to make it slightly more
readable. 

Tested on i686-pc-linux-gnu and arm-none-eabi.  OK for mainline?

Janis
2012-06-13  Janis Johnson  

* lib/scanasm.exp (scan-assembler, scan-assembler-not, scan-hidden,
scan-not-hiddent, scan-file, scan-file-not, scan-stack-usage,
scan-stack-usage-not): Don't strip torture options from test name.

Index: lib/scanasm.exp
===
--- lib/scanasm.exp (revision 188482)
+++ lib/scanasm.exp (working copy)
@@ -79,9 +79,10 @@
 
 proc scan-assembler { args } {
 upvar 2 name testcase
-set testcase [lindex $testcase 0]
+if { [llength $testcase] > 1 } {
+   set testcase "$testcase "
+}
 set output_file "[file rootname [file tail $testcase]].s"
-
 dg-scan "scan-assembler" 1 $testcase $output_file $args
 }
 
@@ -95,7 +96,9 @@
 
 proc scan-assembler-not { args } {
 upvar 2 name testcase
-set testcase [lindex $testcase 0]
+if { [llength $testcase] > 1 } {
+   set testcase "$testcase "
+}
 set output_file "[file rootname [file tail $testcase]].s"
 
 dg-scan "scan-assembler-not" 0 $testcase $output_file $args
@@ -126,7 +129,9 @@
 
 proc scan-hidden { args } {
 upvar 2 name testcase
-set testcase [lindex $testcase 0]
+if { [llength $testcase] > 1 } {
+   set testcase "$testcase "
+}
 set output_file "[file rootname [file tail $testcase]].s"
 
 set symbol [lindex $args 0]
@@ -143,7 +148,9 @@
 
 proc scan-not-hidden { args } {
 upvar 2 name testcase
-set testcase [lindex $testcase 0]
+if { [llength $testcase] > 1 } {
+   set testcase "$testcase "
+}
 set output_file "[file rootname [file tail $testcase]].s"
 
 set symbol [lindex $args 0]
@@ -158,7 +165,9 @@
 
 proc scan-file { output_file args } {
 upvar 2 name testcase
-set testcase [lindex $testcase 0]
+if { [llength $testcase] > 1 } {
+   set testcase "$testcase "
+}
 dg-scan "scan-file" 1 $testcase $output_file $args
 }
 
@@ -167,7 +176,9 @@
 
 proc scan-file-not { output_file args } {
 upvar 2 name testcase
-set testcase [lindex $testcase 0]
+if { [llength $testcase] > 1 } {
+   set testcase "$testcase "
+}
 dg-scan "scan-file-not" 0 $testcase $output_file $args
 }
 
@@ -176,7 +187,9 @@
 
 proc scan-stack-usage { args } {
 upvar 2 name testcase
-set testcase [lindex $testcase 0]
+if { [llength $testcase] > 1 } {
+   set testcase "$testcase "
+}
 set output_file "[file rootname [file tail $testcase]].su"
 
 dg-scan "scan-file" 1 $testcase $output_file $args
@@ -187,7 +200,9 @@
 
 proc scan-stack-usage-not { args } {
 upvar 2 name testcase
-set testcase [lindex $testcase 0]
+if { [llength $testcase] > 1 } {
+   set testcase "$testcase "
+}
 set output_file "[file rootname [file tail $testcase]].su"
 
 dg-scan "scan-file-not" 0 $testcase $output_file $args
@@ -216,7 +231,9 @@
 # it still stores the filename of the testcase in a local variable "name".
 # A cleaner solution would require a new dejagnu release.
 upvar 2 name testcase
-set testcase [lindex $testcase 0]
+if { [llength $testcase] > 1 } {
+   set testcase "$testcase "
+}
 
 set pattern [lindex $args 0]
 set pp_pattern [make_pattern_printable $pattern]
@@ -276,7 +293,9 @@
 }
 
 upvar 2 name testcase
-set testcase [lindex $testcase 0]
+if { [llength $testcase] > 1 } {
+   set testcase "$testcase "
+}
 set pattern [lindex $args 0]
 set pp_pattern [make_pattern_printable $pattern]
 set output_file "[file rootname [file tail $testcase]].s"
@@ -331,7 +350,9 @@
 }
 
 upvar 2 name testcase
-set testcase [lindex $testcase 0]
+if { [llength $testcase] > 1 } {
+   set testcase "$testcase "
+}
 set pattern [lindex $args 0]
 set pp_pattern [make_pattern_printable $pattern]
 set output_file "[file rootname [file tail $testcase]].s"
@@ -387,7 +408,9 @@
 }
 
 upvar 2 name testcase
-set testcase [lindex $testcase 0]
+if { [llength $testcase] > 1 } {
+   set testcase "$testcase "
+}
 
 set what [lindex $args 0]
 set where [lsearch { text data bss total } $what]

[testsuite] lib/dg-pch.exp: distinguish between two compilations in summary

2012-06-13 Thread Janis Johnson

PCH test infrastructure compiles each test twice, with identical results
in the test summary file (assuming they both pass or both fail).  This
patch adds an extra flag to each compile, "-Dcompile1" or "-Dcompile2",
to make the summary lines unique.  This is a rather lame fix but at
least I added a comment.

Tested on i686-pc-linux-gnu and arm-none-eabi.  OK for mainline?

Janis
2012-06-13  Janis Johnson  

* lib/dg-pch.exp (dg-flags-pch): Add flags to make compile lines in
test summary unique.

Index: lib/dg-pch.exp
===
--- lib/dg-pch.exp  (revision 188482)
+++ lib/dg-pch.exp  (working copy)
@@ -50,14 +50,16 @@
# Ensure that the PCH file is used, not the original header.
file_on_host delete "$bname$suffix"
 
-   dg-test -keep-output $test "$otherflags $flags -I." ""
+   # The flags "-Dcompile1" and "-Dcompile2" are to distinguish the
+   # two compiles in test summary lines.
+   dg-test -keep-output $test "$otherflags $flags -I. -Dcompile1" ""
file_on_host delete "$bname$suffix.gch"
if { !$have_errs } {
if { [ file_on_host exists "$bname.s" ] } {
remote_upload host "$bname.s" "$bname.s-gch"
remote_download host "$bname.s-gch"
gcc_copy_files "[file rootname $test]${suffix}s" 
"$bname$suffix"
-   dg-test -keep-output $test "$otherflags $flags -I." ""
+   dg-test -keep-output $test "$otherflags $flags -I. 
-Dcompile2" ""
remote_upload host "$bname.s"
set tmp [ diff "$bname.s" "$bname.s-gch" ]
if { $tmp == 0 } {
@@ -89,4 +91,4 @@
 
 proc dg-pch { subdir test options suffix } {
   return [dg-flags-pch $subdir $test "" $options $suffix]
-}
\ No newline at end of file
+}

Re: MIPS testsuite patch for --with-synci configurations

2012-06-13 Thread Steve Ellcey

On Wed, 2012-06-13 at 18:58 +0100, Richard Sandiford wrote:

> I agree with David.  This is really the same kind of situation as we
> already have for -m(no-)dsp, etc.  There too we need to override an
> explicit -mdsp (which might be added using --target_board, etc.)
> when testing a target that doesn't support the DSP ASE.
> 
> I think the patch below should be enough.  Spot-checked on mips64-elf
> using --target_board mips-sim-idt64/-mips64r2/-msynci.  Could you
> give it a go with your target?
> 
> Thanks,
> Richard

Excellent, this patch works great and fixed all the unexpected failures
I was getting due to --with-synci.  I guess I need to study mips.exp a
bit more to understand all of what it can do.

Steve Ellcey
sell...@mips.com

Re: constant that doesn't fit in 32bits in alpha.c

2012-06-13 Thread Joseph S. Myers

On Wed, 13 Jun 2012, Richard Henderson wrote:

> On 2012-06-12 12:44, Joseph S. Myers wrote:
> > I'd rather have a macro HOST_WIDE_INT_C in hwint.h (like INTMAX_C etc. in 
> > stdint.h).  HOST_WIDE_INT_1 is already defined in hwint.h to either 1L or 
> > 1LL; I'd suggest defining HOST_WIDE_INT_C to concatenate with either L or 
> > LL (and then HOST_WIDE_INT_1 can be HOST_WIDE_INT_C (1), unconditionally).
> > 
> 
> Are you happy with this version?

Yes.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: constant that doesn't fit in 32bits in alpha.c

2012-06-13 Thread Richard Henderson

On 2012-06-13 02:13, Pedro Alves wrote:
> Related, does gcc forbid "long long" / ULL ?


Normally, yes.  The vmsdbgout.c file seems to use it all over though.
Cleaning that up is independent of this thread though.



r~

Re: [testsuite] lib/dg-pch.exp: distinguish between two compilations in summary

2012-06-13 Thread Joseph S. Myers

On Wed, 13 Jun 2012, Janis Johnson wrote:

> PCH test infrastructure compiles each test twice, with identical results
> in the test summary file (assuming they both pass or both fail).  This
> patch adds an extra flag to each compile, "-Dcompile1" or "-Dcompile2",
> to make the summary lines unique.  This is a rather lame fix but at
> least I added a comment.

I would have said this was bug 20771, except that you committed a fix for 
that in 2008, so either it's a reappearance or the reason you didn't close 
the bug in 2008 was that your fix was known incomplete.  Anyway, check for 
open "testsuite" bugs for the issues you are fixing.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH] Fix ICE in expand_cse_reciprocals (PR tree-optimization/42078)

2012-06-13 Thread Alexandre Oliva

Sorry I dropped the ball on this one.  Context is here:
http://gcc.gnu.org/ml/gcc-patches/2012-04/msg00416.html

On Apr 12, 2012, Richard Guenther  wrote:

> +  /* If the conditions in which this function uses VALUE change,
> + adjust gimple_replace_lhs_wants_value().  */
> +  gcc_assert (gimple_replace_lhs_wants_value ()
> + == MAY_HAVE_DEBUG_STMTS);
> +

> that looks ... odd.  But I see you want to conditionally compute value.

> +static inline bool
> +gimple_replace_lhs_wants_value (void)
> +{
> +  return MAY_HAVE_DEBUG_STMTS;
> +}

> but should this not depend on the old stmt / lhs?  For example do we really
> want to do this for artificial variables?  I suppose not.

I think we do.  We want to preserve the value of the expression when it
is referenced in other debug expressions.  For these other uses, it
doesn't matter whether this value happened to be computed and stored in
a temporary, an artificial variable or a user-defined variable.

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist  Red Hat Brazil Compiler Engineer

[PATCH] Some vector cost model cleanup

2012-06-13 Thread William J. Schmidt

This is just some general maintenance to the vectorizer's cost model
code:

 * Corrects a typo in a function name;
 * Eliminates an unnecessary function;
 * Combines some duplicate inline functions.

Bootstrapped and tested on powerpc64-unknown-linux-gnu, no new
regressions.  Ok for trunk?

Thanks,
Bill


2012-06-13  Bill Schmidt  

* tree-vectorizer.h (vect_get_stmt_cost): Move from tree-vect-stmts.c.
(cost_for_stmt): Remove decl.
(vect_get_single_scalar_iteration_cost): Correct typo in name.
* tree-vect-loop.c (vect_get_cost): Remove.
(vect_get_single_scalar_iteration_cost): Correct typo in name; use
vect_get_stmt_cost rather than vect_get_cost.
(vect_get_known_peeling_cost): Use vect_get_stmt_cost rather than
vect_get_cost.
(vect_estimate_min_profitable_iters): Correct typo in call to
vect_get_single_scalar_iteration_cost; use vect_get_stmt_cost rather
than vect_get_cost.
(vect_model_reduction_cost): Use vect_get_stmt_cost rather than
vect_get_cost.
(vect_model_induction_cost): Likewise.
* tree-vect-data-refs.c (vect_peeling_hash_get_lowest_cost): Correct
typo in call to vect_get_single_scalar_iteration_cost.
* tree-vect-stmts.c (vect_get_stmt_cost): Move to tree-vectorizer.h.
(cost_for_stmt): Remove unnecessary function.
* Makefile.in (TREE_VECTORIZER_H): Update dependencies.


Index: gcc/tree-vectorizer.h
===
--- gcc/tree-vectorizer.h   (revision 188507)
+++ gcc/tree-vectorizer.h   (working copy)
@@ -23,6 +23,7 @@ along with GCC; see the file COPYING3.  If not see
 #define GCC_TREE_VECTORIZER_H
 
 #include "tree-data-ref.h"
+#include "target.h"
 
 typedef source_location LOC;
 #define UNKNOWN_LOC UNKNOWN_LOCATION
@@ -769,6 +770,18 @@ vect_pow2 (int x)
   return res;
 }
 
+/* Get cost by calling cost target builtin.  */
+
+static inline
+int vect_get_stmt_cost (enum vect_cost_for_stmt type_of_cost)
+{
+  tree dummy_type = NULL;
+  int dummy = 0;
+
+  return targetm.vectorize.builtin_vectorization_cost (type_of_cost,
+   dummy_type, dummy);
+}
+
 /*-*/
 /* Info on data references alignment.  */
 /*-*/
@@ -843,7 +856,6 @@ extern void vect_model_load_cost (stmt_vec_info, i
 extern void vect_finish_stmt_generation (gimple, gimple,
  gimple_stmt_iterator *);
 extern bool vect_mark_stmts_to_be_vectorized (loop_vec_info);
-extern int cost_for_stmt (gimple);
 extern tree vect_get_vec_def_for_operand (tree, gimple, tree *);
 extern tree vect_init_vector (gimple, tree, tree,
   gimple_stmt_iterator *);
@@ -919,7 +931,7 @@ extern int vect_estimate_min_profitable_iters (loo
 extern tree get_initial_def_for_reduction (gimple, tree, tree *);
 extern int vect_min_worthwhile_factor (enum tree_code);
 extern int vect_get_known_peeling_cost (loop_vec_info, int, int *, int);
-extern int vect_get_single_scalar_iteraion_cost (loop_vec_info);
+extern int vect_get_single_scalar_iteration_cost (loop_vec_info);
 
 /* In tree-vect-slp.c.  */
 extern void vect_free_slp_instance (slp_instance);
Index: gcc/tree-vect-loop.c
===
--- gcc/tree-vect-loop.c(revision 188507)
+++ gcc/tree-vect-loop.c(working copy)
@@ -1201,19 +1201,6 @@ vect_analyze_loop_form (struct loop *loop)
 }
 
 
-/* Get cost by calling cost target builtin.  */
-
-static inline int
-vect_get_cost (enum vect_cost_for_stmt type_of_cost)
-{
-  tree dummy_type = NULL;
-  int dummy = 0;
-
-  return targetm.vectorize.builtin_vectorization_cost (type_of_cost,
-   dummy_type, dummy);
-}
-
- 
 /* Function vect_analyze_loop_operations.
 
Scan the loop stmts and make sure they are all vectorizable.  */
@@ -2385,7 +2372,7 @@ vect_force_simple_reduction (loop_vec_info loop_in
 
 /* Calculate the cost of one scalar iteration of the loop.  */
 int
-vect_get_single_scalar_iteraion_cost (loop_vec_info loop_vinfo)
+vect_get_single_scalar_iteration_cost (loop_vec_info loop_vinfo)
 {
   struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
   basic_block *bbs = LOOP_VINFO_BBS (loop_vinfo);
@@ -2434,12 +2421,12 @@ int
   if (STMT_VINFO_DATA_REF (vinfo_for_stmt (stmt)))
 {
   if (DR_IS_READ (STMT_VINFO_DATA_REF (vinfo_for_stmt (stmt
-   stmt_cost = vect_get_cost (scalar_load);
+   stmt_cost = vect_get_stmt_cost (scalar_load);
  else
-   stmt_cost = vect_get_cost (scalar_store);
+   stmt_cost = vect_get_stmt_cost (scalar_store);
 }
   else
-st

Re: [testsuite] lib/dg-pch.exp: distinguish between two compilations in summary

2012-06-13 Thread Janis Johnson

On 06/13/2012 02:37 PM, Joseph S. Myers wrote:
> On Wed, 13 Jun 2012, Janis Johnson wrote:
> 
>> PCH test infrastructure compiles each test twice, with identical results
>> in the test summary file (assuming they both pass or both fail).  This
>> patch adds an extra flag to each compile, "-Dcompile1" or "-Dcompile2",
>> to make the summary lines unique.  This is a rather lame fix but at
>> least I added a comment.
> 
> I would have said this was bug 20771, except that you committed a fix for 
> that in 2008, so either it's a reappearance or the reason you didn't close 
> the bug in 2008 was that your fix was known incomplete.  Anyway, check for 
> open "testsuite" bugs for the issues you are fixing.
> 

The patch for PR20771 did fix the problem (so I should have closed it)
but it was undone as part of a later change to the same lines.  I could
either do that again, with comments this time explaining why they are
slightly different, or use "-Dwith_PCH" and "-Dwithout_PCH" in the flags
that will appear in the test summary.  Do you have a preference?

Janis

Re: [testsuite] lib/dg-pch.exp: distinguish between two compilations in summary

2012-06-13 Thread Joseph S. Myers

On Wed, 13 Jun 2012, Janis Johnson wrote:

> The patch for PR20771 did fix the problem (so I should have closed it)
> but it was undone as part of a later change to the same lines.  I could
> either do that again, with comments this time explaining why they are
> slightly different, or use "-Dwith_PCH" and "-Dwithout_PCH" in the flags
> that will appear in the test summary.  Do you have a preference?

I have no preference here.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: debug insns in SMS

2012-06-13 Thread Alexandre Oliva

Apologies for the duplicate ping, this one is now properly addressed to
the pass maintainer.

On Apr  9, 2012, Alexandre Oliva  wrote:

> On May  4, 2011, Revital1 Eres  wrote:
>> Hello Alexandre
>>> I think this will restore proper functioning to SMS in the presence of
>>> debug insns.  A while ago, we'd never generate deps of non-debug insns
>>> on debug insns.  I introduced them to enable sched to adjust (reset)
>>> debug insns when non-debug insns were moved before them.  I believe it
>>> is safe to leave them out of the SCCs.  Even though this will end up
>>> causing some loss of debug info, that's probably unavoidable, and the
>>> end result after this change is pobably the best we can hope for.  Your
>>> thoughts?

>> Thanks for the patch!

>> I actually discussed this issue with Ayal yesterday.
>> Ayal also suggested to reconsider the edges that are created in
>> the DDG between real instructions and debug_insns. Currently, we create
>> bidirectional anti deps edges between them. This leads to the problem you
>> were trying to solve in the current patch (described below) where these
>> extra edges influence the construction of the strongly connected component
>> and the code generated with and w\o -g. Your patch seems to solve this
>> problem.
>> However I can not approve it as I'm not the maintainer (Ayal is).

> Ping?

> (Retested on x86_64-linux-gnu and i686-pc-linux-gnu)


> for  gcc/ChangeLog
> from  Alexandre Oliva  

>   * ddg.c (build_intra_loop_deps): Discard deps of nondebug on debug.

Ping?  http://gcc.gnu.org/ml/gcc-patches/2012-04/msg00419.html

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist  Red Hat Brazil Compiler Engineer

Re: [testsuite] more gcc.dg tests: add comments to dg-error and friends

2012-06-13 Thread Mike Stump

On Jun 13, 2012, at 2:12 PM, Janis Johnson wrote:
> This patch modifies miscellaneous tests in gcc/testsuite/gcc.dg to
> specify comments in dg-message/dg-warning/dg-error test directives for
> checks for multiple messages for the same line of source code.

> OK for mainline?

Ok.  I'm fine with you considering all such patches as obvious (or 
pre-approved, if you prefer).

Re: [testsuite] scandump.exp: use printable version of regexp

2012-06-13 Thread Mike Stump

On Jun 13, 2012, at 2:13 PM, Janis Johnson wrote:
> Most scan-* procedures used in dg-final test directives use a printable
> version of the regular expression in the test summary.  This patch makes
> scan-*-dump-tiles do that as well, which eliminates several duplicates
> in test summaries for expressions that contain a newline.

Ok.

Re: [testsuite] scanasm: don't strip torture args from testname in summary

2012-06-13 Thread Mike Stump

On Jun 13, 2012, at 2:14 PM, Janis Johnson wrote:
> Test results in a summary file usually include the torture options used
> for the test run, but those options are stripped for pass/fail reports
> for most scan-* procedures used in dg-final test directives.  This patch
> refrains from stripping

> OK for mainline?

Ok.

[SH] PR 53568 - Add support for bswap built-ins

2012-06-13 Thread Oleg Endo

Hello,

The attached patch improves code generated for byte swap expressions
such as ((x & 0xFF) << 8) | ((x >> 8) & 0xFF).
It seems that currently the tree optimizers only detect bswap32 and
bswap64 but not bswap16 patterns.  The patch adds detection for bswap16
patterns by playing along with the combine pass.

Tested with 
make -k -j8 check RUNTESTFLAGS="--target_board=sh-sim
\{-m2/-ml,-m2/-mb,-m2a/-mb,-m2a-single/-mb,-m4/-ml,
-m4/-mb,-m4-single/-ml,-m4-single/-mb,-m4a-single/-ml,
-m4a-single/-mb}"

and no new failures.
Test cases for this patch and the previous bswap32 patch will follow
shortly.

Cheers,
Oleg

ChangeLog:

PR target/53568
* config/sh/sh.md: Add peephole for swapbsi2. 
(*swapbisi2_and_shl8, *swapbhisi2): New insns and splits.
Index: gcc/config/sh/sh.md
===
--- gcc/config/sh/sh.md	(revision 188525)
+++ gcc/config/sh/sh.md	(working copy)
@@ -4561,6 +4561,81 @@
   "swap.b	%1,%0"
   [(set_attr "type" "arith")])
 
+;; The *swapbisi2_and_shl8 pattern helps the combine pass simplifying
+;; partial byte swap expressions such as...
+;;   ((x & 0xFF) << 8) | ((x >> 8) & 0xFF).
+;; ...which are currently not handled by the tree optimizers.
+;; The combine pass will not initially try to combine the full expression,
+;; but only some sub-expressions.  In such a case the *swapbisi2_and_shl8
+;; pattern acts as an intermediate pattern that will eventually lead combine
+;; to the swapbsi2 pattern above.
+;; As a side effect this also improves code that does (x & 0xFF) << 8
+;; or (x << 8) & 0xFF00.
+(define_insn_and_split "*swapbisi2_and_shl8"
+  [(set (match_operand:SI 0 "arith_reg_dest" "=r")
+	(ior:SI (and:SI (ashift:SI (match_operand:SI 1 "arith_reg_operand" "r")
+   (const_int 8))
+			(const_int 65280))
+		(match_operand:SI 2 "arith_reg_operand" "r")))]
+  "TARGET_SH1 && ! reload_in_progress && ! reload_completed"
+  "#"
+  "&& can_create_pseudo_p ()"
+  [(const_int 0)]
+{
+  rtx tmp0 = gen_reg_rtx (SImode);
+  rtx tmp1 = gen_reg_rtx (SImode);
+
+  emit_insn (gen_zero_extendqisi2 (tmp0, gen_lowpart (QImode, operands[1])));
+  emit_insn (gen_swapbsi2 (tmp1, tmp0));
+  emit_insn (gen_iorsi3 (operands[0], tmp1, operands[2]));
+  DONE;
+})
+
+;; The *swapbhisi2 pattern is, like the *swapbisi2_and_shl8 pattern, another
+;; intermediate pattern that will help the combine pass arriving at swapbsi2.
+(define_insn_and_split "*swapbhisi2"
+  [(set (match_operand:SI 0 "arith_reg_dest" "=r")
+	(ior:SI (and:SI (ashift:SI (match_operand:SI 1 "arith_reg_operand" "r")
+   (const_int 8))
+			(const_int 65280))
+		(zero_extract:SI (match_dup 1) (const_int 8) (const_int 8]
+  "TARGET_SH1 && ! reload_in_progress && ! reload_completed"
+  "#"
+  "&& can_create_pseudo_p ()"
+  [(const_int 0)]
+{
+  rtx tmp = gen_reg_rtx (SImode);
+
+  emit_insn (gen_zero_extendhisi2 (tmp, gen_lowpart (HImode, operands[1])));
+  emit_insn (gen_swapbsi2 (operands[0], tmp));
+  DONE;
+})
+
+;; In some cases the swapbsi2 pattern might leave a sequence such as...
+;;   swap.b  r4,r4
+;;   mov r4,r0
+;;
+;; which can be simplified to...
+;;   swap.b  r4,r0
+(define_peephole2
+  [(set (match_operand:SI 0 "arith_reg_dest" "")
+	(ior:SI (and:SI (match_operand:SI 1 "arith_reg_operand" "")
+			(const_int 4294901760))
+		(ior:SI (and:SI (ashift:SI (match_dup 1) (const_int 8))
+(const_int 65280))
+			(and:SI (ashiftrt:SI (match_dup 1) (const_int 8))
+(const_int 255)
+   (set (match_operand:SI 2 "arith_reg_dest" "")
+	(match_dup 0))]
+  "TARGET_SH1 && peep2_reg_dead_p (2, operands[0])"
+  [(set (match_dup 2)
+	(ior:SI (and:SI (match_operand:SI 1 "arith_reg_operand" "")
+			(const_int 4294901760))
+		(ior:SI (and:SI (ashift:SI (match_dup 1) (const_int 8))
+(const_int 65280))
+			(and:SI (ashiftrt:SI (match_dup 1) (const_int 8))
+(const_int 255)])
+
 
 ;; -
 ;; Zero extension instructions

Re: [testsuite] lib/dg-pch.exp: distinguish between two compilations in summary

2012-06-13 Thread Mike Stump

On Jun 13, 2012, at 2:59 PM, Joseph S. Myers wrote:
> On Wed, 13 Jun 2012, Janis Johnson wrote:
> 
>> The patch for PR20771 did fix the problem (so I should have closed it)
>> but it was undone as part of a later change to the same lines.  I could
>> either do that again, with comments this time explaining why they are
>> slightly different, or use "-Dwith_PCH" and "-Dwithout_PCH" in the flags
>> that will appear in the test summary.  Do you have a preference?
> 
> I have no preference here.

Ok for mainline.  My only preference was to have a comment above the line that 
has the -D flag telling why.  You have that in your patch, so I'm happy.  If 
you want to use -Dwith_PCH, I'm fine with either spelling.

Re: Updated to respond to various email comments from Jason, Diego and Cary (issue6197069)

2012-06-13 Thread Sterling Augustine

> I lean toward -g myself, since there doesn't seem to be a strong rule one
> way or the other.

Unless there are further comments, I'll stick with -g then.

I think that covers all the comments, so I think I will commit this
Friday morning unless I hear anything further.

Sterling

Go patch committed: Avoid some unnecessary interface conversions

2012-06-13 Thread Ian Lance Taylor

This patch to the Go compiler avoids some unnecessary interface
conversions.  I was comparing Type* pointers for equality, but that only
works if the types were defined before the code being compiled.  If they
were defined afterward, the types will normally be forward declarations.
They will wind up pointing to the same type, but the pointers won't be
equal.  Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu.
Committed to mainline.

Ian

diff -r 43610a429d23 go/expressions.cc
--- a/go/expressions.cc	Tue Jun 12 22:54:42 2012 -0700
+++ b/go/expressions.cc	Wed Jun 13 17:37:20 2012 -0700
@@ -168,7 +168,8 @@
   if (lhs_type_tree == error_mark_node)
 return error_mark_node;
 
-  if (lhs_type != rhs_type && lhs_type->interface_type() != NULL)
+  if (lhs_type->forwarded() != rhs_type->forwarded()
+  && lhs_type->interface_type() != NULL)
 {
   if (rhs_type->interface_type() == NULL)
 	return Expression::convert_type_to_interface(context, lhs_type,
@@ -179,7 +180,8 @@
 			  rhs_type, rhs_tree,
 			  false, location);
 }
-  else if (lhs_type != rhs_type && rhs_type->interface_type() != NULL)
+  else if (lhs_type->forwarded() != rhs_type->forwarded()
+	   && rhs_type->interface_type() != NULL)
 return Expression::convert_interface_to_type(context, lhs_type, rhs_type,
 		 rhs_tree, location);
   else if (lhs_type->is_slice_type() && rhs_type->is_nil_type())

Re: [SH] PR 53568 - Add support for bswap built-ins

2012-06-13 Thread Kaz Kojima

Oleg Endo  wrote:
> The attached patch improves code generated for byte swap expressions
> such as ((x & 0xFF) << 8) | ((x >> 8) & 0xFF).
> It seems that currently the tree optimizers only detect bswap32 and
> bswap64 but not bswap16 patterns.  The patch adds detection for bswap16
> patterns by playing along with the combine pass.
> 
> Tested with 
> make -k -j8 check RUNTESTFLAGS="--target_board=sh-sim
> \{-m2/-ml,-m2/-mb,-m2a/-mb,-m2a-single/-mb,-m4/-ml,
> -m4/-mb,-m4-single/-ml,-m4-single/-mb,-m4a-single/-ml,
> -m4a-single/-mb}"
> 
> and no new failures.
> Test cases for this patch and the previous bswap32 patch will follow
> shortly.

The patch looks fine to me.  OK for trunk.

I guess that tree optimizers handle bswap32/64 with its cost
because they are more frequent and critical in the real working
set like network codes than bswap16, though I could be wrong
about it.

Regards,
kaz

Re: [Patch][ARM] Add arm-linux-gnueabihf triplet support.

2012-06-13 Thread Zhenqiang Chen

Ping. OK for trunk?

Thanks!
-Zhenqiang

On 7 June 2012 16:12, Zhenqiang Chen  wrote:
> Hi,
>
> The patch adds arm-linux-gnueabihf triplet support.
>
> No regression for arm-linux-gnueabi tests.
>
> There are some differences between testsuite results on softfp Natty
> builders and the new hard float Precise builders. But none are due to
> the change in triplet.
>
> Thanks!
> -Zhenqiang
>
> gcc/ada/ChangeLog:
> 2012-06-07  Zhenqiang Chen  
>
>        * gcc-interface/Makefile.in: Update linux-gnueabi to linux-gnueabi%.
>
> gcc/ChangeLog:
> 2012-06-07  Zhenqiang Chen  
>
>        * config.gcc: Update arm*-*-linux-*eabi to arm*-*-linux-*eabi*.
>        Add hard-float as default for arm*-*-*eabihf.
>
> libgcc/ChangeLog:
> 2012-06-07  Zhenqiang Chen  
>
>        * config.host: Update arm*-*-linux-*eabi to arm*-*-linux-*eabi*.
>
> libjava/ChangeLog:
> 2012-06-07  Zhenqiang Chen  
>
>        * configure: Update arm*linux*eabi to arm*linux*eabi*.
>        * configure.ac: Likewise.
>
> libstdc++-v3/ChangeLog:
> 2012-06-07  Zhenqiang Chen  
>
>        * configure.host: Update arm*-*-linux-*eabi to arm*-*-linux-*eabi*.
>
>
> diff --git a/gcc/ada/gcc-interface/Makefile.in
> b/gcc/ada/gcc-interface/Makefile.in
> index 21c2471..bae126d 100644
> --- a/gcc/ada/gcc-interface/Makefile.in
> +++ b/gcc/ada/gcc-interface/Makefile.in
> @@ -1828,7 +1828,7 @@ ifeq ($(strip $(filter-out powerpc% e500%
> linux%,$(arch) $(osys))),)
>   LIBRARY_VERSION := $(LIB_VERSION)
>  endif
>
> -ifeq ($(strip $(filter-out arm% linux-gnueabi,$(arch) $(osys)-$(word
> 4,$(targ,)
> +ifeq ($(strip $(filter-out arm% linux-gnueabi%,$(arch) $(osys)-$(word
> 4,$(targ,)
>   LIBGNAT_TARGET_PAIRS = \
>   a-intnam.ads   s-inmaop.adb diff --git a/gcc/config.gcc b/gcc/config.gcc
> index f2b0936..05d669f 100644
> --- a/gcc/config.gcc
> +++ b/gcc/config.gcc
> @@ -835,7 +835,7 @@ arm*-*-linux*)                      # ARM GNU/Linux with 
> ELF
>        esac
>        tmake_file="${tmake_file} arm/t-arm"
>        case ${target} in
> -       arm*-*-linux-*eabi)
> +       arm*-*-linux-*eabi*)
>            tm_file="$tm_file arm/bpabi.h arm/linux-eabi.h"
>            tmake_file="$tmake_file arm/t-arm-elf arm/t-bpabi arm/t-linux-eabi"
>            # Define multilib configuration for arm-linux-androideabi.
> @@ -844,6 +844,11 @@ arm*-*-linux*)                     # ARM GNU/Linux with 
> ELF
>                tmake_file="$tmake_file arm/t-linux-androideabi"
>                ;;
>            esac
> +           case ${target} in
> +           arm*-*-*eabihf)
> +               with_float=${with_float:-hard}
> +               ;;
> +           esac
>            # The BPABI long long divmod functions return a 128-bit value in
>            # registers r0-r3.  Correctly modeling that requires the use of
>            # TImode.
> diff --git a/libgcc/config.host b/libgcc/config.host
> index 14c705b..fd952ff 100644
> --- a/libgcc/config.host
> +++ b/libgcc/config.host
> @@ -316,7 +316,7 @@ arm*-*-netbsdelf*)
>  arm*-*-linux*)                 # ARM GNU/Linux with ELF
>        tmake_file="${tmake_file} arm/t-arm t-fixedpoint-gnu-prefix"
>        case ${host} in
> -       arm*-*-linux-*eabi)
> +       arm*-*-linux-*eabi*)
>          tmake_file="${tmake_file} arm/t-elf arm/t-bpabi arm/t-linux-eabi
> t-slibgcc-libgcc"
>          tm_file="$tm_file arm/bpabi-lib.h"
>          unwind_header=config/arm/unwind-arm.h
> diff --git a/libjava/configure b/libjava/configure
> index 0bd423d..5067f3b 100755
> --- a/libjava/configure
> +++ b/libjava/configure
> @@ -20548,7 +20548,7 @@ case "${host}" in
>     # on Darwin -single_module speeds up loading of the dynamic libraries.
>     extra_ldflags_libjava=-Wl,-single_module
>     ;;
> -arm*linux*eabi)
> +arm*linux*eabi*)
>     # Some of the ARM unwinder code is actually in libstdc++.  We
>     # could in principle replicate it in libgcj, but it's better to
>     # have a dependency on libstdc++.
> diff --git a/libjava/configure.ac b/libjava/configure.ac
> index 62c5000..736608c 100644
> --- a/libjava/configure.ac
> +++ b/libjava/configure.ac
> @@ -931,7 +931,7 @@ case "${host}" in
>     # on Darwin -single_module speeds up loading of the dynamic libraries.
>     extra_ldflags_libjava=-Wl,-single_module
>     ;;
> -arm*linux*eabi)
> +arm*linux*eabi*)
>     # Some of the ARM unwinder code is actually in libstdc++.  We
>     # could in principle replicate it in libgcj, but it's better to
>     # have a dependency on libstdc++.
> diff --git a/libstdc++-v3/configure.host b/libstdc++-v3/configure.host
> index 8f29bc2..c29d9d4 100644
> --- a/libstdc++-v3/configure.host
> +++ b/libstdc++-v3/configure.host
> @@ -324,7 +324,7 @@ case "${host}" in
>         fi
>     esac
>     case "${host}" in
> -      arm*-*-linux-*eabi)
> +      arm*-*-linux-*eabi*)
>        
> port_specific_symbol_files="\$(srcdir)/../config/os/gnu-linux/arm-eabi-extra.ver"
>        ;;
>     esac

Re: [PATCH] Fix libgo testsuite

2012-06-13 Thread Ian Lance Taylor

On Fri, Jun 8, 2012 at 7:38 AM, Andreas Schwab  wrote:
> This fixes most of the libgo testsuite failures.

Thanks.  Committed to mainline.

Ian

Go patch committed: Quote package paths with tabs

2012-06-13 Thread Ian Lance Taylor

In my patch yesterday to quote package paths in reflection strings, I
managed to overlook the fact that quotes are already used for struct
field tags in reflection strings.  This patch changes the compiler and
the reflect package to quote using tabs instead, since tabs will never
appear in struct field tags.  Bootstrapped and ran Go testsuite on
x86_64-unknown-linux-gnu.  Committed to mainline.

Ian

diff -r 2f9880d1acae go/types.cc
--- a/go/types.cc	Wed Jun 13 21:46:56 2012 -0700
+++ b/go/types.cc	Wed Jun 13 22:01:58 2012 -0700
@@ -8340,16 +8340,16 @@
   // -fgo-pkgpath was introduced.  When -fgo-pkgpath is specified,
   // we use it to make a unique reflection string, so that the
   // type canonicalization in the reflect package will work.  In
-  // order to be compatible with the gc compiler, we quote the
-  // package path, so that the reflect methods can discard it.
+  // order to be compatible with the gc compiler, we put tabs into
+  // the package path, so that the reflect methods can discard it.
   const Package* package = this->named_object_->package();
   if (gogo->pkgpath_from_option())
 	{
-	  ret->push_back('"');
+	  ret->push_back('\t');
 	  ret->append(package != NULL
 		  ? package->pkgpath_symbol()
 		  : gogo->pkgpath_symbol());
-	  ret->push_back('"');
+	  ret->push_back('\t');
 	}
   ret->append(package != NULL
 		  ? package->package_name()
diff -r 2f9880d1acae libgo/go/reflect/type.go
--- a/libgo/go/reflect/type.go	Wed Jun 13 21:46:56 2012 -0700
+++ b/libgo/go/reflect/type.go	Wed Jun 13 22:01:58 2012 -0700
@@ -444,7 +444,7 @@
 	r := make([]byte, len(s))
 	j := 0
 	for i := 0; i < len(s); i++ {
-		if s[i] == '"' {
+		if s[i] == '\t' {
 			q = !q
 		} else if !q {
 			r[j] = s[i]

Merge several revisions to gccgo branch

2012-06-13 Thread Ian Lance Taylor

I've merged the following revisions to the gccgo branch:

188548
188547
188545
188496
188494
188482

These are intended to be committed to gcc-4_7-branch when it is open for
bug fixes.

Ian

Re: Updated to respond to various email comments from Jason, Diego and Cary (issue6197069)

2012-06-13 Thread Jason Merrill


On 06/13/2012 04:26 PM, Sterling Augustine wrote:

I lean toward -g myself, since there doesn't seem to be a strong rule one
way or the other.


Unless there are further comments, I'll stick with -g then.

I think that covers all the comments, so I think I will commit this
Friday morning unless I hear anything further.


Weren't you going to repost the patch first?  :)

Jason

Re: Make timevar phases mutually exclusive. (issue6302064)

2012-06-13 Thread Jason Merrill


OK.

Jason

Re: [PATCH] Fix ICE in expand_cse_reciprocals (PR tree-optimization/42078)

2012-06-13 Thread Alexandre Oliva

On Apr 12, 2012, Richard Guenther  wrote:

> +  /* If the conditions in which this function uses VALUE change,
> + adjust gimple_replace_lhs_wants_value().  */
> +  gcc_assert (gimple_replace_lhs_wants_value ()
> + == MAY_HAVE_DEBUG_STMTS);
> +
 if (MAY_HAVE_DEBUG_STMTS)
   {

> that looks ... odd.

Indeed, it does.  Does this look any better?

  bool save_value = MAY_HAVE_DEBUG_STMTS;
  /* If the condition above, in which this function uses VALUE change,
 adjust gimple_replace_lhs_wants_value() to match.  The assert
 below helps enforce this.  */
  gcc_checking_assert (gimple_replace_lhs_wants_value () == save_value);

  if (save_value)
{

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist  Red Hat Brazil Compiler Engineer

Re: [PATCH, ARM] New CPU support for Marvell PJ4 cores

2012-06-13 Thread Chung-Lin Tang

On 2012/6/14 02:18 AM, Ramana Radhakrishnan wrote:
> On 29 May 2012 10:07, Yi-Hsiu Hsu  wrote:
>> Hi,
>>
>> This patch maintains Marvell PJ4 cores pipeline description.
>> Run arm testsuite on arm-linux-gnueabi and no extra regressions are found.
>>
>>* config/arm/marvell-pj4.md: New marvell-pj4 pipeline description.
>>* config/arm/arm.c (arm_issue_rate): Add marvell_pj4.
>>* config/arm/arm-cores.def: Add core marvell-pj4.
>>* config/arm/arm-tune.md: Regenerated.
>>* config/arm/arm-tables.opt: Regenerated.
>>* doc/invoke.texi: Added entry for marvell-pj4.
> 
> This command line option should also be added to BE8_LINK_SPEC similar
> to what's done for the other v7-a cores.
> 
> Ok with that change.

I take the blame for not doing this back then, but I suggest the
resource names be properly qualified, similar to most recently added
pipeline descriptions, e.g. prepend resource/reservation names with
"pj4_"  ("is" to "pj4_is", etc.)

Chung-Lin

Re: [PATCH] Add option for dumping to stderr (issue6190057)

2012-06-13 Thread Sharad Singhai

Thanks for your comments. Responses inline.

On Wed, Jun 13, 2012 at 4:48 AM, Richard Guenther
 wrote:
> On Fri, Jun 8, 2012 at 7:16 AM, Sharad Singhai  wrote:
>> Okay, I have updated the attached patch so that the output from
>> -ftree-vectorizer-verbose is considered diagnostic information and is
>> always
>> sent to stderr. Other functionality remains unchanged. Here is some
>> more context about this patch.
>>
>> This patch improves the dump infrastructure and public interfaces so
>> that the existing private pass-specific dump stream is separated from
>> the diagnostic dump stream (typically stderr).  The optimization
>> passes can output information on the two streams independently.
>>
>> The newly defined interfaces are:
>>
>> Individual passes do not need to access the dump file directly. Thus Instead
>> of doing
>>
>>   if (dump_file && (flags & dump_flags))
>>      fprintf (dump_file, ...);
>>
>> they can do
>>
>>     dump_printf (flags, ...);
>>
>> If the current pass has FLAGS enabled then the information gets
>> printed into the dump file otherwise not.
>>
>> Similar to the dump_printf (), another function is defined, called
>>
>>        diag_printf (dump_flags, ...)
>>
>> This prints information only onto the diagnostic stream, typically
>> standard error. It is useful for separating pass-specific dump
>> information from
>> the diagnostic information.
>>
>> Currently, as a proof of concept, I have converted vectorizer passes
>> to use the new dump format. For this, I have considered
>> information printed in vect_dump file as diagnostic. Thus 'fprintf'
>> calls are changed to 'diag_printf'. Some other information printed to
>> dump_file is sent to the regular dump file via 'dump_printf ()'. It
>> helps to separate the two streams because they might serve different
>> purposes and might have different formatting requirements.
>>
>> For example, using the trunk compiler, the following invocation
>>
>> g++ -S v.cc -ftree-vectorize -fdump-tree-vect -ftree-vectorizer-verbose=2
>>
>> prints tree vectorizer dump into a file named 'v.cc.113t.vect'.
>> However, the verbose diagnostic output is silently
>> ignored. This is not desirable as the two types of dump should not interfere.
>>
>> After this patch, the vectorizer dump is available in 'v.cc.113t.vect'
>> as before, but the verbose vectorizer diagnostic is additionally
>> printed on stderr. Thus both types of dump information are output.
>>
>> An additional feature of this patch is that individual passes can
>> print dump information into command-line named files instead of auto
>> numbered filename. For example,
>
> I'd wish you'd leave out this part for a followup.

I thought you wanted all parts together. Anyway, I can remove this part.

>
>>
>> g++ -S -O2 v.cc -ftree-vectorize -fdump-tree-vect=foo.vect
>>     -ftree-vectorizer-verbose=2
>>     -fdump-tree-pre=foo.pre
>>
>> This prints the tree vectorizer dump into 'foo.vect', PRE dump into
>> 'foo.pre', and the vectorizer verbose diagnostic dump onto stderr.
>>
>> Please take another look.
>
> --- tree-vect-loop-manip.c      (revision 188325)
> +++ tree-vect-loop-manip.c      (working copy)
> @@ -789,14 +789,11 @@ slpeel_make_loop_iterate_ntimes (struct loop *loop
>   gsi_remove (&loop_cond_gsi, true);
>
>   loop_loc = find_loop_location (loop);
> -  if (dump_file && (dump_flags & TDF_DETAILS))
> -    {
> -      if (loop_loc != UNKNOWN_LOC)
> -        fprintf (dump_file, "\nloop at %s:%d: ",
> +  if (loop_loc != UNKNOWN_LOC)
> +    dump_printf (TDF_DETAILS, "\nloop at %s:%d: ",
>                  LOC_FILE (loop_loc), LOC_LINE (loop_loc));
> -      print_gimple_stmt (dump_file, cond_stmt, 0, TDF_SLIM);
> -    }
> -
> +  if (dump_flags & TDF_DETAILS)
> +    dump_gimple_stmt (TDF_SLIM, cond_stmt, 0);
>   loop->nb_iterations = niters;
>
> I'm confused by this.  Why is this not simply
>
>  if (loop_loc != UNKNOWN_LOC)
>    dump_printf (dump_flags, "\nloop at %s:%d: ",
>                       LOC_FILE (loop_loc), LOC_LINE (loop_loc));
>  dump_gimple_stmt (dump_flags | TDF_SLIM, cond_stmt, 0);
>
> for example.  I notice that you maybe mis-understood the message 
> classification
> I asked you to add (maybe I confused you by mentioning to eventually re-use
> the TDF_* flags).  I think you basically provided this message classification
> by adding two classes by providing both dump_gimple_stmt and diag_gimple_stmt.
> But still in the above you keep a dump_flags test _and_ you pass in
> (altered) dump_flags to the dump/diag_gimple_stmt routines.  Let me quote 
> them:
>
> +void
> +dump_gimple_stmt (int flags, gimple gs, int spc)
> +{
> +  if (dump_file)
> +    print_gimple_stmt (dump_file, gs, spc, flags);
> +}
>
> +void
> +diag_gimple_stmt (int flags, gimple gs, int spc)
> +{
> +  if (alt_dump_file)
> +    print_gimple_stmt (alt_dump_file, gs, spc, flags);
> +}
>
> I'd say it should have been a single function:
>
> void
> dump_gimple_stmt (enum msg_classification, int additional_flags,
> gimple gs, i

Re: [PATCH] Option to build bare-metal ARM cross-compiler for arm-none-eabi target without libunwind

2012-06-13 Thread Fredrik Hederstierna

-Joseph Myers  wrote: -
>You need to provide a self-contained explanation of what the problem
>is  that your patch is fixing and why you chose that approach to
>fixing it -  with reference to the ARM EABI documentes (RTABI etc.)
>for why your  approach is valid according to the ARM EABI.  libunwind
>is a library separate from libgcc that is used by libgcc for
>unwinding on ia64-linux-gnu only (whether built by GCC or separately
>installed).  There is also a separate libunwind project that may be
>used  on GNU/Linux platforms but I am not aware of being used for
>bare-metal at  all.

Our experience is that when using simple integer division in your code,
the libgcc division routine will includes div-zero handling (exception 
support?),
which will include dependencies to libunwind. This dependency problem
is the background of the patch.

Since libunwind does use eg. memcpy() and abort() we cannot link since we have 
our
own custom versions of all libc functions, a real bare-metal toolchain.

>Certainly it would be unusual to use it for ARM
>EABI and the ARM  EABI libgcc works fine without it.  So referring to
>libunwind in the ARM  EABI context seems rather confusing; if you
>don't want it, just do the  same as almost all other ARM EABI users
>and don't use it; it's *using*  libunwind that requires special
>action, not avoiding it.

We don't use it, and we do not want to use it.
But we use division libc functions. And the the libgcc division functions are
compiled with -fexceptions, so we will get libunwind dependecies anyway.
The patch does make this dependency optional.

If you want I can submit you with some example scripts how we build our 
toolchain and
some simple code to show what is our problem.
(You can also check the Rockbox project that have the same problem. See 
previous posts from other people.)

Best Regards
Fredrik Hederstierna

Re: [PATCH ARM iWMMXt 0/5] Improve iWMMXt support

2012-06-13 Thread nick clifton


Hi Matt, Hi Xinyu,


This series was written by Marvell and sent by Xinyu Qi
a number of times in the last year.


Sorry for the long delay in reviewing these patches.  Overall they were 
fine, with only a few, very minor, formatting issues.  I have committed 
the entire series of patches to the mainline.



For 4.7 and 4.6 please consider committing my patch
"[PATCH] arm: Fix iwmmxt shift and logical intrinsics (PR 35294)."
which only fixes the logical and shift intrinsics.


I will look at this and post separately about it.

Cheers
  Nick

Re: [PATCH] Option to build bare-metal ARM cross-compiler for arm-none-eabi target without libunwind

2012-06-13 Thread Fredrik Hederstierna

This is the link to the original patch.
It contains some background information and more links.

http://gcc.gnu.org/ml/gcc-patches/2012-04/msg00720.html

The patch is now updated and improved by Larry Doolittle.
/Fredrik

[SH] PR 53568 - Add support for bswap built-ins

2012-06-13 Thread Oleg Endo

Hello,

The attached patch adds support for the bswap32 built-in on SH.

Tested with 
make -k -j8 check RUNTESTFLAGS="--target_board=sh-sim
\{-m2/-ml,-m2/-mb,-m2a/-mb,-m2a-single/-mb,-m4/-ml,
-m4/-mb,-m4-single/-ml,-m4-single/-mb,-m4a-single/-ml,
-m4a-single/-mb}"

and no new failures.


Cheers,
Oleg


ChangeLog:

PR target/53568
* config/sh/sh.md (bswapsi2): New expander.
(swapbsi2): New insn.
Index: gcc/config/sh/sh.md
===
--- gcc/config/sh/sh.md	(revision 188425)
+++ gcc/config/sh/sh.md	(working copy)
@@ -4529,6 +4529,38 @@
   emit_label_after (skip_neg_label, emit_insn (gen_negc (high_dst, high_src)));
   DONE;
 })
+
+(define_expand "bswapsi2"
+  [(set (match_operand:SI 0 "arith_reg_dest" "")
+	(bswap:SI (match_operand:SI 1 "arith_reg_operand" "")))]
+  "TARGET_SH1"
+{
+  if (! can_create_pseudo_p ())
+FAIL;
+  else
+{
+  rtx tmp0 = gen_reg_rtx (SImode);
+  rtx tmp1 = gen_reg_rtx (SImode);
+
+  emit_insn (gen_swapbsi2 (tmp0, operands[1]));
+  emit_insn (gen_rotlsi3_16 (tmp1, tmp0));
+  emit_insn (gen_swapbsi2 (operands[0], tmp1));
+  DONE;
+}
+})
+
+(define_insn "swapbsi2"
+  [(set (match_operand:SI 0 "arith_reg_dest" "=r")
+	(ior:SI (and:SI (match_operand:SI 1 "arith_reg_operand" "r")
+			(const_int 4294901760))
+		(ior:SI (and:SI (ashift:SI (match_dup 1) (const_int 8))
+(const_int 65280))
+			(and:SI (ashiftrt:SI (match_dup 1) (const_int 8))
+(const_int 255)]
+  "TARGET_SH1"
+  "swap.b	%1,%0"
+  [(set_attr "type" "arith")])
+
 
 ;; -
 ;; Zero extension instructions

[Patch, Fortran] PR53597 re-add SAVE constraint for modules with -std=f2003

2012-06-13 Thread Tobias Burnus

Given the very slow patch review, I intent to commit this patch in a 
couple of days as obvious.* Nevertheless, I wouldn't mind a patch review.


The constraint check is actually present in resolve.c, it just doesn't 
trigger.


Build and regtested on x86-64-gnu-linux.
OK for the trunk - and for the 4.6/4.7 branch?*

Patches pending review:
- http://gcc.gnu.org/ml/fortran/2012-05/msg00171.html
- http://gcc.gnu.org/ml/fortran/2012-05/msg00173.html

Tobias

2012-06-13  Tobias Burnus  

	PR fortran/53597
	* decl.c (match_attr_spec): Only mark module variables
	as SAVE_IMPLICIT for Fortran 2008 and later.

2012-06-13  Tobias Burnus  

	PR fortran/53597
	* gfortran.dg/save_4.f90: New.

diff --git a/gcc/fortran/decl.c b/gcc/fortran/decl.c
index a760331..26b5059 100644
--- a/gcc/fortran/decl.c
+++ b/gcc/fortran/decl.c
@@ -3810,8 +3810,9 @@ match_attr_spec (void)
 	}
 }
 
-  /* Module variables implicitly have the SAVE attribute.  */
-  if (gfc_current_state () == COMP_MODULE && !current_attr.save)
+  /* Since Fortran 2008 module variables implicitly have the SAVE attribute.  */
+  if (gfc_current_state () == COMP_MODULE && !current_attr.save
+  && (gfc_option.allow_std & GFC_STD_F2008) != 0)
 current_attr.save = SAVE_IMPLICIT;
 
   colon_seen = 1;
--- /dev/null	2012-06-12 08:13:11.079779038 +0200
+++ gcc/gcc/testsuite/gfortran.dg/save_4.f90	2012-06-13 09:16:20.0 +0200
@@ -0,0 +1,13 @@
+! { dg-do compile }
+! { dg-options "-std=f2003" }
+!
+! PR fortran/53597
+!
+MODULE somemodule
+  IMPLICIT NONE
+  TYPE sometype
+INTEGER :: i
+DOUBLE PRECISION, POINTER, DIMENSION(:,:) :: coef => NULL()
+  END TYPE sometype
+  TYPE(sometype) :: somevariable ! { dg-error "Fortran 2008: Implied SAVE for module variable 'somevariable' at .1., needed due to the default initialization" }
+END MODULE somemodule

[Patch, Fortran] PR53643 Fix INTENT(OUT) for class arrays

2012-06-13 Thread Tobias Burnus

gfortran had an ICE with intent(out) class arrays - and with 
(polymorphic) scalar coarrays.


Build and regtested on x86-64-linux.
OK for the trunk?

Tobias
2012-06-12  Tobias Burnus  

	PR fortran/53643
	* trans-decl.c (init_intent_out_dt): Fix for polymorphic arrays.
	* trans-array.c (structure_alloc_comps): Don't loop for
	scalar coarrays.

2012-06-12  Tobias Burnus  

	PR fortran/53643
	* gfortran.dg/intent_out_7.f90: New.

diff --git a/gcc/fortran/trans-array.c b/gcc/fortran/trans-array.c
index 2534462..0e78210 100644
--- a/gcc/fortran/trans-array.c
+++ b/gcc/fortran/trans-array.c
@@ -7318,9 +7318,7 @@ structure_alloc_comps (gfc_symbol * der_type, tree decl,
 
   if ((POINTER_TYPE_P (decl_type) && rank != 0)
 	|| (TREE_CODE (decl_type) == REFERENCE_TYPE && rank == 0))
-
-decl = build_fold_indirect_ref_loc (input_location,
-decl);
+decl = build_fold_indirect_ref_loc (input_location, decl);
 
   /* Just in case in gets dereferenced.  */
   decl_type = TREE_TYPE (decl);
@@ -7328,7 +7326,7 @@ structure_alloc_comps (gfc_symbol * der_type, tree decl,
   /* If this an array of derived types with allocatable components
  build a loop and recursively call this function.  */
   if (TREE_CODE (decl_type) == ARRAY_TYPE
-	|| GFC_DESCRIPTOR_TYPE_P (decl_type))
+  || (GFC_DESCRIPTOR_TYPE_P (decl_type) && rank != 0))
 {
   tmp = gfc_conv_array_data (decl);
   var = build_fold_indirect_ref_loc (input_location,
diff --git a/gcc/fortran/trans-decl.c b/gcc/fortran/trans-decl.c
index 637376b..75a2160 100644
--- a/gcc/fortran/trans-decl.c
+++ b/gcc/fortran/trans-decl.c
@@ -3451,12 +3451,9 @@ init_intent_out_dt (gfc_symbol * proc_sym, gfc_wrapped_block * block)
 	 && !CLASS_DATA (f->sym)->attr.class_pointer
 	 && CLASS_DATA (f->sym)->ts.u.derived->attr.alloc_comp)
   {
-	tree decl = build_fold_indirect_ref_loc (input_location,
-		 f->sym->backend_decl);
-	tmp = CLASS_DATA (f->sym)->backend_decl;
-	tmp = fold_build3_loc (input_location, COMPONENT_REF,
-			   TREE_TYPE (tmp), decl, tmp, NULL_TREE);
-	tmp = build_fold_indirect_ref_loc (input_location, tmp);
+	tmp = gfc_class_data_get (f->sym->backend_decl);
+	if (CLASS_DATA (f->sym)->as == NULL)
+	  tmp = build_fold_indirect_ref_loc (input_location, tmp);
 	tmp = gfc_deallocate_alloc_comp (CLASS_DATA (f->sym)->ts.u.derived,
 	 tmp,
 	 CLASS_DATA (f->sym)->as ?
--- /dev/null	2012-06-12 08:13:11.079779038 +0200
+++ gcc/gcc/testsuite/gfortran.dg/intent_out_7.f90	2012-06-13 09:45:19.0 +0200
@@ -0,0 +1,26 @@
+! { dg-do compile }
+! { dg-options "-fcoarray=single" }
+!
+! PR fortran/53643
+!
+type t
+ integer, allocatable :: comp
+end type t
+contains
+ subroutine foo(x,y)
+   class(t), allocatable, intent(out) :: x(:)
+   class(t), intent(out) :: y(:)
+ end subroutine
+ subroutine foo2(x,y)
+   class(t), allocatable, intent(out) :: x
+   class(t), intent(out) :: y
+ end subroutine
+ subroutine bar(x,y)
+   class(t), intent(out) :: x(:)[*]
+   class(t), intent(out) :: y[*]
+ end subroutine
+ subroutine bar2(x,y)
+   type(t), intent(out) :: x(:)[*]
+   type(t), intent(out) :: y[*]
+ end subroutine
+end

Re: [PR52983] eliminate autoinc from debug_insn locs

2012-06-13 Thread Alexandre Oliva

On May  3, 2012, Alexandre Oliva  wrote:

> Here are the 3 patches that, together, are equivalent to the one posted
> before, except for the visibility of cleanup_auto_inc_dec.

Ping?

http://gcc.gnu.org/ml/gcc-patches/2012-05/msg00300.html

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist  Red Hat Brazil Compiler Engineer

[Patch, Fortran] PR53642/45170c24 Deferred-length string fixes

2012-06-13 Thread Tobias Burnus

This patch fixes issues with deferred length strings, where the new 
string length (= RHS len) is evaluated too late. That's fixed by calling 
gfc_add_block_to_block. I have no idea whether the condition makes sense 
or whether that function could always be called.


Additionally, in the FE optimization, it avoids the removal of trim in 
"lhs = trim(rhs)" if the lhs has a deferred length.


Build and regtested on x86-64-linux.
OK for the trunk?

Tobias
2012-06-13  Tobias Burnus  

	PR fortran/53642
	PR fortran/45170
	* frontend-passes.c (optimize_assignment): Don't remove RHS's
	trim when assigning to a deferred-length string.
	* trans-expr.c (gfc_trans_assignment_1): Ensure that the RHS string
	length is evaluated before the deferred-length LHS is reallocated.

2012-06-13  Tobias Burnus  

	PR fortran/53642
	PR fortran/45170
	* gfortran.dg/deferred_type_param_8.f90: New.

diff --git a/gcc/fortran/frontend-passes.c b/gcc/fortran/frontend-passes.c
index bcc1bdc..fc32e56 100644
--- a/gcc/fortran/frontend-passes.c
+++ b/gcc/fortran/frontend-passes.c
@@ -735,15 +735,13 @@ optimize_assignment (gfc_code * c)
   lhs = c->expr1;
   rhs = c->expr2;
 
-  if (lhs->ts.type == BT_CHARACTER)
+  if (lhs->ts.type == BT_CHARACTER && !lhs->ts.deferred)
 {
-  /* Optimize away a = trim(b), where a is a character variable.  */
+  /* Optimize  a = trim(b)  to  a = b.  */
   remove_trim (rhs);
 
-  /* Replace a = '   ' by a = '' to optimize away a memcpy, but only
-	 for strings with non-deferred length (otherwise we would
-	 reallocate the length.  */
-  if (empty_string(rhs) && ! lhs->ts.deferred)
+  /* Replace a = '   ' by a = '' to optimize away a memcpy.  */
+  if (empty_string(rhs))
 	rhs->value.character.length = 0;
 }
 
@@ -1171,7 +1169,7 @@ optimize_trim (gfc_expr *e)
 
   ref->u.ss.start = gfc_get_int_expr (gfc_default_integer_kind, NULL, 1);
 
-  /* Build the function call to len_trim(x, gfc_defaul_integer_kind).  */
+  /* Build the function call to len_trim(x, gfc_default_integer_kind).  */
 
   fcn = get_len_trim_call (gfc_copy_expr (e), gfc_default_integer_kind);
 
diff --git a/gcc/fortran/trans-expr.c b/gcc/fortran/trans-expr.c
index 9d48a09..7d1a6d4 100644
--- a/gcc/fortran/trans-expr.c
+++ b/gcc/fortran/trans-expr.c
@@ -6891,7 +6891,6 @@ gfc_trans_assignment_1 (gfc_expr * expr1, gfc_expr * expr2, bool init_flag,
   stmtblock_t body;
   bool l_is_temp;
   bool scalar_to_array;
-  bool def_clen_func;
   tree string_length;
   int n;
 
@@ -7010,13 +7009,8 @@ gfc_trans_assignment_1 (gfc_expr * expr1, gfc_expr * expr2, bool init_flag,
  otherwise the character length of the result is not known.
  NOTE: This relies on having the exact dependence of the length type
  parameter available to the caller; gfortran saves it in the .mod files. */
-  def_clen_func = (expr2->expr_type == EXPR_FUNCTION
-		   || expr2->expr_type == EXPR_COMPCALL
-		   || expr2->expr_type == EXPR_PPC);
-  if (gfc_option.flag_realloc_lhs
-	&& expr2->ts.type == BT_CHARACTER
-	&& (def_clen_func || expr2->expr_type == EXPR_OP)
-	&& expr1->ts.deferred)
+  if (gfc_option.flag_realloc_lhs && expr2->ts.type == BT_CHARACTER
+  && expr1->ts.deferred)
 gfc_add_block_to_block (&block, &rse.pre);
 
   tmp = gfc_trans_scalar_assign (&lse, &rse, expr1->ts,
--- /dev/null	2012-06-12 08:13:11.079779038 +0200
+++ gcc/gcc/testsuite/gfortran.dg/deferred_type_param_8.f90	2012-06-13 09:30:31.0 +0200
@@ -0,0 +1,54 @@
+! { dg-do run }
+!
+! PR fortran/53642
+! PR fortran/45170 (comments 24, 34, 37)
+!
+
+PROGRAM helloworld
+  implicit none
+  character(:),allocatable::string
+  character(11), parameter :: cmp = "hello world"
+  real::rnd
+  integer :: n, i
+  do i = 1, 10
+ call random_number(rnd)
+ n = ceiling(11*rnd)
+ call hello(n, string)
+! print '(A,1X,I0)', '>' // string // '<', len(string)
+ if (n /= len (string) .or. string /= cmp(1:n)) call abort ()
+  end do
+
+  call test_PR53642()
+
+contains
+
+  subroutine hello (n,string)
+character(:), allocatable, intent(out) :: string
+integer,intent(in) :: n
+character(11) :: helloworld="hello world"
+
+string=helloworld(:n)   ! Didn't  work
+!string=(helloworld(:n))! Works.
+!allocate(string, source=helloworld(:n))! Fixed for allocate_with_source_2.f90
+!allocate(string, source=(helloworld(:n)))  ! Works.
+  end subroutine hello
+
+  subroutine test_PR53642()
+character(len=4) :: string="123 "
+character(:), allocatable :: trimmed
+
+trimmed = trim(string)
+if (len_trim(string) /= len(trimmed)) call abort ()
+if (len(trimmed) /= 3) call abort ()
+if (trimmed /= "123") call abort ()
+!print *,len_trim(string),len(trimmed)
+
+! Clear
+trimmed = "XX"
+if (trimmed /= "XX" .or. len(trimmed) /= 6) call abort ()
+
+trimmed = string(1:len_trim(string))
+if (len_trim(trimmed) /= 3) call abort ()
+if (trimmed /= "1

Re: [PR debug/47624] improve value tracking in non-VTA locations

2012-06-13 Thread Alexandre Oliva

On Apr 22, 2012, Alexandre Oliva  wrote:

> for  gcc/ChangeLog
> from  Alexandre Oliva  

>   PR debug/47624
>   * var-tracking.c (loc_exp_dep_pool): New.
>   (vt_emit_notes): Create and release the pool.
>   (compute_bb_dataflow): Use value-based locations in MO_VAL_SET.
>   (emit_notes_in_bb): Likewise.
>   (loc_exp_dep_insert): Deal with NOT_ONEPART vars.
>   (notify_dependents_of_changed_value): Likewise.
>   (notify_dependents_of_resolved_value): Check that NOT_ONEPART
>   variables don't have a VAR_LOC_DEP_LST.
>   (emit_note_insn_var_location): Expand NOT_ONEPART locs that are
>   VALUEs or MEMs of VALUEs.

Ping?  http://gcc.gnu.org/ml/gcc-patches/2012-04/msg01320.html

RTH, the patch for PR49888 that you reviewed and approved depends on
this one.

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist  Red Hat Brazil Compiler Engineer

Re: debug insns in SMS

2012-06-13 Thread Alexandre Oliva

On Apr  9, 2012, Alexandre Oliva  wrote:

>>> I think this will restore proper functioning to SMS in the presence of
>>> debug insns.  A while ago, we'd never generate deps of non-debug insns
>>> on debug insns.  I introduced them to enable sched to adjust (reset)
>>> debug insns when non-debug insns were moved before them.  I believe it
>>> is safe to leave them out of the SCCs.  Even though this will end up
>>> causing some loss of debug info, that's probably unavoidable, and the
>>> end result after this change is pobably the best we can hope for.  Your
>>> thoughts?

> for  gcc/ChangeLog
> from  Alexandre Oliva  

>   * ddg.c (build_intra_loop_deps): Discard deps of nondebug on debug.

> Ping?

http://gcc.gnu.org/ml/gcc-patches/2012-04/msg00419.html

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist  Red Hat Brazil Compiler Engineer

Re: don't force debug insns after their PREV_INSNs

2012-06-13 Thread Alexandre Oliva

On Apr  9, 2012, Alexandre Oliva  wrote:

> The problem here is that a nondebug insn may be moved ahead of a useful
> debug insn and clobber one of its inputs, rendering it useless, when
> there's no good reason for the debug insn to be kept in place, other
> than an accidental dependency on the previous insn when it happens to be
> unrelated with the debug insn.

> Removing the extraneous dependency, that was thought to be a way to
> reduce movement of debug insns, improves on this problem.  It's not
> clear that this artificial dependency really does any good, since odds
> are that that previous insn may be pulled ahead anyway, in which case so
> will debug insn (unless that would fail other of its deps, of course)

> Retested.  Ok?

> for  gcc/ChangeLog
> from  Alexandre Oliva  

>   * sched-deps.c (sched_analyze_insn): Don't force debug insns
>   to follow their original predecessors.

Ping?  http://gcc.gnu.org/ml/gcc-patches/2012-04/msg00418.html

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist  Red Hat Brazil Compiler Engineer

Re: [PR48866] three alternative fixes

2012-06-13 Thread Alexandre Oliva

On Apr  9, 2012, Alexandre Oliva  wrote:

> On Jun  2, 2011, Alexandre Oliva  wrote:
>> On May 30, 2011, Alexandre Oliva  wrote:
>>> On May 30, 2011, Alexandre Oliva  wrote:

 I have 3 different, mutually exclusive patches that fix PR 48866.  The
 problem is exponential time while dealing with an expression that
 resulted from a long chain of replaceable insns with memory accesses
 moved past the debug insns referring to their results.

 1. emit debug temps for replaceable DEFs that end up being referenced in
 debug insns.  We already have some code to try to deal with this, but it
 emits the huge expressions we'd rather avoid, and it may create
 unnecessary duplication.  This new approach emits a placeholder instead
 of skipping replaceable DEFs altogether, and then, if the DEF is
 referenced in a debug insn (perhaps during the late debug re-expasion of
 some other placeholder), it is expanded.  Placeholders that end up not
 being referenced are then throw away.

>>> This is my favorite option, for it's safest: it doesn't change
>>> executable code at all (or should I say it *shouldn't* change it, for I
>>> haven't verified that it doesn't), retaining any register pressure
>>> benefits from TER.

>> This revised and retested version records expansions in an array indexed
>> on SSA version rather than a pointer_map, as suggested by Matz.

> Updated to deal with debug source bind stmts, added an assertion in
> var-tracking to make sure we don't get unexpected kinds of decls in
> VAR_LOCATION insns.  Regstrapped on x86_64-linux-gnu and i686-linux-gnu.
> Ok to install?

> for  gcc/ChangeLog
> from  Alexandre Oliva  

>   PR debug/48866
>   * cfgexpand.c (DEBUG_INSN_TOEXPAND): New.
>   (def_expansions): New.
>   (def_expansions_init): New.
>   (def_expansions_remove_placeholder, def_expansions_fini): New.
>   (def_get_expansion_ptr): New.
>   (expand_debug_expr): Create debug temps as needed.
>   (expand_debug_insn): New, split out of...
>   (expand_debug_locations): ... this.
>   (gen_emit_debug_insn): New, split out of...
>   (expand_gimple_basic_block): ... this.  Simplify expansion of
>   debug stmts.  Emit placeholders for replaceable DEFs, rather
>   than debug temps at last non-debug uses.
>   (gimple_expand_cfg): Initialize and finalize expansions cache.
>   * var-tracking.c (use_type): Check for acceptable var decls in
>   var_locations.

Ping?  http://gcc.gnu.org/ml/gcc-patches/2012-04/msg00413.html

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist  Red Hat Brazil Compiler Engineer

Re: [trunk<-vta] Re: [vtab] Permit coalescing of user variables

2012-06-13 Thread Alexandre Oliva

On Apr  9, 2012, Alexandre Oliva  wrote:

> On Jun  4, 2011, Alexandre Oliva  wrote:

>> On Oct 13, 2009, Alexandre Oliva  wrote:
>>> On Jun  1, 2009, Alexandre Oliva  wrote:
 A long time ago, when variable tracking at assignments was just a
 distant dream, we ran into one of the first contentious points, which
 had to do with coalescing SSA names on copyrename.

 On the one hand, coalescing unrelated SSA names made for better code (at
 least in theory) but poorer debug information; on the other hand,
 refraining from coalescing them exploded compile-time memory use.

 We currently implement a trade-off by which variables inlined from other
 functions can be coalesced, so as to save compile-time memory, reduce
 abstraction penalties and retain debug information for out-of-line
 functions.

 The patch below (ping) implements two other possibilities: refraining
 from coalescing even inlined SSA names, which might enable better debug
 information to be generated, and enabling coalescing of all related
 variables, for better code at the expense of debug information.

 VTA doesn't really care which of the 3 possibilities is used, it works
 equally well with all of them.

>>> On Jun  1, 2009, Alexandre Oliva  also wrote:

 And the patch below changes the default so that we can optimize more.

>>> This patch combines the two patches described above, now that VTA is
>>> enabled by default.

>> This is an updated version of the patch, adjusting the testcases that
>> didn't expect this kind of variable coalescing.

> Regstrapped on x86_64-linux-gnu and i686-linux-gnu.  Ok?

> for  gcc/ChangeLog
> from  Alexandre Oliva  

>   * common.opt (ftree-coalesce-inlined-vars): New.
>   (ftree-coalesce-vars): New.
>   * doc/invoke.texi: Document them.
>   * tree-ssa-copyrename.c (copy_rename_partition_coalesce):
>   Implement them.

> for  gcc/testsuite/ChangeLog
> from  Alexandre Oliva  

>   * g++.dg/tree-ssa/ivopts-2.C: Adjust for coalescing.
>   * gcc.dg/tree-ssa/forwprop-11.c: Likewise.
>   * gcc.dg/tree-ssa/ssa-fre-1.c: Likewise.

> Ping?  (Updated with improved docs; should the options be renamed to
> -ftree-copyrename-* to match the option that covers the entire pass?)

Ping?  http://gcc.gnu.org/ml/gcc-patches/2012-04/msg00412.html

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist  Red Hat Brazil Compiler Engineer

Re: [PATCH 3/3] rs6000: Rewrite sync patterns for atomic; expand early.

2012-06-13 Thread Richard Guenther

On Tue, 12 Jun 2012, Richard Henderson wrote:

> On 2012-06-11 18:40, David Edelsohn wrote:
> >> > Nope.  I do see the obvious mistake in the atomic_load pattern though:
> >> > The mode iterator should have been INT1 not INT.
> > Did you want to commit the fix for the iterator?
> > 
> 
> Applied the following to mainline.
> 
> It ought to go onto the 4.7 branch as well, as it's a wrong-code bug.
> Are we at a place in the 4.7.1 release process where that's possible?

If you are sure it won't break anything go ahead (sooner than later
please).

Thanks,
Richard.

Re: [PR52983, PR48866] eliminate autoinc from debug_insn locs

2012-06-13 Thread Alexandre Oliva

On May  3, 2012, Alexandre Oliva  wrote:

> On May  3, 2012, Alexandre Oliva  wrote:
>> My recent patch for PR48866

> ... had some inconsistency in behavior between dce and word_dce, as you
> pointed out.  I couldn't find any reason for that, so I made them the
> same.

> Regstrapped on x86_64-linux-gnu and i686-linux-gnu

> for  gcc/ChangeLog
> from  Alexandre Oliva  

>   PR debug/52983
>   PR debug/48866
>   * dce.c (word_dce_process_block): Insert debug temps only if the
>   insn is not marked.
>   (dce_process_block): Likewise, and if debug.used is not empty,
>   and only after iterating over all DEFs that might mark the insn.

Ping?  http://gcc.gnu.org/ml/gcc-patches/2012-05/msg00301.html

-- 
Alexandre Oliva   home: +55 19 32435233  cell: +55 19 97143658
OS Tools Compiler Engineer I am Red Hat Brazil
You must be the change you wish to see in the world. -- Gandhi
 Be Free! -- FSFLA.org  FSF Latin America board member

Re: [PR52983, PR48866] eliminate autoinc from debug_insn locs

2012-06-13 Thread Jakub Jelinek

On Wed, Jun 13, 2012 at 05:46:47AM -0300, Alexandre Oliva wrote:
> On May  3, 2012, Alexandre Oliva  wrote:
> 
> > On May  3, 2012, Alexandre Oliva  wrote:
> >> My recent patch for PR48866
> 
> > ... had some inconsistency in behavior between dce and word_dce, as you
> > pointed out.  I couldn't find any reason for that, so I made them the
> > same.
> 
> > Regstrapped on x86_64-linux-gnu and i686-linux-gnu
> 
> > for  gcc/ChangeLog
> > from  Alexandre Oliva  
> 
> > PR debug/52983
> > PR debug/48866
> > * dce.c (word_dce_process_block): Insert debug temps only if the
> > insn is not marked.
> > (dce_process_block): Likewise, and if debug.used is not empty,
> > and only after iterating over all DEFs that might mark the insn.
> 
> Ping?  http://gcc.gnu.org/ml/gcc-patches/2012-05/msg00301.html

Ok.  Thanks.

Jakub

Re: constant that doesn't fit in 32bits in alpha.c

2012-06-13 Thread Pedro Alves

On 06/12/2012 08:44 PM, Joseph S. Myers wrote:

> I'd rather have a macro HOST_WIDE_INT_C in hwint.h (like INTMAX_C etc. in 
> stdint.h).  HOST_WIDE_INT_1 is already defined in hwint.h to either 1L or 
> 1LL; I'd suggest defining HOST_WIDE_INT_C to concatenate with either L or 
> LL (and then HOST_WIDE_INT_1 can be HOST_WIDE_INT_C (1), unconditionally).

Related, does gcc forbid "long long" / ULL ?  In a recent similar GDB 
discussion,
I noticed that libdecnumber seems to uses both unconditionally (for UINT64, and
e.g., the initialization of reciprocals10_128).  So if libdecnumber
is always built with gcc, gcc is also already depending on "long long" / ULL 
being
available too.

-- 
Pedro Alves

Re: [PATCH, RFC] First cut at using vec_construct for strided loads

2012-06-13 Thread Richard Guenther

On Tue, 12 Jun 2012, William J. Schmidt wrote:

> This patch is a follow-up to the discussion generated by
> http://gcc.gnu.org/ml/gcc-patches/2012-06/msg00546.html.  I've added
> vec_construct to the cost model for use in vect_model_load_cost, and
> implemented a cost calculation that makes sense to me for PowerPC.  I'm
> less certain about the default, i386, and spu implementations.  I took a
> guess at i386 from the discussions we had, and used the same calculation
> for the default and for spu.  I'm hoping you or others can fill in the
> blanks if I guessed badly.
> 
> The i386 cost for vec_construct is different from all the others, which
> are parameterized for each processor description.  This should probably
> be parameterized in some way as well, but thought you'd know better than
> I how that should be.  Perhaps instead of
> 
>   elements / 2 + 1
> 
> it should be
> 
>   (elements / 2) * X + Y
> 
> where X and Y are taken from the processor description, and represent
> the cost of a merge and a permute, respectively.  Let me know what you
> think.

Looks good to me with the gcc_asserts removed - TYPE_VECTOR_SUBPARTS
might be 1 for V1TImode for example (heh, not that the vectorizer would
vectorize to that).  But I don't see any possible breakage with
elements == 1, do you?

Target maintainers can improve on the cost calculation if they wish,
the default looks sensible to me.

Thanks,
Richard.

> Thanks,
> Bill
> 
> 
> 2012-06-12  Bill Schmidt  
> 
>   * targhooks.c (default_builtin_vectorized_conversion): Handle
>   vec_construct, using vectype to base cost on subparts.
>   * target.h (enum vect_cost_for_stmt): Add vec_construct.
>   * tree-vect-stmts.c (vect_model_load_cost): Use vec_construct
>   instead of scalar_to-vec.
>   * config/spu/spu.c (spu_builtin_vectorization_cost): Handle
>   vec_construct in same way as default for now.
>   * config/i386/i386.c (ix86_builtin_vectorization_cost): Likewise.
>   * config/rs6000/rs6000.c (rs6000_builtin_vectorization_cost):
>   Handle vec_construct, including special case for 32-bit loads.
>   
> 
> Index: gcc/targhooks.c
> ===
> --- gcc/targhooks.c   (revision 188482)
> +++ gcc/targhooks.c   (working copy)
> @@ -499,9 +499,11 @@ default_builtin_vectorized_conversion (unsigned in
>  
>  int
>  default_builtin_vectorization_cost (enum vect_cost_for_stmt type_of_cost,
> -tree vectype ATTRIBUTE_UNUSED,
> +tree vectype,
>  int misalign ATTRIBUTE_UNUSED)
>  {
> +  unsigned elements;
> +
>switch (type_of_cost)
>  {
>case scalar_stmt:
> @@ -524,6 +526,11 @@ default_builtin_vectorization_cost (enum vect_cost
>case cond_branch_taken:
>  return 3;
>  
> +  case vec_construct:
> + elements = TYPE_VECTOR_SUBPARTS (vectype);
> + gcc_assert (elements > 1);
> + return elements / 2 + 1;
> +
>default:
>  gcc_unreachable ();
>  }
> Index: gcc/target.h
> ===
> --- gcc/target.h  (revision 188482)
> +++ gcc/target.h  (working copy)
> @@ -146,7 +146,8 @@ enum vect_cost_for_stmt
>cond_branch_not_taken,
>cond_branch_taken,
>vec_perm,
> -  vec_promote_demote
> +  vec_promote_demote,
> +  vec_construct
>  };
>  
>  /* The target structure.  This holds all the backend hooks.  */
> Index: gcc/tree-vect-stmts.c
> ===
> --- gcc/tree-vect-stmts.c (revision 188482)
> +++ gcc/tree-vect-stmts.c (working copy)
> @@ -1031,11 +1031,13 @@ vect_model_load_cost (stmt_vec_info stmt_info, int
>/* The loads themselves.  */
>if (STMT_VINFO_STRIDE_LOAD_P (stmt_info))
>  {
> -  /* N scalar loads plus gathering them into a vector.
> - ???  scalar_to_vec isn't the cost for that.  */
> +  /* N scalar loads plus gathering them into a vector.  */
> +  tree vectype = STMT_VINFO_VECTYPE (stmt_info);
>inside_cost += (vect_get_stmt_cost (scalar_load) * ncopies
> -   * TYPE_VECTOR_SUBPARTS (STMT_VINFO_VECTYPE (stmt_info)));
> -  inside_cost += ncopies * vect_get_stmt_cost (scalar_to_vec);
> +   * TYPE_VECTOR_SUBPARTS (vectype));
> +  inside_cost += ncopies
> + * targetm.vectorize.builtin_vectorization_cost (vec_construct,
> + vectype, 0);
>  }
>else
>  vect_get_load_cost (first_dr, ncopies,
> Index: gcc/config/spu/spu.c
> ===
> --- gcc/config/spu/spu.c  (revision 188482)
> +++ gcc/config/spu/spu.c  (working copy)
> @@ -6908,9 +6908,11 @@ spu_builtin_mask_for_load (void)
>  /* Implement targetm.vectorize.builtin_vectorization_cost.  */
>  static int 
>

Re: [trunk<-vta] Re: [vtab] Permit coalescing of user variables

2012-06-13 Thread Richard Guenther

On Wed, Jun 13, 2012 at 10:09 AM, Alexandre Oliva  wrote:
> On Apr  9, 2012, Alexandre Oliva  wrote:
>
>> On Jun  4, 2011, Alexandre Oliva  wrote:
>
>>> On Oct 13, 2009, Alexandre Oliva  wrote:
 On Jun  1, 2009, Alexandre Oliva  wrote:
> A long time ago, when variable tracking at assignments was just a
> distant dream, we ran into one of the first contentious points, which
> had to do with coalescing SSA names on copyrename.
>
> On the one hand, coalescing unrelated SSA names made for better code (at
> least in theory) but poorer debug information; on the other hand,
> refraining from coalescing them exploded compile-time memory use.
>
> We currently implement a trade-off by which variables inlined from other
> functions can be coalesced, so as to save compile-time memory, reduce
> abstraction penalties and retain debug information for out-of-line
> functions.
>
> The patch below (ping) implements two other possibilities: refraining
> from coalescing even inlined SSA names, which might enable better debug
> information to be generated, and enabling coalescing of all related
> variables, for better code at the expense of debug information.
>
> VTA doesn't really care which of the 3 possibilities is used, it works
> equally well with all of them.
>
 On Jun  1, 2009, Alexandre Oliva  also wrote:
>
> And the patch below changes the default so that we can optimize more.
>
 This patch combines the two patches described above, now that VTA is
 enabled by default.
>
>>> This is an updated version of the patch, adjusting the testcases that
>>> didn't expect this kind of variable coalescing.
>
>> Regstrapped on x86_64-linux-gnu and i686-linux-gnu.  Ok?

Ok.

Thanks,
Richard.

>
>> for  gcc/ChangeLog
>> from  Alexandre Oliva  
>
>>       * common.opt (ftree-coalesce-inlined-vars): New.
>>       (ftree-coalesce-vars): New.
>>       * doc/invoke.texi: Document them.
>>       * tree-ssa-copyrename.c (copy_rename_partition_coalesce):
>>       Implement them.
>
>> for  gcc/testsuite/ChangeLog
>> from  Alexandre Oliva  
>
>>       * g++.dg/tree-ssa/ivopts-2.C: Adjust for coalescing.
>>       * gcc.dg/tree-ssa/forwprop-11.c: Likewise.
>>       * gcc.dg/tree-ssa/ssa-fre-1.c: Likewise.
>
>> Ping?  (Updated with improved docs; should the options be renamed to
>> -ftree-copyrename-* to match the option that covers the entire pass?)
>
> Ping?  http://gcc.gnu.org/ml/gcc-patches/2012-04/msg00412.html
>
> --
> Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
> You must be the change you wish to see in the world. -- Gandhi
> Be Free! -- http://FSFLA.org/   FSF Latin America board member
> Free Software Evangelist      Red Hat Brazil Compiler Engineer

RFC/RFA: Allow targets to override the definition of FLOAT_BIT_ORDER_MISMATCH

2012-06-13 Thread Nick Clifton

Hi Guys,

  The RX port currently has a problem with the software implementation
  of 64-bit doubles - the libgcc functions produce bogus results.  I
  tracked this down to this definition in libgcc/fp-bit.h:

#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
#define FLOAT_BIT_ORDER_MISMATCH
#endif

  For the RX __BYTE_ORDER__ is __ORDER_LITTLE_ENDIAN__, but the bits in
  the doubles are in little endian order, so FLOAT_BIT_ORDER_MISMATCH
  should not be defined.

  I have produced a patch (see below) that fixes this problem for me by
  allowing the target to specify FLOAT_BIT_ORDER_MISMATCH explicitly, if
  it wants to.  This matches the old behaviour (in gcc 4.6 and earlier)
  before fp-bit.h was moved to the libgcc directory.  I suspect however
  that this may not be the correct solution, so I am asking for comments
  as to how the problem should be solved.  If my approach is correct
  however, then please may I apply the patch ?

Cheers
  Nick

libgcc/ChangeLog
2012-06-13  Nick Clifton  

* fp-bit.h (FLOAT_BIT_ORDER_MISMATCH): If
LIBGCC2_FLOAT_BIT_ORDER_MISMATCH is defined then use this to
determine if FLOAT_BIT_ORDER_MISMATCH should be defined.
* config/rx/rx-lib.h (LIBGCC2_FLOAT_BIT_ORDER_MISMATCH): Define
to false.

Index: libgcc/fp-bit.h
===
--- libgcc/fp-bit.h (revision 188497)
+++ libgcc/fp-bit.h (working copy)
@@ -129,9 +129,21 @@
 #define NO_DI_MODE
 #endif
 
+/* Allow the target the chance to specify whether
+   the bit order matches the byte order.  */
+#if defined LIBGCC2_FLOAT_BIT_ORDER_MISMATCH
+/* Evaluate the expression - it might be zero.  */
+#if LIBGCC2_FLOAT_BIT_ORDER_MISMATCH
+#define FLOAT_BIT_ORDER_MISMATCH
+#else
+#undef  FLOAT_BIT_ORDER_MISMATCH
+#endif
+#else
+/* Otherwise assume that the bits within a byte are in big endian order.  */
 #if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
 #define FLOAT_BIT_ORDER_MISMATCH
 #endif
+#endif
 
 #if __BYTE_ORDER__ != __FLOAT_WORD_ORDER__
 #define FLOAT_WORD_ORDER_MISMATCH
Index: libgcc/config/rx/rx-lib.h
===
--- libgcc/config/rx/rx-lib.h   (revision 188497)
+++ libgcc/config/rx/rx-lib.h   (working copy)
@@ -1,5 +1,10 @@
-#ifndef __RX_64BIT_DOUBLES__
+#if   defined __RX_64BIT_DOUBLES__
+#undef FLOAT_ONLY
+#elif defined __RX_32BIT_DOUBLES__
 #define DF SF
 #define FLOAT_ONLY
+#else
+#error "RX double size not defined"
 #endif
 
+#define LIBGCC2_FLOAT_BIT_ORDER_MISMATCH 0

[patch] Fix PR middle-end/53590

2012-06-13 Thread Eric Botcazou

This PR is about straight-line code not being as much vectorized in Ada as in C 
or C++.  The problem stems from the very conservative semantics implemented 
under -fnon-call-exceptions for the sake of Java.  We don't need it in Ada.

Tested on x86_64-suse-linux, OK for the mainline?


2012-06-13  Eric Botcazou  

PR middle-end/53590
* common.opt (-fdelete-dead-exceptions): New switch.
* doc/invoke.texi (Optimization Options): Document it.
* cse.c (insn_live_p): Do not return true for an insn that could throw
if dead exceptions can be deleted.
* dce.c (can_alter_cfg): New flag.
(deletable_insn_p): Do not return false for an insn that can throw if
the CFG can be altered and dead exceptions can be deleted.
(init_dce): Set can_alter_cfg to false for fast DCE, true otherwise.
* dse.c (scan_insn): Do not preserve an insn that could throw if dead
exceptions can be deleted.
* tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Do not mark a
statement that could throw as necessary if dead exceptions can be
deleted.
ada/
* gcc-interface/misc.c (gnat_init_options_struct): Set
opts->x_flag_delete_dead_exceptions to 1.


-- 
Eric Botcazou
Index: common.opt
===
--- common.opt	(revision 188445)
+++ common.opt	(working copy)
@@ -979,6 +987,10 @@ fdelayed-branch
 Common Report Var(flag_delayed_branch) Optimization
 Attempt to fill delay slots of branch instructions
 
+fdelete-dead-exceptions
+Common Report Var(flag_delete_dead_exceptions) Init(0) Optimization
+Delete dead statements that may raise exceptions
+
 fdelete-null-pointer-checks
 Common Report Var(flag_delete_null_pointer_checks) Init(1) Optimization
 Delete useless null pointer checks
Index: doc/invoke.texi
===
--- doc/invoke.texi	(revision 188445)
+++ doc/invoke.texi	(working copy)
@@ -359,7 +359,7 @@ Objective-C and Objective-C++ Dialects}.
 -fcse-follow-jumps -fcse-skip-blocks -fcx-fortran-rules @gol
 -fcx-limited-range @gol
 -fdata-sections -fdce -fdelayed-branch @gol
--fdelete-null-pointer-checks -fdevirtualize -fdse @gol
+-fdelete-dead-exceptions -fdelete-null-pointer-checks -fdevirtualize -fdse @gol
 -fearly-inlining -fipa-sra -fexpensive-optimizations -ffat-lto-objects @gol
 -ffast-math -ffinite-math-only -ffloat-store -fexcess-precision=@var{style} @gol
 -fforward-propagate -ffp-contract=@var{style} -ffunction-sections @gol
@@ -6774,6 +6786,16 @@ branch-less equivalents.
 
 Enabled at levels @option{-O}, @option{-O2}, @option{-O3}, @option{-Os}.
 
+@item -fdelete-dead-exceptions
+@opindex fdelete-dead-exceptions
+Assume that statements that may raise exceptions but don't otherwise contribute
+to the execution of the program can be optimized away.
+
+Most languages supporting exceptions disable this option at all levels.
+Otherwise it is enabled at all levels: @option{-O0}, @option{-O1},
+@option{-O2}, @option{-O3}, @option{-Os}.  Passes that use the information
+are enabled independently at different optimization levels.
+
 @item -fdelete-null-pointer-checks
 @opindex fdelete-null-pointer-checks
 Assume that programs cannot safely dereference null pointers, and that
Index: cse.c
===
--- cse.c	(revision 188445)
+++ cse.c	(working copy)
@@ -6800,7 +6800,7 @@ static bool
 insn_live_p (rtx insn, int *counts)
 {
   int i;
-  if (insn_could_throw_p (insn))
+  if (!flag_delete_dead_exceptions && insn_could_throw_p (insn))
 return true;
   else if (GET_CODE (PATTERN (insn)) == SET)
 return set_live_p (PATTERN (insn), insn, counts);
Index: dce.c
===
--- dce.c	(revision 188445)
+++ dce.c	(working copy)
@@ -47,6 +47,9 @@ along with GCC; see the file COPYING3.
we don't want to reenter it.  */
 static bool df_in_progress = false;
 
+/* True if we are allowed to alter the CFG in this pass.  */
+static bool can_alter_cfg = false;
+
 /* Instructions that have been marked but whose dependencies have not
yet been processed.  */
 static VEC(rtx,heap) *worklist;
@@ -113,8 +116,10 @@ deletable_insn_p (rtx insn, bool fast, b
   if (!NONJUMP_INSN_P (insn))
 return false;
 
-  /* Don't delete insns that can throw.  */
-  if (!insn_nothrow_p (insn))
+  /* Don't delete insns that can throw if we need to preserve the CFG or
+ statements that may raise exceptions.  */
+  if ((!can_alter_cfg || !flag_delete_dead_exceptions)
+  && !insn_nothrow_p (insn))
 return false;
 
   body = PATTERN (insn);
@@ -711,7 +716,10 @@ init_dce (bool fast)
 {
   bitmap_obstack_initialize (&dce_blocks_bitmap_obstack);
   bitmap_obstack_initialize (&dce_tmp_bitmap_obstack);
+  can_alter_cfg = false;
 }
+  else
+can_alter_cfg = true;
 
   marked = sbitmap_alloc (get_max_ui

Re: [PATCH] Option to build bare-metal ARM cross-compiler for arm-none-eabi target without libunwind

2012-06-13 Thread Julian Brown

On Wed, 13 Jun 2012 09:35:42 +0200
Fredrik Hederstierna  wrote:

> This is the link to the original patch.
> It contains some background information and more links.
> 
> http://gcc.gnu.org/ml/gcc-patches/2012-04/msg00720.html
> 
> The patch is now updated and improved by Larry Doolittle.
> /Fredrik

Related:

http://gcc.gnu.org/ml/gcc-patches/2009-10/msg01618.html

That one only handles 64-bit division (I don't know if there are other
things which pull in the unwinder), and is probably quite bitrotten by
now -- it never got approved/committed. Sorry about that!

Cheers,

Julian

Re: [PATCH] Option to build bare-metal ARM cross-compiler for arm-none-eabi target without libunwind

2012-06-13 Thread Joseph S. Myers

On Wed, 13 Jun 2012, Fredrik Hederstierna wrote:

> >You need to provide a self-contained explanation of what the problem
> >is  that your patch is fixing and why you chose that approach to
> >fixing it -  with reference to the ARM EABI documentes (RTABI etc.)
> >for why your  approach is valid according to the ARM EABI.  libunwind
> >is a library separate from libgcc that is used by libgcc for
> >unwinding on ia64-linux-gnu only (whether built by GCC or separately
> >installed).  There is also a separate libunwind project that may be
> >used  on GNU/Linux platforms but I am not aware of being used for
> >bare-metal at  all.
> 
> Our experience is that when using simple integer division in your code,
> the libgcc division routine will includes div-zero handling (exception 
> support?),
> which will include dependencies to libunwind. This dependency problem
> is the background of the patch.

What do you mean by "dependencies to libunwind"?

As I explained, libunwind is a separate library that GCC should never be 
building or using for platforms other than IA64 GNU/Linux.  The _Uwind_* 
functions are *not* libunwind functions on ARM EABI, they are functions 
from libgcc_eh.  The __aeabi_unwind_cpp_pr* personality routines are *not* 
libunwind functions either, they are also functions from libgcc_eh.  
Anything related to those functions should not use the term "libunwind", 
whether in configure option names, or patch submissions, since it is not 
libunwind.

Linking with libgcc_eh should work fine by default; no special action 
should be needed to link in whatever unwind functions your code 
references.  But if you don't want to link them in at all, and if defining 
your own versions of __aeabi_div0 / __aeabi_ldiv0 doesn't suffice, then as 
long as your code doesn't raise exceptions it should be safe for you to 
stub out the __aeabi_unwind_cpp_pr* functions.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [patch] Fix PR middle-end/53590

2012-06-13 Thread Richard Guenther

On Wed, Jun 13, 2012 at 12:06 PM, Eric Botcazou  wrote:
> This PR is about straight-line code not being as much vectorized in Ada as in 
> C
> or C++.  The problem stems from the very conservative semantics implemented
> under -fnon-call-exceptions for the sake of Java.  We don't need it in Ada.
>
> Tested on x86_64-suse-linux, OK for the mainline?

+@item -fdelete-dead-exceptions
+@opindex fdelete-dead-exceptions
+Assume that statements that may raise exceptions but don't otherwise contribute
+to the execution of the program can be optimized away.
+
+Most languages supporting exceptions disable this option at all levels.
+Otherwise it is enabled at all levels: @option{-O0}, @option{-O1},
+@option{-O2}, @option{-O3}, @option{-Os}.  Passes that use the information
+are enabled independently at different optimization levels.

I would not iterate all optimization levels here (you miss -Ofast), nor mention
them.  Thus, simply

+@item -fdelete-dead-exceptions
+@opindex fdelete-dead-exceptions
+Assume that statements that may raise exceptions but don't otherwise contribute
+to the execution of the program can be optimized away.
+
This flag is set by the frontends according to their language specification.
Passes that cause dead exceptions to be removed are enabled
independently at different optimization levels.

and be done with that.  Btw, what is "doesn't otherwise contribute to
the execution"?
That is, is the Ada equivalent of

 try { a / maybe_zero; } catch (...) { printf ("maybe_zero was zero!"); }

dead code?

Btw, I suppose you need to arrange to properly transfer the flag for LTO
in some way (well, or conservatively not remove dead exceptions at link-time).

Thanks,
Richard.

>
> 2012-06-13  Eric Botcazou  
>
>        PR middle-end/53590
>        * common.opt (-fdelete-dead-exceptions): New switch.
>        * doc/invoke.texi (Optimization Options): Document it.
>        * cse.c (insn_live_p): Do not return true for an insn that could throw
>        if dead exceptions can be deleted.
>        * dce.c (can_alter_cfg): New flag.
>        (deletable_insn_p): Do not return false for an insn that can throw if
>        the CFG can be altered and dead exceptions can be deleted.
>        (init_dce): Set can_alter_cfg to false for fast DCE, true otherwise.
>        * dse.c (scan_insn): Do not preserve an insn that could throw if dead
>        exceptions can be deleted.
>        * tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Do not mark a
>        statement that could throw as necessary if dead exceptions can be
>        deleted.
> ada/
>        * gcc-interface/misc.c (gnat_init_options_struct): Set
>        opts->x_flag_delete_dead_exceptions to 1.
>
>
> --
> Eric Botcazou

Re: [PATCH] Option to build bare-metal ARM cross-compiler for arm-none-eabi target without libunwind

2012-06-13 Thread Fredrik Hederstierna



-Joseph Myers  wrote: -
>On Wed, 13 Jun 2012, Fredrik Hederstierna wrote:
> >You need to provide a self-contained explanation of what the problem
> >is  that your patch is fixing and why you chose that approach to
> >fixing it -  with reference to the ARM EABI documentes (RTABI etc.)
> >for why your  approach is valid according to the ARM EABI.  libunwind > >is a
>library separate from libgcc that is used by libgcc for > >unwinding
>on ia64-linux-gnu only (whether built by GCC or separately >
>>installed).  There is also a separate libunwind project that may be
>> >used  on GNU/Linux platforms but I am not aware of being used for
>> >bare-metal at  all. >  > Our experience is that when using simple
>integer division in your code, > the libgcc division routine will
>includes div-zero handling (exception support?), > which will include
>dependencies to libunwind. This dependency problem > is the
>background of the patch.

> What do you mean by "dependencies to libunwind"?
> As I explained, libunwind is a separate library that GCC
> should never be  building or using for platforms other than IA64
> GNU/Linux.  The _Uwind_*  functions are *not* libunwind functions on
> ARM EABI, they are functions  from libgcc_eh.

Ah, ok, now I understand. The problem is rather the new dependency
from libgcc to libgcc_eh then. My mistake, I thought UnWind-calls was
related to libunwind, but as you explained now its libgcc_eh.
Sorry for my ignorance.

So, since we build toolchain with --disable-shared, the libgcc_eh.a
will not be built. This makes linking to libgcc_eh not possible.
Somehow also when building libgcc with -fexceptions it will make
references to abort() and memcpy(), but this is maybe related.

> The __aeabi_unwind_cpp_pr* personality routines are *not*  libunwind
>functions either, they are also functions from libgcc_eh.   Anything
>related to those functions should not use the term "libunwind",
>whether in configure option names, or patch submissions, since it is
>not  libunwind.  Linking with libgcc_eh should work fine by default;
>no special action  should be needed to link in whatever unwind
>functions your code  references.  But if you don't want to link them
>in at all, and if defining  your own versions of __aeabi_div0 /
>__aeabi_ldiv0 doesn't suffice, then as  long as your code doesn't
>raise exceptions it should be safe for you to  stub out the
>__aeabi_unwind_cpp_pr* functions.

Ok, so the solution for us it to stub out the __aeabi_unwind_cpp_pr* functions 
then.
This seems to me as a hack, since we build just a plain C bare-metal GCC,
I would rather see a solution where dependencies to libgcc_eh could be removed 
completely.
Something like the suggested patches, but then with renaming of 'libunwind' to 
'libgcc_eh'.

Thanks and Best Regards,
Fredrik

Re: [patch] Fix PR middle-end/53590

2012-06-13 Thread Eric Botcazou

> +@item -fdelete-dead-exceptions
> +@opindex fdelete-dead-exceptions
> +Assume that statements that may raise exceptions but don't otherwise
> contribute +to the execution of the program can be optimized away.
> +
> This flag is set by the frontends according to their language
> specification. Passes that cause dead exceptions to be removed are enabled
> independently at different optimization levels.
>
> and be done with that.

OK, will make the change.

> Btw, what is "doesn't otherwise contribute to the execution"?
> That is, is the Ada equivalent of
>
>  try { a / maybe_zero; } catch (...) { printf ("maybe_zero was zero!"); }
>
> dead code?

Essentially, yes.  There is a special permission to optimize it away if the 
result of the division isn't used.

> Btw, I suppose you need to arrange to properly transfer the flag for LTO
> in some way (well, or conservatively not remove dead exceptions at
> link-time).

I can add a flag in 'struct function' next to can_throw_non_call_exceptions.

-- 
Eric Botcazou

Re: [patch] Fix PR middle-end/53590

2012-06-13 Thread Richard Guenther

On Wed, Jun 13, 2012 at 1:08 PM, Eric Botcazou  wrote:
>> +@item -fdelete-dead-exceptions
>> +@opindex fdelete-dead-exceptions
>> +Assume that statements that may raise exceptions but don't otherwise
>> contribute +to the execution of the program can be optimized away.
>> +
>> This flag is set by the frontends according to their language
>> specification. Passes that cause dead exceptions to be removed are enabled
>> independently at different optimization levels.
>>
>> and be done with that.
>
> OK, will make the change.
>
>> Btw, what is "doesn't otherwise contribute to the execution"?
>> That is, is the Ada equivalent of
>>
>>  try { a / maybe_zero; } catch (...) { printf ("maybe_zero was zero!"); }
>>
>> dead code?
>
> Essentially, yes.  There is a special permission to optimize it away if the
> result of the division isn't used.
>
>> Btw, I suppose you need to arrange to properly transfer the flag for LTO
>> in some way (well, or conservatively not remove dead exceptions at
>> link-time).
>
> I can add a flag in 'struct function' next to can_throw_non_call_exceptions.

Not sure if it's worth it though, is it?  This can be done as a followup anyway.

The patch is otherwise ok.

Thanks,
Richard.

> --
> Eric Botcazou

Re: [SH] PR 53568 - Add support for bswap built-ins

2012-06-13 Thread Kaz Kojima

Oleg Endo  wrote:
> The attached patch adds support for the bswap32 built-in on SH.
> 
> Tested with 
> make -k -j8 check RUNTESTFLAGS="--target_board=sh-sim
> \{-m2/-ml,-m2/-mb,-m2a/-mb,-m2a-single/-mb,-m4/-ml,
> -m4/-mb,-m4-single/-ml,-m4-single/-mb,-m4a-single/-ml,
> -m4a-single/-mb}"
> 
> and no new failures.

OK.

Regards,
kaz

Re: [patch] Fix PR middle-end/53590

2012-06-13 Thread Eric Botcazou

> Not sure if it's worth it though, is it?  This can be done as a followup
> anyway.

If we don't do it, we'll get another PR saying "this works in LTO mode with 
other versions of the Ada compiler" (which is true) so I'll proceed.

> The patch is otherwise ok.

Thanks.

-- 
Eric Botcazou

Re: [PATCH, RFC] First cut at using vec_construct for strided loads

2012-06-13 Thread William J. Schmidt

On Wed, 2012-06-13 at 11:26 +0200, Richard Guenther wrote:
> On Tue, 12 Jun 2012, William J. Schmidt wrote:
> 
> > This patch is a follow-up to the discussion generated by
> > http://gcc.gnu.org/ml/gcc-patches/2012-06/msg00546.html.  I've added
> > vec_construct to the cost model for use in vect_model_load_cost, and
> > implemented a cost calculation that makes sense to me for PowerPC.  I'm
> > less certain about the default, i386, and spu implementations.  I took a
> > guess at i386 from the discussions we had, and used the same calculation
> > for the default and for spu.  I'm hoping you or others can fill in the
> > blanks if I guessed badly.
> > 
> > The i386 cost for vec_construct is different from all the others, which
> > are parameterized for each processor description.  This should probably
> > be parameterized in some way as well, but thought you'd know better than
> > I how that should be.  Perhaps instead of
> > 
> > elements / 2 + 1
> > 
> > it should be
> > 
> > (elements / 2) * X + Y
> > 
> > where X and Y are taken from the processor description, and represent
> > the cost of a merge and a permute, respectively.  Let me know what you
> > think.
> 
> Looks good to me with the gcc_asserts removed - TYPE_VECTOR_SUBPARTS
> might be 1 for V1TImode for example (heh, not that the vectorizer would
> vectorize to that).  But I don't see any possible breakage with
> elements == 1, do you?

No, that was some unnecessary sanity testing I was doing for my own
curiosity.  I'll pull them out and pop this in today.  Thanks for the
review!

Bill

> 
> Target maintainers can improve on the cost calculation if they wish,
> the default looks sensible to me.
> 
> Thanks,
> Richard.
> 
> > Thanks,
> > Bill
> > 
> > 
> > 2012-06-12  Bill Schmidt  
> > 
> > * targhooks.c (default_builtin_vectorized_conversion): Handle
> > vec_construct, using vectype to base cost on subparts.
> > * target.h (enum vect_cost_for_stmt): Add vec_construct.
> > * tree-vect-stmts.c (vect_model_load_cost): Use vec_construct
> > instead of scalar_to-vec.
> > * config/spu/spu.c (spu_builtin_vectorization_cost): Handle
> > vec_construct in same way as default for now.
> > * config/i386/i386.c (ix86_builtin_vectorization_cost): Likewise.
> > * config/rs6000/rs6000.c (rs6000_builtin_vectorization_cost):
> > Handle vec_construct, including special case for 32-bit loads.
> > 
> > 
> > Index: gcc/targhooks.c
> > ===
> > --- gcc/targhooks.c (revision 188482)
> > +++ gcc/targhooks.c (working copy)
> > @@ -499,9 +499,11 @@ default_builtin_vectorized_conversion (unsigned in
> >  
> >  int
> >  default_builtin_vectorization_cost (enum vect_cost_for_stmt type_of_cost,
> > -tree vectype ATTRIBUTE_UNUSED,
> > +tree vectype,
> >  int misalign ATTRIBUTE_UNUSED)
> >  {
> > +  unsigned elements;
> > +
> >switch (type_of_cost)
> >  {
> >case scalar_stmt:
> > @@ -524,6 +526,11 @@ default_builtin_vectorization_cost (enum vect_cost
> >case cond_branch_taken:
> >  return 3;
> >  
> > +  case vec_construct:
> > +   elements = TYPE_VECTOR_SUBPARTS (vectype);
> > +   gcc_assert (elements > 1);
> > +   return elements / 2 + 1;
> > +
> >default:
> >  gcc_unreachable ();
> >  }
> > Index: gcc/target.h
> > ===
> > --- gcc/target.h(revision 188482)
> > +++ gcc/target.h(working copy)
> > @@ -146,7 +146,8 @@ enum vect_cost_for_stmt
> >cond_branch_not_taken,
> >cond_branch_taken,
> >vec_perm,
> > -  vec_promote_demote
> > +  vec_promote_demote,
> > +  vec_construct
> >  };
> >  
> >  /* The target structure.  This holds all the backend hooks.  */
> > Index: gcc/tree-vect-stmts.c
> > ===
> > --- gcc/tree-vect-stmts.c   (revision 188482)
> > +++ gcc/tree-vect-stmts.c   (working copy)
> > @@ -1031,11 +1031,13 @@ vect_model_load_cost (stmt_vec_info stmt_info, int
> >/* The loads themselves.  */
> >if (STMT_VINFO_STRIDE_LOAD_P (stmt_info))
> >  {
> > -  /* N scalar loads plus gathering them into a vector.
> > - ???  scalar_to_vec isn't the cost for that.  */
> > +  /* N scalar loads plus gathering them into a vector.  */
> > +  tree vectype = STMT_VINFO_VECTYPE (stmt_info);
> >inside_cost += (vect_get_stmt_cost (scalar_load) * ncopies
> > - * TYPE_VECTOR_SUBPARTS (STMT_VINFO_VECTYPE (stmt_info)));
> > -  inside_cost += ncopies * vect_get_stmt_cost (scalar_to_vec);
> > + * TYPE_VECTOR_SUBPARTS (vectype));
> > +  inside_cost += ncopies
> > +   * targetm.vectorize.builtin_vectorization_cost (vec_construct,
> > +   vectype, 0);
> >

RFA: better gimplification of compound literals

2012-06-13 Thread Michael Matz

Hi,

On Tue, 12 Jun 2012, Richard Guenther wrote:

> > Ok, I see the C frontend hands us this as
> >
> >  return  VEC_PERM_EXPR < a , b , <<< Unknown tree: compound_literal_expr
> >    v4si D.1712 = { 0, 4, 1, 5 }; >>> > ;
> >
> > and gimplification in some way fails to gimplify it to { 0, 4, 1, 5 }.

Was a non-implemented optimization.  If the compound literal value isn't 
used as lvalue and doesn't have its address taken (and generally fits the 
current predicate) we can as well subst it in place instead of going over 
an intermediate statement.

Regstrapping on x86_64-linux in progress.  Okay if that passes?


Ciao,
Michael.
-
* gimplify.c (gimplify_compound_literal_expr): Take gimple_test_f
argument, don't emit assign statement if value is directly usable.
(gimplify_expr): Adjust.

testsuite/
* gcc.dg/tree-ssa/vector-4.c: New test.

Index: gimplify.c
===
--- gimplify.c  (revision 188500)
+++ gimplify.c  (working copy)
@@ -3796,15 +3796,29 @@ rhs_predicate_for (tree lhs)
 
 static enum gimplify_status
 gimplify_compound_literal_expr (tree *expr_p, gimple_seq *pre_p,
+   bool (*gimple_test_f) (tree),
fallback_t fallback)
 {
   tree decl_s = COMPOUND_LITERAL_EXPR_DECL_EXPR (*expr_p);
   tree decl = DECL_EXPR_DECL (decl_s);
+  tree init = DECL_INITIAL (decl);
   /* Mark the decl as addressable if the compound literal
  expression is addressable now, otherwise it is marked too late
  after we gimplify the initialization expression.  */
   if (TREE_ADDRESSABLE (*expr_p))
 TREE_ADDRESSABLE (decl) = 1;
+  /* Otherwise, if we don't need an lvalue and have a literal directly
+ substitute it.  Check if it matches the gimple predicate, as
+ otherwise we'd generate a new temporary, and we can as well just
+ use the decl we already have.  */
+  else if (!TREE_ADDRESSABLE (decl)
+  && init
+  && (fallback & fb_lvalue) == 0
+  && gimple_test_f (init))
+{
+  *expr_p = init;
+  return GS_OK;
+}
 
   /* Preliminarily mark non-addressed complex variables as eligible
  for promotion to gimple registers.  We'll transform their uses
@@ -7118,7 +7132,8 @@ gimplify_expr (tree *expr_p, gimple_seq
  break;
 
case COMPOUND_LITERAL_EXPR:
- ret = gimplify_compound_literal_expr (expr_p, pre_p, fallback);
+ ret = gimplify_compound_literal_expr (expr_p, pre_p,
+   gimple_test_f, fallback);
  break;
 
case MODIFY_EXPR:
Index: testsuite/gcc.dg/tree-ssa/vector-4.c
===
--- testsuite/gcc.dg/tree-ssa/vector-4.c(revision 0)
+++ testsuite/gcc.dg/tree-ssa/vector-4.c(revision 0)
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-w -O1 -fdump-tree-gimple" } */
+
+typedef int v4si __attribute__ ((vector_size (16)));
+
+v4si vs (v4si a, v4si b)
+{
+  return __builtin_shuffle (a, b, (v4si) {0, 4, 1, 5});
+}
+
+/* The compound literal should be placed directly in the vec_perm.  */
+/* { dg-final { scan-tree-dump-times "VEC_PERM_EXPR ;" 1 
"gimple"} } */
+
+/* { dg-final { cleanup-tree-dump "gimple" } } */

Re: [Patch, Fortran] PR53597 re-add SAVE constraint for modules with -std=f2003

2012-06-13 Thread Paul Richard Thomas

Dear Tobias,

This one is indeed obvious!  OK for trunk.

Cheers

Paul

On 13 June 2012 09:50, Tobias Burnus  wrote:
> Given the very slow patch review, I intent to commit this patch in a couple
> of days as obvious.* Nevertheless, I wouldn't mind a patch review.
>
> The constraint check is actually present in resolve.c, it just doesn't
> trigger.
>
> Build and regtested on x86-64-gnu-linux.
> OK for the trunk - and for the 4.6/4.7 branch?*
>
> Patches pending review:
> - http://gcc.gnu.org/ml/fortran/2012-05/msg00171.html
> - http://gcc.gnu.org/ml/fortran/2012-05/msg00173.html
>
> Tobias
>



-- 
The knack of flying is learning how to throw yourself at the ground and miss.
       --Hitchhikers Guide to the Galaxy

Re: [PATCH] Add option for dumping to stderr (issue6190057)

2012-06-13 Thread Richard Guenther

On Fri, Jun 8, 2012 at 7:16 AM, Sharad Singhai  wrote:
> Okay, I have updated the attached patch so that the output from
> -ftree-vectorizer-verbose is considered diagnostic information and is
> always
> sent to stderr. Other functionality remains unchanged. Here is some
> more context about this patch.
>
> This patch improves the dump infrastructure and public interfaces so
> that the existing private pass-specific dump stream is separated from
> the diagnostic dump stream (typically stderr).  The optimization
> passes can output information on the two streams independently.
>
> The newly defined interfaces are:
>
> Individual passes do not need to access the dump file directly. Thus Instead
> of doing
>
>   if (dump_file && (flags & dump_flags))
>      fprintf (dump_file, ...);
>
> they can do
>
>     dump_printf (flags, ...);
>
> If the current pass has FLAGS enabled then the information gets
> printed into the dump file otherwise not.
>
> Similar to the dump_printf (), another function is defined, called
>
>        diag_printf (dump_flags, ...)
>
> This prints information only onto the diagnostic stream, typically
> standard error. It is useful for separating pass-specific dump
> information from
> the diagnostic information.
>
> Currently, as a proof of concept, I have converted vectorizer passes
> to use the new dump format. For this, I have considered
> information printed in vect_dump file as diagnostic. Thus 'fprintf'
> calls are changed to 'diag_printf'. Some other information printed to
> dump_file is sent to the regular dump file via 'dump_printf ()'. It
> helps to separate the two streams because they might serve different
> purposes and might have different formatting requirements.
>
> For example, using the trunk compiler, the following invocation
>
> g++ -S v.cc -ftree-vectorize -fdump-tree-vect -ftree-vectorizer-verbose=2
>
> prints tree vectorizer dump into a file named 'v.cc.113t.vect'.
> However, the verbose diagnostic output is silently
> ignored. This is not desirable as the two types of dump should not interfere.
>
> After this patch, the vectorizer dump is available in 'v.cc.113t.vect'
> as before, but the verbose vectorizer diagnostic is additionally
> printed on stderr. Thus both types of dump information are output.
>
> An additional feature of this patch is that individual passes can
> print dump information into command-line named files instead of auto
> numbered filename. For example,

I'd wish you'd leave out this part for a followup.

>
> g++ -S -O2 v.cc -ftree-vectorize -fdump-tree-vect=foo.vect
>     -ftree-vectorizer-verbose=2
>     -fdump-tree-pre=foo.pre
>
> This prints the tree vectorizer dump into 'foo.vect', PRE dump into
> 'foo.pre', and the vectorizer verbose diagnostic dump onto stderr.
>
> Please take another look.

--- tree-vect-loop-manip.c  (revision 188325)
+++ tree-vect-loop-manip.c  (working copy)
@@ -789,14 +789,11 @@ slpeel_make_loop_iterate_ntimes (struct loop *loop
   gsi_remove (&loop_cond_gsi, true);

   loop_loc = find_loop_location (loop);
-  if (dump_file && (dump_flags & TDF_DETAILS))
-{
-  if (loop_loc != UNKNOWN_LOC)
-fprintf (dump_file, "\nloop at %s:%d: ",
+  if (loop_loc != UNKNOWN_LOC)
+dump_printf (TDF_DETAILS, "\nloop at %s:%d: ",
  LOC_FILE (loop_loc), LOC_LINE (loop_loc));
-  print_gimple_stmt (dump_file, cond_stmt, 0, TDF_SLIM);
-}
-
+  if (dump_flags & TDF_DETAILS)
+dump_gimple_stmt (TDF_SLIM, cond_stmt, 0);
   loop->nb_iterations = niters;

I'm confused by this.  Why is this not simply

  if (loop_loc != UNKNOWN_LOC)
dump_printf (dump_flags, "\nloop at %s:%d: ",
   LOC_FILE (loop_loc), LOC_LINE (loop_loc));
  dump_gimple_stmt (dump_flags | TDF_SLIM, cond_stmt, 0);

for example.  I notice that you maybe mis-understood the message classification
I asked you to add (maybe I confused you by mentioning to eventually re-use
the TDF_* flags).  I think you basically provided this message classification
by adding two classes by providing both dump_gimple_stmt and diag_gimple_stmt.
But still in the above you keep a dump_flags test _and_ you pass in
(altered) dump_flags to the dump/diag_gimple_stmt routines.  Let me quote them:

+void
+dump_gimple_stmt (int flags, gimple gs, int spc)
+{
+  if (dump_file)
+print_gimple_stmt (dump_file, gs, spc, flags);
+}

+void
+diag_gimple_stmt (int flags, gimple gs, int spc)
+{
+  if (alt_dump_file)
+print_gimple_stmt (alt_dump_file, gs, spc, flags);
+}

I'd say it should have been a single function:

void
dump_gimple_stmt (enum msg_classification, int additional_flags,
gimple gs, int spc)
{
  if (msg_classification & go-to-dumpfile
  && dump_file)
print_gimple_stmt (dump_file, gs, spc, dump_flags | additional_flags);
  if (msg_classification & go-to-alt-dump-file
  && alt_dump_file && (alt_dump_flags & msg_classification))
print_gimple_stmt (alt_dump_file, gs, spc, alt_dump_flags |
additional_flags);
}

whe

Re: [PATCH] Option to build bare-metal ARM cross-compiler for arm-none-eabi target without libunwind

2012-06-13 Thread Fredrik Hederstierna


> The __aeabi_unwind_cpp_pr* personality routines are *not*  libunwind
>functions either, they are also functions from libgcc_eh.   Anything
>related to those functions should not use the term "libunwind",
>whether in configure option names, or patch submissions, since it is
>not  libunwind.  Linking with libgcc_eh should work fine by default;
>no special action  should be needed to link in whatever unwind
>functions your code  references.  But if you don't want to link them
>in at all, and if defining  your own versions of __aeabi_div0 /
>__aeabi_ldiv0 doesn't suffice, then as  long as your code doesn't
>raise exceptions it should be safe for you to  stub out the
>__aeabi_unwind_cpp_pr* functions.  --  Joseph S. Myers

Ok, just read the "Exception Handling ABI for the ARM Architecture".

* From Section 6.2

"Bits 24-27 select one of 16 personality routines defined by the run-time 
support code. Remaining bytes are data
for that personality routine.
...
ARM has allocated index numbers 0, 1 and 2 for use by C and C++.
...
Object producers must emit an R_ARM_NONE relocation from an exception-handling 
table section to the required
personality routine to indicate the dependency to the linker."

I think you are right, we need to stub out the __aeabi_unwind_cpp_pr functions.

Thanks alot & Best Regards
Fredrik

1 2 >

1 - 100 of 111 matches

Mail list logo