Re: [PATCH] Add option for dumping to stderr (issue6190057)

2012-05-08 Thread Sharad Singhai
That is certainly a possibility. The original motivation was to
implement -fopt-info correctly. If there are other use cases, then I
can enhance the patch.

Thanks,
Sharad


On Mon, May 7, 2012 at 3:02 PM, Gabriel Dos Reis
 wrote:
> On Mon, May 7, 2012 at 4:58 PM, Sharad Singhai  wrote:
>> This is the first patch for planned improvements to dump
>> infrastructure.  Please reference the discussion in
>> http://gcc.gnu.org/ml/gcc-patches/2011-10/msg02088.html.
>>
>> The following small patch allows selective tree and rtl dumps on
>> stderr instead of named files.  Later -fopt-info can be implemented in
>> form of -fdump-xxx-stderr.
>
> Instead of -fdump-xxx-stderr, it will be better to have
>
>    -fdump-xxx=yyy
>
> where yyy is a path to a file, with "stderr" and "stdout" having
> special meaning.
>
> -- Gaby


[SH] PR 51244 - Supplementary patch

2012-05-08 Thread Oleg Endo
Hello,

The attached patch is the same as in the PR comment #37.
Tested with 
make -k check RUNTESTFLAGS="--target_board=sh-sim
\{-m2/-ml,-m2/-mb,-m2a/-mb,-m2a-single/-mb,-m4/-ml,-m4/-mb,
-m4-single/-ml,-m4-single/-mb,-m4a-single/-ml,-m4a-single/-mb}"

and no new failures.
OK?

Cheers,
Oleg

ChangeLog:

PR target/51244
* config/sh/sh.md (*branch_true, *branch_false): New insns.
Index: gcc/config/sh/sh.md
===
--- gcc/config/sh/sh.md	(revision 187217)
+++ gcc/config/sh/sh.md	(working copy)
@@ -7097,6 +7097,29 @@
 }
   [(set_attr "type" "cbranch")])
 
+;; The *branch_true patterns help combine when trying to invert conditions.
+(define_insn "*branch_true"
+  [(set (pc) (if_then_else (ne (zero_extend:SI (subreg:QI (reg:SI T_REG) 0))
+			   (const_int 0))
+			   (label_ref (match_operand 0 "" ""))
+			   (pc)))]
+  "TARGET_SH1 && TARGET_LITTLE_ENDIAN"
+{
+  return output_branch (1, insn, operands);
+}
+  [(set_attr "type" "cbranch")])
+
+(define_insn "*branch_true"
+  [(set (pc) (if_then_else (ne (zero_extend:SI (subreg:QI (reg:SI T_REG) 3))
+			   (const_int 0))
+			   (label_ref (match_operand 0 "" ""))
+			   (pc)))]
+  "TARGET_SH1 && ! TARGET_LITTLE_ENDIAN"
+{
+  return output_branch (1, insn, operands);
+}
+  [(set_attr "type" "cbranch")])
+
 (define_insn "branch_false"
   [(set (pc) (if_then_else (eq (reg:SI T_REG) (const_int 0))
 			   (label_ref (match_operand 0 "" ""))
@@ -7107,6 +7130,29 @@
 }
   [(set_attr "type" "cbranch")])
 
+;; The *branch_false patterns help combine when trying to invert conditions.
+(define_insn "*branch_false"
+  [(set (pc) (if_then_else (eq (zero_extend:SI (subreg:QI (reg:SI T_REG) 0))
+			   (const_int 0))
+			   (label_ref (match_operand 0 "" ""))
+			   (pc)))]
+  "TARGET_SH1 && TARGET_LITTLE_ENDIAN"
+{
+  return output_branch (0, insn, operands);
+}
+  [(set_attr "type" "cbranch")])
+
+(define_insn "*branch_false"
+  [(set (pc) (if_then_else (eq (zero_extend:SI (subreg:QI (reg:SI T_REG) 3))
+			   (const_int 0))
+			   (label_ref (match_operand 0 "" ""))
+			   (pc)))]
+  "TARGET_SH1 && ! TARGET_LITTLE_ENDIAN"
+{
+  return output_branch (0, insn, operands);
+}
+  [(set_attr "type" "cbranch")])
+
 ;; Patterns to prevent reorg from re-combining a condbranch with a branch
 ;; which destination is too far away.
 ;; The const_int_operand is distinct for each branch target; it avoids
@@ -9721,7 +9767,7 @@
   ""
   [(const_int 0)])
 
-;; The *movtt patterns improve code at -O1.
+;; The *movtt patterns eliminate redundant T bit to T bit moves / tests.
 (define_insn_and_split "*movtt"
   [(set (reg:SI T_REG)
 	(eq:SI (zero_extend:SI (subreg:QI (reg:SI T_REG) 3))


Re: [SH] PR 51244 - Supplementary patch

2012-05-08 Thread Kaz Kojima
Oleg Endo  wrote:
> The attached patch is the same as in the PR comment #37.
> Tested with 
> make -k check RUNTESTFLAGS="--target_board=sh-sim
> \{-m2/-ml,-m2/-mb,-m2a/-mb,-m2a-single/-mb,-m4/-ml,-m4/-mb,
> -m4-single/-ml,-m4-single/-mb,-m4a-single/-ml,-m4a-single/-mb}"
> 
> and no new failures.
> OK?

OK.

Regards,
kaz


[ping] 3 pending patches

2012-05-08 Thread Eric Botcazou
Fix debug info of nested inline functions:
  http://gcc.gnu.org/ml/gcc-patches/2012-03/msg00161.html

Emit variable as size attribute in debug info:
  http://gcc.gnu.org/ml/gcc-patches/2012-04/msg00422.html

Implement static stack checking on IA-64:
  http://gcc.gnu.org/ml/gcc-patches/2012-03/msg00452.html

Thanks in advance.

-- 
Eric Botcazou


Re: [PATCH] MIPS16: Fix truncated DWARF-2 line information

2012-05-08 Thread Maciej W. Rozycki
Hi Richard,

 Resurrecting the issue now that I have new data.

On Wed, 14 Dec 2011, Richard Sandiford wrote:

> >  After some thinking I decided the simplest approach will be just emitting 
> > the missing location directive in the context of the MIPS16 thunk being 
> > built that will apply to the actual function prologue.  The resulting 
> > change is included below -- this just repeats the record originally output 
> > before the thunk (and which applies to .mips16.fn.sinfrob16 section).
> 
> I think I'd prefer to change where the thunc is emitted.  We shouldn't
> really have the thunk coming between the MIPS16 code and its .cfi_startproc
> either.  And the thunk should probably have CFI info itself.
> 
> I'll try to look at it sometime if you don't beat me to it.

 OK, whatever you prefer.  I hope that CFI data won't confuse GDB, any 
thunks should really be skipped over in regular debugging (i.e. unless you 
single-step by the machine instruction).

> >  Regression-testing this change turned out to be quite tricky as current 
> > trunk does not appear to build for the mips-sde-elf target:
> 
> Gah.  mipsisa64-elfoabi is another option FWIW.

 I'm not sure if that's a configuration I could test without tremendous 
effort.  Thanks for fixing mips-sde-elf support though.

> > and the mips-linux-gnu configuration is not ready yet for MIPS16 testing.
> 
> Out of interest, what goes wrong?  I've been testing -mabi=32/-mips16 on
> mips64-linux-gnu for some time without difficulty.

 I've thought some pieces are missing upstream, but perhaps I've been 
confused.  I reckon there was a nasty issue with GCC confusing the symbols 
used (using the wrong symbol alias or failing to use one) in the context 
of using MIPS16 thunks and PLT (that we discovered as soon as or shortly 
after we started using such a setup, so that wasn't anything particularly 
obscure), but perhaps the fix for that issue has been actually submitted 
and included upstream already.

 Are you using a hard-float multilib for your -mabi=32/-mips16 Linux 
testing?

> Anyway, whatever does end up going in to trunk really does need to be
> tested against trunk first.

 I did that testing now, and filed PR target/53276 so that this issue 
isn't lost.  I'll continue using the fix I proposed until you have 
implemented your suggestions; it's unlikely I'll be able to find an extra 
time slot to look into it any further given that I have a working solution 
and lots of other issues to deal with.  I can't guarantee I'll keep that 
promise though. ;)

 I have some small improvements to how some of these thunks are generated 
outstanding; I'll try to push them through testing and offer them to you 
as time permits now that I've got a reliable configuration for upstream 
GCC testing.

  Maciej


Re: PATCH: Update longlong.h from GLIBC

2012-05-08 Thread Richard Guenther
On Mon, May 7, 2012 at 11:11 PM, H.J. Lu  wrote:
> Hi,
>
> I am preparing to update GLIBC longlong.h from GCC.  This patch updates
> GCC longlong.h to use a URL instead of an FSF postal address and  replace
> spaces with tab.  OK to install?
>
> Since I'd like to simply copy longlong.h from GCC release branch to GLIBC,
> Is this also OK for 4.7 branch?

Why?  Does it fix anything there?

Richard.

> Thanks.
>
>
> H.J.
> ---
> 2012-05-07  H.J. Lu  
>
>        * longlong.h: Use a URL instead of an FSF postal address.
>        Replace spaces with tab.
>
> diff --git a/libgcc/longlong.h b/libgcc/longlong.h
> index 2026377..4fa9d46 100644
> --- a/libgcc/longlong.h
> +++ b/libgcc/longlong.h
> @@ -25,9 +25,8 @@
>    Lesser General Public License for more details.
>
>    You should have received a copy of the GNU Lesser General Public
> -   License along with the GNU C Library; if not, write to the Free
> -   Software Foundation, 51 Franklin Street, Fifth Floor, Boston,
> -   MA 02110-1301, USA.  */
> +   License along with the GNU C Library; if not, see
> +   .  */
>
>  /* You have to define the following before including this file:
>
> @@ -383,21 +382,21 @@ UDItype __umulsidi3 (USItype, USItype);
>   do {                                                                  \
>     register SItype __r0 __asm__ ("0");                                       
>  \
>     register SItype __r1 __asm__ ("1") = (m0);                         \
> -                                                                        \
> +                                                                       \
>     __asm__ ("mr\t%%r0,%3"                                              \
> -             : "=r" (__r0), "=r" (__r1)                                      
>   \
> -             : "r"  (__r1),  "r" (m1));                                      
>   \
> +            : "=r" (__r0), "=r" (__r1)                                 \
> +            : "r"  (__r1),  "r" (m1));                                 \
>     (xh) = __r0; (xl) = __r1;                                          \
>   } while (0)
>
>  #define sdiv_qrnnd(q, r, n1, n0, d) \
> -  do {                                                                  \
> +  do {                                                                 \
>     register SItype __r0 __asm__ ("0") = (n1);                         \
>     register SItype __r1 __asm__ ("1") = (n0);                         \
> -                                                                        \
> +                                                                       \
>     __asm__ ("dr\t%%r0,%4"                                              \
> -             : "=r" (__r0), "=r" (__r1)                                      
>   \
> -             : "r" (__r0), "r" (__r1), "r" (d));                       \
> +            : "=r" (__r0), "=r" (__r1)                                 \
> +            : "r" (__r0), "r" (__r1), "r" (d));                        \
>     (q) = __r1; (r) = __r0;                                            \
>   } while (0)
>  #endif /* __zarch__ */
> @@ -840,9 +839,9 @@ UDItype __umulsidi3 (USItype, USItype);
>  #define count_trailing_zeros(count,x) \
>   do {                                                                 \
>     __asm__ ("ffsd     %2,%0"                                          \
> -            : "=r" ((USItype) (count))                                 \
> -            : "0" ((USItype) 0),                                       \
> -              "r" ((USItype) (x)));                                    \
> +           : "=r" ((USItype) (count))                                  \
> +           : "0" ((USItype) 0),                                        \
> +             "r" ((USItype) (x)));                                     \
>   } while (0)
>  #endif /* __ns32000__ */
>
> @@ -858,7 +857,7 @@ UDItype __umulsidi3 (USItype, USItype);
>      || defined (__ppc__)      /* Darwin */                            \
>      || (defined (PPC) && ! defined (CPU_FAMILY)) /* gcc 2.7.x GNU&SysV */    
> \
>      || (defined (PPC) && defined (CPU_FAMILY)    /* VxWorks */               
> \
> -         && CPU_FAMILY == PPC)                                               
>  \
> +        && CPU_FAMILY == PPC)                                                
> \
>      ) && W_TYPE_SIZE == 32
>  #define add_ss(sh, sl, ah, al, bh, bl) \
>   do {                                                                 \
> @@ -899,7 +898,7 @@ UDItype __umulsidi3 (USItype, USItype);
>   || defined (__ppc__)                                                    \
>   || (defined (PPC) && ! defined (CPU_FAMILY)) /* gcc 2.7.x GNU&SysV */       
> \
>   || (defined (PPC) && defined (CPU_FAMILY)    /* VxWorks */                  
> \
> -         && CPU_FAMILY == PPC)
> +        && CPU_FAMILY == PPC)
>  #define umul_ppmm(ph, pl, m0, m1) \
>   do {                                                         

Re: [patch] don't check for execute bits of the liblto plugin

2012-05-08 Thread Richard Guenther
On Tue, May 8, 2012 at 1:07 AM, Matthias Klose  wrote:
> The lto plugin is installed without x bits set, but gcc-ar.c still checks for
> the execute bits. There is no need to have the lto plugin to have the x bits
> set, so just check that it is readable.
>
> Ok for the trunk and the 4.7 branch?

Ok.

Thanks,
Richard.

>  Matthias


Re: Heads-up, PR53273: testsuite separation and dilution problem. Fix for PR53272

2012-05-08 Thread Richard Guenther
On Tue, May 8, 2012 at 5:39 AM, Hans-Peter Nilsson
 wrote:
> The problem was spotted while fixing PR53272, a target bug with
> crisv32-* involving the error-prone notice_update_cc function.
>
> When wrapping up the test-case to use as a run-test, adding main
> and auxiliary functions to the reduced test-case unexpectedly
> made the bug go away.  This despite all functions (except main)
> being decorated with noinline, noclone and the special marker
> asm ("") ad finitum.  See below: putting the two-file test-case
> in a single file causes different code for the
> rtc_update_irq_enable function in the .expand stage already.
> That REALLY shouldn't happen.  I hope this is just a bug and not
> as it's supposed to be, but this is not the first time I notice
> this general problem, hence this rant and PR53273:
>
> It is, and you don't understand how this can be a problem, or
> think I should just add the brand new function attribute
> nofrobnicate?  Well, having to add two files for each test-case
> is enough of a problem on its own.  The bigger problems is the
> integrity of test-cases: there's no way to know that a future
> optimization doesn't see through those separated files (like, an
> LTO that is enabled always).
>
> There *must* be a future-proof way to write test-cases marking
> where cross-function optimizations should not happen.  If it's
> implemented as function-attribute-nofrobnicate so be it, but it
> must not be limited to only the optimizations in place today.
> Otherwise, some new middle-end generic optimization will
> optimize away the test-case (and always return success), most
> likely eliminating the point of the test.  When the point of the
> test-case is to cover code in a port or lower levels, optimizing
> away the test-cases opens up for bugs to silently creep in; the
> original bug or bugs in the functionality being covered.  This
> optimization-limiting mechanism really should, almost-must, work
> within a single file.  In a (semi-)perfect world, someone would
> interate over the testsuite, adding such attributes to the
> existing tests; I fear a lot of those that don't use "noclone"
> are silently already just eliminated to "exit (0)".  We're on a
> slippery slope here: it started with having to add "noinline"
> attributes, then "noclone" attributes, then the asm("") marker.
> Now that doesn't work anymore either.  Can we just have a way to
> limit those pesky cross-function optimizations and all their kin
> once and for all?

You don't say what actually is different when you add these functions.
There should be no IPA optimizations possible unless you tell GCC
that it sees the whole program (which means using -flto with the
linker plugin).  That is, marking functions noclone and noinline and
avoiding declaring them static should be enough.

Still some pieces of GCC may expose different code generation due to
DECL uid differences - which, as DECL uids are global, makes extra
functions possibly result in different code for unchanged functions.  That's
generally not wanted but it can happen (similar for other such kinds of
numbers).

Richard.

> Ok, enough ranting.  If anyone knows off-hand why the code would
> differ, feel free to add to PR53273, else I'll eventually
> analyze it.  It might just be a minor bug after all; like some
> static branch prediction not being cleared when seeing a
> noinline-marked function.
>
> The test-case below and patch will be committed to trunk and the
> 4.7 branch as soon as testing for crisv32-elf finishes.
>
> gcc/testsuite:
>        PR target/53272
>        * gcc.dg/torture/pr53272-1.c, gcc.dg/torture/pr53272-2.c: New test.
>
> gcc:
>        PR target/53272
>        * config/cris/cris.c (cris_normal_notice_update_cc): For TARGET_V32,
>        when a constant source operand matches an "I" constraint, the "no
>        CC0 change" applies to a register-destination only, not a
>        strict_low_part-destination.
>
>
> --- /dev/null   Tue Oct 29 15:57:07 2002
> +++ gcc/testsuite/gcc.dg/torture/pr53272-1.c    Tue May  8 03:07:52 2012
> @@ -0,0 +1,39 @@
> +/* { dg-do run } */
> +/* { dg-additional-sources "pr53272-2.c" } */
> +struct rtc_class_ops {
> + int (*f)(void *, unsigned int enabled);
> +};
> +
> +struct rtc_device
> +{
> + void *owner;
> + const struct rtc_class_ops *ops;
> + int ops_lock;
> +};
> +
> +__attribute__ ((__noinline__, __noclone__))
> +extern int foo(void *);
> +__attribute__ ((__noinline__, __noclone__))
> +extern void foobar(void *);
> +
> +__attribute__ ((__noinline__, __noclone__))
> +int rtc_update_irq_enable(struct rtc_device *rtc, unsigned int enabled)
> +{
> + int err;
> + asm volatile ("");
> +
> + err = foo(&rtc->ops_lock);
> +
> + if (err)
> +  return err;
> +
> + if (!rtc->ops)
> +  err = -19;
> + else if (!rtc->ops->f)
> +  err = -22;
> + else
> +  err = rtc->ops->f(rtc->owner, enabled);
> +
> + foobar(&rtc->ops_lock);
> + return err;
> +}
> --- /dev/null   Tue Oct 29 15:57:07 2002
> +++ gcc/testsuite/gcc.dg/torture/pr

Re: [PATCH] Add option for dumping to stderr (issue6190057)

2012-05-08 Thread Richard Guenther
On Tue, May 8, 2012 at 12:02 AM, Gabriel Dos Reis
 wrote:
> On Mon, May 7, 2012 at 4:58 PM, Sharad Singhai  wrote:
>> This is the first patch for planned improvements to dump
>> infrastructure.  Please reference the discussion in
>> http://gcc.gnu.org/ml/gcc-patches/2011-10/msg02088.html.
>>
>> The following small patch allows selective tree and rtl dumps on
>> stderr instead of named files.  Later -fopt-info can be implemented in
>> form of -fdump-xxx-stderr.
>
> Instead of -fdump-xxx-stderr, it will be better to have
>
>    -fdump-xxx=yyy
>
> where yyy is a path to a file, with "stderr" and "stdout" having
> special meaning.

Yeah, that looks better.

Thanks,
Richard.

> -- Gaby


Re: PATCH: Update longlong.h from GLIBC

2012-05-08 Thread Andreas Jaeger
On Tuesday, May 08, 2012 10:43:14 Richard Guenther wrote:
> On Mon, May 7, 2012 at 11:11 PM, H.J. Lu  wrote:
> > Hi,
> > 
> > I am preparing to update GLIBC longlong.h from GCC.  This patch
> > updates GCC longlong.h to use a URL instead of an FSF postal address
> > and  replace spaces with tab.  OK to install?
> > 
> > Since I'd like to simply copy longlong.h from GCC release branch to
> > GLIBC, Is this also OK for 4.7 branch?
> 
> Why?  Does it fix anything there?

It makes sharing the file between gcc and glibc easier,

Andreas
-- 
 Andreas Jaeger aj@{suse.com,opensuse.org} Twitter/Identica: jaegerandi
  SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
   GF: Jeff Hawn,Jennifer Guild,Felix Imendörffer,HRB16746 (AG Nürnberg)
GPG fingerprint = 93A3 365E CE47 B889 DF7F  FED1 389A 563C C272 A126


Re: [rtl, patch] combine concat+shuffle

2012-05-08 Thread Richard Sandiford
Looks like a good idea.

Marc Glisse  writes:
> For the testsuite, since the patch is not in a particular target, it would 
> be better to have a generic test (in gcc.dg?), but I don't really know how 
> to write a generic one, so would a test in gcc.target/i386 that scans 
> the asm for shuf or perm be ok?

Tree-level vectorisation tests tend to go in gcc.dg/vector, but it's
hard to generalise rtl-level transforms like these.  An x86-only test
sounds good to me FWIW.

> Index: simplify-rtx.c
> ===
> --- simplify-rtx.c(revision 187228)
> +++ simplify-rtx.c(working copy)
> @@ -3268,10 +3268,32 @@ simplify_binary_operation_1 (enum rtx_co
>
> if (GET_MODE (vec) == mode)
>   return vec;
>   }
> 
> +  /* If we build {a,b} then permute it, build the result directly.  */
> +  if (XVECLEN (trueop1, 0) == 2
> +   && CONST_INT_P (XVECEXP (trueop1, 0, 0))
> +   && CONST_INT_P (XVECEXP (trueop1, 0, 1))
> +   && GET_CODE (trueop0) == VEC_CONCAT
> +   && rtx_equal_p (XEXP (trueop0, 0), XEXP (trueop0, 1))
> +   && GET_CODE (XEXP (trueop0, 0)) == VEC_CONCAT
> +   && GET_MODE (XEXP (trueop0, 0)) == mode)
> + {
> +   int offset0 = INTVAL (XVECEXP (trueop1, 0, 0)) % 2;
> +   int offset1 = INTVAL (XVECEXP (trueop1, 0, 1)) % 2;
> +   rtx baseop  = XEXP (trueop0, 0);
> +   rtx baseop0 = XEXP (baseop , 0);
> +   rtx baseop1 = XEXP (baseop , 1);
> +   baseop0 = avoid_constant_pool_reference (baseop0);
> +   baseop1 = avoid_constant_pool_reference (baseop1);
> +
> +   return simplify_gen_binary (VEC_CONCAT, mode,
> +  offset0 ? baseop1 : baseop0,
> +  offset1 ? baseop1 : baseop0);
> + }
> +

I know you said that generalising it could be done later,
and that's fine, but it looks in some ways like it would
be easier to go straight for the more general:

  && GET_CODE (trueop0) == VEC_CONCAT
  && GET_CODE (XEXP (trueop0, 0)) == VEC_CONCAT
  && GET_MODE (XEXP (trueop0, 0)) == mode
  && GET_CODE (XEXP (trueop0, 1)) == VEC_CONCAT
  && GET_MODE (XEXP (trueop0, 1)) == mode)
{
  unsigned int i0 = INTVAL (XVECEXP (trueop1, 0, 0));
  unsigned int i1 = INTVAL (XVECEXP (trueop1, 0, 1));
  rtx op0, op1;

  gcc_assert (i0 < 4 && i1 < 4);
  op0 = XEXP (XEXP (trueop0, i0 / 2), i0 % 2);
  op1 = XEXP (XEXP (trueop0, i1 / 2), i1 % 2);

  return simplify_gen_binary (VEC_CONCAT, mode, op0, op1);
}

(completely untested).  avoid_constant_pool_reference shouldn't be
called here.

Very minor, but this code probably belongs in the else part of the
if (!VECTOR_MODE_P (mode)) block.

Richard


Re: PATCH: Update longlong.h from GLIBC

2012-05-08 Thread Richard Earnshaw
On 08/05/12 10:04, Andreas Jaeger wrote:
> On Tuesday, May 08, 2012 10:43:14 Richard Guenther wrote:
>> On Mon, May 7, 2012 at 11:11 PM, H.J. Lu  wrote:
>>> Hi,
>>>
>>> I am preparing to update GLIBC longlong.h from GCC.  This patch
>>> updates GCC longlong.h to use a URL instead of an FSF postal address
>>> and  replace spaces with tab.  OK to install?
>>>
>>> Since I'd like to simply copy longlong.h from GCC release branch to
>>> GLIBC, Is this also OK for 4.7 branch?
>>
>> Why?  Does it fix anything there?
> 
> It makes sharing the file between gcc and glibc easier,
> 
> Andreas

Why should glibc be depending on the GCC release branch?  Sounds like
the tail wagging the dog.

Changing this file has quite a high potential for introducing
regressions.  I don't think we should risk that on the release branch.

R.



[PATCH] MIPS16: Remove DWARF-2 location information from GP accesses

2012-05-08 Thread Maciej W. Rozycki
Hi,

 I have been investigating some failures in the MIPS16 GDB test suite that 
appeared to me that was caused by some weird out-of place line 
information.

 For example we get this in gdb.base/break.exp, where a breakpoint is set 
at main and the beginning of the function (after removing some 
"decorations") looks like this:

int
main (int argc, char **argv, char **envp)
{
if (argc == 12345) {  /* an unlikely value < 2^16, in case uninited */ /* 
set breakpoint 6 here */
fprintf (stderr, "usage:  factorial \n");
return 1;
}
[...]

The test case quite reasonably expects the breakpoint to hit at:

if (argc == 12345) { [...]

which indeed happens with standard MIPS testing (at -O0).  However with 
MIPS16 testing (also at -O0) the breakpoint instead hits here:

fprintf (stderr, "usage:  factorial \n");

which of course scores as a test case failure and would confuse any human 
being in a real debug session too.

 So what happens in this case turned out to be this:

[...]
.text
.align  2
.globl  main
.LFB1 = .
.loc 1 88 0
.cfi_startproc
.setmips16
.entmain
.type   main, @function
main:
.frame  $17,8,$31   # vars= 0, regs= 2/0, args= 16, gp= 0
.mask   0x8002,-4
.fmask  0x,0
save$4-$6,24,$17,$31 # 85   *mips16e_save_restore   [length 
= 4]
.cfi_def_cfa_offset 24
.cfi_offset 31, -4
.cfi_offset 17, -8
addiu   $17,$sp,16   # 87   *addsi3_mips16/2[length = 2]
.cfi_def_cfa 17, 8
.loc 1 94 0
move$2,$28   # 12   *movsi_mips16/6 [length = 2]
.loc 1 93 0
lw  $3,8($17)# 69   *movsi_mips16/8 [length = 2]
move$24,$3   # 8*movsi_mips16/2 [length = 2]
move$3,$24   # 70   *movsi_mips16/3 [length = 2]
cmpi$3,12345 # 9*mips.md:2835/2 [length = 4]
btnez   .L4  # 10   *branch_equalitysi_mips16/2 [length = 4]
.loc 1 94 0
lw  $3,%gprel(_impure_ptr)($2)   # 71   *movsi_mips16/8 [length 
= 4]
move$24,$3   # 13   *movsi_mips16/2 [length = 2]
move$2,$24   # 72   *movsi_mips16/3 [length = 2]
lw  $2,12($2)# 73   *movsi_mips16/8 [length = 2]
move$24,$2   # 14   *movsi_mips16/2 [length = 2]
lw  $4,.L6   # 15   *movsi_mips16/8 [length = 4]
li  $5,1 # 16   *movsi_mips16/4 [length = 2]
li  $6,27# 17   *movsi_mips16/4 [length = 2]
move$7,$24   # 18   *movsi_mips16/3 [length = 2]
jal fwrite   # 19   call_value_internal/2   [length = 6]
.loc 1 95 0
li  $3,1 # 74   *movsi_mips16/4 [length = 2]
move$24,$3   # 20   *movsi_mips16/2 [length = 2]
b   .L5  # 65   *jump_mips16[length = 6]
.L4:
.loc 1 97 0
[...]

-- notice the ".loc 1 94 0" directive just after the prologue, spanning 
just a single instruction that copies $gp to a directly-accessible 
register, followed by ".loc 1 93 0" that corresponds to the first actual 
line of main.  The conditional block then follows, marked with another 
".loc 1 94 0" annotation.

 I have tracked it down to mips16_gp_pseudo_reg -- an instruction is 
generated there to access the real $gp and moved backwards as far as 
possible (IIUC).  I think this instruction is not really associated with 
anything specific line 94 does as far as the high-level language is 
concerned and is merely an ABI implementation detail.  It's also going to 
be reused for any other accesses to GP throughout as long as the auxiliary 
register is live.  I think it should really be treated as a part of the 
prologue or similarly to the "lw $gp, GPSLOT($sp)" o32 subroutine call GP 
restoration step.

 I propose therefore to remove line annotation associated with this 
instruction so that it's merged with the prologue or any adjacent source 
line so as not to confuse the GDB test suite and, more importantly, the 
user.  As a result of this change the extra ".loc 1 94 0" directive at the 
beginning of main is not produced anymore.

 This change fixes 32 regressions in the GDB test suite for the MIPS16 
multilibs (either endianness) and the mips-sde-elf target:

PASS: gdb.base/break.exp: Temporary breakpoint info
PASS: gdb.base/break.exp: breakpoint info
PASS: gdb.base/break.exp: run until function breakpoint
PASS: gdb.base/callfuncs.exp: next to t_double_values
PASS: gdb.base/callfuncs.exp: next to t_structs_c
PASS: gdb.base/condbreak.exp: breakpoint info
PASS: gdb.base/define.exp: use hook-stop command
PASS: gdb.base/define.exp: use user command: nextwhere
PASS: gdb.base/hbreak2.exp: hardware breakpoint insertion
PASS: gdb.base/hbreak2.exp: run until function breakpoint
PASS: gdb.base/maint.exp: maint info breakpoints
PASS: gdb.base/pointers.exp: and post-increment
PASS: gdb.base/poi

Re: Speed up insn-attrtab.c compilation

2012-05-08 Thread Michael Matz
Hi,

On Mon, 7 May 2012, Mike Stump wrote:

> On May 7, 2012, at 6:11 AM, Michael Matz wrote:
> > I'd like to retain the #if 0 code therein,
> 
> Can you structure this code as
> 
> #define DEBUG 0
> 
>   if (DEBUG) ...
> 
> ?
> 
> If so, that would be a preferable way to structure the code.

Sure, consider the patch so amended.


Ciao,
Michael.


Re: [rtl, patch] combine concat+shuffle

2012-05-08 Thread Marc Glisse

On Tue, 8 May 2012, Richard Sandiford wrote:


I know you said that generalising it could be done later,
and that's fine, but it looks in some ways like it would
be easier to go straight for the more general:

  && GET_CODE (trueop0) == VEC_CONCAT
  && GET_CODE (XEXP (trueop0, 0)) == VEC_CONCAT
  && GET_MODE (XEXP (trueop0, 0)) == mode
  && GET_CODE (XEXP (trueop0, 1)) == VEC_CONCAT
  && GET_MODE (XEXP (trueop0, 1)) == mode)
{
  unsigned int i0 = INTVAL (XVECEXP (trueop1, 0, 0));
  unsigned int i1 = INTVAL (XVECEXP (trueop1, 0, 1));
  rtx op0, op1;

  gcc_assert (i0 < 4 && i1 < 4);
  op0 = XEXP (XEXP (trueop0, i0 / 2), i0 % 2);
  op1 = XEXP (XEXP (trueop0, i1 / 2), i1 % 2);

  return simplify_gen_binary (VEC_CONCAT, mode, op0, op1);
}


Yes, I hesitated.


(completely untested).  avoid_constant_pool_reference shouldn't be
called here.


I wasn't quite sure what it was for, but it looked safer to call it for 
nothing than to forget it ;-)



Very minor, but this code probably belongs in the else part of the
if (!VECTOR_MODE_P (mode)) block.


Thanks, I'll update the patch with your comments, add a testcase and 
ChangeLog and re-send it here.


--
Marc Glisse


Re: Patches to enable -ftrack-macro-expansion by default

2012-05-08 Thread Andreas Krebbel
On 04/30/2012 01:46 PM, Dodji Seketeli wrote:
> Dodji Seketeli  writes:
> 
>> I am proposing a series of patches which is supposed to address the
>> remaining issues (I am aware of) preventing us from enabling the
>> -ftrack-macro-expansion by default.
>>
>> The idea is to address each issue I notice in the course of trying to
>> bootstrap the compiler and running the tests with
>> -ftrack-macro-expansion enabled.
>>
>> Beside the fixes, I ended up disabling the -ftrack-macro-expansion for
>> many test cases (sometimes globally in the dg-*.exp files, or on a
>> case by case basis), because that option changes the compiler output
>> and so requires that I either adapt the test case or disable the
>> option.  For other tests, I chose to adapt the test case.
> 
> I have finally applied this series of 14 patches to the mainline today.
> The SVN revisions are from r186965 to r186978.

s390 (31 bit) C++ bootstrap broke with revision 186977:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53280

Bye,

-Andreas-



[RS6000] Fix PR53271 powerpc ports

2012-05-08 Thread Alan Modra
On Mon, May 07, 2012 at 05:23:33PM +0200, Olivier Hainque wrote:
>   (emit_frame_save): Don't handle reg+reg addressing.
>  
>  introduces an assert on which we now trip compiling unwind-dw2.c for SPE
>  configurations. We now fall into the TARGET_SPE_ABI part of
> 
>/* Some cases that need register indexed addressing.  */
>gcc_checking_assert (!((TARGET_ALTIVEC_ABI && ALTIVEC_VECTOR_MODE (mode))
>|| (TARGET_VSX && ALTIVEC_OR_VSX_VECTOR_MODE (mode))
>|| (TARGET_E500_DOUBLE && mode == DFmode)
>|| (TARGET_SPE_ABI
>&& SPE_VECTOR_MODE (mode)
>&& !SPE_CONST_OFFSET_OK (offset;
> 
>  in emit_frame_save while compiling uw_install_context.
> 
>  The call comes from this part of rs6000_emit_prologue: 
> 
>   /* ??? There's no need to emit actual instructions here, but it's the
>  easiest way to get the frame unwind information emitted.  */

OK, the assert is doing its job.  I wanted to minimize the number of
places that need temporary hard regs, so that tracking of which hard
reg is in use can all be done in rs6000_emit_prologue.

The problem is that the insns here need reg+reg addressing on SPE,
but as the ??? comment says we really don't need insns, just the eh
unwind reg info.  So that is what the following patch does, attaching
the eh info to a blockage.

I also make use of gen_frame_store and siblings that I invented for
generating the eh info, elsewhere in rs6000.c where doing so is
blindingly obvious.  We could probably use them in other places too,
but I'll leave that for later.  Bootstrapped and regression tested
powerpc-linux.  OK to apply?

PR target/53271
* config/rs6000/rs6000.c (gen_frame_set): New function.
(gen_frame_load, gen_frame_store): New functions.
(rs6000_savres_rtx): Use the above.
(rs6000_emit_epilogue, rs6000_emit_prologue): Here too.
Correct mode used for CR2 in save/restore_world patterns.
Don't emit instructions for eh_return frame unwind reg info.

Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 187275)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -18961,6 +18961,28 @@
   return insn;
 }
 
+static rtx
+gen_frame_set (rtx reg, rtx frame_reg, int offset, bool store)
+{
+  rtx addr, mem;
+
+  addr = gen_rtx_PLUS (Pmode, frame_reg, GEN_INT (offset));
+  mem = gen_frame_mem (GET_MODE (reg), addr);
+  return gen_rtx_SET (VOIDmode, store ? mem : reg, store ? reg : mem);
+}
+
+static rtx
+gen_frame_load (rtx reg, rtx frame_reg, int offset)
+{
+  return gen_frame_set (reg, frame_reg, offset, false);
+}
+
+static rtx
+gen_frame_store (rtx reg, rtx frame_reg, int offset)
+{
+  return gen_frame_set (reg, frame_reg, offset, true);
+}
+
 /* Save a register into the frame, and emit RTX_FRAME_RELATED_P notes.
Save REGNO into [FRAME_REG + OFFSET] in mode MODE.  */
 
@@ -19301,27 +19323,14 @@
   = gen_rtx_USE (VOIDmode, gen_rtx_REG (Pmode, use_reg));
 
   for (i = 0; i < end_reg - start_reg; i++)
-{
-  rtx addr, reg, mem;
-  reg = gen_rtx_REG (reg_mode, start_reg + i);
-  addr = gen_rtx_PLUS (Pmode, frame_reg_rtx,
-  GEN_INT (save_area_offset + reg_size * i));
-  mem = gen_frame_mem (reg_mode, addr);
+RTVEC_ELT (p, i + offset)
+  = gen_frame_set (gen_rtx_REG (reg_mode, start_reg + i),
+  frame_reg_rtx, save_area_offset + reg_size * i,
+  (sel & SAVRES_SAVE) != 0);
 
-  RTVEC_ELT (p, i + offset) = gen_rtx_SET (VOIDmode,
-  (sel & SAVRES_SAVE) ? mem : reg,
-  (sel & SAVRES_SAVE) ? reg : mem);
-}
-
   if ((sel & SAVRES_SAVE) && (sel & SAVRES_LR))
-{
-  rtx addr, reg, mem;
-  reg = gen_rtx_REG (Pmode, 0);
-  addr = gen_rtx_PLUS (Pmode, frame_reg_rtx,
-  GEN_INT (lr_offset));
-  mem = gen_frame_mem (Pmode, addr);
-  RTVEC_ELT (p, i + offset) = gen_rtx_SET (VOIDmode, mem, reg);
-}
+RTVEC_ELT (p, i + offset)
+  = gen_frame_store (gen_rtx_REG (Pmode, 0), frame_reg_rtx, lr_offset);
 
   par = gen_rtx_PARALLEL (VOIDmode, p);
 
@@ -19479,59 +19488,33 @@
   /* We do floats first so that the instruction pattern matches
 properly.  */
   for (i = 0; i < 64 - info->first_fp_reg_save; i++)
-   {
- rtx reg = gen_rtx_REG ((TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT
- ? DFmode : SFmode),
-info->first_fp_reg_save + i);
- rtx addr = gen_rtx_PLUS (Pmode, frame_reg_rtx,
-  GEN_INT (info->fp_save_offset
-   + frame_off + 8 * i));
- rtx mem = gen_frame_mem ((TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT
-   

Re: PATCH: Update longlong.h from GLIBC

2012-05-08 Thread Andreas Jaeger
On Tuesday, May 08, 2012 11:59:34 Richard Earnshaw wrote:
> On 08/05/12 10:04, Andreas Jaeger wrote:
> > On Tuesday, May 08, 2012 10:43:14 Richard Guenther wrote:
> >> On Mon, May 7, 2012 at 11:11 PM, H.J. Lu  wrote:
> >>> Hi,
> >>> 
> >>> I am preparing to update GLIBC longlong.h from GCC.  This patch
> >>> updates GCC longlong.h to use a URL instead of an FSF postal address
> >>> and  replace spaces with tab.  OK to install?
> >>> 
> >>> Since I'd like to simply copy longlong.h from GCC release branch to
> >>> GLIBC, Is this also OK for 4.7 branch?
> >> 
> >> Why?  Does it fix anything there?
> > 
> > It makes sharing the file between gcc and glibc easier,
> > 
> > Andreas
> 
> Why should glibc be depending on the GCC release branch?  Sounds like
> the tail wagging the dog.

Ah, you discuss the release branch ;) Let HJ defend that one.


> Changing this file has quite a high potential for introducing
> regressions.  I don't think we should risk that on the release branch.

It's only whitespace  IMO. I'm arguing for the trunk to take the change,

Andreas
-- 
 Andreas Jaeger aj@{suse.com,opensuse.org} Twitter/Identica: jaegerandi
  SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
   GF: Jeff Hawn,Jennifer Guild,Felix Imendörffer,HRB16746 (AG Nürnberg)
GPG fingerprint = 93A3 365E CE47 B889 DF7F  FED1 389A 563C C272 A126


Re: [rtl, patch] combine concat+shuffle

2012-05-08 Thread Richard Sandiford
Marc Glisse  writes:
> On Tue, 8 May 2012, Richard Sandiford wrote:
>> I know you said that generalising it could be done later,
>> and that's fine, but it looks in some ways like it would
>> be easier to go straight for the more general:
>>
>>&& GET_CODE (trueop0) == VEC_CONCAT
>>&& GET_CODE (XEXP (trueop0, 0)) == VEC_CONCAT
>>&& GET_MODE (XEXP (trueop0, 0)) == mode
>>&& GET_CODE (XEXP (trueop0, 1)) == VEC_CONCAT
>>&& GET_MODE (XEXP (trueop0, 1)) == mode)
>>  {
>>unsigned int i0 = INTVAL (XVECEXP (trueop1, 0, 0));
>>unsigned int i1 = INTVAL (XVECEXP (trueop1, 0, 1));
>>rtx op0, op1;
>>
>>gcc_assert (i0 < 4 && i1 < 4);
>>op0 = XEXP (XEXP (trueop0, i0 / 2), i0 % 2);
>>op1 = XEXP (XEXP (trueop0, i1 / 2), i1 % 2);
>>
>>return simplify_gen_binary (VEC_CONCAT, mode, op0, op1);
>>  }
>
> Yes, I hesitated.

Realised afterwards that both versions need to check
GET_MODE_NUNITS (mode) == 2, because we're requiring OP0 and OP1
to be scalar.  Sorry for not noticing first time.

Richard


[PATCH] Fold (X * CST1) & CST2

2012-05-08 Thread Richard Guenther

This makes us fold (X * 8) & 5 to zero and (X * 6) & 5 to (X * 6) & 4.
It amends the fix for PR52134 in that I noticed we fail to do some
simplifications on size expressions (later on SSA, CCP does the above
transform, though not (yet) changing of the constant).

Bootstrap and regtest ongoing on x86_64-unknown-linux-gnu.

Richard.

2012-05-08  Richard Guenther  

* fold-const.c (fold_binary_loc): Fold (X * CST1) & CST2
to zero or to (X * CST1) & CST2' when CST1 has trailing zeros.

* gcc.dg/fold-bitand-4.c: New testcase.

Index: gcc/fold-const.c
===
*** gcc/fold-const.c(revision 187276)
--- gcc/fold-const.c(working copy)
*** fold_binary_loc (location_t loc,
*** 11449,11454 
--- 11449,11478 
return fold_convert_loc (loc, type, arg0);
}
  
+   /* Fold (X * CST1) & CST2 to zero if we can, or drop known zero
+  bits from CST2.  */
+   if (TREE_CODE (arg1) == INTEGER_CST
+ && TREE_CODE (arg0) == MULT_EXPR
+ && TREE_CODE (TREE_OPERAND (arg0, 1)) == INTEGER_CST)
+   {
+ int arg1tz
+   = double_int_ctz (tree_to_double_int (TREE_OPERAND (arg0, 1)));
+ if (arg1tz > 0)
+   {
+ double_int arg1mask, masked;
+ arg1mask = double_int_not (double_int_mask (arg1tz));
+ arg1mask = double_int_ext (arg1mask, TYPE_PRECISION (type),
+TYPE_UNSIGNED (type));
+ masked = double_int_and (arg1mask, tree_to_double_int (arg1));
+ if (double_int_zero_p (masked))
+   return omit_two_operands_loc (loc, type, build_zero_cst (type),
+ arg0, arg1);
+ else if (!double_int_equal_p (masked, tree_to_double_int (arg1)))
+   return fold_build2_loc (loc, code, type, op0,
+   double_int_to_tree (type, masked));
+   }
+   }
+ 
/* For constants M and N, if M == (1LL << cst) - 1 && (N & M) == M,
 ((A & N) + B) & M -> (A + B) & M
 Similarly if (N & M) == 0,
Index: gcc/testsuite/gcc.dg/fold-bitand-4.c
===
*** gcc/testsuite/gcc.dg/fold-bitand-4.c(revision 0)
--- gcc/testsuite/gcc.dg/fold-bitand-4.c(revision 0)
***
*** 0 
--- 1,16 
+ /* { dg-do compile } */
+ /* { dg-options "-O -fdump-tree-original" } */
+ 
+ int foo (int i)
+ {
+   return (i * 8) & 5;
+ }
+ 
+ unsigned bar (unsigned i)
+ {
+   return (i * 6) & 5;
+ }
+ 
+ /* { dg-final { scan-tree-dump-times "\\\&" 1 "original" } } */
+ /* { dg-final { scan-tree-dump-times "\\\& 4;" 1 "original" } } */
+ /* { dg-final { cleanup-tree-dump "original" } } */


[PATCH] Optimize byte_from_pos, pos_from_bit

2012-05-08 Thread Richard Guenther

This optimizes byte_from_pos and pos_from_bit by noting that we operate
on sizes whose computations have no intermediate (or final) overflow.
This is the single patch necessary to get Ada to bootstrap and test
with TYPE_IS_SIZETYPE removed.  Rather than amending size_binop
(my original plan) I chose to optimize the above two commonly used
accessors.

Conveniently normalize_offset can be re-written to use pos_from_bit
instead of inlinig it.  I also took the liberty to document the
functions (sic).

The patch already passed bootstrap & regtest on x86_64-unknown-linux-gnu
with TYPE_IS_SIZETYPE removed, now re-testing without that change.

Any comments?  Would you like different factoring of the optimization
(I considered adding a byte_from_bitpos)?  Any idea why
byte_from_pos is using TRUNC_DIV_EXPR (only positive offsets?) and
pos_from_bit FLOOR_DIV_EXPR (also negative offsets?) - that seems
inconsistent, and we fold FLOOR_DIV_EXPR of unsigned types (sizetype)
to TRUNC_DIV_EXPR anyways.

Thanks,
Richard.

2012-05-08  Richard Guenther  

* stor-layout.c (bit_from_pos): Document.
(byte_from_pos): Likewise.  Optimize.
(pos_from_bit): Likewise.
(normalize_offset): Use pos_from_bit instead of replicating it.

Index: gcc/stor-layout.c
===
--- gcc/stor-layout.c   (revision 187276)
+++ gcc/stor-layout.c   (working copy)
@@ -785,8 +785,8 @@ start_record_layout (tree t)
   return rli;
 }
 
-/* These four routines perform computations that convert between
-   the offset/bitpos forms and byte and bit offsets.  */
+/* Return the combined bit position for the byte offset OFFSET and the
+   bit position BITPOS.  */
 
 tree
 bit_from_pos (tree offset, tree bitpos)
@@ -797,25 +797,46 @@ bit_from_pos (tree offset, tree bitpos)
 bitsize_unit_node));
 }
 
+/* Return the combined truncated byte position for the byte offset OFFSET and
+   the bit position BITPOS.  */
+
 tree
 byte_from_pos (tree offset, tree bitpos)
 {
-  return size_binop (PLUS_EXPR, offset,
-fold_convert (sizetype,
-  size_binop (TRUNC_DIV_EXPR, bitpos,
-  bitsize_unit_node)));
+  tree bytepos;
+  if (TREE_CODE (bitpos) == MULT_EXPR
+  && tree_int_cst_equal (TREE_OPERAND (bitpos, 1), bitsize_unit_node))
+bytepos = TREE_OPERAND (bitpos, 0);
+  else
+bytepos = size_binop (TRUNC_DIV_EXPR, bitpos, bitsize_unit_node);
+  return size_binop (PLUS_EXPR, offset, fold_convert (sizetype, bytepos));
 }
 
+/* Split the bit position POS into a byte offset *POFFSET and a bit
+   position *PBITPOS with the byte offset aligned to OFF_ALIGN bits.  */
+
 void
 pos_from_bit (tree *poffset, tree *pbitpos, unsigned int off_align,
  tree pos)
 {
-  *poffset = size_binop (MULT_EXPR,
-fold_convert (sizetype,
-  size_binop (FLOOR_DIV_EXPR, pos,
-  bitsize_int (off_align))),
-size_int (off_align / BITS_PER_UNIT));
-  *pbitpos = size_binop (FLOOR_MOD_EXPR, pos, bitsize_int (off_align));
+  tree toff_align = bitsize_int (off_align);
+  if (TREE_CODE (pos) == MULT_EXPR
+  && tree_int_cst_equal (TREE_OPERAND (pos, 1), toff_align))
+{
+  *poffset = size_binop (MULT_EXPR,
+fold_convert (sizetype, TREE_OPERAND (pos, 0)),
+size_int (off_align / BITS_PER_UNIT));
+  *pbitpos = bitsize_zero_node;
+}
+  else
+{
+  *poffset = size_binop (MULT_EXPR,
+fold_convert (sizetype,
+  size_binop (FLOOR_DIV_EXPR, pos,
+  toff_align)),
+size_int (off_align / BITS_PER_UNIT));
+  *pbitpos = size_binop (FLOOR_MOD_EXPR, pos, toff_align);
+}
 }
 
 /* Given a pointer to bit and byte offsets and an offset alignment,
@@ -828,17 +849,10 @@ normalize_offset (tree *poffset, tree *p
  downwards.  */
   if (compare_tree_int (*pbitpos, off_align) >= 0)
 {
-  tree extra_aligns = size_binop (FLOOR_DIV_EXPR, *pbitpos,
- bitsize_int (off_align));
-
-  *poffset
-   = size_binop (PLUS_EXPR, *poffset,
- size_binop (MULT_EXPR,
- fold_convert (sizetype, extra_aligns),
- size_int (off_align / BITS_PER_UNIT)));
-
-  *pbitpos
-   = size_binop (FLOOR_MOD_EXPR, *pbitpos, bitsize_int (off_align));
+  tree offset, bitpos;
+  pos_from_bit (&offset, &bitpos, off_align, *pbitpos);
+  *poffset = size_binop (PLUS_EXPR, *poffset, offset);
+  *pbitpos = bitpos;
 }
 }
 


[PATCH] Remove TYPE_IS_SIZETYPE

2012-05-08 Thread Richard Guenther

This removes the TYPE_IS_SIZETYPE macro and all its uses (by
assuming it returns zero and applying trivial folding).  Sizes
and bitsizes can still be treat specially by means of knowing
what the values represent and by means of using helper functions
that assume you are dealing with "sizes" (in particular size_binop
and friends and bit_from_pos, byte_from_pos or pos_from_bit).

Bootstrapped and tested on x86_64-unknown-linux-gnu for all languages
including Ada with the patch optimizing bute_from_pos and pos_from_bit
with the following yet unvestigated regression:

=== gnat tests ===

Running target unix/
FAIL: gnat.dg/specs/alignment1.ads (test for excess errors)

Is the Ada change ok for trunk?  In any case I'll wait a bit for
discussion.

Thanks,
Richard.

2012-05-08  Richard Guenther  

ada/
* gcc-interface/cuintp.c (UI_From_gnu): Remove TYPE_IS_SIZETYPE use.

c-family/
* c-common.c (c_sizeof_or_alignof_type): Remove assert and
adjust commentary about TYPE_IS_SIZETYPE types.

* tree.h (TYPE_IS_SIZETYPE): Remove.
* fold-const.c (int_const_binop_1): Remove TYPE_IS_SIZETYPE use.
(extract_muldiv_1): Likewise.
* gimple.c (gtc_visit): Likewise.
(gimple_types_compatible_p): Likewise.
(iterative_hash_canonical_type): Likewise.
(gimple_canonical_types_compatible_p): Likewise.
* gimplify.c (gimplify_one_sizepos): Likewise.
* print-tree.c (print_node): Likewise.
* stor-layout.c (initialize_sizetypes): Do not set TYPE_IS_SIZETYPE.

Index: trunk/gcc/ada/gcc-interface/cuintp.c
===
*** trunk.orig/gcc/ada/gcc-interface/cuintp.c   2011-04-11 17:01:30.0 
+0200
--- trunk/gcc/ada/gcc-interface/cuintp.c2012-05-07 16:43:43.497218058 
+0200
*** UI_From_gnu (tree Input)
*** 178,186 
if (host_integerp (Input, 0))
  return UI_From_Int (TREE_INT_CST_LOW (Input));
else if (TREE_INT_CST_HIGH (Input) < 0
!  && TYPE_UNSIGNED (gnu_type)
!  && !(TREE_CODE (gnu_type) == INTEGER_TYPE
!   && TYPE_IS_SIZETYPE (gnu_type)))
  return No_Uint;
  #endif
  
--- 178,184 
if (host_integerp (Input, 0))
  return UI_From_Int (TREE_INT_CST_LOW (Input));
else if (TREE_INT_CST_HIGH (Input) < 0
!  && TYPE_UNSIGNED (gnu_type))
  return No_Uint;
  #endif
  
Index: trunk/gcc/c-family/c-common.c
===
*** trunk.orig/gcc/c-family/c-common.c  2012-05-07 10:48:20.0 +0200
--- trunk/gcc/c-family/c-common.c   2012-05-07 16:41:42.36353 +0200
*** c_sizeof_or_alignof_type (location_t loc
*** 4539,4550 
value = size_int (TYPE_ALIGN_UNIT (type));
  }
  
!   /* VALUE will have an integer type with TYPE_IS_SIZETYPE set.
!  TYPE_IS_SIZETYPE means that certain things (like overflow) will
!  never happen.  However, this node should really have type
!  `size_t', which is just a typedef for an ordinary integer type.  */
value = fold_convert_loc (loc, size_type_node, value);
-   gcc_assert (!TYPE_IS_SIZETYPE (TREE_TYPE (value)));
  
return value;
  }
--- 4539,4548 
value = size_int (TYPE_ALIGN_UNIT (type));
  }
  
!   /* VALUE will have the middle-end integer type sizetype.
!  However, we should really return a value of type `size_t',
!  which is just a typedef for an ordinary integer type.  */
value = fold_convert_loc (loc, size_type_node, value);
  
return value;
  }
Index: trunk/gcc/fold-const.c
===
*** trunk.orig/gcc/fold-const.c 2012-05-04 10:44:44.0 +0200
--- trunk/gcc/fold-const.c  2012-05-07 16:59:27.728185367 +0200
*** int_const_binop_1 (enum tree_code code,
*** 940,947 
tree t;
tree type = TREE_TYPE (arg1);
bool uns = TYPE_UNSIGNED (type);
-   bool is_sizetype
- = (TREE_CODE (type) == INTEGER_TYPE && TYPE_IS_SIZETYPE (type));
bool overflow = false;
  
op1 = tree_to_double_int (arg1);
--- 940,945 
*** int_const_binop_1 (enum tree_code code,
*** 1077,1083 
  }
  
t = force_fit_type_double (TREE_TYPE (arg1), res, overflowable,
!((!uns || is_sizetype) && overflow)
 | TREE_OVERFLOW (arg1) | TREE_OVERFLOW (arg2));
  
return t;
--- 1075,1081 
  }
  
t = force_fit_type_double (TREE_TYPE (arg1), res, overflowable,
!(!uns && overflow)
 | TREE_OVERFLOW (arg1) | TREE_OVERFLOW (arg2));
  
return t;
*** extract_muldiv_1 (tree t, tree c, enum t
*** 5639,5646 
  /* ... and has wrapping overflow, and its type is smaller
 than ctype, then we cannot pass through as widening.  */
  && ((TYPE_OVERFLOW_WRAPS (TREE_TYPE (

Re: [PATCH] Remove TYPE_IS_SIZETYPE

2012-05-08 Thread Richard Guenther
On Tue, 8 May 2012, Richard Guenther wrote:

> 
> This removes the TYPE_IS_SIZETYPE macro and all its uses (by
> assuming it returns zero and applying trivial folding).  Sizes
> and bitsizes can still be treat specially by means of knowing
> what the values represent and by means of using helper functions
> that assume you are dealing with "sizes" (in particular size_binop
> and friends and bit_from_pos, byte_from_pos or pos_from_bit).
> 
> Bootstrapped and tested on x86_64-unknown-linux-gnu for all languages
> including Ada with the patch optimizing bute_from_pos and pos_from_bit
> with the following yet unvestigated regression:
> 
> === gnat tests ===
> 
> Running target unix/
> FAIL: gnat.dg/specs/alignment1.ads (test for excess errors)

Actually that is fixed by 
http://gcc.gnu.org/ml/gcc-patches/2012-05/msg00543.html

Thus there are no regressions.

Richard.


Re: [rtl, patch] combine concat+shuffle

2012-05-08 Thread Marc Glisse

On Tue, 8 May 2012, Richard Sandiford wrote:


Marc Glisse  writes:

On Tue, 8 May 2012, Richard Sandiford wrote:

I know you said that generalising it could be done later,
and that's fine, but it looks in some ways like it would
be easier to go straight for the more general:

  && GET_CODE (trueop0) == VEC_CONCAT
  && GET_CODE (XEXP (trueop0, 0)) == VEC_CONCAT
  && GET_MODE (XEXP (trueop0, 0)) == mode
  && GET_CODE (XEXP (trueop0, 1)) == VEC_CONCAT
  && GET_MODE (XEXP (trueop0, 1)) == mode)
{
  unsigned int i0 = INTVAL (XVECEXP (trueop1, 0, 0));
  unsigned int i1 = INTVAL (XVECEXP (trueop1, 0, 1));
  rtx op0, op1;

  gcc_assert (i0 < 4 && i1 < 4);
  op0 = XEXP (XEXP (trueop0, i0 / 2), i0 % 2);
  op1 = XEXP (XEXP (trueop0, i1 / 2), i1 % 2);

  return simplify_gen_binary (VEC_CONCAT, mode, op0, op1);
}


Yes, I hesitated.


Realised afterwards that both versions need to check
GET_MODE_NUNITS (mode) == 2, because we're requiring OP0 and OP1
to be scalar.  Sorry for not noticing first time.


I thought that was a consequence of

XVECLEN (trueop1, 0) == 2

(in the lines before your first &&)
ie the result (which has mode "mode") is a vec_select of 2 objects.

--
Marc Glisse


Re: [rtl, patch] combine concat+shuffle

2012-05-08 Thread Richard Sandiford
Marc Glisse  writes:
>> Realised afterwards that both versions need to check
>> GET_MODE_NUNITS (mode) == 2, because we're requiring OP0 and OP1
>> to be scalar.  Sorry for not noticing first time.
>
> I thought that was a consequence of
>
> XVECLEN (trueop1, 0) == 2
>
> (in the lines before your first &&)
> ie the result (which has mode "mode") is a vec_select of 2 objects.

Yeah, you're right of course.

Richard


Re: [C++ Patch] PR 53158

2012-05-08 Thread Jason Merrill

On 05/07/2012 11:28 PM, Paolo Carlini wrote:

error: could not convert ‘b.main()::()’ from ‘void’ to ‘bool’


It wouldn't say "operator()"?

I think I'd leave that alone; it is somewhat more informative (about 
what b() expands to) and we're moving toward replacing %qE with caret 
anyway.


The patch to ocp_convert is OK.

Jason


Re: [patch] support for multiarch systems

2012-05-08 Thread Joseph S. Myers
On Tue, 8 May 2012, Matthias Klose wrote:

> On 20.08.2011 21:51, Matthias Klose wrote:
> > Multiarch [1] is the term being used to refer to the capability of a system 
> > to
> > install and run applications of multiple different binary targets on the 
> > same
> > system.
> 
> please find attached an updated for the trunk (2012-05-08). The multiarch
> triplets are now defined in the Debian Wiki [1], and progress is made to get 
> the
> triplet definitions into Debian Policy [2].

This still seems to suffer in some cases the problem of previous versions 
that it does not ensure triplets are never used for non-matching ABIs.  
For example, a compiler for powerpc-linux-gnu can be configured 
--with-float=soft but this patch will still use powerpc-linux-gnu as the 
multiarch triplet.

For MIPS, I see you allowed for soft-float in setting the triplets - but 
the specification you point to doesn't mention the soft-float triplets.  
Likewise you allowed for powerpc-linux-gnuspe being e500v1 or e500v2 but 
haven't documented the e500v1 triplet.  Likewise for big-endian ARM.

I again suggest starting with a patch that does just one architecture - 
but makes sure to cover all the ABIs applicable to that architecture.  
For example, you could start with a patch for x86 (indeed, just x86 
GNU/Linux) - and assign a multiarch triplet for x32 even if you're not 
building an x32 distribution with multiarch.  Then, once the generic 
support has been reviewed by build system maintainers, and the x86 support 
by x86 maintainers and people familiar with all the applicable x86 ABIs, 
send patches for each other architecture (or architecture/OS combination), 
and the relevant architecture experts can review them to make sure the 
relevant ABIs are properly distinguished.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [C++ Patch] PR 53158

2012-05-08 Thread Paolo Carlini

On 05/08/2012 03:00 PM, Jason Merrill wrote:

On 05/07/2012 11:28 PM, Paolo Carlini wrote:

error: could not convert ‘b.main()::()’ from ‘void’ to ‘bool’


It wouldn't say "operator()"?
Nope, it says exactly the above. If you tell me what would be more 
sensible which remaining informative, I can see if I can quickly prepare 
something... like b.operator()? Would it be better? That would be very 
quick to implement: b is for free, and the arguments too, if there is 
something better we want to print in the middle, just let me know.
I think I'd leave that alone; it is somewhat more informative (about 
what b() expands to) and we're moving toward replacing %qE with caret 
anyway.

... or anyway.

The patch to ocp_convert is OK.

Great, thanks.

Paolo.


GCC: Add microblaze-*-rtems*

2012-05-08 Thread Joel Sherrill

This patch adds the microblaze-*-rtems* target to gcc.
OK to apply?

2012-05-07  Joel Sherrill 

* config.gcc (microblaze-*-rtems*): New target.
* config/microblaze/rtems.h: New file

--
Joel Sherrill, Ph.D. Director of Research&   Development
joel.sherr...@oarcorp.comOn-Line Applications Research
Ask me about RTEMS: a free RTOS  Huntsville AL 35805
Support Available (256) 722-9985




GCC: Add microblaze-*-rtems* (w/ diff)

2012-05-08 Thread Joel Sherrill

Sorry.. missed the attachment

This patch adds the microblaze-*-rtems* target to gcc.
OK to apply?

2012-05-07 Joel Sherrill 

* config.gcc (microblaze-*-rtems*): New target.
* config/microblaze/rtems.h: New file.

--
Joel Sherrill, Ph.D. Director of Research&   Development
joel.sherr...@oarcorp.comOn-Line Applications Research
Ask me about RTEMS: a free RTOS  Huntsville AL 35805
Support Available (256) 722-9985


Index: gcc/config.gcc
===
--- gcc/config.gcc	(revision 187223)
+++ gcc/config.gcc	(working copy)
@@ -1700,6 +1700,12 @@
 	c_target_objs="${c_target_objs} microblaze-c.o"
 	cxx_target_objs="${cxx_target_objs} microblaze-c.o"
 	;;
+microblaze-*-rtems*)
+	tm_file="${tm_file} dbxelf.h microblaze/rtems.h rtems.h newlib-stdint.h"
+	c_target_objs="${c_target_objs} microblaze-c.o"
+	cxx_target_objs="${cxx_target_objs} microblaze-c.o"
+tmake_file="${tmake_file} microblaze/t-microblaze t-rtems"
+	;;
 microblaze*-*-*)
 tm_file="${tm_file} dbxelf.h"
 	c_target_objs="${c_target_objs} microblaze-c.o"
Index: gcc/config/microblaze/rtems.h
===
--- gcc/config/microblaze/rtems.h	(revision 0)
+++ gcc/config/microblaze/rtems.h	(revision 0)
@@ -0,0 +1,42 @@
+/* Definitions for rtems targeting a Microblaze using ELF.
+   Copyright (C) 2012 Free Software Foundation, Inc.
+   Contributed by Joel Sherrill (j...@oarcorp.com).
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+/* Target OS builtins.  */
+#undef TARGET_OS_CPP_BUILTINS
+#define TARGET_OS_CPP_BUILTINS()		\
+  do		\
+{		\
+	builtin_define ("__rtems__");		\
+	builtin_assert ("system=rtems");	\
+}		\
+  while (0)
+
+/* Use the default */
+#undef LINK_GCC_C_SEQUENCE_SPEC
+
+/* Extra switches sometimes passed to the linker.  */
+/* -xl-mode-xmdstub translated to -Zxl-mode-xmdstub -- deprecated.  */
+/* RTEMS: Remove use of xilinx.ld but keep other parts for compatibility */
+#undef LINK_SPEC
+#define LINK_SPEC "%{shared:-shared} -N -relax \
+  %{Zxl-mode-xmdstub:-defsym _TEXT_START_ADDR=0x800} \
+  %{mxl-mode-xmdstub:-defsym _TEXT_START_ADDR=0x800} \
+  %{mxl-gp-opt:%{G*}} %{!mxl-gp-opt: -G 0}"
+


[PATCH] libgcov support for profile collection in region of interest (issue6186044)

2012-05-08 Thread Teresa Johnson
Hi Honza,

I added L_gcov_reset and L_gcov_dump for the new interfaces, and also
added a description into the gcov man page. Let me know if it looks
ok now.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

Thanks,
Teresa

2012-05-08   Teresa Johnson  

* libgcc/libgcov.c (gcov_clear, __gcov_reset): New functions.
(__gcov_dump): Ditto.
(gcov_dump_complete): New global variable.
(__gcov_flush): Outline functionality now in gcov_clear.
* gcc/gcov-io.h (__gcov_reset, __gcov_dump): Declare.
* libgcc/Makefile.in (L_gcov_reset, L_gcov_dump): Define.
* gcc/doc/gcov.texi: Add note on using __gcov_reset and __gcov_dump.

Index: libgcc/Makefile.in
===
--- libgcc/Makefile.in  (revision 187048)
+++ libgcc/Makefile.in  (working copy)
@@ -849,7 +849,7 @@ include $(iterator)
 # Defined in libgcov.c, included only in gcov library
 LIBGCOV = _gcov _gcov_merge_add _gcov_merge_single _gcov_merge_delta \
 _gcov_fork _gcov_execl _gcov_execlp _gcov_execle \
-_gcov_execv _gcov_execvp _gcov_execve \
+_gcov_execv _gcov_execvp _gcov_execve _gcov_reset _gcov_dump \
 _gcov_interval_profiler _gcov_pow2_profiler _gcov_one_value_profiler \
 _gcov_indirect_call_profiler _gcov_average_profiler _gcov_ior_profiler \
 _gcov_merge_ior
Index: libgcc/libgcov.c
===
--- libgcc/libgcov.c(revision 187048)
+++ libgcc/libgcov.c(working copy)
@@ -50,6 +50,14 @@ void __gcov_init (struct gcov_info *p __attribute_
 void __gcov_flush (void) {}
 #endif
 
+#ifdef L_gcov_reset
+void __gcov_reset (void) {}
+#endif
+
+#ifdef L_gcov_dump
+void __gcov_dump (void) {}
+#endif
+
 #ifdef L_gcov_merge_add
 void __gcov_merge_add (gcov_type *counters  __attribute__ ((unused)),
   unsigned n_counters __attribute__ ((unused))) {}
@@ -74,7 +82,7 @@ void __gcov_merge_delta (gcov_type *counters  __at
 #include 
 #endif
 
-#ifdef L_gcov
+#if defined(L_gcov) || defined(L_gcov_reset) || defined(L_gcov_dump)
 #include "gcov-io.c"
 
 struct gcov_fn_buffer
@@ -91,6 +99,9 @@ static struct gcov_info *gcov_list;
 /* Size of the longest file name. */
 static size_t gcov_max_filename = 0;
 
+/* Flag when the profile has already been dumped via __gcov_dump().  */
+static int gcov_dump_complete = 0;
+
 /* Make sure path component of the given FILENAME exists, create
missing directories. FILENAME must be writable.
Returns zero on success, or -1 if an error occurred.  */
@@ -286,6 +297,11 @@ gcov_exit (void)
   char *gi_filename, *gi_filename_up;
   gcov_unsigned_t crc32 = 0;
 
+  /* Prevent the counters from being dumped a second time on exit when the
+ application already wrote out the profile using __gcov_dump().  */
+  if (gcov_dump_complete)
+return;
+
   memset (&all_prg, 0, sizeof (all_prg));
   /* Find the totals for this execution.  */
   memset (&this_prg, 0, sizeof (this_prg));
@@ -679,6 +695,37 @@ gcov_exit (void)
 }
 }
 
+/* Reset all counters to zero.  */
+
+static void
+gcov_clear (void)
+{
+  const struct gcov_info *gi_ptr;
+
+  for (gi_ptr = gcov_list; gi_ptr; gi_ptr = gi_ptr->next)
+{
+  unsigned f_ix;
+
+  for (f_ix = 0; f_ix < gi_ptr->n_functions; f_ix++)
+   {
+ unsigned t_ix;
+ const struct gcov_fn_info *gfi_ptr = gi_ptr->functions[f_ix];
+
+ if (!gfi_ptr || gfi_ptr->key != gi_ptr)
+   continue;
+ const struct gcov_ctr_info *ci_ptr = gfi_ptr->ctrs;
+ for (t_ix = 0; t_ix != GCOV_COUNTERS; t_ix++)
+   {
+ if (!gi_ptr->merge[t_ix])
+   continue;
+ 
+ memset (ci_ptr->values, 0, sizeof (gcov_type) * ci_ptr->num);
+ ci_ptr++;
+   }
+   }
+}
+}
+
 /* Add a new object file onto the bb chain.  Invoked automatically
when running an object file's global ctors.  */
 
@@ -730,40 +777,48 @@ init_mx_once (void)
 void
 __gcov_flush (void)
 {
-  const struct gcov_info *gi_ptr;
-
   init_mx_once ();
   __gthread_mutex_lock (&__gcov_flush_mx);
 
   gcov_exit ();
-  for (gi_ptr = gcov_list; gi_ptr; gi_ptr = gi_ptr->next)
-{
-  unsigned f_ix;
+  gcov_clear ();
 
-  for (f_ix = 0; f_ix < gi_ptr->n_functions; f_ix++)
-   {
- unsigned t_ix;
- const struct gcov_fn_info *gfi_ptr = gi_ptr->functions[f_ix];
-
- if (!gfi_ptr || gfi_ptr->key != gi_ptr)
-   continue;
- const struct gcov_ctr_info *ci_ptr = gfi_ptr->ctrs;
- for (t_ix = 0; t_ix != GCOV_COUNTERS; t_ix++)
-   {
- if (!gi_ptr->merge[t_ix])
-   continue;
- 
- memset (ci_ptr->values, 0, sizeof (gcov_type) * ci_ptr->num);
- ci_ptr++;
-   }
-   }
-}
-
   __gthread_mutex_unlock (&__gcov_flush_mx);
 }
 
 #endif /* L_gcov */
 
+#ifdef L_gcov_reset
+
+/* Function that can be called from applica

[PATCH] gcc/config/freebsd-spec.h: Fix building PIE executables. Link them with crt{begin,end}S.o and Scrt1.o which are PIC instead of crt{begin,end}.o and crt1.o which are not. Spec synced from gnu-u

2012-05-08 Thread Alexis Ballier
gcc/config/i386/freebsd.h: Likewise.
---
 gcc/config/freebsd-spec.h |9 +++--
 gcc/config/i386/freebsd.h |9 +++--
 2 files changed, 6 insertions(+), 12 deletions(-)

diff --git a/gcc/config/freebsd-spec.h b/gcc/config/freebsd-spec.h
index 770a3d1..2808582 100644
--- a/gcc/config/freebsd-spec.h
+++ b/gcc/config/freebsd-spec.h
@@ -64,11 +64,8 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If 
not, see
before entering `main'.  */

 #define FBSD_STARTFILE_SPEC \
-  "%{!shared: \
- %{pg:gcrt1.o%s} %{!pg:%{p:gcrt1.o%s} \
-  %{!p:%{profile:gcrt1.o%s} \
-%{!profile:crt1.o%s \
-   crti.o%s %{!shared:crtbegin.o%s} %{shared:crtbeginS.o%s}"
+  "%{!shared: %{pg|p|profile:gcrt1.o%s;pie:Scrt1.o%s;:crt1.o%s}} \
+   crti.o%s %{shared|pie:crtbeginS.o%s;:crtbegin.o%s}"
 
 /* Provide a ENDFILE_SPEC appropriate for FreeBSD.  Here we tack on
the magical crtend.o file (see crtstuff.c) which provides part of 
@@ -77,7 +74,7 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If 
not, see
`crtn.o'.  */
 
 #define FBSD_ENDFILE_SPEC \
-  "%{!shared:crtend.o%s} %{shared:crtendS.o%s} crtn.o%s"
+  "%{shared|pie:crtendS.o%s;:crtend.o%s} crtn.o%s"
 
 /* Provide a LIB_SPEC appropriate for FreeBSD as configured and as
required by the user-land thread model.  Before __FreeBSD_version
diff --git a/gcc/config/i386/freebsd.h b/gcc/config/i386/freebsd.h
index 649274d..dd69e43 100644
--- a/gcc/config/i386/freebsd.h
+++ b/gcc/config/i386/freebsd.h
@@ -67,11 +67,8 @@ along with GCC; see the file COPYING3.  If not see

 #undef STARTFILE_SPEC
 #define STARTFILE_SPEC \
-  "%{!shared: \
- %{pg:gcrt1.o%s} %{!pg:%{p:gcrt1.o%s} \
-  %{!p:%{profile:gcrt1.o%s} \
-%{!profile:crt1.o%s \
-   crti.o%s %{!shared:crtbegin.o%s} %{shared:crtbeginS.o%s}"
+  "%{!shared: %{pg|p|profile:gcrt1.o%s;pie:Scrt1.o%s;:crt1.o%s}} \
+   crti.o%s %{shared|pie:crtbeginS.o%s;:crtbegin.o%s}"
 
 /* Provide a ENDFILE_SPEC appropriate for FreeBSD.  Here we tack on
the magical crtend.o file (see crtstuff.c) which provides part of 
@@ -81,7 +78,7 @@ along with GCC; see the file COPYING3.  If not see
 
 #undef ENDFILE_SPEC
 #define ENDFILE_SPEC \
-  "%{!shared:crtend.o%s} %{shared:crtendS.o%s} crtn.o%s"
+  "%{shared|pie:crtendS.o%s;:crtend.o%s} crtn.o%s"
 
 /* Provide a LINK_SPEC appropriate for FreeBSD.  Here we provide support
for the special GCC options -static and -shared, which allow us to
-- 
1.7.8.6



Re: [PATCH] gcc/config/freebsd-spec.h: Fix building PIE executables. Link them with crt{begin,end}S.o and Scrt1.o which are PIC instead of crt{begin,end}.o and crt1.o which are not. Spec synced from g

2012-05-08 Thread Alexis Ballier
For the record, there's a similar logic in FreeBSD's gcc:
http://svnweb.freebsd.org/base/head/contrib/gcc/config/freebsd-spec.h?revision=200038&view=markup

Regards,

Alexis.


Re: [RS6000] Fix PR53271 powerpc ports

2012-05-08 Thread David Edelsohn
On Tue, May 8, 2012 at 6:32 AM, Alan Modra  wrote:

> OK, the assert is doing its job.  I wanted to minimize the number of
> places that need temporary hard regs, so that tracking of which hard
> reg is in use can all be done in rs6000_emit_prologue.
>
> The problem is that the insns here need reg+reg addressing on SPE,
> but as the ??? comment says we really don't need insns, just the eh
> unwind reg info.  So that is what the following patch does, attaching
> the eh info to a blockage.
>
> I also make use of gen_frame_store and siblings that I invented for
> generating the eh info, elsewhere in rs6000.c where doing so is
> blindingly obvious.  We could probably use them in other places too,
> but I'll leave that for later.  Bootstrapped and regression tested
> powerpc-linux.  OK to apply?
>
>        PR target/53271
>        * config/rs6000/rs6000.c (gen_frame_set): New function.
>        (gen_frame_load, gen_frame_store): New functions.
>        (rs6000_savres_rtx): Use the above.
>        (rs6000_emit_epilogue, rs6000_emit_prologue): Here too.
>        Correct mode used for CR2 in save/restore_world patterns.
>        Don't emit instructions for eh_return frame unwind reg info.

Okay.

Thanks, David


Re: PR 53249: Multiple address modes for same address space

2012-05-08 Thread H.J. Lu
On Sun, May 6, 2012 at 11:41 AM, Richard Sandiford
 wrote:
> x32 uses a mixture of MEM address modes for the same address space.
> Some MEMs have SImode addresses, some have DImode.  This means that
> the currently common idiom:
>
>    targetm.addr_space.address_mode (MEM_ADDR_SPACE (mem))
>
> isn't trustworthy.  We have to use the mode of the address if it has one,
> and only fall back on the above for VOIDmode (CONST_INT) addresses.
>
> We actually already have two (identical) functions to calculate
> such a mode.  The patch below puts the function in a more general place
> and uses it instead of the above for rtl-level stuff.
>
> I'm not sure whether what x32 is doing is a good thing, but I like the
> patch anyway because (a) it removes a duplicated function and (b) it at
> least abstracts the concept away.
>
> Bootstrapped & regression-tested on x86_64-linux-gnu.  Also tested to
> make sure that there were no differences for cc1 .ii files for MIPS
> n32, o32 and n64.  (I used MIPS to get LO_SUM coverage.)  OK to install?
>
> Richard
>
>
> gcc/
>        PR middle-end/53249
>        * dwarf2out.h (get_address_mode): Move declaration to...
>        * rtl.h: ...here.
>        * dwarf2out.c (get_address_mode): Move definition to...
>        * rtlanal.c: ...here.
>        * var-tracking.c (get_address_mode): Delete.
>        * combine.c (find_split_point): Use get_address_mode instead of
>        targetm.addr_space.address_mode.
>        * cselib.c (cselib_record_sets): Likewise.
>        * dse.c (canon_address, record_store): Likewise.
>        * emit-rtl.c (adjust_address_1, offset_address): Likewise.
>        * expr.c (move_by_pieces, emit_block_move_via_loop, store_by_pieces)
>        (store_by_pieces_1, expand_assignment, store_expr, store_constructor)
>        (expand_expr_real_1): Likewise.
>        * ifcvt.c (noce_try_cmove_arith): Likewise.
>        * optabs.c (maybe_legitimize_operand_same_code): Likewise.
>        * reload.c (find_reloads): Likewise.
>        * sched-deps.c (sched_analyze_1, sched_analyze_2): Likewise.
>        * sel-sched-dump.c (debug_mem_addr_value): Likewise.
>

Hi Richard H.

Can you take a look at this patch? It will restore x32 bootstrap.

Thanks.

H.J.


[PATCH] Add -feliminate-malloc to enable/disable elimination of redundant malloc/free pairs

2012-05-08 Thread Dehao Chen
Hello,

This patch adds a flag to guard the optimization that optimize the
following code away:

free (malloc (4));

In some cases, we'd like this type of malloc/free pairs to remain in
the optimized code.

Tested with bootstrap, and no regression in the gcc testsuite.

Is it ok for mainline?

Thanks,
Dehao

gcc/ChangeLog
2012-05-08  Dehao Chen  

* common.opt (feliminate-malloc): New.
* doc/invoke.texi: Document it.
* tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Honor it.

gcc/testsuite/ChangeLog
2012-05-08  Dehao Chen  

* gcc.dg/free-malloc.c: Check if -fno-eliminate-malloc is working
as expected.

Index: gcc/doc/invoke.texi
===
--- gcc/doc/invoke.texi (revision 187277)
+++ gcc/doc/invoke.texi (working copy)
@@ -360,7 +360,8 @@
 -fcx-limited-range @gol
 -fdata-sections -fdce -fdelayed-branch @gol
 -fdelete-null-pointer-checks -fdevirtualize -fdse @gol
--fearly-inlining -fipa-sra -fexpensive-optimizations -ffat-lto-objects @gol
+-fearly-inlining -feliminate-malloc -fipa-sra -fexpensive-optimizations @gol
+-ffat-lto-objects @gol
 -ffast-math -ffinite-math-only -ffloat-store
-fexcess-precision=@var{style} @gol
 -fforward-propagate -ffp-contract=@var{style} -ffunction-sections @gol
 -fgcse -fgcse-after-reload -fgcse-las -fgcse-lm -fgraphite-identity @gol
@@ -6238,6 +6239,7 @@
 -fdefer-pop @gol
 -fdelayed-branch @gol
 -fdse @gol
+-feliminate-malloc @gol
 -fguess-branch-probability @gol
 -fif-conversion2 @gol
 -fif-conversion @gol
@@ -6762,6 +6764,11 @@
 Perform dead store elimination (DSE) on RTL@.
 Enabled by default at @option{-O} and higher.

+@item -feliminate-malloc
+@opindex feliminate-malloc
+Eliminate unnecessary malloc/free pairs.
+Enabled by default at @option{-O} and higher.
+
 @item -fif-conversion
 @opindex fif-conversion
 Attempt to transform conditional jumps into branch-less equivalents.  This
Index: gcc/testsuite/gcc.dg/free-malloc.c
===
--- gcc/testsuite/gcc.dg/free-malloc.c  (revision 0)
+++ gcc/testsuite/gcc.dg/free-malloc.c  (revision 0)
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-eliminate-malloc" } */
+/* { dg-final { scan-assembler-times "malloc" 2} } */
+/* { dg-final { scan-assembler-times "free" 2} } */
+
+extern void * malloc (unsigned long);
+extern void free (void *);
+
+void test ()
+{
+  free (malloc (10));
+}
Index: gcc/common.opt
===
--- gcc/common.opt  (revision 187277)
+++ gcc/common.opt  (working copy)
@@ -1474,6 +1474,10 @@
 Common Var(flag_dce) Init(1) Optimization
 Use the RTL dead code elimination pass

+feliminate-malloc
+Common Var(flag_eliminate_malloc) Init(1) Optimization
+Eliminate unnecessary malloc/free pairs
+
 fdse
 Common Var(flag_dse) Init(1) Optimization
 Use the RTL dead store elimination pass
Index: gcc/tree-ssa-dce.c
===
--- gcc/tree-ssa-dce.c  (revision 187277)
+++ gcc/tree-ssa-dce.c  (working copy)
@@ -309,6 +309,8 @@
case BUILT_IN_CALLOC:
case BUILT_IN_ALLOCA:
case BUILT_IN_ALLOCA_WITH_ALIGN:
+ if (!flag_eliminate_malloc)
+   mark_stmt_necessary (stmt, true);
  return;

default:;


[PATCH][Cilkplus] Handling elemental function for C Compiler.

2012-05-08 Thread Iyer, Balaji V
Hello Everyone,
This patch is for the Cilkplus branch affecting mostly the C compiler. This 
patch will insert elemental functions for C programs.

Thanks,

Balaji V. Iyer.diff --git a/gcc/ChangeLog.cilk b/gcc/ChangeLog.cilk
index d6b28e2..4f01d31 100644
--- a/gcc/ChangeLog.cilk
+++ b/gcc/ChangeLog.cilk
@@ -1,3 +1,17 @@
+2012-05-07  Balaji V. Iyer  
+
+   * c-parser.c (c_parser_declaration_or_fndef): Saved function arguments
+   for an elemental function.
+   * elem-function.c (find_elem_fn_parm_type_1): Added step-size parameter.
+   (find_elem_fn_parm_type): Likewise.
+   * tree-vect-stmts.c (vect_get_vec_def_for_operand): Added step-size
+   support for linear clause.  Also called elem_fn_linear_init_vector for
+   linear clause.
+   (vectorizable_call): When linear clause is set, set the vector type to
+   constant_def.
+   (elem_fn_linear_init_vector): New function.
+   * tree.c (build_elem_fn_linear_vector_from_val): Likewise.
+
 2012-04-24  Balaji V. Iyer  
 
* elem-function.c (find_elem_fn_param_type_1): New function.
diff --git a/gcc/c-parser.c b/gcc/c-parser.c
index 6af434f..04c8eef 100644
--- a/gcc/c-parser.c
+++ b/gcc/c-parser.c
@@ -1640,7 +1640,7 @@ c_parser_declaration_or_fndef (c_parser *parser, bool 
fndef_ok,
   specs->attrs = NULL_TREE;
   while (true)
 {
-  struct c_declarator *declarator;
+  struct c_declarator *declarator = NULL;
   bool dummy = false;
   timevar_id_t tv;
   tree fnbody;
@@ -1705,6 +1705,11 @@ c_parser_declaration_or_fndef (c_parser *parser, bool 
fndef_ok,
  tree d = start_decl (declarator, specs, false,
   chainon (postfix_attrs,
all_prefix_attrs));
+ if (d && TREE_CODE (d) == FUNCTION_DECL
+ && declarator->kind == cdk_function
+ && lookup_attribute ("vector", all_prefix_attrs)
+ && declarator && declarator->u.arg_info)
+   DECL_ARGUMENTS (d) = declarator->u.arg_info->parms;
  if (d)
finish_decl (d, UNKNOWN_LOCATION, NULL_TREE,
 NULL_TREE, asm_name);
diff --git a/gcc/elem-function.c b/gcc/elem-function.c
index e6e3650..f3e4985 100644
--- a/gcc/elem-function.c
+++ b/gcc/elem-function.c
@@ -85,7 +85,7 @@ static tree create_processor_attribute (elem_fn_info *, tree 
*);
 
 /* this is an helper function for find_elem_fn_param_type */
 static enum elem_fn_parm_type
-find_elem_fn_parm_type_1 (tree fndecl, int parm_no)
+find_elem_fn_parm_type_1 (tree fndecl, int parm_no, tree *step_size)
 {
   int ii = 0;
   elem_fn_info *elem_fn_values;
@@ -96,7 +96,12 @@ find_elem_fn_parm_type_1 (tree fndecl, int parm_no)
 
   for (ii = 0; ii < elem_fn_values->no_lvars; ii++)
 if (elem_fn_values->linear_location[ii] == parm_no)
-  return TYPE_LINEAR;
+  {
+   if (step_size != NULL)
+ *step_size = build_int_cst (integer_type_node,
+ elem_fn_values->linear_steps[ii]);
+   return TYPE_LINEAR;
+  }
 
   for (ii = 0; ii < elem_fn_values->no_uvars; ii++)
 if (elem_fn_values->uniform_location[ii] == parm_no)
@@ -109,7 +114,7 @@ find_elem_fn_parm_type_1 (tree fndecl, int parm_no)
 /* this function will return the type of a parameter in elemental function.
The choices are UNIFORM or LINEAR. */
 enum elem_fn_parm_type
-find_elem_fn_parm_type (gimple stmt, tree op)
+find_elem_fn_parm_type (gimple stmt, tree op, tree *step_size)
 {
   tree fndecl, parm = NULL_TREE;
   int ii, nargs;
@@ -128,7 +133,7 @@ find_elem_fn_parm_type (gimple stmt, tree op)
   parm = gimple_call_arg (stmt, ii);
   if (op == parm)
{
- return_type = find_elem_fn_parm_type_1 (fndecl, 1);
+ return_type = find_elem_fn_parm_type_1 (fndecl, ii, step_size);
  return return_type;
}
 }
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index 738a5a7..46dd4a8 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -42,6 +42,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "langhooks.h"
 #include "cilk.h"
 
+extern enum elem_fn_parm_type find_elem_fn_parm_type (gimple, tree, tree *);
 /* Return a variable of type ELEM_TYPE[NELEMS].  */
 
 static tree
@@ -1259,8 +1260,8 @@ vect_get_vec_def_for_operand (tree op, gimple stmt, tree 
*scalar_def)
   enum vect_def_type dt;
   bool is_simple_use;
   tree vector_type;
-
-  extern enum elem_fn_parm_type find_elem_fn_parm_type (gimple, tree);
+  enum elem_fn_parm_type parm_type;
+  tree step_size = NULL_TREE;
 
   if (vect_print_dump_info (REPORT_DETAILS))
 {
@@ -1289,14 +1290,12 @@ vect_get_vec_def_for_operand (tree op, gimple stmt, 
tree *scalar_def)
   && gimple_code (stmt) == GIMPLE_CALL
   && is_elem_fn (gimple_call_fndecl (stmt)))
 {
-  enum elem_fn_parm_type parm_type = find_elem_fn_parm_type (stmt, op);
-  if (pa

Re: [committed] Fix lower-subreg cost calculation

2012-05-08 Thread Richard Earnshaw
On 06/05/12 19:55, Richard Sandiford wrote:
> Georg-Johann Lay  writes:
>> TARGET_RTX_COSTS gets called with x = (const_int 1) and outer = SET
>> for example. How do I get SET_DEST from that information?
>>
>> I don't now if lower-subreg.s ever emits such cost requests, but several
>> passes definitely do.
> 
> Gah!  I really should have remembered that insn_rtx_cost happily ignores
> both SETs and SET_DESTs, and skips straight to the SET_SRC.  This caught
> me out when looking at the auto-inc-dec rewrite last year too.  (The problem
> in that case was that insn_rtx_cost ignored the cost of MEMs in stores,
> and only took into account the cost of MEMs in loads.)
> 
> While that probably ought to change, I felt like I was going down a
> rathole last time I looked at it, so this patch does what I should
> have done originally.
> 
> For the record: I wondered whether rtlanal.c should base the default
> register-to-register copy cost for mode M on the lowest move_cost[M][c][c].
> The problem is that move_cost has traditionally been used to choose
> between difference classes in the same mode, rather than between modes,
> with 2 as the base cost.  So I don't think it's suitable.
> 
> Tested on x86_64-linux-gnu and with the upcoming MIPS costs.  Installed.
> 
> Sorry for the breakage.
> 
> Richard
> 
> 
> gcc/
>   * lower-subreg.c (shift_cost): Use set_src_cost, avoiding the SET.
>   (compute_costs): Likewise for the zero extension.  Use set_rtx_cost
>   to compute the cost of moves.  Set the mode of the target register.
> 

FTR, this caused
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53278

R.



Re: [PATCH libcpp]: Avoid crash in interpret_float_suffix

2012-05-08 Thread Tom Tromey
> "Tristan" == Tristan Gingold  writes:

Tristan> 2012-05-04  Tristan Gingold  
Tristan>* expr.c (interpret_float_suffix): Add a guard.

Ok.

Tom


Re: rfa: shrink rtl_bb_info

2012-05-08 Thread Michael Matz
Hi,

On Fri, 4 May 2012, Richard Guenther wrote:

> Well, as you touch all places that refer to header/footer why not
> introduce macros to access them ...
> 
> > Currently regstrapping on x86_64-linux, okay if that passes?
> 
> Ok with the above change.

For completeness, this is what I checked in (r187288).  Only change from 
submission is the intro of BB_HEADER/BB_FOOTER macros.


Ciao,
Michael.
---
2012-05-03  Michael Matz  

* basic-block.h (struct rtl_bb_info): Remove visited member and
move head_ member to ...
(struct basic_block_def.basic_block_il_dependent): ... the new
member x, replacing but containing old member rtl.
(enum bb_flags): New BB_VISITED flag.
(BB_HEADER, BB_FOOTER): New macros.

* jump.c (mark_all_labels): Adjust.
* cfgcleanup.c (try_optimize_cfg): Adjust.
* cfglayout.c (record_effective_endpoints): Adjust.
(relink_block_chain): Ditto (and don't fiddle with visited).
(fixup_reorder_chain): Adjust.
(fixup_fallthru_exit_predecessor): Ditto.
(cfg_layout_duplicate_bb): Ditto.
* combine.c (update_cfg_for_uncondjump): Adjust.
* bb-reorder.c (struct bbro_basic_block_data_def): Add visited
member.
(bb_visited_trace): New accessor.
(mark_bb_visited): Move in front.
(rotate_loop): Use bb_visited_trace.
(find_traces_1_round): Ditto.
(emit_barrier_after): Ditto.
(copy_bb): Ditto, and initialize visited on resize.
(reorder_basic_blocks): Initize visited member.
(duplicate_computed_gotos): Clear bb flags at start, use
BB_VISITED flags.

* cfgrtl.c (try_redirect_by_replacing_jump): Adjust.
(rtl_verify_flow_info_1): Ditto.
(cfg_layout_split_block): Ditto.
(cfg_layout_delete_block): Ditto.
(cfg_layout_merge_blocks): Ditto.
(init_rtl_bb_info): Adjust and initialize il.x.head_ member.

Index: basic-block.h
===
--- basic-block.h   (revision 187099)
+++ basic-block.h   (working copy)
@@ -102,17 +102,14 @@ extern const struct gcov_ctr_summary *pr
 struct loop;
 
 struct GTY(()) rtl_bb_info {
-  /* The first and last insns of the block.  */
-  rtx head_;
+  /* The first insn of the block is embedded into bb->il.x.  */
+  /* The last insn of the block.  */
   rtx end_;
 
   /* In CFGlayout mode points to insn notes/jumptables to be placed just before
  and after the block.   */
-  rtx header;
-  rtx footer;
-
-  /* This field is used by the bb-reorder pass.  */
-  int visited;
+  rtx header_;
+  rtx footer_;
 };
 
 struct GTY(()) gimple_bb_info {
@@ -169,7 +166,10 @@ struct GTY((chain_next ("%h.next_bb"), c
 
   union basic_block_il_dependent {
   struct gimple_bb_info GTY ((tag ("0"))) gimple;
-  struct rtl_bb_info * GTY ((tag ("1"))) rtl;
+  struct {
+rtx head_;
+struct rtl_bb_info * rtl;
+  } GTY ((tag ("1"))) x;
 } GTY ((desc ("((%1.flags & BB_RTL) != 0)"))) il;
 
   /* Expected number of executions: calculated in profile.c.  */
@@ -260,7 +260,10 @@ enum bb_flags
  df_set_bb_dirty, but not cleared by df_analyze, so it can be used
  to test whether a block has been modified prior to a df_analyze
  call.  */
-  BB_MODIFIED = 1 << 12
+  BB_MODIFIED = 1 << 12,
+
+  /* A general visited flag for passes to use.  */
+  BB_VISITED = 1 << 13
 };
 
 /* Dummy flag for convenience in the hot/cold partitioning code.  */
@@ -415,8 +418,10 @@ struct GTY(()) control_flow_graph {
 
 /* Stuff for recording basic block info.  */
 
-#define BB_HEAD(B)  (B)->il.rtl->head_
-#define BB_END(B)   (B)->il.rtl->end_
+#define BB_HEAD(B)  (B)->il.x.head_
+#define BB_END(B)   (B)->il.x.rtl->end_
+#define BB_HEADER(B)(B)->il.x.rtl->header_
+#define BB_FOOTER(B)(B)->il.x.rtl->footer_
 
 /* Special block numbers [markers] for entry and exit.
Neither of them is supposed to hold actual statements.  */
Index: jump.c
===
--- jump.c  (revision 187098)
+++ jump.c  (working copy)
@@ -275,13 +275,13 @@ mark_all_labels (rtx f)
  /* In cfglayout mode, there may be non-insns between the
 basic blocks.  If those non-insns represent tablejump data,
 they contain label references that we must record.  */
- for (insn = bb->il.rtl->header; insn; insn = NEXT_INSN (insn))
+ for (insn = BB_HEADER (bb); insn; insn = NEXT_INSN (insn))
if (INSN_P (insn))
  {
gcc_assert (JUMP_TABLE_DATA_P (insn));
mark_jump_label (PATTERN (insn), insn, 0);
  }
- for (insn = bb->il.rtl->footer; insn; insn = NEXT_INSN (insn))
+ for (insn = BB_FOOTER (bb); insn; insn = NEXT_INSN (insn))
if (INSN_P (insn))
  {
gcc_assert (JUMP_TA

Re: [RFC PATCH, i386]: Implement ix86_set_reg_reg_cost, fix PR 53250 "Splitting reg" failure.

2012-05-08 Thread Uros Bizjak
On Mon, May 7, 2012 at 10:40 PM, Uros Bizjak  wrote:

> 2012-05-07  Uros Bizjak  
>
>        PR target/53250
>        * config/i386/i386.c (ix86_set_reg_reg_cost): New function.
>        (ix86_rtx_costs): Handle SET.
>
>> Patch was tested on x86_64-pc-linux-gnu {,-m32}.
>>
>> I have also #define LOG_COST 1 temporarily and looked at generated
>> cost calculations for various -msse* settings.
>>
>> I will wait for a day for possible comments on the implementation
>> before the patch will be committed to mainline SVN.

Committed to mainline SVN at r187289.

Uros.


Re: RFA: PR target/53120, constraint modifier "+" on operand tied by matching-constraint, "0".

2012-05-08 Thread nick clifton

Hi DJ,

Make sure a match_dup will still match the generated pattern later,
I've had problems with match_dup not matching two rtx that
rtx_equals() says are "the same" but not physically the same.


I have tried, but failed, to find a way to trigger the use of the 
bset_qi pattern. :-(  I tried rebuilding the toolchain and running the 
GCC testsuite, but neither of these worked.  Do you have a test case 
that triggers it ?


Cheers
  Nick




Re: [PATCH] Add -feliminate-malloc to enable/disable elimination of redundant malloc/free pairs

2012-05-08 Thread Xinliang David Li
To be clear, this flag is for malloc implementation (such as tcmalloc)
with side effect unknown to the compiler. Using -fno-builtin-xxx is
too conservative for that purpose.

David

On Tue, May 8, 2012 at 7:43 AM, Dehao Chen  wrote:
> Hello,
>
> This patch adds a flag to guard the optimization that optimize the
> following code away:
>
> free (malloc (4));
>
> In some cases, we'd like this type of malloc/free pairs to remain in
> the optimized code.
>
> Tested with bootstrap, and no regression in the gcc testsuite.
>
> Is it ok for mainline?
>
> Thanks,
> Dehao
>
> gcc/ChangeLog
> 2012-05-08  Dehao Chen  
>
>        * common.opt (feliminate-malloc): New.
>        * doc/invoke.texi: Document it.
>        * tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Honor it.
>
> gcc/testsuite/ChangeLog
> 2012-05-08  Dehao Chen  
>
>        * gcc.dg/free-malloc.c: Check if -fno-eliminate-malloc is working
>        as expected.
>
> Index: gcc/doc/invoke.texi
> ===
> --- gcc/doc/invoke.texi (revision 187277)
> +++ gcc/doc/invoke.texi (working copy)
> @@ -360,7 +360,8 @@
>  -fcx-limited-range @gol
>  -fdata-sections -fdce -fdelayed-branch @gol
>  -fdelete-null-pointer-checks -fdevirtualize -fdse @gol
> --fearly-inlining -fipa-sra -fexpensive-optimizations -ffat-lto-objects @gol
> +-fearly-inlining -feliminate-malloc -fipa-sra -fexpensive-optimizations @gol
> +-ffat-lto-objects @gol
>  -ffast-math -ffinite-math-only -ffloat-store
> -fexcess-precision=@var{style} @gol
>  -fforward-propagate -ffp-contract=@var{style} -ffunction-sections @gol
>  -fgcse -fgcse-after-reload -fgcse-las -fgcse-lm -fgraphite-identity @gol
> @@ -6238,6 +6239,7 @@
>  -fdefer-pop @gol
>  -fdelayed-branch @gol
>  -fdse @gol
> +-feliminate-malloc @gol
>  -fguess-branch-probability @gol
>  -fif-conversion2 @gol
>  -fif-conversion @gol
> @@ -6762,6 +6764,11 @@
>  Perform dead store elimination (DSE) on RTL@.
>  Enabled by default at @option{-O} and higher.
>
> +@item -feliminate-malloc
> +@opindex feliminate-malloc
> +Eliminate unnecessary malloc/free pairs.
> +Enabled by default at @option{-O} and higher.
> +
>  @item -fif-conversion
>  @opindex fif-conversion
>  Attempt to transform conditional jumps into branch-less equivalents.  This
> Index: gcc/testsuite/gcc.dg/free-malloc.c
> ===
> --- gcc/testsuite/gcc.dg/free-malloc.c  (revision 0)
> +++ gcc/testsuite/gcc.dg/free-malloc.c  (revision 0)
> @@ -0,0 +1,12 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fno-eliminate-malloc" } */
> +/* { dg-final { scan-assembler-times "malloc" 2} } */
> +/* { dg-final { scan-assembler-times "free" 2} } */
> +
> +extern void * malloc (unsigned long);
> +extern void free (void *);
> +
> +void test ()
> +{
> +  free (malloc (10));
> +}
> Index: gcc/common.opt
> ===
> --- gcc/common.opt      (revision 187277)
> +++ gcc/common.opt      (working copy)
> @@ -1474,6 +1474,10 @@
>  Common Var(flag_dce) Init(1) Optimization
>  Use the RTL dead code elimination pass
>
> +feliminate-malloc
> +Common Var(flag_eliminate_malloc) Init(1) Optimization
> +Eliminate unnecessary malloc/free pairs
> +
>  fdse
>  Common Var(flag_dse) Init(1) Optimization
>  Use the RTL dead store elimination pass
> Index: gcc/tree-ssa-dce.c
> ===
> --- gcc/tree-ssa-dce.c  (revision 187277)
> +++ gcc/tree-ssa-dce.c  (working copy)
> @@ -309,6 +309,8 @@
>            case BUILT_IN_CALLOC:
>            case BUILT_IN_ALLOCA:
>            case BUILT_IN_ALLOCA_WITH_ALIGN:
> +             if (!flag_eliminate_malloc)
> +               mark_stmt_necessary (stmt, true);
>              return;
>
>            default:;


Re: [C++ Patch and pubnames 2/2] Adjust c decl pretty printer to match demangler (issue6195056)

2012-05-08 Thread Sterling Augustine
On Mon, May 7, 2012 at 6:44 PM, Gabriel Dos Reis
 wrote:
> On Mon, May 7, 2012 at 7:10 PM, Sterling Augustine
>  wrote:
>> This is the second in the series of patches to make c decl pretty printing
>> more closely match the demangler. A full explanation is here:
>>
>> http://gcc.gnu.org/ml/gcc-patches/2012-05/msg00512.html
>>
>> OK for mainline?
>
> Now I realize something that is wrong with the previous patch.
> Writing 'const T*' in C++ is very much wide spread style is also the
> style used in the C++ standard, TC++PL (the de facto popular reference 
> TC++PL),
> our own C++ standard library implementation, and many popular modern
> C++ textbooks.
> It is a strongly well established style.
> Changing the pretty printer to satisfy the demangler as opposed to users
> look wrong headed.

I'm most definitely not trying to satisfy the demangler--I'm trying to
make GCC's naming consistent with the rest of the tool chain, which
will be good for users. Consider the C++ function:

int foo(const char *bar);

The problem is that the toolchain disagrees on the canonical
pretty-name of this function. If you use nm, objdump, or readelf, or
anything that must recover the name from the binary, you will see:

int foo(char const*)

The demangler follows the documented gnu_v3 demangling convention. As
far as I can tell, the C++ front end is ad-hoc. This disagreement
creates inconsistencies in the debugging information that is confusing
to users.

Do you have a suggestion for fixing the disagreement? I would love to
add this as a parameter somewhere, but the decision is very deep in
the internals of the pretty printer.

Sterling


Re: [C++ Patch and pubnames 2/2] Adjust c decl pretty printer to match demangler (issue6195056)

2012-05-08 Thread Gabriel Dos Reis
On Tue, May 8, 2012 at 11:20 AM, Sterling Augustine
 wrote:

> Do you have a suggestion for fixing the disagreement? I would love to
> add this as a parameter somewhere, but the decision is very deep in
> the internals of the pretty printer.

I disagree with your characterization that the pretty-printer is ad-hoc.
Since its inception, I chose to follow closely the established C++ style
as previously explained.  The pretty printer was designed as a tool
for diagnostics to users, not for the demangler.  If the demangler happens
to find some of its functionalities useful, the proper action would be
to customize it as opposed to a whole change to satisfy its internals.

A way of doing this is to have the pretty-printer object initialized
with a style
(e.g. standard C++ style, gdb style, etc.).  The places you want to
change should
be predicated on the current style matching gdb style, etc.

-- Gaby


Re: [patch, fortran] PR fortran/52537 Optimize string comparisons against empty strings

2012-05-08 Thread Tobias Burnus
Hello Thomas,

below a very timely review - your patch is not even a month old
and was never pinged, besides, you have chosen an unlucky day.
(In other words: Sorry for the slow review.)

Thomas Koenig wrote on Fri, 13 Apr 2012:
> this patch replaces  a != '' with len_trim(a) != 0, to
> speed up the comparison.

I wonder how much it helps - especially for the real world
code. Let's see whether the bug reporter will report back.


Can you also check kind=4 string in the test case?
I think your patch should simply work, but having a test
surely cannot harm.


> +  /* Only use new-style comparisions.  */
> +  switch(op)
> +{
> +case INTRINSIC_EQ_OS:
> +  op = INTRINSIC_EQ;
> +  break;

I have to admit that I do not like that part. At least for
this patch, I think it neither makes the code clearer nor
shorter. The only hypothetical advantage I see is that it
avoids some issues related to forgetting the _OS version in
the switch statements. Thus, my answer whether I like the
change is a (very weak) NO. But my answer to whether you
may do the change is YES.


>  }
>
> +/* Return true if a constant string contains spaces only.  */

Nit: Missing line break. (Two empty lines separate functions.)
I would use "only spaces" or even "only blanks" instead of
"spaces only".


> +  for (i=0; ivalue.character.length; i++)

Missing space around the "<", i.e. "i < e->value".

> +}
> +
> +/* Insert a call to the intrinsic len_trim. Use a different name for

Empty line missing.

> +}
> +
>  /* Optimize expressions for equality.  */

Ditto.

> +  gfc_get_sym_tree("__internal_len_trim", current_ns, &fcn->symtree, false);

Blank missing before "(".


> +{
> +
> +  bool empty_op1, empty_op2;

Spurious empty line.


> +  empty_op1 = empty_string(op1);
> +  empty_op2 = empty_string(op2);

Blank missing before "(".


Otherwise, the patch is OK.

Tobias


Re: [rtl, patch] combine concat+shuffle

2012-05-08 Thread Marc Glisse

Here is a new version.

gcc/ChangeLog
2012-05-08  Marc Glisse  

* simplify-rtx.c (simplify_binary_operation_1): Optimize shuffle
of concatenations.

gcc/testsuite/ChangeLog
2012-05-08  Marc Glisse  

* gcc.target/i386/shuf-concat.c: New test.


On Tue, 8 May 2012, Richard Sandiford wrote:


Very minor, but this code probably belongs in the else part of the
if (!VECTOR_MODE_P (mode)) block.


I moved it in the block. Note that the piece of code right below, that 
starts with:


  if (XVECLEN (trueop1, 0) == 1
  && CONST_INT_P (XVECEXP (trueop1, 0, 0))
  && GET_CODE (trueop0) == VEC_CONCAT)

could probably move too. I had put mine right below because they do 
similar things. By the way, reusing that piece of code and applying it to 
each of the 2 selected parts of the vector would be one way to generalize 
my patch so it also applies to the vpermilpd case.


Note to self: if you want to grep for "shuf" in the asm, don't put "shuf" 
in the name of the file...


--
Marc GlisseIndex: testsuite/gcc.target/i386/shuf-concat.c
===
--- testsuite/gcc.target/i386/shuf-concat.c (revision 0)
+++ testsuite/gcc.target/i386/shuf-concat.c (revision 0)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-O" } */
+
+typedef double v2df __attribute__ ((__vector_size__ (16)));
+
+v2df f(double d,double e){
+  v2df x={-d,d};
+  v2df y={-e,e};
+  return __builtin_ia32_shufpd(x,y,1);
+}
+
+/* { dg-final { scan-assembler-not "shufpd" } } */
+/* { dg-final { scan-assembler-times "unpck" 1 } } */

Property changes on: testsuite/gcc.target/i386/shuf-concat.c
___
Added: svn:eol-style
   + native
Added: svn:keywords
   + Author Date Id Revision URL

Index: simplify-rtx.c
===
--- simplify-rtx.c  (revision 187276)
+++ simplify-rtx.c  (working copy)
@@ -1,10 +1,10 @@
 /* RTL simplification functions for GNU compiler.
Copyright (C) 1987, 1988, 1989, 1992, 1993, 1994, 1995, 1996, 1997, 1998,
1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010,
-   2011  Free Software Foundation, Inc.
+   2011, 2012  Free Software Foundation, Inc.
 
 This file is part of GCC.
 
 GCC is free software; you can redistribute it and/or modify it under
 the terms of the GNU General Public License as published by the Free
 Software Foundation; either version 3, or (at your option) any later
@@ -3239,12 +3239,33 @@ simplify_binary_operation_1 (enum rtx_co
  RTVEC_ELT (v, i) = CONST_VECTOR_ELT (trueop0,
   INTVAL (x));
}
 
  return gen_rtx_CONST_VECTOR (mode, v);
}
+
+ /* If we build {a,b} then permute it, build the result directly.  */
+ if (XVECLEN (trueop1, 0) == 2
+ && CONST_INT_P (XVECEXP (trueop1, 0, 0))
+ && CONST_INT_P (XVECEXP (trueop1, 0, 1))
+ && GET_CODE (trueop0) == VEC_CONCAT
+ && GET_CODE (XEXP (trueop0, 0)) == VEC_CONCAT
+ && GET_MODE (XEXP (trueop0, 0)) == mode
+ && GET_CODE (XEXP (trueop0, 1)) == VEC_CONCAT
+ && GET_MODE (XEXP (trueop0, 1)) == mode)
+   {
+ unsigned int i0 = INTVAL (XVECEXP (trueop1, 0, 0));
+ unsigned int i1 = INTVAL (XVECEXP (trueop1, 0, 1));
+ rtx subop0, subop1;
+
+ gcc_assert (i0 < 4 && i1 < 4);
+ subop0 = XEXP (XEXP (trueop0, i0 / 2), i0 % 2);
+ subop1 = XEXP (XEXP (trueop0, i1 / 2), i1 % 2);
+
+ return simplify_gen_binary (VEC_CONCAT, mode, subop0, subop1);
+   }
}
 
   if (XVECLEN (trueop1, 0) == 1
  && CONST_INT_P (XVECEXP (trueop1, 0, 0))
  && GET_CODE (trueop0) == VEC_CONCAT)
{


[google-4_6] fix profile mismatch in stream LIPO (issue6194059)

2012-05-08 Thread Rong Xu
Hi,

This patch is for google-4_6 branch only.

It fixes a profile mismatch in streaming LIPO and recovers the performance loss
due to this.

Tested with google internal benchmarks.

Thanks,

2012-05-08   Rong Xu  

* ipa-inline.c (fixed_arg_function_p):
(better_inline_comdat_function_p): match lipo_gen in stream LIPO.

Index: ipa-inline.c
===
--- ipa-inline.c(revision 187275)
+++ ipa-inline.c(working copy)
@@ -658,7 +658,7 @@ fixed_arg_function_p (tree fndecl)
 static bool
 better_inline_comdat_function_p (struct cgraph_node *node)
 {
-  return (profile_arc_flag && flag_dyn_ipa
+  return (profile_arc_flag && (flag_dyn_ipa || flag_ripa_stream)
   && DECL_COMDAT (node->decl)
   && node->global.size <= PARAM_VALUE (PARAM_MAX_INLINE_INSNS_SINGLE)
   && fixed_arg_function_p (node->decl));

--
This patch is available for review at http://codereview.appspot.com/6194059


Re: [google-4_6] fix profile mismatch in stream LIPO (issue6194059)

2012-05-08 Thread Xinliang David Li
Ok.

David

On Tue, May 8, 2012 at 9:53 AM, Rong Xu  wrote:
> Hi,
>
> This patch is for google-4_6 branch only.
>
> It fixes a profile mismatch in streaming LIPO and recovers the performance 
> loss
> due to this.
>
> Tested with google internal benchmarks.
>
> Thanks,
>
> 2012-05-08   Rong Xu  
>
>        * ipa-inline.c (fixed_arg_function_p):
>        (better_inline_comdat_function_p): match lipo_gen in stream LIPO.
>
> Index: ipa-inline.c
> ===
> --- ipa-inline.c        (revision 187275)
> +++ ipa-inline.c        (working copy)
> @@ -658,7 +658,7 @@ fixed_arg_function_p (tree fndecl)
>  static bool
>  better_inline_comdat_function_p (struct cgraph_node *node)
>  {
> -  return (profile_arc_flag && flag_dyn_ipa
> +  return (profile_arc_flag && (flag_dyn_ipa || flag_ripa_stream)
>           && DECL_COMDAT (node->decl)
>           && node->global.size <= PARAM_VALUE (PARAM_MAX_INLINE_INSNS_SINGLE)
>           && fixed_arg_function_p (node->decl));
>
> --
> This patch is available for review at http://codereview.appspot.com/6194059


[testsuite] Fix gcc.target/i386/hle-* testcases with Sun as

2012-05-08 Thread Rainer Orth
Several /gcc.target/i386/hle-*.c tests are currently failing on Solaris
9/x86 with Sun as:

FAIL: gcc.target/i386/hle-add-acq-1.c scan-assembler lock[ 
\\n\\t]+(xacquire|.byte[ \\t]+0xf2)[ \\t\\n]+add

The .s file has

lock;
.byte   0xf2

but the scan-assembler regex currently doesn't allow for the ; (which is
not present with gas 2.22).

The patch below does just that.  Tested with the appropriate runtest
invocation on i386-pc-solaris2.9 configured with as and gas
respectively.

Ok for mainline?

Rainer


2012-05-08  Rainer Orth  

* gcc.target/i386/hle-add-acq-1.c: Allow for ; after lock.
* gcc.target/i386/hle-add-rel-1.c: Likewise.
* gcc.target/i386/hle-and-acq-1.c: Likewise.
* gcc.target/i386/hle-and-rel-1.c: Likewise.
* gcc.target/i386/hle-cmpxchg-acq-1.c: Likewise.
* gcc.target/i386/hle-cmpxchg-rel-1.c: Likewise.
* gcc.target/i386/hle-or-acq-1.c: Likewise.
* gcc.target/i386/hle-or-rel-1.c: Likewise.
* gcc.target/i386/hle-sub-acq-1.c: Likewise.
* gcc.target/i386/hle-sub-rel-1.c: Likewise.
* gcc.target/i386/hle-xadd-acq-1.c: Likewise.
* gcc.target/i386/hle-xadd-rel-1.c: Likewise.
* gcc.target/i386/hle-xor-acq-1.c: Likewise.
* gcc.target/i386/hle-xor-rel-1.c: Likewise.

# HG changeset patch
# Parent 0b93bf7b6bda6f699d288070fb3e186fcde95cf8
Fix gcc.target/i386/hle-* testcases with Sun as

diff --git a/gcc/testsuite/gcc.target/i386/hle-add-acq-1.c b/gcc/testsuite/gcc.target/i386/hle-add-acq-1.c
--- a/gcc/testsuite/gcc.target/i386/hle-add-acq-1.c
+++ b/gcc/testsuite/gcc.target/i386/hle-add-acq-1.c
@@ -1,6 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options "-mhle" } */
-/* { dg-final { scan-assembler "lock\[ \n\t\]+\(xacquire\|\.byte\[ \t\]+0xf2\)\[ \t\n\]+add" } } */
+/* { dg-final { scan-assembler "lock;?\[ \n\t\]+\(xacquire\|\.byte\[ \t\]+0xf2\)\[ \t\n\]+add" } } */
 
 void
 hle_add (int *p, int v)
diff --git a/gcc/testsuite/gcc.target/i386/hle-add-rel-1.c b/gcc/testsuite/gcc.target/i386/hle-add-rel-1.c
--- a/gcc/testsuite/gcc.target/i386/hle-add-rel-1.c
+++ b/gcc/testsuite/gcc.target/i386/hle-add-rel-1.c
@@ -1,6 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options "-mhle" } */
-/* { dg-final { scan-assembler "lock\[ \n\t\]+\(xrelease\|\.byte\[ \t\]+0xf3\)\[ \t\n\]+add" } } */
+/* { dg-final { scan-assembler "lock;?\[ \n\t\]+\(xrelease\|\.byte\[ \t\]+0xf3\)\[ \t\n\]+add" } } */
 
 void
 hle_add (int *p, int v)
diff --git a/gcc/testsuite/gcc.target/i386/hle-and-acq-1.c b/gcc/testsuite/gcc.target/i386/hle-and-acq-1.c
--- a/gcc/testsuite/gcc.target/i386/hle-and-acq-1.c
+++ b/gcc/testsuite/gcc.target/i386/hle-and-acq-1.c
@@ -1,6 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options "-mhle" } */
-/* { dg-final { scan-assembler "lock\[ \n\t\]+\(xacquire\|\.byte\[ \t\]+0xf2\)\[ \t\n\]+and" } } */
+/* { dg-final { scan-assembler "lock;?\[ \n\t\]+\(xacquire\|\.byte\[ \t\]+0xf2\)\[ \t\n\]+and" } } */
 
 void
 hle_and (int *p, int v)
diff --git a/gcc/testsuite/gcc.target/i386/hle-and-rel-1.c b/gcc/testsuite/gcc.target/i386/hle-and-rel-1.c
--- a/gcc/testsuite/gcc.target/i386/hle-and-rel-1.c
+++ b/gcc/testsuite/gcc.target/i386/hle-and-rel-1.c
@@ -1,6 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options "-mhle" } */
-/* { dg-final { scan-assembler "lock\[ \n\t\]+\(xrelease\|\.byte\[ \t\]+0xf3\)\[ \t\n\]+and" } } */
+/* { dg-final { scan-assembler "lock;?\[ \n\t\]+\(xrelease\|\.byte\[ \t\]+0xf3\)\[ \t\n\]+and" } } */
 
 void
 hle_and (int *p, int v)
diff --git a/gcc/testsuite/gcc.target/i386/hle-cmpxchg-acq-1.c b/gcc/testsuite/gcc.target/i386/hle-cmpxchg-acq-1.c
--- a/gcc/testsuite/gcc.target/i386/hle-cmpxchg-acq-1.c
+++ b/gcc/testsuite/gcc.target/i386/hle-cmpxchg-acq-1.c
@@ -1,6 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options "-march=x86-64 -mhle" } */
-/* { dg-final { scan-assembler "lock\[ \n\t\]+\(xacquire\|\.byte\[ \t\]+0xf2\)\[ \t\n\]+cmpxchg" } } */
+/* { dg-final { scan-assembler "lock;?\[ \n\t\]+\(xacquire\|\.byte\[ \t\]+0xf2\)\[ \t\n\]+cmpxchg" } } */
 
 int
 hle_cmpxchg (int *p, int oldv, int newv)
diff --git a/gcc/testsuite/gcc.target/i386/hle-cmpxchg-rel-1.c b/gcc/testsuite/gcc.target/i386/hle-cmpxchg-rel-1.c
--- a/gcc/testsuite/gcc.target/i386/hle-cmpxchg-rel-1.c
+++ b/gcc/testsuite/gcc.target/i386/hle-cmpxchg-rel-1.c
@@ -1,6 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options "-march=x86-64 -mhle" } */
-/* { dg-final { scan-assembler "lock\[ \n\t\]+\(xrelease\|\.byte\[ \t\]+0xf3\)\[ \t\n\]+cmpxchg" } } */
+/* { dg-final { scan-assembler "lock;?\[ \n\t\]+\(xrelease\|\.byte\[ \t\]+0xf3\)\[ \t\n\]+cmpxchg" } } */
 
 int
 hle_cmpxchg (int *p, int oldv, int newv)
diff --git a/gcc/testsuite/gcc.target/i386/hle-or-acq-1.c b/gcc/testsuite/gcc.target/i386/hle-or-acq-1.c
--- a/gcc/testsuite/gcc.target/i386/hle-or-acq-1.c
+++ b/gcc/testsuite/gcc.target/i386/hle-or-acq-1.c
@@ -1,6 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options "-mhle" } */
-/* { dg-final { scan-assembler "lock\[ \n\t\]+\(xa

Re: [PATCH] libgcov support for profile collection in region of interest (issue6186044)

2012-05-08 Thread Jan Hubicka
> Hi Honza,
> 
> I added L_gcov_reset and L_gcov_dump for the new interfaces, and also
> added a description into the gcov man page. Let me know if it looks
> ok now.
> 
> Bootstrapped and tested on x86_64-unknown-linux-gnu.
> 
> Thanks,
> Teresa
> 
> 2012-05-08   Teresa Johnson  
> 
>   * libgcc/libgcov.c (gcov_clear, __gcov_reset): New functions.
>   (__gcov_dump): Ditto.
>   (gcov_dump_complete): New global variable.
>   (__gcov_flush): Outline functionality now in gcov_clear.
>   * gcc/gcov-io.h (__gcov_reset, __gcov_dump): Declare.
>   * libgcc/Makefile.in (L_gcov_reset, L_gcov_dump): Define.
>   * gcc/doc/gcov.texi: Add note on using __gcov_reset and __gcov_dump.

It seems OK now, though gcov_clear will end up being in both gcov_reset and 
gcov_flush,
but I suppose it is short enough to make this issue mood
(otherwise we could make it hidden exported from gcov_flush psection)

Honza


Re: [PATCH] MIPS16: Fix truncated DWARF-2 line information

2012-05-08 Thread Richard Sandiford
"Maciej W. Rozycki"  writes:
>> > and the mips-linux-gnu configuration is not ready yet for MIPS16 testing.
>> 
>> Out of interest, what goes wrong?  I've been testing -mabi=32/-mips16 on
>> mips64-linux-gnu for some time without difficulty.
>
>  I've thought some pieces are missing upstream, but perhaps I've been 
> confused.  I reckon there was a nasty issue with GCC confusing the symbols 
> used (using the wrong symbol alias or failing to use one) in the context 
> of using MIPS16 thunks and PLT (that we discovered as soon as or shortly 
> after we started using such a setup, so that wasn't anything particularly 
> obscure), but perhaps the fix for that issue has been actually submitted 
> and included upstream already.
>
>  Are you using a hard-float multilib for your -mabi=32/-mips16 Linux 
> testing?

Yeah.  As an example:

   http://gcc.gnu.org/ml/gcc-testresults/2012-03/msg00393.html

which doesn't look to bad.  Clean fortran results, which I expect
would test the FP interworking fairly heavily.  (It's certainly
been a source of bug fixes in the past, although I don't remember
the results ever being terrible.)

FAOD, this is with normal MIPS libraries and mips16 executables.
There's still no way of building mips16 multilibs out of the box.

Richard


Re: [PATCH] MIPS16: Remove DWARF-2 location information from GP accesses

2012-05-08 Thread Richard Sandiford
"Maciej W. Rozycki"  writes:
> gcc-mips16-gp-pseudo-loc.patch
> Index: gcc-fsf-trunk-quilt/gcc/config/mips/mips.c
> ===
> --- gcc-fsf-trunk-quilt.orig/gcc/config/mips/mips.c   2012-05-02 
> 23:42:46.185566469 +0100
> +++ gcc-fsf-trunk-quilt/gcc/config/mips/mips.c2012-05-03 
> 18:55:28.775580939 +0100
> @@ -2622,7 +2622,8 @@ mips16_gp_pseudo_reg (void)
>   scan = NEXT_INSN (scan);
>  
>insn = gen_load_const_gp (cfun->machine->mips16_gp_pseudo_rtx);
> -  emit_insn_after (insn, scan);
> +  insn = emit_insn_after (insn, scan);
> +  INSN_LOCATOR (insn) = 0;
>  
>pop_topmost_sequence ();
>  }

An alternative would be to use prologue_locator, like ARM does.
I'm not sure whether that's an improvement though, so the patch
is OK as-is, thanks.

Richard


Re: [rtl, patch] combine concat+shuffle

2012-05-08 Thread Richard Sandiford
Marc Glisse  writes:
> Here is a new version.
>
> gcc/ChangeLog
> 2012-05-08  Marc Glisse  
>
>   * simplify-rtx.c (simplify_binary_operation_1): Optimize shuffle
>   of concatenations.

OK, thanks.  I'll leave an x86 maintainer to review the testcase,
but it looks like it'll need some markup to ensure an SSE target.

> Note to self: if you want to grep for "shuf" in the asm, don't put "shuf" 
> in the name of the file...

Yeah :-)  For MIPS tests I tend to add "\t" to the beginning of the regexp.
(And to the end if possible.)

Richard


Re: RFA: PR target/53120, constraint modifier "+" on operand tied by matching-constraint, "0".

2012-05-08 Thread DJ Delorie

#define q ((char *)0x1234)

foo(int x)
{
  *q |= (1 << (char)x);
}

$ m32c-elf-gcc -S -O3 nick.c

.global _foo
_foo:
mov.w   r1,a0; 20   
movhi_op/3
bset4660[a0] ; 11   bset_qi
rts  ; 23   
epilogue_rts


Re: patch ping: Add static branch predict heuristic of comparing IV to loop_bound variable

2012-05-08 Thread Jakub Jelinek
On Fri, May 04, 2012 at 09:46:22PM +0800, Dehao Chen wrote:
> Thanks for the prompt response. Attached is the updated patch.
> 
> Passed bootstrap and all regression tests.

All the new testcases fail for me, on both x86_64-linux and i686-linux,
apparently because of incorrectly committed patch (each testcase source
contains the test twice).

Also, the ChangeLog entries are missing dot at end of each change
description (New instead of New. etc.).

> Index: gcc/testsuite/gcc.dg/predict-3.c
> ===
> --- gcc/testsuite/gcc.dg/predict-3.c  (revision 0)
> +++ gcc/testsuite/gcc.dg/predict-3.c  (revision 0)
> @@ -0,0 +1,25 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-profile_estimate" } */
> +
> +extern int global;
> +
> +int bar(int);
> +
> +void foo (int bound)
> +{
> +  int i, ret = 0;
> +  for (i = 0; i <= bound; i++)
> +{
> +  if (i < bound - 2)
> + global += bar (i);
> +  if (i <= bound)
> + global += bar (i);
> +  if (i + 1 < bound)
> + global += bar (i);
> +  if (i != bound)
> + global += bar (i);
> +}
> +}
> +
> +/* { dg-final { scan-tree-dump-times "loop iv compare heuristics:
> 100.0%" 4 "profile_estimate"} } */
> +/* { dg-final { cleanup-tree-dump "profile_estimate" } } */

Jakub


[PATCH] Fix endless loop in forwprop (PR tree-optimization/53226)

2012-05-08 Thread Jakub Jelinek
Hi!

The attached testcase loops endlessly, using more and more memory.
The problem is that the prev stmt iterator sometimes references stmts that
remove_prop_source_from_use decides to remove, and since Michael's
gimple seq changes that seems to be fatal.

Fixed by not keeping an iterator, but instead marking stmts that don't need
revisiting and restarting with first stmt that needs revisiting.  This
assumes that new stmts will have uid 0, but I believe that is the case.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2012-05-08  Jakub Jelinek  

PR tree-optimization/53226
* tree-ssa-forwprop.c (ssa_forward_propagate_and_combine): Remove
prev and prev_initialized vars, gimple_set_uid (stmt, 0) before
processing it and gimple_set_uid (stmt, 1) if it doesn't need to be
revisited, look for earliest stmt with uid 0 if something changed.

* gcc.c-torture/compile/pr53226.c: New test.

--- gcc/tree-ssa-forwprop.c.jj  2012-05-03 08:35:52.0 +0200
+++ gcc/tree-ssa-forwprop.c 2012-05-08 18:10:19.662061709 +0200
@@ -2677,8 +2677,7 @@ ssa_forward_propagate_and_combine (void)
 
   FOR_EACH_BB (bb)
 {
-  gimple_stmt_iterator gsi, prev;
-  bool prev_initialized;
+  gimple_stmt_iterator gsi;
 
   /* Apply forward propagation to all stmts in the basic-block.
 Note we update GSI within the loop as necessary.  */
@@ -2771,12 +2770,14 @@ ssa_forward_propagate_and_combine (void)
 
   /* Combine stmts with the stmts defining their operands.
 Note we update GSI within the loop as necessary.  */
-  prev_initialized = false;
   for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi);)
{
  gimple stmt = gsi_stmt (gsi);
  bool changed = false;
 
+ /* Mark stmt as potentially needing revisiting.  */
+ gimple_set_uid (stmt, 0);
+
  switch (gimple_code (stmt))
{
case GIMPLE_ASSIGN:
@@ -2856,18 +2857,18 @@ ssa_forward_propagate_and_combine (void)
{
  /* If the stmt changed then re-visit it and the statements
 inserted before it.  */
- if (!prev_initialized)
+ for (; !gsi_end_p (gsi); gsi_prev (&gsi))
+   if (gimple_uid (gsi_stmt (gsi)))
+ break;
+ if (gsi_end_p (gsi))
gsi = gsi_start_bb (bb);
  else
-   {
- gsi = prev;
- gsi_next (&gsi);
-   }
+   gsi_next (&gsi);
}
  else
{
- prev = gsi;
- prev_initialized = true;
+ /* Stmt no longer needs to be revisited.  */
+ gimple_set_uid (stmt, 1);
  gsi_next (&gsi);
}
}
--- gcc/testsuite/gcc.c-torture/compile/pr53226.c.jj2012-05-08 
18:07:40.007000510 +0200
+++ gcc/testsuite/gcc.c-torture/compile/pr53226.c   2012-05-08 
18:07:35.071029578 +0200
@@ -0,0 +1,13 @@
+/* PR tree-optimization/53226 */
+
+void
+foo (unsigned long *x, char y, char z)
+{
+  int i;
+  for (i = y; i < z; ++i)
+{
+  unsigned long a = ((unsigned char) i) & 63UL;
+  unsigned long b = 1ULL << a;
+  *x |= b;
+}
+}

Jakub


[patch] HP-PA: use define_c_enum for "unspec" and "unspecv"

2012-05-08 Thread Steven Bosscher
Hello,

This patch makes pa.md use define_c_enum, so that UNSPECs are printed
with nice strings.

Unfortunately I can't bootstrap on gcc61 right now, so this is only
tested with a normal "make" up to the failure point.

Dave, could you please test this for me, and commit it if it is OK?

Ciao!
Steven


pa_define_c_enum.diff
Description: Binary data


Re: [rtl, patch] combine concat+shuffle

2012-05-08 Thread Marc Glisse

On Tue, 8 May 2012, Richard Sandiford wrote:


Marc Glisse  writes:

Here is a new version.

gcc/ChangeLog
2012-05-08  Marc Glisse  

* simplify-rtx.c (simplify_binary_operation_1): Optimize shuffle
of concatenations.


OK, thanks.  I'll leave an x86 maintainer to review the testcase,
but it looks like it'll need some markup to ensure an SSE target.


Oups, I'd thought about that, then completely forgot. For 64 bits, it 
always works. For 32 bits, it requires -msse2 -mfpmath=sse (without 
-mfpmath=sse we can still test for shufpd, but apparently not unpcklpd, I 
could remove that second test if people prefer, as it isn't important). 
Since this is a compile-only test, I think this would be enough:


/* { dg-options "-O -msse2 -mfpmath=sse" } */


Note to self: if you want to grep for "shuf" in the asm, don't put "shuf"
in the name of the file...


Yeah :-)  For MIPS tests I tend to add "\t" to the beginning of the regexp.
(And to the end if possible.)


Good idea. I was trying to make the check as wide as possible, but that's 
not so useful. Attached a new version of the testcase.


--
Marc GlisseIndex: gcc.target/i386/shuf-concat.c
===
--- gcc.target/i386/shuf-concat.c   (revision 0)
+++ gcc.target/i386/shuf-concat.c   (revision 0)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-O -msse2 -mfpmath=sse" } */
+
+typedef double v2df __attribute__ ((__vector_size__ (16)));
+
+v2df f(double d,double e){
+  v2df x={-d,d};
+  v2df y={-e,e};
+  return __builtin_ia32_shufpd(x,y,1);
+}
+
+/* { dg-final { scan-assembler-not "\tv?shufpd\t" } } */
+/* { dg-final { scan-assembler-times "\tv?unpcklpd\t" 1 } } */

Property changes on: gcc.target/i386/shuf-concat.c
___
Added: svn:eol-style
   + native
Added: svn:keywords
   + Author Date Id Revision URL



Re: [PATCH] libgcov support for profile collection in region of interest (issue6186044)

2012-05-08 Thread Teresa Johnson
On Tue, May 8, 2012 at 10:37 AM, Jan Hubicka  wrote:
>
> > Hi Honza,
> >
> > I added L_gcov_reset and L_gcov_dump for the new interfaces, and also
> > added a description into the gcov man page. Let me know if it looks
> > ok now.
> >
> > Bootstrapped and tested on x86_64-unknown-linux-gnu.
> >
> > Thanks,
> > Teresa
> >
> > 2012-05-08   Teresa Johnson  
> >
> >       * libgcc/libgcov.c (gcov_clear, __gcov_reset): New functions.
> >       (__gcov_dump): Ditto.
> >       (gcov_dump_complete): New global variable.
> >       (__gcov_flush): Outline functionality now in gcov_clear.
> >       * gcc/gcov-io.h (__gcov_reset, __gcov_dump): Declare.
> >       * libgcc/Makefile.in (L_gcov_reset, L_gcov_dump): Define.
> >       * gcc/doc/gcov.texi: Add note on using __gcov_reset and __gcov_dump.
>
> It seems OK now, though gcov_clear will end up being in both gcov_reset and 
> gcov_flush,
> but I suppose it is short enough to make this issue mood
> (otherwise we could make it hidden exported from gcov_flush psection)
>
> Honza

The same issue is going to exist with gcov_exit and gcov_dump/gcov_flush, and
with gcov_dump_complete. I agree that it seems cleaner to export them hidden
so I have done that. Bootstrapped and re-testing is in progress. Ok for trunk
assuming regression tests pass? New patch is below.

Thanks,
Teresa

2012-05-08   Teresa Johnson  

* libgcc/libgcov.c (gcov_clear, __gcov_reset): New functions.
(__gcov_dump): Ditto.
(gcov_dump_complete): New global variable.
(gcov_exit): Export hidden to enable use in L_gcov_dump.
(__gcov_flush): Outline functionality now in gcov_clear.
* gcc/gcov-io.h (__gcov_reset, __gcov_dump): Declare.
* libgcc/Makefile.in (L_gcov_reset, L_gcov_dump): Define.
* gcc/doc/gcov.texi: Add note on using __gcov_reset and __gcov_dump.

Index: libgcc/Makefile.in
===
--- libgcc/Makefile.in  (revision 187048)
+++ libgcc/Makefile.in  (working copy)
@@ -849,7 +849,7 @@ include $(iterator)
 # Defined in libgcov.c, included only in gcov library
 LIBGCOV = _gcov _gcov_merge_add _gcov_merge_single _gcov_merge_delta \
 _gcov_fork _gcov_execl _gcov_execlp _gcov_execle \
-_gcov_execv _gcov_execvp _gcov_execve \
+_gcov_execv _gcov_execvp _gcov_execve _gcov_reset _gcov_dump \
 _gcov_interval_profiler _gcov_pow2_profiler _gcov_one_value_profiler \
 _gcov_indirect_call_profiler _gcov_average_profiler _gcov_ior_profiler \
 _gcov_merge_ior
Index: libgcc/libgcov.c
===
--- libgcc/libgcov.c(revision 187048)
+++ libgcc/libgcov.c(working copy)
@@ -50,6 +50,14 @@ void __gcov_init (struct gcov_info *p __attribute_
 void __gcov_flush (void) {}
 #endif

+#ifdef L_gcov_reset
+void __gcov_reset (void) {}
+#endif
+
+#ifdef L_gcov_dump
+void __gcov_dump (void) {}
+#endif
+
 #ifdef L_gcov_merge_add
 void __gcov_merge_add (gcov_type *counters  __attribute__ ((unused)),
   unsigned n_counters __attribute__ ((unused))) {}
@@ -74,6 +82,10 @@ void __gcov_merge_delta (gcov_type *counters  __at
 #include 
 #endif

+extern void gcov_clear (void) ATTRIBUTE_HIDDEN;
+extern void gcov_exit (void) ATTRIBUTE_HIDDEN;
+extern int gcov_dump_complete ATTRIBUTE_HIDDEN;
+
 #ifdef L_gcov
 #include "gcov-io.c"

@@ -91,6 +103,9 @@ static struct gcov_info *gcov_list;
 /* Size of the longest file name. */
 static size_t gcov_max_filename = 0;

+/* Flag when the profile has already been dumped via __gcov_dump().  */
+int gcov_dump_complete = 0;
+
 /* Make sure path component of the given FILENAME exists, create
missing directories. FILENAME must be writable.
Returns zero on success, or -1 if an error occurred.  */
@@ -268,7 +283,7 @@ gcov_version (struct gcov_info *ptr, gcov_unsigned
in two separate programs, and we must keep the two program
summaries separate.  */

-static void
+void
 gcov_exit (void)
 {
   struct gcov_info *gi_ptr;
@@ -286,6 +301,11 @@ gcov_exit (void)
   char *gi_filename, *gi_filename_up;
   gcov_unsigned_t crc32 = 0;

+  /* Prevent the counters from being dumped a second time on exit when the
+ application already wrote out the profile using __gcov_dump().  */
+  if (gcov_dump_complete)
+return;
+
   memset (&all_prg, 0, sizeof (all_prg));
   /* Find the totals for this execution.  */
   memset (&this_prg, 0, sizeof (this_prg));
@@ -679,6 +699,37 @@ gcov_exit (void)
 }
 }

+/* Reset all counters to zero.  */
+
+void
+gcov_clear (void)
+{
+  const struct gcov_info *gi_ptr;
+
+  for (gi_ptr = gcov_list; gi_ptr; gi_ptr = gi_ptr->next)
+{
+  unsigned f_ix;
+
+  for (f_ix = 0; f_ix < gi_ptr->n_functions; f_ix++)
+   {
+ unsigned t_ix;
+ const struct gcov_fn_info *gfi_ptr = gi_ptr->functions[f_ix];
+
+ if (!gfi_ptr || gfi_ptr->key != gi_ptr)
+   continue;
+ const struct gcov_ctr_info *ci_ptr = gfi_ptr->

Re: [patch] HP-PA: use define_c_enum for "unspec" and "unspecv"

2012-05-08 Thread John David Anglin

On 5/8/2012 2:33 PM, Steven Bosscher wrote:

Dave, could you please test this for me, and commit it if it is OK?

Will do.

Thanks,
Dave

--
John David Anglindave.ang...@bell.net



Re: [testsuite] Fix gcc.target/i386/hle-* testcases with Sun as

2012-05-08 Thread Mike Stump
On May 8, 2012, at 10:19 AM, Rainer Orth wrote:
> Several /gcc.target/i386/hle-*.c tests are currently failing on Solaris
> 9/x86 with Sun as:
> 
> FAIL: gcc.target/i386/hle-add-acq-1.c scan-assembler lock[ 
> \\n\\t]+(xacquire|.byte[ \\t]+0xf2)[ \\t\\n]+add
> 
> The .s file has
> 
>lock;
>.byte   0xf2
> 
> but the scan-assembler regex currently doesn't allow for the ; (which is
> not present with gas 2.22).
> 
> The patch below does just that.  Tested with the appropriate runtest
> invocation on i386-pc-solaris2.9 configured with as and gas
> respectively.
> 
> Ok for mainline?

Ok, assuming that the ; has to be there.  If it doesn't have to be there, an 
alternative patch might be to remove it from the port now instead of the patch.


Re: [PATCH] libgcov support for profile collection in region of interest (issue6186044)

2012-05-08 Thread Jan Hubicka
> On Tue, May 8, 2012 at 10:37 AM, Jan Hubicka  wrote:
> >
> > > Hi Honza,
> > >
> > > I added L_gcov_reset and L_gcov_dump for the new interfaces, and also
> > > added a description into the gcov man page. Let me know if it looks
> > > ok now.
> > >
> > > Bootstrapped and tested on x86_64-unknown-linux-gnu.
> > >
> > > Thanks,
> > > Teresa
> > >
> > > 2012-05-08   Teresa Johnson  
> > >
> > >       * libgcc/libgcov.c (gcov_clear, __gcov_reset): New functions.
> > >       (__gcov_dump): Ditto.
> > >       (gcov_dump_complete): New global variable.
> > >       (__gcov_flush): Outline functionality now in gcov_clear.
> > >       * gcc/gcov-io.h (__gcov_reset, __gcov_dump): Declare.
> > >       * libgcc/Makefile.in (L_gcov_reset, L_gcov_dump): Define.
> > >       * gcc/doc/gcov.texi: Add note on using __gcov_reset and __gcov_dump.
> >
> > It seems OK now, though gcov_clear will end up being in both gcov_reset and 
> > gcov_flush,
> > but I suppose it is short enough to make this issue mood
> > (otherwise we could make it hidden exported from gcov_flush psection)
> >
> > Honza
> 
> The same issue is going to exist with gcov_exit and gcov_dump/gcov_flush, and
> with gcov_dump_complete. I agree that it seems cleaner to export them hidden
> so I have done that. Bootstrapped and re-testing is in progress. Ok for trunk
> assuming regression tests pass? New patch is below.
OK,
thanks
Honza


Re: [PR tree-optimization/52558]: RFC: questions on store data race

2012-05-08 Thread Aldy Hernandez

On 05/07/12 19:11, Andrew MacLeod wrote:

On 05/07/2012 07:04 PM, Aldy Hernandez wrote:



Andrew suggested the correct fix was to add a new pass that was able
to do some ?? flow sensitive data flow analysis ?? that could discover
these unreachable paths and insert the 0 phis at the start of the
blocks automatically. But that seemed like far too much work,
considering how long it's taken me to get this far ;-).



Wait, no. I didn't suggest writing an entire generic pass was the
correct fix for you... I said that this was a technique that a generic
pass which identified this sort of flow sensitive data flow info could
use to work within the CFG. Simply zeroing out uses of the load in PHI
nodes on paths it has determined are not actually reachable by that
value, and you were suppose to integrate just the bits of that technique
that you required.. just replace any uses of your LSM temporary with 0.
I never intended you should write an entire pass...


Ah, well in that case, I've already done that.  Well I don't do it on 
any path, just in the loop header, and things propagate down from there 
quite nicely :).


Aldy


Re: [C++ Patch and pubnames 2/2] Adjust c decl pretty printer to match demangler (issue6195056)

2012-05-08 Thread Manuel López-Ibáñez
On 8 May 2012 18:20, Sterling Augustine  wrote:
> On Mon, May 7, 2012 at 6:44 PM, Gabriel Dos Reis
>  wrote:
>> On Mon, May 7, 2012 at 7:10 PM, Sterling Augustine
>>  wrote:
>>> This is the second in the series of patches to make c decl pretty printing
>>> more closely match the demangler. A full explanation is here:
>>>
>>> http://gcc.gnu.org/ml/gcc-patches/2012-05/msg00512.html
>>>
>>> OK for mainline?
>>
>> Now I realize something that is wrong with the previous patch.
>> Writing 'const T*' in C++ is very much wide spread style is also the
>> style used in the C++ standard, TC++PL (the de facto popular reference 
>> TC++PL),
>> our own C++ standard library implementation, and many popular modern
>> C++ textbooks.
>> It is a strongly well established style.
>> Changing the pretty printer to satisfy the demangler as opposed to users
>> look wrong headed.
>
> I'm most definitely not trying to satisfy the demangler--I'm trying to
> make GCC's naming consistent with the rest of the tool chain, which
> will be good for users. Consider the C++ function:
>
> int foo(const char *bar);
>
> The problem is that the toolchain disagrees on the canonical
> pretty-name of this function. If you use nm, objdump, or readelf, or
> anything that must recover the name from the binary, you will see:
>
> int foo(char const*)
>
> The demangler follows the documented gnu_v3 demangling convention. As
> far as I can tell, the C++ front end is ad-hoc. This disagreement
> creates inconsistencies in the debugging information that is confusing
> to users.
>
> Do you have a suggestion for fixing the disagreement? I would love to
> add this as a parameter somewhere, but the decision is very deep in
> the internals of the pretty printer.

A suggestion: Make dwarf_name call the demangler, and then a (new?) a
function that converts a mangled decl to a human-readable string. In
any case, the pretty-printer does a lot of stuff that is mostly
useless for just printing a declaration (translation, wrapping, etc.).

Bonus point if all GNU toolchain program use the same functions for
demangling and undemagling (because I guess they actually don't, no?)

I guess it cannot be so easy, so I apologize in advance for saying nonsense.

Cheers,

Manuel.


Re: [committed] Fix lower-subreg cost calculation

2012-05-08 Thread Richard Sandiford
Richard Earnshaw  writes:
> FTR, this caused
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53278

Well, this really has been a brown-paper-bag patch.  Fixed as below.
Tested on x86_64-linux-gnu and applied as obvious.

Richard


gcc/
PR rtl-optimization/53278
* lower-subreg.c (decompose_multiword_subregs): Remove left-over
speed_p code from earlier patch.

Index: gcc/lower-subreg.c
===
--- gcc/lower-subreg.c  2012-05-08 19:45:31.0 +0100
+++ gcc/lower-subreg.c  2012-05-08 19:45:31.793855523 +0100
@@ -1487,9 +1487,7 @@ decompose_multiword_subregs (void)
   FOR_EACH_BB (bb)
{
  rtx insn;
- bool speed_p;
 
- speed_p = optimize_bb_for_speed_p (bb);
  FOR_BB_INSNS (bb, insn)
{
  rtx pat;


[committed] Fix PR c++/52361 ICE in tree_strip_nop_conversions

2012-05-08 Thread Manuel López-Ibáñez
The return value of build_range_check can be NULL and integer_zerop
cannot handle that. Bootstrapped and regression tested. John David
Anglin tested that it fixes the ICE in hppa. Committed to mainline as
obvious.

2012-05-09  Manuel López-Ibáñez  

PR c++/53261
* c-common.c (warn_logical_operator): Check that argument of
integer_zerop is not NULL.

Index: gcc/c-family/c-common.c
===
--- gcc/c-family/c-common.c (revision 187257)
+++ gcc/c-family/c-common.c (working copy)
@@ -1627,11 +1627,11 @@ warn_logical_operator (location_t locati
  should be always false to get a warning.  */
   if (or_op)
 in0_p = !in0_p;

   tem = build_range_check (UNKNOWN_LOCATION, type, lhs, in0_p, low0, high0);
-  if (integer_zerop (tem))
+  if (tem && integer_zerop (tem))
 return;

   rhs = make_range (op_right, &in1_p, &low1, &high1, &strict_overflow_p);
   if (!rhs)
 return;
@@ -1642,11 +1642,11 @@ warn_logical_operator (location_t locati
  should be always false to get a warning.  */
   if (or_op)
 in1_p = !in1_p;

   tem = build_range_check (UNKNOWN_LOCATION, type, rhs, in1_p, low1, high1);
-  if (integer_zerop (tem))
+  if (tem && integer_zerop (tem))
 return;

   /* If both expressions have the same operand, if we can merge the
  ranges, and if the range test is always false, then warn.  */
   if (operand_equal_p (lhs, rhs, 0)


Re: [RFC] PR 53063 encode group options in .opt files

2012-05-08 Thread Manuel López-Ibáñez
On 6 May 2012 20:45, Joseph S. Myers  wrote:
>>
>> One idea could be to have an additional auto_handle_option() that is
>> generated from the awk scripts and called after all other
>> handle_option functions. This function will populate a switch with
>> group options and the respective calls to handle_option_generated for
>> sub-options.
>>
>>  Is this a good idea? Where would be the best place to call this function?
>
> That certainly seems one reasonable way to handle implications.

OK, so I implemented this in the patch below, and it generates code like:

bool
common_handle_option_auto (struct gcc_options *opts,
   struct gcc_options *opts_set,
   const struct cl_decoded_option *decoded,
   unsigned int lang_mask, int kind,
   location_t loc,
   const struct cl_option_handlers *handlers,
   diagnostic_context *dc)
{
  size_t scode = decoded->opt_index;
  int value = decoded->value;
  enum opt_code code = (enum opt_code) scode;

  gcc_assert (decoded->canonical_option_num_elements <= 2);

  switch (code)
{
case OPT_Wuninitialized:
  if (!opts_set->x_warn_maybe_uninitialized)
handle_generated_option (opts, opts_set,
 OPT_Wmaybe_uninitialized, NULL, value,
 lang_mask, kind, loc,
 handlers, dc);
  break;

case OPT_Wextra:
  if (!opts_set->x_warn_uninitialized)
handle_generated_option (opts, opts_set,
 OPT_Wuninitialized, NULL, value,
 lang_mask, kind, loc,
 handlers, dc);
  break;

case OPT_Wunused:
  if (!opts_set->x_warn_unused_but_set_variable)
handle_generated_option (opts, opts_set,
 OPT_Wunused_but_set_variable, NULL, value,
 lang_mask, kind, loc,
 handlers, dc);
  if (!opts_set->x_warn_unused_function)
handle_generated_option (opts, opts_set,
 OPT_Wunused_function, NULL, value,
 lang_mask, kind, loc,
 handlers, dc);
  if (!opts_set->x_warn_unused_label)
handle_generated_option (opts, opts_set,
 OPT_Wunused_label, NULL, value,
 lang_mask, kind, loc,
 handlers, dc);
   if (!opts_set->x_warn_unused_value)
 handle_generated_option (opts, opts_set,
  OPT_Wunused_value, NULL, value,
  lang_mask, kind, loc,
  handlers, dc);
   if (!opts_set->x_warn_unused_variable)
 handle_generated_option (opts, opts_set,
  OPT_Wunused_variable, NULL, value,
  lang_mask, kind, loc,
  handlers, dc);
   break;

default:
  break;
}
  return true;
}

which looks correct to me. However, the build fails because now
options.h requires input.h which requires line-map.h, which is not
included when building for example libgcc. options.h is included by
tm.h, so it basically appears everywhere.

Any suggestions how to fix this?

Cheers,

Manuel.



>
> --
> Joseph S. Myers
> jos...@codesourcery.com


group-options-2.diff
Description: Binary data


Re: [RFC] PR 53063 encode group options in .opt files

2012-05-08 Thread Joseph S. Myers
On Wed, 9 May 2012, Manuel L?pez-Ib??ez wrote:

> which looks correct to me. However, the build fails because now
> options.h requires input.h which requires line-map.h, which is not
> included when building for example libgcc. options.h is included by
> tm.h, so it basically appears everywhere.
> 
> Any suggestions how to fix this?

options.h already has some #if !defined(IN_LIBGCC2) && 
!defined(IN_TARGET_LIBS) && !defined(IN_RTS) conditionals, so you could 
arrange for some more such conditionals to be generated.

-- 
Joseph S. Myers
jos...@codesourcery.com

trans-mem: functions making indirect calls are not transformed (issue6194061)

2012-05-08 Thread Dave Boutcher
 gcc/trans-mem.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/trans-mem.c b/gcc/trans-mem.c
index 2badf25..24073fa 100644
--- a/gcc/trans-mem.c
+++ b/gcc/trans-mem.c
@@ -4721,7 +4721,7 @@ ipa_tm_transform_clone (struct cgraph_node *node)
   /* If this function makes no calls and has no irrevocable blocks,
  then there's nothing to do.  */
   /* ??? Remove non-aborting top-level transactions.  */
-  if (!node->callees && !d->irrevocable_blocks_clone)
+  if (!node->callees && !node->indirect_calls && !d->irrevocable_blocks_clone)
 return;
 
   current_function_decl = d->clone->decl;
-- 
1.7.9.5


--
This patch is available for review at http://codereview.appspot.com/6194061


Use C++ in COMPILER_FOR_BUILD if needed (issue6191056)

2012-05-08 Thread Diego Novillo

Found this while testing the C++ conversion for vec.[ch] on the
cxx-conversion branch.  We do not build the build/*.o files with g++,
so I was getting lots of syntax errors while compiling build/vec.o.

I am not completely sure if the changes are correct.  But it works for
me.

Tested on x86_64.  OK for trunk?

2012-05-08   Diego Novillo  

* Makefile.in (CXX_FOR_BUILD): Define.
(BUILD_CXX_FLAGS): Define
(COMPILER_FOR_BUILD): Set to CXX_FOR_BUILD if building with C++.
(LINKER_FOR_BUILD): Likewise.
(BUILD_COMPILERFLAGS): Set to BUILD_CXXFLAGS if building with C++.
(BUILD_LINKERFLAGS): Likewise.

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index ec27f88..1aa9dad 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -728,15 +728,27 @@ DIR = ../gcc
 
 # Native compiler for the build machine and its switches.
 CC_FOR_BUILD = @CC_FOR_BUILD@
+CXX_FOR_BUILD = @CXX_FOR_BUILD@
 BUILD_CFLAGS= @BUILD_CFLAGS@ -DGENERATOR_FILE
+BUILD_CXXFLAGS = $(INTERNAL_CFLAGS) $(CXXFLAGS) -DGENERATOR_FILE
 
 # Native compiler that we use.  This may be C++ some day.
+ifneq ($(ENABLE_BUILD_WITH_CXX),yes)
 COMPILER_FOR_BUILD = $(CC_FOR_BUILD)
 BUILD_COMPILERFLAGS = $(BUILD_CFLAGS)
+else
+COMPILER_FOR_BUILD = $(CXX_FOR_BUILD)
+BUILD_COMPILERFLAGS = $(BUILD_CXXFLAGS)
+endif
 
 # Native linker that we use.
+ifneq ($(ENABLE_BUILD_WITH_CXX),yes)
 LINKER_FOR_BUILD = $(CC_FOR_BUILD)
 BUILD_LINKERFLAGS = $(BUILD_CFLAGS)
+else
+LINKER_FOR_BUILD = $(CXX_FOR_BUILD)
+BUILD_LINKERFLAGS = $(BUILD_CXXFLAGS)
+endif
 
 # Native linker and preprocessor flags.  For x-fragment overrides.
 BUILD_LDFLAGS=@BUILD_LDFLAGS@

--
This patch is available for review at http://codereview.appspot.com/6191056


Symbol table 19/many: cleanup varpool/front-end/varasm interactions

2012-05-08 Thread Jan Hubicka

Hi,
this patch puts varpool more into control of what is output.  Traditionally
frontends were outputting variables as they parsed them via
tree_rest_of_compilation or at the end of compilation via
wrapup_global_declarations that went through all the decls, output unused
warnings and called tree_rest_of_decl_compilation on those that was not
compiled yet.

The difference was that in first case var was output always, while in the
second case only if needed.

I eventually hooked varpool into tree_rest_of_decl_compilation and collected
variables by that.  This however does not collect all variables in all
frontends.  C and C++ frontends calls tree_rest_of_decl_compilation while
for example fortran frontend often relies on wrapup_global_decls.

Moreover there are not only variables but other kind of weirdos flying
around.  First are VAR_DECLs with HAS_VALUE_EXPR_SET.  Those are not
real variables (and are not output to assembly) but they matter for debugging.
Moreover emutls uses this mechanizm to replace TLS vars by their emulation
code.

Next ther are constant pool decls that are not output from varpool but via
special code handling constant pool.

This patch regularizes things somewhat.  First wrapup of global declaration
now happens consistently before finalizing complation unit.  This makes
fortran behave like C and makes varpool complette.

Next I avoid adding variables with VALUE_EXPR into varpool, since they 
are not real symbols.
There is problem with emutls where the variables are insterted but later
replaced. Emutls should update reference lists but does not, I will do that
in followup patch.

Finally it informs varpool that constant pool decls goes into assembly
automatically behind scenes (this is used on some funnier targets to interleave
code with constants)

The patch also adds bunch of sanity checking that things works as expected
and varpool is finally in control of variable output.

This avoids need of frontend to mess with TREE_ASM_WRITTEN flag to convince
backend to skip variables that are just fake varibales.  I removed this from
fortran frontend where it is easy. C++ is less easy since it checks the flag
itself.  I will look into that as followup, too.

Bootstrapped/regtested x86_64-linux, will commit it shortly.

Honza

* cgraphbuild.c (build_cgraph_edges): Do not finalize vars
with VALUE_EXPR.
* cgraph.h (varpool_can_remove_if_no_refs): Vars with VALUE_EXPR
are removable.
* toplev.c (wrapup_global_declaration_2): Vars with VALUE_EXPR
need to wrapup.
(compile_file): Do not output variables.
* cgraphbuild.c (varpool_finalize_decl): When var is finalized late,
output it.
* langhooks.c: Include timevar.h
(write_global_declarations): Finalize compilation unit after wrapup;
set timevars correctly.
* passes.c (rest_of_decl_compilation): Decls with VALUE_EXPR needs
not to be added to varpool.
* varpool.c (varpool_assemble_decl): Sanity check that we are called
only on cases where it makes sense; skip constant pool and value expr
vars.

* lto.c (do_whole_program_analysis): Set timevars correctly.
(lto_main): Likewise.

* trans-common.c (create_common): Do not fake TREE_ASM_WRITTEN.
* trans-decl.c (gfc_finish_cray_pointee): Likewise.
Index: cgraphbuild.c
===
*** cgraphbuild.c   (revision 187296)
--- cgraphbuild.c   (working copy)
*** build_cgraph_edges (void)
*** 356,362 
/* Look for initializers of constant variables and private statics.  */
FOR_EACH_LOCAL_DECL (cfun, ix, decl)
  if (TREE_CODE (decl) == VAR_DECL
!   && (TREE_STATIC (decl) && !DECL_EXTERNAL (decl)))
varpool_finalize_decl (decl);
record_eh_tables (node, cfun);
  
--- 356,363 
/* Look for initializers of constant variables and private statics.  */
FOR_EACH_LOCAL_DECL (cfun, ix, decl)
  if (TREE_CODE (decl) == VAR_DECL
!   && (TREE_STATIC (decl) && !DECL_EXTERNAL (decl))
!   && !DECL_HAS_VALUE_EXPR_P (decl))
varpool_finalize_decl (decl);
record_eh_tables (node, cfun);
  
Index: cgraph.h
===
*** cgraph.h(revision 187296)
--- cgraph.h(working copy)
*** varpool_can_remove_if_no_refs (struct va
*** 1126,1131 
--- 1126,1132 
return (!node->symbol.force_output && 
!node->symbol.used_from_other_partition
  && (DECL_COMDAT (node->symbol.decl)
  || !node->symbol.externally_visible
+ || DECL_HAS_VALUE_EXPR_P (node->symbol.decl)
  || DECL_EXTERNAL (node->symbol.decl)));
  }
  
Index: toplev.c
===
*** toplev.c(revision 187296)
--- toplev.c(working copy)
*** wrapup_global_declaration_1 (tree decl)
*** 364,370 

trans-mem: make sure clones for functions referenced indirectly are marked as needed (issue6201064)

2012-05-08 Thread Dave Boutcher
Without this patch we generate calls to TM_GETTMCLONE for functions
called indirectly, but we don't actually store the clone mapping in
the clone table because we think the functions are not "needed".
Compiles fine, dies at runtime.  See GCC Bugzilla – Bug 53008

 gcc/trans-mem.c |   14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/gcc/trans-mem.c b/gcc/trans-mem.c
index 24073fa..20ed5a0 100644
--- a/gcc/trans-mem.c
+++ b/gcc/trans-mem.c
@@ -4319,6 +4319,9 @@ ipa_tm_create_version_alias (struct cgraph_node *node, 
void *data)
 
   record_tm_clone_pair (old_decl, new_decl);
 
+  /* If someone refers to this function indirectly, mark it needed */
+  if (ipa_ref_list_first_refering (&info->old_node->ref_list)) 
+  ipa_tm_mark_needed_node (info->old_node);
   if (info->old_node->needed)
 ipa_tm_mark_needed_node (new_node);
   return false;
@@ -4372,6 +4375,10 @@ ipa_tm_create_version (struct cgraph_node *old_node)
   record_tm_clone_pair (old_decl, new_decl);
 
   cgraph_call_function_insertion_hooks (new_node);
+  /* If someone refers to this function indirectly, mark it needed */
+  if (ipa_ref_list_first_refering (&old_node->ref_list)) 
+  ipa_tm_mark_needed_node (old_node);
+
   if (old_node->needed)
 ipa_tm_mark_needed_node (new_node);
 
@@ -4778,8 +4785,13 @@ ipa_tm_execute (void)
   No need to do this if the function's address can't be taken.  */
if (is_tm_pure (node->decl))
  {
-   if (!node->local.local)
+   if (!node->local.local) {
  record_tm_clone_pair (node->decl, node->decl);
+ /* if someone refers to this function other than as a call,
+mark it needed */
+ if (ipa_ref_list_first_refering (&node->ref_list)) 
+ ipa_tm_mark_needed_node (node);
+   }
continue;
  }
 
-- 
1.7.9.5


--
This patch is available for review at http://codereview.appspot.com/6201064


gnu-tm: Dont allow assigning transaction_unsafe functions to transaction_safe function pointers (issue6198054)

2012-05-08 Thread Dave Boutcher
Without this patch it is perfectly fine to assign non-transaction_safe
functions to function pointers marked as transaction_safe.  Unpleasantness
happens at run time.

e.g. 

 __attribute__((transaction_safe)) long (*compare)(int, int); 

compare = my_funky_random_function;


 gcc/c-typeck.c |7 +++
 1 file changed, 7 insertions(+)

diff --git a/gcc/c-typeck.c b/gcc/c-typeck.c
index fc01a79..69687d6 100644
--- a/gcc/c-typeck.c
+++ b/gcc/c-typeck.c
@@ -5608,6 +5608,13 @@ convert_for_assignment (location_t location, tree type, 
tree rhs,
  }
}
 
+  /* Check for assignment to transaction safe */
+  if (is_tm_safe(type) && !is_tm_safe_or_pure (rhs)) {
+ warning_at (location, 0,
+ "Assigning unsafe function to transaction_safe "
+ "function pointer"); 
+  }
+
   /* Any non-function converts to a [const][volatile] void *
 and vice versa; otherwise, targets must be the same.
 Meanwhile, the lhs target must have all the qualifiers of the rhs.  */
-- 
1.7.9.5


--
This patch is available for review at http://codereview.appspot.com/6198054


Re: Use C++ in COMPILER_FOR_BUILD if needed (issue6191056)

2012-05-08 Thread Diego Novillo

On 12-05-08 15:46 , Diego Novillo wrote:

Found this while testing the C++ conversion for vec.[ch] on the
cxx-conversion branch.  We do not build the build/*.o files with g++,
so I was getting lots of syntax errors while compiling build/vec.o.

I am not completely sure if the changes are correct.  But it works for
me.

Tested on x86_64.  OK for trunk?

2012-05-08   Diego Novillo

* Makefile.in (CXX_FOR_BUILD): Define.
(BUILD_CXX_FLAGS): Define
(COMPILER_FOR_BUILD): Set to CXX_FOR_BUILD if building with C++.
(LINKER_FOR_BUILD): Likewise.
(BUILD_COMPILERFLAGS): Set to BUILD_CXXFLAGS if building with C++.
(BUILD_LINKERFLAGS): Likewise.



I forgot to include the changes needed in configure.ac to export 
CXX_FOR_BUILD.


Without this, incremental builds from /gcc will fail because
the value of CXX_FOR_BUILD will not be set.

Tested on x86_64.  OK for trunk?


2012-05-08   Diego Novillo  

* configure.ac (CXX_FOR_BUILD): Define and substitute.
* configure: Regenerate.

diff --git a/gcc/configure.ac b/gcc/configure.ac
index b3cfed4..a05f4f9 100644
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -1848,6 +1848,7 @@ AC_SUBST(inhibit_libc)

 # These are the normal (build=host) settings:
 CC_FOR_BUILD='$(CC)'   AC_SUBST(CC_FOR_BUILD)
+CXX_FOR_BUILD='$(CXX)' AC_SUBST(CXX_FOR_BUILD)
 BUILD_CFLAGS='$(ALL_CFLAGS)'   AC_SUBST(BUILD_CFLAGS)
 BUILD_LDFLAGS='$(LDFLAGS)' AC_SUBST(BUILD_LDFLAGS)
 STMP_FIXINC=stmp-fixincAC_SUBST(STMP_FIXINC)


Re: PR 53249: Multiple address modes for same address space

2012-05-08 Thread Richard Henderson

On 05/06/2012 11:41 AM, Richard Sandiford wrote:

PR middle-end/53249
* dwarf2out.h (get_address_mode): Move declaration to...
* rtl.h: ...here.
* dwarf2out.c (get_address_mode): Move definition to...
* rtlanal.c: ...here.
* var-tracking.c (get_address_mode): Delete.
* combine.c (find_split_point): Use get_address_mode instead of
targetm.addr_space.address_mode.
* cselib.c (cselib_record_sets): Likewise.
* dse.c (canon_address, record_store): Likewise.
* emit-rtl.c (adjust_address_1, offset_address): Likewise.
* expr.c (move_by_pieces, emit_block_move_via_loop, store_by_pieces)
(store_by_pieces_1, expand_assignment, store_expr, store_constructor)
(expand_expr_real_1): Likewise.
* ifcvt.c (noce_try_cmove_arith): Likewise.
* optabs.c (maybe_legitimize_operand_same_code): Likewise.
* reload.c (find_reloads): Likewise.
* sched-deps.c (sched_analyze_1, sched_analyze_2): Likewise.
* sel-sched-dump.c (debug_mem_addr_value): Likewise.


ok.


r~


Re: [C++ Patch and pubnames 2/2] Adjust c decl pretty printer to match demangler (issue6195056)

2012-05-08 Thread Gabriel Dos Reis
On Tue, May 8, 2012 at 4:38 PM, Manuel López-Ibáñez
 wrote:
> A suggestion: Make dwarf_name call the demangler, and then a (new?) a
> function that converts a mangled decl to a human-readable string. In
> any case, the pretty-printer does a lot of stuff that is mostly
> useless for just printing a declaration (translation, wrapping, etc.).
>
> Bonus point if all GNU toolchain program use the same functions for
> demangling and undemagling (because I guess they actually don't, no?)
>
> I guess it cannot be so easy, so I apologize in advance for saying nonsense.

It makes sense; and I don't think it is as complicated as it might sound.

-- Gaby


Re: [C++ Patch and pubnames 2/2] Adjust c decl pretty printer to match demangler (issue6195056)

2012-05-08 Thread Cary Coutant
>> A suggestion: Make dwarf_name call the demangler, and then a (new?) a
>> function that converts a mangled decl to a human-readable string. In
>> any case, the pretty-printer does a lot of stuff that is mostly
>> useless for just printing a declaration (translation, wrapping, etc.).
>>
>> Bonus point if all GNU toolchain program use the same functions for
>> demangling and undemagling (because I guess they actually don't, no?)
>>
>> I guess it cannot be so easy, so I apologize in advance for saying nonsense.
>
> It makes sense; and I don't think it is as complicated as it might sound.

dwarf_name takes a tree; the demangler takes a mangled name. We don't
have mangled names for many of the names we want to enter into the
pubnames table.

-cary


Re: Heads-up, PR53273: testsuite separation and dilution problem. Fix for PR53272

2012-05-08 Thread Hans-Peter Nilsson
> From: Richard Guenther 
> Date: Tue, 8 May 2012 10:50:43 +0200

> On Tue, May 8, 2012 at 5:39 AM, Hans-Peter Nilsson
>  wrote:
> > The problem was spotted while fixing PR53272, a target bug with
> > crisv32-* involving the error-prone notice_update_cc function.
> >
> > When wrapping up the test-case to use as a run-test, adding main
> > and auxiliary functions to the reduced test-case unexpectedly
> > made the bug go away.  This despite all functions (except main)
> > being decorated with noinline, noclone and the special marker
> > asm ("") ad finitum.
> > ...
> > Can we just have a way to
> > limit those pesky cross-function optimizations and all their kin
> > once and for all?
> 
> You don't say what actually is different when you add these functions.

I've looked further and added details in PR53273.  Basically,
GCC sneaked a peek at a neighboring function that was
unconditionally called, spotted a noreturning function and
optimized according to that.  While this was an erroneous
partial edit (unconditional aborting), I can probably make a
(re)throwing test-case with the similar unwanted difference in
generated code.

> There should be no IPA optimizations possible unless you tell GCC
> that it sees the whole program (which means using -flto with the
> linker plugin).  That is, marking functions noclone and noinline and
> avoiding declaring them static should be enough.
> 
> Still some pieces of GCC may expose different code generation due to
> DECL uid differences - which, as DECL uids are global, makes extra
> functions possibly result in different code for unchanged functions.  That's
> generally not wanted but it can happen (similar for other such kinds of
> numbers).

Nope, not here; just a noreturn optimization unexpectedly (and
generally when wrapping test-cases unwantedly) applied.  How do
we stop that?  If you say "just add -fno-ipa" (or whatever
option) I say "will that stop *all* future cross-function
optimizations"?  (And to keep test-case integrity, better add it
to existing test-cases.)

brgds, H-P


Re: [patch] support for multiarch systems

2012-05-08 Thread Matthias Klose
On 08.05.2012 15:20, Joseph S. Myers wrote:
> On Tue, 8 May 2012, Matthias Klose wrote:
> 
>> On 20.08.2011 21:51, Matthias Klose wrote:
>>> Multiarch [1] is the term being used to refer to the capability of a system 
>>> to
>>> install and run applications of multiple different binary targets on the 
>>> same
>>> system.
>>
>> please find attached an updated for the trunk (2012-05-08). The multiarch
>> triplets are now defined in the Debian Wiki [1], and progress is made to get 
>> the
>> triplet definitions into Debian Policy [2].
> 
> This still seems to suffer in some cases the problem of previous versions 
> that it does not ensure triplets are never used for non-matching ABIs.  
> For example, a compiler for powerpc-linux-gnu can be configured 
> --with-float=soft but this patch will still use powerpc-linux-gnu as the 
> multiarch triplet.
> 
> For MIPS, I see you allowed for soft-float in setting the triplets - but 
> the specification you point to doesn't mention the soft-float triplets.  
> Likewise you allowed for powerpc-linux-gnuspe being e500v1 or e500v2 but 
> haven't documented the e500v1 triplet.  Likewise for big-endian ARM.
> 
> I again suggest starting with a patch that does just one architecture - 
> but makes sure to cover all the ABIs applicable to that architecture.  
> For example, you could start with a patch for x86 (indeed, just x86 
> GNU/Linux) - and assign a multiarch triplet for x32 even if you're not 
> building an x32 distribution with multiarch.  Then, once the generic 
> support has been reviewed by build system maintainers, and the x86 support 
> by x86 maintainers and people familiar with all the applicable x86 ABIs, 
> send patches for each other architecture (or architecture/OS combination), 
> and the relevant architecture experts can review them to make sure the 
> relevant ABIs are properly distinguished.

ok, the attached patch includes just the support for the x86 targets, including
the kfreebsd and the hurd systems. The x32 multiarch tuple isn't yet defined, so
I'd like to keep it out of the first version.

  Matthias
gcc/

2012-05-08  Matthias Klose  

* doc/invoke.texi: Document -print-multiarch.
* doc/install.texi: Document --enable-multiarch.
* doc/fragments.texi: Document MULTILIB_OSDIRNAMES, MULTIARCH_DIRNAME.
* configure.ac: Add --enable-multiarch option.
* configure.in: Regenerate.
* Makefile.in (s-mlib): Pass MULTIARCH_DIRNAME to genmultilib.
enable_multiarch, with_float: New macros.
if_multiarch: New macro, define in terms of enable_multiarch.
* genmultilib: Add new argument for the multiarch name.
* gcc.c (multiarch_dir): Define.
(for_each_path): Search for multiarch suffixes.
(driver_handle_option): Handle multiarch option.
(do_spec_1): Pass -imultiarch if defined.
(main): Print multiarch.
(set_multilib_dir): Separate multilib and multiarch names
from multilib_select.
(print_multilib_info): Ignore multiarch names in multilib_select.
* incpath.c (add_standard_paths): Search the multiarch include dirs.
* cppdeault.h (default_include): Document multiarch in multilib
member.
* cppdefault.c: [LOCAL_INCLUDE_DIR, STANDARD_INCLUDE_DIR] Add an
include directory for multiarch directories.
* common.opt: New options --print-multiarch and -imultilib.
* config.gcc: Add tmake fragments to tmake_file ( i386/t-kfreebsd
for i[34567]86-*-kfreebsd*-gnu and x86_64-*-kfreebsd*-gnu, i386/t-gnu
for i[34567]86-*-gnu*).
* config/i386/t-kfreebsd: Add multiarch names in MULTILIB_OSDIRNAMES,
Define MULTIARCH_DIRNAME.
* config/i386/t-linux64: Likewise.
* config/i386/t-gnu: Likewise:
* config/i386/t-linux: Likewise.

Index: gcc/common.opt
===
--- gcc/common.opt  (revision 187271)
+++ gcc/common.opt  (working copy)
@@ -345,6 +345,9 @@
 -print-multi-os-directory
 Driver Alias(print-multi-os-directory)
 
+-print-multiarch
+Driver Alias(print-multiarch)
+
 -print-prog-name
 Driver Separate Alias(print-prog-name=)
 
@@ -2286,6 +2289,10 @@
 Common Joined Var(plugindir_string) Init(0)
 -iplugindir=  Set  to be the default plugin directory
 
+imultiarch
+Common Joined Separate RejectDriver Var(imultiarch) Init(0)
+-imultiarch   Set  to be the multiarch include subdirectory
+
 l
 Driver Joined Separate
 
@@ -2342,6 +2349,9 @@
 
 print-multi-os-directory
 Driver Var(print_multi_os_directory)
+ 
+print-multiarch
+Driver Var(print_multiarch)
 
 print-prog-name=
 Driver JoinedOrMissing Var(print_prog_name)
Index: gcc/Makefile.in
===
--- gcc/Makefile.in (revision 187271)
+++ gcc/Makefile.in (working copy)
@@ -350,6 +350,17 @@
 
 enable_plugin = @enable_plugin@
 
+# Multiarch support
+enable_multiarch =

Re: patch ping: Add static branch predict heuristic of comparing IV to loop_bound variable

2012-05-08 Thread Dehao Chen
Sorry for the error. Here is a new patch to fix them:

gcc/testsuite/ChangeLog:
2012-05-08  Dehao Chen  

* gcc.dg/predict-1.c: Remove the replicated text in this text.
* gcc.dg/predict-2.c: Likewise.
* gcc.dg/predict-3.c: Likewise.
* gcc.dg/predict-4.c: Likewise.
* gcc.dg/predict-5.c: Likewise.
* gcc.dg/predict-6.c: Likewise.

Index: gcc/ChangeLog
===
--- gcc/ChangeLog   (revision 187307)
+++ gcc/ChangeLog   (working copy)
@@ -110,15 +110,15 @@

 2012-05-08  Dehao Chen  

-   * predict.c (find_qualified_ssa_name): New
-   (find_ssa_name_in_expr): New
-   (find_ssa_name_in_assign_stmt): New
-   (is_comparison_with_loop_invariant_p): New
-   (is_bound_expr_similar): New
-   (predict_iv_comparison): New
+   * predict.c (find_qualified_ssa_name): New.
+   (find_ssa_name_in_expr): New.
+   (find_ssa_name_in_assign_stmt): New.
+   (is_comparison_with_loop_invariant_p): New.
+   (is_bound_expr_similar): New.
+   (predict_iv_comparison): New.
(predict_loops): Add heuristic for loop-nested branches that compare an
induction variable to a loop bound variable.
-   * predict.def (PRED_LOOP_IV_COMPARE): New macro
+   * predict.def (PRED_LOOP_IV_COMPARE): New macro.

 2012-05-08  Uros Bizjak  

Index: gcc/testsuite/gcc.dg/predict-3.c
===
--- gcc/testsuite/gcc.dg/predict-3.c(revision 187307)
+++ gcc/testsuite/gcc.dg/predict-3.c(working copy)
@@ -23,28 +23,3 @@

 /* { dg-final { scan-tree-dump-times "loop iv compare heuristics:
100.0%" 4 "profile_estimate"} } */
 /* { dg-final { cleanup-tree-dump "profile_estimate" } } */
-/* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-profile_estimate" } */
-
-extern int global;
-
-int bar(int);
-
-void foo (int bound)
-{
-  int i, ret = 0;
-  for (i = 0; i <= bound; i++)
-{
-  if (i < bound - 2)
-   global += bar (i);
-  if (i <= bound)
-   global += bar (i);
-  if (i + 1 < bound)
-   global += bar (i);
-  if (i != bound)
-   global += bar (i);
-}
-}
-
-/* { dg-final { scan-tree-dump-times "loop iv compare heuristics:
100.0%" 4 "profile_estimate"} } */
-/* { dg-final { cleanup-tree-dump "profile_estimate" } } */
Index: gcc/testsuite/gcc.dg/predict-4.c
===
--- gcc/testsuite/gcc.dg/predict-4.c(revision 187307)
+++ gcc/testsuite/gcc.dg/predict-4.c(working copy)
@@ -17,22 +17,3 @@

 /* { dg-final { scan-tree-dump "loop iv compare heuristics: 50.0%"
"profile_estimate"} } */
 /* { dg-final { cleanup-tree-dump "profile_estimate" } } */
-/* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-profile_estimate" } */
-
-extern int global;
-
-int bar(int);
-
-void foo (int bound)
-{
-  int i, ret = 0;
-  for (i = 0; i < 10; i++)
-{
-  if (i < 5)
-   global += bar (i);
-}
-}
-
-/* { dg-final { scan-tree-dump "loop iv compare heuristics: 50.0%"
"profile_estimate"} } */
-/* { dg-final { cleanup-tree-dump "profile_estimate" } } */
Index: gcc/testsuite/gcc.dg/predict-1.c
===
--- gcc/testsuite/gcc.dg/predict-1.c(revision 187307)
+++ gcc/testsuite/gcc.dg/predict-1.c(working copy)
@@ -25,30 +25,3 @@

 /* { dg-final { scan-tree-dump-times "loop iv compare heuristics:
0.0%" 5 "profile_estimate"} } */
 /* { dg-final { cleanup-tree-dump "profile_estimate" } } */
-/* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-profile_estimate" } */
-
-extern int global;
-
-int bar(int);
-
-void foo (int bound)
-{
-  int i, ret = 0;
-  for (i = 0; i < bound; i++)
-{
-  if (i > bound)
-   global += bar (i);
-  if (i >= bound + 2)
-   global += bar (i);
-  if (i > bound - 2)
-   global += bar (i);
-  if (i + 2 > bound)
-   global += bar (i);
-  if (i == 10)
-   global += bar (i);
-}
-}
-
-/* { dg-final { scan-tree-dump-times "loop iv compare heuristics:
0.0%" 5 "profile_estimate"} } */
-/* { dg-final { cleanup-tree-dump "profile_estimate" } } */
Index: gcc/testsuite/gcc.dg/predict-5.c
===
--- gcc/testsuite/gcc.dg/predict-5.c(revision 187307)
+++ gcc/testsuite/gcc.dg/predict-5.c(working copy)
@@ -23,28 +23,3 @@

 /* { dg-final { scan-tree-dump-times "loop iv compare heuristics:
100.0%" 4 "profile_estimate"} } */
 /* { dg-final { cleanup-tree-dump "profile_estimate" } } */
-/* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-profile_estimate" } */
-
-extern int global;
-
-int bar (int);
-
-void foo (int base, int bound)
-{
-  int i, ret = 0;
-  for (i = base; i <= bound; i++)
-{
-  if (i > base)
-   global += bar (i);
-  if (i > base + 1)
-   global += bar (i);
-  if (i >= base + 3)
-   global += bar

Re: Continue strict-volatile-bitfields fixes

2012-05-08 Thread Thomas Schwinge
Hi!

On Fri, 27 Apr 2012 10:29:06 +0200, Jakub Jelinek  wrote:
> On Fri, Apr 27, 2012 at 12:42:41PM +0800, Thomas Schwinge wrote:
> > > > GET_MODE_BITSIZE (lmode)« (8 bits).  (With the current sources, lmode is
> > > > VOIDmode.)
> > > >
> > > > Is emmitting »BIT_FIELD_REF <*common, 32, 0> & 255« wrong in this case,
> > > > or should a later optimization pass be able to figure out that
> > > > »BIT_FIELD_REF <*common, 32, 0> & 255« is in fact the same as
> > > > common->code, and then be able to conflate these?  Any suggestions
> > > > where/how to tackle this?
> > > 
> > > The BIT_FIELD_REF is somewhat of a red-herring.  It is created by 
> > > fold-const.c
> > > in optimize_bit_field_compare, code that I think should be removed 
> > > completely.
> > > Or it needs to be made aware of strict-volatile bitfield and C++ memory 
> > > model
> > > details.
> 
> I'd actually very much prefer the latter, just disable
> optimize_bit_field_compare for strict-volatile bitfield mode and when
> avoiding load data races in C++ memory model (that isn't going to be
> default, right?).  This optimization is useful, and it is solely about
> loads, so even C++ memory model usually shouldn't care.

I can't comment on the C++ memory model bits, but I have now tested the
following patch (fixes the issue for SH, no regressions for ARM, x86):

gcc/
* fold-const.c (optimize_bit_field_compare): Abort early in the strict
volatile bitfields case.

Index: fold-const.c
===
--- fold-const.c(revision 186856)
+++ fold-const.c(working copy)
@@ -3342,6 +3342,11 @@ optimize_bit_field_compare (location_t loc, enum t
   tree mask;
   tree offset;
 
+  /* In the strict volatile bitfields case, doing code changes here may prevent
+ other optimizations, in particular in a SLOW_BYTE_ACCESS setting.  */
+  if (flag_strict_volatile_bitfields > 0)
+return 0;
+
   /* Get all the information about the extractions being done.  If the bit size
  if the same as the size of the underlying object, we aren't doing an
  extraction at all and so can do nothing.  We also don't want to


Grüße,
 Thomas


pgpj1F9xnVycw.pgp
Description: PGP signature


[PATCH] Fix PR53217

2012-05-08 Thread William J. Schmidt
This fixes another statement-placement issue when reassociating
expressions with repeated factors.  Multiplies feeding into
__builtin_powi calls were not getting placed properly ahead of them in
some cases.

Bootstrapped and tested on powerpc64-unknown-linux-gnu with no new
regressions.  I've also run SPEC cpu2006 with no build or correctness
issues.  OK for trunk?

Thanks,
Bill


gcc:

2012-05-08  Bill Schmidt  

PR tree-optimization/53217
* tree-ssa-reassoc.c (bip_map): New static variable.
(possibly_move_powi): Move feeding multiplies with __builtin_powi call.
(attempt_builtin_powi): Save feeding multiplies on a stack.
(reassociate_bb): Create and destroy bip_map.

gcc/testsuite:

2012-05-08  Bill Schmidt  

PR tree-optimization/53217
* gfortran.dg/pr53217.f90: New test.


Index: gcc/testsuite/gfortran.dg/pr53217.f90
===
--- gcc/testsuite/gfortran.dg/pr53217.f90   (revision 0)
+++ gcc/testsuite/gfortran.dg/pr53217.f90   (revision 0)
@@ -0,0 +1,28 @@
+! { dg-do compile }
+! { dg-options "-O1 -ffast-math" }
+
+! This tests only for compile-time failure, which formerly occurred
+! when statements were emitted out of order, failing verify_ssa.
+
+MODULE xc_cs1
+  INTEGER, PARAMETER :: dp=KIND(0.0D0)
+  REAL(KIND=dp), PARAMETER :: a = 0.04918_dp, &
+  c = 0.2533_dp, &
+  d = 0.349_dp
+CONTAINS
+  SUBROUTINE cs1_u_2 ( rho, grho, r13, e_rho_rho, e_rho_ndrho, e_ndrho_ndrho,&
+   npoints, error)
+REAL(KIND=dp), DIMENSION(*), &
+  INTENT(INOUT)  :: e_rho_rho, e_rho_ndrho, &
+e_ndrho_ndrho
+DO ip = 1, npoints
+  IF ( rho(ip) > eps_rho ) THEN
+ oc = 1.0_dp/(r*r*r3*r3 + c*g*g)
+ d2rF4 = c4p*f13*f23*g**4*r3/r * (193*d*r**5*r3*r3+90*d*d*r**5*r3 &
+ -88*g*g*c*r**3*r3-100*d*d*c*g*g*r*r*r3*r3 &
+ +104*r**6)*od**3*oc**4
+ e_rho_rho(ip) = e_rho_rho(ip) + d2F1 + d2rF2 + d2F3 + d2rF4
+  END IF
+END DO
+  END SUBROUTINE cs1_u_2
+END MODULE xc_cs1
Index: gcc/tree-ssa-reassoc.c
===
--- gcc/tree-ssa-reassoc.c  (revision 187117)
+++ gcc/tree-ssa-reassoc.c  (working copy)
@@ -200,6 +200,10 @@ static long *bb_rank;
 /* Operand->rank hashtable.  */
 static struct pointer_map_t *operand_rank;
 
+/* Map from inserted __builtin_powi calls to multiply chains that
+   feed them.  */
+static struct pointer_map_t *bip_map;
+
 /* Forward decls.  */
 static long get_rank (tree);
 
@@ -2249,7 +2253,7 @@ remove_visited_stmt_chain (tree var)
 static void
 possibly_move_powi (gimple stmt, tree op)
 {
-  gimple stmt2;
+  gimple stmt2, *mpy;
   tree fndecl;
   gimple_stmt_iterator gsi1, gsi2;
 
@@ -2278,9 +2282,39 @@ possibly_move_powi (gimple stmt, tree op)
   return;
 }
 
+  /* Move the __builtin_powi.  */
   gsi1 = gsi_for_stmt (stmt);
   gsi2 = gsi_for_stmt (stmt2);
   gsi_move_before (&gsi2, &gsi1);
+
+  /* See if there are multiplies feeding the __builtin_powi base
+ argument that must also be moved.  */
+  while ((mpy = (gimple *) pointer_map_contains (bip_map, stmt2)) != NULL)
+{
+  /* If we've already moved this statement, we're done.  This is
+ identified by a NULL entry for the statement in bip_map.  */
+  gimple *next = (gimple *) pointer_map_contains (bip_map, *mpy);
+  if (next && !*next)
+   return;
+
+  stmt = stmt2;
+  stmt2 = *mpy;
+  gsi1 = gsi_for_stmt (stmt);
+  gsi2 = gsi_for_stmt (stmt2);
+  gsi_move_before (&gsi2, &gsi1);
+
+  /* The moved multiply may be DAG'd from multiple calls if it
+was the result of a cached multiply.  Only move it once.
+Rank order ensures we move it to the right place the first
+time.  */
+  if (next)
+   *next = NULL;
+  else
+   {
+ next = (gimple *) pointer_map_insert (bip_map, *mpy);
+ *next = NULL;
+   }
+}
 }
 
 /* This function checks three consequtive operands in
@@ -3281,6 +3315,7 @@ attempt_builtin_powi (gimple stmt, VEC(operand_ent
   while (true)
 {
   HOST_WIDE_INT power;
+  gimple last_mul = NULL;
 
   /* First look for the largest cached product of factors from
 preceding iterations.  If found, create a builtin_powi for
@@ -3318,16 +3353,25 @@ attempt_builtin_powi (gimple stmt, VEC(operand_ent
}
  else
{
+ gimple *value;
+
  iter_result = get_reassoc_pow_ssa_name (target, type);
  pow_stmt = gimple_build_call (powi_fndecl, 2, rf1->repr, 
build_int_cst (integer_type_node,
   power));
  gimple_call_set_lhs (pow_stmt, iter_result);
  gimple_set_l

[PATCH] Remove -Y option from linker command line on Linux/Sparc.

2012-05-08 Thread David Miller

We never really should have been passing this Solaris compatability
option for specifying library paths in the first place.  Let them
get set the right way when the compiler passes in "-L" options.

Committed to master.

gcc/

* config/sparc/linux.h (LINK_SPEC): Don't pass "-Y" option.
* config/sparc/linux64.h (LINK_ARCH32_SPEC): Likewise.
* config/sparc/linux64.h (LINK_ARCH64_SPEC): Likewise.
---
 gcc/ChangeLog  |6 ++
 gcc/config/sparc/linux.h   |2 +-
 gcc/config/sparc/linux64.h |4 ++--
 3 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 5891094..986f2c1 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,9 @@
+2012-05-08  David S. Miller  
+
+   * config/sparc/linux.h (LINK_SPEC): Don't pass "-Y" option.
+   * config/sparc/linux64.h (LINK_ARCH32_SPEC): Likewise.
+   * config/sparc/linux64.h (LINK_ARCH64_SPEC): Likewise.
+
 2012-05-08  Richard Sandiford  
 
PR rtl-optimization/53278
diff --git a/gcc/config/sparc/linux.h b/gcc/config/sparc/linux.h
index 60dc869..ac6c537 100644
--- a/gcc/config/sparc/linux.h
+++ b/gcc/config/sparc/linux.h
@@ -87,7 +87,7 @@ extern const char *host_detect_local_cpu (int argc, const 
char **argv);
 #define GLIBC_DYNAMIC_LINKER "/lib/ld-linux.so.2"
 
 #undef  LINK_SPEC
-#define LINK_SPEC "-m elf32_sparc -Y P,/usr/lib %{shared:-shared} \
+#define LINK_SPEC "-m elf32_sparc %{shared:-shared} \
   %{!mno-relax:%{!r:-relax}} \
   %{!shared: \
 %{!static: \
diff --git a/gcc/config/sparc/linux64.h b/gcc/config/sparc/linux64.h
index 14966b9..f932e98 100644
--- a/gcc/config/sparc/linux64.h
+++ b/gcc/config/sparc/linux64.h
@@ -105,7 +105,7 @@ along with GCC; see the file COPYING3.  If not see
   { "link_arch_default", LINK_ARCH_DEFAULT_SPEC },   \
   { "link_arch",LINK_ARCH_SPEC },
 
-#define LINK_ARCH32_SPEC "-m elf32_sparc -Y P,%R/usr/lib %{shared:-shared} \
+#define LINK_ARCH32_SPEC "-m elf32_sparc %{shared:-shared} \
   %{!shared: \
 %{!static: \
   %{rdynamic:-export-dynamic} \
@@ -113,7 +113,7 @@ along with GCC; see the file COPYING3.  If not see
   %{static:-static}} \
 "
 
-#define LINK_ARCH64_SPEC "-m elf64_sparc -Y P,%R/usr/lib64 %{shared:-shared} \
+#define LINK_ARCH64_SPEC "-m elf64_sparc %{shared:-shared} \
   %{!shared: \
 %{!static: \
   %{rdynamic:-export-dynamic} \
-- 
1.7.10



Fix gcc.dg/lower-subreg-1.c failure (was: [C Patch]: pr52543)

2012-05-08 Thread Hans-Peter Nilsson
> From: Richard Sandiford 
> Date: Tue, 1 May 2012 16:46:38 +0200

> To repeat: as things stand, very few targets define proper rtx costs
> for SET.

IMHO it's wrong to start blaming targets when rtx_cost doesn't
take the mode in account in the first place, for the default
cost.  (Well, except for the modes-tieable subreg special-case.)
The targets where an operation in N * word_mode costs no more
than one in word_mode, if there even is one, is a minority,
let's adjust the defaults to that.

>  This patch is therefore expected to prevent lower-subreg
> from running in cases where it's actually benefical.  If you see that
> happening, please check whether the rtx_costs are defined properly.

Well, for CRIS (one of the targets of the PR53176 fallout) they
are sane, basically.  Where cris_rtx_costs returns true, it
returns mostly(*) ballparkly-correct costs *where it's passed an
rtx for which there's a corresponding insn*, otherwise falling
back to the defaults.  It shouldn't have to check for validity
of the rtx asked about; core GCC already knows which insns there
are and can gate that in rtx_cost or its callers.

(*) I see a bug in that cris_rtx_costs doesn't check the mode
for extendsidi2, to return COSTS_N_INSNS (3) instead of 0
(because a sign-extending move to SImode doesn't cost more than
a move; a sign- or zero-extension is also free in an operand for
addition and multiplication).  But this isn't on the path that
lower-subreg.c takes, so only an incidental observation...

> Of course, if the costs are defined properly and lower-subreg still
> makes the wrong choice, we need to look at why.

By the way, regarding validity of rtx_cost calls:

> +++ gcc/lower-subreg.c  2012-05-01 09:46:48.473830772 +0100

> +/* Return the cost of a CODE shift in mode MODE by OP1 bits, using the
> +   rtxes in RTXES.  SPEED_P selects between the speed and size cost.  */
> +
> +static int
> +shift_cost (bool speed_p, struct cost_rtxes *rtxes, enum rtx_code code,
> +   enum machine_mode mode, int op1)
> +{
> +  PUT_MODE (rtxes->target, mode);
> +  PUT_CODE (rtxes->shift, code);
> +  PUT_MODE (rtxes->shift, mode);
> +  PUT_MODE (rtxes->source, mode);
> +  XEXP (rtxes->shift, 1) = GEN_INT (op1);
> +  SET_SRC (rtxes->set) = rtxes->shift;
> +  return insn_rtx_cost (rtxes->set, speed_p);
> +}
> +
> +/* For each X in the range [0, BITS_PER_WORD), set SPLITTING[X]
> +   to true if it is profitable to split a double-word CODE shift
> +   of X + BITS_PER_WORD bits.  SPEED_P says whether we are testing
> +   for speed or size profitability.
> +
> +   Use the rtxes in RTXES to calculate costs.  WORD_MOVE_ZERO_COST is
> +   the cost of moving zero into a word-mode register.  WORD_MOVE_COST
> +   is the cost of moving between word registers.  */
> +
> +static void
> +compute_splitting_shift (bool speed_p, struct cost_rtxes *rtxes,
> +bool *splitting, enum rtx_code code,
> +int word_move_zero_cost, int word_move_cost)
> +{

I think there should be a gating check whether the target
implements that kind of shift in that mode at all, before
checking the cost.  Not sure whether it's generally best to put
that test here, or to make the rtx_cost function return the cost
of a libcall for that mode when that happens.  Similar for the
other insns.

Isn't the below better than doing virtually the same in each
target's rtx_costs?  Not tested yet besides "make cc1" and
checking that lower-subreg.c yields sane costs and that
gcc.dg/lower-subreg-1.c passes for cris-elf.  Note that
untieable SUBREGs still get a higher cost than tieable ones.

I'll test this for cris-elf, please tell me if/what other tests
and targets are required (simulator or compilefarm targets only,
please).

* rtlanal.c (rtx_cost): Adjust default cost for X with a
UNITS_PER_WORD factor for all X according to the size of
its mode, not just for SUBREGs with untieable modes.

Index: gcc/rtlanal.c
===
--- gcc/rtlanal.c   (revision 187308)
+++ gcc/rtlanal.c   (working copy)
@@ -3755,10 +3755,17 @@ rtx_cost (rtx x, enum rtx_code outer_cod
   enum rtx_code code;
   const char *fmt;
   int total;
+  int factor;
 
   if (x == 0)
 return 0;
 
+  /* A size N times larger than UNITS_PER_WORD likely needs N times as
+ many insns, taking N times as long.  */
+  factor = GET_MODE_SIZE (GET_MODE (x)) / UNITS_PER_WORD;
+  if (factor == 0)
+factor = 1;
+
   /* Compute the default costs of certain things.
  Note that targetm.rtx_costs can override the defaults.  */
 
@@ -3766,20 +3773,27 @@ rtx_cost (rtx x, enum rtx_code outer_cod
   switch (code)
 {
 case MULT:
-  total = COSTS_N_INSNS (5);
+  total = factor * COSTS_N_INSNS (5);
   break;
 case DIV:
 case UDIV:
 case MOD:
 case UMOD:
-  total = COSTS_N_INSNS (7);
+  total = factor * COSTS_N_INSNS (7);
   break;
 case USE:
   /* 

Re: [PATCH] MIPS16: Fix truncated DWARF-2 line information

2012-05-08 Thread Maciej W. Rozycki
On Tue, 8 May 2012, Richard Sandiford wrote:

> >  Are you using a hard-float multilib for your -mabi=32/-mips16 Linux 
> > testing?
> 
> Yeah.  As an example:
> 
>http://gcc.gnu.org/ml/gcc-testresults/2012-03/msg00393.html
> 
> which doesn't look to bad.  Clean fortran results, which I expect
> would test the FP interworking fairly heavily.  (It's certainly
> been a source of bug fixes in the past, although I don't remember
> the results ever being terrible.)

 Yes, these look very good indeed.  Especially with QEMU that I do not 
feel terribly confident about as far as MIPS16 emulation is concerned (I 
still need to track down a single piece of real silicon supporting both 
MIPS64 and MIPS16 code at a time).

> FAOD, this is with normal MIPS libraries and mips16 executables.
> There's still no way of building mips16 multilibs out of the box.

 I've checked some notes and the issue was with MIPS16 FP PIC code (and 
therefore obviously SVR4 stubs rather than PLT) indeed.  I hope these 
pieces will get submitted eventually.

  Maciej


[PATCH] Add option for dumping to stderr (issue6190057)

2012-05-08 Thread Sharad Singhai
In response to comments, I have updated the patch to support dumps in
user provided files via the option -fdump-xxx=. The
filenames stdout/stderr are treated specially, and are considered
standard streams.

Also updated documentation and a testcase. Okay for trunk?

Thanks,
Sharad

2012-05-08   Sharad Singhai  

* doc/invoke.texi: Add documentation for new option.
* tree-dump.c (dump_stream_p): New function.
(dump_files): Update for new field.
(dump_switch_p_1): Handle user provided filenames.
(dump_begin): Likewise.
(get_dump_file_name): Likewise.
(dump_enable_all): Add new parameter USER_FILENAME.
All callers updated.
(dump_end): Remove attribute.
* tree-pass.h (enum tree_dump_index): Add new constant.
(struct dump_file_info): Add new field USER_FILENAME.
* testsuite/g++.dg/other/dump-userfile-1.C: New test.

Index: doc/invoke.texi
===
--- doc/invoke.texi (revision 187265)
+++ doc/invoke.texi (working copy)
@@ -5322,20 +5322,24 @@ Here are some examples showing uses of these optio
 
 @item -d@var{letters}
 @itemx -fdump-rtl-@var{pass}
+@itemx -fdump-rtl-@var{pass}=@var{filename}
 @opindex d
 Says to make debugging dumps during compilation at times specified by
 @var{letters}.  This is used for debugging the RTL-based passes of the
 compiler.  The file names for most of the dumps are made by appending
 a pass number and a word to the @var{dumpname}, and the files are
-created in the directory of the output file.  Note that the pass
-number is computed statically as passes get registered into the pass
-manager.  Thus the numbering is not related to the dynamic order of
-execution of passes.  In particular, a pass installed by a plugin
-could have a number over 200 even if it executed quite early.
-@var{dumpname} is generated from the name of the output file, if
-explicitly specified and it is not an executable, otherwise it is the
-basename of the source file. These switches may have different effects
-when @option{-E} is used for preprocessing.
+created in the directory of the output file. If the
+@option{=@var{filename}} is appended to the longer form of the dump
+option then the dump is done on that file instead of numbered
+files. The filenames stdout and stderr are treated specially. Note
+that the pass number is computed statically as passes get registered
+into the pass manager.  Thus the numbering is not related to the
+dynamic order of execution of passes.  In particular, a pass installed
+by a plugin could have a number over 200 even if it executed quite
+early.  @var{dumpname} is generated from the name of the output file,
+if explicitly specified and it is not an executable, otherwise it is
+the basename of the source file. These switches may have different
+effects when @option{-E} is used for preprocessing.
 
 Debug dumps can be enabled with a @option{-fdump-rtl} switch or some
 @option{-d} option @var{letters}.  Here are the possible
@@ -5599,6 +5603,10 @@ These dumps are defined but always produce empty f
 @opindex fdump-rtl-all
 Produce all the dumps listed above.
 
+@item -fdump-rtl-all=stderr
+@opindex fdump-rtl-all=stderr
+Produce all RTL dumps on stderr.
+
 @item -dA
 @opindex dA
 Annotate the assembler output with miscellaneous debugging information.
@@ -5719,15 +5727,19 @@ counters for each function compiled.
 
 @item -fdump-tree-@var{switch}
 @itemx -fdump-tree-@var{switch}-@var{options}
+@itemx -fdump-tree-@var{switch}-@var{options}=@var{filename}
 @opindex fdump-tree
 Control the dumping at various stages of processing the intermediate
 language tree to a file.  The file name is generated by appending a
 switch specific suffix to the source file name, and the file is
-created in the same directory as the output file.  If the
-@samp{-@var{options}} form is used, @var{options} is a list of
-@samp{-} separated options which control the details of the dump.  Not
-all options are applicable to all dumps; those that are not
-meaningful are ignored.  The following options are available
+created in the same directory as the output file. In case of
+@option{=@var{filename}}, the dump output is on the given file. Note
+that the filenames stdout and stderr are treated specially and dumps
+are done on standard streams. If the @samp{-@var{options}} form is
+used, @var{options} is a list of @samp{-} separated options which
+control the details or location of the dump.  Not all options are
+applicable to all dumps; those that are not meaningful are ignored.
+The following options are available
 
 @table @samp
 @item address
@@ -5765,9 +5777,56 @@ Enable showing the tree dump for each statement.
 Enable showing the EH region number holding each statement.
 @item scev
 Enable showing scalar evolution analysis details.
+@item slim
+Inhibit dumping of members of a scope or body of a function merely
+because that scope has been reached

Re: patch ping: Add static branch predict heuristic of comparing IV to loop_bound variable

2012-05-08 Thread Jakub Jelinek
On Wed, May 09, 2012 at 09:02:14AM +0800, Dehao Chen wrote:
> Sorry for the error. Here is a new patch to fix them:
> 
> gcc/testsuite/ChangeLog:
> 2012-05-08  Dehao Chen  
> 
>   * gcc.dg/predict-1.c: Remove the replicated text in this text.
>   * gcc.dg/predict-2.c: Likewise.
>   * gcc.dg/predict-3.c: Likewise.
>   * gcc.dg/predict-4.c: Likewise.
>   * gcc.dg/predict-5.c: Likewise.
>   * gcc.dg/predict-6.c: Likewise.

Ok (you could have committed it as obvious even).

> --- gcc/ChangeLog (revision 187307)
> +++ gcc/ChangeLog (working copy)
> @@ -110,15 +110,15 @@
> 
>  2012-05-08  Dehao Chen  
> 
> - * predict.c (find_qualified_ssa_name): New
> - (find_ssa_name_in_expr): New
> - (find_ssa_name_in_assign_stmt): New
> - (is_comparison_with_loop_invariant_p): New
> - (is_bound_expr_similar): New
> - (predict_iv_comparison): New
> + * predict.c (find_qualified_ssa_name): New.
> + (find_ssa_name_in_expr): New.
> + (find_ssa_name_in_assign_stmt): New.
> + (is_comparison_with_loop_invariant_p): New.
> + (is_bound_expr_similar): New.
> + (predict_iv_comparison): New.
>   (predict_loops): Add heuristic for loop-nested branches that compare an
>   induction variable to a loop bound variable.
> - * predict.def (PRED_LOOP_IV_COMPARE): New macro
> + * predict.def (PRED_LOOP_IV_COMPARE): New macro.
> 
>  2012-05-08  Uros Bizjak  
> 
> Index: gcc/testsuite/gcc.dg/predict-3.c
> ===
> --- gcc/testsuite/gcc.dg/predict-3.c  (revision 187307)
> +++ gcc/testsuite/gcc.dg/predict-3.c  (working copy)
> @@ -23,28 +23,3 @@
> 
>  /* { dg-final { scan-tree-dump-times "loop iv compare heuristics:
> 100.0%" 4 "profile_estimate"} } */
>  /* { dg-final { cleanup-tree-dump "profile_estimate" } } */
> -/* { dg-do compile } */
> -/* { dg-options "-O2 -fdump-tree-profile_estimate" } */
> -
> -extern int global;
> -
> -int bar(int);
> -
> -void foo (int bound)
> -{
> -  int i, ret = 0;
> -  for (i = 0; i <= bound; i++)
> -{
> -  if (i < bound - 2)
> - global += bar (i);
> -  if (i <= bound)
> - global += bar (i);
> -  if (i + 1 < bound)
> - global += bar (i);
> -  if (i != bound)
> - global += bar (i);
> -}
> -}
> -
> -/* { dg-final { scan-tree-dump-times "loop iv compare heuristics:
> 100.0%" 4 "profile_estimate"} } */
> -/* { dg-final { cleanup-tree-dump "profile_estimate" } } */
> Index: gcc/testsuite/gcc.dg/predict-4.c
> ===
> --- gcc/testsuite/gcc.dg/predict-4.c  (revision 187307)
> +++ gcc/testsuite/gcc.dg/predict-4.c  (working copy)
> @@ -17,22 +17,3 @@
> 
>  /* { dg-final { scan-tree-dump "loop iv compare heuristics: 50.0%"
> "profile_estimate"} } */
>  /* { dg-final { cleanup-tree-dump "profile_estimate" } } */
> -/* { dg-do compile } */
> -/* { dg-options "-O2 -fdump-tree-profile_estimate" } */
> -
> -extern int global;
> -
> -int bar(int);
> -
> -void foo (int bound)
> -{
> -  int i, ret = 0;
> -  for (i = 0; i < 10; i++)
> -{
> -  if (i < 5)
> - global += bar (i);
> -}
> -}
> -
> -/* { dg-final { scan-tree-dump "loop iv compare heuristics: 50.0%"
> "profile_estimate"} } */
> -/* { dg-final { cleanup-tree-dump "profile_estimate" } } */
> Index: gcc/testsuite/gcc.dg/predict-1.c
> ===
> --- gcc/testsuite/gcc.dg/predict-1.c  (revision 187307)
> +++ gcc/testsuite/gcc.dg/predict-1.c  (working copy)
> @@ -25,30 +25,3 @@
> 
>  /* { dg-final { scan-tree-dump-times "loop iv compare heuristics:
> 0.0%" 5 "profile_estimate"} } */
>  /* { dg-final { cleanup-tree-dump "profile_estimate" } } */
> -/* { dg-do compile } */
> -/* { dg-options "-O2 -fdump-tree-profile_estimate" } */
> -
> -extern int global;
> -
> -int bar(int);
> -
> -void foo (int bound)
> -{
> -  int i, ret = 0;
> -  for (i = 0; i < bound; i++)
> -{
> -  if (i > bound)
> - global += bar (i);
> -  if (i >= bound + 2)
> - global += bar (i);
> -  if (i > bound - 2)
> - global += bar (i);
> -  if (i + 2 > bound)
> - global += bar (i);
> -  if (i == 10)
> - global += bar (i);
> -}
> -}
> -
> -/* { dg-final { scan-tree-dump-times "loop iv compare heuristics:
> 0.0%" 5 "profile_estimate"} } */
> -/* { dg-final { cleanup-tree-dump "profile_estimate" } } */
> Index: gcc/testsuite/gcc.dg/predict-5.c
> ===
> --- gcc/testsuite/gcc.dg/predict-5.c  (revision 187307)
> +++ gcc/testsuite/gcc.dg/predict-5.c  (working copy)
> @@ -23,28 +23,3 @@
> 
>  /* { dg-final { scan-tree-dump-times "loop iv compare heuristics:
> 100.0%" 4 "profile_estimate"} } */
>  /* { dg-final { cleanup-tree-dump "profile_estimate" } } */
> -/* { dg-do compile } */
> -/* { dg-options "-O2 -fdump-tree-profile_estimate" } */
> -
> -extern int global;
> -
> -int bar (int);
> -

Re: [PATCH] Add option for dumping to stderr (issue6190057)

2012-05-08 Thread Andrew Pinski
On Tue, May 8, 2012 at 11:46 PM, Sharad Singhai  wrote:
> In response to comments, I have updated the patch to support dumps in
> user provided files via the option -fdump-xxx=. The
> filenames stdout/stderr are treated specially, and are considered
> standard streams.

I think - should also be treated as special (or maybe the only one
which should be treated as special).

Thanks,
Andrew Pinski

>
> Also updated documentation and a testcase. Okay for trunk?
>
> Thanks,
> Sharad
>
> 2012-05-08   Sharad Singhai  
>
>        * doc/invoke.texi: Add documentation for new option.
>        * tree-dump.c (dump_stream_p): New function.
>        (dump_files): Update for new field.
>        (dump_switch_p_1): Handle user provided filenames.
>        (dump_begin): Likewise.
>        (get_dump_file_name): Likewise.
>        (dump_enable_all): Add new parameter USER_FILENAME.
>        All callers updated.
>        (dump_end): Remove attribute.
>        * tree-pass.h (enum tree_dump_index): Add new constant.
>        (struct dump_file_info): Add new field USER_FILENAME.
>        * testsuite/g++.dg/other/dump-userfile-1.C: New test.
>
> Index: doc/invoke.texi
> ===
> --- doc/invoke.texi     (revision 187265)
> +++ doc/invoke.texi     (working copy)
> @@ -5322,20 +5322,24 @@ Here are some examples showing uses of these optio
>
>  @item -d@var{letters}
>  @itemx -fdump-rtl-@var{pass}
> +@itemx -fdump-rtl-@var{pass}=@var{filename}
>  @opindex d
>  Says to make debugging dumps during compilation at times specified by
>  @var{letters}.  This is used for debugging the RTL-based passes of the
>  compiler.  The file names for most of the dumps are made by appending
>  a pass number and a word to the @var{dumpname}, and the files are
> -created in the directory of the output file.  Note that the pass
> -number is computed statically as passes get registered into the pass
> -manager.  Thus the numbering is not related to the dynamic order of
> -execution of passes.  In particular, a pass installed by a plugin
> -could have a number over 200 even if it executed quite early.
> -@var{dumpname} is generated from the name of the output file, if
> -explicitly specified and it is not an executable, otherwise it is the
> -basename of the source file. These switches may have different effects
> -when @option{-E} is used for preprocessing.
> +created in the directory of the output file. If the
> +@option{=@var{filename}} is appended to the longer form of the dump
> +option then the dump is done on that file instead of numbered
> +files. The filenames stdout and stderr are treated specially. Note
> +that the pass number is computed statically as passes get registered
> +into the pass manager.  Thus the numbering is not related to the
> +dynamic order of execution of passes.  In particular, a pass installed
> +by a plugin could have a number over 200 even if it executed quite
> +early.  @var{dumpname} is generated from the name of the output file,
> +if explicitly specified and it is not an executable, otherwise it is
> +the basename of the source file. These switches may have different
> +effects when @option{-E} is used for preprocessing.
>
>  Debug dumps can be enabled with a @option{-fdump-rtl} switch or some
>  @option{-d} option @var{letters}.  Here are the possible
> @@ -5599,6 +5603,10 @@ These dumps are defined but always produce empty f
>  @opindex fdump-rtl-all
>  Produce all the dumps listed above.
>
> +@item -fdump-rtl-all=stderr
> +@opindex fdump-rtl-all=stderr
> +Produce all RTL dumps on stderr.
> +
>  @item -dA
>  @opindex dA
>  Annotate the assembler output with miscellaneous debugging information.
> @@ -5719,15 +5727,19 @@ counters for each function compiled.
>
>  @item -fdump-tree-@var{switch}
>  @itemx -fdump-tree-@var{switch}-@var{options}
> +@itemx -fdump-tree-@var{switch}-@var{options}=@var{filename}
>  @opindex fdump-tree
>  Control the dumping at various stages of processing the intermediate
>  language tree to a file.  The file name is generated by appending a
>  switch specific suffix to the source file name, and the file is
> -created in the same directory as the output file.  If the
> -@samp{-@var{options}} form is used, @var{options} is a list of
> -@samp{-} separated options which control the details of the dump.  Not
> -all options are applicable to all dumps; those that are not
> -meaningful are ignored.  The following options are available
> +created in the same directory as the output file. In case of
> +@option{=@var{filename}}, the dump output is on the given file. Note
> +that the filenames stdout and stderr are treated specially and dumps
> +are done on standard streams. If the @samp{-@var{options}} form is
> +used, @var{options} is a list of @samp{-} separated options which
> +control the details or location of the dump.  Not all options are
> +applicable to all dumps; those that are not meaningful are ignored.
> +The following options are a

Re: [PATCH] Add option for dumping to stderr (issue6190057)

2012-05-08 Thread Gabriel Dos Reis
On Wed, May 9, 2012 at 1:51 AM, Andrew Pinski  wrote:
> On Tue, May 8, 2012 at 11:46 PM, Sharad Singhai  wrote:
>> In response to comments, I have updated the patch to support dumps in
>> user provided files via the option -fdump-xxx=. The
>> filenames stdout/stderr are treated specially, and are considered
>> standard streams.
>
> I think - should also be treated as special (or maybe the only one
> which should be treated as special).

He originally wanted only "stderr", so treating "-" only specially
would not be in
the line with his original goal.  "-" is equivalent to "stdout", so it
is not like
we don't have the functionally with his revised patch.

>
> Thanks,
> Andrew Pinski
>
>>
>> Also updated documentation and a testcase. Okay for trunk?
>>
>> Thanks,
>> Sharad
>>
>> 2012-05-08   Sharad Singhai  
>>
>>        * doc/invoke.texi: Add documentation for new option.
>>        * tree-dump.c (dump_stream_p): New function.
>>        (dump_files): Update for new field.
>>        (dump_switch_p_1): Handle user provided filenames.
>>        (dump_begin): Likewise.
>>        (get_dump_file_name): Likewise.
>>        (dump_enable_all): Add new parameter USER_FILENAME.
>>        All callers updated.
>>        (dump_end): Remove attribute.
>>        * tree-pass.h (enum tree_dump_index): Add new constant.
>>        (struct dump_file_info): Add new field USER_FILENAME.
>>        * testsuite/g++.dg/other/dump-userfile-1.C: New test.
>>
>> Index: doc/invoke.texi
>> ===
>> --- doc/invoke.texi     (revision 187265)
>> +++ doc/invoke.texi     (working copy)
>> @@ -5322,20 +5322,24 @@ Here are some examples showing uses of these optio
>>
>>  @item -d@var{letters}
>>  @itemx -fdump-rtl-@var{pass}
>> +@itemx -fdump-rtl-@var{pass}=@var{filename}
>>  @opindex d
>>  Says to make debugging dumps during compilation at times specified by
>>  @var{letters}.  This is used for debugging the RTL-based passes of the
>>  compiler.  The file names for most of the dumps are made by appending
>>  a pass number and a word to the @var{dumpname}, and the files are
>> -created in the directory of the output file.  Note that the pass
>> -number is computed statically as passes get registered into the pass
>> -manager.  Thus the numbering is not related to the dynamic order of
>> -execution of passes.  In particular, a pass installed by a plugin
>> -could have a number over 200 even if it executed quite early.
>> -@var{dumpname} is generated from the name of the output file, if
>> -explicitly specified and it is not an executable, otherwise it is the
>> -basename of the source file. These switches may have different effects
>> -when @option{-E} is used for preprocessing.
>> +created in the directory of the output file. If the
>> +@option{=@var{filename}} is appended to the longer form of the dump
>> +option then the dump is done on that file instead of numbered
>> +files. The filenames stdout and stderr are treated specially. Note
>> +that the pass number is computed statically as passes get registered
>> +into the pass manager.  Thus the numbering is not related to the
>> +dynamic order of execution of passes.  In particular, a pass installed
>> +by a plugin could have a number over 200 even if it executed quite
>> +early.  @var{dumpname} is generated from the name of the output file,
>> +if explicitly specified and it is not an executable, otherwise it is
>> +the basename of the source file. These switches may have different
>> +effects when @option{-E} is used for preprocessing.
>>
>>  Debug dumps can be enabled with a @option{-fdump-rtl} switch or some
>>  @option{-d} option @var{letters}.  Here are the possible
>> @@ -5599,6 +5603,10 @@ These dumps are defined but always produce empty f
>>  @opindex fdump-rtl-all
>>  Produce all the dumps listed above.
>>
>> +@item -fdump-rtl-all=stderr
>> +@opindex fdump-rtl-all=stderr
>> +Produce all RTL dumps on stderr.
>> +
>>  @item -dA
>>  @opindex dA
>>  Annotate the assembler output with miscellaneous debugging information.
>> @@ -5719,15 +5727,19 @@ counters for each function compiled.
>>
>>  @item -fdump-tree-@var{switch}
>>  @itemx -fdump-tree-@var{switch}-@var{options}
>> +@itemx -fdump-tree-@var{switch}-@var{options}=@var{filename}
>>  @opindex fdump-tree
>>  Control the dumping at various stages of processing the intermediate
>>  language tree to a file.  The file name is generated by appending a
>>  switch specific suffix to the source file name, and the file is
>> -created in the same directory as the output file.  If the
>> -@samp{-@var{options}} form is used, @var{options} is a list of
>> -@samp{-} separated options which control the details of the dump.  Not
>> -all options are applicable to all dumps; those that are not
>> -meaningful are ignored.  The following options are available
>> +created in the same directory as the output file. In case of
>> +@option{=@var{filename}}, the dump output is on the given file. Note
>