Re: [patch RFA middle-end] Fix PR target/41993

2012-11-06 Thread Eric Botcazou
> 2012-11-05  Uros Bizjak  
>   Kaz Kojima  
> 
>   PR target/41993
>   * mode-switching.c (create_pre_exit): Set return_copy to
>   last_insn when copy_start is a pseudo reg.

OK, thanks.  The number of special cases dealt with in the function is on the 
verge of making it barely understandable though.  Why couldn't a backward scan 
based only on:

/* If the return register is not likely spilled, - as is
   the case for floating point on SH4 - then it might
   be set by an arithmetic operation that needs a
   different mode than the exit block.  */
for (j = n_entities - 1; j >= 0; j--)
  {
int e = entity_map[j];
int mode = MODE_NEEDED (e, return_copy);

if (mode != num_modes[e] && mode != MODE_EXIT (e))
  break;
  }

(with a few shortcuts to speed it up) be sufficient?

-- 
Eric Botcazou


Re: [Patch, Fortran, OOP] PR 54917: [4.7/4.8 Regression] TRANSFER on polymorphic variable causes ICE

2012-11-06 Thread Tobias Burnus

Janus Weil wrote:

the attached patch implements support for polymorphic arguments to
TRANSFER. For details and discussion see the PR.

Btw, as part of the PR is a regression in 4.7 and trunk, I would like
to backport the target-memory.c part and the first test case to the
4.7 branch. Ok?


Looks OK. Thanks for the patch. Nit:

+  result->ts = result->ts.u.derived->components->ts;

can be written as:

+  result->ts = CLASS_DATA (result)->ts;


which is shorter and clearer.

Tobias


2012-11-05  Janus Weil  

 PR fortran/54917
 * target-memory.c (gfc_target_expr_size,gfc_target_interpret_expr):
 Handle BT_CLASS.
 * trans-intrinsic.c (gfc_conv_intrinsic_transfer): Add support for
 polymorphic arguments.

2012-11-05  Janus Weil  

 PR fortran/54917
 * gfortran.dg/transfer_class_1.f90: New.
 * gfortran.dg/transfer_class_2.f90: New.


Re: [Patch] libitm: add HTM fastpath

2012-11-06 Thread Uros Bizjak
Hello!

> This patch adds support for using strongly-isolated HTMs with
> serial-irrevocable mode as fallback.  Such HTMs can execute
> uninstrumented code transactionally, and eventually aborted transactions
> will cause no visible side effects.  Data conflicts with
> nontransactional accesses lead to transactions being aborted.

+static inline bool
+htm_available ()
+{
+  const unsigned cpuid_rtm = (1 << 11);
+  if (__get_cpuid_max (0, NULL) >= 7)
+{
+  unsigned a, b, c, d;
+  __cpuid_count (7, 0, a, b, c, d);
+  if (b & cpuid_rtm)
+   return true;
+}
+  return false;

You can use bit_RTM from cpuid.h instead of cpuid_rtm here.

Uros.


Re: [patch][RFC] Filename based shared library versioning on AIX

2012-11-06 Thread Michael Haubenwallner

On 11/05/2012 07:31 PM, David Edelsohn wrote:
> On Mon, Nov 5, 2012 at 1:10 PM, Michael Haubenwallner
>  wrote:
> 
>> Well, as long as the old sharedlibs were not created as standalone shared
>> objects (lib.so), this is similar to a normal "soname"-bump on AIX, in that
>> it is still possible for the package manager to transfer the old shared
>> objects (with F_LOADONLY flag set) into the new archives.
> 
> Yes, the old shared objects can be placed in the new archive, but one
> also needs to ensure that the archive has the correct name, e.g.
> libfoo.a not libfoo.so.x.y ad libfoo.so.

Actually, AIX does symlink some of its system libraries (including libc), so
it seems to be fine when libfoo.a is a symlink to libfoo.so.x.y containing
the correct shared objects.

>> As far as I can see, gcc does not provide this libtool-option (environment
>> variable LDFLAGS=-brtl) at all for its libraries (for good reason).
> 
> I do not understand what you mean by gcc does not provide this libtool
> option.  GCC does link libstdc++ with -G option, for use with -brtl,
> but does not automatically link applications with -brtl.

When vanilla libtool detects "-brtl" in LDFLAGS, it does:
  * Link the Shared Object with -G flag.
  * The Shared Object's filename is libNAME.so.1.2.3
  * Create symlink libNAME.so.1 -> libNAME.so.1.2.3
  * Create symlink libNAME.so -> libNAME.so.1.2.3
  * Create libNAME.a from static objects.

When "-brtl" is not in LDFLAGS, it does it the traditional way:
  * Link the Shared Object with -bM:SRE flag.
  * The Shared Object's filename is libNAME.so.1
  * Put the Shared Object into libNAME.a

The former is what is incompatible with this "aix-soname" proposal, but I've 
never
seen any gcc library (libstdc++) being created that way as standalone Shared 
Object.

Instead, libstdc++ is created the traditional way, but with -G flag - not
listening to -brtl in LDFLAGS as vanilla libtool does.

/haubi/


[Ada] clean ups in Makefiles

2012-11-06 Thread Arnaud Charlet
This patch cleans up makefile targets related to the generation of
s-oscons.ads. We had duplicate rules between Make-generated.in and
gcc-interface/Makefile.in which caused confusion recently and are now removed.
The rules are now handled in Make-generated.in only, as was intended when
Make-generated.in was introduced. A few minor clean ups also done while
reviewing these files.

Tested on x86_64-pc-linux-gnu, committed on trunk.

libada/ 
* Makefile.in (osconstool): Fix target. 
ada/
* gcc-interface/Makefile.in, gcc-interface/Make-lang.in: Remove
duplicate rules handled by Make-generated.in.
--
Index: libada/Makefile.in
===
--- libada/Makefile.in  (revision 193034)
+++ libada/Makefile.in  (working copy)
@@ -113,7 +113,7 @@ gnatlib-sjlj gnatlib-zcx gnatlib-shared:
$(LN_S) $(ADA_RTS_DIR) adalib
 
 osconstool:
-   $(MAKE) -C $(GCC_DIR)/ada $(LIBADA_FLAGS_TO_PASS) 
./bldtools/oscons/xoscons
+   $(MAKE) -C $(GCC_DIR) $(LIBADA_FLAGS_TO_PASS) ada/s-oscons.ads
 
 install-gnatlib: $(GCC_DIR)/ada/Makefile
$(MAKE) -C $(GCC_DIR)/ada $(LIBADA_FLAGS_TO_PASS) install-gnatlib
Index: gcc/ada/gcc-interface/Makefile.in
===
--- gcc/ada/gcc-interface/Makefile.in   (revision 193208)
+++ gcc/ada/gcc-interface/Makefile.in   (working copy)
@@ -2577,48 +2577,13 @@ install-gnatlib: ../stamp-gnatlib-$(RTSD
$(RTSDIR)/$(word 1,$(subst <, ,$(PAIR)));)
 # Copy tsystem.h
$(CP) $(srcdir)/tsystem.h $(RTSDIR)
+# Copy generated target dependent sources
+   $(RM) $(RTSDIR)/s-oscons.ads
+   (cd $(RTSDIR); $(LN_S) ../s-oscons.ads s-oscons.ads)
$(RM) ../stamp-gnatlib-$(RTSDIR)
touch ../stamp-gnatlib1-$(RTSDIR)
 
-ifeq ($(strip $(filter-out alpha64 ia64 dec hp vms% openvms% alphavms%,$(subst 
-, ,$(host,)
-OSCONS_CPP=../../$(DECC) -E /comment=as_is -DNATIVE \
- -DTARGET='""$(target)""' $(fsrcpfx)ada/s-oscons-tmplt.c
-
-OSCONS_EXTRACT=../../$(DECC) -DNATIVE \
- -DTARGET='""$(target)""' $(fsrcpfx)ada/s-oscons-tmplt.c ; \
-  ld -o s-oscons-tmplt.exe s-oscons-tmplt.obj; \
-  ./s-oscons-tmplt.exe > s-oscons-tmplt.s
-
-else
-# GCC_FOR_TARGET has paths relative to the gcc directory, so we need to adjust
-# for running it from $(RTSDIR)
-OSCONS_CC=`echo "$(GCC_FOR_TARGET)" \
-  | sed -e 's^\./xgcc^../../xgcc^' -e 's^-B./^-B../../^'`
-OSCONS_CPP=$(OSCONS_CC) $(GNATLIBCFLAGS) -E -C \
-  -DTARGET=\"$(target)\" $(fsrcpfx)ada/s-oscons-tmplt.c > s-oscons-tmplt.i
-OSCONS_EXTRACT=$(OSCONS_CC) $(GNATLIBCFLAGS) -S s-oscons-tmplt.i
-endif
-
-./bldtools/oscons/xoscons: xoscons.adb xutil.ads xutil.adb
-   -$(MKDIR) ./bldtools/oscons
-   $(RM) $(addprefix ./bldtools/oscons/,$(notdir $^))
-   $(CP) $^ ./bldtools/oscons
-   (cd ./bldtools/oscons ; gnatmake -q xoscons)
-
-$(RTSDIR)/s-oscons.ads: ../stamp-gnatlib1-$(RTSDIR) s-oscons-tmplt.c gsocket.h 
./bldtools/oscons/xoscons
-   $(RM) $(RTSDIR)/s-oscons-tmplt.i $(RTSDIR)/s-oscons-tmplt.s
-   (cd $(RTSDIR) ; \
-   $(OSCONS_CPP) ; \
-   $(OSCONS_EXTRACT) ; \
-   ../bldtools/oscons/xoscons s-oscons)
-
-# Don't use semicolon separated shell commands that involve list expansions.
-# The semicolon triggers a call to DCL on VMS and DCL can't handle command
-# line lengths in excess of 256 characters.
-# Example: cd $(RTSDIR); ar rc libfoo.a $(LONG_LIST_OF_OBJS)
-# is guaranteed to overflow the buffer.
-
-gnatlib: ../stamp-gnatlib1-$(RTSDIR) ../stamp-gnatlib2-$(RTSDIR) 
$(RTSDIR)/s-oscons.ads
+gnatlib: ../stamp-gnatlib1-$(RTSDIR) ../stamp-gnatlib2-$(RTSDIR)
 # C files
$(MAKE) -C $(RTSDIR) \
CC="`echo \"$(GCC_FOR_TARGET)\" \
Index: gcc/ada/gcc-interface/Make-lang.in
===
--- gcc/ada/gcc-interface/Make-lang.in  (revision 193208)
+++ gcc/ada/gcc-interface/Make-lang.in  (working copy)
@@ -122,7 +122,7 @@ ifeq ($(build), $(host))
 
 # put the host RTS dir first in the PATH to hide the default runtime
 # files that are among the sources
-RTS_DIR=$(strip $(subst \,/,$(shell gnatls -v | grep adalib )))
+RTS_DIR:=$(strip $(subst \,/,$(shell gnatls -v | grep adalib )))
 
 ADA_TOOLS_FLAGS_TO_PASS=\
 CC="$(CC)" \
@@ -157,7 +157,7 @@ else
   else
 # This is a canadian cross. We should use a toolchain running on the
 # build platform and targeting the host platform.
-RTS_DIR=$(strip $(subst \,/,$(shell $(GNATLS_FOR_HOST) -v | grep adalib )))
+RTS_DIR:=$(strip $(subst \,/,$(shell $(GNATLS_FOR_HOST) -v | grep adalib 
)))
 ADA_TOOLS_FLAGS_TO_PASS=\
 CC="$(CC)" \
 $(COMMON_FLAGS_TO_PASS) $(ADA_FLAGS_TO_PASS)  \
@@ -574,7 +574,7 @@ canadian-gnattools: force
$(MAKE) -C ada $(ADA_TOOLS_FLAGS_TO_PASS) gnattools2
$(MAKE) -C ada $(ADA

Re: [patch RFA middle-end] Fix PR target/41993

2012-11-06 Thread Kaz Kojima
Eric Botcazou  wrote:
> OK, thanks.  The number of special cases dealt with in the function is on the 
> verge of making it barely understandable though.  Why couldn't a backward 
> scan 
> based only on:
> 
>   /* If the return register is not likely spilled, - as is
>  the case for floating point on SH4 - then it might
>  be set by an arithmetic operation that needs a
>  different mode than the exit block.  */
>   for (j = n_entities - 1; j >= 0; j--)
> {
>   int e = entity_map[j];
>   int mode = MODE_NEEDED (e, return_copy);
> 
>   if (mode != num_modes[e] && mode != MODE_EXIT (e))
> break;
> }
> 
> (with a few shortcuts to speed it up) be sufficient?

Although I might be getting you wrong, the current code does a scan
based on those lines but builtin_return, functions with no return
value and exceptions require special treatments and made things
complex.

Regards,
kaz


Re: [patch RFA middle-end] Fix PR target/41993

2012-11-06 Thread Eric Botcazou
> Although I might be getting you wrong, the current code does a scan
> based on those lines but builtin_return, functions with no return
> value and exceptions require special treatments and made things
> complex.

I was referring to the ret_start/reg_end/nregs business: why is it necessary?

-- 
Eric Botcazou


Re: [patch RFA middle-end] Fix PR target/41993

2012-11-06 Thread Kaz Kojima
Eric Botcazou  wrote:
> I was referring to the ret_start/reg_end/nregs business: why is it necessary?

I thought that they are for the return value which requires
multiple hard registers.

Regards,
kaz


[Ada] Support both concurrent and sequential partition elaboration policies

2012-11-06 Thread Arnaud Charlet
In the restricted profile, both policies are now supported. The default one
is concurrent but the sequential can be selected too.

The following should now compile:

pragma Profile (Ravenscar);
pragma Partition_Elaboration_Policy (Concurrent);

package p is
  task t;
end;

package body p is
  task body t is
  begin
loop
  null;
end loop;
  end;
end; 

Tested on x86_64-pc-linux-gnu, committed on trunk

2012-11-06  Tristan Gingold  

* exp_ch9.adb (Build_Activation_Chain_Entity): Return immediately if
partition elaboration policy is sequential.
(Build_Task_Activation_Call): Likewise. Use
Activate_Restricted_Tasks on restricted profile.
(Make_Task_Create_Call): Do not use the _Chain
parameter if elaboration policy is sequential. Call
Create_Restricted_Task_Sequential in that case.
* exp_ch3.adb (Build_Initialization_Call): Change condition to
support concurrent elaboration policy.
(Build_Record_Init_Proc): Likewise.
(Init_Formals): Likewise.
* bindgen.adb (Gen_Adainit): Declare Partition_Elaboration_Policy
and set it in generated code if the elaboration policy is
sequential. The procedure called to activate all tasks is now
named __gnat_activate_all_tasks.
* rtsfind.adb (RE_Activate_Restricted_Task,
RE_Create_Restricted_Task_Sequential): New RE_Id literals.
* s-tarest.adb (Create_Restricted_Task): Added to create a task without
adding it on an activation chain.
(Activate_Tasks): Has now a Chain parameter.
(Activate_All_Tasks_Sequential): Added. Called by the binder to
activate all tasks.
(Activate_Restricted_Tasks): Added. Called during elaboration to
activate tasks of the units.
* s-tarest.ads: Remove pragma Partition_Elaboration_Policy.
(Partition_Elaboration_Policy): New variable (set by the binder).
(Create_Restricted_Task): Revert removal of the chain parameter.
(Create_Restricted_Task_Sequential): New procedure.
(Activate_Restricted_Tasks): Revert removal.
(Activate_All_Tasks_Sequential): New procedure.

Index: exp_ch9.adb
===
--- exp_ch9.adb (revision 193208)
+++ exp_ch9.adb (working copy)
@@ -911,10 +911,10 @@
--  Start of processing for Build_Activation_Chain_Entity
 
begin
-  --  Activation chain is never used in restricted profile, see comment
-  --  for Create_Restricted_Task in s-tarest.ads.
+  --  Activation chain is never used for sequential elaboration policy, see
+  --  comment for Create_Restricted_Task_Sequential in s-tarest.ads).
 
-  if Restricted_Profile then
+  if Partition_Elaboration_Policy = 'S' then
  return;
   end if;
 
@@ -4900,10 +4900,10 @@
   P : Node_Id;
 
begin
-  --  On restricted profile, all the tasks will be activated at the end
-  --  of the elaboration (Sequential elaboration policy).
+  --  For sequential elaboration policy, all the tasks will be activated at
+  --  the end of the elaboration.
 
-  if Restricted_Profile then
+  if Partition_Elaboration_Policy = 'S' then
  return;
   end if;
 
@@ -4925,7 +4925,11 @@
   end if;
 
   if Present (Chain) then
- Name := New_Reference_To (RTE (RE_Activate_Tasks), Loc);
+ if Restricted_Profile then
+Name := New_Reference_To (RTE (RE_Activate_Restricted_Tasks), Loc);
+ else
+Name := New_Reference_To (RTE (RE_Activate_Tasks), Loc);
+ end if;
 
  Call :=
Make_Procedure_Call_Statement (Loc,
@@ -13980,10 +13984,10 @@
   Prefix => Make_Identifier (Loc, New_External_Name (Tnam, 'E')),
   Attribute_Name => Name_Unchecked_Access));
 
-  --  Chain parameter. This is a reference to the Chain parameter of the
-  --  initialization procedure. There is no chain in restricted profile.
+  --  Add Chain parameter (not done for sequential elaboration policy, see
+  --  comment for Create_Restricted_Task_Sequential in s-tarest.ads).
 
-  if not Restricted_Profile then
+  if Partition_Elaboration_Policy /= 'S' then
  Append_To (Args, Make_Identifier (Loc, Name_uChain));
   end if;
 
@@ -14015,11 +14019,20 @@
   Prefix=> Make_Identifier (Loc, Name_uInit),
   Selector_Name => Make_Identifier (Loc, Name_uTask_Id)));
 
-  if Restricted_Profile then
- Name := New_Reference_To (RTE (RE_Create_Restricted_Task), Loc);
-  else
- Name := New_Reference_To (RTE (RE_Create_Task), Loc);
-  end if;
+  declare
+ Create_RE : RE_Id;
+  begin
+ if Restricted_Profile then
+if Partition_Elaboration_Policy = 'S' then
+   Create_RE := RE_Create_Restricted_Task_Sequential;
+else
+   Create_RE := RE_Create_Restrict

[Ada] Correct possible double qualification of names in formal mode

2012-11-06 Thread Arnaud Charlet
In the special mode for formal verification, the entity was not marked as
having a qualified name after being passed to Qualify_Entity_Name, which lead
to multiple qualification when Qualify_Entity_Name was called multiple times
on the same entity. The assumptions that it is called only once on each entity
is wrong, because scopes (containing entities) may be put more than once on
the qualification stack, once every time they are expanded, and expansion may
be called multiple times on the same node. The fact that names are marked as
qualified does not break the behavior of Unique_Name to fully qualify names
based on scope names, because this function looks at the other flag
Has_Fully_Qualified_Name, not set in formal verification mode.

Tested on x86_64-pc-linux-gnu, committed on trunk

2012-11-06  Yannick Moy  

* exp_dbug.adb (Qualify_Entity_Name): Mark entity as having a qualified
name after being treated, in formal verification mode.

Index: exp_dbug.adb
===
--- exp_dbug.adb(revision 193208)
+++ exp_dbug.adb(working copy)
@@ -1307,12 +1307,13 @@
   if Has_Qualified_Name (Ent) then
  return;
 
-  --  In formal verification mode, simply append a suffix for homonyms, but
-  --  do not mark the name as being qualified. We used to qualify entity
-  --  names as full expansion does, but this was removed as this prevents
-  --  the verification back-end from using a short name for debugging and
-  --  user interaction. The verification back-end already takes care of
-  --  qualifying names when needed.
+  --  In formal verification mode, simply append a suffix for homonyms.
+  --  We used to qualify entity names as full expansion does, but this was
+  --  removed as this prevents the verification back-end from using a short
+  --  name for debugging and user interaction. The verification back-end
+  --  already takes care of qualifying names when needed. Still mark the
+  --  name as being qualified, as Qualify_Entity_Name may be called more
+  --  than once on the same entity.
 
   elsif Alfa_Mode then
  if Has_Homonym (Ent) then
@@ -1322,6 +1323,7 @@
 Set_Chars (Ent, Name_Enter);
  end if;
 
+ Set_Has_Qualified_Name (Ent);
  return;
 
   --  If the entity is a variable encoding the debug name for an object


[Ada] Fix error in multi-precision division used in ELIMINATED mode

2012-11-06 Thread Arnaud Charlet
The ELIMINATED overlow mode uses a multi-precision integer arithmetic
package that depends on algorithm D from Knuth for multi-precision
division. In very rare cases, this algorithm has a bug causing an
internal overflow resulting in an incorrect result. This patch uses
the fixed version of this algorithm (see code of Div_Rem in s-bignum
for details).

The following program:

 1. pragma Overflow_Checks (Eliminated);
 2. with Ada.Text_IO;
 3. procedure BadBigNum is
 4. type U32 is mod 2**32;
 5. type U64 is mod 2**64;
 6. subtype LLI is Long_Long_Integer;
 7.
 8. procedure Q (Y : LLI) is  --  Y is 2**32 - 1
 9.Ym1 : LLI := Y - 1;  --  Ym1 is 2**32 - 2
10.Z : U64 := U64 Ym1 * 2**32) + Ym1) * 2**32)
11./ (Ym1 * 2**32 + Y));
12.--  Z is 2**32 - 1 = 4_294_967_295
13. begin
14.Ada.Text_IO.Put_Line ("z =" & U64'Image (Z));
15. end Q;
16.
17. X : U32 := -1;  --  2**32 - 1
18. Y : LLI := LLI (X); --  2**32 - 1
19. begin
20. Q (Y);
21. end BadBigNum;

should generate output of

 4294967295

Tested on x86_64-pc-linux-gnu, committed on trunk

2012-11-06  Robert Dewar  

* s-bignum.adb (Div_Rem): Fix bug in step D3.
* uintp.adb (UI_Div_Rem): Add comment on bug in step D3.

Index: uintp.adb
===
--- uintp.adb   (revision 193215)
+++ uintp.adb   (working copy)
@@ -1216,6 +1216,15 @@
 
--  [ CALCULATE Q (hat) ] (step D3 in the algorithm)
 
+   --  Note: this version of step D3 is from the original published
+   --  algorithm, which is known to have a bug causing overflows.
+   --  See: http://www-cs-faculty.stanford.edu/~uno/err2-2e.ps.gz.
+   --  In this code we are safe since our representation of double
+   --  length numbers allows an expanded range.
+
+   --  We don't have a proof of this claim, but the only cases we
+   --  have found that show the bug in step D3 work fine here.
+
Tmp_Int := Dividend (J) * Base + Dividend (J + 1);
 
--  Initial guess
@@ -1230,7 +1239,7 @@
 
while Divisor_Dig2 * Q_Guess >
  (Tmp_Int - Q_Guess * Divisor_Dig1) * Base +
-  Dividend (J + 2)
+Dividend (J + 2)
loop
   Q_Guess := Q_Guess - 1;
end loop;
Index: s-bignum.adb
===
--- s-bignum.adb(revision 193215)
+++ s-bignum.adb(working copy)
@@ -776,7 +776,9 @@
 
  d: DD;
  j: Length;
- qhat : SD;
+ qhat : DD;
+ rhat : DD;
+ temp : DD;
 
   begin
  --  Initialize data of left and right operands
@@ -847,26 +849,37 @@
  --  Loop through digits
 
  loop
+--  Note: In the original printing, step D3 was as follows:
+
 --  D3. [Calculate qhat.] If uj = v1, set qhat to b-l; otherwise
---  set qhat to (uj,uj+1)/v1.
+--  set qhat to (uj,uj+1)/v1. Now test if v2 * qhat is greater than
+--  (uj*b + uj+1 - qhat*v1)*b + uj+2. If so, decrease qhat by 1 and
+--  repeat this test
 
-if u (j) = v1 then
-   qhat := -1;
-else
-   qhat := SD ((u (j) & u (j + 1)) / DD (v1));
-end if;
+--  This had a bug not discovered till 1995, see Vol 2 errata:
+--  http://www-cs-faculty.stanford.edu/~uno/err2-2e.ps.gz. Under
+--  rare circumstances the expression in the test could overflow.
+--  The code below is the fixed version of this step.
 
---  D3 (continued). Now test if v2 * qhat is greater than (uj*b +
---  uj+1 - qhat*v1)*b + uj+2. If so, decrease qhat by 1 and repeat
---  this test, which determines at high speed most of the cases in
---  which the trial value qhat is one too large, and it eliminates
---  all cases where qhat is two too large.
+--  D3. [Calculate qhat.] Set qhat to (uj,uj+1)/v1 and rhat to
+--  to (uj,uj+1) mod v1.
 
-while DD (v2) * DD (qhat) >
-   ((u (j) & u (j + 1)) -
- DD (qhat) * DD (v1)) * b + DD (u (j + 2))
+temp := u (j) & u (j + 1);
+qhat := temp / DD (v1);
+rhat := temp mod DD (v1);
+
+--  D3 (continued). Now test if qhat = b or v2*qhat > (rhat,uj+2):
+--  if so, decrease qhat by 1, increase rhat by v1, and repeat this
+--  test if rhat < b. [The test on v2 determines at at high speed
+--  most of the cases in which the trial value qhat is one 

[Ada] Expansion of renamings of unconstrained objects

2012-11-06 Thread Arnaud Charlet
When the subtype in an object renaming declaration is unconstrained, the
compiler builds an actual subtype using the bounds of the renamed object.
The actual subtype is not needed when the renamed object is a limited record.
This is a useful optimization, in particular for the expansion of iterators
where discriminated types with implicit defereference appear. It also solves
subtyping problems in the back-end, when the expansion of the renamed object
itself involves function calls with unconstrained actuals.

The following must compile quietly :

   gcc -c -gnat12a essai.adb

---
with Variants; use Variants;
with Variants.Iterators; use Variants.Iterators;
procedure Essai is
   function Count_Length_C(V : Variant) return Natural is
  Res : Natural := 0;
   begin
  for III of Text_Iteraton(V) loop
 Res := Res + III.S_Access.all'Length;
  end loop;
  return Res;
   end Count_Length_C;

   function Make_Huge_Text(N : Natural) return Variant is 
  Res : Variant := Make_Text ("YES", N);
   begin
  for I in 1..N loop
 Text_Append(Res, Natural'Image(I));
  end loop;
  return Res;
   end Make_Huge_Text;

   V : constant Variant := Make_Huge_Text(10);
begin
   null;
end Essai;
---
with Ada.Finalization; use Ada.Finalization;
with Ada.Strings; with Ada.Streams;
with Ada.Strings.Unbounded;
package Variants is

   type Variant is private;
   type Variant_Kind is (VK_Null, VK_Num, VK_String, VK_Vector, VK_Text);

   Null_Variant : constant Variant;

   Initial_Max_Text_Size   : constant := 16;
   Initial_Max_Vector_Size : constant := 16;

   procedure Text_Append (V : in out Variant; X : in String);
   function Make_Text (S : String; N : Positive) return Variant;

private
   package Internal is
  use Ada.Strings.Unbounded; -- only for String_Access

  Initial_Reference_Count : constant := 1;

  type String_Value (Size : Natural) is record
 Reference_Count : Integer := Initial_Reference_Count;
 Value   : String (1 .. Size);
  end record;
  type String_Value_Ptr is access all String_Value;

  type Vector_Value (Size : Natural) is record
 Reference_Count : Integer := Initial_Reference_Count;
 Current_Vector_Size : Natural := 0;
  end record;
  type Vector_Value_Ptr is access all Vector_Value;

  type String_Access_Vector is
  array (Positive range <>) of Ada.Strings.Unbounded.String_Access;
  type Text_Value (Size : Natural) is record
 Reference_Count : Integer := Initial_Reference_Count;
 Current_Text_Size : Natural := 0;
 Value : String_Access_Vector (1 .. Size);
  end record;
  type Text_Value_Ptr is access all Text_Value;

  procedure String_Value_Ptr_Read
(Stream : not null access Ada.Streams.Root_Stream_Type'Class;
 Item   : out String_Value_Ptr);

  procedure String_Value_Ptr_Write
(Stream : not null access Ada.Streams.Root_Stream_Type'Class;
 Item   : in String_Value_Ptr);

  procedure Vector_Value_Ptr_Read
(Stream : not null access Ada.Streams.Root_Stream_Type'Class;
 Item : out Vector_Value_Ptr);

  procedure Vector_Value_Ptr_Write
(Stream : not null access Ada.Streams.Root_Stream_Type'Class;
 Item   : in Vector_Value_Ptr);

  procedure Text_Value_Ptr_Read
(Stream : not null access Ada.Streams.Root_Stream_Type'Class;
 Item   : out Text_Value_Ptr);

  procedure Text_Value_Ptr_Write
(Stream : not null access Ada.Streams.Root_Stream_Type'Class;
 Item   : in Text_Value_Ptr);

  for String_Value_Ptr'Read  use String_Value_Ptr_Read;
  for String_Value_Ptr'Write use String_Value_Ptr_Write;
  for Vector_Value_Ptr'Read  use Vector_Value_Ptr_Read;
  for Vector_Value_Ptr'Write use Vector_Value_Ptr_Write;
  for Text_Value_Ptr'Readuse Text_Value_Ptr_Read;
  for Text_Value_Ptr'Write   use Text_Value_Ptr_Write;

  procedure Free (S : in out String_Value_Ptr);
  procedure Free (S : in out Text_Value_Ptr);
  procedure Free (S : in out Vector_Value_Ptr);
  procedure Free (S : in out String_Access);

   end Internal;
   use Internal;

   type Variant_Internal (Kind : Variant_Kind := VK_Null) is record
  case Kind is
 when VK_Null =>
null;
 when VK_Num =>
Num_Value : Float := 0.0;
 when VK_String =>
String_Value : String_Value_Ptr;
 when VK_Vector =>
Vector_Value : Vector_Value_Ptr;
 when VK_Text =>
Text_Value : Text_Value_Ptr;
  end case;
   end record;

   type Variant is new Ada.Finalization.Controlled with record
  V : Variant_Internal;
   end record;

   overriding procedure Adjust   (X : in out Variant);
   overriding procedure Finalize (X : in out Variant);

   procedure Finalize_Internal (V : in out Variant_Internal);
   procedure Adjust_Internal (V : in out Variant_Internal);

[Ada] Fix another error in multi-precision division used in ELIMINATED mode

2012-11-06 Thread Arnaud Charlet
The ELIMINATED overlow mode uses a multi-precision integer arithmetic package
that depends on algorithm D from Knuth for multi-precision division. This
algorithm was recently corrected for an overflow problem, applying a patch from
1995. This version still had a bug, which was corrected in 2005. This patch
applies this correction (see code of Div_Rem in s-bignum for details).

Tested on x86_64-pc-linux-gnu, committed on trunk

2012-11-06  Yannick Moy  

* s-bignum.adb (Div_Rem): Fix another bug in step D3.

Index: s-bignum.adb
===
--- s-bignum.adb(revision 193217)
+++ s-bignum.adb(working copy)
@@ -859,6 +859,8 @@
 --  This had a bug not discovered till 1995, see Vol 2 errata:
 --  http://www-cs-faculty.stanford.edu/~uno/err2-2e.ps.gz. Under
 --  rare circumstances the expression in the test could overflow.
+--  This version was further corrected in 2005, see Vol 2 errata:
+--  http://www-cs-faculty.stanford.edu/~uno/all2-pre.ps.gz.
 --  The code below is the fixed version of this step.
 
 --  D3. [Calculate qhat.] Set qhat to (uj,uj+1)/v1 and rhat to
@@ -868,13 +870,13 @@
 qhat := temp / DD (v1);
 rhat := temp mod DD (v1);
 
---  D3 (continued). Now test if qhat = b or v2*qhat > (rhat,uj+2):
+--  D3 (continued). Now test if qhat >= b or v2*qhat > (rhat,uj+2):
 --  if so, decrease qhat by 1, increase rhat by v1, and repeat this
 --  test if rhat < b. [The test on v2 determines at at high speed
 --  most of the cases in which the trial value qhat is one too
 --  large, and eliminates all cases where qhat is two too large.]
 
-while qhat = b
+while qhat >= b
   or else DD (v2) * qhat > LSD (rhat) & u (j + 2)
 loop
qhat := qhat - 1;


[Ada] Document vax float point representation

2012-11-06 Thread Arnaud Charlet
Comments added.

Tested on x86_64-pc-linux-gnu, committed on trunk

2012-11-06  Tristan Gingold  

* exp_vfpt.adb: Document VAX float point layout.

Index: exp_vfpt.adb
===
--- exp_vfpt.adb(revision 193215)
+++ exp_vfpt.adb(working copy)
@@ -6,7 +6,7 @@
 --  --
 -- B o d y  --
 --  --
---  Copyright (C) 1997-2010, Free Software Foundation, Inc. --
+--  Copyright (C) 1997-2012, Free Software Foundation, Inc. --
 --  --
 -- GNAT is free software;  you can  redistribute it  and/or modify it under --
 -- terms of the  GNU General Public License as published  by the Free Soft- --
@@ -37,6 +37,78 @@
 
 package body Exp_VFpt is
 
+   --  Vax floating point format (from Vax Architecture Reference Manual
+   --  version 6):
+   --
+   --  Float F:
+   --  
+   --
+   --   1 1
+   --   5 4 7 60
+   --  +-+---+--+
+   --  |S| exp   |   fraction   |  A
+   --  +-+---+--+
+   --  | fraction   |  A + 2
+   --  ++
+   --
+   --  bit 15 is the sign bit,
+   --  bits 14:7 is the excess 128 binary exponent,
+   --  bits 6:0 and 31:16 the normalized 24-bit fraction with the redundant
+   --most significant fraction bit not represented.
+   --
+   --  An exponent value of 0 together with a sign bit of 0, is taken to
+   --  indicate that the datum has a value of 0. Exponent values of 1 through
+   --  255 indicate true binary exponents of -127 to +127. An exponent value
+   --  of 0, together with a sign bit of 1, is taken as reserved.
+   --
+   --  Note that fraction bits are not continuous in memory, VAX is little
+   --  endian (LSB first).
+   --
+   --  Float D:
+   --  
+   --
+   --   1 1
+   --   5 4 7 60
+   --  +-+---+--+
+   --  |S| exp   |   fraction   |  A
+   --  +-+---+--+
+   --  | fraction   |  A + 2
+   --  ++
+   --  | fraction   |  A + 4
+   --  ++
+   --  | fraction   |  A + 6
+   --  ++
+   --
+   --  Like Float F but with 55 bits for the fraction.
+   --
+   --  Float G:
+   --  
+   --
+   --   1 1
+   --   5 4   4 3  0
+   --  +-+-++
+   --  |S| exp |  fract |  A
+   --  +-+-++
+   --  | fraction   |  A + 2
+   --  ++
+   --  | fraction   |  A + 4
+   --  ++
+   --  | fraction   |  A + 6
+   --  ++
+   --
+   --  Exponent values of 1 through 2047 indicate trye binary exponents of
+   --  -1023 to +1023.
+   --
+   --  Main differences compared to IEEE 754:
+   --
+   --  * No denormalized numbers
+   --  * No infinity
+   --  * No NaN
+   --  * No -0.0
+   --  * Reserved values (exp = 0, sign = 1)
+   --  * Vax mantissa represent values [0.5, 1)
+   --  * Bias is shifted by 1 (for single float: 128 on Vax, 127 on IEEE)
+
VAXFF_Digits : constant := 6;
VAXDF_Digits : constant := 9;
VAXGF_Digits : constant := 15;


[Ada] Illegal calls in entry call alternatives

2012-11-06 Thread Arnaud Charlet
Entry call alternaitives in timed or conditional entry calls, as well as in
asynchronous transfers of control, must be entry calls (looking through
renamings) or dispatching calls to interface primitives. This patch rejects
an illegal case that was not previously diagnosed, namely an indirect call
to a parameterless procedure.

Compilling aa-syncrhonized_calls.adb must be rejected with the message:

   aa-synchronized_calls.adb:9:13:
  entry call or dispatching primitive of interface required

package AA is
end;
--
package AA.Synchronized_Calls is

   type Procedure_Ptr_T is access procedure;

   protected Protected_Calls is
  procedure Call
 (Process : Procedure_Ptr_T;
  Timeout : Duration);
   end;
end;
--
package body AA.Synchronized_Calls is

   protected body Protected_Calls is
  procedure Call
 (Process : Procedure_Ptr_T;
  Timeout : Duration) is
  begin
 select
Process.all;
 or
delay Timeout;
 end select;
  end;
   end;
end;

Tested on x86_64-pc-linux-gnu, committed on trunk

2012-11-06  Ed Schonberg  

* sem_ch9.adb (Analyze_Entry_Call_Alternative,
Check_Triggering_Statement): Reject properly an indirect call.

Index: sem_ch9.adb
===
--- sem_ch9.adb (revision 193215)
+++ sem_ch9.adb (working copy)
@@ -1470,6 +1470,15 @@
 
   Analyze (Call);
 
+  --  An indirect call in this context  is illegal. A procedure call that
+  --  does not involve a renaming of an entry is illegal as well, but this
+  --  and other semantic errors are caught during resolution.
+
+  if Nkind (Call) = N_Explicit_Dereference then
+ Error_Msg_N
+   ("entry call or dispatching primitive of interface required ", N);
+  end if;
+
   if Is_Non_Empty_List (Statements (N)) then
  Analyze_Statements (Statements (N));
   end if;
@@ -3304,6 +3313,11 @@
  ("dispatching operation of limited or synchronized " &
   "interface required (RM 9.7.2(3))!", Error_Node);
 end if;
+
+ elsif Nkind (Trigger) = N_Explicit_Dereference then
+Error_Msg_N
+  ("entry call or dispatching primitive of interface required ",
+Trigger);
  end if;
   end if;
end Check_Triggering_Statement;


[Ada] Directly emit binary representation of Vax float

2012-11-06 Thread Arnaud Charlet
Code generation for emitting a vax float is improved: instead of calling a
runtime routine, the binary representation is directly emitted.
No functionnal change (and also VMS specific).

Tested on x86_64-pc-linux-gnu, committed on trunk

2012-11-06  Tristan Gingold  

* fe.h (Get_Vax_Real_Literal_As_Signed): Declare.
* eval_fat.adb, eval_fat.ads (Decompose_Int): Move spec in package spec.
* exp_vfpt.adb, exp_vfpt.ads (Vax_Real_Literal_As_Signed): New function.
(Expand_Vax_Real_Literal): Remove.
* exp_ch2.adb (Expand_N_Real_Literal): Do nothing.
* sem_eval.adb (Expr_Value_R): Remove special Vax float case,
as this is not anymore a special case.

Index: fe.h
===
--- fe.h(revision 193215)
+++ fe.h(working copy)
@@ -156,6 +156,11 @@
 
 extern Boolean Is_Fully_Repped_Tagged_Type  (Entity_Id);
 
+/* exp_vfpt: */
+
+#define Get_Vax_Real_Literal_As_Signed exp_vfpt__get_vax_real_literal_as_signed
+extern Ureal Get_Vax_Real_Literal_As_Signed (Node_Id);
+
 /* lib: */
 
 #define Cunit  lib__cunit
Index: eval_fat.adb
===
--- eval_fat.adb(revision 193222)
+++ eval_fat.adb(working copy)
@@ -57,20 +57,6 @@
--  parts. The fraction is in the interval 1.0 / Radix .. T'Pred (1.0) and
--  uses Rbase = Radix. The result is rounded to a nearest machine number.
 
-   procedure Decompose_Int
- (RT   : R;
-  X: T;
-  Fraction : out UI;
-  Exponent : out UI;
-  Mode : Rounding_Mode);
-   --  This is similar to Decompose, except that the Fraction value returned
-   --  is an integer representing the value Fraction * Scale, where Scale is
-   --  the value (Machine_Radix_Value (RT) ** Machine_Mantissa_Value (RT)). The
-   --  value is obtained by using biased rounding (halfway cases round away
-   --  from zero), round to even, a floor operation or a ceiling operation
-   --  depending on the setting of Mode (see corresponding descriptions in
-   --  Urealp).
-
--
-- Adjacent --
--
Index: exp_vfpt.adb
===
--- exp_vfpt.adb(revision 193223)
+++ exp_vfpt.adb(working copy)
@@ -32,8 +32,8 @@
 with Sinfo;use Sinfo;
 with Stand;use Stand;
 with Tbuild;   use Tbuild;
-with Uintp;use Uintp;
 with Urealp;   use Urealp;
+with Eval_Fat; use Eval_Fat;
 
 package body Exp_VFpt is
 
@@ -76,9 +76,13 @@
--  ++
--  | fraction   |  A + 4
--  ++
-   --  | fraction   |  A + 6
+   --  | fraction (low) |  A + 6
--  ++
 
+   --  Note that the fraction bits are not continuous in memory. Bytes in a
+   --  words are stored using little endianness, but words are stored using
+   --  big endianness (PDP endian)
+
--  Like Float F but with 55 bits for the fraction.
 
--  Float G:
@@ -93,10 +97,10 @@
--  ++
--  | fraction   |  A + 4
--  ++
-   --  | fraction   |  A + 6
+   --  | fraction (low) |  A + 6
--  ++
 
-   --  Exponent values of 1 through 2047 indicate trye binary exponents of
+   --  Exponent values of 1 through 2047 indicate true binary exponents of
--  -1023 to +1023.
 
--  Main differences compared to IEEE 754:
@@ -553,94 +557,102 @@
   Analyze_And_Resolve (N, Typ, Suppress => All_Checks);
end Expand_Vax_Foreign_Return;
 
-   -
-   -- Expand_Vax_Real_Literal --
-   -
+   
+   -- Vax_Real_Literal_As_Signed --
+   
 
-   procedure Expand_Vax_Real_Literal (N : Node_Id) is
-  Loc  : constant Source_Ptr := Sloc (N);
-  Typ  : constant Entity_Id  := Etype (N);
-  Btyp : constant Entity_Id  := Base_Type (Typ);
-  Stat : constant Boolean:= Is_Static_Expression (N);
-  Nod  : Node_Id;
+   function Get_Vax_Real_Literal_As_Signed (N : Node_Id) return Uint is
+  Btyp : constant Entity_Id :=
+   Base_Type (Underlying_Type (Etype (N)));
 
-  RE_Source : RE_Id;
-  RE_Target : RE_Id;
-  RE_Fncall : RE_Id;
-  --  Entities for source, target and function call in conversion
+  Value: constant Ureal := Realval (N);
+  Negative : Boolean;
+  Fraction : UI;
+  Exponent : UI;
+  Res  : UI;
 
+  Exponent_Size : Uint;
+  --  Number of bits for the exponent
+
+  Fraction_Size : Uint;
+  --  Number of bits for the fraction
+
+  Uintp_Mark : constant Uintp.Save_Mark := Mark;
+  --  Use 

[Ada] Legality rules for pragma Unchecked_Union

2012-11-06 Thread Arnaud Charlet
GNAT implemented this pragma before it was part of the standard, and a left-
over of this early implementation is the requirement that discriminants of
an unchecked_union type all must have defaults.  This is not a legality rule
imposed by the Ada RM,  and this patch removes it.

Tested on x86_64-pc-linux-gnu, committed on trunk

2012-11-06  Ed Schonberg  

* sem_prag.adb (Analyze_Pragnma, case Unchecked_Union): remove
requirement that discriminants of an unchecked_union must have
defaults.  Uncovered by discussion on LA12-042.

Index: sem_prag.adb
===
--- sem_prag.adb(revision 193217)
+++ sem_prag.adb(working copy)
@@ -14495,7 +14495,6 @@
 Assoc   : constant Node_Id := Arg1;
 Type_Id : constant Node_Id := Get_Pragma_Arg (Assoc);
 Typ : Entity_Id;
-Discr   : Entity_Id;
 Tdef: Node_Id;
 Clist   : Node_Id;
 Vpart   : Node_Id;
@@ -14546,21 +14545,12 @@
 --  Note: in previous versions of GNAT we used to check for limited
 --  types and give an error, but in fact the standard does allow
 --  Unchecked_Union on limited types, so this check was removed.
+--  Similarly, GNAT used to require that all discriminants have
+--  default values, but this is not mandated by the RM.
 
 --  Proceed with basic error checks completed
 
 else
-   Discr := First_Discriminant (Typ);
-   while Present (Discr) loop
-  if No (Discriminant_Default_Value (Discr)) then
- Error_Msg_N
-   ("unchecked union discriminant must have default value",
-Discr);
-  end if;
-
-  Next_Discriminant (Discr);
-   end loop;
-
Tdef  := Type_Definition (Declaration_Node (Typ));
Clist := Component_List (Tdef);
 


[Ada] Support target with both VAX and IEEE float

2012-11-06 Thread Arnaud Charlet
This patch allows the Ada front end to properly support static
evaluation of both VAX and IEEE floating point attributes on a single target.
Before we use a global setting from system.ads to determine wether a
floating point type supported denormals and signed zeros, but in order to
properly support static evaluation of VAX float literals, we need to
allow types-ecific values.

On VMS, the following should compile quietly:

package vms is
   type f is digits 6;
   pragma Float_Representation (Vax_Float, f);

   subtype Truth is Boolean range True .. True;

   T : Truth := not F'Signed_Zeros;
end;

Tested on x86_64-pc-linux-gnu, committed on trunk

2012-11-06  Geert Bosch  

* eval_fat.adb (Machine, Succ): Fix front end to support static
evaluation of attributes on targets with both VAX and IEEE float.
* sem_util.ads, sem_util.adb (Has_Denormals, Has_Signed_Zeros):
New type-specific functions. Previously we used Denorm_On_Target
and Signed_Zeros_On_Target directly, but that doesn't work well
for OpenVMS where a single target supports both floating point
with and without signed zeros.
* sem_attr.adb (Attribute_Denorm, Attribute_Signed_Zeros): Use
new Has_Denormals and Has_Signed_Zeros functions to support both
IEEE and VAX floating point on a single target.

Index: eval_fat.adb
===
--- eval_fat.adb(revision 193224)
+++ eval_fat.adb(working copy)
@@ -25,7 +25,7 @@
 
 with Einfo;use Einfo;
 with Errout;   use Errout;
-with Targparm; use Targparm;
+with Sem_Util; use Sem_Util;
 
 package body Eval_Fat is
 
@@ -505,8 +505,8 @@
 Emin_Den : constant UI := Machine_Emin_Value (RT)
 - Machine_Mantissa_Value (RT) + Uint_1;
  begin
-if X_Exp < Emin_Den or not Denorm_On_Target then
-   if Signed_Zeros_On_Target and then UR_Is_Negative (X) then
+if X_Exp < Emin_Den or not Has_Denormals (RT) then
+   if Has_Signed_Zeros (RT) and then UR_Is_Negative (X) then
   Error_Msg_N
 ("floating-point value underflows to -0.0?", Enode);
   return Ureal_M_0;
@@ -517,7 +517,7 @@
   return Ureal_0;
end if;
 
-elsif Denorm_On_Target then
+elsif Has_Denormals (RT) then
 
--  Emin - Mant <= X_Exp < Emin, so result is denormal. Handle
--  gradual underflow by first computing the number of
@@ -718,7 +718,7 @@
   --  Set exponent such that the radix point will be directly following the
   --  mantissa after scaling.
 
-  if Denorm_On_Target or Exp /= Emin then
+  if Has_Denormals (RT) or Exp /= Emin then
  Exp := Exp - Mantissa;
   else
  Exp := Exp - 1;
Index: sem_util.adb
===
--- sem_util.adb(revision 193215)
+++ sem_util.adb(working copy)
@@ -5398,6 +5398,17 @@
   N_Package_Specification);
end Has_Declarations;
 
+   ---
+   -- Has_Denormals --
+   ---
+
+   function Has_Denormals (E : Entity_Id) return Boolean is
+   begin
+  return Is_Floating_Point_Type (E)
+and then Denorm_On_Target
+and then not Vax_Float (E);
+   end Has_Denormals;
+
---
-- Has_Discriminant_Dependent_Constraint --
---
@@ -6076,6 +6087,17 @@
   end if;
end Has_Private_Component;
 
+   --
+   -- Has_Signed_Zeros --
+   --
+
+   function Has_Signed_Zeros (E : Entity_Id) return Boolean is
+   begin
+  return Is_Floating_Point_Type (E)
+and then Signed_Zeros_On_Target
+and then not Vax_Float (E);
+   end Has_Signed_Zeros;
+
-
-- Has_Static_Array_Bounds --
-
Index: sem_util.ads
===
--- sem_util.ads(revision 193215)
+++ sem_util.ads(working copy)
@@ -674,6 +674,10 @@
function Has_Declarations (N : Node_Id) return Boolean;
--  Determines if the node can have declarations
 
+   function Has_Denormals (E : Entity_Id) return Boolean;
+   --  Determines if the floating-point type E supports denormal numbers.
+   --  Returns False if E is not a floating-point type.
+
function Has_Discriminant_Dependent_Constraint
  (Comp : Entity_Id) return Boolean;
--  Returns True if and only if Comp has a constrained subtype that depends
@@ -708,6 +712,10 @@
--  Check if a type has a (sub)component of a private type that has not
--  yet received a full declaration.
 
+   function Has_Signed_Zeros (E : Entity_Id) return Boolean;
+   --  Determines if the floati

[Ada] Attribute Loop_Entry

2012-11-06 Thread Arnaud Charlet
This patch provides the initial implementation of attribute Loop_Entry. This
attribute is indended for formal verification proofs.

The syntax of the attribute is as follows:

   Prefix'Loop_Entry (Target_Loop_Name)

The brief semantic rules for this attribute are:

The prefix must denote a non-limited object and the only attribute association
allowed must be a loop name. Attribute Loop_Entry can only appear inside pragma
Loop_Assertion.

For each prefix of a Loop_Entry attribute, a constant is implicitly declared at
the beginning of the associated loop statement. The constant's type is that of
the prefix. It is initialized to the result of evaluating the prefix as an
expression at the point of the constant declaration. The constant declaration
is not elaborated if the loop statement had a null iteration scheme.

The value of Prefix'Loop_Entry (Target_Loop_Name) is the value of the constant
and the type of Prefix'Loop_Entry (Target_Loop_Name) is that of the constant.

Example of usage:

procedure Main is
   Counter : Natural := 1;

begin
   Target : for Index in 1 .. 3 loop
  pragma Loop_Assertion
(Invariant => Counter'Loop_Entry (Target) = Counter);

  Counter := Counter + 1;
   end loop Target;
end Main;

In the above example, the value of Counter at the point of entry into the loop
is 1. As a result, the Loop_Assertion should fail on the second iteration.

Tested on x86_64-pc-linux-gnu, committed on trunk

2012-11-06  Hristian Kirtchev  

* einfo.adb: Include Loop_Entry_Attributes to the list of
Node/List/Elist10 usage.
(Loop_Entry_Attributes): New routine.
(Set_Loop_Entry_Attributes): New routine.
(Write_Field10_Name): Add an output string for Loop_Entry_Attributes.
* einfo.ads: Define new attribute Loop_Entry_Attributes along
with its usage in nodes.
(Loop_Entry_Attributes): New routine and dedicated pragma Inline.
(Set_Loop_Entry_Attributes): New routine and dedicated pragma Inline.
* exp_attr.adb (Expand_N_Attribute_Reference): Do not expand
Attribute_Loop_Entry here.
* exp_ch5.adb: Add with and use clause for Elists;
(Expand_Loop_Entry_Attributes): New routine.
(Expand_N_Loop_Statement): Add a call to Expand_Loop_Entry_Attributes.
* exp_prag.adb (Expand_Pragma_Loop_Assertion): Specialize the
search to include multiple nested loops produced by the expansion
of Ada 2012 array iterator.
* sem_attr.adb: Add with and use clause for Elists.
(Analyze_Attribute): Check the legality of attribute Loop_Entry.
(Resolve_Attribute): Nothing to do for Loop_Entry.
(S14_Attribute): New routine.
* snames.ads-tmpl: Add a comment on entries marked with
HiLite. Add new name Name_Loop_Entry. Add new attribute
Attribute_Loop_Entry.

Index: exp_ch5.adb
===
--- exp_ch5.adb (revision 193215)
+++ exp_ch5.adb (working copy)
@@ -28,6 +28,7 @@
 with Checks;   use Checks;
 with Debug;use Debug;
 with Einfo;use Einfo;
+with Elists;   use Elists;
 with Errout;   use Errout;
 with Exp_Aggr; use Exp_Aggr;
 with Exp_Ch6;  use Exp_Ch6;
@@ -110,6 +111,10 @@
procedure Expand_Iterator_Loop_Over_Array (N : Node_Id);
--  Expand loop over arrays that uses the form "for X of C"
 
+   procedure Expand_Loop_Entry_Attributes (N : Node_Id);
+   --  Given a loop statement subject to at least one Loop_Entry attribute,
+   --  expand both the loop and all related Loop_Entry references.
+
procedure Expand_Predicated_Loop (N : Node_Id);
--  Expand for loop over predicated subtype
 
@@ -1522,6 +1527,324 @@
   end;
end Expand_Assign_Record;
 
+   --
+   -- Expand_Loop_Entry_Attributes --
+   --
+
+   procedure Expand_Loop_Entry_Attributes (N : Node_Id) is
+  procedure Build_Conditional_Block
+(Loc  : Source_Ptr;
+ Cond : Node_Id;
+ Stmt : Node_Id;
+ If_Stmt  : out Node_Id;
+ Blk_Stmt : out Node_Id);
+  --  Create a block Blk_Stmt with an empty declarative list and a single
+  --  statement Stmt. The block is encased in an if statement If_Stmt with
+  --  condition Cond. If_Stmt is Empty when there is no condition provided.
+
+  function Is_Array_Iteration (N : Node_Id) return Boolean;
+  --  Determine whether loop statement N denotes an Ada 2012 iteration over
+  --  an array object.
+
+  -
+  -- Build_Conditional_Block --
+  -
+
+  procedure Build_Conditional_Block
+(Loc  : Source_Ptr;
+ Cond : Node_Id;
+ Stmt : Node_Id;
+ If_Stmt  : out Node_Id;
+ Blk_Stmt : out Node_Id)
+  is
+  begin
+ Blk_Stmt :=
+   Make_Block_Statement (Loc,
+ Declarations  

Re: [Patch, Fortran, OOP] PR 54917: [4.7/4.8 Regression] TRANSFER on polymorphic variable causes ICE

2012-11-06 Thread Janus Weil
Hi,

>> the attached patch implements support for polymorphic arguments to
>> TRANSFER. For details and discussion see the PR.
>>
>> Btw, as part of the PR is a regression in 4.7 and trunk, I would like
>> to backport the target-memory.c part and the first test case to the
>> 4.7 branch. Ok?
>
>
> Looks OK. Thanks for the patch. Nit:
>
> +  result->ts = result->ts.u.derived->components->ts;
>
> can be written as:
>
> +  result->ts = CLASS_DATA (result)->ts;
>
>
> which is shorter and clearer.

thanks for the review. I corrected the above and have committed the
patch to trunk as r193226.

Cheers,
Janus

PS: Upon re-reading your email it occurred to me that it might only
have been intended as an ok for the regression fix (and not the
runtime fixes). Was it meant like this?


Re: [PATCH] Vzeroupper placement/47440

2012-11-06 Thread Kirill Yukhin
Hello,
> OK for mainline SVN, please commit.
Checked into GCC trunk: http://gcc.gnu.org/ml/gcc-cvs/2012-11/msg00176.html

Thanks, K


Re: [SH] PR 54089 - Add support for rotcl instruction

2012-11-06 Thread Kaz Kojima
Oleg Endo  wrote:
> This patch adds support for SH's rotcl instruction.
> While working on it, I've noticed that the DImode left shift by one insn
> was not used anymore, and instead ended up as 'x + x'.  This
> transformation was happening before/during RTL expansion.  The fix for
> it was to adjust the costs for DImode plus/minus.
> 
> Tested on rev 193135 with
> make -k check RUNTESTFLAGS="--target_board=sh-sim
> \{-m2/-ml,-m2/-mb,-m2a/-mb,-m4/-ml,-m4/-mb,-m4a/-ml,-m4a/-mb}"
> 
> and no new failures.
> OK?

OK.

Regards,
kaz


[v3] A couple of small fixes

2012-11-06 Thread Paolo Carlini

Hi,

for issues I noticed while working on debug-mode for std::array: avoid 
the inclusion of  in two places (in v3 we never include it); 
fix the new check added by Florian vs -fno-exceptions.


Tested x86_64-linux.

Thanks,
Paolo.

///
2012-11-06  Paolo Carlini  
  
* include/bits/atomic_base.h: Don't include , use nullptr.
* include/std/atomic: Likewise.
* include/tr2/dynamic_bitset: Likewise.

* libsupc++/vec.cc (compute_size(std::size_t, std::size_t,
std::size_t)): Fix for -fno-exceptions.
Index: include/bits/atomic_base.h
===
--- include/bits/atomic_base.h  (revision 193210)
+++ include/bits/atomic_base.h  (working copy)
@@ -1,6 +1,6 @@
 // -*- C++ -*- header.
 
-// Copyright (C) 2008, 2009, 2010, 2011, 2012 Free Software Foundation, Inc.
+// Copyright (C) 2008-2012 Free Software Foundation, Inc.
 //
 // This file is part of the GNU ISO C++ Library.  This library is free
 // software; you can redistribute it and/or modify it under the
@@ -35,7 +35,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 
 namespace std _GLIBCXX_VISIBILITY(default)
@@ -423,11 +422,11 @@
 
   bool
   is_lock_free() const noexcept
-  { return __atomic_is_lock_free(sizeof(_M_i), NULL); }
+  { return __atomic_is_lock_free(sizeof(_M_i), nullptr); }
 
   bool
   is_lock_free() const volatile noexcept
-  { return __atomic_is_lock_free(sizeof(_M_i), NULL); }
+  { return __atomic_is_lock_free(sizeof(_M_i), nullptr); }
 
   void
   store(__int_type __i, memory_order __m = memory_order_seq_cst) noexcept
@@ -717,11 +716,11 @@
 
   bool
   is_lock_free() const noexcept
-  { return __atomic_is_lock_free(_M_type_size(1), NULL); }
+  { return __atomic_is_lock_free(_M_type_size(1), nullptr); }
 
   bool
   is_lock_free() const volatile noexcept
-  { return __atomic_is_lock_free(_M_type_size(1), NULL); }
+  { return __atomic_is_lock_free(_M_type_size(1), nullptr); }
 
   void
   store(__pointer_type __p,
Index: include/std/atomic
===
--- include/std/atomic  (revision 193210)
+++ include/std/atomic  (working copy)
@@ -1,6 +1,6 @@
 // -*- C++ -*- header.
 
-// Copyright (C) 2008, 2009, 2010, 2011, 2012 Free Software Foundation, Inc.
+// Copyright (C) 2008-2012 Free Software Foundation, Inc.
 //
 // This file is part of the GNU ISO C++ Library.  This library is free
 // software; you can redistribute it and/or modify it under the
@@ -184,11 +184,11 @@
 
   bool
   is_lock_free() const noexcept
-  { return __atomic_is_lock_free(sizeof(_M_i), NULL); }
+  { return __atomic_is_lock_free(sizeof(_M_i), nullptr); }
 
   bool
   is_lock_free() const volatile noexcept
-  { return __atomic_is_lock_free(sizeof(_M_i), NULL); }
+  { return __atomic_is_lock_free(sizeof(_M_i), nullptr); }
 
   void
   store(_Tp __i, memory_order _m = memory_order_seq_cst) noexcept
Index: include/tr2/dynamic_bitset
===
--- include/tr2/dynamic_bitset  (revision 193210)
+++ include/tr2/dynamic_bitset  (working copy)
@@ -33,7 +33,6 @@
 
 #include 
 #include 
-#include  // For size_t
 #include 
 #include  // For std::allocator
 #include// For invalid_argument, out_of_range,
Index: libsupc++/vec.cc
===
--- libsupc++/vec.cc(revision 193210)
+++ libsupc++/vec.cc(working copy)
@@ -28,6 +28,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include "unwind-cxx.h"
 
@@ -65,10 +66,18 @@
 std::size_t padding_size)
 {
   if (element_size && element_count > std::size_t(-1) / element_size)
+#ifdef __EXCEPTIONS
throw std::bad_alloc();
+#else
+std::abort();
+#endif
   std::size_t size = element_count * element_size;
   if (size + padding_size < size)
+#ifdef __EXCEPTIONS
throw std::bad_alloc();
+#else
+std::abort();
+#endif
   return size + padding_size;
 }
   }


[PATCH] Use propagate_threaded_block_debug_into even in loop header copying pass (PR debug/54693)

2012-11-06 Thread Jakub Jelinek
Hi!

This patch fixes
-FAIL: gcc.dg/guality/pr54693-2.c  -O1  line 21 i == v + 1
-FAIL: gcc.dg/guality/pr54693-2.c  -O2  line 21 i == v + 1
-FAIL: gcc.dg/guality/pr54693-2.c  -O3 -fomit-frame-pointer  line 21 i == v + 1
-FAIL: gcc.dg/guality/pr54693-2.c  -O3 -g  line 21 i == v + 1
-FAIL: gcc.dg/guality/pr54693.c  -O1  line 22 i == c - 48
on both x86_64-linux and i686-linux (and the x/y/z tests in the new testcase
from UNSUPPORTED to PASS) by copying the debug stmt in ch pass similarly to
how jump threading does that.

Ok for trunk?

2012-11-06  Jakub Jelinek  

PR debug/54693
* tree-flow.h (propagate_threaded_block_debug_into): New prototype.
* tree-ssa-threadedge.c (propagate_threaded_block_debug_into): No
longer static.
* tree-ssa-loop-ch.c (copy_loop_headers): Use it.

* gcc.dg/guality/pr54693-2.c: New test.

--- gcc/tree-flow.h.jj  2012-10-30 18:48:57.0 +0100
+++ gcc/tree-flow.h 2012-11-06 11:10:44.996516737 +0100
@@ -1,6 +1,6 @@
 /* Data and Control Flow Analysis for Trees.
-   Copyright (C) 2001, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011
-   Free Software Foundation, Inc.
+   Copyright (C) 2001, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011,
+   2012 Free Software Foundation, Inc.
Contributed by Diego Novillo 
 
 This file is part of GCC.
@@ -689,6 +689,7 @@ extern void set_ssa_name_value (tree, tr
 extern bool potentially_threadable_block (basic_block);
 extern void thread_across_edge (gimple, edge, bool,
VEC(tree, heap) **, tree (*) (gimple, gimple));
+extern void propagate_threaded_block_debug_into (basic_block, basic_block);
 
 /* In tree-ssa-loop-im.c  */
 /* The possibilities of statement movement.  */
--- gcc/tree-ssa-threadedge.c.jj2012-11-05 08:55:21.0 +0100
+++ gcc/tree-ssa-threadedge.c   2012-11-06 11:10:35.694570819 +0100
@@ -1,5 +1,5 @@
 /* SSA Jump Threading
-   Copyright (C) 2005, 2006, 2007, 2008, 2009, 2010, 2011
+   Copyright (C) 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012
Free Software Foundation, Inc.
Contributed by Jeff Law  
 
@@ -613,7 +613,7 @@ cond_arg_set_in_bb (edge e, basic_block
 /* Copy debug stmts from DEST's chain of single predecessors up to
SRC, so that we don't lose the bindings as PHI nodes are introduced
when DEST gains new predecessors.  */
-static void
+void
 propagate_threaded_block_debug_into (basic_block dest, basic_block src)
 {
   if (!MAY_HAVE_DEBUG_STMTS)
--- gcc/tree-ssa-loop-ch.c.jj   2012-11-01 09:33:25.0 +0100
+++ gcc/tree-ssa-loop-ch.c  2012-11-06 11:11:58.176089941 +0100
@@ -1,5 +1,5 @@
 /* Loop header copying on trees.
-   Copyright (C) 2004, 2005, 2006, 2007, 2008, 2010
+   Copyright (C) 2004, 2005, 2006, 2007, 2008, 2010, 2011, 2012
Free Software Foundation, Inc.
 
 This file is part of GCC.
@@ -197,6 +197,7 @@ copy_loop_headers (void)
 
   entry = loop_preheader_edge (loop);
 
+  propagate_threaded_block_debug_into (exit->dest, entry->dest);
   if (!gimple_duplicate_sese_region (entry, exit, bbs, n_bbs, copied_bbs))
{
  fprintf (dump_file, "Duplication failed.\n");
--- gcc/testsuite/gcc.dg/guality/pr54693-2.c.jj 2012-11-06 11:13:07.141687111 
+0100
+++ gcc/testsuite/gcc.dg/guality/pr54693-2.c2012-11-06 11:33:50.640589407 
+0100
@@ -0,0 +1,33 @@
+/* PR debug/54693 */
+/* { dg-do run } */
+/* { dg-options "-g" } */
+
+int v;
+
+__attribute__((noinline, noclone)) void
+bar (int i)
+{
+  v = i;
+  asm volatile ("" : : "r" (i) : "memory");
+}
+
+__attribute__((noinline, noclone)) void
+foo (int x, int y, int z)
+{
+  int i = 0;
+  while (x > 3 && y > 3 && z > 3)
+{  /* { dg-final { gdb-test 21 "i" "v + 1" } } */
+   /* { dg-final { gdb-test 21 "x" "10 - i" } } */
+  bar (i); /* { dg-final { gdb-test 21 "y" "20 - 2 * i" } } */
+   /* { dg-final { gdb-test 21 "z" "30 - 3 * i" } } */
+  i++, x--, y -= 2, z -= 3;
+}
+}
+
+int
+main ()
+{
+  v = -1;
+  foo (10, 20, 30);
+  return 0;
+}

Jakub


GCC 4.8.0 Status Report (2012-11-06), Stage 1 is over, Stage 3 in effect immediately

2012-11-06 Thread Jakub Jelinek
Status
==

The GCC trunk is now in stage3, patches submitted during stage1
may be still accepted, if they don't need significant rewrites,
but please try to get them in soon.  There is a lot of them outstanding,
so please also help reviewing them.  Otherwise only bugfixes
and documentation fixes are allowed for the trunk.
If all goes well, stage3 would again as last time go on for roughly
two months, with the aim of getting the release at the end of
March to mid April.

We have accumulated quite a lot of bugs during the 8 months
long stage 1, please help with analysing them and bugfixing.

Quality Data


Priority  #   Change from Last Report
---   ---
P1   23
P2   77
P3   91   +  6
---   ---
Total   191   +  6

Previous Report
===

http://gcc.gnu.org/ml/gcc/2012-10/msg00434.html

The next report will be sent by Richard.


Re: RFC: PATCH to add abi_tag attribute

2012-11-06 Thread Jakub Jelinek
On Mon, Nov 05, 2012 at 11:03:37PM -0500, Jason Merrill wrote:
> As discussed at the Cauldron in Prague, this patch introduces a C++
> abi_tag attribute which can be attached to a function or class to
> modify its mangled name and avoid name collisions with earlier
> versions with a different ABI.  It also adds a -Wabi-tag warning
> option to make the compiler suggest adding ABI tags to classes with
> subobjects that have tags.

Couldn't there be auto-propagation at least for classes that aren't forward
declared first?

Also perhaps the documentation should perhaps reserve some names for the
implementation or uses compatible with that (say starting with underscore
or whatever), so that we could in libstdc++ use abi tag names without
a fear that it is used already by others for something else.

Jakub


Re: [Ada] clean ups in Makefiles

2012-11-06 Thread Arnaud Charlet
The previous patch is further cleaned up by removing the osconstool
target in libada/Makefile.in which was hard to maintain.

Tested on x86_64-pc-linux-gnu, committed on trunk.

libada/ 
* Makefile.in (osconstool): Removed, no longer needed.

ada/
* gcc-interface/Makefile.in: Improve handling of s-oscons.ads.

--
Index: libada/Makefile.in
===
--- libada/Makefile.in  (revision 193215)
+++ libada/Makefile.in  (working copy)
@@ -94,7 +94,7 @@ LIBADA_FLAGS_TO_PASS = \
 .PHONY: gnatlib gnatlib-plain gnatlib-sjlj gnatlib-zcx gnatlib-shared 
osconstool
 gnatlib: @default_gnatlib_target@
 
-gnatlib-plain: osconstool $(GCC_DIR)/ada/Makefile
+gnatlib-plain: $(GCC_DIR)/ada/Makefile
test -f stamp-libada || \
$(MAKE) -C $(GCC_DIR)/ada $(LIBADA_FLAGS_TO_PASS) gnatlib \
&& touch stamp-libada
@@ -103,7 +103,7 @@ gnatlib-plain: osconstool $(GCC_DIR)/ada
$(LN_S) $(ADA_RTS_DIR) adainclude
$(LN_S) $(ADA_RTS_DIR) adalib
 
-gnatlib-sjlj gnatlib-zcx gnatlib-shared: osconstool $(GCC_DIR)/ada/Makefile
+gnatlib-sjlj gnatlib-zcx gnatlib-shared: $(GCC_DIR)/ada/Makefile
test -f stamp-libada || \
$(MAKE) -C $(GCC_DIR)/ada $(LIBADA_FLAGS_TO_PASS) $@ \
&& touch stamp-libada
@@ -112,9 +112,6 @@ gnatlib-sjlj gnatlib-zcx gnatlib-shared:
$(LN_S) $(ADA_RTS_DIR) adainclude
$(LN_S) $(ADA_RTS_DIR) adalib
 
-osconstool:
-   $(MAKE) -C $(GCC_DIR) $(LIBADA_FLAGS_TO_PASS) ada/s-oscons.ads
-
 install-gnatlib: $(GCC_DIR)/ada/Makefile
$(MAKE) -C $(GCC_DIR)/ada $(LIBADA_FLAGS_TO_PASS) install-gnatlib
 
Index: gcc-interface/Makefile.in
===
--- gcc-interface/Makefile.in   (revision 193215)
+++ gcc-interface/Makefile.in   (working copy)
@@ -2577,13 +2604,14 @@
$(RTSDIR)/$(word 1,$(subst <, ,$(PAIR)));)
 # Copy tsystem.h
$(CP) $(srcdir)/tsystem.h $(RTSDIR)
-# Copy generated target dependent sources
-   $(RM) $(RTSDIR)/s-oscons.ads
-   (cd $(RTSDIR); $(LN_S) ../s-oscons.ads s-oscons.ads)
$(RM) ../stamp-gnatlib-$(RTSDIR)
touch ../stamp-gnatlib1-$(RTSDIR)
 
-gnatlib: ../stamp-gnatlib1-$(RTSDIR) ../stamp-gnatlib2-$(RTSDIR)
+$(RTSDIR)/s-oscons.ads: ../stamp-gnatlib1-$(RTSDIR)
+   $(RM) $(RTSDIR)/s-oscons.ads
+   (cd $(RTSDIR); $(LN_S) ../s-oscons.ads s-oscons.ads)
+
+gnatlib: ../stamp-gnatlib1-$(RTSDIR) ../stamp-gnatlib2-$(RTSDIR) 
$(RTSDIR)/s-oscons.ads
 # C files
$(MAKE) -C $(RTSDIR) \
CC="`echo \"$(GCC_FOR_TARGET)\" \


[Ada] New port: arm android

2012-11-06 Thread Arnaud Charlet
This Change adds the necessary runtime configuration to build an Ada
runtime for android/arm.

Tested manually on android, committed on trunk.

2012-11-06  Arnaud Charlet  

* gcc-interface/Makefile.in: Add runtime pairs for Android.
* s-osinte-android.ads, s-osinte-android.adb: New files.

Index: s-osinte-android.ads
===
--- s-osinte-android.ads(revision 0)
+++ s-osinte-android.ads(revision 0)
@@ -0,0 +1,643 @@
+--
+--  --
+-- GNAT RUN-TIME LIBRARY (GNARL) COMPONENTS --
+--  --
+--   S Y S T E M . O S _ I N T E R F A C E  --
+--  --
+--  S p e c --
+--  --
+--  Copyright (C) 1995-2012, Free Software Foundation, Inc. --
+--  --
+-- GNAT is free software;  you can  redistribute it  and/or modify it under --
+-- terms of the  GNU General Public License as published  by the Free Soft- --
+-- ware  Foundation;  either version 3,  or (at your option) any later ver- --
+-- sion.  GNAT is distributed in the hope that it will be useful, but WITH- --
+-- OUT ANY WARRANTY;  without even the  implied warranty of MERCHANTABILITY --
+-- or FITNESS FOR A PARTICULAR PURPOSE. --
+--  --
+-- As a special exception under Section 7 of GPL version 3, you are granted --
+-- additional permissions described in the GCC Runtime Library Exception,   --
+-- version 3.1, as published by the Free Software Foundation.   --
+--  --
+-- You should have received a copy of the GNU General Public License and--
+-- a copy of the GCC Runtime Library Exception along with this program; --
+-- see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see--
+-- .  --
+--  --
+-- GNARL was developed by the GNARL team at Florida State University.   --
+-- Extensive contributions were provided by Ada Core Technologies, Inc. --
+--  --
+--
+
+--  This is an Android version of this package which is based on the
+--  GNU/Linux version
+
+--  This package encapsulates all direct interfaces to OS services
+--  that are needed by the tasking run-time (libgnarl).
+
+--  PLEASE DO NOT add any with-clauses to this package or remove the pragma
+--  Preelaborate. This package is designed to be a bottom-level (leaf) package.
+
+with Ada.Unchecked_Conversion;
+with Interfaces.C;
+with System.Linux;
+with System.OS_Constants;
+
+package System.OS_Interface is
+   pragma Preelaborate;
+
+   subtype intis Interfaces.C.int;
+   subtype char   is Interfaces.C.char;
+   subtype short  is Interfaces.C.short;
+   subtype long   is Interfaces.C.long;
+   subtype unsigned   is Interfaces.C.unsigned;
+   subtype unsigned_short is Interfaces.C.unsigned_short;
+   subtype unsigned_long  is Interfaces.C.unsigned_long;
+   subtype unsigned_char  is Interfaces.C.unsigned_char;
+   subtype plain_char is Interfaces.C.plain_char;
+   subtype size_t is Interfaces.C.size_t;
+
+   ---
+   -- Errno --
+   ---
+
+   function errno return int;
+   pragma Import (C, errno, "__get_errno");
+
+   EAGAIN: constant := System.Linux.EAGAIN;
+   EINTR : constant := System.Linux.EINTR;
+   EINVAL: constant := System.Linux.EINVAL;
+   ENOMEM: constant := System.Linux.ENOMEM;
+   EPERM : constant := System.Linux.EPERM;
+   ETIMEDOUT : constant := System.Linux.ETIMEDOUT;
+
+   -
+   -- Signals --
+   -
+
+   Max_Interrupt : constant := 63;
+   type Signal is new int range 0 .. Max_Interrupt;
+   for Signal'Size use int'Size;
+
+   SIGHUP : constant := System.Linux.SIGHUP;
+   SIGINT : constant := System.Linux.SIGINT;
+   SIGQUIT: constant := System.Linux.SIGQUIT;
+   SIGILL : constant := System.Linux.SIGILL;
+   SIGTRAP: constant := System.Linux.SIGTRAP;
+   SIGIOT : constant := System.Linux.SIGIOT;
+   SIGABRT: constant := System.Linux.SIGABRT;
+   SIGFPE : constant := System.Linux.SIGFPE;
+ 

Re: [PATCH 1/7] s390: Constraints, predicates, and op letters for contiguous bitmasks

2012-11-06 Thread Andreas Krebbel
Hi,

thanks for your patch. I've refreshed it to the latest revision and
have added patterns for risbgn (risbg without clobbering CC) which has
been added with zEC12.

I've tested the patch on s390x with -march=z196.  I think it is safe
for EC12 as well. However I'll run some tests there later on.

Feel free to apply.

Bye,

-Andreas-

 gcc/config/s390/constraints.md |   11 ++!!
 gcc/config/s390/predicates.md  |6 ++
 gcc/config/s390/s390.c |   92 ++-!
 gcc/config/s390/s390.md|   74 +-!
 4 files changed, 21 insertions(+), 5 deletions(-), 157 modifications(!)

Index: gcc/config/s390/constraints.md
===
*** gcc/config/s390/constraints.md.orig
--- gcc/config/s390/constraints.md
***
*** 45,50 
--- 45,52 
  ;; H,Q: mode of the part
  ;; D,S,H:   mode of the containing operand
  ;; 0,F: value of the other parts (F - all bits set)
+ ;; --
+ ;; xx[DS]q  satisfies s390_contiguous_bitmask_p for DImode or SImode
  ;;
  ;; The constraint matches if the specified part of a constant
  ;; has a value different from its other parts.  If the letter x
***
*** 330,337 
(and (match_code "const_int")
 (match_test "s390_N_constraint_str (\"xQH0\", ival)")))
  
  
! 
  
  ;;
  ;; Double-letter constraints starting with O follow.
--- 332,346 
(and (match_code "const_int")
 (match_test "s390_N_constraint_str (\"xQH0\", ival)")))
  
+ (define_constraint "NxxDq"
+   "@internal"
+   (and (match_code "const_int")
+(match_test "s390_contiguous_bitmask_p (ival, 64, NULL, NULL)")))
  
! (define_constraint "NxxSq"
!   "@internal"
!   (and (match_code "const_int")
!(match_test "s390_contiguous_bitmask_p (ival, 32, NULL, NULL)")))
  
  ;;
  ;; Double-letter constraints starting with O follow.
Index: gcc/config/s390/predicates.md
===
*** gcc/config/s390/predicates.md.orig
--- gcc/config/s390/predicates.md
***
*** 154,159 
--- 154,165 
return false;
  })
  
+ (define_predicate "contiguous_bitmask_operand"
+   (match_code "const_int")
+ {
+   return s390_contiguous_bitmask_p (INTVAL (op), GET_MODE_BITSIZE (mode), 
NULL, NULL);
+ })
+ 
  ;; operators --
  
  ;; Return nonzero if OP is a valid comparison operator
Index: gcc/config/s390/s390.c
===
*** gcc/config/s390/s390.c.orig
--- gcc/config/s390/s390.c
*** print_operand_address (FILE *file, rtx a
*** 5361,5388 
  'C': print opcode suffix for branch condition.
  'D': print opcode suffix for inverse branch condition.
  'E': print opcode suffix for branch on index instruction.
- 'J': print tls_load/tls_gdcall/tls_ldcall suffix
  'G': print the size of the operand in bytes.
  'O': print only the displacement of a memory reference.
  'R': print only the base register of a memory reference.
  'S': print S-type memory reference (base+displacement).
- 'N': print the second word of a DImode operand.
- 'M': print the second word of a TImode operand.
  'Y': print shift count operand.
  
  'b': print integer X as if it's an unsigned byte.
  'c': print integer X as if it's an signed byte.
! 'x': print integer X as if it's an unsigned halfword.
  'h': print integer X as if it's a signed halfword.
  'i': print the first nonzero HImode part of X.
  'j': print the first HImode part unequal to -1 of X.
  'k': print the first nonzero SImode part of X.
  'm': print the first SImode part unequal to -1 of X.
! 'o': print integer X as if it's an unsigned 32bit word.  */
  
  void
  print_operand (FILE *file, rtx x, int code)
  {
switch (code)
  {
  case 'C':
--- 5361,5395 
  'C': print opcode suffix for branch condition.
  'D': print opcode suffix for inverse branch condition.
  'E': print opcode suffix for branch on index instruction.
  'G': print the size of the operand in bytes.
+ 'J': print tls_load/tls_gdcall/tls_ldcall suffix
+ 'M': print the second word of a TImode operand.
+ 'N': print the second word of a DImode operand.
  'O': print only the displacement of a memory reference.
  'R': print only the base register of a memory reference.
  'S': print S-type memory reference (base+displacement).
  'Y': print shift count operand.
  
  'b': print integer X as if it's an unsigned byte.
  'c': print integer X as if it's an signed byte.
! 'e': "end" of DImode contiguous bitmask X.
! 'f': "end" of SImode contiguous bitmask X.
  'h': print integer X as if it's a signed halfword.
  'i': print the first nonzero HImode part of X.
  'j': print the 

Re: [PATCH 2/7] s390: Only use lhs zero_extract in word_mode

2012-11-06 Thread Andreas Krebbel
Hi,

I had to remove the insv pattern changes from that patch.  I
understand that you simplified the patterns since the generic RTL
expander only generates word mode zero extracts. However, we still
need the SImode variant for atomic operations so we cannot remove it.

The insv_z10 definition can still be simplified after adding the
bitsize mode attribute with one of your later patches so that change
will be moved to a later patch.

No regressions with that patch on s390x with -march=z196.

Feel free to apply.

Thanks!

Bye,

-Andreas-


 gcc/config/s390/s390.md |   17 !
 1 file changed, 17 modifications(!)

Index: gcc/config/s390/s390.md
===
*** gcc/config/s390/s390.md.orig
--- gcc/config/s390/s390.md
***
*** 3525,3539 
[(set_attr "op_type" "RIL")
 (set_attr "z10prop" "z10_fwd_E1")])
  
! ; Update the right-most 32 bit of a DI, or the whole of a SI.
! (define_insn "*insv_l_reg_extimm"
!   [(set (zero_extract:P (match_operand:P 0 "register_operand" "+d")
!   (const_int 32)
!   (match_operand 1 "const_int_operand" "n"))
!   (match_operand:P 2 "const_int_operand" "n"))]
!   "TARGET_EXTIMM
!&& BITS_PER_WORD - INTVAL (operands[1]) == 32"
!   "iilf\t%0,%o2"
[(set_attr "op_type" "RIL")
 (set_attr "z10prop" "z10_fwd_A1")])
  
--- 3525,3538 
[(set_attr "op_type" "RIL")
 (set_attr "z10prop" "z10_fwd_E1")])
  
! ; Update the right-most 32 bit of a DI.
! (define_insn "*insv_l_di_reg_extimm"
!   [(set (zero_extract:DI (match_operand:DI 0 "register_operand" "+d")
!(const_int 32)
!(const_int 32))
!   (match_operand:DI 1 "const_int_operand" "n"))]
!   "TARGET_EXTIMM"
!   "iilf\t%0,%o1"
[(set_attr "op_type" "RIL")
 (set_attr "z10prop" "z10_fwd_A1")])
  



RE: [PATCH Version 2][RFA]Improving register pressure directed hoist

2012-11-06 Thread Bin Cheng


> -Original Message-
> From: Jeff Law [mailto:l...@redhat.com]
> Sent: Tuesday, November 06, 2012 4:51 AM
> To: Bin Cheng
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH Version 2][RFA]Improving register pressure directed
hoist
> 
> On 11/02/2012 02:34 AM, Bin Cheng wrote:
> >>
> > Also I don't understand why the bogus patch can catch more hoist
> > opportunities and improve code size, so please help if you have any
> > idea about this.
> Well, perturbing the code, particularly in a way which is supposed to
change
> the amount of register pressure is certainly going to affect the
allocators
> and reload.
> 
> It shouldn't be that hard to look at results which differ between the two
> patches and analyze why they're different.
I will try to investigate this issue.

> 
> >
> > 2012-11-02  Bin Cheng
> >
> > * gcse.c: (struct bb_data): Add new fields, old_pressure, live_in
> > and backup.
> > (calculate_bb_reg_pressure): Initialize live_in and backup.
> > (update_bb_reg_pressure): New.
> > (should_hoist_expr_to_dom): Add new parameter from.
> > Monitor the change of reg pressure and use it to drive hoisting.
> > (hoist_code): Update LIVE and reg pressure information.
> >
> > gcc/testsuite/ChangeLog
> > 2012-11-02  Bin Cheng
> >
> > * gcc.dg/hoist-register-pressure-3.c: New test.
> >
> >
> > +/* Update register pressure for BB when hoisting an expression from
> > +   instruction FROM, if live ranges of inputs are shrunk.  Also
> > +   maintain live_in information if live range of register referred
> > +   in FROM is shrunk.
> > +
> > +   Return 0 if register pressure doesn't change, otherwise return
> > +   the number by which register pressure is decreased.
> > +
> > +   NOTE: Register pressure won't be increased in this function.  */
> > +
> > +static int
> > +update_bb_reg_pressure (basic_block bb, rtx from,
> > +   enum reg_class pressure_class, int nregs) {
> > +  rtx dreg, insn;
> > +  basic_block succ_bb;
> > +  df_ref *op, op_ref;
> > +  edge succ;
> > +  edge_iterator ei;
> > +  int decreased_pressure = 0;
> > +
> > +  for (op = DF_INSN_USES (from); *op; op++)
> > +{
> > +  dreg = DF_REF_REAL_REG (*op);
> > +  /* The live range of register is shrunk only if it isn't:
> > +1. referred on any path from the end of this block to EXIT, or
> > +2. referred by insns other than FROM in this block.  */
> > +  FOR_EACH_EDGE (succ, ei, bb->succs)
> > +   {
> > + succ_bb = succ->dest;
> > + if (succ_bb == EXIT_BLOCK_PTR)
> > +   continue;
> > +
> > + if (bitmap_bit_p (BB_DATA (succ_bb)->live_in, REGNO (dreg)))
> > +   break;
> > +   }
> > +  if (succ != NULL)
> > +   continue;
> > +
> > +  op_ref = DF_REG_USE_CHAIN (REGNO (dreg));
> > +  for (; op_ref; op_ref = DF_REF_NEXT_REG (op_ref))
> > +   {
> > + if (!DF_REF_INSN_INFO (op_ref))
> > +   continue;
> > +
> > + insn = DF_REF_INSN (op_ref);
> > + if (BLOCK_FOR_INSN (insn) == bb
> > + && NONDEBUG_INSN_P (insn) && insn != from)
> > +   break;
> > +   }
> > +
> > +  /* Decrease register pressure and update live_in information for
> > +this block.  */
> > +  if (!op_ref)
> > +   {
> > + decreased_pressure += nregs;
> > + BB_DATA (bb)->max_reg_pressure[pressure_class] -= nregs;
> > + bitmap_clear_bit (BB_DATA (bb)->live_in, REGNO (dreg));
> > +   }
> > +}
> > +  return decreased_pressure;
> So we're looking to see if any of the registers used in FROM are used
after
> from.  If none are used, then we decrease the register pressure by nregs
which
> appears to be a property of the the registers *set* in FROM.
>   Is seems like there's some inconsistency here.  Or have I misunderstood
> something?
> 
> I'm not sure how much it matters in practice, except perhaps for
conversions
> and the like where the source and destination operands are different
modes.

Agreed, I missed the inconsistence of register_class/number between input
and output. I will fix this issue and measure the effect once I get back to
office.

Thanks for your comments.






Re: [PATCH] Make IPA-CP work on aggregates

2012-11-06 Thread Jakub Jelinek
On Tue, Nov 06, 2012 at 12:58:07AM +0100, Martin Jambor wrote:
> 2012-11-05  Martin Jambor  
> 
>   PR tree-optimization/53787
>   * ipa-cp.c (ipcp_value_source): New field offset.
...

Is this supposed to do something about Fortran array descriptors
where some fields in the descriptors have known constant values in the
caller?

Say
subroutine bar (a, b, n)
  integer :: a(n), b(n)
  call foo (a, b)
contains
subroutine foo (a, b)
  integer :: a(:), b(:)
  a = b
end subroutine
end
-O2 -fno-inline (there could be thousands of better testcases though, this
one doesn't look at too many fields).
With your patch
foo.1899.constprop.0 is created, but I don't see any immediate other
effects.  Certainly e.g.
  _2 = a_1(D)->dim[0].stride;
  if (_2 != 0)
remains till *.optimized dump, even when in the caller it is set to 1.
I guess for Fortran being able to optimize on constant (or even better
constant one) stride would be very worthwhile.

Jakub


Fix loop bounds computed by vectorizer

2012-11-06 Thread Jan Hubicka
Hi,
three of remainings false positives of -Warray-bounds on -O3 bootstrap turned 
out
to be bug in vectorizer.  It sets the loop bound on prologue/epilogue to be
vectorization_factor - 1.
This is not correct: when vectorization_factor is 2, the epilogue/prologue
never loops, so number of iterations should be 0.  Fixing this however led to
wrong code that turned out to be bug in vect_do_peeling_for_loop_bound.
With PEELING_FOR_GAPS we actually skip the last iteration of the vectorized
loop and thus the bound is vectorization_factor * 2 - 2.
Fixed thus, I also made the vectorizer to properly update loop bound
on the vectorized loop so it can be fully unrolled afterwards.
This hits on the testcase I derrived from ira.c where the loop iterates 27
times before vectorizing, while after vectorizing it can be fully unrolled
(our bound is 16) leading to quite nice code sequence.

Bootstrapped/regtested x86_64-linux, comitted as obvious.

Honza

* gcc.target/i386/l_fma_float_?.c: Update.
* gcc.target/i386/l_fma_double_?.c: Update.
* tree-vect-loop-manip.c (vect_do_peeling_for_loop_bound,
vect_do_peeling_for_alignment): Fix loop bound computation.
* tree-vect-loop.c (vect_transform_loop): Maintain loop bounds.

Index: tree-vect-loop.c
===
*** tree-vect-loop.c(revision 193174)
--- tree-vect-loop.c(working copy)
*** vect_transform_loop (loop_vec_info loop_
*** 5448,5457 
--- 5448,5463 
bool transform_pattern_stmt = false;
bool check_profitability = false;
int th;
+   /* Record number of iterations before we started tampering with the 
profile. */
+   gcov_type expected_iterations = expected_loop_iterations_unbounded (loop);
  
if (dump_enabled_p ())
  dump_printf_loc (MSG_NOTE, vect_location, "=== vec_transform_loop ===");
  
+   /* If profile is inprecise, we have chance to fix it up.  */
+   if (LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo))
+ expected_iterations = LOOP_VINFO_INT_NITERS (loop_vinfo);
+ 
/* Use the more conservative vectorization threshold.  If the number
   of iterations is constant assume the cost check has been performed
   by our caller.  If the threshold makes all loops profitable that
*** vect_transform_loop (loop_vec_info loop_
*** 5735,5740 
--- 5741,5765 
  
slpeel_make_loop_iterate_ntimes (loop, ratio);
  
+   /* Reduce loop iterations by the vectorization factor.  */
+   scale_loop_profile (loop, RDIV (REG_BR_PROB_BASE , vectorization_factor),
+ expected_iterations / vectorization_factor);
+   loop->nb_iterations_upper_bound
+ = loop->nb_iterations_upper_bound.udiv (double_int::from_uhwi 
(vectorization_factor),
+   FLOOR_DIV_EXPR);
+   if (LOOP_VINFO_PEELING_FOR_GAPS (loop_vinfo)
+   && loop->nb_iterations_upper_bound != double_int_zero)
+ loop->nb_iterations_upper_bound = loop->nb_iterations_upper_bound - 
double_int_one;
+   if (loop->any_estimate)
+ {
+   loop->nb_iterations_estimate
+ = loop->nb_iterations_estimate.udiv (double_int::from_uhwi 
(vectorization_factor),
+FLOOR_DIV_EXPR);
+if (LOOP_VINFO_PEELING_FOR_GAPS (loop_vinfo)
+  && loop->nb_iterations_estimate != double_int_zero)
+loop->nb_iterations_estimate = loop->nb_iterations_estimate - 
double_int_one;
+ }
+ 
/* The memory tags and pointers in vectorized statements need to
   have their SSA forms updated.  FIXME, why can't this be delayed
   until all the loops have been transformed?  */
Index: tree-vect-loop-manip.c
===
*** tree-vect-loop-manip.c  (revision 193174)
--- tree-vect-loop-manip.c  (working copy)
*** vect_do_peeling_for_loop_bound (loop_vec
*** 1954,1962 
   by ratio_mult_vf_name steps.  */
vect_update_ivs_after_vectorizer (loop_vinfo, ratio_mult_vf_name, update_e);
  
!   max_iter = LOOP_VINFO_VECT_FACTOR (loop_vinfo) - 1;
if (check_profitability)
! max_iter = MAX (max_iter, (int) th);
record_niter_bound (new_loop, double_int::from_shwi (max_iter), false, 
true);
dump_printf (MSG_OPTIMIZED_LOCATIONS,
 "Setting upper bound of nb iterations for epilogue "
--- 1954,1969 
   by ratio_mult_vf_name steps.  */
vect_update_ivs_after_vectorizer (loop_vinfo, ratio_mult_vf_name, update_e);
  
!   /* For vectorization factor N, we need to copy last N-1 values in epilogue
!  and this means N-2 loopback edge executions.
! 
!  PEELING_FOR_GAPS works by subtracting last iteration and thus the 
epilogue
!  will execute at least LOOP_VINFO_VECT_FACTOR times.  */
!   max_iter = (LOOP_VINFO_PEELING_FOR_GAPS (loop_vinfo)
! ? LOOP_VINFO_VECT_FACTOR (loop_vinfo) * 2
! : LOOP_VINFO_VECT_FACTOR (loop_vinfo)) - 2;

Re: [PATCH 3/7] s390: Use risbgz for AND.

2012-11-06 Thread Andreas Krebbel
I didn't do any changes to that one. So it is only a refresh to latest
GCC head.

Bootstrapped on s390x with -march=z196. No regressions.

Feel free to apply.

Thanks!

Bye,

-Andreas-

 gcc/config/s390/s390.md |  109 +!!!
 1 file changed, 4 insertions(+), 105 modifications(!)

Index: gcc/config/s390/s390.md
===
*** gcc/config/s390/s390.md.orig
--- gcc/config/s390/s390.md
***
*** 6000,6043 
  
  (define_insn "*anddi3_cc"
[(set (reg CC_REGNUM)
! (compare (and:DI (match_operand:DI 1 "nonimmediate_operand" "%0,d, 0")
!  (match_operand:DI 2 "general_operand"  " 
d,d,RT"))
!  (const_int 0)))
!(set (match_operand:DI 0 "register_operand"  "=d,d, d")
  (and:DI (match_dup 1) (match_dup 2)))]
!   "s390_match_ccmode(insn, CCTmode) && TARGET_ZARCH"
"@
 ngr\t%0,%2
 ngrk\t%0,%1,%2
!ng\t%0,%2"
!   [(set_attr "op_type"  "RRE,RRF,RXY")
!(set_attr "cpu_facility" "*,z196,*")
!(set_attr "z10prop" "z10_super_E1,*,z10_super_E1")])
  
  (define_insn "*anddi3_cconly"
[(set (reg CC_REGNUM)
! (compare (and:DI (match_operand:DI 1 "nonimmediate_operand" "%0,d, 0")
!  (match_operand:DI 2 "general_operand"  " 
d,d,RT"))
!  (const_int 0)))
!(clobber (match_scratch:DI 0 "=d,d, 
d"))]
!   "s390_match_ccmode(insn, CCTmode) && TARGET_ZARCH
 /* Do not steal TM patterns.  */
 && s390_single_part (operands[2], DImode, HImode, 0) < 0"
"@
 ngr\t%0,%2
 ngrk\t%0,%1,%2
!ng\t%0,%2"
!   [(set_attr "op_type"  "RRE,RRF,RXY")
!(set_attr "cpu_facility" "*,z196,*")
!(set_attr "z10prop" "z10_super_E1,*,z10_super_E1")])
  
  (define_insn "*anddi3"
[(set (match_operand:DI 0 "nonimmediate_operand"
! "=d,d,d,d,d,d,d,d,d,d, d, 
  AQ,Q")
! (and:DI (match_operand:DI 1 "nonimmediate_operand"
! "%d,o,0,0,0,0,0,0,0,d, 0, 
   0,0")
! (match_operand:DI 2 "general_operand"
! "M, 
M,N0HDF,N1HDF,N2HDF,N3HDF,N0SDF,N1SDF,d,d,RT,NxQDF,Q")))
 (clobber (reg:CC CC_REGNUM))]
"TARGET_ZARCH && s390_logical_operator_ok_p (operands)"
"@
--- 6000,6049 
  
  (define_insn "*anddi3_cc"
[(set (reg CC_REGNUM)
! (compare
! (and:DI (match_operand:DI 1 "nonimmediate_operand" "%0,d, 0,d")
!   (match_operand:DI 2 "general_operand"  " d,d,RT,NxxDq"))
!   (const_int 0)))
!(set (match_operand:DI 0 "register_operand"   "=d,d, d,d")
  (and:DI (match_dup 1) (match_dup 2)))]
!   "TARGET_ZARCH && s390_match_ccmode(insn, CCTmode)"
"@
 ngr\t%0,%2
 ngrk\t%0,%1,%2
!ng\t%0,%2
!risbg\t%0,%1,%s2,128+%e2,0"
!   [(set_attr "op_type"  "RRE,RRF,RXY,RIE")
!(set_attr "cpu_facility" "*,z196,*,z10")
!(set_attr "z10prop" "z10_super_E1,*,z10_super_E1,z10_super_E1")])
  
  (define_insn "*anddi3_cconly"
[(set (reg CC_REGNUM)
! (compare
! (and:DI (match_operand:DI 1 "nonimmediate_operand" "%0,d, 0,d")
!   (match_operand:DI 2 "general_operand"  " d,d,RT,NxxDq"))
!  (const_int 0)))
!(clobber (match_scratch:DI 0  "=d,d, d,
d"))]
!   "TARGET_ZARCH
!&& s390_match_ccmode(insn, CCTmode)
 /* Do not steal TM patterns.  */
 && s390_single_part (operands[2], DImode, HImode, 0) < 0"
"@
 ngr\t%0,%2
 ngrk\t%0,%1,%2
!ng\t%0,%2
!risbg\t%0,%1,%s2,128+%e2,0"
!   [(set_attr "op_type"  "RRE,RRF,RXY,RIE")
!(set_attr "cpu_facility" "*,z196,*,z10")
!(set_attr "z10prop" "z10_super_E1,*,z10_super_E1,z10_super_E1")])
  
  (define_insn "*anddi3"
[(set (match_operand:DI 0 "nonimmediate_operand"
! "=d,d,d,d,d,d,d,d,d,d, d,d,   AQ,Q")
! (and:DI
! (match_operand:DI 1 "nonimmediate_operand"
! "%d,o,0,0,0,0,0,0,0,d, 0,d,0,0")
!   (match_operand:DI 2 "general_operand"
! "M, M,N0HDF,N1HDF,N2HDF,N3HDF,N0SDF,N1SDF,d,d,RT,NxxDq,NxQDF,Q")))
 (clobber (reg:CC CC_REGNUM))]
"TARGET_ZARCH && s390_logical_operator_ok_p (operands)"
"@
***
*** 6052,6061 
 ngr\t%0,%2
 ngrk\t%0,%1,%2
 ng\t%0,%2
 #
 #"
!   [(set_attr "op_type" "RRE,RXE,RI,RI,RI,RI,RIL,RIL,RRE,RRF,RXY,SI,SS")
!(set_attr "cpu_facility" "*,*,*,*,*,*,extimm,extimm,*,z196,*,*,*")
 (set_attr "z10prop" "*,
  *,
  z10_super_E1,
--- 6058,6068 
 ngr\t%0,%2
 ngrk\t%0,%1,%2
 ng\t%0,%2
+risbg\t%0,%1,%s2,128+%e2,0
 #
 #"
!   [(set_attr "op_type" "RRE,RXE,RI,RI,RI,RI,RIL,RIL,RRE,RRF,RXY,RIE,SI,SS

Re: [patch RFA middle-end] Fix PR target/41993

2012-11-06 Thread Uros Bizjak
On Mon, Nov 5, 2012 at 11:58 PM, Kaz Kojima  wrote:

> The attached patch is to solve PR target/41993 which will affect
> targets using MODE_EXIT.
> Without it, we can't find all return registers for __builtin_return
> in mode-switching.c:create_pre_exit.  See the trail #4 by Uros in
> the PR for the details.  The patch is tested with bootstrap and
> regtested on i686-pc-linux-gnu with no new failures.  It's also
> tested on cross sh4-unknown-linux-gnu.

Attached patch adds the testcase from PR to the testsuite.

2012-11-06  Uros Bizjak  

PR middle-end/41993
* gcc.dg/torture/pr41993.c: New test.

Tested on x86_64-pc-linux-gnu {,-m32}, committed to mainline SVN.

Uros.

/* { dg-do compile } */
/* { dg-options "-mavx -mvzeroupper" { target { i?86-*-* x86_64-*-* } } } */

short retframe_short (void *rframe)
{
  __builtin_return (rframe);
}


[PATCH,RX] Support Bit Manipulation on Memory Operands

2012-11-06 Thread Naveen H. S
Hi,

Please find attached the patch "rx_bit_insn.patch" which supports bit
operations on memory operand. Please review the same and let me know
if there should be any modifications in it.

Tested with rx-elf. No new Regressions.

ChangeLog
2012-11-06  Naveen H.S  

* config/rx/constraints.md (Uint03, Intu1, Intu0, Intsz, Intso): 
New Constraints.
* config/rx/predicates.md (rx_constbit_operand): New Predicates
that allows value from 0 to 7.
* gcc/config/rx/rx.c (print_operand): Add %D and %E operand codes
for bit manipulations.
* gcc/config/rx/rx.md (iorbset_mem, iorbset_reg, bset): New 
instructions for setting a memory bit.
(xorbnot_mem, xorbnot_reg): New instructions for inverting a memory
bit.
(andbclr_mem, andbclr_reg, bclr, insv_mem_imm): New instructions 
for clearing a memory bit.
(insv): Modify to support bit manipulation operations on memory 
directly.

Thanks & Regards,
Naveen



rx_bit_insn.patch
Description: rx_bit_insn.patch


[PATCH] Less restrictive regex in const-uniq-1.c

2012-11-06 Thread David Edelsohn
The regex in const-uniq-1.c assumes ELF label format, which does not
match AIX XCOFF.  The following patch broadens the regex so that it
also correctly matches on AIX.

* const-uniq-1.c: Expand regex to match AIX XCOFF labels.

Index: const-uniq-1.c
===
--- const-uniq-1.c  (revision 193203)
+++ const-uniq-1.c  (working copy)
@@ -20,5 +20,5 @@
   return a[i+1];
 }

-/* { dg-final { scan-tree-dump-times "L\\\$?C0" 2 "gimple" } } */
+/* { dg-final { scan-tree-dump-times "L\\\$?C\\\.*0" 2 "gimple" } } */
 /* { dg-final { cleanup-tree-dump "gimple" } } */


Re: [PATCH] New configuration options to enable additional executable/startfile/shared library prefixes

2012-11-06 Thread David Edelsohn
On Mon, Nov 5, 2012 at 12:27 PM, Michael Meissner
 wrote:

> Yes, obviously I should have included powerpc-linux as well as powerpc64-linux
> in the documentation.  Thanks.  If it is approved, I will update the
> documentation.

The rs6000 parts of the patch are okay with that change.

Thanks, David


Re: [PATCH] Make IPA-CP work on aggregates

2012-11-06 Thread Jan Hubicka
> 
> 2012-11-05  Martin Jambor  
> 
>   PR tree-optimization/53787
>   * ipa-cp.c (ipcp_value_source): New field offset.
>   (ipcp_agg_lattice): New type.
>   (ipcp_param_lattices): Likewise, move virt_call from ipcp_lattice here.
>   (ipcp_agg_lattice_pool): New variable.
>   (ipa_get_parm_lattices): New function.
>   (ipa_get_lattice): Turned into ipa_get_scalar_lat, use the above.
>   Adjusted all callers.
>   (print_lattice): New function.
>   (print_all_lattices): Use the above, also print aggregate lattices.
>   (set_agg_lats_to_bottom): New function.
>   (set_agg_lats_contain_variable): Likewise.
>   (set_all_contains_variable): Likewise.
>   (initialize_node_lattices): Also handle aggregate lattices, set
>   virt_call in ipcp_param_lattices.
>   (add_value_source): Handle offsets.
>   (add_value_to_lattice): Likewise.
>   (add_scalar_value_to_lattice): New function.
>   (propagate_vals_accross_pass_through): Use add_scalar_value_to_lattice.
>   (propagate_vals_accross_ancestor): Likewise.
>   (propagate_accross_jump_function): Renamed to
>   propagate_scalar_accross_jump_function, use
>   add_scalar_value_to_lattice.
>   (set_check_aggs_by_ref): New function.
>   (merge_agg_lats_step): Likewise.
>   (set_chain_of_aglats_contains_variable): Likewise.
>   (merge_aggregate_lattices): Likewise.
>   (propagate_constants_accross_call): Also handle aggregate lattices.
>   (hint_time_bonus): New function.
>   (context_independent_aggregate_values): Likewise.
>   (gather_context_independent_values): Also handle agggregate values.
>   (agg_jmp_p_vec_for_t_vec): New function.
>   (estimate_local_effects): Also handle agggregate values.
>   (add_all_node_vals_to_toposort): Likewise.
>   (ipcp_propagate_stage): Use struct ipcp_param_lattices.
>   (get_clone_agg_value): New function.
>   (cgraph_edge_brings_value_p): Also handle agggregate values.
>   (create_specialized_node): Likewise.
>   (find_more_values_for_callers_subset): Rename to
>   find_more_scalar_values_for_callers_subset.  Modify dump.
>   (copy_plats_to_inter): New function.
>   (intersect_with_plats): Likewise.
>   (agg_replacements_to_vector): Likewise.
>   (intersect_with_agg_replacements): Likewise.
>   (find_aggregate_values_for_callers_subset): Likewise.
>   (known_aggs_to_agg_replacement_list): Likewise.
>   (cgraph_edge_brings_all_scalars_for_node): Likewise.
>   (cgraph_edge_brings_all_agg_vals_for_node): Likewise.
>   (perhaps_add_new_callers): Old functionality moved to
>   cgraph_edge_brings_all_scalars_for_node, call it and
>   cgraph_edge_brings_all_agg_vals_for_node.
>   (ipcp_val_in_agg_replacements_p): New function.
>   (decide_about_value): New function.
>   (decide_whether_version_node): A lot of functionality moved to
>   decide_about_value.  Also handle agggregate values.
>   (ipcp_driver): Also allocate ipcp_agg_lattice_pool.
>   (pass_ipa_cp): Fill in new entries.
>   * ipa-prop.c (ipa_node_agg_replacements): New variable.
>   (free_parms_ainfo): New function.
>   (ipa_analyze_node): Use free_parms_ainfo to free stuff.
>   (ipa_find_agg_cst_for_param): Do not rely on offset ordering.
>   (ipa_set_node_agg_value_chain): New function.
>   (ipa_node_removal_hook): Also handle ipa_node_agg_replacements.
>   (ipa_node_duplication_hook): Likewise.
>   (ipa_free_all_structures_after_ipa_cp): Also free ipcp_agg_lattice_pool.
>   (ipa_free_all_structures_after_iinln): Likewise.
>   (ipa_dump_agg_replacement_values): New function.
>   (write_agg_replacement_chain): Likewise.
>   (read_agg_replacement_chain): Likewise.
>   (ipa_prop_write_all_agg_replacement): Likewise.
>   (read_replacements_section): Likewise.
>   (ipa_prop_read_all_agg_replacement): Likewise.
>   (adjust_agg_replacement_values): Likewise.
>   (ipcp_transform_function): Likewise.
>   * ipa-prop.h: Also define heap vector of ipa_agg_jf_item_t and of
>   ipa_agg_jump_function_t.
>   (ipa_node_params): Make lattices an array of ipcp_param_lattices.
>   (ipa_agg_replacement_value): New type and its vector.
>   (ipa_set_node_agg_value_chain) Declare.
>   (ipa_node_agg_replacements): Likewise.
>   (ipa_get_agg_replacements_for_node): New function.
>   (ipcp_agg_lattice_pool): Declare.
>   (ipa_dump_agg_replacement_values): Likewise.
>   (ipa_prop_write_all_agg_replacement): Likewise.
>   (ipa_prop_read_all_agg_replacement): Likewise.
>   (ipcp_transform_function): Likewise.
>   * ipa-inline-analysis.c (estimate_ipcp_clone_size_and_time): Pass around
>   known aggregates and hints.
>   * ipa-inline.h: include ipa-prop.h.
>   (estimate_ipcp_clone_size_and_time): Adjust declaration.
>   * lto-streamer.h (lto_section_t

Re: RFC: PATCH to add abi_tag attribute

2012-11-06 Thread Jason Merrill

On 11/06/2012 06:20 AM, Jakub Jelinek wrote:

On Mon, Nov 05, 2012 at 11:03:37PM -0500, Jason Merrill wrote:

As discussed at the Cauldron in Prague, this patch introduces a C++
abi_tag attribute which can be attached to a function or class to
modify its mangled name and avoid name collisions with earlier
versions with a different ABI.  It also adds a -Wabi-tag warning
option to make the compiler suggest adding ABI tags to classes with
subobjects that have tags.


Couldn't there be auto-propagation at least for classes that aren't forward
declared first?


There could, but then there would be a silent difference in behavior 
based on whether or not a class has a forward declaration.  It would 
also mean we would need to instantiate templates in more situations in 
order to collect tags.  I think this is the best solution, even though 
it isn't as comprehensive as we would like.



Also perhaps the documentation should perhaps reserve some names for the
implementation or uses compatible with that (say starting with underscore
or whatever), so that we could in libstdc++ use abi tag names without
a fear that it is used already by others for something else.


Sure, that makes sense.

Jason



Re: [PATCH] Enable -mcpu=power8 for PowerPC

2012-11-06 Thread David Edelsohn
On Mon, Nov 5, 2012 at 11:54 PM, Peter Bergner  wrote:
> This patch enables new -mcpu and -mtune options for POWER8.  The -mcpu=power8
> option currently is just an alias for power7.  The affect of these options
> will be expanded when more technical details are released by IBM.
>
> Bootstrapped and regtested on powerpc64-linux.  Ok for mainline or should
> we wait to commit this after stage1 (since Jakub said it was ok in another
> thread) when we commit our base power8 patches?
>
> Peter
>
>
> * doc/invoke.texi (-mcpu=power8): Document.
> * config.in (HAVE_AS_POWER8): New.
> * config.gcc: Add cpu_type power8.
> * configure.ac: (HAVE_AS_POWER8): Check for assembler support for the
> POWER8 instructions.
> * configure: Regenerate.
> * config/rs6000/rs6000.h: (ASM_CPU_POWER8_SPEC): Define.
> (ASM_CPU_SPEC): Pass %(asm_cpu_power8) for -mcpu=power8.
> (EXTRA_SPECS): Add asm_cpu_power8 spec string.
> * config/rs6000/rs6000-cpus.def (processor_target_table): Alias
> POWER8 to POWER7.
> * config/rs6000/driver-rs6000.c (ASM_CPU_SPEC): For -mcpu=power8,
> pass %(asm_cpu_power8)/-mpwr8.
> * config/rs6000/aix53.h: Likewise.
> * config/rs6000/aix61.h: Likewise.

This patch is okay.

Thanks, David


Re: User directed Function Multiversioning via Function Overloading (issue5752064)

2012-11-06 Thread Jason Merrill

On 11/05/2012 09:38 PM, Sriraman Tallam wrote:

+  /* For multi-versioned functions, more than one match is just fine.
+Call decls_match to make sure they are different because they are
+versioned.  */
+  if (DECL_FUNCTION_VERSIONED (fn))
+   {
+  for (match = TREE_CHAIN (matches); match; match = TREE_CHAIN (match))
+   if (!DECL_FUNCTION_VERSIONED (TREE_PURPOSE (match))
+   || decls_match (fn, TREE_PURPOSE (match)))
+ break;
+   }


I still don't understand what this code is supposed to be doing.  Please 
remove it and instead modify the other loop to allow mismatches that are 
versions of the same function.



+  /* If the olddecl is a version, so is the newdecl.  */
+  if (TREE_CODE (newdecl) == FUNCTION_DECL
+  && DECL_FUNCTION_VERSIONED (olddecl))
+{
+  DECL_FUNCTION_VERSIONED (newdecl) = 1;
+  /* newdecl will be purged and is no longer a version.  */
+  delete_function_version (newdecl);
+}


Please make the comment clearer that the reason we're setting the flag 
on the newdecl is so that it'll be copied back into the olddecl; 
otherwise it seems odd to say it's a version and then it isn't a version.



+  /* If a pointer to a function that is multi-versioned is requested, the
+ pointer to the dispatcher function is returned instead.  This works
+ well because indirectly calling the function will dispatch the right
+ function version at run-time.  */
+  if (DECL_FUNCTION_VERSIONED (fn))
+{
+  tree dispatcher_decl = NULL;
+  gcc_assert (targetm.get_function_versions_dispatcher);
+  dispatcher_decl = targetm.get_function_versions_dispatcher (fn);
+  if (!dispatcher_decl)
+   {
+ error_at (input_location, "Pointer to a multiversioned function"
+   " without a default is not allowed");
+ return error_mark_node;
+   }
+  retrofit_lang_decl (dispatcher_decl);
+  fn = dispatcher_decl;


This code should use the get_function_version_dispatcher function in 
cp/call.c.


Jason



Re: [PING^2] [C++ PATCH] Add overflow checking to __cxa_vec_new[23]

2012-11-06 Thread Jason Merrill

On 11/05/2012 12:52 PM, Florian Weimer wrote:

+// Avoid use of none-overridable new/delete operators in shared


Typo: that should be "non-overridable"

Jason



Re: [PING^2] [C++ PATCH] Add overflow checking to __cxa_vec_new[23]

2012-11-06 Thread Florian Weimer

On 11/06/2012 04:55 PM, Jason Merrill wrote:

On 11/05/2012 12:52 PM, Florian Weimer wrote:

+// Avoid use of none-overridable new/delete operators in shared


Typo: that should be "non-overridable"

Jason


Thanks, this patch fixes both instances.

--
Florian Weimer / Red Hat Product Security Team
gcc/testsuite/ChangeLog:	(revision 193243)

2012-11-06  Florian Weimer  

	* g++.old-deja/g++.abi/cxa_vec.C: Fix typo in comment.

libstdc++-v3/ChangeLog:

2012-11-06  Florian Weimer  

	* testsuite/18_support/cxa_vec.cc: Fix typo in comment.

Index: gcc/testsuite/g++.old-deja/g++.abi/cxa_vec.C
===
--- gcc/testsuite/g++.old-deja/g++.abi/cxa_vec.C	(revision 193243)
+++ gcc/testsuite/g++.old-deja/g++.abi/cxa_vec.C	(working copy)
@@ -5,7 +5,7 @@
 // are resolved when the kernel is linked.
 // { dg-do run { xfail { powerpc-ibm-aix* || vxworks_kernel } } }
 // { dg-options "-flat_namespace" { target *-*-darwin[67]* } }
-// Avoid use of none-overridable new/delete operators in shared
+// Avoid use of non-overridable new/delete operators in shared
 // { dg-options "-static" { target *-*-mingw* } }
 // Test __cxa_vec routines
 // Copyright (C) 2000, 2005 Free Software Foundation, Inc.
Index: libstdc++-v3/testsuite/18_support/cxa_vec.cc
===
--- libstdc++-v3/testsuite/18_support/cxa_vec.cc	(revision 193243)
+++ libstdc++-v3/testsuite/18_support/cxa_vec.cc	(working copy)
@@ -1,5 +1,5 @@
 // { dg-do run }
-// Avoid use of none-overridable new/delete operators in shared
+// Avoid use of non-overridable new/delete operators in shared
 // { dg-options "-static" { target *-*-mingw* } }
 // Test __cxa_vec routines
 // Copyright (C) 2000-2012 Free Software Foundation, Inc.


Fix debug dump formatting in ipa-pure-const

2012-11-06 Thread Jan Hubicka
Hi,
there are missing linebreaks in the debug info.  Fixed thus.

Honza

Index: ChangeLog
===
--- ChangeLog   (revision 193244)
+++ ChangeLog   (working copy)
@@ -1,3 +1,7 @@
+2012-11-06  Jan Hubicka  
+
+   * ipa-pure-const.c (check_stmt): Fix debug info formatting.
+
 2012-11-06  Uros Bizjak  
 
* config/i386/i386.c (TARGET_INSTANTIATE_DECLS): New define.
Index: ipa-pure-const.c
===
--- ipa-pure-const.c(revision 193244)
+++ ipa-pure-const.c(working copy)
@@ -671,15 +671,18 @@ check_stmt (gimple_stmt_iterator *gsip,
   if (cfun->can_throw_non_call_exceptions)
{
  if (dump_file)
-   fprintf (dump_file, "can throw; looping");
+   fprintf (dump_file, "can throw; looping\n");
  local->looping = true;
}
   if (stmt_can_throw_external (stmt))
{
  if (dump_file)
-   fprintf (dump_file, "can throw externally");
+   fprintf (dump_file, "can throw externally\n");
  local->can_throw = true;
}
+  else
+   if (dump_file)
+ fprintf (dump_file, "can throw\n");
 }
   switch (gimple_code (stmt))
 {
@@ -691,7 +694,7 @@ check_stmt (gimple_stmt_iterator *gsip,
/* Target of long jump. */
{
   if (dump_file)
-fprintf (dump_file, "nonlocal label is not const/pure");
+fprintf (dump_file, "nonlocal label is not const/pure\n");
  local->pure_const_state = IPA_NEITHER;
}
   break;
@@ -699,14 +702,14 @@ check_stmt (gimple_stmt_iterator *gsip,
   if (gimple_asm_clobbers_memory_p (stmt))
{
  if (dump_file)
-   fprintf (dump_file, "memory asm clobber is not const/pure");
+   fprintf (dump_file, "memory asm clobber is not const/pure\n");
  /* Abandon all hope, ye who enter here. */
  local->pure_const_state = IPA_NEITHER;
}
   if (gimple_asm_volatile_p (stmt))
{
  if (dump_file)
-   fprintf (dump_file, "volatile is not const/pure");
+   fprintf (dump_file, "volatile is not const/pure\n");
  /* Abandon all hope, ye who enter here. */
  local->pure_const_state = IPA_NEITHER;
   local->looping = true;


Re: [PATCH] Less restrictive regex in const-uniq-1.c

2012-11-06 Thread Andrew Pinski
On Tue, Nov 6, 2012 at 6:54 AM, David Edelsohn  wrote:
> The regex in const-uniq-1.c assumes ELF label format, which does not
> match AIX XCOFF.  The following patch broadens the regex so that it
> also correctly matches on AIX.
>
> * const-uniq-1.c: Expand regex to match AIX XCOFF labels.
>
> Index: const-uniq-1.c
> ===
> --- const-uniq-1.c  (revision 193203)
> +++ const-uniq-1.c  (working copy)
> @@ -20,5 +20,5 @@
>return a[i+1];
>  }
>
> -/* { dg-final { scan-tree-dump-times "L\\\$?C0" 2 "gimple" } } */
> +/* { dg-final { scan-tree-dump-times "L\\\$?C\\\.*0" 2 "gimple" } } */

I think .* will match too much as it will match newlines too.  I think
[^\n\]* is better.

Thanks,
Andrew Pinski


>  /* { dg-final { cleanup-tree-dump "gimple" } } */


[PATCH, i386]: Remove SLOT_VIRTUAL from enum ix86_stack_slot

2012-11-06 Thread Uros Bizjak
Hello!

Attached patch removes SLOT_VIRTUAL and introduces
TARGET_INSTANTIATE_DECLS that takes care of instantiating registers in
ix86_stack_locals array. The patch enables some more stack slot
sharing.

2012-11-06  Uros Bizjak  

* config/i386/i386.c (TARGET_INSTANTIATE_DECLS): New define.
(ix86_instantiate_decls): New function.
(ix86_expand_builtin) : Use SLOT_TEMP
stack slot instead of SLOT_VIRTUAL.
: Ditto.
(assign_386_stack_local): Do not assert when virtual slot is valid.
* config/i386/i386.h (enum ix86_stack_slot): Remove SLOT_VIRTUAL.
* config/i386/i386.md (truncdfsf2): Do not use SLOT_VIRTUAL stack slot.
(truncxf2): Ditto.
(floatunssi2): Ditto.
(isinf2): Ditto.
* config/i386/sync.md (atomic_load): Ditto.
(atomic_store): Ditto.

Tested on x86_64-pc-linux-gnu {,-m32}, committed to mainline SVN.

Uros.
Index: config/i386/i386.c
===
--- config/i386/i386.c  (revision 193180)
+++ config/i386/i386.c  (working copy)
@@ -23751,9 +23751,6 @@ assign_386_stack_local (enum machine_mode mode, en
 
   gcc_assert (n < MAX_386_STACK_LOCALS);
 
-  /* Virtual slot is valid only before vregs are instantiated.  */
-  gcc_assert ((n == SLOT_VIRTUAL) == !virtuals_instantiated);
-
   for (s = ix86_stack_locals; s; s = s->next)
 if (s->mode == mode && s->n == n)
   return validize_mem (copy_rtx (s->rtl));
@@ -23767,6 +23764,16 @@ assign_386_stack_local (enum machine_mode mode, en
   ix86_stack_locals = s;
   return validize_mem (s->rtl);
 }
+
+static void
+ix86_instantiate_decls (void)
+{
+  struct stack_local_entry *s;
+
+  for (s = ix86_stack_locals; s; s = s->next)
+if (s->rtl != NULL_RTX)
+  instantiate_decl_rtl (s->rtl);
+}
 
 /* Calculate the length of the memory address in the instruction encoding.
Includes addr32 prefix, does not include the one-byte modrm, opcode,
@@ -30586,13 +30593,13 @@ ix86_expand_builtin (tree exp, rtx target, rtx sub
 
 case IX86_BUILTIN_LDMXCSR:
   op0 = expand_normal (CALL_EXPR_ARG (exp, 0));
-  target = assign_386_stack_local (SImode, SLOT_VIRTUAL);
+  target = assign_386_stack_local (SImode, SLOT_TEMP);
   emit_move_insn (target, op0);
   emit_insn (gen_sse_ldmxcsr (target));
   return 0;
 
 case IX86_BUILTIN_STMXCSR:
-  target = assign_386_stack_local (SImode, SLOT_VIRTUAL);
+  target = assign_386_stack_local (SImode, SLOT_TEMP);
   emit_insn (gen_sse_stmxcsr (target));
   return copy_to_mode_reg (SImode, target);
 
@@ -41402,6 +41409,9 @@ ix86_memmodel_check (unsigned HOST_WIDE_INT val)
 #undef TARGET_MEMBER_TYPE_FORCES_BLK
 #define TARGET_MEMBER_TYPE_FORCES_BLK ix86_member_type_forces_blk
 
+#undef TARGET_INSTANTIATE_DECLS
+#define TARGET_INSTANTIATE_DECLS ix86_instantiate_decls
+
 #undef TARGET_SECONDARY_RELOAD
 #define TARGET_SECONDARY_RELOAD ix86_secondary_reload
 
Index: config/i386/i386.h
===
--- config/i386/i386.h  (revision 193180)
+++ config/i386/i386.h  (working copy)
@@ -2150,8 +2150,7 @@ enum ix86_entity
 
 enum ix86_stack_slot
 {
-  SLOT_VIRTUAL = 0,
-  SLOT_TEMP,
+  SLOT_TEMP = 0,
   SLOT_CW_STORED,
   SLOT_CW_TRUNC,
   SLOT_CW_FLOOR,
Index: config/i386/i386.md
===
--- config/i386/i386.md (revision 193180)
+++ config/i386/i386.md (working copy)
@@ -4071,10 +4071,7 @@
 ;
   else
 {
-  enum ix86_stack_slot slot = (virtuals_instantiated
-  ? SLOT_TEMP
-  : SLOT_VIRTUAL);
-  rtx temp = assign_386_stack_local (SFmode, slot);
+  rtx temp = assign_386_stack_local (SFmode, SLOT_TEMP);
   emit_insn (gen_truncdfsf2_with_temp (operands[0], operands[1], temp));
   DONE;
 }
@@ -4255,12 +4252,7 @@
   DONE;
 }
   else
-{
-  enum ix86_stack_slot slot = (virtuals_instantiated
-  ? SLOT_TEMP
-  : SLOT_VIRTUAL);
-  operands[2] = assign_386_stack_local (mode, slot);
-}
+operands[2] = assign_386_stack_local (mode, SLOT_TEMP);
 })
 
 (define_insn "*truncxfsf2_mixed"
@@ -5468,12 +5460,7 @@
   DONE;
 }
   else
-{
-  enum ix86_stack_slot slot = (virtuals_instantiated
-  ? SLOT_TEMP
-  : SLOT_VIRTUAL);
-  operands[2] = assign_386_stack_local (DImode, slot);
-}
+operands[2] = assign_386_stack_local (DImode, SLOT_TEMP);
 })
 
 (define_expand "floatunsdisf2"
@@ -15563,10 +15550,7 @@
 emit_insn (gen_fxam2_i387_with_temp (scratch, operands[1]));
   else
 {
-  enum ix86_stack_slot slot = (virtuals_instantiated
-  ? SLOT_TEMP
-  : SLOT_VIRTUAL);
-  rtx temp = assign_386_stack_local (mode, slo

Re: [Patch] libitm: add HTM fastpath

2012-11-06 Thread Richard Henderson
On 2012-11-05 17:09, Torvald Riegel wrote:
> commit 4f2b3c78ac7ae3fb2b639ce32ad197a12ba7d66a
> Author: Torvald Riegel 
> Date:   Tue Oct 23 00:25:50 2012 +0200
> 
> Add HTM fastpath and use Intel RTM for it on x86.

This is ok, modulo the cpuid.h bit that Uros pointed out.


r~


Re: [Patch] libitm: add HTM fastpath

2012-11-06 Thread Richard Henderson
On 2012-11-05 17:09, Torvald Riegel wrote:
> +  if (likely(htm_fastpath && (prop & pr_hasNoAbort)))

For reference, could the NoAbort clause be relaxed with an htm check
in abortTransaction, and the use of an xabort insn with an appropriate
code to indicate user abort?

Just wondering what the current rationale for this is.


r~


Re: [RFC] Heuristics to throttle the complette unrolling

2012-11-06 Thread Jan Hubicka
> On Tue, 30 Oct 2012, Jan Hubicka wrote:
> 
> > Hi,
> > for past week or two I was playing with ways to throttle down the complette
> > unrolling heuristics.  I made complette unroller to use the 
> > tree-ssa-loop-niter
> > upper bound and unroll even in non-trivial cases and this has turned out to
> > increase number of complettely unrolled loops by great amount, so one can
> > see it as considerable code size growth at -O3 SPEC build.
> > 
> > http://gcc.opensuse.org/SPEC/CFP/sb-vangelis-head-64/Total-size_big.png
> > it is the largest jump on right hand side in both peak and base runs.
> > There are also performance imrovements, most impotantly 11% on applu.
> > 
> > The intuition is that complette unrolling is most profitable when IV tests
> > are eliminated and single basic block is created. When condtionals stay
> > in the code it is not that good idea and also functions containing calls
> > are less interesting for unrolling since the calls are slow and optimization
> > oppurtunities are not so great.
> > 
> > This patch reduces unrolling on loops having many branches or calls on the
> > hot patch.  I found that for applu speedup the number of branches needs to 
> > be
> > pretty high - about 32.
> > 
> > The patch saves about half of the growth introduced (but on different 
> > benchmarks)
> > and I think I can move all peeling to trees and reduce peeling limits a 
> > bit, too.
> > 
> > Does this sound sane? Any ideas?
> 
> Yes, this sounds ok (beware of unrelated PARAM_MAX_ONCE_PEELED_INSNS
> remove in the patch below).

Hi,
this is somewhat polished version I comitted today. Main change is to test
inexpensive_builtlin_p when deciding whether to count builtin call as a call.

Bootstrapped/regtested x86_64-linux.
Honza

* cfgloopanal.c (get_loop_hot_path): New function.
* tree-ssa-lop-ivcanon.c (struct loop_size): Add CONSTANT_IV,
NUM_NON_PURE_CALLS_ON_HOT_PATH, NUM_PURE_CALLS_ON_HOT_PATH,
NUM_BRANCHES_ON_HOT_PATH.
(tree_estimate_loop_size): Compute the new values.
(try_unroll_loop_completely): Disable unrolling of loops with only
calls or too many branches.
(tree_unroll_loops_completely): Deal also with outer loops of hot loops.
* cfgloop.h (get_loop_hot_path): Declare.
* params.def (PARAM_MAX_PEEL_BRANCHES): New parameters.
* invoke.texi (max-peel-branches): Document.

* gcc.dg/tree-ssa/loop-1.c: Make to look like a good unroling candidate 
still.
* gcc.dg/tree-ssa/loop-23.c: Likewise.
* gcc.dg/tree-ssa/cunroll-1.c: Unrolling now happens early.
* gcc.dg/tree-prof/unroll-1.c: Remove confused dg-options.

Index: cfgloopanal.c
===
--- cfgloopanal.c   (revision 193240)
+++ cfgloopanal.c   (working copy)
@@ -483,3 +483,36 @@ single_likely_exit (struct loop *loop)
   VEC_free (edge, heap, exits);
   return found;
 }
+
+
+/* Gets basic blocks of a LOOP.  Header is the 0-th block, rest is in dfs
+   order against direction of edges from latch.  Specially, if
+   header != latch, latch is the 1-st block.  */
+
+VEC (basic_block, heap) *
+get_loop_hot_path (const struct loop *loop)
+{
+  basic_block bb = loop->header;
+  VEC (basic_block, heap) *path = NULL;
+  bitmap visited = BITMAP_ALLOC (NULL);
+
+  while (true)
+{
+  edge_iterator ei;
+  edge e;
+  edge best = NULL;
+
+  VEC_safe_push (basic_block, heap, path, bb);
+  bitmap_set_bit (visited, bb->index);
+  FOR_EACH_EDGE (e, ei, bb->succs)
+if ((!best || e->probability > best->probability)
+   && !loop_exit_edge_p (loop, e)
+   && !bitmap_bit_p (visited, e->dest->index))
+ best = e;
+  if (!best || best->dest == loop->header)
+   break;
+  bb = best->dest;
+}
+  BITMAP_FREE (visited);
+  return path;
+}
Index: testsuite/gcc.dg/tree-ssa/loop-1.c
===
--- testsuite/gcc.dg/tree-ssa/loop-1.c  (revision 193240)
+++ testsuite/gcc.dg/tree-ssa/loop-1.c  (working copy)
@@ -17,13 +17,16 @@
to the load from the GOT this also contains the name of the funtion so for
each call the function name would appear twice.  */
 /* { dg-options "-O1 -ftree-loop-ivcanon -funroll-loops 
-fdump-tree-ivcanon-details -fdump-tree-cunroll-details -fdump-tree-optimized 
-mno-relax-pic-calls" { target mips*-*-* } } */
-
-void xxx(void)
+__attribute__ ((pure))
+int foo (int x);
+int xxx(void)
 {
   int x = 45;
+  int sum;
 
   while (x >>= 1)
-foo ();
+sum += foo (x) * 2;
+  return sum;
 }
 
 /* We should be able to find out that the loop iterates four times and unroll 
it completely.  */
Index: testsuite/gcc.dg/tree-ssa/cunroll-1.c
===
--- testsuite/gcc.dg/tree-ssa/cunroll-1.c   (revision 193240)
+++ testsuite/gcc.dg/tree-ssa/cunroll-1.c   (working copy)
@@ -1,5

Re: [PATCH] Less restrictive regex in const-uniq-1.c

2012-11-06 Thread Andrew Pinski
On Tue, Nov 6, 2012 at 8:07 AM, Andrew Pinski  wrote:
> On Tue, Nov 6, 2012 at 6:54 AM, David Edelsohn  wrote:
>> The regex in const-uniq-1.c assumes ELF label format, which does not
>> match AIX XCOFF.  The following patch broadens the regex so that it
>> also correctly matches on AIX.
>>
>> * const-uniq-1.c: Expand regex to match AIX XCOFF labels.
>>
>> Index: const-uniq-1.c
>> ===
>> --- const-uniq-1.c  (revision 193203)
>> +++ const-uniq-1.c  (working copy)
>> @@ -20,5 +20,5 @@
>>return a[i+1];
>>  }
>>
>> -/* { dg-final { scan-tree-dump-times "L\\\$?C0" 2 "gimple" } } */
>> +/* { dg-final { scan-tree-dump-times "L\\\$?C\\\.*0" 2 "gimple" } } */
>
> I think .* will match too much as it will match newlines too.  I think
> [^\n\]* is better.

Just to correct my self, I missed all of the \ before the '.' so this
change is fine with respect of not matching too much.

Thanks,
Andrew pinski


>
> Thanks,
> Andrew Pinski
>
>
>>  /* { dg-final { cleanup-tree-dump "gimple" } } */


Re: [PATCH] Use propagate_threaded_block_debug_into even in loop header copying pass (PR debug/54693)

2012-11-06 Thread Jeff Law

On 11/06/2012 04:00 AM, Jakub Jelinek wrote:

Hi!

This patch fixes
-FAIL: gcc.dg/guality/pr54693-2.c  -O1  line 21 i == v + 1
-FAIL: gcc.dg/guality/pr54693-2.c  -O2  line 21 i == v + 1
-FAIL: gcc.dg/guality/pr54693-2.c  -O3 -fomit-frame-pointer  line 21 i == v + 1
-FAIL: gcc.dg/guality/pr54693-2.c  -O3 -g  line 21 i == v + 1
-FAIL: gcc.dg/guality/pr54693.c  -O1  line 22 i == c - 48
on both x86_64-linux and i686-linux (and the x/y/z tests in the new testcase
from UNSUPPORTED to PASS) by copying the debug stmt in ch pass similarly to
how jump threading does that.

Ok for trunk?

2012-11-06  Jakub Jelinek  

PR debug/54693
* tree-flow.h (propagate_threaded_block_debug_into): New prototype.
* tree-ssa-threadedge.c (propagate_threaded_block_debug_into): No
longer static.
* tree-ssa-loop-ch.c (copy_loop_headers): Use it.

* gcc.dg/guality/pr54693-2.c: New test.

OK.
jeff



Re: [PATCH 02/10] Initial asan cleanups

2012-11-06 Thread Diego Novillo

On 2012-11-02 15:57 , Dodji Seketeli wrote:


  /* AddressSanitizer, a fast memory error detector.
-   Copyright (C) 2011 Free Software Foundation, Inc.
+   Copyright (C) 2011, 2012 Free Software Foundation, Inc.


I *think* we should only mention 2012, but I don't know if code in 
branches counts for the copyright years.




+  /* Address Sanitizer needs porting to each target architecture.  */
+  if (flag_asan && targetm.asan_shadow_offset == NULL)
+{
+  warning (0, "-fasan not supported for this target");


Hm, ASAN's flag is now -fsanitizer=[asan,tsan,memory] or some such.  We 
will need to make that change.  But it can wait until after the initial 
port is in trunk.


This patch is OK.


Diego.


Re: [PATCH 01/10] Initial import of asan from the Google branch into trunk

2012-11-06 Thread Diego Novillo

On 2012-11-02 15:56 , Dodji Seketeli wrote:

This patch imports the initial state of asan as it was in the
Google branch.

It provides basic infrastructure for asan to instrument memory
accesses on the heap, at -O3.  Note that it supports neither stack nor
global variable protection.

The rest of the patches of the set is intended to further improve this
base.

* Makefile.in: Add asan.c and its dependencies.
* common.opt: Add -fasan option.
* invoke.texi: Document the new flag.
* passes.c: Add the asan pass.
* toplev.c (compile_file): Call asan_finish_file.
* asan.c: New file.
* asan.h: New file.
* tree-pass.h: Declare pass_asan.


OK.


Diego.


Re: [PATCH 03/10] Emit GIMPLE directly instead of gimplifying GENERIC.

2012-11-06 Thread Diego Novillo

On 2012-11-02 15:57 , Dodji Seketeli wrote:


* Makefile.in (GTFILES): Add $(srcdir)/asan.c.
(asan.o): Update the dependencies of asan.o.
* asan.c (tm.h, tree.h, tm_p.h, basic-block.h, flags.h
function.h, tree-inline.h, tree-dump.h, diagnostic.h, demangle.h,
langhooks.h, ggc.h, cgraph.h, gimple.h): Remove these unused but
included headers.
(shadow_ptr_types): New variable.
(report_error_func): Change is_store argument to bool, don't append
newline to function name.
(PROB_VERY_UNLIKELY, PROB_ALWAYS): Define.
(build_check_stmt): Change is_store argument to bool.  Emit GIMPLE
directly instead of creating trees and gimplifying them.  Mark
the error reporting function as very unlikely.
(instrument_derefs): Change is_store argument to bool.  Use
int_size_in_bytes to compute size_in_bytes, simplify size check.
Use build_fold_addr_expr instead of build_addr.
(transform_statements): Adjust instrument_derefs caller.
Use gimple_assign_single_p as stmt test.  Don't look at MEM refs
in rhs2.
(asan_init_shadow_ptr_types): New function.
(asan_instrument): Don't push/pop gimplify context.
Call asan_init_shadow_ptr_types if not yet initialized.
* asan.h (ASAN_SHADOW_SHIFT): Adjust comment.


OK.


Diego.



Re: [PATCH 04/10] Allow asan at -O0

2012-11-06 Thread Diego Novillo

On 2012-11-02 15:58 , Dodji Seketeli wrote:

This patch defines a new asan pass gate that is activated at -O0, in
addition to the pass that was initially activated at -O3 level The
patch also does some comment cleanups here and there.

* asan.c (build_check_stmt): Rename join_bb variable to else_bb.
(gate_asan_O0): New function.
(pass_asan_O0): New variable.
* passes.c (init_optimization_passes): Add pass_asan_O0.
* tree-pass.h (pass_asan_O0): New declaration.


OK.


Diego.



Re: [PATCH 05/10] Implement protection of stack variables

2012-11-06 Thread Diego Novillo

On 2012-11-02 16:00 , Dodji Seketeli wrote:

This patch implements the protection of stack variables.

To understand how this works, lets look at this example on x86_64
where the stack grows downward:

  int
  foo ()
  {
char a[23] = {0};
int b[2] = {0};

a[5] = 1;
b[1] = 2;

return a[5] + b[1];
  }

For this function, the stack protected by asan will be organized as
follows, from the top of the stack to the bottom:

Slot 1/ [red zone of 32 bytes called 'RIGHT RedZone']

Slot 2/ [24 bytes for variable 'a']

Slot 3/ [8 bytes of red zone, that adds up to the space of 'a' to make
  the next slot be 32 bytes aligned; this one is called Partial
  Redzone; this 32 bytes alignment is an asan constraint]

Slot 4/ [red zone of 32 bytes called 'Middle RedZone']

Slot 5/ [8 bytes for variable 'b']

Slot 6/ [24 bytes of Partial Red Zone (similar to slot 3]

Slot 7/ [32 bytes of Red Zone at the bottom of the stack, called 'LEFT
  RedZone']

[A cultural question I've kept asking myself is Why has address
  sanitizer authors called these red zones (LEFT, MIDDLE, RIGHT)
  instead of e.g, (BOTTOM, MIDDLE, TOP).  Maybe they can step up and
  educate me so that I get less confused in the future.  :-)]


I believe they layout the stack from right to left (top is to the 
right).  Feels like reading a middle earth map.  Kostya, is my 
recollection correct?



The 32 bytes of LEFT red zone at the bottom of the stack can be
decomposed as such:

 1/ The first 8 bytes contain a magical asan number that is always
 0x41B58AB3.

 2/ The following 8 bytes contains a pointer to a string (to be
 parsed at runtime by the runtime asan library), which format is
 the following:

  "  
  (<32-bytes-aligned-offset-in-bytes-of-variable> 
   ){n} "

where '(...){n}' means the content inside the parenthesis occurs 'n'
times, with 'n' being the number of variables on the stack.

  3/ The following 16 bytes of the red zone have no particular
  format.

The shadow memory for that stack layout is going to look like this:

 - content of shadow memory 8 bytes for slot 7: 0xF1F1F1F1.
   The F1 byte pattern is a magic number called
   ASAN_STACK_MAGIC_LEFT and is a way for the runtime to know that
   the memory for that shadow byte is part of a the LEFT red zone
   intended to seat at the bottom of the variables on the stack.

 - content of shadow memory 8 bytes for slots 6 and 5:
   0xF4F4F400.  The F4 byte pattern is a magic number
   called ASAN_STACK_MAGIC_PARTIAL.  It flags the fact that the
   memory region for this shadow byte is a PARTIAL red zone
   intended to pad a variable A, so that the slot following
   {A,padding} is 32 bytes aligned.

   Note that the fact that the least significant byte of this
   shadow memory content is 00 means that 8 bytes of its
   corresponding memory (which corresponds to the memory of
   variable 'b') is addressable.

 - content of shadow memory 8 bytes for slot 4: 0xF2F2F2F2.
   The F2 byte pattern is a magic number called
   ASAN_STACK_MAGIC_MIDDLE.  It flags the fact that the memory
   region for this shadow byte is a MIDDLE red zone intended to
   seat between two 32 aligned slots of {variable,padding}.

 - content of shadow memory 8 bytes for slot 3 and 2:
   0xF400.  This represents is the concatenation of
   variable 'a' and the partial red zone following it, like what we
   had for variable 'b'.  The least significant 3 bytes being 00
   means that the 3 bytes of variable 'a' are addressable.

 - content of shadow memory 8 bytes for slot 1: 0xF3F3F3F3.
   The F3 byte pattern is a magic number called
   ASAN_STACK_MAGIC_RIGHT.  It flags the fact that the memory
   region for this shadow byte is a RIGHT red zone intended to seat
   at the top of the variables of the stack.



This is a great summary.  Please put it at the top of asan.c or in some 
other prominent place.




- offset = alloc_stack_frame_space (stack_vars[i].size, alignb);
+ if (flag_asan && pred)
+   {
+ HOST_WIDE_INT prev_offset = frame_offset;
+ tree repr_decl = NULL_TREE;
+
+ offset
+   = alloc_stack_frame_space (stack_vars[i].size
+  + ASAN_RED_ZONE_SIZE,
+  MAX (alignb, ASAN_RED_ZONE_SIZE));
+ VEC_safe_push (HOST_WIDE_INT, heap, data->asan_vec,
+prev_offset);
+ VEC_safe_push (HOST_WIDE_INT, heap, data->asan_vec,
+offset + stack_vars[i].size);


Oh, gee, thanks.  More VEC() code for me to convert ;)


The patch is OK.


Diego.


Re: [PATCH 06/10] Implement protection of global variables

2012-11-06 Thread Diego Novillo

On 2012-11-02 16:01 , Dodji Seketeli wrote:


* varasm.c: Include asan.h.
(assemble_noswitch_variable): Grow size by asan_red_zone_size
if decl is asan protected.
(place_block_symbol): Likewise.
(assemble_variable): If decl is asan protected, increase
DECL_ALIGN if needed, and for decls emitted using
assemble_variable_contents append padding zeros after it.
* Makefile.in (varasm.o): Depend on asan.h.
* asan.c: Include output.h.
(asan_pp, asan_pp_initialized, asan_ctor_statements): New variables.
(asan_pp_initialize, asan_pp_string): New functions.
(asan_emit_stack_protection): Use asan_pp{,_initialized}
instead of local pp{,_initialized} vars, use asan_pp_initialize
and asan_pp_string helpers.
(asan_needs_local_alias, asan_protect_global,
asan_global_struct, asan_add_global): New functions.
(asan_finish_file): Protect global vars that can be protected. Use
asan_ctor_statements instead of ctor_statements
* asan.h (asan_protect_global): New prototype.
(asan_red_zone_size): New inline function.


OK.

Please, also put the high-level description in asan.c's documentation.


Diego.



Re: [PATCH 07/10] Make build_check_stmt accept an SSA_NAME for its base

2012-11-06 Thread Diego Novillo

On 2012-11-02 16:02 , Dodji Seketeli wrote:


* asan.c (build_check_stmt): Accept the memory access to be
represented by an SSA_NAME.


OK.


Diego.


Re: [PATCH 08/10] Factorize condition insertion code out of build_check_stmt

2012-11-06 Thread Diego Novillo

On 2012-11-02 16:03 , Dodji Seketeli wrote:


* asan.c (create_cond_insert_point_before_iter): Factorize out of ...
(build_check_stmt): ... here.


OK.


Diego.



Re: [PATCH 09/10] Instrument built-in memory access function calls

2012-11-06 Thread Diego Novillo

On 2012-11-02 16:05 , Dodji Seketeli wrote:


+static bool
+maybe_instrument_builtin_call (gimple_stmt_iterator *iter)
+{
+  gimple call = gsi_stmt (*iter);
+  location_t loc = gimple_location (call);
+
+  if (!is_gimple_call (call))
+return false;


Nit.  Why not factor this out and change the caller to:

if (is_builtin_call (stmt))
   instrument_builtin_call (stmt);

I don't much like functions that do many combined things.


OK, otherwise.


Diego.


Re: [PATCH 10/10] Import the asan runtime library into GCC tree

2012-11-06 Thread Diego Novillo

On 2012-11-02 16:10 , Dodji Seketeli wrote:


 * configure.ac: Add libsanitizer to target_libraries.
* Makefile.def: Ditto.
* configure: Regenerate.
* Makefile.in: Regenerate.
* libsanitizer: New directory for asan runtime.  Contains an empty
tsan directory.

gcc:
* gcc.c (LINK_COMMAND_SPEC): Add -lasan to link command
if -faddress-sanitizer is on.


OK with Jakub's comments addressed.

References to -fasan in diagnostics should be replaced.  But there's 
been another flag name change upstream, so let's do it together with the 
new flag names.



Diego.



Re: [PATCH] Use propagate_threaded_block_debug_into even in loop header copying pass (PR debug/54693)

2012-11-06 Thread Alexandre Oliva
On Nov  6, 2012, Jakub Jelinek  wrote:

> 2012-11-06  Jakub Jelinek  

>   PR debug/54693
>   * tree-flow.h (propagate_threaded_block_debug_into): New prototype.
>   * tree-ssa-threadedge.c (propagate_threaded_block_debug_into): No
>   longer static.
>   * tree-ssa-loop-ch.c (copy_loop_headers): Use it.

>   * gcc.dg/guality/pr54693-2.c: New test.

Nice!  I'd approve this if I was entitled to ;-)

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist  Red Hat Brazil Compiler Engineer


Re: [PATCH] Stream cgraph_node.ipa_transforms_to_apply

2012-11-06 Thread Martin Jambor
Hi,

On Mon, Nov 05, 2012 at 12:15:50PM +0100, Jan Hubicka wrote:
> > Hi,
> > 
> > the following patch adds streaming ofcgraph_node.ipa_transforms_to_apply
> > so that transformation phases of IPA passes are run in LTO too.  It is
> > done by simple streaming of pass.static_pass_number and then looking
> > it up among all_regular_ipa_passes.
> > 
> > Bootstrapped and tested on x86_64-linux, required to make aggregate
> > IPA-CP work in LTO.
> > 
> > OK for trunk?
> > 
> > Thanks,
> > 
> > Martin
> > 
> > 
> > 2012-11-03  Martin Jambor  
> > 
> > * lto-cgraph.c: Include tree-pass.h.
> > (lto_output_node): Stream node->ipa_transforms_to_apply.
> > (input_node): Likewise.
> > * Makefile.in (lto-cgraph.o): Add TREE_PASS_H to dependencies.
> > +  count = streamer_read_hwi (ib);
> > +  node->ipa_transforms_to_apply = NULL;
> > +  for (i = 0; i < count; i++)
> > +{
> > +  struct opt_pass *pass;
> > +  int pi = streamer_read_hwi (ib);
> > +
> > +  for (pass = all_regular_ipa_passes; pass; pass = pass->next)
> > +   if (pass->static_pass_number == pi)
> 
> passes.c compute vector translating IDs to pass structures, please export it 
> and use it here;
> OK with this change.
> 

The following passes bootstrap and testsuite run on
x86_64-linux.  I will commit it tomorrow morning unless somebody
objects.

Thanks,

Martin
 

2012-11-06  Martin Jambor  

* lto-cgraph.c: Include tree-pass.h.
(lto_output_node): Stream node->ipa_transforms_to_apply.
(input_node): Likewise.
* tree-pass.h (passes_by_id): Declare.
(passes_by_id_size): Likewise.
* Makefile.in (lto-cgraph.o): Add TREE_PASS_H to dependencies.

Index: src/gcc/lto-cgraph.c
===
--- src.orig/gcc/lto-cgraph.c
+++ src/gcc/lto-cgraph.c
@@ -45,6 +45,7 @@ along with GCC; see the file COPYING3.
 #include "data-streamer.h"
 #include "tree-streamer.h"
 #include "gcov-io.h"
+#include "tree-pass.h"
 
 static void output_cgraph_opt_summary (void);
 static void input_cgraph_opt_summary (VEC (symtab_node, heap) * nodes);
@@ -377,6 +378,8 @@ lto_output_node (struct lto_simple_outpu
   intptr_t ref;
   bool in_other_partition = false;
   struct cgraph_node *clone_of;
+  struct ipa_opt_pass_d *pass;
+  int i;
 
   boundary_p = !lto_symtab_encoder_in_partition_p (encoder, (symtab_node)node);
 
@@ -432,6 +435,12 @@ lto_output_node (struct lto_simple_outpu
   streamer_write_hwi_stream (ob->main_stream, node->count);
   streamer_write_hwi_stream (ob->main_stream, 
node->count_materialization_scale);
 
+  streamer_write_hwi_stream (ob->main_stream,
+VEC_length (ipa_opt_pass,
+node->ipa_transforms_to_apply));
+  FOR_EACH_VEC_ELT (ipa_opt_pass, node->ipa_transforms_to_apply, i, pass)
+streamer_write_hwi_stream (ob->main_stream, pass->pass.static_pass_number);
+
   if (tag == LTO_symtab_analyzed_node)
 {
   if (node->global.inlined_to)
@@ -897,6 +906,7 @@ input_node (struct lto_file_decl_data *f
   int ref = LCC_NOT_FOUND, ref2 = LCC_NOT_FOUND;
   int clone_ref;
   int order;
+  int i, count;
 
   order = streamer_read_hwi (ib) + order_base;
   clone_ref = streamer_read_hwi (ib);
@@ -919,6 +929,19 @@ input_node (struct lto_file_decl_data *f
   node->count = streamer_read_hwi (ib);
   node->count_materialization_scale = streamer_read_hwi (ib);
 
+  count = streamer_read_hwi (ib);
+  node->ipa_transforms_to_apply = NULL;
+  for (i = 0; i < count; i++)
+{
+  struct opt_pass *pass;
+  int pid = streamer_read_hwi (ib);
+
+  gcc_assert (pid < passes_by_id_size);
+  pass = passes_by_id[pid];
+  VEC_safe_push (ipa_opt_pass, heap, node->ipa_transforms_to_apply,
+(struct ipa_opt_pass_d *) pass);
+}
+
   if (tag == LTO_symtab_analyzed_node)
 ref = streamer_read_hwi (ib);
 
Index: src/gcc/Makefile.in
===
--- src.orig/gcc/Makefile.in
+++ src/gcc/Makefile.in
@@ -2143,7 +2143,7 @@ lto-cgraph.o: lto-cgraph.c $(CONFIG_H) $
$(HASHTAB_H) langhooks.h $(BASIC_BLOCK_H) \
$(TREE_FLOW_H) $(CGRAPH_H) $(FUNCTION_H) $(GGC_H) $(DIAGNOSTIC_CORE_H) \
$(EXCEPT_H) $(TIMEVAR_H) pointer-set.h $(LTO_STREAMER_H) \
-   $(GCOV_IO_H) $(DATA_STREAMER_H) $(TREE_STREAMER_H)
+   $(GCOV_IO_H) $(DATA_STREAMER_H) $(TREE_STREAMER_H) $(TREE_PASS_H)
 lto-streamer-in.o: lto-streamer-in.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
$(TM_H) toplev.h $(DIAGNOSTIC_CORE_H) $(EXPR_H) $(FLAGS_H) $(PARAMS_H) \
input.h $(HASHTAB_H) $(BASIC_BLOCK_H) $(TREE_FLOW_H) $(TREE_PASS_H) \
Index: src/gcc/tree-pass.h
===
--- src.orig/gcc/tree-pass.h
+++ src/gcc/tree-pass.h
@@ -544,6 +544,9 @@ extern void register_pass (struct regist
directly in jump threading, and avoid peeling them next time.  */
 extern bool first_pass_instance;
 
+extern str

Re: [AArch64] fix missing Dwarf call frame information in the epilogue

2012-11-06 Thread Yufeng Zhang

Hi,

Many thanks for reviewing.  Please find the updated patch.  The explicit 
calls to gen_rtx_PLUS and GEN_INT have been replaced by plus_constant, 
and the call to aarch64_set_frame_expr has been replaced with 
add_reg_note (REG_CFA_ADJUST_CFA).


I'll clean up other cases in aarch64.c in a separate patch.

OK to commit?

Thanks,
Yufeng


gcc/ChangeLog

2012-11-06  Yufeng Zhang  

 * config/aarch64/aarch64.c (aarch64_expand_prologue): For the
 load-pair with writeback instruction, replace
 aarch64_set_frame_expr with add_reg_note (REG_CFA_ADJUST_CFA);
 add new local variable 'cfa_reg' and use it.

gcc/testsuite/ChangeLog

2012-11-06  Yufeng Zhang  

 * gcc.target/aarch64/dwarf-cfa-reg.c: New file.


On 09/12/12 19:37, Richard Henderson wrote:

On 09/12/2012 09:10 AM, Yufeng Zhang wrote:

aarch64_set_frame_expr (gen_rtx_SET
  (Pmode,
   stack_pointer_rtx,
-  gen_rtx_PLUS (Pmode, stack_pointer_rtx,
+  gen_rtx_PLUS (Pmode, cfa_reg,
 GEN_INT (offset;


We'd prefer to use

   plus_constant (Pmode, cfa_reg, offset)

instead of the explicit call to gen_rtx_PLUS and GEN_INT.
It would appear that the entire aarch64.c file ought to
be audited for that.

Also, use of the REG_CFA_* notes is strongly encouraged over
use of REG_FRAME_RELATED_EXPR.

There's all sorts of work involved in turning R_F_R_E into
R_CFA_* notes, depending on a rather large state machine.
This state machine was developed when only prologues were
annotated for unwinding, and therefore one cannot expect it
to work reliably for epilogues.

A long-term goal is to convert all targets to use R_CFA_*
exclusively, as that preserves much more information present
in the structure of the code of the prologue generator.  It
means less work within the compiler, and eventually being able
to remove a rather large hunk of state-machine code.


r~

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index b36be90..8a2d7ba 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -1777,7 +1777,7 @@ aarch64_expand_prologue (void)
   - original_frame_size
   - cfun->machine->frame.saved_regs_size);
 
-  /* Store pairs and load pairs have a range only of +/- 512.  */
+  /* Store pairs and load pairs have a range only -512 to 504.  */
   if (offset >= 512)
 {
   /* When the frame has a large size, an initial decrease is done on
@@ -1923,6 +1923,7 @@ aarch64_expand_epilogue (bool for_sibcall)
   HOST_WIDE_INT original_frame_size, frame_size, offset;
   HOST_WIDE_INT fp_offset;
   rtx insn;
+  rtx cfa_reg;
 
   aarch64_layout_frame ();
   original_frame_size = get_frame_size () + cfun->machine->saved_varargs_size;
@@ -1935,7 +1936,9 @@ aarch64_expand_epilogue (bool for_sibcall)
   - original_frame_size
   - cfun->machine->frame.saved_regs_size);
 
-  /* Store pairs and load pairs have a range only of +/- 512.  */
+  cfa_reg = frame_pointer_needed ? hard_frame_pointer_rtx : stack_pointer_rtx;
+
+  /* Store pairs and load pairs have a range only -512 to 504.  */
   if (offset >= 512)
 {
   offset = original_frame_size + cfun->machine->frame.saved_regs_size;
@@ -1966,6 +1969,10 @@ aarch64_expand_epilogue (bool for_sibcall)
   hard_frame_pointer_rtx,
   GEN_INT (- fp_offset)));
   RTX_FRAME_RELATED_P (insn) = 1;
+  /* As SP is set to (FP - fp_offset), according to the rules in
+dwarf2cfi.c:dwarf2out_frame_debug_expr, CFA should be calculated
+from the value of SP from now on.  */
+  cfa_reg = stack_pointer_rtx;
 }
 
   aarch64_save_or_restore_callee_save_registers
@@ -2003,11 +2010,9 @@ aarch64_expand_epilogue (bool for_sibcall)
 GEN_INT (offset),
 GEN_INT (GET_MODE_SIZE (DImode) + offset)));
  RTX_FRAME_RELATED_P (XVECEXP (PATTERN (insn), 0, 2)) = 1;
- aarch64_set_frame_expr (gen_rtx_SET
- (Pmode,
-  stack_pointer_rtx,
-  gen_rtx_PLUS (Pmode, stack_pointer_rtx,
-GEN_INT (offset;
+ add_reg_note (insn, REG_CFA_ADJUST_CFA,
+   (gen_rtx_SET (Pmode, stack_pointer_rtx,
+ plus_constant (cfa_reg, offset;
}
 
  /* The first part of a frame-related parallel insn
@@ -2027,7 +2032,6 @@ aarch64_expand_epilogue (bool for_sibcall)
  RTX_FRAME_RELATED_P (insn) = 1;
}
}
-
   else
{
  insn = emit_insn (gen_add2_insn (stack_pointer_rtx,
dif

Go patch committed: The Go runtime memcmp needs to return intgo

2012-11-06 Thread Ian Lance Taylor
This patch to the Go frontend and libgo add a Go-specific memcmp
routine, which returns intgo rather than int.  This is a step toward
using 64-bit int.  The memcmp routine is only used for struct and array
equality comparisons, it is not really a performance issue.
Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu.
Committed to mainline.

Ian

diff -r c1a6ccf93d67 go/runtime.def
--- a/go/runtime.def	Mon Nov 05 09:40:48 2012 -0800
+++ b/go/runtime.def	Tue Nov 06 09:53:54 2012 -0800
@@ -29,7 +29,7 @@
 // result types.
 
 // The standard C memcmp function, used for struct comparisons.
-DEF_GO_RUNTIME(MEMCMP, "memcmp", P3(POINTER, POINTER, UINTPTR), R1(INT))
+DEF_GO_RUNTIME(MEMCMP, "__go_memcmp", P3(POINTER, POINTER, UINTPTR), R1(INT))
 
 // Range over a string, returning the next index.
 DEF_GO_RUNTIME(STRINGITER, "runtime.stringiter", P2(STRING, INT), R1(INT))
diff -r c1a6ccf93d67 libgo/Makefile.am
--- a/libgo/Makefile.am	Mon Nov 05 09:40:48 2012 -0800
+++ b/libgo/Makefile.am	Tue Nov 06 09:53:54 2012 -0800
@@ -462,6 +462,7 @@
 	runtime/go-map-len.c \
 	runtime/go-map-range.c \
 	runtime/go-matherr.c \
+	runtime/go-memcmp.c \
 	runtime/go-nanotime.c \
 	runtime/go-now.c \
 	runtime/go-new-map.c \
diff -r c1a6ccf93d67 libgo/runtime/go-memcmp.c
--- /dev/null	Thu Jan 01 00:00:00 1970 +
+++ b/libgo/runtime/go-memcmp.c	Tue Nov 06 09:53:54 2012 -0800
@@ -0,0 +1,13 @@
+/* go-memcmp.c -- the go memory comparison function.
+
+   Copyright 2012 The Go Authors. All rights reserved.
+   Use of this source code is governed by a BSD-style
+   license that can be found in the LICENSE file.  */
+
+#include "runtime.h"
+
+intgo
+__go_memcmp (const void *p1, const void *p2, uintptr len)
+{
+  return __builtin_memcmp (p1, p2, len);
+}


Re: User directed Function Multiversioning via Function Overloading (issue5752064)

2012-11-06 Thread Sriraman Tallam
On Tue, Nov 6, 2012 at 7:52 AM, Jason Merrill  wrote:
> On 11/05/2012 09:38 PM, Sriraman Tallam wrote:
>>
>> +  /* For multi-versioned functions, more than one match is just fine.
>>
>> +Call decls_match to make sure they are different because they are
>> +versioned.  */
>> +  if (DECL_FUNCTION_VERSIONED (fn))
>> +   {
>> +  for (match = TREE_CHAIN (matches); match; match = TREE_CHAIN
>> (match))
>> +   if (!DECL_FUNCTION_VERSIONED (TREE_PURPOSE (match))
>> +   || decls_match (fn, TREE_PURPOSE (match)))
>> + break;
>> +   }
>
>
> I still don't understand what this code is supposed to be doing.  Please
> remove it and instead modify the other loop to allow mismatches that are
> versions of the same function.

Ok, will do. I was trying to do for versioned functions what the other
loop was doing thought I could not come up with a test case to
exercise this code.


I will make all the other changes and get back asap.

Thanks,
-Sri.

>
>> +  /* If the olddecl is a version, so is the newdecl.  */
>> +  if (TREE_CODE (newdecl) == FUNCTION_DECL
>> +  && DECL_FUNCTION_VERSIONED (olddecl))
>> +{
>> +  DECL_FUNCTION_VERSIONED (newdecl) = 1;
>> +  /* newdecl will be purged and is no longer a version.  */
>> +  delete_function_version (newdecl);
>> +}
>
>
> Please make the comment clearer that the reason we're setting the flag on
> the newdecl is so that it'll be copied back into the olddecl; otherwise it
> seems odd to say it's a version and then it isn't a version.
>
>> +  /* If a pointer to a function that is multi-versioned is requested, the
>> + pointer to the dispatcher function is returned instead.  This works
>> + well because indirectly calling the function will dispatch the right
>> + function version at run-time.  */
>>
>> +  if (DECL_FUNCTION_VERSIONED (fn))
>> +{
>> +  tree dispatcher_decl = NULL;
>> +  gcc_assert (targetm.get_function_versions_dispatcher);
>> +  dispatcher_decl = targetm.get_function_versions_dispatcher (fn);
>> +  if (!dispatcher_decl)
>> +   {
>> + error_at (input_location, "Pointer to a multiversioned function"
>> +   " without a default is not allowed");
>> + return error_mark_node;
>> +   }
>> +  retrofit_lang_decl (dispatcher_decl);
>> +  fn = dispatcher_decl;
>
>
> This code should use the get_function_version_dispatcher function in
> cp/call.c.
>
> Jason
>


[v3] libstdc++/51850

2012-11-06 Thread Paolo Carlini

Hi,

this is what I'm finishing testing to add a debug-mode std::array. 
Things work pretty well, but I'm not fiddling for now with the 
iterators: std::array must remain an aggregate, thus we can't have base 
classes, thus we can't really use the standard debug-mode infrastructure 
for those. Basing on the various email exchanges over the last years, I 
understand people will like anyway having operator[], and also front and 
back checked (consistently with, eg, std::vector). In any case the rest 
in enhancement beyond the PR.


Thanks,
Paolo.


2012-11-06  Paolo Carlini  

PR libstdc++/51850
* include/debug/array: New, debug-mode implementation.
* include/profile/array: New.
* include/std/array: Adjust.
* include/std/tuple: Just include .
* include/Makefile.am: Add.
* include/Makefile.in: Regenerate.
* testsuite/23_containers/array/debug/front1_neg.cc: New.
* testsuite/23_containers/array/debug/
square_brackets_operator1_neg.cc: Likewise.
* testsuite/23_containers/array/debug/front2_neg.cc: Likewise.
* testsuite/23_containers/array/debug/
square_brackets_operator2_neg.cc: Likewise.
* testsuite/23_containers/array/debug/back1_neg.cc: Likewise.
* testsuite/23_containers/array/debug/back2_neg.cc: Likewise.
* testsuite/23_containers/array/tuple_interface/get_neg.cc: Tweak
to run only in normal-mode.
* testsuite/23_containers/array/tuple_interface/tuple_element_neg.cc:
Likewise.
* testsuite/23_containers/array/tuple_interface/get_debug_neg.cc: New.
* testsuite/23_containers/array/tuple_interface/
tuple_element_debug_neg.cc: Likewise.
Index: include/debug/array
===
--- include/debug/array (revision 0)
+++ include/debug/array (working copy)
@@ -0,0 +1,321 @@
+// Debugging array implementation -*- C++ -*-
+
+// Copyright (C) 2012 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// Under Section 7 of GPL version 3, you are granted additional
+// permissions described in the GCC Runtime Library Exception, version
+// 3.1, as published by the Free Software Foundation.
+
+// You should have received a copy of the GNU General Public License and
+// a copy of the GCC Runtime Library Exception along with this program;
+// see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+// .
+
+/** @file debug/array
+ *  This is a Standard C++ Library header.
+ */
+
+#ifndef _GLIBCXX_DEBUG_ARRAY
+#define _GLIBCXX_DEBUG_ARRAY 1
+
+#pragma GCC system_header
+
+#include 
+
+#ifndef _GLIBCXX_THROW_OR_ABORT
+# if __EXCEPTIONS
+#  define _GLIBCXX_THROW_OR_ABORT(_Exc) (throw (_Exc))
+# else
+#  define _GLIBCXX_THROW_OR_ABORT(_Exc) (__builtin_abort())
+# endif
+#endif
+
+namespace std _GLIBCXX_VISIBILITY(default)
+{
+namespace __debug
+{
+  template
+struct array
+{
+  typedef _Tpvalue_type;
+  typedef value_type*pointer;
+  typedef const value_type*   const_pointer;
+  typedef value_type&reference;
+  typedef const value_type&  const_reference;
+  typedef value_type* iterator;
+  typedef const value_type*   const_iterator;
+  typedef std::size_tsize_type;
+  typedef std::ptrdiff_t difference_type;
+  typedef std::reverse_iteratorreverse_iterator;
+  typedef std::reverse_iterator   const_reverse_iterator;
+
+  // Support for zero-sized arrays mandatory.
+  typedef _GLIBCXX_STD_C::__array_traits<_Tp, _Nm> _AT_Type;
+  typename _AT_Type::_Type _M_elems;
+
+  template
+   struct _Array_check_subscript
+   {
+ std::size_t size() { return _Size; }
+
+ _Array_check_subscript(std::size_t __index)
+ { __glibcxx_check_subscript(__index); }
+};
+
+  template
+   struct _Array_check_nonempty
+   {
+ bool empty() { return _Size == 0; }
+
+ _Array_check_nonempty()
+ { __glibcxx_check_nonempty(); }
+};
+
+  // No explicit construct/copy/destroy for aggregate type.
+
+  // DR 776.
+  void
+  f

Re: New badness metric for inliner

2012-11-06 Thread Jan Hubicka
> 
> This broke the bootstrap on sparc:
> 
> /home/davem/src/GIT/GCC/build-sparc32-linux/./prev-gcc/g++ 
> -B/home/davem/src/GIT/GCC/build-sparc32\
> -linux/./prev-gcc/ -B/usr/local/sparc-unknown-linux-gnu/bin/ -nostdinc++ 
> -B/home/davem/src/GIT/GCC\
> /build-sparc32-linux/prev-sparc-unknown-linux-gnu/libstdc++-v3/src/.libs 
> -B/home/davem/src/GIT/GCC\
> /build-sparc32-linux/prev-sparc-unknown-linux-gnu/libstdc++-v3/libsupc++/.libs
>  -I/home/davem/src/G\
> IT/GCC/build-sparc32-linux/prev-sparc-unknown-linux-gnu/libstdc++-v3/include/sparc-unknown-linux-g\
> nu 
> -I/home/davem/src/GIT/GCC/build-sparc32-linux/prev-sparc-unknown-linux-gnu/libstdc++-v3/include\
>  -I/home/davem/src/GIT/GCC/gcc/libstdc++-v3/libsupc++ 
> -L/home/davem/src/GIT/GCC/build-sparc32-linu\
> x/prev-sparc-unknown-linux-gnu/libstdc++-v3/src/.libs 
> -L/home/davem/src/GIT/GCC/build-sparc32-linu\
> x/prev-sparc-unknown-linux-gnu/libstdc++-v3/libsupc++/.libs -c   -g -O2 
> -gtoggle -DIN_GCC   -fno-e\
> xceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing 
> -Wwrite-strings -Wcast-qu\
> al -Wmissing-format-attribute -pedantic -Wno-long-long -Wno-variadic-macros 
> -Wno-overlength-string\
> s -Werror -fno-common  -DHAVE_CONFIG_H -I. -I. -I../../gcc/gcc 
> -I../../gcc/gcc/. -I../../gcc/gcc/.\
> ./include -I../../gcc/gcc/../libcpp/include  -I../../gcc/gcc/../libdecnumber 
> -I../../gcc/gcc/../li\
> bdecnumber/dpd -I../libdecnumber -I../../gcc/gcc/../libbacktrace 
> -DCLOOG_INT_GMP../../gcc/gcc/\
> graphite-interchange.c -o graphite-interchange.o
> ../../gcc/gcc/graphite-interchange.c:645:1: internal compiler error: in 
> relative_time_benefit, at \
> ipa-inline.c:784
The problem here is really that MAX_TIME * MAX_FREQ do not fit into 32bit 
integer. Fixed thus.

* ipa-inline.c (compute_uninlined_call_time): Return gcov_type.
(compute_inlined_call_time): Watch overflows.
(relative_time_benefit): Compute in gcov_type.
Index: ipa-inline.c
===
--- ipa-inline.c(revision 193246)
+++ ipa-inline.c(working copy)
@@ -459,16 +459,16 @@ want_early_inline_function_p (struct cgr
 /* Compute time of the edge->caller + edge->callee execution when inlining
does not happen.  */
 
-inline int
+inline gcov_type
 compute_uninlined_call_time (struct inline_summary *callee_info,
 struct cgraph_edge *edge)
 {
-  int uninlined_call_time =
+  gcov_type uninlined_call_time =
 RDIV ((gcov_type)callee_info->time * MAX (edge->frequency, 1),
  CGRAPH_FREQ_BASE);
-  int caller_time = inline_summary (edge->caller->global.inlined_to
-   ? edge->caller->global.inlined_to
-   : edge->caller)->time;
+  gcov_type caller_time = inline_summary (edge->caller->global.inlined_to
+ ? edge->caller->global.inlined_to
+ : edge->caller)->time;
   return uninlined_call_time + caller_time;
 }
 
@@ -479,12 +479,13 @@ inline gcov_type
 compute_inlined_call_time (struct cgraph_edge *edge,
   int edge_time)
 {
-  int caller_time = inline_summary (edge->caller->global.inlined_to
-   ? edge->caller->global.inlined_to
-   : edge->caller)->time;
-  int time = caller_time + RDIV ((edge_time - inline_edge_summary 
(edge)->call_stmt_time)
-* MAX (edge->frequency, 1),
-CGRAPH_FREQ_BASE);
+  gcov_type caller_time = inline_summary (edge->caller->global.inlined_to
+ ? edge->caller->global.inlined_to
+ : edge->caller)->time;
+  gcov_type time = (caller_time
+   + RDIV (((gcov_type) edge_time
+- inline_edge_summary (edge)->call_stmt_time)
+   * MAX (edge->frequency, 1), CGRAPH_FREQ_BASE));
   /* Possible one roundoff error, but watch for overflows.  */
   gcc_checking_assert (time >= INT_MIN / 2);
   if (time < 0)
@@ -770,9 +771,9 @@ relative_time_benefit (struct inline_sum
   struct cgraph_edge *edge,
   int edge_time)
 {
-  int relbenefit;
-  int uninlined_call_time = compute_uninlined_call_time (callee_info, edge);
-  int inlined_call_time = compute_inlined_call_time (edge, edge_time);
+  gcov_type relbenefit;
+  gcov_type uninlined_call_time = compute_uninlined_call_time (callee_info, 
edge);
+  gcov_type inlined_call_time = compute_inlined_call_time (edge, edge_time);
 
   /* Inlining into extern inline function is not a win.  */
   if (DECL_EXTERNAL (edge->caller->global.inlined_to
@@ -918,7 +919,7 @@ edge_badness (struct cgraph_edge *edge, 
   (int) badness, (double)edge->frequency / CGRAPH_FREQ_BASE,
   relative_time_benefit (calle

Re: [AArch64] fix missing Dwarf call frame information in the epilogue

2012-11-06 Thread Richard Henderson
On 2012-11-06 09:56, Yufeng Zhang wrote:
> 2012-11-06  Yufeng Zhang  
> 
>  * config/aarch64/aarch64.c (aarch64_expand_prologue): For the
>  load-pair with writeback instruction, replace
>  aarch64_set_frame_expr with add_reg_note (REG_CFA_ADJUST_CFA);
>  add new local variable 'cfa_reg' and use it.
> 
> gcc/testsuite/ChangeLog
> 
> 2012-11-06  Yufeng Zhang  
> 
>  * gcc.target/aarch64/dwarf-cfa-reg.c: New file.

Looks good.


r~


Re: [AArch64] fix missing Dwarf call frame information in the epilogue

2012-11-06 Thread Marcus Shawcroft

On 06/11/12 18:21, Richard Henderson wrote:

On 2012-11-06 09:56, Yufeng Zhang wrote:

2012-11-06  Yufeng Zhang

  * config/aarch64/aarch64.c (aarch64_expand_prologue): For the
  load-pair with writeback instruction, replace
  aarch64_set_frame_expr with add_reg_note (REG_CFA_ADJUST_CFA);
  add new local variable 'cfa_reg' and use it.

gcc/testsuite/ChangeLog

2012-11-06  Yufeng Zhang

  * gcc.target/aarch64/dwarf-cfa-reg.c: New file.


Looks good.


r~



OK.  Yufeng, please back port this onto ARM/aarch64-4.7-branch.

Cheers
/Marcus



Re: New badness metric for inliner

2012-11-06 Thread David Miller
From: Jan Hubicka 
Date: Tue, 6 Nov 2012 19:21:46 +0100

> The problem here is really that MAX_TIME * MAX_FREQ do not fit into 32bit 
> integer. Fixed thus.
> 
>   * ipa-inline.c (compute_uninlined_call_time): Return gcov_type.
>   (compute_inlined_call_time): Watch overflows.
>   (relative_time_benefit): Compute in gcov_type.

Thanks Jan, I'll test this right now.


Go patch committed: Size of int is now 64 bits on x86_64

2012-11-06 Thread Ian Lance Taylor
This patch to the Go compiler and library changes the size of the Go
type "int" to be the same as the size of a pointer.  This means that on
x86_64 the size of int will be 64 bits.  This matches the new behaviour
of the other Go compiler, and is the intended implementation for the
future Go 1.1 release.  Bootstrapped and ran Go testsuite on
x86_64-unknown-linux-gnu.  Committed to mainline.

Ian

diff -r 530c277d39c4 go/gogo.cc
--- a/go/gogo.cc	Tue Nov 06 10:26:29 2012 -0800
+++ b/go/gogo.cc	Tue Nov 06 10:43:54 2012 -0800
@@ -23,8 +23,7 @@
 
 // Class Gogo.
 
-Gogo::Gogo(Backend* backend, Linemap* linemap, int int_type_size,
-   int pointer_size)
+Gogo::Gogo(Backend* backend, Linemap* linemap, int, int pointer_size)
   : backend_(backend),
 linemap_(linemap),
 package_(NULL),
@@ -83,6 +82,7 @@
   this->add_named_type(Type::make_complex_type("complex128", 128,
 	   RUNTIME_TYPE_KIND_COMPLEX128));
 
+  int int_type_size = pointer_size;
   if (int_type_size < 32)
 int_type_size = 32;
   this->add_named_type(Type::make_integer_type("uint", true,
diff -r 530c277d39c4 libgo/runtime/runtime.h
--- a/libgo/runtime/runtime.h	Tue Nov 06 10:26:29 2012 -0800
+++ b/libgo/runtime/runtime.h	Tue Nov 06 10:43:54 2012 -0800
@@ -41,8 +41,8 @@
 typedef signed int   intptr __attribute__ ((mode (pointer)));
 typedef unsigned int uintptr __attribute__ ((mode (pointer)));
 
-typedef int		intgo; // Go's int
-typedef unsigned int	uintgo; // Go's uint
+typedef intptr		intgo; // Go's int
+typedef uintptr		uintgo; // Go's uint
 
 /* Defined types.  */
 


[PATCH,AArch64] Optimise comparison where intermediate result not used

2012-11-06 Thread Ian Bolton
Hi all,

When we perform an addition but only use the result for a comparison,
we can save an instruction.

Consider this function:

int foo (int a, int b) {
  return ((a + b) == 0) ? 1 : 7;
}


Here is the original output:

foo:
add w0, w0, w1
cmp w0, wzr
mov w1, 7
mov w0, 1
csel w0, w1, w0, ne
ret

Now we get this:

foo:
cmn w0, w1
mov w1, 7
mov w0, 1
cselw0, w1, w0, ne
ret

:)


I added other testcases for this and also some for adds and subs, which
were investigated as part of this work.


OK for trunk?

Cheers,
Ian


2012-11-06  Ian Bolton  

  * gcc/config/aarch64/aarch64.md (*compare_neg): New pattern.
  * gcc/testsuite/gcc.target/aarch64/cmn.c: New test.
  * gcc/testsuite/gcc.target/aarch64/adds.c: New test.
  * gcc/testsuite/gcc.target/aarch64/subs.c: New test.




diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index e6086a9..6935192 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -1310,6 +1310,17 @@
(set_attr "mode" "")]
 )
 
+(define_insn "*compare_neg"
+  [(set (reg:CC CC_REGNUM)
+   (compare:CC
+(match_operand:GPI 0 "register_operand" "r")
+(neg:GPI (match_operand:GPI 1 "register_operand" "r"]
+  ""
+  "cmn\\t%0, %1"
+  [(set_attr "v8type" "alus")
+   (set_attr "mode" "")]
+)
+
 (define_insn "*add__"
   [(set (match_operand:GPI 0 "register_operand" "=rk")
(plus:GPI (ASHIFT:GPI (match_operand:GPI 1 "register_operand" "r")
diff --git a/gcc/testsuite/gcc.target/aarch64/adds.c
b/gcc/testsuite/gcc.target/aarch64/adds.c
new file mode 100644
index 000..aa42321
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/adds.c
@@ -0,0 +1,30 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+int z;
+int
+foo (int x, int y)
+{
+  int l = x + y;
+  if (l == 0)
+return 5;
+
+  /* { dg-final { scan-assembler "adds\tw\[0-9\]" } } */
+  z = l ;
+  return 25;
+}
+
+typedef long long s64;
+
+s64 zz;
+s64
+foo2 (s64 x, s64 y)
+{
+  s64 l = x + y;
+  if (l < 0)
+return 5;
+
+  /* { dg-final { scan-assembler "adds\tx\[0-9\]" } } */
+  zz = l ;
+  return 25;
+}
diff --git a/gcc/testsuite/gcc.target/aarch64/cmn.c
b/gcc/testsuite/gcc.target/aarch64/cmn.c
new file mode 100644
index 000..1f06f57
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/cmn.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+int
+foo (int a, int b)
+{
+  if (a + b)
+return 5;
+  else
+return 2;
+  /* { dg-final { scan-assembler "cmn\tw\[0-9\]" } } */
+}
+
+typedef long long s64;
+
+s64
+foo2 (s64 a, s64 b)
+{
+  if (a + b)
+return 5;
+  else
+return 2;
+  /* { dg-final { scan-assembler "cmn\tx\[0-9\]" } } */
+}  
diff --git a/gcc/testsuite/gcc.target/aarch64/subs.c
b/gcc/testsuite/gcc.target/aarch64/subs.c
new file mode 100644
index 000..2bf1975
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/subs.c
@@ -0,0 +1,30 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+int z;
+int
+foo (int x, int y)
+{
+  int l = x - y;
+  if (l == 0)
+return 5;
+
+  /* { dg-final { scan-assembler "subs\tw\[0-9\]" } } */
+  z = l ;
+  return 25;
+}
+
+typedef long long s64;
+
+s64 zz;
+s64
+foo2 (s64 x, s64 y)
+{
+  s64 l = x - y;
+  if (l < 0)
+return 5;
+
+  /* { dg-final { scan-assembler "subs\tx\[0-9\]" } } */
+  zz = l ;
+  return 25;
+}





Re: New badness metric for inliner

2012-11-06 Thread David Miller
From: David Miller 
Date: Tue, 06 Nov 2012 13:26:53 -0500 (EST)

> From: Jan Hubicka 
> Date: Tue, 6 Nov 2012 19:21:46 +0100
> 
>> The problem here is really that MAX_TIME * MAX_FREQ do not fit into 32bit 
>> integer. Fixed thus.
>> 
>>  * ipa-inline.c (compute_uninlined_call_time): Return gcov_type.
>>  (compute_inlined_call_time): Watch overflows.
>>  (relative_time_benefit): Compute in gcov_type.
> 
> Thanks Jan, I'll test this right now.

Bootstrap still fails with this change installed:

../../gcc/gcc/graphite-interchange.c:645:1: internal compiler error: in 
relative_time_benefit, at \
ipa-inline.c:785
 }
 ^
0x108289f relative_time_benefit
../../gcc/gcc/ipa-inline.c:785
0x1082fcb edge_badness
../../gcc/gcc/ipa-inline.c:895
0x108372f update_edge_key
../../gcc/gcc/ipa-inline.c:963
0x10840db update_callee_keys
../../gcc/gcc/ipa-inline.c:1142
0x1085b47 inline_small_functions
../../gcc/gcc/ipa-inline.c:1595
0x10864f7 ipa_inline
../../gcc/gcc/ipa-inline.c:1770
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.
make[3]: *** [graphite-interchange.o] Error 1
make[3]: *** Waiting for unfinished jobs
make[2]: *** [all-stage2-gcc] Error 2
make[1]: *** [stage2-bubble] Error 2
make: *** [all] Error 2

And as Toon pointed out, even x86-64 is seeing this problem, so something
other than the size of the datum holding the value is at work here.


[PATCH,AArch64] Use CSINC instead of CSEL to return 1

2012-11-06 Thread Ian Bolton
Where a CSEL can return the value 1 as one of the alternatives,
it is usually more efficient to use a CSINC than a CSEL (and
never less efficient), since the value of 1 can be derived from
wzr, rather than needing to set it up in a register first.

This patch enables this capability.

It has been regression tested on trunk.

OK for commit?

Cheers,
Ian



2012-11-06  Ian Bolton  

* gcc/config/aarch64/aarch64.md (cmov_insn): Emit
CSINC when one of the alternatives is constant 1.
* gcc/config/aarch64/constraints.md: New constraint.
* gcc/config/aarch64/predicates.md: Rename predicate
aarch64_reg_zero_or_m1 to aarch64_reg_zero_or_m1_or_1.

* gcc/testsuite/gcc.target/aarch64/csinc-2.c: New test.



-

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 6935192..038465e 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -1877,19 +1877,23 @@
 )
 
 (define_insn "*cmov_insn"
-  [(set (match_operand:ALLI 0 "register_operand" "=r,r,r,r")
+  [(set (match_operand:ALLI 0 "register_operand" "=r,r,r,r,r,r,r")
(if_then_else:ALLI
 (match_operator 1 "aarch64_comparison_operator"
  [(match_operand 2 "cc_register" "") (const_int 0)])
-(match_operand:ALLI 3 "aarch64_reg_zero_or_m1" "rZ,rZ,UsM,UsM")
-(match_operand:ALLI 4 "aarch64_reg_zero_or_m1" "rZ,UsM,rZ,UsM")))]
-  ""
-  ;; Final alternative should be unreachable, but included for completeness
+(match_operand:ALLI 3 "aarch64_reg_zero_or_m1_or_1"
"rZ,rZ,UsM,rZ,Ui1,UsM,Ui1")
+(match_operand:ALLI 4 "aarch64_reg_zero_or_m1_or_1"
"rZ,UsM,rZ,Ui1,rZ,UsM,Ui1")))]
+  "!((operands[3] == const1_rtx && operands[4] == constm1_rtx)
+ || (operands[3] == constm1_rtx && operands[4] == const1_rtx))"
+  ;; Final two alternatives should be unreachable, but included for
completeness
   "@
csel\\t%0, %3, %4, %m1
csinv\\t%0, %3, zr, %m1
csinv\\t%0, %4, zr, %M1
-   mov\\t%0, -1"
+   csinc\\t%0, %3, zr, %m1
+   csinc\\t%0, %4, zr, %M1
+   mov\\t%0, -1
+   mov\\t%0, 1"
   [(set_attr "v8type" "csel")
(set_attr "mode" "")]
 )
diff --git a/gcc/config/aarch64/constraints.md
b/gcc/config/aarch64/constraints.md
index da50a47..780faaa 100644
--- a/gcc/config/aarch64/constraints.md
+++ b/gcc/config/aarch64/constraints.md
@@ -102,6 +102,11 @@
   A constraint that matches the immediate constant -1."
   (match_test "op == constm1_rtx"))
 
+(define_constraint "Ui1"
+  "@internal
+  A constraint that matches the immediate constant +1."
+  (match_test "op == const1_rtx"))
+
 (define_constraint "Ui3"
   "@internal
   A constraint that matches the integers 0...4."
diff --git a/gcc/config/aarch64/predicates.md
b/gcc/config/aarch64/predicates.md
index 328e5cf..aae71c1 100644
--- a/gcc/config/aarch64/predicates.md
+++ b/gcc/config/aarch64/predicates.md
@@ -31,11 +31,12 @@
(ior (match_operand 0 "register_operand")
(match_test "op == const0_rtx"
 
-(define_predicate "aarch64_reg_zero_or_m1"
+(define_predicate "aarch64_reg_zero_or_m1_or_1"
   (and (match_code "reg,subreg,const_int")
(ior (match_operand 0 "register_operand")
(ior (match_test "op == const0_rtx")
-(match_test "op == constm1_rtx")
+(ior (match_test "op == constm1_rtx")
+ (match_test "op == const1_rtx"))
 
 (define_predicate "aarch64_fp_compare_operand"
   (ior (match_operand 0 "register_operand")
diff --git a/gcc/testsuite/gcc.target/aarch64/csinc-2.c
b/gcc/testsuite/gcc.target/aarch64/csinc-2.c
new file mode 100644
index 000..6ed9080
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/csinc-2.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+int
+foo (int a, int b)
+{
+  return (a < b) ? 1 : 7;
+  /* { dg-final { scan-assembler "csinc\tw\[0-9\].*wzr" } } */
+}
+
+typedef long long s64;
+
+s64
+foo2 (s64 a, s64 b)
+{
+  return (a == b) ? 7 : 1;
+  /* { dg-final { scan-assembler "csinc\tx\[0-9\].*xzr" } } */
+}





[PATCH, i386]: Mark AVX maskstore memory operand as read-written

2012-11-06 Thread Uros Bizjak
Hello!

We don't mark memory operand 0 of AVX maskstore insn as read-written.

2012-11-06  Uros Bizjak  

* config/i386/sse.md
(_maskstore): Mark operand 0
as read and written by the instruction.

Tested on x86_64-pc-linux-gnu, committed to mainline SVN, will be
backported to 4.7 branch.

Uros.
Index: config/i386/sse.md
===
--- config/i386/sse.md  (revision 193243)
+++ config/i386/sse.md  (working copy)
@@ -11067,7 +11067,7 @@
(set_attr "mode" "")])
 
 (define_insn "_maskstore"
-  [(set (match_operand:V48_AVX2 0 "memory_operand" "=m")
+  [(set (match_operand:V48_AVX2 0 "memory_operand" "+m")
(unspec:V48_AVX2
  [(match_operand: 1 "register_operand" "x")
   (match_operand:V48_AVX2 2 "register_operand" "x")


Re: [C++11] PR54413 Option for turning off compiler extensions for numeric literals.

2012-11-06 Thread Jason Merrill

Why three separate flags?

The flag(s) need(s) to be documented in doc/invoke.texi.


@@ -721,7 +733,12 @@
 case OPT_std_c__1y:
 case OPT_std_gnu__1y:
   if (!preprocessing_asm_p)
-   set_std_cxx1y (code == OPT_std_c__11 /* ISO */);
+   {
+ set_std_cxx1y (code == OPT_std_c__11 /* ISO */);
+ cpp_opts->imaginary_literals = 0;
+ cpp_opts->fixed_point_literals = 0;
+ cpp_opts->machine_defined_literals = 0;
+   }


I think I would disable the built-in extension in both C++11 and C++1y 
if we're in ISO mode, and leave it enabled if we're in GNU mode.


I think the ideal behavior for these suffixes would be to treat them as 
user-defined literals if a corresponding literal operator is available, 
or use the built-in extension if not.  But that doesn't need to happen now.


Jason


[PATCH, i386]: Remove superfluous clear in ix86_init_machine_status

2012-11-06 Thread Uros Bizjak
Hello!

The allocator function does mention "cleared".

2012-11-06  Uros Bizjak  

* config/i386/i386.c (ix86_init_machine_status): Do not
explicitly clear tls_descriptor_call_expanded_p again.

Tested on x86_64-pc-linux-gnu {,-m32}, committed to mainline SVN, will
be backported to 4.7.

Uros.

Index: i386.c
===
--- i386.c  (revision 193244)
+++ i386.c  (working copy)
@@ -23415,7 +23417,6 @@ ix86_init_machine_status (void)

   f = ggc_alloc_cleared_machine_function ();
   f->use_fast_prologue_epilogue_nregs = -1;
-  f->tls_descriptor_call_expanded_p = 0;
   f->call_abi = ix86_abi;
   f->optimize_mode_switching[AVX_U128] = TARGET_VZEROUPPER;


Re: [patch] Contribute performance comparison script.

2012-11-06 Thread Diego Novillo
On Mon, Nov 5, 2012 at 9:38 PM, Lawrence Crowl  wrote:

> 2012-11-05  Lawrence Crowl  
>
> * compare_two_ftime_report_sets: New.

OK.  Thanks.


Diego.


Re: [C++] Omit overflow check for new char[n]

2012-11-06 Thread Jason Merrill

OK.

Jason


Re: New badness metric for inliner

2012-11-06 Thread David Miller
From: David Miller 
Date: Tue, 06 Nov 2012 13:54:01 -0500 (EST)

> From: David Miller 
> Date: Tue, 06 Nov 2012 13:26:53 -0500 (EST)
> 
>> From: Jan Hubicka 
>> Date: Tue, 6 Nov 2012 19:21:46 +0100
>> 
>>> The problem here is really that MAX_TIME * MAX_FREQ do not fit into 32bit 
>>> integer. Fixed thus.
>>> 
>>> * ipa-inline.c (compute_uninlined_call_time): Return gcov_type.
>>> (compute_inlined_call_time): Watch overflows.
>>> (relative_time_benefit): Compute in gcov_type.
>> 
>> Thanks Jan, I'll test this right now.
> 
> Bootstrap still fails with this change installed:
> 
> ../../gcc/gcc/graphite-interchange.c:645:1: internal compiler error: in 
> relative_time_benefit, at \
> ipa-inline.c:785
>  }

The problem appears to be that inline_summary (edge->caller)->time
is negative.

#1  0x010828a0 in relative_time_benefit (callee_info=0xf76fcb10, 
edge=0xf598a980, edge_time=3861) \
at ../../gcc/gcc/ipa-inline.c:785
(gdb) p callee_info->time
$1200 = 3864
(gdb) p edge->frequency
$1201 = 263
(gdb) p (callee_info->time * edge->frequency)
$1202 = 1016232
(gdb) p edge->caller->global.inlined_to
$1203 = (cgraph_node *) 0x0
(gdb) p edge->caller
$1204 = (cgraph_node *) 0xf589ed10
(gdb) p p inline_summary (edge->caller)->time
No symbol "p" in current context.
(gdb) p inline_summary (edge->caller)->time
$1205 = -1044761
(gdb) 


Re: [PATCH] PR c++/54466 - ICE with alias template which type-id is const qualified

2012-11-06 Thread Jason Merrill

OK.  You could also use TYPE_MAIN_DECL instead of TYPE_STUB_DECL.

Jason


Re: PR c/51294 spurious warning from -Wconversion in C and C++ in conditional expressions

2012-11-06 Thread Jason Merrill

OK.

Jason


Re: [C++ PATCH] Fix cplus_decl_attributes (PR c++/54988)

2012-11-06 Thread Jason Merrill

Please add a comment.  OK with that change.

Jason


Re: [Dwarf Fission] Implement Fission Proposal (issue6305113)

2012-11-06 Thread Sterling Augustine
On Mon, Nov 5, 2012 at 3:18 PM, Cary Coutant  wrote:
>> +/* %:replace-extension spec function.  Replaces the extension of the
>> +   first argument with the second argument.  */
>> +
>> +const char *
>> +replace_extension_spec_func (int argc, const char **argv)
>> +{
>> +  char *name;
>> +  char *p;
>> +  char *result;
>> +
>> +  if (argc != 2)
>> +fatal_error ("too few arguments to %%:replace-extension");
>> +
>> +  name = xstrdup (argv[0]);
>> +  p = strrchr (name, '.');
>> +  if (p != NULL)
>> +  *p = '\0';
>> +
>> +  result = concat (name, argv[1], NULL);
>> +
>> +  free (name);
>> +  return result;
>> +}
>
> This doesn't do the right thing when there is no '.' in the last
> component of the path. It should look for the last DIR_SEPARATOR,
> then search for the last '.' after that.

Good catch. Fixed.

>> +/* Describe an entry into the .debug_addr section.  */
>> +
>> +enum ate_kind {
>> +  ate_kind_rtx,
>> +  ate_kind_rtx_dtprel,
>> +  ate_kind_label
>> +};
>> +
>> +typedef struct GTY(()) addr_table_entry_struct {
>> +  enum ate_kind kind;
>> +  unsigned int refcount;
>> +  unsigned int index;
>> +  union addr_table_entry_struct_union
>> +{
>> +  rtx GTY ((tag ("ate_kind_rtx"))) rtl;
>> +  char * GTY ((tag ("ate_kind_label"))) label;
>> +}
>> +  GTY ((desc ("%1.kind"))) addr;
>
> When kind == ate_kind_rtx_dtprel, we use the rtl field. I think this needs
> to be covered for GC to work. As far as I know, gengtype doesn't support
> multiple tags for one union member, so I think it needs to be something
> like this:
>
>   union addr_table_entry_struct_union
> {
>   rtx GTY ((tag ("0"))) rtl;
>   char * GTY ((tag ("1"))) label;
> }
>   GTY ((desc ("(%1.kind == ate_kind_label)"))) addr;
>

Done.

>> +static void add_AT_lbl_id (dw_die_ref, enum dwarf_attribute, const char *,
>> +   bool);
>
> It turns out we never call add_AT_lbl_id with force_direct == true.
> I don't think it's necessary to add this parameter here.

Done--this actually cleans up the patch quite a bit.

>> +/* enum for tracking thread-local variables whose address is really an 
>> offset
>> +   relative to the TLS pointer, which will need link-time relocation, but 
>> will
>> +   not need relocation by the DWARF consumer.  */
>> +
>> +enum dtprel_bool
>> +  {
>> +dtprel_false = 0,
>> +dtprel_true = 1
>> +  };
>
> Extra indentation here.

Fixed.

>> +static inline enum dwarf_location_atom
>> +dw_addr_op (enum dtprel_bool dtprel)
>> +{
>> +  if (dtprel == dtprel_true)
>> +return (dwarf_split_debug_info ? DW_OP_GNU_const_index
>> +: (DWARF2_ADDR_SIZE == 4 ? DW_OP_const4u : DW_OP_const8u));
>> +  else
>> +return (dwarf_split_debug_info ? DW_OP_GNU_addr_index : DW_OP_addr);
>
> Unnecessary parentheses here.

Removed.

>> +/* Return the index for any attribute that will be referenced with a
>> +   DW_FORM_GNU_addr_index.  Strings have their indices handled differently 
>> to
>> +   account for reference counting pruning.  */
>> +
>> +static inline unsigned int
>> +AT_index (dw_attr_ref a)
>> +{
>> +  if (AT_class (a) == dw_val_class_str)
>> +return a->dw_attr_val.v.val_str->index;
>> +  else if (a->dw_attr_val.val_entry != NULL)
>> +return a->dw_attr_val.val_entry->index;
>> +  return NOT_INDEXED;
>> +}
>
> The comment seems out of date. DW_FORM_GNU_str_index should also be
> mentioned, and it doesn't look like strings have their indices handled
> differently (at least not here).

Updated.

>> +static void
>> +remove_addr_table_entry (addr_table_entry *entry)
>> +{
>> +  addr_table_entry *node;
>> +
>> +  gcc_assert (dwarf_split_debug_info && addr_index_table);
>> +  node = (addr_table_entry *) htab_find (addr_index_table, entry);
>> +  node->refcount--;
>> +  /* After an index is assigned, the table is frozen.  */
>> +  gcc_assert (node->refcount > 0 || node->index == NO_INDEX_ASSIGNED);
>
> This shouldn't ever be called after we've assigned any indexes at all,
> so I think it's always safe to asser that node->index == NO_INDEX_ASSIGNED.
> We can also assert that the ref count should never go negative, so I think
> you can rewrite this assert as:
>
>   gcc_assert (node->refcount >= 0 && node->index == NO_INDEX_ASSIGNED);
>

Done.

>> @@ -21215,7 +22086,7 @@ prune_unused_types_update_strings (dw_die_ref die)
>>   *slot = s;
>> }
>>}
>> -}
>> + }
>
> Accidental extra space?

Yes. Fixed.

>> +static void
>> +index_location_lists (dw_die_ref die)
>> +{
>> +  dw_die_ref c;
>> +  dw_attr_ref a;
>> +  unsigned ix;
>> +
>> +  FOR_EACH_VEC_ELT (dw_attr_node, die->die_attr, ix, a)
>> +if (AT_class (a) == dw_val_class_loc_list)
>> +  {
>> +dw_loc_list_ref list = AT_loc_list (a);
>> +dw_loc_list_ref curr;
>> +for (curr = list; curr != NULL; curr = curr->dw_loc_next)
>> +  {
>> +/* Don't index an entry that has already been indexed
>> +   or won't be output.  */
>> +   

Re: New badness metric for inliner

2012-11-06 Thread David Miller
From: David Miller 
Date: Tue, 06 Nov 2012 14:16:32 -0500 (EST)

> (gdb) p inline_summary (edge->caller)->time
> $1205 = -1044761

This negative value is computed by inline_update_overall_summary().

I added some debugging to dump the entry->time values processed when
info->time goes negative:

davem@patience:~/src/GIT/GCC/build-sparc32-linux/prev-gcc$ ./cc1plus -quiet -g 
-O2 -o x.s graphite-interchange.i
e[19]: time[0]
e[19]: time[3996]
e[19]: time[4000]
e[19]: time[1960]
e[19]: time[7840]
e[19]: time[980]
e[19]: time[4900]
e[19]: time[382]
e[19]: time[382]
e[19]: time[2292]
e[19]: time[10073452]
e[19]: time[6644]
e[19]: time[10865]
e[19]: time[726004281]
e[19]: time[10865]
e[19]: time[726004281]
e[19]: time[10865]
e[19]: time[726004281]
e[19]: time[3916]

My initial impression is that we'll need to use gcov_t all over the
place, which is unfortunate because that's going to make the inliner
more expensive on 32-bit builds.

Or perhaps we can get away with only using gcov_t for info->time, I'll
give that a try.


Re: New badness metric for inliner

2012-11-06 Thread David Miller
From: David Miller 
Date: Tue, 06 Nov 2012 14:28:19 -0500 (EST)

> Or perhaps we can get away with only using gcov_t for info->time, I'll
> give that a try.

That gets thing further, but if the edge times add up to such large
values it seems we have lots of other potential problems.

With info->times converted to gcov_type, the next assertion I hit is:

gcc_assert (cached_badness == current_badness);

in inline_small_functions().

Both badness values are negative.

(gdb) p cached_badness
$1 = -91472
(gdb) p current_badness
$2 = -11434

This is starting to look like a very deep rabbit hole, and I'm really
surprised that you hit none of these problems.  Especially since even
x86-64 is getting fortran testsuite failure regressions due to these
changes.



Go patch committed: More 64-bit int preparation

2012-11-06 Thread Ian Lance Taylor
This patch to the compiler and libgo fixes a number of minor issues in
preparation for converting to 64-bit int.  This is mostly changing from
int to intgo, with a couple of exceptions: we limit new map entries to
int32, and we handle the Go type rune as the C type int32.  Bootstrapped
and ran Go testsuite on x86_64-unknown-linux-gnu.  Committed to
mainline.

Ian

Index: gcc/go/gofrontend/types.cc
===
--- gcc/go/gofrontend/types.cc	(revision 193253)
+++ gcc/go/gofrontend/types.cc	(revision 193254)
@@ -2568,8 +2568,12 @@ Integer_type::create_abstract_integer_ty
 {
   static Integer_type* abstract_type;
   if (abstract_type == NULL)
-abstract_type = new Integer_type(true, false, INT_TYPE_SIZE,
- RUNTIME_TYPE_KIND_INT);
+{
+  Type* int_type = Type::lookup_integer_type("int");
+  abstract_type = new Integer_type(true, false,
+   int_type->integer_type()->bits(),
+   RUNTIME_TYPE_KIND_INT);
+}
   return abstract_type;
 }
 
Index: libgo/runtime/go-new-map.c
===
--- libgo/runtime/go-new-map.c	(revision 193253)
+++ libgo/runtime/go-new-map.c	(revision 193254)
@@ -106,10 +106,11 @@ __go_map_next_prime (uintptr_t n)
 struct __go_map *
 __go_new_map (const struct __go_map_descriptor *descriptor, uintptr_t entries)
 {
-  intgo ientries;
+  int32 ientries;
   struct __go_map *ret;
 
-  ientries = (intgo) entries;
+  /* The master library limits map entries to int32, so we do too.  */
+  ientries = (int32) entries;
   if (ientries < 0 || (uintptr_t) ientries != entries)
 runtime_panicstring ("map size out of range");
 
Index: libgo/runtime/go-rune.c
===
--- libgo/runtime/go-rune.c	(revision 193253)
+++ libgo/runtime/go-rune.c	(revision 193254)
@@ -14,7 +14,7 @@
characters used from STR.  */
 
 int
-__go_get_rune (const unsigned char *str, size_t len, int *rune)
+__go_get_rune (const unsigned char *str, size_t len, int32 *rune)
 {
   int c, c1, c2, c3, l;
 
Index: libgo/runtime/cpuprof.c
===
--- libgo/runtime/cpuprof.c	(revision 193253)
+++ libgo/runtime/cpuprof.c	(revision 193254)
@@ -124,7 +124,7 @@ static uintptr eod[3] = {0, 1, 0};
 static void LostProfileData(void) {
 }
 
-extern void runtime_SetCPUProfileRate(int32)
+extern void runtime_SetCPUProfileRate(intgo)
  __asm__("runtime.SetCPUProfileRate");
 
 // SetCPUProfileRate sets the CPU profiling rate.
Index: libgo/runtime/runtime.h
===
--- libgo/runtime/runtime.h	(revision 193253)
+++ libgo/runtime/runtime.h	(revision 193254)
@@ -341,7 +341,7 @@ int32	runtime_ncpu;
 /*
  * common functions and data
  */
-int32	runtime_findnull(const byte*);
+intgo	runtime_findnull(const byte*);
 void	runtime_dump(byte*, int32);
 
 /*
@@ -614,7 +614,7 @@ extern uintptr runtime_stacks_sys;
 
 struct backtrace_state;
 extern struct backtrace_state *__go_get_backtrace_state(void);
-extern _Bool __go_file_line(uintptr, String*, String*, int *);
+extern _Bool __go_file_line(uintptr, String*, String*, intgo *);
 extern byte* runtime_progname();
 
 int32 getproccount(void);
Index: libgo/runtime/proc.c
===
--- libgo/runtime/proc.c	(revision 193253)
+++ libgo/runtime/proc.c	(revision 193254)
@@ -610,11 +610,11 @@ runtime_goroutinetrailer(G *g)
 	if(g != nil && g->gopc != 0 && g->goid != 1) {
 		String fn;
 		String file;
-		int line;
+		intgo line;
 
 		if(__go_file_line(g->gopc - 1, &fn, &file, &line)) {
 			runtime_printf("created by %S\n", fn);
-			runtime_printf("\t%S:%d\n", file, line);
+			runtime_printf("\t%S:%D\n", file, (int64) line);
 		}
 	}
 }
Index: libgo/runtime/go-string.h
===
--- libgo/runtime/go-string.h	(revision 193253)
+++ libgo/runtime/go-string.h	(revision 193254)
@@ -26,6 +26,6 @@ __go_ptr_strings_equal (const String *ps
   return __go_strings_equal (*ps1, *ps2);
 }
 
-extern int __go_get_rune (const unsigned char *, size_t, int *);
+extern int __go_get_rune (const unsigned char *, size_t, int32 *);
 
 #endif /* !defined(LIBGO_GO_STRING_H) */
Index: libgo/runtime/go-traceback.c
===
--- libgo/runtime/go-traceback.c	(revision 193253)
+++ libgo/runtime/go-traceback.c	(revision 193254)
@@ -29,13 +29,13 @@ runtime_printtrace (uintptr *pcbuf, int3
 {
   String fn;
   String file;
-  int line;
+  intgo line;
 
   if (__go_file_line (pcbuf[i], &fn, &file, &line)
 	  && runtime_showframe (fn.str))
 	{
 	  runtime_printf ("%S\n", fn);
-	  runtime_printf ("\t%S:%d\n", file, line);
+	  runtime_printf ("\t%S:%D\n", file, (int64) line);
 	}
 }
 }
Index: libgo/runtime/string.goc
===

Go testsuite patch committed: Update for 64-bit int

2012-11-06 Thread Ian Lance Taylor
This patch updates the Go testsuite for the 64-bit int type.  This is
copied from the master copy of the testsuite.  Ran Go testsuite on
x86_64-unknown-linux-gnu.  Committed to mainline.

Ian

Index: gcc/testsuite/go.test/test/index.go
===
--- gcc/testsuite/go.test/test/index.go	(revision 192508)
+++ gcc/testsuite/go.test/test/index.go	(working copy)
@@ -21,6 +21,7 @@ import (
 	"flag"
 	"fmt"
 	"os"
+	"runtime"
 )
 
 const prolog = `
@@ -224,6 +225,10 @@ func main() {
 // the next pass from running.
 // So run it as a separate check.
 thisPass = 1
+			} else if a == "s" && n == "" && (i == "i64big" || i == "i64bigger") && runtime.GOARCH == "amd64" {
+// On amd64, these huge numbers do fit in an int, so they are not
+// rejected at compile time.
+thisPass = 0
 			} else {
 thisPass = 2
 			}


Re: [C++ Patch] for c++/11750

2012-11-06 Thread Fabien Chêne
Jason, could you please have a look at this (rather old) one ?
Thanks.

2012/8/13 Fabien Chêne :
> Hi,
>
> Here, we were setting the LOOKUP_NONVIRTUAL flag wrongly. Actually, we
> need to check if the function context is the same than the instance
> type -- yes that might happen that they be different in presence of
> using-declarations.
>
> It happens that it was working if the call was invoked through a
> pointer, that's because  we were failing to determine the dynamic type
> (in resolved_fixed_type_p). On the contrary, it wasn't working if the
> call was done through a reference because we manage to determine the
> dynamic_type thanks to a special case in fixed_type_or_null. There is
> probably room for improvement here, though I'm not sure the C++ front
> end is the better place to de-virtualize.
>
> Tested x84_64-unknown-linux-gnu without regressions. OK to commit ?
>
> gcc/testsuite/ChangeLog
>
> 2012-08-12  Fabien Chêne  
>
> PR c++/11750
> * g++.dg/inherit/vitual9.C: New.
>
> gcc/cp/ChangeLog
>
> 2012-08-12  Fabien Chêne  
>
> PR c++/11750
> * call.c (build_new_method_call_1): Check that the instance type
> and the function context are the same before setting the flag
> LOOKUP_NONVIRTUAL.
>
>
> --
> Fabien

-- 
Fabien


sched-deps patch: Fix PR54580

2012-11-06 Thread Bernd Schmidt
If we have

i1: [r1 + 24] = x
i2: r1 = r1 + 24;
i3: y = [r1]

then, if not using cselib, we do not generate a dependency between i3
and i1, since we compare memory addresses [r1] and [r1 + 24]. This is
somewhat lame, but safe since i2 depends on i1 and i3 depends on i2.
However, it breaks with the new optimization I've recently added, which
allows us to switch i3 and i2 by modifying the address in i3. We can end
up with
i3: y = [r1 + 24]
i1: [r1 + 24] = x
i2: r1 = r1 + 24
which is incorrect.

The following patch is a conservative way of fixing this by simply
transferring all backwards dependencies from i2 to i3. I thought about
trying to do better when using cselib for a while, but I wasn't quite
certain how flush_pending_lists would interact with this, and it almost
certainly doesn't really matter.

Bootstrapped and tested on x86_64-linux (boehm-gc.c/gctest.c appears to
fail randomly, otherwise no changes). Ok?


Bernd
	* sched-deps.c (find_inc): Add all dependencies from the inc_insn
	to the mem_insn.

Index: sched-deps.c
===
--- sched-deps.c	(revision 191838)
+++ sched-deps.c	(working copy)
@@ -4706,16 +4706,14 @@ find_inc (struct mem_inc_info *mii, bool
 	  if (backwards)
 	{
 	  FOR_EACH_DEP (mii->inc_insn, SD_LIST_BACK, sd_it, dep)
-		if (modified_in_p (mii->inc_input, DEP_PRO (dep)))
-		  add_dependence_1 (mii->mem_insn, DEP_PRO (dep),
-REG_DEP_TRUE);
+		add_dependence_1 (mii->mem_insn, DEP_PRO (dep),
+  REG_DEP_TRUE);
 	}
 	  else
 	{
 	  FOR_EACH_DEP (mii->inc_insn, SD_LIST_FORW, sd_it, dep)
-		if (modified_in_p (mii->inc_input, DEP_CON (dep)))
-		  add_dependence_1 (DEP_CON (dep), mii->mem_insn,
-REG_DEP_ANTI);
+		add_dependence_1 (DEP_CON (dep), mii->mem_insn,
+  REG_DEP_ANTI);
 	}
 	  return true;
 	}


Re: Go patch committed: Size of int is now 64 bits on x86_64

2012-11-06 Thread Ian Lance Taylor
On Tue, Nov 6, 2012 at 10:46 AM, Ian Lance Taylor  wrote:
> This patch to the Go compiler and library changes the size of the Go
> type "int" to be the same as the size of a pointer.  This means that on
> x86_64 the size of int will be 64 bits.  This matches the new behaviour
> of the other Go compiler, and is the intended implementation for the
> future Go 1.1 release.  Bootstrapped and ran Go testsuite on
> x86_64-unknown-linux-gnu.  Committed to mainline.

By the way, if you have an existing working directory with
--enable-languages=go, make sure to remove your TARGET/libgo directory
before you build after updating to this patch.  This change requires
rebuilding all the object files, but there is no dependency that will
force that to happen.

Ian


Re: New badness metric for inliner

2012-11-06 Thread Jan Hubicka
> From: David Miller 
> Date: Tue, 06 Nov 2012 13:54:01 -0500 (EST)
> 
> > From: David Miller 
> > Date: Tue, 06 Nov 2012 13:26:53 -0500 (EST)
> > 
> >> From: Jan Hubicka 
> >> Date: Tue, 6 Nov 2012 19:21:46 +0100
> >> 
> >>> The problem here is really that MAX_TIME * MAX_FREQ do not fit into 32bit 
> >>> integer. Fixed thus.
> >>> 
> >>>   * ipa-inline.c (compute_uninlined_call_time): Return gcov_type.
> >>>   (compute_inlined_call_time): Watch overflows.
> >>>   (relative_time_benefit): Compute in gcov_type.
> >> 
> >> Thanks Jan, I'll test this right now.
> > 
> > Bootstrap still fails with this change installed:
> > 
> > ../../gcc/gcc/graphite-interchange.c:645:1: internal compiler error: in 
> > relative_time_benefit, at \
> > ipa-inline.c:785
> >  }
> 
> The problem appears to be that inline_summary (edge->caller)->time
> is negative.
> 
> #1  0x010828a0 in relative_time_benefit (callee_info=0xf76fcb10, 
> edge=0xf598a980, edge_time=3861) \
> at ../../gcc/gcc/ipa-inline.c:785
> (gdb) p callee_info->time
> $1200 = 3864
> (gdb) p edge->frequency
> $1201 = 263
> (gdb) p (callee_info->time * edge->frequency)
> $1202 = 1016232
> (gdb) p edge->caller->global.inlined_to
> $1203 = (cgraph_node *) 0x0
> (gdb) p edge->caller
> $1204 = (cgraph_node *) 0xf589ed10
> (gdb) p p inline_summary (edge->caller)->time
> No symbol "p" in current context.
> (gdb) p inline_summary (edge->caller)->time
> $1205 = -1044761

Hmm, this is obvoiusly wrong.  All the caller time computation should be capped
to MAX_TIME that should be safe from overflows.  I will dig into it tonight or
tomorrow. Sorry for the trouble.

Honza


Re: New badness metric for inliner

2012-11-06 Thread David Miller
From: Jan Hubicka 
Date: Tue, 6 Nov 2012 22:01:27 +0100

> Hmm, this is obvoiusly wrong.  All the caller time computation should be 
> capped
> to MAX_TIME that should be safe from overflows.

They are not capped to MAX_TIME.

They are capped to MAX_TIME * INLINE_TIME_SCALE which is
10.


Re: New badness metric for inliner

2012-11-06 Thread Jan Hubicka
> From: Jan Hubicka 
> Date: Tue, 6 Nov 2012 22:01:27 +0100
> 
> > Hmm, this is obvoiusly wrong.  All the caller time computation should be 
> > capped
> > to MAX_TIME that should be safe from overflows.
> 
> They are not capped to MAX_TIME.
> 
> They are capped to MAX_TIME * INLINE_TIME_SCALE which is
> 10.

Right and that is why they need to be capped after every addition.
(while writting the code it did not have the INLINE_TIME_SCALE
factor yet and I concluded I do not need to do capping because 
there are at most 32 additions).

I noticed there is one extra place with this problem, so I fixed it, too.

The attached patch fixes the testcase, so I comitted it as obvious.  Hope it
will fix the bootstrap for you. I did not hit this because my bootstrap did not
have graphite enabled due to lack of proper support libraries.

Comitted as obvious.

* ipa-inline-analysis.c (estimate_function_body_sizes,
inline_update_overall_summary): Cap time calculations.
Index: ipa-inline-analysis.c
===
--- ipa-inline-analysis.c   (revision 193246)
+++ ipa-inline-analysis.c   (working copy)
@@ -2442,6 +2442,8 @@ estimate_function_body_sizes (struct cgr
{
  time += this_time;
  size += this_size;
+ if (time > MAX_TIME * INLINE_TIME_SCALE)
+   time = MAX_TIME * INLINE_TIME_SCALE;
}
 
  /* We account everything but the calls.  Calls have their own
@@ -3323,7 +3325,11 @@ inline_update_overall_summary (struct cg
   info->size = 0;
   info->time = 0;
   for (i = 0; VEC_iterate (size_time_entry, info->entry, i, e); i++)
-info->size += e->size, info->time += e->time;
+{
+  info->size += e->size, info->time += e->time;
+  if (info->time > MAX_TIME * INLINE_TIME_SCALE)
+info->time = MAX_TIME * INLINE_TIME_SCALE;
+}
   estimate_calls_size_and_time (node, &info->size, &info->time, NULL,
~(clause_t)(1 << predicate_false_condition),
NULL, NULL, NULL);


Re: [patch, testsuite, mips] Fix gcc.dg/torture/mips-sdata-1.c

2012-11-06 Thread Steve Ellcey
On Mon, 2012-11-05 at 23:22 +, Richard Sandiford wrote:

> No, same here: I don't use --with-sysroot for the newlib targets.
> Do you build a unified gcc+newlib tree?  If not, I don't think
> the above boilerplate works; you'll have to use something else
> instead.  E.g. install newlib first, change your board files
> to match your build tree setup, or add links from the gcc build
> directory to the newlib one.  But unified trees are simpler really
> (i.e. a newlib and libgloss symlink in the gcc tree).

> Richard

It looks like using --with-sysroot and --with-build-sysroot took care of
the problem so my patch is not needed.

Steve Ellcey
sell...@mips.com



Re: sched-deps patch: Fix PR54580

2012-11-06 Thread Richard Henderson
On 2012-11-06 12:58, Bernd Schmidt wrote:
>   * sched-deps.c (find_inc): Add all dependencies from the inc_insn
>   to the mem_insn.

Ok.


r~


[AARCH64/Committed] Fix g++.dg/abi/aarch64_guard1.C

2012-11-06 Thread Andrew Pinski
Hi,
  The problem here is with section anchors turned on, we generate a
BSS rather than a local common symbol and we no longer match the
pattern: "_ZGVZ3foovE1x,8,8".  This fixes this testcase by just adding
-fno-section-anchors.

Thanks,
Andrew Pinski

2012-11-06  Andrew Pinski  

* g++.dg/abi/aarch64_guard1.C: Add -fno-section-anchors.
Index: g++.dg/abi/aarch64_guard1.C
===
--- g++.dg/abi/aarch64_guard1.C (revision 193259)
+++ g++.dg/abi/aarch64_guard1.C (working copy)
@@ -2,7 +2,7 @@
 // 8-byte doubleword and that only the least significant bit is used
 // for initialization guard variables.
 // { dg-do compile { target aarch64*-*-* } }
-// { dg-options "-O -fdump-tree-original" }
+// { dg-options "-O -fdump-tree-original -fno-section-anchors" }
 
 int bar();
 


Re: [PATCH] Vzeroupper placement/47440

2012-11-06 Thread H.J. Lu
On Tue, Nov 6, 2012 at 2:30 AM, Kirill Yukhin  wrote:
> Hello,
>> OK for mainline SVN, please commit.
> Checked into GCC trunk: http://gcc.gnu.org/ml/gcc-cvs/2012-11/msg00176.html
>
> Thanks, K

This caused:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55224

-- 
H.J.


  1   2   >