[patch] Improve peeling heuristic in the vectorizer

2011-06-12 Thread Ira Rosen
Hi,

gcc.dg/vect/vect-72.c is not expected to use loop peeling for
alignment, but it does on ARM with double-word vectors. The loop
contains two data-refs of type char: one is aligned and the other is
misaligned by 1. When the cost model is disabled the peeling heuristic
chooses to peel a number of scalar iterations that aligns the highest
number of data-refs in the loop. When there are two (with different
alignment), the decision is arbitrary, which causes the vectorizer to
peel 7 iterations in vect-72.c, while not peeling at all is an
obviously better option (at least since the cost model is disabled).
This patch takes into account the required number of iterations to
peel.

Bootstrapped and tested on powerpc64-suse-linux, and tested the
vectorizer testsuite on arm-linux-gnueabi.
Committed.

Ira

ChangeLog:

  * tree-vect-data-refs.c (vect_peeling_hash_get_most_frequent):
  Take number of iterations to peel into account for equally frequent
  misalignment values.

Index: tree-vect-data-refs.c
===
--- tree-vect-data-refs.c   (revision 174964)
+++ tree-vect-data-refs.c   (working copy)
@@ -1248,7 +1248,9 @@ vect_peeling_hash_get_most_frequent (void **slot,
   vect_peel_info elem = (vect_peel_info) *slot;
   vect_peel_extended_info max = (vect_peel_extended_info) data;

-  if (elem->count > max->peel_info.count)
+  if (elem->count > max->peel_info.count
+  || (elem->count == max->peel_info.count
+  && max->peel_info.npeel > elem->npeel))
 {
   max->peel_info.npeel = elem->npeel;
   max->peel_info.count = elem->count;


Re: [Patch, AVR]: Fix PR46779

2011-06-12 Thread Denis Chertykov
2011/6/10 Georg-Johann Lay :
> Then I have a question on spill failures:
>
> There is PR46278, an optimization flaw that goes as follows:
>
> The avr BE defines fake addressing mode X+const that has to be written
> down in asm as
>  X += const
>  a = *X
>  X -= const
>
> The comment says that this is just to cover a corner case, however the
> new register allocator uses this case more often or even greedily.
> There is no way to describe the cost for such an access, and as X has
> lower prologue/epilogue cost than Y, X is preferred over Y often.
>
> In 4.7, I see that flaw less often than in 4.5.  However, I think the
> best way is not to lie at the register allocator and not to supply a
> fake instruction like that.
>
> So I started to work on a fix. The changes in avr.h are:
>
>        * config/avr/avr.h (BASE_REG_CLASS): Remove.
>        (REG_OK_FOR_BASE_NOSTRICT_P): Remove.
>        (REG_OK_FOR_BASE_STRICT_P): Remove.
>        (MODE_CODE_BASE_REG_CLASS): New Define.
>        (REGNO_MODE_CODE_OK_FOR_BASE_P): New Define.
>
> The macros REGNO_MODE_CODE_OK_FOR_BASE_P and MODE_CODE_BASE_REG_CLASS
> allow a better, fine grained control over addressing modes for each
> hard register and allow to support X without fake instructions. The
> code quality actually improves, but I see a new spill failure for the
> test case
>
> * gcc.c-torture/compile/950612-1.c
>
> On the one hand, that test case has heavy long long use and is no
> "real world code" to run on AVR. On the other hand, I don't think
> trading code quality here against ICE there is a good idea.
>
> What do you think about that? As I have no idea how to fix a spill
> failure in the backend, I stopped working on a patch.

Ohhh. It's a most complicated case for GCC for AVR.
Look carefully at `out_movqi_r_mr'.
There are even two fake addressing modes:
1. [Y + infinite-dslacement];
2. [X + (0...63)].
I have spent a many hours (days, months) to debug GCC (especially avr port
and reload) for right addressing modes.
I have stopped on this code.
AVR have a limited memory addressing and GCC can't handle it in native form.
Because of that I have supported a fake adddressing modes.
(Richard Henderson have a different oppinion: GCC can, AVR port can't)
IMHO that three limited pointer registers is not enough for C compiler.
Even more with frame pointer it's only two and X is a very limited.

1. [Y + infinite-dslacement] it's helper for reload addressing.
For addressing too long local/spilled variable. (Y + more-than-63)

2. [X + (0...63)] for another one "normal" pointer address.

> Then I observed trouble with DI patterns during libgcc build and had
> to remove
>
> * "zero_extendqidi2"
> * "zero_extendhidi2"
> * "zero_extendsidi2"
>
> These are "orphan" insns: they deal with DI without having movdi
> support so I removed them.

This seems strange for me.

Denis.


[patch, moxie] Fix comparison regression

2011-06-12 Thread Anthony Green
I've just committed this patch.  It makes match_operator modeless for
moxie comparisons.  This fixes a regression introduced in March by this
patch:  http://gcc.gnu.org/ml/gcc-patches/2011-03/msg01344.html

2011-06-12  Anthony Green  

* config/moxie/moxie.md (cbranchsi4): Remove mode from
comparison.


Index: gcc/config/moxie/moxie.md
===
--- gcc/config/moxie/moxie.md   (revision 174933)
+++ gcc/config/moxie/moxie.md   (working copy)
@@ -308,7 +308,7 @@
  (match_operand:SI 1 "general_operand" "")
  (match_operand:SI 2 "general_operand" "")))
(set (pc)
-(if_then_else (match_operator:CC 0 "comparison_operator"
+(if_then_else (match_operator 0 "comparison_operator"
[(reg:CC CC_REG) (const_int 0)])
   (label_ref (match_operand 3 "" ""))
   (pc)))]




Re: Cgraph alias reorg 13/14 (disable inlining functions called once at -O0

2011-06-12 Thread Richard Guenther
On Fri, Jun 10, 2011 at 9:15 PM, Jan Hubicka  wrote:
>> On Fri, Jun 10, 2011 at 8:42 PM, Jan Hubicka  wrote:
>> > Hi,
>> > by some mistake we enable functions called once at -O0 and it actually 
>> > happens from
>> > time to time.
>>
>> Why do it for -O1?  It definitely makes debugging less reliable.  I'd say do 
>> it
>> for -O[23s] only.
>
> Well, that is what we did before.  I tought -O1 is mostly for optimizations
> that are not too expensive and usually win (rahter than about debugability of
> the output) and I think this one counts here.  But I don't have very strong
> optinions either way.

I think we also suggested at some point that -O1 optimizations
shouldn't interfere
with debugging too much.  But if it is what we did before it's certainly fine.

Richard.

> Honza
>


Re: PATCH [1/n]: Prepare x32: PR middle-end/47364: internal compiler error: in emit_move_insn, at expr.c:3355

2011-06-12 Thread Richard Guenther
On Sat, Jun 11, 2011 at 5:09 PM, H.J. Lu  wrote:
> Hi,
>
> expand_builtin_strlen has
>
> src_reg = gen_reg_rtx (Pmode);
> ...
> pat = expand_expr (src, src_reg, ptr_mode, EXPAND_NORMAL);
> if (pat != src_reg)
>  emit_move_insn (src_reg, pat);
>
> But src_reg may be in ptr_mode, wich may not be the same as Pmode.
> This patch checks it.  OK for trunk?
>
> Thanks.
>
>
> H.J.
> ---
> 2011-06-11  H.J. Lu  
>
>        PR middle-end/47364
>        * builtins.c (expand_builtin_strlen): Properly handle target
>        not in Pmode.
>
> diff --git a/gcc/builtins.c b/gcc/builtins.c
> index 7b24a0c..4e2cf31 100644
> --- a/gcc/builtins.c
> +++ b/gcc/builtins.c
> @@ -2941,7 +2941,11 @@ expand_builtin_strlen (tree exp, rtx target,
>       start_sequence ();
>       pat = expand_expr (src, src_reg, ptr_mode, EXPAND_NORMAL);
>       if (pat != src_reg)
> -       emit_move_insn (src_reg, pat);
> +       {
> +         if (GET_MODE (pat) != Pmode)
> +           pat = convert_to_mode (Pmode, pat, 1);

Shouldn't this be POINTERS_EXTEND_UNSIGNED instead of 1?

> +         emit_move_insn (src_reg, pat);

Why not use convert_move unconditionally?

Or, why not expand src in Pmode from the start?  After all, src_reg is
created as Pmode reg.

Richard.

> +       }
>       pat = get_insns ();
>       end_sequence ();
>
>


Re: RFA (fold): PATCH for c++/49290 (folding *(T*)(ar+10))

2011-06-12 Thread Richard Guenther
On Sat, Jun 11, 2011 at 7:45 PM, Mike Stump  wrote:
> On Jun 10, 2011, at 7:20 AM, Richard Guenther wrote:
>> On Fri, 10 Jun 2011, Jason Merrill wrote:
>>
>>> On 06/10/2011 10:03 AM, Richard Guenther wrote:
>> *((volatile int *)&a[0] + 1)
>
> It would be correct to fold it to
>
> VIEW_CONVERT_EXPR

 No, it wouldn't be correct.  It isn't correct to fold it to an array-ref
 that isn't volatile.
>>>
>>> Hmm?  The C expression produces a volatile int lvalue referring to the 
>>> second
>>> element of a, as does the VIEW_CONVERT_EXPR.  They seem equivalent to me.
>>
>> no, a VIEW_CONVERT_EXPR is generally not an lvalue (fold for example
>> would turn the above to (volatile int) a[1]).
>
> Curious.  We have built up a built-in infrastructure that allows for lvalue 
> register references.  I noticed that for vector types, vectors with different 
> type names but the same in every other respect come out different, and a 
> VIEW_CONVERT_EXPR is placed on it to get the types to match.  Presently I'm 
> treating VIEW_CONVERT_EXPR as an lvalue.  For me not to, I'd need either for 
> the same type to be used, or, for another conversion node to be used that can 
> preserve the lvalueness of registers.
>
> Now, if people want to know why, lvalue for registers, it is to support 
> in/out and output only parameters to built-ins.
>
> Thoughts?

In almost all cases(*) the need for a lvalue VIEW_CONVERT_EXPR can be avoided
by moving the VIEW_CONVERT_EXPR to the rvalue assigned too it.  Remember that
VIEW_CONVERT_EXPR always conver the full object and are not allowed to
change sizes.

So, do you have an example?

Richard.

(*) Ada uses lvalue component-refs on VIEW_CONVERT_EXPRs of aggregate types.
While I don't like it too much it's probably not too convenient (even
if it is always
possible) to move these to the RHS of assignments.


Re: Is VIEW_CONVERT_EXPR an lvalue? (was Re: RFA (fold): PATCH for c++/49290 (folding *(T*)(ar+10)))

2011-06-12 Thread Richard Guenther
On Sat, Jun 11, 2011 at 10:01 PM, Jason Merrill  wrote:
> On 06/10/2011 10:20 AM, Richard Guenther wrote:
>>
>> no, a VIEW_CONVERT_EXPR is generally not an lvalue (fold for example
>> would turn the above to (volatile int) a[1]).
>
> The gimplifier seems to consider it an lvalue: gimplify_expr uses
> gimplify_compound_lval for it, and gimplify_addr_expr handles taking its
> address.  And get_inner_reference handles it.  So I think fold should be
> changed, and we should clarify that VIEW_CONVERT_EXPR is an lvalue.
>
> If not, we need a new tree code for treating an lvalue as an lvalue of a
> different type without having to take its address; that's what I thought
> VIEW_CONVERT_EXPR was for.

The please provide a specification on what a VIEW_CONVERT_EXPR does
to type-based alias analysis.  We are trying to avoid that by the rvalue rule.
Also you can always avoid VIEW_CONVERT_EXPRs for lvalues by simply
moving the conversion to the rvalue side.

Yes, we do handle lvalue VIEW_CONVERT_EXPRs, but that is for Ada
which uses it for aggregates.  I don't want us to add more lvalue
VIEW_CONVERT_EXPR
cases, especially not for register types.

Richard.

> Jason
>


Re: Is VIEW_CONVERT_EXPR an lvalue? (was Re: RFA (fold): PATCH for c++/49290 (folding *(T*)(ar+10)))

2011-06-12 Thread Richard Guenther
On Sun, Jun 12, 2011 at 12:59 PM, Richard Guenther
 wrote:
> On Sat, Jun 11, 2011 at 10:01 PM, Jason Merrill  wrote:
>> On 06/10/2011 10:20 AM, Richard Guenther wrote:
>>>
>>> no, a VIEW_CONVERT_EXPR is generally not an lvalue (fold for example
>>> would turn the above to (volatile int) a[1]).
>>
>> The gimplifier seems to consider it an lvalue: gimplify_expr uses
>> gimplify_compound_lval for it, and gimplify_addr_expr handles taking its
>> address.  And get_inner_reference handles it.  So I think fold should be
>> changed, and we should clarify that VIEW_CONVERT_EXPR is an lvalue.
>>
>> If not, we need a new tree code for treating an lvalue as an lvalue of a
>> different type without having to take its address; that's what I thought
>> VIEW_CONVERT_EXPR was for.

Btw, see tree.def which says

/* Represents viewing something of one type as being of a second type.
   This corresponds to an "Unchecked Conversion" in Ada and roughly to
   the idiom *(type2 *)&X in C.  The only operand is the value to be
   viewed as being of another type.  It is undefined if the type of the
   input and of the expression have different sizes.

   This code may also be used within the LHS of a MODIFY_EXPR, in which
   case no actual data motion may occur.  TREE_ADDRESSABLE will be set in
   this case and GCC must abort if it could not do the operation without
   generating insns.  */
DEFTREECODE (VIEW_CONVERT_EXPR, "view_convert_expr", tcc_reference, 1)

> The please provide a specification on what a VIEW_CONVERT_EXPR does
> to type-based alias analysis.  We are trying to avoid that by the rvalue rule.
> Also you can always avoid VIEW_CONVERT_EXPRs for lvalues by simply
> moving the conversion to the rvalue side.
>
> Yes, we do handle lvalue VIEW_CONVERT_EXPRs, but that is for Ada
> which uses it for aggregates.  I don't want us to add more lvalue
> VIEW_CONVERT_EXPR
> cases, especially not for register types.
>
> Richard.
>
>> Jason
>>
>


Re: PATCH [1/n]: Prepare x32: PR middle-end/47364: internal compiler error: in emit_move_insn, at expr.c:3355

2011-06-12 Thread H.J. Lu
On Sun, Jun 12, 2011 at 3:48 AM, Richard Guenther
 wrote:
> On Sat, Jun 11, 2011 at 5:09 PM, H.J. Lu  wrote:
>> Hi,
>>
>> expand_builtin_strlen has
>>
>> src_reg = gen_reg_rtx (Pmode);
>> ...
>> pat = expand_expr (src, src_reg, ptr_mode, EXPAND_NORMAL);
>> if (pat != src_reg)
>>  emit_move_insn (src_reg, pat);
>>
>> But src_reg may be in ptr_mode, wich may not be the same as Pmode.
>> This patch checks it.  OK for trunk?
>>
>> Thanks.
>>
>>
>> H.J.
>> ---
>> 2011-06-11  H.J. Lu  
>>
>>        PR middle-end/47364
>>        * builtins.c (expand_builtin_strlen): Properly handle target
>>        not in Pmode.
>>
>> diff --git a/gcc/builtins.c b/gcc/builtins.c
>> index 7b24a0c..4e2cf31 100644
>> --- a/gcc/builtins.c
>> +++ b/gcc/builtins.c
>> @@ -2941,7 +2941,11 @@ expand_builtin_strlen (tree exp, rtx target,
>>       start_sequence ();
>>       pat = expand_expr (src, src_reg, ptr_mode, EXPAND_NORMAL);
>>       if (pat != src_reg)
>> -       emit_move_insn (src_reg, pat);
>> +       {
>> +         if (GET_MODE (pat) != Pmode)
>> +           pat = convert_to_mode (Pmode, pat, 1);
>
> Shouldn't this be POINTERS_EXTEND_UNSIGNED instead of 1?
>
>> +         emit_move_insn (src_reg, pat);
>
> Why not use convert_move unconditionally?
>
> Or, why not expand src in Pmode from the start?  After all, src_reg is
> created as Pmode reg.
>

This patch works for my testcase.  OK for trunk?

Thanks.

-- 
H.J.

2011-06-12  H.J. Lu  

PR middle-end/47364
* builtins.c (expand_builtin_strlen): Expand strlen to Pmode.

diff --git a/gcc/builtins.c b/gcc/builtins.c
index 7b24a0c..7db4e6d 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -2939,7 +2939,7 @@ expand_builtin_strlen (tree exp, rtx target,

   /* Now that we are assured of success, expand the source.  */
   start_sequence ();
-  pat = expand_expr (src, src_reg, ptr_mode, EXPAND_NORMAL);
+  pat = expand_expr (src, src_reg, Pmode, EXPAND_NORMAL);
   if (pat != src_reg)
emit_move_insn (src_reg, pat);
   pat = get_insns ();


Re: [PATCH] Un-obsolete Interix

2011-06-12 Thread Gerald Pfeifer

On Thu, 9 Jun 2011, Douglas B Rupp wrote:

Comments welcome.


--- gcc.orig/gcc/doc/install.texi   2011-06-08 14:05:22.0 -0700
+++ gcc/gcc/doc/install.texi2011-06-09 10:04:38.0 -0700
@@ -4499,7 +4499,7 @@
The Interix target is used by OpenNT, Interix, Services For UNIX (SFU),
and Subsystem for UNIX-based Applications (SUA).  Applications compiled
with this target run in the Interix subsystem, which is separate from
-the Win32 subsystem.  This target was last known to work in GCC 3.3.
+the Win32 subsystem.  Gcc supports Interix version 3 and above.

GCC (not Gcc).

And I assume you'll also be updating the release notes at
http://gcc.gnu.org/gcc-4.7/changes.html . ;-)

Gerald


GCC mainline is broken on x86

2011-06-12 Thread H.J. Lu
Hi,

Revision 174952:

http://gcc.gnu.org/ml/gcc-cvs/2011-06/msg00441.html

totally breaks C++ on x86:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49378
http://gcc.gnu.org/ml/gcc-regression/2011-06/msg00159.html

One symptom is we are using uninitialized registers, which leads
to writing random memory location.  The same ia32 binary works on
Fedora 14 under kernel 2.6.35 and failed under Fedora 15 under
kernel 2.6.38.

We also use uninitialized registers on x86-64, but the program
doesn't crash.  This bug may also affects C and other languages.

-- 
H.J.


Re: PATCH [1/n]: Prepare x32: PR middle-end/47364: internal compiler error: in emit_move_insn, at expr.c:3355

2011-06-12 Thread Richard Guenther
On Sun, Jun 12, 2011 at 3:18 PM, H.J. Lu  wrote:
> On Sun, Jun 12, 2011 at 3:48 AM, Richard Guenther
>  wrote:
>> On Sat, Jun 11, 2011 at 5:09 PM, H.J. Lu  wrote:
>>> Hi,
>>>
>>> expand_builtin_strlen has
>>>
>>> src_reg = gen_reg_rtx (Pmode);
>>> ...
>>> pat = expand_expr (src, src_reg, ptr_mode, EXPAND_NORMAL);
>>> if (pat != src_reg)
>>>  emit_move_insn (src_reg, pat);
>>>
>>> But src_reg may be in ptr_mode, wich may not be the same as Pmode.
>>> This patch checks it.  OK for trunk?
>>>
>>> Thanks.
>>>
>>>
>>> H.J.
>>> ---
>>> 2011-06-11  H.J. Lu  
>>>
>>>        PR middle-end/47364
>>>        * builtins.c (expand_builtin_strlen): Properly handle target
>>>        not in Pmode.
>>>
>>> diff --git a/gcc/builtins.c b/gcc/builtins.c
>>> index 7b24a0c..4e2cf31 100644
>>> --- a/gcc/builtins.c
>>> +++ b/gcc/builtins.c
>>> @@ -2941,7 +2941,11 @@ expand_builtin_strlen (tree exp, rtx target,
>>>       start_sequence ();
>>>       pat = expand_expr (src, src_reg, ptr_mode, EXPAND_NORMAL);
>>>       if (pat != src_reg)
>>> -       emit_move_insn (src_reg, pat);
>>> +       {
>>> +         if (GET_MODE (pat) != Pmode)
>>> +           pat = convert_to_mode (Pmode, pat, 1);
>>
>> Shouldn't this be POINTERS_EXTEND_UNSIGNED instead of 1?
>>
>>> +         emit_move_insn (src_reg, pat);
>>
>> Why not use convert_move unconditionally?
>>
>> Or, why not expand src in Pmode from the start?  After all, src_reg is
>> created as Pmode reg.
>>
>
> This patch works for my testcase.  OK for trunk?

Ok if it passes bootstrap & regtest on a ptr_mode != Pmode target.

Thanks,
Richard.

> Thanks.
>
> --
> H.J.
> 
> 2011-06-12  H.J. Lu  
>
>        PR middle-end/47364
>        * builtins.c (expand_builtin_strlen): Expand strlen to Pmode.
>
> diff --git a/gcc/builtins.c b/gcc/builtins.c
> index 7b24a0c..7db4e6d 100644
> --- a/gcc/builtins.c
> +++ b/gcc/builtins.c
> @@ -2939,7 +2939,7 @@ expand_builtin_strlen (tree exp, rtx target,
>
>       /* Now that we are assured of success, expand the source.  */
>       start_sequence ();
> -      pat = expand_expr (src, src_reg, ptr_mode, EXPAND_NORMAL);
> +      pat = expand_expr (src, src_reg, Pmode, EXPAND_NORMAL);
>       if (pat != src_reg)
>        emit_move_insn (src_reg, pat);
>       pat = get_insns ();
>


Re: PATCH [1/n]: Prepare x32: PR middle-end/47364: internal compiler error: in emit_move_insn, at expr.c:3355

2011-06-12 Thread H.J. Lu
On Sun, Jun 12, 2011 at 6:50 AM, Richard Guenther
 wrote:
> On Sun, Jun 12, 2011 at 3:18 PM, H.J. Lu  wrote:
>> On Sun, Jun 12, 2011 at 3:48 AM, Richard Guenther
>>  wrote:
>>> On Sat, Jun 11, 2011 at 5:09 PM, H.J. Lu  wrote:
 Hi,

 expand_builtin_strlen has

 src_reg = gen_reg_rtx (Pmode);
 ...
 pat = expand_expr (src, src_reg, ptr_mode, EXPAND_NORMAL);
 if (pat != src_reg)
  emit_move_insn (src_reg, pat);

 But src_reg may be in ptr_mode, wich may not be the same as Pmode.
 This patch checks it.  OK for trunk?

 Thanks.


 H.J.
 ---
 2011-06-11  H.J. Lu  

        PR middle-end/47364
        * builtins.c (expand_builtin_strlen): Properly handle target
        not in Pmode.

 diff --git a/gcc/builtins.c b/gcc/builtins.c
 index 7b24a0c..4e2cf31 100644
 --- a/gcc/builtins.c
 +++ b/gcc/builtins.c
 @@ -2941,7 +2941,11 @@ expand_builtin_strlen (tree exp, rtx target,
       start_sequence ();
       pat = expand_expr (src, src_reg, ptr_mode, EXPAND_NORMAL);
       if (pat != src_reg)
 -       emit_move_insn (src_reg, pat);
 +       {
 +         if (GET_MODE (pat) != Pmode)
 +           pat = convert_to_mode (Pmode, pat, 1);
>>>
>>> Shouldn't this be POINTERS_EXTEND_UNSIGNED instead of 1?
>>>
 +         emit_move_insn (src_reg, pat);
>>>
>>> Why not use convert_move unconditionally?
>>>
>>> Or, why not expand src in Pmode from the start?  After all, src_reg is
>>> created as Pmode reg.
>>>
>>
>> This patch works for my testcase.  OK for trunk?
>
> Ok if it passes bootstrap & regtest on a ptr_mode != Pmode target.
>

Only the following targets expand strlen:

avr/avr.md:(define_expand "strlenhi"
avr/avr.md:(define_insn "*strlenhi"
i386/i386.md:(define_expand "strlen"
i386/i386.md: if (ix86_expand_strlen (operands[0], operands[1],
operands[2], operands[3]))
i386/i386.md:(define_expand "strlenqi_1"
i386/i386.md:(define_insn "*strlenqi_1"
rs6000/rs6000.md:(define_expand "strlensi"
s390/s390.md:; strlenM instruction pattern(s).
s390/s390.md:(define_expand "strlen"
s390/s390.md:(define_insn "*strlen"

None of them, except for my x32 port, are ptr_mode != Pmode targets.
I will bootstrap and test it on my x32 branch.

Thanks.

-- 
H.J.


Re: PATCH [1/n]: Prepare x32: PR middle-end/47364: internal compiler error: in emit_move_insn, at expr.c:3355

2011-06-12 Thread H.J. Lu
On Sun, Jun 12, 2011 at 7:00 AM, H.J. Lu  wrote:
> On Sun, Jun 12, 2011 at 6:50 AM, Richard Guenther
>  wrote:
>> On Sun, Jun 12, 2011 at 3:18 PM, H.J. Lu  wrote:
>>> On Sun, Jun 12, 2011 at 3:48 AM, Richard Guenther
>>>  wrote:
 On Sat, Jun 11, 2011 at 5:09 PM, H.J. Lu  wrote:
> Hi,
>
> expand_builtin_strlen has
>
> src_reg = gen_reg_rtx (Pmode);
> ...
> pat = expand_expr (src, src_reg, ptr_mode, EXPAND_NORMAL);
> if (pat != src_reg)
>  emit_move_insn (src_reg, pat);
>
> But src_reg may be in ptr_mode, wich may not be the same as Pmode.
> This patch checks it.  OK for trunk?
>
> Thanks.
>
>
> H.J.
> ---
> 2011-06-11  H.J. Lu  
>
>        PR middle-end/47364
>        * builtins.c (expand_builtin_strlen): Properly handle target
>        not in Pmode.
>
> diff --git a/gcc/builtins.c b/gcc/builtins.c
> index 7b24a0c..4e2cf31 100644
> --- a/gcc/builtins.c
> +++ b/gcc/builtins.c
> @@ -2941,7 +2941,11 @@ expand_builtin_strlen (tree exp, rtx target,
>       start_sequence ();
>       pat = expand_expr (src, src_reg, ptr_mode, EXPAND_NORMAL);
>       if (pat != src_reg)
> -       emit_move_insn (src_reg, pat);
> +       {
> +         if (GET_MODE (pat) != Pmode)
> +           pat = convert_to_mode (Pmode, pat, 1);

 Shouldn't this be POINTERS_EXTEND_UNSIGNED instead of 1?

> +         emit_move_insn (src_reg, pat);

 Why not use convert_move unconditionally?

 Or, why not expand src in Pmode from the start?  After all, src_reg is
 created as Pmode reg.

>>>
>>> This patch works for my testcase.  OK for trunk?
>>
>> Ok if it passes bootstrap & regtest on a ptr_mode != Pmode target.
>>
>
> Only the following targets expand strlen:
>
> avr/avr.md:(define_expand "strlenhi"
> avr/avr.md:(define_insn "*strlenhi"
> i386/i386.md:(define_expand "strlen"
> i386/i386.md: if (ix86_expand_strlen (operands[0], operands[1],
> operands[2], operands[3]))
> i386/i386.md:(define_expand "strlenqi_1"
> i386/i386.md:(define_insn "*strlenqi_1"
> rs6000/rs6000.md:(define_expand "strlensi"
> s390/s390.md:; strlenM instruction pattern(s).
> s390/s390.md:(define_expand "strlen"
> s390/s390.md:(define_insn "*strlen"
>
> None of them, except for my x32 port, are ptr_mode != Pmode targets.
> I will bootstrap and test it on my x32 branch.
>

It doesn't work on x32. I got

/export/gnu/import/git/gcc-x32/libssp/gets-chk.c:74:14: internal
compiler error: in emit_move_insn, at expr.c:3319
Please submit a full bug report,
with preprocessed source if appropriate.
See  for instructions.

How about this patch?

Thanks.

-- 
H.J.
---
2011-06-12  H.J. Lu  

PR middle-end/47364
* builtins.c (expand_builtin_strlen): Properly handle target
not in Pmode.

diff --git a/gcc/builtins.c b/gcc/builtins.c
index 7b24a0c..a2f175d 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -2941,7 +2941,14 @@ expand_builtin_strlen (tree exp, rtx target,
   start_sequence ();
   pat = expand_expr (src, src_reg, ptr_mode, EXPAND_NORMAL);
   if (pat != src_reg)
-   emit_move_insn (src_reg, pat);
+   {
+#ifdef POINTERS_EXTEND_UNSIGNED
+ if (GET_MODE (pat) != Pmode)
+   pat = convert_to_mode (Pmode, pat,
+  POINTERS_EXTEND_UNSIGNED);
+#endif
+ emit_move_insn (src_reg, pat);
+   }
   pat = get_insns ();
   end_sequence ();


Re: Cgraph alias reorg 15/14 (New infrastructure for same body aliases)

2011-06-12 Thread Jan Hubicka
> 
> This also pretty much destroyed C++ for ia32:
> 
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49378
> http://gcc.gnu.org/ml/gcc-regression/2011-06/msg00159.html

Hi,
It seems somewhat amazing that we hit kernel sensitive miscompilation here.
The problem most probably is the fact that thunks and functions with thunks can 
become
local. This is correct since thunks are represented as direct calls now, but 
this
makes i386 to use local ABI when calling or compiling them.

Does the following patch help? We may also need to look for the presence of 
thunk
callers.

Index: ipa.c
===
--- ipa.c   (revision 174958)
+++ ipa.c   (working copy)
@@ -120,6 +120,7 @@ static bool
 cgraph_non_local_node_p_1 (struct cgraph_node *node, void *data 
ATTRIBUTE_UNUSED)
 {
return !(cgraph_only_called_directly_or_aliased_p (node)
+   && !ipa_ref_has_aliases_p (&node->ref_list)
&& node->analyzed
&& !DECL_EXTERNAL (node->decl)
&& !node->local.externally_visible
@@ -132,7 +133,11 @@ cgraph_non_local_node_p_1 (struct cgraph
 static bool
 cgraph_local_node_p (struct cgraph_node *node)
 {
-   return !cgraph_for_node_and_aliases (cgraph_function_or_thunk_node (node, 
NULL),
+   struct cgraph_node *n = cgraph_function_or_thunk_node (node, NULL);
+   
+   if (n->thunk.thunk_p)
+ return false;
+   return !cgraph_for_node_and_aliases (n,
cgraph_non_local_node_p_1, NULL, true);

 }


Re: Cgraph alias reorg 15/14 (New infrastructure for same body aliases)

2011-06-12 Thread H.J. Lu
On Sun, Jun 12, 2011 at 7:54 AM, Jan Hubicka  wrote:
>>
>> This also pretty much destroyed C++ for ia32:
>>
>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49378
>> http://gcc.gnu.org/ml/gcc-regression/2011-06/msg00159.html
>
> Hi,
> It seems somewhat amazing that we hit kernel sensitive miscompilation here.
> The problem most probably is the fact that thunks and functions with thunks 
> can become
> local. This is correct since thunks are represented as direct calls now, but 
> this
> makes i386 to use local ABI when calling or compiling them.
>
> Does the following patch help? We may also need to look for the presence of 
> thunk
> callers.
>
> Index: ipa.c
> ===
> --- ipa.c       (revision 174958)
> +++ ipa.c       (working copy)
> @@ -120,6 +120,7 @@ static bool
>  cgraph_non_local_node_p_1 (struct cgraph_node *node, void *data 
> ATTRIBUTE_UNUSED)
>  {
>    return !(cgraph_only_called_directly_or_aliased_p (node)
> +           && !ipa_ref_has_aliases_p (&node->ref_list)
>            && node->analyzed
>            && !DECL_EXTERNAL (node->decl)
>            && !node->local.externally_visible
> @@ -132,7 +133,11 @@ cgraph_non_local_node_p_1 (struct cgraph
>  static bool
>  cgraph_local_node_p (struct cgraph_node *node)
>  {
> -   return !cgraph_for_node_and_aliases (cgraph_function_or_thunk_node (node, 
> NULL),
> +   struct cgraph_node *n = cgraph_function_or_thunk_node (node, NULL);
> +
> +   if (n->thunk.thunk_p)
> +     return false;
> +   return !cgraph_for_node_and_aliases (n,
>                                        cgraph_non_local_node_p_1, NULL, true);
>
>  }
>

I am testing it now.  Will know the results in 2 hours.

Thanks.

-- 
H.J.


Re: Cgraph alias reorg 15/14 (New infrastructure for same body aliases)

2011-06-12 Thread Jan Hubicka
> I am testing it now.  Will know the results in 2 hours.
Thanks.
Could you also send me the preprocessed source for future.o?  The object file I
am getting don't have the bug you report, that is most probably due to glibc
difference.

Honza
> 
> Thanks.
> 
> -- 
> H.J.


Re: Have a boehm-gc patch for gcj/rtems

2011-06-12 Thread Jie Liu
Hi,

Java HelloWorld compiled by cross gcj for RTEMS can run without
problem now, I think it's time for get the boehm-gc patch reviewed and
merged.  :)  Here is the patch:

Index: mach_dep.c
===
--- mach_dep.c  (revision 172224)
+++ mach_dep.c  (working copy)
@@ -411,7 +411,8 @@
 word dummy;

 #   if defined(USE_GENERIC_PUSH_REGS)
-# ifdef HAVE_BUILTIN_UNWIND_INIT
+# if defined(HAVE_BUILTIN_UNWIND_INIT)\
+ &&!(defined(I386) && defined(RTEMS))
 /* This was suggested by Richard Henderson as the way to   */
 /* force callee-save registers and register windows onto   */
 /* the stack.  */
@@ -437,8 +438,8 @@
for (; (char *)i < lim; i++) {
*i = 0;
}
-#   if defined(MSWIN32) || defined(MSWINCE) \
-  || defined(UTS4) || defined(LINUX) || defined(EWS4800)
+#   if defined(MSWIN32) || defined(MSWINCE) || defined(UTS4) \
+   || defined(LINUX) || defined(EWS4800) || defined(RTEMS)
  (void) setjmp(regs);
 #   else
 (void) _setjmp(regs);
Index: include/private/gcconfig.h
===
--- include/private/gcconfig.h  (revision 172224)
+++ include/private/gcconfig.h  (working copy)
@@ -315,6 +315,11 @@
 #define mach_type_known
 #   endif
 # endif
+# if defined(__rtems__) && (defined(i386) || defined(__i386__))
+#   define I386
+#   define RTEMS
+#   define mach_type_known
+# endif
 # if defined(NeXT) && defined(mc68000)
 #   define M68K
 #   define NEXT
@@ -1297,6 +1302,17 @@
 #  define STACKBOTTOM ((ptr_t)0xc000)
 #  define DATAEND  /* not needed */
 #   endif
+#   ifdef RTEMS
+#   define OS_TYPE "RTEMS"
+#   include 
+extern int etext[];
+extern int end[];
+void *rtems_get_stack_bottom();
+#   define InitStackBottom rtems_get_stack_bottom()
+#   define DATASTART ((ptr_t)etext)
+#   define DATAEND ((ptr_t)end)
+#   define STACKBOTTOM ((ptr_t)InitStackBottom)
+#   endif
 #   ifdef DOS4GW
 # define OS_TYPE "DOS4GW"
   extern long __nullarea;
@@ -2370,7 +2386,8 @@
 #   else
 # if defined(NEXT) || defined(DOS4GW) || \
 (defined(AMIGA) && !defined(GC_AMIGA_FASTALLOC)) || \
-(defined(SUNOS5) && !defined(USE_MMAP))
+(defined(SUNOS5) && !defined(USE_MMAP)) || \
+ defined(RTEMS)
 #   define GET_MEM(bytes) HBLKPTR((size_t) \
  calloc(1, (size_t)bytes
+ GC_page_size) \
  + GC_page_size-1)
Index: configure
===
--- configure   (revision 172224)
+++ configure   (working copy)
@@ -14867,6 +14867,8 @@
 $as_echo "#define _REENTRANT 1" >>confdefs.h

 ;;
+ rtems)
+;;
  decosf1 | irix | mach | os2 | dce | vxworks)
 as_fn_error "thread package $THREADS not yet supported" "$LINENO" 5
 ;;
Index: os_dep.c
===
--- os_dep.c(revision 172224)
+++ os_dep.c(working copy)
@@ -1507,7 +1507,7 @@

 # if !defined(OS2) && !defined(PCR) && !defined(AMIGA) \
&& !defined(MSWIN32) && !defined(MSWINCE) \
-   && !defined(MACOS) && !defined(DOS4GW)
+   && !defined(MACOS) && !defined(DOS4GW) && !defined(RTEMS)

 # ifdef SUNOS4
 extern caddr_t sbrk();

Best Regards,
Jie

2011/6/4 Jie Liu :
> Hi,
>
> I am working on porting gcj to rtems now, it's a project of GSoC2011.[1]
> And now, the first step: boehm-gc have been ported,  so I want to get
> this patch reviewed and merged.
>
> And I have filed the FSF Paperwork, the patch has been attached.
> Thank you for your time.
>
> Best Regards,
> Jie
>
> [1]http://socghop.appspot.com/gsoc/org/google/gsoc2011/rtems
>


Re: Cgraph alias reorg 15/14 (New infrastructure for same body aliases)

2011-06-12 Thread H.J. Lu
On Sun, Jun 12, 2011 at 7:54 AM, Jan Hubicka  wrote:
>>
>> This also pretty much destroyed C++ for ia32:
>>
>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49378
>> http://gcc.gnu.org/ml/gcc-regression/2011-06/msg00159.html
>
> Hi,
> It seems somewhat amazing that we hit kernel sensitive miscompilation here.
> The problem most probably is the fact that thunks and functions with thunks 
> can become
> local. This is correct since thunks are represented as direct calls now, but 
> this
> makes i386 to use local ABI when calling or compiling them.
>

For x86-64, we use the same ABI for local and global. But RAX seems
used and uninitialized in thunk.

-- 
H.J.


Re: Cgraph alias reorg 15/14 (New infrastructure for same body aliases)

2011-06-12 Thread H.J. Lu
On Sun, Jun 12, 2011 at 8:01 AM, H.J. Lu  wrote:
> On Sun, Jun 12, 2011 at 7:54 AM, Jan Hubicka  wrote:
>>>
>>> This also pretty much destroyed C++ for ia32:
>>>
>>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49378
>>> http://gcc.gnu.org/ml/gcc-regression/2011-06/msg00159.html
>>
>> Hi,
>> It seems somewhat amazing that we hit kernel sensitive miscompilation here.
>> The problem most probably is the fact that thunks and functions with thunks 
>> can become
>> local. This is correct since thunks are represented as direct calls now, but 
>> this
>> makes i386 to use local ABI when calling or compiling them.
>>
>> Does the following patch help? We may also need to look for the presence of 
>> thunk
>> callers.
>>
>> Index: ipa.c
>> ===
>> --- ipa.c       (revision 174958)
>> +++ ipa.c       (working copy)
>> @@ -120,6 +120,7 @@ static bool
>>  cgraph_non_local_node_p_1 (struct cgraph_node *node, void *data 
>> ATTRIBUTE_UNUSED)
>>  {
>>    return !(cgraph_only_called_directly_or_aliased_p (node)
>> +           && !ipa_ref_has_aliases_p (&node->ref_list)
>>            && node->analyzed
>>            && !DECL_EXTERNAL (node->decl)
>>            && !node->local.externally_visible
>> @@ -132,7 +133,11 @@ cgraph_non_local_node_p_1 (struct cgraph
>>  static bool
>>  cgraph_local_node_p (struct cgraph_node *node)
>>  {
>> -   return !cgraph_for_node_and_aliases (cgraph_function_or_thunk_node 
>> (node, NULL),
>> +   struct cgraph_node *n = cgraph_function_or_thunk_node (node, NULL);
>> +
>> +   if (n->thunk.thunk_p)
>> +     return false;
>> +   return !cgraph_for_node_and_aliases (n,
>>                                        cgraph_non_local_node_p_1, NULL, 
>> true);
>>
>>  }
>>
>
> I am testing it now.  Will know the results in 2 hours.
>

Yes, it fixes C++ regressions.

Thanks.

-- 
H.J.


Re: PATCH [1/n]: Prepare x32: PR middle-end/47364: internal compiler error: in emit_move_insn, at expr.c:3355

2011-06-12 Thread H.J. Lu
On Sun, Jun 12, 2011 at 7:33 AM, H.J. Lu  wrote:
> On Sun, Jun 12, 2011 at 7:00 AM, H.J. Lu  wrote:
>> On Sun, Jun 12, 2011 at 6:50 AM, Richard Guenther
>>  wrote:
>>> On Sun, Jun 12, 2011 at 3:18 PM, H.J. Lu  wrote:
 On Sun, Jun 12, 2011 at 3:48 AM, Richard Guenther
  wrote:
> On Sat, Jun 11, 2011 at 5:09 PM, H.J. Lu  wrote:
>> Hi,
>>
>> expand_builtin_strlen has
>>
>> src_reg = gen_reg_rtx (Pmode);
>> ...
>> pat = expand_expr (src, src_reg, ptr_mode, EXPAND_NORMAL);
>> if (pat != src_reg)
>>  emit_move_insn (src_reg, pat);
>>
>> But src_reg may be in ptr_mode, wich may not be the same as Pmode.
>> This patch checks it.  OK for trunk?
>>
>> Thanks.
>>
>>
>> H.J.
>> ---
>> 2011-06-11  H.J. Lu  
>>
>>        PR middle-end/47364
>>        * builtins.c (expand_builtin_strlen): Properly handle target
>>        not in Pmode.
>>
>> diff --git a/gcc/builtins.c b/gcc/builtins.c
>> index 7b24a0c..4e2cf31 100644
>> --- a/gcc/builtins.c
>> +++ b/gcc/builtins.c
>> @@ -2941,7 +2941,11 @@ expand_builtin_strlen (tree exp, rtx target,
>>       start_sequence ();
>>       pat = expand_expr (src, src_reg, ptr_mode, EXPAND_NORMAL);
>>       if (pat != src_reg)
>> -       emit_move_insn (src_reg, pat);
>> +       {
>> +         if (GET_MODE (pat) != Pmode)
>> +           pat = convert_to_mode (Pmode, pat, 1);
>
> Shouldn't this be POINTERS_EXTEND_UNSIGNED instead of 1?
>
>> +         emit_move_insn (src_reg, pat);
>
> Why not use convert_move unconditionally?
>
> Or, why not expand src in Pmode from the start?  After all, src_reg is
> created as Pmode reg.
>

 This patch works for my testcase.  OK for trunk?
>>>
>>> Ok if it passes bootstrap & regtest on a ptr_mode != Pmode target.
>>>
>>
>> Only the following targets expand strlen:
>>
>> avr/avr.md:(define_expand "strlenhi"
>> avr/avr.md:(define_insn "*strlenhi"
>> i386/i386.md:(define_expand "strlen"
>> i386/i386.md: if (ix86_expand_strlen (operands[0], operands[1],
>> operands[2], operands[3]))
>> i386/i386.md:(define_expand "strlenqi_1"
>> i386/i386.md:(define_insn "*strlenqi_1"
>> rs6000/rs6000.md:(define_expand "strlensi"
>> s390/s390.md:; strlenM instruction pattern(s).
>> s390/s390.md:(define_expand "strlen"
>> s390/s390.md:(define_insn "*strlen"
>>
>> None of them, except for my x32 port, are ptr_mode != Pmode targets.
>> I will bootstrap and test it on my x32 branch.
>>
>
> It doesn't work on x32. I got
>
> /export/gnu/import/git/gcc-x32/libssp/gets-chk.c:74:14: internal
> compiler error: in emit_move_insn, at expr.c:3319
> Please submit a full bug report,
> with preprocessed source if appropriate.
> See  for instructions.
>
> How about this patch?
>
> Thanks.

No regressions on x32 branch.  OK for trunk?

Thanks.

> --
> H.J.
> ---
> 2011-06-12  H.J. Lu  
>
>        PR middle-end/47364
>        * builtins.c (expand_builtin_strlen): Properly handle target
>        not in Pmode.
>
> diff --git a/gcc/builtins.c b/gcc/builtins.c
> index 7b24a0c..a2f175d 100644
> --- a/gcc/builtins.c
> +++ b/gcc/builtins.c
> @@ -2941,7 +2941,14 @@ expand_builtin_strlen (tree exp, rtx target,
>       start_sequence ();
>       pat = expand_expr (src, src_reg, ptr_mode, EXPAND_NORMAL);
>       if (pat != src_reg)
> -       emit_move_insn (src_reg, pat);
> +       {
> +#ifdef POINTERS_EXTEND_UNSIGNED
> +         if (GET_MODE (pat) != Pmode)
> +           pat = convert_to_mode (Pmode, pat,
> +                                  POINTERS_EXTEND_UNSIGNED);
> +#endif
> +         emit_move_insn (src_reg, pat);
> +       }
>       pat = get_insns ();
>       end_sequence ();
>



-- 
H.J.


Re: Cgraph alias reorg 15/14 (New infrastructure for same body aliases)

2011-06-12 Thread Jan Hubicka
> On Sun, Jun 12, 2011 at 7:54 AM, Jan Hubicka  wrote:
> >>
> >> This also pretty much destroyed C++ for ia32:
> >>
> >> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49378
> >> http://gcc.gnu.org/ml/gcc-regression/2011-06/msg00159.html
> >
> > Hi,
> > It seems somewhat amazing that we hit kernel sensitive miscompilation here.
> > The problem most probably is the fact that thunks and functions with thunks 
> > can become
> > local. This is correct since thunks are represented as direct calls now, 
> > but this
> > makes i386 to use local ABI when calling or compiling them.
> >
> 
> For x86-64, we use the same ABI for local and global. But RAX seems
> used and uninitialized in thunk.
0006d270 <_ZN12_GLOBAL__N_121system_error_categoryD0Ev>:
   6d270:   48 8d 05 79 d4 27 00lea0x27d479(%rip),%rax#
2ea6f0 <_ZTVN12_GLOBAL__N_121system_error_categoryE+0x10>
   6d277:   53  push   %rbx
   6d278:   48 89 fbmov%rdi,%rbx
   6d27b:   48 89 07mov%rax,(%rdi)
   6d27e:   e8 55 a0 fe ff  callq  572d8
<_ZNSt14error_categoryD2Ev@plt>
   6d283:   48 89 dfmov%rbx,%rdi
   6d286:   5b  pop%rbx
   6d287:   e9 2c 9d fe ff  jmpq   56fb8 <_ZdlPv@plt>
   6d28c:   90  nop
   6d28d:   90  nop
   6d28e:   90  nop
   6d28f:   90  nop

I don't see uinitialized RAX here.  It is set by the first LEA
I will commit the patch now to unbreak x86 and work on more proper solution 
after debugging
the plugin usses.

Thanks for testing!
Honza
> 
> -- 
> H.J.


Re: Cgraph alias reorg 15/14 (New infrastructure for same body aliases)

2011-06-12 Thread Jan Hubicka
Hi,
this is the fix (or rather a workaround) i comitted.  Thanks!

Index: ChangeLog
===
--- ChangeLog   (revision 174968)
+++ ChangeLog   (working copy)
@@ -1,3 +1,9 @@
+2011-06-11  Jan Hubicka  
+
+   PR middle-end/49378
+   * ipa.c (cgraph_non_local_node_p_1, cgraph_local_node_p): Rule out
+   aliases and thunks.
+
 2011-06-12  Ira Rosen  
 
* tree-vect-data-refs.c (vect_peeling_hash_get_most_frequent):
Index: ipa.c
===
--- ipa.c   (revision 174958)
+++ ipa.c   (working copy)
@@ -119,7 +119,9 @@ process_references (struct ipa_ref_list 
 static bool
 cgraph_non_local_node_p_1 (struct cgraph_node *node, void *data 
ATTRIBUTE_UNUSED)
 {
+   /* FIXME: Aliases can be local, but i386 gets thunks wrong then.  */
return !(cgraph_only_called_directly_or_aliased_p (node)
+   && !ipa_ref_has_aliases_p (&node->ref_list)
&& node->analyzed
&& !DECL_EXTERNAL (node->decl)
&& !node->local.externally_visible
@@ -132,7 +134,13 @@ cgraph_non_local_node_p_1 (struct cgraph
 static bool
 cgraph_local_node_p (struct cgraph_node *node)
 {
-   return !cgraph_for_node_and_aliases (cgraph_function_or_thunk_node (node, 
NULL),
+   struct cgraph_node *n = cgraph_function_or_thunk_node (node, NULL);
+
+   /* FIXME: thunks can be considered local, but we need prevent i386
+  from attempting to change calling convention of them.  */
+   if (n->thunk.thunk_p)
+ return false;
+   return !cgraph_for_node_and_aliases (n,
cgraph_non_local_node_p_1, NULL, true);

 }


Re: Cgraph alias reorg 15/14 (New infrastructure for same body aliases)

2011-06-12 Thread H.J. Lu
On Sun, Jun 12, 2011 at 9:33 AM, Jan Hubicka  wrote:
>> On Sun, Jun 12, 2011 at 7:54 AM, Jan Hubicka  wrote:
>> >>
>> >> This also pretty much destroyed C++ for ia32:
>> >>
>> >> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49378
>> >> http://gcc.gnu.org/ml/gcc-regression/2011-06/msg00159.html
>> >
>> > Hi,
>> > It seems somewhat amazing that we hit kernel sensitive miscompilation here.
>> > The problem most probably is the fact that thunks and functions with 
>> > thunks can become
>> > local. This is correct since thunks are represented as direct calls now, 
>> > but this
>> > makes i386 to use local ABI when calling or compiling them.
>> >
>>
>> For x86-64, we use the same ABI for local and global. But RAX seems
>> used and uninitialized in thunk.
> 0006d270 <_ZN12_GLOBAL__N_121system_error_categoryD0Ev>:
>   6d270:       48 8d 05 79 d4 27 00    lea    0x27d479(%rip),%rax        #
> 2ea6f0 <_ZTVN12_GLOBAL__N_121system_error_categoryE+0x10>
>   6d277:       53                      push   %rbx
>   6d278:       48 89 fb                mov    %rdi,%rbx
>   6d27b:       48 89 07                mov    %rax,(%rdi)
>   6d27e:       e8 55 a0 fe ff          callq  572d8
> <_ZNSt14error_categoryD2Ev@plt>
>   6d283:       48 89 df                mov    %rbx,%rdi
>   6d286:       5b                      pop    %rbx
>   6d287:       e9 2c 9d fe ff          jmpq   56fb8 <_ZdlPv@plt>
>   6d28c:       90                      nop
>   6d28d:       90                      nop
>   6d28e:       90                      nop
>   6d28f:       90                      nop
>
> I don't see uinitialized RAX here.  It is set by the first LEA

You are right.

-- 
H.J.


Re: Cgraph alias reorg 15/14 (New infrastructure for same body aliases)

2011-06-12 Thread H.J. Lu
On Sun, Jun 12, 2011 at 9:40 AM, Jan Hubicka  wrote:
> Hi,
> this is the fix (or rather a workaround) i comitted.  Thanks!
>
> Index: ChangeLog
> ===
> --- ChangeLog   (revision 174968)
> +++ ChangeLog   (working copy)
> @@ -1,3 +1,9 @@
> +2011-06-11  Jan Hubicka  
> +
> +       PR middle-end/49378
> +       * ipa.c (cgraph_non_local_node_p_1, cgraph_local_node_p): Rule out
> +       aliases and thunks.
> +
>  2011-06-12  Ira Rosen  
>
>        * tree-vect-data-refs.c (vect_peeling_hash_get_most_frequent):
> Index: ipa.c
> ===
> --- ipa.c       (revision 174958)
> +++ ipa.c       (working copy)
> @@ -119,7 +119,9 @@ process_references (struct ipa_ref_list
>  static bool
>  cgraph_non_local_node_p_1 (struct cgraph_node *node, void *data 
> ATTRIBUTE_UNUSED)
>  {
> +   /* FIXME: Aliases can be local, but i386 gets thunks wrong then.  */
>    return !(cgraph_only_called_directly_or_aliased_p (node)
> +           && !ipa_ref_has_aliases_p (&node->ref_list)
>            && node->analyzed
>            && !DECL_EXTERNAL (node->decl)
>            && !node->local.externally_visible
> @@ -132,7 +134,13 @@ cgraph_non_local_node_p_1 (struct cgraph
>  static bool
>  cgraph_local_node_p (struct cgraph_node *node)
>  {
> -   return !cgraph_for_node_and_aliases (cgraph_function_or_thunk_node (node, 
> NULL),
> +   struct cgraph_node *n = cgraph_function_or_thunk_node (node, NULL);
> +
> +   /* FIXME: thunks can be considered local, but we need prevent i386
> +      from attempting to change calling convention of them.  */
> +   if (n->thunk.thunk_p)
> +     return false;
> +   return !cgraph_for_node_and_aliases (n,
>                                        cgraph_non_local_node_p_1, NULL, true);
>
>  }
>

I think the thunk local binding is another problem related to:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35513


-- 
H.J.


Re: Cgraph alias reorg 15/14 (New infrastructure for same body aliases)

2011-06-12 Thread Jan Hubicka
Hi,
this patch solves the bultin/strlen-3.c LTO linker plugin problem.
While removing alias code I was bit overactive and removed the check that makes 
us to implicitly
do -fwhole-program when resolution info is around.

It is not quite clear to me why in LTO we need -fwhole-program to get the 
testcase right.
Bootstrap/regtest in progress, will commit it once it passes.

* ipa.c (cgraph_exernally_visible_p): Return accidentally removed check
for LDPR_PREVAILING_DEF_IRONLY.
Index: ipa.c
===
--- ipa.c   (revision 174955)
+++ ipa.c   (working copy)
@@ -614,6 +614,8 @@
   /* If linker counts on us, we must preserve the function.  */
   if (cgraph_used_from_object_file_p (node))
 return true;
+  if (node->resolution == LDPR_PREVAILING_DEF_IRONLY)
+return false;
   if (DECL_PRESERVE_P (node->decl))
 return true;
   if (lookup_attribute ("externally_visible", DECL_ATTRIBUTES (node->decl)))


PR tree-optimize/48836 (internal compiler error: in execute_todo, at passes.c:1261)

2011-06-12 Thread Jan Hubicka
Hi,
I had this patch sitting in my tree for a while, but somehow forgot to commit 
it.
The issue here is that edge redirection affects vops and thus imply need for
SSA update in some cases.

Bootstrapped/regtested x86_64-linux, will commit it shortly.
PR middle-end/48836
* ipa-inline-transform.c: Include tree-pass.h
(inline_transform): Set TODO_update_ssa_only_virtuals.
* Makefile.in (ipa-inline-transform.o): Add tree-pass.h.
Index: ipa-inline-transform.c
===
--- ipa-inline-transform.c  (revision 174958)
+++ ipa-inline-transform.c  (working copy)
@@ -45,6 +45,7 @@ along with GCC; see the file COPYING3.  
 #include "ipa-prop.h"
 #include "ipa-inline.h"
 #include "tree-inline.h"
+#include "tree-pass.h"
 
 int ncalls_inlined;
 int nfunctions_inlined;
@@ -338,6 +339,8 @@ inline_transform (struct cgraph_node *no
   cgraph_redirect_edge_call_stmt_to_callee (e);
   if (!e->inline_failed || warn_inline)
 inline_p = true;
+  /* Redirecting edges might lead to a need for vops to be recomputed.  */
+  todo |= TODO_update_ssa_only_virtuals;
 }
 
   if (inline_p)
Index: Makefile.in
===
--- Makefile.in (revision 174958)
+++ Makefile.in (working copy)
@@ -3087,7 +3087,7 @@ ipa-inline-transform.o : ipa-inline-tran
$(TREE_H) langhooks.h $(TREE_INLINE_H) $(FLAGS_H) $(CGRAPH_H) intl.h \
$(DIAGNOSTIC_H) $(PARAMS_H) $(TIMEVAR_H) $(TREE_PASS_H) \
$(HASHTAB_H) $(COVERAGE_H) $(GGC_H) $(TREE_FLOW_H) $(IPA_PROP_H) \
-   gimple-pretty-print.h ipa-inline.h $(LTO_STREAMER_H)
+   gimple-pretty-print.h ipa-inline.h $(LTO_STREAMER_H) tree-pass.h
 ipa-utils.o : ipa-utils.c $(IPA_UTILS_H) $(CONFIG_H) $(SYSTEM_H) \
coretypes.h $(TM_H) $(TREE_H) $(TREE_FLOW_H) $(TREE_INLINE_H) langhooks.h \
pointer-set.h $(GGC_H) $(GIMPLE_H) $(SPLAY_TREE_H) \


Re: Is VIEW_CONVERT_EXPR an lvalue? (was Re: RFA (fold): PATCH for c++/49290 (folding *(T*)(ar+10)))

2011-06-12 Thread Jason Merrill

On 06/12/2011 06:59 AM, Richard Guenther wrote:

The please provide a specification on what a VIEW_CONVERT_EXPR does
to type-based alias analysis.


If the alias set of the VIEW_CONVERT_EXPR type the same as the set for 
the operand, ignore it; if it's a subset, handle it like a 
COMPONENT_REF; otherwise ignore the operand for TBAA.


It seems like get_alias_set currently gets this backwards; it's ignoring 
outer COMPONENT_REFs instead of the inner structure.



Yes, we do handle lvalue VIEW_CONVERT_EXPRs, but that is for Ada
which uses it for aggregates.


It also seems to be widely used for vectors, but perhaps that's only for 
rvalues.



I don't want us to add more lvalue
VIEW_CONVERT_EXPR cases, especially not for register types.


Then how do we convert an int lvalue to a volatile int lvalue?


/* Represents viewing something of one type as being of a second type.
   This corresponds to an "Unchecked Conversion" in Ada and roughly to
   the idiom *(type2 *)&X in C.


Right, that's why I thought it was an lvalue.


   This code may also be used within the LHS of a MODIFY_EXPR


And this.

Jason


[patch, fortran] Final TRIM optimizations

2011-06-12 Thread Thomas Koenig

Hello world,

this is the last round of TRIM optimizations.  This patch extends the
treatment of trailing TRIMs in concatenations to comparisions.  It also
does a bit of code cleanup by removing some duplication, and by not
changing the rhs in optimize_assignment.

OK for trunk?

Thomas

2011-06-13  Thomas Koenig  

* frontend-passes.c (remove_trim):  New function.
(optimize_assignment):  Use it.
(optimize_comparison):  Likewise.  Return correct status
for previous change.

2011-06-13  Thomas Koenig  

* gfortran.dg/trim_optimize_8.f90:  New test case.
Index: frontend-passes.c
===
--- frontend-passes.c	(Revision 174958)
+++ frontend-passes.c	(Arbeitskopie)
@@ -486,6 +486,35 @@ optimize_binop_array_assignment (gfc_code *c, gfc_
   return false;
 }
 
+/* Remove unneeded TRIMs at the end of expressions.  */
+
+static bool
+remove_trim (gfc_expr *rhs)
+{
+  bool ret;
+
+  ret = false;
+
+  /* Check for a // b // trim(c).  Looping is probably not
+ necessary because the parser usually generates
+ (// (// a b ) trim(c) ) , but better safe than sorry.  */
+
+  while (rhs->expr_type == EXPR_OP
+	 && rhs->value.op.op == INTRINSIC_CONCAT)
+rhs = rhs->value.op.op2;
+
+  while (rhs->expr_type == EXPR_FUNCTION && rhs->value.function.isym
+	 && rhs->value.function.isym->id == GFC_ISYM_TRIM)
+{
+  strip_function_call (rhs);
+  /* Recursive call to catch silly stuff like trim ( a // trim(b)).  */
+  remove_trim (rhs);
+  ret = true;
+}
+
+  return ret;
+}
+
 /* Optimizations for an assignment.  */
 
 static void
@@ -499,25 +528,8 @@ optimize_assignment (gfc_code * c)
   /* Optimize away a = trim(b), where a is a character variable.  */
 
   if (lhs->ts.type == BT_CHARACTER)
-{
-  /* Check for a // b // trim(c).  Looping is probably not
-	 necessary because the parser usually generates
-	 (// (// a b ) trim(c) ) , but better safe than sorry.  */
+remove_trim (rhs);
 
-  while (rhs->expr_type == EXPR_OP
-	 && rhs->value.op.op == INTRINSIC_CONCAT)
-	rhs = rhs->value.op.op2;
-
-  if (rhs->expr_type == EXPR_FUNCTION &&
-	  rhs->value.function.isym &&
-	  rhs->value.function.isym->id == GFC_ISYM_TRIM)
-	{
-	  strip_function_call (rhs);
-	  optimize_assignment (c);
-	  return;
-	}
-}
-
   if (lhs->rank > 0 && gfc_check_dependency (lhs, rhs, true) == 0)
 optimize_binop_array_assignment (c, &rhs, false);
 }
@@ -639,36 +651,17 @@ optimize_comparison (gfc_expr *e, gfc_intrinsic_op
 
   /* Strip off unneeded TRIM calls from string comparisons.  */
 
-  change = false;
+  change = remove_trim (op1);
 
-  if (op1->expr_type == EXPR_FUNCTION 
-  && op1->value.function.isym
-  && op1->value.function.isym->id == GFC_ISYM_TRIM)
-{
-  strip_function_call (op1);
-  change = true;
-}
+  if (remove_trim (op2))
+change = true;
 
-  if (op2->expr_type == EXPR_FUNCTION 
-  && op2->value.function.isym
-  && op2->value.function.isym->id == GFC_ISYM_TRIM)
-{
-  strip_function_call (op2);
-  change = true;
-}
-
-  if (change)
-{
-  optimize_comparison (e, op);
-  return true;
-}
-
   /* An expression of type EXPR_CONSTANT is only valid for scalars.  */
   /* TODO: A scalar constant may be acceptable in some cases (the scalarizer
  handles them well). However, there are also cases that need a non-scalar
  argument. For example the any intrinsic. See PR 45380.  */
   if (e->rank > 0)
-return false;
+return change;
 
   /* Don't compare REAL or COMPLEX expressions when honoring NaNs.  */
 
@@ -698,7 +691,7 @@ optimize_comparison (gfc_expr *e, gfc_intrinsic_op
 			&& op2_left->expr_type == EXPR_CONSTANT
 			&& op1_left->value.character.length
 			   != op2_left->value.character.length)
-		return false;
+		return change;
 		  else
 		{
 		  free (op1_left);
@@ -787,7 +780,7 @@ optimize_comparison (gfc_expr *e, gfc_intrinsic_op
 	}
 }
 
-  return false;
+  return change;
 }
 
 /* Optimize a trim function by replacing it with an equivalent substring
! { dg-do compile }
! { dg-options "-O -fdump-tree-original" }
! Check that trailing trims are also removed from assignment of
! expressions involving concatenations of strings .
program main
  character(2) :: a,b
  character(8) :: d
  a = 'a '
  b = 'b '
  if (trim(a // trim(b)) /= 'a b ') call abort
  if (trim (trim(a) // trim(b)) /= 'ab ') call abort
end
! { dg-final { scan-tree-dump-times "string_len_trim" 1 "original" } }
! { dg-final { cleanup-tree-dump "original" } }


Re: [patch, fortran] Final TRIM optimizations

2011-06-12 Thread Steve Kargl
On Mon, Jun 13, 2011 at 12:24:02AM +0200, Thomas Koenig wrote:
> Hello world,
> 
> this is the last round of TRIM optimizations.  This patch extends the
> treatment of trailing TRIMs in concatenations to comparisions.  It also
> does a bit of code cleanup by removing some duplication, and by not
> changing the rhs in optimize_assignment.
> 
> OK for trunk?
> 
>   Thomas
> 
> 2011-06-13  Thomas Koenig  
> 
> * frontend-passes.c (remove_trim):  New function.
> (optimize_assignment):  Use it.
> (optimize_comparison):  Likewise.  Return correct status
> for previous change.
> 
> 2011-06-13  Thomas Koenig  
> 
> * gfortran.dg/trim_optimize_8.f90:  New test case.

OK.

-- 
Steve


Re: Is VIEW_CONVERT_EXPR an lvalue? (was Re: RFA (fold): PATCH for c++/49290 (folding *(T*)(ar+10)))

2011-06-12 Thread Mike Stump
On Jun 12, 2011, at 4:03 AM, Richard Guenther wrote:
> Btw, see tree.def which says
> 
> /* Represents viewing something of one type as being of a second type.
>   This corresponds to an "Unchecked Conversion" in Ada and roughly to
>   the idiom *(type2 *)&X in C.  The only operand is the value to be
>   viewed as being of another type.  It is undefined if the type of the
>   input and of the expression have different sizes.
> 
>   This code may also be used within the LHS of a MODIFY_EXPR, in which
>   case no actual data motion may occur.  TREE_ADDRESSABLE will be set in
>   this case and GCC must abort if it could not do the operation without
>   generating insns.  */

I wasn't able to follow what this was trying to say.  :-(  No actual data 
motion may occur?  The wording is weasely.  Does it mean: Data motion does not 
occur when used on the LHS of a MODIFY_EXPR?  If so, it should just directly 
state it.


Re: RFA (fold): PATCH for c++/49290 (folding *(T*)(ar+10))

2011-06-12 Thread Mike Stump
On Jun 12, 2011, at 3:55 AM, Richard Guenther wrote:
> In almost all cases(*) the need for a lvalue VIEW_CONVERT_EXPR can be avoided
> by moving the VIEW_CONVERT_EXPR to the rvalue assigned too it.  Remember that
> VIEW_CONVERT_EXPR always conver the full object and are not allowed to
> change sizes.
> 
> So, do you have an example?

Sure, take a divmod type of instruction.[1]  It has two outputs, a div and a 
mod.  The instruction operates on registers, and produces two completely 
independent values.  But really, it is a general features of the built-in 
mechanism that Kenny and I are working on that some people would like to reuse 
for other ports.  The general feature is that one can declare any argument to a 
built-in to be input only, output only or input/output.  This nicely matches 
what I think quite of lot of machines do for assembly language semantics.  The 
support isn't to match my machine, but rather to support building a port, in 
which 1 or more instructions have such parameters.  Requiring memory is a 
non-starter, and in fact, we already have a scheme for memory_operand type 
things for the instructions that take memory.  The scheme used for them is 
borrowed from C++, where we just declare that the built-in takes a reference 
type.  This reuses most of the code paths from C++ and it seems to work nicely. 
 I'd be tempting to use it for register references, but, in my current scheme 
for registers, I support data flow in, out and in/out at the tree level and at 
the rtl level.  We believe this is nice from an optimizing perspective, and 
probably required to get the warnings about using uninitialized variables 
correct.


[patch] Fix PR tree-optimization/49352

2011-06-12 Thread Ira Rosen
Hi,

This patch fixes PR 49352 by ignoring debug uses in SLP reduction detection.
While fixing it Jakub also discovered that an incorrect statement may
be analyzed and operands of not commutative operation may be swapped.
The patch fixes those as well.

Bootstrapped and tested on powerpc64-suse-linux.
Committed.

Ira

ChangeLog:

2011-06-13 Jakub Jelinek  
   Ira Rosen  

PR tree-optimization/49352
* tree-vect-loop.c (vect_is_slp_reduction): Don't count debug uses at
all, make sure loop_use_stmt after the loop is a def stmt of a used
SSA_NAME that is the only one defined inside of the loop.  Don't
check for COND_EXPR and GIMPLE_BINARY_RHS.
(vect_is_simple_reduction_1): Call vect_is_slp_reduction only if
check_reduction is true.



2011-06-13 Jakub Jelinek  
 Ira Rosen  

PR tree-optimization/49352
* gcc.dg/vect/pr49352.c: New test.
Index: testsuite/gcc.dg/vect/pr49352.c
===
--- testsuite/gcc.dg/vect/pr49352.c (revision 0)
+++ testsuite/gcc.dg/vect/pr49352.c (revision 0)
@@ -0,0 +1,14 @@
+/* PR tree-optimization/49352 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-vectorize -fcompare-debug" } */
+
+int
+foo (int *x, int *y, int n)
+{
+  int i, j;
+  int dot = 0;
+  for (i = 0; i < n; i++)
+for (j = 0; j < 2; j++)
+  dot += *(x++) * *(y++);
+  return dot;
+}
Index: tree-vect-loop.c
===
--- tree-vect-loop.c(revision 174981)
+++ tree-vect-loop.c(working copy)
@@ -1710,12 +1710,12 @@ vect_is_slp_reduction (loop_vec_info loop_info, gi
   struct loop *loop = (gimple_bb (phi))->loop_father;
   struct loop *vect_loop = LOOP_VINFO_LOOP (loop_info);
   enum tree_code code;
-  gimple current_stmt = NULL, use_stmt = NULL, first, next_stmt;
+  gimple current_stmt = NULL, loop_use_stmt = NULL, first, next_stmt;
   stmt_vec_info use_stmt_info, current_stmt_info;
   tree lhs;
   imm_use_iterator imm_iter;
   use_operand_p use_p;
-  int nloop_uses, size = 0, nuses;
+  int nloop_uses, size = 0;
   bool found = false;
 
   if (loop != vect_loop)
@@ -1726,66 +1726,68 @@ vect_is_slp_reduction (loop_vec_info loop_info, gi
   while (1)
 {
   nloop_uses = 0;
-  nuses = 0;
   FOR_EACH_IMM_USE_FAST (use_p, imm_iter, lhs)
 {
-  use_stmt = USE_STMT (use_p);
-  nuses++;
+ gimple use_stmt = USE_STMT (use_p);
   if (is_gimple_debug (use_stmt))
 continue;
 
+ use_stmt = USE_STMT (use_p);
+
   /* Check if we got back to the reduction phi.  */
-  if (gimple_code (use_stmt) == GIMPLE_PHI
-  && use_stmt == phi)
+ if (use_stmt == phi)
 {
+ loop_use_stmt = use_stmt;
   found = true;
   break;
 }
 
   if (flow_bb_inside_loop_p (loop, gimple_bb (use_stmt))
   && vinfo_for_stmt (use_stmt)
-  && !is_pattern_stmt_p (vinfo_for_stmt (use_stmt))
-  && use_stmt != first_stmt)
-nloop_uses++;
+ && !STMT_VINFO_IN_PATTERN_P (vinfo_for_stmt (use_stmt)))
+   {
+ loop_use_stmt = use_stmt;
+ nloop_uses++;
+   }
 
   if (nloop_uses > 1)
 return false;
 }
 
-  /* We reached a statement with no uses.  */
-  if (nuses == 0)
-   return false;
-
   if (found)
 break;
 
+  /* We reached a statement with no loop uses.  */
+  if (nloop_uses == 0)
+   return false;
+
   /* This is a loop exit phi, and we haven't reached the reduction phi.  */
-  if (gimple_code (use_stmt) == GIMPLE_PHI)
+  if (gimple_code (loop_use_stmt) == GIMPLE_PHI)
 return false;
 
-  if (!is_gimple_assign (use_stmt)
-  || code != gimple_assign_rhs_code (use_stmt)
-  || !flow_bb_inside_loop_p (loop, gimple_bb (use_stmt)))
+  if (!is_gimple_assign (loop_use_stmt)
+ || code != gimple_assign_rhs_code (loop_use_stmt)
+ || !flow_bb_inside_loop_p (loop, gimple_bb (loop_use_stmt)))
 return false;
 
   /* Insert USE_STMT into reduction chain.  */
-  use_stmt_info = vinfo_for_stmt (use_stmt);
+  use_stmt_info = vinfo_for_stmt (loop_use_stmt);
   if (current_stmt)
 {
   current_stmt_info = vinfo_for_stmt (current_stmt);
-  GROUP_NEXT_ELEMENT (current_stmt_info) = use_stmt;
+ GROUP_NEXT_ELEMENT (current_stmt_info) = loop_use_stmt;
   GROUP_FIRST_ELEMENT (use_stmt_info)
 = GROUP_FIRST_ELEMENT (current_stmt_info);
 }
   else
-  GROUP_FIRST_ELEMENT (use_stmt_info) = use_stmt;
+   GROUP_FIRST_ELEMENT (use_stmt_info) = loop_use_stmt;
 
-  lhs = gimple_assign_lhs (use_stmt);
-  current_stmt = use_stmt;
+  lhs = gimple_assign_lhs (loop_use_stmt);
+  curre