date:20120102

Re: Bogus gcc.c-torture/execute/20071018-1.c testcase?

2012-01-02 Thread Richard Guenther

On Sat, 31 Dec 2011, Mark Kettenis wrote:

> Execution of the test randomly fails for me on OpenBSD/amd64.  Looking
> at the code, it seems it is doing an out-of-bounds array access.  For
> refernce I've copied the code of the testcase below.  As you can see
> there's a foo(0) call in main().  Therefore
> 
>   struct foo **upper = &as->x[rank * 8 - 1];
> 
> becomes
> 
>   struct foo **upper = &as->x[-1];
> 
> so upper points to an address before the malloc()'ed memory.  Then
> when the code does
> 
>   *upper = 0;
> 
> this generates a SIGSEGV, if the malloc()'ed memory happens to lie
> right at the start of a page.  I suppose that may never happen on some
> platforms (Linux?) since many malloc() implementations will use the
> start of a page for their own bookkeeping.
> 
> I don't really understand what the testcase is testing.  Richard, can
> you perhaps shed some light on this?

The testcase was added when trying to fix PR32921, I do not remember
whether it was relevant that an out-of-bounds array access occured
(but I suppose it was so).  It was a miscompile, so a runtime testcase
is required.

I suppose changing the testcase to do &as->x[rank * 8 - 8] and
pass in rank == 1 would make it fail similarly on rev. 129484
(or better make it &as->x[rank * 2 = 2]).

Richard.

> Thanks,
> 
> Mark
> 
> ---
> 
> extern void abort(void);
> 
> struct foo {
>   int rank;
>   char *name;
> };
> 
> struct mem {
>   struct foo *x[4];
> };
> 
> void __attribute__((noinline)) bar(struct foo **f)
> {
>   *f = __builtin_malloc(sizeof(struct foo));
> }
> struct foo * foo(int rank)
> {
>   void *x = __builtin_malloc(sizeof(struct mem));
>   struct mem *as = x;
>   struct foo **upper = &as->x[rank * 8 - 1];
>   *upper = 0;
>   bar(upper);
>   return *upper;
> }
> 
> int main()
> {
>   if (foo(0) == 0)
> abort ();
>   return 0;
> }
> 
> 

-- 
Richard Guenther 
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer

Re: Lingering tbaa in anti_dependence?

2012-01-02 Thread Richard Guenther

On Fri, Dec 30, 2011 at 6:53 PM, Jakub Jelinek  wrote:
> On Thu, Dec 29, 2011 at 04:24:31PM +, Richard Sandiford wrote:
>> AIUI, the outcome of PR38964 was that we can't use TBAA for testing an
>> anti_dependence between a load X and store Y because Y might be defining
>> a new object in the same space as the object that was being read by X.
>> But it looks like we still use component-based disambiguation
>> (nonoverlapping_component_refs_p) in this case.  Is it true that
>> that's also a problem?  E.g. for:
>>
>>     struct s { int f; float g; };
>>     struct t { int header; struct s s; };
>>
>>     float foo (struct t *newt, struct s *olds, int x, int y)
>>     {
>>       float ret = olds[x * y].g;
>>       newt->header = 0;
>>       newt->s.f = 1;
>>       newt->s.g = 1.0;
>>       return ret;
>>     }
>>
>> we can (and on ARM Cortex A8, do) move the store to newt->s.f above
>> the load from olds[...].g.  If we view the assignment to newt
>> as defining a new object in the same space as the now-defunct olds,
>> and if x * y happens to be zero, then the accesses might well be
>> to the same address.
>
> You would need a placement new in between the read from olds[x * y].g
> and newt->* stores, without that it is definitely valid to move it
> ahead of the store.
> IMHO we should keep placement new in the IL in some form, and decide how
> conservative we need to be based on whether the current function
> or current loop etc. contains (or could contain) a placement new or not.

I disagree.  Any representation of placement new would pessimize code
more than it is pessimized now.  You also will lose the advantage that
the current implementation of our TBAA semantics integrate nicely
with the type-punning through union stuff which you'd otherwise have
to make explicit (thus, lower to the proposed placement-new style), too.

Also consider placement new in a function (it would be pure/const),
also consider that placement new is simply an inline function returning
its argument.

> E.g. if a loop doesn't contain placement new anywhere, we could be less
> conservative in optimizing that loop, do more vectorization on it etc.

For loops we can just do more proper dependence analysis.

Richard.

>        Jakub

RFC: Handle conditional expression in sccvn/fre/pre

2012-01-02 Thread Amker.Cheng

Hi,
Since SCCVN operates on SSA graph instead of the control flow graph
for the sake of efficiency,
it does not handle or value number the conditional expression of
GIMPLE_COND statement.
As a result, FRE/PRE does not simplify conditional expression, as
reported in bug 30997.

Since it would be complicate and difficult to process conditional
expression in currently SCCVN
algorithm, how about following method?

STEP1  Before starting FRE/PRE, we can factor out the conditional
expression, like change following
codes:

if (cond_expr)
  goto lable_a
else
  goto label_b

into codes:

tmp = cond_expr
if (tmp == 1)
  goto label_a
else
  goto label_b

STEP2  Let SCCVN/FRE/PRE do its job on value numbering cond_expr and
redundancy elimination;
STEP3  After FRE/PRE, for those "tmp=cond_expr" not used in any
redundancy elimination,
we can forward it to the corresponding GIMPLE_COND statement, just
like tree-ssa-forwprop.c.

In this way, the conditional expression will be handled as other
expressions and no
redundant assignment generated.
Most important,this does not affect the current implementation of SCCVN/FRE/PRE.

The only problem is the method cannot work on reversion of conditional
expression.
For example:

x = a > 2;
if (a<=2)
  goto label_a
else
  goto lable_b
could be optimized as:

x = a > 2
if (x == 0)
  goto label_a
else
  goto label_b

I have worked a draft patch to do the work and would like to hear your
comments on this.

Thanks very much.
-- 
Best Regards.

Re: RFC: Handle conditional expression in sccvn/fre/pre

2012-01-02 Thread Richard Guenther

On Mon, Jan 2, 2012 at 12:37 PM, Amker.Cheng  wrote:
> Hi,
> Since SCCVN operates on SSA graph instead of the control flow graph
> for the sake of efficiency,
> it does not handle or value number the conditional expression of
> GIMPLE_COND statement.
> As a result, FRE/PRE does not simplify conditional expression, as
> reported in bug 30997.
>
> Since it would be complicate and difficult to process conditional
> expression in currently SCCVN
> algorithm, how about following method?
>
> STEP1  Before starting FRE/PRE, we can factor out the conditional
> expression, like change following
> codes:
> 
> if (cond_expr)
>  goto lable_a
> else
>  goto label_b
>
> into codes:
> 
> tmp = cond_expr
> if (tmp == 1)
>  goto label_a
> else
>  goto label_b
>
> STEP2  Let SCCVN/FRE/PRE do its job on value numbering cond_expr and
> redundancy elimination;
> STEP3  After FRE/PRE, for those "tmp=cond_expr" not used in any
> redundancy elimination,
> we can forward it to the corresponding GIMPLE_COND statement, just
> like tree-ssa-forwprop.c.
>
> In this way, the conditional expression will be handled as other
> expressions and no
> redundant assignment generated.
> Most important,this does not affect the current implementation of 
> SCCVN/FRE/PRE.
>
> The only problem is the method cannot work on reversion of conditional
> expression.
> For example:
> 
> x = a > 2;
> if (a<=2)
>  goto label_a
> else
>  goto lable_b
> could be optimized as:
> 
> x = a > 2
> if (x == 0)
>  goto label_a
> else
>  goto label_b
> I have worked a draft patch to do the work and would like to hear your
> comments on this.

I've previously worked on changing GIMPLE_COND to no longer embed
the comparison but carry a predicate SSA_NAME only (this is effectively
what you do as pre-processing before SCCVN).  It had some non-trivial
fallout (for example PRE get's quite confused and ends up separating
conditionals and jumps too far ...) so I didn't finish it.

A subset of all cases can be catched by simply looking up the
N-ary at eliminate () time and re-writing the GIMPLE_COND to use
the predicate - which might not actually be beneficial (but forwprop
will undo not beneficial cases - hopefully).

In the end I'd rather go the way changing the GIMPLE IL to not
embed the comparison in the GIMPLE_COND - that reduces
the amount of redundant way we can express the same thing.

Richard.

> Thanks very much.
> --
> Best Regards.

Re: A case exposing code sink issue

2012-01-02 Thread Richard Guenther

On Thu, Dec 29, 2011 at 9:50 AM, Jiangning Liu  wrote:
>
>
>> -Original Message-
>> From: Jiangning Liu
>> Sent: Wednesday, December 28, 2011 5:38 PM
>> To: Jiangning Liu; 'Richard Guenther'
>> Cc: Michael Matz; gcc@gcc.gnu.org
>> Subject: RE: A case exposing code sink issue
>>
>>
>>
>> > -Original Message-
>> > From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf
>> Of
>> > Jiangning Liu
>> > Sent: Tuesday, December 27, 2011 5:10 PM
>> > To: 'Richard Guenther'
>> > Cc: Michael Matz; gcc@gcc.gnu.org
>> > Subject: RE: A case exposing code sink issue
>> >
>> > >
>> > > The job to do this is final value replacement, not sinking (we do
>> not
>> > > sink non-invariant expressions - you'd have to translate them
>> through
>> > > the loop-closed SSA exit PHI node, certainly doable, patches
>> > > welcome ;)).
>> > >
>> >
>> > Richard,
>> >
>> > In final value replacement, expression "&a + D." can be figured
>> out,
>> > while "&a[i_xxx]" failed to be CHRECed, so I'm wondering if we should
>> > lower
>> > &a[i_xxx] to "&a + unitsize(a) * i_xxx" first? It seems GCC intends
>> to
>> > keep
>> > &a[i_xxx] until cfgexpand pass. Or we have to directly modify CHREC
>> > algorithm to get it calculated?
>> >
>> > Appreciate your kindly help in advance!
>> >
>>
>> Richard,
>>
>> Now I have a patch working for the case of step "i++", by directly
>> modifying
>> scalar evolution algorithm. the following code would be generated after
>> SCCP,
>> l
>>   # i_13 = PHI 
>>   a_p.0_4 = &a[i_13];
>>   MEM[(int *)&a][i_13] = 100;
>>   i_6 = i_13 + 1;
>>   if (i_6 <= 999)
>>     goto ;
>>   else
>>     goto ;
>>
>> :
>>   a_p_lsm.5_11 = &MEM[(void *)&a + 3996B];
>>   a_p = a_p_lsm.5_11;
>>   goto ;
>>
>> It looks good, but I still have problem when the case has step "i+=k".
>>
>> For this case the value of variable i exiting loop isn't invariant, the
>> algorithm below in scalar evolution doesn't work on it,
>>
>> compute_overall_effect_of_inner_loop()
>> {
>>   ...
>>         tree nb_iter = number_of_latch_executions (inner_loop);
>>
>>         if (nb_iter == chrec_dont_know)
>>           return chrec_dont_know;
>>         else
>>           {
>>             tree res;
>>
>>             /* evolution_fn is the evolution function in LOOP.  Get
>>                its value in the nb_iter-th iteration.  */
>>             res = chrec_apply (inner_loop->num, evolution_fn, nb_iter);
>>
>>             if (chrec_contains_symbols_defined_in_loop (res, loop->num))
>>               res = instantiate_parameters (loop, res);
>>
>>             /* Continue the computation until ending on a parent of LOOP.
>> */
>>             return compute_overall_effect_of_inner_loop (loop, res);
>>           }
>> }
>>
>> In theory, we can still have the transformation like below even if the
>> step
>> is "i+=k",
>>
>>   # i_13 = PHI 
>>   i_14 = i_13,
>>   a_p.0_4 = &a[i_13];
>>   MEM[(int *)&a][i_13] = 100;
>>   i_6 = i_13 + k_2(D);     // i+=k
>>   if (i_6 <= 999)
>>     goto ;
>>   else
>>     goto ;
>>
>> :
>>   a_p_lsm.5_11 = &a[i_14];
>>   a_p = a_p_lsm.5_11;
>>   goto ;
>>
>> But I realize this is not a loop closed SSA form at all, because i_14
>> is
>> being used out of the loop. Where could we extend the liverange of
>> variable
>> i in GCC infrastructure and finally solve this problem?
>>
>
> It seems many people are still in the happy of the upcoming 2012 New Year.
> :-)
>
> Following my story, I find the following code in tree-ssa-copy.c
>
>      /* Avoid copy propagation from an inner into an outer loop.
>         Otherwise, this may move loop variant variables outside of
>         their loops and prevent coalescing opportunities.  If the
>         value was loop invariant, it will be hoisted by LICM and
>         exposed for copy propagation.
>         ???  The value will be always loop invariant.
>         In loop-closed SSA form do not copy-propagate through
>         PHI nodes in blocks with a loop exit edge predecessor.  */
>      if (current_loops
>          && TREE_CODE (arg_value) == SSA_NAME
>          && (loop_depth_of_name (arg_value) > loop_depth_of_name (lhs)
>              || (loops_state_satisfies_p (LOOP_CLOSED_SSA)
>                  && loop_exit_edge_p (e->src->loop_father, e
>        {
>          phi_val.value = lhs;
>          break;
>        }
>
> Here http://gcc.gnu.org/ml/gcc-patches/2006-12/msg00066.html, Dan said
>
> "The original check was not because of coalescing, but because we would
> copy prop in-loop variables outside the loop, causing *more*
> invariantness in nested loops."
>
> Can anybody give me a concrete example to explain this statement?

I can neither make sense of the code nor of Dans comment.  But of
course the part about loop-closed-SSA is still true.  I _suppose_ that
the situation is like

 tem = stuff;
 for ()
   x = tem;
 foo = x;

and Dan expects LIM to hoist x = tem (instead of "sinking" it by copyprop).
Then we'd coalesce tem and x (but it would be still live acr

Re: RFC: Handle conditional expression in sccvn/fre/pre

2012-01-02 Thread Amker.Cheng

Thanks Richard,

On Mon, Jan 2, 2012 at 8:33 PM, Richard Guenther
 wrote:
>
> I've previously worked on changing GIMPLE_COND to no longer embed
> the comparison but carry a predicate SSA_NAME only (this is effectively
> what you do as pre-processing before SCCVN).  It had some non-trivial
> fallout (for example PRE get's quite confused and ends up separating
> conditionals and jumps too far ...) so I didn't finish it.
Here changing GIMPLE_COND to no longer embed the comparison,
do you mean this only in fre/pre passes or in common?
If only in fre/pre passes, when and how these changed GIMPLE_COND
be changed back to normal ones?
If in common, won't this affects passes working on GIMPLE_COND, like
tree-ssa-forwprop.c?

>
> A subset of all cases can be catched by simply looking up the
> N-ary at eliminate () time and re-writing the GIMPLE_COND to use
> the predicate - which might not actually be beneficial (but forwprop
> will undo not beneficial cases - hopefully).
>
> In the end I'd rather go the way changing the GIMPLE IL to not
> embed the comparison in the GIMPLE_COND - that reduces
> the amount of redundant way we can express the same thing.
Will you try to handle the reversion comparison case as mentioned
in my previous message? I guess this needs both sccvn and fre/pre's
work. It would be great to hear your thoughts on this.

Thanks

-- 
Best Regards.

-fstack-protector, __stack_chk_fail_local and TARGET_LIBC_PROVIDES_SSP

2012-01-02 Thread Tijl Coosemans

Hi,

I ran into an issue with -fstack-protector on FreeBSD/i386. GCC
generates calls to __stack_chk_fail_local that the linker complains are
undefined. The following test case shows it:

% cat test.c
int
main( void ) {
return( 0 );
}
% gcc46 -o test test.c -fstack-protector-all -fPIE
/var/tmp//ccjYQxKu.o: In function `main':
test.c:(.text+0x37): undefined reference to `__stack_chk_fail_local'
/usr/local/bin/ld: test: hidden symbol `__stack_chk_fail_local' isn't defined
/usr/local/bin/ld: final link failed: Bad value
collect2: ld returned 1 exit status


I managed to fix the problem with the following one line change:


--- gcc/gcc.c.orig
+++ gcc/gcc.c
@@ -601,7 +601,7 @@
 
 #ifndef LINK_SSP_SPEC
 #ifdef TARGET_LIBC_PROVIDES_SSP
-#define LINK_SSP_SPEC "%{fstack-protector:}"
+#define LINK_SSP_SPEC 
"%{fstack-protector|fstack-protector-all:-lssp_nonshared}"
 #else
 #define LINK_SSP_SPEC "%{fstack-protector|fstack-protector-all:-lssp_nonshared 
-lssp}"
 #endif


The configure script detects __stack_chk_fail in FreeBSD libc and
defines TARGET_LIBC_PROVIDES_SSP, but __stack_chk_fail_local should be
statically linked in so (a dynamic) libc should not provide it.
My question is now whether the problem is with FreeBSD's SSP
implementation (where exactly does GCC expect __stack_chk_fail_local to
be defined?) or with GCC (should GCC just always link in ssp_nonshared
as in the patch above?).

Re: RFC: Handle conditional expression in sccvn/fre/pre

2012-01-02 Thread Richard Guenther

On Mon, Jan 2, 2012 at 2:11 PM, Amker.Cheng  wrote:
> Thanks Richard,
>
> On Mon, Jan 2, 2012 at 8:33 PM, Richard Guenther
>  wrote:
>>
>> I've previously worked on changing GIMPLE_COND to no longer embed
>> the comparison but carry a predicate SSA_NAME only (this is effectively
>> what you do as pre-processing before SCCVN).  It had some non-trivial
>> fallout (for example PRE get's quite confused and ends up separating
>> conditionals and jumps too far ...) so I didn't finish it.
> Here changing GIMPLE_COND to no longer embed the comparison,
> do you mean this only in fre/pre passes or in common?
> If only in fre/pre passes, when and how these changed GIMPLE_COND
> be changed back to normal ones?
> If in common, won't this affects passes working on GIMPLE_COND, like
> tree-ssa-forwprop.c?

Everywhere.  I posted a patch early last year and changed COND_EXPRs
on the RHS of an assignment later.

>>
>> A subset of all cases can be catched by simply looking up the
>> N-ary at eliminate () time and re-writing the GIMPLE_COND to use
>> the predicate - which might not actually be beneficial (but forwprop
>> will undo not beneficial cases - hopefully).
>>
>> In the end I'd rather go the way changing the GIMPLE IL to not
>> embed the comparison in the GIMPLE_COND - that reduces
>> the amount of redundant way we can express the same thing.
> Will you try to handle the reversion comparison case as mentioned
> in my previous message? I guess this needs both sccvn and fre/pre's
> work. It would be great to hear your thoughts on this.

Well, with

Index: gcc/tree-ssa-pre.c
===
--- gcc/tree-ssa-pre.c  (revision 182784)
+++ gcc/tree-ssa-pre.c  (working copy)
@@ -4335,16 +4335,23 @@ eliminate (void)
 available value-numbers.  */
  else if (gimple_code (stmt) == GIMPLE_COND)
{
- tree op0 = gimple_cond_lhs (stmt);
- tree op1 = gimple_cond_rhs (stmt);
+ tree op[2];
  tree result;
+ vn_nary_op_t nary;

- if (TREE_CODE (op0) == SSA_NAME)
-   op0 = VN_INFO (op0)->valnum;
- if (TREE_CODE (op1) == SSA_NAME)
-   op1 = VN_INFO (op1)->valnum;
+ op[0] = gimple_cond_lhs (stmt);
+ op[1] = gimple_cond_rhs (stmt);
+ if (TREE_CODE (op[0]) == SSA_NAME)
+   op[0] = VN_INFO (op[0])->valnum;
+ if (TREE_CODE (op[1]) == SSA_NAME)
+   op[1] = VN_INFO (op[1])->valnum;
  result = fold_binary (gimple_cond_code (stmt), boolean_type_node,
-   op0, op1);
+   op[0], op[1]);
+ if (!result)
+   result = vn_nary_op_lookup_pieces (2, gimple_cond_code (stmt),
+  boolean_type_node,
+  op, &nary);
+
  if (result && TREE_CODE (result) == INTEGER_CST)
{
  if (integer_zerop (result))
@@ -4354,6 +4361,13 @@ eliminate (void)
  update_stmt (stmt);
  todo = TODO_cleanup_cfg;
}
+ else if (result && TREE_CODE (result) == SSA_NAME)
+   {
+ gimple_cond_set_code (stmt, NE_EXPR);
+ gimple_cond_set_lhs (stmt, result);
+ gimple_cond_set_rhs (stmt, boolean_false_node);
+ update_stmt (stmt);
+   }
}
  /* Visit indirect calls and turn them into direct calls if
 possible.  */

you get the CSE (too simple patch, you need to check leaders properly).
You can then add similar lookups for an inverted conditional.

Richard.

> Thanks
>
> --
> Best Regards.

Re: RFC: Handle conditional expression in sccvn/fre/pre

2012-01-02 Thread Amker.Cheng

On Mon, Jan 2, 2012 at 9:37 PM, Richard Guenther
 wrote:

> Well, with
>
> Index: gcc/tree-ssa-pre.c
> ===
> --- gcc/tree-ssa-pre.c  (revision 182784)
> +++ gcc/tree-ssa-pre.c  (working copy)
> @@ -4335,16 +4335,23 @@ eliminate (void)
>             available value-numbers.  */
>          else if (gimple_code (stmt) == GIMPLE_COND)
>            {
> -             tree op0 = gimple_cond_lhs (stmt);
> -             tree op1 = gimple_cond_rhs (stmt);
> +             tree op[2];
>              tree result;
> +             vn_nary_op_t nary;
>
> -             if (TREE_CODE (op0) == SSA_NAME)
> -               op0 = VN_INFO (op0)->valnum;
> -             if (TREE_CODE (op1) == SSA_NAME)
> -               op1 = VN_INFO (op1)->valnum;
> +             op[0] = gimple_cond_lhs (stmt);
> +             op[1] = gimple_cond_rhs (stmt);
> +             if (TREE_CODE (op[0]) == SSA_NAME)
> +               op[0] = VN_INFO (op[0])->valnum;
> +             if (TREE_CODE (op[1]) == SSA_NAME)
> +               op[1] = VN_INFO (op[1])->valnum;
>              result = fold_binary (gimple_cond_code (stmt), boolean_type_node,
> -                                   op0, op1);
> +                                   op[0], op[1]);
> +             if (!result)
> +               result = vn_nary_op_lookup_pieces (2, gimple_cond_code (stmt),
> +                                                  boolean_type_node,
> +                                                  op, &nary);
> +
>              if (result && TREE_CODE (result) == INTEGER_CST)
>                {
>                  if (integer_zerop (result))
> @@ -4354,6 +4361,13 @@ eliminate (void)
>                  update_stmt (stmt);
>                  todo = TODO_cleanup_cfg;
>                }
> +             else if (result && TREE_CODE (result) == SSA_NAME)
> +               {
> +                 gimple_cond_set_code (stmt, NE_EXPR);
> +                 gimple_cond_set_lhs (stmt, result);
> +                 gimple_cond_set_rhs (stmt, boolean_false_node);
> +                 update_stmt (stmt);
> +               }
>            }
>          /* Visit indirect calls and turn them into direct calls if
>             possible.  */
>
> you get the CSE (too simple patch, you need to check leaders properly).
> You can then add similar lookups for an inverted conditional.

Thanks for your explanation. On shortcoming of this method is that it
cannot find/take cond_expr(and the implicitly defined variable) as the
leader in pre. I guess this is why you said it can handle a subset of all
cases in previous message?

on the other hand, I like this method, given the simplicity especially. :)

-- 
Best Regards.

Re: fixed_scalar_and_varying_struct_p and varies_p

2012-01-02 Thread Richard Sandiford

Thanks for both replies.

Richard Guenther  writes:
> On Thu, Dec 29, 2011 at 8:48 PM, Eric Botcazou  wrote:
>>> fixed_scalar_and_varying_struct_p passes an _address_ rather than a MEM.
>>> So in these cases fixed_scalar_and_varying_struct_p effectively becomes
>>> a no-op on targets that don't allow MEMs in addresses and takes on
>>> suspicious semantics for those that do.  In the former case, every
>>> address is treated as "unvarying" and f_s_a_v_s_p always returns null.
>>> In the latter case, things like REG addresses are (wrongly) treated as
>>> unvarying while a MEM address might correctly be treated as varying,
>>> leading to false positives.
>>>
>>> It looks like this goes back to when fixed_scalar_and_varying_struct_p
>>> was added in r24759 (1999).
>>
>> Does this mean that MEM_IN_STRUCT_P and MEM_SCALAR_P have also been
>> effectively disabled since then?

Some important callers (cse.c and sched-deps.c) do use the proper
varies_p routine, so it probably isn't quite that extreme.  But...

>>> AIUI, the true_dependence varies_p parameter exists for the benefit
>>> of CSE, so that it can use its local cse_rtx_varies_p function.
>>> All other callers should be using rtx_varies_p instead.  Question is,
>>> should I make that change, or is it time to get rid of
>>> fixed_scalar_and_varying_struct_p instead?
>>
>> I'd vote for the latter (and for eliminating MEM_IN_STRUCT_P and MEM_SCALAR_P
>> in the process, if the answer to the above question is positive), there is no
>> point in resurrecting this now IMO.
>
> I agree.  The tree level routines should be able to figure most of, if
> not all, cases
> on their own via rtx_refs_may_alias_p (similar to the
> nonoverlapping_component_refs
> case which we could simply delete as well).

...that's 2 votes for and none so far against. :-)

I compiled the cc1 .ii files on x86_64-linux-gnu with and without
fixed_scalar_and_varying_struct_p.  There were 19 changes in total,
all of them cases where sched2 was able to reorder two memory accesses
because of f_s_a_v_s_p.  I've attached the diff below.

A good example is:

  if (bit_obstack)
{
  element = bit_obstack->elements;

  if (element)
/* Use up the inner list first before looking at the next
   element of the outer list.  */
if (element->next)
  {
bit_obstack->elements = element->next;
bit_obstack->elements->prev = element->prev;
  }
else
  /*  Inner list was just a singleton.  */
  bit_obstack->elements = element->prev;
  else
element = XOBNEW (&bit_obstack->obstack, bitmap_element);
}
  else
{
  element = bitmap_ggc_free;
  if (element)
/* Use up the inner list first before looking at the next
   element of the outer list.  */
if (element->next)
  {
bitmap_ggc_free = element->next;
bitmap_ggc_free->prev = element->prev;
  }
else
  /*  Inner list was just a singleton.  */
  bitmap_ggc_free = element->prev;
  else
element = ggc_alloc_bitmap_element_def ();
}

from bitmap.c, specifically:

bitmap_ggc_free = element->next;
bitmap_ggc_free->prev = element->prev;

Without f_s_a_v_s_p, sched2 couldn't tell that element->prev didn't
alias bitmap_ggc_free.  And this in turn is because cfgcleanup
considered trying to merge this block with:

bit_obstack->elements = element->next;
bit_obstack->elements->prev = element->prev;

It called merge_memattrs for each pair of instructions that it was
thinking of merging, and because the element->next and element->prev
MEMs were based on different SSA names, we lost the MEM_EXPR completely.

As it happens, we decided not to merge the blocks after all.
So an obvious first observation is that query functions like
flow_find_cross_jump and flow_find_head_matching_sequence shouldn't
change the rtl.  We should only do that once we've decided which
instructions we're actually going to merge.

Of course, that's not a trivial change.  It's easy to make
try_head_merge_bb call merge_memattrs during merging, but less
easy for try_crossjump_to_edge and cond_exec_process_if_block.
(Note that the latter, like try_head_merge_bb, can end up merging
fewer instructions than flow_find_* saw).

But does the choice of SSA name actually count for anything this
late on?  Should we consider MEM_EXPRs node_X->prev and node_Y->prev
to be "similar enough", if node_X and node_Y have equal types?

I've attached a patch to remove fixed_scalar_and_varying_struct_p
just in case it's OK.  Tested on mips64-linux-gnu.

Also, as Eric says, this is really the only middle-end use of
MEM_SCALAR and MEM_IN_STRUCT_P.  It looks like the only other
use is in config/m32c/m32c.c:m32c_immd_dbl_mov.  TBH, I don't
really understand what that function is trying to test, so I can't
tell whether it should be using MEM_EXPR instead.

I've attached a patch

Re: RFC: Handle conditional expression in sccvn/fre/pre

2012-01-02 Thread Richard Guenther

On Mon, Jan 2, 2012 at 3:09 PM, Amker.Cheng  wrote:
> On Mon, Jan 2, 2012 at 9:37 PM, Richard Guenther
>  wrote:
>
>> Well, with
>>
>> Index: gcc/tree-ssa-pre.c
>> ===
>> --- gcc/tree-ssa-pre.c  (revision 182784)
>> +++ gcc/tree-ssa-pre.c  (working copy)
>> @@ -4335,16 +4335,23 @@ eliminate (void)
>>             available value-numbers.  */
>>          else if (gimple_code (stmt) == GIMPLE_COND)
>>            {
>> -             tree op0 = gimple_cond_lhs (stmt);
>> -             tree op1 = gimple_cond_rhs (stmt);
>> +             tree op[2];
>>              tree result;
>> +             vn_nary_op_t nary;
>>
>> -             if (TREE_CODE (op0) == SSA_NAME)
>> -               op0 = VN_INFO (op0)->valnum;
>> -             if (TREE_CODE (op1) == SSA_NAME)
>> -               op1 = VN_INFO (op1)->valnum;
>> +             op[0] = gimple_cond_lhs (stmt);
>> +             op[1] = gimple_cond_rhs (stmt);
>> +             if (TREE_CODE (op[0]) == SSA_NAME)
>> +               op[0] = VN_INFO (op[0])->valnum;
>> +             if (TREE_CODE (op[1]) == SSA_NAME)
>> +               op[1] = VN_INFO (op[1])->valnum;
>>              result = fold_binary (gimple_cond_code (stmt), 
>> boolean_type_node,
>> -                                   op0, op1);
>> +                                   op[0], op[1]);
>> +             if (!result)
>> +               result = vn_nary_op_lookup_pieces (2, gimple_cond_code 
>> (stmt),
>> +                                                  boolean_type_node,
>> +                                                  op, &nary);
>> +
>>              if (result && TREE_CODE (result) == INTEGER_CST)
>>                {
>>                  if (integer_zerop (result))
>> @@ -4354,6 +4361,13 @@ eliminate (void)
>>                  update_stmt (stmt);
>>                  todo = TODO_cleanup_cfg;
>>                }
>> +             else if (result && TREE_CODE (result) == SSA_NAME)
>> +               {
>> +                 gimple_cond_set_code (stmt, NE_EXPR);
>> +                 gimple_cond_set_lhs (stmt, result);
>> +                 gimple_cond_set_rhs (stmt, boolean_false_node);
>> +                 update_stmt (stmt);
>> +               }
>>            }
>>          /* Visit indirect calls and turn them into direct calls if
>>             possible.  */
>>
>> you get the CSE (too simple patch, you need to check leaders properly).
>> You can then add similar lookups for an inverted conditional.
>
> Thanks for your explanation. On shortcoming of this method is that it
> cannot find/take cond_expr(and the implicitly defined variable) as the
> leader in pre. I guess this is why you said it can handle a subset of all
> cases in previous message?

Yes.  It won't handle

  if (x > 1)
   ...
  tem = x > 1;

or

  if (x > 1)
   ...
  if (x > 1)

though maybe we could teach PRE to do the insertion by properly
putting x > 1 into EXP_GEN in compute_avail (but not into AVAIL_OUT).
Not sure about this though.  Currently we don't do anything to
GIMPLE_COND operands (which seems odd anyway, we should
at least add the operands to EXP_GEN).

> on the other hand, I like this method, given the simplicity especially. :)

Yeah.

> --
> Best Regards.

Re: fixed_scalar_and_varying_struct_p and varies_p

2012-01-02 Thread Richard Guenther

On Mon, Jan 2, 2012 at 3:42 PM, Richard Sandiford
 wrote:
> Thanks for both replies.
>
> Richard Guenther  writes:
>> On Thu, Dec 29, 2011 at 8:48 PM, Eric Botcazou  wrote:
 fixed_scalar_and_varying_struct_p passes an _address_ rather than a MEM.
 So in these cases fixed_scalar_and_varying_struct_p effectively becomes
 a no-op on targets that don't allow MEMs in addresses and takes on
 suspicious semantics for those that do.  In the former case, every
 address is treated as "unvarying" and f_s_a_v_s_p always returns null.
 In the latter case, things like REG addresses are (wrongly) treated as
 unvarying while a MEM address might correctly be treated as varying,
 leading to false positives.

 It looks like this goes back to when fixed_scalar_and_varying_struct_p
 was added in r24759 (1999).
>>>
>>> Does this mean that MEM_IN_STRUCT_P and MEM_SCALAR_P have also been
>>> effectively disabled since then?
>
> Some important callers (cse.c and sched-deps.c) do use the proper
> varies_p routine, so it probably isn't quite that extreme.  But...
>
 AIUI, the true_dependence varies_p parameter exists for the benefit
 of CSE, so that it can use its local cse_rtx_varies_p function.
 All other callers should be using rtx_varies_p instead.  Question is,
 should I make that change, or is it time to get rid of
 fixed_scalar_and_varying_struct_p instead?
>>>
>>> I'd vote for the latter (and for eliminating MEM_IN_STRUCT_P and 
>>> MEM_SCALAR_P
>>> in the process, if the answer to the above question is positive), there is 
>>> no
>>> point in resurrecting this now IMO.
>>
>> I agree.  The tree level routines should be able to figure most of, if
>> not all, cases
>> on their own via rtx_refs_may_alias_p (similar to the
>> nonoverlapping_component_refs
>> case which we could simply delete as well).
>
> ...that's 2 votes for and none so far against. :-)
>
> I compiled the cc1 .ii files on x86_64-linux-gnu with and without
> fixed_scalar_and_varying_struct_p.  There were 19 changes in total,
> all of them cases where sched2 was able to reorder two memory accesses
> because of f_s_a_v_s_p.  I've attached the diff below.
>
> A good example is:
>
>  if (bit_obstack)
>    {
>      element = bit_obstack->elements;
>
>      if (element)
>        /* Use up the inner list first before looking at the next
>           element of the outer list.  */
>        if (element->next)
>          {
>            bit_obstack->elements = element->next;
>            bit_obstack->elements->prev = element->prev;
>          }
>        else
>          /*  Inner list was just a singleton.  */
>          bit_obstack->elements = element->prev;
>      else
>        element = XOBNEW (&bit_obstack->obstack, bitmap_element);
>    }
>  else
>    {
>      element = bitmap_ggc_free;
>      if (element)
>        /* Use up the inner list first before looking at the next
>           element of the outer list.  */
>        if (element->next)
>          {
>            bitmap_ggc_free = element->next;
>            bitmap_ggc_free->prev = element->prev;
>          }
>        else
>          /*  Inner list was just a singleton.  */
>          bitmap_ggc_free = element->prev;
>      else
>        element = ggc_alloc_bitmap_element_def ();
>    }
>
> from bitmap.c, specifically:
>
>            bitmap_ggc_free = element->next;
>            bitmap_ggc_free->prev = element->prev;
>
> Without f_s_a_v_s_p, sched2 couldn't tell that element->prev didn't
> alias bitmap_ggc_free.  And this in turn is because cfgcleanup
> considered trying to merge this block with:
>
>            bit_obstack->elements = element->next;
>            bit_obstack->elements->prev = element->prev;
>
> It called merge_memattrs for each pair of instructions that it was
> thinking of merging, and because the element->next and element->prev
> MEMs were based on different SSA names, we lost the MEM_EXPR completely.
>
> As it happens, we decided not to merge the blocks after all.
> So an obvious first observation is that query functions like
> flow_find_cross_jump and flow_find_head_matching_sequence shouldn't
> change the rtl.  We should only do that once we've decided which
> instructions we're actually going to merge.
>
> Of course, that's not a trivial change.  It's easy to make
> try_head_merge_bb call merge_memattrs during merging, but less
> easy for try_crossjump_to_edge and cond_exec_process_if_block.
> (Note that the latter, like try_head_merge_bb, can end up merging
> fewer instructions than flow_find_* saw).
>
> But does the choice of SSA name actually count for anything this
> late on?  Should we consider MEM_EXPRs node_X->prev and node_Y->prev
> to be "similar enough", if node_X and node_Y have equal types?

It's a conservative way of merging alias info.  A more precise variant
would union the points-to information of both bases (and stick it to
a new SSA name), conveniently pt_solution_ior_into exists.
And no, we can't just pick eithe

Re: LTO multiple definition failures

2012-01-02 Thread Sandra Loosemore


On 01/02/2012 12:22 AM, Andi Kleen wrote:

Sandra Loosemore  writes:


I'm still finding my way around LTO; can anyone who's more familiar
with this help narrow down where to look for the cause of this?  I
don't even know if this is a compiler or ld bug at this point.  I'm


I would look into the interaction between the LTO plugin and your ld
(and also try gold if you can)

Generally there are still various issues in these areas which need
workarounds in the LTOed programs, for some things (like ld -r and some
ar) you also need the latest version of HJ Lu's binutils which implement
http://gcc.gnu.org/ml/gcc/2010-12/msg00229.html

A lot of older binutils lds also tended to mishandle mixed LTOed ar
archives.


For avoidance of doubt, I was using mainline HEAD for both gcc and 
binutils for my powerpc-none-eabi experiments last week.  The port for 
our new target is based on GCC 4.6 and a binutils branch from a few 
months ago, but it looked to me like the ld/gcc interaction was 
basically the same -- the link error being triggered by a difference in 
the startup code on the two targets, instead.


Anyway, the problem here isn't that I particularly care about coming up 
with some workaround to make LTO work, but rather that tests from the 
gcc testsuite are failing on this target because of what looks like 
buggy LTO behavior instead of bugs in the target support, and I wanted 
to be sure this was being tracked somewhere.  I didn't see a relevant 
issue in either the gcc or binutils bugzillas, but if it's a known 
consequence of the ld -r problem, I'll shut up and go away again.  ;-)


-Sandra

Re: fixed_scalar_and_varying_struct_p and varies_p

2012-01-02 Thread Eric Botcazou

> I'd say open a missed optimization bug with the testcase and go ahead
> with both patches.  Let's see if Eric has some comments first though.

None, but the m32c maintainer may have some.

DJ, do you happen to know the rationale for the use of the MEM_SCALAR_P and 
MEM_IN_STRUCT_P flags in m32c_immd_dbl_mov?  What condition do these tests try 
to model exactly?

-- 
Eric Botcazou

Re: LTO multiple definition failures

2012-01-02 Thread Andi Kleen

> Anyway, the problem here isn't that I particularly care about coming up 
> with some workaround to make LTO work, but rather that tests from the 
> gcc testsuite are failing on this target because of what looks like 
> buggy LTO behavior instead of bugs in the target support, and I wanted 
> to be sure this was being tracked somewhere.  I didn't see a relevant 
> issue in either the gcc or binutils bugzillas, but if it's a known 
> consequence of the ld -r problem, I'll shut up and go away again.  ;-)

AFAIK none of the test suite tests the ld -r problem, at least not on x86-linux.
So it may be something else and still worth tracking down.

-Andi
-- 
a...@linux.intel.com -- Speaking for myself only.

Re: Bogus gcc.c-torture/execute/20071018-1.c testcase?

Re: Lingering tbaa in anti_dependence?

RFC: Handle conditional expression in sccvn/fre/pre

Re: RFC: Handle conditional expression in sccvn/fre/pre

Re: A case exposing code sink issue

Re: RFC: Handle conditional expression in sccvn/fre/pre

-fstack-protector, __stack_chk_fail_local and TARGET_LIBC_PROVIDES_SSP

Re: RFC: Handle conditional expression in sccvn/fre/pre

Re: RFC: Handle conditional expression in sccvn/fre/pre

Re: fixed_scalar_and_varying_struct_p and varies_p

Re: RFC: Handle conditional expression in sccvn/fre/pre

Re: fixed_scalar_and_varying_struct_p and varies_p

Re: LTO multiple definition failures

Re: fixed_scalar_and_varying_struct_p and varies_p

Re: LTO multiple definition failures

15 matches

Site Navigation

Mail list logo

Footer information