Re: [PATCH PR43513, 1/3] Replace vla with array - Implementation.

2011-07-30 Thread Tom de Vries
Hi,

On 07/28/2011 08:20 PM, Tom de Vries wrote:
> On 07/28/2011 06:25 PM, Richard Guenther wrote:
>> On Thu, 28 Jul 2011, Tom de Vries wrote:
>>
>>> On 07/28/2011 12:22 PM, Richard Guenther wrote:
 On Wed, 27 Jul 2011, Tom de Vries wrote:

> On 07/27/2011 05:27 PM, Richard Guenther wrote:
>> On Wed, 27 Jul 2011, Tom de Vries wrote:
>>
>>> On 07/27/2011 02:12 PM, Richard Guenther wrote:
 On Wed, 27 Jul 2011, Tom de Vries wrote:

> On 07/27/2011 01:50 PM, Tom de Vries wrote:
>> Hi Richard,
>>
>> I have a patch set for bug 43513 - The stack pointer is adjusted 
>> twice.
>>
>> 01_pr43513.3.patch
>> 02_pr43513.3.test.patch
>> 03_pr43513.3.mudflap.patch
>>
>> The patch set has been bootstrapped and reg-tested on x86_64.
>>
>> I will sent out the patches individually.
>>
>
> The patch replaces a vla __builtin_alloca that has a constant 
> argument with an
> array declaration.
>
> OK for trunk?

 I don't think it is safe to try to get at the VLA type the way you do.
>>>
>>> I don't understand in what way it's not safe. Do you mean I don't 
>>> manage to find
>>> the type always, or that I find the wrong type, or something else?
>>
>> I think you might get the wrong type,
>
> Ok, I'll review that code one more time.
>
>> you also do not transform code
>> like
>>
>>   int *p = alloca(4);
>>   *p = 3;
>>
>> as there is no array type involved here.
>>
>
> I was trying to stay away from non-vla allocas.  A source declared alloca 
> has
> function livetime, so we could have a single alloca in a loop, called 10 
> times,
> with all 10 instances live at the same time. This patch does not detect 
> such
> cases, and thus stays away from non-vla allocas. A vla decl does not have 
> such
> problems, the lifetime ends when it goes out of scope.

 Yes indeed - that probably would require more detailed analysis.

 In fact I would simply do sth like

   elem_type = build_nonstandard_integer_type (BITS_PER_UNIT, 1);
   n_elem = size * 8 / BITS_PER_UNIT;
   array_type = build_array_type_nelts (elem_type, n_elem);
   var = create_tmp_var (array_type, NULL);
   return fold_convert (TREE_TYPE (lhs), build_fold_addr_expr (var));

>>>
>>> I tried this code on the example, and it works, but the newly declared 
>>> type has
>>> an 8-bit alignment, while the vla base type has a 32 bit alignment.  
>>> This make
>>> the memory access in the example potentially unaligned, which prohibits 
>>> an
>>> ivopts optimization, so the resulting text size is 68 instead of the 64 
>>> achieved
>>> with my current patch.
>>
>> Ok, so then set DECL_ALIGN of the variable to something reasonable
>> like MIN (size * 8, GET_MODE_PRECISION (word_mode)).  Basically the
>> alignment that the targets alloca function would guarantee.
>>
>
> I tried that, but that doesn't help. It's the alignment of the type that
> matters, not of the decl.

 It shouldn't.  All accesses are performed with the original types and
 alignment comes from that (plus the underlying decl).

>>>
>>> I managed to get it all working by using build_aligned_type rather that 
>>> DECL_ALIGN.
>>
>> That's really odd, DECL_ALIGN should just work - nothing refers to the
>> type of the decl in the IL.  Can you try also setting DECL_USER_ALIGN to 
>> 1 maybe?
>>
> 
> This doesn't work either.
> 
>   /* Declare array.  */
>   elem_type = build_nonstandard_integer_type (BITS_PER_UNIT, 1);
>   n_elem = size * 8 / BITS_PER_UNIT;
>   align = MIN (size * 8, GET_MODE_PRECISION (word_mode));
>   array_type = build_array_type_nelts (elem_type, n_elem);
>   var = create_tmp_var (array_type, NULL);
>   DECL_ALIGN (var) = align;
>   DECL_USER_ALIGN (var) = 1;
> 
> Maybe this clarifies it:
> 
> Breakpoint 1, may_be_unaligned_p (ref=0xf7d9d410, step=0xf7d3d578) at
> /home/vries/local/google/src/gcc-mainline/gcc/tree-ssa-loop-ivopts.c:1621
> (gdb) call debug_generic_expr (ref)
> MEM[(int[0:D.2579] *)&D.2595][0]
> (gdb) call debug_generic_expr (step)
> 4
> 
> 1627base = get_inner_reference (ref, &bitsize, &bitpos, &toffset, &mode,
> (gdb) call debug_generic_expr (base)
> D.2595
> 
> 1629base_type = TREE_TYPE (base);
> (gdb) call debug_generic_expr (base_type)
> [40]
> 
> 1630base_align = TYPE_ALIGN (base_type);
> (gdb) p base_align
> $1 = 8
> 
> So the align is 8-bits, and we return true here:
> 
> (gdb) n
> 1632if (mode != BLKmode)
> (gdb) n
> 1634unsigned mode_align = GET_MODE_ALIGNMENT (mode);
> (gdb)
> 1636if (base_align < mode_align
> (gdb)
> 1639  return true;
> 
> 
> Here we can see tha

Re: [PATCH PR43513, 1/3] Replace vla with array - Implementation.

2011-07-30 Thread Tom de Vries
On 07/30/2011 10:21 AM, Tom de Vries wrote:
> Hi,
> 
> On 07/28/2011 08:20 PM, Tom de Vries wrote:
>> On 07/28/2011 06:25 PM, Richard Guenther wrote:
>>> On Thu, 28 Jul 2011, Tom de Vries wrote:
>>>
 On 07/28/2011 12:22 PM, Richard Guenther wrote:
> On Wed, 27 Jul 2011, Tom de Vries wrote:
>
>> On 07/27/2011 05:27 PM, Richard Guenther wrote:
>>> On Wed, 27 Jul 2011, Tom de Vries wrote:
>>>
 On 07/27/2011 02:12 PM, Richard Guenther wrote:
> On Wed, 27 Jul 2011, Tom de Vries wrote:
>
>> On 07/27/2011 01:50 PM, Tom de Vries wrote:
>>> Hi Richard,
>>>
>>> I have a patch set for bug 43513 - The stack pointer is adjusted 
>>> twice.
>>>
>>> 01_pr43513.3.patch
>>> 02_pr43513.3.test.patch
>>> 03_pr43513.3.mudflap.patch
>>>
>>> The patch set has been bootstrapped and reg-tested on x86_64.
>>>
>>> I will sent out the patches individually.
>>>
>>
>> The patch replaces a vla __builtin_alloca that has a constant 
>> argument with an
>> array declaration.
>>
>> OK for trunk?
>
> I don't think it is safe to try to get at the VLA type the way you do.

 I don't understand in what way it's not safe. Do you mean I don't 
 manage to find
 the type always, or that I find the wrong type, or something else?
>>>
>>> I think you might get the wrong type,
>>
>> Ok, I'll review that code one more time.
>>
>>> you also do not transform code
>>> like
>>>
>>>   int *p = alloca(4);
>>>   *p = 3;
>>>
>>> as there is no array type involved here.
>>>
>>
>> I was trying to stay away from non-vla allocas.  A source declared 
>> alloca has
>> function livetime, so we could have a single alloca in a loop, called 10 
>> times,
>> with all 10 instances live at the same time. This patch does not detect 
>> such
>> cases, and thus stays away from non-vla allocas. A vla decl does not 
>> have such
>> problems, the lifetime ends when it goes out of scope.
>
> Yes indeed - that probably would require more detailed analysis.
>
> In fact I would simply do sth like
>
>   elem_type = build_nonstandard_integer_type (BITS_PER_UNIT, 1);
>   n_elem = size * 8 / BITS_PER_UNIT;
>   array_type = build_array_type_nelts (elem_type, n_elem);
>   var = create_tmp_var (array_type, NULL);
>   return fold_convert (TREE_TYPE (lhs), build_fold_addr_expr (var));
>

 I tried this code on the example, and it works, but the newly declared 
 type has
 an 8-bit alignment, while the vla base type has a 32 bit alignment.  
 This make
 the memory access in the example potentially unaligned, which 
 prohibits an
 ivopts optimization, so the resulting text size is 68 instead of the 
 64 achieved
 with my current patch.
>>>
>>> Ok, so then set DECL_ALIGN of the variable to something reasonable
>>> like MIN (size * 8, GET_MODE_PRECISION (word_mode)).  Basically the
>>> alignment that the targets alloca function would guarantee.
>>>
>>
>> I tried that, but that doesn't help. It's the alignment of the type that
>> matters, not of the decl.
>
> It shouldn't.  All accesses are performed with the original types and
> alignment comes from that (plus the underlying decl).
>

 I managed to get it all working by using build_aligned_type rather that 
 DECL_ALIGN.
>>>
>>> That's really odd, DECL_ALIGN should just work - nothing refers to the
>>> type of the decl in the IL.  Can you try also setting DECL_USER_ALIGN to 
>>> 1 maybe?
>>>
>>
>> This doesn't work either.
>>
>>   /* Declare array.  */
>>   elem_type = build_nonstandard_integer_type (BITS_PER_UNIT, 1);
>>   n_elem = size * 8 / BITS_PER_UNIT;
>>   align = MIN (size * 8, GET_MODE_PRECISION (word_mode));
>>   array_type = build_array_type_nelts (elem_type, n_elem);
>>   var = create_tmp_var (array_type, NULL);
>>   DECL_ALIGN (var) = align;
>>   DECL_USER_ALIGN (var) = 1;
>>
>> Maybe this clarifies it:
>>
>> Breakpoint 1, may_be_unaligned_p (ref=0xf7d9d410, step=0xf7d3d578) at
>> /home/vries/local/google/src/gcc-mainline/gcc/tree-ssa-loop-ivopts.c:1621
>> (gdb) call debug_generic_expr (ref)
>> MEM[(int[0:D.2579] *)&D.2595][0]
>> (gdb) call debug_generic_expr (step)
>> 4
>>
>> 1627   base = get_inner_reference (ref, &bitsize, &bitpos, &toffset, &mode,
>> (gdb) call debug_generic_expr (base)
>> D.2595
>>
>> 1629   base_type = TREE_TYPE (base);
>> (gdb) call debug_generic_expr (base_type)
>> [40]
>>
>> 1630   base_align = TYPE_ALIGN (base_type);
>> (gdb) p base_align
>> $1 = 8
>>
>> So the align is 8-bits, and we return true here:
>>
>> (gdb) n
>> 1632   if (mode != BLKmode)
>> (gdb) n

Re: [PATCH PR43513, 1/3] Replace vla with array - Implementation.

2011-07-30 Thread Tom de Vries
On 07/30/2011 10:21 AM, Tom de Vries wrote:
> Hi,
> 
> On 07/28/2011 08:20 PM, Tom de Vries wrote:
>> On 07/28/2011 06:25 PM, Richard Guenther wrote:
>>> On Thu, 28 Jul 2011, Tom de Vries wrote:
>>>
 On 07/28/2011 12:22 PM, Richard Guenther wrote:
> On Wed, 27 Jul 2011, Tom de Vries wrote:
>
>> On 07/27/2011 05:27 PM, Richard Guenther wrote:
>>> On Wed, 27 Jul 2011, Tom de Vries wrote:
>>>
 On 07/27/2011 02:12 PM, Richard Guenther wrote:
> On Wed, 27 Jul 2011, Tom de Vries wrote:
>
>> On 07/27/2011 01:50 PM, Tom de Vries wrote:
>>> Hi Richard,
>>>
>>> I have a patch set for bug 43513 - The stack pointer is adjusted 
>>> twice.
>>>
>>> 01_pr43513.3.patch
>>> 02_pr43513.3.test.patch
>>> 03_pr43513.3.mudflap.patch
>>>
>>> The patch set has been bootstrapped and reg-tested on x86_64.
>>>
>>> I will sent out the patches individually.
>>>
>>
>> The patch replaces a vla __builtin_alloca that has a constant 
>> argument with an
>> array declaration.
>>
>> OK for trunk?
>
> I don't think it is safe to try to get at the VLA type the way you do.

 I don't understand in what way it's not safe. Do you mean I don't 
 manage to find
 the type always, or that I find the wrong type, or something else?
>>>
>>> I think you might get the wrong type,
>>
>> Ok, I'll review that code one more time.
>>
>>> you also do not transform code
>>> like
>>>
>>>   int *p = alloca(4);
>>>   *p = 3;
>>>
>>> as there is no array type involved here.
>>>
>>
>> I was trying to stay away from non-vla allocas.  A source declared 
>> alloca has
>> function livetime, so we could have a single alloca in a loop, called 10 
>> times,
>> with all 10 instances live at the same time. This patch does not detect 
>> such
>> cases, and thus stays away from non-vla allocas. A vla decl does not 
>> have such
>> problems, the lifetime ends when it goes out of scope.
>
> Yes indeed - that probably would require more detailed analysis.
>
> In fact I would simply do sth like
>
>   elem_type = build_nonstandard_integer_type (BITS_PER_UNIT, 1);
>   n_elem = size * 8 / BITS_PER_UNIT;
>   array_type = build_array_type_nelts (elem_type, n_elem);
>   var = create_tmp_var (array_type, NULL);
>   return fold_convert (TREE_TYPE (lhs), build_fold_addr_expr (var));
>

 I tried this code on the example, and it works, but the newly declared 
 type has
 an 8-bit alignment, while the vla base type has a 32 bit alignment.  
 This make
 the memory access in the example potentially unaligned, which 
 prohibits an
 ivopts optimization, so the resulting text size is 68 instead of the 
 64 achieved
 with my current patch.
>>>
>>> Ok, so then set DECL_ALIGN of the variable to something reasonable
>>> like MIN (size * 8, GET_MODE_PRECISION (word_mode)).  Basically the
>>> alignment that the targets alloca function would guarantee.
>>>
>>
>> I tried that, but that doesn't help. It's the alignment of the type that
>> matters, not of the decl.
>
> It shouldn't.  All accesses are performed with the original types and
> alignment comes from that (plus the underlying decl).
>

 I managed to get it all working by using build_aligned_type rather that 
 DECL_ALIGN.
>>>
>>> That's really odd, DECL_ALIGN should just work - nothing refers to the
>>> type of the decl in the IL.  Can you try also setting DECL_USER_ALIGN to 
>>> 1 maybe?
>>>
>>
>> This doesn't work either.
>>
>>   /* Declare array.  */
>>   elem_type = build_nonstandard_integer_type (BITS_PER_UNIT, 1);
>>   n_elem = size * 8 / BITS_PER_UNIT;
>>   align = MIN (size * 8, GET_MODE_PRECISION (word_mode));
>>   array_type = build_array_type_nelts (elem_type, n_elem);
>>   var = create_tmp_var (array_type, NULL);
>>   DECL_ALIGN (var) = align;
>>   DECL_USER_ALIGN (var) = 1;
>>
>> Maybe this clarifies it:
>>
>> Breakpoint 1, may_be_unaligned_p (ref=0xf7d9d410, step=0xf7d3d578) at
>> /home/vries/local/google/src/gcc-mainline/gcc/tree-ssa-loop-ivopts.c:1621
>> (gdb) call debug_generic_expr (ref)
>> MEM[(int[0:D.2579] *)&D.2595][0]
>> (gdb) call debug_generic_expr (step)
>> 4
>>
>> 1627   base = get_inner_reference (ref, &bitsize, &bitpos, &toffset, &mode,
>> (gdb) call debug_generic_expr (base)
>> D.2595
>>
>> 1629   base_type = TREE_TYPE (base);
>> (gdb) call debug_generic_expr (base_type)
>> [40]
>>
>> 1630   base_align = TYPE_ALIGN (base_type);
>> (gdb) p base_align
>> $1 = 8
>>
>> So the align is 8-bits, and we return true here:
>>
>> (gdb) n
>> 1632   if (mode != BLKmode)
>> (gdb) n

Re: [PATCH PR43513, 2/3] Replace vla with array - Test case.

2011-07-30 Thread Tom de Vries
On 07/27/2011 01:55 PM, Tom de Vries wrote:
> On 07/27/2011 01:54 PM, Tom de Vries wrote:
>> On 07/27/2011 01:50 PM, Tom de Vries wrote:
>>> Hi Richard,
>>>
>>> I have a patch set for bug 43513 - The stack pointer is adjusted twice.
>>>
>>> 01_pr43513.3.patch
>>> 02_pr43513.3.test.patch
>>> 03_pr43513.3.mudflap.patch
>>>
>>> The patch set has been bootstrapped and reg-tested on x86_64.
>>>
>>> I will sent out the patches individually.
>>>
> 
> Sorry, with patch this time.
> 
>>
>> This patch adds the testcase from the bug report, modified not to
>> need includes.
>>

Updated test case, tranformation should happen during ccp2.

OK for trunk?

Thanks,
- Tom

2011-07-30  Tom de Vries  

PR middle-end/43513
* gcc.dg/pr43513.c: New test.
Index: gcc/testsuite/gcc.dg/pr43513.c
===
--- gcc/testsuite/gcc.dg/pr43513.c	(revision 0)
+++ gcc/testsuite/gcc.dg/pr43513.c	(revision 0)
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-ccp2" } */
+
+void bar (int *);
+void foo (char *, int);
+
+void
+foo3 ()
+{
+  const int kIterations = 10;
+  int results[kIterations];
+  int i;
+  bar (results);
+  for (i = 0; i < kIterations; i++)
+foo ("%d ", results[i]);
+}
+
+/* { dg-final { scan-tree-dump-times "alloca" 0 "ccp2"} } */
+/* { dg-final { cleanup-tree-dump "ccp2" } } */


Re: [PATCH] unbreak attribute((optimize(...))) on m68k (PR47908)

2011-07-30 Thread Andreas Schwab
Mikael Pettersson  writes:

> 2011-07-28  Mikael Pettersson  
>
>   PR target/47908
>   * config/m68k/m68k.c (m68k_override_options_after_change): New function.
>   Disable instruction scheduling for non-ColdFire targets.
>   (TARGET_OVERRIDE_OPTIONS_AFTER_CHANGE): Define.

Ok for all active branches.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


Re: [PATCH, RFC] PR49749 biased reassociation for accumulator patterns

2011-07-30 Thread Richard Guenther
On Fri, Jul 29, 2011 at 7:11 PM, William J. Schmidt
 wrote:
> Here is the final version of the reassociation patch.  There are two
> differences from the version I published on 7/27.  I removed the
> function call from within the MAX macro per Michael's comment, and I
> changed the propagation of the rank of loop-carried phis to be zero.
> This involved a small change to propagate_rank, and re-casting
> phi_propagation_rank to a predicate function loop_carried_phi.
>
> Performance numbers look good, with some nice gains and no significant
> regressions for CPU2006 on powerpc64-linux.  Bootstrapped and regression
> tested on powerpc64-linux with no regressions.
>
> Ok for trunk?

Ok.

Thanks,
Richard.

> Thanks,
> Bill
>
>
> 2011-07-29  Bill Schmidt  
>
>        PR tree-optimization/49749
>        * tree-ssa-reassoc.c (get_rank): New forward declaration.
>        (PHI_LOOP_BIAS): New macro.
>        (phi_rank): New function.
>        (loop_carried_phi): Likewise.
>        (propagate_rank): Likewise.
>        (get_rank): Add calls to phi_rank and propagate_rank.
>
> Index: gcc/tree-ssa-reassoc.c
> ===
> --- gcc/tree-ssa-reassoc.c      (revision 176585)
> +++ gcc/tree-ssa-reassoc.c      (working copy)
> @@ -190,7 +190,115 @@ static long *bb_rank;
>  /* Operand->rank hashtable.  */
>  static struct pointer_map_t *operand_rank;
>
> +/* Forward decls.  */
> +static long get_rank (tree);
>
> +
> +/* Bias amount for loop-carried phis.  We want this to be larger than
> +   the depth of any reassociation tree we can see, but not larger than
> +   the rank difference between two blocks.  */
> +#define PHI_LOOP_BIAS (1 << 15)
> +
> +/* Rank assigned to a phi statement.  If STMT is a loop-carried phi of
> +   an innermost loop, and the phi has only a single use which is inside
> +   the loop, then the rank is the block rank of the loop latch plus an
> +   extra bias for the loop-carried dependence.  This causes expressions
> +   calculated into an accumulator variable to be independent for each
> +   iteration of the loop.  If STMT is some other phi, the rank is the
> +   block rank of its containing block.  */
> +static long
> +phi_rank (gimple stmt)
> +{
> +  basic_block bb = gimple_bb (stmt);
> +  struct loop *father = bb->loop_father;
> +  tree res;
> +  unsigned i;
> +  use_operand_p use;
> +  gimple use_stmt;
> +
> +  /* We only care about real loops (those with a latch).  */
> +  if (!father->latch)
> +    return bb_rank[bb->index];
> +
> +  /* Interesting phis must be in headers of innermost loops.  */
> +  if (bb != father->header
> +      || father->inner)
> +    return bb_rank[bb->index];
> +
> +  /* Ignore virtual SSA_NAMEs.  */
> +  res = gimple_phi_result (stmt);
> +  if (!is_gimple_reg (SSA_NAME_VAR (res)))
> +    return bb_rank[bb->index];
> +
> +  /* The phi definition must have a single use, and that use must be
> +     within the loop.  Otherwise this isn't an accumulator pattern.  */
> +  if (!single_imm_use (res, &use, &use_stmt)
> +      || gimple_bb (use_stmt)->loop_father != father)
> +    return bb_rank[bb->index];
> +
> +  /* Look for phi arguments from within the loop.  If found, bias this phi.  
> */
> +  for (i = 0; i < gimple_phi_num_args (stmt); i++)
> +    {
> +      tree arg = gimple_phi_arg_def (stmt, i);
> +      if (TREE_CODE (arg) == SSA_NAME
> +         && !SSA_NAME_IS_DEFAULT_DEF (arg))
> +       {
> +         gimple def_stmt = SSA_NAME_DEF_STMT (arg);
> +         if (gimple_bb (def_stmt)->loop_father == father)
> +           return bb_rank[father->latch->index] + PHI_LOOP_BIAS;
> +       }
> +    }
> +
> +  /* Must be an uninteresting phi.  */
> +  return bb_rank[bb->index];
> +}
> +
> +/* If EXP is an SSA_NAME defined by a PHI statement that represents a
> +   loop-carried dependence of an innermost loop, return TRUE; else
> +   return FALSE.  */
> +static bool
> +loop_carried_phi (tree exp)
> +{
> +  gimple phi_stmt;
> +  long block_rank;
> +
> +  if (TREE_CODE (exp) != SSA_NAME
> +      || SSA_NAME_IS_DEFAULT_DEF (exp))
> +    return false;
> +
> +  phi_stmt = SSA_NAME_DEF_STMT (exp);
> +
> +  if (gimple_code (SSA_NAME_DEF_STMT (exp)) != GIMPLE_PHI)
> +    return false;
> +
> +  /* Non-loop-carried phis have block rank.  Loop-carried phis have
> +     an additional bias added in.  If this phi doesn't have block rank,
> +     it's biased and should not be propagated.  */
> +  block_rank = bb_rank[gimple_bb (phi_stmt)->index];
> +
> +  if (phi_rank (phi_stmt) != block_rank)
> +    return true;
> +
> +  return false;
> +}
> +
> +/* Return the maximum of RANK and the rank that should be propagated
> +   from expression OP.  For most operands, this is just the rank of OP.
> +   For loop-carried phis, the value is zero to avoid undoing the bias
> +   in favor of the phi.  */
> +static long
> +propagate_rank (long rank, tree op)
> +{
> +  long op_rank;
> +
> +  if (loop_carried_phi (op))
> +    return rank;
> +
> +  op_r

Re: [PATCH PR43513, 1/3] Replace vla with array - Implementation.

2011-07-30 Thread Richard Guenther
On Sat, Jul 30, 2011 at 9:34 AM, Tom de Vries  wrote:
> On 07/30/2011 10:21 AM, Tom de Vries wrote:
>> Hi,
>>
>> On 07/28/2011 08:20 PM, Tom de Vries wrote:
>>> On 07/28/2011 06:25 PM, Richard Guenther wrote:
 On Thu, 28 Jul 2011, Tom de Vries wrote:

> On 07/28/2011 12:22 PM, Richard Guenther wrote:
>> On Wed, 27 Jul 2011, Tom de Vries wrote:
>>
>>> On 07/27/2011 05:27 PM, Richard Guenther wrote:
 On Wed, 27 Jul 2011, Tom de Vries wrote:

> On 07/27/2011 02:12 PM, Richard Guenther wrote:
>> On Wed, 27 Jul 2011, Tom de Vries wrote:
>>
>>> On 07/27/2011 01:50 PM, Tom de Vries wrote:
 Hi Richard,

 I have a patch set for bug 43513 - The stack pointer is adjusted 
 twice.

 01_pr43513.3.patch
 02_pr43513.3.test.patch
 03_pr43513.3.mudflap.patch

 The patch set has been bootstrapped and reg-tested on x86_64.

 I will sent out the patches individually.

>>>
>>> The patch replaces a vla __builtin_alloca that has a constant 
>>> argument with an
>>> array declaration.
>>>
>>> OK for trunk?
>>
>> I don't think it is safe to try to get at the VLA type the way you 
>> do.
>
> I don't understand in what way it's not safe. Do you mean I don't 
> manage to find
> the type always, or that I find the wrong type, or something else?

 I think you might get the wrong type,
>>>
>>> Ok, I'll review that code one more time.
>>>
 you also do not transform code
 like

   int *p = alloca(4);
   *p = 3;

 as there is no array type involved here.

>>>
>>> I was trying to stay away from non-vla allocas.  A source declared 
>>> alloca has
>>> function livetime, so we could have a single alloca in a loop, called 
>>> 10 times,
>>> with all 10 instances live at the same time. This patch does not detect 
>>> such
>>> cases, and thus stays away from non-vla allocas. A vla decl does not 
>>> have such
>>> problems, the lifetime ends when it goes out of scope.
>>
>> Yes indeed - that probably would require more detailed analysis.
>>
>> In fact I would simply do sth like
>>
>>   elem_type = build_nonstandard_integer_type (BITS_PER_UNIT, 1);
>>   n_elem = size * 8 / BITS_PER_UNIT;
>>   array_type = build_array_type_nelts (elem_type, n_elem);
>>   var = create_tmp_var (array_type, NULL);
>>   return fold_convert (TREE_TYPE (lhs), build_fold_addr_expr (var));
>>
>
> I tried this code on the example, and it works, but the newly 
> declared type has
> an 8-bit alignment, while the vla base type has a 32 bit alignment.  
> This make
> the memory access in the example potentially unaligned, which 
> prohibits an
> ivopts optimization, so the resulting text size is 68 instead of the 
> 64 achieved
> with my current patch.

 Ok, so then set DECL_ALIGN of the variable to something reasonable
 like MIN (size * 8, GET_MODE_PRECISION (word_mode)).  Basically the
 alignment that the targets alloca function would guarantee.

>>>
>>> I tried that, but that doesn't help. It's the alignment of the type that
>>> matters, not of the decl.
>>
>> It shouldn't.  All accesses are performed with the original types and
>> alignment comes from that (plus the underlying decl).
>>
>
> I managed to get it all working by using build_aligned_type rather that 
> DECL_ALIGN.

 That's really odd, DECL_ALIGN should just work - nothing refers to the
 type of the decl in the IL.  Can you try also setting DECL_USER_ALIGN to
 1 maybe?

>>>
>>> This doesn't work either.
>>>
>>>   /* Declare array.  */
>>>   elem_type = build_nonstandard_integer_type (BITS_PER_UNIT, 1);
>>>   n_elem = size * 8 / BITS_PER_UNIT;
>>>   align = MIN (size * 8, GET_MODE_PRECISION (word_mode));
>>>   array_type = build_array_type_nelts (elem_type, n_elem);
>>>   var = create_tmp_var (array_type, NULL);
>>>   DECL_ALIGN (var) = align;
>>>   DECL_USER_ALIGN (var) = 1;
>>>
>>> Maybe this clarifies it:
>>>
>>> Breakpoint 1, may_be_unaligned_p (ref=0xf7d9d410, step=0xf7d3d578) at
>>> /home/vries/local/google/src/gcc-mainline/gcc/tree-ssa-loop-ivopts.c:1621
>>> (gdb) call debug_generic_expr (ref)
>>> MEM[(int[0:D.2579] *)&D.2595][0]
>>> (gdb) call debug_generic_expr (step)
>>> 4
>>>
>>> 1627   base = get_inner_reference (ref, &bitsize, &bitpos, &toffset, &mode,
>>> (gdb) call debug_generic_expr (base)
>>> D.2595
>>>
>>> 1629   base_type = TREE_TYPE (base);
>>> (gdb) call debug_generic_expr (base_type)
>>>

Re: [PATCH PR43513, 1/3] Replace vla with array - Implementation.

2011-07-30 Thread Richard Guenther
On Sat, Jul 30, 2011 at 9:24 AM, Tom de Vries  wrote:
> On 07/30/2011 10:21 AM, Tom de Vries wrote:
>> Hi,
>>
>> On 07/28/2011 08:20 PM, Tom de Vries wrote:
>>> On 07/28/2011 06:25 PM, Richard Guenther wrote:
 On Thu, 28 Jul 2011, Tom de Vries wrote:

> On 07/28/2011 12:22 PM, Richard Guenther wrote:
>> On Wed, 27 Jul 2011, Tom de Vries wrote:
>>
>>> On 07/27/2011 05:27 PM, Richard Guenther wrote:
 On Wed, 27 Jul 2011, Tom de Vries wrote:

> On 07/27/2011 02:12 PM, Richard Guenther wrote:
>> On Wed, 27 Jul 2011, Tom de Vries wrote:
>>
>>> On 07/27/2011 01:50 PM, Tom de Vries wrote:
 Hi Richard,

 I have a patch set for bug 43513 - The stack pointer is adjusted 
 twice.

 01_pr43513.3.patch
 02_pr43513.3.test.patch
 03_pr43513.3.mudflap.patch

 The patch set has been bootstrapped and reg-tested on x86_64.

 I will sent out the patches individually.

>>>
>>> The patch replaces a vla __builtin_alloca that has a constant 
>>> argument with an
>>> array declaration.
>>>
>>> OK for trunk?
>>
>> I don't think it is safe to try to get at the VLA type the way you 
>> do.
>
> I don't understand in what way it's not safe. Do you mean I don't 
> manage to find
> the type always, or that I find the wrong type, or something else?

 I think you might get the wrong type,
>>>
>>> Ok, I'll review that code one more time.
>>>
 you also do not transform code
 like

   int *p = alloca(4);
   *p = 3;

 as there is no array type involved here.

>>>
>>> I was trying to stay away from non-vla allocas.  A source declared 
>>> alloca has
>>> function livetime, so we could have a single alloca in a loop, called 
>>> 10 times,
>>> with all 10 instances live at the same time. This patch does not detect 
>>> such
>>> cases, and thus stays away from non-vla allocas. A vla decl does not 
>>> have such
>>> problems, the lifetime ends when it goes out of scope.
>>
>> Yes indeed - that probably would require more detailed analysis.
>>
>> In fact I would simply do sth like
>>
>>   elem_type = build_nonstandard_integer_type (BITS_PER_UNIT, 1);
>>   n_elem = size * 8 / BITS_PER_UNIT;
>>   array_type = build_array_type_nelts (elem_type, n_elem);
>>   var = create_tmp_var (array_type, NULL);
>>   return fold_convert (TREE_TYPE (lhs), build_fold_addr_expr (var));
>>
>
> I tried this code on the example, and it works, but the newly 
> declared type has
> an 8-bit alignment, while the vla base type has a 32 bit alignment.  
> This make
> the memory access in the example potentially unaligned, which 
> prohibits an
> ivopts optimization, so the resulting text size is 68 instead of the 
> 64 achieved
> with my current patch.

 Ok, so then set DECL_ALIGN of the variable to something reasonable
 like MIN (size * 8, GET_MODE_PRECISION (word_mode)).  Basically the
 alignment that the targets alloca function would guarantee.

>>>
>>> I tried that, but that doesn't help. It's the alignment of the type that
>>> matters, not of the decl.
>>
>> It shouldn't.  All accesses are performed with the original types and
>> alignment comes from that (plus the underlying decl).
>>
>
> I managed to get it all working by using build_aligned_type rather that 
> DECL_ALIGN.

 That's really odd, DECL_ALIGN should just work - nothing refers to the
 type of the decl in the IL.  Can you try also setting DECL_USER_ALIGN to
 1 maybe?

>>>
>>> This doesn't work either.
>>>
>>>   /* Declare array.  */
>>>   elem_type = build_nonstandard_integer_type (BITS_PER_UNIT, 1);
>>>   n_elem = size * 8 / BITS_PER_UNIT;
>>>   align = MIN (size * 8, GET_MODE_PRECISION (word_mode));
>>>   array_type = build_array_type_nelts (elem_type, n_elem);
>>>   var = create_tmp_var (array_type, NULL);
>>>   DECL_ALIGN (var) = align;
>>>   DECL_USER_ALIGN (var) = 1;
>>>
>>> Maybe this clarifies it:
>>>
>>> Breakpoint 1, may_be_unaligned_p (ref=0xf7d9d410, step=0xf7d3d578) at
>>> /home/vries/local/google/src/gcc-mainline/gcc/tree-ssa-loop-ivopts.c:1621
>>> (gdb) call debug_generic_expr (ref)
>>> MEM[(int[0:D.2579] *)&D.2595][0]
>>> (gdb) call debug_generic_expr (step)
>>> 4
>>>
>>> 1627   base = get_inner_reference (ref, &bitsize, &bitpos, &toffset, &mode,
>>> (gdb) call debug_generic_expr (base)
>>> D.2595
>>>
>>> 1629   base_type = TREE_TYPE (base);
>>> (gdb) call debug_generic_expr (base_type)
>>>

Re: [PATCH] unbreak attribute((optimize(...))) on m68k (PR47908)

2011-07-30 Thread Mikael Pettersson
Andreas Schwab writes:
 > Mikael Pettersson  writes:
 > 
 > > 2011-07-28  Mikael Pettersson  
 > >
 > >PR target/47908
 > >* config/m68k/m68k.c (m68k_override_options_after_change): New function.
 > >Disable instruction scheduling for non-ColdFire targets.
 > >(TARGET_OVERRIDE_OPTIONS_AFTER_CHANGE): Define.
 > 
 > Ok for all active branches.
 > 
 > Andreas.

Thanks.

I will need help from someone with svn commit rights to apply this.
Specifically, the patch posted in

should go into trunk and 4.6, while the patch attached to PR47908

(same approach, different hook) should go into 4.5 and 4.4.

/Mikael


[patch, fortran] Fix ice-on-valid PR 48876

2011-07-30 Thread Thomas Koenig

Hello world,

the attached, rather self-explanatory patch fixes PR 48876.

OK for trunk?

Thomas

2011-07-30  Thomas Koenig  

PR fortran/48876
* expr.c (gfc_simplify_expr):  If end of a string is less
than zero, set it to zero.

2011-07-30  Thomas Koenig  

PR fortran/48876
* gfortran.dg/string_5.f90:  New test.
Index: expr.c
===
--- expr.c	(Revision 176933)
+++ expr.c	(Arbeitskopie)
@@ -1839,6 +1839,9 @@ gfc_simplify_expr (gfc_expr *p, int type)
 	  if (p->ref && p->ref->u.ss.end)
 	gfc_extract_int (p->ref->u.ss.end, &end);
 
+	  if (end < 0)
+	end = 0;
+
 	  s = gfc_get_wide_string (end - start + 2);
 	  memcpy (s, p->value.character.string + start,
 		  (end - start) * sizeof (gfc_char_t));
! { dg-do compile }
! PR fortran/48876 - this used to segfault.
! Test case contributed by mhp77 (a) gmx.at.
program test
  character ::  string =  "string"( : -1 )
end program test



Re: PING: PATCH [4/n]: Prepare x32: Permute the conversion and addition if one operand is a constant

2011-07-30 Thread Paolo Bonzini
On Sat, Jul 30, 2011 at 00:32, H.J. Lu  wrote:
> The whole approach doesn't work. The testcase at
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49721#c1
>
> shows GCC depends on transforming:
>
> (zero_extend:DI (plus:SI (FOO:SI) (const_int Y)))
>
> to
>
> (plus:DI (zero_extend:DI (FOO:SI)) (const_int Y))
>
> Otherwise we either get compiler crash or wrong codes.

Please explain how/why here or in the BZ.  Compiler crashes can be
fixed, wrong code is often a symptom of latent bugs.

Paolo


Re: [PATCH] PR c++/33255 - Support -Wunused-local-typedefs warning

2011-07-30 Thread Joseph S. Myers
On Fri, 29 Jul 2011, Jason Merrill wrote:

> > Looking a bit further, it looks like the C FE uses cfun->language only
> > to store the context of the outer function when faced with a nested
> > function.  This is done by c_push_function_context, called by
> > c_parser_declaration_or_fndef.  Otherwise, cfun->language is not
> > allocated.  Is it appropriate that -Wunused-local-typedefs allocates it
> > as well?
> 
> I think so.  Joseph?

Seems reasonable.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [patch, fortran] Fix ice-on-valid PR 48876

2011-07-30 Thread Steve Kargl
On Sat, Jul 30, 2011 at 01:49:42PM +0200, Thomas Koenig wrote:
> Hello world,
> 
> the attached, rather self-explanatory patch fixes PR 48876.
> 
> OK for trunk?
> 

Yes.  If the problem exists on 4.6, can you apply the patch
to 4.6 as well.

-- 
Steve


Re: [PATCH] Fix PR47594: Sign extend constants while translating to Graphite

2011-07-30 Thread Sebastian Pop
On Sat, Jul 30, 2011 at 08:33, Jack Howarth  wrote:
>    These patches fail to bootstrap on current gcc trunk (r176957) with...
>

The attached patch adds one extra line to convert the step to
unsigned.  It passes bootstrap and has the following extra FAILS:

FAIL: gcc.c-torture/execute/pr45034.c execution,  -O2
FAIL: gcc.c-torture/execute/pr45034.c execution,  -O3 -fomit-frame-pointer
FAIL: gcc.c-torture/execute/pr45034.c execution,  -O3
-fomit-frame-pointer -funroll-loops
FAIL: gfortran.dg/do_3.F90  -O1  execution test
FAIL: gcc.c-torture/execute/pr45034.c execution,  -O3
-fomit-frame-pointer -funroll-all-loops -finline-functions
FAIL: gfortran.dg/do_3.F90  -O2  execution test
FAIL: gcc.c-torture/execute/pr45034.c execution,  -O3 -g
FAIL: gfortran.dg/do_3.F90  -O3 -fomit-frame-pointer  execution test
FAIL: gcc.c-torture/execute/pr45034.c execution,  -Os
FAIL: gfortran.dg/do_3.F90  -O3 -fomit-frame-pointer -funroll-loops
execution test
FAIL: gcc.c-torture/execute/pr45034.c execution,  -O2 -flto
-flto-partition=none
FAIL: gfortran.dg/do_3.F90  -O3 -fomit-frame-pointer
-funroll-all-loops -finline-functions  execution test
FAIL: gcc.c-torture/execute/pr45034.c execution,  -O2 -flto
FAIL: gfortran.dg/do_3.F90  -O3 -g  execution test
FAIL: gfortran.dg/do_3.F90  -Os  execution test

I'm investigating why these fail.

Sebastian
From 0abb53c250949cd22cacbf3ed0b69604665c2949 Mon Sep 17 00:00:00 2001
From: Sebastian Pop 
Date: Fri, 29 Jul 2011 14:35:56 -0500
Subject: [PATCH 2/2] Fix PR47594: Build signed niter expressions

2011-07-23  Sebastian Pop  

	PR middle-end/47594
	* graphite-scop-detection.c (graphite_can_represent_scev): Return false
	on TYPE_UNSIGNED.
	* graphite-sese-to-poly.c (scan_tree_for_params_int): Do not call
	double_int_sext.
	* tree-ssa-loop-niter.c (number_of_iterations_ne): Use the signed types
	for the trivial case, then convert to unsigned.
	(number_of_iterations_lt): Use the original signed types.
	(number_of_iterations_cond): Same.
	(find_loop_niter): Build signed integer constant.
	(loop_niter_by_eval): Same.
---
 gcc/ChangeLog  |   14 
 gcc/graphite-scop-detection.c  |6 +++
 gcc/graphite-sese-to-poly.c|7 +---
 gcc/testsuite/ChangeLog|5 +++
 .../gfortran.dg/graphite/run-id-pr47594.f90|   35 
 gcc/tree-ssa-loop-niter.c  |   30 +++--
 6 files changed, 80 insertions(+), 17 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/graphite/run-id-pr47594.f90

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 738144d..9b22eed 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,17 @@
+2011-07-23  Sebastian Pop  
+
+	PR middle-end/47594
+	* graphite-scop-detection.c (graphite_can_represent_scev): Return false
+	on TYPE_UNSIGNED.
+	* graphite-sese-to-poly.c (scan_tree_for_params_int): Do not call
+	double_int_sext.
+	* tree-ssa-loop-niter.c (number_of_iterations_ne): Use the signed types
+	for the trivial case, then convert to unsigned.
+	(number_of_iterations_lt): Use the original signed types.
+	(number_of_iterations_cond): Same.
+	(find_loop_niter): Build signed integer constant.
+	(loop_niter_by_eval): Same.
+
 2011-07-29  Uros Bizjak  
 
 	* config/i386/predicates.md (tp_or_register_operand): Remove predicate.
diff --git a/gcc/graphite-scop-detection.c b/gcc/graphite-scop-detection.c
index 3460568..403ff23 100644
--- a/gcc/graphite-scop-detection.c
+++ b/gcc/graphite-scop-detection.c
@@ -196,6 +196,12 @@ graphite_can_represent_scev (tree scev)
   if (chrec_contains_undetermined (scev))
 return false;
 
+  /* FIXME: As long as Graphite cannot handle wrap around effects of
+ induction variables, we discard them.  */
+  if (TYPE_UNSIGNED (TREE_TYPE (scev))
+  && !POINTER_TYPE_P (TREE_TYPE (scev)))
+return false;
+
   switch (TREE_CODE (scev))
 {
 case PLUS_EXPR:
diff --git a/gcc/graphite-sese-to-poly.c b/gcc/graphite-sese-to-poly.c
index 7e23c9d..d15f0b3 100644
--- a/gcc/graphite-sese-to-poly.c
+++ b/gcc/graphite-sese-to-poly.c
@@ -647,14 +647,9 @@ scan_tree_for_params_int (tree cst, ppl_Linear_Expression_t expr, mpz_t k)
 {
   mpz_t val;
   ppl_Coefficient_t coef;
-  tree type = TREE_TYPE (cst);
 
   mpz_init (val);
-
-  /* Necessary to not get "-1 = 2^n - 1". */
-  mpz_set_double_int (val, double_int_sext (tree_to_double_int (cst),
-	TYPE_PRECISION (type)), false);
-
+  tree_int_to_gmp (cst, val);
   mpz_mul (val, val, k);
   ppl_new_Coefficient (&coef);
   ppl_assign_Coefficient_from_mpz_t (coef, val);
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index cf5ee2b..55f2e91 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,8 @@
+2011-07-23  Sebastian Pop  
+
+	PR middle-end/47594
+	* gfortran.dg/graphite/run-id-pr47594.f90: New.
+
 2011-07-29  Rainer Orth  
 
 	PR tree-optimization/47407
diff --git a/gcc/testsuite/gfortran.d

[Patch, Fortran, OOP] PR 49112: [4.6/4.7 Regression] Missing type-bound procedure, "duplicate save" warnings and internal compiler error

2011-07-30 Thread Janus Weil
Hi all,

the PR in the subject line contains several issues, and with the
"duplicate save" part fixed, the attached patch takes care of the
"missing type-bound procedure" regression (comment #6).

The problem is the following: When parsing a structure constructor, we
have to resolve the derived type first. However, this will also
trigger the construction of the vtab for this type (if it has
type-bound procedures), which in turn will be incomplete if we're in
the middle of a module and the type-bound procedures have not been
parsed fully.

To solve this dilemma, I have split off from 'resolve_fl_derived' a
part which only concerns the data components etc
('resolve_fl_derived0'). This can be called whenever we encounter a
structure constructor. The full 'resolve_fl_derived' will call this
split-off part and in addition resolve the typebound procedures,
thereby constucting the vtab.

The patch was regtested on x86_64-unknown-linux-gnu. Ok for trunk and 4.6?

Cheers,
Janus


2011-07-30  Janus Weil  

PR fortran/49112
* resolve.c (resolve_structure_cons): Don't do the full dt resolution,
only call 'resolve_fl_derived0'.
(resolve_typebound_procedures): Resolve typebound procedures of
parent type.
(resolve_fl_derived0): New function, which does a part of the work
for 'resolve_fl_derived'.
(resolve_fl_derived): Call 'resolve_fl_derived0' and do some additional
things.


2011-07-30  Janus Weil  

PR fortran/49112
* gfortran.dg/abstract_type_6.f03: Modified.
* gfortran.dg/typebound_proc_24.f03: New.
Index: gcc/testsuite/gfortran.dg/abstract_type_6.f03
===
--- gcc/testsuite/gfortran.dg/abstract_type_6.f03	(revision 176950)
+++ gcc/testsuite/gfortran.dg/abstract_type_6.f03	(working copy)
@@ -31,7 +31,7 @@ TYPE, EXTENDS(middle) :: bottom
 CONTAINS
! useful proc to satisfy deferred procedure in top. Because we've
! extended middle we wouldn't get told off if we forgot this.
-   PROCEDURE :: proc_a => bottom_a
+   PROCEDURE :: proc_a => bottom_a  ! { dg-error "must be a module procedure" }
! calls middle%proc_b and then provides extra behaviour
PROCEDURE :: proc_b => bottom_b
! calls top_c and then provides extra behaviour
Index: gcc/fortran/resolve.c
===
--- gcc/fortran/resolve.c	(revision 176950)
+++ gcc/fortran/resolve.c	(working copy)
@@ -950,6 +950,9 @@ resolve_contained_functions (gfc_namespace *ns)
 }
 
 
+static gfc_try resolve_fl_derived0 (gfc_symbol *sym);
+
+
 /* Resolve all of the elements of a structure constructor and make sure that
the types are correct. The 'init' flag indicates that the given
constructor is an initializer.  */
@@ -965,7 +968,7 @@ resolve_structure_cons (gfc_expr *expr, int init)
   t = SUCCESS;
 
   if (expr->ts.type == BT_DERIVED)
-resolve_symbol (expr->ts.u.derived);
+resolve_fl_derived0 (expr->ts.u.derived);
 
   cons = gfc_constructor_first (expr->value.constructor);
   /* A constructor may have references if it is the result of substituting a
@@ -11361,9 +11364,14 @@ static gfc_try
 resolve_typebound_procedures (gfc_symbol* derived)
 {
   int op;
+  gfc_symbol* super_type;
 
   if (!derived->f2k_derived || !derived->f2k_derived->tb_sym_root)
 return SUCCESS;
+  
+  super_type = gfc_get_derived_super_type (derived);
+  if (super_type)
+resolve_typebound_procedures (super_type);
 
   resolve_bindings_derived = derived;
   resolve_bindings_result = SUCCESS;
@@ -11475,28 +11483,17 @@ ensure_not_abstract (gfc_symbol* sub, gfc_symbol*
 }
 
 
-/* Resolve the components of a derived type.  */
+/* Resolve the components of a derived type. This does not have to wait until
+   resolution stage, but can be done as soon as the dt declaration has been
+   parsed.  */
 
 static gfc_try
-resolve_fl_derived (gfc_symbol *sym)
+resolve_fl_derived0 (gfc_symbol *sym)
 {
   gfc_symbol* super_type;
   gfc_component *c;
 
   super_type = gfc_get_derived_super_type (sym);
-  
-  if (sym->attr.is_class && sym->ts.u.derived == NULL)
-{
-  /* Fix up incomplete CLASS symbols.  */
-  gfc_component *data = gfc_find_component (sym, "_data", true, true);
-  gfc_component *vptr = gfc_find_component (sym, "_vptr", true, true);
-  if (vptr->ts.u.derived == NULL)
-	{
-	  gfc_symbol *vtab = gfc_find_derived_vtab (data->ts.u.derived);
-	  gcc_assert (vtab);
-	  vptr->ts.u.derived = vtab->ts.u.derived;
-	}
-}
 
   /* F2008, C432. */
   if (super_type && sym->attr.coarray_comp && !super_type->attr.coarray_comp)
@@ -11508,7 +11505,7 @@ static gfc_try
 }
 
   /* Ensure the extended type gets resolved before we do.  */
-  if (super_type && resolve_fl_derived (super_type) == FAILURE)
+  if (super_type && resolve_fl_derived0 (super_type) == FAILURE)
 return FAILURE;
 
   /* An ABSTRACT type must be extensible.  */
@@ -11861,14 +11858

Re: [DF] Replace various bitmaps with HARD_REG_SETs

2011-07-30 Thread Jakub Jelinek
On Wed, Jul 27, 2011 at 10:29:43PM +0200, Paolo Bonzini wrote:
> On 07/27/2011 06:17 PM, Joseph S. Myers wrote:
> >>>  --- gcc/target.h 2011-04-06 11:08:17 +
> >>>  +++ gcc/target.h 2011-07-27 10:27:56 +
> >>>  @@ -50,6 +50,7 @@
> >>>#define GCC_TARGET_H
> >>>
> >>>#include "tm.h"
> >>>  +#include "hard-reg-set.h"
> >>>#include "insn-modes.h"
> >Please send a patch against current trunk.  target.h hasn't included tm.h
> >for over a month.  Since hard-reg-set.h depends on tm.h, you won't be able
> >to include hard-reg-set.h in target.h any more, so you'll need to find
> >another solution for that.
> 
> For example you can make HARD_REG_SET always a struct, so that you
> can add a forward declaration in target.h.  GCC is able to optimize
> the struct away, we rely on that.

I don't think it is a good idea.  A single long HARD_REG_SET is actually
the common case, at least with 64-bit host, and while we can SRA a struct
often, several ABIs pass structures less efficiently than plain longs.

Jakub


[gomp3.1] #pragma omp atomic updates and fixes

2011-07-30 Thread Jakub Jelinek
Hi!

This patch implements the remaining changes from in between 3.1 draft and
3.1 final, in the light of the
http://www.openmp.org/forum/viewtopic.php?f=10&t=1199
clarification.
  #pragma omp atomic
x = x + 6 + 2;
is now allowed, as well as
  #pragma omp atomic capture
  { x = x | 7 + 1; v = x; }
etc.  In the C FE I had to change c_parser_binary_expression slightly, like
I've changed cp_parser_binary_expression already a few years ago for OpenMP
3.0, because we don't want to parse
  #pragma omp atomic
x = x * 6 + 2;
as if it was x = x * (6 + 2);.  Regtested on x86_64-linux, committed to
gomp-3_1-branch.

2011-07-30  Jakub Jelinek  

* c-common.h (c_finish_omp_atomic): Add rhs1 argument.
* c-omp.c (c_finish_omp_atomic): Add rhs1 argument.  If it
has side-effects, evaluate those too in the right spot,
if it is a decl and lhs is also a decl, error out if they
aren't the same.  Fix order of omit_two_operands_loc arguments.

* c-parser.c (enum c_parser_prec): New enum, moved from within
c_parser_binary_expression.
(c_parser_binary_expression): Add PREC argument.  Stop parsing
if operator has lower or equal precedence than PREC.
(c_parser_conditional_expression, c_parser_omp_for_loop): Adjust
callers.
(c_parser_omp_atomic): Parse x = x binop expr; stmts in
#pragma omp atomic update and in #pragma omp atomic capture
structured block forms.  Adjust c_finish_omp_atomic caller.
Fix up error handling in capture structured blocks.

* cp-tree.h (finish_omp_atomic): Add rhs1 argument.
* parser.c (cp_parser_omp_atomic): Parse x = x binop expr; stmts in
#pragma omp atomic update and in #pragma omp atomic capture
structured block forms.  Adjust finish_omp_atomic caller.
Fix up error handling in capture structured blocks.
* pt.c (tsubst_expr) : Find saved rhs1 value if any
and pass it to finish_omp_atomic.
* semantics.c (finish_omp_atomic): Add rhs1 argument.  Adjust
c_finish_omp_atomic caller and store rhs1 inside of OMP_ATOMIC
arguments.

* gcc.dg/gomp/atomic-5.c: Adjust expected diagnostics.
* gcc.dg/gomp/atomic-15.c: New test.
* g++.dg/gomp/atomic-5.C: Adjust expected diagnostics.
* g++.dg/gomp/atomic-15.C: New test.

* testsuite/libgomp.c/atomic-14.c: New test.
* testsuite/libgomp.c++/atomic-8.C: New test.
* testsuite/libgomp.c++/atomic-9.C: New test.

--- gcc/c-family/c-common.h.jj  2011-07-11 17:33:51.0 +0200
+++ gcc/c-family/c-common.h 2011-07-30 15:00:38.0 +0200
@@ -1005,7 +1005,7 @@ extern tree c_finish_omp_critical (locat
 extern tree c_finish_omp_ordered (location_t, tree);
 extern void c_finish_omp_barrier (location_t);
 extern tree c_finish_omp_atomic (location_t, enum tree_code, enum tree_code,
-tree, tree, tree, tree);
+tree, tree, tree, tree, tree);
 extern void c_finish_omp_flush (location_t);
 extern void c_finish_omp_taskwait (location_t);
 extern void c_finish_omp_taskyield (location_t);
--- gcc/c-family/c-omp.c.jj 2011-07-11 19:57:49.0 +0200
+++ gcc/c-family/c-omp.c2011-07-30 14:59:03.0 +0200
@@ -1,7 +1,8 @@
 /* This file contains routines to construct GNU OpenMP constructs,
called from parsing in the C and C++ front ends.
 
-   Copyright (C) 2005, 2007, 2008, 2009, 2010 Free Software Foundation, Inc.
+   Copyright (C) 2005, 2007, 2008, 2009, 2010, 2011
+   Free Software Foundation, Inc.
Contributed by Richard Henderson ,
  Diego Novillo .
 
@@ -122,12 +123,13 @@ c_finish_omp_taskyield (location_t loc)
 tree
 c_finish_omp_atomic (location_t loc, enum tree_code code,
 enum tree_code opcode, tree lhs, tree rhs,
-tree v, tree lhs1)
+tree v, tree lhs1, tree rhs1)
 {
   tree x, type, addr;
 
   if (lhs == error_mark_node || rhs == error_mark_node
-  || v == error_mark_node || lhs1 == error_mark_node)
+  || v == error_mark_node || lhs1 == error_mark_node
+  || rhs1 == error_mark_node)
 return error_mark_node;
 
   /* ??? According to one reading of the OpenMP spec, complex type are
@@ -188,6 +190,20 @@ c_finish_omp_atomic (location_t loc, enu
   x = build2 (code, type, addr, rhs);
   SET_EXPR_LOCATION (x, loc);
 
+  /* Generally it is hard to prove lhs1 and lhs are the same memory
+ location, just diagnose different variables.  */
+  if (rhs1
+  && TREE_CODE (rhs1) == VAR_DECL
+  && TREE_CODE (lhs) == VAR_DECL
+  && rhs1 != lhs)
+{
+  if (code == OMP_ATOMIC)
+   error_at (loc, "%<#pragma omp atomic update%> uses two different 
variables for memory");
+  else
+   error_at (loc, "%<#pragma omp atomic capture%> uses two different 
variables for memory");
+  return error_mark_node;
+}
+
   if (code !=

Re: [patch] Fix PR tree-optimization/49471

2011-07-30 Thread Zdenek Dvorak
Hi,

> > This patch fixes the build failure of cactusADM and dealII spec2006
> > benchmarks when autopar is enabled.
> > (for powerpc they fail only when -m32 is additionally enabled)
> >
> > The problem originated in canonicalize_loop_ivs, where we iterate the
> > header's phis in order to base all
> > the induction variables on a single control variable.
> > We use the largest precision of the loop's ivs in order to determine the
> > type of the control variable.
> >
> > Since iterating the loop's phis takes into account not only the loop's
> > ivs, but also reduction variables,
> > we got precision values like 80 for x86, or 128 for ppc.
> > The compilers failed to create proper types for these sizes
> > (respectively).
> >
> > The proper behavior for determining the control variable's type is to take
> > into account only the loop's ivs,
> > which is what this patch does.
> >
> > Bootstrap and testsuite pass successfully (as autopar is not enabled by
> > default).
> > No new regressions when the testsuite is run with autopar enabled.
> > No new regressions for the run of spec2006 with autopar enabled,
> >
> > cactusADM and dealII benchmarks now pass successfully with autopar on
> > powerpc and x86.
> >
> > Thanks to Zdenek who helped me figure out the failure/fix.
> > OK for trunk?
> 
> It'll collide with Sebastians patch in that area.  I suggested a
> INTEGRAL_TYPE_P check instead of the simple_iv one, it
> should be cheaper.  Zdenek, do you think it will be "incorrect"
> in some cases?

well, it does not make much sense -- reductions of integral type would
be taken into consideration for determining the size of the canonical
variable.  However, it is not a big issue (the choice of the type is more
or less arbitrary, as long as the number of iterations fits into it; selecting
the type based on another existing iv is just to avoid unnecessary extensions),

Zdenek

> Thanks,
> Richard.
> 
> > Thanks,
> > Razya
> >
> > ChangeLog:
> >
> >   PR tree-optimization/49471
> >   * tree-vect-loop-manip.c (canonicalize_loop_ivs): Add condition to
> >   ignore reduction variables when iterating the loop header's phis.
> >
> >
> >


Re: [DF] Replace various bitmaps with HARD_REG_SETs

2011-07-30 Thread Jakub Jelinek
On Sat, Jul 30, 2011 at 06:34:34PM +0200, Jakub Jelinek wrote:
> I don't think it is a good idea.  A single long HARD_REG_SET is actually
> the common case, at least with 64-bit host, and while we can SRA a struct
> often, several ABIs pass structures less efficiently than plain longs.

And for bigger HARD_REG_SET, making it a struct would mean that all
functions that use HARD_REG_SET in a read-only way and thus are passed
a HARD_REG_SET argument rather than HARD_REG_SET * suddenly copy the whole
bitset, while previously for single long just the long has been passed and
for larger ones a pointer to the array.  You'd want to change such functions
to pass HARD_REG_SET *, but that would penalize the single long hard reg
sets (i386 native is an example of array HARD_REG_SET, x86_64 native is
an example of single long HARD_REG_SET).

So, if you really want a target hook to return or fill up a HARD_REG_SET,
I think it is best to pass void * around if we don't to include tm.h back in
target hook headers.

Jakub


Re: PATCH: [x32]: Check TARGET_LP64 for SIZE_TYPE/PTRDIFF_TYPE

2011-07-30 Thread Uros Bizjak
On Sat, Jul 30, 2011 at 12:41 AM, H.J. Lu  wrote:

> X32 is 32bit.  This patch checks TARGET_LP64 for SIZE_TYPE/PTRDIFF_TYPE.
> OK for trunk?

OK, if tested on x32. You didn't say how the patch was tested.

Thanks,
Uros.


[RFC] hard-reg-set.h refactoring

2011-07-30 Thread Dimitrios Apostolou

Hello list,

the attached patch changes hard-reg-set.h in the following areas:

1) HARD_REG_SET is now always a struct so that it can be used in files 
where we don't want to include tm.h. Many thanks to Paolo for providing 
the idea and the original patch.


2) Code for specific HARD_REG_SET_LONG values is deleted and only generic 
code is left, making the file much more readable/maintainable. I was 
expecting gcc would unroll, even at -O2, loops with 2-3 iterations, so 
performance should have been the same.



I don't intend for this to go mainline, Jakub has explained on IRC that 
certain ABIs make it slower to pass structs and we wouldn't want that. 
Nevertheless I'd appreciate comments on whether any part of this patch is 
worth keeping. FWIW I've profiled this on i386 to be about 4 M instr 
slower out of ~1.5 G inst. I'll be now checking the profiler to see where 
exactly the overhead is.



Thanks,
Dimitris
=== modified file 'gcc/hard-reg-set.h'
--- gcc/hard-reg-set.h  2011-01-03 20:52:22 +
+++ gcc/hard-reg-set.h  2011-07-29 22:32:27 +
@@ -24,35 +24,31 @@ along with GCC; see the file COPYING3.  
 /* Define the type of a set of hard registers.  */
 
 /* HARD_REG_ELT_TYPE is a typedef of the unsigned integral type which
-   will be used for hard reg sets, either alone or in an array.
-
-   If HARD_REG_SET is a macro, its definition is HARD_REG_ELT_TYPE,
-   and it has enough bits to represent all the target machine's hard
-   registers.  Otherwise, it is a typedef for a suitably sized array
-   of HARD_REG_ELT_TYPEs.  HARD_REG_SET_LONGS is defined as how many.
+   will be used for hard reg sets.  An HARD_REG_ELT_TYPE, or an
+   array of them is wrapped in a struct.
 
Note that lots of code assumes that the first part of a regset is
the same format as a HARD_REG_SET.  To help make sure this is true,
we only try the widest fast integer mode (HOST_WIDEST_FAST_INT)
-   instead of all the smaller types.  This approach loses only if
-   there are very few registers and then only in the few cases where
-   we have an array of HARD_REG_SETs, so it needn't be as complex as
-   it used to be.  */
-
-typedef unsigned HOST_WIDEST_FAST_INT HARD_REG_ELT_TYPE;
-
-#if FIRST_PSEUDO_REGISTER <= HOST_BITS_PER_WIDEST_FAST_INT
-
-#define HARD_REG_SET HARD_REG_ELT_TYPE
+   instead of all the smaller types. */
 
+#ifdef ENABLE_RTL_CHECKING
+#define gcc_rtl_assert(EXPR) gcc_assert (EXPR)
 #else
+#define gcc_rtl_assert(EXPR) ((void)(0 && (EXPR)))
+#endif
+
+typedef unsigned HOST_WIDEST_FAST_INT HARD_REG_ELT_TYPE;
 
 #define HARD_REG_SET_LONGS \
  ((FIRST_PSEUDO_REGISTER + HOST_BITS_PER_WIDEST_FAST_INT - 1)  \
   / HOST_BITS_PER_WIDEST_FAST_INT)
-typedef HARD_REG_ELT_TYPE HARD_REG_SET[HARD_REG_SET_LONGS];
 
-#endif
+#define HARD_REG_SET struct hard_reg_set
+
+struct hard_reg_set {
+  HARD_REG_ELT_TYPE elems[HARD_REG_SET_LONGS];
+};
 
 /* HARD_CONST is used to cast a constant to the appropriate type
for use with a HARD_REG_SET.  */
@@ -89,343 +85,108 @@ typedef HARD_REG_ELT_TYPE HARD_REG_SET[H
hard_reg_set_intersect_p (X, Y), which returns true if X and Y intersect.
hard_reg_set_empty_p (X), which returns true if X is empty.  */
 
-#define UHOST_BITS_PER_WIDE_INT ((unsigned) HOST_BITS_PER_WIDEST_FAST_INT)
 
-#ifdef HARD_REG_SET
+#define HARD_REG_ELT_BITS ((unsigned) HOST_BITS_PER_WIDEST_FAST_INT)
 
 #define SET_HARD_REG_BIT(SET, BIT)  \
- ((SET) |= HARD_CONST (1) << (BIT))
+  hard_reg_set_set_bit (&(SET), (BIT))
 #define CLEAR_HARD_REG_BIT(SET, BIT)  \
- ((SET) &= ~(HARD_CONST (1) << (BIT)))
+  hard_reg_set_clear_bit(&(SET), (BIT))
 #define TEST_HARD_REG_BIT(SET, BIT)  \
- (!!((SET) & (HARD_CONST (1) << (BIT
-
-#define CLEAR_HARD_REG_SET(TO) ((TO) = HARD_CONST (0))
-#define SET_HARD_REG_SET(TO) ((TO) = ~ HARD_CONST (0))
-
-#define COPY_HARD_REG_SET(TO, FROM) ((TO) = (FROM))
-#define COMPL_HARD_REG_SET(TO, FROM) ((TO) = ~(FROM))
-
-#define IOR_HARD_REG_SET(TO, FROM) ((TO) |= (FROM))
-#define IOR_COMPL_HARD_REG_SET(TO, FROM) ((TO) |= ~ (FROM))
-#define AND_HARD_REG_SET(TO, FROM) ((TO) &= (FROM))
-#define AND_COMPL_HARD_REG_SET(TO, FROM) ((TO) &= ~ (FROM))
-
-static inline bool
-hard_reg_set_subset_p (const HARD_REG_SET x, const HARD_REG_SET y)
-{
-  return (x & ~y) == HARD_CONST (0);
-}
+  hard_reg_set_bit_p((SET), (BIT))
 
-static inline bool
-hard_reg_set_equal_p (const HARD_REG_SET x, const HARD_REG_SET y)
-{
-  return x == y;
-}
-
-static inline bool
-hard_reg_set_intersect_p (const HARD_REG_SET x, const HARD_REG_SET y)
-{
-  return (x & y) != HARD_CONST (0);
-}
-
-static inline bool
-hard_reg_set_empty_p (const HARD_REG_SET x)
+static inline void
+hard_reg_set_set_bit (HARD_REG_SET *s, unsigned int bit)
 {
-  return x == HARD_CONST (0);
-}
-
+#if HARD_REG_SET_LONGS > 1
+  int word = bit / HARD_REG_ELT_BITS;
+  int bitpos = bit % HARD_REG_ELT_BITS;
 #else
-
-#define SET_HARD_REG_BIT(SET, BIT) \
-  ((SET)[(BIT) / UHOST_BITS_PER_WIDE_INT]  \
-   |= HARD_CONST (1) << ((BIT) %

[C++ testsuite patch, committed as obvious] PR 49917

2011-07-30 Thread Paolo Carlini

Hi,

I'm committing as obvious a fix to this testcase, which is meant to test 
a runtime issue (c++/13865).


Thanks,
Paolo.

/
2011-07-30  Paolo Carlini  

PR testsuite/49917
* g++.dg/init/for1.C: Fix.
Index: g++.dg/init/for1.C
===
--- g++.dg/init/for1.C  (revision 176961)
+++ g++.dg/init/for1.C  (working copy)
@@ -1,6 +1,8 @@
 // PR c++/13865
 // Bug: We were destroying 'a' before executing the loop.
 
+// { dg-do run }
+
 #include 
 
 int i;
@@ -13,7 +15,7 @@ class A
   ~A()
   {
 printf("A dtor\n");
-if (i != 1)
+if (i != 2)
   r = 1;
   }
 };


Re: [Patch, Fortran, OOP] PR 49112: [4.6/4.7 Regression] Missing type-bound procedure, "duplicate save" warnings and internal compiler error

2011-07-30 Thread Mikael Morin
On Saturday 30 July 2011 17:43:03 Janus Weil wrote:
> Hi all,
> 
> the PR in the subject line contains several issues, and with the
> "duplicate save" part fixed, the attached patch takes care of the
> "missing type-bound procedure" regression (comment #6).
> 
> The problem is the following: When parsing a structure constructor, we
> have to resolve the derived type first. However, this will also
> trigger the construction of the vtab for this type (if it has
> type-bound procedures), which in turn will be incomplete if we're in
> the middle of a module and the type-bound procedures have not been
> parsed fully.
> 
> To solve this dilemma, I have split off from 'resolve_fl_derived' a
> part which only concerns the data components etc
> ('resolve_fl_derived0'). This can be called whenever we encounter a
> structure constructor. The full 'resolve_fl_derived' will call this
> split-off part and in addition resolve the typebound procedures,
> thereby constucting the vtab.
> 
> The patch was regtested on x86_64-unknown-linux-gnu. Ok for trunk and 4.6?
> 
OK, Thanks.

Mikael



Re: [patch, fortran] Fix ice-on-valid PR 48876

2011-07-30 Thread Thomas Koenig

Hello Steve,


On Sat, Jul 30, 2011 at 01:49:42PM +0200, Thomas Koenig wrote:

Hello world,

the attached, rather self-explanatory patch fixes PR 48876.

OK for trunk?



Yes.  If the problem exists on 4.6, can you apply the patch
to 4.6 as well.


Applied to trunk and 4.6 (this was not a regression).  Thanks for the 
review!


Thomas


Re: [patch tree-optimization]: Fix regression about vrp47.c (and co)

2011-07-30 Thread NightStrike
Ping

On Thu, Jul 21, 2011 at 9:20 AM, Kai Tietz  wrote:
> Hello,
>
> this patch adds the ability for bitwise-truth operations to sink into
> use-statement, if it is a cast, if type of it is compatible.
>
> By this we can sink cases like
>
> _Bool D1, D2, D3;
> int R, x, y;
>
> D1 = (bool) x;
> D2 = (bool) y;
> D3 = D1 & D2
> R = (int) D3;
>
> into R-statment as
> R = x & y;
>
> This fixes known vrp47.c regression.
>
> ChangeLog gcc
>
> 2011-07-21  Kai Tietz  
>
>        * tree-vrp.c (ssa_name_get_inner_ssa_name_p): New helper.
>        (ssa_name_get_cast_to_p): Likewise.
>        (simplify_truth_ops_using_ranges): Try to use type-cast
>        for simplification of bitwise-binary expressions.
>        (simplify_stmt_using_ranges): Try to sink into cast for
>        bitwise-truth operations.
>
> 2011-07-21  Kai Tietz  
>
>        * gcc.dg/tree-ssa/vrp47.c: Adjust testcase.
>
> Bootstrapped and regression tested for all standard languages
> (including Ada and Obj-C++) on
> host x86_64-pc-linux-gnu.  Ok for apply?
>
>
> Regards,
> Kai
>
> Index: gcc-head/gcc/tree-vrp.c
> ===
> --- gcc-head.orig/gcc/tree-vrp.c
> +++ gcc-head/gcc/tree-vrp.c
> @@ -6747,19 +6746,92 @@ varying:
>   return SSA_PROP_VARYING;
>  }
>
> +/* Returns operand1 of ssa-name with SSA_NAME as code, Otherwise it
> +   returns NULL_TREE.  */
> +static tree
> +ssa_name_get_inner_ssa_name_p (tree op)
> +{
> +  gimple stmt;
> +
> +  if (TREE_CODE (op) != SSA_NAME
> +      || !is_gimple_assign (SSA_NAME_DEF_STMT (op)))
> +    return NULL_TREE;
> +  stmt = SSA_NAME_DEF_STMT (op);
> +  if (gimple_assign_rhs_code (stmt) != SSA_NAME)
> +    return NULL_TREE;
> +  return gimple_assign_rhs1 (stmt);
> +}
> +
> +/* Returns operand of cast operation, if OP is a type-conversion. Otherwise
> +   return NULL_TREE.  */
> +static tree
> +ssa_name_get_cast_to_p (tree op)
> +{
> +  gimple stmt;
> +
> +  if (TREE_CODE (op) != SSA_NAME
> +      || !is_gimple_assign (SSA_NAME_DEF_STMT (op)))
> +    return NULL_TREE;
> +  stmt = SSA_NAME_DEF_STMT (op);
> +  if (!CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (stmt)))
> +    return NULL_TREE;
> +  return gimple_assign_rhs1 (stmt);
> +}
> +
>  /* Simplify boolean operations if the source is known
>    to be already a boolean.  */
>  static bool
>  simplify_truth_ops_using_ranges (gimple_stmt_iterator *gsi, gimple stmt)
>  {
>   enum tree_code rhs_code = gimple_assign_rhs_code (stmt);
> +  gimple stmt2 = stmt;
>   tree val = NULL;
>   tree op0, op1;
>   value_range_t *vr;
>   bool sop = false;
>   bool need_conversion;
> +  location_t loc = gimple_location (stmt);
>
>   op0 = gimple_assign_rhs1 (stmt);
> +  op1 = NULL_TREE;
> +
> +  /* Handle cases with prefixed type-cast.  */
> +  if (CONVERT_EXPR_CODE_P (rhs_code)
> +      && INTEGRAL_TYPE_P (TREE_TYPE (op0))
> +      && TREE_CODE (op0) == SSA_NAME
> +      && is_gimple_assign (SSA_NAME_DEF_STMT (op0))
> +      && INTEGRAL_TYPE_P (TREE_TYPE (gimple_assign_lhs (stmt
> +    {
> +      stmt2 = SSA_NAME_DEF_STMT (op0);
> +      op0 = gimple_assign_rhs1 (stmt2);
> +      if (!INTEGRAL_TYPE_P (TREE_TYPE (op0)))
> +       return false;
> +      rhs_code = gimple_assign_rhs_code (stmt2);
> +      if (rhs_code != BIT_NOT_EXPR
> +         && rhs_code != BIT_AND_EXPR
> +         && rhs_code != BIT_IOR_EXPR
> +         && rhs_code != BIT_XOR_EXPR
> +         && rhs_code != NE_EXPR && rhs_code != EQ_EXPR)
> +       return false;
> +
> +      if (rhs_code != BIT_NOT_EXPR)
> +       op1 = gimple_assign_rhs2 (stmt2);
> +
> +      if (gimple_has_location (stmt2))
> +        loc = gimple_location (stmt2);
> +    }
> +  else if (CONVERT_EXPR_CODE_P (rhs_code))
> +    return false;
> +  else if (rhs_code != BIT_NOT_EXPR)
> +    op1 = gimple_assign_rhs2 (stmt);
> +
> +  /* ~X is only equivalent to !X, if type-precision is one and X has
> +     an integral type.  */
> +  if (rhs_code == BIT_NOT_EXPR
> +      && (!INTEGRAL_TYPE_P (TREE_TYPE (op0))
> +         || TYPE_PRECISION (TREE_TYPE (op0)) != 1))
> +    return false;
> +
>   if (TYPE_PRECISION (TREE_TYPE (op0)) != 1)
>     {
>       if (TREE_CODE (op0) != SSA_NAME)
> @@ -6775,19 +6847,83 @@ simplify_truth_ops_using_ranges (gimple_
>         return false;
>     }
>
> -  if (rhs_code == BIT_NOT_EXPR && TYPE_PRECISION (TREE_TYPE (op0)) == 1)
> +  need_conversion =
> +    !useless_type_conversion_p (TREE_TYPE (gimple_assign_lhs (stmt)),
> +                               TREE_TYPE (op0));
> +  /* As comparisons X != 0 getting folded to (bool) X by VRP,
> +     but X == 0 might be not folded for none boolean type of X
> +     to (bool) (X ^ 1), we need to handle this case special
> +     to simplify this.
> +     For bitwise-binary operations we have three cases to handle:
> +     a) ((bool) X) op ((bool) Y)
> +     b) ((bool) X) op (Y == 0) -OR- (X == 0) op ((bool) Y)
> +     c) (X == 0) op (Y == 0)
> +     The later two cases can't be handled for now, as we would beed to
> +