Re: [RFC] Add new flag to specify output constraint in match.pd

2020-09-02 Thread Marc Glisse

On Wed, 2 Sep 2020, Richard Biener via Gcc wrote:


On Mon, Aug 24, 2020 at 8:20 AM Feng Xue OS via Gcc  wrote:



  There is a match-folding issue derived from pr94234.  A piece of code like:

  int foo (int n)
  {
 int t1 = 8 * n;
 int t2 = 8 * (n - 1);

 return t1 - t2;
  }

 It can be perfectly caught by the rule "(A * C) +- (B * C) -> (A +- B) * C", 
and
 be folded to constant "8". But this folding will fail if both v1 and v2 have
 multiple uses, as the following code.

  int foo (int n)
  {
 int t1 = 8 * n;
 int t2 = 8 * (n - 1);

 use_fn (t1, t2);
 return t1 - t2;
  }

 Given an expression with non-single-use operands, folding it will introduce
 duplicated computation in most situations, and is deemed to be unprofitable.
 But it is always beneficial if final result is a constant or existing SSA 
value.

 And the rule is:
  (simplify
   (plusminus (mult:cs@3 @0 @1) (mult:cs@4 @0 @2))
   (if ((!ANY_INTEGRAL_TYPE_P (type)
|| TYPE_OVERFLOW_WRAPS (type)
|| (INTEGRAL_TYPE_P (type)
&& tree_expr_nonzero_p (@0)
&& expr_not_equal_to (@0, wi::minus_one (TYPE_PRECISION (type)
   /* If @1 +- @2 is constant require a hard single-use on either
  original operand (but not on both).  */
   && (single_use (@3) || single_use (@4)))   <- control whether match 
or not
(mult (plusminus @1 @2) @0)))

 Current matcher only provides a way to check something before folding,
 but no mechanism to affect decision after folding. If has, for the above
 case, we can let it go when we find result is a constant.


:s already has a counter-measure where it still folds if the output is at
most one operation. So this transformation has a counter-counter-measure
of checking single_use explicitly. And now we want a counter^3-measure...


Counter-measure is key factor to matching-cost.  ":s" seems to be somewhat
coarse-grained. And here we do need more control over it.

But ideally, we could decouple these counter-measures from definitions of
match-rule, and let gimple-matcher get a more reasonable match-or-not
decision based on these counters. Anyway, it is another story.


 Like the way to describe input operand using flags, we could also add
 a new flag to specify this kind of constraint on output that we expect
 it is a simple gimple value.

 Proposed syntax is

  (opcode:v{ condition } )

 The char "v" stands for gimple value, if more descriptive, other char is
 preferred. "condition" enclosed by { } is an optional c-syntax condition
 expression. If present, only when "condition" is met, matcher will check
 whether folding result is a gimple value using
 gimple_simplified_result_is_gimple_val ().

 Since there is no SSA concept in GENERIC, this is only for GIMPLE-match,
 not GENERIC-match.

 With this syntax, the rule is changed to

 #Form 1:
  (simplify
   (plusminus (mult:cs@3 @0 @1) (mult:cs@4 @0 @2))
   (if ((!ANY_INTEGRAL_TYPE_P (type)
|| TYPE_OVERFLOW_WRAPS (type)
|| (INTEGRAL_TYPE_P (type)
&& tree_expr_nonzero_p (@0)
&& expr_not_equal_to (@0, wi::minus_one (TYPE_PRECISION (type))
   ( if (!single_use (@3) && !single_use (@4))
  (mult:v (plusminus @1 @2) @0)))
  (mult (plusminus @1 @2) @0)


That seems to match what you can do with '!' now (that's very recent).


It's also what :s does but a slight bit more "local".  When any operand is
marked :s and it has more than a single-use we only allow simplifications
that do not require insertion of extra stmts.  So basically the above pattern
doesn't behave any different than if you omit your :v.  Only if you'd
place :v on an inner expression there would be a difference.  Correlating
the inner expression we'd not want to insert new expressions for with
a specific :s (or multiple ones) would be a more natural extension of what
:s provides.

Thus, for the above case (Form 1), you do not need :v at all and :s works.


Let's consider that multiplication is expensive. We have code like 
5*X-3*X, which can be simplified to 2*X. However, if both 5*X and 3*X have 
other uses, that would increase the number of multiplications. :s would 
not block a simplification to 2*X, which is a single stmt. So the existing 
transformation has extra explicit checks for single_use. And those extra 
checks block the transformation even for 5*X-4*X -> X which does not 
increase the number of multiplications. Which is where '!' (or :v here) 
comes in.


Or we could decide that the extra multiplication is not that bad if it 
saves an addition, simplifies the expression, possibly gains more insn 
parallelism, etc, in which case we could just drop the existing hard 
single_use check...


--
Marc Glisse


Re: [RFC] Add new flag to specify output constraint in match.pd

2020-09-02 Thread Feng Xue OS via Gcc
>
>>
>> >>   There is a match-folding issue derived from pr94234.  A piece of code 
>> >> like:
>> >>
>> >>   int foo (int n)
>> >>   {
>> >>  int t1 = 8 * n;
>> >>  int t2 = 8 * (n - 1);
>> >>
>> >>  return t1 - t2;
>> >>   }
>> >>
>> >>  It can be perfectly caught by the rule "(A * C) +- (B * C) -> (A +- B) * 
>> >> C", and
>> >>  be folded to constant "8". But this folding will fail if both v1 and v2 
>> >> have
>> >>  multiple uses, as the following code.
>> >>
>> >>   int foo (int n)
>> >>   {
>> >>  int t1 = 8 * n;
>> >>  int t2 = 8 * (n - 1);
>> >>
>> >>  use_fn (t1, t2);
>> >>  return t1 - t2;
>> >>   }
>> >>
>> >>  Given an expression with non-single-use operands, folding it will 
>> >> introduce
>> >>  duplicated computation in most situations, and is deemed to be 
>> >> unprofitable.
>> >>  But it is always beneficial if final result is a constant or existing 
>> >> SSA value.
>> >>
>> >>  And the rule is:
>> >>   (simplify
>> >>(plusminus (mult:cs@3 @0 @1) (mult:cs@4 @0 @2))
>> >>(if ((!ANY_INTEGRAL_TYPE_P (type)
>> >> || TYPE_OVERFLOW_WRAPS (type)
>> >> || (INTEGRAL_TYPE_P (type)
>> >> && tree_expr_nonzero_p (@0)
>> >> && expr_not_equal_to (@0, wi::minus_one (TYPE_PRECISION 
>> >> (type)
>> >>/* If @1 +- @2 is constant require a hard single-use on either
>> >>   original operand (but not on both).  */
>> >>&& (single_use (@3) || single_use (@4)))   <- control whether 
>> >> match or not
>> >> (mult (plusminus @1 @2) @0)))
>> >>
>> >>  Current matcher only provides a way to check something before folding,
>> >>  but no mechanism to affect decision after folding. If has, for the above
>> >>  case, we can let it go when we find result is a constant.
>> >
>> > :s already has a counter-measure where it still folds if the output is at
>> > most one operation. So this transformation has a counter-counter-measure
>> > of checking single_use explicitly. And now we want a counter^3-measure...
>> >
>> Counter-measure is key factor to matching-cost.  ":s" seems to be somewhat
>> coarse-grained. And here we do need more control over it.
>>
>> But ideally, we could decouple these counter-measures from definitions of
>> match-rule, and let gimple-matcher get a more reasonable match-or-not
>> decision based on these counters. Anyway, it is another story.
>>
>> >>  Like the way to describe input operand using flags, we could also add
>> >>  a new flag to specify this kind of constraint on output that we expect
>> >>  it is a simple gimple value.
>> >>
>> >>  Proposed syntax is
>> >>
>> >>   (opcode:v{ condition } )
>> >>
>> >>  The char "v" stands for gimple value, if more descriptive, other char is
>> >>  preferred. "condition" enclosed by { } is an optional c-syntax condition
>> >>  expression. If present, only when "condition" is met, matcher will check
>> >>  whether folding result is a gimple value using
>> >>  gimple_simplified_result_is_gimple_val ().
>> >>
>> >>  Since there is no SSA concept in GENERIC, this is only for GIMPLE-match,
>> >>  not GENERIC-match.
>> >>
>> >>  With this syntax, the rule is changed to
>> >>
>> >>  #Form 1:
>> >>   (simplify
>> >>(plusminus (mult:cs@3 @0 @1) (mult:cs@4 @0 @2))
>> >>(if ((!ANY_INTEGRAL_TYPE_P (type)
>> >> || TYPE_OVERFLOW_WRAPS (type)
>> >> || (INTEGRAL_TYPE_P (type)
>> >> && tree_expr_nonzero_p (@0)
>> >> && expr_not_equal_to (@0, wi::minus_one (TYPE_PRECISION 
>> >> (type))
>> >>( if (!single_use (@3) && !single_use (@4))
>> >>   (mult:v (plusminus @1 @2) @0)))
>> >>   (mult (plusminus @1 @2) @0)
>> >
>> > That seems to match what you can do with '!' now (that's very recent).
>
> It's also what :s does but a slight bit more "local".  When any operand is
> marked :s and it has more than a single-use we only allow simplifications
> that do not require insertion of extra stmts.  So basically the above pattern
> doesn't behave any different than if you omit your :v.  Only if you'd
> place :v on an inner expression there would be a difference.  Correlating
> the inner expression we'd not want to insert new expressions for with
> a specific :s (or multiple ones) would be a more natural extension of what
> :s provides.
>
> Thus, for the above case (Form 1), you do not need :v at all and :s works.

Between ":s" and ":v", there is a subtle difference. ":s" only ensures interior
transform does not insert any new stmts, but this is not true for final one.

Code snippet generated for (A * C) +- (B * C) -> (A+-B) * C:
 
  gimple_seq *lseq = seq;
  if (lseq
  && (!single_use (captures[0])
  || !single_use (captures[3])))
lseq = NULL;
  if (__builtin_expect (!dbg_cnt (match), 0)) goto next_after_fail621;
  if (__builtin_expect (dump_file && (dump_flags & TDF_FOLDING), 0))  
fprintf (dump_file, "Applying patte

Re: [RFC] Add new flag to specify output constraint in match.pd

2020-09-02 Thread Richard Biener via Gcc
On Wed, Sep 2, 2020 at 9:27 AM Marc Glisse  wrote:
>
> On Wed, 2 Sep 2020, Richard Biener via Gcc wrote:
>
> > On Mon, Aug 24, 2020 at 8:20 AM Feng Xue OS via Gcc  wrote:
> >>
>    There is a match-folding issue derived from pr94234.  A piece of code 
>  like:
> 
>    int foo (int n)
>    {
>   int t1 = 8 * n;
>   int t2 = 8 * (n - 1);
> 
>   return t1 - t2;
>    }
> 
>   It can be perfectly caught by the rule "(A * C) +- (B * C) -> (A +- B) 
>  * C", and
>   be folded to constant "8". But this folding will fail if both v1 and v2 
>  have
>   multiple uses, as the following code.
> 
>    int foo (int n)
>    {
>   int t1 = 8 * n;
>   int t2 = 8 * (n - 1);
> 
>   use_fn (t1, t2);
>   return t1 - t2;
>    }
> 
>   Given an expression with non-single-use operands, folding it will 
>  introduce
>   duplicated computation in most situations, and is deemed to be 
>  unprofitable.
>   But it is always beneficial if final result is a constant or existing 
>  SSA value.
> 
>   And the rule is:
>    (simplify
> (plusminus (mult:cs@3 @0 @1) (mult:cs@4 @0 @2))
> (if ((!ANY_INTEGRAL_TYPE_P (type)
>  || TYPE_OVERFLOW_WRAPS (type)
>  || (INTEGRAL_TYPE_P (type)
>  && tree_expr_nonzero_p (@0)
>  && expr_not_equal_to (@0, wi::minus_one (TYPE_PRECISION 
>  (type)
> /* If @1 +- @2 is constant require a hard single-use on either
>    original operand (but not on both).  */
> && (single_use (@3) || single_use (@4)))   <- control whether 
>  match or not
>  (mult (plusminus @1 @2) @0)))
> 
>   Current matcher only provides a way to check something before folding,
>   but no mechanism to affect decision after folding. If has, for the above
>   case, we can let it go when we find result is a constant.
> >>>
> >>> :s already has a counter-measure where it still folds if the output is at
> >>> most one operation. So this transformation has a counter-counter-measure
> >>> of checking single_use explicitly. And now we want a counter^3-measure...
> >>>
> >> Counter-measure is key factor to matching-cost.  ":s" seems to be somewhat
> >> coarse-grained. And here we do need more control over it.
> >>
> >> But ideally, we could decouple these counter-measures from definitions of
> >> match-rule, and let gimple-matcher get a more reasonable match-or-not
> >> decision based on these counters. Anyway, it is another story.
> >>
>   Like the way to describe input operand using flags, we could also add
>   a new flag to specify this kind of constraint on output that we expect
>   it is a simple gimple value.
> 
>   Proposed syntax is
> 
>    (opcode:v{ condition } )
> 
>   The char "v" stands for gimple value, if more descriptive, other char is
>   preferred. "condition" enclosed by { } is an optional c-syntax condition
>   expression. If present, only when "condition" is met, matcher will check
>   whether folding result is a gimple value using
>   gimple_simplified_result_is_gimple_val ().
> 
>   Since there is no SSA concept in GENERIC, this is only for GIMPLE-match,
>   not GENERIC-match.
> 
>   With this syntax, the rule is changed to
> 
>   #Form 1:
>    (simplify
> (plusminus (mult:cs@3 @0 @1) (mult:cs@4 @0 @2))
> (if ((!ANY_INTEGRAL_TYPE_P (type)
>  || TYPE_OVERFLOW_WRAPS (type)
>  || (INTEGRAL_TYPE_P (type)
>  && tree_expr_nonzero_p (@0)
>  && expr_not_equal_to (@0, wi::minus_one (TYPE_PRECISION 
>  (type))
> ( if (!single_use (@3) && !single_use (@4))
>    (mult:v (plusminus @1 @2) @0)))
>    (mult (plusminus @1 @2) @0)
> >>>
> >>> That seems to match what you can do with '!' now (that's very recent).
> >
> > It's also what :s does but a slight bit more "local".  When any operand is
> > marked :s and it has more than a single-use we only allow simplifications
> > that do not require insertion of extra stmts.  So basically the above 
> > pattern
> > doesn't behave any different than if you omit your :v.  Only if you'd
> > place :v on an inner expression there would be a difference.  Correlating
> > the inner expression we'd not want to insert new expressions for with
> > a specific :s (or multiple ones) would be a more natural extension of what
> > :s provides.
> >
> > Thus, for the above case (Form 1), you do not need :v at all and :s works.
>
> Let's consider that multiplication is expensive. We have code like
> 5*X-3*X, which can be simplified to 2*X. However, if both 5*X and 3*X have
> other uses, that would increase the number of multiplications. :s would
> not block a simplification to 2*X, which is a single stmt. So

Re: [RFC] Add new flag to specify output constraint in match.pd

2020-09-02 Thread Richard Biener via Gcc
On Wed, Sep 2, 2020 at 9:35 AM Feng Xue OS  wrote:
>
> >
> >>
> >> >>   There is a match-folding issue derived from pr94234.  A piece of code 
> >> >> like:
> >> >>
> >> >>   int foo (int n)
> >> >>   {
> >> >>  int t1 = 8 * n;
> >> >>  int t2 = 8 * (n - 1);
> >> >>
> >> >>  return t1 - t2;
> >> >>   }
> >> >>
> >> >>  It can be perfectly caught by the rule "(A * C) +- (B * C) -> (A +- B) 
> >> >> * C", and
> >> >>  be folded to constant "8". But this folding will fail if both v1 and 
> >> >> v2 have
> >> >>  multiple uses, as the following code.
> >> >>
> >> >>   int foo (int n)
> >> >>   {
> >> >>  int t1 = 8 * n;
> >> >>  int t2 = 8 * (n - 1);
> >> >>
> >> >>  use_fn (t1, t2);
> >> >>  return t1 - t2;
> >> >>   }
> >> >>
> >> >>  Given an expression with non-single-use operands, folding it will 
> >> >> introduce
> >> >>  duplicated computation in most situations, and is deemed to be 
> >> >> unprofitable.
> >> >>  But it is always beneficial if final result is a constant or existing 
> >> >> SSA value.
> >> >>
> >> >>  And the rule is:
> >> >>   (simplify
> >> >>(plusminus (mult:cs@3 @0 @1) (mult:cs@4 @0 @2))
> >> >>(if ((!ANY_INTEGRAL_TYPE_P (type)
> >> >> || TYPE_OVERFLOW_WRAPS (type)
> >> >> || (INTEGRAL_TYPE_P (type)
> >> >> && tree_expr_nonzero_p (@0)
> >> >> && expr_not_equal_to (@0, wi::minus_one (TYPE_PRECISION 
> >> >> (type)
> >> >>/* If @1 +- @2 is constant require a hard single-use on either
> >> >>   original operand (but not on both).  */
> >> >>&& (single_use (@3) || single_use (@4)))   <- control 
> >> >> whether match or not
> >> >> (mult (plusminus @1 @2) @0)))
> >> >>
> >> >>  Current matcher only provides a way to check something before folding,
> >> >>  but no mechanism to affect decision after folding. If has, for the 
> >> >> above
> >> >>  case, we can let it go when we find result is a constant.
> >> >
> >> > :s already has a counter-measure where it still folds if the output is at
> >> > most one operation. So this transformation has a counter-counter-measure
> >> > of checking single_use explicitly. And now we want a counter^3-measure...
> >> >
> >> Counter-measure is key factor to matching-cost.  ":s" seems to be somewhat
> >> coarse-grained. And here we do need more control over it.
> >>
> >> But ideally, we could decouple these counter-measures from definitions of
> >> match-rule, and let gimple-matcher get a more reasonable match-or-not
> >> decision based on these counters. Anyway, it is another story.
> >>
> >> >>  Like the way to describe input operand using flags, we could also add
> >> >>  a new flag to specify this kind of constraint on output that we expect
> >> >>  it is a simple gimple value.
> >> >>
> >> >>  Proposed syntax is
> >> >>
> >> >>   (opcode:v{ condition } )
> >> >>
> >> >>  The char "v" stands for gimple value, if more descriptive, other char 
> >> >> is
> >> >>  preferred. "condition" enclosed by { } is an optional c-syntax 
> >> >> condition
> >> >>  expression. If present, only when "condition" is met, matcher will 
> >> >> check
> >> >>  whether folding result is a gimple value using
> >> >>  gimple_simplified_result_is_gimple_val ().
> >> >>
> >> >>  Since there is no SSA concept in GENERIC, this is only for 
> >> >> GIMPLE-match,
> >> >>  not GENERIC-match.
> >> >>
> >> >>  With this syntax, the rule is changed to
> >> >>
> >> >>  #Form 1:
> >> >>   (simplify
> >> >>(plusminus (mult:cs@3 @0 @1) (mult:cs@4 @0 @2))
> >> >>(if ((!ANY_INTEGRAL_TYPE_P (type)
> >> >> || TYPE_OVERFLOW_WRAPS (type)
> >> >> || (INTEGRAL_TYPE_P (type)
> >> >> && tree_expr_nonzero_p (@0)
> >> >> && expr_not_equal_to (@0, wi::minus_one (TYPE_PRECISION 
> >> >> (type))
> >> >>( if (!single_use (@3) && !single_use (@4))
> >> >>   (mult:v (plusminus @1 @2) @0)))
> >> >>   (mult (plusminus @1 @2) @0)
> >> >
> >> > That seems to match what you can do with '!' now (that's very recent).
> >
> > It's also what :s does but a slight bit more "local".  When any operand is
> > marked :s and it has more than a single-use we only allow simplifications
> > that do not require insertion of extra stmts.  So basically the above 
> > pattern
> > doesn't behave any different than if you omit your :v.  Only if you'd
> > place :v on an inner expression there would be a difference.  Correlating
> > the inner expression we'd not want to insert new expressions for with
> > a specific :s (or multiple ones) would be a more natural extension of what
> > :s provides.
> >
> > Thus, for the above case (Form 1), you do not need :v at all and :s works.
>
> Between ":s" and ":v", there is a subtle difference. ":s" only ensures 
> interior
> transform does not insert any new stmts, but this is not true for final one.
>
> Code snippet generated for (A * C) +- (B * C) -> (A+-B) * C:
>
>   gimple_seq *lseq = s

Re: [RFC] Add new flag to specify output constraint in match.pd

2020-09-02 Thread Feng Xue OS via Gcc
>>
>> >
>> >>
>> >> >>   There is a match-folding issue derived from pr94234.  A piece of 
>> >> >> code like:
>> >> >>
>> >> >>   int foo (int n)
>> >> >>   {
>> >> >>  int t1 = 8 * n;
>> >> >>  int t2 = 8 * (n - 1);
>> >> >>
>> >> >>  return t1 - t2;
>> >> >>   }
>> >> >>
>> >> >>  It can be perfectly caught by the rule "(A * C) +- (B * C) -> (A +- 
>> >> >> B) * C", and
>> >> >>  be folded to constant "8". But this folding will fail if both v1 and 
>> >> >> v2 have
>> >> >>  multiple uses, as the following code.
>> >> >>
>> >> >>   int foo (int n)
>> >> >>   {
>> >> >>  int t1 = 8 * n;
>> >> >>  int t2 = 8 * (n - 1);
>> >> >>
>> >> >>  use_fn (t1, t2);
>> >> >>  return t1 - t2;
>> >> >>   }
>> >> >>
>> >> >>  Given an expression with non-single-use operands, folding it will 
>> >> >> introduce
>> >> >>  duplicated computation in most situations, and is deemed to be 
>> >> >> unprofitable.
>> >> >>  But it is always beneficial if final result is a constant or existing 
>> >> >> SSA value.
>> >> >>
>> >> >>  And the rule is:
>> >> >>   (simplify
>> >> >>(plusminus (mult:cs@3 @0 @1) (mult:cs@4 @0 @2))
>> >> >>(if ((!ANY_INTEGRAL_TYPE_P (type)
>> >> >> || TYPE_OVERFLOW_WRAPS (type)
>> >> >> || (INTEGRAL_TYPE_P (type)
>> >> >> && tree_expr_nonzero_p (@0)
>> >> >> && expr_not_equal_to (@0, wi::minus_one (TYPE_PRECISION 
>> >> >> (type)
>> >> >>/* If @1 +- @2 is constant require a hard single-use on either
>> >> >>   original operand (but not on both).  */
>> >> >>&& (single_use (@3) || single_use (@4)))   <- control 
>> >> >> whether match or not
>> >> >> (mult (plusminus @1 @2) @0)))
>> >> >>
>> >> >>  Current matcher only provides a way to check something before folding,
>> >> >>  but no mechanism to affect decision after folding. If has, for the 
>> >> >> above
>> >> >>  case, we can let it go when we find result is a constant.
>> >> >
>> >> > :s already has a counter-measure where it still folds if the output is 
>> >> > at
>> >> > most one operation. So this transformation has a counter-counter-measure
>> >> > of checking single_use explicitly. And now we want a 
>> >> > counter^3-measure...
>> >> >
>> >> Counter-measure is key factor to matching-cost.  ":s" seems to be somewhat
>> >> coarse-grained. And here we do need more control over it.
>> >>
>> >> But ideally, we could decouple these counter-measures from definitions of
>> >> match-rule, and let gimple-matcher get a more reasonable match-or-not
>> >> decision based on these counters. Anyway, it is another story.
>> >>
>> >> >>  Like the way to describe input operand using flags, we could also add
>> >> >>  a new flag to specify this kind of constraint on output that we expect
>> >> >>  it is a simple gimple value.
>> >> >>
>> >> >>  Proposed syntax is
>> >> >>
>> >> >>   (opcode:v{ condition } )
>> >> >>
>> >> >>  The char "v" stands for gimple value, if more descriptive, other char 
>> >> >> is
>> >> >>  preferred. "condition" enclosed by { } is an optional c-syntax 
>> >> >> condition
>> >> >>  expression. If present, only when "condition" is met, matcher will 
>> >> >> check
>> >> >>  whether folding result is a gimple value using
>> >> >>  gimple_simplified_result_is_gimple_val ().
>> >> >>
>> >> >>  Since there is no SSA concept in GENERIC, this is only for 
>> >> >> GIMPLE-match,
>> >> >>  not GENERIC-match.
>> >> >>
>> >> >>  With this syntax, the rule is changed to
>> >> >>
>> >> >>  #Form 1:
>> >> >>   (simplify
>> >> >>(plusminus (mult:cs@3 @0 @1) (mult:cs@4 @0 @2))
>> >> >>(if ((!ANY_INTEGRAL_TYPE_P (type)
>> >> >> || TYPE_OVERFLOW_WRAPS (type)
>> >> >> || (INTEGRAL_TYPE_P (type)
>> >> >> && tree_expr_nonzero_p (@0)
>> >> >> && expr_not_equal_to (@0, wi::minus_one (TYPE_PRECISION 
>> >> >> (type))
>> >> >>( if (!single_use (@3) && !single_use (@4))
>> >> >>   (mult:v (plusminus @1 @2) @0)))
>> >> >>   (mult (plusminus @1 @2) @0)
>> >> >
>> >> > That seems to match what you can do with '!' now (that's very recent).
>> >
>> > It's also what :s does but a slight bit more "local".  When any operand is
>> > marked :s and it has more than a single-use we only allow simplifications
>> > that do not require insertion of extra stmts.  So basically the above 
>> > pattern
>> > doesn't behave any different than if you omit your :v.  Only if you'd
>> > place :v on an inner expression there would be a difference.  Correlating
>> > the inner expression we'd not want to insert new expressions for with
>> > a specific :s (or multiple ones) would be a more natural extension of what
>> > :s provides.
>> >
>> > Thus, for the above case (Form 1), you do not need :v at all and :s works.
>>
>> Between ":s" and ":v", there is a subtle difference. ":s" only ensures 
>> interior
>> transform does not insert any new stmts, but this is not true for final one.
>>
>> Code s

Re: [RFC] Add new flag to specify output constraint in match.pd

2020-09-02 Thread Alexander Monakov via Gcc
On Wed, 2 Sep 2020, Richard Biener via Gcc wrote:

> > Or we could decide that the extra multiplication is not that bad if it
> > saves an addition, simplifies the expression, possibly gains more insn
> > parallelism, etc, in which case we could just drop the existing hard
> > single_use check...
> 
> ;)
> 
> it's all going to be heuristics because the decision is made
> locally.  If you consider the other two uses might be
> 5*X + 3*X then converting both to 2*X and 8*X should be
> a win because we save one addition and one subtraction.
> But that would require folding everything tentatively and
> then applying some global costing ...

Arguably simplification on GIMPLE should fold towards what is

1) declared "canonical form"

2) cheaper in the C abstract machine ;)

and let machine-specific passes "expand" the result as needed towards
higher ILP or whatever the concrete machine prefers :)

Alexander


Question about exporting omputing alias sets

2020-09-02 Thread Erick Ochoa

Hello,

I am trying to find out all pointers which alias a pointer and place 
them in a set.


I am using `ptr_derefs_may_alias_p` to find out if two pointers may 
point to the same memory location. I think this yields conservative 
results (i.e., when it cannot be proven that to pointers may alias, it 
will give the result "true"). This is what I need.


To collect all pointers which alias pointer "p", this means is that I 
have to collect all pointer variables and all pointer expressions in the 
program and then call `ptr_derefs_may_alias_p` O(n) times to find the 
set of pointers which alias pointer "p". If I have to do this for all 
pointers in the program, that would mean O(n^2).


This also implies that in order for my sets to be correct I need to 
collect all pointer variables and all pointer expressions. I think that 
a better idea would be to have a list of all bitmap solutions and for 
every variable with a bitmap if position j is set, assigned to alias set 
j. In pseudocode:


compute_alias_sets(bitmap set)
{
  // create array of sets of same length as varmap
  // call it alias_map
  for (i = 0; i < varmap.length (); i++)
  {
bitmap set = get_varinfo(i)->solution;
EXECUTE_IF_SET_IN_BITMAP (set, 0, j, bi)
{
  varinfo_t v = get_varinfo (j);
  alias_map[j].insert(v);
}
  }
}

I think this is a better implementation. I don't need to worry about 
collecting all pointer expressions and I think varmap also contains 
declarations. (I think IPA-PTA already has "collected" all pointer 
expressions/declarations in varmap and created constraint variables for 
them.) *Would there be an issue with exporting varmap and delaying its 
deletion to this pass I'm working on?* This pass is intended to be run 
immediately after IPA-PTA. After this, I think I would still need to do 
some refining on the "alias_map" since (I think) there are more 
constraint variables than gimple variables in the partition, but the 
information I look for should be able to be derived from here.


Thanks!


Re: Question about exporting omputing alias sets

2020-09-02 Thread Richard Biener via Gcc
On Wed, Sep 2, 2020 at 10:04 AM Erick Ochoa
 wrote:
>
> Hello,
>
> I am trying to find out all pointers which alias a pointer and place
> them in a set.
>
> I am using `ptr_derefs_may_alias_p` to find out if two pointers may
> point to the same memory location. I think this yields conservative
> results (i.e., when it cannot be proven that to pointers may alias, it
> will give the result "true"). This is what I need.
>
> To collect all pointers which alias pointer "p", this means is that I
> have to collect all pointer variables and all pointer expressions in the
> program and then call `ptr_derefs_may_alias_p` O(n) times to find the
> set of pointers which alias pointer "p". If I have to do this for all
> pointers in the program, that would mean O(n^2).
>
> This also implies that in order for my sets to be correct I need to
> collect all pointer variables and all pointer expressions. I think that
> a better idea would be to have a list of all bitmap solutions and for
> every variable with a bitmap if position j is set, assigned to alias set
> j. In pseudocode:
>
> compute_alias_sets(bitmap set)
> {
>// create array of sets of same length as varmap
>// call it alias_map
>for (i = 0; i < varmap.length (); i++)
>{
>  bitmap set = get_varinfo(i)->solution;
>  EXECUTE_IF_SET_IN_BITMAP (set, 0, j, bi)
>  {
>varinfo_t v = get_varinfo (j);
>alias_map[j].insert(v);
>  }
>}
> }
>
> I think this is a better implementation. I don't need to worry about
> collecting all pointer expressions and I think varmap also contains
> declarations. (I think IPA-PTA already has "collected" all pointer
> expressions/declarations in varmap and created constraint variables for
> them.) *Would there be an issue with exporting varmap and delaying its
> deletion to this pass I'm working on?* This pass is intended to be run
> immediately after IPA-PTA. After this, I think I would still need to do
> some refining on the "alias_map" since (I think) there are more
> constraint variables than gimple variables in the partition, but the
> information I look for should be able to be derived from here.

I think you should add the code to compute your "alias sets" to
IPA-PTA if struct-reorg is enabled and export those like
we "export" the IPA escaped set (ipa_escaped_pt) to later
passes.

I agree that you want to work on the points-to solutions rather
than start from pointers and expressions in GIMPLE.

Richard.

> Thanks!


Re: Future debug options: -f* or -g*?

2020-09-02 Thread Mark Wielaard
Hi David,

On Mon, 2020-08-31 at 20:10 -0700, David Blaikie wrote:
> Hey Mark - saw a little of/bits about your presentation at LPC 2020 GNU
> Tools Track (& your thread on on the gdb list about debug_names). Wondering
> if you (or anyone else you know who's contributing to debug info in GCC)
> have some thoughts on this flag naming issue. It'd be great to get some
> alignment between GCC and Clang here, so as we both add new flags going
> forward, at least they're at least categorically consistent for users, even
> if we aren't necessarily implementing the exact same flags/flag names
> (though in the -gsplit-dwarf case, it'd be good for any new semantics/name
> to match exactly).

To be honest I don't have any deep insights here. Sadly it is just a
bit of a mess. Not helped by the fact that some debugging formats, like
DWARF have a number in their name (dwarf2) that might not reflect
actually reflext the version of that format (except for it being not-
dwarf1). That is why I was suggesting to use -fdwarf(32|64) instead of
-gdwarf(32|64) to use for 32 bits vs 64 bits offsets in dwarf(2|3|4|5),
because there are already too many -gdwarf[number] options.

The idea of having any of the -fs not imply -g and having
any of the -gs imply -g seems a good one.

But in general I think these suboptions are all a bit nonsense though.
They are great for tinkering and playing with corner cases of the
debugging format. But I don't think we are helping real developers with
them. To make any of them useful they should be in one of the standard
options that people actually use. Nobody has time (or the specialized
knowledge) to really care about all these suboptions. Once people feel
they need to use any of these suboptions we already lost or the user
has such specialized knowledge that they will know about specific
changes in these options or between compilers.

So just like we have -Wall or -Wextra has groups of warning options
people are likely to use and we have -O1, -O2 and -O3 (I personally
believe -flto should just be in -O3 instead of being a separate
option), we should simply focus on making -g1, -g2 and -g3 have good
defaults. Where -g1 is the minimum needed to do some
profiling/tracing/debugging without needing to map all addresses back
to source completely but can leave out full type trees for example, -g2
is the default for being able to do a useful interactive debugging
session and -g3 adds some (expensive) extras like .debug_macros and
maybe .debug_names, etc.

With respect to -gsplit-dwarf I think it was (in hindsight) a mistake
to put things in a separate (.dwo) file. It is a bit confusing to users
and complicates build systems immensely. I am hoping to implement
-gsplit-dwarf=single using SHF_EXCLUDED for gcc and then make that the
default.

Cheers,

Mark


Avoiding truncate/sign-extend of POImode on ppc target

2020-09-02 Thread Aaron Sawdey via Gcc


PR96791 is happening because DSE is trying to truncate a
POImode reg down to DImode. The POImode is created by a
structure copy that gets inline expanded using lxvp/stxvp
which we have defined using POImode. DSE recognizes that a
following load overlaps with the stxvp and can be satisfied
by a truncation of the POImode reg used for the stxvp.

The intention of using POImode was to avoid having to fully
define OImode and it seems like defining these truncations
would be the first step along that path.  If I haven't
misrepresented it, I believe that is Segher's position on
this.

The function extract_low_bits() in expmed.c does some
querying of the target hook modes_tieable_p but not in any
way that allows us to tell it not to allow truncation of
POImode to a smaller mode.  As a result we run into the ICE
because the patterns are not provided to do this.

So, I put it to the community: Is there an existing way to do
this?  Or, do we need another target hook of some kind to
check this sort of thing?

Thanks,
Aaron


Aaron Sawdey, Ph.D. saw...@linux.ibm.com
IBM Linux on POWER Toolchain
 



Re: Avoiding truncate/sign-extend of POImode on ppc target

2020-09-02 Thread Aaron Sawdey via Gcc
Meant to CC a few people, oops.

Aaron Sawdey, Ph.D. saw...@linux.ibm.com
IBM Linux on POWER Toolchain
 

> On Sep 2, 2020, at 9:22 AM, Aaron Sawdey via Gcc  wrote:
> 
> 
> PR96791 is happening because DSE is trying to truncate a
> POImode reg down to DImode. The POImode is created by a
> structure copy that gets inline expanded using lxvp/stxvp
> which we have defined using POImode. DSE recognizes that a
> following load overlaps with the stxvp and can be satisfied
> by a truncation of the POImode reg used for the stxvp.
> 
> The intention of using POImode was to avoid having to fully
> define OImode and it seems like defining these truncations
> would be the first step along that path.  If I haven't
> misrepresented it, I believe that is Segher's position on
> this.
> 
> The function extract_low_bits() in expmed.c does some
> querying of the target hook modes_tieable_p but not in any
> way that allows us to tell it not to allow truncation of
> POImode to a smaller mode.  As a result we run into the ICE
> because the patterns are not provided to do this.
> 
> So, I put it to the community: Is there an existing way to do
> this?  Or, do we need another target hook of some kind to
> check this sort of thing?
> 
> Thanks,
>Aaron
> 
> 
> Aaron Sawdey, Ph.D. saw...@linux.ibm.com
> IBM Linux on POWER Toolchain
> 
> 



Types are confused in inlining

2020-09-02 Thread Gary Oblock via Gcc
I'm not accusing inlining of having problems but I really
need to understand what's going on in this situation so I can
fix my optimization.

The error given is:
main.c: In function ‘main’:
main.c:5:1: error: non-trivial conversion in ‘ssa_name’
5 | main(void)
  | ^
struct type_t *
unsigned long
_101 = dedangled_97;
during GIMPLE pass: fixup_cfg
etc.
etc.

I put a conditional breakpoint in gdb where both
_101 and dedangled_97 were created and low
and behold they were both set to "unsigned long".
Does anybody have a clue as to how "_101" got
changed from "unsigned long" to "struct type_t *"?
Note, the later is a meaningful type in my program.
I'm trying to replace all instances of the former as
part of structure reorganization optimization.) I should
mention that this GIMPLE stmt is the one that moves
the value computed in an inlined function into the body
of code where the inling took place.

Thanks,

Gary Oblock





CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is for 
the sole use of the intended recipient(s) and contains information that is 
confidential and proprietary to Ampere Computing or its subsidiaries. It is to 
be used solely for the purpose of furthering the parties' business 
relationship. Any unauthorized review, copying, or distribution of this email 
(or any attachments thereto) is strictly prohibited. If you are not the 
intended recipient, please contact the sender immediately and permanently 
delete the original and any copies of this email and any attachments thereto.


Re: Future debug options: -f* or -g*?

2020-09-02 Thread David Blaikie via Gcc
On Wed, Sep 2, 2020 at 4:12 AM Mark Wielaard  wrote:

> Hi David,
>
> On Mon, 2020-08-31 at 20:10 -0700, David Blaikie wrote:
> > Hey Mark - saw a little of/bits about your presentation at LPC 2020 GNU
> > Tools Track (& your thread on on the gdb list about debug_names).
> Wondering
> > if you (or anyone else you know who's contributing to debug info in GCC)
> > have some thoughts on this flag naming issue. It'd be great to get some
> > alignment between GCC and Clang here, so as we both add new flags going
> > forward, at least they're at least categorically consistent for users,
> even
> > if we aren't necessarily implementing the exact same flags/flag names
> > (though in the -gsplit-dwarf case, it'd be good for any new
> semantics/name
> > to match exactly).
>
> To be honest I don't have any deep insights here. Sadly it is just a
> bit of a mess.


Happen to know any other interested parties (other GCC DWARF contributors
who might have an opinion on flag naming) who might want to weigh in/help
us converge here?


> Not helped by the fact that some debugging formats, like
> DWARF have a number in their name (dwarf2) that might not reflect
> actually reflext the version of that format (except for it being not-
> dwarf1). That is why I was suggesting to use -fdwarf(32|64) instead of
> -gdwarf(32|64) to use for 32 bits vs 64 bits offsets in dwarf(2|3|4|5),
> because there are already too many -gdwarf[number] options.
>
> The idea of having any of the -fs not imply -g and having
> any of the -gs imply -g seems a good one.
>

Cool cool.


> But in general I think these suboptions are all a bit nonsense though.
> They are great for tinkering and playing with corner cases of the
> debugging format. But I don't think we are helping real developers with
> them. To make any of them useful they should be in one of the standard
> options that people actually use. Nobody has time (or the specialized
> knowledge) to really care about all these suboptions. Once people feel
> they need to use any of these suboptions we already lost or the user
> has such specialized knowledge that they will know about specific
> changes in these options or between compilers.
>

Perhaps - I think there are different use cases. A big one in LLVM we call
-fstandalone-debug: Are you building everything with debug info, or only
one or two files/can the compiler assume other files will be built with
debug info. (this affects basically the same things as
-femit-class-debug-always). Split DWARF for users with distributed build
systems. Several options can weight object size compared to linked
executable size - such as using base address selection entries in range and
location lists (reducing relocations). Obviously I'm biased here - I live
in this stuff & have users/use cases that are... not the norm.


> So just like we have -Wall or -Wextra has groups of warning options
> people are likely to use and we have -O1, -O2 and -O3 (I personally
> believe -flto should just be in -O3 instead of being a separate
> option),


I imagine quite a few users build with optimization for particular CPUs -
though, again, partly my bias of working on a homogenous/controlled
environment in data centers, rather than trying to build a binary to send
to random users in the world. But I'd guess games and things that want to
eek out performance probably want to use modern CPU features if available.


> we should simply focus on making -g1, -g2 and -g3 have good
> defaults. Where -g1 is the minimum needed to do some
> profiling/tracing/debugging without needing to map all addresses back
> to source completely but can leave out full type trees for example, -g2
> is the default for being able to do a useful interactive debugging
> session and -g3 adds some (expensive) extras like .debug_macros and
> maybe .debug_names, etc.
>

Maybe. I'd guess a fair few users (especially in their continuous
integration systems, etc) might have some different tradeoffs in terms of
compile/link time and debugging/debugger startup time (indexes are a
particularly potent example of this - pay some link time for some debugger
startup improvement - but if I don't use a debugger very often, I may not
want to pay that link time cost all the time).


> With respect to -gsplit-dwarf I think it was (in hindsight) a mistake
> to put things in a separate (.dwo) file. It is a bit confusing to users
> and complicates build systems immensely. I am hoping to implement
> -gsplit-dwarf=single using SHF_EXCLUDED for gcc and then make that the
> default.
>

Yeah, we've got support for that model in Clang - but to be honest, one of
the major motivations for implementing Split DWARF was Google's internal
use (big distributed build system) & the split mode isn't a mistake in this
context - not having to ship all the DWARF bytes to the machine doing the
link is a significant savings. (even if we ship all those bytes to another
machine to make a dwp) and smaller files to pull down when a user wants to
debug fr

git hooks update

2020-09-02 Thread Joseph Myers
I've updated the gcc-reposurgeon-8 test repository 
(git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-8.git) to use the 
same copy of the git hooks as used by binutils-gdb, glibc and other 
repositories on sourceware.

All the features from local hook changes are now handled, with this new 
version, through new configuration variables (in project.config on 
refs/meta/config) and three new scripts referenced in project.config: 
commit_checker, commit_email_formatter and update_hook in 
~gccadmin/hooks-bin.  This version also has various upstream improvements.  
In particular, the email-new-commits-only variable means that no commit 
emails should be sent for commits added to refs under refs/heads/devel/, 
refs/users/ or refs/vendors/ that were already in the repository (so with 
the new hooks, merges to and rebases of such branches should not generate 
floods of commit emails).

Please test whether the hooks are working as expected with the expected 
checks properly implemented in gcc-reposurgeon-8.  If they're working OK, 
we can switch over in the production repository by changing the 
/git/gcc.git/hooks symlink to point to 
/sourceware/projects/src-home/git-hooks/hooks instead of 
/home/gccadmin/git-hooks/hooks, and uncommenting the 
allow-non-fast-forward settings in project.config in the production 
repository.

Once we've switched over, we'll be using a copy of the hooks that is 
shared by other projects, rather than a GCC-specific copy, and so should 
avoid making local changes there.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: Types are confused in inlining

2020-09-02 Thread Richard Biener via Gcc
On Wed, Sep 2, 2020 at 10:19 PM Gary Oblock via Gcc  wrote:
>
> I'm not accusing inlining of having problems but I really
> need to understand what's going on in this situation so I can
> fix my optimization.
>
> The error given is:
> main.c: In function ‘main’:
> main.c:5:1: error: non-trivial conversion in ‘ssa_name’
> 5 | main(void)
>   | ^
> struct type_t *
> unsigned long
> _101 = dedangled_97;
> during GIMPLE pass: fixup_cfg
> etc.
> etc.
>
> I put a conditional breakpoint in gdb where both
> _101 and dedangled_97 were created and low
> and behold they were both set to "unsigned long".
> Does anybody have a clue as to how "_101" got
> changed from "unsigned long" to "struct type_t *"?
> Note, the later is a meaningful type in my program.
> I'm trying to replace all instances of the former as
> part of structure reorganization optimization.) I should
> mention that this GIMPLE stmt is the one that moves
> the value computed in an inlined function into the body
> of code where the inling took place.

This is absolutely not enough information to guess at the
issue ;)

I suggest you break at the return stmt of make_ssa_name_fn
looking for t->base.u.version == 101 to see where and with
which type _101 is created, from there watch *&t->typed.type
in case something adjusts the type.

> Thanks,
>
> Gary Oblock
>
>
>
>
>
> CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is 
> for the sole use of the intended recipient(s) and contains information that 
> is confidential and proprietary to Ampere Computing or its subsidiaries. It 
> is to be used solely for the purpose of furthering the parties' business 
> relationship. Any unauthorized review, copying, or distribution of this email 
> (or any attachments thereto) is strictly prohibited. If you are not the 
> intended recipient, please contact the sender immediately and permanently 
> delete the original and any copies of this email and any attachments thereto.