Richard Biener via Gcc-patches <[email protected]> writes:
> niter analysis uses multiple_of_p which currently assumes
> operations like MULT_EXPR do not wrap. We've got to rely on this
> for optimizing size expressions like those in DECL_SIZE and those
> generally use unsigned arithmetic with no indication that they
> are not expected to wrap. To preserve that the following adds
> a parameter to multiple_of_p, defaulted to true, indicating that
> the TOP expression is not expected to wrap for outer computations
> in TYPE. This mostly follows a patch proposed by Bin last year
> with the conversion behavior added.
>
> Applying to all users the new effect is that upon type conversions
> in the TOP expression the behavior will switch to honor
> TYPE_OVERFLOW_UNDEFINED for the converted sub-expressions.
>
> The patch also changes the occurance in niter analysis that we
> know is problematic and we have testcases for to pass false
> to multiple_of_p. The patch also contains a change to the
> PR72817 fix from Bin to avoid regressing gcc.dg/tree-ssa/loop-42.c.
>
> The intent for stage1 is to introduce a size_multiple_of_p and
> internalize the added parameter so all multiple_of_p users will
> honor TYPE_OVERFLOW_UNDEFINED and users dealing with size expressions
> need to be switched to size_multiple_of_p.
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu with all languages
> and {,-m32} testing.
>
> The patch applies ontop of the three earlier posted ones that touch
> multiple_of_p but have not yet been reviewed/pushed.
>
> OK?
>
> Thanks,
> Richard.
>
> 2022-01-26 Richard Biener <[email protected]>
>
> PR tree-optimization/100499
> * fold-const.h (multiple_of_p): Add nowrap parameter, defaulted
> to true.
> * fold-const.cc (multiple_of_p): Likewise. Honor it for
> MULT_EXPR, PLUS_EXPR and MINUS_EXPR and pass it along,
> switching to false for conversions.
> * tree-ssa-loop-niter.cc (number_of_iterations_ne): Do not
> claim the outermost expression does not wrap when calling
> multiple_of_p. Refactor the check done to check the
> original IV, avoiding a bias that might wrap.
>
> * gcc.dg/torture/pr100499-1.c: New testcase.
> * gcc.dg/torture/pr100499-2.c: Likewise.
> * gcc.dg/torture/pr100499-3.c: Likewise.
>
> Co-authored-by: Bin Cheng <[email protected]>
> ---
> gcc/fold-const.cc | 81 +++++++++++++++--------
> gcc/fold-const.h | 2 +-
> gcc/testsuite/gcc.dg/torture/pr100499-1.c | 27 ++++++++
> gcc/testsuite/gcc.dg/torture/pr100499-2.c | 16 +++++
> gcc/testsuite/gcc.dg/torture/pr100499-3.c | 14 ++++
> gcc/tree-ssa-loop-niter.cc | 52 ++++++---------
> 6 files changed, 131 insertions(+), 61 deletions(-)
> create mode 100644 gcc/testsuite/gcc.dg/torture/pr100499-1.c
> create mode 100644 gcc/testsuite/gcc.dg/torture/pr100499-2.c
> create mode 100644 gcc/testsuite/gcc.dg/torture/pr100499-3.c
>
> diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
> index a0a4913c45e..7c204fb6265 100644
> --- a/gcc/fold-const.cc
> +++ b/gcc/fold-const.cc
> @@ -14062,10 +14062,16 @@ fold_binary_initializer_loc (location_t loc,
> tree_code code, tree type,
> SAVE_EXPR (I) * SAVE_EXPR (J)
>
> (where the same SAVE_EXPR (J) is used in the original and the
> - transformed version). */
> + transformed version).
> +
> + NOWRAP specifies whether all outer operations in TYPE should
> + be considered not wrapping. Any type conversion within TOP acts
> + as a barrier and we will fall back to NOWRAP being false.
> + NOWRAP is mostly used to treat expressions in TYPE_SIZE and friends
> + as not wrapping even though they are generally using unsigned arithmetic.
> */
>
> int
> -multiple_of_p (tree type, const_tree top, const_tree bottom)
> +multiple_of_p (tree type, const_tree top, const_tree bottom, bool nowrap)
> {
> gimple *stmt;
> tree op1, op2;
> @@ -14083,10 +14089,17 @@ multiple_of_p (tree type, const_tree top,
> const_tree bottom)
> a multiple of BOTTOM then TOP is a multiple of BOTTOM. */
> if (!integer_pow2p (bottom))
> return 0;
> - return (multiple_of_p (type, TREE_OPERAND (top, 1), bottom)
> - || multiple_of_p (type, TREE_OPERAND (top, 0), bottom));
> + return (multiple_of_p (type, TREE_OPERAND (top, 1), bottom, nowrap)
> + || multiple_of_p (type, TREE_OPERAND (top, 0), bottom, nowrap));
>
> case MULT_EXPR:
> + /* If the multiplication can wrap we cannot recurse further unless
> + the second operand is a power of two which is where wrapping
> + does not matter. */
> + if (!nowrap
> + && !TYPE_OVERFLOW_UNDEFINED (type)
> + && !integer_pow2p (TREE_OPERAND (top, 1)))
> + return 0;
I think I'm missing something, but isn't the key thing whether bottom
is a power of 2? E.g. as it stands it looks like we'd still say that a
wrapping x * 2 is a multiple of 3 based purely on x being a multiple of 3,
thanks to…
> if (TREE_CODE (bottom) == INTEGER_CST)
> {
> op1 = TREE_OPERAND (top, 0);
> @@ -14095,24 +14108,24 @@ multiple_of_p (tree type, const_tree top,
> const_tree bottom)
> std::swap (op1, op2);
> if (TREE_CODE (op2) == INTEGER_CST)
> {
> - if (multiple_of_p (type, op2, bottom))
> + if (multiple_of_p (type, op2, bottom, nowrap))
> return 1;
> /* Handle multiple_of_p ((x * 2 + 2) * 4, 8). */
> - if (multiple_of_p (type, bottom, op2))
> + if (multiple_of_p (type, bottom, op2, nowrap))
> {
> widest_int w = wi::sdiv_trunc (wi::to_widest (bottom),
> wi::to_widest (op2));
> if (wi::fits_to_tree_p (w, TREE_TYPE (bottom)))
> {
> op2 = wide_int_to_tree (TREE_TYPE (bottom), w);
> - return multiple_of_p (type, op1, op2);
> + return multiple_of_p (type, op1, op2, nowrap);
> }
> }
> - return multiple_of_p (type, op1, bottom);
> + return multiple_of_p (type, op1, bottom, nowrap);
> }
> }
> - return (multiple_of_p (type, TREE_OPERAND (top, 1), bottom)
> - || multiple_of_p (type, TREE_OPERAND (top, 0), bottom));
> + return (multiple_of_p (type, TREE_OPERAND (top, 1), bottom, nowrap)
> + || multiple_of_p (type, TREE_OPERAND (top, 0), bottom, nowrap));
…the second arm of the || here.
On the other hand, a wrapping x * 6 is a multiple of 2 even though 6
isn't a power of 2.
Thanks,
Richard
>
> case LSHIFT_EXPR:
> /* Handle X << CST as X * (1 << CST) and only process the constant. */
> @@ -14124,29 +14137,39 @@ multiple_of_p (tree type, const_tree top,
> const_tree bottom)
> wide_int mul_op
> = wi::one (TYPE_PRECISION (type)) << wi::to_wide (op1);
> return multiple_of_p (type,
> - wide_int_to_tree (type, mul_op), bottom);
> + wide_int_to_tree (type, mul_op), bottom,
> + nowrap);
> }
> }
> return 0;
>
> case MINUS_EXPR:
> - /* It is impossible to prove if op0 - op1 is multiple of bottom
> - precisely, so be conservative here checking if both op0 and op1
> - are multiple of bottom. Note we check the second operand first
> - since it's usually simpler. */
> - return (multiple_of_p (type, TREE_OPERAND (top, 1), bottom)
> - && multiple_of_p (type, TREE_OPERAND (top, 0), bottom));
> -
> case PLUS_EXPR:
> - /* The same as MINUS_EXPR, but handle cases like op0 + 0xfffffffd
> - as op0 - 3 if the expression has unsigned type. For example,
> - (X / 3) + 0xfffffffd is multiple of 3, but 0xfffffffd is not. */
> op1 = TREE_OPERAND (top, 1);
> - if (TYPE_UNSIGNED (type)
> +
> + /* If the addition or subtraction can wrap we cannot recurse further
> + unless the second operand is a power of two which is where wrapping
> + does not matter. */
> + if (!nowrap
> + && !TYPE_OVERFLOW_UNDEFINED (type)
> + && !integer_pow2p (op1))
> + return 0;
> +
> + /* Handle cases like op0 + 0xfffffffd as op0 - 3 if the expression has
> + unsigned type. For example, (X / 3) + 0xfffffffd is multiple of 3,
> + but 0xfffffffd is not. */
> + if (TREE_CODE (top) == PLUS_EXPR
> + && nowrap
> + && TYPE_UNSIGNED (type)
> && TREE_CODE (op1) == INTEGER_CST && tree_int_cst_sign_bit (op1))
> op1 = fold_build1 (NEGATE_EXPR, type, op1);
> - return (multiple_of_p (type, op1, bottom)
> - && multiple_of_p (type, TREE_OPERAND (top, 0), bottom));
> +
> + /* It is impossible to prove if op0 +- op1 is multiple of bottom
> + precisely, so be conservative here checking if both op0 and op1
> + are multiple of bottom. Note we check the second operand first
> + since it's usually simpler. */
> + return (multiple_of_p (type, op1, bottom, nowrap)
> + && multiple_of_p (type, TREE_OPERAND (top, 0), bottom, nowrap));
>
> CASE_CONVERT:
> /* Can't handle conversions from non-integral or wider integral type.
> */
> @@ -14154,15 +14177,17 @@ multiple_of_p (tree type, const_tree top,
> const_tree bottom)
> || (TYPE_PRECISION (type)
> < TYPE_PRECISION (TREE_TYPE (TREE_OPERAND (top, 0)))))
> return 0;
> + /* NOWRAP only extends to operations in the outermost type so
> + make sure to strip it off here. */
> return multiple_of_p (TREE_TYPE (TREE_OPERAND (top, 0)),
> - TREE_OPERAND (top, 0), bottom);
> + TREE_OPERAND (top, 0), bottom, false);
>
> case SAVE_EXPR:
> - return multiple_of_p (type, TREE_OPERAND (top, 0), bottom);
> + return multiple_of_p (type, TREE_OPERAND (top, 0), bottom, nowrap);
>
> case COND_EXPR:
> - return (multiple_of_p (type, TREE_OPERAND (top, 1), bottom)
> - && multiple_of_p (type, TREE_OPERAND (top, 2), bottom));
> + return (multiple_of_p (type, TREE_OPERAND (top, 1), bottom, nowrap)
> + && multiple_of_p (type, TREE_OPERAND (top, 2), bottom, nowrap));
>
> case INTEGER_CST:
> if (TREE_CODE (bottom) != INTEGER_CST
> diff --git a/gcc/fold-const.h b/gcc/fold-const.h
> index a9a3062e4f6..03a0de3b4de 100644
> --- a/gcc/fold-const.h
> +++ b/gcc/fold-const.h
> @@ -96,7 +96,7 @@ extern void fold_overflow_warning (const char*, enum
> warn_strict_overflow_code);
> extern enum tree_code fold_div_compare (enum tree_code, tree, tree,
> tree *, tree *, bool *);
> extern bool operand_equal_p (const_tree, const_tree, unsigned int flags = 0);
> -extern int multiple_of_p (tree, const_tree, const_tree);
> +extern int multiple_of_p (tree, const_tree, const_tree, bool = true);
> #define omit_one_operand(T1,T2,T3)\
> omit_one_operand_loc (UNKNOWN_LOCATION, T1, T2, T3)
> extern tree omit_one_operand_loc (location_t, tree, tree, tree);
> diff --git a/gcc/testsuite/gcc.dg/torture/pr100499-1.c
> b/gcc/testsuite/gcc.dg/torture/pr100499-1.c
> new file mode 100644
> index 00000000000..9511c323505
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/torture/pr100499-1.c
> @@ -0,0 +1,27 @@
> +/* { dg-do run } */
> +
> +typedef __UINT16_TYPE__ uint16_t;
> +typedef __INT32_TYPE__ int32_t;
> +static uint16_t g_2823 = 0xEC75L;
> +static uint16_t g_116 = 0xBC07L;
> +
> +static uint16_t
> +safe_mul_func_uint16_t_u_u(uint16_t ui1, uint16_t ui2)
> +{
> + return ((unsigned int)ui1) * ((unsigned int)ui2);
> +}
> +
> +int main ()
> +{
> + uint16_t l_2815 = 0xffff;
> + uint16_t *l_2821 = &g_116;
> + uint16_t *l_2822 = &g_2823;
> +
> +lbl_2826:
> + l_2815 &= 0x1eae;
> + if (safe_mul_func_uint16_t_u_u(((*l_2821) = l_2815), (--(*l_2822))))
> + goto lbl_2826;
> + if (g_2823 != 32768)
> + __builtin_abort ();
> + return 0;
> +}
> diff --git a/gcc/testsuite/gcc.dg/torture/pr100499-2.c
> b/gcc/testsuite/gcc.dg/torture/pr100499-2.c
> new file mode 100644
> index 00000000000..999f931806a
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/torture/pr100499-2.c
> @@ -0,0 +1,16 @@
> +/* { dg-do run } */
> +
> +unsigned char ag = 55;
> +unsigned i;
> +int main()
> +{
> + unsigned char c;
> + unsigned char a = ag;
> +d:
> + c = a-- * 52;
> + if (c)
> + goto d;
> + if (a != 255)
> + __builtin_abort ();
> + return 0;
> +}
> diff --git a/gcc/testsuite/gcc.dg/torture/pr100499-3.c
> b/gcc/testsuite/gcc.dg/torture/pr100499-3.c
> new file mode 100644
> index 00000000000..01be1e50690
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/torture/pr100499-3.c
> @@ -0,0 +1,14 @@
> +/* { dg-do run } */
> +
> +int a = 0;
> +unsigned char b = 0;
> +
> +int main() {
> + a - 6;
> + for (; a >= -13; a = a - 8)
> + while((unsigned char)(b-- * 6))
> + ;
> + if (b != 127)
> + __builtin_abort();
> + return 0;
> +}
> diff --git a/gcc/tree-ssa-loop-niter.cc b/gcc/tree-ssa-loop-niter.cc
> index d33095b8e03..318d10c8fac 100644
> --- a/gcc/tree-ssa-loop-niter.cc
> +++ b/gcc/tree-ssa-loop-niter.cc
> @@ -1042,17 +1042,21 @@ number_of_iterations_ne (class loop *loop, tree type,
> affine_iv *iv,
> new_base = base - step > FINAL ; step < 0
> && base - step doesn't overflow
>
> - 2') |FINAL - new_base| is an exact multiple of step.
> -
> - Please refer to PR34114 as an example of loop-ch's impact, also refer
> - to PR72817 as an example why condition 2') is necessary.
> + Please refer to PR34114 as an example of loop-ch's impact.
>
> Note, for NE_EXPR, base equals to FINAL is a special case, in
> - which the loop exits immediately, and the iv does not overflow. */
> + which the loop exits immediately, and the iv does not overflow.
> +
> + Also note, we prove condition 2) by checking base and final seperately
> + along with condition 1) or 1'). */
> if (!niter->control.no_overflow
> - && (integer_onep (s) || multiple_of_p (type, c, s)))
> + && (integer_onep (s)
> + || (multiple_of_p (type, fold_convert (niter_type, iv->base), s,
> + false)
> + && multiple_of_p (type, fold_convert (niter_type, final), s,
> + false))))
> {
> - tree t, cond, new_c, relaxed_cond = boolean_false_node;
> + tree t, cond, relaxed_cond = boolean_false_node;
>
> if (tree_int_cst_sign_bit (iv->step))
> {
> @@ -1066,12 +1070,8 @@ number_of_iterations_ne (class loop *loop, tree type,
> affine_iv *iv,
> if (integer_nonzerop (t))
> {
> t = fold_build2 (MINUS_EXPR, type, iv->base, iv->step);
> - new_c = fold_build2 (MINUS_EXPR, niter_type,
> - fold_convert (niter_type, t),
> - fold_convert (niter_type, final));
> - if (multiple_of_p (type, new_c, s))
> - relaxed_cond = fold_build2 (GT_EXPR, boolean_type_node,
> - t, final);
> + relaxed_cond = fold_build2 (GT_EXPR, boolean_type_node, t,
> + final);
> }
> }
> }
> @@ -1087,12 +1087,8 @@ number_of_iterations_ne (class loop *loop, tree type,
> affine_iv *iv,
> if (integer_nonzerop (t))
> {
> t = fold_build2 (MINUS_EXPR, type, iv->base, iv->step);
> - new_c = fold_build2 (MINUS_EXPR, niter_type,
> - fold_convert (niter_type, final),
> - fold_convert (niter_type, t));
> - if (multiple_of_p (type, new_c, s))
> - relaxed_cond = fold_build2 (LT_EXPR, boolean_type_node,
> - t, final);
> + relaxed_cond = fold_build2 (LT_EXPR, boolean_type_node, t,
> + final);
> }
> }
> }
> @@ -1102,19 +1098,11 @@ number_of_iterations_ne (class loop *loop, tree type,
> affine_iv *iv,
> t = simplify_using_initial_conditions (loop, relaxed_cond);
>
> if (t && integer_onep (t))
> - niter->control.no_overflow = true;
> - }
> -
> - /* First the trivial cases -- when the step is 1. */
> - if (integer_onep (s))
> - {
> - niter->niter = c;
> - return true;
> - }
> - if (niter->control.no_overflow && multiple_of_p (type, c, s))
> - {
> - niter->niter = fold_build2 (FLOOR_DIV_EXPR, niter_type, c, s);
> - return true;
> + {
> + niter->control.no_overflow = true;
> + niter->niter = fold_build2 (EXACT_DIV_EXPR, niter_type, c, s);
> + return true;
> + }
> }
>
> /* Let nsd (step, size of mode) = d. If d does not divide c, the loop