On Thu, Mar 30, 2017 at 2:34 PM, Richard Biener <richard.guent...@gmail.com> wrote: > On Thu, Mar 30, 2017 at 3:20 PM, Bin.Cheng <amker.ch...@gmail.com> wrote: >> On Thu, Mar 30, 2017 at 2:18 PM, Bin.Cheng <amker.ch...@gmail.com> wrote: >>> On Thu, Mar 30, 2017 at 1:44 PM, Richard Biener >>> <richard.guent...@gmail.com> wrote: >>>> On Thu, Mar 30, 2017 at 2:03 PM, Bin.Cheng <amker.ch...@gmail.com> wrote: >>>>> On Thu, Mar 30, 2017 at 11:37 AM, Richard Biener >>>>> <richard.guent...@gmail.com> wrote: >>>>>> On Wed, Mar 29, 2017 at 5:22 PM, Bin.Cheng <amker.ch...@gmail.com> wrote: >>>>>>> On Tue, Mar 28, 2017 at 1:34 PM, Richard Biener >>>>>>> <richard.guent...@gmail.com> wrote: >>>>>>>> On Tue, Mar 28, 2017 at 2:01 PM, Bin Cheng <bin.ch...@arm.com> wrote: >>>>>>>>> Hi, >>>>>>>>> This patch is to fix PR80153. As analyzed in the PR, root cause is >>>>>>>>> tree_affine lacks >>>>>>>>> ability differentiating (unsigned)(ptr + offset) and (unsigned)ptr + >>>>>>>>> (unsigned)offset, >>>>>>>>> even worse, it always returns the former expression in >>>>>>>>> aff_combination_tree, which >>>>>>>>> is wrong if the original expression has the latter form. The patch >>>>>>>>> resolves the issue >>>>>>>>> by always returning the latter form expression, i.e, always trying to >>>>>>>>> generate folded >>>>>>>>> expression. Also as analyzed in comment, I think this change won't >>>>>>>>> result in substantial >>>>>>>>> code gen difference. >>>>>>>>> I also need to adjust get_computation_aff for test case >>>>>>>>> gcc.dg/tree-ssa/reassoc-19.c. >>>>>>>>> Well, I think the changed behavior is correct, but for case the >>>>>>>>> original pointer candidate >>>>>>>>> is chosen, it should be unnecessary to compute in uutype. Also this >>>>>>>>> adjustment only >>>>>>>>> generates (unsigned)(pointer + offset) which is generated by >>>>>>>>> tree-affine.c. >>>>>>>>> Bootstrap and test on x86_64 and AArch64. Is it OK? >>>>>>>> >>>>>>> Thanks for reviewing. >>>>>>>> Hmm. What is the desired goal? To have all elts added have >>>>>>>> comb->type as type? Then >>>>>>>> the type passed to add_elt_to_tree is redundant with comb->type. It >>>>>>>> looks like it >>>>>>>> is always passed comb->type now. >>>>>>> Yes, except pointer type comb->type, elts are converted to comb->type >>>>>>> with this patch. >>>>>>> The redundant type is removed in updated patch. >>>>>>> >>>>>>>> >>>>>>>> ISTR from past work in this area that it was important for pointer >>>>>>>> combinations to allow >>>>>>>> both pointer and sizetype elts at least. >>>>>>> Yes, It's still important to allow different types for pointer and >>>>>>> offset in pointer type comb. >>>>>>> I missed a pointer type check condition in the patch, fixed in updated >>>>>>> patch. >>>>>>>> >>>>>>>> Your change is incomplete I think, for the scale == -1 and >>>>>>>> POINTER_TYPE_P case >>>>>>>> elt is sizetype now, not of pointer type. As said above, we are >>>>>>>> trying to maintain >>>>>>>> both pointer and sizetype elts with like: >>>>>>>> >>>>>>>> if (scale == 1) >>>>>>>> { >>>>>>>> if (!expr) >>>>>>>> { >>>>>>>> if (POINTER_TYPE_P (TREE_TYPE (elt))) >>>>>>>> return elt; >>>>>>>> else >>>>>>>> return fold_convert (type1, elt); >>>>>>>> } >>>>>>>> >>>>>>>> where your earilier fold to type would result in not all cases handled >>>>>>>> the same >>>>>>>> (depending whether scale was -1 for example). >>>>>>> IIUC, it doesn't matter. For comb->type being pointer type, the >>>>>>> behavior remains the same. >>>>>>> For comb->type being unsigned T, this elt is converted to ptr_offtype, >>>>>>> rather than unsigned T, >>>>>>> this doesn't matter because ptr_offtype and unsigned T are equal to >>>>>>> each other, otherwise >>>>>>> tree_to_aff_combination shouldn't distribute it as a single elt. >>>>>>> Anyway, this is addressed in updated patch by checking pointer >>>>>>> comb->type additionally. >>>>>>> BTW, I think "scale==-1" case is a simple heuristic differentiating >>>>>>> pointer_base and offset. >>>>>>> >>>>>>>> >>>>>>>> Thus - shouldn't we simply drop the type argument (or rather the comb >>>>>>>> one? >>>>>>>> that wide_int_ext_for_comb looks weird given we get a widest_int as >>>>>>>> input >>>>>>>> and all the other wide_int_ext_for_comb calls around). >>>>>>>> >>>>>>>> And unconditionally convert to type, simplifying the rest of the code? >>>>>>> As said, for pointer type comb, we need to keep current behavior; for >>>>>>> other cases, >>>>>>> unconditionally convert to comb->type is the goal. >>>>>>> >>>>>>> Bootstrap and test on x86_64 and AArch64. Is this version OK? >>>>>> >>>>>> @@ -399,22 +400,20 @@ add_elt_to_tree (tree expr, tree type, tree elt, >>>>>> const widest_int &scale_in, >>>>>> if (POINTER_TYPE_P (TREE_TYPE (elt))) >>>>>> return elt; >>>>>> else >>>>>> - return fold_convert (type1, elt); >>>>>> + return fold_convert (type, elt); >>>>>> } >>>>>> >>>>>> the conversion should already have been done. For non-pointer comb->type >>>>>> it has been converted to type by your patch. For pointer-type comb->type >>>>>> it should be either pointer type or ptrofftype ('type') already as well. >>>>>> >>>>>> That said, can we do sth like >>>>>> >>>>>> @@ -384,6 +395,12 @@ add_elt_to_tree (tree expr, tree type, t >>>>>> >>>>>> widest_int scale = wide_int_ext_for_comb (scale_in, comb); >>>>>> >>>>>> + if (! POINTER_TYPE_P (comb->type)) >>>>>> + elt = fold_convert (comb->type, elt); >>>>>> + else >>>>>> + gcc_assert (POINTER_TYPE_P (TREE_TYPE (elt)) >>>>>> + || types_compatible_p (TREE_TYPE (elt), type1)); >>>>> Hmm, this assert can be broken since we do STRIP_NOPS converting to >>>>> aff_tree. It's not compatible for signed and unsigned integer types. >>>>> Also, with this patch, we can even support elt of short type in a >>>>> unsigned long comb, though this is useless. >>>>> >>>>>> + >>>>>> if (scale == -1 >>>>>> && POINTER_TYPE_P (TREE_TYPE (elt))) >>>>>> { >>>>>> >>>>>> that is clearly do the conversion at the start in a way the state >>>>>> of elt is more clear? >>>>> Yes, thanks. V3 patch attached (with gcc_assert removed). Is it ok >>>>> after bootstrap/test? >>>> >>>> - return fold_build2 (PLUS_EXPR, type1, >>>> - expr, fold_convert (type1, elt)); >>>> + return fold_build2 (PLUS_EXPR, type, expr, fold_convert (type, >>>> elt)); >>>> >>>> folding not needed(?) >>>> >>>> - return fold_build1 (NEGATE_EXPR, type1, >>>> - fold_convert (type1, elt)); >>>> + return fold_build1 (NEGATE_EXPR, type, fold_convert (type, elt)); >>>> >>>> likewise. >>>> >>>> - return fold_build2 (MINUS_EXPR, type1, >>>> - expr, fold_convert (type1, elt)); >>>> + return fold_build2 (MINUS_EXPR, type, expr, fold_convert (type, >>>> elt)); >>>> >>>> likewise. >>>> >>>> Ok with removing those and re-testing. >>> Hmm, I thought twice about the simplification, there are cases not >>> properly handled: >>>>>> + if (! POINTER_TYPE_P (comb->type)) >>>>>> + elt = fold_convert (comb->type, elt); >>>>>> + else >>>>>> + gcc_assert (POINTER_TYPE_P (TREE_TYPE (elt)) >>>>>> + || types_compatible_p (TREE_TYPE (elt), type1)); >>> This is not enough, for pointer type comb, if elt is the offset part, >>> we could return signed integer type elt without folding. Though this >>> shouldn't be an issue because it's always converted to ptr_offtype in >>> building pointer_plus, it's better not to create such expressions in >>> the first place. Check condition for unconditionally converting elt >>> should be improved as: >>>>>> + if (! POINTER_TYPE_P (comb->type) || !POINTER_TYPE_P (TREE_TYPE >>>>>> (elt))) >>>>>> + elt = fold_convert (comb->type, elt); >> >> Hmm, precisely as: >>>>>> + if (! POINTER_TYPE_P (comb->type) || !POINTER_TYPE_P (TREE_TYPE >>>>>> (elt))) >>>>>> + elt = fold_convert (type, elt); > > Yeah, that looks good to me. > Turned out it's more subtle than expected. Here is the latest version patch which I think makes aff_tree's type semantics more clear. Detailed comment is added in tree-affine.h describing its semantics.
/* This aff_tree represents fully folded expression in a distributed way. For example, tree expression: (unsigned long)(A + ((sizetype)((integer)B + C) + (sizetype)D * 2) * 4) can be represented as aff_tree like: { type = unsigned long offset = 0 elts[0] = A * 1 elts[1] = B * 4 elts[2] = C * 4 elts[3] = D * 8 } Note aff_tree has (root) type which is type of the original expression, elements can have their own types which are different to aff_tree's. In general, elements' type is type of folded sub-expression, and with NOP type conversion stripped. For example, elts[0] has type of A, which is type of STRIP_NOPS ((sizetype) A). Given aff_tree represents folded form of the original tree expression, it lacks ability to track whether the original form is of folded form or non-folded form. For example, both tree expressions: (unsigned)((int)A + (int)B) (unsigned)(int)A + (unsigned)(int)B have the same aff_tree repsentation. This imposes restrictions on this facility, i.e, we need to be conservative and always generate the latter form when converting aff_tree back to tree expression. This implies all elements need to be converted to aff_tree's type before converting. Always generating folded expr could lead to information loss because we can no longer know that (int)A + (int)B doesn't overflow. As a result, we should avoid using aff_tree in code generation directly. It should be used when we want to explore CSE opportunities by breaking most associations. It can be then used in code generation if there will be benefit. It's possible to represent POINTER_PLUS_EXPR in aff_tree, the aff_tree has pointer type accordingly. Such aff_tree is special in two ways: 1) It has a base element which is the original base pointer. Other elements belong to offset part of the original expression. When converting back to tree, other elements need to be converted to ptr_offtype, rather than pointer type. 2) In aff_tree computation, base element can be eliminated, it's the user's responsibility to convert the rest aff_tree to ptr_offtype. The rest aff_tree stands for offset part expression, no longer the POINTER_PLUS_EXPR. */ As an real example, use of aff_tree in add_iv_candidate_for_use breaks above semantics. It needs to convert aff_tree to ptr_offtype after removing pointer element. Here I simply choose not to use aff_tree since it's unnecessary. Bootstrap and test on x86_64 and AArch64. Is this version OK? Thanks, bin 2017-04-04 Bin Cheng <bin.ch...@arm.com> PR tree-optimization/80153 * tree-affine.h (struct aff_tree): Add comment. * tree-affine.c (add_elt_to_tree): Remove parameter TYPE, and use parameter COMB's type instead. Preserve elt's pointer type if it is the base pointer of a pointer type COMB. (aff_combination_to_tree): Update calls to add_elt_to_tree. * tree-ssa-loop-ivopts.c (alloc_iv): Pass in consistent types. (add_iv_candidate_for_use): Check and remove POINTER_PLUS_EXPR's base part directly, rather than through aff_tree. (get_computation_aff): Use utype directly for original candidate. gcc/testsuite/ChangeLog 2017-04-04 Bin Cheng <bin.ch...@arm.com> PR tree-optimization/80153 * gcc.c-torture/execute/pr80153.c: New.