On Wed, 26 Feb 2020, Jakub Jelinek wrote:

> Hi!
> 
> The following testcase is miscompiled in 8+.
> The problem is that check_no_overlap has a special case for INTEGER_CST
> marked stores (i.e. stores of constants), if both all currenly merged stores
> and the one under consideration for merging with them are marked that way,
> it anticipates that other INTEGER_CST marked stores that overlap with those
> and precede those (have smaller info->order) could be merged with those and
> doesn't punt for them.
> In PR86844 and PR87859 fixes I've then added quite large code that is
> performed after check_no_overlap and tries to find out if we need and can
> merge further INTEGER_CST marked stores, or need to punt.
> Unfortunately, that code is there only in the overlapping case code and
> the testcase below shows that we really need it even in the adjacent store
> case.  After sort_by_bitpos we have:
> bitpos        width   order   rhs_code
> 96    32      3       INTEGER_CST
> 128   32      1       INTEGER_CST
> 128   128     2       INTEGER_CST
> 192   32      0       MEM_REF
> Because of the missing PR86844/PR87859-ish code in the adjacent store
> case, we merge the adjacent (memory wise) stores 96/32/3 and 128/32/1,
> and then we consider the 128-bit store which is in program-order in between
> them, but in this case we punt, because the merging would extend the
> merged store region from bitpos 96 and 64-bits to bitpos 96 and 160-bits
> and that has an overlap with an incompatible store (the MEM_REF one).
> The problem is that we can't really punt this way, because the 128-bit
> store is in between those two we've merged already, so either we manage
> to merge even that one together with the others, or would need to avoid
> already merging the 96/32/3 and 128/32/1 stores together.
> Now, rather than copying around the PR86844/PR87859 code to the other spot,
> we can actually just use the overlapping code, merge_overlapping is really
> a superset of merge_into, so that is what the patch does.  If doing
> adjacent store merge for rhs_code other than INTEGER_CST, I believe the
> current code is already fine, check_no_overlap in that case doesn't make
> the exception and will punt if there is some earlier (smaller order)
> non-mergeable overlapping store.  There is just one case that could be
> problematic, if the merged_store has BIT_INSERT_EXPRs in them and the
> new store is a constant store (INTEGER_CST rhs_code), then check_no_overlap
> would do the exception and still would allow the special case.  But we
> really shouldn't have the special case in that case, so this patch also
> changes check_no_overlap to just have a bool whether we should have the
> special case or not.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux on the trunk as well
> as coprresponding backport to 8 branch, ok for trunk and release branches?

OK.

> Note, as I said in the PR, for GCC11 we could consider performing some kind
> of cheap DSE during the store merging (perhaps guarded with flag_tree_dse).
> And another thing to consider is only consider as problematic non-mergeable
> stores that not only have order smaller than last_order as currently, but
> also have order larger than first_order, as in this testcase if we actually
> ignored (not merged with anything at all) the 192/32/0 store, because it is
> not in between the other stores we'd merge, it would be fine to merge the
> other 3 stores, though of course the testcase can be easily adjusted by
> putting the 192/32 store after the 128/32 store and then this patch would be
> still needed.  Though, I think I'd need more time thinking this over.

When looking at the PR I thought that the issue must be with the sorting
since if there's no intermediate store breaking the chain we should be
able to "merge" all INTEGER_CSTs by simply native encoding them in
program order?  (similar to what VN does with partial defs)

Richard.

> 2020-02-25  Jakub Jelinek  <ja...@redhat.com>
> 
>       PR tree-optimization/93820
>       * gimple-ssa-store-merging.c (check_no_overlap): Change RHS_CODE
>       argument to ALL_INTEGER_CST_P boolean.
>       (imm_store_chain_info::try_coalesce_bswap): Adjust caller.
>       (imm_store_chain_info::coalesce_immediate_stores): Likewise.  Handle
>       adjacent INTEGER_CST store into merged_store->only_constants like
>       overlapping one.
> 
>       * gcc.dg/pr93820.c: New test.
> 
> --- gcc/gimple-ssa-store-merging.c.jj 2020-02-13 10:03:39.000000000 +0100
> +++ gcc/gimple-ssa-store-merging.c    2020-02-25 18:47:40.303092315 +0100
> @@ -2370,8 +2370,9 @@ gather_bswap_load_refs (vec<tree> *refs,
>  /* Check if there are any stores in M_STORE_INFO after index I
>     (where M_STORE_INFO must be sorted by sort_by_bitpos) that overlap
>     a potential group ending with END that have their order
> -   smaller than LAST_ORDER.  RHS_CODE is the kind of store in the
> -   group.  Return true if there are no such stores.
> +   smaller than LAST_ORDER.  ALL_INTEGER_CST_P is true if
> +   all the stores already merged and the one under consideration
> +   have rhs_code of INTEGER_CST.  Return true if there are no such stores.
>     Consider:
>       MEM[(long long int *)p_28] = 0;
>       MEM[(long long int *)p_28 + 8B] = 0;
> @@ -2394,13 +2395,13 @@ gather_bswap_load_refs (vec<tree> *refs,
>     the MEM[(long long int *)p_28 + 8B] = 0; would now be before it,
>     so we need to refuse merging MEM[(long long int *)p_28 + 8B] = 0;
>     into the group.  That way it will be its own store group and will
> -   not be touched.  If RHS_CODE is INTEGER_CST and there are overlapping
> +   not be touched.  If ALL_INTEGER_CST_P and there are overlapping
>     INTEGER_CST stores, those are mergeable using merge_overlapping,
>     so don't return false for those.  */
>  
>  static bool
>  check_no_overlap (vec<store_immediate_info *> m_store_info, unsigned int i,
> -               enum tree_code rhs_code, unsigned int last_order,
> +               bool all_integer_cst_p, unsigned int last_order,
>                 unsigned HOST_WIDE_INT end)
>  {
>    unsigned int len = m_store_info.length ();
> @@ -2410,7 +2411,7 @@ check_no_overlap (vec<store_immediate_in
>        if (info->bitpos >= end)
>       break;
>        if (info->order < last_order
> -       && (rhs_code != INTEGER_CST || info->rhs_code != INTEGER_CST))
> +       && (!all_integer_cst_p || info->rhs_code != INTEGER_CST))
>       return false;
>      }
>    return true;
> @@ -2563,7 +2564,7 @@ imm_store_chain_info::try_coalesce_bswap
>    if (n.base_addr == NULL_TREE && !is_gimple_val (n.src))
>      return false;
>  
> -  if (!check_no_overlap (m_store_info, last, LROTATE_EXPR, last_order, end))
> +  if (!check_no_overlap (m_store_info, last, false, last_order, end))
>      return false;
>  
>    /* Don't handle memory copy this way if normal non-bswap processing
> @@ -2713,7 +2714,14 @@ imm_store_chain_info::coalesce_immediate
>              |---store 2---|
>        Overlapping stores.  */
>        else if (IN_RANGE (info->bitpos, merged_store->start,
> -                      merged_store->start + merged_store->width - 1))
> +                      merged_store->start + merged_store->width - 1)
> +            /* |---store 1---||---store 2---|
> +               Handle also the consecutive INTEGER_CST stores case here,
> +               as we have here the code to deal with overlaps.  */
> +            || (info->bitregion_start <= merged_store->bitregion_end
> +                && info->rhs_code == INTEGER_CST
> +                && merged_store->only_constants
> +                && merged_store->can_be_merged_into (info)))
>       {
>         /* Only allow overlapping stores of constants.  */
>         if (info->rhs_code == INTEGER_CST
> @@ -2725,8 +2733,7 @@ imm_store_chain_info::coalesce_immediate
>             unsigned HOST_WIDE_INT end
>               = MAX (merged_store->start + merged_store->width,
>                      info->bitpos + info->bitsize);
> -           if (check_no_overlap (m_store_info, i, INTEGER_CST,
> -                                 last_order, end))
> +           if (check_no_overlap (m_store_info, i, true, last_order, end))
>               {
>                 /* check_no_overlap call above made sure there are no
>                    overlapping stores with non-INTEGER_CST rhs_code
> @@ -2879,7 +2886,7 @@ imm_store_chain_info::coalesce_immediate
>             std::swap (info->ops[0], info->ops[1]);
>             info->ops_swapped_p = true;
>           }
> -       if (check_no_overlap (m_store_info, i, info->rhs_code,
> +       if (check_no_overlap (m_store_info, i, false,
>                               MAX (merged_store->last_order, info->order),
>                               MAX (merged_store->start + merged_store->width,
>                                    info->bitpos + info->bitsize)))
> --- gcc/testsuite/gcc.dg/pr93820.c.jj 2020-02-25 18:34:01.535154614 +0100
> +++ gcc/testsuite/gcc.dg/pr93820.c    2020-02-25 14:32:31.600207611 +0100
> @@ -0,0 +1,26 @@
> +/* PR tree-optimization/93820 */
> +/* { dg-do run } */
> +/* { dg-options "-O2 -fno-tree-dse" } */
> +
> +typedef int v4si __attribute__((vector_size(4 * sizeof (int))));
> +int a[10];
> +
> +__attribute__((noipa)) int
> +foo (int *p)
> +{
> +  a[6] = *p;
> +  a[4] = 1;
> +  *(((v4si *)&a[0]) + 1) = (v4si) { 0, 0, 0, 0 };
> +  a[3] = 0;
> +}
> +
> +int
> +main ()
> +{
> +  int i = 0;
> +  foo (&i);
> +  for (i = 0; i < 10; i++)
> +    if (a[i])
> +      __builtin_abort ();
> +  return 0;
> +}
> 
>       Jakub
> 
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

Reply via email to