On Mon, 20 Nov 2017, Jakub Jelinek wrote:

> Hi!
> 
> This is something the bswap pass has been already doing, but not the
> new load handling code in store-merging.  If all the loads have the same
> vuse, it still doesn't mean they are necessarily in the same basic block.
> If they are in multiple bbs, previously we've chosen randomly (well, from
> the load corresponding to the first store in the group), now we pick at
> least the last basic block.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ok.

> It might not be good enough, if some program has
>   int x = q[0]; foo (x); int y = q[1]; p[0] = x; p[1] = y; and
> foo conditionally exits or aborts or externally throws or loops forever
> in case q[1] might not be mapped, then even when both loads are in the
> same bb we might put the larger load on the first load rather than the
> second.  I think we'd need to compute uids (perhaps lazily) and compare
> which stmt comes last.  Thoughts on that?

What we need to compute is whether we can hoist/sink loads to the
insert location.  Ideally we'd start by hoisting loads upwards
and if we hit a road-block try sinking the other loads (might work
in case of EH / looping forever).

One slight complication is that AFIAK an externally throwing or
endlessly looping (or memory unmapping) function doesn't necessarily
have a VDEF, not even a VUSE.  So we might not catch all stmts that
serve as a barrier by walking the VUSE -> VDEF chain.

We _might_ want to change rules here when to force a VDEF for
simplicity.

With -fnon-call-exceptions even a division might throw externally
so this rule adjustment might not hold.  But OTOH I don't see
easily how that serves as a barrier (unless the combined load
is unaligned in which case later loads might access not mapped
memory?)

Richard.

> 2017-11-20  Jakub Jelinek  <ja...@redhat.com>
> 
>       PR tree-optimization/83047
>       * gimple-ssa-store-merging.c
>       (imm_store_chain_info::output_merged_store): If the loads with the
>       same vuse are in different basic blocks, for load_gsi pick a load
>       location that is dominated by the other loads.
> 
>       * gcc.dg/pr83047.c: New test.
> 
> --- gcc/gimple-ssa-store-merging.c.jj 2017-11-17 08:40:25.000000000 +0100
> +++ gcc/gimple-ssa-store-merging.c    2017-11-20 10:37:40.429859947 +0100
> @@ -1857,7 +1857,30 @@ imm_store_chain_info::output_merged_stor
>        store_immediate_info *infol = group->stores.last ();
>        if (gimple_vuse (op.stmt) == gimple_vuse (infol->ops[j].stmt))
>       {
> -       load_gsi[j] = gsi_for_stmt (op.stmt);
> +       /* We can't pick the location randomly; while we've verified
> +          all the loads have the same vuse, they can be still in different
> +          basic blocks and we need to pick the one from the last bb:
> +          int x = q[0];
> +          if (x == N) return;
> +          int y = q[1];
> +          p[0] = x;
> +          p[1] = y;
> +          otherwise if we put the wider load at the q[0] load, we might
> +          segfault if q[1] is not mapped.  */
> +       basic_block bb = gimple_bb (op.stmt);
> +       gimple *ostmt = op.stmt;
> +       store_immediate_info *info;
> +       FOR_EACH_VEC_ELT (group->stores, i, info)
> +         {
> +           gimple *tstmt = info->ops[j].stmt;
> +           basic_block tbb = gimple_bb (tstmt);
> +           if (dominated_by_p (CDI_DOMINATORS, tbb, bb))
> +             {
> +               ostmt = tstmt;
> +               bb = tbb;
> +             }
> +         }
> +       load_gsi[j] = gsi_for_stmt (ostmt);
>         load_addr[j]
>           = force_gimple_operand_1 (unshare_expr (op.base_addr),
>                                     &load_seq[j], is_gimple_mem_ref_addr,
> --- gcc/testsuite/gcc.dg/pr83047.c.jj 2017-11-20 10:19:26.612657065 +0100
> +++ gcc/testsuite/gcc.dg/pr83047.c    2017-11-20 10:24:15.000000000 +0100
> @@ -0,0 +1,58 @@
> +/* PR tree-optimization/83047 */
> +/* { dg-do run { target mmap } } */
> +/* { dg-options "-O2" } */
> +
> +#include <stddef.h>
> +#include <stdio.h>
> +#include <sys/mman.h>
> +#ifndef MAP_ANONYMOUS
> +#define MAP_ANONYMOUS MAP_ANON
> +#endif
> +#ifndef MAP_ANON
> +#define MAP_ANON 0
> +#endif
> +#ifndef MAP_FAILED
> +#define MAP_FAILED ((void *)-1)
> +#endif
> +#include <stdlib.h>
> +
> +__attribute__((noipa)) void
> +foo (char *p, char *q, int r)
> +{
> +  char a = q[0];
> +  if (r || a == '\0')
> +    return;
> +  char b = q[1];
> +  p[0] = a;
> +  p[1] = b;
> +}
> +
> +int
> +main ()
> +{
> +  char *p = mmap (NULL, 131072, PROT_READ | PROT_WRITE,
> +               MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
> +  if (p == MAP_FAILED)
> +    return 0;
> +  if (munmap (p + 65536, 65536) < 0)
> +    return 0;
> +  p[0] = 'c';
> +  p[1] = 'd';
> +  p[65536 - 2] = 'a';
> +  p[65536 - 1] = 'b';
> +  volatile int r = 1;
> +  foo (p, p + 65536 - 2, r);
> +  if (p[0] != 'c' || p[1] != 'd')
> +    abort ();
> +  r = 0;
> +  foo (p, p + 65536 - 2, r);
> +  if (p[0] != 'a' || p[1] != 'b')
> +    abort ();
> +  p[0] = 'e';
> +  p[1] = 'f';
> +  r = 1;
> +  foo (p, p + 65536 - 1, r);
> +  if (p[0] != 'e' || p[1] != 'f')
> +    abort ();
> +  return 0;
> +}
> 
>       Jakub
> 
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)

Reply via email to