On Mon, 20 Nov 2017, Jakub Jelinek wrote: > Hi! > > This is something the bswap pass has been already doing, but not the > new load handling code in store-merging. If all the loads have the same > vuse, it still doesn't mean they are necessarily in the same basic block. > If they are in multiple bbs, previously we've chosen randomly (well, from > the load corresponding to the first store in the group), now we pick at > least the last basic block. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
Ok. > It might not be good enough, if some program has > int x = q[0]; foo (x); int y = q[1]; p[0] = x; p[1] = y; and > foo conditionally exits or aborts or externally throws or loops forever > in case q[1] might not be mapped, then even when both loads are in the > same bb we might put the larger load on the first load rather than the > second. I think we'd need to compute uids (perhaps lazily) and compare > which stmt comes last. Thoughts on that? What we need to compute is whether we can hoist/sink loads to the insert location. Ideally we'd start by hoisting loads upwards and if we hit a road-block try sinking the other loads (might work in case of EH / looping forever). One slight complication is that AFIAK an externally throwing or endlessly looping (or memory unmapping) function doesn't necessarily have a VDEF, not even a VUSE. So we might not catch all stmts that serve as a barrier by walking the VUSE -> VDEF chain. We _might_ want to change rules here when to force a VDEF for simplicity. With -fnon-call-exceptions even a division might throw externally so this rule adjustment might not hold. But OTOH I don't see easily how that serves as a barrier (unless the combined load is unaligned in which case later loads might access not mapped memory?) Richard. > 2017-11-20 Jakub Jelinek <ja...@redhat.com> > > PR tree-optimization/83047 > * gimple-ssa-store-merging.c > (imm_store_chain_info::output_merged_store): If the loads with the > same vuse are in different basic blocks, for load_gsi pick a load > location that is dominated by the other loads. > > * gcc.dg/pr83047.c: New test. > > --- gcc/gimple-ssa-store-merging.c.jj 2017-11-17 08:40:25.000000000 +0100 > +++ gcc/gimple-ssa-store-merging.c 2017-11-20 10:37:40.429859947 +0100 > @@ -1857,7 +1857,30 @@ imm_store_chain_info::output_merged_stor > store_immediate_info *infol = group->stores.last (); > if (gimple_vuse (op.stmt) == gimple_vuse (infol->ops[j].stmt)) > { > - load_gsi[j] = gsi_for_stmt (op.stmt); > + /* We can't pick the location randomly; while we've verified > + all the loads have the same vuse, they can be still in different > + basic blocks and we need to pick the one from the last bb: > + int x = q[0]; > + if (x == N) return; > + int y = q[1]; > + p[0] = x; > + p[1] = y; > + otherwise if we put the wider load at the q[0] load, we might > + segfault if q[1] is not mapped. */ > + basic_block bb = gimple_bb (op.stmt); > + gimple *ostmt = op.stmt; > + store_immediate_info *info; > + FOR_EACH_VEC_ELT (group->stores, i, info) > + { > + gimple *tstmt = info->ops[j].stmt; > + basic_block tbb = gimple_bb (tstmt); > + if (dominated_by_p (CDI_DOMINATORS, tbb, bb)) > + { > + ostmt = tstmt; > + bb = tbb; > + } > + } > + load_gsi[j] = gsi_for_stmt (ostmt); > load_addr[j] > = force_gimple_operand_1 (unshare_expr (op.base_addr), > &load_seq[j], is_gimple_mem_ref_addr, > --- gcc/testsuite/gcc.dg/pr83047.c.jj 2017-11-20 10:19:26.612657065 +0100 > +++ gcc/testsuite/gcc.dg/pr83047.c 2017-11-20 10:24:15.000000000 +0100 > @@ -0,0 +1,58 @@ > +/* PR tree-optimization/83047 */ > +/* { dg-do run { target mmap } } */ > +/* { dg-options "-O2" } */ > + > +#include <stddef.h> > +#include <stdio.h> > +#include <sys/mman.h> > +#ifndef MAP_ANONYMOUS > +#define MAP_ANONYMOUS MAP_ANON > +#endif > +#ifndef MAP_ANON > +#define MAP_ANON 0 > +#endif > +#ifndef MAP_FAILED > +#define MAP_FAILED ((void *)-1) > +#endif > +#include <stdlib.h> > + > +__attribute__((noipa)) void > +foo (char *p, char *q, int r) > +{ > + char a = q[0]; > + if (r || a == '\0') > + return; > + char b = q[1]; > + p[0] = a; > + p[1] = b; > +} > + > +int > +main () > +{ > + char *p = mmap (NULL, 131072, PROT_READ | PROT_WRITE, > + MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); > + if (p == MAP_FAILED) > + return 0; > + if (munmap (p + 65536, 65536) < 0) > + return 0; > + p[0] = 'c'; > + p[1] = 'd'; > + p[65536 - 2] = 'a'; > + p[65536 - 1] = 'b'; > + volatile int r = 1; > + foo (p, p + 65536 - 2, r); > + if (p[0] != 'c' || p[1] != 'd') > + abort (); > + r = 0; > + foo (p, p + 65536 - 2, r); > + if (p[0] != 'a' || p[1] != 'b') > + abort (); > + p[0] = 'e'; > + p[1] = 'f'; > + r = 1; > + foo (p, p + 65536 - 1, r); > + if (p[0] != 'e' || p[1] != 'f') > + abort (); > + return 0; > +} > > Jakub > > -- Richard Biener <rguent...@suse.de> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)