On Tue, Jan 28, 2020 at 10:49 AM Richard Sandiford <richard.sandif...@arm.com> wrote: > > Richard Sandiford <richard.sandif...@arm.com> writes: > > predcom has the following code to stop one rogue load from > > interfering with other store-load opportunities: > > > > /* If A is read and B write or vice versa and there is unsuitable > > dependence, instead of merging both components into a component > > that will certainly not pass suitable_component_p, just put the > > read into bad component, perhaps at least the write together with > > all the other data refs in it's component will be optimizable. */ > > > > But when store-store commoning was added later, this had the effect > > of ignoring loads that occur between two candidate stores. > > > > There is code further up to handle loads and stores with unknown > > dependences: > > > > /* Don't do store elimination if there is any unknown dependence for > > any store data reference. */ > > if ((DR_IS_WRITE (dra) || DR_IS_WRITE (drb)) > > && (DDR_ARE_DEPENDENT (ddr) == chrec_dont_know > > || DDR_NUM_DIST_VECTS (ddr) == 0)) > > eliminate_store_p = false; > > > > But the store-load code above skips loads for *known* dependences > > if (a) the load has already been marked "bad" or (b) the data-ref > > machinery knows the dependence distance, but determine_offsets > > can't handle the combination. > > > > (a) happens to be the problem in the testcase, but a different > > sequence could have given (b) instead. We have writes to individual > > fields of a structure and reads from the whole structure. Since > > determine_offsets requires the types to be the same, it returns false > > for each such read/write combination. > > > > This patch records which components have had loads removed and > > prevents store-store commoning for them. It's a bit too pessimistic, > > since there shouldn't be a problem if a "bad" load dominates all stores > > in a component. But (a) we can't AFAIK use pcom_stmt_dominates_stmt_p > > here and (b) the handling for that case would probably need to be > > removed again if we handled more exotic cases in future. > > Forgot to add: > > Tested on aarch64-linux-gnu and x86_64-linux-gnu. OK for master > and eventually for backports?
OK for master and backports. Richard. > Thanks, > Richard > > > > > 2020-01-28 Richard Sandiford <richard.sandif...@arm.com> > > > > gcc/ > > PR tree-optimization/93434 > > * tree-predcom.c (split_data_refs_to_components): Record which > > components have had aliasing loads removed. Prevent store-store > > commoning for all such components. > > > > gcc/testsuite/ > > PR tree-optimization/93434 > > * gcc.c-torture/execute/pr93434.c: New test. > > --- > > gcc/testsuite/gcc.c-torture/execute/pr93434.c | 36 +++++++++++++++++++ > > gcc/tree-predcom.c | 24 +++++++++++-- > > 2 files changed, 58 insertions(+), 2 deletions(-) > > create mode 100644 gcc/testsuite/gcc.c-torture/execute/pr93434.c > > > > diff --git a/gcc/tree-predcom.c b/gcc/tree-predcom.c > > index 047c7fa9583..d2dcfe7f42d 100644 > > --- a/gcc/tree-predcom.c > > +++ b/gcc/tree-predcom.c > > @@ -767,6 +767,7 @@ split_data_refs_to_components (class loop *loop, > > /* Don't do store elimination if loop has multiple exit edges. */ > > bool eliminate_store_p = single_exit (loop) != NULL; > > basic_block last_always_executed = last_always_executed_block (loop); > > + auto_bitmap no_store_store_comps; > > > > FOR_EACH_VEC_ELT (datarefs, i, dr) > > { > > @@ -838,9 +839,13 @@ split_data_refs_to_components (class loop *loop, > > else if (DR_IS_READ (dra) && ib != bad) > > { > > if (ia == bad) > > - continue; > > + { > > + bitmap_set_bit (no_store_store_comps, ib); > > + continue; > > + } > > else if (!determine_offset (dra, drb, &dummy_off)) > > { > > + bitmap_set_bit (no_store_store_comps, ib); > > merge_comps (comp_father, comp_size, bad, ia); > > continue; > > } > > @@ -848,9 +853,13 @@ split_data_refs_to_components (class loop *loop, > > else if (DR_IS_READ (drb) && ia != bad) > > { > > if (ib == bad) > > - continue; > > + { > > + bitmap_set_bit (no_store_store_comps, ia); > > + continue; > > + } > > else if (!determine_offset (dra, drb, &dummy_off)) > > { > > + bitmap_set_bit (no_store_store_comps, ia); > > merge_comps (comp_father, comp_size, bad, ib); > > continue; > > } > > @@ -908,6 +917,17 @@ split_data_refs_to_components (class loop *loop, > > comp->refs.quick_push (dataref); > > } > > > > + if (eliminate_store_p) > > + { > > + bitmap_iterator bi; > > + EXECUTE_IF_SET_IN_BITMAP (no_store_store_comps, 0, ia, bi) > > + { > > + ca = component_of (comp_father, ia); > > + if (ca != bad) > > + comps[ca]->eliminate_store_p = false; > > + } > > + } > > + > > for (i = 0; i < n; i++) > > { > > comp = comps[i]; > > diff --git a/gcc/testsuite/gcc.c-torture/execute/pr93434.c > > b/gcc/testsuite/gcc.c-torture/execute/pr93434.c > > new file mode 100644 > > index 00000000000..e786252794b > > --- /dev/null > > +++ b/gcc/testsuite/gcc.c-torture/execute/pr93434.c > > @@ -0,0 +1,36 @@ > > +typedef struct creal_T { > > + double re; > > + double im; > > +} creal_T; > > + > > +#define N 16 > > +int main() { > > + int k; > > + int i; > > + int j; > > + creal_T t2[N]; > > + double inval; > > + > > + inval = 1.0; > > + for (j = 0; j < N; ++j) { > > + t2[j].re = 0; > > + t2[j].im = 0; > > + } > > + > > + for (j = 0; j < N/4; j++) { > > + i = j * 4; > > + t2[i].re = inval; > > + t2[i].im = inval; > > + k = i + 3; > > + t2[k].re = inval; > > + t2[k].im = inval; > > + t2[i] = t2[k]; > > + t2[k].re = inval; > > + } > > + > > + for (i = 0; i < 2; ++i) > > + if (t2[i].re != !i || t2[i].im != !i) > > + __builtin_abort (); > > + > > + return 0; > > +}