On Wed, 22 Sep 2021, Maxim Kuvyrkov wrote:

> Hi Richard,
> 
> We have improved reporting of our benchmarking CI.  Any feedback on the 
> below report format?

Interestingly I saw no -fopt-info-loop report change for 482.sphinx3
on x86_64.

The change improved infrastructure, please file a bugreport and try
pointing to a specific bad optimization decision - the infrastructure
is (hopefully) not to blame here but a consumer which might have an
off cost decision.

Richard.

> Regards,
> 
> --
> Maxim Kuvyrkov
> https://www.linaro.org
> 
> > On 22 Sep 2021, at 01:58, ci_not...@linaro.org wrote:
> > 
> > After gcc commit f92901a508305f291fcf2acae0825379477724de
> > Author: Richard Biener <rguent...@suse.de>
> > 
> >    tree-optimization/65206 - dependence analysis on mixed pointer/array
> > 
> > the following benchmarks slowed down by more than 2%:
> > - 482.sphinx3 slowed down by 4% from 20816 to 21661 perf samples
> > 
> > Below reproducer instructions can be used to re-build both "first_bad" and 
> > "last_good" cross-toolchains used in this bisection.  Naturally, the 
> > scripts will fail when triggerring benchmarking jobs if you don't have 
> > access to Linaro TCWG CI.
> > 
> > For your convenience, we have uploaded tarballs with pre-processed source 
> > and assembly files at:
> > - First_bad save-temps: 
> > https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-master-aarch64-spec2k6-O3/34/artifact/artifacts/build-f92901a508305f291fcf2acae0825379477724de/save-temps/
> > - Last_good save-temps: 
> > https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-master-aarch64-spec2k6-O3/34/artifact/artifacts/build-abdf63d782cba82b5ecf264248518cbb065650ed/save-temps/
> > - Baseline save-temps: 
> > https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-master-aarch64-spec2k6-O3/34/artifact/artifacts/build-baseline/save-temps/
> > 
> > Configuration:
> > - Benchmark: SPEC CPU2006
> > - Toolchain: GCC + Glibc + GNU Linker
> > - Version: all components were built from their tip of trunk
> > - Target: aarch64-linux-gnu
> > - Compiler flags: -O3
> > - Hardware: NVidia TX1 4x Cortex-A57
> > 
> > This benchmarking CI is work-in-progress, and we welcome feedback and 
> > suggestions at linaro-toolchain@lists.linaro.org .  In our improvement 
> > plans is to add support for SPEC CPU2017 benchmarks and provide "perf 
> > report/annotate" data behind these reports.
> > 
> > THIS IS THE END OF INTERESTING STUFF.  BELOW ARE LINKS TO BUILDS, 
> > REPRODUCTION INSTRUCTIONS, AND THE RAW COMMIT.
> > 
> > This commit has regressed these CI configurations:
> > - tcwg_bmk_gnu_tx1/gnu-master-aarch64-spec2k6-O3
> > 
> > First_bad build: 
> > https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-master-aarch64-spec2k6-O3/34/artifact/artifacts/build-f92901a508305f291fcf2acae0825379477724de/
> > Last_good build: 
> > https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-master-aarch64-spec2k6-O3/34/artifact/artifacts/build-abdf63d782cba82b5ecf264248518cbb065650ed/
> > Baseline build: 
> > https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-master-aarch64-spec2k6-O3/34/artifact/artifacts/build-baseline/
> > Even more details: 
> > https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-master-aarch64-spec2k6-O3/34/artifact/artifacts/
> > 
> > Reproduce builds:
> > <cut>
> > mkdir investigate-gcc-f92901a508305f291fcf2acae0825379477724de
> > cd investigate-gcc-f92901a508305f291fcf2acae0825379477724de
> > 
> > # Fetch scripts
> > git clone https://git.linaro.org/toolchain/jenkins-scripts
> > 
> > # Fetch manifests and test.sh script
> > mkdir -p artifacts/manifests
> > curl -o artifacts/manifests/build-baseline.sh 
> > https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-master-aarch64-spec2k6-O3/34/artifact/artifacts/manifests/build-baseline.sh
> >  --fail
> > curl -o artifacts/manifests/build-parameters.sh 
> > https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-master-aarch64-spec2k6-O3/34/artifact/artifacts/manifests/build-parameters.sh
> >  --fail
> > curl -o artifacts/test.sh 
> > https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-master-aarch64-spec2k6-O3/34/artifact/artifacts/test.sh
> >  --fail
> > chmod +x artifacts/test.sh
> > 
> > # Reproduce the baseline build (build all pre-requisites)
> > ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh
> > 
> > # Save baseline build state (which is then restored in artifacts/test.sh)
> > mkdir -p ./bisect
> > rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ 
> > --exclude /gcc/ ./ ./bisect/baseline/
> > 
> > cd gcc
> > 
> > # Reproduce first_bad build
> > git checkout --detach f92901a508305f291fcf2acae0825379477724de
> > ../artifacts/test.sh
> > 
> > # Reproduce last_good build
> > git checkout --detach abdf63d782cba82b5ecf264248518cbb065650ed
> > ../artifacts/test.sh
> > 
> > cd ..
> > </cut>
> > 
> > Full commit (up to 1000 lines):
> > <cut>
> > commit f92901a508305f291fcf2acae0825379477724de
> > Author: Richard Biener <rguent...@suse.de>
> > Date:   Wed Sep 8 14:42:31 2021 +0200
> > 
> >    tree-optimization/65206 - dependence analysis on mixed pointer/array
> > 
> >    This adds the capability to analyze the dependence of mixed
> >    pointer/array accesses.  The example is from where using a masked
> >    load/store creates the pointer-based access when an otherwise
> >    unconditional access is array based.  Other examples would include
> >    accesses to an array mixed with accesses from inlined helpers
> >    that work on pointers.
> > 
> >    The idea is quite simple and old - analyze the data-ref indices
> >    as if the reference was pointer-based.  The following change does
> >    this by changing dr_analyze_indices to work on the indices
> >    sub-structure and storing an alternate indices substructure in
> >    each data reference.  That alternate set of indices is analyzed
> >    lazily by initialize_data_dependence_relation when it fails to
> >    match-up the main set of indices of two data references.
> >    initialize_data_dependence_relation is refactored into a head
> >    and a tail worker and changed to work on one of the indices
> >    structures and thus away from using DR_* access macros which
> >    continue to reference the main indices substructure.
> > 
> >    There are quite some vectorization and loop distribution opportunities
> >    unleashed in SPEC CPU 2017, notably 520.omnetpp_r, 548.exchange2_r,
> >    510.parest_r, 511.povray_r, 521.wrf_r, 526.blender_r, 527.cam4_r and
> >    544.nab_r see amendments in what they report with -fopt-info-loop while
> >    the rest of the specrate set sees no changes there.  Measuring runtime
> >    for the set where changes were reported reveals nothing off-noise
> >    besides 511.povray_r which seems to regress slightly for me
> >    (on a Zen2 machine with -Ofast -march=native).
> > 
> >    2021-09-08  Richard Biener  <rguent...@suse.de>
> > 
> >            PR tree-optimization/65206
> >            * tree-data-ref.h (struct data_reference): Add alt_indices,
> >            order it last.
> >            * tree-data-ref.c (free_data_ref): Release alt_indices.
> >            (dr_analyze_indices): Work on struct indices and get DR_REF as 
> > tree.
> >            (create_data_ref): Adjust.
> >            (initialize_data_dependence_relation): Split into head
> >            and tail.  When the base objects fail to match up try
> >            again with pointer-based analysis of indices.
> >            * tree-vectorizer.c (vec_info_shared::check_datarefs): Do
> >            not compare the lazily computed alternate set of indices.
> > 
> >            * gcc.dg/torture/20210916.c: New testcase.
> >            * gcc.dg/vect/pr65206.c: Likewise.
> > ---
> > gcc/testsuite/gcc.dg/torture/20210916.c |  20 ++++
> > gcc/testsuite/gcc.dg/vect/pr65206.c     |  22 ++++
> > gcc/tree-data-ref.c                     | 174 
> > +++++++++++++++++++++-----------
> > gcc/tree-data-ref.h                     |   9 +-
> > gcc/tree-vectorizer.c                   |   3 +-
> > 5 files changed, 168 insertions(+), 60 deletions(-)
> > 
> > diff --git a/gcc/testsuite/gcc.dg/torture/20210916.c 
> > b/gcc/testsuite/gcc.dg/torture/20210916.c
> > new file mode 100644
> > index 00000000000..0ea6d45e463
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/torture/20210916.c
> > @@ -0,0 +1,20 @@
> > +/* { dg-do compile } */
> > +
> > +typedef union tree_node *tree;
> > +struct tree_base {
> > +  unsigned : 1;
> > +  unsigned lang_flag_2 : 1;
> > +};
> > +struct tree_type {
> > +  tree main_variant;
> > +};
> > +union tree_node {
> > +  struct tree_base base;
> > +  struct tree_type type;
> > +};
> > +tree finish_struct_t, finish_struct_x;
> > +void finish_struct()
> > +{
> > +  for (; finish_struct_t->type.main_variant;)
> > +    finish_struct_x->base.lang_flag_2 = 0;
> > +}
> > diff --git a/gcc/testsuite/gcc.dg/vect/pr65206.c 
> > b/gcc/testsuite/gcc.dg/vect/pr65206.c
> > new file mode 100644
> > index 00000000000..3b6262622c0
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/vect/pr65206.c
> > @@ -0,0 +1,22 @@
> > +/* { dg-do compile } */
> > +/* { dg-require-effective-target vect_double } */
> > +/* { dg-additional-options "-fno-trapping-math 
> > -fno-allow-store-data-races" } */
> > +/* { dg-additional-options "-mavx" { target avx } } */
> > +
> > +#define N 1024
> > +
> > +double a[N], b[N];
> > +
> > +void foo ()
> > +{
> > +  for (int i = 0; i < N; ++i)
> > +    if (b[i] < 3.)
> > +      a[i] += b[i];
> > +}
> > +
> > +/* We get a .MASK_STORE because while the load of a[i] does not trap
> > +   the store would introduce store data races.  Make sure we still
> > +   can handle the data dependence with zero distance.  */
> > +
> > +/* { dg-final { scan-tree-dump-not "versioning for alias required" "vect" 
> > { target { vect_masked_store || avx } } } } */
> > +/* { dg-final { scan-tree-dump "vectorized 1 loops in function" "vect" { 
> > target { vect_masked_store || avx } } } } */
> > diff --git a/gcc/tree-data-ref.c b/gcc/tree-data-ref.c
> > index e061baa7c20..18307a554fc 100644
> > --- a/gcc/tree-data-ref.c
> > +++ b/gcc/tree-data-ref.c
> > @@ -99,6 +99,7 @@ along with GCC; see the file COPYING3.  If not see
> > #include "internal-fn.h"
> > #include "vr-values.h"
> > #include "range-op.h"
> > +#include "tree-ssa-loop-ivopts.h"
> > 
> > static struct datadep_stats
> > {
> > @@ -1300,22 +1301,18 @@ base_supports_access_fn_components_p (tree base)
> >    DR, analyzed in LOOP and instantiated before NEST.  */
> > 
> > static void
> > -dr_analyze_indices (struct data_reference *dr, edge nest, loop_p loop)
> > +dr_analyze_indices (struct indices *dri, tree ref, edge nest, loop_p loop)
> > {
> > -  vec<tree> access_fns = vNULL;
> > -  tree ref, op;
> > -  tree base, off, access_fn;
> > -
> >   /* If analyzing a basic-block there are no indices to analyze
> >      and thus no access functions.  */
> >   if (!nest)
> >     {
> > -      DR_BASE_OBJECT (dr) = DR_REF (dr);
> > -      DR_ACCESS_FNS (dr).create (0);
> > +      dri->base_object = ref;
> > +      dri->access_fns.create (0);
> >       return;
> >     }
> > 
> > -  ref = DR_REF (dr);
> > +  vec<tree> access_fns = vNULL;
> > 
> >   /* REALPART_EXPR and IMAGPART_EXPR can be handled like accesses
> >      into a two element array with a constant index.  The base is
> > @@ -1338,8 +1335,8 @@ dr_analyze_indices (struct data_reference *dr, edge 
> > nest, loop_p loop)
> >     {
> >       if (TREE_CODE (ref) == ARRAY_REF)
> >     {
> > -     op = TREE_OPERAND (ref, 1);
> > -     access_fn = analyze_scalar_evolution (loop, op);
> > +     tree op = TREE_OPERAND (ref, 1);
> > +     tree access_fn = analyze_scalar_evolution (loop, op);
> >       access_fn = instantiate_scev (nest, loop, access_fn);
> >       access_fns.safe_push (access_fn);
> >     }
> > @@ -1370,16 +1367,16 @@ dr_analyze_indices (struct data_reference *dr, edge 
> > nest, loop_p loop)
> >      analyzed nest, add it as an additional independent access-function.  */
> >   if (TREE_CODE (ref) == MEM_REF)
> >     {
> > -      op = TREE_OPERAND (ref, 0);
> > -      access_fn = analyze_scalar_evolution (loop, op);
> > +      tree op = TREE_OPERAND (ref, 0);
> > +      tree access_fn = analyze_scalar_evolution (loop, op);
> >       access_fn = instantiate_scev (nest, loop, access_fn);
> >       if (TREE_CODE (access_fn) == POLYNOMIAL_CHREC)
> >     {
> > -     tree orig_type;
> >       tree memoff = TREE_OPERAND (ref, 1);
> > -     base = initial_condition (access_fn);
> > -     orig_type = TREE_TYPE (base);
> > +     tree base = initial_condition (access_fn);
> > +     tree orig_type = TREE_TYPE (base);
> >       STRIP_USELESS_TYPE_CONVERSION (base);
> > +     tree off;
> >       split_constant_offset (base, &base, &off);
> >       STRIP_USELESS_TYPE_CONVERSION (base);
> >       /* Fold the MEM_REF offset into the evolutions initial
> > @@ -1424,7 +1421,7 @@ dr_analyze_indices (struct data_reference *dr, edge 
> > nest, loop_p loop)
> >                              base, memoff);
> >       MR_DEPENDENCE_CLIQUE (ref) = MR_DEPENDENCE_CLIQUE (old);
> >       MR_DEPENDENCE_BASE (ref) = MR_DEPENDENCE_BASE (old);
> > -     DR_UNCONSTRAINED_BASE (dr) = true;
> > +     dri->unconstrained_base = true;
> >       access_fns.safe_push (access_fn);
> >     }
> >     }
> > @@ -1436,8 +1433,8 @@ dr_analyze_indices (struct data_reference *dr, edge 
> > nest, loop_p loop)
> >                 build_int_cst (reference_alias_ptr_type (ref), 0));
> >     }
> > 
> > -  DR_BASE_OBJECT (dr) = ref;
> > -  DR_ACCESS_FNS (dr) = access_fns;
> > +  dri->base_object = ref;
> > +  dri->access_fns = access_fns;
> > }
> > 
> > /* Extracts the alias analysis information from the memory reference DR.  */
> > @@ -1463,6 +1460,8 @@ void
> > free_data_ref (data_reference_p dr)
> > {
> >   DR_ACCESS_FNS (dr).release ();
> > +  if (dr->alt_indices.base_object)
> > +    dr->alt_indices.access_fns.release ();
> >   free (dr);
> > }
> > 
> > @@ -1497,7 +1496,7 @@ create_data_ref (edge nest, loop_p loop, tree memref, 
> > gimple *stmt,
> > 
> >   dr_analyze_innermost (&DR_INNERMOST (dr), memref,
> >                     nest != NULL ? loop : NULL, stmt);
> > -  dr_analyze_indices (dr, nest, loop);
> > +  dr_analyze_indices (&dr->indices, DR_REF (dr), nest, loop);
> >   dr_analyze_alias (dr);
> > 
> >   if (dump_file && (dump_flags & TDF_DETAILS))
> > @@ -3066,41 +3065,30 @@ access_fn_components_comparable_p (tree ref_a, tree 
> > ref_b)
> >                          TREE_TYPE (TREE_OPERAND (ref_b, 0)));
> > }
> > 
> > -/* Initialize a data dependence relation between data accesses A and
> > -   B.  NB_LOOPS is the number of loops surrounding the references: the
> > -   size of the classic distance/direction vectors.  */
> > +/* Initialize a data dependence relation RES in LOOP_NEST.  USE_ALT_INDICES
> > +   is true when the main indices of A and B were not comparable so we try 
> > again
> > +   with alternate indices computed on an indirect reference.  */
> > 
> > struct data_dependence_relation *
> > -initialize_data_dependence_relation (struct data_reference *a,
> > -                                struct data_reference *b,
> > -                                vec<loop_p> loop_nest)
> > +initialize_data_dependence_relation (struct data_dependence_relation *res,
> > +                                vec<loop_p> loop_nest,
> > +                                bool use_alt_indices)
> > {
> > -  struct data_dependence_relation *res;
> > +  struct data_reference *a = DDR_A (res);
> > +  struct data_reference *b = DDR_B (res);
> >   unsigned int i;
> > 
> > -  res = XCNEW (struct data_dependence_relation);
> > -  DDR_A (res) = a;
> > -  DDR_B (res) = b;
> > -  DDR_LOOP_NEST (res).create (0);
> > -  DDR_SUBSCRIPTS (res).create (0);
> > -  DDR_DIR_VECTS (res).create (0);
> > -  DDR_DIST_VECTS (res).create (0);
> > -
> > -  if (a == NULL || b == NULL)
> > +  struct indices *indices_a = &a->indices;
> > +  struct indices *indices_b = &b->indices;
> > +  if (use_alt_indices)
> >     {
> > -      DDR_ARE_DEPENDENT (res) = chrec_dont_know;
> > -      return res;
> > +      if (TREE_CODE (DR_REF (a)) != MEM_REF)
> > +   indices_a = &a->alt_indices;
> > +      if (TREE_CODE (DR_REF (b)) != MEM_REF)
> > +   indices_b = &b->alt_indices;
> >     }
> > -
> > -  /* If the data references do not alias, then they are independent.  */
> > -  if (!dr_may_alias_p (a, b, loop_nest.exists () ? loop_nest[0] : NULL))
> > -    {
> > -      DDR_ARE_DEPENDENT (res) = chrec_known;
> > -      return res;
> > -    }
> > -
> > -  unsigned int num_dimensions_a = DR_NUM_DIMENSIONS (a);
> > -  unsigned int num_dimensions_b = DR_NUM_DIMENSIONS (b);
> > +  unsigned int num_dimensions_a = indices_a->access_fns.length ();
> > +  unsigned int num_dimensions_b = indices_b->access_fns.length ();
> >   if (num_dimensions_a == 0 || num_dimensions_b == 0)
> >     {
> >       DDR_ARE_DEPENDENT (res) = chrec_dont_know;
> > @@ -3125,9 +3113,9 @@ initialize_data_dependence_relation (struct 
> > data_reference *a,
> > 
> >      the a and b accesses have a single ARRAY_REF component reference [0]
> >      but have two subscripts.  */
> > -  if (DR_UNCONSTRAINED_BASE (a))
> > +  if (indices_a->unconstrained_base)
> >     num_dimensions_a -= 1;
> > -  if (DR_UNCONSTRAINED_BASE (b))
> > +  if (indices_b->unconstrained_base)
> >     num_dimensions_b -= 1;
> > 
> >   /* These structures describe sequences of component references in
> > @@ -3210,6 +3198,10 @@ initialize_data_dependence_relation (struct 
> > data_reference *a,
> >         B: [3, 4]  (i.e. s.e)  */
> >   while (index_a < num_dimensions_a && index_b < num_dimensions_b)
> >     {
> > +      /* The alternate indices form always has a single dimension
> > +    with unconstrained base.  */
> > +      gcc_assert (!use_alt_indices);
> > +
> >       /* REF_A and REF_B must be one of the component access types
> >      allowed by dr_analyze_indices.  */
> >       gcc_checking_assert (access_fn_component_p (ref_a));
> > @@ -3280,11 +3272,12 @@ initialize_data_dependence_relation (struct 
> > data_reference *a,
> >   /* See whether FULL_SEQ ends at the base and whether the two bases
> >      are equal.  We do not care about TBAA or alignment info so we can
> >      use OEP_ADDRESS_OF to avoid false negatives.  */
> > -  tree base_a = DR_BASE_OBJECT (a);
> > -  tree base_b = DR_BASE_OBJECT (b);
> > +  tree base_a = indices_a->base_object;
> > +  tree base_b = indices_b->base_object;
> >   bool same_base_p = (full_seq.start_a + full_seq.length == num_dimensions_a
> >                   && full_seq.start_b + full_seq.length == num_dimensions_b
> > -                 && DR_UNCONSTRAINED_BASE (a) == DR_UNCONSTRAINED_BASE (b)
> > +                 && (indices_a->unconstrained_base
> > +                     == indices_b->unconstrained_base)
> >                   && operand_equal_p (base_a, base_b, OEP_ADDRESS_OF)
> >                   && (types_compatible_p (TREE_TYPE (base_a),
> >                                           TREE_TYPE (base_b))
> > @@ -3323,7 +3316,7 @@ initialize_data_dependence_relation (struct 
> > data_reference *a,
> >      both lvalues are distinct from the object's declared type.  */
> >   if (same_base_p)
> >     {
> > -      if (DR_UNCONSTRAINED_BASE (a))
> > +      if (indices_a->unconstrained_base)
> >     full_seq.length += 1;
> >     }
> >   else
> > @@ -3332,8 +3325,41 @@ initialize_data_dependence_relation (struct 
> > data_reference *a,
> >   /* Punt if we didn't find a suitable sequence.  */
> >   if (full_seq.length == 0)
> >     {
> > -      DDR_ARE_DEPENDENT (res) = chrec_dont_know;
> > -      return res;
> > +      if (use_alt_indices
> > +     || (TREE_CODE (DR_REF (a)) == MEM_REF
> > +         && TREE_CODE (DR_REF (b)) == MEM_REF)
> > +     || may_be_nonaddressable_p (DR_REF (a))
> > +     || may_be_nonaddressable_p (DR_REF (b)))
> > +   {
> > +     /* Fully exhausted possibilities.  */
> > +     DDR_ARE_DEPENDENT (res) = chrec_dont_know;
> > +     return res;
> > +   }
> > +
> > +      /* Try evaluating both DRs as dereferences of pointers.  */
> > +      if (!a->alt_indices.base_object
> > +     && TREE_CODE (DR_REF (a)) != MEM_REF)
> > +   {
> > +     tree alt_ref = build2 (MEM_REF, TREE_TYPE (DR_REF (a)),
> > +                            build1 (ADDR_EXPR, ptr_type_node, DR_REF (a)),
> > +                            build_int_cst
> > +                              (reference_alias_ptr_type (DR_REF (a)), 0));
> > +     dr_analyze_indices (&a->alt_indices, alt_ref,
> > +                         loop_preheader_edge (loop_nest[0]),
> > +                         loop_containing_stmt (DR_STMT (a)));
> > +   }
> > +      if (!b->alt_indices.base_object
> > +     && TREE_CODE (DR_REF (b)) != MEM_REF)
> > +   {
> > +     tree alt_ref = build2 (MEM_REF, TREE_TYPE (DR_REF (b)),
> > +                            build1 (ADDR_EXPR, ptr_type_node, DR_REF (b)),
> > +                            build_int_cst
> > +                              (reference_alias_ptr_type (DR_REF (b)), 0));
> > +     dr_analyze_indices (&b->alt_indices, alt_ref,
> > +                         loop_preheader_edge (loop_nest[0]),
> > +                         loop_containing_stmt (DR_STMT (b)));
> > +   }
> > +      return initialize_data_dependence_relation (res, loop_nest, true);
> >     }
> > 
> >   if (!same_base_p)
> > @@ -3381,8 +3407,8 @@ initialize_data_dependence_relation (struct 
> > data_reference *a,
> >       struct subscript *subscript;
> > 
> >       subscript = XNEW (struct subscript);
> > -      SUB_ACCESS_FN (subscript, 0) = DR_ACCESS_FN (a, full_seq.start_a + 
> > i);
> > -      SUB_ACCESS_FN (subscript, 1) = DR_ACCESS_FN (b, full_seq.start_b + 
> > i);
> > +      SUB_ACCESS_FN (subscript, 0) = 
> > indices_a->access_fns[full_seq.start_a + i];
> > +      SUB_ACCESS_FN (subscript, 1) = 
> > indices_b->access_fns[full_seq.start_b + i];
> >       SUB_CONFLICTS_IN_A (subscript) = conflict_fn_not_known ();
> >       SUB_CONFLICTS_IN_B (subscript) = conflict_fn_not_known ();
> >       SUB_LAST_CONFLICT (subscript) = chrec_dont_know;
> > @@ -3393,6 +3419,40 @@ initialize_data_dependence_relation (struct 
> > data_reference *a,
> >   return res;
> > }
> > 
> > +/* Initialize a data dependence relation between data accesses A and
> > +   B.  NB_LOOPS is the number of loops surrounding the references: the
> > +   size of the classic distance/direction vectors.  */
> > +
> > +struct data_dependence_relation *
> > +initialize_data_dependence_relation (struct data_reference *a,
> > +                                struct data_reference *b,
> > +                                vec<loop_p> loop_nest)
> > +{
> > +  data_dependence_relation *res = XCNEW (struct data_dependence_relation);
> > +  DDR_A (res) = a;
> > +  DDR_B (res) = b;
> > +  DDR_LOOP_NEST (res).create (0);
> > +  DDR_SUBSCRIPTS (res).create (0);
> > +  DDR_DIR_VECTS (res).create (0);
> > +  DDR_DIST_VECTS (res).create (0);
> > +
> > +  if (a == NULL || b == NULL)
> > +    {
> > +      DDR_ARE_DEPENDENT (res) = chrec_dont_know;
> > +      return res;
> > +    }
> > +
> > +  /* If the data references do not alias, then they are independent.  */
> > +  if (!dr_may_alias_p (a, b, loop_nest.exists () ? loop_nest[0] : NULL))
> > +    {
> > +      DDR_ARE_DEPENDENT (res) = chrec_known;
> > +      return res;
> > +    }
> > +
> > +  return initialize_data_dependence_relation (res, loop_nest, false);
> > +}
> > +
> > +
> > /* Frees memory used by the conflict function F.  */
> > 
> > static void
> > diff --git a/gcc/tree-data-ref.h b/gcc/tree-data-ref.h
> > index 685f33d85ae..74f579c9f3f 100644
> > --- a/gcc/tree-data-ref.h
> > +++ b/gcc/tree-data-ref.h
> > @@ -166,14 +166,19 @@ struct data_reference
> >      and runs to completion.  */
> >   bool is_conditional_in_stmt;
> > 
> > +  /* Alias information for the data reference.  */
> > +  struct dr_alias alias;
> > +
> >   /* Behavior of the memory reference in the innermost loop.  */
> >   struct innermost_loop_behavior innermost;
> > 
> >   /* Subscripts of this data reference.  */
> >   struct indices indices;
> > 
> > -  /* Alias information for the data reference.  */
> > -  struct dr_alias alias;
> > +  /* Alternate subscripts initialized lazily and used by data-dependence
> > +     analysis only when the main indices of two DRs are not comparable.
> > +     Keep last to keep vec_info_shared::check_datarefs happy.  */
> > +  struct indices alt_indices;
> > };
> > 
> > #define DR_STMT(DR)                (DR)->stmt
> > diff --git a/gcc/tree-vectorizer.c b/gcc/tree-vectorizer.c
> > index 3aa3e2a6783..20daa31187d 100644
> > --- a/gcc/tree-vectorizer.c
> > +++ b/gcc/tree-vectorizer.c
> > @@ -507,7 +507,8 @@ vec_info_shared::check_datarefs ()
> >     return;
> >   gcc_assert (datarefs.length () == datarefs_copy.length ());
> >   for (unsigned i = 0; i < datarefs.length (); ++i)
> > -    if (memcmp (&datarefs_copy[i], datarefs[i], sizeof (data_reference)) 
> > != 0)
> > +    if (memcmp (&datarefs_copy[i], datarefs[i],
> > +           offsetof (data_reference, alt_indices)) != 0)
> >       gcc_unreachable ();
> > }
> > 
> > </cut>
> 
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
_______________________________________________
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/linaro-toolchain

Reply via email to