On Wed, 22 Sep 2021, Maxim Kuvyrkov wrote: > Hi Richard, > > We have improved reporting of our benchmarking CI. Any feedback on the > below report format?
Interestingly I saw no -fopt-info-loop report change for 482.sphinx3 on x86_64. The change improved infrastructure, please file a bugreport and try pointing to a specific bad optimization decision - the infrastructure is (hopefully) not to blame here but a consumer which might have an off cost decision. Richard. > Regards, > > -- > Maxim Kuvyrkov > https://www.linaro.org > > > On 22 Sep 2021, at 01:58, ci_not...@linaro.org wrote: > > > > After gcc commit f92901a508305f291fcf2acae0825379477724de > > Author: Richard Biener <rguent...@suse.de> > > > > tree-optimization/65206 - dependence analysis on mixed pointer/array > > > > the following benchmarks slowed down by more than 2%: > > - 482.sphinx3 slowed down by 4% from 20816 to 21661 perf samples > > > > Below reproducer instructions can be used to re-build both "first_bad" and > > "last_good" cross-toolchains used in this bisection. Naturally, the > > scripts will fail when triggerring benchmarking jobs if you don't have > > access to Linaro TCWG CI. > > > > For your convenience, we have uploaded tarballs with pre-processed source > > and assembly files at: > > - First_bad save-temps: > > https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-master-aarch64-spec2k6-O3/34/artifact/artifacts/build-f92901a508305f291fcf2acae0825379477724de/save-temps/ > > - Last_good save-temps: > > https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-master-aarch64-spec2k6-O3/34/artifact/artifacts/build-abdf63d782cba82b5ecf264248518cbb065650ed/save-temps/ > > - Baseline save-temps: > > https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-master-aarch64-spec2k6-O3/34/artifact/artifacts/build-baseline/save-temps/ > > > > Configuration: > > - Benchmark: SPEC CPU2006 > > - Toolchain: GCC + Glibc + GNU Linker > > - Version: all components were built from their tip of trunk > > - Target: aarch64-linux-gnu > > - Compiler flags: -O3 > > - Hardware: NVidia TX1 4x Cortex-A57 > > > > This benchmarking CI is work-in-progress, and we welcome feedback and > > suggestions at linaro-toolchain@lists.linaro.org . In our improvement > > plans is to add support for SPEC CPU2017 benchmarks and provide "perf > > report/annotate" data behind these reports. > > > > THIS IS THE END OF INTERESTING STUFF. BELOW ARE LINKS TO BUILDS, > > REPRODUCTION INSTRUCTIONS, AND THE RAW COMMIT. > > > > This commit has regressed these CI configurations: > > - tcwg_bmk_gnu_tx1/gnu-master-aarch64-spec2k6-O3 > > > > First_bad build: > > https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-master-aarch64-spec2k6-O3/34/artifact/artifacts/build-f92901a508305f291fcf2acae0825379477724de/ > > Last_good build: > > https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-master-aarch64-spec2k6-O3/34/artifact/artifacts/build-abdf63d782cba82b5ecf264248518cbb065650ed/ > > Baseline build: > > https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-master-aarch64-spec2k6-O3/34/artifact/artifacts/build-baseline/ > > Even more details: > > https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-master-aarch64-spec2k6-O3/34/artifact/artifacts/ > > > > Reproduce builds: > > <cut> > > mkdir investigate-gcc-f92901a508305f291fcf2acae0825379477724de > > cd investigate-gcc-f92901a508305f291fcf2acae0825379477724de > > > > # Fetch scripts > > git clone https://git.linaro.org/toolchain/jenkins-scripts > > > > # Fetch manifests and test.sh script > > mkdir -p artifacts/manifests > > curl -o artifacts/manifests/build-baseline.sh > > https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-master-aarch64-spec2k6-O3/34/artifact/artifacts/manifests/build-baseline.sh > > --fail > > curl -o artifacts/manifests/build-parameters.sh > > https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-master-aarch64-spec2k6-O3/34/artifact/artifacts/manifests/build-parameters.sh > > --fail > > curl -o artifacts/test.sh > > https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-master-aarch64-spec2k6-O3/34/artifact/artifacts/test.sh > > --fail > > chmod +x artifacts/test.sh > > > > # Reproduce the baseline build (build all pre-requisites) > > ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh > > > > # Save baseline build state (which is then restored in artifacts/test.sh) > > mkdir -p ./bisect > > rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ > > --exclude /gcc/ ./ ./bisect/baseline/ > > > > cd gcc > > > > # Reproduce first_bad build > > git checkout --detach f92901a508305f291fcf2acae0825379477724de > > ../artifacts/test.sh > > > > # Reproduce last_good build > > git checkout --detach abdf63d782cba82b5ecf264248518cbb065650ed > > ../artifacts/test.sh > > > > cd .. > > </cut> > > > > Full commit (up to 1000 lines): > > <cut> > > commit f92901a508305f291fcf2acae0825379477724de > > Author: Richard Biener <rguent...@suse.de> > > Date: Wed Sep 8 14:42:31 2021 +0200 > > > > tree-optimization/65206 - dependence analysis on mixed pointer/array > > > > This adds the capability to analyze the dependence of mixed > > pointer/array accesses. The example is from where using a masked > > load/store creates the pointer-based access when an otherwise > > unconditional access is array based. Other examples would include > > accesses to an array mixed with accesses from inlined helpers > > that work on pointers. > > > > The idea is quite simple and old - analyze the data-ref indices > > as if the reference was pointer-based. The following change does > > this by changing dr_analyze_indices to work on the indices > > sub-structure and storing an alternate indices substructure in > > each data reference. That alternate set of indices is analyzed > > lazily by initialize_data_dependence_relation when it fails to > > match-up the main set of indices of two data references. > > initialize_data_dependence_relation is refactored into a head > > and a tail worker and changed to work on one of the indices > > structures and thus away from using DR_* access macros which > > continue to reference the main indices substructure. > > > > There are quite some vectorization and loop distribution opportunities > > unleashed in SPEC CPU 2017, notably 520.omnetpp_r, 548.exchange2_r, > > 510.parest_r, 511.povray_r, 521.wrf_r, 526.blender_r, 527.cam4_r and > > 544.nab_r see amendments in what they report with -fopt-info-loop while > > the rest of the specrate set sees no changes there. Measuring runtime > > for the set where changes were reported reveals nothing off-noise > > besides 511.povray_r which seems to regress slightly for me > > (on a Zen2 machine with -Ofast -march=native). > > > > 2021-09-08 Richard Biener <rguent...@suse.de> > > > > PR tree-optimization/65206 > > * tree-data-ref.h (struct data_reference): Add alt_indices, > > order it last. > > * tree-data-ref.c (free_data_ref): Release alt_indices. > > (dr_analyze_indices): Work on struct indices and get DR_REF as > > tree. > > (create_data_ref): Adjust. > > (initialize_data_dependence_relation): Split into head > > and tail. When the base objects fail to match up try > > again with pointer-based analysis of indices. > > * tree-vectorizer.c (vec_info_shared::check_datarefs): Do > > not compare the lazily computed alternate set of indices. > > > > * gcc.dg/torture/20210916.c: New testcase. > > * gcc.dg/vect/pr65206.c: Likewise. > > --- > > gcc/testsuite/gcc.dg/torture/20210916.c | 20 ++++ > > gcc/testsuite/gcc.dg/vect/pr65206.c | 22 ++++ > > gcc/tree-data-ref.c | 174 > > +++++++++++++++++++++----------- > > gcc/tree-data-ref.h | 9 +- > > gcc/tree-vectorizer.c | 3 +- > > 5 files changed, 168 insertions(+), 60 deletions(-) > > > > diff --git a/gcc/testsuite/gcc.dg/torture/20210916.c > > b/gcc/testsuite/gcc.dg/torture/20210916.c > > new file mode 100644 > > index 00000000000..0ea6d45e463 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.dg/torture/20210916.c > > @@ -0,0 +1,20 @@ > > +/* { dg-do compile } */ > > + > > +typedef union tree_node *tree; > > +struct tree_base { > > + unsigned : 1; > > + unsigned lang_flag_2 : 1; > > +}; > > +struct tree_type { > > + tree main_variant; > > +}; > > +union tree_node { > > + struct tree_base base; > > + struct tree_type type; > > +}; > > +tree finish_struct_t, finish_struct_x; > > +void finish_struct() > > +{ > > + for (; finish_struct_t->type.main_variant;) > > + finish_struct_x->base.lang_flag_2 = 0; > > +} > > diff --git a/gcc/testsuite/gcc.dg/vect/pr65206.c > > b/gcc/testsuite/gcc.dg/vect/pr65206.c > > new file mode 100644 > > index 00000000000..3b6262622c0 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.dg/vect/pr65206.c > > @@ -0,0 +1,22 @@ > > +/* { dg-do compile } */ > > +/* { dg-require-effective-target vect_double } */ > > +/* { dg-additional-options "-fno-trapping-math > > -fno-allow-store-data-races" } */ > > +/* { dg-additional-options "-mavx" { target avx } } */ > > + > > +#define N 1024 > > + > > +double a[N], b[N]; > > + > > +void foo () > > +{ > > + for (int i = 0; i < N; ++i) > > + if (b[i] < 3.) > > + a[i] += b[i]; > > +} > > + > > +/* We get a .MASK_STORE because while the load of a[i] does not trap > > + the store would introduce store data races. Make sure we still > > + can handle the data dependence with zero distance. */ > > + > > +/* { dg-final { scan-tree-dump-not "versioning for alias required" "vect" > > { target { vect_masked_store || avx } } } } */ > > +/* { dg-final { scan-tree-dump "vectorized 1 loops in function" "vect" { > > target { vect_masked_store || avx } } } } */ > > diff --git a/gcc/tree-data-ref.c b/gcc/tree-data-ref.c > > index e061baa7c20..18307a554fc 100644 > > --- a/gcc/tree-data-ref.c > > +++ b/gcc/tree-data-ref.c > > @@ -99,6 +99,7 @@ along with GCC; see the file COPYING3. If not see > > #include "internal-fn.h" > > #include "vr-values.h" > > #include "range-op.h" > > +#include "tree-ssa-loop-ivopts.h" > > > > static struct datadep_stats > > { > > @@ -1300,22 +1301,18 @@ base_supports_access_fn_components_p (tree base) > > DR, analyzed in LOOP and instantiated before NEST. */ > > > > static void > > -dr_analyze_indices (struct data_reference *dr, edge nest, loop_p loop) > > +dr_analyze_indices (struct indices *dri, tree ref, edge nest, loop_p loop) > > { > > - vec<tree> access_fns = vNULL; > > - tree ref, op; > > - tree base, off, access_fn; > > - > > /* If analyzing a basic-block there are no indices to analyze > > and thus no access functions. */ > > if (!nest) > > { > > - DR_BASE_OBJECT (dr) = DR_REF (dr); > > - DR_ACCESS_FNS (dr).create (0); > > + dri->base_object = ref; > > + dri->access_fns.create (0); > > return; > > } > > > > - ref = DR_REF (dr); > > + vec<tree> access_fns = vNULL; > > > > /* REALPART_EXPR and IMAGPART_EXPR can be handled like accesses > > into a two element array with a constant index. The base is > > @@ -1338,8 +1335,8 @@ dr_analyze_indices (struct data_reference *dr, edge > > nest, loop_p loop) > > { > > if (TREE_CODE (ref) == ARRAY_REF) > > { > > - op = TREE_OPERAND (ref, 1); > > - access_fn = analyze_scalar_evolution (loop, op); > > + tree op = TREE_OPERAND (ref, 1); > > + tree access_fn = analyze_scalar_evolution (loop, op); > > access_fn = instantiate_scev (nest, loop, access_fn); > > access_fns.safe_push (access_fn); > > } > > @@ -1370,16 +1367,16 @@ dr_analyze_indices (struct data_reference *dr, edge > > nest, loop_p loop) > > analyzed nest, add it as an additional independent access-function. */ > > if (TREE_CODE (ref) == MEM_REF) > > { > > - op = TREE_OPERAND (ref, 0); > > - access_fn = analyze_scalar_evolution (loop, op); > > + tree op = TREE_OPERAND (ref, 0); > > + tree access_fn = analyze_scalar_evolution (loop, op); > > access_fn = instantiate_scev (nest, loop, access_fn); > > if (TREE_CODE (access_fn) == POLYNOMIAL_CHREC) > > { > > - tree orig_type; > > tree memoff = TREE_OPERAND (ref, 1); > > - base = initial_condition (access_fn); > > - orig_type = TREE_TYPE (base); > > + tree base = initial_condition (access_fn); > > + tree orig_type = TREE_TYPE (base); > > STRIP_USELESS_TYPE_CONVERSION (base); > > + tree off; > > split_constant_offset (base, &base, &off); > > STRIP_USELESS_TYPE_CONVERSION (base); > > /* Fold the MEM_REF offset into the evolutions initial > > @@ -1424,7 +1421,7 @@ dr_analyze_indices (struct data_reference *dr, edge > > nest, loop_p loop) > > base, memoff); > > MR_DEPENDENCE_CLIQUE (ref) = MR_DEPENDENCE_CLIQUE (old); > > MR_DEPENDENCE_BASE (ref) = MR_DEPENDENCE_BASE (old); > > - DR_UNCONSTRAINED_BASE (dr) = true; > > + dri->unconstrained_base = true; > > access_fns.safe_push (access_fn); > > } > > } > > @@ -1436,8 +1433,8 @@ dr_analyze_indices (struct data_reference *dr, edge > > nest, loop_p loop) > > build_int_cst (reference_alias_ptr_type (ref), 0)); > > } > > > > - DR_BASE_OBJECT (dr) = ref; > > - DR_ACCESS_FNS (dr) = access_fns; > > + dri->base_object = ref; > > + dri->access_fns = access_fns; > > } > > > > /* Extracts the alias analysis information from the memory reference DR. */ > > @@ -1463,6 +1460,8 @@ void > > free_data_ref (data_reference_p dr) > > { > > DR_ACCESS_FNS (dr).release (); > > + if (dr->alt_indices.base_object) > > + dr->alt_indices.access_fns.release (); > > free (dr); > > } > > > > @@ -1497,7 +1496,7 @@ create_data_ref (edge nest, loop_p loop, tree memref, > > gimple *stmt, > > > > dr_analyze_innermost (&DR_INNERMOST (dr), memref, > > nest != NULL ? loop : NULL, stmt); > > - dr_analyze_indices (dr, nest, loop); > > + dr_analyze_indices (&dr->indices, DR_REF (dr), nest, loop); > > dr_analyze_alias (dr); > > > > if (dump_file && (dump_flags & TDF_DETAILS)) > > @@ -3066,41 +3065,30 @@ access_fn_components_comparable_p (tree ref_a, tree > > ref_b) > > TREE_TYPE (TREE_OPERAND (ref_b, 0))); > > } > > > > -/* Initialize a data dependence relation between data accesses A and > > - B. NB_LOOPS is the number of loops surrounding the references: the > > - size of the classic distance/direction vectors. */ > > +/* Initialize a data dependence relation RES in LOOP_NEST. USE_ALT_INDICES > > + is true when the main indices of A and B were not comparable so we try > > again > > + with alternate indices computed on an indirect reference. */ > > > > struct data_dependence_relation * > > -initialize_data_dependence_relation (struct data_reference *a, > > - struct data_reference *b, > > - vec<loop_p> loop_nest) > > +initialize_data_dependence_relation (struct data_dependence_relation *res, > > + vec<loop_p> loop_nest, > > + bool use_alt_indices) > > { > > - struct data_dependence_relation *res; > > + struct data_reference *a = DDR_A (res); > > + struct data_reference *b = DDR_B (res); > > unsigned int i; > > > > - res = XCNEW (struct data_dependence_relation); > > - DDR_A (res) = a; > > - DDR_B (res) = b; > > - DDR_LOOP_NEST (res).create (0); > > - DDR_SUBSCRIPTS (res).create (0); > > - DDR_DIR_VECTS (res).create (0); > > - DDR_DIST_VECTS (res).create (0); > > - > > - if (a == NULL || b == NULL) > > + struct indices *indices_a = &a->indices; > > + struct indices *indices_b = &b->indices; > > + if (use_alt_indices) > > { > > - DDR_ARE_DEPENDENT (res) = chrec_dont_know; > > - return res; > > + if (TREE_CODE (DR_REF (a)) != MEM_REF) > > + indices_a = &a->alt_indices; > > + if (TREE_CODE (DR_REF (b)) != MEM_REF) > > + indices_b = &b->alt_indices; > > } > > - > > - /* If the data references do not alias, then they are independent. */ > > - if (!dr_may_alias_p (a, b, loop_nest.exists () ? loop_nest[0] : NULL)) > > - { > > - DDR_ARE_DEPENDENT (res) = chrec_known; > > - return res; > > - } > > - > > - unsigned int num_dimensions_a = DR_NUM_DIMENSIONS (a); > > - unsigned int num_dimensions_b = DR_NUM_DIMENSIONS (b); > > + unsigned int num_dimensions_a = indices_a->access_fns.length (); > > + unsigned int num_dimensions_b = indices_b->access_fns.length (); > > if (num_dimensions_a == 0 || num_dimensions_b == 0) > > { > > DDR_ARE_DEPENDENT (res) = chrec_dont_know; > > @@ -3125,9 +3113,9 @@ initialize_data_dependence_relation (struct > > data_reference *a, > > > > the a and b accesses have a single ARRAY_REF component reference [0] > > but have two subscripts. */ > > - if (DR_UNCONSTRAINED_BASE (a)) > > + if (indices_a->unconstrained_base) > > num_dimensions_a -= 1; > > - if (DR_UNCONSTRAINED_BASE (b)) > > + if (indices_b->unconstrained_base) > > num_dimensions_b -= 1; > > > > /* These structures describe sequences of component references in > > @@ -3210,6 +3198,10 @@ initialize_data_dependence_relation (struct > > data_reference *a, > > B: [3, 4] (i.e. s.e) */ > > while (index_a < num_dimensions_a && index_b < num_dimensions_b) > > { > > + /* The alternate indices form always has a single dimension > > + with unconstrained base. */ > > + gcc_assert (!use_alt_indices); > > + > > /* REF_A and REF_B must be one of the component access types > > allowed by dr_analyze_indices. */ > > gcc_checking_assert (access_fn_component_p (ref_a)); > > @@ -3280,11 +3272,12 @@ initialize_data_dependence_relation (struct > > data_reference *a, > > /* See whether FULL_SEQ ends at the base and whether the two bases > > are equal. We do not care about TBAA or alignment info so we can > > use OEP_ADDRESS_OF to avoid false negatives. */ > > - tree base_a = DR_BASE_OBJECT (a); > > - tree base_b = DR_BASE_OBJECT (b); > > + tree base_a = indices_a->base_object; > > + tree base_b = indices_b->base_object; > > bool same_base_p = (full_seq.start_a + full_seq.length == num_dimensions_a > > && full_seq.start_b + full_seq.length == num_dimensions_b > > - && DR_UNCONSTRAINED_BASE (a) == DR_UNCONSTRAINED_BASE (b) > > + && (indices_a->unconstrained_base > > + == indices_b->unconstrained_base) > > && operand_equal_p (base_a, base_b, OEP_ADDRESS_OF) > > && (types_compatible_p (TREE_TYPE (base_a), > > TREE_TYPE (base_b)) > > @@ -3323,7 +3316,7 @@ initialize_data_dependence_relation (struct > > data_reference *a, > > both lvalues are distinct from the object's declared type. */ > > if (same_base_p) > > { > > - if (DR_UNCONSTRAINED_BASE (a)) > > + if (indices_a->unconstrained_base) > > full_seq.length += 1; > > } > > else > > @@ -3332,8 +3325,41 @@ initialize_data_dependence_relation (struct > > data_reference *a, > > /* Punt if we didn't find a suitable sequence. */ > > if (full_seq.length == 0) > > { > > - DDR_ARE_DEPENDENT (res) = chrec_dont_know; > > - return res; > > + if (use_alt_indices > > + || (TREE_CODE (DR_REF (a)) == MEM_REF > > + && TREE_CODE (DR_REF (b)) == MEM_REF) > > + || may_be_nonaddressable_p (DR_REF (a)) > > + || may_be_nonaddressable_p (DR_REF (b))) > > + { > > + /* Fully exhausted possibilities. */ > > + DDR_ARE_DEPENDENT (res) = chrec_dont_know; > > + return res; > > + } > > + > > + /* Try evaluating both DRs as dereferences of pointers. */ > > + if (!a->alt_indices.base_object > > + && TREE_CODE (DR_REF (a)) != MEM_REF) > > + { > > + tree alt_ref = build2 (MEM_REF, TREE_TYPE (DR_REF (a)), > > + build1 (ADDR_EXPR, ptr_type_node, DR_REF (a)), > > + build_int_cst > > + (reference_alias_ptr_type (DR_REF (a)), 0)); > > + dr_analyze_indices (&a->alt_indices, alt_ref, > > + loop_preheader_edge (loop_nest[0]), > > + loop_containing_stmt (DR_STMT (a))); > > + } > > + if (!b->alt_indices.base_object > > + && TREE_CODE (DR_REF (b)) != MEM_REF) > > + { > > + tree alt_ref = build2 (MEM_REF, TREE_TYPE (DR_REF (b)), > > + build1 (ADDR_EXPR, ptr_type_node, DR_REF (b)), > > + build_int_cst > > + (reference_alias_ptr_type (DR_REF (b)), 0)); > > + dr_analyze_indices (&b->alt_indices, alt_ref, > > + loop_preheader_edge (loop_nest[0]), > > + loop_containing_stmt (DR_STMT (b))); > > + } > > + return initialize_data_dependence_relation (res, loop_nest, true); > > } > > > > if (!same_base_p) > > @@ -3381,8 +3407,8 @@ initialize_data_dependence_relation (struct > > data_reference *a, > > struct subscript *subscript; > > > > subscript = XNEW (struct subscript); > > - SUB_ACCESS_FN (subscript, 0) = DR_ACCESS_FN (a, full_seq.start_a + > > i); > > - SUB_ACCESS_FN (subscript, 1) = DR_ACCESS_FN (b, full_seq.start_b + > > i); > > + SUB_ACCESS_FN (subscript, 0) = > > indices_a->access_fns[full_seq.start_a + i]; > > + SUB_ACCESS_FN (subscript, 1) = > > indices_b->access_fns[full_seq.start_b + i]; > > SUB_CONFLICTS_IN_A (subscript) = conflict_fn_not_known (); > > SUB_CONFLICTS_IN_B (subscript) = conflict_fn_not_known (); > > SUB_LAST_CONFLICT (subscript) = chrec_dont_know; > > @@ -3393,6 +3419,40 @@ initialize_data_dependence_relation (struct > > data_reference *a, > > return res; > > } > > > > +/* Initialize a data dependence relation between data accesses A and > > + B. NB_LOOPS is the number of loops surrounding the references: the > > + size of the classic distance/direction vectors. */ > > + > > +struct data_dependence_relation * > > +initialize_data_dependence_relation (struct data_reference *a, > > + struct data_reference *b, > > + vec<loop_p> loop_nest) > > +{ > > + data_dependence_relation *res = XCNEW (struct data_dependence_relation); > > + DDR_A (res) = a; > > + DDR_B (res) = b; > > + DDR_LOOP_NEST (res).create (0); > > + DDR_SUBSCRIPTS (res).create (0); > > + DDR_DIR_VECTS (res).create (0); > > + DDR_DIST_VECTS (res).create (0); > > + > > + if (a == NULL || b == NULL) > > + { > > + DDR_ARE_DEPENDENT (res) = chrec_dont_know; > > + return res; > > + } > > + > > + /* If the data references do not alias, then they are independent. */ > > + if (!dr_may_alias_p (a, b, loop_nest.exists () ? loop_nest[0] : NULL)) > > + { > > + DDR_ARE_DEPENDENT (res) = chrec_known; > > + return res; > > + } > > + > > + return initialize_data_dependence_relation (res, loop_nest, false); > > +} > > + > > + > > /* Frees memory used by the conflict function F. */ > > > > static void > > diff --git a/gcc/tree-data-ref.h b/gcc/tree-data-ref.h > > index 685f33d85ae..74f579c9f3f 100644 > > --- a/gcc/tree-data-ref.h > > +++ b/gcc/tree-data-ref.h > > @@ -166,14 +166,19 @@ struct data_reference > > and runs to completion. */ > > bool is_conditional_in_stmt; > > > > + /* Alias information for the data reference. */ > > + struct dr_alias alias; > > + > > /* Behavior of the memory reference in the innermost loop. */ > > struct innermost_loop_behavior innermost; > > > > /* Subscripts of this data reference. */ > > struct indices indices; > > > > - /* Alias information for the data reference. */ > > - struct dr_alias alias; > > + /* Alternate subscripts initialized lazily and used by data-dependence > > + analysis only when the main indices of two DRs are not comparable. > > + Keep last to keep vec_info_shared::check_datarefs happy. */ > > + struct indices alt_indices; > > }; > > > > #define DR_STMT(DR) (DR)->stmt > > diff --git a/gcc/tree-vectorizer.c b/gcc/tree-vectorizer.c > > index 3aa3e2a6783..20daa31187d 100644 > > --- a/gcc/tree-vectorizer.c > > +++ b/gcc/tree-vectorizer.c > > @@ -507,7 +507,8 @@ vec_info_shared::check_datarefs () > > return; > > gcc_assert (datarefs.length () == datarefs_copy.length ()); > > for (unsigned i = 0; i < datarefs.length (); ++i) > > - if (memcmp (&datarefs_copy[i], datarefs[i], sizeof (data_reference)) > > != 0) > > + if (memcmp (&datarefs_copy[i], datarefs[i], > > + offsetof (data_reference, alt_indices)) != 0) > > gcc_unreachable (); > > } > > > > </cut> > > -- Richard Biener <rguent...@suse.de> SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg) _______________________________________________ linaro-toolchain mailing list linaro-toolchain@lists.linaro.org https://lists.linaro.org/mailman/listinfo/linaro-toolchain