On June 14, 2016 4:14:20 PM GMT+02:00, Alan Hayward <alan.hayw...@arm.com> wrote: >In the given testcase, g++ splits a live operation into two scalar >statements >and four vector statements. > >_5 = _4 >> 2; > _7 = (short int) _5; > >Is turned into: > >vect__5.32_80 = vect__4.31_76 >> 2; > vect__5.32_81 = vect__4.31_77 >> 2; > vect__5.32_82 = vect__4.31_78 >> 2; > vect__5.32_83 = vect__4.31_79 >> 2; > vect__7.33_86 = VEC_PACK_TRUNC_EXPR <vect__5.32_80, vect__5.32_81>; > vect__7.33_87 = VEC_PACK_TRUNC_EXPR <vect__5.32_82, vect__5.32_83>; > >_5 is then accessed outside the loop. > >This patch ensures that vectorizable_live_operation picks the correct >scalar >statement. >I removed the "three possibilites" comment because it was no longer >accurate >(it's also possible to have more vector statements than scalar >statements) >and >the calculation is now much simpler. > >Tested on x86 and aarch64. >Ok to commit?
OK. Thanks, Richard. >gcc/ > PR tree-optimization/71483 > * tree-vect-loop.c (vectorizable_live_operation): Pick correct index > for slp > >testsuite/g++.dg/vect > PR tree-optimization/71483 > * pr71483.c: New > > >Alan. > > >diff --git a/gcc/testsuite/g++.dg/vect/pr71483.c >b/gcc/testsuite/g++.dg/vect/pr71483.c >new file mode 100644 >index >0000000000000000000000000000000000000000..77f879c9a89b8b41ef9dde3c343591857 >2dc8d01 >--- /dev/null >+++ b/gcc/testsuite/g++.dg/vect/pr71483.c >@@ -0,0 +1,11 @@ >+/* { dg-do compile } */ >+int b, c, d; >+short *e; >+void fn1() { >+ for (; b; b--) { >+ d = *e >> 2; >+ *e++ = d; >+ c = *e; >+ *e++ = d; >+ } >+} >diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c >index >4c8678505df6ec572b69fd7d12ac55cf4619ece6..a2413bf9c678d11cc2ffd22bc7d984e91 >1831804 100644 >--- a/gcc/tree-vect-loop.c >+++ b/gcc/tree-vect-loop.c >@@ -6368,24 +6368,20 @@ vectorizable_live_operation (gimple *stmt, > > int num_scalar = SLP_TREE_SCALAR_STMTS (slp_node).length (); > int num_vec = SLP_TREE_NUMBER_OF_VEC_STMTS (slp_node); >- int scalar_per_vec = num_scalar / num_vec; > >- /* There are three possibilites here: >- 1: All scalar stmts fit in a single vector. >- 2: All scalar stmts fit multiple times into a single vector. >- We must choose the last occurence of stmt in the vector. >- 3: Scalar stmts are split across multiple vectors. >- We must choose the correct vector and mod the lane accordingly. >*/ >+ /* Get the last occurrence of the scalar index from the >concatenation of >+ all the slp vectors. Calculate which slp vector it is and the index >+ within. */ >+ int pos = (num_vec * nunits) - num_scalar + slp_index; >+ int vec_entry = pos / nunits; >+ int vec_index = pos % nunits; > > /* Get the correct slp vectorized stmt. */ >- int vec_entry = slp_index / scalar_per_vec; > vec_lhs = gimple_get_lhs (SLP_TREE_VEC_STMTS (slp_node)[vec_entry]); > > /* Get entry to use. */ >- bitstart = build_int_cst (unsigned_type_node, >- scalar_per_vec - (slp_index % scalar_per_vec)); >+ bitstart = build_int_cst (unsigned_type_node, vec_index); > bitstart = int_const_binop (MULT_EXPR, bitsize, bitstart); >- bitstart = int_const_binop (MINUS_EXPR, vec_bitsize, bitstart); > } > else > {