On Tue, Apr 24, 2018 at 09:42:03AM +0200, Richard Biener wrote: > > The following patch restricts the previous fix for PR84037 to the case > of strided loads with non-constant step to avoid regression nbench > LU decomposition test on Haswell where the change causes us to use > AVX128 instead of AVX256 in the two critical loops. > > Bootstrapped and tested on x86_64-unknown-linux-gnu. SPEC CPU 2006 > results are in the noise, so is SPEC CPU 2000 (200.sixtrack seems > to be awfully jumpy for me - it goes up and down by almost 50%!), > nbench LU factorization performance is back up. > > OK for trunk? > > Thanks, > Richard. > > 2018-04-24 Richard Biener <rguent...@suse.de> > > PR target/85491 > * config/i386/i386.c (ix86_add_stmt_cost): Restrict strided > load cost increase to the case of non-constant step.
LGTM. > --- gcc/config/i386/i386.c (revision 259556) > +++ gcc/config/i386/i386.c (working copy) > @@ -50550,8 +50550,9 @@ ix86_add_stmt_cost (void *data, int coun > construction cost by the number of elements involved. */ > if (kind == vec_construct > && stmt_info > - && stmt_info->type == load_vec_info_type > - && stmt_info->memory_access_type == VMAT_ELEMENTWISE) > + && STMT_VINFO_TYPE (stmt_info) == load_vec_info_type > + && STMT_VINFO_MEMORY_ACCESS_TYPE (stmt_info) == VMAT_ELEMENTWISE > + && TREE_CODE (DR_STEP (STMT_VINFO_DATA_REF (stmt_info))) != > INTEGER_CST) > { > stmt_cost = ix86_builtin_vectorization_cost (kind, vectype, misalign); > stmt_cost *= TYPE_VECTOR_SUBPARTS (vectype); Jakub