On Tue, Apr 24, 2018 at 09:42:03AM +0200, Richard Biener wrote:
> 
> The following patch restricts the previous fix for PR84037 to the case
> of strided loads with non-constant step to avoid regression nbench
> LU decomposition test on Haswell where the change causes us to use
> AVX128 instead of AVX256 in the two critical loops.
> 
> Bootstrapped and tested on x86_64-unknown-linux-gnu.  SPEC CPU 2006
> results are in the noise, so is SPEC CPU 2000 (200.sixtrack seems
> to be awfully jumpy for me - it goes up and down by almost 50%!),
> nbench LU factorization performance is back up.
> 
> OK for trunk?
> 
> Thanks,
> Richard.
> 
> 2018-04-24  Richard Biener  <rguent...@suse.de>
> 
>       PR target/85491
>       * config/i386/i386.c (ix86_add_stmt_cost): Restrict strided
>       load cost increase to the case of non-constant step.

LGTM.

> --- gcc/config/i386/i386.c    (revision 259556)
> +++ gcc/config/i386/i386.c    (working copy)
> @@ -50550,8 +50550,9 @@ ix86_add_stmt_cost (void *data, int coun
>       construction cost by the number of elements involved.  */
>    if (kind == vec_construct
>        && stmt_info
> -      && stmt_info->type == load_vec_info_type
> -      && stmt_info->memory_access_type == VMAT_ELEMENTWISE)
> +      && STMT_VINFO_TYPE (stmt_info) == load_vec_info_type
> +      && STMT_VINFO_MEMORY_ACCESS_TYPE (stmt_info) == VMAT_ELEMENTWISE
> +      && TREE_CODE (DR_STEP (STMT_VINFO_DATA_REF (stmt_info))) != 
> INTEGER_CST)
>      {
>        stmt_cost = ix86_builtin_vectorization_cost (kind, vectype, misalign);
>        stmt_cost *= TYPE_VECTOR_SUBPARTS (vectype);

        Jakub

Reply via email to