This enables vectorizing of the complex multiplication testcase in PR37021 by also allowing strided load support to apply to REAL/IMAGPART_EXPR wrapped array or pointer loads. This causes the loop to be vectorized via strided loads.
The testcase requires -fno-tree-pre unless the tree-ssa-structalias.c hunk is applied because PTA (run as part of PRE) thinks that COMPLEX_EXPR <fn-param, 0.0> may point to global memory. While technically this is conservatively correct it is too pessimizing, so the patch makes us assume that we do not transfer pointer values through floating-point values (similar to how we assume we don't do that through truth values). More "proper" vectorizing of the testcase is underway (using SLP on groups with unknown gaps with unknown stride and mixed operations). From that disclaimer you can see it's a lot harder to do properly. Cost model issues still disable vectorizing the loop in the testcase: t.f90:8: note: cost model: the vector iteration cost = 24 divided by the scalar iteration cost = 12 is greater or equal to the vectorization factor = 2. t.f90:8: note: not vectorized: vectorization not profitable. t.f90:8: note: not vectorized: vector version will never be profitable. Bootstrap and regtest running on x86_64-unknown-linux-gnu. Richard. 2013-03-26 Richard Biener <rguent...@suse.de> PR tree-optimization/37021 * tree-vect-data-refs.c (vect_check_strided_load): Allow REALPART/IMAGPART_EXPRs around the supported refs. * tree-ssa-structalias.c (find_func_aliases): Assume that floating-point values are not used to transfer pointers. * gfortran.dg/vect/fast-math-pr37021.f90: New testcase. Index: gcc/tree-vect-data-refs.c =================================================================== *** gcc/tree-vect-data-refs.c.orig 2013-03-26 14:27:19.000000000 +0100 --- gcc/tree-vect-data-refs.c 2013-03-26 14:29:21.519094419 +0100 *************** vect_check_strided_load (gimple stmt, lo *** 2798,2803 **** --- 2798,2807 ---- base = DR_REF (dr); + if (TREE_CODE (base) == REALPART_EXPR + || TREE_CODE (base) == IMAGPART_EXPR) + base = TREE_OPERAND (base, 0); + if (TREE_CODE (base) == ARRAY_REF) { off = TREE_OPERAND (base, 1); Index: gcc/testsuite/gfortran.dg/vect/fast-math-pr37021.f90 =================================================================== *** /dev/null 1970-01-01 00:00:00.000000000 +0000 --- gcc/testsuite/gfortran.dg/vect/fast-math-pr37021.f90 2013-03-26 14:34:38.546568933 +0100 *************** *** 0 **** --- 1,17 ---- + ! { dg-do compile } + + subroutine to_product_of(self,a,b,a1,a2) + complex(kind=8) :: self (:) + complex(kind=8), intent(in) :: a(:,:) + complex(kind=8), intent(in) :: b(:) + integer a1,a2 + self = ZERO + do i = 1,a1 + do j = 1,a2 + self(i) = self(i) + a(i,j)*b(j) + end do + end do + end subroutine + + ! { dg-final { scan-tree-dump "vectorized 1 loops" "vect" } } + ! { dg-final { cleanup-tree-dump "vect" } } Index: gcc/tree-ssa-structalias.c =================================================================== *** gcc/tree-ssa-structalias.c.orig 2013-03-26 15:14:37.000000000 +0100 --- gcc/tree-ssa-structalias.c 2013-03-26 15:14:45.411861483 +0100 *************** find_func_aliases (gimple origt) *** 4631,4637 **** get_constraint_for (lhsop, &lhsc); ! if (code == POINTER_PLUS_EXPR) get_constraint_for_ptr_offset (gimple_assign_rhs1 (t), gimple_assign_rhs2 (t), &rhsc); else if (code == BIT_AND_EXPR --- 4631,4641 ---- get_constraint_for (lhsop, &lhsc); ! if (FLOAT_TYPE_P (TREE_TYPE (lhsop))) ! /* If the operation produces a floating point result then ! assume the value is not produced to transfer a pointer. */ ! ; ! else if (code == POINTER_PLUS_EXPR) get_constraint_for_ptr_offset (gimple_assign_rhs1 (t), gimple_assign_rhs2 (t), &rhsc); else if (code == BIT_AND_EXPR