Hi,

I've noticed that GCC doesn't like to vectorize my loop.

1. When the loop has non-unit stride, I get 'complicated access pattern'
message. Are non-unit strides supported?

res(1:nS) = grid(1:(43-1)*7+1:7)*Dummy ! COMPLICATED ACCESSPATTERN

2. When a stride is not a compile time constant, then I get 'data ref'
error upon vectorization, instead of 'complicated access pattern'.

res(1:nS) = grid(1: (nS-1)*iNew+1 : iNew)*Dummy !NOT VECTORIZED DATA REF

This is a part of more complicated loop that is otherwise fully
vectorizable, and I wonder is it some minor issue so that fix could be
tried, or is it something fundamental that would never allow the loop to
be vectorized.

---------------------------------------------------------------------------
$ cat -> test_nice.for
PROGRAM prog
  INTEGER, PARAMETER :: N = 1000, nS = 43
  INTEGER :: iS

  REAL(4) :: grid(N)
  REAL(4) :: res(nS)

  EXTERNAL test

  res = 0
  DO iS = 1, N
    grid(iS) = SIN(REAL(is)/N)
  END DO
  
  DO iS = 2, 7
    CALL test(iS, grid, 1.2*iS, res)
    PRINT *, res
  END DO

END PROGRAM

  SUBROUTINE test(iNew, grid, Dummy, res)
    INTEGER, PARAMETER   :: nS = 43
    INTEGER, INTENT(in)  :: iNew
    REAL(4), INTENT(in)  :: grid(*)
    REAL(4), INTENT(in)  :: Dummy
    REAL(4), INTENT(out) :: res(*)

    res(1:nS) = grid(1: (nS-1)*iNew+1 : iNew)*Dummy !NOT VECTORIZED DATA
REF
    res(1:nS) = grid(1:(43-1)*7+1:7)*Dummy       ! COMPLICATED ACCESS
PATTERN
    res(1:nS) = grid(1:nS:1)*Dummy               ! VECTORIZED
  END SUBROUTINE



$ gfortran -O2  -fno-backslash -ftree-vectorize
-ftree-vectorizer-verbose=2  -ffast-math -msse4 test_nice.f90 -o
test_nice.exe

test_nice.f90:31: note: LOOP VECTORIZED.
test_nice.f90:30: note: not vectorized: complicated access pattern.
test_nice.f90:29: note: not vectorized: data ref analysis failed
D.1037_18 = (*grid_17(D))[D.1036_16]
test_nice.f90:22: note: vectorized 1 loops in function.

test_nice.f90:11: note: not vectorized: relevant stmt not supported:
D.1008_7 = __builtin_sinf (D.1007_6)
test_nice.f90:1: note: vectorized 0 loops in function.
--
  Cheers
    Michal

Reply via email to