https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70849
Bug ID: 70849
Summary: Loop can be vectorized through gathers on AVX2
platforms.
Product: gcc
Version: 7.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: ysrumyan at gmail dot com
Target Milestone: ---
Simple test which will be attached is not vectorized as not profitable:
test.c:11:5: note: cost model: the vector iteration cost = 2061 divided by the
scalar iteration cost = 9 is greater or equal to the vectorization factor = 8.
test.c:11:5: note: not vectorized: vectorization not profitable.
test.c:11:5: note: not vectorized: vector version will never be profitable.
but it can be vectorized as icc does using gathers:
LOOP BEGIN at test.c(11,5)
remark #15388: vectorization support: reference c1[j] has aligned access
[ test.c(12,7) ]
remark #15388: vectorization support: reference c2[j] has aligned access
[ test.c(13,7) ]
remark #15388: vectorization support: reference c1[j] has aligned access
[ test.c(12,7) ]
remark #15388: vectorization support: reference c2[j] has aligned access
[ test.c(13,7) ]
remark #15415: vectorization support: gather was generated for the
variable <f[j+base]>, strided by 256 [ test.c(12,16) ]
remark #15415: vectorization support: gather was generated for the
variable <f[j+base+1]>, strided by 256 [ test.c(13,16) ]
remark #15415: vectorization support: gather was generated for the
variable <f[j+base]>, strided by 256 [ test.c(12,16) ]
remark #15415: vectorization support: gather was generated for the
variable <f[j+base+1]>, strided by 256 [ test.c(13,16) ]
remark #15305: vectorization support: vector length 8
remark #15300: LOOP WAS VECTORIZED
remark #15449: unmasked aligned unit stride stores: 4
remark #15460: masked strided loads: 4
remark #15475: --- begin vector loop cost summary ---
remark #15476: scalar loop cost: 18
remark #15477: vector loop cost: 12.000
remark #15478: estimated potential speedup: 1.500
remark #15488: --- end vector loop cost summary ---
LOOP END