https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68576
Bug ID: 68576
Summary: scev failed for loop auto parallelize
Product: gcc
Version: 6.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: majun4950646 at 163 dot com
Target Milestone: ---
scev failed to analyze the 1st and 2nd loop in the following example (come from
testsuit interchange-2.f,which is the kernel extracted from bwaves)
---
subroutine foo(f1,f2,f3,f4,f5,f6,f7,f8,f9,f0,g1,g2,g3)
implicit none
integer f4,f3,f2,f1
integer g4,g5,g6,g7,g8,g9
integer i1,i2,i3,i4,i5
real*8 g1(f4,f3,f2,f1),g2(f4,f4,f3,f2,f1),g3(f4,f3,f2,f1)
real*8 f0(f4,f4,f3,f2,f1),f9(f4,f4,f3,f2,f1),f8(f4,f4,f3,f2,f1)
real*8 f7(f4,f4,f3,f2,f1),f6(f4,f4,f3,f2,f1),f5(f4,f4,f3,f2,f1)
do i3=1,f1
g8=mod(i3+f1-2,f1)+1
g9=mod(i3,f1)+1
do i4=1,f2
g6=mod(i4+f2-2,f2)+1
g7=mod(i4,f2)+1
do i5=1,f3
g4=mod(i5+f3-2,f3)+1
g5=mod(i5,f3)+1
do i1=1,f4
g3(i1,i5,i4,i3)=0.0d0
do i2=1,f4
g3(i1,i5,i4,i3)=g3(i1,i5,i4,i3)+
1 g2(i1,i2,i5,i4,i3)*g1(i2,i5,i4,i3)+
2 f0(i1,i2,i5,i4,i3)*g1(i2,g5,i4,i3)+
3 f9(i1,i2,i5,i4,i3)*g1(i2,i5,g7,i3)+
4 f8(i1,i2,i5,i4,i3)*g1(i2,i5,i4,g9)+
5 f7(i1,i2,i5,i4,i3)*g1(i2,g4,i4,i3)+
6 f6(i1,i2,i5,i4,i3)*g1(i2,i5,g6,i3)+
7 f5(i1,i2,i5,i4,i3)*g1(i2,i5,i4,g8)
enddo
enddo
enddo
enddo
enddo
return
end
---
scev analysis for the first loop cause some data ref's access function is
scev_not_known,
dump info like this:
---
Creating dr for *g3_77(D)[_76]
base_address: g3_77(D)
offset from base address: (ssizetype) ((sizetype) (_370 + _372) * 8)
constant offset from base address: 8
step: 8
aligned to: 8
base_object: *g3_77(D)
Access function 0: scev_not_known;
Creating dr for *g2_90(D)[_89]
base_address: g2_90(D)
offset from base address: (ssizetype) ((sizetype) (((_378 + _379) +
stride.88_19) + (integer(kind=8)) i1_1) * 8)
constant offset from base address: 0
step: (ssizetype) ((sizetype) stride.88_19 * 8)
aligned to: 8
base_object: *g2_90(D)
Access function 0: scev_not_known;
---
this will cause data dep check failed because of access function not affine or
constant,and cause this loop cannot be paralleled.
the 2nd loop is the same with the 1st loop, it can only be paralleled on the
3rd loop,which this can parallel on the 1st loop apparently.
so,it looks like scev issue?
---
Thanks
Jun