https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68576
Bug ID: 68576 Summary: scev failed for loop auto parallelize Product: gcc Version: 6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: majun4950646 at 163 dot com Target Milestone: --- scev failed to analyze the 1st and 2nd loop in the following example (come from testsuit interchange-2.f,which is the kernel extracted from bwaves) --- subroutine foo(f1,f2,f3,f4,f5,f6,f7,f8,f9,f0,g1,g2,g3) implicit none integer f4,f3,f2,f1 integer g4,g5,g6,g7,g8,g9 integer i1,i2,i3,i4,i5 real*8 g1(f4,f3,f2,f1),g2(f4,f4,f3,f2,f1),g3(f4,f3,f2,f1) real*8 f0(f4,f4,f3,f2,f1),f9(f4,f4,f3,f2,f1),f8(f4,f4,f3,f2,f1) real*8 f7(f4,f4,f3,f2,f1),f6(f4,f4,f3,f2,f1),f5(f4,f4,f3,f2,f1) do i3=1,f1 g8=mod(i3+f1-2,f1)+1 g9=mod(i3,f1)+1 do i4=1,f2 g6=mod(i4+f2-2,f2)+1 g7=mod(i4,f2)+1 do i5=1,f3 g4=mod(i5+f3-2,f3)+1 g5=mod(i5,f3)+1 do i1=1,f4 g3(i1,i5,i4,i3)=0.0d0 do i2=1,f4 g3(i1,i5,i4,i3)=g3(i1,i5,i4,i3)+ 1 g2(i1,i2,i5,i4,i3)*g1(i2,i5,i4,i3)+ 2 f0(i1,i2,i5,i4,i3)*g1(i2,g5,i4,i3)+ 3 f9(i1,i2,i5,i4,i3)*g1(i2,i5,g7,i3)+ 4 f8(i1,i2,i5,i4,i3)*g1(i2,i5,i4,g9)+ 5 f7(i1,i2,i5,i4,i3)*g1(i2,g4,i4,i3)+ 6 f6(i1,i2,i5,i4,i3)*g1(i2,i5,g6,i3)+ 7 f5(i1,i2,i5,i4,i3)*g1(i2,i5,i4,g8) enddo enddo enddo enddo enddo return end --- scev analysis for the first loop cause some data ref's access function is scev_not_known, dump info like this: --- Creating dr for *g3_77(D)[_76] base_address: g3_77(D) offset from base address: (ssizetype) ((sizetype) (_370 + _372) * 8) constant offset from base address: 8 step: 8 aligned to: 8 base_object: *g3_77(D) Access function 0: scev_not_known; Creating dr for *g2_90(D)[_89] base_address: g2_90(D) offset from base address: (ssizetype) ((sizetype) (((_378 + _379) + stride.88_19) + (integer(kind=8)) i1_1) * 8) constant offset from base address: 0 step: (ssizetype) ((sizetype) stride.88_19 * 8) aligned to: 8 base_object: *g2_90(D) Access function 0: scev_not_known; --- this will cause data dep check failed because of access function not affine or constant,and cause this loop cannot be paralleled. the 2nd loop is the same with the 1st loop, it can only be paralleled on the 3rd loop,which this can parallel on the 1st loop apparently. so,it looks like scev issue? --- Thanks Jun