[Bug tree-optimization/49955] Fails to do partial basic-block SLP

irar at il dot ibm.com Fri, 05 Aug 2011 03:51:46 -0700

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49955


--- Comment #3 from Ira Rosen <irar at il dot ibm.com> 2011-08-05 10:50:27 UTC 
---
(In reply to comment #1)
> The loop that remains after fixing PR49957 in 410.bwaves is the following,
> which loop SLP does not handle (well, I'm not exactly sure) because
> 
> t.f:18: note: ==> examining statement: t1_62 = *q_61(D)[D.1645_60];
> 
> t.f:18: note: num. args = 4 (not unary/binary/ternary op).
> t.f:18: note: vect_is_simple_use: operand *q_61(D)[D.1645_60]
> t.f:18: note: not ssa-name.
> t.f:18: note: use not simple.
> t.f:18: note: no array mode for V2DF[5]
> t.f:18: note: the size of the group of strided accesses is not a power of 2
> t.f:18: note: not vectorized: relevant stmt not supported: t1_62 =
> *q_61(D)[D.1645_60];
> 
> t.f:18: note: bad operation or unsupported loop bound.
> t.f:1: note: vectorized 0 loops in function.
> 
> probably the issue that we can't handle this kind of "invariants" in the
> SLP group?  Thus, the SLP group should be q(2,..), q(3,...) ... q(5, ...)
> which is size 4, q(1,..) should be treated as invariant. 
> 

This loop is not SLPed because there is no SLP opportunity here besides the
loads. The only isomorphism after that is 
               t2=q(2,i,j,k)/t1
               t3=q(3,i,j,k)/t1
               t4=q(4,i,j,k)/t1

and somewhat here
              t7=((dabs(t2)+t6)/dx+mu/dx**2)**2 +
     1            ((dabs(t3)+t6)/dy+mu/dy**2)**2 +
     2            ((dabs(t4)+t6)/dz+mu/dz**2)**2

but these are groups of 3.

Moreover, the current implementation starts building SLP tree from a group of
strided stores, or a group of reductions, or a reduction chain. None of these
exist here.
But, again, even if we could start from a group of loads, it wouldn't help us
much here anyway.

Ira

[Bug tree-optimization/49955] Fails to do partial basic-block SLP

Reply via email to