------- Comment #3 from rguenth at gcc dot gnu dot org 2008-07-23 09:37 ------- The problem is that SLP and the reduction operation do not mix and that the vectorizer doesn't understand the complex component accessors.
<bb 4>: # S.4_40 = PHI <S.4_33(5), 1(3)> D.1045_26 = S.4_40 + -1; CR.27_44 = REALPART_EXPR <(*iy_27(D))[D.1045_26]>; CI.28_45 = IMAGPART_EXPR <(*iy_27(D))[D.1045_26]>; CR.29_46 = REALPART_EXPR <(*ix_30(D))[D.1045_26]>; CI.30_47 = IMAGPART_EXPR <(*ix_30(D))[D.1045_26]>; D.1076_48 = CR.27_44 * CR.29_46; D.1077_49 = CI.28_45 * CI.30_47; D.1078_50 = CR.27_44 * CI.30_47; D.1079_51 = CI.28_45 * CR.29_46; CR.31_52 = D.1076_48 - D.1077_49; CI.32_53 = D.1078_50 + D.1079_51; REALPART_EXPR <(*iy_27(D))[D.1045_26]> = CR.31_52; IMAGPART_EXPR <(*iy_27(D))[D.1045_26]> = CI.32_53; S.4_33 = S.4_40 + 1; if (size.3_4 < S.4_33) there are bugs about this already. For the specific task of complex operations we can possibly change the complex-lowering code to directly generate the vectorized equivalent, though of course teaching the vectorizer to handle the above would be more useful. -- rguenth at gcc dot gnu dot org changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |irar at il dot ibm dot com Status|UNCONFIRMED |NEW Ever Confirmed|0 |1 Last reconfirmed|0000-00-00 00:00:00 |2008-07-23 09:37:26 date| | http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36840