https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99746
Tamar Christina <tnfchris at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |ASSIGNED --- Comment #7 from Tamar Christina <tnfchris at gcc dot gnu.org> --- reduced to SUBROUTINE CLAREF(A) LOGICAL BLOCK COMPLEX T1 , V2 COMPLEX A(LDA, *) , SUM LOGICAL LSAME IF (LSAME) THEN IF (BLOCK) THEN DO 130 J = ITMP1, ITMP2 SUM = T1 * A(J, ICOL1) * A0 + $ V2 * A(J, 2) A(J, ICOL1) = -SUM A(J, 2) = SUM 130 CONTINUE END IF END IF END which produces the following SLP tree, node 0x4e150c0 (max_nunits=2, refcnt=1) op template: REALPART_EXPR <(*a_29(D))[_12]> = sum$real_60; stmt 0 REALPART_EXPR <(*a_29(D))[_12]> = sum$real_60; stmt 1 IMAGPART_EXPR <(*a_29(D))[_12]> = sum$imag_61; children 0x4e15720 node 0x4e15720 (max_nunits=2, refcnt=1) op template: slp_patt_69 = .COMPLEX_FMA (sum$real_60, sum$real_60, sum$real_60); stmt 0 sum$real_60 = _48 + _58; stmt 1 sum$imag_61 = _49 + _59; children 0x4e15500 0x4e15e08 0x4e15ad8 node 0x4e15500 (max_nunits=2, refcnt=1) op template: _48 = a0_31(D) * _46; stmt 0 _48 = a0_31(D) * _46; stmt 1 _49 = a0_31(D) * _47; children 0x4e15588 0x4e151d0 node (external) 0x4e15588 (max_nunits=1, refcnt=1) { a0_31(D), a0_31(D) } node 0x4e151d0 (max_nunits=2, refcnt=1) op template: slp_patt_71 = .COMPLEX_MUL (_46, _46); stmt 0 _46 = _42 - _43; stmt 1 _47 = _44 + _45; children 0x4e15038 0x4e15d80 node (external) 0x4e15038 (max_nunits=1, refcnt=1) { t1$real_38(D), t1$imag_41(D) } node 0x4e15d80 (max_nunits=2, refcnt=2) op template: _17 = REALPART_EXPR <(*a_29(D))[_5]>; stmt 0 _17 = REALPART_EXPR <(*a_29(D))[_5]>; stmt 1 _16 = IMAGPART_EXPR <(*a_29(D))[_5]>; load permutation { 0 1 } node 0x4e15e08 (max_nunits=2, refcnt=2) op template: _50 = REALPART_EXPR <(*a_29(D))[_12]>; stmt 0 _50 = REALPART_EXPR <(*a_29(D))[_12]>; stmt 1 _51 = IMAGPART_EXPR <(*a_29(D))[_12]>; load permutation { 0 1 } node (external) 0x4e15ad8 (max_nunits=1, refcnt=1) { v2$real_52(D), v2$imag_53(D) } which is correct, but vect_detect_hybrid_slp determines marking hybrid: slp_patt_71 = .COMPLEX_MUL (_46, _46); Which is a problem since the patterns are only valid in SLP. I don't quite see why the sub-tree is hybrid though.. it determines marking hybrid: _50 = REALPART_EXPR <(*a_29(D))[_12]>; marking hybrid: _51 = IMAGPART_EXPR <(*a_29(D))[_12]>; marking hybrid: _48 = a0_31(D) * _46; marking hybrid: slp_patt_71 = .COMPLEX_MUL (_46, _46); marking hybrid: sum$imag_61 = _49 + _59; marking hybrid: _49 = a0_31(D) * _47; marking hybrid: _59 = _56 + _57; marking hybrid: _56 = _50 * v2$imag_53(D); marking hybrid: _57 = _51 * v2$real_52(D); marking hybrid: _47 = _44 + _45; marking hybrid: _44 = _17 * t1$imag_41(D); marking hybrid: _45 = _16 * t1$real_38(D); marking hybrid: _16 = IMAGPART_EXPR <(*a_29(D))[_5]>; marking hybrid: _17 = REALPART_EXPR <(*a_29(D))[_5]>; So either the vect_detect_hybrid_slp is correct but then SLP should be aborted or it's not right and this should have been pure. the problem starts because it marks _50 as hybrid, but don't see why it thinks that...