https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99746

Tamar Christina <tnfchris at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED

--- Comment #7 from Tamar Christina <tnfchris at gcc dot gnu.org> ---
reduced to

      SUBROUTINE CLAREF(A)
      LOGICAL            BLOCK
      COMPLEX            T1 , V2
      COMPLEX            A(LDA, *) , SUM
      LOGICAL            LSAME
      IF (LSAME) THEN
         IF (BLOCK) THEN
            DO 130 J = ITMP1, ITMP2
               SUM = T1 * A(J, ICOL1) * A0 +
     $               V2 * A(J, 2)
               A(J, ICOL1) = -SUM
               A(J, 2) = SUM
  130       CONTINUE
         END IF
      END IF
      END

which produces the following SLP tree,

   node 0x4e150c0 (max_nunits=2, refcnt=1)
   op template: REALPART_EXPR <(*a_29(D))[_12]> = sum$real_60;
         stmt 0 REALPART_EXPR <(*a_29(D))[_12]> = sum$real_60;
         stmt 1 IMAGPART_EXPR <(*a_29(D))[_12]> = sum$imag_61;
         children 0x4e15720
   node 0x4e15720 (max_nunits=2, refcnt=1)
   op template: slp_patt_69 = .COMPLEX_FMA (sum$real_60, sum$real_60,
sum$real_60);
         stmt 0 sum$real_60 = _48 + _58;
         stmt 1 sum$imag_61 = _49 + _59;
         children 0x4e15500 0x4e15e08 0x4e15ad8
   node 0x4e15500 (max_nunits=2, refcnt=1)
   op template: _48 = a0_31(D) * _46;
         stmt 0 _48 = a0_31(D) * _46;
         stmt 1 _49 = a0_31(D) * _47;
         children 0x4e15588 0x4e151d0
   node (external) 0x4e15588 (max_nunits=1, refcnt=1)
         { a0_31(D), a0_31(D) }
   node 0x4e151d0 (max_nunits=2, refcnt=1)
   op template: slp_patt_71 = .COMPLEX_MUL (_46, _46);
         stmt 0 _46 = _42 - _43;
         stmt 1 _47 = _44 + _45;
         children 0x4e15038 0x4e15d80
   node (external) 0x4e15038 (max_nunits=1, refcnt=1)
         { t1$real_38(D), t1$imag_41(D) }
   node 0x4e15d80 (max_nunits=2, refcnt=2)
   op template: _17 = REALPART_EXPR <(*a_29(D))[_5]>;
         stmt 0 _17 = REALPART_EXPR <(*a_29(D))[_5]>;
         stmt 1 _16 = IMAGPART_EXPR <(*a_29(D))[_5]>;
         load permutation { 0 1 }
   node 0x4e15e08 (max_nunits=2, refcnt=2)
   op template: _50 = REALPART_EXPR <(*a_29(D))[_12]>;
         stmt 0 _50 = REALPART_EXPR <(*a_29(D))[_12]>;
         stmt 1 _51 = IMAGPART_EXPR <(*a_29(D))[_12]>;
         load permutation { 0 1 }
   node (external) 0x4e15ad8 (max_nunits=1, refcnt=1)
         { v2$real_52(D), v2$imag_53(D) }

which is correct, but vect_detect_hybrid_slp determines

   marking hybrid: slp_patt_71 = .COMPLEX_MUL (_46, _46);

Which is a problem since the patterns are only valid in SLP.

I don't quite see why the sub-tree is hybrid though.. it determines

   marking hybrid: _50 = REALPART_EXPR <(*a_29(D))[_12]>;
   marking hybrid: _51 = IMAGPART_EXPR <(*a_29(D))[_12]>;
   marking hybrid: _48 = a0_31(D) * _46;
   marking hybrid: slp_patt_71 = .COMPLEX_MUL (_46, _46);
   marking hybrid: sum$imag_61 = _49 + _59;
   marking hybrid: _49 = a0_31(D) * _47;
   marking hybrid: _59 = _56 + _57;
   marking hybrid: _56 = _50 * v2$imag_53(D);
   marking hybrid: _57 = _51 * v2$real_52(D);
   marking hybrid: _47 = _44 + _45;
   marking hybrid: _44 = _17 * t1$imag_41(D);
   marking hybrid: _45 = _16 * t1$real_38(D);
   marking hybrid: _16 = IMAGPART_EXPR <(*a_29(D))[_5]>;
   marking hybrid: _17 = REALPART_EXPR <(*a_29(D))[_5]>;

So either the vect_detect_hybrid_slp is correct but then SLP should be aborted
or it's not right and this should have been pure.

the problem starts because it marks _50 as hybrid, but don't see why it thinks
that...

Reply via email to