https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100173

            Bug ID: 100173
           Summary: telecom/viterb00data_1 has 16.92% regression compared
                    O2 -ftree-vectorize -fvect-cost-model=very-cheap to O2
                     on CLX/ICX, 9% regression on znver3
           Product: gcc
           Version: 11.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: crazylht at gmail dot com
                CC: hjl.tools at gmail dot com
  Target Milestone: ---
              Host: x86_64-pc-linux-gnu
            Target: x86_64-*-* i?86-*-*

Created attachment 50647
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50647&action=edit
ACS.cpp

cat testcase

void
__attribute__ ((noipa))
ACS(e_s16 *pBranchMetric)
{
  n_int i;
  e_s16 esMetricIn, esMetric1, esMetric2;

  StatePathMetricData *pIn1 = BufPtr[BufSelector];
  StatePathMetricData *pIn2 = pIn1 + (1<<5)/2;
  StatePathMetricData *pOut = BufPtr[1 - BufSelector];

  BufSelector ^= 1;

  for (i = 0; i < (1<<5)/2; i++) {

    esMetricIn = *pBranchMetric++;

    esMetric1 = pIn1->m_esPathMetric - esMetricIn;
    esMetric2 = pIn2->m_esPathMetric + esMetricIn;

    if (esMetric1 >= esMetric2) {
      pOut->m_esPathMetric = esMetric1;
      pOut->m_esState = (pIn1->m_esState << 1);
    }
    else {
      pOut->m_esPathMetric = esMetric2;
      pOut->m_esState = (pIn2->m_esState << 1);
    }
    pOut++;

    esMetric1 = pIn1->m_esPathMetric + esMetricIn;
    esMetric2 = pIn2->m_esPathMetric - esMetricIn;

    if (esMetric1 >=esMetric2) {
      pOut->m_esPathMetric =esMetric1;
      pOut->m_esState = (pIn1->m_esState << 1) | 1;
    }
    else {
      pOut->m_esPathMetric =esMetric2;
      pOut->m_esState = (pIn2->m_esState << 1) | 1;
    }
    pOut++;

    pIn1++;
    pIn2++;
  }
}

It is if conditional store replacement plays here, it sinks 2 stores from IF_BB
and ELSE_BB to JOIN_BB since they have same address. But failed to vectorize
them with -fvect-cost-model=very-cheap, and it causes worse IPC for consecutive
stores in JOIN_BB on both ICX and znver3. With -fvect-cost-model=cheap, the
loop can be vectorized and 2.6x faster than O2.

So I think we should either vectorize this loop or not sink conditional stores
when cost model is very-cheap.

and the codes related are here: 

  /* If either vectorization or if-conversion is disabled then do
     not sink any stores.  */
  if (param_max_stores_to_sink == 0
      || (!flag_tree_loop_vectorize && !flag_tree_slp_vectorize)
      || !flag_tree_loop_if_convert)
    return false;

Reply via email to