https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100173
Bug ID: 100173 Summary: telecom/viterb00data_1 has 16.92% regression compared O2 -ftree-vectorize -fvect-cost-model=very-cheap to O2 on CLX/ICX, 9% regression on znver3 Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: crazylht at gmail dot com CC: hjl.tools at gmail dot com Target Milestone: --- Host: x86_64-pc-linux-gnu Target: x86_64-*-* i?86-*-* Created attachment 50647 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50647&action=edit ACS.cpp cat testcase void __attribute__ ((noipa)) ACS(e_s16 *pBranchMetric) { n_int i; e_s16 esMetricIn, esMetric1, esMetric2; StatePathMetricData *pIn1 = BufPtr[BufSelector]; StatePathMetricData *pIn2 = pIn1 + (1<<5)/2; StatePathMetricData *pOut = BufPtr[1 - BufSelector]; BufSelector ^= 1; for (i = 0; i < (1<<5)/2; i++) { esMetricIn = *pBranchMetric++; esMetric1 = pIn1->m_esPathMetric - esMetricIn; esMetric2 = pIn2->m_esPathMetric + esMetricIn; if (esMetric1 >= esMetric2) { pOut->m_esPathMetric = esMetric1; pOut->m_esState = (pIn1->m_esState << 1); } else { pOut->m_esPathMetric = esMetric2; pOut->m_esState = (pIn2->m_esState << 1); } pOut++; esMetric1 = pIn1->m_esPathMetric + esMetricIn; esMetric2 = pIn2->m_esPathMetric - esMetricIn; if (esMetric1 >=esMetric2) { pOut->m_esPathMetric =esMetric1; pOut->m_esState = (pIn1->m_esState << 1) | 1; } else { pOut->m_esPathMetric =esMetric2; pOut->m_esState = (pIn2->m_esState << 1) | 1; } pOut++; pIn1++; pIn2++; } } It is if conditional store replacement plays here, it sinks 2 stores from IF_BB and ELSE_BB to JOIN_BB since they have same address. But failed to vectorize them with -fvect-cost-model=very-cheap, and it causes worse IPC for consecutive stores in JOIN_BB on both ICX and znver3. With -fvect-cost-model=cheap, the loop can be vectorized and 2.6x faster than O2. So I think we should either vectorize this loop or not sink conditional stores when cost model is very-cheap. and the codes related are here: /* If either vectorization or if-conversion is disabled then do not sink any stores. */ if (param_max_stores_to_sink == 0 || (!flag_tree_loop_vectorize && !flag_tree_slp_vectorize) || !flag_tree_loop_if_convert) return false;