https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101190
--- Comment #2 from Hongtao.liu <crazylht at gmail dot com> --- (In reply to Richard Biener from comment #1) > the issue is that likely (is that prerequesite patch in yet?) > vect_recog_over_widening_pattern is not detecting that the shift could be > done in smaller than int precision. C promotion rules gives us > > _4 = *_3; > _5 = (int) _4; > _7 = *_6; > _8 = (int) _7; > _9 = _5 << _8; > _10 = (short unsigned int) _9; > *_3 = _10; > > where promotion of the shift amount is a GCC choice. The first reason is > that we hit > > /* See whether we have found that this operation can be done on a > narrower type without changing its semantics. */ > unsigned int new_precision = last_stmt_info->operation_precision; > if (!new_precision) > return NULL; > > which is because the analysis code seems to bail for non-constant shift > amounts with the fear to introduce shifts that are undefined (out-of-bounds). It make sense for a variable shift count which we don't know its range. But the below testcase should generate vpsllvw just like icx did. https://godbolt.org/z/h9PWfbP7K void foo (unsigned short* __restrict pdst, unsigned short* psrc) { for (int i = 0; i != 8; i++) pdst[i] <<= (psrc[i] % 16); }