[Bug tree-optimization/101190] vectorizer failed to generate vashlv8hi, but extend to int and use vashlv4si instead

crazylht at gmail dot com via Gcc-bugs Thu, 24 Jun 2021 23:28:05 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101190


--- Comment #2 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Richard Biener from comment #1)
> the issue is that likely (is that prerequesite patch in yet?)
> vect_recog_over_widening_pattern is not detecting that the shift could be
> done in smaller than int precision.  C promotion rules gives us
> 
>   _4 = *_3;
>   _5 = (int) _4;
>   _7 = *_6;
>   _8 = (int) _7;
>   _9 = _5 << _8;
>   _10 = (short unsigned int) _9;
>   *_3 = _10;
> 
> where promotion of the shift amount is a GCC choice.  The first reason is
> that we hit
> 
>   /* See whether we have found that this operation can be done on a
>      narrower type without changing its semantics.  */
>   unsigned int new_precision = last_stmt_info->operation_precision;
>   if (!new_precision)
>     return NULL;
> 
> which is because the analysis code seems to bail for non-constant shift
> amounts with the fear to introduce shifts that are undefined (out-of-bounds).

It make sense for a variable shift count which we don't know its range.

But the below testcase should generate vpsllvw just like icx did.

https://godbolt.org/z/h9PWfbP7K

void
foo (unsigned short* __restrict pdst, unsigned short* psrc)
{
  for (int i = 0; i != 8; i++)
    pdst[i] <<= (psrc[i] % 16);
}

[Bug tree-optimization/101190] vectorizer failed to generate vashlv8hi, but extend to int and use vashlv4si instead

Reply via email to