https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98537

--- Comment #5 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to prathamesh3492 from comment #4)
> Hi,
> It seems to work on my machine for x86_64.
> Compiling with -O3 (or -O2),
> .optimized dump shows:
> 
> v4si foo (v4si b, v4si a)
> {
>   v4si c;
>   vector(4) <signed-boolean:32> _1;
> 
>   <bb 2> [local count: 1073741824]:
>   _1 = a_2(D) == b_3(D);
>   c_4 = VIEW_CONVERT_EXPR<v4si>(_1);
>   return c_4;
> 
> }
> 
> I tried on top of af362af18f405c34840d820143aa3a94f72fce4d.
> 
> Btw, on ARM it seems to "scalarize" the code,
> .optimized dump shows:
> 
>   _6 = BIT_FIELD_REF <a_2(D), 32, 0>;
>   _7 = BIT_FIELD_REF <b_3(D), 32, 0>;
>   _8 = _6 == _7 ? -1 : 0;
>   _9 = BIT_FIELD_REF <a_2(D), 32, 32>;
>   _10 = BIT_FIELD_REF <b_3(D), 32, 32>;
>   _11 = _9 == _10 ? -1 : 0;
>   _12 = BIT_FIELD_REF <a_2(D), 32, 64>;
>   _13 = BIT_FIELD_REF <b_3(D), 32, 64>;
>   _14 = _12 == _13 ? -1 : 0;
>   _15 = BIT_FIELD_REF <a_2(D), 32, 96>;
>   _16 = BIT_FIELD_REF <b_3(D), 32, 96>;
>   _17 = _15 == _16 ? -1 : 0;
>   c_4 = {_8, _11, _14, _17};
>   return c_4;
> 
> Thanks,
> Prathamesh

try -march=skylake-avx512(In reply to prathamesh3492 from comment #4)
> Hi,
> It seems to work on my machine for x86_64.
> Compiling with -O3 (or -O2),
> .optimized dump shows:
> 
> v4si foo (v4si b, v4si a)
> {
>   v4si c;
>   vector(4) <signed-boolean:32> _1;
> 
>   <bb 2> [local count: 1073741824]:
>   _1 = a_2(D) == b_3(D);
>   c_4 = VIEW_CONVERT_EXPR<v4si>(_1);
>   return c_4;
> 
> }
> 
> I tried on top of af362af18f405c34840d820143aa3a94f72fce4d.
> 
> Btw, on ARM it seems to "scalarize" the code,
> .optimized dump shows:
> 
>   _6 = BIT_FIELD_REF <a_2(D), 32, 0>;
>   _7 = BIT_FIELD_REF <b_3(D), 32, 0>;
>   _8 = _6 == _7 ? -1 : 0;
>   _9 = BIT_FIELD_REF <a_2(D), 32, 32>;
>   _10 = BIT_FIELD_REF <b_3(D), 32, 32>;
>   _11 = _9 == _10 ? -1 : 0;
>   _12 = BIT_FIELD_REF <a_2(D), 32, 64>;
>   _13 = BIT_FIELD_REF <b_3(D), 32, 64>;
>   _14 = _12 == _13 ? -1 : 0;
>   _15 = BIT_FIELD_REF <a_2(D), 32, 96>;
>   _16 = BIT_FIELD_REF <b_3(D), 32, 96>;
>   _17 = _15 == _16 ? -1 : 0;
>   c_4 = {_8, _11, _14, _17};
>   return c_4;
> 
> Thanks,
> Prathamesh

It need avx512, try -march=skylake-avx512

Reply via email to