https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98537
--- Comment #4 from prathamesh3492 at gcc dot gnu.org ---
Hi,
It seems to work on my machine for x86_64.
Compiling with -O3 (or -O2),
.optimized dump shows:
v4si foo (v4si b, v4si a)
{
v4si c;
vector(4) <signed-boolean:32> _1;
<bb 2> [local count: 1073741824]:
_1 = a_2(D) == b_3(D);
c_4 = VIEW_CONVERT_EXPR<v4si>(_1);
return c_4;
}
I tried on top of af362af18f405c34840d820143aa3a94f72fce4d.
Btw, on ARM it seems to "scalarize" the code,
.optimized dump shows:
_6 = BIT_FIELD_REF <a_2(D), 32, 0>;
_7 = BIT_FIELD_REF <b_3(D), 32, 0>;
_8 = _6 == _7 ? -1 : 0;
_9 = BIT_FIELD_REF <a_2(D), 32, 32>;
_10 = BIT_FIELD_REF <b_3(D), 32, 32>;
_11 = _9 == _10 ? -1 : 0;
_12 = BIT_FIELD_REF <a_2(D), 32, 64>;
_13 = BIT_FIELD_REF <b_3(D), 32, 64>;
_14 = _12 == _13 ? -1 : 0;
_15 = BIT_FIELD_REF <a_2(D), 32, 96>;
_16 = BIT_FIELD_REF <b_3(D), 32, 96>;
_17 = _15 == _16 ? -1 : 0;
c_4 = {_8, _11, _14, _17};
return c_4;
Thanks,
Prathamesh