https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98537
--- Comment #5 from Hongtao.liu <crazylht at gmail dot com> --- (In reply to prathamesh3492 from comment #4) > Hi, > It seems to work on my machine for x86_64. > Compiling with -O3 (or -O2), > .optimized dump shows: > > v4si foo (v4si b, v4si a) > { > v4si c; > vector(4) <signed-boolean:32> _1; > > <bb 2> [local count: 1073741824]: > _1 = a_2(D) == b_3(D); > c_4 = VIEW_CONVERT_EXPR<v4si>(_1); > return c_4; > > } > > I tried on top of af362af18f405c34840d820143aa3a94f72fce4d. > > Btw, on ARM it seems to "scalarize" the code, > .optimized dump shows: > > _6 = BIT_FIELD_REF <a_2(D), 32, 0>; > _7 = BIT_FIELD_REF <b_3(D), 32, 0>; > _8 = _6 == _7 ? -1 : 0; > _9 = BIT_FIELD_REF <a_2(D), 32, 32>; > _10 = BIT_FIELD_REF <b_3(D), 32, 32>; > _11 = _9 == _10 ? -1 : 0; > _12 = BIT_FIELD_REF <a_2(D), 32, 64>; > _13 = BIT_FIELD_REF <b_3(D), 32, 64>; > _14 = _12 == _13 ? -1 : 0; > _15 = BIT_FIELD_REF <a_2(D), 32, 96>; > _16 = BIT_FIELD_REF <b_3(D), 32, 96>; > _17 = _15 == _16 ? -1 : 0; > c_4 = {_8, _11, _14, _17}; > return c_4; > > Thanks, > Prathamesh try -march=skylake-avx512(In reply to prathamesh3492 from comment #4) > Hi, > It seems to work on my machine for x86_64. > Compiling with -O3 (or -O2), > .optimized dump shows: > > v4si foo (v4si b, v4si a) > { > v4si c; > vector(4) <signed-boolean:32> _1; > > <bb 2> [local count: 1073741824]: > _1 = a_2(D) == b_3(D); > c_4 = VIEW_CONVERT_EXPR<v4si>(_1); > return c_4; > > } > > I tried on top of af362af18f405c34840d820143aa3a94f72fce4d. > > Btw, on ARM it seems to "scalarize" the code, > .optimized dump shows: > > _6 = BIT_FIELD_REF <a_2(D), 32, 0>; > _7 = BIT_FIELD_REF <b_3(D), 32, 0>; > _8 = _6 == _7 ? -1 : 0; > _9 = BIT_FIELD_REF <a_2(D), 32, 32>; > _10 = BIT_FIELD_REF <b_3(D), 32, 32>; > _11 = _9 == _10 ? -1 : 0; > _12 = BIT_FIELD_REF <a_2(D), 32, 64>; > _13 = BIT_FIELD_REF <b_3(D), 32, 64>; > _14 = _12 == _13 ? -1 : 0; > _15 = BIT_FIELD_REF <a_2(D), 32, 96>; > _16 = BIT_FIELD_REF <b_3(D), 32, 96>; > _17 = _15 == _16 ? -1 : 0; > c_4 = {_8, _11, _14, _17}; > return c_4; > > Thanks, > Prathamesh It need avx512, try -march=skylake-avx512