https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103771
--- Comment #2 from Hongtao.liu <crazylht at gmail dot com> --- (In reply to Tamar Christina from comment #1) > Looks like the change causes the simpler conditional to be detected by the > vectorizer as a masked operation, which in principle makes sense: > > note: vect_recog_mask_conversion_pattern: detected: iftmp.0_21 = x.1_14 > > 255 ? iftmp.0_19 : iftmp.0_20; > note: mask_conversion pattern recognized: patt_43 = patt_42 ? iftmp.0_19 : > iftmp.0_20; > note: extra pattern stmt: patt_40 = x.1_14 > 255; > note: extra pattern stmt: patt_42 = (<signed-boolean:8>) patt_40; > > However not quite sure how the masking works on x86. The additional > statement generated for patt_42 causes it to fail during vectorization: > > note: ==> examining pattern def statement: patt_42 = (<signed-boolean:8>) > patt_40; > note: ==> examining statement: patt_42 = (<signed-boolean:8>) patt_40; > note: vect_is_simple_use: operand x.1_14 > 255, type of def: internal > note: vect_is_simple_use: vectype vector(8) <signed-boolean:1> > missed: conversion not supported by target. > note: vect_is_simple_use: operand x.1_14 > 255, type of def: internal > note: vect_is_simple_use: vectype vector(8) <signed-boolean:1> > note: vect_is_simple_use: operand x.1_14 > 255, type of def: internal > note: vect_is_simple_use: vectype vector(8) <signed-boolean:1> > missed: not vectorized: relevant stmt not supported: patt_42 = > (<signed-boolean:8>) patt_40; > missed: bad operation or unsupported loop bound. > note: ***** Analysis failed with vector mode V32QI > > as there's no conversion patterns for `VEC_UNPACK_LO_EXPR` between bool and > a mask. W/ avx512, we're using scalar mode for mask, can we use VEC_UNPACKS_SBOOL_LO_ here? Since we have vec_unpsack_sbool_lo/hi_qi which should be used for conversion from vector<8> <signed-boolean:1> to vector<4> <signed-boolean:1>. > > which explains why it works for AVX2 and AVX512BW. AVX512F doesn't seem to > allow any QI mode conversions [1] so it fails.. > > Not sure why it's doing the replacement without checking to see that the > target is able to vectorize the statements it generates later. Specifically > it doesn't check if what's returned by build_mask_conversion is supported or > not. > > My guess is because vectorizable_condition will fail anyway without the type > of the conditional being a vector boolean. > > With -mavx512vl V32QI seems to generate in the pattern mask conversions > between vector (8) <signed-boolean:1> and without it vector(32) > <signed-boolean:8>. I think some x86 person needs to give a hint here :) > > [1] https://www.felixcloutier.com/x86/kunpckbw:kunpckwd:kunpckdq