https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105668
--- Comment #4 from rguenther at suse dot de <rguenther at suse dot de> --- On Fri, 20 May 2022, roger at nextmovesoftware dot com wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105668 > > Roger Sayle <roger at nextmovesoftware dot com> changed: > > What |Removed |Added > ---------------------------------------------------------------------------- > CC| |roger at nextmovesoftware > dot com > > --- Comment #3 from Roger Sayle <roger at nextmovesoftware dot com> --- > I suspect that the middle-end could be a bit more forgiving whilst expanding > vcond_mask. If a target doesn't provide V1TImode, GCC can fall back to using > V2DImode, and if that isn't supported V4SImode, then V8HImode then V16QImode. > On x86-64, these all use the same vblendvb instruction or pand,pandn,por logic > (also known as V128BImode :-). That assumes that selecting V4SImode is the same as selecting from V2DImode, that is, vcond works on the "bitlevel" and V4SI and V2DI "naturally overlap" in a vector register (and they can be aliased). I'm not sure that's something we generally assume in GCC. IIRC with RISC-V for example that would not be the case since they represent vectors in a way to have V4{QI,HI,SI} with zero-extended lanes. I'm not sure how RISC-V represents this to the middle-end and how the middle-end could distinguish those cases. (looks like none of it is on trunk?)