https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117818
--- Comment #5 from Steven Munroe <munroesj at gcc dot gnu.org> --- I expected compiling for -mcpu=power9 to do a better job generating splats for small constants. Given the new instructions like VSX Vector Splat Immediate Byte (xxspltib) and Vector Extend Sign Byte To Word/Doubleword the compiler should have a easier time generating vec_splats(). It would seem that Vector Splat Immediate Byte would be the perfect way to generate a constant shift quadword left/right. But that is not what I am seeing. First note there is no direct intrinsic for xxspltib. It is sometimes generated for vec_splat_u8(0-15) and vec_splats((vector unsigned char) x). But sometimes it gets weird. For example: vui128_t test_slqi_char_18_V3 (vui128_t vra) { vui8_t result; vui8_t tmp = vec_splats((unsigned char)18); result = vec_vslo ((vui8_t) vra, tmp); return (vui128_t) vec_vsl (result, tmp); } Which I would expect to generate: xxspltib 34,18 vslo 2,2,0 vsl 2,2,0 But generates: vspltisb 0,9 vadduwm 0,0,0 vslo 2,2,0 vsl 2,2,0 It recognizes that it can't generate 18 with vspltisb and uses the 18 = 9 * 2 pattern. It also erroneously generates vector add word. Seem like GCC is reusing the old pattern and ignoring the new instructions. This is weird because: vui8_t test_splat6_char_18 () { vui8_t tmp = vec_splat_u8(9); return vec_add (tmp, tmp); } Generates: xxspltib 34,9 vaddubm 2,2,2 But: vui8_t test_splat6_char_31 () { // 31 = (16+15) = (15 - (-16)) vui8_t v16 = vec_splat_u8(-16); vui8_t tmp = vec_splat_u8(15); return vec_sub (tmp, v16); } Generates: xxspltib 34,31 Which seems like a miracle. Is this constant propagation? But: vui8_t test_slqi_char_31_V0 (vui8_t vra) { vui8_t result; // 31 = (16+15) = (15 - (-16)) vui8_t v16 = vec_splat_u8(-16); vui8_t tmp = vec_splat_u8(15); tmp = vec_sub (tmp, v16); result = vec_slo (vra, tmp); return vec_sll (result, tmp); } Generates: addis 9,2,.LC0@toc@ha addi 9,9,.LC0@toc@l lxv 32,0(9) vslo 2,2,0 vsl 2,2,0 Ok I think I can fix ths with: vui8_t test_slqi_char_31_V3 (vui8_t vra) { vui8_t result; vui8_t tmp = vec_splats((unsigned char)31); result = vec_slo (vra, tmp); return vec_sll (result, tmp); } But no. it still generated: addis 9,2,.LC0@toc@ha addi 9,9,.LC0@toc@l lxv 32,0(9) vslo 2,2,0 vsl 2,2,0 Which is all very confusing.