[Bug target/120141] [RVV] Noop are not removed

wojciech_mula at poczta dot onet.pl via Gcc-bugs Thu, 21 Aug 2025 16:23:33 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120141


--- Comment #3 from Wojciech Mula <wojciech_mula at poczta dot onet.pl> ---
Thank you for looking at this issue! I'm not going argue, but let me show the
perspective of a programmer who wrote whole lot of x86 intrinsics and now use
RVV ones.

> The consensus was this is just how the intrinsics work.

This is not true for intrinsics from the x86 world (I'm not sure about AltiVec
ones). And this is why I asked why RVV ones do not behave similarly to x86
functions. Consider this example:

---test.c---
#include <tuple>
#include <iostream>
#include <immintrin.h>

__m128i shift_by_zero(__m128i x) {
    return _mm_srli_epi32(x, 0);
}

__m128i add_zero(__m128i x) {
    return _mm_add_epi32(x, _mm_setzero_si128());
}

__m128i mul_zero(__m128i x) {
    return _mm_mul_epi32(x, _mm_setzero_si128());
}
---eof---

It is compiled by `gcc -std=c++20 -march=tigerlake -O3` into
(https://godbolt.org/z/PxaGcexn9):

"shift_by_zero(long long __vector(2))":
        ret
"add_zero(long long __vector(2))":
        ret
"mul_zero(long long __vector(2))":
        vpxor   xmm0, xmm0, xmm0
        ret

> The intrinsic interface is working as designed.  If you want to avoid nop 
> codes, then don't pass arguments that result in nop operations to the 
> intrinsics interfaces.

The problem is that you don't always write intrinsics directly. In C++ programs
we use templates. For example, `_mm_srli_epi32` mentioned above accepts only a
compile-time constant, this in C++ you'd have a template like:

template <size_t K>
__m128 shift_right_epi32(__m128i x) {
    return _mm_srli_epi32(x, K);
}

If we cannot assume that a compiler will simplify `shift_right_epi32<0>`, then
the implementation of template must be aware of that special case:

template <size_t K>
__m128 shift_right_epi32(__m128i x) {
    if constexpr (K == 0) {
       return x;
    } else if constexpr (K >= 32) {
       return _mm_setzero_si128();
    } else {
       return _mm_srli_epi32(x, K);
    }
}

It's obviously doable, but adds more burden on the programmer's side.

[Bug target/120141] [RVV] Noop are not removed

Reply via email to