and on typical RISC targets

Jovan.Vukic--- via Gcc-bugs Thu, 22 Aug 2024 09:02:42 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115921


Jovan Vukic <jovan.vu...@rt-rk.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jovan.vu...@rt-rk.com

--- Comment #2 from Jovan Vukic <jovan.vu...@rt-rk.com> ---
I have reviewed the issue, specifically for the RISC-V platform, as I find the
difference between the results for 32-bit and 64-bit variants intriguing.

Here are my conclusions:
1. Both RISC-V 32-bit and 64-bit versions in the releases 14.1.0 and 14.2.0 do
not perform the optimization in question.
2. The GCC trunk code contains the optimization for this problem for both
RISC-V 32-bit and 64-bit, applicable to AND, OR, and XOR operations.
3. This optimization happens because of the pattern
<optab>_shift_reverse<X:mode> on line 2929 in riscv.md.


The optimization works correctly, as evidenced by the example C code below, for
which GCC generates optimized assembly code for both 32-bit and 64-bit
platforms:

typedef unsigned long target_wide_uint_t;

target_wide_uint_t test_ashift_and(target_wide_uint_t x) {
    return (x & 0x3e) << 12;
}


test_ashift_and:
        andi    a0,a0,62
        slli    a0,a0,12
        ret

However, the problem arises with the test example we are considering (repeated
below). It is optimized for 32-bit but not for 64-bit. Here are the results for
the 64-bit architecture (https://godbolt.org/z/6a7x9zTjs):

typedef unsigned long target_wide_uint_t;

target_wide_uint_t test_ashift_and(target_wide_uint_t x) {
    return (x & 0x3f) << 12;
}


test_ashift_and:
        li      a5,258048
        slli    a0,a0,12
        and     a0,a0,a5
        ret

In this example, we have 0x3f == 2^6 - 1. If we examine the pattern on line
2929 in riscv.md, there is a special condition:
(!TARGET_64BIT || (exact_log2((INTVAL(operands[3]) >> INTVAL(operands[2])) + 1)
== -1))

This condition indicates that optimization does not occur for 64-bit
architecture when the value 0x3f (or another number in its place) is of the
form 2^x - 1. This condition, introduced in this commit
[https://github.com/gcc-mirror/gcc/commit/236116068151bbc72aaaf53d0f223fe06f7e3bac],
seems to be the root of the issue for RISC-V 64-bit.


It would be helpful if someone could clarify the rationale behind the
exact_log2 condition, as it prevents optimization only for numbers of the form
2^x - 1, specifically for the 64-bit target, and its purpose is not immediately
clear to me.

[Bug target/115921] Missed optimization: and->ashift might be cheaper than ashift->and on typical RISC targets

Reply via email to