[Bug target/121591] x86 optimization: isless doesn't reuse EFLAGS result of other floating point comparisons with same operands

Explorer09 at gmail dot com via Gcc-bugs Thu, 21 Aug 2025 10:22:05 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121591


--- Comment #3 from Kang-Che Sung <Explorer09 at gmail dot com> ---
(In reply to ak from comment #2)
> Many x86 targets have limits on how many branches their branch predictor can
> track per 16 byte line so what you are asking for is likely slower. On
> others there are also similar limits what the decoded icache can cache per
> 32 bytes.

Is that really an issue not to optimize this?

My testing shows that it is when isless(a, b) and isgreater(a, b) are used
together, the UCOMISD instructions became not merged.

Even when the branch instructions were less than 16 bytes than their float
compare instructions. That makes, for example, this simple float compare
function more code than necessary:

```c
// It is expected that this can be used as a compare function in qsort()
int float_compare2(const double *a, const double *b) {
    if (*a > *b)
        return 1;
    if (*a < *b)
        return -1;
    return 0;
}
```

I tested this even with the '-Oz' option, which, you know, should ignore
anything about performance.

[Bug target/121591] x86 optimization: isless doesn't reuse EFLAGS result of other floating point comparisons with same operands

Reply via email to