https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119418

            Bug ID: 119418
           Summary: Minimum of 3 float values produces worse code-gen with
                    ternary than repeated minimum of 2 function calls
           Product: gcc
           Version: 14.2.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: rtl-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: mwinkler at blizzard dot com
  Target Milestone: ---

Created attachment 60848
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=60848&action=edit
Contains 3 different implementations of min of 3 float values

Godbolt link for those who don't want to build locally with the attached source
file: https://godbolt.org/z/hnYba38r7.

Compile the attached source file with `-msse2` or `-mavx2` and `-O3` or `-O2`.

Notice that the ternary implementation of
```
template<class T>
constexpr const T& (min)(const T& a, const T& b, const T& c)
{
    return b < a ? (c < b ? c : b) : (c < a ? c : a);
}
```
is unable to reduce to two branchless `minss` instructions. It has a branch.

However doing
```
min(min(a, b), c);
```
where `min` is implemented as
```
template<class T>
constexpr const T& (min)(const T& a, const T& b)
{
    return b < a ? b : a;
}
```
result in a branchless min of 3 with two `minss` instructions.

It is also curious that `min` implemented as
```
template<class T>
constexpr const T& (min)(const T& a, const T& b)
{
    if (b < a)
        return b;
    return a;
}
```
results in a different instruction scheduling than the 2-way `min` implemented
above with a ternary.
  • [Bug rtl-optimization/119418] Ne... mwinkler at blizzard dot com via Gcc-bugs

Reply via email to