[Bug c++/91645] New: Missed optimization with sqrt(x*x)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91645 Bug ID: 91645 Summary: Missed optimization with sqrt(x*x) Product: gcc Version: 9.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: lisyarus at gmail dot com Target Milestone: --- Based on a discussion on stackoverflow: https://stackoverflow.com/questions/57673825/how-to-force-gcc-to-assume-that-a-floating-point-expression-is-non-negative. With gcc-trunk and '-std=c++17 -O3', the function float test (float x) { return std::sqrt(x*x); } produces the following assembly: test(float): mulss xmm0, xmm0 pxorxmm2, xmm2 ucomiss xmm2, xmm0 movaps xmm1, xmm0 sqrtss xmm1, xmm1 ja .L8 movaps xmm0, xmm1 ret .L8: sub rsp, 24 movss DWORD PTR [rsp+12], xmm1 callsqrtf movss xmm1, DWORD PTR [rsp+12] add rsp, 24 movaps xmm0, xmm1 ret As far as I can tell, it calls sqrtf, unless the argument to sqrt is >= 0, to check for negatives/NaN's and set the appropriate errno. The behavior is reasonable, as expected. Adding '-fno-math-errno', '-ffast-math', or '-ffinite-math-only' removes all the clutter and compiles the same code into the neat test(float): mulss xmm0, xmm0 sqrtss xmm0, xmm0 ret Now, the problem is that GCC doesn't seem to optimize away the call to sqrtf based on some surrounding code. As an example, it would be neat to have this (or something similar) to get compiled into the same mulss-sqrtss-ret: float test (float x) { float y = x*x; if (y >= 0.f) return std::sqrt(y); __builtin_unreachable(); } If I understand it correctly, the 'y >= 0.f' excludes 'y' being NaN and 'y' being negative (though this is excluded by 'y = x*x'), so there is no need to check if the argument to `std::sqrt` is any bad, enabling to just do 'sqrtss' and return. Furthemore, adding e.g. '#pragma GCC optimize ("no-math-errno")' before the 'test' function doesn't lead to optimizing it either, though I'm not sure whether this is expected to work and/or requires a separate bugtracker issue.
[Bug tree-optimization/91645] Missed optimization with sqrt(x*x)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91645 --- Comment #2 from Nikita Lisitsa --- If by 'isless(y, 0.0)' you mean 'y < 0.f', then no, it doesn't change anything, it produces the same 'ucomiss ... call sqrtf' boilerplate. May I have misunderstood you? By the way, what about '#pragma GCC optimize ("no-math-errno")'? Is it supposed to work? Should I issue another bug on that matter?
[Bug tree-optimization/91645] Missed optimization with sqrt(x*x)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91645 --- Comment #5 from Nikita Lisitsa --- Oh, thank you a lot! Indeed, this version compiles to just mulss & sqrtss float test (float x) { float y = x*x; if (std::isless(y, 0.f)) __builtin_unreachable(); return std::sqrt(y); } Yet, I still don't quite understand what is happening here. Is it because the standard '<' operator is still subject to FE_* ? Concerning pragmas, the code #pragma GCC optimize ("no-math-errno") float test (float x) { return std::sqrt(x*x); } produces the following assembly std::sqrt(float): pxorxmm2, xmm2 movaps xmm1, xmm0 ucomiss xmm2, xmm0 sqrtss xmm1, xmm1 ja .L8 movaps xmm0, xmm1 ret .L8: sub rsp, 24 movss DWORD PTR [rsp+12], xmm1 callsqrtf movss xmm1, DWORD PTR [rsp+12] add rsp, 24 movaps xmm0, xmm1 ret test(float): mulss xmm0, xmm0 jmp std::sqrt(float) So, the only notable difference is that now 'std::sqrt(float)' is not inlined, but is tail-called instead. Thus, the pragma seems not to work?
[Bug c++/89062] class template argument deduction failure with parentheses
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89062 Nikita Lisitsa changed: What|Removed |Added CC||lisyarus at gmail dot com --- Comment #5 from Nikita Lisitsa --- Any updates on this? Can confirm this on all gcc versions from 7.1 to 10.2 (see https://godbolt.org/z/4qnfea). Interestingly, for a variadic-template constructor only the first argument triggers the error, i.e. template struct A { A(T) {} }; template A(T) -> A; template struct B { template B(Args const & ...){} }; void test() { B b1(A{0}, A{0}, A{0}); // error B b2(A{0}, A{0}, A{0}); // ok }