[Bug c++/96065] New: Move elision of returned automatic variable doesn't happen the variable is enclosed in a block
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96065 Bug ID: 96065 Summary: Move elision of returned automatic variable doesn't happen the variable is enclosed in a block Product: gcc Version: 10.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: b7.10110111 at gmail dot com Target Milestone: --- Consider the following code (at Godbolt's: https://gcc.godbolt.org/z/CyqPF9 ): ``` struct A { A(); A(A&&); A(A const&); A& operator=(A&&); A& operator=(A const&); }; A getA() { { A a; return a; } } int main() { const A a=getA(); } ``` Here we get A::A() call followed by A::A(A&&) call. If we remove the inner braces in getA(), move elision happens, so only A::A() is called. I'd expect that without removal of braces move elision would also happen. This problem of missing move elision also affects the case when the block belongs to an if statement. Same pattern happens with copy elision if we comment out the move constructor. For comparison, MSVC 19.24 (with /O2 flag) and Clang 10.0 (by default) both elide the move.
[Bug sanitizer/86022] New: TCB size calculated in ThreadDescriptorSize() is wrong for glibc-2.14
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86022 Bug ID: 86022 Summary: TCB size calculated in ThreadDescriptorSize() is wrong for glibc-2.14 Product: gcc Version: 8.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: sanitizer Assignee: unassigned at gcc dot gnu.org Reporter: b7.10110111 at gmail dot com CC: dodji at gcc dot gnu.org, dvyukov at gcc dot gnu.org, jakub at gcc dot gnu.org, kcc at gcc dot gnu.org, marxin at gcc dot gnu.org Target Milestone: --- In ThreadDescriptorSize(), I currently see: else if (minor <= 13) val = FIRST_32_SECOND_64(1168, 2304); else val = FIRST_32_SECOND_64(1216, 2304); This leads to assertion failure on glibc-2.14, with the same message as in bug 60038. Actual values for glibc 2.14 are the same as for 2.13: 1168 for i386 and 2304 for x86_64. I checked this by appending the following to glibc-2.14.1/nptl/descr.h: typedef int TCB_SIZE_2304[sizeof(struct pthread)==2304 ? -1 : 1]; typedef int TCB_SIZE_1168[sizeof(struct pthread)==1168 ? -1 : 1]; and getting corresponding error when compiling glibc on a 32-bit and on a 64-bit x86 Kubuntu machines. I suppose the fix should be to change "minor <= 13" to "minor <= 14".
[Bug c++/91990] New: Too slow compilation of recursively-nested template class with two instances of its template parent
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91990 Bug ID: 91990 Summary: Too slow compilation of recursively-nested template class with two instances of its template parent Product: gcc Version: 9.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: b7.10110111 at gmail dot com Target Milestone: --- Consider the following code: ``` template class A { typedef A B; B x, y; }; template<> class A<0> { char m; }; int main() { A a; } ``` Depending on the value of `LEVEL`, g++ compilation takes exponential time. But if you replace `x, y` with `x[2]`, compilation will be in constant (negligibly small) time. I've tested this on g++ 6.5.0, 8.3.0 and 9.1.0, and in all these versions the problem of slow compilation reproduces. For comparison, clang++ 6.0 compiles both versions (with `x, y` and `x[2]`) in negligibly small time regardless of `LEVEL` (tested up to `LEVEL=906`, on 907 it crashes).
[Bug libstdc++/83566] New: cyl_bessel_j returns wrong result for x>1000 for high orders.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83566 Bug ID: 83566 Summary: cyl_bessel_j returns wrong result for x>1000 for high orders. Product: gcc Version: 7.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: b7.10110111 at gmail dot com Target Milestone: --- The following test program prints results of C++17 std::cyl_bessel_j(100,1000.0001) and corresponding result given by GSL: #include #include #include int main() { const double volatile n = 100; const double volatile x = 1000.0001; std::cout.precision(std::numeric_limits::digits10); const auto valueCXX17 = std::cyl_bessel_j(n,x); const auto valueGSL = gsl_sf_bessel_Jn (n,x); std::cout << "C++17: " << valueCXX17 << "\n" << "GSL : " << valueGSL << "\n"; } I get the following output: C++17: 0.433818396252946 GSL : 0.0116783669817645 Comparison with Boost.Math and Wolfram Mathematica shows that GSL is right, while stdc++ is wrong. For x<=1000 there's no such problem. As n decreases, the imprecision gradually gets smaller.
[Bug libstdc++/83566] cyl_bessel_j returns wrong result for x>1000 for high orders.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83566 --- Comment #1 from Ruslan --- > As n decreases, the imprecision gradually gets smaller. To avoid confusion: this statement is for fixed x>1000.
[Bug c++/90971] New: Suboptimal diagnostic for is_same_v requirement for std::array
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90971 Bug ID: 90971 Summary: Suboptimal diagnostic for is_same_v requirement for std::array Product: gcc Version: 9.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: b7.10110111 at gmail dot com Target Milestone: --- Consider the following code: ``` #include int main() { std::array arr={1.32,5,45.3463,4.674,-94.463,34.634}; } ``` GCC 9.1 (9.1.0-2ubuntu2~18.04) gives the following diagnostic with -std=c++17 option: ``` test.cpp: In function ‘int main()’: test.cpp:5:53: error: class template argument deduction failed: 5 | std::array arr={1.32,5,45.3463,4.674,-94.463,34.634}; | ^ test.cpp:5:53: error: no matching function for call to ‘array(double, int, double, double, double, double)’ In file included from test.cpp:1: /usr/include/c++/9/array:244:5: note: candidate: ‘template std::array(_Tp, _Up ...)-> std::array && ...), _Tp>::type, (1 + sizeof... (_Up))>’ 244 | array(_Tp, _Up...) | ^ /usr/include/c++/9/array:244:5: note: template argument deduction/substitution failed: /usr/include/c++/9/array: In substitution of ‘template std::array(_Tp, _Up ...)-> std::array && ...), _Tp>::type, (1 + sizeof... (_Up))> [with _Tp = double; _Up = {int, double, double, double, double}]’: test.cpp:5:53: required from here /usr/include/c++/9/array:244:5: error: no type named ‘type’ in ‘struct std::enable_if’ ``` This error message "error: no type named ‘type’ in ‘struct std::enable_if’" is not too useful. Yes, it is technically correct, but compare it to what clang 6.0 (6.0.0-1ubuntu2) prints instead: ``` test.cpp:5:13: error: no viable constructor or deduction guide for deduction of template arguments of 'array' std::array arr={1.32,5,45.3463,4.674,-94.463,34.634}; ^ /usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/array:244:5: note: candidate template ignored: requirement 'is_same_v' was not satisfied [with _Tp = double, _Up = ] array(_Tp, _Up...) ^ /usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/array:94:12: note: candidate function template not viable: requires 0 arguments, but 6 were provided struct array ^ /usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/array:94:12: note: candidate function template not viable: requires 1 argument, but 6 were provided 1 error generated. ``` Note this: "requirement 'is_same_v' was not satisfied". It's much better than what GCC says.
[Bug libstdc++/86409] New: std::stod fails for denormal numbers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86409 Bug ID: 86409 Summary: std::stod fails for denormal numbers Product: gcc Version: 8.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: b7.10110111 at gmail dot com Target Milestone: --- Consider the following test program: // BEGIN #include #include int main() { const char str[]="3.23534634e-320"; try { const auto value=std::stod(str); std::cout << "stod returned " << value << '\n'; } catch(std::exception&) { std::cerr << "stod failed\n"; } std::istringstream ss(str); double value; ss >> value; if(ss) std::cout << "istringstream gave " << value << '\n'; else std::cerr << "istringstream failed\n"; } // END Here std::stod throws std::out_of_range exception, although the number can be represented as a denormal in double. std::istringstream works as expected, reading the denormal into the variable.
[Bug libstdc++/86409] std::stod fails for denormal numbers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86409 --- Comment #1 from Ruslan --- I was testing this on Kubuntu 14.04 x86_64 with g++ 8.1.0-5ubuntu1~14.04.
[Bug c++/87293] New: An object with invalid type is treated as if it were of type int when reporting errors
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87293 Bug ID: 87293 Summary: An object with invalid type is treated as if it were of type int when reporting errors Product: gcc Version: 8.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: b7.10110111 at gmail dot com Target Milestone: --- Consider the following C++ code: // #include int main() { std::shared_ptr p(new double{5.3}); } // Here, g++ emits the following messages: --BEGIN- test.cpp: In function ‘int main()’: test.cpp:4:21: error: ‘dbl’ was not declared in this scope std::shared_ptr p(new double{5.3}); ^~~ test.cpp:4:24: error: template argument 1 is invalid std::shared_ptr p(new double{5.3}); ^ test.cpp:4:43: error: invalid conversion from ‘double*’ to ‘int’ [-fpermissive] std::shared_ptr p(new double{5.3}); ^ -END--- The first error is correct: there's no dbl type. But the last error makes no sense at all. There's nothing in the code which could imply that the type of `p` could be `int`: even if there were no type present, C++ is not C89 to imply `int` by default. Moreover, if we add a line which uses `p` in another erroneous way, e.g. "struct S{}s=p;", g++ again thinks that `p` is of type `int` ("error: conversion from ‘int’ to non-scalar type ‘main()::S’ requested").
[Bug libstdc++/84666] New: ostringstream prints floats 2x slower than snprintf, when precision>=37
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84666 Bug ID: 84666 Summary: ostringstream prints floats 2x slower than snprintf, when precision>=37 Product: gcc Version: 7.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: b7.10110111 at gmail dot com Target Milestone: --- Created attachment 43541 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43541&action=edit Test program If you compile the attached test program and run it, you'll notice that ostringstream performance becomes 2x slower at precision>=38 (and 1.5x slower on average at precision==37). I've traced it to _M_insert_float using too small initial buffer, regardless of the precision requested, and thus having to call std::__convert_from_v second time. The offending line is: // First try a buffer perhaps big enough (most probably sufficient // for non-ios_base::fixed outputs) int __cs_size = __max_digits * 3; Here __max_digits is a numeric trait of _ValueT, and doesn't depend on __prec. It seems more correct to use __prec instead of (or in addition to) __max_digits here. Interestingly, a few lines below, in the #else branch of #if _GLIBCXX_USE_C99_STDIO, we can see that __prec is taken into account in calculation of __cs_size. Apparently, on Kubuntu 14.04 amd64, _GLIBCXX_USE_C99_STDIO was set to 1.
[Bug target/84756] New: Multiplication done twice just to get upper and lower parts of product
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84756 Bug ID: 84756 Summary: Multiplication done twice just to get upper and lower parts of product Product: gcc Version: 7.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: b7.10110111 at gmail dot com Target Milestone: --- Consider the following C code valid for both x86 and amd64 targets: #ifdef __SIZEOF_INT128__ typedef __uint128_t Longer; #else typedef unsigned long long Longer; #endif typedef unsigned long Shorter; Shorter mul(Shorter a, Shorter b, Shorter* upper) { *upper=(Longer)a*b >> 8*sizeof(Shorter); return (Longer)a*b; } Longer lmul(Shorter a, Shorter b) { return (Longer)a*b; } From lmul function I get the expected good assembly: lmul: mov eax, DWORD PTR [esp+8] mul DWORD PTR [esp+4] ret But for mul gcc generates two multiplications instead of one: mul: pushebx mov ecx, DWORD PTR [esp+8] mov ebx, DWORD PTR [esp+12] mov eax, ecx mul ebx mov eax, DWORD PTR [esp+16] mov DWORD PTR [eax], edx mov eax, ecx imuleax, ebx pop ebx ret Here 'mul ebx' is used to get the upper part of the result, and `imul eax, ebx` is supposed to ge the lower part, although it has already been present right after `mul ebx` in eax register. Similar problem happens when I use -m64 option for gcc to get amd64 code.
[Bug target/84757] New: Useless MOVs and PUSHes to store results of MUL
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84757 Bug ID: 84757 Summary: Useless MOVs and PUSHes to store results of MUL Product: gcc Version: 7.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: b7.10110111 at gmail dot com Target Milestone: --- Consider the following C code: #ifdef __SIZEOF_INT128__ typedef __uint128_t Longer; #else typedef unsigned long long Longer; #endif typedef unsigned long Shorter; Shorter mulSmarter(Shorter a, Shorter b, Shorter* upper) { const Longer ab=(Longer)a*b; *upper=ab >> 8*sizeof(Shorter); return ab; } On amd64 with -m64 option I get identical assembly on both gcc 7.x and 6.3. But on x86 (or amd64 with -m32) assembly is different, and on gcc 7.x is less efficient. See to compare: # gcc 6.3 mulSmarter: mov eax, DWORD PTR [esp+8] mul DWORD PTR [esp+4] mov ecx, edx mov edx, DWORD PTR [esp+12] mov DWORD PTR [edx], ecx ret # gcc 7.3 mulSmarter: push esi push ebx mov eax, DWORD PTR [esp+16] mul DWORD PTR [esp+12] mov esi, edx mov edx, DWORD PTR [esp+20] mov ebx, eax mov eax, ebx mov DWORD PTR [edx], esi pop ebx pop esi ret The gcc 6.3 version is already not perfect, but it's much better than that of 7.3.
[Bug middle-end/54183] Generate __udivmoddi4 instead of __udivdi3 plus __umoddi3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54183 Ruslan changed: What|Removed |Added CC||b7.10110111 at gmail dot com --- Comment #1 from Ruslan --- This seems to be fixed in GCC 7: see https://godbolt.org/g/Mz3Qi6 for example.
[Bug target/84759] New: Calculation of quotient and remainder with constant denominator uses __umoddi3+__udivdi3 instead of __udivmoddi4
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84759 Bug ID: 84759 Summary: Calculation of quotient and remainder with constant denominator uses __umoddi3+__udivdi3 instead of __udivmoddi4 Product: gcc Version: 7.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: b7.10110111 at gmail dot com Target Milestone: --- Starting from GCC 7, code calculating both quotient and remainder of a loong division calls a single __udivmodti4. But this only happens for general values of denominator, while for specific constants for some reason GCC still generates calls to __umodti3 and __udivti3. See the following code (the picture is the same for x86 and amd64 targets): #ifdef __SIZEOF_INT128__ typedef __uint128_t Longer; #else typedef unsigned long long Longer; #endif typedef unsigned long Shorter; Shorter divmod(Longer numerator, Shorter denominator, Shorter* remainder) { *remainder = numerator%denominator; return numerator/denominator; } Shorter divmodConst(Longer numerator, Shorter* remainder) { const Shorter denominator = 100; *remainder = numerator%denominator; return numerator/denominator; } Here divmod is optimized, while divmodConst appears not optimized: divmod: sub esp, 28 xor edx, edx mov eax, DWORD PTR [esp+40] lea ecx, [esp+8] sub esp, 12 pushecx pushedx pusheax pushDWORD PTR [esp+60] pushDWORD PTR [esp+60] call__udivmoddi4 mov edx, DWORD PTR [esp+76] mov ecx, DWORD PTR [esp+40] mov DWORD PTR [edx], ecx add esp, 60 ret divmodConst: pushedi pushesi sub esp, 4 mov esi, DWORD PTR [esp+16] mov edi, DWORD PTR [esp+20] push0 push100 pushedi pushesi call__umoddi3 add esp, 16 mov edx, DWORD PTR [esp+24] mov DWORD PTR [edx], eax push0 push100 pushedi pushesi call__udivdi3 add esp, 20 pop esi pop edi ret
[Bug middle-end/54183] Generate __udivmoddi4 instead of __udivdi3 plus __umoddi3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54183 --- Comment #3 from Ruslan --- Ah, actually your problem is with a constant divisor. I reported it as bug 84759. If you change 10 to e.g. a function parameter, then you'll get __udivmoddi4.
[Bug middle-end/54183] Generate __udivmoddi4 instead of __udivdi3 plus __umoddi3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54183 --- Comment #5 from Ruslan --- Yes, this is exactly the problem: the generic case is optimized while the special case, where the divisor is a compile-time constant, isn't.
[Bug c++/70299] New: pow(long double, int) gives more imprecise result than pow(long double,long double) in c++03 mode
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70299 Bug ID: 70299 Summary: pow(long double, int) gives more imprecise result than pow(long double,long double) in c++03 mode Product: gcc Version: 5.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: b7.10110111 at gmail dot com Target Milestone: --- The following example program appears to get different results for different overloads of std::pow(), even though all the parameters don't lose precision on call: #include #include #include int main() { std::cout.precision(std::numeric_limits::digits10+3); // =max_digits10 std::cout << "pow(long double, int) :" << std::pow(10.L,-4823) << "\n"; std::cout << "pow(long double, long double): " << std::pow(10.L,-4823.L) << "\n"; } Its output in -std=c++03 mode: pow(long double, int) :1.0288e-4823 pow(long double, long double): 1.0005e-4823 And in -std=c++11 and -std=c++14 mode it's correct: pow(long double, int) :1.0005e-4823 pow(long double, long double): 1.0005e-4823
[Bug c++/70299] pow(long double, int) gives more imprecise result than pow(long double,long double) in c++03 mode
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70299 --- Comment #1 from Ruslan --- The machine I tested it was Ubuntu 15.10, uname -a gives Linux integral3-amd64 4.2.0-22-generic #27-Ubuntu SMP Thu Dec 17 22:57:08 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux , libc is Ubuntu GLIBC 2.21-0ubuntu4.
[Bug c++/70441] New: vector<__float128> crashes on two push_back calls with -mavx
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70441 Bug ID: 70441 Summary: vector<__float128> crashes on two push_back calls with -mavx Product: gcc Version: 5.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: b7.10110111 at gmail dot com Target Milestone: --- The following simple program reproduces the bug: #include int main() { std::vector<__float128> tests; tests.push_back(0); tests.push_back(0); } I compiled it with this command line: g++ test.cpp -o test -g -mavx On attempt to run it reliably crashes on `vmovaps XMMWORD PTR [eax],xmm2` instruction, where eax==0x804fa38, i.e. not aligned on 16-byte boundary. The calling code is the second push_back(). Here's the full backtrace: 0x08048d30 in __gnu_cxx::new_allocator<__float128>::construct (this=0xd444, __p=0x804fa38, __val=@0xd460: ) at /opt/gcc-5.2/include/c++/5.2.0/ext/new_allocator.h:130 130 { ::new((void *)__p) _Tp(__val); } (gdb) bt #0 0x08048d30 in __gnu_cxx::new_allocator<__float128>::construct (this=0xd444, __p=0x804fa38, __val=@0xd460: ) at /opt/gcc-5.2/include/c++/5.2.0/ext/new_allocator.h:130 #1 0x080489bd in __gnu_cxx::__alloc_traits >::construct<__float128> (__a=..., __p=0x804fa38, __arg=@0xd460: ) at /opt/gcc-5.2/include/c++/5.2.0/ext/alloc_traits.h:189 #2 0x08048ae1 in std::vector<__float128, std::allocator<__float128> >::_M_insert_aux (this=0xd444, __position=, __x=@0xd460: ) at /opt/gcc-5.2/include/c++/5.2.0/bits/vector.tcc:361 #3 0x080488e9 in std::vector<__float128, std::allocator<__float128> >::push_back (this=0xd444, __x=@0xd460: ) at /opt/gcc-5.2/include/c++/5.2.0/bits/stl_vector.h:925 #4 0x080487c2 in main () at test.cpp:7
[Bug rtl-optimization/70467] New: Useless "and [esp],-1" emitted on AND with uint64_t variable
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70467 Bug ID: 70467 Summary: Useless "and [esp],-1" emitted on AND with uint64_t variable Product: gcc Version: 6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: b7.10110111 at gmail dot com Target Milestone: --- Consider the following C code: #include long double __attribute__((noinline)) test() { return 0; } long double doStuff() { long double value=test(); unsigned long long v; memcpy(&v,&value,sizeof v); v&=~(1ull<<63); memcpy(&value,&v,sizeof v); return value; } int main(){} I get the following output for duStuff() function when I compile this code with `gcc -O3 -fomit-frame-pointer -m32`: doStuff: sub esp, 28 calltest ; OK, I asked to avoid inlining it fstpTBYTE PTR [esp] and DWORD PTR [esp], -1 ; DO NOTHING!!! and DWORD PTR [esp+4], 2147483647 ; Clear highest bit fld TBYTE PTR [esp] add esp, 28 ret The instruction marked with `DO NOTHING!!!` is a no-op here (flags are not tested) and should have been eliminated. This useless instruction is generated across generations of GCC starting at least with 4.4.7 and ending at 6.0.0 20160221 (the snapshot testable at gcc.godbolt.org).
[Bug rtl-optimization/70467] Useless "and [esp],-1" emitted on AND with uint64_t variable
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70467 --- Comment #4 from Ruslan --- (In reply to Jakub Jelinek from comment #3) > ... > nothing there is able to optimize & -1 (and similarly | or ^ 0, or & 0, or | > -1). Just a note: the same happens for arithmetic operations, not just bitwise. E.g. if you change `v&=~(1ull<<63)` in the OP to `v+=1ull<<32`, GCC generates `add dword [esp],0` followed by `adc dword [esp+4],1`.
[Bug rtl-optimization/70504] New: FLD, FLD, FXCH emitted instead of FLD, FLD in the needed order
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70504 Bug ID: 70504 Summary: FLD, FLD, FXCH emitted instead of FLD, FLD in the needed order Product: gcc Version: 6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: b7.10110111 at gmail dot com Target Milestone: --- The following code demonstrates the bug: long double inl_scalbn(long double mant, long double exp) { long double result; asm("fscale" : "=&t"(result) : "%0"(mant), "u"(exp) ); return result; } With `-O3` option GCC generates the following assembly: inl_scalbn: fld TBYTE PTR [esp+4] fld TBYTE PTR [esp+16] fxchst(1) fscale fstpst(1) ret What's even stranger, I thought it was somehow related to order of function arguments, but if I switch `mant` and `exp`, the code just switches `fld` instructions instead of removing `fxch`. It's clear that in both cases the code could have just loaded the parameters in the correct order in the first place.
[Bug rtl-optimization/70976] New: Useless vectorization leads to degradation of performance
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70976 Bug ID: 70976 Summary: Useless vectorization leads to degradation of performance Product: gcc Version: 6.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: b7.10110111 at gmail dot com Target Milestone: --- See the following code: #include int main() { unsigned long u = 13; for(unsigned long i = 0; i < 1UL<<30; i++) u += 23442*u; if (u == 0) printf("0\n"); } Compiling it on an AMD64 system with -O2, I get normal assembly for the loop: .L2: imulrdx, rdx, 23443 sub rax, 1 jne .L2 But if I use -O3, the loop looks like this: .L2: movdqa xmm3, xmm1 add eax, 1 movdqa xmm0, xmm1 pmuludq xmm1, xmm4 cmp eax, 536870912 pmuludq xmm3, xmm2 psrlq xmm0, 32 pmuludq xmm0, xmm2 paddq xmm0, xmm1 movdqa xmm1, xmm3 psllq xmm0, 32 paddq xmm1, xmm0 jne .L2 Not only does it become longer, but also it needlessly does calculations on pairs of identical numbers. On my CPU (Intel(R) Xeon(R) CPU E3-1226 v3 @ 3.30GHz) the -O2 version is almost two times faster than -O3 one. This happens with gcc 4.7.3 and newer, but doesn't with 4.6.4 and older.
[Bug c++/71238] New: Undeclared function message imprecisely points to error column
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71238 Bug ID: 71238 Summary: Undeclared function message imprecisely points to error column Product: gcc Version: 6.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: b7.10110111 at gmail dot com Target Milestone: --- The following program int main() { int x=myFunc(3234); } gives me the error: test.cpp:3:22: error: ‘myFunc’ was not declared in this scope int x=myFunc(3234); ^ Here the "^" symbol points to the closing parenthesis (and the parenthesis itself is even colored red). But the error is not at that column, but rather at `myFunc` identifier. Similar code but without function call parentheses leads to much more precise error message: test.cpp:3:11: error: ‘myFunc’ was not declared in this scope int x=myFunc/*(3234)*/; ^~
[Bug c++/71469] New: Print possible override candidates when a method is marked override but doesn't override
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71469 Bug ID: 71469 Summary: Print possible override candidates when a method is marked override but doesn't override Product: gcc Version: 6.0 Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: b7.10110111 at gmail dot com Target Milestone: --- Currently for this code struct Base { virtual int funct(int); }; struct Der { int func(int) override; int funct() override; }; int main(){} g++ gives the following error: test.cpp:7:9: error: ‘int Der::func(int)’ marked ‘override’, but does not override int func(int) override; ^ test.cpp:8:9: error: ‘int Der::funct()’ marked ‘override’, but does not override int funct() override; ^ Now one has to look into the declaration of Base to find out what's actually wrong. It'd be nice if g++ suggested possible candidates: 1. For the first case in the above example, int func(int), do something similar to "no such member" error (i.e. suggest function name correction) if the supposed override matches in parameter types; 2. For the second case, int funct(), just list the name(s) of virtual functions in the base class matching the name of the supposed override. This will make it much simpler to immediately see some trivial errors like e.g. omitting a parameter or using int instead of long.
[Bug target/77457] Print intended value of constants in assembly output
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77457 --- Comment #1 from Ruslan --- Same for version "GCC: (Ubuntu 6.1.1-3ubuntu11~14.04.1) 6.1.1 20160511"
[Bug target/77457] New: Print intended value of constants in assembly output
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77457 Bug ID: 77457 Summary: Print intended value of constants in assembly output Product: gcc Version: 6.1.0 Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: b7.10110111 at gmail dot com Target Milestone: --- Consider the following simple program: void f() { volatile double x=0.352; } I compile it with `gcc test.c -S -masm=intel -fverbose-asm` and get the following for the value of `x`: .LC0: .long 34359738 .long 1071023915 .ident "GCC: (Ubuntu 5.3.0-3ubuntu1~14.04) 5.3.0 20151204" To decypher it while reading the listing one has to manually concatenate hexadecimal forms of these two numbers, and then transform to floating-point form. Not too handy. For comparison, this is what I get from clang: .LCPI0_0: .quad 4600012688193243578 # double 0.35198 <...skipped some code...> .ident "Ubuntu clang version 3.8.0-svn257311-1~exp1 (trunk) (based on LLVM 3.8.0)" It would be really useful if GCC also printed the intended values of the constants it emits. Namely, this should be done for float, double and long double.
[Bug middle-end/77457] Print intended value of constants in assembly output
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77457 --- Comment #5 from Ruslan --- (In reply to Andrew Pinski from comment #2) > Note also should be shown in C99 hex floats because that is 100% exactly > representable of the number in binary :). Not sure if exactness is worth it. It'll make it harder to see what the decimal value is (and decimal is the most commonly used radix by humans), while decimal form, if printed with `max_digits10` digits, is enough to reproduce the hex/binary form when needed.
gcc-bugs@gcc.gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60962 Bug ID: 60962 Summary: b+(-2.f)*a generates multiplication instruction while b-2.f*a simplifies to addition&subtraction Product: gcc Version: 4.8.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: b7.10110111 at gmail dot com Created attachment 32681 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=32681&action=edit A procedure compilable into assembly to reproduce the bug I've tried the following with -O3 -ffast-math -fassociative-math options (here all operands are floats): float lap0= point[-1]+point[1] + (-2.f)*point[0]; This part of code (compilable into assembly version attached) generates mulss/addss code (adding a constant in .rodata and reading it before) , which leads to 6% slowdown compared to this version: float lap0= point[-1]+point[1] - 2.f*point[0]; , which generates addss/subss code. My g++ version is g++ (Ubuntu 4.8.1-2ubuntu1~12.04) 4.8.1. The full command line is: g++ -O3 -ffast-math -fassociative-math -o test1.s -S -masm=intel test.cpp The problem also reproduces with g++ 4.5.
[Bug c++/66346] New: GCC computes log10(2.L) constant wrongly
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66346 Bug ID: 66346 Summary: GCC computes log10(2.L) constant wrongly Product: gcc Version: 5.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: b7.10110111 at gmail dot com Target Milestone: --- Consider the following code: #include #include int main() { volatile long double two=2.L; long double vol=log10(two); long double con=log10(2.L); long double ref=0.3010299956639811952L; std::cout.precision(19); std::cout << "computed constant: " << con << '\n'; std::cout << "computed volatile: " << vol << '\n'; std::cout << "reference value : " << ref << "\n"; } Here reference value was computed using Wolfram Mathematica via N[Log10[2],19] command. On x86 system it this code, compiled with gcc 4.5, 4.8, 4.9 and 5.1, gives me this output: computed constant: 0.301029995663981198 computed volatile: 0.3010299956639811952 reference value : 0.3010299956639811952 On x86_64 system it prints this: computed constant: 0.301029995663981198 computed volatile: 0.301029995663981198 reference value : 0.3010299956639811952 This appears to be wrong result. The same code compiled by clang++ 3.0.6ubuntu3 gives all values equal to reference value.
[Bug libstdc++/66346] GCC computes log10(2.L) constant wrongly
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66346 Ruslan changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |INVALID --- Comment #3 from Ruslan --- Ah, that's what I'm doing wrong... Thanks, this bug is invalid then.