[Bug c++/96065] New: Move elision of returned automatic variable doesn't happen the variable is enclosed in a block

2020-07-05 Thread b7.10110111 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96065

Bug ID: 96065
   Summary: Move elision of returned automatic variable doesn't
happen the variable is enclosed in a block
   Product: gcc
   Version: 10.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: b7.10110111 at gmail dot com
  Target Milestone: ---

Consider the following code (at Godbolt's: https://gcc.godbolt.org/z/CyqPF9 ):

```
struct A
{
A();
A(A&&);
A(A const&);
A& operator=(A&&);
A& operator=(A const&);
};

A getA()
{
{
A a;
return a;
}
}

int main()
{
const A a=getA();
}
```

Here we get A::A() call followed by A::A(A&&) call. If we remove the inner
braces in getA(), move elision happens, so only A::A() is called. I'd expect
that without removal of braces move elision would also happen.

This problem of missing move elision also affects the case when the block
belongs to an if statement. Same pattern happens with copy elision if we
comment out the move constructor.

For comparison, MSVC 19.24 (with /O2 flag) and Clang 10.0 (by default) both
elide the move.

[Bug sanitizer/86022] New: TCB size calculated in ThreadDescriptorSize() is wrong for glibc-2.14

2018-06-01 Thread b7.10110111 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86022

Bug ID: 86022
   Summary: TCB size calculated in ThreadDescriptorSize() is wrong
for glibc-2.14
   Product: gcc
   Version: 8.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: sanitizer
  Assignee: unassigned at gcc dot gnu.org
  Reporter: b7.10110111 at gmail dot com
CC: dodji at gcc dot gnu.org, dvyukov at gcc dot gnu.org,
jakub at gcc dot gnu.org, kcc at gcc dot gnu.org, marxin at 
gcc dot gnu.org
  Target Milestone: ---

In ThreadDescriptorSize(), I currently see:

  else if (minor <= 13)
val = FIRST_32_SECOND_64(1168, 2304);
  else
val = FIRST_32_SECOND_64(1216, 2304);

This leads to assertion failure on glibc-2.14, with the same message as in bug
60038. Actual values for glibc 2.14 are the same as for 2.13: 1168 for i386 and
2304 for x86_64.

I checked this by appending the following to glibc-2.14.1/nptl/descr.h:

typedef int TCB_SIZE_2304[sizeof(struct pthread)==2304 ? -1 : 1];
typedef int TCB_SIZE_1168[sizeof(struct pthread)==1168 ? -1 : 1];

and getting corresponding error when compiling glibc on a 32-bit and on a
64-bit x86 Kubuntu machines.


I suppose the fix should be to change "minor <= 13" to "minor <= 14".

[Bug c++/91990] New: Too slow compilation of recursively-nested template class with two instances of its template parent

2019-10-04 Thread b7.10110111 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91990

Bug ID: 91990
   Summary: Too slow compilation of recursively-nested template
class with two instances of its template parent
   Product: gcc
   Version: 9.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: b7.10110111 at gmail dot com
  Target Milestone: ---

Consider the following code:
```
template class A
{
typedef A B;
B x, y;
};
template<> class A<0> { char m; };
int main()
{
A a;
}
```
Depending on the value of `LEVEL`, g++ compilation takes exponential time. But
if you replace `x, y` with `x[2]`, compilation will be in constant (negligibly
small) time. I've tested this on g++ 6.5.0, 8.3.0 and 9.1.0, and in all these
versions the problem of slow compilation reproduces.

For comparison, clang++ 6.0 compiles both versions (with `x, y` and `x[2]`) in
negligibly small time regardless of `LEVEL` (tested up to `LEVEL=906`, on 907
it crashes).

[Bug libstdc++/83566] New: cyl_bessel_j returns wrong result for x>1000 for high orders.

2017-12-23 Thread b7.10110111 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83566

Bug ID: 83566
   Summary: cyl_bessel_j returns wrong result for x>1000 for high
orders.
   Product: gcc
   Version: 7.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: b7.10110111 at gmail dot com
  Target Milestone: ---

The following test program prints results of C++17
std::cyl_bessel_j(100,1000.0001) and corresponding result given by GSL:

#include 
#include 
#include 

int main()
{
const double volatile n = 100;
const double volatile x = 1000.0001;
std::cout.precision(std::numeric_limits::digits10);
const auto valueCXX17 = std::cyl_bessel_j(n,x);
const auto valueGSL   = gsl_sf_bessel_Jn (n,x);
std::cout << "C++17: " << valueCXX17 << "\n"
  << "GSL  : " << valueGSL << "\n";
}

I get the following output:

C++17: 0.433818396252946
GSL  : 0.0116783669817645

Comparison with Boost.Math and Wolfram Mathematica shows that GSL is right,
while stdc++ is wrong.

For x<=1000 there's no such problem. As n decreases, the imprecision gradually
gets smaller.

[Bug libstdc++/83566] cyl_bessel_j returns wrong result for x>1000 for high orders.

2017-12-23 Thread b7.10110111 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83566

--- Comment #1 from Ruslan  ---
> As n decreases, the imprecision gradually gets smaller.
To avoid confusion: this statement is for fixed x>1000.

[Bug c++/90971] New: Suboptimal diagnostic for is_same_v requirement for std::array

2019-06-24 Thread b7.10110111 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90971

Bug ID: 90971
   Summary: Suboptimal diagnostic for is_same_v requirement for
std::array
   Product: gcc
   Version: 9.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: b7.10110111 at gmail dot com
  Target Milestone: ---

Consider the following code:


```
#include 

int main()
{
std::array arr={1.32,5,45.3463,4.674,-94.463,34.634};
}
```


GCC 9.1 (9.1.0-2ubuntu2~18.04) gives the following diagnostic with -std=c++17
option:


```
test.cpp: In function ‘int main()’:
test.cpp:5:53: error: class template argument deduction failed:
5 |  std::array arr={1.32,5,45.3463,4.674,-94.463,34.634};
  | ^
test.cpp:5:53: error: no matching function for call to ‘array(double, int,
double, double, double, double)’
In file included from test.cpp:1:
/usr/include/c++/9/array:244:5: note: candidate: ‘template std::array(_Tp, _Up ...)-> std::array && ...), _Tp>::type, (1 + sizeof...
(_Up))>’
  244 | array(_Tp, _Up...)
  | ^
/usr/include/c++/9/array:244:5: note:   template argument
deduction/substitution failed:
/usr/include/c++/9/array: In substitution of ‘template std::array(_Tp, _Up ...)-> std::array && ...), _Tp>::type, (1 + sizeof... (_Up))>
[with _Tp = double; _Up = {int, double, double, double, double}]’:
test.cpp:5:53:   required from here
/usr/include/c++/9/array:244:5: error: no type named ‘type’ in ‘struct
std::enable_if’
```


This error message "error: no type named ‘type’ in ‘struct
std::enable_if’" is not too useful. Yes, it is technically
correct, but compare it to what clang 6.0 (6.0.0-1ubuntu2) prints instead:


```
test.cpp:5:13: error: no viable constructor or deduction guide for deduction of
template arguments of 'array'
std::array arr={1.32,5,45.3463,4.674,-94.463,34.634};
   ^
/usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/array:244:5:
note: candidate template ignored: requirement 'is_same_v' was not
satisfied [with _Tp = double,
  _Up = ]
array(_Tp, _Up...)
^
/usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/array:94:12:
note: candidate function template not viable: requires 0 arguments, but 6 were
provided
struct array
   ^
/usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/array:94:12:
note: candidate function template not viable: requires 1 argument, but 6 were
provided
1 error generated.
```


Note this: "requirement 'is_same_v' was not satisfied". It's much
better than what GCC says.

[Bug libstdc++/86409] New: std::stod fails for denormal numbers

2018-07-05 Thread b7.10110111 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86409

Bug ID: 86409
   Summary: std::stod fails for denormal numbers
   Product: gcc
   Version: 8.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: b7.10110111 at gmail dot com
  Target Milestone: ---

Consider the following test program:

// BEGIN
#include 
#include 

int main()
{
const char str[]="3.23534634e-320";
try
{
const auto value=std::stod(str);
std::cout << "stod returned " << value << '\n';
}
catch(std::exception&)
{
std::cerr << "stod failed\n";
}

std::istringstream ss(str);
double value;
ss >> value;
if(ss) std::cout << "istringstream gave " << value << '\n';
else std::cerr << "istringstream failed\n";
}
// END

Here std::stod throws std::out_of_range exception, although the number can be
represented as a denormal in double. std::istringstream works as expected,
reading the denormal into the variable.

[Bug libstdc++/86409] std::stod fails for denormal numbers

2018-07-05 Thread b7.10110111 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86409

--- Comment #1 from Ruslan  ---
I was testing this on Kubuntu 14.04 x86_64 with g++ 8.1.0-5ubuntu1~14.04.

[Bug c++/87293] New: An object with invalid type is treated as if it were of type int when reporting errors

2018-09-13 Thread b7.10110111 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87293

Bug ID: 87293
   Summary: An object with invalid type is treated as if it were
of type int when reporting errors
   Product: gcc
   Version: 8.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: b7.10110111 at gmail dot com
  Target Milestone: ---

Consider the following C++ code:

//
#include 
int main()
{
std::shared_ptr p(new double{5.3});
}
//

Here, g++ emits the following messages:
--BEGIN-
test.cpp: In function ‘int main()’:
test.cpp:4:21: error: ‘dbl’ was not declared in this scope
 std::shared_ptr p(new double{5.3});
 ^~~
test.cpp:4:24: error: template argument 1 is invalid
 std::shared_ptr p(new double{5.3});
^
test.cpp:4:43: error: invalid conversion from ‘double*’ to ‘int’ [-fpermissive]
 std::shared_ptr p(new double{5.3});
   ^
-END---

The first error is correct: there's no dbl type. But the last error makes no
sense at all. There's nothing in the code which could imply that the type of
`p` could be `int`: even if there were no type present, C++ is not C89 to imply
`int` by default.

Moreover, if we add a line which uses `p` in another erroneous way, e.g.
"struct S{}s=p;", g++ again thinks that `p` is of type `int` ("error:
conversion from ‘int’ to non-scalar type ‘main()::S’ requested").

[Bug libstdc++/84666] New: ostringstream prints floats 2x slower than snprintf, when precision>=37

2018-03-02 Thread b7.10110111 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84666

Bug ID: 84666
   Summary: ostringstream prints floats 2x slower than snprintf,
when precision>=37
   Product: gcc
   Version: 7.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: b7.10110111 at gmail dot com
  Target Milestone: ---

Created attachment 43541
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43541&action=edit
Test program

If you compile the attached test program and run it, you'll notice that
ostringstream performance becomes 2x slower at precision>=38 (and 1.5x slower
on average at precision==37). I've traced it to _M_insert_float using too small
initial buffer, regardless of the precision requested, and thus having to call
std::__convert_from_v second time.

The offending line is:

// First try a buffer perhaps big enough (most probably sufficient
// for non-ios_base::fixed outputs)
int __cs_size = __max_digits * 3;

Here __max_digits is a numeric trait of _ValueT, and doesn't depend on __prec.
It seems more correct to use __prec instead of (or in addition to) __max_digits
here.

Interestingly, a few lines below, in the #else branch of #if
_GLIBCXX_USE_C99_STDIO, we can see that __prec is taken into account in
calculation of __cs_size. Apparently, on Kubuntu 14.04 amd64,
_GLIBCXX_USE_C99_STDIO was set to 1.

[Bug target/84756] New: Multiplication done twice just to get upper and lower parts of product

2018-03-07 Thread b7.10110111 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84756

Bug ID: 84756
   Summary: Multiplication done twice just to get upper and lower
parts of product
   Product: gcc
   Version: 7.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: b7.10110111 at gmail dot com
  Target Milestone: ---

Consider the following C code valid for both x86 and amd64 targets:

#ifdef __SIZEOF_INT128__
typedef __uint128_t Longer;
#else
typedef unsigned long long Longer;
#endif
typedef unsigned long Shorter;

Shorter mul(Shorter a, Shorter b, Shorter* upper)
{
*upper=(Longer)a*b >> 8*sizeof(Shorter);
return (Longer)a*b;
}

Longer lmul(Shorter a, Shorter b)
{
return (Longer)a*b;
}

From lmul function I get the expected good assembly:

lmul:
mov eax, DWORD PTR [esp+8]
mul DWORD PTR [esp+4]
ret

But for mul gcc generates two multiplications instead of one:

mul:
pushebx
mov ecx, DWORD PTR [esp+8]
mov ebx, DWORD PTR [esp+12]
mov eax, ecx
mul ebx
mov eax, DWORD PTR [esp+16]
mov DWORD PTR [eax], edx
mov eax, ecx
imuleax, ebx
pop ebx
ret

Here 'mul ebx' is used to get the upper part of the result, and `imul eax, ebx`
is supposed to ge the lower part, although it has already been present right
after `mul ebx` in eax register.

Similar problem happens when I use -m64 option for gcc to get amd64 code.

[Bug target/84757] New: Useless MOVs and PUSHes to store results of MUL

2018-03-07 Thread b7.10110111 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84757

Bug ID: 84757
   Summary: Useless MOVs and PUSHes to store results of MUL
   Product: gcc
   Version: 7.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: b7.10110111 at gmail dot com
  Target Milestone: ---

Consider the following C code:

#ifdef __SIZEOF_INT128__
typedef __uint128_t Longer;
#else
typedef unsigned long long Longer;
#endif
typedef unsigned long Shorter;

Shorter mulSmarter(Shorter a, Shorter b, Shorter* upper)
{
const Longer ab=(Longer)a*b;
*upper=ab >> 8*sizeof(Shorter);
return ab;
}

On amd64 with -m64 option I get identical assembly on both gcc 7.x and 6.3. But
on x86 (or amd64 with -m32) assembly is different, and on gcc 7.x is less
efficient. See to compare:

# gcc 6.3
mulSmarter:
  mov eax, DWORD PTR [esp+8]
  mul DWORD PTR [esp+4]
  mov ecx, edx
  mov edx, DWORD PTR [esp+12]
  mov DWORD PTR [edx], ecx
  ret

# gcc 7.3
mulSmarter:
  push esi
  push ebx
  mov eax, DWORD PTR [esp+16]
  mul DWORD PTR [esp+12]
  mov esi, edx
  mov edx, DWORD PTR [esp+20]
  mov ebx, eax
  mov eax, ebx
  mov DWORD PTR [edx], esi
  pop ebx
  pop esi
  ret

The gcc 6.3 version is already not perfect, but it's much better than that of
7.3.

[Bug middle-end/54183] Generate __udivmoddi4 instead of __udivdi3 plus __umoddi3

2018-03-08 Thread b7.10110111 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54183

Ruslan  changed:

   What|Removed |Added

 CC||b7.10110111 at gmail dot com

--- Comment #1 from Ruslan  ---
This seems to be fixed in GCC 7: see https://godbolt.org/g/Mz3Qi6 for example.

[Bug target/84759] New: Calculation of quotient and remainder with constant denominator uses __umoddi3+__udivdi3 instead of __udivmoddi4

2018-03-08 Thread b7.10110111 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84759

Bug ID: 84759
   Summary: Calculation of quotient and remainder with constant
denominator uses __umoddi3+__udivdi3 instead of
__udivmoddi4
   Product: gcc
   Version: 7.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: b7.10110111 at gmail dot com
  Target Milestone: ---

Starting from GCC 7, code calculating both quotient and remainder of a loong
division calls a single __udivmodti4. But this only happens for general values
of denominator, while for specific constants for some reason GCC still
generates calls to __umodti3 and __udivti3. See the following code (the picture
is the same for x86 and amd64 targets):

#ifdef __SIZEOF_INT128__
typedef __uint128_t Longer;
#else
typedef unsigned long long Longer;
#endif
typedef unsigned long Shorter;

Shorter divmod(Longer numerator, Shorter denominator, Shorter* remainder)
{
*remainder = numerator%denominator;
return numerator/denominator;
}

Shorter divmodConst(Longer numerator, Shorter* remainder)
{
const Shorter denominator = 100;
*remainder = numerator%denominator;
return numerator/denominator;
}

Here divmod is optimized, while divmodConst appears not optimized:

divmod:
sub esp, 28
xor edx, edx
mov eax, DWORD PTR [esp+40]
lea ecx, [esp+8]
sub esp, 12
pushecx
pushedx
pusheax
pushDWORD PTR [esp+60]
pushDWORD PTR [esp+60]
call__udivmoddi4
mov edx, DWORD PTR [esp+76]
mov ecx, DWORD PTR [esp+40]
mov DWORD PTR [edx], ecx
add esp, 60
ret
divmodConst:
pushedi
pushesi
sub esp, 4
mov esi, DWORD PTR [esp+16]
mov edi, DWORD PTR [esp+20]
push0
push100
pushedi
pushesi
call__umoddi3
add esp, 16
mov edx, DWORD PTR [esp+24]
mov DWORD PTR [edx], eax
push0
push100
pushedi
pushesi
call__udivdi3
add esp, 20
pop esi
pop edi
ret

[Bug middle-end/54183] Generate __udivmoddi4 instead of __udivdi3 plus __umoddi3

2018-03-16 Thread b7.10110111 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54183

--- Comment #3 from Ruslan  ---
Ah, actually your problem is with a constant divisor. I reported it as bug
84759. If you change 10 to e.g. a function parameter, then you'll get
__udivmoddi4.

[Bug middle-end/54183] Generate __udivmoddi4 instead of __udivdi3 plus __umoddi3

2018-03-16 Thread b7.10110111 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54183

--- Comment #5 from Ruslan  ---
Yes, this is exactly the problem: the generic case is optimized while the
special case, where the divisor is a compile-time constant, isn't.

[Bug c++/70299] New: pow(long double, int) gives more imprecise result than pow(long double,long double) in c++03 mode

2016-03-19 Thread b7.10110111 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70299

Bug ID: 70299
   Summary: pow(long double, int) gives more imprecise result than
pow(long double,long double) in c++03 mode
   Product: gcc
   Version: 5.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: b7.10110111 at gmail dot com
  Target Milestone: ---

The following example program appears to get different results for different
overloads of std::pow(), even though all the parameters don't lose precision on
call:

#include 
#include 
#include 

int main()
{
std::cout.precision(std::numeric_limits::digits10+3); //
=max_digits10
std::cout << "pow(long double, int) :" << std::pow(10.L,-4823) <<
"\n";
std::cout << "pow(long double, long double): " << std::pow(10.L,-4823.L) <<
"\n";
}

Its output in -std=c++03 mode:
pow(long double, int) :1.0288e-4823
pow(long double, long double): 1.0005e-4823

And in -std=c++11 and -std=c++14 mode it's correct:
pow(long double, int) :1.0005e-4823
pow(long double, long double): 1.0005e-4823

[Bug c++/70299] pow(long double, int) gives more imprecise result than pow(long double,long double) in c++03 mode

2016-03-19 Thread b7.10110111 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70299

--- Comment #1 from Ruslan  ---
The machine I tested it was Ubuntu 15.10, uname -a gives
Linux integral3-amd64 4.2.0-22-generic #27-Ubuntu SMP Thu Dec 17 22:57:08 UTC
2015 x86_64 x86_64 x86_64 GNU/Linux
, libc is Ubuntu GLIBC 2.21-0ubuntu4.

[Bug c++/70441] New: vector<__float128> crashes on two push_back calls with -mavx

2016-03-29 Thread b7.10110111 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70441

Bug ID: 70441
   Summary: vector<__float128> crashes on two push_back calls with
-mavx
   Product: gcc
   Version: 5.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: b7.10110111 at gmail dot com
  Target Milestone: ---

The following simple program reproduces the bug:


#include 

int main()
{
std::vector<__float128> tests;
tests.push_back(0);
tests.push_back(0);
}


I compiled it with this command line:

g++ test.cpp -o test -g -mavx

On attempt to run it reliably crashes on `vmovaps XMMWORD PTR [eax],xmm2`
instruction, where eax==0x804fa38, i.e. not aligned on 16-byte boundary. The
calling code is the second push_back(). Here's the full backtrace:

0x08048d30 in __gnu_cxx::new_allocator<__float128>::construct (this=0xd444,
__p=0x804fa38, __val=@0xd460: ) at
/opt/gcc-5.2/include/c++/5.2.0/ext/new_allocator.h:130
130   { ::new((void *)__p) _Tp(__val); }
(gdb) bt
#0  0x08048d30 in __gnu_cxx::new_allocator<__float128>::construct
(this=0xd444, __p=0x804fa38, __val=@0xd460: )
at /opt/gcc-5.2/include/c++/5.2.0/ext/new_allocator.h:130
#1  0x080489bd in __gnu_cxx::__alloc_traits
>::construct<__float128> (__a=..., __p=0x804fa38, __arg=@0xd460: )
at /opt/gcc-5.2/include/c++/5.2.0/ext/alloc_traits.h:189
#2  0x08048ae1 in std::vector<__float128, std::allocator<__float128>
>::_M_insert_aux (this=0xd444, __position=,
__x=@0xd460: )
at /opt/gcc-5.2/include/c++/5.2.0/bits/vector.tcc:361
#3  0x080488e9 in std::vector<__float128, std::allocator<__float128>
>::push_back (this=0xd444, __x=@0xd460: ) at
/opt/gcc-5.2/include/c++/5.2.0/bits/stl_vector.h:925
#4  0x080487c2 in main () at test.cpp:7

[Bug rtl-optimization/70467] New: Useless "and [esp],-1" emitted on AND with uint64_t variable

2016-03-30 Thread b7.10110111 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70467

Bug ID: 70467
   Summary: Useless "and [esp],-1" emitted on AND with uint64_t
variable
   Product: gcc
   Version: 6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: b7.10110111 at gmail dot com
  Target Milestone: ---

Consider the following C code:

#include 
long double __attribute__((noinline)) test() { return 0; }

long double doStuff()
{
long double value=test();
unsigned long long v;
memcpy(&v,&value,sizeof v);
v&=~(1ull<<63);
memcpy(&value,&v,sizeof v);
return value;
}
int main(){}

I get the following output for duStuff() function when I compile this code with
`gcc -O3 -fomit-frame-pointer -m32`:


doStuff:
sub esp, 28
calltest   ; OK, I asked to avoid inlining it
fstpTBYTE PTR [esp]
and DWORD PTR [esp], -1   ; DO NOTHING!!!
and DWORD PTR [esp+4], 2147483647 ; Clear highest bit
fld TBYTE PTR [esp]
add esp, 28
ret


The instruction marked with `DO NOTHING!!!` is a no-op here (flags are not
tested) and should have been eliminated.

This useless instruction is generated across generations of GCC starting at
least with 4.4.7 and ending at 6.0.0 20160221 (the snapshot testable at
gcc.godbolt.org).

[Bug rtl-optimization/70467] Useless "and [esp],-1" emitted on AND with uint64_t variable

2016-03-31 Thread b7.10110111 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70467

--- Comment #4 from Ruslan  ---
(In reply to Jakub Jelinek from comment #3)
> ...
> nothing there is able to optimize & -1 (and similarly | or ^ 0, or & 0, or |
> -1).

Just a note: the same happens for arithmetic operations, not just bitwise. E.g.
if you change `v&=~(1ull<<63)` in the OP to `v+=1ull<<32`, GCC generates `add
dword [esp],0` followed by `adc dword [esp+4],1`.

[Bug rtl-optimization/70504] New: FLD, FLD, FXCH emitted instead of FLD, FLD in the needed order

2016-04-01 Thread b7.10110111 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70504

Bug ID: 70504
   Summary: FLD, FLD, FXCH emitted instead of FLD, FLD in the
needed order
   Product: gcc
   Version: 6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: b7.10110111 at gmail dot com
  Target Milestone: ---

The following code demonstrates the bug:


long double inl_scalbn(long double mant, long double exp)
{
long double result;
asm("fscale"
: "=&t"(result)
: "%0"(mant),
  "u"(exp)
   );
return result;
}


With `-O3` option GCC generates the following assembly:


inl_scalbn:
fld TBYTE PTR [esp+4]
fld TBYTE PTR [esp+16]
fxchst(1)
fscale
fstpst(1)
ret


What's even stranger, I thought it was somehow related to order of function
arguments, but if I switch `mant` and `exp`, the code just switches `fld`
instructions instead of removing `fxch`.

It's clear that in both cases the code could have just loaded the parameters in
the correct order in the first place.

[Bug rtl-optimization/70976] New: Useless vectorization leads to degradation of performance

2016-05-06 Thread b7.10110111 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70976

Bug ID: 70976
   Summary: Useless vectorization leads to degradation of
performance
   Product: gcc
   Version: 6.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: b7.10110111 at gmail dot com
  Target Milestone: ---

See the following code:

#include 
int main()
{
unsigned long u = 13;
for(unsigned long i = 0; i < 1UL<<30; i++)
u += 23442*u;
if (u == 0) printf("0\n");
}

Compiling it on an AMD64 system with -O2, I get normal assembly for the loop:

.L2:
imulrdx, rdx, 23443
sub rax, 1
jne .L2

But if I use -O3, the loop looks like this:

.L2:
movdqa  xmm3, xmm1
add eax, 1
movdqa  xmm0, xmm1
pmuludq xmm1, xmm4
cmp eax, 536870912
pmuludq xmm3, xmm2
psrlq   xmm0, 32
pmuludq xmm0, xmm2
paddq   xmm0, xmm1
movdqa  xmm1, xmm3
psllq   xmm0, 32
paddq   xmm1, xmm0
jne .L2

Not only does it become longer, but also it needlessly does calculations on
pairs of identical numbers. On my CPU (Intel(R) Xeon(R) CPU E3-1226 v3 @
3.30GHz) the -O2 version is almost two times faster than -O3 one.

This happens with gcc 4.7.3 and newer, but doesn't with 4.6.4 and older.

[Bug c++/71238] New: Undeclared function message imprecisely points to error column

2016-05-23 Thread b7.10110111 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71238

Bug ID: 71238
   Summary: Undeclared function message imprecisely points to
error column
   Product: gcc
   Version: 6.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: b7.10110111 at gmail dot com
  Target Milestone: ---

The following program

int main()
{
int x=myFunc(3234);
}

gives me the error:

test.cpp:3:22: error: ‘myFunc’ was not declared in this scope
 int x=myFunc(3234);
  ^

Here the "^" symbol points to the closing parenthesis (and the parenthesis
itself is even colored red). But the error is not at that column, but rather at
`myFunc` identifier.

Similar code but without function call parentheses leads to much more precise
error message:

test.cpp:3:11: error: ‘myFunc’ was not declared in this scope
 int x=myFunc/*(3234)*/;
   ^~

[Bug c++/71469] New: Print possible override candidates when a method is marked override but doesn't override

2016-06-09 Thread b7.10110111 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71469

Bug ID: 71469
   Summary: Print possible override candidates when a method is
marked override but doesn't override
   Product: gcc
   Version: 6.0
Status: UNCONFIRMED
  Severity: enhancement
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: b7.10110111 at gmail dot com
  Target Milestone: ---

Currently for this code

struct Base
{
virtual int funct(int);
};
struct Der
{
int func(int) override;
int funct() override;
};
int main(){}

g++ gives the following error:

test.cpp:7:9: error: ‘int Der::func(int)’ marked ‘override’, but does not
override
 int func(int) override;
 ^
test.cpp:8:9: error: ‘int Der::funct()’ marked ‘override’, but does not
override
 int funct() override;
 ^

Now one has to look into the declaration of Base to find out what's actually
wrong.

It'd be nice if g++ suggested possible candidates:

1. For the first case in the above example, int func(int), do something similar
to "no such member" error (i.e. suggest function name correction) if the
supposed override matches in parameter types;
2. For the second case, int funct(), just list the name(s) of virtual functions
in the base class matching the name of the supposed override.

This will make it much simpler to immediately see some trivial errors like e.g.
omitting a parameter or using int instead of long.

[Bug target/77457] Print intended value of constants in assembly output

2016-09-02 Thread b7.10110111 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77457

--- Comment #1 from Ruslan  ---
Same for version "GCC: (Ubuntu 6.1.1-3ubuntu11~14.04.1) 6.1.1 20160511"

[Bug target/77457] New: Print intended value of constants in assembly output

2016-09-02 Thread b7.10110111 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77457

Bug ID: 77457
   Summary: Print intended value of constants in assembly output
   Product: gcc
   Version: 6.1.0
Status: UNCONFIRMED
  Severity: enhancement
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: b7.10110111 at gmail dot com
  Target Milestone: ---

Consider the following simple program:

void f()
{
volatile double x=0.352;
}

I compile it with `gcc test.c -S -masm=intel -fverbose-asm` and get the
following for the value of `x`:

.LC0:
.long   34359738
.long   1071023915
.ident  "GCC: (Ubuntu 5.3.0-3ubuntu1~14.04) 5.3.0 20151204"

To decypher it while reading the listing one has to manually concatenate
hexadecimal forms of these two numbers, and then transform to floating-point
form. Not too handy.

For comparison, this is what I get from clang:

.LCPI0_0:
.quad   4600012688193243578 # double 0.35198
<...skipped some code...>
.ident  "Ubuntu clang version 3.8.0-svn257311-1~exp1 (trunk) (based on
LLVM 3.8.0)"


It would be really useful if GCC also printed the intended values of the
constants it emits. Namely, this should be done for float, double and long
double.

[Bug middle-end/77457] Print intended value of constants in assembly output

2016-09-03 Thread b7.10110111 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77457

--- Comment #5 from Ruslan  ---
(In reply to Andrew Pinski from comment #2)
> Note also should be shown in C99 hex floats because that is 100% exactly
> representable of the number in binary :).

Not sure if exactness is worth it. It'll make it harder to see what the decimal
value is (and decimal is the most commonly used radix by humans), while decimal
form, if printed with `max_digits10` digits, is enough to reproduce the
hex/binary form when needed.

gcc-bugs@gcc.gnu.org

2014-04-25 Thread b7.10110111 at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60962

Bug ID: 60962
   Summary: b+(-2.f)*a generates multiplication instruction while
b-2.f*a simplifies to addition&subtraction
   Product: gcc
   Version: 4.8.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: b7.10110111 at gmail dot com

Created attachment 32681
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=32681&action=edit
A procedure compilable into assembly to reproduce the bug

I've tried the following with -O3 -ffast-math -fassociative-math options (here
all operands are floats):

float lap0= point[-1]+point[1] + (-2.f)*point[0];

This part of code (compilable into assembly version attached) generates
mulss/addss code (adding a constant in .rodata and reading it before) , which
leads to 6% slowdown compared to this version:

float lap0= point[-1]+point[1] - 2.f*point[0];

, which generates addss/subss code.

My g++ version is g++ (Ubuntu 4.8.1-2ubuntu1~12.04) 4.8.1. The full command
line is:

g++ -O3 -ffast-math -fassociative-math -o test1.s -S -masm=intel test.cpp

The problem also reproduces with g++ 4.5.


[Bug c++/66346] New: GCC computes log10(2.L) constant wrongly

2015-05-30 Thread b7.10110111 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66346

Bug ID: 66346
   Summary: GCC computes log10(2.L) constant wrongly
   Product: gcc
   Version: 5.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: b7.10110111 at gmail dot com
  Target Milestone: ---

Consider the following code:

#include 
#include 
int main()
{
volatile long double two=2.L;
long double vol=log10(two);
long double con=log10(2.L);
long double ref=0.3010299956639811952L;
std::cout.precision(19);
std::cout << "computed constant: " << con << '\n';
std::cout << "computed volatile: " << vol << '\n';
std::cout << "reference value  : " << ref << "\n";
}

Here reference value was computed using Wolfram Mathematica via N[Log10[2],19]
command.
On x86 system it this code, compiled with gcc 4.5, 4.8, 4.9 and 5.1, gives me
this output:

computed constant: 0.301029995663981198
computed volatile: 0.3010299956639811952
reference value  : 0.3010299956639811952

On x86_64 system it prints this:

computed constant: 0.301029995663981198
computed volatile: 0.301029995663981198
reference value  : 0.3010299956639811952

This appears to be wrong result. The same code compiled by clang++ 3.0.6ubuntu3
gives all values equal to reference value.


[Bug libstdc++/66346] GCC computes log10(2.L) constant wrongly

2015-05-30 Thread b7.10110111 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66346

Ruslan  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |INVALID

--- Comment #3 from Ruslan  ---
Ah, that's what I'm doing wrong... Thanks, this bug is invalid then.