[Bug c++/100224] incorrect result when doing double vectorized

2021-04-23 Thread zhaoc at apache dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100224

--- Comment #2 from Zhao Chun  ---
(In reply to Richard Biener from comment #1)
> You are accessing 'double' via a pointer to uint64_t * here:
> 
> k = *((uint64_t*)data);
> 
> that violates type based aliasing rules.  You can use -fno-strict-aliasing
> to work around your bug or use
> 
> typedef uint64_t aliasing_uint64_t __attribute__((may_alias));
> k = *((aliasing_uint64_t*)data);

Thanks for your answer, it works for me.

However I don't quite understand why it works after such a change. Could you
please explain it more clearly?

[Bug libfortran/98301] random_init() is broken

2021-04-23 Thread vehre at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98301

Andre Vehreschild  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
 CC||vehre at gcc dot gnu.org
   Assignee|unassigned at gcc dot gnu.org  |vehre at gcc dot gnu.org

--- Comment #9 from Andre Vehreschild  ---
Going to implement the coarray part.

[Bug inline-asm/100178] Should the “short” be promoted to “int” when use inline asm?

2021-04-23 Thread gengqi at linux dot alibaba.com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100178

--- Comment #3 from GengQi  ---
Thanks for your replies, I have taken enough information from them.

I hope this is made clear in the documentation soon.

[Bug rtl-optimization/100225] New: [8/9/10/11/12 Regression] ICE in add_cross_iteration_register_deps, at ddg.c:291

2021-04-23 Thread asolokha at gmx dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100225

Bug ID: 100225
   Summary: [8/9/10/11/12 Regression] ICE in
add_cross_iteration_register_deps, at ddg.c:291
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Keywords: ice-on-valid-code
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: asolokha at gmx dot com
  Target Milestone: ---
Target: aarch64-linux-gnu

gcc-11.0.1-alpha20210418 snapshot (g:b412ce8e961052e6becea3bc783a53e1d5feaa0f)
ICEs when compiling the following testcase w/ -O1 -fmodulo-sched:

void
vorbis_synthesis_lapout (void);

void
ov_info (int **lappcm, int ov_info_i)
{
  while (ov_info_i < 1)
lappcm[ov_info_i++] = __builtin_alloca (1);

  vorbis_synthesis_lapout ();
}

% aarch64-linux-gnu-gcc-11.0.1 -O1 -fmodulo-sched -c oacjgazv.c
during RTL pass: sms
oacjgazv.c: In function 'ov_info':
oacjgazv.c:11:1: internal compiler error: in add_cross_iteration_register_deps,
at ddg.c:291
   11 | }
  | ^
0x8913fa add_cross_iteration_register_deps
   
/var/tmp/portage/cross-aarch64-linux-gnu/gcc-11.0.1_alpha20210418/work/gcc-11-20210418/gcc/ddg.c:291
0x8913fa build_inter_loop_deps
   
/var/tmp/portage/cross-aarch64-linux-gnu/gcc-11.0.1_alpha20210418/work/gcc-11-20210418/gcc/ddg.c:360
0x8913fa create_ddg(basic_block_def*, int)
   
/var/tmp/portage/cross-aarch64-linux-gnu/gcc-11.0.1_alpha20210418/work/gcc-11-20210418/gcc/ddg.c:605
0x1a90489 sms_schedule
   
/var/tmp/portage/cross-aarch64-linux-gnu/gcc-11.0.1_alpha20210418/work/gcc-11-20210418/gcc/modulo-sched.c:1513
0x1a9066f execute
   
/var/tmp/portage/cross-aarch64-linux-gnu/gcc-11.0.1_alpha20210418/work/gcc-11-20210418/gcc/modulo-sched.c:3345

[Bug libstdc++/100226] New: [11/12 Regression] c++/11/bits/stl_tree.h:770:8: error: static assertion failed: comparison object must be invocable as const

2021-04-23 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100226

Bug ID: 100226
   Summary: [11/12 Regression] c++/11/bits/stl_tree.h:770:8:
error: static assertion failed: comparison object must
be invocable as const
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: marxin at gcc dot gnu.org
  Target Milestone: ---

It's taken from ncurses package, where the package can be built with GCC 10.
It's likely caused
by changes in libstdc++ in header files. I can build g++-10 -E && g++-11
ncurses.ii, but g++-11 ... fails.
It's also very difficult to decode the error message.

$ g++ ncurses.ii -c
In file included from /usr/include/c++/11/set:60,
 from /usr/include/zypp/Arch.h:17,
 from /usr/include/zypp/sat/Solvable.h:22,
 from /usr/include/zypp/sat/SolvIterMixin.h:21,
 from /usr/include/zypp/sat/LocaleSupport.h:18,
 from
/home/abuild/rpmbuild/BUILD/libyui-4.2.1/libyui-ncurses-pkg/src/NCPkgFilterPattern.cc:44:
/usr/include/c++/11/bits/stl_tree.h: In instantiation of ‘static const _Key&
std::_Rb_tree<_Key, _Val, _KeyOfValue, _Compare,
_Alloc>::_S_key(std::_Rb_tree<_Key, _Val, _KeyOfValue, _Compare,
_Alloc>::_Const_Link_type) [with _Key =
std::pair, std::__cxx11::basic_string >;
_Val = std::pair,
std::__cxx11::basic_string >; _KeyOfValue =
std::_Identity,
std::__cxx11::basic_string > >; _Compare = paircmp; _Alloc =
std::allocator,
std::__cxx11::basic_string > >; std::_Rb_tree<_Key, _Val, _KeyOfValue,
_Compare, _Alloc>::_Const_Link_type = const
std::_Rb_tree_node,
std::__cxx11::basic_string > >*]’:
/usr/include/c++/11/bits/stl_tree.h:2069:47:   required from
‘std::pair
std::_Rb_tree<_Key, _Val, _KeyOfValue, _Compare,
_Alloc>::_M_get_insert_unique_pos(const key_type&) [with _Key =
std::pair, std::__cxx11::basic_string >;
_Val = std::pair,
std::__cxx11::basic_string >; _KeyOfValue =
std::_Identity,
std::__cxx11::basic_string > >; _Compare = paircmp; _Alloc =
std::allocator,
std::__cxx11::basic_string > >; std::_Rb_tree<_Key, _Val, _KeyOfValue,
_Compare, _Alloc>::key_type = std::pair,
std::__cxx11::basic_string >]’
/usr/include/c++/11/bits/stl_tree.h:2122:4:   required from
‘std::pair, bool> std::_Rb_tree<_Key, _Val,
_KeyOfValue, _Compare, _Alloc>::_M_insert_unique(_Arg&&) [with _Arg =
std::pair, std::__cxx11::basic_string >;
_Key = std::pair,
std::__cxx11::basic_string >; _Val =
std::pair, std::__cxx11::basic_string >;
_KeyOfValue = std::_Identity,
std::__cxx11::basic_string > >; _Compare = paircmp; _Alloc =
std::allocator,
std::__cxx11::basic_string > >]’
/usr/include/c++/11/bits/stl_set.h:521:25:   required from ‘std::pair, _Compare, typename
__gnu_cxx::__alloc_traits<_Alloc>::rebind<_Key>::other>::const_iterator, bool>
std::set<_Key, _Compare, _Alloc>::insert(std::set<_Key, _Compare,
_Alloc>::value_type&&) [with _Key = std::pair,
std::__cxx11::basic_string >; _Compare = paircmp; _Alloc =
std::allocator,
std::__cxx11::basic_string > >; typename std::_Rb_tree<_Key, _Key,
std::_Identity<_Tp>, _Compare, typename
__gnu_cxx::__alloc_traits<_Alloc>::rebind<_Key>::other>::const_iterator =
std::_Rb_tree,
std::__cxx11::basic_string >, std::pair,
std::__cxx11::basic_string >,
std::_Identity,
std::__cxx11::basic_string > >, paircmp,
std::allocator,
std::__cxx11::basic_string > > >::const_iterator; typename
__gnu_cxx::__alloc_traits<_Alloc>::rebind<_Key>::other =
std::allocator,
std::__cxx11::basic_string > >; typename
__gnu_cxx::__alloc_traits<_Alloc>::rebind<_Key> =
__gnu_cxx::__alloc_traits,
std::__cxx11::basic_string > >,
std::pair, std::__cxx11::basic_string >
>::rebind,
std::__cxx11::basic_string > >; typename _Alloc::value_type =
std::pair, std::__cxx11::basic_string >;
std::set<_Key, _Compare, _Alloc>::value_type =
std::pair, std::__cxx11::basic_string
>]’
/home/abuild/rpmbuild/BUILD/libyui-4.2.1/libyui-ncurses-pkg/src/NCPkgFilterPattern.cc:343:28:
  required from here
/usr/include/c++/11/bits/stl_tree.h:770:8: error: static assertion failed:
comparison object must be invocable as const
  770 | 
  |^
/usr/include/c++/11/bits/stl_tree.h:770:8: note: ‘std::is_invocable_v, std::allocator >,
std::__cxx11::basic_string, std::allocator >
>&, const std::pair,
std::allocator >, std::__cxx11::basic_string, std::allocator > >&>’ evaluates to false

[Bug libstdc++/100226] [11/12 Regression] c++/11/bits/stl_tree.h:770:8: error: static assertion failed: comparison object must be invocable as const

2021-04-23 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100226

--- Comment #1 from Martin Liška  ---
Created attachment 50656
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50656&action=edit
test-case

[Bug libstdc++/100226] [11/12 Regression] c++/11/bits/stl_tree.h:770:8: error: static assertion failed: comparison object must be invocable as const

2021-04-23 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100226

Martin Liška  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 CC||redi at gcc dot gnu.org
   Last reconfirmed||2021-04-23
 Status|UNCONFIRMED |NEW

[Bug c++/100224] incorrect result when doing double vectorized

2021-04-23 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100224

--- Comment #3 from Richard Biener  ---
(In reply to Zhao Chun from comment #2)
> (In reply to Richard Biener from comment #1)
> > You are accessing 'double' via a pointer to uint64_t * here:
> > 
> > k = *((uint64_t*)data);
> > 
> > that violates type based aliasing rules.  You can use -fno-strict-aliasing
> > to work around your bug or use
> > 
> > typedef uint64_t aliasing_uint64_t __attribute__((may_alias));
> > k = *((aliasing_uint64_t*)data);
> 
> Thanks for your answer, it works for me.
> 
> However I don't quite understand why it works after such a change. Could you
> please explain it more clearly?

Using a may_alias attributed type tells GCC to treat it like a 'character type'
in terms of what the C/C++ standards allow.  Note that this is not portable.
A solution that works with all compilers I know is doing

   memcpy (&k, data, sizeof (uint64_t));

[Bug libstdc++/100226] [11/12 Regression] c++/11/bits/stl_tree.h:770:8: error: static assertion failed: comparison object must be invocable as const

2021-04-23 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100226

Richard Biener  changed:

   What|Removed |Added

  Known to work||10.3.0
   Target Milestone|--- |11.0
   Keywords||needs-reduction,
   ||rejects-valid

--- Comment #2 from Richard Biener  ---
I guess you want to uninclude it and reduce it w/o expanding the std library
headers.

[Bug rtl-optimization/100225] [8/9/10/11/12 Regression] ICE in add_cross_iteration_register_deps, at ddg.c:291

2021-04-23 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100225

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |8.5

[Bug tree-optimization/99971] GCC generates partially vectorized and scalar code at once

2021-04-23 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99971

--- Comment #8 from CVS Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:700e542971251b11623cce877075567815f72965

commit r12-79-g700e542971251b11623cce877075567815f72965
Author: Richard Biener 
Date:   Fri Apr 9 09:35:51 2021 +0200

tree-optimization/99971 - improve BB vect dependence analysis

We can use TBAA even when we have a DR, do so.  For the testcase
that means fully vectorizing it instead of only vectorizing
the first store group resulting in suboptimal code.

2021-04-09  Richard Biener  

PR tree-optimization/99971
* tree-vect-data-refs.c (vect_slp_analyze_node_dependences):
Always use TBAA for loads.

* g++.dg/vect/slp-pr99971.cc: New testcase.

[Bug tree-optimization/99971] GCC generates partially vectorized and scalar code at once

2021-04-23 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99971

Richard Biener  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED
  Known to work||12.0

--- Comment #9 from Richard Biener  ---
Fixed for GCC 12.

[Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686

2021-04-23 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182

--- Comment #12 from Jakub Jelinek  ---
They do.  Though, in the combined patch I'm still a little bit worried about
the first 4 modified peephole2s, the last 4 look good to me.
The last 4 are where the original insn did a normal DFmode store and your patch
restores those DFmode stores.
But the first 4 had an atomic store followed by a DFmode read, shouldn't those
preserve an atomic store instead of the DFmode store?  A non-atomic DFmode read
is one thing, but it could be followed later by atomic loads, both into DFmode
and ones into DImode that would check the whole bit pattern.

[Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686

2021-04-23 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182

--- Comment #13 from Uroš Bizjak  ---
(In reply to Jakub Jelinek from comment #12)
> They do.  Though, in the combined patch I'm still a little bit worried about
> the first 4 modified peephole2s, the last 4 look good to me.
> The last 4 are where the original insn did a normal DFmode store and your
> patch restores those DFmode stores.
> But the first 4 had an atomic store followed by a DFmode read, shouldn't
> those
> preserve an atomic store instead of the DFmode store?  A non-atomic DFmode
> read is one thing, but it could be followed later by atomic loads, both into
> DFmode and ones into DImode that would check the whole bit pattern.

DFmode loads and stores *are* atomic, this is what the optimization is based
on.

[Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686

2021-04-23 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182

--- Comment #14 from Uroš Bizjak  ---
(In reply to Uroš Bizjak from comment #13)

> DFmode loads and stores *are* atomic, this is what the optimization is based
> on.

Loads and stores to/from x87 and SSE registers, to be clear.

[Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686

2021-04-23 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182

--- Comment #15 from Jakub Jelinek  ---
Yes, but do they preserve all the bits and never modify any bit patterns,
including qNaNs and sNaNs?  I thought the point of using the fistp was that it
preserves everything.

[Bug fortran/100227] New: write with implicit loop

2021-04-23 Thread priv123 at hotmail dot fr via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100227

Bug ID: 100227
   Summary: write with implicit loop
   Product: gcc
   Version: 8.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: priv123 at hotmail dot fr
  Target Milestone: ---

Created attachment 50657
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50657&action=edit
example code

The code in attachment runs properly with gfortran 6.3.0 with -O1 and with
gfortran 8.3.0 with -O0, but fails with gfortran 8.3.0 with -O1:

+ gfortran --version
GNU Fortran (Debian 8.3.0-6) 8.3.0
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

+ gfortran -Wall -Wextra -O1 pb_write.F90
+ ./a.out
 KO if -O1/gcc8   4.57776182E-41  -773.585938   3.06225753E-41
 OK: add index1   1.   5   5.  
9   9.
 OK: rearranged   1.   5.   9.
 OK: two vars 1.   5.   9.

+ gfortran -Wall -Wextra -O0 pb_write.F90
+ ./a.out
 KO if -O1/gcc8   1.   5.   9.
 OK: add index1   1.   5   5.  
9   9.
 OK: rearranged   1.   5.   9.
 OK: two vars 1.   5.   9.

As see in bug 86837, it is also ok with -O1 -fno-frontend-optimize.

I'm emitting a new bug since it should have been fixed in 8.2.1 and it seems in
failure with 8.3.0.

I will check and post results with last available docker images to see if it is
fixed after 8.3.0...

Thanks for reading!

[Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686

2021-04-23 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182

--- Comment #16 from Uroš Bizjak  ---
(In reply to Jakub Jelinek from comment #15)
> Yes, but do they preserve all the bits and never modify any bit patterns,
> including qNaNs and sNaNs?  I thought the point of using the fistp was that
> it preserves everything.

Hm, they don't...

[Bug other/100174] Binary floating-point conversion under source-gcc/gcc/real.[c\h] test on x86-64

2021-04-23 Thread 608410104 at alum dot ccu.edu.tw via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100174

--- Comment #2 from LinoPeng <608410104 at alum dot ccu.edu.tw> ---
Hi Andrew Pinski,

I am new here. I am sorry if I had offended you.

Float a = 0.. It 23 fraction bits is "01010101010011001001100".
I trace gcc-9.2.0 source code from real.c. I founded sig[SIGSZ-1] would be
clear another 41 bits to zero (sig[SIGSZ-1] have 64bits -> 64 - 23 = 41). Also
double too.
Before clear sig[SIGSZ-1] = 01010101 01001100 10011000 0101 0110
0110 10010100 0100011 
After clear sig[SIGSZ-1] = 01010101 01001100 10011000  
  000
As below statement that is real.c source code. r is const REAL_VALUE_TYPE.
"sig = (r->sig[SIGSZ-1] >> (HOST_BITS_PER_LONG - 24)) & 0x7f;"
I mean's why not just truncated 23 bits in sig[SIGSZ-1]. Do or do not using the
function "clear_significand_below" it do not effect sig record fraction bit.

[Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686

2021-04-23 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182

--- Comment #17 from Uroš Bizjak  ---
(In reply to Uroš Bizjak from comment #16)
> (In reply to Jakub Jelinek from comment #15)
> > Yes, but do they preserve all the bits and never modify any bit patterns,
> > including qNaNs and sNaNs?  I thought the point of using the fistp was that
> > it preserves everything.
> 
> Hm, they don't...

This probably means we have to remove x87 peepholes, where an atomic store is
followed by a DFmode read. x87 can't load and store DFmode untouched without
fild/fistp pair.

[Bug target/100152] [10/11/12 Regression] used caller-saved register not preserved across a call.

2021-04-23 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100152

Richard Biener  changed:

   What|Removed |Added

   Priority|P3  |P2
Summary|[10.3, 11, 12 Regression]   |[10/11/12 Regression] used
   |used caller-saved register  |caller-saved register not
   |not preserved across a  |preserved across a call.
   |call.   |
   Keywords||ra

--- Comment #30 from Richard Biener  ---
(In reply to Iain Sandoe from comment #29)
> what is also somewhat peculiar is that replacing the first function in the
> reduced test case with "extern void ___UTF_8_put(char *a, int b);" changes
> the code-gen for the second function.

That might hint at IPA RA which you can try disabling via -fno-ipa-ra which
in turn hints at a target issue.  I'm seeing whether a cross reproduces the
issue on your reduced testcase.

Btw, the GIMPLE optimization change just exposes the issue - it can have no
influence on the used registers.

[Bug target/100217] [11/12 Regression] ICE when building valgrind testsuite with -march=z14 since r11-7552

2021-04-23 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100217

--- Comment #2 from Jakub Jelinek  ---
Created attachment 50658
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50658&action=edit
gcc11-pr100217.patch

Untested fix.  IMHO when we have a hard reg in the inline asm, we just need to
honor it, trying to force it into a pseudo and then subreg would just mean the
user chosen reg is not guaranteed anymore.

[Bug libstdc++/100223] Missing early return in std::partial_sort

2021-04-23 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100223

--- Comment #1 from Jonathan Wakely  ---
Arguably, the caller can do this check if they think it can occur in their
code. That way all calls to the algorithm don't pay for the check.

But it's probably cheap enough to check anyway.

[Bug libstdc++/100226] [11/12 Regression] c++/11/bits/stl_tree.h:770:8: error: static assertion failed: comparison object must be invocable as const

2021-04-23 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100226

--- Comment #3 from Jonathan Wakely  ---
The static assert was added intentionally, the comparison function used with
the container must have a const-qualified operator().

I would check that in the nurses code first.

[Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686

2021-04-23 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182

--- Comment #18 from Jakub Jelinek  ---
Indeed.

[Bug target/100228] New: repeated std::atomic::load() misoptimized by x87 peephole

2021-04-23 Thread aoliva at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100228

Bug ID: 100228
   Summary: repeated std::atomic::load() misoptimized by
x87 peephole
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: aoliva at gcc dot gnu.org
  Target Milestone: ---
Target: i686-pc-linux-gnu

compile this with -O2 -mfpmath=387 -mno-sse

#include 

int main() {
  std::atomic a0;
  std::atomic a1(1.0);
  a0 = a1.load();
  if (a0.load() != a1.load())
__builtin_abort ();
}

it aborts because the first a1.load() is optimized by sync.md:398:

(define_peephole2
  [(set (match_operand:DF 0 "memory_operand")
(match_operand:DF 1 "any_fp_register_operand"))
   (set (mem:BLK (scratch:SI))
(unspec:BLK [(mem:BLK (scratch:SI))] UNSPEC_MEMORY_BLOCKAGE))
   (set (match_operand:DF 2 "fp_register_operand")
(unspec:DF [(match_operand:DI 3 "memory_operand")]
   UNSPEC_FILD_ATOMIC))
   (set (match_operand:DI 4 "memory_operand")
(unspec:DI [(match_dup 2)]
   UNSPEC_FIST_ATOMIC))]
  "!TARGET_64BIT
   && peep2_reg_dead_p (4, operands[2])
   && rtx_equal_p (XEXP (operands[0], 0), XEXP (operands[3], 0))"
  [(const_int 0)]
{
  emit_insn (gen_memory_blockage ());
  emit_move_insn (gen_lowpart (DFmode, operands[4]), operands[1]);
  DONE;
})

the memory location operands[0] stored into by the first instruction is reused
and loaded again in the second a1.load(), but after this peephole, there's no
store before the load.

I don't think we have infrastructure in peephole to test whether there are any
other uses of a store, so I think we have to keep it.  There are other
variations of this peephole around it, that appear to have the same problem.

[Bug libstdc++/100226] [11/12 Regression] c++/11/bits/stl_tree.h:770:8: error: static assertion failed: comparison object must be invocable as const

2021-04-23 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100226

--- Comment #4 from Jonathan Wakely  ---
The template argument '_Compare = paircmp' shows the type user as the
comparison object.

So paircmp::operator() needs to be const.

[Bug libstdc++/100226] [11/12 Regression] c++/11/bits/stl_tree.h:770:8: error: static assertion failed: comparison object must be invocable as const

2021-04-23 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100226

Martin Liška  changed:

   What|Removed |Added

 Resolution|--- |INVALID
 Status|NEW |RESOLVED

--- Comment #5 from Martin Liška  ---
(In reply to Jonathan Wakely from comment #4)
> The template argument '_Compare = paircmp' shows the type user as the
> comparison object.
> 
> So paircmp::operator() needs to be const.

I can confirm that it works, thank you for help!

[Bug target/100152] [10/11/12 Regression] used caller-saved register not preserved across a call.

2021-04-23 Thread iains at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100152

--- Comment #31 from Iain Sandoe  ---
(In reply to Richard Biener from comment #30)
> (In reply to Iain Sandoe from comment #29)
> > what is also somewhat peculiar is that replacing the first function in the
> > reduced test case with "extern void ___UTF_8_put(char *a, int b);" changes
> > the code-gen for the second function.
> 
> That might hint at IPA RA which you can try disabling via -fno-ipa-ra which
> in turn hints at a target issue.  

yeah, it does switch back to using rbx, at least on the reduced test case.

> Btw, the GIMPLE optimization change just exposes the issue - it can have no
> influence on the used registers.

indeed, it seemed more likely to be "exposed by".

[Bug target/99932] OpenACC/nvptx offloading execution regressions starting with CUDA 11.2-era Nvidia Driver 460.27.04

2021-04-23 Thread vries at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99932

--- Comment #2 from Tom de Vries  ---
Minimal example:
...
$ cat libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-dims.c
int
main (void)
{
  int vectors_max = -1;
#pragma acc parallel \
  num_gangs (1) \
  num_workers (1) \
  vector_length (32) \
  copy (vectors_max)
{
#pragma acc loop gang reduction (max: vectors_max)
  for (int i = 0; i < 2; i++)
#pragma acc loop worker reduction (max: vectors_max)
for (int j = 0; j < 2; j++)
#pragma acc loop vector reduction (max: vectors_max)
  for (int k = 0; k < 32; k++)
vectors_max = k;
}

  if (vectors_max != 31)
__builtin_abort ();

  return 0;
}
...

Passes with GOMP_NVPTX_JIT=-O0, starts failing at GOMP_NVPTX_JIT=-O1.

[Bug libstdc++/100226] [11/12 Regression] c++/11/bits/stl_tree.h:770:8: error: static assertion failed: comparison object must be invocable as const

2021-04-23 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100226

--- Comment #6 from Jonathan Wakely  ---
Bug 89370 would really help simplify this diagnostic.

The last three lines would be:

.../src/NCPkgFilterPattern.cc:343:28:   required from here
/usr/include/c++/11/bits/stl_tree.h:770:8: error: static assertion failed:
comparison object must be invocable as const
  770 | 
  |^
/usr/include/c++/11/bits/stl_tree.h:770:8: note: ‘std::is_invocable_v’ evaluates to false

Which is pretty clear, I think.

[Bug libstdc++/100179] [12 regression] xtreme-header-2_a.H fails on arm-eabi

2021-04-23 Thread clyon at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100179

--- Comment #6 from Christophe Lyon  ---
Yes, I confirm it's now fixed, thanks!

[Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686

2021-04-23 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182

--- Comment #19 from Jakub Jelinek  ---
Perhaps best would be to try to construct a testcase for each of the peephole2s
and try some bit pattern that isn't preserved through the FPU except for
fistp/fildp and see what enabling/disabling each of the peephole2s does to it.

[Bug fortran/100227] write with implicit loop

2021-04-23 Thread priv123 at hotmail dot fr via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100227

Mathieu  changed:

   What|Removed |Added

  Known to work||6.3.0
  Known to fail||10.3.0, 9.3.0

--- Comment #1 from Mathieu  ---
same problem reproduced with 9.3.0 and 10.3.0 (from docker)

[Bug target/100228] repeated std::atomic::load() misoptimized by x87 peephole

2021-04-23 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100228

--- Comment #1 from Jonathan Wakely  ---
Is this the same cause as bug 100182?

[Bug libstdc++/100179] [12 regression] xtreme-header-2_a.H fails on arm-eabi

2021-04-23 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100179

--- Comment #7 from Jonathan Wakely  ---
Great, thanks for report, so that this could be fixed for gcc-11.

[Bug target/100228] repeated std::atomic::load() misoptimized by x87 peephole

2021-04-23 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100228

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org
 Resolution|--- |DUPLICATE
 Status|UNCONFIRMED |RESOLVED

--- Comment #2 from Jakub Jelinek  ---
Yes.

*** This bug has been marked as a duplicate of bug 100182 ***

[Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686

2021-04-23 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182

Jakub Jelinek  changed:

   What|Removed |Added

 CC||aoliva at gcc dot gnu.org

--- Comment #20 from Jakub Jelinek  ---
*** Bug 100228 has been marked as a duplicate of this bug. ***

[Bug tree-optimization/99971] GCC generates partially vectorized and scalar code at once

2021-04-23 Thread andysem at mail dot ru via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99971

--- Comment #10 from andysem at mail dot ru ---
Thanks. Will this be backported to 10 and 11 branches?

[Bug libstdc++/100223] Missing early return in std::partial_sort

2021-04-23 Thread hewillk at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100223

--- Comment #2 from 康桓瑋  ---
(In reply to Jonathan Wakely from comment #1)
> Arguably, the caller can do this check if they think it can occur in their
> code. That way all calls to the algorithm don't pay for the check.
> 
> But it's probably cheap enough to check anyway.

Exactly, since the  is full of such checks, I think there is nothing
wrong with adding one for partial_sort.

[Bug target/100152] [10/11/12 Regression] used caller-saved register not preserved across a call.

2021-04-23 Thread iains at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100152

--- Comment #32 from Iain Sandoe  ---
(In reply to Iain Sandoe from comment #31)
> (In reply to Richard Biener from comment #30)
> > (In reply to Iain Sandoe from comment #29)
> > > what is also somewhat peculiar is that replacing the first function in the
> > > reduced test case with "extern void ___UTF_8_put(char *a, int b);" changes
> > > the code-gen for the second function.
> > 
> > That might hint at IPA RA which you can try disabling via -fno-ipa-ra which
> > in turn hints at a target issue.  
> 
> yeah, it does switch back to using rbx, at least on the reduced test case.

(also on the original).

I wonder if the problem is that IPA can't "see" the lazy symbol resolver, so it
just sees a call to ___UTF_8_put and doesn't know that this will be resolved
indirectly.

.. but something similar must apply to PLT and targets with linker veneers ?

[Bug target/100152] [10/11/12 Regression] used caller-saved register not preserved across a call.

2021-04-23 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100152

Richard Biener  changed:

   What|Removed |Added

 CC||vmakarov at gcc dot gnu.org

--- Comment #33 from Richard Biener  ---
(In reply to Iain Sandoe from comment #32)
> (In reply to Iain Sandoe from comment #31)
> > (In reply to Richard Biener from comment #30)
> > > (In reply to Iain Sandoe from comment #29)
> > > > what is also somewhat peculiar is that replacing the first function in 
> > > > the
> > > > reduced test case with "extern void ___UTF_8_put(char *a, int b);" 
> > > > changes
> > > > the code-gen for the second function.
> > > 
> > > That might hint at IPA RA which you can try disabling via -fno-ipa-ra 
> > > which
> > > in turn hints at a target issue.  
> > 
> > yeah, it does switch back to using rbx, at least on the reduced test case.
> 
> (also on the original).
> 
> I wonder if the problem is that IPA can't "see" the lazy symbol resolver, so
> it just sees a call to ___UTF_8_put and doesn't know that this will be
> resolved indirectly.
> 
> .. but something similar must apply to PLT and targets with linker veneers ?

I don't know how IPA RA works in detail but obviously the target has to
expose this detail.  It looks like IPA RA causes us to add some notes to
call insns which are supposed to describe those details and there's
collect_fn_hard_reg_usage which looks at the target function (but likely
does not include the ABI details of the call itself, in this case the
resolver).

[Bug target/100152] [10/11/12 Regression] used caller-saved register not preserved across a call.

2021-04-23 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100152

--- Comment #34 from Richard Biener  ---
(In reply to Richard Biener from comment #33)
> (In reply to Iain Sandoe from comment #32)
> > (In reply to Iain Sandoe from comment #31)
> > > (In reply to Richard Biener from comment #30)
> > > > (In reply to Iain Sandoe from comment #29)
> > > > > what is also somewhat peculiar is that replacing the first function 
> > > > > in the
> > > > > reduced test case with "extern void ___UTF_8_put(char *a, int b);" 
> > > > > changes
> > > > > the code-gen for the second function.
> > > > 
> > > > That might hint at IPA RA which you can try disabling via -fno-ipa-ra 
> > > > which
> > > > in turn hints at a target issue.  
> > > 
> > > yeah, it does switch back to using rbx, at least on the reduced test case.
> > 
> > (also on the original).
> > 
> > I wonder if the problem is that IPA can't "see" the lazy symbol resolver, so
> > it just sees a call to ___UTF_8_put and doesn't know that this will be
> > resolved indirectly.
> > 
> > .. but something similar must apply to PLT and targets with linker veneers ?
> 
> I don't know how IPA RA works in detail but obviously the target has to
> expose this detail.  It looks like IPA RA causes us to add some notes to
> call insns which are supposed to describe those details and there's
> collect_fn_hard_reg_usage which looks at the target function (but likely
> does not include the ABI details of the call itself, in this case the
> resolver).

@deftypevr {Target Hook} bool TARGET_CALL_FUSAGE_CONTAINS_NON_CALLEE_CLOBBERS
Set to true if each call that binds to a local definition explicitly
clobbers or sets all non-fixed registers modified by performing the call.
That is, by the call pattern itself, or by code that might be inserted by the
linker (e.g.@: stubs, veneers, branch islands), but not including those
modifiable by the callee.  The affected registers may be mentioned explicitly
in the call pattern, or included as clobbers in CALL_INSN_FUNCTION_USAGE.
The default version of this hook is set to false.  The purpose of this hook
is to enable the fipa-ra optimization.
@end deftypevr

might be relevant - though when compiling for a shared library the call
to ___UTF_8_put does not bind locally (but then IPA RA shouldn't apply
either I guess).  So, does ___UTF_8_put bind locally?

[Bug tree-optimization/99971] GCC generates partially vectorized and scalar code at once

2021-04-23 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99971

--- Comment #11 from rguenther at suse dot de  ---
On Fri, 23 Apr 2021, andysem at mail dot ru wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99971
> 
> --- Comment #10 from andysem at mail dot ru ---
> Thanks. Will this be backported to 10 and 11 branches?

I don't plan to since it isn't a regression as far as I know, it
doesn't apply to GCC 10 so definitely not there.  I'll consider
for GCC 11.

[Bug target/100152] [10/11/12 Regression] used caller-saved register not preserved across a call.

2021-04-23 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100152

--- Comment #35 from Richard Biener  ---
Which means another possible candidate for the "bug" is darwin_binds_local_p

[Bug target/100152] [10/11/12 Regression] used caller-saved register not preserved across a call.

2021-04-23 Thread iains at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100152

--- Comment #36 from Iain Sandoe  ---
(In reply to Richard Biener from comment #35)
> Which means another possible candidate for the "bug" is darwin_binds_local_p

yeah... see below.

> > > .. but something similar must apply to PLT and targets with linker 
> > > veneers ?
> > 
> > I don't know how IPA RA works in detail but obviously the target has to
> > expose this detail.  It looks like IPA RA causes us to add some notes to
> > call insns which are supposed to describe those details and there's
> > collect_fn_hard_reg_usage which looks at the target function (but likely
> > does not include the ABI details of the call itself, in this case the
> > resolver).


> @deftypevr {Target Hook} bool TARGET_CALL_FUSAGE_CONTAINS_NON_CALLEE_CLOBBERS
> Set to true if each call that binds to a local definition explicitly
> clobbers or sets all non-fixed registers modified by performing the call.
> That is, by the call pattern itself, or by code that might be inserted by the
> linker (e.g.@: stubs, veneers, branch islands), but not including those
> modifiable by the callee.  The affected registers may be mentioned explicitly
> in the call pattern, or included as clobbers in CALL_INSN_FUNCTION_USAGE.
> The default version of this hook is set to false.  The purpose of this hook
> is to enable the fipa-ra optimization.
> @end deftypevr

thanks for the pointer, I'll take a look at that when i have some cycles.
I guess it was never added at the time the IPA stuff was done... and somehow we
"got away with it" mostly.

> might be relevant - though when compiling for a shared library the call
> to ___UTF_8_put does not bind locally (but then IPA RA shouldn't apply
> either I guess).  So, does ___UTF_8_put bind locally?

extern void ___UTF_8_put
   (char* *ptr, unsigned int c)

If it does, then that's also a bug :), will have to check (sometime later).

(we are always building with fPIC for x86_64, snd don't specifically identify
that the result will be a shlib [all Darwin exes are DSOs too] -  although
Linux does identify shlibs as something special).

[Bug tree-optimization/99726] [10 Regression] ICE in create_intersect_range_checks_index, at tree-data-ref.c:1855 since r10-4762-gf9d6338bd15ce1fae36bf25d3a0545e9678ddc58

2021-04-23 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99726

--- Comment #8 from CVS Commits  ---
The releases/gcc-10 branch has been updated by Richard Sandiford
:

https://gcc.gnu.org/g:7e2db68a77fb211898a024c5a7ad7c4449c7e355

commit r10-9749-g7e2db68a77fb211898a024c5a7ad7c4449c7e355
Author: Richard Sandiford 
Date:   Fri Apr 23 10:09:38 2021 +0100

data-ref: Tighten index-based alias checks [PR99726]

create_intersect_range_checks_index tries to create a runtime
alias check based on index comparisons.  It looks through the
access functions for the two DRs to find a SCEV for the loop
that is being versioned and converts a DR_STEP-based check
into an index-based check.

However, there isn't any reliable sign information in the types,
so the code expects the value of the IV step (when interpreted as
signed) to be negative iff the DR_STEP (when interpreted as signed)
is negative.

r10-4762 added another assert related to this assumption and the
assert fired for the testcase in the PR.  The sign of the IV step
didn't match the sign of the DR_STEP.

I think this is actually showing what was previously a wrong-code bug.
The signs didn't match because the DRs contained *two* access function
SCEVs for the loop being versioned.  It doesn't look like the code
is set up to deal with this, since it checks each access function
independently and treats it as the sole source of DR_STEP.

The patch therefore moves the main condition out of the loop.
This also has the advantage of not building a tree for one access
function only to throw it away if we find an inner function that
makes the comparison invalid.

gcc/
PR tree-optimization/99726
* tree-data-ref.c (create_intersect_range_checks_index): Bail
out if there is more than one access function SCEV for the loop
being versioned.

gcc/testsuite/
PR tree-optimization/99726
* gcc.target/i386/pr99726.c: New test.

(cherry picked from commit b5c7accfb56a7347008f629be4c7344dd849b1b1)

[Bug tree-optimization/98268] [10 Regression] ICE: verify_gimple failed with LTO and SVE

2021-04-23 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98268

--- Comment #11 from CVS Commits  ---
The releases/gcc-10 branch has been updated by Richard Sandiford
:

https://gcc.gnu.org/g:18a190c3ee32548de3888b7a64f701999893727b

commit r10-9750-g18a190c3ee32548de3888b7a64f701999893727b
Author: Richard Sandiford 
Date:   Fri Apr 23 10:09:39 2021 +0100

gimple-fold: Recompute ADDR_EXPR flags after folding a TMR [PR98268]

The gimple verifier picked up that an ADDR_EXPR of a MEM_REF was not
marked TREE_CONSTANT even though the address was in fact invariant.
This came from folding a &TARGET_MEM_REF with constant operands to
a &MEM_REF; &TARGET_MEM_REF is never treated as TREE_CONSTANT
but &MEM_REF can be.

gcc/
PR tree-optimization/98268
* gimple-fold.c (maybe_canonicalize_mem_ref_addr): Call
recompute_tree_invariant_for_addr_expr after successfully
folding a TARGET_MEM_REF that occurs inside an ADDR_EXPR.

gcc/testsuite/
PR tree-optimization/98268
* gcc.target/aarch64/sve/pr98268-1.c: New test.
* gcc.target/aarch64/sve/pr98268-2.c: Likewise.

(cherry picked from commit c778968339afd140380a46edbade054667c7dce2)

[Bug tree-optimization/98726] [10/11 Regression] SVE: tree check: expected integer_cst, have poly_int_cst in to_wide, at tree.h:5984

2021-04-23 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98726

--- Comment #13 from CVS Commits  ---
The releases/gcc-10 branch has been updated by Richard Sandiford
:

https://gcc.gnu.org/g:8849e4a94550ffc9a564c105f0cefed5f42b3a7d

commit r10-9752-g8849e4a94550ffc9a564c105f0cefed5f42b3a7d
Author: Richard Biener 
Date:   Fri Apr 23 10:09:40 2021 +0100

middle-end/98726 - fix VECTOR_CST element access

This fixes VECTOR_CST element access with POLY_INT elements and
allows to produce dump files of the PR98726 testcase without
ICEing.

2021-04-23  Richard Biener  

PR middle-end/98726
* tree.h (vector_cst_int_elt): Remove.
* tree.c (vector_cst_int_elt): Use poly_wide_int for computations,
make static.

(cherry picked from commit 4b59dbb5d6759e43bfa23161a8d3feb9ae969e1a)

[Bug target/98136] [8/9/10 Regression] [aarch64] Internal compiler error with large classes and virtual methods since r8-5967-gf5470a77425a54efebfe1732488c40f05ef176d0

2021-04-23 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98136

--- Comment #6 from CVS Commits  ---
The releases/gcc-10 branch has been updated by Richard Sandiford
:

https://gcc.gnu.org/g:63da018de828b4792e95d1431118fd10efef87d1

commit r10-9751-g63da018de828b4792e95d1431118fd10efef87d1
Author: Richard Sandiford 
Date:   Fri Apr 23 10:09:40 2021 +0100

aarch64: Tweak post-RA handling of CONST_INT moves [PR98136]

This PR is a regression caused by r8-5967, where we replaced
a call to aarch64_internal_mov_immediate in aarch64_add_offset
with a call to aarch64_force_temporary, which in turn uses the
normal emit_move_insn{,_1} routines.

The problem is that aarch64_add_offset can be called while
outputting a thunk, where we require all instructions to be
valid without splitting.  However, the move expanders were
not splitting CONST_INT moves themselves.

I think the right fix is to make the move expanders work
even in this scenario, rather than require callers to handle
it as a special case.

gcc/
PR target/98136
* config/aarch64/aarch64.md (mov): Pass multi-instruction
CONST_INTs to aarch64_expand_mov_immediate when called after RA.

gcc/testsuite/
PR target/98136
* g++.dg/pr98136.C: New test.

(cherry picked from commit 48c79f054bf435051c95ee093c45a0f8c9de5b4e)

[Bug tree-optimization/98726] [10/11 Regression] SVE: tree check: expected integer_cst, have poly_int_cst in to_wide, at tree.h:5984

2021-04-23 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98726

--- Comment #14 from CVS Commits  ---
The releases/gcc-10 branch has been updated by Richard Sandiford
:

https://gcc.gnu.org/g:dc9233a4f65a67ca280903d60d57c5fd5d95303e

commit r10-9753-gdc9233a4f65a67ca280903d60d57c5fd5d95303e
Author: Richard Sandiford 
Date:   Fri Apr 23 10:09:41 2021 +0100

Handle CONST_POLY_INTs in CONST_VECTORs [PR97141, PR98726]

This PR is caused by POLY_INT_CSTs being (necessarily) valid
in tree-level VECTOR_CSTs but CONST_POLY_INTs not being valid
in RTL CONST_VECTORs.  I can't tell/remember how deliberate
that was, but I'm guessing not very.  In particular,
valid_for_const_vector_p was added to guard against symbolic
constants rather than CONST_POLY_INTs.

I did briefly consider whether we should maintain the current
status anyway.  However, that would then require a way of
constructing variable-length vectors from individiual elements
if, say, we have:

   { [2, 2], [3, 2], [4, 2], ⦠}

So I'm chalking this up to an oversight.  I think the intention
(and certainly the natural thing) is to have the same rules for
both trees and RTL.

The SVE CONST_VECTOR code should already be set up to handle
CONST_POLY_INTs.  However, we need to add support for Advanced SIMD
CONST_VECTORs that happen to contain SVE-based values.  The patch does
that by expanding such CONST_VECTORs in the same way as variable vectors.

gcc/
PR rtl-optimization/97141
PR rtl-optimization/98726
* emit-rtl.c (valid_for_const_vector_p): Return true for
CONST_POLY_INT_P.
* rtx-vector-builder.h (rtx_vector_builder::step): Return a
poly_wide_int instead of a wide_int.
(rtx_vector_builder::apply_set): Take a poly_wide_int instead
of a wide_int.
* rtx-vector-builder.c (rtx_vector_builder::apply_set): Likewise.
* config/aarch64/aarch64.c (aarch64_legitimate_constant_p): Return
false for CONST_VECTORs that cannot be forced to memory.
* config/aarch64/aarch64-simd.md (mov): If a CONST_VECTOR
is too complex to force to memory, build it up from individual
elements instead.

gcc/testsuite/
PR rtl-optimization/97141
PR rtl-optimization/98726
* gcc.c-torture/compile/pr97141.c: New test.
* gcc.c-torture/compile/pr98726.c: Likewise.
* gcc.target/aarch64/sve/pr97141.c: Likewise.
* gcc.target/aarch64/sve/pr98726.c: Likewise.

(cherry picked from commit 1b5f74e8be4dd7abe5624ff60adceff19ca71bda)

[Bug target/97141] [10 Regression] aarch64, SVE: ICE in decompose, at rtl.h (during expand) since r10-4676-g9c437a108a

2021-04-23 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97141

--- Comment #8 from CVS Commits  ---
The releases/gcc-10 branch has been updated by Richard Sandiford
:

https://gcc.gnu.org/g:dc9233a4f65a67ca280903d60d57c5fd5d95303e

commit r10-9753-gdc9233a4f65a67ca280903d60d57c5fd5d95303e
Author: Richard Sandiford 
Date:   Fri Apr 23 10:09:41 2021 +0100

Handle CONST_POLY_INTs in CONST_VECTORs [PR97141, PR98726]

This PR is caused by POLY_INT_CSTs being (necessarily) valid
in tree-level VECTOR_CSTs but CONST_POLY_INTs not being valid
in RTL CONST_VECTORs.  I can't tell/remember how deliberate
that was, but I'm guessing not very.  In particular,
valid_for_const_vector_p was added to guard against symbolic
constants rather than CONST_POLY_INTs.

I did briefly consider whether we should maintain the current
status anyway.  However, that would then require a way of
constructing variable-length vectors from individiual elements
if, say, we have:

   { [2, 2], [3, 2], [4, 2], ⦠}

So I'm chalking this up to an oversight.  I think the intention
(and certainly the natural thing) is to have the same rules for
both trees and RTL.

The SVE CONST_VECTOR code should already be set up to handle
CONST_POLY_INTs.  However, we need to add support for Advanced SIMD
CONST_VECTORs that happen to contain SVE-based values.  The patch does
that by expanding such CONST_VECTORs in the same way as variable vectors.

gcc/
PR rtl-optimization/97141
PR rtl-optimization/98726
* emit-rtl.c (valid_for_const_vector_p): Return true for
CONST_POLY_INT_P.
* rtx-vector-builder.h (rtx_vector_builder::step): Return a
poly_wide_int instead of a wide_int.
(rtx_vector_builder::apply_set): Take a poly_wide_int instead
of a wide_int.
* rtx-vector-builder.c (rtx_vector_builder::apply_set): Likewise.
* config/aarch64/aarch64.c (aarch64_legitimate_constant_p): Return
false for CONST_VECTORs that cannot be forced to memory.
* config/aarch64/aarch64-simd.md (mov): If a CONST_VECTOR
is too complex to force to memory, build it up from individual
elements instead.

gcc/testsuite/
PR rtl-optimization/97141
PR rtl-optimization/98726
* gcc.c-torture/compile/pr97141.c: New test.
* gcc.c-torture/compile/pr98726.c: Likewise.
* gcc.target/aarch64/sve/pr97141.c: Likewise.
* gcc.target/aarch64/sve/pr98726.c: Likewise.

(cherry picked from commit 1b5f74e8be4dd7abe5624ff60adceff19ca71bda)

[Bug target/99249] [8/9/10 Backport] SVE: ICE in aarch64_expand_sve_const_vector (during RTL pass: early_remat)

2021-04-23 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99249

--- Comment #5 from CVS Commits  ---
The releases/gcc-10 branch has been updated by Richard Sandiford
:

https://gcc.gnu.org/g:690aa217cf2882e58a0572171a3dd8e346f616cf

commit r10-9754-g690aa217cf2882e58a0572171a3dd8e346f616cf
Author: Richard Sandiford 
Date:   Fri Apr 23 10:09:42 2021 +0100

aarch64: Handle more SVE vector constants [PR99246]

PR99246 is about a case in which we failed to handle a CONST_VECTOR
with NELTS_PER_PATTERN==2, i.e. a vector with a âforegroundâ sequence
of N vectors followed by a repeating âbackgroundâ sequence of N
vectors.

At the moment, it's difficult to produce these vectors directly,
but I'm hoping that for GCC 12 we'll do more folding, which will
in turn make this easier to test and easier to optimise.  Until then,
the patch simply relies on the testcase in the PR.

gcc/
PR target/99249
* config/aarch64/aarch64.c (aarch64_expand_sve_const_vector_sel):
New function.
(aarch64_expand_sve_const_vector): Use it for nelts_per_pattern==2.

gcc/testsuite/
PR target/99249
* gcc.target/aarch64/sve/acle/general/pr99246.c: New test.

(cherry picked from commit a065e0bb092a010664777394530ab1a52bb5293b)

[Bug tree-optimization/99726] [10 Regression] ICE in create_intersect_range_checks_index, at tree-data-ref.c:1855 since r10-4762-gf9d6338bd15ce1fae36bf25d3a0545e9678ddc58

2021-04-23 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99726

rsandifo at gcc dot gnu.org  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #9 from rsandifo at gcc dot gnu.org  
---
Fixed.

[Bug tree-optimization/98268] [10 Regression] ICE: verify_gimple failed with LTO and SVE

2021-04-23 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98268

rsandifo at gcc dot gnu.org  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #12 from rsandifo at gcc dot gnu.org  
---
Fixed.

[Bug fortran/100227] [8/9/10/11/12 Regression] write with implicit loop

2021-04-23 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100227

Richard Biener  changed:

   What|Removed |Added

Summary|write with implicit loop|[8/9/10/11/12 Regression]
   ||write with implicit loop
   Keywords||wrong-code
   Target Milestone|--- |8.5
   Priority|P3  |P4

[Bug tree-optimization/100222] Redundant mark_irreducible_loops () in predicate.c

2021-04-23 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100222

Richard Biener  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org
 Ever confirmed|0   |1
   Last reconfirmed||2021-04-23
 Status|UNCONFIRMED |ASSIGNED
Version|tree-ssa|12.0

--- Comment #1 from Richard Biener  ---
Mine.

[Bug target/97141] [10 Regression] aarch64, SVE: ICE in decompose, at rtl.h (during expand) since r10-4676-g9c437a108a

2021-04-23 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97141

rsandifo at gcc dot gnu.org  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #9 from rsandifo at gcc dot gnu.org  
---
Fixed.

[Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686

2021-04-23 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182

Uroš Bizjak  changed:

   What|Removed |Added

  Attachment #50649|0   |1
is obsolete||

--- Comment #21 from Uroš Bizjak  ---
Created attachment 50659
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50659&action=edit
Proposed patch

Here is the complete proposed patch.

We can retain problematic atomic store followed by a DFmode load peepholes as
long as we have a load to the SSE register. Load to the SSE register uses
movlps/movq moves that preserve all bits, so we are sure the store to a memory
location is unchanged from the original.

However, "load to the SSE register" requirement makes the peephole ineffective
for -mfpmath=387, so XFAILs are added to affected testcases.

[Bug tree-optimization/98069] [8/9/10 Regression] Miscompilation with -O3 since r8-2380-g2d7744d4ef93bfff

2021-04-23 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98069

rsandifo at gcc dot gnu.org  changed:

   What|Removed |Added

   Target Milestone|8.5 |10.4

--- Comment #7 from rsandifo at gcc dot gnu.org  
---
As discussed on irc, the fix was quite invasive, so it seems a bit
dangerous to backport further than GCC 10.  Will backport to GCC 10 in
the GCC 11.2 timeframe, once we've had more chance to see if there's
any fallout.

[Bug tree-optimization/97960] [8/9/10 Regression] Wrong code at -O3 since r8-6511-g3ae129323d

2021-04-23 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97960

rsandifo at gcc dot gnu.org  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #12 from rsandifo at gcc dot gnu.org  
---
Tracking backports in PR98069

*** This bug has been marked as a duplicate of bug 98069 ***

[Bug tree-optimization/98069] [8/9/10 Regression] Miscompilation with -O3 since r8-2380-g2d7744d4ef93bfff

2021-04-23 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98069

rsandifo at gcc dot gnu.org  changed:

   What|Removed |Added

 CC||acoplan at gcc dot gnu.org

--- Comment #8 from rsandifo at gcc dot gnu.org  
---
*** Bug 97960 has been marked as a duplicate of this bug. ***

[Bug tree-optimization/95396] [8/9/10 Regression] GCC produces incorrect code with -O3 for loops since r8-6511-g3ae129323d150621

2021-04-23 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95396

rsandifo at gcc dot gnu.org  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #14 from rsandifo at gcc dot gnu.org  
---
Tracking backports in PR98069

*** This bug has been marked as a duplicate of bug 98069 ***

[Bug tree-optimization/98069] [8/9/10 Regression] Miscompilation with -O3 since r8-2380-g2d7744d4ef93bfff

2021-04-23 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98069

--- Comment #9 from rsandifo at gcc dot gnu.org  
---
*** Bug 95396 has been marked as a duplicate of this bug. ***

[Bug target/100229] New: arm: UB in arm_block_set_aligned_non_vect (shift exponent 32 is too large for 32-bit type)

2021-04-23 Thread acoplan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100229

Bug ID: 100229
   Summary: arm: UB in arm_block_set_aligned_non_vect (shift
exponent 32 is too large for 32-bit type)
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: acoplan at gcc dot gnu.org
  Target Milestone: ---

$ cat test.c
int a[1];
__attribute((always_inline)) void g(void *c, int d, int e) {
  __builtin_memset(c, d, e);
}
void f() { g(a, 0, 0); }
$ gcc/xgcc -B gcc -c test.c -ftree-ter
test.c:2:35: warning: ‘always_inline’ function might not be inlinable
[-Wattributes]
2 | __attribute((always_inline)) void g(void *c, int d, int e) {
  |   ^
/data_sdb/toolchain/src/gcc/gcc/config/arm/arm.c:32358:22: runtime error: shift
exponent 32 is too large for 32-bit type 'unsigned int'
#0 0x2427a05 in arm_block_set_aligned_non_vect(rtx_def*, unsigned long,
unsigned long, unsigned long)
(/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x2427a05)
#1 0x2428ce6 in arm_gen_setmem(rtx_def**)
(/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x2428ce6)
#2 0x2c32189 in gen_setmemsi(rtx_def*, rtx_def*, rtx_def*, rtx_def*)
(/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x2c32189)
#3 0x16ddcf1 in rtx_insn* insn_gen_fn::operator()(rtx_def*, rtx_def*, rtx_def*, rtx_def*) const
(/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x16ddcf1)
#4 0x16dcd35 in maybe_gen_insn(insn_code, unsigned int, expand_operand*)
(/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x16dcd35)
#5 0x16dd513 in maybe_expand_insn(insn_code, unsigned int, expand_operand*)
(/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x16dd513)
#6 0x10561d0 in set_storage_via_setmem(rtx_def*, rtx_def*, rtx_def*,
unsigned int, unsigned int, long, unsigned long, unsigned long, unsigned long)
(/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x10561d0)
#7 0xc7df36 in expand_builtin_memset_args(tree_node*, tree_node*,
tree_node*, rtx_def*, machine_mode, tree_node*)
(/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0xc7df36)
#8 0xc7db6f in expand_builtin_memset(tree_node*, rtx_def*, machine_mode)
(/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0xc7db6f)
#9 0xc8ccc8 in expand_builtin(tree_node*, rtx_def*, rtx_def*, machine_mode,
int) (/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0xc8ccc8)
#10 0x108ff58 in expand_expr_real_1(tree_node*, rtx_def*, machine_mode,
expand_modifier, rtx_def**, bool)
(/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x108ff58)
#11 0x1079c93 in expand_expr_real(tree_node*, rtx_def*, machine_mode,
expand_modifier, rtx_def**, bool)
(/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x1079c93)
#12 0xd0c350 in expand_expr(tree_node*, rtx_def*, machine_mode,
expand_modifier) (/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0xd0c350)
#13 0xd1b56d in expand_call_stmt(gcall*)
(/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0xd1b56d)
#14 0xd211e3 in expand_gimple_stmt_1(gimple*)
(/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0xd211e3)
#15 0xd21b56 in expand_gimple_stmt(gimple*)
(/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0xd21b56)
#16 0xd32062 in expand_gimple_basic_block(basic_block_def*, bool)
(/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0xd32062)
#17 0xd35db8 in (anonymous namespace)::pass_expand::execute(function*)
(/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0xd35db8)
#18 0x17d2355 in execute_one_pass(opt_pass*)
(/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x17d2355)
#19 0x17d2b6e in execute_pass_list_1(opt_pass*)
(/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x17d2b6e)
#20 0x17d2c65 in execute_pass_list(function*, opt_pass*)
(/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x17d2c65)
#21 0xdf27ac in cgraph_node::expand()
(/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0xdf27ac)
#22 0xdf3a23 in cgraph_order_sort::process()
(/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0xdf3a23)
#23 0xdf4135 in output_in_order()
(/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0xdf4135)
#24 0xdf4e01 in symbol_table::compile()
(/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0xdf4e01)
#25 0xdf55ea in symbol_table::finalize_compilation_unit()
(/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0xdf55ea)
#26 0x1ac3baa in compile_file()
(/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x1ac3baa)
#27 0x1ac8a15 in do_compile()
(/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x1ac8a15)
#28 0x1ac8f10 in toplev::main(int, char**)
(/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x1ac8f10)
#29 0x36a5ee7 in main
(/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x36a5ee7)
#30 0x75ca1bf6 in __libc_start_main
(/lib/x86_64-linux-gnu/libc.so.6+0x21bf6)
#31 0x980249 in _start
(/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x980249)

[Bug target/100214] UB in arm.c:optimal_immediate_sequence_1 (left shift of 255 by 30 places cannot be represented in type 'int')

2021-04-23 Thread rearnsha at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100214

Richard Earnshaw  changed:

   What|Removed |Added

   Last reconfirmed||2021-04-23
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

--- Comment #1 from Richard Earnshaw  ---
Confirmed by visual inspection of source.  There look to be a number of
signed/unsigned confusions in this function.

[Bug c++/98297] [8/9/10/11 Regression] ICE in cp_parser_elaborated_type_specifier, at cp/parser.c:19653

2021-04-23 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98297

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #5 from Jakub Jelinek  ---
Note, the testcase FAILs on the 8 branch, the emitted error is different.
$ gcc-9/obj28/gcc/cc1plus -quiet -std=c++11 /tmp/pr98297.C -o /tmp/pr98297.s
/tmp/pr98297.C:5:1: warning: ‘b’ attribute directive ignored [-Wattributes]
5 | a ; // { dg-error "does not declare anything" }
  | ^~~
/tmp/pr98297.C:5:1: error: declaration does not declare anything [-fpermissive]
$ gcc-8/obj32/gcc/cc1plus -quiet -std=c++11 /tmp/pr98297.C -o /tmp/pr98297.s
/tmp/pr98297.C:5:1: warning: ‘b’ attribute directive ignored [-Wattributes]
 a ; // { dg-error "does not declare anything" }
 ^~~
/tmp/pr98297.C:5:1: error: name of class shadows template template parameter
‘a’
$ gcc-8/obj30/gcc/cc1plus -quiet -std=c++11 /tmp/pr98297.C -o /tmp/pr98297.s
/tmp/pr98297.C:5:1: internal compiler error: Segmentation fault
 a ; // { dg-error "does not declare anything" }
 ^~~
gcc-8/obj30 is 5 months old snapshot which expectedly ICEs, but the middle
error is different from what the test expects.

[Bug target/100216] arm: UB in arm_canonicalize_comparison (shift exponent 127 is too large for 64-bit type)

2021-04-23 Thread rearnsha at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100216

Richard Earnshaw  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2021-04-23
 Status|UNCONFIRMED |NEW

--- Comment #2 from Richard Earnshaw  ---
Confirmed by visual inspection.  Clearly this code was written at a time when
the largest integral mode on Arm was DImode.  It won't work for wider modes and
it won't do anything for non-integral modes.

Needs an overhaul.

[Bug rtl-optimization/100230] New: ASan: alloc-dealloc-mismatch in early-remat.c

2021-04-23 Thread acoplan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100230

Bug ID: 100230
   Summary: ASan: alloc-dealloc-mismatch in early-remat.c
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: acoplan at gcc dot gnu.org
  Target Milestone: ---

Bootstrapping on aarch64 --with-build-config=bootstrap-asan and running the
testsuite shows the following issue:

$ cat test.c
int a, b;
void c() {
  while (b)
a += b++;
}
$ gcc/xgcc -B gcc -c test.c -march=armv8.2-a+sve -O2 -ftree-vectorize
=
==22323==ERROR: AddressSanitizer: alloc-dealloc-mismatch (operator new [] vs
operator delete) on 0x92f0d900
#0 0x75ed5c in operator delete(void*, unsigned long)
/home/alecop01/toolchain/src/gcc/libsanitizer/asan/asan_new_delete.cpp:172
#1 0x33b033c in sort_candidates
/home/alecop01/toolchain/src/gcc/gcc/early-remat.c:1062
#2 0x33b033c in run /home/alecop01/toolchain/src/gcc/gcc/early-remat.c:2567
#3 0x33b033c in execute
/home/alecop01/toolchain/src/gcc/gcc/early-remat.c:2629
#4 0x151ebd4 in execute_one_pass(opt_pass*)
/home/alecop01/toolchain/src/gcc/gcc/passes.c:2567
#5 0x15201a0 in execute_pass_list_1
/home/alecop01/toolchain/src/gcc/gcc/passes.c:2656
#6 0x15201c4 in execute_pass_list_1
/home/alecop01/toolchain/src/gcc/gcc/passes.c:2657
#7 0x1520270 in execute_pass_list(function*, opt_pass*)
/home/alecop01/toolchain/src/gcc/gcc/passes.c:2667
#8 0xbb7c34 in cgraph_node::expand()
/home/alecop01/toolchain/src/gcc/gcc/cgraphunit.c:1830
#9 0xbb7c34 in cgraph_node::expand()
/home/alecop01/toolchain/src/gcc/gcc/cgraphunit.c:1783
#10 0xbba6d4 in expand_all_functions
/home/alecop01/toolchain/src/gcc/gcc/cgraphunit.c:1994
#11 0xbba6d4 in symbol_table::compile()
/home/alecop01/toolchain/src/gcc/gcc/cgraphunit.c:2358
#12 0xbc18a8 in symbol_table::compile()
/home/alecop01/toolchain/src/gcc/gcc/cgraphunit.c:2271
#13 0xbc18a8 in symbol_table::finalize_compilation_unit()
/home/alecop01/toolchain/src/gcc/gcc/cgraphunit.c:2539
#14 0x1793f44 in compile_file
/home/alecop01/toolchain/src/gcc/gcc/toplev.c:482
#15 0x6d4ffc in do_compile
/home/alecop01/toolchain/src/gcc/gcc/toplev.c:2201
#16 0x6d4ffc in toplev::main(int, char**)
/home/alecop01/toolchain/src/gcc/gcc/toplev.c:2340
#17 0x6df804 in main /home/alecop01/toolchain/src/gcc/gcc/main.c:39
#18 0x973276dc in __libc_start_main
(/lib/aarch64-linux-gnu/libc.so.6+0x206dc)
#19 0x6e271c  (/data/alecop01/builds/gcc11-bstrap-asan/gcc/cc1+0x6e271c)

0x92f0d900 is located 0 bytes inside of 28-byte region
[0x92f0d900,0x92f0d91c)
allocated by thread T0 here:
#0 0x75e16c in operator new[](unsigned long)
/home/alecop01/toolchain/src/gcc/libsanitizer/asan/asan_new_delete.cpp:102
#1 0x33b027c in sort_candidates
/home/alecop01/toolchain/src/gcc/gcc/early-remat.c:1056
#2 0x33b027c in run /home/alecop01/toolchain/src/gcc/gcc/early-remat.c:2567
#3 0x33b027c in execute
/home/alecop01/toolchain/src/gcc/gcc/early-remat.c:2629
#4 0x151ebd4 in execute_one_pass(opt_pass*)
/home/alecop01/toolchain/src/gcc/gcc/passes.c:2567
#5 0x15201a0 in execute_pass_list_1
/home/alecop01/toolchain/src/gcc/gcc/passes.c:2656
#6 0x15201c4 in execute_pass_list_1
/home/alecop01/toolchain/src/gcc/gcc/passes.c:2657
#7 0x1520270 in execute_pass_list(function*, opt_pass*)
/home/alecop01/toolchain/src/gcc/gcc/passes.c:2667
#8 0xbb7c34 in cgraph_node::expand()
/home/alecop01/toolchain/src/gcc/gcc/cgraphunit.c:1830
#9 0xbb7c34 in cgraph_node::expand()
/home/alecop01/toolchain/src/gcc/gcc/cgraphunit.c:1783
#10 0xbba6d4 in expand_all_functions
/home/alecop01/toolchain/src/gcc/gcc/cgraphunit.c:1994
#11 0xbba6d4 in symbol_table::compile()
/home/alecop01/toolchain/src/gcc/gcc/cgraphunit.c:2358
#12 0xbc18a8 in symbol_table::compile()
/home/alecop01/toolchain/src/gcc/gcc/cgraphunit.c:2271
#13 0xbc18a8 in symbol_table::finalize_compilation_unit()
/home/alecop01/toolchain/src/gcc/gcc/cgraphunit.c:2539
#14 0x1793f44 in compile_file
/home/alecop01/toolchain/src/gcc/gcc/toplev.c:482
#15 0x6d4ffc in do_compile
/home/alecop01/toolchain/src/gcc/gcc/toplev.c:2201
#16 0x6d4ffc in toplev::main(int, char**)
/home/alecop01/toolchain/src/gcc/gcc/toplev.c:2340
#17 0x6df804 in main /home/alecop01/toolchain/src/gcc/gcc/main.c:39
#18 0x973276dc in __libc_start_main
(/lib/aarch64-linux-gnu/libc.so.6+0x206dc)
#19 0x6e271c  (/data/alecop01/builds/gcc11-bstrap-asan/gcc/cc1+0x6e271c)

The fix looks obvious.

[Bug rtl-optimization/100230] ASan: alloc-dealloc-mismatch in early-remat.c

2021-04-23 Thread acoplan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100230

Alex Coplan  changed:

   What|Removed |Added

   Last reconfirmed||2021-04-23
 Ever confirmed|0   |1
   Assignee|unassigned at gcc dot gnu.org  |acoplan at gcc dot 
gnu.org
 Status|UNCONFIRMED |ASSIGNED

[Bug rtl-optimization/100230] ASan: alloc-dealloc-mismatch in early-remat.c

2021-04-23 Thread acoplan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100230

--- Comment #1 from Alex Coplan  ---
Testing a fix.

[Bug rtl-optimization/100225] [8/9/10/11/12 Regression] ICE in add_cross_iteration_register_deps, at ddg.c:291

2021-04-23 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100225

Alexander Monakov  changed:

   What|Removed |Added

 Blocks|85099   |
 CC||amonakov at gcc dot gnu.org,
   ||zhroma at gcc dot gnu.org

--- Comment #1 from Alexander Monakov  ---
Hi Martin, this is a modulo-scheduling bug; I think you added "Blocks:
sel-sched" by mistake — removing, and Cc'ing Roman.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85099
[Bug 85099] [meta-bug] selective scheduling issues

[Bug target/99488] dwz: /usr/lib/gcc/mips64el-linux-gnuabi64/11/go1: Found two copies of .debug_line_str section

2021-04-23 Thread syq at debian dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99488

--- Comment #12 from YunQiang Su  ---
This problem disappears if we build gcc 11 with binutils 2.36.

[Bug rtl-optimization/100225] [8/9/10/11/12 Regression] ICE in add_cross_iteration_register_deps, at ddg.c:291

2021-04-23 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100225

--- Comment #2 from Martin Liška  ---
Ah, you are right, sorry.

[Bug fortran/100227] [8/9/10/11/12 Regression] write with implicit loop

2021-04-23 Thread dominiq at lps dot ens.fr via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100227

Dominique d'Humieres  changed:

   What|Removed |Added

 CC||tkoenig at gcc dot gnu.org
  Known to fail||11.0, 12.0
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2021-04-23
 Ever confirmed|0   |1

--- Comment #2 from Dominique d'Humieres  ---
Workaround: use -fno-frontend-optimize.

[Bug target/100217] [11/12 Regression] ICE when building valgrind testsuite with -march=z14 since r11-7552

2021-04-23 Thread iii at linux dot ibm.com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100217

Ilya Leoshkevich  changed:

   What|Removed |Added

 CC||iii at linux dot ibm.com

--- Comment #3 from Ilya Leoshkevich  ---
There main problem here is that `register long double f0 asm ("f0")` does not
make sense on z14 anymore. long doubles are stored in vector registers now, not
in floating-point register pairs. If we skip the hard reg, the code will end up
having the following semantics:

vr0[0:128] = 1.0L;
asm("/* expect the value in vr0[0:64] . vr2[0:64] */");

and fail during the run time. So I think it's better to use the "best effort"
approach and force it into a pseudo, even if this would mean that the
user-specified register is not honored:

--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -16814,6 +16814,12 @@ s390_md_asm_adjust (vec &outputs, vec
&inputs,
   gcc_assert (allows_reg);
   /* Copy input value from a vector register into a FPR pair.  */
   rtx fprx2 = gen_reg_rtx (FPRX2mode);
+  if (REG_P (inputs[i]) && HARD_REGISTER_P (inputs[i]))
+   {
+ rtx orig_input = inputs[i];
+ inputs[i] = gen_reg_rtx (TFmode);
+ emit_move_insn (inputs[i], orig_input);
+   }
   emit_insn (gen_tf_to_fprx2 (fprx2, inputs[i]));
   inputs[i] = fprx2;
   input_modes[i] = FPRX2mode;

I need to check whether we can keep the output logic as is.

Ideally the code should be adapted and use the __LONG_DOUBLE_VX__ macro like
this:

#ifdef __LONG_DOUBLE_VX__
  register long double f0 asm ("v0");
#else
  register long double f0 asm ("f0");
#endif

  f0 = 1.0L;

#ifdef __LONG_DOUBLE_VX__
  asm("" : : "v" (f0));
#else
  asm("" : : "f" (f0));
#endif

Maybe a warning recommending to do this should be printed.

[Bug target/100217] [11/12 Regression] ICE when building valgrind testsuite with -march=z14 since r11-7552

2021-04-23 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100217

--- Comment #4 from Jakub Jelinek  ---
That seems like quite undesirable API change.
Can't the backend when it sees long double register vars for the fN registers
change the mode from TFmode to that new FPRX2mode, so that old code keeps
working?

[Bug tree-optimization/100222] Redundant mark_irreducible_loops () in predicate.c

2021-04-23 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100222

--- Comment #2 from CVS Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:500305a92ef85e6b87ad428a35221c62f4037b93

commit r12-82-g500305a92ef85e6b87ad428a35221c62f4037b93
Author: Richard Biener 
Date:   Fri Apr 23 11:16:52 2021 +0200

tree-optimization/100222 - remove redundant mark_irreducible_loops calls

loop_optimizer_init (LOOPS_NORMAL) already performs this (quite
expensive) marking.

2021-04-23  Richard Biener  

PR tree-optimization/100222
* predict.c (pass_profile::execute): Remove redundant call to
mark_irreducible_loops.
(report_predictor_hitrates): Likewise.

[Bug tree-optimization/100222] Redundant mark_irreducible_loops () in predicate.c

2021-04-23 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100222

Richard Biener  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #3 from Richard Biener  ---
Fixed.

[Bug target/100217] [11/12 Regression] ICE when building valgrind testsuite with -march=z14 since r11-7552

2021-04-23 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100217

Jakub Jelinek  changed:

   What|Removed |Added

   Priority|P3  |P2

[Bug rtl-optimization/100225] [8/9/10/11/12 Regression] ICE in add_cross_iteration_register_deps, at ddg.c:291

2021-04-23 Thread acoplan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100225

Alex Coplan  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
 CC||acoplan at gcc dot gnu.org
   Last reconfirmed||2021-04-23

--- Comment #3 from Alex Coplan  ---
Confirmed.

[Bug target/100152] [10/11/12 Regression] used caller-saved register not preserved across a call.

2021-04-23 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100152

--- Comment #37 from Richard Biener  ---
Oh, and FYI a cc1 cross from x86_64 to x86_64-apple-darwin19.6.0 doesn't seem
to reproduce the issue with the reduced testcase (I seee no call to
___UTF_8_put remaining with -O3 -fPIC -fno-strict-aliasing -fwrapv).

[Bug target/100217] [11/12 Regression] ICE when building valgrind testsuite with -march=z14 since r11-7552

2021-04-23 Thread iii at linux dot ibm.com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100217

--- Comment #5 from Ilya Leoshkevich  ---
That would be an ideal solution, but I wonder how to implement it? Suppose we
find a way to convince expand to pick FPRX2mode for such a long double. What if
the following comes up?

register long double x asm ("v0");  /* FPRX2mode */
long double y;  /* TFmode */
x += y; /* convert? */

Would it be feasible to also teach expand to do the mode conversions?



One other alternative might be to detect `register long double asm("fN")`
declarations and go back to using floating point register pairs for functions
that contain them.

[Bug target/100217] [11/12 Regression] ICE when building valgrind testsuite with -march=z14 since r11-7552

2021-04-23 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100217

--- Comment #6 from Jakub Jelinek  ---
(In reply to Ilya Leoshkevich from comment #5)
> That would be an ideal solution, but I wonder how to implement it? Suppose
> we find a way to convince expand to pick FPRX2mode for such a long double.
> What if the following comes up?
> 
> register long double x asm ("v0");  /* FPRX2mode */
> long double y;  /* TFmode */
> x += y; /* convert? */
> 
> Would it be feasible to also teach expand to do the mode conversions?

It is certainly doable, but perhaps with extra target hooks or something
similar.
Types have their TYPE_MODE and decls have DECL_MODE, though the question is
what breaks if TYPE_MODE != DECL_MODE, at least the comment in tree.h says that
they can only differ for FIELD_DECLs.  Anyway, in GIMPLE register vars are
non-SSA, so apart from inline asm one needs separate loads and stores to them,
so if we could expand those as having FPRX2 hard reg and loads from it convert
to TFmode and stores into it convert from TFmode, ...

> One other alternative might be to detect `register long double asm("fN")`
> declarations and go back to using floating point register pairs for
> functions that contain them.

But this might be actually best short-time solution (for GCC 11.x).

[Bug target/100217] [11/12 Regression] ICE when building valgrind testsuite with -march=z14 since r11-7552

2021-04-23 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100217

--- Comment #7 from Jakub Jelinek  ---
That said, I'm afraid I don't really understand what wrong happens with the
patch I've attached.
Trying something like:
long double
foo (void)
{
  register long double f0 asm ("f0");
  f0 = 1.0L;
  f0 += 127.L;
  f0 *= 32.L;
  return f0;
}
with -O0 -march=z14 -mlong-double-128 so that it is not all folded immediately
shows in the end the computations are done in vector registers.
And another thing to try is intermix that with inline asm expecting those in
"+f" so that intermediate results are pushed to the floating point register
pair.

[Bug target/99748] MVE: Wrong code at -O0 with float to integer conversion

2021-04-23 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99748

--- Comment #6 from CVS Commits  ---
The releases/gcc-10 branch has been updated by Alex Coplan
:

https://gcc.gnu.org/g:283367662c25057fd7c9c98257cca858f85b75fc

commit r10-9755-g283367662c25057fd7c9c98257cca858f85b75fc
Author: Alex Coplan 
Date:   Tue Apr 6 09:06:27 2021 +0100

arm: Fix PCS for SFmode -> SImode libcalls [PR99748]

This patch fixes PR99748 which shows us trying to pass the argument to
__aeabi_f2iz in the VFP register s0 when the library function is
expecting to use the GPR r0. It also fixes the __aeabi_f2uiz case which
was broken in the same way.

For the testcase in the PR, here is the code we generate before the
patch (with -mfloat-abi=hard -march=armv8.1-m.main+mve -O0):

main:
push{r7, lr}
sub sp, sp, #8
add r7, sp, #0
mov r3, #1065353216
str r3, [r7, #4]@ float
vldr.32 s0, [r7, #4]
bl  __aeabi_f2iz
mov r3, r0
cmp r3, #1
[...]

This becomes:

main:
push{r7, lr}
sub sp, sp, #8
add r7, sp, #0
mov r3, #1065353216
str r3, [r7, #4]@ float
ldr r0, [r7, #4]@ float
bl  __aeabi_f2iz
mov r3, r0
cmp r3, #1
[...]

after the patch. We see a similar change for the same testcase with a
cast to unsigned instead of int.

gcc/ChangeLog:

PR target/99748
* config/arm/arm.c (arm_libcall_uses_aapcs_base): Also use base
PCS for [su]fix_optab.

(cherry picked from commit 16ea7f57891d3fe885ee55b2917208695e184714)

[Bug target/99748] MVE: Wrong code at -O0 with float to integer conversion

2021-04-23 Thread acoplan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99748

Alex Coplan  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #7 from Alex Coplan  ---
Fixed for 10.4, so fixed everywhere.

[Bug c++/98297] [8/9/10/11 Regression] ICE in cp_parser_elaborated_type_specifier, at cp/parser.c:19653

2021-04-23 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98297

--- Comment #6 from Jakub Jelinek  ---
Ah, tracked already in PR98358.

[Bug c/69558] [8 Regression] glib2 warning pragmas stopped working

2021-04-23 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69558

Jakub Jelinek  changed:

   What|Removed |Added

   Priority|P1  |P2

--- Comment #31 from Jakub Jelinek  ---
5 years old bug can't be P1.

[Bug target/100217] [11/12 Regression] ICE when building valgrind testsuite with -march=z14 since r11-7552

2021-04-23 Thread iii at linux dot ibm.com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100217

--- Comment #8 from Ilya Leoshkevich  ---
Yeah, inline asm seems to be problematic:

/home/iii/gcc/build/gcc/xgcc -B/home/iii/gcc/build/gcc/
/home/iii/gcc/gcc/testsuite/gcc.target/s390/vector/long-double-asm-hardreg.c   
-fdiagnostics-plain-output   -O2 -march=z14 -mzarch -S -o
long-double-asm-hardreg.s

with the patch from comment 2 produces:

foo:
.LFB0:
.cfi_startproc
larl%r5,.L4
vl  %v0,.L5-.L4(%r5),3
#APP
# 10
"/home/iii/gcc/gcc/testsuite/gcc.target/s390/vector/long-double-asm-hardreg.c"
1
# %v0
# 0 "" 2
#NO_APP
br  %r14

`vl  %v0,.L5-.L4(%r5),3` loads 1.0L into %v0[0:128]. However, it should be
loaded into %v0[0:64] . %v2[0:64].

With the patch from comment 3 I get:

foo:
.LFB0:
.cfi_startproc
larl%r5,.L4
ld  %f0,.L5-.L4(%r5)
ld  %f2,.L5-.L4+8(%r5)
#APP
# 10
"/home/iii/gcc/gcc/testsuite/gcc.target/s390/vector/long-double-asm-hardreg.c"
1
# %f0
# 0 "" 2
#NO_APP
br  %r14

which is correct, but in general case the exact reg that the user requested is
not honored.

[Bug libstdc++/99402] [10 Regression] std::copy creates _GLIBCXX_DEBUG false positive for attempt to subscript a dereferenceable (start-of-sequence) iterator

2021-04-23 Thread fdumont at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99402

--- Comment #13 from François Dumont  ---
Fixed on gcc-10 branch by this commit
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=ab83ce42ea0b2fbc09d51b7bd5e69905dcaa2041.

[Bug target/100152] [10/11/12 Regression] used caller-saved register not preserved across a call.

2021-04-23 Thread iains at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100152

--- Comment #38 from Iain Sandoe  ---
(In reply to Richard Biener from comment #37)
> Oh, and FYI a cc1 cross from x86_64 to x86_64-apple-darwin19.6.0 doesn't seem
> to reproduce the issue with the reduced testcase (I seee no call to
> ___UTF_8_put remaining with -O3 -fPIC -fno-strict-aliasing -fwrapv).

I think my interestingness test isn't strict enough - the creduced code
resulting doesn't have an extern for ___UTF_8_put and only seems to not inline
that fn because the interface has been mangled. [ so that the fn is
legitimately binds_localP as the pasted case ].

if you still have the build around, out of curiosity, does it fail on the
original .i file attached here?

and with -fno-trapping-math -fno-math-errno -fschedule-insns2
-fomit-frame-pointer

( I only need O2 to get a fail ).

[Bug c++/100210] [[nodiscard]] constructor causes warning on arm-linux-gnueabihf

2021-04-23 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100210

Jakub Jelinek  changed:

   What|Removed |Added

 Resolution|FIXED   |DUPLICATE
 CC||jakub at gcc dot gnu.org

--- Comment #3 from Jakub Jelinek  ---
Closing as dup.

*** This bug has been marked as a duplicate of bug 99362 ***

[Bug c++/99362] [10 Regression] invalid unused result

2021-04-23 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99362

Jakub Jelinek  changed:

   What|Removed |Added

 CC||georg.schwab at emocean dot io

--- Comment #10 from Jakub Jelinek  ---
*** Bug 100210 has been marked as a duplicate of this bug. ***

[Bug c++/98767] Function signature lost in concept diagnostic message

2021-04-23 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98767

--- Comment #2 from CVS Commits  ---
The master branch has been updated by Patrick Palka :

https://gcc.gnu.org/g:87fc34a461cf362947a430d8a241f653fd83bc7b

commit r12-86-g87fc34a461cf362947a430d8a241f653fd83bc7b
Author: Patrick Palka 
Date:   Fri Apr 23 08:47:02 2021 -0400

c++: Fix pretty printing pointer to function type [PR98767]

When pretty printing a pointer to function type,
pp_cxx_parameter_declaration_clause ends up always outputting an empty
function parameter list because the loop that outputs the list iterates
over 'args' instead of 'types', and 'args' is empty when a FUNCTION_TYPE
is passed to this routine (as opposed to a FUNCTION_DECL).

This patch fixes this by making the loop iterate over 'types' instead.
This patch also moves the retrofitted chain-of-PARM_DECLs printing from
here to pp_cxx_requires_expr, the only caller that uses it.  Doing so
lets us easily output the trailing '...' in the parameter list of a
variadic function, which this patch also implements.

gcc/cp/ChangeLog:

PR c++/98767
* cxx-pretty-print.c (pp_cxx_parameter_declaration_clause):
Adjust parameter list loop to iterate over 'types' instead of
'args'.  Output the trailing '...' for a variadic function.
Remove PARM_DECL support.
(pp_cxx_requires_expr): Pretty print the parameter list directly
instead of going through pp_cxx_parameter_declaration_clause.

gcc/testsuite/ChangeLog:

PR c++/98767
* g++.dg/concepts/diagnostic17.C: New test.

[Bug target/99932] OpenACC/nvptx offloading execution regressions starting with CUDA 11.2-era Nvidia Driver 460.27.04

2021-04-23 Thread vries at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99932

--- Comment #3 from Tom de Vries  ---
Created attachment 50660
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50660&action=edit
Cuda reproducer

[Bug c++/100231] New: [C++17] Variable template specialization inside a class gives compilation error

2021-04-23 Thread krzyk240 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100231

Bug ID: 100231
   Summary: [C++17] Variable template specialization inside a
class gives compilation error
   Product: gcc
   Version: 10.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: krzyk240 at gmail dot com
  Target Milestone: ---

On the following code:
```
template
struct X {};

class Foo {
template
static constexpr inline bool bar = false;
template
static constexpr inline bool bar> = true;
};
```
GCC gives error:
:8:34: error: explicit template argument list not allowed
8 | static constexpr inline bool bar> = true;
  |  ^

But Clang, ICC and MSVC compile it correctly.

Defining variable template bar outside of Foo class produces no compile errors.

Compilation command: g++ example.cpp -std=c++17

Live example: https://godbolt.org/z/54hqYxe4P

[Bug target/100152] [10/11/12 Regression] used caller-saved register not preserved across a call.

2021-04-23 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100152

--- Comment #39 from Richard Biener  ---
(In reply to Iain Sandoe from comment #38)
> (In reply to Richard Biener from comment #37)
> > Oh, and FYI a cc1 cross from x86_64 to x86_64-apple-darwin19.6.0 doesn't 
> > seem
> > to reproduce the issue with the reduced testcase (I seee no call to
> > ___UTF_8_put remaining with -O3 -fPIC -fno-strict-aliasing -fwrapv).
> 
> I think my interestingness test isn't strict enough - the creduced code
> resulting doesn't have an extern for ___UTF_8_put and only seems to not
> inline that fn because the interface has been mangled. [ so that the fn is
> legitimately binds_localP as the pasted case ].
> 
> if you still have the build around, out of curiosity, does it fail on the
> original .i file attached here?
> 
> and with -fno-trapping-math -fno-math-errno -fschedule-insns2
> -fomit-frame-pointer
> 
> ( I only need O2 to get a fail ).

Yes, with -O2 -fno-trapping-math -fno-math-errno -fschedule-insns2
-fomit-frame-pointer it produces the problematical

.align 4,0x90
L945:
movl0(%rbp,%r10,4), %esi
callUTF_8_put
movq%r10, %rax
addq$1, %r10
cmpq%rax, %r12
jne L945

code.  But then ___UTF_8_put isn't interposable so I wonder why the linker
even has to resolve anything.  Adding -fPIC OTOH should definitely make the
symbol interposable but the same code is still generated ...

Note the 'extern' declaration shouldn't change anything, only that we
see a definition is relevant.

breaking on darwin_binds_local_p I see ___UTF_8_put is considered binding
local even with -fPIC.  So GCC thinks there will be no linker stub involved.

Note 'shlib' is passed as false to default_binds_local_p_3 computed as

3140 on earlier system versions, and with a TODO to complete.  */
3141  bool force_overridable = TARGET_KEXTABI && DARWIN_VTABLE_P (decl);
3142  return default_binds_local_p_3 (decl, force_overridable /* shlib */,
3143  false /* weak dominate */,

and default_binds_local_p_3 would do

  /* If PIC, then assume that any global name can be overridden by
 symbols resolved from other modules.  */
  if (shlib)
return false;

ix86_binds_local_p simply passes flag_shlib != 0 as this argument.

[Bug libstdc++/100180] experimental/net/internet/address/v6/members.cc fails on arm-eabi

2021-04-23 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100180

--- Comment #6 from CVS Commits  ---
The master branch has been updated by Jonathan Wakely :

https://gcc.gnu.org/g:0e1e7b77904f1fe2a6dbfe84bb4fc026584ba480

commit r12-89-g0e1e7b77904f1fe2a6dbfe84bb4fc026584ba480
Author: Jonathan Wakely 
Date:   Fri Apr 23 13:38:05 2021 +0100

libstdc++: Allow net::io_context to compile without  [PR 100180]

This adds dummy placeholders to net::io_context so that it can still be
compiled on targets without .

libstdc++-v3/ChangeLog:

PR libstdc++/100180
* include/experimental/io_context (io_context): Define
dummy_pollfd type so that most member functions still compile
without  and struct pollfd.

[Bug target/100152] [10/11/12 Regression] used caller-saved register not preserved across a call.

2021-04-23 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100152

--- Comment #40 from Richard Biener  ---
(In reply to Richard Biener from comment #39)
> (In reply to Iain Sandoe from comment #38)
> > (In reply to Richard Biener from comment #37)
> > > Oh, and FYI a cc1 cross from x86_64 to x86_64-apple-darwin19.6.0 doesn't 
> > > seem
> > > to reproduce the issue with the reduced testcase (I seee no call to
> > > ___UTF_8_put remaining with -O3 -fPIC -fno-strict-aliasing -fwrapv).
> > 
> > I think my interestingness test isn't strict enough - the creduced code
> > resulting doesn't have an extern for ___UTF_8_put and only seems to not
> > inline that fn because the interface has been mangled. [ so that the fn is
> > legitimately binds_localP as the pasted case ].
> > 
> > if you still have the build around, out of curiosity, does it fail on the
> > original .i file attached here?
> > 
> > and with -fno-trapping-math -fno-math-errno -fschedule-insns2
> > -fomit-frame-pointer
> > 
> > ( I only need O2 to get a fail ).
> 
> Yes, with -O2 -fno-trapping-math -fno-math-errno -fschedule-insns2
> -fomit-frame-pointer it produces the problematical
> 
> .align 4,0x90
> L945:
> movl0(%rbp,%r10,4), %esi
> callUTF_8_put
> movq%r10, %rax
> addq$1, %r10
> cmpq%rax, %r12
> jne L945
> 
> code.  But then ___UTF_8_put isn't interposable so I wonder why the linker
> even has to resolve anything.  Adding -fPIC OTOH should definitely make the
> symbol interposable but the same code is still generated ...
> 
> Note the 'extern' declaration shouldn't change anything, only that we
> see a definition is relevant.
> 
> breaking on darwin_binds_local_p I see ___UTF_8_put is considered binding
> local even with -fPIC.  So GCC thinks there will be no linker stub involved.
> 
> Note 'shlib' is passed as false to default_binds_local_p_3 computed as
> 
> 3140 on earlier system versions, and with a TODO to complete.  */
> 3141  bool force_overridable = TARGET_KEXTABI && DARWIN_VTABLE_P (decl);
> 3142  return default_binds_local_p_3 (decl, force_overridable /* shlib
> */,
> 3143  false /* weak dominate */,
> 
> and default_binds_local_p_3 would do
> 
>   /* If PIC, then assume that any global name can be overridden by
>  symbols resolved from other modules.  */
>   if (shlib)
> return false;
> 
> ix86_binds_local_p simply passes flag_shlib != 0 as this argument.

So it looks like darwin should pass

  flag_shlib != 0 || force_overridable

instead?

  1   2   >