[Bug gcov-profile/113101] compilation error with --coverage option

2023-12-21 Thread jiahaoxiang.hust at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113101

--- Comment #3 from Haoxiang Jia  ---
I tried to add the -mcmodel=large option, but the compilation error still
exists.
# g++ --coverage -mcmodel=large -o test test.cpp
/usr/lib/gcc/x86_64-linux-gnu/11/libgcov.a(_gcov.o): in function
`gcov_write_block':
(.text+0x17): failed to convert GOTPCREL relocation against '__gcov_var';
relink with --no-relax
/usr/lib/gcc/x86_64-linux-gnu/11/libgcov.a(_gcov.o): in function
`gcov_read_words':
(.text+0x59): failed to convert GOTPCREL relocation against '__gcov_var';
relink with --no-relax
/usr/lib/gcc/x86_64-linux-gnu/11/libgcov.a(_gcov.o): in function `gcov_error':
(.text+0x1f6): failed to convert GOTPCREL relocation against
'__gcov_error_file'; relink with --no-relax
/usr/lib/gcc/x86_64-linux-gnu/11/libgcov.a(_gcov.o): in function
`gcov_write_words':
(.text+0x36b): failed to convert GOTPCREL relocation against '__gcov_var';
relink with --no-relax
/usr/lib/gcc/x86_64-linux-gnu/11/libgcov.a(_gcov.o): in function
`__gcov_rewrite':
(.text+0x3d7): failed to convert GOTPCREL relocation against '__gcov_var';
relink with --no-relax
/usr/lib/gcc/x86_64-linux-gnu/11/libgcov.a(_gcov.o): in function `__gcov_open':
(.text+0x430): failed to convert GOTPCREL relocation against '__gcov_var';
relink with --no-relax
/usr/lib/gcc/x86_64-linux-gnu/11/libgcov.a(_gcov.o): in function
`__gcov_close':
(.text+0x4e8): failed to convert GOTPCREL relocation against '__gcov_var';
relink with --no-relax
/usr/lib/gcc/x86_64-linux-gnu/11/libgcov.a(_gcov.o): in function
`gcov_do_dump':
(.text+0x67c): relocation truncated to fit: R_X86_64_PC32 against `.bss'
(.text+0x822): failed to convert GOTPCREL relocation against '__gcov_var';
relink with --no-relax
(.text+0x8c5): failed to convert GOTPCREL relocation against '__gcov_var';
relink with --no-relax
(.text+0x930): relocation truncated to fit: R_X86_64_PC32 against `.bss'
(.text+0xa5e): relocation truncated to fit: R_X86_64_PC32 against `.bss'
(.text+0xaad): relocation truncated to fit: R_X86_64_PC32 against `.bss'
(.text+0xc18): relocation truncated to fit: R_X86_64_PC32 against `.bss'
(.text+0xc75): relocation truncated to fit: R_X86_64_PC32 against `.bss'
(.text+0xecd): relocation truncated to fit: R_X86_64_PC32 against `.bss'
(.text+0x1377): failed to convert GOTPCREL relocation against '__gcov_var';
relink with --no-relax
(.text+0x1514): relocation truncated to fit: R_X86_64_PC32 against `.bss'
(.text+0x151d): relocation truncated to fit: R_X86_64_PC32 against `.bss'
(.text+0x154f): relocation truncated to fit: R_X86_64_PC32 against `.bss'
(.text+0x155e): additional relocation overflows omitted from the output
(.text+0x1668): failed to convert GOTPCREL relocation against '__gcov_var';
relink with --no-relax
(.text+0x167f): failed to convert GOTPCREL relocation against '__gcov_var';
relink with --no-relax
(.text+0x1686): failed to convert GOTPCREL relocation against '__gcov_var';
relink with --no-relax
(.text+0x16b1): failed to convert GOTPCREL relocation against '__gcov_var';
relink with --no-relax
(.text+0x16b8): failed to convert GOTPCREL relocation against '__gcov_var';
relink with --no-relax
(.text+0x1716): failed to convert GOTPCREL relocation against '__gcov_var';
relink with --no-relax
(.text+0x1724): failed to convert GOTPCREL relocation against '__gcov_var';
relink with --no-relax
(.text+0x172b): failed to convert GOTPCREL relocation against '__gcov_var';
relink with --no-relax
(.text+0x1783): failed to convert GOTPCREL relocation against '__gcov_var';
relink with --no-relax
(.text+0x1791): failed to convert GOTPCREL relocation against '__gcov_var';
relink with --no-relax
(.text+0x1798): failed to convert GOTPCREL relocation against '__gcov_var';
relink with --no-relax
/usr/lib/gcc/x86_64-linux-gnu/11/libgcov.a(_gcov.o): in function
`__gcov_write_unsigned':
(.text+0x17ef): failed to convert GOTPCREL relocation against '__gcov_var';
relink with --no-relax
/usr/lib/gcc/x86_64-linux-gnu/11/libgcov.a(_gcov.o): in function
`__gcov_write_counter':
(.text+0x1860): failed to convert GOTPCREL relocation against '__gcov_var';
relink with --no-relax
/usr/lib/gcc/x86_64-linux-gnu/11/libgcov.a(_gcov.o): in function
`__gcov_write_tag_length':
(.text+0x18e0): failed to convert GOTPCREL relocation against '__gcov_var';
relink with --no-relax
/usr/lib/gcc/x86_64-linux-gnu/11/libgcov.a(_gcov.o): in function `__gcov_seek':
(.text+0x1b0f): failed to convert GOTPCREL relocation against '__gcov_var';
relink with --no-relax
/usr/lib/gcc/x86_64-linux-gnu/11/libgcov.a(_gcov.o): in function `__gcov_exit':
(.text+0x1bd5): failed to convert GOTPCREL relocation against
'__gcov_error_file'; relink with --no-relax
collect2: error: ld returned 1 exit status

[Bug tree-optimization/113104] New: Suboptimal loop-based slp node splicing across iterations

2023-12-21 Thread fxue at os dot amperecomputing.com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113104

Bug ID: 113104
   Summary: Suboptimal loop-based slp node splicing across
iterations
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: fxue at os dot amperecomputing.com
  Target Milestone: ---

Given a partial vector-sized slp node in loop, code generation would utilize
inter-iteration parallelism to archive full vectorization by splicing defs of
the node in multiple iterations into one vector. This strategy is not always
good, and could be refined in some situation. To be specific, we'd better not
splice node if it participates in a full-vector-sized operation, otherwise a
permute and vextract that are really unneeded would be introduced.

Suppose target vector size is 128-bit, and a slp node is mapped to VEC_OP in an
iteration. Depending on whether backend supports LO/HI version of the
operation, there are two kinds code sequence for splicing.

  // Isolated 2 iterations
  res_v128_I0 = VEC_OP(opnd_v64_I0, ...)   // iteration #0
  res_v128_I1 = VEC_OP(opnd_v64_I1, ...)   // iteration #1 


  // Spliced (1)
  opnd_v128_I0_I1 = { opnd_v64_I0, opnd_v64_I1 }  // extra permute
  opnd_v64_lo = [vec_unpack_lo_expr] opnd_v128_I0_I1; // extra vextract
  opnd_v64_hi = [vec_unpack_hi_expr] opnd_v128_I0_I1; // extra vextract
  res_v128_I0 = VEC_OP(opnd_v64_lo, ...)
  res_v128_I1 = VEC_OP(opnd_v64_hi, ...)

  // Spliced (2)
  opnd_v128_I0_I1 = { opnd_v64_I0, opnd_v64_I1 }  // extra permute
  res_v128_I0 = VEC_OP_LO(opnd_v128_i0_i1, ...)   // similar or same as VEC_OP
  res_v128_I1 = VEC_OP_HI(opnd_v128_i0_i1, ...)   // similar or same as VEC_OP

Sometime, such permute and vextract might be optimized away by backend passes.
But sometime, it can not. Here is a case on aarch64.

  int test(unsigned array[4][4]);

  int foo(unsigned short *a, unsigned long n)
  {
unsigned array[4][4];

for (unsigned i = 0; i < 4; i++, a += n)
  {
array[i][0] = a[0] << 6;
array[i][1] = a[1] << 6;
array[i][2] = a[2] << 6;
array[i][3] = a[3] << 6;
  }

return test(array);
  }


// Current code generation
mov x2, x0
stp x29, x30, [sp, -80]!
add x3, x2, x1, lsl 1
lsl x1, x1, 1
mov x29, sp
add x4, x3, x1
ldr d0, [x2]
moviv30.4s, 0
add x0, sp, 16
ldr d31, [x2, x1]
ldr d29, [x3, x1]
ldr d28, [x4, x1]
ins v0.d[1], v31.d[0]//
ins v29.d[1], v28.d[0]   // 
zip1v1.8h, v0.8h, v30.8h // superfluous  
zip2v0.8h, v0.8h, v30.8h //
zip1v31.8h, v29.8h, v30.8h   //
zip2v29.8h, v29.8h, v30.8h   //
shl v1.4s, v1.4s, 6
shl v0.4s, v0.4s, 6
shl v31.4s, v31.4s, 6
shl v29.4s, v29.4s, 6
stp q1, q0, [sp, 16]
stp q31, q29, [sp, 48]
bl  test
ldp x29, x30, [sp], 80
ret


// May be optimized to:
stp x29, x30, [sp, -80]!
mov x29, sp
mov x2, x0
add x0, sp, 16
lsl x3, x1, 1
add x1, x2, x1, lsl 1
add x4, x1, x3
ldr d31, [x2, x3]
ushll   v31.4s, v31.4h, 6
ldr d30, [x2]
ushll   v30.4s, v30.4h, 6
str q30, [sp, 16]
ldr d30, [x1, x3]
ushll   v30.4s, v30.4h, 6
str q31, [sp, 32]
ldr d31, [x4, x3]
ushll   v31.4s, v31.4h, 6
stp q30, q31, [sp, 48]
bl  test
ldp x29, x30, [sp], 80
ret

[Bug rtl-optimization/108412] RISC-V: Negative optimization of GCSE && LOOP INVARIANTS

2023-12-21 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108412

--- Comment #4 from JuzheZhong  ---
This issue is fixed when we use -mtune=sifive-u74 so it won't be a problem.

[Bug rtl-optimization/108412] RISC-V: Negative optimization of GCSE && LOOP INVARIANTS

2023-12-21 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108412

JuzheZhong  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|UNCONFIRMED |RESOLVED

--- Comment #5 from JuzheZhong  ---
This issue is fixed when we use -mtune=sifive-u74 so it won't be a problem.

[Bug target/108271] Missed RVV cost model

2023-12-21 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108271

JuzheZhong  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from JuzheZhong  ---
This issue is fixed when we use -mtune=sifive-u74 so it won't be a problem.

[Bug tree-optimization/113105] New: Missing optimzation: fold `div(v, a) * b + rem(v, a)` to `div(v, a) * (b - a) + v`

2023-12-21 Thread xxs_chy at outlook dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113105

Bug ID: 113105
   Summary: Missing optimzation: fold `div(v, a) * b + rem(v, a)`
to `div(v, a) * (b - a) + v`
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: xxs_chy at outlook dot com
  Target Milestone: ---

Godbolt example: https://godbolt.org/z/b5va37Tzx

For example:

unsigned char _bin2bcd(unsigned val)
{
return ((val / 10) << 4) + val % 10;
}

can be folded to:

unsigned char new_bin2bcd(unsigned val)
{
return val / 10 * 6 + val;
}

This C snippet is extracted from
https://github.com/torvalds/linux/blob/master/lib/bcd.c

Both GCC and LLVM missed it.

[Bug c++/103183] [11/12/13/14 Regression] ind[arr] produces an lvalue when arr is an array xvalue

2023-12-21 Thread de34 at live dot cn via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103183

--- Comment #5 from Jiang An  ---
Seems fixed together by commit
r14-6753-g8dfc52a75d4d6c8be1c61b4aa831b1812b14a10e.

https://godbolt.org/z/on3K451a5

[Bug middle-end/113100] [14 regression] many strub tests fail after r14-6737-g4e0a467302fea5

2023-12-21 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113100

Kewen Lin  changed:

   What|Removed |Added

 CC||linkw at gcc dot gnu.org
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
   Last reconfirmed||2023-12-21

--- Comment #2 from Kewen Lin  ---
Confirmed, but it needs an explicit cpu type like -mcpu=power9 for
reproduction, otherwise it could pass on power10 as it can work with pcrel (so
no toc base r2 needed). The change can extend the end of scrubbing, it cleans
the saved toc base unexpectedly.

I noticed that there is one macro SPARC_STACK_BOUNDARY_HACK, which aims to
indicate this SPARC64 specific behavior. Could we leverage this macro (guarded
the biasing with it)? like:

diff --git a/gcc/builtins.cc b/gcc/builtins.cc
index 125ea158ebf..9bad1e962b4 100644
--- a/gcc/builtins.cc
+++ b/gcc/builtins.cc
@@ -5450,6 +5450,7 @@ expand_builtin_stack_address ()
   rtx ret = convert_to_mode (ptr_mode, copy_to_reg (stack_pointer_rtx),
  STACK_UNSIGNED);

+#ifdef SPARC_STACK_BOUNDARY_HACK
   /* Unbias the stack pointer, bringing it to the boundary between the
  stack area claimed by the active function calling this builtin,
  and stack ranges that could get clobbered if it called another
@@ -5476,7 +5477,9 @@ expand_builtin_stack_address ()
  (caller) function's active area as well, whereas those pushed or
  allocated temporarily for a call are regarded as part of the
  callee's stack range, rather than the caller's.  */
-  ret = plus_constant (ptr_mode, ret, STACK_POINTER_OFFSET);
+  if (SPARC_STACK_BOUNDARY_HACK)
+ret = plus_constant (ptr_mode, ret, STACK_POINTER_OFFSET);
+#endif

   return force_reg (ptr_mode, ret);
 }

[Bug rtl-optimization/113097] [14 Regression] LRA ICE building glibc for arc

2023-12-21 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113097

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |14.0

[Bug rtl-optimization/113098] [14 Regression] LRA ICE building glibc for mips

2023-12-21 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113098

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |14.0

[Bug libstdc++/113099] locale without RTTI uses dynamic_cast before gcc 13.2 or has ODR violation since gcc 13.2

2023-12-21 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113099

Richard Biener  changed:

   What|Removed |Added

   Keywords||ABI, documentation

--- Comment #1 from Richard Biener  ---
It seems to me that -fno-rtti is an ABI changing option, at least to the
runtime.  Maybe we should clarify that in its documentation?  -fno-exceptions
could possibly have similar effects if not used consistently across all TUs in
a project.

[Bug bootstrap/112534] [14 regression] build failure after r14-5424-gdb50aea6259545 using gcc 4.8.5

2023-12-21 Thread arsen at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112534

--- Comment #10 from Arsen Arsenović  ---
Created attachment 56915
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56915&action=edit
[PATCH] toplevel: don't override gettext-runtime/configure-discovered build
args

here's a preliminary patch, currently trying it on cfarm112.

[Bug tree-optimization/113104] Suboptimal loop-based slp node splicing across iterations

2023-12-21 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113104

Richard Biener  changed:

   What|Removed |Added

 CC||rguenth at gcc dot gnu.org

--- Comment #1 from Richard Biener  ---
See my proposal on the mailing list to lift the restriction of sticking to a
single vector size, I think this is another example showing this.  If you
use BB level vectorization by disabling loop vectorization but not SLP
vectorization the code should improve?

DeepLearn 2024: early registration December 28

2023-12-21 Thread IRDTA via Gcc-bugs

*To be removed from our mailing list, please respond to this message with 
UNSUBSCRIBE in the subject line*

--

**

11th INTERNATIONAL SCHOOL ON DEEP LEARNING
(and the Future of Artificial Intelligence)

DeepLearn 2024

Porto – Maia, Portugal

July 15-19, 2024

https://deeplearn.irdta.eu/2024/

**

Co-organized by:

University of Maia

Institute for Research Development, Training and Advice – IRDTA
Brussels/London

**

Early registration: December 28, 2023

**

SCOPE:

DeepLearn 2024 will be a research training event with a global scope aiming at 
updating participants on the most recent advances in the critical and fast 
developing area of deep learning. Previous events were held in Bilbao, Genova, 
Warsaw, Las Palmas de Gran Canaria, Guimarães, Las Palmas de Gran Canaria, 
Luleå, Bournemouth, Bari and Las Palmas de Gran Canaria.

Deep learning is a branch of artificial intelligence covering a spectrum of 
current frontier research and industrial innovation that provides more 
efficient algorithms to deal with large-scale data in a huge variety of 
environments: computer vision, neurosciences, speech recognition, language 
processing, human-computer interaction, drug discovery, health informatics, 
medical image analysis, recommender systems, advertising, fraud detection, 
robotics, games, finance, biotechnology, physics experiments, biometrics, 
communications, climate sciences, geographic information systems, signal 
processing, genomics, materials design, video technology, social systems, etc. 
etc.

The field is also raising a number of relevant questions about robustness of 
the algorithms, explainability, transparency, and important ethical concerns at 
the frontier of current knowledge that deserve careful multidisciplinary 
discussion.

Most deep learning subareas will be displayed, and main challenges identified 
through 18 four-hour and a half courses, 2 keynote lectures, 1 round table and 
a few hackathon-type competitions among students, which will tackle the most 
active and promising topics. Renowned academics and industry pioneers will 
lecture and share their views with the audience. The organizers are convinced 
that outstanding speakers will attract the brightest and most motivated 
students. Face to face interaction and networking will be main ingredients of 
the event. It will be also possible to fully participate in vivo remotely.

ADDRESSED TO:

Graduate students, postgraduate students and industry practitioners will be 
typical profiles of participants. However, there are no formal pre-requisites 
for attendance in terms of academic degrees, so people less or more advanced in 
their career will be welcome as well.

Since there will be a variety of levels, specific knowledge background may be 
assumed for some of the courses.

Overall, DeepLearn 2024 is addressed to students, researchers and practitioners 
who want to keep themselves updated about recent developments and future 
trends. All will surely find it fruitful to listen to and discuss with major 
researchers, industry leaders and innovators.

VENUE:

DeepLearn 2024 will take place in Porto, the second largest city in Portugal, 
recognized by UNESCO in 1996 as a World Heritage Site. The venue will be:

University of Maia
Avenida Carlos de Oliveira Campos - Castlo da Maia
4475-690 Maia
Porto, Portugal

https://www.umaia.pt/en

STRUCTURE:

3 courses will run in parallel during the whole event. Participants will be 
able to freely choose the courses they wish to attend as well as to move from 
one to another.

All lectures will be videorecorded. Participants will be able to watch them 
again for 45 days after the event.

An open session will give participants the opportunity to present their own 
work in progress in 5 minutes. Also companies will be able to present their 
technical developments for 10 minutes.

This year’s edition of the school will schedule hands-on activities including 
mini-hackathons, where participants will work in teams to tackle several 
machine learning challenges.

Full live online participation will be possible. The organizers highlight, 
however, the importance of face to face interaction and networking in this kind 
of research training event.

KEYNOTE SPEAKERS:

Jiawei Han (University of Illinois Urbana-Champaign), How Can Large Language 
Models Contribute to Effective Text Mining?

Katia Sycara (Carnegie Mellon University), Effective Multi Agent Teaming

PROFESSORS AND COURSES:

Luca Benini (Swiss Federal Institute of Technology Zurich), 
[intermediate/advanced] Open Hardware Platforms for Edge Machine Learning

Gustau Camps-Valls (University of València), [intermediate] AI for Earth, 
Climate, and Sustainability

Nitesh Chawla (University of Notre Dame), [introductory/intermediate] 

[Bug tree-optimization/113104] Suboptimal loop-based slp node splicing across iterations

2023-12-21 Thread fxue at os dot amperecomputing.com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113104

--- Comment #2 from Feng Xue  ---
(In reply to Richard Biener from comment #1)
> See my proposal on the mailing list to lift the restriction of sticking to a
> single vector size, I think this is another example showing this.  If you
> use BB level vectorization by disabling loop vectorization but not SLP
> vectorization the code should improve?

Yes, the loop is fully unrolled, and BB SLP would.

I could not find the proposal, would you share me a link? Thanks

[Bug tree-optimization/113104] Suboptimal loop-based slp node splicing across iterations

2023-12-21 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113104

--- Comment #3 from rguenther at suse dot de  ---
On Thu, 21 Dec 2023, fxue at os dot amperecomputing.com wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113104
> 
> --- Comment #2 from Feng Xue  ---
> (In reply to Richard Biener from comment #1)
> > See my proposal on the mailing list to lift the restriction of sticking to a
> > single vector size, I think this is another example showing this.  If you
> > use BB level vectorization by disabling loop vectorization but not SLP
> > vectorization the code should improve?
> 
> Yes, the loop is fully unrolled, and BB SLP would.

I suspect even when the loop isn't unrolled (just increase iteration
count) the code would improve

> I could not find the proposal, would you share me a link? Thanks

https://gcc.gnu.org/pipermail/gcc-patches/2023-December/640476.html

[Bug tree-optimization/112941] during GIMPLE pass: bitintlower ICE: in handle_operand_addr, at gimple-lower-bitint.cc:2126 (gimple-lower-bitint.cc:2134) at -O with _BitInt()

2023-12-21 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112941

--- Comment #10 from GCC Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:3d1bdbf64c2ed5be70fbff687b2927e328297b81

commit r14-6777-g3d1bdbf64c2ed5be70fbff687b2927e328297b81
Author: Jakub Jelinek 
Date:   Thu Dec 21 11:13:42 2023 +0100

lower-bitint: Avoid nested casts in muldiv/float operands [PR112941]

Multiplication/division/modulo/float operands are handled by libgcc calls
and so need to be passed as array of limbs with precision argument,
using handle_operand_addr.  That code can't deal with more than one cast,
so the following patch avoids merging those cases.
.MUL_OVERFLOW calls use the same code, but we don't actually try to merge
the operands in that case already.

2023-12-21  Jakub Jelinek  

PR tree-optimization/112941
* gimple-lower-bitint.cc (gimple_lower_bitint): Disallow merging
a cast with multiplication, division or conversion to floating
point
if rhs1 of the cast is result of another single use cast in the
same
bb.

* gcc.dg/bitint-56.c: New test.
* gcc.dg/bitint-57.c: New test.

[Bug sanitizer/113092] _BitInt of shift vs ubsan

2023-12-21 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113092

--- Comment #3 from GCC Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:0e7f5039c52a020c3ed5f18a2b3ee1fb42b78f62

commit r14-6778-g0e7f5039c52a020c3ed5f18a2b3ee1fb42b78f62
Author: Jakub Jelinek 
Date:   Thu Dec 21 11:14:55 2023 +0100

ubsan: Add workaround for missing bitint libubsan support for shifts
[PR113092]

libubsan still doesn't support bitints, so ubsan contains a workaround and
emits value 0 and TK_Unknown kind for those.  If shift second operand has
the large/huge _BitInt type, this results in internal errors in libubsan
though, so the following patch provides a temporary workaround for that
- in the rare case where the last operand has _BitInt type wider than
__int128 (or long long on 32-bit arches), it will pretend the shift count
has that type saturated to its range.  IMHO better than crashing in
the library.  If the value fits into the __int128 (or long long) range,
it will be printed correctly (just print that it has __int128/long long
type rather than say _BitInt(255)), if it doesn't, user will at least
know that it is a very large negative or very large positive value.

2023-12-21  Jakub Jelinek  

PR sanitizer/113092
* c-ubsan.cc (ubsan_instrument_shift): Workaround for missing
ubsan _BitInt support for the shift count.

* gcc.dg/ubsan/bitint-4.c: New test.

[Bug target/112948] gcc/config/aarch64/aarch64-early-ra.cc:1953: possible cut'n'paste error ?

2023-12-21 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112948

--- Comment #2 from GCC Commits  ---
The trunk branch has been updated by Richard Sandiford :

https://gcc.gnu.org/g:81dfa84e35659bc8adff945e61b02bc76c4c7f1e

commit r14-6780-g81dfa84e35659bc8adff945e61b02bc76c4c7f1e
Author: Richard Sandiford 
Date:   Thu Dec 21 10:20:19 2023 +

aarch64: Fix cut-&-pasto in early RA pass [PR112948]

As the PR notes, there was a cut-&-pasto in find_strided_accesses.
I've not been able to find a testcase that shows the problem.

gcc/
PR target/112948
* config/aarch64/aarch64-early-ra.cc (find_strided_accesses): Fix
cut-&-pasto.

[Bug target/113094] [14 Regression][aarch64] ICE in extract_constrain_insn, at recog.cc:2713 since r14-6290-g9f0f7d802482a8

2023-12-21 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113094

--- Comment #5 from GCC Commits  ---
The trunk branch has been updated by Richard Sandiford :

https://gcc.gnu.org/g:803d222e533efbc385411d4b5a2d0ec0551b9f16

commit r14-6781-g803d222e533efbc385411d4b5a2d0ec0551b9f16
Author: Richard Sandiford 
Date:   Thu Dec 21 10:20:19 2023 +

aarch64: Fix early RA handling of deleted insns [PR113094]

The testcase constructs a sequence of insns that are fully dead
and yet (due to forced options) are not removed as such.  This
triggered a case where we would emit a meaningless reload for a
to-be-deleted insn.

We can't delete the insns first because that might disrupt the
iteration ranges.  So this patch turns them into notes before
the walk and then continues to delete them properly afterwards.

gcc/
PR target/113094
* config/aarch64/aarch64-early-ra.cc (apply_allocation): Stub
out instructions that are going to be deleted before iterating
over the rest.

gcc/testsuite/
PR target/113094
* gcc.target/aarch64/pr113094.c: New test.

[Bug target/112948] gcc/config/aarch64/aarch64-early-ra.cc:1953: possible cut'n'paste error ?

2023-12-21 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112948

Richard Sandiford  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #3 from Richard Sandiford  ---
Fixed.  Thanks for the report.

[Bug target/113094] [14 Regression][aarch64] ICE in extract_constrain_insn, at recog.cc:2713 since r14-6290-g9f0f7d802482a8

2023-12-21 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113094

Richard Sandiford  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #6 from Richard Sandiford  ---
Fixed.

[Bug sanitizer/113092] _BitInt of shift vs ubsan

2023-12-21 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113092

--- Comment #4 from Jakub Jelinek  ---
Worked around for now (till libubsan has proper _BitInt support).

[Bug c/113106] New: Missing CSE with cast to volatile

2023-12-21 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113106

Bug ID: 113106
   Summary: Missing CSE with cast to volatile
   Product: gcc
   Version: 12.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ubizjak at gmail dot com
  Target Milestone: ---

The following testcase:

--cut here--
int a;

int foo(void)
{
  return *(volatile int *) &a + a;
}
--cut here--

compiles with -O2 to:

movla(%rip), %eax
addla(%rip), %eax
ret

with more detail:

#(insn:TI 6 7 8 2 (set (reg:SI 0 ax [orig:98 _1 ] [98])
#(mem/v/c:SI (symbol_ref:DI ("a") [flags 0x2] ) [1 MEM[(volatile int *)&a]+0 S4 A32])) "vol.c":5:10 85
{*movsi_internal}
# (nil))
movla(%rip), %eax   # 6 [c=5 l=6]  *movsi_internal/0
#(insn 8 6 14 2 (parallel [
#(set (reg:SI 0 ax [102])
#(plus:SI (reg:SI 0 ax [orig:98 _1 ] [98])
#(mem/c:SI (symbol_ref:DI ("a") [flags 0x2] ) [1 a+0 S4 A32])))
#(clobber (reg:CC 17 flags))
#]) "vol.c":5:31 271 {*addsi_1}
# (expr_list:REG_UNUSED (reg:CC 17 flags)
#(nil)))
addla(%rip), %eax   # 8 [c=9 l=6]  *addsi_1/1

This may be compiled to:

movla(%rip), %eax
addl%eax, %eax
ret

since only one read uses volatile.

[Bug c/113106] Missing CSE with cast to volatile

2023-12-21 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113106

--- Comment #1 from Uroš Bizjak  ---
Perhaps related,

--cut here--
int a;

int foo(void)
{
  return *(volatile int *) &a + *(volatile int *) &a;
}
--cut here--

compiles with -O2 to:

movla(%rip), %eax
movla(%rip), %edx
addl%edx, %eax
ret

but may be compiled to:

movla(%rip), %eax
addla(%rip), %eax
ret

(the memory read may propagate to the insn)

[Bug sanitizer/113092] _BitInt of shift vs ubsan

2023-12-21 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113092

Jakub Jelinek  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #5 from Jakub Jelinek  ---
.

[Bug c/113106] Missing CSE with cast to volatile

2023-12-21 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113106

--- Comment #2 from Uroš Bizjak  ---
For reference, the same optimization should be applied with address spaces:

--cut here--
int __seg_gs b;

int bar(void)
{
  return *(volatile __seg_gs int *) &b + b;
}
--cut here--

the above testcase currently compiles to:

movl%gs:b(%rip), %eax
addl%gs:b(%rip), %eax
ret

but can be compiled to:

movl%gs:b(%rip), %eax
addl%eax, %eax
ret

[Bug target/113093] [14 Regression][aarch64] ICE in rtl_verify_bb_insn, at cfgrtl.cc:2796 since r14-6605-gc0911c6b357ba9

2023-12-21 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113093

--- Comment #3 from GCC Commits  ---
The master branch has been updated by Alex Coplan :

https://gcc.gnu.org/g:aca1f9d7cab3dc1a374a7dc0ec6f7a8d02d2869a

commit r14-6784-gaca1f9d7cab3dc1a374a7dc0ec6f7a8d02d2869a
Author: Alex Coplan 
Date:   Thu Dec 21 10:52:44 2023 +

aarch64: Prevent moving throwing accesses in ldp/stp pass [PR113093]

As the PR shows, there was nothing to prevent the ldp/stp pass from
trying to move throwing insns, which lead to an RTL verification
failure.

This patch fixes that.

gcc/ChangeLog:

PR target/113093
* config/aarch64/aarch64-ldp-fusion.cc (latest_hazard_before):
If the insn is throwing, record the previous insn as a hazard to
prevent moving it from the end of the BB.

gcc/testsuite/ChangeLog:

PR target/113093
* gcc.dg/pr113093.c: New test.

[Bug target/113093] [14 Regression][aarch64] ICE in rtl_verify_bb_insn, at cfgrtl.cc:2796 since r14-6605-gc0911c6b357ba9

2023-12-21 Thread acoplan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113093

Alex Coplan  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from Alex Coplan  ---
Should be fixed, thanks for the report.

[Bug tree-optimization/113091] Over-estimate SLP vector-to-scalar cost for non-live pattern statement

2023-12-21 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113091

--- Comment #5 from Richard Sandiford  ---
> The issue here is that because the "outer" pattern consumes
> patt_64 = (int) patt_63 it should have adjusted _2 = (int) _1 
> stmt-to-vectorize
> as being the outer pattern root stmt for all this logic to work correctly.

I don't think it can though, at least not in general.  The final pattern
stmt has to compute the same value as the original scalar stmt.

[Bug rtl-optimization/113107] New: miss optimization of an unmerged load operation

2023-12-21 Thread absoler at smail dot nju.edu.cn via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113107

Bug ID: 113107
   Summary: miss optimization of an unmerged load operation
   Product: gcc
   Version: 13.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: absoler at smail dot nju.edu.cn
  Target Milestone: ---

Created attachment 56916
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56916&action=edit
source file

I found a redundant load introduced by O2 optimization, and this behavior is
regression, confirmed on gcc 7.2.0, 12.2.0 and 13.2.0 (from 4.2 to 13.2 are
tested)

given this code:

```
int g_30 = 0L;
int g_56[7] = {(-1L),(-1L),(-1L),(-1L),(-1L),(-1L),(-1L)};
unsigned long long g_118 = 18446744073709551607UL;


int  func_65(unsigned  p_67);
int * func_69(int  p_74);

int g, arg;
unsigned func_1() { func_65(arg); }

int func_65(unsigned a) {
  func_69(a);
  g = g_56[4] < (g_56[5] || a);
}

int *func_69(int d) {
  if (d)
return (void*)0;
  int64_t f[109] = {};
  --g_118;
  g_30 = 0;
  return (void*)0;
}

```

the disassemlbly of `func_65` are: (dump with objdump)

```
004015e0 :
func_65():
/root/myCSmith/test/output2.c:39
  4015e0:   mov0x2a9a(%rip),%eax# 404080  first
/root/myCSmith/test/output2.c:39 (discriminator 3)
  4015e6:   mov$0x1,%edx
func_69():
/root/myCSmith/test/output2.c:42
  4015eb:   test   %edi,%edi
  4015ed:   jne401607 
  4015ef:   callq  4015b0 
func_65():
/root/myCSmith/test/output2.c:39
  4015f4:   mov0x2a8a(%rip),%ecx# 404084 
  4015fa:   xor%edx,%edx
  4015fc:   mov0x2a7e(%rip),%eax# 404080  second
  401602:   test   %ecx,%ecx
  401604:   setne  %dl
/root/myCSmith/test/output2.c:39 (discriminator 6)
  401607:   cmp%eax,%edx
  401609:   setg   %al
  40160c:   movzbl %al,%eax
  40160f:   mov%eax,0x105b7(%rip)# 411bcc 
/root/myCSmith/test/output2.c:40
  401615:   retq   
  401616:   nopw   %cs:0x0(%rax,%rax,1)
```

g_56[4] is load twice when `a` equals to 0, both on the 0x4015e0 and 0x4015fc

[Bug target/113044] [14 Regression] wrong code with vector shift at -O1 since r14-5254

2023-12-21 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113044

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org,
   ||uros at gcc dot gnu.org
   Keywords|needs-bisection |
Summary|[14 Regression] wrong code  |[14 Regression] wrong code
   |with vector shift at -O1|with vector shift at -O1
   ||since r14-5254

--- Comment #2 from Jakub Jelinek  ---
Started with r14-5254-gdced5ae64703507a7159972316a1dde48e5f7470

[Bug testsuite/113005] 'libgomp.fortran/rwlock_1.f90', 'libgomp.fortran/rwlock_3.f90' execution test timeouts

2023-12-21 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113005

Thomas Schwinge  changed:

   What|Removed |Added

  Component|libfortran  |testsuite
   Last reconfirmed||2023-12-21
 Target|powerpc64le-linux-gnu   |
 CC||burnus at gcc dot gnu.org,
   ||jakub at gcc dot gnu.org
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

--- Comment #1 from Thomas Schwinge  ---
Turns out, this isn't actually specific to powerpc64le-linux-gnu, but rather
the following: my testing where I saw the timeouts was not build-tree 'make
check' testing, but instead "installed" testing (where you invoke 'runtest' on
a 'make install'ed GCC tree).  In that case, r266482 "Tweak libgomp env vars in
parallel make check (take 2)" is not in effect, that is, there's no limiting to
'OMP_NUM_THREADS=8'.

For example, manually running the '-O0' variant of
'libgomp.fortran/rwlock_1.f90' on a "big-iron" x86_64-pc-linux-gnu system:

$ grep ^model\ name < /proc/cpuinfo | uniq -c
256 model name  : AMD EPYC 7V13 64-Core Processor
$ \time env OMP_NUM_THREADS=[...] LD_LIBRARY_PATH=[...] ./rwlock_1.exe

..., I produce the following data on an idle system:

'OMP_NUM_THREADS=8':

0.16user 0.56system 0:02.36elapsed 31%CPU (0avgtext+0avgdata
4452maxresident)k
0.17user 0.54system 0:02.30elapsed 30%CPU (0avgtext+0avgdata
4532maxresident)k

'OMP_NUM_THREADS=16':

0.40user 1.03system 0:04.52elapsed 31%CPU (0avgtext+0avgdata
5832maxresident)k
0.49user 0.99system 0:04.39elapsed 33%CPU (0avgtext+0avgdata
5876maxresident)k

'OMP_NUM_THREADS=32':

0.98user 2.36system 0:09.33elapsed 35%CPU (0avgtext+0avgdata
8528maxresident)k
0.98user 2.25system 0:09.02elapsed 35%CPU (0avgtext+0avgdata
8548maxresident)k

'OMP_NUM_THREADS=64':

1.82user 5.83system 0:18.44elapsed 41%CPU (0avgtext+0avgdata
13952maxresident)k
1.54user 6.03system 0:18.22elapsed 41%CPU (0avgtext+0avgdata
13996maxresident)k

'OMP_NUM_THREADS=128':

3.71user 12.41system 0:38.02elapsed 42%CPU (0avgtext+0avgdata
24376maxresident)k
3.96user 12.52system 0:39.34elapsed 41%CPU (0avgtext+0avgdata
24476maxresident)k

'OMP_NUM_THREADS=256' (or not set, for that matter):

9.65user 25.19system 1:20.93elapsed 43%CPU (0avgtext+0avgdata
45816maxresident)k
8.99user 25.82system 1:19.40elapsed 43%CPU (0avgtext+0avgdata
45636maxresident)k

For comparison, if I remove 'LD_LIBRARY_PATH', such that the system-wide GCC 10
libraries are used, I get for the latter case:

9.28user 24.54system 1:22.09elapsed 41%CPU (0avgtext+0avgdata
45588maxresident)k
11.26user 24.51system 1:24.32elapsed 42%CPU (0avgtext+0avgdata
45712maxresident)k

..., so only a little bit of an improvement of the new "rwlock" libgfortran vs.
old "mutex" GCC 10 one, curiously.  (But supposedly that depends on the
hardware or other factors?)

Anyway: should these test cases be limiting themselves to some lower
'OMP_NUM_THREADS', for example via 'num_threads' clauses?

The powerpc64le-linux-gnu systems:

$ grep ^cpu < /proc/cpuinfo | uniq -c

160 cpu : POWER8 (raw), altivec supported

152 cpu : POWER8NVL (raw), altivec supported

128 cpu : POWER9, altivec supported

[Bug testsuite/113005] 'libgomp.fortran/rwlock_1.f90', 'libgomp.fortran/rwlock_3.f90' execution test timeouts

2023-12-21 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113005

--- Comment #2 from Jakub Jelinek  ---
Yeah, perhaps just num_threads(8) to all of those?
Why is there
total_threads = omp_get_max_threads ()
btw, when nothing uses it?

[Bug testsuite/113005] 'libgomp.fortran/rwlock_1.f90', 'libgomp.fortran/rwlock_3.f90' execution test timeouts

2023-12-21 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113005

--- Comment #3 from Jakub Jelinek  ---
BTW, even in non-installed testing there is no OMP_NUM_THREADS cap if one just
uses make check and not -jN.  Or when OMP_NUM_THREADS is set in the environment
to some value.

[Bug target/113044] [14 Regression] wrong code with vector shift at -O1 since r14-5254

2023-12-21 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113044

Uroš Bizjak  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |ubizjak at gmail dot com
 Status|NEW |ASSIGNED

--- Comment #3 from Uroš Bizjak  ---
Created attachment 56917
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56917&action=edit
Proposed patch

Patch in testing.

[Bug preprocessor/80755] __has_include_next: internal compiler error: NULL directory in find_file

2023-12-21 Thread lhyatt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80755

Lewis Hyatt  changed:

   What|Removed |Added

   Last reconfirmed||2023-12-21
 Ever confirmed|0   |1
URL||https://gcc.gnu.org/piperma
   ||il/gcc-patches/2023-Decembe
   ||r/641247.html
 CC||lhyatt at gcc dot gnu.org
   Keywords||patch
 Status|UNCONFIRMED |NEW

--- Comment #4 from Lewis Hyatt  ---
Submitted a patch for review:
https://gcc.gnu.org/pipermail/gcc-patches/2023-December/641247.html

[Bug go/86535] FreeBSD/PowerPC64 - Building Go Frontend support for gcc 7.3.0 fails

2023-12-21 Thread ro at CeBiTec dot Uni-Bielefeld.DE via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86535

--- Comment #38 from ro at CeBiTec dot Uni-Bielefeld.DE  ---
> --- Comment #37 from Ian Lance Taylor  ---
> Search for this comment in the top-level configure.ac file.
>
> # Disable libgo for some systems where it is known to not work.
> # For testing, you can easily override this with --enable-libgo.

Ah, I'd missed that, being more used to the various lib*/configure.tgt
files.  The disadvantage of having this in the toplevel configure.ac is
that this file is shared with binutils-gdb.

> That said if you don't configure with --enable-languages=go then you shouldn't
> get libgo.

True.  However, I did configure with --enable-languages=all (which
expanded to c,ada,c++,d,fortran,go,lto,m2,objc,obj-c++,rust).
Unfortunately, we lack a --disable-languages=, so you need to use
--enable-languages=, which is guaranteed to
break (or rather miss new languages) as has happened to me for m2 and
rust.

It's best if GCC self-defends against configurations which are known not
to work instead of having developers run into the trap first ;-)

[Bug rtl-optimization/113107] miss optimization of an unmerged load operation

2023-12-21 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113107

Richard Biener  changed:

   What|Removed |Added

   Last reconfirmed||2023-12-21
   Keywords||alias, missed-optimization
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

--- Comment #1 from Richard Biener  ---
On gimple this is a partial redundancy at most but the problem is that
func_69.part.0 can modify the contents of g_56[4], so it's not safe to CSE
without IPA.

[Bug c/113106] Missing CSE with cast to volatile

2023-12-21 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113106

Richard Biener  changed:

   What|Removed |Added

 CC||rguenth at gcc dot gnu.org

--- Comment #3 from Richard Biener  ---
The situation with address-spaces isn't valid as we need to preserve the second
load because it's volatile.  I think we simply refuse to combine
volatile loads out of caution in the first case.

[Bug c/113106] Missing CSE with cast to volatile

2023-12-21 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113106

--- Comment #4 from Uroš Bizjak  ---
(In reply to Richard Biener from comment #3)
> The situation with address-spaces isn't valid as we need to preserve the
> second load because it's volatile.  I think we simply refuse to combine
> volatile loads out of caution in the first case.

int __seg_gs b;
return *(volatile __seg_gs int *) &b + b;

But the above is the same w.r.t to volatile as:

int a;
return *(volatile int *) &a + a;

?

BTW: I also checked with clang, and it creates expected code in all cases.

[Bug c/113106] Missing CSE with cast to volatile

2023-12-21 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113106

--- Comment #5 from Uroš Bizjak  ---
The issue in comment #2 happens in a couple of places when compiling linux
kernel (with named address spaces enabled). However, the issue is not specific
to named AS, I was just more attentive to moves from %gs: prefixed locations.

[Bug rtl-optimization/113106] Missing CSE with cast to volatile

2023-12-21 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113106

Richard Biener  changed:

   What|Removed |Added

  Component|c   |rtl-optimization
 Ever confirmed|0   |1
   Keywords||missed-optimization
   Last reconfirmed||2023-12-21
 Status|UNCONFIRMED |NEW

--- Comment #6 from Richard Biener  ---
(In reply to Uroš Bizjak from comment #4)
> (In reply to Richard Biener from comment #3)
> > The situation with address-spaces isn't valid as we need to preserve the
> > second load because it's volatile.  I think we simply refuse to combine
> > volatile loads out of caution in the first case.
> 
> int __seg_gs b;
> return *(volatile __seg_gs int *) &b + b;
> 
> But the above is the same w.r.t to volatile as:
> 
> int a;
> return *(volatile int *) &a + a;
> 
> ?
> 
> BTW: I also checked with clang, and it creates expected code in all cases.

But you don't get

   movl%gs:b(%rip), %eax
   addl%eax, %eax

or

   movlb(%rip), %eax
   addl%eax, %eax

which I think would be wrong.  The volatile access doesn't need to yield
the same value as the non-volatile one so we can't value-number them the
same.

The combine issue remains of course.  But GCCs point was always that
trying to optimize volatile is wasted time.

[Bug tree-optimization/113102] during GIMPLE pass: bitintlower ICE: SIGSEGV with _BitInt() at -O1 or -O2

2023-12-21 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113102

Jakub Jelinek  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2023-12-21
 Status|UNCONFIRMED |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org

--- Comment #2 from Jakub Jelinek  ---
Created attachment 56918
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56918&action=edit
gcc14-pr113102-1.patch

So far only lightly tested patch for the #c0 issue.

[Bug tree-optimization/113102] during GIMPLE pass: bitintlower ICE: SIGSEGV with _BitInt() at -O1 or -O2

2023-12-21 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113102

--- Comment #3 from Jakub Jelinek  ---
Created attachment 56919
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56919&action=edit
gcc14-pr113102-2.patch

So far only lightly tested patch for the #c1 issue.

[Bug tree-optimization/113069] gimple-ssa-sccopy.cc:143:12: warning: private field 'curr_generation' is not used [-Wunused-private-field]

2023-12-21 Thread fkastl at suse dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113069

--- Comment #4 from Filip Kastl  ---
Its a statement I forgot to remove. Thanks for the fix!

[Bug rtl-optimization/113097] [14 Regression] LRA ICE building glibc for arc

2023-12-21 Thread vmakarov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113097

--- Comment #1 from Vladimir Makarov  ---
Joseph, thank you for reporting this.  I've just reverted the patch causing
this.

I'll use this report for work on another version of the patch.

[Bug rtl-optimization/113098] [14 Regression] LRA ICE building glibc for mips

2023-12-21 Thread vmakarov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113098

--- Comment #1 from Vladimir Makarov  ---
The patch causing this was reverted.

[Bug rtl-optimization/112918] [m68k] [LRA] ICE: maximum number of generated reload insns per insn achieved (90)

2023-12-21 Thread vmakarov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112918

--- Comment #15 from Vladimir Makarov  ---
The patch resulted in 2 new PRs about ICE when building glibc.  So I reverted
the patch.

I'll continue work on this PR right after the winter holidays.

[Bug target/113044] [14 Regression] wrong code with vector shift at -O1 since r14-5254

2023-12-21 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113044

--- Comment #4 from GCC Commits  ---
The master branch has been updated by Uros Bizjak :

https://gcc.gnu.org/g:2766b83759a02572b7b303aae3d4b54a351f8f96

commit r14-6787-g2766b83759a02572b7b303aae3d4b54a351f8f96
Author: Uros Bizjak 
Date:   Thu Dec 21 13:50:26 2023 +0100

i386: Fix shifts with high register input operand [PR113044]

The move to the output operand should use high register input operand.

PR target/113044

gcc/ChangeLog:

* config/i386/i386.md (*ashlqi_ext_1): Move from the
high register of the input operand.
(*qi_ext_1): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr113044.c: New test.

[Bug target/113044] [14 Regression] wrong code with vector shift at -O1 since r14-5254

2023-12-21 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113044

Uroš Bizjak  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #5 from Uroš Bizjak  ---
Fixed.

[Bug rtl-optimization/113106] Missing CSE with cast to volatile

2023-12-21 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113106

--- Comment #7 from Uroš Bizjak  ---
(In reply to Richard Biener from comment #6)

> > BTW: I also checked with clang, and it creates expected code in all cases.
> 
> But you don't get
> 
>movl%gs:b(%rip), %eax
>addl%eax, %eax
> 
> or
> 
>movlb(%rip), %eax
>addl%eax, %eax
> 
> which I think would be wrong.  The volatile access doesn't need to yield
> the same value as the non-volatile one so we can't value-number them the
> same.

The above is the code that clang produces for the testcases in Comment #2 and
Comment #0.

clang version 15.0.7 (Fedora 15.0.7-2.fc37)

[Bug rtl-optimization/113106] Missing CSE with cast to volatile

2023-12-21 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113106

--- Comment #8 from Andrew Pinski  ---
(In reply to Uroš Bizjak from comment #1)
> Perhaps related,
> 
> --cut here--
> int a;
> 
> int foo(void)
> {
>   return *(volatile int *) &a + *(volatile int *) &a;
> }
> --cut here--
> 
> compiles with -O2 to:
> 
> movla(%rip), %eax
> movla(%rip), %edx
> addl%edx, %eax
> ret
> 
> but may be compiled to:
> 
> movla(%rip), %eax
> addla(%rip), %eax
> ret
> 
> (the memory read may propagate to the insn)

That is pr 3506

[Bug rtl-optimization/112918] [m68k] [LRA] ICE: maximum number of generated reload insns per insn achieved (90)

2023-12-21 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112918

Thomas Schwinge  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=113097,
   ||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=113098
   Assignee|unassigned at gcc dot gnu.org  |vmakarov at gcc dot 
gnu.org

[Bug c++/113108] New: Internal compiler error when choosing overload for operator=

2023-12-21 Thread kyrylo.bohdanenko at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113108

Bug ID: 113108
   Summary: Internal compiler error when choosing overload for
operator=
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: kyrylo.bohdanenko at gmail dot com
  Target Milestone: ---

The following code causes GCC internal error:

template 
struct Foo {
Foo& operator=(Foo&&) = default;
T data;
};

template 
void consume(Foo& (Foo::*)(Foo&&) ) {}

template 
void consume(Foo& (Foo::*)(Foo&&) noexcept) {}

int main() {
consume(&Foo::operator=);
}

Output:
: In substitution of 'template void consume(Foo&
(Foo::*)(Foo&&) noexcept) [with T = ]':
:26:12:   required from here
   26 | consume(&Foo::operator=);
  | ~~~^~
:26:12: internal compiler error: in nothrow_spec_p, at
cp/except.cc:1201
0x262b4bc internal_error(char const*, ...)
???:0
0xa4d583 fancy_abort(char const*, int, char const*)
???:0
0xca6a7f fn_type_unification(tree_node*, tree_node*, tree_node*, tree_node*
const*, unsigned int, tree_node*, unification_kind_t, int, conversion**, bool,
bool)
???:0
0xa7cfd9 build_new_function_call(tree_node*, vec**, int)
???:0
0xcc6c76 finish_call_expr(tree_node*, vec**, bool,
bool, int)
???:0
0xc4b8ba c_parse_file()
???:0
0xd9cda9 c_common_parse_file()
???:0
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See  for instructions.

[Bug c++/113108] Internal compiler error when choosing overload for operator=

2023-12-21 Thread kyrylo.bohdanenko at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113108

--- Comment #1 from Kyrylo Bohdanenko  ---
Compiled with -std=c++17 -O2 -Wall -Wextra -Wpedantic

[Bug c++/113108] Internal compiler error when choosing overload for operator=

2023-12-21 Thread kyrylo.bohdanenko at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113108

--- Comment #2 from Kyrylo Bohdanenko  ---
Godbolt link (not the original example): https://godbolt.org/z/E1veMxcdx

[Bug c++/113108] Internal compiler error when choosing overload for operator=

2023-12-21 Thread mpolacek at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113108

--- Comment #3 from Marek Polacek  ---
Seems to have started r7-4383.

[Bug c++/113108] Internal compiler error when choosing overload for operator=

2023-12-21 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113108

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||ice-on-valid-code

--- Comment #4 from Andrew Pinski  ---
(In reply to Marek Polacek from comment #3)
> Seems to have started r7-4383.

That is where it started to ICE before it was rejected.

[Bug c++/113108] Internal compiler error when choosing overload pointer to member function and default'ed operator=

2023-12-21 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113108

Andrew Pinski  changed:

   What|Removed |Added

   Last reconfirmed||2023-12-21
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

--- Comment #5 from Andrew Pinski  ---
Confirmed.

[Bug rtl-optimization/113098] [14 Regression] LRA ICE building glibc for mips

2023-12-21 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113098

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|UNCONFIRMED |RESOLVED

--- Comment #2 from Andrew Pinski  ---
Patch was reverted so fixed.

[Bug rtl-optimization/113097] [14 Regression] LRA ICE building glibc for arc

2023-12-21 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113097

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|UNCONFIRMED |RESOLVED

--- Comment #2 from Andrew Pinski  ---
Patch was reverted so fixed.

[Bug tree-optimization/113105] Missing optimzation: fold `div(v, a) * b + rem(v, a)` to `div(v, a) * (b - a) + v`

2023-12-21 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113105

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||missed-optimization
   Severity|normal  |enhancement

[Bug tree-optimization/113105] Missing optimzation: fold `div(v, a) * b + rem(v, a)` to `div(v, a) * (b - a) + v`

2023-12-21 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113105

--- Comment #1 from Jakub Jelinek  ---
When it is signed v / a * b + v % a, I think it can introduce UB which wasn't
there originally.
E.g. for v = 0, a = INT_MIN and b = 3.  So, if it isn't done just for unsigned
types,
parts of it need to be done in unsigned.

[Bug tree-optimization/112941] during GIMPLE pass: bitintlower ICE: in handle_operand_addr, at gimple-lower-bitint.cc:2126 (gimple-lower-bitint.cc:2134) at -O with _BitInt()

2023-12-21 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112941

--- Comment #11 from Jakub Jelinek  ---
Created attachment 56920
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56920&action=edit
gcc14-pr112941-thunk.patch

Untested patch for the #c6 ICE.

[Bug tree-optimization/113105] Missing optimzation: fold `div(v, a) * b + rem(v, a)` to `div(v, a) * (b - a) + v`

2023-12-21 Thread xxs_chy at outlook dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113105

--- Comment #2 from XChy  ---
(In reply to Jakub Jelinek from comment #1)
> When it is signed v / a * b + v % a, I think it can introduce UB which
> wasn't there originally.
> E.g. for v = 0, a = INT_MIN and b = 3.  So, if it isn't done just for
> unsigned types,
> parts of it need to be done in unsigned.

Yes, this fold is true if there is no nooverflow/nowrap constraint. For those
with  nooverflow/nowrap constraint, it stays unclear to me when to fold.

For your reference, LLVM expands "v % a" to "v - (v / a) * a", and then
reassociates "(v / a) * b - (v / a) * a + v" to "(v / a) * (b - a) + v" to
solve this issue.

[Bug tree-optimization/113105] Missing optimzation: fold `div(v, a) * b + rem(v, a)` to `div(v, a) * (b - a) + v`

2023-12-21 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113105

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #3 from Jakub Jelinek  ---
I think with int v, a, b; ... v / a * b + v % a can be simplified into
(int) (v / a * ((unsigned) b - a) + v), i.e. perform just the division in
signed and everything else in corresponding unsigned type.
Also, a question is if this is a useful optimization on targets where one
instruction can compute both v / a and v % a together, because then the
original has roughly one divmod insn, one multiplication and one addition,
compared to the divmod insn from which only division is used, subtraction,
multiplication and addition.
Of course, if b - a can fold into a constant, it is different (but
multiplication by constant is often done using shifts and additions and
multiplication by b might be cheaper than by b - a.
When v % a needs to be computed separately and especially when it is expensive,
it can be obviously a win.

>From the usual GIMPLE IL rules, both forms are 4 statements so equally good,
but for the case where casts are needed, the replacement is more expensive.
So, perhaps this shouldn't be done in match.pd, but during expansion or
immediately before expansion, expanding to RTL both forms and comparing the
costs.

[Bug tree-optimization/113054] [14 regressions] ODR warnings when building new SCCP pass

2023-12-21 Thread fkastl at suse dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113054

--- Comment #7 from Filip Kastl  ---
Thanks for the fix! Will be careful not to trigger ODR with my future patches.

(In reply to Andrew Pinski from comment #2)
> Note I also don't like how dead_stmts is a static variable either but that
> would be for another change.

How should I get rid of the static variable? Maybe wrap the copy propagation
algorithm in a class as I wrapped the SCC finding algorithm?

[Bug tree-optimization/113054] [14 regressions] ODR warnings when building new SCCP pass

2023-12-21 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113054

--- Comment #8 from Andrew Pinski  ---
(In reply to Filip Kastl from comment #7)
> (In reply to Andrew Pinski from comment #2)
> > Note I also don't like how dead_stmts is a static variable either but that
> > would be for another change.
> 
> How should I get rid of the static variable? Maybe wrap the copy propagation
> algorithm in a class as I wrapped the SCC finding algorithm?

Yes that would most likely be the way I would do it. and then
init_sccopy/finalize_sccopy becomes constructors/deconstructors ...

[Bug tree-optimization/113054] [14 regressions] ODR warnings when building new SCCP pass

2023-12-21 Thread fkastl at suse dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113054

--- Comment #9 from Filip Kastl  ---
Alright. I suppose this change wouldn't be appropriate in stage 3 nor stage 4,
so I'll wait for the next stage 1 and modify sccopy to use a class.

[Bug middle-end/113040] [14 Regression] libmvec test failures

2023-12-21 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113040

--- Comment #5 from GCC Commits  ---
The master branch has been updated by H.J. Lu :

https://gcc.gnu.org/g:135bb9e37167ef70501a888bd3db195b11b37ae3

commit r14-6788-g135bb9e37167ef70501a888bd3db195b11b37ae3
Author: Andre Vieira (lists) 
Date:   Wed Dec 20 15:17:09 2023 +

omp: Fix simdclone arguments with veclen lower than simdlen [PR113040]

This patch fixes an issue introduced by:
commit ea4a3d08f11a59319df7b750a955ac613a3f438a
Author: Andre Vieira 
Date:   Wed Nov 1 17:02:41 2023 +

 omp: Reorder call for TARGET_SIMD_CLONE_ADJUST

The problem was that after this patch we no longer added multiple
arguments for vector arguments where the veclen was lower than the simdlen.

Bootstrapped and regression tested on x86_64-pc-linux-gnu and
aarch64-unknown-linux-gnu.

gcc/ChangeLog:

PR middle-end/113040
* omp-simd-clone.cc (simd_clone_adjust_argument_types): Add
multiple
vector arguments where simdlen is larger than veclen.

[Bug middle-end/113040] [14 Regression] libmvec test failures

2023-12-21 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113040

H.J. Lu  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #6 from H.J. Lu  ---
Fixed.

[Bug c++/70413] Class template names in anonymous namespaces are not globally unique

2023-12-21 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70413

--- Comment #8 from GCC Commits  ---
The master branch has been updated by Patrick Palka :

https://gcc.gnu.org/g:7226f825db049517b64442a40a6387513febb8f9

commit r14-6789-g7226f825db049517b64442a40a6387513febb8f9
Author: Patrick Palka 
Date:   Thu Dec 21 13:53:43 2023 -0500

c++: visibility wrt template and ptrmem targs [PR70413]

When constraining the visibility of an instantiation, we weren't
properly considering the visibility of PTRMEM_CST and TEMPLATE_DECL
template arguments.

This patch fixes this.  It turns out we don't maintain the relevant
visibility flags for alias templates (e.g. TREE_PUBLIC is never set),
so continue to ignore alias template template arguments for now.

PR c++/70413
PR c++/107906

gcc/cp/ChangeLog:

* decl2.cc (min_vis_expr_r): Handle PTRMEM_CST and TEMPLATE_DECL
other than those for alias templates.

gcc/testsuite/ChangeLog:

* g++.dg/template/linkage2.C: New test.
* g++.dg/template/linkage3.C: New test.
* g++.dg/template/linkage4.C: New test.
* g++.dg/template/linkage4a.C: New test.

[Bug c++/107906] linkage of template not taken into account

2023-12-21 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107906

--- Comment #6 from GCC Commits  ---
The master branch has been updated by Patrick Palka :

https://gcc.gnu.org/g:7226f825db049517b64442a40a6387513febb8f9

commit r14-6789-g7226f825db049517b64442a40a6387513febb8f9
Author: Patrick Palka 
Date:   Thu Dec 21 13:53:43 2023 -0500

c++: visibility wrt template and ptrmem targs [PR70413]

When constraining the visibility of an instantiation, we weren't
properly considering the visibility of PTRMEM_CST and TEMPLATE_DECL
template arguments.

This patch fixes this.  It turns out we don't maintain the relevant
visibility flags for alias templates (e.g. TREE_PUBLIC is never set),
so continue to ignore alias template template arguments for now.

PR c++/70413
PR c++/107906

gcc/cp/ChangeLog:

* decl2.cc (min_vis_expr_r): Handle PTRMEM_CST and TEMPLATE_DECL
other than those for alias templates.

gcc/testsuite/ChangeLog:

* g++.dg/template/linkage2.C: New test.
* g++.dg/template/linkage3.C: New test.
* g++.dg/template/linkage4.C: New test.
* g++.dg/template/linkage4a.C: New test.

[Bug c++/70413] Class template names in anonymous namespaces are not globally unique

2023-12-21 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70413

Patrick Palka  changed:

   What|Removed |Added

   Target Milestone|--- |14.0
 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #9 from Patrick Palka  ---
Fixed for GCC 14 (other than the alias template template argument case which
PR107906 tracks).

[Bug c++/107906] linkage of alias template not taken into account

2023-12-21 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107906

Patrick Palka  changed:

   What|Removed |Added

 Resolution|DUPLICATE   |---
 Status|RESOLVED|NEW
 CC||ppalka at gcc dot gnu.org
Summary|linkage of template not |linkage of alias template
   |taken into account  |not taken into account

--- Comment #7 from Patrick Palka  ---
Reopening to track the specific case of alias template template argument
linkage

[Bug c++/113031] [14 Regression] ICE in cxx_fold_indirect_ref_1 starting with r14-6508

2023-12-21 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113031

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #5 from Jakub Jelinek  ---
So fixed?

[Bug target/112470] [11/12/13/14 regression] [AARCH64] stack-protector vulnerability fixing solution impact code size and performance

2023-12-21 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112470

--- Comment #9 from Andrew Pinski  ---
Created attachment 56921
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56921&action=edit
Simple testcase

[Bug target/113087] [14] RISC-V rv64gcv vector: Runtime mismatch with rv64gc

2023-12-21 Thread patrick at rivosinc dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113087

--- Comment #11 from Patrick O'Neill  ---
(In reply to Patrick O'Neill from comment #10)
> I've kicked off 2 spec runs (zvl 128 and 256) using r14-6765-g4d9e0f3f211.
> I'll let you know the results when they finish.

My terminal crashed - so these are partial results:
zvl256: 3 runtime failures
531.deepsjeng
???
???

zvl128: 1 runtime failure
527.cam4_r

If I had to guess I would say the 2 ??? fails are the existing 521/549.

[Bug c++/84542] missing -Wdeprecated-declarations on a redeclared function template

2023-12-21 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84542

--- Comment #7 from GCC Commits  ---
The master branch has been updated by Patrick Palka :

https://gcc.gnu.org/g:9a65c8ee659042babdb05ef15fea9910fa8d6e62

commit r14-6790-g9a65c8ee659042babdb05ef15fea9910fa8d6e62
Author: Patrick Palka 
Date:   Thu Dec 21 14:33:56 2023 -0500

c++: [[deprecated]] on template redecl [PR84542]

The deprecated and unavailable attributes weren't working when used on
a template redeclaration ultimately because we weren't merging the
corresponding tree flags in duplicate_decls.

PR c++/84542

gcc/cp/ChangeLog:

* decl.cc (merge_attribute_bits): Merge TREE_DEPRECATED
and TREE_UNAVAILABLE.

gcc/testsuite/ChangeLog:

* g++.dg/ext/attr-deprecated-2.C: No longer XFAIL.
* g++.dg/ext/attr-unavailable-12.C: New test.

[Bug c++/84542] missing -Wdeprecated-declarations on a redeclared function template

2023-12-21 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84542

Patrick Palka  changed:

   What|Removed |Added

 CC||ppalka at gcc dot gnu.org
 Status|NEW |RESOLVED
   Target Milestone|--- |14.0
   Assignee|unassigned at gcc dot gnu.org  |ppalka at gcc dot 
gnu.org
 Resolution|--- |FIXED

--- Comment #8 from Patrick Palka  ---
Fixed for GCC 14.

[Bug middle-end/112951] [14 Regression] cond_copysign, cond_len_copysign optab not documented (added by r14-5285-gf30ecd8050444f)

2023-12-21 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112951

Andrew Pinski  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |pinskia at gcc dot 
gnu.org

--- Comment #2 from Andrew Pinski  ---
I have a simple patch.

[Bug rtl-optimization/112758] [13/14 Regression] Inconsistent Bitwise AND Operation Result between int and long long int

2023-12-21 Thread gkm at rivosinc dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112758

--- Comment #15 from Greg McGary  ---
I have a simple patch for this which I will submit soon. The idea is to do
nothing in expand_compound_operation() when the pattern is (sign_extend (mem
...) ).

[Bug rtl-optimization/112758] [13/14 Regression] Inconsistent Bitwise AND Operation Result between int and long long int

2023-12-21 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112758

--- Comment #16 from Jakub Jelinek  ---
Here is what I'd propose, but I can't really test it on any
WORD_REGISTER_OPERATIONS
target.

2023-12-21  Jakub Jelinek  

PR rtl-optimization/112758
* combine.cc (make_compopund_operation_int): Optimize AND of a SUBREG
based on nonzero_bits of SUBREG_REG and constant mask on
WORD_REGISTER_OPERATIONS targets only if it is a zero extending
MEM load.

* gcc.c-torture/execute/pr112758.c: New test.

--- gcc/combine.cc.jj   2023-12-11 23:52:03.528513943 +0100
+++ gcc/combine.cc  2023-12-21 20:25:45.461737423 +0100
@@ -8227,12 +8227,20 @@ make_compound_operation_int (scalar_int_
  int sub_width;
  if ((REG_P (sub) || MEM_P (sub))
  && GET_MODE_PRECISION (sub_mode).is_constant (&sub_width)
- && sub_width < mode_width)
+ && sub_width < mode_width
+ && (!WORD_REGISTER_OPERATIONS
+ || sub_width >= BITS_PER_WORD
+ /* On WORD_REGISTER_OPERATIONS targets the bits
+beyond sub_mode aren't considered undefined,
+so optimize only if it is a MEM load when MEM loads
+zero extend, because then the upper bits are all zero.  */
+ || (MEM_P (sub)
+ && load_extend_op (sub_mode) == ZERO_EXTEND)))
{
  unsigned HOST_WIDE_INT mode_mask = GET_MODE_MASK (sub_mode);
  unsigned HOST_WIDE_INT mask;

- /* original AND constant with all the known zero bits set */
+ /* Original AND constant with all the known zero bits set.  */
  mask = UINTVAL (XEXP (x, 1)) | (~nonzero_bits (sub, sub_mode));
  if ((mask & mode_mask) == mode_mask)
{
--- gcc/testsuite/gcc.c-torture/execute/pr112758.c.jj   2023-12-21
21:01:43.780755959 +0100
+++ gcc/testsuite/gcc.c-torture/execute/pr112758.c  2023-12-21
21:01:30.521940358 +0100
@@ -0,0 +1,15 @@
+/* PR rtl-optimization/112758 */
+
+int a = -__INT_MAX__ - 1;
+
+int
+main ()
+{
+  if (-__INT_MAX__ - 1U == 0x8000ULL)
+{
+  unsigned long long b = 0x00ffULL;
+  if ((b & a) != 0x00ff8000ULL)
+   __builtin_abort ();
+}
+  return 0;
+}

[Bug middle-end/112951] [14 Regression] cond_copysign, cond_len_copysign optab not documented (added by r14-5285-gf30ecd8050444f)

2023-12-21 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112951

Andrew Pinski  changed:

   What|Removed |Added

URL||https://gcc.gnu.org/piperma
   ||il/gcc-patches/2023-Decembe
   ||r/641270.html
   Keywords||patch

--- Comment #3 from Andrew Pinski  ---
Patch submitted:
https://gcc.gnu.org/pipermail/gcc-patches/2023-December/641270.html

[Bug middle-end/112951] [14 Regression] cond_copysign, cond_len_copysign optab not documented (added by r14-5285-gf30ecd8050444f)

2023-12-21 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112951

--- Comment #4 from GCC Commits  ---
The trunk branch has been updated by Andrew Pinski :

https://gcc.gnu.org/g:df5df10355089c9c92529c222100722cea170877

commit r14-6792-gdf5df10355089c9c92529c222100722cea170877
Author: Andrew Pinski 
Date:   Thu Dec 21 11:41:18 2023 -0800

Document cond_copysign and cond_len_copysign optabs [PR112951]

This adds the documentation for cond_copysign and cond_len_copysign optabs.
Also reorders the optabs.def to be in the similar order as how the internal
function was done.

gcc/ChangeLog:

PR middle-end/112951
* doc/md.texi (cond_copysign): Document.
(cond_len_copysign): Likewise.
* optabs.def: Reorder cond_copysign to be before
cond_fmin. Likewise for cond_len_copysign.

Signed-off-by: Andrew Pinski 

[Bug middle-end/112951] [14 Regression] cond_copysign, cond_len_copysign optab not documented (added by r14-5285-gf30ecd8050444f)

2023-12-21 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112951

Andrew Pinski  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #5 from Andrew Pinski  ---
Fixed.

[Bug middle-end/101852] [meta-bug] some standard RTL names are not documented

2023-12-21 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101852
Bug 101852 depends on bug 112951, which changed state.

Bug 112951 Summary: [14 Regression] cond_copysign, cond_len_copysign optab not 
documented (added by r14-5285-gf30ecd8050444f)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112951

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

[Bug middle-end/112581] [14 Regression] wrong code at -O2 and -O3 on x86_64-linux-gnu (generated code hangs) since r14-4661 due to reassoc not handling maybe_undefs

2023-12-21 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112581

Andrew Pinski  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |pinskia at gcc dot 
gnu.org
   Last reconfirmed|2023-11-17 00:00:00 |2023-12-21
 Status|NEW |ASSIGNED

--- Comment #11 from Andrew Pinski  ---
I will going to take a stab at fixing this.

[Bug libstdc++/113099] locale without RTTI uses dynamic_cast before gcc 13.2 or has ODR violation since gcc 13.2

2023-12-21 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113099

--- Comment #2 from Jonathan Wakely  ---
It's mostly OK to mix code with -frtti and -fno-rtti, but sometimes it bites
you.

The crash with older releases seems like __dynamic_cast should gracefully
handle missing RTTI and just fail, not segfault.

The ODR violation in current releaes is a non-issue. The code has the same
semantics either way.

[Bug c++/106213] -Werror=deprecated-copy-dtor does not trigger warning and error

2023-12-21 Thread jason at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106213

Jason Merrill  changed:

   What|Removed |Added

 Resolution|--- |FIXED
   Target Milestone|--- |14.0
 Status|ASSIGNED|RESOLVED

--- Comment #5 from Jason Merrill  ---
Fixed for GCC 14.

[Bug target/113087] [14] RISC-V rv64gcv vector: Runtime mismatch with rv64gc

2023-12-21 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113087

--- Comment #12 from JuzheZhong  ---
(In reply to Patrick O'Neill from comment #11)
> (In reply to Patrick O'Neill from comment #10)
> > I've kicked off 2 spec runs (zvl 128 and 256) using r14-6765-g4d9e0f3f211.
> > I'll let you know the results when they finish.
> 
> My terminal crashed - so these are partial results:
> zvl256: 3 runtime failures
> 531.deepsjeng
> ???
> ???
> 
> zvl128: 1 runtime failure
> 527.cam4_r
> 
> If I had to guess I would say the 2 ??? fails are the existing 521/549.

You mean those 2 cases are still failing?
Do you have any ideas to locate those FAIL and extract them as a simple case?

[Bug tree-optimization/113105] Missing optimzation: fold `div(v, a) * b + rem(v, a)` to `div(v, a) * (b - a) + v`

2023-12-21 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113105

--- Comment #4 from Jakub Jelinek  ---
So, e.g. on x86_64,
unsigned int
f1 (unsigned val)
{
  return val / 10 * 16 + val % 10;
}

unsigned int
f2 (unsigned val)
{
  return val / 10 * 6 + val;
}

unsigned int
f3 (unsigned val, unsigned a, unsigned b)
{
  return val / a * b + val % a;
}

unsigned int
f4 (unsigned val, unsigned a, unsigned b)
{
  return val / a * (b - a) + val % a;
}

unsigned int
f5 (unsigned val)
{
  return val / 93 * 127 + val % 93;
}

unsigned int
f6 (unsigned val)
{
  return val / 93 * (127 - 93) + val;
}

f2, f3 and f5 are shorter compared to f1, f4 and f6 at -O2.
With -Os, f3 is shorter than f4, while f1/f2 and f5/f6 are the same size (and
also same number of insns there, perhaps f1 better than f2 as it uses shift
rather than imul).
So, this is really something that needs to take into account the machine
specific expansion etc., isn't a clear winner all the time.

[Bug c++/95298] [11/12/13/14 Regression] sorry, unimplemented: mangling record_type

2023-12-21 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95298

--- Comment #8 from GCC Commits  ---
The trunk branch has been updated by Jason Merrill :

https://gcc.gnu.org/g:d26f589e61a178e898d8b247042b487287ffe121

commit r14-6797-gd26f589e61a178e898d8b247042b487287ffe121
Author: Jason Merrill 
Date:   Sat Nov 18 14:35:22 2023 -0500

c++: sizeof... mangling with alias template [PR95298]

We were getting sizeof... mangling wrong when the argument after
substitution was a pack expansion that is not a simple T..., such as
list... in variadic-mangle4.C or (A+1)... in variadic-mangle5.C.  In the
former case we ICEd; in the latter case we wrongly mangled it as sZ
.

PR c++/95298

gcc/cp/ChangeLog:

* mangle.cc (write_expression): Handle v18 sizeof... bug.
* pt.cc (tsubst_pack_expansion): Keep TREE_VEC for sizeof...
(tsubst_expr): Don't strip TREE_VEC here.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/variadic-mangle2.C: Add non-member.
* g++.dg/cpp0x/variadic-mangle4.C: New test.
* g++.dg/cpp0x/variadic-mangle5.C: New test.
* g++.dg/cpp0x/variadic-mangle5a.C: New test.

[Bug c++/113031] [14 Regression] ICE in cxx_fold_indirect_ref_1 starting with r14-6508

2023-12-21 Thread nathanieloshead at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113031

--- Comment #6 from Nathaniel Shead  ---
Yes, fixed as far as I'm aware.

[Bug c++/113031] [14 Regression] ICE in cxx_fold_indirect_ref_1 starting with r14-6508

2023-12-21 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113031

Jakub Jelinek  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #7 from Jakub Jelinek  ---
.

[Bug middle-end/113109] New: [14 Regression] g++ EH tests fail at execution time for cris-elf after r14-6674-g4759383245ac97

2023-12-21 Thread hp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113109

Bug ID: 113109
   Summary: [14 Regression] g++ EH tests fail at execution time
for cris-elf after r14-6674-g4759383245ac97
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hp at gcc dot gnu.org
CC: guojiufu at gcc dot gnu.org
  Target Milestone: ---
  Host: x86_64-pc-linux-gnu
Target: cris-elf

After r14-6674-g4759383245ac97, (at least) all tests that "throw", fail for
cris-elf at execution time: g++ tests as well as libstdc++ tests.  I don't see
any other clues from g++.log than execution failing for those tests.

Complete before/after example reports at
https://gcc.gnu.org/pipermail/gcc-testresults/2023-December/803815.html and
https://gcc.gnu.org/pipermail/gcc-testresults/2023-December/803816.html (for
r14-6672-g605d21f8ef1f and r14-6750-gf9be3d8faa47; same failures as 6674).

An example of a small hopefully-minimal test that fail is
gcc/testsuite/g++.old-deja/g++.mike/eh6.C.  Like for seemingly all others,
execution the test fails, and there's no output from the printf.  That
printf-statement is likely not reached, but the output *could* possibly still
be in an output-buffer (I don't remember how that works in newlib; that could
happen for glibc when execution is aborted).

I'm initially setting component to "middle-end" because that's what the commit
touched and also, I'm biased, but visiting the gcc-testresults archives I don't
see other targets fail in the same manner, so it could still be that "target"
fits better.  Further analysis will show; I'll dig a little deeper.  (Commit
author CC:ed.)

[Bug middle-end/113109] [14 Regression] g++ EH tests fail at execution time for cris-elf after r14-6674-g4759383245ac97

2023-12-21 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113109

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |14.0

[Bug middle-end/113109] [14 Regression] g++ EH tests fail at execution time for cris-elf after r14-6674-g4759383245ac97

2023-12-21 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113109

--- Comment #1 from Andrew Pinski  ---
I wonder if this is similar to what I saw years earlier, see
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=30271#c11 . Jeff was worried about
this similar thing when he was reviewing the patch too.

[Bug middle-end/113109] [14 Regression] g++ EH tests fail at execution time for cris-elf after r14-6674-g4759383245ac97

2023-12-21 Thread hp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113109

--- Comment #2 from Hans-Peter Nilsson  ---
(In reply to Hans-Peter Nilsson from comment #0)
> That
> printf-statement is likely not reached,

Now confirmed.  The assembly output for eh6.s is identical (before/after), but
apparently support-libraries (likely the unwind machinery) is miscompiled.

  1   2   >