[Bug target/110709] how to handle the initialization of global struct data for position independent executable application.

2023-07-18 Thread xry111 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110709

Xi Ruoyao  changed:

   What|Removed |Added

 CC||xry111 at gcc dot gnu.org
 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |INVALID

--- Comment #5 from Xi Ruoyao  ---
Anyway Bugzilla is not a place for asking questions.

[Bug rtl-optimization/105715] [13/14 Regression] missed RTL if-conversion with COND_EXPR change

2023-07-18 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105715

Richard Biener  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org

--- Comment #5 from Richard Biener  ---
I'm testing a patch.

[Bug rtl-optimization/110587] [14 regression] 96% pr28071.c compile time regression since r14-2337-g37a231cc7594d1

2023-07-18 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110587

--- Comment #12 from Richard Biener  ---
This code block has a rich history with many fixes for many issues :/  (I
thought of just scrapping it ...), still regno_in_use_p is badly engineered in
this context.  Of course we're quite unlucky that the return REG is in use that
much for this large BB.

In the end the reason why this code exists and also some of the fallout
observed in the history point at issues that might be worth fixing elsewhere as
well.

[Bug lto/110710] LTO linker on Windows creates an invalid Makefile

2023-07-18 Thread cz.finn.cz at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110710

--- Comment #5 from Jan Nárovec  ---
We are using GNU/make (which is a superset of POSIX make, I don't know, whether
it makes some difference) with SHELL=cmd.exe. If GCC intends to support only
POSIX shells, it is OK for us (we will condsider using WSL) and you can close
this issue as invalid.

[Bug libgomp/110663] [OpenMP] Use 'affinity' clause for node placement for the 'task' construct

2023-07-18 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110663

Tobias Burnus  changed:

   What|Removed |Added

  Attachment #55541|0   |1
is obsolete||

--- Comment #2 from Tobias Burnus  ---
Created attachment 55566
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55566&action=edit
Partial libgomp/task.c patch, showing how this could be implemented

[Bug c/101090] incorrect -Wunused-value warning on remquo with constant values

2023-07-18 Thread vincent-gcc at vinc17 dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101090

--- Comment #2 from Vincent Lefèvre  ---
On Debian, I get a warning from GCC 9 to GCC 12 (Debian 12.3.0-6), but neither
with GCC 13 (Debian 13.1.0-8) nor with 14.0.0 20230612 (Debian 20230613-1).

So, has this bug been fixed (and where)?

[Bug c/106264] [10/11/12/13 Regression] spurious -Wunused-value on a folded frexp, modf, and remquo calls with unused result since r9-1295-g781ff3d80e88d7d0

2023-07-18 Thread vincent-gcc at vinc17 dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106264

Vincent Lefèvre  changed:

   What|Removed |Added

 CC||vincent-gcc at vinc17 dot net

--- Comment #8 from Vincent Lefèvre  ---
This seems to be the same issue as PR101090 ("incorrect -Wunused-value warning
on remquo with constant values"), which I had reported in 2021 and was present
in GCC 9 too.

[Bug c/101090] incorrect -Wunused-value warning on remquo with constant values

2023-07-18 Thread vincent-gcc at vinc17 dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101090

--- Comment #3 from Vincent Lefèvre  ---
(In reply to Vincent Lefèvre from comment #2)
> So, has this bug been fixed (and where)?

This seems to be a particular case of PR106264, which was fixed in commit
r13-1741-g40f6e5912288256ee8ac41474f2dce7b6881c111.

[Bug middle-end/110711] New: possible missed optimization for std::max with -march=znver2

2023-07-18 Thread mrks2023 at proton dot me via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110711

Bug ID: 110711
   Summary: possible missed optimization for std::max with
-march=znver2
   Product: gcc
   Version: 13.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: mrks2023 at proton dot me
  Target Milestone: ---

I think I found a missed optimization involving std::max() for -march=znver2
(sorry if it was already reported, but I didn't find anything related in the
bug tracker).

I have two functions that compute the maximum element of an array:

- function k_std_max uses std::max() and is never vectorized
- function k_max uses conditional assignment and is vectorized, when the
optimization flags allow for it

The code (also https://godbolt.org/z/hW49nbqMY):

#include 
#include 

double k_std_max(size_t n_els, double * a)
{
assert(n_els > 0);
double m = a[0];
#ifdef _OPENMP
#pragma omp simd reduction(max:m)
#endif
for (size_t i = 1; i < n_els; ++i) {
m = std::max(m, a[i]);
}

return m;
}

double k_max(size_t n_els, double * a)
{
assert(n_els > 0);
double m = a[0];
#ifdef _OPENMP
#pragma omp simd reduction(max:m)
#endif
for (size_t i = 1; i < n_els; ++i) {
m = m < a[i] ? a[i] : m;
}

return m;
}

Compiling with "-O3 -fopenmp -march=znver2 -Wall -Wextra -DNDEBUG" vectorizes
k_max:

.L19:
vmovupd ymm3, YMMWORD PTR [rax+8]
add rax, 32
vmaxpd  ymm1, ymm3, ymm1
cmp rax, rdx
jne .L19

but for k_std_max still scalar instructions are used:

.L3:
vmovsd  xmm0, QWORD PTR [rax]
add rax, 8
vmaxsd  xmm0, xmm0, xmm1
cmp rdx, rax
jne .L5

Note that I had to use -fopenmp as using only -fopenmp-simd did not vectorize
k_max.

Even when I use "-Ofast" or "-Ofast -fopenmp" instead of "-O3" k_std_max is not
vectorized:

.L3:
vmaxsd  xmm0, xmm0, QWORD PTR [rax]
add rax, 8
cmp rdx, rax
jne .L3

gcc-bugs@gcc.gnu.org

2023-07-18 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110586

Jan Hubicka  changed:

   What|Removed |Added

Summary|[14 Regression] 10% |[14 Regression] 10%
   |fatigue2 regression on zen  |fatigue2 regression on zen
   |since   |since
   |r14-2369-g3a61ca1b925653|r14-2369-g3a61ca1b925653
   ||(bad LRA&scheduling)

--- Comment #4 from Jan Hubicka  ---
Aha, sphinx3 is indeed same patch.
The patch corrects profile here. It is LRA/scheduler interaction that causes
the difference

With older trunk I get:
 Performance counter stats for './b.out':

 28,536.75 msec task-clock:u #1.000 CPUs
utilized
 0  context-switches:u   #0.000 /sec
 0  cpu-migrations:u #0.000 /sec
   138  page-faults:u#4.836 /sec
   134,747,380,473  cycles:u #4.722 GHz
(83.33%)
   714,193,718  stalled-cycles-frontend:u#0.53% frontend
cycles idle(83.33%)
 3,510,378  stalled-cycles-backend:u #0.00% backend
cycles idle (83.33%)
   243,176,910,654  instructions:u   #1.80  insn per
cycle
  #0.00  stalled cycles per
insn (83.33%)
13,541,807,472  branches:u   #  474.539 M/sec  
(83.33%)
13,829,858  branch-misses:u  #0.10% of all
branches (83.33%)

  28.537620889 seconds time elapsed

  28.536941000 seconds user
   0.0 seconds sys

and with current trunk:
 Performance counter stats for './a.out':

  31933.51 msec task-clock:u #1.000 CPUs
utilized
 0  context-switches:u   #0.000 /sec
 0  cpu-migrations:u #0.000 /sec
   138  page-faults:u#4.321 /sec
  150448312691  cycles:u #4.711 GHz
(83.33%)
 760763745  stalled-cycles-frontend:u#0.51% frontend
cycles idle(83.33%)
   1918238  stalled-cycles-backend:u #0.00% backend
cycles idle (83.33%)
  242823668283  instructions:u   #1.61  insn per
cycle
  #0.00  stalled cycles per
insn (83.34%)
   13541981288  branches:u   #  424.068 M/sec  
(83.34%)
  14583703  branch-misses:u  #0.11% of all
branches (83.33%)

  31.933986770 seconds time elapsed

  31.933701000 seconds user
   0.0 seconds sys

So same instruction and branch count, but they execute slower. IPC goes down
from 1.8 to 1.6. Perf thinks the difference is
__perdida_m_MOD_generalized_hookes_law.constprop.0.

  27.45%  b.outb.out [.] MAIN__ 
  27.07%  a.outa.out [.] MAIN__ 
  21.72%  a.outa.out [.]
__perdida_m_MOD_generalized_hookes_law.constprop.0.
  16.60%  b.outb.out [.]
__perdida_m_MOD_generalized_hookes_law.constprop.0.
   2.22%  a.outa.out [.]
__perdida_m_MOD_generalized_hookes_law.constprop.1.
   1.64%  b.outb.out [.]
__perdida_m_MOD_generalized_hookes_law.constprop.1.
   1.55%  b.outlibc.so.6 [.] __memset_avx2_unaligned_erms   
   1.54%  a.outlibc.so.6 [.] __memset_avx2_unaligned_erms   
   0.06%  a.outlibm.so.6 [.] __sincos_fma   
   0.04%  b.outlibm.so.6 [.] __sincos_fma   

b.out is before patch and a.out is after. The difference seems to be relocated
load.  Before patch:

Percent│ 00401860 <__perdida_m_MOD_generalized_hookes_▒
   │ __perdida_m_MOD_generalized_hookes_law.constprop.0.is▒
  0.10 │   push %rbp  ▒
  0.02 │   mov  %r8,%rax  ▒
   │   vmovddup %xmm0,%xmm5   ▒
   │   mov  %rsp,%rbp ▒
  1.22 │   push %r15  ▒
  0.04 │   push %r14  ▒
  0.03 │   push %r13  ▒
  0.09 │   push %r12  ▒
  0.05 │   push %rbx  ▒
  0.03 │   not  

[Bug target/110170] Sub-optimal conditional jumps in conditional-swap with floating point

2023-07-18 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110170

Richard Biener  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
   Target Milestone|--- |14.0
 Resolution|--- |FIXED

--- Comment #16 from Richard Biener  ---
This is fixed now.

[Bug target/110170] Sub-optimal conditional jumps in conditional-swap with floating point

2023-07-18 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110170

--- Comment #17 from Hongtao.liu  ---
(In reply to Richard Biener from comment #16)
> This is fixed now.

The original issue is for sse2, my patch only fixed misoptimization for sse4.1.

[Bug d/110712] New: d: ICE: verify_gimple_failed (conversion of register to a different size in 'view_convert_expr')

2023-07-18 Thread ibuclaw at gdcproject dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110712

Bug ID: 110712
   Summary: d: ICE: verify_gimple_failed (conversion of register
to a different size in 'view_convert_expr')
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: d
  Assignee: ibuclaw at gdcproject dot org
  Reporter: ibuclaw at gdcproject dot org
  Target Milestone: ---

Seen with gdc-12 with --enable-checking turned on, assume that it's going to
still be the case with mainline.

d21: error: conversion of register to a different size in ‘view_convert_expr’
VIEW_CONVERT_EXPR(ap_3(D));

# .MEM_4 = VDEF <.MEM_1(D)>
this_2(D)->ap = VIEW_CONVERT_EXPR(ap_3(D));
during GIMPLE pass: ssa
d21: internal compiler error: verify_gimple failed
0x11aad62 verify_gimple_in_cfg(function*, bool)
../../gcc/tree-cfg.cc:5561
0x1073dae execute_function_todo
../../gcc/passes.cc:2085
0x107431a execute_todo
../../gcc/passes.cc:2139

---

import gcc.builtins : va_list = __builtin_va_list;
struct S
{
this(va_list ap)
{
this.ap = ap;
}
va_list ap;
}

---

Assigning a va_list parameter (static array saturated to a pointer on x86_64)
to a field (static array on x86_64) fails miserably to pass tree checks.

[Bug middle-end/110711] possible missed optimization for std::max with -march=znver2

2023-07-18 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110711

Hongtao.liu  changed:

   What|Removed |Added

 CC||crazylht at gmail dot com

--- Comment #1 from Hongtao.liu  ---
You need to use -ffast-math, w/o it, operands order matters for floating point
max/min, they're not commutative.

[Bug middle-end/110711] possible missed optimization for std::max with -march=znver2

2023-07-18 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110711

--- Comment #2 from Hongtao.liu  ---
(In reply to Hongtao.liu from comment #1)
> You need to use -ffast-math, w/o it, operands order matters for floating
> point max/min, they're not commutative.

Sorry, too fast to reply, ignore this comment.

[Bug target/106952] Missed optimization: x < y ? x : y not lowered to minss

2023-07-18 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106952

--- Comment #4 from Richard Biener  ---
With the proposed patches for PR88540 and PR105715 I get with -O3 -msse4.1

intersection:
.LFB2:
.cfi_startproc
movss   .LC0(%rip), %xmm5
pxor%xmm2, %xmm2
movss   (%rdi), %xmm4
movss   12(%rsi), %xmm1
movss   12(%rdi), %xmm0
divss   %xmm2, %xmm5
movss   (%rsi), %xmm3
subss   %xmm4, %xmm1
subss   %xmm4, %xmm3
pxor%xmm4, %xmm4
mulss   %xmm0, %xmm1
mulss   %xmm0, %xmm3
movaps  %xmm1, %xmm0
cmpnltss%xmm2, %xmm0
blendvps%xmm0, %xmm1, %xmm4
movaps  %xmm3, %xmm0
cmpnltss%xmm2, %xmm0
pxor%xmm2, %xmm2
blendvps%xmm0, %xmm3, %xmm2
movss   16(%rsi), %xmm0
minss   %xmm5, %xmm3
minss   %xmm5, %xmm1
movss   4(%rdi), %xmm5
minss   %xmm4, %xmm2
movss   16(%rdi), %xmm4
subss   %xmm5, %xmm0
maxss   %xmm3, %xmm1
movss   4(%rsi), %xmm3
subss   %xmm5, %xmm3
mulss   %xmm4, %xmm0
movss   8(%rdi), %xmm5
mulss   %xmm4, %xmm3
movaps  %xmm2, %xmm4
maxss   %xmm0, %xmm4
minss   %xmm1, %xmm0
maxss   %xmm3, %xmm2
minss   %xmm1, %xmm3
movss   8(%rsi), %xmm1
subss   %xmm5, %xmm1
maxss   %xmm3, %xmm0
movss   20(%rsi), %xmm3
minss   %xmm4, %xmm2
movss   20(%rdi), %xmm4
subss   %xmm5, %xmm3
mulss   %xmm4, %xmm1
movaps  %xmm2, %xmm5
mulss   %xmm4, %xmm3
movaps  %xmm2, %xmm4
maxss   %xmm1, %xmm4
minss   %xmm0, %xmm1
movaps  %xmm3, %xmm2
maxss   %xmm3, %xmm5
minss   %xmm0, %xmm2
minss   %xmm5, %xmm4
maxss   %xmm1, %xmm2
comiss  %xmm4, %xmm2
seta%al
ret

there's the existing issue that RTL conditional move expansion doesn't
preserve the equality of constants for

  _33 = t2_34 < 0.0;
  _12 = _33 ? 0.0 : t2_34;

but it emits two loads from the constant pool for 0.0 here which in the x86
backend fail to be recognized as min/max.

[Bug target/110625] [AArch64] Vect: SLP fails to vectorize a loop as the reduction_latency calculated by new costs is too large

2023-07-18 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110625

rsandifo at gcc dot gnu.org  changed:

   What|Removed |Added

 CC||rsandifo at gcc dot gnu.org

--- Comment #4 from rsandifo at gcc dot gnu.org  
---
Sorry, didn't see this PR until now.

On:

>   general operations = 15   <-- Too large

Are you sure this is too large?  The vector code seems to be:

ldr q31, [x3], 16
ldr q29, [x4], -16
rev64   v31.8h, v31.8h
uxtlv30.4s, v31.4h
uxtl2   v31.4s, v31.8h
sxtlv27.2d, v30.2s
sxtlv28.2d, v31.2s
sxtl2   v30.2d, v30.4s
sxtl2   v31.2d, v31.4s
scvtf   v27.2d, v27.2d
scvtf   v28.2d, v28.2d
scvtf   v30.2d, v30.2d
scvtf   v31.2d, v31.2d
fmlav26.2d, v27.2d, v29.d[1]
fmlav24.2d, v30.2d, v29.d[1]
fmlav23.2d, v28.2d, v29.d[0]
fmlav25.2d, v31.2d, v29.d[0]

Discounting the loads, we do have 15 general operations.

On the reduction latency, the:

>  /* ??? Ideally we'd do COUNT reductions in parallel, but unfortunately
>that's not yet the case.  */

is referring to the single_defuse_cycle code in vectorizable_reduction.  That's
always seemed like a misfeature to me, since it serialises a multi-vector
reduction through a single accumulator.  I guess it's finally time to opt out
of that for aarch64.

If we did opt out, then removing the “* count” should be correct for all cases.

[Bug libstdc++/110574] --enable-cstdio=stdio_pure is incompatible with LFS

2023-07-18 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110574

--- Comment #7 from CVS Commits  ---
The releases/gcc-13 branch has been updated by Jonathan Wakely
:

https://gcc.gnu.org/g:5342e3cc446d9ba0017c167aa3ff9d3c08c11f0f

commit r13-7578-g5342e3cc446d9ba0017c167aa3ff9d3c08c11f0f
Author: Jonathan Wakely 
Date:   Thu Jul 6 17:10:41 2023 +0100

libstdc++: Fix --enable-cstdio=stdio_pure [PR110574]

When configured with --enable-cstdio=stdio_pure we need to consistently
use fseek and not mix seeks on the file descriptor with reads and writes
on the FILE stream.

There are also a number of bugs related to error handling and return
values, because fread and fwrite return 0 on error, not -1, and fseek
returns 0 on success, not the file offset.

libstdc++-v3/ChangeLog:

PR libstdc++/110574
* acinclude.m4 (GLIBCXX_CHECK_LFS): Check for fseeko and ftello
and define _GLIBCXX_USE_FSEEKO_FTELLO.
* config.h.in: Regenerate.
* configure: Regenerate.
* config/io/basic_file_stdio.cc (xwrite) [_GLIBCXX_USE_STDIO_PURE]:
Check for fwrite error correctly.
(__basic_file::xsgetn) [_GLIBCXX_USE_STDIO_PURE]: Check for
fread error correctly.
(get_file_offset): New function.
(__basic_file::seekoff) [_GLIBCXX_USE_STDIO_PURE]: Use
fseeko if available. Use get_file_offset instead of return value
of fseek.
(__basic_file::showmanyc): Use get_file_offset.

(cherry picked from commit 2f6bbc9a7d9a62423c576e13dc46323fe16ba5aa)

[Bug libstdc++/110542] use of allocated storage after deallocation in a constant expression: std::array of std::vector

2023-07-18 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110542

--- Comment #9 from CVS Commits  ---
The releases/gcc-13 branch has been updated by Jonathan Wakely
:

https://gcc.gnu.org/g:a32d4a34da72087c9f9bdfe3f987b808be814cd7

commit r13-7580-ga32d4a34da72087c9f9bdfe3f987b808be814cd7
Author: Jonathan Wakely 
Date:   Tue Jul 4 16:03:45 2023 +0100

libstdc++: Fix std::__uninitialized_default_n for constant evaluation
[PR110542]

libstdc++-v3/ChangeLog:

PR libstdc++/110542
* include/bits/stl_uninitialized.h (__uninitialized_default_n):
Do not use std::fill_n during constant evaluation.

(cherry picked from commit 83cae6c4b788544635a71748e1881c150f42efef)

[Bug middle-end/110713] New: Fatigue2 runs twice as fast with increased inlining limits

2023-07-18 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110713

Bug ID: 110713
   Summary: Fatigue2 runs twice as fast with increased inlining
limits
   Product: gcc
   Version: 13.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hubicka at gcc dot gnu.org
  Target Milestone: ---

jh@ryzen3:~/pb11/lin/source> ~/trunk-histogram/bin/gfortran fatigue2.f90 -Ofast
-march=native -fdump-tree-all-details-blocks -fdump-rtl-all-details
-fdump-ipa-all-details --param max-inline-insns-auto=110 ; perf stat ./a.out
>/dev/null

 Performance counter stats for './a.out':

  13937.07 msec task-clock:u #1.000 CPUs
utilized 
 0  context-switches:u   #0.000 /sec
 0  cpu-migrations:u #0.000 /sec
   138  page-faults:u#9.902 /sec
   67489472294  cycles:u #4.842 GHz
(83.33%)
  38791427  stalled-cycles-frontend:u#0.06% frontend
cycles idle(83.33%)
   2351353  stalled-cycles-backend:u #0.00% backend
cycles idle (83.33%)
  147268347462  instructions:u   #2.18  insn per
cycle
  #0.00  stalled cycles per
insn (83.33%)
5705431257  branches:u   #  409.371 M/sec  
(83.35%)
  13638274  branch-misses:u  #0.24% of all
branches (83.35%)

  13.941876147 seconds time elapsed

  13.933226000 seconds user
   0.003999000 seconds sys


jh@ryzen3:~/pb11/lin/source> ~/trunk-histogram/bin/gfortran fatigue2.f90 -Ofast
-march=native -fdump-tree-all-details-blocks -fdump-rtl-all-details
-fdump-ipa-all-details  ; perf stat ./a.out >/dev/null

 Performance counter stats for './a.out':

  31300.68 msec task-clock:u #1.000 CPUs
utilized 
 0  context-switches:u   #0.000 /sec
 0  cpu-migrations:u #0.000 /sec
   138  page-faults:u#4.409 /sec
  150619261261  cycles:u #4.812 GHz
(83.32%)
 779861463  stalled-cycles-frontend:u#0.52% frontend
cycles idle(83.33%)
   4695025  stalled-cycles-backend:u #0.00% backend
cycles idle (83.34%)
  242822794319  instructions:u   #1.61  insn per
cycle
  #0.00  stalled cycles per
insn (83.34%)
   13542051898  branches:u   #  432.644 M/sec  
(83.34%)
  14587945  branch-misses:u  #0.11% of all
branches (83.34%)

  31.301169341 seconds time elapsed

  31.296826000 seconds user
   0.003999000 seconds sys

The main differnece is inlning generalized_hookes_law. While it looks quite big
at release_ssa time, after vectorization it gets loopless and inlining is a big
win.

  function generalized_hookes_law (strain_tensor, lambda, mu) result
(stress_tensor)
!
!  Author:   Dr. John K. Prentice
!  Affiliation:  Quetzal Computational Associates, Inc.
!  Dates:28 November 1997
!
!  Purpose:  Apply the generalized Hooke's law for elasticity to the
strain tensor
!(or strain rate tensor) to compute the stress tensor (or
stress rate
!tensor)
!
!
!
!  Input:
!
!strain_tensor[selected_real_kind(15,90),
dimension(3,3)]
! stress tensor
!
!lambda   [selected_real_kind(15,90)]
! Lame constant Lambda
!
!mu   [selected_real_kind(15,90)]
! Lame constant mu
!
! Output:
!
!stress_tensor[selected_real_kind(15,90),
dimension(3,3)]
! stress tensor
!
!
!
!
!=== formal variables =
!
  real (kind = LONGreal), dimension(:,:), intent(in) :: strain_tensor
  real (kind = LONGreal), intent(in) :: lambda, mu
  real (kind = LONGreal), dimension(3,3) :: stress_tensor
!
!== internal variables 
!
  real (kind = LONGreal), dimensio

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-07-18 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

Jakub Jelinek  changed:

   What|Removed |Added

  Attachment #55562|0   |1
is obsolete||

--- Comment #84 from Jakub Jelinek  ---
Created attachment 55567
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55567&action=edit
gcc14-bitint-wip.patch

Actually implemented support for switches first.  The switchlower support pass
has most of the support, so all we need is if we detect large/huge _BitInt
indexed switch is to lower it at the start of the bitintlower pass with small
tweak in the switchlower pass
to transform jump tables from ones indexed by large/huge _BitInt into ones
indexed by unsigned long long; switchlower never creates clusters with range
which doesn't fit into 64 bits, which makes this possible.

[Bug tree-optimization/61747] min,max pattern not always properly optimized (for sse4 targets)

2023-07-18 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61747

Richard Biener  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org
 Status|NEW |ASSIGNED

--- Comment #7 from Richard Biener  ---
The cases with constant arguments fail to be recognized by the x86 conditional
move expansion because RTL expansion makes it too difficult to see they are
equal where required.  That is emit_conditional_move forcing the constant
to two different regs via prepare_cmp_insn.

I'm testing a patch for this.

[Bug libstdc++/110708] std::format("{:%EEC %OOd}", std::chrono::system_clock::now()) should be rejected

2023-07-18 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110708

--- Comment #1 from Jonathan Wakely  ---
And similarly for %OEy and %EOy

[Bug c++/110714] New: constexpr lifetime error: base class this pointer

2023-07-18 Thread pkeir at outlook dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110714

Bug ID: 110714
   Summary: constexpr lifetime error: base class this pointer
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pkeir at outlook dot com
  Target Milestone: ---

Compiling the C++20 MRE code below fails:

struct Base
{
  constexpr virtual ~Base() {}
  constexpr Base* get_this() { return this; }
  int x;
};

struct Derived : public Base {};

constexpr bool test()
{
  Derived* pf = new Derived;
  delete pf->get_this();
  return true;
}

constexpr bool b = test();

...with the following error message:

2$ /opt/gcc-latest/bin/g++ -std=c++20 -c ce_base_alloc2.cpp 
ce_base_alloc2.cpp:17:24:   in ‘constexpr’ expansion of ‘test()’
ce_base_alloc2.cpp:13:23:   in ‘constexpr’ expansion of
‘pf->Derived::.Base::get_this()->Base::~Base()’
ce_base_alloc2.cpp:8:8: error: deallocation of storage that was not previously
allocated
8 | struct Derived : public Base {};
  |^~~

I have tried with GCC trunk (14.0.0) and also version 12.2.0.

I suspect that the this pointer in the base class is not tracking the constexpr
dynamic allocation. Clang and MSVC both compile successfully. Clang requires
the virtual destructor.

[Bug ipa/110705] [11/12 Regression] ICE at -O2 and above: in gimplify_modify_expr, at gimplify.cc:6255 (on GCC-12.x)

2023-07-18 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110705

Richard Biener  changed:

   What|Removed |Added

   Last reconfirmed||2023-07-18
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW

--- Comment #1 from Richard Biener  ---
Confirmed.  Bisection of what fixed it would be useful.

[Bug ipa/110705] [11/12 Regression] ICE at -O2 and above: in gimplify_modify_expr, at gimplify.cc:6255 (on GCC-12.x)

2023-07-18 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110705

Richard Biener  changed:

   What|Removed |Added

   Priority|P3  |P2

[Bug d/110712] d: ICE: verify_gimple_failed (conversion of register to a different size in 'view_convert_expr')

2023-07-18 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110712

--- Comment #1 from Richard Biener  ---
this_2(D)->ap = VIEW_CONVERT_EXPR(ap_3(D));

it looks odd since ap_3(D) is a is_gimple_reg but a struct[1] definitely
would not.  Maybe you are missing a dereference here?  In C
struct[1] would decay to a pointer so

 this.ap = ap;

wouldn't work (besides that va_list copying requires va_copy).

[Bug middle-end/110711] possible missed optimization for std::max with -march=znver2

2023-07-18 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110711

Richard Biener  changed:

   What|Removed |Added

   Keywords||missed-optimization, openmp

--- Comment #3 from Richard Biener  ---
Both are vectorized but somehow the OMP simd setup discards the vectorized
variant of k_std_max.  The .GOMP_SIMD_VF (simduid.3_12(D)) seems to be
statically zero?!

[Bug target/110625] [AArch64] Vect: SLP fails to vectorize a loop as the reduction_latency calculated by new costs is too large

2023-07-18 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110625

rsandifo at gcc dot gnu.org  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |rsandifo at gcc dot 
gnu.org
   Last reconfirmed||2023-07-18
 Ever confirmed|0   |1

--- Comment #5 from rsandifo at gcc dot gnu.org  
---
Taking for the single_defuse_cycle part.

[Bug middle-end/110711] possible missed optimization for std::max with -march=znver2

2023-07-18 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110711

Richard Biener  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2023-07-18
 Status|UNCONFIRMED |NEW

[Bug libstdc++/110512] C++20 random access iterators run sequentially with PSTL

2023-07-18 Thread gonzalo.gadeschi at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110512

--- Comment #5 from gnzlbg  ---
The patch for this bug in libc++ has been reviewed:
https://reviews.llvm.org/D154305 

I've submitted a patch for the same issue to libstdc++: 
https://gcc.gnu.org/pipermail/libstdc++/2023-July/056266.html

[Bug c/106264] [10/11/12/13 Regression] spurious -Wunused-value on a folded frexp, modf, and remquo calls with unused result since r9-1295-g781ff3d80e88d7d0

2023-07-18 Thread roger at nextmovesoftware dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106264

--- Comment #9 from Roger Sayle  ---
*** Bug 101090 has been marked as a duplicate of this bug. ***

[Bug c/101090] incorrect -Wunused-value warning on remquo with constant values

2023-07-18 Thread roger at nextmovesoftware dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101090

Roger Sayle  changed:

   What|Removed |Added

 Resolution|--- |DUPLICATE
 CC||roger at nextmovesoftware dot 
com
 Status|NEW |RESOLVED

--- Comment #4 from Roger Sayle  ---
Many thanks to Vincent for spotting/confirming that his bug report is a
duplicate of PR 106264, which was fixed in GCC 13.

*** This bug has been marked as a duplicate of bug 106264 ***

[Bug c/89180] [meta-bug] bogus/missing -Wunused warnings

2023-07-18 Thread roger at nextmovesoftware dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89180
Bug 89180 depends on bug 101090, which changed state.

Bug 101090 Summary: incorrect -Wunused-value warning on remquo with constant 
values
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101090

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |DUPLICATE

[Bug c++/110714] constexpr lifetime error: base class this pointer

2023-07-18 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110714

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
   Last reconfirmed||2023-07-18

--- Comment #1 from Andrew Pinski  ---
Note this works on the trunk:
```
struct Base
{
  constexpr virtual ~Base() {}
//  constexpr Base* get_this() { return this; }
  int x;
};

struct Derived : public Base {};

constexpr bool test()
{
  Derived* pf = new Derived;
  Base* t = pf;
  delete t;
  return true;
}

constexpr bool b = test();
```

[Bug c++/110715] New: Static thread_local unique_ptrs must be defined in the same order as they were declared when using -ftest-coverage else get error 'function starts on a higher line number than it

2023-07-18 Thread obi.phil+gcc at googlemail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110715

Bug ID: 110715
   Summary: Static thread_local unique_ptrs must be defined in the
same order as they were declared when using
-ftest-coverage else get error 'function starts on a
higher line number than it ends'
   Product: gcc
   Version: 13.1.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: obi.phil+gcc at googlemail dot com
  Target Milestone: ---

Created attachment 55568
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55568&action=edit
Output from g++ -v --save-temps

Consider the following minimum reproducible example:

---
#include 

template
class Base
{
 public:
  Base()
  {
count = std::make_unique(0);
another = std::make_unique(0);
  }

  virtual ~Base() { }

  static thread_local std::unique_ptr count;
  static thread_local std::unique_ptr another;
};

#ifdef BREAK
template  thread_local std::unique_ptr Base::another(nullptr);
#endif
template  thread_local std::unique_ptr Base::count(nullptr);
#ifndef BREAK
template  thread_local std::unique_ptr Base::another(nullptr);
#endif

class Foo : public Base
{
 public:
  Foo() { }
  virtual ~Foo() { }
};

int main()
{
  Foo a;
  return 0;
}
---

When compiled with my usual compiler options (with either -DBREAK or without
it) it compiles fine on OEL9 x86_64 (kernel 5.14.0-162.6.1.el9_1.x86_64).
However when compiled with -ftest-coverage (and -DBREAK) I get the error:

---
TemplateTest.cpp: In function ‘void __tls_init()’:
TemplateTest.cpp:22:54: error: function starts on a higher line number than it
ends [-Werror=coverage-invalid-line-number]
   22 | template  thread_local std::unique_ptr
Base::count(nullptr);
  |  ^~~
TemplateTest.cpp: In function ‘std::unique_ptr&
_ZTWN4BaseI3FooE5countE()’:
TemplateTest.cpp:22:54: error: function starts on a higher line number than it
ends [-Werror=coverage-invalid-line-number]
---

The newer GCC versions appear to require the static thread_local member
variables to be defined in the same order they are declared. However I've also
seen other instances when trying to create the MRE whereby using the arguments
in a different order to the 'declared' order causes the same error. I have so
far been unable to recreate these issues in an MRE however.

I have validated that this occurs with GCC 13.1.1, 12.2.1, and 12.1.1. It does
not occur when using GCC 11.3.1 or 8.3.1.

The full compilation arguments I've used are:
c++ -v --save-temps -DBREAK -Werror -Wall -Wextra -fno-strict-aliasing -fwrapv
-fno-aggressive-loop-optimizations -fprofile-arcs -ftest-coverage -o a.out
TemplateTest.cpp
And I have uploaded the .ii output file it created as an attachment, but please
let me know if you want more information. Here is the output from the command
above:

---
Using built-in specs.
COLLECT_GCC=c++
COLLECT_LTO_WRAPPER=/opt/rh/gcc-toolset-13/root/usr/libexec/gcc/x86_64-redhat-linux/13/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-redhat-linux
Configured with: ../configure --enable-bootstrap
--enable-languages=c,c++,fortran,lto --prefix=/opt/rh/gcc-toolset-13/root/usr
--mandir=/opt/rh/gcc-toolset-13/root/usr/share/man
--infodir=/opt/rh/gcc-toolset-13/root/usr/share/info
--with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared
--enable-threads=posix --enable-checking=release --enable-multilib
--with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions
--enable-gnu-unique-object --enable-linker-build-id
--with-gcc-major-version-only --enable-libstdcxx-backtrace
--with-libstdcxx-zoneinfo=/opt/rh/gcc-toolset-13/root/usr/share/zoneinfo
--with-linker-hash-style=gnu --enable-plugin --enable-initfini-array
--without-isl --enable-offload-targets=nvptx-none --without-cuda-driver
--enable-offload-defaulted --enable-gnu-indirect-function --enable-cet
--with-tune=generic --with-arch_64=x86-64-v2 --with-arch_32=x86-64
--build=x86_64-redhat-linux --with-build-config=bootstrap-lto
--enable-link-serialization=1
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 13.1.1 20230614 (Red Hat 13.1.1-4) (GCC) 
COLLECT_GCC_OPTIONS='-v' '-save-temps' '-D' 'BREAK' '-Werror' '-Wall' '-Wextra'
'-fno-strict-aliasing' '-fwrapv' '-fno-aggressive-loop-optimizations'
'-fprofile-arcs' '-ftest-coverage' '-o' 'a.out' '-shared-libgcc'
'-mtune=generic' '-march=x86-64-v2' '-dumpdir' 'a-'
 /opt/rh/gcc-toolset-13/r

[Bug c++/110714] constexpr lifetime error: base class this pointer

2023-07-18 Thread pkeir at outlook dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110714

--- Comment #2 from Paul Keir  ---
I know. `delete pf` also works. The issue seems to be with the use of the this
pointer within the member function. This is just the MRE - I've come across
this issue twice now in our code base.

[Bug bootstrap/110716] New: failed to build cross gcc 10.5 with host gcc 4.6.3

2023-07-18 Thread anmin_deng at yahoo dot com.tw via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110716

Bug ID: 110716
   Summary: failed to build cross gcc 10.5 with host gcc 4.6.3
   Product: gcc
   Version: 10.5.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: bootstrap
  Assignee: unassigned at gcc dot gnu.org
  Reporter: anmin_deng at yahoo dot com.tw
  Target Milestone: ---

I tried to build a cross gcc-10.5 for ARM target on a linux host/build with
gcc-4.6.3.

Previously in May, I successfully built cross gcc-10.4 for ARM target on the
same linux host/build, but now in July the very same procedure and options
failed for building gcc-10.5.

(possibly gcc-10.5 source added with some C++ features not supported by
gcc-4.6.3?)

The building error messages==>
In file included from ../../gcc-10.5.0/gcc/opts-common.c:29:0:
../../gcc-10.5.0/gcc/opts-jobserver.h:33:22: error: ISO C++ forbids
initialization of member 'error_msg' [-fpermissive]
../../gcc-10.5.0/gcc/opts-jobserver.h:33:22: error: making 'error_msg' static
[-fpermissive]
../../gcc-10.5.0/gcc/opts-jobserver.h:33:22: error: invalid in-class
initialization of static data member of non-integral type 'std::string {aka
std::basic_string}'
../../gcc-10.5.0/gcc/opts-jobserver.h:35:30: error: ISO C++ forbids
initialization of member 'skipped_makeflags' [-fpermissive]
../../gcc-10.5.0/gcc/opts-jobserver.h:35:30: error: making 'skipped_makeflags'
s
static [-fpermissive]
...
...
<==


My configure options are ==>
../gcc-10.4.0/configure LDFLAGS=--static --prefix=/home/tool_chain
--target=arm-none-eabi --build=x86_64-linux-gnu --host=x86_64-linux-gnu
--with-sysroot=/home/tool_chain --with-gnu-as --with-gnu-ld --disable-threads
--with-stabs --disable-nls --disable-host-shared --disable-shared
--with-tls=gnu2 --with-newlib --without-headers --disable-biendian
--enable-version-specific-runtime-libs --enable-languages=c,c++
--disable-libssp --disable-libquadmath --disable-libgomp --disable-libvtv
--disable-lto --disable-add-ons --enable-target-optspace
--with-multilib-list=aprofile,rmprofile --disable-tm-clone-registry
--disable-newlib-iconv --disable-newlib-mb --disable-newlib-wide-orient
--enable-newlib-long-time_t --disable-profile --disable-nss-crypt --disable-nss
--with-host-libstdcxx=-Wl,-Bstatic,-lstdc++,-lm
--with-mpc=/home/tool_chain/prerequisites-20230504
--with-mpfr=/home/tool_chain/prerequisites-20230504
--with-gmp=/home/tool_chain/prerequisites-20230504
--with-isl=/home/tool_chain/prerequisites-20230504
<==

[Bug bootstrap/110716] failed to build cross gcc 10.5 with host gcc 4.6.3

2023-07-18 Thread xry111 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110716

--- Comment #1 from Xi Ruoyao  ---
GCC 10 branch has been closed so this is unlikely to be fixed.

[Bug bootstrap/110716] failed to build cross gcc 10.5 with host gcc 4.6.3

2023-07-18 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110716

--- Comment #2 from Andrew Pinski  ---
Some c++11ism slipped into the last gcc 10 release it seems. Since gcc 10.5 is
the last release of the gcc 10 series, there is not much to be done there.

Now I can't remember if gcc 11 requires c++11 compiler or not.

[Bug c++/110714] constexpr lifetime error: base class this pointer

2023-07-18 Thread pkeir at outlook dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110714

--- Comment #3 from Paul Keir  ---
Actually, there's no need here to delete through the base pointer; so this is
perhaps simpler:

struct Base
{
  constexpr Base* get_this() { return this; }
  int x;
};

struct Derived : public Base {};

constexpr bool test()
{
  Derived* pf = new Derived;

  delete static_cast(pf->get_this());

  return true;
}

constexpr bool b = test();

[Bug bootstrap/110716] failed to build cross gcc 10.5 with host gcc 4.6.3

2023-07-18 Thread xry111 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110716

Xi Ruoyao  changed:

   What|Removed |Added

 CC||xry111 at gcc dot gnu.org

--- Comment #3 from Xi Ruoyao  ---
(In reply to Andrew Pinski from comment #2)
> Some c++11ism slipped into the last gcc 10 release it seems. Since gcc 10.5
> is the last release of the gcc 10 series, there is not much to be done there.
> 
> Now I can't remember if gcc 11 requires c++11 compiler or not.

It does, since r11-462.

[Bug c++/110535] Internal error when performing a surrogate call with unsatisfied constraints

2023-07-18 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110535

--- Comment #2 from CVS Commits  ---
The master branch has been updated by Patrick Palka :

https://gcc.gnu.org/g:1e0f37df1b12cd91a6dbb523f5c722f9a961edaa

commit r14-2618-g1e0f37df1b12cd91a6dbb523f5c722f9a961edaa
Author: Patrick Palka 
Date:   Tue Jul 18 09:21:40 2023 -0400

c++: constrained surrogate call functions [PR110535]

We weren't checking constraints on pointer/reference-to-function conversion
functions during overload resolution, which caused us to ICE on the first
testcase and reject the second testcase.

PR c++/110535

gcc/cp/ChangeLog:

* call.cc (add_conv_candidate): Check constraints.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-surrogate1.C: New test.
* g++.dg/cpp2a/concepts-surrogate2.C: New test.

[Bug bootstrap/110716] failed to build cross gcc 10.5 with host gcc 4.6.3

2023-07-18 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110716

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |WONTFIX

--- Comment #4 from Andrew Pinski  ---
Won't fix.
Build gcc 10.4.0 and then build gcc 10.5.0 and then build the cross compiler.

[Bug c++/110535] Internal error when performing a surrogate call with unsatisfied constraints

2023-07-18 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110535

Patrick Palka  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
 CC||ppalka at gcc dot gnu.org
   Assignee|unassigned at gcc dot gnu.org  |ppalka at gcc dot 
gnu.org
   Target Milestone|--- |13.2

--- Comment #3 from Patrick Palka  ---
Fixed on trunk so far.

[Bug target/110709] how to handle the initialization of global struct data for position independent executable application.

2023-07-18 Thread wangwen at microsoft dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110709

--- Comment #6 from wangwen at microsoft dot com ---
would anyone guide me any place to ask such question?

[Bug rtl-optimization/105715] [13/14 Regression] missed RTL if-conversion with COND_EXPR change

2023-07-18 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105715

--- Comment #6 from CVS Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:cbe5f6859a73b2acf203bd7d13f9fb245d63cbd4

commit r14-2620-gcbe5f6859a73b2acf203bd7d13f9fb245d63cbd4
Author: Richard Biener 
Date:   Tue Jul 18 10:02:52 2023 +0200

middle-end/105715 - missed RTL if-conversion with COND_EXPR expansion

When the COND_EXPR condition operand was split out to a separate stmt
it became subject to CSE with other condition evaluations.  This
unfortunately leads to TER no longer applying and in turn RTL
expansion of COND_EXPRs no longer seeing the condition and thus
failing to try conditional move expansion.  This can be seen with
gcc.target/i386/pr45685.c when built with -march=cascadelake which
then FAILs to produce the expected number of cmovs.

It can also be seen when we create more COND_EXPRs early like for
instruction selection of MIN/MAX operations that map to IEEE
a > b ? a : b expression semantics.

PR middle-end/105715
* gimple-isel.cc (gimple_expand_vec_exprs): Merge into...
(pass_gimple_isel::execute): ... this.  Duplicate
comparison defs of COND_EXPRs.

[Bug rtl-optimization/105715] [13 Regression] missed RTL if-conversion with COND_EXPR change

2023-07-18 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105715

Richard Biener  changed:

   What|Removed |Added

Summary|[13/14 Regression] missed   |[13 Regression] missed RTL
   |RTL if-conversion with  |if-conversion with
   |COND_EXPR change|COND_EXPR change
  Known to fail||13.1.0
  Known to work||14.0

--- Comment #7 from Richard Biener  ---
Fixed on trunk.

[Bug target/110170] Sub-optimal conditional jumps in conditional-swap with floating point

2023-07-18 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110170

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|14.0|---
 Resolution|FIXED   |---
 Status|RESOLVED|REOPENED

--- Comment #18 from Richard Biener  ---
Huh, right.  Somehow I thought minss/maxss is SSE 4.1.  I do have a patch
series that fixes this, the PR88540 is missing for this but it has some fallout
still.

[Bug c++/109241] [12/13/14 Regression] ICE Segmentation fault for statement expression with a local type inside inside a generic lambda inside a generic lambda since r13-6722-gb323f52ccf966800

2023-07-18 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109241

Richard Biener  changed:

   What|Removed |Added

 Status|REOPENED|RESOLVED
 Resolution|--- |FIXED

--- Comment #12 from Richard Biener  ---
Closing again.

[Bug libgcc/109712] [13 Regression] Segmentation fault in linear_search_fdes

2023-07-18 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109712

--- Comment #34 from Richard Biener  ---
You have until Thursday for the backport which is when we want to do 13.2 RC1

[Bug target/110709] how to handle the initialization of global struct data for position independent executable application.

2023-07-18 Thread xry111 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110709

--- Comment #7 from Xi Ruoyao  ---
(In reply to wangwen from comment #6)
> would anyone guide me any place to ask such question?

You are building the .o files with -fpie, but have you linked the executable
with -pie?  Note that -fpie and -pie are two different options, -fpie is for
the compiler but -pie is for the linker.  W/o -pie the linker will just produce
a non-position-independent executable which can be only loaded at a fixed
address, even if you've built all .o files with -fpie.

With -pie the linker will emit some relative relocation entries in a section of
the outputted PIE.  And when you load the executable, either the loader or the
executable itself must "resolve" the relocation, i. e. read each relocation
entry and use the info recorded in the entry to fix up the addresses of global
objects.

Andrew has already explained this.  If you still don't understand, it indicates
you lack the knowledge about how PIE works in general.  Then Google will find
some nice articles explaining PIE in detail.  Any project-specific support
channel won't be proper for asking such a question about a general concept.

[Bug middle-end/110702] [12/13/14 Regression] Wrong code at -O1 on x86_64-linux-gnu (regression since GCC-12.2)

2023-07-18 Thread mikpelinux at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110702

Mikael Pettersson  changed:

   What|Removed |Added

 CC||mikpelinux at gmail dot com

--- Comment #2 from Mikael Pettersson  ---
Masked at -Os/O2/O3. The regression at -O1 started with

374cee99d01fceb89d0929da8b38051e6c9768f0 is the first new commit
commit 374cee99d01fceb89d0929da8b38051e6c9768f0
Author: Richard Biener 
Date:   Tue May 17 09:45:02 2022 +0200

tree-optimization/105618 - restore load sinking

[Bug bootstrap/110716] failed to build cross gcc 10.5 with host gcc 4.6.3

2023-07-18 Thread xry111 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110716

--- Comment #5 from Xi Ruoyao  ---
Should we change invoke.texi?

diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index e099cd0b568..dd4f74fbd78 100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -231,7 +231,7 @@ Necessary to bootstrap GCC.  GCC 4.8.3 or newer has
sufficient
 support for used C++11 features, with earlier GCC versions you
 might run into implementation bugs.

-Versions of GCC prior to 11 also allow bootstrapping with an ISO C++98
+Versions of GCC prior to 10.5 also allow bootstrapping with an ISO C++98
 compiler, versions of GCC prior to 4.8 also allow bootstrapping with a
 ISO C89 compiler, and versions of GCC prior to 3.4 also allow
 bootstrapping with a traditional (K&R) C compiler.

[Bug libgcc/109712] [13 Regression] Segmentation fault in linear_search_fdes

2023-07-18 Thread fw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109712

--- Comment #35 from Florian Weimer  ---
Backport posted, along with the warning fix:

[PATCH releases/gcc-13 1/2] libgcc: Fix eh_frame fast path in find_fde_tail


[PATCH releases/gcc-13 2/2] libgcc: Fix -Wint-conversion warning in
find_fde_tail


[Bug rtl-optimization/71923] return instruction emitted twice with branch target inbetween

2023-07-18 Thread javier.martinez.bugzilla at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71923

Javier Martinez  changed:

   What|Removed |Added

 CC||javier.martinez.bugzilla@gm
   ||ail.com

--- Comment #2 from Javier Martinez  
---
Also reproducible with:

extern void s1(void);
extern void s2(void);

void foo(int i) {
switch (i) {
case 1:  return s1(); 
case 2:  return s1();
case 3:  return s2();
}
}


On Trunk and with -O2 or higher:

foo(int):
  cmp edi, 2
  jg .L2
  test edi, edi
  jle .L7
  jmp s1 #tailcall
.LVL1:
  .p2align 4,,10
  .p2align 3
.L2:
  cmp edi, 3
  jne .L8
  jmp s2 #tailcall
.LVL2:
  .p2align 4,,10
  .p2align 3
.L7:
  ret# <--- ret
  .p2align 4,,10
  .p2align 3
.L8:
  ret# <--- ret

[Bug rtl-optimization/110717] New: Double-word sign-extension missed-optimization

2023-07-18 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110717

Bug ID: 110717
   Summary: Double-word sign-extension missed-optimization
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jakub at gcc dot gnu.org
  Target Milestone: ---

While working on _BitInt, I've noticed that we don't emit very good code at
least on x86_64 -m64/-m32 -O2 for:
#ifdef __SIZEOF_INT128__
unsigned __int128
foo (unsigned __int128 x)
{
  x <<= 59;
  return ((__int128) x) >> 59;
}
#else
unsigned long long
foo (unsigned long long x)
{
  x <<= 27;
  return ((long long) x) >> 27;
}
#endif

The sign-extension from 69 resp. 37 bits could be limited solely to the upper
word,
but we uselessly shift the lower word with it as well:
movq%rdi, %rax
movq%rsi, %rdx
shldq   $59, %rdi, %rdx
salq$59, %rax
shrdq   $59, %rdx, %rax
sarq$59, %rdx
ret
for -m64 and
movl4(%esp), %eax
movl8(%esp), %edx
shldl   $27, %eax, %edx
sall$27, %eax
shrdl   $27, %edx, %eax
sarl$27, %edx
ret
for -m32.
LLVM emits even more horrible code for -m64, but
movl4(%esp), %eax
movl8(%esp), %edx
shll$27, %edx
sarl$27, %edx
retl
for -m32, which looks to me like what we want.

[Bug rtl-optimization/110717] Double-word sign-extension missed-optimization

2023-07-18 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110717

Jakub Jelinek  changed:

   What|Removed |Added

 CC||uros at gcc dot gnu.org
 Target||x86_64-linux
   Keywords||missed-optimization

--- Comment #1 from Jakub Jelinek  ---
Haven't tried other targets.

[Bug target/110709] how to handle the initialization of global struct data for position independent executable application.

2023-07-18 Thread wangwen at microsoft dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110709

--- Comment #8 from wangwen at microsoft dot com ---
I posted it in the wrong place, please just delete it.
thank you.

[Bug fortran/110718] New: [OpenMP] Wrong value for 'lastprivate' for zero-trip loops

2023-07-18 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110718

Bug ID: 110718
   Summary: [OpenMP] Wrong value for 'lastprivate' for zero-trip
loops
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: openmp, wrong-code
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: burnus at gcc dot gnu.org
CC: jakub at gcc dot gnu.org
  Target Milestone: ---

In Fortran, a loop
  do i = m1, m2, m3
has a zero-trip loop if  '(m2 - m1 + m3)/m3' <= 0 (the LHS is how the loop
count is calculated).

The second step for the loop initialization is:
"(2) The DO variable becomes defined with the value of the initial parameter
m1."

(For both, see e.g. F2023, "11.1.7.4.1 Loop initiation".)

Hence, the expected value of 'i' after a zero-trip loop is m1.

However, with OpenMP, this fails with zero-trip loops - having some odd value.
(The equivalent C program works.)

Example: The following program prints "-2" (instead of 3 [= n])
and, hence, fails then with "STOP 2".

implicit none
integer :: i, n, m

n = 3
m = 10
i = 99
!$omp parallel do lastprivate(i)
do i = n,m,-2
  stop 1 ! should not run
end do
print *, i
if (i /= 3) stop 2
end

[Bug libstdc++/110719] New: Should chrono formatters always use std::time_put for locale's representation?

2023-07-18 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110719

Bug ID: 110719
   Summary: Should chrono formatters always use std::time_put for
locale's representation?
   Product: gcc
   Version: 13.1.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: redi at gcc dot gnu.org
  Target Milestone: ---

auto t = std::chrono::system_clock::now();
  auto loc = std::locale::classic();
  std::cout << std::format(loc, "{:%EX %OS}\n", t);

This prints something like:

14:30:46.809059031 46

The %EX output is produced by calling std::format again with a format string
based on the locale's D_FMT, which for the C locale is something like:
%H:%M:%S. And using std::format("{:%H:%M:%S}", t) prints fractional seconds for
the %S part.

The %OS output is produced by calling std::time_put::put with the %OS format
string and a struct tm with tm_sec set to the integer number of seconds. This
doesn't print the fractional part.

If chrono::parse("%EX", t) uses std::time_get then this presents a problem for
round-tripping, as the formatted output will have fractional seconds, but the
parsed input will not consume that fractional part.

Should we consistently use std::time_put for all locale-specific output?
Alternatively, we could use time_point_cast and duration_cast to round to
seconds. None of the locale-specific formats print fractional seconds.

It would be useful to profile std::format with and without std::time_get, to
see if reusing std::format performs better. If it doesn't, using std::time_put
might be simpler.

[Bug target/110649] [14 Regression] 25% sphinx3 spec2006 regression on Ice Lake and zen between g:acaa441a98bebc52 (2023-07-06 11:36) and g:55900189ab517906 (2023-07-07 00:23)

2023-07-18 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110649

--- Comment #14 from Jan Hubicka  ---
Chasing profile update bugs out of the hottest two functions did not solve the
regression. Moreover the weekly testers confirm it was not noise on zens
either.

Before the change we get:

  34.58%  sphinx_livepret  [.] mgau_eval  ◆
  26.61%  sphinx_livepret  [.] vector_gautbl_eval_logs3   ▒
   8.94%  sphinx_livepret  [.] subvq_mgau_shortlist   ▒
   7.36%  sphinx_livepret  [.] logs3_add  ▒
   5.66%  sphinx_livepret  [.] approx_cont_mgau_frame_eval▒
   4.68%  sphinx_livepret  [.] mdef_sseq2sen_active   ▒
   3.38%  sphinx_livepret  [.] dict2pid_comsenscr ▒
   1.66%  sphinx_livepret  [.] hmm_vit_eval_3st   ▒
   0.90%  sphinx_livepret  [.] lextree_hmm_eval   ▒
   0.73%  sphinx_livepret  [.] lextree_hmm_propagate  ▒
   0.71%  sphinx_livepret  [.] lextree_enter  ▒
   0.68%  sphinx_livepret  [.] fe_fft ▒
   0.49%  sphinx_livepret  [.] dict2pid_comsseq2sen_active▒
   0.35%  sphinx_livepret  [.] lextree_ssid_active▒
   0.20%  sphinx_livepret  [.] vithist_rescore▒

So difference seems to be mgau_eval.
Both version of mgau_eval has almost same code layout. Main difference is
registr allocation.  In old version we do more spill around call:

 0.01 │   and  $0xffe0,%rsp  ▒
  0.14 │   mov  %rcx,%rbx ▒
  0.00 │   sub  $0xa0,%rsp▒
  0.04 │   mov  0x10(%rdi),%rax   ▒
  0.13 │   mov  0x8(%rdi),%r15d   ▒
  0.01 │   vmovaps  %xmm3,0x80(%rsp)  ▒
  0.22 │   vmovaps  %xmm2,0x90(%rsp)  ▒
  0.03 │   mov  %rdi,0x70(%rsp)   ▒
  0.05 │   lea  (%rax,%rdx,8),%r14▒
  0.01 │   call log_to_logs3_factor   ▒
  1.00 │   test %r13,%r13 ▒
  0.00 │   vxorps   %xmm4,%xmm4,%xmm4 ▒
  0.02 │   vmovsd   %xmm0,0x78(%rsp)  ▒
  0.00 │   je   433   ▒
  0.01 │   movslq   0x0(%r13),%rax▒
  0.02 │   mov  $0xc800,%edi  ▒
  0.01 │   vmovaps  0x90(%rsp),%xmm2  ▒
  0.23 │   vmovaps  0x80(%rsp),%xmm3  ▒
  0.09 │   test %eax,%eax ▒
  0.00 │   js   3f9   ▒

new verison is missing the spill of xmm2/3

  0.02 │   and  $0xffe0,%rsp  ▒
  0.03 │   mov  %rcx,%rbx ▒
  0.01 │   add  $0xff80,%rsp  ▒
  0.03 │   mov  0x10(%rdi),%rax   ▒
  0.16 │   mov  0x8(%rdi),%r15d   ▒
  0.06 │   mov  %rdi,0x50(%rsp)   ▒
  0.12 │   lea  (%rax,%rdx,8),%r14▒
  0.01 │   call log_to_logs3_factor   ▒
  0.75 │   test %r12,%r12 ▒
  0.00 │   vxorps   %xmm3,%xmm3,%xmm3 ▒
  0.01 │   vmovsd   %xmm0,0x58(%rsp)  ▒
  0.01 │   je   3f2   ▒
  0.01 │   movslq   (%r12),%rcx   ▒
  0.00 │   mov  $0xc800,%edi  ▒
   │   test %ecx,%ecx ▒
  0.14 │   js   3b8   ▒

Which looks better. log_to_logs3_factor just returns constant:

Percent│ vmovsd invlogB,%xmm0  
   │ ret   

I wonder why we no longer need to spill. log_to_logs3_factor is from other
translation unit and this is non-LTO build. Maybe there are undefined
variables.

New version does:
  0.29 │   vmovhps  %xmm4,0x70(%rsp)  ▒
  0.11 │   vmovaps  0x70(%rsp),%xmm7  ▒
and this looks odd.

[Bug rtl-optimization/110717] Double-word sign-extension missed-optimization

2023-07-18 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110717

--- Comment #2 from Jakub Jelinek  ---
Improved testcase which shows similar behavior also with bitfields:

#ifdef __SIZEOF_INT128__
#define type __int128
#define N 59
#else
#define type long long
#define N 27
#endif

struct S { type a : sizeof (type) * __CHAR_BIT__ - N; };

unsigned type
foo (unsigned type x)
{
  x <<= N;
  return ((type) x) >> N;
}

unsigned type
bar (struct S *p)
{
  return p->a;
}

[Bug fortran/110720] New: Internal compiler error (segmentation fault) in gfc_expression_rank

2023-07-18 Thread adrien.morison at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110720

Bug ID: 110720
   Summary: Internal compiler error (segmentation fault) in
gfc_expression_rank
   Product: gcc
   Version: 13.1.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: adrien.morison at gmail dot com
  Target Milestone: ---

Created attachment 55569
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55569&action=edit
Minimal test case triggering the problem, foo_mod.f90 in the above command

This is with a development snapshot on commit
bef95ba085b0ae9bf3eb79a8eed685236d773116

Compiling the attached code lead to an internal compiler error. The attached
code is a self-contained example that triggers the problem, and is a stripped
down version of the actual code (which does a bunch of computations instead of
setting things to 0).

Command:
gfortran -v -save-temps -freport-bug -c foo_mod.f90

Output:
Using built-in specs.
COLLECT_GCC=gfortran
Target: x86_64-pc-linux-gnu
Configured with: /build/gcc/src/gcc/configure
--enable-languages=ada,c,c++,d,fortran,go,lto,objc,obj-c++ --enable-bootstrap
--prefix=/usr --libdir=/usr/lib --libexecdir=/usr/lib --mandir=/usr/share/man
--infodir=/usr/share/info --with-bugurl=https://bugs.archlinux.org/
--with-build-config=bootstrap-lto --with-linker-hash-style=gnu
--with-system-zlib --enable-__cxa_atexit --enable-cet=auto
--enable-checking=release --enable-clocale=gnu --enable-default-pie
--enable-default-ssp --enable-gnu-indirect-function --enable-gnu-unique-object
--enable-libstdcxx-backtrace --enable-link-serialization=1
--enable-linker-build-id --enable-lto --enable-multilib --enable-plugin
--enable-shared --enable-threads=posix --disable-libssp --disable-libstdcxx-pch
--disable-werror
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 13.1.1 20230714 (GCC)
COLLECT_GCC_OPTIONS='-v' '-save-temps' '-freport-bug' '-c' '-mtune=generic'
'-march=x86-64'
 /usr/lib/gcc/x86_64-pc-linux-gnu/13.1.1/f951 foo_mod.f90 -quiet -dumpbase
foo_mod.f90 -dumpbase-ext .f90 -mtune=generic -march=x86-64 -version
-freport-bug -fintrinsic-modules-path
/usr/lib/gcc/x86_64-pc-linux-gnu/13.1.1/finclude
-fpre-include=/usr/include/finclude/math-vector-fortran.h -o foo_mod.s
GNU Fortran (GCC) version 13.1.1 20230714 (x86_64-pc-linux-gnu)
compiled by GNU C version 13.1.1 20230714, GMP version 6.2.1, MPFR
version 4.2.0-p9, MPC version 1.3.1, isl version isl-0.26-GMP

GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
f951: internal compiler error: Segmentation fault
0x1a10354 internal_error(char const*, ...)
???:0
0x77d345 gfc_expression_rank(gfc_expr*)
???:0
0x7392cc gfc_match_expr(gfc_expr**)
???:0
0x7792e0 gfc_match_actual_arglist(int, gfc_actual_arglist**, bool)
???:0
0x784cd3 gfc_match_rvalue(gfc_expr**)
???:0
0x7392cc gfc_match_expr(gfc_expr**)
???:0
0x739778 gfc_match(char const*, ...)
???:0
0x739f29 gfc_match_assignment()
???:0
0x793ac4 gfc_parse_file()
???:0
Please submit a full bug report, with preprocessed source.
Please include the complete backtrace with any bug report.
See  for instructions.

[Bug libgcc/110179] unwind-dw2-fde-dip.c:406: assignment makes integer from pointer without a cast

2023-07-18 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110179

--- Comment #5 from CVS Commits  ---
The releases/gcc-13 branch has been updated by Florian Weimer 
:

https://gcc.gnu.org/g:6f9dfb4d759146eebf7f88ad519010ea2191bf3a

commit r13-7583-g6f9dfb4d759146eebf7f88ad519010ea2191bf3a
Author: Florian Weimer 
Date:   Tue Jul 11 06:19:39 2023 +0200

libgcc: Fix -Wint-conversion warning in find_fde_tail

Fixes commit r14-1614-g49310a99330849 ("libgcc: Fix eh_frame fast path
in find_fde_tail").

libgcc/

PR libgcc/110179
* unwind-dw2-fde-dip.c (find_fde_tail): Add cast to avoid
implicit conversion of pointer value to integer.

(cherry picked from commit 104b09005229ef48a79a33511ea192bb3ec3c415)

[Bug libgcc/109712] [13 Regression] Segmentation fault in linear_search_fdes

2023-07-18 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109712

--- Comment #36 from CVS Commits  ---
The releases/gcc-13 branch has been updated by Florian Weimer 
:

https://gcc.gnu.org/g:7302f8a2fa2f95252b32de2dc826591e75230662

commit r13-7582-g7302f8a2fa2f95252b32de2dc826591e75230662
Author: Florian Weimer 
Date:   Tue Jun 6 11:01:07 2023 +0200

libgcc: Fix eh_frame fast path in find_fde_tail

The eh_frame value is only used by linear_search_fdes, not the binary
search directly in find_fde_tail, so the bug is not immediately
apparent with most programs.

Fixes commit e724b0480bfa5ec04f39be8c7290330b495c59de ("libgcc:
Special-case BFD ld unwind table encodings in find_fde_tail").

libgcc/

PR libgcc/109712
* unwind-dw2-fde-dip.c (find_fde_tail): Correct fast path for
parsing eh_frame.

(cherry picked from commit 49310a993308492348119f4033e4db0bda4fe46a)

[Bug c/110721] New: Segmentation fault with '-O3 -fno-dce -fno-ipa-cp -fno-tree-dce -fno-tree-sink'

2023-07-18 Thread 19373742 at buaa dot edu.cn via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110721

Bug ID: 110721
   Summary: Segmentation fault with '-O3 -fno-dce -fno-ipa-cp
-fno-tree-dce -fno-tree-sink'
   Product: gcc
   Version: 11.4.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: 19373742 at buaa dot edu.cn
  Target Milestone: ---

Created attachment 55570
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55570&action=edit
The preprocessed file

***
OS and Platform:
CentOS Linux release 7.9.2009 (Core), x86_64 GNU/Linux
***
gcc version:
gcc -v
Using built-in specs.
COLLECT_GCC=/home/new-gcc/gcc-11-0713/bin/gcc
COLLECT_LTO_WRAPPER=/home/new-gcc/gcc-11-0713/libexec/gcc/x86_64-pc-linux-gnu/11.4.1/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ./configure --prefix=/home/new-gcc/gcc-11-0713/
--disable-multilib --enable-languae=c,c++
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 11.4.1 20230713 (GCC)
***
Command Lines:
# /home/new-gcc/gcc-11-0713/bin/gcc -I /home/csmith/include/csmith-2.3.0/ -O3
-fno-dce -fno-ipa-cp -fno-tree-dce -fno-tree-sink ecn.c -o ecn.o
#/home/new-gcc/gcc-11-0713/bin/gcc -I /home/csmith/include/csmith-2.3.0/ ecn.c
-o ecn2.o
# ./ecn.o
Segmentation fault
# ./ecn2.o
checksum = AF78526F

[Bug c/110721] Segmentation fault with '-O3 -fno-dce -fno-ipa-cp -fno-tree-dce -fno-tree-sink'

2023-07-18 Thread 19373742 at buaa dot edu.cn via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110721

--- Comment #1 from CTC <19373742 at buaa dot edu.cn> ---
Created attachment 55571
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55571&action=edit
The compiler output

[Bug libgcc/109712] [13 Regression] Segmentation fault in linear_search_fdes

2023-07-18 Thread fw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109712

Florian Weimer  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #37 from Florian Weimer  ---
Backported to 13, the only impacted release.

[Bug gcov-profile/110561] gcov counts closing bracket in a function as executable, lowering coverage statistics

2023-07-18 Thread carlosgalvezp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110561

--- Comment #5 from Carlos Galvez  ---
@Andrew Pinski ping in case you missed my last message. If this were a
duplicate but, wouldn't it also happen in GCC 7.5.0?

[Bug target/110722] New: FP is Saved/Restored around inline assembly

2023-07-18 Thread palmer at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110722

Bug ID: 110722
   Summary: FP is Saved/Restored around inline assembly
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: palmer at gcc dot gnu.org
  Target Milestone: ---

I'm not sure if this is some ABI-related requirement that I've managed to
forget about, but it looks like we're saving/restoring FP around inline
assembly.

long fp_asm(long arg0)
{
asm volatile ("addi a0, a0, 1" : "+r"(arg0));
return arg0;
}

produces

fp_asm(long):
addisp,sp,-16
sd  s0,8(sp)
addis0,sp,16
addi a0, a0, 1
ld  s0,8(sp)
addisp,sp,16
jr  ra

We've got a ton of inline assembly in Linux these days and defconfig has
`-fno-omit-frame-pointer`, so this probably manifests as a performance issue
for someone somewhere -- though Clement just ran into it because he was
curious, so I don't have anything concrete.

[Bug fortran/110720] [13 Regression] Internal compiler error (segmentation fault) in gfc_expression_rank

2023-07-18 Thread anlauf at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110720

anlauf at gcc dot gnu.org changed:

   What|Removed |Added

 CC||anlauf at gcc dot gnu.org,
   ||pault at gcc dot gnu.org
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
Summary|Internal compiler error |[13 Regression] Internal
   |(segmentation fault) in |compiler error
   |gfc_expression_rank |(segmentation fault) in
   ||gfc_expression_rank
   Last reconfirmed||2023-07-18
   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=109948

--- Comment #1 from anlauf at gcc dot gnu.org ---
Looks like a dup of pr109948, which is fixed on 14-trunk.

@Paul: do you think your patch (commit r14-1487-g3c2eba4) is
backportable to 13-branch?

[Bug rtl-optimization/110701] [14 Regression] Wrong code at -O1/2/3/s on x86_64-linux-gnu

2023-07-18 Thread roger at nextmovesoftware dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110701

Roger Sayle  changed:

   What|Removed |Added

 CC||roger at nextmovesoftware dot 
com

--- Comment #5 from Roger Sayle  ---
nonzero_bits ((reg:DI 92),SImode) is returning 340, so combine (or more
specifically simplify_and_const_int_1) believes that the AND (ZERO_EXTEND)
isn't unnecessary.  So it's the same nonzero_bits information that allows us to
turn the  XOR into IOR (in insn 16) that's incorrectly telling us the AND 340
(or AND 343, or ZERO_EXTEND) is unnecessary (in insn 17).

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-07-18 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

Jakub Jelinek  changed:

   What|Removed |Added

  Attachment #55567|0   |1
is obsolete||

--- Comment #85 from Jakub Jelinek  ---
Created attachment 55572
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55572&action=edit
gcc14-bitint-wip.patch

At least the x86-64 _BitInt psABI says that the padding bits are undefined and
the various other psABI proposals do that as well.
Though, when looking at RTL expansion, we were doing REDUCE_BIT_FIELD after
operations, meaning that that we effectively relied on those bits at least for
small/middle _BitInt to be sign or zero extended.
This change tries to force sign/zero extensions when reading _BitInt from
memory, parameters etc.

[Bug target/110649] [14 Regression] 25% sphinx3 spec2006 regression on Ice Lake and zen since g:r14-2369-g3a61ca1b925653 (2023-07-06)

2023-07-18 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110649

Jan Hubicka  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=110586

--- Comment #15 from Jan Hubicka  ---
It seems that both this PR nad PR110586 boils down to worse IRA and scheduling
due to corrected profile.  I wonder if the artifically increased frequency of
former bodies of vectorized loops does not suggest that IRA may take into
account that spilling in code with long latency instructions is worse than
spiling elsehwere.

[Bug c/110721] Segmentation fault with '-O3 -fno-dce -fno-ipa-cp -fno-tree-dce -fno-tree-sink'

2023-07-18 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110721

--- Comment #2 from Andrew Pinski  ---
It seems to work on the trunk ...

[Bug fortran/110723] New: ICE with allocatable character lhs and parenthesized array with vector subscript

2023-07-18 Thread anlauf at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110723

Bug ID: 110723
   Summary: ICE with allocatable character lhs and parenthesized
array with vector subscript
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: anlauf at gcc dot gnu.org
  Target Milestone: ---

Found while analyzing pr95947:

program p
  implicit none
  character(len=1)  :: m(3) = ['a','b','c']
  character(len=:), allocatable :: n(:)
  n =   m([2])
  n =   m([2,3])
  n =  (m([2]))   ! ICE
  n =  (m([2,3])) ! ICE
end

We end up with EXPR_OP in the assignment for the parenthesized variants,
leading to bad gimple.

Tentative fix:

diff --git a/gcc/fortran/trans-expr.cc b/gcc/fortran/trans-expr.cc
index ef3e6d08f78..63d39c2516e 100644
--- a/gcc/fortran/trans-expr.cc
+++ b/gcc/fortran/trans-expr.cc
@@ -11954,6 +11954,12 @@ gfc_trans_assignment_1 (gfc_expr * expr1, gfc_expr *
expr2, bool init_flag,

   rss = NULL;

+  /* Strip left-over parentheses around rhs variables (not expressions).  */
+  while (expr2->expr_type == EXPR_OP
+&& expr2->value.op.op == INTRINSIC_PARENTHESES
+&& expr2->value.op.op1->expr_type == EXPR_VARIABLE)
+expr2 = expr2->value.op.op1;
+
   if (expr2->expr_type != EXPR_VARIABLE
   && expr2->expr_type != EXPR_CONSTANT
   && (expr2->ts.type == BT_CLASS || gfc_may_be_finalized (expr2->ts)))

Not sure if this is the right place.

[Bug fortran/110723] ICE with allocatable character lhs and parenthesized array with vector subscript

2023-07-18 Thread anlauf at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110723

--- Comment #1 from anlauf at gcc dot gnu.org ---
(In reply to anlauf from comment #0)
> Not sure if this is the right place.

Actually, the following still fails:

  n =  (m([2])//"")   ! ICE

:-(

Generally stripping parentheses generates a failure in
gfortran.dg/reassoc_2.f90, which is bad...

[Bug rtl-optimization/110724] New: Unnecessary alignment on branch to unconditional branch targets

2023-07-18 Thread javier.martinez.bugzilla at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110724

Bug ID: 110724
   Summary: Unnecessary alignment on branch to unconditional
branch targets
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: javier.martinez.bugzilla at gmail dot com
  Target Milestone: ---

https://godbolt.org/z/f7qMxxfMj

void duff(int * __restrict to, const int * __restrict from, const int count)
{
int n = (count+7) / 8;
switch(count%8)
{
   case 0: do { *to++ = *from++;
   case 7:  *to++ = *from++;
   case 6:  *to++ = *from++;
   case 5:  *to++ = *from++;
   case 4:  *to++ = *from++;
   case 3:  *to++ = *from++;
   case 2:  *to++ = *from++;
   [[likely]] case 1:  *to++ = *from++;
} while (--n>0);
}
}

Trunk with O3:
jle .L1
[...]
lea rax, [rax+4]
jmp .L5# <-- no fall-through to ret
.p2align 4,,7  # <-- unnecessary alignment
.p2align 3
.L1:
ret


I believe this 16-byte alignment is done to put the branch target at the
beginning of a front-end instruction fetch block. That however seems
unnecessary when the branch target is itself an unconditional branch, as the
instructions to follow will not retire.

In this example the degrade is code size / instruction caching only, as there
is no possible fall-through to .L1 that would cause nop's to be consumed.
Changing the C++ attribute to [[unlikely]] introduces fall-through, and GCC
seems to remove the padding, which is great.

[Bug rtl-optimization/110701] [14 Regression] Wrong code at -O1/2/3/s on x86_64-linux-gnu

2023-07-18 Thread roger at nextmovesoftware dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110701

Roger Sayle  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |roger at 
nextmovesoftware dot com
 Status|NEW |ASSIGNED

--- Comment #6 from Roger Sayle  ---
I have a fix (to combine.cc's record_dead_and_set_regs_1).  Bootstrapping and
regression testing.

[Bug fortran/110725] New: internal compiler error: in expand_expr_real_1, at expr.cc:10897

2023-07-18 Thread tonycurtis32 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110725

Bug ID: 110725
   Summary: internal compiler error: in expand_expr_real_1, at
expr.cc:10897
   Product: gcc
   Version: 13.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tonycurtis32 at gmail dot com
  Target Milestone: ---

Created attachment 55573
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55573&action=edit
preprocessed source

arch = a64fx (arm 8.2 + sve)

==

Using built-in specs.
COLLECT_GCC=gfortran
COLLECT_LTO_WRAPPER=/lustre/software/gcc/13.1.0/libexec/gcc/aarch64-unknown-linux-gnu/13.1.0/lto-wrapper
Target: aarch64-unknown-linux-gnu
Configured with: ../gcc-13.1.0/configure --with-mpfr-lib=/lib64
--with-gmp-lib=/lib64 --with-gmp-include=/usr/include
--with-mpfr-include=/usr/include --with-mpc-lib=/lib64
--with-mpc-include=/usr/include --prefix=/lustre/software/gcc/13.1.0
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 13.1.0 (GCC)

==

command-line = gfortran -ggdb -fopenmp -mcmodel=large -mcpu=a64fx -O3 -o swim
swim.F
(-O0 compiles ok, gcc 12.2 works fine too with optimization)

==

swim.F:107:72:

  107 |   DO 50 I=1,MP1
  |   
1
Warning: Fortran 2018 deleted feature: Shared DO termination label 50 at (1)
swim.F:129:72:

  129 |   DO 60 I=1,M
  |   
1
Warning: Fortran 2018 deleted feature: Shared DO termination label 60 at (1)
swim.F:173:72:

  173 |   DO 86 I=1,MP1
  |   
1
Warning: Fortran 2018 deleted feature: Shared DO termination label 86 at (1)
swim.F:206:72:

  206 |   DO 100 I=1,M
  |   
1
Warning: Fortran 2018 deleted feature: Shared DO termination label 100 at (1)
swim.F:278:72:

  278 |   DO 200 I=1,M
  |   
1
Warning: Fortran 2018 deleted feature: Shared DO termination label 200 at (1)
swim.F:340:72:

  340 |   DO 400 I=1,MP1
  |   
1
Warning: Fortran 2018 deleted feature: Shared DO termination label 400 at (1)
swim.F:376:72:

  376 |   DO 300 I=1,M
  |   
1
Warning: Fortran 2018 deleted feature: Shared DO termination label 300 at (1)
during RTL pass: expand
swim.F:404:72:

  404 | !$omp target
  |   
^
internal compiler error: in expand_expr_real_1, at expr.cc:10897
0x9d1067 expand_expr_real_1(tree_node*, rtx_def*, machine_mode,
expand_modifier, rtx_def**, bool)
../../gcc-13.1.0/gcc/expr.cc:10897
0x9cce9b expand_expr_real(tree_node*, rtx_def*, machine_mode, expand_modifier,
rtx_def**, bool)
../../gcc-13.1.0/gcc/expr.cc:9000
0x9cce9b expand_expr(tree_node*, rtx_def*, machine_mode, expand_modifier)
../../gcc-13.1.0/gcc/expr.h:310
0x9cce9b expand_expr_real_1(tree_node*, rtx_def*, machine_mode,
expand_modifier, rtx_def**, bool)
../../gcc-13.1.0/gcc/expr.cc:11234
0x9ce88f expand_expr_real(tree_node*, rtx_def*, machine_mode, expand_modifier,
rtx_def**, bool)
../../gcc-13.1.0/gcc/expr.cc:9000
0x9ce88f expand_expr_real_1(tree_node*, rtx_def*, machine_mode,
expand_modifier, rtx_def**, bool)
../../gcc-13.1.0/gcc/expr.cc:11463
0x9cff3b expand_expr_real(tree_node*, rtx_def*, machine_mode, expand_modifier,
rtx_def**, bool)
../../gcc-13.1.0/gcc/expr.cc:9000
0x9cff3b expand_expr_real_1(tree_node*, rtx_def*, machine_mode,
expand_modifier, rtx_def**, bool)
../../gcc-13.1.0/gcc/expr.cc:10805
0x9cce9b expand_expr_real(tree_node*, rtx_def*, machine_mode, expand_modifier,
rtx_def**, bool)
../../gcc-13.1.0/gcc/expr.cc:9000
0x9cce9b expand_expr(tree_node*, rtx_def*, machine_mode, expand_modifier)
../../gcc-13.1.0/gcc/expr.h:310
0x9cce9b expand_expr_real_1(tree_node*, rtx_def*, machine_mode,
expand_modifier, rtx_def**, bool)
../../gcc-13.1.0/gcc/expr.cc:11234
0x9cff3b expand_expr_real(tree_node*, rtx_def*, machine_mode, expand_modifier,
rtx_def**, bool)
../../gcc-13.1.0/gcc/expr.cc:9000
0x9cff3b expand_expr_real_1(tree_node*, rtx_def*, machine_mode,
expand_modifier, rtx_def**, bool)
../../gcc-13.1.0/gcc/expr.cc:10805
0x9da10f expand_expr_real(tree_node*, rtx_def*, machine_mode, expand_modifier,
rtx_def**, bool)
../../gcc-13.1.0/gcc/expr.cc:9000
0x9da10f store_expr(tree_node*, rtx_def*, int, bool, bool)
../../gcc-13.1.0/gcc/expr.cc:6330
0x9db73b expand_assignment(tree_node*, tree_node*, bool)
../../gcc-13.1.

[Bug c++/110340] [C++26] P2621R2 - Remove undefined behavior from lexing

2023-07-18 Thread mpolacek at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110340

Marek Polacek  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from Marek Polacek  ---
Done.

[Bug c++/110340] [C++26] P2621R2 - Remove undefined behavior from lexing

2023-07-18 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110340

--- Comment #3 from CVS Commits  ---
The trunk branch has been updated by Marek Polacek :

https://gcc.gnu.org/g:fca089e8a47314a40ad93527ba9f9d0d374b3afb

commit r14-2626-gfca089e8a47314a40ad93527ba9f9d0d374b3afb
Author: Marek Polacek 
Date:   Tue Jul 18 13:26:39 2023 -0400

c++: Add tests for P2621, no UB in lexer [PR110340]

C++26 P2621 removes UB in the lexer and either makes the construct valid
or ill-formed.  We're already handling this correctly so this patch only
adds tests.

PR c++/110340

gcc/testsuite/ChangeLog:

* g++.dg/cpp/string-4.C: New test.
* g++.dg/cpp/ucn-2.C: New test.

[Bug c++/110338] Implement C++26 language features

2023-07-18 Thread mpolacek at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110338
Bug 110338 depends on bug 110340, which changed state.

Bug 110340 Summary: [C++26] P2621R2 - Remove undefined behavior from lexing
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110340

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

[Bug middle-end/110724] Unnecessary alignment on branch to unconditional branch targets

2023-07-18 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110724

--- Comment #1 from Andrew Pinski  ---
I think this is by design.

Adding -fno-align-jumps makes the alignment go away.

[Bug middle-end/110724] Unnecessary alignment on branch to unconditional branch targets

2023-07-18 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110724

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |WAITING
   Last reconfirmed||2023-07-18
 Ever confirmed|0   |1

--- Comment #2 from Andrew Pinski  ---
The generic tuning is:
  "16:11:8",/* Loop alignment.  */
  "16:11:8",/* Jump alignment.  */
  "0:0:8",  /* Label alignment.  */

The the operands are described as:
https://gcc.gnu.org/onlinedocs/gcc-13.1.0/gcc/Optimize-Options.html#index-falign-functions
```
-falign-functions=n:m:n2
...
Examples: -falign-functions=32 aligns functions to the next 32-byte boundary,
-falign-functions=24 aligns to the next 32-byte boundary only if this can be
done by skipping 23 bytes or less, -falign-functions=32:7 aligns to the next
32-byte boundary only if this can be done by skipping 6 bytes or less.

The second pair of n2:m2 values allows you to specify a secondary alignment:
-falign-functions=64:7:32:3 aligns to the next 64-byte boundary if this can be
done by skipping 6 bytes or less, otherwise aligns to the next 32-byte boundary
if this can be done by skipping 2 bytes or less. If m2 is not specified, it
defaults to n2.
```

So align jumps to 16 byte if 11 or less bytes can be used or 8 byte alignment
Which is exactly what this does:
.p2align 4,,7  # <-- unnecessary alignment
.p2align 3

[Bug fortran/110725] [13/14 Regression] internal compiler error: in expand_expr_real_1, at expr.cc:10897

2023-07-18 Thread anlauf at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110725

anlauf at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
Summary|internal compiler error: in |[13/14 Regression] internal
   |expand_expr_real_1, at  |compiler error: in
   |expr.cc:10897   |expand_expr_real_1, at
   ||expr.cc:10897
   Target Milestone|--- |13.2
   Keywords||ice-on-valid-code, openmp
   Last reconfirmed||2023-07-18
   Priority|P3  |P4

--- Comment #1 from anlauf at gcc dot gnu.org ---
The code is accepted by Nvidia.

No ICE at -O0, but ICE at -Og, -O1, and higher.

ICE confirmed also on x86_64, so likely not target-specific.

[Bug c++/110197] [13/14 Regression] Empty constexpr object constructor erronously claims out of range access

2023-07-18 Thread mpolacek at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110197

--- Comment #2 from Marek Polacek  ---
// PR c++/110197

namespace std {
constexpr bool __is_constant_evaluated() {
  return __builtin_is_constant_evaluated();
}
template  using enable_if_t = _Tp;
template  struct __array_traits {
  typedef _Tp _Type[_Nm];
};
template  struct array {
  typename __array_traits<_Tp, _Nm>::_Type _M_elems;
};
template  array(_Tp) -> array, 1>;
struct char_traits {
  static constexpr unsigned length() {
__is_constant_evaluated();
return 0;
  }
};
struct basic_string_view {
  using traits_type = char_traits;
  constexpr basic_string_view(const char *)
  : _M_len{traits_type::length()}, _M_str{} {}
  long _M_len;
  char _M_str;
};
} // namespace std
struct Currency {
  constexpr Currency(std::basic_string_view) {}
};
void get_c() { constexpr std::array c{Currency{""}}; }

[Bug fortran/95947] PACK intrinsic returns blank strings when an allocatable character array with allocatable length is used

2023-07-18 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95947

--- Comment #8 from CVS Commits  ---
The releases/gcc-13 branch has been updated by Harald Anlauf
:

https://gcc.gnu.org/g:ccf94ab2abb6969c04d51c7879f07edfbb97ae55

commit r13-7584-gccf94ab2abb6969c04d51c7879f07edfbb97ae55
Author: Harald Anlauf 
Date:   Sun Jul 16 22:17:27 2023 +0200

Fortran: intrinsics and deferred-length character arguments
[PR95947,PR110658]

gcc/fortran/ChangeLog:

PR fortran/95947
PR fortran/110658
* trans-expr.cc (gfc_conv_procedure_call): For intrinsic procedures
whose result characteristics depends on the first argument and
which
can be of type character, the character length will not be
deferred.

gcc/testsuite/ChangeLog:

PR fortran/95947
PR fortran/110658
* gfortran.dg/deferred_character_37.f90: New test.

(cherry picked from commit 95ddd2659849a904509067ec3a2770135149a722)

[Bug fortran/110658] MINVAL/MAXVAL and deferred-length character arrays

2023-07-18 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110658

--- Comment #4 from CVS Commits  ---
The releases/gcc-13 branch has been updated by Harald Anlauf
:

https://gcc.gnu.org/g:ccf94ab2abb6969c04d51c7879f07edfbb97ae55

commit r13-7584-gccf94ab2abb6969c04d51c7879f07edfbb97ae55
Author: Harald Anlauf 
Date:   Sun Jul 16 22:17:27 2023 +0200

Fortran: intrinsics and deferred-length character arguments
[PR95947,PR110658]

gcc/fortran/ChangeLog:

PR fortran/95947
PR fortran/110658
* trans-expr.cc (gfc_conv_procedure_call): For intrinsic procedures
whose result characteristics depends on the first argument and
which
can be of type character, the character length will not be
deferred.

gcc/testsuite/ChangeLog:

PR fortran/95947
PR fortran/110658
* gfortran.dg/deferred_character_37.f90: New test.

(cherry picked from commit 95ddd2659849a904509067ec3a2770135149a722)

[Bug fortran/95947] PACK intrinsic returns blank strings when an allocatable character array with allocatable length is used

2023-07-18 Thread anlauf at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95947

anlauf at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 CC||anlauf at gcc dot gnu.org
 Resolution|--- |FIXED

--- Comment #9 from anlauf at gcc dot gnu.org ---
Fixed on mainline for gcc-14 so far, and on 13-branch.
Might be backported further after waiting some time.

Thanks for the report!

[Bug fortran/110360] ABI issue with character,value dummy argument

2023-07-18 Thread anlauf at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110360

--- Comment #26 from anlauf at gcc dot gnu.org ---
(In reply to David Edelsohn from comment #25)
> The problem on big endian systems is that GFortran is passing the character
> with the wrong padding.
[...]
> GFortran is not taking account of endianness for the layout of values in
> memory compared to constants loaded into registers.  This isn't an ABI issue
> of the target, this is a memory layout and register layout issue of GFortran.

Frankly speaking, this is a place where I have zero knowledge.

> Let me know if you need more information or tests.

There is pr110419 which tracks the testsuite regression on BE systems.

Mikael added some info there.  Maybe you can have a look, too.

[Bug fortran/110725] [13/14 Regression,openmp] internal compiler error: in expand_expr_real_1, at expr.cc:10897

2023-07-18 Thread kargl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110725

kargl at gcc dot gnu.org changed:

   What|Removed |Added

Summary|[13/14 Regression] internal |[13/14 Regression,openmp]
   |compiler error: in  |internal compiler error: in
   |expand_expr_real_1, at  |expand_expr_real_1, at
   |expr.cc:10897   |expr.cc:10897
 CC||kargl at gcc dot gnu.org

--- Comment #2 from kargl at gcc dot gnu.org ---
Reduced testcase.  The '!$omp end teams' line in subroutine initial appears to
be out-of-place.


  module swim_mod

  INTEGER, PARAMETER :: N1=7702, N2=7702

  DOUBLE PRECISION, ALLOCATABLE, DIMENSION(:,:) :: U, V

  INTEGER :: M, N, MP1, NP1

!$omp declare target(U, V)
!$omp declare target(M,N,MP1,NP1)

  CONTAINS

  SUBROUTINE ALLOC
 IMPLICIT NONE
!$omp target update to(M,N,MP1,NP1)
!$omp&
 ALLOCATE(U(NP1,MP1), V(NP1,MP1))
  END SUBROUTINE

  SUBROUTINE INITAL
  INTEGER I
!$omp target
!$omp teams
!$omp distribute parallel do simd
  DO 75 I=1,M
 U(I+1,N+1) = U(I+1,1)
 V(I,1) = V(I,N+1)
   75 CONTINUE
!$omp end teams
  U(1,N+1) = U(M+1,1)
  V(M+1,1) = V(1,N+1)
!$omp end target
  END SUBROUTINE

  end module swim_mod

[Bug middle-end/110724] Unnecessary alignment on branch to unconditional branch targets

2023-07-18 Thread javier.martinez.bugzilla at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110724

--- Comment #3 from Javier Martinez  
---
The generic tuning of 16:11:8 looks reasonable to me, I do not argue against
it.



From Anger Fog’s Optimizing subroutines in assembly language:

> Most microprocessors fetch code in aligned 16-byte or 32-byte blocks.
> If an important subroutine entry or jump label happens to be near the
> end of a 16-byte block then the microprocessor will only get a few 
> useful bytes of code when fetching that block of code. It may have
> to fetch the next 16 bytes too before it can decode the first instructions
> after the label. This can be avoided by aligning important subroutine
> entries and loop entries by 16. Aligning by 8 will assure that at least 8
> bytes of code can be loaded with the first instruction fetch, which may
> be sufficient if the instructions are small.



This looks like the reason behind the alignment. That section of the book
goes on to explain the inconvenience (execution of nops) of alignment on labels
reachable by other means than branching - which I presume lead to the :m and
:m2 tuning values, the distinction between -falign-labels and -falign-jumps,
and the reason padding is removed when my label is reachable by fall-through
with [[unlikely]].



All this is fine. 

My thesis is that this alignment strategy is always unnecessary in one specific
circumstance - when the branch target is itself an unconditional branch of size
1, as in:



.L1:

  ret 



Because the ret instruction will never cross a block boundary, and the
instructions following the ret must not execute, so there is no front-end stall
to avoid.

[Bug middle-end/110726] New: [14 Regression] wrong code on llvm-16 around 'a |= a == 0'

2023-07-18 Thread slyfox at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110726

Bug ID: 110726
   Summary: [14 Regression] wrong code on llvm-16 around 'a |= a
== 0'
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: slyfox at gcc dot gnu.org
  Target Milestone: ---

This week's gcc r14-2600-g3a407070b610a8 fails llvm-16 test suite as:

  Failed Tests (1):
LLVM-Unit ::
Support/./SupportTests/BlockFrequencyTest.SaturatingRightShift

It looks like the reproducer is trivial and happens even on -O0:

// $ cat bug.cc
int main(void) {
  unsigned long long freq = 0x10080ULL;

  freq >>= 2;
  freq |= freq == 0;

  if (freq != 0x4020ULL)
  __builtin_trap();
}

$ gcc-14 bug.cc -o a && ./a
Illegal instruction (core dumped)

$ $ gcc bug.cc -o a && ./a


$ gcc -v
Using built-in specs.
COLLECT_GCC=/<>/gcc-14.0.0/bin/gcc
COLLECT_LTO_WRAPPER=/<>/gcc-14.0.0/libexec/gcc/x86_64-unknown-linux-gnu/14.0.0/lto-wrapper
Target: x86_64-unknown-linux-gnu
Configured with: ../source/configure --prefix=/<>/gcc-14.0.0
--with-gmp-include=/<>/gmp-6.2.1-dev/include
--with-gmp-lib=/<>/gmp-6.2.1/lib
--with-mpfr-include=/<>/mpfr-4.2.0-dev/include
--with-mpfr-lib=/<>/mpfr-4.2.0/lib --with-mpc=/<>/libmpc-1.3.1
--with-native-system-header-dir=/<>/glibc-2.37-8-dev/include
--with-build-sysroot=/ --program-prefix= --enable-lto --disable-libstdcxx-pch
--without-included-gettext --with-system-zlib --enable-checking=release
--enable-static --enable-languages=c,c++ --disable-multilib --enable-plugin
--disable-libcc1 --with-isl=/<>/isl-0.20 --disable-bootstrap
--build=x86_64-unknown-linux-gnu --host=x86_64-unknown-linux-gnu
--target=x86_64-unknown-linux-gnu
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 14.0.0  (experimental) (GCC)

[Bug target/110727] New: gcc.target/aarch64/sve/aarch64-sve.exp has two new failures since commit 061f74c0673

2023-07-18 Thread thiago.bauermann at linaro dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110727

Bug ID: 110727
   Summary: gcc.target/aarch64/sve/aarch64-sve.exp has two new
failures since commit 061f74c0673
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: thiago.bauermann at linaro dot org
  Target Milestone: ---

Created attachment 55574
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55574&action=edit
Tarball containing testsuite log files for first bad and last good commits.

Our CI detected that commit 061f74c06735 "Fix profile update in
scale_profile_for_vect_loop" introduced these testsuite failures on
aarch64-linux:

Running gcc:gcc.target/aarch64/sve/aarch64-sve.exp ...
FAIL: gcc.target/aarch64/sve/live_1.c scan-assembler-times
\\twhilelo\\tp[0-7].b,  2
FAIL: gcc.target/aarch64/sve/live_1.c scan-assembler-times
\\twhilelo\\tp[0-7].h,  4

I confirmed that they are still present in trunk as of commit c11a3aedec26
"tree-ssa-loop-ch improvements, part 3" from today.

Tested on Ubuntu 22.04 with:

$ ~/src/gcc/configure --disable-bootstrap --disable-multilib && make -j 60
$ make -C gcc check-c RUNTESTFLAGS=gcc.target/aarch64/sve/aarch64-sve.exp

I'll attach gcc.sum and gcc.log from commit c11a3aedec26 as well as gcc.sum and
gcc.log from its parent, which was the last commit where the test passed.

[Bug middle-end/110726] [14 Regression] wrong code on llvm-16 around 'a |= a == 0'

2023-07-18 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110726

--- Comment #1 from Andrew Pinski  ---
I think this will be fixed with -momit-leaf-frame-pointer patch at :
https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624752.html

[Bug middle-end/110726] [14 Regression] wrong code on llvm-16 around 'a |= a == 0'

2023-07-18 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110726

--- Comment #2 from Andrew Pinski  ---
Whoops wrong bug report.

[Bug target/110722] FP is Saved/Restored around inline assembly

2023-07-18 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110722

--- Comment #1 from Andrew Pinski  ---
I think this will be fixed with -momit-leaf-frame-pointer patch at :
https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624752.html

[Bug middle-end/110726] [14 Regression] wrong code on llvm-16 around 'a |= a == 0'

2023-07-18 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110726

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||wrong-code
   Target Milestone|--- |14.0

  1   2   >