[Bug tree-optimization/105883] Memcmp folded only when size is a power of two

2022-06-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105883

Richard Biener  changed:

   What|Removed |Added

   Last reconfirmed||2022-06-14
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW

--- Comment #1 from Richard Biener  ---
memcmp is folded when it can be turned into two loads and a comparison - that
doesn't work for non-power-of-two sizes.  Fully constant folding the loads
isn't attempted - in theory value-numbering could use partial def tracking to
prune
equal prefixes and fold on different known bytes but I think this is all hardly
worth the trouble?

Who will write such code anyway.

[Bug c++/105885] [12/13 Regression] the address of 'template argument' will never be NULL warning

2022-06-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105885

--- Comment #2 from Richard Biener  ---
We diagnose only after template substitution where we cannot distinguish
literal if (nullptr == nullptr) from if (ARG == nullptr) I think.

I guess reporters reasoning is that ARG is defaulted to nullptr and that's the
reason the diagnostic is unwanted?

[Bug middle-end/101836] __builtin_object_size(P->M, 1) where M is an array and the last member of a struct fails

2022-06-14 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101836

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #24 from Jakub Jelinek  ---
For the default, a complication is that standard C++ doesn't allow neither
flexible array members nor zero sized arrays, so unless one uses extensions one
can only write [1].
I think differentiating between only allowing [] as flex, or [] and [0],
or [], [0] and [1], or any trailing array is useful.

[Bug c++/105967] New: Forming a pointer to ref-qualified member function using a function typedef ignores the qualifier

2022-06-14 Thread iamsupermouse at mail dot ru via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105967

Bug ID: 105967
   Summary: Forming a pointer to ref-qualified member function
using a function typedef ignores the qualifier
   Product: gcc
   Version: 12.1.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: iamsupermouse at mail dot ru
  Target Milestone: ---

Consider following code:

#include 

struct A {};
using F = void() &; 
static_assert(std::is_same_v);

GCC fails the static_assert, while Clang and MSVC accept it.

Apparently `F A::*` becomes `void(A::*)()`, without `&`. Same happens for `&&`.

[Bug ipa/105917] [10/11/12/13 regression] Missed passthru jump function

2022-06-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105917

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |10.4
Version|unknown |13.0
   Keywords||missed-optimization

[Bug c++/105967] Forming a pointer to ref-qualified member function using a function typedef ignores the qualifier

2022-06-14 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105967

--- Comment #1 from Andrew Pinski  ---
Note it looks like the pointer to member function type is where it loses the
ref-qualifer and not earlier.
That is GCC correctly rejects:
using F = void() &; 
F t;

[Bug c++/105967] Forming a pointer to ref-qualified member function using a function typedef ignores the qualifier

2022-06-14 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105967

Andrew Pinski  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
  Known to fail||4.8.1, 5.1.0
   Last reconfirmed||2022-06-14

--- Comment #2 from Andrew Pinski  ---
here is a testcase without using static_assert:
struct A {void g()&;};
using F = void() &; 
F A::* t = &A::g;

Confirmed. Not a regression.

[Bug target/105922] autovectorizer does not handle fp exceptions correctly for SVE

2022-06-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105922

Richard Biener  changed:

   What|Removed |Added

   Keywords||wrong-code

--- Comment #2 from Richard Biener  ---
(In reply to Andrew Pinski from comment #1)
> Confirmed. The division should have been predicated on the same as the
> load/store but currently GCC does not do that.
> 
> GCC does not really support looking into fpu status bits or exceptions while
> vectorizing either.

It effectively "supports" it by failing to vectorize when exception state
builtins are used in the vectorized region and otherwise it just accumulates
exception bits (but it doesn't support in-order traps if you enable exceptions
to trap).

Note there's a bit of confusion as to what exactly controls FP exception
bit correctness and the documentation should probably be clarified.

[Bug c/105923] unsupported return type ‘complex double’ for simd

2022-06-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105923

Richard Biener  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Keywords||ABI
   Last reconfirmed||2022-06-14
 Target|x86-64  |x86_64-*-*

--- Comment #2 from Richard Biener  ---
Confirmed.

[Bug tree-optimization/105736] [12/13 Regression] ICE in force_gimple_operand_1, at gimplify-me.cc:79 since r13-222-g28896b38fabce818

2022-06-14 Thread siddhesh at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105736

--- Comment #3 from Siddhesh Poyarekar  ---
Here we go, I'll put it into builtin-dynamic-object-size-0.c, bootstrap and
post a patch.

struct TV4
{
  __attribute__((vector_size (sizeof (int) * 4))) int v;
};

struct TV4 val3;
int *
f1 (struct TV4 *a)
{
  return &a->v[0];
}

int
f2 (void)
{
  int *t = f1 (&val3);
  if (__builtin_dynamic_object_size (t, 0) != -1)
__builtin_abort ();

  return 0;
}

[Bug c/105923] unsupported return type ‘complex double’ for simd

2022-06-14 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105923

--- Comment #3 from Hongtao.liu  ---
An alternative is taking vector complex as a 2*N length vector(just like
vectorizer did).

But __attribute__ ((__simd__ ("notinbranch"))) need to be extent for that.

[Bug rust/105913] gccrs doesn't compile on 32-bit targets

2022-06-14 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105913

Thomas Schwinge  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |dkm at gcc dot gnu.org
   See Also||https://github.com/Rust-GCC
   ||/gccrs/pull/1308

[Bug target/105930] [12/13 Regression] Excessive stack spill generation on 32-bit x86

2022-06-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105930

--- Comment #15 from Richard Biener  ---
So we feed DImode rotates into RA which constrains register allocation
enough to require spills (all 4 DImode vals are live across the kernel,
not even -fschedule-insn can do anything here).  I wonder if it ever makes
sense to not split wide ops before reload.

[Bug target/105932] Small structures returned incorrectly in i386 Microsoft ABI

2022-06-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105932

--- Comment #2 from Richard Biener  ---
(In reply to Andrew Pinski from comment #1)
> I suspect this is a dup of bug 81943.

That's for a 64bit target though.

[Bug lto/105933] LTO ltrans object files does not have proper st_bind and st_visibility

2022-06-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105933

Richard Biener  changed:

   What|Removed |Added

 CC||hubicka at gcc dot gnu.org,
   ||rguenth at gcc dot gnu.org

--- Comment #1 from Richard Biener  ---
You are refering to the LTRANS objects created from the LTRANS compile phase.
These should be perfectly valid to link and have correct st_bind/visibility,
but not necessarily the same as originally since link optimization combines
multiple TUs and distributes them to multiple LTRANS units, requiring former
local symbols to refer to each other from different LTRANS units.

Do you have a testcase that shows linking not-through-the-plugin doesn't work?

[Bug libstdc++/105934] [10/11/12/13 Regression] C++11 pointer versions of atomic_fetch_add missing because of P0558

2022-06-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105934

Richard Biener  changed:

   What|Removed |Added

Summary|[9/10/11/12/13 Regression]  |[10/11/12/13 Regression]
   |C++11 pointer versions of   |C++11 pointer versions of
   |atomic_fetch_add missing|atomic_fetch_add missing
   |because of P0558|because of P0558
   Target Milestone|--- |10.4

[Bug target/105938] [12/13 Regression] ICE in get_insn_temp late, at final.cc:2050 on nvptx-none

2022-06-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105938

Richard Biener  changed:

   What|Removed |Added

   Keywords||needs-bisection
   Target Milestone|--- |12.2
Summary|[12 Regression] ICE in  |[12/13 Regression] ICE in
   |get_insn_temp late, at  |get_insn_temp late, at
   |final.cc:2050 on nvptx-none |final.cc:2050 on nvptx-none

[Bug target/105938] [12/13 Regression] ICE in get_insn_temp late, at final.cc:2050 on nvptx-none

2022-06-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105938

Richard Biener  changed:

   What|Removed |Added

   Priority|P3  |P1

[Bug d/105942] [12/13 Regression] d: internal compiler error: in visit, at d/expr.cc:945

2022-06-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105942

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |12.2
   Priority|P3  |P4
Summary|[12 Regression] d: internal |[12/13 Regression] d:
   |compiler error: in visit,   |internal compiler error: in
   |at d/expr.cc:945|visit, at d/expr.cc:945

[Bug tree-optimization/105943] [12/13 Regression] ICE in expand_LOOP_VECTORIZED, at internal-fn.cc:2640

2022-06-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105943

Richard Biener  changed:

   What|Removed |Added

 Resolution|--- |INVALID
 Status|UNCONFIRMED |RESOLVED

--- Comment #2 from Richard Biener  ---
Yeah, that's not expected to work.  Use -fdisable-tree-vect with care.

[Bug c/105944] [10/11/12/13 Regression] ICE in expand_LOOP_DIST_ALIAS, at internal-fn.cc:2648

2022-06-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105944

Richard Biener  changed:

   What|Removed |Added

 Resolution|--- |INVALID
 Status|UNCONFIRMED |RESOLVED

--- Comment #1 from Richard Biener  ---
Similar.

[Bug c/105950] > O2 optimization causes runtime (SIGILL) during main initialization

2022-06-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105950

--- Comment #9 from Richard Biener  ---
Note that GCC 9 is no longer supported.  Note one common error resulting in
SIGILL is when you fall through to an unreachable place which could be padding
(like when there's a missing return in a function).

[Bug c/105945] [12/13 Regression] ICE in maybe_gen_insn, at optabs.cc:7956

2022-06-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105945

Richard Biener  changed:

   What|Removed |Added

 Resolution|--- |INVALID
 Status|UNCONFIRMED |RESOLVED

--- Comment #1 from Richard Biener  ---
Likewise - ifcvt creates masked load ops expected to be elided by the
vectorizer.

[Bug middle-end/105951] [12/13 Regression] ICE in emit_store_flag, at expmed.cc:6027

2022-06-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105951

Richard Biener  changed:

   What|Removed |Added

   Priority|P3  |P2

[Bug rtl-optimization/105952] [12/13 Regression] ICE in sel_redirect_edge_and_branch, at sel-sched-ir.cc:5680

2022-06-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105952

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |12.2

[Bug target/105953] [12/13 Regression] ICE in extract_insn, at recog.cc:2791

2022-06-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105953

Richard Biener  changed:

   What|Removed |Added

   Priority|P3  |P2

[Bug target/105965] x86: single-element vectors don't have scalar FMA insns used anymore

2022-06-14 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105965

Hongtao.liu  changed:

   What|Removed |Added

 CC||crazylht at gmail dot com

--- Comment #1 from Hongtao.liu  ---
It looks like a regression since GCC9

typedef float v1sf __attribute__((vector_size(4)));

v1sf
foo43 (v1sf a, v1sf b, v1sf c)
{
return a * b + c;
}

gcc9 also don't generate vfmaddXXXss.

pushq   %rbp
movq%rdi, %rax
movq%rsp, %rbp
andq$-32, %rsp
vmovss  24(%rbp), %xmm0
vmulss  16(%rbp), %xmm0, %xmm0
vaddss  32(%rbp), %xmm0, %xmm0
vmovss  %xmm0, -64(%rsp)
movl-64(%rsp), %edx
movl%edx, (%rdi)
leave
ret

[Bug tree-optimization/97185] inconsistent builtin elimination for impossible range

2022-06-14 Thread siddhesh at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97185

--- Comment #1 from Siddhesh Poyarekar  ---
While the missed optimization ought to be fixed, what's the value of
-Wstringop-* warning on an impossible range, i.e. when low > high?  Shouldn't
it just bail out silently if it detects an impossible range?

[Bug d/105942] [12/13 Regression] d: internal compiler error: in visit, at d/expr.cc:945

2022-06-14 Thread ibuclaw at gdcproject dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105942

Iain Buclaw  changed:

   What|Removed |Added

URL||https://github.com/dlang/dm
   ||d/pull/14210

--- Comment #1 from Iain Buclaw  ---
Fix landed in upstream.

[Bug target/105960] Crash in 32-bit mode

2022-06-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105960

Richard Biener  changed:

   What|Removed |Added

 Target|x86_64  |i?86-*-*
   Last reconfirmed||2022-06-14
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
 CC||hjl.tools at gmail dot com

--- Comment #5 from Richard Biener  ---
Confirmed.  Something is wrong with either ld.so or GCC.  We end up with

.globl  exp_ref
.type   exp_ref, @function
exp_ref:
.LFB1:
.cfi_startproc
pushl   %ebx
.cfi_def_cfa_offset 8
.cfi_offset 3, -8
popl%ebx
.cfi_restore 3
.cfi_def_cfa_offset 4
jmp expfull_ref@PLT

^^^ this crashes

.type   expfull_ref, @gnu_indirect_function
.setexpfull_ref,expfull_ref.resolver

.type   expfull_ref.resolver, @function
expfull_ref.resolver:
.LFB4:
.cfi_startproc
pushl   %ebx

but expfull_ref isn't .globl!?


#define TARGET_CLONES  __attribute__((target_clones("default","fma")))
TARGET_CLONES
static inline double
expfull_ref(double x)
{
  return __builtin_pow(x, 0.1234);
}

double
exp_ref(double x)
{
  return expfull_ref(x);
}

[Bug target/105960] [12/13 Regression] Crash in 32-bit mode

2022-06-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105960

Richard Biener  changed:

   What|Removed |Added

   Priority|P3  |P2
   Keywords||needs-bisection, wrong-code
Summary|Crash in 32-bit mode|[12/13 Regression] Crash in
   ||32-bit mode
  Known to work||11.3.0
  Known to fail||12.1.0
   Target Milestone|--- |12.2

[Bug other/105819] GCC 12.1.0 Make failed - Compiled with GCC 4.9.4 and under Mac OS X lion - I

2022-06-14 Thread bug-reports.delphin at laposte dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105819

--- Comment #10 from bug-reports.delphin at laposte dot net ---
Hi :

Ah, OK maybe a mistypping from my own. I will look at this.

Kind regards !

PS Please note taht my spectacles were too old, and I have new ones since last
friday. Progressive lenses, and not easy to see the computer screen...
Adapatation period is needed.

[Bug target/105966] x86: operations on certain few-element vectors yield very inefficient code

2022-06-14 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105966

Hongtao.liu  changed:

   What|Removed |Added

 CC||crazylht at gmail dot com

--- Comment #1 from Hongtao.liu  ---
What's interesting is extending slp vectorizer to handle non-pow2p elements
with vector mask.

[Bug target/105965] [10/11/12/13 Regression] x86: single-element vectors don't have scalar FMA insns used anymore

2022-06-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105965

Richard Biener  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org
   Target Milestone|--- |10.4
   Priority|P3  |P2
 Status|UNCONFIRMED |ASSIGNED
Summary|x86: single-element vectors |[10/11/12/13 Regression]
   |don't have scalar FMA insns |x86: single-element vectors
   |used anymore|don't have scalar FMA insns
   ||used anymore
   Last reconfirmed||2022-06-14
 Ever confirmed|0   |1
   Keywords||missed-optimization

--- Comment #2 from Richard Biener  ---
The widen-mul pass now sees

   [local count: 1073741824]:
  _8 = VIEW_CONVERT_EXPR(a_3(D));
  _9 = VIEW_CONVERT_EXPR(b_4(D));
  _10 = _8 * _9;
  _1 = {_10};
  _11 = VIEW_CONVERT_EXPR(_1);
  _12 = VIEW_CONVERT_EXPR(c_5(D));
  _13 = _11 + _12;
  BIT_FIELD_REF <, 32, 0> = _13;
  return ;

which confuses it.  The above is the result from vector lowering which
presumably
sees that V1SFmode isn't supported.  In GCC 8 the above is instead

   [local count: 1073741825]:
  _8 = BIT_FIELD_REF ;
  _9 = BIT_FIELD_REF ;
  _10 = _8 * _9;
  _11 = BIT_FIELD_REF ;
  _12 = _10 + _11;
  _2 = {_12};
   = _2;

that means we are at least missing a match.pd pattern to simplify

  _1 = {_10};
  _11 = VIEW_CONVERT_EXPR(_1);

[Bug tree-optimization/105940] suggested_unroll_factor applying place looks wrong

2022-06-14 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105940

--- Comment #4 from Kewen Lin  ---
(In reply to Richard Biener from comment #2)
> (In reply to Kewen Lin from comment #1)
> > Created attachment 53126 [details]
> > move_applying
> 
> LGTM (maybe the suggested unroll factor should be only applied if the
> suggestion was from a matching with/without SLP analysis, or in fact
> vect_analyze_loop_1 should communicate that down - disabling SLP when
> the one suggesting unrolling did the re-analysis).

Oops, just noticed the nice suggestion.  Will make a follow up patch for this.
It would looks like:
  when working out suggested unroll factor, save slp decision into one passed
down variable from vect_analyze_loop_1.
  when applying suggested unroll factor, if the save slp is false, directly
ignore slp handlings, otherwise, go the normal slp path but won't start over
for slp off.

[Bug target/105930] [12/13 Regression] Excessive stack spill generation on 32-bit x86

2022-06-14 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105930

--- Comment #16 from Jakub Jelinek  ---
Though, ix86_rot{l,r}di3_doubleword define_insn_and_split patterns were split
only after reload both before and after Roger's change, so somehow whether we
emit it as SImode from the beginning or only split it before reload affects the
RA decisions.
unsigned long long foo (unsigned long long x, int y, unsigned long long z) { x
^= z; return (x << 24) | (x >> (-24 & 63)); }
is too simplified, the difference with that is just that we used to emit
setting of the DImode pseudo to 0 before setting its halves with xor, while now
we don't, so it must be something else.

I believe as post-reload splitters the doubleword rotates have been introduced
already in PR17886.
Rewriting those into pre-reload splitters from post-reload splitters would be
certainly possible, I will try that, the question is whether it would cure this
and what effects it would have on other code.

[Bug c++/105968] New: GCC vectorizes but reports that it did not vectorize

2022-06-14 Thread steveire at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105968

Bug ID: 105968
   Summary: GCC vectorizes but reports that it did not vectorize
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: steveire at gmail dot com
  Target Milestone: ---

I'm looking for a way to know if GCC autovectorizes some code.

Starting with this testcase which I picked up somewhere:

```
#define N 1
#define NTIMES 10

double a[N] __attribute__ ((aligned (16)));
double b[N] __attribute__ ((aligned (16)));
double c[N] __attribute__ ((aligned (16)));
double r[N] __attribute__ ((aligned (16)));

int muladd (void) {
  int i, times;
  for (times = 0; times < NTIMES; times++) {
#if 1
// count up
for (i = 0; i < N; ++i)
r[i] = (a[i] + b[i]) * c[i];
#else
// count down (old gcc won't auto-vectorize)
for (i = N-1; i >= 0; --i)
  r[i] = (a[i] + b[i]) * c[i];
#endif
  }
  return 0;
}
```

the command

```
g++ -O2 -ftree-vectorize -fno-verbose-asm -mavx2 -fopt-info-vec-all -c test.cpp
```

reports 

```
test.cpp:9:5: note: vectorized 1 loops in function.
```

However, with -O3, GCC reports that it did not vectorize:

```
g++ -O3 -ftree-vectorize -fno-verbose-asm -mavx2 -fopt-info-vec-all -c test.cpp
```

output:

```
test.cpp:9:5: note: vectorized 0 loops in function.
```

even though vector instructions are generated.

Demo https://godbolt.org/z/3o41r7jWc

[Bug c++/105968] GCC vectorizes but reports that it did not vectorize

2022-06-14 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105968

Hongtao.liu  changed:

   What|Removed |Added

 CC||crazylht at gmail dot com

--- Comment #1 from Hongtao.liu  ---

> even though vector instructions are generated.
> 

No it's scalar instructions, but the issue here is why vectorizer is ok for -O2
-O2 -ftree-vectorize -mavx2 but not for -O3 -ftree-vectorize -mavx2


muladd():
xor eax, eax
.L2:
vmovsd  xmm0, QWORD PTR a[rax]
vaddsd  xmm0, xmm0, QWORD PTR b[rax]
add rax, 8
vmulsd  xmm0, xmm0, QWORD PTR c[rax-8]
vmovsd  QWORD PTR r[rax-8], xmm0
cmp rax, 8
jne .L2
xor eax, eax
ret

[Bug target/105966] x86: operations on certain few-element vectors yield very inefficient code

2022-06-14 Thread jbeulich at suse dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105966

--- Comment #2 from jbeulich at suse dot com ---
(In reply to Hongtao.liu from comment #1)
> What's interesting is extending slp vectorizer to handle non-pow2p elements
> with vector mask.

Well, for starters I think proper pow2 element counts (and especially "native"
vector widths like 128- or 256-bit ones) want dealing with efficiently. But I
agree the principle can be extended to non-pow2 ones.

[Bug c++/105946] [12/13 Regression] ICE in maybe_warn_pass_by_reference, at tree-ssa-uninit.cc:843

2022-06-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105946

Richard Biener  changed:

   What|Removed |Added

   Last reconfirmed||2022-06-14
   Priority|P3  |P2
 Status|UNCONFIRMED |ASSIGNED
 Ever confirmed|0   |1
   Keywords||needs-reduction
   Target Milestone|--- |12.2
   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org

--- Comment #1 from Richard Biener  ---
Confirmed.

(gdb) p debug_gimple_stmt (stmt)
# .MEM_8 = VDEF <.MEM_7(D)>
_2 = std::__new_allocator >::allocate (__n_1(D), 0B);

842   tree arg = gimple_call_arg (stmt, argno - 1);
(gdb) p argno
$2 = 3
(gdb) p debug_generic_expr (fntype)
struct vector * __new_allocator:: (struct __new_allocator *, size_type,
const void *)

so the number of actual arguments does not match the function type of the call.

I have a simple patch.

[Bug c/105923] unsupported return type ‘complex double’ for simd

2022-06-14 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105923

--- Comment #4 from Hongtao.liu  ---
Hmm, it's in i386.cc


23455/* Set CLONEI->vecsize_mangle, CLONEI->mask_mode, CLONEI->vecsize_int,
23456   CLONEI->vecsize_float and if CLONEI->simdlen is 0, also
23457   CLONEI->simdlen.  Return 0 if SIMD clones shouldn't be emitted,
23458   or number of vecsize_mangle variants that should be emitted.  */
23459
23460static int
23461ix86_simd_clone_compute_vecsize_and_simdlen (struct cgraph_node *node,
23462 struct cgraph_simd_clone
*clonei,
23463 tree base_type, int num)
...
23509case E_QImode:
23510case E_HImode:
23511case E_SImode:
23512case E_DImode:
23513case E_SFmode:
23514case E_DFmode:
23515/* case E_SCmode: */
23516/* case E_DCmode: */
23517  if (!AGGREGATE_TYPE_P (arg_type))
23518break;
23519  /* FALLTHRU */
23520default:
23521  if (clonei->args[i].arg_type == SIMD_CLONE_ARG_TYPE_UNIFORM)
23522break;
23523  warning_at (DECL_SOURCE_LOCATION (node->decl), 0,
23524  "unsupported argument type %qT for simd", arg_type);
23525  return 0;
23526}
23527}

[Bug c/105950] > O2 optimization causes runtime (SIGILL) during main initialization

2022-06-14 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105950

--- Comment #10 from Jonathan Wakely  ---
(In reply to John Kanapes from comment #8)
> I hope, I have a couple of days before closing this ticket:)

Yes, we usually let a bug sit in WAITING status for a couple of months before
closing it, so you have plenty of time.

[Bug target/105966] x86: operations on certain few-element vectors yield very inefficient code

2022-06-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105966

Richard Biener  changed:

   What|Removed |Added

 Blocks||88670

--- Comment #3 from Richard Biener  ---
Is not having AVX512VL relevant in the real world?  Some operations (division)
require different handling than zero-extending, masking might be a way out
but that might turn out to be (way?) more expensive.

I agree that it might be interesting to support SLP with not power-of-two
or generally not fully populated lanes.  The load/store side requires
masking support for this (and the missed optimization would be that we do
not define the contents of the masked elements for loads).

Likewise vector lowering could avoid splitting vector ops into scalars when
there's a wider supported vector mode by means of zero-extending.  It might
need a cost model for this.  I think the reporter is aiming at this.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88670
[Bug 88670] [meta-bug] generic vector extension issues

[Bug c/105950] > O2 optimization causes runtime (SIGILL) during main initialization

2022-06-14 Thread jkanapes at yahoo dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105950

--- Comment #11 from John Kanapes  ---
(In reply to Richard Biener from comment #9)
> Note that GCC 9 is no longer supported.  Note one common error resulting in
> SIGILL is when you fall through to an unreachable place which could be
> padding
> (like when there's a missing return in a function).

Hmmm.
gcc 9.40 is the distro gcc for Ubuntu 20.04, which is LTS and still supported.
Does this mean that no action will be taken upon resolving this ticket?
I am trying to recreate this bug in a smaller, more concise context.
It is not an obvious bug. This is valid code, and it takes a large chain of
previous steps to get it wrong at runtime. It used to work with previous gccs,
but it now seems broken:(
It will remain broken in future releases unless we stop it here:(
I hope you reconsider:)
If that was a problem of a missing return in a function, it would have to be an
internal function. The spot it happens is in main initialization, before it had
a chance to call any of my functions.

[Bug libstdc++/105957] __n * sizeof(_Tp) might overflow under consteval context for std::allocator

2022-06-14 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105957

Jonathan Wakely  changed:

   What|Removed |Added

   Keywords||accepts-invalid
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2022-06-14

--- Comment #1 from Jonathan Wakely  ---
Testcase:

#include 

constexpr auto f()
{
  std::allocator a;
  auto n = std::size_t(-1) / (sizeof(long long) - 1);
  auto p = a.allocate(n);
  a.deallocate(p, n);
  return n;
}
static_assert( f() );


In practice if the arithmetic wraps around and a smaller buffer is allocated,
any attempt to write beyond the allocated size would be detected in constant
evaluation anyway. So you'd still get a compilation error in most cases.

[Bug target/105966] x86: operations on certain few-element vectors yield very inefficient code

2022-06-14 Thread jbeulich at suse dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105966

--- Comment #4 from jbeulich at suse dot com ---
(In reply to Richard Biener from comment #3)
> Is not having AVX512VL relevant in the real world?

Wasn't the Xeon-Phi line of processors lacking VL? I have no idea how
widespread their use (still) is, though.

>  Some operations (division) require different handling than zero-extending,
> masking might be a way out but that might turn out to be (way?) more 
> expensive.

By expensive, you mean in terms of compiler changes? I wouldn't expect
execution to be severely affected by using masking, especially when it's
zeroing-masking. Or if it is, then likely because there was not enough pressure
to make this mode work efficiently (after all there were various performance
quirks when AVX and AVX512F were first introduced).

[Bug c/105950] > O2 optimization causes runtime (SIGILL) during main initialization

2022-06-14 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105950

--- Comment #12 from Jakub Jelinek  ---
(In reply to John Kanapes from comment #11)
> (In reply to Richard Biener from comment #9)
> > Note that GCC 9 is no longer supported.  Note one common error resulting in
> > SIGILL is when you fall through to an unreachable place which could be
> > padding
> > (like when there's a missing return in a function).
> 
> Hmmm.
> gcc 9.40 is the distro gcc for Ubuntu 20.04, which is LTS and still
> supported.

In this case it is Ubuntu that supports it, so you'd need to ask Ubuntu to fix
it
(if it is a compiler bug of course), because upstream GCC 9.5 was the last
release and there won't be any changes for the GCC 9 series.  If it reproduces
with a newer compiler, it can be fixed upstream in the still supported releases
and perhaps Ubuntu could backport it if you ask them to.

> Does this mean that no action will be taken upon resolving this ticket?

Depends on if it is reproducible with a supported compiler.

> I am trying to recreate this bug in a smaller, more concise context.
> It is not an obvious bug. This is valid code, and it takes a large chain of
> previous steps to get it wrong at runtime. It used to work with previous
> gccs, but it now seems broken:(

Claiming it is valid code until it is analyzed is premature.  It can very well
be undefined behavior in the code.

[Bug target/105930] [12/13 Regression] Excessive stack spill generation on 32-bit x86

2022-06-14 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105930

Jakub Jelinek  changed:

   What|Removed |Added

   Keywords|needs-bisection |

--- Comment #17 from Jakub Jelinek  ---
So, I've tried:
--- gcc/config/i386/i386.md.jj  2022-06-13 10:53:26.739290704 +0200
+++ gcc/config/i386/i386.md 2022-06-14 11:09:24.467024047 +0200
@@ -13734,14 +13734,13 @@
 ;; shift instructions and a scratch register.

 (define_insn_and_split "ix86_rotl3_doubleword"
- [(set (match_operand: 0 "register_operand" "=r")
-   (rotate: (match_operand: 1 "register_operand" "0")
-(match_operand:QI 2 "" "")))
-  (clobber (reg:CC FLAGS_REG))
-  (clobber (match_scratch:DWIH 3 "=&r"))]
- ""
+ [(set (match_operand: 0 "register_operand")
+   (rotate: (match_operand: 1 "register_operand")
+(match_operand:QI 2 "")))
+  (clobber (reg:CC FLAGS_REG))]
+ "ix86_pre_reload_split ()"
  "#"
- "reload_completed"
+ "&& 1"
  [(set (match_dup 3) (match_dup 4))
   (parallel
[(set (match_dup 4)
@@ -13764,6 +13763,7 @@
   (match_dup 6 0)))
 (clobber (reg:CC FLAGS_REG))])]
 {
+  operands[3] = gen_reg_rtx (mode);
   operands[6] = GEN_INT (GET_MODE_BITSIZE (mode) - 1);
   operands[7] = GEN_INT (GET_MODE_BITSIZE (mode));

@@ -13771,14 +13771,13 @@
 })

 (define_insn_and_split "ix86_rotr3_doubleword"
- [(set (match_operand: 0 "register_operand" "=r")
-   (rotatert: (match_operand: 1 "register_operand" "0")
-  (match_operand:QI 2 "" "")))
-  (clobber (reg:CC FLAGS_REG))
-  (clobber (match_scratch:DWIH 3 "=&r"))]
- ""
+ [(set (match_operand: 0 "register_operand")
+   (rotatert: (match_operand: 1 "register_operand")
+  (match_operand:QI 2 "")))
+  (clobber (reg:CC FLAGS_REG))]
+ "ix86_pre_reload_split ()"
  "#"
- "reload_completed"
+ "&& 1"
  [(set (match_dup 3) (match_dup 4))
   (parallel
[(set (match_dup 4)
@@ -13801,6 +13800,7 @@
     (match_dup 6)))) 0)))
 (clobber (reg:CC FLAGS_REG))])]
 {
+  operands[3] = gen_reg_rtx (mode);
   operands[6] = GEN_INT (GET_MODE_BITSIZE (mode) - 1);
   operands[7] = GEN_INT (GET_MODE_BITSIZE (mode));


On the #c0 test with -O2 -m32 -mno-mmx -mno-sse it makes some difference, but
not as much as one would hope for:
Numbers from gcc 11.3.1 20220614, 11.3.1 20220614 with the patch, 13.0.0
20220610, 13.0.0 20220614 with the patch:
sub on %esp428  2556  2620  2556
fn size in B 21657 23186 28413 23534
.s lines  6199  3942  7260  4198
So, trunk patched with the above patch results in significantly fewer
instructions, but larger (more of them use 32-bit immediates, mostly in form of
whatever(%esp) memory source operand).
And the stack usage is high.

I think the patch is still a good idea, it gives the RA more options, but we
should investigate why it consumes so much more stack and results in larger
code.

[Bug target/105930] [12/13 Regression] Excessive stack spill generation on 32-bit x86

2022-06-14 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105930

--- Comment #18 from Jakub Jelinek  ---
Of course, size comparisons of -O2 code aren't the most important, for -O2 it
is more important how fast the code is.
When comparing -Os -m32 -mno-mmx -mno-sse, the numbers are
sub on %esp412  2564  2620  2564
fn size in B 27535 20508 35036 20416
.s lines  5816  3590  7251  3544
So in the -Os case, the patched functions are both smaller and fewer
instructions (significantly so), but compared to gcc 11 still significantly
higher stack usage).

[Bug target/105966] x86: operations on certain few-element vectors yield very inefficient code

2022-06-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105966

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
   Last reconfirmed||2022-06-14
 CC||rguenth at gcc dot gnu.org

--- Comment #5 from Richard Biener  ---
So "lowering" would turn

 _1 = _2 * _3;

into

 _2' = { _2, {}, ... }; // vector-of-vector CTOR with zero filling
 _3' = { _3, {}, ... };
 _1' = _2' * _3';
 _1 = BIT_FIELD_REF <_1', 0, bitsizeof(_1)>; // lowpart

little/big-endian needs some thoughts here.  We currently require all
elements explicitely specified for vector-of-vector CTORs, for
scalar element CTORs we allow automatic zero-filling which would be
convenient here as well.  For division we'd use a vector of ones.

Since lowering is on a per-stmt base we have to optimize the glues away,
thus

 _2 = BIT_FIELD_REF <_3, 0, bitsizeof(_3)>;
 _1 = { _2, {}, ... };

should ideally become just _3 but then we have to know _3 is zero-filled
or decide we can also have arbitrary values in the upper halves (signed
integer overflow issues, FP with NaNs might be slow, etc.).  The vector
lowering process lacks something like a lattice so it doesn't re-use
previously lowered intermediate results (boo).

[Bug other/12081] Gcc can't be compiled with -mregparm=3

2022-06-14 Thread oyvind.harboe at zylin dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=12081

--- Comment #35 from oyvind.harboe at zylin dot com ---
SPEC 2017 added SPEC_GCC_VARIADIC_FUNCTIONS_MISMATCH_WORKAROUND to cope with
this error.

[Bug c/105950] > O2 optimization causes runtime (SIGILL) during main initialization

2022-06-14 Thread jkanapes at yahoo dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105950

--- Comment #13 from John Kanapes  ---
(In reply to Jakub Jelinek from comment #12)
> (In reply to John Kanapes from comment #11)
> > (In reply to Richard Biener from comment #9)

> 
> > I am trying to recreate this bug in a smaller, more concise context.
> > It is not an obvious bug. This is valid code, and it takes a large chain of
> > previous steps to get it wrong at runtime. It used to work with previous
> > gccs, but it now seems broken:(
> 
> Claiming it is valid code until it is analyzed is premature.  It can very
> well be undefined behavior in the code.

True. Except that I have already analyzed it with my own tools. That means that
the offending code, as reported by gdb, compiles and runs fine with -O6
optimization with a simpler code. I am not claiming anything, just stressing
that this is not an obvious issue as reported by gdb, and requires a lot of
previous steps to reproduce:(

[Bug c/105950] > O2 optimization causes runtime (SIGILL) during main initialization

2022-06-14 Thread jkanapes at yahoo dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105950

--- Comment #14 from John Kanapes  ---
(In reply to Jakub Jelinek from comment #12)
> (In reply to John Kanapes from comment #11)
> > (In reply to Richard Biener from comment #9)
> > > Note that GCC 9 is no longer supported.  Note one common error resulting 
> > > in
> > > SIGILL is when you fall through to an unreachable place which could be
> > > padding
> > > (like when there's a missing return in a function).
> > 
> > Hmmm.
> > gcc 9.40 is the distro gcc for Ubuntu 20.04, which is LTS and still
> > supported.
> 
> In this case it is Ubuntu that supports it, so you'd need to ask Ubuntu to
> fix it
> (if it is a compiler bug of course), because upstream GCC 9.5 was the last
> release and there won't be any changes for the GCC 9 series.  If it
> reproduces with a newer compiler, it can be fixed upstream in the still
> supported releases and perhaps Ubuntu could backport it if you ask them to.
> 
> > Does this mean that no action will be taken upon resolving this ticket?
> 
> Depends on if it is reproducible with a supported compiler.

That works for me. Both places could use my sources. My work won't be in vain:)

[Bug c++/105968] GCC vectorizes but reports that it did not vectorize

2022-06-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105968

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |WONTFIX

--- Comment #2 from Richard Biener  ---
> ./cc1 -quiet t.c -O3 -mavx2 -fopt-info
t.c:11:25: optimized: loops interchanged in loop nest
> ./cc1 -quiet t.c -O2 -mavx2 -fopt-info
t.c:14:19: optimized: loop vectorized using 32 byte vectors

so we interchange the loop to

for (i = 0; i < N; ++i)
  for (times = 0; times < NTIMES; times++)
r[i] = (a[i] + b[i]) * c[i];

which is indeed good for memory locality (now, we should then eliminate
the inner loop completely but we have no such facility - only unrolling
and DSE/DCE would do this but nothing on the high-level loop form).

"Benchmark" issue.  The outer loop should have a memory clobber.

Oh, and we should in theory be able to vectorize the outer loop if
N is a multiple of the vector element count.  But:

t.c:11:25: note:   === vect_analyze_data_ref_accesses ===
t.c:11:25: note:   zero step in inner loop of nest
t.c:11:25: missed:   not vectorized: complicated access pattern.
t.c:15:14: missed:   not vectorized: complicated access pattern.
t.c:11:25: missed:  bad data access.

so we don't handle this exact issue (maybe the offending check can
simply be elided - assuming dependence checking handles zero steps
correctly).

Putting

__asm__ volatile ("" : : : "memory");

at the end of the outer loop vectorizes with -O3 as well (but doesn't
interchange).

Not a bug I think unless you want to make it a bug about not vectorizing
the outer loop after interchange.

[Bug c/105969] New: [12/13 Regression] ICE in Floating point exception

2022-06-14 Thread gscfq--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105969

Bug ID: 105969
   Summary: [12/13 Regression] ICE in Floating point exception
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gs...@t-online.de
  Target Milestone: ---

Started between 20220522 and 20220529 :


$ cat z1.c
#include 
struct A
{
  char a[0][0][0];
};
extern struct A b[][2];
void f (void)
{
  sprintf (b[0][0].a[1][0], "%s", b[0][0].a[1][0]);
}


$ gcc-13-20220612 -c z1.c -Wall
z1.c: In function 'f':
during GIMPLE pass: warn-printf
z1.c:9:1: internal compiler error: Floating point exception
9 | }
  | ^
0xc2a33f crash_signal
../../gcc/toplev.cc:322
0x184c71e get_origin_and_offset_r
../../gcc/gimple-ssa-sprintf.cc:2322
0x184c749 get_origin_and_offset_r
../../gcc/gimple-ssa-sprintf.cc:2385
0x185267f get_origin_and_offset
../../gcc/gimple-ssa-sprintf.cc:2447
0x185267f handle_printf_call(gimple_stmt_iterator*, pointer_query&)
../../gcc/gimple-ssa-sprintf.cc:4714
0xdfd21d strlen_pass::check_and_optimize_call(bool*)
../../gcc/tree-ssa-strlen.cc:5461
0xdfdbe1 strlen_pass::check_and_optimize_stmt(bool*)
../../gcc/tree-ssa-strlen.cc:5665
0xdfdfb4 strlen_pass::before_dom_children(basic_block_def*)
../../gcc/tree-ssa-strlen.cc:5849
0x17e9284 dom_walker::walk(basic_block_def*)
../../gcc/domwalk.cc:309
0xdfe420 printf_strlen_execute
../../gcc/tree-ssa-strlen.cc:5908

[Bug c/105970] New: ICE in ix86_function_arg, at config/i386/i386.cc:3351

2022-06-14 Thread gscfq--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105970

Bug ID: 105970
   Summary: ICE in ix86_function_arg, at config/i386/i386.cc:3351
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gs...@t-online.de
  Target Milestone: ---

Affects versions down to r7, with files gcc.dg/torture/pr68037-*.c :


$ gcc-13-20220612 -c pr68037-1.c -mx32 -mgeneral-regs-only
$
$ gcc-13-20220612 -c pr68037-1.c -mx32 -mgeneral-regs-only -maddress-mode=long
during RTL pass: expand
pr68037-1.c: In function 'fn':
pr68037-1.c:32:1: internal compiler error: in ix86_function_arg, at
config/i386/i386.cc:3351
   32 | fn (struct interrupt_frame *frame, uword_t error)
  | ^~
0xf2c309 ix86_function_arg
../../gcc/config/i386/i386.cc:3351
0x9313c8 assign_parm_find_entry_rtl
../../gcc/function.cc:2535
0x9313c8 assign_parms
../../gcc/function.cc:3673
0x933607 expand_function_start(tree_node*)
../../gcc/function.cc:5161
0x7d7c21 execute
../../gcc/cfgexpand.cc:6695

[Bug c/105971] New: [12/13 Regression] ICE in bitmap_check_index, at sbitmap.h:104

2022-06-14 Thread gscfq--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105971

Bug ID: 105971
   Summary: [12/13 Regression] ICE in bitmap_check_index, at
sbitmap.h:104
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gs...@t-online.de
  Target Milestone: ---

Started between 20211121 and 20211128, at -O1+ :
(gcc configured with --enable-checking=yes)


$ cat z1.c
void a()
{
  int b;
  int c;
  int d = a;
  _Complex float *e = a;
  for (;;) {
(*e += d) / b ?: 0;
  }
}


$ gcc-13-20220612 -c z1.c -O2
z1.c: In function 'a':
z1.c:5:11: warning: initialization of 'int' from 'void (*)()' makes integer
from pointer without a cast [-Wint-conversion]
5 |   int d = a;
  |   ^
z1.c:6:23: warning: initialization of '_Complex float *' from incompatible
pointer type 'void (*)()' [-Wincompatible-pointer-types]
6 |   _Complex float *e = a;
  |   ^
during GIMPLE pass: dse
z1.c:10:1: internal compiler error: in bitmap_check_index, at sbitmap.h:104
   10 | }
  | ^
0x1e915d1 bitmap_check_index
../../gcc/sbitmap.h:104
0x1e915d1 bitmap_bit_in_range_p(simple_bitmap_def const*, unsigned int,
unsigned int)
../../gcc/sbitmap.cc:336
0xf7f37c live_bytes_read
../../gcc/tree-ssa-dse.cc:786
0xf7f37c dse_classify_store(ao_ref*, gimple*, bool, simple_bitmap_def*, bool*,
tree_node*)
../../gcc/tree-ssa-dse.cc:1007
0xf827f8 dse_optimize_stmt
../../gcc/tree-ssa-dse.cc:1421
0xf827f8 execute
../../gcc/tree-ssa-dse.cc:1527

[Bug c/105972] New: [12/13 Regression] ICE in lower_stmt, at gimple-low.cc:312

2022-06-14 Thread gscfq--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105972

Bug ID: 105972
   Summary: [12/13 Regression] ICE in lower_stmt, at
gimple-low.cc:312
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gs...@t-online.de
  Target Milestone: ---

Started between 20211017 and 20211024 :
(gcc configured with --enable-checking=yes)


$ cat z1.c
__attribute__((optimize(0)))
int f ()
{
  int g ()
}


$ gcc-13-20220612 -c z1.c -g -O2
z1.c: In function 'g':
z1.c:5:1: error: expected declaration specifiers before '}' token
5 | }
  | ^
z1.c:6: error: expected '{' at end of input
z1.c: In function 'f':
z1.c:5:1: error: expected declaration or statement at end of input
5 | }
  | ^
during GIMPLE pass: lower
z1.c:2:5: internal compiler error: in lower_stmt, at gimple-low.cc:312
2 | int f ()
  | ^
0x1c2901d lower_stmt
../../gcc/gimple-low.cc:312
0x1c2901d lower_sequence
../../gcc/gimple-low.cc:217
0x1c27d79 lower_gimple_bind
../../gcc/gimple-low.cc:475
0x1c291a8 lower_function_body
../../gcc/gimple-low.cc:110
0x1c291a8 execute
../../gcc/gimple-low.cc:195

[Bug target/105965] [10/11/12/13 Regression] x86: single-element vectors don't have scalar FMA insns used anymore

2022-06-14 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105965

--- Comment #3 from CVS Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:90467f0ad649d0817f9e034596a0fb85605b55af

commit r13-1085-g90467f0ad649d0817f9e034596a0fb85605b55af
Author: Richard Biener 
Date:   Tue Jun 14 10:59:49 2022 +0200

middle-end/105965 - add missing v_c_e <{ el }> simplification

When we got the simplification of bit-field-ref to view-convert
we lost the ability to detect FMAs since we cannot look through

  _1 = {_10};
  _11 = VIEW_CONVERT_EXPR(_1);

the following amends the (view_convert CONSTRUCTOR) pattern
to handle this case.

2022-06-14  Richard Biener  

PR middle-end/105965
* match.pd (view_convert CONSTRUCTOR): Handle single-element
CTOR case.

* gcc.target/i386/pr105965.c: New testcase.

[Bug c++/105946] [12/13 Regression] ICE in maybe_warn_pass_by_reference, at tree-ssa-uninit.cc:843

2022-06-14 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105946

--- Comment #2 from CVS Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:e07a876c07601e1f3a27420f7d055d20193c362c

commit r13-1086-ge07a876c07601e1f3a27420f7d055d20193c362c
Author: Richard Biener 
Date:   Tue Jun 14 11:10:13 2022 +0200

tree-optimization/105946 - avoid accessing excess args from uninit diag

uninit diagnostics uses passing via reference and access attributes
but that iterates over function type arguments which can in some
cases appearantly outrun the actual arguments leading to ICEs.
The following simply ignores not present arguments.

2022-06-14  Richard Biener  

PR tree-optimization/105946
* tree-ssa-uninit.cc (maybe_warn_pass_by_reference):
Do not look at arguments not specified in the function call.

[Bug target/105965] [10/11/12 Regression] x86: single-element vectors don't have scalar FMA insns used anymore

2022-06-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105965

Richard Biener  changed:

   What|Removed |Added

Summary|[10/11/12/13 Regression]|[10/11/12 Regression] x86:
   |x86: single-element vectors |single-element vectors
   |don't have scalar FMA insns |don't have scalar FMA insns
   |used anymore|used anymore
  Known to work||13.0

--- Comment #4 from Richard Biener  ---
Fixed on trunk sofar.

[Bug c++/105946] [12 Regression] ICE in maybe_warn_pass_by_reference, at tree-ssa-uninit.cc:843

2022-06-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105946

Richard Biener  changed:

   What|Removed |Added

Summary|[12/13 Regression] ICE in   |[12 Regression] ICE in
   |maybe_warn_pass_by_referenc |maybe_warn_pass_by_referenc
   |e, at   |e, at
   |tree-ssa-uninit.cc:843  |tree-ssa-uninit.cc:843
  Known to work||13.0
  Known to fail||12.1.0

--- Comment #3 from Richard Biener  ---
Fixed on trunk sofar.

[Bug tree-optimization/105832] [13 Regression] Dead Code Elimination Regression at -O3 (trunk vs. 12.1.0)

2022-06-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105832

Richard Biener  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org

--- Comment #2 from Richard Biener  ---
Investigating.

[Bug c/105950] > O2 optimization causes runtime (SIGILL) during main initialization

2022-06-14 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105950

--- Comment #15 from Jonathan Wakely  ---
Just running in GDB doesn't find bugs (and there is no -O6 level, -O3 is the
highest).

Did you try it with -fsanitize=undefined yet?

[Bug tree-optimization/105973] New: Wrong branch prediction for if (COND) { if(x) noreturn1(); else noreturn2(); }

2022-06-14 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105973

Bug ID: 105973
   Summary: Wrong branch prediction for if (COND) { if(x)
noreturn1(); else noreturn2(); }
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: redi at gcc dot gnu.org
  Target Milestone: ---

Given this code:

__attribute__((noreturn)) void throw1();
__attribute__((noreturn)) void throw2();

typedef decltype(sizeof(0)) size_t;

#if defined LIKELY
# define PREDICT(C) __builtin_expect(C,1)
#elif defined UNLIKELY
# define PREDICT(C) __builtin_expect(C,0)
#else
# define PREDICT(C) (C)
#endif

template
T* allocate(size_t n)
{
  if (PREDICT(n > (__PTRDIFF_MAX__ / sizeof(T
  {
if (n > (__SIZE_MAX__ / sizeof(T)))
  throw1();
throw2();
  }

  return (T*) ::operator new(n * sizeof(T));
}

int* alloc_int(size_t n)
{
  return allocate(n);
}


The condition decorated with PREDICT is compiled to different code with
-DLIKELY and -DUNLIKELY, as expected.

However with neither macro defined, the result is the same as -DLIKELY (for any
optimization level > -O0). i.e. the calls to throw1 and throw1 come first and
the return statement requires a branch:

_Z9alloc_intm:
.LFB1:
.cfi_startproc
movq%rdi, %rax
shrq$61, %rax
je  .L2
subq$8, %rsp
.cfi_def_cfa_offset 16
shrq$62, %rdi
je  .L3
call_Z6throw1v
.p2align 4,,10
.p2align 3
.L3:
call_Z6throw2v
.p2align 4,,10
.p2align 3
.L2:
.cfi_def_cfa_offset 8
salq$2, %rdi
jmp _Znwm
.cfi_endproc


Surely this is wrong?

If calling a noreturn function is considered unlikely, then surely entering a
block that always calls a noreturn function should also be unlikely?

Clang gets this right, generating the same code as UNLIKELY by default, and
only requiring a branch for the return value when LIKELY is defined.


This code is reduced from std::allocator in libstdc++ and I thought I should be
able to remove a redundant __builtin_expect, but it's needed due to this.

[Bug tree-optimization/105973] Wrong branch prediction for if (COND) { if(x) noreturn1(); else noreturn2(); }

2022-06-14 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105973

--- Comment #1 from Jonathan Wakely  ---
In fact we get it wrong even if both branches call the same noreturn function:

  if (PREDICT(n > (__PTRDIFF_MAX__ / sizeof(T
  {
if (n > (__SIZE_MAX__ / sizeof(T)))
  throw1();
throw1();
  }


This is not compiled to the same code as:

  if (PREDICT(n > (__PTRDIFF_MAX__ / sizeof(T
  {
throw1();
  }

even though it has identical effects.

[Bug tree-optimization/105832] [13 Regression] Dead Code Elimination Regression at -O3 (trunk vs. 12.1.0)

2022-06-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105832

Richard Biener  changed:

   What|Removed |Added

 Status|ASSIGNED|NEW
   Assignee|rguenth at gcc dot gnu.org |unassigned at gcc dot 
gnu.org

--- Comment #3 from Richard Biener  ---
So the difference boils down to GCC 12 ending up with

if (iftmp.0_9 == 1)
 {
   if (iftmp.1_10 != 0)
 {
   loop with call to foo ();
 }
 }

while the new unswitching code swaps these and ends up with

if (iftmp.1_10 != 0)
 {
   if (iftmp.0_9 == 1)
 {
   loop with call to foo ();
 }
 }

the old code also created one pointless unreachable loop copy.  GCC 12
manages to elide the loop calling foo() in thread2 after fre5.

There's nothing wrong with unswitching here I think - we're at most unlucky
with the order of unswitchings (but that might change from current 'random'
to a cost based order).

[Bug tree-optimization/105973] Wrong branch prediction for if (COND) { if(x) noreturn1(); else noreturn2(); }

2022-06-14 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105973

--- Comment #2 from Jonathan Wakely  ---
https://godbolt.org/z/asecWe6KK

[Bug c/105970] ICE in ix86_function_arg, at config/i386/i386.cc:3351

2022-06-14 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105970

Uroš Bizjak  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2022-06-14
 CC||hjl.tools at gmail dot com
 Ever confirmed|0   |1

--- Comment #1 from Uroš Bizjak  ---
Probably something like:

diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index 3d189e124e4..f158cc3aaea 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -3348,7 +3348,7 @@ ix86_function_arg (cumulative_args_t cum_v, const
function_arg_info &arg)
   if (POINTER_TYPE_P (arg.type))
{
  /* This is the pointer argument.  */
- gcc_assert (TYPE_MODE (arg.type) == Pmode);
+ gcc_assert (TYPE_MODE (arg.type) == ptr_mode);
  /* It is at -WORD(AP) in the current frame in interrupt and
 exception handlers.  */
  reg = plus_constant (Pmode, arg_pointer_rtx, -UNITS_PER_WORD);

Pointer mode and Pmode can be distinct for x32 target.  However, I have no idea
what goes into interrupt frame for x32. Let's ask HJ.

[Bug c++/105838] g++ 12.1.0 runs out of memory or time when building const std::vector of std::strings

2022-06-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105838

--- Comment #3 from Richard Biener  ---
Created attachment 53133
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53133&action=edit
unincluded, and reduced

This "reduced" testcase peaks at 3.8GB memory.

> /usr/bin/time /space/rguenther/install/gcc-12.1/bin/g++ -S -O /tmp/t.C
8.68user 1.13system 0:10.03elapsed 97%CPU (0avgtext+0avgdata
3813480maxresident)k
17328inputs+2104outputs (28major+961476minor)pagefaults 0swaps

simply doubling the initializer grows it to 14.8GB

> /usr/bin/time /space/rguenther/install/gcc-12.1/bin/g++ -S -O /tmp/t.C
43.02user 4.49system 0:47.51elapsed 99%CPU (0avgtext+0avgdata
14861052maxresident)k
0inputs+4088outputs (0major+3727738minor)pagefaults 0swaps

[Bug c/105950] > O2 optimization causes runtime (SIGILL) during main initialization

2022-06-14 Thread jkanapes at yahoo dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105950

--- Comment #16 from John Kanapes  ---
 Good to know (O3).
I have posted my -fsanitize=undefined. 
Doesn't compile with it, but I need help to fix that,because I don't know what
it means:(
On Tuesday, June 14, 2022 at 02:35:05 PM GMT+3, redi at gcc dot gnu.org
 wrote:  

 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105950

--- Comment #15 from Jonathan Wakely  ---
Just running in GDB doesn't find bugs (and there is no -O6 level, -O3 is the
highest).

Did you try it with -fsanitize=undefined yet?

[Bug tree-optimization/105739] [10 Regression] Miscompilation of Linux kernel update.c

2022-06-14 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105739

--- Comment #11 from CVS Commits  ---
The master branch has been updated by Jan Hubicka :

https://gcc.gnu.org/g:8f6c317b3a16350698f3c9e0accb43a9b4acb4ae

commit r13-1089-g8f6c317b3a16350698f3c9e0accb43a9b4acb4ae
Author: Jan Hubicka 
Date:   Tue Jun 14 14:05:53 2022 +0200

Fix ipa-cp wrt volatile loads

Check for volatile flag to ipa_load_from_parm_agg.

gcc/ChangeLog:

2022-06-10  Jan Hubicka  

PR ipa/105739
* ipa-prop.cc (ipa_load_from_parm_agg): Punt on volatile loads.

gcc/testsuite/ChangeLog:

2022-06-10  Jan Hubicka  

* gcc.dg/ipa/pr105739.c: New test.

[Bug c/105950] > O2 optimization causes runtime (SIGILL) during main initialization

2022-06-14 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105950

--- Comment #17 from Jakub Jelinek  ---
If you mean https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105950#c2 , no, you
have just posted what is a user error in using the sanitizers and we've told
you how to fix that.  The -fsanitize=undefined option can't be just added to
gcc command line where you compile object files (e.g. if you add it to CFLAGS
or CXXFLAGS vars), but also when you link the program (or shared library), so
e.g. in LDFLAGS, because when linking it takes care of adding -lubsan to the
linker command line.

[Bug c/105950] > O2 optimization causes runtime (SIGILL) during main initialization

2022-06-14 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105950

--- Comment #18 from Jonathan Wakely  ---
Two of us have already explained that (comment 3 and comment 6, and now comment
17).

[Bug tree-optimization/105739] [10 Regression] Miscompilation of Linux kernel update.c

2022-06-14 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105739

--- Comment #12 from Jakub Jelinek  ---
Thanks, I have verified that on the #c0 testcase on 10 branch it makes both
__builtin_unreachable calls go away.

[Bug c/105950] > O2 optimization causes runtime (SIGILL) during main initialization

2022-06-14 Thread jkanapes at yahoo dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105950

--- Comment #19 from John Kanapes  ---
Aaaah. So it's different than the other gcc flags...
I just linked libubsan...
No compilation errors. At runtime it SIGILLS at the same gdb point as before...
Same as the rest of the recommended flags.

BTW since -O3 is the highest gcc optimization, gcc could print a warning:
Warning -Ox is deprecated. Downgrading to -O3;-)
Otherwise in a few years you will find code compiled with -O20 and then it is
the sky. It just takes 1 coder to use it in open source, and since gcc seems to
take it, all the other coders will copy it:(

[Bug c/105950] > O2 optimization causes runtime (SIGILL) during main initialization

2022-06-14 Thread jkanapes at yahoo dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105950

--- Comment #20 from John Kanapes  ---
(In reply to Jonathan Wakely from comment #18)
> Two of us have already explained that (comment 3 and comment 6, and now
> comment 17).

I couldn't understand what you were talking about. It is listed with the other
-f gcc flags:( To avoid confusion, you could update your in your description
that this flag is special and needs to be linked with -lubsan and does that...

[Bug gcov-profile/101487] [GCOV] Wrong coverage of "switch" inside "while" loop

2022-06-14 Thread njuwy at smail dot nju.edu.cn via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101487

Yang Wang  changed:

   What|Removed |Added

 Resolution|INVALID |FIXED

[Bug gcov-profile/101487] [GCOV] Wrong coverage of "switch" inside "while" loop

2022-06-14 Thread njuwy at smail dot nju.edu.cn via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101487

Yang Wang  changed:

   What|Removed |Added

 Status|RESOLVED|UNCONFIRMED
 Resolution|FIXED   |---

--- Comment #2 from Yang Wang  ---
it still exists in the latest version

[Bug gcov-profile/100980] [GCOV]The assignment statement in the “for” structure caused the wrong coverage

2022-06-14 Thread njuwy at smail dot nju.edu.cn via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100980

Yang Wang  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #2 from Yang Wang  ---
fixed in later version

[Bug libstdc++/105934] [10/11/12/13 Regression] C++11 pointer versions of atomic_fetch_add missing because of P0558

2022-06-14 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105934

Jonathan Wakely  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |WONTFIX

--- Comment #5 from Jonathan Wakely  ---
LWG consensus was that the breakage is OK for C++17, and there was no desire to
support this code even for C++11 and C++14 modes.

So I'm closing this as WONTFIX.

[Bug c/105950] > O2 optimization causes runtime (SIGILL) during main initialization

2022-06-14 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105950

--- Comment #21 from Jonathan Wakely  ---
What we said is to use -fsanitize=undefined when linking, not add -lubsan
manually. I don't know how I could have said that more clearly than comment 6.

This is not different to other flags, there are plenty of other flags that are
needed both when compiling and linking.

[Bug c/105950] > O2 optimization causes runtime (SIGILL) during main initialization

2022-06-14 Thread jkanapes at yahoo dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105950

--- Comment #22 from John Kanapes  ---
OK. Removed -lubsan. Added -fsanitize=undefined to linking
Same result as all the other flags.

It took you 4 posts to explain me what to do.
It took me 4 posts to understand what you were talking about.
You should explain better.

[Bug gcov-profile/101618] [GCOV] Wrong coverage caused by call site in a "for" statement

2022-06-14 Thread njuwy at smail dot nju.edu.cn via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101618

Yang Wang  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|UNCONFIRMED |RESOLVED

--- Comment #1 from Yang Wang  ---
fixed in later version

[Bug c++/105838] [10/11/12/13 Regression] g++ 12.1.0 runs out of memory or time when building const std::vector of std::strings

2022-06-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105838

Richard Biener  changed:

   What|Removed |Added

Summary|g++ 12.1.0 runs out of  |[10/11/12/13 Regression]
   |memory or time when |g++ 12.1.0 runs out of
   |building const std::vector  |memory or time when
   |of std::strings |building const std::vector
   ||of std::strings
 Blocks||93199
   Target Milestone|--- |10.4
   Priority|P3  |P2
 CC||ebotcazou at gcc dot gnu.org,
   ||rguenth at gcc dot gnu.org

--- Comment #4 from Richard Biener  ---
Memory usage is from cleanup_empty_eh_merge_phis which deals with a very large
number of incoming edges, recording the edge/var mappings.  This likely runs
into

  /* The post-order traversal may lead to quadraticness in the redirection
 of incoming EH edges from inner LPs, so first try to walk the region
 tree from inner to outer LPs in order to eliminate these edges.  */

where we end up re-directing more and more edges again and again.  Still the
peak memory use is odd, but it might be simply GC garbage piling up in the
CFG manipulation odyssee.

It's removal of MNT regions - with just 3 elements we go in ehcleanup1 from

Before removal of unreachable regions:
Eh tree:
   25 must_not_throw
   1 cleanup land:{12,}
 24 cleanup
 23 must_not_throw
 2 cleanup land:{11,}
   22 must_not_throw
   3 cleanup land:{10,}
 21 must_not_throw
 4 cleanup land:{9,}
   20 must_not_throw
   5 cleanup land:{1,}
 19 must_not_throw
 6 cleanup land:{8,}
   18 must_not_throw
   7 cleanup land:{2,}
 17 must_not_throw
 8 cleanup land:{7,}
   16 must_not_throw
   9 cleanup land:{3,}
 15 must_not_throw
 10 cleanup land:{6,}
   14 must_not_throw
   11 cleanup land:{5,}
 13 must_not_throw
 12 cleanup land:{4,}

to

After removal of unreachable regions:
Eh tree:
   1 cleanup land:{12,}
 2 cleanup land:{11,}
   3 cleanup land:{10,}
 4 cleanup land:{9,}
   5 cleanup land:{1,}
 6 cleanup land:{8,}
   7 cleanup land:{2,}
 8 cleanup land:{7,}
   9 cleanup land:{3,}
 10 cleanup land:{6,}
   11 cleanup land:{5,}
 12 cleanup land:{4,}

but we do this in a sub-optimal order.  Axing the first walk:

  for (i = vec_safe_length (cfun->eh->lp_array) - 1; i >= 1; --i)
{
  lp = (*cfun->eh->lp_array)[i];
  if (lp)
changed |= cleanup_empty_eh (lp);
}

fixes this but it will go against the PR93199 fix in r10-5868-g5eaf0c498f718f,
which the followup r11-3234-gaab6194d0898f5 preserved.  I fear the
optimal order is different for the clobber optimizations and the edge
redirection overhead.

In any case a fix should be evaluated against the PR93199 testcase as well.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93199
[Bug 93199] [9 Regression] Compile time hog in sink_clobbers

[Bug c++/105838] [10/11/12/13 Regression] g++ 12.1.0 runs out of memory or time when building const std::vector of std::strings

2022-06-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105838

Richard Biener  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #5 from Richard Biener  ---
btw, the unincluded testcase ended up too small, not matching the posted
numbers (I had to hit reload and cut it further at that point ...).

[Bug c++/105838] [10/11/12/13 Regression] g++ 12.1.0 runs out of memory or time when building const std::vector of std::strings

2022-06-14 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105838

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jason at gcc dot gnu.org,
   ||redi at gcc dot gnu.org

--- Comment #6 from Jakub Jelinek  ---
Note, for say
#include 
#include 

void foo (const std::vector &);
int main ()
{
  const std::vector lst = {
  "aahing", "aaliis", "aarrgh", "abacas", "abacus", "abakas", "abamps",
"abands", "abased", "abaser", "abases", "abasia" };
  foo (lst);
}
one gets terrible code from both g++ and clang++, in both cases it is serial
code calling many std::string ctors with the string literal arguments
that perhaps later on are inlined.  Over 21000 times in a row.  That also means
over 21000 memory allocations etc.
For your game, the obvious first question would be if you really need
std::vector of std::string in this case and if a normal array of const char *
strings wouldn't be better, that can be initialized at compile time.
Or, if you really need std::vector, if it wouldn't be better to
use array of const char * and build the
vector from it (sizeof (arr) / sizeof (arr[0]) to reserve that many elts in the
vector, then a loop that will construct
the std::string objects and move them into the list).

On the compiler side, a question is if we shouldn't detect such kind of
initializers and if they have over some param determined number of elements
which have the same type / kind (or at least a large sequence of such), don't
emit those
std::allocator::allocator (&D.37541);
try
  {
std::__cxx11::basic_string::basic_string<> (_4,
"aahing", &D.37541);
D.37581 = D.37581 + 32;
D.37582 = D.37582 + -1;
_5 = D.37581;
try
  {
std::allocator::allocator (&D.37543);
try
  {
   
std::__cxx11::basic_string::basic_string<> (_5, "aaliis", &D.37543);
D.37581 = D.37581 + 32;
D.37582 = D.37582 + -1;
_6 = D.37581;
try
  {
...
but a loop.  Doesn't have to be just for the STL types, if we have
struct S { S (int); ... };
  const S s[] = { 1, 3, 22, 42, 132, -12, 18, 19, 32, 0, 25, ... };
then again there should be some upper limit over which we'd just emit:
  const S s[count];
  static const int stemp[count] = { 1, 3, 22, 42, 132, -12, 18, 19, 32, 0, 25,
... };
  for (size_t x = 0; x < count; ++x) S (&s[x], stemp[x]);
or so (of course, with destruction possibility if some ctor may throw).

[Bug target/105920] __builtin_cpu_supports ("f16c") should check AVX

2022-06-14 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105920

H.J. Lu  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED
   Target Milestone|--- |11.4

--- Comment #2 from H.J. Lu  ---
Fixed for GCC 13 by

https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;h=751f306688508b08842d0ab967dee8e6c3b91351

Fixed for GCC 12.2 by:

https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=4b06b7304066fb1016e017d15e189f2e745dceae

Fixed for GCC 11.4 by

https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=30c1cde3adec938606cd49b1b4a262590b496719

[Bug middle-end/105638] Redundant stores aren't removed by DSE

2022-06-14 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105638

H.J. Lu  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED
   Target Milestone|--- |13.0

--- Comment #3 from H.J. Lu  ---
Fixed.

[Bug target/105960] [12/13 Regression] Crash in 32-bit mode

2022-06-14 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105960

--- Comment #6 from H.J. Lu  ---
This is caused by r12-5771.

[Bug target/105974] New: [13 Regression] ICE: RTL check: expected elt 0 type 'i' or 'n', have 'w' (rtx const_int) in arm_bfi_1_p, at config/arm/arm.cc:10214

2022-06-14 Thread zsojka at seznam dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105974

Bug ID: 105974
   Summary: [13 Regression] ICE: RTL check: expected elt 0 type
'i' or 'n', have 'w' (rtx const_int) in arm_bfi_1_p,
at config/arm/arm.cc:10214
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Keywords: build, ice-on-valid-code
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zsojka at seznam dot cz
  Target Milestone: ---
  Host: x86_64-pc-linux-gnu
Target: armv7a-hardfloat-linux-gnueabi

Created attachment 53134
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53134&action=edit
reduced testcase

This currently breaks build with RTL checking enabled.

Compiler output:
$ /repo/build-gcc-trunk-armv7a-hardfloat/./gcc/cc1 -O2 -march=armv7-a+vfpv4
testcase.c 
 __gnu_fractqquda
Analyzing compilation unit
Performing interprocedural optimizations
 <*free_lang_data> {heap 932k}  {heap 932k} 
{heap 932k}  {heap 1212k}  {heap 1684k}
 {heap 1684k}  {heap 1684k} 
{heap 1684k}Streaming LTO
  {heap 1684k}  {heap 1684k}  {heap
1684k}  {heap 1684k}  {heap 1684k}  {heap 1684k} 
{heap 1684k}  {heap 1684k}  {heap 1684k}  {heap
1684k}  {heap 1684k}  {heap 1684k} 
{heap 1684k}  {heap 1684k}Assembling functions:
 __gnu_fractqqudaduring RTL pass: combine

testcase.c: In function '__gnu_fractqquda':
testcase.c:9:1: internal compiler error: RTL check: expected elt 0 type 'i' or
'n', have 'w' (rtx const_int) in arm_bfi_1_p, at config/arm/arm.cc:10214
9 | }
  | ^
0x71d11e rtl_check_failed_type2(rtx_def const*, int, int, int, char const*,
int, char const*)
/repo/gcc-trunk/gcc/rtl.cc:907
0x7d01d3 arm_bfi_1_p
/repo/gcc-trunk/gcc/config/arm/arm.cc:10214
0x14406d6 arm_bfi_p
/repo/gcc-trunk/gcc/config/arm/arm.cc:10255
0x14406d6 arm_rtx_costs_internal
/repo/gcc-trunk/gcc/config/arm/arm.cc:11027
0x14406d6 arm_rtx_costs
/repo/gcc-trunk/gcc/config/arm/arm.cc:12058
0x102c33e rtx_cost(rtx_def*, machine_mode, rtx_code, int, bool)
/repo/gcc-trunk/gcc/rtlanal.cc:4629
0x1b69e98 set_src_cost
/repo/gcc-trunk/gcc/rtl.h:2943
0x1b69e98 distribute_and_simplify_rtx
/repo/gcc-trunk/gcc/combine.cc:10013
0x1b77941 simplify_logical
/repo/gcc-trunk/gcc/combine.cc:7103
0x1b77941 combine_simplify_rtx
/repo/gcc-trunk/gcc/combine.cc:6330
0x1b79d19 subst
/repo/gcc-trunk/gcc/combine.cc:5605
0x1b7d3d7 try_combine
/repo/gcc-trunk/gcc/combine.cc:3288
0x1b85dd5 combine_instructions
/repo/gcc-trunk/gcc/combine.cc:1266
0x1b85dd5 rest_of_handle_combine
/repo/gcc-trunk/gcc/combine.cc:14976
0x1b85dd5 execute
/repo/gcc-trunk/gcc/combine.cc:15021
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.

$ /repo/build-gcc-trunk-armv7a-hardfloat/./gcc/xgcc -v
Using built-in specs.
COLLECT_GCC=/repo/build-gcc-trunk-armv7a-hardfloat/./gcc/xgcc
Target: armv7a-hardfloat-linux-gnueabi
Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++
--enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra
--with-cloog --with-ppl --with-isl --with-float=hard --with-fpu=vfpv4
--with-arch=armv7-a --with-sysroot=/usr/armv7a-hardfloat-linux-gnueabi
--build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu
--target=armv7a-hardfloat-linux-gnueabi
--with-ld=/usr/bin/armv7a-hardfloat-linux-gnueabi-ld
--with-as=/usr/bin/armv7a-hardfloat-linux-gnueabi-as --disable-libstdcxx-pch
--prefix=/repo/gcc-trunk//binary-trunk-r13-1089-20220614140553-g8f6c317b3a1-checking-yes-rtl-df-extra-armv7a-hardfloat
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 13.0.0 20220614 (experimental) (GCC)

[Bug target/105975] New: OpenMP/nvptx offloading: 'internal compiler error: in maybe_legitimize_operand, at optabs.cc:7785'

2022-06-14 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105975

Bug ID: 105975
   Summary: OpenMP/nvptx offloading: 'internal compiler error: in
maybe_legitimize_operand, at optabs.cc:7785'
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Keywords: openmp
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
CC: jakub at gcc dot gnu.org, rsandifo at gcc dot gnu.org,
vries at gcc dot gnu.org
  Target Milestone: ---
Target: nvptx

The recent commit r13-1068-g1d205dbac1e1754c01c22a31bd1688126545401e "Factor
out common internal-fn idiom" causes a class of ICEs in OpenMP/nvptx offloading
compilation: 'during RTL pass: expand', 'internal compiler error: in
maybe_legitimize_operand, at optabs.cc:7785', seen for a lot of libgomp
OpenMP/nvptx offloading test cases (with '-O1' and higher).

0xb1b0b3 maybe_legitimize_operand
[...]/source-gcc/gcc/optabs.cc:7785
0xb1b0b3 maybe_legitimize_operands(insn_code, unsigned int, unsigned int,
expand_operand*)
[...]/source-gcc/gcc/optabs.cc:7936
0xb1b139 maybe_gen_insn(insn_code, unsigned int, expand_operand*)
[...]/source-gcc/gcc/optabs.cc:7955
0xb1a8b8 maybe_expand_insn(insn_code, unsigned int, expand_operand*)
[...]/source-gcc/gcc/optabs.cc:7998
0xb1a8b8 expand_insn(insn_code, unsigned int, expand_operand*)
[...]/source-gcc/gcc/optabs.cc:8029
0x95dcb3 expand_fn_using_insn
[...]/source-gcc/gcc/internal-fn.cc:193
0x6d3ee7 expand_call_stmt
[...]/source-gcc/gcc/cfgexpand.cc:2737
0x6d3ee7 expand_gimple_stmt_1
[...]/source-gcc/gcc/cfgexpand.cc:3869

For extra entertainment: when running with '-wrapper "$GDB",-q,--args', we get
'[Inferior 1 (process [...]) exited normally]'...  (Maybe Valgrind could help? 
Unless someone directly pinpoints the issue, of course.)

I've not yet determined whether it's a latent problem just exposed by this
commit, or whether the commit itself has an issue.  It's not magically fixed by
the related subsequent commit
r13-1069-gf8baf4004ef965ce7a9edf6d2f5eb99adb15803a "Add a general mapping from
internal fns to target insns".

'gcc/internal-fn.cc':

193expand_insn (icode, opno, ops);

'gcc/optabs.cc':

8026expand_insn (enum insn_code icode, unsigned int nops,
8027 class expand_operand *ops)
8028{
8029  if (!maybe_expand_insn (icode, nops, ops))

7995maybe_expand_insn (enum insn_code icode, unsigned int nops,
7996   class expand_operand *ops)
7997{
7998  rtx_insn *pat = maybe_gen_insn (icode, nops, ops);

7951maybe_gen_insn (enum insn_code icode, unsigned int nops,
7952class expand_operand *ops)
7953{
7954  gcc_assert (nops == (unsigned int) insn_data[(int)
icode].n_generator_args);
7955  if (!maybe_legitimize_operands (icode, 0, nops, ops))

7935  /* Otherwise try legitimizing the operand on its own.  */
7936  if (j == i && !maybe_legitimize_operand (icode, opno + i,
&ops[i]))

7784case EXPAND_OUTPUT:
7785  gcc_assert (mode != VOIDmode);

[Bug c/105950] > O2 optimization causes runtime (SIGILL) during main initialization

2022-06-14 Thread jkanapes at yahoo dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105950

--- Comment #23 from John Kanapes  ---
Hi,

I have not been able to recreate the issue with simpler programs that use the
same resources. I will need to upload my sources. Is it OK to upload a tar.gz
archive with a test directory with the sources and a makefile? What do you do
with the sources after the ticket?

TIA

[Bug libstdc++/62187] std::string==const char* could compare sizes first

2022-06-14 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62187

Jonathan Wakely  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |redi at gcc dot gnu.org
 Status|NEW |ASSIGNED

--- Comment #7 from Jonathan Wakely  ---
(In reply to Jonathan Wakely from comment #5)
> I've also created an LWG issue about this,

Rather than a new issue, this was added to https://wg21.link/lwg2852 

The resolution was to confirm that operator== doesn't need to call compare if
it can determine the result another way. That means we can do the length check
unconditionally.

[Bug middle-end/101836] __builtin_object_size(P->M, 1) where M is an array and the last member of a struct fails

2022-06-14 Thread qinzhao at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101836

--- Comment #25 from qinzhao at gcc dot gnu.org ---
So, based on all the discussion so far, how about the following:

** add the following gcc option:

-fstrict-flex-arrays=[0|1|2|3]

when -fstrict-flex-arrays=0:
treat all trailing arrays as flexible arrays. the default behavior;

when -fstrict-flex-arrays=1:
Only treating [], [0], and [1] as flexible array;

when -fstrict-flex-arrays=2:
Only treating [] and [0] as flexible array;

when -fstrict-flex-arrays=3:
Only treating [] as flexible array; The strictest level. 

any comments?

[Bug target/105960] [12/13 Regression] Crash in 32-bit mode

2022-06-14 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105960

H.J. Lu  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |hjl.tools at gmail dot 
com

--- Comment #7 from H.J. Lu  ---
Created attachment 53135
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53135&action=edit
A patch

Try this.

[Bug c/105970] ICE in ix86_function_arg, at config/i386/i386.cc:3351

2022-06-14 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105970

--- Comment #2 from H.J. Lu  ---
(In reply to Uroš Bizjak from comment #1)
> Probably something like:
> 
> diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> index 3d189e124e4..f158cc3aaea 100644
> --- a/gcc/config/i386/i386.cc
> +++ b/gcc/config/i386/i386.cc
> @@ -3348,7 +3348,7 @@ ix86_function_arg (cumulative_args_t cum_v, const
> function_arg_info &arg)
>if (POINTER_TYPE_P (arg.type))
> {
>   /* This is the pointer argument.  */
> - gcc_assert (TYPE_MODE (arg.type) == Pmode);
> + gcc_assert (TYPE_MODE (arg.type) == ptr_mode);

This looks reasonable since pointer mode should be ptr_mode.

>   /* It is at -WORD(AP) in the current frame in interrupt and
>  exception handlers.  */
>   reg = plus_constant (Pmode, arg_pointer_rtx, -UNITS_PER_WORD);
> 
> Pointer mode and Pmode can be distinct for x32 target.  However, I have no
> idea what goes into interrupt frame for x32. Let's ask HJ.

[Bug libstdc++/59048] operator== between std::string and const char* slower than strcmp

2022-06-14 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59048

Jonathan Wakely  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |redi at gcc dot gnu.org
 Status|NEW |ASSIGNED

[Bug c/105950] > O2 optimization causes runtime (SIGILL) during main initialization

2022-06-14 Thread sam at gentoo dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105950

Sam James  changed:

   What|Removed |Added

 CC||sam at gentoo dot org

--- Comment #24 from Sam James  ---
Please be polite on these bugs. There's a lot of documentation online about how
to use UBsan.

It's not ideal to upload a tarball with all of the bits, but if it's what's
needed, then I guess so be it. Some build systems make it easier to enable
sanitizers like Meson.

GCC's bug tracker isn't for general support on how to use build systems and
flags. 

The bug tracker is public and I don't think one can delete their own
attachments.

Are you saying that when you use -fsanitize=undefined and run your program, it
gets SIGILL'd?

[Bug c/105950] > O2 optimization causes runtime (SIGILL) during main initialization

2022-06-14 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105950

--- Comment #25 from Jonathan Wakely  ---
(In reply to John Kanapes from comment #22)
> It took you 4 posts to explain me what to do.
> It took me 4 posts to understand what you were talking about.
> You should explain better.

You should read better. Comment 3 is perfectly clear.

"For UBSan, you can't just compile with -fsanitize=undefined, you need to link
with that flag as well."


(In reply to John Kanapes from comment #23)
> What do you do with the sources after the ticket?

They will stay attached here. If you don't want them to be public, you need to
reduce it to something smaller that still shows the bug (which you've said you
can't) or put them somewhere online and persuade somebody here to download them
and try to reproduce and reduce it for you.

  1   2   >