[Bug target/112919] LoongArch: Alignments in tune parameters are not precise and they regress performance

2024-03-01 Thread chenglulu at loongson dot cn via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112919

--- Comment #10 from chenglulu  ---
(In reply to Xi Ruoyao from comment #9)
> (In reply to chenglulu from comment #8)
> > (In reply to Xi Ruoyao from comment #7)
> > > Any update? :)
> > 
> > Well, I haven't run it yet. Since this does not have a big impact on the
> > spec score, I am currently testing it on a single-channel machine, so the
> > test time will be longer.
> > I will reply here as soon as the results are available.
> 
> Can we determine on LA664 if the current default alignment is better than
> not aligning at all?  Coremarks results suggest the current default is even
> worse than not aligning, but arguably Coremarks is far different from real
> workloads. However if the current default is not better than not aligning
> (or the difference is only marginal and is likely covered up by some random
> fluctuation) we can disable the aligning for LA664.
> 
> (Maybe we and the HW engineers have done some repetitive work or even some
> work cancelling each other out :(. )
On March 8th I should be able to get the test results on the 3A6000 machine, I
need to judge the fluctuation of the spec and then let's see if the default
alignment is set?
In addition, I also tested it on the 3A5000 again, and the results will be
available around March 15th.
The conclusion of coremark from our team leader Xu Chenghua is that
'-falign-labels' have a regular effect on the performance of coremark, and when
the value of '-falign-labels' is greater than 4 bytes, the performance
decreases significantly.

[Bug middle-end/111523] Unexpected performance regression with -ftrivial-auto-var-init=zero for e.g. systemctl unmask

2024-03-01 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111523

--- Comment #7 from Andrew Pinski  ---
One thing I noticed is that malloc_usable_size is used for greedy_realloc but
that should not change based on if -ftrivial-auto-var-init is used or not.

[Bug target/112919] LoongArch: Alignments in tune parameters are not precise and they regress performance

2024-03-01 Thread xry111 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112919

--- Comment #11 from Xi Ruoyao  ---
(In reply to chenglulu from comment #10)
> (In reply to Xi Ruoyao from comment #9)
> > (In reply to chenglulu from comment #8)
> > > (In reply to Xi Ruoyao from comment #7)
> > > > Any update? :)
> > > 
> > > Well, I haven't run it yet. Since this does not have a big impact on the
> > > spec score, I am currently testing it on a single-channel machine, so the
> > > test time will be longer.
> > > I will reply here as soon as the results are available.
> > 
> > Can we determine on LA664 if the current default alignment is better than
> > not aligning at all?  Coremarks results suggest the current default is even
> > worse than not aligning, but arguably Coremarks is far different from real
> > workloads. However if the current default is not better than not aligning
> > (or the difference is only marginal and is likely covered up by some random
> > fluctuation) we can disable the aligning for LA664.
> > 
> > (Maybe we and the HW engineers have done some repetitive work or even some
> > work cancelling each other out :(. )
> On March 8th I should be able to get the test results on the 3A6000 machine,
> I need to judge the fluctuation of the spec and then let's see if the
> default alignment is set?

I just mean if we cannot get a decisive result before GCC 14 we may just turn
off alignment.  But if we can get a decisive result as expected in Mar we can
just use the best we'll find.

[Bug middle-end/111523] Unexpected performance regression with -ftrivial-auto-var-init=zero for e.g. systemctl unmask

2024-03-01 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111523

--- Comment #8 from Andrew Pinski  ---
I also looked into extracting greedy_realloc into its own file and I don't see
anything which would cause -ftrivial-auto-var-init=zero any difference.

[Bug d/114171] [13/14 Regression] gdc -O2 -mavx generates misaligned vmovdqa instruction

2024-03-01 Thread a.horodniceanu at proton dot me via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114171

--- Comment #3 from Andrei Horodniceanu  ---
(In reply to Hongtao Liu from comment #2)
> on rtl level,we get
> 
> (insn 7 6 8 2 (set (reg:CCZ 17 flags)
> (compare:CCZ (mem:TI (plus:DI (reg/f:DI 100 [ _5 ])
> (const_int 24 [0x18])) [0 MEM[(ucent *)_5 + 24B]+0 S16
> A128])
> (const_int 0 [0]))) "test.d":15:16 30 {*cmpti_doubleword}
>  (nil))
> 
> It's 16-byte aligned.

I'm not sure I understand this notation but I think I can pick the mem function
which returns a memory reference at the offset given by the _5 register? plus
24 of 16 byte size and 128 bit alignment. This does match up as the size of the
BreakStatement class is 40 (24 + 16). I don't see how it can say that that
memory is 16-byte aligned as in D code the alignment of the class is 8 but it
may be just that, internally, gcc has a more accurate record.

Could you check the rtl for the same snippet of code but instead of `interface
ASTNode` do `abstract class ASTNode`? This would make BreakStatement be of size
32 instead of 40 and change the 24 offset to 16 which happens to produce a
16-byte aligned memory location. This is the only other interesting thing I hit
when trying to reduce the failing code.

[Bug target/112919] LoongArch: Alignments in tune parameters are not precise and they regress performance

2024-03-01 Thread chenglulu at loongson dot cn via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112919

--- Comment #12 from chenglulu  ---
(In reply to Xi Ruoyao from comment #11)
> (In reply to chenglulu from comment #10)
> > (In reply to Xi Ruoyao from comment #9)
> > > (In reply to chenglulu from comment #8)
> > > > (In reply to Xi Ruoyao from comment #7)
> > > > > Any update? :)
> > > > 
> > > > Well, I haven't run it yet. Since this does not have a big impact on the
> > > > spec score, I am currently testing it on a single-channel machine, so 
> > > > the
> > > > test time will be longer.
> > > > I will reply here as soon as the results are available.
> > > 
> > > Can we determine on LA664 if the current default alignment is better than
> > > not aligning at all?  Coremarks results suggest the current default is 
> > > even
> > > worse than not aligning, but arguably Coremarks is far different from real
> > > workloads. However if the current default is not better than not aligning
> > > (or the difference is only marginal and is likely covered up by some 
> > > random
> > > fluctuation) we can disable the aligning for LA664.
> > > 
> > > (Maybe we and the HW engineers have done some repetitive work or even some
> > > work cancelling each other out :(. )
> > On March 8th I should be able to get the test results on the 3A6000 machine,
> > I need to judge the fluctuation of the spec and then let's see if the
> > default alignment is set?
> 
> I just mean if we cannot get a decisive result before GCC 14 we may just
> turn off alignment.  But if we can get a decisive result as expected in Mar
> we can just use the best we'll find.

Well, the results should be available before GCC14 is released. It also seems
that the setting of 3A5000 needs to be changed, because the value of
'-falign-labels' was affected by the macro ASM_OUTPUT_ALIGN_WITH_NOP in the
previous test.

[Bug target/114100] [avr] Inefficient indirect addressing on Reduced Tiny

2024-03-01 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114100

Georg-Johann Lay  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #3 from Georg-Johann Lay  ---
Improved in v14

[Bug fortran/114141] ASSOCIATE and complex part ref when associate target is a function

2024-03-01 Thread pault at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114141

--- Comment #12 from Paul Thomas  ---
(In reply to Steve Kargl from comment #11)
...snip...
> I know you had some ASSOCIATE patches in the works, and
> certainly do not want to interfere.  Do you want to
> incorporate my patch or some variation into your work?
> I'm hoping to take a stab at the issue Jerry raised 
> with parentheses this weekend.

Hi Steve,

Interference was not what I had in mind :-)

I was thinking of breaking the patch
https://gcc.gnu.org/pipermail/fortran/2024-January/060092.html in two; the
first to deal with derived type functions and the second for class functions.
Your patch for this PR would sit nicely in the first.

Cheers

Paul

[Bug fortran/114141] ASSOCIATE and complex part ref when associate target is a function

2024-03-01 Thread pault at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114141

--- Comment #13 from Paul Thomas  ---
(In reply to Steve Kargl from comment #11)
...snip...
> I know you had some ASSOCIATE patches in the works, and
> certainly do not want to interfere.  Do you want to
> incorporate my patch or some variation into your work?
> I'm hoping to take a stab at the issue Jerry raised 
> with parentheses this weekend.

Hi Steve,

Interference was not what I had in mind :-)

I was thinking of breaking the patch
https://gcc.gnu.org/pipermail/fortran/2024-January/060092.html in two; the
first to deal with derived type functions and the second for class functions.
Your patch for this PR would sit nicely in the first.

Cheers

Paul

[Bug rtl-optimization/114187] New: [14 regression] bizarre register dance on x86_64 for pass-by-value struct

2024-03-01 Thread matteo at mitalia dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114187

Bug ID: 114187
   Summary: [14 regression] bizarre register dance on x86_64 for
pass-by-value struct
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: matteo at mitalia dot net
  Target Milestone: ---

Sample code (+ godbolt link https://godbolt.org/z/zf6e16Wcq )

```
struct P2d {
double x, y;
};

double sumxy(double x, double y) {
return x + y;
}

double sumxy_p(P2d p) {
return p.x + p.y;
}

double sumxy_p_ref(const P2d& p) {
return p.x + p.y;
}
```

with g++ 13.2 -O3 generates a perfectly reasonable

```
sumxy(double, double):
addsd   xmm0, xmm1
ret
sumxy_p(P2d):
addsd   xmm0, xmm1
ret
sumxy_p_ref(P2d const&):
movsd   xmm0, QWORD PTR [rdi]
addsd   xmm0, QWORD PTR [rdi+8]
ret
```

instead with g++ 14 (g++
(Compiler-Explorer-Build-gcc-b05f474c8f7768dad50a99a2d676660ee4db09c6-binutils-2.40)
14.0.1 20240301 (experimental)) we get

```
sumxy(double, double):
addsd   xmm0, xmm1
ret
sumxy_p(P2d):
movqrax, xmm1
movqrdx, xmm0
xchgrdx, rax
movqxmm0, rax
movqxmm2, rdx
addsd   xmm0, xmm2
ret
sumxy_p_ref(P2d const&):
movsd   xmm0, QWORD PTR [rdi]
addsd   xmm0, QWORD PTR [rdi+8]
ret
```

Notice the bizarre registers dance for sumxy_p(P2d) (p.x goes through xmm0 →
rdx → rax → xmm0; p.y in turn xmm1 → rax → rdx → xmm2; then they finally get
summed); sumxy(double, double) which, register-wise, should be the same, is
unaffected.

This exact same code (both for gcc 13 and gcc 14) is generated at all
optimization levels I tested (-Og, -O1, -O2, -O3) except -O0 of course, so it
doesn't seem to depend from particular optimization passes enabled only at high
optimization levels. Also (as reasonable) it doesn't seem to depend on the C++
frontend, as compiling this with plain gcc (adding a typedef for the struct and
changing the reference to a pointer) yields the exact same results.

Most importantly, it seems something target-specific, as ARM64 builds don't
exhibit particular problems, and produce pretty much the same (reasonable) code
both on 14.0 and 13.2

```
sumxy(double, double):
faddd0, d0, d1
ret
sumxy_p(P2d):
faddd0, d0, d1
ret
sumxy_p_ref(P2d const&):
ldp d0, d31, [x0]
faddd0, d0, d31
ret
```

(gcc 13.2 generates slightly different code for sumxy_p_ref, but in a very
minor way)

Fiddling around, with -march=nocona (that leaves gcc 13.2 unaffected) I get a
more compact but still absurd dance:

```
sumxy_p(P2d):
movsd   QWORD PTR [rsp-8], xmm1
mov rdx, QWORD PTR [rsp-8]
movqxmm2, rdx
addsd   xmm0, xmm2
ret
```

here p.x is left in xmm0 where it should, but xmm1 goes through the stack (!),
a GP register (rdx) and finally to xmm2. It feels like in general it wants to
launder xmm1 through a 64 bit GP register before summing it, a bit like a light
version of -ffloat-store.

[Bug target/114187] [14 regression] bizarre register dance on x86_64 for pass-by-value struct

2024-03-01 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114187

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |14.0
 CC||roger at nextmovesoftware dot 
com
  Component|rtl-optimization|target
 Target||X86_64

[Bug target/114187] [14 regression] bizarre register dance on x86_64 for pass-by-value struct

2024-03-01 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114187

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||missed-optimization

--- Comment #1 from Andrew Pinski  ---
Most likely related to the TImode changes...

[Bug tree-optimization/114151] [14 Regression] weird and inefficient codegen and addressing modes since g:a0b1798042d033fd2cc2c806afbb77875dd2909b

2024-03-01 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114151

--- Comment #6 from Richard Biener  ---
(In reply to Andrew Macleod from comment #5)
> (In reply to rguent...@suse.de from comment #4)
> 
> > 
> > What was definitely missing is consideration of POLY_INT_CSTs (and
> > variable polys, as I think there's no range info for those).
> > 
> Ranger doesn't do anything with POLY_INTs, mostly because I didn't
> understand them.  
> 
> > We do eventually want to improve how ranger behaves here.  I'm not sure
> > why when we do not provide a context 'stmt' it can't see to compute
> > a range valid at the SSA names point of definition?  (so basically
> > compute the global range)
> 
> The call looks like it doesn't provide the stmt.  Without the stmt, all
> ranger will ever provide is global ranges.
> 
> I think you are asking why, If there is no global range, it doesn't try to
> compute one from the ssa_name_def_stmt?  Ranger does when it is active.  

I tried with an active ranger but that doesn't make a difference.  Basically
I added enable_ranger () / disable_ranger () around the pass and thought
that would "activate" it.  But looking at range_for_expr I don't see how
that would make a difference without a provided stmt.

But maybe I'm doing it wrong?

[Bug tree-optimization/114164] simdclone vectorization creates unsupported IL

2024-03-01 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114164

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org
 Ever confirmed|0   |1
   Last reconfirmed||2024-03-01

--- Comment #4 from Richard Biener  ---
Ah, yes - we're not trying to do anything special if the mask and arg mask
types have different size - we can handle different number of lanes but we
don't
try any widening/shortening tricks to make the size match.

That said, it looks like we definitely should verify in the vectorizer that
we can handle the VEC_COND_EXPR.

As of trick missing, we could do v4si ? v4sf : v4sf and then widen to v4df
or widen v4si to v4di.  Possibly the target could know what's more efficient
here.

Mine for adding the missing support check.

[Bug fortran/114188] New: Overloading assignment does not invalidate intrinsic assignment

2024-03-01 Thread Bader at lrz dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114188

Bug ID: 114188
   Summary: Overloading assignment does not invalidate intrinsic
assignment
   Product: gcc
   Version: 13.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: Bader at lrz dot de
  Target Milestone: ---

Created attachment 57583
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57583&action=edit
test case for invalid use of assignment overloading

The attached reproducer overloads the assignment operator with a version that
requires the left hand side to be a pointer.

The overload conforms to the requirements for defining the assignment according
to 10.2.1.4 of the Fortran standard. Therefore, the intrinsic assignment should
become unavailable (last sentence of 10.2.1.1).

However, gfortran accepts invocations that use nonpointer arguments.

(NAG Fortran, Intel Fortran and NVidia Fortran issue appropriate error
messages).

[Bug tree-optimization/113441] [14 Regression] Fail to fold the last element with multiple loop since g:2efe3a7de0107618397264017fb045f237764cc7

2024-03-01 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441

--- Comment #32 from Richard Biener  ---
(In reply to Richard Sandiford from comment #31)
> (In reply to Tamar Christina from comment #29)
> > This works fine for normal gather and scatters but doesn't work for widening
> > gathers and narrowing scatters which only the pattern seems to handle.
> I'm supposedly on holiday, so didn't see the IRC discussion, but: as I
> remember it, there is no narrowing or widening for IFN gathers or scatters
> as such, even for patterns.  One vector's worth of offsets corresponds to
> one vector's worth of data.  But the widths of the data elements and the
> offset elements can be different.  Any sign or zero extension of a loaded
> vector, or any operation to double or halve the number of vectors, is done
> separately.

Yep.  The emulated gather/scatter and builtin paths do this widening/shortening
of the offset operand to what we expect on-the-fly.  This support is missing
from the IFN path which relies on patterns doing this.

Having widening/shortening explicitly represented is of course better but
using patterns for this has the unfortunate all-or-nothing effect (right now).

I do hope with SLP only, where it's easier to insert/remove "stmts", we can
delay "pattern recognition" in these cases eventually even up to
vectorizable_* which would "simply" insert a widening/shortening operation
into the SLP graph to make itself happy.

In the mean time I think making the IFN path work also the same way as
emuated/builtin would make sense.  It's already half-way there.

[Bug tree-optimization/113441] [14 Regression] Fail to fold the last element with multiple loop since g:2efe3a7de0107618397264017fb045f237764cc7

2024-03-01 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441

--- Comment #33 from Richard Sandiford  ---
Can you give me a chance to look at it a bit when I back?  This doesn't feel
like the way to go to me.

[Bug target/114184] [12/13/14 Regression] ICE: in extract_insn, at recog.cc:2812 (unrecognizable insn ) with _Complex long double and vector VCE

2024-03-01 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114184

Jakub Jelinek  changed:

   What|Removed |Added

   Priority|P3  |P2
Summary|ICE: in extract_insn, at|[12/13/14 Regression] ICE:
   |recog.cc:2812   |in extract_insn, at
   |(unrecognizable insn ) with |recog.cc:2812
   |-Og -mavx512f and   |(unrecognizable insn ) with
   |__builtin_memmove() |_Complex long double and
   |_BitInt(256)|vector VCE
   Target Milestone|--- |12.4

--- Comment #1 from Jakub Jelinek  ---
Seems unrelated to _BitInt.
E.g. following testcase ICEs with -O2 -mavx2 since
r14-4537-g70b5c6981fcdff246f90e57e91f3e1667eab2eb3
typedef unsigned char V __attribute__((vector_size (32)));
_Complex long double
foo (void)
{
  _Complex long double d;
  *(V *)&d = (V) { 149, 136, 89, 42, 38, 240, 196, 194 };
  return d;
}
and the same with -Og -mavx2 since
r12-7240-g2801f23fb82a5ef51c8b460a500786797943e1e9
I don't see bugs in either of those commits.

What I see happing is that expand_assignment, because the destination is
complex, does
6211  emit_move_insn (XEXP (to_rtx, 0),
6212  read_complex_part (from_rtx,
false));
6213  emit_move_insn (XEXP (to_rtx, 1),
6214  read_complex_part (from_rtx,
true));
where from_rtx at that point is
(subreg:XC (const_vector:V32QI [
(const_int -107 [0xff95])
(const_int -120 [0xff88])
(const_int 89 [0x59])
(const_int 42 [0x2a])
(const_int 38 [0x26])
(const_int -16 [0xfff0])
(const_int -60 [0xffc4])
(const_int -62 [0xffc2])
(const_int 0 [0]) repeated x24
]) 0)
which is supposedly a valid subreg, reinterpretation of a vector as complex
extended double, which is not foldable to constant because it isn't a valid
IEEE value which the compiler can express.  Or should this have been a concat?
Anyway, read_complex_part returns for that
(const_double:XF 0.0 [0x0.0p+0])
for the imag part and
(subreg:XF (const_vector:V32QI [
(const_int -107 [0xff95])
(const_int -120 [0xff88])
(const_int 89 [0x59])
(const_int 42 [0x2a])
(const_int 38 [0x26])
(const_int -16 [0xfff0])
(const_int -60 [0xffc4])
(const_int -62 [0xffc2])
(const_int 0 [0]) repeated x24
]) 0)
for the real part.  The latter makes it through validate_subreg due to the
  /* ??? Similarly, e.g. with (subreg:DF (reg:TI)).  Though store_bit_field
 is the culprit here, and not the backends.  */
  else if (known_ge (osize, regsize) && known_ge (isize, osize))
;
- osize is 16, isize is 32 and regsize is 8.  If that wasn't for that rule,
there would be the
  /* Subregs involving floating point modes are not allowed to
 change size unless it's an insert into a complex mode.
 Therefore (subreg:DI (reg:DF) 0) and (subreg:CS (reg:SF) 0) are fine, but
 (subreg:SI (reg:DF) 0) isn't.  */
rule which would reject it.

[Bug target/114184] [12/13/14 Regression] ICE: in extract_insn, at recog.cc:2812 (unrecognizable insn ) with _Complex long double and vector VCE

2024-03-01 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114184

--- Comment #2 from Jakub Jelinek  ---
BTW,
typedef unsigned char V __attribute__((vector_size (16)));
long double
foo (void)
{
  long double d;
  *(V *)&d = (V) { 149, 136, 89, 42, 38, 240, 196, 194 };
  return d;
}
ICEs at -O2 too since r14-4537-g70b5c6981fcdff246f90e57e91f3e1667eab2eb3
so I'm afraid even trying to optimize the
XFmode lowpart SUBREG of the CONST_VECTOR into same size SUBREG from half sized
CONST_VECTOR to XFmode wouldn't help, because that is exactly what we have in
this testcase,
(insn 5 2 6 2 (set (reg/v:XF 98 [ d ])
(subreg:XF (const_vector:V16QI [
(const_int -107 [0xff95])
(const_int -120 [0xff88])
(const_int 89 [0x59])
(const_int 42 [0x2a])
(const_int 38 [0x26])
(const_int -16 [0xfff0])
(const_int -60 [0xffc4])
(const_int -62 [0xffc2])
(const_int 0 [0]) repeated x8
]) 0)) "pr114184-4.c":6:12 -1
 (nil))

[Bug middle-end/114156] during GIMPLE pass: bitintlower ICE: in lower_stmt, at gimple-lower-bitint.cc:5335 with -O -m32 and _BitInt(128) __builtin_memmove()

2024-03-01 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114156

--- Comment #2 from GCC Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:d3d0fcb652748191714e4c0b2541e977a7fc7bd7

commit r14-9248-gd3d0fcb652748191714e4c0b2541e977a7fc7bd7
Author: Jakub Jelinek 
Date:   Fri Mar 1 11:04:51 2024 +0100

bitint: Handle VCE from large/huge _BitInt SSA_NAME from load [PR114156]

When adding checks in which case not to merge a VIEW_CONVERT_EXPR from
large/huge _BitInt to vector/complex etc., I missed the case of loads.
Those are handled differently later.
Anyway, I think the load case is something we can handle just fine,
so the following patch does that instead of preventing the merging
gimple_lower_bitint; we'd then copy from memory to memory and and do the
vce only on the second one, it is just better to vce the first one.

2024-03-01  Jakub Jelinek  

PR middle-end/114156
* gimple-lower-bitint.cc (bitint_large_huge::lower_stmt): Allow
rhs1 of a VCE to have no underlying variable if it is a load and
handle that case.

* gcc.dg/bitint-96.c: New test.

[Bug tree-optimization/114164] simdclone vectorization creates unsupported IL

2024-03-01 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114164

--- Comment #5 from Richard Biener  ---
The following fixes this, it also shows that even with -mavx2 we don't support
this (as was expected after the analysis).  Note since we emit
mask ? {true,..} : {false,...} we only support in-branch clones when the
target has corresponding vcond_mask expanders.  For vcondeq we'd need to
emit a redundant mask != mask_false_cst compare

diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index be0e1a9c69d..14a3ffb5f02 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -4210,6 +4210,16 @@ vectorizable_simd_clone_call (vec_info *vinfo,
stmt_vec_info stmt_info,
 " supported for mismatched vector
sizes.\n");
  return false;
}
+ if (!expand_vec_cond_expr_p (clone_arg_vectype,
+  arginfo[i].vectype, ERROR_MARK))
+   {
+ if (dump_enabled_p ())
+   dump_printf_loc (MSG_MISSED_OPTIMIZATION,
+vect_location,
+"cannot compute mask argument for"
+" in-branch vector clones.\n");
+ return false;
+   }
}
  else if (SCALAR_INT_MODE_P (bestn->simdclone->mask_mode))
{

[Bug middle-end/114156] during GIMPLE pass: bitintlower ICE: in lower_stmt, at gimple-lower-bitint.cc:5335 with -O -m32 and _BitInt(128) __builtin_memmove()

2024-03-01 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114156

Jakub Jelinek  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #3 from Jakub Jelinek  ---
Fixed.

[Bug target/114189] New: Target implements obsolete vcond{,u,eq} expanders

2024-03-01 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114189

Bug ID: 114189
   Summary: Target implements obsolete vcond{,u,eq} expanders
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: rguenth at gcc dot gnu.org
  Target Milestone: ---

The listed targets implement any of vcond, vcondu or vcondeq expanders where
the modern way of exposing vector conditional instructions is using
mask generating vec_cmp{,u,eq} and vcond_mask expanders.

vcond, vcondu and vcondeq optabs will be retired in the future.

[Bug target/114189] Target implements obsolete vcond{,u,eq} expanders

2024-03-01 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114189

--- Comment #1 from Richard Biener  ---
The following shows them all:

grep 'vcond[

[Bug tree-optimization/113441] [14 Regression] Fail to fold the last element with multiple loop since g:2efe3a7de0107618397264017fb045f237764cc7

2024-03-01 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441

--- Comment #34 from rguenther at suse dot de  ---
On Fri, 1 Mar 2024, rsandifo at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441
> 
> --- Comment #33 from Richard Sandiford  ---
> Can you give me a chance to look at it a bit when I back?  This doesn't feel
> like the way to go to me.

Sure.

[Bug libstdc++/114152] Wrong exception specifiers for LFTSv3 scope guard destructors

2024-03-01 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114152

--- Comment #5 from GCC Commits  ---
The releases/gcc-13 branch has been updated by Jonathan Wakely
:

https://gcc.gnu.org/g:89e7b77df1aefb01aadfacae83170b24a4c4d274

commit r13-8371-g89e7b77df1aefb01aadfacae83170b24a4c4d274
Author: Jonathan Wakely 
Date:   Wed Feb 28 14:45:18 2024 +

libstdc++: Fix noexcept on dtors in  [PR114152]

The PR points out that the destructors all have incorrect
noexcept-specifiers.

libstdc++-v3/ChangeLog:

PR libstdc++/114152
* include/experimental/scope (scope_exit scope_fail): Make
destructor unconditionally noexcept.
(scope_sucess): Fix noexcept-specifier.
* testsuite/experimental/scopeguard/114152.cc: New test.

(cherry picked from commit 80c386cb20d38ebc55f30a79418fabfbed904b87)

[Bug target/113960] [11/12/13 Regression] std::map with std::vector as input overwrites itself with c++20, on s390x platform

2024-03-01 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113960

--- Comment #16 from GCC Commits  ---
The releases/gcc-13 branch has been updated by Jonathan Wakely
:

https://gcc.gnu.org/g:3a16060d605a087fb4cf0bc3b53ed93b5875cd62

commit r13-8377-g3a16060d605a087fb4cf0bc3b53ed93b5875cd62
Author: Jonathan Wakely 
Date:   Tue Feb 27 17:50:34 2024 +

libstdc++: Fix conditions for using memcmp in
std::lexicographical_compare_three_way [PR113960]

The change in r11-2981-g2f983fa69005b6 meant that
std::lexicographical_compare_three_way started to use memcmp for
unsigned integers on big endian targets, but for that to be valid we
need the two value types to have the same size and we need to use that
size to compute the length passed to memcmp.

I already defined a __is_memcmp_ordered_with trait that does the right
checks, std::lexicographical_compare_three_way just needs to use it.

libstdc++-v3/ChangeLog:

PR libstdc++/113960
* include/bits/stl_algobase.h (__is_byte_iter): Replace with ...
(__memcmp_ordered_with): New concept.
(lexicographical_compare_three_way): Use __memcmp_ordered_with
instead of __is_byte_iter. Use correct length for memcmp.
*
testsuite/25_algorithms/lexicographical_compare_three_way/113960.cc:
New test.

(cherry picked from commit f5cdda8acb06c20335855ed353ab9a441c12128a)

[Bug libstdc++/114152] Wrong exception specifiers for LFTSv3 scope guard destructors

2024-03-01 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114152

Jonathan Wakely  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #6 from Jonathan Wakely  ---
Fixed for 13.3 - thanks for the report.

[Bug target/113960] [11/12 Regression] std::map with std::vector as input overwrites itself with c++20, on s390x platform

2024-03-01 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113960

Jonathan Wakely  changed:

   What|Removed |Added

Summary|[11/12/13 Regression]   |[11/12 Regression] std::map
   |std::map with std::vector   |with std::vector as input
   |as input overwrites itself  |overwrites itself with
   |with c++20, on s390x|c++20, on s390x platform
   |platform|

--- Comment #17 from Jonathan Wakely  ---
Fixed for 13.3 and 14.1 so far ...

[Bug c/114181] issubnormal is a macro

2024-03-01 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114181

--- Comment #14 from Jonathan Wakely  ---
Libc headers define lots of macros, we don't #undef them in C++ headers unless
they use the name of a function defined by the C++ library (because C++ says
its library functions must be real functions and must not be hidden by macros).

If you want to provide an implementation of issubnormal (e.g. as
boost::issubnormal) then you need to do:

#include 
#undef issubnormal

I don't think it's a libstdc++ bug that we don't #undef non-standard macros.

It's certainly not a GCC component=c bug that Glibc defines a macro in its
.

So there's no GCC bug here.

Like Andrew said, iff we need to define std::issubnormal then we'll #undef it,
but we're not going to do that until we need to.

[Bug c/114181] issubnormal is a macro

2024-03-01 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114181

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #15 from Jakub Jelinek  ---
You could as well use namespace std { bool (issubnormal) (...); } if you don't
want macro expansion for it.

[Bug target/114184] [12/13/14 Regression] ICE: in extract_insn, at recog.cc:2812 (unrecognizable insn ) with _Complex long double and vector VCE

2024-03-01 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114184

Jakub Jelinek  changed:

   What|Removed |Added

 CC||uros at gcc dot gnu.org

--- Comment #3 from Jakub Jelinek  ---
So, one way to fix this or work around that would be to
--- gcc/config/i386/i386-expand.cc.jj   2024-02-26 07:29:27.695974161 +0100
+++ gcc/config/i386/i386-expand.cc  2024-03-01 12:48:59.678574710 +0100
@@ -451,6 +451,12 @@ ix86_expand_move (machine_mode mode, rtx
  && GET_MODE (SUBREG_REG (op1)) == DImode
  && SUBREG_BYTE (op1) == 0)
op1 = gen_rtx_ZERO_EXTEND (TImode, SUBREG_REG (op1));
+  /* As not all values in XFmode are representable in real_value,
+we might be called with non-general_operand SUBREGs.  */
+  if (mode == XFmode
+ && !general_operand (op1, XFmode)
+ && can_create_pseudo_p ())
+   op1 = force_reg (XFmode, op1);
   break;
 }

Though, e.g. md.texi suggests against using force_reg in the mov optab (though,
perhaps it is meant just that it shouldn't be used during reload).
Of course, one could argue it is middle-end's fault for not honoring the mov
optab predicate, and that emit_move_insn_1's
  code = optab_handler (mov_optab, mode);
  if (code != CODE_FOR_nothing)
return emit_insn (GEN_FCN (code) (x, y));
should verify the predicate there.  Changing that feels very risky in stage4 or
on release branches, moves are really quite special.  Or check it in
emit_move_insn callers where there are risks the predicates aren't satisfied
(though, there are almost 500 callers of that just in middle-end code).

[Bug target/112868] GCC passes -many to the assembler for --enable-checking=release builds

2024-03-01 Thread jeevitha at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112868

Jeevitha  changed:

   What|Removed |Added

 CC||jeevitha at gcc dot gnu.org

--- Comment #8 from Jeevitha  ---
Created attachment 57584
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57584&action=edit
Removed -many from the options passed by default to the assembler.

Sam James, can you do a practice distro build using this patch?

[Bug middle-end/114070] [12/13 regression] ICE when building git-2.43.2 with -mcpu=niagara4 -fno-vect-cost-model

2024-03-01 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114070

--- Comment #10 from GCC Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:f9c30ea737b806caac917d8f501305151a2cbd57

commit r14-9252-gf9c30ea737b806caac917d8f501305151a2cbd57
Author: Richard Biener 
Date:   Thu Feb 29 09:22:19 2024 +0100

middle-end/114070 - VEC_COND_EXPR folding

The following amends the PR114070 fix to optimistically allow
the folding when we cannot expand the current vec_cond using
vcond_mask and we're still before vector lowering.  This leaves
a small window between vectorization and lowering where we could
break vec_conds that can be expanded via vcond{,u,eq}, most
susceptible is the loop unrolling pass which applies VN and thus
possibly folding to the unrolled body of a vectorized loop.

This gets back the folding for targets that cannot do vectorization.
It doesn't get back the folding for x86 with AVX512 for example
since that can handle the original IL but not the folded since
it misses some vcond_mask expanders.

PR middle-end/114070
* match.pd ((c ? a : b) op d  -->  c ? (a op d) : (b op d)):
Allow the folding if before lowering and the current IL
isn't supported with vcond_mask.

[Bug tree-optimization/114190] New: wrong code with -O2 -fno-dce -fharden-compares -mvpclmulqdq --param=max-rtl-if-conversion-unpredictable-cost=136

2024-03-01 Thread zsojka at seznam dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114190

Bug ID: 114190
   Summary: wrong code with -O2 -fno-dce -fharden-compares
-mvpclmulqdq
--param=max-rtl-if-conversion-unpredictable-cost=136
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: wrong-code
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zsojka at seznam dot cz
  Target Milestone: ---
  Host: x86_64-pc-linux-gnu
Target: x86_64-pc-linux-gnu

Created attachment 57585
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57585&action=edit
reduced testcase

Output:
$ x86_64-pc-linux-gnu-gcc -O2 -fno-dce -fharden-compares -mvpclmulqdq
--param=max-rtl-if-conversion-unpredictable-cost=136 testcase.c -Wno-psabi
$ ./a.out 
Aborted

$ x86_64-pc-linux-gnu-gcc -v
Using built-in specs.
COLLECT_GCC=/repo/gcc-trunk/binary-latest-amd64/bin/x86_64-pc-linux-gnu-gcc
COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-r14-9248-20240301110451-gd3d0fcb6527-checking-yes-rtl-df-extra-nobootstrap-amd64/bin/../libexec/gcc/x86_64-pc-linux-gnu/14.0.1/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++
--enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra
--disable-bootstrap --with-cloog --with-ppl --with-isl
--build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu
--target=x86_64-pc-linux-gnu --with-ld=/usr/bin/x86_64-pc-linux-gnu-ld
--with-as=/usr/bin/x86_64-pc-linux-gnu-as --enable-libsanitizer
--disable-libstdcxx-pch
--prefix=/repo/gcc-trunk//binary-trunk-r14-9248-20240301110451-gd3d0fcb6527-checking-yes-rtl-df-extra-nobootstrap-amd64
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 14.0.1 20240301 (experimental) (GCC)

[Bug testsuite/109549] [14 Regression] Conditional move regressions after r14-53-g675b1a7f113adb1d737adaf78b4fd90be7a0ed1a

2024-03-01 Thread stefansf at linux dot ibm.com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109549

--- Comment #11 from Stefan Schulze Frielinghaus  ---
I will have a look at those s390x failures and come up with a
TARGET_NOCE_CONVERSION_PROFITABLE_P implementation.

[Bug libstdc++/77776] C++17 std::hypot implementation is poor

2024-03-01 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=6

--- Comment #12 from Jonathan Wakely  ---
(In reply to g.peterhoff from comment #11)
> Would this be a good implementation for hypot3 in cmath?

Thanks for the suggestion!

> #define GCC_UNLIKELY(x) __builtin_expect(x, 0)
> #define GCC_LIKELY(x) __builtin_expect(x, 1)
> 
> namespace __detail
> {
>   template 
>   inline _GLIBCXX_CONSTEXPR typename 
> enable_if::value,
> bool>::type   __isinf3(const _Tp __x, const _Tp __y, const _Tp __z)   noexcept
>   {
>   return bool(int(std::isinf(__x)) | int(std::isinf(__y)) |
> int(std::isinf(__z)));

The casts are redundant and just make it harder to read IMHO:

  return std::isinf(__x) | std::isinf(__y) | std::isinf(__z);



>   }
> 
>   template 
>   inline _GLIBCXX_CONSTEXPR typename 
> enable_if::value,
> _Tp>::type__hypot3(_Tp __x, _Tp __y, _Tp __z) noexcept
>   {
>   __x = std::fabs(__x);
>   __y = std::fabs(__y);
>   __z = std::fabs(__z);
> 
>   const _Tp
>   __max = std::fmax(std::fmax(__x, __y), __z);
> 
>   if (GCC_UNLIKELY(__max == _Tp{}))
>   {
>   return __max;
>   }
>   else
>   {
>   __x /= __max;
>   __y /= __max;
>   __z /= __max;
>   return std::sqrt(__x*__x + __y*__y + __z*__z) * __max;
>   }
>   }
> } //  __detail
> 
> 
>   template 
>   inline _GLIBCXX_CONSTEXPR typename 
> enable_if::value,
> _Tp>::type__hypot3(const _Tp __x, const _Tp __y, const _Tp __z)   noexcept

This is a C++17 function, you can use enable_if_t>,
but see below.

>   {
>   return (GCC_UNLIKELY(__detail::__isinf3(__x, __y, __z))) ?
> numeric_limits<_Tp>::infinity() : __detail::__hypot3(__x, __y, __z);
>   }
> 
> #undef GCC_UNLIKELY
> #undef GCC_LIKELY
> 
> How does it work?
> * Basically, I first pull out the special case INFINITY (see
> https://en.cppreference.com/w/cpp/numeric/math/hypot).
> * As an additional safety measure (to prevent misuse) the functions are
> defined by enable_if.

I don't think we want to slow down compilation like that. If users decide to
misuse std::__detail::__isinf3 then they get what they deserve.


> constexpr
> * The hypot3 functions can thus be defined as _GLIBCXX_CONSTEXPR.

Just use 'constexpr' because this function isn't compiled as C++98. Then you
don't need the 'inline'. Although the standard doesn't allow std::hypot3 to be
constexpr.

> Questions
> * To get a better runtime behavior I define GCC_(UN)LIKELY. Are there
> already such macros (which I have overlooked)?

No, but you can do:


  if (__isinf3(__x, __y, __x)) [[__unlikely__]]
...


> * The functions are noexcept. Does that make sense? If yes: why are the math
> functions not noexcept?

I think it's just because nobody bothered to add them, and I doubt much code
ever needs to check whether they are noexcept. The compiler should already know
the standard libm functions don't throw. For this function (which isn't in libm
and the compiler doesn't know about) it seems worth adding 'noexcept'.


Does splitting it into three functions matter? It seems simpler as a single
function:

  template
constexpr _Tp
__hypot3(_Tp __x, _Tp __y, _Tp __z) noexcept
{
  if (std::isinf(__x) | std::isinf(__y) | std::isinf(__z)) [[__unlikely__]]
return numeric_limits<_Tp>::infinity();
  __x = std::fabs(__x);
  __y = std::fabs(__y);
  __z = std::fabs(__z);
  const _Tp __max = std::fmax(std::fmax(__x, __y), __z);
  if (__max == _Tp{}) [[__unlikely__]]
return __max;
  else
{
  __x /= __max;
  __y /= __max;
  __z /= __max;
  return std::sqrt(__x*__x + __y*__y + __z*__z) * __max;
}
}

This would add a dependency on  to , which isn't currently
there. Maybe we could just return (_Tp)__builtin_huge_vall().

[Bug c++/111224] modules: xtreme-header-1_a.H etc. ICE (in core_vals, at cp/module.cc:6108) on AArch64 with SVE types

2024-03-01 Thread nshead at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111224

Nathaniel Shead  changed:

   What|Removed |Added

 CC||nshead at gcc dot gnu.org

--- Comment #7 from Nathaniel Shead  ---
Created attachment 57586
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57586&action=edit
Untested patch to implement POLY_INT_CST in modules

Here's a potential fix for this issue. But I only have access to an x86_64
machine currently, so this is completely untested.

[Bug target/114187] [14 regression] bizarre register dance on x86_64 for pass-by-value struct since r14-2526

2024-03-01 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114187

Jakub Jelinek  changed:

   What|Removed |Added

Summary|[14 regression] bizarre |[14 regression] bizarre
   |register dance on x86_64|register dance on x86_64
   |for pass-by-value struct|for pass-by-value struct
   ||since r14-2526
 CC||jakub at gcc dot gnu.org

--- Comment #2 from Jakub Jelinek  ---
Indeed, r14-2526-g8911879415d6c2a7baad88235554a912887a1c5c

[Bug d/114171] [13/14 Regression] gdc -O2 -mavx generates misaligned vmovdqa instruction

2024-03-01 Thread ibuclaw at gdcproject dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114171

--- Comment #4 from Iain Buclaw  ---
Removed druntime dependency.
---
import gcc.builtins;
struct Token {
string label;
}
struct BreakStatement {
ulong pad;
Token label;
}

pragma(inline, false)
auto newclass()
{
void *p = __builtin_malloc(BreakStatement.sizeof);
__builtin_memset(p, 0, BreakStatement.sizeof);
return cast(BreakStatement*) p;
}

int main ()
{
auto bn = newclass();
return bn.label is Token.init;
}
---



Roughly the equivalent C++
---
struct Token {
struct {
__SIZE_TYPE__ length;
const char *ptr;
} label;
};
struct BreakStatement {
__UINT64_TYPE__ pad;
Token label;
};

__attribute__((noinline))
BreakStatement *newclass()
{
void *p = __builtin_malloc(sizeof(BreakStatement));
__builtin_memset(p, 0, sizeof(BreakStatement));
return (BreakStatement*) p;
}

int main ()
{
auto bn = newclass();
auto init = Token();
return *(__uint128_t*)&bn->label == *(__uint128_t*)&init;
}
---

[Bug d/114171] [13/14 Regression] gdc -O2 -mavx generates misaligned vmovdqa instruction

2024-03-01 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114171

--- Comment #5 from Richard Biener  ---
(In reply to Iain Buclaw from comment #4)
> Removed druntime dependency.
> ---
> import gcc.builtins;
> struct Token {
> string label;
> }
> struct BreakStatement {
> ulong pad;
> Token label;
> }
> 
> pragma(inline, false)
> auto newclass()
> {
> void *p = __builtin_malloc(BreakStatement.sizeof);
> __builtin_memset(p, 0, BreakStatement.sizeof);
> return cast(BreakStatement*) p;
> }
> 
> int main ()
> {
> auto bn = newclass();
> return bn.label is Token.init;
> }
> ---
> 
> 
> 
> Roughly the equivalent C++
> ---
> struct Token {
> struct {
> __SIZE_TYPE__ length;
> const char *ptr;
> } label;
> };
> struct BreakStatement {
> __UINT64_TYPE__ pad;
> Token label;
> };
> 
> __attribute__((noinline))
> BreakStatement *newclass()
> {
> void *p = __builtin_malloc(sizeof(BreakStatement));
> __builtin_memset(p, 0, sizeof(BreakStatement));
> return (BreakStatement*) p;
> }
> 
> int main ()
> {
> auto bn = newclass();
> auto init = Token();
> return *(__uint128_t*)&bn->label == *(__uint128_t*)&init;
> }
> ---

Unless gdc somehow guarantees bn->label and init are 128bit aligned
then "casting" this way is broken.  You can of course use
build_aligned_type to build a properly (mis-)aligned type to use
to dereference to.

[Bug tree-optimization/110221] With AVX512 fully masking gfortran.dg/pr68146.f ICEs

2024-03-01 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110221

--- Comment #7 from GCC Commits  ---
The releases/gcc-12 branch has been updated by Andre Simoes Dias Vieira
:

https://gcc.gnu.org/g:9d033155254ac6df5f47ab32896dbf336f991589

commit r12-10186-g9d033155254ac6df5f47ab32896dbf336f991589
Author: Richard Biener 
Date:   Fri Nov 10 12:39:11 2023 +0100

tree-optimization/110221 - SLP and loop mask/len

The following fixes the issue that when SLP stmts are internal defs
but appear invariant because they end up only using invariant defs
then they get scheduled outside of the loop.  This nice optimization
breaks down when loop masks or lens are applied since those are not
explicitly tracked as dependences.  The following makes sure to never
schedule internal defs outside of the vectorized loop when the
loop uses masks/lens.

PR tree-optimization/110221
* tree-vect-slp.cc (vect_schedule_slp_node): When loop
masking / len is applied make sure to not schedule
intenal defs outside of the loop.

* gfortran.dg/pr110221.f: New testcase.

(cherry picked from commit 7c67939ec384425a3d7383dfb4fb39aa7e9ad20a)

[Bug tree-optimization/111478] [12 Regression] aarch64 SVE ICE: in compute_live_loop_exits, at tree-ssa-loop-manip.cc:250

2024-03-01 Thread avieira at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111478

avieira at gcc dot gnu.org changed:

   What|Removed |Added

 CC||avieira at gcc dot gnu.org

--- Comment #10 from avieira at gcc dot gnu.org ---
This has now been backported to gcc-13 and gcc-12, so I think we should close,
will leave that to Richard.

[Bug tree-optimization/111478] [12 Regression] aarch64 SVE ICE: in compute_live_loop_exits, at tree-ssa-loop-manip.cc:250

2024-03-01 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111478

Richard Biener  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #11 from Richard Biener  ---
Fixed.

[Bug debug/114015] ICE: in build_abbrev_table, at dwarf2out.cc:9266 with -g -fvar-tracking-assignments -fdebug-types-section

2024-03-01 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114015

--- Comment #2 from GCC Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:5b1fb8f8b4fe60745dece9b2f83c155c772ca66d

commit r14-9254-g5b1fb8f8b4fe60745dece9b2f83c155c772ca66d
Author: Jakub Jelinek 
Date:   Fri Mar 1 14:57:15 2024 +0100

dwarf2out: Don't move variable sized aggregates to comdat [PR114015]

The following testcase ICEs, because we decide to move that
struct { char a[n]; } DW_TAG_structure_type into .debug_types section
/ DW_UT_type DWARF5 unit, but refer from there to a DW_TAG_variable
(created artificially for the array bounds).
Even with non-bitint, I think it is just wrong to use .debug_types
section / DW_UT_type for something that uses DW_OP_fbreg and similar
in it, things clearly dependent on a particular function.
In most cases, is_nested_in_subprogram (die) check results in such
aggregates not being moved, but in the function parameter type case
that is not the case.

The following patch fixes it by returning false from
should_move_die_to_comdat
for non-constant sized aggregate types, i.e. when either we gave up on
adding DW_AT_byte_size for it because it wasn't expressable, or when
it is something non-constant (location description, reference, ...).

2024-03-01  Jakub Jelinek  

PR debug/114015
* dwarf2out.cc (should_move_die_to_comdat): Return false for
aggregates without DW_AT_byte_size attribute or with non-constant
DW_AT_byte_size.

* gcc.dg/debug/dwarf2/pr114015.c: New test.

[Bug debug/114015] ICE: in build_abbrev_table, at dwarf2out.cc:9266 with -g -fvar-tracking-assignments -fdebug-types-section

2024-03-01 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114015

Jakub Jelinek  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #3 from Jakub Jelinek  ---
Fixed.

[Bug d/114171] [13/14 Regression] gdc -O2 -mavx generates misaligned vmovdqa instruction

2024-03-01 Thread ibuclaw at gdcproject dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114171

--- Comment #6 from Iain Buclaw  ---
(In reply to Richard Biener from comment #5)
> Unless gdc somehow guarantees bn->label and init are 128bit aligned
> then "casting" this way is broken.  You can of course use
> build_aligned_type to build a properly (mis-)aligned type to use
> to dereference to.
Right, it looks like the lowering for struct comparisons wasn't taking the
original alignment into account when doing identity comparisons of struct-like
fields.

---
--- a/gcc/d/d-codegen.cc
+++ b/gcc/d/d-codegen.cc
@@ -1006,6 +1006,7 @@ lower_struct_comparison (tree_code code,
StructDeclaration *sd,
  if (tmode == NULL_TREE)
tmode = make_unsigned_type (GET_MODE_BITSIZE (mode.require
()));

+ tmode = build_aligned_type (tmode, TYPE_ALIGN (stype));
  t1ref = build_vconvert (tmode, t1ref);
  t2ref = build_vconvert (tmode, t2ref);

---


The above change gets reflected in the generated assembly.
---
@@ -326,7 +326,7 @@ _Dmain:
subq$8, %rsp
.cfi_def_cfa_offset 16
call_D8pr1141718newclassFNbNiZPSQBa14BreakStatement
-   vmovdqa 8(%rax), %xmm0
+   vmovdqu 8(%rax), %xmm0
xorl%eax, %eax
vptest  %xmm0, %xmm0
sete%al
---

[Bug c++/113976] [11/12/13 Regression] explicit instantiation of const variable template following implicit instantiation is assembled in .rodata instead of .bss since r8-2857-g2ec399d8a6c9c2

2024-03-01 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113976

Jakub Jelinek  changed:

   What|Removed |Added

Summary|[11/12/13/14 Regression]|[11/12/13 Regression]
   |explicit instantiation of   |explicit instantiation of
   |const variable template |const variable template
   |following implicit  |following implicit
   |instantiation is assembled  |instantiation is assembled
   |in .rodata instead of .bss  |in .rodata instead of .bss
   |since   |since
   |r8-2857-g2ec399d8a6c9c2 |r8-2857-g2ec399d8a6c9c2

--- Comment #12 from Jakub Jelinek  ---
Fixed on the trunk so far.

[Bug middle-end/114136] wrong code for c23 fully anonymous arg lists on arm

2024-03-01 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114136

--- Comment #2 from GCC Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:b5377928a2a5cd2a79eda59e2eba7d0511bf7566

commit r14-9255-gb5377928a2a5cd2a79eda59e2eba7d0511bf7566
Author: Jakub Jelinek 
Date:   Fri Mar 1 15:42:52 2024 +0100

calls: Further fixes for TYPE_NO_NAMED_ARGS_STDARG_P handling [PR114136]

On Tue, Feb 27, 2024 at 04:41:32PM +, Richard Earnshaw wrote:
> On Arm the PR107453 change is causing all anonymous arguments to be
passed on the
> stack, which is incorrect per the ABI.  On a target that uses
> 'pretend_outgoing_vararg_named', why is it correct to set n_named_args to
> zero?  Is it enough to guard both the statements you've added with
> !targetm.calls.pretend_outgoing_args_named?

The TYPE_NO_NAMED_ARGS_STDARG_P functions (C23 fns like void foo (...) {})
have NULL type_arg_types, so the list_length (type_arg_types) isn't done
for
it, but it should be handled as if it was non-NULL but list length was 0.

So, for the
  if (type_arg_types != 0)
n_named_args
  = (list_length (type_arg_types)
 /* Count the struct value address, if it is passed as a parm.  */
 + structure_value_addr_parm);
  else if (TYPE_NO_NAMED_ARGS_STDARG_P (funtype))
n_named_args = 0;
  else
/* If we know nothing, treat all args as named.  */
n_named_args = num_actuals;
case, I think guarding it by any target hooks is wrong, although
I guess it should have been
n_named_args = structure_value_addr_parm;
instead of
n_named_args = 0;

For the second
  if (type_arg_types != 0
  && targetm.calls.strict_argument_naming (args_so_far))
;
  else if (type_arg_types != 0
   && ! targetm.calls.pretend_outgoing_varargs_named (args_so_far))
/* Don't include the last named arg.  */
--n_named_args;
  else if (TYPE_NO_NAMED_ARGS_STDARG_P (funtype))
n_named_args = 0;
  else
/* Treat all args as named.  */
n_named_args = num_actuals;
I think we should treat those as if type_arg_types was non-NULL
with 0 elements in the list, except the --n_named_args would for
!structure_value_addr_parm lead to n_named_args = -1, I think we want
0 for that case.

2024-03-01  Jakub Jelinek  

PR middle-end/114136
* calls.cc (expand_call): For TYPE_NO_NAMED_ARGS_STDARG_P set
n_named_args initially before INIT_CUMULATIVE_ARGS to
structure_value_addr_parm rather than 0, after it don't modify
it if strict_argument_naming and clear only if
!pretend_outgoing_varargs_named.

[Bug tree-optimization/114151] [14 Regression] weird and inefficient codegen and addressing modes since r14-9193

2024-03-01 Thread amacleod at redhat dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114151

--- Comment #7 from Andrew Macleod  ---
(In reply to Richard Biener from comment #6)
> (In reply to Andrew Macleod from comment #5)
> > (In reply to rguent...@suse.de from comment #4)
> > 
> > > 
> > > What was definitely missing is consideration of POLY_INT_CSTs (and
> > > variable polys, as I think there's no range info for those).
> > > 
> > Ranger doesn't do anything with POLY_INTs, mostly because I didn't
> > understand them.  
> > 
> > > We do eventually want to improve how ranger behaves here.  I'm not sure
> > > why when we do not provide a context 'stmt' it can't see to compute
> > > a range valid at the SSA names point of definition?  (so basically
> > > compute the global range)
> > 
> > The call looks like it doesn't provide the stmt.  Without the stmt, all
> > ranger will ever provide is global ranges.
> > 
> > I think you are asking why, If there is no global range, it doesn't try to
> > compute one from the ssa_name_def_stmt?  Ranger does when it is active.  
> 
> I tried with an active ranger but that doesn't make a difference.  Basically
> I added enable_ranger () / disable_ranger () around the pass and thought
> that would "activate" it.  But looking at range_for_expr I don't see how
> that would make a difference without a provided stmt.
> 

It wouldn't. why isn't a context stmt being provided?

range_of_expr with no context stmt makes no attempt to calculate anything. This
is because one can get into a lot of trouble as it doesn't know whether the
expression you are calculating is even in the IL or just some detached tree
expression.

If you have an SSA NAME and want to actually calculate the value, you can use
range_of_stmt (range, SSA_NAME_DEF_STMT (name))  instead of range_of_expr ().

If you pass in a stmt as context, and the SSA_NAME you are asking about is the
LHS of the stmt, then range_of_expr will call range_of_stmt itself... but
again, it needs a context stmt in order to know its safe to do so.

In general, range_of_expr with no context will not calculate anything... When a
stmt for location context is provided, then its free to go an do whatever
calculations are required.

[Bug libstdc++/114147] [11/12/13/14 Regression] tuple allocator-extended constructor requires non-explicit default constructor

2024-03-01 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114147

--- Comment #7 from Jonathan Wakely  ---
Patch posted: https://gcc.gnu.org/pipermail/gcc-patches/2024-March/646989.html

[Bug target/100799] Stackoverflow in optimized code on PPC

2024-03-01 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100799

Peter Bergner  changed:

   What|Removed |Added

 CC||aagarwa at gcc dot gnu.org

--- Comment #32 from Peter Bergner  ---
(In reply to Peter Bergner from comment #31)
> Ok, I think that gives us some idea what needs to be done.  I'll look for
> someone in the team to have a look at implementing this workaround.  Thanks.

Ajit has agreed to try and implement the workaround.

[Bug other/114191] New: Flags "Warning" and "Target" don't mix well in target.opt files

2024-03-01 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114191

Bug ID: 114191
   Summary: Flags "Warning" and "Target" don't mix well in
target.opt files
   Product: gcc
   Version: 13.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: other
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gjl at gcc dot gnu.org
  Target Milestone: ---

In an .opt file, a backend can define target-specific diagnostic options, for
example gcc/config/avr/avr.opt has:

Wmisspelled-isr
Warning C C++ Var(avr_warn_misspelled_isr) Init(1)
Warn if the ISR is misspelled, ...

This is a "Target" option however (so it should be listed with --help=target,
which it currently is not). However, specifying the "Target" flag in avr.opt
makes the option no more recognizable:

$ avr-gcc main.c -c -Wall -Wmisspelled-isr
cc1: error: unrecognized command-line option '-Wmisspelled-isr'

I can reproduce this for target avr, but it likely affects all other targets as
well.

Set the component to "other". As it appears, there is no bugzilla component for
such internal problems.

[Bug target/114187] [14 regression] bizarre register dance on x86_64 for pass-by-value struct since r14-2526

2024-03-01 Thread roger at nextmovesoftware dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114187

Roger Sayle  changed:

   What|Removed |Added

   Last reconfirmed||2024-03-01
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

--- Comment #3 from Roger Sayle  ---
There's a missing simplification in combine:

Trying 6 -> 11:
6: r102:TI=zero_extend(r109:DF#0)<<0x40|zero_extend(r108:DF#0)
  REG_DEAD r108:DF
  REG_DEAD r109:DF
   11: r105:DF=r102:TI#0+r102:TI#8
  REG_DEAD r102:TI
Failed to match this instruction:
(set (reg:DF 105 [ _4 ])
(plus:DF (subreg:DF (ior:TI (ashift:TI (zero_extend:TI (subreg:DI (reg:DF
109) 0))
(const_int 64 [0x40]))
(zero_extend:TI (subreg:DI (reg:DF 108) 0))) 8)
(reg:DF 108)))

where the lowpart is getting simplified to reg:DF 108, but the highpart isn't
getting simplified to reg:DF 109.  i.e.

(subreg:DF (ior:TI (ashift:TI (zero_extend:TI (subreg:DI (reg:DF 109) 0))
  (const_int 64 [0x40]))
  (zero_extend:TI (subreg:DI (reg:DF 108) 0))) 8)
can be simplified to just (reg:DF 109).

I'm looking into why this isn't happening.

[Bug c++/110347] [OpenMP] private/firstprivate of a C++ member variable mishandled

2024-03-01 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110347

--- Comment #3 from GCC Commits  ---
The master branch has been updated by Tobias Burnus :

https://gcc.gnu.org/g:4f82d5a95a244d0aa4f8b2541b47a21bce8a191b

commit r14-9257-g4f82d5a95a244d0aa4f8b2541b47a21bce8a191b
Author: Jakub Jelinek 
Date:   Fri Mar 1 17:26:42 2024 +0100

OpenMP/C++: Fix (first)private clause with member variables [PR110347]

OpenMP permits '(first)private' for C++ member variables, which GCC handles
by tagging those by DECL_OMP_PRIVATIZED_MEMBER, adding a temporary VAR_DECL
and DECL_VALUE_EXPR pointing to the 'this->member_var' in the C++ front
end.

The idea is that in omp-low.cc, the DECL_VALUE_EXPR is used before the
region (for 'firstprivate'; ignored for 'private') while in the region,
the DECL itself is used.

In gimplify, the value expansion is suppressed and deferred if the
  lang_hooks.decls.omp_disregard_value_expr (decl, shared)
returns true - which is never the case if 'shared' is true. In OpenMP 4.5,
only 'map' and 'use_device_ptr' was permitted for the 'target' directive.
And when OpenMP 5.0's 'private'/'firstprivate' clauses was added, the
the update that now 'shared' argument could be false was missed. The
respective check has now been added.

2024-03-01  Jakub Jelinek  
Tobias Burnus  

PR c++/110347

gcc/ChangeLog:

* gimplify.cc (omp_notice_variable): Fix 'shared' arg to
lang_hooks.decls.omp_disregard_value_expr for
(first)private in target regions.

libgomp/ChangeLog:

* testsuite/libgomp.c++/target-lambda-3.C: Moved from
gcc/testsuite/g++.dg/gomp/ and fixed is-mapped handling.
* testsuite/libgomp.c++/target-lambda-1.C: Modify to also
also work without offloading.
* testsuite/libgomp.c++/firstprivate-1.C: New test.
* testsuite/libgomp.c++/firstprivate-2.C: New test.
* testsuite/libgomp.c++/private-1.C: New test.
* testsuite/libgomp.c++/private-2.C: New test.
* testsuite/libgomp.c++/target-lambda-4.C: New test.
* testsuite/libgomp.c++/use_device_ptr-1.C: New test.

gcc/testsuite/ChangeLog:

* g++.dg/gomp/target-lambda-1.C: Moved to become a
run-time test under testsuite/libgomp.c++.

Co-authored-by: Tobias Burnus 

[Bug c++/92687] decltype of a structured binding to a tuple component is a reference type inside a template function

2024-03-01 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92687

--- Comment #6 from GCC Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:867cbadb912ab75d0eaf919a3f992595e508482b

commit r14-9258-g867cbadb912ab75d0eaf919a3f992595e508482b
Author: Jakub Jelinek 
Date:   Fri Mar 1 16:59:08 2024 +0100

c++: Fix up decltype of non-dependent structured binding decl in template
[PR92687]

finish_decltype_type uses DECL_HAS_VALUE_EXPR_P (expr) check for
DECL_DECOMPOSITION_P (expr) to determine if it is
array/struct/vector/complex etc. subobject proxy case vs. structured
binding using std::tuple_{size,element}.
For non-templates or when templates are already instantiated, that works
correctly, finalized DECL_DECOMPOSITION_P non-base vars indeed have
DECL_VALUE_EXPR in the former case and don't have it in the latter.
It works fine for dependent structured bindings as well, cp_finish_decomp
in
that case creates DECLTYPE_TYPE tree and defers the handling until
instantiation.
As the testcase shows, this doesn't work for the non-dependent structured
binding case in templates, because DECL_HAS_VALUE_EXPR_P is set in that
case
always; cp_finish_decomp ends with:
  if (processing_template_decl)
{
  for (unsigned int i = 0; i < count; i++)
if (!DECL_HAS_VALUE_EXPR_P (v[i]))
  {
tree a = build_nt (ARRAY_REF, decl, size_int (i),
   NULL_TREE, NULL_TREE);
SET_DECL_VALUE_EXPR (v[i], a);
DECL_HAS_VALUE_EXPR_P (v[i]) = 1;
  }
}
and those artificial ARRAY_REFs are used in various places during
instantiation to find out what base the DECL_DECOMPOSITION_P VAR_DECLs
have and their positions.

The following patch fixes that by changing lookup_decomp_type, such that
it doesn't ICE when called on a DECL_DECOMPOSITION_P var which isn't in a
hash table, but returns NULL_TREE in that case, and for
processing_template_decl
asserts DECL_HAS_VALUE_EXPR_P is non-NULL and just calls
lookup_decomp_type.
If it returns non-NULL, it is a structured binding using tuple and its
result
is returned, otherwise it falls through to returning unlowered_expr_type
(expr)
because it is an array, structure etc. subobject proxy.
For !processing_template_decl it keeps doing what it did before,
DECL_HAS_VALUE_EXPR_P meaning it is an array/structure etc. subobject
proxy,
otherwise the tuple case.

2024-03-01  Jakub Jelinek  

PR c++/92687
* decl.cc (lookup_decomp_type): Return NULL_TREE if
decomp_type_table
doesn't have entry for V.
* semantics.cc (finish_decltype_type): If ptds.saved, assert
DECL_HAS_VALUE_EXPR_P is true and decide on tuple vs. non-tuple
based
on if lookup_decomp_type is NULL or not.

* g++.dg/cpp1z/decomp59.C: New test.

[Bug middle-end/113436] [OpenMP] 'allocate' clause has no effect for (first)private on 'target' directives

2024-03-01 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113436

--- Comment #2 from Tobias Burnus  ---
As mentioned in comment 0, PR110347's testcase (r14-9257-g4f82d5a95a244d)
contains '#if 0' code which has to be enabled once this bug is fixed.
Please remember to take care of:

* libgomp/testsuite/libgomp.c++/firstprivate-1.C's and
* libgomp/testsuite/libgomp.c++/private-1.C's

#if 0  /* FIXME: The following is disabled because of PR middle-end/113436.  */

[Bug c++/110347] [OpenMP] private/firstprivate of a C++ member variable mishandled

2024-03-01 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110347

Tobias Burnus  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from Tobias Burnus  ---
FIXED on mainline (GCC 14).

[Bug tree-optimization/114192] New: scalar code left around following early break vectorization of reduction

2024-03-01 Thread acoplan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114192

Bug ID: 114192
   Summary: scalar code left around following early break
vectorization of reduction
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: acoplan at gcc dot gnu.org
  Target Milestone: ---

For the following testcase:

int a[1024];
int f4(int *x, int n)
{
int sum = 0;
for (int i = 0; i < n; i++)
{
sum += a[i];
if (a[i] == 42)
break;
}
return sum;
}

at -O3 on aarch64 we vectorize it and get the following vector loop:

.L4:
cmp x7, x2
beq .L23
.L6:
ubfiz   x3, x2, 4, 32
ldr w6, [x4, x2, lsl 2]// scalar load
mov v27.16b, v30.16b
mov w0, w5
add v30.4s, v30.4s, v25.4s
add w5, w5, w6 // scalar add
ldr q29, [x4, x3]
add x2, x2, 1
cmeqv31.4s, v29.4s, v26.4s
add v28.4s, v28.4s, v29.4s
umaxp   v31.4s, v31.4s, v31.4s
fmovx3, d31
cbz x3, .L4

but here the old scalar code has been left around.  If we remove the early exit
from the loop, then although we still leave the scalar code around in the
vectorizer, it gets optimized away immediately by the following DCE pass.

Without the early exit, in the vectorizer dump we have:

   [local count: 860067200]:
  # sum_10 = PHI 
  # i_12 = PHI 
  # vect_sum_10.8_25 = PHI 
  # vectp_a.9_26 = PHI 
  # ivtmp_32 = PHI 
  vect__1.11_28 = MEM  [(int *)vectp_a.9_26];
  _1 = a[i_12]; // scalar load
  vect_sum_6.12_29 = vect__1.11_28 + vect_sum_10.8_25;
  sum_6 = _1 + sum_10;
  i_7 = i_12 + 1;
  vectp_a.9_27 = vectp_a.9_26 + 16;
  ivtmp_33 = ivtmp_32 + 1;
  if (ivtmp_33 < bnd.5_22)
goto ; [89.00%]
  else
goto ; [11.00%]

i.e. the scalar load is left around, but it seems to get cleaned up by the
(immediately following) dce pass:

   [local count: 860067200]:
  # vect_sum_10.8_25 = PHI 
  # vectp_a.9_26 = PHI 
  # ivtmp_32 = PHI 
  vect__1.11_28 = MEM  [(int *)vectp_a.9_26];
  vect_sum_6.12_29 = vect__1.11_28 + vect_sum_10.8_25;
  vectp_a.9_27 = vectp_a.9_26 + 16;
  ivtmp_33 = ivtmp_32 + 1;
  if (ivtmp_33 < bnd.5_22)
goto ; [89.00%]
  else
goto ; [11.00%]

perhaps the dce needs improving to clean up the dead scalar code in the early
exit case, too.

[Bug tree-optimization/114192] scalar code left around following early break vectorization of reduction

2024-03-01 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114192

Tamar Christina  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2024-03-01

--- Comment #1 from Tamar Christina  ---
Confirmed.

It looks like DCE6 no longer thinks:

  # sum_10 = PHI 

  _1 = aD.4432[i_12];
  sum_7 = _1 + sum_11;

is dead after vectorization.

it removes the only dead consumer of sum_7,
a PHI node left over in the guard block which becomes unused after the
reduction is vectorized.

DCE says:

marking necessary through sum_11 stmt sum_11 = PHI 
processing: sum_11 = PHI 

marking necessary through sum_7 stmt sum_7 = _1 + sum_11;
processing: sum_7 = _1 + sum_11;

marking necessary through _1 stmt _1 = a[i_12];
processing: _1 = a[i_12];

so it thinks the closed definition is needed?

This seems to only happen with reductions, other live operations look fine:

extern int a[1024];
int f4(int *x, int n)
{
int sum = 0;
for (int i = 0; i < n; i++)
{
sum = a[i];
if (a[i] == 42)
break;
}
return sum;
}

[Bug tree-optimization/114193] New: missed early break vectorization of reduction

2024-03-01 Thread acoplan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114193

Bug ID: 114193
   Summary: missed early break vectorization of reduction
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: acoplan at gcc dot gnu.org
  Target Milestone: ---

For the following loop:

int a[1024];
int f(int *x, int n)
{
int sum = 0;
for (int i = 0; i < n; i++)
{
if (a[i] == 42)
break;
sum += a[i];
}
return sum;
}

at -O3 on aarch64 we miss vectorizing it.  It works if I move the early exit
down below the update of sum.  It looks like vect_analyze_scalar_cycles fails
to detect this as a reduction:

/app/example.c:5:23: note:   Analyze phi: sum_10 = PHI 
/app/example.c:5:23: missed:   intermediate value used outside loop.
/app/example.c:5:23: missed:   Unknown def-use cycle pattern.

[Bug target/114194] New: ICE when using std::unique_ptr with xtheadvector

2024-03-01 Thread camel-cdr at protonmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114194

Bug ID: 114194
   Summary: ICE when using std::unique_ptr with xtheadvector
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: camel-cdr at protonmail dot com
  Target Milestone: ---

Using std::unique_ptr with xtheadvector enabled causes an ICE:

#include 
extern void use(std::unique_ptr &x);
void test(size_t n) {
std::unique_ptr x;
use(x);
}

See also: https://godbolt.org/z/6nbhxKdfd

I've managed to reduce the problem to the following set of templates, but I
have no idea how this could cause the ICE.

struct S1 { int x; };
struct S2 { constexpr S2() { } template S2(T&); };
struct S3 : S1, S2 { constexpr S3() : S1() { } S3(S3&); };
void f(S3 &) { S3 x; f(x); }


See also: https://godbolt.org/z/5YxM6jd3s

It's extremely brittle, the ICE goes away, if you remove constexpr, the
reference, or any other part I could think of.

[Bug c++/84414] miscompile due to assuming that object returned by value cannot alias its own member pointer values

2024-03-01 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84414

--- Comment #8 from Jonathan Wakely  ---
This is now https://cplusplus.github.io/CWG/issues/2868.html

[Bug tree-optimization/109945] Escape analysis hates copy elision: different result with -O1 vs -O2

2024-03-01 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109945

--- Comment #33 from Jonathan Wakely  ---
This is now https://cplusplus.github.io/CWG/issues/2868.html

[Bug fortran/114188] Overloading assignment does not invalidate intrinsic assignment

2024-03-01 Thread kargl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114188

kargl at gcc dot gnu.org changed:

   What|Removed |Added

 CC||kargl at gcc dot gnu.org

--- Comment #1 from kargl at gcc dot gnu.org ---
(In reply to ba...@lrz.de from comment #0)
> Created attachment 57583 [details]
> test case for invalid use of assignment overloading
> 
> The attached reproducer overloads the assignment operator with a version
> that requires the left hand side to be a pointer.
> 
> The overload conforms to the requirements for defining the assignment
> according to 10.2.1.4 of the Fortran standard. Therefore, the intrinsic
> assignment should become unavailable (last sentence of 10.2.1.1).
> 
> However, gfortran accepts invocations that use nonpointer arguments.
> 
> (NAG Fortran, Intel Fortran and NVidia Fortran issue appropriate error
> messages).

Can you provide a bit more detail in your interpretation of F2023?

The last sentence in 10.2.1.1 is 

   1 An assignment-stmt shall meet the requirements of either a defined
 assignment statement or an intrinsic assignment statement.

If I comment out your 'interface assignment(=)' block, then 'b = a'
is an intrinsic assignment.  If I replace 'b = a' with the direct
call to 'ass', I see

a.f90:26:12:

   26 |call ass(b, a)
  |1
Error: Actual argument for ‘to’ at (1) must be a pointer or a valid
target for the dummy pointer in a pointer assignment statement

which seems to be the error that you want.  The question is then
if the source of this error can be interpreted such that 'b = a' in
your original code is in fact not a defined assignment, and therefore,
it is an intrinsic assignment (last sentence in 10.2.1.1).

10.2.1.5 has

  1 The interpretation of a defined assignment is provided by the subroutine
that defines it.

and the NOTE in this section contains 

   The rules of defined assignment (15.4.3.4.3), ...

15.4.3.4.3 goes into some detail about argument association.  These rules
seem to be the source of the above error when 'ass' is called directed.
Unfortunately, the five requirements in 10.2.1.4 for defined assignment
do not say anything about argument association.

[Bug analyzer/111802] [14 Regression] New analyser diagram failures since commit b365e9d57ad4

2024-03-01 Thread thiago.bauermann at linaro dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111802

--- Comment #5 from Thiago Jung Bauermann  
---
I can confirm this is fixed in our setup. Thank you!

[Bug c++/111224] modules: xtreme-header-1_a.H etc. ICE (in core_vals, at cp/module.cc:6108) on AArch64 with SVE types

2024-03-01 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111224

Patrick Palka  changed:

   What|Removed |Added

 CC||ppalka at gcc dot gnu.org

--- Comment #8 from Patrick Palka  ---
(In reply to Nathaniel Shead from comment #7)
> Created attachment 57586 [details]
> Untested patch to implement POLY_INT_CST in modules
> 
> Here's a potential fix for this issue. But I only have access to an x86_64
> machine currently, so this is completely untested.
The compile farm might have a suitable machine for testing:
https://gcc.gnu.org/wiki/CompileFarm

[Bug c++/110025] [C++23] ICE with default-argument for template-param with decltype and auto.

2024-03-01 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110025

--- Comment #8 from GCC Commits  ---
The master branch has been updated by Patrick Palka :

https://gcc.gnu.org/g:a6a1920b592b58c38137c5c891b3bbb02b084f38

commit r14-9260-ga6a1920b592b58c38137c5c891b3bbb02b084f38
Author: Patrick Palka 
Date:   Fri Mar 1 12:50:18 2024 -0500

c++: auto(x) partial substitution [PR110025, PR114138]

In r12-6773-g09845ad7569bac we gave CTAD placeholders a level of 0 and
ensured we never replaced them via tsubst.  It turns out that autos
representing an explicit cast need the same treatment and for the same
reason: such autos appear in an expression context and so their level
gets easily messed up after partial substitution, leading to premature
replacement via an incidental tsubst instead of via do_auto_deduction.

This patch fixes this by extending the r12-6773 approach to auto(x).

PR c++/110025
PR c++/114138

gcc/cp/ChangeLog:

* cp-tree.h (make_cast_auto): Declare.
* parser.cc (cp_parser_functional_cast): If the type is an auto,
replace it with a level-less one via make_cast_auto.
* pt.cc (find_parameter_packs_r): Don't treat level-less auto
as a type parameter pack.
(tsubst) : Generalize CTAD placeholder
auto handling to all level-less autos.
(make_cast_auto): Define.
(do_auto_deduction): Handle replacement of a level-less auto.

gcc/testsuite/ChangeLog:

* g++.dg/cpp23/auto-fncast16.C: New test.
* g++.dg/cpp23/auto-fncast17.C: New test.
* g++.dg/cpp23/auto-fncast18.C: New test.

Reviewed-by: Jason Merrill 

[Bug c++/114138] [c++2b] ICE on valid code using `auto(expr)` DECAY-COPY

2024-03-01 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114138

--- Comment #1 from GCC Commits  ---
The master branch has been updated by Patrick Palka :

https://gcc.gnu.org/g:a6a1920b592b58c38137c5c891b3bbb02b084f38

commit r14-9260-ga6a1920b592b58c38137c5c891b3bbb02b084f38
Author: Patrick Palka 
Date:   Fri Mar 1 12:50:18 2024 -0500

c++: auto(x) partial substitution [PR110025, PR114138]

In r12-6773-g09845ad7569bac we gave CTAD placeholders a level of 0 and
ensured we never replaced them via tsubst.  It turns out that autos
representing an explicit cast need the same treatment and for the same
reason: such autos appear in an expression context and so their level
gets easily messed up after partial substitution, leading to premature
replacement via an incidental tsubst instead of via do_auto_deduction.

This patch fixes this by extending the r12-6773 approach to auto(x).

PR c++/110025
PR c++/114138

gcc/cp/ChangeLog:

* cp-tree.h (make_cast_auto): Declare.
* parser.cc (cp_parser_functional_cast): If the type is an auto,
replace it with a level-less one via make_cast_auto.
* pt.cc (find_parameter_packs_r): Don't treat level-less auto
as a type parameter pack.
(tsubst) : Generalize CTAD placeholder
auto handling to all level-less autos.
(make_cast_auto): Define.
(do_auto_deduction): Handle replacement of a level-less auto.

gcc/testsuite/ChangeLog:

* g++.dg/cpp23/auto-fncast16.C: New test.
* g++.dg/cpp23/auto-fncast17.C: New test.
* g++.dg/cpp23/auto-fncast18.C: New test.

Reviewed-by: Jason Merrill 

[Bug fortran/114188] Overloading assignment does not invalidate intrinsic assignment

2024-03-01 Thread Bader at lrz dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114188

--- Comment #2 from Bader at lrz dot de  ---
You note that

> Unfortunately, the five requirements in 10.2.1.4 for defined assignment
> do not say anything about argument association.

Hmm, one could see this as "intentionally" instead of "unfortunately": If
the requirements in 10.2.1.4 are fulfilled, then a defined assignment exists.

The consequences are:

(1) the intrinsic assignment becomes unavailable (because the last sentence in 
10.2.1.1 establishes a mutual exclusion).

(2) Any further details on how the subroutine is set up must be appropriately
handled by the programmer (e.g., supplying POINTER objects in my example's
LHS) - this is what is meant by "The interpretation of a defined assignment
is
provided by the subroutine that defines it". The NOTE appearing later
to me does not seem germane to the question at hand.

While my starting assumption may be wrong, the other compilers' behaviour is 
consistent with it.

Cheers
Reinhold

[Bug fortran/104819] Reject NULL without MOLD as actual to an assumed-rank dummy

2024-03-01 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104819

--- Comment #10 from GCC Commits  ---
The master branch has been updated by Harald Anlauf :

https://gcc.gnu.org/g:db0b6746be075e43c8142585968483e125bb52d0

commit r14-9261-gdb0b6746be075e43c8142585968483e125bb52d0
Author: Harald Anlauf 
Date:   Fri Mar 1 19:21:27 2024 +0100

Fortran: improve checks of NULL without MOLD as actual argument [PR104819]

gcc/fortran/ChangeLog:

PR fortran/104819
* check.cc (gfc_check_null): Handle nested NULL()s.
(is_c_interoperable): Check for MOLD argument of NULL() as part of
the interoperability check.
* interface.cc (gfc_compare_actual_formal): Extend checks for
NULL()
actual arguments for presence of MOLD argument when required by
Interp J3/22-146.

gcc/testsuite/ChangeLog:

PR fortran/104819
* gfortran.dg/assumed_rank_9.f90: Adjust testcase use of NULL().
* gfortran.dg/pr101329.f90: Adjust testcase to conform to interp.
* gfortran.dg/null_actual_4.f90: New test.

[Bug libstdc++/113841] Can't swap two std::hash

2024-03-01 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113841

--- Comment #10 from Jonathan Wakely  ---
This one's much harder to fix:

#include 

template
struct Alloc
{
  using value_type = T;

  Alloc(int) { }

  template Alloc(const Alloc&) { }

  T* allocate(std::size_t n) { return std::allocator().allocate(n); }
  void deallocate(T* p, std::size_t n) { std::allocator().deallocate(p, n);
}
};

template struct wrap { T t; };

template void do_adl(T&) { }

void test_pr113841()
{
  using Tr = std::char_traits;
  using test_type = std::basic_string>;
  std::pair>* h = nullptr;
  do_adl(h);
}

[Bug debug/114186] Incorrect CTF generated for multidimensional array

2024-03-01 Thread ibhagat at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114186

--- Comment #1 from Indu Bhagat  ---
This in turn affects BTF generation too, because GCC internally uses the CTFC
(CTF container) to create BTF info.

 $ bpftool btf dump file array.o 
[1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
[2] INT 'long unsigned int' size=8 bits_offset=0 nr_bits=64 encoding=(none)
[3] ARRAY '(anon)' type_id=1 index_type_id=2 nr_elems=5
[4] ARRAY '(anon)' type_id=3 index_type_id=2 nr_elems=9
[5] ARRAY '(anon)' type_id=4 index_type_id=2 nr_elems=3
[6] VAR 'a' type_id=5, linkage=global
[7] DATASEC '.bss' size=0 vlen=1
type_id=6 offset=0 size=540 (VAR 'a')

[Bug c++/114165] &scalar+1 and array+1 rejected as template parameters

2024-03-01 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114165

--- Comment #1 from Andrew Pinski  ---
:17:32: error: '((& scalar) + 4)' is not a valid template argument for
'int*' because it is not the address of a variable

[Bug c++/114165] &scalar+1 and array+1 rejected as template parameters

2024-03-01 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114165

--- Comment #2 from Andrew Pinski  ---
Reduced testcase:
```
template
void withP() {}
int array[2];
int main() {
  withP();
  withP();
}
```

GCC and MSVC both reject `array+1` .
All three accept `&array[1]` though in C++20+.

[Bug fortran/114188] Overloading assignment does not invalidate intrinsic assignment

2024-03-01 Thread kargl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114188

kargl at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Priority|P3  |P4
   Last reconfirmed||2024-03-01
   Keywords||accepts-invalid, wrong-code
 Ever confirmed|0   |1

--- Comment #3 from kargl at gcc dot gnu.org ---
(In reply to ba...@lrz.de from comment #2)
> You note that
> 
> > Unfortunately, the five requirements in 10.2.1.4 for defined assignment
> > do not say anything about argument association.
> 
> Hmm, one could see this as "intentionally" instead of "unfortunately": If
> the requirements in 10.2.1.4 are fulfilled, then a defined assignment exists.
>>   
> The consequences are:
> 
> (1) the intrinsic assignment becomes unavailable (because the last sentence
> in 
> 10.2.1.1 establishes a mutual exclusion).
> 
> (2) Any further details on how the subroutine is set up must be appropriately
> handled by the programmer (e.g., supplying POINTER objects in my
> example's
> LHS) - this is what is meant by "The interpretation of a defined
> assignment is
> provided by the subroutine that defines it". The NOTE appearing later
   to me does not seem germane to the question at hand.
> 
> While my starting assumption may be wrong, the other compilers' behaviour is 
> consistent with it.
> 

I wasn't assuming that you were wrong and I've read enough of
your posts in J3 mailing list to trust your interpretation.
You've confirmed a few of my suspicions on how you were reading
the standard.  Hopefully, the clarity will help whomever jumps
down the rabbit hole to fix the bug.

[Bug target/114187] [14 regression] bizarre register dance on x86_64 for pass-by-value struct since r14-2526

2024-03-01 Thread roger at nextmovesoftware dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114187

Roger Sayle  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |roger at 
nextmovesoftware dot com

--- Comment #4 from Roger Sayle  ---
Created attachment 57587
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57587&action=edit
proposed patch

Proposed fix attached.  Currently bootstrapping and regression testing.  The
problematic code (from March 2023) has an interesting history.

[Bug fortran/114188] Overloading assignment does not invalidate intrinsic assignment

2024-03-01 Thread anlauf at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114188

--- Comment #4 from anlauf at gcc dot gnu.org ---
(In reply to ba...@lrz.de from comment #0)
> (NAG Fortran, Intel Fortran and NVidia Fortran issue appropriate error
> messages).

NVidia has a different issue: if one imports only the type declaration via

  use mod_supp, only : supp

all compilers should more or less compile the code, but that brand gives:

NVFORTRAN-S-0188-Argument number 1 (non-POINTER) to ass: type mismatch
(pr114188.f90: 22)
NVFORTRAN-S-0188-Argument number 1 (non-POINTER) to ass: type mismatch
(pr114188.f90: 23)

i.e. it uses the (unavailable) defined assignment although it shouldn't...

[Bug c++/109642] False Positive -Wdangling-reference with std::span-like classes

2024-03-01 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109642

--- Comment #19 from GCC Commits  ---
The trunk branch has been updated by Marek Polacek :

https://gcc.gnu.org/g:c7607c4cf18986025430ca8626abfe56bfe87106

commit r14-9263-gc7607c4cf18986025430ca8626abfe56bfe87106
Author: Marek Polacek 
Date:   Thu Jan 25 16:38:51 2024 -0500

c++: implement [[gnu::no_dangling]] [PR110358]

Since -Wdangling-reference has false positives that can't be
prevented, we should offer an easy way to suppress the warning.
Currently, that is only possible by using a #pragma, either around the
enclosing class or around the call site.  But #pragma GCC diagnostic tend
to be onerous.  A better solution would be to have an attribute.

To that end, this patch adds a new attribute, [[gnu::no_dangling]].
This attribute takes an optional bool argument to support cases like:

  template 
  struct [[gnu::no_dangling(std::is_reference_v)]] S {
 // ...
  };

PR c++/110358
PR c++/109642

gcc/cp/ChangeLog:

* call.cc (no_dangling_p): New.
(reference_like_class_p): Use it.
(do_warn_dangling_reference): Use it.  Don't warn when the function
or its enclosing class has attribute gnu::no_dangling.
* tree.cc (cxx_gnu_attributes): Add gnu::no_dangling.
(handle_no_dangling_attribute): New.

gcc/ChangeLog:

* doc/extend.texi: Document gnu::no_dangling.
* doc/invoke.texi: Mention that gnu::no_dangling disables
-Wdangling-reference.

gcc/testsuite/ChangeLog:

* g++.dg/ext/attr-no-dangling1.C: New test.
* g++.dg/ext/attr-no-dangling2.C: New test.
* g++.dg/ext/attr-no-dangling3.C: New test.
* g++.dg/ext/attr-no-dangling4.C: New test.
* g++.dg/ext/attr-no-dangling5.C: New test.
* g++.dg/ext/attr-no-dangling6.C: New test.
* g++.dg/ext/attr-no-dangling7.C: New test.
* g++.dg/ext/attr-no-dangling8.C: New test.
* g++.dg/ext/attr-no-dangling9.C: New test.

[Bug c++/110358] requesting nicer suppression for Wdangling-reference

2024-03-01 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110358

--- Comment #6 from GCC Commits  ---
The trunk branch has been updated by Marek Polacek :

https://gcc.gnu.org/g:c7607c4cf18986025430ca8626abfe56bfe87106

commit r14-9263-gc7607c4cf18986025430ca8626abfe56bfe87106
Author: Marek Polacek 
Date:   Thu Jan 25 16:38:51 2024 -0500

c++: implement [[gnu::no_dangling]] [PR110358]

Since -Wdangling-reference has false positives that can't be
prevented, we should offer an easy way to suppress the warning.
Currently, that is only possible by using a #pragma, either around the
enclosing class or around the call site.  But #pragma GCC diagnostic tend
to be onerous.  A better solution would be to have an attribute.

To that end, this patch adds a new attribute, [[gnu::no_dangling]].
This attribute takes an optional bool argument to support cases like:

  template 
  struct [[gnu::no_dangling(std::is_reference_v)]] S {
 // ...
  };

PR c++/110358
PR c++/109642

gcc/cp/ChangeLog:

* call.cc (no_dangling_p): New.
(reference_like_class_p): Use it.
(do_warn_dangling_reference): Use it.  Don't warn when the function
or its enclosing class has attribute gnu::no_dangling.
* tree.cc (cxx_gnu_attributes): Add gnu::no_dangling.
(handle_no_dangling_attribute): New.

gcc/ChangeLog:

* doc/extend.texi: Document gnu::no_dangling.
* doc/invoke.texi: Mention that gnu::no_dangling disables
-Wdangling-reference.

gcc/testsuite/ChangeLog:

* g++.dg/ext/attr-no-dangling1.C: New test.
* g++.dg/ext/attr-no-dangling2.C: New test.
* g++.dg/ext/attr-no-dangling3.C: New test.
* g++.dg/ext/attr-no-dangling4.C: New test.
* g++.dg/ext/attr-no-dangling5.C: New test.
* g++.dg/ext/attr-no-dangling6.C: New test.
* g++.dg/ext/attr-no-dangling7.C: New test.
* g++.dg/ext/attr-no-dangling8.C: New test.
* g++.dg/ext/attr-no-dangling9.C: New test.

[Bug c++/110358] requesting nicer suppression for Wdangling-reference

2024-03-01 Thread mpolacek at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110358

Marek Polacek  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #7 from Marek Polacek  ---
[[gnu::no_dangling]] implemented in GCC 14.

[Bug target/114190] wrong code with -O2 -fno-dce -fharden-compares -mvpclmulqdq --param=max-rtl-if-conversion-unpredictable-cost=136

2024-03-01 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114190

--- Comment #1 from Andrew Pinski  ---
>testesi, esi


is missing with `-fno-dce`

[Bug target/114190] wrong code with -O2 -fno-dce -fharden-compares -mvpclmulqdq --param=max-rtl-if-conversion-unpredictable-cost=136

2024-03-01 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114190

--- Comment #2 from Andrew Pinski  ---
So after reload, it looks ok:
(insn 22 21 380 2 (set (reg:CCZ 17 flags)
(compare:CCZ (reg:SI 4 si [orig:111 _21+4 ] [111])
(const_int 0 [0]))) "/app/example.cpp":8:8 discrim 1 7
{*cmpsi_ccno_1}
 (nil))
(insn 380 22 381 2 (parallel [
(set (reg:DI 1 dx [353])
(plus:DI (reg/f:DI 7 sp)
(const_int 212 [0xd4])))
(clobber (reg:CC 17 flags))
]) "/app/example.cpp":8:65 discrim 1 272 {*adddi_1}
 (expr_list:REG_EQUIV (plus:DI (reg/f:DI 16 argp)
(const_int 76 [0x4c]))
(nil)))
(insn 381 380 466 2 (set (reg:CCZ 17 flags)
(compare:CCZ (reg:SI 4 si [orig:111 _21+4 ] [111])
(const_int 0 [0]))) "/app/example.cpp":8:65 discrim 1 7
{*cmpsi_ccno_1}
 (nil))

Note without -fno-dce, `insn 22` is not in the IR and that is the only
difference so far. THis is ok at this point.

[Bug target/114195] New: [14] RISC-V vector ICE: in vectorizable_store, at tree-vect-stmts.cc:8690

2024-03-01 Thread patrick at rivosinc dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114195

Bug ID: 114195
   Summary: [14] RISC-V vector ICE: in vectorizable_store, at
tree-vect-stmts.cc:8690
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: patrick at rivosinc dot com
  Target Milestone: ---

Created attachment 57588
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57588&action=edit
-freport-bug output

Testcase:
long a, b;
extern short c[];
void d() {
  for (int e = 0; e < 5; e += 2) {
a = ({ a < 0 ? a : 0; });
b = ({ b < 0 ? b : 0; });
c[e] = 0;
  }
}

Backtrace:
> /scratch/tc-testing/tc-feb-20/build-rv64gcv/bin/riscv64-unknown-linux-gnu-gcc 
> -march=rv64gcv -O3 red.c
during GIMPLE pass: vect
red.c: In function 'd':
red.c:3:6: internal compiler error: in vectorizable_store, at
tree-vect-stmts.cc:8690
3 | void d() {
  |  ^
0xbe592a vectorizable_store
../../../gcc/gcc/tree-vect-stmts.cc:8690
0x274ff35 vect_analyze_stmt(vec_info*, _stmt_vec_info*, bool*, _slp_tree*,
_slp_instance*, vec*)
../../../gcc/gcc/tree-vect-stmts.cc:13241
0x15de0e2 vect_analyze_loop_operations
../../../gcc/gcc/tree-vect-loop.cc:2208
0x15de0e2 vect_analyze_loop_2
../../../gcc/gcc/tree-vect-loop.cc:3041
0x15dfdf0 vect_analyze_loop_1
../../../gcc/gcc/tree-vect-loop.cc:3481
0x15e0589 vect_analyze_loop(loop*, vec_info_shared*)
../../../gcc/gcc/tree-vect-loop.cc:3639
0x1627ed4 try_vectorize_loop_1
../../../gcc/gcc/tree-vectorizer.cc:1066
0x1627ed4 try_vectorize_loop
../../../gcc/gcc/tree-vectorizer.cc:1182
0x16287fc execute
../../../gcc/gcc/tree-vectorizer.cc:1298
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See  for instructions.

Godbolt: https://godbolt.org/z/s9YbYhK8e

Tested/found using r14-9084-g61ab046a327 (not bisected)

Found via fuzzer.

[Bug rtl-optimization/114190] wrong code with -O2 -fno-dce -fharden-compares -mvpclmulqdq --param=max-rtl-if-conversion-unpredictable-cost=136

2024-03-01 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114190

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
   Last reconfirmed||2024-03-01
  Component|target  |rtl-optimization

--- Comment #3 from Andrew Pinski  ---
cmpelim produces:
```
(insn 22 21 486 2 (set (reg:CCZ 17 flags)
(compare:CCZ (reg:SI 4 si [orig:111 _21+4 ] [111])
(const_int 0 [0]))) "/app/example.cpp":8:8 discrim 1 7
{*cmpsi_ccno_1}
 (expr_list:REG_UNUSED (reg:CCZ 17 flags)
(nil)))
(insn 486 22 466 2 (set (reg:DI 1 dx [353])
(plus:DI (reg/f:DI 7 sp)
(const_int 212 [0xd4]))) "/app/example.cpp":8:65 discrim 1 254
{*leadi}
 (nil))
(insn 466 486 383 2 (set (reg:DI 4 si [354])
(const:DI (plus:DI (symbol_ref:DI ("u") [flags 0x2]  )
(const_int 1 [0x1] "/app/example.cpp":8:65 discrim 1 84
{*movdi_internal}
 (expr_list:REG_EQUIV (const:DI (plus:DI (symbol_ref:DI ("u") [flags 0x2] 
)
(const_int 1 [0x1])))
(nil)))
(insn 383 466 384 2 (set (reg:DI 1 dx [348])
(if_then_else:DI (eq (reg:CCZ 17 flags)
(const_int 0 [0]))
(reg:DI 1 dx [353])
(reg:DI 4 si [354]))) "/app/example.cpp":8:65 discrim 1 1451
{*movdicc_noc}
 (expr_list:REG_DEAD (reg:CCZ 17 flags)
(expr_list:REG_DEAD (reg:DI 4 si [354])
(nil
```

Which is fine (but note the REG_UNUSED note which was not updated).

>From pro_and_epilogue's dump:
deleting insn with uid = 22.

Which is totally bogus; I think someone didn't redo REG_UNUSED notes again.

[Bug c++/110213] Bogus (as opposed to false positive) -Wdangling-reference warning

2024-03-01 Thread mpolacek at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110213

Marek Polacek  changed:

   What|Removed |Added

 CC||mpolacek at gcc dot gnu.org
 Resolution|--- |WORKSFORME
 Status|UNCONFIRMED |RESOLVED

--- Comment #4 from Marek Polacek  ---
You can now disable the warning by using the [[gnu::no_dangling]] attribute:

https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=c7607c4cf18986025430ca8626abfe56bfe87106

[Bug c++/110075] Bogus -Wdangling-reference

2024-03-01 Thread mpolacek at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110075

Marek Polacek  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #8 from Marek Polacek  ---
In r14-9263-gc7607c4cf18986 I added [[gnu::no_dangling]] which can be used to
suppress the warning in cases where the compiler can't detect that the
reference doesn't actually dangle.

Hopefully that'll work.

[Bug middle-end/114196] New: [13/14 Regression] Fixed length vector ICE: in vect_peel_nonlinear_iv_init, at tree-vect-loop.cc:9454

2024-03-01 Thread patrick at rivosinc dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114196

Bug ID: 114196
   Summary: [13/14 Regression] Fixed length vector ICE: in
vect_peel_nonlinear_iv_init, at tree-vect-loop.cc:9454
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: patrick at rivosinc dot com
  Target Milestone: ---

Created attachment 57589
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57589&action=edit
-freport-bug output

Testcase:
unsigned a;
int b;
long *c;
int main() {
  for (short d = 1; d < (short)5443215699099219 - 15932; d += 94 - 90) {
b = ({
  __typeof__(0) e = c[d];
  e;
})
?: -c[d];
a *= 3;
  }
}

Backtrace:
> /scratch/tc-testing/tc-feb-20/build-rv64gcv/bin/riscv64-unknown-linux-gnu-gcc 
> -march=rv64gcv_zvl256b -O3 --param=riscv-autovec-preference=fixed-vlmax red.c 
> -o red.out
during GIMPLE pass: vect
./red.c: In function 'main':
./red.c:4:5: internal compiler error: in vect_peel_nonlinear_iv_init, at
tree-vect-loop.cc:9454
4 | int main() {
  | ^~~~
0xb04d35 vect_peel_nonlinear_iv_init(gimple**, tree_node*, tree_node*,
tree_node*, vect_induction_op_type)
../../../gcc/gcc/tree-vect-loop.cc:9454
0x15e6c54 vect_update_ivs_after_vectorizer
../../../gcc/gcc/tree-vect-loop-manip.cc:2368
0x15f3ca7 vect_do_peeling(_loop_vec_info*, tree_node*, tree_node*, tree_node**,
tree_node**, tree_node**, int, bool, bool, tree_node**)
../../../gcc/gcc/tree-vect-loop-manip.cc:3501
0x15e198e vect_transform_loop(_loop_vec_info*, gimple*)
../../../gcc/gcc/tree-vect-loop.cc:11934
0x1627a01 vect_transform_loops
../../../gcc/gcc/tree-vectorizer.cc:1006
0x1628173 try_vectorize_loop_1
../../../gcc/gcc/tree-vectorizer.cc:1152
0x1628173 try_vectorize_loop
../../../gcc/gcc/tree-vectorizer.cc:1182
0x16287fc execute
../../../gcc/gcc/tree-vectorizer.cc:1298
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See  for instructions.

Godbolt: https://godbolt.org/z/eeEan1hvv

Tested/found using r14-9084-g61ab046a327 (not bisected)

Found via fuzzer.

[Bug middle-end/114197] New: [14] middle-end: ICE in verify_dominators

2024-03-01 Thread ewlu at rivosinc dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114197

Bug ID: 114197
   Summary: [14] middle-end: ICE in verify_dominators
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ewlu at rivosinc dot com
  Target Milestone: ---

Tested with riscv and x86_64

godbolt: https://godbolt.org/z/EvWj99d4b

#pragma pack(push)
struct a {
  volatile signed b : 8;
};
#pragma pack(pop)
int c;
static struct a d = {5};
void e() {
f:
  for (c = 8; c < 55; ++c)
if (!d.b)
  goto f;
}

testcase.c: In function 'e':
testcase.c:8:6: error: dominator of 7 should be 3, not 9
8 | void e() {
  |  ^
during GIMPLE pass: vect
dump file: testcase.c.179t.vect
testcase.c:8:6: internal compiler error: in verify_dominators, at
dominance.cc:1194
0xa0954c verify_dominators(cdi_direction)
../../../gcc/gcc/dominance.cc:1194
0x15e7ed1 checking_verify_dominators(cdi_direction)
../../../gcc/gcc/dominance.h:76
0x15e7ed1 slpeel_tree_duplicate_loop_to_edge_cfg(loop*, edge_def*, loop*,
edge_def*, edge_def*, edge_def**, bool, vec*)
../../../gcc/gcc/tree-vect-loop-manip.cc:1945
0x15ea20a vect_do_peeling(_loop_vec_info*, tree_node*, tree_node*, tree_node**,
tree_node**, tree_node**, int, bool, bool, tree_node**)
../../../gcc/gcc/tree-vect-loop-manip.cc:3426
0x15d982e vect_transform_loop(_loop_vec_info*, gimple*)
../../../gcc/gcc/tree-vect-loop.cc:11935
0x161f981 vect_transform_loops
../../../gcc/gcc/tree-vectorizer.cc:1006
0x16200f3 try_vectorize_loop_1
../../../gcc/gcc/tree-vectorizer.cc:1152
0x16200f3 try_vectorize_loop
../../../gcc/gcc/tree-vectorizer.cc:1182
0x162077c execute
../../../gcc/gcc/tree-vectorizer.cc:1298

[Bug middle-end/114197] [14] middle-end: ICE in verify_dominators

2024-03-01 Thread ewlu at rivosinc dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114197

Edwin Lu  changed:

   What|Removed |Added

 CC||ewlu at rivosinc dot com,
   ||patrick at rivosinc dot com,
   ||vineetg at rivosinc dot com

--- Comment #1 from Edwin Lu  ---
Found an issue where the dominators in duplicated loops were not properly
updated. Will be sending a fix soon

[Bug middle-end/114197] [14] middle-end: ICE in verify_dominators

2024-03-01 Thread patrick at rivosinc dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114197

--- Comment #2 from Patrick O'Neill  ---
found via fuzzer

[Bug middle-end/114198] New: [14] RISC-V fixed-length vector -flto ICE: in vectorizable_load, at tree-vect-stmts.cc:10570

2024-03-01 Thread patrick at rivosinc dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114198

Bug ID: 114198
   Summary: [14] RISC-V fixed-length vector -flto ICE: in
vectorizable_load, at tree-vect-stmts.cc:10570
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: patrick at rivosinc dot com
  Target Milestone: ---

Testcase:
extern int a;
extern long b[];
char c = -61;
int d = 1198809473;
int e = -1766414329;
short f[1][13];
int g[1][13][13];
signed char h[1][13][13];
long long i[1][13][13][13], j[1][13][13][13];
unsigned k[1][13][13][13][13];
void l(char q, long r, int s, short t[][13], int u[][13][13],
   signed char v[][13][13], long long w[][13][13][13],
   long long x[][13][13][13], unsigned y[][13][13][13][13]) {
  for (int m = 0; m < 3; m = 31)
for (unsigned n = 0; n < ({ v ? v[m][4][4] : 0; }); n += 3)
  for (int o = 0; o < s - 18446744071943137274ULL;
   o += r - 9893399894145917309ULL)
for (unsigned p = 0; p < (unsigned char)q - 182; p += 2) {
  b[o] &= x[m][4][n][4];
  a = y[p][o][n][4][m];
}
}
int main() { l(c, d, e, f, g, h, i, j, k); }

Backtrace:
> /scratch/tc-testing/tc-feb-20/build-rv64gcv/bin/riscv64-unknown-linux-gnu-gcc 
> -march=rv64gcv -flto -O3 --param=riscv-autovec-preference=fixed-vlmax red.c 
> -o red.out
during GIMPLE pass: vect
./red.c: In function 'main':
./red.c:23:5: internal compiler error: in vectorizable_load, at
tree-vect-stmts.cc:10570
   23 | int main() { l(c, d, e, f, g, h, i, j, k); }
  | ^
0xb9a8a9 vectorizable_load
../../../gcc/gcc/tree-vect-stmts.cc:10570
0x2601257 vect_analyze_stmt(vec_info*, _stmt_vec_info*, bool*, _slp_tree*,
_slp_instance*, vec*)
../../../gcc/gcc/tree-vect-stmts.cc:13240
0x1488db2 vect_analyze_loop_operations
../../../gcc/gcc/tree-vect-loop.cc:2208
0x1488db2 vect_analyze_loop_2
../../../gcc/gcc/tree-vect-loop.cc:3041
0x148aac0 vect_analyze_loop_1
../../../gcc/gcc/tree-vect-loop.cc:3481
0x148b259 vect_analyze_loop(loop*, vec_info_shared*)
../../../gcc/gcc/tree-vect-loop.cc:3639
0x14d2ce4 try_vectorize_loop_1
../../../gcc/gcc/tree-vectorizer.cc:1066
0x14d2ce4 try_vectorize_loop
../../../gcc/gcc/tree-vectorizer.cc:1182
0x14d360c execute
../../../gcc/gcc/tree-vectorizer.cc:1298
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See  for instructions.
lto-wrapper: fatal error:
/scratch/tc-testing/tc-feb-20/build-rv64gcv/bin/riscv64-unknown-linux-gnu-gcc
returned 1 exit status
compilation terminated.
/scratch/tc-testing/tc-feb-20/build-rv64gcv/lib/gcc/riscv64-unknown-linux-gnu/14.0.1/../../../../riscv64-unknown-linux-gnu/bin/ld:
error: lto-wrapper failed
collect2: error: ld returned 1 exit status

Godbolt: https://godbolt.org/z/M7obrj3vK

Tested/found using r14-9084-g61ab046a327 (not bisected)

Found via fuzzer.

[Bug c++/114005] Constructing a constexpr std::initializer_list ICEs GCC when using C++ modules

2024-03-01 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114005

--- Comment #1 from GCC Commits  ---
The master branch has been updated by Nathaniel Shead :

https://gcc.gnu.org/g:2823b4d96d9ec4ad4e67e5e8edaa1b060a467491

commit r14-9266-g2823b4d96d9ec4ad4e67e5e8edaa1b060a467491
Author: Nathaniel Shead 
Date:   Thu Feb 29 22:49:13 2024 +1100

c++: Ensure DECL_CONTEXT is set for temporary vars [PR114005]

Modules streaming requires DECL_CONTEXT to be set for anything streamed.
This patch ensures that 'create_temporary_var' does set a DECL_CONTEXT
for these variables (such as the backing storage for initializer_lists)
even if not inside a function declaration.

PR c++/114005

gcc/cp/ChangeLog:

* init.cc (create_temporary_var): Use current_scope instead of
current_function_decl.

gcc/testsuite/ChangeLog:

* g++.dg/modules/pr114005_a.C: New test.
* g++.dg/modules/pr114005_b.C: New test.

Signed-off-by: Nathaniel Shead 

[Bug c++/114170] [modules] error streaming in header unit with instantiated variable template with non-trivial initializer

2024-03-01 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114170

--- Comment #1 from GCC Commits  ---
The master branch has been updated by Nathaniel Shead :

https://gcc.gnu.org/g:852b58552991099141f9df5782e1f28d8606af9d

commit r14-9267-g852b58552991099141f9df5782e1f28d8606af9d
Author: Nathaniel Shead 
Date:   Fri Mar 1 11:08:23 2024 +1100

c++: Stream definitions for implicit instantiations [PR114170]

An implicit instantiation has an initializer depending on whether
DECL_INITIALIZED_P is set (like normal VAR_DECLs) which needs to be
written to ensure that consumers of header modules properly emit
definitions for these instantiations. This patch ensures that we
correctly fallback to checking this flag when DECL_INITIAL is not set
for a template instantiation.

For variables with non-trivial dynamic initialization, DECL_INITIAL can
be empty after 'split_nonconstant_init' but DECL_INITIALIZED_P is still
set; we need to check the latter to determine if we need to go looking
for a definition to emit (often in 'static_aggregates' here). This is
the case in the linked testcase.

However, for template specialisations (not instantiations?) we primarily
care about DECL_INITIAL; if the variable has initialization depending on
a template parameter then we'll need to emit that definition even though
it doesn't yet have DECL_INITIALIZED_P set; this is the case in e.g.

  template  int value = N;

As a drive-by fix, also ensures that the count of initializers matches
the actual number of initializers written. This doesn't seem to be
necessary for correctness in the current testsuite, but feels wrong and
makes debugging harder when initializers aren't properly written for
other reasons.

PR c++/114170

gcc/cp/ChangeLog:

* module.cc (has_definition): Fall back to DECL_INITIALIZED_P
when DECL_INITIAL is not set on a template.
(module_state::write_inits): Only increment count when
initializers are actually written.

gcc/testsuite/ChangeLog:

* g++.dg/modules/var-tpl-2_a.H: New test.
* g++.dg/modules/var-tpl-2_b.C: New test.

Signed-off-by: Nathaniel Shead 

[Bug c++/114170] [modules] error streaming in header unit with instantiated variable template with non-trivial initializer

2024-03-01 Thread nshead at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114170

Nathaniel Shead  changed:

   What|Removed |Added

 CC||nshead at gcc dot gnu.org
 Resolution|--- |FIXED
   Assignee|unassigned at gcc dot gnu.org  |nshead at gcc dot 
gnu.org
   Target Milestone|--- |14.0
 Status|UNCONFIRMED |RESOLVED

--- Comment #2 from Nathaniel Shead  ---
Fixed for GCC 14.

[Bug c++/103524] [meta-bug] modules issue

2024-03-01 Thread nshead at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103524
Bug 103524 depends on bug 114170, which changed state.

Bug 114170 Summary: [modules] error streaming in header unit with instantiated 
variable template with non-trivial initializer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114170

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

[Bug c++/103524] [meta-bug] modules issue

2024-03-01 Thread nshead at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103524
Bug 103524 depends on bug 114005, which changed state.

Bug 114005 Summary: Constructing a constexpr std::initializer_list ICEs GCC 
when using C++ modules
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114005

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

[Bug c++/114005] Constructing a constexpr std::initializer_list ICEs GCC when using C++ modules

2024-03-01 Thread nshead at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114005

Nathaniel Shead  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
   Target Milestone|--- |14.0
 Resolution|--- |FIXED
 CC||nshead at gcc dot gnu.org
   Assignee|unassigned at gcc dot gnu.org  |nshead at gcc dot 
gnu.org

--- Comment #2 from Nathaniel Shead  ---
Fixed for GCC 14.

[Bug libstdc++/114103] FAIL: 29_atomics/atomic/lock_free_aliases.cc -std=gnu++20 (test for excess errors)

2024-03-01 Thread dave.anglin at bell dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114103

--- Comment #11 from dave.anglin at bell dot net ---
On 2024-02-29 12:44 p.m., redi at gcc dot gnu.org wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114103
>
> --- Comment #10 from Jonathan Wakely  ---
> This additional change should fix that:
>
> --- a/libstdc++-v3/src/c++20/tzdb.cc
> +++ b/libstdc++-v3/src/c++20/tzdb.cc
> @@ -643,6 +643,7 @@ namespace std::chrono
>  void unlock() { infos_mutex.unlock(); }
> };
>
> +#if __cpp_lib_atomic_lock_free_type_aliases
>   #if defined __GTHREADS && __cpp_lib_atomic_wait
>   // Atomic count of unexpanded ZoneInfo objects in the infos vector.
>   // Concurrent access is allowed when all objects have been expanded.
> @@ -704,6 +705,7 @@ namespace std::chrono
>   #endif // __GTHREADS && __cpp_lib_atomic_wait
>
>   RulesCounter rules_counter;
> +#endif // __cpp_lib_atomic_lock_free_type_aliases
>   #else // TZDB_DISABLED
>   _Impl(weak_ptr) { }
>   struct {
Now we get:

libtool: compile:  /home/dave/gnu/gcc/objdir64/./gcc/xgcc -shared-libgcc
-B/home/dave/gnu/gcc/objdir64/./gcc -nostdinc++ 
-L/home/dave/gnu/gcc/objdir64/hppa64-hp-hpux11.11/libstdc++-v3/src
-L/home/dave/gnu/gcc/objdir64/hppa64-hp-hpux11.11/libstdc++-v3/src/.libs 
-L/home/dave/gnu/gcc/objdir64/hppa64-hp-hpux11.11/libstdc++-v3/libsupc++/.libs
-B/opt/gnu64/gcc/gcc-14/hppa64-hp-hpux11.11/bin/ 
-B/opt/gnu64/gcc/gcc-14/hppa64-hp-hpux11.11/lib/ -isystem
/opt/gnu64/gcc/gcc-14/hppa64-hp-hpux11.11/include -isystem 
/opt/gnu64/gcc/gcc-14/hppa64-hp-hpux11.11/sys-include -fno-checking
-I/home/dave/gnu/gcc/gcc/libstdc++-v3/../libgcc 
-I/home/dave/gnu/gcc/objdir64/hppa64-hp-hpux11.11/libstdc++-v3/include/hppa64-hp-hpux11.11
 
-I/home/dave/gnu/gcc/objdir64/hppa64-hp-hpux11.11/libstdc++-v3/include
-I/home/dave/gnu/gcc/gcc/libstdc++-v3/libsupc++ -std=gnu++20 
-D_GLIBCXX_SHARED -fno-implicit-templates -Wall -Wextra -Wwrite-strings
-Wcast-qual -Wabi=2 -fdiagnostics-show-location=once -ffunction-sections 
-fdata-sections -frandom-seed=tzdb.lo -fimplicit-templates -O2 -g -I. -c
../../../../../gcc/libstdc++-v3/src/c++20/tzdb.cc  -DPIC 
-D_GLIBCXX_SHARED -o tzdb.o
../../../../../gcc/libstdc++-v3/src/c++20/tzdb.cc: In member function
'std::chrono::sys_info 
std::chrono::time_zone::_M_get_sys_info(std::chrono::sys_seconds) const':
../../../../../gcc/libstdc++-v3/src/c++20/tzdb.cc:781:30: error: 'struct
std::chrono::time_zone::_Impl' has no member named 'rules_counter'; did 
you mean 'RulesCounter'?
   781 | lock_guard lock(_M_impl->rules_counter);
   |  ^
   |  RulesCounter
../../../../../gcc/libstdc++-v3/src/c++20/tzdb.cc:971:18: error: 'struct
std::chrono::time_zone::_Impl' has no member named 'rules_counter'; did 
you mean 'RulesCounter'?
   971 | _M_impl->rules_counter.decrement();
   |  ^
   |  RulesCounter
../../../../../gcc/libstdc++-v3/src/c++20/tzdb.cc: In function 'const
std::chrono::tzdb& std::chrono::reload_tzdb()':
../../../../../gcc/libstdc++-v3/src/c++20/tzdb.cc:1490:24: error: 'struct
std::chrono::time_zone::_Impl' has no member named 'rules_counter'; 
did you mean 'RulesCounter'?
  1490 |   impl.rules_counter.increment();
   |    ^
   |    RulesCounter

[Bug middle-end/114197] [14] middle-end: ICE in verify_dominators

2024-03-01 Thread ewlu at rivosinc dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114197

--- Comment #3 from Edwin Lu  ---
Patch: https://gcc.gnu.org/pipermail/gcc-patches/2024-March/647031.html

  1   2   >