[Bug target/113625] Interesting behavior with and without -mcpu=generic

2024-01-29 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113625

--- Comment #2 from Andrew Pinski  ---
Just FYI this is how I configured GCC:
```
Configured with: ../configure --target=aarch64-linux-gnu
--prefix=/home/apinski/src/upstream-full-cross/install
--enable-languages=c,c++,fortran,go
--with-sysroot=/home/apinski/src/upstream-full-cross/install//sysroot
```

Nothing special even.

[Bug tree-optimization/113630] [11/12/13/14 Regression] -fno-strict-aliasing introduces out-of-bounds memory access

2024-01-29 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113630

--- Comment #4 from Richard Biener  ---
(In reply to Andrew Pinski from comment #3)
> Note LLVM produces decent code here by only using one load:
> ```
> xor eax, eax
> testesi, esi
> seteal
> mov eax, dword ptr [rdi + 4*rax]
> ```
> 
> Maybe GCC could do the same ...

IIRC there's duplicate bugs about this - phiprop does kind-of the reverse.
The sink pass can now sink two exactly same stores but doesn't try sinking
a "compatible" store by introducing a PHI for the address.

  /* ??? We could handle differing SSA uses in the LHS by inserting
 PHIs for them.  */
  else if (! operand_equal_p (gimple_assign_lhs (first_store),
  gimple_assign_lhs (def), 0)
   || (gimple_clobber_p (first_store)
   != gimple_clobber_p (def)))

[Bug c++/113649] ICE: nested template class template argument deduction

2024-01-29 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113649

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2024-01-29
   Keywords||ice-on-valid-code
 Ever confirmed|0   |1

--- Comment #1 from Andrew Pinski  ---
Confirmed.

Related (not ICEing but still incorrectly to reject) testcase:
```
template
struct params {
template 
struct return_type {
constexpr return_type(Return (*p1)()){}
};

template 
return_type(Return (*)()) -> 
return_type;


template
struct addr {};
};

void x();

template struct params<1>::addr<&x>;
```

[Bug tree-optimization/113630] [11/12/13/14 Regression] -fno-strict-aliasing introduces out-of-bounds memory access

2024-01-29 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113630

Richard Biener  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org

--- Comment #5 from Richard Biener  ---
(In reply to Andrew Pinski from comment #2)
> Confirmed.
> 
> I really think what PRE does is correct here since we have an aliasing set
> of 0 for both. Now what is incorrect is hoist_adjacent_loads which cannot do
> either of any of the aliasing sets are 0 ...
> 
> 
> 
> I think even the function below is valid for non-strict aliasing:
> ```
> int __attribute__((noipa,noinline))
> f(struct S *p, int c, int d)
> {
>   int r;
>   if (c)
> {
> r = ((struct M*)p)->a;
> }
>   else
> r = ((struct M*)p)->b;
>   return r;
> }
> ```
> 
> That is hoist_adjacent_loads is broken for non-strict-aliasing in general
> and has been since 4.8.0 when it was added (r0-117275-g372a6eb8d991eb).

It looks it relies on

  /* The zeroth operand of the two component references must be
 identical.  It is not sufficient to compare get_base_address of
 the two references, because this could allow for different
 elements of the same array in the two trees.  It is not safe to
 assume that the existence of one array element implies the
 existence of a different one.  */
  if (!operand_equal_p (TREE_OPERAND (ref1, 0), TREE_OPERAND (ref2, 0), 0))
continue;

for the correctness test.  Note the MEM accesses are of size sizeof (struct M).

With -fno-strict-aliasing we're not wiping that detail so I think it _is_
a bug in PRE that it merges the two accesses.

I'll have a more detailed look.

[Bug c++/113644] [14 regression] ICE when building libcxxabi-16.0.6 since r14-6520

2024-01-29 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113644

Richard Biener  changed:

   What|Removed |Added

   Priority|P3  |P1

[Bug c/113631] FAIL: gcc.dg/pr7356.c, fix still fails with #pragma

2024-01-29 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113631

Richard Biener  changed:

   What|Removed |Added

   Last reconfirmed||2024-01-29
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
Version|unknown |14.0

--- Comment #1 from Richard Biener  ---
:1:2: error: expected ';' before 'typedef'
1 | a

a.h:1:9: error: expected '=', ',', ';', 'asm' or '__attribute__' before
'#pragma'
1 | #pragma message "foo"
  | ^~~

as it's a different message it's likely using a different location to
highlight the issue.  In general it's difficult to tell whether pointing
to the first token sequence in the #included file or the last token
before the #include directive is better here.

Of course the pragma location should underline either #pragma or the whole
#pragma, not just 'message'.

Btw, same issue without the #include:

a
#pragma message "foo"

vs.

a
typedef int b;

I'm not sure it makes sense to special case the situation we've switched
files?

[Bug gcov-profile/113646] PGO hurts run-time of 538.imagick_r as much as 68% at -Ofast -march=native

2024-01-29 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113646

Richard Biener  changed:

   What|Removed |Added

   Keywords||missed-optimization

--- Comment #1 from Richard Biener  ---
Did you try with -fprofile-partial-training (is that default on?  it probably
should ...).  Can you please try training with the rate data instead of train
to rule out a mismatch?

[Bug tree-optimization/113622] [11/12/13/14 Regression] ICE with vectors in named registers

2024-01-29 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113622

--- Comment #10 from Richard Biener  ---
(In reply to Jakub Jelinek from comment #8)
> Guess for an rvalue (if even that crashes) we want to expand it to some
> permutation or whole vector shift which moves the indexed elements first and
> then extract it, for lvalue we need to insert it similarly.

If we can we should match this up with .VEC_SET / .VEC_EXTRACT, otherwise
we should go "simple" and spill.

diff --git a/gcc/gimple-isel.cc b/gcc/gimple-isel.cc
index 7e2392ecd38..e94f292dd38 100644
--- a/gcc/gimple-isel.cc
+++ b/gcc/gimple-isel.cc
@@ -104,7 +104,8 @@ gimple_expand_vec_set_extract_expr (struct function *fun,
   machine_mode outermode = TYPE_MODE (TREE_TYPE (view_op0));
   machine_mode extract_mode = TYPE_MODE (TREE_TYPE (ref));

-  if (auto_var_in_fn_p (view_op0, fun->decl)
+  if ((auto_var_in_fn_p (view_op0, fun->decl)
+  || DECL_HARD_REGISTER (view_op0))
  && !TREE_ADDRESSABLE (view_op0)
  && ((!is_extract && can_vec_set_var_idx_p (outermode))
  || (is_extract

ensures the former and fixes the ICE on x86_64 on trunk.  The comment#5
testcase then results in the following loop:

.L3:
movslq  %eax, %rdx
vmovaps %zmm2, -56(%rsp)
vmovaps %zmm0, -120(%rsp)
vmovss  -120(%rsp,%rdx,4), %xmm4
vmovss  -56(%rsp,%rdx,4), %xmm3
vcmpltss%xmm4, %xmm3, %xmm3
vpbroadcastd%eax, %zmm4
addl$1, %eax
vpcmpd  $0, %zmm7, %zmm4, %k1
vblendvps   %xmm3, %xmm5, %xmm6, %xmm3
vbroadcastss%xmm3, %zmm1{%k1}
cmpl$8, %eax
jne .L3

this isn't optimal of course, for optimality we need vectorization.  But
we still need to avoid the ICEs since vectorization can be disabled.  That
said, I'm quite sure in code using hard registers people are not doing
such stupid things so I wonder how important it is to avoid "regressing"
the vectorization here.

[Bug target/113648] Cross compiler cannot find cross binutils on macOS

2024-01-29 Thread schwab--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113648

--- Comment #3 from Andreas Schwab  ---
The preferred location is
/Volumes/Toolchain/openwrt/lib/gcc/aarch64-linux-gnu/13.2.0/../../../../aarch64-linux-gnu/bin/aarch64-linux-gnu/13.2.0/ld
(known as gcc_tooldir in the makefile)

[Bug target/113648] Cross compiler cannot find cross binutils on macOS

2024-01-29 Thread schwab--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113648

--- Comment #4 from Andreas Schwab  ---
Correction: the preferred location is
/Volumes/Toolchain/openwrt/lib/gcc/aarch64-linux-gnu/13.2.0/../../../../aarch64-linux-gnu/bin/ld

[Bug c++/113624] FAIL: g++.dg/ext/dllimport4.C, ICE on windows targets

2024-01-29 Thread nightstrike at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113624

nightstrike  changed:

   What|Removed |Added

  Known to fail||11.3.0, 12.2.0, 13.0, 14.0

--- Comment #1 from nightstrike  ---
To be clear, the ICE happens on 11, 12, 13, and 14.  I don't have other
versions handy to test.

[Bug tree-optimization/113622] [11/12/13/14 Regression] ICE with vectors in named registers

2024-01-29 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113622

--- Comment #11 from Jakub Jelinek  ---
I think it is most important we don't ICE and generate correct code.  I doubt
this is used too much in real-world code, otherwise it would have been reported
years ago, so how efficient it will be is less important.

[Bug target/113616] [14 Regression] ICE in process_uses_of_deleted_def, at rtl-ssa/changes.cc:252

2024-01-29 Thread acoplan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113616

Alex Coplan  changed:

   What|Removed |Added

URL||https://gcc.gnu.org/piperma
   ||il/gcc-patches/2024-January
   ||/644167.html
   Keywords||patch

--- Comment #4 from Alex Coplan  ---
Patch submitted:
https://gcc.gnu.org/pipermail/gcc-patches/2024-January/644167.html

[Bug tree-optimization/113622] [11/12/13/14 Regression] ICE with vectors in named registers

2024-01-29 Thread xry111 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113622

--- Comment #12 from Xi Ruoyao  ---
(In reply to Jakub Jelinek from comment #11)
> I think it is most important we don't ICE and generate correct code.  I
> doubt this is used too much in real-world code, otherwise it would have been
> reported years ago, so how efficient it will be is less important.

Hmm, but for another test case (LoongArch):

typedef double __attribute__ ((vector_size (32))) vec;
register vec a asm("f25"), b asm("f26"), c asm("f27");

void
test (void)
{
  for (int i = 0; i < 4; i++)
c[i] = __builtin_isless (a[i], b[i]) ? 0.1 : 0.2;
}

I'll have to write a loop (because __builtin_isless does not work on vectors). 
Or is there a vector built-in I'm missing?

[Bug c++/113649] ICE: nested template class template argument deduction

2024-01-29 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113649

Andrew Pinski  changed:

   What|Removed |Added

  Known to fail||12.3.0

--- Comment #2 from Andrew Pinski  ---
This was rejected before GCC 12.3.0 with an error message about the deduction
guide not being at the namespace level.

[Bug tree-optimization/113622] [11/12/13/14 Regression] ICE with vectors in named registers

2024-01-29 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113622

--- Comment #13 from Jakub Jelinek  ---
(In reply to Xi Ruoyao from comment #12)
> (In reply to Jakub Jelinek from comment #11)
> > I think it is most important we don't ICE and generate correct code.  I
> > doubt this is used too much in real-world code, otherwise it would have been
> > reported years ago, so how efficient it will be is less important.
> 
> Hmm, but for another test case (LoongArch):
> 
> typedef double __attribute__ ((vector_size (32))) vec;
> register vec a asm("f25"), b asm("f26"), c asm("f27");
> 
> void
> test (void)
> {
>   for (int i = 0; i < 4; i++)
> c[i] = __builtin_isless (a[i], b[i]) ? 0.1 : 0.2;
> }
> 
> I'll have to write a loop (because __builtin_isless does not work on
> vectors).  Or is there a vector built-in I'm missing?

Why are you doing that?
Normally tests would do
vec
test (vec a, vec b)
{
  vec c = {};
  for (int i = 0; i < 4; i++)
c[i] = __builtin_isless (a[i], b[i]) ? 0.1 : 0.2;
  return c;
}
or something similar.

[Bug tree-optimization/113622] [11/12/13/14 Regression] ICE with vectors in named registers

2024-01-29 Thread xry111 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113622

--- Comment #14 from Xi Ruoyao  ---
(In reply to Jakub Jelinek from comment #13)
> (In reply to Xi Ruoyao from comment #12)
> > (In reply to Jakub Jelinek from comment #11)
> > > I think it is most important we don't ICE and generate correct code.  I
> > > doubt this is used too much in real-world code, otherwise it would have 
> > > been
> > > reported years ago, so how efficient it will be is less important.
> > 
> > Hmm, but for another test case (LoongArch):
> > 
> > typedef double __attribute__ ((vector_size (32))) vec;
> > register vec a asm("f25"), b asm("f26"), c asm("f27");
> > 
> > void
> > test (void)
> > {
> >   for (int i = 0; i < 4; i++)
> > c[i] = __builtin_isless (a[i], b[i]) ? 0.1 : 0.2;
> > }
> > 
> > I'll have to write a loop (because __builtin_isless does not work on
> > vectors).  Or is there a vector built-in I'm missing?
> 
> Why are you doing that?
> Normally tests would do
> vec
> test (vec a, vec b)
> {
>   vec c = {};
>   for (int i = 0; i < 4; i++)
> c[i] = __builtin_isless (a[i], b[i]) ? 0.1 : 0.2;
>   return c;
> }
> or something similar.

Because we are lacking a calling convention passing vectors in vector registers
(it will be added in the future but not before GCC 14 release), thus I cannot
test if the register operands are showing up in a correct order in the
generated asm.

[Bug middle-end/101195] ICE: in tree_to_uhwi, at tree.c:6324

2024-01-29 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101195

Jakub Jelinek  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org

--- Comment #5 from Jakub Jelinek  ---
Created attachment 57249
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57249&action=edit
gcc14-pr101195.patch

Initially I thought we should emit some error message, like that the argument
must be non-negative constant, but given the
  iwhich = EH_RETURN_DATA_REGNO (iwhich);
  if (iwhich == INVALID_REGNUM)
return constm1_rtx;
a few lines later where it will on pretty much all architectures return
INVALID_REGNUM for all but 2-4 values and so silently expand to -1 at runtime
I think silently expanding the negative values to that as well is just fine.
There is no fundamental difference between -1, -42, 42 or 237 in this regard,
all of them most likely aren't eh regnos.

[Bug tree-optimization/110603] [14 Regression] GCC, ICE: internal compiler error: in verify_range, at value-range.cc:1104 since r14-255

2024-01-29 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110603

--- Comment #7 from GCC Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:b338fdbc2b74f25c07da263a1f5983421fac1a53

commit r14-8487-gb338fdbc2b74f25c07da263a1f5983421fac1a53
Author: Jakub Jelinek 
Date:   Mon Jan 29 10:20:32 2024 +0100

tree-ssa-strlen: Fix pdata->maxlen computation [PR110603]

On the following testcase we emit an invalid range of [2, 1] due to
UB in the source.  Older VRP code silently swapped the boundaries and
made [1, 2] range out of it, but newer code just ICEs on it.

The reason for pdata->minlen 2 is that we see a memcpy in this case
setting both elements of the array to non-zero value, so strlen (a)
can't be smaller than 2.  The reason for pdata->maxlen 1 is that in
char a[2] array without UB there can be at most 1 non-zero character
because there needs to be '\0' termination in the buffer too.

IMHO we shouldn't create invalid ranges like that and even creating
for that case a range [1, 2] looks wrong to me, so the following patch
just doesn't set maxlen in that case to the array size - 1, matching
what will really happen at runtime when triggering such UB (strlen will
be at least 2, perhaps more or will crash).
This is what the second hunk of the patch does.

The first hunk fixes a fortunately harmless thinko.
If the strlen pass knows the string length (i.e. get_string_length
function returns non-NULL), we take a different path, we get to this
only if all we know is that there are certain number of non-zero
characters but we don't know what it is followed with, whether further
non-zero characters or zero termination or either of that.
If we know exactly how many non-zero characters it is, such as
char a[42];
...
  memcpy (a, "01234567890123456789", 20);
then we take an earlier if for the INTEGER_CST case and set correctly
just pdata->minlen to 20 in that case, but if we have something like
  int len;
  ...
  if (len < 15 || len > 32) return;
  memcpy (a, "0123456789012345678901234567890123456789", len);
then we have [15, 32] range for the nonzero_chars and we set pdata->minlen
correctly to 15, but incorrectly set also pdata->maxlen to 32.  That is
not what the above implies, it just means that in some cases we know that
there are at least 32 non-zero characters, followed by something we don't
know.  There is no guarantee that there is '\0' right after it, so it
means nothing.
The reason this is harmless, just confusing, is that the code a few lines
later fortunately overwrites this incorrect pdata->maxlen value with
something different (either array length - 1 or all ones etc.).

2024-01-29  Jakub Jelinek  

PR tree-optimization/110603
* tree-ssa-strlen.cc (get_range_strlen_dynamic): Remove incorrect
setting of pdata->maxlen to vr.upper_bound (which is
unconditionally
overwritten anyway).  Avoid creating invalid range with minlen
larger than maxlen.  Formatting fix.

* gcc.c-torture/compile/pr110603.c: New test.

[Bug tree-optimization/113622] [11/12/13/14 Regression] ICE with vectors in named registers

2024-01-29 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113622

--- Comment #15 from Richard Biener  ---
(In reply to Jakub Jelinek from comment #11)
> I think it is most important we don't ICE and generate correct code.  I
> doubt this is used too much in real-world code, otherwise it would have been
> reported years ago, so how efficient it will be is less important.

We do spill on the read side already.  On the write side the ICE is because
of r0-71337-g1e188d1e130034.  Note we're spilling parts of bitpos to offset:

  /* Otherwise, split it up.  */
  if (offset)
{
  /* Avoid returning a negative bitpos as this may wreak havoc later.  */
  if (!bit_offset.to_shwi (pbitpos) || maybe_lt (*pbitpos, 0))
{
  *pbitpos = num_trailing_bits (bit_offset.force_shwi ());
  poly_offset_int bytes = bits_to_bytes_round_down (bit_offset);
  offset = size_binop (PLUS_EXPR, offset,
   build_int_cst (sizetype, bytes.force_shwi ()));
}

  *poffset = offset;

but it can also be large positive when the bit amount doesn't fit a HWI.

The flow of 'to' expansion is a bit awkward, but the following properly
spills in case of variable offset and non-MEM_P:

diff --git a/gcc/expr.cc b/gcc/expr.cc
index ee822c11dce..f54d0b1474e 100644
--- a/gcc/expr.cc
+++ b/gcc/expr.cc
@@ -6061,6 +6061,7 @@ expand_assignment (tree to, tree from, bool nontemporal)
to_rtx = adjust_address (to_rtx, BLKmode, 0);
}

+  rtx stemp = NULL_RTX, old_to_rtx = NULL_RTX;
   if (offset != 0)
{
  machine_mode address_mode;
@@ -6070,9 +6071,24 @@ expand_assignment (tree to, tree from, bool nontemporal)
{
  /* We can get constant negative offsets into arrays with broken
 user code.  Translate this to a trap instead of ICEing.  */
- gcc_assert (TREE_CODE (offset) == INTEGER_CST);
- expand_builtin_trap ();
- to_rtx = gen_rtx_MEM (BLKmode, const0_rtx);
+ if (TREE_CODE (offset) == INTEGER_CST)
+   {
+ expand_builtin_trap ();
+ to_rtx = gen_rtx_MEM (BLKmode, const0_rtx);
+   }
+ /* Else spill for variable offset to the destination.  */
+ else
+   {
+ gcc_assert (!TREE_CODE (from) == CALL_EXPR
+ && COMPLETE_TYPE_P (TREE_TYPE (from))
+ && (TREE_CODE (TYPE_SIZE (TREE_TYPE (from)))
+ != INTEGER_CST));
+ stemp = assign_stack_temp (GET_MODE (to_rtx),
+GET_MODE_SIZE (GET_MODE
(to_rtx)));
+ emit_move_insn (stemp, to_rtx);
+ old_to_rtx = to_rtx;
+ to_rtx = stemp;
+   }
}

  offset_rtx = expand_expr (offset, NULL_RTX, VOIDmode, EXPAND_SUM);
@@ -6305,6 +6321,9 @@ expand_assignment (tree to, tree from, bool nontemporal)
  bitregion_start, bitregion_end,
  mode1, from, get_alias_set (to),
  nontemporal, reversep);
+ /* Move the temporary storage back to the non-MEM_P.  */
+ if (stemp)
+   emit_move_insn (old_to_rtx, stemp);
}

   if (result)

[Bug tree-optimization/110603] [14 Regression] GCC, ICE: internal compiler error: in verify_range, at value-range.cc:1104 since r14-255

2024-01-29 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110603

Jakub Jelinek  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #8 from Jakub Jelinek  ---
Fixed.

[Bug tree-optimization/113622] [11/12/13/14 Regression] ICE with vectors in named registers

2024-01-29 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113622

--- Comment #16 from Richard Biener  ---
typedef double __attribute__ ((vector_size (16))) vec;

void
test (void)
{
  register vec a asm("xmm1"), b asm("xmm2"), c asm("xmm3");
  for (int i = 0; i < 2; i++)
c[i] = a[i] < b[i] ? 0.1 : 0.2;
}

also ICEs with -O0 -msse.

[Bug target/113615] internal compiler error: in extract_insn, at recog.cc:2812

2024-01-29 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113615

--- Comment #3 from Andrew Stubbs  ---
I did see these, but I hadn't had time to chase them up.

The proposed patch is exactly the sort of solution I was expecting to find,
short term. Have you confirmed that it fixes all the cases?

A proper solution is to find out how to implement reductions with the RDNA ISA,
of course, but that's probably non-trivial (as in, I'm pretty sure it's more
than renaming a few mnemonics), and low-priority as GCC does a reasonably good 
job without them.

[Bug other/111966] GCN '--with-arch=[...]' not considered for 'mkoffload' default 'elf_arch'

2024-01-29 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111966

--- Comment #4 from GCC Commits  ---
The master branch has been updated by Tobias Burnus :

https://gcc.gnu.org/g:ef5ccdbbc60c230a483898afbf0c053a9f8bb176

commit r14-8489-gef5ccdbbc60c230a483898afbf0c053a9f8bb176
Author: Tobias Burnus 
Date:   Mon Jan 29 11:10:33 2024 +0100

gcn/mkoffload.cc: Fix SRAM_ECC and XNACK handling [PR111966]

Some more '-g' fixes as the .mkoffload.dbg.o debug file's has elf flags
which did not match those generated for the compilation, leading to linker
errors.  For .mkoffload.dbg.o, the elf flags are generated by mkoffload
itself - while for the other .o files, that's done by the compiler via
setting default and mainly via the ASM_SPEC.

This is a follow up to r14-8332-g13127dac106724 which fixed an issue
caused by the default arch.  In this patch, it is mainly for gfx1100
and gfx1030 which always failed.  It also affects gfx906 and possibly
gfx900 but only when using the -mxnack/-msram-ecc flags explicitly.

What happens on the compiler side is mainly determined by gcn-hsa.h's
and otherwise by some default setting. In particular for xnack and
sram_ecc, there is:

For gfx1100 and gfx1030, neither xnack nor sram_ecc is set (only
'+wavefrontsize64').

For fiji, gfx900, gfx906 and gfx908 there is always -mattr=-xnack and
for all but gfx908 also -msram-ecc=no - independent of what has been
passed to the compiler. However, on the elf flags, the result differs:
For fiji, due to the HSACOv3, it is always set to 0 via
copy_early_debug_info; for gfx900, gfx906 and gfx908, xnack is OFF.
For sram-ecc, it is 'unset' for gfx900, 'any' for gfx906 and for
gfx908 it is 'any' unless overridden.

For gfx90a, the -msram-ecc= and -mxnack= are passed on, or if not present,
...=any is passed on.  Note that this "any" is different from argument
nor present at elf flag level:
For XNACK: unset/unsupported is 0, any = 0x100, off = 0x200, on = 0x300.
For SRAMECC: unset/unsupported is 0, any = 0x400, off = 0x800, on = 0xc00.

The obstack_ptr_grow changes are more to avoid confusion than having an
actual effect as they would overwise be filtered out via the ASM_SPEC.

gcc/ChangeLog:

PR other/111966
* config/gcn/mkoffload.cc (SET_XNACK_UNSET, TEST_SRAM_ECC_UNSET):
New.
(SET_SRAM_ECC_UNSUPPORTED): Renamed to ...
(SET_SRAM_ECC_UNSET): ... this.
(copy_early_debug_info): Remove gfx900 special case, now handled as
part of the generic handling.
(main): Update SRAM_ECC and XNACK for the -march as done in
gcn-hsa.h.

Signed-off-by: Tobias Burnus 

[Bug c/113650] New: __builtin_nonlocal_goto ICEs when passed 0 as arguments

2024-01-29 Thread gabravier at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113650

Bug ID: 113650
   Summary: __builtin_nonlocal_goto ICEs when passed 0 as
arguments
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gabravier at gmail dot com
  Target Milestone: ---

void f() {
__builtin_nonlocal_goto(0, 0);
}

This crashes GCC with the following error:

during RTL pass: expand
: In function 'f':
:2:9: internal compiler error: in int_mode_for_mode, at
stor-layout.cc:407
2 | __builtin_nonlocal_goto(0, 0);
  | ^
0x23382dc internal_error(char const*, ...)
???:0
0x96bd77 fancy_abort(char const*, int, char const*)
???:0
0xc569ae emit_move_insn_1(rtx_def*, rtx_def*)
???:0
0xc56d40 emit_move_insn(rtx_def*, rtx_def*)
???:0
0xc2c376 copy_to_reg(rtx_def*)
???:0
0xaf9911 expand_builtin(tree_node*, rtx_def*, rtx_def*, machine_mode, int)
???:0
0xc53d5c expand_expr_real_1(tree_node*, rtx_def*, machine_mode,
expand_modifier, rtx_def**, bool)
???:0
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See  for instructions.

[Bug target/113623] [14 Regression] ICE in aarch64_pair_mem_from_base since r14-6605

2024-01-29 Thread acoplan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113623

--- Comment #3 from Alex Coplan  ---
I think ldp_fusion is exposing a latent issue here.  We trip the assert:

gcc_assert (aarch64_mem_pair_lanes_operand (mem, pair_mode));

on the RTL:

(rr) pr mem
(mem/f:V2x8QI (reg:DI 63 v31) [0 +0 S16 A64])

because v31 isn't a valid base register according to
aarch64_regno_ok_for_base_p.  This comes from the following RTL in sched1,
where we already have:

   30: x0:DI=[v31:DI]
   29: x1:DI=[v31:DI+0x8]

but again these mems look invalid as per aarch64_regno_ok_for_base_p.

[Bug target/113623] [14 Regression] ICE in aarch64_pair_mem_from_base since r14-6605

2024-01-29 Thread acoplan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113623

Alex Coplan  changed:

   What|Removed |Added

 CC||rsandifo at gcc dot gnu.org

--- Comment #4 from Alex Coplan  ---
I think this is an early RA problem.  In asmcons (in function qux), we have:

   29: x1:DI=[r122:DI+0x8]
   30: x0:DI=[r122:DI]

and then in early_ra, we get:

   29: x1:DI=[v31:DI+0x8]
   30: x0:DI=[v31:DI]

CCing Richard S for an opinion.

[Bug c/113650] __builtin_nonlocal_goto ICEs when passed 0 as arguments

2024-01-29 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113650

--- Comment #1 from Richard Biener  ---
I don't think these are supposed to be used by the user ...

[Bug target/113623] [14 Regression] ICE in aarch64_pair_mem_from_base since r14-6605

2024-01-29 Thread acoplan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113623

Alex Coplan  changed:

   What|Removed |Added

 Status|ASSIGNED|NEW
   Assignee|acoplan at gcc dot gnu.org |unassigned at gcc dot 
gnu.org

[Bug target/113615] internal compiler error: in extract_insn, at recog.cc:2812

2024-01-29 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113615

--- Comment #4 from Tobias Burnus  ---
Patch:
  https://gcc.gnu.org/pipermail/gcc-patches/2024-January/644181.html

It fixes this issue but two other kind of issues I still see for gfx1100.

[Bug target/113623] [14 Regression] ICE in aarch64_pair_mem_from_base since r14-6605

2024-01-29 Thread acoplan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113623

--- Comment #5 from Alex Coplan  ---
Indeed passing -mearly-ra=none makes the ICE go away as well.

[Bug c++/113651] New: The GCC optimizer performs poorly on a very simple code snippet.

2024-01-29 Thread cuking998244353 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113651

Bug ID: 113651
   Summary: The GCC optimizer performs poorly on a very simple
code snippet.
   Product: gcc
   Version: 13.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: cuking998244353 at gmail dot com
  Target Milestone: ---

I would like to report an issue with the GCC optimizer related to a specific
code snippet. The optimizer exhibits suboptimal behavior when handling a very
simple code segment. Below is the core code snippet:


#include 

constexpr uint32_t M = 0x04c11db7;

uint32_t calc_CRC1(const uint8_t *data, int len, uint32_t init_value = 0)
{
uint32_t r = init_value;
for (int i = 0; i < len; i++)
{
for (int j = 0; j < 8; j++)
{
// Issue in Code1:
r = (r << 1) ^ (data[i] >> (7 - j) & 1) ^ (r >> 31 ? M : 0);

// The following Code2 works as expected:
// uint32_t flag = r >> 31 ? M : 0;
// r = (r << 1) ^ (data[i] >> (7 - j) & 1);
// r ^= flag;
}
}
for (int i = 0; i < 32; i++)
{
r = (r << 1) ^ (r >> 31 ? M : 0);
}
return r;
}


When I replace the segment with Code2 instead of Code1, there is a noticeable
improvement in performance.

When I read the assembly code, I noticed that in the first approach, GCC
chooses to compute the result of (r << 1) ^ (data[i] >> (7 - j) & 1) and then
XOR this result with M, obtaining the desired value through cmov.

However, it is evident that there is no dependency among the three
sub-expressions here. Each sub-expression can be computed independently and
then XORed together. This optimization approach, on the contrary, results in a
lengthening of the dependency chain.

If the second code snippet is used, this issue does not arise. Such a simple
modification leads to significant differences in results, and when I switch to
compiling with Clang, there is no noticeable performance difference between the
two. I believe this may indicate the presence of a bug in the GCC optimizer.

GCC version:
gcc.exe (Rev2, Built by MSYS2 project) 13.2.0
Copyright (C) 2023 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

system:
windows11 (it can still be reproduced on Linux.)

cmd:
-O2 -Wall -Wextra

You can view the issue directly through this
link:https://godbolt.org/z/PG6b7xveo

[Bug target/113618] [14 Regression] AArch64: memmove idiom regression

2024-01-29 Thread wilco at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113618

--- Comment #3 from Wilco  ---
(In reply to Richard Biener from comment #2)
> It might be good to recognize this pattern in strlenopt or a related pass.
> 
> A purely local transform would turn it into
> 
> memcpy (temp, a, 64);
> memmove (b, a, 64);
> 
> relying on DSE to eliminate the copy to temp if possible.  Not sure if
> that possibly would be a bad transform if copying to temp is required.

This would only be beneficial if you know memmove is inlined if memcpy is - on
almost all targets memmove becomes a library call, so the transformation would
be worse if memcpy can be inlined.

> stp q30, q31, [sp]
> ldp q30, q31, [sp]
> 
> why is CSE not able to catch this?

The new RTL now has UNSPECs in them, so CSE doesn't know it is a plain
load/store:

STP: 

(insn 12 11 13 2 (set (mem/c:V2x16QI (reg:DI 102) [0 +0 S32 A128])
(unspec:V2x16QI [
(reg:V4SI 104)
(reg:V4SI 105)
] UNSPEC_STP)) "/app/example.c":5:5 -1
 (nil))

LDP:

(insn 16 15 17 2 (parallel [
(set (reg:V4SI 108)
(unspec:V4SI [
(mem/c:V2x16QI (reg:DI 107) [0 +0 S32 A128])
] UNSPEC_LDP_FST))
(set (reg:V4SI 109)
(unspec:V4SI [
(mem/c:V2x16QI (reg:DI 107) [0 +0 S32 A128])
] UNSPEC_LDP_SND))
]) "/app/example.c":6:5 -1
 (nil))

[Bug debug/113636] [14 Regression] internal compiler error: in dead_debug_global_find, at valtrack.cc:275

2024-01-29 Thread claudio.bantaloukas at arm dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113636

Claudio Bantaloukas  changed:

   What|Removed |Added

 CC||claudio.bantaloukas at arm dot 
com

--- Comment #8 from Claudio Bantaloukas  ---
Created attachment 57250
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57250&action=edit
Further reduction

Further reduced test case

[Bug target/113607] [14] RISC-V rv64gcv vector: Runtime mismatch at -O3

2024-01-29 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113607

--- Comment #16 from Robin Dapp  ---
Disabling vec_extract makes us operate on non-partial vectors, though so there
are a lot of differences in codegen.  I'm going to have a look.

[Bug tree-optimization/113281] [14 Regression] Wrong code due to vectorization of shift reduction and missing promotions since r14-3027

2024-01-29 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281

--- Comment #23 from GCC Commits  ---
The trunk branch has been updated by Richard Sandiford :

https://gcc.gnu.org/g:1a8261e047f7a2c2b0afb95716f7615cba718cd1

commit r14-8492-g1a8261e047f7a2c2b0afb95716f7615cba718cd1
Author: Richard Sandiford 
Date:   Mon Jan 29 12:33:08 2024 +

vect: Tighten vect_determine_precisions_from_range [PR113281]

This was another PR caused by the way that
vect_determine_precisions_from_range handles shifts.  We tried to
narrow 32768 >> x to a 16-bit shift based on range information for
the inputs and outputs, with vect_recog_over_widening_pattern
(after PR110828) adjusting the shift amount.  But this doesn't
work for the case where x is in [16, 31], since then 32-bit
32768 >> x is a well-defined zero, whereas no well-defined
16-bit 32768 >> y will produce 0.

We could perhaps generate x < 16 ? 32768 >> x : 0 instead,
but since vect_determine_precisions_from_range was never really
supposed to rely on fix-ups, it seems better to fix that instead.

The patch also makes the code more selective about which codes
can be narrowed based on input and output ranges.  This showed
that vect_truncatable_operation_p was missing cases for
BIT_NOT_EXPR (equivalent to BIT_XOR_EXPR of -1) and NEGATE_EXPR
(equivalent to BIT_NOT_EXPR followed by a PLUS_EXPR of 1).

pr113281-1.c is the original testcase.  pr113281-[23].c failed
before the patch due to overly optimistic narrowing.  pr113281-[45].c
previously passed and are meant to protect against accidental
optimisation regressions.

gcc/
PR target/113281
* tree-vect-patterns.cc (vect_recog_over_widening_pattern): Remove
workaround for right shifts.
(vect_truncatable_operation_p): Handle NEGATE_EXPR and
BIT_NOT_EXPR.
(vect_determine_precisions_from_range): Be more selective about
which codes can be narrowed based on their input and output ranges.
For shifts, require at least one more bit of precision than the
maximum shift amount.

gcc/testsuite/
PR target/113281
* gcc.dg/vect/pr113281-1.c: New test.
* gcc.dg/vect/pr113281-2.c: Likewise.
* gcc.dg/vect/pr113281-3.c: Likewise.
* gcc.dg/vect/pr113281-4.c: Likewise.
* gcc.dg/vect/pr113281-5.c: Likewise.

[Bug tree-optimization/113281] Wrong code due to vectorization of shift reduction and missing promotions since r14-3027

2024-01-29 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281

--- Comment #24 from Richard Sandiford  ---
Fixed on trunk so far, but it's latent on branches.  I'll see what
the trunk fallout is like before asking about backports.

[Bug middle-end/113651] The GCC optimizer performs poorly on a very simple code snippet.

2024-01-29 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113651

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2024-01-29
 Ever confirmed|0   |1

--- Comment #1 from Richard Biener  ---
Confirmed.   This is a missed phiopt (or operation sinking) of

  if (r.1_90 < 0)
goto ; [41.00%]
  else
goto ; [59.00%]

   [local count: 391324129]:
  _91 = _89 ^ 79764919;

   [local count: 954449104]:
  # prephitmp_92 = PHI <_91(6), _89(5)>

to sth like

  if (r.1_90 < 0)
goto ; [41.00%]
  else
goto ; [59.00%]

   [local count: 391324129]:

   [local count: 954449104]:
  # prephitmp_91 = PHI <79764919(6), 0(5)>
  _92 = _89 ^ prephitmp_xx;

on some archs the conditional constant might be generated by a
conditional add of 79764919 to zero.

Whether this is better suited for GIMPLE or RTL if-conversion remains to be
seen.

That splitting the expression helps is just luck.

[Bug target/113615] internal compiler error: in extract_insn, at recog.cc:2812

2024-01-29 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113615

--- Comment #5 from GCC Commits  ---
The master branch has been updated by Tobias Burnus :

https://gcc.gnu.org/g:7cc2262ec9a410dc56d1c1c6b950c922e14f621d

commit r14-8493-g7cc2262ec9a410dc56d1c1c6b950c922e14f621d
Author: Tobias Burnus 
Date:   Mon Jan 29 13:51:25 2024 +0100

gcn/gcn-valu.md: Disable fold_left_plus for TARGET_RDNA2_PLUS [PR113615]

gcc/ChangeLog:

PR target/113615
* config/gcn/gcn-valu.md (fold_left_plus_): Only
define for !TARGET_RDNA2_PLUS.

Signed-off-by: Tobias Burnus 

[Bug target/113652] New: ppc: unrecognized opcode: `lfiwzx'

2024-01-29 Thread csfore at posteo dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113652

Bug ID: 113652
   Summary: ppc: unrecognized opcode: `lfiwzx'
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: csfore at posteo dot net
  Target Milestone: ---

Created attachment 57251
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57251&action=edit
original preprocessed file

Steps to reproduce:
1. Attempt to build GCC 14 (latest snapshot attempted is Gentoo's 20240128)
2. Fails to assemble with:
/tmp/ccP8ev2f.s: Assembler messages:
/tmp/ccP8ev2f.s:85: Error: unrecognized opcode: `lfiwzx'

Originally reported downstream at: https://bugs.gentoo.org/921621

Command to reproduce:
gcc -mcpu=7450 -O1 -mvsx -c _kf_to_sd.i

$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/powerpc-unknown-linux-gnu/14/lto-wrapper
Target: powerpc-unknown-linux-gnu
Configured with:
/var/tmp/portage/sys-devel/gcc-14.0.0_pre20231119/work/gcc-14-20231119/configure
--host=powerpc-unknown-linux-gnu --build=powerpc-unknown-linux-gnu
--prefix=/usr --bindir=/usr/powerpc-unknown-linux-gnu/gcc-bin/14
--includedir=/usr/lib/gcc/powerpc-unknown-linux-gnu/14/include
--datadir=/usr/share/gcc-data/powerpc-unknown-linux-gnu/14
--mandir=/usr/share/gcc-data/powerpc-unknown-linux-gnu/14/man
--infodir=/usr/share/gcc-data/powerpc-unknown-linux-gnu/14/info
--with-gxx-include-dir=/usr/lib/gcc/powerpc-unknown-linux-gnu/14/include/g++-v14
--disable-silent-rules --disable-dependency-tracking
--with-python-dir=/share/gcc-data/powerpc-unknown-linux-gnu/14/python
--enable-languages=c,c++,fortran --enable-obsolete --enable-secureplt
--disable-werror --with-system-zlib --enable-nls --without-included-gettext
--disable-libunwind-exceptions --enable-checking=yes,extra
--with-bugurl=https://bugs.gentoo.org/ --with-pkgversion='Gentoo
14.0.0_pre20231119 p9' --with-gcc-major-version-only --enable-libstdcxx-time
--enable-lto --disable-libstdcxx-pch --enable-shared --enable-threads=posix
--enable-__cxa_atexit --enable-clocale=gnu --disable-multilib
--disable-fixed-point --enable-targets=all --enable-libgomp --disable-libssp
--disable-libada --disable-cet --disable-systemtap
--disable-valgrind-annotations --disable-vtable-verify --disable-libvtv
--without-zstd --without-isl --enable-default-pie --enable-host-pie
--disable-host-bind-now --enable-default-ssp
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 14.0.0 20231119 (experimental) (Gentoo 14.0.0_pre20231119 p9)

[Bug target/113652] [14 regression] Failed bootstrap on ppc unrecognized opcode: `lfiwzx' with -mcpu=7450

2024-01-29 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113652

Richard Biener  changed:

   What|Removed |Added

 Target||powerpc
   Target Milestone|--- |14.0

--- Comment #1 from Richard Biener  ---
What's the version of binutils you are using?

[Bug target/113652] [14 regression] Failed bootstrap on ppc unrecognized opcode: `lfiwzx' with -mcpu=7450

2024-01-29 Thread csfore at posteo dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113652

--- Comment #2 from Christopher Fore  ---
I've tried with both 2.40 and 2.41

[Bug target/112950] gcc.target/aarch64/sve/acle/general/dupq_5.c fails on aarch64_be-linux-gnu

2024-01-29 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112950

--- Comment #2 from GCC Commits  ---
The master branch has been updated by Prathamesh Kulkarni
:

https://gcc.gnu.org/g:8a48723daca911f8cb16a459933772989173aa73

commit r14-8494-g8a48723daca911f8cb16a459933772989173aa73
Author: Prathamesh Kulkarni 
Date:   Mon Jan 29 18:42:44 2024 +0530

PR112950: Use #pragma GCC for including arm_sve.h.

gcc/testsuite/ChangeLog:
PR target/112950
* gcc.target/aarch64/sve/acle/general/dupq_5.c: Remove include
directive
and instead use #pragma GCC for including arm_sve.h.

[Bug target/112950] gcc.target/aarch64/sve/acle/general/dupq_5.c fails on aarch64_be-linux-gnu

2024-01-29 Thread prathamesh3492 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112950

prathamesh3492 at gcc dot gnu.org changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|UNCONFIRMED |RESOLVED

--- Comment #3 from prathamesh3492 at gcc dot gnu.org ---
Fixed.

[Bug rtl-optimization/38534] gcc 4.2.1 and above: No need to save called-saved registers in 'noreturn' function

2024-01-29 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=38534

--- Comment #24 from GCC Commits  ---
The master branch has been updated by H.J. Lu :

https://gcc.gnu.org/g:291f75fa1bc6a23c6184bb99c726074b13f2f18e

commit r14-8495-g291f75fa1bc6a23c6184bb99c726074b13f2f18e
Author: H.J. Lu 
Date:   Sat Jan 27 05:48:45 2024 -0800

x86: Save callee-saved registers in noreturn functions for -O0/-Og

Save callee-saved registers in noreturn functions for -O0/-Og so that
debugger can restore callee-saved registers in caller's frame.

Also add the TREE_THIS_VOLATILE check to minimize noreturn attribute
lookup.

gcc/

PR target/38534
* config/i386/i386-options.cc (ix86_set_func_type): Save
callee-saved registers in noreturn functions for -O0/-Og.

gcc/testsuite/

PR target/38534
* gcc.target/i386/pr38534-5.c: New file.
* gcc.target/i386/pr38534-6.c: Likewise.

[Bug target/113616] [14 Regression] ICE in process_uses_of_deleted_def, at rtl-ssa/changes.cc:252

2024-01-29 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113616

--- Comment #5 from GCC Commits  ---
The master branch has been updated by Alex Coplan :

https://gcc.gnu.org/g:d41a1873f334cf29b9a595bb03c27bff2be17319

commit r14-8496-gd41a1873f334cf29b9a595bb03c27bff2be17319
Author: Alex Coplan 
Date:   Mon Jan 29 13:28:04 2024 +

aarch64: Ensure iterator validity when updating debug uses [PR113616]

The fix for PR113089 introduced range-based for loops over the
debug_insn_uses of an RTL-SSA set_info, but in the case that we reset a
debug insn, the use would get removed from the use list, and thus we
would end up using an invalidated iterator in the next iteration of the
loop.  In practice this means we end up terminating the loop
prematurely, and hence ICE as in PR113089 since there are debug uses
that we failed to fix up.

This patch fixes that by introducing a general mechanism to avoid this
sort of problem.  We introduce a safe_iterator to iterator-utils.h which
wraps an iterator, and also holds the end iterator value.  It then
pre-computes the next iterator value at all iterations, so it doesn't
matter if the original iterator got invalidated during the loop body, we
can still move safely to the next iteration.

We introduce an iterate_safely helper which effectively adapts a
container such as iterator_range into a container of safe_iterators over
the original iterator type.

We then use iterate_safely around all loops over debug_insn_uses () in
the aarch64 ldp/stp pass to fix PR113616.  While doing this, I
remembered that cleanup_tombstones () had the same problem.  I
previously worked around this locally by manually maintaining the next
nondebug insn, so this patch also refactors that loop to use the new
iterate_safely helper.

While doing that I noticed that a couple of cases in cleanup_tombstones
could be converted from using dyn_cast to as_a,
which should be safe because there are no clobbers of mem in RTL-SSA, so
all defs of memory should be set_infos.

gcc/ChangeLog:

PR target/113616
* config/aarch64/aarch64-ldp-fusion.cc
(fixup_debug_uses_trailing_add):
Use iterate_safely when iterating over debug uses.
(fixup_debug_uses): Likewise.
(ldp_bb_info::cleanup_tombstones): Use iterate_safely to iterate
over nondebug insns instead of manually maintaining the next insn.
* iterator-utils.h (class safe_iterator): New.
(iterate_safely): New.

gcc/testsuite/ChangeLog:

PR target/113616
* gcc.c-torture/compile/pr113616.c: New test.

[Bug tree-optimization/113622] [11/12/13/14 Regression] ICE with vectors in named registers

2024-01-29 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113622

--- Comment #17 from GCC Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:96bc048d78f804bac0fa7b2ca3b6dd3a04c68217

commit r14-8497-g96bc048d78f804bac0fa7b2ca3b6dd3a04c68217
Author: Richard Biener 
Date:   Mon Jan 29 09:47:31 2024 +0100

middle-end/113622 - allow .VEC_SET and .VEC_EXTRACT for global hard regs

The following expands .VEC_SET and .VEC_EXTRACT instruction selection
to global hard registers, not only automatic variables (possibly)
promoted to registers.  This can avoid some ICEs later and create
better code.

PR middle-end/113622
* gimple-isel.cc (gimple_expand_vec_set_extract_expr):
Also allow DECL_HARD_REGISTER variables.

* gcc.target/i386/pr113622-1.c: New testcase.

[Bug tree-optimization/113622] [11/12/13/14 Regression] ICE with vectors in named registers

2024-01-29 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113622

--- Comment #18 from GCC Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:0f7945417f913c85bd556904c0c4e7bf77793488

commit r14-8498-g0f7945417f913c85bd556904c0c4e7bf77793488
Author: Richard Biener 
Date:   Mon Jan 29 10:24:39 2024 +0100

middle-end/113622 - handle store with variable index to register

The following implements storing to a non-MEM_P with a variable
offset.  We usually avoid this by forcing expansion to memory but
this doesn't work for hard register variables.  The solution is
to spill and operate on the stack.

PR middle-end/113622
* expr.cc (expand_assignment): Spill hard registers if
we index them with a variable offset.

* gcc.target/i386/pr113622-2.c: New testcase.
* gcc.target/i386/pr113622-3.c: Likewise.

[Bug target/113616] [14 Regression] ICE in process_uses_of_deleted_def, at rtl-ssa/changes.cc:252

2024-01-29 Thread acoplan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113616

Alex Coplan  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #6 from Alex Coplan  ---
Should be fixed, thanks for the report.

[Bug tree-optimization/113622] [11/12/13 Regression] ICE with vectors in named registers

2024-01-29 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113622

Richard Biener  changed:

   What|Removed |Added

  Known to work||14.0
Summary|[11/12/13/14 Regression]|[11/12/13 Regression] ICE
   |ICE with vectors in named   |with vectors in named
   |registers   |registers

--- Comment #19 from Richard Biener  ---
Should be fixed on trunk, not sure to what extent backporting is suitable.

[Bug debug/113636] [14 Regression] internal compiler error: in dead_debug_global_find, at valtrack.cc:275

2024-01-29 Thread claudio.bantaloukas at arm dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113636

--- Comment #9 from Claudio Bantaloukas  ---
git bisect "gccgo -g -O3 -mtune=thunderxt88 -c -o __case.o case.go" starting
with $(git merge-base origin/releases/gcc-13 origin/trunk) as a first good
commit points at [9f0f7d802482a8958d6cdc72f1fe0c8549db2182] aarch64: Add an
early RA for strided registers

[Bug rtl-optimization/113597] [14 Regression] aarch64: Significant code quality regression since r14-8346-ga98d5130a6dcff

2024-01-29 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113597

Richard Biener  changed:

   What|Removed |Added

  Attachment #57214|0   |1
is obsolete||

--- Comment #13 from Richard Biener  ---
Created attachment 57252
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57252&action=edit
prototype fix

Note when I extended the patch to also cover a PARM_DECL base to extent
coverage I see

FAIL: gcc.dg/torture/pr70421.c   -O1  execution test
FAIL: gcc.dg/torture/pr70421.c   -O2  execution test
FAIL: gcc.dg/torture/pr70421.c   -O3 -g  execution test
FAIL: gcc.dg/torture/pr70421.c   -Os  execution test
FAIL: gcc.dg/torture/pr70421.c   -O2 -flto -fno-use-linker-plugin
-flto-partitio
n=none  execution test
FAIL: gcc.dg/torture/pr70421.c   -O2 -flto -fuse-linker-plugin
-fno-fat-lto-obje
cts  execution test

on x86_64.  It seems that arg_base_value isn't the correct thing to use
but it eventually should have been unique_base_value (UNIQUE_BASE_VALUE_ARGP)?
I'm not sure whether all the different unique base values mean we'll not
be able to derive exactly those classes from MEM_EXPRs.

[Bug debug/113636] [14 Regression] internal compiler error: in dead_debug_global_find, at valtrack.cc:275 since r14-6290-g9f0f7d802482a8

2024-01-29 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113636

Richard Sandiford  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
 Ever confirmed|0   |1
   Assignee|unassigned at gcc dot gnu.org  |rsandifo at gcc dot 
gnu.org
   Last reconfirmed||2024-01-29

--- Comment #10 from Richard Sandiford  ---
Mine.

[Bug c/113653] New: Failure to diagnose use of (non-constant-expr) const objects in static initializers

2024-01-29 Thread bugdal at aerifal dot cx via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113653

Bug ID: 113653
   Summary: Failure to diagnose use of (non-constant-expr) const
objects in static initializers
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: bugdal at aerifal dot cx
  Target Milestone: ---

The following is a constraint violation:

int foo()
{
static const int x = 1;
static const int y = x; // not a constant expression
return y;
}

However, gcc does not diagnose it as such, even with -Wall -Wextra.

This appears to have been a regression somewhere between the gcc 4 era and now.

I'm not sure what component this should be assigned to. I chose "c" because
it's C-specific that this is not a constant expression; it would be in C++.

[Bug c/113653] Failure to diagnose use of (non-constant-expr) const objects in static initializers

2024-01-29 Thread bugdal at aerifal dot cx via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113653

--- Comment #1 from Rich Felker  ---
FWIW -pedantic also does not help.

[Bug c++/113644] [14 regression] ICE when building libcxxabi-16.0.6 since r14-6520

2024-01-29 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113644

Patrick Palka  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |ppalka at gcc dot 
gnu.org
 Status|NEW |ASSIGNED

--- Comment #5 from Patrick Palka  ---
non-diagnostic ICE-on-invalid testcase:

template struct A { };

struct B { static const int value = 42; };

template void f(A<42>);
template void f(A);

int main() {
  f(A<42>{});
}

[Bug target/113623] [14 Regression] ICE in aarch64_pair_mem_from_base since r14-6605

2024-01-29 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113623

Richard Sandiford  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |rsandifo at gcc dot 
gnu.org

--- Comment #6 from Richard Sandiford  ---
Mine.

[Bug debug/113562] [14 Regression] FAIL: gcc.dg/guality/pr54796.c

2024-01-29 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113562

--- Comment #3 from Richard Biener  ---
Just to put it somewhere I ran dwlocstat on cc1plus before/after the offending
change and it looks almost the same.  We go from

cov%samples cumul
0..10   1280217/38% 1280217/38%
11..20  55668/1%1335885/40%
21..30  68004/2%1403889/42%
31..40  70774/2%1474663/44%
41..50  75554/2%1550217/46%
51..60  91816/2%1642033/49%
61..70  101139/3%   1743172/52%
71..80  135281/4%   1878453/56%
81..90  198470/5%   2076923/62%
91..100 1233822/37% 3310745/100%

to

cov%samples cumul
0..10   1280197/38% 1280197/38%
11..20  55669/1%1335866/40%
21..30  68014/2%1403880/42%
31..40  70773/2%1474653/44%
41..50  75542/2%1550195/46%
51..60  91800/2%1641995/49%
61..70  101133/3%   1743128/52%
71..80  135259/4%   1878387/56%
81..90  198496/5%   2076883/62%
91..100 1233844/37% 3310727/100%

[Bug analyzer/113654] New: [14 Regression] -Wanalyzer-allocation-size false positive seen on Linux kernel's drivers/gpu/drm/i915/display/intel_bios.c

2024-01-29 Thread dmalcolm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113654

Bug ID: 113654
   Summary: [14 Regression] -Wanalyzer-allocation-size false
positive seen on Linux kernel's
drivers/gpu/drm/i915/display/intel_bios.c
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: analyzer
  Assignee: dmalcolm at gcc dot gnu.org
  Reporter: dmalcolm at gcc dot gnu.org
Blocks: 106358
  Target Milestone: ---

Trunk: https://godbolt.org/z/Y7jYxxhe7
Doesn't seem to affect 13.2

/* Adapted from include/linux/math.h  */
#define __round_mask(x, y) ((__typeof__(x))((y)-1))
#define round_up(x, y) x)-1) | __round_mask(x, y))+1)

/* Reduced from Linux kernel's drivers/gpu/drm/i915/display/intel_bios.c  */
typedef unsigned short u16;
typedef unsigned int u32;
typedef unsigned long __kernel_size_t;
typedef __kernel_size_t size_t;

extern __attribute__((__alloc_size__(1))) __attribute__((__malloc__))
void* kzalloc(size_t size);

typedef struct
{
  u32 reg;
} i915_reg_t;
struct intel_uncore;
struct intel_uncore_funcs
{
  u32 (*mmio_readl)(struct intel_uncore* uncore, i915_reg_t r);
};
struct intel_uncore
{
  void* regs;
  struct intel_uncore_funcs funcs;
};
static inline __attribute__((__gnu_inline__)) __attribute__((__unused__))
__attribute__((no_instrument_function)) u32
intel_uncore_read(struct intel_uncore* uncore, i915_reg_t reg)
{
  return uncore->funcs.mmio_readl(uncore, reg);
}
struct drm_i915_private
{
  struct intel_uncore uncore;
};
struct vbt_header*
spi_oprom_get_vbt(struct drm_i915_private* i915)
{
  u16 vbt_size;
  u32* vbt;
  vbt_size =
intel_uncore_read(&i915->uncore, ((const i915_reg_t){ .reg = (0x102040)
}));
  vbt_size &= 0x;
  vbt = kzalloc(round_up (vbt_size, 4));
  if (!vbt)
goto err_not_found;
  return (struct vbt_header*)vbt;
err_not_found:
  return ((void*)0);
}


: In function 'spi_oprom_get_vbt':
:46:9: warning: allocated buffer size is not a multiple of the
pointee's size [CWE-131] [-Wanalyzer-allocation-size]
   46 |   vbt = kzalloc(round_up (vbt_size, 4));
  | ^~~
  'spi_oprom_get_vbt': event 1
|
|   46 |   vbt = kzalloc(round_up (vbt_size, 4));
|  | ^~~
|  | |
|  | (1) allocated and assigned to 'u32 *' {aka 'unsigned int
*'} here; 'sizeof (u32 {aka unsigned int})' is '4'
|


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106358
[Bug 106358] [meta-bug] tracker bug for building the Linux kernel with
-fanalyzer

[Bug target/113655] New: Cross compiling to mips64-elf fails because "MIPS_EXPLICIT_RELOCS was not declared" after r14-8386-g58af788d1d0825

2024-01-29 Thread jamborm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113655

Bug ID: 113655
   Summary: Cross compiling to mips64-elf fails because
"MIPS_EXPLICIT_RELOCS was not declared" after
r14-8386-g58af788d1d0825
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jamborm at gcc dot gnu.org
CC: syq at gcc dot gnu.org
  Target Milestone: ---
  Host: x86_64-linux
Target: mips64-elf

Starting with r14-8386-g58af788d1d0825 (MIPS: Accept arguments for
-mexplicit-relocs), when I try to test that cross compilation from
x86_64-linux to target mips64-elf still works by configuring gcc with:

../src/configure --prefix=/home/mjambor/gcc/mine/inst --enable-languages=c,c++
--enable-checking=yes --disable-bootstrap --disable-multilib --enable-obsolete
--target=mips64-elf

and then building just the compiler with make -j64 all-host,

the compilation fails with:

options.cc:3474:3: error: ‘MIPS_EXPLICIT_RELOCS’ was not declared in this
scope; did you mean ‘MIPS_EXPLICIT_RELOCS_NON ’?
 3474 |   MIPS_EXPLICIT_RELOCS, /* mips_opt_explicit_relocs */
  |   ^~~~
  |   MIPS_EXPLICIT_RELOCS_NONE


Our buildbot reports failures when building a cross-compiler for
mips64el-st-linux-gnu, mips64octeon-linux, mipsisa64r2-linux,
mipsisa32r2-linux-gnu, mipsisa64r2-sde-elf, mipsisa32-elfoabi,
mipsisa64-elfoabi, mipsisa64r2el-elf, mipsisa64sr71k-elf,
mipsisa64sb1-elf, mips64-elf, mipsel-elf, mips64vr-elf,
mips64orion-elf, mips-rtems, mips-wrs-vxworks, mipstx39-elf and I
suspect the problem is the same or similar.

[Bug tree-optimization/113603] [12/13/14 Regression] ICE Segfault during GIMPLE pass: strlen at -O3 since r12-145

2024-01-29 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113603

Jakub Jelinek  changed:

   What|Removed |Added

   Last reconfirmed||2024-01-29
 Status|UNCONFIRMED |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #2 from Jakub Jelinek  ---
Created attachment 57253
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57253&action=edit
gcc14-pr113603.patch

Untested fix.

[Bug gcov-profile/113646] PGO hurts run-time of 538.imagick_r as much as 68% at -Ofast -march=native

2024-01-29 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113646

--- Comment #2 from Jan Hubicka  ---
> Did you try with -fprofile-partial-training (is that default on?  it probably
> should ...).  Can you please try training with the rate data instead of train

It is not on by default - the problem of partial training is that it
mostly nullifies any code size benefits from profile-use and that is
relatively noticebale aspect of it in real-world situations (like
for GCC itself or Firefox the overall size of binary matters).

I need to work on this more, but now we have two-state optimize_size
predicates and with level 1 we can turn off those -Os optimizations that
make large tradeoffs of performance for size optimization.

Honza
> to rule out a mismatch?

[Bug target/113656] New: [x86] ICE in simplify_const_unary_operation, at simplify-rtx.cc:1954 with new -mavx10.1

2024-01-29 Thread mjires at suse dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113656

Bug ID: 113656
   Summary: [x86] ICE in simplify_const_unary_operation, at
simplify-rtx.cc:1954 with new -mavx10.1
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: ice-on-valid-code
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: mjires at suse dot cz
CC: haochen.jiang at intel dot com
  Target Milestone: ---
Target: x86_64-pc-linux-gnu

Compiling reduced testcase gcc.target/i386/avx512fp16-broadcast-2.c results in
ICE. Bisected to initial AVX10.1 patch r14-5607-g2f8f7ee2db82a3.

$ cat avx512fp16-broadcast-2.c
_Float16 test_256_src[8];
void test_256() {
  for (int i = 0; i < 8; i++)
test_256_src[i] = i - 8.4;
}


$ gcc avx512fp16-broadcast-2.c -frounding-math -O3 -mavx10.1
-funsafe-math-optimizations
during RTL pass: combine
avx512fp16-broadcast-2.c: In function ‘test_256’:
avx512fp16-broadcast-2.c:5:1: internal compiler error: in
simplify_const_unary_operation, at simplify-rtx.cc:1954
5 | }
  | ^
0x169ccd4 simplify_const_unary_operation(rtx_code, machine_mode, rtx_def*,
machine_mode)
/home/mjires/git/GCC/master/gcc/simplify-rtx.cc:1954
0x1698783 simplify_context::simplify_unary_operation(rtx_code, machine_mode,
rtx_def*, machine_mode)
/home/mjires/git/GCC/master/gcc/simplify-rtx.cc:889
0x1696b93 simplify_context::simplify_gen_unary(rtx_code, machine_mode,
rtx_def*, machine_mode)
/home/mjires/git/GCC/master/gcc/simplify-rtx.cc:360
0x169a0bb simplify_context::simplify_unary_operation_1(rtx_code, machine_mode,
rtx_def*)
/home/mjires/git/GCC/master/gcc/simplify-rtx.cc:1304
0x16987aa simplify_context::simplify_unary_operation(rtx_code, machine_mode,
rtx_def*, machine_mode)
/home/mjires/git/GCC/master/gcc/simplify-rtx.cc:893
0x101ade9 simplify_unary_operation(rtx_code, machine_mode, rtx_def*,
machine_mode)
/home/mjires/git/GCC/master/gcc/rtl.h:3486
0x2e56f58 combine_simplify_rtx
/home/mjires/git/GCC/master/gcc/combine.cc:5690
0x2e56c13 subst
/home/mjires/git/GCC/master/gcc/combine.cc:5609
0x2e56946 subst
/home/mjires/git/GCC/master/gcc/combine.cc:5536
0x2e56946 subst
/home/mjires/git/GCC/master/gcc/combine.cc:5536
0x2e4f6f1 try_combine
/home/mjires/git/GCC/master/gcc/combine.cc:3302
0x2e49be5 combine_instructions
/home/mjires/git/GCC/master/gcc/combine.cc:1264
0x2e71cf2 rest_of_handle_combine
/home/mjires/git/GCC/master/gcc/combine.cc:15091
0x2e71dae execute
/home/mjires/git/GCC/master/gcc/combine.cc:15135
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.


$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/home/mjires/built/master/libexec/gcc/x86_64-pc-linux-gnu/14.0.1/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /home/mjires/git/GCC/master/configure
--prefix=/home/mjires/built/master --disable-bootstrap
--enable-languages=c,c++,fortran,lto --disable-multilib --disable-libsanitizer
--enable-checking : (reconfigured) /home/mjires/git/GCC/master/configure
--prefix=/home/mjires/built/master --disable-bootstrap
--enable-languages=c,c++,fortran,lto --disable-multilib --disable-libsanitizer
--enable-checking
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 14.0.1 20240129 (experimental) (GCC)

[Bug target/108933] [11/12/13/14 Regression] Missing rev16 detection

2024-01-29 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108933

--- Comment #5 from GCC Commits  ---
The master branch has been updated by Richard Earnshaw :

https://gcc.gnu.org/g:bad991a1c5960e90c4686a9362a1258ef29e195b

commit r14-8499-gbad991a1c5960e90c4686a9362a1258ef29e195b
Author: Matthieu Longo 
Date:   Mon Jan 29 15:54:35 2024 +

arm: Add pattern for bswap + rotate -> rev16 [Bug 108933]

The rev16 pattern was not recognised anymore as a change in the bswap
tree pass was introducing a new GIMPLE form, not recognized by the
assembly final transformation pass.

Also, fix the output patterns for arm_rev16si_alt[12] to correctly
handle the instructions being made conditional.

More details in the PR.

gcc/ChangeLog:

PR target/108933
* config/arm/arm.md (arm_rev16si2): Convert to define_insn.
Correct generated RTL.
(arm_rev16si2_alt1): Correctly handle conditional execution.
(arm_rev16si2_alt2): Likewise.

gcc/testsuite/ChangeLog:

PR target/108933
* gcc.target/arm/rev16.c: Moved to...
* gcc.target/arm/rev16_1.c: ...here.
* gcc.target/arm/rev16_2.c: New test to check that rev16 is
emitted.

[Bug c++/113638] [13/14 Regression] Array bounds of variable templates are not correctly deduced from initializers since GCC13 inside a decltype/sizeof

2024-01-29 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113638

Patrick Palka  changed:

   What|Removed |Added

   Keywords|needs-bisection |
 CC||ppalka at gcc dot gnu.org

--- Comment #3 from Patrick Palka  ---
Started with r13-2540-g4db3cb781c3553.  If we remove the constexpr then it
seems we never accepted the testcase, so this is only a regression for
constexpr variable templates, but a proper fix will likely cover the
non-constexpr case too.

[Bug target/108933] [11/12/13 Regression] Missing rev16 detection

2024-01-29 Thread rearnsha at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108933

Richard Earnshaw  changed:

   What|Removed |Added

Summary|[11/12/13/14 Regression]|[11/12/13 Regression]
   |Missing rev16 detection |Missing rev16 detection
 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |Matthieu.Longo at arm 
dot com

--- Comment #6 from Richard Earnshaw  ---
Fixed on trunk so far.

[Bug target/113657] New: [14 Regression] ICE Segmentation fault since r14-1187-gd6b756447cd58b

2024-01-29 Thread mjires at suse dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113657

Bug ID: 113657
   Summary: [14 Regression] ICE Segmentation fault since
r14-1187-gd6b756447cd58b
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: ice-on-valid-code
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: mjires at suse dot cz
CC: aoliva at gcc dot gnu.org
  Target Milestone: ---

Compiling reduced testcase gcc.target/aarch64/pr113550.c results in ICE since
r14-1187-gd6b756447cd58b.

$ cat pr113550.c
#pragma GCC target "+ls64"
#pragma GCC aarch64 "arm_acle.h"
__arm_data512_t foo(__arm_data512_t* ptr) { return *ptr; }


$ aarch64-linux-gnu-gcc pr113550.c -mstrict-align
during RTL pass: split5
pr113550.c: In function ‘foo’:
pr113550.c:3:58: internal compiler error: Segmentation fault
3 | __arm_data512_t foo(__arm_data512_t* ptr) { return *ptr; }
  |  ^
0x178f9de crash_signal
/home/mjires/git/GCC/master/gcc/toplev.cc:317
0x16f7566 reg_overlap_mentioned_p(rtx_def const*, rtx_def const*)
/home/mjires/git/GCC/master/gcc/rtlanal.cc:1857
0x267687e gen_split_300(rtx_insn*, rtx_def**)
/home/mjires/git/GCC/master/gcc/config/aarch64/aarch64-simd.md:8232
0x2a7ffb2 split_71
/home/mjires/git/GCC/master/gcc/config/aarch64/aarch64-simd.md:8212
0x2a99edf split_131
/home/mjires/git/GCC/master/gcc/config/aarch64/aarch64.md:875
0x2a9dcf4 split_insns(rtx_def*, rtx_insn*)
/home/mjires/git/GCC/master/gcc/config/aarch64/aarch64-sve.md:10993
0x117c8db try_split(rtx_def*, rtx_insn*, int)
/home/mjires/git/GCC/master/gcc/emit-rtl.cc:3941
0x16a3a44 split_insn
/home/mjires/git/GCC/master/gcc/recog.cc:3405
0x16a3ea6 split_all_insns_noflow()
/home/mjires/git/GCC/master/gcc/recog.cc:3567
0x16a5ee4 execute
/home/mjires/git/GCC/master/gcc/recog.cc:4641
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.


$ aarch64-linux-gnu-gcc -v
Using built-in specs.
COLLECT_GCC=aarch64-linux-gnu-gcc
COLLECT_LTO_WRAPPER=/home/mjires/built/master/libexec/gcc/aarch64-linux-gnu/14.0.1/lto-wrapper
Target: aarch64-linux-gnu
Configured with: /home/mjires/git/GCC/master/configure
--prefix=/home/mjires/built/master --target=aarch64-linux-gnu
--disable-bootstrap --enable-languages=c,c++,fortran --disable-multilib
--disable-libsanitizer --enable-checking : (reconfigured)
/home/mjires/git/GCC/master/configure --prefix=/home/mjires/built/master
--target=aarch64-linux-gnu --disable-bootstrap --enable-languages=c,c++,fortran
--disable-multilib --disable-libsanitizer --enable-checking
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 14.0.1 20240129 (experimental) (GCC)

[Bug target/113655] Cross compiling to mips64-elf fails because "MIPS_EXPLICIT_RELOCS was not declared" after r14-8386-g58af788d1d0825

2024-01-29 Thread syq at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113655

--- Comment #1 from YunQiang Su  ---
Thank for your report. It's due to a typo

I will fix it now.

[Bug target/113607] [14] RISC-V rv64gcv vector: Runtime mismatch at -O3

2024-01-29 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113607

--- Comment #17 from Robin Dapp  ---
Grasping for straws by blaming qemu ;)

At some point we do the vector shift

vsll.vv v1,v2,v2,v0.t

but the mask v0 is all zeros:
gdb:
   b = {0 }

According to the mask-undisturbed policy set before
vsetvli zero,zero,e32,mf2,ta,mu

all elements should be unchanged.  I'm seeing an all-zeros result in v1,
though.
v1 is used as 'j', is zero and therefore 'q' is not incremented and we don't
assign c = d causing the wrong result.

Before the shift I see v2 in gdb as:
  w = {4294967295, 4294967295, 0, 0}
(That's also a bit dubious because we load 2 elements from 'g' of which only
one should be -1.  This doesn't change the end result, though.)

After the shift gdb shows v1 as:
   w = {0, 0, 0, 0},

when it should be w = {-1, -1, 0, 0}.

Does this make sense?

[Bug target/113655] Cross compiling to mips64-elf fails because "MIPS_EXPLICIT_RELOCS was not declared" after r14-8386-g58af788d1d0825

2024-01-29 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113655

--- Comment #2 from GCC Commits  ---
The master branch has been updated by YunQiang Su :

https://gcc.gnu.org/g:8e84b4fad149b9b9544c7b1fc61a45cf6139176e

commit r14-8500-g8e84b4fad149b9b9544c7b1fc61a45cf6139176e
Author: YunQiang Su 
Date:   Tue Jan 30 00:26:28 2024 +0800

MIPS: Fix typo in gcc/configure.ac: gcc_cv_as_mips_explicit

gcc_cv_as_mips_explicit should be gcc_cv_as_mips_explicit_relocs.
This was introduced in commit
58af788d1d0825187def434c95cab35a690a31b0.

gcc
PR target/113655
* configure.ac: Fix typo gcc_cv_as_mips_explicit should be
gcc_cv_as_mips_explicit_relocs.
* configure: Regnerated.

[Bug target/113655] Cross compiling to mips64-elf fails because "MIPS_EXPLICIT_RELOCS was not declared" after r14-8386-g58af788d1d0825

2024-01-29 Thread syq at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113655

YunQiang Su  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #3 from YunQiang Su  ---
I think that this problem has been resolved.

If no, you can just reopen this report.

[Bug c/113653] Failure to diagnose use of (non-constant-expr) const objects in static initializers

2024-01-29 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113653

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #2 from Andrew Pinski  ---
This was done on purpose, see PR 83222 (and PR 69960).

*** This bug has been marked as a duplicate of bug 83222 ***

[Bug c/83222] [8 regression] Inconsistent "initializer element is not constant" error

2024-01-29 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83222

Andrew Pinski  changed:

   What|Removed |Added

 CC||bugdal at aerifal dot cx

--- Comment #6 from Andrew Pinski  ---
*** Bug 113653 has been marked as a duplicate of this bug. ***

[Bug c/113653] Failure to diagnose use of (non-constant-expr) const objects in static initializers

2024-01-29 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113653

--- Comment #3 from Andrew Pinski  ---
Specifically https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69960#c24 .

[Bug c++/113658] New: GCC 14 has incomplete impl for declared feature "cxx_constexpr_string_builtins"

2024-01-29 Thread berrange at redhat dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113658

Bug ID: 113658
   Summary: GCC 14 has incomplete impl for declared feature
"cxx_constexpr_string_builtins"
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: berrange at redhat dot com
  Target Milestone: ---

In GCC 14 there was support for the Clang __has_feature() language extension
added by

  commit 06280a906cb3dc80cf5e07cf3335b758848d488d
  Author: Alex Coplan 
  Date:   Fri Mar 17 16:30:51 2023 +

c-family: Implement __has_feature and __has_extension [PR60512]

This patch implements clang's __has_feature and __has_extension in GCC.
Currently the patch aims to implement all documented features (and some
undocumented ones) following the documentation at
https://clang.llvm.org/docs/LanguageExtensions.html with the exception
of the legacy features for C++ type traits. 

One of the features declared as implemented is "cxx_constexpr_string_builtins"
which was documented by CLang as indicating support for built-ins for memchr,
memcmp, strchr, strcmp, strlen, strncmp,  wcschr, wcscmp, wcslen, wcsncmp,
wmemchr, wmemcmp, and one extra special case for memchr.

Except GCC 14 does not provide built-ins for all those functions, so GCC is
claiming support for a feature it cannot fully support. 

As a result code that was written against this CLang feature extension now gets
enabled with GCC 14 and then fails to build.

$ cat >> demo.cpp <

#ifndef __has_feature
#warning "no __has_feature"
#define __has_feature(a) 0
#endif

char *foo(const char *s, int c, size_t n) {
#if __has_feature(cxx_constexpr_string_builtins)
return __builtin_char_memchr(s, c, n);
#else
#warning "no __builtin_char_memchr"
return NULL;
#endif
}
EOF
$ g++ b.cpp
b.cpp: In function ‘char* foo(const char*, int, size_t)’:
b.cpp:10:12: error: ‘__builtin_char_memchr’ was not declared in this scope; did
you mean ‘__builtin_memchr’?
   10 | return __builtin_char_memchr(s, c, n);
  |^
  |__builtin_memchr


gcc version 14.0.1 20240125 (Red Hat 14.0.1-0) (GCC)

[Bug target/113652] [14 regression] Failed bootstrap on ppc unrecognized opcode: `lfiwzx' with -mcpu=7450

2024-01-29 Thread erhard_f at mailbox dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113652

--- Comment #3 from Erhard F.  ---
(In reply to Christopher Fore from comment #0)
> Created attachment 57251 [details]
> original preprocessed file
> 
> Steps to reproduce:
> 1. Attempt to build GCC 14 (latest snapshot attempted is Gentoo's 20240128)
> 2. Fails to assemble with:
> /tmp/ccP8ev2f.s: Assembler messages:
> /tmp/ccP8ev2f.s:85: Error: unrecognized opcode: `lfiwzx'
> 
> Originally reported downstream at: https://bugs.gentoo.org/921621
> 
> Command to reproduce:
> gcc -mcpu=7450 -O1 -mvsx -c _kf_to_sd.i
I did the Gentoo downstream bugreport where I tried to build GCC 14 with GCC 11
+ binutils 2.41. Building was done on a 32 bit partition on my Talos II via
'setarch ppc32 emerge -1 gcc'.

As the PowerPC 7450 (aka PowerPC G4) is only capable of -maltivec but not -mvsx
I only passed "-O2 -mcpu=7450 -mtune=7450" as CFLAGS.

Still my GCC 14 build spills this "unrecognized opcode: `lfiwzx'" which is a
Power ISA 2.06 (POWER 7) opcode. GCC 11 builds fine on the same system.

[Bug c++/113658] GCC 14 has incomplete impl for declared feature "cxx_constexpr_string_builtins"

2024-01-29 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113658

--- Comment #1 from Andrew Pinski  ---
Hmm, Most other conditional uses of builtin use __has_builtins instead.
Interesting one projection just conditionalized it on
cxx_constexpr_string_builtins .

[Bug c/113653] Failure to diagnose use of (non-constant-expr) const objects in static initializers

2024-01-29 Thread bugdal at aerifal dot cx via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113653

Rich Felker  changed:

   What|Removed |Added

 Resolution|DUPLICATE   |---
 Status|RESOLVED|UNCONFIRMED

--- Comment #4 from Rich Felker  ---
This is NOT a duplicate of the marked bug - that bug was complaining that
invalid code didn't compile.

This bug is that GCC accepts invalid code, even with -pedantic, with no
diagnostic, making it impossible to catch invalid C. This bug bit me in the
wild - I accepted code that should have been rejected as a constraint
violation, and thereby made the project impossible to compile with other
compilers for a couple releases.

In standards-conforming and/or pedantic mode, the code should be rejected.

[Bug c++/113658] GCC 14 has incomplete impl for declared feature "cxx_constexpr_string_builtins"

2024-01-29 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113658

--- Comment #2 from Andrew Pinski  ---
Newer libc++ does the "correct" thing even:
https://github.com/llvm/llvm-project/blob/430c1fd50d774dc30a9628bcf60ce243f74ff376/libcxx/include/__string/constexpr_c_functions.h#L121

[Bug c/113653] Failure to diagnose use of (non-constant-expr) const objects in static initializers

2024-01-29 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113653

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |INVALID
 Status|UNCONFIRMED |RESOLVED

--- Comment #5 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #3)
> Specifically https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69960#c24 .

See that comment. It explains on why this is not exactly invalid code.

[Bug tree-optimization/113659] New: [14 Regression] ICE Segmentation fault since r14-8355-g02e683894942da

2024-01-29 Thread mjires at suse dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113659

Bug ID: 113659
   Summary: [14 Regression] ICE Segmentation fault since
r14-8355-g02e683894942da
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: ice-on-valid-code
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: mjires at suse dot cz
CC: rguenther at suse dot de
  Target Milestone: ---

Compiling reduced testcase g++.dg/torture/predcom-1.C results in ICE since
r14-8355-g02e683894942da. Reintroduced PR113442.

$ cat predcom-1.C
struct Foo {
  int *ptr;
};
bool Baz(Foo first) {
  while (first.ptr)
if (*first.ptr++)
  return false;
}


$ g++ predcom-1.C -O3 -fno-tree-sra -mavx512bf16
predcom-1.C: In function ‘bool Baz(Foo)’:
predcom-1.C:8:1: warning: control reaches end of non-void function
[-Wreturn-type]
8 | }
  | ^
during GIMPLE pass: vect
predcom-1.C:4:6: internal compiler error: Segmentation fault
4 | bool Baz(Foo first) {
  |  ^~~
0x1b06530 crash_signal
/home/mjires/git/GCC/master/gcc/toplev.cc:317
0x13c5e6a gimple_phi_result(gphi const*)
/home/mjires/git/GCC/master/gcc/gimple.h:4608
0x1e6a0a4 slpeel_tree_duplicate_loop_to_edge_cfg(loop*, edge_def*, loop*,
edge_def*, edge_def*, edge_def**, bool, vec*)
/home/mjires/git/GCC/master/gcc/tree-vect-loop-manip.cc:1713
0x1e6f49e vect_do_peeling(_loop_vec_info*, tree_node*, tree_node*, tree_node**,
tree_node**, tree_node**, int, bool, bool, tree_node**)
/home/mjires/git/GCC/master/gcc/tree-vect-loop-manip.cc:3395
0x1e597a8 vect_transform_loop(_loop_vec_info*, gimple*)
/home/mjires/git/GCC/master/gcc/tree-vect-loop.cc:11914
0x1eb23de vect_transform_loops
/home/mjires/git/GCC/master/gcc/tree-vectorizer.cc:1006
0x1eb2b3e try_vectorize_loop_1
/home/mjires/git/GCC/master/gcc/tree-vectorizer.cc:1152
0x1eb2c77 try_vectorize_loop
/home/mjires/git/GCC/master/gcc/tree-vectorizer.cc:1182
0x1eb2f3e execute
/home/mjires/git/GCC/master/gcc/tree-vectorizer.cc:1298
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.


$ g++ -v
Using built-in specs.
COLLECT_GCC=g++
COLLECT_LTO_WRAPPER=/home/mjires/built/master/libexec/gcc/x86_64-pc-linux-gnu/14.0.1/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /home/mjires/git/GCC/master/configure
--prefix=/home/mjires/built/master --disable-bootstrap
--enable-languages=c,c++,fortran,lto --disable-multilib --disable-libsanitizer
--enable-checking : (reconfigured) /home/mjires/git/GCC/master/configure
--prefix=/home/mjires/built/master --disable-bootstrap
--enable-languages=c,c++,fortran,lto --disable-multilib --disable-libsanitizer
--enable-checking
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 14.0.1 20240129 (experimental) (GCC)

[Bug target/111677] [12/13 Regression] darktable build on aarch64 fails with unrecognizable insn due to -fstack-protector changes

2024-01-29 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111677

--- Comment #21 from Richard Sandiford  ---
(In reply to Alex Coplan from comment #13)
> The problem seems to be this code in aarch64_process_components:
> 
>   while (regno != last_regno)
> {
>   bool frame_related_p = aarch64_emit_cfi_for_reg_p (regno);
>   machine_mode mode = aarch64_reg_save_mode (regno);
> 
>   rtx reg = gen_rtx_REG (mode, regno);
>   poly_int64 offset = frame.reg_offset[regno];
>   if (frame_pointer_needed)
> offset -= frame.bytes_below_hard_fp;
> 
>   rtx addr = plus_constant (Pmode, ptr_reg, offset);
>   rtx mem = gen_frame_mem (mode, addr);
> 
> which emits a TFmode mem with offset 512, which is out of range for TFmode
> (so we later ICE with an unrecognisable insn).  Presumably this just needs
> tweaking to emit a new base anchor in the case of large offsets like this. 
> It looks like the code in aarch64_save_callee_saves already does this.
We shouldn't emit new anchor registers here, since unlike in the prologue,
we don't have any guarantee that certain registers are free.

aarch64_get_separate_components is supposed to vet shrink-wrappable
offsets, but in this case the offset looks valid, since:

str q22, [sp, #512]

is a valid instruction.  Perhaps the constraints are too narrow?

[Bug c/113653] Failure to diagnose use of (non-constant-expr) const objects in static initializers

2024-01-29 Thread bugdal at aerifal dot cx via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113653

--- Comment #6 from Rich Felker  ---
I'm aware of the allowance to accept "other forms". It's unfortunately
underspecified (does the implementation need to be specific in what forms?
document them per the normal rules for implementation-defined behavior? etc.)
but indeed it exists.

Regardless, at least -pedantic should diagnose this, because it's a big footgun
for writing code that is not valid C, that only works with certain compilers
that implement C++-like behavior in C. I would also be happy with a separate
warning option controlling it, named something like like
-Wextended-constant-expressions.

[Bug rtl-optimization/113660] New: ICE: verify_flow_info failed: missing REG_EH_REGION note at the end of bb 2 with -fnon-call-exceptions -fharden-control-flow-redundancy on invalid

2024-01-29 Thread zsojka at seznam dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113660

Bug ID: 113660
   Summary: ICE: verify_flow_info failed: missing REG_EH_REGION
note at the end of bb 2 with -fnon-call-exceptions
-fharden-control-flow-redundancy on invalid
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: ice-on-invalid-code
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zsojka at seznam dot cz
  Target Milestone: ---
  Host: x86_64-pc-linux-gnu
Target: x86_64-pc-linux-gnu

Created attachment 57254
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57254&action=edit
reduced testcase

Compiler output:
$ x86_64-pc-linux-gnu-gcc -fnon-call-exceptions
-fharden-control-flow-redundancy testcase.c
testcase.c: In function 'foo':
testcase.c:5:3: error: impossible constraint in 'asm'
5 |   __asm__ volatile("" : "=a"(i));
  |   ^~~
testcase.c:6:1: error: missing REG_EH_REGION note at the end of bb 2
6 | }
  | ^
during RTL pass: into_cfglayout
testcase.c:6:1: internal compiler error: verify_flow_info failed
0xf904ae verify_flow_info()
/repo/gcc-trunk/gcc/cfghooks.cc:287
0x268381b checking_verify_flow_info()
/repo/gcc-trunk/gcc/cfghooks.h:214
0x268381b try_optimize_cfg
/repo/gcc-trunk/gcc/cfgcleanup.cc:2980
0x268381b cleanup_cfg(int)
/repo/gcc-trunk/gcc/cfgcleanup.cc:3143
0xfaaf8a execute
/repo/gcc-trunk/gcc/cfgrtl.cc:3713
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.

$ x86_64-pc-linux-gnu-gcc -v
Using built-in specs.
COLLECT_GCC=/repo/gcc-trunk/binary-latest-amd64/bin/x86_64-pc-linux-gnu-gcc
COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-r14-8487-20240129102032-gb338fdbc2b7-checking-yes-rtl-df-extra-nobootstrap-amd64/bin/../libexec/gcc/x86_64-pc-linux-gnu/14.0.1/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++
--enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra
--disable-bootstrap --with-cloog --with-ppl --with-isl
--build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu
--target=x86_64-pc-linux-gnu --with-ld=/usr/bin/x86_64-pc-linux-gnu-ld
--with-as=/usr/bin/x86_64-pc-linux-gnu-as --disable-libstdcxx-pch
--prefix=/repo/gcc-trunk//binary-trunk-r14-8487-20240129102032-gb338fdbc2b7-checking-yes-rtl-df-extra-nobootstrap-amd64
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 14.0.1 20240129 (experimental) (GCC)

[Bug tree-optimization/113661] New: [14 Regression] xalancbmk miscompiled on aarch64 since r14-7194-g6cb155a6cf3142

2024-01-29 Thread acoplan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113661

Bug ID: 113661
   Summary: [14 Regression] xalancbmk miscompiled on aarch64 since
r14-7194-g6cb155a6cf3142
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: acoplan at gcc dot gnu.org
  Target Milestone: ---

xalancbmk (both from SPEC 2006 and SPEC 2017) seems to be miscompiled on
aarch64 since r14-7194-g6cb155a6cf314232248a12bdd395ed4151ae5a28 i.e.

commit 6cb155a6cf314232248a12bdd395ed4151ae5a28 (refs/bisect/bad)
Author: Tamar Christina 
Date:   Fri Jan 12 15:24:49 2024 +

middle-end: make memory analysis for early break more deterministic
[PR113135]

I see:

*** Miscompare of ref-t5.out

with the options -Ofast -fomit-frame-pointer -mcpu=neoverse-v1 -flto=auto .

[Bug tree-optimization/113661] [14 Regression] xalancbmk miscompiled on aarch64 since r14-7194-g6cb155a6cf3142

2024-01-29 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113661

Tamar Christina  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #1 from Tamar Christina  ---
duplicate

*** This bug has been marked as a duplicate of bug 112644 ***

[Bug sanitizer/112644] [14 Regression] Some of the hwasan testcase fail after the recent merge

2024-01-29 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112644

Tamar Christina  changed:

   What|Removed |Added

 CC||acoplan at gcc dot gnu.org

--- Comment #8 from Tamar Christina  ---
*** Bug 113661 has been marked as a duplicate of this bug. ***

[Bug tree-optimization/113661] [14 Regression] xalancbmk miscompiled on aarch64 since r14-7194-g6cb155a6cf3142

2024-01-29 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113661

--- Comment #2 from Tamar Christina  ---


*** This bug has been marked as a duplicate of bug 113576 ***

[Bug tree-optimization/113576] [14 regression] 502.gcc_r hangs r14-8223-g1c1853a70f9422169190e65e568dcccbce02d95c

2024-01-29 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576

Tamar Christina  changed:

   What|Removed |Added

 CC||acoplan at gcc dot gnu.org

--- Comment #23 from Tamar Christina  ---
*** Bug 113661 has been marked as a duplicate of this bug. ***

[Bug tree-optimization/113588] [14 Regression] The vectorizer is introducing out-of-bounds memory access

2024-01-29 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113588

Tamar Christina  changed:

   What|Removed |Added

 CC||acoplan at gcc dot gnu.org

--- Comment #5 from Tamar Christina  ---
*** Bug 113661 has been marked as a duplicate of this bug. ***

[Bug tree-optimization/113661] [14 Regression] xalancbmk miscompiled on aarch64 since r14-7194-g6cb155a6cf3142

2024-01-29 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113661

--- Comment #3 from Tamar Christina  ---
arg wrong one again. anyway, this is a duplicate

*** This bug has been marked as a duplicate of bug 113588 ***

[Bug c++/113662] New: [13/14 Regression] Wrong code for std::sort with fancy pointer

2024-01-29 Thread ostash at ostash dot kiev.ua via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113662

Bug ID: 113662
   Summary: [13/14 Regression] Wrong code for std::sort with fancy
pointer
   Product: gcc
   Version: 13.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ostash at ostash dot kiev.ua
  Target Milestone: ---

Hello,

For the following snippet compiled with "-O3 -std=c++20"
---
#include 
#include 

#include 
#include 

struct Foo
{
public:
  uint32_t m1; 
  uint32_t m2;
  uint8_t m3;
};

bool operator<(const Foo& lhs, const Foo& rhs)
{
  return lhs.m1 < rhs.m1;
}

template 
class MyAllocator 
{
public:
  using value_type = T;
  using pointer = boost::interprocess::offset_ptr;

  boost::interprocess::offset_ptr allocate( std::size_t n ) {
  return boost::interprocess::offset_ptr(a.allocate(n));
  }
  void deallocate(  boost::interprocess::offset_ptr p, std::size_t n ) {
  a.deallocate(p.get(), n);
  }

  std::allocator a;
};

int main()
{
boost::container::vector> vec;
vec.emplace_back().m1 = 4748;
vec.emplace_back().m1 = 4687;
vec.emplace_back().m1 = 4717;
vec.emplace_back().m1 = 4779;

std::cout << "before: " <<  vec.size() << '\n';
for (const auto& x : vec)
std::cout << std::to_string(x.m1) << '\n';

std::sort(vec.begin(), vec.end());

std::cout << "after: " <<  vec.size() << '\n';
for (const auto& x : vec)
std::cout << std::to_string(x.m1) << '\n';
}
---

we receive the following output:
---
before: 4
4748
4687
4717
4779
after: 4
4687
4717
4717
4779
---

I've managed to bisect this issue to the following commit:
429a7a88438cc80e7c58d9f63d44838089899b12 is the first bad commit
commit 429a7a88438cc80e7c58d9f63d44838089899b12
Author: Andrew MacLeod 
Date:   Tue Mar 28 12:16:34 2023 -0400
   Add recursive GORI recompuations with a depth limit.
   PR tree-optimization/109154
   gcc/
   * gimple-range-gori.cc (gori_compute::may_recompute_p): Add depth
limit.
   * gimple-range-gori.h (may_recompute_p): Add depth param.
   * params.opt (ranger-recompute-depth): New param.
   gcc/testsuite/
   * gcc.dg/Walloca-13.c: Remove bogus warning that is now fixed.
gcc/gimple-range-gori.cc  | 30 ++
gcc/gimple-range-gori.h   |  4 ++--
gcc/params.opt|  5 +
gcc/testsuite/gcc.dg/Walloca-13.c |  2 +-
4 files changed, 30 insertions(+), 11 deletions(-)

I tried different versions of Boost to ensure that the problem is not coming
from offset_ptr. It looks like that it is possible to reproduce issue with "-O2
-ftree-partial-pre".

Everything works fine with std::vector or std::allocator.

I'd be glad to perform other tests if needed.

[Bug c++/113544] [14 Regression] bogus incomplete type error with dependent data member in local class in generic lambda since r14-278

2024-01-29 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113544

--- Comment #2 from GCC Commits  ---
The trunk branch has been updated by Jason Merrill :

https://gcc.gnu.org/g:77d3fb39c62558838c0e905df717903b5393dfc9

commit r14-8502-g77d3fb39c62558838c0e905df717903b5393dfc9
Author: Jason Merrill 
Date:   Fri Jan 26 17:33:51 2024 -0500

c++: local class in generic lambda [PR113544]

My earlier commit r14-278-gd60cbbfaa9a3ad was a start toward better
handling of local classes in generic lambdas, but isn't actually useful by
itself and breaks this testcase, so let's revert it for now.

PR c++/113544

gcc/cp/ChangeLog:

* pt.cc (instantiate_class_template): Don't partially instantiate.
(tsubst_stmt): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1y/lambda-generic-nested3.C: New test.

[Bug libstdc++/97813] std::filesystem::equivalent returning incorrect results on MinGW due to symlinks

2024-01-29 Thread lennoxhoe at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97813

Lennox Ho  changed:

   What|Removed |Added

 CC||lennoxhoe at gmail dot com

--- Comment #2 from Lennox Ho  ---
> We don't currently support symlinks on Windows.
> There's a lot more than just equivalent that needs to change to support them.

Hi Jonathan, can you elaborate on this?
Is this a simple matter of nobody has bothered to implement this, or are there
blocking issues?

[Bug c++/113544] [14 Regression] bogus incomplete type error with dependent data member in local class in generic lambda since r14-278

2024-01-29 Thread jason at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113544

Jason Merrill  changed:

   What|Removed |Added

 Resolution|--- |FIXED
   Assignee|unassigned at gcc dot gnu.org  |jason at gcc dot gnu.org
 Status|NEW |RESOLVED

--- Comment #3 from Jason Merrill  ---
Fixed.

[Bug target/113652] [14 regression] Failed bootstrap on ppc unrecognized opcode: `lfiwzx' with -mcpu=7450

2024-01-29 Thread csfore at posteo dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113652

--- Comment #4 from Christopher Fore  ---
Created attachment 57255
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57255&action=edit
minimized preprocessed file

Here's the minimized file (still errors)

[Bug c++/113662] [13/14 Regression] Wrong code for std::sort with fancy pointer

2024-01-29 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113662

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||wrong-code

--- Comment #1 from Andrew Pinski  ---
One thing I noticed is both -fwrapv and -fno-strict-aliasing does not change
the code generation.

[Bug tree-optimization/113662] [13/14 Regression] Wrong code for std::sort with fancy pointer

2024-01-29 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113662

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |13.3
  Component|c++ |tree-optimization

[Bug tree-optimization/113662] [13/14 Regression] Wrong code for std::sort with fancy pointer

2024-01-29 Thread ostash at ostash dot kiev.ua via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113662

--- Comment #2 from Viktor Ostashevskyi  ---
Adding --param=ranger-recompute-depth=1 or --param=ranger-recompute-depth=2
also fixes the issue. Higher values behave wrongly.

[Bug tree-optimization/113662] [13/14 Regression] Wrong code for std::sort with fancy pointer

2024-01-29 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113662

--- Comment #3 from Andrew Pinski  ---
Add -fno-ivopts also fixes the issue ...

  1   2   >