[Bug tree-optimization/113664] False positive warnings with -fno-strict-overflow (-Warray-bounds, -Wstringop-overflow)

2024-01-30 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113664

Richard Biener  changed:

   What|Removed |Added

   Last reconfirmed||2024-01-30
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

--- Comment #4 from Richard Biener  ---
Confirmed.  As usual it's jump-threading related where we isolate, in the
-Warray-bounds case

MEM[(char *)1B] = 48;

we inline 'f' and then, when s == dot == NULL your code dereferences both
NULL and NULL + 1.

So the diagnostic messages leave a lot to be desired but in the end they
point to a problem in your code which is a guard against a NULL 's'.

The jump threading is different with -fwrapv-pointer, in particular without
it we just get the NULL dereference which we seem to ignore during
array-bound diagnostics.

We later isolate the paths as unreachable but that happens after the
diagnostic.

[Bug tree-optimization/113467] [14 regression] libgcrypt-1.10.3 is miscompiled

2024-01-30 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113467

--- Comment #28 from Sam James  ---
(In reply to Sam James from comment #13)
> this also fixes mpfr + gmp tests, thank you!

just ftr: the mpfr/gmp issue might actually be PR113576

[Bug ipa/113665] [11/12/13/14 regression] Regular for Loop results in Endless Loop with -O2

2024-01-30 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113665

--- Comment #5 from Richard Biener  ---
Well, ICF figures out the other part of the partial inlined test() are equal
and I think they are.  The

if (i >= S){
return false;
}

tests are inlined and eliminated (I think correctly so).  -fno-partial-inlining
also avoids the issue.

The issue is that ICF doesn't wipe (or compare) range info so we get after
inlining:

   [local count: 10737416]:
  goto ; [100.00%]

   [local count: 1063004409]:
  # RANGE [irange] long unsigned int [0, 591] NONZERO 0x3ff
  _5 = (long unsigned int) i_2;
  # RANGE [irange] unsigned int [0, 287] NONZERO 0x1ff
  _11 = (unsigned int) _5;

[Bug ipa/113665] [11/12/13/14 regression] Regular for Loop results in Endless Loop with -O2

2024-01-30 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113665

Richard Biener  changed:

   What|Removed |Added

 CC||hubicka at gcc dot gnu.org,
   ||rguenth at gcc dot gnu.org
   Priority|P3  |P2

--- Comment #6 from Richard Biener  ---
Honza - ICF seems to fixup points-to sets when merging variables, so there
should be a way to kill off flow-sensitive info inside prevailing bodies
as well.  But would that happen before inlining the body?  Can you work
on that?  I think comparing ranges would weaken ICF unnecessarily?

[Bug tree-optimization/113576] [14 regression] 502.gcc_r hangs r14-8223-g1c1853a70f9422169190e65e568dcccbce02d95c

2024-01-30 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576

--- Comment #27 from Richard Biener  ---
(In reply to Hongtao Liu from comment #25)
> (In reply to Tamar Christina from comment #24)
> > Just to avoid confusion, are you still working on this one Richi?
> 
> I'm working on a patch to add a target hook as #c18 mentioned.

Not sure a target hook was suggested - I think it was suggested that
do_compare_and_jump always masks excess bits for integer mode vector masks?

[Bug tree-optimization/113576] [14 regression] 502.gcc_r hangs r14-8223-g1c1853a70f9422169190e65e568dcccbce02d95c

2024-01-30 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576

--- Comment #28 from Hongtao Liu  ---
I saw we already maskoff integral modes for vector mask in store_constructor

/* Use sign-extension for uniform boolean vectors with
   integer modes and single-bit mask entries.
   Effectively "vec_duplicate" for bitmasks.  */
if (elt_size == 1
&& !TREE_SIDE_EFFECTS (exp)
&& VECTOR_BOOLEAN_TYPE_P (type)
&& SCALAR_INT_MODE_P (TYPE_MODE (type))
&& (elt = uniform_vector_p (exp))
&& !VECTOR_TYPE_P (TREE_TYPE (elt)))
  {
rtx op0 = force_reg (TYPE_MODE (TREE_TYPE (elt)),
 expand_normal (elt));
rtx tmp = gen_reg_rtx (mode);
convert_move (tmp, op0, 0);

/* Ensure no excess bits are set.
   GCN needs this for nunits < 64.
   x86 needs this for nunits < 8.  */
auto nunits = TYPE_VECTOR_SUBPARTS (type).to_constant ();
if (maybe_ne (GET_MODE_PRECISION (mode), nunits))
  tmp = expand_binop (mode, and_optab, tmp,
  GEN_INT ((1 << nunits) - 1), target,
  true, OPTAB_WIDEN);
if (tmp != target)
  emit_move_insn (target, tmp);
break;
  }

[Bug tree-optimization/113622] [11/12/13 Regression] ICE with vectors in named registers

2024-01-30 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113622

--- Comment #20 from GCC Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:47b81161c98cf2ff5495d4aa6386cc3c87f9d27b

commit r14-8515-g47b81161c98cf2ff5495d4aa6386cc3c87f9d27b
Author: Jakub Jelinek 
Date:   Tue Jan 30 09:31:22 2024 +0100

testsuite: Fix up pr113622-{2,3}.c for i686-linux [PR113622]

The 2 new tests FAIL for me on i686-linux:
.../gcc/testsuite/gcc.target/i386/pr113622-2.c:5:14: error: data type of
'a' isn't suitable for a register
.../gcc/testsuite/gcc.target/i386/pr113622-2.c:5:29: error: data type of
'b' isn't suitable for a register
.../gcc/testsuite/gcc.target/i386/pr113622-2.c:5:44: error: data type of
'c' isn't suitable for a register
The problem is that the tests use vectors of double, something added
only in SSE2, while the testcases ask for just -msse which only provides
vectors of floats.

So, either it should be using floats instead of doubles, or we need
to add -msse2 to dg-options.

I've done the latter.

2024-01-30  Jakub Jelinek  

PR middle-end/113622
* gcc.target/i386/pr113622-2.c: Use -msse2 instead of -msse in
dg-options.
* gcc.target/i386/pr113622-3.c: Likewise.

[Bug tree-optimization/113659] [14 Regression] ICE Segmentation fault since r14-8355-g02e683894942da

2024-01-30 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113659

--- Comment #3 from Richard Biener  ---
So the issue is similar to gcc.c-torture/execute/20150611-1.c, this time
the main exit ends in a path without a virtual use (__builtin_unreachable ()).
We can do the same as we do for the alternate exits here.

[Bug target/113657] [14 Regression] ICE Segmentation fault with -mstrict-align and __arm_data512_t since r14-1187-gd6b756447cd58b

2024-01-30 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113657

Andrew Pinski  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |pinskia at gcc dot 
gnu.org

--- Comment #2 from Andrew Pinski  ---
I have a patch.

[Bug tree-optimization/113576] [14 regression] 502.gcc_r hangs r14-8223-g1c1853a70f9422169190e65e568dcccbce02d95c

2024-01-30 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576

--- Comment #29 from Richard Biener  ---
(In reply to Hongtao Liu from comment #28)
> I saw we already maskoff integral modes for vector mask in store_constructor
> 
>   /* Use sign-extension for uniform boolean vectors with
>  integer modes and single-bit mask entries.
>  Effectively "vec_duplicate" for bitmasks.  */
>   if (elt_size == 1
>   && !TREE_SIDE_EFFECTS (exp)
>   && VECTOR_BOOLEAN_TYPE_P (type)
>   && SCALAR_INT_MODE_P (TYPE_MODE (type))
>   && (elt = uniform_vector_p (exp))
>   && !VECTOR_TYPE_P (TREE_TYPE (elt)))
> {
>   rtx op0 = force_reg (TYPE_MODE (TREE_TYPE (elt)),
>expand_normal (elt));
>   rtx tmp = gen_reg_rtx (mode);
>   convert_move (tmp, op0, 0);
> 
>   /* Ensure no excess bits are set.
>  GCN needs this for nunits < 64.
>  x86 needs this for nunits < 8.  */
>   auto nunits = TYPE_VECTOR_SUBPARTS (type).to_constant ();
>   if (maybe_ne (GET_MODE_PRECISION (mode), nunits))
> tmp = expand_binop (mode, and_optab, tmp,
> GEN_INT ((1 << nunits) - 1), target,
> true, OPTAB_WIDEN);
>   if (tmp != target)
> emit_move_insn (target, tmp);
>   break;
> }

But that's just for CONSTRUCTORs, we got the VIEW_CONVERT_EXPR path for
VECTOR_CSTs.  But yeah, that _might_ argue we should perform the same
masking for VECTOR_CST expansion as well, instead of trying to fixup
in do_compare_and_jump?

[Bug target/113652] [14 regression] Failed bootstrap on ppc unrecognized opcode: `lfiwzx' with -mcpu=7450

2024-01-30 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113652

Kewen Lin  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
   Last reconfirmed||2024-01-30

--- Comment #13 from Kewen Lin  ---
One more finding: without an explicit cpu type but -mvsx, gcc passes -mpower7
to assembler already, but if there is an explicitly specified cpu type, it
won't do that. I think the reason why it doesn't always make it is that only
the last cpu type wins and the passing can override some higher cpu type
unexpectedly.

The fixing candidates seems to be:

diff --git a/libgcc/config/rs6000/t-float128 b/libgcc/config/rs6000/t-float128
index b09b5664af0..47b06d3c30d 100644
--- a/libgcc/config/rs6000/t-float128
+++ b/libgcc/config/rs6000/t-float128
@@ -74,7 +74,7 @@ fp128_includes = $(srcdir)/soft-fp/double.h \
   $(srcdir)/soft-fp/soft-fp.h

 # Build the emulator without ISA 3.0 hardware support.
-FP128_CFLAGS_SW  = -Wno-type-limits -mvsx -mfloat128 \
+FP128_CFLAGS_SW  = -Wno-type-limits -mvsx -mfloat128 -mcpu=power7 \
-mno-float128-hardware -mno-gnu-attribute \
-I$(srcdir)/soft-fp \
-I$(srcdir)/config/rs6000 \

Or

diff --git a/libgcc/config/rs6000/t-float128 b/libgcc/config/rs6000/t-float128
index b09b5664af0..bf4a5e6aaf0 100644
--- a/libgcc/config/rs6000/t-float128
+++ b/libgcc/config/rs6000/t-float128
@@ -74,7 +74,7 @@ fp128_includes = $(srcdir)/soft-fp/double.h \
   $(srcdir)/soft-fp/soft-fp.h

 # Build the emulator without ISA 3.0 hardware support.
-FP128_CFLAGS_SW  = -Wno-type-limits -mvsx -mfloat128 \
+FP128_CFLAGS_SW  = -Wno-type-limits -mvsx -mfloat128 -Wa,-many \
-mno-float128-hardware -mno-gnu-attribute \
-I$(srcdir)/soft-fp \
-I$(srcdir)/config/rs6000 \

As gcc considers -mvsx to imply -mcpu=power7 (appending onto the current
specified cpu type if there is one) while assembler doesn't consider like that.

[Bug c++/113658] GCC 14 has incomplete impl for declared feature "cxx_constexpr_string_builtins"

2024-01-30 Thread acoplan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113658

Alex Coplan  changed:

   What|Removed |Added

   Last reconfirmed||2024-01-30
 Status|UNCONFIRMED |ASSIGNED
 Ever confirmed|0   |1
   Assignee|unassigned at gcc dot gnu.org  |acoplan at gcc dot 
gnu.org

--- Comment #5 from Alex Coplan  ---
(In reply to Jakub Jelinek from comment #3)
> Obviously using __has_builtin is much better than using the really badly
> designed __has_feature/__has_extension.
> That said, wcs{chr,cmp,len,ncmp} and wmem{chr,cmp} aren't builtins in gcc
> either, so I guess we shouldn't announce this "feature".

Mine, then.  I can prepare a patch to stop advertising the feature.

[Bug middle-end/101195] ICE: in tree_to_uhwi, at tree.c:6324

2024-01-30 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101195

--- Comment #6 from GCC Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:26c9b95b9f712ff1f813351b5d45371620085221

commit r14-8516-g26c9b95b9f712ff1f813351b5d45371620085221
Author: Jakub Jelinek 
Date:   Tue Jan 30 09:57:21 2024 +0100

except: Fix __builtin_eh_return_data_regno (-42) expansion [PR101195]

The expansion of this builtin emits an error if the argument is not
INTEGER_CST, otherwise uses tree_to_uhwi on the argument (which is declared
int) and then uses EH_RETURN_DATA_REGNO macro which on most targets returns
INVALID_REGNUM for all values but some small number (2 or 4); if it returns
INVALID_REGNUM, we silently expand to -1.

Now, I think the error for non-INTEGER_CST makes sense to catch when people
unintentionally don't call it with a constant (but, users shouldn't really
use this builtin anyway, it is for the unwinder only).  Initially I thought
about emitting an error for the negative values as well on which
tree_to_uhwi otherwise ICEs, but given that the function will silently
expand to -1 for INT_MAX - 1 or INT_MAX - 3 other values, I think treating
the negatives the same silently is fine too.

2024-01-30  Jakub Jelinek  

PR middle-end/101195
* except.cc (expand_builtin_eh_return_data_regno): If which doesn't
fit into unsigned HOST_WIDE_INT, return constm1_rtx.

* gcc.dg/pr101195.c: New test.

[Bug tree-optimization/113603] [12/13/14 Regression] ICE Segfault during GIMPLE pass: strlen at -O3 since r12-145

2024-01-30 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113603

--- Comment #3 from GCC Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:d7250c1e02478586a0cd6d5cb67bf4d17249a7e7

commit r14-8517-gd7250c1e02478586a0cd6d5cb67bf4d17249a7e7
Author: Jakub Jelinek 
Date:   Tue Jan 30 09:58:05 2024 +0100

tree-ssa-strlen: Fix up handle_store [PR113603]

Since r10-2101-gb631bdb3c16e85f35d3 handle_store uses
count_nonzero_bytes{,_addr} which (more recently limited to statements
with the same vuse) can walk earlier statements feeding the rhs
of the store and call get_stridx on it.
Unlike most of the other functions where get_stridx is called first on
rhs and only later on lhs, handle_store calls get_stridx on the lhs before
the count_nonzero_bytes* call and does some si->nonzero_bytes comparison
on it.
Now, strinfo structures are refcounted and it is important not to screw
it up.
What happens on the following testcase is that we call get_strinfo on the
destination idx's base (g), which returns a strinfo at that moment
with refcount of 2, one copy referenced in bb 2 final strinfos, one in bb 3
(the vector of strinfos was unshared from the dominator there because some
other strinfo was added) and finally we process a store in bb 6.
Now, count_nonzero_bytes is called and that sees &g[1] in a PHI and
calls get_stridx on it, which in turn calls get_stridx_plus_constant
because &g + 1 address doesn't have stridx yet.  This creates a new
strinfo for it:
  si = new_strinfo (ptr, idx, build_int_cst (size_type_node,
nonzero_chars),
basesi->full_string_p);
  set_strinfo (idx, si);
and the latter call, because it is the first one in bb 6 that needs it,
unshares the stridx_to_strinfo vector (so refcount of the g strinfo becomes
3).
Now, get_stridx_plus_constant needs to chain the new strinfo of &g[1] in
between the related strinfos, so after the g record.  Because the strinfo
is now shared between the current bb and 2 other bbs, it needs to
unshare_strinfo it (creating a new strinfo which can be modified as a copy
of the old one, decrementing refcount of the old shared one and setting
refcount of the new one to 1):
  if (strinfo *nextsi = get_strinfo (chainsi->next))
{
  nextsi = unshare_strinfo (nextsi);
  si->next = nextsi->idx;
  nextsi->prev = idx;
}
  chainsi = unshare_strinfo (chainsi);
  if (chainsi->first == 0)
chainsi->first = chainsi->idx;
  chainsi->next = idx;
Now, the bug is that the caller of this a couple of frames above,
handle_store, holds on a pointer to this g strinfo (but doesn't know
about the unsharing, so the pointer is to the old strinfo with refcount
of 2), and later needs to update it, so it
  si = unshare_strinfo (si);
and modifies some fields in it.
This creates a new strinfo (with refcount of 1 which is stored into
the vector of the current bb) based on the old strinfo for g and
decrements refcount of the old one to 1.  So, now we are in inconsistent
state, because the old strinfo for g is referenced in bb 2 and bb 3
vectors, but has just refcount of 1, and then have one strinfo (the one
created by unshare_strinfo (chainsi) in get_stridx_plus_constant) which
has refcount of 1 but isn't referenced from anywhere anymore.
Later on when we free one of the bb 2 or bb 3 vectors (forgot which)
that decrements refcount from 1 to 0 and poisons the strinfo/returns it to
the pool, but then maybe_invalidate when looking at the other bb's pointer
to it ICEs.

The following patch fixes it by calling get_strinfo again, it is guaranteed
to return non-NULL, but could be an unshared copy instead of the originally
fetched shared one.

I believe we only need to do this refetching for the case where get_strinfo
is called on the lhs before get_stridx is called on other operands, because
we should be always modifying (apart from the chaining changes) the strinfo
for the destination of the statements, not other strinfos just consumed in
there.

2024-01-30  Jakub Jelinek  

PR tree-optimization/113603
* tree-ssa-strlen.cc (strlen_pass::handle_store): After
count_nonzero_bytes call refetch si using get_strinfo in case it
has been unshared in the meantime.

* gcc.c-torture/compile/pr113603.c: New test.

[Bug c++/113658] GCC 14 has incomplete impl for declared feature "cxx_constexpr_string_builtins"

2024-01-30 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113658

--- Comment #6 from Jakub Jelinek  ---
It doesn't help that __has_feature/__has_extension is very badly documented,
obviously the best meaning for a feature check would be that the selected
builtins are usable in constexpr expressions if they are implemented, but it
has been added before __has_builtin has been introduced/standardized.  And for
__builtin_wcs* etc. implementation we have the long standing problem what APIs
to use to access the wchar_ts, I think that is the reason why we don't
implement format attribute for *wprintf etc.

[Bug middle-end/101195] ICE: in tree_to_uhwi, at tree.c:6324

2024-01-30 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101195

Jakub Jelinek  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED
   Target Milestone|--- |14.0

--- Comment #7 from Jakub Jelinek  ---
Fixed for GCC 14.  Not important to backport IMHO, this builtin is meant to be
used just in the unwinder, not in normal user code.

[Bug tree-optimization/113603] [12/13 Regression] ICE Segfault during GIMPLE pass: strlen at -O3 since r12-145

2024-01-30 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113603

Jakub Jelinek  changed:

   What|Removed |Added

Summary|[12/13/14 Regression] ICE   |[12/13 Regression] ICE
   |Segfault during GIMPLE  |Segfault during GIMPLE
   |pass: strlen at -O3 since   |pass: strlen at -O3 since
   |r12-145 |r12-145

--- Comment #4 from Jakub Jelinek  ---
Fixed on the trunk so far.

[Bug target/113656] [x86] ICE in simplify_const_unary_operation, at simplify-rtx.cc:1954 with new -mavx10.1

2024-01-30 Thread haochen.jiang at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113656

Haochen Jiang  changed:

   What|Removed |Added

 CC||crazylht at gmail dot com

--- Comment #4 from Haochen Jiang  ---
>From my bisect, it seems that the guilty commit is gcc-14-1707-ge52be6034fa.

[Bug ipa/113665] [11/12/13/14 regression] Regular for Loop results in Endless Loop with -O2 since r11-4987-g602c6cfc79ce4a

2024-01-30 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113665

Sam James  changed:

   What|Removed |Added

Summary|[11/12/13/14 regression]|[11/12/13/14 regression]
   |Regular for Loop results in |Regular for Loop results in
   |Endless Loop with -O2   |Endless Loop with -O2 since
   ||r11-4987-g602c6cfc79ce4a

--- Comment #7 from Sam James  ---
(In reply to Andrew Pinski from comment #4)
> (In reply to Sam James from comment #3)
> > I will try bisect.
> 
> Most likely r11-5094-gafa6adbd6c83ee or r11-4987-g602c6cfc79ce4a or
> r11-4986-ga1fdc16da34118 .

r11-4987-g602c6cfc79ce4a is the first bad commit
commit r11-4987-g602c6cfc79ce4a
Author: Jan Hubicka 
Date:   Fri Nov 13 15:58:41 2020 +0100

Improve handling of memory operands in ipa-icf 2/4

[Bug target/113059] [14 regression] fftw fails tests for -O3 -m32 -march=znver2 since r14-6210-ge44ed92dbbe9d4

2024-01-30 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113059

--- Comment #16 from Jakub Jelinek  ---
The question is revert what exactly?
If we revert r14-6210, we get back the other P1.  Or do you mean revert
r14-5355?
I guess another option is move the vzeroupper pass one pass later, i.e. after
pass_gcse.

[Bug target/113600] [14 regression] 525.x264_r run-time regresses by 8% with PGO -Ofast -march=znver4

2024-01-30 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113600

--- Comment #5 from Hongtao Liu  ---
It looks like x264_pixel_satd_16x16 consumes more time after my commit, an
extracted case is as below, note there's no attribute((always_inline)) in the
original x264_pixel_satd_8x4, it's added to force inline(Under PGO, it's hot
and will be inlined)

typedef unsigned char uint8_t;
typedef unsigned uint32_t;
typedef unsigned short uint16_t;

static inline uint32_t abs2( uint32_t a )
{
uint32_t s = ((a>>15)&0x10001)*0x;
return (a+s)^s;
}

int
__attribute__((always_inline))
x264_pixel_satd_8x4( uint8_t *pix1, int i_pix1, uint8_t *pix2, int i_pix2 )
{
  uint32_t tmp[4][4];
  uint32_t a0, a1, a2, a3;
  int sum = 0;
  for( int i = 0; i < 4; i++, pix1 += i_pix1, pix2 += i_pix2 )
{
  a0 = (pix1[0] - pix2[0]) + ((pix1[4] - pix2[4]) << 16);
  a1 = (pix1[1] - pix2[1]) + ((pix1[5] - pix2[5]) << 16);
  a2 = (pix1[2] - pix2[2]) + ((pix1[6] - pix2[6]) << 16);
  a3 = (pix1[3] - pix2[3]) + ((pix1[7] - pix2[7]) << 16);
  { int t0 = a0 + a1; int t1 = a0 - a1; int t2 = a2 + a3; int t3 = a2 - a3;
tmp[i][0] = t0 + t2; tmp[i][2] = t0 - t2; tmp[i][1] = t1 + t3; tmp[i][3] = t1 -
t3;};
}
  for( int i = 0; i < 4; i++ )
{
  { int t0 = tmp[0][i] + tmp[1][i]; int t1 = tmp[0][i] - tmp[1][i]; int t2
= tmp[2][i] + tmp[3][i]; int t3 = tmp[2][i] - tmp[3][i]; a0 = t0 + t2; a2 = t0
- t2; a1 = t1 + t3; a3 = t1 - t3;};
  sum += abs2(a0) + abs2(a1) + abs2(a2) + abs2(a3);
}
  return (((uint16_t)sum) + ((uint32_t)sum>>16)) >> 1;
}

int x264_pixel_satd_16x16( uint8_t *pix1, int i_pix1, uint8_t *pix2, int i_pix2
)
{
  int sum = x264_pixel_satd_8x4( pix1, i_pix1, pix2, i_pix2 )
+ x264_pixel_satd_8x4( pix1+4*i_pix1, i_pix1, pix2+4*i_pix2, i_pix2 );
  sum+= x264_pixel_satd_8x4( pix1+8, i_pix1, pix2+8, i_pix2 )
+ x264_pixel_satd_8x4( pix1+8+4*i_pix1, i_pix1, pix2+8+4*i_pix2, i_pix2 );
  sum+= x264_pixel_satd_8x4( pix1+8*i_pix1, i_pix1, pix2+8*i_pix2, i_pix2 )
+ x264_pixel_satd_8x4( pix1+12*i_pix1, i_pix1, pix2+12*i_pix2, i_pix2 );
  sum+= x264_pixel_satd_8x4( pix1+8+8*i_pix1, i_pix1, pix2+8+8*i_pix2, i_pix2 )
+ x264_pixel_satd_8x4( pix1+8+12*i_pix1, i_pix1, pix2+8+12*i_pix2, i_pix2
);
  return sum;
}


after commits, slp failed to splitted group size 16(vector int(16)) into small
4 + 12 and missed vectorization for below cases.

  vect_t2_2445.784_8503 = VIEW_CONVERT_EXPR(_8502);
  vect__2457.786_8505 = vect_t0_2441.783_8501 - vect_t2_2445.784_8503;
  vect__2448.785_8504 = vect_t0_2441.783_8501 + vect_t2_2445.784_8503;
  _8506 = VEC_PERM_EXPR ;
  vect__2449.787_8507 = VIEW_CONVERT_EXPR(_8506);
  t3_2447 = (int) _2446;
  _2448 = t0_2441 + t2_2445;
  _2449 = (unsigned int) _2448;
  _2451 = t0_2441 - t2_2445;
  _2452 = (unsigned int) _2451;
  _2454 = t1_2443 + t3_2447;
  _2455 = (unsigned int) _2454;
  _2457 = t1_2443 - t3_2447;
  _2458 = (unsigned int) _2457;
  MEM  [(unsigned int *)&tmp + 16B] =
vect__2449.787_8507;


The vector store will be optimized off with later vector load, so for the bad
case there're STLF issue.

[Bug target/111677] [12/13 Regression] darktable build on aarch64 fails with unrecognizable insn due to -fstack-protector changes

2024-01-30 Thread acoplan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111677

Alex Coplan  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |acoplan at gcc dot 
gnu.org
 Status|NEW |ASSIGNED

--- Comment #22 from Alex Coplan  ---
(In reply to Richard Sandiford from comment #21)
> 
> aarch64_get_separate_components is supposed to vet shrink-wrappable
> offsets, but in this case the offset looks valid, since:
> 
> str q22, [sp, #512]
> 
> is a valid instruction.  Perhaps the constraints are too narrow?

Yeah, as discussed offline, for T{I,F}mode we deliberately restrict the range
to the ldp x-reg range, since at least for TImode we don't know pre-RA how it
will be allocated (a single q reg or a pair of x regs).

We could look at using a different mode for the save that doesn't have those
restrictions, I'll try to do that.

[Bug target/113600] [14 regression] 525.x264_r run-time regresses by 8% with PGO -Ofast -march=znver4

2024-01-30 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113600

--- Comment #6 from Hongtao Liu  ---
Guess explicit .REDUC_PLUS instead of original VEC_PERM_EXPR somehow impacts
the store split decision.

[Bug debug/113636] [14 Regression] internal compiler error: in dead_debug_global_find, at valtrack.cc:275 since r14-6290-g9f0f7d802482a8

2024-01-30 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113636

--- Comment #11 from GCC Commits  ---
The trunk branch has been updated by Richard Sandiford :

https://gcc.gnu.org/g:aeec7d87a28ac80c64ebfa88cef3dccee3ba8efc

commit r14-8518-gaeec7d87a28ac80c64ebfa88cef3dccee3ba8efc
Author: Richard Sandiford 
Date:   Tue Jan 30 09:30:35 2024 +

aarch64: Handle debug references to removed registers [PR113636]

In this PR, we entered early-ra with quite a bit of dead code.
The code was duly removed (to avoid wasting registers), but there
was a dangling reference in debug instructions, which caused an
ICE later.

Fixed by resetting a debug instruction if it references a register
that is no longer needed by non-debug instructions.

gcc/
PR target/113636
* config/aarch64/aarch64-early-ra.cc (early_ra::replace_regs): Take
the containing insn as an extra parameter.  Reset debug
instructions
if they reference a register that is no longer used by real insns.
(early_ra::apply_allocation): Update calls accordingly.

gcc/testsuite/
PR target/113636
* go.dg/pr113636.go: New test.

[Bug target/113623] [14 Regression] ICE in aarch64_pair_mem_from_base since r14-6605

2024-01-30 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113623

--- Comment #7 from GCC Commits  ---
The trunk branch has been updated by Richard Sandiford :

https://gcc.gnu.org/g:fa2739ac1b74769d97fba34db9b9a8aa8786539e

commit r14-8519-gfa2739ac1b74769d97fba34db9b9a8aa8786539e
Author: Richard Sandiford 
Date:   Tue Jan 30 09:30:35 2024 +

aarch64: Avoid allocating FPRs to address registers [PR113623]

For something like:

void
foo (void)
{
  int *ptr;
  asm volatile ("%0" : "=w" (ptr));
  asm volatile ("%0" :: "m" (*ptr));
}

early-ra would allocate ptr to an FPR for the first asm, thus
leaving an FPR address in the second asm.  The address was then
reloaded by LRA to make it valid.

But early-ra shouldn't be allocating at all in that kind of
situation.  Doing so caused the ICE in the PR (with LDP fusion).

Fixed by making sure that we record address references as
GPR references.

gcc/
PR target/113623
* config/aarch64/aarch64-early-ra.cc (early_ra::preprocess_insns):
Mark all registers that occur in addresses as needing a GPR.

gcc/testsuite/
PR target/113623
* gcc.c-torture/compile/pr113623.c: New test.

[Bug libgcc/113403] [14 Regression] __builtin_nested_func_ptr_created, __builtin_nested_func_ptr should be dynamically linked by default

2024-01-30 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113403

--- Comment #15 from GCC Commits  ---
The master branch has been updated by Iain D Sandoe :

https://gcc.gnu.org/g:7b3b3788c579856abcfdc6eed589c64dc7e88cdb

commit r14-8520-g7b3b3788c579856abcfdc6eed589c64dc7e88cdb
Author: Iain Sandoe 
Date:   Fri Jan 19 15:57:04 2024 +

libgcc: Make heap trampoline support dynamic [PR113403].

This removes the heap trampoline support functions from libgcc.a and
adds them to libgcc_eh.a.  They are also present in libgcc_s.

PR libgcc/113403

libgcc/ChangeLog:

* config/aarch64/t-heap-trampoline: Move the heap trampoline
support functions from libgcc.a to libgcc_eh.a.
* config/i386/t-heap-trampoline: Likewise.

[Bug libgcc/113403] [14 Regression] __builtin_nested_func_ptr_created, __builtin_nested_func_ptr should be dynamically linked by default

2024-01-30 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113403

--- Comment #16 from GCC Commits  ---
The master branch has been updated by Iain D Sandoe :

https://gcc.gnu.org/g:506e74f53a5e4f607284d3c41da17cdd3eca4fb8

commit r14-8521-g506e74f53a5e4f607284d3c41da17cdd3eca4fb8
Author: Iain Sandoe 
Date:   Sun Jan 28 13:31:56 2024 +

libgcc: Make heap trampoline support dynamic [PR113403].

In order to handle system security constraints during GCC build
and test and that most platform versions cannot link to libgcc_eh
since the unwinder there is incompatible with the system one.

1. We make the support functions weak definitions.
2. We include them as a CRT for platform conditions that do not
   allow libgcc_eh.
3. We ensure that the weak symbols are exported from DSOs (which
   includes exes on Darwin) so that the dynamic linker will
   pick one instance (which avoids duplication of trampoline
   caches).

PR libgcc/113403

gcc/ChangeLog:

* config/darwin.h (DARWIN_SHARED_WEAK_ADDS, DARWIN_WEAK_CRTS): New.
(REAL_LIBGCC_SPEC): Move weak CRT handling to separate spec.
* config/i386/darwin.h (DARWIN_HEAP_T_LIB): New.
* config/i386/darwin32-biarch.h (DARWIN_HEAP_T_LIB): New.
* config/i386/darwin64-biarch.h (DARWIN_HEAP_T_LIB): New.
* config/rs6000/darwin.h (DARWIN_HEAP_T_LIB): New.

libgcc/ChangeLog:

* config.host: Build libheap_t.a for i686/x86_64 Darwin.
* config/aarch64/heap-trampoline.c (HEAP_T_ATTR): New.
(allocate_tramp_ctrl): Allow a target to build this as a weak def.
(__gcc_nested_func_ptr_created): Likewise.
* config/i386/heap-trampoline.c (HEAP_T_ATTR): New.
(allocate_tramp_ctrl): Allow a target to build this as a weak def.
(__gcc_nested_func_ptr_created): Likewise.
* config/t-darwin: Build libheap_t.a (a CRT with heap trampoline
support).

[Bug target/112861] [14 regression] Most gdc tests FAIL on macOS 12+

2024-01-30 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112861

--- Comment #4 from GCC Commits  ---
The master branch has been updated by Iain D Sandoe :

https://gcc.gnu.org/g:d1d144d80f27f7a027ec8a05758555e7aa45462f

commit r14-8522-gd1d144d80f27f7a027ec8a05758555e7aa45462f
Author: Iain Sandoe 
Date:   Wed Jan 24 08:05:01 2024 +

testsuite, GDC: Update link flags [PR112861].

The regressions here are because we do not generate a runpath for
the uninstalled libstdc++.  This patch updates the link flags handling
to simplify it.

We need to add options to locate both libgphobos and libstdc++
Usually '-L' options are added to point to the relevant directories for
the uninstalled libraries.

In cases where libraries are available as both shared and convenience
some additional checks are made.

For some targets -static- options are handled by specs substitution
and need a '-B' option rather than '-L'.  For Darwin, when embedded
runpaths are in use (the default for all versions after macOS 10.11),
'-B' is also needed to provide the runpath.

When '-B' is used, this results in a '-L' for each path that exists (so
that appending a '-L' as well is a needless duplicate).  There are also
cases where tools warn for duplicates, leading to spurious fails.
Therefore the objective is to add a single -B/-L option for each needed
path.

PR target/112861

gcc/testsuite/ChangeLog:

* lib/gdc.exp: Decide on whether to present -B or -L to reference
the paths to uninstalled libphobos and libstdc++ and use that to
generate the link flags.

[Bug target/113623] [14 Regression] ICE in aarch64_pair_mem_from_base since r14-6605

2024-01-30 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113623

Richard Sandiford  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #8 from Richard Sandiford  ---
Fixed.

[Bug debug/113636] [14 Regression] internal compiler error: in dead_debug_global_find, at valtrack.cc:275 since r14-6290-g9f0f7d802482a8

2024-01-30 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113636

Richard Sandiford  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #12 from Richard Sandiford  ---
Fixed.  Thanks for the report and help with reproducing.

[Bug tree-optimization/113576] [14 regression] 502.gcc_r hangs r14-8223-g1c1853a70f9422169190e65e568dcccbce02d95c

2024-01-30 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576

--- Comment #30 from Richard Sandiford  ---
(In reply to Richard Biener from comment #29)
> But that's just for CONSTRUCTORs, we got the VIEW_CONVERT_EXPR path for
> VECTOR_CSTs.  But yeah, that _might_ argue we should perform the same
> masking for VECTOR_CST expansion as well, instead of trying to fixup
> in do_compare_and_jump?
But then how would ~ be implemented for things like 4-bit masks?
If we use notqi2 then I assume the upper bits could be 1 rather than 0.

[Bug target/113059] [14 regression] fftw fails tests for -O3 -m32 -march=znver2 since r14-6210-ge44ed92dbbe9d4

2024-01-30 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113059

--- Comment #17 from Richard Biener  ---
(In reply to Jakub Jelinek from comment #16)
> The question is revert what exactly?
> If we revert r14-6210, we get back the other P1.  Or do you mean revert
> r14-5355?
> I guess another option is move the vzeroupper pass one pass later, i.e.
> after pass_gcse.

I think moving mdreorg passes as late as possible esp. when they don't play
well with DF/notes is a good thing.  Maybe even after pass_rtl_dse2 and
thus after shrink-wrapping?

[Bug target/113059] [14 regression] fftw fails tests for -O3 -m32 -march=znver2 since r14-6210-ge44ed92dbbe9d4

2024-01-30 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113059

--- Comment #18 from Jakub Jelinek  ---
(In reply to Richard Biener from comment #17)
> (In reply to Jakub Jelinek from comment #16)
> > The question is revert what exactly?
> > If we revert r14-6210, we get back the other P1.  Or do you mean revert
> > r14-5355?
> > I guess another option is move the vzeroupper pass one pass later, i.e.
> > after pass_gcse.
> 
> I think moving mdreorg passes as late as possible esp. when they don't play
> well with DF/notes is a good thing.  Maybe even after pass_rtl_dse2 and
> thus after shrink-wrapping?

The thing is that the vzeroupper pass actually plays well with DF notes, the
problem is that it now (in GCC 14) asks for them to be computed.
The first issue was that vzeroupper before postreload_cse computed the notes,
then
postreload_cse CSEd something and made the REG_UNUSED invalid without killing
them and then later passes went wrong because of the incorrect notes.
This issue is that vzeroupper now after postreload_cse but before gcse2
computes notes, then gcse2 CSEs something and makes REG_UNUSED invalid, rest is
the same.
But, I believe gcse2 is the last CSE-ish pass.
I wouldn't move it too much further, because I don't remember the interactions
between vzeroupper, splitting and peepholes.

[Bug tree-optimization/113576] [14 regression] 502.gcc_r hangs r14-8223-g1c1853a70f9422169190e65e568dcccbce02d95c

2024-01-30 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576

--- Comment #31 from rguenther at suse dot de  ---
On Tue, 30 Jan 2024, rsandifo at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
> 
> --- Comment #30 from Richard Sandiford  ---
> (In reply to Richard Biener from comment #29)
> > But that's just for CONSTRUCTORs, we got the VIEW_CONVERT_EXPR path for
> > VECTOR_CSTs.  But yeah, that _might_ argue we should perform the same
> > masking for VECTOR_CST expansion as well, instead of trying to fixup
> > in do_compare_and_jump?
> But then how would ~ be implemented for things like 4-bit masks?
> If we use notqi2 then I assume the upper bits could be 1 rather than 0.

Yeah, I guess it's similar to expand_expr_real_1 'reduce_bit_field'
handling - we'd need to insert fixup code in strathegic places
(or for ~ use xor with the proper mask).

The difficulty is that we can't make the backend do this unless
there are insn operands that allows it to infer the real precision
of the mode.  And for most insns the excess bits are irrelevant
anyway.

Still the CTOR case showed wrong-code issues with GCN, which possibly
means it has the same issue with VECTOR_CSTs as well.  IIRC that
was that all vectors are 1024bits, and its "fake" V4SImode insns
rely on accurate masked out upper bits.  That might hint that
compares are not enough here (but for non-compares the backend
might have a chance to fixup by infering the max. number of
active elements).

If we think that compares (but that would also be compares without
jump, aka a == b | c == d) are the only problematical case we can
also fixup at the uses rather than at the defs as 'reduce_bit_field'
tries to do.

[Bug target/113059] [14 regression] fftw fails tests for -O3 -m32 -march=znver2 since r14-6210-ge44ed92dbbe9d4

2024-01-30 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113059

--- Comment #19 from Jakub Jelinek  ---
Created attachment 57258
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57258&action=edit
gcc14-pr113059.patch

So in patch form like this.  Untested so far.

[Bug target/113059] [14 regression] fftw fails tests for -O3 -m32 -march=znver2 since r14-6210-ge44ed92dbbe9d4

2024-01-30 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113059

--- Comment #20 from rguenther at suse dot de  ---
On Tue, 30 Jan 2024, jakub at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113059
> 
> --- Comment #18 from Jakub Jelinek  ---
> (In reply to Richard Biener from comment #17)
> > (In reply to Jakub Jelinek from comment #16)
> > > The question is revert what exactly?
> > > If we revert r14-6210, we get back the other P1.  Or do you mean revert
> > > r14-5355?
> > > I guess another option is move the vzeroupper pass one pass later, i.e.
> > > after pass_gcse.
> > 
> > I think moving mdreorg passes as late as possible esp. when they don't play
> > well with DF/notes is a good thing.  Maybe even after pass_rtl_dse2 and
> > thus after shrink-wrapping?
> 
> The thing is that the vzeroupper pass actually plays well with DF notes, the
> problem is that it now (in GCC 14) asks for them to be computed.
> The first issue was that vzeroupper before postreload_cse computed the notes,
> then
> postreload_cse CSEd something and made the REG_UNUSED invalid without killing
> them and then later passes went wrong because of the incorrect notes.
> This issue is that vzeroupper now after postreload_cse but before gcse2
> computes notes, then gcse2 CSEs something and makes REG_UNUSED invalid, rest 
> is
> the same.
> But, I believe gcse2 is the last CSE-ish pass.
> I wouldn't move it too much further, because I don't remember the interactions
> between vzeroupper, splitting and peepholes.

OK, so the "real" revert would then simply kill the notes actively
again after vzeroupper?  Btw, DSE also uses cselib, but I'm not sure
whether it uses REG_UNUSED but IIRC it does "redundant store" removal
and it does replace reads in some cases ... (not sure if after reload
though).

So for maximum safety if we'd have a way to kill off REG_UNUSED maybe
we should do that instead?  OTOH any "stray" valid REG_UNUSED
notes not causing issues with gcse or postreload_cse might not be
preserved and cause missed optimizations later ...

[Bug target/113059] [14 regression] fftw fails tests for -O3 -m32 -march=znver2 since r14-6210-ge44ed92dbbe9d4

2024-01-30 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113059

--- Comment #21 from Richard Biener  ---
(In reply to Jakub Jelinek from comment #19)
> Created attachment 57258 [details]
> gcc14-pr113059.patch
> 
> So in patch form like this.  Untested so far.

LGTM.

[Bug tree-optimization/113659] [14 Regression] ICE Segmentation fault since r14-8355-g02e683894942da

2024-01-30 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113659

--- Comment #4 from GCC Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:4c2169d2f4061e72e1e61e9a175d16f7ff50f5c0

commit r14-8524-g4c2169d2f4061e72e1e61e9a175d16f7ff50f5c0
Author: Richard Biener 
Date:   Tue Jan 30 09:42:08 2024 +0100

tree-optimization/113659 - early exit vectorization and missing VUSE

The following handles the case of the main exit going to a path without
virtual use and handles it similar to the alternate exit handling.

PR tree-optimization/113659
* tree-vect-loop-manip.cc (slpeel_tree_duplicate_loop_to_edge_cfg):
Handle main exit without virtual use.

* gcc.dg/pr113659.c: New testcase.

[Bug tree-optimization/113659] [14 Regression] ICE Segmentation fault since r14-8355-g02e683894942da

2024-01-30 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113659

Richard Biener  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #5 from Richard Biener  ---
Fixed.

[Bug middle-end/113166] RISC-V: Redundant move instructions in RVV intrinsic codes

2024-01-30 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113166

--- Comment #3 from JuzheZhong  ---
#include 
#include 

template 
inline vuint8m1_t tail_load(void const* data);

template<>
inline vuint8m1_t tail_load(void const* data) {
uint64_t const* ptr64 = reinterpret_cast(data);
#if 1
const vuint64m1_t zero = __riscv_vmv_v_x_u64m1(0,
__riscv_vsetvlmax_e64m1());
vuint64m1_t v64 = __riscv_vslide1up(zero, *ptr64,
__riscv_vsetvlmax_e64m1());
return __riscv_vreinterpret_u8m1(v64);
#elif 1
vuint64m1_t v64 = __riscv_vmv_s_x_u64m1(*ptr64, 1);
const vuint64m1_t zero = __riscv_vmv_v_x_u64m1(0,
__riscv_vsetvlmax_e64m1());
v64 = __riscv_vslideup(v64, zero, 1, __riscv_vsetvlmax_e8m1());
return __riscv_vreinterpret_u8m1(v64);
#elif 1
vuint64m1_t v64 = __riscv_vle64_v_u64m1(ptr64, 1);
const vuint64m1_t zero = __riscv_vmv_v_x_u64m1(0,
__riscv_vsetvlmax_e64m1());
v64 = __riscv_vslideup(v64, zero, 1, __riscv_vsetvlmax_e8m1());
return __riscv_vreinterpret_u8m1(v64);
#else
vuint8m1_t v = __riscv_vreinterpret_u8m1(__riscv_vle64_v_u64m1(ptr64, 1));
const vuint8m1_t zero = __riscv_vmv_v_x_u8m1(0, __riscv_vsetvlmax_e8m1());
return __riscv_vslideup(v, zero, sizeof(uint64_t),
__riscv_vsetvlmax_e8m1());
#endif
}

vuint8m1_t test2(uint64_t data) {
return tail_load(&data);
}

GCC ASM:

test2(unsigned long):
vsetvli a5,zero,e64,m1,ta,ma
vmv.v.i v8,0
vmv1r.v v9,v8   
vslide1up.vxv8,v9,a0
ret

LLVM ASM:

test2(unsigned long):  # @test2(unsigned long)
vsetvli a1, zero, e64, m1, ta, ma
vmv.v.i v9, 0
vslide1up.vxv8, v9, a0
ret

[Bug tree-optimization/113664] False positive warnings with -fno-strict-overflow (-Warray-bounds, -Wstringop-overflow)

2024-01-30 Thread stefan at bytereef dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113664

--- Comment #5 from Stefan Krah  ---
> So the diagnostic messages leave a lot to be desired but in the end
> they point to a problem in your code which is a guard against a NULL 's'.

Hmm, the real code is used to print floating point numbers and integers.
Integers get dot==NULL. It is fine (and desired!) in that case to optimize
away the if clause.

As far as I can see, it is compliant with the C standard.


Even with -fno-strict-overflow one could make the case that the warning
is strange. If "s" wraps around, the allocated output string is too small,
and you have bigger problems.

It is impossible for gcc to detect whether the string size is sufficient,
so IMHO it should not warn.


In essence, since gcc-10 (12?) idioms that were warning-free for 10 years
tend to receive false positive warnings now.

This also applies to -Warray-bounds. I think the Linux kernel disables at
least -Warray-bounds and -Wmaybe-uninitialized.

I think this is becoming a problem, because most projects do not report
false positives but just silently disable the warnings.

[Bug target/113059] [14 regression] fftw fails tests for -O3 -m32 -march=znver2 since r14-6210-ge44ed92dbbe9d4

2024-01-30 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113059

--- Comment #22 from Jakub Jelinek  ---
BTW, I have quickly looked at REG_UNUSED notes on insn-recog.cc (as a randomly
picked large GCC object).  Ignoring REG_UNUSED notes for flags register (which
are extremely common in lots of passes), seems we have just a few such notes
before RA and probably combiner kills them?
insn-recog.cc.292r.cse2:(expr_list:REG_UNUSED (reg:SI 207)
insn-recog.cc.293r.dse1:(expr_list:REG_UNUSED (reg:SI 207)
insn-recog.cc.294r.fwprop2:(expr_list:REG_UNUSED (reg:SI 207)
insn-recog.cc.296r.init-regs:(expr_list:REG_UNUSED (reg:SI 207)
insn-recog.cc.297r.ud_dce:(expr_list:REG_UNUSED (reg:SI 207)
insn-recog.cc.298r.combine:  REG_UNUSED r207:SI
insn-recog.cc.298r.combine:  REG_UNUSED r207:SI
Then I see tons of REG_UNUSED notes for non-flags in IRA dump (all new), but
they don't appear after it, so most likely LRA removes them.  Even the flags
related REG_UNUSED don't appear in *.reload dump (except some details
comments), nor *.postreload etc.
And (note, this isn't a build with -mavx*, so no vzeroupper), they reappear
only in dse2 dump and keep appearing from that point onwards.
Similarly REG_DEAD notes don't appear in the IL between *.reload and
*.pro_and_epilogue inclusive.
Seems LRA does this in update_inc_notes, reload did that in reload function.

So, maybe safer than the above patch would be simply do that too in vzeroupper
pass (I guess we don't need to update the REG_INC notes).
Let me write another patch.

[Bug target/113656] [x86] ICE in simplify_const_unary_operation, at simplify-rtx.cc:1954 with new -mavx10.1

2024-01-30 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113656

Hongtao Liu  changed:

   What|Removed |Added

 CC||liuhongt at gcc dot gnu.org,
   ||rsandifo at gcc dot gnu.org

--- Comment #5 from Hongtao Liu  ---
It hit gcc_assert in simplify_const_unary_operation when combine try to
simplify 

(float_truncate:V4HF
  (float_truncate:V4SF
(mem/u/c:V4DF (symbol_ref/u:DI ("*.LC4") [flags 0x2]) [0  S32 A256])))

to

(float_truncate:V4HF
(const_vector:V4DF [
(const_double:DF
-8.4003552713678800500929355621337890625e+0
[-0x0.88p+4])
(const_double:DF
-7.4003552713678800500929355621337890625e+0
[-0x0.ecccdp+3])
(const_double:DF
-6.4003552713678800500929355621337890625e+0
[-0x0.dp+3])
(const_double:DF
-5.4003552713678800500929355621337890625e+0
[-0x0.acccdp+3])
]))


  if (VECTOR_MODE_P (mode)
  && GET_CODE (op) == CONST_VECTOR
  && known_eq (GET_MODE_NUNITS (mode), CONST_VECTOR_NUNITS (op)))
{
  gcc_assert (GET_MODE (op) == op_mode); --- hit assert here.

  rtx_vector_builder builder;
  if (!builder.new_unary_operation (mode, op, false))
return 0;

  unsigned int count = builder.encoded_nelts ();
  for (unsigned int i = 0; i < count; i++)
{
  rtx x = simplify_unary_operation (code, GET_MODE_INNER (mode),
CONST_VECTOR_ELT (op, i),
GET_MODE_INNER (op_mode));
  if (!x || !valid_for_const_vector_p (mode, x))
return 0;
  builder.quick_push (x);
}
  return builder.build ();
}

The gcc_assert is added by r10-2139-g4ce6ab68894469

Author: Richard Sandiford 
Date:   Mon Jul 29 08:40:21 2019 +

Implement more rtx vector folds on variable-length vectors

This patch extends the tree-level folding of variable-length vectors
so that it can also be used on rtxes.  The first step is to move
the tree_vector_builder new_unary/binary_operator routines to the
parent vector_builder class (which in turn means adding a new
template parameter).  The second step is to make simplify-rtx.c
use a direct rtx analogue of the VECTOR_CST handling in fold-const.c.

2019-07-29  Richard Sandiford  

gcc/
* vector-builder.h (vector_builder): Add a shape template
parameter.
(vector_builder::new_unary_operation): New function, generalizing
the old tree_vector_builder function.
(vector_builder::new_binary_operation): Likewise.
(vector_builder::binary_encoded_nelts): Likewise.
* int-vector-builder.h (int_vector_builder): Update template
parameters to vector_builder.
(int_vector_builder::shape_nelts): New function.
* rtx-vector-builder.h (rtx_vector_builder): Update template
parameters to vector_builder.
(rtx_vector_builder::shape_nelts): New function.
(rtx_vector_builder::nelts_of): Likewise.
(rtx_vector_builder::npatterns_of): Likewise.
(rtx_vector_builder::nelts_per_pattern_of): Likewise.
* tree-vector-builder.h (tree_vector_builder): Update template
parameters to vector_builder.
(tree_vector_builder::shape_nelts): New function.
(tree_vector_builder::nelts_of): Likewise.
(tree_vector_builder::npatterns_of): Likewise.
(tree_vector_builder::nelts_per_pattern_of): Likewise.
* tree-vector-builder.c (tree_vector_builder::new_unary_operation)
(tree_vector_builder::new_binary_operation): Delete.
(tree_vector_builder::binary_encoded_nelts): Likewise.
* simplify-rtx.c: Include rtx-vector-builder.h.
(distributes_over_addition_p): New function.
(simplify_const_unary_operation)
(simplify_const_binary_operation): Generalize handling of vector
constants to include variable-length vectors.
(test_vector_ops_series): Add more tests.


before that commit, it only assert for GET_MODE_NUNITS


 /* Simplification and canonicalization of RTL.  */

@@ -1753,27 +1754,23 @@ simplify_const_unary_operation (enum rtx_code code,
machine_mode mode,

   if (VECTOR_MODE_P (mode) && GET_CODE (op) == CONST_VECTOR)
 {
-  unsigned int n_elts;
-  if (!CONST_VECTOR_NUNITS (op).is_constant (&n_elts))
-   return NULL_RTX;
-
-  machine_mode opmode = GET_MODE (op);
-  gcc_assert (known_eq (GET_MODE_NUNITS (mode), n_elts));
-  gcc_assert (known_eq (GET_MODE_NUNITS (opmode), n_elts));
+  gcc_assert (GET_MODE (op) == op_mode);

-  rtvec v = rtvec_alloc (n_elts);

[Bug tree-optimization/113664] False positive warnings with -fno-strict-overflow (-Warray-bounds, -Wstringop-overflow)

2024-01-30 Thread stefan at bytereef dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113664

--- Comment #6 from Stefan Krah  ---
Sometimes you hear "code should be rewritten" because squashing the warnings
makes it better.

I disagree. I've seen many segfaults introduced in projects that rush
to squash warnings.

Sometimes, analyzers just cannot cope with established idioms. clang-analyzer
for instance hates Knuth's algorithm D (long division). It would be strange to
change that for an analyzer.

[Bug tree-optimization/99395] s116 benchmark of TSVC is vectorized by clang and not by gcc

2024-01-30 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99395

--- Comment #8 from JuzheZhong  ---
Hi, Richard.

Now, I find the time to GCC vectorization optimization.

I find this case:

  _2 = a[_1];
  ...
  a[i_16] = _4;
  ,,,
  _7 = a[_1];---> This load should be eliminated and re-use _2.

Am I right ?

Could you guide me which pass should do this CSE optimization ?

Thanks.

[Bug target/113059] [14 regression] fftw fails tests for -O3 -m32 -march=znver2 since r14-6210-ge44ed92dbbe9d4

2024-01-30 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113059

--- Comment #23 from Jakub Jelinek  ---
Created attachment 57259
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57259&action=edit
gcc14-pr113059-2.patch

Untested patch to remove the notes instead of moving the pass around further.

[Bug libstdc++/113663] [MinGW] std::filesystem::hard_link_count always returns 1

2024-01-30 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113663

Jonathan Wakely  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2024-01-30
 Ever confirmed|0   |1

[Bug libstdc++/113522] std::swap cannot be called with explicit template argument std::array

2024-01-30 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113522

Jonathan Wakely  changed:

   What|Removed |Added

   Last reconfirmed||2024-01-30
 Ever confirmed|0   |1
 Status|UNCONFIRMED |SUSPENDED

--- Comment #8 from Jonathan Wakely  ---
https://cplusplus.github.io/LWG/issue4047

[Bug target/111677] [12/13/14 Regression] darktable build on aarch64 fails with unrecognizable insn due to -fstack-protector changes

2024-01-30 Thread acoplan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111677

Alex Coplan  changed:

   What|Removed |Added

   Keywords|needs-bisection |
  Known to fail|13.2.1  |14.0
  Known to work|14.0|
Version|13.2.0  |13.2.1
Summary|[12/13 Regression]  |[12/13/14 Regression]
   |darktable build on aarch64  |darktable build on aarch64
   |fails with unrecognizable   |fails with unrecognizable
   |insn due to |insn due to
   |-fstack-protector changes   |-fstack-protector changes

--- Comment #23 from Alex Coplan  ---
Discovered by accident while working on a patch for trunk, but adding
-funroll-loops to the testcase in #c20 is enough to make the ICE trigger on the
trunk, too.

Testing a fix for trunk and a backport to 13 (to start with).

To reproduce on the trunk (t.c as in #c20):

$ gcc/xgcc -B gcc -c t.c -O3 -ffast-math -fopenmp -fstack-protector-strong
-funroll-loops
t.c: In function ‘dt_bilateral_splat.simdclone.1’:
t.c:25:1: error: unrecognizable insn:
   25 | }
  | ^
(insn 2182 2181 406 85 (set (mem/c:TF (plus:DI (reg/f:DI 31 sp)
(const_int 512 [0x200])) [7  S16 A8])
(reg:TF 55 v23)) -1
 (expr_list:REG_DEAD (reg:TF 55 v23)
(nil)))
during RTL pass: sched_fusion
t.c:25:1: internal compiler error: in get_attr_type, at
config/aarch64/aarch64.md:29678
0x74a68f _fatal_insn(char const*, rtx_def const*, char const*, int, char
const*)
/home/alecop01/toolchain/src/gcc/gcc/rtl-error.cc:108
0x74a6c3 _fatal_insn_not_found(rtx_def const*, char const*, int, char const*)
/home/alecop01/toolchain/src/gcc/gcc/rtl-error.cc:116
0x18cf03b get_attr_type(rtx_insn*)
/home/alecop01/toolchain/src/gcc/gcc/config/aarch64/aarch64.md:29678
0x13278b7 aarch64_sched_variable_issue
/home/alecop01/toolchain/src/gcc/gcc/config/aarch64/aarch64.cc:15827
0x13278b7 aarch64_sched_variable_issue
/home/alecop01/toolchain/src/gcc/gcc/config/aarch64/aarch64.cc:15818
0x1e25057 schedule_block(basic_block_def**, void*)
/home/alecop01/toolchain/src/gcc/gcc/haifa-sched.cc:6912
0xeb307f schedule_region
/home/alecop01/toolchain/src/gcc/gcc/sched-rgn.cc:3203
0xeb307f schedule_insns()
/home/alecop01/toolchain/src/gcc/gcc/sched-rgn.cc:3525
0xeb34a3 schedule_insns()
/home/alecop01/toolchain/src/gcc/gcc/sched-rgn.cc:3511
0xeb34a3 rest_of_handle_sched_fusion
/home/alecop01/toolchain/src/gcc/gcc/sched-rgn.cc:3760
0xeb34a3 execute
/home/alecop01/toolchain/src/gcc/gcc/sched-rgn.cc:3938
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See  for instructions.

[Bug tree-optimization/99395] s116 benchmark of TSVC is vectorized by clang and not by gcc

2024-01-30 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99395

--- Comment #9 from Richard Biener  ---
(In reply to JuzheZhong from comment #8)
> Hi, Richard.
> 
> Now, I find the time to GCC vectorization optimization.
> 
> I find this case:
> 
>   _2 = a[_1];
>   ...
>   a[i_16] = _4;
>   ,,,
>   _7 = a[_1];---> This load should be eliminated and re-use _2.
> 
> Am I right ?
> 
> Could you guide me which pass should do this CSE optimization ?
> 
> Thanks.

In principle it's value-numbering.  The reason it doesn't do this is
compile-time cost of doing full data-ref analysis.  In principle it's
as "easy" as hooking that up into vn_reference_lookup_3 as part of the
early work therein to disambiguate more defs.

Iff we chose to refrain from valueizing any of the SSA uses we could
cache both the data references and the dependence resolution.

One could also think of doing very simple recognition of these
single index expressions and / or integrating this with other cases.
IIRC there's some warranting SCEV processing / niter analysis as well
for example to figure that

 for (int i = 0; i < 128; ++i)
   a[i] = 1;
 return a[5];

returns 1.

[Bug rtl-optimization/113617] [14 Regression] Symbol ... referenced in section `.data.rel.ro.local' of ...: defined in discarded section ... since r14-4944

2024-01-30 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113617

--- Comment #14 from Jakub Jelinek  ---
The huge names make the assembly quite unreadable, so I've tried to reduce it a
little bit further:
pr113617.h:
namespace {
template  struct J { static constexpr int value = V; };
template  using K = J;
using M = K;
template  struct L { template  using type = _Tp;
};
template  using N = typename
L<_Cond>::type<_If, _Else>;
M k;
template  struct O { using type = _Tp; };
template 
struct P : N, _Up> {};
template  struct Q { using type = typename P<_Tp>::type; };
}
namespace R {
struct H;
enum G {};
template  class S;
struct T { using U = bool (*) (H &, const H &, G); U F; };
template  class B;
template 
struct B<_R(_A...), _F> {
  static bool F(H &, const H &, G) { return false; }
  __attribute__((noipa)) static _R bar(const H &) {}
};
template 
struct S<_R(_A...)> : T {
  template  using AH = B<_R(), _F>;
  template  S(_F) {
using AG = AH<_F>;
barr = AG::bar;
F = AG::F;
  }
  using AF = _R (*)(const H &);
  AF barr;
};
template  class I;
template 
struct I<_F(_B...)> {};
template  using W = decltype(k);
template  struct V {
  typedef I::type(typename Q<_B>::type...)> type;
};
template 
__attribute__((noipa)) typename V::value, _F, _B...>::type
baz(_F, _B...) { return typename V::value, _F, _B...>::type (); }
template  struct AJ {
  template  struct _Ptr { using type = _Up *; };
  using AI = typename _Ptr<_Tp>::type;
};
template  struct Y {
  using AI = typename AJ<_Tp>::AI;
  AI operator->();
};
}
extern int z;
namespace N1 {
namespace N2 {
namespace N3 {
enum Z { Z1, Z2 };
template  struct X {
  template 
  __attribute__((noipa)) void boo(long long, long long, long long, _F &) {}
};
struct AC {
  AC(int);
  void m1(R::S);
};
template 
__attribute__((noipa)) void garply(void *, long long, long long, long long) {}
template <>
template 
void X::boo(long long, long long x, long long y, _F &fi) {
  AC pool(z);
  for (;;) {
auto job = R::baz(garply<_F>, &fi, y, y, x);
pool.m1(job);
  }
}
struct AB {
  static AB &bleh();
  template 
  void boo(long first, long x, long y, _F fi) {
switch (ab1) {
case Z1:
  ab2->boo(first, x, y, fi);
case Z2:
  ab3->boo(first, x, y, fi);
}
  }
  Z ab1;
  R::Y> ab2;
  R::Y> ab3;
};
template  struct C;
template  struct C<_F, false> {
  __attribute__((noipa)) C(_F) {}
  void boo(long first, long x, long y) {
auto u = AB::bleh();
u.boo(first, x, y, *this);
  }
};
template  struct AA { typedef C<_F, 0> type; };
}
}
}
struct AD {
  template 
  static void boo(long first, long x, long y, _F f) {
typename N1::N2::N3::AA<_F>::type fi(f);
fi.boo(first, x, y);
  }
  template 
  static void boo(long first, long x, _F f) {
boo(first, x, 0, f);
  }
};
template  struct A {
  void foo(long long, long long);
  int *c;
};
namespace {
template  struct D { __attribute__((noipa)) D(int *) {} };
}
template 
void A::foo(long long x, long long y)
{
  int e;
  D d(&e);
  AD::boo(0, y, d);
  long p;
  for (p = 0; p < x; p++)
c[p] = c[p - 1];
}
pr113617.C:
#include "pr113617.h"
int z;
long xx1;
void corge() {
  A a;
  a.foo(xx1, 0);
}
pr113617-aux.cc:
#include "pr113617.h"
void qux() {
  A a;
  a.foo(0, 0);
}

>From what I can see, there is just one comdat group, _ZN1AIxE3fooExx, in the
whole testcase, which contains everything instantiated because of the foo
instantiation.

[Bug target/112470] [11/12/13/14 regression] [AARCH64] stack-protector vulnerability fixing solution impact code size and performance

2024-01-30 Thread xry111 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112470

Xi Ruoyao  changed:

   What|Removed |Added

 CC||xry111 at gcc dot gnu.org

--- Comment #12 from Xi Ruoyao  ---
I'm wondering why this is a P1...  To me this is making every
missed-optimization for a potential 0.5% performance increment candidates of
P1.

[Bug libfortran/111022] ES0.0E0 format gave ES0.dE0 output with d too high.

2024-01-30 Thread jvdelisle at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111022

--- Comment #28 from Jerry DeLisle  ---
Created attachment 57260
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57260&action=edit
A final patch

This patch provides the necessary changes with only minor adjustment to
existing gfortran test cases. (This took insanely longer than I had hoped and
such is life.) I am preparing one or two additional test cases and will submit
this for approval on the list.

[Bug analyzer/113654] [14 Regression] -Wanalyzer-allocation-size false positive seen on Linux kernel's drivers/gpu/drm/i915/display/intel_bios.c

2024-01-30 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113654

--- Comment #1 from GCC Commits  ---
The master branch has been updated by David Malcolm :

https://gcc.gnu.org/g:9f382376660069e49290fdb51861abdec63519c7

commit r14-8627-g9f382376660069e49290fdb51861abdec63519c7
Author: David Malcolm 
Date:   Tue Jan 30 08:17:47 2024 -0500

analyzer: fix -Wanalyzer-allocation-size false +ve on Linux kernel's
round_up macro [PR113654]

gcc/analyzer/ChangeLog:
PR analyzer/113654
* region-model.cc (is_round_up): New.
(is_multiple_p): New.
(is_dubious_capacity): New.
(region_model::check_region_size): Move usage of size_visitor into
is_dubious_capacity.

gcc/testsuite/ChangeLog:
PR analyzer/113654
* c-c++-common/analyzer/allocation-size-pr113654-1.c: New test.

Signed-off-by: David Malcolm 

[Bug tree-optimization/113622] [11/12/13 Regression] ICE with vectors in named registers

2024-01-30 Thread xry111 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113622

--- Comment #21 from Xi Ruoyao  ---
This still blows up on LoongArch even after r14-8498:

typedef float __attribute__ ((vector_size (16))) vec;
typedef int __attribute__ ((vector_size (16))) ivec;
register vec a asm("f25"), b asm("f26");
register ivec c asm("f27");

void
test (void)
{
  for (int i = 0; i < 4; i++)
c[i] = a[i] < b[i] ? -1 : 1;
}

$ gcc/cc1 -msimd=lsx t.c -O2 -fno-vect-cost-model -nostdinc
t.c: In function ‘test’:
t.c:7:1: internal compiler error: in expand_expr_addr_expr_1, at expr.cc:9139
7 | test (void)
  | ^~~~
0x102d7cb expand_expr_addr_expr_1
../../gcc/gcc/expr.cc:9139
0x102e13e expand_expr_addr_expr
../../gcc/gcc/expr.cc:9252
0x103df07 expand_expr_real_1(tree_node*, rtx_def*, machine_mode,
expand_modifier, rtx_def**, bool)
../../gcc/gcc/expr.cc:12585
0x102e824 expand_expr_real(tree_node*, rtx_def*, machine_mode, expand_modifier,
rtx_def**, bool)
../../gcc/gcc/expr.cc:9440
0xe45096 expand_expr(tree_node*, rtx_def*, machine_mode, expand_modifier)
../../gcc/gcc/expr.h:316
0x102fc56 expand_expr_real_2(separate_ops*, rtx_def*, machine_mode,
expand_modifier)
../../gcc/gcc/expr.cc:9762
0x10361c9 expand_expr_real_gassign(gassign*, rtx_def*, machine_mode,
expand_modifier, rtx_def**, bool)
../../gcc/gcc/expr.cc:11096
0xe798de expand_gimple_stmt_1
../../gcc/gcc/cfgexpand.cc:4010
0xe79b81 expand_gimple_stmt
../../gcc/gcc/cfgexpand.cc:4071
0xe825e3 expand_gimple_basic_block
../../gcc/gcc/cfgexpand.cc:6127
0xe84b95 execute
../../gcc/gcc/cfgexpand.cc:6866

Interestingly -fno-vect-cost-model is needed to trigger the ICE.

[Bug tree-optimization/113622] [11/12/13 Regression] ICE with vectors in named registers

2024-01-30 Thread xry111 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113622

--- Comment #22 from Xi Ruoyao  ---
On x86_64:

$ cat t.c
typedef float __attribute__ ((vector_size (16))) vec;
typedef int __attribute__ ((vector_size (16))) ivec;
register vec a asm("xmm0"), b asm("xmm1");
register ivec c asm("xmm2");

void
test (void)
{
  for (int i = 0; i < 4; i++)
c[i] = a[i] < b[i] ? -1 : 1;
}
$ gcc/cc1 -msse2 t.c -O2 -fno-vect-cost-model -nostdinc -ffixed-xmm{0,1,2}
t.c: In function 'test':
t.c:7:1: internal compiler error: in expand_expr_addr_expr_1, at expr.cc:9139
7 | test (void)
  | ^~~~
0x10e6d6e expand_expr_addr_expr_1
../../gcc/gcc/expr.cc:9139
0x10e76e2 expand_expr_addr_expr
../../gcc/gcc/expr.cc:9252
0x10f73a7 expand_expr_real_1(tree_node*, rtx_def*, machine_mode,
expand_modifier, rtx_def**, bool)
../../gcc/gcc/expr.cc:12585
0x10e7dc8 expand_expr_real(tree_node*, rtx_def*, machine_mode, expand_modifier,
rtx_def**, bool)
../../gcc/gcc/expr.cc:9440
0xef7346 expand_expr(tree_node*, rtx_def*, machine_mode, expand_modifier)
../../gcc/gcc/expr.h:316
0x10e91fa expand_expr_real_2(separate_ops*, rtx_def*, machine_mode,
expand_modifier)
../../gcc/gcc/expr.cc:9762
0x10ef77d expand_expr_real_gassign(gassign*, rtx_def*, machine_mode,
expand_modifier, rtx_def**, bool)
../../gcc/gcc/expr.cc:11096
0xf2db31 expand_gimple_stmt_1
../../gcc/gcc/cfgexpand.cc:4010
0xf2ddd4 expand_gimple_stmt
../../gcc/gcc/cfgexpand.cc:4071
0xf36844 expand_gimple_basic_block
../../gcc/gcc/cfgexpand.cc:6127
0xf38ff8 execute
../../gcc/gcc/cfgexpand.cc:6866

Should I open a new ticket or add back 14 Regression to the subject?

[Bug libstdc++/108323] combine does not change the locale name

2024-01-30 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108323

Jonathan Wakely  changed:

   What|Removed |Added

   Last reconfirmed|2023-01-07 00:00:00 |2024-1-30

--- Comment #1 from Jonathan Wakely  ---
This should be fixed at the same time as
https://cplusplus.github.io/LWG/issue2295

And we should also add a static_assert(__is_facet<_Facet>::value, "") for
https://cplusplus.github.io/LWG/issue436

[Bug analyzer/106358] [meta-bug] tracker bug for building the Linux kernel with -fanalyzer

2024-01-30 Thread dmalcolm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106358
Bug 106358 depends on bug 113654, which changed state.

Bug 113654 Summary: [14 Regression] -Wanalyzer-allocation-size false positive 
seen on Linux kernel's drivers/gpu/drm/i915/display/intel_bios.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113654

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

[Bug analyzer/113654] [14 Regression] -Wanalyzer-allocation-size false positive seen on Linux kernel's drivers/gpu/drm/i915/display/intel_bios.c

2024-01-30 Thread dmalcolm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113654

David Malcolm  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #2 from David Malcolm  ---
Should be fixed by above patch; marking as resolved.

[Bug preprocessor/105608] [11/12/13/14 Regression] ICE: in linemap_add with a really long defined macro on the command line r11-338-g2a0225e47868fbfc

2024-01-30 Thread ro at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105608

Rainer Orth  changed:

   What|Removed |Added

 CC||ro at gcc dot gnu.org

--- Comment #10 from Rainer Orth  ---
Something very weird is going on here: the new g++.dg/pch/line-map-3.C test
FAILs on i386-pc-solaris2.11 only:

XPASS: g++.dg/pch/line-map-3.C  -O2 -I. -Dwith_PCH  (test for bogus messages,
line 2)
+XPASS: g++.dg/pch/line-map-3.C  -O2 -I. -Dwith_PCH  at line 17 (test for bogus
messages, line 2)
+FAIL: g++.dg/pch/line-map-3.C  -O2 -I. -Dwith_PCH (test for excess errors)
+XPASS: g++.dg/pch/line-map-3.C  -O2 -g -I. -Dwith_PCH  (test for bogus
messages, line 2)
+XPASS: g++.dg/pch/line-map-3.C  -O2 -g -I. -Dwith_PCH  at line 17 (test for
bogus messages, line 2)
+FAIL: g++.dg/pch/line-map-3.C  -O2 -g -I. -Dwith_PCH (test for excess errors)
+XPASS: g++.dg/pch/line-map-3.C  -g -I. -Dwith_PCH  (test for bogus messages,
line 2)
+XPASS: g++.dg/pch/line-map-3.C  -g -I. -Dwith_PCH  at line 17 (test for bogus
messages, line 2)
+FAIL: g++.dg/pch/line-map-3.C  -g -I. -Dwith_PCH (test for excess errors)

both 32 and 64-bit.  The excess error is

Excess errors:
/vol/gcc/src/hg/master/local/gcc/testsuite/g++.dg/pch/line-map-3.C:3:9: error:
macro "UNUSED_MACRO" is not used [-Werror=unused-macros]
/vol/gcc/src/hg/master/local/gcc/testsuite/g++.dg/pch/line-map-3.C:3:9: error:
macro "with_PCH" is not used [-Werror=unused-macros]

When checking sparc-sun-solaris2.11 for comparison, the output is different:

./line-map-3.H:2: error: macro "UNUSED_MACRO" is not used
[-Werror=unused-macros]
./line-map-3.H:2: error: macro "with_PCH" is not used [-Werror=unused-macros]

This is totally strange since the setup of both systems is identical; they're
even building from a shared source tree.

[Bug c++/109867] -Wswitch-default reports missing default in coroutine

2024-01-30 Thread piotrwn1 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109867

Piotr Nycz  changed:

   What|Removed |Added

 CC||piotrwn1 at gmail dot com

--- Comment #1 from Piotr Nycz  ---
This also happens for void coroutine (return_void()).
Quite big annoying

[Bug c++/81271] gcc/cp/lex.c:116: wrong condition ?

2024-01-30 Thread jason at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81271

Jason Merrill  changed:

   What|Removed |Added

 CC||jason at gcc dot gnu.org

--- Comment #4 from Jason Merrill  ---
What tool did this warning come from?

[Bug preprocessor/105608] [11/12/13/14 Regression] ICE: in linemap_add with a really long defined macro on the command line r11-338-g2a0225e47868fbfc

2024-01-30 Thread lhyatt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105608

--- Comment #11 from Lewis Hyatt  ---
Oh interesting. So the purpose of this test was just to record that GCC outputs
incorrect locations for this case, I wanted to xfail it and then fix it
properly for GCC 15. I did not consider that it might output different wrong
locations for different platforms, but I could buy that it may happen, for a
similar reason why this switched from being silently broken to ICEing since
r11-338 which was seemingly unrelated. It seems like in one case the wrong
location is inside the header file and in the other case, the wrong location is
the line just following the include. It may have to do with line endings or
some other issue with the treatment of EOF? If this test is causing problems we
could just skip it on some architectures maybe? Once the underlying issue is
fixed, the location (line 2 of the .C file) will be correct everywhere. I am
curious why it gets a different wrong output though, if there is a compile farm
machine with this architecture I could look into it.

[Bug d/113667] New: [14 Regression] libgphobos symbols missing

2024-01-30 Thread doko at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113667

Bug ID: 113667
   Summary: [14 Regression] libgphobos symbols missing
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: d
  Assignee: ibuclaw at gdcproject dot org
  Reporter: doko at gcc dot gnu.org
  Target Milestone: ---

Created attachment 57261
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57261&action=edit
symbols diff

libgphobos is dropping a bunch of symbols in GCC 14, is a soname bump required
for the release?

[Bug tree-optimization/111268] [11/12/13/14 Regression] internal compiler error: in to_constant, at poly-int.h:504

2024-01-30 Thread ricbal02 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111268

Richard Ball  changed:

   What|Removed |Added

 CC||ricbal02 at gcc dot gnu.org

--- Comment #15 from Richard Ball  ---
The following patch fixes the neon-sve-bridge.c regression around this bug.
https://sourceware.org/pipermail/gcc-patches/2024-January/644432.html

[Bug go/113668] New: [14 Regression] libgo soname bump needed for the GCC 14 release?

2024-01-30 Thread doko at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113668

Bug ID: 113668
   Summary: [14 Regression] libgo soname bump needed for the GCC
14 release?
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: go
  Assignee: ian at airs dot com
  Reporter: doko at gcc dot gnu.org
  Target Milestone: ---

is a libgo soname bump needed for the GCC 14 release? or can the current libgo
also be used by GCC 13?

[Bug tree-optimization/113622] [11/12/13 Regression] ICE with vectors in named registers

2024-01-30 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113622

--- Comment #23 from rguenther at suse dot de  ---
On Tue, 30 Jan 2024, xry111 at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113622
> 
> --- Comment #22 from Xi Ruoyao  ---
> On x86_64:
> 
> $ cat t.c
> typedef float __attribute__ ((vector_size (16))) vec;
> typedef int __attribute__ ((vector_size (16))) ivec;
> register vec a asm("xmm0"), b asm("xmm1");
> register ivec c asm("xmm2");
> 
> void
> test (void)
> {
>   for (int i = 0; i < 4; i++)
> c[i] = a[i] < b[i] ? -1 : 1;
> }
> $ gcc/cc1 -msse2 t.c -O2 -fno-vect-cost-model -nostdinc -ffixed-xmm{0,1,2}
> t.c: In function 'test':
> t.c:7:1: internal compiler error: in expand_expr_addr_expr_1, at expr.cc:9139
> 7 | test (void)
>   | ^~~~
> 0x10e6d6e expand_expr_addr_expr_1
> ../../gcc/gcc/expr.cc:9139
> 0x10e76e2 expand_expr_addr_expr
> ../../gcc/gcc/expr.cc:9252
> 0x10f73a7 expand_expr_real_1(tree_node*, rtx_def*, machine_mode,
> expand_modifier, rtx_def**, bool)
> ../../gcc/gcc/expr.cc:12585
> 0x10e7dc8 expand_expr_real(tree_node*, rtx_def*, machine_mode, 
> expand_modifier,
> rtx_def**, bool)
> ../../gcc/gcc/expr.cc:9440
> 0xef7346 expand_expr(tree_node*, rtx_def*, machine_mode, expand_modifier)
> ../../gcc/gcc/expr.h:316
> 0x10e91fa expand_expr_real_2(separate_ops*, rtx_def*, machine_mode,
> expand_modifier)
> ../../gcc/gcc/expr.cc:9762
> 0x10ef77d expand_expr_real_gassign(gassign*, rtx_def*, machine_mode,
> expand_modifier, rtx_def**, bool)
> ../../gcc/gcc/expr.cc:11096
> 0xf2db31 expand_gimple_stmt_1
> ../../gcc/gcc/cfgexpand.cc:4010
> 0xf2ddd4 expand_gimple_stmt
> ../../gcc/gcc/cfgexpand.cc:4071
> 0xf36844 expand_gimple_basic_block
> ../../gcc/gcc/cfgexpand.cc:6127
> 0xf38ff8 execute
> ../../gcc/gcc/cfgexpand.cc:6866
> 
> Should I open a new ticket or add back 14 Regression to the subject?

Please open a new ticked - this seems to be another vectorizer issue.

We end up with the invalid

_28 = (sizetype) &a;

[Bug preprocessor/105608] [11/12/13/14 Regression] ICE: in linemap_add with a really long defined macro on the command line r11-338-g2a0225e47868fbfc

2024-01-30 Thread ro at CeBiTec dot Uni-Bielefeld.DE via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105608

--- Comment #12 from ro at CeBiTec dot Uni-Bielefeld.DE  ---
> --- Comment #11 from Lewis Hyatt  ---
> Oh interesting. So the purpose of this test was just to record that GCC 
> outputs
> incorrect locations for this case, I wanted to xfail it and then fix it
> properly for GCC 15. I did not consider that it might output different wrong
> locations for different platforms, but I could buy that it may happen, for a
> similar reason why this switched from being silently broken to ICEing since
> r11-338 which was seemingly unrelated. It seems like in one case the wrong
> location is inside the header file and in the other case, the wrong location 
> is
> the line just following the include. It may have to do with line endings or
> some other issue with the treatment of EOF? If this test is causing problems 
> we

It's still weird given that it's exactly the same version of Solaris on
both SPARC and x86.

> could just skip it on some architectures maybe? Once the underlying issue is

I guess that's best for now.  I'll check if the test behaves differently
for a 64-bit-default (amd64-pc-solaris2.11) compiler.

> fixed, the location (line 2 of the .C file) will be correct everywhere. I am
> curious why it gets a different wrong output though, if there is a compile 
> farm
> machine with this architecture I could look into it.

There's no Solaris/x86 system in the cfarm right now, unfortunately.
The only one runs Solaris 11.3/SPARC, where the test works just like
everywhere else.

That said, I've accquired systems to add to the cfarm that will both be
running current Solaris 11.4 (SPARC and x86).  I'm working on installing
and integrating them as we speak, but I don't have an ETA yet.

[Bug libstdc++/113512] Incorrect results for std::format("{:#.3g}", flt)

2024-01-30 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113512

--- Comment #2 from GCC Commits  ---
The releases/gcc-13 branch has been updated by Jonathan Wakely
:

https://gcc.gnu.org/g:1bb2e52cc69902c7bf00fbdd094e948803222946

commit r13-8261-g1bb2e52cc69902c7bf00fbdd094e948803222946
Author: Jonathan Wakely 
Date:   Sat Jan 20 00:44:12 2024 +

libstdc++: Fix std::format floating-point alternate forms [PR113512]

The logic for handling '#' forms was ... not good. The count of
significant figures just counted digits, instead of ignoring leading
zeros. And when moving the result from the stack buffer to a dynamic
string the exponent could get lost in some cases.

libstdc++-v3/ChangeLog:

PR libstdc++/113512
* include/std/format (__formatter_fp::format): Fix logic for
alternate forms.
* testsuite/std/format/functions/format.cc: Check buggy cases of
alternate forms with g presentation type.

(cherry picked from commit a57439d61937925cec48df6166b2a805ae7054d5)

[Bug libstdc++/113500] Using std::format with float or double based std::chrono::time_point causes error: no match for 'operator<<'

2024-01-30 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113500

--- Comment #12 from GCC Commits  ---
The releases/gcc-13 branch has been updated by Jonathan Wakely
:

https://gcc.gnu.org/g:a5aca83ca9c7fac895d10eb7b3e14b1927ec1eac

commit r13-8263-ga5aca83ca9c7fac895d10eb7b3e14b1927ec1eac
Author: Jonathan Wakely 
Date:   Sun Jan 21 18:16:14 2024 +

libstdc++: Fix std::format for floating-point chrono::time_point [PR113500]

Currently trying to use std::format with certain specializations of
std::chrono::time_point is ill-formed, due to one member function of the
__formatter_chrono type which tries to write a time_point to an ostream.
For sys_time or sys_time with a period greater than days
there is no operator<< that can be used.

That operator<< is only needed when using an empty chrono-specs in the
format string, like "{}", but the ill-formed expression gives an error
even if not actually used. This means it's not possible to format some
other specializations of chrono::time_point, even when using a non-empty
chrono-specs.

This fixes it by avoiding using 'os << t' for all chrono::time_point
specializations, and instead using std::format("{:L%F %T}", t). So that
we continue to reject std::format("{}", sys_time{1.0s}) a check for
empty chrono-specs is added to the formatter, C>
specialization.

While testing this I noticed that the output for %S with a
floating-point duration was incorrect, as the subseconds part was being
appended to the seconds without a decimal point, and without the correct
number of leading zeros.

libstdc++-v3/ChangeLog:

PR libstdc++/113500
* include/bits/chrono_io.h (__formatter_chrono::_M_S): Fix
printing of subseconds with floating-point rep.
(__formatter_chrono::_M_format_to_ostream): Do not write
time_point specializations directly to the ostream.
(formatter, C>::parse): Do not allow an
empty chrono-spec if the type fails to meet the constraints for
writing to an ostream with operator<<.
* testsuite/std/time/clock/file/io.cc: Check formatting
non-integral times with empty chrono-specs.
* testsuite/std/time/clock/gps/io.cc: Likewise.
* testsuite/std/time/clock/utc/io.cc: Likewise.
* testsuite/std/time/hh_mm_ss/io.cc: Likewise.

(cherry picked from commit 7431fcea6b72beb54abb1932c254ac0e76bd0bde)

[Bug sanitizer/113669] New: -fsanitize=undefined failed to check a signed integer overflow

2024-01-30 Thread jiajing_zheng at 163 dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113669

Bug ID: 113669
   Summary: -fsanitize=undefined failed to check a signed integer
overflow
   Product: gcc
   Version: 12.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: sanitizer
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jiajing_zheng at 163 dot com
CC: dodji at gcc dot gnu.org, dvyukov at gcc dot gnu.org,
jakub at gcc dot gnu.org, kcc at gcc dot gnu.org
  Target Milestone: ---

I took a motion of the loop invariant expression of source.c and got
mutation.c.
Both the two files have a signed integer overflow problem.
I checked both files using -fsanitize=undefined at the -O0,-O1,-O2,-O3,-Os
optimization levels. The results showed that 'signed integer overflow' was
given for mutation.c at -O0,-O1,-O3,-Os, but missing at -O2. And for source.c,
the message was missing at all the above optimization levels.

jing@jing-ubuntu:~$ cat source.c 

static int g_B = -66265337;
static unsigned char g_A[2] = {0b00110110, 0b0010};

static void func_1(void);

static void func_1(void) {
  char *arr[4];
  char ch = '1';
  int i;
  for (i = 0; i < 4; i++) {
// source statement:
g_A[0] += ((int)(g_B * g_A[1])) & (g_A[1] & g_A[0]) | g_A[0];
  arr[i] = &ch;
  }
}

int main(void) {
  func_1();
  return 0;
}

jing@jing-ubuntu:~$ cat mutation.c 

static int g_B = -66265337;
static unsigned char g_A[2] = {0b00110110, 0b0010};

static void func_1(void);

static void func_1(void) {
  char *arr[4];
  char ch = '1';
  int i;
  //loop invaraint expression motion:
  int temp = (int)(g_B * g_A[1]);
  for (i = 0; i < 4; i++) {
// mutation statement:
g_A[0] += temp & (g_A[1] & g_A[0]) | g_A[0];
  arr[i] = &ch;
  }
}

int main(void) {
  func_1();
  return 0;
}


results for source.c:
jing@jing-ubuntu:~$ gcc source.c -fsanitize=undefined,address -O0 && ./a.out
jing@jing-ubuntu:~$ gcc source.c -fsanitize=undefined,address -O1 && ./a.out
jing@jing-ubuntu:~$ gcc source.c -fsanitize=undefined,address -O2 && ./a.out
jing@jing-ubuntu:~$ gcc source.c -fsanitize=undefined,address -O3 && ./a.out
jing@jing-ubuntu:~$ gcc source.c -fsanitize=undefined,address -Os && ./a.out

result for mutation.c at -O2:
jing@jing-ubuntu:~$ gcc mutation.c -fsanitize=undefined,address -O2 && ./a.out

results for mutation.c at -O0,-O1,-O3,-Os:
jing@jing-ubuntu:~$ gcc mutation.c -fsanitize=undefined,address -O0 && ./a.out
mutation.c:12:7: runtime error: signed integer overflow: 122 * -66265337 cannot
be represented in type 'int'
jing@jing-ubuntu:~$ gcc mutation.c -fsanitize=undefined,address -O1 && ./a.out
mutation.c:12:7: runtime error: signed integer overflow: 122 * -66265337 cannot
be represented in type 'int'
jing@jing-ubuntu:~$ gcc mutation.c -fsanitize=undefined,address -O3 && ./a.out
mutation.c:12:7: runtime error: signed integer overflow: 122 * -66265337 cannot
be represented in type 'int'
jing@jing-ubuntu:~$ gcc mutation.c -fsanitize=undefined,address -Os && ./a.out
mutation.c:12:7: runtime error: signed integer overflow: 122 * -66265337 cannot
be represented in type 'int'


jing@jing-ubuntu:~$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/home/jing/gcc-12.2.0/usr/local/bin/../libexec/gcc/x86_64-pc-linux-gnu/12.2.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../configure -enable-checking=release -enable-languages=c,c++
-disable-multilib
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 12.2.0 (GCC)

[Bug c++/113640] 'deducing this' lambda invoked multiple times unexpectedly

2024-01-30 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113640

Patrick Palka  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |ppalka at gcc dot 
gnu.org
   Last reconfirmed||2024-01-30
 Ever confirmed|0   |1
 Status|UNCONFIRMED |ASSIGNED
 CC||ppalka at gcc dot gnu.org

[Bug c++/113644] [14 regression] ICE when building libcxxabi-16.0.6 since r14-6520

2024-01-30 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113644

--- Comment #6 from GCC Commits  ---
The master branch has been updated by Patrick Palka :

https://gcc.gnu.org/g:af37bef86199e50368cbfbc97befe0622a07f12f

commit r14-8630-gaf37bef86199e50368cbfbc97befe0622a07f12f
Author: Patrick Palka 
Date:   Tue Jan 30 10:13:41 2024 -0500

c++: unifying integer parm with type-dep arg [PR113644]

Here when trying to unify P=42 A=T::value we ICE due to the latter's
empty type, which same_type_p dislikes.

PR c++/113644

gcc/cp/ChangeLog:

* pt.cc (unify) : Handle NULL_TREE type.

gcc/testsuite/ChangeLog:

* g++.dg/template/nontype30.C: New test.

[Bug c++/113644] [14 regression] ICE when building libcxxabi-16.0.6 since r14-6520

2024-01-30 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113644

Patrick Palka  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #7 from Patrick Palka  ---
Fixed.

[Bug libstdc++/113500] Using std::format with float or double based std::chrono::time_point causes error: no match for 'operator<<'

2024-01-30 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113500

Jonathan Wakely  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #13 from Jonathan Wakely  ---
Fixed for 13.3 now too. Thanks for the report.

[Bug target/111677] [12/13/14 Regression] darktable build on aarch64 fails with unrecognizable insn due to -fstack-protector changes

2024-01-30 Thread acoplan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111677

Alex Coplan  changed:

   What|Removed |Added

   Keywords||patch

--- Comment #24 from Alex Coplan  ---
Proposed fix for trunk:
https://gcc.gnu.org/pipermail/gcc-patches/2024-January/61.html

[Bug sanitizer/113669] -fsanitize=undefined failed to check a signed integer overflow

2024-01-30 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113669

--- Comment #1 from Jakub Jelinek  ---
This is because already the FE optimizes it, when it sees that
((int)(g_B * g_A[1])) & (g_A[1] & g_A[0]) | g_A[0]
is just being added to unsigned char element, the upper bits of it aren't
needed, so the multiplication and & and | are all performed in unsigned char
rather than wider types.

[Bug c++/113640] 'deducing this' lambda invoked multiple times unexpectedly

2024-01-30 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113640

--- Comment #2 from GCC Commits  ---
The master branch has been updated by Patrick Palka :

https://gcc.gnu.org/g:0857a00fe3226db8801384743b6d44353dcac9da

commit r14-8631-g0857a00fe3226db8801384743b6d44353dcac9da
Author: Patrick Palka 
Date:   Tue Jan 30 10:44:56 2024 -0500

c++: duplicated side effects of xobj arg [PR113640]

We miscompile the below testcase because keep_unused_object_arg thinks
the object argument of an xobj member function is unused, and so it ends
up duplicating the argument's side effects.

PR c++/113640

gcc/cp/ChangeLog:

* call.cc (keep_unused_object_arg): Punt for an xobj member
function.

gcc/testsuite/ChangeLog:

* g++.dg/cpp23/explicit-obj-lambda14.C: New test.

Reviewed-by: Jason Merrill 

[Bug c++/113640] 'deducing this' lambda invoked multiple times unexpectedly

2024-01-30 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113640

Patrick Palka  changed:

   What|Removed |Added

   Target Milestone|--- |14.0
   Keywords||wrong-code
 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #3 from Patrick Palka  ---
Fixed, thanks for the report.

[Bug tree-optimization/113670] New: ICE with vectors in named registers and -fno-vect-cost-model

2024-01-30 Thread xry111 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113670

Bug ID: 113670
   Summary: ICE with vectors in named registers and
-fno-vect-cost-model
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: xry111 at gcc dot gnu.org
  Target Milestone: ---

$ cat t.c
typedef float __attribute__ ((vector_size (16))) vec;
typedef int __attribute__ ((vector_size (16))) ivec;
register vec a asm("xmm0"), b asm("xmm1");
register ivec c asm("xmm2");

void
test (void)
{
  for (int i = 0; i < 4; i++)
c[i] = a[i] < b[i] ? -1 : 1;
}
$ gcc/cc1 -msse2 t.c -O2 -fno-vect-cost-model -nostdinc -ffixed-xmm{0,1,2}
t.c: In function 'test':
t.c:7:1: internal compiler error: in expand_expr_addr_expr_1, at expr.cc:9139
7 | test (void)
  | ^~~~
0x10e6d6e expand_expr_addr_expr_1
../../gcc/gcc/expr.cc:9139
0x10e76e2 expand_expr_addr_expr
../../gcc/gcc/expr.cc:9252
0x10f73a7 expand_expr_real_1(tree_node*, rtx_def*, machine_mode,
expand_modifier, rtx_def**, bool)
../../gcc/gcc/expr.cc:12585
0x10e7dc8 expand_expr_real(tree_node*, rtx_def*, machine_mode, expand_modifier,
rtx_def**, bool)
../../gcc/gcc/expr.cc:9440
0xef7346 expand_expr(tree_node*, rtx_def*, machine_mode, expand_modifier)
../../gcc/gcc/expr.h:316
0x10e91fa expand_expr_real_2(separate_ops*, rtx_def*, machine_mode,
expand_modifier)
../../gcc/gcc/expr.cc:9762
0x10ef77d expand_expr_real_gassign(gassign*, rtx_def*, machine_mode,
expand_modifier, rtx_def**, bool)
../../gcc/gcc/expr.cc:11096
0xf2db31 expand_gimple_stmt_1
../../gcc/gcc/cfgexpand.cc:4010
0xf2ddd4 expand_gimple_stmt
../../gcc/gcc/cfgexpand.cc:4071
0xf36844 expand_gimple_basic_block
../../gcc/gcc/cfgexpand.cc:6127
0xf38ff8 execute
../../gcc/gcc/cfgexpand.cc:6866

[Bug tree-optimization/113670] ICE with vectors in named registers and -fno-vect-cost-model

2024-01-30 Thread xry111 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113670

Xi Ruoyao  changed:

   What|Removed |Added

  Known to fail|13.2.0  |

--- Comment #1 from Xi Ruoyao  ---
It's difficult to say when this started because in previous releases another
ICE (PR113622) happens anyway.

[Bug tree-optimization/113670] ICE with vectors in named registers and -fno-vect-cost-model

2024-01-30 Thread xry111 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113670

--- Comment #2 from Xi Ruoyao  ---
Quoting the observation from Richard:

> We end up with the invalid
> 
> _28 = (sizetype) &a;

[Bug c++/113582] incorrect warning about unused label with `pragma GCC diagnostic` around the unused label

2024-01-30 Thread mpolacek at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113582

--- Comment #6 from Marek Polacek  ---
Patch approved for GCC 15:
https://gcc.gnu.org/pipermail/gcc-patches/2024-January/643999.html

[Bug c++/110358] requesting nicer suppression for Wdangling-reference

2024-01-30 Thread mpolacek at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110358

--- Comment #4 from Marek Polacek  ---
Patch for [[gnu::non_owning]] posted:
https://gcc.gnu.org/pipermail/gcc-patches/2024-January/643998.html

Not sure how important it is to accept the optional argument.

[Bug c++/81271] gcc/cp/lex.c:116: wrong condition ?

2024-01-30 Thread dcb314 at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81271

--- Comment #5 from David Binderman  ---
(In reply to Jason Merrill from comment #4)
> What tool did this warning come from?

Looks like cppcheck to me.

[Bug c++/113300] GCC rejects valid program involving copy list initialization A a = {} of a class with explicit and non explicit default constructors

2024-01-30 Thread mpolacek at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113300

--- Comment #5 from Marek Polacek  ---
Not a regression so I think it has to wait until GCC 15.  I'd like to take a
look then.

I've updated https://gcc.gnu.org/projects/cxx-dr-status.html

[Bug fortran/113671] New: Passing allocatable character(:) slices with negative stride: invalid memory access / segfault

2024-01-30 Thread orbisvicis+gcc at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113671

Bug ID: 113671
   Summary: Passing allocatable character(:) slices with negative
stride: invalid memory access / segfault
   Product: gcc
   Version: 13.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: orbisvicis+gcc at gmail dot com
  Target Milestone: ---

Created attachment 57262
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57262&action=edit
Various conditions to explore the invalid memory access

When passing an allocatable character(:) slice with negative stride, gfortran
either segfaults or accesses invalid memory.

* GNU Fortran (GCC) 11.4.0 on Cygwin 3.4.7-1.x86_64 segfaults (invalid memory
access)
* x86_64 GFortran 13.2|trunk on godbolt.org accesses invalid memory

Actual output:

```
testing:1   3   5
   one  

   !
```

Expected output:

```
testing:1   3   5
   three
   two  
   one
```

[Bug c++/113451] [14 regression] 32-bit g++.dg/abi/mangle-regparm1a.C FAILs

2024-01-30 Thread jason at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113451

Jason Merrill  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
 Ever confirmed|0   |1
   Assignee|unassigned at gcc dot gnu.org  |jason at gcc dot gnu.org
   Last reconfirmed||2024-01-30

[Bug c++/113443] GCC rejects valid program involving parameter packs with in between class type

2024-01-30 Thread mpolacek at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113443

Marek Polacek  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2024-01-30
 Ever confirmed|0   |1

--- Comment #5 from Marek Polacek  ---
Confirmed, then.

[Bug c++/113451] [14 regression] 32-bit g++.dg/abi/mangle-regparm1a.C FAILs

2024-01-30 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113451

--- Comment #2 from GCC Commits  ---
The trunk branch has been updated by Jason Merrill :

https://gcc.gnu.org/g:dd7aa986fd12fc24e9d2efb8a8b267acb2bf19ea

commit r14-8632-gdd7aa986fd12fc24e9d2efb8a8b267acb2bf19ea
Author: Jason Merrill 
Date:   Tue Jan 30 11:36:53 2024 -0500

testsuite: mangle-reparm1a options [PR113451]

When I added -fabi-compat-version=8 to avoid mangling aliases it also
suppressed the -Wabi warning.

PR c++/113451

gcc/testsuite/ChangeLog:

* g++.dg/abi/mangle-regparm1a.C: Use -Wabi=0.

[Bug c++/113451] [14 regression] 32-bit g++.dg/abi/mangle-regparm1a.C FAILs

2024-01-30 Thread jason at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113451

Jason Merrill  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #3 from Jason Merrill  ---
Fixed.

[Bug c++/110075] Bogus -Wdangling-reference

2024-01-30 Thread mpolacek at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110075

--- Comment #7 from Marek Polacek  ---
It's hard to decide.  It seems that once we've covered std::span-like classes,
in practice the warning points out real issues.  I would hope that code like
you posted is actually fairly rare.  But it's difficult to be sure.

[Bug c++/112846] [14 Regression] nvptx: 'FAIL: g++.dg/abi/anon6.C -std=c++20 scan-assembler _Z5dummyIXtl8wrapper1IdEtlNS1_Ut_Edi9RightNametlNS2_Ut_Edi9RightNameLd405ec00000000000EEEEEEvv'

2024-01-30 Thread jason at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112846

Jason Merrill  changed:

   What|Removed |Added

   Last reconfirmed||2024-01-30
 Status|UNCONFIRMED |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |jason at gcc dot gnu.org
 Ever confirmed|0   |1

[Bug c++/113649] ICE: nested template class template argument deduction

2024-01-30 Thread mpolacek at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113649

Marek Polacek  changed:

   What|Removed |Added

 CC||mpolacek at gcc dot gnu.org,
   ||ppalka at gcc dot gnu.org

--- Comment #3 from Marek Polacek  ---
Seems the ICE started with r13-4876-gbd1fc4a219d8c0:

commit bd1fc4a219d8c0fad0ec41002e895b49e384c1c2
Author: Patrick Palka 
Date:   Fri Dec 23 09:18:37 2022 -0500

c++: template friend with variadic constraints [PR107853]

[Bug libstdc++/113512] Incorrect results for std::format("{:#.3g}", flt)

2024-01-30 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113512

Jonathan Wakely  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #3 from Jonathan Wakely  ---
Fixed for 13.3

[Bug tree-optimization/113639] ICE: in handle_operand_addr, at gimple-lower-bitint.cc:2265 at -O with _BitInt() in a struct

2024-01-30 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113639

Jakub Jelinek  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org

--- Comment #2 from Jakub Jelinek  ---
Created attachment 57263
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57263&action=edit
gcc14-pr113639.patch

Untested fix.

[Bug debug/113637] ICE: in as_a, at machmode.h:381 with extern function declaration and _BitInt() used as VLA size

2024-01-30 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113637

Jakub Jelinek  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org
 Status|NEW |ASSIGNED

--- Comment #3 from Jakub Jelinek  ---
Created attachment 57264
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57264&action=edit
gcc14-pr113637.patch

Types with BLKmode TYPE_MODE will be certainly larger than DWARF2_ADDR_SIZE...

[Bug c++/112846] [14 Regression] nvptx: 'FAIL: g++.dg/abi/anon6.C -std=c++20 scan-assembler _Z5dummyIXtl8wrapper1IdEtlNS1_Ut_Edi9RightNametlNS2_Ut_Edi9RightNameLd405ec00000000000EEEEEEvv'

2024-01-30 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112846

--- Comment #2 from GCC Commits  ---
The trunk branch has been updated by Jason Merrill :

https://gcc.gnu.org/g:209fc1e5f6c67e55e579b69f617b0b678b1bfdf0

commit r14-8633-g209fc1e5f6c67e55e579b69f617b0b678b1bfdf0
Author: Jason Merrill 
Date:   Tue Jan 30 12:07:21 2024 -0500

testsuite: fix anon6 mangling [PR112846]

As with r14-6796-g2fa122cae50cd8, avoid mangling compatibility aliases in
mangling tests, and test the new mangling.

PR c++/112846

gcc/testsuite/ChangeLog:

* g++.dg/abi/anon6.C: Specify ABI v18.
* g++.dg/abi/anon6a.C: New test for ABI v19.

[Bug c++/112846] [14 Regression] nvptx: 'FAIL: g++.dg/abi/anon6.C -std=c++20 scan-assembler _Z5dummyIXtl8wrapper1IdEtlNS1_Ut_Edi9RightNametlNS2_Ut_Edi9RightNameLd405ec00000000000EEEEEEvv'

2024-01-30 Thread jason at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112846

Jason Merrill  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #3 from Jason Merrill  ---
Fixed.  I assume you were seeing the issue on nvptx because it doesn't use
mangling aliases.

[Bug ipa/111444] [14 Regression] Wrong code at -O2/3/s on x86_64-gnu since r14-3226-gd073e2d75d9

2024-01-30 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111444

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #3 from Jakub Jelinek  ---
Slightly reduced testcase:
int a = 3, d, e;
const int **g;

static void
foo (int **i, int **j)
{
  const int *k[46];
  const int **l = &k[5];
  *j = &e;
  for (g = l; d; d = d + 1)
;
  **i = 0;
}

int
main ()
{
  int *m = &a;
  foo (&m, &m);
  if (a != 3)
__builtin_abort ();
}

This goes wrong during PRE.

  1   2   >