[Bug tree-optimization/113551] [13 Regression] Miscompilation with -O1 -funswitch-loops -fno-strict-overflow

2024-01-23 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113551

Richard Biener  changed:

   What|Removed |Added

   Assignee|rguenth at gcc dot gnu.org |unassigned at gcc dot 
gnu.org
 Status|ASSIGNED|NEW
   Priority|P3  |P2
  Known to work||14.0
Summary|[13/14 Regression]  |[13 Regression]
   |Miscompilation with -O1 |Miscompilation with -O1
   |-funswitch-loops|-funswitch-loops
   |-fno-strict-overflow|-fno-strict-overflow
 CC||rguenth at gcc dot gnu.org

--- Comment #7 from Richard Biener  ---
The IL after unswitching looks OK, but we assume that when &dso->i is NULL
then dso == NULL and when &dso->i is not NULL then dso also isn't.

I think this is a ranger bug that has been fixed on trunk
but eventually not yet backported, thus we have a duplicate somewhere.

Bisection will tell.

[Bug tree-optimization/113462] ICE: in handle_cast, at gimple-lower-bitint.cc:1539 at -O with _BitInt() in a struct

2024-01-23 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113462

--- Comment #5 from GCC Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:5015015ae6b29b3f1734c7693ba25b88cdd531a1

commit r14-8349-g5015015ae6b29b3f1734c7693ba25b88cdd531a1
Author: Jakub Jelinek 
Date:   Tue Jan 23 09:02:48 2024 +0100

fold-const: Fold larger VIEW_CONVERT_EXPRs [PR113462]

On Mon, Jan 22, 2024 at 11:27:52AM +0100, Richard Biener wrote:
> We run into
>
> static tree
> native_interpret_int (tree type, const unsigned char *ptr, int len)
> {
> ...
>   if (total_bytes > len
>   || total_bytes * BITS_PER_UNIT > HOST_BITS_PER_DOUBLE_INT)
> return NULL_TREE;
>
> OTOH using a V_C_E to "truncate" a _BitInt looks wrong?  OTOH the
> check doesn't really handle native_encode_expr using the "proper"
> wide_int encoding however that's exactly handled.  So it might be
> a pre-existing issue that's only uncovered by large _BitInts
> (__int128 might show similar issues?)

I guess the || total_bytes * BITS_PER_UNIT > HOST_BITS_PER_DOUBLE_INT
conditions make no sense, all we care is whether it fits in the buffer
or not.
But then there is
fold_view_convert_expr
(and other spots) which use
  /* We support up to 1024-bit values (for GCN/RISC-V V128QImode).  */
  unsigned char buffer[128];
or something similar.
This patch fixes even that by using a XALLOCAVEC allocated buffer
if the type size is 129 .. 8192 bytes.

2024-01-22  Jakub Jelinek  

PR tree-optimization/113462
* fold-const.cc (native_interpret_int): Don't punt if total_bytes
is larger than HOST_BITS_PER_DOUBLE_INT / BITS_PER_UNIT.
(fold_view_convert_expr): Use XALLOCAVEC buffers for types with
sizes between 129 and 8192 bytes.

[Bug tree-optimization/113551] [13 Regression] Miscompilation with -O1 -funswitch-loops -fno-strict-overflow

2024-01-23 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113551

Richard Biener  changed:

   What|Removed |Added

 CC||amacleod at redhat dot com

--- Comment #8 from Richard Biener  ---
Note on trunk we have jump-threaded this to move dso == (void *)0 || dso ==
this out of the loop so there's nothing to unswitch and the bad circumstances
likely do not trigger.

I still remember the ranger bug though.

[Bug tree-optimization/113552] [11/12/13/14 Regression] vectorizer generates calls to vector math routines with 1 simd lane.

2024-01-23 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113552

Richard Biener  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Priority|P1  |P2
 CC||rguenth at gcc dot gnu.org
   Last reconfirmed||2024-01-23
 Status|UNCONFIRMED |WAITING
   Target Milestone|14.0|11.5

--- Comment #1 from Richard Biener  ---
Hum, the vectorizer looks at the simd specs and if it says 1-lane variants
(simdlen == 1) are available it will happily create them.

Can you provide the testcase amended with the used SIMD "declarations"
(as with the fortran syntax or with a C testcase)?

[Bug rust/113553] New: rust fails to build on spar64-linux-gnu

2024-01-23 Thread doko at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113553

Bug ID: 113553
   Summary: rust fails to build on spar64-linux-gnu
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rust
  Assignee: unassigned at gcc dot gnu.org
  Reporter: doko at gcc dot gnu.org
CC: dkm at gcc dot gnu.org, gcc-rust at gcc dot gnu.org
  Target Milestone: ---

seen with trunk 20240121 on sparc64-linux-gnu:

[...]
../../../../src/libgrust/libp
roc_macro_internal/literal.cc: In static member function 'static
ProcMacro::Literal ProcMacro::
Literal::make_f32(float, bool)':
../../../../src/libgrust/libproc_macro_internal/literal.cc:155:57: error: call
of overloaded 'to_string(float&)' is ambiguous
  155 |   auto text = FFIString::make_ffistring (std::to_string (value));
  |  ~~~^~~
In file included from
/<>/build/sparc64-linux-gnu/libstdc++-v3/include/string:54,
 from
../../../../src/libgrust/libproc_macro_internal/literal.h:27,
 from
../../../../src/libgrust/libproc_macro_internal/literal.cc:23:
/<>/build/sparc64-linux-gnu/libstdc++-v3/include/bits/basic_string.h:4240:3:
note: candidate: 'std::string std::__cxx11::to_string(int)'
 4240 |   to_string(int __val)
  |   ^
/<>/build/sparc64-linux-gnu/libstdc++-v3/include/bits/basic_string.h:4259:3:
note: candidate: 'std::string std::__cxx11::to_string(unsigned int)'
 4259 |   to_string(unsigned __val)
  |   ^
/<>/build/sparc64-linux-gnu/libstdc++-v3/include/bits/basic_string.h:4275:3:
note: candidate: 'std::string std::__cxx11::to_string(long int)'
 4275 |   to_string(long __val)
  |   ^
/<>/build/sparc64-linux-gnu/libstdc++-v3/include/bits/basic_string.h:4294:3:
note: candidate: 'std::string std::__cxx11::to_string(long unsigned int)'
 4294 |   to_string(unsigned long __val)
  |   ^
/<>/build/sparc64-linux-gnu/libstdc++-v3/include/bits/basic_string.h:4310:3:
note: candidate: 'std::string std::__cxx11::to_string(long long int)'
 4310 |   to_string(long long __val)
  |   ^
/<>/build/sparc64-linux-gnu/libstdc++-v3/include/bits/basic_string.h:4327:3:
note: candidate: 'std::string std::__cxx11::to_string(long long unsigned int)'
 4327 |   to_string(unsigned long long __val)
  |   ^
../../../../src/libgrust/libproc_macro_internal/literal.cc:157:70: error: could
not convert '{ProcMacro::LitKind::make_float(), text, suffix,
ProcMacro::Span::make_unknown()}' from '' to
'ProcMacro::Literal'
  157 |   return {LitKind::make_float (), text, suffix, Span::make_unknown ()};
  |  ^
  |  |
  | 

../../../../src/libgrust/libproc_macro_internal/literal.cc: In static member
function 'static ProcMacro::Literal ProcMacro::Literal::make_f64(double,
bool)':
../../../../src/libgrust/libproc_macro_internal/literal.cc:163:57: error: call
of overloaded 'to_string(double&)' is ambiguous
  163 |   auto text = FFIString::make_ffistring (std::to_string (value));
  |  ~~~^~~
/<>/build/sparc64-linux-gnu/libstdc++-v3/include/bits/basic_string.h:4240:3:
note: candidate: 'std::string std::__cxx11::to_string(int)'
 4240 |   to_string(int __val)
  |   ^
checking if /<>/build/./gcc/xgcc -B/<>/build/./gcc/
-B/usr/sparc64-linux-gnu/bin/ -B/usr/sparc64-linux-gnu/lib/ -isystem
/usr/sparc64-linux-gnu/include -isystem /usr/sparc64-linux-gnu/sys-include
-isystem /<>/build/sys-includesupports -fno-rtti
-fno-exceptions...
/<>/build/sparc64-linux-gnu/libstdc++-v3/include/bits/basic_string.h:4259:3:
note: candidate: 'std::string std::__cxx11::to_string(unsigned int)'
 4259 |   to_string(unsigned __val)
  |   ^
/<>/build/sparc64-linux-gnu/libstdc++-v3/include/bits/basic_string.h:4275:3:
note: candidate: 'std::string std::__cxx11::to_string(long int)'
 4275 |   to_string(long __val)
  |   ^
/<>/build/sparc64-linux-gnu/libstdc++-v3/include/bits/basic_string.h:4294:3:
note: candidate: 'std::string std::__cxx11::to_string(long unsigned int)'
 4294 |   to_string(unsigned long __val)
  |   ^
/<>/build/sparc64-linux-gnu/libstdc++-v3/include/bits/basic_string.h:4310:3:
note: candidate: 'std::string std::__cxx11::to_string(long long int)'
 4310 |   to_string(long long __val)
  |   ^
/<>/build/sparc64-linux-gnu/libstdc++-v3/include/bits/basic_string.h:4327:3:
note: candidate: 'std::string std::__cxx11::to_string(long long unsigned int)'
 4327 |   to_string(unsigned long long __val)
  |   ^
../../../../src/libgrust/libproc_macro_internal/literal.cc:165:70: error: could
not convert '{ProcMacro::LitK

[Bug tree-optimization/113441] [14 Regression] Fail to fold the last element with multiple loop

2024-01-23 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441

--- Comment #12 from JuzheZhong  ---
(In reply to Richard Biener from comment #11)
> (In reply to Tamar Christina from comment #9)
> > There is a weird costing going on in the PHI nodes though:
> > 
> > m_108 = PHI  1 times vector_stmt costs 0 in body 
> > m_108 = PHI  2 times scalar_to_vec costs 0 in prologue
> > 
> > they have collapsed to 0. which can't be right..
> 
> Note this is likely because of the backend going wrong.
> 
> bool
> vectorizable_phi (vec_info *,
>   stmt_vec_info stmt_info, gimple **vec_stmt,
>   slp_tree slp_node, stmt_vector_for_cost *cost_vec)
> {
> ..
> 
>   /* For single-argument PHIs assume coalescing which means zero cost
>  for the scalar and the vector PHIs.  This avoids artificially
>  favoring the vector path (but may pessimize it in some cases).  */
>   if (gimple_phi_num_args (as_a  (stmt_info->stmt)) > 1)
> record_stmt_cost (cost_vec, SLP_TREE_NUMBER_OF_VEC_STMTS (slp_node),
>   vector_stmt, stmt_info, vectype, 0, vect_body);
> 
> You could check if we call this with sane values.

Do you mean it's RISC-V backend cost model issue ?

[Bug tree-optimization/113441] [14 Regression] Fail to fold the last element with multiple loop

2024-01-23 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441

--- Comment #13 from rguenther at suse dot de  ---
On Tue, 23 Jan 2024, juzhe.zhong at rivai dot ai wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441
> 
> --- Comment #12 from JuzheZhong  ---
> (In reply to Richard Biener from comment #11)
> > (In reply to Tamar Christina from comment #9)
> > > There is a weird costing going on in the PHI nodes though:
> > > 
> > > m_108 = PHI  1 times vector_stmt costs 0 in body 
> > > m_108 = PHI  2 times scalar_to_vec costs 0 in prologue
> > > 
> > > they have collapsed to 0. which can't be right..
> > 
> > Note this is likely because of the backend going wrong.
> > 
> > bool
> > vectorizable_phi (vec_info *,
> >   stmt_vec_info stmt_info, gimple **vec_stmt,
> >   slp_tree slp_node, stmt_vector_for_cost *cost_vec)
> > {
> > ..
> > 
> >   /* For single-argument PHIs assume coalescing which means zero cost
> >  for the scalar and the vector PHIs.  This avoids artificially
> >  favoring the vector path (but may pessimize it in some cases).  */
> >   if (gimple_phi_num_args (as_a  (stmt_info->stmt)) > 1)
> > record_stmt_cost (cost_vec, SLP_TREE_NUMBER_OF_VEC_STMTS (slp_node),
> >   vector_stmt, stmt_info, vectype, 0, vect_body);
> > 
> > You could check if we call this with sane values.
> 
> Do you mean it's RISC-V backend cost model issue ?

I responded to Tamar which means a aarch64 cost model issue - the
specific issue that the PHIs appear to have no cost.  I didn't look
at any of the rest.

[Bug modula2/113554] New: [14 Regression] m2 fails to build on x86_64-linux-gnux32

2024-01-23 Thread doko at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113554

Bug ID: 113554
   Summary: [14 Regression] m2 fails to build on
x86_64-linux-gnux32
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: modula2
  Assignee: gaius at gcc dot gnu.org
  Reporter: doko at gcc dot gnu.org
  Target Milestone: ---

seen with trunk 20240121 on x86_64-linux-gnux32:

[...]
../../src/gcc/m2/mc/mc.flex:32:9: warning: "alloca" redefined
   32 | #define alloca __builtin_alloca
  | ^~
In file included from /usr/include/stdlib.h:587,
 from :22:
/usr/include/alloca.h:35:10: note: this is the location of the previous
definition
   35 | # define alloca(size)   __builtin_alloca (size)
  |  ^~
../../src/gcc/m2/mc/mc.flex: In function 'handleDate':
../../src/gcc/m2/mc/mc.flex:333:25: error: passing argument 1 of 'time' from
incompatible point
er type [-Wincompatible-pointer-types]
  333 |   time_t  clock = time ((long *)0);
  | ^
  | |
  | long int *
In file included from ../../src/gcc/m2/mc/mc.flex:28:
/usr/include/time.h:76:29: note: expected 'time_t *' {aka 'long long int *'}
but argument is of
 type 'long int *'
   76 | extern time_t time (time_t *__timer) __THROW;
  | ^~~
test -d m2/gm2-libs-boot/.deps || /bin/bash ../../src/gcc/../mkinstalldirs
m2/gm2-libs-boot/.de
ps

make[5]: *** [../../src/gcc/m2/Make-lang.in:1421: mcflex.o] Error 1
make[5]: *** Waiting for unfinished jobs
rm gfdl.pod gcc.pod gfortran.pod gpl.pod lto-dump.pod gdc.pod
make[5]: Leaving directory '/<>/build/gcc'
make[4]: *** [Makefile:5099: all-stage2-gcc] Error 2
make[4]: Leaving directory '/<>/build'
make[3]: *** [Makefile:32444: stage2-bubble] Error 2
make[3]: Leaving directory '/<>/build'
make[2]: *** [Makefile:32656: bootstrap] Error 2
make[2]: Leaving directory '/<>/build'

[Bug tree-optimization/113441] [14 Regression] Fail to fold the last element with multiple loop

2024-01-23 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441

--- Comment #14 from JuzheZhong  ---
I just tried again both GCC-13.2 and GCC-14 with -fno-vect-cost-model.

https://godbolt.org/z/enEG3qf5K

GCC-14 requires scalar epilogue loop, whereas GCC-13.2 doesn't.

I believe it's not cost model issue.

[Bug middle-end/113364] [14 regression] ICE verify_ssa: `definition in block N does not dominate use in block` with `-O3 -march=znver2`

2024-01-23 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113364

Richard Biener  changed:

   What|Removed |Added

   Last reconfirmed|2024-01-13 00:00:00 |2024-1-23

--- Comment #12 from Richard Biener  ---
Re-confirmed after my fixes btw.

[Bug middle-end/113364] [14 regression] ICE verify_ssa: `definition in block N does not dominate use in block` with `-O3 -march=znver2`

2024-01-23 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113364

--- Comment #13 from Tamar Christina  ---
Yes I had to rerun my baseline after updating trunk. Will post patch once peak
finishes

[Bug ipa/107931] [12/13/14 Regression] -Og causes always_inline to fail since r12-6677-gc952126870c92cf2

2024-01-23 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107931

--- Comment #25 from Richard Biener  ---
Btw, I'd rather go the opposite and make the testcase at hand always invalid
and diagnosed which means diagnose taking the address of always-inline declared
functions and never emit an out-of-line body for them.

[Bug middle-end/113364] [14 regression] ICE verify_ssa: `definition in block N does not dominate use in block` with `-O3 -march=znver2`

2024-01-23 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113364

--- Comment #14 from Tamar Christina  ---
Yes I had to rerun my baseline after updating trunk. Will post patch once peak
finishes

[Bug c/113555] New: Yet another failure in verify_ssa

2024-01-23 Thread dcb314 at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113555

Bug ID: 113555
   Summary: Yet another failure in verify_ssa
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: dcb314 at hotmail dot com
  Target Milestone: ---

Today's gcc trunk says:

$ ~/gcc/results.20240123.asan.ubsan/bin/gcc -c -O3 -w bug1000.c
$ ~/gcc/results.20240123.asan.ubsan/bin/gcc -c -O3 -w -march=znver3 bug1000.c
bug1000.c: In function ‘net_roa_check_ip4_trie_tab’:
bug1000.c:20:6: error: definition in block 4 does not dominate use in block 5
   20 | void net_roa_check_ip4_trie_tab() {
  |  ^~
for SSA_NAME: vect__14.32_111 in statement:
vect__14.32_112 = PHI 
PHI argument
vect__14.32_111
for PHI node
vect__14.32_112 = PHI 
during GIMPLE pass: vect
bug1000.c:20:6: internal compiler error: verify_ssa failed

Reduced source code is

int ip4_getbit_a, ip4_getbit_pos, ip4_clrbit_pos;
void ip4_clrbit(int *a) { *a &= ip4_clrbit_pos; }
typedef struct {
  char pxlen;
  int prefix
} net_addr_ip4;
void fib_get_chain();
int trie_match_longest_ip4();
int trie_match_next_longest_ip4(net_addr_ip4 *n) {
  int __trans_tmp_1;
  while (n->pxlen) {
n->pxlen--;
ip4_clrbit(&n->prefix);
__trans_tmp_1 = ip4_getbit_a >> ip4_getbit_pos;
if (__trans_tmp_1)
  return 1;
  }
  return 0;
}
void net_roa_check_ip4_trie_tab() {
  net_addr_ip4 px0;
  for (int _n = trie_match_longest_ip4(&px0); _n;
   _n = trie_match_next_longest_ip4(&px0))
fib_get_chain();
}

The bug seems to have existed since at least 20231227.

On a side note, 1000 bug reports and enhancement requests in 11 years.

[Bug ipa/107931] [12/13/14 Regression] -Og causes always_inline to fail since r12-6677-gc952126870c92cf2

2024-01-23 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107931

--- Comment #26 from Jakub Jelinek  ---
(In reply to Richard Biener from comment #25)
> Btw, I'd rather go the opposite and make the testcase at hand always invalid
> and diagnosed which means diagnose taking the address of always-inline
> declared functions and never emit an out-of-line body for them.

That would need an exception at least for gnu extern inline always_inline
functions,
because the way they are used in glibc requires &open etc. to be valid (and use
then as fallback the out of line open).
But with the exception we can still see those indirect uses optimized later
into direct calls.

[Bug tree-optimization/113552] [11/12/13/14 Regression] vectorizer generates calls to vector math routines with 1 simd lane.

2024-01-23 Thread nsz at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113552

nsz at gcc dot gnu.org changed:

   What|Removed |Added

 CC||nsz at gcc dot gnu.org

--- Comment #2 from nsz at gcc dot gnu.org ---
is this fortran only?

glibc release is in a week, we can still do something (or backport a fix).

the vector abi does not allow 1 lane in this case
https://github.com/ARM-software/abi-aa/blob/main/vfabia64/vfabia64.rst#L867

c annotation:
https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/aarch64/fpu/bits/math-vector.h;h=04837bdcd7c0d0ce91192e09fc2d6614cae289c2;hb=HEAD
fortran annotation:
https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/aarch64/fpu/finclude/math-vector-fortran.h;h=92e15f0d6a758258f5728e628bbb2422b176fa95;hb=HEAD

i think the bug can be reproduced with older glibc by adding

!GCC$ builtin (cos) attributes simd (notinbranch)

[Bug tree-optimization/113552] [11/12/13/14 Regression] vectorizer generates calls to vector math routines with 1 simd lane.

2024-01-23 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113552

--- Comment #3 from Tamar Christina  ---
(In reply to Richard Biener from comment #1)
> Hum, the vectorizer looks at the simd specs and if it says 1-lane variants
> (simdlen == 1) are available it will happily create them.
>

My understanding is that the spec just says "All SIMD variants are available"
but technically V1DF is FP not SIMD. 

> Can you provide the testcase amended with the used SIMD "declarations"
> (as with the fortran syntax or with a C testcase)?

fair point:

!GCC$ builtin (cos) attributes simd (notinbranch)

  SUBROUTINE a(b)
  DIMENSION b(3,0)
  COMMON c
  DO 4 m=1,c
 DO 4 d=1,3
 b(d,m)=b(d,m)+COS(5.0D00*m)
   4  CONTINUE
  END
  DIMENSION e(53)
  DIMENSION f(6,91),g(6,91),h(6,91),
 *  i(6,91),j(6,91),k(6,86)
  DIMENSION l(107)
  END

where just

aarch64-unknown-linux-gnu-gfortran -S -o - -Ofast -w cosmo.fppized3.f

is enough.

[Bug c++/107058] [11/12/13/14 Regression] ICE in dwarf2out_die_ref_for_decl, at dwarf2out.cc:6038 since r11-5003-gd50310408f54e380

2024-01-23 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107058

Richard Biener  changed:

   What|Removed |Added

  Component|debug   |c++

--- Comment #5 from Richard Biener  ---
The particular reason we even stream the CONST_DECL is that it appears in

int __attribute__((aligned (A))) foo;

via the attribute list:

 
unit-size 
align:32 warn_if_not_align:0 symtab:-157035984 alias-set 2
canonical-type 0x76a305e8 precision:32 min  max 
pointer_to_this >
addressable public static SI pr50459.c:13:34 size  unit-size 
user align:1024 warn_if_not_align:0 context 
attributes 
value >>
chain >

where we "failed" to replace the CONST_DECL with its value.  When handling
the attribute we're doing

  align_expr = TREE_VALUE (args);
  if (align_expr && TREE_CODE (align_expr) != IDENTIFIER_NODE
  && TREE_CODE (align_expr) != FUNCTION_DECL)
align_expr = default_conversion (align_expr);

and that resolves it to an INTEGER_CST for further processing.

I'll note that streaming out debug references from certain contexts like
attribute arguments is also unnecessary but it's difficult to selectively
disable it.

IMO the correct thing to do is for the C++ frontend to, like the C frontend,
resolve the enumerators before calling common_handle_aligned_attribute.

It's also possible to more gracefully handle the assert it's still a bug.
I'm going to handle it gracefully when not checking.

[Bug middle-end/113540] missing -Warray-bounds warning with malloc and a simple loop

2024-01-23 Thread vincent-gcc at vinc17 dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113540

--- Comment #2 from Vincent Lefèvre  ---
Thanks for the explanations, but why in the following case

void foo (void)
{
  volatile char t[4];
  for (int i = 0; i <= 4; i++)
t[i] = 0;
  return;
}

does one get the warning (contrary to the use of malloc)?

[Bug tree-optimization/113552] [11/12/13/14 Regression] vectorizer generates calls to vector math routines with 1 simd lane.

2024-01-23 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113552

--- Comment #4 from Tamar Christina  ---
(In reply to nsz from comment #2)
> is this fortran only?
> 

No it should be C as well, I was just reducing from a Fortran workload that
failed so I can see what the vectorizer was doing.

[Bug target/113556] New: gcc.dg/vect/vect-simd-clone-16c.c etc. FAIL

2024-01-23 Thread ro at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113556

Bug ID: 113556
   Summary: gcc.dg/vect/vect-simd-clone-16c.c etc. FAIL
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ro at gcc dot gnu.org
CC: ams at gcc dot gnu.org
  Target Milestone: ---
Target: i386-pc-solaris2.11

Since their introduction on 20230222, several tests FAIL on Solaris/x86 (both
32 and 64-bit) when using a 32-bit compiler only:

FAIL: gcc.dg/vect/vect-simd-clone-16c.c scan-tree-dump-times vect "[nr]
[^n]* = foo.simdclone" 2
FAIL: gcc.dg/vect/vect-simd-clone-16d.c scan-tree-dump-times vect "[nr]
[^n]* = foo.simdclone" 2
FAIL: gcc.dg/vect/vect-simd-clone-17c.c scan-tree-dump-times vect "[nr]
[^n]* = foo.simdclone" 2
FAIL: gcc.dg/vect/vect-simd-clone-17d.c scan-tree-dump-times vect "[nr]
[^n]* = foo.simdclone" 2
FAIL: gcc.dg/vect/vect-simd-clone-18c.c scan-tree-dump-times vect "[nr]
[^n]* = foo.simdclone" 2
FAIL: gcc.dg/vect/vect-simd-clone-18d.c scan-tree-dump-times vect "[nr]
[^n]* = foo.simdclone" 2

The same tests PASS (again, 32 and 64-bit) when using a 64-bit compiler
instead.

I noticed that those tests have something like

/* { dg-final { scan-tree-dump-times {[\n\r] [^\n]* = foo\.simdclone} 2 "vect"
{ target { ! { x86_64*-*-* || { i686*-*-* || aarch64*-*-* } } } } } } */

while the 32-bit Solaris/x86 triple still is i386-pc-solaris2.11. However, that
compiler defaults to -mpentium4.

Is there anything in the tests that require specific ISA extensions only
available on i686 or is this just an oversight, so we can switch them to use
i?u6 instead?

[Bug target/113556] gcc.dg/vect/vect-simd-clone-16c.c etc. FAIL

2024-01-23 Thread ro at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113556

Rainer Orth  changed:

   What|Removed |Added

   Target Milestone|--- |14.0

[Bug tree-optimization/113524] FAIL: gcc.dg/torture/pr113026-1.c -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (test for bogus messages, line 10)

2024-01-23 Thread ro at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113524

Rainer Orth  changed:

   What|Removed |Added

 Target|hppa*-*-linux*  |hppa*-*-linux*,
   ||sparc-sun-solaris2.11,
   ||s390x-ibm-linux-gnu,
   ||m68k-unknown-linux-gnu,
   ||i686-pc-linux-gnu,
   ||arm-unknown-linux-gnueabihf
   ||-
   Last reconfirmed||2024-01-23
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
 CC||ro at gcc dot gnu.org

--- Comment #1 from Rainer Orth  ---
I also see this on 32-bit Solaris/SPARC.  There are testresult reports for a
couple of other targets, too.  This seems to be at -O3 -fomit-frame-pointer
only,
not plain -O3.

[Bug tree-optimization/113524] FAIL: gcc.dg/torture/pr113026-1.c -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (test for bogus messages, line 10)

2024-01-23 Thread ro at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113524

Rainer Orth  changed:

   What|Removed |Added

   Target Milestone|--- |14.0

[Bug tree-optimization/113557] gcc.dg/vect/vect-multi-peel-gaps.c FAILs

2024-01-23 Thread ro at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113557

--- Comment #1 from Rainer Orth  ---
Created attachment 57192
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57192&action=edit
32-bit sparc-sun-solaris2.11 vect-multi-peel-gaps.c.179t.vect

[Bug tree-optimization/113557] gcc.dg/vect/vect-multi-peel-gaps.c FAILs

2024-01-23 Thread ro at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113557

Rainer Orth  changed:

   What|Removed |Added

   Target Milestone|--- |14.0

[Bug tree-optimization/113557] New: gcc.dg/vect/vect-multi-peel-gaps.c FAILs

2024-01-23 Thread ro at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113557

Bug ID: 113557
   Summary: gcc.dg/vect/vect-multi-peel-gaps.c FAILs
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ro at gcc dot gnu.org
CC: matmal01 at gcc dot gnu.org
  Target Milestone: ---
Target: sparc*-sun-solaris2.11,
mips64el-unknown-linux-gnuabi64

The gcc.dg/vect/vect-multi-peel-gaps.c test FAILs on Solaris/SPARC (both 32
and 64-bit) since it was introduced on 20230721.  I'm also seeing a report
for Linux/MIPS64EL.

FAIL: gcc.dg/vect/vect-multi-peel-gaps.c -flto -ffat-lto-objects 
scan-tree-dump vect "LOOP VECTORIZED"
FAIL: gcc.dg/vect/vect-multi-peel-gaps.c scan-tree-dump vect "LOOP VECTORIZED"

[Bug c++/112820] vtable not emitted correctly from module when compiling with -g

2024-01-23 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112820

--- Comment #3 from GCC Commits  ---
The master branch has been updated by Nathaniel Shead :

https://gcc.gnu.org/g:affef534b0335592336c82918f15242576e2ab8f

commit r14-8350-gaffef534b0335592336c82918f15242576e2ab8f
Author: Nathaniel Shead 
Date:   Wed Jan 17 16:50:39 2024 +1100

c++: Fix handling of extern templates in modules [PR112820]

Currently, extern templates are detected by looking for the
DECL_EXTERNAL flag on a TYPE_DECL. However, this is incorrect:
TYPE_DECLs don't actually set this flag, and it happens to work by
coincidence due to TYPE_DECL_SUPPRESS_DEBUG happening to use the same
underlying bit. This however causes issues with other TYPE_DECLs that
also happen to have suppressed debug information.

Instead, this patch reworks the logic so CLASSTYPE_INTERFACE_ONLY is
always emitted into the module BMI and can then be used to check for an
extern template correctly.

Otherwise, for other declarations we always want to redetermine this:
even for declarations from the GMF, we may change our mind on whether to
import or export depending on decisions made later in the TU after
importing so we shouldn't decide this now, or necessarily reuse what the
module we'd imported had decided.

Some of this may need to change in the future to account for
https://github.com/itanium-cxx-abi/cxx-abi/issues/170.

PR c++/112820
PR c++/102607

gcc/cp/ChangeLog:

* module.cc (trees_out::lang_type_bools): Write interface_only
and interface_unknown.
(trees_in::lang_type_bools): Read the above flags.
(trees_in::decl_value): Reset CLASSTYPE_INTERFACE_* except for
extern templates.
(trees_in::read_class_def): Remove buggy extern template
handling.

gcc/testsuite/ChangeLog:

* g++.dg/modules/debug-2_a.C: New test.
* g++.dg/modules/debug-2_b.C: New test.
* g++.dg/modules/debug-2_c.C: New test.
* g++.dg/modules/debug-3_a.C: New test.
* g++.dg/modules/debug-3_b.C: New test.

Signed-off-by: Nathaniel Shead 

[Bug c++/102607] [modules] option -g results in undefined reference to `typeinfo for type`

2024-01-23 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102607

--- Comment #2 from GCC Commits  ---
The master branch has been updated by Nathaniel Shead :

https://gcc.gnu.org/g:affef534b0335592336c82918f15242576e2ab8f

commit r14-8350-gaffef534b0335592336c82918f15242576e2ab8f
Author: Nathaniel Shead 
Date:   Wed Jan 17 16:50:39 2024 +1100

c++: Fix handling of extern templates in modules [PR112820]

Currently, extern templates are detected by looking for the
DECL_EXTERNAL flag on a TYPE_DECL. However, this is incorrect:
TYPE_DECLs don't actually set this flag, and it happens to work by
coincidence due to TYPE_DECL_SUPPRESS_DEBUG happening to use the same
underlying bit. This however causes issues with other TYPE_DECLs that
also happen to have suppressed debug information.

Instead, this patch reworks the logic so CLASSTYPE_INTERFACE_ONLY is
always emitted into the module BMI and can then be used to check for an
extern template correctly.

Otherwise, for other declarations we always want to redetermine this:
even for declarations from the GMF, we may change our mind on whether to
import or export depending on decisions made later in the TU after
importing so we shouldn't decide this now, or necessarily reuse what the
module we'd imported had decided.

Some of this may need to change in the future to account for
https://github.com/itanium-cxx-abi/cxx-abi/issues/170.

PR c++/112820
PR c++/102607

gcc/cp/ChangeLog:

* module.cc (trees_out::lang_type_bools): Write interface_only
and interface_unknown.
(trees_in::lang_type_bools): Read the above flags.
(trees_in::decl_value): Reset CLASSTYPE_INTERFACE_* except for
extern templates.
(trees_in::read_class_def): Remove buggy extern template
handling.

gcc/testsuite/ChangeLog:

* g++.dg/modules/debug-2_a.C: New test.
* g++.dg/modules/debug-2_b.C: New test.
* g++.dg/modules/debug-2_c.C: New test.
* g++.dg/modules/debug-3_a.C: New test.
* g++.dg/modules/debug-3_b.C: New test.

Signed-off-by: Nathaniel Shead 

[Bug testsuite/113558] New: [14 regression] gcc.dg/vect/vect-outer-4c-big-array.c etc. FAIL

2024-01-23 Thread ro at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113558

Bug ID: 113558
   Summary: [14 regression] gcc.dg/vect/vect-outer-4c-big-array.c
etc. FAIL
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: testsuite
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ro at gcc dot gnu.org
CC: rdapp at gcc dot gnu.org
  Target Milestone: ---
Target: sparc*-sun-solaris2.11

Since 20230905, quite a number of tests regressed on 32 and 64-bit
Solaris/SPARC:

FAIL: gcc.dg/vect/vect-outer-4c-big-array.c -flto -ffat-lto-objects 
scan-tree-dump-times vect "zero step in outer
loop.(?:(?!failed)(?!Re-trying).)*succeeded" 1
FAIL: gcc.dg/vect/vect-outer-4c-big-array.c scan-tree-dump-times vect "zero
step in outer loop.(?:(?!failed)(?!Re-trying).)*succeeded" 1
FAIL: gcc.dg/vect/vect-reduc-dot-s16a.c -flto -ffat-lto-objects 
scan-tree-dump-times vect "vect_recog_dot_prod_pattern:
detected(?:(?!failed)(?!Re-trying).)*succeeded" 1
FAIL: gcc.dg/vect/vect-reduc-dot-s16a.c scan-tree-dump-times vect
"vect_recog_dot_prod_pattern: detected(?:(?!failed)(?!Re-trying).)*succeeded" 1
FAIL: gcc.dg/vect/vect-reduc-dot-s8a.c -flto -ffat-lto-objects 
scan-tree-dump-times vect "vect_recog_dot_prod_pattern:
detected(?:(?!failed)(?!Re-trying).)*succeeded" 1
FAIL: gcc.dg/vect/vect-reduc-dot-s8a.c -flto -ffat-lto-objects 
scan-tree-dump-times vect "vect_recog_widen_mult_pattern:
detected(?:(?!failed)(?!Re-trying).)*succeeded" 1
FAIL: gcc.dg/vect/vect-reduc-dot-s8a.c scan-tree-dump-times vect
"vect_recog_dot_prod_pattern: detected(?:(?!failed)(?!Re-trying).)*succeeded" 1
FAIL: gcc.dg/vect/vect-reduc-dot-s8a.c scan-tree-dump-times vect
"vect_recog_widen_mult_pattern: detected(?:(?!failed)(?!Re-trying).)*succeeded"
1
FAIL: gcc.dg/vect/vect-reduc-dot-s8b.c -flto -ffat-lto-objects 
scan-tree-dump-times vect "vect_recog_widen_mult_pattern:
detected(?:(?!failed)(?!Re-trying).)*succeeded" 1
FAIL: gcc.dg/vect/vect-reduc-dot-s8b.c scan-tree-dump-times vect
"vect_recog_widen_mult_pattern: detected(?:(?!failed)(?!Re-trying).)*succeeded"
1
FAIL: gcc.dg/vect/vect-reduc-dot-u16b.c -flto -ffat-lto-objects 
scan-tree-dump-times vect "vect_recog_dot_prod_pattern:
detected(?:(?!failed)(?!Re-trying).)*succeeded" 1
FAIL: gcc.dg/vect/vect-reduc-dot-u16b.c scan-tree-dump-times vect
"vect_recog_dot_prod_pattern: detected(?:(?!failed)(?!Re-trying).)*succeeded" 1
FAIL: gcc.dg/vect/vect-reduc-dot-u8a.c -flto -ffat-lto-objects 
scan-tree-dump-times vect "vect_recog_dot_prod_pattern:
detected(?:(?!failed)(?!Re-trying).)*succeeded" 1
FAIL: gcc.dg/vect/vect-reduc-dot-u8a.c scan-tree-dump-times vect
"vect_recog_dot_prod_pattern: detected(?:(?!failed)(?!Re-trying).)*succeeded" 1
FAIL: gcc.dg/vect/vect-reduc-dot-u8b.c -flto -ffat-lto-objects 
scan-tree-dump-times vect "vect_recog_dot_prod_pattern:
detected(?:(?!failed)(?!Re-trying).)*succeeded" 1
FAIL: gcc.dg/vect/vect-reduc-dot-u8b.c scan-tree-dump-times vect
"vect_recog_dot_prod_pattern: detected(?:(?!failed)(?!Re-trying).)*succeeded" 1
FAIL: gcc.dg/vect/vect-reduc-pattern-1a.c -flto -ffat-lto-objects 
scan-tree-dump-times vect "vect_recog_widen_sum_pattern:
detected(?:(?!failed)(?!Re-trying).)*succeeded" 1
FAIL: gcc.dg/vect/vect-reduc-pattern-1b-big-array.c -flto -ffat-lto-objects 
scan-tree-dump-times vect "vect_recog_widen_sum_pattern:
detected(?:(?!failed)(?!Re-trying).)*succeeded" 1
FAIL: gcc.dg/vect/vect-reduc-pattern-1b-big-array.c scan-tree-dump-times vect
"vect_recog_widen_sum_pattern: detected(?:(?!failed)(?!Re-trying).)*succeeded"
1
FAIL: gcc.dg/vect/vect-reduc-pattern-1c-big-array.c -flto -ffat-lto-objects 
scan-tree-dump-times vect "vect_recog_widen_sum_pattern:
detected(?:(?!failed)(?!Re-trying).)*succeeded" 1
FAIL: gcc.dg/vect/vect-reduc-pattern-1c-big-array.c scan-tree-dump-times vect
"vect_recog_widen_sum_pattern: detected(?:(?!failed)(?!Re-trying).)*succeeded"
1
FAIL: gcc.dg/vect/vect-reduc-pattern-2a.c -flto -ffat-lto-objects 
scan-tree-dump-times vect "vect_recog_widen_sum_pattern:
detected(?:(?!failed)(?!Re-trying).)*succeeded" 1
FAIL: gcc.dg/vect/wrapv-vect-reduc-dot-s8b.c scan-tree-dump-times vect
"vect_recog_dot_prod_pattern: detected(?:(?!failed)(?!Re-trying).)*succeeded" 1
FAIL: gcc.dg/vect/wrapv-vect-reduc-dot-s8b.c scan-tree-dump-times vect
"vect_recog_widen_mult_pattern: detected(?:(?!failed)(?!Re-trying).)*succeeded"
1

This is no doubt due to

commit e40edf6499576993862801640227e076b868241b
Author: Robin Dapp 
Date:   Thu Aug 31 09:16:35 2023 +0200

testsuite/vect: Make match patterns more accurate.

The first scan (for "OUTER LOOP VECTORIZED") isn't run.  Before your patch,
the second scan matched

/vol/gcc/src/hg/master/local/gcc/testsuite/gcc.dg/vect/vect-outer-4c-big-array.c:17:17:
note:   zero step in outer loop.

but there's no line matching "succeeded" on this target.

[Bug testsuite/113558] [14 regression] gcc.dg/vect/vect-outer-4c-big-array.c etc. FAIL

2024-01-23 Thread ro at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113558

--- Comment #1 from Rainer Orth  ---
Created attachment 57193
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57193&action=edit
32-bit sparc-sun-solaris2.11 vect-outer-4c-big-array.c.179t.vect

[Bug testsuite/113558] [14 regression] gcc.dg/vect/vect-outer-4c-big-array.c etc. FAIL

2024-01-23 Thread ro at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113558

Rainer Orth  changed:

   What|Removed |Added

   Target Milestone|--- |14.0

[Bug tree-optimization/113552] [11/12/13/14 Regression] vectorizer generates calls to vector math routines with 1 simd lane.

2024-01-23 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113552

Tamar Christina  changed:

   What|Removed |Added

 Status|WAITING |NEW

--- Comment #5 from Tamar Christina  ---
__attribute__ ((__simd__ ("notinbranch"), const))
double cos (double);

void foo (float *a, double *b)
{
for (int i = 0; i < 12; i+=3)
  {
b[i] = cos (5.0 * a[i]);
b[i+1] = cos (5.0 * a[i+1]);
b[i+2] = cos (5.0 * a[i+2]);
  }
}

Simple C example that shows the problem.

This seems to happen when SLP succeeds and the group size is a non power of
two.
The vectorizer then unrolls to make it a power of two and during vectorization
it seems to destroy the vector, make the call and reconstruct it.

So this seems like an SLP vectorization bug.  I can't seem to trigger it
however on GCC < 14 since SLP consistently fails for all my examples because it
tries a mode that's larger than the vector size.

So It may be a GCC 14 only regression, but I think it's latent in the
vectorizer.

[Bug c++/107058] [11/12/13/14 Regression] ICE in dwarf2out_die_ref_for_decl, at dwarf2out.cc:6038 since r11-5003-gd50310408f54e380

2024-01-23 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107058

--- Comment #6 from GCC Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:e2f3057fc911f9f55986b3de237b0155c0e09fe8

commit r14-8351-ge2f3057fc911f9f55986b3de237b0155c0e09fe8
Author: Richard Biener 
Date:   Tue Jan 23 09:51:00 2024 +0100

debug/107058 - gracefully handle unexpected DIE contexts

While the bug is persisting that LTO streaming picks up a CONST_DECL
from an attribute argument on a VAR_DECL which with -fdebug-type-section
refers to a DIE in a type unit we can handle this gracefully, at least
with -fno-checking.  Do so.  The C++ frontend nevetheless should resolve
the CONST_DECL attribute argument to a constant.

PR debug/107058
* dwarf2out.cc (dwarf2out_die_ref_for_decl): Gracefully
handle unexpected but bogus DIE contexts when not checking
enabled.

* c-c++-common/pr107058.c: New testcase.

[Bug c++/107058] [11/12/13/14 Regression] ICE in dwarf2out_die_ref_for_decl, at dwarf2out.cc:6038 since r11-5003-gd50310408f54e380

2024-01-23 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107058

Richard Biener  changed:

   What|Removed |Added

   Keywords||ice-checking

--- Comment #7 from Richard Biener  ---
Improved to ice-checking.

[Bug modula2/113559] New: gm2/isolib/run/pass/seqappend.mod FAILs

2024-01-23 Thread ro at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113559

Bug ID: 113559
   Summary: gm2/isolib/run/pass/seqappend.mod FAILs
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: modula2
  Assignee: gaius at gcc dot gnu.org
  Reporter: ro at gcc dot gnu.org
  Target Milestone: ---
Target: i386-pc-solaris2.11, sparc-sun-solaris2.11,
s390x-ibm-linux-gnu, m68k-unknown-linux-gnu

Since 20230515, the gm2/isolib/run/pass/seqappend.mod test FAILs on 32-bit
Solaris/SPARC and x86:

FAIL: gm2/isolib/run/pass/seqappend.mod execution,  -O 
FAIL: gm2/isolib/run/pass/seqappend.mod execution,  -O -g 
FAIL: gm2/isolib/run/pass/seqappend.mod execution,  -O3 -fomit-frame-pointer 
FAIL: gm2/isolib/run/pass/seqappend.mod execution,  -O3 -fomit-frame-pointer
-finline-functions 
FAIL: gm2/isolib/run/pass/seqappend.mod execution,  -Os 
FAIL: gm2/isolib/run/pass/seqappend.mod execution,  -g 

There are also reports for Linux/s390x and Linux/m68k.

The failure is like

short read occurred: 10...
append test failed

[Bug c/113555] Yet another failure in verify_ssa

2024-01-23 Thread dcb314 at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113555

--- Comment #1 from David Binderman  ---
Second test case:

char LZ4_decompress_generic_source;
void LZ4_decompress_generic_endOnInput() {
  char *ip = &LZ4_decompress_generic_source;
  while (1) {
long length;
if (length) {
  unsigned s;
  do {
if (ip > -5)
  goto _output_error;
s = *ip++;
length += s;
  } while (s);
}
  }
_output_error:
}

cvise $ ~/gcc/results/bin/gcc -c -w -O3 bug1000B.c
cvise $ ~/gcc/results/bin/gcc -c -w -O3 -march=znver3 bug1000B.c
bug1000B.c: In function ‘LZ4_decompress_generic_endOnInput’:
bug1000B.c:2:6: error: definition in block 7 does not dominate use in block 5
2 | void LZ4_decompress_generic_endOnInput() {
  |  ^

[Bug modula2/113559] gm2/isolib/run/pass/seqappend.mod FAILs

2024-01-23 Thread ro at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113559

Rainer Orth  changed:

   What|Removed |Added

   Target Milestone|--- |14.0

[Bug debug/7853] gcc reports multiple symbol definitions on the wrong line

2024-01-23 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=7853

Richard Biener  changed:

   What|Removed |Added

  Known to work||13.2.1, 4.3.4
 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #21 from Richard Biener  ---
Fixed.  I can't reproduce, not even with stabs (where I don't see any line info
but probably due to too recent gdb).  With dwarf it's also correct with GCC
4.3.

[Bug tree-optimization/113441] [14 Regression] Fail to fold the last element with multiple loop

2024-01-23 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441

--- Comment #15 from rguenther at suse dot de  ---
On Tue, 23 Jan 2024, juzhe.zhong at rivai dot ai wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441
> 
> --- Comment #14 from JuzheZhong  ---
> I just tried again both GCC-13.2 and GCC-14 with -fno-vect-cost-model.
> 
> https://godbolt.org/z/enEG3qf5K
> 
> GCC-14 requires scalar epilogue loop, whereas GCC-13.2 doesn't.
> 
> I believe it's not cost model issue.

As said, please try to bisect to the point where we started to require
the epilogue.

[Bug ipa/107931] [12/13/14 Regression] -Og causes always_inline to fail since r12-6677-gc952126870c92cf2

2024-01-23 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107931

--- Comment #27 from rguenther at suse dot de  ---
On Tue, 23 Jan 2024, jakub at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107931
> 
> --- Comment #26 from Jakub Jelinek  ---
> (In reply to Richard Biener from comment #25)
> > Btw, I'd rather go the opposite and make the testcase at hand always invalid
> > and diagnosed which means diagnose taking the address of always-inline
> > declared functions and never emit an out-of-line body for them.
> 
> That would need an exception at least for gnu extern inline always_inline
> functions,
> because the way they are used in glibc requires &open etc. to be valid (and 
> use
> then as fallback the out of line open).

Sure, already the C frontend should resolve to the out-of-line open call
there, we shouldn't do this in the middle-end.  Yes, indirect 'open' will
then not be fortified, but so what.

[Bug tree-optimization/113441] [14 Regression] Fail to fold the last element with multiple loop

2024-01-23 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441

--- Comment #16 from Tamar Christina  ---
(In reply to rguent...@suse.de from comment #13)
> > > You could check if we call this with sane values.
> > 
> > Do you mean it's RISC-V backend cost model issue ?
> 
> I responded to Tamar which means a aarch64 cost model issue - the
> specific issue that the PHIs appear to have no cost.  I didn't look
> at any of the rest.

Yeah, I'll be checking this separately and make a different issue if need be.

(In reply to JuzheZhong from comment #14)
> I just tried again both GCC-13.2 and GCC-14 with -fno-vect-cost-model.
> 
> https://godbolt.org/z/enEG3qf5K
> 
> GCC-14 requires scalar epilogue loop, whereas GCC-13.2 doesn't.
> 
> I believe it's not cost model issue.

Yes, my bisect originally stopped because of the costing change.  I've started
a new one with -fno-vect-cost-model but having trouble with the condition to
check for.  Will be back in a bit

[Bug tree-optimization/113552] [11/12/13/14 Regression] vectorizer generates calls to vector math routines with 1 simd lane.

2024-01-23 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113552

Richard Biener  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #6 from Richard Biener  ---
(In reply to Tamar Christina from comment #5)
> __attribute__ ((__simd__ ("notinbranch"), const))
> double cos (double);

So here the backend is then probably responsible to parse this into a valid
list of simdlen cases.

> void foo (float *a, double *b)
> {
> for (int i = 0; i < 12; i+=3)
>   {
> b[i] = cos (5.0 * a[i]);
> b[i+1] = cos (5.0 * a[i+1]);
> b[i+2] = cos (5.0 * a[i+2]);
>   }
> }
> 
> Simple C example that shows the problem.
> 
> This seems to happen when SLP succeeds and the group size is a non power of
> two.
> The vectorizer then unrolls to make it a power of two and during
> vectorization
> it seems to destroy the vector, make the call and reconstruct it.
> 
> So this seems like an SLP vectorization bug.  I can't seem to trigger it
> however on GCC < 14 since SLP consistently fails for all my examples because
> it tries a mode that's larger than the vector size.

On the 13 branch and x86_64 the above results in a large VF and using
_ZGVbN2v_cos, same on trunk.

> So It may be a GCC 14 only regression, but I think it's latent in the
> vectorizer.

I think there's sth odd with the backend here, but I can confirm the
behavior.  Note it analyzes and costs VF == 4 and V2DF resulting in
6 calls but then code generation comes along doing sth different!?

[Bug modula2/113511] lack of libm2 ABI compatibility on powerpc platforms

2024-01-23 Thread gaiusmod2 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113511

--- Comment #2 from gaiusmod2 at gmail dot com ---
"rguenth at gcc dot gnu.org"  writes:

> There's also the question on compatibility to libgm2 from GCC 13.

indeed - I guess the -mabi could be retained for backward compatibility.

> I suppose the frontend could simply not allow changing the M2 language
> "long double" (however it is called) with -mabi=... (which really only
> change the C language ABI!).  Of course calls to libm are subject to the
> C language ABI.

ok yes.  So m2's longreal data type uses ieeelongdouble throughout by
default on powerpc - that would be clean.

In principle could all the C interface from m2 code convert the longreal
representation to glibc long double and visa versa?

So for example in the case of libm.def

change libm.def from a DEFINITION FOR "C" to an ordinary m2 definition
module.  Introduce libm.c which for non power platforms just passes
calls though to C.  On powerpc (without an IEEE128 glibc) it will
convert __float128 onto the underlying long double representation.

> Does the language standard have anything to say here?  I suppose there's
> no ABI documents for M2 for various targets, so eventually C interoperability
> language in the standard directs at the common sense?

It leaves much to be implementation defined :-)

The gcc/m2/gm2-libs-iso/LowLong.def provides setMode which could be used
to control the behaviour of the above conversions.  The size of the set
Modes and their meaning is implementation defined.

Possibly it might be implemented:

Bit 0:  issue an error and abort if the underlying long double support
in glibc does not match the longreal in m2.
Bit 1:  issue a single warning if the underlying long double support
in glibc does not match the longreal in m2 and then attempt
conversion.
Bit 2:  raise an exception if the underlying long double support
in glibc does not match the longreal in m2.
Bit 3:  raise an exception if the conversion between longreal and
glibc C long double representations exceeds range.

Bit 4.. More bits to control conversion behaviour.

[ as a start ]

[Bug tree-optimization/113552] [11/12/13/14 Regression] vectorizer generates calls to vector math routines with 1 simd lane.

2024-01-23 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113552

--- Comment #7 from Richard Biener  ---
OK, maybe the costing is simply not taking into account that we chose the
simdlen == 1 variant which _does_ exist!  It's the chosen one:

4052bestn = cgraph_node::get (simd_clone_info[0]);
(gdb) p bestn
$5 = 
(gdb) p bestn->simdclone->simdlen 
$6 = {coeffs = {1, 0}}

and it's usable

4077int target_badness = targetm.simd_clone.usable (n);
4078if (target_badness < 0)

(returns 0)

But note we do

4073if (num_calls != 1)
4074  this_badness += exact_log2 (num_calls) * 4096;

which of course is quite bogus since we have 12 calls and exact_log2 will
return -1 here.  Maybe we want ceil_log2 here.

when we try the simdlen == 2 variant that also turns out usable but
the calculates badness is the same so we stick to the simdlen == 1 one.

So - the target should reject this clone or not generate it in the first
place.  And of course the cost thing should be fixed which will likely mask
the issue in the target.

[Bug target/113114] [14 Regression] ICE compiling gcc.c-torture/execute/pr59643.cwith -mabi=ilp32; in try_promote_writeback aarch64-ldp-fusion.cc

2024-01-23 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113114

--- Comment #9 from GCC Commits  ---
The master branch has been updated by Alex Coplan :

https://gcc.gnu.org/g:20e18106fac2d11ee43683291ff11d76da41d50b

commit r14-8353-g20e18106fac2d11ee43683291ff11d76da41d50b
Author: Alex Coplan 
Date:   Thu Jan 18 17:53:01 2024 +

aarch64: Don't assert recog success in ldp/stp pass [PR113114]

The PR shows two different cases where try_promote_writeback produces an
RTL pattern which isn't recognized.  Currently this leads to an ICE, as
we assert recog success, but I think it's better just to back out of the
changes gracefully if recog fails (as we do in the main fuse_pair case).

In theory since we check the ranges here recog shouldn't fail (which is
why I had the assert in the first place), but the PR shows an edge case
in the patterns where if we form a pre-writeback pair where the
writeback offset is exactly -S, where S is the size in bytes of one
transfer register, we fail to match the expected pattern as the patterns
look explicitly for plus operands in the mems.  I think fixing this
would require adding at least four new special-case patterns to
aarch64.md for what doesn't seem to be a particularly useful variant of
the insns.  Even if we were to do that, I think it would be GCC 15
material, and it's better to just punt for GCC 14.

The ILP32 case in the PR is a bit different, as that shows us trying to
combine a pair with DImode base register operands in the mems together
with an SImode trailing update of the base register.  This leads to us
forming an RTL pattern which references the base register in both SImode
and DImode, which also fails to recog.  Again, I think it's best just to
take the missed optimization for now.  If we really want to make this
(try_promote_writeback) work for ILP32, we can try to do it for GCC 15.

gcc/ChangeLog:

PR target/113114
* config/aarch64/aarch64-ldp-fusion.cc (try_promote_writeback):
Don't assert recog success, just punt if the writeback pair
isn't recognized.

gcc/testsuite/ChangeLog:

PR target/113114
* gcc.c-torture/compile/pr113114.c: New test.
* gcc.target/aarch64/pr113114.c: New test.

[Bug tree-optimization/113552] [11/12/13/14 Regression] vectorizer generates calls to vector math routines with 1 simd lane.

2024-01-23 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113552

--- Comment #8 from Richard Biener  ---
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 09749ae3817..1ddbe7a2f6b 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -4071,7 +4071,7 @@ vectorizable_simd_clone_call (vec_info *vinfo,
stmt_vec_info stmt_info,
|| (nargs != simd_nargs))
  continue;
if (num_calls != 1)
- this_badness += exact_log2 (num_calls) * 4096;
+ this_badness += floor_log2 (num_calls) * 4096 + num_calls;
if (n->simdclone->inbranch)
  this_badness += 8192;
int target_badness = targetm.simd_clone.usable (n);


"fixes" it

[Bug target/113114] [14 Regression] ICE compiling gcc.c-torture/execute/pr59643.cwith -mabi=ilp32; in try_promote_writeback aarch64-ldp-fusion.cc

2024-01-23 Thread acoplan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113114

Alex Coplan  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #10 from Alex Coplan  ---
Should be fixed, thanks for the report.

[Bug target/112989] [14 Regression] GC ICE with C++, `#include ` and `-fsanitize=address`

2024-01-23 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112989

--- Comment #16 from GCC Commits  ---
The trunk branch has been updated by Richard Sandiford :

https://gcc.gnu.org/g:659a5a908edd84894c2aa7f6f89468217d6894ca

commit r14-8354-g659a5a908edd84894c2aa7f6f89468217d6894ca
Author: Richard Sandiford 
Date:   Tue Jan 23 11:10:41 2024 +

aarch64: Avoid registering duplicate C++ overloads [PR112989]

In the original fix for this PR, I'd made sure that
including  didn't reach the final return in
simulate_builtin_function_decl (which would indicate duplicate
function definitions).  But it seems I forgot to do the same
thing for C++, which defines all of its overloads directly.

This patch fixes a case where we still recorded duplicate
functions for C++.  Thanks to Iain for reporting the resulting
GC ICE and for help with reproducing it.

gcc/
PR target/112989
* config/aarch64/aarch64-sve-builtins-shapes.cc (build_one): Skip
MODE_single variants of functions that don't take tuple arguments.

[Bug tree-optimization/110603] [14 Regression] GCC, ICE: internal compiler error: in verify_range, at value-range.cc:1104 since r14-255

2024-01-23 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110603

Jakub Jelinek  changed:

   What|Removed |Added

 CC||law at gcc dot gnu.org

--- Comment #5 from Jakub Jelinek  ---
(In reply to Aldy Hernandez from comment #4)
> Now the reason we're passing swapped endpoints seems to originate in
> get_range_strlen_dynamic().  It is setting a min of 2, courtesy of the
> nonzero characters in the memcpy:
> 
> memcpy(a, "12", sizeof("12") - 1);

Guess in a program without UB both the bounds are valid, a zero terminated
string in
char[2] array can't have strlen longer than 1 and when '1' and '2' characters
are memcpyed at the start of some buffer then the string length will be at
least 2.
But the program would invoke UB if this code is reached, so the question is how
to resolve it.
The old behavior of VRP/ranger with swapping the boundaries avoided the ICE but
wasn't
right, this case isn't that the string length will be in [1, 2] range, but that
the argument will never be a valid zero terminated string.
So, guess either we shouldn't set minlen or maxlen (whatever is found second)
if it violates the other bound, or check it after the fact and pick just one of
them or set the other to one of them.

[Bug tree-optimization/113552] [11/12/13/14 Regression] vectorizer generates calls to vector math routines with 1 simd lane.

2024-01-23 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113552

Tamar Christina  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |tnfchris at gcc dot 
gnu.org
 Status|NEW |ASSIGNED

--- Comment #9 from Tamar Christina  ---
(In reply to Richard Biener from comment #7)
> So - the target should reject this clone or not generate it in the first
> place.  And of course the cost thing should be fixed which will likely mask
> the issue in the target.

Yeah, looks like there's a bug in
aarch64_simd_clone_compute_vecsize_and_simdlen that's also present on the
branches.  I'll submit a patch.

[Bug tree-optimization/113552] [11/12/13/14 Regression] vectorizer generates calls to vector math routines with 1 simd lane.

2024-01-23 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113552

--- Comment #10 from Richard Biener  ---
I'll fix the exact_log2 issue.

[Bug testsuite/113418] Use of vect_* target selectors in tests out of vect directories

2024-01-23 Thread xry111 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113418

--- Comment #5 from Xi Ruoyao  ---
For pr104992.c:
https://gcc.gnu.org/pipermail/gcc-patches/2024-January/643697.html

[Bug tree-optimization/104992] [missed optimization] x / y * y == x not optimized to x % y == 0

2024-01-23 Thread xry111 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104992

Xi Ruoyao  changed:

   What|Removed |Added

 CC||xry111 at gcc dot gnu.org

--- Comment #5 from Xi Ruoyao  ---
Hmm, shouldn't we close this as fixed now?

[Bug debug/8108] Problem in the code generator for C and the linker is extremelly slow

2024-01-23 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=8108

Richard Biener  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #16 from Richard Biener  ---
It has been fixed in the linker now.

[Bug debug/8188] DW_AT_containing_type incorrectly emitted

2024-01-23 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=8188

Richard Biener  changed:

   What|Removed |Added

   Last reconfirmed|2005-12-28 06:14:46 |2024-1-23
 CC||jason at gcc dot gnu.org,
   ||tromey at gcc dot gnu.org

--- Comment #4 from Richard Biener  ---
Reconfirmed.  With modern dwarf we now emit even

 <1><4c>: Abbrev Number: 11 (DW_TAG_class_type)
<4d>   DW_AT_name: B
<4f>   DW_AT_byte_size   : 16
<50>   DW_AT_decl_file   : 1
<51>   DW_AT_decl_line   : 9
<52>   DW_AT_decl_column : 7
<53>   DW_AT_containing_type: <0x4c>
<57>   DW_AT_sibling : <0x106>

which is a self-reference ...

The code adding this reads

  /* GNU extension: Record what type our vtable lives in.  */
  if (TYPE_VFIELD (type))
{
  tree vtype = DECL_FCONTEXT (TYPE_VFIELD (type));

  gen_type_die (vtype, context_die);
  add_AT_die_ref (type_die, DW_AT_containing_type,
  lookup_type_die (vtype));

(there are more "GNU extension" uses of DW_AT_containing_type)

Jason added this in 1996 with the commit message

x

(sic)

What does gdb do with all those "extension" uses of DW_AT_containing_type?
Can we just drop them all?

[Bug debug/8354] Incorrect DWARF-2/3 emitted for const + array

2024-01-23 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=8354

Richard Biener  changed:

   What|Removed |Added

   Last reconfirmed|2005-12-28 06:14:12 |2024-1-23

--- Comment #14 from Richard Biener  ---
Re-confirmed for both C and C++

[Bug tree-optimization/113441] [14 Regression] Fail to fold the last element with multiple loop

2024-01-23 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441

--- Comment #17 from Tamar Christina  ---
Ok, bisected to

g:2efe3a7de0107618397264017fb045f237764cc7 is the first bad commit
commit 2efe3a7de0107618397264017fb045f237764cc7
Author: Hao Liu 
Date:   Wed Dec 6 14:52:19 2023 +0800

tree-optimization/112774: extend the SCEV CHREC tree with a nonwrapping
flag

Before this commit we were unable to analyse the stride of the access.
After this niters seems to estimate the loop trip count at 4 and after that the
logs diverge enormously.

[Bug tree-optimization/113441] [14 Regression] Fail to fold the last element with multiple loop

2024-01-23 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441

--- Comment #18 from Richard Biener  ---
(In reply to Tamar Christina from comment #17)
> Ok, bisected to
> 
> g:2efe3a7de0107618397264017fb045f237764cc7 is the first bad commit
> commit 2efe3a7de0107618397264017fb045f237764cc7
> Author: Hao Liu 
> Date:   Wed Dec 6 14:52:19 2023 +0800
> 
> tree-optimization/112774: extend the SCEV CHREC tree with a nonwrapping
> flag
> 
> Before this commit we were unable to analyse the stride of the access.
> After this niters seems to estimate the loop trip count at 4 and after that
> the logs diverge enormously.

Hum, but that's backward and would match to what I said in comment#2 - we
should get better code with that.

Juzhe - when you revert the above ontop of trunk does the generated code
look better for Risc-V?

[Bug tree-optimization/113441] [14 Regression] Fail to fold the last element with multiple loop

2024-01-23 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441

--- Comment #19 from Richard Biener  ---
(In reply to Richard Biener from comment #18)
> (In reply to Tamar Christina from comment #17)
> > Ok, bisected to
> > 
> > g:2efe3a7de0107618397264017fb045f237764cc7 is the first bad commit
> > commit 2efe3a7de0107618397264017fb045f237764cc7
> > Author: Hao Liu 
> > Date:   Wed Dec 6 14:52:19 2023 +0800
> > 
> > tree-optimization/112774: extend the SCEV CHREC tree with a nonwrapping
> > flag
> > 
> > Before this commit we were unable to analyse the stride of the access.
> > After this niters seems to estimate the loop trip count at 4 and after that
> > the logs diverge enormously.
> 
> Hum, but that's backward and would match to what I said in comment#2 - we
> should get better code with that.
> 
> Juzhe - when you revert the above ontop of trunk does the generated code
> look better for Risc-V?

It doesn't revert but you can do

diff --git a/gcc/tree-scalar-evolution.cc b/gcc/tree-scalar-evolution.cc
index 25e3130e2f1..7870c8d76fb 100644
--- a/gcc/tree-scalar-evolution.cc
+++ b/gcc/tree-scalar-evolution.cc
@@ -2054,7 +2054,7 @@ analyze_scalar_evolution (class loop *loop, tree var)

 void record_nonwrapping_chrec (tree chrec)
 {
-  CHREC_NOWRAP(chrec) = 1;
+  CHREC_NOWRAP(chrec) = 0;

   if (dump_file && (dump_flags & TDF_SCEV))
 {

[Bug tree-optimization/113441] [14 Regression] Fail to fold the last element with multiple loop

2024-01-23 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441

--- Comment #20 from Richard Biener  ---
(In reply to Richard Biener from comment #19)
> (In reply to Richard Biener from comment #18)
> > (In reply to Tamar Christina from comment #17)
> > > Ok, bisected to
> > > 
> > > g:2efe3a7de0107618397264017fb045f237764cc7 is the first bad commit
> > > commit 2efe3a7de0107618397264017fb045f237764cc7
> > > Author: Hao Liu 
> > > Date:   Wed Dec 6 14:52:19 2023 +0800
> > > 
> > > tree-optimization/112774: extend the SCEV CHREC tree with a 
> > > nonwrapping
> > > flag
> > > 
> > > Before this commit we were unable to analyse the stride of the access.
> > > After this niters seems to estimate the loop trip count at 4 and after 
> > > that
> > > the logs diverge enormously.
> > 
> > Hum, but that's backward and would match to what I said in comment#2 - we
> > should get better code with that.
> > 
> > Juzhe - when you revert the above ontop of trunk does the generated code
> > look better for Risc-V?
> 
> It doesn't revert but you can do
> 
> diff --git a/gcc/tree-scalar-evolution.cc b/gcc/tree-scalar-evolution.cc
> index 25e3130e2f1..7870c8d76fb 100644
> --- a/gcc/tree-scalar-evolution.cc
> +++ b/gcc/tree-scalar-evolution.cc
> @@ -2054,7 +2054,7 @@ analyze_scalar_evolution (class loop *loop, tree var)
>  
>  void record_nonwrapping_chrec (tree chrec)
>  {
> -  CHREC_NOWRAP(chrec) = 1;
> +  CHREC_NOWRAP(chrec) = 0;
>  
>if (dump_file && (dump_flags & TDF_SCEV))
>  {

For me with this, on x86-64 we do not vectorize the loop at all.  With
-fno-vect-cost-model we vectorize some of the stores as part of BB
vectorization.

[Bug tree-optimization/113441] [14 Regression] Fail to fold the last element with multiple loop

2024-01-23 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441

--- Comment #21 from Richard Biener  ---
On aarch64 I can see already GCC 13.2 looking very much different from 12.3,
but I can't decipher the code to decide whether 12.3 vectorizes the loop or
not.
trunk looks similar to 13.2 here, so the bisected change can't really be
responsible here.

[Bug tree-optimization/113441] [14 Regression] Fail to fold the last element with multiple loop

2024-01-23 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441

--- Comment #22 from Tamar Christina  ---
for me with `-fno-vect-cost-model` on without this commit we generate
https://gist.github.com/Mistuke/d9252bfcb2aa766327c5f377e162f5b7 for the loop
and with the commit well.. it doesn't fit on the screen but the codegen is
pretty horrible with

smlal2  v24.4s, v13.8h, v5.8h
smull   v31.4s, v30.4h, v17.4h
add v20.4s, v20.4s, v11.4s
smlal2  v29.4s, v3.8h, v6.8h
smull2  v25.4s, v25.8h, v15.8h
add v22.4s, v28.4s, v22.4s
shrnv21.4h, v21.4s, 15
add v20.4s, v20.4s, v26.4s
add v29.4s, v29.4s, v24.4s
smlal2  v25.4s, v16.8h, v7.8h
smlal   v31.4s, v18.4h, v8.4h
smull2  v27.4s, v27.8h, v17.8h
shrn2   v21.8h, v22.4s, 15
add v29.4s, v29.4s, v25.4s
add v31.4s, v31.4s, v20.4s
smlal2  v27.4s, v18.8h, v8.8h
str h21, [x5, x9]
add x9, x9, 32
add x9, x5, x9
shrnv31.4h, v31.4s, 15
st1 {v21.h}[1], [x10]
add v27.4s, v27.4s, v29.4s
st1 {v21.h}[2], [x6]
add x6, x7, 20
add x10, x1, x21
st1 {v21.h}[3], [x2]
add x2, x7, 24
add x7, x7, 28
st1 {v21.h}[4], [x8]
shrn2   v31.8h, v27.4s, 15
st1 {v21.h}[5], [x6]
lsl x6, x10, 1
add x10, x5, x10, lsl 1
st1 {v21.h}[6], [x2]
add x2, x10, 4
st1 {v21.h}[7], [x7]
add x7, x10, 8
str h31, [x5, x6]
add x8, x10, 12
lsl x1, x1, 1
add x6, x6, 32
st1 {v31.h}[1], [x2]
add x2, x10, 16
st1 {v31.h}[2], [x7]
add x7, x10, 20
st1 {v31.h}[3], [x8]
add x8, x10, 24
add x10, x10, 28
st1 {v31.h}[4], [x2]
st1 {v31.h}[5], [x7]
add x11, x1, 32
st1 {v31.h}[6], [x8]
add x11, x0, x11
st1 {v31.h}[7], [x10]
add x10, x1, x25
ld1hz31.s, p5/z, [x11]

going on for a while. i.e. single element lane stores. So with the cost model
disabled, it definitely does get worse witht that commit. with the cost model
on there's no difference.

[Bug tree-optimization/113552] [11/12/13/14 Regression] vectorizer generates calls to vector math routines with 1 simd lane.

2024-01-23 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113552

--- Comment #11 from GCC Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:d5d43dc399bb0f15084827c59a025189c630afdd

commit r14-8357-gd5d43dc399bb0f15084827c59a025189c630afdd
Author: Richard Biener 
Date:   Tue Jan 23 12:53:04 2024 +0100

tree-optimization/113552 - fix num_call accounting in simd clone
vectorization

The following avoids using exact_log2 on the number of SIMD clone calls
to be emitted when vectorizing calls since that can easily be not
a power of two in which case it will return -1.  For different simd
clones the number of calls will differ by a multiply with a power of two
only so using floor_log2 is good enough here.

PR tree-optimization/113552
* tree-vect-stmts.cc (vectorizable_simd_clone_call): Use
floor_log2 instead of exact_log2 on the number of calls.

[Bug tree-optimization/113441] [14 Regression] Fail to fold the last element with multiple loop

2024-01-23 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441

--- Comment #23 from Tamar Christina  ---
tamar:~/gcc-dsg/test$ extract-toolchain gcc 2efe3a7de01
A   1514 files
D   0 files
M   0 files
Extracted 'origin/manygcc-basepoints-gcc-14-6292-g2f512f6fcdd:2efe3a7de01'

> ./bin/gcc -S -o ../wlo-bad.s -march=armv8-a+sve -O3 -msve-vector-bits=512 
> -fno-vect-cost-model -g0 ../wlo.c -fdump-tree-vect-all

tamar:~/gcc-dsg/test$ extract-toolchain gcc 9f7ad5eff3b
A   1514 files
D   0 files
M   0 files
Extracted 'origin/manygcc-basepoints-gcc-14-6292-g2f512f6fcdd:9f7ad5eff3b'

> ./bin/gcc -S -o ../wlo-good.s -march=armv8-a+sve -O3 -msve-vector-bits=512 
> -fno-vect-cost-model -g0 ../wlo.c -fdump-tree-vect-all

> diff ../wlo-bad.s ../wlo-good.s  | wc -l
537

and for the record the bisect was scanning for  "requires scalar epilogue loop"
and that's the first commit they appear on.

[Bug tree-optimization/113441] [14 Regression] Fail to fold the last element with multiple loop

2024-01-23 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441

--- Comment #24 from JuzheZhong  ---
(In reply to Richard Biener from comment #19)
> (In reply to Richard Biener from comment #18)
> > (In reply to Tamar Christina from comment #17)
> > > Ok, bisected to
> > > 
> > > g:2efe3a7de0107618397264017fb045f237764cc7 is the first bad commit
> > > commit 2efe3a7de0107618397264017fb045f237764cc7
> > > Author: Hao Liu 
> > > Date:   Wed Dec 6 14:52:19 2023 +0800
> > > 
> > > tree-optimization/112774: extend the SCEV CHREC tree with a 
> > > nonwrapping
> > > flag
> > > 
> > > Before this commit we were unable to analyse the stride of the access.
> > > After this niters seems to estimate the loop trip count at 4 and after 
> > > that
> > > the logs diverge enormously.
> > 
> > Hum, but that's backward and would match to what I said in comment#2 - we
> > should get better code with that.
> > 
> > Juzhe - when you revert the above ontop of trunk does the generated code
> > look better for Risc-V?
> 
> It doesn't revert but you can do
> 
> diff --git a/gcc/tree-scalar-evolution.cc b/gcc/tree-scalar-evolution.cc
> index 25e3130e2f1..7870c8d76fb 100644
> --- a/gcc/tree-scalar-evolution.cc
> +++ b/gcc/tree-scalar-evolution.cc
> @@ -2054,7 +2054,7 @@ analyze_scalar_evolution (class loop *loop, tree var)
>  
>  void record_nonwrapping_chrec (tree chrec)
>  {
> -  CHREC_NOWRAP(chrec) = 1;
> +  CHREC_NOWRAP(chrec) = 0;
>  
>if (dump_file && (dump_flags & TDF_SCEV))
>  {

Hmmm. With experiments. The codegen looks slightly better but still didn't
recover back to GCC-12.


Btw, I compare ARM SVE codegen, even with cost model:

https://godbolt.org/z/cKc1PG3dv

I think GCC 13.2 codegen is better than GCC trunk with cost model.

[Bug target/113070] [14 regression] [AArch64] [PGO/LTO] Miscompilation of go compiler

2024-01-23 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113070

--- Comment #9 from GCC Commits  ---
The master branch has been updated by Alex Coplan :

https://gcc.gnu.org/g:e0374b028a665a2ea8d6eb2b4e5862774e9e85c2

commit r14-8358-ge0374b028a665a2ea8d6eb2b4e5862774e9e85c2
Author: Alex Coplan 
Date:   Thu Jan 11 16:17:37 2024 +

rtl-ssa: Run finalize_new_accesses forwards [PR113070]

The next patch in this series exposes an interface for creating new uses
in RTL-SSA.  The intent is that new user-created uses can consume new
user-created defs in the same change group.  This is so that we can
correctly update uses of memory when inserting a new store pair insn in
the aarch64 load/store pair fusion pass (the affected uses need to
consume the new store pair insn).

As it stands, finalize_new_accesses is called as part of the backwards
insn placement loop within change_insns, but if we want new uses to be
able to depend on new defs in the same change group, we need
finalize_new_accesses to be called on earlier insns first.  This is so
that when we process temporary uses and turn them into permanent uses,
we can follow the last_def link on the temporary def to ensure we end up
with a permanent use consuming a permanent def.

gcc/ChangeLog:

PR target/113070
* rtl-ssa/changes.cc (function_info::change_insns): Split out the
call
to finalize_new_accesses from the backwards placement loop, run it
forwards in a separate loop.

[Bug target/113070] [14 regression] [AArch64] [PGO/LTO] Miscompilation of go compiler

2024-01-23 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113070

--- Comment #12 from GCC Commits  ---
The master branch has been updated by Alex Coplan :

https://gcc.gnu.org/g:ef86659da9de59896fb0128eef418224299267e9

commit r14-8361-gef86659da9de59896fb0128eef418224299267e9
Author: Alex Coplan 
Date:   Fri Jan 12 10:15:36 2024 +

aarch64: Fix up uses of mem following stp insert [PR113070]

As the PR shows (specifically #c7) we are missing updating uses of mem
when inserting an stp in the aarch64 load/store pair fusion pass.  This
patch fixes that.

RTL-SSA has a simple view of memory and by default doesn't allow stores
to be re-ordered w.r.t. other stores.  In the ldp fusion pass, we do our
own alias analysis and so can re-order stores over other accesses when
we deem this is safe.  If neither store can be re-purposed (moved into
the required position to form the stp while respecting the RTL-SSA
constraints), then we turn both the candidate stores into "tombstone"
insns (logically delete them) and insert a new stp insn.

As it stands, we implement the insert case separately (after dealing
with the candidate stores) in fuse_pair by inserting into the middle of
the vector of changes.  This is OK when we only have to insert one
change, but with this fix we would need to insert the change for the new
stp plus multiple changes to fix up uses of mem (note the number of
fix-ups is naturally bounded by the alias limit param to prevent
quadratic behaviour).  If we kept the code structured as is and inserted
into the middle of the vector, that would lead to repeated moving of
elements in the vector which seems inefficient.  The structure of the
code would also be a little unwieldy.

To improve on that situation, this patch introduces a helper class,
stp_change_builder, which implements a state machine that helps to build
the required changes directly in program order.  That state machine is
reponsible for deciding what changes need to be made in what order, and
the code in fuse_pair then simply follows those steps.

Together with the fix in the previous patch for installing new defs
correctly in RTL-SSA, this fixes PR113070.

We take the opportunity to rename the function decide_stp_strategy to
try_repurpose_store, as that seems more descriptive of what it actually
does, since stp_change_builder is now responsible for the overall change
strategy.

gcc/ChangeLog:

PR target/113070
* config/aarch64/aarch64-ldp-fusion.cc
(struct stp_change_builder): New.
(decide_stp_strategy): Reanme to ...
(try_repurpose_store): ... this.
(ldp_bb_info::fuse_pair): Refactor to use stp_change_builder to
construct stp changes.  Fix up uses when inserting new stp insns.

[Bug target/113070] [14 regression] [AArch64] [PGO/LTO] Miscompilation of go compiler

2024-01-23 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113070

--- Comment #10 from GCC Commits  ---
The master branch has been updated by Alex Coplan :

https://gcc.gnu.org/g:fce3994d04fc5d7d1c91f6db5a1f144aa291439a

commit r14-8359-gfce3994d04fc5d7d1c91f6db5a1f144aa291439a
Author: Alex Coplan 
Date:   Fri Jan 12 10:14:33 2024 +

rtl-ssa: Support for creating new uses [PR113070]

This exposes an interface for users to create new uses in RTL-SSA.
This is needed for updating uses after inserting a new store pair insn
in the aarch64 load/store pair fusion pass.

gcc/ChangeLog:

PR target/113070
* rtl-ssa/accesses.cc (function_info::create_use): New.
* rtl-ssa/changes.cc (function_info::finalize_new_accesses):
Ensure new uses end up referring to permanent defs.
* rtl-ssa/functions.h (function_info::create_use): Declare.

[Bug target/113070] [14 regression] [AArch64] [PGO/LTO] Miscompilation of go compiler

2024-01-23 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113070

--- Comment #11 from GCC Commits  ---
The master branch has been updated by Alex Coplan :

https://gcc.gnu.org/g:6dd613df59060fb54c4e3f66f39cf59bc44d118a

commit r14-8360-g6dd613df59060fb54c4e3f66f39cf59bc44d118a
Author: Alex Coplan 
Date:   Fri Jan 12 09:09:10 2024 +

rtl-ssa: Ensure new defs get inserted [PR113070]

In r14-5820-ga49befbd2c783e751dc2110b544fe540eb7e33eb I added support to
RTL-SSA for inserting new insns, which included support for users
creating new defs.

However, I missed that apply_changes_to_insn needed updating to ensure
that the new defs actually got inserted into the main def chain.  This
meant that when the aarch64 ldp/stp pass inserted a new stp insn, the
stp would just get skipped over during subsequent alias analysis, as its
def never got inserted into the memory def chain.  This (unsurprisingly)
led to wrong code.

This patch fixes the issue by ensuring new user-created defs get
inserted.  I would have preferred to have used a flag internal to the
defs instead of a separate data structure to keep track of them, but since
machine_mode increased to 16 bits we're already at 64 bits in access_info,
and we can't really reuse m_is_temp as the logic in finalize_new_accesses
requires it to get cleared.

gcc/ChangeLog:

PR target/113070
* rtl-ssa.h: Include hash-set.h.
* rtl-ssa/changes.cc (function_info::finalize_new_accesses): Add
new_sets parameter and use it to keep track of new user-created
sets.
(function_info::apply_changes_to_insn): Also call add_def on new
sets.
(function_info::change_insns): Add hash_set to keep track of new
user-created defs.  Plumb it through.
* rtl-ssa/functions.h: Add hash_set parameter to
finalize_new_accesses and
apply_changes_to_insn.

[Bug tree-optimization/113441] [14 Regression] Fail to fold the last element with multiple loop

2024-01-23 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441

--- Comment #25 from Tamar Christina  ---
> >  void record_nonwrapping_chrec (tree chrec)
> >  {
> > -  CHREC_NOWRAP(chrec) = 1;
> > +  CHREC_NOWRAP(chrec) = 0;
> >  
> >if (dump_file && (dump_flags & TDF_SCEV))
> >  {
> 
> Hmmm. With experiments. The codegen looks slightly better but still didn't
> recover back to GCC-12.
> 
> 
> Btw, I compare ARM SVE codegen, even with cost model:
> 
> https://godbolt.org/z/cKc1PG3dv
> 
> I think GCC 13.2 codegen is better than GCC trunk with cost model.

If you have the cost model enabled you hit
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441#c9 which is just a target
bug I need to look into separately.

[Bug target/113070] [14 regression] [AArch64] [PGO/LTO] Miscompilation of go compiler

2024-01-23 Thread acoplan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113070

Alex Coplan  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #13 from Alex Coplan  ---
Should be fixed, sorry for the delay, and thanks for the report.

[Bug c++/113560] New: Strange code generated when optimizing a multiplication on x86_64

2024-01-23 Thread accelerator0099 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113560

Bug ID: 113560
   Summary: Strange code generated when optimizing a
multiplication on x86_64
   Product: gcc
   Version: 13.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: accelerator0099 at gmail dot com
  Target Milestone: ---

Code:
#include 
auto f(char *buf, unsigned long long in) noexcept
{
unsigned long long hi{};
auto lo{_mulx_u64(in, 0x2af31dc462ull, &hi)};
lo = _mulx_u64(lo, 100, &hi);
__builtin_memcpy(buf + 2, &hi, 2);
return buf + 10;
}
auto g(char *buf, unsigned long long in) noexcept
{
unsigned long long hi{};
_mulx_u64(in, 100, &hi);
__builtin_memcpy(buf + 2, &hi, 2);
return buf + 10;
}

Compile with:
-Ofast -std=c++23 -march=znver4

GCC 13.2 and truck generate:
f(char*, unsigned long long):
movabs  rdx, 184467440738
mov rax, rdi
mulxr9, r8, rsi
xor r9d, r9d
mov rsi, r8
mov rdi, r9
add rsi, r8
shldrdi, r8, 1
add rsi, r8
adc rdi, r9
shldrdi, rsi, 3
sal rsi, 3
add rsi, r8
adc rdi, r9
add rax, 10
shldrdi, rsi, 2
mov WORD PTR [rax-8], di
ret
g(char*, unsigned long long):
mov eax, 100
mul rsi
lea rax, [rdi+10]
mov WORD PTR [rdi+2], dx
ret

GCC 12 generates:
f(char*, unsigned long long):
movabs  rdx, 184467440738
mov rax, rsi
imulrax, rdx
mov edx, 100
mulxrdx, rax, rax
lea rax, [rdi+10]
mov WORD PTR [rdi+2], dx
ret
g(char*, unsigned long long):
mov eax, 100
mul rsi
lea rax, [rdi+10]
mov WORD PTR [rdi+2], dx
ret

Clang:
f(char*, unsigned long long):
unsigned long long)
movabs  rdx, 184467440738
mov eax, 100
imulrdx, rsi
mulxrax, rax, rax
mov word ptr [rdi + 2], ax
lea rax, [rdi + 10]
ret
g(char*, unsigned long long):
unsigned long long)
mov eax, 100
mov rdx, rsi
mulxrax, rax, rax
mov word ptr [rdi + 2], ax
lea rax, [rdi + 10]
ret

See also:
https://gcc.godbolt.org/z/df7Gr1MKo

[Bug modula2/113554] [14 Regression] m2 fails to build on x86_64-linux-gnux32

2024-01-23 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113554

H.J. Lu  changed:

   What|Removed |Added

   Target Milestone|--- |14.0
 Ever confirmed|0   |1
   Last reconfirmed||2024-01-23
 Status|UNCONFIRMED |NEW

--- Comment #1 from H.J. Lu  ---
A patch is posted at

https://gcc.gnu.org/pipermail/gcc-patches/2024-January/643707.html

[Bug c/113438] ICE (segfault) in dwarf2out_decl with -g -std=c23 on c23-tag-composite-2.c

2024-01-23 Thread uecker at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113438

--- Comment #4 from uecker at gcc dot gnu.org ---
Created attachment 57194
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57194&action=edit
preliminary patch

[Bug modula2/113554] [14 Regression] m2 fails to build on x86_64-linux-gnux32

2024-01-23 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113554

--- Comment #2 from GCC Commits  ---
The master branch has been updated by H.J. Lu :

https://gcc.gnu.org/g:2bdf138a0d0141065fa9a8efa2ff86f211888a59

commit r14-8362-g2bdf138a0d0141065fa9a8efa2ff86f211888a59
Author: H.J. Lu 
Date:   Tue Jan 23 05:55:07 2024 -0800

m2: Use time_t in time and don't redefine alloca

Fix the m2 build warning and error:

[...]
../../src/gcc/m2/mc/mc.flex:32:9: warning: "alloca" redefined
   32 | #define alloca __builtin_alloca
  | ^~
In file included from /usr/include/stdlib.h:587,
 from :22:
/usr/include/alloca.h:35:10: note: this is the location of the previous
definition
   35 | # define alloca(size)   __builtin_alloca (size)
  |  ^~
../../src/gcc/m2/mc/mc.flex: In function 'handleDate':
../../src/gcc/m2/mc/mc.flex:333:25: error: passing argument 1 of 'time'
from incompatible point
er type [-Wincompatible-pointer-types]
  333 |   time_t  clock = time ((long *)0);
  | ^
  | |
  | long int *
In file included from ../../src/gcc/m2/mc/mc.flex:28:
/usr/include/time.h:76:29: note: expected 'time_t *' {aka 'long long int
*'} but argument is of
 type 'long int *'
   76 | extern time_t time (time_t *__timer) __THROW;

PR bootstrap/113554
* mc/mc.flex (alloca): Don't redefine.
(handleDate): Replace (long *)0 with (time_t *)0 when calling
time.

[Bug modula2/113554] [14 Regression] m2 fails to build on x86_64-linux-gnux32

2024-01-23 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113554

H.J. Lu  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #3 from H.J. Lu  ---
Fixed.

[Bug middle-end/113364] [14 regression] ICE verify_ssa: `definition in block N does not dominate use in block` with `-O3 -march=znver2`

2024-01-23 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113364

--- Comment #15 from Tamar Christina  ---
Ok, the fix fixes the ICE but after rebasing to trunk I get a misscompile
during bootstrap which miscompiles the x86 backend.

This is likely related to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113539
so tracking it down...

[Bug c/100789] [11/12/13/14 Regression] ICE with __transaction_relaxed and left shift signed overflow

2024-01-23 Thread mpolacek at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100789

Marek Polacek  changed:

   What|Removed |Added

 Status|ASSIGNED|NEW
   Assignee|mpolacek at gcc dot gnu.org|unassigned at gcc dot 
gnu.org

[Bug target/113255] [11/12/13 Regression] wrong code with -O2 -mtune=k8

2024-01-23 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113255

--- Comment #14 from GCC Commits  ---
The master branch has been updated by H.J. Lu :

https://gcc.gnu.org/g:3936c8709c25c8bc72be0c1b2cc3ae7a25dc90ec

commit r14-8363-g3936c8709c25c8bc72be0c1b2cc3ae7a25dc90ec
Author: H.J. Lu <(no_default)>
Date:   Tue Jan 23 06:34:43 2024 -0800

gcc.dg/torture/pr113255.c: Fix ia32 test failure

Fix ia32 test failure:

FAIL: gcc.dg/torture/pr113255.c   -O1  (test for excess errors)
Excess errors:
cc1: error: '-mstringop-strategy=rep_8byte' not supported for 32-bit code

PR rtl-optimization/113255
* gcc.dg/torture/pr113255.c (dg-additional-options): Add only
if not ia32.

[Bug c++/113544] [14 Regression] bogus incomplete type error with dependent data member in local class in generic lambda since r14-278

2024-01-23 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113544

Patrick Palka  changed:

   What|Removed |Added

   Keywords||ice-on-valid-code
 CC||jason at gcc dot gnu.org

--- Comment #1 from Patrick Palka  ---
And we ICE when the local class has a dependent base:

template
void f() {
  [](auto parm) {
struct type : decltype(parm) { };
  };
}

template void f();

[Bug c/113555] Yet another failure in verify_ssa

2024-01-23 Thread dcb314 at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113555

--- Comment #2 from David Binderman  ---
For the first source code, the bug seems to exist sometime between 20231119
and 20231227.

Git hashes are g:eaeaad3fcac4d7a3 and g:f19ceb2d49afdfa5

Please ignore the second source code - it is a separate bug.

[Bug testsuite/113558] [14 regression] gcc.dg/vect/vect-outer-4c-big-array.c etc. FAIL

2024-01-23 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113558

--- Comment #2 from Robin Dapp  ---
Created attachment 57195
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57195&action=edit
Tentative patch

Ah, it looks like nothing is being vectorized at all and the second check just
happened to match as part of the unsuccessful vectorization attempt.  It would
seem that we need the same condition as for the first check as well.

Would you mind giving the attached patch a try?  I ran it on riscv and power10
so far, x86 and aarch64 are still in progress.

[Bug c/113561] New: yet more verify_ssa fails

2024-01-23 Thread dcb314 at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113561

Bug ID: 113561
   Summary: yet more verify_ssa fails
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: dcb314 at hotmail dot com
  Target Milestone: ---

For this C source code:

char LZ4_decompress_generic_source;
void LZ4_decompress_generic_endOnInput() {
  char *ip = &LZ4_decompress_generic_source;
  while (1) {
long length;
if (length) {
  unsigned s;
  do {
if (ip > -5)
  goto _output_error;
s = *ip++;
length += s;
  } while (s);
}
  }
_output_error:
}

cvise $ /home/dcb38/gcc/results.20240119.asan.ubsan/bin/gcc -c -w -O3 bug1001.c
cvise $ /home/dcb38/gcc/results.20240119.asan.ubsan/bin/gcc -c -w -O3
-march=znver3 bug1001.c
bug1001.c: In function ‘LZ4_decompress_generic_endOnInput’:
bug1001.c:2:6: error: definition in block 7 does not dominate use in block 5
2 | void LZ4_decompress_generic_endOnInput() {
  |  ^

The bug first seems to exist sometime between g:484f48f03cf9a382
and g:5a22bb250d8f4ad2

[Bug tree-optimization/113555] Yet another failure in verify_ssa

2024-01-23 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113555

Richard Biener  changed:

   What|Removed |Added

   Keywords||ice-on-valid-code
 CC||rguenth at gcc dot gnu.org,
   ||tnfchris at gcc dot gnu.org
  Component|c   |tree-optimization

--- Comment #3 from Richard Biener  ---
possibly a duplicate

[Bug target/113560] Strange code generated when optimizing a multiplication on x86_64

2024-01-23 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113560

Richard Biener  changed:

   What|Removed |Added

   Keywords||missed-optimization
 Status|UNCONFIRMED |NEW
  Component|c++ |target
 Ever confirmed|0   |1
   Last reconfirmed||2024-01-23
 Target||x86_64-*-*

--- Comment #1 from Richard Biener  ---
GCC thinks the multiplication bu the constant is cheaper this way - are you
sure otherwise?

I see g using a highpart multiply while f uses a widening multiply.

[Bug tree-optimization/113561] yet more verify_ssa fails

2024-01-23 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113561

Richard Biener  changed:

   What|Removed |Added

  Component|c   |tree-optimization
   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=113364
   Keywords||ice-on-valid-code

--- Comment #1 from Richard Biener  ---
The proposed fix for PR113364 resolves the ICE.

[Bug tree-optimization/113555] Yet another failure in verify_ssa

2024-01-23 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113555

Richard Biener  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=113364

--- Comment #4 from Richard Biener  ---
The proposed fix for PR113364 resolves the ICE

[Bug debug/10466] Can't debug the first function in a C file that is included in another C file

2024-01-23 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=10466

Richard Biener  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #6 from Richard Biener  ---
Reading symbols from ./a.out...
(gdb) b bar
Breakpoint 1 at 0x400545: file b.c, line 4.
(gdb) b bar2
Breakpoint 2 at 0x40055a: file b.c, line 12.
(gdb) r
Starting program: /tmp/a.out 
Missing separate debuginfos, use: zypper install
glibc-debuginfo-2.31-150300.63.1.x86_64
foo

Breakpoint 1, bar () at b.c:4
4 printf( "bar\n" );
(gdb) l
1   int
2   bar()
3   {
4 printf( "bar\n" );
5
6 return 0;
7   }
8
9   int
10  bar2()

Works for me (named the file b.c), with both gcc 7.5 and gcc 13.2.  I'm
using quite recent gdb though, gdb 12.1

Either fixed or a gdb bug (which was also fixed meanwhile).

[Bug debug/10499] Debug information for some C++ headers is missing classes (template specific?)

2024-01-23 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=10499

Richard Biener  changed:

   What|Removed |Added

 CC||jason at gcc dot gnu.org
   Last reconfirmed|2005-12-28 06:11:53 |2024-1-23

--- Comment #9 from Richard Biener  ---
With gcc 7.5 and gcc 13.2 I see

(gdb) p/r s
$3 = 
(gdb) ptype s
type = std::stringstream

and the python pretty-printers "ICE" like

(gdb) p s
Python Exception : list index out of range

The DW_TAG_typedef is still as reported, to a DW_AT_declaration, and the
variable DIE only refers to the typedef DIE.

So - reconfirmed.  Might be also a C++ frontend representation issue.

[Bug debug/12385] Full debug info not emitted for C++ classes with external virtual functions

2024-01-23 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=12385

Richard Biener  changed:

   What|Removed |Added

  Known to work||13.2.0
 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #5 from Richard Biener  ---
I see now with GCC 13.2 (with GCC 7.5 'test' is missing):

 <1><29>: Abbrev Number: 2 (DW_TAG_class_type)
<2a>   DW_AT_name: (indirect string, offset: 0xe4): ObjectBase
<2e>   DW_AT_declaration : 1
<2e>   DW_AT_sibling : <0x85>
 <2><32>: Abbrev Number: 3 (DW_TAG_subprogram)
<33>   DW_AT_external: 1
<33>   DW_AT_name: (indirect string, offset: 0xa4): test
<37>   DW_AT_decl_file   : 1
<38>   DW_AT_decl_line   : 4
<39>   DW_AT_decl_column : 16
<3a>   DW_AT_linkage_name: (indirect string, offset: 0x88):
_ZN10ObjectBase4testEv
<3e>   DW_AT_virtuality  : 1(virtual)
<3f>   DW_AT_vtable_elem_location: 2 byte block: 10 0   (DW_OP_constu:
0)
<42>   DW_AT_containing_type: <0x29>
<46>   DW_AT_accessibility: 1   (public)
<47>   DW_AT_declaration : 1
<47>   DW_AT_object_pointer: <0x4f>
<4b>   DW_AT_sibling : <0x55>
 <3><4f>: Abbrev Number: 4 (DW_TAG_formal_parameter)
<50>   DW_AT_type: <0x85>
<54>   DW_AT_artificial  : 1
 <3><54>: Abbrev Number: 0
 <2><55>: Abbrev Number: 5 (DW_TAG_subprogram)
<56>   DW_AT_external: 1
<56>   DW_AT_name: (indirect string, offset: 0xe4): ObjectBase
<5a>   DW_AT_linkage_name: (indirect string, offset: 0xae):
_ZN10ObjectBaseC4Ev
<5e>   DW_AT_artificial  : 1
<5e>   DW_AT_accessibility: 1   (public)
<5f>   DW_AT_declaration : 1
<5f>   DW_AT_object_pointer: <0x67>
<63>   DW_AT_sibling : <0x6d>
 <3><67>: Abbrev Number: 4 (DW_TAG_formal_parameter)
<68>   DW_AT_type: <0x85>
<6c>   DW_AT_artificial  : 1
 <3><6c>: Abbrev Number: 0
 <2><6d>: Abbrev Number: 6 (DW_TAG_subprogram)
<6e>   DW_AT_external: 1
<6e>   DW_AT_name: (indirect string, offset: 0x14): test2
<72>   DW_AT_decl_file   : 1
<73>   DW_AT_decl_line   : 9
<74>   DW_AT_decl_column : 6
<75>   DW_AT_linkage_name: (indirect string, offset: 0xcc):
_ZN10ObjectBase5test2Ev
<79>   DW_AT_accessibility: 1   (public)
<7a>   DW_AT_declaration : 1
<7a>   DW_AT_object_pointer: <0x7e>
 <3><7e>: Abbrev Number: 4 (DW_TAG_formal_parameter)
<7f>   DW_AT_type: <0x85>
<83>   DW_AT_artificial  : 1
 <3><83>: Abbrev Number: 0
 <2><84>: Abbrev Number: 0

so both functions are there and so is the CTOR.  Thus fixed.

[Bug debug/10499] Debug information for some C++ headers is missing classes (template specific?)

2024-01-23 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=10499
Bug 10499 depends on bug 12385, which changed state.

Bug 12385 Summary: Full debug info not emitted for C++ classes with external 
virtual functions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=12385

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

[Bug debug/19954] Compiler emits incomplete structure type

2024-01-23 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=19954
Bug 19954 depends on bug 12385, which changed state.

Bug 12385 Summary: Full debug info not emitted for C++ classes with external 
virtual functions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=12385

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

[Bug debug/14022] asm() should start a new line table entry

2024-01-23 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=14022

Richard Biener  changed:

   What|Removed |Added

   Last reconfirmed|2005-12-20 18:28:59 |2024-1-23

--- Comment #4 from Richard Biener  ---
It's still as reported AFAICS.

.loc 1 6 0
popq%rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size   main, .-main
#APP
.text
 _sub:
 rts

.Letext0:
.section.debug_info,"",@progbits

[Bug middle-end/113364] [14 regression] ICE verify_ssa: `definition in block N does not dominate use in block` with `-O3 -march=znver2`

2024-01-23 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113364

--- Comment #16 from Tamar Christina  ---
Ok, I've submitted the patch since the ICE and miscompare are unrelated.

I'll keep this ticket open in any case.  The miscompares didn't happen based on
commits from ~2 weeks ago, So this will give me a place to start.

Hopefully send a patch for those tomorrow.

[Bug debug/14168] Unneeded DIEs output for imported declarations

2024-01-23 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=14168

Richard Biener  changed:

   What|Removed |Added

 CC||rguenth at gcc dot gnu.org
   Last reconfirmed|2005-11-02 02:12:19 |2024-1-23

--- Comment #3 from Richard Biener  ---
I think the debug reflects the source which is what it should first and
foremost do so there is no bug.

-feliminate-unused-debug-{symbols,types} which are enabled by default should
maybe eliminate everything, but the using declaration,
DW_TAG_imported_declaration isn't handled explicitly in prune_unused_types_walk
which means we keep it (and referenced things and also its context DIE, the
namespace.

The question is whether we should handle DW_TAG_imported_declaration based
on the imported DIE (typedef or function) or on its own merit.

[Bug debug/113562] New: [14 Regression] FAIL: gcc.dg/guality/pr54796.c

2024-01-23 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113562

Bug ID: 113562
   Summary: [14 Regression] FAIL: gcc.dg/guality/pr54796.c
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: debug
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hjl.tools at gmail dot com
CC: rguenth at gcc dot gnu.org
  Target Milestone: ---

On x86-64, r14-8346 caused:

FAIL: gcc.dg/guality/pr54796.c   -O1  -DPREVENT_OPTIMIZATION  line 17 a == 5
FAIL: gcc.dg/guality/pr54796.c   -O1  -DPREVENT_OPTIMIZATION  line 17 b == 6
FAIL: gcc.dg/guality/pr54796.c   -O1  -DPREVENT_OPTIMIZATION  line 17 c == 5
FAIL: gcc.dg/guality/pr54796.c   -O2  -DPREVENT_OPTIMIZATION  line 17 a == 5
FAIL: gcc.dg/guality/pr54796.c   -O2  -DPREVENT_OPTIMIZATION  line 17 b == 6
FAIL: gcc.dg/guality/pr54796.c   -O2  -DPREVENT_OPTIMIZATION  line 17 c == 5
FAIL: gcc.dg/guality/pr54796.c   -O3 -g  -DPREVENT_OPTIMIZATION  line 17 a == 5
FAIL: gcc.dg/guality/pr54796.c   -O3 -g  -DPREVENT_OPTIMIZATION  line 17 b == 6
FAIL: gcc.dg/guality/pr54796.c   -O3 -g  -DPREVENT_OPTIMIZATION  line 17 c == 5
FAIL: gcc.dg/guality/pr54796.c   -Os  -DPREVENT_OPTIMIZATION  line 17 a == 5
FAIL: gcc.dg/guality/pr54796.c   -Os  -DPREVENT_OPTIMIZATION  line 17 b == 6
FAIL: gcc.dg/guality/pr54796.c   -Os  -DPREVENT_OPTIMIZATION  line 17 c == 5
FAIL: gcc.dg/guality/pr54796.c  -Og -DPREVENT_OPTIMIZATION  line 17 a == 5
FAIL: gcc.dg/guality/pr54796.c  -Og -DPREVENT_OPTIMIZATION  line 17 b == 6
FAIL: gcc.dg/guality/pr54796.c  -Og -DPREVENT_OPTIMIZATION  line 17 c == 5
FAIL: gcc.dg/guality/pr54796.c   -O2 -flto -fno-use-linker-plugin
-flto-partitio
n=none  -DPREVENT_OPTIMIZATION line 17 a == 5
FAIL: gcc.dg/guality/pr54796.c   -O2 -flto -fno-use-linker-plugin
-flto-partitio
n=none  -DPREVENT_OPTIMIZATION line 17 b == 6
FAIL: gcc.dg/guality/pr54796.c   -O2 -flto -fno-use-linker-plugin
-flto-partitio
n=none  -DPREVENT_OPTIMIZATION line 17 c == 5

[Bug tree-optimization/113467] [14 regression] libgcrypt-1.10.3 is miscompiled

2024-01-23 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113467

--- Comment #17 from Sam James  ---
it was educational ;)

fwiw, no more miscompilations found yet with this

[Bug libstdc++/113294] constexpr error from accessing inactive union member in basic_string after move assignment

2024-01-23 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113294

Patrick Palka  changed:

   What|Removed |Added

   Last reconfirmed||2024-01-23
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
 CC||ppalka at gcc dot gnu.org

--- Comment #2 from Patrick Palka  ---
The front end started rejecting the testcase (with libstdc++ 12, 13 and 14
headers) after the active union member change tracking improvements r14-4771. 
Clang also rejects the testcase with libstdc++ 13 headers, which points to this
being a library bug unlike the other related PRs about constexpr std::string. 
Your proposed fix looks good to me.

[Bug target/113356] [14 Regression][aarch64] ICE in try_fuse_pair, at config/aarch64/aarch64-ldp-fusion.cc:2203 since r14-6947-g4b67ec7ff5b1aa

2024-01-23 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113356

--- Comment #5 from GCC Commits  ---
The master branch has been updated by Alex Coplan :

https://gcc.gnu.org/g:639ae543449a8d6525303458497bd4b897660ec3

commit r14-8367-g639ae543449a8d6525303458497bd4b897660ec3
Author: Alex Coplan 
Date:   Mon Jan 15 11:07:48 2024 +

aarch64: Don't record hazards against paired insns [PR113356]

For the testcase in the PR, we try to pair insns where the first has
writeback and the second uses the updated base register.  This causes us
to record a hazard against the second insn, thus narrowing the move
range away from the end of the BB.

However, it isn't meaningful to record hazards against the other insn
in the pair, as this doesn't change which pairs can be formed, and also
doesn't change where the pair is formed (from the perspective of
nondebug insns).

To see why this is the case, consider the two cases:

 - Suppoe we are finding hazards for insns[0].  If we record a hazard
   against insns[1], then range.last becomes
   insns[1]->prev_nondebug_insn (), but note that this is equivalent to
   inserting after insns[1] (since insns[1] is being changed).
 - Now consider finding hazards for insns[1].  Suppose we record
   insns[0] as a hazard.  Then we set range.first = insns[0], which is a
   no-op.

As such, it seems better to never record hazards against the other insn
in the pair, as we check whether the insns themselves are suitable for
combination separately (e.g. for ldp checking that they use distinct
transfer registers).  Avoiding unnecessarily narrowing the move range
avoids unnecessarily re-ordering over debug insns.

This should also mean that we can only narrow the move range away from
the end of the BB in the case that we record a hazard for insns[0]
against insns[1]->prev_nondebug_insn () or earlier.  This means that for
the non-call-exceptions case, either the move range includes insns[1],
or we reject the pair (thus the assert tripped in the PR should always
hold).

gcc/ChangeLog:

PR target/113356
* config/aarch64/aarch64-ldp-fusion.cc
(ldp_bb_info::try_fuse_pair):
Don't record hazards against the opposite insn in the pair.

gcc/testsuite/ChangeLog:

PR target/113356
* gcc.target/aarch64/pr113356.C: New test.

  1   2   3   >