date:20240417

[Bug libstdc++/114750] New: converting load/store of simd fails compilation on ARM

2024-04-17 Thread mkretz at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114750

Bug ID: 114750
   Summary: converting load/store of simd fails compilation on ARM
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: mkretz at gcc dot gnu.org
  Target Milestone: ---

https://compiler-explorer.com/z/jh5YeqzEe

Testcase:

#include 

namespace stdx = std::experimental;

void f(long double* ptr, double* dptr, float* fptr)
{
  stdx::simd x;
  //stdx::fixed_size_simd x;
  x.copy_from(ptr, stdx::element_aligned);
  x.copy_to(dptr, stdx::element_aligned);
  x.copy_from(dptr, stdx::element_aligned);
  x.copy_to(fptr, stdx::element_aligned);
  x.copy_from(fptr, stdx::element_aligned);
  x.copy_to(ptr, stdx::element_aligned);
}

This fails when using 'std::experimental::parallelism_v2::__intrinsic_type' in the load and store implementations. The same failure can
occur for '__intrinsic_type' and '__intrinsic_type'.

A condition, checking whether the type is actually vectorizable for the target
is missing.

[Bug preprocessor/114748] [14 Regression] libcpp aclocal.m4 and configure incorrectly regenerated

2024-04-17 Thread clyon at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114748

--- Comment #6 from Christophe Lyon  ---
(In reply to Andrew Pinski from comment #1)
> The last time aclocal.m4 had an include for override.m4 was
> r9-3776-g22e052725189a4 . 

IIUC that commit actually removed the include for override.m4 ?

> Are you sure you are using the correct autoconf/automake version?
Yes, autoconf-2.69 and automake-1.15.1. I'm updating autoregen.py in the
sourceware buildbot.

[Bug libstdc++/114750] converting load/store of simd fails compilation on ARM

2024-04-17 Thread mkretz at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114750

Matthias Kretz (Vir)  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |mkretz at gcc dot 
gnu.org
   Target Milestone|--- |14.0
 Ever confirmed|0   |1
 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2024-04-17

[Bug fortran/103312] [11/12/13/14 Regression] ICE in gfc_find_component since r9-1098-g3cf89a7b992d483e

2024-04-17 Thread pault at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103312

Paul Thomas  changed:

   What|Removed |Added

 CC||pault at gcc dot gnu.org
   Assignee|unassigned at gcc dot gnu.org  |pault at gcc dot gnu.org

--- Comment #6 from Paul Thomas  ---
Created attachment 57969
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57969&action=edit
Partial fix for the PR

The supplied testcase generates completely blank derived type symbols for the
_vptr component of 'this' in 'func'. The chunk in resolve.cc fixes that.

The rest of the patch allows the full testcase below to blast through to
translation, where it dies in trans-decl.cc - again with blanks symbols in the
default initializer this time.

Of the compilers to which I have access, only NAG succeeds with the full
testcase. If this%size() is replaced with a constant expression or an integer
dummy argument, all compilers succeed, including current versions of gfortran.

I have taken it but need to get on with daytime work for a few days.

Paul

module example

  type, abstract :: foo
integer :: i
  contains
procedure(foo_size), deferred :: size
procedure(foo_func), deferred :: func
  end type

  interface
function foo_func (this) result (string)
  import :: foo
  class(foo) :: this
  character(this%size()) :: string
end function
pure integer function foo_size (this)
  import foo
  class(foo), intent(in) :: this
end function
  end interface

end module

module extension
  use example
  implicit none
  type, extends(foo) :: bar
!integer :: i
  contains
procedure :: size
procedure :: func
  end type

contains
pure integer function size (this)
  class(bar), intent(in) :: this
  size = this%i
end function
function func (this) result (string)
  class(bar) :: this
  character(this%size()) :: string
  string = repeat ("x", len (string))
end function

end module

  use example
  use extension
  type(bar) :: a
  a%i = 5
  print *, a%func()
end

[Bug c++/103696] pragma optimization is not applying to Lambdas

2024-04-17 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103696

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |12.0

[Bug preprocessor/114748] [14 Regression] libcpp aclocal.m4 and configure incorrectly regenerated

2024-04-17 Thread clyon at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114748

--- Comment #7 from Christophe Lyon  ---
So yes indeed at r14-5423-gfbe4e64365ec7f, autoreconf will generate the same
contents, but starting at r14-5424-gdb50aea6259545 we get this discrepancy.

We can probably commit the "fixed" version, but should we investigate why
override.m4 is needed again?

[Bug gcov-profile/114751] New: .gcda:stamp mismatch with notes file

2024-04-17 Thread gejoed at rediffmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114751

Bug ID: 114751
   Summary: .gcda:stamp mismatch with notes file
   Product: gcc
   Version: 11.4.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: gcov-profile
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gejoed at rediffmail dot com
  Target Milestone: ---

Hi team,

In my org project we use gcc/gcov which is customised by another team. However
with version 11.4.0 we did see a '.gcda:stamp mismatch with notes file' while
running gcov tool. The same command goes fine with the previous branch code
which uses gcc 10.3.0. 

Example would be like :
/gcov -l src-file.c -o ./src-file-obj-dir-path

src-file.gcda:stamp mismatch with notes file
File 'src-file.c'
Lines executed:0.00% of 36
Creating 'src-file.c.gcov'

File '../../abcd/ef/hijk.h'
Lines executed:0.00% of 2
Creating 'src-file.c##hijk.h.gcov'

File '../include/lmno.h'
Lines executed:0.00% of 2
Creating 'src-file.c##lmno.h.gcov'

Lines executed:0.00% of 40

This is seen for several files in the new branch of code where gcc v11.4.0 is
used and not seen with older branch where the gcc version is 10.3.0.

We generate the gcov based image (compiled with --coverage in CFLAGS) , load
the image on the device, do testing, collect gcda files from device and put it
back in the same obj-dir location and run the gcov tool to get the gcov files
(eg :src-file.c.gcov).

We checked the time stamp of gcda vs gcno files and even then the issue was
seen. Was there any specific enhancement with 11.x version which would be
causing the issue or is there any further check to be done during the
compilation with --coverage or during the gcov tool run ?

Due to my organisation restriction, I'm not able to give more info on the file
names used and the make file info but I can try the best to get the info for
your understanding. 
Awaiting valuable reply. 

Thank you team for your support always !

[Bug middle-end/17951] Dominance info is incorrect for entry and exit blocks

2024-04-17 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=17951

--- Comment #2 from Richard Biener  ---
Created attachment 57970
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57970&action=edit
not quite working patch

Some TLC to all this might make fixing easier.  This is a start (at fixing, not
TLC).  At least CDI_DOMINATORS for EXIT and CDI_POST_DOMINATORS for ENTRY
should be fixed.  Or asserts put in we don't query the bogus info.

[Bug middle-end/23096] Wrong folding for FLOOR_MOD_EXPR

2024-04-17 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=23096

Richard Biener  changed:

   What|Removed |Added

 Status|NEW |WAITING

--- Comment #2 from Richard Biener  ---
Writing a self-test for this shows we do not fold (51 - 7) %[fl] 3.  But
I can't spot substantial differences in the PLUS/MINUS_EXPR handling.  In
particular we still do MINUS->PLUS by negating op1.

So I'm not sure the original issue was really folding with those constants?

[Bug target/114752] New: AVR: internal compiler error. Unknown mode: const_double:DF

2024-04-17 Thread gjl at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114752

Bug ID: 114752
   Summary: AVR: internal compiler error. Unknown mode:
const_double:DF
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: minor
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gjl at gcc dot gnu.org
  Target Milestone: ---

Inline asm does not accept 64-bit float constants:

void func (void)
{
__asm ("; %0" :: "EF" (1.0L));
}


foo.c: In function 'func':
foo.c: error: internal compiler error.  Unknown mode:
  | }
  | ^
(const_double:DF 1.0e+0 [0x0.8p+1])
during RTL pass: final
foo.c:4:1: internal compiler error: in avr_print_operand, at
config/avr/avr.cc:3937

[Bug target/114752] AVR: internal compiler error. Unknown mode: const_double:DF

2024-04-17 Thread gjl at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114752

Georg-Johann Lay  changed:

   What|Removed |Added

   Priority|P3  |P5
   Keywords||ice-on-valid-code
 Target||avr

[Bug sanitizer/114743] ICE in build_check_stmt at asan.cc:2707 while compiling gcc.dg/ubsan/pr112709-2.c with -fsanitize=address

2024-04-17 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114743

--- Comment #2 from GCC Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:299d14a54672a4d12c1abbe4031a732bb56cddaa

commit r14-1-g299d14a54672a4d12c1abbe4031a732bb56cddaa
Author: Jakub Jelinek 
Date:   Wed Apr 17 10:24:18 2024 +0200

asan: Don't instrument .ABNORMAL_DISPATCHER [PR114743]

.ABNORMAL_DISPATCHER is currently the only internal function with
ECF_NORETURN, and asan likes to instrument ECF_NORETURN calls by adding
some builtin call before them, which breaks the .ABNORMAL_DISPATCHER
discovery added in gsi_safe_*.

The following patch fixes asan not to instrument .ABNORMAL_DISPATCHER
calls, like it doesn't instrument a couple of specific builtin calls
as well.

2024-04-17  Jakub Jelinek  

PR sanitizer/114743
* asan.cc (maybe_instrument_call): Don't instrument calls to
.ABNORMAL_DISPATCHER.

* gcc.dg/asan/pr112709-2.c (freddy): New function from
gcc.dg/ubsan/pr112709-2.c version of the test.

[Bug sanitizer/114743] ICE in build_check_stmt at asan.cc:2707 while compiling gcc.dg/ubsan/pr112709-2.c with -fsanitize=address

2024-04-17 Thread jakub at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114743

Jakub Jelinek  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #3 from Jakub Jelinek  ---
Fixed.

[Bug libstdc++/114750] converting load/store of simd fails compilation on ARM

2024-04-17 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114750

--- Comment #1 from GCC Commits  ---
The master branch has been updated by Matthias Kretz :

https://gcc.gnu.org/g:0fc7f3c6adc8543f55ec35b309016d9d9c4ddd35

commit r14-10001-g0fc7f3c6adc8543f55ec35b309016d9d9c4ddd35
Author: Matthias Kretz 
Date:   Wed Apr 17 09:11:25 2024 +0200

libstdc++: Avoid ill-formed types on ARM

This resolves failing tests in check-simd.

Signed-off-by: Matthias Kretz 

libstdc++-v3/ChangeLog:

PR libstdc++/114750
* include/experimental/bits/simd_builtin.h
(_SimdImplBuiltin::_S_load, _S_store): Fall back to copying
scalars if the memory type cannot be vectorized for the target.

[Bug target/114752] AVR: internal compiler error. Unknown mode: const_double:DF

2024-04-17 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114752

--- Comment #1 from GCC Commits  ---
The master branch has been updated by Georg-Johann Lay :

https://gcc.gnu.org/g:909c6faf2c726178d115726e56304eac91cff6e9

commit r14-10003-g909c6faf2c726178d115726e56304eac91cff6e9
Author: Georg-Johann Lay 
Date:   Wed Apr 17 10:26:05 2024 +0200

AVR: target/114752 - Fix ICE on inline asm const 64-bit float operand

gcc/
PR target/114752
* config/avr/avr.cc (avr_print_operand) [CONST_DOUBLE_P]: Handle
DFmode.

[Bug tree-optimization/114749] [14] RISC-V rv64gcv ICE: in vectorizable_load, at tree-vect-stmts.cc

2024-04-17 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114749

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
 Ever confirmed|0   |1
   Last reconfirmed||2024-04-17
   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org

--- Comment #1 from Richard Biener  ---
Confirmed.  We end up with a load using VMAT_ELEMENTWISE and using partial
vectors for this grouped load with single element interleaving and a
gap of 1802.

I have a fix, the issue seems latent.

[Bug target/114432] [13 Regression] ICE in connect_traces, at dwarf2cfi.cc:3079 on s390x-linux-gnu

2024-04-17 Thread stefansf at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114432

Stefan Schulze Frielinghaus  changed:

   What|Removed |Added

   Last reconfirmed||2024-4-17
 CC||stefansf at gcc dot gnu.org

--- Comment #1 from Stefan Schulze Frielinghaus  
---
Reproducible via

$ wget https://www.codelabs.ch/download/libalog-0.6.2.tar.bz2
$ tar xf libalog-0.6.2.tar.bz2
$ gcc -c -x ada -gnatA -gnatygAdISuxo -gnatVa -gnatwal -gnatf -fstack-check
-gnato -g -O2 -fno-omit-frame-pointer -mbackchain -gnatwe -fPIC
libalog-0.6.2/src/alog-active_logger.adb

Started with r12-4926-g79fe28d2c4b785
Confirmed on trunk r14--g9c7cf5d71f0716

@Doko does Debian use -mbackchain? I can reproduce this on Ubuntu even for z196
but -mbackchain and -fstack-check are required in order to fail. For me 12.3 is
failing, too, which matches with my bisect. How did you test 12.3?

[Bug rtl-optimization/96865] ICE in hash_rtx_cb, at cse.c:2548

2024-04-17 Thread jeevitha at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96865

Jeevitha  changed:

   What|Removed |Added

 CC||jeevitha at gcc dot gnu.org

--- Comment #1 from Jeevitha  ---
The issue was in the following RTL,
(insn 6 2 21 2 (parallel [
(set (reg:SI 96 lr)
(unspec:SI [
(symbol_ref:SI ("_GLOBAL_OFFSET_TABLE_") [flags 0x42])
(label_ref 0)
] UNSPEC_TOCPTR))
(code_label 0 0 0 2 (nil) [2 uses])
]) "test.c":6:6 798 {load_toc_v4_PIC_1b_normal}

The above RTL will generate the following assembly:

bcl 20,31,$+8

There is no need for 'label_ref 0' and 'code_label' in the RTL instruction, as
the assembly does not require them. Is my understanding correct?

Currently, we are encountering an ICE due to the following reason: In the
'hash_rtx' function [cse.cc], handling for 'LABEL_REF' is present but not for
'code_label', causing it to not return a hash directly. Instead, it searches
for the format corresponding to '(code_label 0 0 0 4 (nil))', which is
"uuB00is". In the provided code snippet below, 'B' and 'u' do not have case
handling, leading to a 'gcc_unreachable'. This is the reason for the ICE
currently.


fmt = GET_RTX_FORMAT (code);  //code -> code_label
  for (; i >= 0; i--)
{
  switch (fmt[i])
{
case 'e':
  /* If we are about to do the last recursive call
 needed at this level, change it into iteration.
 This function  is called enough to be worth it.  */
  if (i == 0)
{
  x = XEXP (x, i);
  goto repeat;
}

  hash += hash_rtx (XEXP (x, i), VOIDmode, do_not_record_p,
hash_arg_in_memory_p,
have_reg_qty, cb);
  break;

case 'E':
  for (j = 0; j < XVECLEN (x, i); j++)
hash += hash_rtx (XVECEXP (x, i, j), VOIDmode, do_not_record_p,
  hash_arg_in_memory_p,
  have_reg_qty, cb);
  break;

case 's':
  hash += hash_rtx_string (XSTR (x, i));
  break;

case 'i':
  hash += (unsigned int) XINT (x, i);
  break;

case 'p':
  hash += constant_lower_bound (SUBREG_BYTE (x));
  break;

case '0': case 't':
  /* Unused.  */
  break;

default:
  gcc_unreachable ();
}
}

To address this, I've removed 'operand1' and adjusted the respective
'match_dup' to 'operand0' in the below pattern.

(define_insn "load_toc_v4_PIC_1b_normal"
  [(set (reg:SI LR_REGNO)
(unspec:SI [(match_operand:SI 0 "immediate_operand" "s")
(label_ref (match_operand 1 "" ""))]
UNSPEC_TOCPTR))
   (match_dup 1)]

After implementing the mentioned change, there were no ICEs or regressions. Is
this approach correct?

[Bug target/82731] _mm256_set_epi8(array[offset[0]], array[offset[1]], ...) byte gather makes slow code, trying to zero-extend all the uint16_t offsets first and spilling them.

2024-04-17 Thread liuhongt at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82731

Hongtao Liu  changed:

   What|Removed |Added

 CC||liuhongt at gcc dot gnu.org

--- Comment #3 from Hongtao Liu  ---
Looks like ix86_vect_estimate_reg_pressure doesn't work here, taking a look.

[Bug gcov-profile/114751] .gcda:stamp mismatch with notes file

2024-04-17 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114751

--- Comment #1 from Richard Biener  ---
>From reading what the gcov code does it somehow means that the gcda and gcno
files were not created consistently.

You can use gcov-dump to check the stamp, for an example pair I have around
I see consistent stamps:

> gcov-dump-13 t.gcda 
t.gcda:data:magic `gcda':version `B32*'
t.gcda:stamp 3774884159
t.gcda:checksum 2538568055
...

> gcov-dump-13 t.gcno
t.gcno:note:magic `gcno':version `B32*'
t.gcno:stamp 3774884159
t.gcno:checksum 0
...

[Bug target/114752] AVR: internal compiler error. Unknown mode: const_double:DF

2024-04-17 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114752

--- Comment #2 from GCC Commits  ---
The releases/gcc-13 branch has been updated by Georg-Johann Lay
:

https://gcc.gnu.org/g:ca7d454804045a39d10a9b1f691a940aeacdf25b

commit r13-8616-gca7d454804045a39d10a9b1f691a940aeacdf25b
Author: Georg-Johann Lay 
Date:   Wed Apr 17 10:26:05 2024 +0200

AVR: target/114752 - Fix ICE on inline asm const 64-bit float operand

gcc/
PR target/114752
* config/avr/avr.cc (avr_print_operand) [CONST_DOUBLE_P]: Handle
DFmode.

(cherry picked from commit 909c6faf2c726178d115726e56304eac91cff6e9)

[Bug target/114752] AVR: internal compiler error. Unknown mode: const_double:DF

2024-04-17 Thread gjl at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114752

Georg-Johann Lay  changed:

   What|Removed |Added

   Target Milestone|--- |13.3
 Resolution|--- |FIXED
 Status|UNCONFIRMED |RESOLVED

--- Comment #3 from Georg-Johann Lay  ---
Fixed in v13.3+

[Bug tree-optimization/114749] [14] RISC-V rv64gcv ICE: in vectorizable_load, at tree-vect-stmts.cc

2024-04-17 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114749

--- Comment #2 from GCC Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:bf2b5231312e1cea45732cb8df6ffa2b2c9115b6

commit r14-10005-gbf2b5231312e1cea45732cb8df6ffa2b2c9115b6
Author: Richard Biener 
Date:   Wed Apr 17 10:40:04 2024 +0200

tree-optimization/114749 - reset partial vector decision for no-SLP retry

The following makes sure to reset LOOP_VINFO_USING_PARTIAL_VECTORS_P
to its default of false when re-trying without SLP as otherwise
analysis may run into bogus asserts.

PR tree-optimization/114749
* tree-vect-loop.cc (vect_analyze_loop_2): Reset
LOOP_VINFO_USING_PARTIAL_VECTORS_P when re-trying without SLP.

[Bug target/82731] _mm256_set_epi8(array[offset[0]], array[offset[1]], ...) byte gather makes slow code, trying to zero-extend all the uint16_t offsets first and spilling them.

2024-04-17 Thread liuhongt at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82731

--- Comment #4 from Hongtao Liu  ---
(In reply to Hongtao Liu from comment #3)
> Looks like ix86_vect_estimate_reg_pressure doesn't work here, taking a look.

Oh, ix86_vect_estimate_reg_pressure is only for loop, BB vectorizer only use
ix86_builtin_vectorization_cost, but not add_stmt_cost/finish_cost.

[Bug tree-optimization/114749] [13 Regression] RISC-V rv64gcv ICE: in vectorizable_load, at tree-vect-stmts.cc

2024-04-17 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114749

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |13.3
Summary|[14] RISC-V rv64gcv ICE: in |[13 Regression] RISC-V
   |vectorizable_load, at   |rv64gcv ICE: in
   |tree-vect-stmts.cc  |vectorizable_load, at
   ||tree-vect-stmts.cc
   Priority|P3  |P2

--- Comment #3 from Richard Biener  ---
I'm going to at least backport to 13.

[Bug target/82731] _mm256_set_epi8(array[offset[0]], array[offset[1]], ...) byte gather makes slow code, trying to zero-extend all the uint16_t offsets first and spilling them.

2024-04-17 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82731

--- Comment #5 from Richard Biener  ---
We do not BB vectorize gathers I think (ISTR some "loop" uses in the
infrastructure, not too difficult to fix I guess).

In the end the problem is RTL expansion of the CTOR and then lack of
combine?

Look at how we RTL expand

typedef char __v32qi __attribute__((vector_size(32)));

__v32qi
_mm256_set_epi8  (char __q31, char __q30, char __q29, char __q28,
  char __q27, char __q26, char __q25, char __q24,
  char __q23, char __q22, char __q21, char __q20,
  char __q19, char __q18, char __q17, char __q16,
  char __q15, char __q14, char __q13, char __q12,
  char __q11, char __q10, char __q09, char __q08,
  char __q07, char __q06, char __q05, char __q04,
  char __q03, char __q02, char __q01, char __q00)
{
  return __extension__ (__v32qi){
__q00, __q01, __q02, __q03, __q04, __q05, __q06, __q07,
__q08, __q09, __q10, __q11, __q12, __q13, __q14, __q15,
__q16, __q17, __q18, __q19, __q20, __q21, __q22, __q23,
__q24, __q25, __q26, __q27, __q28, __q29, __q30, __q31
  };
}

[Bug gcov-profile/114751] .gcda:stamp mismatch with notes file

2024-04-17 Thread gejoed at rediffmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114751

--- Comment #2 from Gejoe  ---

For me, it is like this (just keeping the sample filenames as such but the
values are real ones shown while checking with gcov-dump) :

$gcov-dump ./obj-dir-path/src-file.gcda
./obj-dir-path/src-file.gcda:data:magic `gcda':version `B14*'
./obj-dir-path/src-file.gcda:stamp 2912455990
:


$ gcov-dump ./obj-dir-path/src-file.gcno
./obj-dir-path/src-file.gcno:note:magic `gcno':version `B14*'
./obj-dir-path/src-file.gcno:stamp 2912494680
:


Does this indicate something more to be checked ?

In the previous branch where gcc 10.3.0 is used, I could see same stamp value
for the gcov-dump of same gcda and gcno files there - 3176078538.

Awaiting reply.

Thanks.

[Bug modula2/114745] const cast causes ICE

2024-04-17 Thread gaius at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114745

Gaius Mulley  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from Gaius Mulley  ---
Closing now the patch has been applied.

[Bug c++/114753] New: from_chars aborts with -m32 -ftrapv when passed -9223372036854775808

2024-04-17 Thread gnu.ojxq8 at dralias dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114753

Bug ID: 114753
   Summary: from_chars aborts with -m32 -ftrapv when passed
-9223372036854775808
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gnu.ojxq8 at dralias dot com
  Target Milestone: ---

std::from_chars may abort when used with -m32 -ftrapv on some values.

Without -m32, or without -ftrapv, or using clang, the code works correctly.

To reproduce:

$ cat a.cpp 
#include 
#include 
#include 
int main() {
  int64_t result{};
  std::string_view str{"-9223372036854775808"};
  (void)std::from_chars(str.begin(), str.end(), result);
  return result != -9223372036854775807 - 1;
}


$  g++ -m32 -ftrapv -std=c++17 ./a.cpp && ./a.out 
Aborted (core dumped)

[Bug target/82731] _mm256_set_epi8(array[offset[0]], array[offset[1]], ...) byte gather makes slow code, trying to zero-extend all the uint16_t offsets first and spilling them.

2024-04-17 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82731

--- Comment #6 from Richard Biener  ---
That's ix86_expand_vector_init_interleave which for QI inner_mode extends
to SImode, likely because it tries to work with just SSE2?

[Bug c++/114753] from_chars aborts with -m32 -ftrapv when passed -9223372036854775808

2024-04-17 Thread redi at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114753

Jonathan Wakely  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2024-04-17
 Status|UNCONFIRMED |NEW

--- Comment #1 from Jonathan Wakely  ---
The abort happens here:

  if (__builtin_mul_overflow(__val, __sign, &__tmp))

With __val = 9223372036854775808LL __sign = -1LL

The libgcc2.c:__mulvdi3 function reaches the abort() on line 375

[Bug c++/114753] from_chars aborts with -m32 -ftrapv when passed -9223372036854775808

2024-04-17 Thread redi at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114753

--- Comment #2 from Jonathan Wakely  ---
(In reply to Jonathan Wakely from comment #1)
> With __val = 9223372036854775808LL __sign = -1LL

Oops, that should be __val = 9223372036854775808ULL and __sign = -1

i.e.

int main()
{
  unsigned long long val = 9223372036854775808ULL;
  int sign = -1;
  long long res;
  return __builtin_mul_overflow(val, sign, &res);
}

This shouldn't trap.

[Bug middle-end/114753] from_chars aborts with -m32 -ftrapv when passed -9223372036854775808

2024-04-17 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114753

Richard Biener  changed:

   What|Removed |Added

   Keywords||wrong-code
  Component|c++ |middle-end
Version|unknown |14.0

--- Comment #3 from Richard Biener  ---
Confirmed.  .MUL_OVERFLOW expansion likely ends up calling expand_binop
which decides for itself whether the op traps based on the result type.

[Bug target/82731] _mm256_set_epi8(array[offset[0]], array[offset[1]], ...) byte gather makes slow code, trying to zero-extend all the uint16_t offsets first and spilling them.

2024-04-17 Thread liuhongt at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82731

--- Comment #7 from Hongtao Liu  ---
(In reply to Hongtao Liu from comment #4)
> (In reply to Hongtao Liu from comment #3)
> > Looks like ix86_vect_estimate_reg_pressure doesn't work here, taking a look.
> 
> Oh, ix86_vect_estimate_reg_pressure is only for loop, BB vectorizer only use
> ix86_builtin_vectorization_cost, but not add_stmt_cost/finish_cost.

Oh, CTOR comes from source code, not from vectorizer.
Then why those loads from offset is not moved just before consumer(loads from
array), then the live range of those values can be shorten.(loads from array
are moved just before CTOR insns).

[Bug middle-end/114753] from_chars aborts with -m32 -ftrapv when passed -9223372036854775808

2024-04-17 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114753

Richard Biener  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #4 from Richard Biener  ---
#0  expand_binop (mode=E_DImode, binoptab=smulv_optab, op0=0x771b53f0, 
op1=0x771b85a0, target=0x0, unsignedp=0, methods=OPTAB_LIB_WIDEN)
at /space/rguenther/src/gcc/gcc/optabs.cc:1497
#1  0x0116edf5 in expand_mult (mode=E_DImode, op0=0x771b53f0, 
op1=0x771b85a0, target=0x0, unsignedp=0, no_libcall=false)
at /space/rguenther/src/gcc/gcc/expmed.cc:3610
#2  0x011a38e0 in expand_expr_real_2 (ops=0x7fffcef0, target=0x0, 
tmode=E_DImode, modifier=EXPAND_NORMAL)
at /space/rguenther/src/gcc/gcc/expr.cc:10275
#3  0x01359608 in expand_mul_overflow (loc=2147483653, 
lhs=, arg0=, 
arg1=, unsr_p=false, uns0_p=false, uns1_p=true, 
is_ubsan=false, datap=0x0)
at /space/rguenther/src/gcc/gcc/internal-fn.cc:2359
#4  0x0135b7f7 in expand_arith_overflow (code=MULT_EXPR, 
stmt=)
at /space/rguenther/src/gcc/gcc/internal-fn.cc:2827
#5  0x0135b9c3 in expand_MUL_OVERFLOW (stmt=0x77019f30)
at /space/rguenther/src/gcc/gcc/internal-fn.cc:2876

so a hack would be to reset flag_trapv around

2354 For unsigned multiplication when high parts are both
non-zero
2355 this overflows always.  */
2356  ops.code = MULT_EXPR;
2357  ops.op0 = make_tree (type, op0);
2358  ops.op1 = make_tree (type, op1);
2359  tem = expand_expr_real_2 (&ops, NULL_RTX, mode,
EXPAND_NORMAL);

in expand_mul_overflow.

[Bug target/114432] [13 Regression] ICE in connect_traces, at dwarf2cfi.cc:3079 on s390x-linux-gnu

2024-04-17 Thread stefansf at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114432

--- Comment #2 from Stefan Schulze Frielinghaus  
---
Fails for function alog.active_logger.logging_taskT and trace 2 whose heads are

(gdb) call debug(ti->head)
(code_label 48 573 49 152 (nil) [2 uses])

(gdb) call debug(ti->eh_head)
(insn 57 765 58 (set (reg/f:DI 14 %r14 [orig:74 _39 ] [74])
(mem/f:DI (reg/f:DI 10 %r10 [orig:123 _task ] [123]) [0
_task_25(D)->parent+0 S8 A64]))
"libalog-0.6.2/src/alog-active_logger.adb":252:33 discrim 2 1477 {*movdi_64}
 (expr_list:REG_EH_REGION (const_int 6 [0x6])
(nil)))

Looking at the trace there exists no insn with a ARGS_SIZE note which is why

gcc_assert (!ti->args_size_undefined || ti->args_size_defined_for_eh);

fails.

[Bug target/114432] [13 Regression] ICE in connect_traces, at dwarf2cfi.cc:3079 on s390x-linux-gnu

2024-04-17 Thread stefansf at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114432

--- Comment #3 from Stefan Schulze Frielinghaus  
---
Created attachment 57971
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57971&action=edit
dwarf2cfi dump for alog-active_logger.adb

[Bug middle-end/114753] from_chars aborts with -m32 -ftrapv when passed -9223372036854775808

2024-04-17 Thread jakub at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114753

--- Comment #5 from Jakub Jelinek  ---
I wouldn't call it a hack, I'd say it is the right fix.
Though, we have tons of those in internal-fn.cc
  ops.code = MULT_EXPR;
  ops.code = MULT_EXPR;
  ops.code = MULT_EXPR;
  ops.code = MULT_EXPR;
  ops.code = MULT_EXPR;
  ops.code = WIDEN_MULT_EXPR;
  ops.code = MULT_HIGHPART_EXPR;
  ops.code = MULT_EXPR;
  ops.code = WIDEN_MULT_EXPR;
  ops.code = MULT_EXPR;
  ops.code = MULT_EXPR;
  ops.code = PLUS_EXPR;
  ops.code = TRUNC_DIV_EXPR;
  ops.code = TRUNC_MOD_EXPR;
plus some which reuse earlier set ops.code plus some which use just some
variable as ops.code.  Sure, I think WIDEN_MULT_EXPR/MULT_HIGHPART_EXPR and the
div/mod neither probably consider flag_trapv, but MULT_EXPR/PLUS_EXPR does
(though, perhaps the PLUS_EXPR is for vectors only).
So, the question is where to put that.
To cover everything, bet best would be to put it into expand_mul_overflow and
expand_arith_overflow and expand_vector_ubsan_overflow.

[Bug middle-end/114753] from_chars aborts with -m32 -ftrapv when passed -9223372036854775808

2024-04-17 Thread jakub at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114753

Jakub Jelinek  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org

--- Comment #6 from Jakub Jelinek  ---
Created attachment 57972
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57972&action=edit
gcc14-pr114753.patch

Untested fix.

[Bug middle-end/114754] New: [OpenMP] Missing 'uses_allocators' diagnostic

2024-04-17 Thread burnus at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114754

Bug ID: 114754
   Summary: [OpenMP] Missing 'uses_allocators' diagnostic
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: accepts-invalid, diagnostic, openmp
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: burnus at gcc dot gnu.org
  Target Milestone: ---

Cf. https://github.com/SOLLVE/sollve_vv/pull/802

"LLVM error:
 error: allocator must be specified in the 'uses_allocators' clause
adding uses_allocators(omp_default_mem_alloc) fixes the problem"

OpenMP spec has under 'Restrictions to the *target* construct are as follows:'

"Memory allocators that do not appear in a *uses_allocators* clause cannot
appear as an allocator in an *allocate* clause or be used in the *target*
region unless a *requires* directive with the *dynamic_allocators* clause is
present in the same compilation unit."

Example snippets, based on the sollve_vv testcase.

The OG13 patch only diagnose the issue in the last/4th directive;
clang diagnoses both the 'allocate' clause variants (2nd + 4th) but neither
diagnoses the 1st/4th one.

* * *

   omp_allocator_handle_t al = omp_init_allocator(omp_default_mem_space, 0,
NULL);
   #pragma omp target
   {
 int *y = omp_alloc(omp_default_mem_alloc, sizeof(1));
   }

   #pragma omp target allocate(omp_default_mem_alloc:x) firstprivate(x)
map(from: device_result)
   {
  for (int i = 0; i < N; i++)
x += i;
  device_result = x;
   }

   #pragma omp target firstprivate(al)
   {
 int *y = omp_alloc(al, sizeof(1));
   }

   #pragma omp target allocate(al:x) firstprivate(x) map(from: device_result)
   {
  for (int i = 0; i < N; i++)
x += i;
  device_result = x;
   }

[Bug c/90181] Feature request: provide a way to explicitly select specific named registers in constraints

2024-04-17 Thread pskocik at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90181

Petr Skocik  changed:

   What|Removed |Added

 CC||pskocik at gmail dot com

--- Comment #16 from Petr Skocik  ---
The current way of loading stuff into regs that don't have a specific
constraint for them also breaks on gcc (but not on clang) if the variable is
marked const.
https://godbolt.org/z/1PvYsrqG9

[Bug c++/105841] [12 Regression] Change in behavior of CTAD for alias templates

2024-04-17 Thread hokein.wu at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105841

Haojian Wu  changed:

   What|Removed |Added

 CC||hokein.wu at gmail dot com

--- Comment #15 from Haojian Wu  ---
Hi, I notice that the __is_deducible was hidden in the commit
https://github.com/gcc-mirror/gcc/commit/30556bf81f4385c2a9c449948865dbcf35664764.

Is there any reason behind the change?

(Context: I'm implementing the is_deducible part for alias CTAD in clang, and
I'm considering to add a similar builtin in clang).

[Bug preprocessor/114748] [14 Regression] libcpp aclocal.m4 and configure incorrectly regenerated

2024-04-17 Thread jakub at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114748

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #8 from Jakub Jelinek  ---
(In reply to Christophe Lyon from comment #7)
> So yes indeed at r14-5423-gfbe4e64365ec7f, autoreconf will generate the same
> contents, but starting at r14-5424-gdb50aea6259545 we get this discrepancy.
> 
> We can probably commit the "fixed" version, but should we investigate why
> override.m4 is needed again?

The reason why r14-5424 now requires override.m4 can be seen from aclocal -I
../config --verbose.
iconv.m4 added use of AC_PREREQ macro and override.m4 (re)defines that.
Before that AC_PREREQ wasn't used and so nothing required override.m4.

Can you just post the #c0 patch with a ChangeLog entry
* aclocal.m4: Regenerate.
* configure: Regenerate.
?

[Bug c++/114634] [11/12/13 Regression] Crash Issue Encountered in GCC Compilation of Template Code with Aligned Attribute since r9-1745

2024-04-17 Thread jakub at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114634

Jakub Jelinek  changed:

   What|Removed |Added

Summary|[11/12/13/14 Regression]|[11/12/13 Regression] Crash
   |Crash Issue Encountered in  |Issue Encountered in GCC
   |GCC Compilation of Template |Compilation of Template
   |Code with Aligned Attribute |Code with Aligned Attribute
   |since r9-1745   |since r9-1745

--- Comment #5 from Jakub Jelinek  ---
commit r14-9962-g7ec54f5fdfec298812a749699874db4d6a7246bb   
Author: Jakub Jelinek 
Date:   Mon Apr 15 10:25:22 2024 +0200  

attribs: Don't crash on NULL TREE_TYPE in diag_attr_exclusions [PR114634]   

The enumerator still doesn't have TREE_TYPE set but diag_attr_exclusions
assumes that all decls must have types. 
I think it is better in something as unimportant as diag_attr_exclusions
to be more robust, if there is no type, it can just diagnose exclusions 
on the DECL_ATTRIBUTES, like for types it only diagnoses it on  
TYPE_ATTRIBUTES.

2024-04-15  Jakub Jelinek 

PR c++/114634   
* attribs.cc (diag_attr_exclusions): Set attrs[1] to NULL_TREE for  
decls with NULL TREE_TYPE.  

* g++.dg/ext/attrib68.C: New test.

[Bug target/114741] [14 regression] aarch64 sve: unnecessary fmov for scalar int bit operations

2024-04-17 Thread law at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114741

Jeffrey A. Law  changed:

   What|Removed |Added

 CC||law at gcc dot gnu.org
   Priority|P3  |P2

[Bug c++/114709] [12/13/14 Regression] Incorrect handling of inactive union member access via pointer to member in constant evaluated context

2024-04-17 Thread law at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114709

Jeffrey A. Law  changed:

   What|Removed |Added

   Priority|P3  |P2
 CC||law at gcc dot gnu.org

[Bug analyzer/114677] [13/14 Regression] -Wanalyzer-fd-leak false positive writing to int * param

2024-04-17 Thread law at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114677

Jeffrey A. Law  changed:

   What|Removed |Added

   Priority|P3  |P2
 CC||law at gcc dot gnu.org

[Bug fortran/113956] [13/14 Regression] ice in gfc_trans_pointer_assignment, at fortran/trans-expr.cc:10524

2024-04-17 Thread law at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113956

Jeffrey A. Law  changed:

   What|Removed |Added

 CC||law at gcc dot gnu.org
   Priority|P3  |P4

[Bug target/114741] [14 regression] aarch64 sve: unnecessary fmov for scalar int bit operations

2024-04-17 Thread wilco at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114741

--- Comment #7 from Wilco  ---
(In reply to Tamar Christina from comment #6)
> and the exact armv9-a cost model you quoted, also does the right codegen.
> https://godbolt.org/z/obafoT6cj
> 
> There is just an inexplicable penalty being applied to the r->r alternative.

Indeed it is not related to cost model - building SPEC shows a significant
regression (~1%) with -mcpu=neoverse-v1 due to AND immediate being quite common
in scalar code. The '^' incorrectly forces many cases to use the SVE
alternative.

[Bug target/114741] [14 regression] aarch64 sve: unnecessary fmov for scalar int bit operations

2024-04-17 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114741

Tamar Christina  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |tnfchris at gcc dot 
gnu.org
 Status|NEW |ASSIGNED

--- Comment #8 from Tamar Christina  ---
Indeed, Regtesting a patch that fixes it, so mine...

It looks like ^ doesn't work well when there's a choice of multiple register
files.

[Bug target/114416] calling convention incompatibility with vendor compiler for V9

2024-04-17 Thread jakub.kulik at oracle dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114416

--- Comment #10 from Jakub Kulik  ---
Sorry for longer response.

I asked internally again and was told by a colleague who was in the room when
the spec was created, that: "the intent was (and is) that the individual
elements/atoms/fundamental types that make up a small structure, no matter how
those elements/atoms/fundamental types are aggregated within the structure, are
passed in registers appropriate for the fundamental type in question. (That is,
pointers and integral types are passed in the %o registers, and floating point
types are passed in the floating-point registers.) So a structure that contains
an array of two floats is treated the same as a structure that contains two
floats."

That said, he agreed that the spec could perhaps be better written. He also
added:

Page 3P-11 says this, under "Function Argument Passing":

"It is convenient to describe parameter linkage in terms of an array.
Conceptually, parameters are assigned into an array of extended words,
left-to-right, with an occasional “hole” to satisfy alignment restrictions.
Typically, most parameter values will be “promoted” from their memory locations
into registers, and most calls are expected to execute this way with less
overhead."

There is then a diagram that shows the correspondence between parameter
registers and the parameter array.

On page 3P-12, under "Structure and Union arguments", it says this:

"Structure or union types up to eight bytes in size are assigned to one
parameter array word, and align to eight-byte boundaries.

"Structure or union types larger than eight bytes, and up to sixteen bytes in
size are assigned to two consecutive parameter array words, and align according
to the alignment requirements of the structure or at least to an eight-byte
boundary."

So perhaps instead of saying "The individual fields of a structure ... are
subject to promotion into registers based on their type using the same rules as
apply to scalar values" the spec should have said "The individual parameter
array words assigned to a structure ... are subject to promotion into registers
based on their type using the same rules as apply to scalar values."

[Bug target/114416] calling convention incompatibility with vendor compiler for V9

2024-04-17 Thread jakub.kulik at oracle dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114416

--- Comment #11 from Jakub Kulik  ---
> This is a bit of circular reasoning but, if the rule had been crystal clear,
> GCC would have implemented it at some point during the last quarter of
> century.

I see. I guess it's also not a common enough use case to pass small structs
with float arrays between programs and libraries (potentially compiled with
different compilers).

That said, for example libffi implements the ABI as was intended (it's how I
originally found this issue).

[Bug preprocessor/114748] [14 Regression] libcpp aclocal.m4 and configure incorrectly regenerated

2024-04-17 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114748

--- Comment #9 from GCC Commits  ---
The master branch has been updated by Christophe Lyon :

https://gcc.gnu.org/g:a9fefbf71726bb0ce89c79e547ab3319af3227a8

commit r14-10006-ga9fefbf71726bb0ce89c79e547ab3319af3227a8
Author: Christophe Lyon 
Date:   Wed Apr 17 13:56:19 2024 +

libcpp: Regenerate aclocal.m4 and configure [PR 114748]

As discussed in the PR, aclocal.m4 and configure were incorrectly
regenerated at some point.

2024-04-17  Christophe Lyon  

PR preprocessor/114748
libcpp/
* aclocal.m4: Regenerate.
* configure: Regenerate.

[Bug preprocessor/114748] [14 Regression] libcpp aclocal.m4 and configure incorrectly regenerated

2024-04-17 Thread clyon at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114748

Christophe Lyon  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #10 from Christophe Lyon  ---
Fixed on trunk

[Bug other/114738] [14 Regression] Default DOCUMENTATION_ROOT_URL vs. release branches

2024-04-17 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114738

--- Comment #5 from GCC Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:57056146f4ffc5ea347c03e37e1e2c7cd99261d0

commit r14-10007-g57056146f4ffc5ea347c03e37e1e2c7cd99261d0
Author: Jakub Jelinek 
Date:   Wed Apr 17 16:17:22 2024 +0200

DOCUMENTATION_ROOT_URL vs. release branches [PR114738]

Starting with GCC 14 we have the nice URLification of the options printed
in diagnostics, say for in
test.c:4:23: warning: format â%dâ expects argument of type âintâ,
but argument 2 has type âlong intâ [-Wformat=]
the -Wformat= is underlined in some terminals and hovering on it shows
https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wformat
link.

This works nicely on the GCC trunk, where the online documentation is
regenerated every day from a cron job and more importantly, people rarely
use the trunk snapshots for too long, so it is unlikely that further
changes
in the documentation will make too many links stale, because users will
simply regularly update to newer snapshots.

I think it doesn't work properly on release branches though.
Some users only use the relased versions (i.e. MAJOR.MINOR.0) from tarballs
but can use them for a couple of years, others use snapshots from the
release branches, but again they could be in use for months or years and
the above mentioned online docs which represent just the GCC trunk might
diverge significantly.

Now, for the relases we always publish also online docs for the release,
which unlike the trunk online docs will not change further, under
e.g.
   
https://gcc.gnu.org/onlinedocs/gcc-14.1.0/gcc/Warning-Options.html#index-Wformat
or
   
https://gcc.gnu.org/onlinedocs/gcc-14.2.0/gcc/Warning-Options.html#index-Wformat
etc.

So, I think at least for the MAJOR.MINOR.0 releases we want to use
URLs like above rather than the trunk ones and we can use the same process
of updating *.opt.urls as well for that.

For the snapshots from release branches, we don't have such docs.
One option (implemented in the patch below for the URL printing side) is
point to the MAJOR.MINOR.0 docs even for MAJOR.MINOR.1 snapshots.
Most of the links will work fine, for options newly added on the release
branches (rare thing but still happens) can have until the next release
no URLs for them and get them with the next point release.
The question is what to do about make regenerate-opt-urls for the release
branch snapshots.  Either just document that users shouldn't
make regenerate-opt-urls on release branches (and filter out *.opt.urls
changes from their commits), add make regenerate-opt-urls task be RM
responsibility before making first release candidate from a branch and
adjust the autoregen CI to know about that.  Or add a separate goal
which instead of relying on make html created files would download
copy of the html files from the last release from web (kind of web
mirroring the https://gcc.gnu.org/onlinedocs/gcc-14.1.0/ subtree locally)
and doing regenerate-opt-urls on top of that?  But how to catch the
point when first release candidate is made and we want to update to
what will be the URLs once the release is made (but will be stale URLs
for a week or so)?

Another option would be to add to cron daily regeneration of the online
docs for the release branches.  I don't think that is a good idea though,
because as I wrote earlier, not all users update to the latest snapshot
frequently, so there can be users that use gcc 13.1.1 20230525 for months
or years, and other users which use gcc 13.1.1 20230615 for years etc.

Another question is what is most sensible for users who want to override
the default root and use the --with-documentation-root-url= configure
option.  Do we expect them to grab the whole onlinedocs tree or for release
branches at least include gcc-14.1.0/ subdirectory under the root?
If so, the patch below deals with that.  Or should we just change the
default documentation root url, so if user doesn't specify
--with-documentation-root-url= and we are on a release branch, default that
to https://gcc.gnu.org/onlinedocs/gcc-14.1.0/ or
https://gcc.gnu.org/onlinedocs/gcc-14.2.0/ etc. and don't add any infix in
get_option_url/make_doc_url, but when people supply their own, let them
point to the root of the tree which contains the right docs?
Then such changes would go into gcc/configure.ac, some case based on
"$gcc_version", from that decide if it is a release branch or trunk.

2024-04-17  Jakub Jelinek  

PR other/114738
* opts.cc (get_option_url): On release branches append
gcc-MAJOR.MINOR.0/ after DOCUMENTATION_ROOT_URL.
* gcc-urlifier.cc (gcc_urlifier::make_doc_url): Likewise.

[Bug rtl-optimization/96865] ICE in hash_rtx_cb, at cse.c:2548

2024-04-17 Thread bergner at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96865

Peter Bergner  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

[Bug tree-optimization/113964] [11/12/13/14/15 Regression] repeat copy of struct

2024-04-17 Thread jamborm at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113964

--- Comment #5 from Martin Jambor  ---
(In reply to Richard Biener from comment #2)
> No, I think the issue is that ESRA leaves e.f0 alone:
> 
>   e$f3_7 = e.f3;
>   e$f0$f4_8 = e.f0.f4;
>   _1 = e$f0$f4_8;
>   _2 = (unsigned char) _1;
>   e$f3_9 = _2;
>   e.f0 = g_50;
>   e$f3_10 = MEM  [(struct S1 *)&g_50];
>   e$f0$f4_11 = MEM  [(struct S1 *)&g_50 + 24B];
>   MEM  [(union U8 *)&e] = e$f3_10;
>   MEM  [(union U8 *)&e + 24B] = e$f0$f4_11;
>   g_16 = e.f0;
> 
> it looks like it materializes the e.f0 = g_15 copy but fails to elide that
> (maybe assuming sth else will?)?  And then for some reason the final
> g_16 = e.f90 copy isn't replaced?!
> 
> So somehow SRAs heuristics go off.
> 
> Martin?

I am afraid this is just another example of what flow-insensitive SRA cannot
optimize well.  I'll keep it in the list of testcases to hopefully one day
improve on when we make it flow sensitive.

[Bug rtl-optimization/96865] ICE in hash_rtx_cb, at cse.c:2548

2024-04-17 Thread bergner at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96865

Peter Bergner  changed:

   What|Removed |Added

  Known to fail||12.0, 13.0, 14.0

--- Comment #2 from Peter Bergner  ---
Fails on trunk and basically all earlier versions.

[Bug middle-end/114509] [11/12/13/14 Regression] an infinite loop or ICE with openmp after errors in some cases

2024-04-17 Thread jakub at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114509

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #2 from Jakub Jelinek  ---
Reduced testcase:
void bar (void *);

void
foo (int x)
{
  struct S { int a[x]; int b; } s;
#pragma omp parallel for
  for (int i = 0; i < 10; i++)
bar (&s);
}

We don't really support variable length structures/unions in OpenMP/OpenACC
lowering/expansion, and I don't see why we should spend time on that, variable
length structures/unions just shouldn't be used in C (they are already invalid
in C++ and not present in Fortran either), I think that extension exists just
because Ada needs to support that.  Though Ada on the other side doesn't
support OpenMP/OpenACC.
So, I think we should just sorry if something attempts to privatize/map
variable length structure/union; making it shared should be fine.

[Bug target/69031] ICE: in hash_rtx_cb, at cse.c:2533 with -fPIC -fselective-scheduling and __builtin_setjmp()

2024-04-17 Thread bergner at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69031

Peter Bergner  changed:

   What|Removed |Added

 CC||bergner at gcc dot gnu.org
 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |WORKSFORME

--- Comment #3 from Peter Bergner  ---
Maybe already fixed?  Marking as resolved for now and we can reopen if someone
can actually recreate the ICE.  I could not.

[Bug rtl-optimization/85099] [meta-bug] selective scheduling issues

2024-04-17 Thread bergner at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85099
Bug 85099 depends on bug 69031, which changed state.

Bug 69031 Summary: ICE: in hash_rtx_cb, at cse.c:2533 with -fPIC 
-fselective-scheduling and __builtin_setjmp()
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69031

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |WORKSFORME

[Bug rtl-optimization/96865] ICE in hash_rtx_cb, at cse.c:2548

2024-04-17 Thread segher at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96865

Segher Boessenkool  changed:

   What|Removed |Added

 CC||abel at ispras dot ru

--- Comment #3 from Segher Boessenkool  ---
Yup.  I thought there would be missing options needed for this to fail (-mcpu=
for example), but it fails with plain trunk.

Something with sel-sched.  It works fine without that.

Putting the maintainers of selective scheduling on Cc:.

[Bug libgcc/114755] New: wrong code with _BitInt() modulo at -O0 on aarch64

2024-04-17 Thread zsojka at seznam dot cz via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114755

Bug ID: 114755
   Summary: wrong code with _BitInt() modulo at -O0 on aarch64
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: wrong-code
  Severity: normal
  Priority: P3
 Component: libgcc
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zsojka at seznam dot cz
CC: jakub at gcc dot gnu.org
  Target Milestone: ---
  Host: x86_64-pc-linux-gnu
Target: aarch64-unknown-linux-gnu

Created attachment 57973
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57973&action=edit
reduced testcase

Output:
$ aarch64-unknown-linux-gnu-gcc testcase.c -O0 -static
$ qemu-aarch64 -- ./a.out 
qemu: uncaught target signal 6 (Aborted) - core dumped
Aborted

$ aarch64-unknown-linux-gnu-gcc -v
Using built-in specs.
COLLECT_GCC=/repo/gcc-trunk/binary-latest-aarch64/bin/aarch64-unknown-linux-gnu-gcc
COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-r14-10007-20240417161722-g57056146f4f-checking-yes-rtl-df-extra-aarch64/bin/../libexec/gcc/aarch64-unknown-linux-gnu/14.0.1/lto-wrapper
Target: aarch64-unknown-linux-gnu
Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++
--enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra
--with-cloog --with-ppl --with-isl
--with-sysroot=/usr/aarch64-unknown-linux-gnu --build=x86_64-pc-linux-gnu
--host=x86_64-pc-linux-gnu --target=aarch64-unknown-linux-gnu
--with-ld=/usr/bin/aarch64-unknown-linux-gnu-ld
--with-as=/usr/bin/aarch64-unknown-linux-gnu-as --enable-libsanitizer
--disable-libstdcxx-pch
--prefix=/repo/gcc-trunk//binary-trunk-r14-10007-20240417161722-g57056146f4f-checking-yes-rtl-df-extra-aarch64
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 14.0.1 20240417 (experimental) (GCC)

[Bug rtl-optimization/96865] ICE in hash_rtx_cb, at cse.c:2548

2024-04-17 Thread segher at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96865

--- Comment #4 from Segher Boessenkool  ---
Well, I wanted to add Alex as well, but BZ does not allow that?  Says he does
not exist?

Is there some other mail address than that mentioned in MAINTAINERS, the one he
usually uses, that works, maybe @gcc.gnu.org?

[Bug fortran/114739] [14 Regression] ice in gfc_find_derived_types, at fortran/symbol.cc:2458

2024-04-17 Thread pault at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114739

Paul Thomas  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |pault at gcc dot gnu.org

--- Comment #6 from Paul Thomas  ---
Hi David and Harald,

Thanks for the heads up.

I am within minutes of posting a fix on the list.

Paul

[Bug target/114676] [12/13/14 Regression] DSE removes assignment that is used later

2024-04-17 Thread krebbel at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114676

--- Comment #13 from Andreas Krebbel  ---
We will go and fix PyTorch instead. Although it is not clearly documented, the
way PyTorch uses the builtin right now is probably not what was intended. It is
pretty clear that the element type pointer needs to alias vectors of the same
element type, but there is no saying about aliasing everything.

I'm just wondering how to improve the diagnostics in our backend to catch this.
The example below is similar to what PyTorch does today. Casting mem to
(float*) prevents our builtin code from complaining about the type mismatch and
by that opens the door for the much harder to debug TBAA problem.

#include 

void __attribute__((noinline)) foo (int *mem)
{
  vec_xst ((vector float){ 1.0f, 2.0f, 3.0f, 4.0f }, 0, (float*)mem);
}

int
main ()
{
  int m[4] = { 0 };
  foo (m);
  if (m[3] == 0)
__builtin_abort ();
  return 0;
}

[Bug middle-end/23096] Wrong folding for FLOOR_MOD_EXPR

2024-04-17 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=23096

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |WORKSFORME
 Status|WAITING |RESOLVED

--- Comment #3 from Andrew Pinski  ---
I only filed this bug to keep track of the what was described as the underlying
issue for PR 22348 but since that is proved to maybe not be so let's close this
as works for me.

[Bug tree-optimization/22348] [4.0 Regression] Execution continues past end of for loop end condition with optimisation enabled

2024-04-17 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=22348
Bug 22348 depends on bug 23096, which changed state.

Bug 23096 Summary: Wrong folding for FLOOR_MOD_EXPR
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=23096

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution|--- |WORKSFORME

[Bug target/114676] [12/13/14 Regression] DSE removes assignment that is used later

2024-04-17 Thread jakub at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114676

--- Comment #14 from Jakub Jelinek  ---
(In reply to Andreas Krebbel from comment #13)
> We will go and fix PyTorch instead. Although it is not clearly documented,
> the way PyTorch uses the builtin right now is probably not what was
> intended. It is pretty clear that the element type pointer needs to alias
> vectors of the same element type, but there is no saying about aliasing
> everything.
> 
> I'm just wondering how to improve the diagnostics in our backend to catch
> this. The example below is similar to what PyTorch does today. Casting mem
> to (float*) prevents our builtin code from complaining about the type
> mismatch and by that opens the door for the much harder to debug TBAA
> problem.

We need a TBAA analyzer among sanitizers (but writing it is really hard).

> #include 
> 
> void __attribute__((noinline)) foo (int *mem)
> {
>   vec_xst ((vector float){ 1.0f, 2.0f, 3.0f, 4.0f }, 0, (float*)mem);

So use
  *(vector float __attribute__((__may_alias__)) *)mem = (vector float){ 1.0f,
2.0f, 3.0f, 4.0f };
instead?  Sure, GCC extension, not an intrinsic in that case...

[Bug libgcc/114755] wrong code with _BitInt() modulo at -O0 on aarch64

2024-04-17 Thread jakub at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114755

Jakub Jelinek  changed:

   What|Removed |Added

   Last reconfirmed||2024-04-17
   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org
 Status|UNCONFIRMED |ASSIGNED
 Ever confirmed|0   |1

--- Comment #1 from Jakub Jelinek  ---
Created attachment 57974
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57974&action=edit
gcc14-pr114755.patch

Lightly tested fix so far.
In these problematic cases, we ended up after negation with 0 as most
significant limb of v2 and used __builtin_clz* on that.  On x86_64 that
returned 64 and kind of worked right after triggering UB several times, but on
aarch64 it returned 63 and misbehaved with the result.

[Bug target/114756] New: [14] RISC-V rv32imc miscompile with -fdata-sections

2024-04-17 Thread patrick at rivosinc dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114756

Bug ID: 114756
   Summary: [14] RISC-V rv32imc miscompile with -fdata-sections
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: patrick at rivosinc dot com
  Target Milestone: ---

Testcase:
#pragma pack(push)
#pragma pack(1)
struct a
{
long long b;
char c;
long long d;
int e;
int f;
long long g;
char h;
};
#pragma pack(pop)
struct i
{
unsigned c;
unsigned d;
unsigned h;
};
int j;
int k;
short l;
int m;
static struct a n = {0xBBCB58D824AE4D28};
static long long *s = &n.b;
struct a o = {1};
struct a p[32] = {{4}};
struct a aa = {6};
struct i ab[3] = {{108}};
int ac[8] = {4};
short ad;
short *ae[] = {&ad, &ad, &ad, &ad, &ad, &ad};
struct a q = {9};
int *r[24] = {&m};
struct a t = {4073709551615};
int *u[3] = {&j};
short v[64] = {6};
struct a x = {4073709551611};
struct a y = {4073709551607};
char z[60] = {4};
struct i af = {5};
struct i ag = {11};
short *ai[] = {&l, &l, &l, &l, &l};
int *aj[32] = {&k};
struct a ak = {4073709551615};
struct a al = {1};
struct a am = {12};
struct i an = {5};
struct a ao[216] = {{4073709551615}};
struct i ap = {3};
long long aq[] = {6, 0, 6, 6, 0, 6, 6};
struct a ar = {4};
struct a as[10] = {{13}};
struct a at = {4073709551615};
struct a au = {3};
struct a av = {6};
struct a aw = {7};
struct i ax = {4};
struct i ay = {4};
struct i az = {4};
struct i ba = {1};
struct a bb[8] = {{4073709551615}};
int main() { __builtin_printf("%llX\n", *s); }

Commands:
> /scratch/tc-testing/tc-apr-9/build-rv32gcv/bin/riscv64-unknown-linux-gnu-gcc 
> -O1 -mabi=ilp32 -march=rv32imc -fdata-sections red.c -o user-config.out 
> -fsigned-char -fno-strict-aliasing -static
> /scratch/tc-testing/tc-apr-15/build-rv64gcv/bin/qemu-riscv32 user-config.out
24AE4D28

without -fdata-sections
> /scratch/tc-testing/tc-apr-9/build-rv32gcv/bin/riscv64-unknown-linux-gnu-gcc 
> -O1 -mabi=ilp32 -march=rv32imc red.c -o user-config.out -fsigned-char 
> -fno-strict-aliasing -static
> /scratch/tc-testing/tc-apr-15/build-rv64gcv/bin/qemu-riscv32 user-config.out
BBCB58D824AE4D28

Found via fuzzer.

[Bug libstdc++/108760] ranges::iota is not included in

2024-04-17 Thread mlevine55 at bloomberg dot net via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108760

--- Comment #5 from Michael Levine  ---
I submitted a patch for this today, see either the gcc-patches or the libstdc++
mailing lists for the subject:  [PATCH] libstdc++: Fix std::ranges::iota is not
included in numeric [PR108760]

[Bug target/114756] [14] RISC-V rv32imc miscompile with -fdata-sections

2024-04-17 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114756

--- Comment #1 from Andrew Pinski  ---
lui a5,%hi(n)
lw  a2,%lo(n)(a5)
lw  a3,%lo(n+4)(a5)

vs:
lui a5,%hi(.LANCHOR0)
addia5,a5,%lo(.LANCHOR0)
lw  a2,0(a5)
lw  a3,4(a5)

I have seen that issue before ...

[Bug middle-end/100604] GCC generates invalid LO_SYM for unaligned global

2024-04-17 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100604

Andrew Pinski  changed:

   What|Removed |Added

 CC||patrick at rivosinc dot com

--- Comment #7 from Andrew Pinski  ---
*** Bug 114756 has been marked as a duplicate of this bug. ***

[Bug target/114756] [14] RISC-V rv32imc miscompile with -fdata-sections

2024-04-17 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114756

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |DUPLICATE
 Status|UNCONFIRMED |RESOLVED

--- Comment #2 from Andrew Pinski  ---
Dup.

*** This bug has been marked as a duplicate of bug 100604 ***

[Bug target/114756] [14] RISC-V rv32imc miscompile with -fdata-sections

2024-04-17 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114756

--- Comment #3 from Andrew Pinski  ---
Basically what is happening is the linker relaxation code is turning it into
something which is wrong. But GCC's invalid use of %lo(n+4)(a5) with a
(invalid) corresponding %hi(n) is confusing the relaxation code really.

[Bug middle-end/100604] GCC generates invalid LO_SYM for unaligned global

2024-04-17 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100604

--- Comment #8 from Andrew Pinski  ---
Note this linker relaxation code could be more forgiving here and not producing
"wrong-code" but GCC should be fixed still.

[Bug testsuite/114177] gcc.target/aarch64/sve/loop_add_6.c needs to be fixed for LLP64 targets

2024-04-17 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114177

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |pinskia at gcc dot 
gnu.org
   Last reconfirmed||2024-04-17
 Ever confirmed|0   |1

--- Comment #1 from Andrew Pinski  ---
mine.

[Bug other/114757] New: [ASAN] ASAN miscalculates size of region when building the JDK

2024-04-17 Thread szaldana at redhat dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114757

Bug ID: 114757
   Summary: [ASAN] ASAN miscalculates size of region when building
the JDK
   Product: gcc
   Version: 13.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: other
  Assignee: unassigned at gcc dot gnu.org
  Reporter: szaldana at redhat dot com
  Target Milestone: ---

Hi all, 

I've come across an ASAN bug while building mainline JDK.

System: Linux x86
Gcc version: 13.2.1

Please find the stack trace below:

```
/home/szaldana/jdk/src/hotspot/share/gc/z/zMarkStack.cpp: In constructor
‘ZMarkStripeSet::ZMarkStripeSet(uintptr_t)’:
/home/szaldana/jdk/src/hotspot/share/gc/z/zMarkStack.cpp:43:17: error: writing
80 bytes into a region of size 8 [-Werror=stringop-overflow=]
   43 | _stripes[i] = ZMarkStripe(base);
  | ^~~
In file included from
/home/szaldana/jdk/src/hotspot/share/gc/z/zMarkStack.inline.hpp:27,
 from
/home/szaldana/jdk/src/hotspot/share/gc/z/zMarkStack.cpp:25:
/home/szaldana/jdk/src/hotspot/share/gc/z/zMarkStack.hpp:57:15: note:
destination object ‘ZStackList >::_base’ of size 8
   57 | uintptr_t _base;
  | ^
/home/szaldana/jdk/src/hotspot/share/gc/z/zMarkStack.cpp:43:17: error: writing
80 bytes into a region of size 8 [-Werror=stringop-overflow=]
   43 | _stripes[i] = ZMarkStripe(base);
  | ^~~
/home/szaldana/jdk/src/hotspot/share/gc/z/zMarkStack.hpp:57:15: note:
destination object ‘ZStackList >::_base’ of size 8
   57 | uintptr_t _base;
  | ^
```

The "region of size 8" seems like a bug in ASAN. It is presumably what ASAN
thinks is the size of ```_stripes[i]``` in
[zMarkStack.cpp](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/z/zMarkStack.cpp#L43),
but that's wrong.

[ZMarkStripe](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/z/zMarkStack.hpp#L82)
is made up of two
[ZStackList](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/z/zMarkStack.hpp#L55)
entries. Note how each one of those is 16 bytes. 

Additionally,  note how ```ZStackList``` is 64 byte aligned to make each one
have its own cache line. So the memory layout is something like this: 

```
0 ---
  ZStackList 
16 
  padding
64 
  ZStackList
80 ---
  padding 
128 ---
```

Thus, ```sizeof(ZMarkStripe)``` should be 128. 

On the other hand, the "writing 80 bytes" seems correct, as that is the size of
```ZMarkStripe``` excluding trailing padding. The assignment doesn't need to
copy that trailing padding. 

If you'd like to reproduce the bug, it suffices to [build the
jdk](https://openjdk.org/groups/build/doc/building.html) passing the
```--enable-asan``` flag to the ```bash configure``` arguments. 

Find the bug reported in the JDK
[here](https://bugs.openjdk.org/browse/JDK-8330047). 

I'm also attaching the log file with the commands that trigger the stack trace
above. 


Looking forward to your comments! 

Sonia

[Bug other/114757] [ASAN] ASAN miscalculates size of region when building the JDK

2024-04-17 Thread szaldana at redhat dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114757

--- Comment #1 from Sonia Zaldana Calles  ---
Created attachment 57975
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57975&action=edit
debug log file

Contains a .txt file with the debug log.

[Bug tree-optimization/114757] stringop-overflow warning with -fsanitize=address while building JDK

2024-04-17 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114757

Andrew Pinski  changed:

   What|Removed |Added

  Component|other   |tree-optimization
 Blocks||88443
   Last reconfirmed||2024-04-17
 Ever confirmed|0   |1
   Keywords||diagnostic
Summary|[ASAN] ASAN miscalculates   |stringop-overflow warning
   |size of region when |with -fsanitize=address
   |building the JDK|while building JDK
 Status|UNCONFIRMED |WAITING

--- Comment #2 from Andrew Pinski  ---
Note the documentation has the following warning about warnings and sanitizers:
```
Note that sanitizers tend to increase the rate of false positive warnings, most
notably those around -Wmaybe-uninitialized. We recommend against combining
-Werror and [the use of] sanitizers.
```


https://gcc.gnu.org/onlinedocs/gcc-13.2.0/gcc/Instrumentation-Options.html#index-fsanitize_003daddress

Can you attach the preprocessed source as requested by
https://gcc.gnu.org/bugs/ ? And the exact options which are being used to
invoke gcc?


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88443
[Bug 88443] [meta-bug] bogus/missing -Wstringop-overflow warnings

[Bug target/108678] Windows on ARM64 platform target aarch64-w64-mingw32

2024-04-17 Thread brechtsanders at users dot sourceforge.net via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108678

--- Comment #10 from Brecht Sanders  ---
What is the status of GCC support for aarch64-w64-mingw32 ?

I just tried GCC 14 snapshot 20240414 and it looks like it's still not
supported.

Build fails with:
*** Configuration aarch64-w64-mingw32 not supported

[Bug target/108678] Windows on ARM64 platform target aarch64-w64-mingw32

2024-04-17 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108678

--- Comment #11 from Andrew Pinski  ---
(In reply to Brecht Sanders from comment #10)
> What is the status of GCC support for aarch64-w64-mingw32 ?
> 
> I just tried GCC 14 snapshot 20240414 and it looks like it's still not
> supported.
> 
> Build fails with:
> *** Configuration aarch64-w64-mingw32 not supported

Patches have started to be posted but won't be fully reviewed/committed until
after GCC 14 is released due to them coming in late (during stage 4) in the
release cycle (See https://gcc.gnu.org/develop.html for the full gcc
release/development cycle).

[Bug tree-optimization/114757] stringop-overflow warning with -fsanitize=address while building JDK

2024-04-17 Thread szaldana at redhat dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114757

--- Comment #3 from Sonia Zaldana Calles  ---
Created attachment 57976
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57976&action=edit
ZMarkStack.ii

Preprocessed file for ZMarkStack

[Bug tree-optimization/114757] stringop-overflow warning with -fsanitize=address while building JDK

2024-04-17 Thread szaldana at redhat dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114757

--- Comment #4 from Sonia Zaldana Calles  ---
Command to compile zMarkStack.cpp 

( /usr/bin/rm -f
/home/szaldana/jdk/build/linux-x86_64-server-release/hotspot/variant-server/libjvm/objs/zMarkStack.o.log
&& /usr/bin/g++ -MMD -MF
/home/szaldana/jdk/build/linux-x86_64-server-release/hotspot/variant-server/libjvm/objs/zMarkStack.d.tmp
-I/home/szaldana/jdk/build/linux-x86_64-server-release/hotspot/variant-server/libjvm/objs/precompiled
-D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS
-D_GNU_SOURCE -D_REENTRANT -pipe -fno-rtti -fno-exceptions -fvisibility=hidden
-fno-strict-aliasing -fno-omit-frame-pointer -fstack-protector -std=c++14
-DLIBC=gnu -DLINUX -D_FILE_OFFSET_BITS=64 -Wall -Wextra -Wformat=2
-Wpointer-arith -Wsign-compare -Wunused-function -Wundef -Wunused-value
-Wreturn-type -Wtrampolines -Woverloaded-virtual -Wreorder -fPIC
-fmacro-prefix-map=/home/szaldana/jdk/= -DVM_LITTLE_ENDIAN -D_LP64=1
-fno-lifetime-dse -Wno-format-zero-length -Wtype-limits -Wuninitialized -m64
-fsanitize=address -Wno-stringop-truncation -fno-omit-frame-pointer -fno-common
-DADDRESS_SANITIZER -DNDEBUG -DPRODUCT -DTARGET_ARCH_x86
-DINCLUDE_SUFFIX_OS=_linux -DINCLUDE_SUFFIX_CPU=_x86
-DINCLUDE_SUFFIX_COMPILER=_gcc -DTARGET_COMPILER_gcc -DAMD64
-DHOTSPOT_LIB_ARCH='"amd64"' -DCOMPILER1 -DCOMPILER2
-I/home/szaldana/jdk/build/linux-x86_64-server-release/hotspot/variant-server/gensrc/adfiles
-I/home/szaldana/jdk/src/hotspot/share
-I/home/szaldana/jdk/src/hotspot/os/linux
-I/home/szaldana/jdk/src/hotspot/os/posix
-I/home/szaldana/jdk/src/hotspot/cpu/x86
-I/home/szaldana/jdk/src/hotspot/os_cpu/linux_x86
-I/home/szaldana/jdk/build/linux-x86_64-server-release/hotspot/variant-server/gensrc
-I/home/szaldana/jdk/src/hotspot/share/precompiled
-I/home/szaldana/jdk/src/hotspot/share/include
-I/home/szaldana/jdk/src/hotspot/os/posix/include
-I/home/szaldana/jdk/build/linux-x86_64-server-release/support/modules_include/java.base
-I/home/szaldana/jdk/build/linux-x86_64-server-release/support/modules_include/java.base/linux
-I/home/szaldana/jdk/src/java.base/share/native/libjimage -m64
-I/home/szaldana/jdk/build/linux-x86_64-server-release/hotspot/variant-server/gensrc/adfiles
-I/home/szaldana/jdk/src/hotspot/share
-I/home/szaldana/jdk/src/hotspot/os/linux
-I/home/szaldana/jdk/src/hotspot/os/posix
-I/home/szaldana/jdk/src/hotspot/cpu/x86
-I/home/szaldana/jdk/src/hotspot/os_cpu/linux_x86
-I/home/szaldana/jdk/build/linux-x86_64-server-release/hotspot/variant-server/gensrc
-I/home/szaldana/jdk/build/linux-x86_64-server-release/support/modules_include/java.base
-I/home/szaldana/jdk/src/java.base/unix/native/include
-I/home/szaldana/jdk/src/java.base/share/native/include -g -gdwarf-4
-fdebug-prefix-map=/home/szaldana/jdk/=
-fdebug-prefix-map=/usr/include/=/usr/include/
-fdebug-prefix-map=/usr/lib/gcc/x86_64-redhat-linux/13/include/=/usr/local/gcc_include/
-fdebug-prefix-map=/usr/include/c++/13/=/usr/local/gxx_include/
-fdebug-prefix-map=/home/szaldana/jdk/build/linux-x86_64-server-release/=
-Wno-unused-parameter -Wno-unused -Wno-array-bounds -Wno-comment
-Wno-delete-non-virtual-dtor -Wno-empty-body -Wno-implicit-fallthrough
-Wno-int-in-bool-context -Wno-maybe-uninitialized
-Wno-missing-field-initializers -Wno-shift-negative-value -Wno-unknown-pragmas
-Werror -O3 -c -o
/home/szaldana/jdk/build/linux-x86_64-server-release/hotspot/variant-server/libjvm/objs/zMarkStack.o
/home/szaldana/jdk/src/hotspot/share/gc/z/zMarkStack.cpp
-frandom-seed="zMarkStack.cpp" > >(/usr/bin/tee -a
/home/szaldana/jdk/build/linux-x86_64-server-release/hotspot/variant-server/libjvm/objs/zMarkStack.o.log)
2> >(/usr/bin/tee -a
/home/szaldana/jdk/build/linux-x86_64-server-release/hotspot/variant-server/libjvm/objs/zMarkStack.o.log
>&2) || ( exitcode=$? && /usr/bin/cp
/home/szaldana/jdk/build/linux-x86_64-server-release/hotspot/variant-server/libjvm/objs/zMarkStack.o.log
/home/szaldana/jdk/build/linux-x86_64-server-release/make-support/failure-logs/hotspot_variant-server_libjvm_objs_zMarkStack.o.log
&& /usr/bin/cp
/home/szaldana/jdk/build/linux-x86_64-server-release/hotspot/variant-server/libjvm/objs/zMarkStack.o.cmdline
/home/szaldana/jdk/build/linux-x86_64-server-release/make-support/failure-logs/hotspot_variant-server_libjvm_objs_zMarkStack.o.cmdline
&& exit $exitcode ) )

[Bug tree-optimization/114757] stringop-overflow warning with -fsanitize=address while building JDK

2024-04-17 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114757

Andrew Pinski  changed:

   What|Removed |Added

 Status|WAITING |UNCONFIRMED
 Ever confirmed|1   |0

[Bug middle-end/112976] expand_gimple_stmt_1 vs gimple_assign_nontemporal_move_p vs SSA_NAME on lhs

2024-04-17 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112976

--- Comment #2 from Andrew Pinski  ---
Note gimple_assign_nontemporal_move_p is just for non temporal stores. There is
no code handling non-temporal loads (which do exist on some targets, aarch64
for an example). 
I will also add a comment to that effect above gimple_assign_nontemporal_move_p
too.

Plus I will also add to the verifier that gimple_assign_nontemporal_move_p is
only set for the case where LHS != SSA_NAME

[Bug c++/114758] New: The layout of a std::vector reports a warning

2024-04-17 Thread clalancette.github at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114758

Bug ID: 114758
   Summary: The layout of a std::vector reports a warning
   Product: gcc
   Version: 13.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: clalancette.github at gmail dot com
  Target Milestone: ---

Consider the following program:

#include 
#include 

struct BoundedPlainSequences
{
  BoundedPlainSequences()
  {
this->bool_values_default.resize(3);
this->bool_values_default = {{false, true, false}};

this->uint16_values_default = {{0, 1, 65535}};
  }

  std::vector bool_values_default;

  std::vector uint16_values_default;
};

int main()
{
  BoundedPlainSequences seq{};

  return 0;
}

If I compile this program with the following command-line on Ubuntu 24.04:

$ g++ -O3 bounded-warn.cpp

Then I get the warning:

In file included from /usr/include/c++/13/bits/stl_uninitialized.h:63,
 from /usr/include/c++/13/memory:69,
 from bounded-warn.cpp:1:
In static member function ‘static _Up* std::__copy_move<_IsMove, true,
std::random_access_iterator_tag>::__copy_m(_Tp*, _Tp*, _Up*) [with _Tp = long
unsigned int; _Up = long unsigned int; bool _IsMove = false]’,
inlined from ‘_OI std::__copy_move_a2(_II, _II, _OI) [with bool _IsMove =
false; _II = long unsigned int*; _OI = long unsigned int*]’ at
/usr/include/c++/13/bits/stl_algobase.h:506:30,
inlined from ‘_OI std::__copy_move_a1(_II, _II, _OI) [with bool _IsMove =
false; _II = long unsigned int*; _OI = long unsigned int*]’ at
/usr/include/c++/13/bits/stl_algobase.h:533:42,
inlined from ‘_OI std::__copy_move_a(_II, _II, _OI) [with bool _IsMove =
false; _II = long unsigned int*; _OI = long unsigned int*]’ at
/usr/include/c++/13/bits/stl_algobase.h:540:31,
inlined from ‘_OI std::copy(_II, _II, _OI) [with _II = long unsigned int*;
_OI = long unsigned int*]’ at /usr/include/c++/13/bits/stl_algobase.h:633:7,
inlined from ‘std::vector::iterator std::vector::_M_copy_aligned(const_iterator, const_iterator, iterator) [with _Alloc
= std::allocator]’ at /usr/include/c++/13/bits/stl_bvector.h:1342:28,
inlined from ‘void std::vector::_M_fill_insert(iterator,
size_type, bool) [with _Alloc = std::allocator]’ at
/usr/include/c++/13/bits/vector.tcc:879:34,
inlined from ‘std::vector::iterator std::vector::insert(const_iterator, size_type, const bool&) [with _Alloc =
std::allocator]’ at /usr/include/c++/13/bits/stl_bvector.h:1242:16,
inlined from ‘void std::vector::resize(size_type, bool) [with
_Alloc = std::allocator]’ at
/usr/include/c++/13/bits/stl_bvector.h:1288:10,
inlined from ‘BoundedPlainSequences::BoundedPlainSequences()’ at
bounded-warn.cpp:8:37:
/usr/include/c++/13/bits/stl_algobase.h:437:30: warning: ‘void*
__builtin_memmove(void*, const void*, long unsigned int)’ writing between 9 and
9223372036854775807 bytes into a region of size 8 overflows the destination
[-Wstringop-overflow=]
  437 | __builtin_memmove(__result, __first, sizeof(_Tp) * _Num);
  | ~^~~
In file included from
/usr/include/x86_64-linux-gnu/c++/13/bits/c++allocator.h:33,
 from /usr/include/c++/13/bits/allocator.h:46,
 from /usr/include/c++/13/memory:65:
In member function ‘_Tp* std::__new_allocator<_Tp>::allocate(size_type, const
void*) [with _Tp = long unsigned int]’,
inlined from ‘static _Tp* std::allocator_traits
>::allocate(allocator_type&, size_type) [with _Tp = long unsigned int]’ at
/usr/include/c++/13/bits/alloc_traits.h:482:28,
inlined from ‘std::_Bvector_base<_Alloc>::_Bit_pointer
std::_Bvector_base<_Alloc>::_M_allocate(std::size_t) [with _Alloc =
std::allocator]’ at /usr/include/c++/13/bits/stl_bvector.h:679:48,
inlined from ‘void std::vector::_M_fill_insert(iterator,
size_type, bool) [with _Alloc = std::allocator]’ at
/usr/include/c++/13/bits/vector.tcc:877:40,
inlined from ‘std::vector::iterator std::vector::insert(const_iterator, size_type, const bool&) [with _Alloc =
std::allocator]’ at /usr/include/c++/13/bits/stl_bvector.h:1242:16,
inlined from ‘void std::vector::resize(size_type, bool) [with
_Alloc = std::allocator]’ at
/usr/include/c++/13/bits/stl_bvector.h:1288:10,
inlined from ‘BoundedPlainSequences::BoundedPlainSequences()’ at
bounded-warn.cpp:8:37:
/usr/include/c++/13/bits/new_allocator.h:151:55: note: destination object of
size 8 allocated by ‘operator new’
  151 | return static_cast<_Tp*>(_GLIBCXX_OPERATOR_NEW(__n *
sizeof(_Tp)));
  |   ^


If I compile with -O2 or -O0, I don't get the warning.  If I remove the
"this->bool_values_default.resize(3);" line, I don't get the warning.  If I
remove the "this->uint16_values_default = {{0, 1, 65535}};" line, I don't get
the warning.

[Bug target/114759] New: Power: multiple issues with -mrop-protect

2024-04-17 Thread bergner at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114759

Bug ID: 114759
   Summary: Power: multiple issues with -mrop-protect
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: bergner at gcc dot gnu.org
  Target Milestone: ---

There are multiple issues with the -mrop-protect option which are all
inter-related.

1. We always define the __ROP_PROTECT__ predefined macro when using
-mrop-protect, even when we've silently disabled ROP protection because of a
too old -mcpu=CPU value.  We should only emit __ROP_PROTECT__ when it's legal
to emit the ROP insns.

2. We always disable shrink-wrapping when -mrop-protect is used, even when
we've silently disabled ROP protection because of a too old -mcpu=CPU value. 
We should not disable shrink-wrapping if we've disabled ROP protection.

3. We silently disable ROP protection for everything other than -mcpu=power10. 
The binutils assembler accepts the ROP insns back to Power8, so we should emit
them for Power8 and later.

4. We give an error when -mrop-protect is used with any -mabi=ABI value not
equal to ELFv2, whereas a too old -mcpu=CPU value only causes us to silently
disable ROP protection.  I think both scenarios should behave similarly, so
either we silently disable ROP protection for both or we give an error for
both.

This is not a regression.  I consider 1. to be a correctness/wrong code bug.

[Bug target/114759] Power: multiple issues with -mrop-protect

2024-04-17 Thread bergner at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114759

Peter Bergner  changed:

   What|Removed |Added

 CC||dje at gcc dot gnu.org,
   ||linkw at gcc dot gnu.org,
   ||segher at gcc dot gnu.org
   Assignee|unassigned at gcc dot gnu.org  |bergner at gcc dot 
gnu.org
 Target||powerpc64le-linux
   Last reconfirmed||2024-04-17
 Status|UNCONFIRMED |ASSIGNED
 Ever confirmed|0   |1

--- Comment #1 from Peter Bergner  ---
Confirmed.

[Bug middle-end/112976] expand_gimple_stmt_1 vs gimple_assign_nontemporal_move_p vs SSA_NAME on lhs

2024-04-17 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112976

Andrew Pinski  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=101852

--- Comment #3 from Andrew Pinski  ---
Note storent optab name is not documented.

[Bug tree-optimization/114758] The layout of a std::vector reports a warning

2024-04-17 Thread clalancette.github at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114758

--- Comment #1 from Chris Lalancette  ---
I should also mention that this doesn't happen with gcc 12.3 or earlier.  It
seems to only have started happening with gcc 13.1 and 13.2.

[Bug tree-optimization/114749] [13 Regression] RISC-V rv64gcv ICE: in vectorizable_load, at tree-vect-stmts.cc

2024-04-17 Thread juzhe.zhong at rivai dot ai via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114749

--- Comment #4 from JuzheZhong  ---
Hi, Patrick.

It seems that Richard didn't append the testcase in the patch.
Could you send a patch to add the testcase for RISC-V port ?

Thangks.

[Bug target/114759] Power: multiple issues with -mrop-protect

2024-04-17 Thread segher at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114759

--- Comment #2 from Segher Boessenkool  ---
> 1. We always define the __ROP_PROTECT__ predefined macro when using 
> -mrop-protect, even when we've silently disabled ROP protection because of a 
> too old -mcpu=CPU value.  We should only emit __ROP_PROTECT__ when it's legal 
> to emit the ROP insns.

No.  Whenever the -mrop-protect option is in effect, we should do that
predefine.

If you want to refuse the option without a -mcpu= that can generate useful code
for it, that's fine, but that is not what we do.  Instead, we generate code
that
will do the ROP-protection boogaloo on CPUs that implement support for that,
and
does nothing otherwise.

> 2.  We always disable shrink-wrapping when -mrop-protect is used, [...]

Yes, this is problematic, and seems to be completely unnecessary.  When using
SWS
at least -- but then we need to define a component for doing the ROP-protection
thing, of course.  After all, it has to be done before anything else in the
function.
By exactly the same argument we should *also* do ROP-protection in all leaf
functions, btw!

> 3.  We silently disable ROP protection for everything other than 
> -mcpu=power10.  The binutils assembler accepts the ROP insns back to Power8, 
> so we should emit them for Power8 and later.

The ISA claims it will work for anything after ISA 2.04, even.

> 4.  We give an error when -mrop-protect is used with any -mabi=ABI value not 
> equal to ELFv2, [...]

Yes, we should make it work everywhere.  Even on -m32.  But it requires
adjusting
the ABI as well!

2) should be fixed, and 4) should be fixed by actually implementing it
everywhere!

[Bug tree-optimization/114760] New: traling zero count detection failure

2024-04-17 Thread jiangning.liu at amperecomputing dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114760

Bug ID: 114760
   Summary: traling zero count detection failure
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jiangning.liu at amperecomputing dot com
  Target Milestone: ---

For this small case, gcc failed to detect trailing zero count calculation, so
the x86 instruction tzcnt cannot be generated, but clang can generate it.

unsigned  ntz32_6a(unsigned x) {
  int n;

  n = 32;
  while (x != 0) {
n = n - 1;
x = x + x;
  }
  return n;
}

If we slightly change "x = x + x" to "x = x << 1", the optimization will just
work.

unsigned  ntz32_6a(unsigned x) {
  int n;

  n = 32;
  while (x != 0) {
n = n - 1;
x = x << 1;
  }
  return n;
}

It seems number_of_iterations_cltz/number_of_iterations_cltz_complement in
tree-ssa-loop-niter.cc or somewhere else need to be enhanced.

[Bug tree-optimization/114758] The layout of a std::vector reports a warning

2024-04-17 Thread redi at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114758

--- Comment #2 from Jonathan Wakely  ---
It's just yet another occurrence of false positive -Wstringop-overflow
warnings, it has nothing to do with vector being special.

[Bug c/114746] With FLT_EVAL_METHOD = 2, -fexcess-precision=fast reduces the precision of floating-point constants

2024-04-17 Thread vincent-gcc at vinc17 dot net via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114746

--- Comment #1 from Vincent Lefèvre  ---
There is the same issue with constant floating-point expressions.

Consider the following program given at
  https://github.com/llvm/llvm-project/issues/89128

#include 
#include 

static double const_init = 1.0 + (DBL_EPSILON/2) + (DBL_EPSILON/2);
int main() {
double nonconst_init = 1.0;
nonconst_init = nonconst_init + (DBL_EPSILON/2) + (DBL_EPSILON/2);
printf("FLT_EVAL_METHOD = %d\n", FLT_EVAL_METHOD);
printf("const: %g\n", const_init - 1.0);
printf("nonconst: %g\n", (double)nonconst_init - 1.0);
}

With -m32 -mno-sse, one gets

FLT_EVAL_METHOD = 2
const: 0
nonconst: 2.22045e-16

instead of

FLT_EVAL_METHOD = 2
const: 2.22045e-16
nonconst: 2.22045e-16

[Bug c/114746] With FLT_EVAL_METHOD = 2, -fexcess-precision=fast reduces the precision of floating-point constants and floating-point constant expressions

2024-04-17 Thread vincent-gcc at vinc17 dot net via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114746

Vincent Lefèvre  changed:

   What|Removed |Added

Summary|With FLT_EVAL_METHOD = 2,   |With FLT_EVAL_METHOD = 2,
   |-fexcess-precision=fast |-fexcess-precision=fast
   |reduces the precision of|reduces the precision of
   |floating-point constants|floating-point constants
   ||and floating-point constant
   ||expressions

--- Comment #2 from Vincent Lefèvre  ---
I've updated the bug title from "With FLT_EVAL_METHOD = 2,
-fexcess-precision=fast reduces the precision of floating-point constants" to
"With FLT_EVAL_METHOD = 2, -fexcess-precision=fast reduces the precision of
floating-point constants and floating-point constant expressions" (I don't
think that this deserves a separate bug).

[Bug tree-optimization/114760] traling zero count detection failure

2024-04-17 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114760

Andrew Pinski  changed:

   What|Removed |Added

   Severity|normal  |enhancement

[Bug rtl-optimization/114729] RISC-V SPEC2017 507.cactu excessive spillls with -fschedule-insns

2024-04-17 Thread vineetg at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114729

--- Comment #10 from Vineet Gupta  ---
Debug update -fsched-verbose=99 dumps (they are reay verbose)

For the insn/regs under consideration, the canonical pre-scheduled sequence
with ideal live-range (but non-ideal load-to-use delay) is following

  ;;   ==
  ;;   -- basic block 3 from 17 to 98 -- before reload
  ;;   ==

  ;;|   35 |   10 | r170#0=[r242+low(`u')] alu
  ;;|   44 |6 | [r230+low(`_Z1sv')]=r170#0 alu

  ;;|   46 |7 | r180=zxt(r170,0x10,0x10)   alu
  ;;|   54 |6 | [r230+low(const(`_Z1sv'+0x2))]=r180#0 alu

  ;;|   55 |7 | r188=r170 0>>0x20  alu
  ;;|   64 |6 | [r230+low(const(`_Z1sv'+0x4))]=r188#0 alu

  ;;|   65 |7 | r197=r170 0>>0x30  alu
  ;;|   73 |6 | [r230+low(const(`_Z1sv'+0x6))]=r197#0 alu

r170 (insn 35) is the central character whose live range has to be longest 
because of dependencies.

 - {46, 55, 65} USE r170, and sources which create new pseudos
 - {54, 64, 73} are where these new pseudos sink.

How these 2 sets are interleaved defines the register pressure.
 - If above src1:sink1:src2:sink2:src3:sink3: 1 reg suffices
 - If src1:src2:src3: 3 reg needed

Per sched1 dumps, the "source" set gets inducted into the ready queue together:

  ;;dependencies resolved: insn 65
  ;;tick updated: insn 65 into ready
  ;;dependencies resolved: insn 55
  ;;tick updated: insn 55 into ready
  ;;dependencies resolved: insn 46
  ;;tick updated: insn 46 into ready
  ;;dependencies resolved: insn 44
  ;;tick updated: insn 44 into ready
  ;;+--
  ;;| Pressure costs for ready queue
  ;;|  pressure points GR_REGS:[26->28 at 17:54] FP_REGS:[1->1 at 0:94]
  ;;+--
  ;;|  15   44 |6  +3 | GR_REGS:[0 base cost 0] FP_REGS:[0 base cost 0]
  ;;|  16   46 |7  +3 | GR_REGS:[1 base cost 0] FP_REGS:[0 base cost 0]
   
  ;;|  18   55 |7  +3 | GR_REGS:[1 base cost 1] FP_REGS:[0 base cost 0]
   
  ;;|  20   65 |7  +3 | GR_REGS:[1 base cost 1] FP_REGS:[0 base cost 0]
   
  ;;|  11   76 |   10  +2 | GR_REGS:[1 base cost 0] FP_REGS:[-1 base cost
0]
  ;;|   0   94 |2  +1 | GR_REGS:[0 base cost 0] FP_REGS:[0 base cost 0]
  ;;|  28   92 |5  +1 | GR_REGS:[0 base cost 0] FP_REGS:[1 base cost 0]
  ;;|  26   88 |5  +1 | GR_REGS:[0 base cost 0] FP_REGS:[1 base cost 0]
  ;;|  22   79 |9  +1 | GR_REGS:[0 base cost 0] FP_REGS:[1 base cost 0]
  ;;+--
  ;;  RFS_PRESSURE_DELAY: 7: 44 46 76 94
  ;;RFS_PRIORITY: 6: 92 88 79
  ;;  RFS_PRESSURE_INDEX: 2: 55
  ;;Ready list (t =  10):65:44(cost=1:prio=7:delay=3:idx=20) 
55:42(cost=1:prio=7:delay=3:idx=18)  44:39(cost=0:prio=6:delay=3:idx=15) 
46:40(cost=0:prio=7:delay=3:idx=16)  76:47(cost=0:prio=10:delay=2:idx=11) 
94:58(cost=0:prio=2:delay=1:idx=0)  92:56(cost=0:prio=5:delay=1:idx=28) 
88:54(cost=0:prio=5:delay=1:idx=26)  79:48(cost=0:prio=9:delay=1:idx=22)

As the algorithm converges, they move around a bit, but rarely are the src/sink
considered in same iteration and if at all only 1

  ;;+--
  ;;| Pressure costs for ready queue
  ;;|  pressure points GR_REGS:[29->29 at 0:94] FP_REGS:[1->1 at 0:94]
  ;;+--

...

  ;;|  19   64 |6  +0 | GR_REGS:[-1 base cost -1] FP_REGS:[0 base cost
0]
  ;;|  17   54 |6  +0 | GR_REGS:[-1 base cost -1] FP_REGS:[0 base cost
0]
  ;;|  20   65 |7  +0 | GR_REGS:[0 base cost 0] FP_REGS:[0 base cos


All of this leads to the pessimistic schedule emitted in the end.

I'm still trying to wrap my head around the humungous dump info.

1 2 >

1 - 100 of 117 matches

Mail list logo