[Bug target/112426] sched1 pessimizes codegen on aarch64 by increasing register pressure

2023-11-08 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112426

--- Comment #2 from Richard Biener  ---
sched1 does not care for register pressure unless you enable -fsched-pressure,
so this isn't a bug unless you have this enabled.

[Bug c++/112427] [14 regression] ICE when building Minetest (internal compiler error: tree check: expected tree that contains ‘decl common’ structure, have ‘identifier_node’ in get_inner_reference, at

2023-11-08 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112427

Richard Biener  changed:

   What|Removed |Added

   Priority|P3  |P1

[Bug tree-optimization/112430] [14 Regression] ICE: verify_ssa failed, missing definition since r14-1837-g43a3252c42af12

2023-11-08 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112430

Richard Biener  changed:

   What|Removed |Added

   Priority|P3  |P1
  Component|middle-end  |tree-optimization

[Bug target/112438] RISC-V: Wrong auto-vectorization on induction variable of RVV

2023-11-08 Thread kito at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112438

--- Comment #8 from Kito Cheng  ---
> Oh. I understand it now. I think it's a bug.
> 
> And.. I just take a look at my internal LLVM...
> Also has same issue
> 
> I think we need to adapt the Gimple IR here:
> 
>   _35 = .SELECT_VL (ivtmp_33, POLY_INT_CST [4, 4]);
>   _21 = vect_vec_iv_.6_22 + { POLY_INT_CST [4, 4], ... };
> 
> change it into:
> 
>   _35 = .SELECT_VL (ivtmp_33, POLY_INT_CST [4, 4]);
>   _21 = vect_vec_iv_.6_22 + _35;

Yeah, so...I guess the original report still valid, it's just bring up another
potential bug :P

Personally I really hate that magic constraint for vl but it's just too
late.

[Bug target/112435] [14 regression] GCC generates assembly which gas rejects with AVX when building ncnn (Error: unsupported instruction `vblendps') since r14-96-gc2dac2e5fbbcdd

2023-11-08 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112435

Richard Biener  changed:

   What|Removed |Added

   Priority|P3  |P1
 Target||x86_64-*-*

[Bug c/112432] Internal-fn: The [i|l|ll]rint family don't support FLOATN

2023-11-08 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112432

Richard Biener  changed:

   What|Removed |Added

 Status|WAITING |NEW

--- Comment #3 from Richard Biener  ---
Ah, yes, for lrint we have the builtins - I just looked for lceil here.  So
yeah, where there are DEF_EXT_LIB_FLOATN_NX_BUILTINS we should have
DEF_INTERNAL_FLT_FLOATN_FN.

[Bug target/112374] [14 Regression] `--with-arch=skylake-avx512 --with-cpu=skylake-avx512` causes a comparison failure

2023-11-08 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112374

--- Comment #6 from Robin Dapp  ---
How does the test suite look without bootstrapping?  Are there still new FAILs?

[Bug target/112438] RISC-V: Wrong auto-vectorization on induction variable of RVV

2023-11-08 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112438

--- Comment #9 from JuzheZhong  ---
I have a draft patch to fix it:

foo:
ble a0,zero,.L5
vsetvli a5,zero,e32,m1,ta,ma
vid.v   v2
.L3:
vsetvli a5,a0,e32,m1,ta,ma
sllia4,a5,2
vle32.v v3,0(a1)
sub a0,a0,a5
vadd.vv v1,v2,v3
vse32.v v1,0(a2)
add a1,a1,a4
add a2,a2,a4
vsetvli a4,zero,e32,m1,ta,ma
vmv.v.x v1,a5
vadd.vv v2,v2,v1
bne a0,zero,.L3
.L5:
ret

Seems correct ?

[Bug target/112438] RISC-V: Wrong auto-vectorization on induction variable of RVV

2023-11-08 Thread kito at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112438

--- Comment #10 from Kito Cheng  ---
(In reply to JuzheZhong from comment #9)
> I have a draft patch to fix it:
> 
> foo:
>   ble a0,zero,.L5
>   vsetvli a5,zero,e32,m1,ta,ma
>   vid.v   v2
> .L3:
>   vsetvli a5,a0,e32,m1,ta,ma
>   sllia4,a5,2
>   vle32.v v3,0(a1)
>   sub a0,a0,a5
>   vadd.vv v1,v2,v3
>   vse32.v v1,0(a2)
>   add a1,a1,a4
>   add a2,a2,a4
>   vsetvli a4,zero,e32,m1,ta,ma
>   vmv.v.x v1,a5

 this splat must be under "vsetvli  a5,a0,e32,m1,ta,ma" rather than
"vsetvlia4,zero,e32,m1,ta,ma"

>   vadd.vv v2,v2,v1
>   bne a0,zero,.L3
> .L5:
>   ret
> 
> Seems correct ?

[Bug target/112438] RISC-V: Wrong auto-vectorization on induction variable of RVV

2023-11-08 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112438

--- Comment #11 from JuzheZhong  ---
Why the splat can't be VLMAX ?

I think it must be VLMAX, otherwise, it could be wrong.

[Bug target/112438] RISC-V: Wrong auto-vectorization on induction variable of RVV

2023-11-08 Thread kito at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112438

--- Comment #12 from Kito Cheng  ---
oh, yeah, you are right, it already take a5 to splat, so it's right, and as you
said it must be VLMAX, unless it AVL prorogation for both splat and the
following vadd.vv

[Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94

2023-11-08 Thread manolis.tsamis at vrull dot eu via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415

--- Comment #15 from Manolis Tsamis  ---
(In reply to Sam James from comment #13)
> Created attachment 56527 [details]
> compile.c.323r.fold_mem_offsets.bad.xz
> 
> Output from
> ```
> hppa2.0-unknown-linux-gnu-gcc -c  -DNDEBUG -g -fwrapv -O3 -Wall -O2  
> -std=c11 -Werror=implicit-function-declaration -fvisibility=hidden 
> -I/home/sam/git/cpython/Include/internal -IObjects -IInclude -IPython -I.
> -I/home/sam/git/cpython/Include-DPy_BUILD_CORE -o Python/compile.o
> /home/sam/git/cpython/Python/compile.c -fdump-rtl-fold_mem_offsets-all
> ```
> 
> If I instrument certain functions in compile.c with no optimisation
> attribuet or build the file with -fno-fold-mem-offsets, Python works, so I'm
> reasonably sure this is the relevant object.

Thanks for the dump file! There are 66 folded/eliminated instructions in this
object file; I did look at each case and there doesn't seem to be anything
strange. In fact most of the transformations are straightforward:

 - All except a couple of cases don't involve any arithmetic, so it's just
moving a constant around.
 - The majority of the transformations are 'trivial' and consist of a single
add and then a memory operation: a sequence like X = Y + Const, R = MEM[X + 0]
is folded to X = Y, R = MEM[X + Const]. I wonder why so many of these exist and
are not optimized elsewhere.
 - There are some cases with negative offsets, but the calculations look
correct.
 - There are few more complicated cases, but I've done these on paper and also
look correct.

Of course I could be missing some more complicated effect, but what I want to
say is that everything looks sensible in this particular file.

> Thanks! You are very welcome to have access to some HPPA machines for
> this kind of work. Please email me an SSH public key + desired username
> if that sounds helpful.

Yes, since I couldn't find anything interesting in the dump, that would
definitely be helpful. Thanks!

Manolis

[Bug middle-end/112406] [14 Regression] Several SPECCPU 2017 benchmarks fail with LTO on internal compiler error: in expand_insn, at optabs.cc:8305

2023-11-08 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112406

--- Comment #6 from Tamar Christina  ---
First reduction:

typedef struct {
  int red
} MagickPixelPacket;
GetImageChannelMoments_image, GetImageChannelMoments_image_0,
GetImageChannelMoments___trans_tmp_1, GetImageChannelMoments_M11_0,
GetImageChannelMoments_pixel_3, GetImageChannelMoments_y,
GetImageChannelMoments_p;
double GetImageChannelMoments_M00_0, GetImageChannelMoments_M00_1,
GetImageChannelMoments_M01_1;
MagickPixelPacket GetImageChannelMoments_pixel;
SetMagickPixelPacket(int color, MagickPixelPacket *pixel) {
  pixel->red = color;
}
GetImageChannelMoments() {
  for (; GetImageChannelMoments_y; GetImageChannelMoments_y++) {
SetMagickPixelPacket(GetImageChannelMoments_p,
 &GetImageChannelMoments_pixel);
GetImageChannelMoments_M00_1 += GetImageChannelMoments_pixel.red;
if (GetImageChannelMoments_image)
  GetImageChannelMoments_M00_1++;
GetImageChannelMoments_M01_1 +=
GetImageChannelMoments_y * GetImageChannelMoments_pixel_3;
if (GetImageChannelMoments_image_0)
  GetImageChannelMoments_M00_0++;
GetImageChannelMoments_M01_1 +=
GetImageChannelMoments_y * GetImageChannelMoments_p++;
  }
  GetImageChannelMoments___trans_tmp_1 = atan(GetImageChannelMoments_M11_0);
}

reproduce with:

gcc -march=armv8-a+sve -w -Ofast statistic.i -o statistic.o

bisected to:

01c18f58d37865d5f3bbe93e666183b54ec608c7 is the first bad commit
commit 01c18f58d37865d5f3bbe93e666183b54ec608c7
Author: Robin Dapp 
Date:   Wed Sep 13 22:19:35 2023 +0200

ifcvt/vect: Emit COND_OP for conditional scalar reduction.

As described in PR111401 we currently emit a COND and a PLUS expression
for conditional reductions.  This makes it difficult to combine both
into a masked reduction statement later.
This patch improves that by directly emitting a COND_ADD/COND_OP during
ifcvt and adjusting some vectorizer code to handle it.

It also makes neutral_op_for_reduction return -0 if HONOR_SIGNED_ZEROS
is true.

gcc/ChangeLog:

PR middle-end/111401
* internal-fn.cc (internal_fn_else_index): New function.
* internal-fn.h (internal_fn_else_index): Define.
* tree-if-conv.cc (convert_scalar_cond_reduction): Emit COND_OP
if supported.
(predicate_scalar_phi): Add whitespace.
* tree-vect-loop.cc (fold_left_reduction_fn): Add IFN_COND_OP.
(neutral_op_for_reduction): Return -0 for PLUS.
(check_reduction_path): Don't count else operand in COND_OP.
(vect_is_simple_reduction): Ditto.
(vect_create_epilog_for_reduction): Fix whitespace.
(vectorize_fold_left_reduction): Add COND_OP handling.
(vectorizable_reduction): Don't count else operand in COND_OP.
(vect_transform_reduction): Add COND_OP handling.
* tree-vectorizer.h (neutral_op_for_reduction): Add default
parameter.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/vect-cond-reduc-in-order-2-signed-zero.c: New test.
* gcc.target/riscv/rvv/autovec/cond/pr111401.c: New test.
* gcc.target/riscv/rvv/autovec/reduc/reduc_call-2.c: Adjust.
* gcc.target/riscv/rvv/autovec/reduc/reduc_call-4.c: Ditto.

--

I'll start on the exchange one now.

[Bug middle-end/112406] [14 Regression] Several SPECCPU 2017 benchmarks fail with on internal compiler error: in expand_insn, at optabs.cc:8305 after g:01c18f58d37865d5f3bbe93e666183b54ec608c7

2023-11-08 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112406

Tamar Christina  changed:

   What|Removed |Added

   Priority|P3  |P1
Summary|[14 Regression] Several |[14 Regression] Several
   |SPECCPU 2017 benchmarks |SPECCPU 2017 benchmarks
   |fail with LTO on internal   |fail with on internal
   |compiler error: in  |compiler error: in
   |expand_insn, at |expand_insn, at
   |optabs.cc:8305  |optabs.cc:8305 after
   ||g:01c18f58d37865d5f3bbe93e6
   ||66183b54ec608c7

[Bug c/112432] Internal-fn: The [i|l|ll]rint family don't support FLOATN

2023-11-08 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112432

--- Comment #4 from Li Pan  ---
(In reply to Richard Biener from comment #3)
> Ah, yes, for lrint we have the builtins - I just looked for lceil here.  So
> yeah, where there are DEF_EXT_LIB_FLOATN_NX_BUILTINS we should have
> DEF_INTERNAL_FLT_FLOATN_FN.

Thanks Richard, I will have a try for this change.

[Bug target/112426] sched1 pessimizes codegen on aarch64 by increasing register pressure

2023-11-08 Thread acoplan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112426

--- Comment #3 from Alex Coplan  ---
Seems to happen with/without -fsched-pressure FWIW.

[Bug middle-end/112406] [14 Regression] Several SPECCPU 2017 benchmarks fail with on internal compiler error: in expand_insn, at optabs.cc:8305 after g:01c18f58d37865d5f3bbe93e666183b54ec608c7

2023-11-08 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112406

--- Comment #7 from Robin Dapp  ---
Ah, thanks, I can reproduce this on the cfarm/gcc185.

We don't expand:
vect__ifc__141.81_358 = .COND_ADD (vect_cst__356,
vect_GetImageChannelMoments_M00_0_lsm.74_338, { 1.0e+0, ... },
vect_GetImageChannelMoments_M00_0_lsm.74_338);

because we cannot legitimize the first operand ([1]):
 (reg:VNx16QI 247 [ vect_cst__356 ])
while operand[0] is VNx2DFmode (reg:VNx2DF 248 [ vect__ifc__141.81 ]).

We only check for the lhs type in ifcvt via vectorized_internal_fn_supported_p.
Maybe we need something like ifcvt_can_predicate as well?

[Bug fortran/112407] [13/14 Regression] Fix for PR37336 triggers an ICE in gfc_format_decoder while constructing a vtab

2023-11-08 Thread pault at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112407

Paul Thomas  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |pault at gcc dot gnu.org

--- Comment #4 from Paul Thomas  ---
Created attachment 56531
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56531&action=edit
Fix for this PR

The bug comes about because the vtable is being declared in one of the specific
procedures typebound to the derived type, thereby making the procedure
implicitly recursive. The attached fix gives this specific procedure the
recursive attribute.

The patch regression tests OK.

I have yet to understand why the vtable is not being declared in the containing
module namespace. I'll dig around some more after I have done some paid work
:-)

Perhaps you could try a build with this patch and -frecursive removed.

Paul

[Bug target/112438] RISC-V: Wrong auto-vectorization on induction variable of RVV

2023-11-08 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112438

--- Comment #13 from JuzheZhong  ---
Hi, kito.

https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635688.html 

Candidate patch to fix this.

Could you comment and give more explanation to Richards since I don't think I
can explain it better than you.

Thanks.

[Bug libfortran/112371] Wrong upper bound for the result of reduction intrinsics if the array is empty

2023-11-08 Thread mikael at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112371

Mikael Morin  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
 Ever confirmed|0   |1
   Last reconfirmed||2023-11-08
   Assignee|unassigned at gcc dot gnu.org  |mikael at gcc dot 
gnu.org
  Component|fortran |libfortran

--- Comment #4 from Mikael Morin  ---
Patches submitted (and accepted):
https://gcc.gnu.org/pipermail/fortran/2023-November/059904.html
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635518.html

[Bug libfortran/112412] Masked reduction functions return an unallocated array when the result is empty

2023-11-08 Thread mikael at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112412

Mikael Morin  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2023-11-08
   Assignee|unassigned at gcc dot gnu.org  |mikael at gcc dot 
gnu.org

--- Comment #1 from Mikael Morin  ---
Patches submitted (and accepted):
https://gcc.gnu.org/pipermail/fortran/2023-November/059904.html
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635518.html

[Bug middle-end/112406] [14 Regression] Several SPECCPU 2017 benchmarks fail with on internal compiler error: in expand_insn, at optabs.cc:8305 after g:01c18f58d37865d5f3bbe93e666183b54ec608c7

2023-11-08 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112406

--- Comment #8 from Robin Dapp  ---
Ah of course it's not the first argument but the mask.  During vectorization we
already create

fail1.c:15:10: note:  add new stmt: vect__ifc__141.81_358 = .COND_ADD
(vect_cst__356, vect_GetImageChannelMoments_M00_0_lsm.74_338, vect_cst__357,
vect_GetImageChannelMoments_M00_0_lsm.74_338);

where vect_cst__356 is

vector([16,16]) unsigned char vect_cst__356;

fail1.c:15:10: note:  created new init_stmt: _355 = (unsigned char) _114; 
fail1.c:15:10: note:  created new init_stmt: vect_cst__356 =
[vec_duplicate_expr] _355;

The type comes from vect_get_vec_defs_for_operand.

[Bug c++/111788] g++ DWARF for void foo(...) missing unspecified parameters DIE

2023-11-08 Thread gprocida at google dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111788

--- Comment #4 from Giuliano Procida  ---
Also worth noting that while Clang gives virtual table entries (for struct X)
this type in DWARF:

int(** _vptr.X)(...)

GCC gives them this type:

int(** _vptr$X)()

The different member names are an unfortunate inconsistency.

Clang is strictly better on the typing, stating that the virtual functions take
an unknown number of arguments. Perhaps even better would have been if the
first argument had been specified as type X*.

In any case, this is likely all cosmetic as debuggers have all the special
knowledge needed.

[Bug middle-end/112406] [14 Regression] Several SPECCPU 2017 benchmarks fail with on internal compiler error: in expand_insn, at optabs.cc:8305 after g:01c18f58d37865d5f3bbe93e666183b54ec608c7

2023-11-08 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112406

--- Comment #9 from Robin Dapp  ---
I believe the problem is that in

  if (vectype)
vector_type = vectype;
  else if (VECT_SCALAR_BOOLEAN_TYPE_P (TREE_TYPE (op))
   && VECTOR_BOOLEAN_TYPE_P (stmt_vectype))
vector_type = truth_type_for (stmt_vectype);
  else
vector_type = get_vectype_for_scalar_type (loop_vinfo, TREE_TYPE (op));

we don't expect a COND_OP and wrongly deduce the vector type from the scalar
type (because !VECTOR_BOOLEAN_TYPE_P (stmt_vectype)).  Maybe we need to check
whether we look at the mask operand of a conditional operation and use the
statement's vectype in that case.

[Bug libfortran/112412] Masked reduction functions return an unallocated array when the result is empty

2023-11-08 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112412

--- Comment #2 from CVS Commits  ---
The master branch has been updated by Mikael Morin :

https://gcc.gnu.org/g:d56bf419453ad44e53b05a9de22e98f6a80b5efd

commit r14-5244-gd56bf419453ad44e53b05a9de22e98f6a80b5efd
Author: Mikael Morin 
Date:   Tue Nov 7 11:24:02 2023 +0100

libgfortran: Don't skip allocation if size is zero [PR112412]

In the function template of transformational functions doing a reduction
of an array along one dimension, if the passed in result array was
unallocated and the calculated allocation size was zero (this is the case
of empty result arrays), an early return used to skip the allocation.  This
change moves the allocation before the early return, so that empty result
arrays are not seen as unallocated.  This is possible because zero size is
explicitly supported by the allocation function.

The offending code is present in several places, and this updates them all.
More precisely, there is one place in the template for logical reductions,
and there are two places in the templates corresponding to masked
reductions
with respectively array mask and scalar mask.  Templates for unmasked
reductions, which already allocate before returning, are not affected, but
unmasked reductions are checked nevertheless in the testcase.  The affected
m4 files are ifunction.m4 for regular functions and types, ifunction-s.m4
for character minloc and maxloc, ifunction-s2.m4 for character minval and
maxval, and ifunction_logical for logical reductions.

PR fortran/112412

libgfortran/ChangeLog:

* m4/ifunction.m4 (START_MASKED_ARRAY_FUNCTION,
SCALAR_ARRAY_FUNCTION):
Don't skip allocation if the allocation size is zero.
* m4/ifunction-s.m4 (START_MASKED_ARRAY_FUNCTION,
SCALAR_ARRAY_FUNCTION): Ditto.
* m4/ifunction-s2.m4 (START_MASKED_ARRAY_FUNCTION,
SCALAR_ARRAY_FUNCTION): Ditto.
* m4/ifunction_logical.m4 (START_ARRAY_FUNCTION): Ditto.
* generated/all_l1.c: Regenerate.
* generated/all_l16.c: Regenerate.
* generated/all_l2.c: Regenerate.
* generated/all_l4.c: Regenerate.
* generated/all_l8.c: Regenerate.
* generated/any_l1.c: Regenerate.
* generated/any_l16.c: Regenerate.
* generated/any_l2.c: Regenerate.
* generated/any_l4.c: Regenerate.
* generated/any_l8.c: Regenerate.
* generated/count_16_l.c: Regenerate.
* generated/count_1_l.c: Regenerate.
* generated/count_2_l.c: Regenerate.
* generated/count_4_l.c: Regenerate.
* generated/count_8_l.c: Regenerate.
* generated/iall_i1.c: Regenerate.
* generated/iall_i16.c: Regenerate.
* generated/iall_i2.c: Regenerate.
* generated/iall_i4.c: Regenerate.
* generated/iall_i8.c: Regenerate.
* generated/iany_i1.c: Regenerate.
* generated/iany_i16.c: Regenerate.
* generated/iany_i2.c: Regenerate.
* generated/iany_i4.c: Regenerate.
* generated/iany_i8.c: Regenerate.
* generated/iparity_i1.c: Regenerate.
* generated/iparity_i16.c: Regenerate.
* generated/iparity_i2.c: Regenerate.
* generated/iparity_i4.c: Regenerate.
* generated/iparity_i8.c: Regenerate.
* generated/maxloc1_16_i1.c: Regenerate.
* generated/maxloc1_16_i16.c: Regenerate.
* generated/maxloc1_16_i2.c: Regenerate.
* generated/maxloc1_16_i4.c: Regenerate.
* generated/maxloc1_16_i8.c: Regenerate.
* generated/maxloc1_16_r10.c: Regenerate.
* generated/maxloc1_16_r16.c: Regenerate.
* generated/maxloc1_16_r17.c: Regenerate.
* generated/maxloc1_16_r4.c: Regenerate.
* generated/maxloc1_16_r8.c: Regenerate.
* generated/maxloc1_16_s1.c: Regenerate.
* generated/maxloc1_16_s4.c: Regenerate.
* generated/maxloc1_4_i1.c: Regenerate.
* generated/maxloc1_4_i16.c: Regenerate.
* generated/maxloc1_4_i2.c: Regenerate.
* generated/maxloc1_4_i4.c: Regenerate.
* generated/maxloc1_4_i8.c: Regenerate.
* generated/maxloc1_4_r10.c: Regenerate.
* generated/maxloc1_4_r16.c: Regenerate.
* generated/maxloc1_4_r17.c: Regenerate.
* generated/maxloc1_4_r4.c: Regenerate.
* generated/maxloc1_4_r8.c: Regenerate.
* generated/maxloc1_4_s1.c: Regenerate.
* generated/maxloc1_4_s4.c: Regenerate.
* generated/maxloc1_8_i1.c: Regenerate.
* generated/maxloc1_8_i16.c: Regenerate.
* generated/maxloc1_8_i2.c: Regenerate.
* generated/maxloc1_8_i4.c: Regenerate.
   

[Bug libfortran/112371] Wrong upper bound for the result of reduction intrinsics if the array is empty

2023-11-08 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112371

--- Comment #5 from CVS Commits  ---
The master branch has been updated by Mikael Morin :

https://gcc.gnu.org/g:85a9688180a5523ae1704119978f3d634493300f

commit r14-5245-g85a9688180a5523ae1704119978f3d634493300f
Author: Mikael Morin 
Date:   Tue Nov 7 11:24:03 2023 +0100

libgfortran: Remove early return if extent is zero [PR112371]

Remove the early return present in function templates for transformational
functions doing a (masked) reduction of an array along a dimension.
This early return, which triggered if the extent in the reduction dimension
was zero, was wrong because even if the reduction operation degenerates to
a constant value in that case, one has to loop anyway along the other
dimensions to initialize every element of the resulting array with that
constant value.  The case of negative extent (not sure whether it may
happen
in practice) which was also early returning, is handled by clamping to
zero.

The offending piece of code was present in several places, and this removes
them all.  Namely, the impacted m4 files are ifunction.m4 for regular
functions and types, ifunction-s.m4 for character minloc and maxloc, and
ifunction-s2.m4 for character minval and maxval.

PR fortran/112371

libgfortran/ChangeLog:

* m4/ifunction.m4 (START_MASKED_ARRAY_FUNCTION): Remove early
return if
extent is zero or less, and clamp negative value to zero.
* m4/ifunction-s.m4 (START_MASKED_ARRAY_FUNCTION): Ditto.
* m4/ifunction-s2.m4 (START_MASKED_ARRAY_FUNCTION): Ditto.
* generated/iall_i1.c: Regenerate.
* generated/iall_i16.c: Regenerate.
* generated/iall_i2.c: Regenerate.
* generated/iall_i4.c: Regenerate.
* generated/iall_i8.c: Regenerate.
* generated/iany_i1.c: Regenerate.
* generated/iany_i16.c: Regenerate.
* generated/iany_i2.c: Regenerate.
* generated/iany_i4.c: Regenerate.
* generated/iany_i8.c: Regenerate.
* generated/iparity_i1.c: Regenerate.
* generated/iparity_i16.c: Regenerate.
* generated/iparity_i2.c: Regenerate.
* generated/iparity_i4.c: Regenerate.
* generated/iparity_i8.c: Regenerate.
* generated/maxloc1_16_i1.c: Regenerate.
* generated/maxloc1_16_i16.c: Regenerate.
* generated/maxloc1_16_i2.c: Regenerate.
* generated/maxloc1_16_i4.c: Regenerate.
* generated/maxloc1_16_i8.c: Regenerate.
* generated/maxloc1_16_r10.c: Regenerate.
* generated/maxloc1_16_r16.c: Regenerate.
* generated/maxloc1_16_r17.c: Regenerate.
* generated/maxloc1_16_r4.c: Regenerate.
* generated/maxloc1_16_r8.c: Regenerate.
* generated/maxloc1_16_s1.c: Regenerate.
* generated/maxloc1_16_s4.c: Regenerate.
* generated/maxloc1_4_i1.c: Regenerate.
* generated/maxloc1_4_i16.c: Regenerate.
* generated/maxloc1_4_i2.c: Regenerate.
* generated/maxloc1_4_i4.c: Regenerate.
* generated/maxloc1_4_i8.c: Regenerate.
* generated/maxloc1_4_r10.c: Regenerate.
* generated/maxloc1_4_r16.c: Regenerate.
* generated/maxloc1_4_r17.c: Regenerate.
* generated/maxloc1_4_r4.c: Regenerate.
* generated/maxloc1_4_r8.c: Regenerate.
* generated/maxloc1_4_s1.c: Regenerate.
* generated/maxloc1_4_s4.c: Regenerate.
* generated/maxloc1_8_i1.c: Regenerate.
* generated/maxloc1_8_i16.c: Regenerate.
* generated/maxloc1_8_i2.c: Regenerate.
* generated/maxloc1_8_i4.c: Regenerate.
* generated/maxloc1_8_i8.c: Regenerate.
* generated/maxloc1_8_r10.c: Regenerate.
* generated/maxloc1_8_r16.c: Regenerate.
* generated/maxloc1_8_r17.c: Regenerate.
* generated/maxloc1_8_r4.c: Regenerate.
* generated/maxloc1_8_r8.c: Regenerate.
* generated/maxloc1_8_s1.c: Regenerate.
* generated/maxloc1_8_s4.c: Regenerate.
* generated/maxval1_s1.c: Regenerate.
* generated/maxval1_s4.c: Regenerate.
* generated/maxval_i1.c: Regenerate.
* generated/maxval_i16.c: Regenerate.
* generated/maxval_i2.c: Regenerate.
* generated/maxval_i4.c: Regenerate.
* generated/maxval_i8.c: Regenerate.
* generated/maxval_r10.c: Regenerate.
* generated/maxval_r16.c: Regenerate.
* generated/maxval_r17.c: Regenerate.
* generated/maxval_r4.c: Regenerate.
* generated/maxval_r8.c: Regenerate.
* generated/minloc1_16_i1.c: Regenerate.
* generated/minloc1_16_i16.c: Regenerate.
* generated

[Bug libfortran/112371] Wrong upper bound for the result of reduction intrinsics if the array is empty

2023-11-08 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112371

--- Comment #6 from CVS Commits  ---
The master branch has been updated by Mikael Morin :

https://gcc.gnu.org/g:62715bf891979cfb8c6684fdcd65b06a28bbbf5c

commit r14-5246-g62715bf891979cfb8c6684fdcd65b06a28bbbf5c
Author: Mikael Morin 
Date:   Tue Nov 7 11:24:04 2023 +0100

libgfortran: Remove empty array descriptor first dimension overwrite
[PR112371]

Remove the forced overwrite of the first dimension of the result array
descriptor to set it to zero extent, in the function templates for
transformational functions doing an array reduction along a dimension. 
This
overwrite, which happened before early returning in case the result array
was empty, was wrong because an array may have a non-zero extent in the
first dimension and still be empty if it has a zero extent in a higher
dimension.  Overwriting the dimension was resulting in wrong array result
upper bound for the first dimension in that case.

The offending piece of code was present in several places, and this removes
them all.  More precisely, there is only one case to fix for logical
reduction functions, and there are three cases for other reduction
functions, corresponding to non-masked reduction, reduction with array
mask,
and reduction with scalar mask.  The impacted m4 files are
ifunction_logical.m4 for logical reduction functions, ifunction.m4 for
regular functions and types, ifunction-s.m4 for character minloc and
maxloc,
ifunction-s2.m4 for character minval and maxval, and ifindloc1.m4 for
findloc.

PR fortran/112371

libgfortran/ChangeLog:

* m4/ifunction.m4 (START_ARRAY_FUNCTION,
START_MASKED_ARRAY_FUNCTION,
SCALAR_ARRAY_FUNCTION): Remove overwrite of the first dimension of
the
array descriptor.
* m4/ifunction-s.m4 (START_ARRAY_FUNCTION,
START_MASKED_ARRAY_FUNCTION,
SCALAR_ARRAY_FUNCTION): Ditto.
* m4/ifunction-s2.m4 (START_ARRAY_FUNCTION,
START_MASKED_ARRAY_FUNCTION, SCALAR_ARRAY_FUNCTION): Ditto.
* m4/ifunction_logical.m4 (START_ARRAY_FUNCTION): Ditto.
* m4/ifindloc1.m4: Ditto.
* generated/all_l1.c: Regenerate.
* generated/all_l16.c: Regenerate.
* generated/all_l2.c: Regenerate.
* generated/all_l4.c: Regenerate.
* generated/all_l8.c: Regenerate.
* generated/any_l1.c: Regenerate.
* generated/any_l16.c: Regenerate.
* generated/any_l2.c: Regenerate.
* generated/any_l4.c: Regenerate.
* generated/any_l8.c: Regenerate.
* generated/count_16_l.c: Regenerate.
* generated/count_1_l.c: Regenerate.
* generated/count_2_l.c: Regenerate.
* generated/count_4_l.c: Regenerate.
* generated/count_8_l.c: Regenerate.
* generated/findloc1_c10.c: Regenerate.
* generated/findloc1_c16.c: Regenerate.
* generated/findloc1_c17.c: Regenerate.
* generated/findloc1_c4.c: Regenerate.
* generated/findloc1_c8.c: Regenerate.
* generated/findloc1_i1.c: Regenerate.
* generated/findloc1_i16.c: Regenerate.
* generated/findloc1_i2.c: Regenerate.
* generated/findloc1_i4.c: Regenerate.
* generated/findloc1_i8.c: Regenerate.
* generated/findloc1_r10.c: Regenerate.
* generated/findloc1_r16.c: Regenerate.
* generated/findloc1_r17.c: Regenerate.
* generated/findloc1_r4.c: Regenerate.
* generated/findloc1_r8.c: Regenerate.
* generated/findloc1_s1.c: Regenerate.
* generated/findloc1_s4.c: Regenerate.
* generated/iall_i1.c: Regenerate.
* generated/iall_i16.c: Regenerate.
* generated/iall_i2.c: Regenerate.
* generated/iall_i4.c: Regenerate.
* generated/iall_i8.c: Regenerate.
* generated/iany_i1.c: Regenerate.
* generated/iany_i16.c: Regenerate.
* generated/iany_i2.c: Regenerate.
* generated/iany_i4.c: Regenerate.
* generated/iany_i8.c: Regenerate.
* generated/iparity_i1.c: Regenerate.
* generated/iparity_i16.c: Regenerate.
* generated/iparity_i2.c: Regenerate.
* generated/iparity_i4.c: Regenerate.
* generated/iparity_i8.c: Regenerate.
* generated/maxloc1_16_i1.c: Regenerate.
* generated/maxloc1_16_i16.c: Regenerate.
* generated/maxloc1_16_i2.c: Regenerate.
* generated/maxloc1_16_i4.c: Regenerate.
* generated/maxloc1_16_i8.c: Regenerate.
* generated/maxloc1_16_r10.c: Regenerate.
* generated/maxloc1_16_r16.c: Regenerate.
* generated/maxloc1_16_r17.c: Regenerate.
* generated/maxloc1_16_r4

[Bug c++/112439] New: Modification of a member overlapping with a [[no_unique_address]] member in the constructor is incorrectly rejected

2023-11-08 Thread de34 at live dot cn via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112439

Bug ID: 112439
   Summary: Modification of a member overlapping with a
[[no_unique_address]] member in the constructor is
incorrectly rejected
   Product: gcc
   Version: 13.2.1
Status: UNCONFIRMED
  Keywords: rejects-valid
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: de34 at live dot cn
  Target Milestone: ---

GCC starts (incorrectly) rejecting the following code snipped since GCC13
(https://godbolt.org/z/faqMahz34). It's wrong to emit an error for this because
c is not const in the constructor body.

```
struct Empty {};

class Foo {
public:
constexpr Foo(int x, Empty y, int z) : a(x), b(y)
{
c = z;
}

private:
int a{};
[[no_unique_address]] Empty b{};
[[no_unique_address]] int c{};
};

constexpr Foo r{1, {}, 3};
```

It seems that the code is correctly accepted if b and c are made not to
overlap.

[Bug libstdc++/111948] subrange modifies a const size object

2023-11-08 Thread de34 at live dot cn via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111948

Jiang An  changed:

   What|Removed |Added

 CC||de34 at live dot cn

--- Comment #6 from Jiang An  ---
This is certainly a compiler bug. I've reduced it and reported Bug 112439.

[Bug libfortran/112371] Wrong upper bound for the result of reduction intrinsics if the array is empty

2023-11-08 Thread mikael at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112371

Mikael Morin  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #7 from Mikael Morin  ---
Fixed for gfortran 14.1, closing.

[Bug libfortran/112412] Masked reduction functions return an unallocated array when the result is empty

2023-11-08 Thread mikael at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112412

Mikael Morin  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #3 from Mikael Morin  ---
Fixed for gfortran 14.1, closing.

[Bug libstdc++/112440] New: Compiler does not grok basic_string::resize and basic_string::reserve if _CharT is char

2023-11-08 Thread antoshkka at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112440

Bug ID: 112440
   Summary: Compiler does not grok basic_string::resize and
basic_string::reserve if _CharT is char
   Product: gcc
   Version: 13.2.1
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: antoshkka at gmail dot com
  Target Milestone: ---

Consider the example:

#include 
void test1(std::size_t summ) {
std::string result;
result.resize(summ);

if (result.size() > summ) {
__builtin_abort();
}
}

The resulting assembly contains `call abort` and code to check the string size:
https://godbolt.org/z/zcj3Pc3G8

Looks like this is due to char* aliasing with string internals, switching to
std::u8string removes the `call abort` related assembly:
https://godbolt.org/z/a6bKaqqn5

I've failed to come up with a generic solution, but looks like adding
__builtin_unreachable() to the end of basic_string::resize and
basic_string::reserve helps: https://godbolt.org/z/vWcjqGK94


P.S.: such hints help to shorten the assembly for reserve+append*n cases
https://godbolt.org/z/nsEGsWdP3 , https://godbolt.org/z/qMf4b7dd8 ,
https://godbolt.org/z/1r6dd6d5M which are quire common

[Bug c++/112437] ICE with throw inside concept sometimes and -std=c++20

2023-11-08 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112437

Jonathan Wakely  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
   Last reconfirmed||2023-11-08

[Bug libstdc++/111948] subrange modifies a const size object

2023-11-08 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111948

--- Comment #7 from Jonathan Wakely  ---
Thanks! I briefly tried but failed to reproduce it in a reduced example.

[Bug c++/112439] Modification of a member overlapping with a [[no_unique_address]] member in the constructor is incorrectly rejected in constant evaluation

2023-11-08 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112439

Jonathan Wakely  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2023-11-08

[Bug libstdc++/112440] Compiler does not grok basic_string::resize and basic_string::reserve if _CharT is char

2023-11-08 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112440

Jonathan Wakely  changed:

   What|Removed |Added

   Last reconfirmed||2023-11-08
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

--- Comment #1 from Jonathan Wakely  ---
(In reply to Antony Polukhin from comment #0)
> Consider the example:
> 
> #include 
> void test1(std::size_t summ) {
> std::string result;
> result.resize(summ);
> 
> if (result.size() > summ) {
> __builtin_abort();
> }
> }
> 
> The resulting assembly contains `call abort` and code to check the string
> size: https://godbolt.org/z/zcj3Pc3G8
> 
> Looks like this is due to char* aliasing with string internals, switching to
> std::u8string removes the `call abort` related assembly:
> https://godbolt.org/z/a6bKaqqn5

Yes, the problem of std::string's data member aliasing itself has been
discussed in bugzilla before. It's presumably also an issue for
std::vector.

I think we discussed a new attribute which would say that none of the members
of *this refer to *this, but that wouldn't be correct for the SSO std::string,
where it's data pointer can point to its internal buffer. Maybe it would work
if it meant that the pointer doesn't alias anything of a different type in
*this. That would allow the char* std::string::_M_dataplus._M_p to point to the
char array std::string::_M_local_buf, but would tell the compiler it never
points to _M_string_length or _M_allocated_capacity.


> I've failed to come up with a generic solution, but looks like adding
> __builtin_unreachable() to the end of basic_string::resize and
> basic_string::reserve helps: https://godbolt.org/z/vWcjqGK94
> 
> 
> P.S.: such hints help to shorten the assembly for reserve+append*n cases
> https://godbolt.org/z/nsEGsWdP3 , https://godbolt.org/z/qMf4b7dd8 ,
> https://godbolt.org/z/1r6dd6d5M which are quire common

That seems worth doing, thanks for the suggestion.

[Bug bootstrap/112441] New: Comparing stages 2 and 3 Bootstrap comparison failure!

2023-11-08 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112441

Bug ID: 112441
   Summary: Comparing stages 2 and 3 Bootstrap comparison failure!
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: bootstrap
  Assignee: unassigned at gcc dot gnu.org
  Reporter: crazylht at gmail dot com
  Target Milestone: ---

I meet an bootstrapped compare failure with r14-5243-g80f466aa1cce27

My GCC configure is --with-cpu=native --with-arch=native --disable-libsanitizer
--enable-checking=yes,rtl,extra --enable-clocale

and machine is cascadelake.


make[9]: Leaving directory
'/export/users/liuhongt/tools-build/build_intel-innersource_master_native_bootstrap/x86_64-pc-linux-gnu/32/libstdc++-v3'
make[8]: Leaving directory
'/export/users/liuhongt/tools-build/build_intel-innersource_master_native_bootstrap/x86_64-pc-linux-gnu/32/libstdc++-v3'
make[7]: Leaving directory
'/export/users/liuhongt/tools-build/build_intel-innersource_master_native_bootstrap/x86_64-pc-linux-gnu/32/libstdc++-v3'
make[6]: Leaving directory
'/export/users/liuhongt/tools-build/build_intel-innersource_master_native_bootstrap/x86_64-pc-linux-gnu/libstdc++-v3'
make[5]: Leaving directory
'/export/users/liuhongt/tools-build/build_intel-innersource_master_native_bootstrap/x86_64-pc-linux-gnu/libstdc++-v3'
make[4]: Leaving directory
'/export/users/liuhongt/tools-build/build_intel-innersource_master_native_bootstrap/x86_64-pc-linux-gnu/libstdc++-v3'
make[3]: Leaving directory
'/export/users/liuhongt/tools-build/build_intel-innersource_master_native_bootstrap/x86_64-pc-linux-gnu/libstdc++-v3'
make[2]: Leaving directory
'/export/users/liuhongt/tools-build/build_intel-innersource_master_native_bootstrap'
make "DESTDIR=" "RPATH_ENVVAR=LD_LIBRARY_PATH"
"TARGET_SUBDIR=x86_64-pc-linux-gnu"
"bindir=/export/users/liuhongt/install/intel-innersource_master_native_bootstrap/bin"
"datadir=/export/users/liuhongt/install/intel-innersource_master_native_bootstrap/share"
"exec_prefix=/export/users/liuhongt/install/intel-innersource_master_native_bootstrap"
"includedir=/export/users/liuhongt/install/intel-innersource_master_native_bootstrap/include"
"datarootdir=/export/users/liuhongt/install/intel-innersource_master_native_bootstrap/share"
"docdir=/export/users/liuhongt/install/intel-innersource_master_native_bootstrap/share/doc/"
"infodir=/export/users/liuhongt/install/intel-innersource_master_native_bootstrap/share/info"
"pdfdir=/export/users/liuhongt/install/intel-innersource_master_native_bootstrap/share/doc/"
"htmldir=/export/users/liuhongt/install/intel-innersource_master_native_bootstrap/share/doc/"
"libdir=/export/users/liuhongt/install/intel-innersource_master_native_bootstrap/lib"
"libexecdir=/export/users/liuhongt/install/intel-innersource_master_native_bootstrap/libexec"
"lispdir="
"localstatedir=/export/users/liuhongt/install/intel-innersource_master_native_bootstrap/var"
"mandir=/export/users/liuhongt/install/intel-innersource_master_native_bootstrap/share/man"
"oldincludedir=/usr/include"
"prefix=/export/users/liuhongt/install/intel-innersource_master_native_bootstrap"
"sbindir=/export/users/liuhongt/install/intel-innersource_master_native_bootstrap/sbin"
"sharedstatedir=/export/users/liuhongt/install/intel-innersource_master_native_bootstrap/com"
"sysconfdir=/export/users/liuhongt/install/intel-innersource_master_native_bootstrap/etc"
"tooldir=/export/users/liuhongt/install/intel-innersource_master_native_bootstrap/x86_64-pc-linux-gnu"
"build_tooldir=/export/users/liuhongt/install/intel-innersource_master_native_bootstrap/x86_64-pc-linux-gnu"
"target_alias=x86_64-pc-linux-gnu" "AWK=gawk" "BISON=bison" "CC_FOR_BUILD=gcc"
"CFLAGS_FOR_BUILD=-g -O2" "CXX_FOR_BUILD=g++ -std=c++11" "EXPECT=expect"
"FLEX=flex" "INSTALL=/usr/bin/install -c" "INSTALL_DATA=/usr/bin/install -c -m
644" "INSTALL_PROGRAM=/usr/bin/install -c" "INSTALL_SCRIPT=/usr/bin/install -c"
"LDFLAGS_FOR_BUILD=" "LEX=flex" "M4=m4" "MAKE=make" "RUNTEST=runtest"
"RUNTESTFLAGS=" "SED=/usr/bin/sed" "SHELL=/bin/sh" "YACC=bison -y" "`echo
'ADAFLAGS=' | sed -e s'/[^=][^=]*=$/XFOO=/'`" "ADA_CFLAGS=" "AR_FLAGS=rc"
"`echo 'BOOT_ADAFLAGS=-gnatpg' | sed -e s'/[^=][^=]*=$/XFOO=/'`"
"BOOT_CFLAGS=-g -O2" "BOOT_LDFLAGS=" "CFLAGS=-g -O2" "CXXFLAGS=-g -O2"
"LDFLAGS=" "LIBCFLAGS=-g -O2  " "LIBCXXFLAGS=-g -O2   -fno-implicit-templates"
"STAGE1_CHECKING=--enable-checking=yes,rtl,extra,types"
"STAGE1_LANGUAGES=c,c++,lto" "GNATBIND=no" "GNATMAKE=no" "GDC=no" "GDCFLAGS=-g
-O2" "AR_FOR_TARGET=ar --plugin
/usr/libexec/gcc/x86_64-redhat-linux/11/liblto_plugin.so" "AS_FOR_TARGET=as"
"CC_FOR_TARGET=/export/users/liuhongt/tools-build/build_intel-innersource_master_native_bootstrap/./gcc/xgcc
-B/export/users/liuhongt/tools-build/build_intel-innersource_master_native_bootstrap/./gcc/"
"CFLAGS_FOR_TARGET=-g -O2" "CPPFLAGS_FOR_TARGET=" "CXXFLAGS_FOR_TARGET=-g -O2
-D_GNU_SOURCE" "DLLTOOL_FOR_TARGET=dlltool"

[Bug tree-optimization/111133] SLP of scatters not implemented

2023-11-08 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=33

Richard Biener  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2023-11-08
 Status|UNCONFIRMED |NEW

--- Comment #1 from Richard Biener  ---
With

#pragma GCC ivdep

the dependence issue is gone but we are not grouping gathers/scatters in
vect_analyze_data_ref_accesses.  Technically those are not "groups",
we wouldn't know how to set gap/size.

So to SLP scatters we'd need to optimistically perform SLP discovery on
"all" of them (or likely more successful, start greedy discovery from
the offset or the stored data side, aka from loads).  Eventually handled
when SLP discovery is rewritten.

For now priority #1 is to get single-lane discovery work for scatters.

[Bug bootstrap/112441] Comparing stages 2 and 3 Bootstrap comparison failure!

2023-11-08 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112441

Hongtao.liu  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #1 from Hongtao.liu  ---
dup

*** This bug has been marked as a duplicate of bug 112374 ***

[Bug target/112374] [14 Regression] `--with-arch=skylake-avx512 --with-cpu=skylake-avx512` causes a comparison failure

2023-11-08 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112374

Hongtao.liu  changed:

   What|Removed |Added

 CC||crazylht at gmail dot com

--- Comment #7 from Hongtao.liu  ---
*** Bug 112441 has been marked as a duplicate of this bug. ***

[Bug libstdc++/19495] basic_string::_M_rep() can produce an unnaturally aligned pointer to _Rep

2023-11-08 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=19495

--- Comment #27 from Jonathan Wakely  ---
(In reply to Ben Elliston from comment #26)
> This test now passes on powerpc*-linux-gnu.

I wonder how ... the "fix" got reverted, and we still use an allocator of char
to create the storage for the COW string's _Rep, which has stricter alignment
requirements than char.

The array_allocator was eventually removed in 2019, because it was broken. But
the alignment problem still seems to be present in the COW std::string.

[Bug libstdc++/19495] basic_string::_M_rep() can produce an unnaturally aligned pointer to _Rep

2023-11-08 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=19495

--- Comment #28 from Jonathan Wakely  ---
(In reply to Jonathan Wakely from comment #27)
> But the alignment problem still seems to be present in the COW std::string.

But I think that's covered by PR 8670, which is still open.

[Bug target/112337] arm: ICE in arm_effective_regno when compiling for MVE

2023-11-08 Thread vmakarov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112337

Vladimir Makarov  changed:

   What|Removed |Added

 CC||vmakarov at gcc dot gnu.org

--- Comment #7 from Vladimir Makarov  ---
(In reply to Alex Coplan from comment #6)
> Confirmed. Here's a slightly cleaned up reproducer that doesn't warn:
> 
> #pragma GCC arm "arm_mve_types.h"
> int32x4_t h(void *p) { return __builtin_mve_vldrwq_sv4si(p); }
> void g(int32x4_t);
> void f(int, int, int, short, int *p) {
>   int *bias = p;
>   for (;;) {
> int32x4_t d = h(bias);
> bias += 4;
> g(d);
>   }
> }
> 
> ICEs with -O2 -march=armv8.1-m.main+mve -mfloat-abi=hard on the trunk.

Looking at the dump, I can guess INC/DEC operand is not a reg after IRA
temporary transformation.  It can be fixed in arm.cc by checking that the
operand is reg instead of using the assert but it could be wrong because the
documentation says the operand should be a reg.  Also such solution would not
work for possible problem on other targets.

Could you provide me preprocessed test file. I'll try to find a solution as
soon as possible.

[Bug target/112337] arm: ICE in arm_effective_regno when compiling for MVE

2023-11-08 Thread acoplan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112337

--- Comment #8 from Alex Coplan  ---
(In reply to Vladimir Makarov from comment #7)
> (In reply to Alex Coplan from comment #6)
> > Confirmed. Here's a slightly cleaned up reproducer that doesn't warn:
> > 
> > #pragma GCC arm "arm_mve_types.h"
> > int32x4_t h(void *p) { return __builtin_mve_vldrwq_sv4si(p); }
> > void g(int32x4_t);
> > void f(int, int, int, short, int *p) {
> >   int *bias = p;
> >   for (;;) {
> > int32x4_t d = h(bias);
> > bias += 4;
> > g(d);
> >   }
> > }
> > 
> > ICEs with -O2 -march=armv8.1-m.main+mve -mfloat-abi=hard on the trunk.
> 
> Looking at the dump, I can guess INC/DEC operand is not a reg after IRA
> temporary transformation.  It can be fixed in arm.cc by checking that the
> operand is reg instead of using the assert but it could be wrong because the
> documentation says the operand should be a reg.  Also such solution would
> not work for possible problem on other targets.
> 
> Could you provide me preprocessed test file. I'll try to find a solution as
> soon as possible.

The quoted code above is a preprocessed testcase.

FWIW, https://gcc.gnu.org/onlinedocs/gccint/Incdec.html seems to document that
mem inside pre_dec is valid.  If that's no longer the case, we should update
the documentation (and then IRA needs fixing).  If the documentation is
correct, then we need to fix this in arm.cc.

[Bug c/112442] New: Segfault from casting a ptr when using -O2

2023-11-08 Thread adam.andersson at elisapolystar dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112442

Bug ID: 112442
   Summary: Segfault from casting a ptr when using -O2
   Product: gcc
   Version: 13.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: adam.andersson at elisapolystar dot com
  Target Milestone: ---

Created attachment 56532
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56532&action=edit
Small testprogram that reproduces the issue

I have attached a simple program that segfaults when compiling with GCC 13 and
-O2

Compiled with: gcc -v -save-temps -O2 gcc-segfault.c

How to reproduce:
$ gcc -O2 a-gcc-segfault.i -o test && ./test
Segmentation fault (core dumped)


This does not happen in GCC 12 or earlier, or if I use -O1 or no optimization.
Also doesnt happen if I remove the cast to unsigned char*, or if I inline the
test-function. 


My system:
Linux adam1 6.5.9-arch2-1 #1 SMP PREEMPT_DYNAMIC Thu, 26 Oct 2023 00:52:20
+ x86_64 GNU/Linux
gcc version 13.2.1 20230801 (GCC)

[Bug libstdc++/112089] std::shared_lock::unlock should throw operation_not_permitted instead resource_deadlock_would_occur

2023-11-08 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112089

--- Comment #4 from CVS Commits  ---
The releases/gcc-13 branch has been updated by Jonathan Wakely
:

https://gcc.gnu.org/g:e46a07b0e8a2bb074b491e0bb54a5cc8c8051341

commit r13-8012-ge46a07b0e8a2bb074b491e0bb54a5cc8c8051341
Author: Jonathan Wakely 
Date:   Thu Oct 26 16:51:30 2023 +0100

libstdc++: Fix exception thrown by std::shared_lock::unlock() [PR112089]

The incorrect errc constant here looks like a copy&paste error.

libstdc++-v3/ChangeLog:

PR libstdc++/112089
* include/std/shared_mutex (shared_lock::unlock): Change errc
constant to operation_not_permitted.
* testsuite/30_threads/shared_lock/locking/112089.cc: New test.

(cherry picked from commit 0c305f3dec9a992dd775a3b9607b7b1e8c051859)

[Bug libstdc++/112314] Missing index assertions in basic_string_view

2023-11-08 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112314

--- Comment #8 from CVS Commits  ---
The releases/gcc-13 branch has been updated by Jonathan Wakely
:

https://gcc.gnu.org/g:66d0abdf0ade07228eba4dedcd1a9da09960ef53

commit r13-8014-g66d0abdf0ade07228eba4dedcd1a9da09960ef53
Author: Jonathan Wakely 
Date:   Wed Nov 1 15:01:22 2023 +

libstdc++: Add assertion to std::string_view::remove_suffix [PR112314]

libstdc++-v3/ChangeLog:

PR libstdc++/112314
* include/std/string_view (string_view::remove_suffix): Add
debug assertion.
*
testsuite/21_strings/basic_string_view/modifiers/remove_prefix/debug.cc:
New test.
*
testsuite/21_strings/basic_string_view/modifiers/remove_suffix/debug.cc:
New test.

(cherry picked from commit 6afa984f47e16e8bd958646d7407b74e61041f5d)

[Bug libstdc++/112440] Compiler does not grok basic_string::resize and basic_string::reserve if _CharT is char

2023-11-08 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112440

Jonathan Wakely  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=93971

--- Comment #2 from Jonathan Wakely  ---
See PR 103534 comment 8 which refers to PR 93971

PR 98465 also discusses the problem.

[Bug c++/112439] [13/14 Regression] Modification of a member overlapping with a [[no_unique_address]] member in the constructor is incorrectly rejected in constant evaluation

2023-11-08 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112439

Patrick Palka  changed:

   What|Removed |Added

 CC||jason at gcc dot gnu.org,
   ||ppalka at gcc dot gnu.org
  Known to work||12.3.0
  Known to fail||13.2.0, 14.0
   Target Milestone|--- |13.3
Summary|Modification of a member|[13/14 Regression]
   |overlapping with a  |Modification of a member
   |[[no_unique_address]]   |overlapping with a
   |member in the constructor   |[[no_unique_address]]
   |is incorrectly rejected in  |member in the constructor
   |constant evaluation |is incorrectly rejected in
   ||constant evaluation

--- Comment #1 from Patrick Palka  ---
Started with r13-160-g967cdbe6629653

[Bug tree-optimization/93971] std::string considered to alias declared objects of incompatible types

2023-11-08 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93971

--- Comment #12 from Jonathan Wakely  ---
(In reply to Jason Merrill from comment #11)
> I don't think it's valid to use a plain char array as storage for an object
> of another type; the "provides storage" wording in
> http://eel.is/c++draft/intro.object#3 only applies to unsigned char and
> std::byte.

But that implies the problem would still exist for std::vector, and
std::basic_string>. But maybe
that's OK, strings of unsigned char are unusual, and a std::vector
might really be used as a buffer of untyped memory.

If we can optimize std::string and std::vector as though their data
pointers don't alias anything else, that would probably be good enough to
please nearly everybody.

[Bug c/112442] Segfault from casting a ptr when using -O2

2023-11-08 Thread adam.andersson at elisapolystar dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112442

--- Comment #1 from Adam Andersson  ---
Disregard my comment about it working GCC 12. In gcc version 12.3.0 (GCC) it
does not work either.

[Bug c/112442] Segfault from casting a ptr when using -O2

2023-11-08 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112442

--- Comment #2 from Jonathan Wakely  ---
Looks like it doesn't always segfault, but the contents of the tmp buffer are
incorrect (which might segfault, or might fail to print "test!").

[Bug tree-optimization/112443] New: Misoptimization of _mm256_blendv_epi8 intrinsic on avx512bw+avx512vl

2023-11-08 Thread alexander.grund--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112443

Bug ID: 112443
   Summary: Misoptimization of _mm256_blendv_epi8 intrinsic on
avx512bw+avx512vl
   Product: gcc
   Version: 12.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: alexander.gr...@tu-dresden.de
  Target Milestone: ---

Created attachment 56533
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56533&action=edit
Reproducer code extracted from actual source

I came around some piece of code in PyTorch using AVX2 intrinsics that is
misoptimized producing wrong results, when compiled for newer CPUS.
In particular I was able to reproduce this with `-mavx512bw -mavx512vl -O2`

We usually compile with `-march=native` which on the Sapphire Rapids system
enables the above AVX512 flags, but so does `-march=cannonlake` and above.

The piece of code in question is a call to `_mm256_blendv_epi8(a, b, mask)`
that seemingly produces inverted semantics, i.e. I have a mask with all bits
set and it returns a and for a mask with all bits unset it returns b.

It is also a bit complicated to reproduce as it seems to require hiding some
details behind a lambda called through `std::function`.
In the attached example a zero and one vector is created once and copied into
the lambda where it is reused for potentially many iterations (removing the
loop also reproduces the issue)
Either of the following actions causes the bug to disappear:
- Removing either of the 2 `-mavx512` flags
- Reducing to `-O1` or lower
- Moving the zero_vec inside the lambda (moving one_vec makes no difference)
- Not calling through std::function (either run the lambda directly or pass
through as a template param instead of std::function)
- `-DREGEN_MASK` to create a new mask through a (superflous)
`_mm256_cmpeq_epi8` against all 1 bits

Reproducing:
g++ -std=c++17 -mavx512bw -mavx512vl -O2 bug.cpp && ./a.out

Expected output (last line, first line shows the inverted semantic):
vec[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1]

Actual output:
vec[255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
255]

[Bug middle-end/112444] New: [14 regression] ICE when buliding libqmi (internal compiler error: tree check: expected class ‘type’, have ‘exceptional’ (error_mark) in useless_type_conversion_p, at gimp

2023-11-08 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112444

Bug ID: 112444
   Summary: [14 regression] ICE when buliding libqmi (internal
compiler error: tree check: expected class ‘type’,
have ‘exceptional’ (error_mark) in
useless_type_conversion_p, at gimple-expr.cc:85)
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: sjames at gcc dot gnu.org
  Target Milestone: ---

Created attachment 56534
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56534&action=edit
meson-generated_.._qmi-pbm.c.i.xz

Originally reported downstream in Gentoo by Toralf Förster at
https://bugs.gentoo.org/917037.

Seems to need -ftrivial-auto-var-init=zero.
```
# x86_64-pc-linux-gnu-gcc -Isrc/libqmi-glib/generated/libqmi-glib-generated.a.p
-Isrc/libqmi-glib/generated -I../libqmi-1.32.4/src/libqmi-glib/generated -I.
-I../libqmi-1.32.4 -Isrc/libqmi-glib -I../libqmi-1.32.4/src/libqmi-glib
-I/usr/include/glib-2.0 -I/usr/lib64/glib-2.0/include
-I/usr/lib64/libffi/include -I/usr/include/libmount -I/usr/include/blkid
-I/usr/include/libqrtr-glib -fdiagnostics-color=always -D_FILE_OFFSET_BITS=64
-Wall -Winvalid-pch -Wextra -std=gnu89 -DHAVE_CONFIG_H -Wno-unused-parameter
-Wno-cast-function-type -Wno-packed -O3 -pipe -march=native
-fno-diagnostics-color -fPIC -pthread
-DGLIB_VERSION_MIN_REQUIRED=GLIB_VERSION_2_56
-DGLIB_VERSION_MAX_ALLOWED=GLIB_VERSION_2_56
-DGLIB_DISABLE_DEPRECATION_WARNINGS -DLIBQMI_GLIB_COMPILATION
'-DG_LOG_DOMAIN="Qmi"' -Wno-unused-function -MD -MQ
src/libqmi-glib/generated/libqmi-glib-generated.a.p/meson-generated_.._qmi-pbm.c.o
-MF
src/libqmi-glib/generated/libqmi-glib-generated.a.p/meson-generated_.._qmi-pbm.c.o.d
-o
src/libqmi-glib/generated/libqmi-glib-generated.a.p/meson-generated_.._qmi-pbm.c.o
-c src/libqmi-glib/generated/qmi-pbm.c -ftrivial-auto-var-init=zero
during GIMPLE pass: fre
src/libqmi-glib/generated/qmi-pbm.c: In function
‘message_get_all_capabilities_get_tlv_printable’:
src/libqmi-glib/generated/qmi-pbm.c:3934:1: internal compiler error: tree
check: expected class ‘type’, have ‘exceptional’ (error_mark) in
useless_type_conversion_p, at gimple-expr.cc:85
 3934 | message_get_all_capabilities_get_tlv_printable (
  | ^~
0x560e0a0b792f tree_class_check_failed(tree_node const*, tree_code_class, char
const*, int, char const*)
   
/usr/src/debug/sys-devel/gcc-14.0.0_pre20231105/gcc-14-20231105/gcc/tree.cc:8999
0x560e0924e52a tree_class_check(tree_node*, tree_code_class, char const*, int,
char const*)
   
/usr/src/debug/sys-devel/gcc-14.0.0_pre20231105/gcc-14-20231105/gcc/tree.h:3753
0x560e0924e52a useless_type_conversion_p(tree_node*, tree_node*)
   
/usr/src/debug/sys-devel/gcc-14.0.0_pre20231105/gcc-14-20231105/gcc/gimple-expr.cc:85
0x560e0a9c8534 verify_gimple_assign_single
   
/usr/src/debug/sys-devel/gcc-14.0.0_pre20231105/gcc-14-20231105/gcc/tree-cfg.cc:4593
0x560e0aa84d20 verify_gimple_in_cfg(function*, bool, bool)
   
/usr/src/debug/sys-devel/gcc-14.0.0_pre20231105/gcc-14-20231105/gcc/tree-cfg.cc:5582
0x560e0a8f7f9c execute_function_todo
   
/usr/src/debug/sys-devel/gcc-14.0.0_pre20231105/gcc-14-20231105/gcc/passes.cc:2088
0x560e0a8f7f9c do_per_function
   
/usr/src/debug/sys-devel/gcc-14.0.0_pre20231105/gcc-14-20231105/gcc/passes.cc:1687
0x560e0a8f7f9c execute_todo
   
/usr/src/debug/sys-devel/gcc-14.0.0_pre20231105/gcc-14-20231105/gcc/passes.cc:2142
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See  for instructions.
```

'gcc -c meson-generated_.._qmi-pbm.c.i -O3 -ftrivial-auto-var-init=zero' is
enough for me to reproduce.

[Bug c/112442] Segfault from casting a ptr when using -O2

2023-11-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112442

--- Comment #3 from Andrew Pinski  ---
I am not 100% sure but there seems like some kind of aliasing issue going on.

Basically you have a pointer to an `unsigned char` but writing it via a pointer
to `char`.
Yes writing to a type via `char` would be valid and well defined but you are
writing to a pointer of char.

[Bug c/112442] Segfault from casting a ptr when using -O2

2023-11-08 Thread adam.andersson at elisapolystar dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112442

--- Comment #4 from Adam Andersson  ---
(In reply to Andrew Pinski from comment #3)
> I am not 100% sure but there seems like some kind of aliasing issue going on.
> 
> Basically you have a pointer to an `unsigned char` but writing it via a
> pointer to `char`.
> Yes writing to a type via `char` would be valid and well defined but you are
> writing to a pointer of char.

Something weird is going on when casting a char pointer to an unsigned char
pointer. If you replace the unsigned char pointer with a void pointer it works
fine.

[Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94

2023-11-08 Thread jeffreyalaw at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415

--- Comment #16 from Jeffrey A. Law  ---
On 11/8/23 03:09, manolis.tsamis at vrull dot eu wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
> 
> --- Comment #15 from Manolis Tsamis  ---
> (In reply to Sam James from comment #13)
>> Created attachment 56527 [details]
>> compile.c.323r.fold_mem_offsets.bad.xz
>>
>> Output from
>> ```
>> hppa2.0-unknown-linux-gnu-gcc -c  -DNDEBUG -g -fwrapv -O3 -Wall -O2
>> -std=c11 -Werror=implicit-function-declaration -fvisibility=hidden
>> -I/home/sam/git/cpython/Include/internal -IObjects -IInclude -IPython -I.
>> -I/home/sam/git/cpython/Include-DPy_BUILD_CORE -o Python/compile.o
>> /home/sam/git/cpython/Python/compile.c -fdump-rtl-fold_mem_offsets-all
>> ```
>>
>> If I instrument certain functions in compile.c with no optimisation
>> attribuet or build the file with -fno-fold-mem-offsets, Python works, so I'm
>> reasonably sure this is the relevant object.
> 
> Thanks for the dump file! There are 66 folded/eliminated instructions in this
> object file; I did look at each case and there doesn't seem to be anything
> strange. In fact most of the transformations are straightforward:
> 
>   - All except a couple of cases don't involve any arithmetic, so it's just
> moving a constant around.
>   - The majority of the transformations are 'trivial' and consist of a single
> add and then a memory operation: a sequence like X = Y + Const, R = MEM[X + 0]
> is folded to X = Y, R = MEM[X + Const]. I wonder why so many of these exist 
> and
> are not optimized elsewhere.
>   - There are some cases with negative offsets, but the calculations look
> correct.
>   - There are few more complicated cases, but I've done these on paper and 
> also
> look correct.
The PA port is "weird".  It's addressing modes aren't a good match for 
GCC (they're not symmetrical across loads vs stores and across fp vs 
integer) and they have the implicit space register problem.  But I don't 
immediately recall needing to avoid propagation of constants into memory 
references or anything like that.

I'd probably continue with the process of narrowing down what code is 
affected using the attributes.  We already know the file, narrowing it 
down to a function might help considerably with the evaluation effort.

Note that QEMU has a functional PA port.  So you might be able to just 
take a root filesystem, add the tarball referenced earlier and play 
around to narrow things down further.

I haven't done work on the PA in about 20 years at this point, but I can 
probably still grok its code.  Between David and myself I'm sure we can 
help interpret what's going on


Jeff

[Bug testsuite/111298] time-profiler-2.c flaky on glibc RISC-V

2023-11-08 Thread amylaar at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111298

Jorn Wolfgang Rennecke  changed:

   What|Removed |Added

 CC||amylaar at gcc dot gnu.org

--- Comment #3 from Jorn Wolfgang Rennecke  ---
(In reply to Patrick O'Neill from comment #0)
> I'm guessing that this is likely due to some conflict between
> time-profiler-1.c and time-profiler-2.c and filing this under testsuite
> framework issue, but feel free to move it if it's likely caused by a
> specific component.

My guess is that the atomic fetch-and-update emitted by
gimple_gen_time_profiler
is not actually atomic (at least under RISC-V Qemu).
Note that in time-profiler-2.c, there is a parent and a child process that
access the same gcov data.

[Bug c/112442] Segfault from casting a ptr when using -O2

2023-11-08 Thread schwab--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112442

--- Comment #5 from Andreas Schwab  ---
warning: dereferencing type-punned pointer will break strict-aliasing rules
[-Wstrict-aliasing]
   15 | test((char **)&ptr, "test!");

[Bug middle-end/112444] [14 regression] ICE when buliding libqmi with -O3 -ftrivial-auto-var-init=zero (internal compiler error: tree check: expected class ‘type’, have ‘exceptional’ (error_mark) in u

2023-11-08 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112444

--- Comment #1 from Sam James  ---
Created attachment 56535
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56535&action=edit
reduced.i

cvise popped this out, I haven't tried to prettify it by hand at all as heading
out now.

[Bug c/112442] Segfault from casting a ptr when using -O2

2023-11-08 Thread xry111 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112442

Xi Ruoyao  changed:

   What|Removed |Added

 Resolution|--- |INVALID
 Status|UNCONFIRMED |RESOLVED
 CC||xry111 at gcc dot gnu.org

--- Comment #6 from Xi Ruoyao  ---
It's definitely an aliasing rule violation.  And it's still wrong even if you
use a void pointer.  The void pointer "workaround" just happens to work by
luck.

[Bug fortran/112407] [13/14 Regression] Fix for PR37336 triggers an ICE in gfc_format_decoder while constructing a vtab

2023-11-08 Thread trnka at scm dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112407

--- Comment #5 from Tomáš Trnka  ---
(In reply to Paul Thomas from comment #4)
> Created attachment 56531 [details]
> Fix for this PR
> 
> The bug comes about because the vtable is being declared in one of the
> specific procedures typebound to the derived type, thereby making the
> procedure implicitly recursive. The attached fix gives this specific
> procedure the recursive attribute.

This fix seems to work great, all of our stuff builds and passes tests without
any new trouble (without -frecursive). Your previous patch in comment 2 also
seems to work (our code builds fine, but I haven't tested that variant
thoroughly).

I'm looking forward to any more information on the root cause.

[Bug c/112442] Segfault from casting a ptr when using -O2

2023-11-08 Thread xry111 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112442

--- Comment #7 from Xi Ruoyao  ---
Note that in the "new bug" page, there is a red banner saying:

Before reporting that GCC compiles your code incorrectly, compile it with gcc
-Wall -Wextra and see whether this shows anything wrong with your code.
Similarly, if compiling with -fno-strict-aliasing -fwrapv makes a difference,
your code probably is not correct.

In this case -fno-strict-aliasing makes a difference.  And the code is indeed
incorrect.

[Bug middle-end/112406] [14 Regression] Several SPECCPU 2017 benchmarks fail with on internal compiler error: in expand_insn, at optabs.cc:8305 after g:01c18f58d37865d5f3bbe93e666183b54ec608c7

2023-11-08 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112406

--- Comment #10 from Tamar Christina  ---
Just finished second bisect and reduce.  Came out to this commit as well.

---

  module brute_force
integer, parameter :: r=9
 integer sudoku1(1, r)
contains
  subroutine brute
  integer l(r), u(r)
 where(sudoku1(1, :) /= 1)
  l = 1
u = 1
 end where
  do i1 = 1, u(1)
 do
end do
 end do
  end
  end

---

gfortran -w -c exchange2.f90 -fprofile-generate -march=armv8-a+sve -Ofast -o
exchange2.f90.o

gives:

during GIMPLE pass: vect
exchange2.fppized2.f90:5:18:

5 |   subroutine brute
  |  ^
internal compiler error: in vect_get_vec_defs_for_operand, at
tree-vect-stmts.cc:1257

which is probably related to your last message.

[Bug middle-end/112406] [14 Regression] Several SPECCPU 2017 benchmarks fail with on internal compiler error: in expand_insn, at optabs.cc:8305 after g:01c18f58d37865d5f3bbe93e666183b54ec608c7

2023-11-08 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112406

--- Comment #11 from Robin Dapp  ---
Thanks, this is helpful.

I have a patch that I just bootstrapped and ran the testsuite with on aarch64.
Going to post it soon, maybe Richi still has a better idea how to work around
this.

[Bug target/112445] New: [14 Regression] ICE: in lra_split_hard_reg_for, at lra-assigns.cc:1861 unable to find a register to spill: {*umulditi3_1} with -O -march=cascadelake -fwrapv

2023-11-08 Thread zsojka at seznam dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112445

Bug ID: 112445
   Summary: [14 Regression] ICE: in lra_split_hard_reg_for, at
lra-assigns.cc:1861 unable to find a register to
spill: {*umulditi3_1} with -O -march=cascadelake
-fwrapv
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: ice-on-valid-code
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zsojka at seznam dot cz
  Target Milestone: ---
  Host: x86_64-pc-linux-gnu
Target: x86_64-pc-linux-gnu

Created attachment 56536
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56536&action=edit
reduced testcase

Compiler output:
$ x86_64-pc-linux-gnu-gcc -O -march=cascadelake -fwrapv testcase.c 
testcase.c: In function 'foo':
testcase.c:19:1: error: unable to find a register to spill
   19 | }
  | ^
testcase.c:19:1: error: this is the insn:
(insn 37 142 199 2 (parallel [
(set (reg:TI 302 [orig:150 _73 ] [150])
(mult:TI (zero_extend:TI (reg:DI 184 [ cu8_0 ]))
(zero_extend:TI (reg:DI 181 [ foo0_s64_0 ]
(clobber (reg:CC 17 flags))
]) "testcase.c":10:9 510 {*umulditi3_1}
 (expr_list:REG_DEAD (reg:DI 184 [ cu8_0 ])
(expr_list:REG_UNUSED (reg:CC 17 flags)
(nil
during RTL pass: reload
testcase.c:19:1: internal compiler error: in lra_split_hard_reg_for, at
lra-assigns.cc:1861
0x7f3bef _fatal_insn(char const*, rtx_def const*, char const*, int, char
const*)
/repo/gcc-trunk/gcc/rtl-error.cc:108
0x12b9d6d lra_split_hard_reg_for()
/repo/gcc-trunk/gcc/lra-assigns.cc:1861
0x12b3268 lra(_IO_FILE*)
/repo/gcc-trunk/gcc/lra.cc:2495
0x1261999 do_reload
/repo/gcc-trunk/gcc/ira.cc:5973
0x1261999 execute
/repo/gcc-trunk/gcc/ira.cc:6161
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.

$ x86_64-pc-linux-gnu-gcc -v
Using built-in specs.
COLLECT_GCC=/repo/gcc-trunk/binary-latest-amd64/bin/x86_64-pc-linux-gnu-gcc
COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-r14-5250-20231108213319-g8cf7b936d44-checking-yes-rtl-df-extra-nobootstrap-amd64/bin/../libexec/gcc/x86_64-pc-linux-gnu/14.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++
--enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra
--disable-bootstrap --with-cloog --with-ppl --with-isl
--build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu
--target=x86_64-pc-linux-gnu --with-ld=/usr/bin/x86_64-pc-linux-gnu-ld
--with-as=/usr/bin/x86_64-pc-linux-gnu-as --disable-libstdcxx-pch
--prefix=/repo/gcc-trunk//binary-trunk-r14-5250-20231108213319-g8cf7b936d44-checking-yes-rtl-df-extra-nobootstrap-amd64
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 14.0.0 20231108 (experimental) (GCC)

[Bug c/112442] Segfault from casting a ptr when using -O2

2023-11-08 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112442

--- Comment #8 from Jonathan Wakely  ---
The aliasing doesn't happen when writing to the array, it's when reading a
char* value from an object of type unsigned char*.

If you just passed the unsigned char* to memcpy instead of *(char**)&ptr it
would be OK.

memcpy(*&ptr, ...) would also be OK.

[Bug libstdc++/8670] Alignment problem in std::basic_string

2023-11-08 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=8670

--- Comment #22 from Jonathan Wakely  ---
This bug is still present in the COW std::string, which is still supported even
though it's not the default.

There are two problems. The first is the one reported by James Kanze, that the
string contents need to be aligned to alignof(_CharT) but are currently aligned
to 1. The second, stated by Nathan in comment 13, is that the _Rep object needs
to be aligned to alignof(_Rep), not 1. The reference count in the _Rep ends up
misaligned, and the atomic operations on it are undefined.

Here's a C++17 test case that shows both problems:

#define _GLIBCXX_USE_CXX11_ABI 0

#include 

template
struct alloc
{
  using value_type = T;

  alloc() = default;

  template
alloc(const alloc&) { }

  T* allocate(std::size_t n)
  {
if constexpr (std::is_same_v)
  return next.allocate(n + 1) + 1;
return next.allocate(n);
  }

  void deallocate(T* p, std::size_t n)
  {
if constexpr (std::is_same_v)
  return next.deallocate(p - 1, n + 1);
return next.deallocate(p, n);
  }

  [[no_unique_address]] std::allocator next;

  bool operator==(const alloc&) const { return true; }
};

template
  using String = std::basic_string, alloc>;

int main()
{
  String sd(2, 0.0L);
  return sd[1];
}



This results in loads of UBsan errors like:

/usr/include/c++/13/bits/cow_string.h:3604:24: runtime error: member access
within misaligned address 0x006f22b1 for type 'struct _Rep', which requires
8 byte alignment
0x006f22b1: note: pointer points here
 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00
00 00  00 00 00 00 00
  ^ 
...
/usr/include/c++/13/bits/char_traits.h:307:15: runtime error: store to
misaligned address 0x006f22c9 for type 'char_type', which requires 16 byte
alignment
0x006f22c9: note: pointer points here
 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00
00 00  00 00 00 00 00
  ^ 
...
/usr/include/c++/13/bits/cow_string.h:252:46: runtime error: reference binding
to misaligned address 0x006f22e9 for type 'char_type', which requires 16
byte alignment
0x006f22e9: note: pointer points here
 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00
00 00  00 00 00 00 00
  ^ 
...
/usr/include/c++/13/ext/atomicity.h:84:18: runtime error: load of misaligned
address 0x006f22c1 for type '_Atomic_word', which requires 4 byte alignment
0x006f22c1: note: pointer points here
 00 00 00  00 ff ff ff ff 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00
00 00  00 00 00 00 00
  ^ 


It seems to me that we can just do:

  struct __attribute__((__aligned__(__alignof__(_CharT _Rep_base
  {
size_type   _M_length;
size_type   _M_capacity;
_Atomic_word_M_refcount;
  };

And then stop allocating raw bytes (with alignment 1) to place the _Rep into,
and use this allocator type instead:

typedef typename __gnu_cxx::__alloc_traits<_Alloc>::template
  rebind<_Rep>::other _Rep_alloc;

Then of course we need to adjust the size that we (de)allocate, to be in units
of sizeof(_Rep) not bytes.

[Bug libstdc++/8670] Alignment problem in std::basic_string

2023-11-08 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=8670

Jonathan Wakely  changed:

   What|Removed |Added

   Assignee|giovannibajo at gmail dot com  |redi at gcc dot gnu.org

--- Comment #23 from Jonathan Wakely  ---
Created attachment 56537
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56537&action=edit
Fix alignment of COW std::string storage

This fixes both problems, so re-assigning to myself.

[Bug c/112442] Segfault from casting a ptr when using -O2

2023-11-08 Thread adam.andersson at elisapolystar dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112442

--- Comment #9 from Adam Andersson  ---
I was sure I had tried -fno-strict-aliasing without any difference, but I
guessed I messed up somehow. Sorry about that.

Still, is it not strange that -Wall doesn't generate a warning about this then?

[Bug ada/112446] New: Switch -gnatyz included in -gnatyg

2023-11-08 Thread simon at pushface dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112446

Bug ID: 112446
   Summary: Switch -gnatyz included in -gnatyg
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: ada
  Assignee: unassigned at gcc dot gnu.org
  Reporter: simon at pushface dot org
CC: dkm at gcc dot gnu.org
  Target Milestone: ---

Created attachment 56538
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56538&action=edit
Demonstrator

"gnatmake --help" states that -gnatyg is equivalent to -gnatydISux, but 
in fact the new switch -gnatyz (check parentheses not required by operator 
precedence rules) is included.

If this is deliberate, the help information should say so.

(Personally, I think that clarifying parens are a valuable help to the 
reader! Are the GNAT Style Rules published?)

Given this (see the attachment),

   procedure P (P1, P2 : Boolean) is
  Dummy : Boolean;
   begin
  Dummy := (P1) or P2;
   end P;

this happens:

   $ /opt/gcc-14.0.0-20231105/bin/gnatmake -gnatyg p.adb
   gcc -c -gnatyg p.adb
   p.adb:4:13: (style) redundant parentheses [-gnatyz]

[Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94

2023-11-08 Thread dave.anglin at bell dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415

--- Comment #17 from dave.anglin at bell dot net ---
On 2023-11-08 9:42 a.m., jeffreyalaw at gmail dot com wrote:
> I'd probably continue with the process of narrowing down what code is
> affected using the attributes.  We already know the file, narrowing it
> down to a function might help considerably with the evaluation effort.
The problem seems to be in compiler_visit_expr().

-static int compiler_visit_expr(struct compiler *, expr_ty);
+static int compiler_visit_expr(struct compiler *, expr_ty)
__attribute__((optimize("no-inline-small-functions")));

Python builds okay if this function is not inlined, if it is compiled at -O1,
or if -fno-inline-small-functions is
specified as above.  Can't specify -fno-fold-mem-offsets as a function
attribute.

[Bug libstdc++/8670] Alignment problem in std::basic_string

2023-11-08 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=8670

--- Comment #24 from Jonathan Wakely  ---
Oh, but this would be an ABI break. When using the explicit instantiation
definitions in libstdc++.so allocations and deallocations will match because
both will come from the library. But if anything is inlined, code compiled
against older gcc headers might allocate N bytes from _Raw_bytes_alloc, and
other code might deallocate N2 bytes from _Rep_alloc, where N != N2.

[Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94

2023-11-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415

--- Comment #18 from Andrew Pinski  ---
I wonder if -fno-strict-aliasing works around the issue too?
I get the feeling that `fold mem offset pass` allows the aliasing code to have
a better time with the offset and that might be expose more aliasing issues.

The other thing to try is add `-fno-schedule-insns2 -fno-schedule-insns`
instead of `-fno-strict-aliasing` as the scheduler is normally where the
aliasing issues are exposed on the RTL level ...

[Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94

2023-11-08 Thread law at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415

--- Comment #19 from Jeffrey A. Law  ---
f-m-o runs post-allocation, so the scope of where it's behavior can change
things is narrower.  So testing with -fno-schedule-insns isn't going to be
useful, but -fno-schedule-insns2 might.

I'm a bit concerned that we can't turn off f-m-o with an attribute.  That would
indicating something isn't wired up right in the options handling.

[Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94

2023-11-08 Thread dave.anglin at bell dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415

--- Comment #20 from dave.anglin at bell dot net ---
On 2023-11-08 2:07 p.m., pinskia at gcc dot gnu.org wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
>
> --- Comment #18 from Andrew Pinski  ---
> I wonder if -fno-strict-aliasing works around the issue too?
> I get the feeling that `fold mem offset pass` allows the aliasing code to have
> a better time with the offset and that might be expose more aliasing issues.
>
> The other thing to try is add `-fno-schedule-insns2 -fno-schedule-insns`
> instead of `-fno-strict-aliasing` as the scheduler is normally where the
> aliasing issues are exposed on the RTL level ...
Both -fno-strict-aliasing and -fno-schedule-insns2 applied to
compiler_visit_expr()
work around issue.

[Bug target/112447] New: risc-v regression: FAIL: gcc.c-torture/execute/memset-3.c -O3

2023-11-08 Thread vineetg at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112447

Bug ID: 112447
   Summary: risc-v regression: FAIL:
gcc.c-torture/execute/memset-3.c   -O3
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: vineetg at gcc dot gnu.org
  Reporter: vineetg at gcc dot gnu.org
CC: jeffreyalaw at gmail dot com, juzhe.zhong at rivai dot ai,
lehua.ding at rivai dot ai, rdapp at gcc dot gnu.org
  Target Milestone: ---

As reported in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111311#c8

we have following execute failures on trunk.

=== gcc: Unexpected fails for rv64gcv lp64d medlow ===
FAIL: gcc.c-torture/execute/memset-3.c   -O3 -g  execution test

The issue is an extraneous VSETVLI instruction (with wrong SEW) being generated
which creates wrong fill pattern for memset.

```
main:

[...]

.L36:  ; 2. loop start for @off 0 
vse8.v  v1,0(t3)
vse8.v  v1,0(t6)
vse8.v  v1,0(s1)
vse8.v  v3,0(a5)
...
; loop epilogue
li  a7,15
beq a4,a7,.L171
vsetvli zero,zero,e32,m2,ta,ma   <--- wrong
j   .L36
```

vsetvli pass dumps:

```
Phase 3: Reduce global vsetvl infos. 

  Compute LCM insert and delete data:

  Expr[2]: VALID (insn 2847, bb 3)
Demand fields: demand_sew_lmul demand_avl
SEW=8, VLMUL=mf2, RATIO=16, MAX_SEW=64
TAIL_POLICY=agnostic, MASK_POLICY=agnostic
AVL=(const_int 8 [0x8])
VL=(nil)

VSETVL infos after phase 3

  bb 3:
probability: always (guessed)
Header vsetvl info:VALID (insn 2847, bb 3) (deleted)  <---
  Demand fields: demand_sew_lmul demand_avl
  SEW=8, VLMUL=mf2, RATIO=16, MAX_SEW=64
  TAIL_POLICY=agnostic, MASK_POLICY=agnostic
  AVL=(const_int 8 [0x8])
  VL=(nil)
```

So it seems LCM is deleting the valid VSETVLI insn which later causes Phase 4
to insert a different/incorrect one.

I revert the following commit and the issue goes away. 

 2023-10-18 f0e28d8c1371 RISC-V: Fix failed hoist in LICM of vmv.v.x
instruction  

This at least tells us the cause of issue, next step is to fix the issue.

[Bug target/111311] RISC-V regression testsuite errors with --param=riscv-autovec-preference=scalable

2023-11-08 Thread vineetg at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111311

--- Comment #16 from Vineet Gupta  ---
(In reply to Patrick O'Neill from comment #8)
> Updated regression list using r14-5070-g4ea36076d66 on rv64gcv:
> 
> === gcc: Unexpected fails for rv64gcv lp64d medlow ===
> FAIL: gcc.c-torture/execute/memset-3.c   -O3 -fomit-frame-pointer
> -funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
> FAIL: gcc.c-torture/execute/memset-3.c   -O3 -g  execution test

memset-3 failure tracked separately:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112447

[Bug c++/112448] New: Constraint expression b rejected

2023-11-08 Thread fchelnokov at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112448

Bug ID: 112448
   Summary: Constraint expression b rejected
   Product: gcc
   Version: 13.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: fchelnokov at gmail dot com
  Target Milestone: ---

This program

struct s { static constexpr bool v = true; };
template inline constexpr bool b = true;
constexpr bool f(auto x) requires b { return true; }
static_assert(f(s{})); // clang ok, gcc nope, msvc ok

is valid per the explanation here https://stackoverflow.com/a/77439003/7325599


but GCC rejects it with the error:

missing template arguments before '<' token
3 | constexpr bool f(auto x) requires b { return true; }
  |^
:3:36: error: expected initializer before '<' token
:4:15: error: 'f' was not declared in this scope
4 | static_assert(f(s{}));

Online demo: https://godbolt.org/z/1Gejvr4cn

[Bug target/82524] [7/8 Regression] expensive-optimizations produces wrong results

2023-11-08 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82524

--- Comment #22 from CVS Commits  ---
The master branch has been updated by Uros Bizjak :

https://gcc.gnu.org/g:dced5ae64703507a7159972316a1dde48e5f7470

commit r14-5254-gdced5ae64703507a7159972316a1dde48e5f7470
Author: Uros Bizjak 
Date:   Wed Nov 8 21:46:26 2023 +0100

i386: Apply LRA reload workaround to insns with high registers [PR82524]

LRA is not able to reload zero_extracted in-out operand with matched input
operand in the same way as strict_low_part in-out operand.  The patch
applies the strict_low_part workaround, where we allow LRA to generate
an instruction with non-matched input operand, which is split post reload
to the instruction that inserts non-matched input operand to an in-out
operand and the instruction that uses matched operand, also to
zero_extracted in-out operand case.

The generated code from the pr82524.c testcase improves from:

movl%esi, %ecx
movl%edi, %eax
movsbl  %ch, %esi
addl%esi, %edx
movb%dl, %ah

to:
movl%edi, %eax
movl%esi, %ecx
movb%ch, %ah
addb%dl, %ah

The compiler is now also able to handle non-commutative operations:

movl%edi, %eax
movl%esi, %ecx
movb%ch, %ah
subb%dl, %ah

and unary operations:

movl%edi, %eax
movl%esi, %edx
movb%dh, %ah
negb%ah

The patch also robustifies split condition of the splitters to ensure that
only alternatives with unmatched operands are split.

PR target/82524

gcc/ChangeLog:

* config/i386/i386.md (*add_1_slp):
Split insn only for unmatched operand 0.
(*sub_1_slp): Ditto.
(*_1_slp): Merge pattern from
"*and_1_slp"
and "*_1_slp" using any_logic code iterator.
Split insn only for unmatched operand 0.
(*neg1_slp): Split insn only for unmatched operand 0.
(*one_cmpl_1_slp): Ditto.
(*ashl3_1_slp): Ditto.
(*_1_slp): Ditto.
(*_1_slp): Ditto.
(*addqi_ext_1): Redefine as define_insn_and_split.  Add
alternative 1 and split insn after reload for unmatched operand 0.
(*qi_ext_2): Merge pattern from
"*addqi_ext_2" and "*subqi_ext_2" using plusminus code
iterator. Redefine as define_insn_and_split.  Add alternative 1
and split insn after reload for unmatched operand 0.
(*subqi_ext_1): Redefine as define_insn_and_split.  Add
alternative 1 and split insn after reload for unmatched operand 0.
(*qi_ext_0): Merge pattern from
"*andqi_ext_0" and and "*qi_ext_0"
using
any_logic code iterator.
(*qi_ext_1): Merge pattern from
"*andqi_ext_1" and "*qi_ext_1" using
any_logic code iterator. Redefine as define_insn_and_split.  Add
alternative 1 and split insn after reload for unmatched operand 0.
(*qi_ext_1_cc): Merge pattern from
"*andqi_ext_1_cc" and "*xorqi_ext_1_cc" using any_logic
code iterator. Redefine as define_insn_and_split.  Add alternative
1
and split insn after reload for unmatched operand 0.
(*qi_ext_2): Merge pattern from
"*andqi_ext_2" and "*qi_ext_2" using
any_logic code iterator. Redefine as define_insn_and_split.  Add
alternative 1 and split insn after reload for unmatched operand 0.
(*qi_ext_3): Redefine as
define_insn_and_split.
Add alternative 1 and split insn after reload for unmatched operand
0.
(*negqi_ext_1): Rename from "*negqi_ext_2".  Add
alternative 1 and split insn after reload for unmatched operand 0.
(*one_cmplqi_ext_1): Ditto.
(*ashlqi_ext_1): Ditto.
(*qi_ext_1): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr78904-1.c (test_sub): New test.
* gcc.target/i386/pr78904-1a.c (test_sub): Ditto.
* gcc.target/i386/pr78904-1b.c (test_sub): Ditto.
* gcc.target/i386/pr78904-2.c (test_sub): Ditto.
* gcc.target/i386/pr78904-2a.c (test_sub): Ditto.
* gcc.target/i386/pr78904-2b.c (test_sub): Ditto.
* gcc.target/i386/pr78952-4.c (test_sub): Ditto.
* gcc.target/i386/pr82524.c: New test.
* gcc.target/i386/pr82524-1.c: New test.
* gcc.target/i386/pr82524-2.c: New test.
* gcc.target/i386/pr82524-3.c: New test.

[Bug c/112449] New: Arithmetic operations can produce signaling NaNs

2023-11-08 Thread post+gcc at ralfj dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112449

Bug ID: 112449
   Summary: Arithmetic operations can produce signaling NaNs
   Product: gcc
   Version: 13.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: post+gcc at ralfj dot de
  Target Milestone: ---

According to the IEEE 754 specification, the output of an arithmetic operation
can never be a signaling NaN. However, GCC performs optimizations that turn `x
* 1.0` into just `x`, and if `x` is a signaling NaN, that means that the
multiplication will now (seem to) return a signaling NaN. (proof of GCC
performing that transformation: https://godbolt.org/z/scPhn1d8s)

It is very common for C compilers to violate this IEEE 754 requirement, but it
does open the door to a great many questions. Since GCC evidently does not seem
to implement the original IEEE 754 semantics, it would be great to have some
documentation on what exactly GCC *does* implement, and in particular under
which conditions operations are allowed to return a signaling NaN.

So currently, GCC is either buggy because it violates the IEEE 754 spec, or
there's a documentation bug in that the actual floating point spec GCC intends
to implement is not documented. At least, all I was able to find is
https://gcc.gnu.org/wiki/FloatingPointMath, which just says "does not care
about signalling NaNs". (I hope this does not mean that any arithmetic
operation may arbitrarily produce signaling NaNs. That would be an issue for
operations which are sensitive to the difference between quiet NaN and
signaling NaN, such as `pow`.)

As a point of comparison, LLVM recently added this to their documentation to
answer these kinds of questions:
https://llvm.org/docs/LangRef.html#behavior-of-floating-point-nan-values. (That
PR was authored by me but received input from a lot of people.) LLVM goes
further than to just document signaling vs quiet NaN there, since in practice
there's some critical code that would break if arithmetic operations returned
NaNs with arbitrary bits in their payload (specifically, that would break NaN
boxing as performing by some JavaScript engines, or at least make it a lot less
efficient since engines would have to re-normalize NaNs after every single
operation -- which to my knowledge, they don't actually do in practice).

[Bug c/112449] Arithmetic operations can produce signaling NaNs

2023-11-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112449

--- Comment #1 from Andrew Pinski  ---
See
https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#index-fsignaling-nans

[Bug c/112449] Arithmetic operations can produce signaling NaNs

2023-11-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112449

--- Comment #2 from Andrew Pinski  ---
Note mips and sh and a few other targets have the quiet bit meaning the
opposite.

[Bug c/112449] Arithmetic operations can produce signaling NaNs

2023-11-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112449

--- Comment #3 from Andrew Pinski  ---
GCC does document some of this on
https://gcc.gnu.org/onlinedocs/gcc-13.2.0/gcc/Floating-point-implementation.html
but not the signaling nan part.

[Bug c/112449] Arithmetic operations can produce signaling NaNs

2023-11-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112449

--- Comment #4 from Andrew Pinski  ---
And documented other parts here:
https://gcc.gnu.org/onlinedocs/gcc-13.2.0/cpp/Common-Predefined-Macros.html


specifically:
It does not indicate whether optimizations respect signaling NaN semantics (the
macro for that is __SUPPORT_SNAN__).

[Bug c++/112448] Constraint expression b rejected

2023-11-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112448

--- Comment #1 from Andrew Pinski  ---
Note the error message on the trunk changed to:
:4:40: error: template argument 1 is invalid
4 | constexpr bool f(auto x) requires b { return true; }
  |^

[Bug c++/112448] Constraint expression b rejected

2023-11-08 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112448

Patrick Palka  changed:

   What|Removed |Added

 CC||ppalka at gcc dot gnu.org
 Resolution|--- |DUPLICATE
 Status|UNCONFIRMED |RESOLVED

--- Comment #2 from Patrick Palka  ---
dup

*** This bug has been marked as a duplicate of bug 104255 ***

[Bug c++/104255] parsing function signature fails when it uses a function parameter outside of an unevaluated context

2023-11-08 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104255

Patrick Palka  changed:

   What|Removed |Added

 CC||fchelnokov at gmail dot com

--- Comment #7 from Patrick Palka  ---
*** Bug 112448 has been marked as a duplicate of this bug. ***

[Bug c++/104255] parsing function signature fails when it uses a function parameter outside of an unevaluated context

2023-11-08 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104255

Patrick Palka  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2023-11-08

[Bug target/112447] risc-v regression: FAIL: gcc.c-torture/execute/memset-3.c -O3

2023-11-08 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112447

--- Comment #1 from JuzheZhong  ---
Could you share more assembly ?

[Bug c/112339] ICE with clang::no_sanitize and -fsanitize=

2023-11-08 Thread s_gccbugzilla at nedprod dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112339

--- Comment #3 from Niall Douglas  ---
Thanks for the patch. I've sent it on to the originator of the bug, if they
confirm it fixes their issue to me I'll let you know.

[Bug target/112447] risc-v regression: FAIL: gcc.c-torture/execute/memset-3.c -O3

2023-11-08 Thread vineetg at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112447

--- Comment #2 from Vineet Gupta  ---
Created attachment 56539
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56539&action=edit
manually reduced src

[Bug target/112447] risc-v regression: FAIL: gcc.c-torture/execute/memset-3.c -O3

2023-11-08 Thread vineetg at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112447

--- Comment #3 from Vineet Gupta  ---
Created attachment 56540
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56540&action=edit
asm output ok

[Bug target/112447] risc-v regression: FAIL: gcc.c-torture/execute/memset-3.c -O3

2023-11-08 Thread vineetg at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112447

--- Comment #4 from Vineet Gupta  ---
Created attachment 56541
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56541&action=edit
asm output nok

[Bug target/112447] risc-v regression: FAIL: gcc.c-torture/execute/memset-3.c -O3

2023-11-08 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112447

--- Comment #5 from JuzheZhong  ---
I don't think VSETVL is wrong.


vsetivlizero,8,e8,mf2,ta,ma
sd  ra,120(sp)
vmv.x.s a1,v1
...
.L36:
vse8.v
...
vsetivlizero,8,e32,m2,ta,ma
j.L36

Both e8mf2 and e32m2 are valid for vse8.v since they have same ratio = 16

[Bug target/112447] risc-v regression: FAIL: gcc.c-torture/execute/memset-3.c -O3

2023-11-08 Thread vineetg at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112447

--- Comment #6 from Vineet Gupta  ---
I have debugged this by single stepped in qemu 

when the test fails (first loop for offset 0, iteration 8)

The last VSETVLI is this one, 

   0x10a3e   0d107057  vsetvli  zero,zero,e32,m2,ta,ma
   0x10a42   j  0x10666

We eventually hit a VMV.v.x. which creates invalid pattern due to e32.

   (gdb) info reg vtype
   vtype  0xd1  209 # SEW = 010’b / e32, LMUL = 001’b / m2
   (gdb) info reg vl
   vl 0x8   8
   (gdb) info reg a0
   a0 0x41  65

   vmv.v.x  v2,a0

  (gdb) info reg v2
  v2 {q = {0x41004100410041} 
  (gdb) info reg v3
  v2 {q = {0x41004100410041}

[Bug target/112447] risc-v regression: FAIL: gcc.c-torture/execute/memset-3.c -O3

2023-11-08 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112447

--- Comment #7 from JuzheZhong  ---
Oh. I missed it:

vmv.v.x v2,s0
vse8.v  v2,0(a5)

Leave it to me today. It should be simple fix.

Thanks for report it.

[Bug target/112447] risc-v regression: FAIL: gcc.c-torture/execute/memset-3.c -O3

2023-11-08 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112447

--- Comment #8 from JuzheZhong  ---
Could you continue debug more cases ?


FAIL: gcc.c-torture/execute/pr89369.c   -O2  execution test
FAIL: gcc.c-torture/execute/pr89369.c   -O2 -flto -fno-use-linker-plugin
-flto-partition=none  execution test
FAIL: gcc.c-torture/execute/pr89369.c   -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects  execution test
FAIL: gcc.c-torture/execute/pr89369.c   -O3 -g  execution test
FAIL: gcc.dg/pr96239.c execution test
FAIL: gcc.dg/vshift-5.c execution test
FAIL: gcc.dg/torture/pr61346.c   -O2  execution test
FAIL: gcc.dg/torture/pr61346.c   -O2 -flto -fno-use-linker-plugin
-flto-partition=none  execution test
FAIL: gcc.dg/torture/pr61346.c   -O3 -fomit-frame-pointer -funroll-loops
-fpeel-loops -ftracer -finline-functions  execution test
FAIL: gcc.dg/torture/pr61346.c   -O3 -g  execution test

They are RV32 system. memset issue I will fix it soon today.

[Bug target/112447] risc-v regression: FAIL: gcc.c-torture/execute/memset-3.c -O3

2023-11-08 Thread vineetg at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112447

Vineet Gupta  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2023-11-08
 Status|UNCONFIRMED |ASSIGNED

--- Comment #9 from Vineet Gupta  ---
(In reply to JuzheZhong from comment #7)
> Oh. I missed it:
> 
>   vmv.v.x v2,s0
>   vse8.v  v2,0(a5)
> 
> Leave it to me today. It should be simple fix.
> 
> Thanks for report it.

Can I request you to let me continue to debug and fix it. I want to familiarize
myself with the vsetv pass and this seems like a good opportunity to do so
considering you think the fix is not hard.

[Bug target/112447] risc-v regression: FAIL: gcc.c-torture/execute/memset-3.c -O3

2023-11-08 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112447

--- Comment #10 from JuzheZhong  ---
(In reply to Vineet Gupta from comment #9)
> (In reply to JuzheZhong from comment #7)
> > Oh. I missed it:
> > 
> >   vmv.v.x   v2,s0
> > vse8.v  v2,0(a5)
> > 
> > Leave it to me today. It should be simple fix.
> > 
> > Thanks for report it.
> 
> Can I request you to let me continue to debug and fix it. I want to
> familiarize myself with the vsetv pass and this seems like a good
> opportunity to do so considering you think the fix is not hard.

OK. Thanks.

  1   2   >