date:20230825

[Bug tree-optimization/111137] [11/12/13/14 Regression] Wrong code at -O2/3

2023-08-25 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=37

Richard Biener  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org

--- Comment #2 from Richard Biener  ---
We BB vectorize as follows

[local count: 118111600]:
   # ivtmp_29 = PHI 
-  b[0][3] = 0;
   _39 = g_21;
+  vectp.26_37 = &b[_39][0];
+  vect__15.27_4 = MEM  [(int *)vectp.26_37];
+  vect__18.28_45 = vect__15.27_4 ^ { 1, 1 };
   _15 = b[_39][0];
   _18 = _15 ^ 1;
-  b[_39][0] = _18;
-  b[0][4] = 0;
   _48 = b[_39][1];
   _49 = _48 ^ 1;
-  b[_39][1] = _49;
-  b[0][1] = 0;
+  vectp.30_46 = &b[_39][0];
+  MEM  [(int *)vectp.30_46] = vect__18.28_45;
   _5 = _42;
+  vectp.20_36 = &b[_5][0];
+  vect__6.21_24 = MEM  [(int *)vectp.20_36];
+  vect__7.22_35 = vect__6.21_24 ^ { 1, 1 };
   _6 = b[_5][0];
   _7 = _6 ^ 1;
-  b[_5][0] = _7;
-  b[0][2] = 0;
+  MEM  [(int *)&b + 4B] = { 0, 0, 0, 0 };
   _10 = b[_5][1];
   _11 = _10 ^ 1;
-  b[_5][1] = _11;
+  vectp.24_31 = &b[_5][0];
+  MEM  [(int *)vectp.24_31] = vect__7.22_35;
   ivtmp_23 = ivtmp_29 - 1;
   if (ivtmp_23 != 0)

and the issue is that we somehow get data dependence wrong.  _39 is 1 and
_5 is 0.  That means we have

  b[0][3] = 0;
  _39 = g_21;
  _15 = b[_39][0];
  _18 = _15 ^ 1;
  b[1 /*_39*/][0] = _18;
  b[0][4] = 0;
  _48 = b[_39][1];
  _49 = _48 ^ 1;
  b[1 /*_39*/][1] = _49;
  b[0][1] = 0;
  _5 = _42;
  _6 = b[0 /*_5*/][0];
  _7 = _6 ^ 1;
  b[0 /*_5*/][0] = _7;
  b[0][2] = 0;
  _10 = b[0 /*_5*/][1];
  _11 = _10 ^ 1;
  b[0 /*_5*/][1] = _11;

I will have a look.

[Bug tree-optimization/111142] [14 regression] ICE in get_group_load_store_type, at tree-vect-stmts.cc:2121 (AVX512)

2023-08-25 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=42

Richard Biener  changed:

   What|Removed |Added

   Last reconfirmed||2023-08-25
 Status|UNCONFIRMED |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org
 Ever confirmed|0   |1

--- Comment #2 from Richard Biener  ---

2120  /* Stores can't yet have gaps.  */
2121  gcc_assert (slp_node || vls_type == VLS_LOAD || gap == 0);
(gdb) p vls_type
$1 = VLS_STORE_INVARIANT

Looks like mine I think.  Reducing.

[Bug middle-end/111151] [12/13/14 Regression] Wrong code at -O0 on x86_64-pc-linux-gnu

2023-08-25 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51

Richard Biener  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2023-08-25

--- Comment #1 from Richard Biener  ---
  a = MAX_EXPR <(long long unsigned int) t + 4503599, 32739>;
  b = a * 18446744073709551606;
  printf ((const char *) "%llu (Split calculation result)\n", b);
  c = MAX_EXPR <(long long unsigned int) t * 18446744073709551606 +
18446744073664515626, 18446744073709224226>;
  printf ((const char *) "%llu (Combine calculation result)\n", c);

smells like extract_muldiv, but

Applying pattern match.pd:4359, generic-match-8.cc:3047
Applying pattern match.pd:4359, generic-match-8.cc:3047
Applying pattern match.pd:5475, generic-match-8.cc:1606
Applying pattern match.pd:4256, generic-match-8.cc:2977
Applying pattern match.pd:5113, generic-match-4.cc:2339
Applying pattern match.pd:4392, generic-match-8.cc:3091
Applying pattern match.pd:4359, generic-match-8.cc:3047
Applying pattern match.pd:4359, generic-match-8.cc:3047
Applying pattern match.pd:5475, generic-match-8.cc:1606
Applying pattern match.pd:4256, generic-match-8.cc:2977
Applying pattern match.pd:5113, generic-match-4.cc:2339
Applying pattern match.pd:4392, generic-match-8.cc:3091

it would be nice to have fold-const.cc report "matches" as well.  Maybe
replace all return ; with return report (); with a report
macro doing dumping like above if  is non-NULL.

[Bug middle-end/111151] [12/13/14 Regression] Wrong code at -O0 on x86_64-pc-linux-gnu

2023-08-25 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51

Richard Biener  changed:

   What|Removed |Added

   Keywords||needs-bisection, wrong-code
   Target Milestone|--- |12.4

[Bug tree-optimization/111136] ICE in RISC-V test case since r14-3441-ga1558e9ad85693

2023-08-25 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=36

Richard Biener  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2023-08-25
   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org
 Status|UNCONFIRMED |ASSIGNED

--- Comment #2 from Richard Biener  ---
I will have a look.

[Bug tree-optimization/111136] ICE in RISC-V test case since r14-3441-ga1558e9ad85693

2023-08-25 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=36

--- Comment #3 from Richard Biener  ---
A similar aarch64 ICE is fixed with the following:

diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
index ebee8037e02..23c6e8259e7 100644
--- a/gcc/tree-vect-loop.cc
+++ b/gcc/tree-vect-loop.cc
@@ -2453,8 +2453,13 @@ vect_dissolve_slp_only_groups (loop_vec_info loop_vinfo)
  DR_GROUP_FIRST_ELEMENT (vinfo) = vinfo;
  DR_GROUP_NEXT_ELEMENT (vinfo) = NULL;
  DR_GROUP_SIZE (vinfo) = 1;
- if (STMT_VINFO_STRIDED_P (first_element))
-   DR_GROUP_GAP (vinfo) = 0;
+ if (STMT_VINFO_STRIDED_P (first_element)
+ /* We cannot handle stores with gaps.  */
+ || DR_IS_WRITE (dr_info->dr))
+   {
+ STMT_VINFO_STRIDED_P (vinfo) = true;
+ DR_GROUP_GAP (vinfo) = 0;
+   }
  else
DR_GROUP_GAP (vinfo) = group_size - 1;
  /* Duplicate and adjust alignment info, it needs to

[Bug target/108271] Missed RVV cost model

2023-08-25 Thread rdapp at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108271

Robin Dapp  changed:

   What|Removed |Added

 CC||rdapp at gcc dot gnu.org

--- Comment #3 from Robin Dapp  ---
This is basically the same problem as PR108412.  As long as loads/stores have a
high(ish) latency and we mostly do load/store, they will tend to lump together
at the end of the function.  Setting vector load/store to a latency of <= 2
helps here and we might want to do this in order to avoid excessive spilling.
I had to deal with this before, e.g. in SPEC2006's calculix.
In the end insn scheduling wouldn't buy us anything and rather caused more
spilling causing performance degradationl

[Bug tree-optimization/111136] ICE in RISC-V test case since r14-3441-ga1558e9ad85693

2023-08-25 Thread rdapp at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=36

--- Comment #4 from Robin Dapp  ---
All gather-scatter tests pass for me again (the given example in particular)
after applying this.

[Bug middle-end/111152] New: ~7-9% performance regression on 510.parest_r SPEC 2017 benchmark

2023-08-25 Thread fkastl at suse dot cz via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52

Bug ID: 52
   Summary: ~7-9% performance regression on 510.parest_r SPEC 2017
benchmark
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: needs-bisection
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: fkastl at suse dot cz
CC: mjambor at suse dot cz
  Target Milestone: ---

There is a performance regression of about 7-9% on the 510.parest_r SPEC 2017
benchmark.

Between g:d073e2d75d9ed492 and g:9ade70bb86c8744f

Intel Xeon Gold 5315Y, all with -Ofast native, some with lto and/or pgo
https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=801.457.0
https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=796.457.0
https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=792.457.0
https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=797.457.0

With Zen3 -O2 generic lto pgo the regression is less noticeable (only 4%)
https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=694.457.0

[Bug tree-optimization/111128] [14 Regression] ICE: tree check: expected class 'type', have 'exceptional' (error_mark) in useless_type_conversion_p, at gimple-expr.cc:85

2023-08-25 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=28

Richard Biener  changed:

   What|Removed |Added

 CC||dcb314 at hotmail dot com

--- Comment #5 from Richard Biener  ---
*** Bug 30 has been marked as a duplicate of this bug. ***

[Bug c/111130] ice & tree check fail in useless_type_conversion_p, at gimple-expr.cc:85

2023-08-25 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=30

Richard Biener  changed:

   What|Removed |Added

 Resolution|--- |DUPLICATE
 Status|UNCONFIRMED |RESOLVED

--- Comment #3 from Richard Biener  ---
duplicate then

*** This bug has been marked as a duplicate of bug 28 ***

[Bug middle-end/111152] ~7-9% performance regression on 510.parest_r SPEC 2017 benchmark

2023-08-25 Thread crazylht at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52

Hongtao.liu  changed:

   What|Removed |Added

 CC||crazylht at gmail dot com

--- Comment #1 from Hongtao.liu  ---
It's PR111064

[Bug middle-end/111152] ~7-9% performance regression on 510.parest_r SPEC 2017 benchmark

2023-08-25 Thread crazylht at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52

--- Comment #2 from Hongtao.liu  ---
> With Zen3 -O2 generic lto pgo the regression is less noticeable (only 4%)
> https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=694.457.0

Not sure about this part

[Bug tree-optimization/111142] [14 regression] ICE in get_group_load_store_type, at tree-vect-stmts.cc:2121 (AVX512)

2023-08-25 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=42

Richard Biener  changed:

   What|Removed |Added

 Resolution|--- |DUPLICATE
 Status|ASSIGNED|RESOLVED

--- Comment #3 from Richard Biener  ---
The fix for PR36 also fixes this.

*** This bug has been marked as a duplicate of bug 36 ***

[Bug tree-optimization/111136] ICE in RISC-V test case since r14-3441-ga1558e9ad85693

2023-08-25 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=36

Richard Biener  changed:

   What|Removed |Added

 CC||manuel.lauss at googlemail dot 
com

--- Comment #5 from Richard Biener  ---
*** Bug 42 has been marked as a duplicate of this bug. ***

[Bug middle-end/111152] ~7-9% performance regression on 510.parest_r SPEC 2017 benchmark

2023-08-25 Thread fkastl at suse dot cz via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52

--- Comment #3 from Filip Kastl  ---
Ah, sorry. Didn't notice I'm making a duplicate.

The Zen3 regression isn't that big and could just be noise.

[Bug target/111064] 5-10% regression of parest on icelake between g:d073e2d75d9ed492de9a8dc6970e5b69fae20e5a (Aug 15 2023) and g:9ade70bb86c8744f4416a48bb69cf4705f00905a (Aug 16)

2023-08-25 Thread fkastl at suse dot cz via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111064

Filip Kastl  changed:

   What|Removed |Added

 CC||fkastl at suse dot cz

--- Comment #5 from Filip Kastl  ---
*** Bug 52 has been marked as a duplicate of this bug. ***

[Bug middle-end/111152] ~7-9% performance regression on 510.parest_r SPEC 2017 benchmark

2023-08-25 Thread fkastl at suse dot cz via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52

Filip Kastl  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #4 from Filip Kastl  ---
Marking as duplicate.

*** This bug has been marked as a duplicate of bug 111064 ***

[Bug tree-optimization/111136] ICE in RISC-V test case since r14-3441-ga1558e9ad85693

2023-08-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=36

--- Comment #6 from CVS Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:a739bac402ea5a583e43dbd01c14ebaff317c885

commit r14-3477-ga739bac402ea5a583e43dbd01c14ebaff317c885
Author: Richard Biener 
Date:   Fri Aug 25 09:42:16 2023 +0200

tree-optimization/36 - STMT_VINFO_SLP_VECT_ONLY and stores

vect_dissolve_slp_only_groups currently only expects loads, for stores
we have to make sure to mark the dissolved "groups" strided.

PR tree-optimization/36
* tree-vect-loop.cc (vect_dissolve_slp_only_groups): For
stores force STMT_VINFO_STRIDED_P and also duplicate that
to all elements.

[Bug tree-optimization/111136] ICE in RISC-V test case since r14-3441-ga1558e9ad85693

2023-08-25 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=36

Richard Biener  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #7 from Richard Biener  ---
Fixed.

[Bug middle-end/110832] [14 Regression] 14% capacita -O2 regression between g:9fdbd7d6fa5e0a76 (2023-07-26 01:45) and g:ca912a39cccdd990 (2023-07-27 03:44) on zen3 and core

2023-08-25 Thread ubizjak at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110832

Uroš Bizjak  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #13 from Uroš Bizjak  ---
Let's keep this patch to gcc-14+. The runtime regression is now due to strict
IEEE compilance, where the compiler sanitizes every partial vector input to
potentially trapping instructions. OTOH, -fno-trapping-math removes
sanitization fixups (and the documentation documents possible issues with
assembler and builtins passing non-conformat FP values), and
-m[no-]partial-vector-fp-math option is introduced to completely disable
potentially traping instructions for partial vectors.

So, fixed for gcc-14+.

[Bug target/110762] [11/12/13 Regression] inappropriate use of SSE (or AVX) insns for v2sf mode operations

2023-08-25 Thread ubizjak at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110762

Uroš Bizjak  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
   Target Milestone|13.3|14.0
 Resolution|--- |FIXED

--- Comment #25 from Uroš Bizjak  ---
Let's keep this patch to gcc-14+. The compiler now sanitizes every partial
vector input to potentially trapping instructions. OTOH, the patch introduced
noticeable runtime regression, so in a follow-up patch (PR110832)
-fno-trapping-math removes sanitization fixups (and the documentation documents
possible issues with assembler and builtins passing non-conformat FP values),
and -m[no-]partial-vector-fp-math option is introduced to completely disable
potentially traping instructions for partial vectors.

So, fixed for gcc-14+.

[Bug c++/110345] [C++26] P2552R3 - On the ignorability of standard attributes

2023-08-25 Thread jakub at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110345

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #1 from Jakub Jelinek  ---
Created attachment 55791
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55791&action=edit
gcc14-pr110345-wip.patch

So, parts of this paper has been already fixed in r13-3848-g05119c345797bc04c .
The fallthrough stuff is still outstanding I think and I'll have a look.
The largest nightmare is diagnostics on standard attributes which appertain to
something that shouldn't allow them, I'm lost there.
I went through the grammar (except for modules) and tried to find all spots
where standard allows attribute-specifier-seq and tried to write that into a
testcase in the attached patch, for now for deprecated attribute and marked all
-pedantic-errors diagnostics.  I've added lots of FIXMEs in there where I don't
really know.  There are
even 4 spots where we don't parse the attributes at all (and clang++ does), on
the other side there are 4 spots where we do parse them and clang++ doesn't.
I think to mark this paper fully resolved we'd need to resolve the FIXMEs and
copy/adjust the test for all other standard attributes and verify we emit at
least some diagnostics with -pedantic-errors where the standard attributes
aren't allowed to appertain to.

[Bug c/111153] New: RISC-V: Incorrect Vector cost model for reduction

2023-08-25 Thread juzhe.zhong at rivai dot ai via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53

Bug ID: 53
   Summary: RISC-V: Incorrect Vector cost model for reduction
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: juzhe.zhong at rivai dot ai
  Target Milestone: ---

Consider this following case:

#include 

#define DEF_REDUC_PLUS(TYPE)\
TYPE __attribute__ ((noinline, noclone))\
reduc_plus_##TYPE (TYPE * __restrict a, int n)  \
{   \
  TYPE r = 0;   \
  for (int i = 0; i < n; ++i)   \
r += a[i];  \
  return r; \
}

#define TEST_PLUS(T)\
  T (int32_t)   \

TEST_PLUS (DEF_REDUC_PLUS)

 -O3 --param=riscv-autovec-preference=scalable:

reduc_plus_int32_t:
ble a1,zero,.L8
addiw   a5,a1,-1
li  a4,4
addisp,sp,-16
mv  a2,a0
sext.w  a3,a1
bleua5,a4,.L9
srliw   a4,a3,2
sllia4,a4,4
mv  a5,a0
add a4,a4,a0
vsetivlizero,4,e32,m1,ta,ma
vmv.v.i v1,0
vse32.v v1,0(sp)
.L4:
vle32.v v1,0(a5)
vle32.v v2,0(sp)
addia5,a5,16
vadd.vv v1,v2,v1
vse32.v v1,0(sp)
bne a4,a5,.L4
ld  a5,0(sp)
lw  a4,0(sp)
andia1,a1,-4
sraia5,a5,32
addwa5,a4,a5
lw  a4,8(sp)
addwa5,a5,a4
ld  a4,8(sp)
sraia4,a4,32
addwa0,a5,a4
beq a3,a1,.L15
.L3:
subwa3,a3,a1
sllia5,a1,32
sllia3,a3,32
srlia3,a3,32
srlia5,a5,30
add a2,a2,a5
vsetvli a5,a3,e8,mf4,tu,mu
vsetvli a4,zero,e32,m1,ta,ma
sub a1,a3,a5
vmv.v.i v1,0
vsetvli zero,a3,e32,m1,tu,ma
vle32.v v2,0(a2)
vmv.v.v v1,v2
bne a3,a5,.L21
.L7:
vsetvli a4,zero,e32,m1,ta,ma
vmv.s.x v2,zero
vredsum.vs  v1,v1,v2
vmv.x.s a5,v1
addwa0,a0,a5
.L15:
addisp,sp,16
jr  ra
.L21:
sllia5,a5,2
add a2,a2,a5
vsetvli zero,a1,e32,m1,tu,ma
vle32.v v2,0(a2)
vadd.vv v1,v1,v2
j   .L7
.L8:
li  a0,0
ret
.L9:
li  a1,0
li  a0,0
j   .L3

-O3 --param=riscv-autovec-preference=scalable -fno-vect-cost-model:
reduc_plus_int32_t:
ble a1,zero,.L4
vsetvli a3,zero,e32,m1,ta,ma
vmv.v.i v1,0
.L3:
vsetvli a5,a1,e32,m1,tu,ma
sllia4,a5,2
sub a1,a1,a5
vle32.v v2,0(a0)
add a0,a0,a4
vadd.vv v1,v2,v1
bne a1,zero,.L3
vsetvli a3,zero,e32,m1,ta,ma
vmv.s.x v2,zero
vredsum.vs  v1,v1,v2
vmv.x.s a0,v1
ret
.L4:
li  a0,0
ret

The current vector cost model generates inferiors codegen.

[Bug c/111153] RISC-V: Incorrect Vector cost model for reduction

2023-08-25 Thread rdapp at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53

--- Comment #1 from Robin Dapp  ---
We seem to decide that a slightly more expensive loop (one instruction more)
without an epilogue is better than a loop with an epilogue.  This looks
intentional in the vectorizer cost estimation and is not specific to our lack
of a costing model.  Hmm..

The main loops are (VLA):
.L3:
vsetvli a5,a1,e32,m1,tu,ma
sllia4,a5,2
sub a1,a1,a5
vle32.v v2,0(a0)
add a0,a0,a4
vadd.vv v1,v2,v1
bne a1,zero,.L3

vs (VLS):
.L4:
vle32.v v1,0(a5)
vle32.v v2,0(sp)
addia5,a5,16
vadd.vv v1,v2,v1
vse32.v v1,0(sp)
bne a4,a5,.L4

This is doubly weird because of the spill of the accumulator.  We shouldn't be
generating this sequence but even if so, it should be more expensive.  This can
be achieved e.g. by the following example vectorizer cost function:

static int
riscv_builtin_vectorization_cost (enum vect_cost_for_stmt type_of_cost,
 tree vectype,
 int misalign ATTRIBUTE_UNUSED)
{
  unsigned elements;

  switch (type_of_cost)
{
  case scalar_stmt:
  case scalar_load:
  case scalar_store:
  case vector_stmt:
  case vector_gather_load:
  case vector_scatter_store:
  case vec_to_scalar:
  case scalar_to_vec:
  case cond_branch_not_taken:
  case vec_perm:
  case vec_promote_demote:
  case unaligned_load:
  case unaligned_store:
return 1;

  case vector_load:
  case vector_store:
return 3;

  case cond_branch_taken:
return 3;

  case vec_construct:
elements = estimated_poly_value (TYPE_VECTOR_SUBPARTS (vectype));
return elements / 2 + 1;

  default:
gcc_unreachable ();
}
}

For a proper loop like
vle32.v v2,0(sp)
.L4:
vle32.v v1,0(a5)
addia5,a5,16
vadd.vv v1,v2,v1
bne a4,a5,.L4
vse32.v v1,0(sp)
I'm not so sure anymore.  For large n this could be preferable depending on the
vectorization factor and other things.

[Bug c/111154] New: vect-cost-model=dynamic triggers false warning on array operation

2023-08-25 Thread changyp6 at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54

Bug ID: 54
   Summary: vect-cost-model=dynamic triggers false warning on
array operation
   Product: gcc
   Version: 13.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: changyp6 at gmail dot com
  Target Milestone: ---

Created attachment 55792
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55792&action=edit
Test case of vect-cost-model

I have a program, which operates on an array, which size is defined by a macro.

When compiling this code, GCC always reports following warnings:
```
test.c:35:18: warning: writing 1 byte into a region of size 0
[-Wstringop-overflow=]
   35 | desta[i] = src[i];
  | ~^~~~
test.c:7:16: note: at offset 8 into destination object 'desta' of size 8
7 | static uint8_t desta[ARRAY_SIZE];
  |^
test.c:35:18: warning: writing 1 byte into a region of size 0
[-Wstringop-overflow=]
   35 | desta[i] = src[i];
  | ~^~~~
test.c:7:16: note: at offset 9 into destination object 'desta' of size 8
7 | static uint8_t desta[ARRAY_SIZE];
  |^
test.c:35:18: warning: writing 1 byte into a region of size 0
[-Wstringop-overflow=]
   35 | desta[i] = src[i];
  | ~^~~~
```
After digging on this issue, I found that this warning is triggered by 
`-fvect-cost-model=dynamic`, in -O3
`-fvect-cost-model=very-cheap` won't trigger this warning.

Attached is the test case program, just run
`make bugs`
to produce this false warning

This source code file has 3 functions, each function has a for-loop to operate
the array.
The upper bound of the for-loop is determined by a global static variable,
which will be set by another function.

If I remove 1 function which operate on the array, this warning won't show up.
It seems that this issue is related to loop-vectorization, if there are enough
loops, loop-vectorization will be triggered, and the array's bound is an
variable, which cannot be determined directly by GCC, in this case, GCC
vectorize the loop, accessed regions outside of the array.

[Bug c++/102609] [C++23] P0847R7 - Deducing this

2023-08-25 Thread waffl3x at protonmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102609

--- Comment #15 from waffl3x  ---
Created attachment 55793
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55793&action=edit
inital support for P0847 explicit-object-parameter

Alright, I finalized something that I hope is worthy of criticism. I haven't
ran tests on it yet but I think it should be relatively stable. My first time
around I made the mistake of having hard failing TREE_CHECKs in if conditions,
and I'm pretty sure that was causing problems with the tests (I saw a lot of
segfaults), but I'm fairly sure it should be good this time (not that I shared
the first one).

I will probably start tests first thing tomorrow, hopefully I can figure out
how to make it take less than 4-8 hours.

I'm pretty happy with where I put most things, I didn't add anything to the
core tree nodes, I instead used tree_decl_common::decl_flag_3 for PARM_DECL,
but I added a member to lang_decl_base, hopefully this is satisfactory.

As for what I know works right now, the below program outputs 15, 25, 35, 45 as
you would expect. I haven't tried lambda's but I am sure they don't work. I
have not tried anything with inheritance, I wouldn't bet on it working but I
wouldn't bet against it. I have not tried implicit conversions, but I have a
feeling they probably work. I was planning on implementing rejection of
qualifiers on xobj member functions but I forgot, so that will come tomorrow. I
also have to implement errors when trying to declare a xobj parameter in a
function type. I haven't tried taking the address of an xobj member function,
but I have a hunch it will work, they are almost entirely treated as static
member functions at the moment. Speaking of that, the function declaration gets
pretty printed as a static function at the moment too.

So as you can see, there's still lots to do, but it shouldn't be as hard now
that I am more familiar with the code base.

Something I'm not especially happy with is how the error checking is strewn
around. I tried to put it where it is most relevant, but prioritized putting it
where I could get the best diagnostics, but I'm probably missing some things I
could be doing to better group it together. I am also not happy with the
quality of all of the diagnostics, I want to improve on that as I learn the
quirks of the utilities.

Please criticize, I am certain I am still doing some stuff wrong, so I would
appreciate any input so I can correct those mistakes.

Here is the aforementioned program that I know to work on my system. Surely
nothing can go wrong, but who knows.

#include 
struct S {
int _a;
int my_func(this S& s) {
return s._a + 5;
}
int my_func(this S const& s) {
return s._a + 15;
}
int my_func(this S&& s) {
return s._a + 25;
}
int my_func(this S const&& s) {
return s._a + 35;
}
template
int my_func_dispatch(this Self&& self) {
return static_cast(self).my_func();
}
};

int main()
{
S s{10};

printf("%d\n", s.my_func_dispatch());
printf("%d\n", static_cast(s).my_func_dispatch());
printf("%d\n", static_cast(s).my_func_dispatch());
printf("%d\n", static_cast(s).my_func_dispatch());
}

[Bug c/111154] vect-cost-model=dynamic triggers false warning on array operation

2023-08-25 Thread changyp6 at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54

--- Comment #1 from Tomas Chang  ---
Forgot to mention, this bug can only be produced in 'AARCH64' platform.

Native compile the attached test case in "aarch64" machine,
or cross-compile the source code using aarch64 toolchain, can trigger it.

[Bug middle-end/54202] Overeager warning about freeing non-heap objects

2023-08-25 Thread charlechaud at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54202

Charles Blake  changed:

   What|Removed |Added

 CC||charlechaud at gmail dot com

--- Comment #11 from Charles Blake  ---
Here is another example (no external includes/etc.):

typedef long unsigned int size_t;

extern void *malloc (size_t __size) __attribute__ ((__nothrow__ , __leaf__))
__attribute__ ((__malloc__))
 __attribute__ ((__alloc_size__ (1))) __attribute__
((__warn_unused_result__));

extern void free (void *__ptr) __attribute__ ((__nothrow__ , __leaf__));

typedef struct { long count, flags; } Foo;

Foo *G(Foo *x, long flags, int unused) {
if (x) { // originally a file descriptor
x->count = 0; flags = 0;
} else {
if (!(x = (Foo *)malloc(sizeof *x))) return (Foo *)0;
flags = 1;
}
x->flags = flags;
return x;
}

Foo *F(Foo *x, long flags) {
if (x) { // originally a pathname path
x->count = 0; flags = 0;
} else {
if (!(x = (Foo *)malloc(sizeof *x))) return (Foo *)0;
flags = 1;
}
x->flags = flags;
return G(x, flags, 123);
}

void release(Foo *x) {
if (x && x->flags) free(x);
}

int main(void) {
Foo x;
if (F(&x, 0))
release(&x);
return 0;
}

In the above, variations due to compiler optimization levels make it even more
confusing: -O1 does not warn, -O2 does warn, -O3 again does not warn (on both
gcc-12.3 and gcc-13.2 on Gentoo Linux, anyway). O3 optimizes the whole main
away.  clang never warns on this at any -O level, nor with clang --analyze.

What is particularly bad about this warning is that AFAICT there is no way to
massage the code to reliably silence it while also preserving the intended
effect.  Now it's even on by default.  This actively discourages an otherwise
clean arrangement in the C programming language when working with anyone else
who takes warns too seriously.  Any such warning should have "may be" language
not language that sounds like the compiler has "proved an actuality".

Might the same argument apply to many warnings?  Maybe.  Two words is not so
bad, though.  Other warns may have workarounds like annotations to silence them
reliably without sacrificing functionality.  Maybe I'm missing one here?

[Bug middle-end/111151] [12/13/14 Regression] Wrong code at -O0 on x86_64-pc-linux-gnu

2023-08-25 Thread mikpelinux at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51

Mikael Pettersson  changed:

   What|Removed |Added

 CC||mikpelinux at gmail dot com

--- Comment #2 from Mikael Pettersson  ---
On x86_64-linux-gnu I can reproduce this all the way back to gcc-3.2.3 with
both -m64 and -m32. gcc-2.95.3 compiling for i686-linux-gnu gets it right.

[Bug middle-end/110973] 9% 444.namd regression between g:c2a447d840476dbd (2023-08-03 18:47) and g:73da34a538ddc2ad (2023-08-09 20:17)

2023-08-25 Thread fkastl at suse dot cz via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110973

--- Comment #4 from Filip Kastl  ---
(In reply to Martin Jambor from comment #3)
> There was also a 7.7% regression on zen3 with generic march (these
> measurements are without LTO):
> 
> https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=466.120.0

I think this slowdown is caused by the r14-3078-gd9f3ea61fe36e2 commit that
Richard mentioned. I measured it on another zen3 machine and got ~7% slowdown.

[Bug tree-optimization/111154] vect-cost-model=dynamic triggers false warning on array operation

2023-08-25 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54

Richard Biener  changed:

   What|Removed |Added

 Blocks||88443
 Ever confirmed|0   |1
   Last reconfirmed||2023-08-25
 Status|UNCONFIRMED |NEW
  Component|c   |tree-optimization
 Target||aarch64
   Keywords||diagnostic

--- Comment #2 from Richard Biener  ---
Confirmed.  Reproduces with -O3 -march=armv8.3-a and

unsigned char desta[8];
void copya(unsigned char *src, int size)
{
  for (int i = 0; i < size; i++)
desta[i] = src[i];
}

we end up with a peeled epilogue:

  if (_12 == niters_vector_mult_vf.6_39)
goto ; [12.50%]
  else
goto ; [87.50%]

   [local count: 73583527]:
  _96 = MEM[(unsigned char *)src_8(D) + 8B];
  desta[_36] = _96;
  if (size_7(D) > 9)
goto ; [85.71%]
  else
goto ; [14.29%]

   [local count: 10511932]:
  goto ; [100.00%]

   [local count: 63071596]:
  _103 = MEM[(unsigned char *)src_8(D) + 9B];
  desta[9] = _103;
  if (size_7(D) != 10)
goto ; [85.71%]
  else
goto ; [14.29%]

   [local count: 9010228]:
  goto ; [100.00%]

   [local count: 54061368]:
  _110 = MEM[(unsigned char *)src_8(D) + 10B];
  desta[10] = _110;
  if (size_7(D) != 11)
goto ; [85.71%]
  else
goto ; [14.29%]
...


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88443
[Bug 88443] [meta-bug] bogus/missing -Wstringop-overflow warnings

[Bug middle-end/111151] [12/13/14 Regression] Wrong code at -O0 on x86_64-pc-linux-gnu

2023-08-25 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51

Richard Biener  changed:

   What|Removed |Added

 CC||pinskia at gcc dot gnu.org,
   ||rguenth at gcc dot gnu.org

--- Comment #3 from Richard Biener  ---
instrumenting extract_muldiv shows

...
Applying pattern match.pd:5113, generic-match-4.cc:2339
Applying fold-const.c:6892
Applying fold-const.c:7101
Applying fold-const.c:6892
Applying fold-const.c:6985
Applying pattern match.pd:4392, generic-match-8.cc:3091

and commenting

case MIN_EXPR:  case MAX_EXPR:
  /* If widening the type changes the signedness, then we can't perform
 this optimization as that changes the result.  */
  if (TYPE_UNSIGNED (ctype) != TYPE_UNSIGNED (type))
break;

  /* MIN (a, b) / 5 -> MIN (a / 5, b / 5)  */
  sub_strict_overflow_p = false;
  if ((t1 = extract_muldiv (op0, c, code, wide_type,
&sub_strict_overflow_p)) != 0
  && (t2 = extract_muldiv (op1, c, code, wide_type,
   &sub_strict_overflow_p)) != 0)
{
  if (tree_int_cst_sgn (c) < 0)
tcode = (tcode == MIN_EXPR ? MAX_EXPR : MIN_EXPR);
  if (sub_strict_overflow_p)
*strict_overflow_p = true; 
  return DUMP_FOLD (fold_build2 (tcode, ctype, fold_convert (ctype,
t1),
 fold_convert (ctype, t2)));
} 
  break;

fixes the testcase.  We turn

  MAX ((long long unsigned int) t + 4503599, 32739) * 18446744073709551606

to

  MIN ((long long unsigned int) t * 18446744073709551606 +
18446744073664515626,
   18446744073709224226)

I think when overflow wraps we cannot do this transform at all, independent
on the "sign" of 'c'.

Maybe

@@ -6970,8 +6972,11 @@ extract_muldiv_1 (tree t, tree c, enum tree_code code,
tree wide_type,

   /* MIN (a, b) / 5 -> MIN (a / 5, b / 5)  */
   sub_strict_overflow_p = false;
-  if ((t1 = extract_muldiv (op0, c, code, wide_type,
-   &sub_strict_overflow_p)) != 0
+  if ((wide_type
+  ? TYPE_OVERFLOW_UNDEFINED (wide_type)
+  : TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (op0))) 
+ && (t1 = extract_muldiv (op0, c, code, wide_type,
+  &sub_strict_overflow_p)) != 0
  && (t2 = extract_muldiv (op1, c, code, wide_type,
   &sub_strict_overflow_p)) != 0)
{

[Bug middle-end/111151] [12/13/14 Regression] Wrong code at -O0 on x86_64-pc-linux-gnu

2023-08-25 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51

--- Comment #4 from Richard Biener  ---
Created attachment 55794
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55794&action=edit
fold "dumping"

This adds DUMP_FOLD, but I get

/home/rguenther/src/trunk/gcc/fold-const.cc:6808:22: warning: ISO C++ forbids
braced-groups within expressions [-Wpedantic]
 #define DUMP_FOLD(X) ({ auto x = (X); if (x && dump_file && (dump_flags &
TDF_FOLDING)) fprintf (dump_file, "Applying fold-const.c:%d\n", __LINE__); x;
})
  ^
/home/rguenther/src/trunk/gcc/fold-const.cc:7193:15: note: in expansion of
macro 'DUMP_FOLD'
return DUMP_FOLD (fold_build2 (code, ctype, fold_convert (ctype, op0),
   ^

with

#define DUMP_FOLD(X) ({ auto x = (X); if (x && dump_file && (dump_flags &
TDF_FOLDING)) fprintf (dump_file, "Applying fold-const.c:%d\n", __LINE__); x;
})

so maybe need to use __extension__ and #if __GNUC for this?

[Bug middle-end/111151] [12/13/14 Regression] Wrong code at -O0 on x86_64-pc-linux-gnu

2023-08-25 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51

--- Comment #5 from Richard Biener  ---
(In reply to Richard Biener from comment #3)
> I think when overflow wraps we cannot do this transform at all, independent
> on the "sign" of 'c'.
> 
> Maybe
> 
> @@ -6970,8 +6972,11 @@ extract_muldiv_1 (tree t, tree c, enum tree_code
> code, tree wide_type,
>  
>/* MIN (a, b) / 5 -> MIN (a / 5, b / 5)  */
>sub_strict_overflow_p = false;
> -  if ((t1 = extract_muldiv (op0, c, code, wide_type,
> -   &sub_strict_overflow_p)) != 0
> +  if ((wide_type
> +  ? TYPE_OVERFLOW_UNDEFINED (wide_type)
> +  : TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (op0))) 
> + && (t1 = extract_muldiv (op0, c, code, wide_type,
> +  &sub_strict_overflow_p)) != 0
>   && (t2 = extract_muldiv (op1, c, code, wide_type,
>&sub_strict_overflow_p)) != 0)
> {

but even when overflow is undefined we don't know whether we introduce
additional overflow then.  Consider MAX (INT_MIN, 0) * -1 where we compute
0 * -1 (fine) but after the transform we'd do MIN (INT_MIN * -1, 0)
which isn't valid.

And when overflow wraps consider MAX (UINT_MAX, 1) * 2 which
will compute UINT_MAX * 2 == 0 while MAX (UINT_MAX * 2, 1 * 2) will compute 2.

Unless I'm missing something.

What we'd need to know is whether the inner operations are known to not
overflow/wrap (or whether they change sign consistently).  But without
range info we can't know this unless op0 and op1 are constants.

So - scrap that whole sub-rule?

[Bug tree-optimization/111137] [11/12/13/14 Regression] Wrong code at -O2/3

2023-08-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=37

--- Comment #3 from CVS Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:845ee9c7107956845e487cb123fa581d9c70ea1b

commit r14-3480-g845ee9c7107956845e487cb123fa581d9c70ea1b
Author: Richard Biener 
Date:   Fri Aug 25 13:37:30 2023 +0200

tree-optimization/37 - dependence checking for SLP

The following fixes a mistake with SLP dependence checking.  When
checking whether we can hoist loads to the first load place we
special-case stores of the same instance considering them sunk
to the last store place.  But we fail to consider that stores from
other SLP instances are sunk in a similar way.  This leads us to
miss the dependence between (A) and (B) in

  b[0][1] = 0; (A)
...
  _6 = b[_5 /* 0 */][0];   (B')
  _7 = _6 ^ 1;
  b[_5 /* 0 */][0] = _7;
  b[0][2] = 0; (A')
  _10 = b[_5 /* 0 */][1];  (B)
  _11 = _10 ^ 1;
  b[_5 /* 0 */][1] = _11;

where the zeroing stores are sunk to (A') and the loads hoisted
to (B').  The following fixes this, treating grouped stores from
other instances similar to stores from our own instance.  The
difference is - and this is more conservative than necessary - that
we don't know which stores of a group are in which SLP instance
(though I believe either all of the grouped stores will be in
a single SLP instance or in none at the moment), so we don't
know which stores are sunk where.  We simply assume they are
all sunk to the last store we run into.  Likewise we do not take
into account that an SLP instance might be cancelled (or a grouped
store not actually belong to any instance).

PR tree-optimization/37
* tree-vect-data-refs.cc (vect_slp_analyze_load_dependences):
Properly handle grouped stores from other SLP instances.

* gcc.dg/torture/pr37.c: New testcase.

[Bug analyzer/111155] New: RFE: better diagrams for string operations

2023-08-25 Thread dmalcolm at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55

Bug ID: 55
   Summary: RFE: better diagrams for string operations
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: analyzer
  Assignee: dmalcolm at gcc dot gnu.org
  Reporter: dmalcolm at gcc dot gnu.org
  Target Milestone: ---

See
https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=99a3fcb8ff0bf27407c525415384372189e2c3cc

The generated diagrams could be improved.

Specifically:

- we should show the index of the insertion point into buf of the strcat
string.  This could be done by looking at hard boundaries, and ensuring that we
show the index on each side of a hard boundary when the index is within the
valid area (with ellipsis cells for other runs)

- we could show the existing content of the valid region, visualizing:
  - the string from the strcpy that is untouched by the strcat, 
  - the existing NUL from the strcpy that is being overwritten by the strcat,
and 
  - the uninitialized bytes that are being overwritten by the strcat

[Bug tree-optimization/111137] [11/12/13 Regression] Wrong code at -O2/3

2023-08-25 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=37

Richard Biener  changed:

   What|Removed |Added

  Known to work||14.0
Summary|[11/12/13/14 Regression]|[11/12/13 Regression] Wrong
   |Wrong code at -O2/3 |code at -O2/3

--- Comment #4 from Richard Biener  ---
Fixed on trunk sofar.

[Bug bootstrap/102665] ELF-specific configure flags should be rejected on non-ELF platforms

2023-08-25 Thread egallager at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102665

--- Comment #6 from Eric Gallager  ---
WIP: I stubbed in a start on this in my autotools-tinkering branch a bit:
https://github.com/gcc-mirror/gcc/commit/c2caa289485edb40eddcd240a7fc655cfd7c38ba
(it's got some unrelated parts in it that I'll need to strip out again when
cherry-picking it over to my branch for this PR, though...)

[Bug target/34422] Bootstrap error with --enable-fixed-point (configure should reject --enable-fixed-point on platforms that don't support it)

2023-08-25 Thread egallager at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=34422

Eric Gallager  changed:

   What|Removed |Added

 Status|ASSIGNED|NEW
   Assignee|egallager at gcc dot gnu.org   |unassigned at gcc dot 
gnu.org

--- Comment #13 from Eric Gallager  ---
(In reply to Eric Gallager from comment #12)
> (In reply to Eric Gallager from comment #11)
> > Patch posted to gcc-patches:
> > https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596654.html
> 
> So this patch needs some more work; it might take awhile before I get to
> fixing it though... should I unassign? I still intend to get back to this
> later, but I just want to make sure that people are clear that they can give
> it a try themselves in the meantime...

...ok yeah I think I'm going to unassign; if anyone thinks they can address the
feedback from here, feel free to take it:
https://gcc.gnu.org/pipermail/gcc-patches/2022-June/597046.html

[Bug target/110559] Bad mask_load/mask_store codegen of RVV

2023-08-25 Thread rdapp at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110559

--- Comment #3 from Robin Dapp  ---
I got back to this again today, now that pressure-aware scheduling is the
default.  As mentioned before, it helps but doesn't get rid of the spills.

Testing with the "generic ooo" scheduling model it looks like vector load/store
latency of 6 is too high.  Yet, even setting them to 1 is not enough to get rid
of spills entirely.  What helps is additionally lowering the vector alu latency
to 2 (from the default 3).

I'm not really sure how to properly handle this.  As far as I can tell spilling
is always going to happen if we try to "wait" for dependencies and delay the
dependent instructions.  In my experience the hardware does a better job at
live scheduling anyway and we only make things worse in several cases. 
Previously I experimented with setting the latency of most instructions to 1
with few exceptions and instead ensure a proper instruction mix i.e. trying to
keep every execution unit busy.  That's not a panacea either, though.

[Bug c/108310] Some warnings that -Wtraditional-conversion causes to be emitted aren't actually controlled by it

2023-08-25 Thread egallager at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108310

Eric Gallager  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2023-08-25
 Ever confirmed|0   |1
   Assignee|unassigned at gcc dot gnu.org  |egallager at gcc dot 
gnu.org

--- Comment #6 from Eric Gallager  ---
I think I might be able to figure this out myself

[Bug middle-end/111156] New: [14 Regression] aarch64 aarch64/sve/mask_struct_store_4.c failures

2023-08-25 Thread adhemerval.zanella at linaro dot org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56

Bug ID: 56
   Summary: [14 Regression] aarch64
aarch64/sve/mask_struct_store_4.c failures
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: adhemerval.zanella at linaro dot org
  Target Milestone: ---

After
https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;h=a1558e9ad856938f165f838733955b331ebbec09,
I have noticed regressions on aarch64:

Running gcc:gcc.target/aarch64/sve/aarch64-sve.exp ...
FAIL: gcc.target/aarch64/sve/mask_struct_store_4.c (internal compiler error: in
get_group_load_store_type, at tree-vect-stmts.cc:2121)
FAIL: gcc.target/aarch64/sve/mask_struct_store_4.c (test for excess errors)
UNRESOLVED: gcc.target/aarch64/sve/mask_struct_store_4.c scan-assembler-not
\\tst2b\\t.z[0-9]
UNRESOLVED: gcc.target/aarch64/sve/mask_struct_store_4.c scan-assembler-not
\\tst2d\\t.z[0-9]
UNRESOLVED: gcc.target/aarch64/sve/mask_struct_store_4.c scan-assembler-not
\\tst2h\\t.z[0-9]
UNRESOLVED: gcc.target/aarch64/sve/mask_struct_store_4.c scan-assembler-not
\\tst2w\\t.z[0-9]

(As indicated by Linaro CI
https://ci.linaro.org/job/tcwg_gnu_native_check_gcc--master-aarch64-build/570/artifact/artifacts/notify/mail-body.txt/*view*/)

[Bug ipa/111157] New: 416.gamess fails with a run-time abort when compiled with -O2 -flto after r14-3226-gd073e2d75d9ed4

2023-08-25 Thread jamborm at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57

Bug ID: 57
   Summary: 416.gamess fails with a run-time abort when compiled
with -O2 -flto after r14-3226-gd073e2d75d9ed4
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: ipa
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jamborm at gcc dot gnu.org
CC: marxin at gcc dot gnu.org
Blocks: 26163
  Target Milestone: ---
  Host: x86_64-linux
Target: x86_64-linux

Benchmark 416.gamess from SPEC 2006 fails with a run-time error (STOP IN ABRT),
starting with my very own r14-3226-gd073e2d75d9ed4 (Feed results of IPA-CP into
tree value numbering).


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163
[Bug 26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)

[Bug c++/110341] [C++26] P1854R4 - Making non-encodable string literals ill-formed

2023-08-25 Thread jakub at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110341

Jakub Jelinek  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org

--- Comment #3 from Jakub Jelinek  ---
Created attachment 55795
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55795&action=edit
gcc14-pr110341.patch

Untested implementation.

[Bug ipa/111157] [14 Regression] 416.gamess fails with a run-time abort when compiled with -O2 -flto after r14-3226-gd073e2d75d9ed4

2023-08-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |14.0
Summary|416.gamess fails with a |[14 Regression] 416.gamess
   |run-time abort when |fails with a run-time abort
   |compiled with -O2 -flto |when compiled with -O2
   |after   |-flto after
   |r14-3226-gd073e2d75d9ed4|r14-3226-gd073e2d75d9ed4
   Keywords||wrong-code

[Bug fortran/35095] DATA with implied-do: Improve bounds checking

2023-08-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=35095

--- Comment #3 from CVS Commits  ---
The master branch has been updated by Harald Anlauf :

https://gcc.gnu.org/g:4024ddbe50c2d1cb54c75304c72817d3fc63cdb6

commit r14-3484-g4024ddbe50c2d1cb54c75304c72817d3fc63cdb6
Author: Harald Anlauf 
Date:   Thu Aug 24 23:16:25 2023 +0200

Fortran: improve bounds checking for DATA with implied-do [PR35095]

gcc/fortran/ChangeLog:

PR fortran/35095
* data.cc (get_array_index): Add bounds-checking code and return
error
status.  Overindexing will be allowed as an extension for
-std=legacy
and generate an error in standard-conforming mode.
(gfc_assign_data_value): Use error status from get_array_index for
graceful error recovery.

gcc/testsuite/ChangeLog:

PR fortran/35095
* gfortran.dg/data_bounds_1.f90: Adjust options to disable
warnings.
* gfortran.dg/data_bounds_2.f90: New test.

[Bug fortran/35095] DATA with implied-do: Improve bounds checking

2023-08-25 Thread anlauf at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=35095

anlauf at gcc dot gnu.org changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #4 from anlauf at gcc dot gnu.org ---
Fixed after 15 years...

[Bug fortran/33056] [Meta-bug] Data - statement related bugs

2023-08-25 Thread anlauf at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=33056
Bug 33056 depends on bug 35095, which changed state.

Bug 35095 Summary: DATA with implied-do: Improve bounds checking
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=35095

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

[Bug tree-optimization/106677] Abstraction overhead with std::views::join

2023-08-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106677

--- Comment #5 from CVS Commits  ---
The trunk branch has been updated by Andrew Pinski :

https://gcc.gnu.org/g:6df8dcec7196e42ca2eed69e1ae455bae8d0fe93

commit r14-3485-g6df8dcec7196e42ca2eed69e1ae455bae8d0fe93
Author: Andrew Pinski 
Date:   Sat Aug 19 17:56:46 2023 -0700

MATCH: `a | C -> C` when we know that `a & ~C == 0`

Even though this is handled by other code inside both VRP and CCP,
sometimes we want to optimize this outside of VRP and CCP.
An example is given in PR 106677 where phiopt will happen
after VRP (which removes a cast for a comparison) and then
phiopt will optimize the phi to be `a | 1` which can then
be optimized to `1` due to this patch.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

Note Similar code already exists in simplify_rtx for the RTL level;
it was moved from combine to simplify_rtx in r0-72539-gbd1ef757767f6d.
gcc/ChangeLog:

* match.pd (`a | C -> C`): New pattern.

[Bug ipa/111157] [14 Regression] 416.gamess fails with a run-time abort when compiled with -O2 -flto after r14-3226-gd073e2d75d9ed4

2023-08-25 Thread jamborm at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57

--- Comment #1 from Martin Jambor  ---
interestingly, the issue goes away with -flto-partition=one

It is triggered by propagating 0 as the last parameter of point.constprop.isra
which however looks correct, all four calls to the function (in different
partitions) look like this:

  istat = 0;
  _272 = MEM[(double &)&elprop];
  point.constprop.isra (_272, &ipoint, &xyzprp.xp, &xyzprp.yp, &xyzprp.zp,
&istat);
  _46 = istat;

[Bug ipa/111157] [14 Regression] 416.gamess fails with a run-time abort when compiled with -O2 -flto after r14-3226-gd073e2d75d9ed4

2023-08-25 Thread jamborm at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57

Martin Jambor  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Assignee|unassigned at gcc dot gnu.org  |jamborm at gcc dot 
gnu.org
 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2023-08-25

--- Comment #2 from Martin Jambor  ---
With the propagation, PRE performs the following:

 void point.constprop.isra (double ISRA.1740, int & restrict ipoint, double &
restrict x, double & restrict y, double & restrict z, int & restrict istat)
 {
   int iy;
   int ix;
   double & restrict prploc;
   int _5;
-  int _9;
   int _11;
   int _12;
   long int _13;
@@ -13284,7 +15377,6 @@
[local count: 1073741824]:
   # DEBUG D#556 s=> prploc
   # DEBUG prploc => D#556
-  *istat_1(D) = 0;
   if (ISRA.1740_70(D) !=
6.08805302369215843745180305174265706886482420102590225625e-154)
 goto ; [50.00%]
   else
@@ -13301,8 +15393,7 @@
   calcom (x_6(D), y_7(D), z_8(D));

[local count: 536870912]:
-  _9 = *ipoint_4(D);
-  if (_9 > 1)
+  if (_5 > 1)
 goto ; [59.00%]
   else
 goto ; [41.00%]


The write of zero to where we know there is already zero is however
problematic because all callers expect the pointed to value to be
overwritten and dse2 pass does:

@@ -20229,7 +14076,6 @@
   xpt = 0.0;
   ypt = 0.0;
   zpt = 0.0;
-  istat = 0;
   point.constprop.isra (_73, &ipoint, &xpt, &ypt, &zpt, &istat);
   _77 = istat;
   if (_77 < 0)


So I guess this is a nasty interaction with IPA-modref, and indeed
-fno-ipa-modref avoids the issue.

I'll discuss with Honza whether IPA-modref can be modified to kill the
known values in summaries in such cases or whether it is IPA-CP that
should avoid "knowing" the value.

[Bug middle-end/111156] [14 Regression] aarch64 aarch64/sve/mask_struct_store_4.c failures

2023-08-25 Thread dcb314 at hotmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56

David Binderman  changed:

   What|Removed |Added

 CC||dcb314 at hotmail dot com

--- Comment #1 from David Binderman  ---
I see this also, on x86_64, with -O2 -march=znver1.

I will reduce the code.

[Bug testsuite/96109] [11 Regression] gcc.dg/vect/slp-47.c etc. FAIL

2023-08-25 Thread dcb314 at hotmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96109

David Binderman  changed:

   What|Removed |Added

 CC||dcb314 at hotmail dot com

--- Comment #13 from David Binderman  ---
The bug first seems to appear sometime between g:93f803d53b5ccaab
and g:68f7cb6cf9e8b9f2, some 39 commits.

[Bug testsuite/96109] [11 Regression] gcc.dg/vect/slp-47.c etc. FAIL

2023-08-25 Thread dcb314 at hotmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96109

--- Comment #14 from David Binderman  ---
Sorry wrong bug report.

[Bug middle-end/111156] [14 Regression] aarch64 aarch64/sve/mask_struct_store_4.c failures

2023-08-25 Thread dcb314 at hotmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56

--- Comment #2 from David Binderman  ---
The bug first seems to appear sometime between g:93f803d53b5ccaab
and g:68f7cb6cf9e8b9f2, some 39 commits.

[Bug middle-end/111156] [14 Regression] aarch64 aarch64/sve/mask_struct_store_4.c failures

2023-08-25 Thread dcb314 at hotmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56

--- Comment #3 from David Binderman  ---
Reduced C code seems to be:

struct median_estimator {
  long median;
  long step
} median_diff_ts[];
median_estimator_update_data, median_estimator_update_diff,
median_estimator_update_median, mm_profile_print_i;
median_estimator_update(struct median_estimator *me) {
  if (__builtin_expect(me->step, 0))
me->median = median_estimator_update_data;
  if (median_estimator_update_diff)
me->step = median_estimator_update_median;
}
mm_profile_print() {
  mm_profile_print_i = 1;
  for (; mm_profile_print_i; mm_profile_print_i++)
median_estimator_update(&median_diff_ts[mm_profile_print_i]);
}

[Bug target/111096] Frame pointer is not used even when -fomit-frame-pointer is specified

2023-08-25 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111096

--- Comment #9 from Thomas Koenig  ---
(In reply to Richard Earnshaw from comment #8)
> (In reply to Thomas Koenig from comment #7)
> > Would it make sense to document this somewhere?  Or did I just miss it? :-)
> 
> Possibly, but I've no idea where.  It's too target-specific to put under the
> generic documentation for -fomit-frame-pointer and I don't think there's a
> section in the manual that really documents the target-specific behaviours
> of generic options.

Hm, maybe a chapter "Architecture-specific implementation choices"
to document those cases where the ABI gives some leeway could be a
place to put it.  It could have sections on architecture.

[Bug libstdc++/70472] is_copy_constructible>>::value is true

2023-08-25 Thread safinaskar at mail dot ru via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70472

Askar Safin  changed:

   What|Removed |Added

 Resolution|--- |INVALID
 Status|NEW |RESOLVED

--- Comment #15 from Askar Safin  ---
I agree. I close as INVALID

[Bug tree-optimization/106677] Abstraction overhead with std::views::join

2023-08-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106677

--- Comment #6 from CVS Commits  ---
The trunk branch has been updated by Andrew Pinski :

https://gcc.gnu.org/g:d9a0d692ffc6951c5670f54c3f4f17ec64a58600

commit r14-3486-gd9a0d692ffc6951c5670f54c3f4f17ec64a58600
Author: Andrew Pinski 
Date:   Sat Aug 19 15:30:45 2023 -0700

MATCH: Move `a ? one_zero : one_zero` matching after min/max matching

In PR 106677, I noticed that on the trunk we were producing:
```
  _25 = SR.116_117 == 0;
  _27 = (unsigned char) _25;
  _32 = _27 | SR.116_117;
```
From `SR.115_117 != 0 ? SR.115_117 : 1`
Rather than:
```
  _119 = MAX_EXPR <1, SR.115_117>;
```
Or (rather)
```
  _119 = SR.115_117 | 1;
```
Due to the order of the patterns.

Committed as approved with the new comment and testcase.
Bootstrapped and tested on x86_64-linux-gnu with no regressions.

gcc/ChangeLog:

* match.pd (`a ? one_zero : one_zero`): Move
below detection of minmax.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/phi-opt-34.c: New test.

[Bug tree-optimization/111013] [14 Regression] Dead Code Elimination Regression at -O2 since r14-338-g1dd154f6407

2023-08-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111013

Andrew Pinski  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=110918
 Resolution|--- |FIXED

--- Comment #3 from Andrew Pinski  ---
Fixed by  r14-3414-g0cfc9c953d0221ec3971a25 .

[Bug c++/111158] New: diagnostics, colors, and std::same_as

2023-08-25 Thread barry.revzin at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58

Bug ID: 58
   Summary: diagnostics, colors, and std::same_as
   Product: gcc
   Version: 13.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: barry.revzin at gmail dot com
  Target Milestone: ---

Consider this broken example:

#include 

template 
struct Optional { };

auto f() -> Optional;

auto g()
#ifdef CONCEPTS
-> std::same_as> auto
#else
-> Optional
#endif
{
return f();
}

The code is wrong, g() is specifying it returns Optional but actually
returns Optional and there's no conversion.

Without concepts (just using a normal trailing return type), we get:

: In function 'Optional g()':
:15:13: error: could not convert 'f()' from 'Optional' to
'Optional'
   15 | return f();
  |~^~
  | |
  | Optional
Compiler returned: 1

Pasting it here like this actually undersells how good the error it is, because
it's actually in color, and the "int&" and "int" parts are a different color
from the Optinoal<...> part, so it really visually stands out. Very nice, gcc
devs!

With concepts (the -> std::same_as> auto approach), we get:

: In function 'auto [requires std::same_as<, Optional
>] g()':
:15:13: error: deduced return type does not satisfy placeholder
constraints
   15 | return f();
  |~^~
:15:13: note: constraints not satisfied
In file included from :1:
/opt/compiler-explorer/gcc-trunk-20230824/include/c++/14.0.0/concepts:57:15:  
required for the satisfaction of '__same_as<_Tp, _Up>' [with _Tp =
Optional; _Up = Optional]
/opt/compiler-explorer/gcc-trunk-20230824/include/c++/14.0.0/concepts:62:13:  
required for the satisfaction of 'same_as, Optional >], Optional >' [with auto
[requires std::same_as<, Optional >] = Optional]
/opt/compiler-explorer/gcc-trunk-20230824/include/c++/14.0.0/concepts:57:32:
note: the expression 'is_same_v<_Tp, _Up> [with _Tp = Optional; _Up =
Optional]' evaluated to 'false'
   57 |   concept __same_as = std::is_same_v<_Tp, _Up>;
  |   ~^~~

This isn't colored as nicely, and in general the diagnostic is I think quite a
bit worse - it contains a lot of information that simply isn't relevant to the
user. Like the fact that std::same_as is specified in this weird way in terms
of this other __same_as thing. That's an implementation detail that pretty much
never matters.

It'd be nice if gcc had dedicated diagnostics handling for std::same_as
and just wrote that T and U are different types and applied the same kind of
color highlighting that it does for the non-concepts case - which helped me a
lot to visually call out that the distinction was int vs int&. 

Link to compiler explorer, where you can see the colors with gcc 13.2:
https://godbolt.org/z/YaKjzvM98

[Bug c++/111158] diagnostics, colors, and std::same_as

2023-08-25 Thread mpolacek at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58

Marek Polacek  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2023-08-25
 CC||mpolacek at gcc dot gnu.org
   Keywords||diagnostic
 Status|UNCONFIRMED |NEW

[Bug ipa/111157] [14 Regression] 416.gamess fails with a run-time abort when compiled with -O2 -flto after r14-3226-gd073e2d75d9ed4

2023-08-25 Thread jamborm at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57

--- Comment #3 from Martin Jambor  ---
Simple C testcase:

-- pr57_0.c --
/* { dg-lto-do run } */
/* { dg-lto-options { { -O2 -flto=auto } } } */
/* { dg-extra-ld-options { -flto-partition=1to1 } } */

extern __attribute__((noinline))
void foo (int *p);


void __attribute__((noinline))
bar (void)
{
  int istat;

  istat = 1234;
  foo (&istat);
  if (istat != 1234)
__builtin_abort ();
}

int main (int argc, char **argv)
{
  bar ();
  return 0;
}
-- pr57_1.c --
volatile int v = 0;

void __attribute__((noinline))
foo (int *p)
{
  *p = 1234;
  if (v)
*p = 0;
  return;
}
--

[Bug target/111139] RISC-V: improve scalar constants cost model

2023-08-25 Thread vineetg at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=39

--- Comment #2 from Vineet Gupta  ---
Test case to help drive some of this:

unsigned long long f5(unsigned long long i)
{
  return i * 0x0202020202020202ULL;
}

[Bug preprocessor/90400] _Pragma not always expanded in the right location within macros

2023-08-25 Thread lhyatt at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90400

Lewis Hyatt  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2023-08-25
 CC||lhyatt at gcc dot gnu.org
 Ever confirmed|0   |1
   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=103165,
   ||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=102409

--- Comment #8 from Lewis Hyatt  ---
(In reply to Tobias Burnus from comment #7)
> Regarding "-E": Actually, looking at PR103165 comment 2, I note that for the
> example there 'clang -E' outputs:
> 
> "hello; \"\" _Pragma(\"GCC diagnostic pop\") world;"
> 
> that is: While the macros are replaced, the (unknown) _Pragma remains as
> _Pragma - such that it is then later only processed when running the
> compiler.
> 
> No idea whether that makes sense or not not - just as observation.

This issue was indeed fixed by r12-5454, the fix for PR103165.

I will get a testcase added and then close this one. The testcase will be a
tweaked version of the original one from this PR. It needs to use a different
_Pragma, because nowadays, '#pragma GCC diagnostic' is recognized by the
preprocessor. The existing c-c++-common/gomp/pragma-2.c provides coverage for
that case. '#pragma GCC unroll' is a useful new testcase, being another pragma
that is explicitly ignored during preprocess-only modes.

[Bug c++/91483] Poor diagnostic on trying to take constexpr reference to non-static object

2023-08-25 Thread mpolacek at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91483

--- Comment #3 from Marek Polacek  ---
The error comes from verify_constant, which doesn't explain anything. 
verify_constant uses reduced_constant_expression_p which just says yes/no but
doesn't explain anything.  reduced_constant_expression_p uses the middle-end
initializer_constant_valid_p but that's not going to say anything, either.

Either we need a version of reduced_constant_expression_p that actually says
what's wrong, or add a function that, when given an expression that isn't
reduced_constant_expression_p, will look for known problematical cases, like
the one above.

[Bug c++/102409] _pragma ("omp ...") expansion issue - placed in the wrong scope

2023-08-25 Thread lhyatt at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102409

Lewis Hyatt  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED
 CC||lhyatt at gcc dot gnu.org

--- Comment #4 from Lewis Hyatt  ---
Fixed by r12-4797.

[Bug tree-optimization/110311] [14 Regression] regression in tree-optimizer

2023-08-25 Thread juergen.reuter at desy dot de via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110311

--- Comment #52 from Jürgen Reuter  ---
(In reply to Jakub Jelinek from comment #51)
> The easiest would be to bisect gcc in the suspected ranges, that way you'd
> know for sure which git commit introduced the problem and which
> fixed/"fixed" it.
> If it is about what the compiler emits, one doesn't have to build whole gcc
> from scratch each time, but can just --disable-bootstrap build it and during
> bisection
> whenever git is updated just ./config.status --recheck; ./config.status;
> make -jN in libcpp, libiberty and gcc subdirectories and use f951/gfortran
> binariers from that instead of the ones from the initial build to build your
> project.

This was the offending commit by Richard Sayle, on Saturday June 17:
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=96c3539f2a38134cb76d8ab2e924e0dc70b2ccbd

=
i386: Two minor tweaks to ix86_expand_move.

This patch splits out two (independent) minor changes to i386-expand.cc's
ix86_expand_move from a larger patch, given that it's better to review
and commit these independent pieces separately from a more complex patch.

The first change is to test for CONST_WIDE_INT_P before calling
ix86_convert_const_wide_int_to_broadcast.  Whilst stepping through
this function in gdb, I was surprised that the code was continually
jumping into this function with operands that obviously weren't
appropriate.

The second change is to generalize the optimization for efficiently
moving a TImode value to V1TImode (via V2DImode), to cover all 128-bit
vector modes.

Hence for the test case:

typedef unsigned long uv2di __attribute__ ((__vector_size__ (16)));
uv2di foo2(__int128 x) { return (uv2di)x; }

we'd previously move via memory with:

foo2:   movq%rdi, -24(%rsp)
movq%rsi, -16(%rsp)
movdqa  -24(%rsp), %xmm0
ret

with this patch we now generate with -O2 (the same as V1TImode):

foo2:   movq%rdi, %xmm0
movq%rsi, %xmm1
punpcklqdq  %xmm1, %xmm0
ret

and with -O2 -msse4 the even better:

foo2:   movq%rdi, %xmm0
pinsrq  $1, %rsi, %xmm0
ret

The new test case is unimaginatively called sse2-v1ti-mov-2.c given
the original test case just for V1TI mode was called sse2-v1ti-mov-1.c.

2023-06-17  Roger Sayle  

gcc/ChangeLog
* config/i386/i386-expand.cc (ix86_expand_move): Check that OP1 is
CONST_WIDE_INT_P before calling ix86_convert_wide_int_to_broadcast.
Generalize special case for converting TImode to V1TImode to handle
all 128-bit vector conversions.

gcc/testsuite/ChangeLog
* gcc.target/i386/sse2-v1ti-mov-2.c: New test case.
===

Now the question is, was this commit later reverted? Or changed in a different
manner

[Bug tree-optimization/110311] [14 Regression] regression in tree-optimizer

2023-08-25 Thread juergen.reuter at desy dot de via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110311

--- Comment #53 from Jürgen Reuter  ---
Additional comment: the commit which fixed/"fixed" this offending commit came
between July 3 and July 10.

[Bug tree-optimization/110311] [14 Regression] regression in tree-optimizer

2023-08-25 Thread juergen.reuter at desy dot de via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110311

--- Comment #54 from Jürgen Reuter  ---
(In reply to Jürgen Reuter from comment #53)
> Additional comment: the commit which fixed/"fixed" this offending commit
> came between July 3 and July 10.

Wildly speculating, it would be this commit maybe,
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=bdf2737cda53a83332db1a1a021653447b05a7e7
???

[Bug c++/111159] New: [13 Regression] False positive -Wdangling-reference

2023-08-25 Thread daniel at constexpr dot org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59

Bug ID: 59
   Summary: [13 Regression] False positive -Wdangling-reference
   Product: gcc
   Version: 13.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: daniel at constexpr dot org
  Target Milestone: ---

GCC 13.2.0 as well as git from today report a false positive
-Wdangling-reference warning for the following C++ code:

struct A {
int * i;
int & b() { return *i; }
};

int g = 42;

A a() {
return A{ &g };
}

int main() {
const int & i = a().b();
return i;
}

[Bug c++/111159] [13 Regression] False positive -Wdangling-reference

2023-08-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59

Andrew Pinski  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=109642

--- Comment #1 from Andrew Pinski  ---
I suspect this is just a dup of bug 109642 .

[Bug tree-optimization/110768] [14 Regression] Dead Code Elimination Regression since r14-2623-gc11a3aedec2

2023-08-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110768

--- Comment #2 from Andrew Pinski  ---
Looks to be fixed now.

[Bug tree-optimization/110891] [14 Regression] Dead Code Elimination Regression since r14-2674-gd0de3bf9175

2023-08-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110891

--- Comment #3 from Andrew Pinski  ---
One thing I noticed (I don't know if causes the missed optimization) is that we
have before PRE:
```
   [local count: 1073531371]:
  if (a.0_1 != 0)
goto ; [50.00%]
  else
goto ; [50.00%]

   [local count: 536765686]:
  if (_28 == &d)
goto ; [30.00%]
  else
goto ; [70.00%]

   [local count: 536765685]:
  if (_28 == &d)
goto ; [30.00%]
  else
goto ; [70.00%]
```
Which obvious should just be `if (_28 == &d) goto bb9; else goto bb7;` and not
check `a.0_1` at all.

I tried a reduced testcase but PRE optimizes it:
```
int g();
int h();

int j, l;

int f(int a, int *b)
{
if (a == 0)
{
if (b == &j) goto L9; else goto L7;
}
else
{
if (b == &j) goto L9; else goto L7;
}
L7: return g();
L9: return h();
}
```

[Bug tree-optimization/110891] [14 Regression] Dead Code Elimination Regression since r14-2674-gd0de3bf9175

2023-08-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110891

--- Comment #4 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #3)
> One thing I noticed (I don't know if causes the missed optimization) is that
> we have before PRE:
> ```
>[local count: 1073531371]:
>   if (a.0_1 != 0)
> goto ; [50.00%]
>   else
> goto ; [50.00%]
> 
>[local count: 536765686]:
>   if (_28 == &d)
> goto ; [30.00%]
>   else
> goto ; [70.00%]
> 
>[local count: 536765685]:
>   if (_28 == &d)
> goto ; [30.00%]
>   else
> goto ; [70.00%]
> ```
> Which obvious should just be `if (_28 == &d) goto bb9; else goto bb7;` and
> not check `a.0_1` at all.

I wonder if ifcombine could optimize that instead of requiring PRE. I think
that might even fix the issue too.

[Bug tree-optimization/110503] [13/14 Regression] Dead Code Elimination Regression at -O3 since r13-322-g7f04b0d786e

2023-08-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110503

Andrew Pinski  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |pinskia at gcc dot 
gnu.org
   Last reconfirmed|2023-06-30 00:00:00 |2023-8-25

--- Comment #6 from Andrew Pinski  ---
This will be fixed with my test_for_singularity patch.

[Bug tree-optimization/110992] [14 Regression] Dead Code Elimination Regression at -O3 since r14-1654-g7ceed7e3e29

2023-08-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110992

--- Comment #2 from Andrew Pinski  ---
So I think this is because ethread can understand `(a & b)!=0` but not
`(a*c)!=0` that is a*c != 0 means that a or c will both be non-zero (especially
since a*c is known to not to overflow as c as a range of [0,1]).

[Bug rtl-optimization/110939] [14 Regression] 14.0 ICE at rtl.h:2297 while bootstrapping on loongarch64

2023-08-25 Thread xry111 at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110939

Xi Ruoyao  changed:

   What|Removed |Added

   Priority|P3  |P1

[Bug rtl-optimization/110939] [14 Regression] 14.0 ICE at rtl.h:2297 while bootstrapping on loongarch64

2023-08-25 Thread xry111 at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110939

Xi Ruoyao  changed:

   What|Removed |Added

   Last reconfirmed|2023-08-08 00:00:00 |2023-08-26
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW

[Bug middle-end/111151] [12/13/14 Regression] Wrong code at -O0 on x86_64-pc-linux-gnu

2023-08-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51

Andrew Pinski  changed:

   What|Removed |Added

  Known to work||2.95.3
   Keywords|needs-bisection |

--- Comment #6 from Andrew Pinski  ---
extract_muldiv seems to be the cause of so many issues.
It was added in r0-25100-g1baa375feaa2 which was only included in GCC 3.0+
even.

The bad MIN/MAX code was added with that revision even so I am going to remove
the needs-bisect since I think it is obvious that revision broke it.

[Bug rtl-optimization/111143] [missed optimization] unlikely code slows down diffutils x86-64 ASCII processing

2023-08-25 Thread eggert at cs dot ucla.edu via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43

--- Comment #5 from Paul Eggert  ---
(In reply to Alexander Monakov from comment #4)

> To evaluate scheduling aspect, keep 'mov eax, 1' while changing 'add rbx,
> rax' to 'add rbx, 1'.

Adding the (unnecessary) 'mov eax, 1' doesn't affect the timing much, which is
what I would expect on a newer processor.

When I reran the benchmark on the same laptop (Intel i5-1335U), I got 3.289s
for GCC-generated code, 2.256s for the "38% faster" code (now it's 46% faster;
don't know why) and 2.260 s for the faster code with the unnecessary 'mov eax,
1' inserted.

[Bug c++/111160] New: ICE on assigning volatile through ternary operator

2023-08-25 Thread trinxery at firemail dot cc via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60

Bug ID: 60
   Summary: ICE on assigning volatile through ternary operator
   Product: gcc
   Version: 13.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: trinxery at firemail dot cc
  Target Milestone: ---

Created attachment 55796
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55796&action=edit
-freport-bug

class TheClass {};
void the_func() {
  TheClass x;
  volatile TheClass y;
  (false ? x : x) = y;
}

buggy.cxx: In function ‘void the_func()’:
buggy.cxx:6:21: internal compiler error: in stabilize_expr, at cp/tree.cc:5969
6 |   (false ? x : x) = y;
  | ^
0x9340a9 stabilize_expr(tree_node*, tree_node**)
../../src/gcc/cp/tree.cc:5969
...

I think this is descriptive enough.

[Bug c++/111160] [11/12/13/14 Regression] ICE on assigning volatile through ternary operator

2023-08-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |11.5
   Last reconfirmed||2023-08-26
Summary|ICE on assigning volatile   |[11/12/13/14 Regression]
   |through ternary operator|ICE on assigning volatile
   ||through ternary operator
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
  Known to work||5.1.0
  Known to fail||6.1.0
   Keywords||ice-on-invalid-code

--- Comment #1 from Andrew Pinski  ---
GCC 5.1.0 and before for C++11:
: In function 'void the_func()':
:6:19: error: ambiguous overload for 'operator=' (operand types are
'TheClass' and 'volatile TheClass')
   (false ? x : x) = y;
   ^
:2:8: note: candidate: TheClass& TheClass::operator=(const TheClass&)

 struct TheClass {};
^
:2:8: note:   conversion of argument 1 would be ill-formed:
:6:19: error: binding 'volatile TheClass' to reference of type 'const
TheClass&' discards qualifiers
   (false ? x : x) = y;
   ^
:2:8: note: candidate: TheClass& TheClass::operator=(TheClass&&) 
 struct TheClass {};
^
:2:8: note:   conversion of argument 1 would be ill-formed:
:6:19: error: cannot bind 'volatile TheClass' lvalue to 'TheClass&&'
   (false ? x : x) = y;
   ^

For C++98:
: In function 'void the_func()':
:6:19: error: binding 'volatile TheClass' to reference of type 'const
TheClass&' discards qualifiers
   (false ? x : x) = y;
   ^
:2:8: note:   initializing argument 1 of 'TheClass&
TheClass::operator=(const TheClass&)'
 struct TheClass {};
^

Confirmed. this is invalid code.

[Bug c++/70744] preincrements possibly double-evaluated in GNU ternaries

2023-08-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70744

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |7.0

[Bug target/70290] -mavx512vl breaks parsing of C++ vector condition

2023-08-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70290

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |6.0

[Bug libfortran/111022] ES0.0E0 format gave ES0.dE0 output with d too high.

2023-08-25 Thread jvdelisle at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111022

Jerry DeLisle  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED

--- Comment #7 from Jerry DeLisle  ---
I don't have ifort installed at the moment. Would someone post the output from
ifort with teste0es0en0.f90 attached to this PR? Much appreciated.

[no subject]

2023-08-25 Thread sukkamon togfudee via Gcc-bugs

(หากคุณกำลังมองหาเงินทุนหมุนเวียนระยะสั้น)
สำหรับผู้ประกอบการ โรงงานฯ หจก. บริษัท ธุรกิจ SME
อนุมัติง่ายกว่าธนาคาร | ไม่เช็คบูโร | ลดต้น ลดดอกเบี้ย | ไม่ต้องมีคนค้ำ |
อนุมัติทันที หลังส่งเอกสารครบถ้วน 1 ช.ม
หากคุณสนใจบริการของเรา โทรด่วนหาเรา
📞โทร 082 5928519 คุณเอก
📞โทร 063 2543219 ตะวัน
💬ไลน์ไอดี esc.credit
✅ดอกเบี้ยเริ่มต้น 1.5%
✅ปิดยอดได้ตลอดเวลา ไม่ต้องรอให้ครบสัญญา
✅ฟรีค่าธรรมเนียม ไม่เรียกเก็บเงินก่อนทำสัญญาทุกกรณี

[Bug libfortran/111022] ES0.0E0 format gave ES0.dE0 output with d too high.

2023-08-25 Thread john.harper at vuw dot ac.nz via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111022

--- Comment #8 from john.harper at vuw dot ac.nz ---
I couldn't see the program teste0es0en0.f90 that is in your bugzilla but 
I could see that it does have 691 bytes. So does one of the two versions 
that I now have in my own computer. The attachment to this email contains 
that version and what ifort did with it. (Of course E0.0E0 is illegal 
Fortran but ES0.0E0 and EN0.0E0 are OK according to both the f2018 and 
f2023 standards.)

On Sat, 26 Aug 2023, jvdelisle at gcc dot gnu.org wrote:

> Date: Sat, 26 Aug 2023 04:26:51 +
> From: jvdelisle at gcc dot gnu.org 
> To: John Harper 
> Subject: [Bug libfortran/111022] ES0.0E0 format gave ES0.dE0 output with d too
>  high.
> Resent-Date: Sat, 26 Aug 2023 16:27:07 +1200 (NZST)
> Resent-From: 
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111022
>
> Jerry DeLisle  changed:
>
>   What|Removed |Added
> 
> Status|NEW |ASSIGNED
>
> --- Comment #7 from Jerry DeLisle  ---
> I don't have ifort installed at the moment. Would someone post the output from
> ifort with teste0es0en0.f90 attached to this PR? Much appreciated.
>
> -- 
> You are receiving this mail because:
> You reported the bug.
>


-- John Harper, School of Mathematics and Statistics
Victoria Univ. of Wellington, PO Box 600, Wellington 6140, New Zealand.
e-mail john.har...@vuw.ac.nz

[Bug middle-end/38264] tree_forwarder_block_p says no to first basic block

2023-08-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=38264

--- Comment #6 from Andrew Pinski  ---
I actually just ran into a problem caused by having this check here.

I was modifying tree-ssa-ifcombine to optimize the case where we have the same
condition on both bb from a bb like:
```
int g();
int h();

int j, l;

int f(int a, int *b)
{
if (a == 0)
{
if (b == &j) goto L9; else goto L7;
}
else
{
if (b == &j) goto L9; else goto L7;
}
L7: return g();
L9: return h();
}
```

I go and try to remove one of the bb (which have the same condition) and that
should have updated dominators in a reasonable way because the removal should
have done a forwarder from bb 2 to bb3. But instead of doing the forwarding of
the bb, we end up with still bb3 and the dominator needs to be updated in
non-trival ways.

[Bug tree-optimization/110891] [14 Regression] Dead Code Elimination Regression since r14-2674-gd0de3bf9175

2023-08-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110891

Andrew Pinski  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |pinskia at gcc dot 
gnu.org

--- Comment #5 from Andrew Pinski  ---
I have a patch for what I laid out in comment #3 and it solves the original
issue too because after ifcombine we have:
  _24 = _28 == &c;
  _18 = _28 == &d;
  _8 = _18 | _24;
  _25 = _28 == &a;
  _20 = _8 | _25;
  if (_20 != 0)
goto ; [99.98%]
  else
goto ; [0.02%]

   [local count: 210453]:
  __assert_fail ("", "", 4, &__PRETTY_FUNCTION__);

   [local count: 536765685]:
  _6 = _28 == &c;
  _13 = _28 == &d;
  _9 = _6 | _13;
  _27 = _28 == &a;
  _17 = _9 | _27;
  if (_17 != 0)
goto ; [99.98%]
  else
goto ; [0.02%]

Which then obviously gets optimized by DOM2.

[Bug tree-optimization/110891] [14 Regression] Dead Code Elimination Regression since r14-2674-gd0de3bf9175

2023-08-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110891

Andrew Pinski  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED

92 matches

Mail list logo