[Bug middle-end/110444] [14 Regression] ice in real_can_shorten_arithmetic, at real.cc:1398

2023-06-28 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110444

Andrew Pinski  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2023-06-28

--- Comment #6 from Andrew Pinski  ---
Reduced testcase:
```
void f(float *a, float *b, float *c, int size)
{
  float t[2];
  t[0] = b[0] - (float)__builtin_pow(c[0], 2);
  t[1] = b[1] - (float)__builtin_pow(c[1], 2);
  a[0] = t[0];
  a[1] = t[1];
}
```

[Bug middle-end/110444] [14 Regression] ice in real_can_shorten_arithmetic, at real.cc:1398

2023-06-28 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110444

--- Comment #7 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #5)
> Note this might be already fixed by:
> https://gcc.gnu.org/pipermail/gcc-patches/2023-June/622984.html
> https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;
> h=d915762ea9043da858d388b60b2d8093ff77eeab
> 
> Once I reduce the testcase, I will retest it to see if it has been fixed or
> not.

Yes it was, going to add a testcase and close the bug as fixed.

[Bug middle-end/110444] [14 Regression] ice in real_can_shorten_arithmetic, at real.cc:1398

2023-06-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110444

--- Comment #8 from CVS Commits  ---
The trunk branch has been updated by Andrew Pinski :

https://gcc.gnu.org/g:857e1f93ff8e3b93a7a3dcca9e50fe32a4c93950

commit r14-2151-g857e1f93ff8e3b93a7a3dcca9e50fe32a4c93950
Author: Andrew Pinski 
Date:   Wed Jun 28 00:21:08 2023 -0700

Add testcase for PR 110444

This testcase was fixed after r14-2135-gd915762ea9043da85 and
there was no testcase for it before so adding one is a good thing.

Committed as obvious after testing the testcase to make sure it works.

gcc/testsuite/ChangeLog:

PR tree-optimization/110444
* gcc.c-torture/compile/pr110444-1.c: New test.

[Bug middle-end/110444] [14 Regression] ice in real_can_shorten_arithmetic, at real.cc:1398

2023-06-28 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110444

Andrew Pinski  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #9 from Andrew Pinski  ---
Fixed.

[Bug middle-end/110377] Early VRP and IPA-PROP should work out value ranges from __builtin_unreachable

2023-06-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110377

--- Comment #6 from CVS Commits  ---
The master branch has been updated by Jan Hubicka :

https://gcc.gnu.org/g:7198573f44fb579843bff8deda695107858d8cff

commit r14-2152-g7198573f44fb579843bff8deda695107858d8cff
Author: Jan Hubicka 
Date:   Wed Jun 28 09:34:53 2023 +0200

Enable ranger for ipa-prop

gcc/ChangeLog:

PR tree-optimization/110377
* ipa-prop.cc (ipa_compute_jump_functions_for_edge): Pass statement
to
the ranger query.
(ipa_analyze_node): Enable ranger.

gcc/testsuite/ChangeLog:

PR tree-optimization/110377
* gcc.dg/ipa/pr110377.c: New test.

[Bug debug/110439] Missing DW_TAG_typedef for variable with attribute of typedef'd type

2023-06-28 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110439

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2023-06-28
 Ever confirmed|0   |1

--- Comment #3 from Richard Biener  ---
it works fine with

foo const attr_foo;

where we generate

 <1><1e>: Abbrev Number: 3 (DW_TAG_typedef)
<1f>   DW_AT_name: foo
<23>   DW_AT_decl_file   : 1
<24>   DW_AT_decl_line   : 1
<25>   DW_AT_decl_column : 13
<26>   DW_AT_type: <0x2f>
 <1><2a>: Abbrev Number: 4 (DW_TAG_const_type)
<2b>   DW_AT_type: <0x1e>
 <1><36>: Abbrev Number: 1 (DW_TAG_variable)
<37>   DW_AT_name: (indirect string, offset: 0xa): attr_foo
<3b>   DW_AT_decl_file   : 1
<3b>   DW_AT_decl_line   : 2
<3c>   DW_AT_decl_column : 11
<3d>   DW_AT_type: <0x2a>
<41>   DW_AT_external: 1
<41>   DW_AT_location: 9 byte block: 3 0 0 0 0 0 0 0 0  (DW_OP_addr: 0)

to GCC the attribute generates a similar type variant but there's no
way to emit the "qualification" in dwarf but it looks like for consistency
we should then fall back to the typedef DIE, not the base type DIE.

Confirmed.

[Bug tree-optimization/110440] [14 regression] ICE when building pixman

2023-06-28 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110440

Richard Biener  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org
 Status|UNCONFIRMED |ASSIGNED
 Ever confirmed|0   |1
   Last reconfirmed||2023-06-28

--- Comment #4 from Richard Biener  ---
I will have a look.

[Bug c++/110437] SIGILL when return missing in a C++ function with a condition

2023-06-28 Thread xry111 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110437

Xi Ruoyao  changed:

   What|Removed |Added

 CC||xry111 at gcc dot gnu.org

--- Comment #11 from Xi Ruoyao  ---
(In reply to Jan Žižka from comment #3)
> Good thanks for pointer and clarification.
> 
> Is there some reason this cannot be caught during compile time already? I
> mean the warning should be an error maybe? It would be much easier to fix in
> legacy code.

Note that -Werror=return-type cannot be the default because...

int f(int a)
{
  if (a == 1)
return 0xdead;
  else if (a == 42)
return 0xbeef;
}

is perfectly legal if the caller doesn't pass anything other than 1 or 42 to f.
 So we cannot just reject it at the compile time, we can only issue a warning.

And generally there is no way to determine if an "unsupported" value is passed
to f at compile time because doing so will need to solve the halting problem.

[Bug middle-end/110443] [14 Regression] ICE on a52dec-0.7.4: GIMPLE pass: vect SIGSEGV in vect_get_gather_scatter_ops()

2023-06-28 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110443

Richard Biener  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2023-06-28
   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org

--- Comment #4 from Richard Biener  ---
I will have a look.

[Bug tree-optimization/110446] New: [14 Regression] Wrong code at -O1/2/3/s on x86_64-pc-linux-gnu

2023-06-28 Thread jwzeng at nuaa dot edu.cn via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110446

Bug ID: 110446
   Summary: [14 Regression] Wrong code at -O1/2/3/s on
x86_64-pc-linux-gnu
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jwzeng at nuaa dot edu.cn
  Target Milestone: ---

Link to the Compiler Explorer: https://godbolt.org/z/jE9fxsYfE

The following code snippet:

#include 
unsigned int a = 1387579096U;
int main() {
a = 1 < (~a) ? 1 : (~a);
printf("%u\n", a);
return 0;
}

> $ /usr/gcc-trunk/bin/gcc -O0 bug.c; ./a.out
> 1
> $ /usr/gcc-trunk/bin/gcc -O1 bug.c; ./a.out
> 2907388199
> $ /usr/gcc-trunk/bin/gcc -O2 bug.c; ./a.out
> 2907388199
> $ /usr/gcc-trunk/bin/gcc -O3 bug.c; ./a.out
> 2907388199
> $ /usr/gcc-trunk/bin/gcc -Os bug.c; ./a.out
> 2907388199
When compiled with -O1/2/3/s, it prints the wrong result 2907388199 instead of
1. Earlier GCCs do not have this bug.

> $ /usr/gcc-trunk/bin/gcc --version
Using built-in specs.
COLLECT_GCC=/home/compilers/gcc/gcc-trunk/bin/gcc
COLLECT_LTO_WRAPPER=/home/compilers/gcc/gcc-trunk/bin/../libexec/gcc/x86_64-pc-linux-gnu/14.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc-trunk-source/gcc/configure
--enable-languages=c,c++,fortran --enable-checking=release
--enable-valgrind-annotations --disable-werror --disable-libstdcxx-pch
--enable-libgomp --enable-lto --enable-gold --with-plugin-ld=gold
--prefix=/usr/local/gcc-trunk
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 14.0.0 20230619 (experimental) [master r14-1917-gf8e0270272] (GCC)

[Bug c++/110437] SIGILL when return missing in a C++ function with a condition

2023-06-28 Thread jan.zizka at nokia dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110437

--- Comment #12 from Jan Žižka  ---
(In reply to Xi Ruoyao from comment #11)
> is perfectly legal if the caller doesn't pass anything other than 1 or 42 to
> f.  So we cannot just reject it at the compile time, we can only issue a
> warning.

True that, still this doesn't make it sound software implementation :-)

Another example pointed out by one colleague of mine:

int f(int a)
{
  if (a == 1)
exit(0);
  else
return 0xbeef;
}

> And generally there is no way to determine if an "unsupported" value is
> passed to f at compile time because doing so will need to solve the halting
> problem.

That is true :-) but from software implementation if you reuse such a function
or it is a library I'd personally disallow this as this is bad coding. But this
will be opinionated so better not to dive to such a rabbit hole.

Thanks for comments.

[Bug tree-optimization/110446] [14 Regression] Wrong code at -O1/2/3/s on x86_64-pc-linux-gnu

2023-06-28 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110446

Andrew Pinski  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |pinskia at gcc dot 
gnu.org
 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2023-06-28
   Keywords||wrong-code
   Target Milestone|--- |14.0
 Ever confirmed|0   |1

--- Comment #1 from Andrew Pinski  ---
Mine. This is basically a dup of the other phiopt issue dealing with flow
sensitive information.

[Bug c++/110437] SIGILL when return missing in a C++ function with a condition

2023-06-28 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110437

--- Comment #13 from Jonathan Wakely  ---
(In reply to Jan Žižka from comment #12)
> Another example pointed out by one colleague of mine:
> 
> int f(int a)
> {
>   if (a == 1)
> exit(0);

exit is marked noreturn so the compiler knows this function never reaches the
end without returning. But a user-defined function like "log_and_exit" might
not be marked noreturn, so the same argument applies there. The code is
correct, but the compiler can't prove it, so it can't give an error by default.


> That is true :-) but from software implementation if you reuse such a
> function or it is a library I'd personally disallow this as this is bad
> coding. But this will be opinionated so better not to dive to such a rabbit
> hole.

Right, the C++ standard isn't based on opinions. If you want to disallow it,
you can use -Werror=return-type. That can't be the default though.

[Bug target/104124] Poor optimization for vector splat DW with small consts

2023-06-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104124

--- Comment #4 from CVS Commits  ---
The master branch has been updated by HaoChen Gui :

https://gcc.gnu.org/g:f3d87219dd502d5c11608ffb83fbe66c79baf784

commit r14-2153-gf3d87219dd502d5c11608ffb83fbe66c79baf784
Author: Haochen Gui 
Date:   Wed Jun 28 16:30:44 2023 +0800

rs6000: Splat vector small V2DI constants with vspltisw and vupkhsw

This patch adds a new insn for vector splat with small V2DI constants on
P8.
If the value of constant is in RANGE (-16, 15) but not 0 or -1, it can be
loaded with vspltisw and vupkhsw on P8.

gcc/
PR target/104124
* config/rs6000/altivec.md (*altivec_vupkhs_direct):
Rename
to...
(altivec_vupkhs_direct): ...this.
* config/rs6000/predicates.md (vspltisw_vupkhsw_constant_split):
New
predicate to test if a constant can be loaded with vspltisw and
vupkhsw.
(easy_vector_constant): Call vspltisw_vupkhsw_constant_p to Check
if
a vector constant can be synthesized with a vspltisw and a vupkhsw.
* config/rs6000/rs6000-protos.h (vspltisw_vupkhsw_constant_p):
Declare.
* config/rs6000/rs6000.cc (vspltisw_vupkhsw_constant_p): New
function to return true if OP mode is V2DI and can be synthesized
with vupkhsw and vspltisw.
* config/rs6000/vsx.md (*vspltisw_v2di_split): New insn to load up
constants with vspltisw and vupkhsw.

gcc/testsuite/
PR target/104124
* gcc.target/powerpc/pr104124.c: New.

[Bug c++/110447] New: [modules] unexpected attachment of GMF decls to a named module.

2023-06-28 Thread iains at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110447

Bug ID: 110447
   Summary: [modules] unexpected attachment of GMF decls to a
named module.
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: iains at gcc dot gnu.org
  Target Milestone: ---

While trying to compare behaviour of GCC and clang 



Foo.cpp:

module;

#include 

export module Foo;

export void
my_hello (const char *str)
{
 std::cout << str << std::endl;
}


pr61465.cpp:

import Foo; // assume module 'foo' contain the declarations from ``
#include 

int main(int argc, char *argv[])
{
std::cout << "Test\n";
my_hello ("and we used Foo");
return 0;
}



g++ -std=c++20 -fmodules-ts Foo.cpp -c
g++ -std=c++20 -fmodules-ts pr6145-main.cpp Foo.o -o t -flang-info-module-cmi

...

In module imported at pr6145-main.cpp:1:1:
Foo: note: reading CMI ‘gcm.cache/Foo.gcm’


/usr/include/runetype.h:111:20: error: conflicting declaration ‘_RuneLocale
_DefaultRuneLocale’


/usr/include/runetype.h:111:20: note: previous declaration as ‘_RuneLocale@Foo
_DefaultRuneLocale@Foo’



NOTE: iostream is _not_ built as a header unit in this case (if we do that the
problem disappears).



What seems wrong here is that the _RuneLocale symbol appears to be attached to
Foo - but ir should be part of the GMF of Foo and not visible to importers
(although potentially reachable if used in the module purview of Foo).

unless i miss something?

[Bug c++/110447] [modules] unexpected attachment of GMF decls to a named module.

2023-06-28 Thread iains at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110447

Iain Sandoe  changed:

   What|Removed |Added

 CC||nathan at acm dot org

--- Comment #1 from Iain Sandoe  ---
note that if I reverse the include/import order like so...

#include 
import Foo; // assume module 'foo' contain the declarations from ``

all is (or at least appears to be) OK.

[Bug middle-end/110443] [14 Regression] ICE on a52dec-0.7.4: GIMPLE pass: vect SIGSEGV in vect_get_gather_scatter_ops()

2023-06-28 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110443

Richard Biener  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #6 from Richard Biener  ---
Fixed.

[Bug middle-end/110443] [14 Regression] ICE on a52dec-0.7.4: GIMPLE pass: vect SIGSEGV in vect_get_gather_scatter_ops()

2023-06-28 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110443

--- Comment #7 from Richard Biener  ---
*** Bug 110440 has been marked as a duplicate of this bug. ***

[Bug middle-end/110443] [14 Regression] ICE on a52dec-0.7.4: GIMPLE pass: vect SIGSEGV in vect_get_gather_scatter_ops()

2023-06-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110443

--- Comment #5 from CVS Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:51c8cbc6bba387f953d9be48c4a4c8b657dd54a5

commit r14-2156-g51c8cbc6bba387f953d9be48c4a4c8b657dd54a5
Author: Richard Biener 
Date:   Wed Jun 28 10:16:57 2023 +0200

tree-optimization/110443 - prevent SLP splat of gathers

The following prevents non-grouped load SLP in case the element
to splat is from a gather operation.  While it should be possible
to support this it is not similar to the single element interleaving
case I was trying to mimic here.

PR tree-optimization/110443
* tree-vect-slp.cc (vect_build_slp_tree_1): Reject non-grouped
gather loads.

* gcc.dg/torture/pr110443.c: New testcase.

[Bug tree-optimization/110440] [14 regression] ICE when building pixman

2023-06-28 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110440

Richard Biener  changed:

   What|Removed |Added

 Resolution|--- |DUPLICATE
 Status|ASSIGNED|RESOLVED

--- Comment #5 from Richard Biener  ---
duplicate

*** This bug has been marked as a duplicate of bug 110443 ***

[Bug target/110448] New: [RISC-V] RVV intrinsic api test error

2023-06-28 Thread mumuxi_ll at outlook dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110448

Bug ID: 110448
   Summary: [RISC-V] RVV intrinsic api test error
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: mumuxi_ll at outlook dot com
  Target Milestone: ---

Too few arguments error to function '__riscv_vsadd_vv_i16m8' when testing rvv
intrinsic api according to https://github.com/riscv-non-isa/rvv-intrinsic-doc
with gcc-trunk branch. Here is the tset link:https://godbolt.org/z/nYr6TKPbq.

But it succeed with gcc-13.1: https://godbolt.org/z/oavWf96o1.

[Bug tree-optimization/103680] Jump threading and switch corrupts profile

2023-06-28 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103680

Jan Hubicka  changed:

   What|Removed |Added

   Last reconfirmed||2023-06-28
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

--- Comment #7 from Jan Hubicka  ---
a simple testcase:
test (int i)
{
if (__builtin_expect_with_probability (i > 5, 1, 0.6))
foo ();
}
test2(int i)
{
test (i);
if (__builtin_expect_with_probability (i > 4, 1, 0.7))
foo ();
}
this is can be updated quite easily, but we still fail.

[Bug fortran/110360] ABI issue with character,value dummy argument

2023-06-28 Thread mikael at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110360

--- Comment #21 from Mikael Morin  ---
(In reply to anlauf from comment #20)
> Created attachment 55407 [details]
> Third patch set
> 
> Here's a lightly tested 3rd patch that tries to handle the chaos I created...
> 
> Can you have a look?

This looks good.
There is gfc_conv_string_parameter that seems to be used often with
gfc_string_to_single_character and that you could use to generate the address,
but it is basically equivalent to your solution, I think.

.

[Bug tree-optimization/110449] New: Vect: use a small step to calculate the loop induction if the loop is unrolled during loop vectorization

2023-06-28 Thread hliu at amperecomputing dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110449

Bug ID: 110449
   Summary: Vect: use a small step to calculate the loop induction
if the loop is unrolled during loop vectorization
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hliu at amperecomputing dot com
  Target Milestone: ---

This is inspired by clang. Compile the follwing case with "-mcpu=neoverse-n2
-O3":

void foo(int *arr, int val, int step) {
  for (int i = 0; i < 1024; i++) {
arr[i] = val;
val += step;
  }
}

It will be unrolled by 2 during vectorization. GCC generates code:
fmovs29, w2 # step
shl v27.2s, v29.2s, 3   # 8*step
shl v28.2s, v29.2s, 2   # 4*step
...
.L2:
mov v30.16b, v31.16b
add v31.4s, v31.4s, v27.4s  # += 8*step
add v29.4s, v30.4s, v28.4s  # += 4*step
stp q30, q29, [x0]
add x0, x0, 32
cmp x1, x0
bne .L2

The v27 (i.e. "8*step") is actually not necessary. We can use v29 + v28 (i.e.
"+ 4*step") and generate simpler code:
fmovs29, w2 # step
shl v28.2s, v29.2s, 2   # 4*step
...
.L2:
add v29.4s, v30.4s, v28.4s  # += 4*step
stp q30, q29, [x0]
add x0, x0, 32
add v30.4s, v29.4s, v28.4s  # += 4*step
cmp x1, x0
bne .L2

This has two benefits:
(1) Save 1 vector register and one "mov" instructon
(2) For floating point, the result value of small step should be closer to the
original scalar result value than large step. I.e. "A + 4*step + ... + 4*step"
should be closer to "A + step + ... + step" than "A + 8*step + ... 8*step".

Do you think if this is reasonable? 

I have a simple patch to enhance the tree-vect-loop.cc
"vectorizable_induction()" to achieve this. Will send out the patch for code
review later.

[Bug target/110448] [RISC-V] RVV intrinsic api test error

2023-06-28 Thread kito at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110448

Kito Cheng  changed:

   What|Removed |Added

 CC||kito at gcc dot gnu.org

--- Comment #1 from Kito Cheng  ---
That's incompatible change at RVV intrinsic spec land.

see https://github.com/riscv-non-isa/rvv-intrinsic-doc/pull/222

[Bug tree-optimization/110434] tree-nrv introduces incorrect CLOBBER(eol)

2023-06-28 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110434

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org
 Ever confirmed|0   |1
   Last reconfirmed||2023-06-28

[Bug analyzer/110433] ASAN reports mismatching new/delete when compiling analyzer testcases

2023-06-28 Thread jamborm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110433

--- Comment #2 from Martin Jambor  ---
Here is the promised longer trace (from compiling
testsuite/gcc.dg/analyzer/out-of-bounds-diagram-5-unicode.c):

=
==58010==ERROR: AddressSanitizer: new-delete-type-mismatch on 0x50d00a00 in
thread T0:
  object passed to delete has wrong type:
  size of the allocated type:   136 bytes;
  size of the deallocated type: 104 bytes.
#0 0x83eba8 in operator delete(void*, unsigned long)
/home/worker/buildworker/tiber-gcc-asan/build/libsanitizer/asan/asan_new_delete.cpp:164
#1 0x51e6e45 in
std::default_delete::operator()(ana::svalue_spatial_item*)
const
/home/worker/buildworker/tiber-gcc-asan/objdir/prev-x86_64-pc-linux-gnu/libstdc++-v3/include/bits/unique_ptr.h:99
#2 0x51e6e45 in std::unique_ptr >::~unique_ptr()
/home/worker/buildworker/tiber-gcc-asan/objdir/prev-x86_64-pc-linux-gnu/libstdc++-v3/include/bits/unique_ptr.h:404
#3 0x51e6e45 in ana::access_diagram_impl::~access_diagram_impl()
/home/worker/buildworker/tiber-gcc-asan/build/gcc/analyzer/access-diagram.cc:1728
#4 0x51e703c in ana::access_diagram_impl::~access_diagram_impl()
/home/worker/buildworker/tiber-gcc-asan/build/gcc/analyzer/access-diagram.cc:1728
#5 0x4e97142 in
std::default_delete::operator()(text_art::widget*) const
/home/worker/buildworker/tiber-gcc-asan/objdir/prev-x86_64-pc-linux-gnu/libstdc++-v3/include/bits/unique_ptr.h:99
#6 0x4e97142 in std::unique_ptr >::~unique_ptr()
/home/worker/buildworker/tiber-gcc-asan/objdir/prev-x86_64-pc-linux-gnu/libstdc++-v3/include/bits/unique_ptr.h:404
#7 0x4e97142 in text_art::wrapper_widget::~wrapper_widget()
/home/worker/buildworker/tiber-gcc-asan/build/gcc/text-art/widget.h:136
#8 0x4e97142 in ana::access_diagram::~access_diagram()
/home/worker/buildworker/tiber-gcc-asan/build/gcc/analyzer/access-diagram.h:149
#9 0x4e97142 in
ana::out_of_bounds::make_access_diagram(ana::access_operation const&,
text_art::style_manager&, text_art::theme const&, ana::logger*) const
/home/worker/buildworker/tiber-gcc-asan/build/gcc/analyzer/bounds-checking.cc:192
#10 0x4e97142 in ana::out_of_bounds::maybe_show_diagram(ana::logger*) const
/home/worker/buildworker/tiber-gcc-asan/build/gcc/analyzer/bounds-checking.cc:169
#11 0x4e9928c in ana::out_of_bounds::maybe_show_notes(unsigned int,
ana::logger*) const
/home/worker/buildworker/tiber-gcc-asan/build/gcc/analyzer/bounds-checking.cc:125
#12 0x4e9928c in ana::concrete_buffer_overflow::emit(rich_location*,
ana::logger*)
/home/worker/buildworker/tiber-gcc-asan/build/gcc/analyzer/bounds-checking.cc:333
#13 0x4eee9ed in
ana::diagnostic_manager::emit_saved_diagnostic(ana::exploded_graph const&,
ana::saved_diagnostic const&)
/home/worker/buildworker/tiber-gcc-asan/build/gcc/analyzer/diagnostic-manager.cc:1424
#14 0x4efca7a in ana::dedupe_winners::emit_best(ana::diagnostic_manager*,
ana::exploded_graph const&)
/home/worker/buildworker/tiber-gcc-asan/build/gcc/analyzer/diagnostic-manager.cc:1311
#15 0x4eefd35 in
ana::diagnostic_manager::emit_saved_diagnostics(ana::exploded_graph const&)
/home/worker/buildworker/tiber-gcc-asan/build/gcc/analyzer/diagnostic-manager.cc:1363
#16 0x2a647b6 in ana::impl_run_checkers(ana::logger*)
/home/worker/buildworker/tiber-gcc-asan/build/gcc/analyzer/engine.cc:6139
#17 0x2a66f59 in ana::run_checkers()
/home/worker/buildworker/tiber-gcc-asan/build/gcc/analyzer/engine.cc:6213
#18 0x2a2f8dc in execute
/home/worker/buildworker/tiber-gcc-asan/build/gcc/analyzer/analyzer-pass.cc:87
#19 0x197af8b in execute_one_pass(opt_pass*)
/home/worker/buildworker/tiber-gcc-asan/build/gcc/passes.cc:2651
#20 0x197d6d0 in execute_ipa_pass_list(opt_pass*)
/home/worker/buildworker/tiber-gcc-asan/build/gcc/passes.cc:3100
#21 0xd6f193 in ipa_passes
/home/worker/buildworker/tiber-gcc-asan/build/gcc/cgraphunit.cc:2268
#22 0xd6f193 in symbol_table::compile()
/home/worker/buildworker/tiber-gcc-asan/build/gcc/cgraphunit.cc:2331
#23 0xd6f193 in symbol_table::compile()
/home/worker/buildworker/tiber-gcc-asan/build/gcc/cgraphunit.cc:2309
#24 0xd76f09 in symbol_table::finalize_compilation_unit()
/home/worker/buildworker/tiber-gcc-asan/build/gcc/cgraphunit.cc:2583
#25 0x1d81ed8 in compile_file
/home/worker/buildworker/tiber-gcc-asan/build/gcc/toplev.cc:471
#26 0x77e8fb in do_compile
/home/worker/buildworker/tiber-gcc-asan/build/gcc/toplev.cc:2126
#27 0x77e8fb in toplev::main(int, char**)
/home/worker/buildworker/tiber-gcc-asan/build/gcc/toplev.cc:2282
#28 0x789813 in main
/home/worker/buildworker/tiber-gcc-asan/build/gcc/main.cc:39

[Bug middle-end/109849] suboptimal code for vector walking loop

2023-06-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109849

--- Comment #18 from CVS Commits  ---
The master branch has been updated by Jan Hubicka :

https://gcc.gnu.org/g:45c53768b6fa3d737ae818e31d3c50da62e0ad2b

commit r14-2157-g45c53768b6fa3d737ae818e31d3c50da62e0ad2b
Author: Jan Hubicka 
Date:   Wed Jun 28 11:45:15 2023 +0200

Add cold attribute to throw wrappers and terminate

PR middle-end/109849
* include/bits/c++config (std::__terminate): Mark cold.
* include/bits/functexcept.h: Mark everything as cold.
* libsupc++/exception: Mark terminate and unexpected as cold.

[Bug target/109780] [12/13/14 Regression] csmith: runtime crash with -O2 -march=znver1

2023-06-28 Thread xry111 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109780

--- Comment #17 from Xi Ruoyao  ---
(In reply to H.J. Lu from comment #16)
> Created attachment 55409 [details]
> A patch
> 
> I am stilling trying to find a small testcase.

The patch triggers an ICE building Spidermonkey 115b9 (it segfaults with GCC
trunk because of some unaligned vmovdqa):

0x93297b ix86_finalize_stack_frame_flags
../../gcc/gcc/config/i386/i386.cc:8224
0x162064c ix86_expand_epilogue(int)
../../gcc/gcc/config/i386/i386.cc:9405
0x1b2e27f gen_epilogue()
../../gcc/gcc/config/i386/i386.md:17517
0x160a815 target_gen_epilogue
../../gcc/gcc/config/i386/i386.md:17013
0xf15e86 make_epilogue_seq
../../gcc/gcc/function.cc:5964
0xf15f8b thread_prologue_and_epilogue_insns()
../../gcc/gcc/function.cc:6046
0xf166c2 rest_of_handle_thread_prologue_and_epilogue
../../gcc/gcc/function.cc:6544
0xf166c2 execute
../../gcc/gcc/function.cc:6625

The code at i386.cc:8224 reads:

  if (crtl->stack_realign_finalized)
{
  /* After stack_realign_needed is finalized, we can't no longer
 change it.  */
  gcc_assert (crtl->stack_realign_needed == stack_realign);
  return;
}

I'm not sure if the assert should be dropped or it's more difficult.

Or can we just force to use unaligned vector moves for block operations until
we can find a better solution?  It's at least better than leaving the
vectorized block moving broken and forcing people trying to disable the
feature.

[Bug target/109780] [12/13/14 Regression] csmith: runtime crash with -O2 -march=znver1

2023-06-28 Thread xry111 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109780

--- Comment #18 from Xi Ruoyao  ---
(In reply to Xi Ruoyao from comment #17)
> (In reply to H.J. Lu from comment #16)
> > Created attachment 55409 [details]
> > A patch
> > 
> > I am stilling trying to find a small testcase.
> 
> The patch triggers an ICE building Spidermonkey 115b9 (it segfaults with GCC
> trunk because of some unaligned vmovdqa):

I mean, "GCC trunk and -O3 -march=tigerlake -mtune=tigerlake".

Re: [Bug ipa/110334] [13/14 Regresssion] unused functions not eliminated before LTO streaming

2023-06-28 Thread Jan Hubicka via Gcc-bugs
> 
> why disallow caller->indirect_calls?
See testcase in comment #9
> 
> > +   return false;
> > +  for (cgraph_edge *e2 = callee->callees; e2; e2 = e2->next_callee)
> 
> I don't think this flys - it looks quadratic.  Can we compute this
> in the inline summary once instead?

I guess I can place a cache there.  I think this check will become more
global over time so it more fits IMO here.
> 
> As for indirect calls, can we maybe mark initial direct GIMPLE call
> stmts as "always-inline" and only look at that marking, thus an
> indirect call will never become "always-inline"?  Iff cgraph edges
> prevail during all early inlining we could mark call edges for
> this purpose?

I also think we need call site specific info.
Tagging gimple call statements and copying the info to gimple edges will
probably be needed here.  We want to keep the info from early inlining
to late inlining since we output errors late.
We already have plenty of GF_CALL_ flags, so adding one should be easy?

Honza


[Bug ipa/110334] [13/14 Regresssion] unused functions not eliminated before LTO streaming

2023-06-28 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110334

--- Comment #14 from Jan Hubicka  ---
> 
> why disallow caller->indirect_calls?
See testcase in comment #9
> 
> > +   return false;
> > +  for (cgraph_edge *e2 = callee->callees; e2; e2 = e2->next_callee)
> 
> I don't think this flys - it looks quadratic.  Can we compute this
> in the inline summary once instead?

I guess I can place a cache there.  I think this check will become more
global over time so it more fits IMO here.
> 
> As for indirect calls, can we maybe mark initial direct GIMPLE call
> stmts as "always-inline" and only look at that marking, thus an
> indirect call will never become "always-inline"?  Iff cgraph edges
> prevail during all early inlining we could mark call edges for
> this purpose?

I also think we need call site specific info.
Tagging gimple call statements and copying the info to gimple edges will
probably be needed here.  We want to keep the info from early inlining
to late inlining since we output errors late.
We already have plenty of GF_CALL_ flags, so adding one should be easy?

Honza

[Bug target/78794] [7 Regression] We noticed ~9% regression in 32-bit mode for 462.libquntum on Avoton after r243202

2023-06-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78794

--- Comment #12 from CVS Commits  ---
The master branch has been updated by Roger Sayle :

https://gcc.gnu.org/g:c027592d39b2968005aa28bc84a946bab2668db8

commit r14-2158-gc027592d39b2968005aa28bc84a946bab2668db8
Author: Roger Sayle 
Date:   Wed Jun 28 11:07:47 2023 +0100

i386: Fix FAIL of gcc.target/i386/pr78794.c on ia32.

This patch fixes that FAIL of gcc.target/i386/pr78794.c on ia32, which
is caused by minor STV rtx_cost differences with -march=silvermont.
It turns out that generic tuning results in pandn, but the lack of
accurate parameterization for COMPARE in compute_convert_gain combined
with small differences in scalar<->SSE costs on silvermont results in
this DImode chain not being converted.

The solution is to provide more accurate costs/gains for converting
(DImode and SImode) comparisons.

I'd been holding off of doing this as I'd thought it would be possible
to turn pandn;ptestz into ptestc (for an even bigger scalar-to-vector
win) but I've recently realized that these optimizations (as I've
implemented them) occur in the wrong order (stv2 occurs after
combine), so it isn't easy for STV to convert CCZmode into CCCmode.
Doh!  Perhaps something can be done in peephole2.

2023-06-28  Roger Sayle  

gcc/ChangeLog
PR target/78794
* config/i386/i386-features.cc (compute_convert_gain): Provide
more accurate gains for conversion of scalar comparisons to
PTEST.

[Bug target/104610] memcmp () == 0 can be optimized better for avx512f

2023-06-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104610

--- Comment #20 from CVS Commits  ---
The master branch has been updated by Roger Sayle :

https://gcc.gnu.org/g:4afbebcdc5780d28e52b7d65643e462c7c3882ce

commit r14-2159-g4afbebcdc5780d28e52b7d65643e462c7c3882ce
Author: Roger Sayle 
Date:   Wed Jun 28 11:11:34 2023 +0100

i386: Add cbranchti4 pattern to i386.md (for -m32 compare_by_pieces).

This patch fixes some very odd (unanticipated) code generation by
compare_by_pieces with -m32 -mavx, since the recent addition of the
cbranchoi4 pattern.  The issue is that cbranchoi4 is available with
TARGET_AVX, but cbranchti4 is currently conditional on TARGET_64BIT
which results in the odd behaviour (thanks to OPTAB_WIDEN) that with
-m32 -mavx, compare_by_pieces ends up (inefficiently) widening 128-bit
comparisons to 256-bits before performing PTEST.

This patch fixes this by providing a cbranchti4 pattern that's available
with either TARGET_64BIT or TARGET_SSE4_1.

For the test case below (again from PR 104610):

int foo(char *a)
{
static const char t[] = "0123456789012345678901234567890";
return __builtin_memcmp(a, &t[0], sizeof(t)) == 0;
}

GCC with -m32 -O2 -mavx currently produces the bonkers:

foo:pushl   %ebp
movl%esp, %ebp
andl$-32, %esp
subl$64, %esp
movl8(%ebp), %eax
vmovdqa .LC0, %xmm4
movl$0, 48(%esp)
vmovdqu (%eax), %xmm2
movl$0, 52(%esp)
movl$0, 56(%esp)
movl$0, 60(%esp)
movl$0, 16(%esp)
movl$0, 20(%esp)
movl$0, 24(%esp)
movl$0, 28(%esp)
vmovdqa %xmm2, 32(%esp)
vmovdqa %xmm4, (%esp)
vmovdqa (%esp), %ymm5
vpxor   32(%esp), %ymm5, %ymm0
vptest  %ymm0, %ymm0
jne .L2
vmovdqu 16(%eax), %xmm7
movl$0, 48(%esp)
movl$0, 52(%esp)
vmovdqa %xmm7, 32(%esp)
vmovdqa .LC1, %xmm7
movl$0, 56(%esp)
movl$0, 60(%esp)
movl$0, 16(%esp)
movl$0, 20(%esp)
movl$0, 24(%esp)
movl$0, 28(%esp)
vmovdqa %xmm7, (%esp)
vmovdqa (%esp), %ymm1
vpxor   32(%esp), %ymm1, %ymm0
vptest  %ymm0, %ymm0
je  .L6
.L2:movl$1, %eax
xorl$1, %eax
vzeroupper
leave
ret
.L6:xorl%eax, %eax
xorl$1, %eax
vzeroupper
leave
ret

with this patch, we now generate the (slightly) more sensible:

foo:vmovdqa .LC0, %xmm0
movl4(%esp), %eax
vpxor   (%eax), %xmm0, %xmm0
vptest  %xmm0, %xmm0
jne .L2
vmovdqa .LC1, %xmm0
vpxor   16(%eax), %xmm0, %xmm0
vptest  %xmm0, %xmm0
je  .L5
.L2:movl$1, %eax
xorl$1, %eax
ret
.L5:xorl%eax, %eax
xorl$1, %eax
ret

2023-06-28  Roger Sayle  

gcc/ChangeLog
* config/i386/i386-expand.cc (ix86_expand_branch): Also use ptest
for TImode comparisons on 32-bit architectures.
* config/i386/i386.md (cbranch4): Change from SDWIM to
SWIM1248x to exclude/avoid TImode being conditional on -m64.
(cbranchti4): New define_expand for TImode on both TARGET_64BIT
and/or with TARGET_SSE4_1.
* config/i386/predicates.md (ix86_timode_comparison_operator):
New predicate that depends upon TARGET_64BIT.
(ix86_timode_comparison_operand): Likewise.

gcc/testsuite/ChangeLog
* gcc.target/i386/pieces-memcmp-2.c: New test case.

[Bug tree-optimization/110450] New: [14 Regression] Dead Code Elimination Regression at -O2 since r14-261-g0ef3756adf0

2023-06-28 Thread theodort at inf dot ethz.ch via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110450

Bug ID: 110450
   Summary: [14 Regression] Dead Code Elimination Regression at
-O2 since r14-261-g0ef3756adf0
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: theodort at inf dot ethz.ch
  Target Milestone: ---

https://godbolt.org/z/KqcYxfevj

Given the following code:

void foo(void);
static int a = 1;
static int *b = &a, *c = &a;
static short d, e;
static char f = 11;
static char(g)(char h, int i) { return h << i; }
int main() {
if (f) *c = g(0 >= a, 3);
e = *c;
d = e % f;
if (d) {
__builtin_unreachable();
} else if (*b)
foo();
;
}

gcc-trunk -O2 does not eliminate the call to foo:

main:
movla(%rip), %eax
xorl%edx, %edx
testl   %eax, %eax
setle   %dl
leal0(,%rdx,8), %edx
movl%edx, a(%rip)
jle .L8
xorl%eax, %eax
ret
.L8:
pushq   %rax
callfoo
xorl%eax, %eax
popq%rdx
ret

gcc-13.1.0 -O2 eliminates the call to foo:

main:
movla(%rip), %edx
xorl%eax, %eax
testl   %edx, %edx
setle   %al
leal0(,%rax,8), %eax
movl%eax, a(%rip)
xorl%eax, %eax
ret

Bisects to r14-261-g0ef3756adf0

[Bug ipa/110334] [13/14 Regresssion] unused functions not eliminated before LTO streaming

2023-06-28 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110334

--- Comment #15 from rguenther at suse dot de  ---
On Wed, 28 Jun 2023, hubicka at ucw dot cz wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110334
> 
> --- Comment #14 from Jan Hubicka  ---
> > 
> > why disallow caller->indirect_calls?
> See testcase in comment #9
> > 
> > > +   return false;
> > > +  for (cgraph_edge *e2 = callee->callees; e2; e2 = e2->next_callee)
> > 
> > I don't think this flys - it looks quadratic.  Can we compute this
> > in the inline summary once instead?
> 
> I guess I can place a cache there.  I think this check will become more
> global over time so it more fits IMO here.
> > 
> > As for indirect calls, can we maybe mark initial direct GIMPLE call
> > stmts as "always-inline" and only look at that marking, thus an
> > indirect call will never become "always-inline"?  Iff cgraph edges
> > prevail during all early inlining we could mark call edges for
> > this purpose?
> 
> I also think we need call site specific info.
> Tagging gimple call statements and copying the info to gimple edges will
> probably be needed here.  We want to keep the info from early inlining
> to late inlining since we output errors late.
> We already have plenty of GF_CALL_ flags, so adding one should be easy?

We have 3 bits left :/  I was hoping that cgraph_edge lives long
enough?  But I suppose we're not keeping them across the early opts
pipeline.

[Bug tree-optimization/109689] [14 Regression] ICE at -O1 with "-ftree-vectorize": in check_loop_closed_ssa_def, at tree-ssa-loop-manip.cc:645 since r14-301-gf2d6beb7a4ddf1

2023-06-28 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109689

--- Comment #9 from Richard Biener  ---
(In reply to Jan Hubicka from comment #8)
> An easy way would be to avoid unlooping if tree_ssa_loop_ch is executed in
> loop closed ssa (which happens from ch_vect pass).
> 
> I wonder how hard would be however to get this right?
> I think this means to take the basic block we turn into unreachable and look
> on its predecessors that are in loop introducing new PHIs and renaming. 
> This is bit involved to do by hand.
> 
> So perhaps simply:
>   rewrite_into_loop_closed_ssa (NULL, 0);
> in case we unlooped in loop closed ssa form (which is not that common).
> Would that be acceptable?

Yes, we do that in other places as well.

[Bug tree-optimization/109689] [14 Regression] ICE at -O1 with "-ftree-vectorize": in check_loop_closed_ssa_def, at tree-ssa-loop-manip.cc:645 since r14-301-gf2d6beb7a4ddf1

2023-06-28 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109689

--- Comment #10 from Jan Hubicka  ---
> > So perhaps simply:
> >   rewrite_into_loop_closed_ssa (NULL, 0);
> > in case we unlooped in loop closed ssa form (which is not that common).
> > Would that be acceptable?
> 
> Yes, we do that in other places as well.
OK, I will test the fix.

Thanks
Honza

[Bug ipa/110334] [13/14 Regresssion] unused functions not eliminated before LTO streaming

2023-06-28 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110334

--- Comment #16 from Jan Hubicka  ---
> > We already have plenty of GF_CALL_ flags, so adding one should be easy?
> 
> We have 3 bits left :/  I was hoping that cgraph_edge lives long
> enough?  But I suppose we're not keeping them across the early opts
> pipeline.

Hmm, so we have too many flags.  Indeed problem is that we don't want to
keep callgraph edges across all modifications gimple optimization passes
does.
Eventualy such annotations can probably go to hash_map just like we do
for EH regions etc.

Honza

[Bug tree-optimization/110451] New: LIM fails to hoist comparisons

2023-06-28 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110451

Bug ID: 110451
   Summary: LIM fails to hoist comparisons
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: rguenth at gcc dot gnu.org
  Target Milestone: ---

For SPEC bwaves or gfortran.dg/pr81303.f we interchange the loop nest which
produces a conditional in the innermost loop like the following to handle
a reduction with invariant initialization:

   [local count: 955630226]:
  # l_93 = PHI 
  # ivtmp_58 = PHI 
  _30 = (integer(kind=8)) l_93;
  _31 = _29 + _30;
  y__I_lsm.120_22 = (*y_139(D))[_31];
  _23 = m_95 != 1;
  y__I_lsm.120_33 = _23 ? y__I_lsm.120_22 : 0.0;
  _42 = _30 + _41;
  _43 = (*a_141(D))[_42];

currently the LIM cost modeling finds it not profitable to hoist the
invariant m_95 != 1 compare.  This causes vectorization to choose
a larger vectorization factor to accomodate vectorizing this compare.

[Bug tree-optimization/110451] LIM fails to hoist comparisons

2023-06-28 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110451

Richard Biener  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Keywords||missed-optimization
   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org
   Last reconfirmed||2023-06-28
 Status|UNCONFIRMED |ASSIGNED

--- Comment #1 from Richard Biener  ---
Mine.

[Bug target/104610] memcmp () == 0 can be optimized better for avx512f

2023-06-28 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104610

--- Comment #21 from Uroš Bizjak  ---
Just before the patch from Comment #20, the compiler creates (-O2 -mavx):

--cut here--
vmovdqa .LC1(%rip), %xmm0
vmovdqa %xmm0, -24(%rsp)
vmovdqu (%rdi), %xmm0
vpxor   .LC0(%rip), %xmm0, %xmm0
vptest  %xmm0, %xmm0
je  .L5
.L2:
movl$1, %eax
testl   %eax, %eax
sete%al
ret
.L5:
vmovdqu 16(%rdi), %xmm0
vpxor   -24(%rsp), %xmm0, %xmm0
vptest  %xmm0, %xmm0
jne .L2
xorl%eax, %eax
testl   %eax, %eax
sete%al
ret
--cut here--

Please note the creative way of returning 0 and 1 ... :

movl$1, %eax
testl   %eax, %eax
sete%al
ret

Even the new code (From comment #20) is unnecessarily convoluted:

.L2:movl$1, %eax
xorl$1, %eax
ret
.L5:xorl%eax, %eax
xorl$1, %eax
ret

[Bug tree-optimization/110452] New: Bad vectorization of invariant masks

2023-06-28 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110452

Bug ID: 110452
   Summary: Bad vectorization of invariant masks
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: rguenth at gcc dot gnu.org
  Target Milestone: ---

When we have loop like

double a[1024], b[1024], c[1024];

void foo (int flag, int n)
{
  _Bool x = flag == 3;
  for (int i = 0; i < n; ++i)
a[i] = (x ? b[i] : c[i]) * 42.;
}

and build it with -O2 -ftree-vectorize -march=znver4 (to avoid unswitching)
we get

  _55 = _2 ? -1 : 0;
  vect_cst__56 = {_55, _55, _55, _55, _55, _55, _55, _55};

   [local count: 567644343]:
  # i_14 = PHI 
  # vectp_b.10_49 = PHI 
  # vectp_c.13_52 = PHI 
  # vectp_a.18_62 = PHI 
  # ivtmp_65 = PHI 
  vect_iftmp.12_51 = MEM  [(double *)vectp_b.10_49];
  iftmp.0_9 = b[i_14];
  vect_iftmp.15_54 = MEM  [(double *)vectp_c.13_52];
  iftmp.0_8 = c[i_14];
  vect_patt_13.16_59 = VEC_COND_EXPR ;
  iftmp.0_3 = _2 ? iftmp.0_9 : iftmp.0_8;

so the invariant but not constant condition _2 on the COND_EXPR is vectorized
as

  _55 = _2 ? -1 : 0;
  vect_cst__56 = {_55, _55, _55, _55, _55, _55, _55, _55};

unfortunately that leads to very bad generated code

cmpl$3, %edi
sete%cl
movl%ecx, %esi
leal(%rsi,%rsi), %eax
leal0(,%rsi,4), %r9d
leal0(,%rsi,8), %r8d
orl %esi, %eax
orl %r9d, %eax
movl%ecx, %r9d
orl %r8d, %eax
movl%ecx, %r8d
sall$4, %r9d
sall$5, %r8d
sall$6, %esi
orl %r9d, %eax
orl %r8d, %eax
movl%ecx, %r8d
orl %esi, %eax
sall$7, %r8d
orl %r8d, %eax
kmovb   %eax, %k1

[Bug tree-optimization/110452] Bad vectorization of invariant masks

2023-06-28 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110452

Richard Biener  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org
   Keywords||missed-optimization
   Last reconfirmed||2023-06-28
 Target||x86_64-*-*

[Bug c/110453] New: gcc accepts redefinition of global variable without initializer

2023-06-28 Thread laneast at laneast dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110453

Bug ID: 110453
   Summary: gcc accepts redefinition of global variable without
initializer
   Product: gcc
   Version: 13.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: laneast at laneast dot com
  Target Milestone: ---

gcc accepts following program:

#include 

int x;
int x = 10;

int main() {
printf("%d\n", x);

return 0;
}


Tested gcc version:
* gcc.exe (Rev7, Built by MSYS2 project) 13.1.0
* gcc (Debian 12.2.0-14) 12.2.0

BTW, g++ works fine:
main.c:4:5: error: redefinition of 'int x'
4 | int x = 10;
  | ^
main.c:3:5: note: 'int x' previously declared here
3 | int x;
  | ^

[Bug c/110454] New: [14 Regression] ICE: tree check: expected none of vector_type, have vector_type in convert_argument, at c/c-typeck.cc:3388 with -Wtraditional-conversion

2023-06-28 Thread zsojka at seznam dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110454

Bug ID: 110454
   Summary: [14 Regression] ICE: tree check: expected none of
vector_type, have vector_type in convert_argument, at
c/c-typeck.cc:3388 with -Wtraditional-conversion
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: ice-on-valid-code
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zsojka at seznam dot cz
  Target Milestone: ---

Created attachment 55410
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55410&action=edit
reduced testcase

Probably since commit fe48f2651334bc4d96b6df6b2bb6b29fcb732a83

Compiler output:
$ x86_64-pc-linux-gnu-gcc -Wtraditional-conversion testcase.c
testcase.c: In function 'foo':
testcase.c:6:3: internal compiler error: tree check: expected none of
vector_type, have vector_type in convert_argument, at c/c-typeck.cc:3388
6 |   foo (v);
  |   ^~~
0x85e0a3 tree_not_check_failed(tree_node const*, char const*, int, char const*,
...)
/repo/gcc-trunk/gcc/tree.cc:8936
0x6c114e tree_not_check(tree_node*, char const*, int, char const*, tree_code)
/repo/gcc-trunk/gcc/tree.h:3581
0x6c114e convert_argument
/repo/gcc-trunk/gcc/c/c-typeck.cc:3388
0xdde4bc convert_arguments
/repo/gcc-trunk/gcc/c/c-typeck.cc:3743
0xdde4bc build_function_call_vec(unsigned int, vec, tree_node*, vec*, vec*, tree_node*)
/repo/gcc-trunk/gcc/c/c-typeck.cc:3249
0xe05321 c_parser_postfix_expression_after_primary
/repo/gcc-trunk/gcc/c/c-parser.cc:11258
0xdfb83d c_parser_postfix_expression
/repo/gcc-trunk/gcc/c/c-parser.cc:10865
0xdffa6a c_parser_unary_expression
/repo/gcc-trunk/gcc/c/c-parser.cc:8850
0xe0157f c_parser_cast_expression
/repo/gcc-trunk/gcc/c/c-parser.cc:8691
0xe0186f c_parser_binary_expression
/repo/gcc-trunk/gcc/c/c-parser.cc:8459
0xe02cdb c_parser_conditional_expression
/repo/gcc-trunk/gcc/c/c-parser.cc:8157
0xe03494 c_parser_expr_no_commas
/repo/gcc-trunk/gcc/c/c-parser.cc:8071
0xe03751 c_parser_expression
/repo/gcc-trunk/gcc/c/c-parser.cc:11398
0xe03e97 c_parser_expression_conv
/repo/gcc-trunk/gcc/c/c-parser.cc:11438
0xdf8d1f c_parser_statement_after_labels
/repo/gcc-trunk/gcc/c/c-parser.cc:6800
0xdfb32c c_parser_compound_statement_nostart
/repo/gcc-trunk/gcc/c/c-parser.cc:6315
0xe207f4 c_parser_compound_statement
/repo/gcc-trunk/gcc/c/c-parser.cc:6124
0xe22852 c_parser_declaration_or_fndef
/repo/gcc-trunk/gcc/c/c-parser.cc:2844
0xe29f1b c_parser_external_declaration
/repo/gcc-trunk/gcc/c/c-parser.cc:1925
0xe2a8f3 c_parser_translation_unit
/repo/gcc-trunk/gcc/c/c-parser.cc:1779
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.

$ x86_64-pc-linux-gnu-gcc -v
Using built-in specs.
COLLECT_GCC=/repo/gcc-trunk/binary-latest-amd64/bin/x86_64-pc-linux-gnu-gcc
COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-r14-2153-20230628163757-gf3d87219dd5-checking-yes-rtl-df-extra-amd64/bin/../libexec/gcc/x86_64-pc-linux-gnu/14.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++
--enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra
--with-cloog --with-ppl --with-isl --build=x86_64-pc-linux-gnu
--host=x86_64-pc-linux-gnu --target=x86_64-pc-linux-gnu
--with-ld=/usr/bin/x86_64-pc-linux-gnu-ld
--with-as=/usr/bin/x86_64-pc-linux-gnu-as --disable-libstdcxx-pch
--prefix=/repo/gcc-trunk//binary-trunk-r14-2153-20230628163757-gf3d87219dd5-checking-yes-rtl-df-extra-amd64
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 14.0.0 20230628 (experimental) (GCC)

[Bug analyzer/110455] New: tree check: expected none of vector_type, have vector_type in get_gassign_result, at analyzer/region-model.cc:870 with -fanalyzer[14 Regression] ICE:

2023-06-28 Thread zsojka at seznam dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110455

Bug ID: 110455
   Summary: tree check: expected none of vector_type, have
vector_type in get_gassign_result, at
analyzer/region-model.cc:870 with -fanalyzer[14
Regression] ICE:
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: ice-on-valid-code
  Severity: normal
  Priority: P3
 Component: analyzer
  Assignee: dmalcolm at gcc dot gnu.org
  Reporter: zsojka at seznam dot cz
  Target Milestone: ---

Created attachment 55411
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55411&action=edit
reduced testcase

Probably since commit fe48f2651334bc4d96b6df6b2bb6b29fcb732a83

Compiler output:
$ x86_64-pc-linux-gnu-gcc -fanalyzer testcase.c 
during IPA pass: analyzer
testcase.c: In function 'foo':
testcase.c:6:9: internal compiler error: tree check: expected none of
vector_type, have vector_type in get_gassign_result, at
analyzer/region-model.cc:870
6 |   v | v << 1;
  |   ~~^~~~
0x85e0a3 tree_not_check_failed(tree_node const*, char const*, int, char const*,
...)
/repo/gcc-trunk/gcc/tree.cc:8936
0x893730 tree_not_check(tree_node*, char const*, int, char const*, tree_code)
/repo/gcc-trunk/gcc/tree.h:3581
0x893730 ana::region_model::get_gassign_result(gassign const*,
ana::region_model_context*)
/repo/gcc-trunk/gcc/analyzer/region-model.cc:870
0x1856eac ana::region_model::on_assignment(gassign const*,
ana::region_model_context*)
/repo/gcc-trunk/gcc/analyzer/region-model.cc:1156
0x18257d0 ana::exploded_node::on_stmt(ana::exploded_graph&, ana::supernode
const*, gimple const*, ana::program_state*, ana::uncertainty_t*,
ana::path_context*)
/repo/gcc-trunk/gcc/analyzer/engine.cc:1471
0x182851d ana::exploded_graph::process_node(ana::exploded_node*)
/repo/gcc-trunk/gcc/analyzer/engine.cc:4063
0x18293ba ana::exploded_graph::process_worklist()
/repo/gcc-trunk/gcc/analyzer/engine.cc:3466
0x182ba6f ana::impl_run_checkers(ana::logger*)
/repo/gcc-trunk/gcc/analyzer/engine.cc:6125
0x182c916 ana::run_checkers()
/repo/gcc-trunk/gcc/analyzer/engine.cc:6213
0x181bae8 execute
/repo/gcc-trunk/gcc/analyzer/analyzer-pass.cc:87
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.

$ x86_64-pc-linux-gnu-gcc -v
Using built-in specs.
COLLECT_GCC=/repo/gcc-trunk/binary-latest-amd64/bin/x86_64-pc-linux-gnu-gcc
COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-r14-2153-20230628163757-gf3d87219dd5-checking-yes-rtl-df-extra-amd64/bin/../libexec/gcc/x86_64-pc-linux-gnu/14.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++
--enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra
--with-cloog --with-ppl --with-isl --build=x86_64-pc-linux-gnu
--host=x86_64-pc-linux-gnu --target=x86_64-pc-linux-gnu
--with-ld=/usr/bin/x86_64-pc-linux-gnu-ld
--with-as=/usr/bin/x86_64-pc-linux-gnu-as --disable-libstdcxx-pch
--prefix=/repo/gcc-trunk//binary-trunk-r14-2153-20230628163757-gf3d87219dd5-checking-yes-rtl-df-extra-amd64
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 14.0.0 20230628 (experimental) (GCC)

[Bug c/110453] gcc accepts redefinition of global variable without initializer

2023-06-28 Thread xry111 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110453

Xi Ruoyao  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |INVALID
 CC||xry111 at gcc dot gnu.org

--- Comment #1 from Xi Ruoyao  ---
It's allowed by the C standard, see paragraph 2 and EXAMPLE 1 in section 6.9.2,
C99.  The clause is not changed in C23.

[Bug fortran/49213] [OOP] gfortran rejects structure constructor expression

2023-06-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49213

--- Comment #36 from CVS Commits  ---
The master branch has been updated by Paul Thomas :

https://gcc.gnu.org/g:3521768e8e3c448052c5bd3e8fde412e9cf5d70f

commit r14-2160-g3521768e8e3c448052c5bd3e8fde412e9cf5d70f
Author: Paul Thomas 
Date:   Wed Jun 28 12:38:58 2023 +0100

Fortran: Enable class expressions in structure constructors [PR49213]

2023-06-28  Paul Thomas  

gcc/fortran
PR fortran/49213
* expr.cc (gfc_is_ptr_fcn): Remove reference to class_pointer.
* resolve.cc (resolve_assoc_var): Call gfc_is_ptr_fcn to allow
associate names with pointer function targets to be used in
variable definition context.
* trans-decl.cc (get_symbol_decl): Remove extraneous line.
* trans-expr.cc (alloc_scalar_allocatable_subcomponent): Obtain
size of intrinsic and character expressions.
(gfc_trans_subcomponent_assign): Expand assignment to class
components to include intrinsic and character expressions.

gcc/testsuite/
PR fortran/49213
* gfortran.dg/pr49213.f90 : New test

[Bug testsuite/66005] libgomp make check time is excessive

2023-06-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66005

--- Comment #23 from CVS Commits  ---
The releases/gcc-13 branch has been updated by Thomas Schwinge
:

https://gcc.gnu.org/g:2aa6135efb2d5fce93578592d91f8ce19a1b983b

commit r13-7493-g2aa6135efb2d5fce93578592d91f8ce19a1b983b
Author: Rainer Orth 
Date:   Thu May 7 13:26:57 2015 +0200

Support parallel testing in libgomp, part I [PR66005]

..., while still hard-coding the number of parallel slots to one.

PR testsuite/66005
libgomp/
* testsuite/Makefile.am (PWD_COMMAND): New variable.
(%/site.exp): New target.
(check_p_numbers0, check_p_numbers1, check_p_numbers2)
(check_p_numbers3, check_p_numbers4, check_p_numbers5)
(check_p_numbers6, check_p_numbers, gcc_test_parallel_slots)
(check_p_subdirs)
(check_DEJAGNU_libgomp_targets): New variables.
($(check_DEJAGNU_libgomp_targets)): New target.
($(check_DEJAGNU_libgomp_targets)): New dependency.
(check-DEJAGNU $(check_DEJAGNU_libgomp_targets)): New targets.
* testsuite/Makefile.in: Regenerate.
* testsuite/lib/libgomp.exp: For parallel testing,
'load_file ../libgomp-test-support.exp'.

Co-authored-by: Thomas Schwinge 
(cherry picked from commit e797db5c744f7b4e110f23a495fca8e6b8aebe83)

[Bug testsuite/66005] libgomp make check time is excessive

2023-06-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66005

--- Comment #24 from CVS Commits  ---
The releases/gcc-13 branch has been updated by Thomas Schwinge
:

https://gcc.gnu.org/g:3840d5ccf750b6a059258be7faa4a3fce85a6fa6

commit r13-7494-g3840d5ccf750b6a059258be7faa4a3fce85a6fa6
Author: Thomas Schwinge 
Date:   Tue Apr 25 23:53:12 2023 +0200

Support parallel testing in libgomp, part II [PR66005]

..., and enable if 'flock' is available for serializing execution testing.

Regarding the default of 19 parallel slots, this turned out to be a local
minimum for wall time when testing this on:

$ uname -srvi
Linux 4.2.0-42-generic #49~14.04.1-Ubuntu SMP Wed Jun 29 20:22:11 UTC
2016 x86_64
$ grep '^model name' < /proc/cpuinfo | uniq -c
 32 model name  : Intel(R) Xeon(R) CPU E5-2640 v3 @ 2.60GHz

... in two configurations: case (a) standard configuration, no offloading
configured, case (b) offloading for GCN and nvptx configured but no devices
available.  For both cases, default plus '-m32' variant.

$ \time make check-target-libgomp
RUNTESTFLAGS="--target_board=unix\{,-m32\}"

Case (a), baseline:

6432.23user 332.38system 47:32.28elapsed 237%CPU (0avgtext+0avgdata
505044maxresident)k
6382.43user 319.21system 47:06.04elapsed 237%CPU (0avgtext+0avgdata
505172maxresident)k

This is what people have been complaining about, rightly so, in
 "libgomp make check time is excessive" and
elsewhere.

Case (a), parallelized:

-j12 GCC_TEST_PARALLEL_SLOTS=10
3088.49user 267.74system 6:43.82elapsed 831%CPU (0avgtext+0avgdata
505188maxresident)k
-j15 GCC_TEST_PARALLEL_SLOTS=15
3308.08user 294.79system 5:56.04elapsed 1011%CPU (0avgtext+0avgdata
505360maxresident)k
-j17 GCC_TEST_PARALLEL_SLOTS=17
3539.93user 298.99system 5:27.86elapsed 1170%CPU (0avgtext+0avgdata
505112maxresident)k
-j18 GCC_TEST_PARALLEL_SLOTS=18
3697.50user 317.18system 5:14.63elapsed 1275%CPU (0avgtext+0avgdata
505360maxresident)k
-j19 GCC_TEST_PARALLEL_SLOTS=19
3765.94user 324.27system 5:13.22elapsed 1305%CPU (0avgtext+0avgdata
505128maxresident)k
-j20 GCC_TEST_PARALLEL_SLOTS=20
3684.66user 312.32system 5:15.26elapsed 1267%CPU (0avgtext+0avgdata
505100maxresident)k
-j23 GCC_TEST_PARALLEL_SLOTS=23
4040.59user 347.10system 5:29.12elapsed 1333%CPU (0avgtext+0avgdata
505200maxresident)k
-j26 GCC_TEST_PARALLEL_SLOTS=26
3973.24user 377.96system 5:24.70elapsed 1340%CPU (0avgtext+0avgdata
505160maxresident)k
-j32 GCC_TEST_PARALLEL_SLOTS=32
4004.42user 346.10system 5:16.11elapsed 1376%CPU (0avgtext+0avgdata
505160maxresident)k

Yay!

Case (b), baseline; 2+ h:

7227.58user 700.54system 2:14:33elapsed 98%CPU (0avgtext+0avgdata
994264maxresident)k

Case (b), parallelized:

-j12 GCC_TEST_PARALLEL_SLOTS=10
7377.46user 777.52system 16:06.63elapsed 843%CPU (0avgtext+0avgdata
994344maxresident)k
-j15 GCC_TEST_PARALLEL_SLOTS=15
8019.18user 721.42system 12:13.56elapsed 1191%CPU (0avgtext+0avgdata
994228maxresident)k
-j17 GCC_TEST_PARALLEL_SLOTS=17
8530.11user 716.95system 10:45.92elapsed 1431%CPU (0avgtext+0avgdata
994176maxresident)k
-j18 GCC_TEST_PARALLEL_SLOTS=18
8776.79user 645.89system 10:27.20elapsed 1502%CPU (0avgtext+0avgdata
994248maxresident)k
-j19 GCC_TEST_PARALLEL_SLOTS=19
9332.37user 641.76system 10:15.09elapsed 1621%CPU (0avgtext+0avgdata
994260maxresident)k
-j20 GCC_TEST_PARALLEL_SLOTS=20
9609.54user 789.88system 10:26.94elapsed 1658%CPU (0avgtext+0avgdata
994284maxresident)k
-j23 GCC_TEST_PARALLEL_SLOTS=23
10362.40user 911.14system 10:44.47elapsed 1749%CPU (0avgtext+0avgdata
994208maxresident)k
-j26 GCC_TEST_PARALLEL_SLOTS=26
11159.44user 850.99system 11:09.25elapsed 1794%CPU (0avgtext+0avgdata
994256maxresident)k
-j32 GCC_TEST_PARALLEL_SLOTS=32
11453.50user 939.52system 11:00.38elapsed 1876%CPU (0avgtext+0avgdata
994240maxresident)k

On my Dell Precision 7530 laptop:

$ uname -srvi
Linux 5.15.0-71-generic #78-Ubuntu SMP Tue Apr 18 09:00:29 UTC 2023
x86_64
$ grep '^model name' < /proc/cpuinfo | uniq -c
 12 model name  : Intel(R) Core(TM) i7-8850H CPU @ 2.60GHz
$ nvidia-smi -L
GPU 0: Quadro P1000 (UUID: GPU-e043973b-b52a-d02b-c066-a8fdbf64e8ea)

... in two configurations: case (c) standard configuration, no offloading
configured, case (d) offloading for nvptx configured and device available.
For both cases, only default variant, no '-m32'.

$ \time make check-target-libgomp

Case (c), baseline; roughly half of case (a) (just one variant):

1180.98user 110.80system 19:36.40elapsed 109%CPU (0avgtext+0avgdata
505148maxresident)k
1

[Bug testsuite/66005] libgomp make check time is excessive

2023-06-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66005

--- Comment #25 from CVS Commits  ---
The releases/gcc-13 branch has been updated by Thomas Schwinge
:

https://gcc.gnu.org/g:09124b7ed7709721e86556b4083ef40925d7489b

commit r13-7495-g09124b7ed7709721e86556b4083ef40925d7489b
Author: Thomas Schwinge 
Date:   Mon May 15 20:00:07 2023 +0200

Support parallel testing in libgomp: fallback Perl 'flock' [PR66005]

Follow-up to commit 6c3b30ef9e0578509bdaf59c13da4a212fe6c2ba
"Support parallel testing in libgomp, part II [PR66005]"
("..., and enable if 'flock' is available for serializing execution
testing"),
where we saw:

> On my Dell Precision 7530 laptop:
>
> $ uname -srvi
> Linux 5.15.0-71-generic #78-Ubuntu SMP Tue Apr 18 09:00:29 UTC 2023
x86_64
> $ grep '^model name' < /proc/cpuinfo | uniq -c
>  12 model name  : Intel(R) Core(TM) i7-8850H CPU @ 2.60GHz
> $ nvidia-smi -L
> GPU 0: Quadro P1000 (UUID: GPU-e043973b-b52a-d02b-c066-a8fdbf64e8ea)
>
> ... [...]: case (c) standard configuration, no offloading
> configured, [...]

> $ \time make check-target-libgomp
>
> Case (c), baseline; [...]:
>
> 1180.98user 110.80system 19:36.40elapsed 109%CPU (0avgtext+0avgdata
505148maxresident)k
> 1133.22user 111.08system 19:35.75elapsed 105%CPU (0avgtext+0avgdata
505212maxresident)k
>
> Case (c), parallelized [using 'flock']:
>
> [...]
> -j12 GCC_TEST_PARALLEL_SLOTS=12
> 2591.04user 192.64system 4:44.98elapsed 976%CPU (0avgtext+0avgdata
505216maxresident)k
> 2581.23user 195.21system 4:47.51elapsed 965%CPU (0avgtext+0avgdata
505212maxresident)k

Quite the same when instead of 'flock' using this fallback Perl 'flock':

2565.23user 194.35system 4:46.77elapsed 962%CPU (0avgtext+0avgdata
505216maxresident)k
2549.38user 200.20system 4:46.08elapsed 961%CPU (0avgtext+0avgdata
505216maxresident)k

PR testsuite/66005
gcc/
* doc/install.texi: Document (optional) Perl usage for parallel
testing of libgomp.
libgomp/
* testsuite/lib/libgomp.exp: 'flock' through stdout.
* testsuite/flock: New.
* configure.ac (FLOCK): Point to that if no 'flock' available, but
'perl' is.
* configure: Regenerate.

(cherry picked from commit 04abe1944d30eb18a2060cfcd9695d085f7b4752)

[Bug testsuite/66005] libgomp make check time is excessive

2023-06-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66005

--- Comment #26 from CVS Commits  ---
The releases/gcc-12 branch has been updated by Thomas Schwinge
:

https://gcc.gnu.org/g:66df913899d32e7726f986afb61c5c5615eb2a36

commit r12-9737-g66df913899d32e7726f986afb61c5c5615eb2a36
Author: Rainer Orth 
Date:   Thu May 7 13:26:57 2015 +0200

Support parallel testing in libgomp, part I [PR66005]

..., while still hard-coding the number of parallel slots to one.

PR testsuite/66005
libgomp/
* testsuite/Makefile.am (PWD_COMMAND): New variable.
(%/site.exp): New target.
(check_p_numbers0, check_p_numbers1, check_p_numbers2)
(check_p_numbers3, check_p_numbers4, check_p_numbers5)
(check_p_numbers6, check_p_numbers, gcc_test_parallel_slots)
(check_p_subdirs)
(check_DEJAGNU_libgomp_targets): New variables.
($(check_DEJAGNU_libgomp_targets)): New target.
($(check_DEJAGNU_libgomp_targets)): New dependency.
(check-DEJAGNU $(check_DEJAGNU_libgomp_targets)): New targets.
* testsuite/Makefile.in: Regenerate.
* testsuite/lib/libgomp.exp: For parallel testing,
'load_file ../libgomp-test-support.exp'.

Co-authored-by: Thomas Schwinge 
(cherry picked from commit e797db5c744f7b4e110f23a495fca8e6b8aebe83)

[Bug testsuite/66005] libgomp make check time is excessive

2023-06-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66005

--- Comment #27 from CVS Commits  ---
The releases/gcc-12 branch has been updated by Thomas Schwinge
:

https://gcc.gnu.org/g:5c6515076f2ba55a31149085d3826e975c114fe5

commit r12-9738-g5c6515076f2ba55a31149085d3826e975c114fe5
Author: Thomas Schwinge 
Date:   Tue Apr 25 23:53:12 2023 +0200

Support parallel testing in libgomp, part II [PR66005]

..., and enable if 'flock' is available for serializing execution testing.

Regarding the default of 19 parallel slots, this turned out to be a local
minimum for wall time when testing this on:

$ uname -srvi
Linux 4.2.0-42-generic #49~14.04.1-Ubuntu SMP Wed Jun 29 20:22:11 UTC
2016 x86_64
$ grep '^model name' < /proc/cpuinfo | uniq -c
 32 model name  : Intel(R) Xeon(R) CPU E5-2640 v3 @ 2.60GHz

... in two configurations: case (a) standard configuration, no offloading
configured, case (b) offloading for GCN and nvptx configured but no devices
available.  For both cases, default plus '-m32' variant.

$ \time make check-target-libgomp
RUNTESTFLAGS="--target_board=unix\{,-m32\}"

Case (a), baseline:

6432.23user 332.38system 47:32.28elapsed 237%CPU (0avgtext+0avgdata
505044maxresident)k
6382.43user 319.21system 47:06.04elapsed 237%CPU (0avgtext+0avgdata
505172maxresident)k

This is what people have been complaining about, rightly so, in
 "libgomp make check time is excessive" and
elsewhere.

Case (a), parallelized:

-j12 GCC_TEST_PARALLEL_SLOTS=10
3088.49user 267.74system 6:43.82elapsed 831%CPU (0avgtext+0avgdata
505188maxresident)k
-j15 GCC_TEST_PARALLEL_SLOTS=15
3308.08user 294.79system 5:56.04elapsed 1011%CPU (0avgtext+0avgdata
505360maxresident)k
-j17 GCC_TEST_PARALLEL_SLOTS=17
3539.93user 298.99system 5:27.86elapsed 1170%CPU (0avgtext+0avgdata
505112maxresident)k
-j18 GCC_TEST_PARALLEL_SLOTS=18
3697.50user 317.18system 5:14.63elapsed 1275%CPU (0avgtext+0avgdata
505360maxresident)k
-j19 GCC_TEST_PARALLEL_SLOTS=19
3765.94user 324.27system 5:13.22elapsed 1305%CPU (0avgtext+0avgdata
505128maxresident)k
-j20 GCC_TEST_PARALLEL_SLOTS=20
3684.66user 312.32system 5:15.26elapsed 1267%CPU (0avgtext+0avgdata
505100maxresident)k
-j23 GCC_TEST_PARALLEL_SLOTS=23
4040.59user 347.10system 5:29.12elapsed 1333%CPU (0avgtext+0avgdata
505200maxresident)k
-j26 GCC_TEST_PARALLEL_SLOTS=26
3973.24user 377.96system 5:24.70elapsed 1340%CPU (0avgtext+0avgdata
505160maxresident)k
-j32 GCC_TEST_PARALLEL_SLOTS=32
4004.42user 346.10system 5:16.11elapsed 1376%CPU (0avgtext+0avgdata
505160maxresident)k

Yay!

Case (b), baseline; 2+ h:

7227.58user 700.54system 2:14:33elapsed 98%CPU (0avgtext+0avgdata
994264maxresident)k

Case (b), parallelized:

-j12 GCC_TEST_PARALLEL_SLOTS=10
7377.46user 777.52system 16:06.63elapsed 843%CPU (0avgtext+0avgdata
994344maxresident)k
-j15 GCC_TEST_PARALLEL_SLOTS=15
8019.18user 721.42system 12:13.56elapsed 1191%CPU (0avgtext+0avgdata
994228maxresident)k
-j17 GCC_TEST_PARALLEL_SLOTS=17
8530.11user 716.95system 10:45.92elapsed 1431%CPU (0avgtext+0avgdata
994176maxresident)k
-j18 GCC_TEST_PARALLEL_SLOTS=18
8776.79user 645.89system 10:27.20elapsed 1502%CPU (0avgtext+0avgdata
994248maxresident)k
-j19 GCC_TEST_PARALLEL_SLOTS=19
9332.37user 641.76system 10:15.09elapsed 1621%CPU (0avgtext+0avgdata
994260maxresident)k
-j20 GCC_TEST_PARALLEL_SLOTS=20
9609.54user 789.88system 10:26.94elapsed 1658%CPU (0avgtext+0avgdata
994284maxresident)k
-j23 GCC_TEST_PARALLEL_SLOTS=23
10362.40user 911.14system 10:44.47elapsed 1749%CPU (0avgtext+0avgdata
994208maxresident)k
-j26 GCC_TEST_PARALLEL_SLOTS=26
11159.44user 850.99system 11:09.25elapsed 1794%CPU (0avgtext+0avgdata
994256maxresident)k
-j32 GCC_TEST_PARALLEL_SLOTS=32
11453.50user 939.52system 11:00.38elapsed 1876%CPU (0avgtext+0avgdata
994240maxresident)k

On my Dell Precision 7530 laptop:

$ uname -srvi
Linux 5.15.0-71-generic #78-Ubuntu SMP Tue Apr 18 09:00:29 UTC 2023
x86_64
$ grep '^model name' < /proc/cpuinfo | uniq -c
 12 model name  : Intel(R) Core(TM) i7-8850H CPU @ 2.60GHz
$ nvidia-smi -L
GPU 0: Quadro P1000 (UUID: GPU-e043973b-b52a-d02b-c066-a8fdbf64e8ea)

... in two configurations: case (c) standard configuration, no offloading
configured, case (d) offloading for nvptx configured and device available.
For both cases, only default variant, no '-m32'.

$ \time make check-target-libgomp

Case (c), baseline; roughly half of case (a) (just one variant):

1180.98user 110.80system 19:36.40elapsed 109%CPU (0avgtext+0avgdata
505148maxresident)k
1

[Bug testsuite/66005] libgomp make check time is excessive

2023-06-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66005

--- Comment #28 from CVS Commits  ---
The releases/gcc-12 branch has been updated by Thomas Schwinge
:

https://gcc.gnu.org/g:b4561b782427cdfe0fac1a869e79a49187817ffe

commit r12-9739-gb4561b782427cdfe0fac1a869e79a49187817ffe
Author: Thomas Schwinge 
Date:   Mon May 15 20:00:07 2023 +0200

Support parallel testing in libgomp: fallback Perl 'flock' [PR66005]

Follow-up to commit 6c3b30ef9e0578509bdaf59c13da4a212fe6c2ba
"Support parallel testing in libgomp, part II [PR66005]"
("..., and enable if 'flock' is available for serializing execution
testing"),
where we saw:

> On my Dell Precision 7530 laptop:
>
> $ uname -srvi
> Linux 5.15.0-71-generic #78-Ubuntu SMP Tue Apr 18 09:00:29 UTC 2023
x86_64
> $ grep '^model name' < /proc/cpuinfo | uniq -c
>  12 model name  : Intel(R) Core(TM) i7-8850H CPU @ 2.60GHz
> $ nvidia-smi -L
> GPU 0: Quadro P1000 (UUID: GPU-e043973b-b52a-d02b-c066-a8fdbf64e8ea)
>
> ... [...]: case (c) standard configuration, no offloading
> configured, [...]

> $ \time make check-target-libgomp
>
> Case (c), baseline; [...]:
>
> 1180.98user 110.80system 19:36.40elapsed 109%CPU (0avgtext+0avgdata
505148maxresident)k
> 1133.22user 111.08system 19:35.75elapsed 105%CPU (0avgtext+0avgdata
505212maxresident)k
>
> Case (c), parallelized [using 'flock']:
>
> [...]
> -j12 GCC_TEST_PARALLEL_SLOTS=12
> 2591.04user 192.64system 4:44.98elapsed 976%CPU (0avgtext+0avgdata
505216maxresident)k
> 2581.23user 195.21system 4:47.51elapsed 965%CPU (0avgtext+0avgdata
505212maxresident)k

Quite the same when instead of 'flock' using this fallback Perl 'flock':

2565.23user 194.35system 4:46.77elapsed 962%CPU (0avgtext+0avgdata
505216maxresident)k
2549.38user 200.20system 4:46.08elapsed 961%CPU (0avgtext+0avgdata
505216maxresident)k

PR testsuite/66005
gcc/
* doc/install.texi: Document (optional) Perl usage for parallel
testing of libgomp.
libgomp/
* testsuite/lib/libgomp.exp: 'flock' through stdout.
* testsuite/flock: New.
* configure.ac (FLOCK): Point to that if no 'flock' available, but
'perl' is.
* configure: Regenerate.

(cherry picked from commit 04abe1944d30eb18a2060cfcd9695d085f7b4752)

[Bug target/108678] Windows on ARM64 platform target aarch64-w64-mingw32

2023-06-28 Thread zac.walker at linaro dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108678

--- Comment #8 from Zac Walker  ---
Thanks Jonathan. I am still in test and cleanup up mode but hope to start
upstreaming in a few weeks.

[Bug testsuite/66005] libgomp make check time is excessive

2023-06-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66005

--- Comment #29 from CVS Commits  ---
The releases/gcc-11 branch has been updated by Thomas Schwinge
:

https://gcc.gnu.org/g:e1bd4f5434d7989d723188e9f2b524ce234bc44d

commit r11-10879-ge1bd4f5434d7989d723188e9f2b524ce234bc44d
Author: Rainer Orth 
Date:   Thu May 7 13:26:57 2015 +0200

Support parallel testing in libgomp, part I [PR66005]

..., while still hard-coding the number of parallel slots to one.

PR testsuite/66005
libgomp/
* testsuite/Makefile.am (PWD_COMMAND): New variable.
(%/site.exp): New target.
(check_p_numbers0, check_p_numbers1, check_p_numbers2)
(check_p_numbers3, check_p_numbers4, check_p_numbers5)
(check_p_numbers6, check_p_numbers, gcc_test_parallel_slots)
(check_p_subdirs)
(check_DEJAGNU_libgomp_targets): New variables.
($(check_DEJAGNU_libgomp_targets)): New target.
($(check_DEJAGNU_libgomp_targets)): New dependency.
(check-DEJAGNU $(check_DEJAGNU_libgomp_targets)): New targets.
* testsuite/Makefile.in: Regenerate.
* testsuite/lib/libgomp.exp: For parallel testing,
'load_file ../libgomp-test-support.exp'.

Co-authored-by: Thomas Schwinge 
(cherry picked from commit e797db5c744f7b4e110f23a495fca8e6b8aebe83)

[Bug testsuite/66005] libgomp make check time is excessive

2023-06-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66005

--- Comment #30 from CVS Commits  ---
The releases/gcc-11 branch has been updated by Thomas Schwinge
:

https://gcc.gnu.org/g:4506b349cf527834239554a03e43ae45237b315c

commit r11-10880-g4506b349cf527834239554a03e43ae45237b315c
Author: Thomas Schwinge 
Date:   Tue Apr 25 23:53:12 2023 +0200

Support parallel testing in libgomp, part II [PR66005]

..., and enable if 'flock' is available for serializing execution testing.

Regarding the default of 19 parallel slots, this turned out to be a local
minimum for wall time when testing this on:

$ uname -srvi
Linux 4.2.0-42-generic #49~14.04.1-Ubuntu SMP Wed Jun 29 20:22:11 UTC
2016 x86_64
$ grep '^model name' < /proc/cpuinfo | uniq -c
 32 model name  : Intel(R) Xeon(R) CPU E5-2640 v3 @ 2.60GHz

... in two configurations: case (a) standard configuration, no offloading
configured, case (b) offloading for GCN and nvptx configured but no devices
available.  For both cases, default plus '-m32' variant.

$ \time make check-target-libgomp
RUNTESTFLAGS="--target_board=unix\{,-m32\}"

Case (a), baseline:

6432.23user 332.38system 47:32.28elapsed 237%CPU (0avgtext+0avgdata
505044maxresident)k
6382.43user 319.21system 47:06.04elapsed 237%CPU (0avgtext+0avgdata
505172maxresident)k

This is what people have been complaining about, rightly so, in
 "libgomp make check time is excessive" and
elsewhere.

Case (a), parallelized:

-j12 GCC_TEST_PARALLEL_SLOTS=10
3088.49user 267.74system 6:43.82elapsed 831%CPU (0avgtext+0avgdata
505188maxresident)k
-j15 GCC_TEST_PARALLEL_SLOTS=15
3308.08user 294.79system 5:56.04elapsed 1011%CPU (0avgtext+0avgdata
505360maxresident)k
-j17 GCC_TEST_PARALLEL_SLOTS=17
3539.93user 298.99system 5:27.86elapsed 1170%CPU (0avgtext+0avgdata
505112maxresident)k
-j18 GCC_TEST_PARALLEL_SLOTS=18
3697.50user 317.18system 5:14.63elapsed 1275%CPU (0avgtext+0avgdata
505360maxresident)k
-j19 GCC_TEST_PARALLEL_SLOTS=19
3765.94user 324.27system 5:13.22elapsed 1305%CPU (0avgtext+0avgdata
505128maxresident)k
-j20 GCC_TEST_PARALLEL_SLOTS=20
3684.66user 312.32system 5:15.26elapsed 1267%CPU (0avgtext+0avgdata
505100maxresident)k
-j23 GCC_TEST_PARALLEL_SLOTS=23
4040.59user 347.10system 5:29.12elapsed 1333%CPU (0avgtext+0avgdata
505200maxresident)k
-j26 GCC_TEST_PARALLEL_SLOTS=26
3973.24user 377.96system 5:24.70elapsed 1340%CPU (0avgtext+0avgdata
505160maxresident)k
-j32 GCC_TEST_PARALLEL_SLOTS=32
4004.42user 346.10system 5:16.11elapsed 1376%CPU (0avgtext+0avgdata
505160maxresident)k

Yay!

Case (b), baseline; 2+ h:

7227.58user 700.54system 2:14:33elapsed 98%CPU (0avgtext+0avgdata
994264maxresident)k

Case (b), parallelized:

-j12 GCC_TEST_PARALLEL_SLOTS=10
7377.46user 777.52system 16:06.63elapsed 843%CPU (0avgtext+0avgdata
994344maxresident)k
-j15 GCC_TEST_PARALLEL_SLOTS=15
8019.18user 721.42system 12:13.56elapsed 1191%CPU (0avgtext+0avgdata
994228maxresident)k
-j17 GCC_TEST_PARALLEL_SLOTS=17
8530.11user 716.95system 10:45.92elapsed 1431%CPU (0avgtext+0avgdata
994176maxresident)k
-j18 GCC_TEST_PARALLEL_SLOTS=18
8776.79user 645.89system 10:27.20elapsed 1502%CPU (0avgtext+0avgdata
994248maxresident)k
-j19 GCC_TEST_PARALLEL_SLOTS=19
9332.37user 641.76system 10:15.09elapsed 1621%CPU (0avgtext+0avgdata
994260maxresident)k
-j20 GCC_TEST_PARALLEL_SLOTS=20
9609.54user 789.88system 10:26.94elapsed 1658%CPU (0avgtext+0avgdata
994284maxresident)k
-j23 GCC_TEST_PARALLEL_SLOTS=23
10362.40user 911.14system 10:44.47elapsed 1749%CPU (0avgtext+0avgdata
994208maxresident)k
-j26 GCC_TEST_PARALLEL_SLOTS=26
11159.44user 850.99system 11:09.25elapsed 1794%CPU (0avgtext+0avgdata
994256maxresident)k
-j32 GCC_TEST_PARALLEL_SLOTS=32
11453.50user 939.52system 11:00.38elapsed 1876%CPU (0avgtext+0avgdata
994240maxresident)k

On my Dell Precision 7530 laptop:

$ uname -srvi
Linux 5.15.0-71-generic #78-Ubuntu SMP Tue Apr 18 09:00:29 UTC 2023
x86_64
$ grep '^model name' < /proc/cpuinfo | uniq -c
 12 model name  : Intel(R) Core(TM) i7-8850H CPU @ 2.60GHz
$ nvidia-smi -L
GPU 0: Quadro P1000 (UUID: GPU-e043973b-b52a-d02b-c066-a8fdbf64e8ea)

... in two configurations: case (c) standard configuration, no offloading
configured, case (d) offloading for nvptx configured and device available.
For both cases, only default variant, no '-m32'.

$ \time make check-target-libgomp

Case (c), baseline; roughly half of case (a) (just one variant):

1180.98user 110.80system 19:36.40elapsed 109%CPU (0avgtext+0avgdata
505148maxresident)k

[Bug testsuite/66005] libgomp make check time is excessive

2023-06-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66005

--- Comment #31 from CVS Commits  ---
The releases/gcc-11 branch has been updated by Thomas Schwinge
:

https://gcc.gnu.org/g:91955e374e07dc8ee9111eeb49c137c5582ed674

commit r11-10881-g91955e374e07dc8ee9111eeb49c137c5582ed674
Author: Thomas Schwinge 
Date:   Mon May 15 20:00:07 2023 +0200

Support parallel testing in libgomp: fallback Perl 'flock' [PR66005]

Follow-up to commit 6c3b30ef9e0578509bdaf59c13da4a212fe6c2ba
"Support parallel testing in libgomp, part II [PR66005]"
("..., and enable if 'flock' is available for serializing execution
testing"),
where we saw:

> On my Dell Precision 7530 laptop:
>
> $ uname -srvi
> Linux 5.15.0-71-generic #78-Ubuntu SMP Tue Apr 18 09:00:29 UTC 2023
x86_64
> $ grep '^model name' < /proc/cpuinfo | uniq -c
>  12 model name  : Intel(R) Core(TM) i7-8850H CPU @ 2.60GHz
> $ nvidia-smi -L
> GPU 0: Quadro P1000 (UUID: GPU-e043973b-b52a-d02b-c066-a8fdbf64e8ea)
>
> ... [...]: case (c) standard configuration, no offloading
> configured, [...]

> $ \time make check-target-libgomp
>
> Case (c), baseline; [...]:
>
> 1180.98user 110.80system 19:36.40elapsed 109%CPU (0avgtext+0avgdata
505148maxresident)k
> 1133.22user 111.08system 19:35.75elapsed 105%CPU (0avgtext+0avgdata
505212maxresident)k
>
> Case (c), parallelized [using 'flock']:
>
> [...]
> -j12 GCC_TEST_PARALLEL_SLOTS=12
> 2591.04user 192.64system 4:44.98elapsed 976%CPU (0avgtext+0avgdata
505216maxresident)k
> 2581.23user 195.21system 4:47.51elapsed 965%CPU (0avgtext+0avgdata
505212maxresident)k

Quite the same when instead of 'flock' using this fallback Perl 'flock':

2565.23user 194.35system 4:46.77elapsed 962%CPU (0avgtext+0avgdata
505216maxresident)k
2549.38user 200.20system 4:46.08elapsed 961%CPU (0avgtext+0avgdata
505216maxresident)k

PR testsuite/66005
gcc/
* doc/install.texi: Document (optional) Perl usage for parallel
testing of libgomp.
libgomp/
* testsuite/lib/libgomp.exp: 'flock' through stdout.
* testsuite/flock: New.
* configure.ac (FLOCK): Point to that if no 'flock' available, but
'perl' is.
* configure: Regenerate.

(cherry picked from commit 04abe1944d30eb18a2060cfcd9695d085f7b4752)

[Bug middle-end/106081] missed vectorization

2023-06-28 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106081

--- Comment #8 from Jan Hubicka  ---
Imagemagick improved by 17% on zen3 and 11% on altra
https://lnt.opensuse.org/db_default/v4/SPEC/37550
https://lnt.opensuse.org/db_default/v4/SPEC/37543
which is cool :)

The loop is now optimized as:

.L2:
vmovdqu16   (%rax), %zmm0
vmovupd (%rdx), %zmm2
addq$64, %rax
subq$64, %rdx
vpermpd %zmm2, %zmm15, %zmm9
vpermpd %zmm2, %zmm14, %zmm8
vpermpd %zmm2, %zmm13, %zmm7
vpermpd %zmm2, %zmm11, %zmm2
vpshufb %zmm12, %zmm0, %zmm0
vpmovsxwd   %ymm0, %zmm1
vextracti64x4   $0x1, %zmm0, %ymm0
vpmovsxwd   %ymm0, %zmm0
vcvtdq2pd   %ymm1, %zmm10
vextracti32x8   $0x1, %zmm1, %ymm1
vcvtdq2pd   %ymm1, %zmm1
vfmadd231pd %zmm2, %zmm10, %zmm6
vfmadd231pd %zmm9, %zmm1, %zmm3
vcvtdq2pd   %ymm0, %zmm1
vextracti32x8   $0x1, %zmm0, %ymm0
vcvtdq2pd   %ymm0, %zmm0
vfmadd231pd %zmm8, %zmm1, %zmm5
vfmadd231pd %zmm7, %zmm0, %zmm4
cmpq%rax, %rcx
jne .L2

[Bug c++/110441] c++17: temporary causes static member function call to confuse required copy elision

2023-06-28 Thread gasper.azman at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110441

Gašper Ažman  changed:

   What|Removed |Added

 CC||gasper.azman at gmail dot com

--- Comment #1 from Gašper Ažman  ---
I hit this in gcc 10 as well when implementing sender/receiver. Was not able to
reduce it this nicely, so I didn't report. Nice work, Eric.

[Bug c++/110441] c++17: temporary causes static member function call to confuse required copy elision

2023-06-28 Thread gasper.azman at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110441

--- Comment #2 from Gašper Ažman  ---
Some more color from twitter, courtesy of @matthewecross:

Interestingly both "return S::f();" and "auto s = S(); return s.f();" both
pass.  It's only when you create a temporary instance of S in the return
statement that it fails.

[Bug target/110456] New: vectorization with loop masking prone to STLF issues

2023-06-28 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110456

Bug ID: 110456
   Summary: vectorization with loop masking prone to STLF issues
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: rguenth at gcc dot gnu.org
  Target Milestone: ---

void __attribute__((noipa))
test (double * __restrict a, double *b, int n, int m)
{
  for (int j = 0; j < m; ++j)
for (int i = 0; i < n; ++i)
  a[i + j*n] = a[i + j*n /* + 512 */] + b[i + j*n];
}

double a[1024];
double b[1024]; 

int main(int argc, char **argv)
{
  int m = atoi (argv[1]);
  for (long i = 0; i < 10; ++i)
test (a + 4, b + 4, 4, m);
}


Shows that when we apply loop masking with --param vect-partial-vector-usage
then masked stores will generally prohibit store-to-load forwarding,
especially when there's only a partial overlap with a following load like
when traversing a multi-dimensional array as above.  The above runs
noticable slower compared to when the loads are offset
(uncomment the /* + 512 */).

The situation is difficult to avoid in general but there might be easy
heuristics that could be implemented like avoiding loop masking when
there's a read-modify-write operation to the same memory location in
a loop (with or without an immediately visible outer loop).  For
unknown dependences and thus runtime disambiguation a proper distance
of any read/write operation could be ensured as well.

[Bug tree-optimization/110451] LIM fails to hoist comparisons

2023-06-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110451

--- Comment #2 from CVS Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:237e83e2158a3d9b875f8775805d04d97e8b36c1

commit r14-2161-g237e83e2158a3d9b875f8775805d04d97e8b36c1
Author: Richard Biener 
Date:   Wed Jun 28 13:36:59 2023 +0200

tree-optimization/110451 - hoist invariant compare after interchange

The following adjusts the cost model of invariant motion to consider
[VEC_]COND_EXPRs and comparisons producing a data value as expensive.
For 503.bwaves_r this avoids an unnecessarily high vectorization
factor because of an integer comparison besides data operations on
double.

PR tree-optimization/110451
* tree-ssa-loop-im.cc (stmt_cost): [VEC_]COND_EXPR and
tcc_comparison are expensive.

* gfortran.dg/vect/pr110451.f: New testcase.

[Bug tree-optimization/110451] LIM fails to hoist comparisons

2023-06-28 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110451

Richard Biener  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #3 from Richard Biener  ---
Fixed.

[Bug target/110456] vectorization with loop masking prone to STLF issues

2023-06-28 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110456

--- Comment #1 from Richard Biener  ---
Correction, the testcase should look like

void __attribute__((noipa))
test (double * __restrict a, double *b, int n, int m)
{
  for (int j = 0; j < m; ++j)
for (int i = 0; i < n; ++i)
  a[i + j*n] = a[i + j*n /* + 512 */] + b[i + j*n];
}

double a[1024];
double b[1024];

int main(int argc, char **argv)
{
  int m = atoi (argv[1]);
  for (long i = 0; i < m; ++i)
test (a + 4, b + 4, 4, 1024/4);
  return 0;
}

[Bug driver/110408] gcc 13 crashes with %rename in specs

2023-06-28 Thread brjd_epdjq36 at kygur dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110408

--- Comment #1 from Brjd  ---
Test with specs file



%rename lib old_lib

*lib:
--start-group -lgcc -lc --end-group %(old_lib)





and hello.cpp

g++ hello.cpp -specs=/path-to-specs



g++-13: fatal error: specs file malformed after 36 characters
compilation terminated.

g++12 no error
g++-4 no error.

[Bug rtl-optimization/110423] Redundant constants not getting eliminated on RISCV.

2023-06-28 Thread law at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110423

Jeffrey A. Law  changed:

   What|Removed |Added

 CC||law at gcc dot gnu.org

--- Comment #2 from Jeffrey A. Law  ---
So there is another broad approach we can take here.

As Vineet mentioned, this isn't really a job for PRE/LCM as those are
formulated around a requirement that they never insert an expression evaluation
in any path that did not have an evaluation before.  ie no speculative constant
loads.

We could potentially relax that condition.  I'm not sure we'd formulate it as a
PRE/LCM problem, but it gives you a sense of how we could tackle this.  The
difficulty would be in the heuristics for when to apply this transformation
since it will make some codes slower and may increase register pressure.  This
is derived heavily from Click's work in the 90s.  This would happen in gimple
most likely, though I guess one could do it in RTL if they have a high pain
threshold.

In the simplest way to think about the placement algorithm is to find the
blocks where all the uses of any given constant C occur.  A trivially correct
placement of load of that constant would be the entry block as it must dominate
every block in that set.  Of course that would make the placement quite
speculative and lengthen live ranges.  That's usually referred to an an early
placement.

Next find the latest placement for the constant load that covers all the uses. 
That will be the lowest common ancestor in the dominator tree of the set of
blocks that use the constant.

If you were to imagine a path through the dominator tree starting at the early
placement (entry) and ending at the lowest common ancestor, any block on that
path could be selected for generating the constant load and would cover every
use with that single load.  Within the set of blocks on that path, find the set
with the lowest loop nesting, then within that reduced set find those with the
deepest control nesting (or lowest estimated frequency counts).  There may be
more than one block in that final set.  Any are valid and "reasonable" choices.


Click's paper is much more general, but the same concepts apply.  His paper
doesn't cover anything like bifurcating the graph (thus allowing multiple
constant loads in an effort to reduce undesired speculation or register
allocation conflicts).

We might be able to get away with this precisely because these are constant
loads and thus subject to rematerialization later if register pressure is high.

https://courses.cs.washington.edu/courses/cse501/06wi/reading/click-pldi95.pdf

[Bug tree-optimization/110434] tree-nrv introduces incorrect CLOBBER(eol)

2023-06-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110434

--- Comment #1 from CVS Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:4bf76b5b6db8e68755788ec91012c5a686440720

commit r14-2164-g4bf76b5b6db8e68755788ec91012c5a686440720
Author: Richard Biener 
Date:   Wed Jun 28 11:27:45 2023 +0200

tree-optimization/110434 - avoid  ={v} {CLOBBER} from NRV

When NRV replaces a local variable with  it also replaces
occurences in clobbers.  This leads to  being clobbered
before the return of it which is strictly invalid but harmless in
practice since there's no pass after NRV which would remove
earlier stores.

The following fixes this nevertheless.

PR tree-optimization/110434
* tree-nrv.cc (pass_nrv::execute): Remove CLOBBERs of
VAR we replace with .

[Bug middle-end/110443] [14 Regression] ICE on a52dec-0.7.4: GIMPLE pass: vect SIGSEGV in vect_get_gather_scatter_ops()

2023-06-28 Thread slyfox at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110443

--- Comment #8 from Sergei Trofimovich  ---
I confirm a52dec-0.7.4 is fixed as well now. Thank you!

[Bug tree-optimization/110434] tree-nrv introduces incorrect CLOBBER(eol)

2023-06-28 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110434

Richard Biener  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #2 from Richard Biener  ---
Fixed.  Thanks for reporting.

[Bug debug/110308] [14 Regression] ICE on audiofile-0.3.6: RTL: vartrack: Segmentation fault in mode_to_precision(machine_mode)

2023-06-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110308

--- Comment #15 from CVS Commits  ---
The master branch has been updated by Philipp Tomsich :

https://gcc.gnu.org/g:893883f2f8f56984209c6ed210ee992ff71a14b0

commit r14-2165-g893883f2f8f56984209c6ed210ee992ff71a14b0
Author: Manolis Tsamis 
Date:   Tue Jun 20 16:23:52 2023 +0200

cprop_hardreg: fix ORIGINAL_REGNO/REG_ATTRS/REG_POINTER handling

Fixes: 6a2e8dcbbd4bab3

Propagation for the stack pointer in regcprop was enabled in
6a2e8dcbbd4bab3, but set ORIGINAL_REGNO/REG_ATTRS/REG_POINTER for
stack_pointer_rtx which caused regression (e.g., PR 110313, PR 110308).

This fix adds special handling for stack_pointer_rtx in the places
where maybe_mode_change is called. This also adds an check in
maybe_mode_change to return the stack pointer only when the requested
mode matches the mode of stack_pointer_rtx.

PR debug/110308

gcc/ChangeLog:

* regcprop.cc (maybe_mode_change): Check stack_pointer_rtx mode.
(maybe_copy_reg_attrs): New function.
(find_oldest_value_reg): Use maybe_copy_reg_attrs.
(copyprop_hardreg_forward_1): Ditto.

gcc/testsuite/ChangeLog:

* g++.dg/torture/pr110308.C: New test.

Signed-off-by: Manolis Tsamis 
Signed-off-by: Philipp Tomsich 

[Bug target/110313] [14 Regression] GCN Fiji reload ICE in 'process_alt_operands'

2023-06-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110313

--- Comment #12 from CVS Commits  ---
The master branch has been updated by Philipp Tomsich :

https://gcc.gnu.org/g:893883f2f8f56984209c6ed210ee992ff71a14b0

commit r14-2165-g893883f2f8f56984209c6ed210ee992ff71a14b0
Author: Manolis Tsamis 
Date:   Tue Jun 20 16:23:52 2023 +0200

cprop_hardreg: fix ORIGINAL_REGNO/REG_ATTRS/REG_POINTER handling

Fixes: 6a2e8dcbbd4bab3

Propagation for the stack pointer in regcprop was enabled in
6a2e8dcbbd4bab3, but set ORIGINAL_REGNO/REG_ATTRS/REG_POINTER for
stack_pointer_rtx which caused regression (e.g., PR 110313, PR 110308).

This fix adds special handling for stack_pointer_rtx in the places
where maybe_mode_change is called. This also adds an check in
maybe_mode_change to return the stack pointer only when the requested
mode matches the mode of stack_pointer_rtx.

PR debug/110308

gcc/ChangeLog:

* regcprop.cc (maybe_mode_change): Check stack_pointer_rtx mode.
(maybe_copy_reg_attrs): New function.
(find_oldest_value_reg): Use maybe_copy_reg_attrs.
(copyprop_hardreg_forward_1): Ditto.

gcc/testsuite/ChangeLog:

* g++.dg/torture/pr110308.C: New test.

Signed-off-by: Manolis Tsamis 
Signed-off-by: Philipp Tomsich 

[Bug c++/110441] c++17: temporary causes static member function call to confuse required copy elision

2023-06-28 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110441

Patrick Palka  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2023-06-28
 CC||jason at gcc dot gnu.org,
   ||ppalka at gcc dot gnu.org

--- Comment #3 from Patrick Palka  ---
Confirmed, this never worked.  The problem seems to be that because f is
static, 'S().f()' is represented as a COMPOUND_EXPR that evaluates the
otherwise unused object argument S() followed by a TARGET_EXPR for S::f().  And
this COMPOUND_EXPR foils the copy elision check in build_special_member_call
which looks only for an outermost TARGET_EXPR and doesn't look through
COMPOUND_EXPR.

In contrast, '(S(), S::f())' (which should be equivalent) is represented as a
TARGET_EXPR of a COMPOUND_EXPR rather than a COMPOUND_EXPR of a TARGET_EXPR,
and so copy elision is correctly avoided.  So perhaps we could make
keep_unused_object_arg for a TARGET_EXPR result place the COMPOUND_EXPR inside
the TARGET_EXPR_INITIAL instead of around the TARGET_EXPR?

[Bug c++/110441] c++17: temporary causes static member function call to confuse required copy elision

2023-06-28 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110441

--- Comment #4 from Patrick Palka  ---
(In reply to Patrick Palka from comment #3)
> In contrast, '(S(), S::f())' (which should be equivalent) is represented as
> a TARGET_EXPR of a COMPOUND_EXPR rather than a COMPOUND_EXPR of a
> TARGET_EXPR, and so copy elision is correctly avoided.
oops, this should say "is correctly _performed_"

[Bug target/110457] New: Unnecessary movsx eax, dil

2023-06-28 Thread antoshkka at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110457

Bug ID: 110457
   Summary: Unnecessary movsx   eax, dil
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: antoshkka at gmail dot com
  Target Milestone: ---

For the following code

int sample1(char c) {
  return (c << 4) + (c << 8) + (c << 16) + c;
}


GCC-14 with -O2 generates the assembly:

sample1(char):
 movsx  eax,dil
 imul   eax,eax,0x10111
 ret


However, it could be shortened to just:

sample1(char):
 imul   eax,edi,0x10111


Godbolt playground: https://godbolt.org/z/7GGdedEY8

[Bug target/110457] Unnecessary movsx eax, dil

2023-06-28 Thread antoshkka at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110457

--- Comment #1 from Antony Polukhin  ---
> However, it could be shortened to just:

sample1(char):
 imul   eax,edi,0x10111
 ret; missed in previous message

[Bug c/110458] New: -Warray-bounds=2 new false positive

2023-06-28 Thread sirl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110458

Bug ID: 110458
   Summary: -Warray-bounds=2 new false positive
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: sirl at gcc dot gnu.org
  Target Milestone: ---

Created attachment 55412
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55412&action=edit
testcase

Since somewhere between r14-1870 and r14-2097 a new -Warray-bounds=2 false
positive is shown for this little testcase:

typedef struct {
  unsigned arr1[4];
  unsigned arr2[4];
} data;

void f_notok(void *arrayOut) {
  int i;
  unsigned *arr2;
  unsigned *arr1;
  data *dataOut;
  dataOut = (data *)arrayOut;
  arr1 = dataOut[0].arr1;
  arr2 = dataOut[0].arr2;
  i = 0;
  for (; i < 4; i++) {
arr1[i] = 0;
arr2[i] = 0;
  }
}

When compiled with trunk@r2097 "gcc -O2 -W -Wall -Warray-bounds=2 -c
bug-Warray-bounds-eq-2.c" the warning is:

bug-Warray-bounds-eq-2.c: In function 'f_notok':
bug-Warray-bounds-eq-2.c:16:13: warning: '__builtin_memset' offset [16, 31]
from the object at 'arrayOut' is out of the bounds of referenced subobject
'arr1' with type 'unsigned int[4]' at offset 0 [-Warray-bounds=]
   16 | arr1[i] = 0;
  | ^~~
bug-Warray-bounds-eq-2.c:2:12: note: subobject 'arr1' declared here
2 |   unsigned arr1[4];
  |^~~~

gcc-13.1.1 and earlier didn't warn here. The attached full testcase also shows
that slight variations in the code silence the warning.

[Bug tree-optimization/110459] New: Trivial on stack variable was not optimized away

2023-06-28 Thread antoshkka at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110459

Bug ID: 110459
   Summary: Trivial on stack variable was not optimized away
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: antoshkka at gmail dot com
  Target Milestone: ---

Consider the example:

struct array {
char data[4];
};

auto sample2(char c) {
  array buffer = {c, 0, 0, 0};
  return buffer;
}


With GCC-14 and -O2 it produces the following assembly:

sample2(char):
xor eax, eax
mov BYTE PTR [rsp-22], 0
mov WORD PTR [rsp-24], ax
mov eax, DWORD PTR [rsp-24]
sal eax, 8
mov al, dil
ret


It could be further optimized to just:

sample2(char):
movzx   eax, dil
ret


Godbolt playground: https://godbolt.org/z/nxKhvo3ns

[Bug c++/110441] c++17: temporary causes static member function call to confuse required copy elision

2023-06-28 Thread matt.cross+gcc-bugzilla at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110441

Matt Cross  changed:

   What|Removed |Added

 CC||matt.cross+gcc-bugzilla@gma
   ||il.com

--- Comment #5 from Matt Cross  ---

I have also found that
* Making the function f() non-static works.  https://godbolt.org/z/jn6Ms1n5h
* Making a unique_ptr to an S fails: "auto sp = std::make_unique(); return
sp->f();"  https://godbolt.org/z/85e9MW91b

I suspect it is the same root cause, but just in case there's wrinkles here I
thought these additional test cases might be helpful.

[Bug tree-optimization/110460] New: [14 Regression] ft32 ICE on 931110-1.c with new TYPE_PRECISION checking

2023-06-28 Thread law at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110460

Bug ID: 110460
   Summary: [14 Regression] ft32 ICE on 931110-1.c with new
TYPE_PRECISION checking
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: law at gcc dot gnu.org
  Target Milestone: ---

commit fe48f2651334bc4d96b6df6b2bb6b29fcb732a83
Author: Richard Biener 
Date:   Fri Jun 9 09:31:14 2023 +0200

Prevent TYPE_PRECISION on VECTOR_TYPEs

The following makes sure that using TYPE_PRECISION on VECTOR_TYPE
ICEs when tree checking is enabled.  This should avoid wrong-code
in cases like PR110182 and instead ICE.

It also introduces a TYPE_PRECISION_RAW accessor and adjusts
places I found that are eligible to use that.

* tree.h (TYPE_PRECISION): Check for non-VECTOR_TYPE.
(TYPE_PRECISION_RAW): Provide raw access to the precision
field.
* tree.cc (verify_type_variant): Compare TYPE_PRECISION_RAW.
(gimple_canonical_types_compatible_p): Likewise.
* tree-streamer-out.cc (pack_ts_type_common_value_fields):
Stream TYPE_PRECISION_RAW.
* tree-streamer-in.cc (unpack_ts_type_common_value_fields):
Likewise.
* lto-streamer-out.cc (hash_tree): Hash TYPE_PRECISION_RAW.

gcc/lto/
* lto-common.cc (compare_tree_sccs_1): Use TYPE_PRECISION_RAW.


One example on ft32-elf:

Tests that now fail, but worked before (13 tests):

ft32-sim: gcc.c-torture/execute/931110-1.c   -O3 -fomit-frame-pointer
-funroll-loops -fpeel-loops -ftracer -finline-functions  (test for excess
errors)
ft32-sim: gcc.c-torture/execute/931110-1.c   -O3 -g  (test for excess errors)
ft32-sim: gcc.dg/pr108095.c (test for excess errors)

And if you dig into the 931110-1.c failure you find:
ft32-sim: gcc.c-torture/execute/931110-1.c   -O3 -fomit-frame-pointer
-funroll-loops -fpeel-loops -ftracer -finline-functions  (internal compiler
error: tree check: expected none of vector_type, have vector_type in
type_has_mode_precision_p, at tree.h:6644)
ft32-sim: gcc.c-torture/execute/931110-1.c   -O3 -g  (internal compiler error:
tree check: expected none of vector_type, have vector_type in
type_has_mode_precision_p, at tree.h:6644)

It looks like SCALAR_DEST in vectorizable_operation is actually a vector type
-- meaning that STMT was already vectorized.

This is the patch I'm testing.  There are other failures that don't seem to be
fixed by this patch.  Anyway, the whole point of the change is to find these
lurking bugs.

diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index d642d3c257f..3dd8a284577 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -6481,6 +6481,10 @@ vectorizable_operation (vec_info *vinfo,
   scalar_dest = gimple_assign_lhs (stmt);
   vectype_out = STMT_VINFO_VECTYPE (stmt_info);

+  /* STMT may have already been vectorized.  */
+  if (VECTOR_TYPE_P (TREE_TYPE (scalar_dest)))
+return false;
+
   /* Most operations cannot handle bit-precision types without extra
  truncations.  */
   bool mask_op_p = VECTOR_BOOLEAN_TYPE_P (vectype_out);

[Bug target/109780] [12/13/14 Regression] csmith: runtime crash with -O2 -march=znver1

2023-06-28 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109780

--- Comment #19 from H.J. Lu  ---
(In reply to Xi Ruoyao from comment #17)
> (In reply to H.J. Lu from comment #16)
> > Created attachment 55409 [details]
> > A patch
> > 
> > I am stilling trying to find a small testcase.
> 
> The patch triggers an ICE building Spidermonkey 115b9 (it segfaults with GCC
> trunk because of some unaligned vmovdqa):
> 
> 0x93297b ix86_finalize_stack_frame_flags
>   ../../gcc/gcc/config/i386/i386.cc:8224
> 0x162064c ix86_expand_epilogue(int)
>   ../../gcc/gcc/config/i386/i386.cc:9405
> 0x1b2e27f gen_epilogue()
>   ../../gcc/gcc/config/i386/i386.md:17517
> 0x160a815 target_gen_epilogue
>   ../../gcc/gcc/config/i386/i386.md:17013
> 0xf15e86 make_epilogue_seq
>   ../../gcc/gcc/function.cc:5964
> 0xf15f8b thread_prologue_and_epilogue_insns()
>   ../../gcc/gcc/function.cc:6046
> 0xf166c2 rest_of_handle_thread_prologue_and_epilogue
>   ../../gcc/gcc/function.cc:6544
> 0xf166c2 execute
>   ../../gcc/gcc/function.cc:6625
> 
> The code at i386.cc:8224 reads:
> 
>   if (crtl->stack_realign_finalized)
> {
>   /* After stack_realign_needed is finalized, we can't no longer
>  change it.  */
>   gcc_assert (crtl->stack_realign_needed == stack_realign);
>   return;
> }
> 
> I'm not sure if the assert should be dropped or it's more difficult.
> 
> Or can we just force to use unaligned vector moves for block operations
> until we can find a better solution?  It's at least better than leaving the
> vectorized block moving broken and forcing people trying to disable the
> feature.

Do you have a testcase?

[Bug target/109780] [12/13/14 Regression] csmith: runtime crash with -O2 -march=znver1

2023-06-28 Thread xry111 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109780

--- Comment #20 from Xi Ruoyao  ---
(In reply to H.J. Lu from comment #19)
> (In reply to Xi Ruoyao from comment #17)
> > (In reply to H.J. Lu from comment #16)
> > > Created attachment 55409 [details]
> > > A patch
> > > 
> > > I am stilling trying to find a small testcase.
> > 
> > The patch triggers an ICE building Spidermonkey 115b9 (it segfaults with GCC
> > trunk because of some unaligned vmovdqa):
> > 
> > 0x93297b ix86_finalize_stack_frame_flags
> > ../../gcc/gcc/config/i386/i386.cc:8224
> > 0x162064c ix86_expand_epilogue(int)
> > ../../gcc/gcc/config/i386/i386.cc:9405
> > 0x1b2e27f gen_epilogue()
> > ../../gcc/gcc/config/i386/i386.md:17517
> > 0x160a815 target_gen_epilogue
> > ../../gcc/gcc/config/i386/i386.md:17013
> > 0xf15e86 make_epilogue_seq
> > ../../gcc/gcc/function.cc:5964
> > 0xf15f8b thread_prologue_and_epilogue_insns()
> > ../../gcc/gcc/function.cc:6046
> > 0xf166c2 rest_of_handle_thread_prologue_and_epilogue
> > ../../gcc/gcc/function.cc:6544
> > 0xf166c2 execute
> > ../../gcc/gcc/function.cc:6625
> > 
> > The code at i386.cc:8224 reads:
> > 
> >   if (crtl->stack_realign_finalized)
> > {
> >   /* After stack_realign_needed is finalized, we can't no longer
> >  change it.  */
> >   gcc_assert (crtl->stack_realign_needed == stack_realign);
> >   return;
> > }
> > 
> > I'm not sure if the assert should be dropped or it's more difficult.
> > 
> > Or can we just force to use unaligned vector moves for block operations
> > until we can find a better solution?  It's at least better than leaving the
> > vectorized block moving broken and forcing people trying to disable the
> > feature.
> 
> Do you have a testcase?

It's too large and I'm running cvise on it.

[Bug c++/110175] [GCC][Crash] GCC Crash on valid code

2023-06-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110175

--- Comment #3 from CVS Commits  ---
The trunk branch has been updated by Marek Polacek :

https://gcc.gnu.org/g:4de22e25918f6fe40184c444ba6d81b19b806e26

commit r14-2167-g4de22e25918f6fe40184c444ba6d81b19b806e26
Author: Marek Polacek 
Date:   Thu Jun 8 14:07:44 2023 -0400

c++: fix error reporting routines re-entered ICE [PR110175]

Here we get the "error reporting routines re-entered" ICE because
of an unguarded use of warning_at.  While at it, I added a check
for a warning_at just above it.

PR c++/110175

gcc/cp/ChangeLog:

* typeck.cc (cp_build_unary_op): Check tf_warning before warning.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/decltype-110175.C: New test.

[Bug c++/110175] [GCC][Crash] GCC Crash on valid code

2023-06-28 Thread mpolacek at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110175

Marek Polacek  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from Marek Polacek  ---
Fixed in GCC 14.

[Bug d/110193] d_signed_or_unsigned_type is invoked for vector types

2023-06-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110193

--- Comment #6 from CVS Commits  ---
The master branch has been updated by Iain Buclaw :

https://gcc.gnu.org/g:9757e4440bd8755d327601a60a73d57d712583ed

commit r14-2168-g9757e4440bd8755d327601a60a73d57d712583ed
Author: Iain Buclaw 
Date:   Wed Jun 28 17:38:16 2023 +0200

d: Fix d_signed_or_unsigned_type is invoked for vector types (PR110193)

This function can be invoked on VECTOR_TYPE, but the implementation
assumes it works on integer types only.  To fix, added a check whether
the type passed is any `__vector(T)' or non-integral type, and return
early by calling `signed_or_unsigned_type_for()' instead.

Problem was found by instrumenting TYPE_PRECISION and ICEing when
applied on VECTOR_TYPEs.

PR d/110193

gcc/d/ChangeLog:

* types.cc (d_signed_or_unsigned_type): Handle being called with
any
vector or non-integral type.

[Bug analyzer/110426] Missing buffer overflow warning with function pointer that has the alloc_size attribute

2023-06-28 Thread dmalcolm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110426

David Malcolm  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2023-06-28

--- Comment #1 from David Malcolm  ---
Thanks for filing this; confirmed.

The above reproducer on Compiler Explorer (with x86_64 trunk) is:
  https://godbolt.org/z/Yq5YrhWPa

[Bug target/110457] Unnecessary movsx eax, dil

2023-06-28 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110457

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||ABI

--- Comment #2 from Andrew Pinski  ---
Why do you think that?
Iirc the x86_64 abi says the upper bits are undefined.

[Bug target/110457] Unnecessary movsx eax, dil

2023-06-28 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110457

Andrew Pinski  changed:

   What|Removed |Added

 Target||X86_64

--- Comment #3 from Andrew Pinski  ---
Note iirc there is still a disagreement in what the abi says. Llvm folks says
it says one thing while gcc folks have said it says it undefined.

[Bug d/110193] d_signed_or_unsigned_type is invoked for vector types

2023-06-28 Thread ibuclaw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110193

ibuclaw at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #7 from ibuclaw at gcc dot gnu.org ---
Fix committed.

[Bug target/110457] Unnecessary movsx eax, dil

2023-06-28 Thread antoshkka at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110457

--- Comment #4 from Antony Polukhin  ---
Oh, if there's an disagreement I'm fine with closing this issue as
invalid/later/won't_fix

[Bug target/110406] d: Wrong code-gen returning POD structs by value

2023-06-28 Thread ibuclaw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110406

ibuclaw at gcc dot gnu.org changed:

   What|Removed |Added

   Last reconfirmed||2023-06-28
 Ever confirmed|0   |1
 Status|UNCONFIRMED |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |ibuclaw at gcc dot 
gnu.org

--- Comment #13 from ibuclaw at gcc dot gnu.org ---
Created attachment 55413
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55413&action=edit
delay calling compute_record_mode until all fields complete

Attaching the full patch that was being tested, related to the above snippet.

[Bug fortran/99065] ASSOCIATE function selector expression "no IMPLICIT type" failure

2023-06-28 Thread pault at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99065

Paul Thomas  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #2 from Paul Thomas  ---
I actually worked on this one first but have to give the prior credit to Ian
Harvey.

Paul

*** This bug has been marked as a duplicate of bug 89645 ***

[Bug fortran/87477] [meta-bug] [F03] issues concerning the ASSOCIATE statement

2023-06-28 Thread pault at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87477
Bug 87477 depends on bug 99065, which changed state.

Bug 99065 Summary: ASSOCIATE function selector expression "no IMPLICIT type" 
failure
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99065

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |DUPLICATE

[Bug fortran/89645] No IMPLICIT type error with: ASSOCIATE( X => function() )

2023-06-28 Thread pault at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89645

--- Comment #3 from Paul Thomas  ---
*** Bug 99065 has been marked as a duplicate of this bug. ***

[Bug tree-optimization/110461] New: [14 regression] ICE when building openh264 with new vector_type checking

2023-06-28 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110461

Bug ID: 110461
   Summary: [14 regression] ICE when building openh264 with new
vector_type checking
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: sjames at gcc dot gnu.org
  Target Milestone: ---

Created attachment 55414
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55414&action=edit
encode_mb_aux.ii

g++ -c encode_mb_aux.ii -O2 is enough to repro.

```
aarch64-unknown-linux-gnu-g++ -O3 -pipe -mcpu=native -fdiagnostics-color=always
 -DNDEBUG -DHAVE_NEON_AARCH64 -Wall -fno-strict-aliasing -fPIC -MMD -MP
-fstack-protector-all -march=armv8-a -DGENERATED_VERSION_HEADER -O3 -pipe
-mcpu=native -fdiagnostics-color=always -Wno-class-memaccess -I./codec/api/wels
-I./codec/common/inc -Icodec/common/inc  -I./codec/encoder/core/inc
-I./codec/encoder/plus/inc -I./codec/processing/interface -c -o
codec/encoder/core/src/encode_mb_aux.o codec/encoder/core/src/encode_mb_aux.cpp
during GIMPLE pass: vect
codec/encoder/core/src/encode_mb_aux.cpp: In function ‘void
WelsEnc::WelsQuantFour4x4Max_c(int16_t*, const int16_t*, const int16_t*,
int16_t*)’:
codec/encoder/core/src/encode_mb_aux.cpp:209:6: internal compiler error: tree
check: expected none of vector_type, have vector_type in gimple_simplify_144,
at gimple-match-3.cc:1027
  209 | void WelsQuantFour4x4Max_c (int16_t* pDct, const int16_t* pFF, const
int16_t* pMF, int16_t* pMax) {
  |  ^
0xc959b02b tree_not_check_failed(tree_node const*, char const*, int, char
const*, ...)
   
/usr/src/debug/sys-devel/gcc-14.0.0./gcc-14.0.0./gcc/tree.cc:8936
0xcb1c6c53 tree_not_check(tree_node*, char const*, int, char const*,
tree_code)
   
/usr/src/debug/sys-devel/gcc-14.0.0./gcc-14.0.0./gcc/tree.h:3581
0xcb1c6c53 gimple_simplify_144(gimple_match_op*, gimple**, tree_node*
(*)(tree_node*), tree_node*, tree_node**, tree_code)
   
/usr/src/debug/sys-devel/gcc-14.0.0./build/gcc/gimple-match-3.cc:1027
0xcb1a21bb gimple_simplify_BIT_XOR_EXPR(gimple_match_op*, gimple**,
tree_node* (*)(tree_node*), code_helper, tree_node*, tree_node*, tree_node*)
   
/usr/src/debug/sys-devel/gcc-14.0.0./build/gcc/gimple-match-2.cc:9569
0xca74debf gimple_resimplify2
   
/usr/src/debug/sys-devel/gcc-14.0.0./gcc-14.0.0./gcc/gimple-match-exports.cc:967
0xca74e687 gimple_simplify(gimple*, gimple_match_op*, gimple**, tree_node*
(*)(tree_node*), tree_node* (*)(tree_node*))
   
/usr/src/debug/sys-devel/gcc-14.0.0./gcc-14.0.0./gcc/gimple-match-exports.cc:834
0xc9c8e5b7 gimple_fold_stmt_to_constant_1(gimple*, tree_node*
(*)(tree_node*), tree_node* (*)(tree_node*))
   
/usr/src/debug/sys-devel/gcc-14.0.0./gcc-14.0.0./gcc/gimple-fold.cc:7472
0xca2cfe97 try_to_simplify
   
/usr/src/debug/sys-devel/gcc-14.0.0./gcc-14.0.0./gcc/tree-ssa-sccvn.cc:6096
0xca2cfe97 visit_stmt
   
/usr/src/debug/sys-devel/gcc-14.0.0./gcc-14.0.0./gcc/tree-ssa-sccvn.cc:6139
0xca2d0e6f process_bb
   
/usr/src/debug/sys-devel/gcc-14.0.0./gcc-14.0.0./gcc/tree-ssa-sccvn.cc:7945
0xca2d2823 do_rpo_vn_1
   
/usr/src/debug/sys-devel/gcc-14.0.0./gcc-14.0.0./gcc/tree-ssa-sccvn.cc:8544
0xca2d448f do_rpo_vn(function*, edge_def*, bitmap_head*, bool, bool,
vn_lookup_kind)
   
/usr/src/debug/sys-devel/gcc-14.0.0./gcc-14.0.0./gcc/tree-ssa-sccvn.cc:8646
0xca3ba1f3 execute
   
/usr/src/debug/sys-devel/gcc-14.0.0./gcc-14.0.0./gcc/tree-vectorizer.cc:1385
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See <https://bugs.gentoo.org/> for instructions.
```

```
gcc (Gentoo 14.0.0 p, commit 6cb33e2f39e289ec4f25f845d8153053147c5c49) 14.0.0
20230628 (experimental) c7e87e82435b918084f305386b12b8fbcdcf3307
Copyright (C) 2023 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
```

[Bug tree-optimization/103680] Jump threading and switch corrupts profile

2023-06-28 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103680

Andrew Pinski  changed:

   What|Removed |Added

 Depends on||25623

--- Comment #8 from Andrew Pinski  ---
(In reply to Jan Hubicka from comment #7)
> a simple testcase:
> test (int i)
> {
> if (__builtin_expect_with_probability (i > 5, 1, 0.6))
> foo ();
> }
> test2(int i)
> {
> test (i);
> if (__builtin_expect_with_probability (i > 4, 1, 0.7))
> foo ();
> }
> this is can be updated quite easily, but we still fail.

That is exactly bug 25623 comment #1.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=25623
[Bug 25623] jump threading/cfg cleanup messes up "incoming counts" for some BBs

[Bug tree-optimization/110461] [14 regression] ICE when building openh264 with new vector_type checking

2023-06-28 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110461

--- Comment #1 from Andrew Pinski  ---
I think it is:
/* Try to fold (type) X op CST -> (type) (X op ((type-x) CST))

[Bug tree-optimization/110461] [14 regression] ICE when building openh264 with new vector_type checking

2023-06-28 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110461

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |14.0
   Keywords||ice-on-valid-code

[Bug libstdc++/110462] New: [14 regression] Build failure with musl-1.2.4 (filesystem/ops-common.h:377:5: error: 'off64_t' was not declared in this scope; did you mean 'off_t'?)

2023-06-28 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110462

Bug ID: 110462
   Summary: [14 regression] Build failure with musl-1.2.4
(filesystem/ops-common.h:377:5: error: 'off64_t' was
not declared in this scope; did you mean 'off_t'?)
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: sjames at gcc dot gnu.org
  Target Milestone: ---

See PR109533. Pretty sure this was caused by r14-1569-gd87caacf8e2df5.

```
In file included from
/var/tmp/portage/sys-devel/gcc-14.0.0./work/gcc-14.0.0./libstdc++-v3/src/c++17/fs_ops.cc:63:
/var/tmp/portage/sys-devel/gcc-14.0.0./work/gcc-14.0.0./libstdc++-v3/src/c++17/../filesystem/ops-common.h:
In function 'bool std::filesystem::copy_file_copy_file_range(int, int,
std::size_t)':
/var/tmp/portage/sys-devel/gcc-14.0.0./work/gcc-14.0.0./libstdc++-v3/src/c++17/../filesystem/ops-common.h:377:5:
error: 'off64_t' was not declared in this scope; did you mean 'off_t'?
  377 | off64_t off_in = 0, off_out = 0;
  | ^~~
  | off_t
/var/tmp/portage/sys-devel/gcc-14.0.0./work/gcc-14.0.0./libstdc++-v3/src/c++17/../filesystem/ops-common.h:381:50:
error: 'off_in' was not declared in this scope; did you mean 'fd_in'?
  381 | bytes_copied = ::copy_file_range(fd_in, &off_in, fd_out,
&off_out,
  |  ^~
  |  fd_in
/var/tmp/portage/sys-devel/gcc-14.0.0./work/gcc-14.0.0./libstdc++-v3/src/c++17/../filesystem/ops-common.h:381:67:
error: 'off_out' was not declared in this scope; did you mean 'fd_out'?
  381 | bytes_copied = ::copy_file_range(fd_in, &off_in, fd_out,
&off_out,
  |  
^~~
  |  
fd_out
make[6]: *** [Makefile:587: fs_ops.lo] Error 1
```

  1   2   3   >