[Bug ipa/103405] [12 Regression] c67005c FAILs with -fipa-modref

2021-11-25 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103405

--- Comment #9 from Jan Hubicka  ---
Fixed by g:16e85390507ea92331c9052393b591202007f5ab (forgot to add PR marker)

[Bug tree-optimization/103409] [12 Regression] 18% WRF compile-time regression with -O2 -flto between g:264f061997c0a534 and g:3e09331f6aeaf595

2021-11-25 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103409

--- Comment #2 from hubicka at kam dot mff.cuni.cz ---
> The two main changes during that time period was jump threading and modref.
> modref seems might be more likely with wrf being fortran code and even using
> nested functions and such.

Yep, I think both are possible.  There was also change enabling ipa-sra
on fortran.

There was yet another regression in wrf earlier that I think was related
to imroving flags propagation.  I think those are not by modref itself,
but triggers some other pass.  I wil try to look up the regression range
(it was before ranger got in).

[Bug tree-optimization/103423] New: 19% cpu2006 wrf compile time regression with -flto between g:0b7a11874d4eb428 and g:704e8a825c78b9a8

2021-11-25 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103423

Bug ID: 103423
   Summary: 19% cpu2006 wrf compile time regression with -flto
between g:0b7a11874d4eb428 and g:704e8a825c78b9a8
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hubicka at gcc dot gnu.org
  Target Milestone: ---

This is visible at:
https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=427.270.8&plot.1=227.270.8&;
https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=413.270.8&plot.1=292.270.8&;
https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=412.270.8&plot.1=289.270.8&;
https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=445.270.8&plot.1=233.270.8&;

Curiously it does not seem to show on 2017 version of wrf.

[Bug tree-optimization/103409] [12 Regression] 18% WRF compile-time regression with -O2 -flto between g:264f061997c0a534 and g:3e09331f6aeaf595

2021-11-25 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103409

--- Comment #3 from Jan Hubicka  ---
I filled in PR103423. Interesting observation is that both regressions are cca
18% but happens at different time-ranges.  This one is spec2017 WRF while the
other is spec2006 WRF and neither reproduce on both.

So perhaps modref improved enough in July to regress compile time of 2006 wrf
while for 2017 it needed couple more months...

[Bug c++/93259] Unsized temporary array initialization problem

2021-11-25 Thread m.cencora at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93259

--- Comment #4 from m.cencora at gmail dot com ---
This might be related to CWG2487 "Type dependence of function-style cast to
incomplete array type"

[Bug ipa/103405] [12 Regression] c67005c FAILs with -fipa-modref

2021-11-25 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103405

Martin Liška  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #10 from Martin Liška  ---
Fixed.

[Bug tree-optimization/103423] [12 Regression] 19% cpu2006 wrf compile time regression with -flto since r12-3903-g0288527f47cec669

2021-11-25 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103423

Martin Liška  changed:

   What|Removed |Added

   Last reconfirmed||2021-11-25
 Blocks||26163
   Target Milestone|--- |12.0
 Status|UNCONFIRMED |NEW
Summary|19% cpu2006 wrf compile |[12 Regression] 19% cpu2006
   |time regression with -flto  |wrf compile time regression
   |between g:0b7a11874d4eb428  |with -flto since
   |and g:704e8a825c78b9a8  |r12-3903-g0288527f47cec669
 CC||aldyh at gcc dot gnu.org,
   ||amacleod at redhat dot com,
   ||marxin at gcc dot gnu.org
 Ever confirmed|0   |1


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163
[Bug 26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)

[Bug tree-optimization/103376] [12 Regression] wrong code at -Os and above on x86_64-linux-gnu since r12-5453-ga944b5dec3adb28e

2021-11-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103376

--- Comment #11 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:531dae29a67e915a145d908bd2f46d22bc369c11

commit r12-5512-g531dae29a67e915a145d908bd2f46d22bc369c11
Author: Jakub Jelinek 
Date:   Thu Nov 25 10:38:33 2021 +0100

bswap: Improve perform_symbolic_merge [PR103376]

Thinking more about it, perhaps we could do more for BIT_XOR_EXPR.
We could allow masked1 == masked2 case for it, but would need to
do something different than the
  n->n = n1->n | n2->n;
we do on all the bytes together.
In particular, for masked1 == masked2 if masked1 != 0 (well, for 0
both variants are the same) and masked1 != 0xff we would need to
clear corresponding n->n byte instead of setting it to the input
as x ^ x = 0 (but if we don't know what x and y are, the result is
also don't know).  Now, for plus it is much harder, because not only
for non-zero operands we don't know what the result is, but it can
modify upper bytes as well.  So perhaps only if current's byte
masked1 && masked2 set the resulting byte to 0xff (unknown) iff
the byte above it is 0 and 0, and set that resulting byte to 0xff too.
Also, even for | we could instead of return NULL just set the resulting
byte to 0xff if it is different, perhaps it will be masked off later on.

This patch just punts on plus if both corresponding bytes are non-zero,
otherwise implements the above.

2021-11-25  Jakub Jelinek  

PR tree-optimization/103376
* gimple-ssa-store-merging.c (perform_symbolic_merge): For
BIT_IOR_EXPR, if masked1 && masked2 && masked1 != masked2, don't
punt, but set the corresponding result byte to MARKER_BYTE_UNKNOWN.
For BIT_XOR_EXPR similarly and if masked1 == masked2 and the
byte isn't MARKER_BYTE_UNKNOWN, set the corresponding result byte
to
0.

* gcc.dg/optimize-bswapsi-7.c: New test.

[Bug libgcc/103424] New: Ignoring -mfpu=sp_full/-mfpu=-sp_lite/-msingle-float

2021-11-25 Thread andrea.bellandi at desy dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103424

Bug ID: 103424
   Summary: Ignoring -mfpu=sp_full/-mfpu=-sp_lite/-msingle-float
   Product: gcc
   Version: 8.5.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libgcc
  Assignee: unassigned at gcc dot gnu.org
  Reporter: andrea.bellandi at desy dot de
  Target Milestone: ---

When passing "-mfpu=sp_full/-mfpu=-sp_lite/-msingle-float" to CFLAGS with
--target=powerpc-xilinx-eabi/--target=powerpc-xilinx-eabisim, libgcc should be
compiled with hard- single precision fpu code and soft- double precision (see
https://gcc.gnu.org/onlinedocs/gcc-8.5.0/gcc/RS_002f6000-and-PowerPC-Options.html#RS_002f6000-and-PowerPC-Options).

However these flags are ignored (no apparent reference to them in libgcc).
Therefore the library attempt a compilation with both single- and double- soft
floating point replacement.

How to reproduce:

Compilation failure of the whole 'gcc product' with the following parameters

CFLAGS_FOR_TARGET="-g -O2 -mxilinx-fpu -msingle-float" ../gcc/configure
--target=powerpc-xilinx-eabi --prefix="$PREFIX" --disable-nls --with-tune=440
--enable-languages=c,c++ --without-headers

make all-gcc 
make all-target-libgcc

Error: 

/home/bellandi/Projects/build-gcc/./gcc/xgcc
-B/home/bellandi/Projects/build-gcc/./gcc/
-B/home/bellandi/.local/cross/powerpc-xilinx-eabi/bin/
-B/home/bellandi/.local/cross/powerpc-xilinx-eabi/lib/ -isystem
/home/bellandi/.local/cross/powerpc-xilinx-eabi/include -isystem
/home/bellandi/.local/cross/powerpc-xilinx-eabi/sys-include-g -O2
-mfpu=sp_lite -mrelocatable-lib -mno-eabi -mstrict-align -O2  -g -O2 -DIN_GCC 
-DCROSS_DIRECTORY_STRUCTURE  -W -Wall -Wno-narrowing -Wwrite-strings
-Wcast-qual -Wstrict-prototypes -Wmissing-prototypes -Wold-style-definition 
-isystem ./include   -g -DIN_LIBGCC2 -fbuilding-libgcc -fno-stack-protector
-Dinhibit_libc  -I. -I. -I../../.././gcc -I../../../../gcc/libgcc
-I../../../../gcc/libgcc/. -I../../../../gcc/libgcc/../gcc
-I../../../../gcc/libgcc/../include-o _mulsc3.o -MT _mulsc3.o -MD -MP -MF
_mulsc3.dep -DL_mulsc3 -c ../../../../gcc/libgcc/libgcc2.c -fvisibility=hidden
-DHIDE_EXPORTS
during RTL pass: reload 
../../../../gcc/libgcc/libgcc2.c: In function '__mulsc3':   
../../../../gcc/libgcc/libgcc2.c:2036:1: internal compiler error: Max. number
of generated reload insns per insn is achieved (90)

 }  
 ^  
0x8d1519 lra_constraints(bool)  
  ../../gcc/gcc/lra-constraints.c:4836  
0x8bf2e4 lra(_IO_FILE*) 
  ../../gcc/gcc/lra.c:2422  
0x87bd49 do_reload  
  ../../gcc/gcc/ira.c:5472  
0x87bd49 execute
  ../../gcc/gcc/ira.c:5656  
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.  
See  for instructions.   
make[3]: *** [Makefile:494: _mulsc3.o] Error 1  
make[3]: Leaving directory
'/home/bellandi/Projects/build-gcc/powerpc-xilinx-eabi/single/libgcc'
make[2]: *** [Makefile:1201: multi-do] Error 1  
make[2]: Leaving directory
'/home/bellandi/Projects/build-gcc/powerpc-xilinx-eabi/libgcc'
make[1]: *** [Makefile:125: all-multi] Error 2  
make[1]: Leaving directory
'/home/bellandi/Projects/build-gcc/powerpc-xilinx-eabi/libgcc'
make: *** [Makefile:12382: all-target-libgcc] Error 2 


This probably happens due to the libgcc attempt to generate single precision
code for __mulsc3

Note that the flags should be implemented at least for powerpc-xilinx-eabi. 
See 
https://china.xilinx.com/support/documentation/ip_documentation/apu_fpu.pdf
page 8 
https://www.xilinx.com/support/documentation/ip_documentation/apu_fpu_virtex5.pdf
page 8


Possible fix: 

Perform a configuration simular to the powerpc*-*-linux* for the 'e500v1'
option:
https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=libgcc/config.host;h=677173eee432efe784dfe6535b931f63df54cc7c;hb=eafe83f2f20ef0c1e7703c361ba314b44574523c#l1083

Result in the inclusion of t-e500v1-fp
https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=libgcc/config/rs6000/t-e500v1-fp;h=ff88acaa8e7af0fadcc4f4

[Bug tree-optimization/103409] [12 Regression] 18% SPEC2017 WRF compile-time regression with -O2 -flto between g:264f061997c0a534 and g:3e09331f6aeaf595

2021-11-25 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103409

Martin Liška  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Blocks||26163
 Ever confirmed|0   |1
 CC||marxin at gcc dot gnu.org
   Last reconfirmed||2021-11-25


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163
[Bug 26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)

[Bug tree-optimization/103417] [12 Regression] wrong code at -O1 and above on x86_64-linux-gnu since r12-5489

2021-11-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103417

--- Comment #6 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:94912212d3d1be0b1c490e9b5f45165ef5f30d8a

commit r12-5513-g94912212d3d1be0b1c490e9b5f45165ef5f30d8a
Author: Jakub Jelinek 
Date:   Thu Nov 25 10:47:24 2021 +0100

match.pd: Fix up the recent bitmask_inv_cst_vector_p simplification
[PR103417]

The following testcase is miscompiled since the r12-5489-g0888d6bbe97e10
changes.
The simplification triggers on
(x & 4294967040U) >= 0U
and turns it into:
x <= 255U
which is incorrect, it should fold to 1 because unsigned >= 0U is always
true and normally the
/* Non-equality compare simplifications from fold_binary  */
 (if (wi::to_wide (cst) == min)
   (if (cmp == GE_EXPR)
{ constant_boolean_node (true, type); })
simplification folds that, but this simplification was done earlier.

The simplification correctly doesn't include lt which has the same
reason why it shouldn't be handled, we'll fold it to 0 elsewhere.

But, IMNSHO while it isn't incorrect to handle le and gt there, it is
unnecessary.  Because (x & cst) <= 0U and (x & cst) > 0U should
never appear, again in
/* Non-equality compare simplifications from fold_binary  */
we have a simplification for it:
   (if (cmp == LE_EXPR)
(eq @2 @1))
   (if (cmp == GT_EXPR)
(ne @2 @1
This is done for
  (cmp (convert?@2 @0) uniform_integer_cst_p@1)
and so should be done for both integers and vectors.
As the bitmask_inv_cst_vector_p simplification only handles
eq and ne for signed types, I think it can be simplified to just
following patch.

2021-11-25  Jakub Jelinek  

PR tree-optimization/103417
* match.pd ((X & Y) CMP 0): Only handle eq and ne.  Commonalize
common tests.

* gcc.c-torture/execute/pr103417.c: New test.

[Bug libgcc/103424] Ignoring -mfpu=sp_full/-mfpu=-sp_lite/-msingle-float

2021-11-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103424

--- Comment #1 from Andrew Pinski  ---
Try using TFLAGS instead of CFLAGS_FOR_TARGET.

[Bug fortran/103412] [10/11/12 Regression] ICE: Invalid expression in gfc_element_size since r10-2083-g8dc63166e0b85954

2021-11-25 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103412

Martin Liška  changed:

   What|Removed |Added

Summary|[10/11/12 Regression] ICE:  |[10/11/12 Regression] ICE:
   |Invalid expression in   |Invalid expression in
   |gfc_element_size|gfc_element_size since
   ||r10-2083-g8dc63166e0b85954
 CC||marxin at gcc dot gnu.org

--- Comment #2 from Martin Liška  ---
Started with r10-2083-g8dc63166e0b85954.

[Bug libgcc/103424] Ignoring -mfpu=sp_full/-mfpu=-sp_lite/-msingle-float

2021-11-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103424

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
   Target Milestone|--- |9.0
 Resolution|--- |WONTFIX

--- Comment #2 from Andrew Pinski  ---
This option was removed in GCC 9 and was deprecated in GCC 8.

Since We don't support any GCC lower than GCC 9, closing as won't fix.

[Bug fortran/103413] [10/11/12 Regression] ICE: Invalid expression in gfc_element_size since r10-2083-g8dc63166e0b85954

2021-11-25 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103413

Martin Liška  changed:

   What|Removed |Added

 CC||marxin at gcc dot gnu.org
Summary|[10/11/12 Regression] ICE:  |[10/11/12 Regression] ICE:
   |Invalid expression in   |Invalid expression in
   |gfc_element_size|gfc_element_size since
   ||r10-2083-g8dc63166e0b85954

--- Comment #3 from Martin Liška  ---
Started with r10-2083-g8dc63166e0b85954.

[Bug fortran/103414] [PDT] ICE in gfc_free_actual_arglist, at fortran/expr.c:547 since r10-2083-g8dc63166e0b85954

2021-11-25 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103414

Martin Liška  changed:

   What|Removed |Added

 CC||marxin at gcc dot gnu.org
Summary|[PDT] ICE in|[PDT] ICE in
   |gfc_free_actual_arglist, at |gfc_free_actual_arglist, at
   |fortran/expr.c:547  |fortran/expr.c:547 since
   ||r10-2083-g8dc63166e0b85954

--- Comment #5 from Martin Liška  ---
Started with r10-2083-g8dc63166e0b85954.

[Bug preprocessor/103415] [12 Regression] ICE in cpp_interpret_string_1, at libcpp/charset.c:1739

2021-11-25 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103415

Martin Liška  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org,
   ||marxin at gcc dot gnu.org

--- Comment #3 from Martin Liška  ---
(In reply to Andrew Pinski from comment #2)
> Confirmed.
> 
> 
> (In reply to G. Steinmetz from comment #0)
> > Started between 20210808 and 20210822  :
> 
> Then it is when __VA_OPT__ support was added in r12-2940-gd56599979211266b.

Yes, it started with the revision.

[Bug fortran/103414] [10/11/12 Regression] [PDT] ICE in gfc_free_actual_arglist, at fortran/expr.c:547 since r10-2083-g8dc63166e0b85954

2021-11-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103414

Andrew Pinski  changed:

   What|Removed |Added

Summary|[PDT] ICE in|[10/11/12 Regression] [PDT]
   |gfc_free_actual_arglist, at |ICE in
   |fortran/expr.c:547 since|gfc_free_actual_arglist, at
   |r10-2083-g8dc63166e0b85954  |fortran/expr.c:547 since
   ||r10-2083-g8dc63166e0b85954
   Target Milestone|--- |10.4

[Bug target/103395] [9/10/11/12 Regression] ICE on qemu in arm create_fix_barrier

2021-11-25 Thread fw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103395

Florian Weimer  changed:

   What|Removed |Added

 CC||fw at gcc dot gnu.org

--- Comment #13 from Florian Weimer  ---
Maybe it's possible to provide specific, architecture-independent constraints
for Systemtap-like use cases?

[Bug middle-end/103416] [12 Regression][OpenMP] Bogus firstprivate(n) map(to:n [len: 4][implicit])

2021-11-25 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103416

--- Comment #2 from Tobias Burnus  ---
(In reply to Chung-Lin Tang from comment #1)
> Can you see if adding this patch:
> https://gcc.gnu.org/pipermail/gcc-patches/2021-November/583279.html
> fixes this problem? If so, then it should be another occurrence of PR90030

Yes and no.

That patch FIXES the issue
  libgomp: cuCtxSynchronize error: misaligned address
  libgomp: cuMemFree_v2 error: misaligned address
  libgomp: device finalization failed


BUT I still see
  #pragma omp target map(to:D.4246 [len: 4][implicit]) ...

and when adding an explicit firstprivate(n) I see

 #pragma omp target firstprivate(n) map(to:D.4246 [len: 4][implicit]) ...

which looks wrong. I understand that implicit mapping tries to solve the
problem of explicitly mapping an array section via 'omp target (enter) data' -
and then implicitly mapping the whole array. — But I think it does not make
sense to add this implicit mapping if there is an implicit or explicit
'firstprivate' for that variable. – That's just generates pointless code,
obfuscates the dump, adds an overhead to libgomp, ...


Actually, it does not only apply to 'firstprivate' - the same also can be
caused for 'tofrom' vs. 'to' as in

#pragma omp target ... map(to:D.4217 [len: 4][implicit]) map(tofrom:n [len:
4][implicit])


for the following code:

PROGRAM target_parallel_do
  implicit none
  INTEGER :: i0, N
  COMPLEX(8) :: scalar
  N = 1
  !$OMP TARGET PARALLEL do map(from: scalar) private(i0) defaultmap(tofrom)
  DO i0 = 1, N
scalar%re = n
  END DO
  !$omp end target parallel do
END PROGRAM target_parallel_do


Admittedly, I have not yet managed to construct something which causes an
observable misbehavior.

[Bug target/103395] [9/10/11/12 Regression] ICE on qemu in arm create_fix_barrier

2021-11-25 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103395

Jakub Jelinek  changed:

   What|Removed |Added

 CC||scox at redhat dot com,
   ||wcohen at redhat dot com

--- Comment #14 from Jakub Jelinek  ---
If it can be proven that all gcc versions until now treat "nor" constraint as
ignoring the n in there and pushing all constants into constant pool, I think
it could be changed into "or" for arm32.  But it would be IMNSHO unnecessary
pessimization (but it could e.g. be done for GCC < 12 or whenever this would be
fixed).
Another option is to tweak whatever generates those large inline asms.
In the qemu case it is created with
/usr/bin/python3 ../scripts/tracetool.py --backend=dtrace --group=util
--format=h /builddir/build/BUILD/qemu-6.1.0/util/trace-events
trace/trace-util.h   
whatever that is (but that means I haven't actually seen what it generates).
Note, apparently several other packages are affected, so not sure what changed
recently in systemtap-sdt-devel or whatever else that adds up to this.
In the preprocessed source I got I see several blocks of
   ".ascii \"\\x20\""
   "\n"
   "_SDT_SIGN %n[_SDT_S4]"
   "\n"
   "_SDT_SIZE %n[_SDT_S4]"
   "\n"
   "_SDT_TYPE %n[_SDT_S4]"
   "\n"
   ".ascii \"%[_SDT_A4]\""
   "\n"
It already uses assembler macros, perhaps adding a macro to do all 3 at once or
perhaps with something extra could bring the number of newlines down...

[Bug middle-end/103416] [12 Regression][OpenMP] Bogus firstprivate(n) map(to:n [len: 4][implicit])

2021-11-25 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103416

--- Comment #3 from Tobias Burnus  ---
Okay, the
  map(to:D.4217 [len: 4][implicit]) map(tofrom:n [len: 4][implicit])
issue is not new – only the '[implicit]' + the misaligned address one (fixed by
the patch from comment 1).

 * * *

Thus regression → comment 1

***

And extra map(to:D.4217) – this occurs already with GCC 11 and is, thus, not a
regression.
(Only '[implicit]' is new.)

This seems to be due to a scope issue. In gimplify_omp_for

(gdb) p debug(for_stmt)
#pragma omp for nowait
for (i0 = 1; i0 <= D.4217; i0 = i0 + 1)
  REALPART_EXPR  = (real(kind=8)) n;

which then calls at some point
  omp_notice_variable (
debug_tree(decl) which ends up with
7698nflags |= GOVD_MAP | GOVD_MAP_TO_ONLY;

The problem is that here
integer(kind=4) D.3933;
is generated in the parent scope of '#omp target' instead of in the parent
scope of '#omp parallel do'.

Thus, there are two questions:
* Why is 'map(to:' and not 'firstprivate' used?
* Why is the var generated in the parent scope of 'omp target' instead of
inside 'omp target'?

[Bug tree-optimization/103425] New: 48% tramp3d regression between g:df1a0d526e2e4c75 and g:9e026da720091704 with -Ofast -march=native at Zen

2021-11-25 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103425

Bug ID: 103425
   Summary: 48% tramp3d regression between g:df1a0d526e2e4c75 and
g:9e026da720091704 with  -Ofast -march=native at Zen
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hubicka at gcc dot gnu.org
  Target Milestone: ---

visible at https://lnt.opensuse.org/db_default/v4/CPP/graph?plot.0=171.576.0
and with LTO https://lnt.opensuse.org/db_default/v4/CPP/graph?plot.0=178.576.0

It lasts for 2 runs already so probably not a noise.  Curiously I do not see it
on other testers (perhaps they need to run yet)

[Bug tree-optimization/103254] [12 Regression] Compile time hog in compare_values_warnv since r12-4790-g4b3a325f07acebf4

2021-11-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103254

--- Comment #6 from CVS Commits  ---
The master branch has been updated by Aldy Hernandez :

https://gcc.gnu.org/g:8acbd7bef6edbf537e3037174907029b530212f6

commit r12-5514-g8acbd7bef6edbf537e3037174907029b530212f6
Author: Aldy Hernandez 
Date:   Wed Nov 24 09:43:36 2021 +0100

path solver: Compute ranges in path in gimple order.

Andrew's patch for this PR103254 papered over some underlying
performance issues in the path solver that I'd like to address.

We are currently solving the SSA's defined in the current block in
bitmap order, which amounts to random order for all purposes.  This is
causing unnecessary recursion in gori.  This patch changes the order
to gimple order, thus solving dependencies before uses.

There is no change in threadable paths with this change.

Tested on x86-64 & ppc64le Linux.

gcc/ChangeLog:

PR tree-optimization/103254
* gimple-range-path.cc (path_range_query::compute_ranges_defined):
New
(path_range_query::compute_ranges_in_block): Move to
compute_ranges_defined.
* gimple-range-path.h (compute_ranges_defined): New.

[Bug tree-optimization/103254] [12 Regression] Compile time hog in compare_values_warnv since r12-4790-g4b3a325f07acebf4

2021-11-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103254

--- Comment #7 from CVS Commits  ---
The master branch has been updated by Aldy Hernandez :

https://gcc.gnu.org/g:d1c1919ef8a18eea9d5c1741f8c9adaabf5571f2

commit r12-5515-gd1c1919ef8a18eea9d5c1741f8c9adaabf5571f2
Author: Aldy Hernandez 
Date:   Wed Nov 24 17:58:43 2021 +0100

path solver: Move boolean import code to compute_imports.

In a follow-up patch I will be pruning the set of exported ranges
within blocks to avoid unnecessary work.  In order to do this, all the
interesting SSA names must be in the internal import bitmap ahead of
time.  I had already abstracted them out into compute_imports, but I
missed the boolean code.  This fixes the oversight.

There's a net gain of 25 threadable paths, which is unexpected but
welcome.

Tested on x86-64 & ppc64le Linux.

gcc/ChangeLog:

PR tree-optimization/103254
* gimple-range-path.cc (path_range_query::compute_ranges): Move
exported boolean code...
(path_range_query::compute_imports): ...here.

[Bug fortran/80330] OpenACC: Unexpected data mapping instead of implicit firstprivate

2021-11-25 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80330

Tobias Burnus  changed:

   What|Removed |Added

 CC||burnus at gcc dot gnu.org

--- Comment #2 from Tobias Burnus  ---
See also PR 103416 comment 3.

[Bug target/103395] [9/10/11/12 Regression] ICE on qemu in arm create_fix_barrier

2021-11-25 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103395

--- Comment #15 from Jakub Jelinek  ---
Apparently the change on the systemtap side was:
https://sourceware.org/git/?p=systemtap.git;a=commit;f=includes/sys/sdt.h;h=eaa15b047688175a94e3ae796529785a3a0af208
which indeed adds a lot of newlines to the inline asms.
But when already using gas macros, I wonder if all that
#define _SDT_ASM_TEMPLATE_1 _SDT_ARGFMT(1)
#define _SDT_ASM_TEMPLATE_2 _SDT_ASM_TEMPLATE_1 _SDT_ASM_BLANK
_SDT_ARGFMT(2)
#define _SDT_ASM_TEMPLATE_3 _SDT_ASM_TEMPLATE_2 _SDT_ASM_BLANK
_SDT_ARGFMT(3)
#define _SDT_ASM_TEMPLATE_4 _SDT_ASM_TEMPLATE_3 _SDT_ASM_BLANK
_SDT_ARGFMT(4)
#define _SDT_ASM_TEMPLATE_5 _SDT_ASM_TEMPLATE_4 _SDT_ASM_BLANK
_SDT_ARGFMT(5)
#define _SDT_ASM_TEMPLATE_6 _SDT_ASM_TEMPLATE_5 _SDT_ASM_BLANK
_SDT_ARGFMT(6)
#define _SDT_ASM_TEMPLATE_7 _SDT_ASM_TEMPLATE_6 _SDT_ASM_BLANK
_SDT_ARGFMT(7)
#define _SDT_ASM_TEMPLATE_8 _SDT_ASM_TEMPLATE_7 _SDT_ASM_BLANK
_SDT_ARGFMT(8)
#define _SDT_ASM_TEMPLATE_9 _SDT_ASM_TEMPLATE_8 _SDT_ASM_BLANK
_SDT_ARGFMT(9)
#define _SDT_ASM_TEMPLATE_10_SDT_ASM_TEMPLATE_9 _SDT_ASM_BLANK
_SDT_ARGFMT(10)
#define _SDT_ASM_TEMPLATE_11_SDT_ASM_TEMPLATE_10 _SDT_ASM_BLANK
_SDT_ARGFMT(11)
#define _SDT_ASM_TEMPLATE_12_SDT_ASM_TEMPLATE_11 _SDT_ASM_BLANK
_SDT_ARGFMT(12)
couldn't be rewritten into use of another .macro _SDT_ASM_TEMPLATE that just
takes emits it all.
See the
 .macro  sum from=0, to=5
 .long   \from
 .if \to-\from
 sum "(\from+1)",\to
 .endif
 .endm
macro from gas documentation for inspiration.

[Bug middle-end/103416] [12 Regression][OpenMP] Bogus firstprivate(n) map(to:n [len: 4][implicit])

2021-11-25 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103416

--- Comment #4 from Tobias Burnus  ---
Created attachment 51872
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51872&action=edit
RFC Patch to avoid the pointless evaluation, see comment 4

(In reply to Tobias Burnus from comment #3)

> * Why is the var generated in the parent scope of 'omp target' instead of
> inside 'omp target'?

The problem is a forced evaluation of the array bounds, which I regard as
pointless if the variable is just a plain variable - no array ref, not struct
ref no ...

Cf. attachment. (The question is when 'force=true' is needed and whether the
DECL_P check is the right one or whether more or less should be permitted.)

This is indeed the same as issue as PR80330 (8...)

 * * *

The
  libgomp: cuCtxSynchronize error: misaligned address
is a regression – see comment 1 for a patch which fixes it. This is PR90030
(9...)


 * * *

> * Why is 'map(to:' and not 'firstprivate' used?

Because of:

gfc_omp_predetermined_mapping (tree decl)
{
  if (DECL_ARTIFICIAL (decl)
  && ! GFC_DECL_RESULT (decl)
  && ! (DECL_LANG_SPECIFIC (decl)
&& GFC_DECL_SAVED_DESCRIPTOR (decl)))
return OMP_CLAUSE_DEFAULTMAP_TO;

I wonder whether OMP_DEFAULTMAP_FIRSTPRIVATE  wouldn't make more sense in this
case – at least for gfc_omp_scalar_target_p ?

Which is also related to PR80330 (8...)

[Bug middle-end/103416] [12 Regression][OpenMP] Bogus firstprivate(n) map(to:n [len: 4][implicit])

2021-11-25 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103416

--- Comment #5 from Tobias Burnus  ---
(In reply to Tobias Burnus from comment #4)
> Created attachment 51872 [details]
> RFC Patch to avoid the pointless evaluation, see comment 4

The default was supposed to be 'false' - to be overridden where needed.

Otherwise, the 'false' has to added in gfc_trans_omp_do under
  /* Evaluate all the expressions in the iterator.  */

[Bug tree-optimization/103425] 48% tramp3d regression between g:df1a0d526e2e4c75 and g:9e026da720091704 with -Ofast -march=native at Zen

2021-11-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103425

--- Comment #1 from Andrew Pinski  ---
I hope it was not caused by my patch. As it could in theory cause cost
differences 

[Bug target/103395] [9/10/11/12 Regression] ICE on qemu in arm create_fix_barrier

2021-11-25 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103395

--- Comment #16 from Jakub Jelinek  ---
Note, the %n[_SDT_S##no] in there need to stay (dunno about the
_SDT_ASM_SUBSTR(_SDT_ARGTMPL(_SDT_A##no)) stuff), but that could be achieved
by giving the macro from, to, arg, args:vararg arguments and use it like:
  _SDT_ASM_TEMPLATE 1, 4, %n[_SDT_S1], %n[_SDT_S2], %n[_SDT_S3], %n[_SDT_S4]

[Bug ipa/103052] [9/10/11 Regression] Function is found to be pure looping but has a call to a noreturn function in it

2021-11-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103052

--- Comment #13 from CVS Commits  ---
The releases/gcc-10 branch has been updated by Jan Hubicka
:

https://gcc.gnu.org/g:298a4694f89ecb512be8ecba0512558996961fae

commit r10-10294-g298a4694f89ecb512be8ecba0512558996961fae
Author: Jan Hubicka 
Date:   Sun Nov 21 00:35:22 2021 +0100

Fix looping flag discovery in ipa-pure-const

The testcase shows situation where there is non-trivial cycle in the
callgraph
involving a noreturn call.  This cycle is important for const function
discovery
but not important for pure.  IPA pure const uses same strongly connected
components for both propagations which makes it to get suboptimal result
(does not detect the pure flag). However local pure const gets the
situation
right becaue it processes functions in right order.  This hits rarely
executed code in propagate_pure_const that merge results with previously
known state that has long standing bug in it that makes it to throw away
the looping flag.

Bootstrapped/regtested x86_64-linux.

gcc/ChangeLog:

2021-11-21  Jan Hubicka  

PR ipa/103052
* ipa-pure-const.c (propagate_pure_const): Fix merging of loping
flag.

gcc/testsuite/ChangeLog:

2021-11-21  Jan Hubicka  

PR ipa/103052
* gcc.c-torture/execute/pr103052.c: New test.

(cherry picked from commit a0e99d5bb741d3db74a67d492f47b28217fbf88a)

[Bug ipa/103052] [9/10/11 Regression] Function is found to be pure looping but has a call to a noreturn function in it

2021-11-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103052

--- Comment #14 from CVS Commits  ---
The releases/gcc-11 branch has been updated by Jan Hubicka
:

https://gcc.gnu.org/g:6a1358f7ea475e9d46c1535656bdfb2a7904

commit r11-9310-g6a1358f7ea475e9d46c1535656bdfb2a7904
Author: Jan Hubicka 
Date:   Sun Nov 21 00:35:22 2021 +0100

Fix looping flag discovery in ipa-pure-const

The testcase shows situation where there is non-trivial cycle in the
callgraph
involving a noreturn call.  This cycle is important for const function
discovery
but not important for pure.  IPA pure const uses same strongly connected
components for both propagations which makes it to get suboptimal result
(does not detect the pure flag). However local pure const gets the
situation
right becaue it processes functions in right order.  This hits rarely
executed code in propagate_pure_const that merge results with previously
known state that has long standing bug in it that makes it to throw away
the looping flag.

Bootstrapped/regtested x86_64-linux.

gcc/ChangeLog:

2021-11-21  Jan Hubicka  

PR ipa/103052
* ipa-pure-const.c (propagate_pure_const): Fix merging of loping
flag.

gcc/testsuite/ChangeLog:

2021-11-21  Jan Hubicka  

PR ipa/103052
* gcc.c-torture/execute/pr103052.c: New test.

(cherry picked from commit a0e99d5bb741d3db74a67d492f47b28217fbf88a)

[Bug c++/46476] Missing Warning about unreachable code after return [-Wunreachable-code-return]

2021-11-25 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=46476

Thomas Schwinge  changed:

   What|Removed |Added

 CC||tschwinge at gcc dot gnu.org

--- Comment #20 from Thomas Schwinge  ---
(In reply to Richard Biener from comment #18)
> /home/rguenther/src/trunk/libgomp/oacc-plugin.c: In function
> 'GOMP_PLUGIN_acc_default_dim':
> /home/rguenther/src/trunk/libgomp/oacc-plugin.c:65:7: error: statement is
> not reachable [-Werror]
>65 |   return -1;
>   |   ^~

(That's correct, and you do address that in the patch posted.)

It feels strange to not have a 'return' in a non-'void' function, but that's
fine, given 'gomp_fatal' being 'noreturn'.


For posterity (only; these "bad" cases have not made it into the patch posted):

> /home/rguenther/src/trunk/libgomp/oacc-profiling.c: In function
> 'acc_prof_register':
> /home/rguenther/src/trunk/libgomp/oacc-profiling.c:354:7: error: statement
> is not reachable [-Werror]
>   354 |   __builtin_unreachable ();
>   |   ^
> /home/rguenther/src/trunk/libgomp/oacc-profiling.c: In function
> 'acc_prof_unregister':
> /home/rguenther/src/trunk/libgomp/oacc-profiling.c:475:7: error: statement
> is not reachable [-Werror]
>   475 |   __builtin_unreachable ();
>   |   ^
> 
> the latter two are an issue with inital CFG construction I think, where
> group_case_labels turns
> 
> void bar (foo x)
> {
>:
>   switch (x) , case 0: , case 1: >
> 
>:
> :
>   goto ;
> 
>:
> :
>   __builtin_unreachable ();
> 
>:
> :
>   return;
> 
> into the following with BB 4 now unreachable.
> 
> void bar (foo x)
> {
>:
>   switch (x) , case 0: >
> 
>:
> :
>   goto ;
> 
>:
> :
>   __builtin_unreachable ();
> 
>:
> :
>   return;

The source-level situation here is:

[...]
   256/* Special cases.  */
   257if (reg == acc_toggle)
[...]
   274else if (reg == acc_toggle_per_thread)
   275  {
[...]
   284/* Silently ignore.  */
   285gomp_debug (0, "  ignoring bogus request\n");
   286return;
   287  }
[...]
   302switch (reg)
   303  {
[...]
   353  case acc_toggle_per_thread:
   354__builtin_unreachable ();
   355  }
[...]

..., and similar for the other instance.

Here, the point is to (a) enumerate all possible 'enum' values in the 'switch
(reg)', but (b) make it clear ('__builtin_unreachable') that we're not
expecting 'acc_toggle_per_thread' here, as it has already been handled (plus
early 'return') above.  In my opinion, we shouldn't diagnose these cases (and
you don't, per the patch posted).

[Bug c++/103426] New: Acceptance of invalid template specialization in a namespace not enclosing the specialized template

2021-11-25 Thread fchelnokov at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103426

Bug ID: 103426
   Summary: Acceptance of invalid template specialization in a
namespace not enclosing the specialized template
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: fchelnokov at gmail dot com
  Target Milestone: ---

The code as follows is invalid:
```
template
struct S {
static int x;
};

namespace {
template<> int S::x = 0;
}
```
because an explicit specialization shall be declared in a namespace enclosing
the specialized template. Clang rejects it, but not GCC. Demo:
https://gcc.godbolt.org/z/74oxofsaf

Related discussion: https://stackoverflow.com/q/30400814/7325599

[Bug tree-optimization/103425] 48% tramp3d regression between g:df1a0d526e2e4c75 and g:9e026da720091704 with -Ofast -march=native at Zen

2021-11-25 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103425

Martin Liška  changed:

   What|Removed |Added

 CC||marxin at gcc dot gnu.org

--- Comment #2 from Martin Liška  ---
Cannot reproduce that locally with -march=znver2 on a znver1 machine.

[Bug tree-optimization/103427] New: Alignment of C++ references and 'this' pointer not used by optimizer

2021-11-25 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103427

Bug ID: 103427
   Summary: Alignment of C++ references and 'this' pointer not
used by optimizer
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: redi at gcc dot gnu.org
  Target Milestone: ---

#include 

void* f(int& i)
{
// should be no-op, &i is aligned to alignof(int)
return (void*)uintptr_t)&i) + 3) & ~3);
}

struct My
{
int value;

void* Test1();
void* Test2();
};

void* My::Test1()
{
return this;
}

void* My::Test2()
{
// should be no-op, 'this' is aligned to alignof(My)
return (void*)uintptr_t)this) + 3) & ~3);
}


GCC fails to optimize away the redundant arithmetic to "fix" the alignment:

f(int&):
lea rax, [rdi+3]
and rax, -4
ret
My::Test1():
mov rax, rdi
ret
My::Test2():
lea rax, [rdi+3]
and rax, -4
ret

Since https://reviews.llvm.org/D99790 Clang optimizes it:





Although an int* might not actually point to a valid int, and so could be
misaligned, and int& must be bound to a valid object, which means it cannot be
misaligned. It would be undefined to bind a reference to a misaligned object.

Similarly, although a My* could contain an arbitrary address, inside a member
function the 'this' pointer must point to a valid object, which cannot be
misaligned. It would be undefined to 

Pragmas and attributes and -fpack-struct=n can break those rules, but by
default we should be able to assume they're true.

This will break some code which has undefined behaviour (and would be diagnosed
by -fsanitize=alignment) but currently "works"  e.g.
https://github.com/dotnet/runtime/issues/61671 was caused by the Clang change.
So maybe there should be a specific flag to enable/disable this optimization,
like we do for -fdelete-null-pointer-checks etc.

[Bug tree-optimization/103427] Alignment of C++ references and 'this' pointer not used by optimizer

2021-11-25 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103427

--- Comment #1 from Jonathan Wakely  ---
(In reply to Jonathan Wakely from comment #0)
> Since https://reviews.llvm.org/D99790 Clang optimizes it:

Oops, I meant to paste this, from clang 13.0.0 at -O1

f(int&): # @f(int&)
mov rax, rdi
ret
My::Test1(): # @My::Test1()
mov rax, rdi
ret
My::Test2(): # @My::Test2()
mov rax, rdi
ret


The int& case has been optimized since Clang 11.0.0, and 13.0.0 started doing
the same for the 'this' pointer.

[Bug tree-optimization/103427] Alignment of C++ references and 'this' pointer not used by optimizer

2021-11-25 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103427

--- Comment #2 from Jonathan Wakely  ---
https://godbolt.org/z/8aMc14qfW

[Bug tree-optimization/103427] Alignment of C++ references and 'this' pointer not used by optimizer

2021-11-25 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103427

Jakub Jelinek  changed:

   What|Removed |Added

 CC||aldyh at gcc dot gnu.org,
   ||amacleod at redhat dot com,
   ||jakub at gcc dot gnu.org,
   ||rguenth at gcc dot gnu.org

--- Comment #3 from Jakub Jelinek  ---
I guess we'd need to use min_align_of_type, because say ia32 long long &
can bind to a long long member inside of struct S { int a; long long b; }
and then it is only 4 byte aligned.  The question is how/where to do this
though,
as POINTER_TYPE_P conversions are considered useless and therefore what has
been a reference or pointer and to what exactly can blur pretty fast.
But at least whether a PARM_DECL has reference type or pointer type and what
the pointed type is shouldn't change arbitrarily.
I'm not sure if we can rely on this for non-C++ FEs though, so perhaps a
langhook that we use during evrp on (D) SSA_NAME of PARM_DECLs and ask the FE
whether it guarantees some alignment for those (aka pretend there is a virtual
__builtin_assume_aligned for those)?

[Bug c++/103426] Acceptance of invalid template specialization in a namespace not enclosing the specialized template

2021-11-25 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103426

Jonathan Wakely  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=92598
   Last reconfirmed||2021-11-25
   Keywords||accepts-invalid
 Status|UNCONFIRMED |NEW

[Bug ipa/103052] [9/10/11 Regression] Function is found to be pure looping but has a call to a noreturn function in it

2021-11-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103052

--- Comment #15 from CVS Commits  ---
The releases/gcc-9 branch has been updated by Jan Hubicka
:

https://gcc.gnu.org/g:3d1f5e86fb4351a109d45fe441b1b00d6e56c277

commit r9-9844-g3d1f5e86fb4351a109d45fe441b1b00d6e56c277
Author: Jan Hubicka 
Date:   Sun Nov 21 00:35:22 2021 +0100

Fix looping flag discovery in ipa-pure-const

The testcase shows situation where there is non-trivial cycle in the
callgraph
involving a noreturn call.  This cycle is important for const function
discovery
but not important for pure.  IPA pure const uses same strongly connected
components for both propagations which makes it to get suboptimal result
(does not detect the pure flag). However local pure const gets the
situation
right becaue it processes functions in right order.  This hits rarely
executed code in propagate_pure_const that merge results with previously
known state that has long standing bug in it that makes it to throw away
the looping flag.

Bootstrapped/regtested x86_64-linux.

gcc/ChangeLog:

2021-11-21  Jan Hubicka  

PR ipa/103052
* ipa-pure-const.c (propagate_pure_const): Fix merging of loping
flag.

gcc/testsuite/ChangeLog:

2021-11-21  Jan Hubicka  

PR ipa/103052
* gcc.c-torture/execute/pr103052.c: New test.

(cherry picked from commit a0e99d5bb741d3db74a67d492f47b28217fbf88a)

[Bug ipa/103052] [9/10/11 Regression] Function is found to be pure looping but has a call to a noreturn function in it

2021-11-25 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103052

Jan Hubicka  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #16 from Jan Hubicka  ---
Fixed on release branches too now.

[Bug tree-optimization/103427] Alignment of C++ references and 'this' pointer not used by optimizer

2021-11-25 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103427

--- Comment #4 from Jonathan Wakely  ---
Oops more slip-ups in the original submission ..

(In reply to Jonathan Wakely from comment #0)
> Although an int* might not actually point to a valid int, and so could be
> misaligned, and int& must be bound to a valid object

s/and int/an int/

> Similarly, although a My* could contain an arbitrary address, inside a
> member function the 'this' pointer must point to a valid object, which
> cannot be misaligned. It would be undefined to 

... call a member function through a misaligned pointer.

[Bug tree-optimization/103427] Alignment of C++ references and 'this' pointer not used by optimizer

2021-11-25 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103427

--- Comment #5 from Jonathan Wakely  ---
(In reply to Jakub Jelinek from comment #3)
> I'm not sure if we can rely on this for non-C++ FEs though, so perhaps a
> langhook that we use during evrp on (D) SSA_NAME of PARM_DECLs and ask the
> FE whether it guarantees some alignment for those (aka pretend there is a
> virtual __builtin_assume_aligned for those)?

Yes, it looks like Clang does it by decorating the 'this' pointer in the FE:

// Apply `nonnull`, `dereferencable(N)` and `align N` to the `this` argument.


Maybe something can be done in the G++ FE when member functions use 'this' and
when binding a reference?

[Bug c++/103408] ICE when requires auto(x) in C++23

2021-11-25 Thread hewillk at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103408

--- Comment #3 from 康桓瑋  ---
(In reply to Marek Polacek from comment #2)
> Started with r12-5386, obviously.

I don't know if it is caused by the same bug.

template
concept C = auto([]{});
static_assert(C<0>);

https://godbolt.org/z/nj6qbGxP7

[Bug tree-optimization/103427] Alignment of C++ references and 'this' pointer not used by optimizer

2021-11-25 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103427

--- Comment #6 from Richard Biener  ---
Is the case important enough to worry about?  Actual accesses will be assumed
to be aligned according to the type.

But sure, we could in theory special-case REFERENCE_TYPE in CCP.  Does any
other frontend use REFERENCE_TYPE?

[Bug tree-optimization/103427] Alignment of C++ references and 'this' pointer not used by optimizer

2021-11-25 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103427

--- Comment #7 from Jonathan Wakely  ---
(In reply to Richard Biener from comment #6)
> Is the case important enough to worry about?

I have no idea, I just noticed that clang is doing this and we aren't.

I doubt it's very important.

[Bug tree-optimization/103359] [12 Regression] Dead Code Elimination Regression at -O3 (trunk vs 11.2.0)

2021-11-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103359

--- Comment #10 from CVS Commits  ---
The master branch has been updated by Andrew Macleod :

https://gcc.gnu.org/g:661c02e54ea72fb55205df0a717951ff28bb739e

commit r12-5522-g661c02e54ea72fb55205df0a717951ff28bb739e
Author: Andrew MacLeod 
Date:   Tue Nov 23 14:12:29 2021 -0500

Check for equivalences between PHI argument and def.

If a PHI argument on an edge is equivalent with the DEF, then it doesn't
provide any new information, defer processing it unless they are all
equivalences.

PR tree-optimization/103359
gcc/
* gimple-range-fold.cc (fold_using_range::range_of_phi): If arg is
equivalent to def, don't initially include it's range.

gcc/testsuite/
* gcc.dg/pr103359.c: New.

[Bug tree-optimization/103427] Alignment of C++ references and 'this' pointer not used by optimizer

2021-11-25 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103427

--- Comment #8 from Jakub Jelinek  ---
(In reply to Richard Biener from comment #6)
> Is the case important enough to worry about?  Actual accesses will be
> assumed to be aligned according to the type.
> 
> But sure, we could in theory special-case REFERENCE_TYPE in CCP.  Does any
> other frontend use REFERENCE_TYPE?

Fortran and Ada do.
For Fortran it is used for dummy arguments passed by reference (most of them),
but whether that implies the reference must be well aligned or not, I don't
know.
For Ada no idea.
Also, "this" in methods doesn't have a reference type, even though it actually
works more like a reference than pointer.

[Bug tree-optimization/103359] [12 Regression] Dead Code Elimination Regression at -O3 (trunk vs 11.2.0)

2021-11-25 Thread amacleod at redhat dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103359

Andrew Macleod  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #11 from Andrew Macleod  ---
Fixed.

[Bug tree-optimization/103427] Alignment of C++ references and 'this' pointer not used by optimizer

2021-11-25 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103427

--- Comment #9 from rguenther at suse dot de  ---
On Thu, 25 Nov 2021, jakub at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103427
> 
> --- Comment #8 from Jakub Jelinek  ---
> (In reply to Richard Biener from comment #6)
> > Is the case important enough to worry about?  Actual accesses will be
> > assumed to be aligned according to the type.
> > 
> > But sure, we could in theory special-case REFERENCE_TYPE in CCP.  Does any
> > other frontend use REFERENCE_TYPE?
> 
> Fortran and Ada do.
> For Fortran it is used for dummy arguments passed by reference (most of them),
> but whether that implies the reference must be well aligned or not, I don't
> know.
> For Ada no idea.
> Also, "this" in methods doesn't have a reference type, even though it actually
> works more like a reference than pointer.

I suppose it's better to have a flag on a PARM_DECL then indicating
the pointed to storage is aligned according to its type.  Can one
have sth like int *& or int && (pointer to reference or reference to
reference?)

[Bug c++/103428] New: Parameter packs not expanded with local struct in lambda

2021-11-25 Thread hewillk at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103428

Bug ID: 103428
   Summary: Parameter packs not expanded with local struct in
lambda
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hewillk at gmail dot com
  Target Milestone: ---

gcc-10 rejects this well-formed code. 
Fortunately, gcc-11.1 accepts it, however, gcc-11.2 rejected it again.

template
auto f(Ts... args) {
  ([]() { struct B : decltype(args) { }; }, ...);
};

int main() {
  f([]{});
}

https://godbolt.org/z/rYhnKee7M

[Bug tree-optimization/103425] [12 Regression] 48% tramp3d regression between g:df1a0d526e2e4c75 and g:9e026da720091704 with -Ofast -march=native at Zen

2021-11-25 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103425

Richard Biener  changed:

   What|Removed |Added

 Target||x86_64-*-*
Summary|48% tramp3d regression  |[12 Regression] 48% tramp3d
   |between g:df1a0d526e2e4c75  |regression between
   |and g:9e026da720091704 with |g:df1a0d526e2e4c75 and
   | -Ofast -march=native at|g:9e026da720091704 with
   |Zen |-Ofast -march=native at Zen
   Keywords||needs-bisection
   Target Milestone|--- |12.0

--- Comment #3 from Richard Biener  ---
Not visible on Haswell (for now).

[Bug target/103421] -march=bogus12323123423452345 -march=skylake-avx512 is accepted as a command line option

2021-11-25 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103421

--- Comment #2 from Richard Biener  ---
I think that makes sense in some way, not sure we want
-march-for-check=bogus12323123423452345.  Also consider -march=xyz
-moption-not-valid-for-xyz -march=but-for-this

[Bug tree-optimization/102648] [12 Regression] Dead Code Elimination Regression at -O3 (trunk vs 11.2.0) since r12-2381-g704e8a825c78b9a8

2021-11-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102648

--- Comment #3 from CVS Commits  ---
The master branch has been updated by Andrew Macleod :

https://gcc.gnu.org/g:1598bd47b2a4a5f12b5a987d16d82634644db4b6

commit r12-5524-g1598bd47b2a4a5f12b5a987d16d82634644db4b6
Author: Andrew MacLeod 
Date:   Thu Nov 25 08:58:19 2021 -0500

Add the testcase for this PR to the testsuite.

Various ranger-enabled patches like threading and VRP2 can do this now, so
add the testcase for posterity.

gcc/testsuite/
PR tree-optimization/102648
* gcc.dg/pr102648.c: New.

[Bug tree-optimization/102648] [12 Regression] Dead Code Elimination Regression at -O3 (trunk vs 11.2.0) since r12-2381-g704e8a825c78b9a8

2021-11-25 Thread amacleod at redhat dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102648

Andrew Macleod  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from Andrew Macleod  ---
Should be fixed.

[Bug tree-optimization/103427] Alignment of C++ references and 'this' pointer not used by optimizer

2021-11-25 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103427

--- Comment #10 from Jonathan Wakely  ---
int*& is a reference to a pointer, and is perfectly valid.

You can't have a pointer to a reference (a reference isn't required to have any
storage, so taking the address of a reference doesn't make sense).

int&& is an rvalue reference, which is just a different type of reference. You
can't have int & & though. Binding another reference to a reference actually
binds to the underlying object, not the reference.

So there are no pointers or references to references.

[Bug c++/100465] Overloading operator+= and including filesystem causes conflicting overload compilation error

2021-11-25 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100465

Jonathan Wakely  changed:

   What|Removed |Added

   Last reconfirmed|2021-05-07 00:00:00 |2021-11-25

--- Comment #6 from Jonathan Wakely  ---
(In reply to Jonathan Wakely from comment #2)
> Maybe another case of PR 51577 but I haven't looked into it yet.

The testcase in comment 4 was fixed by the patch for that bug, r12-702.

The original testcase using  still fails though.

Patrick, do you think this is just a dup of PR 51577? Do I need to reduce this
again to something that still fails, or do we have a matching testcase already?

[Bug c++/103429] New: Optimization of Auto-generated condition chain is not giving good lookup tables.

2021-11-25 Thread ed at edwardrosten dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103429

Bug ID: 103429
   Summary: Optimization of Auto-generated condition chain is not
giving good lookup tables.
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ed at edwardrosten dot com
  Target Milestone: ---

I've got come generated condition chains (using recursive templates) and am
getting some odd/suboptimal optimization results. Code is provided below and
with a godbolt link.

In the first case (without a force inline), the compiler inlines the functions
but does not perform condition chain optimization. In the second case
(identical code but with force inline), it will optimize condition chains but
only with exactly 5 elements. Otherwise it will end up with an if-else
structure indexing optimized 5 element condition chains, and an if-else chain
for anything spare.

It only attempts the optimization from gcc 11 onwards, I checked on trunk too.


Example:
https://godbolt.org/z/c9xbPqq7r

Here's the code:
template void f();

constexpr int N=5;

template 
static inline void f_dispatch(int i){
if constexpr (I == N)
return;
else if(i == I)
f();
else
f_dispatch(i);
}

template __attribute__((always_inline)) 
static inline void f_dispatch_always_inline(int i){
if constexpr (I == N)
return;
else if(i == I)
f();
else
f_dispatch_always_inline(i);
}

void run(int i){
f_dispatch<>(i);
}

void run_inline(int i){
f_dispatch_always_inline<>(i);
}

[Bug tree-optimization/103409] [12 Regression] 18% SPEC2017 WRF compile-time regression with -O2 -flto between g:264f061997c0a534 and g:3e09331f6aeaf595

2021-11-25 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103409

--- Comment #4 from Martin Liška  ---
I'm going to bisect that.

[Bug tree-optimization/103429] Optimization of Auto-generated condition chain is not giving good lookup tables.

2021-11-25 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103429

Richard Biener  changed:

   What|Removed |Added

  Component|c++ |tree-optimization
 CC||marxin at gcc dot gnu.org

--- Comment #1 from Richard Biener  ---
Sounds like if-to-switch / switch conversion could help.  Note that GCC
converts large switches into binary if trees in some cases as well.

[Bug tree-optimization/103221] evrp removes |SIGN but does not propagate the ssa name

2021-11-25 Thread amacleod at redhat dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103221

--- Comment #3 from Andrew Macleod  ---
And BTW, we do this optimization, just not completely in evrp.  EVRP removes
the extraneous | -128 since that is a range related action.

Constant propagation handles the propagation of the copy into the PHI, I'm not
sure we also need to do it in a VRP pass.

[Bug tree-optimization/103417] [12 Regression] wrong code at -O1 and above on x86_64-linux-gnu since r12-5489

2021-11-25 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103417

Jakub Jelinek  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #7 from Jakub Jelinek  ---
Fixed.

[Bug c++/47256] "--sysroot" option is not passed to COLLECT_GCC_OPTIONS

2021-11-25 Thread richard.purdie at linuxfoundation dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=47256

--- Comment #7 from Richard Purdie  
---
Thanks for the tip, we'll look into dropping it!

[Bug preprocessor/103415] [12 Regression] ICE in cpp_interpret_string_1, at libcpp/charset.c:1739

2021-11-25 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103415

--- Comment #4 from Jakub Jelinek  ---
__VA_OPT__ has been supported for a few more years, my change just added
support for stringification of __VA_OPT__...

[Bug tree-optimization/103429] Optimization of Auto-generated condition chain is not giving good lookup tables.

2021-11-25 Thread ed at edwardrosten dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103429

--- Comment #2 from Edward Rosten  ---
It is doing if-to-switch, but only really with N=5, and only if force-inline is
set. I think this are two problems, one is that you need to force-inline in
order to trigger if-to-switch.

The other problem is that the if-to-switch conversion triggered only works well
for exactly 5 conditions, otherwise it uses a mix of converted and unconverted.
It doesn't appear to be doing binary tree generation, more like linear search
Here's the asm for N=7:

run(int):
ret
run_inline(int):
testedi, edi
je  .L13
cmp edi, 1
je  .L14
cmp edi, 6
ja  .L3
mov edi, edi
jmp [QWORD PTR .L8[0+rdi*8]]
.L8:
.quad   .L3
.quad   .L3
.quad   .L12
.quad   .L11
.quad   .L10
.quad   .L9
.quad   .L7
.L13:
jmp void f<0>()
.L7:
jmp void f<6>()
.L12:
jmp void f<2>()
.L11:
jmp void f<3>()
.L10:
jmp void f<4>()
.L9:
jmp void f<5>()
.L14:
jmp void f<1>()
.L3:
ret

Note, it's essentially doing:

if(i==0)
f<0>();
else if(i==1)
f<1>();
else if(i > 6)
return;
else switch(i){
case 0:
case 1:   
return;
case 2: f<2>(); return;
case 3: f<3>(); return;
case 4: f<4>(); return;
case 5: f<5>(); return;
case 6: f<6>(); return;
}

It's not doing binary searches. For, e.g. N%5 == 1, the structure is more like:

if(i==0)
f<0>();
else if(i > 5){
if(i-5 > 4){
if(i-11>4){
if(i-16 > 4){ 
 // and so on, linearly
}
else switch(i-16){
 //...
}
}
else switch(i-11){
   //...
}
}
else switch(i-6){
  //...
}

}
else switch(i){
case 0:
return;
case 1: f<1>(); return;  
case 2: f<2>(); return;
case 3: f<3>(); return;
case 4: f<4>(); return;
case 5: f<5>(); return;
}

[Bug fortran/103414] [10/11/12 Regression] [PDT] ICE in gfc_free_actual_arglist, at fortran/expr.c:547 since r10-2083-g8dc63166e0b85954

2021-11-25 Thread kargl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103414

--- Comment #6 from kargl at gcc dot gnu.org ---
(In reply to Martin Liška from comment #5)
> Started with r10-2083-g8dc63166e0b85954.

Well, no, it did not start with the above commit.
At best, it was exposed by this commit.

[Bug fortran/103412] [10/11/12 Regression] ICE: Invalid expression in gfc_element_size since r10-2083-g8dc63166e0b85954

2021-11-25 Thread kargl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103412

--- Comment #3 from kargl at gcc dot gnu.org ---
(In reply to Martin Liška from comment #2)
> Started with r10-2083-g8dc63166e0b85954.

No, it did not start with this commit.
It was exposed by this commit.

[Bug target/103396] [12 Regression][GCN][BUILD] ICE RTL check: access of elt 4 of vector with last elt 3 in move_callee_saved_registers, at config/gcn/gcn.c:2821

2021-11-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103396

--- Comment #5 from CVS Commits  ---
The master branch has been updated by Andrew Stubbs :

https://gcc.gnu.org/g:58d50a5dd6344179eebaeb6fd2f895e59463cf74

commit r12-5525-g58d50a5dd6344179eebaeb6fd2f895e59463cf74
Author: Andrew Stubbs 
Date:   Thu Nov 25 15:59:20 2021 +

amdgcn: Fix ICE generating CFI [PR103396]

gcc/ChangeLog:

PR target/103396
* config/gcn/gcn.c (move_callee_saved_registers): Ensure that the
number of spilled registers is counted correctly.

[Bug target/103396] [12 Regression][GCN][BUILD] ICE RTL check: access of elt 4 of vector with last elt 3 in move_callee_saved_registers, at config/gcn/gcn.c:2821

2021-11-25 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103396

Andrew Stubbs  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #6 from Andrew Stubbs  ---
This problem should be fixed now.

[Bug tree-optimization/103429] Optimization of Auto-generated condition chain is not giving good lookup tables.

2021-11-25 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103429

Martin Liška  changed:

   What|Removed |Added

   Last reconfirmed||2021-11-25
   Assignee|unassigned at gcc dot gnu.org  |marxin at gcc dot 
gnu.org
 Ever confirmed|0   |1
 Status|UNCONFIRMED |ASSIGNED

--- Comment #3 from Martin Liška  ---
Looking into it.

[Bug tree-optimization/103425] [12 Regression] 48% tramp3d regression between g:df1a0d526e2e4c75 and g:9e026da720091704 with -Ofast -march=native at Zen

2021-11-25 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103425

--- Comment #4 from Jan Hubicka  ---
In meanwhile other testers picked the revision and it seems that indeed only
benzen machine reports this (it is AMD EPYC 7702).  So it looks
microarchitecture specific issue.

[Bug preprocessor/103415] [12 Regression] ICE in cpp_interpret_string_1, at libcpp/charset.c:1739

2021-11-25 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103415

Jakub Jelinek  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org
 Status|NEW |ASSIGNED

--- Comment #5 from Jakub Jelinek  ---
Created attachment 51873
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51873&action=edit
gcc12-pr103415.patch

Untested fix.

[Bug ipa/103227] [12 Regression] 58% exchange2 regression with -Ofast -march=native on zen3 since r12-5223-gecdf414bd89e6ba251f6b3f494407139b4dbae0e

2021-11-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103227

--- Comment #13 from CVS Commits  ---
The master branch has been updated by Martin Jambor :

https://gcc.gnu.org/g:5bc4cb04127a4805b6228b0a6cbfebdbd61314d2

commit r12-5527-g5bc4cb04127a4805b6228b0a6cbfebdbd61314d2
Author: Martin Jambor 
Date:   Thu Nov 25 17:58:12 2021 +0100

ipa: Teach IPA-CP transformation about IPA-SRA modifications (PR 103227)

PR 103227 exposed an issue with ordering of transformations of IPA
passes.  IPA-CP can create clones for constants passed by reference
and at the same time IPA-SRA can also decide that the parameter does
not need to be a pointer (or an aggregate) and plan to convert it
into (a) simple scalar(s).  Because no intermediate clone is created
just for the purpose of ordering the transformations and because
IPA-SRA transformation is implemented as part of clone
materialization, the IPA-CP transformation happens only afterwards,
reversing the order of the transformations compared to the ordering of
analyses.

IPA-CP transformation looks at planned substitutions for values passed
by reference or in aggregates but finds that all the relevant
parameters no longer exist.  Currently it subsequently simply gives
up, leading to clones created for no good purpose (and huge regression
of 548.exchange_r.  This patch teaches it recognize the situation,
look up the new scalarized parameter and perform value substitution on
it.  On my desktop this has recovered the lost exchange2 run-time (and
some more).

I have disabled IPA-SRA in a Fortran testcase so that the dumping from
the transformation phase can still be matched in order to verify that
IPA-CP understands the IL after verifying that it does the right thing
also with IPA-SRA.

gcc/ChangeLog:

2021-11-23  Martin Jambor  

PR ipa/103227
* ipa-prop.h (ipa_get_param): New overload.  Move bits of the
existing
one to the new one.
* ipa-param-manipulation.h (ipa_param_adjustments): New member
function get_updated_index_or_split.
* ipa-param-manipulation.c
(ipa_param_adjustments::get_updated_index_or_split): New function.
* ipa-prop.c (adjust_agg_replacement_values): Reimplement, add
capability to identify scalarized parameters and perform
substitution
on them.
(ipcp_transform_function): Create descriptors earlier, handle new
return values of adjust_agg_replacement_values.

gcc/testsuite/ChangeLog:

2021-11-23  Martin Jambor  

PR ipa/103227
* gcc.dg/ipa/pr103227-1.c: New test.
* gcc.dg/ipa/pr103227-3.c: Likewise.
* gcc.dg/ipa/pr103227-2.c: Likewise.
* gfortran.dg/pr53787.f90: Disable IPA-SRA.

[Bug c++/103428] [11/12 Regression] Parameter packs not expanded with local struct in lambda

2021-11-25 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103428

Jakub Jelinek  changed:

   What|Removed |Added

Summary|Parameter packs not |[11/12 Regression]
   |expanded with local struct  |Parameter packs not
   |in lambda   |expanded with local struct
   ||in lambda
   Last reconfirmed||2021-11-25
   Target Milestone|--- |11.3
 Ever confirmed|0   |1
 CC||jakub at gcc dot gnu.org,
   ||jason at gcc dot gnu.org,
   ||ppalka at gcc dot gnu.org
   Priority|P3  |P2
 Status|UNCONFIRMED |NEW

--- Comment #1 from Jakub Jelinek  ---
Started to be rejected again with
r12-392-g2a6fc19e655e696bf0df9b7aaedf9848b23f07f3
11.1 accepts it since
r11-8103-ge89055f90cff9fb6f565b9374e1ab74f805682fb

[Bug target/93453] PPC: rldimi not taken into account to avoid shift+or

2021-11-25 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93453

--- Comment #9 from Segher Boessenkool  ---
Yeah that looks better already, thanks.  Please get rid of the debug stuff
still in here, and send to gcc-patches@?

[Bug c++/103430] New: ICE in gimplify_var_or_parm_decl, at gimplify.c:2975

2021-11-25 Thread asolokha at gmx dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103430

Bug ID: 103430
   Summary: ICE in gimplify_var_or_parm_decl, at gimplify.c:2975
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Keywords: ice-on-valid-code
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: asolokha at gmx dot com
CC: muecker at gwdg dot de
  Target Milestone: ---

g:4e6bf0b9dd5585df1a1472d6a93b9fff72fe2524 fixes the long-standing issue w/
VLAs and statement expressions for the C front-end, but not for the C++ one.
g++ still ICEs when compiling the following testcase, extracted from
gcc/testsuite/gcc.dg/vla-stexp-9.c:

void foo(void)
{
if (2 * sizeof(int) != sizeof((*({ int N = 2; int (*x)[9][N] = 0; x;
})[1])))
__builtin_abort();
}

% g++-12.0.0 -c mrzd2yqy.c
mrzd2yqy.c: In function 'void foo()':
mrzd2yqy.c:3:29: internal compiler error: in gimplify_var_or_parm_decl, at
gimplify.c:2975
3 | if (2 * sizeof(int) != sizeof((*({ int N = 2; int (*x)[9][N] =
0; x; })[1])))
  |
^~~~
0x7a339b gimplify_var_or_parm_decl
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha20211121/work/gcc-12-20211121/gcc/gimplify.c:2975
0xea3e1f gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*),
int)
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha20211121/work/gcc-12-20211121/gcc/gimplify.c:15127
0xea98e0 internal_get_tmp_var
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha20211121/work/gcc-12-20211121/gcc/gimplify.c:624
0xeac95e get_initialized_tmp_var(tree_node*, gimple**, gimple**, bool)
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha20211121/work/gcc-12-20211121/gcc/gimplify.c:679
0xeac95e gimplify_save_expr
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha20211121/work/gcc-12-20211121/gcc/gimplify.c:6267
0xea4298 gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*),
int)
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha20211121/work/gcc-12-20211121/gcc/gimplify.c:14967
0xea3e7d gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*),
int)
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha20211121/work/gcc-12-20211121/gcc/gimplify.c:14971
0xea3a51 gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*),
int)
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha20211121/work/gcc-12-20211121/gcc/gimplify.c:15432
0xea3a51 gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*),
int)
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha20211121/work/gcc-12-20211121/gcc/gimplify.c:15432
0xebbd7c gimplify_cond_expr
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha20211121/work/gcc-12-20211121/gcc/gimplify.c:4329
0xea446c gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*),
int)
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha20211121/work/gcc-12-20211121/gcc/gimplify.c:14623
0xea79e6 gimplify_stmt(tree_node**, gimple**)
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha20211121/work/gcc-12-20211121/gcc/gimplify.c:7024
0xea82af gimplify_bind_expr
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha20211121/work/gcc-12-20211121/gcc/gimplify.c:1426
0xea444f gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*),
int)
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha20211121/work/gcc-12-20211121/gcc/gimplify.c:14867
0xea79e6 gimplify_stmt(tree_node**, gimple**)
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha20211121/work/gcc-12-20211121/gcc/gimplify.c:7024
0xea82af gimplify_bind_expr
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha20211121/work/gcc-12-20211121/gcc/gimplify.c:1426
0xea444f gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*),
int)
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha20211121/work/gcc-12-20211121/gcc/gimplify.c:14867
0xebeb96 gimplify_stmt(tree_node**, gimple**)
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha20211121/work/gcc-12-20211121/gcc/gimplify.c:7024
0xebeb96 gimplify_body(tree_node*, bool)
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha20211121/work/gcc-12-20211121/gcc/gimplify.c:15912
0xebf02d gimplify_function_tree(tree_node*)
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha20211121/work/gcc-12-20211121/gcc/gimplify.c:16066

[Bug c++/102454] coroutines: ICE in gimplify_var_or_parm_decl, at gimplify.c:2958

2021-11-25 Thread asolokha at gmx dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102454

--- Comment #6 from Arseny Solokha  ---
Should this PR be closed now?

[Bug middle-end/103393] [12 Regression] Generating 256bit register usage with -mprefer-avx128 -mprefer-vector-width=128

2021-11-25 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103393

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #10 from Jakub Jelinek  ---
Alternatively, couldn't we check next to that new
 && have_insn_for (SET, mode)
also that
 && known_le (GET_MODE_SIZE (mode), MOVE_MAX)
?

[Bug middle-end/103393] [12 Regression] Generating 256bit register usage with -mprefer-avx128 -mprefer-vector-width=128

2021-11-25 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103393

--- Comment #11 from Jakub Jelinek  ---
Actually no, GET_MODE_SIZE in that case is the size of the whole operation.
To me the previous change looks extremely ARM specific with load lines in mind
which no other target has.  If we want to support more than one SET covering
it, there should be a loop to find out how large each load should be and we
should decide that based on MOVE_MAX.

[Bug tree-optimization/103409] [12 Regression] 18% SPEC2017 WRF compile-time regression with -O2 -flto since r12-3903-g0288527f47cec669

2021-11-25 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103409

Martin Liška  changed:

   What|Removed |Added

Summary|[12 Regression] 18% |[12 Regression] 18%
   |SPEC2017 WRF compile-time   |SPEC2017 WRF compile-time
   |regression with -O2 -flto   |regression with -O2 -flto
   |between g:264f061997c0a534  |since
   |and g:3e09331f6aeaf595  |r12-3903-g0288527f47cec669
   Keywords|needs-bisection |
 CC||aldyh at gcc dot gnu.org,
   ||amacleod at redhat dot com

--- Comment #5 from Martin Liška  ---
Started with r12-3903-g0288527f47cec669.

[Bug tree-optimization/103429] Optimization of Auto-generated condition chain is not giving good lookup tables.

2021-11-25 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103429

--- Comment #4 from Martin Liška  ---
So it's very funny what's happening here. iftoswitch pass is called for all
e.g.
f_dispatch_always_inline<10>, f_dispatch_always_inline<9> and so on until
f_dispatch_always_inline<5> which is converted to switch.
And then all early passes are called for f_dispatch_always_inline<4> which
include einline and we end up with:

__attribute__((always_inline))
void f_dispatch_always_inline<4> (int i)
{
   :
  if (i_2(D) == 4)
goto ; [INV]
  else
goto ; [INV]

   :
  f<4> ();
  goto ; [INV]

   :
  switch (i_2(D))  [16.67%], case 5:  [16.67%], case 6: 
[16.67%], case 7:  [16.67%], case 8:  [16.67%], case 9:  [16.67%]>

   :
:
  f<5> ();
  goto ; [100.00%]

   :
:
  f<6> ();
  goto ; [100.00%]

   :
:
  f<7> ();
  goto ; [100.00%]

   :
:
  f<8> ();
  goto ; [100.00%]

   :
:
  f<9> ();

   :
:
  return;

}

which is a mixture of if and switch statements.

So what we basically need is if-to-switch hybrid support for if-else chain
combined with switches.

[Bug c++/102213] Incorrect executable produced from valid input code with virtual consteval

2021-11-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102213

Andrew Pinski  changed:

   What|Removed |Added

 Ever confirmed|0   |1
  Known to fail||11.1.0, 11.2.0
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2021-11-25

--- Comment #1 from Andrew Pinski  ---
Confirmed, at -O1 it is constant evulated in the front-end and it works.

[Bug c++/102213] Incorrect executable produced from valid input code with virtual consteval

2021-11-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102213

--- Comment #2 from Andrew Pinski  ---
Note GCC 10 did a sorry message:
sorry, unimplemented: 'virtual' 'consteval'

[Bug c++/102454] coroutines: ICE in gimplify_var_or_parm_decl, at gimplify.c:2958

2021-11-25 Thread iains at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102454

--- Comment #7 from Iain Sandoe  ---
I was leaving it to check if we needed to back port to 10.x as well.

[Bug tree-optimization/103429] Optimization of Auto-generated condition chain is not giving good lookup tables.

2021-11-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103429

Andrew Pinski  changed:

   What|Removed |Added

   Severity|normal  |enhancement

[Bug middle-end/103406] [12 Regression] gcc -O0 behaves differently on "DBL_MAX related operations" than gcc -O1 and above

2021-11-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103406

--- Comment #12 from CVS Commits  ---
The master branch has been updated by Roger Sayle :

https://gcc.gnu.org/g:6ea5fb3cc7f3cc9b731d72183c66c23543876f5a

commit r12-5529-g6ea5fb3cc7f3cc9b731d72183c66c23543876f5a
Author: Roger Sayle 
Date:   Thu Nov 25 19:02:06 2021 +

PR middle-end/103406: Check for Inf before simplifying x-x.

This is a simple one line fix to the regression PR middle-end/103406,
where x - x is being folded to 0.0 even when x is +Inf or -Inf.
In GCC 11 and previously, we'd check whether the type honored NaNs
(which implicitly covered the case where the type honors infinities),
but my patch to test whether the operand could potentially be NaN
failed to also check whether the operand could potentially be Inf.

2021-11-25  Roger Sayle  

gcc/ChangeLog
PR middle-end/103406
* match.pd (minus @0 @0): Check tree_expr_maybe_infinite_p.

gcc/testsuite/ChangeLog
PR middle-end/103406
* gcc.dg/pr103406.c: New test case.

[Bug tree-optimization/102958] std::u8string suboptimal compared to std::string, triggers warnings

2021-11-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102958

Andrew Pinski  changed:

   What|Removed |Added

   Last reconfirmed||2021-11-25
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

--- Comment #3 from Andrew Pinski  ---
Confirmed, interesting we don't detect this as strlen:
   [local count: 8687547547]:
  # __i_155 = PHI <__i_46(3), 0(2)>
  __i_46 = __i_155 + 1;
  _48 = MEM[(const char_type &)"123456789" + __i_46 * 1];
  if (_48 != 0)
goto ; [89.00%]
  else
goto ; [11.00%]

I thought there was code to do that dection now?

[Bug target/102117] s390: Inefficient code for 64x64=128 signed multiply for <= z13

2021-11-25 Thread roger at nextmovesoftware dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102117

Roger Sayle  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED
   Target Milestone|--- |12.0

--- Comment #4 from Roger Sayle  ---
This should now be fixed on mainline.

[Bug tree-optimization/103332] Spurious -Wstringop-overflow warnings in libstdc++ tests

2021-11-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103332

Andrew Pinski  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2021-11-25
 Status|UNCONFIRMED |NEW

--- Comment #4 from Andrew Pinski  ---
.

[Bug tree-optimization/103427] Alignment of C++ references and 'this' pointer not used by optimizer

2021-11-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103427

Andrew Pinski  changed:

   What|Removed |Added

   Last reconfirmed||2021-11-25
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Severity|normal  |enhancement

--- Comment #11 from Andrew Pinski  ---
Confirmed.
I had thought there was another bug about this but I can't find it.

[Bug c++/103426] Acceptance of invalid template specialization in a namespace not enclosing the specialized template

2021-11-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103426

Andrew Pinski  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=56119
 Status|NEW |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #1 from Andrew Pinski  ---
This is a dup of bug 56119.

*** This bug has been marked as a duplicate of bug 56119 ***

[Bug c++/56119] Allows static member definition of template class in namespace not enclosing this class

2021-11-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56119

Andrew Pinski  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=103426
 CC||fchelnokov at gmail dot com

--- Comment #3 from Andrew Pinski  ---
*** Bug 103426 has been marked as a duplicate of this bug. ***

[Bug middle-end/103406] gcc -O0 behaves differently on "DBL_MAX related operations" than gcc -O1 and above

2021-11-25 Thread roger at nextmovesoftware dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103406

Roger Sayle  changed:

   What|Removed |Added

 Status|ASSIGNED|NEW
   Assignee|roger at nextmovesoftware dot com  |unassigned at gcc dot 
gnu.org
Summary|[12 Regression] gcc -O0 |gcc -O0 behaves differently
   |behaves differently on  |on "DBL_MAX related
   |"DBL_MAX related|operations" than gcc -O1
   |operations" than gcc -O1|and above
   |and above   |
 Target||x86_64

--- Comment #13 from Roger Sayle  ---
The Inf - Inf => 0.0 regression should now be fixed on mainline.

Hmm.  As hinted by Richard Beiner's investigation, the underlying problem is
even more pervasive.  It turns out that on x86/IA64 chips, floating point
addition is not commutative, i.e. x+y is not the same as y+x, as demonstrated
by the test program below:

#include 

const double pn = __builtin_nan("");
const double mn = -__builtin_nan("");

__attribute__ ((noinline, noclone))
double plus(double x, double y)
{
  return x + y;
}

int main()
{
  printf("%lf\n",plus(pn,mn));
  printf("%lf\n",plus(mn,pn));
  return 0;
}

Output:
nan
-nan

Unfortunately, GCC assumes almost everywhere the FP addition is commutative
and (as per comments #8 and #9) associative with negation/minus.  This appears
to be target property, c.f. libgcc's _FP_CHOOSENAN, but could in theory be
resolved by a -fstrict-math mode (that implies -ftrapping-math) that disables
commutativity (swapping of operands) throughout the compiler, including
reload/fold-const etc., on affected Intel-like targets.
Perhaps this PR is a duplicate now that the regression has been fixed?

[Bug tree-optimization/103345] missed optimization: add/xor individual bytes to form a word

2021-11-25 Thread roger at nextmovesoftware dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103345

Roger Sayle  changed:

   What|Removed |Added

   Target Milestone|--- |12.0
 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #5 from Roger Sayle  ---
This PR should now be fixed (missed optimization implemented) on mainline.

[Bug libstdc++/101608] ranges::fill/fill_n missing std::is_constant_evaluated() condition for __builtin_memset

2021-11-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101608

--- Comment #2 from CVS Commits  ---
The master branch has been updated by Jonathan Wakely :

https://gcc.gnu.org/g:82c3657dd74896b39937bb0a2aaeba9b8ca105fd

commit r12-5530-g82c3657dd74896b39937bb0a2aaeba9b8ca105fd
Author: Jonathan Wakely 
Date:   Wed Nov 24 13:17:54 2021 +

libstdc++: Do not use memset in constexpr calls to ranges::fill_n
[PR101608]

libstdc++-v3/ChangeLog:

PR libstdc++/101608
* include/bits/ranges_algobase.h (__fill_n_fn): Check for
constant evaluation before using memset.
* testsuite/25_algorithms/fill_n/constrained.cc: Check
byte-sized values as well.

[Bug tree-optimization/98953] Failure to optimize two reads from adjacent addresses into one

2021-11-25 Thread roger at nextmovesoftware dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98953

Roger Sayle  changed:

   What|Removed |Added

 Status|ASSIGNED|NEW
   Assignee|roger at nextmovesoftware dot com  |unassigned at gcc dot 
gnu.org

--- Comment #4 from Roger Sayle  ---
The MULT_EXPR and PLUS_EXPR aspects of this PR are now resolved (i.e. the case
in comment #1), but unfortunately the abs-based indexing used in the original
report still causes problems.  The bswap pass doesn't yet handle memory
accesses of the form read[abs]/read[abs+1] (but does handle read[0]/read[1]).

[Bug tree-optimization/99520] Failure to detect bswap pattern

2021-11-25 Thread roger at nextmovesoftware dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99520

Roger Sayle  changed:

   What|Removed |Added

   Target Milestone|--- |12.0
 CC||roger at nextmovesoftware dot 
com
 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #9 from Roger Sayle  ---
This PR is now fixed on mainline.  Thanks to Jakub (my apologies if I'd seen
comment #2 I wouldn't of accidentally broken things; aka PR
tree-optimization/103376, fortunately Jakub was able to quickly correct my
oversight).

[Bug middle-end/103406] gcc -O0 behaves differently on "DBL_MAX related operations" than gcc -O1 and above

2021-11-25 Thread joseph at codesourcery dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103406

--- Comment #14 from joseph at codesourcery dot com  ---
There is no reasonable definition of how operands of binary + map to 
particular operands of a particular instruction and so no -f or -m option 
could sensibly be defined for that.  When the result is a NaN, there is no 
requirement at all on what (quiet) NaN it is (beyond a preference for 
preservation of the payload of a NaN operand if there is at least one NaN 
operand).

  1   2   >