[Bug middle-end/104558] [9/10/11/12 Regression] ICE: in expand_call, at calls.cc:3601 with -fabi-version=9 on pr83487.c

2022-03-03 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104558

--- Comment #4 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:e6e6e0a97340068c90fe091482efbaacd6474754

commit r12-7458-ge6e6e0a97340068c90fe091482efbaacd6474754
Author: Jakub Jelinek 
Date:   Thu Mar 3 09:11:09 2022 +0100

calls: When bypassing emit_push_insn for 0 sized arg, emit at least
anti_adjust_stack for alignment pad if needed [PR104558]

The following testcase ICEs on x86_64 when asked to use the pre-GCC 8
ABI where zero sized arguments weren't ignored.
In GCC 7 the emit_push_insn calls in store_one_arg were unconditional,
it is true that they didn't actually push anything because it had zero
size, but because arg->locate.alignment_pad is 8 in this case,
emit_push_insn at the end performs
  if (alignment_pad && args_addr == 0)
anti_adjust_stack (alignment_pad);
and an assert larger on is upset if we don't do it.
The following patch keeps the emit_push_insn conditional but calls
the anti_adjust_stack when needed by hand for the zero sized arguments.
For the new x86_64 ABI where zero sized arguments are ignored
arg->locate.alignment_pad is 0 in this case, so nothing changes
- we in that case really do ignore it.

There is another emit_push_insn call earlier in store_one_arg, also made
conditional on non-zero size by Marek in GCC 8, but that one is for
arguments with non-BLKmode and the only way those can be zero size is
if they are TYPE_EMPTY_P aka when they are completely ignored.  But
I believe arg->locate.alignment_pad should be 0 in that case, so IMHO
there is no need to do anything in the second spot.

2022-03-03  Jakub Jelinek  

PR middle-end/104558
* calls.cc (store_one_arg): When not calling emit_push_insn
because size_rtx is const0_rtx, call at least anti_adjust_stack
on arg->locate.alignment_pad if !argblock and the alignment might
be non-zero.

* gcc.dg/pr104558.c: New test.

[Bug middle-end/104757] [12 Regression][OpenMP] ICE in GIMPLE pass: walloca - in gimple_range_global / segfault as SSA_NAME_DEF_STMT is NULL for 'if' clause arg

2022-03-03 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104757

--- Comment #11 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:431414b5d934866af3f6415a56c35bb57b928fef

commit r12-7459-g431414b5d934866af3f6415a56c35bb57b928fef
Author: Jakub Jelinek 
Date:   Thu Mar 3 09:13:32 2022 +0100

openmp: Disable SSA form during gimplification on OMP_SIMD clauses and body
[PR104757]

When offloading to nvptx is enabled, scan_omp_simd duplicates the simd
region including its clauses and body using inliner's
copy_gimple_seq_and_replace_locals.  That works nicely for decls, remaps
only those that are seen in the nested bind expr vars (i.e. local
variables)
and doesn't remap other vars.  But for SSA_NAMEs it remaps them always,
doesn't
know if their def stmt is outside of the simd (then it better shouldn't be
remapped)
or inside of it (then it should) and without cfg/dominators that is pretty
hard
to figure out (well, we could walk the region twice, once note SSA_NAMEs
defined
by each stmt seen there and once do the remapping of only those visited
SSA_NAMEs).

This patch uses a simpler way, disables temporarily into_ssa for the
clauses and
body of each simd region; we already disable into_ssa e.g. in
parallel/target/task
etc. regions through push_gimplify_context () but for simd we don't push
any gimplification context and appart from into_ssa I think we don't need
it.

2022-03-03  Jakub Jelinek  

PR middle-end/104757
* gimplify.cc (gimplify_omp_loop): Call gimplify_expr rather than
gimplify_omp_for.
(gimplify_expr) : Temporarily disable
gimplify_ctxp->into_ssa around call to gimplify_omp_for.

* gfortran.dg/gomp/pr104757.f90: New test.
* gcc.dg/gomp/pr104757.c: New test.

[Bug middle-end/104558] [9/10/11 Regression] ICE: in expand_call, at calls.cc:3601 with -fabi-version=9 on pr83487.c

2022-03-03 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104558

Jakub Jelinek  changed:

   What|Removed |Added

Summary|[9/10/11/12 Regression] |[9/10/11 Regression] ICE:
   |ICE: in expand_call, at |in expand_call, at
   |calls.cc:3601 with  |calls.cc:3601 with
   |-fabi-version=9 on  |-fabi-version=9 on
   |pr83487.c   |pr83487.c

--- Comment #5 from Jakub Jelinek  ---
Fixed on the trunk.  I think for backports we'd need a proof that we always ICE
or at least that if it doesn't ICE the callers are ABI incompatible with
callees.

[Bug middle-end/104757] [12 Regression][OpenMP] ICE in GIMPLE pass: walloca - in gimple_range_global / segfault as SSA_NAME_DEF_STMT is NULL for 'if' clause arg

2022-03-03 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104757

--- Comment #12 from Jakub Jelinek  ---
I admit I haven't built an offloading compiler, just hand edited auto-host.h to
ENABLE_OFFLOADING 1 and used the env var to debug those 2 testcases and then
tested on non-offloading compiler.
So, not closing this just yet until confirmed it works for offloading compiler
too.

[Bug target/104704] [12 Regression] ix86_gen_scratch_sse_rtx doesn't work with explicit XMM7/XMM15/XMM31 usage

2022-03-03 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104704

--- Comment #13 from Hongtao.liu  ---
(In reply to H.J. Lu from comment #10)
> Created attachment 52553 [details]
> A patch to always return pseudo register in ix86_gen_scratch_sse_rtx

Please go ahead with this patch, i'll submit an incremental patch for #12

[Bug tree-optimization/104746] [12 Regression] False positive for -Wformat-overflow=2 since r12-7033-g3c9f762ad02f398c

2022-03-03 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104746

Martin Liška  changed:

   What|Removed |Added

 Resolution|--- |INVALID
 Status|WAITING |RESOLVED

--- Comment #2 from Martin Liška  ---
(In reply to Martin Sebor from comment #1)
> The warning certainly looks cryptic but seems to actually point out a real
> bug in the code: len is set to 1 less than the number of bytes the sprintf
> call writes to the buffer (the two strings plus the slash character plus the
> teminating nul byte).

Oh, I overlooked that. Thanks for the explanation, I'm going to report that to
the upstream project.

[Bug c++/96780] debuginfo for std::move and std::forward isn't useful

2022-03-03 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96780

--- Comment #9 from Jonathan Wakely  ---
As well as folding move and forward, it probably makes sense to do the same for
as_const and addressof (and our internal __addressof). addressof is
particularly annoying because it's uglier *and* slower than taking the address,
but 100% necessary because of ADL.

[Bug target/104758] [nvptx] sm_30 support

2022-03-03 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104758

--- Comment #3 from CVS Commits  ---
The master branch has been updated by Tom de Vries :

https://gcc.gnu.org/g:07667c911b1827fb98a1b5da621d51d8fcf0409a

commit r12-7462-g07667c911b1827fb98a1b5da621d51d8fcf0409a
Author: Tom de Vries 
Date:   Wed Mar 2 12:04:39 2022 +0100

[nvptx] Build libraries with misa=sm_30

In gcc-11, when  specifying -misa=sm_30, an executable may still contain
sm_35
code (due to libraries being built with the default -misa=sm_35), so it
won't
run on an sm_30 board.

Fix this by building libraries with sm_30, as was the case in gcc-5 to
gcc-10.

gcc/ChangeLog:

2022-03-03  Tom de Vries  

PR target/104758
* config/nvptx/t-nvptx (MULTILIB_EXTRA_OPTS): Add misa=sm_30.

[Bug target/104758] [nvptx] sm_30 board support broken

2022-03-03 Thread vries at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104758

Tom de Vries  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
   Target Milestone|--- |12.0
Summary|[nvptx] sm_30 support   |[nvptx] sm_30 board support
   ||broken
 Resolution|--- |FIXED

--- Comment #4 from Tom de Vries  ---
Fixed by "[nvptx] Build libraries with misa=sm_30".

[Bug tree-optimization/104746] [12 Regression] False positive for -Wformat-overflow=2 since r12-7033-g3c9f762ad02f398c

2022-03-03 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104746

Martin Liška  changed:

   What|Removed |Added

 Status|RESOLVED|NEW
 Resolution|INVALID |---

--- Comment #3 from Martin Liška  ---
> That said, the warning persists even with a buffer of sufficient size, but
> then disappears if the empty definition of systemd_escape2() is removed. 
> Since the function fails to return a result the test case is invalid, I'm
> guessing because it was reduced too far.  Can you provide a valid test case?
> 

Sure, there's a valid test-case that is completely fine and we emit a warning:

$ cat problem.c
#include 
#include 
#include 

int main(int argc, char *argv[])
{
  const char *pipefs_path = argv[1];
  const char *dirname = argv[1];
  const char *suffix = ".mount";

  int len = strlen(pipefs_path);
  char *result = malloc(len + strlen(suffix) + 1);
  sprintf(result, "%s", suffix);

  char *path = malloc(strlen(dirname) + strlen(result) + 2);
  sprintf(path, "%s/%s", dirname, result);

  return 0;
}

[Bug c++/103705] [12 Regression] ICE: tree check: expected tree that contains 'decl minimal' structure, have 'array_ref' in finish_omp_clauses, at cp/semantics.c:7928 since r12-5838-g6c0399378e77d029

2022-03-03 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103705

Tobias Burnus  changed:

   What|Removed |Added

 CC||burnus at gcc dot gnu.org

--- Comment #3 from Tobias Burnus  ---
Can this be closed? Or is something in addition still needed?

(It is marked as GCC 12 regression and I see one mainline/GCC12 commit for it.)

[Bug target/104768] New: [nvptx] Exploit Independent Thread Scheduling for sm_70+

2022-03-03 Thread vries at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104768

Bug ID: 104768
   Summary: [nvptx] Exploit Independent Thread Scheduling for
sm_70+
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: enhancement
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vries at gcc dot gnu.org
  Target Milestone: ---

Starting with sm_70, a fundamental change in the architecture occurred, called 
"Independent Thread Scheduling".  It means warps threads are no longer
executing in lock-step.

We could try to exploit this in the port.

F.i., is it still necessary to emit a warp sync after a diverging branch?

[Bug tree-optimization/104746] [12 Regression] False positive for -Wformat-overflow=2 since r12-7033-g3c9f762ad02f398c

2022-03-03 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104746

--- Comment #4 from Martin Liška  ---
Btw. upstream applied the following patch:
http://git.linux-nfs.org/?p=steved/nfs-utils.git;a=blobdiff;f=systemd/rpc-pipefs-generator.c;h=7b2bb4f7fe7d27f6339ac7adf9562523b1395aef;hp=c24db567a45d9f544a1a094e0c0fbc5582f80da9;hb=7f8463fe702174bd613df9d308cc899af25ae02e;hpb=475857723dee84a68b675a56ca0ea2e81c2f626e

that seems to me like a papering over.

[Bug ipa/104187] Call site specific attribute to control inliner

2022-03-03 Thread david.bolvansky at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104187

--- Comment #8 from Dávid Bolvanský  ---
So this works in Clang now

int foo(int x, int y) { // any compiler will happily inline this function
return x / y;
}

int test(int x, int y) {
int r = 0;
[[clang::noinline]] r += foo(x, y); // for some reason we don't want any
inlining here
return r;
}


foo(int, int): # @foo(int, int)
  mov eax, edi
  cdq
  idiv esi
  ret
test(int, int): # @test(int, int)
  jmp foo(int, int) # TAILCALL

foo(int, int): # @foo(int, int)
  mov eax, edi
  cdq
  idiv esi
  ret
test(int, int): # @test(int, int)
  jmp foo(int, int) # TAILCALL

[Bug fortran/104131] ICE in gfc_conv_array_ref, at fortran/trans-array.c:3810

2022-03-03 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104131

--- Comment #3 from CVS Commits  ---
The master branch has been updated by Kwok Yeung :

https://gcc.gnu.org/g:88c4d85e27e18bf991ab8728b73127a0385f2c27

commit r12-7464-g88c4d85e27e18bf991ab8728b73127a0385f2c27
Author: Kwok Cheung Yeung 
Date:   Thu Mar 3 10:23:26 2022 +

openmp, fortran: Check that the type of an event handle in a detach clause
is suitable [PR104131]

This rejects variables that are array types, array elements or derived type
members when used as the event handle inside a detach clause (in accordance
with the OpenMP specification).  This would previously lead to an ICE.

2022-03-03  Kwok Cheung Yeung  

gcc/fortran/

PR fortran/104131
* openmp.cc (gfc_match_omp_detach): Move check for type of event
handle to...
(resolve_omp_clauses) ...here.  Also check that the event handle is
not an array, or an array access or structure element access.

gcc/testsuite/

PR fortran/104131
* gfortran.dg/gomp/pr104131.f90: New.
* gfortran.dg/gomp/task-detach-1.f90: Update expected error
message.

[Bug c++/103705] [12 Regression] ICE: tree check: expected tree that contains 'decl minimal' structure, have 'array_ref' in finish_omp_clauses, at cp/semantics.c:7928 since r12-5838-g6c0399378e77d029

2022-03-03 Thread cltang at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103705

Chung-Lin Tang  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #4 from Chung-Lin Tang  ---
I don't see more to fix ATM. Closing for now.

[Bug target/104769] New: [nvptx] mptx/misa multilibs

2022-03-03 Thread vries at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104769

Bug ID: 104769
   Summary: [nvptx] mptx/misa multilibs
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: enhancement
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vries at gcc dot gnu.org
  Target Milestone: ---

The current situation is:
- default: -misa=sm_35 -mptx=6.0
- libraries: -misa=sm_30 -mptx=3.1

There's an open question on whether we need or want multilibs for different
values of misa and/or mptx.

First, let's look at drivers.  The libraries use the lowest (supported by gcc)
versions, which are supported by the latest production branch driver 510.x.  So
from that point of view, there's no need.

Then let's look at misa.  Potentially, having a multilib for each misa value is
the most optimal solution.  But is it faster in practice?  I'd expect the
performance critical code to reside in the applications, which can be using
their own optimal -misa settings.

Now, let's look at mptx.  Newer ptx isa versions unlock new features, which
could enable code that is currently not compiling.  F.i., alloca, support for
.alias.  Again here, the question is whether we need this only in the
applications or also in the libraries.  We'll assess this each time we start
using a new feature.

Finally, let's look at mgomp/libgomp.a.

This has code like:
...
@ %r73 membar.sys;
@ %r73 atom.add.u64 %r25,[%r35+72],%r32;
@ %r73 membar.sys;
{
.reg .b32 act;
vote.ballot.b32 act,1;
.reg .pred uni;
setp.eq.b32 uni,act,0x;
@ ! uni trap;
@ ! uni exit;
...

Arguably, when compiling with -mptx=6.0, which is the default, we want to use
bar.warp.sync instead.

So, perhaps a default mptx=3.1 and a multilib mptx=6.0?

[Bug c++/104702] [12 Regression] False positive -Wunused-value warning with -fno-exceptions

2022-03-03 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104702

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org,
   ||msebor at gcc dot gnu.org

--- Comment #2 from Jakub Jelinek  ---
I think the problem in this case are primarily the warning-control.cc changes.
In GCC 11, build_vec_init did at the end:
  /* Now make the result have the correct type.  */
  if (TREE_CODE (atype) == ARRAY_TYPE)
{
  atype = build_pointer_type (atype);
  stmt_expr = build1 (NOP_EXPR, atype, stmt_expr);
  stmt_expr = cp_build_fold_indirect_ref (stmt_expr);
  TREE_NO_WARNING (stmt_expr) = 1;
}
GCC 12 does instead:
  if (TREE_CODE (atype) == ARRAY_TYPE)
{
  atype = build_pointer_type (atype);
  stmt_expr = build1 (NOP_EXPR, atype, stmt_expr);
  stmt_expr = cp_build_fold_indirect_ref (stmt_expr);
  suppress_warning (stmt_expr /* What warning? */);
}
Clearly, at least one of the intended warnings that should be suppressed is
OPT_Wunused_value.
But as EXPR_LOCATION (stmt_expr) == UNKNOWN_LOCATION here, all suppress_warning
does is set TREE_NO_WARNING bit on it (i.e. like GCC 11 used).
Later on, add_stmt is called on it and does:
  if (!EXPR_HAS_LOCATION (t))
SET_EXPR_LOCATION (t, input_location);
There was also even before build_vec_init a call to
suppress_warning with some unrelated expression that had the
284032 {file = 0x3f67310 "pr104702.C", line = 11, column = 31, data = 0x0, sysp
= false}
location (with OPT_Wparentheses) which is the input_location now newly set by
add_stmt on the INDIRECT_REF.
Later on, convert_to_void calls
&& !warning_suppressed_p (expr, OPT_Wunused_value)
on the INDIRECT_REF.  TREE_NO_WARNING bit is set, but at this point it has a
location, so it looks up the location in hash map, finds that OPT_Wparantheses
is the warning to be suppressed there and returns that OPT_Wunused_value isn't
suppressed there, even when we wanted to suppress all warnings on the tree.

So, I think the options are either to analyze everything where suppress_warning
might be called on trees that don't have EXPR_LOCATION set yet but might get it
changed later on, or perhaps change the warning-control.cc implementation for
the cases where it is called on trees with reserved location (grab another bit
next to TREE_NO_WARNING and set it too if we want to suppress some warnings on
reserved location tree, which would then be sticky and would imply suppress all
warnings on this tree from now on).
Or change warning-control.cc not to base anything on location_t's but on the
trees or gimple * pointers instead.  We'd of course need to copy records in the
hash map if we copy a tree or gimple stmt including the
TREE_NO_WARNING/gimple_no_warning bit (and would need to be deletable so that
when some trees/gimple stmts are garbage collected we remove entries from those
hash tables).  In fact, I thought that was the plan for warning control.
Basing it on location_t has beyond the above mentioned problem also a problem
if ever EXPR_LOCATION changes later on (I admit except for UNKNOWN_LOCATION ->
some other location changes that is quite rare).

Also, unless I'm misreading the warning-control.cc code, it seems it is based
on whatever location_t a tree or especially gimple * has, but we encode in
location_t not just the location, but also the BLOCK starting with
gimple-low.cc, and I don't see any stripping of that BLOCK stuff, i.e. no uses
of
LOCATION_LOCUS anywhere.  So again I don't see how it can work well,
suppress_warning before gimple-low.cc sets gimple_no_warning bit and fills in
details for one location, then we change the location_t because we add BLOCK to
it, then some warning pass tests suppress_warning_p and either will not find
the new location_t in there and thus will imply all warnings are suppressed, or
worse suppress_location before gimple-low.cc was for one warning, there is
another suppress_location after it on the same stmt (that will add a record on
the new location_t) and finally suppress_warning_p test on the first warning
(will return false).

[Bug libstdc++/97759] Could std::has_single_bit be faster?

2022-03-03 Thread peter at cordes dot ca via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97759

Peter Cordes  changed:

   What|Removed |Added

 CC||peter at cordes dot ca

--- Comment #14 from Peter Cordes  ---
Agreed with the idea of expanding doing the  popcount(x) == 1  peephole
replacement in the compiler, not forcing the header to figure out whether we
have efficient popcount or not.

If we have BMI2, it's BLSR (Bit Lowest-Set Reset) sets CF=1 if the input was
zero, and ZF=1 if the output is zero.  Unfortunately none of the standard
jcc/setcc conditions check ZF=1 & CF=0, even with CMC to invert CF first.  
(If Intel had designed it to produce ZF and CF inverted, it would be
non-intuitive for all other uses but would have allowed blsr / jcc to implement
if(has_single_bit(x)).)

CF=1, ZF=0  impossible: input was zero, output was non-zero
CF=1, ZF=1  input was zero
CF=0, ZF=0  input had multiple bits set
CF=0, ZF=1  input had a single bit set.

If we're going to branch on it anyway after inlining, a branchy strategy is
probably good:

singlebit_bmi2_branchy:
   xor%eax, %eax
   blsr   %edi, %edi#  ZF=1 means we cleared the last bit, or the input was
zero
   jc .Linput_zero  # input was zero, return 0 regardless of ZF
   setz   %al
 .Linput_zero:
   ret


And when we want a boolean in a register, a combination of setz and cmovc can
materialize one.  With clever choice of registers, we can even avoid giving
setcc a false dependency on a register that isn't already part of its dep chain

singlebit_bmi2_cmov:
   blsr%edi, %eax
   setz%al # false dep, but it's ready if FLAGS are ready because
we wrote it with BLSR
   cmovc   %edi, %eax  # return 1 only if ZF=1 (setz produces 1) and CF=0
(cmovc doesn't overwrite it with the input 0)
   ret

With xor-zeroing first, we could produce the boolean zero-extended to 32-bit,
instead of here where only the low 8 bits are actually 0 / 1.  (Which is fine
for returning a bool in all the mainstream calling conventions)

(This is the same kind of idea as ARM64 sub/tst / ccmp / cset, where ccmp can
conditionally update flags.)

An evil variation on this uses setnz / dec to invert ZF without affecting CF,
allowing JA:

   blsr   %edi,%eax
   setnz  %al # AL = !ZF
   dec%al # 1-1 -> ZF=1,  0-1 -> ZF=0.  ZF=!ZF without
affecting CF
   # seta   %al # set on CF=0 and ZF=0
   ja was_single_bit# only actually useful for branching after inlining

dec/ja can't macro-fuse into a single uop, but on Skylake and later Intel it
doesn't cost any extra partial-FLAGS merging uops, because JA simply has both
parts of FLAGS as separate inputs.  (This is why CMOVA / CMOVBE are still 2
uops on Skylake, unlike all other forms: they need 2 integer inputs and 2 parts
of FLAGS, while others need either CF or SPAZO not both.  Interestingly, Zen1/2
have that effect but not Zen3)

I don't know how AMD handles dec / ja partial-flags shenanigans.  Intel Haswell
would I think have a flags-merging uop; older Intel doesn't support BMI1 so
P6-family is irrelevant.  https://stackoverflow.com/a/49868149/224132

I haven't benchmarked them because they have different use-cases (materializing
a boolean vs. creating a FLAGS condition to branch on, being branchless
itself), so any single benchmark would make one of them look good.  If your
data almost never (or always) has an all-zero input, the JC in the first
version will predict well.  After inlining, if the caller branches on the bool
result, you might want to just branch on both conditions separately.

I don't think this setnz/dec/ja version is ever useful.  Unlike branching
separately on ZF and CF, it's not bad if both 0 and multi-bit inputs are common
while single-bit inputs are rare.  But blsr/setz/cmovc + test/jnz is only 4
uops, same as this on Skylake. (test can macro-fuse with jnz).

The uops are all dependent on each other, so it also has the same latency (to
detect a branch miss) as popcnt / macro-fused cmp/je which is 2 uops.  The only
thing this has going for it is avoiding a port-1-only uop, I think.

It's also possible to blsr / lahf / and  ah, (1<<6) | (1<<0) / cmp  ah, 1<<6   
to directly check that ZF=1 and CF=0.  I doubt that's useful.  Or hmm, can we
branch directly on PF after AND with that 2-bit mask?  CF=1 ZF=0 is impossible,
so the only other odd-parity case is CF=0 ZF=1.  AMD and Intel can macro-fuse
test/jp.

   blsr  %edi, %eax
   lahf
   test  $(1<<6) | (1<<0), %ah# check ZF and CF.
   jpo   was_single_bit   # ZF != CF means CF=0, ZF=1 because the
other way is impossible.

Also possible of course is the straightforward 2x setcc and AND to materialize
a boolean in the bottom byte of EAX.  Good ILP, only 3 cycle latency from input
to result on Intel, but that's the same as setz/cmovc which is fewer uops and
can avoid false dependencie

[Bug middle-end/104763] [12 Regression] Generate wrong assembly code

2022-03-03 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104763

Martin Liška  changed:

   What|Removed |Added

 CC||rguenth at gcc dot gnu.org

--- Comment #4 from Martin Liška  ---
Likely changed with r9-384-gf1bcb061d172ca7e.

[Bug tree-optimization/104639] [12 Regression] Useless loop not fully optimized anymore

2022-03-03 Thread aldyh at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104639

Aldy Hernandez  changed:

   What|Removed |Added

 CC||amacleod at redhat dot com

--- Comment #7 from Aldy Hernandez  ---
FWIW, thread2 does thread everything that is threadable.  That is, we thread
5->3->4 from this:

   [local count: 1073741824]:
  # i_1 = PHI 
  if (i_1 == 4)
goto ; [89.00%]
  else
goto ; [11.00%]

   [local count: 955630224]:
  goto ; [100.00%]

into this:

   [local count: 118111600]:
  if (i_2(D) == 4)
goto ; [97.00%]
  else
goto ; [3.00%]

   [local count: 955630224]:
  # i_5 = PHI <6(2)>

   [local count: 118111600]:
  # i_6 = PHI 
  _3 = i_6 != 0;
  return _3;

There are no more conditionals we can thread in the final form above.

To my untrained eye, it does seem like phiopt should be able to optimize:

   [local count: 118111600]:
  # i_6 = PHI 
  _3 = i_6 != 0;

into:

  _3 = i_2 != 0;

Then BB2 and BB3 can just be cleaned up as dead?

[Bug c++/104669] [11/12 Regression] ICE in is_function_default_version, at attribs.cc:1219

2022-03-03 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104669

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org,
   ||jason at gcc dot gnu.org

--- Comment #3 from Jakub Jelinek  ---
These push_local_extern_decl_alias created aliases are an endless source of
problems :(.
What happens is we create foo as a FUNCTION_DECL local to the function, then
push_local_extern_decl_alias creates another FUNCTION_DECL for it.
Then duplicate_decls is called on the local decl and another new local decl,
during that duplicate_decls e.g. DECL_ATTRIBUTES and other flags on the local
FUNCTION_DECL are tweaked, but doesn't update DECL_ATTRIBUTES etc. on the
extern alias.
The only spot in duplicate_decls that deals with this is:
  if (VAR_OR_FUNCTION_DECL_P (newdecl) && DECL_LOCAL_DECL_P (newdecl))
{
  if (!DECL_LOCAL_DECL_P (olddecl))
/* This can happen if olddecl was brought in from the
   enclosing namespace via a using-decl.  The new decl is
   then not a block-scope extern at all.  */
DECL_LOCAL_DECL_P (newdecl) = false;
  else
{
  retrofit_lang_decl (newdecl);
  DECL_LOCAL_DECL_ALIAS (newdecl) = DECL_LOCAL_DECL_ALIAS (olddecl);
}
}
So, how do we synchronize changes from the local FUNCTION_DECL to
DECL_LOCAL_DECL_ALIAS when they are made?

[Bug target/104664] [12 Regression] ICE: in extract_constrain_insn, at recog.cc:2670 (insn does not satisfy its constraints) with -Og -ffinite-math-only

2022-03-03 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104664

Jakub Jelinek  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 CC||jakub at gcc dot gnu.org
 Status|NEW |RESOLVED

--- Comment #7 from Jakub Jelinek  ---
.

[Bug c++/104770] New: std::any_cast cause SIGBUS instead of throwing std::bad_any_cast

2022-03-03 Thread payload.prom0j at icloud dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104770

Bug ID: 104770
   Summary: std::any_cast cause SIGBUS instead of throwing
std::bad_any_cast
   Product: gcc
   Version: 11.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: payload.prom0j at icloud dot com
  Target Milestone: ---

Created attachment 52555
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52555&action=edit
test code

std::any a1="123.456";

try
{
  std::cout<<"  std::any_cast(a1) =
"<(a1)<(a1) generate exception:
"

[Bug target/104770] std::any_cast cause SIGBUS instead of throwing std::bad_any_cast

2022-03-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104770

Andrew Pinski  changed:

   What|Removed |Added

  Component|c++ |target
 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |MOVED
 Target||aarch64-apple-darwin21
   Keywords||EH, wrong-code

--- Comment #1 from Andrew Pinski  ---
Please file it with homebrew as aarch64 darwin support is not upstreamed yet:
https://github.com/Homebrew/homebrew-core/issues

Plus it works on x86_64-linux-gnu and others just fine.

[Bug d/104749] [12 regression] stage1 d21 fails to link on Solaris/x86

2022-03-03 Thread ro at CeBiTec dot Uni-Bielefeld.DE via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104749

--- Comment #5 from ro at CeBiTec dot Uni-Bielefeld.DE  ---
> --- Comment #4 from ro at CeBiTec dot Uni-Bielefeld.DE  Will try gdc 9.4.0 (which is currently building, switching from gas 2.32
> to 2.38 if that matters) afterwards.

Both gdc 11.1.0 and gdc 9.4.0 work fine.

[Bug d/104749] [12 regression] stage1 d21 fails to link on Solaris/x86

2022-03-03 Thread ibuclaw at gdcproject dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104749

--- Comment #6 from Iain Buclaw  ---
Just having a look around, I couldn't see anywhere that `___s` would be created
now.

Then I came across r9-8460, which was fixed for 10.1.0, and backported before
9.4.0 was released.  Linked bug report is pr d/94240.

In that case then, earlier versions of version 9 might not work correctly.

[Bug target/104529] [12 Regression] inefficient codegen around new/delete

2022-03-03 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104529

Jakub Jelinek  changed:

   What|Removed |Added

   Keywords|needs-bisection |
 CC||jakub at gcc dot gnu.org

--- Comment #6 from Jakub Jelinek  ---
The change mentioned in #c3 happened in r12-1529-gd7deee423f993bee8ee44
(but both on aarch64 and x86_64).
I don't see the code mentioned in #c0 on x86_64, I see also loads and stores
like on aarch64.

[Bug target/104770] std::any_cast cause SIGBUS instead of throwing std::bad_any_cast

2022-03-03 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104770

--- Comment #2 from Jonathan Wakely  ---
(In reply to Victor Tsang from comment #0)
> --with-bugurl=https://github.com/Homebrew/homebrew-core/issues

As it says there ^^

[Bug target/104529] [12 Regression] inefficient codegen around new/delete

2022-03-03 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104529

--- Comment #7 from Tamar Christina  ---
(In reply to Jakub Jelinek from comment #6)
> I don't see the code mentioned in #c0 on x86_64, I see also loads and stores
> like on aarch64.

Yes, that was my mistake, I was accidentally comparing GCC 11 x86_64 with GCC
12 AArch64.  That's how I noticed it was an 12 regression later.  Should have
clarified.

[Bug debug/104771] New: '-fcompare-debug' failure w/ -mno-vsx -O1 -frename-registers

2022-03-03 Thread asolokha at gmx dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104771

Bug ID: 104771
   Summary: '-fcompare-debug' failure w/ -mno-vsx -O1
-frename-registers
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Keywords: compare-debug-failure
  Severity: normal
  Priority: P3
 Component: debug
  Assignee: unassigned at gcc dot gnu.org
  Reporter: asolokha at gmx dot com
  Target Milestone: ---
Target: powerpc-e300c3-linux-gnu

Created attachment 52556
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52556&action=edit
gdk diff

g++ 12.0.1 20220227 snapshot (g:d1574a9b820f17adb9004255e2018967e9be063b) fails
-fcompare-debug check when compiling the following testcase w/ -mno-vsx -O1
-frename-registers:

int n;

void
quux ();

template 
void
bar (long double y)
{
  for (int i = 0; i < n; ++i)
{
  double uninit, a;

  for (int j = 0; j < n; ++j)
{
}

  a = uninit / y + n;
  if (a != 0.0)
quux ();
}
}

void
foo (long double x)
{
  bar (x);
  bar (x);
}

% powerpc-e300c3-linux-gnu-g++-12.0.1 -mno-vsx -O1 -fcompare-debug
-frename-registers -c vrsxlrxu.cpp
powerpc-e300c3-linux-gnu-g++-12.0.1: error: vrsxlrxu.cpp: '-fcompare-debug'
failure

-fno-rename-registers or -mvsx "fixes" it.

gdk diff attached.

[Bug c++/104765] Expression statement with a return in a lambda-parameter-default causes segfault when called in a different function

2022-03-03 Thread aaron at aaronballman dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104765

Aaron Ballman  changed:

   What|Removed |Added

 CC||aaron at aaronballman dot com

--- Comment #3 from Aaron Ballman  ---
Here's another example that causes diagnostics and then ICEs in a different
way: https://godbolt.org/z/nWTGEc1dW

template 
int foo(Callable&& Call) {
  return Call();
}

int main() {
  auto l = [](int a = ({ int x = 12; x; })) {
return a;
  };
  return foo(l);
}


(I was trying to see how well the locally-declared `x` in the lambda's default
argument worked in practice.) Consider me a +1 for prohibiting statement
expressions in lambda default arguments.

[Bug target/104529] [12 Regression] inefficient codegen around new/delete

2022-03-03 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104529

--- Comment #8 from Jakub Jelinek  ---
>From what I can see, this setting of TREE_READONLY has been added in
r9-869-g5603790dbf233c31c60 aka PR85873 fix.

[Bug target/104529] [12 Regression] inefficient codegen around new/delete

2022-03-03 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104529

--- Comment #9 from Jakub Jelinek  ---
When I disable the TREE_READONLY (decl) = true; in build_target_expr, on the
array-temp1.C testcase gimple dump changes:
 int f ()
 {
   int D.2491;
-  static const int C.0[10] = {1, 42, 3, 4, 5, 6, 7, 8, 9, 0};
+  const int D.2435[10];
   typedef const int AR[];

-  try
-{
-  D.2491 = C.0[5];
-  return D.2491;
-}
-  finally
-{
-  C.0 = {CLOBBER(eol)};
-}
+  D.2435[0] = 1;
+  D.2435[1] = 42;
+  D.2435[2] = 3;
+  D.2435[3] = 4;
+  D.2435[4] = 5;
+  D.2435[5] = 6;
+  D.2435[6] = 7;
+  D.2435[7] = 8;
+  D.2435[8] = 9;
+  D.2435[9] = 0;
+  D.2491 = D.2435[5];
+  return D.2491;

which seems quite undesirable change.
The spot that cares about TREE_READONLY is exactly in
gimplify_init_constructor:
/* If a const aggregate variable is being initialized, then it
   should never be a lose to promote the variable to be static.  */
if (valid_const_initializer
&& num_nonzero_elements > 1
&& TREE_READONLY (object)
&& VAR_P (object)
&& !DECL_REGISTER (object)
&& (flag_merge_constants >= 2 || !TREE_ADDRESSABLE (object))
...

So, the #c5 patch looks wrong from this regard too.
Furthermore, we have that notify_temp_creation mode there and I think we really
don't want to clear TREE_READONLY in that case.

So, I think we need something like:
--- gcc/gimplify.cc.jj  2022-03-03 09:13:16.0 +0100
+++ gcc/gimplify.cc 2022-03-03 14:42:00.952959549 +0100
@@ -5120,6 +5120,12 @@ gimplify_init_constructor (tree *expr_p,
  {
if (notify_temp_creation)
  return GS_OK;
+
+   /* The var will be initialized and so appear on lhs of
+  assignment, it can't be TREE_READONLY anymore.  */
+   if (VAR_P (object))
+ TREE_READONLY (object) = 0;
+
is_empty_ctor = true;
break;
  }
@@ -5171,6 +5177,11 @@ gimplify_init_constructor (tree *expr_p,
break;
  }

+   /* The var will be initialized and so appear on lhs of
+  assignment, it can't be TREE_READONLY anymore.  */
+   if (VAR_P (object) && !notify_temp_creation)
+ TREE_READONLY (object) = 0;
+
/* If there are "lots" of initialized elements, even discounting
   those that are not address constants (and thus *must* be
   computed at runtime), then partition the constructor into

or so.

[Bug tree-optimization/104639] [12 Regression] Useless loop not fully optimized anymore

2022-03-03 Thread amacleod at redhat dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104639

--- Comment #8 from Andrew Macleod  ---
(In reply to Richard Biener from comment #3)
> it's odd that VRP doesn't optimize this though.  VRP2 says
> 
> Exported global range table:
> 
> i_6  : int ~[4, 4]
> bool foo (int i)
> {
>   bool _3;
> 
>[local count: 118111600]:
>   if (i_2(D) == 4)
> goto ; [97.00%]
>   else
> goto ; [3.00%]
>   
>[local count: 955630224]:
>   
>[local count: 118111600]:
>   # i_6 = PHI 
>   _3 = i_6 != 0;
>   return _3;
> 
> but shouldn't ranger figure that i_2(D) != 0 && 6 != 0 is the same as
> i_2(D) != 0?  Alternatively this could be sth for phiopt.  PRE still
> sees the loop (I guess it was previously the one optimizing this)

Ranger itself isn't going to see that, All we can tell by ranges directly is
that it is ~[4, 4].You have to look back at the def from the use to come to
this conclusion, so it would be a simplification or  PHI-opt thing.  

Curious question, if that was an 'if' instead of a return using _3, the
threader would probably thread the PHI away?  ie:
   [local count: 118111600]:
  # i_6 = PHI 
  _3 = i_6 != 0;
  if (_3 != 0)
goto 
  else
goto 

I don't suppose there is any possible future enhancement that would let us
thread into returns like this?  Or maybe its not common enough?

[Bug tree-optimization/104475] [12 Regression] Wstringop-overflow + atomics incorrect warning on dynamic object

2022-03-03 Thread aldyh at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104475

Aldy Hernandez  changed:

   What|Removed |Added

 CC||amacleod at redhat dot com

--- Comment #3 from Aldy Hernandez  ---
This isn't the threader but VRP/ranger.

What happens is that the threader isolates the path, making it easier for VRP
to see the equivalence, and then CCP4 folds the constant into the problematic
call.  This is from the .ccp4 pass:

Folding statement: __atomic_or_fetch_4 (pretmp_29, 64, 0);
Folded into: __atomic_or_fetch_4 (184B, 64, 0);

In VRP2 the ranger is folding:

Folding statement: pretmp_29 = &MEM[(struct __atomic_base *)_1 + 184B]._M_i;
Folded into: pretmp_29 = 184B;

The ranger is determining that _1 is 0 because it has determined that since _2
is 0 on the 2->3 edge, so is _1, as m_mutex is the first field of _1:

=== BB 2 
Imports: _1  
Exports: _1  _2  
 _2 : _1(I)  
 [local count: 1073741824]:
_1 = this_10(D)->d;
_2 = &_1->m_mutex;
MEM[(struct __as_base  &)&lock] ={v} {CLOBBER};
if (_2 != 0B)
  goto ; [90.00%]
else
  goto ; [10.00%]

2->5  (T) _1 :  struct QFutureInterfaceBasePrivate * [1B, +INF]
2->5  (T) _2 :  struct QMutex * [1B, +INF]
2->3  (F) _1 :  struct QFutureInterfaceBasePrivate * [0B, 0B]
2->3  (F) _2 :  struct QMutex * [0B, 0B]

Andrew, how/where is that we relate _1 and _2 here?  I can't seem to find it.

My gut feeling is that special casing anything in the ranger for this is wrong.

[Bug tree-optimization/104639] [12 Regression] Useless loop not fully optimized anymore

2022-03-03 Thread aldyh at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104639

--- Comment #9 from Aldy Hernandez  ---
(In reply to Andrew Macleod from comment #8)
> (In reply to Richard Biener from comment #3)

> Curious question, if that was an 'if' instead of a return using _3, the
> threader would probably thread the PHI away?  ie:
>[local count: 118111600]:
>   # i_6 = PHI 
>   _3 = i_6 != 0;
>   if (_3 != 0)
> goto 
>   else
> goto 
> 
> I don't suppose there is any possible future enhancement that would let us
> thread into returns like this?  Or maybe its not common enough?

Sure, the threader can't thread the current situation because it only does
conditionals not PHIs.  If that were an if, it should be able to thread it.

[Bug analyzer/104680] identical inner condition not detected

2022-03-03 Thread dmalcolm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104680

--- Comment #2 from David Malcolm  ---
> trunk.git/gcc/config/avr/avr.cc:8674:22: warning: Identical inner 'if' 
> condition is always true. [identicalInnerCondition]

In avr_out_fract:

8665   │   /* We need to consider to-be-discarded bits
8666   │  if the value is negative.  */
8667   │   if (sn < s0)
8668   │ {
8669   │   avr_asm_len ("tst %0" CR_TAB
8670   │"brpl 0f",
8671   │&all_regs_rtx[src.regno_msb], plen, 2);
8672   │   /* Test to-be-discarded bytes for any nozero bits.
8673   │  ??? Could use OR or SBIW to test two registers at
once.  */
8674   │   if (sn < s0)
8675   │ avr_asm_len ("cp %0,__zero_reg__", &all_regs_rtx[sn],
plen, 1);
8676   │ 

sn < s0 is checked at line 8667, then again at line 8675.

[Bug analyzer/104680] identical inner condition not detected

2022-03-03 Thread dmalcolm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104680

--- Comment #3 from David Malcolm  ---
> trunk.git/gcc/config/mn10300/mn10300.cc:888:8: warning: Identical inner 'if' 
> condition is always true. [identicalInnerCondition]

In mn10300_expand_prologue:

 877   │   /* Consider alternative save_a0_merge only if we don't need a
 878   │  frame pointer, size is nonzero and the user hasn't
 879   │  changed the calling conventions of a0.  */
 880   │   if (! frame_pointer_needed && size
 881   │   && call_used_regs[FIRST_ADDRESS_REGNUM]
 882   │   && ! fixed_regs[FIRST_ADDRESS_REGNUM])
 883   │ {
 884   │   /* Insn: add -(size + 4 * num_regs_to_save), sp.  */
 885   │   this_strategy_size = SIZE_ADD_SP (-(size + 4 *
num_regs_to_save));
 886   │   /* Insn: mov sp, a0.  */
 887   │   this_strategy_size++;
 888   │   if (size)
 889   │ {
 890   │   /* Insn: add size, a0.  */
 891   │   this_strategy_size += SIZE_ADD_AX (size);
 892   │ }

"size" is checked at line 880, then again at line 888.

[Bug analyzer/104680] identical inner condition not detected

2022-03-03 Thread dmalcolm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104680

--- Comment #4 from David Malcolm  ---
> trunk.git/gcc/d/expr.cc:689:17: warning: Identical inner 'if' condition is 
> always true. [identicalInnerCondition]

In 'void visit (CatExp *e)':
 682   │ if (e->e1->op == EXP::concatenate)
 683   │   {
 684   │ /* Flatten multiple concatenations to an array.
 685   │So the expression ((a ~ b) ~ c) becomes [a, b, c]  */
 686   │ int ndims = 2;
 687   │ 
 688   │ for (Expression *ex = e->e1; ex->op == EXP::concatenate;)
 689   │   {
 690   │ if (ex->op == EXP::concatenate)
 691   │   {
 692   │ ex = ex->isCatExp ()->e1;
 693   │ ndims++;
 694   │   }
 695   │   }

Looks like the ex->op == EXP::concatenate in line 690 is indeed checked by the
loop guard at line 688, so this code does look suspicious to me.

[Bug analyzer/104680] identical inner condition not detected

2022-03-03 Thread dmalcolm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104680

--- Comment #5 from David Malcolm  ---
> trunk.git/libffi/src/m32r/ffi.c:66:15: warning: Identical inner 'if' 
> condition is always true. [identicalInnerCondition]

In ffi_prep_args:

  56   │   for (i = ecif->cif->nargs, p_arg = ecif->cif->arg_types;
  57   │(i != 0) && (avn != 0);
  58   │i--, p_arg++)
  59   │ {
  60   │   size_t z;
  61   │ 
  62   │   /* Align if necessary.  */
  63   │   if (((*p_arg)->alignment - 1) & (unsigned) argp)
  64   │ argp = (char *) FFI_ALIGN (argp, (*p_arg)->alignment);
  65   │ 
  66   │   if (avn != 0) 
  67   │ {

avn != 0 checked at both line 66 and in the loop guard at line 57.

[Bug analyzer/104680] identical inner condition not detected

2022-03-03 Thread dmalcolm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104680

--- Comment #6 from David Malcolm  ---
> trunk.git/liboffloadmic/runtime/offload_engine.cpp:113:13: warning: Identical 
> inner 'if' condition is always true. [identicalInnerCondition]

 108   │ void Engine::init(void)
 109   │ {
 110   │ if (!m_ready) {
 111   │ mutex_locker_t locker(m_lock);
 112   │ 
 113   │ if (!m_ready) {

Possible false positive? (multithreaded code?)

[Bug analyzer/104680] identical inner condition not detected

2022-03-03 Thread dmalcolm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104680

--- Comment #7 from David Malcolm  ---
> trunk.git/zlib/contrib/minizip/zip.c:1212:26: warning: Identical inner 'if' 
> condition is always true. [identicalInnerCondition]

In zipOpenNewFileInZip4_64:

1206   │ #ifdef HAVE_BZIP2
1207   │ if ((err==ZIP_OK) && (zi->ci.method == Z_DEFLATED || zi->ci.method
== Z_BZIP2ED) && (!zi->ci.raw))
1208   │ #else
1209   │ if ((err==ZIP_OK) && (zi->ci.method == Z_DEFLATED) &&
(!zi->ci.raw))
1210   │ #endif
1211   │ {
1212   │ if(zi->ci.method == Z_DEFLATED)
1213   │ {

Arguably a false positive: it's going to be a repeated test if the preprocessor
uses line 1209, but not if it uses line 1207.

[Bug middle-end/104529] [12 Regression] inefficient codegen around new/delete

2022-03-03 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104529

Jakub Jelinek  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org
 Status|NEW |ASSIGNED

--- Comment #10 from Jakub Jelinek  ---
Created attachment 52557
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52557&action=edit
gcc12-pr104529.patch

Untested fix.

[Bug c/104680] identical inner condition not detected

2022-03-03 Thread dmalcolm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104680

David Malcolm  changed:

   What|Removed |Added

  Component|analyzer|c
   Assignee|dmalcolm at gcc dot gnu.org|unassigned at gcc dot 
gnu.org

--- Comment #8 from David Malcolm  ---
There are two aspects to this bug:

(a) the code flagged by cppcheck's implementation of "identicalInnerCondition"
warnings from cppcheck when run on GCC itself.  I've posted comments above
giving some details on each of these.

(b) the fact that GCC's -Wduplicated-branches doesn't flag these issues, when
arguably it should (though note the possible false positives above).


Re (a), should these be opened as bugs against individual subsystems?

Re (b) and in reply to Andrew Pinski from comment #1):
> Wduplicated-branches only currently works for the a && a case and not the if
> (a) if (a) case.
> 
> Maybe this could be done by the analyzer instead.

...I don't think this is a good fit for the analyzer; it seems much more
appropriate for the frontends to me; reassigning component to "c".

Hope this is constructive.

[Bug tree-optimization/104639] [12 Regression] Useless loop not fully optimized anymore

2022-03-03 Thread amacleod at redhat dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104639

--- Comment #10 from Andrew Macleod  ---
I was thinking less about phis and more that its a "return" instead of an "if"
ending the block preventing the threader from doing anything.

  _3 = i_6 != 0;
  return _3;

Ie, so if a return uses a condition, we could act like its a branch.. if that
were viewed as
  if (_3)
return 1
  else
return 0
the threader would do what we are looking for?

I don't know if that would be easy or hard or even worthwhile, but I seem to
vaguely recall another situation a few months ago that was similar... with a
block that ended in a return not triggering threads thru a PHI in the block
that could have been helpful.

[Bug target/104758] [nvptx] sm_30 board support broken

2022-03-03 Thread vries at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104758

Tom de Vries  changed:

   What|Removed |Added

 Resolution|FIXED   |---
   Last reconfirmed||2022-03-03
 Status|RESOLVED|REOPENED
 Ever confirmed|0   |1

--- Comment #5 from Tom de Vries  ---
The sm_30 board arrived and I'm running tests, and things break because we
still build f.i. libgcc with sm_35.  Same for newlib.  Only libgomp is build
with sm_30.

[Bug c/104680] identical inner condition not detected

2022-03-03 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104680

--- Comment #9 from Jakub Jelinek  ---
At least right now the analyzer might be too late for this though, no?  There
could be various optimizations before it that make it hard to figure out what
was actually user code and what is something else.
On the other side, doing this in the FEs would handle only the most trivial
cases, any time there would be some function call or inline asm or whatever
else could have possibly changed any values in the expressions, we'd need to
punt.
So, at least having a SSA form so that we can find out what is the same value
and what is (possibly) different would be helpful, likely with at least a
forwprop.

[Bug middle-end/104757] [12 Regression][OpenMP] ICE in GIMPLE pass: walloca - in gimple_range_global / segfault as SSA_NAME_DEF_STMT is NULL for 'if' clause arg

2022-03-03 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104757

Thomas Schwinge  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #13 from Thomas Schwinge  ---
(In reply to Jakub Jelinek from comment #12)
> not closing this just yet until confirmed it works for offloading compiler 
> too.

ACK, thanks.  New test cases PASS, and additionally the expected:

[-FAIL: gfortran.dg/gomp/clauses-1.f90   -O  (internal compiler error:
Segmentation fault)-]
[-FAIL:-]{+PASS:+} gfortran.dg/gomp/clauses-1.f90   -O  (test for excess
errors)

[Bug c++/104765] Expression statement with a return in a lambda-parameter-default causes segfault when called in a different function

2022-03-03 Thread mpolacek at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104765

--- Comment #4 from Marek Polacek  ---
The attached patch disables ({}) in lambda param list, which fixes the bug, but
also makes things less consistent:

void G() {
  void fn (int i, int = ({ 1; }));  // currently OK
}

void g() {
  auto a = [](){}; // currently OK
  a();
}

void g2() {
  auto a = [](int = ({ 1; })){};  // error with the patch
  a();
}

My preference: prohibit any use of ({}) in default arguments.

--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -5605,11 +5605,15 @@ cp_parser_primary_expression (cp_parser *parser,

 at class or namespace scope.  */
  if (!parser->in_function_body
- || parser->in_template_argument_list_p)
+ || parser->in_template_argument_list_p
+ || (parsing_function_declarator ()
+ && current_class_type
+ && LAMBDA_TYPE_P (current_class_type)))
{
  error_at (token->location,
"statement-expressions are not allowed outside "
-   "functions nor in template-argument lists");
+   "functions nor in template-argument lists or in "
+   "lambda parameter lists");
  cp_parser_skip_to_end_of_block_or_statement (parser);
  if (cp_lexer_next_token_is (parser->lexer, CPP_CLOSE_PAREN))
cp_lexer_consume_token (parser->lexer);

[Bug tree-optimization/104639] [12 Regression] Useless loop not fully optimized anymore

2022-03-03 Thread aldyh at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104639

--- Comment #11 from Aldy Hernandez  ---
(In reply to Andrew Macleod from comment #10)
> I was thinking less about phis and more that its a "return" instead of an
> "if" ending the block preventing the threader from doing anything.
> 
>   _3 = i_6 != 0;
>   return _3;
> 
> Ie, so if a return uses a condition, we could act like its a branch.. if
> that were viewed as
>   if (_3)
> return 1
>   else
> return 0
> the threader would do what we are looking for?
> 
> I don't know if that would be easy or hard or even worthwhile, but I seem to
> vaguely recall another situation a few months ago that was similar... with a
> block that ended in a return not triggering threads thru a PHI in the block
> that could have been helpful.

I don't know.  I thought return's were special.  Can there be more than one per
function?

[Bug target/104643] gcc/config/rs6000/driver-rs6000.cc: 2 * pointless call ?

2022-03-03 Thread willschm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104643

Will Schmidt  changed:

   What|Removed |Added

 CC||bergner at gcc dot gnu.org,
   ||segher at gcc dot gnu.org,
   ||willschm at gcc dot gnu.org

--- Comment #2 from Will Schmidt  ---
(In reply to David Binderman from comment #0)
> Static analyser cppcheck says:
> 
> 1.
> 
> gcc/config/rs6000/driver-rs6000.cc:578:13: style: Variable 'cache' is
> reassigned a value before the old one has been used. [redundantAssignment]
> 
> Source code is
> 
>   cache = detect_caches_freebsd ();
>   /* FreeBSD PPC does not provide any cache information yet.  */
>   cache = "";
> 
> The function call looks pointless.
> 
> 2.
> 
> gcc/config/rs6000/driver-rs6000.cc:582:13: style: Variable 'cache' is
> reassigned a value before the old one has been used. [redundantAssignment]

There is a similar pattern for the __linux__ if/else path. 

#elif defined (__FreeBSD__)
  cache = detect_caches_freebsd ();
  /* FreeBSD PPC does not provide any cache information yet.  */
  cache = "";
#elif defined (__linux__)
  cache = detect_caches_linux ();
  /* PPC Linux does not provide any cache information yet.  */
  cache = "";
#else

It looks like the __linux__ reassignment has been there for quite a while as
well.
when not overridden, the detect_caches_foo functions call describe_cache, which
builds a string ala
  sprintf (l1size, "--param l1-cache-size=%u", l1_sizekb);
  sprintf (line, "--param l1-cache-line-size=%u", l1_line);
  sprintf (l2size, "--param l2-cache-size=%u", l2_sizekb);
  return concat (l1size, " ", line, " ", l2size, " ", NULL);

We've obviously not noticed that the param values are no longer being set, for
quite a while.  Is there value in re-enabling this, or could this simply be
removed?The other logic in detect_caches_linux() does set values for
l1_sizekb and friends, based on the detected platform string, which has a
special case for power6* or not.  Possibly would need some touch-ups for
processors later than power6.

[Bug target/99708] __SIZEOF_FLOAT128__ not defined on powerpc64le-linux

2022-03-03 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99708

--- Comment #8 from Jonathan Wakely  ---
It looks like trunk now defines __SIZEOF_FLOAT128__ on powerpc-ibm-aix* and
powerpc64*-*-linux-gnu, but it seems to be defined unconditionally, even if the
__float128 type *isn't* available!


On power8-aix (e.g. gcc119 in the cfarm):

./gcc/cc1plus -E -dM /dev/null -quiet -maix64  | fgrep FLOAT128#define
__SIZEOF_FLOAT128__ 16
./gcc/cc1plus - -quiet   <<<  '__float128 f;'
:1:1: error: '__float128' does not name a type; did you mean '__int128'?

You need to add -mfloat128 -mvsx to be able to use the type.

On power8 ppc64le linux (e.g. gcc112 in the cfarm):

./gcc/cc1plus -E -dM /dev/null -quiet   | fgrep FLOAT128
#define __FLOAT128__ 1
#define __SIZEOF_FLOAT128__ 16
#define __FLOAT128_TYPE__ 1

That's fine, the type is usable.

But for power7 ppc64 (BE) linux (e.g. gcc110 in the cfarm):

./gcc/cc1plus -E -dM /dev/null -quiet   | fgrep FLOAT128
#define __SIZEOF_FLOAT128__ 16



That means __SIZEOF_FLOAT128__ still can't be used to detect whether a type
called "__float128" is supported for the current target.

On power it looks like we can use __FLOAT128__ for that purpose, but other
targets such as x86 don't define that one, only __SIZEOF_FLOAT128__. So you
have to use a different macro to detect __float128 depending on the target.

[Bug target/99708] __SIZEOF_FLOAT128__ not defined on powerpc64le-linux

2022-03-03 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99708

--- Comment #9 from Jonathan Wakely  ---
It looks like r12-7271-g687e57d7ac741d added it:

--- a/gcc/config/rs6000/rs6000-c.cc
+++ b/gcc/config/rs6000/rs6000-c.cc
@@ -623,7 +623,11 @@ rs6000_cpu_cpp_builtins (cpp_reader *pfile)
   if (TARGET_FRSQRTES)
 builtin_define ("__RSQRTEF__");
   if (TARGET_FLOAT128_TYPE)
-builtin_define ("__FLOAT128_TYPE__");
+  builtin_define ("__FLOAT128_TYPE__");
+  if (ibm128_float_type_node)
+builtin_define ("__SIZEOF_IBM128__=16");
+  if (ieee128_float_type_node)
+builtin_define ("__SIZEOF_FLOAT128__=16");
 #ifdef TARGET_LIBC_PROVIDES_HWCAP_IN_TCB
   builtin_define ("__BUILTIN_CPU_SUPPORTS__");
 #endif

[Bug target/99708] __SIZEOF_FLOAT128__ not defined on powerpc64le-linux

2022-03-03 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99708

--- Comment #10 from Jonathan Wakely  ---
Maybe we could do this instead:

--- a/gcc/config/rs6000/rs6000-c.cc
+++ b/gcc/config/rs6000/rs6000-c.cc
@@ -623,11 +623,13 @@ rs6000_cpu_cpp_builtins (cpp_reader *pfile)
   if (TARGET_FRSQRTES)
 builtin_define ("__RSQRTEF__");
   if (TARGET_FLOAT128_TYPE)
+{
   builtin_define ("__FLOAT128_TYPE__");
+  if (ieee128_float_type_node)
+   builtin_define ("__SIZEOF_FLOAT128__=16");
+}
   if (ibm128_float_type_node)
 builtin_define ("__SIZEOF_IBM128__=16");
-  if (ieee128_float_type_node)
-builtin_define ("__SIZEOF_FLOAT128__=16");
 #ifdef TARGET_LIBC_PROVIDES_HWCAP_IN_TCB
   builtin_define ("__BUILTIN_CPU_SUPPORTS__");
 #endif


It would be nice to add a test to the testsuite along the lines of:

/* { dg-do compile } */
#ifdef __SIZEOF_FLOAT128__
__float128 f;
#endif

[Bug libstdc++/104772] New: std::numeric_limits<__float128> should be specialized

2022-03-03 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104772

Bug ID: 104772
   Summary: std::numeric_limits<__float128> should be specialized
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: enhancement
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: redi at gcc dot gnu.org
Depends on: 99708, 98202
  Target Milestone: ---

We should provide a specialization of numeric_limits<__float128> when that type
is valid (even if is_floating_point_v<__float128> is false due to
__STRICT_ANSI__ being defined).

It might also be useful to do the same for __float80 on x86, although that's
less widely used. That could be done fairly easily:

#ifdef __SIZEOF_FLOAT80__
template<>
numeric_limits<__float80> : numeric_limits { };
#endif

For __float128 we need to define it fully. The compiler gives us the info we
need:

#define __FLT128_DECIMAL_DIG__ 36
#define __FLT128_DENORM_MIN__ 6.47517511943802511092443895822764655e-4966F128
#define __FLT128_DIG__ 33
#define __FLT128_EPSILON__ 1.92592994438723585305597794258492732e-34F128
#define __FLT128_HAS_DENORM__ 1
#define __FLT128_HAS_INFINITY__ 1
#define __FLT128_HAS_QUIET_NAN__ 1
#define __FLT128_IS_IEC_60559__ 2
#define __FLT128_MANT_DIG__ 113
#define __FLT128_MAX_10_EXP__ 4932
#define __FLT128_MAX_EXP__ 16384
#define __FLT128_MAX__ 1.18973149535723176508575932662800702e+4932F128
#define __FLT128_MIN_10_EXP__ (-4931)
#define __FLT128_MIN_EXP__ (-16381)
#define __FLT128_MIN__ 3.36210314311209350626267781732175260e-4932F128
#define __FLT128_NORM_MAX__ 1.18973149535723176508575932662800702e+4932F128

The F128 suffixes are a problem though, see PR 98202 comment 7.

Detecting __float128 reliably is blocked by PR 99708 comment 8.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98202
[Bug 98202] C++ cannot parse F128 suffix for float128 literals
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99708
[Bug 99708] __SIZEOF_FLOAT128__ not defined on powerpc64le-linux

[Bug target/104773] New: compare with 1 not merged with subtract 1

2022-03-03 Thread peter at cordes dot ca via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104773

Bug ID: 104773
   Summary: compare with 1 not merged with subtract 1
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: peter at cordes dot ca
  Target Milestone: ---
Target: x86_64-*-*, i?86-*-*, arm-*-*

std::bit_ceil(x) involves if(x == 0 || x == 1) return 1;
and 1u << (32-clz(x-1)).

The compare of course compiles to an unsigned <= 1, which can be done with a
sub instead of cmp, producing the value we need as an input for the
leading-zero count.  But GCC does *not* do this.  (Neither does clang for
x86-64).  I trimmed down the libstdc++  code into something I could
compile even when Godbolt is doesn't have working headers for some ISAs:
https://godbolt.org/z/3EE7W5bna

// cut down from libstdc++ for normal integer cases; compiles the same
  template
constexpr _Tp
bit_ceil(_Tp __x) noexcept
{
  constexpr auto _Nd = std::numeric_limits<_Tp>::digits;
  if (__x == 0 || __x == 1)
return 1;
  auto __shift_exponent = _Nd - __builtin_clz((_Tp)(__x - 1u));
  // using __promoted_type = decltype(__x << 1); ... // removed check for
x<

[Bug tree-optimization/104475] [12 Regression] Wstringop-overflow + atomics incorrect warning on dynamic object

2022-03-03 Thread amacleod at redhat dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104475

--- Comment #4 from Andrew Macleod  ---
(In reply to Aldy Hernandez from comment #3)
> This isn't the threader but VRP/ranger.
> 
> What happens is that the threader isolates the path, making it easier for
> VRP to see the equivalence, and then CCP4 folds the constant into the
> problematic call.  This is from the .ccp4 pass:
> 
> Folding statement: __atomic_or_fetch_4 (pretmp_29, 64, 0);
> Folded into: __atomic_or_fetch_4 (184B, 64, 0);
> 
> In VRP2 the ranger is folding:
> 
> Folding statement: pretmp_29 = &MEM[(struct __atomic_base *)_1 + 184B]._M_i;
> Folded into: pretmp_29 = 184B;
> 
> The ranger is determining that _1 is 0 because it has determined that since
> _2 is 0 on the 2->3 edge, so is _1, as m_mutex is the first field of _1:
> 
> === BB 2 
> Imports: _1  
> Exports: _1  _2  
>  _2 : _1(I)  
>  [local count: 1073741824]:
> _1 = this_10(D)->d;
> _2 = &_1->m_mutex;
> MEM[(struct __as_base  &)&lock] ={v} {CLOBBER};
> if (_2 != 0B)
>   goto ; [90.00%]
> else
>   goto ; [10.00%]
> 
> 2->5  (T) _1 :struct QFutureInterfaceBasePrivate * [1B, +INF]
> 2->5  (T) _2 :struct QMutex * [1B, +INF]
> 2->3  (F) _1 :struct QFutureInterfaceBasePrivate * [0B, 0B]
> 2->3  (F) _2 :struct QMutex * [0B, 0B]
> 
> Andrew, how/where is that we relate _1 and _2 here?  I can't seem to find it.
> 
> My gut feeling is that special casing anything in the ranger for this is
> wrong.


Its via op1_range for OP_ADDR:  
--param=ranger-debug=tracegori shows:

2120GORI  outgoing_edge for _1 on edge 2->3
2121GORIcompute op 1 (_2) at if (_2 != 0B)
GORI  LHS =bool [0, 0]
GORI  Computes _2 = struct QMutex * [0B, 0B] intersect Known range
: struct QMutex * VARYING
GORITRUE : (2121) produces  (_2) struct QMutex * [0B, 0B]
2122GORIcompute op 1 (_1) at _2 = &_1->m_mutex;
GORI  LHS =struct QMutex * [0B, 0B]
GORI  Computes _1 = struct QFutureInterfaceBasePrivate * [0B, 0B]
intersect Known range : struct QFutureInterfaceBasePrivate * VARYING
GORITRUE : (2122) produces  (_1) struct QFutureInterfaceBasePrivate
* [0B, 0B]
GORI  TRUE : (2120) outgoing_edge (_1) struct
QFutureInterfaceBasePrivate * [0B, 0B]

so with _2 == 0, the 2122 trace element is solving for _1 in
_2 = &_1->m_mutex
[0,0] = &_1->m_mutex

is  it possible for _1 to be anything other than 0 in this case?  If so we need
to adjust range-ops

[Bug libstdc++/104772] std::numeric_limits<__float128> should be specialized

2022-03-03 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104772

--- Comment #1 from Jonathan Wakely  ---
Strictly speaking, the __FLT128_* macros relate to _Float128 which is not
defined for C++ even when __float128 is (and in theory a target could have a
non-IEEE __float128 which would have different properties from _Float128).

[Bug libstdc++/104772] std::numeric_limits<__float128> should be specialized

2022-03-03 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104772

Jonathan Wakely  changed:

   What|Removed |Added

   Last reconfirmed||2022-03-03
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW

--- Comment #2 from Jonathan Wakely  ---
Oh, and likewise for __gnu_cxx::__numeric_traits_floating

[Bug middle-end/104774] New: OpenACC 'kernels' decomposition: internal compiler error: 'verify_gimple' failed, with 'loop' with explicit 'seq' or 'independent'

2022-03-03 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104774

Bug ID: 104774
   Summary: OpenACC 'kernels' decomposition: internal compiler
error: 'verify_gimple' failed, with 'loop' with
explicit 'seq' or 'independent'
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Keywords: ice-on-valid-code, openacc
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: tschwinge at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
  Target Milestone: ---

Another one in addition to PR104132, PR104133, that is *not* fixed by my WIP
patches for those:

int arr_0;

void
foo (void)
{
#pragma acc kernels
  {
int k;

#pragma acc loop seq
for (k = 0; k < 2; k++)
  arr_0 = k;

#pragma acc loop independent reduction(+: arr_0)
for (k = 0; k < 2; k++)
  arr_0 += k;
  }
}

With '-fopenacc --param openacc-kernels=decompose -O0 -g0' (so, not involving
'GIMPLE_DEBUG's), for both C and C++, we run into:

pr.c: In function ‘foo._omp_fn.0’:
pr.c:18:1: error: non-register as LHS of binary operation
   18 | }
  | ^
# .MEM_21 = VDEF <.MEM_3>
k = 0 + .offset.24_2;
pr.c:18:1: error: invalid RHS for gimple memory store: ‘var_decl’
*_23;

k

# .MEM_24 = VDEF <.MEM_21>
*_23 = k;
pr.c:18:1: error: non-register as LHS of binary operation
# .MEM_27 = VDEF <.MEM_4>
k = 0 + 2;
during GIMPLE pass: ssa
pr.c:18:1: internal compiler error: verify_gimple failed

Only with both 'loop's changed to implicit or explicit 'auto' (that is, both
'seq' and 'independent' removed), we succeed to compile (with my PR104132,
PR104133 WIP patches applied).

[Bug target/99708] __SIZEOF_FLOAT128__ not defined on powerpc64le-linux

2022-03-03 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99708

Segher Boessenkool  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2022-03-03
 Status|UNCONFIRMED |NEW

--- Comment #11 from Segher Boessenkool  ---
(In reply to Jonathan Wakely from comment #8)
> But for power7 ppc64 (BE) linux (e.g. gcc110 in the cfarm):
> 
> ./gcc/cc1plus -E -dM /dev/null -quiet   | fgrep FLOAT128
> #define __SIZEOF_FLOAT128__ 16

It isn't clear what cpu your compiler defaults to, making this not such a great
example.  -mcpu=power7 *does* support __float128, after all.  But indeed
-mcpu=970 defines __SIZEOF_FLOAT128__ as well, which is wrong.

Confirmed.

[Bug target/99708] __SIZEOF_FLOAT128__ not defined on powerpc64le-linux

2022-03-03 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99708

--- Comment #12 from Segher Boessenkool  ---
(In reply to Jonathan Wakely from comment #10)
> Maybe we could do this instead:
> 
> --- a/gcc/config/rs6000/rs6000-c.cc
> +++ b/gcc/config/rs6000/rs6000-c.cc
> @@ -623,11 +623,13 @@ rs6000_cpu_cpp_builtins (cpp_reader *pfile)
>if (TARGET_FRSQRTES)
>  builtin_define ("__RSQRTEF__");
>if (TARGET_FLOAT128_TYPE)
> +{
>builtin_define ("__FLOAT128_TYPE__");
> +  if (ieee128_float_type_node)
> +   builtin_define ("__SIZEOF_FLOAT128__=16");
> +}
>if (ibm128_float_type_node)
>  builtin_define ("__SIZEOF_IBM128__=16");
> -  if (ieee128_float_type_node)
> -builtin_define ("__SIZEOF_FLOAT128__=16");
>  #ifdef TARGET_LIBC_PROVIDES_HWCAP_IN_TCB
>builtin_define ("__BUILTIN_CPU_SUPPORTS__");
>  #endif

But why is ieee128_float_type_node not zero then?  That doesn't make
much sense, so this would just be hiding problems.

> It would be nice to add a test to the testsuite along the lines of:
> 
> /* { dg-do compile } */
> #ifdef __SIZEOF_FLOAT128__
> __float128 f;
> #endif

Sure, if we want this to be a generic macro, that is an excellent idea.  Also
for the other types?

[Bug c++/104642] Add __builtin_trap() for missing return at -O0

2022-03-03 Thread mpolacek at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104642

Marek Polacek  changed:

   What|Removed |Added

 CC||mpolacek at gcc dot gnu.org
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2022-03-03

[Bug target/104643] gcc/config/rs6000/driver-rs6000.cc: 2 * pointless call ?

2022-03-03 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104643

--- Comment #3 from Segher Boessenkool  ---
Note that the called function is not pure (it writes to some global vars), so
perhaps this was on purpose even?  Andreas?

[Bug target/99708] __SIZEOF_FLOAT128__ not defined on powerpc64le-linux

2022-03-03 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99708

--- Comment #13 from Jakub Jelinek  ---
I see
  if (TARGET_FLOAT128_TYPE)
{
  if (!TARGET_IEEEQUAD && TARGET_LONG_DOUBLE_128)
ibm128_float_type_node = long_double_type_node;
  else
{
  ibm128_float_type_node = make_node (REAL_TYPE);
  TYPE_PRECISION (ibm128_float_type_node) = 128;
  SET_TYPE_MODE (ibm128_float_type_node, IFmode);
  layout_type (ibm128_float_type_node);
}
  t = build_qualified_type (ibm128_float_type_node, TYPE_QUAL_CONST);
  ptr_ibm128_float_type_node = build_pointer_type (t);
  lang_hooks.types.register_builtin_type (ibm128_float_type_node,
  "__ibm128");

  if (TARGET_IEEEQUAD && TARGET_LONG_DOUBLE_128)
ieee128_float_type_node = long_double_type_node;
  else
ieee128_float_type_node = float128_type_node;
  t = build_qualified_type (ieee128_float_type_node, TYPE_QUAL_CONST);
  ptr_ieee128_float_type_node = build_pointer_type (t);
  lang_hooks.types.register_builtin_type (ieee128_float_type_node,
  "__ieee128");
}

  else
ieee128_float_type_node = ibm128_float_type_node = long_double_type_node;

Doesn't this mean that ieee128_float_type_node and ibm128_float_type_node is
always non-NULL?
So, maybe we shouldn't test whether those are non-NULL, but whether the name
of say ieee128_float_type_node is __ieee128 and similarly if
ibm128_float_type_node's name is __ibm128?
Though, __SIZEOF_FLOAT128__ macro talks about __float128 which is on ppc64 a
macro, so
probably it needs to be that plus whether __float128 is defined to __ieee128.

[Bug c++/104642] Add __builtin_trap() for missing return at -O0

2022-03-03 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104642

--- Comment #3 from Jonathan Wakely  ---
(In reply to Richard Biener from comment #1)
> Not sure, people will still see the surprising behavior at -O1+ then

Sure, but we can justify that as optimizing away the checks. But giving a
predictable crash at -O0 and confusing nonsense at -O1+ is easier to debug than
confusing nonsense at all levels.


> I'd rather have -funreachable-traps or so, enabled by default at -O0, rather
> than special-casing the return case btw.

That would be even better, yes.

[Bug c/104680] identical inner condition not detected

2022-03-03 Thread egallager at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104680

Eric Gallager  changed:

   What|Removed |Added

 Blocks||89863
   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=82100
 CC||egallager at gcc dot gnu.org

--- Comment #10 from Eric Gallager  ---
This seems like basically the opposite of bug 82100


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89863
[Bug 89863] [meta-bug] Issues in gcc that other static analyzers (cppcheck,
clang-static-analyzer, PVS-studio) find that gcc misses

[Bug fortran/104696] [OpenMP] Implicit mapping breaks struct mapping

2022-03-03 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104696

Tobias Burnus  changed:

   What|Removed |Added

Summary|[12 Regression][OpenMP] |[OpenMP] Implicit mapping
   |Implicit mapping breaks |breaks struct mapping
   |struct mapping  |
   Keywords||missed-optimization

--- Comment #1 from Tobias Burnus  ---
Looking closer at it, I no longer think it is a regression (to be checked if
deemed important).

But it looks as if there are two problems - one wrong-code one and one
missed-optimization one.

Namely, I think reason for both issues is that 
 map(to:var3.r[1].d [len: 88]) ...
is not turned into
 map(struct:var3 [len: 1]) map(to:var3.r[1].d [len: 88])
but into 'to' + pointer assign. This does not even work when using the 'always'
modifier.

('struct:' appears when using 'Q' instead of 'R(2)'.)

That that the struct not detected seems to be because array and component refs
are mixed – hiding that the var is memory wise in the same struct.

I believe with that fixed, it would work correctly.

  * * *

For C/C++, it "works".

But: it still does not detect that the member is part of the whole struct - and
allocates pointlessly too much memory. To illustrate this, I added a large
'arr' element.

Otherwise, that the reason that it works in C/C++ is that ATTACH and not
pointer assign is used.

Namely:
 test.c--
struct s { int *d; };
struct s2 { struct s r[5], q, arr[1024][1024]; };

int main () {
  struct s2 x;
  x.q.d = __builtin_malloc(sizeof(int));
  x.r[1].d = __builtin_malloc(sizeof(int));

  #pragma omp target map(tofrom: x.q.d[:1])
   *x.q.d = 2;
  #pragma omp target map(tofrom: x.r[1].d[:1])
   *x.r[1].d = 3;

  __builtin_printf("%d, %d\n", *x.q.d, *x.r[1].d);
  return 0;
}
 test.c--

gives:

  #pragma omp target num_teams(1) thread_limit(0)
 map(tofrom:x [len: 8388656][implicit]) map(tofrom:*_3 [len: 4])
 map(attach:x.q.d [bias: 0])

  #pragma omp target num_teams(1) thread_limit(0)
 map(tofrom:x [len: 8388656][implicit])
 map(tofrom:*_4 [len: 4]) map(attach:x.r[1].d [bias: 0])

[Bug target/99708] __SIZEOF_FLOAT128__ not defined on powerpc64le-linux

2022-03-03 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99708

--- Comment #14 from Jakub Jelinek  ---
Unfortunately, checking TYPE_NAME won't work either.
Because for the ibm128_float_type_node = long_double_type_node;
and ieee128_float_type_node = long_double_type_node; cases,
lang_hooks.types.register_builtin_type will not change TYPE_NAME on those which
remains "long double".
As one can't easily name-lookup those __ieee128 and __ibm128 identifiers back,
I think TARGET_FLOAT128_TYPE is the macro that controls it.
The __SIZEOF_*__ macros probably should be defined/undefined in
rs6000_target_modify_macros based on that and for __SIZEOF_FLOAT128__ also
based on (flags & OPTION_MASK_FLOAT128_KEYWORD) != 0.

[Bug target/104758] [nvptx] sm_30 board support broken

2022-03-03 Thread vries at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104758

--- Comment #6 from Tom de Vries  ---
I'm now looking at:
...
diff --git a/gcc/config/nvptx/nvptx.opt b/gcc/config/nvptx/nvptx.opt
index c83ceb3568b1..fea99c5d4069 100644
--- a/gcc/config/nvptx/nvptx.opt
+++ b/gcc/config/nvptx/nvptx.opt
@@ -53,7 +53,7 @@ Generate code for OpenMP offloading: enables -msoft-stack and
-munif
orm-simt.

 ; Default needs to be in sync with default in ASM_SPEC in nvptx.h.
 misa=
-Target RejectNegative ToLower Joined Enum(ptx_isa) Var(ptx_isa_option)
Init(PTX_ISA_SM35)

+Target RejectNegative ToLower Joined Enum(ptx_isa) Var(ptx_isa_option)
Init(PTX_ISA
_SM30)
 Specify the version of the ptx ISA to use.

 Enum
diff --git a/gcc/config/nvptx/t-nvptx b/gcc/config/nvptx/t-nvptx
index 8f67264d1328..b63c4a5a39d5 100644
--- a/gcc/config/nvptx/t-nvptx
+++ b/gcc/config/nvptx/t-nvptx
@@ -30,6 +30,4 @@ s-nvptx-gen-opt: $(srcdir)/config/nvptx/nvptx-sm.def
  tmp-nvptx-gen.opt $(srcdir)/config/nvptx/nvptx-gen.opt
$(STAMP) s-nvptx-gen-opt

-MULTILIB_OPTIONS = mgomp
-
-MULTILIB_EXTRA_OPTS = misa=sm_30 mptx=3.1
+MULTILIB_OPTIONS = mgomp mptx=3.1
...

So, switch default back to sm_30.  And add mptx=3.1 as a multilib option.

[Bug tree-optimization/104475] [12 Regression] Wstringop-overflow + atomics incorrect warning on dynamic object

2022-03-03 Thread amacleod at redhat dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104475

--- Comment #5 from Andrew Macleod  ---
We do have the option of not trying to determine anything about _1 in
situations like this..
I tried removing the op1_range() routine for addr_expr, and we pass all tests
just fine.

we would pick up non-null, if relevant,  during normal processing. It may not
be necessary to try to determine things like this during the outgoing edge
calculation.. especially if its causing issues...

[Bug target/99708] __SIZEOF_FLOAT128__ not defined on powerpc64le-linux

2022-03-03 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99708

--- Comment #15 from Jakub Jelinek  ---
Perhaps another possible test could be
ieee128_float_type_node != ibm128_float_type_node
because whenever those two are different, we know __ieee128 and __ibm128 are
supported (but still need to verify whether __float128 macro is defined).

[Bug tree-optimization/104754] gcc.dg/pr102892-1.c FAILs

2022-03-03 Thread aldyh at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104754

Aldy Hernandez  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
   Last reconfirmed||2022-03-03
 CC||amacleod at redhat dot com

--- Comment #1 from Aldy Hernandez  ---
Confirmed on a cross to m68k-unknown-linux-gnu.

Interestingly this may actually be a regression against GCC11, at least on this
target (and possibly the others mentioned though I haven't checked).

The test verifies that there are no calls to foo().  On m68k the gate to foo()
flows through here (threadfull2 dump right before vrp2):

   [local count: 715863673]:
  # ivtmp.9_23 = PHI 
  bar ();
  _2 = (void *) ivtmp.9_23;
  _1 = MEM[(long int *)_2];
  ivtmp.9_24 = ivtmp.9_23 + 4;
  if (_1 == 1)
goto ; [20.24%]
  else
goto ; [79.76%]

   [local count: 144890806]:
  foo ();

ivtmp.9_24 has been set previously in BB9 to:

  ivtmp.9_7 = (unsigned int) &b;

VRP2 can't seem to do anything with the above sequence, since it can't figure
out what _1 is.  I suppose it could, since there is enough information to to
get at "b" at -O3.

On x86, where the test passes, we have the following before vrp2:

  [local count: 477266310]:
  # c_4 = PHI 
  bar ();
  _15 = (sizetype) c_4;
  _17 = MEM[(long int *)&b + _15 * 8];
  if (_17 == 1)
goto ; [20.24%]
  else
goto ; [79.76%]

   [local count: 96598701]:
  foo ();
  c_29 = c_4 + 1;
  goto ; [100.00%]

which vrp2 can happily optimize to:

   [local count: 477266310]:
  bar ();
  _17 = 0;
  if (_17 == 1)
goto ; [20.24%]
  else
goto ; [79.76%]
...
...

  [local count: 96598701]:
  foo ();
  goto ; [100.00%]

Thus leading to foo's demise by ccp4.

I haven't dug deep, but this is likely due to the pointer equivalence tracking
we use in evrp/VRP2 not being able to see that this is all funny talk for b[]:

  ivtmp.9_7 = (unsigned int) &b;
...
...
  # ivtmp.9_23 = PHI 
  _2 = (void *) ivtmp.9_23;
  _1 = MEM[(long int *)_2];
  if (_1 == 1)

We have plans for a proper pointer range class for GCC13, though I wonder
whether we'll be able to handle the above gymnastics.

FWIW, the above transformation seems to be ivopts at play.

Whereas on x86 we go from:

  [local count: 715863673]:
  # c_19 = PHI 
  bar ();
  _1 = b[c_19][0];
  if (_1 == 1)

to:

   [local count: 715863673]:
  # c_19 = PHI 
  bar ();
  _23 = (sizetype) c_19;
  _1 = MEM[(long int *)&b + _23 * 8];
  if (_1 == 1)
goto ; [20.24%]

on m68k we transform the sequence to:

   [local count: 715863673]:
  # ivtmp.9_23 = PHI 
  bar ();
  _2 = (void *) ivtmp.9_23;
  _1 = MEM[(long int *)_2];
  ivtmp.9_24 = ivtmp.9_23 + 4;
  if (_1 == 1)

Perhaps someone with more target-foo can opine.

[Bug target/99708] __SIZEOF_FLOAT128__ not defined on powerpc64le-linux

2022-03-03 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99708

--- Comment #16 from Jakub Jelinek  ---
So what about:
--- gcc/config/rs6000/rs6000-c.cc.jj2022-02-17 10:24:16.756113275 +0100
+++ gcc/config/rs6000/rs6000-c.cc   2022-03-03 19:06:25.771981905 +0100
@@ -584,6 +584,10 @@ rs6000_target_modify_macros (bool define
rs6000_define_or_undefine_macro (true, "__float128=__ieee128");
   else
rs6000_define_or_undefine_macro (false, "__float128");
+  if (ibm128_float_type_node != ieee128_float_type_node && define_p)
+   rs6000_define_or_undefine_macro (true, "__SIZEOF_FLOAT128__=16");
+  else
+   rs6000_define_or_undefine_macro (false, "__SIZEOF_FLOAT128__");
 }
   /* OPTION_MASK_FLOAT128_HARDWARE can be turned on if -mcpu=power9 is used or
  via the target attribute/pragma.  */
@@ -623,11 +627,9 @@ rs6000_cpu_cpp_builtins (cpp_reader *pfi
   if (TARGET_FRSQRTES)
 builtin_define ("__RSQRTEF__");
   if (TARGET_FLOAT128_TYPE)
-  builtin_define ("__FLOAT128_TYPE__");
-  if (ibm128_float_type_node)
+builtin_define ("__FLOAT128_TYPE__");
+  if (ibm128_float_type_node != ieee128_float_type_node)
 builtin_define ("__SIZEOF_IBM128__=16");
-  if (ieee128_float_type_node)
-builtin_define ("__SIZEOF_FLOAT128__=16");
 #ifdef TARGET_LIBC_PROVIDES_HWCAP_IN_TCB
   builtin_define ("__BUILTIN_CPU_SUPPORTS__");
 #endif

?

[Bug tree-optimization/104754] gcc.dg/pr102892-1.c FAILs

2022-03-03 Thread aldyh at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104754

--- Comment #2 from Aldy Hernandez  ---
BTW, on GCC11, ivopts doesn't even get a whack at it.  The whole thing is
optimized away by .fre4:

int main ()
{
  long int a;
  long int c;

   [local count: 44232128]:
  if (a_9(D) <= 0)
goto ; [89.00%]
  else
goto ; [11.00%]

   [local count: 39366592]:
  bar ();
  bar ();

   [local count: 44232131]:
  return 0;

}

Perhaps this is a regression elsewhere.

[Bug tree-optimization/104746] [12 Regression] False positive for -Wformat-overflow=2 since r12-7033-g3c9f762ad02f398c

2022-03-03 Thread msebor at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104746

--- Comment #5 from Martin Sebor  ---
(In reply to Martin Liška from comment #3)

This is an example of the "symbolic constraints involving multiple arguments"
that I mentioned in comment #1.  There is no logic to determine from the
complex relationship between the lengths of the two strings that their sum also
constrains the output of the call to avoid the overflow.  A similar example of
the same problem is below.  The conditional guarantees that each of i and j
produces exactly one digit on output, but the all the warning logic considers
is the range of the arguments, which is [0, INT_MAX].  Unlike in the string
case, I think here Ranger could actually set the range of i and j to be [0, 9]
on the assumption the sum doesn't overflow, but that would still not avoid the
warning unless the code also checked the range of the sum.

$ cat b.c && gcc -O2 -S -Wall -Wformat-overflow=2 b.c

char a[3];

void f (int i, int j)
{
  if (i < 0 || j < 0 || i + j > 9)
return;

  __builtin_sprintf (a, "%u%u", i, j);
}
b.c: In function ‘f’:
b.c:8:26: warning: ‘%u’ directive writing between 1 and 10 bytes into a region
of size 4 [-Wformat-overflow=]
8 |   __builtin_sprintf (a, "%u%u", i, j);
  |  ^~
b.c:8:25: note: using the range [0, 4294967295] for directive argument
8 |   __builtin_sprintf (a, "%u%u", i, j);
  | ^~
b.c:8:25: note: using the range [0, 4294967295] for directive argument
b.c:8:3: note: ‘__builtin_sprintf’ output between 3 and 21 bytes into a
destination of size 4
8 |   __builtin_sprintf (a, "%u%u", i, j);
  |   ^~~

[Bug tree-optimization/104746] False positive for -Wformat-overflow=2 since r12-7033-g3c9f762ad02f398c

2022-03-03 Thread msebor at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104746

Martin Sebor  changed:

   What|Removed |Added

Summary|[12 Regression] False   |False positive for
   |positive for|-Wformat-overflow=2 since
   |-Wformat-overflow=2 since   |r12-7033-g3c9f762ad02f398c
   |r12-7033-g3c9f762ad02f398c  |

--- Comment #6 from Martin Sebor  ---
None of these "false positives" is due to a bug in the warning code.  The
warning has been designed and documented to work this way.  What triggers more
instances of these warnings in GCC 12 is the more accurate range info courtesy
of Ranger.  Prior to GCC 12, the ranges were less accurate and sometimes
unavailable at all, and the warning is designed to avoid triggering in the
absence of any range info at all.

So I don't consider this a regression.

[Bug c++/65396] Function template default template arguments not merged

2022-03-03 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65396

Patrick Palka  changed:

   What|Removed |Added

 CC||ppalka at gcc dot gnu.org
   Assignee|unassigned at gcc dot gnu.org  |ppalka at gcc dot 
gnu.org
 Status|NEW |ASSIGNED

[Bug c++/104702] [12 Regression] False positive -Wunused-value warning with -fno-exceptions

2022-03-03 Thread msebor at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104702

--- Comment #3 from Martin Sebor  ---
The warning mapping needs to be updated whenever a location of a tree or
gimple* changes (gimple_set_block() calls gimple_set_location() which calls
copy_warning() so that part at least should work).  I saw code in the C++ front
end that doesn't do that when working on the warning control stuff but had no
good way to find it all; my expectation was that it would get discovered as
bugs like this one cropped up.

[Bug middle-end/104761] [12 Regression] bogus -Wdangling-pointer with cleanup and infinite loop

2022-03-03 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104761

--- Comment #5 from CVS Commits  ---
The master branch has been updated by Martin Sebor :

https://gcc.gnu.org/g:51149a05b8cc8e4fc5a77a65857894daa371de89

commit r12-7467-g51149a05b8cc8e4fc5a77a65857894daa371de89
Author: Martin Sebor 
Date:   Thu Mar 3 13:58:00 2022 -0700

Call mark_dfs_back_edges before testing EDGE_DFS_BACK [PR104761].

Resolves:
PR middle-end/104761 - bogus -Wdangling-pointer with cleanup and infinite
loop

gcc/ChangeLog:

PR middle-end/104761
* gimple-ssa-warn-access.cc (pass_waccess::execute): Call
mark_dfs_back_edges.

gcc/testsuite/ChangeLog:

PR middle-end/104761
* g++.dg/warn/Wdangling-pointer-4.C: New test.
* gcc.dg/Wdangling-pointer-4.c: New test.

[Bug middle-end/104761] [12 Regression] bogus -Wdangling-pointer with cleanup and infinite loop

2022-03-03 Thread msebor at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104761

Martin Sebor  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #6 from Martin Sebor  ---
Fixed in r12-7467.

[Bug middle-end/104077] bogus/missing -Wdangling-pointer

2022-03-03 Thread msebor at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104077
Bug 104077 depends on bug 104761, which changed state.

Bug 104761 Summary: [12 Regression] bogus -Wdangling-pointer with cleanup and 
infinite loop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104761

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

[Bug tree-optimization/104746] False positive for -Wformat-overflow=2 since r12-7033-g3c9f762ad02f398c

2022-03-03 Thread amacleod at redhat dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104746

Andrew Macleod  changed:

   What|Removed |Added

 CC||amacleod at redhat dot com

--- Comment #7 from Andrew Macleod  ---

(In reply to Martin Sebor from comment #6)
> None of these "false positives" is due to a bug in the warning code.  The
> warning has been designed and documented to work this way.  What triggers
> more instances of these warnings in GCC 12 is the more accurate range info
> courtesy of Ranger.  Prior to GCC 12, the ranges were less accurate and
> sometimes unavailable at all, and the warning is designed to avoid
> triggering in the absence of any range info at all.
> 
> So I don't consider this a regression.

"Regression" is defined as didn't cause a problem before, but does now. Making
this a regression.

Besides, according to the warning:

size 4 [-Wformat-overflow=]
8 |   __builtin_sprintf (a, "%u%u", i, j);
  |  ^~
b.c:8:25: note: using the range [0, 4294967295] for directive argument
8 |   __builtin_sprintf (a, "%u%u", i, j);
  | ^~
b.c:8:25: note: using the range [0, 4294967295] for directive argument
b.c:8:3: note: ‘__builtin_sprintf’ output between 3 and 21 bytes into a
destination of size 4
8 |   __builtin_sprintf (a, "%u%u", i, j);

its using [0, 4294967295] as the range, which is [0, 0x] or varying..
so there isn't any new precision of ranges from ranger causing this? TVRYING
implies there is no range at all known.

Wouldnt we be seeing [0,9] if you were getting more precise ranges?

[Bug target/104775] New: [9/10/11/12 Regression] Failure to assemble on s390x with -fsanitize=undefined

2022-03-03 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104775

Bug ID: 104775
   Summary: [9/10/11/12 Regression] Failure to assemble on s390x
with -fsanitize=undefined
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jakub at gcc dot gnu.org
  Target Milestone: ---

struct A {
  virtual long foo () const;
  int a;
};
long
A::foo () const
{
  int sec = a / 0;
  return sec;
}

test with -O2 -w -m64 -march=zEC12 -fsanitize=undefined fails to assemble
Music_Emu.s: Assembler messages:
Music_Emu.s:56: Error: syntax error; missing ')' after base register
Music_Emu.s:56: Error: junk at end of line: `%r12)'
The regression appeared some time between r24 and r245002 so during GCC 7
development, haven't bisected when exactly.
The problematic assembly line is:
clgte   %r1,0(%r5,%r12)

[Bug target/104775] [9/10/11/12 Regression] Failure to assemble on s390x with -fsanitize=undefined

2022-03-03 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104775

Jakub Jelinek  changed:

   What|Removed |Added

   Target Milestone|--- |9.5
 CC||krebbel at gcc dot gnu.org
   Priority|P3  |P2
 Target||s390x-linux

[Bug target/104775] [9/10/11/12 Regression] Failure to assemble on s390x with -fsanitize=undefined

2022-03-03 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104775

--- Comment #1 from Jakub Jelinek  ---
Even simpler testcase, this time C, just -O2 -march=zEC12 is needed:
long a[64];
void bar (void);

void
foo (int x, int y)
{
  if (x != a[y])
bar ();
  __builtin_trap ();
}

[Bug libstdc++/96526] implement std::strong_order total order on floating point types

2022-03-03 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96526

--- Comment #3 from CVS Commits  ---
The master branch has been updated by Jonathan Wakely :

https://gcc.gnu.org/g:9805965e3551b66b5bd751d6076791d00cdeb137

commit r12-7468-g9805965e3551b66b5bd751d6076791d00cdeb137
Author: Jonathan Wakely 
Date:   Thu Mar 3 12:34:27 2022 +

libstdc++: Implement std::strong_order for floating-point types [PR96526]

This removes a FIXME in , defining the total order for
floating-point types. I originally opened PR96526 to request a new
compiler built-in to implement this, but now that we have std::bit_cast
it can be done entirely in the library.

The implementation is based on the glibc definitions of totalorder,
totalorderf, totalorderl etc.

I think this works for all the types that satisfy std::floating_point
today, and should also work for the types expected to be added by P1467
except for std::bfloat16_t. It also supports some additional types that
don't currently satisfy std::floating_point, such as __float80, but we
probably do want that to satisfy the concept for non-strict modes.

libstdc++-v3/ChangeLog:

PR libstdc++/96526
* libsupc++/compare (strong_order): Add missing support for
floating-point types.
*
testsuite/18_support/comparisons/algorithms/strong_order_floats.cc:
New test.

[Bug libstdc++/104748] debug mode: FAIL: std/ranges/adaptors/all.cc (test for excess errors)

2022-03-03 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104748

--- Comment #1 from CVS Commits  ---
The master branch has been updated by Jonathan Wakely :

https://gcc.gnu.org/g:5706a5db88a0eeaf82071debe1364f4533896a65

commit r12-7470-g5706a5db88a0eeaf82071debe1364f4533896a65
Author: Jonathan Wakely 
Date:   Thu Mar 3 22:28:48 2022 +

libstdc++: Use non-debug vector in constexpr test [PR104748]

The std::__debug::vector isn't usable in constant expressions, so this
test fails in debug mode. Until the debug vector is fixed we can just
make the test use the non-debug one.

libstdc++-v3/ChangeLog:

PR libstdc++/104748
* testsuite/std/ranges/adaptors/all.cc: Use non-debug vector for
constexpr test.

[Bug libstdc++/104748] debug mode: FAIL: std/ranges/adaptors/all.cc (test for excess errors)

2022-03-03 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104748

Jonathan Wakely  changed:

   What|Removed |Added

   Last reconfirmed||2022-03-03
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

--- Comment #2 from Jonathan Wakely  ---
The test no longer FAILs, but we should still make the debug vector usable in
constant expressions (by disabling its iterator tracking).

[Bug libstdc++/96526] implement std::strong_order total order on floating point types

2022-03-03 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96526

Jonathan Wakely  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #4 from Jonathan Wakely  ---
Fixed for GCC 12. I might backport this to gcc-11 later.

[Bug target/99708] __SIZEOF_FLOAT128__ not defined on powerpc64le-linux

2022-03-03 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99708

--- Comment #17 from Segher Boessenkool  ---
(In reply to Jakub Jelinek from comment #13)
> I see 



> Doesn't this mean that ieee128_float_type_node and ibm128_float_type_node is
> always non-NULL?

No.  All of that code is inside
  if (TARGET_FLOAT128_TYPE)
so none of that is ever run otherwise.

The basic types we have are __ibm128 and __ieee128, all the rest is a
macro maze.

[Bug target/99708] __SIZEOF_FLOAT128__ not defined on powerpc64le-linux

2022-03-03 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99708

--- Comment #18 from Segher Boessenkool  ---
Ah, I didn't see the
  else
ieee128_float_type_node = ibm128_float_type_node = long_double_type_node;
which looks completely garbage.  It long double is just DP float, we certainly
do not want either __ibm128 or __ieee128 to be the same!

Mike?

[Bug translation/104552] Mistakes in strings to be translated in GCC 12

2022-03-03 Thread roland.illig at gmx dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104552

--- Comment #8 from Roland Illig  ---
>From tree-ssa-uninit.c:
> accessing argument %u of a function declared with attribute %<%s%>

The %<%s%> should be the conventional %qs.

[Bug c/99291] maybe_warn_pass_by_reference uses outdated format string

2022-03-03 Thread roland.illig at gmx dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99291

--- Comment #6 from Roland Illig  ---
Still reproducible in GCC 12.

[Bug c++/104734] -isystem hides -Woverloaded-virtual warning

2022-03-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104734

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||diagnostic
   Last reconfirmed||2022-03-03
 Status|UNCONFIRMED |WAITING
 Ever confirmed|0   |1

--- Comment #1 from Andrew Pinski  ---
Can you attach the preprocesed source for both cases?
You can get it via -save-temps option added.

[Bug tree-optimization/104746] False positive for -Wformat-overflow=2 since r12-7033-g3c9f762ad02f398c

2022-03-03 Thread msebor at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104746

--- Comment #8 from Martin Sebor  ---
Andrew, quoting from the documentation for the warning:

  Unknown string arguments whose length cannot be assumed to be bounded either
by the directive’s precision, or by a finite set of string literals they may
evaluate to, or the character array they may point to, are assumed to be 1
character long. 

The length of the second string argument to the second sprintf call is assumed
to be bounded by the size of the array allocated by the first call to malloc. 
The malloc argument is at most 2147483654 (INT_MAX + strlen (".mount") + 1), so
the maximum length of the string is 2147483653.  That's also what the warning
prints.  The ability to determine and use the maximum length was added in
r12-7033 to avoid the warning reported in PR 104119.  Because GCC 11 doesn't
have this ability, it assumes the length of the string argument is 1, and so
the warning doesn't trigger there.  That could be considered a bug or
limitation in GCC 11.

The instance of the warning in the test case in comment #3 is designed point
out directives whose output may exceed the environmental limit of 4095 bytes
(the minimum the C standard requires implementations to support).  It's working
as designed and as intended.  If you don't like the design and want to propose
a change I suggest you submit a concrete proposal and we can discuss it.

As for my integer test from comment #7, you're right that if each argument was
in the range [0, 9] that the warning would be avoided.  I didn't get the limits
quite right.  A test case that should better illustrate the point I was making
about the constraints derived from relationships might go something like this:

char a[4];

void f (int i, int j)
{
  if (i < 0 || j < 0 || i + j > 19)
return;

  __builtin_sprintf (a, "%u%u", i, j);
}

Here, setting the range of each of i and j on its own to [0, 19] isn't enough
to rule out a possible overflow; we also need to capture the constraint that if
one is two digits the other must be just one.  At the moment there is no logic
to determine that.  I think the corresponding test case for a possible
optimization is this:

void f (int i, int j)
{
  if (i < 0 || j < 0 || i + j > 19)
return;

  if (i * j > 90)   // fold to false
__builtin_abort ();
}

[Bug ada/104767] GNAT.Serial_Communications windows package allows/causes multiple closing of the same windows handle.

2022-03-03 Thread tornenvi at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104767

--- Comment #1 from tornenvi at gmail dot com ---
Bug report should be referring to calls to 'Open' instead of
'Connect'.

[Bug testsuite/104724] gcc.target/i386/avx512fp16-vcvtsi2sh-1b.c etc. FAIL

2022-03-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104724

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|UNCONFIRMED |RESOLVED
  Component|target  |testsuite
   Keywords||testsuite-fail

--- Comment #5 from Andrew Pinski  ---
Fixed.

[Bug target/104773] compare with 1 not merged with subtract 1

2022-03-03 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104773

--- Comment #1 from Hongtao.liu  ---
It looks like the same issue as PR98977.

  1   2   >