[Bug tree-optimization/110896] [12/13/14 Regression] gcc.dg/ubsan/pr81981.c is xfailed

2023-08-04 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110896

Richard Biener  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2023-08-04

--- Comment #1 from Richard Biener  ---
We simplify this to t[0] * 2 (by luck, so it could also be u[0] * 2) which
means we lose track of the use of the other variable.

Value numbering stmt = t$0_4 = PHI 
Setting value number of t$0_4 to t$0_4 (changed)
Making available beyond BB4 t$0_4 for value t$0_4
Value numbering stmt = u$0_12 = PHI 
Marking CSEd to PHI node t$0_4 = PHI 
Setting value number of u$0_12 to t$0_4 (changed)
...
Replaced redundant PHI node defining u$0_12 with t$0_4
gimple_simplified to _9 = t$0_4 * 2;

early uninit sees conditional init of the memory and so refrains from
diagnosing this.

I suppose this is kind-of a duplicate of the many missed uninit diagnostics
because of CCP optimistic propagation (and only because we don't do optimistic
copyprop we do not have even more such cases).

[Bug middle-end/101955] (signed<<

2023-08-04 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101955

--- Comment #6 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:9020da78df2854f14f8b1d38b58a6d3b77a4b731

commit r14-2977-g9020da78df2854f14f8b1d38b58a6d3b77a4b731
Author: Drew Ross 
Date:   Fri Aug 4 09:08:05 2023 +0200

match.pd: Canonicalize (signed x << c) >> c [PR101955]

Canonicalizes (signed x << c) >> c into the lowest
precision(type) - c bits of x IF those bits have a mode precision or a
precision of 1. Also combines this rule with (unsigned x << c) >> c -> x &
((unsigned)-1 >> c) to prevent duplicate pattern.

PR middle-end/101955
* match.pd ((signed x << c) >> c): New canonicalization.

* gcc.dg/pr101955.c: New test.

[Bug tree-optimization/110897] RISC-V: Fail to vectorize shift

2023-08-04 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110897

Richard Biener  changed:

   What|Removed |Added

   Last reconfirmed||2023-08-04
 Target|riscv   |riscv, x86_64-*-*
 CC||rsandifo at gcc dot gnu.org
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
 Blocks||53947
  Component|target  |tree-optimization

--- Comment #3 from Richard Biener  ---
it looks like you don't support vector short logical shift?  For some reason
vect_recog_over_widening_pattern doesn't check whether the demoted operation
is supported ...

The following helps on x86_64, it disables the demotion.  I think the idea
was that we eventually recognize a widening shift, so the narrow operation
itself doesn't need to be supported, but clearly that doesn't work out
when there is no such shift.

diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
index e4ab8c2d65b..4e4191652e3 100644
--- a/gcc/tree-vect-patterns.cc
+++ b/gcc/tree-vect-patterns.cc
@@ -3091,6 +3091,11 @@ vect_recog_over_widening_pattern (vec_info *vinfo,
   if (!new_vectype || !op_vectype)
 return NULL;

+  optab optab;
+  if (!(optab = optab_for_tree_code (code, op_vectype, optab_vector))
+  || optab_handler (optab, TYPE_MODE (op_vectype)) == CODE_FOR_nothing)
+return NULL;
+
   if (dump_enabled_p ())
 dump_printf_loc (MSG_NOTE, vect_location, "demoting %T to %T\n",
 type, new_type);

with the patch above x86 can vectorize both loops with AVX2 but not without.

Can you confirm this helps on RISC-V as well?

Richard, what was the idea here?


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations

[Bug tree-optimization/110897] RISC-V: Fail to vectorize shift

2023-08-04 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110897

--- Comment #4 from JuzheZhong  ---
(In reply to Richard Biener from comment #3)
> it looks like you don't support vector short logical shift?  For some reason
> vect_recog_over_widening_pattern doesn't check whether the demoted operation
> is supported ...
> 
> The following helps on x86_64, it disables the demotion.  I think the idea
> was that we eventually recognize a widening shift, so the narrow operation
> itself doesn't need to be supported, but clearly that doesn't work out
> when there is no such shift.
> 


Thanks Richi.


what is the "vector short logical shift" optab ?

Could you give me the optab name?

I am gonna try to support this in RISC-V port.


> diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
> index e4ab8c2d65b..4e4191652e3 100644
> --- a/gcc/tree-vect-patterns.cc
> +++ b/gcc/tree-vect-patterns.cc
> @@ -3091,6 +3091,11 @@ vect_recog_over_widening_pattern (vec_info *vinfo,
>if (!new_vectype || !op_vectype)
>  return NULL;
>  
> +  optab optab;
> +  if (!(optab = optab_for_tree_code (code, op_vectype, optab_vector))
> +  || optab_handler (optab, TYPE_MODE (op_vectype)) == CODE_FOR_nothing)
> +return NULL;
> +
>if (dump_enabled_p ())
>  dump_printf_loc (MSG_NOTE, vect_location, "demoting %T to %T\n",
>  type, new_type);
> 
> with the patch above x86 can vectorize both loops with AVX2 but not without.
> 
> Can you confirm this helps on RISC-V as well?
> 
> Richard, what was the idea here?

Yeah. I can try it after I try "vector short logical shift" pattern.

[Bug middle-end/110874] [14 Regression] ice with -O2 with recent gcc

2023-08-04 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110874

--- Comment #15 from CVS Commits  ---
The trunk branch has been updated by Andrew Pinski :

https://gcc.gnu.org/g:91c963ea6f845a0c59b7523a5330b8d3ed1beb6a

commit r14-2978-g91c963ea6f845a0c59b7523a5330b8d3ed1beb6a
Author: Andrew Pinski 
Date:   Wed Aug 2 14:49:00 2023 -0700

Fix PR 110874: infinite loop in gimple_bitwise_inverted_equal_p with fre

This changes gimple_bitwise_inverted_equal_p to use a 2 different match
patterns
to try to match bit_not wrapped with a possible nop_convert and a
comparison
also wrapped with a possible nop_convert. This is to avoid being recursive.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

gcc/ChangeLog:

PR tree-optimization/110874
* gimple-match-head.cc (gimple_bit_not_with_nop): New declaration.
(gimple_maybe_cmp): Likewise.
(gimple_bitwise_inverted_equal_p): Rewrite to use
gimple_bit_not_with_nop
and gimple_maybe_cmp instead of being recursive.
* match.pd (bit_not_with_nop): New match pattern.
(maybe_cmp): Likewise.

gcc/testsuite/ChangeLog:

PR tree-optimization/110874
* gcc.c-torture/compile/pr110874-a.c: New test.

[Bug middle-end/110874] [14 Regression] ice with -O2 with recent gcc

2023-08-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110874

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #16 from Andrew Pinski  ---
Fixed.

[Bug tree-optimization/110897] RISC-V: Fail to vectorize shift

2023-08-04 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110897

--- Comment #5 from JuzheZhong  ---
(In reply to Richard Biener from comment #3)
> it looks like you don't support vector short logical shift?  For some reason
> vect_recog_over_widening_pattern doesn't check whether the demoted operation
> is supported ...
> 
> The following helps on x86_64, it disables the demotion.  I think the idea
> was that we eventually recognize a widening shift, so the narrow operation
> itself doesn't need to be supported, but clearly that doesn't work out
> when there is no such shift.
> 
> diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
> index e4ab8c2d65b..4e4191652e3 100644
> --- a/gcc/tree-vect-patterns.cc
> +++ b/gcc/tree-vect-patterns.cc
> @@ -3091,6 +3091,11 @@ vect_recog_over_widening_pattern (vec_info *vinfo,
>if (!new_vectype || !op_vectype)
>  return NULL;
>  
> +  optab optab;
> +  if (!(optab = optab_for_tree_code (code, op_vectype, optab_vector))
> +  || optab_handler (optab, TYPE_MODE (op_vectype)) == CODE_FOR_nothing)
> +return NULL;
> +
>if (dump_enabled_p ())
>  dump_printf_loc (MSG_NOTE, vect_location, "demoting %T to %T\n",
>  type, new_type);
> 
> with the patch above x86 can vectorize both loops with AVX2 but not without.
> 
> Can you confirm this helps on RISC-V as well?
> 
> Richard, what was the idea here?

Hi, Richi.

I guess you mean "vector short logical shift" pattern is this:

(define_insn_and_split "v3"
  [(set (match_operand:VI 0 "register_operand"  "=vr,vr")
(any_shift:VI
 (match_operand:VI 1 "register_operand" " vr,vr")
 (match_operand:VI 2 "vector_shift_operand" " vr,vk")))]
  "TARGET_VECTOR && can_create_pseudo_p ()"
  "#"
  "&& 1"
  [(const_int 0)]
{
  riscv_vector::emit_vlmax_insn (code_for_pred (, mode),
 riscv_vector::RVV_BINOP, operands);
  DONE;
}
 [(set_attr "type" "vshift")
  (set_attr "mode" "")])

(define_code_iterator any_shift [ashift ashiftrt lshiftrt])

VI includes vector short.

I think RISCV port support vector short logical shift ?

[Bug middle-end/110869] [14 regression] ICE in decompose, at rtl.h:2297

2023-08-04 Thread ro at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110869

--- Comment #9 from Rainer Orth  ---
Created attachment 55684
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55684&action=edit
64-bit sparc-sun-solaris2.11 cmp-mem-const-1.c.289r.combine

[Bug middle-end/110869] [14 regression] ICE in decompose, at rtl.h:2297

2023-08-04 Thread ro at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110869

--- Comment #10 from Rainer Orth  ---
The tests still FAIL on Solaris/SPARC:

FAIL: gcc.dg/cmp-mem-const-1.c scan-rtl-dump combine "narrow comparison from
mode .I to QI"
FAIL: gcc.dg/cmp-mem-const-2.c scan-rtl-dump combine "narrow comparison from
mode .I to QI"
FAIL: gcc.dg/cmp-mem-const-3.c scan-rtl-dump combine "narrow comparison from
mode .I to HI"
FAIL: gcc.dg/cmp-mem-const-4.c scan-rtl-dump combine "narrow comparison from
mode .I to HI"
FAIL: gcc.dg/cmp-mem-const-5.c scan-rtl-dump combine "narrow comparison from
mode .I to SI"
FAIL: gcc.dg/cmp-mem-const-6.c scan-rtl-dump combine "narrow comparison from
mode .I to SI"

[Bug tree-optimization/110891] [14 Regression] Dead Code Elimination Regression since r14-2674-gd0de3bf9175

2023-08-04 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110891

Richard Biener  changed:

   What|Removed |Added

 CC||amacleod at redhat dot com

--- Comment #2 from Richard Biener  ---
I didn't anticipate the trick triggering with FRE but we have

Value numbering stmt = _6 = _5 | 64;
Setting value number of _6 to _6 (changed)
...
Value numbering stmt = _9 = _8 & 5;
Setting value number of _9 to _9 (changed)
...
Replaced a with _6 in all uses of _8 = a;
Applying pattern match.pd:184, gimple-match-10.cc:6142
Applying pattern match.pd:1962, gimple-match-6.cc:16850
gimple_simplified to _10 = _5 & 5;
_9 = _10;

that's the old issue that when we are recursively simplifying pattern
results like

  (bit_ior (bit_and @0 @2) (bit_and! @1 @2)))

we need to push operands, but when any outer operation simplifies
away we can't (or rather do not) pop them again (also when asked
to never push we'd fail the pattern before trying to simplify
the outer operation).  That can then result in such stray copies
to appear.

So the first IL difference is

--- a/t.c.114t.fre3 2023-08-04 09:22:55.380428835 +0200
+++ b/t.c.114t.fre3 2023-08-04 09:21:50.455470894 +0200
@@ -159,7 +159,7 @@
   a = _6;
   _10 = _5 & 5;
   _9 = _10;
-  a = _9;
+  a = _10;
   return 0;

 }

but that vanishes in copyprop1.  ifcombine then gets different SSA names
assigned which means different association of bitwise or operations.  For
some reason this causes the divergence in DOM2.

After copyprop2 we have

-FREE_SSANAMES: 12, 21, 3, 4, 7, 16, 15, 19, 22, 26, 17, 27, 13, 6, 20, 25, 8,
18, 24, 9, 
+FREE_SSANAMES: 12, 21, 3, 4, 7, 16, 15, 19, 22, 26, 17, 27, 9, 13, 6, 20, 25,
8, 18, 24, 

so the same SSA names are in the freelist but as that is unordered we pick
different names when re-using.

In the DOM2 pass you can see that ranger behaves slightly different when
processing operands in different order for commutative operations like
bitwise or in this case, that leads to the observed difference in threading.

Tracing ranger reveals too many differences, in the end I'd say "bad luck",
but maybe ranger folks want to investigate as well?

I'm not convinced we need to sort FREE_SSANAMES, solving the slightly
imperfect simplification for match would be nice.

[Bug tree-optimization/82397] [8 Regression] qsort comparator non-negative on sorted output: 1 in vect_analyze_data_ref_accesses

2023-08-04 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82397
Bug 82397 depends on bug 82446, which changed state.

Bug 82446 Summary: [11/12/13/14 Regression] Missed equalities in 
dr_group_sort_cmp
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82446

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

[Bug tree-optimization/82446] [11/12/13/14 Regression] Missed equalities in dr_group_sort_cmp

2023-08-04 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82446

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #12 from Richard Biener  ---
I don't think anything changed here, but I don't see any actual testcase where
we could verify things so let's close this bug.

[Bug ada/110898] New: compilation of adacl-assert-integer.ads failed

2023-08-04 Thread krischik at users dot sourceforge.net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110898

Bug ID: 110898
   Summary: compilation of adacl-assert-integer.ads failed
   Product: gcc
   Version: 12.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: ada
  Assignee: unassigned at gcc dot gnu.org
  Reporter: krischik at users dot sourceforge.net
CC: dkm at gcc dot gnu.org
  Target Milestone: ---

Created attachment 55685
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55685&action=edit
Source code

# the exact version of GCC, as shown by "gcc -v";

> >alr exec -P1 -- gcc --version
> ⓘ Synchronizing workspace...
> Dependencies automatically updated as follows:   
> 
>+♼ gnat 12.1.2 (new,installed,gnat_native)
> 
> gcc (GCC) 12.1.0
> Copyright (C) 2022 Free Software Foundation, Inc.
> This is free software; see the source for copying conditions.  There is NO
> warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

# the system type;

macOS 12.6.7

# the options when GCC was configured/built;

# the exact version of GCC, as shown by "gcc -v";

> >alr exec -P1 -- gcc --version
> ⓘ Synchronizing workspace...
> Dependencies automatically updated as follows:   
> 
>+♼ gnat 12.1.2 (new,installed,gnat_native)
> 
> gcc (GCC) 12.1.0
> Copyright (C) 2022 Free Software Foundation, Inc.
> This is free software; see the source for copying conditions.  There is NO
> warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

# the system type;

macOS 12.6.7

# the options when GCC was configured/built;

> >alr exec -P1 -- gcc -v   
> Using built-in specs.
> COLLECT_GCC=/Users/martin/.config/alire/cache/dependencies/gnat_native_12.1.2_587b912f/bin/gcc
> COLLECT_LTO_WRAPPER=/Users/martin/.config/alire/cache/dependencies/gnat_native_12.1.2_587b912f/bin/../libexec/gcc/x86_64-apple-darwin19.6.0/12.1.0/lto-wrapper
> Target: x86_64-apple-darwin19.6.0
> Configured with: ../src/configure 
> --prefix=/Users/runner/work/GNAT-FSF-builds/GNAT-FSF-builds/sbx/x86_64-darwin/gcc/install
>  --enable-languages=c,ada,c++ --enable-libstdcxx --enable-libstdcxx-threads 
> --enable-libada --disable-nls --without-libiconv-prefix 
> --disable-libstdcxx-pch --enable-lto --disable-multilib --disable-libcilkrts 
> --without-build-config 
> --with-build-sysroot=/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk
>  
> --with-specs='%{!sysroot=*:--sysroot=%:if-exists-else(/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk
>  /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk)}' 
> --with-mpfr=/Users/runner/work/GNAT-FSF-builds/GNAT-FSF-builds/sbx/x86_64-darwin/mpfr/install
>  
> --with-gmp=/Users/runner/work/GNAT-FSF-builds/GNAT-FSF-builds/sbx/x86_64-darwin/gmp/install
>  
> --with-mpc=/Users/runner/work/GNAT-FSF-builds/GNAT-FSF-builds/sbx/x86_64-darwin/mpc/install
>  --build=x86_64-apple-darwin19.6.0
> Thread model: posix
> Supported LTO compression algorithms: zlib
> gcc version 12.1.0 (GCC) 
> COMPILER_PATH=/Users/martin/.config/alire/cache/dependencies/gnat_native_12.1.2_587b912f/bin/../libexec/gcc/x86_64-apple-darwin19.6.0/12.1.0/:/Users/martin/.config/alire/cache/dependencies/gnat_native_12.1.2_587b912f/bin/../libexec/gcc/
> LIBRARY_PATH=/Users/martin/.config/alire/cache/dependencies/gnat_native_12.1.2_587b912f/bin/../lib/gcc/x86_64-apple-darwin19.6.0/12.1.0/:/Users/martin/.config/alire/cache/dependencies/gnat_native_12.1.2_587b912f/bin/../lib/gcc/:/usr/local/lib/:/Users/martin/.config/alire/cache/dependencies/gnat_native_12.1.2_587b912f/bin/../lib/gcc/x86_64-apple-darwin19.6.0/12.1.0/../../../
> COLLECT_GCC_OPTIONS='-P' '-v' '-mmacosx-version-min=12.5.0' 
> '-asm_macosx_version_min=12.5' '-nodefaultexport' '-mtune=core2' 
> '--sysroot=/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk'
>  '-dumpdir' 'a.'
>  
> /Users/martin/.config/alire/cache/dependencies/gnat_native_12.1.2_587b912f/bin/../libexec/gcc/x86_64-apple-darwin19.6.0/12.1.0/collect2
>  -syslibroot 
> /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/
>  -dynamic -arch x86_64 -macosx_version_min 12.5.0 -o a.out 
> -L/Users/martin/.config/alire/cache/dependencies/gnat_native_12.1.2_587b912f/bin/../lib/gcc/x86_64-apple-darwin19.6.0/12.1.0
>  
> -L/Users/martin/.config/alire/cache/dependencies/gnat_native_12.1.2_587b912f/bin/../lib/gcc
>  -L/usr/local/lib 
> -L/Users/martin/.config/alire/cache/dependencies/gnat_native_12.1.2_587b912f/bin/../lib/gcc/x86_64-apple-darwin19.6.0/12.1.0/../../..
>  adacl.gpr -lemutls_w -lgcc -lSystem -no_compact_unwind
> ld: warning: ignoring file adacl.gpr, building for macOS-x86_64 but 
> attempting to link with file built for unknown-unsupported file format ( 0x2D 
> 0x2D 0x2D 0x2D 0x2D 0x2D 0x2D 0x2D 0x

[Bug middle-end/110869] [14 regression] ICE in decompose, at rtl.h:2297

2023-08-04 Thread stefansf at linux dot ibm.com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110869

--- Comment #11 from Stefan Schulze Frielinghaus  ---
Created attachment 55686
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55686&action=edit
Increase optimization

[Bug middle-end/110869] [14 regression] ICE in decompose, at rtl.h:2297

2023-08-04 Thread stefansf at linux dot ibm.com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110869

--- Comment #12 from Stefan Schulze Frielinghaus  ---
I have done a test with a cross-compiler and it looks to me as if we need -O2
instead of -O1 on Sparc in order to trigger the optimization.  Can you give the
attached patch a try?  Sorry for all the hassle.

[Bug tree-optimization/110897] RISC-V: Fail to vectorize shift

2023-08-04 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110897

--- Comment #6 from JuzheZhong  ---
(In reply to Richard Biener from comment #3)
> it looks like you don't support vector short logical shift?  For some reason
> vect_recog_over_widening_pattern doesn't check whether the demoted operation
> is supported ...
> 
> The following helps on x86_64, it disables the demotion.  I think the idea
> was that we eventually recognize a widening shift, so the narrow operation
> itself doesn't need to be supported, but clearly that doesn't work out
> when there is no such shift.
> 
> diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
> index e4ab8c2d65b..4e4191652e3 100644
> --- a/gcc/tree-vect-patterns.cc
> +++ b/gcc/tree-vect-patterns.cc
> @@ -3091,6 +3091,11 @@ vect_recog_over_widening_pattern (vec_info *vinfo,
>if (!new_vectype || !op_vectype)
>  return NULL;
>  
> +  optab optab;
> +  if (!(optab = optab_for_tree_code (code, op_vectype, optab_vector))
> +  || optab_handler (optab, TYPE_MODE (op_vectype)) == CODE_FOR_nothing)
> +return NULL;
> +
>if (dump_enabled_p ())
>  dump_printf_loc (MSG_NOTE, vect_location, "demoting %T to %T\n",
>  type, new_type);
> 
> with the patch above x86 can vectorize both loops with AVX2 but not without.
> 
> Can you confirm this helps on RISC-V as well?
> 
> Richard, what was the idea here?

Hi, Richi.

I try this codes as you suggested:
  optab optab;
  if (!(optab = optab_for_tree_code (code, op_vectype, optab_vector))
  || optab_handler (optab, TYPE_MODE (op_vectype)) == CODE_FOR_nothing)
return NULL;

[jzzhong@server1:/work/home/jzzhong/work/insn]$~/work/rvv-opensource/output/gcc-rv64/bin/riscv64-rivai-elf-gcc
-march=rv64gcv -O3 --param=riscv-autovec-preference=scalable -S
-fopt-info-vec-missed rvv.c
rvv.c:14:1: missed: couldn't vectorize loop
rvv.c:14:1: missed: not vectorized: no vectype for stmt: _4 = *_3;
 scalar_type: uint16_t


Still can not vectorize it.

[Bug middle-end/110869] [14 regression] ICE in decompose, at rtl.h:2297

2023-08-04 Thread ro at CeBiTec dot Uni-Bielefeld.DE via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110869

--- Comment #13 from ro at CeBiTec dot Uni-Bielefeld.DE  ---
> --- Comment #12 from Stefan Schulze Frielinghaus  ibm.com> ---
> I have done a test with a cross-compiler and it looks to me as if we need -O2
> instead of -O1 on Sparc in order to trigger the optimization.  Can you give 
> the
> attached patch a try?  Sorry for all the hassle.

We're getting there.  The -1.c and -2.c tests PASS now, however the rest
still FAILs: they lack "narrow comparison"... completely.

[Bug tree-optimization/110897] RISC-V: Fail to vectorize shift

2023-08-04 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110897

--- Comment #7 from Richard Biener  ---
(In reply to JuzheZhong from comment #5)
> (In reply to Richard Biener from comment #3)
> > it looks like you don't support vector short logical shift?  For some reason
> > vect_recog_over_widening_pattern doesn't check whether the demoted operation
> > is supported ...
> > 
> > The following helps on x86_64, it disables the demotion.  I think the idea
> > was that we eventually recognize a widening shift, so the narrow operation
> > itself doesn't need to be supported, but clearly that doesn't work out
> > when there is no such shift.
> > 
> > diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
> > index e4ab8c2d65b..4e4191652e3 100644
> > --- a/gcc/tree-vect-patterns.cc
> > +++ b/gcc/tree-vect-patterns.cc
> > @@ -3091,6 +3091,11 @@ vect_recog_over_widening_pattern (vec_info *vinfo,
> >if (!new_vectype || !op_vectype)
> >  return NULL;
> >  
> > +  optab optab;
> > +  if (!(optab = optab_for_tree_code (code, op_vectype, optab_vector))
> > +  || optab_handler (optab, TYPE_MODE (op_vectype)) == CODE_FOR_nothing)
> > +return NULL;
> > +
> >if (dump_enabled_p ())
> >  dump_printf_loc (MSG_NOTE, vect_location, "demoting %T to %T\n",
> >  type, new_type);
> > 
> > with the patch above x86 can vectorize both loops with AVX2 but not without.
> > 
> > Can you confirm this helps on RISC-V as well?
> > 
> > Richard, what was the idea here?
> 
> Hi, Richi.
> 
> I guess you mean "vector short logical shift" pattern is this:
> 
> (define_insn_and_split "v3"
>   [(set (match_operand:VI 0 "register_operand"  "=vr,vr")
> (any_shift:VI
>  (match_operand:VI 1 "register_operand" " vr,vr")
>  (match_operand:VI 2 "vector_shift_operand" " vr,vk")))]
>   "TARGET_VECTOR && can_create_pseudo_p ()"
>   "#"
>   "&& 1"
>   [(const_int 0)]
> {
>   riscv_vector::emit_vlmax_insn (code_for_pred (, mode),
>riscv_vector::RVV_BINOP, operands);
>   DONE;
> }
>  [(set_attr "type" "vshift")
>   (set_attr "mode" "")])
> 
> (define_code_iterator any_shift [ashift ashiftrt lshiftrt])
> 
> VI includes vector short.
> 
> I think RISCV port support vector short logical shift ?

The optab is vlshr_optab:

OPTAB_VC(vlshr_optab, "vlshr$a3", LSHIFTRT) 

your define_insn maybe produces the wrong names?

[Bug tree-optimization/110897] RISC-V: Fail to vectorize shift

2023-08-04 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110897

--- Comment #8 from Richard Biener  ---
(In reply to JuzheZhong from comment #6)
> (In reply to Richard Biener from comment #3)
> > it looks like you don't support vector short logical shift?  For some reason
> > vect_recog_over_widening_pattern doesn't check whether the demoted operation
> > is supported ...
> > 
> > The following helps on x86_64, it disables the demotion.  I think the idea
> > was that we eventually recognize a widening shift, so the narrow operation
> > itself doesn't need to be supported, but clearly that doesn't work out
> > when there is no such shift.
> > 
> > diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
> > index e4ab8c2d65b..4e4191652e3 100644
> > --- a/gcc/tree-vect-patterns.cc
> > +++ b/gcc/tree-vect-patterns.cc
> > @@ -3091,6 +3091,11 @@ vect_recog_over_widening_pattern (vec_info *vinfo,
> >if (!new_vectype || !op_vectype)
> >  return NULL;
> >  
> > +  optab optab;
> > +  if (!(optab = optab_for_tree_code (code, op_vectype, optab_vector))
> > +  || optab_handler (optab, TYPE_MODE (op_vectype)) == CODE_FOR_nothing)
> > +return NULL;
> > +
> >if (dump_enabled_p ())
> >  dump_printf_loc (MSG_NOTE, vect_location, "demoting %T to %T\n",
> >  type, new_type);
> > 
> > with the patch above x86 can vectorize both loops with AVX2 but not without.
> > 
> > Can you confirm this helps on RISC-V as well?
> > 
> > Richard, what was the idea here?
> 
> Hi, Richi.
> 
> I try this codes as you suggested:
>   optab optab;
>   if (!(optab = optab_for_tree_code (code, op_vectype, optab_vector))
>   || optab_handler (optab, TYPE_MODE (op_vectype)) == CODE_FOR_nothing)
> return NULL;
> 
> [jzzhong@server1:/work/home/jzzhong/work/insn]$~/work/rvv-opensource/output/
> gcc-rv64/bin/riscv64-rivai-elf-gcc -march=rv64gcv -O3
> --param=riscv-autovec-preference=scalable -S -fopt-info-vec-missed rvv.c
> rvv.c:14:1: missed: couldn't vectorize loop
> rvv.c:14:1: missed: not vectorized: no vectype for stmt: _4 = *_3;
>  scalar_type: uint16_t
> 
> 
> Still can not vectorize it.

Well, that means we do not have a vector mode for HImode elements?!

[Bug tree-optimization/110897] RISC-V: Fail to vectorize shift

2023-08-04 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110897

--- Comment #9 from JuzheZhong  ---
The name is correct, since the same pattern works for uint32 but fail to work
for uint16

I checked the build file:
CODE_FOR_vlshrrvvm1hi3 = 10350,


>> Well, that means we do not have a vector mode for HImode elements?!

We have vector mode for HImode. 

You can see CODE_FOR_vlshrrvvm1hi3, the "rvvm1hi" is vector HImode.

Consider this following case:

#define TEST2_TYPE(TYPE)\
  __attribute__((noipa))\
  void vshiftr_##TYPE (TYPE *dst, TYPE *a, TYPE *b, int n)  \
  { \
for (int i = 0; i < n; i++) \
  dst[i] = (a[i]) >> b[i];  \
  }

#define TEST_ALL()  \
 TEST2_TYPE(uint32_t)   \
 TEST2_TYPE(uint16_t)   \

TEST_ALL()

rvv.c:15:1: missed: statement clobbers memory: vect__4.9_52 = .MASK_LEN_LOAD
(vectp_a.7_50, 32B, { -1, ... }, _65, 0);
rvv.c:15:1: missed: statement clobbers memory: vect__6.12_56 = .MASK_LEN_LOAD
(vectp_b.10_54, 32B, { -1, ... }, _65, 0);
rvv.c:15:1: missed: statement clobbers memory: .MASK_LEN_STORE
(vectp_dst.14_59, 32B, { -1, ... }, _65, 0, vect__8.13_57);

rvv.c:15:1: missed: couldn't vectorize loop
rvv.c:15:1: missed: not vectorized: no vectype for stmt: _4 = *_3;
 scalar_type: uint16_t


uint32_t can vectorize but uint16_t fail, we have defined both vector SImode
and
HImode for "vlshr$a3" optab.


I seems that we must support widen shift pattern in RISCV port even though we
don't  have widen shift instructions ?

[Bug tree-optimization/110897] RISC-V: Fail to vectorize shift

2023-08-04 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110897

--- Comment #10 from rsandifo at gcc dot gnu.org  
---
(In reply to JuzheZhong from comment #9)
> I seems that we must support widen shift pattern in RISCV port even though
> we don't  have widen shift instructions ?
I doubt it.  Seems like one of those bugs where someone needs to walk through
what's happening in the code, rather than relying on the debug dumps.

[Bug middle-end/110869] [14 regression] ICE in decompose, at rtl.h:2297

2023-08-04 Thread stefansf at linux dot ibm.com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110869

--- Comment #14 from Stefan Schulze Frielinghaus  ---
For -3 and -4 I can confirm that we do not end up with a proper comparison
during combine which means we should just ignore these on Sparc.

I'm currently puzzled that -5 and -6 are actually processed on Sparc (32 or 64
bit) at all.  Shouldn't this:

/* { dg-do compile { target { lp64 } && ! target { sparc*-*-* } } } */

prevent this?

[Bug tree-optimization/110897] RISC-V: Fail to vectorize shift

2023-08-04 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110897

--- Comment #11 from JuzheZhong  ---
I debug vectorizable_shift:


Breakpoint 1, vectorizable_shift (vinfo=0x3fb45d0, stmt_info=0x3fb5ea0,
gsi=0x0, vec_stmt=0x0, slp_node=0x0, cost_vec=0x7fffc648) at
../../../riscv-gcc/gcc/tree-vect-stmts.cc:6028
6028  scalar_dest = gimple_assign_lhs (stmt);
(gdb) n
6029  vectype_out = STMT_VINFO_VECTYPE (stmt_info);
(gdb) p scalar_dest->typed.type->type_common.mode
$7 = E_HImode
(gdb) call print_gimple_stmts(stdout,stmt,0,0)
No symbol "print_gimple_stmts" in current context.
(gdb) call print_gimple_stmt(stdout,stmt,0,0)
patt_33 = _4 >> patt_34;


It's odd here, we are supposed to vectorize this following codes in ifcvt dump:
  _5 = (int) _4;
  _8 = (int) _7;
  _9 = _5 >> _8;

You can see "_9 = _5 >> _8;". We should vectorize SImode instead of HImode.
The correct follow should be first extend HI -> SImode, Then vectorize logical
shift right for SImode, and finally truncate SImode to HImode.

Am I right? When I debug tree-vect-stmts.cc, the vectorization follow doesn't
work as we want ?

Thanks.

[Bug ada/110898] compilation of adacl-assert-integer.ads failed

2023-08-04 Thread dkm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110898

--- Comment #1 from Marc Poulhiès  ---
I get the following error when compiling the adacl-assert-integer.ads file:

```
src/adacl-assert-integer.ads:21:10: warning: unit "GNAT.Source_Info" is not
referenced [-gnatwu]
src/adacl-assert-integer.ads:25:34: (style) trailing spaces not permitted
[-gnatyb]
src/adacl-assert-integer.ads:31:01: error: child of a generic package must be a
generic unit
```

I've checked and I also get the same errors with gcc 11.x, so that's not
something new. I think your code should be fixed here.

[Bug tree-optimization/110897] RISC-V: Fail to vectorize shift

2023-08-04 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110897

--- Comment #12 from rsandifo at gcc dot gnu.org  
---
(In reply to JuzheZhong from comment #11)
> You can see "_9 = _5 >> _8;". We should vectorize SImode instead of HImode.
> The correct follow should be first extend HI -> SImode, Then vectorize
> logical shift right for SImode, and finally truncate SImode to HImode.
The point of vect_recog_over_widening_pattern is to avoid the extension and
truncation.  So this is working as expected.  The question is why doing the
optimisation prevents vectorisation, given that the target apparently provides
HImode shifts right.

[Bug tree-optimization/110897] RISC-V: Fail to vectorize shift

2023-08-04 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110897

--- Comment #13 from JuzheZhong  ---
I just checked ARM SVE has the same behavior with RISC-V:

https://godbolt.org/z/vY6ecY6Mx

You can see this compiler explorer. ARM trunk GCC SVE failed to vectorize it
too same as RISCV wheras ARM GCC 13.1 can vectorize it.

[Bug tree-optimization/106293] [13 regression] 456.hmmer at -Ofast -march=native regressed by 19% on zen2 and zen3 in July 2022

2023-08-04 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106293

Jan Hubicka  changed:

   What|Removed |Added

Summary|[13/14 Regression]  |[13 regression] 456.hmmer
   |456.hmmer at -Ofast |at -Ofast -march=native
   |-march=native regressed by  |regressed by 19% on zen2
   |19% on zen2 and zen3 in |and zen3 in July 2022
   |July 2022   |

--- Comment #26 from Jan Hubicka  ---
We are out of regression finally, but still there are several things to fix.
 1) vectorizer produces corrupt profile
 2) loop-split is not able to work out that it splits last iteration
 3) we work way to hard optimizing loops iterating 0 times.

The loop in question really iterates zero times.  It is created by loop split
from the internal loop:

for (k = 1; k <= M; k++) {
  mc[k] = mpp[k-1]   + tpmm[k-1];
  if ((sc = ip[k-1]  + tpim[k-1]) > mc[k])  mc[k] = sc;
  if ((sc = dpp[k-1] + tpdm[k-1]) > mc[k])  mc[k] = sc;
  if ((sc = xmb  + bp[k]) > mc[k])  mc[k] = sc;
  mc[k] += ms[k];
  if (mc[k] < -INFTY) mc[k] = -INFTY;

  dc[k] = dc[k-1] + tpdd[k-1];
  if ((sc = mc[k-1] + tpmd[k-1]) > dc[k]) dc[k] = sc;
  if (dc[k] < -INFTY) dc[k] = -INFTY;

  if (k < M) {
ic[k] = mpp[k] + tpmi[k];
if ((sc = ip[k] + tpii[k]) > ic[k]) ic[k] = sc;
ic[k] += is[k];
if (ic[k] < -INFTY) ic[k] = -INFTY;
  }

it peels off the last iteration. For ocnidtion is
 if (k <= M)
while we plit on
 if (k < M)
M is varianble and nothing seems to be able to optimize out the second loop
after splitting.

My plan is to add the pattern match so loop split gets this right and records
upper bound on iteration count, but first want to show other bugs exposed by
this scenario.

[Bug tree-optimization/110897] RISC-V: Fail to vectorize shift

2023-08-04 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110897

--- Comment #14 from JuzheZhong  ---
(In reply to rsand...@gcc.gnu.org from comment #12)
> (In reply to JuzheZhong from comment #11)
> > You can see "_9 = _5 >> _8;". We should vectorize SImode instead of HImode.
> > The correct follow should be first extend HI -> SImode, Then vectorize
> > logical shift right for SImode, and finally truncate SImode to HImode.
> The point of vect_recog_over_widening_pattern is to avoid the extension and
> truncation.  So this is working as expected.  The question is why doing the
> optimisation prevents vectorisation, given that the target apparently
> provides HImode shifts right.

Oh, thanks Richard.

After deep analysis, I found this code make it failed:

  incompatible_op1_vectype_p
= (op1_vectype == NULL_TREE
   || maybe_ne (TYPE_VECTOR_SUBPARTS (op1_vectype),
TYPE_VECTOR_SUBPARTS (vectype))
   || TYPE_MODE (op1_vectype) != TYPE_MODE (vectype));
  if (incompatible_op1_vectype_p
  && (!slp_node
  || SLP_TREE_DEF_TYPE (slp_op1) != vect_constant_def
  || slp_op1->refcnt != 1))
{
  if (dump_enabled_p ())
dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
 "unusable type for last operand in"
 " vector/vector shift/rotate.\n");
  return false;
}

incompatible_op1_vectype_p is true.

The reason it becomes true is op1_vectype has the different NUNTIS with
vectype.

The reason why they are different NUNITS is because 

op1_vectype = get_vectype_for_scalar_type = RVVM1SImode.
vectype = STMT_VINFO_VECTYPE (stmt_info) = RVVMF2SImode.

That's the reason why they are different make it failed.

As for easier understand for ARM SVE, I believe ARM sve:

op1_vectype = get_vectype_for_scalar_type = VNx4SImode.
vectype = STMT_VINFO_VECTYPE (stmt_info) = VNx2SImode.

Then ARM SVE also failed.

When revert that commit, they are the same (both are RVVM1SImode for RISCV
or VNx4SImode for ARM SVE).

Could you tell me how to fix that ? 

Thanks.

[Bug middle-end/110857] aarch64-linux-gnu profiledbootstrap broken

2023-08-04 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110857

Jan Hubicka  changed:

   What|Removed |Added

 Status|UNCONFIRMED |WAITING
   Last reconfirmed||2023-08-04
 Ever confirmed|0   |1

--- Comment #4 from Jan Hubicka  ---
I hope the fix for x86_64 also cures arm profiledbootstrap. From backtrace it
is the same bug.

[Bug tree-optimization/110838] [14 Regression] wrong code on x365-3.5, -O3, sign extraction

2023-08-04 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110838

--- Comment #10 from CVS Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:04aa0edcace22a7815cfc57575f1f7b1f166ac10

commit r14-2985-g04aa0edcace22a7815cfc57575f1f7b1f166ac10
Author: Richard Biener 
Date:   Fri Aug 4 11:24:49 2023 +0200

tree-optimization/110838 - less aggressively fold out-of-bound shifts

The following adjusts the shift simplification patterns to avoid
touching out-of-bound shift value arithmetic right shifts of
possibly negative values.  While simplifying those to zero isn't
wrong it's violating the principle of least surprise.

PR tree-optimization/110838
* match.pd (([rl]shift @0 out-of-bounds) -> zero): Restrict
the arithmetic right-shift case to non-negative operands.

[Bug middle-end/110857] aarch64-linux-gnu profiledbootstrap broken

2023-08-04 Thread prathamesh3492 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110857

--- Comment #5 from prathamesh3492 at gcc dot gnu.org ---
Hi Honza,
Sorry for late response, and thanks for the fix! I am currently running
profiledbootstrap on aarch64 with your fix, and will let you know the results
after it completes.

Thanks,
Prathamesh

[Bug middle-end/110316] [11/12/13/14 Regression] g++.dg/ext/timevar1.C and timevar2.C fail erratically

2023-08-04 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110316

--- Comment #4 from CVS Commits  ---
The master branch has been updated by Matthew Malcomson :

https://gcc.gnu.org/g:0782b01c9ea43d43648071faa9c65a101f5068a2

commit r14-2986-g0782b01c9ea43d43648071faa9c65a101f5068a2
Author: Matthew Malcomson 
Date:   Fri Aug 4 11:26:47 2023 +0100

mid-end: Use integral time intervals in timevar.cc

On some AArch64 bootstrapped builds, we were getting a flaky test
because the floating point operations in `get_time` were being fused
with the floating point operations in `timevar_accumulate`.

This meant that the rounding behaviour of our multiplication with
`ticks_to_msec` was different when used in `timer::start` and when
performed in `timer::stop`.  These extra inaccuracies led to the
testcase `g++.dg/ext/timevar1.C` being flaky on some hardware.

--
Avoiding the inlining which was agreed to be undesirable.  Three
alternative approaches:
1) Use `-ffp-contract=on` to avoid this particular optimisation.
2) Adjusting the code so that the "tolerance" is always of the order of
   a "tick".
3) Recording times and elapsed differences in integral values.
   - Could be in terms of a standard measurement (e.g. nanoseconds or
 microseconds).
   - Could be in terms of whatever integral value ("ticks" /
 secondsµseconds / "clock ticks") is returned from the syscall
 chosen at configure time.

While `-ffp-contract=on` removes the problem that I bumped into, there
has been a similar bug on x86 that was to do with a different floating
point problem that also happens after `get_time` and
`timevar_accumulate` both being inlined into the same function.  Hence
it seems worth choosing a different approach.

Of the two other solutions, recording measurements in integral values
seems the most robust against slightly "off" measurements being
presented to the user -- even though it could avoid the ICE that creates
a flaky test.

I considered storing time in whatever units our syscall returns and
normalising them at the time we print out rather than normalising them
to nanoseconds at the point we record our "current time".  The logic
being that normalisation could have some rounding affect (e.g. if
TICKS_PER_SECOND is 3) that would be taken into account in calculations.

I decided against it in order to give the values recorded in
`timevar_time_def` some interpretive value so it's easier to read the
code.  Compared to the small rounding that would represent a tiny amount
of time and AIUI can not trigger the same kind of ICE's as we are
attempting to fix, said interpretive value seems more valuable.

Recording time in microseconds seemed reasonable since all obvious
values for ticks and `getrusage` are at microsecond granularity or less
precise.  That said, since TICKS_PER_SECOND and CLOCKS_PER_SEC are both
variables given to use by the host system I was not sure of that enough
to make this decision.

--
timer::all_zero is ignoring rows which are inconsequential to the user
and would be printed out as all zeros.  Since upon printing rows we
convert to the same double value and print out the same precision as
before, we return true/false based on the same amount of time as before.

timer::print_row casts to a floating point measurement in units of
seconds as was printed out before.

timer::validate_phases -- I'm printing out nanoseconds here rather than
floating point seconds since this is an error message for when things
have "gone wrong" printing out the actual nanoseconds that have been
recorded seems like the best approach.
N.b. since we now print out nanoseconds instead of floating point value
the padding requirements are different.  Originally we were padding to
24 characters and printing 18 decimal places.  This looked odd with the
now visually smaller values getting printed.  I judged 13 characters
(corresponding to 2 hours) to be a reasonable point at which our
alignment could start to degrade and this provides a more compact output
for the majority of cases (checked by triggering the error case via
GDB).

--
N.b. I use a literal 10 for "NANOSEC_PER_SEC".  I believe this
would fit in an integer on all hosts that GCC supports, but am not
certain there are not strange integer sizes we support hence am pointing
it out for special attention during review.

--
No expected change in generated code.
Bootstrapped and regtested on AArch64 with no regressions.

Hope this is acceptable -- I had originally planned to use
`-ffp-contract` as agreed until I saw mention of the old x86 bug in the
same area which was not to do with flo

[Bug c/9903] [3.2 regression] ICE for legal code

2023-08-04 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=9903

--- Comment #3 from CVS Commits  ---
The master branch has been updated by Matthew Malcomson :

https://gcc.gnu.org/g:0782b01c9ea43d43648071faa9c65a101f5068a2

commit r14-2986-g0782b01c9ea43d43648071faa9c65a101f5068a2
Author: Matthew Malcomson 
Date:   Fri Aug 4 11:26:47 2023 +0100

mid-end: Use integral time intervals in timevar.cc

On some AArch64 bootstrapped builds, we were getting a flaky test
because the floating point operations in `get_time` were being fused
with the floating point operations in `timevar_accumulate`.

This meant that the rounding behaviour of our multiplication with
`ticks_to_msec` was different when used in `timer::start` and when
performed in `timer::stop`.  These extra inaccuracies led to the
testcase `g++.dg/ext/timevar1.C` being flaky on some hardware.

--
Avoiding the inlining which was agreed to be undesirable.  Three
alternative approaches:
1) Use `-ffp-contract=on` to avoid this particular optimisation.
2) Adjusting the code so that the "tolerance" is always of the order of
   a "tick".
3) Recording times and elapsed differences in integral values.
   - Could be in terms of a standard measurement (e.g. nanoseconds or
 microseconds).
   - Could be in terms of whatever integral value ("ticks" /
 secondsµseconds / "clock ticks") is returned from the syscall
 chosen at configure time.

While `-ffp-contract=on` removes the problem that I bumped into, there
has been a similar bug on x86 that was to do with a different floating
point problem that also happens after `get_time` and
`timevar_accumulate` both being inlined into the same function.  Hence
it seems worth choosing a different approach.

Of the two other solutions, recording measurements in integral values
seems the most robust against slightly "off" measurements being
presented to the user -- even though it could avoid the ICE that creates
a flaky test.

I considered storing time in whatever units our syscall returns and
normalising them at the time we print out rather than normalising them
to nanoseconds at the point we record our "current time".  The logic
being that normalisation could have some rounding affect (e.g. if
TICKS_PER_SECOND is 3) that would be taken into account in calculations.

I decided against it in order to give the values recorded in
`timevar_time_def` some interpretive value so it's easier to read the
code.  Compared to the small rounding that would represent a tiny amount
of time and AIUI can not trigger the same kind of ICE's as we are
attempting to fix, said interpretive value seems more valuable.

Recording time in microseconds seemed reasonable since all obvious
values for ticks and `getrusage` are at microsecond granularity or less
precise.  That said, since TICKS_PER_SECOND and CLOCKS_PER_SEC are both
variables given to use by the host system I was not sure of that enough
to make this decision.

--
timer::all_zero is ignoring rows which are inconsequential to the user
and would be printed out as all zeros.  Since upon printing rows we
convert to the same double value and print out the same precision as
before, we return true/false based on the same amount of time as before.

timer::print_row casts to a floating point measurement in units of
seconds as was printed out before.

timer::validate_phases -- I'm printing out nanoseconds here rather than
floating point seconds since this is an error message for when things
have "gone wrong" printing out the actual nanoseconds that have been
recorded seems like the best approach.
N.b. since we now print out nanoseconds instead of floating point value
the padding requirements are different.  Originally we were padding to
24 characters and printing 18 decimal places.  This looked odd with the
now visually smaller values getting printed.  I judged 13 characters
(corresponding to 2 hours) to be a reasonable point at which our
alignment could start to degrade and this provides a more compact output
for the majority of cases (checked by triggering the error case via
GDB).

--
N.b. I use a literal 10 for "NANOSEC_PER_SEC".  I believe this
would fit in an integer on all hosts that GCC supports, but am not
certain there are not strange integer sizes we support hence am pointing
it out for special attention during review.

--
No expected change in generated code.
Bootstrapped and regtested on AArch64 with no regressions.

Hope this is acceptable -- I had originally planned to use
`-ffp-contract` as agreed until I saw mention of the old x86 bug in the
same area which was not to do with float

[Bug tree-optimization/110838] [14 Regression] wrong code on x365-3.5, -O3, sign extraction

2023-08-04 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110838

--- Comment #11 from CVS Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:1a599caab86464006ea8c9501aff6c6638e891eb

commit r14-2987-g1a599caab86464006ea8c9501aff6c6638e891eb
Author: Richard Biener 
Date:   Fri Aug 4 12:11:45 2023 +0200

tree-optimization/110838 - vectorization of widened right shifts

The following fixes a problem with my last attempt of avoiding
out-of-bound shift values for vectorized right shifts of widened
operands.  Instead of truncating the shift amount with a bitwise
and we actually need to saturate it to the target precision.

The following does that and adds test coverage for the constant
and invariant but variable case that would previously have failed.

PR tree-optimization/110838
* tree-vect-patterns.cc (vect_recog_over_widening_pattern):
Fix right-shift value sanitizing.  Properly emit external
def mangling in the preheader rather than in the pattern
def sequence where it will fail vectorizing.

* gcc.dg/vect/pr110838.c: New testcase.

[Bug target/110066] [13 Regression] [RISC-V] Segment fault if compiled with -static -pg

2023-08-04 Thread aurelien at aurel32 dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110066

--- Comment #25 from Aurelien Jarno  ---
(In reply to Andrew Pinski from comment #23)
> Fixed on the trunk will backport to GCC 13 after 13.2.0 is released (since
> the branch is frozen except for RM approvals).

Now that GCC 13.2.0 has been released, would it be possible to backport the
fix, please?

[Bug tree-optimization/110897] RISC-V: Fail to vectorize shift

2023-08-04 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110897

--- Comment #15 from Richard Biener  ---
Well, the question is why we arrive here with the two different vector types.
Can you tell me a relevant cc1 compiler command like for a x86->riscv cross
that exposes the issue?

[Bug sanitizer/81981] [8 Regression] -fsanitize=undefined makes a -Wmaybe-uninitialized warning disappear

2023-08-04 Thread vincent-gcc at vinc17 dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81981

--- Comment #9 from Vincent Lefèvre  ---
Note, however, that there is a small regression in GCC 11: the warning for t is
output as expected, but if -fsanitize=undefined is given, the message for t is
suboptimal, saying "*&t[0]" instead of "t[0]":

zira:~> gcc-11 -Wmaybe-uninitialized -O2 -c tst.c -fsanitize=undefined
tst.c: In function ‘foo’:
tst.c:12:15: warning: ‘*&t[0]’ may be used uninitialized in this function
[-Wmaybe-uninitialized]
   12 |   return t[0] + u[0];
  |  ~^~
tst.c:12:15: warning: ‘u[0]’ may be used uninitialized in this function
[-Wmaybe-uninitialized]

No such issue without -fsanitize=undefined:

zira:~> gcc-11 -Wmaybe-uninitialized -O2 -c tst.c
tst.c: In function ‘foo’:
tst.c:12:15: warning: ‘u[0]’ may be used uninitialized in this function
[-Wmaybe-uninitialized]
   12 |   return t[0] + u[0];
  |  ~^~
tst.c:12:15: warning: ‘t[0]’ may be used uninitialized in this function
[-Wmaybe-uninitialized]

It is impossible to say whether this is fixed in GCC 12 and later, because of
PR 110896, i.e. the warning is always missing.

[Bug modula2/110779] SysClock can not read the clock

2023-08-04 Thread gaius at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110779

Gaius Mulley  changed:

   What|Removed |Added

  Attachment #55683|0   |1
is obsolete||

--- Comment #3 from Gaius Mulley  ---
Created attachment 55687
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55687&action=edit
Proposed fix v2

The previous patch was missing some new files.  This has successfully
bootstrapped on x86_64 and aarch64.  I'd like to see it bootstrap on ppc64le,
x86_32 and armv7l before it is git committed (as the libgm2 automake
{Makefile.in, configure, config.h} have been regenerated).

[Bug c++/110848] Consider enabling -Wvla by default in non-GNU C++ modes

2023-08-04 Thread aaron at aaronballman dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110848

--- Comment #12 from Aaron Ballman  ---
(In reply to Eric Gallager from comment #11)
> How about:
> 
> -std=c++XY: enabled by default (as per the proposal)
> -std=gnu++XY: enabled by -Wall and/or -Wextra (in addition to being enabled
> by -pedantic like it already is)

That's a good suggestion -- I'd be quite happy with adding it to -Wall (or
barring that, -Wextra) in GNU++ modes.

[Bug target/106346] [11/12/13/14 Regression] Potential regression on vectorization of left shift with constants since r11-5160-g9fc9573f9a5e94

2023-08-04 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106346

--- Comment #8 from CVS Commits  ---
The master branch has been updated by Tamar Christina :

https://gcc.gnu.org/g:451391a6477f5b012faeca42cdba1bfb8e6eecc0

commit r14-2991-g451391a6477f5b012faeca42cdba1bfb8e6eecc0
Author: Tamar Christina 
Date:   Fri Aug 4 13:49:23 2023 +0100

AArch64: Undo vec_widen_shiftl optabs [PR106346]

In GCC 11 we implemented the vectorizer optab for widening left shifts,
however this optab is only supported for uniform shift constants.

At the moment GCC still has two loop vectorization strategy (classical loop
and
SLP based loop vec) and the optab is implemented as a scalar pattern.

This means that when we apply it to a non-uniform constant inside a loop we
only
find out during SLP build that the constants aren't uniform.  At this point
it's
too late and we lose SLP entirely.

Over the years I've tried various options but none of it works well:

1. Dissolving patterns during SLP built (problematic, also dissolves them
for
non-slp).
2. Optionally ignoring patterns for SLP build (problematic, ends up
interfearing
with relevancy detection).
3. Relaxing contraint on SLP build to allow non-constant values and
dissolving
them after SLP build using an SLP pattern.  (problematic, ends up breaking
shift reassociation).

As a result we've concluded that for now this pattern should just be
removed
and formed during RTL.

The plan is to move this to an SLP only pattern once we remove classical
loop
vectorization support from GCC, at which time we can also properly support
SVE's
Top and Bottom variants.

This removes the optab and reworks the RTL to recognize both the vector
variant
and the intrinsics variant.  Also just simplifies all these patterns.

gcc/ChangeLog:

PR target/106346
* config/aarch64/aarch64-simd.md (vec_widen_shiftl_lo_,
vec_widen_shiftl_hi_): Remove.
(aarch64_shll_internal): Renamed to...
(aarch64_shll): .. This.
(aarch64_shll2_internal): Renamed to...
(aarch64_shll2): .. This.
(aarch64_shll_n, aarch64_shll2_n): Re-use new
optabs.
* config/aarch64/constraints.md (D2, DL): New.
* config/aarch64/predicates.md (aarch64_simd_shll_imm_vec): New.

gcc/testsuite/ChangeLog:

PR target/106346
* gcc.target/aarch64/pr98772.c: Adjust assembly.
* gcc.target/aarch64/vect-widen-shift.c: New test.

[Bug target/110899] New: RFE: Attributes preserve_most and preserve_all

2023-08-04 Thread elver at google dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110899

Bug ID: 110899
   Summary: RFE: Attributes preserve_most and preserve_all
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: elver at google dot com
  Target Milestone: ---

Clang/LLVM implements the function attributes "preserve_most" and
"preserve_all":

[1] preserve_most: "On X86-64 and AArch64 targets, this attribute changes the
calling convention of a function. The preserve_most calling convention attempts
to make the code in the caller as unintrusive as possible. This convention
behaves identically to the C calling convention on how arguments and return
values are passed, but it uses a different set of caller/callee-saved
registers. This alleviates the burden of saving and recovering a large register
set before and after the call in the caller. If the arguments are passed in
callee-saved registers, then they will be preserved by the callee across the
call. This doesn’t apply for values returned in callee-saved registers.

- On X86-64 the callee preserves all general purpose registers, except for R11.
R11 can be used as a scratch register. Floating-point registers (XMMs/YMMs) are
not preserved and need to be saved by the caller.

- On AArch64 the callee preserve all general purpose registers, except X0-X8
and X16-X18."

[2] preserve_all: "On X86-64 and AArch64 targets, this attribute changes the
calling convention of a function. The preserve_all calling convention attempts
to make the code in the caller even less intrusive than the preserve_most
calling convention. This calling convention also behaves identical to the C
calling convention on how arguments and return values are passed, but it uses a
different set of caller/callee-saved registers. This removes the burden of
saving and recovering a large register set before and after the call in the
caller. If the arguments are passed in callee-saved registers, then they will
be preserved by the callee across the call. This doesn’t apply for values
returned in callee-saved registers.

- On X86-64 the callee preserves all general purpose registers, except for R11.
R11 can be used as a scratch register. Furthermore it also preserves all
floating-point registers (XMMs/YMMs).

- On AArch64 the callee preserve all general purpose registers, except X0-X8
and X16-X18. Furthermore it also preserves lower 128 bits of V8-V31 SIMD -
floating point registers."

[1] https://clang.llvm.org/docs/AttributeReference.html#preserve-most
[2] https://clang.llvm.org/docs/AttributeReference.html#preserve-all


These attributes, esp. preserve_most, provides a convenient way to optimize the
generated code for calls to rarely taken slow paths, such as error-reporting
functions. Recently, we're looking to make use of this in the Linux kernel [3],
with potentially additional usecases being discussed.

[3] https://lkml.kernel.org/r/20230804090621.400-1-el...@google.com

[Bug tree-optimization/53947] [meta-bug] vectorizer missed-optimizations

2023-08-04 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
Bug 53947 depends on bug 106346, which changed state.

Bug 106346 Summary: [11/12/13/14 Regression] Potential regression on 
vectorization of left shift with constants since r11-5160-g9fc9573f9a5e94
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106346

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

[Bug target/106346] [11/12/13/14 Regression] Potential regression on vectorization of left shift with constants since r11-5160-g9fc9573f9a5e94

2023-08-04 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106346

Tamar Christina  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #9 from Tamar Christina  ---
Fixed in GCC 14.

[Bug c/108986] [11 Regression] Incorrect warning for [static] array parameter

2023-08-04 Thread muecker at gwdg dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108986

Martin Uecker  changed:

   What|Removed |Added

 CC||muecker at gwdg dot de

--- Comment #10 from Martin Uecker  ---
PATCH: https://gcc.gnu.org/pipermail/gcc-patches/2023-July/625559.html

[Bug middle-end/110869] [14 regression] ICE in decompose, at rtl.h:2297

2023-08-04 Thread stefansf at linux dot ibm.com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110869

--- Comment #15 from Stefan Schulze Frielinghaus  ---
Created attachment 55688
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55688&action=edit
Increase optimization and skip sparc for 4-6

[Bug middle-end/110869] [14 regression] ICE in decompose, at rtl.h:2297

2023-08-04 Thread stefansf at linux dot ibm.com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110869

--- Comment #16 from Stefan Schulze Frielinghaus  ---
Turns out that my dejagnu foo is weak ;-) I came up with a wrong target
selector. Should be fixed in the new attachment.

[Bug target/110899] RFE: Attributes preserve_most and preserve_all

2023-08-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110899

Andrew Pinski  changed:

   What|Removed |Added

   Severity|normal  |enhancement

[Bug tree-optimization/110897] RISC-V: Fail to vectorize shift

2023-08-04 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110897

--- Comment #16 from JuzheZhong  ---
(In reply to Richard Biener from comment #15)
> Well, the question is why we arrive here with the two different vector types.
> Can you tell me a relevant cc1 compiler command like for a x86->riscv cross
> that exposes the issue?

Thanks for taking care of this issue.

The RISC-V cc1 command:

cc1 -march=rv64gcv -mabi=lp64d --param=riscv-autovec-preference=scalable

For ARM SVE:

-march=armv8-a+sve -O3

This issue is exposed in both RISC-V and ARM.

code:

#include 

#define TEST2_TYPE(TYPE)\
  __attribute__((noipa))\
  void vshiftr_##TYPE (TYPE *dst, TYPE *a, TYPE *b, int n)  \
  { \
for (int i = 0; i < n; i++) \
  dst[i] = (a[i]) >> b[i];  \
  }

#define TEST_ALL()  \
 TEST2_TYPE(uint16_t)   \

TEST_ALL()

[Bug middle-end/88873] missing vectorization for decomposed operations on a vector type

2023-08-04 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88873

--- Comment #10 from CVS Commits  ---
The master branch has been updated by Roger Sayle :

https://gcc.gnu.org/g:faa2202ee7fcf039b2016ce5766a2927526c5f78

commit r14-2997-gfaa2202ee7fcf039b2016ce5766a2927526c5f78
Author: Roger Sayle 
Date:   Fri Aug 4 16:23:38 2023 +0100

i386: Split SUBREGs of SSE vector registers into vec_select insns.

This patch is the final piece in the series to improve the ABI issues
affecting PR 88873.  The previous patches tackled inserting DFmode
values into V2DFmode registers, by introducing insvti_{low,high}part
patterns.  This patch improves the extraction of DFmode values from
V2DFmode registers via TImode intermediates.

I'd initially thought this would require new extvti_{low,high}part
patterns to be defined, but all that's required is to recognize that
the SUBREG idioms produced by combine are equivalent to (forms of)
vec_select patterns.  The target-independent middle-end can't be sure
that the appropriate vec_select instruction exists on the target,
hence doesn't canonicalize a SUBREG of a vector mode as a vec_select,
but the backend can provide a define_split stating where and when
this is useful, for example, considering whether the operand is in
memory, or whether !TARGET_SSE_MATH and the destination is i387.

For pr88873.c, gcc -O2 -march=cascadelake currently generates:

foo:vpunpcklqdq %xmm3, %xmm2, %xmm7
vpunpcklqdq %xmm1, %xmm0, %xmm6
vpunpcklqdq %xmm5, %xmm4, %xmm2
vmovdqa %xmm7, -24(%rsp)
vmovdqa %xmm6, %xmm1
movq-16(%rsp), %rax
vpinsrq $1, %rax, %xmm7, %xmm4
vmovapd %xmm4, %xmm6
vfmadd132pd %xmm1, %xmm2, %xmm6
vmovapd %xmm6, -24(%rsp)
vmovsd  -16(%rsp), %xmm1
vmovsd  -24(%rsp), %xmm0
ret

with this patch, we now generate:

foo:vpunpcklqdq %xmm1, %xmm0, %xmm6
vpunpcklqdq %xmm3, %xmm2, %xmm7
vpunpcklqdq %xmm5, %xmm4, %xmm2
vmovdqa %xmm6, %xmm1
vfmadd132pd %xmm7, %xmm2, %xmm1
vmovsd  %xmm1, %xmm1, %xmm0
vunpckhpd   %xmm1, %xmm1, %xmm1
ret

The improvement is even more dramatic when compared to the original
29 instructions shown in comment #8.  GCC 13, for example, required
12 transfers to/from memory.

2023-08-04  Roger Sayle  

gcc/ChangeLog
* config/i386/sse.md (define_split): Convert highpart:DF extract
from V2DFmode register into a sse2_storehpd instruction.
(define_split): Likewise, convert lowpart:DF extract from V2DF
register into a sse2_storelpd instruction.

gcc/testsuite/ChangeLog
* gcc.target/i386/pr88873.c: Tweak to check for improved code.

[Bug rtl-optimization/110717] Double-word sign-extension missed-optimization

2023-08-04 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110717

--- Comment #15 from CVS Commits  ---
The master branch has been updated by Roger Sayle :

https://gcc.gnu.org/g:c572f09a751cbd365e2285b30527de5ab9025972

commit r14-2998-gc572f09a751cbd365e2285b30527de5ab9025972
Author: Roger Sayle 
Date:   Fri Aug 4 16:26:06 2023 +0100

Specify signed/unsigned/dontcare in calls to extract_bit_field_1.

This patch is inspired by Jakub's work on PR rtl-optimization/110717.
The bitfield example described in comment #2, looks like:

struct S { __int128 a : 69; };
unsigned type bar (struct S *p) {
  return p->a;
}

which on x86_64 with -O2 currently generates:

bar:movzbl  8(%rdi), %ecx
movq(%rdi), %rax
andl$31, %ecx
movq%rcx, %rdx
salq$59, %rdx
sarq$59, %rdx
ret

The ANDL $31 is interesting... we first extract an unsigned 69-bit bitfield
by masking/clearing the top bits of the most significant word, and then
it gets sign-extended, by left shifting and arithmetic right shifting.
Obviously, this bit-wise AND is redundant, for signed bit-fields, we don't
require these bits to be cleared, if we're about to set them appropriately.

This patch eliminates this redundancy in the middle-end, during RTL
expansion, but extending the extract_bit_field APIs so that the integer
UNSIGNEDP argument takes a special value; 0 indicates the field should
be sign extended, 1 (any non-zero value) indicates the field should be
zero extended, but -1 indicates a third option, that we don't care how
or whether the field is extended.  By passing and checking this sentinel
value at the appropriate places we avoid the useless bit masking (on
all targets).

For the test case above, with this patch we now generate:

bar:movzbl  8(%rdi), %ecx
movq(%rdi), %rax
movq%rcx, %rdx
salq$59, %rdx
sarq$59, %rdx
ret

2023-08-04  Roger Sayle  

gcc/ChangeLog
* expmed.cc (extract_bit_field_1): Document that an UNSIGNEDP
value of -1 is equivalent to don't care.
(extract_integral_bit_field): Indicate that we don't require
the most significant word to be zero extended, if we're about
to sign extend it.
(extract_fixed_bit_field_1): Document that an UNSIGNEDP value
of -1 is equivalent to don't care.  Don't clear the most
significant bits with AND mask when UNSIGNEDP is -1.

gcc/testsuite/ChangeLog
* gcc.target/i386/pr110717-2.c: New test case.

[Bug c++/110900] New: std::string initializes SSO object subfield without making the SSO object active in the union

2023-08-04 Thread danakj at orodu dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110900

Bug ID: 110900
   Summary: std::string initializes SSO object subfield without
making the SSO object active in the union
   Product: gcc
   Version: 11.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: danakj at orodu dot net
  Target Milestone: ---

Specific errors by clang:

note: construction of subobject of member '_M_local_buf' of union with no
active member is not allowed in a constant expression

error: accessing ‘std::__cxx11::basic_string_M_allocated_capacity’ member instead of initialized
‘std::__cxx11::basic_string_M_local_buf’ member in
constant expression

Specific error by GCC:

error: accessing ‘std::__cxx11::basic_string_M_allocated_capacity’ member instead of initialized
‘std::__cxx11::basic_string_M_local_buf’ member in
constant expression

Full errors:

Here's the clang 17 error:

/usr/include/c++/12/bits/stl_construct.h:97:14: note: construction of subobject
of member '_M_local_buf' of union with no active member is not allowed in a
constant expression
   97 | { return ::new((void*)__location)
_Tp(std::forward<_Args>(__args)...); }
  |  ^
/usr/include/c++/12/bits/char_traits.h:262:6: note: in call to
'construct_at(&[]() {
std::string acc;
sus::Array::with('a', 'b', 'c', 'd',
'e').into_iter().for_each([&](char v) {
acc.push_back(v);
});
return acc;
}().._M_local_buf[0], acc.._M_local_buf[0])'
  262 | std::construct_at(__s1 + __i, __s2[__i]);
  | ^
/usr/include/c++/12/bits/char_traits.h:429:11: note: in call to 'copy(&[]() {
std::string acc;
sus::Array::with('a', 'b', 'c', 'd',
'e').into_iter().for_each([&](char v) {
acc.push_back(v);
});
return acc;
}().._M_local_buf[0], &acc.._M_local_buf[0], 6)'
  429 |   return __gnu_cxx::char_traits::copy(__s1, __s2,
__n);
  |  ^
/usr/include/c++/12/bits/basic_string.h:675:6: note: in call to 'copy(&[]() {
std::string acc;
sus::Array::with('a', 'b', 'c', 'd',
'e').into_iter().for_each([&](char v) {
acc.push_back(v);
});
return acc;
}().._M_local_buf[0], &acc.._M_local_buf[0], 6)'
  675 | traits_type::copy(_M_local_buf, __str._M_local_buf,
  | ^
/home/runner/work/subspace/subspace/sus/iter/iterator_unittest.cc:1864:12:
note: in call to 'basic_string(acc)'
 1864 | return acc;
  |^
/home/runner/work/subspace/subspace/sus/iter/iterator_unittest.cc:1859:17:
note: in call to '[]() {
std::string acc;
sus::Array::with('a', 'b', 'c', 'd',
'e').into_iter().for_each([&](char v) {
acc.push_back(v);
});
return acc;
}.operator()()'
 1859 |   static_assert([]() {
  | ^

Here's the g++ 13 error:
/home/runner/work/subspace/subspace/sus/iter/iterator_unittest.cc:1792:24:
error: non-constant condition for static assertion
 1787 |   static_assert(sus::Array::with('a', 'b', 'c', 'd', 'e')
  | ~~
 1788 | .into_iter()
  | 
 1789 | .fold(std::string(), [](std::string acc, char v) {
  | ~~
 1790 |   acc.push_back(v);
  |   ~
 1791 |   return acc;
  |   ~~~
 1792 | }) == "abcde");
  | ~~~^~
/home/runner/work/subspace/subspace/sus/iter/iterator_unittest.cc:1792:35:   in
‘constexpr’ expansion of ‘sus::iter::IteratorBase::fold(B, F) &&
[with B = std::__cxx11::basic_string; F =
{anonymous}::Iterator_Fold_Test::TestBody()::; Iter
= sus::containers::ArrayIntoIter; ItemT =
char](std::__cxx11::basic_string(), ({anonymous}::Iterator_Fold_Test::TestBody()::(), {anonymous}::Iterator_Fold_Test::TestBody()::()))’
/home/runner/work/subspace/subspace/sus/iter/iterator_unittest.cc:1792:35:   in
‘constexpr’ expansion of ‘sus::fn::call_mut(F&&, Args&& ...) [with F =
{anonymous}::Iterator_Fold_Test::TestBody()::&; Args
= {std::__cxx11::basic_string,
std::allocator >, char}]((* &
sus::mem::move&>(init)), (&
sus::mem::move&>(o))->sus::option::Option::unwrap())’
/home/runner/work/subspace/subspace/sus/iter/iterator_unittest.cc:1792:35:   in
‘constexpr’ expansion of ‘std::invoke(_Callable&&, _Args&& ...) [with _Callable
= {anonymous}::Iterator_Fold_Test::TestBody()::&; _Args =
{__cxx11::basic_string, allocator >, char};
invoke_result_t<_Fn, _Args ...> = __cxx11::basic_string]((* &
sus::mem::forward >((* & args#0))), (* &
sus::mem::forward((* & args#1’
/home/runner/work/subspace/subspace/sus/iter/iterator_unittest.cc:1792:35:   in
‘const

[Bug middle-end/110888] Missing optimization for trivial MATMUL cases, requires -fno-signed-zeros

2023-08-04 Thread tkoenig at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110888

Thomas Koenig  changed:

   What|Removed |Added

  Component|fortran |middle-end

--- Comment #3 from Thomas Koenig  ---
Interesting problem.

For

  _19 = (*x_13(D))[0];
  _20 = (*y_14(D))[0];
  _21 = _19 * _20;
  _22 = _21 + 0.0;

the multiplication cannot produce a signalling NaN, so the addition
of zero should always be a no-op. For this, a simpler test case would
be

double add(double a, double b)
{
  return a*b + 0.0;
}

which gets me, on x86_64, 

mulsd   %xmm1, %xmm0
pxor%xmm1, %xmm1
addsd   %xmm1, %xmm0
re

According to godbolt, icc produces

add:
mulsd %xmm1, %xmm0  #3.12
ret   

which should be fine.

So, an issue for tree optimization?

[Bug c++/110900] std::string initializes SSO object subfield without making the SSO object active in the union

2023-08-04 Thread danakj at orodu dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110900

--- Comment #1 from danakj at orodu dot net ---
I am going to try work around this by not using std::string in constant
expressions..

So in the meantime I pushed a branch where this bug will continue to reproduce.

With gcc-13:

git clone --recurse-submodules https://github.com/danakj/subspace
cd subspace
git checkout test origin/libstd-bug-sso
CXX=path/to/gcc-13 cmake -B out -DSUBSPACE_BUILD_TESTS=ON
cmake --build out -j 20

[Bug fortran/110888] Missing optimization for trivial MATMUL cases, requires -fno-signed-zeros

2023-08-04 Thread tkoenig at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110888

Thomas Koenig  changed:

   What|Removed |Added

  Component|middle-end  |fortran

--- Comment #4 from Thomas Koenig  ---
Hm, on second thoughts, signed zeros are an issue, resetting to Fortran.

Generally, we are in an intrinsic, so we can do whatever we please
(we certainly do in the library case, and this is expected behavior).

Having -ffast-math applied locally to the BLOCK that the matmul
is executed in would be a possibility.

[Bug target/109465] LoongArch: The expansion of memcpy is slow and bloated for some sizes

2023-08-04 Thread xry111 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109465

Xi Ruoyao  changed:

   What|Removed |Added

   Target Milestone|--- |14.0

[Bug c++/110900] std::string initializes SSO object subfield without making the SSO object active in the union

2023-08-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110900

Andrew Pinski  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |WAITING
   Last reconfirmed||2023-08-04

--- Comment #2 from Andrew Pinski  ---
Can you please read https://gcc.gnu.org/bugs/ on what we need?

[Bug c++/110158] Cannot use union with std::string inside in constant expression

2023-08-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110158

Andrew Pinski  changed:

   What|Removed |Added

 CC||danakj at orodu dot net

--- Comment #3 from Andrew Pinski  ---
*** Bug 110900 has been marked as a duplicate of this bug. ***

[Bug libstdc++/110900] std::string initializes SSO object subfield without making the SSO object active in the union

2023-08-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110900

Andrew Pinski  changed:

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #3 from Andrew Pinski  ---
Dup of bug 110158.

*** This bug has been marked as a duplicate of bug 110158 ***

[Bug libstdc++/110900] std::string initializes SSO object subfield without making the SSO object active in the union

2023-08-04 Thread danakj at orodu dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110900

--- Comment #4 from danakj at orodu dot net ---
The error message is the same as 110158 but to be clear the std::string is not
in a union. The error message is about the union _inside_ std::string.

[Bug libstdc++/110900] std::string initializes SSO object subfield without making the SSO object active in the union

2023-08-04 Thread danakj at orodu dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110900

--- Comment #5 from danakj at orodu dot net ---
> Can you please read https://gcc.gnu.org/bugs/ on what we need?

Yeah, sorry I can't reproduce this locally on my Mac or Windows machine. It
reproduces on github Linux CI bots, and I have diagnosed it from there.

https://github.com/chromium/subspace/actions/runs/5758764036/job/15611774084?pr=306

This job is using gcc 13.1.0, and it installs libstdc++-13-dev.

Here's the command that fails:

/usr/bin/g++-13  -I/home/runner/work/subspace/subspace
-I/home/runner/work/subspace/subspace/third_party/googletest
-I/home/runner/work/subspace/subspace/third_party/fmt/include -isystem
/home/runner/work/subspace/subspace/third_party/googletest/googletest/include
-isystem /home/runner/work/subspace/subspace/third_party/googletest/googletest
-isystem /usr/include/c++/13 -isystem /usr/include/x86_64-linux-gnu/c++/13
-isystem /usr/include/c++/13/backward -isystem
/usr/lib/gcc/x86_64-linux-gnu/13/include -isystem /usr/local/include -isystem
/usr/include/x86_64-linux-gnu -isystem /usr/include -O3 -DNDEBUG -std=gnu++20
-fno-rtti -Werror -MD -MT
sus/CMakeFiles/subspace_unittests.dir/iter/iterator_unittest.cc.o -MF
sus/CMakeFiles/subspace_unittests.dir/iter/iterator_unittest.cc.o.d -o
sus/CMakeFiles/subspace_unittests.dir/iter/iterator_unittest.cc.o -c
/home/runner/work/subspace/subspace/sus/iter/iterator_unittest.cc

I think it's simplest to just do a git clone and build that though... as I
can't easily minmize this.

[Bug c++/110158] Cannot use union with std::string inside in constant expression

2023-08-04 Thread danakj at orodu dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110158

--- Comment #4 from danakj at orodu dot net ---
Here's a repro without the std::string inside a union. It is the SSO union
inside the string that causes the error.

https://gcc.godbolt.org/z/T8oM8vYnq

```
#include 

template 
constexpr T fold(T init, I i, S s, F f) {
while (true) {
if (i == s)
return init;
else
init = f(std::move(init), *i++);
}
}

constexpr char v[] = {'a', 'b', 'c'};
static_assert(fold(std::string(), std::begin(v), std::end(v),
   [](std::string acc, char v) {
   acc.push_back(v);
   return acc;
   }) == "abc");

int main() {}
```

:18:23: error: non-constant condition for static assertion
   14 | static_assert(fold(std::string(), std::begin(v), std::end(v),
  |   ~~~
   15 |[](std::string acc, char v) {
  |~
   16 |acc.push_back(v);
  |~
   17 |return acc;
  |~~~
   18 |}) == "abc");
  |~~~^~~~
:18:32:   in 'constexpr' expansion of 'fold(T, I, S, F) [with T =
std::__cxx11::basic_string; I = const char*; S = const char*; F =
](std::begin(v), std::end(v), ((),
()))'
:18:32:   in 'constexpr' expansion of
'std::__cxx11::basic_string((* &
std::move<__cxx11::basic_string&>(init)))'
:18:23: error: accessing 'std::__cxx11::basic_string_M_allocated_capacity' member instead of initialized
'std::__cxx11::basic_string_M_local_buf' member in
constant expression
ASM generation compiler returned: 1

[Bug libstdc++/110900] std::string initializes SSO object subfield without making the SSO object active in the union

2023-08-04 Thread danakj at orodu dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110900

--- Comment #6 from danakj at orodu dot net ---
Thanks for the link, I used the godbolt from that bug to set up the right
environment and that let me minimize it. I posted it into the dupe bug.

[Bug driver/110901] New: -march does not override -mcpu on aarch64

2023-08-04 Thread raj.khem at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110901

Bug ID: 110901
   Summary: -march does not override -mcpu on aarch64
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: driver
  Assignee: unassigned at gcc dot gnu.org
  Reporter: raj.khem at gmail dot com
  Target Milestone: ---

As per 

https://gcc.gnu.org/onlinedocs/gcc/AArch64-Options.html#index-mcpu

When -march is used then relevant part of -mcpu are overridden by that. However
this seems to be not happening in following case with GCC13

a.s
===
.text
ptrue p0.b

=


aarch64-yoe-linux-gcc  -mcpu=cortex-a72.cortex-a53 -mbranch-protection=standard

--sysroot=/mnt/b/yoe/master/build/tmp/work/cortexa72-cortexa53-crypto-yoe-linux/glibc/2.38-r0/recipe-sysroot
-fuse-ld=bfd -c -march=armv8.2-a+sve a.s -v
Using built-in specs.
COLLECT_GCC=../recipe-sysroot-native/usr/bin/aarch64-yoe-linux/aarch64-yoe-linux-gcc
Target: aarch64-yoe-linux
Configured with:
../../../../../../work-shared/gcc-13.2.0-r0/gcc-13.2.0/configure
--build=x86_64-linux --host=x86_64-linux --target=aarch64-yoe-linux
--prefix=/host-native/usr --exec_prefix=/host-native/usr
--bindir=/host-native/usr/bin/aarch64-yoe-linux
--sbindir=/host-native/usr/bin/aarch64-yoe-linux
--libexecdir=/host-native/usr/libexec/aarch64-yoe-linux
--datadir=/host-native/usr/share --sysconfdir=/host-native/etc
--sharedstatedir=/host-native/com --localstatedir=/host-native/var
--libdir=/host-native/usr/lib/aarch64-yoe-linux
--includedir=/host-native/usr/include --oldincludedir=/host-native/usr/include
--infodir=/host-native/usr/share/info --mandir=/host-native/usr/share/man
--disable-silent-rules --disable-dependency-tracking
--with-libtool-sysroot=/host-native --enable-clocale=generic --with-gnu-ld
--enable-shared --enable-languages=c,c++ --enable-threads=posix
--disable-multilib --enable-default-pie --enable-c99 --enable-long-long
--enable-symvers=gnu --enable-libstdcxx-pch --program-prefix=aarch64-yoe-linux-
--without-local-prefix --disable-install-libiberty --disable-libssp
--enable-libitm --enable-lto --disable-bootstrap --with-system-zlib
--with-linker-hash-style=sysv --enable-linker-build-id --with-ppl=no
--with-cloog=no --enable-checking=release --enable-cheaders=c_global
--without-isl --with-gxx-include-dir=/not/exist/usr/include/c++/13.2.0
--with-sysroot=/not/exist --with-build-sysroot=/host
--enable-poison-system-directories=error --with-system-zlib --disable-static
--disable-nls --with-glibc-version=2.28 --enable-initfini-array
--enable-__cxa_atexit
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 13.2.0 (GCC)
COLLECT_GCC_OPTIONS='-mcpu=cortex-a72.cortex-a53'
'-mbranch-protection=standard'
'--sysroot=/mnt/b/yoe/master/build/tmp/work/cortexa72-cortexa53-crypto-yoe-linux/glibc/2.38-r0/recipe-sysroot'
'-fuse-ld=bfd' '-c' '-march=armv8.2-a+sve' '-v' '-mlittle-endian' '-mabi=lp64'

/mnt/b/yoe/master/build/tmp/work/cortexa72-cortexa53-crypto-yoe-linux/glibc/2.38-r0/recipe-sysroot-native/usr/bin/aarch64-yoe-linux/../../libexec/aarch64-yoe-linux/gcc/aarch64-yoe-linux/13.2.0/as
-v -EL -march=armv8.2-a+sve -march=armv8-a+crc -mabi=lp64 -o a.o a.s
GNU assembler version 2.41.0 (aarch64-yoe-linux) using BFD version (GNU
Binutils) 2.41.0.20230731
a.s: Assembler messages:
a.s:2: Error: selected processor does not support `ptrue p0.b'


However if I remove -mcpu=cortex-a72.cortex-a53 or change it to
-mcpu=cortex-a72.cortex-a53+sve then it works ok. Interesting part is -march
values in the assembler commandline order.

as -v -EL -march=armv8.2-a+sve -march=armv8-a+crc -mabi=lp64 -o a.o a.s

as we can see the -march computed from -mcpu is specified *after* the -march
passed by user.

is this a bug?

[Bug target/110202] _mm512_ternarylogic_epi64 generates unnecessary operations

2023-08-04 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110202

--- Comment #11 from CVS Commits  ---
The master branch has been updated by Alexander Monakov :

https://gcc.gnu.org/g:567d06bb357a39ece865cef67ada44124f227e45

commit r14-2999-g567d06bb357a39ece865cef67ada44124f227e45
Author: Yan Simonaytes 
Date:   Tue Jul 25 20:43:19 2023 +0300

i386: eliminate redundant operands of VPTERNLOG

As mentioned in PR 110202, GCC may be presented with input where control
word of the VPTERNLOG intrinsic implies that some of its operands do not
affect the result.  In that case, we can eliminate redundant operands
of the instruction by substituting any other operand in their place.
This removes false dependencies.

For instance, instead of (252 = 0xfc = _MM_TERNLOG_A | _MM_TERNLOG_B)

vpternlogq  $252, %zmm2, %zmm1, %zmm0

emit

vpternlogq  $252, %zmm0, %zmm1, %zmm0

When VPTERNLOG is invariant w.r.t first and second operands, and the
third operand is memory, load memory into the output operand first, i.e.
instead of (85 = 0x55 = ~_MM_TERNLOG_C)

vpternlogq  $85, (%rdi), %zmm1, %zmm0

emit

vmovdqa64   (%rdi), %zmm0
vpternlogq  $85, %zmm0, %zmm0, %zmm0

gcc/ChangeLog:

PR target/110202
* config/i386/i386-protos.h
(vpternlog_redundant_operand_mask): Declare.
(substitute_vpternlog_operands): Declare.
* config/i386/i386.cc
(vpternlog_redundant_operand_mask): New helper.
(substitute_vpternlog_operands): New function.  Use them...
* config/i386/sse.md: ... here in new VPTERNLOG define_splits.

gcc/testsuite/ChangeLog:

PR target/110202
* gcc.target/i386/invariant-ternlog-1.c: New test.
* gcc.target/i386/invariant-ternlog-2.c: New test.

[Bug analyzer/110902] New: Missing cast in region_model_manager::maybe_fold_binop on MULT_EXPR by 1

2023-08-04 Thread dmalcolm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110902

Bug ID: 110902
   Summary: Missing cast in region_model_manager::maybe_fold_binop
on MULT_EXPR by 1
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: analyzer
  Assignee: dmalcolm at gcc dot gnu.org
  Reporter: dmalcolm at gcc dot gnu.org
  Target Milestone: ---

Whilst trying to fix PR analyzer/110426, I noticed that
region_model_manager::maybe_fold_binop doesn't always return the correct type;
specifically, it fails to cast to TYPE when folding (VAL * 1) -> VAL:

diff --git a/gcc/analyzer/region-model-manager.cc
b/gcc/analyzer/region-model-manager.cc
index 46d271a295c..010906f1ec0 100644
--- a/gcc/analyzer/region-model-manager.cc
+++ b/gcc/analyzer/region-model-manager.cc
@@ -654,7 +654,7 @@ region_model_manager::maybe_fold_binop (tree type, enum
tree_code op,
return get_or_create_constant_svalue (build_int_cst (type, 0));
   /* (VAL * 1) -> VAL.  */
   if (cst1 && integer_onep (cst1))
-   return arg0;
+   return get_or_create_cast (type, arg0);
   break;
 case BIT_AND_EXPR:
   if (cst1)

However, on adding the above cast, various bounds-checking tests fail,
seemingly due to confusion about ptrdiff_t vs size_t, and how to compare such
values:

FAIL: gcc.dg/analyzer/flexible-array-member-1.c  (test for warnings, line 96)

With -m64:
FAIL: gcc.dg/analyzer/out-of-bounds-diagram-3.c  (test for warnings, line 19)
FAIL: gcc.dg/analyzer/out-of-bounds-diagram-3.c  (test for warnings, line 24)
FAIL: gcc.dg/analyzer/out-of-bounds-diagram-3.c expected multiline pattern
lines 29-44

[Bug target/110901] -march does not override -mcpu on aarch64

2023-08-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110901

--- Comment #1 from Andrew Pinski  ---
Order matters. In this case -march is after -mcpu ...

[Bug target/110901] -march does not override -mcpu on aarch64

2023-08-04 Thread raj.khem at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110901

--- Comment #2 from Khem Raj  ---
(In reply to Andrew Pinski from comment #1)
> Order matters. In this case -march is after -mcpu ...

It does not seem to be effective in this case. I tried to specify -mcpu after
-march and vice-versa, result is same 


% ../recipe-sysroot-native/usr/bin/aarch64-yoe-linux/aarch64-yoe-linux-gcc
-mbranch-protection=standard 
--sysroot=/mnt/b/yoe/master/build/tmp/work/cortexa72-cortexa53-crypto-yoe-linux/glibc/2.38-r0/recipe-sysroot
-fuse-ld=bfd -c -march=armv8.2-a+sve -mcpu=cortex-a72.cortex-a53  a.s
a.s: Assembler messages:
a.s:2: Error: selected processor does not support `ptrue p0.b'

../recipe-sysroot-native/usr/bin/aarch64-yoe-linux/aarch64-yoe-linux-gcc
-mbranch-protection=standard 
--sysroot=/mnt/b/yoe/master/build/tmp/work/cortexa72-cortexa53-crypto-yoe-linux/glibc/2.38-r0/recipe-sysroot
-fuse-ld=bfd -c -mcpu=cortex-a72.cortex-a53 -march=armv8.2-a+sve  a.s
a.s: Assembler messages:
a.s:2: Error: selected processor does not support `ptrue p0.b'

[Bug target/110901] -march does not override -mcpu (big.little on aarch64

2023-08-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110901

--- Comment #3 from Andrew Pinski  ---
With C code, these use of -march and -mcpu would normally be rejected even.

[Bug ada/110898] compilation of adacl-assert-integer.ads failed

2023-08-04 Thread krischik at users dot sourceforge.net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110898

Martin Krischik  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |INVALID

--- Comment #2 from Martin Krischik  ---
@(In reply to Marc Poulhiès from comment #1)

> I've checked and I also get the same errors with gcc 11.x, so that's not
> something new. I think your code should be fixed here.

Yes, those error messages make sense. Especially the „error: child of a generic
package must be a generic unit“. That is indeed a problem on my side.

Thanks for checking. What confuse me was the not at all helpful “compilation of
adacl-assert-integer.ads failed” and the proper error message is no where to be
seen.

But is probably an Alire problem. I'll close the bug.

[Bug middle-end/94442] [11/12/13/14 regression] Redundant loads/stores emitted at -O3

2023-08-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94442

Andrew Pinski  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
   Target Milestone|11.5|11.0
 Resolution|--- |FIXED

--- Comment #13 from Andrew Pinski  ---
Fixed by r11-6794-g04b472ad0e1dc93abafe .

[Bug target/95958] [meta-bug] Inefficient arm_neon.h code for AArch64

2023-08-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95958
Bug 95958 depends on bug 94442, which changed state.

Bug 94442 Summary: [11/12/13/14 regression] Redundant loads/stores emitted at 
-O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94442

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

[Bug middle-end/26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)

2023-08-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163
Bug 26163 depends on bug 95084, which changed state.

Bug 95084 Summary: [11/12/13/14 Regression] code sinking prevents if-conversion
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95084

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |DUPLICATE

[Bug tree-optimization/95084] [11/12/13/14 Regression] code sinking prevents if-conversion

2023-08-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95084

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |DUPLICATE
 Status|NEW |RESOLVED

--- Comment #6 from Andrew Pinski  ---
This was fixed by the patch which fixed PR 92335 and since that is still open
as a regression like this one I am going to close this one as a dup of bug
92335 and they are exactly the same issue even.

*** This bug has been marked as a duplicate of bug 92335 ***

[Bug tree-optimization/92335] [11/12/13 Regression] sinking of loads happen too early which causes vectorization not to be done

2023-08-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92335

--- Comment #10 from Andrew Pinski  ---
*** Bug 95084 has been marked as a duplicate of this bug. ***

[Bug tree-optimization/110903] New: [14 Regression] Dead Code Elimination Regression since r14-1597-g64d90d06d2d

2023-08-04 Thread theodort at inf dot ethz.ch via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110903

Bug ID: 110903
   Summary: [14 Regression] Dead Code Elimination Regression since
r14-1597-g64d90d06d2d
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: theodort at inf dot ethz.ch
  Target Milestone: ---

https://godbolt.org/z/7of4jjM3K

Given the following code:

void foo(void);
static char b, c;
static short e, f;
static int g = 41317;
static int(a)(int h, int i) { return h + i; }
static int(d)(int h, int i) { return i ? h : 0; }
int main() {
{
char j;
short k;
for (; g >= 10; g = (short)g) {
int l = 1, m = 0;
j = 8 * k;
k = j <= 0;
f = c + 3;
for (; c < 2; c = f) {
char n = 4073709551615;
if (!(((m) >= 0) && ((m) <= 0))) {
__builtin_unreachable();
}
if (g)
;
else {
if ((m = k, (b = a(d(l, k), e) && n) || l) < k) foo();
e = l = 0;
}
}
}
}
}

gcc-trunk -O3 does not eliminate the call to foo:

main:
movlg(%rip), %edi
cmpl$9, %edi
jle .L25
pushq   %rbp
movl%edi, %ecx
movl$1, %ebp
movl$1, %esi
pushq   %rbx
movl$1, %ebx
subq$8, %rsp
movzbl  c(%rip), %edx
movsbw  %dl, %ax
addl$3, %eax
movw%ax, f(%rip)
cmpb$1, %dl
jg  .L12
.p2align 4,,10
.p2align 3
.L6:
testl   %edi, %edi
je  .L7
movb%al, c(%rip)
movsbw  %al, %dx
cmpb$1, %al
jle .L6
.L9:
movswl  %di, %ecx
movl%ecx, g(%rip)
cmpl$9, %ecx
jle .L17
addl$3, %edx
movw%dx, f(%rip)
.L12:
movswl  %cx, %eax
cmpw$9, %cx
jle .L29
.L4:
jmp .L4
.p2align 4,,10
.p2align 3
.L7:
movswl  e(%rip), %ecx
movl%ebx, %edx
andl%esi, %edx
addl%ecx, %edx
orl %esi, %edx
jne .L10
testb   %bpl, %bpl
jne .L30
.L10:
xorl%edx, %edx
movb%al, c(%rip)
movw%dx, e(%rip)
movsbw  %al, %dx
cmpb$1, %al
jg  .L9
xorl%esi, %esi
jmp .L6
.p2align 4,,10
.p2align 3
.L30:
callfoo
movzwl  f(%rip), %eax
movlg(%rip), %edi
jmp .L10
.L29:
movl%eax, g(%rip)
.L17:
addq$8, %rsp
xorl%eax, %eax
popq%rbx
popq%rbp
ret
.L25:
xorl%eax, %eax
ret

gcc-13.2.0 -O3 eliminates the call to foo:

main:
movlg(%rip), %esi
movl%esi, %ecx
cmpl$9, %esi
jle .L14
movzbl  c(%rip), %eax
movsbw  %al, %dx
addl$3, %edx
movw%dx, f(%rip)
cmpb$1, %al
jg  .L12
xorl%eax, %eax
testb   %al, %al
movl%edx, %eax
je  .L6
cmpb$1, %dl
jg  .L22
.L7:
jmp .L7
.p2align 4,,10
.p2align 3
.L22:
movb%dl, c(%rip)
.L8:
movswl  %si, %ecx
movl%ecx, g(%rip)
cmpl$9, %ecx
jle .L14
addl$3, %eax
cbtw
movw%ax, f(%rip)
.L12:
movswl  %cx, %eax
cmpw$9, %cx
jle .L23
.L4:
jmp .L4
.p2align 4,,10
.p2align 3
.L6:
movb%dl, c(%rip)
cmpw$1, %dx
jg  .L8
.p2align 4,,10
.p2align 3
.L9:
movlg(%rip), %eax
testl   %eax, %eax
jne .L9
movw$0, e(%rip)
movb%dl, c(%rip)
.L23:
movl%eax, g(%rip)
.L14:
xorl%eax, %eax
ret

Bisects to r14-1597-g64d90d06d2d

[Bug tree-optimization/110903] [14 Regression] Dead Code Elimination Regression since r14-1597-g64d90d06d2d

2023-08-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110903

--- Comment #1 from Andrew Pinski  ---
Note the original testcase has some obvious use of an uninitialized variable.
Anyways here is a fixed up testcase which does not have that uninitialized
variable and GCC 13 was able to optimize away the call to foo still:
```
void foo(void);
static signed char b, c;
static short e, f;
static int g = 41317;
static int(a)(int h, int i) { return h + i; }
static int(d)(int h, int i) { return i ? h : 0; }
short t = 10;
int main() {
{
signed char j;
short k = t;
for (; g >= 10; g = (short)g) {
_Bool l = 1;
int m = 0;
j = 8 * k;
k = j <= 0;
f = c + 3;
for (; c < 2; c = f) {
signed char n = 4073709551615;
if (!(((m) >= 0) && ((m) <= 0))) {
__builtin_unreachable();
}
if (g)
;
else {
if ((m = k, (b = a(d(l, k), e) && n) || l) < k) foo();
e = l = 0;
}
}
}
}
}
```

[Bug middle-end/110857] aarch64-linux-gnu profiledbootstrap broken

2023-08-04 Thread prathamesh3492 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110857

--- Comment #6 from prathamesh3492 at gcc dot gnu.org ---
profiledbootstrap now works on aarch64-linux-gnu, thanks!

[Bug c++/110904] New: __is_convertible incorrectly reports non-referenceable function prototypes as convertible

2023-08-04 Thread nikolasklauser at berlin dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110904

Bug ID: 110904
   Summary: __is_convertible incorrectly reports non-referenceable
 function prototypes as convertible
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: nikolasklauser at berlin dot de
  Target Milestone: ---

```
#include 

using Function = void();
using ConstFunction = void() const;

static_assert((!std::is_convertible::value), "");
static_assert((!std::is_convertible::value), ""); //
convertible
static_assert((!std::is_convertible::value), ""); //
convertible
static_assert((!std::is_convertible::value), ""); //
convertible
static_assert((!std::is_convertible::value), "");
static_assert((!std::is_convertible::value), "");
static_assert((!std::is_convertible::value), "");
static_assert((!std::is_convertible::value), "");
```
__is_convertible() claims that the cases marked above are convertible, but
AFAICT that shouldn't be true. According to the standard,
```
To test() {
  return declval();
}
```
has to be well formed, but that's never the case for `ConstFunction`.

[Bug c++/110904] __is_convertible incorrectly reports non-referenceable function prototypes as convertible

2023-08-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110904

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #1 from Andrew Pinski  ---
Dup of bug 109680.

*** This bug has been marked as a duplicate of bug 109680 ***

[Bug c++/109680] [13 Regression] is_convertible incorrectly true

2023-08-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109680

Andrew Pinski  changed:

   What|Removed |Added

 CC||nikolasklauser at berlin dot de

--- Comment #14 from Andrew Pinski  ---
*** Bug 110904 has been marked as a duplicate of this bug. ***

[Bug tree-optimization/110903] [12/13/14 Regression] Dead Code Elimination Regression

2023-08-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110903

Andrew Pinski  changed:

   What|Removed |Added

   Last reconfirmed||2023-08-04
 Status|UNCONFIRMED |NEW
   Target Milestone|--- |12.4
 Ever confirmed|0   |1
   Keywords||needs-bisection
Summary|[14 Regression] Dead Code   |[12/13/14 Regression] Dead
   |Elimination Regression  |Code Elimination Regression
   |since r14-1597-g64d90d06d2d |

--- Comment #2 from Andrew Pinski  ---
Confirmed. before r14-1597, there was a jump threading happening with respect
to:
  if (j_32 <= 0)
goto ; [50.00%]
  else
goto ; [50.00%]

   [local count: 238907556]:

   [local count: 477815112]:
  # iftmp.10_37 = PHI <_11(7), 0(8)>

But after, we change that into
iftmp.10_37 = _11 & (j_32 <= 0);

It just happens we depend on that due to:
  _43 = l_22 | _25;
  _39 = j_32 <= 0;
  _12 = ~_43;
  _44 = _12 & _39;


If we change the code to be:
```
void foo(void);
static signed char b, c;
static short e, f;
static int g = 41317;
static int(a)(int h, int i) { return h + i; }
static int(d)(int h, int i) { return i & h;}//i ? h : 0; }
short t = 10;
int main() {
{
signed char j;
short k = t;
for (; g >= 10; g = (short)g) {
_Bool l = 1;
int m = 0;
j = 8 * k;
k = j <= 0;
f = c + 3;
for (; c < 2; c = f) {
signed char n = 4073709551615;
if (!(((m) >= 0) && ((m) <= 0))) {
__builtin_unreachable();
}
if (g)
;
else {
if ((m = k, (b = a(d(l, k), e) && n) || l) < k) foo();
e = l = 0;
}
}
}
}
}
```

GCC 11 is able to remove the call to foo but GCC 12 cannot.
the IR for the part where the phiopt2 changes on the trunk is similar enough.

So this is instead a regression from GCC 11.

[Bug c++/110905] New: GCC rejects constexpr code that may re-initialize union member

2023-08-04 Thread danakj at orodu dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110905

Bug ID: 110905
   Summary: GCC rejects constexpr code that may re-initialize
union member
   Product: gcc
   Version: 13.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: danakj at orodu dot net
  Target Milestone: ---

Godbolt: https://gcc.godbolt.org/z/v5anxqnP1

This repro contains a std::optional (which has a union) and it sets the union
in a loop. Doing so causes GCC to reject the code as not being a constant
expression. The error I was getting in my project was far more descriptive,
with it trying to call the deleted constructor of the union.

error: use of deleted function
‘sus::option::__private::Storage,
false>()’

In my more minimal test case the error is more terse and less clear.

:62:59: error: non-constant condition for static assertion
   62 | static_assert(Flatten({{1, 2, 3}, {}, {4, 5}}).sum() == 1 + 2 + 3
+ 4 + 5);
  |  
^~~~
:62:51: error: '(((const std::vector
>*)(&)) != 0)' is not a constant expression
   62 | static_assert(Flatten({{1, 2, 3}, {}, {4, 5}}).sum() == 1 + 2 + 3
+ 4 + 5);
  |   

```cpp
#include 
#include 

template 
struct VectorIter {
constexpr std::optional next() {
if (front == back) return std::optional();
T& item = v[front];
front += 1u;
return std::optional(std::move(item));
}

constexpr VectorIter(std::vector v2) : v(std::move(v2)), front(0u),
back(v.size()) {}
VectorIter(VectorIter&&) = default;
VectorIter& operator=(VectorIter&&) = default;

std::vector v;
size_t front;
size_t back;
};

template 
struct Flatten {
constexpr Flatten(std::vector> v) : vec(std::move(v)) {}

constexpr std::optional next() {
std::optional out;
while (true) {
// Take an item off front_iter_ if possible.
if (front_iter_.has_value()) {
out = front_iter_.value().next();
if (out.has_value()) return out;
front_iter_ = std::nullopt;
}
// Otherwise grab the next vector into front_iter_.
if (!vec.empty()) {
std::vector v = std::move(vec[0]);
vec.erase(vec.begin());
front_iter_.emplace([](auto&& iter) {
return VectorIter(std::move(iter));
}(std::move(v)));
}
if (!front_iter_.has_value()) break;
}
return out;
}

constexpr T sum() && {
T out = T();
while (true) {
std::optional i = next();
if (!i.has_value()) break;
out += *i;
}
return out;
}

std::vector> vec;
std::optional> front_iter_;
};

static_assert(Flatten({{1, 2, 3}, {}, {4, 5}}).sum() == 1 + 2 + 3 + 4 +
5);

int main() {}

```

When the Flatten::next() method is simplified a bit, so that it can see the
union is only initialized once, the GCC compiler no longer rejects the code.
https://gcc.godbolt.org/z/szfGsdxb7

```cpp
#include 
#include 

template 
struct VectorIter {
constexpr std::optional next() {
if (front == back) return std::optional();
T& item = v[front];
front += 1u;
return std::optional(std::move(item));
}

constexpr VectorIter(std::vector v2) : v(std::move(v2)), front(0u),
back(v.size()) {}
VectorIter(VectorIter&&) = default;
VectorIter& operator=(VectorIter&&) = default;

std::vector v;
size_t front;
size_t back;
};

template 
struct Flatten {
constexpr Flatten(std::vector v) : vec(std::move(v)) {}

constexpr std::optional next() {
std::optional out;
while (true) {
// Take an item off front_iter_ if possible.
if (front_iter_.has_value()) {
out = front_iter_.value().next();
if (out.has_value()) return out;
front_iter_ = std::nullopt;
}
// Otherwise grab the next vector into front_iter_.
if (!moved) {
std::vector v = std::move(vec);
moved = true;
front_iter_.emplace([](auto&& iter) {
return VectorIter(std::move(iter));
}(std::move(v)));
}
if (!front_iter_.has_value()) break;
}
return out;
}

constexpr T sum() && {
T out = T();
while (true) {
std::optional i = next();
if (!i.has_value()) break;
out += *i;
}
return out;
}

bool moved = false;
std::vector vec;
std::optional> front_iter_;
};

static_assert(Flatten({1, 2, 3}).sum() == 1 + 2 + 3);

int main() {}
```

Yet in the first example, the GCC com

[Bug c++/110905] GCC rejects constexpr code that may re-initialize union member

2023-08-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110905

Andrew Pinski  changed:

   What|Removed |Added

   Last reconfirmed||2023-08-04
 Status|UNCONFIRMED |WAITING
 Ever confirmed|0   |1

--- Comment #1 from Andrew Pinski  ---
>In my more minimal test case the error is more terse and less clear.


The reduced testcase is a different issue and is a dup of bug 85944.

In the first testcase provided below if we move the static_assert into main
instead of the toplevel, it gets accepted.

I think you need to redo your reduction.

[Bug other/109910] GCC prologue/epilogue saves/restores callee-saved registers that are never changed

2023-08-04 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109910

Georg-Johann Lay  changed:

   What|Removed |Added

   Last reconfirmed||2023-08-04
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

[Bug tree-optimization/32806] Missing optimization to remove backward dependencies

2023-08-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=32806

--- Comment #2 from Andrew Pinski  ---
Created attachment 55689
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55689&action=edit
compilable testcase

[Bug tree-optimization/30049] Variable-length arrays (VLA) should be converted to normal arrays if possible

2023-08-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=30049

--- Comment #2 from Andrew Pinski  ---
Created attachment 55690
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55690&action=edit
testcase

[apinski@xeond2 upstream-gcc-git]$ ~/upstream-gcc/bin/gcc  t.c -march=opteron
-ffast-math -funroll-loops -ftree-vectorize -msse3 -O3 -g
[apinski@xeond2 upstream-gcc-git]$ time ./a.out
real0m1.522s
user0m1.517s
sys 0m0.001s
[apinski@xeond2 upstream-gcc-git]$ ~/upstream-gcc/bin/gcc  t.c -march=opteron
-ffast-math -funroll-loops -ftree-vectorize -msse3 -O3 -g -DNORMAL_ARRAY
[apinski@xeond2 upstream-gcc-git]$ time ./a.out
real0m0.356s
user0m0.352s
sys 0m0.002s

[Bug tree-optimization/30049] Variable-length arrays (VLA) should be converted to normal arrays if possible

2023-08-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=30049

--- Comment #3 from Andrew Pinski  ---
The only difference I saw is scheduling and some small IV-OPTs difference ...

[Bug tree-optimization/35224] scalar evolution analysis fails with "evolution of base is not affine"

2023-08-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=35224

--- Comment #1 from Andrew Pinski  ---
Created attachment 55691
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55691&action=edit
testcase

[Bug tree-optimization/49955] Fails to do partial basic-block SLP

2023-08-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49955

--- Comment #4 from Andrew Pinski  ---
The testcase in comment #0 started to be vectorized in GCC 13 

[Bug tree-optimization/18437] vectorizer failed for matrix multiplication

2023-08-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=18437

--- Comment #9 from Andrew Pinski  ---
For the original testcase in comment #0, with `-O3 -fno-vect-cost-model` GCC
can vectorize it on aarch64 but not on x86_64.

[Bug analyzer/110426] Missing buffer overflow warning with function pointer that has the alloc_size attribute

2023-08-04 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110426

--- Comment #2 from CVS Commits  ---
The master branch has been updated by David Malcolm :

https://gcc.gnu.org/g:021077b94741c9300dfff3a24e95b3ffa3f508a7

commit r14-3001-g021077b94741c9300dfff3a24e95b3ffa3f508a7
Author: David Malcolm 
Date:   Fri Aug 4 16:18:40 2023 -0400

analyzer: handle function attribute "alloc_size" [PR110426]

This patch makes -fanalyzer make use of the function attribute
"alloc_size", allowing -fanalyzer to emit -Wanalyzer-allocation-size,
-Wanalyzer-out-of-bounds, and -Wanalyzer-tainted-allocation-size on
execution paths involving allocations using such functions.

gcc/analyzer/ChangeLog:
PR analyzer/110426
* bounds-checking.cc (region_model::check_region_bounds): Handle
symbolic base regions.
* call-details.cc: Include "stringpool.h" and "attribs.h".
(call_details::lookup_function_attribute): New function.
* call-details.h (call_details::lookup_function_attribute): New
function decl.
* region-model-manager.cc
(region_model_manager::maybe_fold_binop): Add reference to
PR analyzer/110902.
* region-model-reachability.cc (reachable_regions::handle_sval):
Add symbolic regions for pointers that are conjured svalues for
the LHS of a stmt.
* region-model.cc (region_model::canonicalize): Purge dynamic
extents for regions that aren't referenced.
(get_result_size_in_bytes): New function.
(region_model::on_call_pre): Use get_result_size_in_bytes and
potentially set the dynamic extents of the region pointed to by
the return value.
(region_model::deref_rvalue): Add param "add_nonnull_constraint"
and use it to conditionalize adding the constraint.
(pending_diagnostic_subclass::dubious_allocation_size): Add "stmt"
param to both ctors and use it to initialize new "m_stmt" field.
(pending_diagnostic_subclass::operator==): Use m_stmt; don't use
m_lhs or m_rhs.
(pending_diagnostic_subclass::m_stmt): New field.
(region_model::check_region_size): Generalize to any kind of
pointer svalue by using deref_rvalue rather than checking for
region_svalue.  Pass stmt to dubious_allocation_size ctor.
* region-model.h (region_model::deref_rvalue): Add param
"add_nonnull_constraint".
* svalue.cc (conjured_svalue::lhs_value_p): New function.
* svalue.h (conjured_svalue::lhs_value_p): New decl.

gcc/testsuite/ChangeLog:
PR analyzer/110426
* gcc.dg/analyzer/allocation-size-1.c: Update expected message to
reflect consolidation of size and assignment into a single event.
* gcc.dg/analyzer/allocation-size-2.c: Likewise.
* gcc.dg/analyzer/allocation-size-3.c: Likewise.
* gcc.dg/analyzer/allocation-size-4.c: Likewise.
* gcc.dg/analyzer/allocation-size-multiline-1.c: Likewise.
* gcc.dg/analyzer/allocation-size-multiline-2.c: Likewise.
* gcc.dg/analyzer/allocation-size-multiline-3.c: Likewise.
* gcc.dg/analyzer/attr-alloc_size-1.c: New test.
* gcc.dg/analyzer/attr-alloc_size-2.c: New test.
* gcc.dg/analyzer/attr-alloc_size-3.c: New test.
* gcc.dg/analyzer/explode-4.c: New test.
* gcc.dg/analyzer/taint-size-1.c: Add test coverage for
__attribute__ alloc_size.

Signed-off-by: David Malcolm 

[Bug analyzer/110902] Missing cast in region_model_manager::maybe_fold_binop on MULT_EXPR by 1

2023-08-04 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110902

--- Comment #1 from CVS Commits  ---
The master branch has been updated by David Malcolm :

https://gcc.gnu.org/g:021077b94741c9300dfff3a24e95b3ffa3f508a7

commit r14-3001-g021077b94741c9300dfff3a24e95b3ffa3f508a7
Author: David Malcolm 
Date:   Fri Aug 4 16:18:40 2023 -0400

analyzer: handle function attribute "alloc_size" [PR110426]

This patch makes -fanalyzer make use of the function attribute
"alloc_size", allowing -fanalyzer to emit -Wanalyzer-allocation-size,
-Wanalyzer-out-of-bounds, and -Wanalyzer-tainted-allocation-size on
execution paths involving allocations using such functions.

gcc/analyzer/ChangeLog:
PR analyzer/110426
* bounds-checking.cc (region_model::check_region_bounds): Handle
symbolic base regions.
* call-details.cc: Include "stringpool.h" and "attribs.h".
(call_details::lookup_function_attribute): New function.
* call-details.h (call_details::lookup_function_attribute): New
function decl.
* region-model-manager.cc
(region_model_manager::maybe_fold_binop): Add reference to
PR analyzer/110902.
* region-model-reachability.cc (reachable_regions::handle_sval):
Add symbolic regions for pointers that are conjured svalues for
the LHS of a stmt.
* region-model.cc (region_model::canonicalize): Purge dynamic
extents for regions that aren't referenced.
(get_result_size_in_bytes): New function.
(region_model::on_call_pre): Use get_result_size_in_bytes and
potentially set the dynamic extents of the region pointed to by
the return value.
(region_model::deref_rvalue): Add param "add_nonnull_constraint"
and use it to conditionalize adding the constraint.
(pending_diagnostic_subclass::dubious_allocation_size): Add "stmt"
param to both ctors and use it to initialize new "m_stmt" field.
(pending_diagnostic_subclass::operator==): Use m_stmt; don't use
m_lhs or m_rhs.
(pending_diagnostic_subclass::m_stmt): New field.
(region_model::check_region_size): Generalize to any kind of
pointer svalue by using deref_rvalue rather than checking for
region_svalue.  Pass stmt to dubious_allocation_size ctor.
* region-model.h (region_model::deref_rvalue): Add param
"add_nonnull_constraint".
* svalue.cc (conjured_svalue::lhs_value_p): New function.
* svalue.h (conjured_svalue::lhs_value_p): New decl.

gcc/testsuite/ChangeLog:
PR analyzer/110426
* gcc.dg/analyzer/allocation-size-1.c: Update expected message to
reflect consolidation of size and assignment into a single event.
* gcc.dg/analyzer/allocation-size-2.c: Likewise.
* gcc.dg/analyzer/allocation-size-3.c: Likewise.
* gcc.dg/analyzer/allocation-size-4.c: Likewise.
* gcc.dg/analyzer/allocation-size-multiline-1.c: Likewise.
* gcc.dg/analyzer/allocation-size-multiline-2.c: Likewise.
* gcc.dg/analyzer/allocation-size-multiline-3.c: Likewise.
* gcc.dg/analyzer/attr-alloc_size-1.c: New test.
* gcc.dg/analyzer/attr-alloc_size-2.c: New test.
* gcc.dg/analyzer/attr-alloc_size-3.c: New test.
* gcc.dg/analyzer/explode-4.c: New test.
* gcc.dg/analyzer/taint-size-1.c: Add test coverage for
__attribute__ alloc_size.

Signed-off-by: David Malcolm 

[Bug tree-optimization/18437] vectorizer failed for matrix multiplication

2023-08-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=18437

--- Comment #10 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #9)
> For the original testcase in comment #0, with `-O3 -fno-vect-cost-model` GCC
> can vectorize it on aarch64 but not on x86_64.

I should say starting in GCC 6 .

[Bug analyzer/110426] Missing buffer overflow warning with function pointer that has the alloc_size attribute

2023-08-04 Thread dmalcolm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110426

David Malcolm  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #3 from David Malcolm  ---
Should be implemented for gcc 14 by the above patch.

[Bug tree-optimization/21998] (cond ? result1 : result2) is vectorized, where equivalent if-syntax isn't (store)

2023-08-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=21998

--- Comment #7 from Andrew Pinski  ---
We can vectorize test2 using mask stores 

[Bug c++/110905] GCC rejects constexpr code that may re-initialize union member

2023-08-04 Thread danakj at orodu dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110905

--- Comment #2 from danakj at orodu dot net ---
Ah ok. Here's a big reproduction: https://godbolt.org/z/Kj7Tcd6P4

/opt/compiler-explorer/gcc-trunk-20230804/include/c++/14.0.0/bits/stl_construct.h:97:14:
  in 'constexpr' expansion of
'((sus::containers::VecIntoIter*))->sus::containers::VecIntoIter::VecIntoIter((*
& std::forward >((* & __args#0'
:32895:22: error: use of deleted function
'sus::option::__private::Storage,
false>()'
32895 | struct [[nodiscard]] VecIntoIter final
  |  ^~~
:3015:9: note:
'sus::option::__private::Storage,
false>()' is implicitly deleted because the
default definition would be ill-formed:
 3015 |   union {
  | ^
:3015:9: error: no matching function for call to
'sus::containers::VecIntoIter::VecIntoIter()'
:32953:13: note: candidate: 'constexpr
sus::containers::VecIntoIter::VecIntoIter(sus::containers::Vec&&,
sus::num::usize, sus::num::usize) [with ItemT = sus::num::i32]'
32953 |   constexpr VecIntoIter(Vec&& vec, usize front, usize back)
noexcept
  | ^~~
:32953:13: note:   candidate expects 3 arguments, 0 provided
:32951:13: note: candidate: 'constexpr
sus::containers::VecIntoIter::VecIntoIter(sus::containers::Vec&&)
[with ItemT = sus::num::i32]'
32951 |   constexpr VecIntoIter(Vec&& vec) noexcept :
vec_(::sus::move(vec)) {}
  | ^~~
:32951:13: note:   candidate expects 1 argument, 0 provided
:32895:22: note: candidate: 'constexpr
sus::containers::VecIntoIter::VecIntoIter(sus::containers::VecIntoIter&&)'
32895 | struct [[nodiscard]] VecIntoIter final
  |  ^~~
:32895:22: note:   candidate expects 1 argument, 0 provided
Compiler returned: 1


I will try to shrink it now.

[Bug middle-end/110906] New: __attribute__((optimize("no-math-errno"))) has no effect.

2023-08-04 Thread cassio.neri at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110906

Bug ID: 110906
   Summary: __attribute__((optimize("no-math-errno"))) has no
effect.
   Product: gcc
   Version: 13.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: cassio.neri at gmail dot com
  Target Milestone: ---

Consider this C++ code compiled with -O3:

double g(double x) {
  return std::sqrt(x);
}

Usually this does call the library function std::sqrt because x might be
negative and errno needs to be set accordingly. Moreover, with -fno-math-errno
a single sqrtsd instruction is emitted. However, annotating g with

__attribute__((optimize("no-math-errno")))

has no effect. This attribute (and #pragma GCC optimize("no-math-errno") ) used
to work up to gcc 5.5.

https://godbolt.org/z/T1nb11bv5

[Bug middle-end/110906] __attribute__((optimize("no-math-errno"))) has no effect.

2023-08-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110906

--- Comment #1 from Andrew Pinski  ---
well std::sqrt is not annotated with no-math-errno after all ...

[Bug middle-end/110906] __attribute__((optimize("no-math-errno"))) has no effect.

2023-08-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110906

--- Comment #2 from Andrew Pinski  ---
Created attachment 55692
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55692&action=edit
Full testcase

[Bug middle-end/110906] __attribute__((optimize("no-math-errno"))) has no effect.

2023-08-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110906

--- Comment #3 from Andrew Pinski  ---
But even:
```
__attribute__((optimize("no-math-errno")))
double g(double x) {
  return __builtin_sqrt(x);
}
```
Does not change here ...

  1   2   >