Re: [PATCH 1/5] libstdc++: Import the fast_float library

2021-11-16 Thread Florian Weimer via Gcc-patches
* Patrick Palka via Libstdc:

> This copies the fast_float library[1] into the compiled-in library
> sources.  We're going to use this library in our floating-point
> std::from_chars implementation for faster and more portable parsing of
> binary32/64 decimal strings.
>
> [1]: https://github.com/fastfloat/fast_float
>
> Series tested on x86_64, i686, ppc64, ppc64le and aarch64, does it
> look OK for trunk?

Missing Signed-off-by:?

> diff --git a/libstdc++-v3/src/c++17/fast_float/LICENSE-APACHE 
> b/libstdc++-v3/src/c++17/fast_float/LICENSE-APACHE
> new file mode 100644
> index 000..26f4398f249
> --- /dev/null
> +++ b/libstdc++-v3/src/c++17/fast_float/LICENSE-APACHE
> @@ -0,0 +1,190 @@
> + Apache License
> +   Version 2.0, January 2004
> +http://www.apache.org/licenses/

> diff --git a/libstdc++-v3/src/c++17/fast_float/LICENSE-MIT 
> b/libstdc++-v3/src/c++17/fast_float/LICENSE-MIT
> new file mode 100644
> index 000..2fb2a37ad7f
> --- /dev/null
> +++ b/libstdc++-v3/src/c++17/fast_float/LICENSE-MIT
> @@ -0,0 +1,27 @@
> +MIT License
> +
> +Copyright (c) 2021 The fast_float authors

You also need to include the README file, which makes it clear that
recipients can choose between Apache and MIT.  GCC needs to use the MIT
option, I think.

Thanks,
Florian



Re: [PATCH][GCC] arm: add armv9-a architecture to -march

2021-11-16 Thread Christophe Lyon via Gcc-patches
Hi,


On Tue, Nov 9, 2021 at 12:36 PM Przemyslaw Wirkus via Gcc-patches <
gcc-patches@gcc.gnu.org> wrote:

> > > > -Original Message-
> > > > From: Przemyslaw Wirkus
> > > > Sent: 18 October 2021 10:37
> > > > To: gcc-patches@gcc.gnu.org
> > > > Cc: Richard Earnshaw ; Ramana
> > > > Radhakrishnan ; Kyrylo Tkachov
> > > > ; ni...@redhat.com
> > > > Subject: [PATCH][GCC] arm: add armv9-a architecture to -march
> > > >
> > > > Hi,
> > > >
> > > > This patch is adding `armv9-a` to -march in Arm GCC.
> > > >
> > > > In this patch:
> > > >   + Add `armv9-a` to -march.
> > > >   + Update multilib with armv9-a and armv9-a+simd.
> > > >
> > > > After this patch three additional multilib directories are available:
> > > >
> > > > $ arm-none-eabi-gcc --print-multi-lib .; [...vanilla multi-lib
> > > > dirs...] thumb/v9-a/nofp;@mthumb@march=armv9-a@mfloat-abi=soft
> > > > thumb/v9-a+simd/softfp;@mthumb@march=armv9-a+simd@mfloat-
> > > > abi=softfp
> > > > thumb/v9-a+simd/hard;@mthumb@march=armv9-a+simd@mfloat-
> > > > abi=hard
> > > >
>

This is causing a GCC build failure when using "old" binutils (I'm using
2.36.1),
because the new -march=armv9-a option is not supported. This breaks the
multilib support.

I don't remember how we handled similar cases in the past? Is that just
"expected", and
"current" GCC needs "current" binutils, or should we have a multilib list
dependent on
the actual binutils support? (I think this is not the case, and it sounds
like an undesirable
extra complication in an already overcrowded mutilib-Makefile)

Christophe

> > > New multi-lib directories under
> > > > $GCC_INSTALL_DIE/lib/gcc/arm-none-eabi/12.0.0/thumb are created:
> > > >
> > > > thumb/
> > > > +--- v9-a
> > > > ||--- nofp
> > > > |
> > > > +--- v9-a+simd
> > > >  |--- hard
> > > >  |--- softfp
> > > >
> > > > Regtested on arm-none-eabi cross and no issues.
> > > >
> > > > OK for master?
>
> Thanks.
>
> commit 32ba7860ccaddd5219e6dae94a3d0653e124c9dd
>
> > Ok.
> > Thanks,
> > Kyrill
> >
> >
> > > >
> > > > gcc/ChangeLog:
> > > >
> > > >   * config/arm/arm-cpus.in (armv9): New define.
> > > >   (ARMv9a): New group.
> > > >   (armv9-a): New arch definition.
> > > >   * config/arm/arm-tables.opt: Regenerate.
> > > >   * config/arm/arm.h (BASE_ARCH_9A): New arch enum value.
> > > >   * config/arm/t-aprofile: Added armv9-a and armv9+simd.
> > > >   * config/arm/t-arm-elf: Added arm9-a, v9_fps and all_v9_archs
> > > >   to MULTILIB_MATCHES.
> > > >   * config/arm/t-multilib: Added v9_a_nosimd_variants and
> > > >   v9_a_simd_variants to MULTILIB_MATCHES.
> > > >   * doc/invoke.texi: Update docs.
> > > >
> > > > gcc/testsuite/ChangeLog:
> > > >
> > > >   * gcc.target/arm/multilib.exp: Update test with armv9-a entries.
> > > >   * lib/target-supports.exp (v9a): Add new armflag.
> > > >   (__ARM_ARCH_9A__): Add new armdef.
> > > >
> > > > --
> > > > kind regards,
> > > > Przemyslaw Wirkus
>
>


[PATCH] i386: vcvtph2ps and vcvtps2ph should be used to convert _Float16 to SFmode with -mf16c [PR 102811]

2021-11-16 Thread Kong, Lingling via Gcc-patches
Hi,

vcvtph2ps and vcvtps2ph should be used to convert _Float16 to SFmode with 
-mf16c. So added define_insn extendhfsf2 and truncsfhf2 for target_f16c.

OK for master?

gcc/ChangeLog:

PR target/102811
* config/i386/i386.md (extendhfsf2): Add extenndhfsf2 for f16c.
(extendhfdf2): Split extendhf2 into separate extendhfsf2, 
extendhfdf2.
(truncsfhf2): Likewise.
(truncdfhf2): Likewise.

gcc/testsuite/ChangeLog:

PR target/102811
* gcc.target/i386/avx512vl-vcvtps2ph-pr102811.c: New test.
---
 gcc/config/i386/i386.md   | 48 +++
 .../i386/avx512vl-vcvtps2ph-pr102811.c| 10 
 2 files changed, 49 insertions(+), 9 deletions(-)  create mode 100644 
gcc/testsuite/gcc.target/i386/avx512vl-vcvtps2ph-pr102811.c

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 
6eb9de81921..c5415475342 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -4574,15 +4574,30 @@
   emit_move_insn (operands[0], CONST0_RTX (V2DFmode));
 })
 
-(define_insn "extendhf2"
-  [(set (match_operand:MODEF 0 "nonimm_ssenomem_operand" "=v")
-(float_extend:MODEF
+(define_insn "extendhfsf2"
+  [(set (match_operand:SF 0 "register_operand" "=v")
+   (float_extend:SF
+ (match_operand:HF 1 "nonimmediate_operand" "vm")))]
+  "TARGET_AVX512FP16 || TARGET_F16C || TARGET_AVX512VL"
+{
+  if (TARGET_AVX512FP16)
+return "vcvtsh2ss\t{%1, %0, %0|%0, %0, %1}";
+  else
+return "vcvtph2ps\t{%1, %0|%0, %1}"; }
+  [(set_attr "type" "ssecvt")
+   (set_attr "prefix" "maybe_evex")
+   (set_attr "mode" "SF")])
+
+(define_insn "extendhfdf2"
+  [(set (match_operand:DF 0 "nonimm_ssenomem_operand" "=v")
+   (float_extend:DF
  (match_operand:HF 1 "nonimmediate_operand" "vm")))]
   "TARGET_AVX512FP16"
-  "vcvtsh2\t{%1, %0, %0|%0, %0, %1}"
+  "vcvtsh2sd\t{%1, %0, %0|%0, %0, %1}"
   [(set_attr "type" "ssecvt")
(set_attr "prefix" "evex")
-   (set_attr "mode" "")])
+   (set_attr "mode" "DF")])
 
 
 (define_expand "extendxf2"
@@ -4766,12 +4781,27 @@
 
 ;; Conversion from {SF,DF}mode to HFmode.
 
-(define_insn "trunchf2"
+(define_insn "truncsfhf2"
+  [(set (match_operand:HF 0 "register_operand" "=v")
+   (float_truncate:HF
+ (match_operand:SF 1 "nonimmediate_operand" "vm")))]
+  "TARGET_AVX512FP16 || TARGET_F16C || TARGET_AVX512VL"
+  {
+if (TARGET_AVX512FP16)
+  return "vcvtss2sh\t{%1, %d0|%d0, %1}";
+else
+  return "vcvtps2ph\t{0, %1, %0|%0, %1, 0}";
+  }
+  [(set_attr "type" "ssecvt")
+   (set_attr "prefix" "evex")
+   (set_attr "mode" "HF")])
+
+(define_insn "truncdfhf2"
   [(set (match_operand:HF 0 "register_operand" "=v")
-   (float_truncate:HF
- (match_operand:MODEF 1 "nonimmediate_operand" "vm")))]
+   (float_truncate:HF
+ (match_operand:DF 1 "nonimmediate_operand" "vm")))]
   "TARGET_AVX512FP16"
-  "vcvt2sh\t{%1, %d0|%d0, %1}"
+  "vcvtsd2sh\t{%1, %d0|%d0, %1}"
   [(set_attr "type" "ssecvt")
(set_attr "prefix" "evex")
(set_attr "mode" "HF")])
diff --git a/gcc/testsuite/gcc.target/i386/avx512vl-vcvtps2ph-pr102811.c 
b/gcc/testsuite/gcc.target/i386/avx512vl-vcvtps2ph-pr102811.c
new file mode 100644
index 000..ab44a304a03
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512vl-vcvtps2ph-pr102811.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mf16c -mno-avx512fp16" } */
+/* { dg-final { scan-assembler-times "vcvtph2ps\[ \\t\]" 2 } } */
+/* { dg-final { scan-assembler-times "vcvtps2ph\[ \\t\]" 1 } } */
+/* { dg-final { scan-assembler-not "__truncsfhf2\[ \\t\]"} } */
+/* { dg-final { scan-assembler-not "__extendhfsf2\[ \\t\]"} } */
+_Float16 test (_Float16 a, _Float16 b)
+{
+  return a + b;
+}
--
2.18.1



Re: Basic kill analysis for modref

2021-11-16 Thread Jan Hubicka via Gcc-patches
>    chain_map  isn't initialized.
> 
> This caused:
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103262
> 

Hi,
this is patch I comitted that moves the misplaced hunk.

gcc/ChangeLog:

PR ipa/103262
* ipa-modref.c (merge_call_side_effects): Fix uninitialized
access.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/modref-dse-5.c: New test.

diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c
index df4612bbff9..4784f68f585 100644
--- a/gcc/ipa-modref.c
+++ b/gcc/ipa-modref.c
@@ -964,38 +980,6 @@ merge_call_side_effects (modref_summary *cur_summary,
   if (flags & (ECF_CONST | ECF_NOVOPS))
 return changed;
 
-  if (always_executed
-  && callee_summary->kills.length ()
-  && (!cfun->can_throw_non_call_exceptions
- || !stmt_could_throw_p (cfun, stmt)))
-{
-  /* Watch for self recursive updates.  */
-  auto_vec saved_kills;
-
-  saved_kills.reserve_exact (callee_summary->kills.length ());
-  saved_kills.splice (callee_summary->kills);
-  for (auto kill : saved_kills)
-   {
- if (kill.parm_index >= (int)parm_map.length ())
-   continue;
- modref_parm_map &m
- = kill.parm_index == MODREF_STATIC_CHAIN_PARM
-   ? chain_map
-   : parm_map[kill.parm_index];
- if (m.parm_index == MODREF_LOCAL_MEMORY_PARM
- || m.parm_index == MODREF_UNKNOWN_PARM
- || m.parm_index == MODREF_RETSLOT_PARM
- || !m.parm_offset_known)
-   continue;
- modref_access_node n = kill;
- n.parm_index = m.parm_index;
- n.parm_offset += m.parm_offset;
- if (modref_access_node::insert_kill (cur_summary->kills, n,
-  record_adjustments))
-   changed = true;
-   }
-}
-
   /* We can not safely optimize based on summary of callee if it does
  not always bind to current def: it is possible that memory load
  was optimized out earlier which may not happen in the interposed
@@ -1043,6 +1027,38 @@ merge_call_side_effects (modref_summary *cur_summary,
   if (dump_file)
 fprintf (dump_file, "\n");
 
+  if (always_executed
+  && callee_summary->kills.length ()
+  && (!cfun->can_throw_non_call_exceptions
+ || !stmt_could_throw_p (cfun, stmt)))
+{
+  /* Watch for self recursive updates.  */
+  auto_vec saved_kills;
+
+  saved_kills.reserve_exact (callee_summary->kills.length ());
+  saved_kills.splice (callee_summary->kills);
+  for (auto kill : saved_kills)
+   {
+ if (kill.parm_index >= (int)parm_map.length ())
+   continue;
+ modref_parm_map &m
+ = kill.parm_index == MODREF_STATIC_CHAIN_PARM
+   ? chain_map
+   : parm_map[kill.parm_index];
+ if (m.parm_index == MODREF_LOCAL_MEMORY_PARM
+ || m.parm_index == MODREF_UNKNOWN_PARM
+ || m.parm_index == MODREF_RETSLOT_PARM
+ || !m.parm_offset_known)
+   continue;
+ modref_access_node n = kill;
+ n.parm_index = m.parm_index;
+ n.parm_offset += m.parm_offset;
+ if (modref_access_node::insert_kill (cur_summary->kills, n,
+  record_adjustments))
+   changed = true;
+   }
+}
+
   /* Merge with callee's summary.  */
   changed |= cur_summary->loads->merge (callee_summary->loads, &parm_map,
&chain_map, record_adjustments);
+/* { dg-final { scan-tree-dump "Deleted dead store: kill_me" "dse2" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/modref-dse-5.c 
b/gcc/testsuite/gcc.dg/tree-ssa/modref-dse-5.c
new file mode 100644
index 000..ad35b70136f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/modref-dse-5.c
@@ -0,0 +1,43 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-dse2-details"  } */
+struct a {int a,b,c;};
+__attribute__ ((noinline))
+void
+kill_me (struct a *a)
+{
+  a->a=0;
+  a->b=0;
+  a->c=0;
+}
+__attribute__ ((noinline))
+int
+wrap(int b, struct a *a)
+{
+   kill_me (a);
+   return b;
+}
+__attribute__ ((noinline))
+void
+my_pleasure (struct a *a)
+{
+  a->a=1;
+  a->c=2;
+}
+__attribute__ ((noinline))
+int
+wrap2(int b, struct a *a)
+{
+   my_pleasure (a);
+   return b;
+}
+
+int
+set (struct a *a)
+{
+  wrap (0, a);
+  int ret = wrap2 (0, a);
+  //int ret = my_pleasure (a);
+  a->b=1;
+  return ret;
+}
+/* { dg-final { scan-tree-dump "Deleted dead store: wrap" "dse2" } } */


[PATCH] i386: add alias for f*mul_*ch intrinsics

2021-11-16 Thread Kong, Lingling via Gcc-patches
Hi,

This patch is to add alias for f*mul_*ch intrinsics. 

Ok for master?

gcc/ChangeLog:

* config/i386/avx512fp16intrin.h (_mm512_mul_pch): Add alias for 
_mm512_fmul_pch.
(_mm512_mask_mul_pch): Likewise.
(_mm512_maskz_mul_pch): Likewise.
(_mm512_mul_round_pch): Likewise.
(_mm512_mask_mul_round_pch): Likewise.
(_mm512_maskz_mul_round_pch): Likewise.
(_mm512_cmul_pch): Likewise.
(_mm512_mask_cmul_pch): Likewise.
(_mm512_maskz_cmul_pch): Likewise.
(_mm512_cmul_round_pch): Likewise.
(_mm512_mask_cmul_round_pch): Likewise.
(_mm512_maskz_cmul_round_pch): Likewise.
(_mm_mul_sch): Likewise.
(_mm_mask_mul_sch): Likewise.
(_mm_maskz_mul_sch): Likewise.
(_mm_mul_round_sch): Likewise.
(_mm_mask_mul_round_sch): Likewise.
(_mm_maskz_mul_round_sch): Likewise.
(_mm_cmul_sch): Likewise.
(_mm_mask_cmul_sch): Likewise.
(_mm_maskz_cmul_sch): Likewise.
(_mm_cmul_round_sch): Likewise.
(_mm_mask_cmul_round_sch): Likewise.
(_mm_maskz_cmul_round_sch): Likewise.
* config/i386/avx512fp16vlintrin.h (_mm_mul_pch): Likewise.
(_mm_mask_mul_pch): Likewise.
(_mm_maskz_mul_pch): Likewise.
(_mm256_mul_pch): Likewise.
(_mm256_mask_mul_pch): Likewise.
(_mm256_maskz_mul_pch): Likewise.
(_mm_cmul_pch): Likewise.
(_mm_mask_cmul_pch): Likewise.
(_mm_maskz_cmul_pch): Likewise.
(_mm256_cmul_pch): Likewise.
(_mm256_mask_cmul_pch): Likewise.
(_mm256_maskz_cmul_pch): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/i386/avx512fp16-vfcmulcph-1a.c: Add new test for alias.
* gcc.target/i386/avx512fp16-vfcmulcsh-1a.c: Likewise.
* gcc.target/i386/avx512fp16-vfmulcph-1a.c: Likewise.
* gcc.target/i386/avx512fp16-vfmulcsh-1a.c: Likewise.
* gcc.target/i386/avx512fp16vl-vfcmulcph-1a.c: Likewise.
* gcc.target/i386/avx512fp16vl-vfmulcph-1a.c: Likewise.
---
 gcc/config/i386/avx512fp16intrin.h| 39 +++
 gcc/config/i386/avx512fp16vlintrin.h  | 17 
 .../gcc.target/i386/avx512fp16-vfcmulcph-1a.c | 19 ++---  
.../gcc.target/i386/avx512fp16-vfcmulcsh-1a.c | 19 ++---  
.../gcc.target/i386/avx512fp16-vfmulcph-1a.c  | 19 ++---  
.../gcc.target/i386/avx512fp16-vfmulcsh-1a.c  | 19 ++---
 .../i386/avx512fp16vl-vfcmulcph-1a.c  | 20 +++---
 .../i386/avx512fp16vl-vfmulcph-1a.c   | 20 +++---
 8 files changed, 136 insertions(+), 36 deletions(-)

diff --git a/gcc/config/i386/avx512fp16intrin.h 
b/gcc/config/i386/avx512fp16intrin.h
index 44c5e24f234..fe73e693897 100644
--- a/gcc/config/i386/avx512fp16intrin.h
+++ b/gcc/config/i386/avx512fp16intrin.h
@@ -7162,6 +7162,45 @@ _mm512_set1_pch (_Float16 _Complex __A)
   return (__m512h) _mm512_set1_ps (u.b);  }
 
+// intrinsics below are alias for f*mul_*ch #define _mm512_mul_pch(A, 
+B) _mm512_fmul_pch ((A), (B))
+#define _mm512_mask_mul_pch(W, U, A, B)  \
+  _mm512_mask_fmul_pch ((W), (U), (A), (B)) #define 
+_mm512_maskz_mul_pch(U, A, B) _mm512_maskz_fmul_pch ((U), (A), (B)) 
+#define _mm512_mul_round_pch(A, B, R) _mm512_fmul_round_pch ((A), (B), (R))
+#define _mm512_mask_mul_round_pch(W, U, A, B, R) \
+  _mm512_mask_fmul_round_pch ((W), (U), (A), (B), (R))
+#define _mm512_maskz_mul_round_pch(U, A, B, R)   \
+  _mm512_maskz_fmul_round_pch ((U), (A), (B), (R))
+
+#define _mm512_cmul_pch(A, B) _mm512_fcmul_pch ((A), (B))
+#define _mm512_mask_cmul_pch(W, U, A, B) \
+  _mm512_mask_fcmul_pch ((W), (U), (A), (B)) #define 
+_mm512_maskz_cmul_pch(U, A, B) _mm512_maskz_fcmul_pch ((U), (A), (B)) 
+#define _mm512_cmul_round_pch(A, B, R) _mm512_fcmul_round_pch ((A), (B), (R))
+#define _mm512_mask_cmul_round_pch(W, U, A, B, R)\
+  _mm512_mask_fcmul_round_pch ((W), (U), (A), (B), (R))
+#define _mm512_maskz_cmul_round_pch(U, A, B, R)  \
+  _mm512_maskz_fcmul_round_pch ((U), (A), (B), (R))
+
+#define _mm_mul_sch(A, B) _mm_fmul_sch ((A), (B)) #define 
+_mm_mask_mul_sch(W, U, A, B) _mm_mask_fmul_sch ((W), (U), (A), (B)) 
+#define _mm_maskz_mul_sch(U, A, B) _mm_maskz_fmul_sch ((U), (A), (B)) 
+#define _mm_mul_round_sch(A, B, R) _mm_fmul_round_sch ((A), (B), (R))
+#define _mm_mask_mul_round_sch(W, U, A, B, R)\
+  _mm_mask_fmul_round_sch ((W), (U), (A), (B), (R))
+#define _mm_maskz_mul_round_sch(U, A, B, R)  \
+  _mm_maskz_fmul_round_sch ((U), (A), (B), (R))
+
+#define _mm_cmul_sch(A, B) _mm_fcmul_sch ((A), (B)) #define 
+_mm_mask_cmul_sch(W, U, A, B) _mm_mask_fcmul_sch ((W), (U), (A), (B)) 
+#define _mm_maskz_cmul_sch(U, A, B) _mm_maskz_fcmul_sch ((U), (A), (B)) 
+#define _mm_cmul_round_sch(A, B, R) _mm_fcmul_round_sch ((A), (B), (R))
+#define

Re: [PATCH] i386: add alias for f*mul_*ch intrinsics

2021-11-16 Thread Hongtao Liu via Gcc-patches
On Tue, Nov 16, 2021 at 4:23 PM Kong, Lingling via Gcc-patches
 wrote:
>
> Hi,
>
> This patch is to add alias for f*mul_*ch intrinsics.
>
> Ok for master?
This patch just adds some macro definitions (new aliases for
intrinsic) to the header file, and I think this should be low risk.
And considering that the intel intrinsic guide has been updated with
those aliases, it would be inconvenienced if they were not in the
latest gcc, so I think we should install this.
Ok if there's no other objections.
>
> gcc/ChangeLog:
>
> * config/i386/avx512fp16intrin.h (_mm512_mul_pch): Add alias for 
> _mm512_fmul_pch.
> (_mm512_mask_mul_pch): Likewise.
> (_mm512_maskz_mul_pch): Likewise.
> (_mm512_mul_round_pch): Likewise.
> (_mm512_mask_mul_round_pch): Likewise.
> (_mm512_maskz_mul_round_pch): Likewise.
> (_mm512_cmul_pch): Likewise.
> (_mm512_mask_cmul_pch): Likewise.
> (_mm512_maskz_cmul_pch): Likewise.
> (_mm512_cmul_round_pch): Likewise.
> (_mm512_mask_cmul_round_pch): Likewise.
> (_mm512_maskz_cmul_round_pch): Likewise.
> (_mm_mul_sch): Likewise.
> (_mm_mask_mul_sch): Likewise.
> (_mm_maskz_mul_sch): Likewise.
> (_mm_mul_round_sch): Likewise.
> (_mm_mask_mul_round_sch): Likewise.
> (_mm_maskz_mul_round_sch): Likewise.
> (_mm_cmul_sch): Likewise.
> (_mm_mask_cmul_sch): Likewise.
> (_mm_maskz_cmul_sch): Likewise.
> (_mm_cmul_round_sch): Likewise.
> (_mm_mask_cmul_round_sch): Likewise.
> (_mm_maskz_cmul_round_sch): Likewise.
> * config/i386/avx512fp16vlintrin.h (_mm_mul_pch): Likewise.
> (_mm_mask_mul_pch): Likewise.
> (_mm_maskz_mul_pch): Likewise.
> (_mm256_mul_pch): Likewise.
> (_mm256_mask_mul_pch): Likewise.
> (_mm256_maskz_mul_pch): Likewise.
> (_mm_cmul_pch): Likewise.
> (_mm_mask_cmul_pch): Likewise.
> (_mm_maskz_cmul_pch): Likewise.
> (_mm256_cmul_pch): Likewise.
> (_mm256_mask_cmul_pch): Likewise.
> (_mm256_maskz_cmul_pch): Likewise.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/i386/avx512fp16-vfcmulcph-1a.c: Add new test for alias.
> * gcc.target/i386/avx512fp16-vfcmulcsh-1a.c: Likewise.
> * gcc.target/i386/avx512fp16-vfmulcph-1a.c: Likewise.
> * gcc.target/i386/avx512fp16-vfmulcsh-1a.c: Likewise.
> * gcc.target/i386/avx512fp16vl-vfcmulcph-1a.c: Likewise.
> * gcc.target/i386/avx512fp16vl-vfmulcph-1a.c: Likewise.
> ---
>  gcc/config/i386/avx512fp16intrin.h| 39 +++
>  gcc/config/i386/avx512fp16vlintrin.h  | 17 
>  .../gcc.target/i386/avx512fp16-vfcmulcph-1a.c | 19 ++---  
> .../gcc.target/i386/avx512fp16-vfcmulcsh-1a.c | 19 ++---  
> .../gcc.target/i386/avx512fp16-vfmulcph-1a.c  | 19 ++---  
> .../gcc.target/i386/avx512fp16-vfmulcsh-1a.c  | 19 ++---
>  .../i386/avx512fp16vl-vfcmulcph-1a.c  | 20 +++---
>  .../i386/avx512fp16vl-vfmulcph-1a.c   | 20 +++---
>  8 files changed, 136 insertions(+), 36 deletions(-)
>
> diff --git a/gcc/config/i386/avx512fp16intrin.h 
> b/gcc/config/i386/avx512fp16intrin.h
> index 44c5e24f234..fe73e693897 100644
> --- a/gcc/config/i386/avx512fp16intrin.h
> +++ b/gcc/config/i386/avx512fp16intrin.h
> @@ -7162,6 +7162,45 @@ _mm512_set1_pch (_Float16 _Complex __A)
>return (__m512h) _mm512_set1_ps (u.b);  }
>
> +// intrinsics below are alias for f*mul_*ch #define _mm512_mul_pch(A,
> +B) _mm512_fmul_pch ((A), (B))
> +#define _mm512_mask_mul_pch(W, U, A, B)  
> \
> +  _mm512_mask_fmul_pch ((W), (U), (A), (B)) #define
> +_mm512_maskz_mul_pch(U, A, B) _mm512_maskz_fmul_pch ((U), (A), (B))
> +#define _mm512_mul_round_pch(A, B, R) _mm512_fmul_round_pch ((A), (B), (R))
> +#define _mm512_mask_mul_round_pch(W, U, A, B, R) \
> +  _mm512_mask_fmul_round_pch ((W), (U), (A), (B), (R))
> +#define _mm512_maskz_mul_round_pch(U, A, B, R)   \
> +  _mm512_maskz_fmul_round_pch ((U), (A), (B), (R))
> +
> +#define _mm512_cmul_pch(A, B) _mm512_fcmul_pch ((A), (B))
> +#define _mm512_mask_cmul_pch(W, U, A, B) \
> +  _mm512_mask_fcmul_pch ((W), (U), (A), (B)) #define
> +_mm512_maskz_cmul_pch(U, A, B) _mm512_maskz_fcmul_pch ((U), (A), (B))
> +#define _mm512_cmul_round_pch(A, B, R) _mm512_fcmul_round_pch ((A), (B), (R))
> +#define _mm512_mask_cmul_round_pch(W, U, A, B, R)\
> +  _mm512_mask_fcmul_round_pch ((W), (U), (A), (B), (R))
> +#define _mm512_maskz_cmul_round_pch(U, A, B, R)  
> \
> +  _mm512_maskz_fcmul_round_pch ((U), (A), (B), (R))
> +
> +#define _mm_mul_sch(A, B) _mm_fmul_sch ((A), (B)) #define
> +_mm_mask_mul_sch(W, U, A, B) _mm_mask_fmul_sch ((W), (U), (A), (B))
> +#define _mm_maskz_mul_sch(U, A, B) _mm_maskz_fmul_sch ((U), (A), (B))
> +#def

Re: [PATCH] i386: vcvtph2ps and vcvtps2ph should be used to convert _Float16 to SFmode with -mf16c [PR 102811]

2021-11-16 Thread Hongtao Liu via Gcc-patches
On Tue, Nov 16, 2021 at 4:15 PM Kong, Lingling via Gcc-patches
 wrote:
>
> Hi,
>
> vcvtph2ps and vcvtps2ph should be used to convert _Float16 to SFmode with 
> -mf16c. So added define_insn extendhfsf2 and truncsfhf2 for target_f16c.
>
> OK for master?
>
> gcc/ChangeLog:
>
> PR target/102811
> * config/i386/i386.md (extendhfsf2): Add extenndhfsf2 for f16c.
> (extendhfdf2): Split extendhf2 into separate extendhfsf2, 
> extendhfdf2.
> (truncsfhf2): Likewise.
> (truncdfhf2): Likewise.
>
> gcc/testsuite/ChangeLog:
>
> PR target/102811
> * gcc.target/i386/avx512vl-vcvtps2ph-pr102811.c: New test.
> ---
>  gcc/config/i386/i386.md   | 48 +++
>  .../i386/avx512vl-vcvtps2ph-pr102811.c| 10 
>  2 files changed, 49 insertions(+), 9 deletions(-)  create mode 100644 
> gcc/testsuite/gcc.target/i386/avx512vl-vcvtps2ph-pr102811.c
>
> diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 
> 6eb9de81921..c5415475342 100644
> --- a/gcc/config/i386/i386.md
> +++ b/gcc/config/i386/i386.md
> @@ -4574,15 +4574,30 @@
>emit_move_insn (operands[0], CONST0_RTX (V2DFmode));
>  })
>
> -(define_insn "extendhf2"
> -  [(set (match_operand:MODEF 0 "nonimm_ssenomem_operand" "=v")
> -(float_extend:MODEF
> +(define_insn "extendhfsf2"
> +  [(set (match_operand:SF 0 "register_operand" "=v")
> +   (float_extend:SF
> + (match_operand:HF 1 "nonimmediate_operand" "vm")))]
> +  "TARGET_AVX512FP16 || TARGET_F16C || TARGET_AVX512VL"
> +{
> +  if (TARGET_AVX512FP16)
> +return "vcvtsh2ss\t{%1, %0, %0|%0, %0, %1}";
> +  else
> +return "vcvtph2ps\t{%1, %0|%0, %1}"; }
> +  [(set_attr "type" "ssecvt")
> +   (set_attr "prefix" "maybe_evex")
> +   (set_attr "mode" "SF")])
> +
> +(define_insn "extendhfdf2"
> +  [(set (match_operand:DF 0 "nonimm_ssenomem_operand" "=v")
> +   (float_extend:DF
>   (match_operand:HF 1 "nonimmediate_operand" "vm")))]
>"TARGET_AVX512FP16"
> -  "vcvtsh2\t{%1, %0, %0|%0, %0, %1}"
> +  "vcvtsh2sd\t{%1, %0, %0|%0, %0, %1}"
>[(set_attr "type" "ssecvt")
> (set_attr "prefix" "evex")
> -   (set_attr "mode" "")])
> +   (set_attr "mode" "DF")])
>
>
>  (define_expand "extendxf2"
> @@ -4766,12 +4781,27 @@
>
>  ;; Conversion from {SF,DF}mode to HFmode.
>
> -(define_insn "trunchf2"
> +(define_insn "truncsfhf2"
> +  [(set (match_operand:HF 0 "register_operand" "=v")
> +   (float_truncate:HF
> + (match_operand:SF 1 "nonimmediate_operand" "vm")))]
> +  "TARGET_AVX512FP16 || TARGET_F16C || TARGET_AVX512VL"
> +  {
> +if (TARGET_AVX512FP16)
> +  return "vcvtss2sh\t{%1, %d0|%d0, %1}";
> +else
> +  return "vcvtps2ph\t{0, %1, %0|%0, %1, 0}";
> +  }
> +  [(set_attr "type" "ssecvt")
> +   (set_attr "prefix" "evex")
> +   (set_attr "mode" "HF")])
> +
> +(define_insn "truncdfhf2"
>[(set (match_operand:HF 0 "register_operand" "=v")
> -   (float_truncate:HF
> - (match_operand:MODEF 1 "nonimmediate_operand" "vm")))]
> +   (float_truncate:HF
> + (match_operand:DF 1 "nonimmediate_operand" "vm")))]
>"TARGET_AVX512FP16"
> -  "vcvt2sh\t{%1, %d0|%d0, %1}"
> +  "vcvtsd2sh\t{%1, %d0|%d0, %1}"
>[(set_attr "type" "ssecvt")
> (set_attr "prefix" "evex")
> (set_attr "mode" "HF")])
> diff --git a/gcc/testsuite/gcc.target/i386/avx512vl-vcvtps2ph-pr102811.c 
> b/gcc/testsuite/gcc.target/i386/avx512vl-vcvtps2ph-pr102811.c
> new file mode 100644
> index 000..ab44a304a03
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/avx512vl-vcvtps2ph-pr102811.c
> @@ -0,0 +1,10 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mf16c -mno-avx512fp16" } */
> +/* { dg-final { scan-assembler-times "vcvtph2ps\[ \\t\]" 2 } } */
> +/* { dg-final { scan-assembler-times "vcvtps2ph\[ \\t\]" 1 } } */
> +/* { dg-final { scan-assembler-not "__truncsfhf2\[ \\t\]"} } */
> +/* { dg-final { scan-assembler-not "__extendhfsf2\[ \\t\]"} } */
> +_Float16 test (_Float16 a, _Float16 b)
> +{
> +  return a + b;
> +}
> --
> 2.18.1
>


-- 
BR,
Hongtao


Re: [PATCH] x86_64: Avoid rorx rotation instructions with -Os

2021-11-16 Thread Uros Bizjak via Gcc-patches
On Mon, Nov 15, 2021 at 2:54 PM Roger Sayle  wrote:
>
>
> This patch teaches the i386 backend to avoid using BMI2's rorx
> instructions when optimizing for size.  The benefits are shown
> with the following example:
>
> unsigned int ror1(unsigned int x) { return (x >> 1) | (x << 31); }
> unsigned int ror2(unsigned int x) { return (x >> 2) | (x << 30); }
> unsigned int rol2(unsigned int x) { return (x >> 30) | (x << 2); }
> unsigned int rol1(unsigned int x) { return (x >> 31) | (x << 1); }
>
> which currently with -Os -march=cascadelake generates:
>
> ror1:   rorx$1, %edi, %eax  // 6 bytes
> ret
> ror2:   rorx$2, %edi, %eax  // 6 bytes
> ret
> rol2:   rorx$30, %edi, %eax // 6 bytes
> ret
> rol1:   rorx$31, %edi, %eax // 6 bytes
> ret
>
> but with this patch now generates:
>
> ror1:   movl%edi, %eax  // 2 bytes
> rorl%eax// 2 bytes
> ret
> ror2:   movl%edi, %eax  // 2 bytes
> rorl$2, %eax// 3 bytes
> ret
> rol2:   movl%edi, %eax  // 2 bytes
> roll$2, %eax// 3 bytes
> ret
> rol1:   movl%edi, %eax  // 2 bytes
> roll%eax// 2 bytes
> ret
>
> I've confirmed that this patch is a win on the CSiBE benchmark,
> even though rotations are rare, where for example libmspack/test/md5.o
> shrinks from 5824 bytes to 5632 bytes.
>
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check with no new failures.  Ok for mainline?
>
>
> 2021-11-15  Roger Sayle  
>
> gcc/ChangeLog
> * config/i386/i386.md (*bmi2_rorx_1): Make conditional
> on !optimize_function_for_size_p.
> (*3_1): Add preferred_for_size attribute.
> (define_splits): Conditionalize on !optimize_function_for_size_p.
> (*bmi2_rorxsi3_1_zext): Likewise.
> (*si2_1_zext): Add preferred_for_size attribute.
> (define_splits): Conditionalize on !optimize_function_for_size_p.

OK.

Thanks,
Uros.


Re: [PATCH] gcc: implement AIX-style constructors

2021-11-16 Thread CHIGOT, CLEMENT via Gcc-patches
> Hi David, 
> 
> Here is the new version of the patch. 
> I've moved the startup function in crtcdtors files.
> 
> I'm just wondering if the part dealing with the
> __init_aix_libgcc_cxa_atexit is needed. I'm adding it because
> the destructor created in crtcxa.o is following GCC format and
> thus won't be launched if the flag "-mcdtors=aix" is passed.
> However, as you said, this option might not operate correctly
> if the GCC runtime isn't rebuild with it.

Gentle Ping. 

Thanks, 
Clément

Re: aix: Add FAT library support for libffi for AIX

2021-11-16 Thread CHIGOT, CLEMENT via Gcc-patches
> Even if GCC64 is able to boostrap without libffi being a
> FAT library on AIX, the tests for "-maix32" are not working
> without it.
> 
> libffi/ChangeLog:
> 2021-10-21  Clément Chigot  
> 
>       * Makefile.am (tmake_file): Build and install AIX-style FAT
>          libraries.
>       * Makefile.in: Regenerate.
>       * include/Makefile.in: Regenerate.
>       * man/Makefile.in: Regenerate.
>       * testsuite/Makefile.in: Regenerate.
>        * configure (tmake_file): Substitute.
>        * configure.ac: Regenerate.
>        * configure.host (powerpc-*-aix*): Define tmake_file.
>        * src/powerpc/t-aix: New file.
>
> I've already made a PR to libffi itself in order to add the common part of 
> this patch to it. But for now, it's still unmerged: 
> https://github.com/libffi/libffi/pull/661. 

Gentle ping, 

Thanks
Clément


Re: [PATCH] aix: handle 64bit inodes for include directories

2021-11-16 Thread CHIGOT, CLEMENT via Gcc-patches
Hi everyone,

Gentle ping

Thanks,
Clément

From: CHIGOT, CLEMENT 
Sent: Tuesday, October 26, 2021 4:51 PM
To: Jeff Law ; David Malcolm 
Cc: gcc-patches@gcc.gnu.org ; David Edelsohn 

Subject: Re: [PATCH] aix: handle 64bit inodes for include directories

Hi everyone,

Gentle ping on this patch.

Clément

From: CHIGOT, CLEMENT 
Sent: Tuesday, October 12, 2021 10:35 AM
To: Jeff Law ; David Malcolm 
Cc: gcc-patches@gcc.gnu.org ; David Edelsohn 

Subject: Re: [PATCH] aix: handle 64bit inodes for include directories

Hi Jeff,

Any update on this patch ?
As it's dealing with configure files, I would like to have it merged
asap before any conflicts appear.

Thanks,
Clément


Re: [PATCH] i386: vcvtph2ps and vcvtps2ph should be used to convert _Float16 to SFmode with -mf16c [PR 102811]

2021-11-16 Thread Uros Bizjak via Gcc-patches
On Tue, Nov 16, 2021 at 9:15 AM Kong, Lingling via Gcc-patches
 wrote:
>
> Hi,
>
> vcvtph2ps and vcvtps2ph should be used to convert _Float16 to SFmode with 
> -mf16c. So added define_insn extendhfsf2 and truncsfhf2 for target_f16c.
>
> OK for master?

No, this is the wrong approach. There can be invalid values in the
high elements of the vector, so these should be cleared before
conversion.

Please see the attached (unfinished) patch and use it as a starting
point. Please note that we can now allow 2-byte values in SSE
registers, so movhi_internal and ix86_can_change_mode_class should be
updated accordingly.

Uros.
>
> gcc/ChangeLog:
>
> PR target/102811
> * config/i386/i386.md (extendhfsf2): Add extenndhfsf2 for f16c.
> (extendhfdf2): Split extendhf2 into separate extendhfsf2, 
> extendhfdf2.
> (truncsfhf2): Likewise.
> (truncdfhf2): Likewise.
>
> gcc/testsuite/ChangeLog:
>
> PR target/102811
> * gcc.target/i386/avx512vl-vcvtps2ph-pr102811.c: New test.
> ---
>  gcc/config/i386/i386.md   | 48 +++
>  .../i386/avx512vl-vcvtps2ph-pr102811.c| 10 
>  2 files changed, 49 insertions(+), 9 deletions(-)  create mode 100644 
> gcc/testsuite/gcc.target/i386/avx512vl-vcvtps2ph-pr102811.c
>
> diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 
> 6eb9de81921..c5415475342 100644
> --- a/gcc/config/i386/i386.md
> +++ b/gcc/config/i386/i386.md
> @@ -4574,15 +4574,30 @@
>emit_move_insn (operands[0], CONST0_RTX (V2DFmode));
>  })
>
> -(define_insn "extendhf2"
> -  [(set (match_operand:MODEF 0 "nonimm_ssenomem_operand" "=v")
> -(float_extend:MODEF
> +(define_insn "extendhfsf2"
> +  [(set (match_operand:SF 0 "register_operand" "=v")
> +   (float_extend:SF
> + (match_operand:HF 1 "nonimmediate_operand" "vm")))]
> +  "TARGET_AVX512FP16 || TARGET_F16C || TARGET_AVX512VL"
> +{
> +  if (TARGET_AVX512FP16)
> +return "vcvtsh2ss\t{%1, %0, %0|%0, %0, %1}";
> +  else
> +return "vcvtph2ps\t{%1, %0|%0, %1}"; }
> +  [(set_attr "type" "ssecvt")
> +   (set_attr "prefix" "maybe_evex")
> +   (set_attr "mode" "SF")])
> +
> +(define_insn "extendhfdf2"
> +  [(set (match_operand:DF 0 "nonimm_ssenomem_operand" "=v")
> +   (float_extend:DF
>   (match_operand:HF 1 "nonimmediate_operand" "vm")))]
>"TARGET_AVX512FP16"
> -  "vcvtsh2\t{%1, %0, %0|%0, %0, %1}"
> +  "vcvtsh2sd\t{%1, %0, %0|%0, %0, %1}"
>[(set_attr "type" "ssecvt")
> (set_attr "prefix" "evex")
> -   (set_attr "mode" "")])
> +   (set_attr "mode" "DF")])
>
>
>  (define_expand "extendxf2"
> @@ -4766,12 +4781,27 @@
>
>  ;; Conversion from {SF,DF}mode to HFmode.
>
> -(define_insn "trunchf2"
> +(define_insn "truncsfhf2"
> +  [(set (match_operand:HF 0 "register_operand" "=v")
> +   (float_truncate:HF
> + (match_operand:SF 1 "nonimmediate_operand" "vm")))]
> +  "TARGET_AVX512FP16 || TARGET_F16C || TARGET_AVX512VL"
> +  {
> +if (TARGET_AVX512FP16)
> +  return "vcvtss2sh\t{%1, %d0|%d0, %1}";
> +else
> +  return "vcvtps2ph\t{0, %1, %0|%0, %1, 0}";
> +  }
> +  [(set_attr "type" "ssecvt")
> +   (set_attr "prefix" "evex")
> +   (set_attr "mode" "HF")])
> +
> +(define_insn "truncdfhf2"
>[(set (match_operand:HF 0 "register_operand" "=v")
> -   (float_truncate:HF
> - (match_operand:MODEF 1 "nonimmediate_operand" "vm")))]
> +   (float_truncate:HF
> + (match_operand:DF 1 "nonimmediate_operand" "vm")))]
>"TARGET_AVX512FP16"
> -  "vcvt2sh\t{%1, %d0|%d0, %1}"
> +  "vcvtsd2sh\t{%1, %d0|%d0, %1}"
>[(set_attr "type" "ssecvt")
> (set_attr "prefix" "evex")
> (set_attr "mode" "HF")])
> diff --git a/gcc/testsuite/gcc.target/i386/avx512vl-vcvtps2ph-pr102811.c 
> b/gcc/testsuite/gcc.target/i386/avx512vl-vcvtps2ph-pr102811.c
> new file mode 100644
> index 000..ab44a304a03
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/avx512vl-vcvtps2ph-pr102811.c
> @@ -0,0 +1,10 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mf16c -mno-avx512fp16" } */
> +/* { dg-final { scan-assembler-times "vcvtph2ps\[ \\t\]" 2 } } */
> +/* { dg-final { scan-assembler-times "vcvtps2ph\[ \\t\]" 1 } } */
> +/* { dg-final { scan-assembler-not "__truncsfhf2\[ \\t\]"} } */
> +/* { dg-final { scan-assembler-not "__extendhfsf2\[ \\t\]"} } */
> +_Float16 test (_Float16 a, _Float16 b)
> +{
> +  return a + b;
> +}
> --
> 2.18.1
>
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 9cc903e826b..21a3a45d22c 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -19462,9 +19462,8 @@ ix86_can_change_mode_class (machine_mode from, 
machine_mode to,
 disallow a change to these modes, reload will assume it's ok to
 drop the subreg from (subreg:SI (reg:HI 100) 0).  This affects
 the vec_dupv4hi pattern.
-NB: AVX512FP16 supports vmovw which can load 16bit data to sse
-register.  */
-  int mov_size = MAYBE_SSE_CLASS_P (regclass) && 

Re: [PATCH] pch: Add support for PCH for relocatable executables

2021-11-16 Thread Jakub Jelinek via Gcc-patches
On Sat, Nov 13, 2021 at 08:32:41PM +, Iain Sandoe wrote:
> IMO both this series
>  - which restores the ability to work with PIE exes but requires a known 
> address for the PCH 
> and the series I posted
>  - which allows a configuration to opt out of PCH anyway
> 
> could be useful - for Darwin I prefer this series.

Yeah, I think we want both and let the users choose.

Finding a hole can be indeed hard on 32-bit VA, but no OS I've seen
randomizes across the whole 44 or 48 or how many bits VA, otherwise e.g.
address sanitizer or thread sanitizer would have no chance to work either.

Having the PCH blob be relocatable would be achievable too, we have all the
information in the GTY for it after all when we are able to relocate it at
PCH saving time, but don't do that currently because it would be more
expensive at PCH restore time.  But perhaps better to do that as a fallback
if we don't manage to get the right slot.

Jakub



[PATCH] waccess: Fix up pass_waccess::check_alloc_size_call [PR102009]

2021-11-16 Thread Jakub Jelinek via Gcc-patches
Hi!

This function punts if the builtins have no arguments, but as can be seen
on the testcase, even if it has some arguments but alloc_size attribute's
arguments point to arguments that aren't passed, we get a warning earlier
from the FE but should punt rather than ICE on it.
Other users of alloc_size attribute e.g. in
tree-object-size.c (alloc_object_size) punt similarly and similarly
even in the same TU maybe_warn_nonstring_arg correctly verifies calls have
enough arguments.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2021-11-16  Jakub Jelinek  

PR tree-optimization/102009
* gimple-ssa-warn-access.cc (pass_waccess::check_alloc_size_call):
Punt if any of alloc_size arguments is out of bounds vs. number of
call arguments.

* gcc.dg/pr102009.c: New test.

--- gcc/gimple-ssa-warn-access.cc.jj2021-11-09 15:25:15.0 +0100
+++ gcc/gimple-ssa-warn-access.cc   2021-11-15 17:22:44.769580185 +0100
@@ -2335,10 +2335,6 @@ pass_waccess::check_alloca (gcall *stmt)
 void
 pass_waccess::check_alloc_size_call (gcall *stmt)
 {
-  if (gimple_call_num_args (stmt) < 1)
-/* Avoid invalid calls to functions without a prototype.  */
-return;
-
   tree fndecl = gimple_call_fndecl (stmt);
   if (fndecl && gimple_call_builtin_p (stmt, BUILT_IN_NORMAL))
 {
@@ -2367,13 +2363,19 @@ pass_waccess::check_alloc_size_call (gca
  the actual argument(s) at those indices in ALLOC_ARGS.  */
   int idx[2] = { -1, -1 };
   tree alloc_args[] = { NULL_TREE, NULL_TREE };
+  unsigned nargs = gimple_call_num_args (stmt);
 
   tree args = TREE_VALUE (alloc_size);
   idx[0] = TREE_INT_CST_LOW (TREE_VALUE (args)) - 1;
+  /* Avoid invalid calls to functions without a prototype.  */
+  if ((unsigned) idx[0] >= nargs)
+return;
   alloc_args[0] = call_arg (stmt, idx[0]);
   if (TREE_CHAIN (args))
 {
   idx[1] = TREE_INT_CST_LOW (TREE_VALUE (TREE_CHAIN (args))) - 1;
+  if ((unsigned) idx[1] >= nargs)
+   return;
   alloc_args[1] = call_arg (stmt, idx[1]);
 }
 
--- gcc/testsuite/gcc.dg/pr102009.c.jj  2021-11-15 17:29:19.090162531 +0100
+++ gcc/testsuite/gcc.dg/pr102009.c 2021-11-15 17:30:08.328486037 +0100
@@ -0,0 +1,10 @@
+/* PR tree-optimization/102009 */
+/* { dg-do compile } */
+
+void *realloc ();  /* { dg-message "declared here" } */
+
+void *
+foo (void *p)
+{
+  return realloc (p);  /* { dg-warning "too few arguments to built-in function 
'realloc' expecting " } */
+}

Jakub



Re: [PATCH] waccess: Fix up pass_waccess::check_alloc_size_call [PR102009]

2021-11-16 Thread Richard Biener via Gcc-patches
On Tue, 16 Nov 2021, Jakub Jelinek wrote:

> Hi!
> 
> This function punts if the builtins have no arguments, but as can be seen
> on the testcase, even if it has some arguments but alloc_size attribute's
> arguments point to arguments that aren't passed, we get a warning earlier
> from the FE but should punt rather than ICE on it.
> Other users of alloc_size attribute e.g. in
> tree-object-size.c (alloc_object_size) punt similarly and similarly
> even in the same TU maybe_warn_nonstring_arg correctly verifies calls have
> enough arguments.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

Thanks,
Richard.

> 2021-11-16  Jakub Jelinek  
> 
>   PR tree-optimization/102009
>   * gimple-ssa-warn-access.cc (pass_waccess::check_alloc_size_call):
>   Punt if any of alloc_size arguments is out of bounds vs. number of
>   call arguments.
> 
>   * gcc.dg/pr102009.c: New test.
> 
> --- gcc/gimple-ssa-warn-access.cc.jj  2021-11-09 15:25:15.0 +0100
> +++ gcc/gimple-ssa-warn-access.cc 2021-11-15 17:22:44.769580185 +0100
> @@ -2335,10 +2335,6 @@ pass_waccess::check_alloca (gcall *stmt)
>  void
>  pass_waccess::check_alloc_size_call (gcall *stmt)
>  {
> -  if (gimple_call_num_args (stmt) < 1)
> -/* Avoid invalid calls to functions without a prototype.  */
> -return;
> -
>tree fndecl = gimple_call_fndecl (stmt);
>if (fndecl && gimple_call_builtin_p (stmt, BUILT_IN_NORMAL))
>  {
> @@ -2367,13 +2363,19 @@ pass_waccess::check_alloc_size_call (gca
>   the actual argument(s) at those indices in ALLOC_ARGS.  */
>int idx[2] = { -1, -1 };
>tree alloc_args[] = { NULL_TREE, NULL_TREE };
> +  unsigned nargs = gimple_call_num_args (stmt);
>  
>tree args = TREE_VALUE (alloc_size);
>idx[0] = TREE_INT_CST_LOW (TREE_VALUE (args)) - 1;
> +  /* Avoid invalid calls to functions without a prototype.  */
> +  if ((unsigned) idx[0] >= nargs)
> +return;
>alloc_args[0] = call_arg (stmt, idx[0]);
>if (TREE_CHAIN (args))
>  {
>idx[1] = TREE_INT_CST_LOW (TREE_VALUE (TREE_CHAIN (args))) - 1;
> +  if ((unsigned) idx[1] >= nargs)
> + return;
>alloc_args[1] = call_arg (stmt, idx[1]);
>  }
>  
> --- gcc/testsuite/gcc.dg/pr102009.c.jj2021-11-15 17:29:19.090162531 
> +0100
> +++ gcc/testsuite/gcc.dg/pr102009.c   2021-11-15 17:30:08.328486037 +0100
> @@ -0,0 +1,10 @@
> +/* PR tree-optimization/102009 */
> +/* { dg-do compile } */
> +
> +void *realloc ();/* { dg-message "declared here" } */
> +
> +void *
> +foo (void *p)
> +{
> +  return realloc (p);/* { dg-warning "too few arguments to built-in 
> function 'realloc' expecting " } */
> +}
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Ivo Totev; HRB 36809 (AG Nuernberg)


[committed] openmp: Regimplify operands of GIMPLE_COND in a few more places [PR103208]

2021-11-16 Thread Jakub Jelinek via Gcc-patches
Hi!

As the testcase shows, the non-rectangular loop expansion code didn't
try to regimplify operands of GIMPLE_CONDs it built in some cases.
I have added a helper function which does that and used it in some places
that were regimplifying already to simplify those spots, plus added it
in a couple of other places where it was needed.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk.

2021-11-16  Jakub Jelinek  

PR tree-optimization/103208
* omp-expand.c (expand_omp_build_cond): New function.
(expand_omp_for_init_counts, expand_omp_for_init_vars,
expand_omp_for_static_nochunk, expand_omp_for_static_chunk): Use it.

* c-c++-common/gomp/loop-11.c: New test.

--- gcc/omp-expand.c.jj 2021-11-11 14:35:37.631348121 +0100
+++ gcc/omp-expand.c2021-11-15 20:39:22.666976655 +0100
@@ -1208,6 +1208,28 @@ expand_omp_build_assign (gimple_stmt_ite
 }
 }
 
+/* Prepend or append LHS CODE RHS condition before or after *GSI_P.  */
+
+static gcond *
+expand_omp_build_cond (gimple_stmt_iterator *gsi_p, enum tree_code code,
+  tree lhs, tree rhs, bool after = false)
+{
+  gcond *cond_stmt = gimple_build_cond (code, lhs, rhs, NULL_TREE, NULL_TREE);
+  if (after)
+gsi_insert_after (gsi_p, cond_stmt, GSI_CONTINUE_LINKING);
+  else
+gsi_insert_before (gsi_p, cond_stmt, GSI_SAME_STMT);
+  if (walk_tree (gimple_cond_lhs_ptr (cond_stmt), expand_omp_regimplify_p,
+NULL, NULL)
+  || walk_tree (gimple_cond_rhs_ptr (cond_stmt), expand_omp_regimplify_p,
+   NULL, NULL))
+{
+  gimple_stmt_iterator gsi = gsi_for_stmt (cond_stmt);
+  gimple_regimplify_operands (cond_stmt, &gsi);
+}
+  return cond_stmt;
+}
+
 /* Expand the OpenMP parallel or task directive starting at REGION.  */
 
 static void
@@ -1868,17 +1890,8 @@ expand_omp_for_init_counts (struct omp_f
  n2 = fold_convert (itype, unshare_expr (fd->loops[i].n2));
  n2 = force_gimple_operand_gsi (gsi, n2, true, NULL_TREE,
 true, GSI_SAME_STMT);
- cond_stmt = gimple_build_cond (fd->loops[i].cond_code, n1, n2,
-NULL_TREE, NULL_TREE);
- gsi_insert_before (gsi, cond_stmt, GSI_SAME_STMT);
- if (walk_tree (gimple_cond_lhs_ptr (cond_stmt),
-expand_omp_regimplify_p, NULL, NULL)
- || walk_tree (gimple_cond_rhs_ptr (cond_stmt),
-   expand_omp_regimplify_p, NULL, NULL))
-   {
- *gsi = gsi_for_stmt (cond_stmt);
- gimple_regimplify_operands (cond_stmt, gsi);
-   }
+ cond_stmt = expand_omp_build_cond (gsi, fd->loops[i].cond_code,
+n1, n2);
  e = split_block (entry_bb, cond_stmt);
  basic_block &zero_iter_bb
= i < fd->collapse ? zero_iter1_bb : zero_iter2_bb;
@@ -2075,18 +2088,16 @@ expand_omp_for_init_counts (struct omp_f
  n2e = force_gimple_operand_gsi (&gsi2, n2e, true, NULL_TREE,
  true, GSI_SAME_STMT);
  gcond *cond_stmt
-   = gimple_build_cond (fd->loops[i].cond_code, n1, n2,
-NULL_TREE, NULL_TREE);
- gsi_insert_before (&gsi2, cond_stmt, GSI_SAME_STMT);
+   = expand_omp_build_cond (&gsi2, fd->loops[i].cond_code,
+n1, n2);
  e = split_block (bb1, cond_stmt);
  e->flags = EDGE_TRUE_VALUE;
  e->probability = profile_probability::likely ().guessed ();
  basic_block bb2 = e->dest;
  gsi2 = gsi_after_labels (bb2);
 
- cond_stmt = gimple_build_cond (fd->loops[i].cond_code, n1e, n2e,
-NULL_TREE, NULL_TREE);
- gsi_insert_before (&gsi2, cond_stmt, GSI_SAME_STMT);
+ cond_stmt = expand_omp_build_cond (&gsi2, fd->loops[i].cond_code,
+n1e, n2e);
  e = split_block (bb2, cond_stmt);
  e->flags = EDGE_TRUE_VALUE;
  e->probability = profile_probability::likely ().guessed ();
@@ -2137,9 +2148,8 @@ expand_omp_for_init_counts (struct omp_f
  e->probability = profile_probability::unlikely ().guessed ();
 
  gsi2 = gsi_after_labels (bb3);
- cond_stmt = gimple_build_cond (fd->loops[i].cond_code, n1e, n2e,
-NULL_TREE, NULL_TREE);
- gsi_insert_before (&gsi2, cond_stmt, GSI_SAME_STMT);
+ cond_stmt = expand_omp_build_cond (&gsi2, fd->loops[i].cond_code,
+n1e, n2e);
  e = split_block (bb3, cond_stmt);
  e->flags = EDGE_TRUE_VALUE;
  e->probability = profile_probability::likely ().guessed ();
@@ -2193,9 +2203,8 @@ expand_omp_for_init_counts (struct omp_f
 true, GSI_SAME_STMT);
   

[committed] libgomp: Mark thread_limit clause to target construct as implemented

2021-11-16 Thread Jakub Jelinek via Gcc-patches
On Mon, Nov 15, 2021 at 02:00:42PM +0100, Tobias Burnus wrote:
> Fortran: openmp: Add support for thread_limit clause on target
> 
> gcc/fortran/ChangeLog:
> 
>   * openmp.c (OMP_TARGET_CLAUSES): Add thread_limit.
>   * trans-openmp.c (gfc_split_omp_clauses): Add thread_limit also to
>   teams.

After the Fortran changes we can mark it as implemented...

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk.

2021-11-16  Jakub Jelinek  

* libgomp.texi (OpenMP 5.1): Mark thread_limit clause to target
construct as implemented.

--- libgomp/libgomp.texi.jj 2021-10-27 09:24:43.312822017 +0200
+++ libgomp/libgomp.texi2021-11-15 22:29:35.210487522 +0100
@@ -292,7 +292,7 @@ The OpenMP 4.5 specification is fully su
   clauses of the taskloop construct @tab Y @tab
 @item @code{align} clause/modifier in @code{allocate} directive/clause
   and @code{allocator} directive @tab P @tab C/C++ on clause only
-@item @code{thread_limit} clause to @code{target} construct @tab N @tab
+@item @code{thread_limit} clause to @code{target} construct @tab Y @tab
 @item @code{has_device_addr} clause to @code{target} construct @tab N @tab
 @item iterators in @code{target update} motion clauses and @code{map}
   clauses @tab N @tab

Jakub



Re: [PATCH 1/5] libstdc++: Import the fast_float library

2021-11-16 Thread Jonathan Wakely via Gcc-patches
On Tue, 16 Nov 2021 at 08:01, Florian Weimer wrote:
>
> * Patrick Palka via Libstdc:
>
> > This copies the fast_float library[1] into the compiled-in library
> > sources.  We're going to use this library in our floating-point
> > std::from_chars implementation for faster and more portable parsing of
> > binary32/64 decimal strings.
> >
> > [1]: https://github.com/fastfloat/fast_float
> >
> > Series tested on x86_64, i686, ppc64, ppc64le and aarch64, does it
> > look OK for trunk?
>
> Missing Signed-off-by:?

That's not needed if Patrick is still covered by an FSF assignment.

>
> > diff --git a/libstdc++-v3/src/c++17/fast_float/LICENSE-APACHE 
> > b/libstdc++-v3/src/c++17/fast_float/LICENSE-APACHE
> > new file mode 100644
> > index 000..26f4398f249
> > --- /dev/null
> > +++ b/libstdc++-v3/src/c++17/fast_float/LICENSE-APACHE
> > @@ -0,0 +1,190 @@
> > + Apache License
> > +   Version 2.0, January 2004
> > +http://www.apache.org/licenses/
>
> > diff --git a/libstdc++-v3/src/c++17/fast_float/LICENSE-MIT 
> > b/libstdc++-v3/src/c++17/fast_float/LICENSE-MIT
> > new file mode 100644
> > index 000..2fb2a37ad7f
> > --- /dev/null
> > +++ b/libstdc++-v3/src/c++17/fast_float/LICENSE-MIT
> > @@ -0,0 +1,27 @@
> > +MIT License
> > +
> > +Copyright (c) 2021 The fast_float authors
>
> You also need to include the README file, which makes it clear that
> recipients can choose between Apache and MIT.  GCC needs to use the MIT
> option, I think.

I think we could use Apache as well, because this code isn't going to
appear in public headers so the problematic clause doesn't apply. But
MIT is simpler.


Re: [PATCH] libstdc++: Merge latest Ryu sources

2021-11-16 Thread Jonathan Wakely via Gcc-patches
On Tue, 16 Nov 2021 at 00:36, Patrick Palka wrote:
>
> The only source change is a speedup to pow5Factor.
>
> Tested on x86_64-pc-linux-gnu, does this look OK for trunk?

OK, thanks.


[PATCH] ipa-sra: Testcase that removing a "returns_nonnull" retval works

2021-11-16 Thread Martin Jambor
Hi,

since we can now remove return values of functions with return_nonnull
type attribute, I'll feel a bit safer if we can test this does not ICE
when someone attempts to access a non-existent call LHS.  Eventually
we should probably drop the attribute when this happens.

Tested on x86_64-linux, I will push it to master momentarily.

Martin


gcc/testsuite/ChangeLog:

2021-11-15  Martin Jambor  

* gcc.dg/ipa/ipa-sra-ret-nonull.c: New test.
---
 gcc/testsuite/gcc.dg/ipa/ipa-sra-ret-nonull.c | 40 +++
 1 file changed, 40 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/ipa/ipa-sra-ret-nonull.c

diff --git a/gcc/testsuite/gcc.dg/ipa/ipa-sra-ret-nonull.c 
b/gcc/testsuite/gcc.dg/ipa/ipa-sra-ret-nonull.c
new file mode 100644
index 000..18c13efd609
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ipa/ipa-sra-ret-nonull.c
@@ -0,0 +1,40 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-ipa-sra-details"  } */
+
+volatile void *gp;
+volatile void *gq;
+char buf[16];
+
+__attribute__((returns_nonnull, noinline))
+static char *
+foo (char *p, char *q)
+{
+  gq = q;
+  gp = p;
+  return q;
+}
+
+__attribute__((returns_nonnull, noinline))
+static char *
+bar (char *p, char *q)
+{
+  return foo (p, q) + 8;
+}
+
+__attribute__((noipa))
+static char *
+get_charp (void)
+{
+  return &buf[0];
+}
+
+int
+main ()
+{
+  char *r;
+  asm volatile ("" : : : "memory");
+  r = bar (get_charp (), get_charp ());
+  return 0;
+}
+
+/* { dg-final { scan-ipa-dump-times "Will SKIP return." 2 "sra" } } */
-- 
2.33.0



Re: [PATCH 1/5] libstdc++: Import the fast_float library

2021-11-16 Thread Florian Weimer via Gcc-patches
* Jonathan Wakely:

> On Tue, 16 Nov 2021 at 08:01, Florian Weimer wrote:
>>
>> * Patrick Palka via Libstdc:
>>
>> > This copies the fast_float library[1] into the compiled-in library
>> > sources.  We're going to use this library in our floating-point
>> > std::from_chars implementation for faster and more portable parsing of
>> > binary32/64 decimal strings.
>> >
>> > [1]: https://github.com/fastfloat/fast_float
>> >
>> > Series tested on x86_64, i686, ppc64, ppc64le and aarch64, does it
>> > look OK for trunk?
>>
>> Missing Signed-off-by:?
>
> That's not needed if Patrick is still covered by an FSF assignment.

But the submission is not covered by the FSF assignment.

> I think we could use Apache as well, because this code isn't going to
> appear in public headers so the problematic clause doesn't apply. But
> MIT is simpler.

Okay, so you consider dynamic linking only?  I think the historic
libstdc++ license is more permissive than Apache or MIT when used with
GCC.  There aren't any notification or other requirements.

Thanks,
Florian



Re: [PATCH] Fix PR tree-optimization/103228 and 103228: folding of (type) X op CST where type is a nop convert

2021-11-16 Thread Richard Biener via Gcc-patches
On Tue, Nov 16, 2021 at 4:36 AM apinski--- via Gcc-patches
 wrote:
>
> From: Andrew Pinski 
>
> Currently we fold (type) X op CST into (type) (X op ((type-x) CST)) when the 
> conversion widens
> but not when the conversion is a nop. For the same reason why we move the 
> widening conversion
> (the possibility of removing an extra conversion), we should do the same if 
> the conversion is a
> nop.
>
> OK? Boostrapped and tested on x86_64-linux-gnu with no regressions.
>
> PR tree-optimization/103228
> PR tree-optimization/55177
>
> gcc/ChangeLog:
>
> * match.pd ((type) X bitop CST): Also do this
> transformation for nop conversions.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/tree-ssa/pr103228-1.c: New test.
> * gcc.dg/tree-ssa/pr55177-1.c: New test.
> ---
>  gcc/match.pd   |  2 +-
>  gcc/testsuite/gcc.dg/tree-ssa/pr103228-1.c | 11 +++
>  gcc/testsuite/gcc.dg/tree-ssa/pr55177-1.c  | 14 ++
>  3 files changed, 26 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr103228-1.c
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr55177-1.c
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index a0e9a82e4c4..dc3d5054583 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -1615,7 +1615,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> && (bitop != BIT_AND_EXPR || GIMPLE)
> && (/* That's a good idea if the conversion widens the operand, thus
>   after hoisting the conversion the operation will be narrower.  
> */

Can you please adjust the comment?  OK with that change.

> -  TYPE_PRECISION (TREE_TYPE (@0)) < TYPE_PRECISION (type)
> +  TYPE_PRECISION (TREE_TYPE (@0)) <= TYPE_PRECISION (type)
>/* It's also a good idea if the conversion is to a non-integer
>   mode.  */
>|| GET_MODE_CLASS (TYPE_MODE (type)) != MODE_INT
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr103228-1.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/pr103228-1.c
> new file mode 100644
> index 000..a7539819cf2
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr103228-1.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-optimized" } */
> +int f(int a, int b)
> +{
> +  b|=1u;
> +  b|=2;
> +  return b;
> +}
> +/* { dg-final { scan-tree-dump-times "\\\| 3" 1 "optimized"} } */
> +/* { dg-final { scan-tree-dump-times "\\\| 1" 0 "optimized"} } */
> +/* { dg-final { scan-tree-dump-times "\\\| 2" 0 "optimized"} } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr55177-1.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/pr55177-1.c
> new file mode 100644
> index 000..de1a264345c
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr55177-1.c
> @@ -0,0 +1,14 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-optimized" } */
> +extern int x;
> +
> +void foo(void)
> +{
> +  int a = __builtin_bswap32(x);
> +  a &= 0x5a5b5c5d;
> +  x = __builtin_bswap32(a);
> +}
> +
> +/* { dg-final { scan-tree-dump-times "__builtin_bswap32" 0 "optimized"} } */
> +/* { dg-final { scan-tree-dump-times "& 1566333786" 1 "optimized"} } */
> +/* { dg-final { scan-tree-dump-times "& 1515936861" 0 "optimized"} } */
> --
> 2.17.1
>


[committed] arc: Update arc specific tests

2021-11-16 Thread Claudiu Zissulescu via Gcc-patches
Update assembly output test pattern. Take into consideration also for
which platform we do execute the test (baremetal or linux).

gcc/testsuite/ChangeLog:

* gcc.target/arc/add_n-combine.c: Update test patterns.
* gcc.target/arc/builtin_eh.c: Update test for linux platforms.
* gcc.target/arc/mul64-1.c: Disable this test while running on
linux.
* gcc.target/arc/tls-gd.c: Update matching patterns.
* gcc.target/arc/tls-ie.c: Likewise.
* gcc.target/arc/tls-ld.c: Likewise.
* gcc.target/arc/uncached-8.c: Likewise.

Signed-off-by: Claudiu Zissulescu 
---
 gcc/testsuite/gcc.target/arc/add_n-combine.c | 4 ++--
 gcc/testsuite/gcc.target/arc/builtin_eh.c| 3 ++-
 gcc/testsuite/gcc.target/arc/mul64-1.c   | 2 +-
 gcc/testsuite/gcc.target/arc/tls-gd.c| 4 ++--
 gcc/testsuite/gcc.target/arc/tls-ie.c| 4 ++--
 gcc/testsuite/gcc.target/arc/tls-ld.c| 6 +++---
 gcc/testsuite/gcc.target/arc/uncached-8.c| 5 +++--
 7 files changed, 15 insertions(+), 13 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arc/add_n-combine.c 
b/gcc/testsuite/gcc.target/arc/add_n-combine.c
index bc400df669e..84e261ece8f 100644
--- a/gcc/testsuite/gcc.target/arc/add_n-combine.c
+++ b/gcc/testsuite/gcc.target/arc/add_n-combine.c
@@ -45,6 +45,6 @@ void f() {
   a(at3.bn[bu]);
 }
 
-/* { dg-final { scan-assembler "add1" } } */
-/* { dg-final { scan-assembler "add2" } } */
+/* { dg-final { scan-assembler "@at1\\+1" } } */
+/* { dg-final { scan-assembler "@at2\\+2" } } */
 /* { dg-final { scan-assembler "add3" } } */
diff --git a/gcc/testsuite/gcc.target/arc/builtin_eh.c 
b/gcc/testsuite/gcc.target/arc/builtin_eh.c
index 717a54bb084..83f4f1d2ee0 100644
--- a/gcc/testsuite/gcc.target/arc/builtin_eh.c
+++ b/gcc/testsuite/gcc.target/arc/builtin_eh.c
@@ -19,4 +19,5 @@ foo (int x)
 /* { dg-final { scan-assembler "r13" } } */
 /* { dg-final { scan-assembler "r0" } } */
 /* { dg-final { scan-assembler "fp" } } */
-/* { dg-final { scan-assembler "fp,64" } } */
+/* { dg-final { scan-assembler "fp,64" { target { *-elf32-* } } } } */
+/* { dg-final { scan-assembler "fp,60" { target { *-linux-* } } } } */
diff --git a/gcc/testsuite/gcc.target/arc/mul64-1.c 
b/gcc/testsuite/gcc.target/arc/mul64-1.c
index 2543fc33d3f..1a351feee87 100644
--- a/gcc/testsuite/gcc.target/arc/mul64-1.c
+++ b/gcc/testsuite/gcc.target/arc/mul64-1.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-skip-if "MUL64 is ARC600 extension." { ! { clmcpu } } } */
+/* { dg-skip-if "MUL64 is ARC600 extension." { { ! { clmcpu } } || *-linux-* } 
} */
 /* { dg-options "-O2 -mmul64 -mbig-endian -mcpu=arc600" } */
 
 /* Check if mlo/mhi registers are correctly layout when we compile for
diff --git a/gcc/testsuite/gcc.target/arc/tls-gd.c 
b/gcc/testsuite/gcc.target/arc/tls-gd.c
index aa1b5429b08..d02af9537f8 100644
--- a/gcc/testsuite/gcc.target/arc/tls-gd.c
+++ b/gcc/testsuite/gcc.target/arc/tls-gd.c
@@ -13,5 +13,5 @@ int *ae2 (void)
   return &e2;
 }
 
-/* { dg-final { scan-assembler "add r0,pcl,@e2@tlsgd" } } */
-/* { dg-final { scan-assembler "bl @__tls_get_addr@plt" } } */
+/* { dg-final { scan-assembler "add\\s+r0,pcl,@e2@tlsgd" } } */
+/* { dg-final { scan-assembler "bl\\s+@__tls_get_addr@plt" } } */
diff --git a/gcc/testsuite/gcc.target/arc/tls-ie.c 
b/gcc/testsuite/gcc.target/arc/tls-ie.c
index 0c981cfbf67..f4ad635c4d3 100644
--- a/gcc/testsuite/gcc.target/arc/tls-ie.c
+++ b/gcc/testsuite/gcc.target/arc/tls-ie.c
@@ -13,5 +13,5 @@ int *ae2 (void)
   return &e2;
 }
 
-/* { dg-final { scan-assembler "ld r0,\\\[pcl,@e2@tlsie\\\]" } } */
-/* { dg-final { scan-assembler "add_s r0,r0,r25" } } */
+/* { dg-final { scan-assembler "ld\\s+r0,\\\[pcl,@e2@tlsie\\\]" } } */
+/* { dg-final { scan-assembler "add_s\\s+r0,r0,r25" } } */
diff --git a/gcc/testsuite/gcc.target/arc/tls-ld.c 
b/gcc/testsuite/gcc.target/arc/tls-ld.c
index 351c3f02abd..68ab9bf809c 100644
--- a/gcc/testsuite/gcc.target/arc/tls-ld.c
+++ b/gcc/testsuite/gcc.target/arc/tls-ld.c
@@ -13,6 +13,6 @@ int *ae2 (void)
   return &e2;
 }
 
-/* { dg-final { scan-assembler "add r0,pcl,@.tbss@tlsgd" } } */
-/* { dg-final { scan-assembler "bl @__tls_get_addr@plt" } } */
-/* { dg-final { scan-assembler "add_s r0,r0,@e2@dtpoff" } } */
+/* { dg-final { scan-assembler "add\\s+r0,pcl,@.tbss@tlsgd" } } */
+/* { dg-final { scan-assembler "bl\\s+@__tls_get_addr@plt" } } */
+/* { dg-final { scan-assembler "add_s\\s+r0,r0,@e2@dtpoff" } } */
diff --git a/gcc/testsuite/gcc.target/arc/uncached-8.c 
b/gcc/testsuite/gcc.target/arc/uncached-8.c
index 060229b11df..b5ea2359a9a 100644
--- a/gcc/testsuite/gcc.target/arc/uncached-8.c
+++ b/gcc/testsuite/gcc.target/arc/uncached-8.c
@@ -29,5 +29,6 @@ void bar (void)
   x.c.b.a = 10;
 }
 
-/* { dg-final { scan-assembler-times "st\.di" 1 } } */
-/* { dg-final { scan-assembler-times "st\.as\.di" 1 } } */
+/* { dg-final { scan-assembler-times "st\.di" 2 { target { *-linux-* } } } } */
+/* { dg-final { scan-assembler-times "st\.di" 1 { 

Re: [GCC-11 PATCH] aarch64: enable Ampere-1 CPU (backport to GCC11)

2021-11-16 Thread Richard Sandiford via Gcc-patches
Philipp Tomsich  writes:
> This adds support and a basic turning model for the Ampere Computing
> "Ampere-1" CPU.
>
> The Ampere-1 implements the ARMv8.6 architecture in A64 mode and is
> modelled as a 4-wide issue (as with all modern micro-architectures,
> the chosen issue rate is a compromise between the maximum dispatch
> rate and the maximum rate of uops issued to the scheduler).
>
> This adds the -mcpu=ampere1 command-line option and the relevant cost
> information/tuning tables for the Ampere-1.
>
> gcc/ChangeLog:
>
>   * config/aarch64/aarch64-cores.def (AARCH64_CORE): New Ampere-1
>   core.
>   * config/aarch64/aarch64-tune.md: Regenerate.
>   * config/aarch64/aarch64-cost-tables.h: Add extra costs for
>   Ampere-1.
>   * config/aarch64/aarch64.c: Add tuning structures for Ampere-1.
>
> (cherry picked from 67b0d47e20e655c0dd53a76ea88aab60fafb2059)
>
> ---
> This is a backport from master and only affects the AArch64 backend.
>
> OK for GCC-11?

Yes, thanks.

Richard.

>
>  gcc/config/aarch64/aarch64-cores.def |   3 +-
>  gcc/config/aarch64/aarch64-cost-tables.h | 104 +++
>  gcc/config/aarch64/aarch64-tune.md   |   2 +-
>  gcc/config/aarch64/aarch64.c |  78 +
>  gcc/doc/invoke.texi  |   2 +-
>  5 files changed, 186 insertions(+), 3 deletions(-)
>
> diff --git a/gcc/config/aarch64/aarch64-cores.def 
> b/gcc/config/aarch64/aarch64-cores.def
> index b2aa1670561..4643e0e2795 100644
> --- a/gcc/config/aarch64/aarch64-cores.def
> +++ b/gcc/config/aarch64/aarch64-cores.def
> @@ -68,7 +68,8 @@ AARCH64_CORE("octeontx83",octeontxt83,   thunderx,  8A, 
>  AARCH64_FL_FOR_ARCH
>  AARCH64_CORE("thunderxt81",   thunderxt81,   thunderx,  8A,  
> AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx,  0x43, 
> 0x0a2, -1)
>  AARCH64_CORE("thunderxt83",   thunderxt83,   thunderx,  8A,  
> AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx,  0x43, 
> 0x0a3, -1)
>  
> -/* Ampere Computing cores. */
> +/* Ampere Computing ('\xC0') cores. */
> +AARCH64_CORE("ampere1", ampere1, cortexa57, 8_6A, AARCH64_FL_FOR_ARCH8_6, 
> ampere1, 0xC0, 0xac3, -1)
>  /* Do not swap around "emag" and "xgene1",
> this order is required to handle variant correctly. */
>  AARCH64_CORE("emag",emag,  xgene1,8A,  AARCH64_FL_FOR_ARCH8 
> | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, emag, 0x50, 0x000, 3)
> diff --git a/gcc/config/aarch64/aarch64-cost-tables.h 
> b/gcc/config/aarch64/aarch64-cost-tables.h
> index dd2e7e7cbb1..4b7e4e034a2 100644
> --- a/gcc/config/aarch64/aarch64-cost-tables.h
> +++ b/gcc/config/aarch64/aarch64-cost-tables.h
> @@ -650,4 +650,108 @@ const struct cpu_cost_table a64fx_extra_costs =
>}
>  };
>  
> +const struct cpu_cost_table ampere1_extra_costs =
> +{
> +  /* ALU */
> +  {
> +0, /* arith.  */
> +0, /* logical.  */
> +0, /* shift.  */
> +COSTS_N_INSNS (1), /* shift_reg.  */
> +0, /* arith_shift.  */
> +COSTS_N_INSNS (1), /* arith_shift_reg.  */
> +0, /* log_shift.  */
> +COSTS_N_INSNS (1), /* log_shift_reg.  */
> +0, /* extend.  */
> +COSTS_N_INSNS (1), /* extend_arith.  */
> +0, /* bfi.  */
> +0, /* bfx.  */
> +0, /* clz.  */
> +0, /* rev.  */
> +0, /* non_exec.  */
> +true   /* non_exec_costs_exec.  */
> +  },
> +  {
> +/* MULT SImode */
> +{
> +  COSTS_N_INSNS (3),   /* simple.  */
> +  COSTS_N_INSNS (3),   /* flag_setting.  */
> +  COSTS_N_INSNS (3),   /* extend.  */
> +  COSTS_N_INSNS (4),   /* add.  */
> +  COSTS_N_INSNS (4),   /* extend_add.  */
> +  COSTS_N_INSNS (18)   /* idiv.  */
> +},
> +/* MULT DImode */
> +{
> +  COSTS_N_INSNS (3),   /* simple.  */
> +  0,   /* flag_setting (N/A).  */
> +  COSTS_N_INSNS (3),   /* extend.  */
> +  COSTS_N_INSNS (4),   /* add.  */
> +  COSTS_N_INSNS (4),   /* extend_add.  */
> +  COSTS_N_INSNS (34)   /* idiv.  */
> +}
> +  },
> +  /* LD/ST */
> +  {
> +COSTS_N_INSNS (4), /* load.  */
> +COSTS_N_INSNS (4), /* load_sign_extend.  */
> +0, /* ldrd (n/a).  */
> +0, /* ldm_1st.  */
> +0, /* ldm_regs_per_insn_1st.  */
> +0, /* ldm_regs_per_insn_subsequent.  */
> +COSTS_N_INSNS (5), /* loadf.  */
> +COSTS_N_INSNS (5), /* loadd.  */
> +COSTS_N_INSNS (5), /* load_unaligned.  */
> +0, /* store.  */
> +0, /* strd.  */
> +0, /* stm_1st.  */
> +0, /* stm_regs_per_insn_1st.  */
> +  

Re: [PATCH] tree-optimization: [PR103218] Fold ((type)(a<0)) << SIGNBITOFA into ((type)a) & signbit

2021-11-16 Thread Richard Biener via Gcc-patches
On Sat, Nov 13, 2021 at 9:14 PM apinski--- via Gcc-patches
 wrote:
>
> From: Andrew Pinski 
>
> This folds Fold ((type)(a<0)) << SIGNBITOFA into ((type)a) & signbit inside 
> match.pd.
> This was already handled in fold-cost by:
> /* A < 0 ?  : 0 is simply (A & ).  */
> I have not removed as we only simplify "a ? POW2 : 0" at the gimple level to 
> "a << CST1"
> and fold actually does the reverse of folding "(a<0)< 1< OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

OK.

Thanks,
Richard.

> PR tree-optimization/103218
>
> gcc/ChangeLog:
>
> * match.pd: New pattern for "((type)(a<0)) << SIGNBITOFA".
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/tree-ssa/pr103218-1.c: New test.
> ---
>  gcc/match.pd   | 10 
>  gcc/testsuite/gcc.dg/tree-ssa/pr103218-1.c | 28 ++
>  2 files changed, 38 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr103218-1.c
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index a319aefa808..df31964e02f 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -865,6 +865,16 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>  { tree utype = unsigned_type_for (type); }
>  (convert (rshift (lshift (convert:utype @0) @2) @3))
>
> +/* Fold ((type)(a<0)) << SIGNBITOFA into ((type)a) & signbit. */
> +(simplify
> + (lshift (convert (lt @0 integer_zerop@1)) INTEGER_CST@2)
> + (if (TYPE_SIGN (TREE_TYPE (@0)) == SIGNED
> +  && wi::eq_p (wi::to_wide (@2), TYPE_PRECISION (TREE_TYPE (@0)) - 1))
> +  (with { wide_int wone = wi::one (TYPE_PRECISION (type)); }
> +   (bit_and (convert @0)
> +{ wide_int_to_tree (type,
> +   wi::lshift (wone, wi::to_wide (@2))); }
> +
>  /* Fold (-x >> C) into -(x > 0) where C = precision(type) - 1.  */
>  (for cst (INTEGER_CST VECTOR_CST)
>   (simplify
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr103218-1.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/pr103218-1.c
> new file mode 100644
> index 000..f086f073b38
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr103218-1.c
> @@ -0,0 +1,28 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-optimized" } */
> +/* PR tree-optimization/103218 */
> +
> +/* These first two are removed during forwprop1 */
> +signed char f(signed char a)
> +{
> +  signed char t = a < 0;
> +  int tt = (unsigned char)(t << 7);
> +  return tt;
> +}
> +signed char f0(signed char a)
> +{
> +  unsigned char t = a < 0;
> +  int tt = (unsigned char)(t << 7);
> +  return tt;
> +}
> +
> +/* This one is removed during phiopt. */
> +signed char  f1(signed char a)
> +{
> +if (a < 0)
> +  return 1u<<7;
> +return 0;
> +}
> +
> +/* These three examples should remove "a < 0" by optimized. */
> +/* { dg-final { scan-tree-dump-times "< 0" 0 "optimized"} } */
> --
> 2.17.1
>


[PATCH 0/2][GCC] arm: Define MVE types internally

2021-11-16 Thread Murray Steele via Gcc-patches
Hi all,

This patch series implements the arm MVE ACLE types currently found
under config/arm/arm_mve_types.h internally via a new pragma. Exposing
the MVE ACLE types internally allows for an MVE intrinsics
implementation similar to the current SVE implementation.

Any prefix of the patch series should build and pass regression tests.

Thanks,
Murray

---

Murray Steele (2):
  arm: Move arm_simd_info array declaration into header
  arm: Define MVE types internally via pragma

 gcc/config.gcc|   2 +-
 gcc/config/arm/arm-builtins.c |  87 +---
 gcc/config/arm/arm-builtins.h |  87 
 gcc/config/arm/arm-c.c|  21 ++
 gcc/config/arm/arm-mve-builtins.cc| 192 ++
 gcc/config/arm/arm-mve-builtins.def   |  41 
 gcc/config/arm/arm-mve-builtins.h |  34 
 gcc/config/arm/arm-protos.h   |   5 +
 gcc/config/arm/arm_mve_types.h|  30 +--
 gcc/config/arm/t-arm  |  10 +
 .../arm/mve/general-c/type_redef_1.c  |   7 +
 .../arm/mve/general-c/type_redef_10.c |   7 +
 .../arm/mve/general-c/type_redef_11.c |   7 +
 .../arm/mve/general-c/type_redef_12.c |   7 +
 .../arm/mve/general-c/type_redef_13.c |   7 +
 .../arm/mve/general-c/type_redef_14.c |   7 +
 .../arm/mve/general-c/type_redef_15.c |   7 +
 .../arm/mve/general-c/type_redef_16.c |   7 +
 .../arm/mve/general-c/type_redef_17.c |   7 +
 .../arm/mve/general-c/type_redef_18.c |   7 +
 .../arm/mve/general-c/type_redef_19.c |   7 +
 .../arm/mve/general-c/type_redef_2.c  |   7 +
 .../arm/mve/general-c/type_redef_20.c |   7 +
 .../arm/mve/general-c/type_redef_21.c |   7 +
 .../arm/mve/general-c/type_redef_22.c |   7 +
 .../arm/mve/general-c/type_redef_23.c |   7 +
 .../arm/mve/general-c/type_redef_24.c |   7 +
 .../arm/mve/general-c/type_redef_25.c |   7 +
 .../arm/mve/general-c/type_redef_26.c |   7 +
 .../arm/mve/general-c/type_redef_27.c |   7 +
 .../arm/mve/general-c/type_redef_28.c |   7 +
 .../arm/mve/general-c/type_redef_29.c |   7 +
 .../arm/mve/general-c/type_redef_3.c  |   7 +
 .../arm/mve/general-c/type_redef_30.c |   7 +
 .../arm/mve/general-c/type_redef_31.c |   7 +
 .../arm/mve/general-c/type_redef_4.c  |   7 +
 .../arm/mve/general-c/type_redef_5.c  |   7 +
 .../arm/mve/general-c/type_redef_6.c  |   7 +
 .../arm/mve/general-c/type_redef_7.c  |   7 +
 .../arm/mve/general-c/type_redef_8.c  |   7 +
 .../arm/mve/general-c/type_redef_9.c  |   7 +
 .../arm/mve/general/double_pragmas_1.c|   8 +
 .../gcc.target/arm/mve/general/nomve_1.c  |   3 +
 gcc/testsuite/gcc.target/arm/mve/mve.exp  |   6 +
 44 files changed, 627 insertions(+), 116 deletions(-)
 create mode 100644 gcc/config/arm/arm-mve-builtins.cc
 create mode 100644 gcc/config/arm/arm-mve-builtins.def
 create mode 100644 gcc/config/arm/arm-mve-builtins.h
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/general-c/type_redef_1.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/general-c/type_redef_10.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/general-c/type_redef_11.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/general-c/type_redef_12.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/general-c/type_redef_13.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/general-c/type_redef_14.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/general-c/type_redef_15.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/general-c/type_redef_16.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/general-c/type_redef_17.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/general-c/type_redef_18.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/general-c/type_redef_19.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/general-c/type_redef_2.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/general-c/type_redef_20.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/general-c/type_redef_21.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/general-c/type_redef_22.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/general-c/type_redef_23.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/general-c/type_redef_24.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/general-c/type_redef_25.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/general-c/type_redef_26.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/general-c/type_redef_27.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/general-c/type_redef_28.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/general-c/type_redef_29.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/general-c/type_redef_3.c
 create mode 100644 gcc

[PATCH 1/2][GCC] arm: Move arm_simd_info array declaration into header

2021-11-16 Thread Murray Steele via Gcc-patches
Hi all,

This patch moves the arm_simd_type and arm_type_qualifiers enums, and
arm_simd_info struct from arm-builtins.c into arm-builtins.h header.

This is a first step towards internalising the type definitions for MVE
predicate, vector, and tuple types.  By moving arm_simd_types into a
header, we allow future patches to use these type trees externally to
arm-builtins.c, which is a crucial step towards developing an MVE
intrinsics framework similar to the current SVE implementation.

Thanks,
Murray

gcc/ChangeLog:

* config/arm/arm-builtins.c (enum arm_type_qualifiers): Move to
arm_builtins.h
(enum arm_simd_type): Move to arm-builtins.h
(struct arm_simd_type_info): Move to arm-builtins.h
* config/arm/arm-builtins.h (enum arm_simd_type): Move from
arm-builtins.c
(enum arm_type_qualifiers): Move from arm-builtins.c
(struct arm_simd_type_info): Move from arm-builtins.c



diff --git a/gcc/config/arm/arm-builtins.h b/gcc/config/arm/arm-builtins.h
index 
bee9f9bb83758820ca7faedf80b7e138026c1ca0..a40fa8950707314d3cc1372fb5c47a8891a18516
 100644
--- a/gcc/config/arm/arm-builtins.h
+++ b/gcc/config/arm/arm-builtins.h
@@ -32,4 +32,91 @@ enum resolver_ident {
 enum resolver_ident arm_describe_resolver (tree);
 unsigned arm_cde_end_args (tree);
 
+#define ENTRY(E, M, Q, S, T, G) E,
+enum arm_simd_type
+{
+#include "arm-simd-builtin-types.def"
+  __TYPE_FINAL
+};
+#undef ENTRY
+
+enum arm_type_qualifiers
+{
+  /* T foo.  */
+  qualifier_none = 0x0,
+  /* unsigned T foo.  */
+  qualifier_unsigned = 0x1, /* 1 << 0  */
+  /* const T foo.  */
+  qualifier_const = 0x2, /* 1 << 1  */
+  /* T *foo.  */
+  qualifier_pointer = 0x4, /* 1 << 2  */
+  /* const T * foo.  */
+  qualifier_const_pointer = 0x6,
+  /* Used when expanding arguments if an operand could
+ be an immediate.  */
+  qualifier_immediate = 0x8, /* 1 << 3  */
+  qualifier_unsigned_immediate = 0x9,
+  qualifier_maybe_immediate = 0x10, /* 1 << 4  */
+  /* void foo (...).  */
+  qualifier_void = 0x20, /* 1 << 5  */
+  /* Some patterns may have internal operands, this qualifier is an
+ instruction to the initialisation code to skip this operand.  */
+  qualifier_internal = 0x40, /* 1 << 6  */
+  /* Some builtins should use the T_*mode* encoded in a simd_builtin_datum
+ rather than using the type of the operand.  */
+  qualifier_map_mode = 0x80, /* 1 << 7  */
+  /* qualifier_pointer | qualifier_map_mode  */
+  qualifier_pointer_map_mode = 0x84,
+  /* qualifier_const_pointer | qualifier_map_mode  */
+  qualifier_const_pointer_map_mode = 0x86,
+  /* Polynomial types.  */
+  qualifier_poly = 0x100,
+  /* Lane indices - must be within range of previous argument = a vector.  */
+  qualifier_lane_index = 0x200,
+  /* Lane indices for single lane structure loads and stores.  */
+  qualifier_struct_load_store_lane_index = 0x400,
+  /* A void pointer.  */
+  qualifier_void_pointer = 0x800,
+  /* A const void pointer.  */
+  qualifier_const_void_pointer = 0x802,
+  /* Lane indices selected in pairs - must be within range of previous
+ argument = a vector.  */
+  qualifier_lane_pair_index = 0x1000,
+  /* Lane indices selected in quadtuplets - must be within range of previous
+ argument = a vector.  */
+  qualifier_lane_quadtup_index = 0x2000
+};
+
+struct arm_simd_type_info
+{
+  enum arm_simd_type type;
+
+  /* Internal type name.  */
+  const char *name;
+
+  /* Internal type name(mangled).  The mangled names conform to the
+ AAPCS (see "Procedure Call Standard for the ARM Architecture",
+ Appendix A).  To qualify for emission with the mangled names defined in
+ that document, a vector type must not only be of the correct mode but also
+ be of the correct internal Neon vector type (e.g. __simd64_int8_t);
+ these types are registered by arm_init_simd_builtin_types ().  In other
+ words, vector types defined in other ways e.g. via vector_size attribute
+ will get default mangled names.  */
+  const char *mangle;
+
+  /* Internal type.  */
+  tree itype;
+
+  /* Element type.  */
+  tree eltype;
+
+  /* Machine mode the internal type maps to.  */
+  machine_mode mode;
+
+  /* Qualifiers.  */
+  enum arm_type_qualifiers q;
+};
+
+extern struct arm_simd_type_info arm_simd_types[];
+
 #endif /* GCC_ARM_BUILTINS_H */
diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c
index 
3a9ff8f26b8e222c52cb70f7509b714c3e475758..b6bf31349d8f0e996a6c169b061ebe05a2cf9acb
 100644
--- a/gcc/config/arm/arm-builtins.c
+++ b/gcc/config/arm/arm-builtins.c
@@ -48,53 +48,6 @@
 
 #define SIMD_MAX_BUILTIN_ARGS 7
 
-enum arm_type_qualifiers
-{
-  /* T foo.  */
-  qualifier_none = 0x0,
-  /* unsigned T foo.  */
-  qualifier_unsigned = 0x1, /* 1 << 0  */
-  /* const T foo.  */
-  qualifier_const = 0x2, /* 1 << 1  */
-  /* T *foo.  */
-  qualifier_pointer = 0x4, /* 1 << 2  */
-  /* const T * foo.  */
-  qualifier_const_pointer = 0x6,
-  /* Used when expanding arguments if an o

[PATCH 2/2][GCC] arm: Declare MVE types internally via pragma

2021-11-16 Thread Murray Steele via Gcc-patches
Hi all,

This patch moves the implementation of MVE ACLE types from
arm_mve_types.h to inside GCC via a new pragma, which replaces the prior
type definitions. This allows for the types to be used internally for
intrinsic function definitions.

Bootstrapped and regression tested on arm-none-linux-gnuabihf, and
regression tested on arm-eabi -- no issues.

Thanks,
Murray

gcc/ChangeLog:

* config.gcc: Add arm-mve-builtins.o to extra_objs for arm-*-*-*
targets.
* config/arm/arm-c.c (arm_pragma_arm): Handle new pragma.
(arm_register_target_pragmas): Register new pragma.
* config/arm/arm-protos.h: Add arm_mve namespace and declare
arm_handle_mve_types_h.
* config/arm/arm_mve_types.h: Replace MVE type definitions with
new pragma.
* config/arm/t-arm: Add arm-mve-builtins.o target.
* config/arm/arm-mve-builtins.cc: New file.
* config/arm/arm-mve-builtins.def: New file.
* config/arm/arm-mve-builtins.h: New file.

gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/mve.exp: Add new subdirectories.
* gcc.target/arm/mve/general-c/type_redef_1.c: New test.
* gcc.target/arm/mve/general-c/type_redef_10.c: New test.
* gcc.target/arm/mve/general-c/type_redef_11.c: New test.
* gcc.target/arm/mve/general-c/type_redef_12.c: New test.
* gcc.target/arm/mve/general-c/type_redef_13.c: New test.
* gcc.target/arm/mve/general-c/type_redef_14.c: New test.
* gcc.target/arm/mve/general-c/type_redef_15.c: New test.
* gcc.target/arm/mve/general-c/type_redef_16.c: New test.
* gcc.target/arm/mve/general-c/type_redef_17.c: New test.
* gcc.target/arm/mve/general-c/type_redef_18.c: New test.
* gcc.target/arm/mve/general-c/type_redef_19.c: New test.
* gcc.target/arm/mve/general-c/type_redef_2.c: New test.
* gcc.target/arm/mve/general-c/type_redef_20.c: New test.
* gcc.target/arm/mve/general-c/type_redef_21.c: New test.
* gcc.target/arm/mve/general-c/type_redef_22.c: New test.
* gcc.target/arm/mve/general-c/type_redef_23.c: New test.
* gcc.target/arm/mve/general-c/type_redef_24.c: New test.
* gcc.target/arm/mve/general-c/type_redef_25.c: New test.
* gcc.target/arm/mve/general-c/type_redef_26.c: New test.
* gcc.target/arm/mve/general-c/type_redef_27.c: New test.
* gcc.target/arm/mve/general-c/type_redef_28.c: New test.
* gcc.target/arm/mve/general-c/type_redef_29.c: New test.
* gcc.target/arm/mve/general-c/type_redef_3.c: New test.
* gcc.target/arm/mve/general-c/type_redef_30.c: New test.
* gcc.target/arm/mve/general-c/type_redef_31.c: New test.
* gcc.target/arm/mve/general-c/type_redef_4.c: New test.
* gcc.target/arm/mve/general-c/type_redef_5.c: New test.
* gcc.target/arm/mve/general-c/type_redef_6.c: New test.
* gcc.target/arm/mve/general-c/type_redef_7.c: New test.
* gcc.target/arm/mve/general-c/type_redef_8.c: New test.
* gcc.target/arm/mve/general-c/type_redef_9.c: New test.
* gcc.target/arm/mve/general/double_pragmas_1.c: New test.
* gcc.target/arm/mve/general/nomve_1.c: New test.



diff --git a/gcc/config.gcc b/gcc/config.gcc
index 
3675e063a5365ff84854eb5c2c27921216494c69..50d3401e3aa94f077d7e0675ee443a94431dba1e
 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -352,7 +352,7 @@ arc*-*-*)
;;
 arm*-*-*)
cpu_type=arm
-   extra_objs="arm-builtins.o aarch-common.o"
+   extra_objs="arm-builtins.o arm-mve-builtins.o aarch-common.o"
extra_headers="mmintrin.h arm_neon.h arm_acle.h arm_fp16.h arm_cmse.h 
arm_bf16.h arm_mve_types.h arm_mve.h arm_cde.h"
target_type_format_char='%'
c_target_objs="arm-c.o"
diff --git a/gcc/config/arm/arm-c.c b/gcc/config/arm/arm-c.c
index 
cc7901bca8dc9c5c27ed6afc5bc26afd42689e6d..d1414f6e0e1c2bd0a7364b837c16adf493221376
 100644
--- a/gcc/config/arm/arm-c.c
+++ b/gcc/config/arm/arm-c.c
@@ -28,6 +28,7 @@
 #include "c-family/c-pragma.h"
 #include "stringpool.h"
 #include "arm-builtins.h"
+#include "arm-protos.h"
 
 tree
 arm_resolve_cde_builtin (location_t loc, tree fndecl, void *arglist)
@@ -129,6 +130,24 @@ arm_resolve_cde_builtin (location_t loc, tree fndecl, void 
*arglist)
   return call_expr;
 }
 
+/* Implement "#pragma GCC arm".  */
+static void
+arm_pragma_arm (cpp_reader *)
+{
+  tree x;
+  if (pragma_lex (&x) != CPP_STRING)
+{
+  error ("%<#pragma GCC arm%> requires a string parameter");
+  return;
+}
+
+  const char *name = TREE_STRING_POINTER (x);
+  if (strcmp (name, "arm_mve_types.h") == 0)
+arm_mve::handle_arm_mve_types_h ();
+  else
+error ("unknown %<#pragma GCC arm%> option %qs", name);
+}
+
 /* Implement TARGET_RESOLVE_OVERLOADED_BUILTIN.  This is currently only
used for the MVE related builtins for the CDE extension.
Here we ensure the type of arguments is such

Re: [PATCH] PR tree-optimization/103216: optimize some A ? (b op CST) : b into b op (A?CST:CST2)

2021-11-16 Thread Richard Biener via Gcc-patches
On Mon, Nov 15, 2021 at 1:09 AM apinski--- via Gcc-patches
 wrote:
>
> From: Andrew Pinski 
>
> For this PR, we have:
>   if (d_5 < 0)
> goto ; [INV]
>   else
> goto ; [INV]
>
>:
>   v_7 = c_4 | -128;
>
>:
>   # v_1 = PHI 
>
> Which PHI-OPT will try to simplify
> "(d_5 < 0) ? (c_4 | -128) : c_4" which is not handled currently.
> This adds a few patterns which allows to try to see if (a ? CST : CST1)
> where CST1 is either 0, 1 or -1 depending on the operator.
> Note to optimize this case always, we should check to make sure that
> the a?CST:CST1 gets simplified to not include the conditional expression.
> The ! flag does not work as we want to have more simplifcations than just
> when we simplify it to a leaf node (SSA_NAME or CONSTANT). This adds a new
> flag ^ to genmatch which says the simplification should happen but not down
> to the same kind of node.
> We could allow this for !GIMPLE and use fold_* rather than fold_buildN but I
> didn't see any use of it for now.
>
> Also all of these patterns need to be done late as other optimizations can be
> done without them.
>
> OK? Bootstrapped and tested on x86_64 with no regressions.
>
> gcc/ChangeLog:
>
> * doc/match-and-simplify.texi: Document ^ flag.
> * genmatch.c (expr::expr): Add Setting of force_simplify.
> (expr): Add force_simplify field.
> (expr::gen_transform): Add support for force_simplify field.
> (parser::parse_expr): Add parsing of ^ flag for the expr.
> * match.pd: New patterns to optimize "a ? (b op CST) : b".
> ---
>  gcc/doc/match-and-simplify.texi | 16 +
>  gcc/genmatch.c  | 35 ++--
>  gcc/match.pd| 41 +
>  3 files changed, 90 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/doc/match-and-simplify.texi b/gcc/doc/match-and-simplify.texi
> index e7e5a4f7299..4e3407c0263 100644
> --- a/gcc/doc/match-and-simplify.texi
> +++ b/gcc/doc/match-and-simplify.texi
> @@ -377,6 +377,22 @@ of the @code{vec_cond} expression but only if the actual 
> plus
>  operations both simplify.  Note this is currently only supported
>  for code generation targeting @code{GIMPLE}.
>
> +Another modifier for generated expressions is @code{^} which
> +tells the machinery to only consider the simplification in case
> +the marked expression simplified away from the original code.
> +Consider for example
> +
> +@smallexample
> +(simplify
> + (cond @@0 (plus:s @@1 INTEGER_CST@@2) @@1)
> + (plus @@1 (cond^ @@0 @@2 @{ build_zero_cst (type); @})))
> +@end smallexample
> +
> +which moves the inner @code{plus} operation to the outside of the
> +@code{cond} expression but only if the actual cond operation simplify
> +wayaway from cond.  Note this is currently only supported for code

s/wayaway/away/

> +generation targeting @code{GIMPLE}.
> +
>  As intermediate conversions are often optional there is a way to
>  avoid the need to repeat patterns both with and without such
>  conversions.  Namely you can mark a conversion as being optional
> diff --git a/gcc/genmatch.c b/gcc/genmatch.c
> index 95248455ec5..2dca1141df6 100644
> --- a/gcc/genmatch.c
> +++ b/gcc/genmatch.c
> @@ -698,12 +698,13 @@ public:
>  : operand (OP_EXPR, loc), operation (operation_),
>ops (vNULL), expr_type (NULL), is_commutative (is_commutative_),
>is_generic (false), force_single_use (false), force_leaf (false),
> -  opt_grp (0) {}
> +  force_simplify(false), opt_grp (0) {}
>expr (expr *e)
>  : operand (OP_EXPR, e->location), operation (e->operation),
>ops (vNULL), expr_type (e->expr_type), is_commutative 
> (e->is_commutative),
>is_generic (e->is_generic), force_single_use (e->force_single_use),
> -  force_leaf (e->force_leaf), opt_grp (e->opt_grp) {}
> +  force_leaf (e->force_leaf), force_simplify(e->force_simplify),
> +  opt_grp (e->opt_grp) {}
>void append_op (operand *op) { ops.safe_push (op); }
>/* The operator and its operands.  */
>id_base *operation;
> @@ -721,6 +722,9 @@ public:
>/* Whether in the result expression this should be a leaf node
>   with any children simplified down to simple operands.  */
>bool force_leaf;
> +  /* Whether in the result expression this should be a node
> + with any children simplified down not to use the original operator.  */
> +  bool force_simplify;
>/* If non-zero, the group for optional handling.  */
>unsigned char opt_grp;
>virtual void gen_transform (FILE *f, int, const char *, bool, int,
> @@ -2527,6 +2531,17 @@ expr::gen_transform (FILE *f, int indent, const char 
> *dest, bool gimple,
> fprintf (f, ", _o%d[%u]", depth, i);
>fprintf (f, ");\n");
>fprintf_indent (f, indent, "tem_op.resimplify (lseq, valueize);\n");

I wonder if with force_simplify we should pass NULL as lseq to resimplify?
That is, should we allow (plus^ (convert @0) @1) to simplify to
(convert (plus

RE: [vect-patterns] Refactor widen_plus/widen_minus as internal_fns

2021-11-16 Thread Joel Hutton via Gcc-patches
Updated patch 2 with explanation included in commit message and changes 
requested.

Bootstrapped and regression tested on aarch64
> -Original Message-
> From: Joel Hutton
> Sent: 12 November 2021 11:42
> To: Richard Biener 
> Cc: gcc-patches@gcc.gnu.org; Richard Sandiford
> 
> Subject: RE: [vect-patterns] Refactor widen_plus/widen_minus as
> internal_fns
> 
> > please use #define INCLUDE_MAP before the system.h include instead.
Done.

> > Is it really necessary to build a new std::map for each optab lookup?!
> > That looks quite ugly and inefficient.  We'd usually - if necessary at
> > all - build a auto_vec > and .sort () and .bsearch () 
> > it.
> Ok, I'll rework this part. In the meantime, to address your other comment.
Done.

> > I'm not sure I understand DEF_INTERNAL_OPTAB_MULTI_FN, neither this
> > cover letter nor the patch ChangeLog explains anything.
> 
> I'll attempt to clarify, if this makes things clearer I can include this in 
> the
> commit message of the respun patch:
> 
> DEF_INTERNAL_OPTAB_MULTI_FN is like DEF_INTERNAL_OPTAB_FN except it
> provides convenience wrappers for defining conversions that require a hi/lo
> split, like widening and narrowing operations.  Each definition for 
> will require an optab named  and two other optabs that you specify
> for signed and unsigned. The hi/lo pair is necessary because the widening
> operations take n narrow elements as inputs and return n/2 wide elements
> as outputs. The 'lo' operation operates on the first n/2 elements of input.
> The 'hi' operation operates on the second n/2 elements of input. Defining an
> internal_fn along with hi/lo variations allows a single internal function to 
> be
> returned from a vect_recog function that will later be expanded to hi/lo.
> 
> DEF_INTERNAL_OPTAB_MULTI_FN is used in internal-fn.def to register a
> widening internal_fn. It is defined differently in different places and 
> internal-
> fn.def is sourced from those places so the parameters given can be reused.
>   internal-fn.c: defined to expand to hi/lo signed/unsigned optabs, later
> defined to generate the  'expand_' functions for the hi/lo versions of the fn.
>   internal-fn.def: defined to invoke DEF_INTERNAL_OPTAB_FN for the original
> and hi/lo variants of the internal_fn
> 
>  For example:
>  IFN_VEC_WIDEN_PLUS -> IFN_VEC_WIDEN_PLUS_HI,
> IFN_VEC_WIDEN_PLUS_LO
> for aarch64: IFN_VEC_WIDEN_PLUS_HI   -> vec_widen_addl_hi_
> -> (u/s)addl2
>IFN_VEC_WIDEN_PLUS_LO  -> vec_widen_addl_lo_
> -> (u/s)addl
> 
> This gives the same functionality as the previous
> WIDEN_PLUS/WIDEN_MINUS tree codes which are expanded into
> VEC_WIDEN_PLUS_LO, VEC_WIDEN_PLUS_HI.
> 
> Let me know if I'm not expressing this clearly.
> 
> Thanks,
> Joel


0001-vect-patterns-Refactor-to-allow-internal_fn-s.patch
Description: 0001-vect-patterns-Refactor-to-allow-internal_fn-s.patch


0002-vect-patterns-Refactor-widen_plus-as-internal_fn.patch
Description: 0002-vect-patterns-Refactor-widen_plus-as-internal_fn.patch


0003-Remove-widen_plus-minus_expr-tree-codes.patch
Description: 0003-Remove-widen_plus-minus_expr-tree-codes.patch


[PATCH] tree-optimization/102880 - improve CD-DCE

2021-11-16 Thread Richard Biener via Gcc-patches
The PR shows a missed control-dependent DCE caused by CFG cleanup
merging a forwarder resulting in a partially degenerate PHI node.
With control-dependent DCE we need to mark control dependences
of incoming edges into PHIs as necessary but that is unnecessarily
conservative for the case when two edges have the same value.
There is no easy way to mark only a subset of control dependences
of both edges necessary so the fix is to produce forwarder blocks
where then the control dependence captures the requirements more
precisely.

For gcc.dg/tree-ssa/ssa-dom-thread-7.c the number of edges in the
CFG decrease as we have commonized PHI arguments which in turn
results in different threadings.  The testcase is too complex
and the dump scanning too simple to do anything meaningful here
but to adjust the number of expected threads.

The same CFG massaging could be useful at RTL expansion time to
reduce the number of copies we need to insert on edges.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

2021-11-12  Richard Biener  

PR tree-optimization/102880
* tree-ssa-dce.c (sort_phi_args): New function.
(make_forwarders_with_degenerate_phis): Likewise.
(perform_tree_ssa_dce): Call
make_forwarders_with_degenerate_phis.

* gcc.dg/tree-ssa/pr102880.c: New testcase.
* gcc.dg/tree-ssa/pr69270-3.c: Robustify.
* gcc.dg/tree-ssa/ssa-dom-thread-7.c: Change the number of
expected threadings.
---
 gcc/testsuite/gcc.dg/tree-ssa/pr102880.c  |  27 +++
 gcc/testsuite/gcc.dg/tree-ssa/pr69270-3.c |   2 +-
 .../gcc.dg/tree-ssa/ssa-dom-thread-7.c|   2 +-
 gcc/tree-ssa-dce.c| 171 +-
 4 files changed, 196 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr102880.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr102880.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr102880.c
new file mode 100644
index 000..0306deedb6c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr102880.c
@@ -0,0 +1,27 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+void foo(void);
+
+static int b, c, d, e, f, ah;
+static short g, ai, am, aq, as;
+static char an, at, av, ax, ay;
+static char a(char h, char i) { return i == 0 || h && i == 1 ? 0 : h % i; }
+static void ae(int h) {
+  if (a(b, h))
+foo();
+
+}
+int main() {
+  ae(1);
+  ay = a(0, ay);
+  ax = a(g, aq);
+  at = a(0, as);
+  av = a(c, 1);
+  an = a(am, f);
+  int al = e || ((a(1, ah) && b) & d) == 2;
+  ai = al;
+}
+
+/* We should eliminate the call to foo.  */
+/* { dg-final { scan-tree-dump-not "foo" "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr69270-3.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr69270-3.c
index 89735f67de2..5ffd5f71506 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr69270-3.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr69270-3.c
@@ -3,7 +3,7 @@
 
 /* We're looking for a constant argument a PHI node.  There
should only be one if we unpropagate correctly.  */
-/* { dg-final { scan-tree-dump-times ", 1" 1 "uncprop1"} } */
+/* { dg-final { scan-tree-dump-times "<1\|, 1" 1 "uncprop1"} } */
 
 typedef long unsigned int size_t;
 typedef union gimple_statement_d *gimple;
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c 
b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c
index d40a61fd725..b64e71dae22 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c
@@ -11,7 +11,7 @@
to change decisions in switch expansion which in turn can expose new
jump threading opportunities.  Skip the later tests on aarch64.  */
 /* { dg-final { scan-tree-dump-not "Jumps threaded"  "dom3" { target { ! 
aarch64*-*-* } } } } */
-/* { dg-final { scan-tree-dump "Jumps threaded: 11"  "thread2" { target { ! 
aarch64*-*-* } } } } */
+/* { dg-final { scan-tree-dump "Jumps threaded: 7"  "thread2" { target { ! 
aarch64*-*-* } } } } */
 /* { dg-final { scan-tree-dump "Jumps threaded: 18"  "thread2" { target { 
aarch64*-*-* } } } } */
 
 enum STATE {
diff --git a/gcc/tree-ssa-dce.c b/gcc/tree-ssa-dce.c
index 1281e67489c..dbf02c434de 100644
--- a/gcc/tree-ssa-dce.c
+++ b/gcc/tree-ssa-dce.c
@@ -67,6 +67,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-scalar-evolution.h"
 #include "tree-ssa-propagate.h"
 #include "gimple-fold.h"
+#include "tree-ssa.h"
 
 static struct stmt_stats
 {
@@ -1612,6 +1613,164 @@ tree_dce_done (bool aggressive)
   worklist.release ();
 }
 
+/* Sort PHI argument values for make_forwarders_with_degenerate_phis.  */
+
+static int
+sort_phi_args (const void *a_, const void *b_)
+{
+  auto *a = (const std::pair *) a_;
+  auto *b = (const std::pair *) b_;
+  hashval_t ha = a->second;
+  hashval_t hb = b->second;
+  if (ha < hb)
+return -1;
+  else if (ha > hb)
+return 1;
+  else
+return 0;
+}
+
+/* Look for a non-virtual PHIs and make a forwarder block when all PHIs
+   have the sam

[committed] arc: Update (u)maddhisi4 patterns

2021-11-16 Thread Claudiu Zissulescu via Gcc-patches
The (u)maddsihi4 patterns are using the ARC's VMAC2H(U)
instruction with null destination, however, VMAC2H(U) doesn't
rewrite the accumulator.  This patch solves the destination issue
of VMAC2H by replacing it with DMACH(U) instruction.

gcc/

* config/arc/arc.md (maddhisi4): Use a single move to accumulator.
(umaddhisi4): Likewise.
(machi): Update pattern.
(umachi): Likewise.

gcc/testsuite/

* gcc.target/arc/tmac-4.c: New test.

Signed-off-by: Claudiu Zissulescu 
---
 gcc/config/arc/arc.md | 34 +--
 gcc/testsuite/gcc.target/arc/tmac-4.c | 29 +++
 2 files changed, 46 insertions(+), 17 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/arc/tmac-4.c

diff --git a/gcc/config/arc/arc.md b/gcc/config/arc/arc.md
index 4919d275820..74ec38f1526 100644
--- a/gcc/config/arc/arc.md
+++ b/gcc/config/arc/arc.md
@@ -6023,26 +6023,26 @@ (define_insn "stack_irq_dwarf"
 (define_expand "maddhisi4"
   [(match_operand:SI 0 "register_operand" "")
(match_operand:HI 1 "register_operand" "")
-   (match_operand:HI 2 "extend_operand"   "")
+   (match_operand:HI 2 "register_operand" "")
(match_operand:SI 3 "register_operand" "")]
   "TARGET_PLUS_MACD"
   "{
-   rtx acc_reg = gen_rtx_REG (SImode, ACC_REG_FIRST);
+   rtx acc_reg = gen_rtx_REG (SImode, ACCL_REGNO);
 
emit_move_insn (acc_reg, operands[3]);
-   emit_insn (gen_machi (operands[1], operands[2]));
-   emit_move_insn (operands[0], acc_reg);
+   emit_insn (gen_machi (operands[0], operands[1], operands[2], acc_reg));
DONE;
   }")
 
 (define_insn "machi"
-  [(set (reg:SI ARCV2_ACC)
+  [(set (match_operand:SI 0 "register_operand" "=Ral,r")
(plus:SI
-(mult:SI (sign_extend:SI (match_operand:HI 0 "register_operand" "%r"))
- (sign_extend:SI (match_operand:HI 1 "register_operand" "r")))
-(reg:SI ARCV2_ACC)))]
+(mult:SI (sign_extend:SI (match_operand:HI 1 "register_operand" 
"%r,r"))
+ (sign_extend:SI (match_operand:HI 2 "register_operand" 
"r,r")))
+(match_operand:SI 3 "accl_operand" "")))
+   (clobber (reg:DI ARCV2_ACC))]
   "TARGET_PLUS_MACD"
-  "vmac2h\\t0,%0,%1"
+  "dmach\\t%0,%1,%2"
   [(set_attr "length" "4")
(set_attr "type" "multi")
(set_attr "predicable" "no")
@@ -6056,22 +6056,22 @@ (define_expand "umaddhisi4"
(match_operand:SI 3 "register_operand" "")]
   "TARGET_PLUS_MACD"
   "{
-   rtx acc_reg = gen_rtx_REG (SImode, ACC_REG_FIRST);
+   rtx acc_reg = gen_rtx_REG (SImode, ACCL_REGNO);
 
emit_move_insn (acc_reg, operands[3]);
-   emit_insn (gen_umachi (operands[1], operands[2]));
-   emit_move_insn (operands[0], acc_reg);
+   emit_insn (gen_umachi (operands[0], operands[1], operands[2], acc_reg));
DONE;
   }")
 
 (define_insn "umachi"
-  [(set (reg:SI ARCV2_ACC)
+  [(set (match_operand:SI 0 "register_operand" "=Ral,r")
(plus:SI
-(mult:SI (zero_extend:SI (match_operand:HI 0 "register_operand" "%r"))
- (zero_extend:SI (match_operand:HI 1 "register_operand" "r")))
-(reg:SI ARCV2_ACC)))]
+(mult:SI (zero_extend:SI (match_operand:HI 1 "register_operand" 
"%r,r"))
+ (zero_extend:SI (match_operand:HI 2 "register_operand" 
"r,r")))
+(match_operand:SI 3 "accl_operand" "")))
+   (clobber (reg:DI ARCV2_ACC))]
   "TARGET_PLUS_MACD"
-  "vmac2hu\\t0,%0,%1"
+  "dmachu\\t%0,%1,%2"
   [(set_attr "length" "4")
(set_attr "type" "multi")
(set_attr "predicable" "no")
diff --git a/gcc/testsuite/gcc.target/arc/tmac-4.c 
b/gcc/testsuite/gcc.target/arc/tmac-4.c
new file mode 100644
index 000..3c6b99327a7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arc/tmac-4.c
@@ -0,0 +1,29 @@
+/* { dg-do compile } */
+/* { dg-skip-if "" { ! { clmcpu } } } */
+/* { dg-options "-O3 -mbig-endian -mcpu=hs38" } */
+
+struct a {};
+struct b {
+  int c;
+  int d;
+};
+
+struct {
+  struct a e;
+  struct b f[];
+} g;
+short h;
+
+extern void bar (int *);
+
+int foo(void)
+{
+  struct b *a;
+  for (;;)
+{
+  a = &g.f[h];
+  bar(&a->d);
+}
+}
+
+/* { dg-final { scan-assembler "dmach" } } */
-- 
2.31.1



POS Customers Database

2021-11-16 Thread Taylor Germain via Gcc-patches
Hi,

I was in your website, and I got to know that you are one of the Point Of sales 
(POS) company. We can help you in providing customers/users and competitors' 
business contacts across USA and worldwide which includes entire business 
details that you would require.


Technology Product we track
Number of Users
Square POS
14028
Aloha POS
4852
Lightspeed Retail
1247
Maropost
1547
Total: 21674


If this sounds of any value, please specify your requirement in detail so that 
I can get back to you with more information and few samples just for your 
review.

I look forward to hearing from you soon.

Regards,

Taylor Germain |Business Manager - Partnership Development

As this is not an auto generated email, to discontinue receiving email from us 
reply as "Exclude"



Re: [PATCH 1/5] libstdc++: Import the fast_float library

2021-11-16 Thread Jonathan Wakely via Gcc-patches
On Tue, 16 Nov 2021 at 09:46, Florian Weimer via Libstdc++ <
libstd...@gcc.gnu.org> wrote:

> * Jonathan Wakely:
>
> > On Tue, 16 Nov 2021 at 08:01, Florian Weimer wrote:
> >>
> >> * Patrick Palka via Libstdc:
> >>
> >> > This copies the fast_float library[1] into the compiled-in library
> >> > sources.  We're going to use this library in our floating-point
> >> > std::from_chars implementation for faster and more portable parsing of
> >> > binary32/64 decimal strings.
> >> >
> >> > [1]: https://github.com/fastfloat/fast_float
> >> >
> >> > Series tested on x86_64, i686, ppc64, ppc64le and aarch64, does it
> >> > look OK for trunk?
> >>
> >> Missing Signed-off-by:?
> >
> > That's not needed if Patrick is still covered by an FSF assignment.
>
> But the submission is not covered by the FSF assignment.
>

Good point.


> > I think we could use Apache as well, because this code isn't going to
> > appear in public headers so the problematic clause doesn't apply. But
> > MIT is simpler.
>
> Okay, so you consider dynamic linking only?  I think the historic
> libstdc++ license is more permissive than Apache or MIT when used with
> GCC.  There aren't any notification or other requirements.
>
>
Another good point - the Apache license is (once again) problematic here.
So it's good we can choose the MIT one.


[PATCH] regrename: Skip renaming if instruction is noop move.

2021-11-16 Thread Jojo R via Gcc-patches
Skip renaming if instruction is noop move, and it will
been removed for performance.

gcc/
* regrename.c (find_rename_reg): Return satisfied regno
if instruction is noop move.
---
 gcc/regrename.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/gcc/regrename.c b/gcc/regrename.c
index b8a9ca36f22..cb605f5176b 100644
--- a/gcc/regrename.c
+++ b/gcc/regrename.c
@@ -394,6 +394,9 @@ find_rename_reg (du_head_p this_head, enum reg_class 
super_class,
  this_head, *unavailable))
 return this_head->tied_chain->regno;
 
+  if (noop_move_p (this_head->first->insn))
+return best_new_reg;
+
   /* If PREFERRED_CLASS is not NO_REGS, we iterate in the first pass
  over registers that belong to PREFERRED_CLASS and try to find the
  best register within the class.  If that failed, we iterate in
-- 
2.24.3 (Apple Git-128)



Re: [PATCH][GCC] arm: add armv9-a architecture to -march

2021-11-16 Thread Richard Earnshaw via Gcc-patches
You can't make an omelette without breaking eggs, as they say.  New 
architectures need new assemblers.


However, I wonder if there's anything in v9-a that significantly affects 
the quality of the base multilib code needed for building the libraries. 
 It might be that we can deal with v9-a by just mapping it to the v8-a 
equivalents.  That would then avoid the need for an updated assembler, 
and reduce the build time and install footprint.


R.


On 16/11/2021 08:03, Christophe Lyon via Gcc-patches wrote:

Hi,


On Tue, Nov 9, 2021 at 12:36 PM Przemyslaw Wirkus via Gcc-patches <
gcc-patches@gcc.gnu.org> wrote:


-Original Message-
From: Przemyslaw Wirkus
Sent: 18 October 2021 10:37
To: gcc-patches@gcc.gnu.org
Cc: Richard Earnshaw ; Ramana
Radhakrishnan ; Kyrylo Tkachov
; ni...@redhat.com
Subject: [PATCH][GCC] arm: add armv9-a architecture to -march

Hi,

This patch is adding `armv9-a` to -march in Arm GCC.

In this patch:
   + Add `armv9-a` to -march.
   + Update multilib with armv9-a and armv9-a+simd.

After this patch three additional multilib directories are available:

$ arm-none-eabi-gcc --print-multi-lib .; [...vanilla multi-lib
dirs...] thumb/v9-a/nofp;@mthumb@march=armv9-a@mfloat-abi=soft
thumb/v9-a+simd/softfp;@mthumb@march=armv9-a+simd@mfloat-
abi=softfp
thumb/v9-a+simd/hard;@mthumb@march=armv9-a+simd@mfloat-
abi=hard





This is causing a GCC build failure when using "old" binutils (I'm using
2.36.1),
because the new -march=armv9-a option is not supported. This breaks the
multilib support.

I don't remember how we handled similar cases in the past? Is that just
"expected", and
"current" GCC needs "current" binutils, or should we have a multilib list
dependent on
the actual binutils support? (I think this is not the case, and it sounds
like an undesirable
extra complication in an already overcrowded mutilib-Makefile)

Christophe


New multi-lib directories under

$GCC_INSTALL_DIE/lib/gcc/arm-none-eabi/12.0.0/thumb are created:

thumb/
+--- v9-a
||--- nofp
|
+--- v9-a+simd
  |--- hard
  |--- softfp

Regtested on arm-none-eabi cross and no issues.

OK for master?


Thanks.

commit 32ba7860ccaddd5219e6dae94a3d0653e124c9dd


Ok.
Thanks,
Kyrill




gcc/ChangeLog:

   * config/arm/arm-cpus.in (armv9): New define.
   (ARMv9a): New group.
   (armv9-a): New arch definition.
   * config/arm/arm-tables.opt: Regenerate.
   * config/arm/arm.h (BASE_ARCH_9A): New arch enum value.
   * config/arm/t-aprofile: Added armv9-a and armv9+simd.
   * config/arm/t-arm-elf: Added arm9-a, v9_fps and all_v9_archs
   to MULTILIB_MATCHES.
   * config/arm/t-multilib: Added v9_a_nosimd_variants and
   v9_a_simd_variants to MULTILIB_MATCHES.
   * doc/invoke.texi: Update docs.

gcc/testsuite/ChangeLog:

   * gcc.target/arm/multilib.exp: Update test with armv9-a entries.
   * lib/target-supports.exp (v9a): Add new armflag.
   (__ARM_ARCH_9A__): Add new armdef.

--
kind regards,
Przemyslaw Wirkus





[PATCH] OpenMP: Ensure that offloaded variables are public

2021-11-16 Thread Andrew Stubbs

Hi,

This patch is needed for AMD GCN offloading when we use the assembler 
from LLVM 13+.


The GCN runtime (libgomp+ROCm) requires that the location of all 
variables in the offloaded variables table are discoverable at runtime 
(using the "hsa_executable_symbol_get_info" API), and this only works 
when the symbols are exported from the binary. Previously we solved this 
by having mkoffload insert ".global" directives into the assembler text, 
but newer LLVM assemblers emit an error if we do this when then variable 
was previously declared ".local" (which happens when a variable is 
zero-initialized and placed in the BSS).


Since we can no longer easily fix them up after the fact, this patch 
fixes them up during OMP lowering.


OK?

AndrewOpenMP: Ensure that offloaded variables are public

The AMD GCN runtime loader requires that variables in the offload table are
exported (public) so that it can locate the load address and do the mapping.

gcc/ChangeLog:

* config/gcn/mkoffload.c (process_asm): Don't add .global directives.
* omp-offload.c (pass_omp_target_link::execute): Make offload_vars
public.

diff --git a/gcc/config/gcn/mkoffload.c b/gcc/config/gcn/mkoffload.c
index b2e71ea5aa00..5b130cc6de71 100644
--- a/gcc/config/gcn/mkoffload.c
+++ b/gcc/config/gcn/mkoffload.c
@@ -573,10 +573,6 @@ process_asm (FILE *in, FILE *out, FILE *cfile)
  abort ();
obstack_int_grow (&varsizes_os, varsize);
var_count++;
-
-   /* The HSA Runtime cannot locate the symbol if it is not
-  exported from the kernel.  */
-   fprintf (out, "\t.global %s\n", varname);
  }
break;
  }
diff --git a/gcc/omp-offload.c b/gcc/omp-offload.c
index 833f7ddea58f..c6fb87a5dee2 100644
--- a/gcc/omp-offload.c
+++ b/gcc/omp-offload.c
@@ -2799,6 +2799,18 @@ pass_omp_target_link::execute (function *fun)
}
 }
 
+  /* Variables in the offload table may need to be public for the runtime
+ loader to be able to locate them.  (This is true for at least amdgcn.)  */
+  if (offload_vars)
+for (auto it = offload_vars->begin (); it != offload_vars->end (); it++)
+if (!TREE_PUBLIC (*it))
+  {
+   TREE_PUBLIC (*it) = 1;
+
+   if (dump_enabled_p () && dump_flags & TDF_DETAILS)
+ dump_printf (MSG_NOTE, "Make offload var public: %T\n", *it);
+  }
+
   return 0;
 }
 


[PATCH] middle-end/103248 - fix RDIV_EXPR handling with fixed point

2021-11-16 Thread Richard Biener via Gcc-patches
This fixes the previous adjustment to operation_could_trap_helper_p
where I failed to realize that RDIV_EXPR is also used for
fixed-point types.  It also fixes that handling by properly
checking for a fixed_zerop divisor.

Bootstrapped and tested on x86_64-unknown-linux-gnu, OK?

Thanks,
Richard.

2021-11-16  Richard Biener  

PR middle-end/103248
* tree-eh.c (operation_could_trap_helper_p): Properly handle
fixed-point RDIV_EXPR.

* gcc.dg/pr103248.c: New testcase.
---
 gcc/testsuite/gcc.dg/pr103248.c |  8 
 gcc/tree-eh.c   | 12 +---
 2 files changed, 17 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr103248.c

diff --git a/gcc/testsuite/gcc.dg/pr103248.c b/gcc/testsuite/gcc.dg/pr103248.c
new file mode 100644
index 000..da6232d21ee
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr103248.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target fixed_point } */
+/* { dg-options "-fnon-call-exceptions" } */
+
+_Accum sa;
+int c;
+
+void div_csa() { c /= sa; }
diff --git a/gcc/tree-eh.c b/gcc/tree-eh.c
index 3eff07fc8fe..916da85af2e 100644
--- a/gcc/tree-eh.c
+++ b/gcc/tree-eh.c
@@ -2474,10 +2474,16 @@ operation_could_trap_helper_p (enum tree_code op,
   return false;
 
 case RDIV_EXPR:
-  if (honor_snans)
+  if (fp_operation)
+   {
+ if (honor_snans)
+   return true;
+ return flag_trapping_math;
+   }
+  /* Fixed point operations also use RDIV_EXPR.  */
+  if (!TREE_CONSTANT (divisor) || fixed_zerop (divisor))
return true;
-  gcc_assert (fp_operation);
-  return flag_trapping_math;
+  return false;
 
 case LT_EXPR:
 case LE_EXPR:
-- 
2.31.1


Re: Use modref kills in tree-ssa-dse

2021-11-16 Thread Richard Biener via Gcc-patches
On Mon, 15 Nov 2021, Jan Hubicka wrote:

> Hi,
> this patch extends tree-ssa-dse to use modref kill summary to clear
> live_bytes.  This makes it possible to remove calls that are killed
> in parts.
> 
> I noticed that DSE duplicates the logic of tree-ssa-alias that is 
> mathing bases of memory accesses.  Here operands_equal_p (base1, base, 
> OEP_ADDRESS_OF) is used. So it won't work with mismatching memref 
> offsets.  We probably want to commonize this and add common function 
> that matches bases and returns offset adjustments. I wonder however if 
> it can catch any cases that the tree-ssa-alias code doesn't?

Not sure, tree-ssa-dse.c doesn't seem to handle MEM_REF with offset?

VN has adjust_offsets_for_equal_base_address for this purpose.  I
agree that some common functionality like

bool
get_relative_extent_of (const ao_ref *base, const ao_ref *ref,
poly_int64 *offset);

that computes [offset, offset + ref->[max_]size] of REF adjusted as to
make ao_ref_base have the same address (or return false if not
possible).  Then [ base->offset, base->offset + base->max_size ]
can be compared against that.

> Other check that stmt_kills_ref_p has and tree-ssa-dse is for 
> non-call-exceptions.
> 
> Bootstrapped/regtested x86_64-linux, OK?

See below.

> gcc/ChangeLog:
> 
>   * ipa-modref.c (get_modref_function_summary): New function.
>   * ipa-modref.h (get_modref_function_summary): Declare.
>   * tree-ssa-dse.c (clear_live_bytes_for_ref): Break out from ...
>   (clear_bytes_written_by): ... here; add handling of modref summary.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/tree-ssa/modref-dse-4.c: New test.
> 
> diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c
> index df4612bbff9..8966f9fd2a4 100644
> --- a/gcc/ipa-modref.c
> +++ b/gcc/ipa-modref.c
> @@ -724,6 +724,22 @@ get_modref_function_summary (cgraph_node *func)
>return r;
>  }
>  
> +/* Get function summary for CALL if it exists, return NULL otherwise.
> +   If INTERPOSED is non-NULL set it to true if call may be interposed.  */
> +
> +modref_summary *
> +get_modref_function_summary (gcall *call, bool *interposed)
> +{
> +  tree callee = gimple_call_fndecl (call);
> +  if (!callee)
> +return NULL;
> +  struct cgraph_node *node = cgraph_node::get (callee);
> +  if (!node)
> +return NULL;
> +  if (interposed)
> +*interposed = !node->binds_to_current_def_p ();
> +  return get_modref_function_summary (node);
> +}
> +
>  namespace {
>  
>  /* Construct modref_access_node from REF.  */
> diff --git a/gcc/ipa-modref.h b/gcc/ipa-modref.h
> index 9e8a30fd80a..72e608864ce 100644
> --- a/gcc/ipa-modref.h
> +++ b/gcc/ipa-modref.h
> @@ -50,6 +50,7 @@ struct GTY(()) modref_summary
>  };
>  
>  modref_summary *get_modref_function_summary (cgraph_node *func);
> +modref_summary *get_modref_function_summary (gcall *call, bool *interposed);
>  void ipa_modref_c_finalize ();
>  void ipa_merge_modref_summary_after_inlining (cgraph_edge *e);
>  
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/modref-dse-4.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/modref-dse-4.c
> new file mode 100644
> index 000..81aa7dc587c
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/modref-dse-4.c
> @@ -0,0 +1,26 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-dse2-details"  } */
> +struct a {int a,b,c;};
> +__attribute__ ((noinline))
> +void
> +kill_me (struct a *a)
> +{
> +  a->a=0;
> +  a->b=0;
> +  a->c=0;
> +}
> +__attribute__ ((noinline))
> +void
> +my_pleasure (struct a *a)
> +{
> +  a->a=1;
> +  a->c=2;
> +}
> +void
> +set (struct a *a)
> +{
> +  kill_me (a);
> +  my_pleasure (a);
> +  a->b=1;
> +}
> +/* { dg-final { scan-tree-dump "Deleted dead store: kill_me" "dse2" } } */
> diff --git a/gcc/tree-ssa-dse.c b/gcc/tree-ssa-dse.c
> index ce0083a6dab..d2f54b0faad 100644
> --- a/gcc/tree-ssa-dse.c
> +++ b/gcc/tree-ssa-dse.c
> @@ -209,6 +209,24 @@ normalize_ref (ao_ref *copy, ao_ref *ref)
>return true;
>  }
>  
> +/* Update LIVE_BYTES tracking REF for write to WRITE:
> +   Verify we have the same base memory address, the write
> +   has a known size and overlaps with REF.  */
> +static void
> +clear_live_bytes_for_ref (sbitmap live_bytes, ao_ref *ref, ao_ref *write)
> +{
> +  HOST_WIDE_INT start, size;
> +
> +  if (valid_ao_ref_for_dse (write)
> +  && operand_equal_p (write->base, ref->base, OEP_ADDRESS_OF)
> +  && known_eq (write->size, write->max_size)
> +  && normalize_ref (write, ref)

normalize_ref alters 'write', I think we should work on a local
copy here.  See live_bytes_read which takes a copy of 'use_ref'.

Otherwise looks good to me.

Thanks,
Richard.

> +  && (write->offset - ref->offset).is_constant (&start)
> +  && write->size.is_constant (&size))
> +bitmap_clear_range (live_bytes, start / BITS_PER_UNIT,
> + size / BITS_PER_UNIT);
> +}
> +
>  /* Clear any bytes written by STMT from the bitmap LIVE_BYTES.  The base
> address written by STMT must mat

Re: [AArch64] Enable generation of FRINTNZ instructions

2021-11-16 Thread Richard Biener via Gcc-patches
On Fri, 12 Nov 2021, Andre Simoes Dias Vieira wrote:

> 
> On 12/11/2021 10:56, Richard Biener wrote:
> > On Thu, 11 Nov 2021, Andre Vieira (lists) wrote:
> >
> >> Hi,
> >>
> >> This patch introduces two IFN's FTRUNC32 and FTRUNC64, the corresponding
> >> optabs and mappings. It also creates a backend pattern to implement them
> >> for
> >> aarch64 and a match.pd pattern to idiom recognize these.
> >> These IFN's (and optabs) represent a truncation towards zero, as if
> >> performed
> >> by first casting it to a signed integer of 32 or 64 bits and then back to
> >> the
> >> same floating point type/mode.
> >>
> >> The match.pd pattern choses to use these, when supported, regardless of
> >> trapping math, since these new patterns mimic the original behavior of
> >> truncating through an integer.
> >>
> >> I didn't think any of the existing IFN's represented these. I know it's a
> >> bit
> >> late in stage 1, but I thought this might be OK given it's only used by a
> >> single target and should have very little impact on anything else.
> >>
> >> Bootstrapped on aarch64-none-linux.
> >>
> >> OK for trunk?
> > On the RTL side ftrunc32/ftrunc64 would probably be better a conversion
> > optab (with two modes), so not
> >
> > +OPTAB_D (ftrunc32_optab, "ftrunc$asi2")
> > +OPTAB_D (ftrunc64_optab, "ftrunc$adi2")
> >
> > but
> >
> > OPTAB_CD (ftrunc_shrt_optab, "ftrunc$a$I$b2")
> >
> > or so?  I know that gets somewhat awkward for the internal function,
> > but IMHO we shouldn't tie our hands because of that?
> I tried doing this originally, but indeed I couldn't find a way to correctly
> tie the internal function to it.
> 
> direct_optab_supported_p with multiple types expect those to be of the same
> mode. I see convert_optab_supported_p does but I don't know how that is
> used...
> 
> Any ideas?

No "nice" ones.  The "usual" way is to provide fake arguments that
specify the type/mode.  We could use an integer argument directly
secifying the mode (then the IL would look host dependent - ugh),
or specify a constant zero in the intended mode (less visibly
obvious - but at least with -gimple dumping you'd see the type...).

In any case if people think going with two optabs is OK then
please consider using ftruncsi and ftruncdi instead of 32/64.

Richard.


Re: [PATCH] regrename: Skip renaming if instruction is noop move.

2021-11-16 Thread Richard Biener via Gcc-patches
On Tue, Nov 16, 2021 at 12:45 PM Jojo R via Gcc-patches
 wrote:
>
> Skip renaming if instruction is noop move, and it will
> been removed for performance.

Is there any (target specific) testcase you can add?  Such commits are
problematic
when later bisected to since the intent isn't clear.

> gcc/
> * regrename.c (find_rename_reg): Return satisfied regno
> if instruction is noop move.
> ---
>  gcc/regrename.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/gcc/regrename.c b/gcc/regrename.c
> index b8a9ca36f22..cb605f5176b 100644
> --- a/gcc/regrename.c
> +++ b/gcc/regrename.c
> @@ -394,6 +394,9 @@ find_rename_reg (du_head_p this_head, enum reg_class 
> super_class,
>   this_head, *unavailable))
>  return this_head->tied_chain->regno;
>
> +  if (noop_move_p (this_head->first->insn))
> +return best_new_reg;
> +
>/* If PREFERRED_CLASS is not NO_REGS, we iterate in the first pass
>   over registers that belong to PREFERRED_CLASS and try to find the
>   best register within the class.  If that failed, we iterate in
> --
> 2.24.3 (Apple Git-128)
>


Re: [PATCH 5/5] vect: Support masked gather loads with SLP

2021-11-16 Thread Richard Biener via Gcc-patches
On Fri, Nov 12, 2021 at 7:06 PM Richard Sandiford via Gcc-patches
 wrote:
>
> This patch extends the previous SLP gather load support so
> that it can handle masked loads too.
>
> Regstrapped on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

OK.

Thanks,
Richard.

> Richard
>
>
> gcc/
> * tree-vect-slp.c (arg1_arg4_map): New variable.
> (vect_get_operand_map): Handle IFN_MASK_GATHER_LOAD.
> (vect_build_slp_tree_1): Likewise.
> (vect_build_slp_tree_2): Likewise.
> * tree-vect-stmts.c (vectorizable_load): Expect the mask to be
> the last SLP child node rather than the first.
>
> gcc/testsuite/
> * gcc.dg/vect/vect-gather-3.c: New test.
> * gcc.dg/vect/vect-gather-4.c: Likewise.
> * gcc.target/aarch64/sve/mask_gather_load_8.c: Likewise.
> ---
>  gcc/testsuite/gcc.dg/vect/vect-gather-3.c | 64 ++
>  gcc/testsuite/gcc.dg/vect/vect-gather-4.c | 48 ++
>  .../aarch64/sve/mask_gather_load_8.c  | 65 +++
>  gcc/tree-vect-slp.c   | 15 -
>  gcc/tree-vect-stmts.c | 21 --
>  5 files changed, 203 insertions(+), 10 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/vect/vect-gather-3.c
>  create mode 100644 gcc/testsuite/gcc.dg/vect/vect-gather-4.c
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/mask_gather_load_8.c
>
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-gather-3.c 
> b/gcc/testsuite/gcc.dg/vect/vect-gather-3.c
> new file mode 100644
> index 000..738bd3f3106
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/vect-gather-3.c
> @@ -0,0 +1,64 @@
> +#include "tree-vect.h"
> +
> +#define N 16
> +
> +void __attribute__((noipa))
> +f (int *restrict y, int *restrict x, int *restrict indices)
> +{
> +  for (int i = 0; i < N; ++i)
> +{
> +  y[i * 2] = (indices[i * 2] < N * 2
> + ? x[indices[i * 2]] + 1
> + : 1);
> +  y[i * 2 + 1] = (indices[i * 2 + 1] < N * 2
> + ? x[indices[i * 2 + 1]] + 2
> + : 2);
> +}
> +}
> +
> +int y[N * 2];
> +int x[N * 2] = {
> +  72704, 52152, 51301, 96681,
> +  57937, 60490, 34504, 60944,
> +  42225, 28333, 88336, 74300,
> +  29250, 20484, 38852, 91536,
> +  86917, 63941, 31590, 21998,
> +  22419, 26974, 28668, 13968,
> +  3451, 20247, 44089, 85521,
> +  22871, 87362, 50555, 85939
> +};
> +int indices[N * 2] = {
> +  15, 0x1, 0xcafe0, 19,
> +  7, 22, 19, 1,
> +  0x2, 0x7, 15, 30,
> +  5, 12, 11, 11,
> +  10, 25, 5, 20,
> +  22, 24, 32, 28,
> +  30, 19, 6, 0xabcdef,
> +  7, 12, 8, 21
> +};
> +int expected[N * 2] = {
> +  91537, 2, 1, 22000,
> +  60945, 28670, 21999, 52154,
> +  1, 2, 91537, 50557,
> +  60491, 29252, 74301, 74302,
> +  88337, 20249, 60491, 22421,
> +  28669, 3453, 1, 22873,
> +  50556, 22000, 34505, 2,
> +  60945, 29252, 42226, 26976
> +};
> +
> +int
> +main (void)
> +{
> +  check_vect ();
> +
> +  f (y, x, indices);
> +  for (int i = 0; i < 32; ++i)
> +if (y[i] != expected[i])
> +  __builtin_abort ();
> +
> +  return 0;
> +}
> +
> +/* { dg-final { scan-tree-dump "Loop contains only SLP stmts" vect { target 
> { vect_gather_load_ifn && vect_masked_load } } } } */
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-gather-4.c 
> b/gcc/testsuite/gcc.dg/vect/vect-gather-4.c
> new file mode 100644
> index 000..ee2e4e4999a
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/vect-gather-4.c
> @@ -0,0 +1,48 @@
> +/* { dg-do compile } */
> +
> +#define N 16
> +
> +void
> +f1 (int *restrict y, int *restrict x1, int *restrict x2,
> +int *restrict indices)
> +{
> +  for (int i = 0; i < N; ++i)
> +{
> +  y[i * 2] = (indices[i * 2] < N * 2
> + ? x1[indices[i * 2]] + 1
> + : 1);
> +  y[i * 2 + 1] = (indices[i * 2 + 1] < N * 2
> + ? x2[indices[i * 2 + 1]] + 2
> + : 2);
> +}
> +}
> +
> +void
> +f2 (int *restrict y, int *restrict x, int *restrict indices)
> +{
> +  for (int i = 0; i < N; ++i)
> +{
> +  y[i * 2] = (indices[i * 2] < N * 2
> + ? x[indices[i * 2]] + 1
> + : 1);
> +  y[i * 2 + 1] = (indices[i * 2 + 1] < N * 2
> + ? x[indices[i * 2 + 1] * 2] + 2
> + : 2);
> +}
> +}
> +
> +void
> +f3 (int *restrict y, int *restrict x, int *restrict indices)
> +{
> +  for (int i = 0; i < N; ++i)
> +{
> +  y[i * 2] = (indices[i * 2] < N * 2
> + ? x[indices[i * 2]] + 1
> + : 1);
> +  y[i * 2 + 1] = (indices[i * 2 + 1] < N * 2
> + ? x[(unsigned int) indices[i * 2 + 1]] + 2
> + : 2);
> +}
> +}
> +
> +/* { dg-final { scan-tree-dump-not "Loop contains only SLP stmts" vect { 
> target vect_gather_load_ifn } } } */
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/mask_gather_load_8.c 
> b/gcc/testsuite/gcc.target/aarch64/sve/ma

Re: [PATCH][GCC] arm: add armv9-a architecture to -march

2021-11-16 Thread Ramana Radhakrishnan via Gcc-patches
Hi There,

I think for AArch32 mapping it back to armv8-a sounds sufficient.  Unless we 
have string or math routines in newlib that make use of any ACLE guards that 
are beyond armv8-a …

Ramana


From: Richard Earnshaw 
Date: Tuesday, 16 November 2021 at 11:48
To: Christophe Lyon , Przemyslaw Wirkus 

Cc: Ramana Radhakrishnan , 
gcc-patches@gcc.gnu.org , Richard Earnshaw 

Subject: Re: [PATCH][GCC] arm: add armv9-a architecture to -march
You can't make an omelette without breaking eggs, as they say.  New
architectures need new assemblers.

However, I wonder if there's anything in v9-a that significantly affects
the quality of the base multilib code needed for building the libraries.
  It might be that we can deal with v9-a by just mapping it to the v8-a
equivalents.  That would then avoid the need for an updated assembler,
and reduce the build time and install footprint.

R.


On 16/11/2021 08:03, Christophe Lyon via Gcc-patches wrote:
> Hi,
>
>
> On Tue, Nov 9, 2021 at 12:36 PM Przemyslaw Wirkus via Gcc-patches <
> gcc-patches@gcc.gnu.org> wrote:
>
> -Original Message-
> From: Przemyslaw Wirkus
> Sent: 18 October 2021 10:37
> To: gcc-patches@gcc.gnu.org
> Cc: Richard Earnshaw ; Ramana
> Radhakrishnan ; Kyrylo Tkachov
> ; ni...@redhat.com
> Subject: [PATCH][GCC] arm: add armv9-a architecture to -march
>
> Hi,
>
> This patch is adding `armv9-a` to -march in Arm GCC.
>
> In this patch:
>+ Add `armv9-a` to -march.
>+ Update multilib with armv9-a and armv9-a+simd.
>
> After this patch three additional multilib directories are available:
>
> $ arm-none-eabi-gcc --print-multi-lib .; [...vanilla multi-lib
> dirs...] thumb/v9-a/nofp;@mthumb@march=armv9-a@mfloat-abi=soft
> thumb/v9-a+simd/softfp;@mthumb@march=armv9-a+simd@mfloat-
> abi=softfp
> thumb/v9-a+simd/hard;@mthumb@march=armv9-a+simd@mfloat-
> abi=hard
>
>>
>
> This is causing a GCC build failure when using "old" binutils (I'm using
> 2.36.1),
> because the new -march=armv9-a option is not supported. This breaks the
> multilib support.
>
> I don't remember how we handled similar cases in the past? Is that just
> "expected", and
> "current" GCC needs "current" binutils, or should we have a multilib list
> dependent on
> the actual binutils support? (I think this is not the case, and it sounds
> like an undesirable
> extra complication in an already overcrowded mutilib-Makefile)
>
> Christophe
>
 New multi-lib directories under
> $GCC_INSTALL_DIE/lib/gcc/arm-none-eabi/12.0.0/thumb are created:
>
> thumb/
> +--- v9-a
> ||--- nofp
> |
> +--- v9-a+simd
>   |--- hard
>   |--- softfp
>
> Regtested on arm-none-eabi cross and no issues.
>
> OK for master?
>>
>> Thanks.
>>
>> commit 32ba7860ccaddd5219e6dae94a3d0653e124c9dd
>>
>>> Ok.
>>> Thanks,
>>> Kyrill
>>>
>>>
>
> gcc/ChangeLog:
>
>* config/arm/arm-cpus.in (armv9): New define.
>(ARMv9a): New group.
>(armv9-a): New arch definition.
>* config/arm/arm-tables.opt: Regenerate.
>* config/arm/arm.h (BASE_ARCH_9A): New arch enum value.
>* config/arm/t-aprofile: Added armv9-a and armv9+simd.
>* config/arm/t-arm-elf: Added arm9-a, v9_fps and all_v9_archs
>to MULTILIB_MATCHES.
>* config/arm/t-multilib: Added v9_a_nosimd_variants and
>v9_a_simd_variants to MULTILIB_MATCHES.
>* doc/invoke.texi: Update docs.
>
> gcc/testsuite/ChangeLog:
>
>* gcc.target/arm/multilib.exp: Update test with armv9-a entries.
>* lib/target-supports.exp (v9a): Add new armflag.
>(__ARM_ARCH_9A__): Add new armdef.
>
> --
> kind regards,
> Przemyslaw Wirkus
>>
>>


Re: Use modref kills in tree-ssa-dse

2021-11-16 Thread Jan Hubicka via Gcc-patches
> 
> Not sure, tree-ssa-dse.c doesn't seem to handle MEM_REF with offset?
> 
> VN has adjust_offsets_for_equal_base_address for this purpose.  I
> agree that some common functionality like
> 
> bool
> get_relative_extent_of (const ao_ref *base, const ao_ref *ref,
> poly_int64 *offset);
> 
> that computes [offset, offset + ref->[max_]size] of REF adjusted as to
> make ao_ref_base have the same address (or return false if not
> possible).  Then [ base->offset, base->offset + base->max_size ]
> can be compared against that.

OK, I will look into that.
> > +  if (valid_ao_ref_for_dse (write)
> > +  && operand_equal_p (write->base, ref->base, OEP_ADDRESS_OF)
> > +  && known_eq (write->size, write->max_size)
> > +  && normalize_ref (write, ref)
> 
> normalize_ref alters 'write', I think we should work on a local
> copy here.  See live_bytes_read which takes a copy of 'use_ref'.

We never proces same write twice (get_ao_ref is always constructing
fresh copy), so this should be safe.  Or shall I turn the write
parameter to "ao_ref write" instead of "ao_ref *write" just to be sure
we do not break infuture?

Thank you,
Honza


Re: [PATCH] ivopts: Improve code generated for very simple loops.

2021-11-16 Thread Richard Biener via Gcc-patches
On Mon, Nov 15, 2021 at 2:04 PM Roger Sayle  wrote:
>
>
> This patch tidies up the code that GCC generates for simple loops,
> by selecting/generating a simpler loop bound expression in ivopts.
> The original motivation came from looking at the following loop (from
> gcc.target/i386/pr90178.c)
>
> int *find_ptr (int* mem, int sz, int val)
> {
>   for (int i = 0; i < sz; i++)
> if (mem[i] == val)
>   return &mem[i];
>   return 0;
> }
>
> which GCC currently compiles to:
>
> find_ptr:
> movq%rdi, %rax
> testl   %esi, %esi
> jle .L4
> leal-1(%rsi), %ecx
> leaq4(%rdi,%rcx,4), %rcx
> jmp .L3
> .L7:addq$4, %rax
> cmpq%rcx, %rax
> je  .L4
> .L3:cmpl%edx, (%rax)
> jne .L7
> ret
> .L4:xorl%eax, %eax
> ret
>
> Notice the relatively complex leal/leaq instructions, that result
> from ivopts using the following expression for the loop bound:
> inv_expr 2: ((unsigned long) ((unsigned int) sz_8(D) + 4294967295)
> * 4 + (unsigned long) mem_9(D)) + 4
>
> which results from NITERS being (unsigned int) sz_8(D) + 4294967295,
> i.e. (sz - 1), and the logic in cand_value_at determining the bound
> as BASE + NITERS*STEP at the start of the final iteration and as
> BASE + NITERS*STEP + STEP at the end of the final iteration.
>
> Ideally, we'd like the middle-end optimizers to simplify
> BASE + NITERS*STEP + STEP as BASE + (NITERS+1)*STEP, especially
> when NITERS already has the form BOUND-1, but with type conversions
> and possible overflow to worry about, the above "inv_expr 2" is the
> best that can be done by fold (without additional context information).
>
> This patch improves ivopts' cand_value_at by instead of using just
> the tree expression for NITERS, passing the data structure that
> explains how that expression was derived.  This allows us to peek
> under the surface to check that NITERS+1 doesn't overflow, and in
> this patch to use the SSA_NAME already holding the required value.
>
> In the motivating loop above, inv_expr 2 now becomes:
> (unsigned long) sz_8(D) * 4 + (unsigned long) mem_9(D)
>
> And as a result, on x86_64 we now generate:
>
> find_ptr:
> movq%rdi, %rax
> testl   %esi, %esi
> jle .L4
> movslq  %esi, %rsi
> leaq(%rdi,%rsi,4), %rcx
> jmp .L3
> .L7:addq$4, %rax
> cmpq%rcx, %rax
> je  .L4
> .L3:cmpl%edx, (%rax)
> jne .L7
> ret
> .L4:xorl%eax, %eax
> ret
>
>
> This improvement required one minor tweak to GCC's testsuite for
> gcc.dg/wrapped-binop-simplify.c, where we again generate better
> code, and therefore no longer find as many optimization opportunities
> in later passes (vrp2).
>
> Previously:
>
> void v1 (unsigned long *in, unsigned long *out, unsigned int n)
> {
>   int i;
>   for (i = 0; i < n; i++) {
> out[i] = in[i];
>   }
> }
>
> on x86_64 generated:
> v1: testl   %edx, %edx
> je  .L1
> movl%edx, %edx
> xorl%eax, %eax
> .L3:movq(%rdi,%rax,8), %rcx
> movq%rcx, (%rsi,%rax,8)
> addq$1, %rax
> cmpq%rax, %rdx
> jne .L3
> .L1:ret
>
> and now instead generates:
> v1: testl   %edx, %edx
> je  .L1
> movl%edx, %edx
> xorl%eax, %eax
> leaq0(,%rdx,8), %rcx
> .L3:movq(%rdi,%rax), %rdx
> movq%rdx, (%rsi,%rax)
> addq$8, %rax
> cmpq%rax, %rcx
> jne .L3
> .L1:ret

Is that actually better?  IIRC the addressing modes are both complex
and we now have an extra lea?  For this case I see we generate

  _15 = n_10(D) + 4294967295;
  _8 = (unsigned long) _15;
  _7 = _8 + 1;

where n is unsigned int so if we know that n is not zero we can simplify the
addition and conveniently the loop header test provides this guarantee.
IIRC there were some attempts to enhance match.pd for some
cases of such expressions.

>
> This patch has been tested on x86_64-pc-linux-gnu with a make bootstrap
> and make -k check with no new failures.  Ok for mainline?

+  /* If AFTER_ADJUST is required, the code below generates the equivalent
+   * of BASE + NITER * STEP + STEP, when ideally we'd prefer the expression
+   * BASE + (NITER + 1) * STEP, especially when NITER is often of the form
+   * SSA_NAME - 1.  Unfortunately, guaranteeing that adding 1 to NITER
+   * doesn't overflow is tricky, so we peek inside the TREE_NITER_DESC
+   * class for common idioms that we know are safe.  */

No '* ' each line.

+  if (after_adjust
+  && desc->control.no_overflow
+  && integer_onep (desc->control.step)
+  && integer_onep (desc->control.base)
+  && desc->cmp == LT_EXPR
+  && TREE_CODE (desc->bound) == SSA_NAME)
+{
+  niter = desc->bound;
+  after_adjust = false;
+}

I wonder if the non-overflo

Re: Use modref kills in tree-ssa-dse

2021-11-16 Thread Richard Biener via Gcc-patches
On Tue, 16 Nov 2021, Jan Hubicka wrote:

> > 
> > Not sure, tree-ssa-dse.c doesn't seem to handle MEM_REF with offset?
> > 
> > VN has adjust_offsets_for_equal_base_address for this purpose.  I
> > agree that some common functionality like
> > 
> > bool
> > get_relative_extent_of (const ao_ref *base, const ao_ref *ref,
> > poly_int64 *offset);
> > 
> > that computes [offset, offset + ref->[max_]size] of REF adjusted as to
> > make ao_ref_base have the same address (or return false if not
> > possible).  Then [ base->offset, base->offset + base->max_size ]
> > can be compared against that.
> 
> OK, I will look into that.
> > > +  if (valid_ao_ref_for_dse (write)
> > > +  && operand_equal_p (write->base, ref->base, OEP_ADDRESS_OF)
> > > +  && known_eq (write->size, write->max_size)
> > > +  && normalize_ref (write, ref)
> > 
> > normalize_ref alters 'write', I think we should work on a local
> > copy here.  See live_bytes_read which takes a copy of 'use_ref'.
> 
> We never proces same write twice (get_ao_ref is always constructing
> fresh copy), so this should be safe.  Or shall I turn the write
> parameter to "ao_ref write" instead of "ao_ref *write" just to be sure
> we do not break infuture?

Yes.

Thanks,
Richard.


Re: [PATCH 3/3] elf: Add _dl_find_eh_frame function

2021-11-16 Thread Adhemerval Zanella via Gcc-patches



On 03/11/2021 13:28, Florian Weimer via Gcc-patches wrote:
> This function is similar to __gnu_Unwind_Find_exidx as used on arm.
> It can be used to speed up the libgcc unwinder.

Besides the terse patch description, the design seems ok to accomplish the
lock-free read and update.  There are some question and remarks below,
and I still need to revise the tests.

However the code is somewhat complex and I would like to have some feedback
if gcc will be willing to accept this change (I assume it would require
this code merge on glibc beforehand).

> ---
>  NEWS  |   4 +
>  bits/dlfcn_eh_frame.h |  33 +
>  dlfcn/Makefile|   2 +-
>  dlfcn/dlfcn.h |   2 +
>  elf/Makefile  |  31 +-
>  elf/Versions  |   3 +
>  elf/dl-close.c|   4 +
>  elf/dl-find_eh_frame.c| 864 ++
>  elf/dl-find_eh_frame.h|  90 ++
>  elf/dl-find_eh_frame_slow.h   |  55 ++
>  elf/dl-libc_freeres.c |   2 +
>  elf/dl-open.c |   5 +
>  elf/rtld.c|   7 +
>  elf/tst-dl_find_eh_frame-mod1.c   |  10 +
>  elf/tst-dl_find_eh_frame-mod2.c   |  10 +
>  elf/tst-dl_find_eh_frame-mod3.c   |  10 +
>  elf/tst-dl_find_eh_frame-mod4.c   |  10 +
>  elf/tst-dl_find_eh_frame-mod5.c   |  11 +
>  elf/tst-dl_find_eh_frame-mod6.c   |  11 +
>  elf/tst-dl_find_eh_frame-mod7.c   |  10 +
>  elf/tst-dl_find_eh_frame-mod8.c   |  10 +
>  elf/tst-dl_find_eh_frame-mod9.c   |  10 +
>  elf/tst-dl_find_eh_frame-threads.c| 237 +
>  elf/tst-dl_find_eh_frame.c| 179 
>  include/atomic_wide_counter.h |  14 +
>  include/bits/dlfcn_eh_frame.h |   1 +
>  include/link.h|   3 +
>  manual/Makefile   |   2 +-
>  manual/dynlink.texi   |  69 ++
>  manual/libdl.texi |  10 -
>  manual/probes.texi|   2 +-
>  manual/threads.texi   |   2 +-
>  sysdeps/i386/bits/dlfcn_eh_frame.h|  34 +
>  sysdeps/mach/hurd/i386/ld.abilist |   1 +
>  sysdeps/nios2/bits/dlfcn_eh_frame.h   |  34 +
>  sysdeps/unix/sysv/linux/aarch64/ld.abilist|   1 +
>  sysdeps/unix/sysv/linux/alpha/ld.abilist  |   1 +
>  sysdeps/unix/sysv/linux/arc/ld.abilist|   1 +
>  sysdeps/unix/sysv/linux/arm/be/ld.abilist |   1 +
>  sysdeps/unix/sysv/linux/arm/le/ld.abilist |   1 +
>  sysdeps/unix/sysv/linux/csky/ld.abilist   |   1 +
>  sysdeps/unix/sysv/linux/hppa/ld.abilist   |   1 +
>  sysdeps/unix/sysv/linux/i386/ld.abilist   |   1 +
>  sysdeps/unix/sysv/linux/ia64/ld.abilist   |   1 +
>  .../unix/sysv/linux/m68k/coldfire/ld.abilist  |   1 +
>  .../unix/sysv/linux/m68k/m680x0/ld.abilist|   1 +
>  sysdeps/unix/sysv/linux/microblaze/ld.abilist |   1 +
>  .../unix/sysv/linux/mips/mips32/ld.abilist|   1 +
>  .../sysv/linux/mips/mips64/n32/ld.abilist |   1 +
>  .../sysv/linux/mips/mips64/n64/ld.abilist |   1 +
>  sysdeps/unix/sysv/linux/nios2/ld.abilist  |   1 +
>  .../sysv/linux/powerpc/powerpc32/ld.abilist   |   1 +
>  .../linux/powerpc/powerpc64/be/ld.abilist |   1 +
>  .../linux/powerpc/powerpc64/le/ld.abilist |   1 +
>  sysdeps/unix/sysv/linux/riscv/rv32/ld.abilist |   1 +
>  sysdeps/unix/sysv/linux/riscv/rv64/ld.abilist |   1 +
>  .../unix/sysv/linux/s390/s390-32/ld.abilist   |   1 +
>  .../unix/sysv/linux/s390/s390-64/ld.abilist   |   1 +
>  sysdeps/unix/sysv/linux/sh/be/ld.abilist  |   1 +
>  sysdeps/unix/sysv/linux/sh/le/ld.abilist  |   1 +
>  .../unix/sysv/linux/sparc/sparc32/ld.abilist  |   1 +
>  .../unix/sysv/linux/sparc/sparc64/ld.abilist  |   1 +
>  sysdeps/unix/sysv/linux/x86_64/64/ld.abilist  |   1 +
>  sysdeps/unix/sysv/linux/x86_64/x32/ld.abilist |   1 +
>  64 files changed, 1795 insertions(+), 16 deletions(-)
>  create mode 100644 bits/dlfcn_eh_frame.h
>  create mode 100644 elf/dl-find_eh_frame.c
>  create mode 100644 elf/dl-find_eh_frame.h
>  create mode 100644 elf/dl-find_eh_frame_slow.h
>  create mode 100644 elf/tst-dl_find_eh_frame-mod1.c
>  create mode 100644 elf/tst-dl_find_eh_frame-mod2.c
>  create mode 100644 elf/tst-dl_find_eh_frame-mod3.c
>  create mode 100644 elf/tst-dl_find_eh_frame-mod4.c
>  create mode 100644 elf/tst-dl_find_eh_frame-mod5.c
>  create mode 100644 elf/tst-dl_find_eh_frame-mod6.c
>  create mode 100644 elf/tst-dl_find_eh_frame-mod7.c
>  create mode 100644 elf/tst-dl_find_eh_frame-mod8.c
>  create mode 100644 elf/tst-dl_find_eh_frame-mod9.c
>  create mode 100644 elf/tst-d

[PATCH, v5, OpenMP 5.0] Improve OpenMP target support for C++ [PR92120 v5]

2021-11-16 Thread Chung-Lin Tang

Hi Jakub,

On 2021/6/24 9:15 PM, Jakub Jelinek wrote:

On Fri, Jun 18, 2021 at 10:25:16PM +0800, Chung-Lin Tang wrote:

Note, you'll need to rebase your patch, it clashes with
r12-1768-g7619d33471c10fe3d149dcbb701d99ed3dd23528.
Sorry for that.  And sorry for patch review delay.


--- a/gcc/c/c-typeck.c
+++ b/gcc/c/c-typeck.c
@@ -13104,6 +13104,12 @@ handle_omp_array_sections_1 (tree c, tree t, vec 
&types,
  return error_mark_node;
}
  t = TREE_OPERAND (t, 0);
+ if ((ort == C_ORT_ACC || ort == C_ORT_OMP)


Map clauses never appear on declare simd, so
(ort == C_ORT_ACC || ort == C_ORT_OMP)
previously meant always and since the in_reduction change is incorrect
(as C_ORT_OMP_TARGET is used for target construct but not for
e.g. target data* or target update).


+ && TREE_CODE (t) == MEM_REF)


Upon reviewing, it appears that most of these C_ORT_* tests are no longer 
needed, removed in new patch.


So please just use if (TREE_CODE (t) == MEM_REF)
or explain when it shouldn't trigger.


@@ -14736,6 +14743,11 @@ c_finish_omp_clauses (tree clauses, enum 
c_omp_region_type ort)
{
  while (TREE_CODE (t) == COMPONENT_REF)
t = TREE_OPERAND (t, 0);
+ if (TREE_CODE (t) == MEM_REF)
+   {
+ t = TREE_OPERAND (t, 0);
+ STRIP_NOPS (t);
+   }


This doesn't look correct.  At least the parsing (and the spec AFAIK)
doesn't ensure that if there is ->, it must come before all the dots.
So, if one uses map (s->x.y) the above would work, but if map (s->x.y->z) or
map (s.a->b->c->d->e) is used, it wouldn't.  I'd expect a single
while loop that looks through COMPONENT_REFs and MEM_REFs as they appear.
Maybe the handle_omp_array_sections_1 MEM_REF case too?

Or do you want to have it done incrementally, start with supporting only
a single -> first before all the dots and later on add support for the rest?

I think the 5.0 and especially 5.1 wording basically says that map clause
operand is arbitrary lvalue expression that includes array section support
too, so eventually we should just have somewhere in parsing scope a bool
whether OpenMP array sections are allowed or not, add OMP_ARRAY_REF or
similar tree code for those and after parsing the expression, ensure
array sections appear only where they can appear and for a subset of the
lvalue expressions where we have decl plus series of -> field or . field
or [ index ] or [ array section stuff ] handle those specially.
That arbitrary lvalue can certainly be done incrementally.
map (foo(123)->a.b[3]->c.d[:7]) and the like.


Indeed this kind of modification is sort of "as encountered", so there are
probably many cases that are not completely handled yet; it's not just
the front-end, but also changes in gimplify_scan_omp_clauses().

However, I had another patch that should've plowed a bit further on this:
https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570075.html
as well as those patch sets that Julian is working on.
(our current plan is to have my sets go in first, and Julian's on top,
to minimize clashing)


  if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
  && OMP_CLAUSE_MAP_IMPLICIT (c)
  && (bitmap_bit_p (&map_head, DECL_UID (t))
@@ -14802,6 +14814,15 @@ c_finish_omp_clauses (tree clauses, enum 
c_omp_region_type ort)
   bias) to zero here, so it is not set erroneously to the pointer
   size later on in gimplify.c.  */
OMP_CLAUSE_SIZE (c) = size_zero_node;
+ indir_component_ref_p = false;
+ if ((ort == C_ORT_ACC || ort == C_ORT_OMP)


Same comment about ort tests.


+ && TREE_CODE (t) == COMPONENT_REF
+ && TREE_CODE (TREE_OPERAND (t, 0)) == MEM_REF)
+   {
+ t = TREE_OPERAND (TREE_OPERAND (t, 0), 0);
+ indir_component_ref_p = true;
+ STRIP_NOPS (t);
+   }


Again, this can handle only a single ->


@@ -42330,16 +42328,10 @@ cp_parser_omp_target (cp_parser *parser, cp_token 
*pragma_tok,
cclauses[C_OMP_CLAUSE_SPLIT_TARGET] = tc;
  }
}
- tree stmt = make_node (OMP_TARGET);
- TREE_TYPE (stmt) = void_type_node;
- OMP_TARGET_CLAUSES (stmt) = cclauses[C_OMP_CLAUSE_SPLIT_TARGET];
- c_omp_adjust_map_clauses (OMP_TARGET_CLAUSES (stmt), true);
- OMP_TARGET_BODY (stmt) = body;
- OMP_TARGET_COMBINED (stmt) = 1;
- SET_EXPR_LOCATION (stmt, pragma_tok->location);
- add_stmt (stmt);
- pc = &OMP_TARGET_CLAUSES (stmt);
- goto check_clauses;
+ c_omp_adjust_map_clauses (cclauses[C_OMP_CLAUSE_SPLIT_TARGET], true);
+ finish_omp_target (pragma_tok->location,
+cclauses[C_OMP_CLAUSE_SPLIT_TARGET], body, true

Re: [PATCH, rs6000] Optimization for vec_xl_sext

2021-11-16 Thread Bill Schmidt via Gcc-patches
Hi Hao Chen,

I don't understand.  This patch was already approved and you committed it. :-)  
I know
because I needed to make corresponding adjustments to the new builtins code.

Thanks,
Bill

On 11/15/21 8:16 PM, HAO CHEN GUI wrote:
> Hi,
>
>    The patch optimizes the code generation for vec_xl_sext builtin. Now all 
> the sign extensions are done on VSX registers directly.
>
>    Bootstrapped and tested on powerpc64le-linux with no regressions. Is this 
> okay for trunk? Any recommendations? Thanks a lot.
>
> ChangeLog
>
> 2021-11-16 Haochen Gui 
>
> gcc/
>     * config/rs6000/rs6000-call.c (altivec_expand_lxvr_builtin): Modify
>     the expansion for sign extension. All extensions are done on VSX
>     registers.
>
> gcc/testsuite/
>     * gcc.target/powerpc/p10_vec_xl_sext.c: New test.
>
> patch.diff
>
> diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
> index b4e13af4dc6..587e9fa2a2a 100644
> --- a/gcc/config/rs6000/rs6000-call.c
> +++ b/gcc/config/rs6000/rs6000-call.c
> @@ -9779,7 +9779,7 @@ altivec_expand_lxvr_builtin (enum insn_code icode, tree 
> exp, rtx target, bool bl
>
>    if (sign_extend)
>  {
> -  rtx discratch = gen_reg_rtx (DImode);
> +  rtx discratch = gen_reg_rtx (V2DImode);
>    rtx tiscratch = gen_reg_rtx (TImode);
>
>    /* Emit the lxvr*x insn.  */
> @@ -9788,20 +9788,31 @@ altivec_expand_lxvr_builtin (enum insn_code icode, 
> tree exp, rtx target, bool bl
>     return 0;
>    emit_insn (pat);
>
> -  /* Emit a sign extension from QI,HI,WI to double (DI).  */
> -  rtx scratch = gen_lowpart (smode, tiscratch);
> +  /* Emit a sign extension from V16QI,V8HI,V4SI to V2DI.  */
> +  rtx temp1, temp2;
>    if (icode == CODE_FOR_vsx_lxvrbx)
> -   emit_insn (gen_extendqidi2 (discratch, scratch));
> +   {
> + temp1  = simplify_gen_subreg (V16QImode, tiscratch, TImode, 0);
> + emit_insn (gen_vsx_sign_extend_qi_v2di (discratch, temp1));
> +   }
>    else if (icode == CODE_FOR_vsx_lxvrhx)
> -   emit_insn (gen_extendhidi2 (discratch, scratch));
> +   {
> + temp1  = simplify_gen_subreg (V8HImode, tiscratch, TImode, 0);
> + emit_insn (gen_vsx_sign_extend_hi_v2di (discratch, temp1));
> +   }
>    else if (icode == CODE_FOR_vsx_lxvrwx)
> -   emit_insn (gen_extendsidi2 (discratch, scratch));
> -  /*  Assign discratch directly if scratch is already DI.  */
> -  if (icode == CODE_FOR_vsx_lxvrdx)
> -   discratch = scratch;
> +   {
> + temp1  = simplify_gen_subreg (V4SImode, tiscratch, TImode, 0);
> + emit_insn (gen_vsx_sign_extend_si_v2di (discratch, temp1));
> +   }
> +  else if (icode == CODE_FOR_vsx_lxvrdx)
> +   discratch = simplify_gen_subreg (V2DImode, tiscratch, TImode, 0);
> +  else
> +   gcc_unreachable ();
>
> -  /* Emit the sign extension from DI (double) to TI (quad).  */
> -  emit_insn (gen_extendditi2 (target, discratch));
> +  /* Emit the sign extension from V2DI (double) to TI (quad).  */
> +  temp2 = simplify_gen_subreg (TImode, discratch, V2DImode, 0);
> +  emit_insn (gen_extendditi2_vector (target, temp2));
>
>    return target;
>  }
> diff --git a/gcc/testsuite/gcc.target/powerpc/p10_vec_xl_sext.c 
> b/gcc/testsuite/gcc.target/powerpc/p10_vec_xl_sext.c
> new file mode 100644
> index 000..78e72ac5425
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/p10_vec_xl_sext.c
> @@ -0,0 +1,35 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target int128 } */
> +/* { dg-require-effective-target power10_ok } */
> +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
> +
> +#include 
> +
> +vector signed __int128
> +foo1 (signed long a, signed char *b)
> +{
> +  return vec_xl_sext (a, b);
> +}
> +
> +vector signed __int128
> +foo2 (signed long a, signed short *b)
> +{
> +  return vec_xl_sext (a, b);
> +}
> +
> +vector signed __int128
> +foo3 (signed long a, signed int *b)
> +{
> +  return vec_xl_sext (a, b);
> +}
> +
> +vector signed __int128
> +foo4 (signed long a, signed long *b)
> +{
> +  return vec_xl_sext (a, b);
> +}
> +
> +/* { dg-final { scan-assembler-times {\mvextsd2q\M} 4 } } */
> +/* { dg-final { scan-assembler-times {\mvextsb2d\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\mvextsh2d\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\mvextsw2d\M} 1 } } */
>


Re: [PATCH v4] Fix ICE when mixing VLAs and statement expressions [PR91038]

2021-11-16 Thread Uecker, Martin
Am Montag, den 08.11.2021, 19:13 +0100 schrieb Martin Uecker:
> Am Montag, den 08.11.2021, 12:13 -0500 schrieb Jason Merrill:
> > On 11/7/21 01:40, Uecker, Martin wrote:
> > > Am Mittwoch, den 03.11.2021, 10:18 -0400 schrieb Jason Merrill:
> 
> ...
> 
> > > Thank you! I made these changes and ran
> > > bootstrap and tests again.
> > 
> > Hmm, it doesn't look like you made the change to use the save_expr 
> > function instead of build1?
> 
> Oh, sorry. I wanted to change it and then forgot.
> Now also with this change (changelog as before).


Ok, with is this change?

Best,
Martin



> > > Ok for trunk?
> > > 
> > > 
> > > Any idea how to fix returning structs with
> > > VLA member from statement expressions?
> > 
> > Testcase?
> 
> void foo(void)
> {
>   ({ int N = 3; struct { char x[N]; } x; x; });
> }
> 
> The difference to the tests in this patch (which
> also forgot to include in the last version) is that
> the object of variable size is returned from the
> statement expression and not a pointer to it.
> This can not happen with arrays because they decay
> to pointers.
> 
> 
> Martin
> 
> 
> > > Otherwise, I will add an error message to
> > > the FE in another patch.
> > > 
> > > Martin
> > > 
> 
> diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
> index 436df45df68..95083f95442 100644
> --- a/gcc/c-family/c-common.c
> +++ b/gcc/c-family/c-common.c
> @@ -3306,7 +3306,19 @@ pointer_int_sum (location_t loc, enum tree_code 
> resultcode,
>TREE_TYPE (result_type)))
>  size_exp = integer_one_node;
>else
> -size_exp = size_in_bytes_loc (loc, TREE_TYPE (result_type));
> +{
> +  size_exp = size_in_bytes_loc (loc, TREE_TYPE (result_type));
> +  /* Wrap the pointer expression in a SAVE_EXPR to make sure it
> +  is evaluated first when the size expression may depend
> +  on it for VM types.  */
> +  if (TREE_SIDE_EFFECTS (size_exp)
> +   && TREE_SIDE_EFFECTS (ptrop)
> +   && variably_modified_type_p (TREE_TYPE (ptrop), NULL))
> + {
> +   ptrop = save_expr (ptrop);
> +   size_exp = build2 (COMPOUND_EXPR, TREE_TYPE (intop), ptrop, size_exp);
> + }
> +}
>  
>/* We are manipulating pointer values, so we don't need to warn
>   about relying on undefined signed overflow.  We disable the
> diff --git a/gcc/gimplify.c b/gcc/gimplify.c
> index c2ab96e7e18..84f7dc3c248 100644
> --- a/gcc/gimplify.c
> +++ b/gcc/gimplify.c
> @@ -2964,7 +2964,9 @@ gimplify_var_or_parm_decl (tree *expr_p)
>   declaration, for which we've already issued an error.  It would
>   be really nice if the front end wouldn't leak these at all.
>   Currently the only known culprit is C++ destructors, as seen
> - in g++.old-deja/g++.jason/binding.C.  */
> + in g++.old-deja/g++.jason/binding.C.
> + Another possible culpit are size expressions for variably modified
> + types which are lost in the FE or not gimplified correctly.  */
>if (VAR_P (decl)
>&& !DECL_SEEN_IN_BIND_EXPR_P (decl)
>&& !TREE_STATIC (decl) && !DECL_EXTERNAL (decl)
> @@ -3109,16 +3111,22 @@ gimplify_compound_lval (tree *expr_p, gimple_seq 
> *pre_p, gimple_seq
> *post_p,
>   expression until we deal with any variable bounds, sizes, or
>   positions in order to deal with PLACEHOLDER_EXPRs.
>  
> - So we do this in three steps.  First we deal with the annotations
> - for any variables in the components, then we gimplify the base,
> - then we gimplify any indices, from left to right.  */
> + The base expression may contain a statement expression that
> + has declarations used in size expressions, so has to be
> + gimplified before gimplifying the size expressions.
> +
> + So we do this in three steps.  First we deal with variable
> + bounds, sizes, and positions, then we gimplify the base,
> + then we deal with the annotations for any variables in the
> + components and any indices, from left to right.  */
> +
>for (i = expr_stack.length () - 1; i >= 0; i--)
>  {
>tree t = expr_stack[i];
>  
>if (TREE_CODE (t) == ARRAY_REF || TREE_CODE (t) == ARRAY_RANGE_REF)
>   {
> -   /* Gimplify the low bound and element type size and put them into
> +   /* Deal with the low bound and element type size and put them into
>the ARRAY_REF.  If these values are set, they have already been
>gimplified.  */
> if (TREE_OPERAND (t, 2) == NULL_TREE)
> @@ -3127,18 +3135,8 @@ gimplify_compound_lval (tree *expr_p, gimple_seq 
> *pre_p, gimple_seq
> *post_p,
> if (!is_gimple_min_invariant (low))
>   {
> TREE_OPERAND (t, 2) = low;
> -   tret = gimplify_expr (&TREE_OPERAND (t, 2), pre_p,
> - post_p, is_gimple_reg,
> - fb_rvalue);
> -   ret = MIN (ret, tret);
>   }
>   }
> -   else

Re: [PATCH] Loop unswitching: support gswitch statements.

2021-11-16 Thread Martin Liška

On 11/11/21 08:15, Richard Biener wrote:

If you look at simplify_using_entry_checks then this is really really simple,
so I'd try to abstract this, recording sth like a unswitch_predicate where
we store the condition we unswitch on plus maybe cache the constant
range of a VAR cmp CST variable condition on the true/false edge.  We
can then try to simplify each gcond/gswitch based on such an unswitch_predicate
(when we ever scan the loop once to discover all opportunities we'd have a
set of unswitch_predicates to try simplifying against).  As said the integer
range thing would be an improvement over the current state so even that
can be done as followup but I guess for gswitch support that's going to be
the thing to use.


I started working on the unswitch_predicate where I recond also true/false-edge 
irange
of an expression we unswitch on.

I noticed one significant problem, let's consider:

  for (int i = 0; i < size; i++)
  {
double tmp;

if (order == 1)
  tmp = -8 * a[i];
else
  {
if (order == 2)
  tmp = -4 * b[i];
else
  tmp = a[i];
  }

r[i] = 3.4f * tmp + d[i];
  }

We can end up with first unswitching candidate being 'if (order == 2)' (I have 
a real benchmark where it happens).
So I collect ranges and they are [2,2] for true edge and [-INF, 0], [3, INF] 
(because we came to the condition through order != 1 cond).
Then the loop is cloned and we have

if (order == 2)
   loop_version_1
else
   loop_version_2

but in loop_version_2 we wrongly fold 'if (order == 1)' to false because it's 
reflected in the range.

So the question is, can one iterate get_loop_body stmts in some dominator order?

Thanks,
Martin




[committed] libstdc++: Fix typos in tests

2021-11-16 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux, pushed to trunk.


libstdc++-v3/ChangeLog:

* testsuite/21_strings/basic_string/allocator/71964.cc: Fix
typo.
* testsuite/23_containers/set/allocator/71964.cc: Likewise.
---
 .../testsuite/21_strings/basic_string/allocator/71964.cc| 2 +-
 libstdc++-v3/testsuite/23_containers/set/allocator/71964.cc | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/libstdc++-v3/testsuite/21_strings/basic_string/allocator/71964.cc 
b/libstdc++-v3/testsuite/21_strings/basic_string/allocator/71964.cc
index c57cb96e971..4196b331aca 100644
--- a/libstdc++-v3/testsuite/21_strings/basic_string/allocator/71964.cc
+++ b/libstdc++-v3/testsuite/21_strings/basic_string/allocator/71964.cc
@@ -40,7 +40,7 @@ template
   a.moved_from = true;
 }
 
-T* allocate(unsigned n) { return std::allocator{}.allcoate(n); }
+T* allocate(unsigned n) { return std::allocator{}.allocate(n); }
 void deallocate(T* p, unsigned n) { std::allocator{}.deallocate(p, n); }
 
 bool moved_to;
diff --git a/libstdc++-v3/testsuite/23_containers/set/allocator/71964.cc 
b/libstdc++-v3/testsuite/23_containers/set/allocator/71964.cc
index 34a02d85e66..a2c166afd0f 100644
--- a/libstdc++-v3/testsuite/23_containers/set/allocator/71964.cc
+++ b/libstdc++-v3/testsuite/23_containers/set/allocator/71964.cc
@@ -40,7 +40,7 @@ template
   a.moved_from = true;
 }
 
-T* allocate(unsigned n) { return std::allocator{}.allcoate(n); }
+T* allocate(unsigned n) { return std::allocator{}.allocate(n); }
 void deallocate(T* p, unsigned n) { std::allocator{}.deallocate(p, n); }
 
 bool moved_to;
-- 
2.31.1



[committed] libstdc++: Fix out-of-bound array accesses in testsuite

2021-11-16 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux, pushed to trunk.


I fixed some undefined behaviour in string tests in r238609, but I only
fixed the narrow char versions. This applies the same fixes to the
wchar_t ones. These problems were found when testing a patch to make
std::basic_string usable in constexpr.

libstdc++-v3/ChangeLog:

* testsuite/21_strings/basic_string/modifiers/append/wchar_t/1.cc:
Fix reads past the end of strings.
* testsuite/21_strings/basic_string/operations/compare/wchar_t/1.cc:
Likewise.
* testsuite/experimental/string_view/operations/compare/wchar_t/1.cc:
Likewise.
---
 .../21_strings/basic_string/modifiers/append/wchar_t/1.cc | 2 +-
 .../21_strings/basic_string/operations/compare/wchar_t/1.cc   | 4 ++--
 .../experimental/string_view/operations/compare/wchar_t/1.cc  | 4 ++--
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git 
a/libstdc++-v3/testsuite/21_strings/basic_string/modifiers/append/wchar_t/1.cc 
b/libstdc++-v3/testsuite/21_strings/basic_string/modifiers/append/wchar_t/1.cc
index bb2d682de8e..684209f143e 100644
--- 
a/libstdc++-v3/testsuite/21_strings/basic_string/modifiers/append/wchar_t/1.cc
+++ 
b/libstdc++-v3/testsuite/21_strings/basic_string/modifiers/append/wchar_t/1.cc
@@ -117,7 +117,7 @@ void test01(void)
   VERIFY( str06 == L"corpus, corpus" );
 
   str06 = str02;
-  str06.append(L"corpus, ", 12);
+  str06.append(L"corpus, ", 9); // n=9 includes null terminator
   VERIFY( str06 != L"corpus, corpus, " );
 
 
diff --git 
a/libstdc++-v3/testsuite/21_strings/basic_string/operations/compare/wchar_t/1.cc
 
b/libstdc++-v3/testsuite/21_strings/basic_string/operations/compare/wchar_t/1.cc
index 27836f8e6fb..6f2113fb16a 100644
--- 
a/libstdc++-v3/testsuite/21_strings/basic_string/operations/compare/wchar_t/1.cc
+++ 
b/libstdc++-v3/testsuite/21_strings/basic_string/operations/compare/wchar_t/1.cc
@@ -81,8 +81,8 @@ test01()
   test_value(wcsncmp(str_1.data(), str_0.data(), 6), z);
   test_value(wcsncmp(str_1.data(), str_0.data(), 14), lt);
   test_value(wmemcmp(str_1.data(), str_0.data(), 6), z);
-  test_value(wmemcmp(str_1.data(), str_0.data(), 14), lt);
-  test_value(wmemcmp(L"costa marbella", L"costa rica", 14), lt);
+  test_value(wmemcmp(str_1.data(), str_0.data(), 10), lt);
+  test_value(wmemcmp(L"costa marbella", L"costa rica", 10), lt);
 
   // int compare(const basic_string& str) const;
   test_value(str_0.compare(str_1), gt); //because r>m
diff --git 
a/libstdc++-v3/testsuite/experimental/string_view/operations/compare/wchar_t/1.cc
 
b/libstdc++-v3/testsuite/experimental/string_view/operations/compare/wchar_t/1.cc
index db523e6a83c..20bb030970b 100644
--- 
a/libstdc++-v3/testsuite/experimental/string_view/operations/compare/wchar_t/1.cc
+++ 
b/libstdc++-v3/testsuite/experimental/string_view/operations/compare/wchar_t/1.cc
@@ -81,8 +81,8 @@ test01()
   test_value(wcsncmp(str_1.data(), str_0.data(), 6), z);
   test_value(wcsncmp(str_1.data(), str_0.data(), 14), lt);
   test_value(wmemcmp(str_1.data(), str_0.data(), 6), z);
-  test_value(wmemcmp(str_1.data(), str_0.data(), 14), lt);
-  test_value(wmemcmp(L"costa marbella", L"costa rica", 14), lt);
+  test_value(wmemcmp(str_1.data(), str_0.data(), 10), lt);
+  test_value(wmemcmp(L"costa marbella", L"costa rica", 10), lt);
 
   // int compare(const basic_string_view& str) const;
   test_value(str_0.compare(str_1), gt); //because r>m
-- 
2.31.1



Re: [PATCH v2] configure: define TARGET_LIBC_GNUSTACK on musl

2021-11-16 Thread Dragan Mladjenovic

Hi,

Looks fine to me. If possible, maybe it should even be back-ported to 
stable branches.


Not sure if MIPS assembly sources (if any) in musl would need explicit 
.note.GNU-stack


to complement this?

Best regards,

Dragan

On 16-Nov-21 06:13, Ilya Lipnitskiy wrote:

musl only uses PT_GNU_STACK to set default thread stack size and has no
executable stack support[0], so there is no reason not to emit the
.note.GNU-stack section on musl builds.

[0]: 
https://lore.kernel.org/all/20190423192534.gn23...@brightrain.aerifal.cx/T/#u

gcc/ChangeLog:

* configure: Regenerate.
* configure.ac: define TARGET_LIBC_GNUSTACK on musl

Signed-off-by: Ilya Lipnitskiy 
---
  gcc/configure| 3 +++
  gcc/configure.ac | 3 +++
  2 files changed, 6 insertions(+)

diff --git a/gcc/configure b/gcc/configure
index 74b9d9be4c85..7091a838aefa 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -31275,6 +31275,9 @@ fi
  # Check if the target LIBC handles PT_GNU_STACK.
  gcc_cv_libc_gnustack=unknown
  case "$target" in
+  mips*-*-linux-musl*)
+gcc_cv_libc_gnustack=yes
+;;
mips*-*-linux*)
  
  if test $glibc_version_major -gt 2 \

diff --git a/gcc/configure.ac b/gcc/configure.ac
index c9ee1fb8919e..8a2d34179a75 100644
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -6961,6 +6961,9 @@ fi
  # Check if the target LIBC handles PT_GNU_STACK.
  gcc_cv_libc_gnustack=unknown
  case "$target" in
+  mips*-*-linux-musl*)
+gcc_cv_libc_gnustack=yes
+;;
mips*-*-linux*)
  GCC_GLIBC_VERSION_GTE_IFELSE([2], [31], [gcc_cv_libc_gnustack=yes], )
  ;;


Re: [PATCH] Fix spelling of ones' complement.

2021-11-16 Thread Koning, Paul via Gcc-patches



> On Nov 16, 2021, at 2:03 AM, Aldy Hernandez via Gcc-patches 
>  wrote:
> 
> On Tue, Nov 16, 2021, 03:20 Marek Polacek via Gcc-patches <
> gcc-patches@gcc.gnu.org> wrote:
> 
>> On Tue, Nov 16, 2021 at 02:01:47AM +, Koning, Paul via Gcc-patches
>> wrote:
>>> 
>>> 
 On Nov 15, 2021, at 8:48 PM, Marek Polacek via Gcc-patches <
>> gcc-patches@gcc.gnu.org> wrote:
 
 Nitpicking time.  It's spelled "ones' complement" rather than "one's
 complement".
>>> 
>>> Is that so?  I see Wikipedia claims it is, but there are no sources for
>> that claim.  (There is an assertion that it is "discussed at length on the
>> talk page" of an article about number representation, but in fact there is
>> no discussion there at all.)
>>> 
>>> I have never seen this spelling before, and I very much doubt its
>> validity.  For one thing, why then have "two's complement"?  For another,
>> to pick one random authority, J.E. Thornton in "Design of a computer -- the
>> Control Data 6600" refers to "one's complement" to describe the well known
>> mode used by that machine and its relatives.
>> 
>> Knuth, The Art of Computer Programming Volume 2, page 203-4:
>> 
>> "A two's complement number is complemented with respect to a single
>> power of 2, while a ones' complement number is complemented with respect
>> to a long sequence of 1s."
>> 
> 
> I think you get to do a drop mike when you pull out Knuth.
> 
> :-)

If that were the only source, sure.  But with authoritative sources for both 
terms (with the ones I quoted being the earlier ones) at the very least there 
is an argument that both terms are used.  

Some more: DEC PDP-1 handbook (April 1960), page 9: "Negative numbers are 
represented as the 1's complement of the positive numbers."

Univac 1107 CPU manual, page 2-6: "Next, the adder subtracts the one's 
complement..."

CDC 160 programming manual (1963), page 2-1: "All arithmetic is binary, one's 
complement notation".

Incidentally, these are the four of the five machines cited by the Wikipedia 
article.

Re: [PATCH] Loop unswitching: support gswitch statements.

2021-11-16 Thread Martin Liška

On 11/11/21 08:15, Richard Biener wrote:

So I'd try to do no functional change first, improving the costing and
setting up the transform to simply pick up the stmts to "fold" as discovered
during analysis (as I hinted you possibly can use gimple_uid to mark
the stmts that simplify, IIRC gimple_uid is preserved during copying.
gimple_uid would also scale better than gimple_plf in case we do
the analysis for all candidates at once).


Thinking about the analysis. Am I correct that we want to properly calculate
loop size for true and false edge of a potential gcond before the actually 
unswitching?

We can do that by finding a first gcond candidate, evaluate (symbolic + irange 
approache)
all other gcond in the loop body and use BB_REACHABLE discovery. Similarly to 
what we do now
at lines 378-446. Then tree_num_loop_insns can be adjusted for only these 
reachable blocks.
Having that, we can calculate # of insns that will live in true/false loops.

Then we can call tree_unswitch_loop and make the gcond folding as we do in the 
versioned loops.

Is it a step in good direction? Having that we can then extend it to gswitch 
statements.

Cheers,
Martin


Re: [PATCH] Fix spelling of ones' complement.

2021-11-16 Thread Aldy Hernandez via Gcc-patches
On Tue, Nov 16, 2021 at 3:40 PM Koning, Paul  wrote:
>
>
>
> > On Nov 16, 2021, at 2:03 AM, Aldy Hernandez via Gcc-patches 
> >  wrote:
> >
> > On Tue, Nov 16, 2021, 03:20 Marek Polacek via Gcc-patches <
> > gcc-patches@gcc.gnu.org> wrote:
> >
> >> On Tue, Nov 16, 2021 at 02:01:47AM +, Koning, Paul via Gcc-patches
> >> wrote:
> >>>
> >>>
>  On Nov 15, 2021, at 8:48 PM, Marek Polacek via Gcc-patches <
> >> gcc-patches@gcc.gnu.org> wrote:
> 
>  Nitpicking time.  It's spelled "ones' complement" rather than "one's
>  complement".
> >>>
> >>> Is that so?  I see Wikipedia claims it is, but there are no sources for
> >> that claim.  (There is an assertion that it is "discussed at length on the
> >> talk page" of an article about number representation, but in fact there is
> >> no discussion there at all.)
> >>>
> >>> I have never seen this spelling before, and I very much doubt its
> >> validity.  For one thing, why then have "two's complement"?  For another,
> >> to pick one random authority, J.E. Thornton in "Design of a computer -- the
> >> Control Data 6600" refers to "one's complement" to describe the well known
> >> mode used by that machine and its relatives.
> >>
> >> Knuth, The Art of Computer Programming Volume 2, page 203-4:
> >>
> >> "A two's complement number is complemented with respect to a single
> >> power of 2, while a ones' complement number is complemented with respect
> >> to a long sequence of 1s."
> >>
> >
> > I think you get to do a drop mike when you pull out Knuth.
> >
> > :-)
>
> If that were the only source, sure.  But with authoritative sources for both 
> terms (with the ones I quoted being the earlier ones) at the very least there 
> is an argument that both terms are used.
>
> Some more: DEC PDP-1 handbook (April 1960), page 9: "Negative numbers are 
> represented as the 1's complement of the positive numbers."
>
> Univac 1107 CPU manual, page 2-6: "Next, the adder subtracts the one's 
> complement..."
>
> CDC 160 programming manual (1963), page 2-1: "All arithmetic is binary, one's 
> complement notation".
>
> Incidentally, these are the four of the five machines cited by the Wikipedia 
> article.

All sources before Knuth are clearly wrong.  How could they not?
Folks living in the pre-Knuth era lived without a deity.

:-P



[PATCH v2] c++: improve print_node of PTRMEM_CST

2021-11-16 Thread Jason Merrill via Gcc-patches

On 11/4/21 16:32, Jakub Jelinek wrote:

On Thu, Nov 04, 2021 at 11:52:34AM -0400, Jason Merrill via Gcc-patches wrote:

It's been inconvenient that pretty-printing of PTRMEM_CST didn't display
what member the constant refers to.

Adding that is complicated by the absence of a langhook for CONSTANT_CLASS_P
nodes; the simplest fix for that is to use the tcc_exceptional hook for
tcc_constant as well.

Tested x86_64-pc-linux-gnu.  OK for trunk, or should I add a new hook for
constants?

gcc/cp/ChangeLog:

* ptree.c (cxx_print_xnode): Handle PTRMEM_CST.

gcc/ChangeLog:

* print-tree.c (print_node): Also call print_xnode hook for
tcc_constant class.


I think using the same langhook is fine, but in that case certainly
   /* Called by print_tree when there is a tree of class tcc_exceptional
  that it doesn't know how to display.  */
should be adjusted so that it mentions also tcc_constant.


Done.


And maybe rename it from print_xnode to print_node?


I think changing the comment is enough, it's still just exceptional and 
constant.


This is what I'm pushing:From 761b128dbfa2fbc1f1a0138160a39db95db7759a Mon Sep 17 00:00:00 2001
From: Jason Merrill 
Date: Fri, 29 Oct 2021 16:39:01 -0400
Subject: [PATCH] c++: improve print_node of PTRMEM_CST
To: gcc-patches@gcc.gnu.org

It's been inconvenient that pretty-printing of PTRMEM_CST didn't display
what member the constant refers to.

Adding that is complicated by the absence of a langhook for CONSTANT_CLASS_P
nodes; the simplest fix for that is to use the tcc_exceptional hook for
tcc_constant as well.

gcc/cp/ChangeLog:

	* ptree.c (cxx_print_xnode): Handle PTRMEM_CST.

gcc/ChangeLog:

	* langhooks.h (struct lang_hooks): Adjust comment.
	* print-tree.c (print_node): Also call print_xnode hook for
	tcc_constant class.
---
 gcc/langhooks.h  | 2 +-
 gcc/cp/ptree.c   | 3 +++
 gcc/print-tree.c | 3 +--
 3 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/gcc/langhooks.h b/gcc/langhooks.h
index 3e89134e8b4..3db8f2a550d 100644
--- a/gcc/langhooks.h
+++ b/gcc/langhooks.h
@@ -477,7 +477,7 @@ struct lang_hooks
   void (*print_statistics) (void);
 
   /* Called by print_tree when there is a tree of class tcc_exceptional
- that it doesn't know how to display.  */
+ or tcc_constant that it doesn't know how to display.  */
   lang_print_tree_hook print_xnode;
 
   /* Called to print language-dependent parts of tcc_decl, tcc_type,
diff --git a/gcc/cp/ptree.c b/gcc/cp/ptree.c
index ca7884db39b..d514aa2cad2 100644
--- a/gcc/cp/ptree.c
+++ b/gcc/cp/ptree.c
@@ -379,6 +379,9 @@ cxx_print_xnode (FILE *file, tree node, int indent)
   if (tree message = STATIC_ASSERT_MESSAGE (node))
 	print_node (file, "message", message, indent+4);
   break;
+case PTRMEM_CST:
+  print_node (file, "member", PTRMEM_CST_MEMBER (node), indent+4);
+  break;
 default:
   break;
 }
diff --git a/gcc/print-tree.c b/gcc/print-tree.c
index d1fbd044c27..b5dc523fcb1 100644
--- a/gcc/print-tree.c
+++ b/gcc/print-tree.c
@@ -1004,8 +1004,7 @@ print_node (FILE *file, const char *prefix, tree node, int indent,
 	  break;
 
 	default:
-	  if (EXCEPTIONAL_CLASS_P (node))
-	lang_hooks.print_xnode (file, node, indent);
+	  lang_hooks.print_xnode (file, node, indent);
 	  break;
 	}
 
-- 
2.27.0



[committed] analyzer: fix overeager sharing of bounded_range instances [PR102662]

2021-11-16 Thread David Malcolm via Gcc-patches
This was leading to an assertion failure ICE on a switch stmt when using
-fstrict-enums, due to erroneously reusing a range involving one enum
with a range involving a different enum.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r12-5307-ge1c0c908f85816240b685a5be4f0e5a0e6634979.

gcc/analyzer/ChangeLog:
PR analyzer/102662
* constraint-manager.cc (bounded_range::operator==): Require the
types to be the same for equality.

gcc/testsuite/ChangeLog:
PR analyzer/102662
* g++.dg/analyzer/pr102662.C: New test.

Signed-off-by: David Malcolm 
---
 gcc/analyzer/constraint-manager.cc   |  4 ++-
 gcc/testsuite/g++.dg/analyzer/pr102662.C | 39 
 2 files changed, 42 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/analyzer/pr102662.C

diff --git a/gcc/analyzer/constraint-manager.cc 
b/gcc/analyzer/constraint-manager.cc
index 6df23fb477e..ea6b5dc60e0 100644
--- a/gcc/analyzer/constraint-manager.cc
+++ b/gcc/analyzer/constraint-manager.cc
@@ -432,7 +432,9 @@ bounded_range::intersects_p (const bounded_range &other,
 bool
 bounded_range::operator== (const bounded_range &other) const
 {
-  return (tree_int_cst_equal (m_lower, other.m_lower)
+  return (TREE_TYPE (m_lower) == TREE_TYPE (other.m_lower)
+ && TREE_TYPE (m_upper) == TREE_TYPE (other.m_upper)
+ && tree_int_cst_equal (m_lower, other.m_lower)
  && tree_int_cst_equal (m_upper, other.m_upper));
 }
 
diff --git a/gcc/testsuite/g++.dg/analyzer/pr102662.C 
b/gcc/testsuite/g++.dg/analyzer/pr102662.C
new file mode 100644
index 000..99252c7d109
--- /dev/null
+++ b/gcc/testsuite/g++.dg/analyzer/pr102662.C
@@ -0,0 +1,39 @@
+/* { dg-additional-options "-fstrict-enums" } */
+
+enum OpCode {
+  OP_MOVE,
+  OP_LOADK,
+  OP_LOADBOOL,
+  OP_LOADNIL,
+  OP_GETUPVAL,
+  OP_SETUPVAL
+};
+
+enum OpArg {
+  OpArgN,
+  OpArgU,
+  OpArgR,
+  OpArgK
+};
+
+void
+symbexec_lastpc (enum OpCode symbexec_lastpc_op, enum OpArg luaP_opmodes)
+{
+  switch (luaP_opmodes)
+{
+case OpArgN:
+case OpArgK:
+  {
+switch (symbexec_lastpc_op)
+  {
+  case OP_LOADNIL:
+  case OP_SETUPVAL:
+break;
+  default:
+break;
+  }
+  }
+default:
+  break;
+}
+}
-- 
2.26.3



Re: [PATCH 1/5] libstdc++: Import the fast_float library

2021-11-16 Thread Patrick Palka via Gcc-patches
On Tue, 16 Nov 2021, Florian Weimer wrote:

> * Patrick Palka via Libstdc:
> 
> > This copies the fast_float library[1] into the compiled-in library
> > sources.  We're going to use this library in our floating-point
> > std::from_chars implementation for faster and more portable parsing of
> > binary32/64 decimal strings.
> >
> > [1]: https://github.com/fastfloat/fast_float
> >
> > Series tested on x86_64, i686, ppc64, ppc64le and aarch64, does it
> > look OK for trunk?
> 
> Missing Signed-off-by:?

Oops, fixed in the below patch.

> 
> > diff --git a/libstdc++-v3/src/c++17/fast_float/LICENSE-APACHE 
> > b/libstdc++-v3/src/c++17/fast_float/LICENSE-APACHE
> > new file mode 100644
> > index 000..26f4398f249
> > --- /dev/null
> > +++ b/libstdc++-v3/src/c++17/fast_float/LICENSE-APACHE
> > @@ -0,0 +1,190 @@
> > + Apache License
> > +   Version 2.0, January 2004
> > +http://www.apache.org/licenses/
> 
> > diff --git a/libstdc++-v3/src/c++17/fast_float/LICENSE-MIT 
> > b/libstdc++-v3/src/c++17/fast_float/LICENSE-MIT
> > new file mode 100644
> > index 000..2fb2a37ad7f
> > --- /dev/null
> > +++ b/libstdc++-v3/src/c++17/fast_float/LICENSE-MIT
> > @@ -0,0 +1,27 @@
> > +MIT License
> > +
> > +Copyright (c) 2021 The fast_float authors
> 
> You also need to include the README file, which makes it clear that
> recipients can choose between Apache and MIT.  GCC needs to use the MIT
> option, I think.

Also fixed.

I noticed that the source repository contains the script
./script/amalgamate.py that generates a single-file version of the
library for us, complete with an embedded copyright/license banner.
This seems like a simpler way of integrating the library, so the below
patch uses the amalgamation instead.

-- >8 --

Subject: [PATCH 1/5] libstdc++: Import the fast_float library

We're going to use the fast_float library in our (compiled-in)
floating-point std::from_chars implementation for faster and more
portable parsing of binary32/64 decimal strings.

The single file fast_float.h is an amalgamation of the entire library,
which can be (re)generated with the command

  python3 ./script/amalgamate.py --license=MIT \
> $GCC_SRC/libstdc++-v3/c++17/fast_float/fast_float.h

[1]: https://github.com/fastfloat/fast_float

libstdc++-v3/ChangeLog:

* src/c++17/fast_float/LOCAL_PATCHES: New file.
* src/c++17/fast_float/MERGE: New file.
* src/c++17/fast_float/README.fd: New file, copied from the
fast_float library sources.
* src/c++17/fast_float/fast_float.h: New file, an amalgamation
of the fast_float library.

Signed-off-by: Patrick Palka 
---
 .../src/c++17/fast_float/LOCAL_PATCHES|0
 libstdc++-v3/src/c++17/fast_float/MERGE   |4 +
 libstdc++-v3/src/c++17/fast_float/README.md   |  218 ++
 .../src/c++17/fast_float/fast_float.h | 2944 +
 4 files changed, 3166 insertions(+)
 create mode 100644 libstdc++-v3/src/c++17/fast_float/LOCAL_PATCHES
 create mode 100644 libstdc++-v3/src/c++17/fast_float/MERGE
 create mode 100644 libstdc++-v3/src/c++17/fast_float/README.md
 create mode 100644 libstdc++-v3/src/c++17/fast_float/fast_float.h

diff --git a/libstdc++-v3/src/c++17/fast_float/LOCAL_PATCHES 
b/libstdc++-v3/src/c++17/fast_float/LOCAL_PATCHES
new file mode 100644
index 000..e69de29bb2d
diff --git a/libstdc++-v3/src/c++17/fast_float/MERGE 
b/libstdc++-v3/src/c++17/fast_float/MERGE
new file mode 100644
index 000..43bdc3981c8
--- /dev/null
+++ b/libstdc++-v3/src/c++17/fast_float/MERGE
@@ -0,0 +1,4 @@
+d35368cae610b4edeec61cd41e4d2367a4d33f58
+
+The first line of this file holds the git revision number of the
+last merge done from the master library sources.
diff --git a/libstdc++-v3/src/c++17/fast_float/README.md 
b/libstdc++-v3/src/c++17/fast_float/README.md
new file mode 100644
index 000..1e1c06d0a3e
--- /dev/null
+++ b/libstdc++-v3/src/c++17/fast_float/README.md
@@ -0,0 +1,218 @@
+## fast_float number parsing library: 4x faster than strtod
+
+![Ubuntu 20.04 CI (GCC 
9)](https://github.com/lemire/fast_float/workflows/Ubuntu%2020.04%20CI%20(GCC%209)/badge.svg)
+![Ubuntu 18.04 CI (GCC 
7)](https://github.com/lemire/fast_float/workflows/Ubuntu%2018.04%20CI%20(GCC%207)/badge.svg)
+![Alpine 
Linux](https://github.com/lemire/fast_float/workflows/Alpine%20Linux/badge.svg)
+![MSYS2-CI](https://github.com/lemire/fast_float/workflows/MSYS2-CI/badge.svg)
+![VS16-CLANG-CI](https://github.com/lemire/fast_float/workflows/VS16-CLANG-CI/badge.svg)
+[![VS16-CI](https://github.com/fastfloat/fast_float/actions/workflows/vs16-ci.yml/badge.svg)](https://github.com/fastfloat/fast_float/actions/workflows/vs16-ci.yml)
+
+The fast_float library provides fast header-only implementations for the C++ 
from_chars
+functions for `float` and `double` types.  These functions convert ASCII 
strings representing
+decimal values (e.g., `1.3e10`) into binary 

[PATCH]middle-end: Fix FMA detection when inspecting gimple which have no LHS.

2021-11-16 Thread Tamar Christina via Gcc-patches
Hi All,

convert_mult_to_fma assumes that all gimple_assigns have a LHS set.  This
assumption is however not true when an IFN is kept around just for the
side-effects.  In those situations you have just the IFN and lhs will be null.

Since there's no LHS, there also can't be any ADD and such it can't be an FMA
so it's correct to just return early if no LHS.

Bootstrapped Regtested on aarch64-none-linux-gnu,
x86_64-pc-linux-gnu and no regressions.

Ok for master?

Thanks,
Tamar



gcc/ChangeLog:

PR tree-optimizations/103253
* tree-ssa-math-opts.c (convert_mult_to_fma): Check for LHS.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/pr103253.c: New test.

--- inline copy of patch -- 
diff --git a/gcc/testsuite/gcc.dg/vect/pr103253.c 
b/gcc/testsuite/gcc.dg/vect/pr103253.c
new file mode 100644
index 
..abe3f09f3818d79a53f2aa962c6b6c06855d618e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/pr103253.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target fopenmp } */
+/* { dg-additional-options "-O2 -fexceptions -fopenmp 
-fno-delete-dead-exceptions -fno-trapping-math" } */
+
+double
+do_work (double do_work_pri)
+{
+  int i;
+
+#pragma omp simd
+  for (i = 0; i < 17; ++i)
+do_work_pri = (!i ? 0.5 : i) * 2.0;
+
+  return do_work_pri;
+}
+
diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c
index 
c4a6492b50df25b4cf296a75bd51e5af34eeacc7..cc8496c3c325f3cc303a90b9b9cac383e5a7942d
 100644
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -3224,6 +3224,10 @@ convert_mult_to_fma (gimple *mul_stmt, tree op1, tree 
op2,
 fma_deferring_state *state, tree mul_cond = NULL_TREE)
 {
   tree mul_result = gimple_get_lhs (mul_stmt);
+  /* If there isn't a LHS then this can't be an FMA.  There can be no LHS
+ if the statement was left just for the side-effects.  */
+  if (!mul_result)
+return false;
   tree type = TREE_TYPE (mul_result);
   gimple *use_stmt, *neguse_stmt;
   use_operand_p use_p;


-- 
diff --git a/gcc/testsuite/gcc.dg/vect/pr103253.c b/gcc/testsuite/gcc.dg/vect/pr103253.c
new file mode 100644
index ..abe3f09f3818d79a53f2aa962c6b6c06855d618e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/pr103253.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target fopenmp } */
+/* { dg-additional-options "-O2 -fexceptions -fopenmp -fno-delete-dead-exceptions -fno-trapping-math" } */
+
+double
+do_work (double do_work_pri)
+{
+  int i;
+
+#pragma omp simd
+  for (i = 0; i < 17; ++i)
+do_work_pri = (!i ? 0.5 : i) * 2.0;
+
+  return do_work_pri;
+}
+
diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c
index c4a6492b50df25b4cf296a75bd51e5af34eeacc7..cc8496c3c325f3cc303a90b9b9cac383e5a7942d 100644
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -3224,6 +3224,10 @@ convert_mult_to_fma (gimple *mul_stmt, tree op1, tree op2,
 		 fma_deferring_state *state, tree mul_cond = NULL_TREE)
 {
   tree mul_result = gimple_get_lhs (mul_stmt);
+  /* If there isn't a LHS then this can't be an FMA.  There can be no LHS
+ if the statement was left just for the side-effects.  */
+  if (!mul_result)
+return false;
   tree type = TREE_TYPE (mul_result);
   gimple *use_stmt, *neguse_stmt;
   use_operand_p use_p;



[PATCH][committed]AArch64 shrn-combine-10: update test to current codegen.

2021-11-16 Thread Tamar Christina via Gcc-patches
Hi All,

When the rshrn commit was reverted I missed this testcase.
This now updates it.

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Committed under the obvious rule.

Thanks,
Tamar

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/shrn-combine-10.c: Use shrn.

--- inline copy of patch -- 
diff --git a/gcc/testsuite/gcc.target/aarch64/shrn-combine-10.c 
b/gcc/testsuite/gcc.target/aarch64/shrn-combine-10.c
index 
3a1cfce93e9065e8d5b43a770b0ef24a17586411..dc9e9be94cbe4ba81d936dfaf178674b9da31040
 100644
--- a/gcc/testsuite/gcc.target/aarch64/shrn-combine-10.c
+++ b/gcc/testsuite/gcc.target/aarch64/shrn-combine-10.c
@@ -6,7 +6,7 @@
 
 uint32x4_t foo (uint64x2_t a, uint64x2_t b)
 {
-  return vrshrn_high_n_u64 (vrshrn_n_u64 (a, 32), b, 32);
+  return vshrn_high_n_u64 (vshrn_n_u64 (a, 32), b, 32);
 }
 
 /* { dg-final { scan-assembler-times {\tuzp2\t} 1 } } */


-- 
diff --git a/gcc/testsuite/gcc.target/aarch64/shrn-combine-10.c b/gcc/testsuite/gcc.target/aarch64/shrn-combine-10.c
index 3a1cfce93e9065e8d5b43a770b0ef24a17586411..dc9e9be94cbe4ba81d936dfaf178674b9da31040 100644
--- a/gcc/testsuite/gcc.target/aarch64/shrn-combine-10.c
+++ b/gcc/testsuite/gcc.target/aarch64/shrn-combine-10.c
@@ -6,7 +6,7 @@
 
 uint32x4_t foo (uint64x2_t a, uint64x2_t b)
 {
-  return vrshrn_high_n_u64 (vrshrn_n_u64 (a, 32), b, 32);
+  return vshrn_high_n_u64 (vshrn_n_u64 (a, 32), b, 32);
 }
 
 /* { dg-final { scan-assembler-times {\tuzp2\t} 1 } } */



[PATCH][committed]middle-end signbit-2: make test check for scalar or vector versions

2021-11-16 Thread Tamar Christina via Gcc-patches
Hi All,

This updates the signbit-2 test to check for
the scalar optimization if the target does not
support vectorization.

Regtested on aarch64-none-linux-gnu,
x86_64-pc-linux-gnu and no regressions.

Committed under the gcc obvious rule.

Thanks,
Tamar

gcc/testsuite/ChangeLog:

* gcc.dg/signbit-2.c: CHeck vect or scalar.

--- inline copy of patch -- 
diff --git a/gcc/testsuite/gcc.dg/signbit-2.c b/gcc/testsuite/gcc.dg/signbit-2.c
index 
d8501e9b7a2d82b511ad0b3a44c0121d635972c0..b609f67dc9f8a949b86f0ec84144db834b9d531a
 100644
--- a/gcc/testsuite/gcc.dg/signbit-2.c
+++ b/gcc/testsuite/gcc.dg/signbit-2.c
@@ -19,5 +19,6 @@ void fun2(int32_t *x, int n)
   x[i] = (-x[i]) >> 30;
 }
 
-/* { dg-final { scan-tree-dump {\s+>\s+\{ 0, 0, 0(, 0)+ \}} optimized } } */
+/* { dg-final { scan-tree-dump {\s+>\s+\{ 0(, 0)+ \}} optimized { target 
vect_int } } } */
+/* { dg-final { scan-tree-dump {\s+>\s+0} optimized { target { ! vect_int } } 
} } */
 /* { dg-final { scan-tree-dump-not {\s+>>\s+31} optimized } } */


-- 
diff --git a/gcc/testsuite/gcc.dg/signbit-2.c b/gcc/testsuite/gcc.dg/signbit-2.c
index d8501e9b7a2d82b511ad0b3a44c0121d635972c0..b609f67dc9f8a949b86f0ec84144db834b9d531a 100644
--- a/gcc/testsuite/gcc.dg/signbit-2.c
+++ b/gcc/testsuite/gcc.dg/signbit-2.c
@@ -19,5 +19,6 @@ void fun2(int32_t *x, int n)
   x[i] = (-x[i]) >> 30;
 }
 
-/* { dg-final { scan-tree-dump {\s+>\s+\{ 0, 0, 0(, 0)+ \}} optimized } } */
+/* { dg-final { scan-tree-dump {\s+>\s+\{ 0(, 0)+ \}} optimized { target vect_int } } } */
+/* { dg-final { scan-tree-dump {\s+>\s+0} optimized { target { ! vect_int } } } } */
 /* { dg-final { scan-tree-dump-not {\s+>>\s+31} optimized } } */



Re: [PATCH 2/5] gimple-match: Add a gimple_extract_op function

2021-11-16 Thread Richard Sandiford via Gcc-patches
Richard Biener  writes:
> On Wed, Nov 10, 2021 at 1:46 PM Richard Sandiford via Gcc-patches
>  wrote:
>>
>> code_helper and gimple_match_op seem like generally useful ways
>> of summing up a gimple_assign or gimple_call (or gimple_cond).
>> This patch adds a gimple_extract_op function that can be used
>> for that.
>>
>> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?
>>
>> Richard
>>
>>
>> gcc/
>> * gimple-match.h (gimple_extract_op): Declare.
>> * gimple-match.c (gimple_extract): New function, extracted from...
>> (gimple_simplify): ...here.
>> (gimple_extract_op): New function.
>> ---
>>  gcc/gimple-match-head.c | 261 +++-
>>  gcc/gimple-match.h  |   1 +
>>  2 files changed, 149 insertions(+), 113 deletions(-)
>>
>> diff --git a/gcc/gimple-match-head.c b/gcc/gimple-match-head.c
>> index 9d88b2f8551..4c6e0883ba4 100644
>> --- a/gcc/gimple-match-head.c
>> +++ b/gcc/gimple-match-head.c
>> @@ -890,12 +890,29 @@ try_conditional_simplification (internal_fn ifn, 
>> gimple_match_op *res_op,
>>return true;
>>  }
>>
>> -/* The main STMT based simplification entry.  It is used by the fold_stmt
>> -   and the fold_stmt_to_constant APIs.  */
>> +/* Common subroutine of gimple_extract_op and gimple_simplify.  Try to
>> +   describe STMT in RES_OP.  Return:
>>
>> -bool
>> -gimple_simplify (gimple *stmt, gimple_match_op *res_op, gimple_seq *seq,
>> -tree (*valueize)(tree), tree (*top_valueize)(tree))
>> +   - -1 if extraction failed
>> +   - otherwise, 0 if no simplification should take place
>> +   - otherwise, the number of operands for a GIMPLE_ASSIGN or GIMPLE_COND
>> +   - otherwise, -2 for a GIMPLE_CALL
>> +
>> +   Before recording an operand, call:
>> +
>> +   - VALUEIZE_CONDITION for a COND_EXPR condition
>> +   - VALUEIZE_NAME if the rhs of a GIMPLE_ASSIGN is an SSA_NAME
>
> I think at least VALUEIZE_NAME is unnecessary, see below

Yeah, it's unnecessary.  The idea was to (try to) ensure that
gimple_simplify keeps all the microoptimisations that it had
previously.  This includes open-coding do_valueize for SSA_NAMEs
and jumping straight to the right gimplify_resimplifyN routine
when the number of operands is already known.

(The two calls to gimple_extract<> produce different functions
that ought to get inlined into their single callers.  A lot of the
jumps should then be threaded.)

I can drop all that if you don't think it's worth it though.
Just wanted to double-check first.

Thanks,
Richard

>> +   - VALUEIZE_OP for every other top-level operand
>> +
>> +   Each routine takes a tree argument and returns a tree.  */
>> +
>> +template> +typename ValueizeName>
>> +inline int
>> +gimple_extract (gimple *stmt, gimple_match_op *res_op,
>> +   ValueizeOp valueize_op,
>> +   ValueizeCondition valueize_condition,
>> +   ValueizeName valueize_name)
>>  {
>>switch (gimple_code (stmt))
>>  {
>> @@ -911,100 +928,53 @@ gimple_simplify (gimple *stmt, gimple_match_op 
>> *res_op, gimple_seq *seq,
>> || code == VIEW_CONVERT_EXPR)
>>   {
>> tree op0 = TREE_OPERAND (gimple_assign_rhs1 (stmt), 0);
>> -   bool valueized = false;
>> -   op0 = do_valueize (op0, top_valueize, valueized);
>> -   res_op->set_op (code, type, op0);
>> -   return (gimple_resimplify1 (seq, res_op, valueize)
>> -   || valueized);
>> +   res_op->set_op (code, type, valueize_op (op0));
>> +   return 1;
>>   }
>> else if (code == BIT_FIELD_REF)
>>   {
>> tree rhs1 = gimple_assign_rhs1 (stmt);
>> -   tree op0 = TREE_OPERAND (rhs1, 0);
>> -   bool valueized = false;
>> -   op0 = do_valueize (op0, top_valueize, valueized);
>> +   tree op0 = valueize_op (TREE_OPERAND (rhs1, 0));
>> res_op->set_op (code, type, op0,
>> TREE_OPERAND (rhs1, 1),
>> TREE_OPERAND (rhs1, 2),
>> REF_REVERSE_STORAGE_ORDER (rhs1));
>> -   if (res_op->reverse)
>> - return valueized;
>> -   return (gimple_resimplify3 (seq, res_op, valueize)
>> -   || valueized);
>> +   return res_op->reverse ? 0 : 3;
>>   }
>> -   else if (code == SSA_NAME
>> -&& top_valueize)
>> +   else if (code == SSA_NAME)
>>   {
>> tree op0 = gimple_assign_rhs1 (stmt);
>> -   tree valueized = top_valueize (op0);
>> +   tree valueized = valueize_name (op0);
>> if (!valueized || op0 == valueized)
>> - return false;
>> + return -1;
>> res_op->set_op (TREE_CODE (op0), type, valueized);
>> -

Re: [PATCH 1/5] libstdc++: Import the fast_float library

2021-11-16 Thread Daniel Krügler via Gcc-patches
Am Di., 16. Nov. 2021 um 16:31 Uhr schrieb Patrick Palka via Libstdc++
:
>
[..]
> -- >8 --
>
> Subject: [PATCH 1/5] libstdc++: Import the fast_float library
>
[..]
> +## Reference
> +
> +- Daniel Lemire, [Number Parsing at a Gigabyte per 
> Second](https://arxiv.org/abs/2101.11408), Software: Pratice and Experience 
> 51 (8), 2021.

There is a typo in the title at the very end:

s/Pratice/Practice

(See https://arxiv.org/abs/2101.11408)

- Daniel


Re: [PATCH 4/5] vect: Make reduction code handle calls

2021-11-16 Thread Richard Sandiford via Gcc-patches
Richard Biener via Gcc-patches  writes:
> On Wed, Nov 10, 2021 at 1:48 PM Richard Sandiford via Gcc-patches
>  wrote:
>>
>> This patch extends the reduction code to handle calls.  So far
>> it's a structural change only; a later patch adds support for
>> specific function reductions.
>>
>> Most of the patch consists of using code_helper and gimple_match_op
>> to describe the reduction operations.  The other main change is that
>> vectorizable_call now needs to handle fully-predicated reductions.
>>
>> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?
>>
>> Richard
>>
>>
>> gcc/
>> * builtins.h (associated_internal_fn): Declare overload that
>> takes a (combined_cfn, return type) pair.
>> * builtins.c (associated_internal_fn): Split new overload out
>> of original fndecl version.  Also provide an overload that takes
>> a (combined_cfn, return type) pair.
>> * internal-fn.h (commutative_binary_fn_p): Declare.
>> (associative_binary_fn_p): Likewise.
>> * internal-fn.c (commutative_binary_fn_p): New function,
>> split out from...
>> (first_commutative_argument): ...here.
>> (associative_binary_fn_p): New function.
>> * gimple-match.h (code_helper): Add a constructor that takes
>> internal functions.
>> (commutative_binary_op_p): Declare.
>> (associative_binary_op_p): Likewise.
>> (canonicalize_code): Likewise.
>> (directly_supported_p): Likewise.
>> (get_conditional_internal_fn): Likewise.
>> (gimple_build): New overload that takes a code_helper.
>> * gimple-fold.c (gimple_build): Likewise.
>> * gimple-match-head.c (commutative_binary_op_p): New function.
>> (associative_binary_op_p): Likewise.
>> (canonicalize_code): Likewise.
>> (directly_supported_p): Likewise.
>> (get_conditional_internal_fn): Likewise.
>> * tree-vectorizer.h: Include gimple-match.h.
>> (neutral_op_for_reduction): Take a code_helper instead of a 
>> tree_code.
>> (needs_fold_left_reduction_p): Likewise.
>> (reduction_fn_for_scalar_code): Likewise.
>> (vect_can_vectorize_without_simd_p): Declare a nNew overload that 
>> takes
>> a code_helper.
>> * tree-vect-loop.c: Include case-cfn-macros.h.
>> (fold_left_reduction_fn): Take a code_helper instead of a tree_code.
>> (reduction_fn_for_scalar_code): Likewise.
>> (neutral_op_for_reduction): Likewise.
>> (needs_fold_left_reduction_p): Likewise.
>> (use_mask_by_cond_expr_p): Likewise.
>> (build_vect_cond_expr): Likewise.
>> (vect_create_partial_epilog): Likewise.  Use gimple_build rather
>> than gimple_build_assign.
>> (check_reduction_path): Handle calls and operate on code_helpers
>> rather than tree_codes.
>> (vect_is_simple_reduction): Likewise.
>> (vect_model_reduction_cost): Likewise.
>> (vect_find_reusable_accumulator): Likewise.
>> (vect_create_epilog_for_reduction): Likewise.
>> (vect_transform_cycle_phi): Likewise.
>> (vectorizable_reduction): Likewise.  Make more use of
>> lane_reduc_code_p.
>> (vect_transform_reduction): Use gimple_extract_op but expect
>> a tree_code for now.
>> (vect_can_vectorize_without_simd_p): New overload that takes
>> a code_helper.
>> * tree-vect-stmts.c (vectorizable_call): Handle reductions in
>> fully-masked loops.
>> * tree-vect-patterns.c (vect_mark_pattern_stmts): Use
>> gimple_extract_op when updating STMT_VINFO_REDUC_IDX.
>> ---
>>  gcc/builtins.c   |  46 -
>>  gcc/builtins.h   |   1 +
>>  gcc/gimple-fold.c|   9 +
>>  gcc/gimple-match-head.c  |  70 +++
>>  gcc/gimple-match.h   |  20 ++
>>  gcc/internal-fn.c|  46 -
>>  gcc/internal-fn.h|   2 +
>>  gcc/tree-vect-loop.c | 420 +++
>>  gcc/tree-vect-patterns.c |  23 ++-
>>  gcc/tree-vect-stmts.c|  66 --
>>  gcc/tree-vectorizer.h|  10 +-
>>  11 files changed, 455 insertions(+), 258 deletions(-)
>>
>> diff --git a/gcc/builtins.c b/gcc/builtins.c
>> index 384864bfb3a..03829c03a5a 100644
>> --- a/gcc/builtins.c
>> +++ b/gcc/builtins.c
>> @@ -2139,17 +2139,17 @@ mathfn_built_in_type (combined_fn fn)
>>  #undef SEQ_OF_CASE_MATHFN
>>  }
>>
>> -/* If BUILT_IN_NORMAL function FNDECL has an associated internal function,
>> -   return its code, otherwise return IFN_LAST.  Note that this function
>> -   only tests whether the function is defined in internals.def, not whether
>> -   it is actually available on the target.  */
>> +/* Check whether there is an internal function associated with function FN
>> +   and return type RETURN_TYPE.  Return the function if so, otherwise return
>> +   IFN_LAST.
>>
>> -internal_fn
>> -associated_internal_fn (tre

Re: [musl] Re: [PATCH v2] configure: define TARGET_LIBC_GNUSTACK on musl

2021-11-16 Thread Rich Felker
On Tue, Nov 16, 2021 at 03:40:00PM +0100, Dragan Mladjenovic wrote:
> Hi,
> 
> Looks fine to me. If possible, maybe it should even be back-ported
> to stable branches.
> 
> Not sure if MIPS assembly sources (if any) in musl would need
> explicit ..note.GNU-stack
> 
> to complement this?

What are the actual consequences of making this change, and what is
the goal? I'm concerned that it might produce object files which don't
include annotation that they don't need executable stack, in which
case the final executable file will be marked as executable-stack and
the kernel will load it as such. That would be very bad.

Rich


> On 16-Nov-21 06:13, Ilya Lipnitskiy wrote:
> >musl only uses PT_GNU_STACK to set default thread stack size and has no
> >executable stack support[0], so there is no reason not to emit the
> >.note.GNU-stack section on musl builds.
> >
> >[0]: 
> >https://lore.kernel.org/all/20190423192534.gn23...@brightrain.aerifal.cx/T/#u
> >
> >gcc/ChangeLog:
> >
> > * configure: Regenerate.
> > * configure.ac: define TARGET_LIBC_GNUSTACK on musl
> >
> >Signed-off-by: Ilya Lipnitskiy 
> >---
> >  gcc/configure| 3 +++
> >  gcc/configure.ac | 3 +++
> >  2 files changed, 6 insertions(+)
> >
> >diff --git a/gcc/configure b/gcc/configure
> >index 74b9d9be4c85..7091a838aefa 100755
> >--- a/gcc/configure
> >+++ b/gcc/configure
> >@@ -31275,6 +31275,9 @@ fi
> >  # Check if the target LIBC handles PT_GNU_STACK.
> >  gcc_cv_libc_gnustack=unknown
> >  case "$target" in
> >+  mips*-*-linux-musl*)
> >+gcc_cv_libc_gnustack=yes
> >+;;
> >mips*-*-linux*)
> >  if test $glibc_version_major -gt 2 \
> >diff --git a/gcc/configure.ac b/gcc/configure.ac
> >index c9ee1fb8919e..8a2d34179a75 100644
> >--- a/gcc/configure.ac
> >+++ b/gcc/configure.ac
> >@@ -6961,6 +6961,9 @@ fi
> >  # Check if the target LIBC handles PT_GNU_STACK.
> >  gcc_cv_libc_gnustack=unknown
> >  case "$target" in
> >+  mips*-*-linux-musl*)
> >+gcc_cv_libc_gnustack=yes
> >+;;
> >mips*-*-linux*)
> >  GCC_GLIBC_VERSION_GTE_IFELSE([2], [31], [gcc_cv_libc_gnustack=yes], )
> >  ;;


Re: [PATCH] simplify get_range_strlen interface

2021-11-16 Thread Martin Sebor via Gcc-patches

On 11/15/21 3:05 PM, Martin Sebor wrote:

The deeply nested PHI handling in get_range_strlen_dynamic makes
the code bigger and harder to follow than it would be if done in
its own function.  The attached patch does that.

In addition, the get_range_strlen family of functions use a bitmap
to avoid infinite recursion.  Rather than dynamically allocating
and freeing it on demand the attached patch simplifies the code
by using an instance of auto_bitmap.  This avoids the risk of
neglecting to deallocate the bitmap.


I forgot over the weekend that this change also fixes a bug:
PR 102960.

I have committed the fix in r12-5310 along with a test.

Martin



Tested on x86_64-linux.

Martin




[committed 1/2] libstdc++: Use hidden friends for vector::reference swap overloads

2021-11-16 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux, committed to trunk.

These swap overloads are non-standard, but are needed to make swap work
for vector::reference rvalues. They don't need to be called
explicitly, only via ADL, so hide them from normal lookup. This is what
I've proposed as the resolution to LWG 3638.

libstdc++-v3/ChangeLog:

* include/bits/stl_bvector.h (swap(_Bit_reference, _Bit_reference))
(swap(_Bit_reference, bool&), swap(bool&, _Bit_reference)):
Define as hidden friends of _Bit_reference.
---
 libstdc++-v3/include/bits/stl_bvector.h | 50 -
 1 file changed, 25 insertions(+), 25 deletions(-)

diff --git a/libstdc++-v3/include/bits/stl_bvector.h 
b/libstdc++-v3/include/bits/stl_bvector.h
index 381c47b6132..68070685baf 100644
--- a/libstdc++-v3/include/bits/stl_bvector.h
+++ b/libstdc++-v3/include/bits/stl_bvector.h
@@ -125,36 +125,36 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
 void
 flip() _GLIBCXX_NOEXCEPT
 { *_M_p ^= _M_mask; }
-  };
 
 #if __cplusplus >= 201103L
-  _GLIBCXX20_CONSTEXPR
-  inline void
-  swap(_Bit_reference __x, _Bit_reference __y) noexcept
-  {
-bool __tmp = __x;
-__x = __y;
-__y = __tmp;
-  }
+_GLIBCXX20_CONSTEXPR
+friend void
+swap(_Bit_reference __x, _Bit_reference __y) noexcept
+{
+  bool __tmp = __x;
+  __x = __y;
+  __y = __tmp;
+}
 
-  _GLIBCXX20_CONSTEXPR
-  inline void
-  swap(_Bit_reference __x, bool& __y) noexcept
-  {
-bool __tmp = __x;
-__x = __y;
-__y = __tmp;
-  }
+_GLIBCXX20_CONSTEXPR
+friend void
+swap(_Bit_reference __x, bool& __y) noexcept
+{
+  bool __tmp = __x;
+  __x = __y;
+  __y = __tmp;
+}
 
-  _GLIBCXX20_CONSTEXPR
-  inline void
-  swap(bool& __x, _Bit_reference __y) noexcept
-  {
-bool __tmp = __x;
-__x = __y;
-__y = __tmp;
-  }
+_GLIBCXX20_CONSTEXPR
+friend void
+swap(bool& __x, _Bit_reference __y) noexcept
+{
+  bool __tmp = __x;
+  __x = __y;
+  __y = __tmp;
+}
 #endif
+  };
 
   struct _Bit_iterator_base
   : public std::iterator
-- 
2.31.1



[committed 2/2] libstdc++: Implement constexpr std::basic_string for C++20

2021-11-16 Thread Jonathan Wakely via Gcc-patches
From: Michael de Lang 

Tested x86_64-linux, committed to trunk.


This is only supported for the cxx11 ABI, not for COW strings.

libstdc++-v3/ChangeLog:

* include/bits/basic_string.h (basic_string, operator""s): Add
constexpr for C++20.
(basic_string::basic_string(basic_string&&)): Only copy
initialized portion of the buffer.
(basic_string::basic_string(basic_string&&, const Alloc&)):
Likewise.
* include/bits/basic_string.tcc (basic_string): Add constexpr
for C++20.
(basic_string::swap(basic_string&)): Only copy initialized
portions of the buffers.
(basic_string::_M_replace): Add constexpr implementation that
doesn't depend on pointer comparisons.
* include/bits/cow_string.h: Adjust comment.
* include/ext/type_traits.h (__is_null_pointer): Add constexpr.
* include/std/string (erase, erase_if): Add constexpr.
* include/std/version (__cpp_lib_constexpr_string): Update
value.
* testsuite/21_strings/basic_string/cons/char/constexpr.cc:
New test.
* testsuite/21_strings/basic_string/cons/wchar_t/constexpr.cc:
New test.
* testsuite/21_strings/basic_string/literals/constexpr.cc:
New test.
* testsuite/21_strings/basic_string/modifiers/constexpr.cc: New test.
* testsuite/21_strings/basic_string/modifiers/swap/char/constexpr.cc:
New test.
* testsuite/21_strings/basic_string/modifiers/swap/wchar_t/constexpr.cc:
New test.
* testsuite/21_strings/basic_string/version.cc: New test.
---
 libstdc++-v3/include/bits/basic_string.h  | 274 --
 libstdc++-v3/include/bits/basic_string.tcc|  69 -
 libstdc++-v3/include/bits/cow_string.h|   2 +-
 libstdc++-v3/include/ext/type_traits.h|   4 +-
 libstdc++-v3/include/std/string   |   2 +
 libstdc++-v3/include/std/version  |   6 +-
 .../basic_string/cons/char/constexpr.cc   | 174 +++
 .../basic_string/cons/wchar_t/constexpr.cc| 174 +++
 .../basic_string/literals/constexpr.cc|  22 ++
 .../basic_string/modifiers/constexpr.cc   |  52 
 .../modifiers/swap/char/constexpr.cc  |  49 
 .../modifiers/swap/wchar_t/constexpr.cc   |  49 
 .../21_strings/basic_string/version.cc|  25 ++
 13 files changed, 869 insertions(+), 33 deletions(-)
 create mode 100644 
libstdc++-v3/testsuite/21_strings/basic_string/cons/char/constexpr.cc
 create mode 100644 
libstdc++-v3/testsuite/21_strings/basic_string/cons/wchar_t/constexpr.cc
 create mode 100644 
libstdc++-v3/testsuite/21_strings/basic_string/literals/constexpr.cc
 create mode 100644 
libstdc++-v3/testsuite/21_strings/basic_string/modifiers/constexpr.cc
 create mode 100644 
libstdc++-v3/testsuite/21_strings/basic_string/modifiers/swap/char/constexpr.cc
 create mode 100644 
libstdc++-v3/testsuite/21_strings/basic_string/modifiers/swap/wchar_t/constexpr.cc
 create mode 100644 libstdc++-v3/testsuite/21_strings/basic_string/version.cc

diff --git a/libstdc++-v3/include/bits/basic_string.h 
b/libstdc++-v3/include/bits/basic_string.h
index a6575fa9e26..b6945f1cdfb 100644
--- a/libstdc++-v3/include/bits/basic_string.h
+++ b/libstdc++-v3/include/bits/basic_string.h
@@ -57,12 +57,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 _GLIBCXX_BEGIN_NAMESPACE_CXX11
 
 #ifdef __cpp_lib_is_constant_evaluated
-// Support P1032R1 in C++20 (but not P0980R1 yet).
-# define __cpp_lib_constexpr_string 201811L
+// Support P0980R1 in C++20.
+# define __cpp_lib_constexpr_string 201907L
 #elif __cplusplus >= 201703L && _GLIBCXX_HAVE_BUILTIN_IS_CONSTANT_EVALUATED
 // Support P0426R1 changes to char_traits in C++17.
 # define __cpp_lib_constexpr_string 201611L
-#elif __cplusplus > 201703L
 #endif
 
   /**
@@ -131,6 +130,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
  _Res>;
 
   // Allows an implicit conversion to __sv_type.
+  _GLIBCXX20_CONSTEXPR
   static __sv_type
   _S_to_string_view(__sv_type __svt) noexcept
   { return __svt; }
@@ -141,7 +141,9 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
   // is provided.
   struct __sv_wrapper
   {
-   explicit __sv_wrapper(__sv_type __sv) noexcept : _M_sv(__sv) { }
+   _GLIBCXX20_CONSTEXPR explicit
+   __sv_wrapper(__sv_type __sv) noexcept : _M_sv(__sv) { }
+
__sv_type _M_sv;
   };
 
@@ -151,6 +153,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
*  @param  __svw  string view wrapper.
*  @param  __a  Allocator to use.
*/
+  _GLIBCXX20_CONSTEXPR
   explicit
   basic_string(__sv_wrapper __svw, const _Alloc& __a)
   : basic_string(__svw._M_sv.data(), __svw._M_sv.size(), __a) { }
@@ -163,9 +166,11 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
_Alloc_hider(pointer __dat, const _Alloc& __a = _Alloc())
: allocator_type(__a), _M_p(__dat) { }
 #else
+   _GLIBCXX20_CONSTEXPR
_Alloc_hider(pointer __dat, const _Al

[PATCH] Do not abort compilation when dump file is /dev/*

2021-11-16 Thread Giuliano Belinassi via Gcc-patches
The `configure` scripts generated with autoconf often tests compiler
features by setting output to `/dev/null`, which then sets the dump
folder as being /dev/* and the compilation halts with an error because
GCC cannot create files in /dev/. This is a problem when configure is
testing for compiler features because it cannot tell if the failure was
due to unsupported features or any other problem, and disable it even
if it is working.

As an example, running configure overriding CFLAGS="-fdump-ipa-clones"
will result in several compiler-features as being disabled because of
gcc halting with an error creating files in /dev/*.

This commit fixes this issue by checking if the dump folder is /dev/.
If yes, then it just informs the user and disables dumping, but does
not halt the compilation and the compiler retuns 0 to the shell.

gcc/ChangeLog
2021-11-16  Giuliano Belinassi  

* dumpfile.c (dump_open): Do not halt compilation when file
matches /dev/*.

gcc/testsuite/ChangeLog
2021-11-16  Giuliano Belinassi  

* gcc.dg/devnull-dump.c: New.

Signed-off-by: Giuliano Belinassi 
---
 gcc/dumpfile.c  | 17 -
 gcc/testsuite/gcc.dg/devnull-dump.c |  7 +++
 2 files changed, 23 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/devnull-dump.c

diff --git a/gcc/dumpfile.c b/gcc/dumpfile.c
index 8169daf7f59..b1dbfb371af 100644
--- a/gcc/dumpfile.c
+++ b/gcc/dumpfile.c
@@ -378,7 +378,22 @@ dump_open (const char *filename, bool trunc)
   FILE *stream = fopen (filename, trunc ? "w" : "a");
 
   if (!stream)
-error ("could not open dump file %qs: %m", filename);
+{
+  /* Autoconf tests compiler functionalities by setting output to 
/dev/null.
+In this case, if dumps are enabled, it will try to set the output
+folder to /dev/*, which is of course invalid and the compiler will exit
+with an error, resulting in configure script reporting the tested
+feature as being unavailable. Here we test this case by checking if the
+output file prefix has /dev/ and only inform the user in this case
+rather than refusing to compile.  */
+
+  const char *const slash_dev = "/dev/";
+  if (strncmp(slash_dev, filename, strlen(slash_dev)) == 0)
+   inform (UNKNOWN_LOCATION,
+   "could not open dump file %qs: %m. Dumps are disabled.", 
filename);
+  else
+   error ("could not open dump file %qs: %m", filename);
+}
   return stream;
 }
 
diff --git a/gcc/testsuite/gcc.dg/devnull-dump.c 
b/gcc/testsuite/gcc.dg/devnull-dump.c
new file mode 100644
index 000..378e0901c28
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/devnull-dump.c
@@ -0,0 +1,7 @@
+/* { dg-do assemble } */
+/* { dg-options "-fdump-ipa-clones -o /dev/null" } */
+
+int main()
+{
+  return 0;
+}
-- 
2.33.1



[PATCH RFC] c-family: don't cache large vecs

2021-11-16 Thread Jason Merrill via Gcc-patches
Patrick observed recently that an element of the vector cache could be
arbitrarily large.  Let's only cache relatively small vecs.

This has no effect on compiling the libstdc++ stdc++.h, presumably because
nothing in the library requires a vec that large.  I figure that this makes it
more likely that a subsequent long list will reuse the same memory when the
later vec gets expanded.

Does this make sense to others?

gcc/c-family/ChangeLog:

* c-common.c (release_tree_vector): Only cache vecs smaller than
16 elements.
---
 gcc/c-family/c-common.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 436df45df68..90e8ec87b6b 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -8213,8 +8213,16 @@ release_tree_vector (vec *vec)
 {
   if (vec != NULL)
 {
-  vec->truncate (0);
-  vec_safe_push (tree_vector_cache, vec);
+  if (vec->allocated () >= 16)
+   /* Don't cache vecs that have expanded more than once.  On a p64
+  target, vecs double in alloc size with each power of 2 elements, e.g
+  at 16 elements the alloc increases from 128 to 256 bytes.  */
+   vec_free (vec);
+  else
+   {
+ vec->truncate (0);
+ vec_safe_push (tree_vector_cache, vec);
+   }
 }
 }
 

base-commit: 132f1c27770fa6dafdf14591878d301aedd5ae16
-- 
2.27.0



Re: [committed 2/2] libstdc++: Implement constexpr std::basic_string for C++20

2021-11-16 Thread Jonathan Wakely via Gcc-patches
Oops, the subject line was not supposed to say 2/2 for this commit, and I
was not supposed to have Michael de Lang as the author ... I messed up my
git send-email and git cherry-pick commands!

Sorry Michael, I originally tried to use your tests from
https://github.com/Oipo/gcc/ but as noted in https://gcc.gnu.org/PR93989
those tests are incorrect, and so I didn't actually use any of them (nor
the std::string code itself). But apparently the commit still had you as
the author, because I reset the content of the git tree, but not the commit
author. I'll fix that in GCC's ChangeLog file after it regenerates
overnight.



On Tue, 16 Nov 2021 at 16:47, Jonathan Wakely wrote:

> From: Michael de Lang
>
> Tested x86_64-linux, committed to trunk.
>
>
> This is only supported for the cxx11 ABI, not for COW strings.
>
> libstdc++-v3/ChangeLog:
>
> * include/bits/basic_string.h (basic_string, operator""s): Add
> constexpr for C++20.
> (basic_string::basic_string(basic_string&&)): Only copy
> initialized portion of the buffer.
> (basic_string::basic_string(basic_string&&, const Alloc&)):
> Likewise.
> * include/bits/basic_string.tcc (basic_string): Add constexpr
> for C++20.
> (basic_string::swap(basic_string&)): Only copy initialized
> portions of the buffers.
> (basic_string::_M_replace): Add constexpr implementation that
> doesn't depend on pointer comparisons.
> * include/bits/cow_string.h: Adjust comment.
> * include/ext/type_traits.h (__is_null_pointer): Add constexpr.
> * include/std/string (erase, erase_if): Add constexpr.
> * include/std/version (__cpp_lib_constexpr_string): Update
> value.
> * testsuite/21_strings/basic_string/cons/char/constexpr.cc:
> New test.
> * testsuite/21_strings/basic_string/cons/wchar_t/constexpr.cc:
> New test.
> * testsuite/21_strings/basic_string/literals/constexpr.cc:
> New test.
> * testsuite/21_strings/basic_string/modifiers/constexpr.cc: New
> test.
> *
> testsuite/21_strings/basic_string/modifiers/swap/char/constexpr.cc:
> New test.
> *
> testsuite/21_strings/basic_string/modifiers/swap/wchar_t/constexpr.cc:
> New test.
> * testsuite/21_strings/basic_string/version.cc: New test.
>
>
>


Re: [PATCH 06/15] visium: Fix non-robust split condition in define_insn_and_split

2021-11-16 Thread Eric Botcazou via Gcc-patches
> gcc/ChangeLog:
> 
>  * config/visium/visium.md (*add3_insn, *addsi3_insn, *addi3_insn,
>   *sub3_insn, *subsi3_insn, *subdi3_insn, *neg2_insn,
>   *negdi2_insn, *and3_insn, *ior3_insn, *xor3_insn,
>   *one_cmpl2_insn, *ashl3_insn, *ashr3_insn,
>   *lshr3_insn, *trunchiqi2_insn, *truncsihi2_insn,
>   *truncdisi2_insn, *extendqihi2_insn, *extendqisi2_insn,
>   *extendhisi2_insn, *extendsidi2_insn, *zero_extendqihi2_insn,
>*zero_extendqisi2_insn, *zero_extendsidi2_insn): Fix split condition.

OK for mainline, thanks.

-- 
Eric Botcazou




Re: [PATCH RFC] c-family: don't cache large vecs

2021-11-16 Thread Marek Polacek via Gcc-patches
On Tue, Nov 16, 2021 at 11:53:14AM -0500, Jason Merrill via Gcc-patches wrote:
> Patrick observed recently that an element of the vector cache could be
> arbitrarily large.  Let's only cache relatively small vecs.
> 
> This has no effect on compiling the libstdc++ stdc++.h, presumably because
> nothing in the library requires a vec that large.  I figure that this makes it
> more likely that a subsequent long list will reuse the same memory when the
> later vec gets expanded.
> 
> Does this make sense to others?

Looks good to me.
 
> gcc/c-family/ChangeLog:
> 
>   * c-common.c (release_tree_vector): Only cache vecs smaller than
>   16 elements.
> ---
>  gcc/c-family/c-common.c | 12 ++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
> index 436df45df68..90e8ec87b6b 100644
> --- a/gcc/c-family/c-common.c
> +++ b/gcc/c-family/c-common.c
> @@ -8213,8 +8213,16 @@ release_tree_vector (vec *vec)
>  {
>if (vec != NULL)
>  {
> -  vec->truncate (0);
> -  vec_safe_push (tree_vector_cache, vec);
> +  if (vec->allocated () >= 16)
> + /* Don't cache vecs that have expanded more than once.  On a p64
> +target, vecs double in alloc size with each power of 2 elements, e.g
> +at 16 elements the alloc increases from 128 to 256 bytes.  */
> + vec_free (vec);
> +  else
> + {
> +   vec->truncate (0);
> +   vec_safe_push (tree_vector_cache, vec);
> + }
>  }
>  }
>  
> 
> base-commit: 132f1c27770fa6dafdf14591878d301aedd5ae16
> -- 
> 2.27.0
> 

Marek



[PATCH] rs6000: Add [power6-64] stanza to new builtin support

2021-11-16 Thread Bill Schmidt via Gcc-patches
Hi!  While reviewing the recent 32-bit changes for the new builtin 
infrastructure,
I realized that I needed another stanza to represent builtins requiring both
-mcpu=power6 and -mpowerpc64.  (There's only one of these, but nonetheless...)
So this patch adds that support in the same fashion as [power7-64] and
[power9-64].  Bootstrapped and tested on powerpc64le-linux-gnu, and on
powerpc64-linux-gnu with -m32/-m64.  Is this okay for trunk?

Thanks!
Bill


2021-11-16  Bill Schmidt  

gcc/
* config/rs6000/rs6000-builtin-new.def: Add power6-64 stanza.
Move CMPB to power6-64 stanza.
* config/rs6000/rs6000-call.c (rs6000_invalid_new_builtin): Handle
ENB_P6_64 case.
(rs6000_new_builtin_is_supported): Likewise.
(rs6000_expand_new_builtin): Likewise.
(rs6000_init_builtins): Likewise.
* config/rs6000/rs6000-gen-builtins.c (bif_stanza): Add
BSTZ_P6_64.
(stanza_map): Add entry mapping power6-64 to BSTZ_P6_64.
(enable_string): Add "ENB_P6_64".
(write_decls): Add ENB_P6_64 to bif_enable enum.
---
 gcc/config/rs6000/rs6000-builtin-new.def |  9 ++---
 gcc/config/rs6000/rs6000-call.c  | 10 ++
 gcc/config/rs6000/rs6000-gen-builtins.c  |  4 
 3 files changed, 20 insertions(+), 3 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtin-new.def 
b/gcc/config/rs6000/rs6000-builtin-new.def
index 1dd8f6b40b2..58dfce1ca37 100644
--- a/gcc/config/rs6000/rs6000-builtin-new.def
+++ b/gcc/config/rs6000/rs6000-builtin-new.def
@@ -266,13 +266,16 @@
 
 ; Power6 builtins (ISA 2.05).
 [power6]
-  const signed long __builtin_p6_cmpb (signed long, signed long);
-CMPB cmpbdi3 {}
-
   const signed int __builtin_p6_cmpb_32 (signed int, signed int);
 CMPB_32 cmpbsi3 {}
 
 
+; Power6 builtins requiring 64-bit GPRs (even with 32-bit addressing).
+[power6-64]
+  const signed long __builtin_p6_cmpb (signed long, signed long);
+CMPB cmpbdi3 {}
+
+
 ; AltiVec builtins.
 [altivec]
   const vsc __builtin_altivec_abs_v16qi (vsc);
diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index 83e1abb6118..822a9736591 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -11919,6 +11919,10 @@ rs6000_invalid_new_builtin (enum rs6000_gen_builtins 
fncode)
 case ENB_P6:
   error ("%qs requires the %qs option", name, "-mcpu=power6");
   break;
+case ENB_P6_64:
+  error ("%qs requires the %qs option and either the %qs or %qs option",
+name, "-mcpu=power6", "-m64", "-mpowerpc64");
+  break;
 case ENB_ALTIVEC:
   error ("%qs requires the %qs option", name, "-maltivec");
   break;
@@ -13346,6 +13350,8 @@ rs6000_new_builtin_is_supported (enum 
rs6000_gen_builtins fncode)
   return TARGET_POPCNTB;
 case ENB_P6:
   return TARGET_CMPB;
+case ENB_P6_64:
+  return TARGET_CMPB && TARGET_POWERPC64;
 case ENB_P7:
   return TARGET_POPCNTD;
 case ENB_P7_64:
@@ -15697,6 +15703,8 @@ rs6000_expand_new_builtin (tree exp, rtx target,
   if (!(e == ENB_ALWAYS
|| (e == ENB_P5 && TARGET_POPCNTB)
|| (e == ENB_P6 && TARGET_CMPB)
+   || (e == ENB_P6_64  && TARGET_CMPB
+   && TARGET_POWERPC64)
|| (e == ENB_ALTIVEC&& TARGET_ALTIVEC)
|| (e == ENB_CELL   && TARGET_ALTIVEC
&& rs6000_cpu == PROCESSOR_CELL)
@@ -16419,6 +16427,8 @@ rs6000_init_builtins (void)
continue;
  if (e == ENB_P6 && !TARGET_CMPB)
continue;
+ if (e == ENB_P6_64 && !(TARGET_CMPB && TARGET_POWERPC64))
+   continue;
  if (e == ENB_ALTIVEC && !TARGET_ALTIVEC)
continue;
  if (e == ENB_VSX && !TARGET_VSX)
diff --git a/gcc/config/rs6000/rs6000-gen-builtins.c 
b/gcc/config/rs6000/rs6000-gen-builtins.c
index 1655a2fd765..4ce83bd2290 100644
--- a/gcc/config/rs6000/rs6000-gen-builtins.c
+++ b/gcc/config/rs6000/rs6000-gen-builtins.c
@@ -212,6 +212,7 @@ enum bif_stanza
  BSTZ_ALWAYS,
  BSTZ_P5,
  BSTZ_P6,
+ BSTZ_P6_64,
  BSTZ_ALTIVEC,
  BSTZ_CELL,
  BSTZ_VSX,
@@ -245,6 +246,7 @@ static stanza_entry stanza_map[NUMBIFSTANZAS] =
 { "always",BSTZ_ALWAYS },
 { "power5",BSTZ_P5 },
 { "power6",BSTZ_P6 },
+{ "power6-64", BSTZ_P6_64  },
 { "altivec",   BSTZ_ALTIVEC},
 { "cell",  BSTZ_CELL   },
 { "vsx",   BSTZ_VSX},
@@ -269,6 +271,7 @@ static const char *enable_string[NUMBIFSTANZAS] =
 "ENB_ALWAYS",
 "ENB_P5",
 "ENB_P6",
+"ENB_P6_64",
 "ENB_ALTIVEC",
 "ENB_CELL",
 "ENB_VSX",
@@ -2227,6 +2230,7 @@ write_decls (void)
   fprintf (header_file, "  ENB_ALWAYS,\n");
   fprintf (header_file, "  ENB_P5,\n");
   fprintf (header_file, "  ENB_P6,\n");
+  fprintf (header_file, "  ENB_P6_64,\n");
   fprintf (header_file, "  ENB_ALT

Re: [PATCH] rs6000: Add [power6-64] stanza to new builtin support

2021-11-16 Thread Bill Schmidt via Gcc-patches
Sorry, I forgot to CC maintainers on this one.

Thanks!
Bill

On 11/16/21 11:06 AM, Bill Schmidt wrote:
> Hi!  While reviewing the recent 32-bit changes for the new builtin 
> infrastructure,
> I realized that I needed another stanza to represent builtins requiring both
> -mcpu=power6 and -mpowerpc64.  (There's only one of these, but nonetheless...)
> So this patch adds that support in the same fashion as [power7-64] and
> [power9-64].  Bootstrapped and tested on powerpc64le-linux-gnu, and on
> powerpc64-linux-gnu with -m32/-m64.  Is this okay for trunk?
>
> Thanks!
> Bill
>
>
> 2021-11-16  Bill Schmidt  
>
> gcc/
>   * config/rs6000/rs6000-builtin-new.def: Add power6-64 stanza.
>   Move CMPB to power6-64 stanza.
>   * config/rs6000/rs6000-call.c (rs6000_invalid_new_builtin): Handle
>   ENB_P6_64 case.
>   (rs6000_new_builtin_is_supported): Likewise.
>   (rs6000_expand_new_builtin): Likewise.
>   (rs6000_init_builtins): Likewise.
>   * config/rs6000/rs6000-gen-builtins.c (bif_stanza): Add
>   BSTZ_P6_64.
>   (stanza_map): Add entry mapping power6-64 to BSTZ_P6_64.
>   (enable_string): Add "ENB_P6_64".
>   (write_decls): Add ENB_P6_64 to bif_enable enum.
> ---
>  gcc/config/rs6000/rs6000-builtin-new.def |  9 ++---
>  gcc/config/rs6000/rs6000-call.c  | 10 ++
>  gcc/config/rs6000/rs6000-gen-builtins.c  |  4 
>  3 files changed, 20 insertions(+), 3 deletions(-)
>
> diff --git a/gcc/config/rs6000/rs6000-builtin-new.def 
> b/gcc/config/rs6000/rs6000-builtin-new.def
> index 1dd8f6b40b2..58dfce1ca37 100644
> --- a/gcc/config/rs6000/rs6000-builtin-new.def
> +++ b/gcc/config/rs6000/rs6000-builtin-new.def
> @@ -266,13 +266,16 @@
>  
>  ; Power6 builtins (ISA 2.05).
>  [power6]
> -  const signed long __builtin_p6_cmpb (signed long, signed long);
> -CMPB cmpbdi3 {}
> -
>const signed int __builtin_p6_cmpb_32 (signed int, signed int);
>  CMPB_32 cmpbsi3 {}
>  
>  
> +; Power6 builtins requiring 64-bit GPRs (even with 32-bit addressing).
> +[power6-64]
> +  const signed long __builtin_p6_cmpb (signed long, signed long);
> +CMPB cmpbdi3 {}
> +
> +
>  ; AltiVec builtins.
>  [altivec]
>const vsc __builtin_altivec_abs_v16qi (vsc);
> diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
> index 83e1abb6118..822a9736591 100644
> --- a/gcc/config/rs6000/rs6000-call.c
> +++ b/gcc/config/rs6000/rs6000-call.c
> @@ -11919,6 +11919,10 @@ rs6000_invalid_new_builtin (enum rs6000_gen_builtins 
> fncode)
>  case ENB_P6:
>error ("%qs requires the %qs option", name, "-mcpu=power6");
>break;
> +case ENB_P6_64:
> +  error ("%qs requires the %qs option and either the %qs or %qs option",
> +  name, "-mcpu=power6", "-m64", "-mpowerpc64");
> +  break;
>  case ENB_ALTIVEC:
>error ("%qs requires the %qs option", name, "-maltivec");
>break;
> @@ -13346,6 +13350,8 @@ rs6000_new_builtin_is_supported (enum 
> rs6000_gen_builtins fncode)
>return TARGET_POPCNTB;
>  case ENB_P6:
>return TARGET_CMPB;
> +case ENB_P6_64:
> +  return TARGET_CMPB && TARGET_POWERPC64;
>  case ENB_P7:
>return TARGET_POPCNTD;
>  case ENB_P7_64:
> @@ -15697,6 +15703,8 @@ rs6000_expand_new_builtin (tree exp, rtx target,
>if (!(e == ENB_ALWAYS
>   || (e == ENB_P5 && TARGET_POPCNTB)
>   || (e == ENB_P6 && TARGET_CMPB)
> + || (e == ENB_P6_64  && TARGET_CMPB
> + && TARGET_POWERPC64)
>   || (e == ENB_ALTIVEC&& TARGET_ALTIVEC)
>   || (e == ENB_CELL   && TARGET_ALTIVEC
>   && rs6000_cpu == PROCESSOR_CELL)
> @@ -16419,6 +16427,8 @@ rs6000_init_builtins (void)
>   continue;
> if (e == ENB_P6 && !TARGET_CMPB)
>   continue;
> +   if (e == ENB_P6_64 && !(TARGET_CMPB && TARGET_POWERPC64))
> + continue;
> if (e == ENB_ALTIVEC && !TARGET_ALTIVEC)
>   continue;
> if (e == ENB_VSX && !TARGET_VSX)
> diff --git a/gcc/config/rs6000/rs6000-gen-builtins.c 
> b/gcc/config/rs6000/rs6000-gen-builtins.c
> index 1655a2fd765..4ce83bd2290 100644
> --- a/gcc/config/rs6000/rs6000-gen-builtins.c
> +++ b/gcc/config/rs6000/rs6000-gen-builtins.c
> @@ -212,6 +212,7 @@ enum bif_stanza
>   BSTZ_ALWAYS,
>   BSTZ_P5,
>   BSTZ_P6,
> + BSTZ_P6_64,
>   BSTZ_ALTIVEC,
>   BSTZ_CELL,
>   BSTZ_VSX,
> @@ -245,6 +246,7 @@ static stanza_entry stanza_map[NUMBIFSTANZAS] =
>  { "always",  BSTZ_ALWAYS },
>  { "power5",  BSTZ_P5 },
>  { "power6",  BSTZ_P6 },
> +{ "power6-64",   BSTZ_P6_64  },
>  { "altivec", BSTZ_ALTIVEC},
>  { "cell",BSTZ_CELL   },
>  { "vsx", BSTZ_VSX},
> @@ -269,6 +271,7 @@ static const char *enable_string[NUMBIFSTANZAS] =
>  "ENB_ALWAYS",
>  "ENB_P5",
>  "ENB_P6",
> +"ENB_P6_64",
>  

[PATCH] rs6000: Better error messages for power8/9-vector builtins

2021-11-16 Thread Bill Schmidt via Gcc-patches
Hi!  During a previous patch review, Segher asked that I provide better
messages when builtins are unavailable because they require both a minimum
CPU and the enablement of VSX instructions.  This patch does just that.

Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.
Is this okay for trunk?

Thanks!
Bill


2021-11-11  Bill Schmidt  

gcc/
* config/rs6000/rs6000-call.c (rs6000_invalid_new_builtin): Change
error messages for ENB_P8V and ENB_P9V.
---
 gcc/config/rs6000/rs6000-call.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index 85fec80c6d7..035266eb001 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -11943,7 +11943,8 @@ rs6000_invalid_new_builtin (enum rs6000_gen_builtins 
fncode)
   error ("%qs requires the %qs option", name, "-mcpu=power8");
   break;
 case ENB_P8V:
-  error ("%qs requires the %qs option", name, "-mpower8-vector");
+  error ("%qs requires the %qs and %qs options", name, "-mcpu=power8",
+"-mvsx");
   break;
 case ENB_P9:
   error ("%qs requires the %qs option", name, "-mcpu=power9");
@@ -11953,7 +11954,8 @@ rs6000_invalid_new_builtin (enum rs6000_gen_builtins 
fncode)
 name, "-mcpu=power9", "-m64", "-mpowerpc64");
   break;
 case ENB_P9V:
-  error ("%qs requires the %qs option", name, "-mpower9-vector");
+  error ("%qs requires the %qs and %qs options", name, "-mcpu=power9",
+"-mvsx");
   break;
 case ENB_IEEE128_HW:
   error ("%qs requires ISA 3.0 IEEE 128-bit floating point", name);
-- 
2.27.0




Re: [musl] Re: [PATCH v2] configure: define TARGET_LIBC_GNUSTACK on musl

2021-11-16 Thread Ilya Lipnitskiy via Gcc-patches
On Tue, Nov 16, 2021 at 8:41 AM Rich Felker  wrote:
>
> On Tue, Nov 16, 2021 at 03:40:00PM +0100, Dragan Mladjenovic wrote:
> > Hi,
> >
> > Looks fine to me. If possible, maybe it should even be back-ported
> > to stable branches.
The change cherry-picks fine onto 10.x and 11.x branches. Should I
send out separate patches or can the committer of this patch apply it
to 10.x and 11.x?
> >
> > Not sure if MIPS assembly sources (if any) in musl would need
> > explicit ..note.GNU-stack
> >
> > to complement this?
>
> What are the actual consequences of making this change, and what is
> the goal? I'm concerned that it might produce object files which don't
> include annotation that they don't need executable stack, in which
> case the final executable file will be marked as executable-stack and
> the kernel will load it as such. That would be very bad.
It is actually the other way around - for MIPS hard-float targets on
non-glibc (or glibc < 2.31) without this change the .note.GNU-stack
annotation is not emitted by GCC.

Ilya
>
> Rich
>
>
> > On 16-Nov-21 06:13, Ilya Lipnitskiy wrote:
> > >musl only uses PT_GNU_STACK to set default thread stack size and has no
> > >executable stack support[0], so there is no reason not to emit the
> > >.note.GNU-stack section on musl builds.
> > >
> > >[0]: 
> > >https://lore.kernel.org/all/20190423192534.gn23...@brightrain.aerifal.cx/T/#u
> > >
> > >gcc/ChangeLog:
> > >
> > > * configure: Regenerate.
> > > * configure.ac: define TARGET_LIBC_GNUSTACK on musl
> > >
> > >Signed-off-by: Ilya Lipnitskiy 
> > >---
> > >  gcc/configure| 3 +++
> > >  gcc/configure.ac | 3 +++
> > >  2 files changed, 6 insertions(+)
> > >
> > >diff --git a/gcc/configure b/gcc/configure
> > >index 74b9d9be4c85..7091a838aefa 100755
> > >--- a/gcc/configure
> > >+++ b/gcc/configure
> > >@@ -31275,6 +31275,9 @@ fi
> > >  # Check if the target LIBC handles PT_GNU_STACK.
> > >  gcc_cv_libc_gnustack=unknown
> > >  case "$target" in
> > >+  mips*-*-linux-musl*)
> > >+gcc_cv_libc_gnustack=yes
> > >+;;
> > >mips*-*-linux*)
> > >  if test $glibc_version_major -gt 2 \
> > >diff --git a/gcc/configure.ac b/gcc/configure.ac
> > >index c9ee1fb8919e..8a2d34179a75 100644
> > >--- a/gcc/configure.ac
> > >+++ b/gcc/configure.ac
> > >@@ -6961,6 +6961,9 @@ fi
> > >  # Check if the target LIBC handles PT_GNU_STACK.
> > >  gcc_cv_libc_gnustack=unknown
> > >  case "$target" in
> > >+  mips*-*-linux-musl*)
> > >+gcc_cv_libc_gnustack=yes
> > >+;;
> > >mips*-*-linux*)
> > >  GCC_GLIBC_VERSION_GTE_IFELSE([2], [31], [gcc_cv_libc_gnustack=yes], )
> > >  ;;


Re: [PATCH 4/5] if-conv: Apply VN to hoisted conversions

2021-11-16 Thread Richard Sandiford via Gcc-patches
Richard Biener via Gcc-patches  writes:
> On Mon, Nov 15, 2021 at 3:00 PM Richard Sandiford
>  wrote:
>>
>> Richard Biener via Gcc-patches  writes:
>> > On Fri, Nov 12, 2021 at 7:05 PM Richard Sandiford via Gcc-patches
>> >  wrote:
>> >>
>> >> This patch is a prerequisite for a later one.  At the moment,
>> >> if-conversion converts predicated POINTER_PLUS_EXPRs into
>> >> non-wrapping forms, which for:
>> >>
>> >> … = base + offset
>> >>
>> >> becomes:
>> >>
>> >> tmp = (unsigned long) base
>> >> … = tmp + offset
>> >>
>> >> It then hoists these conversions out of the loop where possible.
>> >>
>> >> However, because “base” is a valid gimple operand, there can be
>> >> multiple POINTER_PLUS_EXPRs with the same base, which can in turn
>> >> lead to multiple instances of the same conversion.  The later VN pass
>> >> is (and I think needs to be) restricted to the new if-converted code,
>> >> whereas here we're deliberately inserting the conversions before the
>> >> .LOOP_VECTORIZED condition:
>> >>
>> >> /* If we versioned loop then make sure to insert invariant
>> >>stmts before the .LOOP_VECTORIZED check since the vectorizer
>> >>will re-use that for things like runtime alias versioning
>> >>whose condition can end up using those invariants.  */
>> >>
>> >> We can therefore enter the vectoriser with redundant conversions.
>> >>
>> >> The easiest fix seemed to be to defer the hoisting until after VN.
>> >> This catches other hoisting opportunities too.
>> >>
>> >> Hoisting the code from the (artificial) loop in pr99102.c means
>> >> that it's no longer worth vectorising.  The patch forces vectorisation
>> >> instead of relying on the cost model.
>> >>
>> >> The patch also reverts pr87007-4.c and pr87007-5.c back to their
>> >> original forms, undoing changes in 783dc66f9ccb0019c3dad.
>> >> The code at the time the tests were added was:
>> >>
>> >> testl   %edi, %edi
>> >> je  .L10
>> >> vxorps  %xmm1, %xmm1, %xmm1
>> >> vsqrtsd d3(%rip), %xmm1, %xmm0
>> >> vsqrtsd d2(%rip), %xmm1, %xmm1
>> >> ...
>> >> .L10:
>> >> ret
>> >>
>> >> with the operations being hoisted, and the vxorps was specifically
>> >> wanted (compared to the previous code).  This patch restores the code
>> >> to that form, with the hoisted operations and the vxorps.
>> >>
>> >> Regstrapped on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?
>> >>
>> >> Richard
>> >>
>> >>
>> >> gcc/
>> >> * tree-if-conv.c: Include tree-eh.h.
>> >> (predicate_statements): Remove pe argument.  Don't hoist
>> >> statements here.
>> >> (combine_blocks): Remove pe argument.
>> >> (ifcvt_can_hoist, ifcvt_can_hoist_further): New functions.
>> >> (ifcvt_hoist_invariants): Likewise.
>> >> (tree_if_conversion): Update call to combine_blocks.  Call
>> >> ifcvt_hoist_invariants after VN.
>> >>
>> >> gcc/testsuite/
>> >> * gcc.dg/vect/pr99102.c: Add -fno-vect-cost-model.
>> >>
>> >> Revert:
>> >>
>> >> 2020-09-09  Richard Biener  
>> >>
>> >> * gcc.target/i386/pr87007-4.c: Adjust.
>> >> * gcc.target/i386/pr87007-5.c: Likewise.
>> >> ---
>> >>  gcc/testsuite/gcc.dg/vect/pr99102.c   |   2 +-
>> >>  gcc/testsuite/gcc.target/i386/pr87007-4.c |   2 +-
>> >>  gcc/testsuite/gcc.target/i386/pr87007-5.c |   2 +-
>> >>  gcc/tree-if-conv.c| 122 --
>> >>  4 files changed, 114 insertions(+), 14 deletions(-)
>> >>
>> >> diff --git a/gcc/testsuite/gcc.dg/vect/pr99102.c 
>> >> b/gcc/testsuite/gcc.dg/vect/pr99102.c
>> >> index 6c1a13f0783..0d030d15c86 100644
>> >> --- a/gcc/testsuite/gcc.dg/vect/pr99102.c
>> >> +++ b/gcc/testsuite/gcc.dg/vect/pr99102.c
>> >> @@ -1,4 +1,4 @@
>> >> -/* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-details" } */
>> >> +/* { dg-options "-O2 -ftree-vectorize -fno-vect-cost-model 
>> >> -fdump-tree-vect-details" } */
>> >>  /* { dg-additional-options "-msve-vector-bits=256" { target 
>> >> aarch64_sve256_hw } } */
>> >>  long a[44];
>> >>  short d, e = -7;
>> >> diff --git a/gcc/testsuite/gcc.target/i386/pr87007-4.c 
>> >> b/gcc/testsuite/gcc.target/i386/pr87007-4.c
>> >> index 9c4b8005af3..e91bdcbac44 100644
>> >> --- a/gcc/testsuite/gcc.target/i386/pr87007-4.c
>> >> +++ b/gcc/testsuite/gcc.target/i386/pr87007-4.c
>> >> @@ -15,4 +15,4 @@ foo (int n, int k)
>> >>d1 = ceil (d3);
>> >>  }
>> >>
>> >> -/* { dg-final { scan-assembler-times "vxorps\[^\n\r\]*xmm\[0-9\]" 0 } } 
>> >> */
>> >> +/* { dg-final { scan-assembler-times "vxorps\[^\n\r\]*xmm\[0-9\]" 1 } } 
>> >> */
>> >> diff --git a/gcc/testsuite/gcc.target/i386/pr87007-5.c 
>> >> b/gcc/testsuite/gcc.target/i386/pr87007-5.c
>> >> index e4d956a5d7f..20d13cf650b 100644
>> >> --- a/gcc/testsuite/gcc.target/i386/pr87007-5.c
>> >> +++ b/gcc/testsuite/gcc.target/i386/pr87007-5.c
>> >> @@ -15,4 +15,4 @@ foo (int n, int k)
>> >>  

Re: [PATCH] rs6000: MMA test case emits wrong code when building a vector pair

2021-11-16 Thread Peter Bergner via Gcc-patches
On 11/13/21 7:25 AM, Segher Boessenkool wrote:
> On Wed, Oct 27, 2021 at 08:37:57PM -0500, Peter Bergner wrote:
>> PR102976 shows a test case where we generate wrong code when building
>> a vector pair from 2 vector registers.  The bug here is that with unlucky
>> register assignments, we can clobber one of the input operands before
>> we write both registers of the output operand.  The solution is to use
>> early-clobbers in the assemble pair and accumulator patterns.
> 
> Because of what insns there are after the split.  Aha.
> 
> Please add a comment explaining this, near the earlyclobber itself.

Done for both patterns.



> You can just write this as {\mxxlor \d+,44,44\M} etc., that will be
> simplest I think.

Done and tested that it still works.


> Okay for trunk with comments added near the earlyclobber, and the RE
> improved.  Also fine for 11 after some burn-in.  Thanks!

Ok, I pushed with both changes.  I'll push a change to GCC11 in a few days.
Thanks!

Peter




[PATCH] x86: Add -mharden-sls=[none|all|return|indirect-branch]

2021-11-16 Thread H.J. Lu via Gcc-patches
Add -mharden-sls= to mitigate against straight line speculation (SLS)
for function return and indirect branch by adding an INT3 instruction
after function return and indirect branch.

gcc/

PR target/102952
* config/i386/i386-opts.h (harden_sls): New enum.
* config/i386/i386.c (output_indirect_thunk): Mitigate against
SLS for function return.
(ix86_output_function_return): Likewise.
(ix86_output_jmp_thunk_or_indirect): Mitigate against indirect
branch.
(ix86_output_indirect_jmp): Likewise.
(ix86_output_call_insn): Likewise.
* config/i386/i386.opt: Add -mharden-sls=.
* doc/invoke.texi: Document -mharden-sls=.

gcc/testsuite/

PR target/102952
* gcc.target/i386/harden-sls-1.c: New test.
* gcc.target/i386/harden-sls-2.c: Likewise.
* gcc.target/i386/harden-sls-3.c: Likewise.
* gcc.target/i386/harden-sls-4.c: Likewise.
---
 gcc/config/i386/i386-opts.h  |  7 +
 gcc/config/i386/i386.c   | 30 
 gcc/config/i386/i386.opt | 20 +
 gcc/doc/invoke.texi  | 10 ++-
 gcc/testsuite/gcc.target/i386/harden-sls-1.c | 14 +
 gcc/testsuite/gcc.target/i386/harden-sls-2.c | 14 +
 gcc/testsuite/gcc.target/i386/harden-sls-3.c | 14 +
 gcc/testsuite/gcc.target/i386/harden-sls-4.c | 14 +
 8 files changed, 116 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/harden-sls-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/harden-sls-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/harden-sls-3.c
 create mode 100644 gcc/testsuite/gcc.target/i386/harden-sls-4.c

diff --git a/gcc/config/i386/i386-opts.h b/gcc/config/i386/i386-opts.h
index 04e4ad608fb..171d3106d0a 100644
--- a/gcc/config/i386/i386-opts.h
+++ b/gcc/config/i386/i386-opts.h
@@ -121,4 +121,11 @@ enum instrument_return {
   instrument_return_nop5
 };
 
+enum harden_sls {
+  harden_sls_none = 0,
+  harden_sls_return = 1 << 0,
+  harden_sls_indirect_branch = 1 << 1,
+  harden_sls_all = harden_sls_return | harden_sls_indirect_branch
+};
+
 #endif
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index cc9f9322fad..0a902d66321 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -5914,6 +5914,8 @@ output_indirect_thunk (unsigned int regno)
 }
 
   fputs ("\tret\n", asm_out_file);
+  if ((ix86_harden_sls & harden_sls_return))
+fputs ("\tint3\n", asm_out_file);
 }
 
 /* Output a funtion with a call and return thunk for indirect branch.
@@ -15987,6 +15989,8 @@ ix86_output_jmp_thunk_or_indirect (const char 
*thunk_name, const int regno)
   fprintf (asm_out_file, "\tjmp\t");
   assemble_name (asm_out_file, thunk_name);
   putc ('\n', asm_out_file);
+  if ((ix86_harden_sls & harden_sls_indirect_branch))
+   fputs ("\tint3\n", asm_out_file);
 }
   else
 output_indirect_thunk (regno);
@@ -16212,10 +16216,14 @@ ix86_output_indirect_jmp (rtx call_op)
gcc_unreachable ();
 
   ix86_output_indirect_branch (call_op, "%0", true);
-  return "";
+  if ((ix86_harden_sls & harden_sls_indirect_branch))
+   return "int3";
+  else
+   return "";
 }
   else
-return "%!jmp\t%A0";
+return ((ix86_harden_sls & harden_sls_indirect_branch)
+   ? "%!jmp\t%A0\n\tint3" : "%!jmp\t%A0");
 }
 
 /* Output return instrumentation for current function if needed.  */
@@ -16283,10 +16291,15 @@ ix86_output_function_return (bool long_p)
   return "";
 }
 
-  if (!long_p)
-return "%!ret";
+  if ((ix86_harden_sls & harden_sls_return))
+return "%!ret\n\tint3";
+  else
+{
+  if (!long_p)
+   return "%!ret";
 
-  return "rep%; ret";
+  return "rep%; ret";
+}
 }
 
 /* Output indirect function return.  RET_OP is the function return
@@ -16381,7 +16394,12 @@ ix86_output_call_insn (rtx_insn *insn, rtx call_op)
   if (output_indirect_p && !direct_p)
ix86_output_indirect_branch (call_op, xasm, true);
   else
-   output_asm_insn (xasm, &call_op);
+   {
+ output_asm_insn (xasm, &call_op);
+ if (!direct_p
+ && (ix86_harden_sls & harden_sls_indirect_branch))
+   return "int3";
+   }
   return "";
 }
 
diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt
index b38ac13fc91..c5452c49597 100644
--- a/gcc/config/i386/i386.opt
+++ b/gcc/config/i386/i386.opt
@@ -1121,6 +1121,26 @@ mrecord-return
 Target Var(ix86_flag_record_return) Init(0)
 Generate a __return_loc section pointing to all return instrumentation code.
 
+mharden-sls=
+Target RejectNegative Joined Enum(harden_sls) Var(ix86_harden_sls) 
Init(harden_sls_none)
+Generate code to mitigate against straight line speculation.
+
+Enum
+Name(harden_sls) Type(enum harden_sls)
+Known choices for mitigation against straight line speculation with 
-mharden-sls

[PATCH] x86: Add -mindirect-branch-cs-prefix

2021-11-16 Thread H.J. Lu via Gcc-patches
Add -mindirect-branch-cs-prefix to add CS prefix to call and jmp to thunk
via r8-r15 registers when converting indirect call and jump to increase
the instruction length to 6, allowing the non-thunk form to be inlined.

gcc/

PR target/102952
* config/i386/i386.c (ix86_output_jmp_thunk_or_indirect): Emit
CS prefix for -mindirect-branch-cs-prefix.
(ix86_output_indirect_branch_via_reg): Likewise.
* config/i386/i386.opt: Add -mindirect-branch-cs-prefix.
* doc/invoke.texi: Document -mindirect-branch-cs-prefix.

gcc/testsuite/

PR target/102952
* gcc.target/i386/indirect-thunk-cs-prefix-1.c: New test.
* gcc.target/i386/indirect-thunk-cs-prefix-2.c: Likewise.
---
 gcc/config/i386/i386.c|  6 ++
 gcc/config/i386/i386.opt  |  4 
 gcc/doc/invoke.texi   |  8 +++-
 .../gcc.target/i386/indirect-thunk-cs-prefix-1.c  | 14 ++
 .../gcc.target/i386/indirect-thunk-cs-prefix-2.c  | 15 +++
 5 files changed, 46 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/indirect-thunk-cs-prefix-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/indirect-thunk-cs-prefix-2.c

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 7e9b7bc347f..0a902d66321 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -15983,6 +15983,9 @@ ix86_output_jmp_thunk_or_indirect (const char 
*thunk_name, const int regno)
 {
   if (thunk_name != NULL)
 {
+  if (regno >= FIRST_REX_INT_REG
+ && ix86_indirect_branch_cs_prefix)
+   fprintf (asm_out_file, "\tcs\n");
   fprintf (asm_out_file, "\tjmp\t");
   assemble_name (asm_out_file, thunk_name);
   putc ('\n', asm_out_file);
@@ -16036,6 +16039,9 @@ ix86_output_indirect_branch_via_reg (rtx call_op, bool 
sibcall_p)
 {
   if (thunk_name != NULL)
{
+ if (regno >= FIRST_REX_INT_REG
+ && ix86_indirect_branch_cs_prefix)
+   fprintf (asm_out_file, "\tcs\n");
  fprintf (asm_out_file, "\tcall\t");
  assemble_name (asm_out_file, thunk_name);
  putc ('\n', asm_out_file);
diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt
index 8d499a5a4df..c5452c49597 100644
--- a/gcc/config/i386/i386.opt
+++ b/gcc/config/i386/i386.opt
@@ -1076,6 +1076,10 @@ Enum(indirect_branch) String(thunk-inline) 
Value(indirect_branch_thunk_inline)
 EnumValue
 Enum(indirect_branch) String(thunk-extern) Value(indirect_branch_thunk_extern)
 
+mindirect-branch-cs-prefix
+Target Var(ix86_indirect_branch_cs_prefix) Init(0)
+Add CS prefix to call and jmp to thunk when converting indirect call and jump.
+
 mindirect-branch-register
 Target Var(ix86_indirect_branch_register) Init(0)
 Force indirect call and jump via register.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index f3b4b467765..c992a7152f5 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -1425,7 +1425,8 @@ See RS/6000 and PowerPC Options.
 -mstack-protector-guard-symbol=@var{symbol} @gol
 -mgeneral-regs-only  -mcall-ms2sysv-xlogues -mrelax-cmpxchg-loop @gol
 -mindirect-branch=@var{choice}  -mfunction-return=@var{choice} @gol
--mindirect-branch-register -mharden-sls=@var{choice} -mneeded}
+-mindirect-branch-register -mharden-sls=@var{choice} @gol
+-mindirect-branch-cs-prefix -mneeded}
 
 @emph{x86 Windows Options}
 @gccoptlist{-mconsole  -mcygwin  -mno-cygwin  -mdll @gol
@@ -32390,6 +32391,11 @@ hardening.  @samp{return} enables SLS hardening for 
function return.
 @samp{indirect-branch} enables SLS hardening for indirect branch.
 @samp{all} enables all SLS hardening.
 
+@item -mindirect-branch-cs-prefix
+@opindex mindirect-branch-cs-prefix
+Add CS prefix to call and jmp to thunk via r8-r15 registers when
+converting indirect call and jump.
+
 @end table
 
 These @samp{-m} switches are supported in addition to the above
diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-cs-prefix-1.c 
b/gcc/testsuite/gcc.target/i386/indirect-thunk-cs-prefix-1.c
new file mode 100644
index 000..db2f3416823
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/indirect-thunk-cs-prefix-1.c
@@ -0,0 +1,14 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -ffixed-rax -ffixed-rbx -ffixed-rcx -ffixed-rdx 
-ffixed-rdi -ffixed-rsi -mindirect-branch-cs-prefix 
-mindirect-branch=thunk-extern" } */
+/* { dg-additional-options "-fno-pic" { target { ! *-*-darwin* } } } */
+
+extern void (*fptr) (void);
+
+void
+foo (void)
+{
+  fptr ();
+}
+
+/* { dg-final { scan-assembler-times "jmp\[ 
\t\]+_?__x86_indirect_thunk_r\[0-9\]+" 1 } } */
+/* { dg-final { scan-assembler-times "\tcs" 1 } } */
diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-cs-prefix-2.c 
b/gcc/testsuite/gcc.target/i386/indirect-thunk-cs-prefix-2.c
new file mode 100644
index 000..adfc39a49d4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/indir

[PATCH v2] rs6000: Fix a handful of 32-bit built-in function problems

2021-11-16 Thread Bill Schmidt via Gcc-patches
Hi!  I previously posted [1] to correct some problems with the new builtins
support targeting 32-bit code gen.  Based on the discussion, I've made some
adjustments and would like to submit this for consideration.

We eventually agreed that the strange behavior for -m32 -mpowerpc64 for certain
HTM builtins should be removed.  All of the registers TEXASR, TEXASRU, TFHAR,
and TFIAR are now accessed using the unsigned long data type in all 
configurations.

Segher didn't like the change in the error message for the cmpb-3.c test case,
but I think this should be fine.  The test case just tests for the error 
message,
but there is also a "note" message that provides additional information.  The
diagnostics that the user sees will look like this:

cmpb-3.c:11:3: error: '__builtin_p6_cmpb' requires the '-mcpu=power6' option 
and either the '-m64' or '-mpowerpc64' option
cmpb-3.c:11:3: note: builtin '__builtin_cmpb' requires builtin 
'__builtin_p6_cmpb'

So it's clear to the user that their use of __builtin_cmpb at line 11 triggered
the error.

Bootstrapped and tested on powerpc64le-linux-gnu, and on powerpc64-linux-gnu
using -m32/-m64.  Is this okay for trunk?

Thanks!
Bill

[1] https://gcc.gnu.org/pipermail/gcc-patches/2021-November/583905.html


2021-11-16  Bill Schmidt  

gcc/
* config/rs6000/rs6000-builtin-new.def (CMPB): Flag as no32bit.
(BPERMD): Flag as 32bit (needing special handling for 32-bit).
(UNPACK_TD): Return unsigned long long instead of unsigned long.
(GET_TEXASR): Return unsigned long instead of unsigned long long.
(GET_TEXASRU): Likewise.
(GET_TFHAR): Likewise.
(GET_TFIAR): Likewise.
(SET_TEXASR): Pass unsigned long instead of unsigned long long.
(SET_TEXASRU): Likewise.
(SET_TFHAR): Likewise.
(SET_TFIAR): Likewise.
(TABORTDC): Likewise.
(TABORTDCI): Likewise.
* config/rs6000/rs6000-call.c (rs6000_expand_new_builtin): Fix error
handling for no32bit.  Add 32bit handling for RS6000_BIF_BPERMD.

gcc/testsuite/
* gcc.target/powerpc/cmpb-3.c: Adjust error message.
---
 gcc/config/rs6000/rs6000-builtin-new.def  | 30 +++
 gcc/config/rs6000/rs6000-call.c   |  9 ---
 gcc/testsuite/gcc.target/powerpc/cmpb-3.c |  2 +-
 3 files changed, 22 insertions(+), 19 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtin-new.def 
b/gcc/config/rs6000/rs6000-builtin-new.def
index 58dfce1ca37..30556e5c7f2 100644
--- a/gcc/config/rs6000/rs6000-builtin-new.def
+++ b/gcc/config/rs6000/rs6000-builtin-new.def
@@ -273,7 +273,7 @@
 ; Power6 builtins requiring 64-bit GPRs (even with 32-bit addressing).
 [power6-64]
   const signed long __builtin_p6_cmpb (signed long, signed long);
-CMPB cmpbdi3 {}
+CMPB cmpbdi3 {no32bit}
 
 
 ; AltiVec builtins.
@@ -2018,7 +2018,7 @@
 ADDG6S addg6s {}
 
   const signed long __builtin_bpermd (signed long, signed long);
-BPERMD bpermd_di {}
+BPERMD bpermd_di {32bit}
 
   const unsigned int __builtin_cbcdtd (unsigned int);
 CBCDTD cbcdtd {}
@@ -2971,7 +2971,7 @@
   void __builtin_set_fpscr_drn (const int[0,7]);
 SET_FPSCR_DRN rs6000_set_fpscr_drn {}
 
-  const unsigned long __builtin_unpack_dec128 (_Decimal128, const int<1>);
+  const unsigned long long __builtin_unpack_dec128 (_Decimal128, const int<1>);
 UNPACK_TD unpacktd {}
 
 
@@ -3014,39 +3014,39 @@
 
 
 [htm]
-  unsigned long long __builtin_get_texasr ();
+  unsigned long __builtin_get_texasr ();
 GET_TEXASR nothing {htm,htmspr}
 
-  unsigned long long __builtin_get_texasru ();
+  unsigned long __builtin_get_texasru ();
 GET_TEXASRU nothing {htm,htmspr}
 
-  unsigned long long __builtin_get_tfhar ();
+  unsigned long __builtin_get_tfhar ();
 GET_TFHAR nothing {htm,htmspr}
 
-  unsigned long long __builtin_get_tfiar ();
+  unsigned long __builtin_get_tfiar ();
 GET_TFIAR nothing {htm,htmspr}
 
-  void __builtin_set_texasr (unsigned long long);
+  void __builtin_set_texasr (unsigned long);
 SET_TEXASR nothing {htm,htmspr}
 
-  void __builtin_set_texasru (unsigned long long);
+  void __builtin_set_texasru (unsigned long);
 SET_TEXASRU nothing {htm,htmspr}
 
-  void __builtin_set_tfhar (unsigned long long);
+  void __builtin_set_tfhar (unsigned long);
 SET_TFHAR nothing {htm,htmspr}
 
-  void __builtin_set_tfiar (unsigned long long);
+  void __builtin_set_tfiar (unsigned long);
 SET_TFIAR nothing {htm,htmspr}
 
   unsigned int __builtin_tabort (unsigned int);
 TABORT tabort {htm,htmcr}
 
-  unsigned int __builtin_tabortdc (unsigned long long, unsigned long long, \
-   unsigned long long);
+  unsigned int __builtin_tabortdc (unsigned long, unsigned long, \
+   unsigned long);
 TABORTDC tabortdc {htm,htmcr}
 
-  unsigned int __builtin_tabortdci (unsigned long long, unsigned long long, \
-unsigned long long);
+  unsi

[r12-5292 Regression] FAIL: gcc.dg/tree-ssa/modref-dse-5.c scan-tree-dump dse2 "Deleted dead store: wrap" on Linux/x86_64

2021-11-16 Thread sunil.k.pandey via Gcc-patches
On Linux/x86_64,

e69b7c5779863469479698f863ab25e0d9b4586e is the first bad commit
commit e69b7c5779863469479698f863ab25e0d9b4586e
Author: Jan Hubicka 
Date:   Tue Nov 16 09:15:39 2021 +0100

Fix uninitialized access in merge_call_side_effects

caused

FAIL: gcc.dg/tree-ssa/modref-dse-5.c scan-tree-dump dse2 "Deleted dead store: 
wrap"

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r12-5292/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="tree-ssa.exp=gcc.dg/tree-ssa/modref-dse-5.c 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="tree-ssa.exp=gcc.dg/tree-ssa/modref-dse-5.c 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="tree-ssa.exp=gcc.dg/tree-ssa/modref-dse-5.c 
--target_board='unix{-m64}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="tree-ssa.exp=gcc.dg/tree-ssa/modref-dse-5.c 
--target_board='unix{-m64\ -march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)


[r12-5301 Regression] FAIL: gcc.dg/tree-ssa/if-to-switch-3.c scan-tree-dump iftoswitch "Condition chain with [^\n\r]* BBs transformed into a switch statement." on Linux/x86_64

2021-11-16 Thread sunil.k.pandey via Gcc-patches
On Linux/x86_64,

045206450386bcd774db3bde0c696828402361c6 is the first bad commit
commit 045206450386bcd774db3bde0c696828402361c6
Author: Richard Biener 
Date:   Fri Nov 12 10:21:22 2021 +0100

tree-optimization/102880 - improve CD-DCE

caused

FAIL: gcc.dg/tree-ssa/if-to-switch-3.c scan-tree-dump iftoswitch "Condition 
chain with [^\n\r]* BBs transformed into a switch statement."

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r12-5301/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="tree-ssa.exp=gcc.dg/tree-ssa/if-to-switch-3.c 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="tree-ssa.exp=gcc.dg/tree-ssa/if-to-switch-3.c 
--target_board='unix{-m32\ -march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)


[pushed] configure, Darwin: Set appropriate defaults for host-shared.

2021-11-16 Thread Iain Sandoe via Gcc-patches
Darwin x86_64 and aarch64 platforms are PIC (shared) by default,
and user-space code must be built in this mode.  The patch
ensures that this is set correctly and applies a default when
--enable-host-shared is not set.

tested on *-darwin*, x86_64,powerpc64le-linux-gnu,
pushed to master, thanks
Iain

Signed-off-by: Iain Sandoe 

ChangeLog:

* configure: Regenerate.
* configure.ac: Ensure that PIC (shared) defaults are set
correctly for Darwin.
---
 configure| 16 +++-
 configure.ac | 15 ++-
 2 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/configure b/configure
index 58979d6e3b1..3062495da31 100755
--- a/configure
+++ b/configure
@@ -8447,8 +8447,20 @@ fi
 # Check whether --enable-host-shared was given.
 if test "${enable_host_shared+set}" = set; then :
   enableval=$enable_host_shared; host_shared=$enableval
+ case $target in
+   x86_64-*-darwin* | aarch64-*-darwin*)
+ if test x$host_shared != xyes ; then
+   # PIC is the default, and actually cannot be switched off.
+   echo configure.ac: warning: PIC code is required for the configured 
target, host-shared setting ignored. 1>&2
+   host_shared=yes
+ fi ;;
+  *) ;;
+ esac
 else
-  host_shared=no
+  case $target in
+  x86_64-*-darwin* | aarch64-*-darwin*) host_shared=yes ;;
+  *) host_shared=no ;;
+ esac
 fi
 
 
@@ -10083,6 +10095,8 @@ done
 
 
 
+
+
 # Generate default definitions for YACC, M4, LEX and other programs that run
 # on the build machine.  These are used if the Makefile can't locate these
 # programs in objdir.
diff --git a/configure.ac b/configure.ac
index 550e6993b59..bed60bcaf72 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1859,7 +1859,20 @@ AC_SUBST(extra_linker_plugin_flags)
 AC_ARG_ENABLE(host-shared,
 [AS_HELP_STRING([--enable-host-shared],
[build host code as shared libraries])],
-[host_shared=$enableval], [host_shared=no])
+[host_shared=$enableval
+ case $target in
+   x86_64-*-darwin* | aarch64-*-darwin*)
+ if test x$host_shared != xyes ; then
+   # PIC is the default, and actually cannot be switched off.
+   echo configure.ac: warning: PIC code is required for the configured 
target, host-shared setting ignored. 1>&2
+   host_shared=yes
+ fi ;;
+  *) ;;
+ esac],
+[case $target in
+  x86_64-*-darwin* | aarch64-*-darwin*) host_shared=yes ;;
+  *) host_shared=no ;;
+ esac])
 AC_SUBST(host_shared)
 
 # By default, C and C++ are the only stage 1 languages.
-- 
2.24.3 (Apple Git-128)



[PATCH v2] libcpp: Implement -Wbidi-chars for CVE-2021-42574 [PR103026]

2021-11-16 Thread Marek Polacek via Gcc-patches
On Mon, Nov 15, 2021 at 06:15:40PM -0500, David Malcolm wrote:
> > On Mon, Nov 08, 2021 at 04:33:43PM -0500, Marek Polacek wrote:
> > > Ping, can we conclude on the name?   IMHO, -Wbidirectional is just fine,
> > > but changing the name is a trivial operation. 
> > 
> > Here's a patch with a better name (suggested by Jonathan W.).  Otherwise no
> > changes.
> 
> Thanks for implementing this.
> 
> > 
> > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> > 
> > -- >8 --
> > From a link below:
> > "An issue was discovered in the Bidirectional Algorithm in the Unicode
> > Specification through 14.0. It permits the visual reordering of
> > characters via control sequences, which can be used to craft source code
> > that renders different logic than the logical ordering of tokens
> > ingested by compilers and interpreters. Adversaries can leverage this to
> > encode source code for compilers accepting Unicode such that targeted
> > vulnerabilities are introduced invisibly to human reviewers."
> > 
> > More info:
> > https://nvd.nist.gov/vuln/detail/CVE-2021-42574
> > https://trojansource.codes/
> > 
> > This is not a compiler bug.  However, to mitigate the problem, this patch
> > implements -Wbidi-chars=[none|unpaired|any] to warn about possibly
> > misleading Unicode bidirectional characters the preprocessor may encounter.
> > 
> > The default is =unpaired, which warns about improperly terminated
> > bidirectional characters; e.g. a LRE without its appertaining PDF.  The
> 
> I like the default.

Great.

> Wording nit: maybe use "corresponding" rather than "appertaining"; I
> believe the latter has a sense that one is part of the other, when they
> are more like peers.

OK, fixed.

> > level =any warns about any use of bidirectional characters.
> 
> Terminology nit:
> The patch is referring to "bidirectional characters", but I think the
> term "bidirectional control characters" would be better.

Adjusted.
 
> For example, a passage of text containing both numbers and characters
> in a right-to-left script could be considered "bidirectional", since
> the numbers are written from left-to-right.
> 
> Specifically, the patch looks for these specific characters:
>   * U+202A LEFT-TO-RIGHT EMBEDDING
>   * U+202B RIGHT-TO-LEFT EMBEDDING
>   * U+202C POP DIRECTIONAL FORMATTING
>   * U+202D LEFT-TO-RIGHT OVERRIDE
>   * U+202E RIGHT-TO-LEFT OVERRIDE
>   * U+2066 LEFT-TO-RIGHT ISOLATE
>   * U+2067 RIGHT-TO-LEFT ISOLATE
>   * U+2068 FIRST STRONG ISOLATE
>   * U+2069 POP DIRECTIONAL ISOLATE
> 
> However, the following characters could also be considered as
> "bidirectional control characters":
>   * U+200E ‎LEFT-TO-RIGHT MARK (UTF-8: E2 80 8E)
>   * U+200F ‎RIGHT-TO-LEFT MARK (UTF-8: E2 80 8F)
> but aren't checked for in the patch.  Should they be?  I can imagine
> ways in which they could be abused, so I think so.

I'd only intended to check the bidi chars described in the original
trojan source pdf, but I added checking for U+200E/U+200F too, since
it was easy enough.  AFAIK they aren't popped by a PDF/PDI like the
rest, so don't need to go on the vec, and so we only warn with =any.
Tests: Wbidi-chars-16.c + Wbidi-chars-17.c
  
> [...snip...]
> 
> > diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
> > index 06457ac739e..b047df0f125 100644
> > --- a/gcc/c-family/c.opt
> > +++ b/gcc/c-family/c.opt
> > @@ -374,6 +374,30 @@ Wbad-function-cast
> >  C ObjC Var(warn_bad_function_cast) Warning
> >  Warn about casting functions to incompatible types.
> >  
> > +Wbidi-chars
> > +C ObjC C++ ObjC++ Warning Alias(Wbidi-chars=,any,none)
> > +;
> > +
> > +Wbidi-chars=
> > +C ObjC C++ ObjC++ RejectNegative Joined Warning 
> > CPP(cpp_warn_bidirectional) CppReason(CPP_W_BIDIRECTIONAL) 
> > Var(warn_bidirectional) Init(bidirectional_unpaired) 
> > Enum(cpp_bidirectional_level)
> > +-Wbidi-chars=[none|unpaired|any] Warn about UTF-8 bidirectional characters.
> 
> "control characters"
 
Fixed.

> [...snip...]
> 
> >  
> > +@item -Wbidi-chars=@r{[}none@r{|}unpaired@r{|}any@r{]}
> > +@opindex Wbidi-chars=
> > +@opindex Wbidi-chars
> > +@opindex Wno-bidi-chars
> > +Warn about possibly misleading UTF-8 bidirectional characters in comments,
> 
> (and here again)
 
Fixed.

> > +string literals, character constants, and identifiers.  Such characters can
> > +change left-to-right writing direction into right-to-left (and vice versa),
> > +which can cause confusion between the logical order and visual order.  This
> > +may be dangerous; for instance, it may seem that a piece of code is not
> > +commented out, whereas it in fact is.
> > +
> > +There are three levels of warning supported by GCC@.  The default is
> > +@option{-Wbidi-chars=unpaired}, which warns about improperly terminated
> > +bidi contexts.  @option{-Wbidi-chars=none} turns the warning off.
> > +@option{-Wbidi-chars=any} warns about any use of bidirectional characters.
> 
> (and again)

Fixed.

> [...snip...]
> 
> 
> > diff --git a/gcc/testsuite/c-c++-common/Wbidi-chars

Re: [PATCH v4] Fix ICE when mixing VLAs and statement expressions [PR91038]

2021-11-16 Thread Jason Merrill via Gcc-patches

On 11/16/21 08:48, Uecker, Martin wrote:

Am Montag, den 08.11.2021, 19:13 +0100 schrieb Martin Uecker:

Am Montag, den 08.11.2021, 12:13 -0500 schrieb Jason Merrill:

On 11/7/21 01:40, Uecker, Martin wrote:

Am Mittwoch, den 03.11.2021, 10:18 -0400 schrieb Jason Merrill:


...


Thank you! I made these changes and ran
bootstrap and tests again.


Hmm, it doesn't look like you made the change to use the save_expr
function instead of build1?


Oh, sorry. I wanted to change it and then forgot.
Now also with this change (changelog as before).



Ok, with is this change?


OK.


Best,
Martin




Ok for trunk?


Any idea how to fix returning structs with
VLA member from statement expressions?


Testcase?


void foo(void)
{
   ({ int N = 3; struct { char x[N]; } x; x; });
}

The difference to the tests in this patch (which
also forgot to include in the last version) is that
the object of variable size is returned from the
statement expression and not a pointer to it.
This can not happen with arrays because they decay
to pointers.


Martin



Otherwise, I will add an error message to
the FE in another patch.

Martin



diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 436df45df68..95083f95442 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -3306,7 +3306,19 @@ pointer_int_sum (location_t loc, enum tree_code 
resultcode,
 TREE_TYPE (result_type)))
  size_exp = integer_one_node;
else
-size_exp = size_in_bytes_loc (loc, TREE_TYPE (result_type));
+{
+  size_exp = size_in_bytes_loc (loc, TREE_TYPE (result_type));
+  /* Wrap the pointer expression in a SAVE_EXPR to make sure it
+is evaluated first when the size expression may depend
+on it for VM types.  */
+  if (TREE_SIDE_EFFECTS (size_exp)
+ && TREE_SIDE_EFFECTS (ptrop)
+ && variably_modified_type_p (TREE_TYPE (ptrop), NULL))
+   {
+ ptrop = save_expr (ptrop);
+ size_exp = build2 (COMPOUND_EXPR, TREE_TYPE (intop), ptrop, size_exp);
+   }
+}
  
/* We are manipulating pointer values, so we don't need to warn

   about relying on undefined signed overflow.  We disable the
diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index c2ab96e7e18..84f7dc3c248 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -2964,7 +2964,9 @@ gimplify_var_or_parm_decl (tree *expr_p)
   declaration, for which we've already issued an error.  It would
   be really nice if the front end wouldn't leak these at all.
   Currently the only known culprit is C++ destructors, as seen
- in g++.old-deja/g++.jason/binding.C.  */
+ in g++.old-deja/g++.jason/binding.C.
+ Another possible culpit are size expressions for variably modified
+ types which are lost in the FE or not gimplified correctly.  */
if (VAR_P (decl)
&& !DECL_SEEN_IN_BIND_EXPR_P (decl)
&& !TREE_STATIC (decl) && !DECL_EXTERNAL (decl)
@@ -3109,16 +3111,22 @@ gimplify_compound_lval (tree *expr_p, gimple_seq 
*pre_p, gimple_seq
*post_p,
   expression until we deal with any variable bounds, sizes, or
   positions in order to deal with PLACEHOLDER_EXPRs.
  
- So we do this in three steps.  First we deal with the annotations

- for any variables in the components, then we gimplify the base,
- then we gimplify any indices, from left to right.  */
+ The base expression may contain a statement expression that
+ has declarations used in size expressions, so has to be
+ gimplified before gimplifying the size expressions.
+
+ So we do this in three steps.  First we deal with variable
+ bounds, sizes, and positions, then we gimplify the base,
+ then we deal with the annotations for any variables in the
+ components and any indices, from left to right.  */
+
for (i = expr_stack.length () - 1; i >= 0; i--)
  {
tree t = expr_stack[i];
  
if (TREE_CODE (t) == ARRAY_REF || TREE_CODE (t) == ARRAY_RANGE_REF)

{
- /* Gimplify the low bound and element type size and put them into
+ /* Deal with the low bound and element type size and put them into
 the ARRAY_REF.  If these values are set, they have already been
 gimplified.  */
  if (TREE_OPERAND (t, 2) == NULL_TREE)
@@ -3127,18 +3135,8 @@ gimplify_compound_lval (tree *expr_p, gimple_seq *pre_p, 
gimple_seq
*post_p,
  if (!is_gimple_min_invariant (low))
{
  TREE_OPERAND (t, 2) = low;
- tret = gimplify_expr (&TREE_OPERAND (t, 2), pre_p,
-   post_p, is_gimple_reg,
-   fb_rvalue);
- ret = MIN (ret, tret);
}
}
- else
-   {
- tret = gimplify_expr (&TREE_OPERAND (t, 2), pre_p, post_p,
-   is_gimple_reg, fb_rvalue);
- ret = MIN (ret, tre

Re: [PATCH] Fix spelling of ones' complement.

2021-11-16 Thread Bernhard Reutner-Fischer via Gcc-patches
On Tue, 16 Nov 2021 15:55:55 +0100
Aldy Hernandez via Gcc-patches  wrote:

> All sources before Knuth are clearly wrong.  How could they not?
> Folks living in the pre-Knuth era lived without a deity.
> 
> :-P

Not sure if this one's a compliment.

Speaking of which:

$ git grep -i "complim"
gcc/ChangeLog-2000: addition over compliments over shifts.
gcc/ada/sem_util.adb:  --  Assume that the main unit does not have a 
complimentary unit
gcc/ada/sem_util.adb:  --  Obtain the complimentary unit of the main unit
gcc/config/fr30/fr30.c:  /* Convert GCC's comparison operators into the 
complimentary FR30
gcc/config/mn10300/mn10300.md:  /* Recall that twos-compliment is 
ones-compliment plus one.  When
gcc/config/nds32/constraints.md:  "A constant whose compliment value is in the 
range of imm15u
gcc/config/nds32/nds32.md:;; 'ONE_COMPLIMENT' operation
gcc/config/sparc/sparc.h:   compliment of ordered and unordered comparisons, 
but until generic
gcc/config/visium/visium.h:   compliment of ordered and unordered comparisons, 
but until generic
gcc/d/expr.cc:  /* Build a compliment expression, where all the bits in the 
value are
gcc/d/intrinsics.cc:   Variants of `bt' will then update that bit. `btc' 
compliments the bit, `bts'
gcc/doc/md.texi:A constant whose compliment value is in the range of imm15u
gcc/ipa-reference.c:  /* Create the complimentary sets.  */
libstdc++-v3/testsuite/data/thirty_years_among_the_dead_preproc.txt:compliment

Maybe someone competent should contemplate to complement the fixes
for ones' two's complement in the above, except the first and last... ;)


[PATCH, committed] PR fortran/103286 - ICE in resolve_select, at fortran/resolve.c:8848

2021-11-16 Thread Harald Anlauf via Gcc-patches
Committed to mainline as obvious after regtesting.

When issuing an error on an invalid range in a SELECT CASE statement
with a logical case expression, we need to be careful to use the
right locus information.

Thanks,
Harald

From 3b3c9932338650c9a402cf1bfbdf7dfc03e185e7 Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Tue, 16 Nov 2021 21:06:06 +0100
Subject: [PATCH] Fortran: avoid NULL pointer dereference on invalid range in
 logical SELECT CASE

gcc/fortran/ChangeLog:

	PR fortran/103286
	* resolve.c (resolve_select): Choose appropriate range limit to
	avoid NULL pointer dereference when generating error message.

gcc/testsuite/ChangeLog:

	PR fortran/103286
	* gfortran.dg/pr103286.f90: New test.
---
 gcc/fortran/resolve.c  |  3 ++-
 gcc/testsuite/gfortran.dg/pr103286.f90 | 11 +++
 2 files changed, 13 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gfortran.dg/pr103286.f90

diff --git a/gcc/fortran/resolve.c b/gcc/fortran/resolve.c
index 705d2326a29..f074a0ab3a1 100644
--- a/gcc/fortran/resolve.c
+++ b/gcc/fortran/resolve.c
@@ -8846,7 +8846,8 @@ resolve_select (gfc_code *code, bool select_type)
 		  || cp->low != cp->high))
 	{
 	  gfc_error ("Logical range in CASE statement at %L is not "
-			 "allowed", &cp->low->where);
+			 "allowed",
+			 cp->low ? &cp->low->where : &cp->high->where);
 	  t = false;
 	  break;
 	}
diff --git a/gcc/testsuite/gfortran.dg/pr103286.f90 b/gcc/testsuite/gfortran.dg/pr103286.f90
new file mode 100644
index 000..1c18b7136ce
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr103286.f90
@@ -0,0 +1,11 @@
+! { dg-do compile }
+! { dg-options "std=gnu" }
+! PR fortran/103286 - ICE in resolve_select
+
+program p
+  select case (.true.) ! { dg-warning "Extension: Conversion" }
+  case (1_8)
+  case (:0)! { dg-error "Logical range in CASE statement" }
+  case (2:)! { dg-error "Logical range in CASE statement" }
+  end select
+end
--
2.26.2



Re: [PATCH] restore ancient -Waddress for weak symbols [PR33925]

2021-11-16 Thread Jason Merrill via Gcc-patches

On 10/23/21 19:06, Martin Sebor wrote:

On 10/4/21 3:37 PM, Jason Merrill wrote:

On 10/4/21 14:42, Martin Sebor wrote:

While resolving the recent -Waddress enhancement request (PR
PR102103) I came across a 2007 problem report about GCC 4 having
stopped warning for using the address of inline functions in
equality comparisons with null.  With inline functions being
commonplace in C++ this seems like an important use case for
the warning.

The change that resulted in suppressing the warning in these
cases was introduced inadvertently in a fix for PR 22252.

To restore the warning, the attached patch enhances
the decl_with_nonnull_addr_p() function to return true also for
weak symbols for which a definition has been provided.


I think you probably want to merge this function with 
fold-const.c:maybe_nonzero_address, which already handles more cases.


maybe_nonzero_address() doesn't behave quite like
decl_with_nonnull_addr_p() expects and I'm reluctant to muck
around with the former too much since it's used for codegen,
while the latter just for warnings.  (There is even a case
where the functions don't behave the same, and would result
in different warnings between C and C++ without some extra
help.)

So in the attached revision I just have maybe_nonzero_address()
call decl_with_nonnull_addr_p() and then refine the failing
(or uncertain) cases separately, with some overlap between
them.

Since I worked on this someone complained that some instances
of the warning newly enhanced under PR102103 aren't suppresed
in code resulting from macro expansion.  Since it's trivial,
I include the fix for that report in this patch as well.



+   allocated stroage might have a null address.  */


typo.

OK with that fixed.

Jason



Re: [RFC] c++: Print function template parms when relevant (was: [PATCH v4] c++: Add gnu::diagnose_as attribute)

2021-11-16 Thread Jason Merrill via Gcc-patches

On 11/8/21 15:00, Matthias Kretz wrote:

I forgot to mention why I tagged it [RFC]: I needed one more bit of
information on the template args TREE_VEC to encode EXPLICIT_TEMPLATE_ARGS_P.
Its TREE_CHAIN already points to an integer constant denoting the number of
non-default arguments, so I couldn't trivially replace that. Therefore, I used
the sign of that integer. I was hoping to find a cleaner solution, though.


It seems that we aren't using any TREE_LANG_FLAG_n on TREE_VEC, so that 
would be a cleaner solution.



On Monday, 8 November 2021 17:40:44 CET Matthias Kretz wrote:

On Tuesday, 17 August 2021 20:31:54 CET Jason Merrill wrote:

2. Given a DECL_TI_ARGS tree, can I query whether an argument was
deduced
or explicitly specified? I'm asking because I still consider diagnostics
of function templates unfortunate. `template  void f()` is
fine,
as is `void f(T) [with T = float]`, but `void f() [with T = float]`
could
be better. I.e. if the template parameter appears somewhere in the
function parameter list, dump_template_parms would only produce noise.
If, however, the template parameter was given explicitly, it would be
nice if it could show up accordingly in diagnostics.


NON_DEFAULT_TEMPLATE_ARGS_COUNT has that information, though there are
some issues with it.  Attached is my WIP from May to improve it
somewhat, if that's interesting.


It is interesting. I used your patch to come up with the attached. Patch. I
must say, I didn't try to read through all the cp/pt.c code to understand
all of what you did there (which is why my ChangeLog entry says "Jason?"),
but it works for me (and all of `make check`).

Anyway, I'd like to propose the following before finishing my diagnose_as
patch. I believe it's useful to fix this part first. The diagnostic/default-
template-args-[12].C tests show a lot of examples of the intent of this
patch. And the remaining changes to the testsuite show how it changes
diagnostic output.

-- 8< 

The choice when to print a function template parameter was still
suboptimal. That's because sometimes the function template parameter
list only adds noise, while in other situations the lack of a function
template parameter list makes diagnostic messages hard to understand.

The general idea of this change is to print template parms wherever they
would appear in the source code as well. Thus, the diagnostics code
needs to know whether any template parameter was given explicitly.

Signed-off-by: Matthias Kretz 

gcc/testsuite/ChangeLog:

 * g++.dg/debug/dwarf2/template-params-12n.C: Optionally, allow
 DW_AT_default_value.
 * g++.dg/diagnostic/default-template-args-1.C: New.
 * g++.dg/diagnostic/default-template-args-2.C: New.
 * g++.dg/diagnostic/param-type-mismatch-2.C: Expect template
 parms in diagnostic.
 * g++.dg/ext/pretty1.C: Expect function template specialization
 to not pretty-print template parms.
 * g++.old-deja/g++.ext/pretty3.C: Ditto.
 * g++.old-deja/g++.pt/memtemp77.C: Ditto.
 * g++.dg/goacc/template.C: Expect function template parms for
 explicit arguments.
 * g++.dg/gomp/declare-variant-7.C: Expect no function template
 parms for deduced arguments.
 * g++.dg/template/error40.C: Expect only non-default template
 arguments in diagnostic.

gcc/cp/ChangeLog:

 * cp-tree.h (GET_NON_DEFAULT_TEMPLATE_ARGS_COUNT): Return
 absolute value of stored constant.
 (EXPLICIT_TEMPLATE_ARGS_P): New.
 (SET_EXPLICIT_TEMPLATE_ARGS_P): New.
 (TFF_AS_PRIMARY): New constant.
 * error.c (get_non_default_template_args_count): Avoid
 GET_NON_DEFAULT_TEMPLATE_ARGS_COUNT if
 NON_DEFAULT_TEMPLATE_ARGS_COUNT is a NULL_TREE. Make independent
 of flag_pretty_templates.
 (dump_template_bindings): Add flags parameter to be passed to
 get_non_default_template_args_count. Print only non-default
 template arguments.
 (dump_function_decl): Call dump_function_name and dump_type of
 the DECL_CONTEXT with specialized template and set
 TFF_AS_PRIMARY for their flags.
 (dump_function_name): Add and document conditions for calling
 dump_template_parms.
 (dump_template_parms): Print only non-default template
 parameters.
 * pt.c (determine_specialization): Jason?
 (template_parms_level_to_args): Jason?
 (copy_template_args): Jason?
 (fn_type_unification): Set EXPLICIT_TEMPLATE_ARGS_P on the
 template arguments tree if any template parameter was explicitly
 given.
 (type_unification_real): Jason?
 (get_partial_spec_bindings): Jason?
 (tsubst_template_args): Determine number of defaulted arguments
 from new argument vector, if possible.
---
  gcc/cp/cp-tree.h  | 18 +++-
  gcc/cp/error.c  

[PATCH v2] rs6000: Test case adjustments for new builtins

2021-11-16 Thread Bill Schmidt via Gcc-patches
Hi!  I recently submitted [1] to make adjustments to test cases for the new 
builtins
support, mostly due to error messages changing for consistency.  Thanks for the
previous review.  I've reviewed the reasons for the changes and removed 
unrelated
changes as requested.

A couple of comments:

 - For fold-vect-splat-floatdouble.c and fold-vec-splat-longlong.c, the existing
   test cases have some bad tests in them (checking two bits when only one bit
   is meaningful).  The new builtin support catches this but the old support did
   not.  Removing those bad cases changes some of the scan-assembler-times 
expected
   values.
 - For int_128bit-runnable.c, I chose not to do gimple folding on the 128-bit
   comparison operations in the new implementation, because doing so results in
   bad code that splits things into two 64-bit values.  That needs separate
   attention; but the point here is, when I did that, I started generating
   more of the vcmpequq, vcmpgtsq, and vcmpgtuq instructions.

Everything else here is hopefully straightforward, and unchanged from the 
previous
submission.

Bootstrapped and tested on powerpc64le-linux-gnu, and on powerpc64-linux-gnu 
with
-m32 and -m64.  Is this okay for trunk?

Thanks!
Bill

[1] https://gcc.gnu.org/pipermail/gcc-patches/2021-September/578615.html


2021-11-15  Bill Schmidt  

gcc/testsuite/
* gcc.target/powerpc/bfp/scalar-extract-exp-2.c: Adjust error
message.
* gcc.target/powerpc/bfp/scalar-extract-sig-2.c: Likewise.
* gcc.target/powerpc/bfp/scalar-insert-exp-2.c: Likewise.
* gcc.target/powerpc/bfp/scalar-insert-exp-5.c: Likewise.
* gcc.target/powerpc/bfp/scalar-insert-exp-8.c: Likewise.
* gcc.target/powerpc/bfp/scalar-test-neg-2.c: Likewise.
* gcc.target/powerpc/bfp/scalar-test-neg-3.c: Likewise.
* gcc.target/powerpc/bfp/scalar-test-neg-5.c: Likewise.
* gcc.target/powerpc/byte-in-set-2.c: Likewise.
* gcc.target/powerpc/cmpb-2.c: Likewise.
* gcc.target/powerpc/cmpb32-2.c: Likewise.
* gcc.target/powerpc/crypto-builtin-2.c: Likewise.
* gcc.target/powerpc/fold-vec-splat-floatdouble.c: Remove invalid
test and adjust xxpermdi count.
* gcc.target/powerpc/fold-vec-splat-longlong.c: Remove invalid
tests and adjust instruction counts.
* gcc.target/powerpc/fold-vec-splat-misc-invalid.c: Adjust error
messages.
* gcc.target/powerpc/int_128bit-runnable.c: Adjust instruction
counts since we do better by not gimple-folding some builtins.
* gcc.target/powerpc/pr80315-1.c: Adjust error message.
* gcc.target/powerpc/pr80315-2.c: Likewise.
* gcc.target/powerpc/pr80315-3.c: Likewise.
* gcc.target/powerpc/pr80315-4.c: Likewise.
* gcc.target/powerpc/pr88100.c: Likewise.
* gcc.target/powerpc/pragma_misc9.c: Likewise.
* gcc.target/powerpc/pragma_power8.c: Undef _RS6000_VECDEFINES_H.
* gcc.target/powerpc/pragma_power9.c: Likewise.
* gcc.target/powerpc/test_fpscr_drn_builtin_error.c: Adjust error
messages.
* gcc.target/powerpc/test_fpscr_rn_builtin_error.c: Likewise.
* gcc.target/powerpc/vec-gnb-2.c: Likewise.
* gcc.target/powerpc/vsu/vec-all-nez-7.c: Likewise.
* gcc.target/powerpc/vsu/vec-any-eqz-7.c: Likewise.
* gcc.target/powerpc/vsu/vec-cmpnez-7.c: Likewise.
* gcc.target/powerpc/vsu/vec-cntlz-lsbb-2.c: Likewise.
* gcc.target/powerpc/vsu/vec-cnttz-lsbb-2.c: Likewise.
* gcc.target/powerpc/vsu/vec-xl-len-13.c: Likewise.
* gcc.target/powerpc/vsu/vec-xst-len-12.c: Likewise.
---
 .../gcc.target/powerpc/bfp/scalar-extract-exp-2.c  |  2 +-
 .../gcc.target/powerpc/bfp/scalar-extract-sig-2.c  |  2 +-
 .../gcc.target/powerpc/bfp/scalar-insert-exp-2.c   |  2 +-
 .../gcc.target/powerpc/bfp/scalar-insert-exp-5.c   |  2 +-
 .../gcc.target/powerpc/bfp/scalar-insert-exp-8.c   |  2 +-
 .../gcc.target/powerpc/bfp/scalar-test-neg-2.c |  2 +-
 .../gcc.target/powerpc/bfp/scalar-test-neg-3.c |  2 +-
 .../gcc.target/powerpc/bfp/scalar-test-neg-5.c |  2 +-
 gcc/testsuite/gcc.target/powerpc/byte-in-set-2.c   |  2 +-
 gcc/testsuite/gcc.target/powerpc/cmpb-2.c  |  2 +-
 gcc/testsuite/gcc.target/powerpc/cmpb32-2.c|  2 +-
 .../gcc.target/powerpc/crypto-builtin-2.c  | 14 +++---
 .../powerpc/fold-vec-splat-floatdouble.c   |  4 ++--
 .../gcc.target/powerpc/fold-vec-splat-longlong.c   | 10 +++---
 .../powerpc/fold-vec-splat-misc-invalid.c  |  8 
 .../gcc.target/powerpc/int_128bit-runnable.c   |  6 +++---
 gcc/testsuite/gcc.target/powerpc/pr80315-1.c   |  2 +-
 gcc/testsuite/gcc.target/powerpc/pr80315-2.c   |  2 +-
 gcc/testsuite/gcc.target/powerpc/pr80315-3.c   |  2 +-
 gcc/testsuite/gcc.target/powerpc/pr80315-4.c   |  2 +-
 gcc/testsuite/gcc.target/powerpc/pr88100.c | 12 ++--
 gcc/testsuite/gcc.

Re: [RFC] c++: Print function template parms when relevant (was: [PATCH v4] c++: Add gnu::diagnose_as attribute)

2021-11-16 Thread Matthias Kretz
On Tuesday, 16 November 2021 21:25:33 CET Jason Merrill wrote:
> On 11/8/21 15:00, Matthias Kretz wrote:
> > I forgot to mention why I tagged it [RFC]: I needed one more bit of
> > information on the template args TREE_VEC to encode
> > EXPLICIT_TEMPLATE_ARGS_P. Its TREE_CHAIN already points to an integer
> > constant denoting the number of non-default arguments, so I couldn't
> > trivially replace that. Therefore, I used the sign of that integer. I was
> > hoping to find a cleaner solution, though.
> It seems that we aren't using any TREE_LANG_FLAG_n on TREE_VEC, so that
> would be a cleaner solution.

I tried that first but realized that TREE_VEC doesn't allow any 
TREE_LANG_FLAGs (it uses those bits for the length IIRC). And setting the 
TREE_LANG_FLAGs on the TREE_CHAIN of the TREE_VEC can't work either (since the 
int constants are shared between many trees).

Should I maybe turn the TREE_CHAIN into a TREE_LIST using TREE_PURPOSE and 
TREE_VALUE for EXPLICIT_TEMPLATE_ARGS_P and non-default arguments, 
respectively? (And where would I document this?)

-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 stdₓ::simd
──


Re: [RFC] c++: Print function template parms when relevant (was: [PATCH v4] c++: Add gnu::diagnose_as attribute)

2021-11-16 Thread Jason Merrill via Gcc-patches

On 11/16/21 15:42, Matthias Kretz wrote:

On Tuesday, 16 November 2021 21:25:33 CET Jason Merrill wrote:

On 11/8/21 15:00, Matthias Kretz wrote:

I forgot to mention why I tagged it [RFC]: I needed one more bit of
information on the template args TREE_VEC to encode
EXPLICIT_TEMPLATE_ARGS_P. Its TREE_CHAIN already points to an integer
constant denoting the number of non-default arguments, so I couldn't
trivially replace that. Therefore, I used the sign of that integer. I was
hoping to find a cleaner solution, though.

It seems that we aren't using any TREE_LANG_FLAG_n on TREE_VEC, so that
would be a cleaner solution.


I tried that first but realized that TREE_VEC doesn't allow any
TREE_LANG_FLAGs (it uses those bits for the length IIRC). And setting the
TREE_LANG_FLAGs on the TREE_CHAIN of the TREE_VEC can't work either (since the
int constants are shared between many trees).

Should I maybe turn the TREE_CHAIN into a TREE_LIST using TREE_PURPOSE and
TREE_VALUE for EXPLICIT_TEMPLATE_ARGS_P and non-default arguments,
respectively? (And where would I document this?)


Maybe a TREE_LIST if there are explicit template arguments to a function 
template, where TREE_PURPOSE is the number of explicit arguments and 
TREE_VALUE is the number of non-default arguments.


I'd document it at the definition of NON_DEFAULT_TEMPLATE_ARGS_COUNT. 
The SET/GET macros should become functions.


Jason



Re: [RFC] c++: Print function template parms when relevant (was: [PATCH v4] c++: Add gnu::diagnose_as attribute)

2021-11-16 Thread Matthias Kretz
On Tuesday, 16 November 2021 21:49:31 CET Jason Merrill wrote:
> On 11/16/21 15:42, Matthias Kretz wrote:
> > On Tuesday, 16 November 2021 21:25:33 CET Jason Merrill wrote:
> >> On 11/8/21 15:00, Matthias Kretz wrote:
> >>> I forgot to mention why I tagged it [RFC]: I needed one more bit of
> >>> information on the template args TREE_VEC to encode
> >>> EXPLICIT_TEMPLATE_ARGS_P. Its TREE_CHAIN already points to an integer
> >>> constant denoting the number of non-default arguments, so I couldn't
> >>> trivially replace that. Therefore, I used the sign of that integer. I
> >>> was
> >>> hoping to find a cleaner solution, though.
> >> 
> >> It seems that we aren't using any TREE_LANG_FLAG_n on TREE_VEC, so that
> >> would be a cleaner solution.
> > 
> > I tried that first but realized that TREE_VEC doesn't allow any
> > TREE_LANG_FLAGs (it uses those bits for the length IIRC). And setting the
> > TREE_LANG_FLAGs on the TREE_CHAIN of the TREE_VEC can't work either (since
> > the int constants are shared between many trees).
> > 
> > Should I maybe turn the TREE_CHAIN into a TREE_LIST using TREE_PURPOSE and
> > TREE_VALUE for EXPLICIT_TEMPLATE_ARGS_P and non-default arguments,
> > respectively? (And where would I document this?)
> 
> Maybe a TREE_LIST if there are explicit template arguments to a function
> template, where TREE_PURPOSE is the number of explicit arguments and
> TREE_VALUE is the number of non-default arguments.
> 
> I'd document it at the definition of NON_DEFAULT_TEMPLATE_ARGS_COUNT.
> The SET/GET macros should become functions.

Sounds good. I'll come up with a new patch ASAP.

-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 stdₓ::simd
──


Re: [PATCH] Fix spelling of ones' complement.

2021-11-16 Thread Mike Stump via Gcc-patches
On Nov 15, 2021, at 5:48 PM, Marek Polacek via Gcc-patches 
 wrote:
> 
> Nitpicking time.  It's spelled "ones' complement" rather than "one's
> complement".  I didn't go into config/.
> 
> Ok for trunk?

So, is it two's complement or twos' complement then?  Seems like it should be 
the same, but  wikipedia suggests it is two's complement, as does google.  If 
that is wrong, you should go edit it as well.  :-)


Re: [PATCH] Fix spelling of ones' complement.

2021-11-16 Thread Marek Polacek via Gcc-patches
On Tue, Nov 16, 2021 at 01:09:15PM -0800, Mike Stump via Gcc-patches wrote:
> On Nov 15, 2021, at 5:48 PM, Marek Polacek via Gcc-patches 
>  wrote:
> > 
> > Nitpicking time.  It's spelled "ones' complement" rather than "one's
> > complement".  I didn't go into config/.
> > 
> > Ok for trunk?
> 
> So, is it two's complement or twos' complement then?  Seems like it should be 
> the same, but  wikipedia suggests it is two's complement, as does google.  If 
> that is wrong, you should go edit it as well.  :-)
 
It is "two's complement":
https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584543.html
but Knuth also continues to say that there's "twos' complement notation",
which "has radix 3 and complementation with respect to (2...22)_3."


It's not lost on me how inconsequential this patch is; I'm happy to just
drop it and let the copy editor in me sleep.

Marek



[committed] libstdc++: Fix tests for constexpr std::string

2021-11-16 Thread Jonathan Wakely via Gcc-patches
Tested powerpc64le-linux, pushed to trunk.


Some tests fail when run with -D_GLIBCXX_USE_CXX11_ABI or -stdgnu++20.

libstdc++-v3/ChangeLog:

* include/bits/basic_string.h (operator<=>): Use constexpr
unconditionally.
* testsuite/21_strings/basic_string/modifiers/constexpr.cc:
Require cxx11-abit effective target.
* testsuite/21_strings/headers/string/synopsis.cc: Add
conditional constexpr to declarations, and adjust relational
operators for C++20.
---
 libstdc++-v3/include/bits/basic_string.h  |  6 ++--
 .../basic_string/modifiers/constexpr.cc   |  1 +
 .../21_strings/headers/string/synopsis.cc | 33 +--
 3 files changed, 33 insertions(+), 7 deletions(-)

diff --git a/libstdc++-v3/include/bits/basic_string.h 
b/libstdc++-v3/include/bits/basic_string.h
index b6945f1cdfb..0b7d6c0a981 100644
--- a/libstdc++-v3/include/bits/basic_string.h
+++ b/libstdc++-v3/include/bits/basic_string.h
@@ -3546,8 +3546,7 @@ _GLIBCXX_END_NAMESPACE_CXX11
*  greater than, or incomparable with `__rhs`.
*/
   template
-_GLIBCXX20_CONSTEXPR
-inline auto
+constexpr auto
 operator<=>(const basic_string<_CharT, _Traits, _Alloc>& __lhs,
const basic_string<_CharT, _Traits, _Alloc>& __rhs) noexcept
 -> decltype(__detail::__char_traits_cmp_cat<_Traits>(0))
@@ -3561,8 +3560,7 @@ _GLIBCXX_END_NAMESPACE_CXX11
*  greater than, or incomparable with `__rhs`.
*/
   template
-_GLIBCXX20_CONSTEXPR
-inline auto
+constexpr auto
 operator<=>(const basic_string<_CharT, _Traits, _Alloc>& __lhs,
const _CharT* __rhs) noexcept
 -> decltype(__detail::__char_traits_cmp_cat<_Traits>(0))
diff --git 
a/libstdc++-v3/testsuite/21_strings/basic_string/modifiers/constexpr.cc 
b/libstdc++-v3/testsuite/21_strings/basic_string/modifiers/constexpr.cc
index c875a3a19ad..a4627714d9a 100644
--- a/libstdc++-v3/testsuite/21_strings/basic_string/modifiers/constexpr.cc
+++ b/libstdc++-v3/testsuite/21_strings/basic_string/modifiers/constexpr.cc
@@ -1,5 +1,6 @@
 // { dg-options "-std=gnu++20" }
 // { dg-do compile { target c++20 } }
+// { dg-require-effective-target cxx11-abi }
 
 #include 
 #include 
diff --git a/libstdc++-v3/testsuite/21_strings/headers/string/synopsis.cc 
b/libstdc++-v3/testsuite/21_strings/headers/string/synopsis.cc
index f14c4ae831c..f12345ed426 100644
--- a/libstdc++-v3/testsuite/21_strings/headers/string/synopsis.cc
+++ b/libstdc++-v3/testsuite/21_strings/headers/string/synopsis.cc
@@ -26,6 +26,12 @@
 # define NOTHROW
 #endif
 
+#if __cplusplus >= 202002L
+# define CONSTEXPR constexpr
+#else
+# define CONSTEXPR
+#endif
+
 namespace std {
   //  lib.char.traits, character traits:
   template
@@ -40,33 +46,52 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
 _GLIBCXX_END_NAMESPACE_CXX11
 
   template
+  CONSTEXPR
   basic_string
   operator+(const basic_string& lhs,
const basic_string& rhs);
   template
+  CONSTEXPR
   basic_string
   operator+(const charT* lhs,
const basic_string& rhs);
   template
+  CONSTEXPR
   basic_string
   operator+(charT lhs, const basic_string& rhs);
   template
+  CONSTEXPR
   basic_string
   operator+(const basic_string& lhs,
const charT* rhs);
   template
+  CONSTEXPR
   basic_string
   operator+(const basic_string& lhs, charT rhs);
 
   template
+  CONSTEXPR
   bool operator==(const basic_string& lhs,
  const basic_string& rhs) NOTHROW;
   template
-  bool operator==(const charT* lhs,
- const basic_string& rhs);
-  template
+  CONSTEXPR
   bool operator==(const basic_string& lhs,
  const charT* rhs);
+
+#if __cpp_lib_three_way_comparison
+  template
+  constexpr
+  bool operator<=>(const basic_string& lhs,
+  const basic_string& rhs) NOTHROW;
+  template
+  constexpr
+  bool operator<=>(const basic_string& lhs,
+  const charT* rhs);
+#else
+  template
+  CONSTEXPR
+  bool operator==(const charT* lhs,
+ const basic_string& rhs);
   template
   bool operator!=(const basic_string& lhs,
  const basic_string& rhs) NOTHROW;
@@ -114,9 +139,11 @@ _GLIBCXX_END_NAMESPACE_CXX11
   template
   bool operator>=(const charT* lhs,
  const basic_string& rhs);
+#endif
 
   //  lib.string.special:
   template
+  CONSTEXPR
   void swap(basic_string& lhs,
basic_string& rhs)
 #if __cplusplus >= 201103L
-- 
2.31.1



  1   2   >