Re: [PATCH] x86: Shrink writing 0/-1 to memory using and/or with -Oz.

2021-12-22 Thread Uros Bizjak via Gcc-patches
On Tue, Dec 21, 2021 at 4:08 PM Roger Sayle  wrote:
>
>
> This is the second part of my fix to PR target/103773 where -Oz shouldn't
> use push/pop on x86 to shrink writing small integer constants to memory.
> Instead clang uses "andl $0, mem" for writing zero, and "orl $-1, mem"
> when writing -1 to memory when using -Oz.  This patch implements this
> via peephole2 where we can confirm that its ok to clobber the flags.
>
> On the CSiBE benchmark, this reduces total code size from 3664172 bytes
> to 3663304 bytes, saving 868 bytes.
>
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check with no new failures, and the new testcase checked
> both with and without -m32.  Ok for mainline?
>
>
> 2021-12-21  Roger Sayle  
>
> gcc/ChangeLog
> * gcc/config/i386/i386.md (define_peephole2): With -Oz use
> andl $0,mem instead of movl $0,mem and orl $-1,mem instead of
> movl $-1,mem.

Your approach uses access to uninitialized memory, which may confuse optimizers.

Please rather enhance *mov_xor and *mov_or to accept
memory operand and convert to these patterns.

Uros.

> gcc/testsuite/ChangeLog
> * gcc.target/i386/pr103773-2.c: New test case.
>
>
> Thanks in advance (and my apologies for the breakage).
> Roger
> --
>


Re: [PATCH] PR target/103773: Fix wrong-code with -Oz from pop to memory.

2021-12-22 Thread Uros Bizjak via Gcc-patches
On Tue, Dec 21, 2021 at 1:27 PM Roger Sayle  wrote:
>
>
> My apologies for the inconvenience.  The new support for -Oz using
> push/pop for small integer constants on x86_64 is only a win/correct
> for loading registers.  Fixed by adding !MEM_P tests in the appropriate
> locations.
>
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check with no new failures.  Ok for mainline?
>
>
> 2021-12-21  Roger Sayle  
>
> gcc/ChangeLog
> PR target/103773
> * config/i386/i386.md (*movdi_internal): Only use short
> push/pop sequence for register (non-memory) destinations.
> (*movsi_internal): Likewise.
>
> gcc/testsuite/ChangeLog
> PR target/103773
> * gcc.target/i386/pr103773.c: New test case.

Ouch, as pointed out in the PR, this approach clobbers the red zone.

Please revert the original patch.

Thanks,
Uros.

>
> Roger
> --
>


Re: [PATCH] PR target/103773: Fix wrong-code with -Oz from pop to memory.

2021-12-22 Thread Uros Bizjak via Gcc-patches
On Wed, Dec 22, 2021 at 9:10 AM Uros Bizjak  wrote:
>
> On Tue, Dec 21, 2021 at 1:27 PM Roger Sayle  
> wrote:
> >
> >
> > My apologies for the inconvenience.  The new support for -Oz using
> > push/pop for small integer constants on x86_64 is only a win/correct
> > for loading registers.  Fixed by adding !MEM_P tests in the appropriate
> > locations.
> >
> > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> > and make -k check with no new failures.  Ok for mainline?
> >
> >
> > 2021-12-21  Roger Sayle  
> >
> > gcc/ChangeLog
> > PR target/103773
> > * config/i386/i386.md (*movdi_internal): Only use short
> > push/pop sequence for register (non-memory) destinations.
> > (*movsi_internal): Likewise.
> >
> > gcc/testsuite/ChangeLog
> > PR target/103773
> > * gcc.target/i386/pr103773.c: New test case.
>
> Ouch, as pointed out in the PR, this approach clobbers the red zone.
>
> Please revert the original patch.

*Maybe* we can use frame->red_zone_size here, but the frame is
recalculated several times during the compilation. I think it is just
too dangerous to use push/pop w.r.t. red zone clobbering.

Uros.


Re: vxworks libstdc++ locale

2021-12-22 Thread Rasmus Villemoes via Gcc-patches
On 21/12/2021 16.42, Rasmus Villemoes wrote:
> Hi
> 
> While trying to upgrade our vxworks 5.5 compiler to gcc12, I've hit a
> problem when loading the libstdc++ module on target. It manifests as
> 
> [00] tShell memPartFree: invalid block 8bf72c in partition 9605dc.
> [00] tShell memPartFree: invalid block 8bf38c in partition 9605dc.
> [00] tShell memPartFree: invalid block 8bf304 in partition 9605dc.
> [00] tShell memPartFree: invalid block 8bf348 in partition 9605dc.
> [00] tShell memPartFree: invalid block 8bf23c in partition 9605dc.
> [00] tShell memPartFree: invalid block 8bf6c4 in partition 9605dc.
> [00] tShell memPartFree: invalid block 8bf794 in partition 9605dc.
> [00] tShell memPartFree: invalid block 8bf7a0 in partition 9605dc.
> [00] tShell memPartFree: invalid block 8bf7bc in partition 9605dc.
> 
> being printed on the console. We didn't use to pass an explicit
> --enable-clocale option to configure, but if I add
> --enable-clocale=generic , thus reverting to the locale implementation
> used for gcc11, the problem goes away.
> 
> The vxworks locale seems to be mostly identical to generic, just
> differing in CCTYPE_CC. And comparing the .a files, it seems that that
> TU ends up defining a constructor
> _GLOBAL__sub_I__ZNSt12ctype_bynameIcEC2EPKcj , which calls
> _ZNSt8ios_base4InitC1Ev . But then I'm lost.
> 
> Any ideas?

So if I remove the

#include 

from libstdc++-v3/config/locale/vxworks/ctype_members.cc the problem
goes away, and I don't see the purpose of that #include anyway (a debug
leftover perhaps?).

Rasmus


RE: [PATCH] PR target/103773: Fix wrong-code with -Oz from pop to memory.

2021-12-22 Thread Roger Sayle

Hi Uros,
Would you consider the following variant that disables this optimization when a
red zone is used by the current function?  You're right that cfun's 
red_zone_size is
recalculated dynamically, but ix86_red_zone_used should be a better "gate" given
that this logic resides very late during compilation, in the output templates, 
where
whether or not a red zone is used is known.

On CSiBE, disabling this optimization in non-leaf functions that use a red zone 
costs
219 bytes, but remains a significant win over -Os.  (Alas the absolute numbers 
aren't
comparable as this testing included the 0/-1 write to memory changes).

Tested (overnight) on x86_64-pc-linux-gnu with make bootstrap and make -k check
with no new failures.

2021-12-22  Roger Sayle  

gcc/ChangeLog
PR target/103773
* config/i386/i386.md (*movdi_internal): Only use short
push/pop sequence for register (non-memory) destinations
when the current function doesn't make use of a red zone.
(*movsi_internal): Likewise.

gcc/testsuite/ChangeLog
PR target/103773
* gcc.target/i386/pr103773.c: New test case.

Please let me know what you think.  I'll revert, if this tweak doesn't address
your concerns.
Roger
--

> -Original Message-
> From: Uros Bizjak 
> Sent: 22 December 2021 08:20
> To: Roger Sayle 
> Cc: GCC Patches 
> Subject: Re: [PATCH] PR target/103773: Fix wrong-code with -Oz from pop to
> memory.
> 
> On Wed, Dec 22, 2021 at 9:10 AM Uros Bizjak  wrote:
> >
> > On Tue, Dec 21, 2021 at 1:27 PM Roger Sayle
>  wrote:
> > >
> > >
> > > My apologies for the inconvenience.  The new support for -Oz using
> > > push/pop for small integer constants on x86_64 is only a win/correct
> > > for loading registers.  Fixed by adding !MEM_P tests in the
> > > appropriate locations.
> > >
> > > This patch has been tested on x86_64-pc-linux-gnu with make
> > > bootstrap and make -k check with no new failures.  Ok for mainline?
> > >
> > >
> > > 2021-12-21  Roger Sayle  
> > >
> > > gcc/ChangeLog
> > > PR target/103773
> > > * config/i386/i386.md (*movdi_internal): Only use short
> > > push/pop sequence for register (non-memory) destinations.
> > > (*movsi_internal): Likewise.
> > >
> > > gcc/testsuite/ChangeLog
> > > PR target/103773
> > > * gcc.target/i386/pr103773.c: New test case.
> >
> > Ouch, as pointed out in the PR, this approach clobbers the red zone.
> >
> > Please revert the original patch.
> 
> *Maybe* we can use frame->red_zone_size here, but the frame is recalculated
> several times during the compilation. I think it is just too dangerous to use
> push/pop w.r.t. red zone clobbering.
> 
> Uros.
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index d25453f..489cede 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -2217,7 +2217,9 @@
  if (optimize_size > 1
  && TARGET_64BIT
  && CONST_INT_P (operands[1])
- && IN_RANGE (INTVAL (operands[1]), -128, 127))
+ && IN_RANGE (INTVAL (operands[1]), -128, 127)
+ && !MEM_P (operands[0])
+ && !ix86_red_zone_used)
return "push{q}\t%1\n\tpop{q}\t%0";
  return "mov{l}\t{%k1, %k0|%k0, %k1}";
}
@@ -2440,7 +2442,9 @@
return "lea{l}\t{%E1, %0|%0, %E1}";
   else if (optimize_size > 1
   && CONST_INT_P (operands[1])
-  && IN_RANGE (INTVAL (operands[1]), -128, 127))
+  && IN_RANGE (INTVAL (operands[1]), -128, 127)
+  && !MEM_P (operands[0])
+  && !ix86_red_zone_used)
{
  if (TARGET_64BIT)
return "push{q}\t%1\n\tpop{q}\t%q0";


[PATCH][pushed] docs: Unify instruct set name.

2021-12-22 Thread Martin Liška

gcc/ChangeLog:

* doc/extend.texi: Use uppercase letters for SSEx.
---
 gcc/doc/extend.texi | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index f52384f7629..a15c4fe9b33 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -6855,12 +6855,12 @@ and SSE4.2).
 @item sse4.1
 @itemx no-sse4.1
 @cindex @code{target("sse4.1")} function attribute, x86
-Enable/disable the generation of the sse4.1 instructions.
+Enable/disable the generation of the SSE4.1 instructions.
 
 @item sse4.2

 @itemx no-sse4.2
 @cindex @code{target("sse4.2")} function attribute, x86
-Enable/disable the generation of the sse4.2 instructions.
+Enable/disable the generation of the SSE4.2 instructions.
 
 @item sse4a

 @itemx no-sse4a
--
2.34.1



Re: [PATCH] PR target/103773: Fix wrong-code with -Oz from pop to memory.

2021-12-22 Thread Uros Bizjak via Gcc-patches
On Wed, Dec 22, 2021 at 10:26 AM Roger Sayle  wrote:
>
>
> Hi Uros,
> Would you consider the following variant that disables this optimization when 
> a
> red zone is used by the current function?  You're right that cfun's 
> red_zone_size is
> recalculated dynamically, but ix86_red_zone_used should be a better "gate" 
> given
> that this logic resides very late during compilation, in the output 
> templates, where
> whether or not a red zone is used is known.
>
> On CSiBE, disabling this optimization in non-leaf functions that use a red 
> zone costs
> 219 bytes, but remains a significant win over -Os.  (Alas the absolute 
> numbers aren't
> comparable as this testing included the 0/-1 write to memory changes).
>
> Tested (overnight) on x86_64-pc-linux-gnu with make bootstrap and make -k 
> check
> with no new failures.
>
> 2021-12-22  Roger Sayle  
>
> gcc/ChangeLog
> PR target/103773
> * config/i386/i386.md (*movdi_internal): Only use short
> push/pop sequence for register (non-memory) destinations
> when the current function doesn't make use of a red zone.
> (*movsi_internal): Likewise.
>
> gcc/testsuite/ChangeLog
> PR target/103773
> * gcc.target/i386/pr103773.c: New test case.
>
> Please let me know what you think.  I'll revert, if this tweak doesn't address
> your concerns.

Yes, using ix86_red_zone_used looks safe.

OTOH, is there a reason the transformation is not implemented via
peephole2 pass? IIRC, frame is stable after pro_and_epilogue_pass, and
peephole2 pass is instanced well after register allocation.

Uros.


[PATCH][GCC] aarch64: fix: ls64 tests fail on aarch64-linux-gnu_ilp32 [PR103729]

2021-12-22 Thread Przemyslaw Wirkus via Gcc-patches
This patch is sorting issue with LS64 intrinsics tests failing with
aarch64-linux-gnu_ilp32 target.

Regtested on aarch64-linux-gnu_ilp32, aarch64-elf and aarch64_be-elf
and no issues.

OK to install?

gcc/ChangeLog:

PR target/103729
* config/aarch64/aarch64-builtins.c 
(aarch64_expand_builtin_ls64):
Handle SImode for ILP32.


rb15171.patch
Description: rb15171.patch


[PATCH][pushed] docs: use ';' for function declarations.

2021-12-22 Thread Martin Liška

Pushed as obvious, it makes the documentation more consistent.

Martin

gcc/ChangeLog:

* doc/extend.texi: Unify all function declarations in examples
where some miss trailing ';'.
---
 gcc/doc/extend.texi | 2973 +--
 1 file changed, 1483 insertions(+), 1490 deletions(-)

diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index a15c4fe9b33..7e5791b67c5 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -14591,34 +14591,34 @@ The following built-in functions are always 
available.  They
 all generate the machine instruction that is part of the name.
 
 @smallexample

-long __builtin_alpha_implver (void)
-long __builtin_alpha_rpcc (void)
-long __builtin_alpha_amask (long)
-long __builtin_alpha_cmpbge (long, long)
-long __builtin_alpha_extbl (long, long)
-long __builtin_alpha_extwl (long, long)
-long __builtin_alpha_extll (long, long)
-long __builtin_alpha_extql (long, long)
-long __builtin_alpha_extwh (long, long)
-long __builtin_alpha_extlh (long, long)
-long __builtin_alpha_extqh (long, long)
-long __builtin_alpha_insbl (long, long)
-long __builtin_alpha_inswl (long, long)
-long __builtin_alpha_insll (long, long)
-long __builtin_alpha_insql (long, long)
-long __builtin_alpha_inswh (long, long)
-long __builtin_alpha_inslh (long, long)
-long __builtin_alpha_insqh (long, long)
-long __builtin_alpha_mskbl (long, long)
-long __builtin_alpha_mskwl (long, long)
-long __builtin_alpha_mskll (long, long)
-long __builtin_alpha_mskql (long, long)
-long __builtin_alpha_mskwh (long, long)
-long __builtin_alpha_msklh (long, long)
-long __builtin_alpha_mskqh (long, long)
-long __builtin_alpha_umulh (long, long)
-long __builtin_alpha_zap (long, long)
-long __builtin_alpha_zapnot (long, long)
+long __builtin_alpha_implver (void);
+long __builtin_alpha_rpcc (void);
+long __builtin_alpha_amask (long);
+long __builtin_alpha_cmpbge (long, long);
+long __builtin_alpha_extbl (long, long);
+long __builtin_alpha_extwl (long, long);
+long __builtin_alpha_extll (long, long);
+long __builtin_alpha_extql (long, long);
+long __builtin_alpha_extwh (long, long);
+long __builtin_alpha_extlh (long, long);
+long __builtin_alpha_extqh (long, long);
+long __builtin_alpha_insbl (long, long);
+long __builtin_alpha_inswl (long, long);
+long __builtin_alpha_insll (long, long);
+long __builtin_alpha_insql (long, long);
+long __builtin_alpha_inswh (long, long);
+long __builtin_alpha_inslh (long, long);
+long __builtin_alpha_insqh (long, long);
+long __builtin_alpha_mskbl (long, long);
+long __builtin_alpha_mskwl (long, long);
+long __builtin_alpha_mskll (long, long);
+long __builtin_alpha_mskql (long, long);
+long __builtin_alpha_mskwh (long, long);
+long __builtin_alpha_msklh (long, long);
+long __builtin_alpha_mskqh (long, long);
+long __builtin_alpha_umulh (long, long);
+long __builtin_alpha_zap (long, long);
+long __builtin_alpha_zapnot (long, long);
 @end smallexample
 
 The following built-in functions are always with @option{-mmax}

@@ -14627,19 +14627,19 @@ later.  They all generate the machine instruction 
that is part
 of the name.
 
 @smallexample

-long __builtin_alpha_pklb (long)
-long __builtin_alpha_pkwb (long)
-long __builtin_alpha_unpkbl (long)
-long __builtin_alpha_unpkbw (long)
-long __builtin_alpha_minub8 (long, long)
-long __builtin_alpha_minsb8 (long, long)
-long __builtin_alpha_minuw4 (long, long)
-long __builtin_alpha_minsw4 (long, long)
-long __builtin_alpha_maxub8 (long, long)
-long __builtin_alpha_maxsb8 (long, long)
-long __builtin_alpha_maxuw4 (long, long)
-long __builtin_alpha_maxsw4 (long, long)
-long __builtin_alpha_perr (long, long)
+long __builtin_alpha_pklb (long);
+long __builtin_alpha_pkwb (long);
+long __builtin_alpha_unpkbl (long);
+long __builtin_alpha_unpkbw (long);
+long __builtin_alpha_minub8 (long, long);
+long __builtin_alpha_minsb8 (long, long);
+long __builtin_alpha_minuw4 (long, long);
+long __builtin_alpha_minsw4 (long, long);
+long __builtin_alpha_maxub8 (long, long);
+long __builtin_alpha_maxsb8 (long, long);
+long __builtin_alpha_maxuw4 (long, long);
+long __builtin_alpha_maxsw4 (long, long);
+long __builtin_alpha_perr (long, long);
 @end smallexample
 
 The following built-in functions are always with @option{-mcix}

@@ -14648,9 +14648,9 @@ later.  They all generate the machine instruction that 
is part
 of the name.
 
 @smallexample

-long __builtin_alpha_cttz (long)
-long __builtin_alpha_ctlz (long)
-long __builtin_alpha_ctpop (long)
+long __builtin_alpha_cttz (long);
+long __builtin_alpha_ctlz (long);
+long __builtin_alpha_ctpop (long);
 @end smallexample
 
 The following built-in functions are available on systems that use the OSF/1

@@ -14659,8 +14659,8 @@ PAL calls, but when invoked with @option{-mtls-kernel}, 
they invoke
 @code{rdval} and @code{wrval}.
 
 @smallexample

-void *__builtin_thread_pointer (void)
-void __builtin_set_thread_pointer (void *)
+void *__builtin_thread_pointer (void);
+void __builtin_set_thread_p

Re: [PATCH] PR target/103773: Fix wrong-code with -Oz from pop to memory.

2021-12-22 Thread Uros Bizjak via Gcc-patches
On Wed, Dec 22, 2021 at 11:26 AM Uros Bizjak  wrote:
>
> On Wed, Dec 22, 2021 at 10:26 AM Roger Sayle  
> wrote:
> >
> >
> > Hi Uros,
> > Would you consider the following variant that disables this optimization 
> > when a
> > red zone is used by the current function?  You're right that cfun's 
> > red_zone_size is
> > recalculated dynamically, but ix86_red_zone_used should be a better "gate" 
> > given
> > that this logic resides very late during compilation, in the output 
> > templates, where
> > whether or not a red zone is used is known.
> >
> > On CSiBE, disabling this optimization in non-leaf functions that use a red 
> > zone costs
> > 219 bytes, but remains a significant win over -Os.  (Alas the absolute 
> > numbers aren't
> > comparable as this testing included the 0/-1 write to memory changes).
> >
> > Tested (overnight) on x86_64-pc-linux-gnu with make bootstrap and make -k 
> > check
> > with no new failures.
> >
> > 2021-12-22  Roger Sayle  
> >
> > gcc/ChangeLog
> > PR target/103773
> > * config/i386/i386.md (*movdi_internal): Only use short
> > push/pop sequence for register (non-memory) destinations
> > when the current function doesn't make use of a red zone.
> > (*movsi_internal): Likewise.
> >
> > gcc/testsuite/ChangeLog
> > PR target/103773
> > * gcc.target/i386/pr103773.c: New test case.
> >
> > Please let me know what you think.  I'll revert, if this tweak doesn't 
> > address
> > your concerns.
>
> Yes, using ix86_red_zone_used looks safe.
>
> OTOH, is there a reason the transformation is not implemented via
> peephole2 pass? IIRC, frame is stable after pro_and_epilogue_pass, and
> peephole2 pass is instanced well after register allocation.

Something like the attached patch.

Uros.
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 58b10643fcb..e5d603f0025 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -2514,6 +2514,24 @@
   ]
   (symbol_ref "true")))])
 
+(define_peephole2
+  [(set (match_operand:SWI48 0 "general_reg_operand")
+   (match_operand:SWI48 1 "const_int_operand"))]
+  "optimize_insn_for_size_p () && optimize_size > 1
+   && IN_RANGE (INTVAL (operands[1]), -128, 127)
+   && !ix86_red_zone_used"
+  [(set (match_dup 2) (match_dup 1))
+   (set (match_dup 0) (match_dup 3))]
+{
+  if (GET_MODE (operands[0]) != word_mode)
+operands[0] = gen_rtx_REG (word_mode, REGNO (operands[0]));
+
+  operands[2] = gen_rtx_MEM (word_mode,
+gen_rtx_PRE_DEC (Pmode, stack_pointer_rtx));
+  operands[3] = gen_rtx_MEM (word_mode,
+gen_rtx_POST_INC (Pmode, stack_pointer_rtx));
+})
+
 (define_insn "*movhi_internal"
   [(set (match_operand:HI 0 "nonimmediate_operand"
 "=r,r,r,m ,*k,*k ,r ,m ,*k ,?r,?*v,*v,*v,*v,m")


Re: [patch, Fortran] Make REAL(KIND=16) detection more robust

2021-12-22 Thread FX via Gcc-patches
Thanks Thomas, pushed as 228173565eafbe34e44c1600c32e32a323eb5aab



228173565eafbe34e44c1600c32e32a323eb5aab.patch
Description: Binary data


[PATCH] Fix typo in type verification.

2021-12-22 Thread Martin Liška

Hello.

The patch is quite obvious fix.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

PR ipa/103786

gcc/ChangeLog:

* tree.c (verify_type): Fix typo.
---
 gcc/tree.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/tree.c b/gcc/tree.c
index 72cceda568f..0741e3b01af 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -13530,7 +13530,7 @@ verify_type (const_tree t)
   tree ct = TYPE_CANONICAL (t);
   if (!ct)
 ;
-  else if (TYPE_CANONICAL (t) != ct)
+  else if (TYPE_CANONICAL (ct) != ct)
 {
   error ("% has different %");
   debug_tree (ct);
--
2.34.1



[PATCH] docs: replace http:// with https://

2021-12-22 Thread Martin Liška

I replaced and verified http:// links for various domains.

Ready to be installed?
Tahnks,
Martin

gcc/ada/ChangeLog:

* doc/share/gnu_free_documentation_license.rst: Replace http:// with 
https.
* gnat-style.texi: Likewise.
* gnat_rm.texi: Likewise.
* gnat_ugn.texi: Likewise.

gcc/d/ChangeLog:

* gdc.texi: Replace http:// with https.

gcc/ChangeLog:

* doc/contrib.texi: Replace http:// with https.
* doc/contribute.texi: Likewise.
* doc/extend.texi: Likewise.
* doc/gccint.texi: Likewise.
* doc/gnu.texi: Likewise.
* doc/implement-c.texi: Likewise.
* doc/implement-cxx.texi: Likewise.
* doc/include/fdl.texi: Likewise.
* doc/include/gpl_v3.texi: Likewise.
* doc/install.texi: Likewise.
* doc/invoke.texi: Likewise.
* doc/passes.texi: Likewise.
* doc/service.texi: Likewise.
* doc/sourcebuild.texi: Likewise.
* doc/standards.texi: Likewise.

gcc/fortran/ChangeLog:

* gfortran.texi: Replace http:// with https.
* intrinsic.texi: Likewise.

gcc/go/ChangeLog:

* gccgo.texi: Replace http:// with https.

gcc/jit/ChangeLog:

* docs/_build/texinfo/libgccjit.texi: Replace http:// with https.
* docs/cp/index.rst: Likewise.
* docs/cp/intro/index.rst: Likewise.
* docs/cp/intro/tutorial01.rst: Likewise.
* docs/cp/intro/tutorial02.rst: Likewise.
* docs/cp/intro/tutorial03.rst: Likewise.
* docs/cp/intro/tutorial04.rst: Likewise.
* docs/cp/topics/asm.rst: Likewise.
* docs/cp/topics/compilation.rst: Likewise.
* docs/cp/topics/contexts.rst: Likewise.
* docs/cp/topics/expressions.rst: Likewise.
* docs/cp/topics/functions.rst: Likewise.
* docs/cp/topics/index.rst: Likewise.
* docs/cp/topics/locations.rst: Likewise.
* docs/cp/topics/objects.rst: Likewise.
* docs/cp/topics/types.rst: Likewise.
* docs/index.rst: Likewise.
* docs/internals/index.rst: Likewise.
* docs/intro/index.rst: Likewise.
* docs/intro/tutorial01.rst: Likewise.
* docs/intro/tutorial02.rst: Likewise.
* docs/intro/tutorial03.rst: Likewise.
* docs/intro/tutorial04.rst: Likewise.
* docs/intro/tutorial05.rst: Likewise.
* docs/topics/asm.rst: Likewise.
* docs/topics/compatibility.rst: Likewise.
* docs/topics/compilation.rst: Likewise.
* docs/topics/contexts.rst: Likewise.
* docs/topics/expressions.rst: Likewise.
* docs/topics/function-pointers.rst: Likewise.
* docs/topics/functions.rst: Likewise.
* docs/topics/index.rst: Likewise.
* docs/topics/locations.rst: Likewise.
* docs/topics/objects.rst: Likewise.
* docs/topics/performance.rst: Likewise.
* docs/topics/types.rst: Likewise.
---
 .../share/gnu_free_documentation_license.rst  |  4 +-
 gcc/ada/gnat-style.texi   |  4 +-
 gcc/ada/gnat_rm.texi  |  4 +-
 gcc/ada/gnat_ugn.texi |  4 +-
 gcc/d/gdc.texi| 10 +-
 gcc/doc/contrib.texi  |  2 +-
 gcc/doc/contribute.texi   | 10 +-
 gcc/doc/extend.texi   |  4 +-
 gcc/doc/gccint.texi   |  2 +-
 gcc/doc/gnu.texi  |  4 +-
 gcc/doc/implement-c.texi  |  2 +-
 gcc/doc/implement-cxx.texi|  2 +-
 gcc/doc/include/fdl.texi  |  6 +-
 gcc/doc/include/gpl_v3.texi   |  6 +-
 gcc/doc/install.texi  | 32 +++
 gcc/doc/invoke.texi   | 10 +-
 gcc/doc/passes.texi   |  2 +-
 gcc/doc/service.texi  |  2 +-
 gcc/doc/sourcebuild.texi  |  2 +-
 gcc/doc/standards.texi|  6 +-
 gcc/fortran/gfortran.texi | 14 +--
 gcc/fortran/intrinsic.texi|  4 +-
 gcc/go/gccgo.texi |  4 +-
 gcc/jit/docs/_build/texinfo/libgccjit.texi| 96 +--
 gcc/jit/docs/cp/index.rst |  4 +-
 gcc/jit/docs/cp/intro/index.rst   |  2 +-
 gcc/jit/docs/cp/intro/tutorial01.rst  |  2 +-
 gcc/jit/docs/cp/intro/tutorial02.rst  |  2 +-
 gcc/jit/docs/cp/intro/tutorial03.rst  |  2 +-
 gcc/jit/docs/cp/intro/tutorial04.rst  |  2 +-
 gcc/jit/docs/cp/topics/asm.rst|  2 +-
 gcc/jit/docs/cp/topics/compilation.rst|  2 +-
 gcc/jit/docs/cp/topics/contexts.rst   |  2 +-
 gcc/jit/docs/cp/topics/expressions.rst|  2 +-
 gcc/jit/docs/cp/topics/functions.rst  |  2 +-
 gcc/jit/docs/cp/topics/index.rst  |  2 +-
 gcc/jit/docs/cp/topics/locations.rst  |  2 +-
 gcc/jit

[OG11][PATCH] OpenMP: Ensure that offloaded variables are public

2021-12-22 Thread Andrew Stubbs

This is now backported to the devel/omp/gcc-11 branch (OG11).

Andrew

On 09/12/2021 11:41, Andrew Stubbs wrote:

On 02/12/2021 16:43, Jakub Jelinek wrote:

On Thu, Dec 02, 2021 at 04:31:36PM +, Andrew Stubbs wrote:

On 02/12/2021 16:05, Andrew Stubbs wrote:

On 02/12/2021 12:58, Jakub Jelinek wrote:

I've tried modifying offload_handle_link_vars but that spot
doesn't catch
the omp_data_sizes variables emitted by
libgomp.c-c++-common/target_42.c,
which was one of the motivating examples.


Why doesn't catch it?  Is the variable created only post-IPA?
I'd think that it should have been created before IPA, streamed and
therefore I don't understand why you don't see it after streaming 
LTO in.


On closer inspection it does, in fact, catch it as you'd expect, but
then the variable is no longer marked public when it gets to
pass_omp_target_link::execute, so something somewhere is resetting it.
More investigation is needed


The "whole-program" pass is removing the public flag. That's probably
working as intended, and I assume it is run for offload code on purpose?


So you'd stick it somewhere into e.g. symbol_table::compile
after ipa_passes call, guarded with #ifdef ACCEL_COMPILER ?


I've given up on this approach, and switched to loading the symbol 
addresses from the table directly. The relocation issues that I had with 
older assemblers/linkers do not seem to be a problem any more.


This patch requires only a single symbol to be forced global, and since 
that's one that I create in mkoffload there is no issue with previous 
definitions.


I think I can approve this myself, but if you have any observations I'm 
happy to hear them.


Andrew




Re: [PATCH] docs: replace http:// with https://

2021-12-22 Thread Iain Buclaw via Gcc-patches
Excerpts from Martin Liška's message of Dezember 22, 2021 1:57 pm:
> I replaced and verified http:// links for various domains.
> 
> Ready to be installed?
> Tahnks,
> Martin
> 

Hi,

> gcc/d/ChangeLog:
> 
>   * gdc.texi: Replace http:// with https.
> 
> ---
>   gcc/d/gdc.texi| 10 +-
> 

OK for the D front-end docs change.

Iain.

> diff --git a/gcc/d/gdc.texi b/gcc/d/gdc.texi
> index bfec1568857..d93d2e8001a 100644
> --- a/gcc/d/gdc.texi
> +++ b/gcc/d/gdc.texi
> @@ -326,14 +326,14 @@ values are supported:
>   @item all
>   Turns on all upcoming D language features.
>   @item dip1000
> -Implements @uref{http://wiki.dlang.org/DIP1000} (Scoped pointers).
> +Implements @uref{https://wiki.dlang.org/DIP1000} (Scoped pointers).
>   @item dip1008
> -Implements @uref{http://wiki.dlang.org/DIP1008} (Allow exceptions in
> +Implements @uref{https://wiki.dlang.org/DIP1008} (Allow exceptions in
>   @code{@@nogc} code).
>   @item dip1021
> -Implements @uref{http://wiki.dlang.org/DIP1021} (Mutable function arguments).
> +Implements @uref{https://wiki.dlang.org/DIP1021} (Mutable function 
> arguments).
>   @item dip25
> -Implements @uref{http://wiki.dlang.org/DIP25} (Sealed references).
> +Implements @uref{https://wiki.dlang.org/DIP25} (Sealed references).
>   @item dtorfields
>   Turns on generation for destructing fields of partially constructed objects.
>   @item fieldwise
> @@ -383,7 +383,7 @@ are supported:
>   @item all
>   Turns off all revertable D language features.
>   @item dip25
> -Reverts @uref{http://wiki.dlang.org/DIP25} (Sealed references).
> +Reverts @uref{https://wiki.dlang.org/DIP25} (Sealed references).
>   @item dtorfields
>   Turns off generation for destructing fields of partially constructed 
> objects.
>   @item markdown



Re: [PATCH 1/2][GCC] arm: Move arm_simd_info array declaration into header

2021-12-22 Thread Richard Earnshaw via Gcc-patches




On 24/11/2021 12:18, Richard Earnshaw via Gcc-patches wrote:



On 24/11/2021 12:15, Murray Steele wrote:

On 18/11/2021 15:40, Richard Earnshaw wrote:



On 16/11/2021 10:14, Murray Steele via Gcc-patches wrote:

Hi all,

This patch moves the arm_simd_type and arm_type_qualifiers enums, and
arm_simd_info struct from arm-builtins.c into arm-builtins.h header.

This is a first step towards internalising the type definitions for MVE
predicate, vector, and tuple types.  By moving arm_simd_types into a
header, we allow future patches to use these type trees externally to
arm-builtins.c, which is a crucial step towards developing an MVE
intrinsics framework similar to the current SVE implementation.

Thanks,
Murray

gcc/ChangeLog:

 * config/arm/arm-builtins.c (enum arm_type_qualifiers): Move to
 arm_builtins.h
 (enum arm_simd_type): Move to arm-builtins.h
 (struct arm_simd_type_info): Move to arm-builtins.h
 * config/arm/arm-builtins.h (enum arm_simd_type): Move from
 arm-builtins.c
 (enum arm_type_qualifiers): Move from arm-builtins.c
 (struct arm_simd_type_info): Move from arm-builtins.c





OK.

R.


Hi Richard,

I don't currently have write access, so I will need this patch 
committed on my behalf.


Thanks again,
Murray



That can be done when 2/2 patch has been resolved.  They need to go in 
together.


R.


Now pushed.

R.


Re: [PATCH v3 2/2][GCC] arm: Declare MVE types internally via pragma

2021-12-22 Thread Richard Earnshaw via Gcc-patches




On 09/12/2021 15:24, Murray Steele via Gcc-patches wrote:

Changes from original patch:

1. Make mentioned changes to changelog.
2. Add namespace-end comments.
3. Add #error for when arm-mve-builtins.def is included without
defining DEF_MVE_TYPE.
4. Make placement of '#undef DEF_MVE_TYPE' consistent.

---

This patch moves the implementation of MVE ACLE types from
arm_mve_types.h to inside GCC via a new pragma, which replaces the prior
type definitions. This allows for the types to be used internally for
intrinsic function definitions.

Bootstrapped and regression tested on arm-none-linux-gnuabihf, and
regression tested on arm-eabi -- no issues.

Thanks,
Murray

gcc/ChangeLog:

 * config.gcc: Add arm-mve-builtins.o to extra_objs.
 * config/arm/arm-c.c (arm_pragma_arm): Handle "#pragma GCC arm".
 (arm_register_target_pragmas): Register it.
 * config/arm/arm-protos.h: (arm_mve::arm_handle_mve_types_h): New
 prototype.
 * config/arm/arm_mve_types.h: Replace MVE type definitions with
 new pragma.
 * config/arm/t-arm: (arm-mve-builtins.o): New target rule.
 * config/arm/arm-mve-builtins.cc: New file.
 * config/arm/arm-mve-builtins.def: New file.
 * config/arm/arm-mve-builtins.h: New file.

gcc/testsuite/ChangeLog:

 * gcc.target/arm/mve/mve.exp: Add new subdirectories.
 * gcc.target/arm/mve/general-c/type_redef_1.c: New test.
 * gcc.target/arm/mve/general/double_pragmas_1.c: New test.
 * gcc.target/arm/mve/general/nomve_1.c: New test.



I fixed a minor issue in the changelog (config.gcc needs to mention 
arm*-*-* as the 'function') and pushed this.


Thanks,

R.


[PATCH][GCC] arm: fix __arm_vld1q_z* and __arm_vst1q_p* intrinsics.

2021-12-22 Thread Murray Steele via Gcc-patches
Hi All,

This patch fixes the implementation of the existing __arm_vld1q_z* and
__arm_vst1q_p* MVE intrinsic functions.

The MVE ACLE allows for __ARM_MVE_PRESERVE_USER_NAMESPACE to be defined,
which removes definitions for intrinsic functions without the __arm_
prefix. __arm_vld1q_z* and __arm_vst1q_p* are currently implemented via
calls to vldr* and vstr*, which results in several compile-time errors when
__ARM_MVE_PRESERVE_USER_NAMESPACE is defined. This patch replaces these
with calls to their prefixed counterparts, __arm_vldr* and __arm_str*,
and adds a test covering the definition of __ARM_MVE_PRESERVE_USER_NAMESPACE.

Regression tested on arm-eabi -- no issues.

Thanks,
Murray

gcc/ChangeLog:

* config/arm/arm_mve.h (__arm_vst1q_p_u8): Use prefixed intrinsic
function.
(__arm_vst1q_p_s8): Likewise.
(__arm_vld1q_z_u8): Likewise.
(__arm_vld1q_z_s8): Likewise.
(__arm_vst1q_p_u16): Likewise.
(__arm_vst1q_p_s16): Likewise.
(__arm_vld1q_z_u16): Likewise.
(__arm_vld1q_z_s16): Likewise.
(__arm_vst1q_p_u32): Likewise.
(__arm_vst1q_p_s32): Likewise.
(__arm_vld1q_z_u32): Likewise.
(__arm_vld1q_z_s32): Likewise.
(__arm_vld1q_z_f16): Likewise.
(__arm_vst1q_p_f16): Likewise.
(__arm_vld1q_z_f32): Likewise.
(__arm_vst1q_p_f32): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/general/preserve_user_namespace_1.c: New test.diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h
index 
e04d46218d03effdf0cb79471108cd2f24e92dec..708f5c71fddfc2cab0b0456e0b8724c803544ddc
 100644
--- a/gcc/config/arm/arm_mve.h
+++ b/gcc/config/arm/arm_mve.h
@@ -16171,14 +16171,14 @@ __extension__ extern __inline void
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 __arm_vst1q_p_u8 (uint8_t * __addr, uint8x16_t __value, mve_pred16_t __p)
 {
-  return vstrbq_p_u8 (__addr, __value, __p);
+  return __arm_vstrbq_p_u8 (__addr, __value, __p);
 }
 
 __extension__ extern __inline void
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 __arm_vst1q_p_s8 (int8_t * __addr, int8x16_t __value, mve_pred16_t __p)
 {
-  return vstrbq_p_s8 (__addr, __value, __p);
+  return __arm_vstrbq_p_s8 (__addr, __value, __p);
 }
 
 __extension__ extern __inline void
@@ -16203,14 +16203,14 @@ __extension__ extern __inline uint8x16_t
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 __arm_vld1q_z_u8 (uint8_t const *__base, mve_pred16_t __p)
 {
-  return vldrbq_z_u8 ( __base, __p);
+  return __arm_vldrbq_z_u8 ( __base, __p);
 }
 
 __extension__ extern __inline int8x16_t
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 __arm_vld1q_z_s8 (int8_t const *__base, mve_pred16_t __p)
 {
-  return vldrbq_z_s8 ( __base, __p);
+  return __arm_vldrbq_z_s8 ( __base, __p);
 }
 
 __extension__ extern __inline int8x16x2_t
@@ -16253,14 +16253,14 @@ __extension__ extern __inline void
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 __arm_vst1q_p_u16 (uint16_t * __addr, uint16x8_t __value, mve_pred16_t __p)
 {
-  return vstrhq_p_u16 (__addr, __value, __p);
+  return __arm_vstrhq_p_u16 (__addr, __value, __p);
 }
 
 __extension__ extern __inline void
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 __arm_vst1q_p_s16 (int16_t * __addr, int16x8_t __value, mve_pred16_t __p)
 {
-  return vstrhq_p_s16 (__addr, __value, __p);
+  return __arm_vstrhq_p_s16 (__addr, __value, __p);
 }
 
 __extension__ extern __inline void
@@ -16285,14 +16285,14 @@ __extension__ extern __inline uint16x8_t
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 __arm_vld1q_z_u16 (uint16_t const *__base, mve_pred16_t __p)
 {
-  return vldrhq_z_u16 ( __base, __p);
+  return __arm_vldrhq_z_u16 ( __base, __p);
 }
 
 __extension__ extern __inline int16x8_t
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 __arm_vld1q_z_s16 (int16_t const *__base, mve_pred16_t __p)
 {
-  return vldrhq_z_s16 ( __base, __p);
+  return __arm_vldrhq_z_s16 ( __base, __p);
 }
 
 __extension__ extern __inline int16x8x2_t
@@ -16335,14 +16335,14 @@ __extension__ extern __inline void
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 __arm_vst1q_p_u32 (uint32_t * __addr, uint32x4_t __value, mve_pred16_t __p)
 {
-  return vstrwq_p_u32 (__addr, __value, __p);
+  return __arm_vstrwq_p_u32 (__addr, __value, __p);
 }
 
 __extension__ extern __inline void
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 __arm_vst1q_p_s32 (int32_t * __addr, int32x4_t __value, mve_pred16_t __p)
 {
-  return vstrwq_p_s32 (__addr, __value, __p);
+  return __arm_vstrwq_p_s32 (__addr, __value, __p);
 }
 
 __extension__ extern __inline void
@@ -16367,14 +16367,14 @@ __extension__ extern __inline uint32x4_t
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 __arm_vld1q_z_u32 (uint32_t const *__base, mve_pred16_t __

Re: [PATCH][GCC] arm: fix __arm_vld1q_z* and __arm_vst1q_p* intrinsics.

2021-12-22 Thread Richard Earnshaw via Gcc-patches
On 22/12/2021 15:55, Murray Steele via Gcc-patches wrote:
> Hi All,
> 
> This patch fixes the implementation of the existing __arm_vld1q_z* and
> __arm_vst1q_p* MVE intrinsic functions.
> 
> The MVE ACLE allows for __ARM_MVE_PRESERVE_USER_NAMESPACE to be defined,
> which removes definitions for intrinsic functions without the __arm_
> prefix. __arm_vld1q_z* and __arm_vst1q_p* are currently implemented via
> calls to vldr* and vstr*, which results in several compile-time errors when
> __ARM_MVE_PRESERVE_USER_NAMESPACE is defined. This patch replaces these
> with calls to their prefixed counterparts, __arm_vldr* and __arm_str*,
> and adds a test covering the definition of __ARM_MVE_PRESERVE_USER_NAMESPACE.

Is there a PR in bugzilla for this?

R.

> 
> Regression tested on arm-eabi -- no issues.
> 
> Thanks,
> Murray
> 
> gcc/ChangeLog:
> 
> * config/arm/arm_mve.h (__arm_vst1q_p_u8): Use prefixed intrinsic
> function.
> (__arm_vst1q_p_s8): Likewise.
> (__arm_vld1q_z_u8): Likewise.
> (__arm_vld1q_z_s8): Likewise.
> (__arm_vst1q_p_u16): Likewise.
> (__arm_vst1q_p_s16): Likewise.
> (__arm_vld1q_z_u16): Likewise.
> (__arm_vld1q_z_s16): Likewise.
> (__arm_vst1q_p_u32): Likewise.
> (__arm_vst1q_p_s32): Likewise.
> (__arm_vld1q_z_u32): Likewise.
> (__arm_vld1q_z_s32): Likewise.
> (__arm_vld1q_z_f16): Likewise.
> (__arm_vst1q_p_f16): Likewise.
> (__arm_vld1q_z_f32): Likewise.
> (__arm_vst1q_p_f32): Likewise.
> 
> gcc/testsuite/ChangeLog:
> 
> * gcc.target/arm/mve/general/preserve_user_namespace_1.c: New test.
> 



Re: [PATCH] Fix typo in type verification.

2021-12-22 Thread Richard Biener via Gcc-patches
On December 22, 2021 1:03:18 PM GMT+01:00, "Martin Liška"  
wrote:
>Hello.
>
>The patch is quite obvious fix.
>
>Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>
>Ready to be installed?

Ok. 

Richard. 

>Thanks,
>Martin
>
>   PR ipa/103786
>
>gcc/ChangeLog:
>
>   * tree.c (verify_type): Fix typo.
>---
>  gcc/tree.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
>diff --git a/gcc/tree.c b/gcc/tree.c
>index 72cceda568f..0741e3b01af 100644
>--- a/gcc/tree.c
>+++ b/gcc/tree.c
>@@ -13530,7 +13530,7 @@ verify_type (const_tree t)
>tree ct = TYPE_CANONICAL (t);
>if (!ct)
>  ;
>-  else if (TYPE_CANONICAL (t) != ct)
>+  else if (TYPE_CANONICAL (ct) != ct)
>  {
>error ("% has different %");
>debug_tree (ct);



Re: [PATCH][GCC] arm: fix __arm_vld1q_z* and __arm_vst1q_p* intrinsics.

2021-12-22 Thread Murray Steele via Gcc-patches
Hi,

On 22/12/2021 16:04, Richard Earnshaw wrote:

> 
> Is there a PR in bugzilla for this?
> 
> R.
> 


No, not at this time. It's something I came across whilst
making changes of my own.

For completeness, the ACLE specification I am referencing
has been added below [1].

[1]: https://github.com/ARM-software/acle/releases/tag/r2021Q3

Thanks,
Murray


[PATCH] c++: hard error w/ ptr+CST and incomplete type [PR103700]

2021-12-22 Thread Patrick Palka via Gcc-patches
In pointer_int_sum when called from a SFINAE context, we need to avoid
calling size_in_bytes_loc on an incomplete pointed-to type since this
latter function isn't SFINAE-friendly and always emits an error in this
case.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk and perhaps 11?  pointer_int_sum is also used in the C FE, but
always with the complain parameter defaulted to true so this change
should have no effect there AFAICT.

PR c++/103700

gcc/c-family/ChangeLog:

* c-common.c (pointer_int_sum): When quiet, return
error_mark_node for an incomplete type and avoid calling
size_in_bytes_loc.

gcc/testsuite/ChangeLog:

* g++.dg/template/sfinae32.C: New test.
---
 gcc/c-family/c-common.c  |  2 ++
 gcc/testsuite/g++.dg/template/sfinae32.C | 17 +
 2 files changed, 19 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/template/sfinae32.C

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index a25d59fa77b..f3e3e9ba0a5 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -3308,6 +3308,8 @@ pointer_int_sum (location_t loc, enum tree_code 
resultcode,
 size_exp = integer_one_node;
   else
 {
+  if (!complain && !COMPLETE_TYPE_P (TREE_TYPE (result_type)))
+   return error_mark_node;
   size_exp = size_in_bytes_loc (loc, TREE_TYPE (result_type));
   /* Wrap the pointer expression in a SAVE_EXPR to make sure it
 is evaluated first when the size expression may depend
diff --git a/gcc/testsuite/g++.dg/template/sfinae32.C 
b/gcc/testsuite/g++.dg/template/sfinae32.C
new file mode 100644
index 000..488bf145e21
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/sfinae32.C
@@ -0,0 +1,17 @@
+// PR c++/103700
+// { dg-do compile { target c++11 } }
+
+template auto f() -> decltype(*p + N) = delete;
+template auto f() -> decltype(*p - N) = delete;
+template auto f() -> decltype(N + *p) = delete;
+template void f();
+
+struct Incomplete *p;
+
+int main() {
+  f();
+  f();
+  f();
+  f();
+  f();
+}
-- 
2.34.1.363.g597af311a2



Re: [PATCH] c++: hard error w/ ptr+CST and incomplete type [PR103700]

2021-12-22 Thread Jason Merrill via Gcc-patches

On 12/22/21 12:39, Patrick Palka wrote:

In pointer_int_sum when called from a SFINAE context, we need to avoid
calling size_in_bytes_loc on an incomplete pointed-to type since this
latter function isn't SFINAE-friendly and always emits an error in this
case.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk and perhaps 11?  pointer_int_sum is also used in the C FE, but
always with the complain parameter defaulted to true so this change
should have no effect there AFAICT.


LGTM, but let's give the C maintainers time to comment; OK on Friday if 
no comment.



PR c++/103700

gcc/c-family/ChangeLog:

* c-common.c (pointer_int_sum): When quiet, return
error_mark_node for an incomplete type and avoid calling
size_in_bytes_loc.

gcc/testsuite/ChangeLog:

* g++.dg/template/sfinae32.C: New test.
---
  gcc/c-family/c-common.c  |  2 ++
  gcc/testsuite/g++.dg/template/sfinae32.C | 17 +
  2 files changed, 19 insertions(+)
  create mode 100644 gcc/testsuite/g++.dg/template/sfinae32.C

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index a25d59fa77b..f3e3e9ba0a5 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -3308,6 +3308,8 @@ pointer_int_sum (location_t loc, enum tree_code 
resultcode,
  size_exp = integer_one_node;
else
  {
+  if (!complain && !COMPLETE_TYPE_P (TREE_TYPE (result_type)))
+   return error_mark_node;
size_exp = size_in_bytes_loc (loc, TREE_TYPE (result_type));
/* Wrap the pointer expression in a SAVE_EXPR to make sure it
 is evaluated first when the size expression may depend
diff --git a/gcc/testsuite/g++.dg/template/sfinae32.C 
b/gcc/testsuite/g++.dg/template/sfinae32.C
new file mode 100644
index 000..488bf145e21
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/sfinae32.C
@@ -0,0 +1,17 @@
+// PR c++/103700
+// { dg-do compile { target c++11 } }
+
+template auto f() -> decltype(*p + N) = delete;
+template auto f() -> decltype(*p - N) = delete;
+template auto f() -> decltype(N + *p) = delete;
+template void f();
+
+struct Incomplete *p;
+
+int main() {
+  f();
+  f();
+  f();
+  f();
+  f();
+}




[PATCH] libsanitizer: Fix setbuffer() interceptor (accept size not mode)

2021-12-22 Thread Azat Khuzhin via Gcc-patches
Fixes: b667dd7017a ("Libsanitizer merge from trunk r368656.")
Refs: https://reviews.llvm.org/D116176
---
 .../sanitizer_common/sanitizer_common_interceptors.inc | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc 
b/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc
index abb38ccfa15..60b0545a943 100644
--- a/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc
+++ b/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc
@@ -7858,12 +7858,13 @@ INTERCEPTOR(void, setbuf, __sanitizer_FILE *stream, 
char *buf) {
   unpoison_file(stream);
 }
 
-INTERCEPTOR(void, setbuffer, __sanitizer_FILE *stream, char *buf, int mode) {
+INTERCEPTOR(void, setbuffer, __sanitizer_FILE *stream, char *buf,
+  SIZE_T size) {
   void *ctx;
-  COMMON_INTERCEPTOR_ENTER(ctx, setbuffer, stream, buf, mode);
+  COMMON_INTERCEPTOR_ENTER(ctx, setbuffer, stream, buf, size);
   REAL(setbuffer)(stream, buf, mode);
   if (buf) {
-COMMON_INTERCEPTOR_WRITE_RANGE(ctx, buf, __sanitizer_bufsiz);
+COMMON_INTERCEPTOR_WRITE_RANGE(ctx, buf, size);
   }
   if (stream)
 unpoison_file(stream);
-- 
2.33.1



Re: [PATCH] libsanitizer: Fix setbuffer() interceptor (accept size not mode)

2021-12-22 Thread Bernhard Reutner-Fischer via Gcc-patches
On 22 December 2021 19:19:12 CET, Azat Khuzhin via Gcc-patches 
 wrote:
>Fixes: b667dd7017a ("Libsanitizer merge from trunk r368656.")
>Refs: https://reviews.llvm.org/D116176
>---
> .../sanitizer_common/sanitizer_common_interceptors.inc | 7 ---
> 1 file changed, 4 insertions(+), 3 deletions(-)
>
>diff --git a/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc 
>b/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc
>index abb38ccfa15..60b0545a943 100644
>--- a/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc
>+++ b/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc
>@@ -7858,12 +7858,13 @@ INTERCEPTOR(void, setbuf, __sanitizer_FILE *stream, 
>char *buf) {
>   unpoison_file(stream);
> }
> 
>-INTERCEPTOR(void, setbuffer, __sanitizer_FILE *stream, char *buf, int mode) {
>+INTERCEPTOR(void, setbuffer, __sanitizer_FILE *stream, char *buf,
>+  SIZE_T size) {
>   void *ctx;
>-  COMMON_INTERCEPTOR_ENTER(ctx, setbuffer, stream, buf, mode);
>+  COMMON_INTERCEPTOR_ENTER(ctx, setbuffer, stream, buf, size);
>   REAL(setbuffer)(stream, buf, mode);

Where does mode come from after this patch?
thanks,

>   if (buf) {
>-COMMON_INTERCEPTOR_WRITE_RANGE(ctx, buf, __sanitizer_bufsiz);
>+COMMON_INTERCEPTOR_WRITE_RANGE(ctx, buf, size);
>   }
>   if (stream)
> unpoison_file(stream);



Re: [PATCH] libsanitizer: Fix setbuffer() interceptor (accept size not mode)

2021-12-22 Thread Azat Khuzhin via Gcc-patches
On Wed, Dec 22, 2021 at 09:41:06PM +0100, Bernhard Reutner-Fischer wrote:
> On 22 December 2021 19:19:12 CET, Azat Khuzhin via Gcc-patches 
>  wrote:
> >Fixes: b667dd7017a ("Libsanitizer merge from trunk r368656.")
> >Refs: https://reviews.llvm.org/D116176
> >---
> > .../sanitizer_common/sanitizer_common_interceptors.inc | 7 ---
> > 1 file changed, 4 insertions(+), 3 deletions(-)
> >
> >diff --git a/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc 
> >b/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc
> >index abb38ccfa15..60b0545a943 100644
> >--- a/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc
> >+++ b/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc
> >@@ -7858,12 +7858,13 @@ INTERCEPTOR(void, setbuf, __sanitizer_FILE *stream, 
> >char *buf) {
> >   unpoison_file(stream);
> > }
> > 
> >-INTERCEPTOR(void, setbuffer, __sanitizer_FILE *stream, char *buf, int mode) 
> >{
> >+INTERCEPTOR(void, setbuffer, __sanitizer_FILE *stream, char *buf,
> >+  SIZE_T size) {
> >   void *ctx;
> >-  COMMON_INTERCEPTOR_ENTER(ctx, setbuffer, stream, buf, mode);
> >+  COMMON_INTERCEPTOR_ENTER(ctx, setbuffer, stream, buf, size);
> >   REAL(setbuffer)(stream, buf, mode);
> 
> Where does mode come from after this patch?

setbuffer() does not accept mode, it simply do not change it.
Only setvbuf() can change the mode.

> thanks,
> 
> >   if (buf) {
> >-COMMON_INTERCEPTOR_WRITE_RANGE(ctx, buf, __sanitizer_bufsiz);
> >+COMMON_INTERCEPTOR_WRITE_RANGE(ctx, buf, size);
> >   }
> >   if (stream)
> > unpoison_file(stream);
> 


[PATCH v2] libsanitizer: Fix setbuffer() interceptor (accept size not mode)

2021-12-22 Thread Azat Khuzhin via Gcc-patches
Fixes: b667dd7017a ("Libsanitizer merge from trunk r368656.")
Refs: https://reviews.llvm.org/D116176
---
 .../sanitizer_common/sanitizer_common_interceptors.inc   | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc 
b/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc
index abb38ccfa15..86784768fe5 100644
--- a/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc
+++ b/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc
@@ -7858,12 +7858,13 @@ INTERCEPTOR(void, setbuf, __sanitizer_FILE *stream, 
char *buf) {
   unpoison_file(stream);
 }
 
-INTERCEPTOR(void, setbuffer, __sanitizer_FILE *stream, char *buf, int mode) {
+INTERCEPTOR(void, setbuffer, __sanitizer_FILE *stream, char *buf,
+  SIZE_T size) {
   void *ctx;
-  COMMON_INTERCEPTOR_ENTER(ctx, setbuffer, stream, buf, mode);
-  REAL(setbuffer)(stream, buf, mode);
+  COMMON_INTERCEPTOR_ENTER(ctx, setbuffer, stream, buf, size);
+  REAL(setbuffer)(stream, buf, size);
   if (buf) {
-COMMON_INTERCEPTOR_WRITE_RANGE(ctx, buf, __sanitizer_bufsiz);
+COMMON_INTERCEPTOR_WRITE_RANGE(ctx, buf, size);
   }
   if (stream)
 unpoison_file(stream);
-- 
2.33.1



Re: [PATCH] libsanitizer: Fix setbuffer() interceptor (accept size not mode)

2021-12-22 Thread Azat Khuzhin via Gcc-patches
On Wed, Dec 22, 2021 at 09:41:06PM +0100, Bernhard Reutner-Fischer wrote:
> On 22 December 2021 19:19:12 CET, Azat Khuzhin via Gcc-patches 
>  wrote:
> >-  COMMON_INTERCEPTOR_ENTER(ctx, setbuffer, stream, buf, mode);
> >+  COMMON_INTERCEPTOR_ENTER(ctx, setbuffer, stream, buf, size);
> >   REAL(setbuffer)(stream, buf, mode);
> 
> Where does mode come from after this patch?

Sorry, missed that.
Fixed and send v2 patch.

Thanks!


Re: [PATCH] libsanitizer: Fix setbuffer() interceptor (accept size not mode)

2021-12-22 Thread Bernhard Reutner-Fischer via Gcc-patches
On Wed, 22 Dec 2021 23:50:39 +0300
Azat Khuzhin  wrote:

> Thanks!

you're welcome.

You should state how you tested the patch. Please refer to
https://gcc.gnu.org/contribute.html#testing

thanks,


Re: [PATCH] libsanitizer: Fix setbuffer() interceptor (accept size not mode)

2021-12-22 Thread Azat Khuzhin via Gcc-patches
On Wed, Dec 22, 2021 at 10:02:02PM +0100, Bernhard Reutner-Fischer wrote:
> You should state how you tested the patch. Please refer to
> https://gcc.gnu.org/contribute.html#testing

I though about this, but when gcc syncs changes with upstream [1], it
does not syncs tests, even though they were there [2].

  [1]: https://github.com/gcc-mirror/gcc/commit/b667dd7017a8
  [2]: https://github.com/llvm/llvm-project/commit/0c81a62d9d76

That's why I though that test can be ommitted in this case (I've added a
test for llvm in [3]).

  [3]: https://reviews.llvm.org/D116176

So what is the right way here?
- migrate all tests
- write test only for setbuffer()
- do not add any tests, since they are covered in llvm repo

-- 
Azat.


[PATCH] rs6000: Fix an assertion in update_target_cost_per_stmt [PR103702]

2021-12-22 Thread Kewen.Lin via Gcc-patches
Hi, 

This patch is to fix one wrong assertion which is too aggressive.
Vectorizer can do vec_construct costing for the vector type which
only has one unit.  For the failed case, the passed-in vector type
is "vector(1) int", though it doesn't end up with any construction
eventually.  We have to handle this kind of input in function
rs6000_cost_data::update_target_cost_per_stmt.

Bootstrapped and regtested on powerpc64le-linux-gnu P9 and
powerpc64-linux-gnu P8.

Is it ok for trunk?

BR,
Kewen
-
gcc/ChangeLog:

PR target/103702
* config/rs6000/rs6000.c
(rs6000_cost_data::update_target_cost_per_stmt): Fix one wrong
assertion with early return.

gcc/testsuite/ChangeLog:

PR target/103702
* gcc.target/powerpc/pr103702.c: New test.
---
 gcc/config/rs6000/rs6000.c  |  7 --
 gcc/testsuite/gcc.target/powerpc/pr103702.c | 24 +
 2 files changed, 29 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr103702.c

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 0b09713b2f5..37f07fe5358 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -5461,8 +5461,11 @@ rs6000_cost_data::update_target_cost_per_stmt 
(vect_cost_for_stmt kind,
{
  tree vectype = STMT_VINFO_VECTYPE (stmt_info);
  unsigned int nunits = vect_nunits_for_cost (vectype);
- /* We don't expect strided/elementwise loads for just 1 nunit.  */
- gcc_assert (nunits > 1);
+ /* As PR103702 shows, it's possible that vectorizer wants to do
+costings for only one unit here, it's no need to do any
+penalization for it, so simply early return here.  */
+ if (nunits == 1)
+   return;
  /* i386 port adopts nunits * stmt_cost as the penalized cost
 for this kind of penalization, we used to follow it but
 found it could result in an unreliable body cost especially
diff --git a/gcc/testsuite/gcc.target/powerpc/pr103702.c 
b/gcc/testsuite/gcc.target/powerpc/pr103702.c
new file mode 100644
index 000..585946fd64b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr103702.c
@@ -0,0 +1,24 @@
+/* We don't have one powerpc.*_ok for Power6, use altivec_ok conservatively.  
*/
+/* { dg-require-effective-target powerpc_altivec_ok } */
+/* { dg-options "-mdejagnu-cpu=power6 -O2 -ftree-loop-vectorize 
-fno-tree-scev-cprop" } */
+
+/* Verify there is no ICE.  */
+
+unsigned short a, e;
+int *b, *d;
+int c;
+extern int fn2 ();
+void
+fn1 ()
+{
+  void *f;
+  for (;;)
+{
+  fn2 ();
+  b = f;
+  e = 0;
+  for (; e < a; ++e)
+   b[e] = d[e * c];
+}
+}
+
--
2.27.0



[PATCH] rs6000: Disable MMA if no P9 VECTOR support [PR103627]

2021-12-22 Thread Kewen.Lin via Gcc-patches
Hi,

As PR103627 shows, there is an unexpected case where !TARGET_VSX
and TARGET_MMA co-exist.  As ISA3.1 claims, SIMD is a requirement
for MMA.  By looking into the ICE, I noticed that the current
MMA implementation depends on vector pairs load/store, but since
we don't have a separated option to control Power10 vector, this
patch is to check for Power9 vector instead.

Bootstrapped and regtested on powerpc64le-linux-gnu P9 and
powerpc64-linux-gnu P8.

Is it ok for trunk?

BR,
Kewen
-
gcc/ChangeLog:

PR target/103627
* config/rs6000/rs6000.c (rs6000_option_override_internal): Disable
MMA if !TARGET_P9_VECTOR.

gcc/testsuite/ChangeLog:

PR target/103627
* gcc.target/powerpc/pr103627-1.c: New test.
* gcc.target/powerpc/pr103627-2.c: New test.
---
 gcc/config/rs6000/rs6000.c| 11 +++
 gcc/testsuite/gcc.target/powerpc/pr103627-1.c | 16 
 gcc/testsuite/gcc.target/powerpc/pr103627-2.c | 16 
 3 files changed, 43 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr103627-1.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr103627-2.c

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index c020947abc8..ec3b46682a7 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -4505,6 +4505,17 @@ rs6000_option_override_internal (bool global_init_p)
   rs6000_isa_flags &= ~OPTION_MASK_MMA;
 }

+  /* MMA requires SIMD support as ISA 3.1 claims and our implementation
+ such as "*movoo" uses vector pair access which are only supported
+ from ISA 3.1.  But since we don't have one separated option to
+ control Power10 vector, check for Power9 vector instead.  */
+  if (TARGET_MMA && !TARGET_P9_VECTOR)
+{
+  if ((rs6000_isa_flags_explicit & OPTION_MASK_MMA) != 0)
+   error ("%qs requires %qs", "-mmma", "-mpower9-vector");
+  rs6000_isa_flags &= ~OPTION_MASK_MMA;
+}
+
   if (!TARGET_PCREL && TARGET_PCREL_OPT)
 rs6000_isa_flags &= ~OPTION_MASK_PCREL_OPT;

diff --git a/gcc/testsuite/gcc.target/powerpc/pr103627-1.c 
b/gcc/testsuite/gcc.target/powerpc/pr103627-1.c
new file mode 100644
index 000..6c6c16188fb
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr103627-1.c
@@ -0,0 +1,16 @@
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -mno-power9-vector" } */
+
+/* Verify compiler emits error message instead of ICE.  */
+
+extern float *dest;
+extern __vector_quad src;
+
+int
+foo ()
+{
+  __builtin_mma_disassemble_acc (dest, &src);
+  /* { dg-error "'__builtin_mma_disassemble_acc' requires the '-mmma' option" 
"" { target *-*-* } .-1 } */
+  return 0;
+}
+
diff --git a/gcc/testsuite/gcc.target/powerpc/pr103627-2.c 
b/gcc/testsuite/gcc.target/powerpc/pr103627-2.c
new file mode 100644
index 000..6604872c0e8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr103627-2.c
@@ -0,0 +1,16 @@
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -mmma -mno-power9-vector" } */
+
+/* Verify the emitted error message.  */
+
+extern float *dest;
+extern __vector_quad src;
+
+int
+foo ()
+{
+  __builtin_mma_disassemble_acc (dest, &src);
+  /* { dg-error "'-mmma' requires '-mpower9-vector'" "mma" { target *-*-* } 0 
} */
+  return 0;
+}
+
--
2.27.0



[PATCH] rs6000: Move the hunk affecting VSX/ALTIVEC ahead [PR103627]

2021-12-22 Thread Kewen.Lin via Gcc-patches
Hi,

There is one hunk checking for functions with target attribute/pragma
have the same altivec abi as the one of main_target_opt, it can update
both VSX and ALTIVEC flags.  Meanwhile, we have some codes to check or
warn for some isa flags related to VSX and ALTIVEC, that sit where the
mentioned hunk is proposed to be moved to in this patch.

Since the flags update in the mentioned hunk happen behind those
adjustments based on VSX and ALTIVEC flags, it can cause the
incompatibility and result in unexpected behaviors, the associated test
case is one typical case.

Besides, we already have the code which sets TARGET_FLOAT128_TYPE and
lays after where the hunk is moved to, and OPTION_MASK_FLOAT128_KEYWORD
will rely on TARGET_FLOAT128_TYPE, so this patch just simply removes them.

Bootstrapped and regtested on powerpc64le-linux-gnu P9 and
powerpc64-linux-gnu P8 and P7.

Is it ok for trunk?

BR,
Kewen
-
gcc/ChangeLog:

PR target/103627
* config/rs6000/rs6000.c (rs6000_option_override_internal): Move the
hunk affecting VSX and ALTIVEC to the appropriate place.

gcc/testsuite/ChangeLog:

PR target/103627
* gcc.target/powerpc/pr103627-3.c: New test.
---
 gcc/config/rs6000/rs6000.c| 21 ---
 gcc/testsuite/gcc.target/powerpc/pr103627-3.c | 20 ++
 2 files changed, 29 insertions(+), 12 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr103627-3.c

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index ec3b46682a7..0b09713b2f5 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -3955,6 +3955,15 @@ rs6000_option_override_internal (bool global_init_p)
   else if (TARGET_ALTIVEC)
 rs6000_isa_flags |= (OPTION_MASK_PPC_GFXOPT & ~ignore_masks);

+  /* Disable VSX and Altivec silently if the user switched cpus to power7 in a
+ target attribute or pragma which automatically enables both options,
+ unless the altivec ABI was set.  This is set by default for 64-bit, but
+ not for 32-bit.  Don't move this before the above code using ignore_masks,
+ since it can reset the cleared VSX/ALTIVEC flag again.  */
+  if (main_target_opt != NULL && !main_target_opt->x_rs6000_altivec_abi)
+rs6000_isa_flags &= ~((OPTION_MASK_VSX | OPTION_MASK_ALTIVEC)
+ & ~rs6000_isa_flags_explicit);
+
   if (TARGET_CRYPTO && !TARGET_ALTIVEC)
 {
   if (rs6000_isa_flags_explicit & OPTION_MASK_CRYPTO)
@@ -4373,18 +4382,6 @@ rs6000_option_override_internal (bool global_init_p)
}
 }

-  /* Disable VSX and Altivec silently if the user switched cpus to power7 in a
- target attribute or pragma which automatically enables both options,
- unless the altivec ABI was set.  This is set by default for 64-bit, but
- not for 32-bit.  */
-  if (main_target_opt != NULL && !main_target_opt->x_rs6000_altivec_abi)
-{
-  TARGET_FLOAT128_TYPE = 0;
-  rs6000_isa_flags &= ~((OPTION_MASK_VSX | OPTION_MASK_ALTIVEC
-| OPTION_MASK_FLOAT128_KEYWORD)
-   & ~rs6000_isa_flags_explicit);
-}
-
   /* Enable Altivec ABI for AIX -maltivec.  */
   if (TARGET_XCOFF
   && (TARGET_ALTIVEC || TARGET_VSX)
diff --git a/gcc/testsuite/gcc.target/powerpc/pr103627-3.c 
b/gcc/testsuite/gcc.target/powerpc/pr103627-3.c
new file mode 100644
index 000..9df2b73fe85
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr103627-3.c
@@ -0,0 +1,20 @@
+/* There are no error messages for either LE or BE 64bit.  */
+/* { dg-require-effective-target be }*/
+/* { dg-require-effective-target ilp32 } */
+/* We don't have one powerpc.*_ok for Power6, use altivec_ok conservatively.  
*/
+/* { dg-require-effective-target powerpc_altivec_ok } */
+/* { dg-options "-mdejagnu-cpu=power6" } */
+
+/* Verify compiler emits error message instead of ICE.  */
+
+#pragma GCC target "cpu=power10"
+int
+main ()
+{
+  float *b;
+  __vector_quad c;
+  __builtin_mma_disassemble_acc (b, &c);
+  /* { dg-error "'__builtin_mma_disassemble_acc' requires the '-mmma' option" 
"" { target *-*-* } .-1 } */
+  return 0;
+}
+
--
2.27.0


Re: [PATCH] i386: Enable intrinsics that convert float and bf16 data to each other.

2021-12-22 Thread Hongtao Liu via Gcc-patches
On Wed, Dec 22, 2021 at 11:28 AM Kong, Lingling via Gcc-patches
 wrote:
>
> Hi,
>
>
> This patch is to enable intrinsics that convert float and bf16 data to each 
> other.
> Ok for master?
>
Ok.
> gcc/ChangeLog:
>
> * config/i386/avx512bf16intrin.h (_mm_cvtsbh_ss): Add new intrinsic.
> (_mm512_cvtpbh_ps): Likewise.
> (_mm512_maskz_cvtpbh_ps): Likewise.
> (_mm512_mask_cvtpbh_ps): Likewise.
> * config/i386/avx512bf16vlintrin.h (_mm_cvtness_sbh): Likewise.
> (_mm_cvtpbh_ps): Likewise.
> (_mm256_cvtpbh_ps): Likewise.
> (_mm_maskz_cvtpbh_ps): Likewise.
> (_mm256_maskz_cvtpbh_ps): Likewise.
> (_mm_mask_cvtpbh_ps): Likewise.
> (_mm256_mask_cvtpbh_ps): Likewise.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/i386/avx512bf16-cvtsbh2ss-1.c: New test.
> * gcc.target/i386/avx512bf16-vcvtpbh2ps-1.c: Ditto.
> * gcc.target/i386/avx512bf16vl-cvtness2sbh-1.c: Ditto.
> * gcc.target/i386/avx512bf16vl-vcvtpbh2ps-1.c: Ditto.
> ---
>  gcc/config/i386/avx512bf16intrin.h| 36 +++
>  gcc/config/i386/avx512bf16vlintrin.h  | 63 +++
>  .../gcc.target/i386/avx512bf16-cvtsbh2ss-1.c  | 15 +  
> .../gcc.target/i386/avx512bf16-vcvtpbh2ps-1.c | 20 ++
>  .../i386/avx512bf16vl-cvtness2sbh-1.c | 14 +
>  .../i386/avx512bf16vl-vcvtpbh2ps-1.c  | 29 +
>  6 files changed, 177 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/i386/avx512bf16-cvtsbh2ss-1.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/avx512bf16-vcvtpbh2ps-1.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/avx512bf16vl-cvtness2sbh-1.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/avx512bf16vl-vcvtpbh2ps-1.c
>
> diff --git a/gcc/config/i386/avx512bf16intrin.h 
> b/gcc/config/i386/avx512bf16intrin.h
> index 9afc6bd7d2b..6b62dc3e398 100644
> --- a/gcc/config/i386/avx512bf16intrin.h
> +++ b/gcc/config/i386/avx512bf16intrin.h
> @@ -41,6 +41,16 @@ typedef short __v32bh __attribute__ ((__vector_size__ 
> (64)));
> vector types, and their scalar components.  */  typedef short __m512bh 
> __attribute__ ((__vector_size__ (64), __may_alias__));
>
> +/* Convert One BF16 Data to One Single Float Data.  */ extern __inline
> +float __attribute__ ((__gnu_inline__, __always_inline__,
> +__artificial__)) _mm_cvtsbh_ss (__bfloat16 __A) {
> +  union{ float a; unsigned int b;} __tmp;
> +  __tmp.b = ((unsigned int)(__A)) << 16;
> +  return __tmp.a;
> +}
> +
>  /* vcvtne2ps2bf16 */
>
>  extern __inline __m512bh
> @@ -110,6 +120,32 @@ _mm512_maskz_dpbf16_ps (__mmask16 __A, __m512 __B, 
> __m512bh __C, __m512bh __D)
>return (__m512)__builtin_ia32_dpbf16ps_v16sf_maskz(__B, __C, __D, __A);  }
>
> +extern __inline __m512
> +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> +_mm512_cvtpbh_ps (__m256bh __A) {
> +  return (__m512)_mm512_castsi512_ps ((__m512i)_mm512_slli_epi32 (
> +(__m512i)_mm512_cvtepi16_epi32 ((__m256i)__A), 16)); }
> +
> +extern __inline __m512
> +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> +_mm512_maskz_cvtpbh_ps (__mmask16 __U, __m256bh __A) {
> +  return (__m512)_mm512_castsi512_ps ((__m512i) _mm512_slli_epi32 (
> +(__m512i)_mm512_maskz_cvtepi16_epi32 (
> +(__mmask16)__U, (__m256i)__A), 16));
> +}
> +
> +extern __inline __m512
> +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> +_mm512_mask_cvtpbh_ps (__m512 __S, __mmask16 __U, __m256bh __A) {
> +  return (__m512)_mm512_castsi512_ps ((__m512i)(_mm512_mask_slli_epi32 (
> +(__m512i)__S, (__mmask16)__U,
> +(__m512i)_mm512_cvtepi16_epi32 ((__m256i)__A), 16))); }
> +
>  #ifdef __DISABLE_AVX512BF16__
>  #undef __DISABLE_AVX512BF16__
>  #pragma GCC pop_options
> diff --git a/gcc/config/i386/avx512bf16vlintrin.h 
> b/gcc/config/i386/avx512bf16vlintrin.h
> index 6dd396d4008..5e6a6503aa6 100644
> --- a/gcc/config/i386/avx512bf16vlintrin.h
> +++ b/gcc/config/i386/avx512bf16vlintrin.h
> @@ -43,6 +43,7 @@ typedef short __v8bh __attribute__ ((__vector_size__ 
> (16)));  typedef short __m256bh __attribute__ ((__vector_size__ (32), 
> __may_alias__));  typedef short __m128bh __attribute__ ((__vector_size__ 
> (16), __may_alias__));
>
> +typedef unsigned short __bfloat16;
>  /* vcvtne2ps2bf16 */
>
>  extern __inline __m256bh
> @@ -175,6 +176,68 @@ _mm_maskz_dpbf16_ps (__mmask8 __A, __m128 __B, __m128bh 
> __C, __m128bh __D)
>return (__m128)__builtin_ia32_dpbf16ps_v4sf_maskz(__B, __C, __D, __A);  }
>
> +extern __inline __bfloat16
> +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> +_mm_cvtness_sbh (float __A) {
> +  __v4sf __V = {__A, 0, 0, 0};
> +  __v8hi __R = __builtin_ia32_cvtneps2bf16_v4sf_mask ((__v4sf)__V,
> +  (__v8hi)_mm_undefined_si128 (), (__mmask8)-1);
> +  return __R[0];
> +}
> +
> +extern __inline __m128
> +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)

Re: [PATCH] [i386] Add define_insn_and_split for vpcmp{b, w, d, q} vpcmp{ph, ps, pd}.

2021-12-22 Thread Hongtao Liu via Gcc-patches
On Tue, Dec 21, 2021 at 2:27 PM liuhongt  wrote:
>
> The purpose of those define_insn_and_split:
> 1. Combine vpcmpuw and zero_extend into vpcmpuw.
> 2. Canonicalize vpcmpuw pattern so CSE can replace duplicate vpcmpuw to just 
> kmov
> 3. Use DImode as dest of zero_extend so cprop_hardreg can eliminate redundant 
> kmov.
Use DImode as dest of zero_extend is too aggressive which causes
several regression.
New patch add define_insn_and_split just combine  vpcmpuw and
zero_extend into vpcmpuw.
Here's the patch i'm checking in.
>
> It should partially fix the issue in PR.
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> Ready to push to trunk.
>
> gcc/ChangeLog:
>
> PR target/103750
> * config/i386/sse.md
> (*_cmp3_zero_extend):
> New define_insn_and_split.
> (*_cmp3): Ditto.
> (*_cmp3_zero_extenddi): New define_insn.
> (*_cmp3_zero_extend):
> New define_insn_and_split.
> (*_ucmp3_zero_extend):
> Ditto.
> (*_ucmp3): Ditto.
> (*_ucmp3_zero_extenddi): New define_insn.
> (*_ucmp3_zero_extend):
> New define_insn_and_split.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/i386/bitwise_mask_op-3.c: Adjust test/
> * g++.target/i386/pr103750-1.C: New test.
> ---
>  gcc/config/i386/sse.md| 267 ++
>  gcc/testsuite/g++.target/i386/pr103750-1.C|  50 
>  .../gcc.target/i386/bitwise_mask_op-3.c   |   6 +-
>  3 files changed, 320 insertions(+), 3 deletions(-)
>  create mode 100644 gcc/testsuite/g++.target/i386/pr103750-1.C
>
> diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
> index 5196149ee32..fb885d58272 100644
> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -3702,6 +3702,75 @@ (define_insn 
> "_cmp3"
> (set_attr "prefix" "evex")
> (set_attr "mode" "")])
>
> +;; Those Splitters are used to canonicalize vpcmpuw pattern, so that CSE can 
> transfrom
> +;; duplicated vpcmpuw to vpcmpuw and kmov
> +;; Choose biggest mode(DImode) as dest, so kmov can be optimized by 
> cprop_hardreg.
> +(define_insn_and_split 
> "*_cmp3_zero_extend"
> +  [(set (match_operand:SWI248x 0 "register_operand" "=k")
> +   (zero_extend:SWI248x
> + (unspec:
> +   [(match_operand:V48H_AVX512VL 1 "register_operand" "v")
> +(match_operand:V48H_AVX512VL 2 "nonimmediate_operand" "vm")
> +(match_operand:SI 3 "" "n")]
> +   UNSPEC_PCMP)))]
> +  "TARGET_AVX512BW
> +   && (GET_MODE_NUNITS (mode)
> +   < GET_MODE_PRECISION (mode))"
> +  "vcmp\t{%3, %2, %1, %0|%0, %1, %2, %3}"
> +  "&& mode != E_DImode"
> +  [(set (match_dup 0)
> +   (zero_extend:DI
> + (unspec:
> +   [(match_dup 1)
> +(match_dup 2)
> +(match_dup 3)]
> +   UNSPEC_PCMP)))]
> +  "operands[0] = lowpart_subreg (DImode, operands[0], mode);"
> +  [(set_attr "type" "ssecmp")
> +   (set_attr "length_immediate" "1")
> +   (set_attr "prefix" "evex")
> +   (set_attr "mode" "")])
> +
> +(define_insn_and_split "*_cmp3"
> +  [(set (match_operand: 0 "register_operand" "=k")
> +   (unspec:
> + [(match_operand:V48H_AVX512VL 1 "register_operand" "v")
> +  (match_operand:V48H_AVX512VL 2 "nonimmediate_operand" "vm")
> +  (match_operand:SI 3 "" "n")]
> + UNSPEC_PCMP))]
> +  "TARGET_AVX512BW
> +   && GET_MODE_NUNITS (mode) < 64"
> +  "#"
> +  "&& 1"
> +  [(set (match_dup 0)
> +   (zero_extend:DI
> + (unspec:
> +   [(match_dup 1)
> +(match_dup 2)
> +(match_dup 3)]
> +   UNSPEC_PCMP)))]
> +  "operands[0] = lowpart_subreg (DImode, operands[0], 
> mode);"
> +  [(set_attr "type" "ssecmp")
> +   (set_attr "length_immediate" "1")
> +   (set_attr "prefix" "evex")
> +   (set_attr "mode" "")])
> +
> +(define_insn "*_cmp3_zero_extenddi"
> +  [(set (match_operand:DI 0 "register_operand" "=k")
> +   (zero_extend:DI
> + (unspec:
> +   [(match_operand:V48H_AVX512VL 1 "register_operand" "v")
> +(match_operand:V48H_AVX512VL 2 "nonimmediate_operand" "vm")
> +(match_operand:SI 3 "" "n")]
> +   UNSPEC_PCMP)))]
> +  "TARGET_AVX512BW
> +   && GET_MODE_NUNITS (mode) < 64"
> +  "vcmp\t{%3, %2, %1, %0|%0, %1, %2, %3}"
> +  [(set_attr "type" "ssecmp")
> +   (set_attr "length_immediate" "1")
> +   (set_attr "prefix" "evex")
> +   (set_attr "mode" "")])
> +
>  (define_insn_and_split "*_cmp3"
>[(set (match_operand: 0 "register_operand")
> (not:
> @@ -3735,6 +3804,72 @@ (define_insn 
> "_cmp3"
> (set_attr "prefix" "evex")
> (set_attr "mode" "")])
>
> +(define_insn_and_split 
> "*_cmp3_zero_extend"
> +  [(set (match_operand:SWI248x 0 "register_operand" "=k")
> +   (zero_extend:SWI248x
> + (unspec:
> +   [(match_operand:VI12_AVX512VL 1 "register_operand" "v")
> +(match_operand:VI12_AVX512VL 2 "nonimmediate_operand" "vm")
> + 

[PATCH] fixed testcase riscv/pr103302.c

2021-12-22 Thread shihua
From: LiaoShihua 

because riscv32 not support __int128, so skip if -march=rv32*.

gcc/testsuite\ChangeLog:
* gcc.target/riscv/pr103302.c: skip if -march=rv32*
---
 gcc/testsuite/gcc.target/riscv/pr103302.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/testsuite/gcc.target/riscv/pr103302.c 
b/gcc/testsuite/gcc.target/riscv/pr103302.c
index 822c4087416..2cfb12498a2 100644
--- a/gcc/testsuite/gcc.target/riscv/pr103302.c
+++ b/gcc/testsuite/gcc.target/riscv/pr103302.c
@@ -1,4 +1,5 @@
 /* { dg-do run } */
+/* { dg-skip-if "rv32 not support _int128" { *-*-* } { "-march=rv32*" } } */
 /* { dg-options "-Og -fharden-compares -fno-tree-dce -fno-tree-fre " } */
 
 typedef unsigned char u8;
-- 
2.31.1.windows.1



Re: [PATCH] fixed testcase riscv/pr103302.c

2021-12-22 Thread Andrew Pinski via Gcc-patches
On Wed, Dec 22, 2021 at 11:37 PM  wrote:
>
> From: LiaoShihua 
>
> because riscv32 not support __int128, so skip if -march=rv32*.
>
> gcc/testsuite\ChangeLog:
> * gcc.target/riscv/pr103302.c: skip if -march=rv32*
> ---
>  gcc/testsuite/gcc.target/riscv/pr103302.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/gcc/testsuite/gcc.target/riscv/pr103302.c 
> b/gcc/testsuite/gcc.target/riscv/pr103302.c
> index 822c4087416..2cfb12498a2 100644
> --- a/gcc/testsuite/gcc.target/riscv/pr103302.c
> +++ b/gcc/testsuite/gcc.target/riscv/pr103302.c
> @@ -1,4 +1,5 @@
>  /* { dg-do run } */
> +/* { dg-skip-if "rv32 not support _int128" { *-*-* } { "-march=rv32*" } } */

Better fix:
/* { dg-do run { target int128 } } */

Thanks,
Andrew Pinski

>  /* { dg-options "-Og -fharden-compares -fno-tree-dce -fno-tree-fre " } */
>
>  typedef unsigned char u8;
> --
> 2.31.1.windows.1
>