Re: stddef.h: Add C2x unreachable macro

2022-09-09 Thread Richard Biener via Gcc-patches
On Thu, Sep 8, 2022 at 9:32 PM Joseph Myers  wrote:
>
> C2x adds a macro unreachable to stddef.h, with the same semantics as
> __builtin_unreachable.  Define this macro accordingly.
>
> Bootstrapped with no regressions for x86_64-pc-linux-gnu.  OK to commit?

OK.

> gcc/
> * ginclude/stddef.h [__STDC_VERSION__ > 201710L] (unreachable):
> New macro.
>
> gcc/testsuite/
> * gcc.dg/c11-unreachable-1.c, gcc.dg/c2x-unreachable-1.c: New
> tests.
>
> diff --git a/gcc/ginclude/stddef.h b/gcc/ginclude/stddef.h
> index 315ff786694..3d29213e8f1 100644
> --- a/gcc/ginclude/stddef.h
> +++ b/gcc/ginclude/stddef.h
> @@ -451,6 +451,10 @@ typedef struct {
>  #endif
>  #endif /* C23.  */
>
> +#if defined __STDC_VERSION__ && __STDC_VERSION__ > 201710L
> +#define unreachable() (__builtin_unreachable ())
> +#endif
> +
>  #endif /* _STDDEF_H was defined this time */
>
>  #endif /* !_STDDEF_H && !_STDDEF_H_ && !_ANSI_STDDEF_H && !__STDDEF_H__
> diff --git a/gcc/testsuite/gcc.dg/c11-unreachable-1.c 
> b/gcc/testsuite/gcc.dg/c11-unreachable-1.c
> new file mode 100644
> index 000..28e48392ed1
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/c11-unreachable-1.c
> @@ -0,0 +1,9 @@
> +/* Test unreachable not defined in  for C11.  */
> +/* { dg-do preprocess } */
> +/* { dg-options "-std=c11 -pedantic-errors" } */
> +
> +#include 
> +
> +#ifdef unreachable
> +#error "unreachable defined"
> +#endif
> diff --git a/gcc/testsuite/gcc.dg/c2x-unreachable-1.c 
> b/gcc/testsuite/gcc.dg/c2x-unreachable-1.c
> new file mode 100644
> index 000..468f1f87ebb
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/c2x-unreachable-1.c
> @@ -0,0 +1,29 @@
> +/* Test unreachable in  for C2x.  */
> +/* { dg-do run } */
> +/* { dg-options "-std=c2x -pedantic-errors -O2" } */
> +
> +#include 
> +
> +#ifndef unreachable
> +#error "unreachable not defined"
> +#endif
> +
> +extern void *p;
> +extern __typeof__ (unreachable ()) *p;
> +
> +volatile int x = 1;
> +
> +extern void not_defined (void);
> +
> +extern void exit (int);
> +
> +int
> +main ()
> +{
> +  if (x == 2)
> +{
> +  unreachable ();
> +  not_defined ();
> +}
> +  exit (0);
> +}
>
> --
> Joseph S. Myers
> jos...@codesourcery.com


Re: [PATCH 2/2] RISC-V: Implement TARGET_COMPUTE_MULTILIB

2022-09-09 Thread Andreas Schwab
How did you test that?

../../gcc/common/config/riscv/riscv-common.cc: In function 'const char* 
riscv_multi_lib_check(int, const char**)':
../../gcc/common/config/riscv/riscv-common.cc:1451:11: error: bare apostrophe 
''' in format [-Werror=format-diag]
 1451 |   "Can't find suitable multilib set for %<-march=%s%>/%<-mabi=%s%>",
  |   ^
../../gcc/common/config/riscv/riscv-common.cc:1451:7: note: if avoiding the 
apostrophe is not feasible, enclose it in a pair of '%<' and '%>' directives 
instead
 1451 |   "Can't find suitable multilib set for %<-march=%s%>/%<-mabi=%s%>",
  |   ^
../../gcc/common/config/riscv/riscv-common.cc: At global scope:
../../gcc/common/config/riscv/riscv-common.cc:1492:1: error: 'int 
riscv_check_conds(const switchstr*, int, int, const 
std::vector >&)' defined but not used 
[-Werror=unused-function]
 1492 | riscv_check_conds (
  | ^
../../gcc/common/config/riscv/riscv-common.cc:1374:1: error: 'const char* 
find_last_appear_switch(const switchstr*, int, const char*)' defined but not 
used [-Werror=unused-function]
 1374 | find_last_appear_switch (
  | ^~~
cc1plus: all warnings being treated as errors
make[3]: *** [Makefile:2442: riscv-common.o] Error 1

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."


Re: [PATCH v2] Handle OPAQUE_TYPE specially in verify_type [PR106833]

2022-09-09 Thread Richard Biener via Gcc-patches
On Fri, Sep 9, 2022 at 8:51 AM Kewen.Lin  wrote:
>
> Hi Richi,
>
> Thanks for the review comments!
>
> on 2022/9/8 15:36, Richard Biener wrote:
> >
> >
> >> Am 08.09.2022 um 07:53 schrieb Kewen.Lin :
> >>
> >> Hi,
> >>
> >> As PR106833 shows, cv-qualified opaque type can cause ICE
> >> during LTO.  It exposes that we missd to handle OPAQUE_TYPE
> >> well in type verification.  As Richi pointed out, also
> >> assuming that target will always define TYPE_MAIN_VARIANT
> >> and TYPE_CANONICAL for opaque type, this patch is to check
> >> both are OPAQUE_TYPE_P.  Besides, it also checks the only
> >> available size and alignment information as well as type
> >> mode for TYPE_MAIN_VARIANT.
> >>
> ...
> >> +
> >> +  if (t != tv)
> >> +{
> >> +  verify_match (TREE_CODE, t, tv);
> >> +  verify_match (TYPE_MODE, t, tv);
> >> +  verify_match (TYPE_SIZE, t, tv);
> >
> > TYPE_SIZE is a tree, you should probably
> > Compare this with operand_equal_p.  It’s
> > Not documented to be a constant size?
> > Thus some VLA vector mode might be allowed ( a poly_int size),
>
> Thanks for catching, I was referencing the code in function
> verify_type_variant, that corresponding part seems imperfect:
>
>   if (TREE_CODE (TYPE_SIZE (t)) != PLACEHOLDER_EXPR
>   && TREE_CODE (TYPE_SIZE (tv)) != PLACEHOLDER_EXPR)
> verify_variant_match (TYPE_SIZE);
>
> I agree poly_int size is allowed, the patch was updated for it.
>
> BLKmode
> > Is ruled out(?),
>
> Yes, it requires a mode of MODE_OPAQUE class.
>
> the docs say we have
> > ‚An MODE_Opaque‘ here but I don’t see
> > This being verified?
> >
>
> There is a MODE equality check, I assumed the given t already
> has one MODE_OPAQUE mode, but the patch was updated to make
> it explicit as you concerned.
>
> > The macro makes this a bit unworldly
> > For the only benefit of elaborate diagnostic
> > Which I think isn’t really necessary
>
> OK, fixed!
>
> The previous version makes just one check on TYPE_CANONICAL to
> be cheap as gimple_canonical_types_compatible_p said, but
> since there are just several fields to be check, this updated
> version adjusted it to be the same as what's for TYPE_MAIN_VARIANT.
> Hope it's fine. :)

I think we'll call verify_type on the main variant as well so that would be
redundant (ensured by transitivity), can you check?

> Tested as before.
>
> Does this updated patch look good to you?

Yes, please remove the checks against the main variant if the above holds,
OK with or without that change depending on this outcome.

Thanks,
Richard.


>
> BR,
> Kewen
> --


Re: [PATCH] amdgcn: Add support for additional natively supported floating-point operations

2022-09-09 Thread Andrew Stubbs

On 08/09/2022 21:38, Kwok Cheung Yeung wrote:

Hello

This patch adds support for some additional floating-point operations, 
in scalar and vector modes, which are natively supported by the AMD GCN 
instruction set, but haven't been implemented in GCC yet. With the 
exception of frexp, these implement standard RTL names, and should be 
utilised automatically by GCC.


The instructions for the transcendental functions are documented to have 
limited numerical precision, so they are only used if 
unsafe_math_optimizations are enabled for now.


The sin and cos instructions for some reason are scaled by 2*PI radians 
(i.e. 1.0 == 2*PI radians/360 degrees), so their inputs need to be 
scaled by 1/(2*PI) first. I've implemented this as an expander to two 
instructions - one to do the pre-scaling, one to do the sin/cos. 
1/(2*PI) is a builtin constant for GCN, but the syntax to use it in the 
LLVM assembler was wrong - now fixed.


I have also added some extra GCN-specific builtins to access the vector 
versions of some of these operations (to implement vectorized versions 
of library math routines) and to access the frexp operations.


Okay for trunk?


LGTM. I'm assuming you've checked the maths. :)

Andrew



[wwwdocs] Fix typo in description of tainted_args attribute

2022-09-09 Thread Jonathan Wakely via Gcc-patches
Pushed to wwwdocs.

---
 htdocs/gcc-12/changes.html | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/htdocs/gcc-12/changes.html b/htdocs/gcc-12/changes.html
index c6dae27a..2cb5a654 100644
--- a/htdocs/gcc-12/changes.html
+++ b/htdocs/gcc-12/changes.html
@@ -988,7 +988,7 @@ function Multiply (S1, S2 : Sign) return Sign is
   added to the C and C++ frontends, usable on functions, and on
   function pointer callback fields in structs.  The analyzer's taint
   mode will treat all parameters and buffers pointed to by parameters
-  of such functions as being attacked-controlled, such as for
+  of such functions as being attacker-controlled, such as for
   annotating system calls in an operating system kernel as being an
   "attack surface".
 
-- 
2.37.3



Re: [PATCH] amdgcn: Add support for additional natively supported floating-point operations

2022-09-09 Thread Tobias Burnus

On 09.09.22 10:10, Andrew Stubbs wrote:
On 08.09.22 22:38, Kwok Cheung Yeung wrote:
The instructions for the transcendental functions are documented to have 
limited numerical precision, so they are only used if unsafe_math_optimizations 
are enabled for now.

-funsafe-math-optimizations implies -fno-signed-zeros, -fno-trapping-math, 
-fassociative-math,
and -freciprocal-math. All of them reduce precision and my violate IEEE or 
ISO/language standards.

However, I think it is rather surprising to have all of the sudden only a 
precision of the
order of 100,000,000 ULP instead of ~4 ULP as to be expected. That's a 
precision loss of the
order of 10^8 or 2^29 which is huge!

For program deliberately using double precision, it can be too much – even if 
they do not need
double precision in reality. (Weather forecast system recently moved to single 
precision as the
quality is similar and benefits of faster results/finer grids or longer 
forecast times prevail.)

As this behavior is highly surprising, I think it should be at least documented.

In https://gcc.gnu.org/PR105246 , I suggested a new flag (such as 
-mpermit-reduced-precision) to
make it possible turn it on/off explicitly (might be still enabled by 
-funsafe-math-optimizations);
alternatively, it could also be handled as initial guess for the result which 
is then refined
in some iteration steps. (It could also be combined to give the user the 
choice.)

While still being convinced that a flag makes more sense than just documenting 
it,
I have nonetheless attached a documentation attempt.

Thoughts?

Tobias


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
GCN: Document in invoke.texi reduced precs with -funsafe-math-opt

gcc/ChangeLog:

	* doc/invoke.texi (AMD GCN Options): Document that
	-funsafe-math-optimizations implies single-precision
	results for some math intrinsics.

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 5c066219a7d..01011bd9f9b 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -20139,8 +20139,14 @@ purpose.  The default is @option{-m1reg-none}.
 @cindex AMD GCN Options
 
 These options are defined specifically for the AMD GCN port.
 
+Note that the @option{-funsafe-math-optimizations} option implies that
+for 64bit floating-pointer numbers, the following operations yield results
+with only 23 bits instead of 52 bits for the fractional part of the
+floating-point number: @code{sqrt}, @code{exp2}, @code{log2}, @code{sin}
+and @code{cos}.
+
 @table @gcctabopt
 
 @item -march=@var{gpu}
 @opindex march


[PATCH][GCC 12] arm: Fix constant immediates predicates and constraints for some MVE builtins

2022-09-09 Thread Christophe Lyon via Gcc-patches
This is a backport from trunk to gcc-12.

Several MVE builtins incorrectly use the same predicate/constraint
pair for several modes, which does not match the specification.
This patch uses the appropriate iterator instead.

2022-09-06  Christophe Lyon  

gcc/
* config/arm/mve.md (mve_vqshluq_n_s): Use
MVE_pred/MVE_constraint instead of mve_imm_7/Ra.
(mve_vqshluq_m_n_s): Likewise.
(mve_vqrshrnbq_n_): Use MVE_pred3/MVE_constraint3
instead of mve_imm_8/Rb.
(mve_vqrshrunbq_n_s): Likewise.
(mve_vqrshrntq_n_): Likewise.
(mve_vqrshruntq_n_s): Likewise.
(mve_vrshrnbq_n_): Likewise.
(mve_vrshrntq_n_): Likewise.
(mve_vqrshrnbq_m_n_): Likewise.
(mve_vqrshrntq_m_n_): Likewise.
(mve_vrshrnbq_m_n_): Likewise.
(mve_vrshrntq_m_n_): Likewise.
(mve_vqrshrunbq_m_n_s): Likewise.
(mve_vsriq_n_): Likewise.

(cheerry-picked from c3fb6658c7670e446f2fd00984404d971e416b3c)
---
 gcc/config/arm/mve.md | 30 +++---
 1 file changed, 15 insertions(+), 15 deletions(-)

diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
index f16991c0a34..469e7e7f8dc 100644
--- a/gcc/config/arm/mve.md
+++ b/gcc/config/arm/mve.md
@@ -1617,7 +1617,7 @@ (define_insn "mve_vqshluq_n_s"
   [
(set (match_operand:MVE_2 0 "s_register_operand" "=w")
(unspec:MVE_2 [(match_operand:MVE_2 1 "s_register_operand" "w")
-  (match_operand:SI 2 "mve_imm_7" "Ra")]
+  (match_operand:SI 2 "" "")]
 VQSHLUQ_N_S))
   ]
   "TARGET_HAVE_MVE"
@@ -2608,7 +2608,7 @@ (define_insn "mve_vqrshrnbq_n_"
(set (match_operand: 0 "s_register_operand" "=w")
(unspec: [(match_operand: 1 
"s_register_operand" "0")
 (match_operand:MVE_5 2 "s_register_operand" 
"w")
-(match_operand:SI 3 "mve_imm_8" "Rb")]
+(match_operand:SI 3 "" 
"")]
 VQRSHRNBQ_N))
   ]
   "TARGET_HAVE_MVE"
@@ -2623,7 +2623,7 @@ (define_insn "mve_vqrshrunbq_n_s"
(set (match_operand: 0 "s_register_operand" "=w")
(unspec: [(match_operand: 1 
"s_register_operand" "0")
 (match_operand:MVE_5 2 "s_register_operand" 
"w")
-(match_operand:SI 3 "mve_imm_8" "Rb")]
+(match_operand:SI 3 "" 
"")]
 VQRSHRUNBQ_N_S))
   ]
   "TARGET_HAVE_MVE"
@@ -3563,7 +3563,7 @@ (define_insn "mve_vsriq_n_"
(set (match_operand:MVE_2 0 "s_register_operand" "=w")
(unspec:MVE_2 [(match_operand:MVE_2 1 "s_register_operand" "0")
   (match_operand:MVE_2 2 "s_register_operand" "w")
-  (match_operand:SI 3 "mve_imm_selective_upto_8" "Rg")]
+  (match_operand:SI 3 "" "")]
 VSRIQ_N))
   ]
   "TARGET_HAVE_MVE"
@@ -4466,7 +4466,7 @@ (define_insn "mve_vqrshrntq_n_"
(set (match_operand: 0 "s_register_operand" "=w")
(unspec: [(match_operand: 1 
"s_register_operand" "0")
   (match_operand:MVE_5 2 "s_register_operand" "w")
-  (match_operand:SI 3 "mve_imm_8" "Rb")]
+  (match_operand:SI 3 "" "")]
 VQRSHRNTQ_N))
   ]
   "TARGET_HAVE_MVE"
@@ -4482,7 +4482,7 @@ (define_insn "mve_vqrshruntq_n_s"
(set (match_operand: 0 "s_register_operand" "=w")
(unspec: [(match_operand: 1 
"s_register_operand" "0")
   (match_operand:MVE_5 2 "s_register_operand" "w")
-  (match_operand:SI 3 "mve_imm_8" "Rb")]
+  (match_operand:SI 3 "" "")]
 VQRSHRUNTQ_N_S))
   ]
   "TARGET_HAVE_MVE"
@@ -4770,7 +4770,7 @@ (define_insn "mve_vrshrnbq_n_"
(set (match_operand: 0 "s_register_operand" "=w")
(unspec: [(match_operand: 1 
"s_register_operand" "0")
   (match_operand:MVE_5 2 "s_register_operand" "w")
-  (match_operand:SI 3 "mve_imm_8" "Rb")]
+  (match_operand:SI 3 "" "")]
 VRSHRNBQ_N))
   ]
   "TARGET_HAVE_MVE"
@@ -4786,7 +4786,7 @@ (define_insn "mve_vrshrntq_n_"
(set (match_operand: 0 "s_register_operand" "=w")
(unspec: [(match_operand: 1 
"s_register_operand" "0")
   (match_operand:MVE_5 2 "s_register_operand" "w")
-  (match_operand:SI 3 "mve_imm_8" "Rb")]
+  (match_operand:SI 3 "" "")]
 VRSHRNTQ_N))
   ]
   "TARGET_HAVE_MVE"
@@ -4980,7 +4980,7 @@ (define_insn "mve_vqshluq_m_n_s"
(set (match_operand:MVE_2 0 "s_register_operand" "=w")
(unspec:MVE_2 [(match_operand:MVE_2 1 "s_register_operand" "0")
   (match_operand:MVE_2 2 "s_register_operand" "w")
-  (match_operand:SI 3 "mve_imm_7" "Ra")
+  (match_operand:SI 3 "" "")
   (match_operand: 4 "vpr_register_operand" 
"Up")]
 VQSHLUQ_

[PATCH] RISC-V: Support -fexcess-precision=16

2022-09-09 Thread Palmer Dabbelt
This fixes f19a327077e ("Support -fexcess-precision=16 which will enable
FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16 when backend supports _Float16.") on
RISC-V targets.

gcc/ChangeLog

PR target/106815
* config/riscv/riscv.cc (riscv_excess_precision): Add support
for EXCESS_PRECISION_TYPE_FLOAT16.
---
 gcc/config/riscv/riscv.cc | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 675d92c0961..9b6d3e95b1b 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -5962,6 +5962,7 @@ riscv_excess_precision (enum excess_precision_type type)
   return (TARGET_ZFH ? FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16
 : FLT_EVAL_METHOD_PROMOTE_TO_FLOAT);
 case EXCESS_PRECISION_TYPE_IMPLICIT:
+case EXCESS_PRECISION_TYPE_FLOAT16:
   return FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16;
 default:
   gcc_unreachable ();
-- 
2.34.1



[PATCH] Document -fexcess-precision=16 in tm.texi

2022-09-09 Thread Palmer Dabbelt
I just happened to stuble on this one while trying to sort out the
RISC-V bits.

gcc/ChangeLog

* doc/tm.texi (TARGET_C_EXCESS_PRECISION): Add 16.
---
 gcc/doc/tm.texi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 858bfb80cec..7590924f2ca 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -1009,7 +1009,7 @@ of the excess precision explicitly added.  For
 @code{EXCESS_PRECISION_TYPE_FLOAT16}, and
 @code{EXCESS_PRECISION_TYPE_FAST}, the target should return the
 explicit excess precision that should be added depending on the
-value set for @option{-fexcess-precision=@r{[}standard@r{|}fast@r{]}}.
+value set for @option{-fexcess-precision=@r{[}standard@r{|}fast@r{|}16@r{]}}.
 Note that unpredictable explicit excess precision does not make sense,
 so a target should never return @code{FLT_EVAL_METHOD_UNPREDICTABLE}
 when @var{type} is @code{EXCESS_PRECISION_TYPE_STANDARD},
-- 
2.34.1



[PATCH] tree-optimization/106881 - fix simple_control_dep_chain part

2022-09-09 Thread Richard Biener via Gcc-patches
This adjusts simple_control_dep_chain in the same way I adjusted
compute_control_dep_chain_pdom to avoid adding fallthru edges to
the predicate chain.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

PR tree-optimization/106881
* gimple-predicate-analysis.cc (simple_control_dep_chain):
Add only non-fallthru edges and avoid the same set of edges
as compute_control_dep_chain_pdom does.
---
 gcc/gimple-predicate-analysis.cc | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/gcc/gimple-predicate-analysis.cc b/gcc/gimple-predicate-analysis.cc
index 910ab97a29e..bc9ed847267 100644
--- a/gcc/gimple-predicate-analysis.cc
+++ b/gcc/gimple-predicate-analysis.cc
@@ -926,10 +926,14 @@ simple_control_dep_chain (vec& chain, basic_block 
from, basic_block to)
 {
   basic_block dest = src;
   src = get_immediate_dominator (CDI_DOMINATORS, src);
-  edge pred_e;
-  if (single_pred_p (dest)
- && (pred_e = find_edge (src, dest)))
-   chain.safe_push (pred_e);
+  if (single_pred_p (dest))
+   {
+ edge pred_e = single_pred_edge (dest);
+ gcc_assert (pred_e->src == src);
+ if (!(pred_e->flags & ((EDGE_FAKE | EDGE_ABNORMAL | EDGE_DFS_BACK)))
+ && !single_succ_p (src))
+   chain.safe_push (pred_e);
+   }
 }
 }
 
-- 
2.35.3


Re: [PATCH] RISC-V: Support -fexcess-precision=16

2022-09-09 Thread Kito Cheng via Gcc-patches
LGTM, seems like you have landed now, see you soon :)

On Fri, Sep 9, 2022 at 5:44 PM Palmer Dabbelt  wrote:
>
> This fixes f19a327077e ("Support -fexcess-precision=16 which will enable
> FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16 when backend supports _Float16.") on
> RISC-V targets.
>
> gcc/ChangeLog
>
> PR target/106815
> * config/riscv/riscv.cc (riscv_excess_precision): Add support
> for EXCESS_PRECISION_TYPE_FLOAT16.
> ---
>  gcc/config/riscv/riscv.cc | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> index 675d92c0961..9b6d3e95b1b 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -5962,6 +5962,7 @@ riscv_excess_precision (enum excess_precision_type type)
>return (TARGET_ZFH ? FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16
>  : FLT_EVAL_METHOD_PROMOTE_TO_FLOAT);
>  case EXCESS_PRECISION_TYPE_IMPLICIT:
> +case EXCESS_PRECISION_TYPE_FLOAT16:
>return FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16;
>  default:
>gcc_unreachable ();
> --
> 2.34.1
>


Re: [PATCH 2/2] RISC-V: Implement TARGET_COMPUTE_MULTILIB

2022-09-09 Thread Kito Cheng via Gcc-patches
Hi Andreas:

Hm, I should change my default gcc on Ubuntu, I didn't got this
when build with GCC 7, and can be reproduced by GCC 11,

Will summit patch once I test done.

On Fri, Sep 9, 2022 at 3:21 PM Andreas Schwab  wrote:
>
> How did you test that?
>
> ../../gcc/common/config/riscv/riscv-common.cc: In function 'const char* 
> riscv_multi_lib_check(int, const char**)':
> ../../gcc/common/config/riscv/riscv-common.cc:1451:11: error: bare apostrophe 
> ''' in format [-Werror=format-diag]
>  1451 |   "Can't find suitable multilib set for 
> %<-march=%s%>/%<-mabi=%s%>",
>   |   ^
> ../../gcc/common/config/riscv/riscv-common.cc:1451:7: note: if avoiding the 
> apostrophe is not feasible, enclose it in a pair of '%<' and '%>' directives 
> instead
>  1451 |   "Can't find suitable multilib set for 
> %<-march=%s%>/%<-mabi=%s%>",
>   |   
> ^
> ../../gcc/common/config/riscv/riscv-common.cc: At global scope:
> ../../gcc/common/config/riscv/riscv-common.cc:1492:1: error: 'int 
> riscv_check_conds(const switchstr*, int, int, const 
> std::vector >&)' defined but not used 
> [-Werror=unused-function]
>  1492 | riscv_check_conds (
>   | ^
> ../../gcc/common/config/riscv/riscv-common.cc:1374:1: error: 'const char* 
> find_last_appear_switch(const switchstr*, int, const char*)' defined but not 
> used [-Werror=unused-function]
>  1374 | find_last_appear_switch (
>   | ^~~
> cc1plus: all warnings being treated as errors
> make[3]: *** [Makefile:2442: riscv-common.o] Error 1
>
> --
> Andreas Schwab, sch...@linux-m68k.org
> GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
> "And now for something completely different."


[committed] RISC-V: Suppress build warnings

2022-09-09 Thread Kito Cheng
../../gcc/common/config/riscv/riscv-common.cc: In function 'const char* 
riscv_multi_lib_check(int, const char**)':
../../gcc/common/config/riscv/riscv-common.cc:1451:11: error: bare apostrophe 
''' in format [-Werror=format-diag]
 1451 |   "Can't find suitable multilib set for %<-march=%s%>/%<-mabi=%s%>",
  |   ^
../../gcc/common/config/riscv/riscv-common.cc:1451:7: note: if avoiding the 
apostrophe is not feasible, enclose it in a pair of '%<' and '%>' directives 
instead
 1451 |   "Can't find suitable multilib set for %<-march=%s%>/%<-mabi=%s%>",
  |   ^
../../gcc/common/config/riscv/riscv-common.cc: At global scope:
../../gcc/common/config/riscv/riscv-common.cc:1492:1: error: 'int 
riscv_check_conds(const switchstr*, int, int, const 
std::vector >&)' defined but not used 
[-Werror=unused-function]
 1492 | riscv_check_conds (
  | ^
../../gcc/common/config/riscv/riscv-common.cc:1374:1: error: 'const char* 
find_last_appear_switch(const switchstr*, int, const char*)' defined but not 
used [-Werror=unused-function]
 1374 | find_last_appear_switch (
  | ^~~
cc1plus: all warnings being treated as errors
make[3]: *** [Makefile:2442: riscv-common.o] Error 1

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc (RISCV_USE_CUSTOMISED_MULTI_LIB):
Move forward for cover all all necessary functions for suppress
unused function warnings.
(riscv_multi_lib_check): Move forward, and tweak message to suppress
-Werror=format-diag warning.
---
 gcc/common/config/riscv/riscv-common.cc | 36 -
 1 file changed, 18 insertions(+), 18 deletions(-)

diff --git a/gcc/common/config/riscv/riscv-common.cc 
b/gcc/common/config/riscv/riscv-common.cc
index 120a0384686..77219162eeb 100644
--- a/gcc/common/config/riscv/riscv-common.cc
+++ b/gcc/common/config/riscv/riscv-common.cc
@@ -1366,6 +1366,24 @@ riscv_expand_arch_from_cpu (int argc ATTRIBUTE_UNUSED,
   return xasprintf ("-march=%s", arch.c_str());
 }
 
+/* Report error if not found suitable multilib.  */
+const char *
+riscv_multi_lib_check (int argc ATTRIBUTE_UNUSED,
+  const char **argv ATTRIBUTE_UNUSED)
+{
+  if (riscv_no_matched_multi_lib)
+fatal_error (
+  input_location,
+  "Cannot find suitable multilib set for %<-march=%s%>/%<-mabi=%s%>",
+  riscv_current_arch_str.c_str (),
+  riscv_current_abi_str.c_str ());
+
+  return "";
+}
+
+/* We only override this in bare-metal toolchain.  */
+#ifdef RISCV_USE_CUSTOMISED_MULTI_LIB
+
 /* Find last switch with the prefix, options are take last one in general,
return NULL if not found, and return the option value if found, it could
return empty string if the option has no value.  */
@@ -1440,21 +1458,6 @@ riscv_multi_lib_info_t::parse (
   return true;
 }
 
-/* Report error if not found suitable multilib.  */
-const char *
-riscv_multi_lib_check (int argc ATTRIBUTE_UNUSED,
-  const char **argv ATTRIBUTE_UNUSED)
-{
-  if (riscv_no_matched_multi_lib)
-fatal_error (
-  input_location,
-  "Can't find suitable multilib set for %<-march=%s%>/%<-mabi=%s%>",
-  riscv_current_arch_str.c_str (),
-  riscv_current_abi_str.c_str ());
-
-  return "";
-}
-
 /* Checking ARG is not appeared in SWITCHES if NOT_ARG is set or
ARG is appeared if NOT_ARG is not set.  */
 
@@ -1534,9 +1537,6 @@ riscv_check_conds (
   return match_score + ok_count * 100;
 }
 
-/* We only override this in bare-metal toolchain.  */
-#ifdef RISCV_USE_CUSTOMISED_MULTI_LIB
-
 /* Implement TARGET_COMPUTE_MULTILIB.  */
 static const char *
 riscv_compute_multilib (
-- 
2.37.2



[PING x2] Re: [PATCH, libgomp] Fix chunk_size<1 for dynamic schedule

2022-09-09 Thread Chung-Lin Tang



On 2022/8/26 4:15 PM, Chung-Lin Tang wrote:
> On 2022/8/4 9:31 PM, Koning, Paul wrote:
>>
>>
>>> On Aug 4, 2022, at 9:17 AM, Chung-Lin Tang  wrote:
>>>
>>> On 2022/6/28 10:06 PM, Jakub Jelinek wrote:
 On Thu, Jun 23, 2022 at 11:47:59PM +0800, Chung-Lin Tang wrote:
> with the way that chunk_size < 1 is handled for gomp_iter_dynamic_next:
>
> (1) chunk_size <= -1: wraps into large unsigned value, seems to work 
> though.
> (2) chunk_size == 0:  infinite loop
>
> The (2) behavior is obviously not desired. This patch fixes this by 
> changing
 Why?  It is a user error, undefined behavior, we shouldn't slow down valid
 code for users who don't bother reading the standard.
>>>
>>> This is loop init code, not per-iteration. The overhead really isn't that 
>>> much.
>>>
>>> The question should be, if GCC having infinite loop behavior is reasonable,
>>> even if it is undefined in the spec.
>>
>> I wouldn't think so.  The way I see "undefined code" is that you can't 
>> complain about "wrong code" produced by the compiler.  But for the compiler 
>> to malfunction on wrong input is an entirely differerent matter.  For one 
>> thing, it's hard to fix your code if the compiler fails.  How would you 
>> locate the offending source line?
>>
>>  paul
> 
> Ping?

Ping x2.


[PATCH] Makefile.tpl: pass CXXFLAGS_FOR_BUILD where appropriate

2022-09-09 Thread Ross Burton via Gcc-patches
If CXXFLAGS contains something unsupported by the build CXX, we see
build failures (e.g. using -fmacro-prefix-map for the target). Ensure
that CXXFLAGS_FOR_BUILD is passed where appropriate so that the correct
flags are used.

ChangeLog:

* Makefile.in: Regenerate.
* Makefile.tpl: Add missing CXXFLAGS_FOR_BUILD overrides

Signed-off-by: Ross Burton 
---
 Makefile.in  | 2 ++
 Makefile.tpl | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/Makefile.in b/Makefile.in
index 1919dfee829..6f96852ed80 100644
--- a/Makefile.in
+++ b/Makefile.in
@@ -176,6 +176,7 @@ BUILD_EXPORTS = \
 # built for the build system to override those in BASE_FLAGS_TO_PASS.
 EXTRA_BUILD_FLAGS = \
CFLAGS="$(CFLAGS_FOR_BUILD)" \
+   CXXFLAGS="$(CXXFLAGS_FOR_BUILD)" \
LDFLAGS="$(LDFLAGS_FOR_BUILD)"
 
 # This is the list of directories to built for the host system.
@@ -207,6 +208,7 @@ HOST_EXPORTS = \
CPP_FOR_BUILD="$(CPP_FOR_BUILD)"; export CPP_FOR_BUILD; \
CPPFLAGS_FOR_BUILD="$(CPPFLAGS_FOR_BUILD)"; export CPPFLAGS_FOR_BUILD; \
CXX_FOR_BUILD="$(CXX_FOR_BUILD)"; export CXX_FOR_BUILD; \
+   CXXFLAGS_FOR_BUILD="$(CXXFLAGS_FOR_BUILD)"; export CXXFLAGS_FOR_BUILD; \
DLLTOOL="$(DLLTOOL)"; export DLLTOOL; \
DSYMUTIL="$(DSYMUTIL)"; export DSYMUTIL; \
LD="$(LD)"; export LD; \
diff --git a/Makefile.tpl b/Makefile.tpl
index c7344558429..5876ad5aa5d 100644
--- a/Makefile.tpl
+++ b/Makefile.tpl
@@ -179,6 +179,7 @@ BUILD_EXPORTS = \
 # built for the build system to override those in BASE_FLAGS_TO_PASS.
 EXTRA_BUILD_FLAGS = \
CFLAGS="$(CFLAGS_FOR_BUILD)" \
+   CXXFLAGS="$(CXXFLAGS_FOR_BUILD)" \
LDFLAGS="$(LDFLAGS_FOR_BUILD)"
 
 # This is the list of directories to built for the host system.
@@ -210,6 +211,7 @@ HOST_EXPORTS = \
CPP_FOR_BUILD="$(CPP_FOR_BUILD)"; export CPP_FOR_BUILD; \
CPPFLAGS_FOR_BUILD="$(CPPFLAGS_FOR_BUILD)"; export CPPFLAGS_FOR_BUILD; \
CXX_FOR_BUILD="$(CXX_FOR_BUILD)"; export CXX_FOR_BUILD; \
+   CXXFLAGS_FOR_BUILD="$(CXXFLAGS_FOR_BUILD)"; export CXXFLAGS_FOR_BUILD; \
DLLTOOL="$(DLLTOOL)"; export DLLTOOL; \
DSYMUTIL="$(DSYMUTIL)"; export DSYMUTIL; \
LD="$(LD)"; export LD; \
-- 
2.34.1



Re: [PATCH] amdgcn: Add support for additional natively supported floating-point operations

2022-09-09 Thread Richard Biener via Gcc-patches
On Fri, 9 Sep 2022, Tobias Burnus wrote:

> On 09.09.22 10:10, Andrew Stubbs wrote:
> On 08.09.22 22:38, Kwok Cheung Yeung wrote:
> The instructions for the transcendental functions are documented to have
> limited numerical precision, so they are only used if
> unsafe_math_optimizations are enabled for now.
> 
> -funsafe-math-optimizations implies -fno-signed-zeros, -fno-trapping-math,
> -fassociative-math,
> and -freciprocal-math. All of them reduce precision and my violate IEEE or
> ISO/language standards.
> 
> However, I think it is rather surprising to have all of the sudden only a
> precision of the
> order of 100,000,000 ULP instead of ~4 ULP as to be expected. That's a
> precision loss of the
> order of 10^8 or 2^29 which is huge!
> 
> For program deliberately using double precision, it can be too much ? even if
> they do not need
> double precision in reality. (Weather forecast system recently moved to single
> precision as the
> quality is similar and benefits of faster results/finer grids or longer
> forecast times prevail.)
> 
> As this behavior is highly surprising, I think it should be at least
> documented.
> 
> In https://gcc.gnu.org/PR105246 , I suggested a new flag (such as
> -mpermit-reduced-precision) to
> make it possible turn it on/off explicitly (might be still enabled by
> -funsafe-math-optimizations);
> alternatively, it could also be handled as initial guess for the result which
> is then refined
> in some iteration steps. (It could also be combined to give the user the
> choice.)
> 
> While still being convinced that a flag makes more sense than just documenting
> it,
> I have nonetheless attached a documentation attempt.
> 
> Thoughts?

I agree - for example powerpc has -mrecip= to control which instructions
to use (float/double rsqrt or inverse) and -mrecip-precision to
specify whether further iteration is done or not.

x86 has similar but does always perform newton raphson iteration,
documenting 2 ulp instead of 0.5 ulp precision.

Your suggested huge reduction in precision isn't usually acceptable
and should be always explicitely enabled.

Richard.


[PATCH] gcov: Respect tripplet when looking for gcov

2022-09-09 Thread Torbjörn SVENSSON via Gcc-patches
When testing a cross toolchain outside the build tree, the binary name
for gcov is prefixed with the tripplet.

gcc/testsuite/ChangeLog:

* g++.dg/gcov/gcov.exp: Respect tripplet when looking for gcov
* gcc.misc-tests/gcov.exp: Likewise

Signed-off-by: Torbjörn SVENSSON 
---
 gcc/testsuite/g++.dg/gcov/gcov.exp| 4 ++--
 gcc/testsuite/gcc.misc-tests/gcov.exp | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/gcc/testsuite/g++.dg/gcov/gcov.exp 
b/gcc/testsuite/g++.dg/gcov/gcov.exp
index 88acd95c361..04e7a016486 100644
--- a/gcc/testsuite/g++.dg/gcov/gcov.exp
+++ b/gcc/testsuite/g++.dg/gcov/gcov.exp
@@ -24,9 +24,9 @@ global GXX_UNDER_TEST
 
 # Find gcov in the same directory as $GXX_UNDER_TEST.
 if { ![is_remote host] && [string match "*/*" [lindex $GXX_UNDER_TEST 0]] } {
-set GCOV [file dirname [lindex $GXX_UNDER_TEST 0]]/gcov
+set GCOV [file dirname [lindex $GXX_UNDER_TEST 0]]/[transform gcov]
 } else {
-set GCOV gcov
+set GCOV [transform gcov]
 }
 
 # Initialize harness.
diff --git a/gcc/testsuite/gcc.misc-tests/gcov.exp 
b/gcc/testsuite/gcc.misc-tests/gcov.exp
index 82376d90ac2..a55ce234f6e 100644
--- a/gcc/testsuite/gcc.misc-tests/gcov.exp
+++ b/gcc/testsuite/gcc.misc-tests/gcov.exp
@@ -24,9 +24,9 @@ global GCC_UNDER_TEST
 
 # For now find gcov in the same directory as $GCC_UNDER_TEST.
 if { ![is_remote host] && [string match "*/*" [lindex $GCC_UNDER_TEST 0]] } {
-set GCOV [file dirname [lindex $GCC_UNDER_TEST 0]]/gcov
+set GCOV [file dirname [lindex $GCC_UNDER_TEST 0]]/[transform gcov]
 } else {
-set GCOV gcov
+set GCOV {transform gcov]
 }
 
 # Initialize harness.
-- 
2.25.1



Re: [PATCH] gcov: Respect tripplet when looking for gcov

2022-09-09 Thread Martin Liška
On 9/9/22 12:27, Torbjörn SVENSSON wrote:
> When testing a cross toolchain outside the build tree, the binary name
> for gcov is prefixed with the tripplet.

Ok, thanks!

Martin

> 
> gcc/testsuite/ChangeLog:
> 
> * g++.dg/gcov/gcov.exp: Respect tripplet when looking for gcov
> * gcc.misc-tests/gcov.exp: Likewise
> 
> Signed-off-by: Torbjörn SVENSSON 
> ---
>  gcc/testsuite/g++.dg/gcov/gcov.exp| 4 ++--
>  gcc/testsuite/gcc.misc-tests/gcov.exp | 4 ++--
>  2 files changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/gcc/testsuite/g++.dg/gcov/gcov.exp 
> b/gcc/testsuite/g++.dg/gcov/gcov.exp
> index 88acd95c361..04e7a016486 100644
> --- a/gcc/testsuite/g++.dg/gcov/gcov.exp
> +++ b/gcc/testsuite/g++.dg/gcov/gcov.exp
> @@ -24,9 +24,9 @@ global GXX_UNDER_TEST
>  
>  # Find gcov in the same directory as $GXX_UNDER_TEST.
>  if { ![is_remote host] && [string match "*/*" [lindex $GXX_UNDER_TEST 0]] } {
> -set GCOV [file dirname [lindex $GXX_UNDER_TEST 0]]/gcov
> +set GCOV [file dirname [lindex $GXX_UNDER_TEST 0]]/[transform gcov]
>  } else {
> -set GCOV gcov
> +set GCOV [transform gcov]
>  }
>  
>  # Initialize harness.
> diff --git a/gcc/testsuite/gcc.misc-tests/gcov.exp 
> b/gcc/testsuite/gcc.misc-tests/gcov.exp
> index 82376d90ac2..a55ce234f6e 100644
> --- a/gcc/testsuite/gcc.misc-tests/gcov.exp
> +++ b/gcc/testsuite/gcc.misc-tests/gcov.exp
> @@ -24,9 +24,9 @@ global GCC_UNDER_TEST
>  
>  # For now find gcov in the same directory as $GCC_UNDER_TEST.
>  if { ![is_remote host] && [string match "*/*" [lindex $GCC_UNDER_TEST 0]] } {
> -set GCOV [file dirname [lindex $GCC_UNDER_TEST 0]]/gcov
> +set GCOV [file dirname [lindex $GCC_UNDER_TEST 0]]/[transform gcov]
>  } else {
> -set GCOV gcov
> +set GCOV {transform gcov]
>  }
>  
>  # Initialize harness.



[committed] libgomp: Fix up OMP_PROC_BIND handling [PR106894]

2022-09-09 Thread Jakub Jelinek via Gcc-patches
On Wed, Aug 31, 2022 at 12:56:25PM +0200, Marcel Vollweiler wrote:
> +   case PARSE_BIND:
> + *(char *) (host_envvars[omp_var].dest[0])
> +   = *(char *) params[0];
> + *(char *) (host_envvars[omp_var].dest[1])
> +   = *(char *) params[1];
> + *(unsigned long *) (host_envvars[omp_var].dest[2])
> +   = *(unsigned long *) params[2];

While the first param is char (gomp_global_icv.bind_var), the second param
is char * (gomp_bind_var_list), so we shouldn't access it through *(char *).

Tested on x86_64-linux with
make check RUNTESTFLAGS="c.exp='*affinity* icv-6.c *display*' 
c++.exp='*affinity* icv-6.c *display*' fortran.exp='*affinity*'"
which previously had various failures, committed to trunk.

2022-09-09  Jakub Jelinek  

PR libgomp/106894
* env.c (initialize_env) : Use char ** instead of
char * for dest[1] initialization from params[1].  Formatting fixes.

--- libgomp/env.c.jj2022-09-08 20:22:07.849183684 +0200
+++ libgomp/env.c   2022-09-09 13:30:14.090107492 +0200
@@ -2184,12 +2184,10 @@ initialize_env (void)
*(int *) (host_envvars[omp_var].dest[1]) = *(int *) params[1];
break;
  case PARSE_BIND:
-   *(char *) (host_envvars[omp_var].dest[0])
- = *(char *) params[0];
-   *(char *) (host_envvars[omp_var].dest[1])
- = *(char *) params[1];
+   *(char *) (host_envvars[omp_var].dest[0]) = *(char *) params[0];
+   *(char **) (host_envvars[omp_var].dest[1]) = *(char **) params[1];
*(unsigned long *) (host_envvars[omp_var].dest[2])
- = *(unsigned long *) params[2];
+ = *(unsigned long *) params[2];
break;
  }
   }


Jakub



Re: [PATCH] gcov: Respect tripplet when looking for gcov

2022-09-09 Thread Rainer Orth
Hi Torbjörn,

> When testing a cross toolchain outside the build tree, the binary name
> for gcov is prefixed with the tripplet.

here and below: the beast is called triplet.

> gcc/testsuite/ChangeLog:
>
> * g++.dg/gcov/gcov.exp: Respect tripplet when looking for gcov
> * gcc.misc-tests/gcov.exp: Likewise

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: Modula-2: merge followup (brief update on the progress of the new linking implementation)

2022-09-09 Thread Martin Liška
On 9/8/22 17:52, Gaius Mulley wrote:
> Martin Liška  writes:
> 
>> Note I've just converted the current Modula-2 manual to RST (Sphinx):
>> https://splichal.eu/scripts/sphinx/
>>
>> It contains some minor issues, but in general it should be pretty fine. Note 
>> pygments
>> contains a corresponding lexer:
>> https://pygments.org/docs/lexers/#multi-dialect-lexer-for-modula-2
> 
> very nice - well done!  All very useful - and much easier to see bugs in
> gm2.texi.  I see a few missing options (missing from gm2.texi) and also

Do you have an example, please?

> I see problems with the libraries - meta data (const/type/var) is

What type of meta data do you mean?

> rendered (@findex perhaps should be ignored or processed).  Or I could
> disable it from gcc/m2/tools-src/def2texi.py)?
> 
> It is starting to look very nice indeed

Thanks!

Martin

> 
> regards,
> Gaius



GCN: Add -mlow-precision-sqrt for double-precision sqrt [PR105246] (was: Re: [PATCH] amdgcn: Add support for additional natively supported floating-point operations)

2022-09-09 Thread Tobias Burnus

On 09.09.22 12:16, Richard Biener wrote:

On Fri, 9 Sep 2022, Tobias Burnus wrote:

-funsafe-math-optimizations implies -fno-signed-zeros, -fno-trapping-math,
-fassociative-math,
and -freciprocal-math. All of them reduce precision and my violate IEEE or
ISO/language standards.

However, I think it is rather surprising to have all of the sudden only a
precision of the order of 100,000,000 ULP instead of ~4 ULP as to be expected.
That's a precision loss of the order of 10^8 or 2^29 which is huge!
...

I agree - for example powerpc has -mrecip= to control which instructions
to use (float/double rsqrt or inverse) and -mrecip-precision to
specify whether further iteration is done or not.
[...]
Your suggested huge reduction in precision isn't usually acceptable
and should be always explicitely enabled.


First, I have to correct myself – Kwok's -munsafe-math-optimizations is
only about single-precision functions, which do not have this problem.

However, the pre-existing 'sqrt' problem still is real. It also applies
to reverse sqrt ("v_rsq"), but that's for whatever reason not used for GCN.

This patch now adds a commandline flag - off by default - to choose
whether this behavior is wanted. I did use the same name as aarch64,
https://gcc.gnu.org/onlinedocs/gcc/AArch64-Options.html#index-mlow-precision-sqrt
(the latter also has -mlow-precision-recip-sqrt, which is not (yet)
sensible for GCN.)

This patch was manually tested for all combinations and I also looked at
insn-recog.cc, given that it is my first .md patch – it it seems to work
fine.

OK for mainline – or are there comments or more suggestions? I also
included some word for the release notes.

Tobias
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
GCN: Add -mlow-precision-sqrt for double-precision sqrt [PR105246]

GCN's sqrt supports single and double precision; however, for either
the result has 23 bits for the fractional part of the floating-point
number. (For double precision: instead of 52 bits).

This adds now -mlow-precision-sqrt, using the same naming as aarch64.
Before, the hardware builtin sqrt was always used with
unsafe-math-optimiaztions, now only with single precision; for
double precision, the new -mlow-precision-sqrt is explicitly
required in addition. As there is no rsqrt, this flag likewise
applies to 1/sqrt.

	PR target/105246

gcc/ChangeLog:

	* config/gcn/gcn.opt (mlow-precision-sqrt): New, off by default.
	* config/gcn/gcn-valu.md (sqrt, v_sqrt): Require it unless SFmode.
	* doc/invoke.texi (GCN): Add -mlow-precision-sqrt entry.

 gcc/config/gcn/gcn-valu.md |  6 --
 gcc/config/gcn/gcn.opt |  7 +++
 gcc/doc/invoke.texi| 11 +++
 3 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/gcc/config/gcn/gcn-valu.md b/gcc/config/gcn/gcn-valu.md
index 8c33ae0c717..c7a0b562874 100644
--- a/gcc/config/gcn/gcn-valu.md
+++ b/gcc/config/gcn/gcn-valu.md
@@ -2276,7 +2276,8 @@ (define_insn "sqrt2"
   [(set (match_operand:V_FP 0 "register_operand"  "=  v")
 	(sqrt:V_FP
 	  (match_operand:V_FP 1 "gcn_alu_operand" "vSvB")))]
-  "flag_unsafe_math_optimizations"
+  "(flag_unsafe_math_optimizations
+&& (mode == V64SFmode || flag_mlow_precision_sqrt))"
   "v_sqrt%i0\t%0, %1"
   [(set_attr "type" "vop1")
(set_attr "length" "8")])
@@ -2285,7 +2286,8 @@ (define_insn "sqrt2"
   [(set (match_operand:FP 0 "register_operand"  "=  v")
 	(sqrt:FP
 	  (match_operand:FP 1 "gcn_alu_operand" "vSvB")))]
-  "flag_unsafe_math_optimizations"
+  "(flag_unsafe_math_optimizations
+&& (mode == SFmode || flag_mlow_precision_sqrt))"
   "v_sqrt%i0\t%0, %1"
   [(set_attr "type" "vop1")
(set_attr "length" "8")])
diff --git a/gcc/config/gcn/gcn.opt b/gcc/config/gcn/gcn.opt
index 9606aaf0b1a..a3f341f7eb1 100644
--- a/gcc/config/gcn/gcn.opt
+++ b/gcc/config/gcn/gcn.opt
@@ -77,6 +77,13 @@ mgang-private-size=
 Target RejectNegative Joined UInteger Var(gang_private_size_opt) Init(-1)
 Amount of local data-share (LDS) memory to reserve for gang-private variables.
 
+mlow-precision-sqrt
+Target Var(flag_mlow_precision_sqrt) Optimization
+Enable the square root approximation for 64bit double precision;
+this reduces precision of square root results to 23 bits for the
+fractional part of the floating-point number.
+It also implies low-precision reciprocal sqrt.
+
 Wopenacc-dims
 Target Var(warn_openacc_dims) Warning
 Warn about invalid OpenACC dimensions.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 5c066219a7d..fdd6e41cade 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -20192,6 +20192,17 @@ compiled code must match the device mode.  The default is @samp{-mno-xnack}.
 At present this option is a placeholder for support that is not yet
 implemented.
 
+@item -mlow-precision-sqrt
+@itemx -mno-low-prec

[PATCH] tree-optimization/106722 - avoid invalid pointer association in predcom

2022-09-09 Thread Richard Biener via Gcc-patches
When predictive commoning builds a reference for iteration N it
prematurely associates a constant offset into the MEM_REF offset
operand which can be invalid if the base pointer then points
outside of an object which alias-analysis does not consider valid.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed to trunk 
sofar.

PR tree-optimization/106722
* tree-predcom.cc (ref_at_iteration): Do not associate the
constant part of the offset into the MEM_REF offset
operand, across a non-zero offset.

* gcc.dg/torture/pr106892.c: New testcase.
---
 gcc/testsuite/gcc.dg/torture/pr106892.c | 30 +
 gcc/tree-predcom.cc | 18 +--
 2 files changed, 46 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr106892.c

diff --git a/gcc/testsuite/gcc.dg/torture/pr106892.c 
b/gcc/testsuite/gcc.dg/torture/pr106892.c
new file mode 100644
index 000..73a66a037b7
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr106892.c
@@ -0,0 +1,30 @@
+/* { dg-do run } */
+
+int a, b, c, d, e;
+int f[8];
+static int g() {
+  while (a)
+a >>= 4;
+  return 0;
+}
+static int h(int i) {
+  if (i >= '0')
+return i - '0';
+  //__builtin_unreachable ();
+}
+void __attribute__((noipa)) j(int i) {
+  for (b = 2; g() <= 7; b++)
+if (i) {
+  for (; e <= 7; e++)
+for (c = 1; c <= 7; c++) {
+  d = h(b + '0');
+  f[-d + 4] ^= 3;
+}
+  return;
+}
+}
+int main() {
+  j(1);
+  if (f[2] != 0)
+__builtin_abort ();
+}
diff --git a/gcc/tree-predcom.cc b/gcc/tree-predcom.cc
index 5d923fba170..a6e45e36ffd 100644
--- a/gcc/tree-predcom.cc
+++ b/gcc/tree-predcom.cc
@@ -1771,10 +1771,24 @@ ref_at_iteration (data_reference_p dr, int iter,
  ref = TREE_OPERAND (ref, 0);
}
 }
-  tree addr = fold_build_pointer_plus (DR_BASE_ADDRESS (dr), off);
+  /* We may not associate the constant offset across the pointer plus
+ expression because that might form a pointer to before the object
+ then.  But for some cases we can retain that to allow tree_could_trap_p
+ to return false - see gcc.dg/tree-ssa/predcom-1.c  */
+  tree addr, alias_ptr;
+  if (integer_zerop  (off))
+{
+  alias_ptr = fold_convert (reference_alias_ptr_type (ref), coff);
+  addr = DR_BASE_ADDRESS (dr);
+}
+  else
+{
+  alias_ptr = build_zero_cst (reference_alias_ptr_type (ref));
+  off = size_binop (PLUS_EXPR, off, coff);
+  addr = fold_build_pointer_plus (DR_BASE_ADDRESS (dr), off);
+}
   addr = force_gimple_operand_1 (unshare_expr (addr), stmts,
 is_gimple_mem_ref_addr, NULL_TREE);
-  tree alias_ptr = fold_convert (reference_alias_ptr_type (ref), coff);
   tree type = build_aligned_type (TREE_TYPE (ref),
  get_object_alignment (ref));
   ref = build2 (MEM_REF, type, addr, alias_ptr);
-- 
2.35.3


RE: [PATCH] amdgcn: Add support for additional natively supported floating-point operations

2022-09-09 Thread Stubbs, Andrew
> -Original Message-
> I agree - for example powerpc has -mrecip= to control which instructions
> to use (float/double rsqrt or inverse) and -mrecip-precision to
> specify whether further iteration is done or not.
> 
> x86 has similar but does always perform newton raphson iteration,
> documenting 2 ulp instead of 0.5 ulp precision.
> 
> Your suggested huge reduction in precision isn't usually acceptable
> and should be always explicitely enabled.

There isn't a problem with *this* patch (although we do have existing accuracy 
issues thanks to previous documents lacking the information).

The "inaccurate" instructions are single-precision only, and therefore 
acceptable with -ffast-math.

Kwok intends to provide vectorized library calls for the double-precision and 
-fno-fast-math cases.

In general I want to avoid adding extra arch-specific options; partly because 
approximately no one will use them, and partly because the amdgcn compiler is 
almost always hidden behind an x86_64 compiler.

Andrew


Re: GCN: Add -mlow-precision-sqrt for double-precision sqrt [PR105246] (was: Re: [PATCH] amdgcn: Add support for additional natively supported floating-point operations)

2022-09-09 Thread Andrew Stubbs

On 09/09/2022 13:20, Tobias Burnus wrote:
However, the pre-existing 'sqrt' problem still is real. It also applies 
to reverse sqrt ("v_rsq"), but that's for whatever reason not used for GCN.


This patch now adds a commandline flag - off by default - to choose 
whether this behavior is wanted. I did use the same name as aarch64, 
https://gcc.gnu.org/onlinedocs/gcc/AArch64-Options.html#index-mlow-precision-sqrt (the latter also has -mlow-precision-recip-sqrt, which is not (yet) sensible for GCN.)


This patch was manually tested for all combinations and I also looked at 
insn-recog.cc, given that it is my first .md patch – it it seems to work 
fine.


OK for mainline – or are there comments or more suggestions? I also 
included some word for the release notes.


No, thank you.

I don't see any value in adding an option no one cares about (but we 
still have to maintain and test).


I think it will make sense to drop the double-precision insn definition 
and fall back to libm in that case.


Kwok is currently reviewing all the libm functions and can probably 
include this one.


Andrew


[PATCH] c++: remove '_sfinae' suffix from functions

2022-09-09 Thread Patrick Palka via Gcc-patches
Each of the following functions

  instantiate_non_dependent_expr
  get_target_expr
  require_complete_type
  abstract_virtuals_error
  cxx_constant_value

is (presumably for historical reasons) just a non-SFINAE-enabled wrapper
for the corresponding SFINAE-enabled version that's suffixed by '_sfinae'.
But this suffix is at best redundant since a 'complain' parameter already
conveys that a function is appropriately SFINAE-enabled, and having two
such versions of a function is cluttersome compared to just using a default
argument (and also no less error prone I think).

So this patch squashes the two versions of each of the above functions
by adding a default 'complain' argument to the SFINAE-enabled version
whose '_sfinae' suffix we then remove.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?

gcc/cp/ChangeLog:

* all.cc (build_conditional_expr): Adjust calls to
'_sfinae'-suffixed functions.
(build_temp): Likewise.
(convert_like_internal): Likewise.
(convert_arg_to_ellipsis): Likewise.
(build_over_call): Likewise.
(build_cxx_call): Likewise.
(build_new_method_call): Likewise.
* constexpr.cc (cxx_eval_outermost_constant_expr): Likewise.
(cxx_constant_value_sfinae): Rename to ...
(cxx_constant_value): ... this.  Document its default arguments.
(fold_non_dependent_expr): Adjust function comment.
* cp-tree.h (instantiate_non_dependent_expr_sfinae): Rename to ...
(instantiate_non_dependent_expr): ... this.  Give its 'complain'
parameter a default argument.
(get_target_expr_sfinae, get_target_expr): Likewise.
(require_complete_type_sfinae, require_complete_type): Likewise.
(abstract_virtuals_error_sfinae, abstract_virtuals_error):
Likewise.
(cxx_constant_value_sfinae, cxx_constant_value): Likewise.
* cvt.cc (build_up_reference): Adjust calls to '_sfinae'-suffixed
functions.
(ocp_convert): Likewise.
* decl.cc (build_explicit_specifier): Likewise.
* except.cc (build_noexcept_spec): Likewise.
* init.cc (build_new_1): Likewise.
* pt.cc (expand_integer_pack): Likewise.
(instantiate_non_dependent_expr_internal): Adjust function
comment.
(instantiate_non_dependent_expr): Rename to ...
(instantiate_non_dependent_expr_sfinae): ... this.  Document its
default argument.
(tsubst_init): Adjust calls to '_sfinae'-suffixed functions.
(fold_targs_r): Likewise.
* semantics.cc (finish_compound_literal): Likewise.
(finish_decltype_type): Likewise.
(cp_build_bit_cast): Likewise.
* tree.cc (build_cplus_new): Likewise.
(get_target_expr): Rename to ...
(get_target_expr_sfinae): ... this.  Document its default
argument.
* typeck.cc (require_complete_type): Rename to ...
(require_complete_type_sfinae): ... this.  Document its default
argument.
(cp_build_array_ref): Adjust calls to '_sfinae'-suffixed
functions.
(convert_arguments): Likewise.
(cp_build_binary_op): Likewise.
(build_static_cast_1): Likewise.
(cp_build_modify_expr): Likewise.
(convert_for_initialization): Likewise.
* typeck2.cc (abstract_virtuals_error): Rename to ...
(abstract_virtuals_error_sfinae): ... this. Document its default
argument.
(build_functional_cast_1): Adjust calls to '_sfinae'-suffixed
functions.
---
 gcc/cp/call.cc  | 22 +++---
 gcc/cp/constexpr.cc | 20 ++--
 gcc/cp/cp-tree.h| 23 +++
 gcc/cp/cvt.cc   |  4 ++--
 gcc/cp/decl.cc  |  2 +-
 gcc/cp/except.cc|  2 +-
 gcc/cp/init.cc  |  2 +-
 gcc/cp/pt.cc| 17 ++---
 gcc/cp/semantics.cc |  6 +++---
 gcc/cp/tree.cc  | 10 ++
 gcc/cp/typeck.cc| 21 -
 gcc/cp/typeck2.cc   | 33 ++---
 12 files changed, 62 insertions(+), 100 deletions(-)

diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
index d107a2814dc..7e9289fc2d0 100644
--- a/gcc/cp/call.cc
+++ b/gcc/cp/call.cc
@@ -5976,7 +5976,7 @@ build_conditional_expr (const op_location_t &loc,
 but now we sometimes wrap them in NOP_EXPRs so the test would
 fail.  */
   if (CLASS_TYPE_P (TREE_TYPE (result)))
-   result = get_target_expr_sfinae (result, complain);
+   result = get_target_expr (result, complain);
   /* If this expression is an rvalue, but might be mistaken for an
 lvalue, we must add a NON_LVALUE_EXPR.  */
   result = rvalue (result);
@@ -7672,7 +7672,7 @@ build_temp (tree expr, tree type, int flags,
   if ((lvalue_kind (expr) & clk_packed)
   && CLASS_TYPE_P (TREE_TYPE (expr))
   && !type_has_nontrivial_copy_init (TREE_TYPE (expr)))
-return get_target_expr_sfinae (expr, complain);
+

[PATCH v2] gcov: Respect triplet when looking for gcov

2022-09-09 Thread Torbjörn SVENSSON via Gcc-patches
When testing a cross toolchain outside the build tree, the binary name
for gcov is prefixed with the triplet.

gcc/testsuite/ChangeLog:

* g++.dg/gcov/gcov.exp: Respect triplet when looking for gcov.
* gcc.misc-tests/gcov.exp: Likewise.

Signed-off-by: Torbjörn SVENSSON 
---
 gcc/testsuite/g++.dg/gcov/gcov.exp| 4 ++--
 gcc/testsuite/gcc.misc-tests/gcov.exp | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/gcc/testsuite/g++.dg/gcov/gcov.exp 
b/gcc/testsuite/g++.dg/gcov/gcov.exp
index 88acd95c361..04e7a016486 100644
--- a/gcc/testsuite/g++.dg/gcov/gcov.exp
+++ b/gcc/testsuite/g++.dg/gcov/gcov.exp
@@ -24,9 +24,9 @@ global GXX_UNDER_TEST
 
 # Find gcov in the same directory as $GXX_UNDER_TEST.
 if { ![is_remote host] && [string match "*/*" [lindex $GXX_UNDER_TEST 0]] } {
-set GCOV [file dirname [lindex $GXX_UNDER_TEST 0]]/gcov
+set GCOV [file dirname [lindex $GXX_UNDER_TEST 0]]/[transform gcov]
 } else {
-set GCOV gcov
+set GCOV [transform gcov]
 }
 
 # Initialize harness.
diff --git a/gcc/testsuite/gcc.misc-tests/gcov.exp 
b/gcc/testsuite/gcc.misc-tests/gcov.exp
index 82376d90ac2..a55ce234f6e 100644
--- a/gcc/testsuite/gcc.misc-tests/gcov.exp
+++ b/gcc/testsuite/gcc.misc-tests/gcov.exp
@@ -24,9 +24,9 @@ global GCC_UNDER_TEST
 
 # For now find gcov in the same directory as $GCC_UNDER_TEST.
 if { ![is_remote host] && [string match "*/*" [lindex $GCC_UNDER_TEST 0]] } {
-set GCOV [file dirname [lindex $GCC_UNDER_TEST 0]]/gcov
+set GCOV [file dirname [lindex $GCC_UNDER_TEST 0]]/[transform gcov]
 } else {
-set GCOV gcov
+set GCOV {transform gcov]
 }
 
 # Initialize harness.
-- 
2.25.1



[PATCH] testsuite: gluefile file need to be prefixed

2022-09-09 Thread Torbjörn SVENSSON via Gcc-patches
PR/95720
When the status wrapper is used, the gluefile need to be prefixed with
-Wl, in order for the test cases to have the dump files with the
expected names.

gcc/testsuite/ChangeLog:

* gcc/testsuite/lib/g++.exp: Moved gluefile block to after
  flags have been prefixed for the target_compile call.
* gcc/testsuite/lib/gcc.exp: Likewise.
* gcc/testsuite/lib/wrapper.exp: Reset adjusted state flag.

Co-Authored-By: Yvan ROUX 
Signed-off-by: Torbjörn SVENSSON 
---
 gcc/testsuite/lib/g++.exp | 10 +-
 gcc/testsuite/lib/gcc.exp | 21 +++--
 gcc/testsuite/lib/wrapper.exp |  7 ++-
 3 files changed, 22 insertions(+), 16 deletions(-)

diff --git a/gcc/testsuite/lib/g++.exp b/gcc/testsuite/lib/g++.exp
index 24ef068b239..16e61fb4ad4 100644
--- a/gcc/testsuite/lib/g++.exp
+++ b/gcc/testsuite/lib/g++.exp
@@ -303,11 +303,6 @@ proc g++_target_compile { source dest type options } {
 global flags_to_postpone
 global board_info
 
-if { [target_info needs_status_wrapper] != "" && [info exists gluefile] } {
-   lappend options "libs=${gluefile}"
-   lappend options "ldflags=${wrap_flags}"
-}
-
 global TEST_EXTRA_LIBS
 if [info exists TEST_EXTRA_LIBS] {
lappend options "ldflags=$TEST_EXTRA_LIBS"
@@ -333,6 +328,11 @@ proc g++_target_compile { source dest type options } {
 
 set options [dg-additional-files-options $options $source]
 
+if { [target_info needs_status_wrapper] != "" && [info exists gluefile] } {
+   lappend options "libs=${gluefile}"
+   lappend options "ldflags=${wrap_flags}"
+}
+
 set result [target_compile $source $dest $type $options]
 
 if {[board_info $tboard exists multilib_flags]} {
diff --git a/gcc/testsuite/lib/gcc.exp b/gcc/testsuite/lib/gcc.exp
index 1b25ebec4cf..2f145d0fdf4 100644
--- a/gcc/testsuite/lib/gcc.exp
+++ b/gcc/testsuite/lib/gcc.exp
@@ -129,16 +129,6 @@ proc gcc_target_compile { source dest type options } {
 global flags_to_postpone
 global board_info
 
-if {[target_info needs_status_wrapper] != "" && \
-   [target_info needs_status_wrapper] != "0" && \
-   [info exists gluefile] } {
-   lappend options "libs=${gluefile}"
-   lappend options "ldflags=$wrap_flags"
-   if { $type == "executable" } {
-   set options [concat "{additional_flags=-dumpbase \"\"}" $options]
-   }
-}
-
 global TEST_EXTRA_LIBS
 if [info exists TEST_EXTRA_LIBS] {
lappend options "ldflags=$TEST_EXTRA_LIBS"
@@ -170,6 +160,17 @@ proc gcc_target_compile { source dest type options } {
 lappend options "timeout=[timeout_value]"
 lappend options "compiler=$GCC_UNDER_TEST"
 set options [dg-additional-files-options $options $source]
+
+if {[target_info needs_status_wrapper] != "" && \
+   [target_info needs_status_wrapper] != "0" && \
+   [info exists gluefile] } {
+   lappend options "libs=${gluefile}"
+   lappend options "ldflags=$wrap_flags"
+   if { $type == "executable" } {
+   set options [concat "{additional_flags=-dumpbase \"\"}" $options]
+   }
+}
+
 set return_val [target_compile $source $dest $type $options]
 
 if {[board_info $tboard exists multilib_flags]} {
diff --git a/gcc/testsuite/lib/wrapper.exp b/gcc/testsuite/lib/wrapper.exp
index 5a601b269da..4a7d56941fc 100644
--- a/gcc/testsuite/lib/wrapper.exp
+++ b/gcc/testsuite/lib/wrapper.exp
@@ -22,7 +22,7 @@
 # the compiler when compiling FILENAME.
 
 proc ${tool}_maybe_build_wrapper { filename args } {
-global gluefile wrap_flags
+global gluefile wrap_flags gcc_adjusted_linker_flags
 
 if { [target_info needs_status_wrapper] != "" \
 && [target_info needs_status_wrapper] != "0" \
@@ -43,6 +43,11 @@ proc ${tool}_maybe_build_wrapper { filename args } {
if { $result != "" } {
set gluefile [lindex $result 0]
set wrap_flags [lindex $result 1]
+
+   # Reset the cached state of the adjusted flags
+   if { [info exists gcc_adjusted_linker_flags] } {
+   set gcc_adjusted_linker_flags 0
+   }
}
 }
 }
-- 
2.25.1



Re: [PATCH v2] Handle OPAQUE_TYPE specially in verify_type [PR106833]

2022-09-09 Thread Kewen.Lin via Gcc-patches
on 2022/9/9 15:25, Richard Biener wrote:
> On Fri, Sep 9, 2022 at 8:51 AM Kewen.Lin  wrote:
>>
>> Hi Richi,
>>
>> Thanks for the review comments!
>>
>> on 2022/9/8 15:36, Richard Biener wrote:
>>>
>>>
 Am 08.09.2022 um 07:53 schrieb Kewen.Lin :

 Hi,

 As PR106833 shows, cv-qualified opaque type can cause ICE
 during LTO.  It exposes that we missd to handle OPAQUE_TYPE
 well in type verification.  As Richi pointed out, also
 assuming that target will always define TYPE_MAIN_VARIANT
 and TYPE_CANONICAL for opaque type, this patch is to check
 both are OPAQUE_TYPE_P.  Besides, it also checks the only
 available size and alignment information as well as type
 mode for TYPE_MAIN_VARIANT.

>> ...
 +
 +  if (t != tv)
 +{
 +  verify_match (TREE_CODE, t, tv);
 +  verify_match (TYPE_MODE, t, tv);
 +  verify_match (TYPE_SIZE, t, tv);
>>>
>>> TYPE_SIZE is a tree, you should probably
>>> Compare this with operand_equal_p.  It’s
>>> Not documented to be a constant size?
>>> Thus some VLA vector mode might be allowed ( a poly_int size),
>>
>> Thanks for catching, I was referencing the code in function
>> verify_type_variant, that corresponding part seems imperfect:
>>
>>   if (TREE_CODE (TYPE_SIZE (t)) != PLACEHOLDER_EXPR
>>   && TREE_CODE (TYPE_SIZE (tv)) != PLACEHOLDER_EXPR)
>> verify_variant_match (TYPE_SIZE);
>>
>> I agree poly_int size is allowed, the patch was updated for it.
>>
>> BLKmode
>>> Is ruled out(?),
>>
>> Yes, it requires a mode of MODE_OPAQUE class.
>>
>> the docs say we have
>>> ‚An MODE_Opaque‘ here but I don’t see
>>> This being verified?
>>>
>>
>> There is a MODE equality check, I assumed the given t already
>> has one MODE_OPAQUE mode, but the patch was updated to make
>> it explicit as you concerned.
>>
>>> The macro makes this a bit unworldly
>>> For the only benefit of elaborate diagnostic
>>> Which I think isn’t really necessary
>>
>> OK, fixed!
>>
>> The previous version makes just one check on TYPE_CANONICAL to
>> be cheap as gimple_canonical_types_compatible_p said, but
>> since there are just several fields to be check, this updated
>> version adjusted it to be the same as what's for TYPE_MAIN_VARIANT.
>> Hope it's fine. :)
> 
> I think we'll call verify_type on the main variant as well so that would be
> redundant (ensured by transitivity), can you check?

I just had a check and found that we don't always call verify_type
on the main variant.  For example, with one case like:

__attribute__((noipa))
int foo(c){
  return 0;
}

int main ()
{
  const __vector_quad c;
  int r = foo(c);
  return r;
}

Checking during LTO WPA, verify_type only gets type "const
__vector_quad", no type "__vector_quad".

btw, it needs some hacking in rs6000_function_arg to make this
opaque type valid for function arg.

> 
>> Tested as before.
>>
>> Does this updated patch look good to you?
> 
> Yes, please remove the checks against the main variant if the above holds,
> OK with or without that change depending on this outcome.
> 

Committed in r13-2562, thanks!

BR,
Kewen


Re: [PATCH] libstdc++: Refactor implementation of operator+ for std::string

2022-09-09 Thread Will Hawkins
On Thu, Sep 8, 2022 at 2:05 PM Jonathan Wakely  wrote:
>
>
>
> On Thu, 8 Sep 2022, 18:51 François Dumont via Libstdc++, 
>  wrote:
>>
>> On 05/09/22 20:30, Will Hawkins wrote:
>> > Based on Jonathan's work, here is a patch for the implementation of 
>> > operator+
>> > on std::string that makes sure we always use the best allocation strategy.
>> >
>> > I have attempted to learn from all the feedback that I got on a previous
>> > submission -- I hope I did the right thing.
>> >
>> > Passes abi and conformance testing on x86-64 trunk.
>> >
>> > Sincerely,
>> > Will
>> >
>> > -- >8 --
>> >
>> > Create a single function that performs one-allocation string concatenation
>> > that can be used by various different version of operator+.
>> >
>> > libstdc++-v3/ChangeLog:
>> >
>> >   * include/bits/basic_string.h:
>> >   Add common function that performs single-allocation string
>> >   concatenation. (__str_cat)
>> >   Use __str_cat to perform optimized operator+, where relevant.
>> >   * include/bits/basic_string.tcc::
>> >   Remove single-allocation implementation of operator+.
>> >
>> > Signed-off-by: Will Hawkins 
>> > ---
>> >   libstdc++-v3/include/bits/basic_string.h   | 66 --
>> >   libstdc++-v3/include/bits/basic_string.tcc | 41 --
>> >   2 files changed, 49 insertions(+), 58 deletions(-)
>> >
>> > diff --git a/libstdc++-v3/include/bits/basic_string.h 
>> > b/libstdc++-v3/include/bits/basic_string.h
>> > index 0df64ea98ca..4078651fadb 100644
>> > --- a/libstdc++-v3/include/bits/basic_string.h
>> > +++ b/libstdc++-v3/include/bits/basic_string.h
>> > @@ -3481,6 +3481,24 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
>> >   _GLIBCXX_END_NAMESPACE_CXX11
>> >   #endif
>> >
>> > +  template
>> > +_GLIBCXX20_CONSTEXPR
>> > +inline _Str
>> > +__str_concat(typename _Str::value_type const* __lhs,
>> > +  typename _Str::size_type __lhs_len,
>> > +  typename _Str::value_type const* __rhs,
>> > +  typename _Str::size_type __rhs_len,
>> > +  typename _Str::allocator_type const& __a)
>> > +{
>> > +  typedef typename _Str::allocator_type allocator_type;
>> > +  typedef __gnu_cxx::__alloc_traits _Alloc_traits;
>> > +  _Str __str(_Alloc_traits::_S_select_on_copy(__a));
>> > +  __str.reserve(__lhs_len + __rhs_len);
>> > +  __str.append(__lhs, __lhs_len);
>> > +  __str.append(__rhs, __rhs_len);
>> > +  return __str;
>> > +}
>> > +
>> > // operator+
>> > /**
>> >  *  @brief  Concatenate two strings.
>> > @@ -3490,13 +3508,14 @@ _GLIBCXX_END_NAMESPACE_CXX11
>> >  */
>> > template
>> >   _GLIBCXX_NODISCARD _GLIBCXX20_CONSTEXPR
>> > -basic_string<_CharT, _Traits, _Alloc>
>> > +inline basic_string<_CharT, _Traits, _Alloc>
>> >   operator+(const basic_string<_CharT, _Traits, _Alloc>& __lhs,
>> > const basic_string<_CharT, _Traits, _Alloc>& __rhs)
>> >   {
>> > -  basic_string<_CharT, _Traits, _Alloc> __str(__lhs);
>> > -  __str.append(__rhs);
>> > -  return __str;
>> > +  typedef basic_string<_CharT, _Traits, _Alloc> _Str;
>> > +  return std::__str_concat<_Str>(__lhs.c_str(), __lhs.size(),
>> > +  __rhs.c_str(), __rhs.size(),
>>
>> You should use data() rather than c_str() here and all other operators.
>>
>> It is currently the same but is more accurate in your context. Maybe one
>> day it will make a difference.
>
>
>
> I don't think so, that would be a major breaking change, for no benefit. I 
> think it's safe to assume they will always stay equivalent now.

Happy to make any changes to the patch that the group thinks are necessary!

Will


>
>
>>
>> > +  __lhs.get_allocator());
>> >   }
>> >
>> > /**
>> > @@ -3507,9 +3526,16 @@ _GLIBCXX_END_NAMESPACE_CXX11
>> >  */
>> > template
>> >   _GLIBCXX_NODISCARD _GLIBCXX20_CONSTEXPR
>> > -basic_string<_CharT,_Traits,_Alloc>
>> > +inline basic_string<_CharT,_Traits,_Alloc>
>>
>> Why inlining ?
>>
>> I guess it is done this way to limit code bloat.
>>
>> >   operator+(const _CharT* __lhs,
>> > -   const basic_string<_CharT,_Traits,_Alloc>& __rhs);
>> > +   const basic_string<_CharT,_Traits,_Alloc>& __rhs)
>> > +{
>> > +  __glibcxx_requires_string(__lhs);
>> > +  typedef basic_string<_CharT, _Traits, _Alloc> _Str;
>> > +  return std::__str_concat<_Str>(__lhs, _Traits::length(__lhs),
>> > +  __rhs.c_str(), __rhs.size(),
>> > +  __rhs.get_allocator());
>> > +}
>> >
>> > /**
>> >  *  @brief  Concatenate character and string.
>> > @@ -3519,8 +3545,14 @@ _GLIBCXX_END_NAMESPACE_CXX11
>> >  */
>> > template
>> >   _GLIBCXX_NODISCARD _GLIBCXX20_CONSTEXPR
>> > -basic_string<_CharT,_Traits,_Alloc>
>> > -operator+(_CharT __lhs, const basic_string<_CharT,_Traits,_Alloc>& 
>>

Re: Extend fold_vec_perm to fold VEC_PERM_EXPR in VLA manner

2022-09-09 Thread Prathamesh Kulkarni via Gcc-patches
On Mon, 5 Sept 2022 at 15:51, Richard Sandiford
 wrote:
>
> Sorry for the slow reply.  I wrote a response a couple of weeks ago
> but I think it get lost in a machine outage.
>
> Prathamesh Kulkarni  writes:
> > Hi,
> > The attached prototype patch extends fold_vec_perm to fold VEC_PERM_EXPR
> > in VLA manner, and currently handles the following cases:
> > (a) fixed len arg0, arg1 and fixed len sel.
> > (b) fixed len arg0, arg1 and vla sel
> > (c) vla arg0, arg1 and vla sel with arg0, arg1 being VECTOR_CST.
> >
> > It seems to work for the VLA tests written in
> > test_vec_perm_vla_folding (), and am working thru the fallout observed in
> > regression testing.
> >
> > Does the approach taken in the patch look in the right direction ?
> > I am not sure if I have got the conversion from "sel_index"
> > to index of either arg0, or arg1 entirely correct.
> > I would be grateful for suggestions on the patch.
> >
> > Thanks,
> > Prathamesh
> >
> > diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
> > index 4f4ec81c8d4..5e12260211e 100644
> > --- a/gcc/fold-const.cc
> > +++ b/gcc/fold-const.cc
> > @@ -85,6 +85,9 @@ along with GCC; see the file COPYING3.  If not see
> >  #include "vec-perm-indices.h"
> >  #include "asan.h"
> >  #include "gimple-range.h"
> > +#include "tree-pretty-print.h"
> > +#include "gimple-pretty-print.h"
> > +#include "print-tree.h"
> >
> >  /* Nonzero if we are folding constants inside an initializer or a C++
> > manifestly-constant-evaluated context; zero otherwise.
> > @@ -10496,40 +10499,6 @@ fold_mult_zconjz (location_t loc, tree type, tree 
> > expr)
> > build_zero_cst (itype));
> >  }
> >
> > -
> > -/* Helper function for fold_vec_perm.  Store elements of VECTOR_CST or
> > -   CONSTRUCTOR ARG into array ELTS, which has NELTS elements, and return
> > -   true if successful.  */
> > -
> > -static bool
> > -vec_cst_ctor_to_array (tree arg, unsigned int nelts, tree *elts)
> > -{
> > -  unsigned HOST_WIDE_INT i, nunits;
> > -
> > -  if (TREE_CODE (arg) == VECTOR_CST
> > -  && VECTOR_CST_NELTS (arg).is_constant (&nunits))
> > -{
> > -  for (i = 0; i < nunits; ++i)
> > - elts[i] = VECTOR_CST_ELT (arg, i);
> > -}
> > -  else if (TREE_CODE (arg) == CONSTRUCTOR)
> > -{
> > -  constructor_elt *elt;
> > -
> > -  FOR_EACH_VEC_SAFE_ELT (CONSTRUCTOR_ELTS (arg), i, elt)
> > - if (i >= nelts || TREE_CODE (TREE_TYPE (elt->value)) == VECTOR_TYPE)
> > -   return false;
> > - else
> > -   elts[i] = elt->value;
> > -}
> > -  else
> > -return false;
> > -  for (; i < nelts; i++)
> > -elts[i]
> > -  = fold_convert (TREE_TYPE (TREE_TYPE (arg)), integer_zero_node);
> > -  return true;
> > -}
> > -
> >  /* Attempt to fold vector permutation of ARG0 and ARG1 vectors using SEL
> > selector.  Return the folded VECTOR_CST or CONSTRUCTOR if successful,
> > NULL_TREE otherwise.  */
> > @@ -10537,45 +10506,149 @@ vec_cst_ctor_to_array (tree arg, unsigned int 
> > nelts, tree *elts)
> >  tree
> >  fold_vec_perm (tree type, tree arg0, tree arg1, const vec_perm_indices 
> > &sel)
> >  {
> > -  unsigned int i;
> > -  unsigned HOST_WIDE_INT nelts;
> > -  bool need_ctor = false;
> > +  poly_uint64 arg0_len = TYPE_VECTOR_SUBPARTS (TREE_TYPE (arg0));
> > +  poly_uint64 arg1_len = TYPE_VECTOR_SUBPARTS (TREE_TYPE (arg1));
> > +
> > +  gcc_assert (known_eq (TYPE_VECTOR_SUBPARTS (type),
> > + sel.length ()));
> > +  gcc_assert (known_eq (arg0_len, arg1_len));
> >
> > -  if (!sel.length ().is_constant (&nelts))
> > -return NULL_TREE;
> > -  gcc_assert (known_eq (TYPE_VECTOR_SUBPARTS (type), nelts)
> > -   && known_eq (TYPE_VECTOR_SUBPARTS (TREE_TYPE (arg0)), nelts)
> > -   && known_eq (TYPE_VECTOR_SUBPARTS (TREE_TYPE (arg1)), nelts));
> >if (TREE_TYPE (TREE_TYPE (arg0)) != TREE_TYPE (type)
> >|| TREE_TYPE (TREE_TYPE (arg1)) != TREE_TYPE (type))
> >  return NULL_TREE;
> >
> > -  tree *in_elts = XALLOCAVEC (tree, nelts * 2);
> > -  if (!vec_cst_ctor_to_array (arg0, nelts, in_elts)
> > -  || !vec_cst_ctor_to_array (arg1, nelts, in_elts + nelts))
> > +  unsigned input_npatterns = 0;
> > +  unsigned out_npatterns = sel.encoding ().npatterns ();
> > +  unsigned out_nelts_per_pattern = sel.encoding ().nelts_per_pattern ();
> > +
> > +  /* FIXME: How to reshape fixed length vector_cst, so that
> > + npatterns == vector.length () and nelts_per_pattern == 1 ?
> > + It seems the vector is canonicalized to minimize npatterns.  */
> > +
> > +  if (arg0_len.is_constant ())
> > +{
> > +  /* If arg0, arg1 are fixed width vectors, and sel is VLA,
> > + ensure that it is a dup sequence and has same period
> > +  as input vector.  */
> > +
> > +  if (!sel.length ().is_constant ()
> > +   && (sel.encoding ().nelts_per_pattern () > 2
> > +   || !known_eq (arg0_len, sel.encoding ().npatterns (
> > + return NULL_TREE;
> > +
> > +  input_npat

Re: [PATCH 3/3] vect: inbranch SIMD clones

2022-09-09 Thread Jakub Jelinek via Gcc-patches
On Tue, Aug 09, 2022 at 02:23:50PM +0100, Andrew Stubbs wrote:
> 
> There has been support for generating "inbranch" SIMD clones for a long time,
> but nothing actually uses them (as far as I can see).

Thanks for working on this.

Note, there is another case where the inbranch SIMD clones could be used
and I even thought it is implemented, but apparently it isn't or it doesn't
work:
#ifndef TYPE
#define TYPE int
#endif

/* A simple function that will be cloned.  */
#pragma omp declare simd inbranch
TYPE __attribute__((noinline))
foo (TYPE a)
{
  return a + 1;
}

/* Check that "inbranch" clones are called correctly.  */

void __attribute__((noinline))
masked (TYPE * __restrict a, TYPE * __restrict b, int size)
{
  #pragma omp simd
  for (int i = 0; i < size; i++)
b[i] = foo(a[i]);
}

Here, IMHO we should use the inbranch clone for vectorization (better
than not vectorizing it, worse than when we'd have a notinbranch clone)
and just use mask of all ones.
But sure, it can be done incrementally, just mentioning it for completeness.

> This patch add supports for a sub-set of possible cases (those using
> mask_mode == VOIDmode).  The other cases fail to vectorize, just as before,
> so there should be no regressions.
> 
> The sub-set of support should cover all cases needed by amdgcn, at present.
> 
> gcc/ChangeLog:
> 
>   * omp-simd-clone.cc (simd_clone_adjust_argument_types): Set vector_type
>   for mask arguments also.
>   * tree-if-conv.cc: Include cgraph.h.
>   (if_convertible_stmt_p): Do if conversions for calls to SIMD calls.
>   (predicate_statements): Pass the predicate to SIMD functions.
>   * tree-vect-stmts.cc (vectorizable_simd_clone_call): Permit calls
>   to clones with mask arguments, in some cases.
>   Generate the mask vector arguments.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/vect/vect-simd-clone-16.c: New test.
>   * gcc.dg/vect/vect-simd-clone-16b.c: New test.
>   * gcc.dg/vect/vect-simd-clone-16c.c: New test.
>   * gcc.dg/vect/vect-simd-clone-16d.c: New test.
>   * gcc.dg/vect/vect-simd-clone-16e.c: New test.
>   * gcc.dg/vect/vect-simd-clone-16f.c: New test.
>   * gcc.dg/vect/vect-simd-clone-17.c: New test.
>   * gcc.dg/vect/vect-simd-clone-17b.c: New test.
>   * gcc.dg/vect/vect-simd-clone-17c.c: New test.
>   * gcc.dg/vect/vect-simd-clone-17d.c: New test.
>   * gcc.dg/vect/vect-simd-clone-17e.c: New test.
>   * gcc.dg/vect/vect-simd-clone-17f.c: New test.
>   * gcc.dg/vect/vect-simd-clone-18.c: New test.
>   * gcc.dg/vect/vect-simd-clone-18b.c: New test.
>   * gcc.dg/vect/vect-simd-clone-18c.c: New test.
>   * gcc.dg/vect/vect-simd-clone-18d.c: New test.
>   * gcc.dg/vect/vect-simd-clone-18e.c: New test.
>   * gcc.dg/vect/vect-simd-clone-18f.c: New test.

> --- a/gcc/tree-if-conv.cc
> +++ b/gcc/tree-if-conv.cc
> @@ -1074,13 +1076,19 @@ if_convertible_stmt_p (gimple *stmt, 
> vec refs)
>   tree fndecl = gimple_call_fndecl (stmt);
>   if (fndecl)
> {
> + /* We can vectorize some builtins and functions with SIMD
> +clones.  */
>   int flags = gimple_call_flags (stmt);
> + struct cgraph_node *node = cgraph_node::get (fndecl);
>   if ((flags & ECF_CONST)
>   && !(flags & ECF_LOOPING_CONST_OR_PURE)
> - /* We can only vectorize some builtins at the moment,
> -so restrict if-conversion to those.  */
>   && fndecl_built_in_p (fndecl))
> return true;
> + else if (node && node->simd_clones != NULL)
> +   {
> + need_to_predicate = true;

I think it would be worth it to check that at least one of the
node->simd_clones clones has ->inbranch set, because if all calls
are declare simd notinbranch, then predicating the loop will be just a
wasted effort.

> + return true;
> +   }
> }
>   return false;
>}
> @@ -2614,6 +2622,31 @@ predicate_statements (loop_p loop)
> gimple_assign_set_rhs1 (stmt, ifc_temp_var (type, rhs, &gsi));
> update_stmt (stmt);
>   }
> +
> +   /* Add a predicate parameter to functions that have a SIMD clone.
> +  This will cause the vectorizer to match the "in branch" clone
> +  variants because they also have the extra parameter, and serves
> +  to build the mask vector in a natural way.  */
> +   gcall *call = dyn_cast  (gsi_stmt (gsi));
> +   if (call && !gimple_call_internal_p (call))
> + {
> +   tree orig_fndecl = gimple_call_fndecl (call);
> +   int orig_nargs = gimple_call_num_args (call);
> +   auto_vec args;
> +   for (int i=0; i < orig_nargs; i++)
> + args.safe_push (gimple_call_arg (call, i));
> +   args.safe_push (cond);
> +
> +   /* Replace the call with a new one that has the extra
> +  parameter.  The FUNCTION_DECL remains u

Re: [Patch][1/3] libgomp: Prepare for reverse offload fn lookup

2022-09-09 Thread Jakub Jelinek via Gcc-patches
On Thu, Aug 25, 2022 at 04:54:51PM +0200, Tobias Burnus wrote:
> Technically, this patch is stand alone, but conceptually it based on the
> submitted but not reviewed patch:
> "[Patch] OpenMP: Support reverse offload (middle end part)"
> https://gcc.gnu.org/pipermail/gcc-patches/2022-July/598662.html
> 
> With that patch, for reverse offloads ('omp target device(ancestor:1)'),
> calls like the following are added:
>  GOMP_target_ext (-2 /* initial device */, omp_fn.1
> where 'omp_fn.1' on nonhost devices a stub function just required for
> looking up the host function pointer via the offload_funcs table.
> 
> The attached patch prepare for reverse-offload device->host
> function-address lookup by requesting (if needed) the on-device address.
> 
> OK for mainline?
> 
> Tobias
> 
> 
> -
> Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
> München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
> Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
> München, HRB 106955

> libgomp: Prepare for reverse offload fn lookup
> 
> Prepare for reverse-offloading function-pointer lookup by passing
> a rev_fn_table argument to GOMP_OFFLOAD_load_image.
> 
> The argument will be NULL, unless GOMP_REQUIRES_REVERSE_OFFLOAD is
> requested and devices not supported it, are filtered out.
> (Up to and including this commit, no non-host device claims such
> support and the caller currently always passes NULL.)
> 
> libgomp/ChangeLog:
> 
>   * libgomp-plugin.h (GOMP_OFFLOAD_load_image): Add
>   'uint64_t **rev_fn_table' argument.
>   * oacc-host.c (host_load_image): Likewise.
>   * plugin/plugin-gcn.c (GOMP_OFFLOAD_load_image): Likewise;
>   currently unused.
>   * plugin/plugin-nvptx.c (GOMP_OFFLOAD_load_image): Likewise.
>   * target.c (gomp_load_image_to_device): Update call but pass
>   NULL for now.
> 
> liboffloadmic/ChangeLog:
> 
>   * plugin/libgomp-plugin-intelmic.cpp (GOMP_OFFLOAD_load_image):
>   Add (unused) uint64_t **rev_fn_table argument.

Ok, thanks.

Jakub



Re: [Patch][2/3] GCN: libgomp+mkoffload.cc: Prepare for reverse offload fn lookup

2022-09-09 Thread Jakub Jelinek via Gcc-patches
On Thu, Aug 25, 2022 at 05:38:58PM +0200, Tobias Burnus wrote:
> On 25.08.22 16:54, Tobias Burnus wrote:
> 
> The attached patch prepare for reverse-offload device->host
> function-address lookup by requesting (if needed) the on-device address.
> 
> 
> This patch adds the actual implementation for GCN. A variant would be
> to only generate .offload_func_table inside mkoffload when
> OMP_REQUIRES_REVERSE_OFFLOAD has been requested.
> 
> This is currently effectively a no op as with [1/3] patch, always NULL
> is passed and as GOMP_OFFLOAD_get_num_devices returns <= 0 as soon as
> 'omp requires reverse_offload' has been specified.
> 
> OK for mainline?
> 
> Tobias
> 
> 
> -
> Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
> München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
> Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
> München, HRB 106955

> GCN: libgomp+mkoffload.cc: Prepare for reverse offload fn lookup
> 
> Add support to GCN for reverse lookup of function name to prepare for
> 'omp target device(ancestor:1)'.
> 
> gcc/ChangeLog:
> 
>   * config/gcn/mkoffload.cc (process_asm): Create .offload_func_table,
>   similar to pre-existing .offload_var_table.
> 
> libgomp/ChangeLog:
> 
>   * plugin/plugin-gcn.c (GOMP_OFFLOAD_load_image): Read
>   .offload_func_table to populate rev_fn_table when requested.

Ok.

Jakub



Re: [Patch][2/3][v2] nvptx: libgomp+mkoffload.cc: Prepare for reverse offload fn lookup

2022-09-09 Thread Jakub Jelinek via Gcc-patches
On Mon, Aug 29, 2022 at 08:43:26PM +0200, Tobias Burnus wrote:
> Slightly revised version, fixing some issues in mkoffload.cc. Otherwise, the 
> same applies:
> 
> On 25.08.22 19:30, Tobias Burnus wrote:
> On 25.08.22 16:54, Tobias Burnus wrote:
> 
> The attached patch prepare for reverse-offload device->host
> function-address lookup by requesting (if needed) the on-device address.
> 
> 
> This patch adds the actual implementation for NVPTX.
> 
> Having  array[] = {fn1,fn2};  works with nvptx only since sm_35; hence,
> if there is a reverse_offload and sm_30 is used, there will be a compile-time
> error.

Wonder if we instead shouldn't arrange for silent request for no PTX
offloading (or one with warning?) if sm_30 and reverse offload is needed.
Error might be too harsh, the program can still offload to GCN or host
just fine...

Otherwise LGTM.

Jakub



Re: Modula-2: merge followup (brief update on the progress of the new linking implementation)

2022-09-09 Thread Gaius Mulley via Gcc-patches
Martin Liška  writes:

>> gm2.texi.  I see a few missing options (missing from gm2.texi) and also
>
> Do you have an example, please?

-fscaffold-main was missing from gcc/doc/gm2.texi.  I've git pushed a
 correction (and alphabetically sorted all options).

>> I see problems with the libraries - meta data (const/type/var) is
>
> What type of meta data do you mean?

@findex functionname [meta]

for an example see:

builddir/gcc/m2/gm2-libs.texi

(* all the following types are declared internally to gm2
TYPE
@findex LOC (type)
   LOC ;

@findex ADR
PROCEDURE ADR (VAR v: ): ADDRESS;
  (* Returns the address of variable v.  *)

The above findex populates the function index and is issued at the end
of m2/doc/gm2.texi by a call to:

@printindex fn

An example output:

https://www.nongnu.org/gm2/12/functions.html
More detail on findex
https://www.gnu.org/software/texinfo/manual/texinfo/html_node/Predefined-Indices.html

hope this is useful,

regards,
Gaius


Re: [Patch] libgomp/nvptx: Prepare for reverse-offload callback handling

2022-09-09 Thread Jakub Jelinek via Gcc-patches
On Fri, Aug 26, 2022 at 05:56:09PM +0300, Alexander Monakov via Gcc-patches 
wrote:
> 
> On Fri, 26 Aug 2022, Tobias Burnus wrote:
> 
> > @Tom and Alexander: Better suggestions are welcome for the busy loop in
> > libgomp/plugin/plugin-nvptx.c regarding the variable placement and checking
> > its value.
> 
> I think to do that without polling you can use PTX 'brkpt' instruction on the
> device and CUDA Debugger API on the host (but you'd have to be careful about
> interactions with the real debugger).
> 
> How did the standardization process for this feature look like, how did it 
> pass
> if it's not efficiently implementable for the major offloading targets?

It doesn't have to be implementable on all major offloading targets, it is
enough when it can work on some.  As one needs to request the reverse
offloading through a declarative directive, it is always possible in that
case to just pretend devices that don't support it don't exist.

But it would be really nice to support it even on PTX.

Are there any other implementations of reverse offloading to PTX already?

Jakub



Re: [Patch] libgomp/nvptx: Prepare for reverse-offload callback handling

2022-09-09 Thread Jakub Jelinek via Gcc-patches
On Fri, Aug 26, 2022 at 11:07:28AM +0200, Tobias Burnus wrote:
> @Tom and Alexander: Better suggestions are welcome for the busy loop in
> libgomp/plugin/plugin-nvptx.c regarding the variable placement and checking
> its value.

I'm afraid you need Alexander or Tom here, I don't feel I can review it;
I could rubber stamp it if they are ok with it.

Jakub



Re: [Patch] libgomp: Add reverse-offload splay tree

2022-09-09 Thread Jakub Jelinek via Gcc-patches
On Fri, Aug 26, 2022 at 01:15:24PM +0200, Tobias Burnus wrote:
> For reverse-offload data handling, we need to support:
> (a) device fn addr -> host fn address
> (b) finding already mapped host -> device vars via their device address
> 
> For (a), the functions addrs, we need some extra code (cf. previous patches)
> as this information does not exist already. For (b), the variables, we have
> two options:
> (i) Do a reverse search on the existing data. That's done in
>  oacc-mem.c's lookup_dev
>and obviously is O(N) as it has to check all nodes.
>With the assumption "This is not expected to be a common operation."
>is might be still okay.
> (ii) Using a second splay tree for the reverse lookup.
> 
> The attached patch prepares for this – but does not yet handle all
> locations and is not yet active. The 'devicep->load_image_func' call
> depends whether the previous [1/3] patch (cf. below) has been applied
> or not.
> 
> (The advantage of the reverse-offload mapping is that 'nowait' is not
> permitted and 'target {enter,exit,} data device(anchestor:1)' is neither.
> Thus, all 'omp target device(ancestor:1)' mapping done in target_rev
> can be undone in the same function - and all others are preexisting.)
> 
> OK for mainline?

I'd prefer this going in only when somebody actually uses it, and
without the #if 0 parts or
/* , rev_lookup ? &rev_target_fn_table : NULL */
Also, I wonder if the reverse splay tree is recreatable from the normal
splay tree (or at least until something that results in that no longer be
the case) if the reverse splay tree couldn't be created lazily, only when
something actually asks for the reverse offload the first time walk the
whole normal splay tree and populate from it the reverse one and after
that maintain it.
Or at least not key whether to populate it on reverse offload being
requested, but actually some device(ancestor:1) somewhere.

> +  /* Likeverse for the reverse lookup device->host for reverse offload. */

Likewise or something else?

Jakub



Re: [PATCH v2, rs6000] Change insn condition from TARGET_64BIT to TARGET_POWERPC64 for VSX scalar extract/insert instructions

2022-09-09 Thread Segher Boessenkool
On Thu, Sep 08, 2022 at 01:59:02PM +0800, HAO CHEN GUI wrote:
> In rs6000-overload.def, the vsx_ version built-ins are overridden to vec_
> version.

How?  Where?

Instead, afaics they are defined in rs6000-builtins.def:

  const vd __builtin_vsx_extract_exp_dp (vd);
VEEDP xvxexpdp {}

  const vf __builtin_vsx_extract_exp_sp (vf);
VEESP xvxexpsp {}

  const vd __builtin_vsx_extract_sig_dp (vd);
VESDP xvxsigdp {}

  const vf __builtin_vsx_extract_sig_sp (vf);
VESSP xvxsigsp {}


Again: the vec_ versions are fine.  I wonder if the vsx_ versions ever
worked, if the builtin infrastructure rewrite broke it.  And if so, what
we should do now?  The argument for not deleting these legacy builtins
is that someone might use them, but that seems unlikely since it has
been utterly broken for a while now, and no one has complained.


Segher


[PATCH 3/3] libstdc++: Fix return type of empty zip/adjacent_transform [PR106803]

2022-09-09 Thread Patrick Palka via Gcc-patches
Tested on x86_64-pc-linux-gnu, does this series look OK for trunk?

PR libstdc++/106803

libstdc++-v3/ChangeLog:

* include/std/ranges (views::_ZipTransform::operator()): Fix
return type in the empty case.
(views::_AdjacentTransform::operator()): Likewise.
---
 libstdc++-v3/include/std/ranges | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/libstdc++-v3/include/std/ranges b/libstdc++-v3/include/std/ranges
index 37ad80ad3de..20eb4e82ac8 100644
--- a/libstdc++-v3/include/std/ranges
+++ b/libstdc++-v3/include/std/ranges
@@ -5071,7 +5071,7 @@ namespace views::__adaptor
operator() [[nodiscard]] (_Fp&& __f, _Ts&&... __ts) const
{
  if constexpr (sizeof...(_Ts) == 0)
-   return views::empty>>;
+   return views::empty&>>>;
  else
return zip_transform_view(std::forward<_Fp>(__f), 
std::forward<_Ts>(__ts)...);
}
@@ -5762,7 +5762,7 @@ namespace views::__adaptor
  operator() [[nodiscard]] (_Range&& __r, _Fp&& __f) const
  {
if constexpr (_Nm == 0)
- return views::empty>;
+ return zip_transform(std::forward<_Fp>(__f));
else
  return adjacent_transform_view, decay_t<_Fp>, _Nm>
(std::forward<_Range>(__r), std::forward<_Fp>(__f));
-- 
2.37.3.518.g79f2338b37



[PATCH 1/3] libstdc++: Fix zip_view's operator- for integer-class difference type [PR106766]

2022-09-09 Thread Patrick Palka via Gcc-patches
make_unsigned_t can't give us the unsigned version of an integer-class
difference type, so use __make_unsigned_like_t / __to_unsigned_like
instead.

PR libstdc++/106766

libstdc++-v3/ChangeLog:

* include/std/ranges (zip_view::_Iterator::operator-): Use
__to_unsigned_like instead of make_unsigned_t.
(zip_view::_Sentinel::operator-): Likewise.
* testsuite/std/ranges/zip/1.cc (test04): New test.
---
 libstdc++-v3/include/std/ranges|  8 
 libstdc++-v3/testsuite/std/ranges/zip/1.cc | 14 ++
 2 files changed, 18 insertions(+), 4 deletions(-)

diff --git a/libstdc++-v3/include/std/ranges b/libstdc++-v3/include/std/ranges
index 2b5cb0531f0..2b8fec3c386 100644
--- a/libstdc++-v3/include/std/ranges
+++ b/libstdc++-v3/include/std/ranges
@@ -4657,8 +4657,8 @@ namespace views::__adaptor
return ranges::min({difference_type(std::get<_Is>(__x._M_current)
- 
std::get<_Is>(__y._M_current))...},
   ranges::less{},
-  [](difference_type __i) -> 
make_unsigned_t {
-return __i < 0 ? -__i : __i;
+  [](difference_type __i) {
+return __detail::__to_unsigned_like(__i < 0 ? -__i 
: __i);
   });
   }(make_index_sequence{});
 }
@@ -4726,8 +4726,8 @@ namespace views::__adaptor
   return [&](index_sequence<_Is...>) {
return ranges::min({_Ret(std::get<_Is>(__x._M_current) - 
std::get<_Is>(__y._M_end))...},
   ranges::less{},
-  [](_Ret __i) -> make_unsigned_t<_Ret> {
-return __i < 0 ? -__i : __i;
+  [](_Ret __i) {
+return __detail::__to_unsigned_like(__i < 0 ? -__i 
: __i);
   });
   }(make_index_sequence{});
 }
diff --git a/libstdc++-v3/testsuite/std/ranges/zip/1.cc 
b/libstdc++-v3/testsuite/std/ranges/zip/1.cc
index 0113efdb537..f868c97cb69 100644
--- a/libstdc++-v3/testsuite/std/ranges/zip/1.cc
+++ b/libstdc++-v3/testsuite/std/ranges/zip/1.cc
@@ -102,10 +102,24 @@ test03()
   return true;
 }
 
+constexpr bool
+test04()
+{
+  // PR libstdc++/106766
+  auto r = views::zip(views::iota(__int128(0), __int128(1)));
+  auto i = r.begin();
+  auto s = r.end();
+  VERIFY( s - i == 1 );
+  VERIFY( i + 1 - i == 1 );
+
+  return true;
+}
+
 int
 main()
 {
   static_assert(test01());
   static_assert(test02());
   static_assert(test03());
+  static_assert(test04());
 }
-- 
2.37.3.518.g79f2338b37



[PATCH 2/3] libstdc++: Fix typo in adjacent_view::_Iterator [PR106798]

2022-09-09 Thread Patrick Palka via Gcc-patches
PR libstdc++/106798

libstdc++-v3/ChangeLog:

* include/std/ranges (adjacent_view::_Iterator::_Iterator): Fix
typo.
* testsuite/std/ranges/adaptors/adjacent/1.cc (test04): New test.
---
 libstdc++-v3/include/std/ranges  |  2 +-
 .../testsuite/std/ranges/adaptors/adjacent/1.cc  | 12 
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/libstdc++-v3/include/std/ranges b/libstdc++-v3/include/std/ranges
index 2b8fec3c386..37ad80ad3de 100644
--- a/libstdc++-v3/include/std/ranges
+++ b/libstdc++-v3/include/std/ranges
@@ -5239,7 +5239,7 @@ namespace views::__adaptor
   requires _Const && convertible_to, iterator_t<_Base>>
 {
   for (size_t __j = 0; __j < _Nm; ++__j)
-   _M_current[__j] = std::move(__i[__j]);
+   _M_current[__j] = std::move(__i._M_current[__j]);
 }
 
 constexpr auto
diff --git a/libstdc++-v3/testsuite/std/ranges/adaptors/adjacent/1.cc 
b/libstdc++-v3/testsuite/std/ranges/adaptors/adjacent/1.cc
index 9829f79364f..443c1fbf450 100644
--- a/libstdc++-v3/testsuite/std/ranges/adaptors/adjacent/1.cc
+++ b/libstdc++-v3/testsuite/std/ranges/adaptors/adjacent/1.cc
@@ -101,10 +101,22 @@ test03()
   return true;
 }
 
+constexpr bool
+test04()
+{
+  // PR libstdc++/106798
+  auto r = views::single(0) | views::lazy_split(0) | views::pairwise;
+  decltype(ranges::cend(r)) s = r.end();
+  VERIFY( r.begin() == s );
+
+  return true;
+}
+
 int
 main()
 {
   static_assert(test01());
   static_assert(test02());
   static_assert(test03());
+  static_assert(test04());
 }
-- 
2.37.3.518.g79f2338b37



Re: [PATCH] Fortran: Add IEEE_SIGNBIT and IEEE_FMA functions

2022-09-09 Thread FX via Gcc-patches
Hi Thomas,

>> Both of these functions are new with Fortran 2018, could you add
>> a standards version check?
> 
> Thanks Thomas, I will do that and post here the commit diff. The check will 
> not be perfect, though, because the warning/error cannot be emitted when 
> loading the module (because it’s in an external file), but will have to be 
> when the call is actually emitted.

Actuelly, that does not work. gfc_notify_std() should not be used at 
code-generation time, but in matching or setting-up symbols. It is never used 
in trans-* files, so I do not think I should introduce it now.

Any hard objection to committing as it is? In the middle term, I intend to 
revamp this part anyway, as I said in my previous email.

Thanks,
FX

Re: [PATCH] Add __builtin_iseqsig()

2022-09-09 Thread FX via Gcc-patches
ping


> Le 1 sept. 2022 à 23:02, FX  a écrit :
> 
> Attached patch adds __builtin_iseqsig() to the middle-end and C family 
> front-ends.
> Testing does not currently check whether the signaling part works, because 
> with optimisation is actually does not (preexisting compiler bug: 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106805)
> 
> Bootstrapped and regtested on x86_64-linux.
> OK to commit?
> 
> (I’m not very skilled for middle-end hacking, so I’m sure there will be 
> modifications to make.)
> 
> FX
> <0001-Add-__builtin_iseqsig.patch>



Re: [PATCH] amdgcn: Add support for additional natively supported floating-point operations

2022-09-09 Thread Joseph Myers
On Thu, 8 Sep 2022, Kwok Cheung Yeung wrote:

> The sin and cos instructions for some reason are scaled by 2*PI radians (i.e.
> 1.0 == 2*PI radians/360 degrees), so their inputs need to be scaled by
> 1/(2*PI) first. I've implemented this as an expander to two instructions - one

C2x has sinpi and cospi function families for sin(pi*x) and cos(pi*x); 
adding built-in functions / insn patterns for those functions would then 
allow those instructions to be used for those functions with scaling by 
1/2.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH 1/3] libstdc++: Fix zip_view's operator- for integer-class difference type [PR106766]

2022-09-09 Thread Jonathan Wakely via Gcc-patches
On Fri, 9 Sep 2022, 18:25 Patrick Palka via Libstdc++, <
libstd...@gcc.gnu.org> wrote:

> make_unsigned_t can't give us the unsigned version of an integer-class
> difference type, so use __make_unsigned_like_t / __to_unsigned_like
> instead.
>

OK, thanks



> PR libstdc++/106766
>
> libstdc++-v3/ChangeLog:
>
> * include/std/ranges (zip_view::_Iterator::operator-): Use
> __to_unsigned_like instead of make_unsigned_t.
> (zip_view::_Sentinel::operator-): Likewise.
> * testsuite/std/ranges/zip/1.cc (test04): New test.
> ---
>  libstdc++-v3/include/std/ranges|  8 
>  libstdc++-v3/testsuite/std/ranges/zip/1.cc | 14 ++
>  2 files changed, 18 insertions(+), 4 deletions(-)
>
> diff --git a/libstdc++-v3/include/std/ranges
> b/libstdc++-v3/include/std/ranges
> index 2b5cb0531f0..2b8fec3c386 100644
> --- a/libstdc++-v3/include/std/ranges
> +++ b/libstdc++-v3/include/std/ranges
> @@ -4657,8 +4657,8 @@ namespace views::__adaptor
> return ranges::min({difference_type(std::get<_Is>(__x._M_current)
> -
> std::get<_Is>(__y._M_current))...},
>ranges::less{},
> -  [](difference_type __i) ->
> make_unsigned_t {
> -return __i < 0 ? -__i : __i;
> +  [](difference_type __i) {
> +return __detail::__to_unsigned_like(__i < 0 ?
> -__i : __i);
>});
>}(make_index_sequence{});
>  }
> @@ -4726,8 +4726,8 @@ namespace views::__adaptor
>return [&](index_sequence<_Is...>) {
> return ranges::min({_Ret(std::get<_Is>(__x._M_current) -
> std::get<_Is>(__y._M_end))...},
>ranges::less{},
> -  [](_Ret __i) -> make_unsigned_t<_Ret> {
> -return __i < 0 ? -__i : __i;
> +  [](_Ret __i) {
> +return __detail::__to_unsigned_like(__i < 0 ?
> -__i : __i);
>});
>}(make_index_sequence{});
>  }
> diff --git a/libstdc++-v3/testsuite/std/ranges/zip/1.cc
> b/libstdc++-v3/testsuite/std/ranges/zip/1.cc
> index 0113efdb537..f868c97cb69 100644
> --- a/libstdc++-v3/testsuite/std/ranges/zip/1.cc
> +++ b/libstdc++-v3/testsuite/std/ranges/zip/1.cc
> @@ -102,10 +102,24 @@ test03()
>return true;
>  }
>
> +constexpr bool
> +test04()
> +{
> +  // PR libstdc++/106766
> +  auto r = views::zip(views::iota(__int128(0), __int128(1)));
> +  auto i = r.begin();
> +  auto s = r.end();
> +  VERIFY( s - i == 1 );
> +  VERIFY( i + 1 - i == 1 );
> +
> +  return true;
> +}
> +
>  int
>  main()
>  {
>static_assert(test01());
>static_assert(test02());
>static_assert(test03());
> +  static_assert(test04());
>  }
> --
> 2.37.3.518.g79f2338b37
>
>


Re: [PATCH 2/3] libstdc++: Fix typo in adjacent_view::_Iterator [PR106798]

2022-09-09 Thread Jonathan Wakely via Gcc-patches
On Fri, 9 Sep 2022, 18:25 Patrick Palka via Libstdc++, <
libstd...@gcc.gnu.org> wrote:

> PR libstdc++/106798
>
> libstdc++-v3/ChangeLog:
>
> * include/std/ranges (adjacent_view::_Iterator::_Iterator): Fix
> typo.
> * testsuite/std/ranges/adaptors/adjacent/1.cc (test04): New test.
>



OK, thanks.


---
>  libstdc++-v3/include/std/ranges  |  2 +-
>  .../testsuite/std/ranges/adaptors/adjacent/1.cc  | 12 
>  2 files changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/libstdc++-v3/include/std/ranges
> b/libstdc++-v3/include/std/ranges
> index 2b8fec3c386..37ad80ad3de 100644
> --- a/libstdc++-v3/include/std/ranges
> +++ b/libstdc++-v3/include/std/ranges
> @@ -5239,7 +5239,7 @@ namespace views::__adaptor
>requires _Const && convertible_to,
> iterator_t<_Base>>
>  {
>for (size_t __j = 0; __j < _Nm; ++__j)
> -   _M_current[__j] = std::move(__i[__j]);
> +   _M_current[__j] = std::move(__i._M_current[__j]);
>  }
>
>  constexpr auto
> diff --git a/libstdc++-v3/testsuite/std/ranges/adaptors/adjacent/1.cc
> b/libstdc++-v3/testsuite/std/ranges/adaptors/adjacent/1.cc
> index 9829f79364f..443c1fbf450 100644
> --- a/libstdc++-v3/testsuite/std/ranges/adaptors/adjacent/1.cc
> +++ b/libstdc++-v3/testsuite/std/ranges/adaptors/adjacent/1.cc
> @@ -101,10 +101,22 @@ test03()
>return true;
>  }
>
> +constexpr bool
> +test04()
> +{
> +  // PR libstdc++/106798
> +  auto r = views::single(0) | views::lazy_split(0) | views::pairwise;
> +  decltype(ranges::cend(r)) s = r.end();
> +  VERIFY( r.begin() == s );
> +
> +  return true;
> +}
> +
>  int
>  main()
>  {
>static_assert(test01());
>static_assert(test02());
>static_assert(test03());
> +  static_assert(test04());
>  }
> --
> 2.37.3.518.g79f2338b37
>
>


Re: [PATCH 3/3] libstdc++: Fix return type of empty zip/adjacent_transform [PR106803]

2022-09-09 Thread Jonathan Wakely via Gcc-patches
On Fri, 9 Sep 2022, 18:27 Patrick Palka via Libstdc++, <
libstd...@gcc.gnu.org> wrote:

> Tested on x86_64-pc-linux-gnu, does this series look OK for trunk?
>


All three are OK, thanks.


> PR libstdc++/106803
>
> libstdc++-v3/ChangeLog:
>
> * include/std/ranges (views::_ZipTransform::operator()): Fix
> return type in the empty case.
> (views::_AdjacentTransform::operator()): Likewise.
> ---
>  libstdc++-v3/include/std/ranges | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/libstdc++-v3/include/std/ranges
> b/libstdc++-v3/include/std/ranges
> index 37ad80ad3de..20eb4e82ac8 100644
> --- a/libstdc++-v3/include/std/ranges
> +++ b/libstdc++-v3/include/std/ranges
> @@ -5071,7 +5071,7 @@ namespace views::__adaptor
> operator() [[nodiscard]] (_Fp&& __f, _Ts&&... __ts) const
> {
>   if constexpr (sizeof...(_Ts) == 0)
> -   return views::empty>>;
> +   return views::empty&>>>;
>   else
> return zip_transform_view(std::forward<_Fp>(__f),
> std::forward<_Ts>(__ts)...);
> }
> @@ -5762,7 +5762,7 @@ namespace views::__adaptor
>   operator() [[nodiscard]] (_Range&& __r, _Fp&& __f) const
>   {
> if constexpr (_Nm == 0)
> - return views::empty>;
> + return zip_transform(std::forward<_Fp>(__f));
> else
>   return adjacent_transform_view, decay_t<_Fp>,
> _Nm>
> (std::forward<_Range>(__r), std::forward<_Fp>(__f));
> --
> 2.37.3.518.g79f2338b37
>
>


Re: Patch ping (was Re: [PATCH] libstdc++: Clear padding bits in atomic compare_exchange)

2022-09-09 Thread Rainer Orth
Hi Jonathan,

> Here's a complete patch that combines the various incremental patches
> that have been going around. I'm testing this now.
>
> Please take a look.

unfortunately, this patch broke macOS bootstrap (seen on
x86_64-apple-darwin11.4.2):

In file included from 
/var/gcc/regression/master/10.7-gcc/build/x86_64-apple-darwin11.4.2/libstdc++-v3/include/bits/shared_ptr_atomic.h:33,
 from 
/var/gcc/regression/master/10.7-gcc/build/x86_64-apple-darwin11.4.2/libstdc++-v3/include/memory:78,
 from 
/vol/gcc/src/hg/master/darwin/libstdc++-v3/include/precompiled/stdc++.h:82:
/var/gcc/regression/master/10.7-gcc/build/x86_64-apple-darwin11.4.2/libstdc++-v3/include/bits/atomic_base.h:
 In function 'bool std::__atomic_impl::__compare_exchange(_Tp&, _Val<_Tp>&, 
_Val<_Tp>&, bool, std::memory_order, std::memory_order)':
/var/gcc/regression/master/10.7-gcc/build/x86_64-apple-darwin11.4.2/libstdc++-v3/include/bits/atomic_base.h:1008:49:
 error: expected primary-expression before ',' token
 1008 |   __weak, int(__s), int(__f)))
  | ^
/var/gcc/regression/master/10.7-gcc/build/x86_64-apple-darwin11.4.2/libstdc++-v3/include/bits/atomic_base.h:1017:50:
 error: expected primary-expression before ',' token
 1017 |__weak, int(__s), int(__f));
  |  ^

Darwin gcc predefines __weak= in gcc/config/darwin-c.cc (darwin_cpp_builtins).

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: Patch ping (was Re: [PATCH] libstdc++: Clear padding bits in atomic compare_exchange)

2022-09-09 Thread Iain Sandoe via Gcc-patches



> On 9 Sep 2022, at 19:36, Rainer Orth  wrote:
> 

>> Here's a complete patch that combines the various incremental patches
>> that have been going around. I'm testing this now.
>> 
>> Please take a look.
> 
> unfortunately, this patch broke macOS bootstrap (seen on
> x86_64-apple-darwin11.4.2):
> 
> In file included from 
> /var/gcc/regression/master/10.7-gcc/build/x86_64-apple-darwin11.4.2/libstdc++-v3/include/bits/shared_ptr_atomic.h:33,
> from 
> /var/gcc/regression/master/10.7-gcc/build/x86_64-apple-darwin11.4.2/libstdc++-v3/include/memory:78,
> from 
> /vol/gcc/src/hg/master/darwin/libstdc++-v3/include/precompiled/stdc++.h:82:
> /var/gcc/regression/master/10.7-gcc/build/x86_64-apple-darwin11.4.2/libstdc++-v3/include/bits/atomic_base.h:
>  In function 'bool std::__atomic_impl::__compare_exchange(_Tp&, _Val<_Tp>&, 
> _Val<_Tp>&, bool, std::memory_order, std::memory_order)':
> /var/gcc/regression/master/10.7-gcc/build/x86_64-apple-darwin11.4.2/libstdc++-v3/include/bits/atomic_base.h:1008:49:
>  error: expected primary-expression before ',' token
> 1008 |   __weak, int(__s), int(__f)))
>  | ^
> /var/gcc/regression/master/10.7-gcc/build/x86_64-apple-darwin11.4.2/libstdc++-v3/include/bits/atomic_base.h:1017:50:
>  error: expected primary-expression before ',' token
> 1017 |__weak, int(__s), int(__f));
>  |  ^
> 
> Darwin gcc predefines __weak= in gcc/config/darwin-c.cc (darwin_cpp_builtins).

yes, __weak and __strong are Objective C things (in principle, applicable to 
non-Darwin targets
using NeXT runtime - there is at least one such target).

Iain



Re: Patch ping (was Re: [PATCH] libstdc++: Clear padding bits in atomic compare_exchange)

2022-09-09 Thread Thomas Rodgers via Gcc-patches
s/__weak/__is_weak/g  perhaps?

On Fri, Sep 9, 2022 at 11:46 AM Iain Sandoe via Libstdc++ <
libstd...@gcc.gnu.org> wrote:

>
>
> > On 9 Sep 2022, at 19:36, Rainer Orth 
> wrote:
> >
>
> >> Here's a complete patch that combines the various incremental patches
> >> that have been going around. I'm testing this now.
> >>
> >> Please take a look.
> >
> > unfortunately, this patch broke macOS bootstrap (seen on
> > x86_64-apple-darwin11.4.2):
> >
> > In file included from
> /var/gcc/regression/master/10.7-gcc/build/x86_64-apple-darwin11.4.2/libstdc++-v3/include/bits/shared_ptr_atomic.h:33,
> > from
> /var/gcc/regression/master/10.7-gcc/build/x86_64-apple-darwin11.4.2/libstdc++-v3/include/memory:78,
> > from
> /vol/gcc/src/hg/master/darwin/libstdc++-v3/include/precompiled/stdc++.h:82:
> >
> /var/gcc/regression/master/10.7-gcc/build/x86_64-apple-darwin11.4.2/libstdc++-v3/include/bits/atomic_base.h:
> In function 'bool std::__atomic_impl::__compare_exchange(_Tp&, _Val<_Tp>&,
> _Val<_Tp>&, bool, std::memory_order, std::memory_order)':
> >
> /var/gcc/regression/master/10.7-gcc/build/x86_64-apple-darwin11.4.2/libstdc++-v3/include/bits/atomic_base.h:1008:49:
> error: expected primary-expression before ',' token
> > 1008 |   __weak, int(__s),
> int(__f)))
> >  | ^
> >
> /var/gcc/regression/master/10.7-gcc/build/x86_64-apple-darwin11.4.2/libstdc++-v3/include/bits/atomic_base.h:1017:50:
> error: expected primary-expression before ',' token
> > 1017 |__weak, int(__s),
> int(__f));
> >  |  ^
> >
> > Darwin gcc predefines __weak= in gcc/config/darwin-c.cc
> (darwin_cpp_builtins).
>
> yes, __weak and __strong are Objective C things (in principle, applicable
> to non-Darwin targets
> using NeXT runtime - there is at least one such target).
>
> Iain
>
>


Re: Patch ping (was Re: [PATCH] libstdc++: Clear padding bits in atomic compare_exchange)

2022-09-09 Thread Jonathan Wakely via Gcc-patches
On Fri, 9 Sept 2022 at 20:01, Thomas Rodgers wrote:
>
> s/__weak/__is_weak/g  perhaps?

Yes, that'll do. Fixed by the attached, with a test to avoid it happening again.

Tested x86_64-linux, pushed to trunk.




>
> On Fri, Sep 9, 2022 at 11:46 AM Iain Sandoe via Libstdc++ 
>  wrote:
>>
>>
>>
>> > On 9 Sep 2022, at 19:36, Rainer Orth  wrote:
>> >
>>
>> >> Here's a complete patch that combines the various incremental patches
>> >> that have been going around. I'm testing this now.
>> >>
>> >> Please take a look.
>> >
>> > unfortunately, this patch broke macOS bootstrap (seen on
>> > x86_64-apple-darwin11.4.2):
>> >
>> > In file included from 
>> > /var/gcc/regression/master/10.7-gcc/build/x86_64-apple-darwin11.4.2/libstdc++-v3/include/bits/shared_ptr_atomic.h:33,
>> > from 
>> > /var/gcc/regression/master/10.7-gcc/build/x86_64-apple-darwin11.4.2/libstdc++-v3/include/memory:78,
>> > from 
>> > /vol/gcc/src/hg/master/darwin/libstdc++-v3/include/precompiled/stdc++.h:82:
>> > /var/gcc/regression/master/10.7-gcc/build/x86_64-apple-darwin11.4.2/libstdc++-v3/include/bits/atomic_base.h:
>> >  In function 'bool std::__atomic_impl::__compare_exchange(_Tp&, 
>> > _Val<_Tp>&, _Val<_Tp>&, bool, std::memory_order, std::memory_order)':
>> > /var/gcc/regression/master/10.7-gcc/build/x86_64-apple-darwin11.4.2/libstdc++-v3/include/bits/atomic_base.h:1008:49:
>> >  error: expected primary-expression before ',' token
>> > 1008 |   __weak, int(__s), 
>> > int(__f)))
>> >  | ^
>> > /var/gcc/regression/master/10.7-gcc/build/x86_64-apple-darwin11.4.2/libstdc++-v3/include/bits/atomic_base.h:1017:50:
>> >  error: expected primary-expression before ',' token
>> > 1017 |__weak, int(__s), 
>> > int(__f));
>> >  |  ^
>> >
>> > Darwin gcc predefines __weak= in gcc/config/darwin-c.cc 
>> > (darwin_cpp_builtins).
>>
>> yes, __weak and __strong are Objective C things (in principle, applicable to 
>> non-Darwin targets
>> using NeXT runtime - there is at least one such target).
>>
>> Iain
>>
commit 007680f946eaffa3c6321624129e1ec18e673091
Author: Jonathan Wakely 
Date:   Fri Sep 9 21:03:58 2022

libstdc++: Rename parameter to avoid darwin __weak qualifier

libstdc++-v3/ChangeLog:

* include/bits/atomic_base.h (__atomic_impl::__compare_exchange):
Rename __weak to __is_weak.
* testsuite/17_intro/names.cc: Add __weak and __strong.

diff --git a/libstdc++-v3/include/bits/atomic_base.h 
b/libstdc++-v3/include/bits/atomic_base.h
index 29315547aab..6ea3268fdf0 100644
--- a/libstdc++-v3/include/bits/atomic_base.h
+++ b/libstdc++-v3/include/bits/atomic_base.h
@@ -990,7 +990,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 template
   _GLIBCXX_ALWAYS_INLINE bool
   __compare_exchange(_Tp& __val, _Val<_Tp>& __e, _Val<_Tp>& __i,
-bool __weak, memory_order __s, memory_order __f) 
noexcept
+bool __is_weak,
+memory_order __s, memory_order __f) noexcept
   {
__glibcxx_assert(__is_valid_cmpexch_failure_order(__f));
 
@@ -1005,7 +1006,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
__atomic_impl::__clear_padding(*__exp);
if (__atomic_compare_exchange(std::__addressof(__val), __exp,
  __atomic_impl::__clear_padding(__i),
- __weak, int(__s), int(__f)))
+ __is_weak, int(__s), int(__f)))
  return true;
__builtin_memcpy(std::__addressof(__e), __exp, sizeof(_Vp));
return false;
@@ -1014,7 +1015,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  return __atomic_compare_exchange(std::__addressof(__val),
   std::__addressof(__e),
   std::__addressof(__i),
-  __weak, int(__s), int(__f));
+  __is_weak, int(__s), int(__f));
   }
   } // namespace __atomic_impl
 
diff --git a/libstdc++-v3/testsuite/17_intro/names.cc 
b/libstdc++-v3/testsuite/17_intro/names.cc
index ede2fe8caa7..86fb8f8999b 100644
--- a/libstdc++-v3/testsuite/17_intro/names.cc
+++ b/libstdc++-v3/testsuite/17_intro/names.cc
@@ -129,6 +129,10 @@
 // This clashes with newlib so don't use it.
 # define __lockablecannot be used as an identifier
 
+#ifndef __APPLE__
+#define __weak   predefined qualifier on darwin
+#define __strong predefined qualifier on darwin
+#endif
 
 // Common template parameter names
 #define OutputIterator OutputIterator is not a reserved name


Re: [PATCH] libstdc++: Refactor implementation of operator+ for std::string

2022-09-09 Thread Jonathan Wakely via Gcc-patches
On Thu, 8 Sept 2022 at 18:51, François Dumont via Libstdc++
 wrote:
>
> On 05/09/22 20:30, Will Hawkins wrote:
> > Based on Jonathan's work, here is a patch for the implementation of 
> > operator+
> > on std::string that makes sure we always use the best allocation strategy.
> >
> > I have attempted to learn from all the feedback that I got on a previous
> > submission -- I hope I did the right thing.
> >
> > Passes abi and conformance testing on x86-64 trunk.
> >
> > Sincerely,
> > Will
> >
> > -- >8 --
> >
> > Create a single function that performs one-allocation string concatenation
> > that can be used by various different version of operator+.
> >
> > libstdc++-v3/ChangeLog:
> >
> >   * include/bits/basic_string.h:
> >   Add common function that performs single-allocation string
> >   concatenation. (__str_cat)
> >   Use __str_cat to perform optimized operator+, where relevant.
> >   * include/bits/basic_string.tcc::
> >   Remove single-allocation implementation of operator+.
> >
> > Signed-off-by: Will Hawkins 
> > ---
> >   libstdc++-v3/include/bits/basic_string.h   | 66 --
> >   libstdc++-v3/include/bits/basic_string.tcc | 41 --
> >   2 files changed, 49 insertions(+), 58 deletions(-)
> >
> > diff --git a/libstdc++-v3/include/bits/basic_string.h 
> > b/libstdc++-v3/include/bits/basic_string.h
> > index 0df64ea98ca..4078651fadb 100644
> > --- a/libstdc++-v3/include/bits/basic_string.h
> > +++ b/libstdc++-v3/include/bits/basic_string.h
> > @@ -3481,6 +3481,24 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
> >   _GLIBCXX_END_NAMESPACE_CXX11
> >   #endif
> >
> > +  template
> > +_GLIBCXX20_CONSTEXPR
> > +inline _Str
> > +__str_concat(typename _Str::value_type const* __lhs,
> > +  typename _Str::size_type __lhs_len,
> > +  typename _Str::value_type const* __rhs,
> > +  typename _Str::size_type __rhs_len,
> > +  typename _Str::allocator_type const& __a)
> > +{
> > +  typedef typename _Str::allocator_type allocator_type;
> > +  typedef __gnu_cxx::__alloc_traits _Alloc_traits;
> > +  _Str __str(_Alloc_traits::_S_select_on_copy(__a));
> > +  __str.reserve(__lhs_len + __rhs_len);
> > +  __str.append(__lhs, __lhs_len);
> > +  __str.append(__rhs, __rhs_len);
> > +  return __str;
> > +}
> > +
> > // operator+
> > /**
> >  *  @brief  Concatenate two strings.
> > @@ -3490,13 +3508,14 @@ _GLIBCXX_END_NAMESPACE_CXX11
> >  */
> > template
> >   _GLIBCXX_NODISCARD _GLIBCXX20_CONSTEXPR
> > -basic_string<_CharT, _Traits, _Alloc>
> > +inline basic_string<_CharT, _Traits, _Alloc>
> >   operator+(const basic_string<_CharT, _Traits, _Alloc>& __lhs,
> > const basic_string<_CharT, _Traits, _Alloc>& __rhs)
> >   {
> > -  basic_string<_CharT, _Traits, _Alloc> __str(__lhs);
> > -  __str.append(__rhs);
> > -  return __str;
> > +  typedef basic_string<_CharT, _Traits, _Alloc> _Str;
> > +  return std::__str_concat<_Str>(__lhs.c_str(), __lhs.size(),
> > +  __rhs.c_str(), __rhs.size(),
>
> You should use data() rather than c_str() here and all other operators.
>
> It is currently the same but is more accurate in your context. Maybe one
> day it will make a difference.

As I said, it will never make a difference, so there's no technical
reason to change it. I suppose data() is a little more expressive
here, in that we only care about the characters, not the null
terminator that c_str() implies (even though data() has the null
terminator too, as it's the same pointer returned).

>
> > +  __lhs.get_allocator());
> >   }
> >
> > /**
> > @@ -3507,9 +3526,16 @@ _GLIBCXX_END_NAMESPACE_CXX11
> >  */
> > template
> >   _GLIBCXX_NODISCARD _GLIBCXX20_CONSTEXPR
> > -basic_string<_CharT,_Traits,_Alloc>
> > +inline basic_string<_CharT,_Traits,_Alloc>
>
> Why inlining ?

Because it's a one line function that just calls another function.
That's an ideal candidate for being inline.



stdatomic.h: Do not define ATOMIC_VAR_INIT for C2x

2022-09-09 Thread Joseph Myers
The  macro ATOMIC_VAR_INIT, previously declared obsolete,
is removed completely in C2x; disable it for C2x in GCC's
implementation.  (Although ATOMIC_* are reserved names for this
header, disabling the macro for C2x still seems appropriate.)

Bootstrapped with no regressions for x86_64-pc-linux-gnu.  OK to commit?

gcc/
* ginclude/stdatomic.h [defined __STDC_VERSION__ &&
__STDC_VERSION__ > 201710L] (ATOMIC_VAR_INIT): Do not define.

gcc/testsuite/
* gcc.dg/atomic/c2x-stdatomic-var-init-1.c: New test.

diff --git a/gcc/ginclude/stdatomic.h b/gcc/ginclude/stdatomic.h
index 9f2475b739d..a56ba5d9639 100644
--- a/gcc/ginclude/stdatomic.h
+++ b/gcc/ginclude/stdatomic.h
@@ -79,7 +79,9 @@ typedef _Atomic __INTMAX_TYPE__ atomic_intmax_t;
 typedef _Atomic __UINTMAX_TYPE__ atomic_uintmax_t;
 
 
+#if !(defined __STDC_VERSION__ && __STDC_VERSION__ > 201710L)
 #define ATOMIC_VAR_INIT(VALUE) (VALUE)
+#endif
 
 /* Initialize an atomic object pointed to by PTR with VAL.  */
 #define atomic_init(PTR, VAL)   \
diff --git a/gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-var-init-1.c 
b/gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-var-init-1.c
new file mode 100644
index 000..1978a410350
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-var-init-1.c
@@ -0,0 +1,9 @@
+/* Test ATOMIC_VAR_INIT not in C2x.  */
+/* { dg-do compile } */
+/* { dg-options "-std=c2x -pedantic-errors" } */
+
+#include 
+
+#ifdef ATOMIC_VAR_INIT
+#error "ATOMIC_VAR_INIT defined"
+#endif

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] OpenMP, libgomp: Environment variable syntax extension.

2022-09-09 Thread Rainer Orth
Hi Jakub,

> On Wed, Aug 31, 2022 at 12:56:25PM +0200, Marcel Vollweiler wrote:
>> libgomp/ChangeLog:
[...]
>>  (initialize_env): Extended to parse the new syntax of environment
>>  variables.

this patch broke Darwin bootstrap:

Undefined symbols for architecture x86_64:
  "_environ", referenced from:
  _initialize_env in env.o
ld: symbol(s) not found for architecture x86_64
collect2: error: ld returned 1 exit status
make[5]: *** [libgomp.la] Error 1

This is documented in environ(7):

 Shared libraries and bundles don't have direct access to environ, which
 is only available to the loader ld(1) when a complete program is being
 linked.  The environment routines can still be used, but if direct access
 to environ is needed, the _NSGetEnviron() routine, defined in
 , can be used to retrieve the address of environ at run-
 time.

The following patch/hack, taken from
libgfortran/intrinsics/execute_command_line.c, allows the link to
succeed.  Bootstrap still running...

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


diff --git a/libgomp/env.c b/libgomp/env.c
--- a/libgomp/env.c
+++ b/libgomp/env.c
@@ -54,6 +54,13 @@
 #include 
 #include "thread-stacksize.h"
 
+#ifdef __APPLE__
+# include 
+# define environ (*_NSGetEnviron ())
+#else
+extern char **environ;
+#endif
+
 #ifndef HAVE_STRTOULL
 # define strtoull(ptr, eptr, base) strtoul (ptr, eptr, base)
 #endif
@@ -2033,7 +2040,6 @@ startswith (const char *str, const char 
 static void __attribute__((constructor))
 initialize_env (void)
 {
-  extern char **environ;
   char **env;
   int omp_var, dev_num = 0, dev_num_len = 0, i;
   bool ignore = false;


[committed] analyzer: add test coverage for flexible array members [PR98247]

2022-09-09 Thread David Malcolm via Gcc-patches
Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r13-2571-g084dc9a0c6cec1.

gcc/testsuite/ChangeLog:
PR analyzer/98247
* gcc.dg/analyzer/flexible-array-member-1.c: New test.

Signed-off-by: David Malcolm 
---
 .../gcc.dg/analyzer/flexible-array-member-1.c | 100 ++
 1 file changed, 100 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/flexible-array-member-1.c

diff --git a/gcc/testsuite/gcc.dg/analyzer/flexible-array-member-1.c 
b/gcc/testsuite/gcc.dg/analyzer/flexible-array-member-1.c
new file mode 100644
index 000..2df085a43f2
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/flexible-array-member-1.c
@@ -0,0 +1,100 @@
+#include 
+#include 
+
+struct str {
+  size_t len;
+  char data[];
+};
+
+struct str *
+test_const_size (void)
+{
+  struct str *str = malloc(sizeof(str) + 10);
+  if (str) {
+str->len = 10;
+memset(str->data, 'x', 10);
+return str;
+  }
+  return NULL;
+}
+
+struct str *
+test_const_size_oob_1 (void)
+{
+  /* Forgetting to add space for the trailing array.  */
+  struct str *str = malloc(sizeof(str));
+  if (str) {
+str->len = 10;
+memset(str->data, 'x', 10); /* { dg-warning "heap-based buffer overflow" 
"Wanalyzer-out-of-bounds" } */
+/* { dg-warning "'memset' writing 10 bytes into a region of size 0 
overflows the destination" "Wstringop-overflow" { target *-*-* } .-1 } */
+return str;
+  }
+  return NULL;
+}
+
+struct str *
+test_const_size_oob_2 (void)
+{
+  struct str *str = malloc(sizeof(str) + 10);
+  if (str) {
+str->len = 10;
+/* Using the wrong size here.  */
+memset(str->data, 'x', 11); /* { dg-warning "heap-based buffer overflow" 
"Wanalyzer-out-of-bounds" } */
+/* { dg-warning "'memset' writing 11 bytes into a region of size 10 
overflows the destination" "Wstringop-overflow" { target *-*-* } .-1 } */
+return str;
+  }
+  return NULL;
+}
+
+struct str *
+test_symbolic_size (size_t len)
+{
+  struct str *str = malloc(sizeof(str) + len);
+  if (str) {
+str->len = len;
+memset(str->data, 'x', len);
+return str;
+  }
+  return NULL;
+}
+
+struct str *
+test_symbolic_size_oob (size_t len)
+{
+  /* Forgetting to add space for the trailing array.  */
+  struct str *str = malloc(sizeof(str));
+  if (str) {
+str->len = len;
+memset(str->data, 'x', len); /* { dg-warning "heap-based buffer overflow" 
"PR analyzer/98247" { xfail *-*-* } } */
+// TODO(xfail): we don't yet complain about this case, which occurs when 
len > 0
+return str;
+  }
+  return NULL;
+}
+
+struct str *
+test_symbolic_size_with_terminator (size_t len)
+{
+  struct str *str = malloc(sizeof(str) + len + 1);
+  if (str) {
+str->len = len;
+memset(str->data, 'x', len);
+str->data[len] = '\0';
+return str;
+  }
+  return NULL;
+}
+
+struct str *
+test_symbolic_size_with_terminator_oob (size_t len)
+{
+  /* Forgetting to add 1 for the terminator.  */
+  struct str *str = malloc(sizeof(str) + len);
+  if (str) {
+str->len = len;
+memset(str->data, 'x', len);
+str->data[len] = '\0'; /* { dg-warning "heap-based buffer overflow" } */
+return str;
+  }
+  return NULL;
+}
-- 
2.26.3



[committed] analyzer: add support for plugin-supplied known function behaviors

2022-09-09 Thread David Malcolm via Gcc-patches
This patch adds the ability for plugins to register "known functions"
with the analyzer, identified by name.  If -fanalyzer sees a call to
such a function (with no body), it will use a plugin-provided subclass
of the new known_function abstract base class to model the possible
outcomes of the function call.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r13-2572-g07e30160beaa20.

gcc/ChangeLog:
* Makefile.in (ANALYZER_OBJS): Add
analyzer/known-function-manager.o.

gcc/analyzer/ChangeLog:
* analyzer.h (class known_function_manager): New forward decl.
(class known_function): New.
(plugin_analyzer_init_iface::register_known_function): New.
* engine.cc: Include "analyzer/known-function-manager.h".
(plugin_analyzer_init_impl::plugin_analyzer_init_impl): Add
known_fn_mgr param.
(plugin_analyzer_init_impl::register_state_machine): Add
LOC_SCOPE.
(plugin_analyzer_init_impl::register_known_function): New.
(plugin_analyzer_init_impl::m_known_fn_mgr): New.
(impl_run_checkers): Update plugin callback invocation to use
eng's known_function_manager.
* known-function-manager.cc: New file.
* known-function-manager.h: New file.
* region-model-manager.cc
(region_model_manager::region_model_manager): Pass logger to
m_known_fn_mgr's ctor.
* region-model.cc (region_model::update_for_zero_return): New.
(region_model::update_for_nonzero_return): New.
(maybe_simplify_upper_bound): New.
(region_model::maybe_get_copy_bounds): New.
(region_model::get_known_function): New.
(region_model::on_call_pre): Handle plugin-supplied known
functions.
* region-model.h: Include "analyzer/known-function-manager.h".
(region_model_manager::get_known_function_manager): New.
(region_model_manager::m_known_fn_mgr): New.
(call_details::get_model): New accessor.
(region_model::maybe_get_copy_bounds): New decl.
(region_model::update_for_zero_return): New decl.
(region_model::update_for_nonzero_return): New decl.
(region_model::get_known_function): New decl.
(region_model::get_known_function_manager): New.

gcc/testsuite/ChangeLog:
* gcc.dg/plugin/analyzer_known_fns_plugin.c: New test plugin.
* gcc.dg/plugin/known-fns-1.c: New test.
* gcc.dg/plugin/plugin.exp (plugin_test_list): Add the new plugin
and test.

Signed-off-by: David Malcolm 
---
 gcc/Makefile.in   |   1 +
 gcc/analyzer/analyzer.h   |  13 ++
 gcc/analyzer/engine.cc|  16 +-
 gcc/analyzer/known-function-manager.cc|  78 +++
 gcc/analyzer/known-function-manager.h |  45 
 gcc/analyzer/region-model-manager.cc  |   3 +-
 gcc/analyzer/region-model.cc  | 109 ++
 gcc/analyzer/region-model.h   |  21 ++
 .../gcc.dg/plugin/analyzer_known_fns_plugin.c | 201 ++
 gcc/testsuite/gcc.dg/plugin/known-fns-1.c |  61 ++
 gcc/testsuite/gcc.dg/plugin/plugin.exp|   2 +
 11 files changed, 548 insertions(+), 2 deletions(-)
 create mode 100644 gcc/analyzer/known-function-manager.cc
 create mode 100644 gcc/analyzer/known-function-manager.h
 create mode 100644 gcc/testsuite/gcc.dg/plugin/analyzer_known_fns_plugin.c
 create mode 100644 gcc/testsuite/gcc.dg/plugin/known-fns-1.c

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index d3b66b7106e..a4689d52e36 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1262,6 +1262,7 @@ ANALYZER_OBJS = \
analyzer/engine.o \
analyzer/feasible-graph.o \
analyzer/function-set.o \
+   analyzer/known-function-manager.o \
analyzer/pending-diagnostic.o \
analyzer/program-point.o \
analyzer/program-state.o \
diff --git a/gcc/analyzer/analyzer.h b/gcc/analyzer/analyzer.h
index e4dd6d6339d..b325aee37ce 100644
--- a/gcc/analyzer/analyzer.h
+++ b/gcc/analyzer/analyzer.h
@@ -113,6 +113,7 @@ class engine;
 class state_machine;
 class logger;
 class visitor;
+class known_function_manager;
 
 /* Forward decls of functions.  */
 
@@ -218,12 +219,24 @@ extern location_t get_stmt_location (const gimple *stmt, 
function *fun);
 
 extern bool compat_types_p (tree src_type, tree dst_type);
 
+/* Abstract base class for simulating the behavior of known functions,
+   supplied by plugins.  */
+
+class known_function
+{
+public:
+  virtual ~known_function () {}
+  virtual void impl_call_pre (const call_details &cd) const = 0;
+};
+
 /* Passed by pointer to PLUGIN_ANALYZER_INIT callbacks.  */
 
 class plugin_analyzer_init_iface
 {
 public:
   virtual void register_state_machine (state_machine *) = 0;
+  virtual void register_known_function (const char *name,
+   known_function *) = 0;
   virtual logger *get_log

[committed] analyzer: implement trust boundaries via a plugin for Linux kernel

2022-09-09 Thread David Malcolm via Gcc-patches
This is a less ambitious version of:
  [PATCH 0/6] RFC: adding support to GCC for detecting trust boundaries
https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584372.html

Earlier versions of this patch attempted:
(a) various ways of identifying "untrusted" memory regions
(b) providing a way to support the Linux kernel's "__user" annotation,
either via type attributes, or via custom address spaces
(c) enough attributes to identify "copy_from_user" and "copy_to_user",
(d) wiring all of the above together to detect infoleaks and taint

This patch adds a new -Wanalyzer-exposure-through-uninit-copy, emitted
by -fanalyzer if it detects copying of uninitialized data through
a pointer to an untrusted region, but requires a plugin to tell it when
a copy crosses a trust boundary.

This patch adds a proof-of-concept gcc plugin for the analyzer for use
with the Linux kernel that special-cases calls to "copy_from_user" and
calls to "copy_to_user": calls to copy_to_user are checked for
-Wanalyzer-exposure-through-uninit-copy, and data copied via
copy_from_user is marked as tainted when -fanalyzer-checker=taint is
active.

This is very much just a proof-of-concept.  A big limitation is that the
copy_{from,to}_user special-casing only happens if these functions have
no body in the TU being analyzed, which isn't the case for a normal
kernel build.  I'd much prefer to provide a more general mechanism for
handling such behavior without resorting to plugins (e.g. via attributes
or custom address spaces), but in the interest of not "letting perfect
be the enemy of the good" this patch at least allows parts of this
"trust boundaries" code to be merged for experimentation with the idea.

The -Wanalyzer-exposure-through-uninit-copy diagnostic uses notes to
express what fields and padding within a struct have not been initialized.
For example:

infoleak-CVE-2011-1078-2.c: In function 'test_1':
infoleak-CVE-2011-1078-2.c:32:9: warning: potential exposure of sensitive
  information by copying uninitialized data from stack across trust
  boundary [CWE-200] [-Wanalyzer-exposure-through-uninit-copy]
   32 | copy_to_user(optval, &cinfo, sizeof(cinfo));
  | ^~~
  'test_1': events 1-3
|
|   25 | struct sco_conninfo cinfo;
|  | ^
|  | |
|  | (1) region created on stack here
|  | (2) capacity: 6 bytes
|..
|   32 | copy_to_user(optval, &cinfo, sizeof(cinfo));
|  | ~~~
|  | |
|  | (3) uninitialized data copied from stack here
|
infoleak-CVE-2011-1078-2.c:32:9: note: 1 byte is uninitialized
   32 | copy_to_user(optval, &cinfo, sizeof(cinfo));
  | ^~~
infoleak-CVE-2011-1078-2.c:18:15: note: padding after field 'dev_class'
  is uninitialized (1 byte)
   18 | __u8  dev_class[3];
  |   ^
infoleak-CVE-2011-1078-2.c:25:29: note: suggest forcing
  zero-initialization by providing a '{0}' initializer
   25 | struct sco_conninfo cinfo;
  | ^
  |   = {0}

For taint-detection, the patch includes a series of reproducers for
detecting CVE-2011-0521.  Unfortunately the analyzer doesn't yet detect
the issue until the code has been significantly simplified from its
original form: currently only in -5.c and -6.c in the series of test
(see notes in the individual cases), such as:

taint-CVE-2011-0521-6.c:33:48: warning: use of attacker-controlled value
  '*info.num' in array lookup without bounds checking [CWE-129]
  [-Wanalyzer-tainted-array-index]
   33 | av7110->ci_slot[info->num].num = info->num;
  | ~~~^~~
  'test_1': events 1-3
|
|   19 |if (copy_from_user(&sbuf, (void __user *)arg, sizeof(sbuf)) != 
0)
|  |^
|  ||
|  |(1) following 'false' branch...
|..
|   23 | struct dvb_device *dvbdev = file->private_data;
|  |~~
|  ||
|  |(2) ...to here
|..
|   33 | av7110->ci_slot[info->num].num = info->num;
|  | ~~
|  ||
|  |(3) use of 
attacker-controlled value '*info.num' in array lookup without bounds checking
|

The patch also includes various infoleak and taint cases from my
antipatterns.ko kernel module:
  https://github.com/davidmalcolm/antipatterns.ko

Successfully bootstrapped & regrteste

Re: [PATCH] OpenMP, libgomp: Environment variable syntax extension.

2022-09-09 Thread Jakub Jelinek via Gcc-patches
On Fri, Sep 09, 2022 at 10:50:19PM +0200, Rainer Orth wrote:
> Hi Jakub,
> 
> > On Wed, Aug 31, 2022 at 12:56:25PM +0200, Marcel Vollweiler wrote:
> >> libgomp/ChangeLog:
> [...]
> >>(initialize_env): Extended to parse the new syntax of environment
> >>variables.
> 
> this patch broke Darwin bootstrap:
> 
> Undefined symbols for architecture x86_64:
>   "_environ", referenced from:
>   _initialize_env in env.o
> ld: symbol(s) not found for architecture x86_64
> collect2: error: ld returned 1 exit status
> make[5]: *** [libgomp.la] Error 1
> 
> This is documented in environ(7):
> 
>  Shared libraries and bundles don't have direct access to environ, which
>  is only available to the loader ld(1) when a complete program is being
>  linked.  The environment routines can still be used, but if direct access
>  to environ is needed, the _NSGetEnviron() routine, defined in
>  , can be used to retrieve the address of environ at run-
>  time.
> 
> The following patch/hack, taken from
> libgfortran/intrinsics/execute_command_line.c, allows the link to
> succeed.  Bootstrap still running...

My preference would be to introduce some new header for this, which would
say define get_environ inline function, and define it as
static inline char **
get_environ (void)
{
  extern char **environ;
  return environ;
}
in libgomp/config/posix/env.h
and to that
#include 

static inline char **
get_environ (void)
{
  return *_NSGetEnviron ();
}
in libgomp/config/darwin/env.h
That way it is easier to override it for other platforms if needed.

Jakub



Re: [PATCH] OpenMP, libgomp: Environment variable syntax extension.

2022-09-09 Thread Iain Sandoe



> On 9 Sep 2022, at 23:08, Jakub Jelinek  wrote:
> 
> On Fri, Sep 09, 2022 at 10:50:19PM +0200, Rainer Orth wrote:
>> Hi Jakub,
>> 
>>> On Wed, Aug 31, 2022 at 12:56:25PM +0200, Marcel Vollweiler wrote:
 libgomp/ChangeLog:
>> [...]
(initialize_env): Extended to parse the new syntax of environment
variables.
>> 
>> this patch broke Darwin bootstrap:
>> 
>> Undefined symbols for architecture x86_64:
>>  "_environ", referenced from:
>>  _initialize_env in env.o
>> ld: symbol(s) not found for architecture x86_64
>> collect2: error: ld returned 1 exit status
>> make[5]: *** [libgomp.la] Error 1
>> 
>> This is documented in environ(7):
>> 
>> Shared libraries and bundles don't have direct access to environ, which
>> is only available to the loader ld(1) when a complete program is being
>> linked.  The environment routines can still be used, but if direct access
>> to environ is needed, the _NSGetEnviron() routine, defined in
>> , can be used to retrieve the address of environ at run-
>> time.
>> 
>> The following patch/hack, taken from
>> libgfortran/intrinsics/execute_command_line.c, allows the link to
>> succeed.  Bootstrap still running...
> 
> My preference would be to introduce some new header for this, which would
> say define get_environ inline function, and define it as
> static inline char **
> get_environ (void)
> {
>  extern char **environ;
>  return environ;
> }
> in libgomp/config/posix/env.h
> and to that
> #include 
> 
> static inline char **
> get_environ (void)
> {
>  return *_NSGetEnviron ();
> }
> in libgomp/config/darwin/env.h
> That way it is easier to override it for other platforms if needed.

We already have such a header …
include/environ.h

Iain



Re: [PATCH] OpenMP, libgomp: Environment variable syntax extension.

2022-09-09 Thread Jakub Jelinek via Gcc-patches
On Fri, Sep 09, 2022 at 11:13:52PM +0100, Iain Sandoe wrote:
> We already have such a header …
> include/environ.h

Ah, ok, then please just use it.  Seems libgomp Makefile.am
already includes -I$(top_srcdir)/../include

So just include that and remove the extern char **environ;
from the constructor.

Jakub



Re: [PATCH v4 1/2] xtensa: Eliminate unused stack frame allocation/freeing

2022-09-09 Thread Max Filippov via Gcc-patches
On Thu, Sep 8, 2022 at 2:38 PM Takayuki 'January June' Suwa
 wrote:
>
> Changes from v3:
>   (xtensa_expand_prologue): Changed to exclude debug insns from DF use chain 
> analysis.
>
> ---
>
> In the example below, 'x' is once placed on the stack frame and then read
> into registers as the argument value of bar():
>
> /* example */
> struct foo {
>   int a, b;
> };
> extern struct foo bar(struct foo);
> struct foo test(void) {
>   struct foo x = { 0, 1 };
>   return bar(x);
> }
>
> Thanks to the dead store elimination, the initialization of 'x' turns into
> merely loading the immediates to registers, but corresponding stack frame
> growth is not rolled back.  As a result:
>
> ;; prereq: the CALL0 ABI
> ;; before
> test:
> addisp, sp, -16 // unused stack frame allocation/freeing
> movi.n  a2, 0
> movi.n  a3, 1
> addisp, sp, 16  // because no instructions that refer to
> j.l bar, a9 // the stack pointer between the two
>
> This patch eliminates such unused stack frame allocation/freeing:
>
> ;; after
> test:
> movi.n  a2, 0
> movi.n  a3, 1
> j.l bar, a9
>
> gcc/ChangeLog:
>
> * config/xtensa/xtensa.cc (machine_function): New boolean member as
> a flag that controls whether to emit the insns for stack pointer
> adjustment inside of the pro/epilogue.
> (xtensa_emit_adjust_stack_ptr): New function to share the common
> codes and to emit insns if not inhibited.
> (xtensa_expand_epilogue): Change to use the function mentioned
> above when using the CALL0 ABI.
> (xtensa_expand_prologue): Ditto.
> And also change to set the inhibit flag used by
> xtensa_emit_adjust_stack_ptr() to true if the stack pointer is only
> used for its own adjustment.
> ---
>  gcc/config/xtensa/xtensa.cc | 164 ++--
>  1 file changed, 80 insertions(+), 84 deletions(-)

Regtested for target=xtensa-linux-uclibc, no new regressions.
Committed to master.

-- 
Thanks.
-- Max


Re: [PATCH 2/2] xtensa: Make complex hard register clobber elimination more robust and accurate

2022-09-09 Thread Max Filippov via Gcc-patches
On Wed, Aug 31, 2022 at 10:50 PM Takayuki 'January June' Suwa
 wrote:
>
> This patch eliminates all clobbers for complex hard registers that will
> be overwritten entirely afterwards (supersedence of
> 3867d414bd7d9e5b6fb2a51b1fb3d9e9e1eae9).
>
> gcc/ChangeLog:
>
> * config/xtensa/xtensa.md: Rewrite the split pattern that performs
> the abovementioned process so that insns that overwrite clobbered
> register no longer need to be contiguous.
> (DSC): Remove as no longer needed.
> ---
>  gcc/config/xtensa/xtensa.md | 67 +
>  1 file changed, 45 insertions(+), 22 deletions(-)

Regtested for target=xtensa-linux-uclibc, no new regressions.
Committed to master.

-- 
Thanks.
-- Max


Re: [PATCH v2] Handle OPAQUE_TYPE specially in verify_type [PR106833]

2022-09-09 Thread Peter Bergner via Gcc-patches
On 9/9/22 8:27 AM, Kewen.Lin wrote:
> __attribute__((noipa))
> int foo(c){
>   return 0;
> }
> 
> int main ()
> {
>   const __vector_quad c;
>   int r = foo(c);
>   return r;
> }
> 
> Checking during LTO WPA, verify_type only gets type "const
> __vector_quad", no type "__vector_quad".
> 
> btw, it needs some hacking in rs6000_function_arg to make this
> opaque type valid for function arg.

We don't allow (at this time) __vector_pair or __vector_quad to be
used as actual arguments to non-builtin functions.  We do allow
pointers to those types though.

Peter



Re: [PATCH v2] Handle OPAQUE_TYPE specially in verify_type [PR106833]

2022-09-09 Thread Segher Boessenkool
On Fri, Sep 09, 2022 at 07:56:42PM -0500, Peter Bergner wrote:
> On 9/9/22 8:27 AM, Kewen.Lin wrote:
> > btw, it needs some hacking in rs6000_function_arg to make this
> > opaque type valid for function arg.
> 
> We don't allow (at this time) __vector_pair or __vector_quad to be
> used as actual arguments to non-builtin functions.  We do allow
> pointers to those types though.

It would be nice to support that, if it isn't too hard.  It won't be
digging us into a hole, experience has taught us :-)


Segher


Re: [PATCH v2] Handle OPAQUE_TYPE specially in verify_type [PR106833]

2022-09-09 Thread Peter Bergner via Gcc-patches
On 9/9/22 8:47 PM, Segher Boessenkool wrote:
> On Fri, Sep 09, 2022 at 07:56:42PM -0500, Peter Bergner wrote:
>> On 9/9/22 8:27 AM, Kewen.Lin wrote:
>>> btw, it needs some hacking in rs6000_function_arg to make this
>>> opaque type valid for function arg.
>>
>> We don't allow (at this time) __vector_pair or __vector_quad to be
>> used as actual arguments to non-builtin functions.  We do allow
>> pointers to those types though.
> 
> It would be nice to support that, if it isn't too hard.  It won't be
> digging us into a hole, experience has taught us :-)

Sure, but we didn't need it at the time (and still don't) and if we do
add support for that, we'll have to update the ABIs to describe how
they are passed and returned and that is no small feat in itself.
It's just a fair amount of work when no one is actually asking for
that support and we have a lot of other things to work on.

Peter





Re: [PATCH Rust front-end v2 02/37] gccrs: Add nessecary hooks for a Rust front-end testsuite

2022-09-09 Thread Mike Stump via Gcc-patches
Ok.

> On Aug 24, 2022, at 4:59 AM, herron.phi...@googlemail.com wrote:
> 
> From: Philip Herron 
> 
> This copy's over code from other front-end testsuites to enable testing
> for the rust front-end specifically.
> 
> Co-authored-by: Marc Poulhiès 
> Co-authored-by: Thomas Schwinge 
> ---
> gcc/testsuite/lib/rust-dg.exp |  49 +
> gcc/testsuite/lib/rust.exp| 186 ++


Re: [PATCH] testsuite: btf: Fix btf-datasec-1.c for RISC-V

2022-09-09 Thread Mike Stump via Gcc-patches
On May 10, 2022, at 6:31 PM, Kito Cheng via Gcc-patches 
 wrote:
> 
> LGTM, that's only added a new option for RISC-V and won't affect all
> other targets, so I assume I can approve that.

Yes.  Usual and customary for ports.

Re: [PATCH v3] Simplify memchr with small constant strings

2022-09-09 Thread Jan-Benedict Glaw
Hi!

On Wed, 2022-09-07 14:00:25 +0200, Richard Biener  
wrote:
> On Wed, Sep 7, 2022 at 12:58 PM Jan-Benedict Glaw  wrote:
> > ../../gcc/gcc/tree-ssa-forwprop.cc:1258:42: error: array subscript 1 is 
> > outside array bounds of 'tree_node* [1]' [-Werror=array-bounds]
> >  1258 | op[i - 1] = fold_convert_loc (loc, boolean_type_node,
> >   | ~^~~~
> >  1259 |   fold_build2_loc (loc,
> >   |   ~
> >  1260 |
> > BIT_IOR_EXPR,
> >   |
> > ~
> >  1261 |
> > boolean_type_node,
> >   |
> > ~~
> >  1262 |op[i - 
> > 1],
> >   |
> > ~~
> >  1263 |op[i]));
> >   |~~~
> > In file included from ../../gcc/gcc/system.h:707,
> >  from ../../gcc/gcc/tree-ssa-forwprop.cc:21:
> > ../../gcc/gcc/../include/libiberty.h:733:36: note: at offset 8 into object 
> > of size [0, 8] allocated by '__builtin_alloca'
> >   733 | # define alloca(x) __builtin_alloca(x)
> >   |^~~
> > ../../gcc/gcc/../include/libiberty.h:365:40: note: in expansion of macro 
> > 'alloca'
> >   365 | #define XALLOCAVEC(T, N)((T *) alloca (sizeof (T) * (N)))
> >   |^~
> > ../../gcc/gcc/tree-ssa-forwprop.cc:1250:22: note: in expansion of macro 
> > 'XALLOCAVEC'
> >  1250 |   tree *op = XALLOCAVEC (tree, isize);
> >   |  ^~
> > cc1plus: all warnings being treated as errors
> > make[1]: *** [Makefile:1146: tree-ssa-forwprop.o] Error 1
> > make[1]: Leaving directory 
> > '/var/lib/laminar/run/gcc-pru-elf/1/toolchain-build/gcc'
> > make: *** [Makefile:4583: all-gcc] Error 2
> 
> can you open a bugreport please?

Just opened (after re-verification) as
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106900

Thanks,
  Jan-Benedict

-- 


signature.asc
Description: PGP signature


[r13-2545 Regression] FAIL: libgomp.fortran/display-affinity-1.f90 -O execution test on Linux/x86_64

2022-09-09 Thread haochen.jiang via Gcc-patches
On Linux/x86_64,

9f2fca56593a2b87026b399d26adcdca90705685 is the first bad commit
commit 9f2fca56593a2b87026b399d26adcdca90705685
Author: Marcel Vollweiler 
Date:   Thu Sep 8 10:01:33 2022 -0700

OpenMP, libgomp: Environment variable syntax extension

caused

FAIL: libgomp.c/affinity-2.c execution test
FAIL: libgomp.fortran/affinity1.f90   -O2  execution test
FAIL: libgomp.fortran/affinity2.f90   -O2  execution test
FAIL: libgomp.fortran/display-affinity-1.f90   -O  execution test

with GCC configured with

../../gcc/configure 
--prefix=/export/users/haochenj/src/gcc-bisect/master/master/r13-2545/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/x86_64-linux/libgomp/testsuite && make check 
RUNTESTFLAGS="c.exp=libgomp.c/affinity-2.c --target_board='unix{-m64}'"
$ cd {build_dir}/x86_64-linux/libgomp/testsuite && make check 
RUNTESTFLAGS="c.exp=libgomp.c/affinity-2.c --target_board='unix{-m64\ 
-march=cascadelake}'"
$ cd {build_dir}/x86_64-linux/libgomp/testsuite && make check 
RUNTESTFLAGS="c++.exp=libgomp.c-c++-common/display-affinity-1.c 
--target_board='unix{-m64}'"
$ cd {build_dir}/x86_64-linux/libgomp/testsuite && make check 
RUNTESTFLAGS="c++.exp=libgomp.c-c++-common/display-affinity-1.c 
--target_board='unix{-m64\ -march=cascadelake}'"
$ cd {build_dir}/x86_64-linux/libgomp/testsuite && make check 
RUNTESTFLAGS="fortran.exp=libgomp.fortran/affinity1.f90 
--target_board='unix{-m64}'"
$ cd {build_dir}/x86_64-linux/libgomp/testsuite && make check 
RUNTESTFLAGS="fortran.exp=libgomp.fortran/affinity1.f90 
--target_board='unix{-m64\ -march=cascadelake}'"
$ cd {build_dir}/x86_64-linux/libgomp/testsuite && make check 
RUNTESTFLAGS="fortran.exp=libgomp.fortran/affinity2.f90 
--target_board='unix{-m64}'"
$ cd {build_dir}/x86_64-linux/libgomp/testsuite && make check 
RUNTESTFLAGS="fortran.exp=libgomp.fortran/affinity2.f90 
--target_board='unix{-m64\ -march=cascadelake}'"
$ cd {build_dir}/x86_64-linux/libgomp/testsuite && make check 
RUNTESTFLAGS="fortran.exp=libgomp.fortran/display-affinity-1.f90 
--target_board='unix{-m64}'"
$ cd {build_dir}/x86_64-linux/libgomp/testsuite && make check 
RUNTESTFLAGS="fortran.exp=libgomp.fortran/display-affinity-1.f90 
--target_board='unix{-m64\ -march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at haochen dot jiang at intel.com)


[r13-2548 Regression] FAIL: 29_atomics/atomic_ref/compare_exchange_padding.cc execution test on Linux/x86_64

2022-09-09 Thread haochen.jiang via Gcc-patches
On Linux/x86_64,

157236dbd621644b3cec50b6cf38811959f3e78c is the first bad commit
commit 157236dbd621644b3cec50b6cf38811959f3e78c
Author: Thomas Rodgers 
Date:   Thu Aug 25 12:11:40 2022 +0200

libstdc++: Clear padding bits in atomic compare_exchange

caused

FAIL: 29_atomics/atomic_ref/compare_exchange_padding.cc execution test

with GCC configured with

../../gcc/configure 
--prefix=/export/users/haochenj/src/gcc-bisect/master/master/r13-2548/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/x86_64-linux/libstdc++-v3/testsuite && make check 
RUNTESTFLAGS="conformance.exp=29_atomics/atomic_ref/compare_exchange_padding.cc 
--target_board='unix{-m32}'"
$ cd {build_dir}/x86_64-linux/libstdc++-v3/testsuite && make check 
RUNTESTFLAGS="conformance.exp=29_atomics/atomic_ref/compare_exchange_padding.cc 
--target_board='unix{-m32\ -march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at haochen dot jiang at intel.com)