[PATCH (pushed)] libgccjit: silent 2 Clang warnings

2022-12-21 Thread Martin Liška
The make silent the following 2 warnings:

jit/jit-playback.h:785:16: warning: private field 'm_source_file' is not used 
[-Wunused-private-field]
jit/jit-playback.h:804:16: warning: private field 'm_line' is not used 
[-Wunused-private-field]

gcc/jit/ChangeLog:

* jit-playback.h: Use unused attribute.
---
 gcc/jit/jit-playback.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/jit/jit-playback.h b/gcc/jit/jit-playback.h
index 214f399f45c..43e92d67d74 100644
--- a/gcc/jit/jit-playback.h
+++ b/gcc/jit/jit-playback.h
@@ -782,7 +782,7 @@ public:
   vec m_locations;
 
 private:
-  source_file *m_source_file;
+  source_file *m_source_file ATTRIBUTE_UNUSED;
   int m_line_num;
 };
 
@@ -801,7 +801,7 @@ public:
 
 private:
   recording::location *m_recording_loc;
-  source_line *m_line;
+  source_line *m_line ATTRIBUTE_UNUSED;
   int m_column_num;
 };
 
-- 
2.39.0



[PATCH] go: fix clang warnings

2022-12-21 Thread Martin Liška
The patch fixes the following Clang warnings:

gcc/go/gofrontend/escape.cc:1290:17: warning: private field 'fn_' is not used 
[-Wunused-private-field]
gcc/go/gofrontend/escape.cc:3478:19: warning: private field 'context_' is not 
used [-Wunused-private-field]
gcc/go/gofrontend/lex.h:564:15: warning: private field 'input_file_name_' is 
not used [-Wunused-private-field]
gcc/go/gofrontend/types.cc:5788:20: warning: private field 'call_' is not used 
[-Wunused-private-field]
gcc/go/gofrontend/wb.cc:206:9: warning: private field 'gogo_' is not used 
[-Wunused-private-field]

Ready for master?
Thanks,
Martin

gcc/go/ChangeLog:

* gofrontend/escape.cc (Gogo::analyze_escape): Remove context
arg.
(Gogo::assign_connectivity): Likewise.
(class Escape_analysis_tag): Remove context ctor argument.
(Gogo::tag_function): Likewise.
* gofrontend/expressions.cc (Call_expression::do_type):
Remove extra arg.
* gofrontend/gogo.h (class Gogo): Remove context arg.
* gofrontend/lex.h (class Lex): Make input_file_name_
with ATTRIBUTE_UNUSED.
* gofrontend/types.cc (class Call_multiple_result_type):
Remove call_ argument.
(Type::make_call_multiple_result_type): Likewise.
* gofrontend/types.h (class Type): Remove argument.
* gofrontend/wb.cc (class Check_escape): Remove gogo_
field.
(Gogo::add_write_barriers): Remove extra arg.
---
 gcc/go/gofrontend/escape.cc  | 20 +++-
 gcc/go/gofrontend/expressions.cc |  2 +-
 gcc/go/gofrontend/gogo.h |  2 +-
 gcc/go/gofrontend/lex.h  |  2 +-
 gcc/go/gofrontend/types.cc   | 13 -
 gcc/go/gofrontend/types.h|  2 +-
 gcc/go/gofrontend/wb.cc  | 10 +++---
 7 files changed, 18 insertions(+), 33 deletions(-)

diff --git a/gcc/go/gofrontend/escape.cc b/gcc/go/gofrontend/escape.cc
index 6da29edc419..ccbc7910743 100644
--- a/gcc/go/gofrontend/escape.cc
+++ b/gcc/go/gofrontend/escape.cc
@@ -990,7 +990,7 @@ Gogo::analyze_escape()
   for (std::vector::iterator fn = stack.begin();
fn != stack.end();
++fn)
-this->tag_function(context, *fn);
+   this->tag_function(*fn);
 
   if (this->debug_escape_level() != 0)
{
@@ -1246,10 +1246,10 @@ Escape_analysis_loop::statement(Block*, size_t*, 
Statement* s)
 class Escape_analysis_assign : public Traverse
 {
 public:
-  Escape_analysis_assign(Escape_context* context, Named_object* fn)
+  Escape_analysis_assign(Escape_context* context)
 : Traverse(traverse_statements
   | traverse_expressions),
-  context_(context), fn_(fn)
+  context_(context)
   { }
 
   // Model statements within a function as assignments and flows between nodes.
@@ -1286,8 +1286,6 @@ public:
 private:
   // The escape context for this set of functions.
   Escape_context* context_;
-  // The current function being analyzed.
-  Named_object* fn_;
 };
 
 // Helper function to detect self assignment like the following.
@@ -2899,7 +2897,7 @@ Gogo::assign_connectivity(Escape_context* context, 
Named_object* fn)
   int save_depth = context->loop_depth();
   context->set_loop_depth(1);
 
-  Escape_analysis_assign ea(context, fn);
+  Escape_analysis_assign ea(context);
   Function::Results* res = fn->func_value()->result_variables();
   if (res != NULL)
 {
@@ -3465,17 +3463,13 @@ Gogo::propagate_escape(Escape_context* context, Node* 
dst)
 class Escape_analysis_tag
 {
  public:
-  Escape_analysis_tag(Escape_context* context)
-: context_(context)
+  Escape_analysis_tag()
   { }
 
   // Add notes to the function's type about the escape information of its
   // input parameters.
   void
   tag(Named_object* fn);
-
- private:
-  Escape_context* context_;
 };
 
 void
@@ -3580,9 +3574,9 @@ Escape_analysis_tag::tag(Named_object* fn)
 // retain analysis results across imports.
 
 void
-Gogo::tag_function(Escape_context* context, Named_object* fn)
+Gogo::tag_function(Named_object* fn)
 {
-  Escape_analysis_tag eat(context);
+  Escape_analysis_tag eat;
   eat.tag(fn);
 }
 
diff --git a/gcc/go/gofrontend/expressions.cc b/gcc/go/gofrontend/expressions.cc
index 71838b14629..53901306ef7 100644
--- a/gcc/go/gofrontend/expressions.cc
+++ b/gcc/go/gofrontend/expressions.cc
@@ -12500,7 +12500,7 @@ Call_expression::do_type()
   else if (results->size() == 1)
 ret = results->begin()->type();
   else
-ret = Type::make_call_multiple_result_type(this);
+ret = Type::make_call_multiple_result_type();
 
   this->type_ = ret;
 
diff --git a/gcc/go/gofrontend/gogo.h b/gcc/go/gofrontend/gogo.h
index 433fdaebb66..c08a16b74c2 100644
--- a/gcc/go/gofrontend/gogo.h
+++ b/gcc/go/gofrontend/gogo.h
@@ -879,7 +879,7 @@ class Gogo
   // Add notes about the escape level of a function's input and output
   // parameters for exporting and importing top level functions.
   void
-  tag_function(Escape_context*, Named_object*);
+  tag_function(Named_object*

[committed] modula2: Fix lto profiledbootstrap on powerpc64le-linux and s390x-linux [PR108153]

2022-12-21 Thread Jakub Jelinek via Gcc-patches
Hi!

Lto profiledbootstrap was failing for me on {powerpc64le,s390x}-linux with
modula 2 enabled, with:
cc1gm2: internal compiler error: the location value is corrupt
0x11a3d2d m2assert_AssertLocation(unsigned int)
../../gcc/m2/gm2-gcc/m2assert.cc:40
0x11a3d2d m2statement_BuildAssignmentTree
../../gcc/m2/gm2-gcc/m2statement.cc:177
ICE.  The problem was that caller (m2assert_AssertLocation used
location_t M2Options_OverrideLocation (location_t);
prototype with the libcpp/line-map.h
typedef unsigned int location_t;
typedef, but the callee defined in Modula 2 was using:
TYPE
   location_t = INTEGER ;
and
PROCEDURE OverrideLocation (location: location_t) : location_t ;
Now, on powerpc64le-linux unsigned int is returned and passed zero extended
into 64-bits, while signed int is returned and passed sign-extended into 64-bits
and Modula 2 INTEGER is signed 32-bit type, so when the caller then compared
M2Options_OverrideLocation (location) != location
and powerpc64le-linux performed the comparison as 64-bit compare, there
was a mismatch for location_t of 0x807 or others with the MSB set.

Fixed by making Modula 2 location_t a CARDINAL, which is 32-bit unsigned type.

Bootstrapped/regtested normally on x86_64-linux and i686-linux, with
bootstrap-lto profiledbootstrap on powerpc64le-linux and aarch64-linux so
far (and in regtesting on x86_64-linux, i686-linux and s390x-linux),
approved in the PR by Gaius, committed to trunk.

2022-12-21  Jakub Jelinek  

PR modula2/108153
* gm2-gcc/m2linemap.def (location_t): Use CARDINAL instead of INTEGER.

--- gcc/m2/gm2-gcc/m2linemap.def.jj 2022-12-19 14:59:50.169762747 +0100
+++ gcc/m2/gm2-gcc/m2linemap.def2022-12-20 16:36:18.321555969 +0100
@@ -30,7 +30,7 @@ EXPORT QUALIFIED StartFile, EndFile, Sta
  WarningAtf, NoteAtf, internal_error, location_t ;
 
 TYPE
-   location_t = INTEGER ;
+   location_t = CARDINAL ;
 
 
 PROCEDURE StartFile (filename: ADDRESS; linebegin: CARDINAL) ;

Jakub



[committed] openmp: Don't try to destruct DECL_OMP_PRIVATIZED_MEMBER vars [PR108180]

2022-12-21 Thread Jakub Jelinek via Gcc-patches
Hi!

DECL_OMP_PRIVATIZED_MEMBER vars are artificial vars with DECL_VALUE_EXPR
of this->field used just during gimplification and omp lowering/expansion
to privatize individual fields in methods when needed.
As the following testcase shows, when not in templates, they were handled
right, but in templates we actually called cp_finish_decl on them and
that can result in their destruction, which is obviously undesirable,
we should only destruct the privatized copies of them created in omp
lowering.

Fixed thusly.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk
so far.

2022-12-21  Jakub Jelinek  

PR c++/108180
* pt.cc (tsubst_expr): Don't call cp_finish_decl on
DECL_OMP_PRIVATIZED_MEMBER vars.

* testsuite/libgomp.c++/pr108180.C: New test.

--- gcc/cp/pt.cc.jj 2022-12-19 11:09:13.499170642 +0100
+++ gcc/cp/pt.cc2022-12-20 12:13:59.406097521 +0100
@@ -18895,6 +18895,11 @@ tsubst_expr (tree t, tree args, tsubst_f
maybe_push_decl (decl);
 
if (VAR_P (decl)
+   && DECL_LANG_SPECIFIC (decl)
+   && DECL_OMP_PRIVATIZED_MEMBER (decl))
+ break;
+
+   if (VAR_P (decl)
&& DECL_DECOMPOSITION_P (decl)
&& TREE_TYPE (pattern_decl) != error_mark_node)
  ndecl = tsubst_decomp_names (decl, pattern_decl, args,
--- libgomp/testsuite/libgomp.c++/pr108180.C.jj 2022-12-20 12:54:21.077793817 
+0100
+++ libgomp/testsuite/libgomp.c++/pr108180.C2022-12-20 12:53:04.740905961 
+0100
@@ -0,0 +1,55 @@
+// PR c++/108180
+// { dg-do run }
+
+struct A {
+  A () { ++a; }
+  A (A &&) = delete;
+  A (const A &) { ++a; }
+  A &operator= (const A &) = delete;
+  A &operator= (A &&) = delete;
+  ~A () { --a; }
+  static int a;
+};
+int A::a = 0;
+
+struct B {
+  A a;
+  template 
+  int
+  foo ()
+  {
+int res = 0;
+#pragma omp parallel for if(false) firstprivate(a)
+for (int i = 0; i < 64; ++i)
+  res += i;
+return res;
+  }
+  int
+  bar ()
+  {
+int res = 0;
+#pragma omp parallel for if(false) firstprivate(a)
+for (int i = 0; i < 64; ++i)
+  res += i;
+return res;
+  }
+};
+
+int
+main ()
+{
+  {
+B b;
+if (b.foo<0> () != 2016)
+  __builtin_abort ();
+  }
+  if (A::a != 0)
+__builtin_abort ();
+  {
+B b;
+if (b.bar () != 2016)
+  __builtin_abort ();
+  }
+  if (A::a != 0)
+__builtin_abort ();
+}

Jakub



[RFC/PATCH] Remove the workaround for _Float128 precision [PR107299]

2022-12-21 Thread Kewen.Lin via Gcc-patches
Hi,

This a different attempt from Mike's approach[1][2] to fix the
issue in PR107299.  With option -mabi=ieeelongdouble specified,
type long double (and __float128) and _Float128 have the same
mode TFmode, but they have different type precisions, it causes
the assertion to fail in function fold_using_range::fold_stmt.
IMHO, it's not sensible that two types have the same mode but
different precisions.  By tracing where we make type _Float128
have different precision from the precision of its mode, I found
it's from a work around for rs6000 KFmode.  Being curious why
we need this work around, I disabled it and tested it with some
combinations as below, but all were bootstrapped and no
regression failures were observed.

  - BE Power8 with --with-long-double-format=ibm
  - LE Power9 with --with-long-double-format=ibm
  - LE Power10 with --with-long-double-format=ibm
  - x86_64-redhat-linux
  - aarch64-linux-gnu

For LE Power10 with --with-long-double-format=ieee, since it
suffered the bootstrapping issue, I compared the regression
testing results with the one from Mike's approach.  The
comparison result showed this approach can have several more
test cases pass and no cases regressed, they are:

1) libstdc++:

  FAIL->PASS: std/format/arguments/args.cc (test for excess errors)
  FAIL->PASS: std/format/error.cc (test for excess errors)
  FAIL->PASS: std/format/formatter/requirements.cc (test for excess errors)
  FAIL->PASS: std/format/functions/format.cc (test for excess errors)
  FAIL->PASS: std/format/functions/format_to_n.cc (test for excess errors)
  FAIL->PASS: std/format/functions/size.cc (test for excess errors)
  FAIL->PASS: std/format/functions/vformat_to.cc (test for excess errors)
  FAIL->PASS: std/format/parse_ctx.cc (test for excess errors)
  FAIL->PASS: std/format/string.cc (test for excess errors)
  FAIL->PASS: std/format/string_neg.cc (test for excess errors)

  Caused by the same reason: one static assertion fail in header
  file format (_Type is __ieee128):

static_assert(__format::__formattable_with<_Type, _Context>);

2) gfortran:

  NA->PASS: gfortran.dg/c-interop/typecodes-array-float128.f90
  NA->PASS: gfortran.dg/c-interop/typecodes-scalar-float128.f90
  NA->PASS: gfortran.dg/PR100914.f90

  Due to that the effective target `fortran_real_c_float128`
  checking fails, fail to compile below source with error msg:
  "Error: Kind -4 not supported for type REAL".

use iso_c_binding
real(kind=c_float128) :: x
x = cos (x)
end

3) g++:

  FAIL->PASS: g++.dg/cpp23/ext-floating1.C  -std=gnu++23

  Due to the static assertion failing:

static_assert (is_same::value);

* compatible with long double

This approach keeps type long double compatible with _Float128
(at -mabi=ieeelongdouble) as before, so for the simple case
like:

  _Float128 foo (long double t) { return t; }

there is no conversion.  See the difference at optimized
dumping:

  with the contrastive approach:

_Float128 foo (long double a)
{
  _Float128 _2;

   [local count: 1073741824]:
  _2 = (_Float128) a_1(D);
  return _2;
  }

  with this:

  _Float128 foo (long double a)
  {
   [local count: 1073741824]:
  return a_1(D);
  }

IMHO, it's still good to keep _Float128 and __float128
compatible with long double, to get rid of those useless
type conversions.

Besides, this approach still takes TFmode attribute type
as type long double, while the contrastive approach takes
TFmode attribute type as type _Float128, whose corresponding
mode isn't TFmode, the inconsistence seems not good.

As above, I wonder if we can consider this approach which
makes type _Float128 have the same precision as MODE_PRECISION
of its mode, it keeps the previous implementation to make type
long double compatible with _Float128.  Since the REAL_MODE_FORMAT
of the mode still holds the information, even if some place which
isn't covered in the above testing need the actual precision, we
can still retrieve the actual precision with that.

Any thoughts?

[1] https://gcc.gnu.org/pipermail/gcc-patches/2022-November/604834.html
[2] v3 https://gcc.gnu.org/pipermail/gcc-patches/2022-December/608513.html

BR,
Kewen
-
PR target/107299

gcc/ChangeLog:

* tree.cc (build_common_tree_nodes): Remove workaround for rs6000
KFmode.
---
 gcc/tree.cc | 9 -
 1 file changed, 9 deletions(-)

diff --git a/gcc/tree.cc b/gcc/tree.cc
index 254b2373dcf..cbae14d095e 100644
--- a/gcc/tree.cc
+++ b/gcc/tree.cc
@@ -9442,15 +9442,6 @@ build_common_tree_nodes (bool signed_char)
   if (!targetm.floatn_mode (n, extended).exists (&mode))
continue;
   int precision = GET_MODE_PRECISION (mode);
-  /* Work around the rs6000 KFmode having precision 113 not
-128.  */
-  const struct real_format *fmt = REAL_MODE_FORMAT (mode);
-  gcc_assert (fmt->b == 2 && fmt->emin + fmt->emax == 3);
-  int min_precision = fmt->p + ceil_log2 (fmt->emax - fmt->emin);
-  if (!extended)
-

Re: [PATCH] aarch64: Fix plugin header install

2022-12-21 Thread Richard Sandiford via Gcc-patches
Jakub Jelinek  writes:
> Hi!
>
> The r13-2943-g11a113d501ff64 made aarch64.h include
> aarch64-option-extensions.def, but that file isn't installed
> for building plugins.
>
> The following patch should fix that, ok for trunk if it
> passes bootstrap/regtest + building plugin against it?
>
> 2022-12-20  Jakub Jelinek  
>
>   * config/aarch64/t-aarch64 (OPTIONS_H_EXTRA): Add
>   aarch64-option-extensions.def.
>
> --- gcc/config/aarch64/t-aarch64.jj   2022-04-04 13:55:46.001615509 +0200
> +++ gcc/config/aarch64/t-aarch64  2022-12-20 11:31:03.245651809 +0100
> @@ -22,7 +22,8 @@ TM_H += $(srcdir)/config/aarch64/aarch64
>  OPTIONS_H_EXTRA += $(srcdir)/config/aarch64/aarch64-cores.def \
>  $(srcdir)/config/aarch64/aarch64-arches.def \
>  $(srcdir)/config/aarch64/aarch64-fusion-pairs.def \
> -$(srcdir)/config/aarch64/aarch64-tuning-flags.def
> +$(srcdir)/config/aarch64/aarch64-tuning-flags.def \
> +$(srcdir)/config/aarch64/aarch64-option-extensions.def

Should this (and aarch64-fusion-pairs.def and aarch64-tuning-flags.def)
be in TM_H instead?  The first two OPTIONS_H_EXTRA entries seem to be
for aarch64-opt.h (included via aarch64.opt).

I guess TM_H should also have aarch64-arches.def, since it's included
for aarch64_feature.

OK with that change if it works/makes sense.

Thanks,
Richard

>  
>  $(srcdir)/config/aarch64/aarch64-tune.md: s-aarch64-tune-md; @true
>  s-aarch64-tune-md: $(srcdir)/config/aarch64/gentune.sh \
>
>   Jakub


Re: [C PATCH] remove same_translation_unit_p

2022-12-21 Thread Martin Uecker via Gcc-patches
Am Dienstag, dem 20.12.2022 um 20:04 + schrieb Joseph Myers:
> On Tue, 20 Dec 2022, Martin Uecker via Gcc-patches wrote:
> 
> > Here is a patch to remove the unused function
> > same_translation_unit_p and related code. The
> > code to check for structural equivalency of
> > structs / unions is kept (with some fixes)
> > because it will be needed for C2X.
> 
> Could you repost this in development stage 1, since this is not a bug
> fix?

Sure, I will repost for stage 1.

I plan to send some other similar patches with
no functional change (with no expectation that they
get reviewed now) because I like to get some early
feedback rom Martin Liška on some UBsan changes
which are layered on top of them.


Martin






Re: [PATCH] aarch64: Fix plugin header install

2022-12-21 Thread Jakub Jelinek via Gcc-patches
On Wed, Dec 21, 2022 at 09:56:33AM +, Richard Sandiford wrote:
> Jakub Jelinek  writes:
> > The r13-2943-g11a113d501ff64 made aarch64.h include
> > aarch64-option-extensions.def, but that file isn't installed
> > for building plugins.
> >
> > The following patch should fix that, ok for trunk if it
> > passes bootstrap/regtest + building plugin against it?
> >
> > 2022-12-20  Jakub Jelinek  
> >
> > * config/aarch64/t-aarch64 (OPTIONS_H_EXTRA): Add
> > aarch64-option-extensions.def.
> >
> > --- gcc/config/aarch64/t-aarch64.jj 2022-04-04 13:55:46.001615509 +0200
> > +++ gcc/config/aarch64/t-aarch642022-12-20 11:31:03.245651809 +0100
> > @@ -22,7 +22,8 @@ TM_H += $(srcdir)/config/aarch64/aarch64
> >  OPTIONS_H_EXTRA += $(srcdir)/config/aarch64/aarch64-cores.def \
> >$(srcdir)/config/aarch64/aarch64-arches.def \
> >$(srcdir)/config/aarch64/aarch64-fusion-pairs.def \
> > -  $(srcdir)/config/aarch64/aarch64-tuning-flags.def
> > +  $(srcdir)/config/aarch64/aarch64-tuning-flags.def \
> > +  $(srcdir)/config/aarch64/aarch64-option-extensions.def
> 
> Should this (and aarch64-fusion-pairs.def and aarch64-tuning-flags.def)
> be in TM_H instead?  The first two OPTIONS_H_EXTRA entries seem to be
> for aarch64-opt.h (included via aarch64.opt).
> 
> I guess TM_H should also have aarch64-arches.def, since it's included
> for aarch64_feature.
> 
> OK with that change if it works/makes sense.

gcc/Makefile.in has
TM_H  = $(GTM_H) insn-flags.h $(OPTIONS_H)
and
OPTIONS_H = options.h flag-types.h $(OPTIONS_H_EXTRA)
which means that adding something into TM_H when it is already in
OPTIONS_H_EXTRA is a unnecessary.
It is true that aarch64-fusion-pairs.def (included by aarch64-protos.h)
and aarch64-tuning-flags.def (ditto) and aarch64-option-extensions.def
(included by aarch64.h) aren't needed for options.h, so I think the
right patch would be:

2022-12-21  Jakub Jelinek  

* config/aarch64/t-aarch64 (TM_H): Don't add aarch64-cores.def,
add aarch64-fusion-pairs.def, aarch64-tuning-flags.def and
aarch64-option-extensions.def.
(OPTIONS_H_EXTRA): Don't add aarch64-fusion-pairs.def nor
aarch64-tuning-flags.def.

--- gcc/config/aarch64/t-aarch64.jj 2022-12-21 09:03:13.146034669 +0100
+++ gcc/config/aarch64/t-aarch642022-12-21 11:06:20.966118401 +0100
@@ -18,11 +18,11 @@
 #  along with GCC; see the file COPYING3.  If not see
 #  .
 
-TM_H += $(srcdir)/config/aarch64/aarch64-cores.def
+TM_H += $(srcdir)/config/aarch64/aarch64-fusion-pairs.def \
+   $(srcdir)/config/aarch64/aarch64-tuning-flags.def \
+   $(srcdir)/config/aarch64/aarch64-option-extensions.def
 OPTIONS_H_EXTRA += $(srcdir)/config/aarch64/aarch64-cores.def \
-  $(srcdir)/config/aarch64/aarch64-arches.def \
-  $(srcdir)/config/aarch64/aarch64-fusion-pairs.def \
-  $(srcdir)/config/aarch64/aarch64-tuning-flags.def
+  $(srcdir)/config/aarch64/aarch64-arches.def
 
 $(srcdir)/config/aarch64/aarch64-tune.md: s-aarch64-tune-md; @true
 s-aarch64-tune-md: $(srcdir)/config/aarch64/gentune.sh \


Jakub



Re: [PATCH]AArch64 relax constraints on FP16 insn PR108172

2022-12-21 Thread Richard Sandiford via Gcc-patches
Tamar Christina  writes:
> Hi All,
>
> The move, load, load, store, dup, basically all the non arithmetic FP16
> instructions use baseline armv8-a HF support, and so do not require the
> Armv8.2-a extensions.  This relaxes the instructions.
>
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
>
> Ok for master?
>
> Thanks,
> Tamar
>
> gcc/ChangeLog:
>
>   PR target/108172
>   * config/aarch64/aarch64-simd.md (*aarch64_simd_movv2hf): Relax
>   TARGET_SIMD_F16INST to TARGET_SIMD.
>   * config/aarch64/aarch64.cc (aarch64_classify_vector_mode): Re-order.
>   * config/aarch64/iterators.md (VMOVE): Drop TARGET_SIMD_F16INST.
>
> gcc/testsuite/ChangeLog:
>
>   PR target/108172
>   * gcc.target/aarch64/pr108172.c: New test.

OK, thanks.

I think we need better tests for this area though.  The VMOVE uses
include aarch64_dup_lane, which uses , which has no
definition for V2HF, so we get:

return "dup\t%0., %1.h[%2]";

The original patch defined Vtype to "2h", but using that here would
have generated an invalid instruction.

We also have unexpanded s in the pairwise operations:

"faddp\t%h0, %1.",
"fmaxp\t%h0, %1.",
"fminp\t%h0, %1.",
"fmaxnmp\t%h0, %1.",
"fminnmp\t%h0, %1.",

Would it be easy (using a combination of this patch and a follow-on patch)
to wind the V2HF support back to a state that makes sense on its own,
without the postponed pairwise support?  Or would it be simpler to
revert 2cba118e538ba0b7582af7f9fb5ba2dfbb772f8e for GCC 13 and revisit
this in GCC 14, alongside the original motivating use case?

Richard

> --- inline copy of patch -- 
> diff --git a/gcc/config/aarch64/aarch64-simd.md 
> b/gcc/config/aarch64/aarch64-simd.md
> index 
> a62b6deaf4a57e570074d7d894e6fac13779f6fb..8a9f655d547285ec7bdc173086308d7d44a8d482
>  100644
> --- a/gcc/config/aarch64/aarch64-simd.md
> +++ b/gcc/config/aarch64/aarch64-simd.md
> @@ -164,7 +164,7 @@ (define_insn "*aarch64_simd_movv2hf"
>   "=w, m,  m,  w, ?r, ?w, ?r, w, w")
>   (match_operand:V2HF 1 "general_operand"
>   "m,  Dz, w,  w,  w,  r,  r, Dz, Dn"))]
> -  "TARGET_SIMD_F16INST
> +  "TARGET_SIMD
> && (register_operand (operands[0], V2HFmode)
> || aarch64_simd_reg_or_zero (operands[1], V2HFmode))"
> "@
> diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
> index 
> 73515c174fa4fe7830527e7eabd91c4648130ff4..d1c0476321b79bc6aded350d24ea5d556c796519
>  100644
> --- a/gcc/config/aarch64/aarch64.cc
> +++ b/gcc/config/aarch64/aarch64.cc
> @@ -3616,6 +3616,8 @@ aarch64_classify_vector_mode (machine_mode mode)
>  case E_V4x2DFmode:
>return TARGET_FLOAT ? VEC_ADVSIMD | VEC_STRUCT : 0;
>  
> +/* 32-bit Advanced SIMD vectors.  */
> +case E_V2HFmode:
>  /* 64-bit Advanced SIMD vectors.  */
>  case E_V8QImode:
>  case E_V4HImode:
> @@ -3634,7 +3636,6 @@ aarch64_classify_vector_mode (machine_mode mode)
>  case E_V8BFmode:
>  case E_V4SFmode:
>  case E_V2DFmode:
> -case E_V2HFmode:
>return TARGET_FLOAT ? VEC_ADVSIMD : 0;
>  
>  default:
> diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
> index 
> a521dbde1ec42c0c442a9ca3dd52c9727d116399..70742520984d30158e62a38c92abea82b2dac059
>  100644
> --- a/gcc/config/aarch64/iterators.md
> +++ b/gcc/config/aarch64/iterators.md
> @@ -204,8 +204,7 @@ (define_mode_iterator VALL_F16 [V8QI V16QI V4HI V8HI V2SI 
> V4SI V2DI
>  ;; All Advanced SIMD modes suitable for moving, loading, and storing
>  ;; including V2HF
>  (define_mode_iterator VMOVE [V8QI V16QI V4HI V8HI V2SI V4SI V2DI
> -  V4HF V8HF V4BF V8BF V2SF V4SF V2DF
> -  (V2HF "TARGET_SIMD_F16INST")])
> +  V2HF V4HF V8HF V4BF V8BF V2SF V4SF V2DF])
>  
>  
>  ;; The VALL_F16 modes except the 128-bit 2-element ones.
> diff --git a/gcc/testsuite/gcc.target/aarch64/pr108172.c 
> b/gcc/testsuite/gcc.target/aarch64/pr108172.c
> new file mode 100644
> index 
> ..b29054fdb1d6e602755bc93089f1edec4eb53b8e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/pr108172.c
> @@ -0,0 +1,30 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O0" } */
> +
> +typedef _Float16 v4hf __attribute__ ((vector_size (8)));
> +typedef _Float16 v2hf __attribute__ ((vector_size (4)));
> +
> +v4hf
> +v4hf_abi_1 (v4hf a)
> +{
> +  return a;
> +}
> +
> +v4hf
> +v4hf_abi_3 (v4hf a, v4hf b, v4hf c)
> +{
> +  return c;
> +}
> +
> +v4hf
> +v4hf_abi_4 (v4hf a, v4hf b, v4hf c, v4hf d)
> +{
> +  return d;
> +}
> +
> +v2hf
> +v2hf_test (v2hf a, v2hf b, v2hf c, v2hf d)
> +{
> +  return b;
> +}
> +


Re: [PATCH] aarch64: Fix plugin header install

2022-12-21 Thread Richard Sandiford via Gcc-patches
Jakub Jelinek  writes:
> On Wed, Dec 21, 2022 at 09:56:33AM +, Richard Sandiford wrote:
>> Jakub Jelinek  writes:
>> > The r13-2943-g11a113d501ff64 made aarch64.h include
>> > aarch64-option-extensions.def, but that file isn't installed
>> > for building plugins.
>> >
>> > The following patch should fix that, ok for trunk if it
>> > passes bootstrap/regtest + building plugin against it?
>> >
>> > 2022-12-20  Jakub Jelinek  
>> >
>> >* config/aarch64/t-aarch64 (OPTIONS_H_EXTRA): Add
>> >aarch64-option-extensions.def.
>> >
>> > --- gcc/config/aarch64/t-aarch64.jj2022-04-04 13:55:46.001615509 
>> > +0200
>> > +++ gcc/config/aarch64/t-aarch64   2022-12-20 11:31:03.245651809 +0100
>> > @@ -22,7 +22,8 @@ TM_H += $(srcdir)/config/aarch64/aarch64
>> >  OPTIONS_H_EXTRA += $(srcdir)/config/aarch64/aarch64-cores.def \
>> >   $(srcdir)/config/aarch64/aarch64-arches.def \
>> >   $(srcdir)/config/aarch64/aarch64-fusion-pairs.def \
>> > - $(srcdir)/config/aarch64/aarch64-tuning-flags.def
>> > + $(srcdir)/config/aarch64/aarch64-tuning-flags.def \
>> > + $(srcdir)/config/aarch64/aarch64-option-extensions.def
>> 
>> Should this (and aarch64-fusion-pairs.def and aarch64-tuning-flags.def)
>> be in TM_H instead?  The first two OPTIONS_H_EXTRA entries seem to be
>> for aarch64-opt.h (included via aarch64.opt).
>> 
>> I guess TM_H should also have aarch64-arches.def, since it's included
>> for aarch64_feature.
>> 
>> OK with that change if it works/makes sense.
>
> gcc/Makefile.in has
> TM_H  = $(GTM_H) insn-flags.h $(OPTIONS_H)
> and
> OPTIONS_H = options.h flag-types.h $(OPTIONS_H_EXTRA)
> which means that adding something into TM_H when it is already in
> OPTIONS_H_EXTRA is a unnecessary.
> It is true that aarch64-fusion-pairs.def (included by aarch64-protos.h)
> and aarch64-tuning-flags.def (ditto) and aarch64-option-extensions.def
> (included by aarch64.h) aren't needed for options.h, so I think the
> right patch would be:
>
> 2022-12-21  Jakub Jelinek  
>
>   * config/aarch64/t-aarch64 (TM_H): Don't add aarch64-cores.def,
>   add aarch64-fusion-pairs.def, aarch64-tuning-flags.def and
>   aarch64-option-extensions.def.
>   (OPTIONS_H_EXTRA): Don't add aarch64-fusion-pairs.def nor
>   aarch64-tuning-flags.def.

OK, thanks.

Richard

> --- gcc/config/aarch64/t-aarch64.jj   2022-12-21 09:03:13.146034669 +0100
> +++ gcc/config/aarch64/t-aarch64  2022-12-21 11:06:20.966118401 +0100
> @@ -18,11 +18,11 @@
>  #  along with GCC; see the file COPYING3.  If not see
>  #  .
>  
> -TM_H += $(srcdir)/config/aarch64/aarch64-cores.def
> +TM_H += $(srcdir)/config/aarch64/aarch64-fusion-pairs.def \
> + $(srcdir)/config/aarch64/aarch64-tuning-flags.def \
> + $(srcdir)/config/aarch64/aarch64-option-extensions.def
>  OPTIONS_H_EXTRA += $(srcdir)/config/aarch64/aarch64-cores.def \
> -$(srcdir)/config/aarch64/aarch64-arches.def \
> -$(srcdir)/config/aarch64/aarch64-fusion-pairs.def \
> -$(srcdir)/config/aarch64/aarch64-tuning-flags.def
> +$(srcdir)/config/aarch64/aarch64-arches.def
>  
>  $(srcdir)/config/aarch64/aarch64-tune.md: s-aarch64-tune-md; @true
>  s-aarch64-tune-md: $(srcdir)/config/aarch64/gentune.sh \
>
>
>   Jakub


RE: [PATCH]AArch64 relax constraints on FP16 insn PR108172

2022-12-21 Thread Tamar Christina via Gcc-patches
> -Original Message-
> From: Richard Sandiford 
> Sent: Wednesday, December 21, 2022 10:31 AM
> To: Tamar Christina 
> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw
> ; Marcus Shawcroft
> ; Kyrylo Tkachov 
> Subject: Re: [PATCH]AArch64 relax constraints on FP16 insn PR108172
> 
> Tamar Christina  writes:
> > Hi All,
> >
> > The move, load, load, store, dup, basically all the non arithmetic
> > FP16 instructions use baseline armv8-a HF support, and so do not
> > require the Armv8.2-a extensions.  This relaxes the instructions.
> >
> > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> >
> > Ok for master?
> >
> > Thanks,
> > Tamar
> >
> > gcc/ChangeLog:
> >
> > PR target/108172
> > * config/aarch64/aarch64-simd.md (*aarch64_simd_movv2hf): Relax
> > TARGET_SIMD_F16INST to TARGET_SIMD.
> > * config/aarch64/aarch64.cc (aarch64_classify_vector_mode): Re-
> order.
> > * config/aarch64/iterators.md (VMOVE): Drop
> TARGET_SIMD_F16INST.
> >
> > gcc/testsuite/ChangeLog:
> >
> > PR target/108172
> > * gcc.target/aarch64/pr108172.c: New test.
> 
> OK, thanks.
> 
> I think we need better tests for this area though.  The VMOVE uses include
> aarch64_dup_lane, which uses , which has no definition for
> V2HF, so we get:
> 
> return "dup\t%0., %1.h[%2]";
> 
> The original patch defined Vtype to "2h", but using that here would have
> generated an invalid instruction.
> 
> We also have unexpanded s in the pairwise operations:
> 
> "faddp\t%h0, %1.",
> "fmaxp\t%h0, %1.",
> "fminp\t%h0, %1.",
> "fmaxnmp\t%h0, %1.",
> "fminnmp\t%h0, %1.",
> 

Right, the pairwise bit should have been reverted, the tests were all In that 
patch.

> Would it be easy (using a combination of this patch and a follow-on patch) to
> wind the V2HF support back to a state that makes sense on its own, without
> the postponed pairwise support?  Or would it be simpler to revert
> 2cba118e538ba0b7582af7f9fb5ba2dfbb772f8e for GCC 13 and revisit this in
> GCC 14, alongside the original motivating use case?

The pairwise and the dup should be undone, as evidence that they don't trigger
currently. The mov make sense on their own already and improve various codegen
around shorts on their own.

I won't be revisiting pairwise operations again, so I'll remove the remnants 
from this
Patch.

> 
> Richard
> 
> > --- inline copy of patch --
> > diff --git a/gcc/config/aarch64/aarch64-simd.md
> > b/gcc/config/aarch64/aarch64-simd.md
> > index
> >
> a62b6deaf4a57e570074d7d894e6fac13779f6fb..8a9f655d547285ec7bdc173086
> 30
> > 8d7d44a8d482 100644
> > --- a/gcc/config/aarch64/aarch64-simd.md
> > +++ b/gcc/config/aarch64/aarch64-simd.md
> > @@ -164,7 +164,7 @@ (define_insn "*aarch64_simd_movv2hf"
> > "=w, m,  m,  w, ?r, ?w, ?r, w, w")
> > (match_operand:V2HF 1 "general_operand"
> > "m,  Dz, w,  w,  w,  r,  r, Dz, Dn"))]
> > -  "TARGET_SIMD_F16INST
> > +  "TARGET_SIMD
> > && (register_operand (operands[0], V2HFmode)
> > || aarch64_simd_reg_or_zero (operands[1], V2HFmode))"
> > "@
> > diff --git a/gcc/config/aarch64/aarch64.cc
> > b/gcc/config/aarch64/aarch64.cc index
> >
> 73515c174fa4fe7830527e7eabd91c4648130ff4..d1c0476321b79bc6aded350d24
> ea
> > 5d556c796519 100644
> > --- a/gcc/config/aarch64/aarch64.cc
> > +++ b/gcc/config/aarch64/aarch64.cc
> > @@ -3616,6 +3616,8 @@ aarch64_classify_vector_mode (machine_mode
> mode)
> >  case E_V4x2DFmode:
> >return TARGET_FLOAT ? VEC_ADVSIMD | VEC_STRUCT : 0;
> >
> > +/* 32-bit Advanced SIMD vectors.  */
> > +case E_V2HFmode:
> >  /* 64-bit Advanced SIMD vectors.  */
> >  case E_V8QImode:
> >  case E_V4HImode:
> > @@ -3634,7 +3636,6 @@ aarch64_classify_vector_mode (machine_mode
> mode)
> >  case E_V8BFmode:
> >  case E_V4SFmode:
> >  case E_V2DFmode:
> > -case E_V2HFmode:
> >return TARGET_FLOAT ? VEC_ADVSIMD : 0;
> >
> >  default:
> > diff --git a/gcc/config/aarch64/iterators.md
> > b/gcc/config/aarch64/iterators.md index
> >
> a521dbde1ec42c0c442a9ca3dd52c9727d116399..70742520984d30158e62a38c9
> 2ab
> > ea82b2dac059 100644
> > --- a/gcc/config/aarch64/iterators.md
> > +++ b/gcc/config/aarch64/iterators.md
> > @@ -204,8 +204,7 @@ (define_mode_iterator VALL_F16 [V8QI V16QI V4HI
> > V8HI V2SI V4SI V2DI  ;; All Advanced SIMD modes suitable for moving,
> > loading, and storing  ;; including V2HF  (define_mode_iterator VMOVE
> > [V8QI V16QI V4HI V8HI V2SI V4SI V2DI
> > -V4HF V8HF V4BF V8BF V2SF V4SF V2DF
> > -(V2HF "TARGET_SIMD_F16INST")])
> > +V2HF V4HF V8HF V4BF V8BF V2SF V4SF V2DF])
> >
> >
> >  ;; The VALL_F16 modes except the 128-bit 2-element ones.
> > diff --git a/gcc/testsuite/gcc.target/aarch64/pr108172.c
> > b/gcc/testsuite/gcc.target/aarch64/pr108172.c
> > new file mode 100644
> > index
> >
> ..b29054fdb1d6e602755bc9308
> 9f1
> > edec4eb

Re: [PATCH]AArch64 relax constraints on FP16 insn PR108172

2022-12-21 Thread Richard Sandiford via Gcc-patches
Tamar Christina  writes:
>> -Original Message-
>> From: Richard Sandiford 
>> Sent: Wednesday, December 21, 2022 10:31 AM
>> To: Tamar Christina 
>> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw
>> ; Marcus Shawcroft
>> ; Kyrylo Tkachov 
>> Subject: Re: [PATCH]AArch64 relax constraints on FP16 insn PR108172
>> 
>> Tamar Christina  writes:
>> > Hi All,
>> >
>> > The move, load, load, store, dup, basically all the non arithmetic
>> > FP16 instructions use baseline armv8-a HF support, and so do not
>> > require the Armv8.2-a extensions.  This relaxes the instructions.
>> >
>> > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
>> >
>> > Ok for master?
>> >
>> > Thanks,
>> > Tamar
>> >
>> > gcc/ChangeLog:
>> >
>> >PR target/108172
>> >* config/aarch64/aarch64-simd.md (*aarch64_simd_movv2hf): Relax
>> >TARGET_SIMD_F16INST to TARGET_SIMD.
>> >* config/aarch64/aarch64.cc (aarch64_classify_vector_mode): Re-
>> order.
>> >* config/aarch64/iterators.md (VMOVE): Drop
>> TARGET_SIMD_F16INST.
>> >
>> > gcc/testsuite/ChangeLog:
>> >
>> >PR target/108172
>> >* gcc.target/aarch64/pr108172.c: New test.
>> 
>> OK, thanks.
>> 
>> I think we need better tests for this area though.  The VMOVE uses include
>> aarch64_dup_lane, which uses , which has no definition for
>> V2HF, so we get:
>> 
>> return "dup\t%0., %1.h[%2]";
>> 
>> The original patch defined Vtype to "2h", but using that here would have
>> generated an invalid instruction.
>> 
>> We also have unexpanded s in the pairwise operations:
>> 
>> "faddp\t%h0, %1.",
>> "fmaxp\t%h0, %1.",
>> "fminp\t%h0, %1.",
>> "fmaxnmp\t%h0, %1.",
>> "fminnmp\t%h0, %1.",
>> 
>
> Right, the pairwise bit should have been reverted, the tests were all In that 
> patch.
>
>> Would it be easy (using a combination of this patch and a follow-on patch) to
>> wind the V2HF support back to a state that makes sense on its own, without
>> the postponed pairwise support?  Or would it be simpler to revert
>> 2cba118e538ba0b7582af7f9fb5ba2dfbb772f8e for GCC 13 and revisit this in
>> GCC 14, alongside the original motivating use case?
>
> The pairwise and the dup should be undone, as evidence that they don't trigger
> currently. The mov make sense on their own already and improve various codegen
> around shorts on their own.

OK, sounds good, thanks.

> I won't be revisiting pairwise operations again, so I'll remove the remnants 
> from this
> Patch.

Ack.  Sorry for causing frustration in the review for that series.

Richard

>> Richard
>> 
>> > --- inline copy of patch --
>> > diff --git a/gcc/config/aarch64/aarch64-simd.md
>> > b/gcc/config/aarch64/aarch64-simd.md
>> > index
>> >
>> a62b6deaf4a57e570074d7d894e6fac13779f6fb..8a9f655d547285ec7bdc173086
>> 30
>> > 8d7d44a8d482 100644
>> > --- a/gcc/config/aarch64/aarch64-simd.md
>> > +++ b/gcc/config/aarch64/aarch64-simd.md
>> > @@ -164,7 +164,7 @@ (define_insn "*aarch64_simd_movv2hf"
>> >"=w, m,  m,  w, ?r, ?w, ?r, w, w")
>> >(match_operand:V2HF 1 "general_operand"
>> >"m,  Dz, w,  w,  w,  r,  r, Dz, Dn"))]
>> > -  "TARGET_SIMD_F16INST
>> > +  "TARGET_SIMD
>> > && (register_operand (operands[0], V2HFmode)
>> > || aarch64_simd_reg_or_zero (operands[1], V2HFmode))"
>> > "@
>> > diff --git a/gcc/config/aarch64/aarch64.cc
>> > b/gcc/config/aarch64/aarch64.cc index
>> >
>> 73515c174fa4fe7830527e7eabd91c4648130ff4..d1c0476321b79bc6aded350d24
>> ea
>> > 5d556c796519 100644
>> > --- a/gcc/config/aarch64/aarch64.cc
>> > +++ b/gcc/config/aarch64/aarch64.cc
>> > @@ -3616,6 +3616,8 @@ aarch64_classify_vector_mode (machine_mode
>> mode)
>> >  case E_V4x2DFmode:
>> >return TARGET_FLOAT ? VEC_ADVSIMD | VEC_STRUCT : 0;
>> >
>> > +/* 32-bit Advanced SIMD vectors.  */
>> > +case E_V2HFmode:
>> >  /* 64-bit Advanced SIMD vectors.  */
>> >  case E_V8QImode:
>> >  case E_V4HImode:
>> > @@ -3634,7 +3636,6 @@ aarch64_classify_vector_mode (machine_mode
>> mode)
>> >  case E_V8BFmode:
>> >  case E_V4SFmode:
>> >  case E_V2DFmode:
>> > -case E_V2HFmode:
>> >return TARGET_FLOAT ? VEC_ADVSIMD : 0;
>> >
>> >  default:
>> > diff --git a/gcc/config/aarch64/iterators.md
>> > b/gcc/config/aarch64/iterators.md index
>> >
>> a521dbde1ec42c0c442a9ca3dd52c9727d116399..70742520984d30158e62a38c9
>> 2ab
>> > ea82b2dac059 100644
>> > --- a/gcc/config/aarch64/iterators.md
>> > +++ b/gcc/config/aarch64/iterators.md
>> > @@ -204,8 +204,7 @@ (define_mode_iterator VALL_F16 [V8QI V16QI V4HI
>> > V8HI V2SI V4SI V2DI  ;; All Advanced SIMD modes suitable for moving,
>> > loading, and storing  ;; including V2HF  (define_mode_iterator VMOVE
>> > [V8QI V16QI V4HI V8HI V2SI V4SI V2DI
>> > -   V4HF V8HF V4BF V8BF V2SF V4SF V2DF
>> > -   (V2HF "TARGET_SIMD_F16INST")])
>> > +   V2HF V4HF V8HF V4BF V8BF V2SF V4SF V2DF])
>> >
>> >
>> >  ;; The VALL_F16 modes except the 12

Make -fwhole-program to work with incremental LTO linking

2022-12-21 Thread Jan Hubicka via Gcc-patches
Hi,
this patches updates documentation of -fwhole-program which was wrongly
claiming that it is useless with LTO whole it is useful for LTO without plugin
and extends -fwhole-program to also work with incremental linking when
non-LTO code is produced.
This is useful when building kernel where the incremental link is de-facto fina
binary and only some explicitly marked symbols needs to remain.

Bootstrapped/regtested x86_64-linux, comitted.

gcc/ChangeLog:

2022-12-21  Jan Hubicka  

* doc/invoke.texi: Fix documentation of -fwhole-program with LTO
and document behaviour for incremental linking.

gcc/lto/ChangeLog:

2022-12-21  Jan Hubicka  

* lto-common.cc (lto_resolution_read): With incremental linking
and whole program ignore turn LDPR_PREVAILING_DEF_IRONLY to
LDPR_PREVAILING_DEF_IRONLY_EXP
* lto-lang.cc (lto_post_options): Do not clear flag_whole_program
for incremental link

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 330da6eb5d4..8c01160e2bb 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -13648,9 +13648,12 @@ compiled.  All public functions and variables with the 
exception of @code{main}
 and those merged by attribute @code{externally_visible} become static functions
 and in effect are optimized more aggressively by interprocedural optimizers.
 
-This option should not be used in combination with @option{-flto}.
-Instead relying on a linker plugin should provide safer and more precise
-information.
+With @option{-flto} this option has a limited use.  In most cases the
+precise list of symbols used or exported from the binary is known the
+resolution info passed to the link-time optimizer by the linker plugin.  It is
+still useful if no linker plugin is used or during incremental link step when
+final code is produced (with @option{-flto}
+@option{-flinker-output=nolto-rel}).
 
 @item -flto[=@var{n}]
 @opindex flto
diff --git a/gcc/lto/lto-common.cc b/gcc/lto/lto-common.cc
index f64309731fa..125064ba47e 100644
--- a/gcc/lto/lto-common.cc
+++ b/gcc/lto/lto-common.cc
@@ -2118,6 +2118,17 @@ lto_resolution_read (splay_tree file_ids, FILE 
*resolution, lto_file *file)
  if (strcmp (lto_resolution_str[j], r_str) == 0)
{
  r = (enum ld_plugin_symbol_resolution) j;
+ /* Incremental linking together with -fwhole-program may seem
+somewhat contradictionary (as the point of incremental linking
+is to allow re-linking with more symbols later) but it is
+used to build LTO kernel.  We want to hide all symbols that
+are not explicitely marked as exported and thus turn
+LDPR_PREVAILING_DEF_IRONLY_EXP
+to LDPR_PREVAILING_DEF_IRONLY.  */
+ if (flag_whole_program
+ && flag_incremental_link == INCREMENTAL_LINK_NOLTO
+ && r == LDPR_PREVAILING_DEF_IRONLY_EXP)
+   r = LDPR_PREVAILING_DEF_IRONLY;
  break;
}
}
diff --git a/gcc/lto/lto-lang.cc b/gcc/lto/lto-lang.cc
index d36453ba25d..7018dfae4a5 100644
--- a/gcc/lto/lto-lang.cc
+++ b/gcc/lto/lto-lang.cc
@@ -901,7 +901,6 @@ lto_post_options (const char **pfilename ATTRIBUTE_UNUSED)
   break;
 
 case LTO_LINKER_OUTPUT_NOLTOREL: /* .o: incremental link producing asm  */
-  flag_whole_program = 0;
   flag_incremental_link = INCREMENTAL_LINK_NOLTO;
   break;
 


[PATCH] middle-end/107994 - ICE after error with comparison gimplification

2022-12-21 Thread Richard Biener via Gcc-patches
The following avoids passing down error_mark_node to fold_convert.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

PR middle-end/107994
* gimplify.cc (gimplify_expr): Catch errorneous comparison
operand.
---
 gcc/gimplify.cc | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc
index 250782b1140..c9c800a5850 100644
--- a/gcc/gimplify.cc
+++ b/gcc/gimplify.cc
@@ -17098,6 +17098,9 @@ gimplify_expr (tree *expr_p, gimple_seq *pre_p, 
gimple_seq *post_p,
 Compare scalar mode aggregates as scalar mode values.  Using
 memcmp for them would be very inefficient at best, and is
 plain wrong if bitfields are involved.  */
+ if (error_operand_p (TREE_OPERAND (*expr_p, 1)))
+   ret = GS_ERROR;
+ else
{
  tree type = TREE_TYPE (TREE_OPERAND (*expr_p, 1));
 
@@ -17122,9 +17125,8 @@ gimplify_expr (tree *expr_p, gimple_seq *pre_p, 
gimple_seq *post_p,
ret = gimplify_scalar_mode_aggregate_compare (expr_p);
  else
ret = gimplify_variable_sized_compare (expr_p);
-
- break;
}
+ break;
 
/* If *EXPR_P does not need to be special-cased, handle it
   according to its class.  */
-- 
2.35.3


[PATCH] modula2: PR-108119 Disable m2 plugin m2rte

2022-12-21 Thread Gaius Mulley via Gcc-patches


Hello,

PR-108119 Disable m2 plugin m2rte (provide --enable-m2plugin configure option).

The m2 plugin m2rte attempts to find reachable calls to the m2 exception
handler, but it identifies the m2 exception calls by procedure name.
As this won't work with other languages it should be disabled by default.

This patch disables the plugin from being built.  It provides a new
configure switch --enable-m2plugin to override the default and it will
check the HAVE_PLUGIN of the host and warn if there is a conflict.

Tested on x86_64 GNU/Linux bootstrapped for all languages no extra
failures (m2 tests missing as the plugin is disabled and TCL expect
now disables m2 plugin tests)

ok for trunk?

regards,
Gaius


ChangeLog:

* Makefile.def (extra_configure_flags): Add @enable_m2plugin@.
* Makefile.in : Rebuilt.
* configure : Rebuilt.
* configure.ac (host_tools): Remove unused gm2tools.
(m2plugin) New AC_ARG_ENABLE.
(enable_m2plugin) New AC_SUBST added.

gcc/ChangeLog:

* Makefile.in (enable_m2plugin): Added.
(site.exp): New variable ENABLE_M2PLUGIN.
* config.in: Rebuilt.
* configure : Rebuilt.
* gcc/configure.ac (m2plugin): New AC_ARG_ENABLE.
(enable_m2plugin) New AC_SUBST added.
* doc/install.texi (--enable-m2plugin): Documented.

gcc/m2/ChangeLog:

* gm2spec.cc (ENABLE_PLUGIN): Replaced by ENABLE_M2PLUGIN.  Ensure
that OPT_fplugin option is only created if ENABLE_M2PLUGIN is
defined.
(lang_specific_driver) Change warning to mention
--enable-m2plugin.
* Make-lang.in (enable_m2plugin): Set to no if enable_plugin is
no.
(M2RTE_PLUGIN_SO) New definition.
(m2.all.cross) Use M2RTE_PLUGIN_SO.
(m2.start.encap) Use M2RTE_PLUGIN_SO.
(m2.install-plugin) Use M2RTE_PLUGIN_SO.
(m2.install-plugin) Add dummy rule when enable_m2plugin is no.
(plugin/m2rte$(exeext).so) Add dummy rule when enable_m2plugin is
no.
(stage1/m2/cc1gm2$(exeext) Use M2RTE_PLUGIN_SO.
(stage2/m2/cc1gm2$(exeext) Use M2RTE_PLUGIN_SO.

gcc/testsuite/ChangeLog:

* gm2/coroutines/pim/run/pass/coroutines-pim-run-pass.exp
(ENABLE_M2PLUGIN): Checked to see whether the test should be ignored.
* gm2/iso/check/fail/iso-check-fail.exp (ENABLE_M2PLUGIN): Ditto.
* gm2/switches/auto-init/fail/switches-auto-init-fail.exp
(ENABLE_M2PLUGIN): Ditto.
* gm2/switches/check-all/pim2/fail/switches-check-all-pim2-fail.exp
(ENABLE_M2PLUGIN): Ditto.
* 
gm2/switches/check-all/plugin/iso/fail/switches-check-all-plugin-iso-fail.exp
(ENABLE_M2PLUGIN): Ditto.
* 
gm2/switches/check-all/plugin/pim2/fail/switches-check-all-plugin-pim2-fail.exp
(ENABLE_M2PLUGIN): Ditto.

diff --git a/Makefile.def b/Makefile.def
index 5f44190154e..d33e4528b63 100644
--- a/Makefile.def
+++ b/Makefile.def
@@ -103,7 +103,7 @@ host_modules= { module= libiberty; bootstrap=true;
 // @extra_linker_plugin_flags@ and @extra_linker_plugin_configure_flags@.
 host_modules= { module= libiberty-linker-plugin; bootstrap=true;
module_srcdir=libiberty;
-   extra_configure_flags='@extra_host_libiberty_configure_flags@ 
--disable-install-libiberty @extra_linker_plugin_flags@ 
@extra_linker_plugin_configure_flags@';
+   extra_configure_flags='@extra_host_libiberty_configure_flags@ 
--disable-install-libiberty @extra_linker_plugin_flags@ 
@extra_linker_plugin_configure_flags@ @enable_m2plugin@';
extra_make_flags='@extra_linker_plugin_flags@'; };
 // We abuse missing to avoid installing anything for libiconv.
 host_modules= { module= libiconv;
diff --git a/configure.ac b/configure.ac
index c5191ce24ae..b771f61ef33 100644
--- a/configure.ac
+++ b/configure.ac
@@ -140,7 +140,7 @@ host_libs="intl libiberty opcodes bfd readline tcl tk itcl 
libgui zlib libbacktr
 # binutils, gas and ld appear in that order because it makes sense to run
 # "make check" in that particular order.
 # If --enable-gold is used, "gold" may replace "ld".
-host_tools="texinfo flex bison binutils gas ld fixincludes gcc cgen sid sim 
gdb gdbserver gprof etc expect dejagnu m4 utils guile fastjar gnattools libcc1 
gm2tools gotools c++tools"
+host_tools="texinfo flex bison binutils gas ld fixincludes gcc cgen sid sim 
gdb gdbserver gprof etc expect dejagnu m4 utils guile fastjar gnattools libcc1 
gotools c++tools"
 
 # these libraries are built for the target environment, and are built after
 # the host libraries and the host tools (which may be a cross compiler)
@@ -465,13 +465,22 @@ if test "${ENABLE_LIBADA}" != "yes" ; then
   noconfigdirs="$noconfigdirs gnattools"
 fi
 
+
 AC_ARG_ENABLE(libgm2,
 [AS_HELP_STRING([--enable-libgm2], [build libgm2 directory])],
 ENABLE_LIBGM2=$enableval,
 ENABLE_LIBGM2=no)
-if test "${ENABLE_LIBGM2}" != "yes" ; then
-  noconfigdirs="$noconfigdirs gm2tools"
+
+
+AC_

Re: [committed] modula2: Fix lto profiledbootstrap on powerpc64le-linux and s390x-linux [PR108153]

2022-12-21 Thread Gaius Mulley via Gcc-patches
Jakub Jelinek  writes:

> Hi!
>
> Lto profiledbootstrap was failing for me on {powerpc64le,s390x}-linux with
> modula 2 enabled, with:
> cc1gm2: internal compiler error: the location value is corrupt
> 0x11a3d2d m2assert_AssertLocation(unsigned int)
> ../../gcc/m2/gm2-gcc/m2assert.cc:40
> 0x11a3d2d m2statement_BuildAssignmentTree
> ../../gcc/m2/gm2-gcc/m2statement.cc:177
> ICE.  The problem was that caller (m2assert_AssertLocation used
> location_t M2Options_OverrideLocation (location_t);
> prototype with the libcpp/line-map.h
> typedef unsigned int location_t;
> typedef, but the callee defined in Modula 2 was using:
> TYPE
>location_t = INTEGER ;
> and
> PROCEDURE OverrideLocation (location: location_t) : location_t ;
> Now, on powerpc64le-linux unsigned int is returned and passed zero extended
> into 64-bits, while signed int is returned and passed sign-extended into 
> 64-bits
> and Modula 2 INTEGER is signed 32-bit type, so when the caller then compared
> M2Options_OverrideLocation (location) != location
> and powerpc64le-linux performed the comparison as 64-bit compare, there
> was a mismatch for location_t of 0x807 or others with the MSB set.
>
> Fixed by making Modula 2 location_t a CARDINAL, which is 32-bit unsigned type.
>
> Bootstrapped/regtested normally on x86_64-linux and i686-linux, with
> bootstrap-lto profiledbootstrap on powerpc64le-linux and aarch64-linux so
> far (and in regtesting on x86_64-linux, i686-linux and s390x-linux),
> approved in the PR by Gaius, committed to trunk.
>
> 2022-12-21  Jakub Jelinek  
>
>   PR modula2/108153
>   * gm2-gcc/m2linemap.def (location_t): Use CARDINAL instead of INTEGER.
>
> --- gcc/m2/gm2-gcc/m2linemap.def.jj   2022-12-19 14:59:50.169762747 +0100
> +++ gcc/m2/gm2-gcc/m2linemap.def  2022-12-20 16:36:18.321555969 +0100
> @@ -30,7 +30,7 @@ EXPORT QUALIFIED StartFile, EndFile, Sta
>   WarningAtf, NoteAtf, internal_error, location_t ;
>  
>  TYPE
> -   location_t = INTEGER ;
> +   location_t = CARDINAL ;
>  
>  
>  PROCEDURE StartFile (filename: ADDRESS; linebegin: CARDINAL) ;
>
>   Jakub

Hi Jakub,

thanks for finding the bug and the fix!

regards,
Gaius





Re: [PATCH] modula2: PR-108119 Disable m2 plugin m2rte

2022-12-21 Thread Richard Biener via Gcc-patches
On Wed, Dec 21, 2022 at 1:35 PM Gaius Mulley via Gcc-patches
 wrote:
>
>
> Hello,
>
> PR-108119 Disable m2 plugin m2rte (provide --enable-m2plugin configure 
> option).
>
> The m2 plugin m2rte attempts to find reachable calls to the m2 exception
> handler, but it identifies the m2 exception calls by procedure name.
> As this won't work with other languages it should be disabled by default.
>
> This patch disables the plugin from being built.  It provides a new
> configure switch --enable-m2plugin to override the default and it will
> check the HAVE_PLUGIN of the host and warn if there is a conflict.
>
> Tested on x86_64 GNU/Linux bootstrapped for all languages no extra
> failures (m2 tests missing as the plugin is disabled and TCL expect
> now disables m2 plugin tests)
>
> ok for trunk?
>
> regards,
> Gaius
>
>
> ChangeLog:
>
> * Makefile.def (extra_configure_flags): Add @enable_m2plugin@.
> * Makefile.in : Rebuilt.
> * configure : Rebuilt.
> * configure.ac (host_tools): Remove unused gm2tools.
> (m2plugin) New AC_ARG_ENABLE.
> (enable_m2plugin) New AC_SUBST added.

Why's this at the toplevel?

>
> gcc/ChangeLog:
>
> * Makefile.in (enable_m2plugin): Added.
> (site.exp): New variable ENABLE_M2PLUGIN.
> * config.in: Rebuilt.
> * configure : Rebuilt.
> * gcc/configure.ac (m2plugin): New AC_ARG_ENABLE.
> (enable_m2plugin) New AC_SUBST added.
> * doc/install.texi (--enable-m2plugin): Documented.

Likewise - shouldn't this be in gcc/m2/?

>
> gcc/m2/ChangeLog:
>
> * gm2spec.cc (ENABLE_PLUGIN): Replaced by ENABLE_M2PLUGIN.  Ensure
> that OPT_fplugin option is only created if ENABLE_M2PLUGIN is
> defined.
> (lang_specific_driver) Change warning to mention
> --enable-m2plugin.
> * Make-lang.in (enable_m2plugin): Set to no if enable_plugin is
> no.
> (M2RTE_PLUGIN_SO) New definition.
> (m2.all.cross) Use M2RTE_PLUGIN_SO.
> (m2.start.encap) Use M2RTE_PLUGIN_SO.
> (m2.install-plugin) Use M2RTE_PLUGIN_SO.
> (m2.install-plugin) Add dummy rule when enable_m2plugin is no.
> (plugin/m2rte$(exeext).so) Add dummy rule when enable_m2plugin is
> no.
> (stage1/m2/cc1gm2$(exeext) Use M2RTE_PLUGIN_SO.
> (stage2/m2/cc1gm2$(exeext) Use M2RTE_PLUGIN_SO.
>
> gcc/testsuite/ChangeLog:
>
> * gm2/coroutines/pim/run/pass/coroutines-pim-run-pass.exp
> (ENABLE_M2PLUGIN): Checked to see whether the test should be ignored.
> * gm2/iso/check/fail/iso-check-fail.exp (ENABLE_M2PLUGIN): Ditto.
> * gm2/switches/auto-init/fail/switches-auto-init-fail.exp
> (ENABLE_M2PLUGIN): Ditto.
> * gm2/switches/check-all/pim2/fail/switches-check-all-pim2-fail.exp
> (ENABLE_M2PLUGIN): Ditto.
> * 
> gm2/switches/check-all/plugin/iso/fail/switches-check-all-plugin-iso-fail.exp
> (ENABLE_M2PLUGIN): Ditto.
> * 
> gm2/switches/check-all/plugin/pim2/fail/switches-check-all-plugin-pim2-fail.exp
> (ENABLE_M2PLUGIN): Ditto.
>
> diff --git a/Makefile.def b/Makefile.def
> index 5f44190154e..d33e4528b63 100644
> --- a/Makefile.def
> +++ b/Makefile.def
> @@ -103,7 +103,7 @@ host_modules= { module= libiberty; bootstrap=true;
>  // @extra_linker_plugin_flags@ and @extra_linker_plugin_configure_flags@.
>  host_modules= { module= libiberty-linker-plugin; bootstrap=true;
> module_srcdir=libiberty;
> -   extra_configure_flags='@extra_host_libiberty_configure_flags@ 
> --disable-install-libiberty @extra_linker_plugin_flags@ 
> @extra_linker_plugin_configure_flags@';
> +   extra_configure_flags='@extra_host_libiberty_configure_flags@ 
> --disable-install-libiberty @extra_linker_plugin_flags@ 
> @extra_linker_plugin_configure_flags@ @enable_m2plugin@';
> extra_make_flags='@extra_linker_plugin_flags@'; };
>  // We abuse missing to avoid installing anything for libiconv.
>  host_modules= { module= libiconv;
> diff --git a/configure.ac b/configure.ac
> index c5191ce24ae..b771f61ef33 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -140,7 +140,7 @@ host_libs="intl libiberty opcodes bfd readline tcl tk 
> itcl libgui zlib libbacktr
>  # binutils, gas and ld appear in that order because it makes sense to run
>  # "make check" in that particular order.
>  # If --enable-gold is used, "gold" may replace "ld".
> -host_tools="texinfo flex bison binutils gas ld fixincludes gcc cgen sid sim 
> gdb gdbserver gprof etc expect dejagnu m4 utils guile fastjar gnattools 
> libcc1 gm2tools gotools c++tools"
> +host_tools="texinfo flex bison binutils gas ld fixincludes gcc cgen sid sim 
> gdb gdbserver gprof etc expect dejagnu m4 utils guile fastjar gnattools 
> libcc1 gotools c++tools"
>
>  # these libraries are built for the target environment, and are built after
>  # the host libraries and the host tools (which may be a cross 

[pushed] libffi: Fix X86 32b Darwin build and EH frames.

2022-12-21 Thread Iain Sandoe via Gcc-patches
The addition of rust has pulled in libffi as a target library for Darwin.
This fails to bootstrap on some versions of i686-darwin and has build-time
diagnostics on all.  The EH is also invalid and those tests fail.

Note that X86 Darwin is not listed as supported upstream.
However I have made a pull request with this patch and confirmed that the
library builds on 
 i686-darwin9 ... darwin17 with recent GCC
 i686-darwin11 ... darwin17 with xcode clang.
 
 The testsuite is not too happy, but this patch does not attempt to
 address anything other than build failures (it happens to fix the EH
 tests as a by product).
 
 Tested on i686-darwin9-17, x86_64-linux-gnu (with a 32b multilib) and
 x86_64 darwin21, pushed to master, thanks,
 Iain

--- 8< ---

This addresses a number of issues in the X86 Darwin 32b port for libffi.

1. The pic symbol stubs are weak definitions; the correct section placement
   for these depends on the linker version in use.  We do not have access
   to that information, but we can use the target OS version (assumes that
   the user has installed the latest version of xcode available).
   When a coalesced section is in use (OS versions earlier than Darwin12 /
   OSX 10.8), its name must differ from  __TEXT,__text since otherwise that
   would correspond to altering the attributes of the .text section (which
   produces a diagnostic from the assembler).
   Here we use __TEXT, __textcoal_nt for this which is what GCC emits for
   these stubs.
   For later versions than Darwin 12 (OS X 10.8) we can place the stubs in
   the .text section (if we do not we get a diagnostic from clang -cc1as
   saying that the use of coalesced sections for this is deprecated).

2. The EH frame is specified manually, since there is no support for .cfi_
   directives in 'cctools' assemblers.  The implementation needs to provide
   offsets for CFA advance, code size and to the CIE as signed values
   rather than relocations. However the cctools assembler will produce a
   relocation for expressions like ' .long Lxx-Lyy' which then leads to a
   link-time error.  We correct this by forming the offset values using
   ' .set' directives and then assigning the results of them.

3. The register numbering used by m32 X86 Darwin EH frames is not the same
   as the DWARF debug numbering (the Frame and Stack pointer numbers are
   swapped).

4. The FDE address encoding used by the system tools is '0x10' (PCrel + abs)
   where the value provided was PCrel + sdata4.

5. GCC does not use compact unwind at present, and it was not implemented
   until Darwin10 / OSX 10.6.  There were some issues with function location
   in 10.6 so that the solution here suppresses emitting the compact unwind
   section until Darwin11 / OSX 10.7.

Signed-off-by: Iain Sandoe 

libffi/ChangeLog:

* src/x86/sysv.S (COMDAT): Amend section use for Darwin, accounting
cases where coalesced is needed. (eh_frame): Rework to avoid relocs
that cause builf fails on earlier Darwin.  Adjust register numbers
to account for X86 m32 Darwin differences between EH and debug.
---
 libffi/src/x86/sysv.S | 121 +-
 1 file changed, 83 insertions(+), 38 deletions(-)

diff --git a/libffi/src/x86/sysv.S b/libffi/src/x86/sysv.S
index 7110f02f5f3..c7a0fb51b48 100644
--- a/libffi/src/x86/sysv.S
+++ b/libffi/src/x86/sysv.S
@@ -888,10 +888,27 @@ ENDF(C(ffi_closure_raw_THISCALL))
 #endif /* !FFI_NO_RAW_API */
 
 #ifdef X86_DARWIN
-# define COMDAT(X) \
-.section __TEXT,__text,coalesced,pure_instructions;\
+/* The linker in use on earlier Darwin needs weak definitions to be
+   placed in a coalesced section.  That section should not be called
+   __TEXT,__text since that would be re-defining the attributes of the
+   .text section (which is an error for earlier tools). Here we use
+   '__textcoal_nt' which is what GCC emits for this.
+   Later linker versions are happy to use a normal section and, after
+   Darwin12 / OSX 10.8, the tools warn that using coalesced sections
+   for this is deprecated so we must switch to avoid build fails and/or
+   deprecation warnings.  */
+# if defined(__ENVIRONMENT_MAC_OS_X_VERSION_MIN_REQUIRED__) && \
+   __ENVIRONMENT_MAC_OS_X_VERSION_MIN_REQUIRED__ < 1080
+#  define COMDAT(X)\
+.section __TEXT,__textcoal_nt,coalesced,pure_instructions; \
+.weak_definition X;\
+FFI_HIDDEN(X)
+# else
+#  define COMDAT(X)\
+.text; \
 .weak_definition X;\
 FFI_HIDDEN(X)
+# endif
 #elif defined __ELF__ && !(defined(__sun__) && defined(__svr4__))
 # define COMDAT(X) \

[PATCH 0/2]: Fix address cost complexity and register pressure cost calculation.

2022-12-21 Thread Dimitrije Milošević
Architectures like Mips are very limited when it comes to addressing modes. 
Therefore, the expected
behavior would be that, for the BASE + OFFSET addressing mode, complexity is 
lower, while, for more
complex addressing modes (e.g. BASE + INDEX << SCALE), which are not supported, 
complexity is
higher. Currently, the complexity calculation algorithm bails out if BASE + 
INDEX addressing mode
is not supported by the target architecture, resuling in 0-complexities for all 
candidates, which
leads to non-optimal candidate selection, especially in scenarios where there 
are multiple nested
loops.

Register pressure cost model isn't optimal for the case when there are enough 
registers. Currently,
the register pressure cost is bumped up by another n_cands, while there is no 
reason for the
register pressure cost to be equal to n_cands + n_invs (for that case).
Adding another n_cands could be used as a tie-breaker for the two cases where 
we do have enough
registers and the sum of n_invs and n_cands is equal, however I think there are 
two problems with
that:
  - How often does it happen that we have two cases where we do have enough 
registers,
  n_invs + n_cands sums are equal, and n_cands differ? I think that's pretty 
rare.
  - Bumping up the cost by another n_cands may lead to cost for the "If we do 
have
  enough registers." case to be higher than for other cases, which doesn't make 
sense.

Dimitrije Milosevic (2):
  ivopts: ivopts: Compute complexity for unsupported addressing modes.
  ivopts: Revert register pressure cost when there are enough registers.

 gcc/tree-ssa-loop-ivopts.cc | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)
---
2.25.1




[PATCH 1/2] ivopts: Compute complexity for unsupported addressing modes.

2022-12-21 Thread Dimitrije Milošević
After f9f69dd, complexity is calculated using the
valid_mem_ref_p target hook. Architectures like Mips only
allow BASE + OFFSET addressing modes, which in turn prevents
the calculation of complexity for other addressing modes,
resulting in non-optimal candidate selection.

There still is code that adjusts the address cost for
unsupported addressing modes, however, it only adjusts the
cost part (the complexity part is left at 0).

gcc/ChangeLog:

* tree-ssa-loop-ivopts.cc (get_address_cost): Calculate
complexity for unsupported addressing modes as well.

Signed-off-by: Dimitrije Milosevic 
---
 gcc/tree-ssa-loop-ivopts.cc | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/gcc/tree-ssa-loop-ivopts.cc b/gcc/tree-ssa-loop-ivopts.cc
index ebd4aecce37..60c61dc9e49 100644
--- a/gcc/tree-ssa-loop-ivopts.cc
+++ b/gcc/tree-ssa-loop-ivopts.cc
@@ -4778,10 +4778,14 @@ get_address_cost (struct ivopts_data *data, struct 
iv_use *use,
 comp_inv = aff_combination_to_tree (aff_inv);
   if (comp_inv != NULL_TREE)
 cost = force_var_cost (data, comp_inv, inv_vars);
-  if (ratio != 1 && parts.step == NULL_TREE)
+  if (ratio != 1 && parts.step == NULL_TREE) {
 var_cost += mult_by_coeff_cost (ratio, addr_mode, speed);
-  if (comp_inv != NULL_TREE && parts.index == NULL_TREE)
+var_cost.complexity += 1;
+  }
+  if (comp_inv != NULL_TREE && parts.index == NULL_TREE) {
 var_cost += add_cost (speed, addr_mode);
+var_cost.complexity += 1;
+  }
 
   if (comp_inv && inv_expr && !simple_inv)
 {
-- 
2.25.1



[PATCH 2/2] ivopts: Revert register pressure cost when there are enough registers.

2022-12-21 Thread Dimitrije Milošević
When there are enough registers, the register pressure cost is
unnecessarily bumped by adding another n_cands.

This behavior may result in register pressure costs for the case
when there are enough registers being higher than for other cases.

When there are enough registers, the register pressure cost should be
equal to n_invs + n_cands.

This used to be the case before c18101f.

gcc/ChangeLog:

* tree-ssa-loop-ivopts.cc (ivopts_estimate_reg_pressure): Adjust.

Signed-off-by: Dimitrije Milosevic 
---
 gcc/tree-ssa-loop-ivopts.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/tree-ssa-loop-ivopts.cc b/gcc/tree-ssa-loop-ivopts.cc
index 60c61dc9e49..3176482d0d9 100644
--- a/gcc/tree-ssa-loop-ivopts.cc
+++ b/gcc/tree-ssa-loop-ivopts.cc
@@ -6092,7 +6092,7 @@ ivopts_estimate_reg_pressure (struct ivopts_data *data, 
unsigned n_invs,
 
   /* If we have enough registers.  */
   if (regs_needed + target_res_regs < available_regs)
-cost = n_new;
+return n_new;
   /* If close to running out of registers, try to preserve them.  */
   else if (regs_needed <= available_regs)
 cost = target_reg_cost [speed] * regs_needed;
-- 
2.25.1



Re: gcc-13/changes.html: Mention -fstrict-flex-arrays and its impact

2022-12-21 Thread Qing Zhao via Gcc-patches
Hi, Richard,

Thanks a lot for your comments.

> On Dec 21, 2022, at 2:12 AM, Richard Biener  wrote:
> 
> On Tue, 20 Dec 2022, Qing Zhao wrote:
> 
>> Hi,
>> 
>> This is the patch for mentioning -fstrict-flex-arrays and -Warray-bounds=2 
>> changes in gcc-13/changes.html.
>> 
>> Let me know if you have any comment or suggestions.
> 
> Some copy editing below
> 
>> Thanks.
>> 
>> Qing.
>> 
>> ===
>> From c022076169b4f1990b91f7daf4cc52c6c5535228 Mon Sep 17 00:00:00 2001
>> From: Qing Zhao 
>> Date: Tue, 20 Dec 2022 16:13:04 +
>> Subject: [PATCH] gcc-13/changes: Mention -fstrict-flex-arrays and its impact.
>> 
>> ---
>> htdocs/gcc-13/changes.html | 15 +++
>> 1 file changed, 15 insertions(+)
>> 
>> diff --git a/htdocs/gcc-13/changes.html b/htdocs/gcc-13/changes.html
>> index 689178f9..47b3d40f 100644
>> --- a/htdocs/gcc-13/changes.html
>> +++ b/htdocs/gcc-13/changes.html
>> @@ -39,6 +39,10 @@ a work-in-progress.
>> Legacy debug info compression option -gz=zlib-gnu was 
>> removed
>>   and the option is ignored right now.
>> New debug info compression option value -gz=zstd has 
>> been added.
>> +-Warray-bounds=2 will no longer issue warnings for out 
>> of bounds
>> +  accesses to trailing struct members of one-element array type 
>> anymore. Please
>> +  add -fstrict-flex-arrays=level to control how the 
>> compiler treat
>> +  trailing arrays of structures as flexible array members. 
> 
> "Instead it diagnoses accesses to trailing arrays according to 
> -fstrict-flex-arrays."

Okay.
> 
>> 
>> 
>> 
>> @@ -409,6 +413,17 @@ a work-in-progress.
>> Other significant improvements
>> 
>> 
>> +Treating trailing arrays as flexible array 
>> members
>> +
>> +
>> + GCC can now control when to treat the trailing array of a structure as 
>> a 
>> + flexible array member for the purpose of accessing the elements of such
>> + an array. By default, all trailing arrays of structures are treated as
> 
> all trailing arrays in aggregates are treated
Okay.
> 
>> + flexible array members. Use the new command-line option
>> + -fstrict-flex-array=level to control how GCC treats the 
>> trailing
>> + array of a structure as a flexible array member at different levels.
> 
> -fstrict-flex-arrays to control which trailing array
> members are streated as flexible arrays.

Okay.

> 
> I've also just now noticed that there's now a flag_strict_flex_arrays
> check in the middle-end (in array bound diagnostics) but this option
> isn't streamed or handled with LTO.  I think you want to replace that
> with the appropriate DECL_NOT_FLEXARRAY check.

We need to know the level value of the strict_flex_arrays on the struct field 
to issue proper warnings at different levels.
DECL_NOT_FLEXARRAY does not include such info.
So, what should I do? Streaming the flag_strict_flex_arrays with LTO?

>  We might also want
> to see how inlining accesses from TUs with different -fstrict-flex-arrays
> setting behaves when accessing the same structure (and whether we might
> want to issue an ODR style diagnostic there).

Yes, good point, I will check on this part.

BTW, a stupid question: what does ODR mean?

thanks.

Qing
> 
> Thanks,
> Richard.



[PATCH] c++: get_nsdmi in template context [PR108116]

2022-12-21 Thread Patrick Palka via Gcc-patches
Here during ahead of time checking of C{}, we indirectly call get_nsdmi
for C::m from finish_compound_literal, which in turn calls
break_out_target_exprs for C::m's (non-templated) initializer, during
which we end up building a call to A::~A and checking expr_noexcept_p
for it (from build_vec_delete_1).  But this is all done with
processing_template_decl set, so the built A::~A call is templated
(whose form r12-6897-gdec8d0e5fa00ceb2 recently changed) which
expr_noexcept_p doesn't expect and we crash.

In r10-6183-g20afdcd3698275 we fixed a similar issue by guarding a
expr_noexcept_p call with !processing_template_decl, which works here
too.  But it seems to me since the initializer we obtain in get_nsdmi is
always non-templated, it should be calling break_out_target_exprs with
processing_template_decl cleared since otherwise the function might end
up mixing templated and non-templated trees.

I'm not sure about this though, perhaps this is not the best fix here.
Alternatively, when processing_template_decl we could make get_nsdmi
avoid calling break_out_target_exprs at all or something.  Additionally,
perhaps break_out_target_exprs should be a no-op more generally when
processing_template_decl since we shouldn't see any TARGET_EXPRs inside
a template?

Bootstrapped and regtested on x86_64-pc-linux-gnu.

PR c++/108116

gcc/cp/ChangeLog:

* init.cc (get_nsdmi): Clear processing_template_decl before
processing the non-templated initializer.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/nsdmi-template24.C: New test.
---
 gcc/cp/init.cc|  8 ++-
 gcc/testsuite/g++.dg/cpp0x/nsdmi-template24.C | 22 +++
 2 files changed, 29 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/nsdmi-template24.C

diff --git a/gcc/cp/init.cc b/gcc/cp/init.cc
index 73e6547c076..c4345ebdaea 100644
--- a/gcc/cp/init.cc
+++ b/gcc/cp/init.cc
@@ -561,7 +561,8 @@ perform_target_ctor (tree init)
   return init;
 }
 
-/* Return the non-static data initializer for FIELD_DECL MEMBER.  */
+/* Return the non-static data initializer for FIELD_DECL MEMBER.
+   The initializer returned is always non-templated.  */
 
 static GTY((cache)) decl_tree_cache_map *nsdmi_inst;
 
@@ -670,6 +671,11 @@ get_nsdmi (tree member, bool in_ctor, tsubst_flags_t 
complain)
   current_class_ptr = build_address (current_class_ref);
 }
 
+  /* Since INIT is always non-templated clear processing_template_decl
+ before processing it so that we don't interleave templated and
+ non-templated trees.  */
+  processing_template_decl_sentinel ptds;
+
   /* Strip redundant TARGET_EXPR so we don't need to remap it, and
  so the aggregate init code below will see a CONSTRUCTOR.  */
   bool simple_target = (init && SIMPLE_TARGET_EXPR_P (init));
diff --git a/gcc/testsuite/g++.dg/cpp0x/nsdmi-template24.C 
b/gcc/testsuite/g++.dg/cpp0x/nsdmi-template24.C
new file mode 100644
index 000..202c67d7321
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/nsdmi-template24.C
@@ -0,0 +1,22 @@
+// PR c++/108116
+// { dg-do compile { target c++11 } }
+
+#include 
+
+struct A {
+  A(int);
+  ~A();
+};
+
+struct B {
+  B(std::initializer_list);
+};
+
+struct C {
+  B m{0};
+};
+
+template
+void f() {
+  C c = C{};
+};
-- 
2.39.0.95.g7c2ef319c5



Re: [PATCH] c++: get_nsdmi in template context [PR108116]

2022-12-21 Thread Patrick Palka via Gcc-patches
On Wed, 21 Dec 2022, Patrick Palka wrote:

> Here during ahead of time checking of C{}, we indirectly call get_nsdmi
> for C::m from finish_compound_literal, which in turn calls
> break_out_target_exprs for C::m's (non-templated) initializer, during
> which we end up building a call to A::~A and checking expr_noexcept_p
> for it (from build_vec_delete_1).  But this is all done with
> processing_template_decl set, so the built A::~A call is templated
> (whose form r12-6897-gdec8d0e5fa00ceb2 recently changed) which
> expr_noexcept_p doesn't expect and we crash.
> 
> In r10-6183-g20afdcd3698275 we fixed a similar issue by guarding a
> expr_noexcept_p call with !processing_template_decl, which works here
> too.  But it seems to me since the initializer we obtain in get_nsdmi is
> always non-templated, it should be calling break_out_target_exprs with
> processing_template_decl cleared since otherwise the function might end
> up mixing templated and non-templated trees.
> 
> I'm not sure about this though, perhaps this is not the best fix here.
> Alternatively, when processing_template_decl we could make get_nsdmi
> avoid calling break_out_target_exprs at all or something.  Additionally,
> perhaps break_out_target_exprs should be a no-op more generally when
> processing_template_decl since we shouldn't see any TARGET_EXPRs inside
> a template?
> 
> Bootstrapped and regtested on x86_64-pc-linux-gnu.

Note this is a 12 regression so I suppose there's also the question of
what's safest to backport vs what's the best fix..

> 
>   PR c++/108116
> 
> gcc/cp/ChangeLog:
> 
>   * init.cc (get_nsdmi): Clear processing_template_decl before
>   processing the non-templated initializer.
> 
> gcc/testsuite/ChangeLog:
> 
>   * g++.dg/cpp0x/nsdmi-template24.C: New test.
> ---
>  gcc/cp/init.cc|  8 ++-
>  gcc/testsuite/g++.dg/cpp0x/nsdmi-template24.C | 22 +++
>  2 files changed, 29 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/g++.dg/cpp0x/nsdmi-template24.C
> 
> diff --git a/gcc/cp/init.cc b/gcc/cp/init.cc
> index 73e6547c076..c4345ebdaea 100644
> --- a/gcc/cp/init.cc
> +++ b/gcc/cp/init.cc
> @@ -561,7 +561,8 @@ perform_target_ctor (tree init)
>return init;
>  }
>  
> -/* Return the non-static data initializer for FIELD_DECL MEMBER.  */
> +/* Return the non-static data initializer for FIELD_DECL MEMBER.
> +   The initializer returned is always non-templated.  */
>  
>  static GTY((cache)) decl_tree_cache_map *nsdmi_inst;
>  
> @@ -670,6 +671,11 @@ get_nsdmi (tree member, bool in_ctor, tsubst_flags_t 
> complain)
>current_class_ptr = build_address (current_class_ref);
>  }
>  
> +  /* Since INIT is always non-templated clear processing_template_decl
> + before processing it so that we don't interleave templated and
> + non-templated trees.  */
> +  processing_template_decl_sentinel ptds;
> +
>/* Strip redundant TARGET_EXPR so we don't need to remap it, and
>   so the aggregate init code below will see a CONSTRUCTOR.  */
>bool simple_target = (init && SIMPLE_TARGET_EXPR_P (init));
> diff --git a/gcc/testsuite/g++.dg/cpp0x/nsdmi-template24.C 
> b/gcc/testsuite/g++.dg/cpp0x/nsdmi-template24.C
> new file mode 100644
> index 000..202c67d7321
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/cpp0x/nsdmi-template24.C
> @@ -0,0 +1,22 @@
> +// PR c++/108116
> +// { dg-do compile { target c++11 } }
> +
> +#include 
> +
> +struct A {
> +  A(int);
> +  ~A();
> +};
> +
> +struct B {
> +  B(std::initializer_list);
> +};
> +
> +struct C {
> +  B m{0};
> +};
> +
> +template
> +void f() {
> +  C c = C{};
> +};
> -- 
> 2.39.0.95.g7c2ef319c5
> 
> 



Re: [PATCHv2] Use toplevel configure for GMP and MPFR for gdb

2022-12-21 Thread Jeff Law via Gcc-patches




On 12/20/22 20:45, Andrew Pinski wrote:

On Tue, Dec 20, 2022 at 10:59 AM Tom Tromey  wrote:



"Andrew" == apinski--- via Gdb-patches  writes:


Andrew> From: Andrew Pinski 
Andrew> This patch uses the toplevel configure parts for GMP/MPFR for
Andrew> gdb. The only thing is that gdb now requires MPFR for building.
Andrew> Before it was a recommended but not required library.
Andrew> Also this allows building of GMP and MPFR with the toplevel
Andrew> directory just like how it is done for GCC.
Andrew> We now error out in the toplevel configure of the version
Andrew> of GMP and MPFR that is wrong.

Andrew> OK after GDB 13 branches? Build gdb 3 ways:
Andrew> with GMP and MPFR in the toplevel (static library used at that point 
for both)
Andrew> With only MPFR in the toplevel (GMP distro library used and MPFR built 
from source)
Andrew> With neither GMP and MPFR in the toplevel (distro libraries used)

I think it's fine to move forward with this now.
Thank you again for doing this.


Just to double check this is an approval?

Jeff,
   Just to double check this is still ok for gcc repo even if we are in stage 3?
Yea, it was submitted prior to stage1 closing and approved for GCC once 
the GDB folks signed off.


jeff


Re: [PATCHv2] Use toplevel configure for GMP and MPFR for gdb

2022-12-21 Thread Tom Tromey
>> I think it's fine to move forward with this now.
>> Thank you again for doing this.

Andrew> Just to double check this is an approval?

Yes, sorry for being unclear.

Tom


Re: [PATCH] modula2: PR-108119 Disable m2 plugin m2rte

2022-12-21 Thread Gaius Mulley via Gcc-patches
Richard Biener  writes:

>>
>> ChangeLog:
>>
>> * Makefile.def (extra_configure_flags): Add @enable_m2plugin@.
>> * Makefile.in : Rebuilt.
>> * configure : Rebuilt.
>> * configure.ac (host_tools): Remove unused gm2tools.
>> (m2plugin) New AC_ARG_ENABLE.
>> (enable_m2plugin) New AC_SUBST added.
>
> Why's this at the toplevel?

ah sorry - my misunderstanding on how configure and friends work.  I'll
rework the patch local to gcc/m2.  Thanks for spotting this error,

regards,
Gaius


Re: [PATCH] modula2: PR-108119 Disable m2 plugin m2rte

2022-12-21 Thread Rainer Orth
Hi Gaius,

Btw., you've got a couple of formatting errors in your ChangeLog entires:

>>> ChangeLog:
>>>
>>> * Makefile.def (extra_configure_flags): Add @enable_m2plugin@.
>>> * Makefile.in : Rebuilt.
 ^ no blank here.  Besides, the entires are
   usually in present tense ("Rebuild").

>>> * configure.ac (host_tools): Remove unused gm2tools.
>>> (m2plugin) New AC_ARG_ENABLE.
  ^ missing :

and several times more.  Just nits, but best keep the entires consistent
with the GNU Coding Standards.

Thanks.
Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


[Patch] Fortran/OpenMP: Add parsing support for allocators/allocate directive (was: [Patch] Fortran/OpenMP: Add parsing support for allocators directive)

2022-12-21 Thread Tobias Burnus

Related pending (simple) patches - aka *Patch Ping*:

* [Patch] Fortran: Extend align-clause checks of OpenMP's allocate clause
  https://gcc.gnu.org/pipermail/gcc-patches/2022-December/608401.html

* [Patch] OpenMP: Parse align clause in allocate directive in C/C++
  https://gcc.gnu.org/pipermail/gcc-patches/2022-December/608404.html

On 14.12.22 11:47, Tobias Burnus wrote:


This patch adds parsing/argument-checking support for
  '!$omp allocators allocate([align(int),allocator(a) :] list)'


This follow-up patch additionally adds parsing support for both
declarative and allocate-stmt-associated '!$omp allocate' directives –
and replaces my previous patch.

OK for mainline?

 * * *

The code requires in line with OpenMP 5.1 that an executable statement
comes before an '!$omp allocate' that is associated with a Fortran
ALLOCATE stmt, which is diagnosed.

Note: There is a spec change/regression related to permitting structure
elements; while OpenMP 5.0/5.1 did permit them in the
allocate-stmt-associated "!$omp allocate", OpenMP 5.2 stopped doing –
and '!$omp allocators' never permitted it. — For allocate that's seems
to be the accidental result from "permitted unless stated otherwise" to
"rejected unless stated otherwise". For 'allocators', it is the result
of the original 'allocate' clause which should have been extended for
'allocators' - or should not.

In any case, that's tracked now in OpenMP's spec issue #3437.

Thoughts? – The code rejects var%comp and var(1)%comp etc. for now –
besides the unclear spec status, I admittedly did this also to make
checking easier (like for duplicated entries, entry same as in ALLOCATE
except for tailing array spec etc.).

 * * *

This patch replaced both my previous patch in this thread and also
Abid's patch


"[PATCH 1/5] [gfortran] Add parsing support for allocate directive
(OpenMP 5.0)."
https://gcc.gnu.org/pipermail/gcc-patches/2022-October/603258.html


In his patch set, later patches actually add allocater support for
allocatables/pointers, only – but there issues with regards to the used
allocator (see patches + patch review).

As my attached patch raises a sorry, it neither addresses that issue nor
is it affected by that issue.

Tobias
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
Fortran/OpenMP: Add parsing support for allocators/allocate directive

gcc/fortran/ChangeLog:

	* dump-parse-tree.cc (show_omp_namelist): Update allocator, fix
	align dump.
	(show_omp_node, show_code_node): Handle EXEC_OMP_ALLOCATE.
	* gfortran.h (enum gfc_statement): Add ST_OMP_ALLOCATE and ..._EXEC.
	(enum gfc_exec_op): Add EXEC_OMP_ALLOCATE.
	(struct gfc_omp_namelist): Add 'allocator' to 'u2' union.
	(struct gfc_namespace): Add omp_allocate.
	(gfc_resolve_omp_allocate): New.
	* match.cc (gfc_free_omp_namelist): Free 'u2.allocator'.
	* match.h (gfc_match_omp_allocate, gfc_match_omp_allocators): New.
	* openmp.cc (gfc_omp_directives): Uncomment allocate/allocators.
	(gfc_match_omp_variable_list): Add bool arg for
	rejecting listening common-block vars separately.
	(gfc_match_omp_clauses): Update for u2.allocators.
	(OMP_ALLOCATORS_CLAUSES, gfc_match_omp_allocate,
	gfc_match_omp_allocators, is_predefined_allocator,
	gfc_resolve_omp_allocate): New.
	(resolve_omp_clauses): Update 'allocate' clause checks.
	(omp_code_to_statement, gfc_resolve_omp_directive): Handle
	OMP ALLOCATE/ALLOCATORS.
	* parse.cc (in_exec_part): New global var.
	(check_omp_allocate_stmt, parse_openmp_allocate_block): New.
	(decode_omp_directive, case_exec_markers, case_omp_decl,
	gfc_ascii_statement, parse_omp_structured_block): Handle
	OMP allocate/allocators.
	(verify_st_order, parse_executable): Set in_exec_part.
	* resolve.cc (gfc_resolve_blocks, resolve_codes): Handle
	allocate/allocators.
	* st.cc (gfc_free_statement): Likewise.
	* trans.cc (trans_code):) Likewise.
	* trans-openmp.cc (gfc_trans_omp_directive): Likewise.
	(gfc_trans_omp_clauses, gfc_split_omp_clauses): Update for
	u2.allocator, fix for u.align.

libgomp/ChangeLog:

	* testsuite/libgomp.fortran/allocate-3.f90: Update dg-error.

gcc/testsuite/ChangeLog:

	* gfortran.dg/gomp/allocate-2.f90: Update dg-error.
	* gfortran.dg/gomp/allocate-4.f90: New test.
	* gfortran.dg/gomp/allocate-5.f90: New test.
	* gfortran.dg/gomp/allocate-6.f90: New test.
	* gfortran.dg/gomp/allocate-7.f90: New test.
	* gfortran.dg/gomp/allocators-1.f90: New test.
	* gfortran.dg/gomp/allocators-2.f90: New test.

 gcc/fortran/dump-parse-tree.cc   |   8 +-
 gcc/fortran/gfortran.h   |   9 +-
 gcc/fortran/match.cc |   7 +-
 gcc/fortran/match.h  |   2 +
 gcc/fortran/openmp.cc| 328 +--
 gcc/fortran/parse.cc   

GCC Patch Tracking [https://gcc.gnu.org/pipermail/gcc-patches/2022-August/599882.html]

2022-12-21 Thread Shinde, Yash
https://gcc.gnu.org/pipermail/gcc-patches/2022-August/599882.html
Submission Date- Wed, 17 Aug 2022
Upstream-Status: Submitted

The given GCC patch is with it’s current status as ‘Submitted’ but still is not 
taken to the upstream. Please let us know about why the patch is not taken to 
the upstream or let us know if there is any problem in including it in upstream.


Regards,
Yash.



[RFC PATCH] RISC-V: Add support for vector crypto extensions

2022-12-21 Thread Christoph Muellner
From: Christoph Müllner 

This series adds basic support for the vector crypto extensions:
* Zvkb
* Zvkg
* Zvkh[a,b]
* Zvkn
* Zvksed
* Zvksh

The implementation follows the version 20221220 of the specification,
which can be found here:
  https://github.com/riscv/riscv-crypto/releases/tag/v20221220

Note, that this specification is not frozen yet, meaning that
incompatible changes are possible.
Therefore, this patchset is marked as RFC and should not be considered
for upstream inclusion.

All extensions come with (passing) tests for the feature test macros.

A Binutils patch series for vector crypto support can be found here:
  https://sourceware.org/pipermail/binutils/2022-December/125272.html

Signed-off-by: Christoph Müllner 
---
 gcc/common/config/riscv/riscv-common.cc | 16 
 gcc/config/riscv/riscv-opts.h   | 16 
 gcc/config/riscv/riscv.opt  |  3 +++
 gcc/testsuite/gcc.target/riscv/zvkb.c   | 13 +
 gcc/testsuite/gcc.target/riscv/zvkg.c   | 13 +
 gcc/testsuite/gcc.target/riscv/zvkha.c  | 13 +
 gcc/testsuite/gcc.target/riscv/zvkhb.c  | 13 +
 gcc/testsuite/gcc.target/riscv/zvkn.c   | 13 +
 gcc/testsuite/gcc.target/riscv/zvksed.c | 13 +
 gcc/testsuite/gcc.target/riscv/zvksh.c  | 13 +
 10 files changed, 126 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/zvkb.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zvkg.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zvkha.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zvkhb.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zvkn.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zvksed.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zvksh.c

diff --git a/gcc/common/config/riscv/riscv-common.cc 
b/gcc/common/config/riscv/riscv-common.cc
index 4b7f777c103..dfd654eea24 100644
--- a/gcc/common/config/riscv/riscv-common.cc
+++ b/gcc/common/config/riscv/riscv-common.cc
@@ -201,6 +201,14 @@ static const struct riscv_ext_version 
riscv_ext_version_table[] =
   {"zve64f", ISA_SPEC_CLASS_NONE, 1, 0},
   {"zve64d", ISA_SPEC_CLASS_NONE, 1, 0},
 
+  {"zvkb", ISA_SPEC_CLASS_NONE, 1, 0},
+  {"zvkg", ISA_SPEC_CLASS_NONE, 1, 0},
+  {"zvkha", ISA_SPEC_CLASS_NONE, 1, 0},
+  {"zvkhb", ISA_SPEC_CLASS_NONE, 1, 0},
+  {"zvkn", ISA_SPEC_CLASS_NONE, 1, 0},
+  {"zvksed", ISA_SPEC_CLASS_NONE, 1, 0},
+  {"zvksh", ISA_SPEC_CLASS_NONE, 1, 0},
+
   {"zvl32b", ISA_SPEC_CLASS_NONE, 1, 0},
   {"zvl64b", ISA_SPEC_CLASS_NONE, 1, 0},
   {"zvl128b", ISA_SPEC_CLASS_NONE, 1, 0},
@@ -1226,6 +1234,14 @@ static const riscv_ext_flag_table_t 
riscv_ext_flag_table[] =
   {"zve64f",   &gcc_options::x_riscv_vector_elen_flags, 
MASK_VECTOR_ELEN_FP_32},
   {"zve64d",   &gcc_options::x_riscv_vector_elen_flags, 
MASK_VECTOR_ELEN_FP_64},
 
+  {"zvkb", &gcc_options::x_riscv_zvk_subext, MASK_ZVKB},
+  {"zvkg", &gcc_options::x_riscv_zvk_subext, MASK_ZVKG},
+  {"zvkha",&gcc_options::x_riscv_zvk_subext, MASK_ZVKHA},
+  {"zvkhb",&gcc_options::x_riscv_zvk_subext, MASK_ZVKHB},
+  {"zvkn", &gcc_options::x_riscv_zvk_subext, MASK_ZVKN},
+  {"zvksed",   &gcc_options::x_riscv_zvk_subext, MASK_ZVKSED},
+  {"zvksh",&gcc_options::x_riscv_zvk_subext, MASK_ZVKSH},
+
   {"zvl32b",&gcc_options::x_riscv_zvl_flags, MASK_ZVL32B},
   {"zvl64b",&gcc_options::x_riscv_zvl_flags, MASK_ZVL64B},
   {"zvl128b",   &gcc_options::x_riscv_zvl_flags, MASK_ZVL128B},
diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h
index 25fd85b09b1..5b367bd194c 100644
--- a/gcc/config/riscv/riscv-opts.h
+++ b/gcc/config/riscv/riscv-opts.h
@@ -132,6 +132,22 @@ enum stack_protector_guard {
 #define TARGET_VECTOR_ELEN_FP_64 \
   ((riscv_vector_elen_flags & MASK_VECTOR_ELEN_FP_64) != 0)
 
+#define MASK_ZVKB  (1 << 0)
+#define MASK_ZVKG  (1 << 1)
+#define MASK_ZVKHA (1 << 2)
+#define MASK_ZVKHB (1 << 3)
+#define MASK_ZVKN  (1 << 4)
+#define MASK_ZVKSED(1 << 5)
+#define MASK_ZVKSH (1 << 6)
+
+#define TARGET_ZVKB((riscv_zvk_subext & MASK_ZVKB) != 0)
+#define TARGET_ZVKG((riscv_zvk_subext & MASK_ZVKG) != 0)
+#define TARGET_ZVKHA   ((riscv_zvk_subext & MASK_ZVKHA) != 0)
+#define TARGET_ZVKHB   ((riscv_zvk_subext & MASK_ZVKHB) != 0)
+#define TARGET_ZVKN((riscv_zvk_subext & MASK_ZVKN) != 0)
+#define TARGET_ZVKSED  ((riscv_zvk_subext & MASK_ZVKSED) != 0)
+#define TARGET_ZVKSH   ((riscv_zvk_subext & MASK_ZVKSH) != 0)
+
 #define MASK_ZVL32B(1 <<  0)
 #define MASK_ZVL64B(1 <<  1)
 #define MASK_ZVL128B   (1 <<  2)
diff --git a/gcc/config/riscv/riscv.opt b/gcc/config/riscv/riscv.opt
index 7c3ca48d1cc..ea24a80d734 100644
--- a/gcc/config/riscv/riscv.opt
+++ b/gcc/config/riscv/riscv.opt
@@ -218,6 +218,9 @@ int riscv_zk_subext
 TargetVariable
 int riscv_vector_elen_flags
 
+TargetVariable
+int riscv_zvk_subext
+
 TargetVariable
 int riscv_zvl_flags
 
diff --git a/gcc/testsuite/gcc.target/riscv/zvkb.c 
b/gcc/tes

Re: [PATCH] modula2: PR-108119 Disable m2 plugin m2rte

2022-12-21 Thread Gaius Mulley via Gcc-patches
Rainer Orth  writes:

> Hi Gaius,
>
> Btw., you've got a couple of formatting errors in your ChangeLog entires:
>
 ChangeLog:

 * Makefile.def (extra_configure_flags): Add @enable_m2plugin@.
 * Makefile.in : Rebuilt.
>  ^ no blank here.  Besides, the entires are
>usually in present tense ("Rebuild").
>
 * configure.ac (host_tools): Remove unused gm2tools.
 (m2plugin) New AC_ARG_ENABLE.
>   ^ missing :
>
> and several times more.  Just nits, but best keep the entires consistent
> with the GNU Coding Standards.
>
> Thanks.
> Rainer

Hi Rainer,

thanks for spotting these - yes I'll double check ChangeLogs in the
future and adjust to present tense,

regards,
Gaius




Re: [PATCH V2 2/2] [x86] x86: Add a new option -mdaz-ftz to enable FTZ and DAZ flags in MXCSR.

2022-12-21 Thread H.J. Lu via Gcc-patches
On Mon, Dec 19, 2022 at 8:52 PM Hongtao Liu  wrote:
>
> On Thu, Dec 15, 2022 at 3:45 PM Hongtao Liu  wrote:
> >
> > On Thu, Dec 15, 2022 at 3:39 PM Jakub Jelinek  wrote:
> > >
> > > On Thu, Dec 15, 2022 at 02:21:37PM +0800, liuhongt via Gcc-patches wrote:
> > > > --- a/gcc/config/i386/i386.opt
> > > > +++ b/gcc/config/i386/i386.opt
> > > > @@ -420,6 +420,10 @@ mpc80
> > > >  Target RejectNegative
> > > >  Set 80387 floating-point precision to 80-bit.
> > > >
> > > > +mdaz-ftz
> > > > +Target
> > >
> > > s/Target/Driver/
> > Change to Driver and Got error like:cc1: error: command-line option
> > ‘-mdaz-ftz’ is valid for the driver but not for C.
> Hi Jakub:
>   I didn't find a good solution to handle this error after changing
> *Target* to *Driver*, Could you give some hints how to solve this
> problem?
> Or is it ok for you to mark this as *Target*(there won't be any save
> and restore in cfun since there's no variable defined here.)

Since all -m* options are passed to cc1, -mdaz-ftz can't be marked
as Driver.  We need to give it a different name to mark it as Driver.


-- 
H.J.


Re: [RFC/PATCH] Remove the workaround for _Float128 precision [PR107299]

2022-12-21 Thread Segher Boessenkool
Hi!

On Wed, Dec 21, 2022 at 05:02:17PM +0800, Kewen.Lin wrote:
> This a different attempt from Mike's approach[1][2] to fix the
> issue in PR107299.

Ke Wen, Mike: so iiuc with this patch applied all three of Mike's
patches are unnecessary?

> With option -mabi=ieeelongdouble specified,
> type long double (and __float128) and _Float128 have the same
> mode TFmode, but they have different type precisions, it causes
> the assertion to fail in function fold_using_range::fold_stmt.
> IMHO, it's not sensible that two types have the same mode but
> different precisions.

Yes, absolutely.  It is a hack, and always was a hack, just one that
worked remarkably well.  But the problems it worked around have not
been fixed yet (and the workarounds are not perfect either).

> By tracing where we make type _Float128
> have different precision from the precision of its mode, I found
> it's from a work around for rs6000 KFmode.  Being curious why
> we need this work around, I disabled it and tested it with some
> combinations as below, but all were bootstrapped and no
> regression failures were observed.
> 
>   - BE Power8 with --with-long-double-format=ibm
>   - LE Power9 with --with-long-double-format=ibm
>   - LE Power10 with --with-long-double-format=ibm
>   - x86_64-redhat-linux
>   - aarch64-linux-gnu
> 
> For LE Power10 with --with-long-double-format=ieee, since it
> suffered the bootstrapping issue, I compared the regression
> testing results with the one from Mike's approach.  The
> comparison result showed this approach can have several more
> test cases pass and no cases regressed, they are:
> 
> 1) libstdc++:
> 
>   FAIL->PASS: std/format/arguments/args.cc (test for excess errors)
>   FAIL->PASS: std/format/error.cc (test for excess errors)
>   FAIL->PASS: std/format/formatter/requirements.cc (test for excess errors)
>   FAIL->PASS: std/format/functions/format.cc (test for excess errors)
>   FAIL->PASS: std/format/functions/format_to_n.cc (test for excess errors)
>   FAIL->PASS: std/format/functions/size.cc (test for excess errors)
>   FAIL->PASS: std/format/functions/vformat_to.cc (test for excess errors)
>   FAIL->PASS: std/format/parse_ctx.cc (test for excess errors)
>   FAIL->PASS: std/format/string.cc (test for excess errors)
>   FAIL->PASS: std/format/string_neg.cc (test for excess errors)
> 
>   Caused by the same reason: one static assertion fail in header
>   file format (_Type is __ieee128):
> 
> static_assert(__format::__formattable_with<_Type, _Context>);
> 
> 2) gfortran:
> 
>   NA->PASS: gfortran.dg/c-interop/typecodes-array-float128.f90
>   NA->PASS: gfortran.dg/c-interop/typecodes-scalar-float128.f90
>   NA->PASS: gfortran.dg/PR100914.f90
> 
>   Due to that the effective target `fortran_real_c_float128`
>   checking fails, fail to compile below source with error msg:
>   "Error: Kind -4 not supported for type REAL".
> 
> use iso_c_binding
> real(kind=c_float128) :: x
> x = cos (x)
> end
> 
> 3) g++:
> 
>   FAIL->PASS: g++.dg/cpp23/ext-floating1.C  -std=gnu++23
> 
>   Due to the static assertion failing:
> 
> static_assert (is_same::value);

Does it fix the new testcases in Mike's series as well?

> * compatible with long double
> 
> This approach keeps type long double compatible with _Float128
> (at -mabi=ieeelongdouble) as before, so for the simple case
> like:
> 
>   _Float128 foo (long double t) { return t; }
> 
> there is no conversion.  See the difference at optimized
> dumping:
> 
>   with the contrastive approach:
> 
> _Float128 foo (long double a)
> {
>   _Float128 _2;
> 
>[local count: 1073741824]:
>   _2 = (_Float128) a_1(D);
>   return _2;
>   }
> 
>   with this:
> 
>   _Float128 foo (long double a)
>   {
>[local count: 1073741824]:
>   return a_1(D);
>   }
> 
> IMHO, it's still good to keep _Float128 and __float128
> compatible with long double, to get rid of those useless
> type conversions.
> 
> Besides, this approach still takes TFmode attribute type
> as type long double, while the contrastive approach takes
> TFmode attribute type as type _Float128, whose corresponding
> mode isn't TFmode, the inconsistence seems not good.
> 
> As above, I wonder if we can consider this approach which
> makes type _Float128 have the same precision as MODE_PRECISION
> of its mode, it keeps the previous implementation to make type
> long double compatible with _Float128.  Since the REAL_MODE_FORMAT
> of the mode still holds the information, even if some place which
> isn't covered in the above testing need the actual precision, we
> can still retrieve the actual precision with that.

"Precision" does not have a useful meaning for all floating point
formats.  It does not have one for double-double for example.  The way
precision is defined in IEEE FP means double-double has 2048 bits of
precision (or is it 2047), not useful.  Taking the precision of the
format instead to be the minimum over all values in that format gives
double-doub

Re: [RFC/PATCH] Remove the workaround for _Float128 precision [PR107299]

2022-12-21 Thread Joseph Myers
On Wed, 21 Dec 2022, Segher Boessenkool wrote:

> > --- a/gcc/tree.cc
> > +++ b/gcc/tree.cc
> > @@ -9442,15 +9442,6 @@ build_common_tree_nodes (bool signed_char)
> >if (!targetm.floatn_mode (n, extended).exists (&mode))
> > continue;
> >int precision = GET_MODE_PRECISION (mode);
> > -  /* Work around the rs6000 KFmode having precision 113 not
> > -128.  */
> 
> It has precision 126 now fwiw.
> 
> Joseph: what do you think about this patch?  Is the workaround it
> removes still useful in any way, do we need to do that some other way if
> we remove this?

I think it's best for the TYPE_PRECISION, for any type with the binary128 
format, to be 128 (not 126).

It's necessary that _Float128, _Float64x and long double all have the same 
TYPE_PRECISION when they have the same (binary128) format, or at least 
that TYPE_PRECISION for _Float128 >= that for long double >= that for 
_Float64x, so that the rules in c_common_type apply properly.

How the TYPE_PRECISION compares to that of __ibm128, or of long double 
when that's double-double, is less important.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] c++: get_nsdmi in template context [PR108116]

2022-12-21 Thread Jason Merrill via Gcc-patches

On 12/21/22 09:52, Patrick Palka wrote:

Here during ahead of time checking of C{}, we indirectly call get_nsdmi
for C::m from finish_compound_literal, which in turn calls
break_out_target_exprs for C::m's (non-templated) initializer, during
which we end up building a call to A::~A and checking expr_noexcept_p
for it (from build_vec_delete_1).  But this is all done with
processing_template_decl set, so the built A::~A call is templated
(whose form r12-6897-gdec8d0e5fa00ceb2 recently changed) which
expr_noexcept_p doesn't expect and we crash.

In r10-6183-g20afdcd3698275 we fixed a similar issue by guarding a
expr_noexcept_p call with !processing_template_decl, which works here
too.  But it seems to me since the initializer we obtain in get_nsdmi is
always non-templated, it should be calling break_out_target_exprs with
processing_template_decl cleared since otherwise the function might end
up mixing templated and non-templated trees.

I'm not sure about this though, perhaps this is not the best fix here.
Alternatively, when processing_template_decl we could make get_nsdmi
avoid calling break_out_target_exprs at all or something.  Additionally,
perhaps break_out_target_exprs should be a no-op more generally when
processing_template_decl since we shouldn't see any TARGET_EXPRs inside
a template?


Hmm.

Any time we would call break_out_target_exprs we're dealing with 
non-dependent expressions; if we're in a template, we're building up an 
initializer or a call that we'll soon throw away, just for the purpose 
of checking or type computation.


Furthermore, as you say, the argument is always a non-template tree, 
whether in get_nsdmi or convert_default_arg.  So having 
processing_template_decl cleared would be correct.


I don't think we can get away with not calling break_out_target_exprs at 
all in a template; if nothing else, we would lose immediate invocation 
expansion.  However, we could probably skip the bot_manip tree walk, 
which should avoid the problem.


Either way we end up returning non-template trees, as we do now, and 
callers have to deal with transient CONSTRUCTORs containing such (as we 
do in massage_init_elt).


Does convert_default_arg not run into the same problem, e.g. when calling

  void g(B = {0});

?


Bootstrapped and regtested on x86_64-pc-linux-gnu.

PR c++/108116

gcc/cp/ChangeLog:

* init.cc (get_nsdmi): Clear processing_template_decl before
processing the non-templated initializer.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/nsdmi-template24.C: New test.
---
  gcc/cp/init.cc|  8 ++-
  gcc/testsuite/g++.dg/cpp0x/nsdmi-template24.C | 22 +++
  2 files changed, 29 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp0x/nsdmi-template24.C

diff --git a/gcc/cp/init.cc b/gcc/cp/init.cc
index 73e6547c076..c4345ebdaea 100644
--- a/gcc/cp/init.cc
+++ b/gcc/cp/init.cc
@@ -561,7 +561,8 @@ perform_target_ctor (tree init)
return init;
  }
  
-/* Return the non-static data initializer for FIELD_DECL MEMBER.  */

+/* Return the non-static data initializer for FIELD_DECL MEMBER.
+   The initializer returned is always non-templated.  */
  
  static GTY((cache)) decl_tree_cache_map *nsdmi_inst;
  
@@ -670,6 +671,11 @@ get_nsdmi (tree member, bool in_ctor, tsubst_flags_t complain)

current_class_ptr = build_address (current_class_ref);
  }
  
+  /* Since INIT is always non-templated clear processing_template_decl

+ before processing it so that we don't interleave templated and
+ non-templated trees.  */
+  processing_template_decl_sentinel ptds;
+
/* Strip redundant TARGET_EXPR so we don't need to remap it, and
   so the aggregate init code below will see a CONSTRUCTOR.  */
bool simple_target = (init && SIMPLE_TARGET_EXPR_P (init));
diff --git a/gcc/testsuite/g++.dg/cpp0x/nsdmi-template24.C 
b/gcc/testsuite/g++.dg/cpp0x/nsdmi-template24.C
new file mode 100644
index 000..202c67d7321
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/nsdmi-template24.C
@@ -0,0 +1,22 @@
+// PR c++/108116
+// { dg-do compile { target c++11 } }
+
+#include 
+
+struct A {
+  A(int);
+  ~A();
+};
+
+struct B {
+  B(std::initializer_list);
+};
+
+struct C {
+  B m{0};
+};
+
+template
+void f() {
+  C c = C{};
+};




[PATCH 0/2] __bos and flex arrays

2022-12-21 Thread Siddhesh Poyarekar
Hi,

The first patch in the series is just a minor test cleanup that I did to
make sure all tests in a test case run (instead of aborting at first
failure) and print the ones that failed.  The second patch is the actual
fix.

The patch intends to make __bos/__bdos do the right thing with structs
containing flex arrays, either directly or within nested structs and
unions.  This should improve minimum object size estimation in some
cases and also bail out more consistently so that flex arrays don't
cause false positives in fortification.

I've tested this with a bootstrap on x86_64 and also with
--with-build-config=bootstrap-ubsan to make sure that there are no new
failures due to this change.

Siddhesh Poyarekar (2):
  testsuite: Run __bos tests to completion
  tree-object-size: More consistent behaviour with flex arrays

 .../g++.dg/ext/builtin-object-size1.C | 267 
 .../g++.dg/ext/builtin-object-size2.C | 267 
 .../gcc.dg/builtin-dynamic-object-size-0.c|  14 +-
 gcc/testsuite/gcc.dg/builtin-object-size-1.c  | 263 
 gcc/testsuite/gcc.dg/builtin-object-size-12.c |  12 +-
 gcc/testsuite/gcc.dg/builtin-object-size-13.c |  17 +-
 gcc/testsuite/gcc.dg/builtin-object-size-15.c |  11 +-
 gcc/testsuite/gcc.dg/builtin-object-size-2.c  | 287 +-
 gcc/testsuite/gcc.dg/builtin-object-size-3.c  | 263 
 gcc/testsuite/gcc.dg/builtin-object-size-4.c  | 267 
 gcc/testsuite/gcc.dg/builtin-object-size-6.c  | 267 
 gcc/testsuite/gcc.dg/builtin-object-size-7.c  |  52 ++--
 gcc/testsuite/gcc.dg/builtin-object-size-8.c  |  17 +-
 .../gcc.dg/builtin-object-size-common.h   |  12 +
 .../gcc.dg/builtin-object-size-flex-common.h  |  90 ++
 ...n-object-size-flex-nested-struct-nonzero.c |   6 +
 ...ltin-object-size-flex-nested-struct-zero.c |   6 +
 .../builtin-object-size-flex-nested-struct.c  |  22 ++
 ...in-object-size-flex-nested-union-nonzero.c |   6 +
 ...iltin-object-size-flex-nested-union-zero.c |   6 +
 .../builtin-object-size-flex-nested-union.c   |  28 ++
 .../gcc.dg/builtin-object-size-flex-nonzero.c |   6 +
 .../gcc.dg/builtin-object-size-flex-zero.c|   6 +
 .../gcc.dg/builtin-object-size-flex.c |  18 ++
 gcc/testsuite/gcc.dg/pr101836.c   |  11 +-
 gcc/testsuite/gcc.dg/strict-flex-array-3.c|  11 +-
 gcc/tree-object-size.cc   | 150 -
 27 files changed, 1275 insertions(+), 1107 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/builtin-object-size-common.h
 create mode 100644 gcc/testsuite/gcc.dg/builtin-object-size-flex-common.h
 create mode 100644 
gcc/testsuite/gcc.dg/builtin-object-size-flex-nested-struct-nonzero.c
 create mode 100644 
gcc/testsuite/gcc.dg/builtin-object-size-flex-nested-struct-zero.c
 create mode 100644 
gcc/testsuite/gcc.dg/builtin-object-size-flex-nested-struct.c
 create mode 100644 
gcc/testsuite/gcc.dg/builtin-object-size-flex-nested-union-nonzero.c
 create mode 100644 
gcc/testsuite/gcc.dg/builtin-object-size-flex-nested-union-zero.c
 create mode 100644 gcc/testsuite/gcc.dg/builtin-object-size-flex-nested-union.c
 create mode 100644 gcc/testsuite/gcc.dg/builtin-object-size-flex-nonzero.c
 create mode 100644 gcc/testsuite/gcc.dg/builtin-object-size-flex-zero.c
 create mode 100644 gcc/testsuite/gcc.dg/builtin-object-size-flex.c

-- 
2.38.1



[PATCH 2/2] tree-object-size: More consistent behaviour with flex arrays

2022-12-21 Thread Siddhesh Poyarekar
The tree object size pass tries to fail when it detects a flex array in
the struct, but it ends up doing the right thing only when the flex
array is in the outermost struct.  For nested cases (such as arrays
nested in a union or an inner struct), it ends up taking whatever value
the flex array is declared with, using zero for the standard flex array,
i.e. [].

Rework subobject size computation to make it more consistent across the
board, honoring -fstrict-flex-arrays.  With this change, any array at
the end of the struct will end up causing __bos to use the allocated
value of the outer object, bailing out in the maximum case when it can't
find it.  In the minimum case, it will return the subscript value or the
allocated value of the outer object, whichever is larger.

gcc/ChangeLog:

PR tree-optimization/107952
* tree-object-size.cc (size_from_objects): New function.
(addr_object_size): Call it.  Fully rely on
array_ref_flexible_size_p call to determine flex array.

gcc/testsuite/ChangeLog:

PR tree-optimization/107952
* g++.dg/ext/builtin-object-size1.C (test1, test6, test7,
test8): Adjust expected result for object size type 3 and 1.
* g++.dg/ext/builtin-object-size2.C (test1, test6, test7,
test8): Likewise.
* gcc.dg/builtin-object-size-13.c (main): Likewise.
* gcc.dg/builtin-object-size-6.c (test1, test6, test7, test8):
Likewise.
* gcc.dg/builtin-object-size-8.c (main): Likewise.
* gcc.dg/builtin-object-size-flex-common.h: Common code for new
tests.
* gcc.dg/builtin-object-size-flex-nested-struct-nonzero.c: New
test.
* gcc.dg/builtin-object-size-flex-nested-struct-zero.c: New
test.
* gcc.dg/builtin-object-size-flex-nested-struct.c: New test.
* gcc.dg/builtin-object-size-flex-nested-union-nonzero.c: New
test.
* gcc.dg/builtin-object-size-flex-nested-union-zero.c: New test.
* gcc.dg/builtin-object-size-flex-nested-union.c: New test.
* gcc.dg/builtin-object-size-flex-nonzero.c: New test.
* gcc.dg/builtin-object-size-flex-zero.c: New test.
* gcc.dg/builtin-object-size-flex.c: New test.

Signed-off-by: Siddhesh Poyarekar 
---
 .../g++.dg/ext/builtin-object-size1.C |  10 +-
 .../g++.dg/ext/builtin-object-size2.C |  10 +-
 gcc/testsuite/gcc.dg/builtin-object-size-13.c |   4 +-
 gcc/testsuite/gcc.dg/builtin-object-size-6.c  |  10 +-
 gcc/testsuite/gcc.dg/builtin-object-size-8.c  |   4 +-
 .../gcc.dg/builtin-object-size-flex-common.h  |  90 +++
 ...n-object-size-flex-nested-struct-nonzero.c |   6 +
 ...ltin-object-size-flex-nested-struct-zero.c |   6 +
 .../builtin-object-size-flex-nested-struct.c  |  22 +++
 ...in-object-size-flex-nested-union-nonzero.c |   6 +
 ...iltin-object-size-flex-nested-union-zero.c |   6 +
 .../builtin-object-size-flex-nested-union.c   |  28 
 .../gcc.dg/builtin-object-size-flex-nonzero.c |   6 +
 .../gcc.dg/builtin-object-size-flex-zero.c|   6 +
 .../gcc.dg/builtin-object-size-flex.c |  18 +++
 gcc/tree-object-size.cc   | 150 ++
 16 files changed, 265 insertions(+), 117 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/builtin-object-size-flex-common.h
 create mode 100644 
gcc/testsuite/gcc.dg/builtin-object-size-flex-nested-struct-nonzero.c
 create mode 100644 
gcc/testsuite/gcc.dg/builtin-object-size-flex-nested-struct-zero.c
 create mode 100644 
gcc/testsuite/gcc.dg/builtin-object-size-flex-nested-struct.c
 create mode 100644 
gcc/testsuite/gcc.dg/builtin-object-size-flex-nested-union-nonzero.c
 create mode 100644 
gcc/testsuite/gcc.dg/builtin-object-size-flex-nested-union-zero.c
 create mode 100644 gcc/testsuite/gcc.dg/builtin-object-size-flex-nested-union.c
 create mode 100644 gcc/testsuite/gcc.dg/builtin-object-size-flex-nonzero.c
 create mode 100644 gcc/testsuite/gcc.dg/builtin-object-size-flex-zero.c
 create mode 100644 gcc/testsuite/gcc.dg/builtin-object-size-flex.c

diff --git a/gcc/testsuite/g++.dg/ext/builtin-object-size1.C 
b/gcc/testsuite/g++.dg/ext/builtin-object-size1.C
index 165b415683b..5b863637123 100644
--- a/gcc/testsuite/g++.dg/ext/builtin-object-size1.C
+++ b/gcc/testsuite/g++.dg/ext/builtin-object-size1.C
@@ -103,7 +103,7 @@ test1 (A *p)
 FAIL ();
   if (__builtin_object_size (&p->b, 3) != sizeof (p->b))
 FAIL ();
-  if (__builtin_object_size (&p->c, 3) != 0)
+  if (__builtin_object_size (&p->c, 3) != 10)
 FAIL ();
   c = p->a;
   if (__builtin_object_size (c, 3) != sizeof (p->a))
@@ -118,7 +118,7 @@ test1 (A *p)
   if (__builtin_object_size (c, 3) != sizeof (p->b))
 FAIL ();
   c = (char *) &p->c;
-  if (__builtin_object_size (c, 3) != 0)
+  if (__builtin_object_size (c, 3) != 10)
 FAIL ();
 }
 
@@ -344,7 +344,7 @@ test6 (struct D *d)
 {
   if (__builtin_object_size (&d->j.a[3], 0) != (size_t) -1)
 FAIL ();
-  if (__builtin_object_size (&d->j.a[3

[PATCH 1/2] testsuite: Run __bos tests to completion

2022-12-21 Thread Siddhesh Poyarekar
Instead of failing on first error, run all __builtin_object_size and
__builtin_dynamic_object_size tests to completion and then provide a
summary of which tests failed.

gcc/testsuite/ChangeLog:

* gcc.dg/builtin-dynamic-object-size-0.c: Move FAIL and nfail
into...
* gcc.dg/builtin-object-size-common.h: ... new file.
* g++.dg/ext/builtin-object-size1.C: Include
builtin-object-size-common.h.  Replace all abort with FAIL.
(main): Call DONE.
* g++.dg/ext/builtin-object-size2.C: Likewise.
* gcc.dg/builtin-object-size-1.c: Likewise.
* gcc.dg/builtin-object-size-12.c: Likewise.
* gcc.dg/builtin-object-size-13.c: Likewise.
* gcc.dg/builtin-object-size-15.c: Likewise.
* gcc.dg/builtin-object-size-2.c: Likewise.
* gcc.dg/builtin-object-size-3.c: Likewise.
* gcc.dg/builtin-object-size-4.c: Likewise.
* gcc.dg/builtin-object-size-6.c: Likewise.
* gcc.dg/builtin-object-size-7.c: Likewise.
* gcc.dg/builtin-object-size-8.c: Likewise.
* gcc.dg/pr101836.c: Likewise.
* gcc.dg/strict-flex-array-3.c: Likewise.

Signed-off-by: Siddhesh Poyarekar 
---
 .../g++.dg/ext/builtin-object-size1.C | 257 
 .../g++.dg/ext/builtin-object-size2.C | 257 
 .../gcc.dg/builtin-dynamic-object-size-0.c|  14 +-
 gcc/testsuite/gcc.dg/builtin-object-size-1.c  | 263 
 gcc/testsuite/gcc.dg/builtin-object-size-12.c |  12 +-
 gcc/testsuite/gcc.dg/builtin-object-size-13.c |  13 +-
 gcc/testsuite/gcc.dg/builtin-object-size-15.c |  11 +-
 gcc/testsuite/gcc.dg/builtin-object-size-2.c  | 287 +-
 gcc/testsuite/gcc.dg/builtin-object-size-3.c  | 263 
 gcc/testsuite/gcc.dg/builtin-object-size-4.c  | 267 
 gcc/testsuite/gcc.dg/builtin-object-size-6.c  | 257 
 gcc/testsuite/gcc.dg/builtin-object-size-7.c  |  52 ++--
 gcc/testsuite/gcc.dg/builtin-object-size-8.c  |  13 +-
 .../gcc.dg/builtin-object-size-common.h   |  12 +
 gcc/testsuite/gcc.dg/pr101836.c   |  11 +-
 gcc/testsuite/gcc.dg/strict-flex-array-3.c|  11 +-
 16 files changed, 1010 insertions(+), 990 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/builtin-object-size-common.h

diff --git a/gcc/testsuite/g++.dg/ext/builtin-object-size1.C 
b/gcc/testsuite/g++.dg/ext/builtin-object-size1.C
index 8590a0bbebd..165b415683b 100644
--- a/gcc/testsuite/g++.dg/ext/builtin-object-size1.C
+++ b/gcc/testsuite/g++.dg/ext/builtin-object-size1.C
@@ -1,8 +1,9 @@
 // { dg-do run }
 // { dg-options "-O2" }
 
+#include "../../gcc.dg/builtin-object-size-common.h"
+
 typedef __SIZE_TYPE__ size_t;
-extern "C" void abort ();
 extern "C" void exit (int);
 extern "C" void *malloc (size_t);
 extern "C" void free (void *);
@@ -20,105 +21,105 @@ test1 (A *p)
 {
   char *c;
   if (__builtin_object_size (&p->a, 0) != (size_t) -1)
-abort ();
+FAIL ();
   if (__builtin_object_size (&p->a[0], 0) != (size_t) -1)
-abort ();
+FAIL ();
   if (__builtin_object_size (&p->a[3], 0) != (size_t) -1)
-abort ();
+FAIL ();
   if (__builtin_object_size (&p->b, 0) != (size_t) -1)
-abort ();
+FAIL ();
   if (__builtin_object_size (&p->c, 0) != (size_t) -1)
-abort ();
+FAIL ();
   c = p->a;
   if (__builtin_object_size (c, 0) != (size_t) -1)
-abort ();
+FAIL ();
   c = &p->a[0];
   if (__builtin_object_size (c, 0) != (size_t) -1)
-abort ();
+FAIL ();
   c = &p->a[3];
   if (__builtin_object_size (c, 0) != (size_t) -1)
-abort ();
+FAIL ();
   c = (char *) &p->b;
   if (__builtin_object_size (c, 0) != (size_t) -1)
-abort ();
+FAIL ();
   c = (char *) &p->c;
   if (__builtin_object_size (c, 0) != (size_t) -1)
-abort ();
+FAIL ();
   if (__builtin_object_size (&p->a, 1) != sizeof (p->a))
-abort ();
+FAIL ();
   if (__builtin_object_size (&p->a[0], 1) != sizeof (p->a))
-abort ();
+FAIL ();
   if (__builtin_object_size (&p->a[3], 1) != sizeof (p->a) - 3)
-abort ();
+FAIL ();
   if (__builtin_object_size (&p->b, 1) != sizeof (p->b))
-abort ();
+FAIL ();
   if (__builtin_object_size (&p->c, 1) != (size_t) -1)
-abort ();
+FAIL ();
   c = p->a;
   if (__builtin_object_size (c, 1) != sizeof (p->a))
-abort ();
+FAIL ();
   c = &p->a[0];
   if (__builtin_object_size (c, 1) != sizeof (p->a))
-abort ();
+FAIL ();
   c = &p->a[3];
   if (__builtin_object_size (c, 1) != sizeof (p->a) - 3)
-abort ();
+FAIL ();
   c = (char *) &p->b;
   if (__builtin_object_size (c, 1) != sizeof (p->b))
-abort ();
+FAIL ();
   c = (char *) &p->c;
   if (__builtin_object_size (c, 1) != (size_t) -1)
-abort ();
+FAIL ();
   if (__builtin_object_size (&p->a, 2) != 0)
-abort ();
+FAIL ();
   if (__builtin_object_size (&p->a[0], 2) != 0)
-abort ();
+FAIL ();
   if (__builtin_object_size (&p->a[3], 2) != 0)
-abort ();

Re: [PATCH V2 2/2] [x86] x86: Add a new option -mdaz-ftz to enable FTZ and DAZ flags in MXCSR.

2022-12-21 Thread Jakub Jelinek via Gcc-patches
On Wed, Dec 21, 2022 at 12:20:23PM -0800, H.J. Lu wrote:
> On Mon, Dec 19, 2022 at 8:52 PM Hongtao Liu  wrote:
> >
> > On Thu, Dec 15, 2022 at 3:45 PM Hongtao Liu  wrote:
> > >
> > > On Thu, Dec 15, 2022 at 3:39 PM Jakub Jelinek  wrote:
> > > >
> > > > On Thu, Dec 15, 2022 at 02:21:37PM +0800, liuhongt via Gcc-patches 
> > > > wrote:
> > > > > --- a/gcc/config/i386/i386.opt
> > > > > +++ b/gcc/config/i386/i386.opt
> > > > > @@ -420,6 +420,10 @@ mpc80
> > > > >  Target RejectNegative
> > > > >  Set 80387 floating-point precision to 80-bit.
> > > > >
> > > > > +mdaz-ftz
> > > > > +Target
> > > >
> > > > s/Target/Driver/
> > > Change to Driver and Got error like:cc1: error: command-line option
> > > ‘-mdaz-ftz’ is valid for the driver but not for C.
> > Hi Jakub:
> >   I didn't find a good solution to handle this error after changing
> > *Target* to *Driver*, Could you give some hints how to solve this
> > problem?
> > Or is it ok for you to mark this as *Target*(there won't be any save
> > and restore in cfun since there's no variable defined here.)
> 
> Since all -m* options are passed to cc1, -mdaz-ftz can't be marked
> as Driver.  We need to give it a different name to mark it as Driver.

It is ok like that.

Jakub



Re: [PATCH V2 2/2] [x86] x86: Add a new option -mdaz-ftz to enable FTZ and DAZ flags in MXCSR.

2022-12-21 Thread H.J. Lu via Gcc-patches
On Wed, Dec 21, 2022 at 2:35 PM Jakub Jelinek  wrote:
>
> On Wed, Dec 21, 2022 at 12:20:23PM -0800, H.J. Lu wrote:
> > On Mon, Dec 19, 2022 at 8:52 PM Hongtao Liu  wrote:
> > >
> > > On Thu, Dec 15, 2022 at 3:45 PM Hongtao Liu  wrote:
> > > >
> > > > On Thu, Dec 15, 2022 at 3:39 PM Jakub Jelinek  wrote:
> > > > >
> > > > > On Thu, Dec 15, 2022 at 02:21:37PM +0800, liuhongt via Gcc-patches 
> > > > > wrote:
> > > > > > --- a/gcc/config/i386/i386.opt
> > > > > > +++ b/gcc/config/i386/i386.opt
> > > > > > @@ -420,6 +420,10 @@ mpc80
> > > > > >  Target RejectNegative
> > > > > >  Set 80387 floating-point precision to 80-bit.
> > > > > >
> > > > > > +mdaz-ftz
> > > > > > +Target
> > > > >
> > > > > s/Target/Driver/
> > > > Change to Driver and Got error like:cc1: error: command-line option
> > > > ‘-mdaz-ftz’ is valid for the driver but not for C.
> > > Hi Jakub:
> > >   I didn't find a good solution to handle this error after changing
> > > *Target* to *Driver*, Could you give some hints how to solve this
> > > problem?
> > > Or is it ok for you to mark this as *Target*(there won't be any save
> > > and restore in cfun since there's no variable defined here.)
> >
> > Since all -m* options are passed to cc1, -mdaz-ftz can't be marked
> > as Driver.  We need to give it a different name to mark it as Driver.
>
> It is ok like that.
>
> Jakub
>

The GCC driver handles -mno-XXX automatically for -mXXX.  Use
a different name needs to handle the negation.   Or we can do something
like this to check for CL_DRIVER before passing it to cc1.
-- 
H.J.
diff --git a/gcc/gcc.cc b/gcc/gcc.cc
index 2568d541196..87cbea11ae1 100644
--- a/gcc/gcc.cc
+++ b/gcc/gcc.cc
@@ -3851,7 +3851,7 @@ alloc_switch (void)
 
 static void
 save_switch (const char *opt, size_t n_args, const char *const *args,
-	 bool validated, bool known)
+	 bool validated, bool known, bool driver = false)
 {
   alloc_switch ();
   switches[n_switches].part1 = opt + 1;
@@ -3868,6 +3868,7 @@ save_switch (const char *opt, size_t n_args, const char *const *args,
   switches[n_switches].validated = validated;
   switches[n_switches].known = known;
   switches[n_switches].ordering = 0;
+  switches[n_switches].driver = driver;
   n_switches++;
 }
 
@@ -4575,7 +4576,8 @@ driver_handle_option (struct gcc_options *opts,
   if (do_save)
 save_switch (decoded->canonical_option[0],
 		 decoded->canonical_option_num_elements - 1,
-		 &decoded->canonical_option[1], validated, true);
+		 &decoded->canonical_option[1], validated, true,
+		 cl_options[opt_index].flags == CL_DRIVER);
   return true;
 }
 
@@ -7465,7 +7467,8 @@ check_live_switch (int switchnum, int prefix_length)
 static void
 give_switch (int switchnum, int omit_first_word)
 {
-  if ((switches[switchnum].live_cond & SWITCH_IGNORE) != 0)
+  if ((switches[switchnum].live_cond & SWITCH_IGNORE) != 0
+  || switches[switchnum].driver)
 return;
 
   if (!omit_first_word)
diff --git a/gcc/opts.h b/gcc/opts.h
index ce4fd5c39b9..2900f0d9168 100644
--- a/gcc/opts.h
+++ b/gcc/opts.h
@@ -561,6 +561,7 @@ struct switchstr
   bool known;
   bool validated;
   bool ordering;
+  bool driver;
 };
 
 #endif


Re: [RFC/PATCH] Remove the workaround for _Float128 precision [PR107299]

2022-12-21 Thread Jakub Jelinek via Gcc-patches
On Wed, Dec 21, 2022 at 09:40:24PM +, Joseph Myers wrote:
> On Wed, 21 Dec 2022, Segher Boessenkool wrote:
> 
> > > --- a/gcc/tree.cc
> > > +++ b/gcc/tree.cc
> > > @@ -9442,15 +9442,6 @@ build_common_tree_nodes (bool signed_char)
> > >if (!targetm.floatn_mode (n, extended).exists (&mode))
> > >   continue;
> > >int precision = GET_MODE_PRECISION (mode);
> > > -  /* Work around the rs6000 KFmode having precision 113 not
> > > -  128.  */
> > 
> > It has precision 126 now fwiw.
> > 
> > Joseph: what do you think about this patch?  Is the workaround it
> > removes still useful in any way, do we need to do that some other way if
> > we remove this?
> 
> I think it's best for the TYPE_PRECISION, for any type with the binary128 
> format, to be 128 (not 126).

Agreed.

> It's necessary that _Float128, _Float64x and long double all have the same 
> TYPE_PRECISION when they have the same (binary128) format, or at least 
> that TYPE_PRECISION for _Float128 >= that for long double >= that for 
> _Float64x, so that the rules in c_common_type apply properly.
> 
> How the TYPE_PRECISION compares to that of __ibm128, or of long double 
> when that's double-double, is less important.

I guess it can affect the common type for {long double
(when binary128),_Float128,_Float64x,__float128,__ieee128} vs. {long double 
(when
ibm128),__ibm128}, especially in C (for C++ only when non-standard types are
involved (__float128, __ieee128, __ibm128).
But I think unless we error (e.g. in C++ when we see unordered floating
point types), prefering binary128 is better, it certainly has much bigger
exponent range over __ibm128 and most of the time also the precision
(__ibm128 wastes some bits on the other exponent).

Jakub



Re: [PATCH V2 2/2] [x86] x86: Add a new option -mdaz-ftz to enable FTZ and DAZ flags in MXCSR.

2022-12-21 Thread Jakub Jelinek via Gcc-patches
On Wed, Dec 21, 2022 at 02:43:43PM -0800, H.J. Lu wrote:
> > > > > > >  Target RejectNegative
> > > > > > >  Set 80387 floating-point precision to 80-bit.
> > > > > > >
> > > > > > > +mdaz-ftz
> > > > > > > +Target
> > > > > >
> > > > > > s/Target/Driver/
> > > > > Change to Driver and Got error like:cc1: error: command-line option
> > > > > ‘-mdaz-ftz’ is valid for the driver but not for C.
> > > > Hi Jakub:
> > > >   I didn't find a good solution to handle this error after changing
> > > > *Target* to *Driver*, Could you give some hints how to solve this
> > > > problem?
> > > > Or is it ok for you to mark this as *Target*(there won't be any save
> > > > and restore in cfun since there's no variable defined here.)
> > >
> > > Since all -m* options are passed to cc1, -mdaz-ftz can't be marked
> > > as Driver.  We need to give it a different name to mark it as Driver.
> >
> > It is ok like that.
> >
> > Jakub
> >
> 
> The GCC driver handles -mno-XXX automatically for -mXXX.  Use
> a different name needs to handle the negation.   Or we can do something
> like this to check for CL_DRIVER before passing it to cc1.

I meant I'm ok with -m{,no-}daz-ftz option being Target rather than Driver.

Jakub



Re: [PATCH V2 2/2] [x86] x86: Add a new option -mdaz-ftz to enable FTZ and DAZ flags in MXCSR.

2022-12-21 Thread Hongtao Liu via Gcc-patches
On Thu, Dec 22, 2022 at 6:46 AM Jakub Jelinek  wrote:
>
> On Wed, Dec 21, 2022 at 02:43:43PM -0800, H.J. Lu wrote:
> > > > > > > >  Target RejectNegative
> > > > > > > >  Set 80387 floating-point precision to 80-bit.
> > > > > > > >
> > > > > > > > +mdaz-ftz
> > > > > > > > +Target
> > > > > > >
> > > > > > > s/Target/Driver/
> > > > > > Change to Driver and Got error like:cc1: error: command-line option
> > > > > > ‘-mdaz-ftz’ is valid for the driver but not for C.
> > > > > Hi Jakub:
> > > > >   I didn't find a good solution to handle this error after changing
> > > > > *Target* to *Driver*, Could you give some hints how to solve this
> > > > > problem?
> > > > > Or is it ok for you to mark this as *Target*(there won't be any save
> > > > > and restore in cfun since there's no variable defined here.)
> > > >
> > > > Since all -m* options are passed to cc1, -mdaz-ftz can't be marked
> > > > as Driver.  We need to give it a different name to mark it as Driver.
> > >
> > > It is ok like that.
> > >
> > > Jakub
> > >
> >
> > The GCC driver handles -mno-XXX automatically for -mXXX.  Use
> > a different name needs to handle the negation.   Or we can do something
> > like this to check for CL_DRIVER before passing it to cc1.
>
> I meant I'm ok with -m{,no-}daz-ftz option being Target rather than Driver.
>
Thanks.
Uros, Is the patch for you?
> Jakub
>


-- 
BR,
Hongtao


Re: [RFC/PATCH] Remove the workaround for _Float128 precision [PR107299]

2022-12-21 Thread Kewen.Lin via Gcc-patches
Hi Segher,

on 2022/12/22 05:24, Segher Boessenkool wrote:
> Hi!
> 
> On Wed, Dec 21, 2022 at 05:02:17PM +0800, Kewen.Lin wrote:
>> This a different attempt from Mike's approach[1][2] to fix the
>> issue in PR107299.
> 
> Ke Wen, Mike: so iiuc with this patch applied all three of Mike's
> patches are unnecessary?

I think the 1st patch is still needed, it's to fix a latent issue
as the associated test cases {mul,div}ic3-1.c show.

> Does it fix the new testcases in Mike's series as well?

Yeah, it doesn't suffer the issue exposed by float128-hw4.c, so
it doesn't need that adjustment on float128-hw4.c.  It can also
make newly added float128-hw{12,13}.c pass.

>> As above, I wonder if we can consider this approach which
>> makes type _Float128 have the same precision as MODE_PRECISION
>> of its mode, it keeps the previous implementation to make type
>> long double compatible with _Float128.  Since the REAL_MODE_FORMAT
>> of the mode still holds the information, even if some place which
>> isn't covered in the above testing need the actual precision, we
>> can still retrieve the actual precision with that.
> 
> "Precision" does not have a useful meaning for all floating point
> formats.  It does not have one for double-double for example.  The way
> precision is defined in IEEE FP means double-double has 2048 bits of
> precision (or is it 2047), not useful.  Taking the precision of the
> format instead to be the minimum over all values in that format gives
> double-double the same precision as IEEE DP, not useful in any practical
> way either :-/

OK, I think that's why we don't see any regressions with this work
around removal,   :)

> 
>> --- a/gcc/tree.cc
>> +++ b/gcc/tree.cc
>> @@ -9442,15 +9442,6 @@ build_common_tree_nodes (bool signed_char)
>>if (!targetm.floatn_mode (n, extended).exists (&mode))
>>  continue;
>>int precision = GET_MODE_PRECISION (mode);
>> -  /* Work around the rs6000 KFmode having precision 113 not
>> - 128.  */
> 
> It has precision 126 now fwiw.

Yes, with -mabi=ibmlongdouble, it uses KFmode so 126, while with
-mabi=ieeelongdouble, it uses TFmode so 127.

BR,
Kewen


Re: [RFC/PATCH] Remove the workaround for _Float128 precision [PR107299]

2022-12-21 Thread Kewen.Lin via Gcc-patches
Hi Joseph,

on 2022/12/22 05:40, Joseph Myers wrote:
> On Wed, 21 Dec 2022, Segher Boessenkool wrote:
> 
>>> --- a/gcc/tree.cc
>>> +++ b/gcc/tree.cc
>>> @@ -9442,15 +9442,6 @@ build_common_tree_nodes (bool signed_char)
>>>if (!targetm.floatn_mode (n, extended).exists (&mode))
>>> continue;
>>>int precision = GET_MODE_PRECISION (mode);
>>> -  /* Work around the rs6000 KFmode having precision 113 not
>>> -128.  */
>>
>> It has precision 126 now fwiw.
>>
>> Joseph: what do you think about this patch?  Is the workaround it
>> removes still useful in any way, do we need to do that some other way if
>> we remove this?
> 
> I think it's best for the TYPE_PRECISION, for any type with the binary128 
> format, to be 128 (not 126).

I agree that it's more reasonable to use 128 for it (all of them) eventually,
but what I thought is that if we can get rid of this workaround first to make
the bootstrapping survive.  Commit r9-1302-g6a8886e45f7eb6 makes TFmode/
KFmode/IFmode have different precisions with some reasons, Jakub mentioned
commit r13-3292, I think later we can refer to it and have a try to unique
those modes to have the same precisions (probably next stage1), I guess Mike
may have more insightful comments here.

> 
> It's necessary that _Float128, _Float64x and long double all have the same 
> TYPE_PRECISION when they have the same (binary128) format, or at least 
> that TYPE_PRECISION for _Float128 >= that for long double >= that for 
> _Float64x, so that the rules in c_common_type apply properly.
> 

a. _Float128, _Float64x and long double all have the same
   TYPE_PRECISION when they have the same (binary128) format.

b. TYPE_PRECISION for _Float128 >= that for long double
   >= that for _Float64x.

**Without this patch**, 

1) -mabi=ieeelongdouble (these three are ieee128)

  _Float128 => 128 ("type => its TYPE_PRECISION")
  _Float64x => 128
  long double => 127

// Neither a and b holds.

2) -mabi=ibmlongdouble (long double is ibm128)

  _Float128 => 128
  _Float64x => 128
  long double => 127

// a N/A, b doesn't hold.

**With this patch**, 

1) -mabi=ieeelongdouble

  _Float128 => 127
  _Float64x => 127
  long double => 127

// both a and b hold.
  
2) -mabi=ibmlongdouble
  
  _Float128 => 126
  _Float64x => 126
  long double => 127

// a N/A, b doesn't hold.

IMHO, this patch improves the status quo slightly.

BR,
Kewen


Re: gcc-13/changes.html: Mention -fstrict-flex-arrays and its impact

2022-12-21 Thread Richard Biener via Gcc-patches
On Wed, 21 Dec 2022, Qing Zhao wrote:

> Hi, Richard,
> 
> Thanks a lot for your comments.
> 
> > On Dec 21, 2022, at 2:12 AM, Richard Biener  wrote:
> > 
> > On Tue, 20 Dec 2022, Qing Zhao wrote:
> > 
> >> Hi,
> >> 
> >> This is the patch for mentioning -fstrict-flex-arrays and -Warray-bounds=2 
> >> changes in gcc-13/changes.html.
> >> 
> >> Let me know if you have any comment or suggestions.
> > 
> > Some copy editing below
> > 
> >> Thanks.
> >> 
> >> Qing.
> >> 
> >> ===
> >> From c022076169b4f1990b91f7daf4cc52c6c5535228 Mon Sep 17 00:00:00 2001
> >> From: Qing Zhao 
> >> Date: Tue, 20 Dec 2022 16:13:04 +
> >> Subject: [PATCH] gcc-13/changes: Mention -fstrict-flex-arrays and its 
> >> impact.
> >> 
> >> ---
> >> htdocs/gcc-13/changes.html | 15 +++
> >> 1 file changed, 15 insertions(+)
> >> 
> >> diff --git a/htdocs/gcc-13/changes.html b/htdocs/gcc-13/changes.html
> >> index 689178f9..47b3d40f 100644
> >> --- a/htdocs/gcc-13/changes.html
> >> +++ b/htdocs/gcc-13/changes.html
> >> @@ -39,6 +39,10 @@ a work-in-progress.
> >> Legacy debug info compression option -gz=zlib-gnu was 
> >> removed
> >>   and the option is ignored right now.
> >> New debug info compression option value -gz=zstd has 
> >> been added.
> >> +-Warray-bounds=2 will no longer issue warnings for 
> >> out of bounds
> >> +  accesses to trailing struct members of one-element array type 
> >> anymore. Please
> >> +  add -fstrict-flex-arrays=level to control how the 
> >> compiler treat
> >> +  trailing arrays of structures as flexible array members. 
> > 
> > "Instead it diagnoses accesses to trailing arrays according to 
> > -fstrict-flex-arrays."
> 
> Okay.
> > 
> >> 
> >> 
> >> 
> >> @@ -409,6 +413,17 @@ a work-in-progress.
> >> Other significant improvements
> >> 
> >> 
> >> +Treating trailing arrays as flexible array 
> >> members
> >> +
> >> +
> >> + GCC can now control when to treat the trailing array of a structure 
> >> as a 
> >> + flexible array member for the purpose of accessing the elements of 
> >> such
> >> + an array. By default, all trailing arrays of structures are treated 
> >> as
> > 
> > all trailing arrays in aggregates are treated
> Okay.
> > 
> >> + flexible array members. Use the new command-line option
> >> + -fstrict-flex-array=level to control how GCC treats the 
> >> trailing
> >> + array of a structure as a flexible array member at different levels.
> > 
> > -fstrict-flex-arrays to control which trailing array
> > members are streated as flexible arrays.
> 
> Okay.
> 
> > 
> > I've also just now noticed that there's now a flag_strict_flex_arrays
> > check in the middle-end (in array bound diagnostics) but this option
> > isn't streamed or handled with LTO.  I think you want to replace that
> > with the appropriate DECL_NOT_FLEXARRAY check.
> 
> We need to know the level value of the strict_flex_arrays on the struct 
> field to issue proper warnings at different levels. DECL_NOT_FLEXARRAY 
> does not include such info. So, what should I do? Streaming the 
> flag_strict_flex_arrays with LTO?

But you do

  if (compref)
{
  /* Try to determine special array member type for this 
COMPONENT_REF.  */
  sam = component_ref_sam_type (arg);
  /* Get the level of strict_flex_array for this array field.  */
  tree afield_decl = TREE_OPERAND (arg, 1);
  strict_flex_array_level = strict_flex_array_level_of (afield_decl);

I see that function doesn't look at DECL_NOT_FLEXARRAY but just
checks attributes (those are streamed in LTO).

OK, so I suppose the diagnostic itself would become just less precise
as in "trailing array %qT should not be used as a flexible array member"
without the "for level N and above" part of the diagnostic?

> >  We might also want
> > to see how inlining accesses from TUs with different -fstrict-flex-arrays
> > setting behaves when accessing the same structure (and whether we might
> > want to issue an ODR style diagnostic there).

This mixing also means streaming -fstrict-flex-arrays won't be of much
help in general.

> Yes, good point, I will check on this part.
> 
> BTW, a stupid question: what does ODR mean?

It's the One-Definition-Rule (of C++).  Basically we'd diagnose
same struct declarations with different -fstrict-flex-arrays setting.
I see we miss comparing DECL_NOT_FLEXARRAY for tree merging, I'm
testing a patch to fix that now.

Richard.

> thanks.
> 
> Qing
> > 
> > Thanks,
> > Richard.
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)


Re: [PATCH] loading float member of parameter stored via int registers

2022-12-21 Thread guojiufu via Gcc-patches

Hi,

On 2022-12-21 15:30, Richard Biener wrote:

On Wed, 21 Dec 2022, Jiufu Guo wrote:


Hi,

This patch is fixing an issue about parameter accessing if the
parameter is struct type and passed through integer registers, and
there is floating member is accessed. Like below code:

typedef struct DF {double a[4]; long l; } DF;
double foo_df (DF arg){return arg.a[3];}

On ppc64le, with trunk gcc, "std 6,-24(1) ; lfd 1,-24(1)" is
generated.  While instruction "mtvsrd 1, 6" would be enough for
this case.


So why do we end up spilling for PPC?


Good question! According to GCC source code (in function.cc/expr.cc),
it is common behavior: using "word_mode" to store the parameter to 
stack,

And using the field's mode (e.g. float mode) to load from the stack.
But with some tries, I fail to construct cases on many platforms.
So, I convert the fix to a target hook and implemented the rs6000 part
first.



struct X { int i; float f; };

float foo (struct X x)
{
  return x.f;
}

does pass the structure in $RDI on x86_64 and we manage (with
optimization, with -O0 we spill) to generate

shrq$32, %rdi
movd%edi, %xmm0

and RTL expansion generates

(note 4 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
(insn 2 4 3 2 (set (reg/v:DI 83 [ x ])
(reg:DI 5 di [ x ])) "t.c":4:1 -1
 (nil))
(note 3 2 6 2 NOTE_INSN_FUNCTION_BEG)
(insn 6 3 7 2 (parallel [
(set (reg:DI 85)
(ashiftrt:DI (reg/v:DI 83 [ x ])
(const_int 32 [0x20])))
(clobber (reg:CC 17 flags))
]) "t.c":5:11 -1
 (nil))
(insn 7 6 8 2 (set (reg:SI 86)
(subreg:SI (reg:DI 85) 0)) "t.c":5:11 -1
 (nil))

I would imagine that for the ppc case we only see the subreg here
which should be even easier to optimize.

So how's this not fixable by providing proper patterns / subreg
capabilities?  Looking a bit at the RTL we have the issue might
be that nothing seems to handle CSE of



This case is also related to 'parameter on struct', PR89310 is
just for this case. On trunk, it is fixed.
One difference: the parameter is in DImode, and passed via an
integer register for "{int i; float f;}".
But for "{double a[4]; long l;}", the parameter is in BLKmode,
and stored to stack during the argument setup.


(note 8 0 5 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
(insn 5 8 7 2 (set (mem/c:DI (plus:DI (reg/f:DI 110 sfp)
(const_int 56 [0x38])) [2 arg+24 S8 A64])
(reg:DI 6 6)) "t.c":2:23 679 {*movdi_internal64}
 (expr_list:REG_DEAD (reg:DI 6 6)
(nil)))
(note 7 5 10 2 NOTE_INSN_FUNCTION_BEG)
(note 10 7 15 2 NOTE_INSN_DELETED)
(insn 15 10 16 2 (set (reg/i:DF 33 1)
(mem/c:DF (plus:DI (reg/f:DI 110 sfp)
(const_int 56 [0x38])) [1 arg.a[3]+0 S8 A64])) 
"t.c":2:40

576 {*movdf_hardfloat64}
 (nil))

Possibly because the store and load happen in a different mode?  Can
you see why CSE doesn't handle this (producing a subreg)?  On


Yes, exactly! For "{double a[4]; long l;}", because the store and load
are using a different mode, and then CSE does not optimize it.  This
patch makes the store and load using the same mode (DImode), and then
leverage CSE to handle it.


the GIMPLE side we'd happily do that (but we don't see the argument
setup).


Thanks for your comments!


BR,
Jeff (Jiufu)



Thanks,
Richard.


This patch updates the behavior when loading floating members of a
parameter: if that floating member is stored via integer register,
then loading it as integer mode first, and converting it to floating
mode.

I also thought of a method: before storing the register to stack,
convert it to float mode first. While there are some cases that may
still prefer to keep an integer register store.

Bootstrap and regtest passes on ppc64{,le}.
I would ask for help to review for comments and if this patch is
acceptable for the trunk.


BR,
Jeff (Jiufu)

PR target/108073

gcc/ChangeLog:

* config/rs6000/rs6000.cc (TARGET_LOADING_INT_CONVERT_TO_FLOAT): New
macro definition.
(rs6000_loading_int_convert_to_float): New hook implement.
* doc/tm.texi: Regenerated.
* doc/tm.texi.in (loading_int_convert_to_float): New hook.
* expr.cc (expand_expr_real_1): Updated to use the new hook.
* target.def (loading_int_convert_to_float): New hook.

gcc/testsuite/ChangeLog:

* g++.target/powerpc/pr102024.C: Update.
* gcc.target/powerpc/pr108073.c: New test.

---
 gcc/config/rs6000/rs6000.cc | 70 
+

 gcc/doc/tm.texi |  6 ++
 gcc/doc/tm.texi.in  |  2 +
 gcc/expr.cc | 15 +
 gcc/target.def  | 11 
 gcc/testsuite/g++.target/powerpc/pr102024.C |  2 +-
 gcc/testsuite/gcc.target/powerpc/pr108073.c | 24 +++
 7 files changed, 129 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr108073.c

diff --git a/gcc/config/rs600

Re: [PATCH] loading float member of parameter stored via int registers

2022-12-21 Thread Richard Biener via Gcc-patches
On Thu, 22 Dec 2022, guojiufu wrote:

> Hi,
> 
> On 2022-12-21 15:30, Richard Biener wrote:
> > On Wed, 21 Dec 2022, Jiufu Guo wrote:
> > 
> >> Hi,
> >> 
> >> This patch is fixing an issue about parameter accessing if the
> >> parameter is struct type and passed through integer registers, and
> >> there is floating member is accessed. Like below code:
> >> 
> >> typedef struct DF {double a[4]; long l; } DF;
> >> double foo_df (DF arg){return arg.a[3];}
> >> 
> >> On ppc64le, with trunk gcc, "std 6,-24(1) ; lfd 1,-24(1)" is
> >> generated.  While instruction "mtvsrd 1, 6" would be enough for
> >> this case.
> > 
> > So why do we end up spilling for PPC?
> 
> Good question! According to GCC source code (in function.cc/expr.cc),
> it is common behavior: using "word_mode" to store the parameter to stack,
> And using the field's mode (e.g. float mode) to load from the stack.
> But with some tries, I fail to construct cases on many platforms.
> So, I convert the fix to a target hook and implemented the rs6000 part
> first.
> 
> > 
> > struct X { int i; float f; };
> > 
> > float foo (struct X x)
> > {
> >   return x.f;
> > }
> > 
> > does pass the structure in $RDI on x86_64 and we manage (with
> > optimization, with -O0 we spill) to generate
> > 
> > shrq$32, %rdi
> > movd%edi, %xmm0
> > 
> > and RTL expansion generates
> > 
> > (note 4 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
> > (insn 2 4 3 2 (set (reg/v:DI 83 [ x ])
> > (reg:DI 5 di [ x ])) "t.c":4:1 -1
> >  (nil))
> > (note 3 2 6 2 NOTE_INSN_FUNCTION_BEG)
> > (insn 6 3 7 2 (parallel [
> > (set (reg:DI 85)
> > (ashiftrt:DI (reg/v:DI 83 [ x ])
> > (const_int 32 [0x20])))
> > (clobber (reg:CC 17 flags))
> > ]) "t.c":5:11 -1
> >  (nil))
> > (insn 7 6 8 2 (set (reg:SI 86)
> > (subreg:SI (reg:DI 85) 0)) "t.c":5:11 -1
> >  (nil))
> > 
> > I would imagine that for the ppc case we only see the subreg here
> > which should be even easier to optimize.
> > 
> > So how's this not fixable by providing proper patterns / subreg
> > capabilities?  Looking a bit at the RTL we have the issue might
> > be that nothing seems to handle CSE of
> > 
> 
> This case is also related to 'parameter on struct', PR89310 is
> just for this case. On trunk, it is fixed.
> One difference: the parameter is in DImode, and passed via an
> integer register for "{int i; float f;}".
> But for "{double a[4]; long l;}", the parameter is in BLKmode,
> and stored to stack during the argument setup.

OK, so this would be another case for "heuristics" to use
sth different than word_mode for storing, but of course
the arguments are in integer registers and using different
modes can for example prohibit store-multiple instruction use.

As I said in the related thread an RTL expansion time "SRA"
with the incoming argument assignment in mind could make
more optimal decisions for these kind of special-cases.

> > (note 8 0 5 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
> > (insn 5 8 7 2 (set (mem/c:DI (plus:DI (reg/f:DI 110 sfp)
> > (const_int 56 [0x38])) [2 arg+24 S8 A64])
> > (reg:DI 6 6)) "t.c":2:23 679 {*movdi_internal64}
> >  (expr_list:REG_DEAD (reg:DI 6 6)
> > (nil)))
> > (note 7 5 10 2 NOTE_INSN_FUNCTION_BEG)
> > (note 10 7 15 2 NOTE_INSN_DELETED)
> > (insn 15 10 16 2 (set (reg/i:DF 33 1)
> > (mem/c:DF (plus:DI (reg/f:DI 110 sfp)
> > (const_int 56 [0x38])) [1 arg.a[3]+0 S8 A64])) "t.c":2:40
> > 576 {*movdf_hardfloat64}
> >  (nil))
> > 
> > Possibly because the store and load happen in a different mode?  Can
> > you see why CSE doesn't handle this (producing a subreg)?  On
> 
> Yes, exactly! For "{double a[4]; long l;}", because the store and load
> are using a different mode, and then CSE does not optimize it.  This
> patch makes the store and load using the same mode (DImode), and then
> leverage CSE to handle it.

So can we instead fix CSE to consider replacing insn 15 above with

 (insn 15 (set (reg/i:DF 33 1)
   (subreg:DF (reg/f:DI 6 6)))

?  One peculiarity is that this introduces a hardreg use in this
special case.

Richard.

> > the GIMPLE side we'd happily do that (but we don't see the argument
> > setup).
> 
> Thanks for your comments!
> 
> 
> BR,
> Jeff (Jiufu)
> 
> > 
> > Thanks,
> > Richard.
> > 
> >> This patch updates the behavior when loading floating members of a
> >> parameter: if that floating member is stored via integer register,
> >> then loading it as integer mode first, and converting it to floating
> >> mode.
> >> 
> >> I also thought of a method: before storing the register to stack,
> >> convert it to float mode first. While there are some cases that may
> >> still prefer to keep an integer register store.
> >> 
> >> Bootstrap and regtest passes on ppc64{,le}.
> >> I would ask for help to review for comments and if this patch is
> >> acceptable for the trunk.
> >> 
> >> 
> >> BR,
> >> Jeff (Jiufu)
>