Re: [PATCH 0/2] RISC-V: Add intrinsics support and testcases for SiFive Xsfvfnrclipxfqf extension.

2024-12-02 Thread Kito Cheng
LGTM, committed to trunk :)

On Mon, Dec 2, 2024 at 9:33 AM  wrote:
>
> From: yulong 
>
> This patch implements the Sifvie vendor extension Xsfvfnrclipxfqf[1]
>  support to gcc. Providing support for FP32-to-int8 Ranged Clip
>  instrctions.
>
> [1] https://www.sifive.com/document-file/fp32-to-int8-ranged-clip-instructions
>
> Co-Authored by: Jiawei Chen 
> Co-Authored by: Shihua Liao 
> Co-Authored by: Yixuan Chen 
>
> yulong (2):
>   RISC-V: Add intrinsics support for SiFive Xsfvfnrclipxfqf extensions.
>   RISC-V: Add intrinsics testcases for SiFive Xsfvfnrclipxfqf
> extensions.
>
>  gcc/config/riscv/generic-vector-ooo.md|   2 +-
>  gcc/config/riscv/genrvv-type-indexer.cc   |  10 +
>  .../riscv/riscv-vector-builtins-bases.cc  |   6 -
>  .../riscv/riscv-vector-builtins-bases.h   |   6 +
>  .../riscv/riscv-vector-builtins-shapes.cc |  28 +
>  .../riscv/riscv-vector-builtins-shapes.h  |   1 +
>  gcc/config/riscv/riscv-vector-builtins.cc |  51 +-
>  gcc/config/riscv/riscv-vector-builtins.def|  31 +-
>  gcc/config/riscv/riscv-vector-builtins.h  |   7 +
>  gcc/config/riscv/riscv.md |   3 +-
>  .../riscv/sifive-vector-builtins-bases.cc |  52 ++
>  .../riscv/sifive-vector-builtins-bases.h  |   2 +
>  .../sifive-vector-builtins-functions.def  |   4 +
>  gcc/config/riscv/sifive-vector.md |  20 +
>  gcc/config/riscv/vector-iterators.md  |  30 +-
>  .../riscv/rvv/xsfvector/sf_vfnrclip_x_f_qf.c  | 606 ++
>  .../riscv/rvv/xsfvector/sf_vfnrclip_xu_f_qf.c | 605 +
>  17 files changed, 1425 insertions(+), 39 deletions(-)
>  create mode 100644 
> gcc/testsuite/gcc.target/riscv/rvv/xsfvector/sf_vfnrclip_x_f_qf.c
>  create mode 100644 
> gcc/testsuite/gcc.target/riscv/rvv/xsfvector/sf_vfnrclip_xu_f_qf.c
>
> --
> 2.34.1
>


[committed] testsuite: Adjust rs6000-ldouble-2.c for switch to -std=gnu23 by default [PR117663]

2024-12-02 Thread Jakub Jelinek
Hi!

-std=gnu23/-std=c23 changes LDBL_EPSILON for IBM long double, see r13-3029 and
https://gcc.gnu.org/pipermail/gcc-patches/2022-October/602738.html
for details.

That change even had a note:
"and when we move to a C2x
default, gcc.target/powerpc/rs6000-ldouble-2.c will need an
appropriate option added to keep using an older language version"

The following patch just implements it to fix rs6000-ldouble-2.c regression.

Tested on powerpc64le-linux, committed to trunk as obvious.

2024-12-02  Jakub Jelinek  

PR testsuite/117663
* gcc.target/powerpc/rs6000-ldouble-2.c: Add -std=gnu17 to dg-options.

--- gcc/testsuite/gcc.target/powerpc/rs6000-ldouble-2.c.jj  2020-01-11 
16:31:55.821281713 +0100
+++ gcc/testsuite/gcc.target/powerpc/rs6000-ldouble-2.c 2024-12-02 
13:47:58.368301037 +0100
@@ -1,5 +1,5 @@
 /* { dg-do run { target { { powerpc*-*-darwin* powerpc*-*-aix* rs6000-*-* } || 
{ powerpc*-*-linux* && lp64 } } } } */
-/* { dg-options "-mlong-double-128" } */
+/* { dg-options "-mlong-double-128 -std=gnu17" } */
 
 /* Check that LDBL_EPSILON is right for 'long double'.  */
 

Jakub



RE: [PATCH 1/2]middle-end: refactor type to be explicit in operand_equal_p [PR114932]

2024-12-02 Thread Tamar Christina
> -Original Message-
> From: Richard Biener 
> Sent: Friday, November 29, 2024 8:57 AM
> To: Tamar Christina 
> Cc: gcc-patches@gcc.gnu.org; nd ; rguent...@suse.de;
> j...@ventanamicro.com
> Subject: Re: [PATCH 1/2]middle-end: refactor type to be explicit in
> operand_equal_p [PR114932]
> 
> On Tue, Aug 20, 2024 at 3:07 PM Tamar Christina 
> wrote:
> >
> > Hi All,
> >
> > This is a refactoring with no expected behavioral change.
> > The goal with this is to make the type of the expressions being used 
> > explicit.
> >
> > I did not change all the recursive calls to operand_equal_p () to recurse
> > directly to the new function but instead this goes through the top level 
> > call
> > which re-extracts the types.
> >
> > This was done because in most of the cases where we recurse type == arg.
> > The second patch makes use of this new flexibility to implement an overload
> > of operand_equal_p which checks for equality under two's complement.
> >
> > Bootstrapped Regtested on aarch64-none-linux-gnu,
> > arm-none-linux-gnueabihf, x86_64-pc-linux-gnu
> > -m32, -m64 and no issues.
> >
> > Ok for master?
> >
> > Thanks,
> > Tamar
> >
> > gcc/ChangeLog:
> >
> > PR tree-optimization/114932
> > * fold-const.cc (operand_compare::operand_equal_p): Split into one 
> > that
> > takes explicit type parameters and use that in public one.
> > * fold-const.h (class operand_compare): Add operand_equal_p private
> > overload.
> >
> > ---
> > diff --git a/gcc/fold-const.h b/gcc/fold-const.h
> > index
> b82ef137e2f2096f86c20df3c7749747e604177e..878545b1148b839e8a8e866f
> 38e31161f0d116c8 100644
> > --- a/gcc/fold-const.h
> > +++ b/gcc/fold-const.h
> > @@ -273,6 +273,12 @@ protected:
> >   true is returned.  Then RET is set to corresponding comparsion 
> > result.  */
> >bool verify_hash_value (const_tree arg0, const_tree arg1, unsigned int 
> > flags,
> >   bool *ret);
> > +
> > +private:
> > +  /* Return true if two operands are equal.  The flags fields can be used
> > + to specify OEP flags described in tree-core.h.  */
> > +  bool operand_equal_p (tree, const_tree, tree, const_tree,
> > +   unsigned int flags);
> >  };
> >
> >  #endif // GCC_FOLD_CONST_H
> > diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
> > index
> 8908e7381e72cbbf4a8fd96f18cbf4436aba8441..71e82b1d76d4106c7c23c54af
> 8b35905a1af9f1c 100644
> > --- a/gcc/fold-const.cc
> > +++ b/gcc/fold-const.cc
> > @@ -3156,6 +3156,17 @@ combine_comparisons (location_t loc,
> >  bool
> >  operand_compare::operand_equal_p (const_tree arg0, const_tree arg1,
> >   unsigned int flags)
> > +{
> > +  return operand_equal_p (TREE_TYPE (arg0), arg0, TREE_TYPE (arg1), arg1,
> flags);
> > +}
> > +
> > +/* The same as operand_equal_p however the type of ARG0 and ARG1 are
> assumed to be
> > +   the TYPE0 and TYPE1 respectively.  */
> > +
> > +bool
> > +operand_compare::operand_equal_p (tree type0, const_tree arg0,
> > + tree type1, const_tree arg1,
> 
> did you try using const_tree for type0/type1?
> 

I did, but types_compatible_p is non-const and it calls 
useless_type_conversion_p
which is also non-const.  Having a look I don't either of those function changes
type so I could change them all to const_tree if you'd like and see what shakes 
out.

It looks like all the calls done in useless_type_conversion_p are already 
const_tree.

Do you want me to propagate the const_tree down?

Thanks,
Tamar
> > + unsigned int flags)
> >  {
> >bool r;
> >if (verify_hash_value (arg0, arg1, flags, &r))
> > @@ -3166,25 +3177,25 @@ operand_compare::operand_equal_p (const_tree
> arg0, const_tree arg1,
> >
> >/* If either is ERROR_MARK, they aren't equal.  */
> >if (TREE_CODE (arg0) == ERROR_MARK || TREE_CODE (arg1) == ERROR_MARK
> > -  || TREE_TYPE (arg0) == error_mark_node
> > -  || TREE_TYPE (arg1) == error_mark_node)
> > +  || type0 == error_mark_node
> > +  || type1 == error_mark_node)
> >  return false;
> >
> >/* Similar, if either does not have a type (like a template id),
> >   they aren't equal.  */
> > -  if (!TREE_TYPE (arg0) || !TREE_TYPE (arg1))
> > +  if (!type0 || !type1)
> >  return false;
> >
> >/* Bitwise identity makes no sense if the values have different layouts. 
> >  */
> >if ((flags & OEP_BITWISE)
> > -  && !tree_nop_conversion_p (TREE_TYPE (arg0), TREE_TYPE (arg1)))
> > +  && !tree_nop_conversion_p (type0, type1))
> >  return false;
> >
> >/* We cannot consider pointers to different address space equal.  */
> > -  if (POINTER_TYPE_P (TREE_TYPE (arg0))
> > -  && POINTER_TYPE_P (TREE_TYPE (arg1))
> > -  && (TYPE_ADDR_SPACE (TREE_TYPE (TREE_TYPE (arg0)))
> > - != TYPE_ADDR_SPACE (TREE_TYPE (TREE_TYPE (arg1)
> > +  if (POINTER_TYPE_P (type0)
> > +  && POINTER_TYPE_P (type1)
> > +

Re: [PATCH] aarch64: Extend SVE2 bit-select instructions for Neon modes.

2024-12-02 Thread Kyrylo Tkachov


> On 29 Nov 2024, at 14:16, Richard Sandiford  wrote:
> 
> Kyrylo Tkachov  writes:
>>> On 27 Nov 2024, at 09:34, Richard Sandiford  
>>> wrote:
>>> 
>>> Soumya AR  writes:
 NBSL, BSL1N, and BSL2N are bit-select intructions on SVE2 with certain 
 operands
 inverted. These can be extended to work with Neon modes.
 
 Since these instructions are unpredicated, duplicate patterns were added 
 with
 the predicate removed to generate these instructions for Neon modes.
 
 The patch was bootstrapped and regtested on aarch64-linux-gnu, no 
 regression.
 OK for mainline?
 
 Signed-off-by: Soumya AR 
 
 gcc/ChangeLog:
 
 * config/aarch64/aarch64-sve2.md
 (*aarch64_sve2_nbsl_unpred): New pattern to match unpredicated
 form.
 (*aarch64_sve2_bsl1n_unpred): Likewise.
 (*aarch64_sve2_bsl2n_unpred): Likewise.
 
 gcc/testsuite/ChangeLog:
 
 * gcc.target/aarch64/sve/bitsel.c: New test.
>>> 
>>> Thanks for the patch.  But since this is a new optimisation, and is not
>>> fixing a regression, I'm not sure whether it would be appropriate during
>>> stage 3.  Let's see what other maintainers say.
>> 
>> IMO it’s not high risk but it’s a nice-to-have optimisation rather than 
>> driven by a concrete motivating workload.
>> Given that we have a few such patches (like the ASRD patch from Soumya) it 
>> would be consistent to either take them all now or stage them all for GCC 16.
> 
> Yeah, agreed.  I'd chosen this patch somewhat arbitrarily, but it was
> really a comment about the ongoing work in general.
> 
>> I’d be okay with deferring them to GCC 16 but would appreciate if they 
>> received some feedback on the implementation beforehand so they can be 
>> polished for next stage1.
> 
> Sure, will try to get to them soon.
> 
> I'm also not strongly opposed to the patches going in.  Full disclosure:
> there are some bits of FP8 work that (despite our best efforts) slipped
> into stage 3 due to unforeseen circumstances, and still need to be posted.
> I'm hoping they can still go in, since the alternative would be to
> disable all the existing FP8 work for GCC 15.
> 
> Given that, it probably seems hypocritical to push back on these SVE-for-
> NEON patches.  The reason I did is that the work seems like an ongoing
> project with no well-defined end point, so it seemed like the GCC 15
> cut-off would have to be time-driven rather than feature-driven.

Yeah no problem. I’d like to see FP8 land properly in GCC 15 too.
Thanks,
Kyrill

> 
> Thanks for all the work on this though -- it's definitely a useful project.
> 
> Richard




[PATCH] tree-optimization/116352 - SLP scheduling and stmt order

2024-12-02 Thread Richard Biener
The PR uncovers unchecked constraints on the ability to code-generate
with SLP but also latent issues with regard to stmt order checking
since loop (early-break) and BB (for quite some time) vectorization
are no longer constraint to single-BBs.  In particular get_later_stmt
simply compares UIDs of stmts, but that's only reliable when they
are in the same BB.

For the PR in question the problematical case is demoting a SLP node
to external which fails to check we can actually code generate this
in the way we do (using get_later_stmt).  The following thus adds
checking that we demote to external only when all defs are from
the same BB.

We no longer vectorize gcc.dg/vect/bb-slp-49.c but the testcase was
for a wrong-code issue and the vectorization done is a no-op.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

PR tree-optimization/116352
PR tree-optimization/117876
* tree-vect-slp.cc (vect_slp_can_convert_to_external): New.
(vect_slp_convert_to_external): Call it.
(vect_build_slp_tree_2): Likewise.

* gcc.dg/vect/pr116352.c: New testcase.
* gcc.dg/vect/bb-slp-49.c: Remove vectorization check.
---
 gcc/testsuite/gcc.dg/vect/bb-slp-49.c |  3 +--
 gcc/testsuite/gcc.dg/vect/pr116352.c  | 34 +++
 gcc/tree-vect-slp.cc  | 29 ++-
 3 files changed, 58 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/vect/pr116352.c

diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-49.c 
b/gcc/testsuite/gcc.dg/vect/bb-slp-49.c
index e7101fcff46..c0ad5d70a9a 100644
--- a/gcc/testsuite/gcc.dg/vect/bb-slp-49.c
+++ b/gcc/testsuite/gcc.dg/vect/bb-slp-49.c
@@ -23,6 +23,5 @@ main ()
   return 0;
 }
 
-/* See that we vectorize an SLP instance.  */
+/* See that we try to vectorize an SLP instance.  */
 /* { dg-final { scan-tree-dump "Analyzing vectorizable constructor" "slp1" } } 
*/
-/* { dg-final { scan-tree-dump "vectorizing stmts using SLP" "slp1" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/pr116352.c 
b/gcc/testsuite/gcc.dg/vect/pr116352.c
new file mode 100644
index 000..3fe537c34ff
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/pr116352.c
@@ -0,0 +1,34 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O3" } */
+
+static void addPrior(float center_x, float center_y, float width, float height,
+ bool normalized, float *dst)
+{
+  if (normalized)
+{
+  dst[0] = (center_x - width * 0.5f);
+  dst[1] = (center_y - height * 0.5f);
+  dst[2] = (center_x + width * 0.5f);
+  dst[3] = (center_y + height * 0.5f);
+}
+  else
+{
+  dst[0] = center_x - width * 0.5f;
+  dst[1] = center_y - height * 0.5f;
+  dst[2] = center_x + width * 0.5f - 1.0f;
+  dst[3] = center_y + height * 0.5f - 1.0f;
+}
+}
+void forward(float *outputPtr, int _offsetsXs, float *_offsetsX,
+float *_offsetsY, float _stepX, float _stepY,
+bool _bboxesNormalized, float _boxWidth, float _boxHeight)
+{
+  for (int j = 0; j < _offsetsXs; ++j)
+{
+  float center_x = (_offsetsX[j]) * _stepX;
+  float center_y = (_offsetsY[j]) * _stepY;
+  addPrior(center_x, center_y, _boxWidth, _boxHeight, _bboxesNormalized,
+  outputPtr);
+  outputPtr += 4;
+}
+}
diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
index ec986cc3f68..1799d5a619b 100644
--- a/gcc/tree-vect-slp.cc
+++ b/gcc/tree-vect-slp.cc
@@ -67,6 +67,7 @@ static int vectorizable_slp_permutation_1 (vec_info *, 
gimple_stmt_iterator *,
 static bool vectorizable_slp_permutation (vec_info *, gimple_stmt_iterator *,
  slp_tree, stmt_vector_for_cost *);
 static void vect_print_slp_tree (dump_flags_t, dump_location_t, slp_tree);
+static bool vect_slp_can_convert_to_external (const vec &);
 
 static object_allocator<_slp_tree> *slp_tree_pool;
 static slp_tree slp_first_node;
@@ -2887,7 +2888,8 @@ fail:
  for (j = 0; j < group_size; ++j)
if (!matches[j])
  break;
- if (!known_ge (j, TYPE_VECTOR_SUBPARTS (vectype)))
+ if (!known_ge (j, TYPE_VECTOR_SUBPARTS (vectype))
+ && vect_slp_can_convert_to_external (oprnd_info->def_stmts))
{
  if (dump_enabled_p ())
dump_printf_loc (MSG_NOTE, vect_location,
@@ -7764,6 +7766,24 @@ vect_slp_analyze_node_operations_1 (vec_info *vinfo, 
slp_tree node,
node, node_instance, cost_vec);
 }
 
+/* Verify if we can externalize a set of internal defs.  */
+
+static bool
+vect_slp_can_convert_to_external (const vec &stmts)
+{
+  basic_block bb = NULL;
+  for (stmt_vec_info stmt : stmts)
+if (!stmt)
+  return false;
+/* Constant generation uses get_later_stmt which can only handle
+   defs from the same BB.  */
+else if (!bb)
+  bb = gimple_bb (stmt->stmt);
+else if (gimple_bb (stmt->stmt) != bb)
+  return false;
+  return true;
+

Re: [committed] Patches 8-12 of Mariam & Matevos's CRC optimization work

2024-12-02 Thread Jakub Jelinek
On Sun, Dec 01, 2024 at 08:56:39AM -0700, Jeff Law wrote:
> diff --git a/gcc/testsuite/gcc.dg/crc-side-instr-1.c 
> b/gcc/testsuite/gcc.dg/crc-side-instr-1.c
> new file mode 100644
> index 000..69738d5c866
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/crc-side-instr-1.c
> @@ -0,0 +1,27 @@
> +/* { dg-do compile } */
> +/* { dg-options "-fdump-tree-crc-details" } */
> +/* { dg-skip-if "" { *-*-* } { "-O0" "-O1" "-Os" "-flto" } } */

...
and similarly for all other crc-side-instr*.c
These tests are clearly written for gcc.dg/torture/, but placed in gcc.dg/,
where we don't cycle through different options and none of explicit -O0,
-O1, -Os or -flto will be among the options, only -fdump-tree-crc-details
will be and so it will be compiled without optimizations and
all the scan-tree-dump directives UNRESOLVED because crc dump doesn't exist
at -O0.

Jakub



Re: PING: [PATCH v4 1/7] Honor TARGET_PROMOTE_PROTOTYPES during RTL expand

2024-12-02 Thread Jeff Law




On 12/2/24 1:55 AM, Richard Biener wrote:

On Sun, Dec 1, 2024 at 11:15 PM Jeff Law  wrote:




On 11/27/24 3:34 PM, H.J. Lu wrote:

On Thu, Nov 21, 2024, 2:02 PM H.J. Lu mailto:hjl.to...@gmail.com>> wrote:

 Promote integer arguments smaller than int if TARGET_PROMOTE_PROTOTYPES
 returns true.

  PR middle-end/14907
  * calls.c (initialize_argument_information): Promote small
 integer
  arguments if TARGET_PROMOTE_PROTOTYPES returns true.

This doesn't look right.  Promotions are primarily driven by the target
files, in particular TARGET_PROMOTE_FUNCTION_MODE.

PROMOTE_PROTOTYPES is more of a language front-end hook and it doesn't
seem appropriate to be testing it in calls.cc.


It's a misguided hook that when applied in a subset of frontends ends
up generating
wrong code when doing multi-language LTO.  I requested moving it's handling to
RTL expansion where we can apply it consistently.
It's probably a fair assessment that if a language FE is doing something 
like that, then it's going to be problematic for LTO.


So maybe the question morphs into whether or not HJ's patch takes us 
down that path and if so, then shouldn't we be looking to remove the FE 
uses?  And if we do that, how do we get a degree of confidence that we 
haven't accidentally twiddled the ABI in a meaningful way.





This particular patch looks OK to me (but as said elsewhere I'm not
very familiar with calls.cc and it's peculiarities).
I didn't see anything particularly concerning other than the overarching 
question of using what had been a language FE hook in calls.cc.  I'm 
obviously leery of changing ABI stuff this late in the game and would 
generally prefer to defer something like that until the next stage1.

jeff



Re: [PATCH 1/1] aarch64: remove extra XTN in vector concatenation

2024-12-02 Thread Kyrylo Tkachov
Hi Akram,

> On 2 Dec 2024, at 15:54, Akram Ahmad  wrote:
> 
> GIMPLE code which performs a narrowing truncation on the result of a
> vector concatenation currently results in an unnecessary XTN being
> emitted following a UZP1 to concate the operands. In cases such as this,
> UZP1 should instead use a smaller arrangement specifier to replace the
> XTN instruction. This is seen in cases such as in this GIMPLE example:
> 
> int32x2_t foo (svint64_t a, svint64_t b)
> {
>  vector(2) int vect__2.8;
>  long int _1;
>  long int _3;
>  vector(2) long int _12;
> 
>   [local count: 1073741824]:
>  _1 = svaddv_s64 ({ -1, 0, 0, 0, 0, 0, 0, 0, ... }, a_6(D));
>  _3 = svaddv_s64 ({ -1, 0, 0, 0, 0, 0, 0, 0, ... }, b_7(D));
>  _12 = {_1, _3};
>  vect__2.8_13 = (vector(2) int) _12;
>  return vect__2.8_13;
> 
> }
> 
> Original assembly generated:
> 
> bar:
>ptrue   p3.b, all
>uaddv   d0, p3, z0.d
>uaddv   d1, p3, z1.d
>uzp1v0.2d, v0.2d, v1.2d
>xtn v0.2s, v0.2d
>ret
> 
> This patch therefore defines the *aarch64_trunc_concat insn which
> truncates the concatenation result, rather than concatenating the
> truncated operands (such as in *aarch64_narrow_trunc), resulting
> in the following optimised assembly being emitted:
> 
> bar:
>ptrue   p3.b, all
>uaddv   d0, p3, z0.d
>uaddv   d1, p3, z1.d
>uzp1v0.2s, v0.2s, v1.2s
>ret
> 
> This patch passes all regression tests on aarch64 with no new failures.
> A supporting test for this optimisation is also written and passes.
> 
> OK for master? I do not have commit rights so I cannot push the patch
> myself.

Thanks for the patch. As this is sent after the end of stage1 and is not 
finishing support for an architecture feature perhaps we should stage this for 
GCC 16.
But if it fixes a performance problem in a real app or, better yet, fixes a 
performance regression then we should consider it for this cycle.
That said...


> 
> gcc/ChangeLog:
> 
> * config/aarch64/aarch64-simd.md: (*aarch64_trunc_concat) new
>  insn definition.
> * config/aarch64/iterators.md: (VDQHSD_F): new mode iterator.
>  (VTRUNCD): new mode attribute for truncated modes.
>  (Vtruncd): new mode attribute for arrangement specifier.
> 
> gcc/testsuite/ChangeLog:
> 
> * gcc.target/aarch64/sve/truncated_concatenation_1.c: new test
>  for the above example and the int64x2 version of the above.
> ---
> gcc/config/aarch64/aarch64-simd.md| 16 ++
> gcc/config/aarch64/iterators.md   | 12 ++
> .../aarch64/sve/truncated_concatenation_1.c   | 22 +++
> 3 files changed, 50 insertions(+)
> create mode 100644 
> gcc/testsuite/gcc.target/aarch64/sve/truncated_concatenation_1.c
> 
> diff --git a/gcc/config/aarch64/aarch64-simd.md 
> b/gcc/config/aarch64/aarch64-simd.md
> index cfe95bd4c31..de3dd444ecd 100644
> --- a/gcc/config/aarch64/aarch64-simd.md
> +++ b/gcc/config/aarch64/aarch64-simd.md
> @@ -1872,6 +1872,22 @@
>   [(set_attr "type" "neon_permute")]
> )
> 
> +(define_insn "*aarch64_trunc_concat"
> +  [(set (match_operand: 0 "register_operand" "=w")
> + (truncate:
> +  (vec_concat:VDQHSD_F
> +(match_operand: 1 "register_operand" "w")
> +(match_operand: 2 "register_operand" "w"]
> +  "TARGET_SIMD"
> +{
> +  if (!BYTES_BIG_ENDIAN)
> +return "uzp1\\t%0., %1., %2.";
> +  else
> +return "uzp1\\t%0., %2., %1.";
> +}

… The UZP1 instruction doesn’t accept .2h operands so I don’t think this 
pattern is valid for the V2SF value of VDQHSD_F


> +  [(set_attr "type" "neon_permute")]
> +)
> +
> ;; Packing doubles.
> 
> (define_expand "vec_pack_trunc_"
> diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
> index d7cb27e1885..3b28b2fae0c 100644
> --- a/gcc/config/aarch64/iterators.md
> +++ b/gcc/config/aarch64/iterators.md
> @@ -290,6 +290,10 @@
> ;; Advanced SIMD modes for H, S and D types.
> (define_mode_iterator VDQHSD [V4HI V8HI V2SI V4SI V2DI])
> 
> +;; Advanced SIMD modes that can be truncated whilst preserving
> +;; the number of vector elements.
> +(define_mode_iterator VDQHSD_F [V8HI V4SI V2DI V2SF V4SF V2DF])
> +
> (define_mode_iterator VDQHSD_V1DI [VDQHSD V1DI])
> 
> ;; Advanced SIMD and scalar integer modes for H and S.
> @@ -1722,6 +1726,14 @@
> (define_mode_attr Vnarrowq2 [(V8HI "v16qi") (V4SI "v8hi")
> (V2DI "v4si")])
> 
> +;; Truncated Advanced SIMD modes which preserve the number of lanes.
> +(define_mode_attr VTRUNCD [(V8HI "V8QI") (V4SI "V4HI")
> +   (V2SF "V2HF") (V4SF "V4HF")
> +   (V2DI "V2SI") (V2DF "V2SF")])
> +(define_mode_attr Vtruncd [(V8HI "8b") (V4SI "4h")
> +   (V2SF "2h") (V4SF "4h")
> +   (V2DI "2s") (V2DF "2s")])
> +
> ;; Narrowed modes of vector modes.
> (define_mode_attr VNARROW [(VNx8HI "VNx16QI")
>   (VNx4SI "VNx8HI") (VNx4SF "VNx8HF")
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/truncated_concatenation_1.c 
> b/gcc/testsuite/gcc.target/aarch64/sve/truncated_concatenation_1.c
> ne

[PATCH v1] MAINTAINERS: add myself to write after approval

2024-12-02 Thread Claudio Bantaloukas

ChangeLog:

* MAINTAINERS: Add myself to write after approval.
---
 MAINTAINERS | 1 +
 1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 6851affb6cb..7d65ed64bdd 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -345,6 +345,7 @@ Simon Baldwin   simonb  
 Richard Ballricbal02
 Scott Bambrough -   
 Wolfgang Bangerth   -   
+Claudio Bantaloukas rdfm
 Gergö Barany-   
 Thiago Jung Bauermann   -   
 Charles Baylis  cbaylis 


[PATCH 1/1] aarch64: remove extra XTN in vector concatenation

2024-12-02 Thread Akram Ahmad
GIMPLE code which performs a narrowing truncation on the result of a
vector concatenation currently results in an unnecessary XTN being
emitted following a UZP1 to concate the operands. In cases such as this,
UZP1 should instead use a smaller arrangement specifier to replace the
XTN instruction. This is seen in cases such as in this GIMPLE example:

int32x2_t foo (svint64_t a, svint64_t b)
{
  vector(2) int vect__2.8;
  long int _1;
  long int _3;
  vector(2) long int _12;

   [local count: 1073741824]:
  _1 = svaddv_s64 ({ -1, 0, 0, 0, 0, 0, 0, 0, ... }, a_6(D));
  _3 = svaddv_s64 ({ -1, 0, 0, 0, 0, 0, 0, 0, ... }, b_7(D));
  _12 = {_1, _3};
  vect__2.8_13 = (vector(2) int) _12;
  return vect__2.8_13;

}

Original assembly generated:

bar:
ptrue   p3.b, all
uaddv   d0, p3, z0.d
uaddv   d1, p3, z1.d
uzp1v0.2d, v0.2d, v1.2d
xtn v0.2s, v0.2d
ret

This patch therefore defines the *aarch64_trunc_concat insn which
truncates the concatenation result, rather than concatenating the
truncated operands (such as in *aarch64_narrow_trunc), resulting
in the following optimised assembly being emitted:

bar:
ptrue   p3.b, all
uaddv   d0, p3, z0.d
uaddv   d1, p3, z1.d
uzp1v0.2s, v0.2s, v1.2s
ret

This patch passes all regression tests on aarch64 with no new failures.
A supporting test for this optimisation is also written and passes.

OK for master? I do not have commit rights so I cannot push the patch
myself.

gcc/ChangeLog:

* config/aarch64/aarch64-simd.md: (*aarch64_trunc_concat) new
  insn definition.
* config/aarch64/iterators.md: (VDQHSD_F): new mode iterator.
  (VTRUNCD): new mode attribute for truncated modes.
  (Vtruncd): new mode attribute for arrangement specifier.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/sve/truncated_concatenation_1.c: new test
  for the above example and the int64x2 version of the above.
---
 gcc/config/aarch64/aarch64-simd.md| 16 ++
 gcc/config/aarch64/iterators.md   | 12 ++
 .../aarch64/sve/truncated_concatenation_1.c   | 22 +++
 3 files changed, 50 insertions(+)
 create mode 100644 
gcc/testsuite/gcc.target/aarch64/sve/truncated_concatenation_1.c

diff --git a/gcc/config/aarch64/aarch64-simd.md 
b/gcc/config/aarch64/aarch64-simd.md
index cfe95bd4c31..de3dd444ecd 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -1872,6 +1872,22 @@
   [(set_attr "type" "neon_permute")]
 )
 
+(define_insn "*aarch64_trunc_concat"
+  [(set (match_operand: 0 "register_operand" "=w")
+   (truncate:
+ (vec_concat:VDQHSD_F
+(match_operand: 1 "register_operand" "w")
+   (match_operand: 2 "register_operand" "w"]
+  "TARGET_SIMD"
+{
+  if (!BYTES_BIG_ENDIAN)
+return "uzp1\\t%0., %1., %2.";
+  else
+return "uzp1\\t%0., %2., %1.";
+}
+  [(set_attr "type" "neon_permute")]
+)
+
 ;; Packing doubles.
 
 (define_expand "vec_pack_trunc_"
diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
index d7cb27e1885..3b28b2fae0c 100644
--- a/gcc/config/aarch64/iterators.md
+++ b/gcc/config/aarch64/iterators.md
@@ -290,6 +290,10 @@
 ;; Advanced SIMD modes for H, S and D types.
 (define_mode_iterator VDQHSD [V4HI V8HI V2SI V4SI V2DI])
 
+;; Advanced SIMD modes that can be truncated whilst preserving
+;; the number of vector elements.
+(define_mode_iterator VDQHSD_F [V8HI V4SI V2DI V2SF V4SF V2DF])
+
 (define_mode_iterator VDQHSD_V1DI [VDQHSD V1DI])
 
 ;; Advanced SIMD and scalar integer modes for H and S.
@@ -1722,6 +1726,14 @@
 (define_mode_attr Vnarrowq2 [(V8HI "v16qi") (V4SI "v8hi")
 (V2DI "v4si")])
 
+;; Truncated Advanced SIMD modes which preserve the number of lanes.
+(define_mode_attr VTRUNCD [(V8HI "V8QI") (V4SI "V4HI")
+  (V2SF "V2HF") (V4SF "V4HF")
+  (V2DI "V2SI") (V2DF "V2SF")])
+(define_mode_attr Vtruncd [(V8HI "8b") (V4SI "4h")
+  (V2SF "2h") (V4SF "4h")
+  (V2DI "2s") (V2DF "2s")])
+
 ;; Narrowed modes of vector modes.
 (define_mode_attr VNARROW [(VNx8HI "VNx16QI")
   (VNx4SI "VNx8HI") (VNx4SF "VNx8HF")
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/truncated_concatenation_1.c 
b/gcc/testsuite/gcc.target/aarch64/sve/truncated_concatenation_1.c
new file mode 100644
index 000..e0ad4209206
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve/truncated_concatenation_1.c
@@ -0,0 +1,22 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -Wall -march=armv8.2-a+sve" } */
+
+#include 
+#include 
+
+int32x2_t foo (svint64_t a, svint64_t 

Re: [committed] Patches 8-12 of Mariam & Matevos's CRC optimization work

2024-12-02 Thread Jeff Law




On 12/2/24 7:02 AM, Jakub Jelinek wrote:

On Sun, Dec 01, 2024 at 08:56:39AM -0700, Jeff Law wrote:

diff --git a/gcc/testsuite/gcc.dg/crc-side-instr-1.c 
b/gcc/testsuite/gcc.dg/crc-side-instr-1.c
new file mode 100644
index 000..69738d5c866
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/crc-side-instr-1.c
@@ -0,0 +1,27 @@
+/* { dg-do compile } */
+/* { dg-options "-fdump-tree-crc-details" } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-O1" "-Os" "-flto" } } */


...
and similarly for all other crc-side-instr*.c
These tests are clearly written for gcc.dg/torture/, but placed in gcc.dg/,
where we don't cycle through different options and none of explicit -O0,
-O1, -Os or -flto will be among the options, only -fdump-tree-crc-details
will be and so it will be compiled without optimizations and
all the scan-tree-dump directives UNRESOLVED because crc dump doesn't exist
at -O0.
We can certainly move them into gcc.dg/torture.  Let me do a little 
testing around that to make sure we don't go notably backwards.


jeff


Re: [patch,avr,testsuite,applied] gcc.c-torture/execute/memcpy-a*.c

2024-12-02 Thread Maciej W. Rozycki
On Sun, 1 Dec 2024, Georg-Johann Lay wrote:

> > > > As a matter of interest, is the timeout/memory exhaustion observed with
> > > > host compilation or target execution?
> > > It happens during link, when the linker observes that the memory regions
> > > won't fit:
> > > 
> > > .../avr/bin/ld: memcpy-a8.elf section `.text' will not fit in region
> > > `text'
> > > .../avr/bin/ld: address 0x82c174 of memcpy-a8.elf section `.data' is not
> > > within region `data'
> > > .../avr/bin/ld: address 0x82c17c of memcpy-a8.elf section `.bss' is not
> > > within region `data'
> > > .../avr/bin/ld: region `text' overflowed by 245074 bytes
> > > collect2: error: ld returned 1 exit status
> > 
> > The memory overflow should be caught by ${tool}_check_unsupported_p.
> > Even without this patch, the testsuite should mark the tests as
> > UNSUPPORTED and not FAIL for avr.
> 
> They ARE being reported as UNSUPPORTED.  But it takes ~40m to arrive at
> these conclusions for all 5 tests.  A whole testsuite run takes
> 60m...70m, so adding 40m for a single test just to see one UNSUPPORTED
> rushing by each minute is no fun.  It's known in advance that these
> tests are pointless on AVR.

 I agree it can be annoying under these circumstances.

> > On native x86_64-pc-linux-gnu:
> >$ time make check-gcc-c RUNTESTFLAGS="execute.exp=memcpy-a*.c"
> ># of expected passes 56
> > 
> >real 8m37,778s
> >user 8m29,895s
> >sys  0m5,805s
> > 
> > Should these tests instead be gated by "run_expensive_tests"?
> 
> "in addition" instead of "instead" would be fine for me.

 I've posted a change to this effect now.

> Though I don't know anything about when a test on a current hardware is
> deemed "expensive".  For AVR, they are pointless *and* are consuming
> an offensive amount of time (otherwise I wouldn't care; there are many
> other tests that are beyond the memory constraints of AVRs).

 I think it's subjective:

'run_expensive_tests'
 Expensive testcases (usually those that consume excessive amounts
 of CPU time) should be run on this target.  This can be enabled by
 setting the 'GCC_TEST_RUN_EXPENSIVE' environment variable to a
 non-empty string.

-- I think what constitutes "excessive amounts of CPU time" will vary 
depending on the environment, but I guess the need to set the timeout 
factor is a good indication.  I wasn't aware of this effective-target 
setting at the time I proposed the tests or I would have used it along 
with the original submission.

  Maciej


Re: [committed] Patches 8-12 of Mariam & Matevos's CRC optimization work

2024-12-02 Thread Mark Wielaard
Hi Jeff,

On Sun, 2024-12-01 at 08:56 -0700, Jeff Law wrote:
> commit 148e20466c2c246df9472efed0f2ae94cb65a0f8
> Author: Matevos Mehrabyan 
> Date:   Mon Nov 11 13:00:10 2024 -0700
> 
>     [PATCH v6 09/12] Add symbolic execution support.
>     
>     Gives an opportunity to execute the code on bit level, assigning
>     symbolic values to the variables which don't have initial values.
>     Supports only CRC specific operations.
>     
>     Example:
>     
>     uint8_t crc;
>     uint8_t pol = 1;
>     crc = crc ^ pol;
>     
>     during symbolic execution crc's value will be:
>     crc(8), crc(7), ... crc(1), crc(0) ^ 1
>     
>     gcc/
>     * Makefile.in (OBJS): Add sym-exec/sym-exec-expression.o,
>     sym-exec/sym-exec-state.o, sym-exec/sym-exec-condition.o.
>     * configure (sym-exec): New subdir.
>     * sym-exec/sym-exec-condition.cc: New file.
>     * sym-exec/sym-exec-condition.h: New file.
>     * sym-exec/sym-exec-expr-is-a-helper.h: New file.
>     * sym-exec/sym-exec-expression.cc: New file.
>     * sym-exec/sym-exec-expression.h: New file.
>     * sym-exec/sym-exec-state.cc: New file.
>     * sym-exec/sym-exec-state.h: New file.
>     
>     Co-authored-by: Mariam Arutunian 

This updates configure without updating configure.ac. So when
regenerating configure the change disappears.

The autoregen buildbot doesn't like that:
https://builder.sourceware.org/buildbot/#/builders/gcc-autoregen

Cheers,

Mark


Re: [PATCH] arm: [MVE intrinsics] Avoid warnings when floating-point is not supported [PR 117814]

2024-12-02 Thread Christophe Lyon
On Mon, 2 Dec 2024 at 12:44, Richard Earnshaw (lists)
 wrote:
>
> On 02/12/2024 11:21, Christophe Lyon wrote:
> > If the target does not support floating-point, we register FP vector
> > types as 'void' (see register_vector_type).
> >
> > The leads to warnings about 'pure attribute on function returning
> > void' when we declare the various load intrinsics because their
> > call_properties say CP_READ_MEMORY (thus giving them the 'pure'
> > attribute), but their return type is void.
> >
> > To avoid such warnings, pretend the call_properties are empty when FP
> > is disabled and the function would return an FP value.  If such
> > functions are incorrectly used in user code, a proper error is
> > emitted:
> > unknown type name ‘float16x8_t'; did you mean ‘int16x8_t’?
> >
> > gcc/ChangeLog:
> >
> >   PR target/117814
> >   * config/arm/arm-mve-builtins-base.cc (vld1_impl): Fix
> >   call_properties.
> >   (vld24_impl): Likewise.
> >   * config/arm/arm-mve-builtins-functions.h (load_extending):
> >   Likewise.
>
> Won't this lead to problems if the code is something like
>
> #include "arm_mve.h"
>
> #pragma gcc target ("arch=armv8.1-m.main+mve.fp")
>
> // Some use of an affected intrinsic
>
> and then compile with "-march=armv8.1-m.main+mve -mfpu=auto 
> -mfloat-abi=softfp"?
>
Indeed actually the real cause is precisely that without FPU we do
not register FP types (and map them to "void" instead).
I had started to look at fixing that but it looked more invasive.

Actually if you take any intrinsic test (say vaddq_f16.c), remove
"_fp" from dg-require-effective-target and dg-add-options, and add the
pragma above, you'll get the same error:
vaddq_f16.c:20:1: error: unknown type name 'float16x8_t'; did you mean
'int16x8_t'?

Let me look at fixing this.

Thanks,

Christophe

> R.
>
> > ---
> >  gcc/config/arm/arm-mve-builtins-base.cc | 22 +++--
> >  gcc/config/arm/arm-mve-builtins-functions.h | 11 ++-
> >  2 files changed, 30 insertions(+), 3 deletions(-)
> >
> > diff --git a/gcc/config/arm/arm-mve-builtins-base.cc 
> > b/gcc/config/arm/arm-mve-builtins-base.cc
> > index 723004b53d7..a322730eca8 100644
> > --- a/gcc/config/arm/arm-mve-builtins-base.cc
> > +++ b/gcc/config/arm/arm-mve-builtins-base.cc
> > @@ -141,8 +141,17 @@ class vld1_impl : public full_width_access
> >  {
> >  public:
> >unsigned int
> > -  call_properties (const function_instance &) const override
> > +  call_properties (const function_instance &instance) const override
> >{
> > +/* If the target does not support floating-point, we register FP vector
> > +   types as 'void'.  In this case, pretend we do not access memory to 
> > avoid
> > +   warnings about 'pure attribute on function returning void' when we
> > +   declare the intrinsics.  Such uses in user code are properly
> > +   diagnosed.  */
> > +if (!TARGET_HAVE_MVE_FLOAT
> > + && instance.type_suffix (0).float_p)
> > +  return 0;
> > +
> >  return CP_READ_MEMORY;
> >}
> >
> > @@ -1141,8 +1150,17 @@ public:
> >using full_width_access::full_width_access;
> >
> >unsigned int
> > -  call_properties (const function_instance &) const override
> > +  call_properties (const function_instance &instance) const override
> >{
> > +/* If the target does not support floating-point, we register FP vector
> > +   types as 'void'.  In this case, pretend we do not access memory to 
> > avoid
> > +   warnings about 'pure attribute on function returning void' when we
> > +   declare the intrinsics.  Such uses in user code are properly
> > +   diagnosed.  */
> > +if (!TARGET_HAVE_MVE_FLOAT
> > + && instance.type_suffix (0).float_p)
> > +  return 0;
> > +
> >  return CP_READ_MEMORY;
> >}
> >
> > diff --git a/gcc/config/arm/arm-mve-builtins-functions.h 
> > b/gcc/config/arm/arm-mve-builtins-functions.h
> > index 0ade2157e4a..1a9a347805c 100644
> > --- a/gcc/config/arm/arm-mve-builtins-functions.h
> > +++ b/gcc/config/arm/arm-mve-builtins-functions.h
> > @@ -986,8 +986,17 @@ public:
> >m_float_memory_type (NUM_TYPE_SUFFIXES)
> >{}
> >
> > -  unsigned int call_properties (const function_instance &) const override
> > +  unsigned int call_properties (const function_instance &instance) const 
> > override
> >{
> > +/* If the target does not support floating-point, we register FP vector
> > +   types as 'void'.  In this case, pretend we do not access memory to 
> > avoid
> > +   warnings about 'pure attribute on function returning void' when we
> > +   declare the intrinsics.  Such uses in user code are properly
> > +   diagnosed.  */
> > +if (!TARGET_HAVE_MVE_FLOAT
> > + && instance.type_suffix (0).float_p)
> > +  return 0;
> > +
> >  return CP_READ_MEMORY;
> >}
> >
>


[PATCH] tree-optimization/117874 - missed vectorization that's formerly hybrid

2024-12-02 Thread Richard Biener
With SLP forced we fail to consider using single-lane SLP for a case
that we still end up discovering as hybrid (in the PR in question
this is because we run into the SLP discovery limit due to excessive
association).

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

This solves a bit of the 433.milc regression.

PR tree-optimization/117874
* tree-vect-loop.cc (vect_analyze_loop_2): When non-SLP
analysis fails, try single-lane SLP.

* gcc.dg/vect/pr117874.c: New testcase.
---
 gcc/testsuite/gcc.dg/vect/pr117874.c | 50 
 gcc/tree-vect-loop.cc|  7 ++--
 2 files changed, 53 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/vect/pr117874.c

diff --git a/gcc/testsuite/gcc.dg/vect/pr117874.c 
b/gcc/testsuite/gcc.dg/vect/pr117874.c
new file mode 100644
index 000..27e5f8ca369
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/pr117874.c
@@ -0,0 +1,50 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target vect_double } */
+
+typedef struct {
+double real;
+double imag;
+} complex;
+
+typedef struct { complex e[3][3]; } su3_matrix;
+
+void mult_su3_an(su3_matrix *a, su3_matrix *b, su3_matrix *c)
+{
+  int j;
+  double a0r,a0i,a1r,a1i,a2r,a2i;
+  double b0r,b0i,b1r,b1i,b2r,b2i;
+  for(j=0;j<3;j++)
+{
+  a0r=a->e[0][0].real; a0i=a->e[0][0].imag;
+  b0r=b->e[0][j].real; b0i=b->e[0][j].imag;
+  a1r=a->e[1][0].real; a1i=a->e[1][0].imag;
+  b1r=b->e[1][j].real; b1i=b->e[1][j].imag;
+  a2r=a->e[2][0].real; a2i=a->e[2][0].imag;
+  b2r=b->e[2][j].real; b2i=b->e[2][j].imag;
+
+  c->e[0][j].real = a0r*b0r + a0i*b0i + a1r*b1r + a1i*b1i + a2r*b2r + 
a2i*b2i;
+  c->e[0][j].imag = a0r*b0i - a0i*b0r + a1r*b1i - a1i*b1r + a2r*b2i - 
a2i*b2r;
+
+  a0r=a->e[0][1].real; a0i=a->e[0][1].imag;
+  b0r=b->e[0][j].real; b0i=b->e[0][j].imag;
+  a1r=a->e[1][1].real; a1i=a->e[1][1].imag;
+  b1r=b->e[1][j].real; b1i=b->e[1][j].imag;
+  a2r=a->e[2][1].real; a2i=a->e[2][1].imag;
+  b2r=b->e[2][j].real; b2i=b->e[2][j].imag;
+
+  c->e[1][j].real = a0r*b0r + a0i*b0i + a1r*b1r + a1i*b1i + a2r*b2r + 
a2i*b2i;
+  c->e[1][j].imag = a0r*b0i - a0i*b0r + a1r*b1i - a1i*b1r + a2r*b2i - 
a2i*b2r;
+
+  a0r=a->e[0][2].real; a0i=a->e[0][2].imag;
+  b0r=b->e[0][j].real; b0i=b->e[0][j].imag;
+  a1r=a->e[1][2].real; a1i=a->e[1][2].imag;
+  b1r=b->e[1][j].real; b1i=b->e[1][j].imag;
+  a2r=a->e[2][2].real; a2i=a->e[2][2].imag;
+  b2r=b->e[2][j].real; b2i=b->e[2][j].imag;
+
+  c->e[2][j].real = a0r*b0r + a0i*b0i + a1r*b1r + a1i*b1i + a2r*b2r + 
a2i*b2i;
+  c->e[2][j].imag = a0r*b0i - a0i*b0r + a1r*b1i - a1i*b1r + a2r*b2i - 
a2i*b2r;
+}
+}
+
+/* { dg-final { scan-tree-dump "vectorized 1 loops" "vect" { target 
vect_hw_misalign } } } */
diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
index 5a24fb8bf4c..85209604486 100644
--- a/gcc/tree-vect-loop.cc
+++ b/gcc/tree-vect-loop.cc
@@ -3005,10 +3005,9 @@ start_over:
   ok = vect_analyze_loop_operations (loop_vinfo);
   if (!ok)
 {
-  if (dump_enabled_p ())
-   dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
-"bad operation or unsupported loop bound.\n");
-  return ok;
+  ok = opt_result::failure_at (vect_location,
+  "bad operation or unsupported loop bound\n");
+  goto again;
 }
 
   /* For now, we don't expect to mix both masking and length approaches for one
-- 
2.43.0


Re: [PATCH] Add new hardreg PRE pass

2024-12-02 Thread Jeff Law




On 10/31/24 12:29 PM, Andrew Carlotti wrote:

This pass is used to optimise assignments to the FPMR register in
aarch64.  I chose to implement this as a middle-end pass because it
mostly reuses the existing RTL PRE code within gcse.cc.

Compared to RTL PRE, the key difference in this new pass is that we
insert new writes directly to the destination hardreg, instead of
writing to a new pseudo-register and copying the result later.  This
requires changes to the analysis portion of the pass, because sets
cannot be moved before existing instructions that set, use or clobber
the hardreg, and the value becomes unavailable after any uses of
clobbers of the hardreg.

This patch would currently break any debug instructions that use the
value of fpmr in a region of code where that value is changed by this
pass.  I haven't worked out the best way to fix this, but I suspect the
issue is uncommon and tricky enough that it would be best to just drop
those debug instructions.

I've bootstrapped and regression tested this on aarch64, and it should be NFC
on other targets.  Aside from this, my testing so far has involved hacking in a
single FP8 intrinsic and testing various parameters and control flow
structures, and checking both the codegen and the LCM bitmaps.  I intend to
write better and more comprehensive tests once there are some real intrinsic
implementations available to use.


Is this approach good?  Apart from fixing the debug instructions and
adding tests, is there anything else I need to change?


gcc/ChangeLog:

* config/aarch64/aarch64.h (HARDREG_PRE_REGNOS): New macro.
* gcse.cc (doing_hardreg_pre_p): New global variable.
(current_hardreg_regno): Ditto.
(compute_local_properties): Unset transp for hardreg clobbers.
(prune_hardreg_uses): New.
(want_to_gcse_p): Always return true for hardreg PRE.
(hash_scan_set): Add checks for hardreg uses/clobbers.
(oprs_unchanged_p): Disable load motion for hardreg PRE pass.
(record_last_mem_set_info): Ditto.
(compute_hash_table_work): Record hardreg uses.
(prune_expressions): Mark hardreg sets as call-clobbered.
(compute_pre_data): Add call to prune_hardreg_uses.
(pre_expr_reaches_here_p_work): Add comment.
(insert_insn_start_basic_block): New functions.
(pre_edge_insert): Don't add hardreg sets to predecessor block.
(pre_delete): Use hardreg for the reaching reg.
(pre_gcse): Don't insert copies for hardreg PRE.
(one_pre_gcse_pass): Disable load motion for hardreg PRE pass.
(execute_hardreg_pre): New.
(class pass_hardreg_pre): New.
(pass_hardreg_pre::gate): New.
(make_pass_hardreg_pre): New.
* passes.def (pass_hardreg_pre): New pass.
* tree-pass.h (make_pass_hardreg_pre): New.
So at a 30k foot level, one thing to be very leery of is extending the 
lifetime of any hard register.  It's probably not a big deal on aarch, 
but it can cause all kinds of headaches on other targets.


Essentially you probably need to avoid PRE on a hard register that's in 
a likely spilled class.









diff --git a/gcc/gcse.cc b/gcc/gcse.cc
index 
31b92f30fa1ba6c519429d4b7bc55547b2d71c01..ce4ebe420c02d78fcde3144eed595e22212aaa0b
 100644
--- a/gcc/gcse.cc
+++ b/gcc/gcse.cc



@@ -693,10 +698,29 @@ compute_local_properties (sbitmap *transp, sbitmap *comp, 
sbitmap *antloc,
 We start by assuming all are transparent [none are killed], and
 then reset the bits for those that are.  */
  if (transp)
-   compute_transp (expr->expr, indx, transp,
-   blocks_with_calls,
-   modify_mem_list_set,
-   canon_modify_mem_list);
+   {
+ compute_transp (expr->expr, indx, transp,
+ blocks_with_calls,
+ modify_mem_list_set,
+ canon_modify_mem_list);
+
+ if (doing_hardreg_pre_p)
+   {
+ /* We also need to check whether the destination hardreg is
+set or call-clobbered in each BB.  We'll check for hardreg
+uses later.  */
+ df_ref def;
+ for (def = DF_REG_DEF_CHAIN (current_hardreg_regno);
+  def;
+  def = DF_REF_NEXT_REG (def))
+   bitmap_clear_bit (transp[DF_REF_BB (def)->index], indx);
+
+ bitmap_iterator bi;
+ unsigned bb_index;
+ EXECUTE_IF_SET_IN_BITMAP (blocks_with_calls, 0, bb_index, bi)
+   bitmap_clear_bit (transp[bb_index], indx);
+   }
+   }
It's been a long time since I looked at the code, but is there code 
already in the pass to walk down the FUSAGE notes attached to calls? 
You'll definitely need that since it can have uses/clobbers of hard regs 
t

[PATCH v1] RISC-V: Fix incorrect optimization options passing to cond and builtin

2024-12-02 Thread pan2 . li
From: Pan Li 

Like the strided load/store, the testcases of vector cond and builtin are
designed to pick up different sorts of optimization options but actually
these option are ignored according to the Execution log of gcc.log.
This patch would like to make it correct almost the same as what we
fixed for strided load/store.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/rvv.exp: Fix the incorrect optimization
options passing to testcases.

Signed-off-by: Pan Li 
---
 gcc/testsuite/gcc.target/riscv/rvv/rvv.exp | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp 
b/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp
index 87c5ecb1a8b..87dea457608 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp
+++ b/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp
@@ -75,9 +75,9 @@ foreach op $AUTOVEC_TEST_OPTS {
   dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/autovec/reduc/*.\[cS\]]] 
\
 "" "$op"
   dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/autovec/cond/*.\[cS\]]] \
-"" "$op"
+"$op" ""
   dg-runtest [lsort [glob -nocomplain 
$srcdir/$subdir/autovec/builtin/*.\[cS\]]] \
-"" "$op"
+"$op" ""
 }
 
 # widening operation only test on LMUL < 8
-- 
2.43.0



Re: PING: [PATCH v4 1/7] Honor TARGET_PROMOTE_PROTOTYPES during RTL expand

2024-12-02 Thread H.J. Lu
On Tue, Dec 3, 2024 at 2:55 AM Jeff Law  wrote:
>
>
>
> On 12/2/24 1:55 AM, Richard Biener wrote:
> > On Sun, Dec 1, 2024 at 11:15 PM Jeff Law  wrote:
> >>
> >>
> >>
> >> On 11/27/24 3:34 PM, H.J. Lu wrote:
> >>> On Thu, Nov 21, 2024, 2:02 PM H.J. Lu  >>> > wrote:
> >>>
> >>>  Promote integer arguments smaller than int if 
> >>> TARGET_PROMOTE_PROTOTYPES
> >>>  returns true.
> >>>
> >>>   PR middle-end/14907
> >>>   * calls.c (initialize_argument_information): Promote small
> >>>  integer
> >>>   arguments if TARGET_PROMOTE_PROTOTYPES returns true.
> >> This doesn't look right.  Promotions are primarily driven by the target
> >> files, in particular TARGET_PROMOTE_FUNCTION_MODE.
> >>
> >> PROMOTE_PROTOTYPES is more of a language front-end hook and it doesn't
> >> seem appropriate to be testing it in calls.cc.
> >
> > It's a misguided hook that when applied in a subset of frontends ends
> > up generating
> > wrong code when doing multi-language LTO.  I requested moving it's handling 
> > to
> > RTL expansion where we can apply it consistently.
> It's probably a fair assessment that if a language FE is doing something
> like that, then it's going to be problematic for LTO.
>
> So maybe the question morphs into whether or not HJ's patch takes us
> down that path and if so, then shouldn't we be looking to remove the FE
> uses?  And if we do that, how do we get a degree of confidence that we

My second patch in the series:

https://patchwork.sourceware.org/project/gcc/list/?series=41009
https://patchwork.sourceware.org/project/gcc/patch/20241121060255.774789-3-hjl.to...@gmail.com/

does that.

> haven't accidentally twiddled the ABI in a meaningful way.

There are no regressions on x86-64.  All outgoing char/short
arguments are extended to int.  It exposes an x86 intrinsic bug:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117547

and fixes the Fortran bug pointed out by Ricard:

https://gcc.gnu.org/pipermail/gcc-patches/2024-November/667559.html

>
>
> >
> > This particular patch looks OK to me (but as said elsewhere I'm not
> > very familiar with calls.cc and it's peculiarities).
> I didn't see anything particularly concerning other than the overarching
> question of using what had been a language FE hook in calls.cc.  I'm
> obviously leery of changing ABI stuff this late in the game and would
> generally prefer to defer something like that until the next stage1.
> jeff
>


-- 
H.J.


[PATCH 0/1] aarch64: remove extra XTN in vector concatenation

2024-12-02 Thread Akram Ahmad
Hi all,

This patch adds a new insn which optimises vector concatenations on SIMD/FP
registers when a narrowing truncation is performed on the resulting vector.
This usually results in codegen such as...

uzp1v0.2d, v0.2d, v1.2d
xtn v0.2s, v0.2d
ret

... whereas the following would have sufficed without the need for XTN:

uzp1v0.2s, v0.2s, v1.2s
ret

A more rigorous example is provided in the commit message. This is a
fairly straightforward patch, although I would appreciate some feedback
as to whether the scope of the modes covered by the insn is appropriate.
Similarly, I would also appreciate any suggestions for other test cases
that should be covered for this optimisation.

Many thanks,

Akram

---

Akram Ahmad (1):
  aarch64: remove extra XTN in vector concatenation

 gcc/config/aarch64/aarch64-simd.md| 16 ++
 gcc/config/aarch64/iterators.md   | 12 ++
 .../aarch64/sve/truncated_concatenation_1.c   | 22 +++
 3 files changed, 50 insertions(+)
 create mode 100644 
gcc/testsuite/gcc.target/aarch64/sve/truncated_concatenation_1.c

-- 
2.34.1



Re: [PATCH] c++/contracts: ICE with contract assert on non empty statement [PR 117579]

2024-12-02 Thread Jason Merrill

On 12/2/24 10:59 AM, Nina Ranns wrote:


Tested on x86_64-pc-linux-gnu.
First time using git send-email, let me know if anything needs to be
done differently.


Thanks, just a few tweaks.


OK for trunk?


This question implies to me that the sender has commit access, which I 
don't think you do yet since you aren't in MAINTAINERS.  But I can't 
think of a similar standard question for contributors without commit 
access; I guess just leave it out.



Thanks,
Nina




For git-am this line should contain scissors, e.g.

-- 8< --


Contract assert is an attribute on a non empty statement. Currently we


"on an empty statement", right?


assert that the statement is empty before emitting the assertion. This
has been changed to a conditional check that the statement is empty
before the assertion is emitted.

PR c++/117579

gcc/cp/ChangeLog:

* parser.cc (cp_parser_statement): assertion replaced with a


Capitalize the beginning of the sentence, and change to active voice, 
e.g. "Replace assertion..."



conditional check that the statement containing a contract
assert is empty.

gcc/testsuite/ChangeLog:

* g++.dg/contracts/pr117579.C: New test.

Signed-off-by: Nina Ranns 
---
  gcc/cp/parser.cc  | 6 --
  gcc/testsuite/g++.dg/contracts/pr117579.C | 9 +
  2 files changed, 13 insertions(+), 2 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/contracts/pr117579.C

diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index f60ed47dfd7..e655e5cc0db 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -13082,8 +13082,10 @@ cp_parser_statement (cp_parser* parser, tree 
in_statement_expr,
if (cp_contract_assertion_p (std_attrs))
{
  /* Add the assertion as a statement in the current block.  */
- gcc_assert (!statement || statement == error_mark_node);
- emit_assertion (std_attrs);
+ if (!statement)
+   emit_assertion (std_attrs);
+ /* we already checked that the contract assertion is followed by


Capitalize the beginning of the sentence.


+  a semicolon.  */
  std_attrs = NULL_TREE;
}
  }
diff --git a/gcc/testsuite/g++.dg/contracts/pr117579.C 
b/gcc/testsuite/g++.dg/contracts/pr117579.C
new file mode 100644
index 000..f15cdf0c78d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/contracts/pr117579.C
@@ -0,0 +1,9 @@
+// check that contract assertion on a non empty statement doesn't cause an ICE


Capitalize the beginning of the sentence and end with a period.


+// { dg-do compile }


This is the default; it's OK to have this line, but unnecessary.


+// { dg-options "-std=c++2a -fcontracts " }
+
+void f();
+int main ()
+{
+  [[assert: true]] f(); // { dg-error "assertions must be followed by" }
+}




Re: Should -fsanitize=bounds support counted-by attribute for pointers inside a structure?

2024-12-02 Thread Qing Zhao


> On Nov 30, 2024, at 07:22, Martin Uecker  wrote:
> 
> Am Dienstag, dem 26.11.2024 um 20:59 + schrieb Qing Zhao:
>> Think this over these days, I have another thought that need some feedback:
>> 
>> The major issue right now is:
>> 
>> 1. For the following structure in which the “counted_by” attributes is 
>> attached to the pointer field.
>> 
>> struct foo {
>>  int n;
>>  char *p __attribute__ ((counted_by (n)));
>> } *x;
> 
> BTW: I also do not like the syntax 'n' for a lookup of a member 
> in the member namespace. I think it should be '.n'.  For FAM this 
> is less problematic because it always at the end, but here it is 
> problematic.
During my implementation of extending this attribute to pointer fields of 
structure. 
I didn’t notice  issue with the current syntax ’n’ for the pointer fields so 
far, 
even though when  the field “n” is declared after the corresponding pointer 
field, i.e:

struct foo {
{
  char *p __attribute__ ((counted_by (n)));
  int n;
}

So, could you please explain a little bit more on what’s the potential issue 
here?


>> 
>> There is one important additional requirement:
>> 
>> x->n, x->p can ONLY be changed by changing the whole structure at the same 
>> time. 
>> Otherwise, x->n might not be consistent with x->p.
> 
> By itself, this would still not fix the issue I pointed out.
> 
> struct foo x;
> x = .. ; // set the whole structure
> char *p = x->p;
> x = ... ; // set the whole structure
> 
> What is the bound for 'p' ?  

Since p was set to the pointer field of the old structure, then the bound of it 
should be the old bound. 
> With current rules it would be the old bound.

I thought that this should be the correct behavior, isn’t it?

> 
> 
>> 
>> 2. This new requirement is ONLY for “counted_by” attribute that is attached 
>> to the pointer field, not needed
>> for flexible array members.
>> 
>> 3. Then there will be inconsistency for the “counted_by” attribute between 
>> FAM and pointer field.
> 
> 
>> The major questions I have right now:
>> 
>> 1. Shall we keep this inconsistency between FAM and pointer field? 
>> Or, 
>> 
>> 2. Shall we keep them consistent by adding this new requirement for the 
>> previous FAM? 
> 
> Or have a new attribute?  I feel we double down on a bad design
> which made sense for FAM addressing a very specific use case
> (retrofitting checks to the Linux kernel) but is otherwise not
> very strong.

I do agree that the previous specific feature of the “counted_by" for the FAM, 
i.e:
"
One important feature of the attribute is, a reference to the
flexible array member field uses the latest value assigned to the
field that represents the number of the elements before that
reference.  For example,

   p->count = val1;
   p->array[20] = 0;  // ref1 to p->array
   p->count = val2;
   p->array[30] = 0;  // ref2 to p->array

in the above, 'ref1' uses 'val1' as the number of the elements in
'p->array', and 'ref2' uses 'val2' as the number of elements in
'p->array’.
“

Is a specific feature for Linux kernel, I am wondering whether the above feature
really needed by the Linux kernel? 

If Not, I prefer to eliminate this specific feature from GCC before we 
officially release the “counted_by”
attribute in GCC15.

If Linux kernel does need this feature, Yes, maybe a new attribute for pointer 
is better.

Kees, could you please also provide more comments and suggestion on this?


> 
> Martin
> 
>> 
>> Kees, the following feature that you requested for the FAM:
>> 
>> "
>> One important feature of the attribute is, a reference to the
>> flexible array member field uses the latest value assigned to the
>> field that represents the number of the elements before that
>> reference.  For example,
>> 
>>p->count = val1;
>>p->array[20] = 0;  // ref1 to p->array
>>p->count = val2;
>>p->array[30] = 0;  // ref2 to p->array
>> 
>> in the above, 'ref1' uses 'val1' as the number of the elements in
>> 'p->array', and 'ref2' uses 'val2' as the number of elements in
>> 'p->array’.
>> "
>> Has this feature been used by Linux kernel already?
>> Is this feature really needed by Linux kernel?
>> 
>> Thanks a lot for suggestions and comments.
>> 
>> Qing
>> 
>>> On Nov 20, 2024, at 14:23, Martin Uecker  wrote:
>>> 
>>> Am Mittwoch, dem 20.11.2024 um 17:37 + schrieb Qing Zhao:
 Hi, Martin,
 
 Thanks a lot for pointing this out. 
 
 This does look like a problem we need avoid for the pointer arrays.
 
 Does  the same problem exist in the language extension too if the n is 
 allowed to be changed after initialization?
 
 If so, for the future language extension, is there any proposed solution 
 to this problem? 
 
>>> 
>>> There is no specification yet and nothing formally proposed, so
>>> it is entirely unclear at this point.
>>> 
>>> My idea would be to give 'x->buf' the type '(cha

[PING^3][PATCH v3] rs6000: Fix issue in specifying PTImode as an attribute [PR106895]

2024-12-02 Thread jeevitha
Ping!

please review.

Thanks & Regards
Jeevitha

On 14/10/24 5:16 pm, jeevitha wrote:
> Hi All,
> 
> The following patch has been bootstrapped and regtested on powerpc64le-linux.
> 
> PTImode assists in generating even/odd register pairs on 128 bits. When the 
> user
> specifies PTImode as an attribute, it breaks because there is no internal type
> to handle this mode. To fix this, we have created a intPTI_type_internal_node 
> to
> handle PTImode. We are not documenting this __pti_internal type, since users
> are not encouraged to use this type externally.
> 
> 2024-10-14  Jeevitha Palanisamy  
> 
> gcc/
>   PR target/106895
>   * config/rs6000/rs6000.h (enum rs6000_builtin_type_index): Add
>   RS6000_BTI_INTPTI.
>   * config/rs6000/rs6000-builtin.cc (rs6000_init_builtins): Add node for
>   PTImode type.
> 
> gcc/testsuite/
>   PR target/106895
>   * gcc.target/powerpc/pr106895.c: New testcase.
> 
> diff --git a/gcc/config/rs6000/rs6000-builtin.cc 
> b/gcc/config/rs6000/rs6000-builtin.cc
> index 9bdbae1ecf9..baf17f3b28a 100644
> --- a/gcc/config/rs6000/rs6000-builtin.cc
> +++ b/gcc/config/rs6000/rs6000-builtin.cc
> @@ -756,6 +756,15 @@ rs6000_init_builtins (void)
>else
>  ieee128_float_type_node = NULL_TREE;
>  
> +  /* PTImode to get even/odd register pairs.  */
> +  intPTI_type_internal_node = make_node(INTEGER_TYPE);
> +  TYPE_PRECISION (intPTI_type_internal_node) = GET_MODE_BITSIZE (PTImode);
> +  layout_type (intPTI_type_internal_node);
> +  SET_TYPE_MODE (intPTI_type_internal_node, PTImode);
> +  t = build_qualified_type (intPTI_type_internal_node, TYPE_QUAL_CONST);
> +  lang_hooks.types.register_builtin_type (intPTI_type_internal_node,
> +   "__pti_internal");
> +
>/* Vector pair and vector quad support.  */
>vector_pair_type_node = make_node (OPAQUE_TYPE);
>SET_TYPE_MODE (vector_pair_type_node, OOmode);
> diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
> index d460eb06544..1612b3e2fcd 100644
> --- a/gcc/config/rs6000/rs6000.h
> +++ b/gcc/config/rs6000/rs6000.h
> @@ -2288,6 +2288,7 @@ enum rs6000_builtin_type_index
>RS6000_BTI_ptr_vector_quad,
>RS6000_BTI_ptr_long_long,
>RS6000_BTI_ptr_long_long_unsigned,
> +  RS6000_BTI_INTPTI,
>RS6000_BTI_MAX
>  };
>  
> @@ -2332,6 +2333,7 @@ enum rs6000_builtin_type_index
>  #define uintDI_type_internal_node 
> (rs6000_builtin_types[RS6000_BTI_UINTDI])
>  #define intTI_type_internal_node  
> (rs6000_builtin_types[RS6000_BTI_INTTI])
>  #define uintTI_type_internal_node 
> (rs6000_builtin_types[RS6000_BTI_UINTTI])
> +#define intPTI_type_internal_node 
> (rs6000_builtin_types[RS6000_BTI_INTPTI])
>  #define float_type_internal_node  
> (rs6000_builtin_types[RS6000_BTI_float])
>  #define double_type_internal_node 
> (rs6000_builtin_types[RS6000_BTI_double])
>  #define long_double_type_internal_node
> (rs6000_builtin_types[RS6000_BTI_long_double])
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr106895.c 
> b/gcc/testsuite/gcc.target/powerpc/pr106895.c
> new file mode 100644
> index 000..88516c5a426
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr106895.c
> @@ -0,0 +1,17 @@
> +/* PR target/106895 */
> +/* { dg-do assemble } */
> +/* { dg-require-effective-target int128 } */
> +/* { dg-options "-O2 -save-temps" } */
> +
> +/* Verify the following generates even/odd register pairs.  */
> +
> +typedef __int128 pti __attribute__((mode(PTI)));
> +
> +void
> +set128 (pti val, pti *mem)
> +{
> +asm("stq %1,%0" : "=m"(*mem) : "r"(val));
> +}
> +
> +/* { dg-final { scan-assembler {\mstq\M} } } */
> +
> 
> 
> 



[PING^3][PATCH] testsuite: Simplify target test and dg-options for AMO tests

2024-12-02 Thread jeevitha
Ping!

please review.

Thanks & Regards
Jeevitha

On 15/10/24 12:49 pm, jeevitha wrote:
> Hi All,
> 
> Removed powerpc*-*-* from the target test as it is always true. Simplified
> options by removing -mpower9-misc and -mvsx, which are enabled by default with
> -mdejagnu-cpu=power9. The has_arch_pwr9 check is also true with
> -mdejagnu-cpu=power9, so it has been removed.
> 
> 2024-10-15 Jeevitha Palanisamy 
> 
> gcc/testsuite/
> 
>   * gcc.target/powerpc/amo1.c: Removed powerpc*-*-* from the target and
>   simplified dg-options.
>   * gcc.target/powerpc/amo2.c: Simplified dg-options and added powerpc_vsx
>   target check.
> 
> 
> diff --git a/gcc/testsuite/gcc.target/powerpc/amo1.c 
> b/gcc/testsuite/gcc.target/powerpc/amo1.c
> index c5af373b4e9..9a981cd4219 100644
> --- a/gcc/testsuite/gcc.target/powerpc/amo1.c
> +++ b/gcc/testsuite/gcc.target/powerpc/amo1.c
> @@ -1,6 +1,5 @@
> -/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
> -/* { dg-options "-mvsx -mpower9-misc -O2" } */
> -/* { dg-additional-options "-mdejagnu-cpu=power9" { target { ! has_arch_pwr9 
> } } } */
> +/* { dg-do compile { target { lp64 } } } */
> +/* { dg-options "-mdejagnu-cpu=power9 -O2" } */
>  /* { dg-require-effective-target powerpc_vsx } */
>  
>  /* Verify P9 atomic memory operations.  */
> diff --git a/gcc/testsuite/gcc.target/powerpc/amo2.c 
> b/gcc/testsuite/gcc.target/powerpc/amo2.c
> index 592f0fb3f92..9e4ff0ce064 100644
> --- a/gcc/testsuite/gcc.target/powerpc/amo2.c
> +++ b/gcc/testsuite/gcc.target/powerpc/amo2.c
> @@ -1,6 +1,6 @@
>  /* { dg-do run { target { powerpc*-*-linux* && { lp64 && p9vector_hw } } } } 
> */
> -/* { dg-options "-O2 -mvsx -mpower9-misc" } */
> -/* { dg-additional-options "-mdejagnu-cpu=power9" { target { ! has_arch_pwr9 
> } } } */
> +/* { dg-options "-mdejagnu-cpu=power9 -O2" } */
> +/* { dg-require-effective-target powerpc_vsx } */
>  
>  #include 
>  #include 
> 
> 
> 



Re: [PATCH] arm: [MVE intrinsics] Avoid warnings when floating-point is not supported [PR 117814]

2024-12-02 Thread Richard Earnshaw (lists)
On 02/12/2024 11:21, Christophe Lyon wrote:
> If the target does not support floating-point, we register FP vector
> types as 'void' (see register_vector_type).
> 
> The leads to warnings about 'pure attribute on function returning
> void' when we declare the various load intrinsics because their
> call_properties say CP_READ_MEMORY (thus giving them the 'pure'
> attribute), but their return type is void.
> 
> To avoid such warnings, pretend the call_properties are empty when FP
> is disabled and the function would return an FP value.  If such
> functions are incorrectly used in user code, a proper error is
> emitted:
> unknown type name ‘float16x8_t'; did you mean ‘int16x8_t’?
> 
> gcc/ChangeLog:
> 
>   PR target/117814
>   * config/arm/arm-mve-builtins-base.cc (vld1_impl): Fix
>   call_properties.
>   (vld24_impl): Likewise.
>   * config/arm/arm-mve-builtins-functions.h (load_extending):
>   Likewise.

Won't this lead to problems if the code is something like

#include "arm_mve.h"

#pragma gcc target ("arch=armv8.1-m.main+mve.fp")

// Some use of an affected intrinsic

and then compile with "-march=armv8.1-m.main+mve -mfpu=auto -mfloat-abi=softfp"?

R.

> ---
>  gcc/config/arm/arm-mve-builtins-base.cc | 22 +++--
>  gcc/config/arm/arm-mve-builtins-functions.h | 11 ++-
>  2 files changed, 30 insertions(+), 3 deletions(-)
> 
> diff --git a/gcc/config/arm/arm-mve-builtins-base.cc 
> b/gcc/config/arm/arm-mve-builtins-base.cc
> index 723004b53d7..a322730eca8 100644
> --- a/gcc/config/arm/arm-mve-builtins-base.cc
> +++ b/gcc/config/arm/arm-mve-builtins-base.cc
> @@ -141,8 +141,17 @@ class vld1_impl : public full_width_access
>  {
>  public:
>unsigned int
> -  call_properties (const function_instance &) const override
> +  call_properties (const function_instance &instance) const override
>{
> +/* If the target does not support floating-point, we register FP vector
> +   types as 'void'.  In this case, pretend we do not access memory to 
> avoid
> +   warnings about 'pure attribute on function returning void' when we
> +   declare the intrinsics.  Such uses in user code are properly
> +   diagnosed.  */
> +if (!TARGET_HAVE_MVE_FLOAT
> + && instance.type_suffix (0).float_p)
> +  return 0;
> +
>  return CP_READ_MEMORY;
>}
>  
> @@ -1141,8 +1150,17 @@ public:
>using full_width_access::full_width_access;
>  
>unsigned int
> -  call_properties (const function_instance &) const override
> +  call_properties (const function_instance &instance) const override
>{
> +/* If the target does not support floating-point, we register FP vector
> +   types as 'void'.  In this case, pretend we do not access memory to 
> avoid
> +   warnings about 'pure attribute on function returning void' when we
> +   declare the intrinsics.  Such uses in user code are properly
> +   diagnosed.  */
> +if (!TARGET_HAVE_MVE_FLOAT
> + && instance.type_suffix (0).float_p)
> +  return 0;
> +
>  return CP_READ_MEMORY;
>}
>  
> diff --git a/gcc/config/arm/arm-mve-builtins-functions.h 
> b/gcc/config/arm/arm-mve-builtins-functions.h
> index 0ade2157e4a..1a9a347805c 100644
> --- a/gcc/config/arm/arm-mve-builtins-functions.h
> +++ b/gcc/config/arm/arm-mve-builtins-functions.h
> @@ -986,8 +986,17 @@ public:
>m_float_memory_type (NUM_TYPE_SUFFIXES)
>{}
>  
> -  unsigned int call_properties (const function_instance &) const override
> +  unsigned int call_properties (const function_instance &instance) const 
> override
>{
> +/* If the target does not support floating-point, we register FP vector
> +   types as 'void'.  In this case, pretend we do not access memory to 
> avoid
> +   warnings about 'pure attribute on function returning void' when we
> +   declare the intrinsics.  Such uses in user code are properly
> +   diagnosed.  */
> +if (!TARGET_HAVE_MVE_FLOAT
> + && instance.type_suffix (0).float_p)
> +  return 0;
> +
>  return CP_READ_MEMORY;
>}
>  



Re: PING: [PATCH v4 1/7] Honor TARGET_PROMOTE_PROTOTYPES during RTL expand

2024-12-02 Thread Richard Biener
On Sun, Dec 1, 2024 at 11:15 PM Jeff Law  wrote:
>
>
>
> On 11/27/24 3:34 PM, H.J. Lu wrote:
> > On Thu, Nov 21, 2024, 2:02 PM H.J. Lu  > > wrote:
> >
> > Promote integer arguments smaller than int if TARGET_PROMOTE_PROTOTYPES
> > returns true.
> >
> >  PR middle-end/14907
> >  * calls.c (initialize_argument_information): Promote small
> > integer
> >  arguments if TARGET_PROMOTE_PROTOTYPES returns true.
> This doesn't look right.  Promotions are primarily driven by the target
> files, in particular TARGET_PROMOTE_FUNCTION_MODE.
>
> PROMOTE_PROTOTYPES is more of a language front-end hook and it doesn't
> seem appropriate to be testing it in calls.cc.

It's a misguided hook that when applied in a subset of frontends ends
up generating
wrong code when doing multi-language LTO.  I requested moving it's handling to
RTL expansion where we can apply it consistently.

This particular patch looks OK to me (but as said elsewhere I'm not
very familiar
with calls.cc and it's peculiarities).

Richard.

>
> Jeff


RE: [PATCH v3] MATCH: Simplify `(trunc)copysign ((extend)x, CST)` to `copysign (x, -1.0/1.0)` [PR112472]

2024-12-02 Thread Eikansh Gupta
> On Thu, Nov 14, 2024 at 11:59 AM Eikansh Gupta
>  wrote:
> >
> > This patch simplify `(trunc)copysign ((extend)x, CST)` to `copysign
> > (x, -1.0/1.0)` depending on the sign of CST. Previously, it was simplified 
> > to
> `copysign (x, CST)`.
> > It can be optimized as the sign of the CST matters, not the value.
> >
> > The patch also simplify `(trunc)abs (extend x)` to `abs (x)`.
> 
> Please do not mix two different changes.

The second change is needed because at match.pd:1198, copysign(x, +ve CST) is
converted to abs(x).

> 
> > PR tree-optimization/112472
> >
> > gcc/ChangeLog:
> >
> > * match.pd ((trunc)copysign ((extend)x, -CST) --> copysign (x, 
> > -1.0)):
> New pattern.
> > ((trunc)abs (extend x) --> abs (x)): New pattern.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.dg/tree-ssa/pr112472.c: New test.
> >
> > Signed-off-by: Eikansh Gupta 
> > ---
> >  gcc/match.pd | 25 +++-
> >  gcc/testsuite/gcc.dg/tree-ssa/pr112472.c | 22 +
> >  2 files changed, 42 insertions(+), 5 deletions(-)  create mode 100644
> > gcc/testsuite/gcc.dg/tree-ssa/pr112472.c
> >
> > diff --git a/gcc/match.pd b/gcc/match.pd index
> > 00988241348..5b930beb418 100644
> > --- a/gcc/match.pd
> > +++ b/gcc/match.pd
> > @@ -8854,19 +8854,34 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> >  type, OPTIMIZE_FOR_BOTH))
> > (tos @0
> >
> > -/* Simplify (trunc)copysign ((extend)x, (extend)y) to copysignf (x, y),
> > -   x,y is float value, similar for _Float16/double.  */
> > +/* Simplify (trunc)copysign ((extend)x, (extend)y) to copysignf (x, y) and
> > +   simplify (trunc)copysign ((extend)x, CST) to copysign (x, -1.0/1.0).
> > +   x,y is float value, similar for _Float16/double. */
> >  (for copysigns (COPYSIGN_ALL)
> >   (simplify
> > -  (convert (copysigns (convert@2 @0) (convert @1)))
> > +  (convert (copysigns (convert@2 @0) (convert2? @1)))
> 
> You want to capture convert2? with @3
> 
> > (if (optimize
> > && !HONOR_SNANS (@2)
> > && types_match (type, TREE_TYPE (@0))
> > -   && types_match (type, TREE_TYPE (@1))
> > && TYPE_PRECISION (type) < TYPE_PRECISION (TREE_TYPE (@2))
> > && direct_internal_fn_supported_p (IFN_COPYSIGN,
> >   type, OPTIMIZE_FOR_BOTH))
> > -(IFN_COPYSIGN @0 @1
> > + (if (TREE_CODE (@1) == REAL_CST)
> 
> and check TREE_CODE (@3) == REAL_CST, we might not always fold a
> conversion of a FP constant.

A basic question here. When do we not fold this type of conversion?

> > +  (if (REAL_VALUE_NEGATIVE (TREE_REAL_CST (@1)))
> > +   (IFN_COPYSIGN @0 { build_minus_one_cst (type); })
> > +   (IFN_COPYSIGN @0 { build_one_cst (type); }))

Should I remove the above line as copysign(x, +ve CST) is converted to abs(x)?

> > +  (if (types_match (type, TREE_TYPE (@1)))
> > +   (IFN_COPYSIGN @0 @1))
> > +
> > +/* (trunc)abs (extend x) --> abs (x)
> > +   x is a float value */
> > +(simplify
> > + (convert (abs (convert@1 @0)))
> > +  (if (optimize
> > +  && !HONOR_SNANS (@1)
> > +  && types_match (type, TREE_TYPE (@0))
> > +  && TYPE_PRECISION (type) < TYPE_PRECISION (TREE_TYPE (@1)))
> > +   (abs @0)))
> 
> This one is OK, but I don't see a testcase?  Please split it out to a 
> separate patch
> and add one.

The second testcase is for this change. I will split it into separate patch.

> 
> Richard.

Regards,
Eikansh



Re: [PATCH] aarch64: Add flags field to aarch64-simd-pragma-builtins.def

2024-12-02 Thread Andrew Pinski
On Mon, Dec 2, 2024 at 2:00 AM Richard Sandiford
 wrote:
>
> This patch adds a flags field to aarch64-simd-pragma-builtins.def
> and uses it to add attributes to the function declaration.
>
> Bootstrapped & regression-tested on aarch64-linux-gnu.  I'll commit
> tomorrow if there are no comments before then.

This was on my list of things to do so one less thing for myself.

Thanks,
Andrew

>
> Richard
>
>
> gcc/
> * config/aarch64/aarch64-simd-pragma-builtins.def: Add a flags
> field to each entry.
> * config/aarch64/aarch64-builtins.cc: Update includes accordingly.
> (aarch64_pragma_builtins_data): Add a flags field.
> (aarch64_init_pragma_builtins): Use the flags field to add attributes
> to the function declaration.
> ---
>  gcc/config/aarch64/aarch64-builtins.cc| 12 ++--
>  .../aarch64/aarch64-simd-pragma-builtins.def  | 59 +++
>  2 files changed, 42 insertions(+), 29 deletions(-)
>
> diff --git a/gcc/config/aarch64/aarch64-builtins.cc 
> b/gcc/config/aarch64/aarch64-builtins.cc
> index 9f578a77888..d58065cdd58 100644
> --- a/gcc/config/aarch64/aarch64-builtins.cc
> +++ b/gcc/config/aarch64/aarch64-builtins.cc
> @@ -782,7 +782,7 @@ typedef struct
>AARCH64_SIMD_BUILTIN_##T##_##N##A,
>
>  #undef ENTRY
> -#define ENTRY(N, S, T0, T1, T2, U) \
> +#define ENTRY(N, S, T0, T1, T2, U, F)  \
>AARCH64_##N,
>
>  enum aarch64_builtins
> @@ -1647,9 +1647,10 @@ namespace simd_types {
>  }
>
>  #undef ENTRY
> -#define ENTRY(N, S, T0, T1, T2, U) \
> +#define ENTRY(N, S, T0, T1, T2, U, F) \
>{#N, aarch64_builtin_signatures::S, simd_types::T0, simd_types::T1, \
> -   simd_types::T2, U, aarch64_required_extensions::REQUIRED_EXTENSIONS},
> +   simd_types::T2, U, aarch64_required_extensions::REQUIRED_EXTENSIONS, \
> +   FLAG_##F},
>
>  /* Initialize pragma builtins.  */
>
> @@ -1660,6 +1661,7 @@ struct aarch64_pragma_builtins_data
>simd_type types[3];
>int unspec;
>aarch64_required_extensions required_extensions;
> +  unsigned int flags;
>  };
>
>  static aarch64_pragma_builtins_data aarch64_pragma_builtins[] = {
> @@ -1704,8 +1706,10 @@ aarch64_init_pragma_builtins ()
>auto data = aarch64_pragma_builtins[i];
>auto fntype = aarch64_fntype (data);
>auto code = AARCH64_PRAGMA_BUILTIN_START + i + 1;
> +  auto flag_mode = data.types[0].mode;
> +  auto attrs = aarch64_get_attributes (data.flags, flag_mode);
>aarch64_builtin_decls[code]
> -   = aarch64_general_simulate_builtin (data.name, fntype, code);
> +   = aarch64_general_simulate_builtin (data.name, fntype, code, attrs);
>  }
>  }
>
> diff --git a/gcc/config/aarch64/aarch64-simd-pragma-builtins.def 
> b/gcc/config/aarch64/aarch64-simd-pragma-builtins.def
> index db40745e9e3..ae8732bdb31 100644
> --- a/gcc/config/aarch64/aarch64-simd-pragma-builtins.def
> +++ b/gcc/config/aarch64/aarch64-simd-pragma-builtins.def
> @@ -19,46 +19,55 @@
> .  */
>
>  #undef ENTRY_BINARY
> -#define ENTRY_BINARY(N, T0, T1, T2, U) \
> -  ENTRY (N, binary, T0, T1, T2, U)
> +#define ENTRY_BINARY(N, T0, T1, T2, U, F)  \
> +  ENTRY (N, binary, T0, T1, T2, U, F)
>
>  #undef ENTRY_BINARY_LANE
> -#define ENTRY_BINARY_LANE(N, T0, T1, T2, U)\
> -  ENTRY (N, binary_lane, T0, T1, T2, U)
> +#define ENTRY_BINARY_LANE(N, T0, T1, T2, U, F) \
> +  ENTRY (N, binary_lane, T0, T1, T2, U, F)
>
>  #undef ENTRY_BINARY_VHSDF
> -#define ENTRY_BINARY_VHSDF(NAME, UNSPEC)  \
> -  ENTRY_BINARY (NAME##_f16, f16, f16, f16, UNSPEC) \
> -  ENTRY_BINARY (NAME##q_f16, f16q, f16q, f16q, UNSPEC) \
> -  ENTRY_BINARY (NAME##_f32, f32, f32, f32, UNSPEC) \
> -  ENTRY_BINARY (NAME##q_f32, f32q, f32q, f32q, UNSPEC) \
> -  ENTRY_BINARY (NAME##q_f64, f64q, f64q, f64q, UNSPEC)
> +#define ENTRY_BINARY_VHSDF(NAME, UNSPEC, FLAGS)\
> +  ENTRY_BINARY (NAME##_f16, f16, f16, f16, UNSPEC, FLAGS)  \
> +  ENTRY_BINARY (NAME##q_f16, f16q, f16q, f16q, UNSPEC, FLAGS)  \
> +  ENTRY_BINARY (NAME##_f32, f32, f32, f32, UNSPEC, FLAGS)  \
> +  ENTRY_BINARY (NAME##q_f32, f32q, f32q, f32q, UNSPEC, FLAGS)  \
> +  ENTRY_BINARY (NAME##q_f64, f64q, f64q, f64q, UNSPEC, FLAGS)
>
>  #undef ENTRY_TERNARY_VLUT8
> -#define ENTRY_TERNARY_VLUT8(T) \
> -  ENTRY_BINARY_LANE (vluti2_lane_##T##8, T##8q, T##8, u8, UNSPEC_LUTI2)  
>   \
> -  ENTRY_BINARY_LANE (vluti2_laneq_##T##8, T##8q, T##8, u8q, UNSPEC_LUTI2) \
> -  ENTRY_BINARY_LANE (vluti2q_lane_##T##8, T##8q, T##8q, u8, UNSPEC_LUTI2) \
> -  ENTRY_BINARY_LANE (vluti2q_laneq_##T##8, T##8q, T##8q, u8q, UNSPEC_LUTI2) \
> -  ENTRY_BINARY_LANE (vluti4q_lane_##T##8, T##8q, T##8q, u8, UNSPEC_LUTI4) \
> -  ENTRY_BINARY_LANE (vluti4q_laneq_##T##8, T##8q, T##8q, u8q, UNSPEC_LUTI4)
> +#define ENTRY_TERNARY_VLUT8(T) \
> +  ENTRY_BINARY_LANE (vluti2_lane_##T##8, T##8q, T##8, u8,  \
> +  

Re: [PATCH] riscv: Avoid narrowing warning

2024-12-02 Thread Kito Cheng
LGTM, thanks :)

On Mon, Dec 2, 2024 at 6:00 PM Andreas Schwab  wrote:
>
> * config/riscv/riscv.cc (fli_value_hf, fli_value_sf)
> (fli_value_df): Use integer constants.  Constify.
> (riscv_float_const_rtx_index_for_fli): Add const.
> ---
>  gcc/config/riscv/riscv.cc | 64 ---
>  1 file changed, 39 insertions(+), 25 deletions(-)
>
> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> index 7a1724d6e73..0a6c00926b3 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -1637,35 +1637,49 @@ static int riscv_symbol_insns (enum riscv_symbol_type 
> type)
> Manual draft. For details, please see:
> https://github.com/riscv/riscv-isa-manual/releases/tag/isa-449cd0c  */
>
> -static unsigned HOST_WIDE_INT fli_value_hf[32] =
> -{
> -  0xbcp8, 0x4p8, 0x1p8, 0x2p8, 0x1cp8, 0x20p8, 0x2cp8, 0x30p8,
> -  0x34p8, 0x35p8, 0x36p8, 0x37p8, 0x38p8, 0x39p8, 0x3ap8, 0x3bp8,
> -  0x3cp8, 0x3dp8, 0x3ep8, 0x3fp8, 0x40p8, 0x41p8, 0x42p8, 0x44p8,
> -  0x48p8, 0x4cp8, 0x58p8, 0x5cp8, 0x78p8,
> +static const unsigned HOST_WIDE_INT fli_value_hf[32] =
> +{
> +#define P8(v) ((unsigned HOST_WIDE_INT) (v) << 8)
> +  P8(0xbc), P8(0x4), P8(0x1), P8(0x2),
> +  P8(0x1c), P8(0x20), P8(0x2c), P8(0x30),
> +  P8(0x34), P8(0x35), P8(0x36), P8(0x37),
> +  P8(0x38), P8(0x39), P8(0x3a), P8(0x3b),
> +  P8(0x3c), P8(0x3d), P8(0x3e), P8(0x3f),
> +  P8(0x40), P8(0x41), P8(0x42), P8(0x44),
> +  P8(0x48), P8(0x4c), P8(0x58), P8(0x5c),
> +  P8(0x78),
>/* Only used for filling, ensuring that 29 and 30 of HF are the same.  */
> -  0x78p8,
> -  0x7cp8, 0x7ep8
> +  P8(0x78),
> +  P8(0x7c), P8(0x7e)
> +#undef P8
>  };
>
> -static unsigned HOST_WIDE_INT fli_value_sf[32] =
> -{
> -  0xbf8p20, 0x008p20, 0x378p20, 0x380p20, 0x3b8p20, 0x3c0p20, 0x3d8p20, 
> 0x3e0p20,
> -  0x3e8p20, 0x3eap20, 0x3ecp20, 0x3eep20, 0x3f0p20, 0x3f2p20, 0x3f4p20, 
> 0x3f6p20,
> -  0x3f8p20, 0x3fap20, 0x3fcp20, 0x3fep20, 0x400p20, 0x402p20, 0x404p20, 
> 0x408p20,
> -  0x410p20, 0x418p20, 0x430p20, 0x438p20, 0x470p20, 0x478p20, 0x7f8p20, 
> 0x7fcp20
> +static const unsigned HOST_WIDE_INT fli_value_sf[32] =
> +{
> +#define P20(v) ((unsigned HOST_WIDE_INT) (v) << 20)
> +  P20(0xbf8), P20(0x008), P20(0x378), P20(0x380),
> +  P20(0x3b8), P20(0x3c0), P20(0x3d8), P20(0x3e0),
> +  P20(0x3e8), P20(0x3ea), P20(0x3ec), P20(0x3ee),
> +  P20(0x3f0), P20(0x3f2), P20(0x3f4), P20(0x3f6),
> +  P20(0x3f8), P20(0x3fa), P20(0x3fc), P20(0x3fe),
> +  P20(0x400), P20(0x402), P20(0x404), P20(0x408),
> +  P20(0x410), P20(0x418), P20(0x430), P20(0x438),
> +  P20(0x470), P20(0x478), P20(0x7f8), P20(0x7fc)
> +#undef P20
>  };
>
> -static unsigned HOST_WIDE_INT fli_value_df[32] =
> -{
> -  0xbff0p48, 0x10p48, 0x3ef0p48, 0x3f00p48,
> -  0x3f70p48, 0x3f80p48, 0x3fb0p48, 0x3fc0p48,
> -  0x3fd0p48, 0x3fd4p48, 0x3fd8p48, 0x3fdcp48,
> -  0x3fe0p48, 0x3fe4p48, 0x3fe8p48, 0x3fecp48,
> -  0x3ff0p48, 0x3ff4p48, 0x3ff8p48, 0x3ffcp48,
> -  0x4000p48, 0x4004p48, 0x4008p48, 0x4010p48,
> -  0x4020p48, 0x4030p48, 0x4060p48, 0x4070p48,
> -  0x40e0p48, 0x40f0p48, 0x7ff0p48, 0x7ff8p48
> +static const unsigned HOST_WIDE_INT fli_value_df[32] =
> +{
> +#define P48(v) ((unsigned HOST_WIDE_INT) (v) << 48)
> +  P48(0xbff0), P48(0x10), P48(0x3ef0), P48(0x3f00),
> +  P48(0x3f70), P48(0x3f80), P48(0x3fb0), P48(0x3fc0),
> +  P48(0x3fd0), P48(0x3fd4), P48(0x3fd8), P48(0x3fdc),
> +  P48(0x3fe0), P48(0x3fe4), P48(0x3fe8), P48(0x3fec),
> +  P48(0x3ff0), P48(0x3ff4), P48(0x3ff8), P48(0x3ffc),
> +  P48(0x4000), P48(0x4004), P48(0x4008), P48(0x4010),
> +  P48(0x4020), P48(0x4030), P48(0x4060), P48(0x4070),
> +  P48(0x40e0), P48(0x40f0), P48(0x7ff0), P48(0x7ff8)
> +#undef P48
>  };
>
>  /* Display floating-point values at the assembly level, which is consistent
> @@ -1686,7 +1700,7 @@ const char *fli_value_print[32] =
>  int
>  riscv_float_const_rtx_index_for_fli (rtx x)
>  {
> -  unsigned HOST_WIDE_INT *fli_value_array;
> +  const unsigned HOST_WIDE_INT *fli_value_array;
>
>machine_mode mode = GET_MODE (x);
>
> --
> 2.47.1
>
>
> --
> Andreas Schwab, SUSE Labs, sch...@suse.de
> GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
> "And now for something completely different."


[PATCH] arm: [MVE intrinsics] Avoid warnings when floating-point is not supported [PR 117814]

2024-12-02 Thread Christophe Lyon
If the target does not support floating-point, we register FP vector
types as 'void' (see register_vector_type).

The leads to warnings about 'pure attribute on function returning
void' when we declare the various load intrinsics because their
call_properties say CP_READ_MEMORY (thus giving them the 'pure'
attribute), but their return type is void.

To avoid such warnings, pretend the call_properties are empty when FP
is disabled and the function would return an FP value.  If such
functions are incorrectly used in user code, a proper error is
emitted:
unknown type name ‘float16x8_t'; did you mean ‘int16x8_t’?

gcc/ChangeLog:

PR target/117814
* config/arm/arm-mve-builtins-base.cc (vld1_impl): Fix
call_properties.
(vld24_impl): Likewise.
* config/arm/arm-mve-builtins-functions.h (load_extending):
Likewise.
---
 gcc/config/arm/arm-mve-builtins-base.cc | 22 +++--
 gcc/config/arm/arm-mve-builtins-functions.h | 11 ++-
 2 files changed, 30 insertions(+), 3 deletions(-)

diff --git a/gcc/config/arm/arm-mve-builtins-base.cc 
b/gcc/config/arm/arm-mve-builtins-base.cc
index 723004b53d7..a322730eca8 100644
--- a/gcc/config/arm/arm-mve-builtins-base.cc
+++ b/gcc/config/arm/arm-mve-builtins-base.cc
@@ -141,8 +141,17 @@ class vld1_impl : public full_width_access
 {
 public:
   unsigned int
-  call_properties (const function_instance &) const override
+  call_properties (const function_instance &instance) const override
   {
+/* If the target does not support floating-point, we register FP vector
+   types as 'void'.  In this case, pretend we do not access memory to avoid
+   warnings about 'pure attribute on function returning void' when we
+   declare the intrinsics.  Such uses in user code are properly
+   diagnosed.  */
+if (!TARGET_HAVE_MVE_FLOAT
+   && instance.type_suffix (0).float_p)
+  return 0;
+
 return CP_READ_MEMORY;
   }
 
@@ -1141,8 +1150,17 @@ public:
   using full_width_access::full_width_access;
 
   unsigned int
-  call_properties (const function_instance &) const override
+  call_properties (const function_instance &instance) const override
   {
+/* If the target does not support floating-point, we register FP vector
+   types as 'void'.  In this case, pretend we do not access memory to avoid
+   warnings about 'pure attribute on function returning void' when we
+   declare the intrinsics.  Such uses in user code are properly
+   diagnosed.  */
+if (!TARGET_HAVE_MVE_FLOAT
+   && instance.type_suffix (0).float_p)
+  return 0;
+
 return CP_READ_MEMORY;
   }
 
diff --git a/gcc/config/arm/arm-mve-builtins-functions.h 
b/gcc/config/arm/arm-mve-builtins-functions.h
index 0ade2157e4a..1a9a347805c 100644
--- a/gcc/config/arm/arm-mve-builtins-functions.h
+++ b/gcc/config/arm/arm-mve-builtins-functions.h
@@ -986,8 +986,17 @@ public:
   m_float_memory_type (NUM_TYPE_SUFFIXES)
   {}
 
-  unsigned int call_properties (const function_instance &) const override
+  unsigned int call_properties (const function_instance &instance) const 
override
   {
+/* If the target does not support floating-point, we register FP vector
+   types as 'void'.  In this case, pretend we do not access memory to avoid
+   warnings about 'pure attribute on function returning void' when we
+   declare the intrinsics.  Such uses in user code are properly
+   diagnosed.  */
+if (!TARGET_HAVE_MVE_FLOAT
+   && instance.type_suffix (0).float_p)
+  return 0;
+
 return CP_READ_MEMORY;
   }
 
-- 
2.34.1



[patch, avr] ad PR117726: Improve logic 8-bit shifts with an offset of 6

2024-12-02 Thread Georg-Johann Lay

Logic 8-bit shifts with an offset of 6 can be improved by
supporting them as 3-operand operations.

Ok for trunk?

Johann

--

AVR: Tweak uin8_t << 6 and uint8_t >> 6 shifts.

Logic 8-bit shifts with an offset of 6 can be improved by
supporting them as 3-operand operations.

PR target/117726
gcc/
* config/avr/avr-passes.cc (avr_emit_shift): All 8-bit shifts with
an offset of 6 have 3-operand alternatives.
* config/avr/avr.cc (ashlqi3_out, lshrqi3_out) [case 6]:
Implement as 3-operand insn.
(avr_rtx_costs_1) [QImode, ASHIFT + LSHIFTRT]: Adjust
costs for offset of 6.
* config/avr/avr.md (*ashlqi3_split, *ashlqi3)
(*lshrqi3_split, *lshrqi3): Add "r,r,C06" alternative.diff --git a/gcc/config/avr/avr-passes.cc b/gcc/config/avr/avr-passes.cc
index 7be5ec25fbc..dc98780ef27 100644
--- a/gcc/config/avr/avr-passes.cc
+++ b/gcc/config/avr/avr-passes.cc
@@ -4899,6 +4899,8 @@ avr_emit_shift (rtx_code code, rtx dest, rtx src, int off, rtx scratch)
   // Work out which alternatives can handle 3 operands independent
   // of options.
 
+  const bool b8_is_3op = off == 6;
+
   const bool b16_is_3op = select()
 : code == ASHIFT ? satisfies_constraint_C7c (xoff) // 7...12
 : code == LSHIFTRT ? satisfies_constraint_C7c (xoff)
@@ -4914,6 +4916,7 @@ avr_emit_shift (rtx_code code, rtx dest, rtx src, int off, rtx scratch)
   const bool is_3op = (off % 8 == 0
 		   || off == n_bits - 1
 		   || (code == ASHIFTRT && off == n_bits - 2)
+		   || (n_bits == 8 && b8_is_3op)
 		   || (n_bits == 16 && b16_is_3op)
 		   || (n_bits == 24 && b24_is_3op));
   rtx shift;
diff --git a/gcc/config/avr/avr.cc b/gcc/config/avr/avr.cc
index 32028df30a5..ccf9b05bb3e 100644
--- a/gcc/config/avr/avr.cc
+++ b/gcc/config/avr/avr.cc
@@ -6780,6 +6780,8 @@ ashlqi3_out (rtx_insn *insn, rtx operands[], int *plen)
 {
   if (CONST_INT_P (operands[2]))
 {
+  int reg0 = REGNO (operands[0]);
+  int reg1 = REGNO (operands[1]);
   bool ldreg_p = test_hard_reg_class (LD_REGS, operands[0]);
   int offs = INTVAL (operands[2]);
 
@@ -6787,7 +6789,7 @@ ashlqi3_out (rtx_insn *insn, rtx operands[], int *plen)
 	*plen = 0;
 
   if (offs <= 3
-	  || (offs <= 6 && ! ldreg_p))
+	  || (offs <= 5 && ! ldreg_p))
 	{
 	  for (int i = 0; i < offs; ++i)
 	avr_asm_len ("lsl %0", operands, plen, 1);
@@ -6814,10 +6816,28 @@ ashlqi3_out (rtx_insn *insn, rtx operands[], int *plen)
 			  "lsl %0"  CR_TAB
 			  "andi %0,0xe0", operands, plen, 3);
 	case 6:
-	  return avr_asm_len ("swap %0" CR_TAB
-			  "lsl %0"  CR_TAB
-			  "lsl %0"  CR_TAB
-			  "andi %0,0xc0", operands, plen, 4);
+	  if (ldreg_p && reg0 == reg1)
+	return avr_asm_len ("swap %0" CR_TAB
+"lsl %0"  CR_TAB
+"lsl %0"  CR_TAB
+"andi %0,0xc0", operands, plen, 4);
+	  if (ldreg_p && reg0 != reg1 && AVR_HAVE_MUL)
+	return avr_asm_len ("ldi %0,1<<6" CR_TAB
+"mul %0,%1"   CR_TAB
+"mov %0,r0"   CR_TAB
+"clr __zero_reg__", operands, plen, 4);
+	  return reg0 != reg1
+	? avr_asm_len ("clr %0"CR_TAB
+			   "bst %1,0"  CR_TAB
+			   "bld %0,6"  CR_TAB
+			   "bst %1,1"  CR_TAB
+			   "bld %0,7", operands, plen, 5)
+	: avr_asm_len ("lsl %0"  CR_TAB
+			   "lsl %0"  CR_TAB
+			   "lsl %0"  CR_TAB
+			   "lsl %0"  CR_TAB
+			   "lsl %0"  CR_TAB
+			   "lsl %0", operands, plen, 6);
 	case 7:
 	  return avr_asm_len ("bst %1,0" CR_TAB
 			  "clr %0"   CR_TAB
@@ -7663,6 +7683,8 @@ lshrqi3_out (rtx_insn *insn, rtx operands[], int *plen)
 {
   if (CONST_INT_P (operands[2]))
 {
+  int reg0 = REGNO (operands[0]);
+  int reg1 = REGNO (operands[1]);
   bool ldreg_p = test_hard_reg_class (LD_REGS, operands[0]);
   int offs = INTVAL (operands[2]);
 
@@ -7670,7 +7692,7 @@ lshrqi3_out (rtx_insn *insn, rtx operands[], int *plen)
 	*plen = 0;
 
   if (offs <= 3
-	  || (offs <= 6 && ! ldreg_p))
+	  || (offs <= 5 && ! ldreg_p))
 	{
 	  for (int i = 0; i < offs; ++i)
 	avr_asm_len ("lsr %0", operands, plen, 1);
@@ -7697,10 +7719,28 @@ lshrqi3_out (rtx_insn *insn, rtx operands[], int *plen)
 			  "lsr %0"  CR_TAB
 			  "andi %0,0x7", operands, plen, 3);
 	case 6:
-	  return avr_asm_len ("swap %0" CR_TAB
-			  "lsr %0"  CR_TAB
-			  "lsr %0"  CR_TAB
-			  "andi %0,0x3", operands, plen, 4);
+	  if (ldreg_p && reg0 == reg1)
+	return avr_asm_len ("swap %0" CR_TAB
+"lsr %0"  CR_TAB
+"lsr %0"  CR_TAB
+"andi %0,0x3", operands, plen, 4);
+	  if (ldreg_p && reg0 != reg1 && AVR_HAVE_MUL)
+	return avr_asm_len ("ldi %0,1<<2" CR_TAB
+"mul %0,%1"   CR_TAB
+"mov %0,r1"   CR_TAB
+"clr __zero_reg__", operands, plen, 4);
+	  return reg0 != reg1
+	? avr_asm_len ("clr %0"CR_TAB
+			   "bst %1,6"  CR_TAB
+			   "bld %0,0"  CR_TAB
+			   "bst %1,7"  CR_TAB
+			   "bld %0,1", operands, plen, 5)
+	: avr_asm_len ("lsr %0"  CR_TAB
+			   "lsr %0"  CR_TAB
+			   "lsr %0"  CR_TAB
+			   "lsr %0"  CR_TA

Re: [PATCH v3 2/8] aarch64: Make C/C++ operations possible on SVE ACLE types.

2024-12-02 Thread Tejas Belagod

On 11/30/24 3:30 AM, Christophe Lyon wrote:

Hi!

On Fri, 29 Nov 2024 at 05:00, Tejas Belagod  wrote:


This patch changes the TYPE_INDIVISBLE flag to 0 to enable SVE ACLE types to be
treated as GNU vectors and have the same semantics with operations that are
defined on GNU vectors.

gcc/ChangeLog:

 * config/aarch64/aarch64-sve-builtins.cc (register_builtin_types): Flip
 TYPE_INDIVISBLE flag for SVE ACLE vector types.


Sorry I haven't closely followed the discussions around this patch
series, but the Linaro postcommit CI reports
1036 regressions after patch 2/8, is that expected?
Given that precommit CI detected "only" 22 regressions with all 8
patches, I suppose most of the 1036 are fixed later in the series?


Thanks for raising this.

Patch 2/8 enables SVE vectors to behave like GNU vectors (C/C++ operator 
semantics start applying to SVE vectors) which has a lot of fallout in 
FE/ME/BE that the patches 3-8 fix (mostly related to handling VLA vectors).


I'm currently testing a patch to fix the remaining 22 regressions - they 
are mostly testisms for which I wanted to make sure I was doing the 
right thing (which I have indicated in my cover letter).


Thanks,
Tejas.



Thanks,

Christophe


---
  gcc/config/aarch64/aarch64-sve-builtins.cc | 5 -
  1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc 
b/gcc/config/aarch64/aarch64-sve-builtins.cc
index 0fec1cd439e..adbadd303d4 100644
--- a/gcc/config/aarch64/aarch64-sve-builtins.cc
+++ b/gcc/config/aarch64/aarch64-sve-builtins.cc
@@ -4576,6 +4576,9 @@ register_builtin_types ()
   vectype = build_truth_vector_type_for_mode (BYTES_PER_SVE_VECTOR,
   VNx16BImode);
   num_pr = 1;
+ /* Leave svbool_t as indivisible for now.  We don't yet support
+C/C++ operators on predicates.  */
+ TYPE_INDIVISIBLE_P (vectype) = 1;
 }
   else
 {
@@ -4592,12 +4595,12 @@ register_builtin_types ()
   && TYPE_ALIGN (vectype) == 128
   && known_eq (size, BITS_PER_SVE_VECTOR));
   num_zr = 1;
+ TYPE_INDIVISIBLE_P (vectype) = 0;
 }
   vectype = build_distinct_type_copy (vectype);
   gcc_assert (vectype == TYPE_MAIN_VARIANT (vectype));
   SET_TYPE_STRUCTURAL_EQUALITY (vectype);
   TYPE_ARTIFICIAL (vectype) = 1;
- TYPE_INDIVISIBLE_P (vectype) = 1;
   make_type_sizeless (vectype);
 }
if (num_pr)
--
2.25.1





Re: Ping: [PATCH] c++: Allow overloaded builtins to be used in SFINAE context

2024-12-02 Thread Matthew Malcomson
Ping 4

Also adding those that I've Cc'd in the patchset for FP atomics since this 
patch is enabling that one.

From: Matthew Malcomson 
Sent: 26 November 2024 10:26 AM
To: gcc-patches@gcc.gnu.org 
Cc: Jason Merrill ; Nathan Sidwell 
Subject: Re: Ping: [PATCH] c++: Allow overloaded builtins to be used in SFINAE 
context

External email: Use caution opening links or attachments


Ping 3

On 11/18/24 13:22, Matthew Malcomson wrote:
> External email: Use caution opening links or attachments
>
>
> Ping 2
>
> On 10/21/24 11:43, Matthew Malcomson wrote:
>> External email: Use caution opening links or attachments
>>
>>
>> Ping (re-sending ping because previous message body too large for list --
>> apologies for duplication to those on Cc).
>> Attaching update on testsuite to fix testism on Arm that Linaro CI
>> caught.
>>
>>> From: Matthew Malcomson 
>>>
>>> This commit newly introduces the ability to use overloaded builtins in
>>> C++ SFINAE context.
>



Ping^6 [PATCH 0/2] Prime path coverage to gcc/gcov

2024-12-02 Thread Jørgen Kvalsvik

Ping.

On 11/21/24 20:14, Jørgen Kvalsvik wrote:

Ping.

On 11/12/24 09:56, Jørgen Kvalsvik wrote:

Ping.

On 10/30/24 13:55, Jørgen Kvalsvik wrote:

Ping.

On 10/21/24 15:21, Jørgen Kvalsvik wrote:

Ping.

On 10/10/24 10:08, Jørgen Kvalsvik wrote:

Ping.

On 10/3/24 12:46, Jørgen Kvalsvik wrote:

This is both a ping and a minor update. A few of the patches from the
previous set have been merged, but the big feature still needs 
review.


Since then it has been quiet, but there are two notable changes:

1. The --prime-paths-{lines,source} flags take an optional 
argument to

    print covered or uncovered paths, or both. By default, uncovered
    paths are printed like before.
2. Fixed a bad vector access when independent functions share 
compiler

    generated statements. A reproducing case is in gcov-23.C which
    relied on printing the uncovered path of multiple destructors of
    static objects.

Jørgen Kvalsvik (2):
   gcov: branch, conds, calls in function summaries
   Add prime path coverage to gcc/gcov

  gcc/Makefile.in    |    6 +-
  gcc/builtins.cc    |    2 +-
  gcc/collect2.cc    |    5 +-
  gcc/common.opt |   16 +
  gcc/doc/gcov.texi  |  184 +++
  gcc/doc/invoke.texi    |   36 +
  gcc/gcc.cc |    4 +-
  gcc/gcov-counter.def   |    3 +
  gcc/gcov-io.h  |    3 +
  gcc/gcov.cc    |  531 ++-
  gcc/ipa-inline.cc  |    2 +-
  gcc/passes.cc  |    4 +-
  gcc/path-coverage.cc   |  782 +
  gcc/prime-paths.cc | 2031 ++ 
+ + 

  gcc/profile.cc |    6 +-
  gcc/selftest-run-tests.cc  |    1 +
  gcc/selftest.h |    1 +
  gcc/testsuite/g++.dg/gcov/gcov-22.C    |  170 ++
  gcc/testsuite/g++.dg/gcov/gcov-23-1.h  |    9 +
  gcc/testsuite/g++.dg/gcov/gcov-23-2.h  |    9 +
  gcc/testsuite/g++.dg/gcov/gcov-23.C    |   30 +
  gcc/testsuite/gcc.misc-tests/gcov-29.c |  869 ++
  gcc/testsuite/gcc.misc-tests/gcov-30.c |  869 ++
  gcc/testsuite/gcc.misc-tests/gcov-31.c |   35 +
  gcc/testsuite/gcc.misc-tests/gcov-32.c |   24 +
  gcc/testsuite/gcc.misc-tests/gcov-33.c |   27 +
  gcc/testsuite/gcc.misc-tests/gcov-34.c |   29 +
  gcc/testsuite/lib/gcov.exp |  118 +-
  gcc/tree-profile.cc    |   11 +-
  29 files changed, 5795 insertions(+), 22 deletions(-)
  create mode 100644 gcc/path-coverage.cc
  create mode 100644 gcc/prime-paths.cc
  create mode 100644 gcc/testsuite/g++.dg/gcov/gcov-22.C
  create mode 100644 gcc/testsuite/g++.dg/gcov/gcov-23-1.h
  create mode 100644 gcc/testsuite/g++.dg/gcov/gcov-23-2.h
  create mode 100644 gcc/testsuite/g++.dg/gcov/gcov-23.C
  create mode 100644 gcc/testsuite/gcc.misc-tests/gcov-29.c
  create mode 100644 gcc/testsuite/gcc.misc-tests/gcov-30.c
  create mode 100644 gcc/testsuite/gcc.misc-tests/gcov-31.c
  create mode 100644 gcc/testsuite/gcc.misc-tests/gcov-32.c
  create mode 100644 gcc/testsuite/gcc.misc-tests/gcov-33.c
  create mode 100644 gcc/testsuite/gcc.misc-tests/gcov-34.c















[PATCH] testsuite: Mark gcc.c-torture/execute/memcpy-a?.c tests expensive

2024-12-02 Thread Maciej W. Rozycki
These tests can take several seconds per compilation to complete, taking 
total elapsed time measured in minutes.  Mark them as expensive so as to 
let people skip them where they want to save on testing time.

gcc/testsuite/
* gcc.c-torture/execute/memcpy-a1.c: Mark as expensive.
* gcc.c-torture/execute/memcpy-a2.c: Likewise.
* gcc.c-torture/execute/memcpy-a3.c: Likewise.
* gcc.c-torture/execute/memcpy-a4.c: Likewise.
---
 gcc/testsuite/gcc.c-torture/execute/memcpy-a1.c |1 +
 gcc/testsuite/gcc.c-torture/execute/memcpy-a2.c |1 +
 gcc/testsuite/gcc.c-torture/execute/memcpy-a4.c |1 +
 gcc/testsuite/gcc.c-torture/execute/memcpy-a8.c |1 +
 4 files changed, 4 insertions(+)

gcc-test-memcpy-expensive.diff
Index: gcc/gcc/testsuite/gcc.c-torture/execute/memcpy-a1.c
===
--- gcc.orig/gcc/testsuite/gcc.c-torture/execute/memcpy-a1.c
+++ gcc/gcc/testsuite/gcc.c-torture/execute/memcpy-a1.c
@@ -1,3 +1,4 @@
+/* { dg-require-effective-target run_expensive_tests } */
 /* { dg-timeout-factor 8 } */
 /* { dg-skip-if "memory full + time hog" { "avr-*-*" } } */
 
Index: gcc/gcc/testsuite/gcc.c-torture/execute/memcpy-a2.c
===
--- gcc.orig/gcc/testsuite/gcc.c-torture/execute/memcpy-a2.c
+++ gcc/gcc/testsuite/gcc.c-torture/execute/memcpy-a2.c
@@ -1,3 +1,4 @@
+/* { dg-require-effective-target run_expensive_tests } */
 /* { dg-timeout-factor 8 } */
 /* { dg-skip-if "memory full + time hog" { "avr-*-*" } } */
 
Index: gcc/gcc/testsuite/gcc.c-torture/execute/memcpy-a4.c
===
--- gcc.orig/gcc/testsuite/gcc.c-torture/execute/memcpy-a4.c
+++ gcc/gcc/testsuite/gcc.c-torture/execute/memcpy-a4.c
@@ -1,3 +1,4 @@
+/* { dg-require-effective-target run_expensive_tests } */
 /* { dg-timeout-factor 8 } */
 /* { dg-skip-if "memory full + time hog" { "avr-*-*" } } */
 
Index: gcc/gcc/testsuite/gcc.c-torture/execute/memcpy-a8.c
===
--- gcc.orig/gcc/testsuite/gcc.c-torture/execute/memcpy-a8.c
+++ gcc/gcc/testsuite/gcc.c-torture/execute/memcpy-a8.c
@@ -1,3 +1,4 @@
+/* { dg-require-effective-target run_expensive_tests } */
 /* { dg-timeout-factor 8 } */
 /* { dg-skip-if "memory full + time hog" { "avr-*-*" } } */
 


Re: Should -fsanitize=bounds support counted-by attribute for pointers inside a structure?

2024-12-02 Thread Martin Uecker
Am Montag, dem 02.12.2024 um 20:15 + schrieb Qing Zhao:
> 
> > On Nov 30, 2024, at 07:22, Martin Uecker  wrote:
> > 
> > Am Dienstag, dem 26.11.2024 um 20:59 + schrieb Qing Zhao:
> > > Think this over these days, I have another thought that need some 
> > > feedback:
> > > 
> > > The major issue right now is:
> > > 
> > > 1. For the following structure in which the “counted_by” attributes is 
> > > attached to the pointer field.
> > > 
> > > struct foo {
> > >  int n;
> > >  char *p __attribute__ ((counted_by (n)));
> > > } *x;
> > 
> > BTW: I also do not like the syntax 'n' for a lookup of a member 
> > in the member namespace. I think it should be '.n'.  For FAM this 
> > is less problematic because it always at the end, but here it is 
> > problematic.
> During my implementation of extending this attribute to pointer fields of 
> structure. 
> I didn’t notice  issue with the current syntax ’n’ for the pointer fields so 
> far, 
> even though when  the field “n” is declared after the corresponding pointer 
> field, i.e:
> 
> struct foo {
> {
>   char *p __attribute__ ((counted_by (n)));
>   int n;
> }
> 
> So, could you please explain a little bit more on what’s the potential issue 
> here?

My issue with it is that it is not consistent to how C's scoping
of identifiers work. In

constexpr int n = 3;
struct foo {
{
  char (*p)[n] __attribute__ ((counted_by (n))
  int n;
}

the n in the type of p refers to the previous n in scope while
the n in your attribute would refer to the member.

This is incoherent and confusing.  It becomes worse should
you ever want to allow more complicated expressions.

It would be clearer if you the syntax ".n" which resembles
the syntax for designated initializers that is already used
in initializers to refer to struct members.

constexpr int n;
struct foo {
{
  char (*p)[n] __attribute__ ((counted_by (.n))
  int n;
}


> 
> 
> > > 
> > > There is one important additional requirement:
> > > 
> > > x->n, x->p can ONLY be changed by changing the whole structure at the 
> > > same time. 
> > > Otherwise, x->n might not be consistent with x->p.
> > 
> > By itself, this would still not fix the issue I pointed out.
> > 
> > struct foo x;
> > x = .. ; // set the whole structure
> > char *p = x->p;
> > x = ... ; // set the whole structure
> > 
> > What is the bound for 'p' ?  
> 
> Since p was set to the pointer field of the old structure, then the bound of 
> it should be the old bound. 
> > With current rules it would be the old bound.
> 
> I thought that this should be the correct behavior, isn’t it?

Yes, sorry, what I meant was "with the current rules it would be
the *new* bound".  (And I guess this is why you suggest to potentially
change it below).

Martin

> 
> > 
> > 
> > > 
> > > 2. This new requirement is ONLY for “counted_by” attribute that is 
> > > attached to the pointer field, not needed
> > > for flexible array members.
> > > 
> > > 3. Then there will be inconsistency for the “counted_by” attribute 
> > > between FAM and pointer field.
> > 
> > 
> > > The major questions I have right now:
> > > 
> > > 1. Shall we keep this inconsistency between FAM and pointer field? 
> > > Or, 
> > > 
> > > 2. Shall we keep them consistent by adding this new requirement for the 
> > > previous FAM? 
> > 
> > Or have a new attribute?  I feel we double down on a bad design
> > which made sense for FAM addressing a very specific use case
> > (retrofitting checks to the Linux kernel) but is otherwise not
> > very strong.
> 
> I do agree that the previous specific feature of the “counted_by" for the 
> FAM, i.e:
> "
> One important feature of the attribute is, a reference to the
> flexible array member field uses the latest value assigned to the
> field that represents the number of the elements before that
> reference.  For example,
> 
>p->count = val1;
>p->array[20] = 0;  // ref1 to p->array
>p->count = val2;
>p->array[30] = 0;  // ref2 to p->array
> 
> in the above, 'ref1' uses 'val1' as the number of the elements in
> 'p->array', and 'ref2' uses 'val2' as the number of elements in
> 'p->array’.
> “
> 
> Is a specific feature for Linux kernel, I am wondering whether the above 
> feature
> really needed by the Linux kernel? 
> 
> If Not, I prefer to eliminate this specific feature from GCC before we 
> officially release the “counted_by”
> attribute in GCC15.
> 
> If Linux kernel does need this feature, Yes, maybe a new attribute for 
> pointer is better.
> 
> Kees, could you please also provide more comments and suggestion on this?
> 



> 
> > 
> > Martin
> > 
> > > 
> > > Kees, the following feature that you requested for the FAM:
> > > 
> > > "
> > > One important feature of the attribute is, a reference to the
> > > flexible array member field uses the latest value assigned to the
> > > field that represents the number of the elements before that
> > > reference.  For example,
> > >

Re: [PATCH] libstdc++: Use hidden friends for __normal_iterator operators

2024-12-02 Thread Patrick Palka
On Mon, 2 Dec 2024, Patrick Palka wrote:

> On Thu, 28 Nov 2024, Jonathan Wakely wrote:
> 
> > As suggested by Jason, this makes all __normal_iterator operators into
> > friends so they can be found by ADL and don't need to be separately
> > exported in module std.
> 
> Might as well remove the __gnu_cxx exports in std.cc.in while we're at
> it?
> 
> > 
> > For the operator<=> comparing two iterators of the same type, I had to
> > use a deduced return type and add a requires-clause, because it's no
> > longer a template and so we no longer get substitution failures when
> > it's considered in oerload resolution.
> > 
> > I also had to reorder the __attribute__((always_inline)) and
> > [[nodiscard]] attributes, which have to be in a particular order when
> > used on friend functions.
> > 
> > libstdc++-v3/ChangeLog:
> > 
> > * include/bits/stl_iterator.h (__normal_iterator): Make all
> > non-member operators hidden friends.
> > * src/c++11/string-inst.cc: Remove explicit instantiations of
> > operators that are no longer templates.
> > ---
> > 
> > Tested x86_64-linux.
> > 
> > This iterator type isn't defined in the standard, and users shouldn't be
> > doing funny things with it, so nothing prevents us from replacing its
> > operators with hidden friends.
> > 
> >  libstdc++-v3/include/bits/stl_iterator.h | 341 ---
> >  libstdc++-v3/src/c++11/string-inst.cc|  11 -
> >  2 files changed, 184 insertions(+), 168 deletions(-)
> > 
> > diff --git a/libstdc++-v3/include/bits/stl_iterator.h 
> > b/libstdc++-v3/include/bits/stl_iterator.h
> > index e872598d7d8..656a47e5f76 100644
> > --- a/libstdc++-v3/include/bits/stl_iterator.h
> > +++ b/libstdc++-v3/include/bits/stl_iterator.h
> > @@ -1164,188 +1164,215 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >const _Iterator&
> >base() const _GLIBCXX_NOEXCEPT
> >{ return _M_current; }
> > -};
> >  
> > -  // Note: In what follows, the left- and right-hand-side iterators are
> > -  // allowed to vary in types (conceptually in cv-qualification) so that
> > -  // comparison between cv-qualified and non-cv-qualified iterators be
> > -  // valid.  However, the greedy and unfriendly operators in std::rel_ops
> > -  // will make overload resolution ambiguous (when in scope) if we don't
> > -  // provide overloads whose operands are of the same type.  Can someone
> > -  // remind me what generic programming is about? -- Gaby
> > +private:
> > +  // Note: In what follows, the left- and right-hand-side iterators are
> > +  // allowed to vary in types (conceptually in cv-qualification) so 
> > that
> > +  // comparison between cv-qualified and non-cv-qualified iterators be
> > +  // valid.  However, the greedy and unfriendly operators in 
> > std::rel_ops
> > +  // will make overload resolution ambiguous (when in scope) if we 
> > don't
> > +  // provide overloads whose operands are of the same type.  Can 
> > someone
> > +  // remind me what generic programming is about? -- Gaby
> >  
> >  #ifdef __cpp_lib_three_way_comparison
> > -  template
> > -[[nodiscard, __gnu__::__always_inline__]]
> > -constexpr bool
> > -operator==(const __normal_iterator<_IteratorL, _Container>& __lhs,
> > -  const __normal_iterator<_IteratorR, _Container>& __rhs)
> > -noexcept(noexcept(__lhs.base() == __rhs.base()))
> > -requires requires {
> > -  { __lhs.base() == __rhs.base() } -> std::convertible_to;
> > -}
> > -{ return __lhs.base() == __rhs.base(); }
> > +  template
> > +   [[nodiscard, __gnu__::__always_inline__]]
> > +   friend
> > +   constexpr bool
> > +   operator==(const __normal_iterator& __lhs,
> > +  const __normal_iterator<_Iter, _Container>& __rhs)
> > +   noexcept(noexcept(__lhs.base() == __rhs.base()))
> > +   requires requires {
> > + { __lhs.base() == __rhs.base() } -> std::convertible_to;
> > +   }
> > +   { return __lhs.base() == __rhs.base(); }
> >  
> > -  template
> > -[[nodiscard, __gnu__::__always_inline__]]
> > -constexpr std::__detail::__synth3way_t<_IteratorR, _IteratorL>
> > -operator<=>(const __normal_iterator<_IteratorL, _Container>& __lhs,
> > -   const __normal_iterator<_IteratorR, _Container>& __rhs)
> > -noexcept(noexcept(std::__detail::__synth3way(__lhs.base(), 
> > __rhs.base(
> > -{ return std::__detail::__synth3way(__lhs.base(), __rhs.base()); }
> > +  template
> > +   static constexpr bool __nothrow_synth3way
> > + = noexcept(std::__detail::__synth3way(std::declval<_Iterator&>(),
> > +   std::declval<_Iter&>()));
> 
> Since base() returns a const reference do we want to consider const
> references in this noexcept helper?
> 
> >  
> > -  template
> > -[[nodiscard, __gnu__::__always_inline__]]
> > -constexpr bool
> > -operator==(const __normal_iterator<_Iterator, _Container>& __lhs,
> > -  const __normal_iterator<_It

Re: m68k: don't allow o/o in movdi, movdf, movxf

2024-12-02 Thread Jeff Law




On 12/2/24 8:52 AM, Andreas Schwab wrote:

The movdi, movdf and movxf patterns allow both operands to be offsettable
memory, but output_move_double cannot handle overlapping objects.  This is
visible in the failure of gcc.c-torture/execute/pr97073.c when compiled
with LTO (where cprop optimizes out the AND operation; the failure also
occurs without LTO when the AND is removed).  Split the constraints so
that the operands cannot both be "o" in the same insn.

* config/m68k/m68k.md (movdi+1, movdf+1, movxf+2): Split
constraints so that the operands cannot both be "o".

Thanks for taking care of this stuff!

jeff



Re: Fix type compatibility for types with flexible array member [PR113688,PR114014,PR117724]

2024-12-02 Thread Qing Zhao


> On Nov 30, 2024, at 07:10, Martin Uecker  wrote:
> 
> Am Dienstag, dem 26.11.2024 um 15:15 + schrieb Qing Zhao:
>> 
>>> On Nov 25, 2024, at 16:46, Martin Uecker  wrote:
>>> 
>>> 
>>> Hi Qing,
>>> 
>>> Am Montag, dem 25.11.2024 um 17:40 + schrieb Qing Zhao:
 Hi, Martin,
 
 I didn’t go through all the details of your patch.
 
 But I have one question:
 
 Did you consider the effect of the option -fstrict-flex-array 
 (https://gcc.gnu.org/onlinedocs/gcc-14.2.0/gcc/C-Dialect-Options.html#index-fstrict-flex-arrays)
  on how gcc treats the zero size trailing array, 1-element trailing array 
 as flexible array member in the patch?
>>> 
>>> I used the function which was already there which
>>> does not take this into account.  For the new version
>>> of the patch this should not matter anymore.
>> 
>> Why it’s not matter anymore?
>> 
>> For the following testing case:
>> 
>> struct S{int x,y[1];}*a;
>> int main(void){
>> struct S{int x,y[];};
>> }
>> 
>> With your latest patch,  the two structures are considered as compatible 
>> with -g;
>> However, if we add -fstrict-flex-array=2 or -fstrict-flex-array=3,  the 
>> trailing array y[1] is NOT treated
>> as FAM anymore, as a result, these two structure are NOT compatible too. 
>> 
>> Do I miss anything obvious? 
> 
> It is not about compatibility from a language semantic point of you
> but for TBAA-compatibility which needs to be consistent with it but
> can (and must be) more general.
> 
> For TBAA, I think we want 
> 
> struct foo { int x; int y[]; };
> 
> to be TBAA-compatible to
> 
> struct foo { int x; int y[3]; };

Okay, I see now.  Thank you for the explanation.
(Now I also see this from the comments of the routine 
gimple_canonical_types_compatible_p -:)


Though, what confused me is the testing case in your patch:

diff --git a/gcc/testsuite/gcc.dg/pr114014.c b/gcc/testsuite/gcc.dg/pr114014.c
new file mode 100644
index 000..ab783f4f85d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr114014.c
@@ -0,0 +1,14 @@
+/* PR c/114014
+ * { dg-do compile }
+ * { dg-options "-std=c23 -g" } */
+
+struct r {
+  int a;
+  char b[];
+};
+struct r {
+  int a;
+  char b[0];
+}; /* { dg-error "redefinition" } */
+
+

Is the above testing case claiming that b[] and b[0] are compatible from a 
language semantic point of view?

thanks.

Qing
> even when we do not treat the later as FAM (i.e. still forbid
> out-of-bounds accesses).
> 
> E.g. see Richard's comment: 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114713#c2
> 
> 
> Martin
> 
>> Thanks.
>> 
>> Qing
>>> 
>>> Martin
>>> 
>>> 
 
 thanks.
 
 Qing
> On Nov 23, 2024, at 14:45, Martin Uecker  wrote:
> 
> 
> This patch tries fixes the errors we have because of
> flexible array members.  I am bit unsure about the exception
> for the mode. 
> 
> Bootstrapped and regression tested on x86_64.
> 
> 
> 
>  Fix type compatibility for types with flexible array member 
> [PR113688,PR114014,PR117724]
> 
>  verify_type checks the compatibility of TYPE_CANONICAL using
>  gimple_canonical_types_compatible_p.   But it is stricter than what the
>  C standard requires and therefor inconsistent with how TYPE_CANONICAL is 
> set
>  in the C FE.  Here, the logic is changed to ignore array size when one 
> of the
>  types is a flexible array member.  To not get errors because of 
> inconsistent
>  number of members, zero-sized arrays are not ignored anymore when 
> checking
>  fields of a struct (which is stricter than what was done before).
>  Finally, a exception is added that allows the TYPE_MODE of a type with
>  flexible array member to differ from another compatible type.
> 
>  PR c/113688
>  PR c/114014
>  PR c/117724
> 
>  gcc/ChangeLog:
>  * tree.cc (gimple_canonical_types_compatible_p): Revise
>  logic for types with FAM.
>  (verify_type): Add exception for mode for types with FAM.
> 
>  gcc/testsuite/ChangeLog:
>  * gcc.dg/pr113688.c: New test.
>  * gcc.dg/pr114014.c: New test.
>  * gcc.dg/pr117724.c: New test.
> 
> diff --git a/gcc/testsuite/gcc.dg/pr113688.c 
> b/gcc/testsuite/gcc.dg/pr113688.c
> new file mode 100644
> index 000..8dee8c86f1b
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/pr113688.c
> @@ -0,0 +1,8 @@
> +/* { dg-do compile } */
> +/* { dg-options "-g" } */
> +
> +struct S{int x,y[1];}*a;
> +int main(void){
> + struct S{int x,y[];};
> +}
> +
> diff --git a/gcc/testsuite/gcc.dg/pr114014.c 
> b/gcc/testsuite/gcc.dg/pr114014.c
> new file mode 100644
> index 000..ab783f4f85d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/pr114014.c
> @@ -0,0 +1,14 @@
> +/* PR c/114014
> + * { dg-do compile }
> + * { dg-opt

Re: [PATCH] c++: Allow overloaded builtins to be used in SFINAE context

2024-12-02 Thread Jason Merrill

On 10/21/24 6:43 AM, Matthew Malcomson wrote:

Ping (re-sending ping because previous message body too large for list --
apologies for duplication to those on Cc).
Attaching update on testsuite to fix testism on Arm that Linaro CI caught.


From: Matthew Malcomson 

This commit newly introduces the ability to use overloaded builtins in
C++ SFINAE context.

...

This is done both for resolve_overloaded_builtin and
check_builtin_function_arguments, both of which can be used in SFINAE
contexts.
N.b. I attempted to trigger something using the `reject_gcc_builtin`
function in an SFINAE context.  Given the context where this
function is called from the C++ frontend it looks like it may be
possible, but I did not manage to trigger this in template context
by attempting to do something similar to the testcases added around
those calls.
- I would appreciate any feedback on whether this is something that
  can happen in a template context, and if so some help writing a
  relevant testcase for it.


I'm also having trouble thinking of a case where we could hit 
reject_gcc_builtin in SFINAE contexts, it should happen during parsing. 
We can always add that later if someone comes up with a testcase.


Incidentally, using "N.b." in comments seems redundant; all comments are 
there to give context for what follows.



Both of these functions have target hooks for target specific builtins
that I have updated to take the extra boolean flag.  I have not adjusted
the functions implementing those target hooks (except to update the
declarations) so target specific builtins will still error in SFINAE
contexts.


Since you're adding the parameter, why not adjust them to check it?


- I did not pass this new flag through
  atomic_bitint_fetch_using_cas_loop since the _BitInt type is not
  available in the C++ frontend and I didn't want if conditions that can
  not be executed in the source.


I guess that's fine.


- I only test non-compile-time-constant types with SVE types, since I do
  not know of a way to get a VLA into a SFINAE context.


Agreed.


- While writing tests I noticed a few differences with clang in this
  area.  I don't think they are problematic but am mentioning them for
  completeness and to allow others to judge if these are a problem).
  - atomic_fetch_add on a boolean is allowed by clang.


I don't see a need to allow that.


  - With SFINAE GCC is happy with multiple definitions of a differently
typed template as long as they aren't instantiated, while clang is
not.  This seems to be a general difference around clang and GCC and
not specific to builtins.
I.e. two template definitions with the same name where one
specialises on `myfunc (std::declval())` and another on
`myfunc (std::declval(), std::declval())` will not give an
error in GCC unless one attempts to instantiate the template.
However it will give an error in clang on the template redefinition.


Can you give a testcase?  I'm having trouble understanding what you mean.


- I do not block the warning about using
  __builtin_speculation_safe_value on a target that does not have any
  active mitigation defined.  I do this since when this happens we do
  not return error_mark_node, which means that if a user attempted to
  use SFINAE to check for this situation that would silently fail if the
  warning were suppressed.
  N.b. this is also a warning rather than an error.
  Similarly I do not block the warning about an invalid memory model
  argument in get_atomic_generic_size.


Hmm, I think we want a SFINAE failure in these cases rather than a 
warning, like various extensions that we reject in SFINAE but accept 
with a pedwarn in other contexts.


Giving a warning when considering a candidate that might not be chosen 
for other reasons seems undesirable.



@@ -8356,41 +8473,41 @@ resolve_overloaded_builtin (location_t loc, tree 
function,
  {
  case BUILT_IN_ATOMIC_EXCHANGE:
{
- if (resolve_overloaded_atomic_exchange (loc, function, params,
- &new_return))
-   return new_return;
- /* Change to the _N variant.  */
- orig_code = BUILT_IN_ATOMIC_EXCHANGE_N;
- break;
+   if (resolve_overloaded_atomic_exchange (loc, function, params,
+   &new_return, complain))
+ return new_return;
+   /* Change to the _N variant.  */
+   orig_code = BUILT_IN_ATOMIC_EXCHANGE_N;
+   break;
}


Try not to change the indentation of lines you aren't otherwise changing.

Jason



[committed] arm, mve: Adding missing Runtime Library Exception to header files

2024-12-02 Thread Andre Vieira (lists)
Add missing Runtime Library Exception to mve header files to bring them 
into line with other similar headers. Not adding it in the first place 
was an oversight.


gcc/ChangeLog:

* config/arm/arm_mve.h: Add Runtime Library Exception.
* config/arm/arm_mve_types.h: Likewise.diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h
index 
8ffdbc7e1095a2bd02376f8584c33e8f17e14697..21a2ae7353bf63449d7603875d76cbd6ab632c41
 100644
--- a/gcc/config/arm/arm_mve.h
+++ b/gcc/config/arm/arm_mve.h
@@ -15,6 +15,10 @@
or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
License for more details.
 
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
You should have received a copy of the GNU General Public License
along with GCC; see the file COPYING3.  If not see
.  */
diff --git a/gcc/config/arm/arm_mve_types.h b/gcc/config/arm/arm_mve_types.h
index 
f549f881b490b5d6a5e6d42d7139a3e3a446271e..7771435f1d75b9a02a269641ef218e72cf1868d3
 100644
--- a/gcc/config/arm/arm_mve_types.h
+++ b/gcc/config/arm/arm_mve_types.h
@@ -15,6 +15,10 @@
or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
License for more details.
 
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
You should have received a copy of the GNU General Public License
along with GCC; see the file COPYING3.  If not see
.  */


gccrs: Remove unused files 'gcc/rust/typecheck/rust-hir-type-check-toplevel.{cc,h}' (was: [PATCH] gccrs: Remove unused files)

2024-12-02 Thread Thomas Schwinge
Hi!

On 2024-12-01T21:17:39-0500, Owen Avery  wrote:
> These files only exist upstream, and were presumably either never
> removed upstream or accidentally upstreamed despite being removed
> downstream.
>
> gcc/rust/ChangeLog:
>
>   * typecheck/rust-hir-type-check-toplevel.cc: Removed.
>   * typecheck/rust-hir-type-check-toplevel.h: Removed.

Confirmed: something went wrong during upstreaming of
"Refactor TypeResolution to be a simple query based system" -- but no
harm done at all.  I'd come up with the same patch just a week ago.  ;-)
I've added a bit more context to your Git commit log, and then pushed to
trunk branch in commit 8173d0a4b75ae2b25e9ed8b4ed8bdc39c3438560
"gccrs: Remove unused files 
'gcc/rust/typecheck/rust-hir-type-check-toplevel.{cc,h}'",
see attached.  


Grüße
 Thomas


>From 8173d0a4b75ae2b25e9ed8b4ed8bdc39c3438560 Mon Sep 17 00:00:00 2001
From: Owen Avery 
Date: Sun, 1 Dec 2024 21:17:39 -0500
Subject: [PATCH] gccrs: Remove unused files
 'gcc/rust/typecheck/rust-hir-type-check-toplevel.{cc,h}'

These files only still exist upstream; they should have been removed as
part of commit 104cc285533e742726ae18a7d3d4f384dd20c350
"gccrs: Refactor TypeResolution to be a simple query based system".

gcc/rust/ChangeLog:

	* typecheck/rust-hir-type-check-toplevel.cc: Removed.
	* typecheck/rust-hir-type-check-toplevel.h: Removed.

Signed-off-by: Owen Avery 
Co-authored-by: Thomas Schwinge 
---
 .../typecheck/rust-hir-type-check-toplevel.cc | 378 --
 .../typecheck/rust-hir-type-check-toplevel.h  |  56 ---
 2 files changed, 434 deletions(-)
 delete mode 100644 gcc/rust/typecheck/rust-hir-type-check-toplevel.cc
 delete mode 100644 gcc/rust/typecheck/rust-hir-type-check-toplevel.h

diff --git a/gcc/rust/typecheck/rust-hir-type-check-toplevel.cc b/gcc/rust/typecheck/rust-hir-type-check-toplevel.cc
deleted file mode 100644
index 8224afb4b684..
--- a/gcc/rust/typecheck/rust-hir-type-check-toplevel.cc
+++ /dev/null
@@ -1,378 +0,0 @@
-// Copyright (C) 2020-2024 Free Software Foundation, Inc.
-
-// This file is part of GCC.
-
-// GCC is free software; you can redistribute it and/or modify it under
-// the terms of the GNU General Public License as published by the Free
-// Software Foundation; either version 3, or (at your option) any later
-// version.
-
-// GCC is distributed in the hope that it will be useful, but WITHOUT ANY
-// WARRANTY; without even the implied warranty of MERCHANTABILITY or
-// FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
-// for more details.
-
-// You should have received a copy of the GNU General Public License
-// along with GCC; see the file COPYING3.  If not see
-// .
-
-#include "rust-hir-type-check-toplevel.h"
-#include "rust-hir-type-check-enumitem.h"
-#include "rust-hir-type-check-type.h"
-#include "rust-hir-type-check-expr.h"
-#include "rust-hir-type-check-pattern.h"
-#include "rust-hir-type-check-implitem.h"
-
-namespace Rust {
-namespace Resolver {
-
-TypeCheckTopLevel::TypeCheckTopLevel () : TypeCheckBase () {}
-
-void
-TypeCheckTopLevel::Resolve (HIR::Item &item)
-{
-  rust_assert (item.get_hir_kind () == HIR::Node::BaseKind::VIS_ITEM);
-  HIR::VisItem &vis_item = static_cast (item);
-
-  TypeCheckTopLevel resolver;
-  vis_item.accept_vis (resolver);
-}
-
-void
-TypeCheckTopLevel::visit (HIR::TypeAlias &alias)
-{
-  TyTy::BaseType *actual_type
-= TypeCheckType::Resolve (alias.get_type_aliased ().get ());
-
-  context->insert_type (alias.get_mappings (), actual_type);
-
-  for (auto &where_clause_item : alias.get_where_clause ().get_items ())
-{
-  ResolveWhereClauseItem::Resolve (*where_clause_item.get ());
-}
-}
-
-void
-TypeCheckTopLevel::visit (HIR::TupleStruct &struct_decl)
-{
-  std::vector substitutions;
-  if (struct_decl.has_generics ())
-resolve_generic_params (struct_decl.get_generic_params (), substitutions);
-
-  for (auto &where_clause_item : struct_decl.get_where_clause ().get_items ())
-{
-  ResolveWhereClauseItem::Resolve (*where_clause_item.get ());
-}
-
-  std::vector fields;
-  size_t idx = 0;
-  for (auto &field : struct_decl.get_fields ())
-{
-  TyTy::BaseType *field_type
-	= TypeCheckType::Resolve (field.get_field_type ().get ());
-  TyTy::StructFieldType *ty_field
-	= new TyTy::StructFieldType (field.get_mappings ().get_hirid (),
- std::to_string (idx), field_type,
- field.get_locus ());
-  fields.push_back (ty_field);
-  context->insert_type (field.get_mappings (), ty_field->get_field_type ());
-  idx++;
-}
-
-  // get the path
-  const CanonicalPath *canonical_path = nullptr;
-  bool ok = mappings->lookup_canonical_path (
-struct_decl.get_mappings ().get_nodeid (), &canonical_path);
-  rust_assert (ok);
-  RustIdent ident{*canonical_path, struct_decl.get_locus ()};
-
-  // its a single variant ADT
-  std::vector variants;
-  variants.push_back (new TyTy::VariantDef (
-struct_

m68k: don't allow o/o in movdi, movdf, movxf

2024-12-02 Thread Andreas Schwab
The movdi, movdf and movxf patterns allow both operands to be offsettable
memory, but output_move_double cannot handle overlapping objects.  This is
visible in the failure of gcc.c-torture/execute/pr97073.c when compiled
with LTO (where cprop optimizes out the AND operation; the failure also
occurs without LTO when the AND is removed).  Split the constraints so
that the operands cannot both be "o" in the same insn.

* config/m68k/m68k.md (movdi+1, movdf+1, movxf+2): Split
constraints so that the operands cannot both be "o".
---
 gcc/config/m68k/m68k.md | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/gcc/config/m68k/m68k.md b/gcc/config/m68k/m68k.md
index 1c9a6bf1748..d7329004e91 100644
--- a/gcc/config/m68k/m68k.md
+++ b/gcc/config/m68k/m68k.md
@@ -1354,8 +1354,8 @@
 })
 
 (define_insn ""
-  [(set (match_operand:DF 0 "nonimmediate_operand" "=rm,rf,rf,&rof<>")
-   (match_operand:DF 1 "general_operand" "*rf,m,0,*rofE<>"))]
+  [(set (match_operand:DF 0 "nonimmediate_operand" "=rm,rf,rf,&rof<>,&rf<>")
+   (match_operand:DF 1 "general_operand" "*rf,m,0,*rfE<>,*rofE<>"))]
 ;  [(set (match_operand:DF 0 "nonimmediate_operand" "=rm,&rf,&rof<>")
 ;  (match_operand:DF 1 "general_operand" "rf,m,rofF<>"))]
   "!TARGET_COLDFIRE"
@@ -1514,8 +1514,8 @@
   [(set_attr "flags_valid" "move")])
 
 (define_insn ""
-  [(set (match_operand:XF 0 "nonimmediate_operand" "=rm,rf,&rof<>")
-   (match_operand:XF 1 "nonimmediate_operand" "rf,m,rof<>"))]
+  [(set (match_operand:XF 0 "nonimmediate_operand" "=rm,rf,&rof<>,&rf<>")
+   (match_operand:XF 1 "nonimmediate_operand" "rf,m,rf<>,rof<>"))]
   "! TARGET_68881 && ! TARGET_COLDFIRE"
 {
   if (FP_REG_P (operands[0]))
@@ -1568,8 +1568,8 @@
 ;; movdi can apply to fp regs in some cases
 (define_insn ""
   ;; Let's see if it really still needs to handle fp regs, and, if so, why.
-  [(set (match_operand:DI 0 "nonimmediate_operand" "=rm,r,&ro<>")
-   (match_operand:DI 1 "general_operand" "rF,m,roi<>F"))]
+  [(set (match_operand:DI 0 "nonimmediate_operand" "=rm,r,&ro<>,&r<>")
+   (match_operand:DI 1 "general_operand" "rF,m,ri<>F,roi<>F"))]
 ;  [(set (match_operand:DI 0 "nonimmediate_operand" "=rm,&r,&ro<>,!&rm,!&f")
 ;  (match_operand:DI 1 "general_operand" "r,m,roi<>,fF"))]
 ;  [(set (match_operand:DI 0 "nonimmediate_operand" "=rm,&rf,&ro<>,!&rm,!&f")
-- 
2.47.1


-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


Re: [patch,avr] ad PR117726: Improve logic 8-bit shifts with an offset of 6

2024-12-02 Thread Denis Chertykov
пн, 2 дек. 2024 г. в 15:29, Georg-Johann Lay :
>
> Logic 8-bit shifts with an offset of 6 can be improved by
> supporting them as 3-operand operations.
>
> Ok for trunk?
>

Ok. Please apply.

Denis.


[PATCH] phiopt: don't handle the case cond edge dest is itself [PR117243]

2024-12-02 Thread Andrew Pinski
After r12-5300-gf98f373dd822b3, phiopt could get the following bb structure:
  |
middle-bb -|
  ||
  |   ||   |
phi<1, 2>  |   |
cond   |   |
  ||   |
  |+---|

Which was considered 2 loops. The inner loop had esimtate of upper_bound to be 
8,
due to the original `for (b = 0; b <= 7; b++)`. The outer loop was already an
infinite one.
So phiopt would come along and change the condition to be unconditionally true
and cleanup cfg would remove the condition and remove the outer loop but not
update the inner one becoming an infinite loop.
I decided it was easier to avoid this inside phiopt rather than figuring out how
to fix up cleanup cfg.

This patch avoids the issue by rejecting edges back to the condition bb before
loop optmiizations have been run.

Bootstrapped and tested on x86_64-linux-gnu.

Note since the testcases depend on the loop being infinite, I used the alarm 
signal trick.

PR tree-optimization/117243
PR tree-optimization/116749

gcc/ChangeLog:

* tree-ssa-phiopt.cc (execute_over_cond_phis): Reject edges back
to the conditional bb before loop optimizers have been run.

gcc/testsuite/ChangeLog:

* gcc.dg/torture/pr117243-1.c: New test.
* gcc.dg/torture/pr117243-2.c: New test.

Signed-off-by: Andrew Pinski 
---
 gcc/testsuite/gcc.dg/torture/pr117243-1.c | 44 ++
 gcc/testsuite/gcc.dg/torture/pr117243-2.c | 54 +++
 gcc/tree-ssa-phiopt.cc|  7 +++
 3 files changed, 105 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr117243-1.c
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr117243-2.c

diff --git a/gcc/testsuite/gcc.dg/torture/pr117243-1.c 
b/gcc/testsuite/gcc.dg/torture/pr117243-1.c
new file mode 100644
index 000..46723132553
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr117243-1.c
@@ -0,0 +1,44 @@
+/* { dg-do run } */
+/* { dg-require-effective-target signal } */
+
+/* PR tree-optimization/117243 */
+#include 
+#include 
+#include 
+
+void do_exit (int i)
+{
+  exit (0);
+}
+
+/* foo should be an infinite but sometimes it gets optimized incorrectly into
+   an __builtin_unreachable(); which is not valid.  */
+void
+foo (unsigned int a, unsigned char b)
+{
+  lbl:
+  for (b = 0; b <= 7; b++)
+{
+  unsigned char c[1][1];
+  int i, j;
+  for (i = 0; i < 1; i++)
+for (j = 0; j < 1; j++)
+  c[i][j] = 1;
+  if (b)
+   goto lbl;
+}
+}
+
+int
+main ()
+{
+  struct sigaction s;
+  sigemptyset (&s.sa_mask);
+  s.sa_handler = do_exit;
+  s.sa_flags = 0;
+  sigaction (SIGALRM, &s, NULL);
+  alarm (1);
+
+  foo (1, 2);
+}
+
diff --git a/gcc/testsuite/gcc.dg/torture/pr117243-2.c 
b/gcc/testsuite/gcc.dg/torture/pr117243-2.c
new file mode 100644
index 000..5cb864b467d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr117243-2.c
@@ -0,0 +1,54 @@
+/* { dg-do run } */
+/* { dg-additional-options "-fno-tree-ch" } */
+/* { dg-require-effective-target signal } */
+
+/* PR tree-optimization/117243 */
+/* PR tree-optimization/116749 */
+#include 
+#include 
+#include 
+
+void do_exit (int i)
+{
+  exit (0);
+}
+
+/* main1 should be an infinite but sometimes it gets optimized incorrectly into
+   an __builtin_unreachable(); which is not valid.  */
+int main1 (void)
+{
+int g=0;
+int l1[1];
+int *l2 = &g;
+int i;
+for (i=0; i<1; i++)
+l1[i] = (1);
+for (g=0; g; ++g)
+{
+int *l3[1] = {&l1[0]};
+}
+*l2 = *l1;
+b:
+for (i=0; i<2; ++i)
+{ 
+if (i)
+goto b;
+if (g)
+continue;
+}
+return 0;
+}
+
+int
+main (void)
+{
+  struct sigaction s;
+  sigemptyset (&s.sa_mask);
+  s.sa_handler = do_exit;
+  s.sa_flags = 0;
+  sigaction (SIGALRM, &s, NULL);
+  alarm (1);
+
+  main1 ();
+}
+
diff --git a/gcc/tree-ssa-phiopt.cc b/gcc/tree-ssa-phiopt.cc
index 15651809d71..a2649590d29 100644
--- a/gcc/tree-ssa-phiopt.cc
+++ b/gcc/tree-ssa-phiopt.cc
@@ -4178,6 +4178,13 @@ execute_over_cond_phis (func_type func)
   e2 = EDGE_SUCC (bb, 1);
   bb2 = e2->dest;
 
+  /* If loop opts are not done, then don't handle a loop back to itself.
+ The loop estimates sometimes are not updated correctly when changing
+a loop into an infinite loop.  */
+  if (!(cfun->curr_properties & PROP_loop_opts_done)
+ && (bb1 == bb || bb2 == bb))
+   continue;
+
   /* We cannot do the optimization on abnormal edges.  */
   if ((e1->flags & EDGE_ABNORMAL) != 0
  || (e2->flags & EDGE_ABNORMAL) != 0)
-- 
2.43.0



Re: [PATCH] libstdc++: Use hidden friends for __normal_iterator operators

2024-12-02 Thread Jonathan Wakely
On Mon, 2 Dec 2024 at 17:42, Patrick Palka  wrote:
>
> On Mon, 2 Dec 2024, Patrick Palka wrote:
>
> > On Thu, 28 Nov 2024, Jonathan Wakely wrote:
> >
> > > As suggested by Jason, this makes all __normal_iterator operators into
> > > friends so they can be found by ADL and don't need to be separately
> > > exported in module std.
> >
> > Might as well remove the __gnu_cxx exports in std.cc.in while we're at
> > it?

Ah yes, since that was the original purpose of the change!

> > >
> > > For the operator<=> comparing two iterators of the same type, I had to
> > > use a deduced return type and add a requires-clause, because it's no
> > > longer a template and so we no longer get substitution failures when
> > > it's considered in oerload resolution.
> > >
> > > I also had to reorder the __attribute__((always_inline)) and
> > > [[nodiscard]] attributes, which have to be in a particular order when
> > > used on friend functions.
> > >
> > > libstdc++-v3/ChangeLog:
> > >
> > > * include/bits/stl_iterator.h (__normal_iterator): Make all
> > > non-member operators hidden friends.
> > > * src/c++11/string-inst.cc: Remove explicit instantiations of
> > > operators that are no longer templates.
> > > ---
> > >
> > > Tested x86_64-linux.
> > >
> > > This iterator type isn't defined in the standard, and users shouldn't be
> > > doing funny things with it, so nothing prevents us from replacing its
> > > operators with hidden friends.
> > >
> > >  libstdc++-v3/include/bits/stl_iterator.h | 341 ---
> > >  libstdc++-v3/src/c++11/string-inst.cc|  11 -
> > >  2 files changed, 184 insertions(+), 168 deletions(-)
> > >
> > > diff --git a/libstdc++-v3/include/bits/stl_iterator.h 
> > > b/libstdc++-v3/include/bits/stl_iterator.h
> > > index e872598d7d8..656a47e5f76 100644
> > > --- a/libstdc++-v3/include/bits/stl_iterator.h
> > > +++ b/libstdc++-v3/include/bits/stl_iterator.h
> > > @@ -1164,188 +1164,215 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> > >const _Iterator&
> > >base() const _GLIBCXX_NOEXCEPT
> > >{ return _M_current; }
> > > -};
> > >
> > > -  // Note: In what follows, the left- and right-hand-side iterators are
> > > -  // allowed to vary in types (conceptually in cv-qualification) so that
> > > -  // comparison between cv-qualified and non-cv-qualified iterators be
> > > -  // valid.  However, the greedy and unfriendly operators in std::rel_ops
> > > -  // will make overload resolution ambiguous (when in scope) if we don't
> > > -  // provide overloads whose operands are of the same type.  Can someone
> > > -  // remind me what generic programming is about? -- Gaby
> > > +private:
> > > +  // Note: In what follows, the left- and right-hand-side iterators 
> > > are
> > > +  // allowed to vary in types (conceptually in cv-qualification) so 
> > > that
> > > +  // comparison between cv-qualified and non-cv-qualified iterators 
> > > be
> > > +  // valid.  However, the greedy and unfriendly operators in 
> > > std::rel_ops
> > > +  // will make overload resolution ambiguous (when in scope) if we 
> > > don't
> > > +  // provide overloads whose operands are of the same type.  Can 
> > > someone
> > > +  // remind me what generic programming is about? -- Gaby
> > >
> > >  #ifdef __cpp_lib_three_way_comparison
> > > -  template
> > > -[[nodiscard, __gnu__::__always_inline__]]
> > > -constexpr bool
> > > -operator==(const __normal_iterator<_IteratorL, _Container>& __lhs,
> > > -  const __normal_iterator<_IteratorR, _Container>& __rhs)
> > > -noexcept(noexcept(__lhs.base() == __rhs.base()))
> > > -requires requires {
> > > -  { __lhs.base() == __rhs.base() } -> std::convertible_to;
> > > -}
> > > -{ return __lhs.base() == __rhs.base(); }
> > > +  template
> > > +   [[nodiscard, __gnu__::__always_inline__]]
> > > +   friend
> > > +   constexpr bool
> > > +   operator==(const __normal_iterator& __lhs,
> > > +  const __normal_iterator<_Iter, _Container>& __rhs)
> > > +   noexcept(noexcept(__lhs.base() == __rhs.base()))
> > > +   requires requires {
> > > + { __lhs.base() == __rhs.base() } -> std::convertible_to;
> > > +   }
> > > +   { return __lhs.base() == __rhs.base(); }
> > >
> > > -  template
> > > -[[nodiscard, __gnu__::__always_inline__]]
> > > -constexpr std::__detail::__synth3way_t<_IteratorR, _IteratorL>
> > > -operator<=>(const __normal_iterator<_IteratorL, _Container>& __lhs,
> > > -   const __normal_iterator<_IteratorR, _Container>& __rhs)
> > > -noexcept(noexcept(std::__detail::__synth3way(__lhs.base(), 
> > > __rhs.base(
> > > -{ return std::__detail::__synth3way(__lhs.base(), __rhs.base()); }
> > > +  template
> > > +   static constexpr bool __nothrow_synth3way
> > > + = noexcept(std::__detail::__synth3way(std::declval<_Iterator&>(),
> > > +   std::declval<_Iter&>()));
> >
> > Sinc

[PATCH] c++/contracts: ICE with contract assert on non empty statement [PR 117579]

2024-12-02 Thread Nina Ranns


Tested on x86_64-pc-linux-gnu.
First time using git send-email, let me know if anything needs to be
done differently.
OK for trunk?
Thanks,
Nina



Contract assert is an attribute on a non empty statement. Currently we
assert that the statement is empty before emitting the assertion. This
has been changed to a conditional check that the statement is empty
before the assertion is emitted.

PR c++/117579

gcc/cp/ChangeLog:

* parser.cc (cp_parser_statement): assertion replaced with a
conditional check that the statement containing a contract
assert is empty.

gcc/testsuite/ChangeLog:

* g++.dg/contracts/pr117579.C: New test.

Signed-off-by: Nina Ranns 
---
 gcc/cp/parser.cc  | 6 --
 gcc/testsuite/g++.dg/contracts/pr117579.C | 9 +
 2 files changed, 13 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/contracts/pr117579.C

diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index f60ed47dfd7..e655e5cc0db 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -13082,8 +13082,10 @@ cp_parser_statement (cp_parser* parser, tree 
in_statement_expr,
   if (cp_contract_assertion_p (std_attrs))
{
  /* Add the assertion as a statement in the current block.  */
- gcc_assert (!statement || statement == error_mark_node);
- emit_assertion (std_attrs);
+ if (!statement)
+   emit_assertion (std_attrs);
+ /* we already checked that the contract assertion is followed by
+  a semicolon.  */
  std_attrs = NULL_TREE;
}
 }
diff --git a/gcc/testsuite/g++.dg/contracts/pr117579.C 
b/gcc/testsuite/g++.dg/contracts/pr117579.C
new file mode 100644
index 000..f15cdf0c78d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/contracts/pr117579.C
@@ -0,0 +1,9 @@
+// check that contract assertion on a non empty statement doesn't cause an ICE
+// { dg-do compile }
+// { dg-options "-std=c++2a -fcontracts " }
+
+void f();
+int main ()
+{
+  [[assert: true]] f(); // { dg-error "assertions must be followed by" }
+}
-- 
2.45.2



Re: Fix type compatibility for types with flexible array member [PR113688,PR114014,PR117724]

2024-12-02 Thread Martin Uecker
Am Montag, dem 02.12.2024 um 16:31 + schrieb Qing Zhao:
> 
> > On Nov 30, 2024, at 07:10, Martin Uecker  wrote:
> > 
> > Am Dienstag, dem 26.11.2024 um 15:15 + schrieb Qing Zhao:
> > > 
> > > > On Nov 25, 2024, at 16:46, Martin Uecker  wrote:
> > > > 
> > > > 
> > > > Hi Qing,
> > > > 
> > > > Am Montag, dem 25.11.2024 um 17:40 + schrieb Qing Zhao:
> > > > > Hi, Martin,
> > > > > 
> > > > > I didn’t go through all the details of your patch.
> > > > > 
> > > > > But I have one question:
> > > > > 
> > > > > Did you consider the effect of the option -fstrict-flex-array 
> > > > > (https://gcc.gnu.org/onlinedocs/gcc-14.2.0/gcc/C-Dialect-Options.html#index-fstrict-flex-arrays)
> > > > >  on how gcc treats the zero size trailing array, 1-element trailing 
> > > > > array as flexible array member in the patch?
> > > > 
> > > > I used the function which was already there which
> > > > does not take this into account.  For the new version
> > > > of the patch this should not matter anymore.
> > > 
> > > Why it’s not matter anymore?
> > > 
> > > For the following testing case:
> > > 
> > > struct S{int x,y[1];}*a;
> > > int main(void){
> > > struct S{int x,y[];};
> > > }
> > > 
> > > With your latest patch,  the two structures are considered as compatible 
> > > with -g;
> > > However, if we add -fstrict-flex-array=2 or -fstrict-flex-array=3,  the 
> > > trailing array y[1] is NOT treated
> > > as FAM anymore, as a result, these two structure are NOT compatible too. 
> > > 
> > > Do I miss anything obvious? 
> > 
> > It is not about compatibility from a language semantic point of you
> > but for TBAA-compatibility which needs to be consistent with it but
> > can (and must be) more general.
> > 
> > For TBAA, I think we want 
> > 
> > struct foo { int x; int y[]; };
> > 
> > to be TBAA-compatible to
> > 
> > struct foo { int x; int y[3]; };
> 
> Okay, I see now.  Thank you for the explanation.
> (Now I also see this from the comments of the routine 
> gimple_canonical_types_compatible_p -:)
> 
> 
> Though, what confused me is the testing case in your patch:
> 
> diff --git a/gcc/testsuite/gcc.dg/pr114014.c b/gcc/testsuite/gcc.dg/pr114014.c
> new file mode 100644
> index 000..ab783f4f85d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/pr114014.c
> @@ -0,0 +1,14 @@
> +/* PR c/114014
> + * { dg-do compile }
> + * { dg-options "-std=c23 -g" } */
> +
> +struct r {
> +  int a;
> +  char b[];
> +};
> +struct r {
> +  int a;
> +  char b[0];
> +}; /* { dg-error "redefinition" } */
> +
> +
> 
> Is the above testing case claiming that b[] and b[0] are compatible from a 
> language semantic point of view?

It would test that we do not crash with checking.

Semantically, in c23 if you redeclare a type in the same scope then
it must not only be compatible but is also not allowed to differ.
So a redeclaration in the same scope has stricter requirements than
compatibility (this also true for typedefs for example).

Whether we allow

struct r {
  int a;
  char b[];
};

struct r {
  int a;
  char b[0];
};

depends on us because the [0] is an extension.  I would make it
compatible but not allow redefinition as the types are different.


Martin


> 
> thanks.
> 
> Qing
> > even when we do not treat the later as FAM (i.e. still forbid
> > out-of-bounds accesses).
> > 
> > E.g. see Richard's comment: 
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114713#c2
> > 
> > 
> > Martin
> > 
> > > Thanks.
> > > 
> > > Qing
> > > > 
> > > > Martin
> > > > 
> > > > 
> > > > > 
> > > > > thanks.
> > > > > 
> > > > > Qing
> > > > > > On Nov 23, 2024, at 14:45, Martin Uecker  wrote:
> > > > > > 
> > > > > > 
> > > > > > This patch tries fixes the errors we have because of
> > > > > > flexible array members.  I am bit unsure about the exception
> > > > > > for the mode. 
> > > > > > 
> > > > > > Bootstrapped and regression tested on x86_64.
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > >  Fix type compatibility for types with flexible array member 
> > > > > > [PR113688,PR114014,PR117724]
> > > > > > 
> > > > > >  verify_type checks the compatibility of TYPE_CANONICAL using
> > > > > >  gimple_canonical_types_compatible_p.   But it is stricter than 
> > > > > > what the
> > > > > >  C standard requires and therefor inconsistent with how 
> > > > > > TYPE_CANONICAL is set
> > > > > >  in the C FE.  Here, the logic is changed to ignore array size when 
> > > > > > one of the
> > > > > >  types is a flexible array member.  To not get errors because of 
> > > > > > inconsistent
> > > > > >  number of members, zero-sized arrays are not ignored anymore when 
> > > > > > checking
> > > > > >  fields of a struct (which is stricter than what was done before).
> > > > > >  Finally, a exception is added that allows the TYPE_MODE of a type 
> > > > > > with
> > > > > >  flexible array member to differ from another compatible type.
> > > > > > 
> > > > > >  PR c/113688
> > > > > >  PR c/114014
> > > > > >  PR c/117724
>

Re: [Patch, fortran] PR102689 revisited - Segfault with RESHAPE of CLASS as actual argument

2024-12-02 Thread Harald Anlauf

Hi Paul,

thanks for that tremendous patch!

Reviewing it properly is beyond me, but if nobody else volunteers,
I'll just provide a few minor comments derived from playing with it,
and let you decide to push or polish.

In testcase class_transformational_2.f90 I recommend to "harden"
the select type () blocks in check_result and check_result_a by e.g.

 class default
stop stop_flag + 4

so that wrong dynamic types are detected (I had fun with ifx... :).
There are also unused declared variables (e.g. ss(:)), probably left
overs.

I got an ICE compiling class_transformational_1.f90 with the following
options:

% gfc-15 class_transformational_1.f90 -O3 -fcheck=bounds

Note that -O3 and -fcheck=bounds seem essential.  I get:

f951: internal compiler error: in make_ssa_name_fn, at tree-ssanames.cc:355
0x26ab458 internal_error(char const*, ...)
../../gcc-trunk/gcc/diagnostic-global-context.cc:517
0x99dc1c fancy_abort(char const*, int, char const*)
../../gcc-trunk/gcc/diagnostic.cc:1696
0x8947a8 make_ssa_name_fn(function*, tree_node*, gimple*, unsigned int)
../../gcc-trunk/gcc/tree-ssanames.cc:355
0x11e879b make_ssa_name(tree_node*, gimple*)
../../gcc-trunk/gcc/tree-ssanames.h:99
0x11e879b remap_ssa_name
../../gcc-trunk/gcc/tree-inline.cc:238
[...]

I did not try to narrow that down further.  If you decide to push
the submitted patch, and if you can reproduce the above, then please
open a separate PR to track this.  Given what the patch resolves,
this is not a real show-stopper, so you may go ahead.

Thanks,
Harald

Am 29.11.24 um 23:21 schrieb Paul Richard Thomas:

Hi Harald,

Sorry about that - it was the standard HEAD versus HEAD~ mistake.

Thanks for pointing it out.

Paul


On Fri, 29 Nov 2024 at 17:31, Harald Anlauf  wrote:


Hi Paul,

the patch seems to contain stuff that has already been pushed
(gcc/testsuite/gfortran.dg/pr117768.f90, and the chunks in
class.cc and resolve.cc).  Can you please check?

Cheers,
Harald

Am 29.11.24 um 17:34 schrieb Paul Richard Thomas:

Hi All,

This patch was originally pushed as r15-2739. Subsequently memory faults
were found and so the patch was reverted. At the time, I could find where
the problem lay. This morning I had another look and found it almost
immediately :-)

The fix is the 'gfc_resize_class_size_with_len' in the chunk '@@ -1595,14
+1629,51 @@ gfc_trans_create_temp_array '. Without it,, half as much

memory

as needed was being provided by the allocation and so accesses were
occurring outside the allocated space. Valgrind now reports no errors.

Regression tests with flying colours - OK for mainline?

Paul










[PATCH v2] arm: [MVE intrinsics] Avoid warnings when floating-point is not supported [PR 117814]

2024-12-02 Thread Christophe Lyon
If the target does not support floating-point, we register FP vector
types as 'void' (see register_vector_type).

The leads to warnings about 'pure attribute on function returning
void' when we declare the various load intrinsics because their
call_properties say CP_READ_MEMORY (thus giving them the 'pure'
attribute), but their return type is void.

To avoid such warnings, declare floating-point scalar and vector types
even if the target does not have an FPU.

In arm-mve-builtins.cc (register_builtin_types, register_vector_type,
register_builtin_tuple_types), this means simply removing the early
exits.  However, for this to work, we need to update
arm_vector_mode_supported_p, so that vector floating-point types are
always defined, and __fp16 must always be registered by
arm_init_fp16_builtins (as it is the base type for vectors of
float16_t.  Another side effect is that the declaration of float16_t
and float32_t typedefs is now unconditional

The two new tests verify that:
- we emit an error if the code tries to use a floating-point
  intrinsics and the target does not have the floating-point extension
- we emit the expected code when activating the floating-point
  expected via a pragma

gcc/ChangeLog:

PR target/117814
* config/arm/arm-builtins.cc (arm_init_fp16_builtins): Always
register __fp16 type.
* config/arm/arm-mve-builtins.cc (register_vector_type): Remove
special handling when TARGET_HAVE_MVE_FLOAT is false.
(register_builtin_tuple_types): Likewise.
* config/arm/arm.cc (arm_vector_mode_supported_p): Accept
floating-point vector modes even if TARGET_HAVE_MVE_FLOAT is
false.
* config/arm/arm_mve_types.h (float16_t, float32_t): Define
unconditionally.

gcc/testsuite/ChangeLog:

PR target/117814
* gcc.target/arm/mve/intrinsics/pr117814-2.c: New test.
* gcc.target/arm/mve/intrinsics/pr117814.c: New test.
---
 gcc/config/arm/arm-builtins.cc|  5 ++--
 gcc/config/arm/arm-mve-builtins.cc| 24 ++--
 gcc/config/arm/arm.cc |  6 +---
 gcc/config/arm/arm_mve_types.h|  2 --
 .../arm/mve/intrinsics/pr117814-2.c   | 28 +++
 .../gcc.target/arm/mve/intrinsics/pr117814.c  | 19 +
 6 files changed, 52 insertions(+), 32 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/intrinsics/pr117814-2.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/intrinsics/pr117814.c

diff --git a/gcc/config/arm/arm-builtins.cc b/gcc/config/arm/arm-builtins.cc
index 01bdbbf943d..71b78fee55b 100644
--- a/gcc/config/arm/arm-builtins.cc
+++ b/gcc/config/arm/arm-builtins.cc
@@ -2443,9 +2443,8 @@ arm_init_fp16_builtins (void)
   arm_fp16_type_node = make_node (REAL_TYPE);
   TYPE_PRECISION (arm_fp16_type_node) = GET_MODE_PRECISION (HFmode);
   layout_type (arm_fp16_type_node);
-  if (arm_fp16_format)
-(*lang_hooks.types.register_builtin_type) (arm_fp16_type_node,
-  "__fp16");
+  (*lang_hooks.types.register_builtin_type) (arm_fp16_type_node,
+"__fp16");
 }
 
 void
diff --git a/gcc/config/arm/arm-mve-builtins.cc 
b/gcc/config/arm/arm-mve-builtins.cc
index 30b103ec086..25c1b442a28 100644
--- a/gcc/config/arm/arm-mve-builtins.cc
+++ b/gcc/config/arm/arm-mve-builtins.cc
@@ -410,8 +410,6 @@ register_builtin_types ()
 #include "arm-mve-builtins.def"
   for (unsigned int i = 0; i < NUM_VECTOR_TYPES; ++i)
 {
-  if (vector_types[i].requires_float && !TARGET_HAVE_MVE_FLOAT)
-   continue;
   tree eltype = scalar_types[i];
   tree vectype;
   if (eltype == boolean_type_node)
@@ -433,18 +431,6 @@ register_builtin_types ()
 static void
 register_vector_type (vector_type_index type)
 {
-
-  /* If the target does not have the mve.fp extension, but the type requires
- it, then it needs to be assigned a non-dummy type so that functions
- with those types in their signature can be registered.  This allows for
- diagnostics about the missing extension, rather than about a missing
- function definition.  */
-  if (vector_types[type].requires_float && !TARGET_HAVE_MVE_FLOAT)
-{
-  acle_vector_types[0][type] = void_type_node;
-  return;
-}
-
   tree vectype = abi_vector_types[type];
   tree id = get_identifier (vector_types[type].acle_name);
   tree decl = build_decl (input_location, TYPE_DECL, id, vectype);
@@ -512,17 +498,11 @@ register_builtin_tuple_types (vector_type_index type)
 {
   const vector_type_info* info = &vector_types[type];
 
-  /* If the target does not have the mve.fp extension, but the type requires
- it, then it needs to be assigned a non-dummy type so that functions
- with those types in their signature can be registered.  This allows for
- diagnostics about the missing extension, rather than about a missing
- function definition.  */
-  if (s

RE: [RFC] PR81358: Enable automatic linking of libatomic

2024-12-02 Thread Joseph Myers
On Mon, 2 Dec 2024, Prathamesh Kulkarni wrote:

> Thanks for the suggestions! Unfortunately, it seems to me that AC_PROG_CC 
> also does run tests that
> need modified CFLAGS. I tried the following assertion before invoking 
> AC_PROG_CC (for stage-1 build):
> 
> if test -z "${CFLAGS}"; then
>   AC_MSG_ERROR([CFLAGS must be set.])
> fi
> 
> and it seems to pass. So I suppose the default setting of CFLAGS in 
> AC_PROG_CC won't be applicable anyway ? I checked what value CFLAGS was 
> set to and it turned out to be "-g -O2" (same as default setting in 
> AC_PROG_CC, altho I am not quite sure where CFLAGS were set before 
> invoking libatomic/configure). Given that the above assert passes, would 
> it be safe to add "-fno-link-libatomic" to CFLAGS before invoking 
> AC_PROG_CC (and after the assert) ?

Provided the assertion goes together with the addition to CFLAGS (to catch 
any cases of someone building libatomic outside of the toplevel GCC build 
system), it would probably be safe.

-- 
Joseph S. Myers
josmy...@redhat.com



Re: [PATCH] phiopt: don't handle the case cond edge dest is itself [PR117243]

2024-12-02 Thread Jakub Jelinek
On Mon, Dec 02, 2024 at 01:51:39PM -0800, Andrew Pinski wrote:
> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/torture/pr117243-1.c: New test.
>   * gcc.dg/torture/pr117243-2.c: New test.

Just commenting on the testcases.
I don't like 2 tests times 7 different command line options each spend
some time in busy loop uselessly.
Can't you just make it dg-do compile and scan for optimized dump not having
__builtin_unreachable calls?

Jakub



[PATCH v2] rs6000: Inefficient vector splat of small V2DI constants [PR107757]

2024-12-02 Thread Surya Kumari Jangala
I have incorporated review comments in this patch.

Regards,
Surya


rs6000: Inefficient vector splat of small V2DI constants [PR107757]

On P8, for vector splat of double word constants, specifically -1 and 1,
gcc generates inefficient code. For -1, gcc generates two instructions
(vspltisw and vupkhsw) whereas only one instruction (vspltisw) is
sufficient. For constant 1, gcc generates a load of the constant from
.rodata instead of the instructions vspltisw and vupkhsw.

The routine vspltisw_vupkhsw_constant_p() returns true if the constant
can be synthesized with instructions vspltisw and vupkhsw. However, for
constant 1, this routine returns false.

For constant -1, this routine returns true. Vector splat of -1 can be
done with only one instruction, i.e., vspltisw. We do not need two
instructions. Hence this routine should return false for -1.

With this patch, gcc generates only one instruction (vspltisw)
for -1. And for constant 1, this patch generates two instructions
(vspltisw and vupkhsw).

2024-11-20  Surya Kumari Jangala  

gcc/
PR target/107757
* config/rs6000/rs6000.cc (vspltisw_vupkhsw_constant_p):
Return false for -1 and return true for 1.

gcc/testsuite/
PR target/107757
* gcc.target/powerpc/pr107757-1.c: New.
* gcc.target/powerpc/pr107757-2.c: New.
---
 gcc/config/rs6000/rs6000.cc   |  2 +-
 gcc/testsuite/gcc.target/powerpc/pr107757-1.c | 14 ++
 gcc/testsuite/gcc.target/powerpc/pr107757-2.c | 13 +
 3 files changed, 28 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr107757-1.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr107757-2.c

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 02a2f1152db..d0c528f4d5f 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -6652,7 +6652,7 @@ vspltisw_vupkhsw_constant_p (rtx op, machine_mode mode, 
int *constant_ptr)
 return false;
 
   value = INTVAL (elt);
-  if (value == 0 || value == 1
+  if (value == 0 || value == -1
   || !EASY_VECTOR_15 (value))
 return false;
 
diff --git a/gcc/testsuite/gcc.target/powerpc/pr107757-1.c 
b/gcc/testsuite/gcc.target/powerpc/pr107757-1.c
new file mode 100644
index 000..49076fba255
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr107757-1.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-mdejagnu-cpu=power8 -O2" } */
+/* { dg-require-effective-target powerpc_vsx } */
+/* { dg-final { scan-assembler {\mvspltisw\M} } } */
+/* { dg-final { scan-assembler {\mvupkhsw\M} } } */
+/* { dg-final { scan-assembler-not {\mlvx\M} } } */
+
+#include 
+
+vector long long
+foo ()
+{
+ return vec_splats (1LL);
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/pr107757-2.c 
b/gcc/testsuite/gcc.target/powerpc/pr107757-2.c
new file mode 100644
index 000..4955696f11d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr107757-2.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-mdejagnu-cpu=power8 -O2" } */
+/* { dg-require-effective-target powerpc_vsx } */
+/* { dg-final { scan-assembler {\mvspltisw\M} } } */
+/* { dg-final { scan-assembler-not {\mvupkhsw\M} } } */
+
+#include 
+
+vector long long
+foo ()
+{
+return vec_splats (~0LL);
+}
-- 
2.43.5




Re: [PATCH v2 1/2] aarch64: Refactor AdvSIMD intrinsics

2024-12-02 Thread Richard Sandiford
 writes:
> Refactor AdvSIMD intrinsics defined using the new pragma-based approach
> so that it is more extensible.
>
> Introduce a new struct, simd_type, which defines types using a mode and
> qualifiers, and use objects of this struct in the declaration of intrinsics
> in the aarch64-simd-pragma-builtins.def file.
>
> Change aarch64_pragma_builtins_data struct to support return type and
> argument types.
>
> Refactor aarch64_fntype and aarch64_expand_pragma_builtin so that it
> initialises corresponding vectors in a loop. As we add intrinsics with
> more arguments, these functions won't need to change to support those.
>
> gcc/ChangeLog:
>
>   * config/aarch64/aarch64-builtins.cc
>   (ENTRY): Modify to add support of return and argument types.
>   (struct simd_type): New struct to declare types using mode and
>   qualifiers.
>   (struct aarch64_pragma_builtins_data): Replace mode with
>   the array of types to support return and argument types.
>   (aarch64_get_number_of_args): New utility to get number of
>   arguments given an aarch64_builtin_signatures variant.
>   (aarch64_fntype): Modify to handle different signatures.
>   (aarch64_expand_pragma_builtin): Modify to handle different
>   signatures.
>   * config/aarch64/aarch64-simd-pragma-builtins.def
>   (ENTRY_VHSDF): Rename to ENTRY_BINARY_VHSDF.
>   (ENTRY_BINARY): New macro to declare binary intrinsics.
>   (ENTRY_BINARY_VHSDF): Remove signature argument and use
>   ENTRY_BINARY.
> ---
>  gcc/config/aarch64/aarch64-builtins.cc| 106 ++
>  .../aarch64/aarch64-simd-pragma-builtins.def  |  22 ++--
>  2 files changed, 97 insertions(+), 31 deletions(-)

Thanks for picking this up.  As we discussed off-line, I'll do the
review in the form of a patch rather than as English text, in the
interests of parallelising the remaining work.

The main things were:

- I'd suggested in one of the replies to Vladimir that we add:

tree type () const { return aarch64_simd_builtin_type (mode, qualifiers); }

  to simd_type, to reduce the cut-&-paste.  This patch does that.

- I hadn't realised that my suggestion to use build_function_type_vec
  would require a GCed vector.  Given that it does, I suppose we should
  use build_function_type_array instead.  Sorry for the run-around.

- When setting up the operands during expand, it's probably more robust
  to use the mode of the associated type, since that is tied directly
  to the operand value.  This removes the need to describe every argument
  (including non-vector ones) using simd_type, which will be useful for
  later patches.

- The:

expand_insn (icode, ops.length (), ops.address ());
return ops[0].value;

  sequence is likely to be used by most intrinsics, so I think we should
  do it after the main switch, rather than in each case statement.

Here's what I'd like to commit.  I'll wait until tomorrow in case there
are any comments.

Thanks,
Richard


>From c9bcc73f3b09f072dbc97c8e8c3f46ed0d7c1bfc Mon Sep 17 00:00:00 2001
From: Saurabh Jha 
Date: Tue, 26 Nov 2024 12:18:24 +
Subject: [PATCH] aarch64: Refactor AdvSIMD intrinsics
To: gcc-patches@gcc.gnu.org

Refactor AdvSIMD intrinsics defined using the new pragma-based approach
so that it is more extensible.

Introduce a new struct, simd_type, which defines types using a mode and
qualifiers, and use objects of this struct in the declaration of intrinsics
in the aarch64-simd-pragma-builtins.def file.

Change aarch64_pragma_builtins_data struct to support return type and
argument types.

Refactor aarch64_fntype and aarch64_expand_pragma_builtin so that it
initialises corresponding vectors in a loop. As we add intrinsics with
more arguments, these functions won't need to change to support those.

gcc/ChangeLog:

* config/aarch64/aarch64-builtins.cc
(ENTRY): Modify to add support of return and argument types.
(struct simd_type): New struct to declare types using mode and
qualifiers.
(struct aarch64_pragma_builtins_data): Replace mode with
the array of types to support return and argument types.
(aarch64_fntype): Modify to handle different signatures.
(aarch64_expand_pragma_builtin): Modify to handle different
signatures.
* config/aarch64/aarch64-simd-pragma-builtins.def
(ENTRY_VHSDF): Rename to ENTRY_BINARY_VHSDF.
(ENTRY_BINARY): New macro to declare binary intrinsics.
(ENTRY_BINARY_VHSDF): Remove signature argument and use
ENTRY_BINARY.

Co-authored-by: Vladimir Miloserdov 
Co-authored-by: Richard Sandiford 
---
 gcc/config/aarch64/aarch64-builtins.cc| 108 ++
 gcc/config/aarch64/aarch64-builtins.h |   2 +-
 .../aarch64/aarch64-simd-pragma-builtins.def  |  22 ++--
 3 files changed, 99 insertions(+), 33 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-builtins.cc 
b/gcc/config/aarch64/aarch64-builtins.cc
index b860e22f01f

Re: [PATCH v3 2/8] aarch64: Make C/C++ operations possible on SVE ACLE types.

2024-12-02 Thread Christophe Lyon
On Mon, 2 Dec 2024 at 10:45, Tejas Belagod  wrote:
>
> On 11/30/24 3:30 AM, Christophe Lyon wrote:
> > Hi!
> >
> > On Fri, 29 Nov 2024 at 05:00, Tejas Belagod  wrote:
> >>
> >> This patch changes the TYPE_INDIVISBLE flag to 0 to enable SVE ACLE types 
> >> to be
> >> treated as GNU vectors and have the same semantics with operations that are
> >> defined on GNU vectors.
> >>
> >> gcc/ChangeLog:
> >>
> >>  * config/aarch64/aarch64-sve-builtins.cc 
> >> (register_builtin_types): Flip
> >>  TYPE_INDIVISBLE flag for SVE ACLE vector types.
> >
> > Sorry I haven't closely followed the discussions around this patch
> > series, but the Linaro postcommit CI reports
> > 1036 regressions after patch 2/8, is that expected?
> > Given that precommit CI detected "only" 22 regressions with all 8
> > patches, I suppose most of the 1036 are fixed later in the series?
>
> Thanks for raising this.
>
> Patch 2/8 enables SVE vectors to behave like GNU vectors (C/C++ operator
> semantics start applying to SVE vectors) which has a lot of fallout in
> FE/ME/BE that the patches 3-8 fix (mostly related to handling VLA vectors).
>
> I'm currently testing a patch to fix the remaining 22 regressions - they
> are mostly testisms for which I wanted to make sure I was doing the
> right thing (which I have indicated in my cover letter).
>
Indeed, but I wasn't expecting regressions within the series, also
fixed within the series (I thought the policy was to avoid such
things, and that each patch is expected not to introduce regressions,
precisely because it's annoying during bisects).
As you have probably noticed the CI has sent other notifications with
your follow-up patches, finding a few more regressions and indeed lots
of improvements.  I'll have a quick check and probably close them as
these results look to be expected from your side.

Regarding the 22 regressions you mention above, I've noticed that
Andrew Pinksi has fixed some (all?) of them already (in case you
haven't noticed his patches).

Thanks,

Christophe


> Thanks,
> Tejas.
>
> >
> > Thanks,
> >
> > Christophe
> >
> >> ---
> >>   gcc/config/aarch64/aarch64-sve-builtins.cc | 5 -
> >>   1 file changed, 4 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc 
> >> b/gcc/config/aarch64/aarch64-sve-builtins.cc
> >> index 0fec1cd439e..adbadd303d4 100644
> >> --- a/gcc/config/aarch64/aarch64-sve-builtins.cc
> >> +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc
> >> @@ -4576,6 +4576,9 @@ register_builtin_types ()
> >>vectype = build_truth_vector_type_for_mode 
> >> (BYTES_PER_SVE_VECTOR,
> >>VNx16BImode);
> >>num_pr = 1;
> >> + /* Leave svbool_t as indivisible for now.  We don't yet 
> >> support
> >> +C/C++ operators on predicates.  */
> >> + TYPE_INDIVISIBLE_P (vectype) = 1;
> >>  }
> >>else
> >>  {
> >> @@ -4592,12 +4595,12 @@ register_builtin_types ()
> >>&& TYPE_ALIGN (vectype) == 128
> >>&& known_eq (size, BITS_PER_SVE_VECTOR));
> >>num_zr = 1;
> >> + TYPE_INDIVISIBLE_P (vectype) = 0;
> >>  }
> >>vectype = build_distinct_type_copy (vectype);
> >>gcc_assert (vectype == TYPE_MAIN_VARIANT (vectype));
> >>SET_TYPE_STRUCTURAL_EQUALITY (vectype);
> >>TYPE_ARTIFICIAL (vectype) = 1;
> >> - TYPE_INDIVISIBLE_P (vectype) = 1;
> >>make_type_sizeless (vectype);
> >>  }
> >> if (num_pr)
> >> --
> >> 2.25.1
> >>
>


Re: [PATCH v3 2/8] aarch64: Make C/C++ operations possible on SVE ACLE types.

2024-12-02 Thread Tejas Belagod

On 12/2/24 3:20 PM, Andrew Pinski wrote:

On Mon, Dec 2, 2024 at 1:47 AM Tejas Belagod  wrote:


On 11/30/24 3:30 AM, Christophe Lyon wrote:

Hi!

On Fri, 29 Nov 2024 at 05:00, Tejas Belagod  wrote:


This patch changes the TYPE_INDIVISBLE flag to 0 to enable SVE ACLE types to be
treated as GNU vectors and have the same semantics with operations that are
defined on GNU vectors.

gcc/ChangeLog:

  * config/aarch64/aarch64-sve-builtins.cc (register_builtin_types): 
Flip
  TYPE_INDIVISBLE flag for SVE ACLE vector types.


Sorry I haven't closely followed the discussions around this patch
series, but the Linaro postcommit CI reports
1036 regressions after patch 2/8, is that expected?
Given that precommit CI detected "only" 22 regressions with all 8
patches, I suppose most of the 1036 are fixed later in the series?


Thanks for raising this.

Patch 2/8 enables SVE vectors to behave like GNU vectors (C/C++ operator
semantics start applying to SVE vectors) which has a lot of fallout in
FE/ME/BE that the patches 3-8 fix (mostly related to handling VLA vectors).

I'm currently testing a patch to fix the remaining 22 regressions - they
are mostly testisms for which I wanted to make sure I was doing the
right thing (which I have indicated in my cover letter).


I think I fixed all of the testcases over the weekend.

https://gcc.gnu.org/pipermail/gcc-patches/2024-December/670492.html
https://gcc.gnu.org/pipermail/gcc-patches/2024-December/670493.html
https://gcc.gnu.org/pipermail/gcc-patches/2024-December/670494.html
https://gcc.gnu.org/pipermail/gcc-patches/2024-December/670495.html
https://gcc.gnu.org/pipermail/gcc-patches/2024-December/670496.html
https://gcc.gnu.org/pipermail/gcc-patches/2024-December/670497.html



Ah, sorry, missed that - was still catching up with stuff. Thanks for 
the fixes - much appreciated.


Thanks,
Tejas.


Thanks,
Andrew



Thanks,
Tejas.



Thanks,

Christophe


---
   gcc/config/aarch64/aarch64-sve-builtins.cc | 5 -
   1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc 
b/gcc/config/aarch64/aarch64-sve-builtins.cc
index 0fec1cd439e..adbadd303d4 100644
--- a/gcc/config/aarch64/aarch64-sve-builtins.cc
+++ b/gcc/config/aarch64/aarch64-sve-builtins.cc
@@ -4576,6 +4576,9 @@ register_builtin_types ()
vectype = build_truth_vector_type_for_mode 
(BYTES_PER_SVE_VECTOR,
VNx16BImode);
num_pr = 1;
+ /* Leave svbool_t as indivisible for now.  We don't yet support
+C/C++ operators on predicates.  */
+ TYPE_INDIVISIBLE_P (vectype) = 1;
  }
else
  {
@@ -4592,12 +4595,12 @@ register_builtin_types ()
&& TYPE_ALIGN (vectype) == 128
&& known_eq (size, BITS_PER_SVE_VECTOR));
num_zr = 1;
+ TYPE_INDIVISIBLE_P (vectype) = 0;
  }
vectype = build_distinct_type_copy (vectype);
gcc_assert (vectype == TYPE_MAIN_VARIANT (vectype));
SET_TYPE_STRUCTURAL_EQUALITY (vectype);
TYPE_ARTIFICIAL (vectype) = 1;
- TYPE_INDIVISIBLE_P (vectype) = 1;
make_type_sizeless (vectype);
  }
 if (num_pr)
--
2.25.1







Re: [PATCH v2 2/2] aarch64: Add support for AdvSIMD lut

2024-12-02 Thread Richard Sandiford
 writes:
> The AArch64 FEAT_LUT extension is optional from Armv9.2-a and mandatory
> from Armv9.5-a. It introduces instructions for lookup table reads with
> bit indices.
>
> This patch adds support for AdvSIMD lut intrinsics. The intrinsics for
> this extension are implemented as the following builtin functions:
> * vluti2{q}_lane{q}_{u8|s8|p8}
> * vluti2{q}_lane{q}_{u16|s16|p16|f16|bf16}
> * vluti4q_lane{q}_{u8|s8|p8}
> * vluti4q_lane{q}_{u16|s16|p16|f16|bf16}_x2
>
> We also introduced a new approach to do lane checks for AdvSIMD.

Following on from the reply to part 1, my main comments here are:

- It seems more accurate to classify these intrinsics as binary rather
  than ternary, since they operate on only two data inputs.  The intrinsics
  have three arguments, but the lane argument acts as an index into the
  second argument rather than as independent data.  The patch below
  therefore calls them "binary_lane".

- Similarly, it might be better to add the lane argument separately,
  rather than encode it as a simd_type.  The spec says that the argument
  should have type "int", so we should use integer_type_node rather than
  any other signed 32-bit integer type.  "s32" should instead normally
  correspond to "int32_t", which can be "signed long" for some ILP32 targets.

- We can reuse the SVE error-reporting functions to report nonconstant
  or out-of-range arguments.  One advantage of this is that we give the
  numerical value being passed (which might not be obvious if the value
  is a computed expression instead of a literal).  It also means that
  we say "...the value 0" rather than "...range 0 - 0".

- However, doing that requires the checking routine to have access to
  the fndecl and the argument number.  Since there's now quite a bit of
  information to carry around, it seemed better to wrap the checking in
  a class that stores the information as member variables.

- Very minor, but: it seemed better to pass a reference rather than
  pointer to the builtin data, given that the caller has proven that
  the pointer is nonnull.

- There were no tests for nonconstant lane arguments.

- The md patterns should require TARGET_LUT rather than TARGET_SIMD.

- LUTI2 only cares about the first 4 elements of the first argument.
  The q forms that take 128-bit inputs are provided as a convenience,
  but only the low 64 bits actually matter.  (Only the low 32 bits
  matter for 8-bit data.)  We can therefore truncate the 128-bit
  vectors to 64 bits before expanding the instruction.

  I originally did this as an alternative way of handling the next
  point.  Although it's no longer needed for that, it seems worth
  keeping anyway, since it should in theory allow some optimisation.
  E.g. it should allow us to get rid of unnecessary 64-bit-to-128-bit
  extensions.  It also cuts down on the number of patterns.

- The md patterns included some invalid LUTI4 combinations.
  There are multiple ways of handling that, but one easy one is to add
  an extra operand that specifies the quantisation size (2 or 4 bits).
  The C++ condition can then check that the mode is appropriate.  This
  is also the representation used for SME LUT instructions.

  (With the change mentioned in the previous point, the mode of the
  first input uniquely determines the choice between LUTI2 and LUTI4,
  but that's somewhat more complex to model.)

Here's what I'd like to commit.  Bootstrapped & regression-tested on
aarch64-linux-gnu.  I'll wait until tomorrow in case there are any comments.

Thanks,
Richard


>From 69917f56e5c1ffe71e05f9ec5a3f47713bd57df8 Mon Sep 17 00:00:00 2001
From: Saurabh Jha 
Date: Sat, 30 Nov 2024 17:51:05 +
Subject: [PATCH] aarch64: Add support for AdvSIMD lut
To: gcc-patches@gcc.gnu.org

The AArch64 FEAT_LUT extension is optional from Armv9.2-A and mandatory
from Armv9.5-A. It introduces instructions for lookup table reads with
bit indices.

This patch adds support for AdvSIMD lut intrinsics. The intrinsics for
this extension are implemented as the following builtin functions:
* vluti2{q}_lane{q}_{u8|s8|p8}
* vluti2{q}_lane{q}_{u16|s16|p16|f16|bf16}
* vluti4q_lane{q}_{u8|s8|p8}
* vluti4q_lane{q}_{u16|s16|p16|f16|bf16}_x2

We also introduced a new approach to do lane checks for AdvSIMD.

gcc/ChangeLog:

* config/aarch64/aarch64-builtins.cc
(aarch64_builtin_signatures): Add binary_lane.
(aarch64_fntype): Handle it.
(simd_types): Add 16-bit x2 types.
(aarch64_pragma_builtins_checker): New class.
(aarch64_general_check_builtin_call): Use it.
(aarch64_expand_pragma_builtin): Add support for lut unspecs.
* config/aarch64/aarch64-option-extensions.def
(AARCH64_OPT_EXTENSION): Add lut option.
* config/aarch64/aarch64-simd-pragma-builtins.def
(ENTRY_BINARY_LANE): Modify to use new ENTRY macro.
(ENTRY_TERNARY_VLUT8): Macro to declare lut intrinsics.
(ENTRY_TERNARY_VLUT16): Macro to declare lut intrinsics.
  

[PATCH] aarch64: Add flags field to aarch64-simd-pragma-builtins.def

2024-12-02 Thread Richard Sandiford
This patch adds a flags field to aarch64-simd-pragma-builtins.def
and uses it to add attributes to the function declaration.

Bootstrapped & regression-tested on aarch64-linux-gnu.  I'll commit
tomorrow if there are no comments before then.

Richard


gcc/
* config/aarch64/aarch64-simd-pragma-builtins.def: Add a flags
field to each entry.
* config/aarch64/aarch64-builtins.cc: Update includes accordingly.
(aarch64_pragma_builtins_data): Add a flags field.
(aarch64_init_pragma_builtins): Use the flags field to add attributes
to the function declaration.
---
 gcc/config/aarch64/aarch64-builtins.cc| 12 ++--
 .../aarch64/aarch64-simd-pragma-builtins.def  | 59 +++
 2 files changed, 42 insertions(+), 29 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-builtins.cc 
b/gcc/config/aarch64/aarch64-builtins.cc
index 9f578a77888..d58065cdd58 100644
--- a/gcc/config/aarch64/aarch64-builtins.cc
+++ b/gcc/config/aarch64/aarch64-builtins.cc
@@ -782,7 +782,7 @@ typedef struct
   AARCH64_SIMD_BUILTIN_##T##_##N##A,
 
 #undef ENTRY
-#define ENTRY(N, S, T0, T1, T2, U) \
+#define ENTRY(N, S, T0, T1, T2, U, F)  \
   AARCH64_##N,
 
 enum aarch64_builtins
@@ -1647,9 +1647,10 @@ namespace simd_types {
 }
 
 #undef ENTRY
-#define ENTRY(N, S, T0, T1, T2, U) \
+#define ENTRY(N, S, T0, T1, T2, U, F) \
   {#N, aarch64_builtin_signatures::S, simd_types::T0, simd_types::T1, \
-   simd_types::T2, U, aarch64_required_extensions::REQUIRED_EXTENSIONS},
+   simd_types::T2, U, aarch64_required_extensions::REQUIRED_EXTENSIONS, \
+   FLAG_##F},
 
 /* Initialize pragma builtins.  */
 
@@ -1660,6 +1661,7 @@ struct aarch64_pragma_builtins_data
   simd_type types[3];
   int unspec;
   aarch64_required_extensions required_extensions;
+  unsigned int flags;
 };
 
 static aarch64_pragma_builtins_data aarch64_pragma_builtins[] = {
@@ -1704,8 +1706,10 @@ aarch64_init_pragma_builtins ()
   auto data = aarch64_pragma_builtins[i];
   auto fntype = aarch64_fntype (data);
   auto code = AARCH64_PRAGMA_BUILTIN_START + i + 1;
+  auto flag_mode = data.types[0].mode;
+  auto attrs = aarch64_get_attributes (data.flags, flag_mode);
   aarch64_builtin_decls[code]
-   = aarch64_general_simulate_builtin (data.name, fntype, code);
+   = aarch64_general_simulate_builtin (data.name, fntype, code, attrs);
 }
 }
 
diff --git a/gcc/config/aarch64/aarch64-simd-pragma-builtins.def 
b/gcc/config/aarch64/aarch64-simd-pragma-builtins.def
index db40745e9e3..ae8732bdb31 100644
--- a/gcc/config/aarch64/aarch64-simd-pragma-builtins.def
+++ b/gcc/config/aarch64/aarch64-simd-pragma-builtins.def
@@ -19,46 +19,55 @@
.  */
 
 #undef ENTRY_BINARY
-#define ENTRY_BINARY(N, T0, T1, T2, U) \
-  ENTRY (N, binary, T0, T1, T2, U)
+#define ENTRY_BINARY(N, T0, T1, T2, U, F)  \
+  ENTRY (N, binary, T0, T1, T2, U, F)
 
 #undef ENTRY_BINARY_LANE
-#define ENTRY_BINARY_LANE(N, T0, T1, T2, U)\
-  ENTRY (N, binary_lane, T0, T1, T2, U)
+#define ENTRY_BINARY_LANE(N, T0, T1, T2, U, F) \
+  ENTRY (N, binary_lane, T0, T1, T2, U, F)
 
 #undef ENTRY_BINARY_VHSDF
-#define ENTRY_BINARY_VHSDF(NAME, UNSPEC)  \
-  ENTRY_BINARY (NAME##_f16, f16, f16, f16, UNSPEC) \
-  ENTRY_BINARY (NAME##q_f16, f16q, f16q, f16q, UNSPEC) \
-  ENTRY_BINARY (NAME##_f32, f32, f32, f32, UNSPEC) \
-  ENTRY_BINARY (NAME##q_f32, f32q, f32q, f32q, UNSPEC) \
-  ENTRY_BINARY (NAME##q_f64, f64q, f64q, f64q, UNSPEC)
+#define ENTRY_BINARY_VHSDF(NAME, UNSPEC, FLAGS)\
+  ENTRY_BINARY (NAME##_f16, f16, f16, f16, UNSPEC, FLAGS)  \
+  ENTRY_BINARY (NAME##q_f16, f16q, f16q, f16q, UNSPEC, FLAGS)  \
+  ENTRY_BINARY (NAME##_f32, f32, f32, f32, UNSPEC, FLAGS)  \
+  ENTRY_BINARY (NAME##q_f32, f32q, f32q, f32q, UNSPEC, FLAGS)  \
+  ENTRY_BINARY (NAME##q_f64, f64q, f64q, f64q, UNSPEC, FLAGS)
 
 #undef ENTRY_TERNARY_VLUT8
-#define ENTRY_TERNARY_VLUT8(T) \
-  ENTRY_BINARY_LANE (vluti2_lane_##T##8, T##8q, T##8, u8, UNSPEC_LUTI2)
\
-  ENTRY_BINARY_LANE (vluti2_laneq_##T##8, T##8q, T##8, u8q, UNSPEC_LUTI2) \
-  ENTRY_BINARY_LANE (vluti2q_lane_##T##8, T##8q, T##8q, u8, UNSPEC_LUTI2) \
-  ENTRY_BINARY_LANE (vluti2q_laneq_##T##8, T##8q, T##8q, u8q, UNSPEC_LUTI2) \
-  ENTRY_BINARY_LANE (vluti4q_lane_##T##8, T##8q, T##8q, u8, UNSPEC_LUTI4) \
-  ENTRY_BINARY_LANE (vluti4q_laneq_##T##8, T##8q, T##8q, u8q, UNSPEC_LUTI4)
+#define ENTRY_TERNARY_VLUT8(T) \
+  ENTRY_BINARY_LANE (vluti2_lane_##T##8, T##8q, T##8, u8,  \
+UNSPEC_LUTI2, NONE)\
+  ENTRY_BINARY_LANE (vluti2_laneq_##T##8, T##8q, T##8, u8q,\
+UNSPEC_LUTI2, NONE)\
+  ENTRY_BINARY_LANE (vluti2q_lane_##T##8, T##8q, T##8q, u8,\
+UNSPEC_LUTI2, NONE)\
+  ENTRY_BINARY_LANE (vlu

[PATCH] riscv: Avoid narrowing warning

2024-12-02 Thread Andreas Schwab
* config/riscv/riscv.cc (fli_value_hf, fli_value_sf)
(fli_value_df): Use integer constants.  Constify.
(riscv_float_const_rtx_index_for_fli): Add const.
---
 gcc/config/riscv/riscv.cc | 64 ---
 1 file changed, 39 insertions(+), 25 deletions(-)

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 7a1724d6e73..0a6c00926b3 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -1637,35 +1637,49 @@ static int riscv_symbol_insns (enum riscv_symbol_type 
type)
Manual draft. For details, please see:
https://github.com/riscv/riscv-isa-manual/releases/tag/isa-449cd0c  */
 
-static unsigned HOST_WIDE_INT fli_value_hf[32] =
-{
-  0xbcp8, 0x4p8, 0x1p8, 0x2p8, 0x1cp8, 0x20p8, 0x2cp8, 0x30p8,
-  0x34p8, 0x35p8, 0x36p8, 0x37p8, 0x38p8, 0x39p8, 0x3ap8, 0x3bp8,
-  0x3cp8, 0x3dp8, 0x3ep8, 0x3fp8, 0x40p8, 0x41p8, 0x42p8, 0x44p8,
-  0x48p8, 0x4cp8, 0x58p8, 0x5cp8, 0x78p8,
+static const unsigned HOST_WIDE_INT fli_value_hf[32] =
+{
+#define P8(v) ((unsigned HOST_WIDE_INT) (v) << 8)
+  P8(0xbc), P8(0x4), P8(0x1), P8(0x2),
+  P8(0x1c), P8(0x20), P8(0x2c), P8(0x30),
+  P8(0x34), P8(0x35), P8(0x36), P8(0x37),
+  P8(0x38), P8(0x39), P8(0x3a), P8(0x3b),
+  P8(0x3c), P8(0x3d), P8(0x3e), P8(0x3f),
+  P8(0x40), P8(0x41), P8(0x42), P8(0x44),
+  P8(0x48), P8(0x4c), P8(0x58), P8(0x5c),
+  P8(0x78),
   /* Only used for filling, ensuring that 29 and 30 of HF are the same.  */
-  0x78p8,
-  0x7cp8, 0x7ep8
+  P8(0x78),
+  P8(0x7c), P8(0x7e)
+#undef P8
 };
 
-static unsigned HOST_WIDE_INT fli_value_sf[32] =
-{
-  0xbf8p20, 0x008p20, 0x378p20, 0x380p20, 0x3b8p20, 0x3c0p20, 0x3d8p20, 
0x3e0p20,
-  0x3e8p20, 0x3eap20, 0x3ecp20, 0x3eep20, 0x3f0p20, 0x3f2p20, 0x3f4p20, 
0x3f6p20,
-  0x3f8p20, 0x3fap20, 0x3fcp20, 0x3fep20, 0x400p20, 0x402p20, 0x404p20, 
0x408p20,
-  0x410p20, 0x418p20, 0x430p20, 0x438p20, 0x470p20, 0x478p20, 0x7f8p20, 
0x7fcp20
+static const unsigned HOST_WIDE_INT fli_value_sf[32] =
+{
+#define P20(v) ((unsigned HOST_WIDE_INT) (v) << 20)
+  P20(0xbf8), P20(0x008), P20(0x378), P20(0x380),
+  P20(0x3b8), P20(0x3c0), P20(0x3d8), P20(0x3e0),
+  P20(0x3e8), P20(0x3ea), P20(0x3ec), P20(0x3ee),
+  P20(0x3f0), P20(0x3f2), P20(0x3f4), P20(0x3f6),
+  P20(0x3f8), P20(0x3fa), P20(0x3fc), P20(0x3fe),
+  P20(0x400), P20(0x402), P20(0x404), P20(0x408),
+  P20(0x410), P20(0x418), P20(0x430), P20(0x438),
+  P20(0x470), P20(0x478), P20(0x7f8), P20(0x7fc)
+#undef P20
 };
 
-static unsigned HOST_WIDE_INT fli_value_df[32] =
-{
-  0xbff0p48, 0x10p48, 0x3ef0p48, 0x3f00p48,
-  0x3f70p48, 0x3f80p48, 0x3fb0p48, 0x3fc0p48,
-  0x3fd0p48, 0x3fd4p48, 0x3fd8p48, 0x3fdcp48,
-  0x3fe0p48, 0x3fe4p48, 0x3fe8p48, 0x3fecp48,
-  0x3ff0p48, 0x3ff4p48, 0x3ff8p48, 0x3ffcp48,
-  0x4000p48, 0x4004p48, 0x4008p48, 0x4010p48,
-  0x4020p48, 0x4030p48, 0x4060p48, 0x4070p48,
-  0x40e0p48, 0x40f0p48, 0x7ff0p48, 0x7ff8p48
+static const unsigned HOST_WIDE_INT fli_value_df[32] =
+{
+#define P48(v) ((unsigned HOST_WIDE_INT) (v) << 48)
+  P48(0xbff0), P48(0x10), P48(0x3ef0), P48(0x3f00),
+  P48(0x3f70), P48(0x3f80), P48(0x3fb0), P48(0x3fc0),
+  P48(0x3fd0), P48(0x3fd4), P48(0x3fd8), P48(0x3fdc),
+  P48(0x3fe0), P48(0x3fe4), P48(0x3fe8), P48(0x3fec),
+  P48(0x3ff0), P48(0x3ff4), P48(0x3ff8), P48(0x3ffc),
+  P48(0x4000), P48(0x4004), P48(0x4008), P48(0x4010),
+  P48(0x4020), P48(0x4030), P48(0x4060), P48(0x4070),
+  P48(0x40e0), P48(0x40f0), P48(0x7ff0), P48(0x7ff8)
+#undef P48
 };
 
 /* Display floating-point values at the assembly level, which is consistent
@@ -1686,7 +1700,7 @@ const char *fli_value_print[32] =
 int
 riscv_float_const_rtx_index_for_fli (rtx x)
 {
-  unsigned HOST_WIDE_INT *fli_value_array;
+  const unsigned HOST_WIDE_INT *fli_value_array;
 
   machine_mode mode = GET_MODE (x);
 
-- 
2.47.1


-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


[PATCH] aarch64: Move some diagnostic functions to aarch64.cc

2024-12-02 Thread Richard Sandiford
Some of the diagnostics reported for SVE builtins would also be
useful for Advanced SIMD builtins, so this patch moves them from
aarch64-sve-builtins.cc to aarch64.cc.  I put them in a new aarch64
namespace for now -- perhaps in future they should be generic.

Bootstrapped & regression-tested on aarch64-linux-gnu.

This is a prerequisite for the comments I have about the LUTI support
(which was originally posted in stage 1).  I'll commit tomorrow if there
are no comments before then.

Richard


gcc/
* config/aarch64/aarch64-sve-builtins.cc (report_non_ice)
(report_out_of_range, report_neither_nor, report_not_one_of)
(report_not_enum): Move to...
* config/aarch64/aarch64.cc: ...here, putting them in the aarch64
namespace, and...
* config/aarch64/aarch64-protos.h: ...declare them here.
---
 gcc/config/aarch64/aarch64-protos.h| 12 
 gcc/config/aarch64/aarch64-sve-builtins.cc | 65 +-
 gcc/config/aarch64/aarch64.cc  | 64 +
 3 files changed, 78 insertions(+), 63 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-protos.h 
b/gcc/config/aarch64/aarch64-protos.h
index c6ce62190bc..cad6e0b0a6f 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -1119,6 +1119,18 @@ bool aarch64_general_check_builtin_call (location_t, 
vec,
 unsigned int, tree, unsigned int,
 tree *);
 
+namespace aarch64 {
+  void report_non_ice (location_t, tree, unsigned int);
+  void report_out_of_range (location_t, tree, unsigned int, HOST_WIDE_INT,
+   HOST_WIDE_INT, HOST_WIDE_INT);
+  void report_neither_nor (location_t, tree, unsigned int, HOST_WIDE_INT,
+  HOST_WIDE_INT, HOST_WIDE_INT);
+  void report_not_one_of (location_t, tree, unsigned int, HOST_WIDE_INT,
+ HOST_WIDE_INT, HOST_WIDE_INT, HOST_WIDE_INT,
+ HOST_WIDE_INT);
+  void report_not_enum (location_t, tree, unsigned int, HOST_WIDE_INT, tree);
+}
+
 namespace aarch64_sve {
   void init_builtins ();
   void handle_arm_sve_h (bool);
diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc 
b/gcc/config/aarch64/aarch64-sve-builtins.cc
index 79dc81fcbb7..8e94a2d2cfe 100644
--- a/gcc/config/aarch64/aarch64-sve-builtins.cc
+++ b/gcc/config/aarch64/aarch64-sve-builtins.cc
@@ -55,6 +55,8 @@
 #include "aarch64-sve-builtins-shapes.h"
 #include "aarch64-builtins.h"
 
+using namespace aarch64;
+
 namespace aarch64_sve {
 
 /* Static information about each single-predicate or single-vector
@@ -1150,69 +1152,6 @@ lookup_fndecl (tree fndecl)
 }
 
 
-/* Report that LOCATION has a call to FNDECL in which argument ARGNO
-   was not an integer constant expression.  ARGNO counts from zero.  */
-static void
-report_non_ice (location_t location, tree fndecl, unsigned int argno)
-{
-  error_at (location, "argument %d of %qE must be an integer constant"
-   " expression", argno + 1, fndecl);
-}
-
-/* Report that LOCATION has a call to FNDECL in which argument ARGNO has
-   the value ACTUAL, whereas the function requires a value in the range
-   [MIN, MAX].  ARGNO counts from zero.  */
-static void
-report_out_of_range (location_t location, tree fndecl, unsigned int argno,
-HOST_WIDE_INT actual, HOST_WIDE_INT min,
-HOST_WIDE_INT max)
-{
-  if (min == max)
-error_at (location, "passing %wd to argument %d of %qE, which expects"
- " the value %wd", actual, argno + 1, fndecl, min);
-  else
-error_at (location, "passing %wd to argument %d of %qE, which expects"
- " a value in the range [%wd, %wd]", actual, argno + 1, fndecl,
- min, max);
-}
-
-/* Report that LOCATION has a call to FNDECL in which argument ARGNO has
-   the value ACTUAL, whereas the function requires either VALUE0 or
-   VALUE1.  ARGNO counts from zero.  */
-static void
-report_neither_nor (location_t location, tree fndecl, unsigned int argno,
-   HOST_WIDE_INT actual, HOST_WIDE_INT value0,
-   HOST_WIDE_INT value1)
-{
-  error_at (location, "passing %wd to argument %d of %qE, which expects"
-   " either %wd or %wd", actual, argno + 1, fndecl, value0, value1);
-}
-
-/* Report that LOCATION has a call to FNDECL in which argument ARGNO has
-   the value ACTUAL, whereas the function requires one of VALUE0..3.
-   ARGNO counts from zero.  */
-static void
-report_not_one_of (location_t location, tree fndecl, unsigned int argno,
-  HOST_WIDE_INT actual, HOST_WIDE_INT value0,
-  HOST_WIDE_INT value1, HOST_WIDE_INT value2,
-  HOST_WIDE_INT value3)
-{
-  error_at (location, "passing %wd to argument %d of %qE, which expects"
-   " %wd, %wd, %wd or %wd", actual, argno + 1, fndecl, value0, value1,
-   value2, value3);
-}
-
-/* Report that LOCA

[PATCH] aarch64: Split out aarch64_v64_mode

2024-12-02 Thread Richard Sandiford
We had a function called aarch64_vq_mode, where "vq" stood for "vector
quadword".  It was used by aarch64_simd_container_mode (from which it
originated) and in preparation for various SVE ...Q instructions.

It's useful for follow-on patches if we also split out the handling
of 64-bit modes from aarch64_simd_container_mode.  Keeping to the
same naming scheme would replace "q" with "d", but that has
unfortunate connotations, and doesn't AFAIK correspond to any
actual SVE mnemonics.

This patch therefore splits the handling out into a function called
aarch64_v64_mode and renames aarch64_vq_mode to aarch64_v128_mode for
consistency.  I didn't rename the "vq" local variables, since I think
those names make sense in context.

Bootstrapped & regression-tested on aarch64-linux-gnu.

This is a prerequisite for the comments I have about the LUTI support
(which was originally posted in stage 1).  I'll commit tomorrow if there
are no comments before then.

Richard


gcc/
* config/aarch64/aarch64-protos.h (aarch64_v64_mode): Declare.
(aarch64_vq_mode): Rename to...
(aarch64_v128_mode): ...this.
* config/aarch64/aarch64.cc (aarch64_v64_mode): New function,
split out from...
(aarch64_simd_container_mode): ...here.
(aarch64_vq_mode): Rename to...
(aarch64_v128_mode): ...this and update callers.
* config/aarch64/aarch64-sve-builtins-base.cc: Likewise update calls.
---
 gcc/config/aarch64/aarch64-protos.h   |  3 +-
 .../aarch64/aarch64-sve-builtins-base.cc  |  8 +--
 gcc/config/aarch64/aarch64.cc | 52 +++
 3 files changed, 36 insertions(+), 27 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-protos.h 
b/gcc/config/aarch64/aarch64-protos.h
index cad6e0b0a6f..8644d29a0a6 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -853,7 +853,8 @@ bool aarch64_and_bitmask_imm (unsigned HOST_WIDE_INT 
val_in, machine_mode mode);
 int aarch64_branch_cost (bool, bool);
 enum aarch64_symbol_type aarch64_classify_symbolic_expression (rtx);
 bool aarch64_advsimd_struct_mode_p (machine_mode mode);
-opt_machine_mode aarch64_vq_mode (scalar_mode);
+opt_machine_mode aarch64_v64_mode (scalar_mode);
+opt_machine_mode aarch64_v128_mode (scalar_mode);
 opt_machine_mode aarch64_full_sve_mode (scalar_mode);
 bool aarch64_can_const_movi_rtx_p (rtx x, machine_mode mode);
 bool aarch64_const_vec_all_same_int_p (rtx, HOST_WIDE_INT);
diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.cc 
b/gcc/config/aarch64/aarch64-sve-builtins-base.cc
index b97941932ab..13e020b5345 100644
--- a/gcc/config/aarch64/aarch64-sve-builtins-base.cc
+++ b/gcc/config/aarch64/aarch64-sve-builtins-base.cc
@@ -996,7 +996,7 @@ private:
 tree lhs_type = TREE_TYPE (lhs);
 tree elt_type = TREE_TYPE (lhs_type);
 scalar_mode elt_mode = SCALAR_TYPE_MODE (elt_type);
-machine_mode vq_mode = aarch64_vq_mode (elt_mode).require ();
+machine_mode vq_mode = aarch64_v128_mode (elt_mode).require ();
 tree vq_type = build_vector_type_for_mode (elt_type, vq_mode);
 
 unsigned nargs = gimple_call_num_args (f.call);
@@ -1067,7 +1067,7 @@ public:
 
 /* Get the 128-bit Advanced SIMD vector for this data size.  */
 scalar_mode element_mode = GET_MODE_INNER (mode);
-machine_mode vq_mode = aarch64_vq_mode (element_mode).require ();
+machine_mode vq_mode = aarch64_v128_mode (element_mode).require ();
 gcc_assert (known_eq (elements_per_vq, GET_MODE_NUNITS (vq_mode)));
 
 /* Put the arguments into a 128-bit Advanced SIMD vector.  We want
@@ -1651,7 +1651,7 @@ public:
   machine_mode
   memory_vector_mode (const function_instance &fi) const override
   {
-return aarch64_vq_mode (GET_MODE_INNER (fi.vector_mode (0))).require ();
+return aarch64_v128_mode (GET_MODE_INNER (fi.vector_mode (0))).require ();
   }
 
   rtx
@@ -1685,7 +1685,7 @@ public:
tree eltype = TREE_TYPE (lhs_type);
 
scalar_mode elmode = GET_MODE_INNER (TYPE_MODE (lhs_type));
-   machine_mode vq_mode = aarch64_vq_mode (elmode).require ();
+   machine_mode vq_mode = aarch64_v128_mode (elmode).require ();
tree vectype = build_vector_type_for_mode (eltype, vq_mode);
 
tree elt_ptr_type
diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index cc401befde4..43238aefef2 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -5737,7 +5737,7 @@ aarch64_expand_sve_const_vector (rtx target, rtx src)
 targets, the layout of the 128-bit vector in an Advanced SIMD
 register would be different from its layout in an SVE register,
 but this 128-bit vector is a memory value only.  */
-  machine_mode vq_mode = aarch64_vq_mode (elt_mode).require ();
+  machine_mode vq_mode = aarch64_v128_mode (elt_mode).require ();
   rtx vq_value = simplify_gen_subreg (vq_mode, src, mode, 0);
   if (vq_value && aarch64_expan

[PATCH] aarch64: Put iterators into the right section

2024-12-02 Thread Richard Sandiford
Saurabh picked up Vladimir's work to add the FEAT_LUT extension.
which was originally posted during stage 1.  In the interests of
parallelising the remaining work, and after checking with Saurabh
off-line, I'm doing the review in the form of a patch rather than
as English text.

This patch is split out from:

  https://gcc.gnu.org/pipermail/gcc-patches/2024-November/670061.html

and incorporated some other changes on the same lines.

Bootstrapped & regression-tested on aarch64-linux-gnu.  I'll commit
tomorrow if there are no comments before then.

Richard


>From feeb88b1e80a353965c2b9b3cbc703ca354ea6f0 Mon Sep 17 00:00:00 2001
From: Saurabh Jha 
Date: Sat, 30 Nov 2024 23:20:41 +
Subject: [PATCH] aarch64: Put iterators into the right section
To: gcc-patches@gcc.gnu.org

iterators.md is grouped by iterator type and by attribute type.
This patch just moves some stuff that was in the "wrong" section.

gcc/
* config/aarch64/iterators.md: Reorder some declarations,
putting them under the associated heading comment.

Co-authored-by: Vladimir Miloserdov 
Co-authored-by: Richard Sandiford 
---
 gcc/config/aarch64/iterators.md | 103 
 1 file changed, 50 insertions(+), 53 deletions(-)

diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
index 4786b0210e7..720d79db8e4 100644
--- a/gcc/config/aarch64/iterators.md
+++ b/gcc/config/aarch64/iterators.md
@@ -3165,6 +3165,10 @@ (define_int_iterator CLAST [UNSPEC_CLASTA UNSPEC_CLASTB])
 
 (define_int_iterator LAST [UNSPEC_LASTA UNSPEC_LASTB])
 
+;; Iterators for fp8 operations
+
+(define_int_iterator FAMINMAX_UNS [UNSPEC_FAMAX UNSPEC_FAMIN])
+
 (define_int_iterator SVE_INT_UNARY [UNSPEC_REVB
UNSPEC_REVH UNSPEC_REVW])
 
@@ -3738,20 +3742,44 @@ (define_int_iterator ATOMIC_LDOP
  [UNSPECV_ATOMIC_LDOP_OR UNSPECV_ATOMIC_LDOP_BIC
   UNSPECV_ATOMIC_LDOP_XOR UNSPECV_ATOMIC_LDOP_PLUS])
 
-(define_int_attr atomic_ldop
- [(UNSPECV_ATOMIC_LDOP_OR "set") (UNSPECV_ATOMIC_LDOP_BIC "clr")
-  (UNSPECV_ATOMIC_LDOP_XOR "eor") (UNSPECV_ATOMIC_LDOP_PLUS "add")])
-
-(define_int_attr atomic_ldoptab
- [(UNSPECV_ATOMIC_LDOP_OR "ior") (UNSPECV_ATOMIC_LDOP_BIC "bic")
-  (UNSPECV_ATOMIC_LDOP_XOR "xor") (UNSPECV_ATOMIC_LDOP_PLUS "add")])
-
 (define_int_iterator SUBDI_BITS [8 16 32])
 
 (define_int_iterator BHSD_BITS [8 16 32 64])
 
 (define_int_iterator LUTI_BITS [2 4])
 
+(define_int_iterator GET_FPSCR
+  [UNSPECV_GET_FPSR UNSPECV_GET_FPCR])
+
+(define_int_iterator SET_FPSCR
+  [UNSPECV_SET_FPSR UNSPECV_SET_FPCR])
+
+(define_int_iterator FP8CVT_UNS
+  [UNSPEC_F1CVT
+   UNSPEC_F2CVT
+   UNSPEC_F1CVTLT
+   UNSPEC_F2CVTLT])
+
+(define_int_iterator SVE2_FP8_TERNARY_VNX8HF
+  [UNSPEC_FMLALB_FP8
+   UNSPEC_FMLALT_FP8])
+
+(define_int_iterator SVE2_FP8_TERNARY_VNX4SF
+  [UNSPEC_FMLALLBB_FP8
+   UNSPEC_FMLALLBT_FP8
+   UNSPEC_FMLALLTB_FP8
+   UNSPEC_FMLALLTT_FP8])
+
+(define_int_iterator SVE2_FP8_TERNARY_LANE_VNX8HF
+  [UNSPEC_FMLALB_FP8
+   UNSPEC_FMLALT_FP8])
+
+(define_int_iterator SVE2_FP8_TERNARY_LANE_VNX4SF
+  [UNSPEC_FMLALLBB_FP8
+   UNSPEC_FMLALLBT_FP8
+   UNSPEC_FMLALLTB_FP8
+   UNSPEC_FMLALLTT_FP8])
+
 ;; ---
 ;; Int Iterators Attributes.
 ;; ---
@@ -3968,6 +3996,8 @@ (define_code_attr binqops_op [(ss_plus "sqadd")
 (define_code_attr binqops_op_rev [(ss_plus "sqsub")
  (ss_minus "sqadd")])
 
+(define_code_attr faminmax_op [(smax "famax") (smin "famin")])
+
 ;; The SVE logical instruction that implements an unspec.
 (define_int_attr logicalf_op [(UNSPEC_ANDF "and")
  (UNSPEC_IORF "orr")
@@ -4180,6 +4210,12 @@ (define_int_attr f16mac1 [(UNSPEC_FMLAL "a") 
(UNSPEC_FMLSL "s")
 (define_int_attr frintnzs_op [(UNSPEC_FRINT32Z "frint32z") (UNSPEC_FRINT32X 
"frint32x")
  (UNSPEC_FRINT64Z "frint64z") (UNSPEC_FRINT64X 
"frint64x")])
 
+(define_int_attr faminmax_cond_uns_op
+  [(UNSPEC_COND_SMAX "famax") (UNSPEC_COND_SMIN "famin")])
+
+(define_int_attr faminmax_uns_op
+  [(UNSPEC_FAMAX "famax") (UNSPEC_FAMIN "famin")])
+
 ;; The condition associated with an UNSPEC_COND_.
 (define_int_attr cmp_op [(UNSPEC_COND_CMPEQ_WIDE "eq")
 (UNSPEC_COND_CMPGE_WIDE "ge")
@@ -4724,12 +4760,6 @@ (define_int_attr has_16bit_form [(UNSPEC_SME_SDOT "true")
 
 ;; Iterators and attributes for fpcr fpsr getter setters
 
-(define_int_iterator GET_FPSCR
-  [UNSPECV_GET_FPSR UNSPECV_GET_FPCR])
-
-(define_int_iterator SET_FPSCR
-  [UNSPECV_SET_FPSR UNSPECV_SET_FPCR])
-
 (define_int_attr fpscr_name
   [(UNSPECV_GET_FPSR "fpsr")
(UNSPECV_SET_FPSR "fpsr")
@@ -4738,26 +4768,13 @@ (define_int_attr fpscr_name
 
 (define_int_attr bits_etype [(8 "b") (16 "h") (32 "s") (64 "d")])
 
-;; Iterators and attributes for faminmax
-
-(define_int_iterator FAMINMAX_UNS [UNSPEC_FAMAX UNSPEC_FAMIN])
-
-(def

Re: School Districts Contacts 2024

2024-12-02 Thread Chloe Hall
Hi there,
We are excited to offer you a comprehensive email list of school districts that 
includes key contact information such as phone numbers, email addresses, 
mailing addresses, company revenue, size, and web addresses. Our databases also 
cover related industries such as:

  *   K-12 schools
  *   Universities
  *   Vocational schools and training programs
  *   Performing arts schools
  *   Fitness centers and gyms
  *   Child care services and providers
  *   Educational publishers and suppliers
If you're interested, we would be happy to provide you with relevant counts and 
a test file based on your specific requirements.
Thank you for your time and consideration, and please let us know if you have 
any questions or concerns.
Thanks,
Chloe Hall

To remove from this mailing reply with the subject line " LEAVE US".



Re: [PATCH v3 2/8] aarch64: Make C/C++ operations possible on SVE ACLE types.

2024-12-02 Thread Andrew Pinski
On Mon, Dec 2, 2024 at 1:47 AM Tejas Belagod  wrote:
>
> On 11/30/24 3:30 AM, Christophe Lyon wrote:
> > Hi!
> >
> > On Fri, 29 Nov 2024 at 05:00, Tejas Belagod  wrote:
> >>
> >> This patch changes the TYPE_INDIVISBLE flag to 0 to enable SVE ACLE types 
> >> to be
> >> treated as GNU vectors and have the same semantics with operations that are
> >> defined on GNU vectors.
> >>
> >> gcc/ChangeLog:
> >>
> >>  * config/aarch64/aarch64-sve-builtins.cc 
> >> (register_builtin_types): Flip
> >>  TYPE_INDIVISBLE flag for SVE ACLE vector types.
> >
> > Sorry I haven't closely followed the discussions around this patch
> > series, but the Linaro postcommit CI reports
> > 1036 regressions after patch 2/8, is that expected?
> > Given that precommit CI detected "only" 22 regressions with all 8
> > patches, I suppose most of the 1036 are fixed later in the series?
>
> Thanks for raising this.
>
> Patch 2/8 enables SVE vectors to behave like GNU vectors (C/C++ operator
> semantics start applying to SVE vectors) which has a lot of fallout in
> FE/ME/BE that the patches 3-8 fix (mostly related to handling VLA vectors).
>
> I'm currently testing a patch to fix the remaining 22 regressions - they
> are mostly testisms for which I wanted to make sure I was doing the
> right thing (which I have indicated in my cover letter).

I think I fixed all of the testcases over the weekend.

https://gcc.gnu.org/pipermail/gcc-patches/2024-December/670492.html
https://gcc.gnu.org/pipermail/gcc-patches/2024-December/670493.html
https://gcc.gnu.org/pipermail/gcc-patches/2024-December/670494.html
https://gcc.gnu.org/pipermail/gcc-patches/2024-December/670495.html
https://gcc.gnu.org/pipermail/gcc-patches/2024-December/670496.html
https://gcc.gnu.org/pipermail/gcc-patches/2024-December/670497.html

Thanks,
Andrew

>
> Thanks,
> Tejas.
>
> >
> > Thanks,
> >
> > Christophe
> >
> >> ---
> >>   gcc/config/aarch64/aarch64-sve-builtins.cc | 5 -
> >>   1 file changed, 4 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc 
> >> b/gcc/config/aarch64/aarch64-sve-builtins.cc
> >> index 0fec1cd439e..adbadd303d4 100644
> >> --- a/gcc/config/aarch64/aarch64-sve-builtins.cc
> >> +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc
> >> @@ -4576,6 +4576,9 @@ register_builtin_types ()
> >>vectype = build_truth_vector_type_for_mode 
> >> (BYTES_PER_SVE_VECTOR,
> >>VNx16BImode);
> >>num_pr = 1;
> >> + /* Leave svbool_t as indivisible for now.  We don't yet 
> >> support
> >> +C/C++ operators on predicates.  */
> >> + TYPE_INDIVISIBLE_P (vectype) = 1;
> >>  }
> >>else
> >>  {
> >> @@ -4592,12 +4595,12 @@ register_builtin_types ()
> >>&& TYPE_ALIGN (vectype) == 128
> >>&& known_eq (size, BITS_PER_SVE_VECTOR));
> >>num_zr = 1;
> >> + TYPE_INDIVISIBLE_P (vectype) = 0;
> >>  }
> >>vectype = build_distinct_type_copy (vectype);
> >>gcc_assert (vectype == TYPE_MAIN_VARIANT (vectype));
> >>SET_TYPE_STRUCTURAL_EQUALITY (vectype);
> >>TYPE_ARTIFICIAL (vectype) = 1;
> >> - TYPE_INDIVISIBLE_P (vectype) = 1;
> >>make_type_sizeless (vectype);
> >>  }
> >> if (num_pr)
> >> --
> >> 2.25.1
> >>
>


[PATCH] wwwdocs: Clarify DCO name/identity and (anonymous) pseudonym policy

2024-12-02 Thread Mark Wielaard
Adjust the DCO text to match the broader community usage and
clarifications around the use of real names, known identities and
(anonymous) pseudonyms.

These changes clarify what was meant by "real name" and that it is not
required to be a "legal name" or any other stronger requirement than a
known identity that could be contacted to discuss the contribution as
adopted by other communities like the linux kernel, elfutils, cncf and
gentoo.

Also explain that the FSF assignment policy might be more appropriate
when wanting to contribute using an anonymous pseudonym.
---
 htdocs/dco.html | 17 +++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/htdocs/dco.html b/htdocs/dco.html
index 68fa183b9fc0..5713f003cce3 100644
--- a/htdocs/dco.html
+++ b/htdocs/dco.html
@@ -54,8 +54,21 @@ then you just add a line saying:
 
 Signed-off-by: Random J Developer 

 
-using your real name (sorry, no pseudonyms or anonymous contributions.)  This
-will be done for you automatically if you use `git commit -s`.
+using a known identity (sorry, no anonymous contributions.)  The name
+you use as your identity should not be an anonymous id or false name
+that misrepresents who you are.  This will be done for you
+automatically if you use `git commit -s`.
+
+A known identity can be the committer's real, birth or legal name,
+but can also be an established (online) identity.  It is the name you
+convey to people in the community for them to use to identify you as
+you.  The key concern is that your identification is sufficient enough
+to contact you if an issue were to arise in the future about your
+contribution.  You should not deliberately use a name or email address
+that hides your identity.  When you wish to only contribute under an
+(anonymous) pseudonym, or when you require an explicit employer
+disclaimer, then following the FSF
+assignment process is more appropriate.
 
 Some people also put extra optional tags at the end.  The GCC project does
 not require tags from anyone other than the original author of the patch, but
-- 
2.47.0



[PATCH v1 1/1] aarch64: fix fp8 cpuinfo feature names

2024-12-02 Thread Claudio Bantaloukas

The previous version of the patch was based on the mistaken assumption that
features in /proc/cpuinfo had matching names to the feature names that gcc and
gas accept.
This patch enables the fp8 feature when the f8cvt feature is enabled, under the
assumption that fpmr is always enabled when f8cvt is.

Changelog:

gcc/
* config/aarch64/aarch64-option-extensions.def: (fp8): fix 
FEATURE_STRING.
(fp8fma, ssve-fp8fma): Likewise.
(fp8dot4, ssve-fp8dot4, fp8dot2, ssve-fp8dot2): Likewise.
---
 gcc/config/aarch64/aarch64-option-extensions.def | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-option-extensions.def b/gcc/config/aarch64/aarch64-option-extensions.def
index 90abb1c5edd..7c5633aa803 100644
--- a/gcc/config/aarch64/aarch64-option-extensions.def
+++ b/gcc/config/aarch64/aarch64-option-extensions.def
@@ -243,21 +243,21 @@ AARCH64_OPT_EXTENSION("the", THE, (), (), (), "the")
 
 AARCH64_OPT_EXTENSION("gcs", GCS, (), (), (), "gcs")
 
-AARCH64_OPT_EXTENSION("fp8", FP8, (SIMD), (), (), "fp8")
+AARCH64_OPT_EXTENSION("fp8", FP8, (SIMD), (), (), "f8cvt")
 
-AARCH64_OPT_EXTENSION("fp8fma", FP8FMA, (FP8), (), (), "fp8fma")
+AARCH64_OPT_EXTENSION("fp8fma", FP8FMA, (FP8), (), (), "f8fma")
 
-AARCH64_OPT_EXTENSION("ssve-fp8fma", SSVE_FP8FMA, (SME2,FP8), (), (), "ssve-fp8fma")
+AARCH64_OPT_EXTENSION("ssve-fp8fma", SSVE_FP8FMA, (SME2,FP8), (), (), "smesf8fma")
 
 AARCH64_OPT_EXTENSION("faminmax", FAMINMAX, (SIMD), (), (), "faminmax")
 
-AARCH64_OPT_EXTENSION("fp8dot4", FP8DOT4, (FP8FMA), (), (), "fp8dot4")
+AARCH64_OPT_EXTENSION("fp8dot4", FP8DOT4, (FP8FMA), (), (), "f8dp4")
 
-AARCH64_OPT_EXTENSION("ssve-fp8dot4", SSVE_FP8DOT4, (SSVE_FP8FMA), (), (), "ssve-fp8dot4")
+AARCH64_OPT_EXTENSION("ssve-fp8dot4", SSVE_FP8DOT4, (SSVE_FP8FMA), (), (), "smesf8dp4")
 
-AARCH64_OPT_EXTENSION("fp8dot2", FP8DOT2, (FP8DOT4), (), (), "fp8dot2")
+AARCH64_OPT_EXTENSION("fp8dot2", FP8DOT2, (FP8DOT4), (), (), "f8dp2")
 
-AARCH64_OPT_EXTENSION("ssve-fp8dot2", SSVE_FP8DOT2, (SSVE_FP8DOT4), (), (), "ssve-fp8dot2")
+AARCH64_OPT_EXTENSION("ssve-fp8dot2", SSVE_FP8DOT2, (SSVE_FP8DOT4), (), (), "smesf8dp2")
 
 #undef AARCH64_OPT_FMV_EXTENSION
 #undef AARCH64_OPT_EXTENSION


[PATCH v1 0/1] aarch64: fix fp8 cpuinfo feature names

2024-12-02 Thread Claudio Bantaloukas


The previous version of the patch was based on the mistaken assumption that
features in /proc/cpuinfo had matching names to the feature names that gcc and
gas accept.
This patch enables the fp8 feature when the f8cvt feature is enabled, under the
assumption that fpmr is always enabled when f8cvt is.

OK for trunk?

Changelog:

gcc/
config/aarch64/aarch64-option-extensions.def: (fp8): fix FEATURE_STRING.
(fp8fma, ssve-fp8fma): Likewise.
(fp8dot4, ssve-fp8dot4, fp8dot2, ssve-fp8dot2): Likewise.


Claudio Bantaloukas (1):
  aarch64: fix fp8 cpuinfo feature names

 gcc/config/aarch64/aarch64-option-extensions.def | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

-- 
2.45.2



[committed] Add sym-exec subdirectory to configure.in rather than generated configure

2024-12-02 Thread Jeff Law
As Marc pointed out one patch in the CRC series changed a generated file 
rather than the canonical source.  This corrects the canonical source.


Committing as obvious.  The generated configure is already up-to-date.

Jeff

commit 4df8e6fc0cbc8358f88e81bb64b790af2a848a35
Author: Jeff Law 
Date:   Mon Dec 2 10:45:21 2024 -0700

[committed] Add sym-exec subdirectory to configure.in rather than generated 
configure

As Marc pointed out one patch in the CRC series changed a generated file 
rather
than the canonical source.  This corrects the canonical source.

Committing as obvious.  The generated configure is already up-to-date.

gcc/
* configure.ac: Add sym-exec subdirectory.

diff --git a/gcc/configure.ac b/gcc/configure.ac
index a6c650c8f3a..e9bddc6db21 100644
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -1381,7 +1381,7 @@ AC_CHECK_HEADERS(ext/hash_map)
 ZW_CREATE_DEPDIR
 AC_CONFIG_COMMANDS([gccdepdir],[
   ${CONFIG_SHELL-/bin/sh} $ac_aux_dir/mkinstalldirs build/$DEPDIR
-  for lang in $subdirs c-family common analyzer text-art rtl-ssa
+  for lang in $subdirs c-family common analyzer text-art rtl-ssa sym-exec
   do
   ${CONFIG_SHELL-/bin/sh} $ac_aux_dir/mkinstalldirs $lang/$DEPDIR
   done], [subdirs="$subdirs" ac_aux_dir=$ac_aux_dir DEPDIR=$DEPDIR])


Re: [committed] Patches 8-12 of Mariam & Matevos's CRC optimization work

2024-12-02 Thread Jeff Law




On 12/2/24 6:07 AM, Mark Wielaard wrote:

Hi Jeff,

On Sun, 2024-12-01 at 08:56 -0700, Jeff Law wrote:

commit 148e20466c2c246df9472efed0f2ae94cb65a0f8
Author: Matevos Mehrabyan 
Date:   Mon Nov 11 13:00:10 2024 -0700

     [PATCH v6 09/12] Add symbolic execution support.
 
     Gives an opportunity to execute the code on bit level, assigning

     symbolic values to the variables which don't have initial values.
     Supports only CRC specific operations.
 
     Example:
 
     uint8_t crc;

     uint8_t pol = 1;
     crc = crc ^ pol;
 
     during symbolic execution crc's value will be:

     crc(8), crc(7), ... crc(1), crc(0) ^ 1
 
     gcc/

     * Makefile.in (OBJS): Add sym-exec/sym-exec-expression.o,
     sym-exec/sym-exec-state.o, sym-exec/sym-exec-condition.o.
     * configure (sym-exec): New subdir.
     * sym-exec/sym-exec-condition.cc: New file.
     * sym-exec/sym-exec-condition.h: New file.
     * sym-exec/sym-exec-expr-is-a-helper.h: New file.
     * sym-exec/sym-exec-expression.cc: New file.
     * sym-exec/sym-exec-expression.h: New file.
     * sym-exec/sym-exec-state.cc: New file.
     * sym-exec/sym-exec-state.h: New file.
 
     Co-authored-by: Mariam Arutunian 


This updates configure without updating configure.ac. So when
regenerating configure the change disappears.

The autoregen buildbot doesn't like that:
https://builder.sourceware.org/buildbot/#/builders/gcc-autoregen

Thanks.  Should be fixed now.
jeff



[committed] Add trailing newlines where needed

2024-12-02 Thread Jakub Jelinek
Hi!

Especially in the recent CRC commits, I see
\ No newline at end of file
in almost every second file.  So, I went through
the diff between r15-1 and current trunk in gcc/, looking for
additions of such problems which don't intentional (e.g.
Wtrailing-whitespace* tests had it there intentionally) and
just added the missing newline elsewhere.

Committed to trunk as obvious.

2024-12-02  Jakub Jelinek  

gcc/
* config/mingw/mingw-stdint.h: Add newline at the end of the file.
* config/mingw/winnt-dll.cc: Likewise.
* sym-exec/sym-exec-expression.h: Likewise.
* sym-exec/sym-exec-expression.cc: Likewise.
* sym-exec/sym-exec-condition.cc: Likewise.
* sym-exec/sym-exec-expr-is-a-helper.h: Likewise.
* sym-exec/sym-exec-condition.h: Likewise.
* hwint.cc: Likewise.
* crc-verification.cc: Likewise.
* sarif-spec-urls.def: Likewise.
gcc/testsuite/
* g++.target/aarch64/pr94515-2.C: Add newline at the end of the file.
* g++.target/aarch64/return_address_sign_ab_exception.C: Likewise.
* gcc.target/arm/thumb2-switchstatement.c: Likewise.
* gcc.target/riscv/rvv/base/vssubu-2.c: Likewise.
* gcc.target/riscv/rvv/base/vssubu-1.c: Likewise.
* gcc.target/riscv/and-shift32.c: Likewise.
* gcc.target/riscv/crc-builtin-zbc32.c: Likewise.
* gcc.target/riscv/and-shift64.c: Likewise.
* gcc.target/riscv/xtheadbb-extu-4.c: Likewise.
* gcc.target/i386/avx2-bf16-vec-absneg.c: Likewise.
* gcc.target/i386/avx512f-bf16-vec-absneg.c: Likewise.
* gcc.target/aarch64/cpunative/native_cpu_26.c: Likewise.
* gcc.target/aarch64/cpunative/info_26: Likewise.
* gcc.target/aarch64/cpunative/info_25: Likewise.
* g++.dg/contracts/pr116607.C: Likewise.
* gfortran.dg/pr108889.f90: Likewise.
* gcc.dg/crc-not-crc-14.c: Likewise.
* gcc.dg/crc-from-fedora-packages-13.c: Likewise.
* gcc.dg/crc-not-crc-25.c: Likewise.
* gcc.dg/crc-from-fedora-packages-29.c: Likewise.
* gcc.dg/crc-from-fedora-packages-10.c: Likewise.
* gcc.dg/crc-side-instr-10.c: Likewise.
* gcc.dg/crc-side-instr-1.c: Likewise.
* gcc.dg/crc-side-instr-3.c: Likewise.
* gcc.dg/crc-side-instr-2.c: Likewise.
* gcc.dg/crc-not-crc-17.c: Likewise.
* gcc.dg/crc-from-fedora-packages-7.c: Likewise.
* gcc.dg/crc-side-instr-12.c: Likewise.
* gcc.dg/crc-side-instr-16.c: Likewise.
* gcc.dg/crc-not-crc-16.c: Likewise.
* gcc.dg/crc-from-fedora-packages-4.c: Likewise.
* gcc.dg/crc-not-crc-20.c: Likewise.
* gcc.dg/crc-linux-3.c: Likewise.
* gcc.dg/crc-from-fedora-packages-27.c: Likewise.
* gcc.dg/pr109393.c: Likewise.
* gcc.dg/crc-side-instr-7.c: Likewise.
* gcc.dg/crc-side-instr-4.c: Likewise.
* gcc.dg/tree-ssa/ldexp.c: Likewise.
* gcc.dg/tree-ssa/pr114760-2.c: Likewise.
* gcc.dg/tree-ssa/pr114760-1.c: Likewise.
* gcc.dg/crc-side-instr-15.c: Likewise.
* gcc.dg/crc-side-instr-9.c: Likewise.
* gcc.dg/crc-not-crc-26.c: Likewise.
* gcc.dg/crc-side-instr-8.c: Likewise.
* gcc.dg/crc-not-crc-23.c: Likewise.
* gcc.dg/crc-not-crc-19.c: Likewise.
* gcc.dg/crc-from-fedora-packages-22.c: Likewise.
* gcc.dg/crc-from-fedora-packages-16.c: Likewise.
* gcc.dg/crc-side-instr-11.c: Likewise.
* gcc.dg/crc-from-fedora-packages-5.c: Likewise.
* gcc.dg/crc-not-crc-22.c: Likewise.
* gcc.dg/crc-side-instr-17.c: Likewise.
* gcc.dg/crc-linux-4.c: Likewise.
* gcc.dg/crc-side-instr-14.c: Likewise.
* gcc.dg/crc-not-crc-18.c: Likewise.
* gcc.dg/crc-from-fedora-packages-23.c: Likewise.
* gcc.dg/crc-not-crc-21.c: Likewise.
* gcc.dg/crc-linux-2.c: Likewise.
* gcc.dg/crc-from-fedora-packages-1.c: Likewise.
* gcc.dg/crc-from-fedora-packages-30.c: Likewise.
* gcc.dg/torture/crc-11.c: Likewise.
* gcc.dg/torture/crc-27.c: Likewise.
* gcc.dg/torture/crc-2.c: Likewise.
* gcc.dg/torture/crc-24.c: Likewise.
* gcc.dg/torture/crc-crc8.c: Likewise.
* gcc.dg/torture/crc-crc8-data8-xorOustideFor.c: Likewise.
* gcc.dg/torture/crc-16.c: Likewise.
* gcc.dg/torture/crc-crc64-data64.c: Likewise.
* gcc.dg/crc-from-fedora-packages-32.c: Likewise.
* gcc.dg/crc-side-instr-6.c: Likewise.
* gcc.dg/crc-side-instr-5.c: Likewise.
* gcc.dg/crc-side-instr-13.c: Likewise.
* gcc.dg/crc-not-crc-15.c: Likewise.
* gcc.dg/crc-not-crc-13.c: Likewise.
* gcc.dg/crc-from-fedora-packages-6.c: Likewise.
* gcc.dg/crc-not-crc-24.c: Likewise.

--- gcc/config/mingw/mingw-stdint.h.jj  2024-05-07 18:10:10.538872950 +0200
+++ gcc/config/mingw/mingw-stdint.h 2024-12-02 14:42:19.833496289 +0100
@@ -52,4 +52,4 @@ alon

RE: [RFC] PR81358: Enable automatic linking of libatomic

2024-12-02 Thread Prathamesh Kulkarni



> -Original Message-
> From: Joseph Myers 
> Sent: 29 November 2024 21:48
> To: Prathamesh Kulkarni 
> Cc: Xi Ruoyao ; Matthew Malcomson
> ; gcc-patches@gcc.gnu.org
> Subject: RE: [RFC] PR81358: Enable automatic linking of libatomic
> 
> External email: Use caution opening links or attachments
> 
> 
> On Fri, 29 Nov 2024, Prathamesh Kulkarni wrote:
> 
> > > My expectation is that CFLAGS should not be modified until after
> > > save_CFLAGS is set, which should not be until after configure has
> > > executed the logic that sets a -g -O2 default.  Is there some
> > > problem with that ordering (e.g. configure tests that expect to
> link
> > > target programs but run as part of the same Autoconf macro
> > > invocation that also generates the logic to determine default
> > > values)?  Also, the
> > It seems that in configure, AC_PROG_CC expands to setting "-g -O2"
> in
> > CFLAGS, and running conftests using those CFLAGS, and any
> adjustments to CFLAGS after invoking AC_PROG_CC don't help.
> > In the attached patch, I simply moved save_CFLAGS and CFLAGS before
> > invoking AC_PROG_CC, and adding "-fno-link-libatomic" to CFLAGS,
> which seems to work, but not sure if it's the correct approach ?
> 
> I don't think having those settings before the default from AC_PROG_CC
> is logically right, because the default from AC_PROG_CC only applies
> if CFLAGS is not already set (that is, you'd lose the default -g -O2,
> if libatomic/configure is run without CFLAGS set).
> 
> The underlying principle is that CFLAGS is a variable for the *user*
> to set as they wish, not something that should be used for any options
> required as part of the build (that's what other things such as
> XCFLAGS and AM_CFLAGS are for).  So if you need to modify CFLAGS in
> configure for use as part of configure tests, it should only be done
> temporarily in a way that doesn't interfere with the normal logic to
> determine that default setting.  If for some reason that doesn't work
> (if AC_PROG_CC also runs tests that need modified CFLAGS, giving
> nowhere you get modify the value between the default being set and it
> being used), you'd need an assertion that libatomic/configure didn't
> get run with unset CFLAGS so the default wouldn't be applicable
> anyway, with appropriate comments explaining the issues.
Hi Joseph,
Thanks for the suggestions! Unfortunately, it seems to me that AC_PROG_CC also 
does run tests that
need modified CFLAGS. I tried the following assertion before invoking 
AC_PROG_CC (for stage-1 build):

if test -z "${CFLAGS}"; then
  AC_MSG_ERROR([CFLAGS must be set.])
fi

and it seems to pass. So I suppose the default setting of CFLAGS in AC_PROG_CC 
won't be applicable anyway ?
I checked what value CFLAGS was set to and it turned out to be "-g -O2" (same 
as default setting in AC_PROG_CC, altho I am not quite sure
where CFLAGS were set before invoking libatomic/configure). Given that the 
above assert passes, would it be safe to add "-fno-link-libatomic"
to CFLAGS before invoking AC_PROG_CC (and after the assert) ?

Thanks,
Prathamesh
> 
> --
> Joseph S. Myers
> josmy...@redhat.com



Re: [PATCH] libstdc++: Use hidden friends for __normal_iterator operators

2024-12-02 Thread Patrick Palka
On Thu, 28 Nov 2024, Jonathan Wakely wrote:

> As suggested by Jason, this makes all __normal_iterator operators into
> friends so they can be found by ADL and don't need to be separately
> exported in module std.

Might as well remove the __gnu_cxx exports in std.cc.in while we're at
it?

> 
> For the operator<=> comparing two iterators of the same type, I had to
> use a deduced return type and add a requires-clause, because it's no
> longer a template and so we no longer get substitution failures when
> it's considered in oerload resolution.
> 
> I also had to reorder the __attribute__((always_inline)) and
> [[nodiscard]] attributes, which have to be in a particular order when
> used on friend functions.
> 
> libstdc++-v3/ChangeLog:
> 
>   * include/bits/stl_iterator.h (__normal_iterator): Make all
>   non-member operators hidden friends.
>   * src/c++11/string-inst.cc: Remove explicit instantiations of
>   operators that are no longer templates.
> ---
> 
> Tested x86_64-linux.
> 
> This iterator type isn't defined in the standard, and users shouldn't be
> doing funny things with it, so nothing prevents us from replacing its
> operators with hidden friends.
> 
>  libstdc++-v3/include/bits/stl_iterator.h | 341 ---
>  libstdc++-v3/src/c++11/string-inst.cc|  11 -
>  2 files changed, 184 insertions(+), 168 deletions(-)
> 
> diff --git a/libstdc++-v3/include/bits/stl_iterator.h 
> b/libstdc++-v3/include/bits/stl_iterator.h
> index e872598d7d8..656a47e5f76 100644
> --- a/libstdc++-v3/include/bits/stl_iterator.h
> +++ b/libstdc++-v3/include/bits/stl_iterator.h
> @@ -1164,188 +1164,215 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>const _Iterator&
>base() const _GLIBCXX_NOEXCEPT
>{ return _M_current; }
> -};
>  
> -  // Note: In what follows, the left- and right-hand-side iterators are
> -  // allowed to vary in types (conceptually in cv-qualification) so that
> -  // comparison between cv-qualified and non-cv-qualified iterators be
> -  // valid.  However, the greedy and unfriendly operators in std::rel_ops
> -  // will make overload resolution ambiguous (when in scope) if we don't
> -  // provide overloads whose operands are of the same type.  Can someone
> -  // remind me what generic programming is about? -- Gaby
> +private:
> +  // Note: In what follows, the left- and right-hand-side iterators are
> +  // allowed to vary in types (conceptually in cv-qualification) so that
> +  // comparison between cv-qualified and non-cv-qualified iterators be
> +  // valid.  However, the greedy and unfriendly operators in std::rel_ops
> +  // will make overload resolution ambiguous (when in scope) if we don't
> +  // provide overloads whose operands are of the same type.  Can someone
> +  // remind me what generic programming is about? -- Gaby
>  
>  #ifdef __cpp_lib_three_way_comparison
> -  template
> -[[nodiscard, __gnu__::__always_inline__]]
> -constexpr bool
> -operator==(const __normal_iterator<_IteratorL, _Container>& __lhs,
> -const __normal_iterator<_IteratorR, _Container>& __rhs)
> -noexcept(noexcept(__lhs.base() == __rhs.base()))
> -requires requires {
> -  { __lhs.base() == __rhs.base() } -> std::convertible_to;
> -}
> -{ return __lhs.base() == __rhs.base(); }
> +  template
> + [[nodiscard, __gnu__::__always_inline__]]
> + friend
> + constexpr bool
> + operator==(const __normal_iterator& __lhs,
> +const __normal_iterator<_Iter, _Container>& __rhs)
> + noexcept(noexcept(__lhs.base() == __rhs.base()))
> + requires requires {
> +   { __lhs.base() == __rhs.base() } -> std::convertible_to;
> + }
> + { return __lhs.base() == __rhs.base(); }
>  
> -  template
> -[[nodiscard, __gnu__::__always_inline__]]
> -constexpr std::__detail::__synth3way_t<_IteratorR, _IteratorL>
> -operator<=>(const __normal_iterator<_IteratorL, _Container>& __lhs,
> - const __normal_iterator<_IteratorR, _Container>& __rhs)
> -noexcept(noexcept(std::__detail::__synth3way(__lhs.base(), 
> __rhs.base(
> -{ return std::__detail::__synth3way(__lhs.base(), __rhs.base()); }
> +  template
> + static constexpr bool __nothrow_synth3way
> +   = noexcept(std::__detail::__synth3way(std::declval<_Iterator&>(),
> + std::declval<_Iter&>()));

Since base() returns a const reference do we want to consider const
references in this noexcept helper?

>  
> -  template
> -[[nodiscard, __gnu__::__always_inline__]]
> -constexpr bool
> -operator==(const __normal_iterator<_Iterator, _Container>& __lhs,
> -const __normal_iterator<_Iterator, _Container>& __rhs)
> -noexcept(noexcept(__lhs.base() == __rhs.base()))
> -requires requires {
> -  { __lhs.base() == __rhs.base() } -> std::convertible_to;
> -}
> -{ return __lhs.base() == __rhs.base(); }
> +   

Re: [PATCH] AIX Build failure with default -std=gnu23.

2024-12-02 Thread swamy sangamesh
Dear Community,

Please let me know your comment.
Or is it more appropriate to have changes with header guard like this ?

--- a/libiberty/getopt.c
+++ b/libiberty/getopt.c
@@ -25,9 +25,11 @@
 ^L
 /* This tells Alpha OSF/1 not to define a getopt prototype in .
Ditto for AIX 3.2 and .  */
+#ifndef _AIX
 #ifndef _NO_PROTO
 # define _NO_PROTO
 #endif
+#endif

 #ifdef HAVE_CONFIG_H
 # include 


Thanks,
Sangamesh


On Thu, Nov 28, 2024 at 11:09 AM Sangamesh Mallayya <
swamy.sangam...@gmail.com> wrote:

>  libiberty/getopt.c file is defining _NO_PROTO which causes conflicting
>  declarations for the functions in AIX header files like stdio.h &
> stdlib.h.
>  These declarations are being considered as errors in C23 which wasn't
>  the case with C17.
>
> Here is the error we get.
>
> /gcc_build/./prev-gcc/xgcc -B/gcc_build/./prev-gcc/
> -B/home/sangam/install/GCC/powerpc-ibm-aix7.3.3.0/bin/ -B/home/sangam
> /install/GCC/powerpc-ibm-aix7.3.3.0/bin/
> -B/home/sangam/install/GCC/powerpc-ibm-aix7.3.3.0/lib/ -isystem
> /home/sangam/ins
> tall/GCC/powerpc-ibm-aix7.3.3.0/include -isystem
> /home/sangam/install/GCC/powerpc-ibm-aix7.3.3.0/sys-include   -fno-check
> ing -c -DHAVE_CONFIG_H -g -O2 -fno-checking  -I.
> -I/opt/freeware/src/packages/BUILD/gcc/libiberty/../include  -W -Wall -W
> write-strings -Wc++-compat -Wstrict-prototypes -Wshadow=local -pedantic
> -D_GNU_SOURCE  /opt/freeware/src/packages/BUILD/
> gcc/libiberty/getopt.c -o getopt.o
>
>
> In file included from
> /opt/freeware/src/packages/BUILD/gcc/libiberty/getopt.c:45:
> /gcc_build/prev-gcc/include-fixed/stdio.h:593:12: error: conflicting types
> for 'fgetpos64'; have 'int(FILE *, fpos64_t *)
> ' {aka 'int(FILE *, long long int *)'}
>   593 | extern int fgetpos64(FILE *, fpos64_t *);
>   |^
> /gcc_build/prev-gcc/include-fixed/stdio.h:298:17: note: previous
> declaration of 'fgetpos64' with type 'int(void)'
>   298 | extern int  fgetpos();
>   | ^~~
> /gcc_build/prev-gcc/include-fixed/stdio.h:594:14: error: conflicting types
> for 'fopen64'; have 'FILE *(const char *, cons
> t char *)'
>   594 | extern FILE *fopen64(const char *, const char *);
>   |  ^~~
>
> /gcc_build/prev-gcc/include-fixed/stdio.h:259:17: note: previous
> declaration of 'fopen64' with type 'FILE *(void)'
>   259 | extern FILE *   fopen();
>   | ^
> /gcc_build/prev-gcc/include-fixed/stdio.h:595:14: error: conflicting types
> for 'freopen64'; have 'FILE *(const char *, co
> nst char *, FILE *)'
>   595 | extern FILE *freopen64(const char *, const char *, FILE *);
>   |  ^
> /gcc_build/prev-gcc/include-fixed/stdio.h:260:17: note: previous
> declaration of 'freopen64' with type 'FILE *(void)'
>   260 | extern FILE *   freopen();
>   | ^~~
> /gcc_build/prev-gcc/include-fixed/stdio.h:597:12: error: conflicting types
> for 'fsetpos64'; have 'int(FILE *, const fpos6
> 4_t *)' {aka 'int(FILE *, const long long int *)'}
>   597 | extern int fsetpos64(FILE *, const fpos64_t *);
>   |^
> /gcc_build/prev-gcc/include-fixed/stdio.h:300:17: note: previous
> declaration of 'fsetpos64' with type 'int(void)'
>   300 | extern int  fsetpos();
>   | ^~~
> In file included from
> /opt/freeware/src/packages/BUILD/gcc/libiberty/getopt.c:216:
> /gcc_build/prev-gcc/include-fixed/stdlib.h: In function 'strtold':
> /gcc_build/prev-gcc/include-fixed/stdlib.h:233:30: error: too many
> arguments to function 'strtod'
>
>
> Compiled with this patch on RHEL8.10 ppc64le as well.
>
> ---
>  libiberty/getopt.c | 6 --
>  1 file changed, 6 deletions(-)
>
> diff --git a/libiberty/getopt.c b/libiberty/getopt.c
> index 2f7086cc0c8..48736d4db41 100644
> --- a/libiberty/getopt.c
> +++ b/libiberty/getopt.c
> @@ -23,12 +23,6 @@
> Foundation, Inc., 51 Franklin Street - Fifth Floor, Boston, MA
> 02110-1301,
> USA.  */
>
> -/* This tells Alpha OSF/1 not to define a getopt prototype in .
> -   Ditto for AIX 3.2 and .  */
> -#ifndef _NO_PROTO
> -# define _NO_PROTO
> -#endif
> -
>  #ifdef HAVE_CONFIG_H
>  # include 
>  #endif
> --
> 2.41.0
>
>

-- 
Thanks & Regards,
Sangamesh


Re: Should -fsanitize=bounds support counted-by attribute for pointers inside a structure?

2024-12-02 Thread Qing Zhao


> On Dec 2, 2024, at 16:13, Martin Uecker  wrote:
> 
> Am Montag, dem 02.12.2024 um 20:15 + schrieb Qing Zhao:
>> 
>>> On Nov 30, 2024, at 07:22, Martin Uecker  wrote:
>>> 
>>> Am Dienstag, dem 26.11.2024 um 20:59 + schrieb Qing Zhao:
 Think this over these days, I have another thought that need some feedback:
 
 The major issue right now is:
 
 1. For the following structure in which the “counted_by” attributes is 
 attached to the pointer field.
 
 struct foo {
 int n;
 char *p __attribute__ ((counted_by (n)));
 } *x;
>>> 
>>> BTW: I also do not like the syntax 'n' for a lookup of a member 
>>> in the member namespace. I think it should be '.n'.  For FAM this 
>>> is less problematic because it always at the end, but here it is 
>>> problematic.
>> During my implementation of extending this attribute to pointer fields of 
>> structure. 
>> I didn’t notice  issue with the current syntax ’n’ for the pointer fields so 
>> far, 
>> even though when  the field “n” is declared after the corresponding pointer 
>> field, i.e:
>> 
>> struct foo {
>> {
>>  char *p __attribute__ ((counted_by (n)));
>>  int n;
>> }
>> 
>> So, could you please explain a little bit more on what’s the potential issue 
>> here?
> 
> My issue with it is that it is not consistent to how C's scoping
> of identifiers work. In
> 
> constexpr int n = 3;
> struct foo {
> {
>  char (*p)[n] __attribute__ ((counted_by (n))
>  int n;
> }
> 
> the n in the type of p refers to the previous n in scope while
> the n in your attribute would refer to the member.

Okay, I see your point here -:).

Yes, I agree that if the compiler can accept the syntax “.n” in general, then 
it will make the 
“counted_by” attribute to allow more complicated expressions in general. 

We had this similar discussion before the design and implementation for the 
“counted_by” attribute
on FAMs, and we agreed to delay the approach of accepting the syntax “.n” in 
the future possible 
language standard at that time.

So, for the “counted_by attribute on FAMs, the implementation is, searching the 
“n” in all the fields of the 
containing structure and locating that specific field. 

Now, when extending “counted_by” attribute to pointer fields of structures, the 
implementation is similar.

> 
> This is incoherent and confusing.  It becomes worse should
> you ever want to allow more complicated expressions.

You are right, it’s hard to allow more complicated expressions for “counted_by” 
based on the current
design. 

If we agree to accept the “.n” syntax in GCC in general, that’s of course 
better.
Then how about the current “counted_by” for FAMs? Shall we also change it to 
accept “.n” syntax?

> 
> It would be clearer if you the syntax ".n" which resembles
> the syntax for designated initializers that is already used
> in initializers to refer to struct members.
> 
> constexpr int n;
> struct foo {
> {
>  char (*p)[n] __attribute__ ((counted_by (.n))
>  int n;
> }
> 
Yes, I agree. 
> 
>> 
>> 
 
 There is one important additional requirement:
 
 x->n, x->p can ONLY be changed by changing the whole structure at the same 
 time. 
 Otherwise, x->n might not be consistent with x->p.
>>> 
>>> By itself, this would still not fix the issue I pointed out.
>>> 
>>> struct foo x;
>>> x = .. ; // set the whole structure
>>> char *p = x->p;
>>> x = ... ; // set the whole structure
>>> 
>>> What is the bound for 'p' ?  
>> 
>> Since p was set to the pointer field of the old structure, then the bound of 
>> it should be the old bound. 
>>> With current rules it would be the old bound.
>> 
>> I thought that this should be the correct behavior, isn’t it?
> 
> Yes, sorry, what I meant was "with the current rules it would be
> the *new* bound”.

struct foo x;
x=… ;  // set the whole structure 1
char *p = x->p;
x=… ;  // set the whole structure 2

In the above, when “set the whole structure 1”, x1, x1->n and x1->p are set at 
the same time;
After *p = x->p;the pointer “p” is pointing to “x1->p”, it’s bound is 
“x1->n”;

Then when “set the whole structure 2”, x2 is different than x1,  x2->n and 
x2->p are set at the same time, the pointer
‘p’ still points to “x1->p”, therefore it’s bound should be “x1->n”. 

So, as long as the whole structure is set at the same time, should be fine. 

Do I miss anything here?

Qing

>  (And I guess this is why you suggest to potentially
> change it below).
> 
> Martin
> 
>> 
>>> 
>>> 
 
 2. This new requirement is ONLY for “counted_by” attribute that is 
 attached to the pointer field, not needed
 for flexible array members.
 
 3. Then there will be inconsistency for the “counted_by” attribute between 
 FAM and pointer field.
>>> 
>>> 
 The major questions I have right now:
 
 1. Shall we keep this inconsistency between FAM and pointer field? 
 Or, 
 
 2. Shall we keep them consistent by adding this new requirement for the 
 pre

Re: Fix type compatibility for types with flexible array member [PR113688,PR114014,PR117724]

2024-12-02 Thread Qing Zhao


> On Dec 2, 2024, at 16:00, Martin Uecker  wrote:
> 
> Am Montag, dem 02.12.2024 um 16:31 + schrieb Qing Zhao:
>> 
>>> On Nov 30, 2024, at 07:10, Martin Uecker  wrote:
>>> 
>>> Am Dienstag, dem 26.11.2024 um 15:15 + schrieb Qing Zhao:
 
> On Nov 25, 2024, at 16:46, Martin Uecker  wrote:
> 
> 
> Hi Qing,
> 
> Am Montag, dem 25.11.2024 um 17:40 + schrieb Qing Zhao:
>> Hi, Martin,
>> 
>> I didn’t go through all the details of your patch.
>> 
>> But I have one question:
>> 
>> Did you consider the effect of the option -fstrict-flex-array 
>> (https://gcc.gnu.org/onlinedocs/gcc-14.2.0/gcc/C-Dialect-Options.html#index-fstrict-flex-arrays)
>>  on how gcc treats the zero size trailing array, 1-element trailing 
>> array as flexible array member in the patch?
> 
> I used the function which was already there which
> does not take this into account.  For the new version
> of the patch this should not matter anymore.
 
 Why it’s not matter anymore?
 
 For the following testing case:
 
 struct S{int x,y[1];}*a;
 int main(void){
 struct S{int x,y[];};
 }
 
 With your latest patch,  the two structures are considered as compatible 
 with -g;
 However, if we add -fstrict-flex-array=2 or -fstrict-flex-array=3,  the 
 trailing array y[1] is NOT treated
 as FAM anymore, as a result, these two structure are NOT compatible too. 
 
 Do I miss anything obvious?
>>> 
>>> It is not about compatibility from a language semantic point of you
>>> but for TBAA-compatibility which needs to be consistent with it but
>>> can (and must be) more general.
>>> 
>>> For TBAA, I think we want 
>>> 
>>> struct foo { int x; int y[]; };
>>> 
>>> to be TBAA-compatible to
>>> 
>>> struct foo { int x; int y[3]; };
>> 
>> Okay, I see now.  Thank you for the explanation.
>> (Now I also see this from the comments of the routine 
>> gimple_canonical_types_compatible_p -:)
>> 
>> 
>> Though, what confused me is the testing case in your patch:
>> 
>> diff --git a/gcc/testsuite/gcc.dg/pr114014.c 
>> b/gcc/testsuite/gcc.dg/pr114014.c
>> new file mode 100644
>> index 000..ab783f4f85d
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.dg/pr114014.c
>> @@ -0,0 +1,14 @@
>> +/* PR c/114014
>> + * { dg-do compile }
>> + * { dg-options "-std=c23 -g" } */
>> +
>> +struct r {
>> +  int a;
>> +  char b[];
>> +};
>> +struct r {
>> +  int a;
>> +  char b[0];
>> +}; /* { dg-error "redefinition" } */
>> +
>> +
>> 
>> Is the above testing case claiming that b[] and b[0] are compatible from a 
>> language semantic point of view?
> 
> It would test that we do not crash with checking.
> 
> Semantically, in c23 if you redeclare a type in the same scope then
> it must not only be compatible but is also not allowed to differ.
> So a redeclaration in the same scope has stricter requirements than
> compatibility (this also true for typedefs for example).

So, here does the “compatibility” mean “compatibility from a language semantic 
point of view” or TBAA-compability? 
> 
> Whether we allow
> 
> struct r {
>  int a;
>  char b[];
> };
> 
> struct r {
>  int a;
>  char b[0];
> };
> 
> depends on us because the [0] is an extension.

[0] is an extension for representing FAM ONLY when -fstrict-flex-array<3, when 
-fstrict-flex-array=3 specified, [0] is NOT considered as an extension for FAM 
anymore. 
For [1], only when -fstrict-flex-array<2 spedified, it’s considered as an 
extension for FAM.

So, I still think that we should consider -fstrict-flex-array and its impact on 
the GCC extensions [0] and [1]. 

Qing
>  I would make it
> compatible but not allow redefinition as the types are different.


> 
> 
> Martin
> 
> 
>> 
>> thanks.
>> 
>> Qing
>>> even when we do not treat the later as FAM (i.e. still forbid
>>> out-of-bounds accesses).
>>> 
>>> E.g. see Richard's comment: 
>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114713#c2
>>> 
>>> 
>>> Martin
>>> 
 Thanks.
 
 Qing
> 
> Martin
> 
> 
>> 
>> thanks.
>> 
>> Qing
>>> On Nov 23, 2024, at 14:45, Martin Uecker  wrote:
>>> 
>>> 
>>> This patch tries fixes the errors we have because of
>>> flexible array members.  I am bit unsure about the exception
>>> for the mode. 
>>> 
>>> Bootstrapped and regression tested on x86_64.
>>> 
>>> 
>>> 
>>> Fix type compatibility for types with flexible array member 
>>> [PR113688,PR114014,PR117724]
>>> 
>>> verify_type checks the compatibility of TYPE_CANONICAL using
>>> gimple_canonical_types_compatible_p.   But it is stricter than what the
>>> C standard requires and therefor inconsistent with how TYPE_CANONICAL 
>>> is set
>>> in the C FE.  Here, the logic is changed to ignore array size when one 
>>> of the
>>> types is a flexible array member.  To not get errors because of 
>>> inconsistent
>>> n

Re: [PATCH] libstdc++: Simplify std::_Destroy using 'if constexpr'

2024-12-02 Thread Patrick Palka
On Thu, 28 Nov 2024, Jonathan Wakely wrote:

> This is another place where we can use 'if constexpr' to replace
> dispatching to a specialized class template, improving compile times and
> avoiding a function call.
> 
> libstdc++-v3/ChangeLog:
> 
>   * include/bits/stl_construct.h (_Destroy(FwdIter, FwdIter)): Use
>   'if constexpr' instead of dispatching to a member function of a
>   class template.
>   (_Destroy_n(FwdIter, Size)): Likewise.
>   (_Destroy_aux, _Destroy_n_aux): Only define for C++98.
> ---
> 
> This seems worthwhile, as another small reduction in compile times,
> similar to a number of recent patches.

LGTM

> 
> Tested x86_64-linux.
> 
>  libstdc++-v3/include/bits/stl_construct.h | 33 ++-
>  1 file changed, 27 insertions(+), 6 deletions(-)
> 
> diff --git a/libstdc++-v3/include/bits/stl_construct.h 
> b/libstdc++-v3/include/bits/stl_construct.h
> index 9d6111396e1..6889a9bfa0e 100644
> --- a/libstdc++-v3/include/bits/stl_construct.h
> +++ b/libstdc++-v3/include/bits/stl_construct.h
> @@ -166,6 +166,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>  #endif
>  }
>  
> +#pragma GCC diagnostic push
> +#pragma GCC diagnostic ignored "-Wc++17-extensions" // for if-constexpr
> +
> +#if __cplusplus < 201103L
>template
>  struct _Destroy_aux
>  {
> @@ -185,6 +189,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>  static void
>  __destroy(_ForwardIterator, _ForwardIterator) { }
>  };
> +#endif
>  
>/**
> * Destroy a range of objects.  If the value_type of the object has
> @@ -201,15 +206,21 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>// A deleted destructor is trivial, this ensures we reject such types:
>static_assert(is_destructible<_Value_type>::value,
>   "value type is destructible");
> -#endif
> +  if constexpr (!__has_trivial_destructor(_Value_type))
> + for (; __first != __last; ++__first)
> +   std::_Destroy(std::__addressof(*__first));
>  #if __cpp_constexpr_dynamic_alloc // >= C++20
> -  if (std::__is_constant_evaluated())
> - return std::_Destroy_aux::__destroy(__first, __last);
> +  else if (std::__is_constant_evaluated())
> + for (; __first != __last; ++__first)
> +   std::destroy_at(std::__addressof(*__first));
>  #endif
> +#else
>std::_Destroy_aux<__has_trivial_destructor(_Value_type)>::
>   __destroy(__first, __last);
> +#endif
>  }
>  
> +#if __cplusplus < 201103L
>template
>  struct _Destroy_n_aux
>  {
> @@ -234,6 +245,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> return __first;
>   }
>  };
> +#endif
>  
>/**
> * Destroy a range of objects.  If the value_type of the object has
> @@ -250,14 +262,23 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>// A deleted destructor is trivial, this ensures we reject such types:
>static_assert(is_destructible<_Value_type>::value,
>   "value type is destructible");
> -#endif
> +  if constexpr (!__has_trivial_destructor(_Value_type))
> + for (; __count > 0; (void)++__first, --__count)
> +   std::_Destroy(std::__addressof(*__first));
>  #if __cpp_constexpr_dynamic_alloc // >= C++20
> -  if (std::__is_constant_evaluated())
> - return std::_Destroy_n_aux::__destroy_n(__first, __count);
> +  else if (std::__is_constant_evaluated())
> + for (; __count > 0; (void)++__first, --__count)
> +   std::destroy_at(std::__addressof(*__first));
>  #endif
> +  else
> + std::advance(__first, __count);
> +  return __first;
> +#else
>return std::_Destroy_n_aux<__has_trivial_destructor(_Value_type)>::
>   __destroy_n(__first, __count);
> +#endif
>  }
> +#pragma GCC diagnostic pop
>  
>  #if __glibcxx_raw_memory_algorithms // >= C++17
>template 
> -- 
> 2.47.0
> 
> 



Re: [committed] Patches 8-12 of Mariam & Matevos's CRC optimization work

2024-12-02 Thread Jeff Law




On 12/2/24 7:02 AM, Jakub Jelinek wrote:

On Sun, Dec 01, 2024 at 08:56:39AM -0700, Jeff Law wrote:

diff --git a/gcc/testsuite/gcc.dg/crc-side-instr-1.c 
b/gcc/testsuite/gcc.dg/crc-side-instr-1.c
new file mode 100644
index 000..69738d5c866
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/crc-side-instr-1.c
@@ -0,0 +1,27 @@
+/* { dg-do compile } */
+/* { dg-options "-fdump-tree-crc-details" } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-O1" "-Os" "-flto" } } */


...
and similarly for all other crc-side-instr*.c
These tests are clearly written for gcc.dg/torture/, but placed in gcc.dg/,
where we don't cycle through different options and none of explicit -O0,
-O1, -Os or -flto will be among the options, only -fdump-tree-crc-details
will be and so it will be compiled without optimizations and
all the scan-tree-dump directives UNRESOLVED because crc dump doesn't exist
at -O0.
There were a few more with a different naming convention, but right now 
it looks like they'll all move into torture/ without triggering any new 
fallout.


jeff



Re: [committed] Patches 8-12 of Mariam & Matevos's CRC optimization work

2024-12-02 Thread Jakub Jelinek
On Mon, Dec 02, 2024 at 04:37:22PM -0700, Jeff Law wrote:
> > and similarly for all other crc-side-instr*.c
> > These tests are clearly written for gcc.dg/torture/, but placed in gcc.dg/,
> > where we don't cycle through different options and none of explicit -O0,
> > -O1, -Os or -flto will be among the options, only -fdump-tree-crc-details
> > will be and so it will be compiled without optimizations and
> > all the scan-tree-dump directives UNRESOLVED because crc dump doesn't exist
> > at -O0.
> There were a few more with a different naming convention, but right now it
> looks like they'll all move into torture/ without triggering any new
> fallout.

Yes,
+UNRESOLVED: gcc.dg/crc-linux-1.c scan-tree-dump crc "calculates CRC!"
+UNRESOLVED: gcc.dg/crc-linux-2.c scan-tree-dump crc "calculates CRC!"
+UNRESOLVED: gcc.dg/crc-linux-5.c scan-tree-dump crc "drm_dp_msg_data_crc4 
function maybe contains CRC calculation."
+UNRESOLVED: gcc.dg/crc-not-crc-15.c scan-tree-dump-times crc "calculates CRC!" 0
is what I see on i686-linux.

Jakub



Re: [PATCH] phiopt: don't handle the case cond edge dest is itself [PR117243]

2024-12-02 Thread Andrew Pinski
On Mon, Dec 2, 2024 at 2:05 PM Jakub Jelinek  wrote:
>
> On Mon, Dec 02, 2024 at 01:51:39PM -0800, Andrew Pinski wrote:
> > gcc/testsuite/ChangeLog:
> >
> >   * gcc.dg/torture/pr117243-1.c: New test.
> >   * gcc.dg/torture/pr117243-2.c: New test.
>
> Just commenting on the testcases.
> I don't like 2 tests times 7 different command line options each spend
> some time in busy loop uselessly.
> Can't you just make it dg-do compile and scan for optimized dump not having
> __builtin_unreachable calls?

I guess that would work better really. I will rewrite testcases to do that.

THanks,
Andrew

>
> Jakub
>


Re: [PATCH v3 4/7] Support for 64-bit location_t: libgdiagnostics parts

2024-12-02 Thread Lewis Hyatt
On Mon, Dec 02, 2024 at 07:35:12PM -0500, David Malcolm wrote:
> On Sun, 2024-12-01 at 19:44 -0500, Lewis Hyatt wrote:
> > This patch is new in v3 and is a small change to libgdiagnostics
> > similar to
> > other changes required by 64-bit location_t.
> > 
> > -- >8 --
> > 
> > Tweak libgdiagnostics.cc, which is necessarily sensitive to line-map
> > internals, to support 64-bit location_t as well.
> > 
> > gcc/ChangeLog:
> > 
> > * libgdiagnostics.cc (struct diagnostic_manager): Use
> > location_t(-1)
> > instead of UINT_MAX to support 64-bit location_t as well.
> > (diagnostic_manager::diagnostic_manager): Change hard-coded
> > "5" to
> > line_map_suggested_range_bits.
> > ---
> >  gcc/libgdiagnostics.cc | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/gcc/libgdiagnostics.cc b/gcc/libgdiagnostics.cc
> > index e5cee0958f9..53a8423f904 100644
> > --- a/gcc/libgdiagnostics.cc
> > +++ b/gcc/libgdiagnostics.cc
> > @@ -320,7 +320,7 @@ public:
> >  linemap_init (&m_line_table, BUILTINS_LOCATION);
> >  m_line_table.m_reallocator = xrealloc;
> >  m_line_table.m_round_alloc_size = round_alloc_size;
> > -    m_line_table.default_range_bits = 5;
> > +    m_line_table.default_range_bits = line_map_suggested_range_bits;
> 
> Is line_map_suggested_range_bits still a constant after the other
> patches in the kit?  If so, this patch is OK for trunk.
> 
> Thanks
> Dave

Thanks, yes it is just a constant. It is 5 on current master and would
change to 7 with these patches.

-Lewis


[PATCH v1] RISC-V: Fix incorrect optimization options passing to reduc and ternop

2024-12-02 Thread pan2 . li
From: Pan Li 

Like the strided load/store, the testcases of vector reduce and ternop
are designed to pick up different sorts of optimization options but
actually these option are ignored according to the Execution log of
the gcc.log.

This patch would like to make it correct almost the same as what we
fixed for strided load/store.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/rvv.exp: Fix the incorrect optimization
options passing to testcases.

Signed-off-by: Pan Li 
---
 gcc/testsuite/gcc.target/riscv/rvv/rvv.exp | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp 
b/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp
index 87dea457608..65a57aa7913 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp
+++ b/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp
@@ -71,9 +71,9 @@ foreach op $AUTOVEC_TEST_OPTS {
   dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/autovec/unop/*.\[cS\]]] \
 "" "$op"
   dg-runtest [lsort [glob -nocomplain 
$srcdir/$subdir/autovec/ternop/*.\[cS\]]] \
-"" "$op"
+"$op" ""
   dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/autovec/reduc/*.\[cS\]]] 
\
-"" "$op"
+"$op" ""
   dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/autovec/cond/*.\[cS\]]] \
 "$op" ""
   dg-runtest [lsort [glob -nocomplain 
$srcdir/$subdir/autovec/builtin/*.\[cS\]]] \
-- 
2.43.0



Re: Should -fsanitize=bounds support counted-by attribute for pointers inside a structure?

2024-12-02 Thread Martin Uecker
Am Montag, dem 02.12.2024 um 22:58 + schrieb Qing Zhao:
> 
> > On Dec 2, 2024, at 16:13, Martin Uecker  wrote:
> > 
> > Am Montag, dem 02.12.2024 um 20:15 + schrieb Qing Zhao:
> > > 
> > > > On Nov 30, 2024, at 07:22, Martin Uecker  wrote:
> > > > 
> > > > Am Dienstag, dem 26.11.2024 um 20:59 + schrieb Qing Zhao:
> > > > > Think this over these days, I have another thought that need some 
> > > > > feedback:
> > > > > 
> > > > > The major issue right now is:
> > > > > 
> > > > > 1. For the following structure in which the “counted_by” attributes 
> > > > > is attached to the pointer field.
> > > > > 
> > > > > struct foo {
> > > > > int n;
> > > > > char *p __attribute__ ((counted_by (n)));
> > > > > } *x;
> > > > 
> > > > BTW: I also do not like the syntax 'n' for a lookup of a member 
> > > > in the member namespace. I think it should be '.n'.  For FAM this 
> > > > is less problematic because it always at the end, but here it is 
> > > > problematic.
> > > During my implementation of extending this attribute to pointer fields of 
> > > structure. 
> > > I didn’t notice  issue with the current syntax ’n’ for the pointer fields 
> > > so far, 
> > > even though when  the field “n” is declared after the corresponding 
> > > pointer field, i.e:
> > > 
> > > struct foo {
> > > {
> > >  char *p __attribute__ ((counted_by (n)));
> > >  int n;
> > > }
> > > 
> > > So, could you please explain a little bit more on what’s the potential 
> > > issue here?
> > 
> > My issue with it is that it is not consistent to how C's scoping
> > of identifiers work. In
> > 
> > constexpr int n = 3;
> > struct foo {
> > {
> >  char (*p)[n] __attribute__ ((counted_by (n))
> >  int n;
> > }
> > 
> > the n in the type of p refers to the previous n in scope while
> > the n in your attribute would refer to the member.
> 
> Okay, I see your point here -:).
> 
> Yes, I agree that if the compiler can accept the syntax “.n” in general, then 
> it will make the 
> “counted_by” attribute to allow more complicated expressions in general. 
> 
> We had this similar discussion before the design and implementation for the 
> “counted_by” attribute
> on FAMs, and we agreed to delay the approach of accepting the syntax “.n” in 
> the future possible 
> language standard at that time.
> 
> So, for the “counted_by attribute on FAMs, the implementation is, searching 
> the “n” in all the fields of the 
> containing structure and locating that specific field. 
> 
> Now, when extending “counted_by” attribute to pointer fields of structures, 
> the implementation is similar.

> > 
> > This is incoherent and confusing.  It becomes worse should
> > you ever want to allow more complicated expressions.
> 
> You are right, it’s hard to allow more complicated expressions for 
> “counted_by” based on the current
> design. 
> 
> If we agree to accept the “.n” syntax in GCC in general, that’s of course 
> better.
> Then how about the current “counted_by” for FAMs? Shall we also change it to 
> accept “.n” syntax?

My recommendation would be to change it.  It is also not ideal for 
this case - only less problematic.

> > 
> > It would be clearer if you the syntax ".n" which resembles
> > the syntax for designated initializers that is already used
> > in initializers to refer to struct members.
> > 
> > constexpr int n;
> > struct foo {
> > {
> >  char (*p)[n] __attribute__ ((counted_by (.n))
> >  int n;
> > }
> > 
> Yes, I agree. 
> > 
> > > 
> > > 
> > > > > 
> > > > > There is one important additional requirement:
> > > > > 
> > > > > x->n, x->p can ONLY be changed by changing the whole structure at the 
> > > > > same time. 
> > > > > Otherwise, x->n might not be consistent with x->p.
> > > > 
> > > > By itself, this would still not fix the issue I pointed out.
> > > > 
> > > > struct foo x;
> > > > x = .. ; // set the whole structure
> > > > char *p = x->p;
> > > > x = ... ; // set the whole structure
> > > > 
> > > > What is the bound for 'p' ?  
> > > 
> > > Since p was set to the pointer field of the old structure, then the bound 
> > > of it should be the old bound. 
> > > > With current rules it would be the old bound.
> > > 
> > > I thought that this should be the correct behavior, isn’t it?
> > 
> > Yes, sorry, what I meant was "with the current rules it would be
> > the *new* bound”.
> 
> struct foo x;
> x=… ;  // set the whole structure 1
> char *p = x->p;
> x=… ;  // set the whole structure 2
> 
> In the above, when “set the whole structure 1”, x1, x1->n and x1->p are set 
> at the same time;
> After *p = x->p;the pointer “p” is pointing to “x1->p”, it’s bound is 
> “x1->n”;

I agree.
> 
> Then when “set the whole structure 2”, x2 is different than x1,  x2->n and 
> x2->p are set at the same time, the pointer
> ‘p’ still points to “x1->p”, therefore it’s bound should be “x1->n”. 
> 
> So, as long as the whole structure is set at the same time, should be fine. 
> 
> Do I miss anything here?

I was talking aout the pointer "p" which w

Re: [PATCH] RSIC-V: Fix ICE for unrecognizable insn `UNSPEC_VSETVL` for XTheadVector

2024-12-02 Thread Jin Ma
HI, Jeff

I am very sorry that I took so long to reply because I was ill and hospitalized.

> > +
> > +  /* Since the parameter vl of XTheadVector does not support
> > + immediate numbers, we need to put it in the register
> > + in advance.  */
> > +  if (TARGET_XTHEADVECTOR
> > +  && CONST_INT_P (x)
> > +  && base->apply_vl_p ()
> > +  && argno == (unsigned) (call_expr_nargs (exp) - 1)
> > +  && !rtx_equal_p (x, const0_rtx))
> Last condition is better written as
> x != CONST0_RTX (GET_MODE (x))

Ok

> 
> > +{
> > +  rtx tmp = gen_reg_rtx (Pmode);
> > +  /* Use UNSPEC to avoid being optimized before vsetvl pass.  */
> > +  emit_insn (gen_th_pred_vl_mov (Pmode, tmp, x));
> Pmode seems wrong.  word_mode would likely be better.  That would mean
> some adjustment to your new insn.
> 
> Additionally, I'd like to understand better why you can't just
>   tmp = force_reg (word_mode, x);
> 
> Can you explain in more detail what you're trying to avoid?  ie, RTL
> before/after the problematical optimization?  It feels like you're
> papering over a bigger problem using the UNSPEC.

In fact, we inserted the vsetvl instruction in the "vsetvl" pass. Before it,
there will be many opportunities to eliminate the mov instruction we need,
such as "combine" pass or "reload" pass(curr_insn_transform), which will
eventually lead to vl in the vector pattern being an immediate instead of
a register. This will lead to only "vsetivli" being generated in the "vsetvl"
pass, which is obviously inconsistent with expectations. So we need a special
mov with UNSPEC to avoid it being optimized before the "vsetvl" pass.

> 
> Can you also resubmit with the RSIC-V in the subject line fixed to
> RISC-V that way the pre-commit tester will pick it up.

Ok

> Thanks,
> 
> Jeff

BR
Jin


[PATCH v2] RISC-V: Fix ICE for unrecognizable insn `UNSPEC_VSETVL` for XTheadVector

2024-12-02 Thread Jin Ma
Since XTheadvector does not support vsetivli, vl needs to be put into
registers during the expand phase.

PR 116593

gcc/ChangeLog:

* config/riscv/riscv-vector-builtins.cc 
(function_expander::add_input_operand):
Put const to GPR for vl.
* config/riscv/thead-vector.md (@th_pred_vl_mov): New.

gcc/testsuite/ChangeLog:

* g++.target/riscv/xtheadvector/pr116593.C: New test.
* g++.target/riscv/xtheadvector/xtheadvector.exp: New test.

Reported-by: nihui 
---
 gcc/config/riscv/riscv-vector-builtins.cc | 18 +++-
 gcc/config/riscv/thead-vector.md  | 13 ++
 .../g++.target/riscv/xtheadvector/pr116593.C  | 45 +++
 .../riscv/xtheadvector/xtheadvector.exp   | 37 +++
 4 files changed, 112 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.target/riscv/xtheadvector/pr116593.C
 create mode 100644 gcc/testsuite/g++.target/riscv/xtheadvector/xtheadvector.exp

diff --git a/gcc/config/riscv/riscv-vector-builtins.cc 
b/gcc/config/riscv/riscv-vector-builtins.cc
index b9b9d33adab6..cced0461a7bb 100644
--- a/gcc/config/riscv/riscv-vector-builtins.cc
+++ b/gcc/config/riscv/riscv-vector-builtins.cc
@@ -4089,7 +4089,23 @@ function_expander::add_input_operand (unsigned argno)
 {
   tree arg = CALL_EXPR_ARG (exp, argno);
   rtx x = expand_normal (arg);
-  add_input_operand (TYPE_MODE (TREE_TYPE (arg)), x);
+
+  /* Since the parameter vl of XTheadVector does not support
+ immediate numbers, we need to put it in the register
+ in advance.  */
+  if (TARGET_XTHEADVECTOR
+  && CONST_INT_P (x)
+  && base->apply_vl_p ()
+  && argno == (unsigned) (call_expr_nargs (exp) - 1)
+  && x != CONST0_RTX (GET_MODE (x)))
+{
+  rtx tmp = gen_reg_rtx (word_mode);
+  /* Use UNSPEC to avoid being optimized before vsetvl pass.  */
+  emit_insn (gen_th_pred_vl_mov (word_mode, tmp, x));
+  add_input_operand (TYPE_MODE (TREE_TYPE (arg)), tmp);
+}
+  else
+add_input_operand (TYPE_MODE (TREE_TYPE (arg)), x);
 }
 
 /* Since we may normalize vop/vop_tu/vop_m/vop_tumu.. into a single patter.
diff --git a/gcc/config/riscv/thead-vector.md b/gcc/config/riscv/thead-vector.md
index 5fe9ba08c4eb..0e00514c6b2d 100644
--- a/gcc/config/riscv/thead-vector.md
+++ b/gcc/config/riscv/thead-vector.md
@@ -25,6 +25,7 @@ (define_c_enum "unspec" [
   UNSPEC_TH_VSUXW
 
   UNSPEC_TH_VWLDST
+  UNSPEC_TH_VL_MOV
 ])
 
 (define_int_iterator UNSPEC_TH_VLMEM_OP [
@@ -93,6 +94,18 @@ (define_int_iterator UNSPEC_TH_VSXMEM_OP [
 (define_mode_iterator V_VLS_VT [V VLS VT])
 (define_mode_iterator V_VB_VLS_VT [V VB VLS VT])
 
+(define_insn_and_split "@th_pred_vl_mov"
+  [(set (match_operand:P 0 "register_operand""=r")
+   (unspec:P
+ [(match_operand:P 1 "const_int_operand" " i")]
+   UNSPEC_TH_VL_MOV))]
+  "TARGET_XTHEADVECTOR"
+  "li\t%0,%1"
+  "&& epilogue_completed"
+  [(set (match_dup 0) (match_dup 1))]
+  {}
+  [(set_attr "type" "arith")])
+
 (define_split
   [(set (match_operand:V_VB_VLS_VT 0 "reg_or_mem_operand")
(match_operand:V_VB_VLS_VT 1 "reg_or_mem_operand"))]
diff --git a/gcc/testsuite/g++.target/riscv/xtheadvector/pr116593.C 
b/gcc/testsuite/g++.target/riscv/xtheadvector/pr116593.C
new file mode 100644
index ..e44e7437ad70
--- /dev/null
+++ b/gcc/testsuite/g++.target/riscv/xtheadvector/pr116593.C
@@ -0,0 +1,45 @@
+/* Test that we do not have ice when compile */
+/* { dg-do compile } */
+/* { dg-options "-march=rv32gc_zfh_xtheadvector -mabi=ilp32d -O2" { target { 
rv32 } } } */
+/* { dg-options "-march=rv64gc_zfh_xtheadvector -mabi=lp64d -O2" { target { 
rv64 } } } */
+
+#include 
+#include 
+#include 
+
+static vfloat32m8_t atan2_ps(vfloat32m8_t a, vfloat32m8_t b, size_t vl)
+{
+  std::vector tmpx(vl);
+  std::vector tmpy(vl);
+  __riscv_vse32_v_f32m8(tmpx.data(), a, vl);
+  __riscv_vse32_v_f32m8(tmpy.data(), b, vl);
+  for (size_t i = 0; i < vl; i++)
+  {
+tmpx[i] = atan2(tmpx[i], tmpy[i]);
+  }
+  return __riscv_vle32_v_f32m8(tmpx.data(), vl);
+}
+
+void atan2(const float *x, const float *y, float *out, int size, int ch)
+{
+  for (int i = 0; i < ch; i++)
+  {
+const float *xx = x + size * i;
+const float *yy = y + size * i;
+float *zz = out + size * i;
+
+int n = size;
+while (n > 0)
+{
+  size_t vl = __riscv_vsetvl_e32m8(n);
+  vfloat32m8_t _xx = __riscv_vle32_v_f32m8(xx, vl);
+  vfloat32m8_t _yy = __riscv_vle32_v_f32m8(yy, vl);
+  vfloat32m8_t _zz = atan2_ps(_xx, _yy, vl);
+  __riscv_vse32_v_f32m8(zz, _zz, vl);
+  n -= vl;
+  xx += vl;
+  yy += vl;
+  zz += vl;
+}
+  }
+}
diff --git a/gcc/testsuite/g++.target/riscv/xtheadvector/xtheadvector.exp 
b/gcc/testsuite/g++.target/riscv/xtheadvector/xtheadvector.exp
new file mode 100644
index ..551fd9c92670
--- /dev/null
+++ b/gcc/testsuite/g++.target/riscv/xtheadvector/xtheadvector.exp
@@ -0,0 +1,37 @@
+# Copyright (C) 2023-2024 Free Software Found

[committed] libstdc++: Disable deprecated warnings for std::rel_ops in std.cc

2024-12-02 Thread Jonathan Wakely
This avoids some warnings when building the std module.

libstdc++-v3/ChangeLog:

* src/c++23/std.cc.in: Disable deprecated warnings when
exporting std::rel_ops members.
---

Pushed to trunk.

 libstdc++-v3/src/c++23/std.cc.in | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/libstdc++-v3/src/c++23/std.cc.in b/libstdc++-v3/src/c++23/std.cc.in
index 16e66c3d921..7a0ff8edad6 100644
--- a/libstdc++-v3/src/c++23/std.cc.in
+++ b/libstdc++-v3/src/c++23/std.cc.in
@@ -3151,6 +3151,8 @@ export namespace std
   using std::piecewise_construct_t;
   using std::tuple_element;
   using std::tuple_size;
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wdeprecated-declarations"
   namespace rel_ops
   {
 using std::rel_ops::operator!=;
@@ -3158,6 +3160,7 @@ export namespace std
 using std::rel_ops::operator<=;
 using std::rel_ops::operator>=;
   }
+#pragma GCC diagnostic pop
 #if __cpp_lib_unreachable
   using std::unreachable;
 #endif
-- 
2.47.0



Re: [patch, libgfortran] PR117820

2024-12-02 Thread Paul Richard Thomas
Hi Jerry,

That's fine for trunk and, after a decent interval, I would suggest that
you backport to 14-branch.

Please add the name of the contributor to the testcase unless you have been
asked not to.

Thanks

Paul


On Tue, 3 Dec 2024 at 04:13, Jerry D  wrote:

> Hi all,
>
> Attached patch adds a test for zero that is needed for write_boz to work
> correctly. Almost obvious.
>
> Regression tested on x86_64.
>
> Ok for trunk?
>
> Jerry
>
> Author: Jerry DeLisle 
> Date:   Mon Dec 2 19:45:26 2024 -0800
>
>  Fortran: Fix B64.0 formatted write output.
>
>  PR fortran/117820
>
>  libgfortran/ChangeLog:
>
>  * io/write.c (write_b): Add test for zero needed by write_boz.
>
>  gcc/testsuite/ChangeLog:
>
>  * gfortran.dg/pr117820.f90: New test.
>
>


Re: Fix type compatibility for types with flexible array member [PR113688,PR114014,PR117724]

2024-12-02 Thread Martin Uecker
Am Montag, dem 02.12.2024 um 22:33 + schrieb Qing Zhao:



> > > 
> > > diff --git a/gcc/testsuite/gcc.dg/pr114014.c 
> > > b/gcc/testsuite/gcc.dg/pr114014.c
> > > new file mode 100644
> > > index 000..ab783f4f85d
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.dg/pr114014.c
> > > @@ -0,0 +1,14 @@
> > > +/* PR c/114014
> > > + * { dg-do compile }
> > > + * { dg-options "-std=c23 -g" } */
> > > +
> > > +struct r {
> > > +  int a;
> > > +  char b[];
> > > +};
> > > +struct r {
> > > +  int a;
> > > +  char b[0];
> > > +}; /* { dg-error "redefinition" } */
> > > +
> > > +
> > > 
> > > Is the above testing case claiming that b[] and b[0] are compatible from 
> > > a language semantic point of view?
> > 
> > It would test that we do not crash with checking.
> > 
> > Semantically, in c23 if you redeclare a type in the same scope then
> > it must not only be compatible but is also not allowed to differ.
> > So a redeclaration in the same scope has stricter requirements than
> > compatibility (this also true for typedefs for example).
> 
> So, here does the “compatibility” mean “compatibility from a language 
> semantic point of view” or TBAA-compability? 

When we talk about what is allowed to be used and what warnings/errors
appear in the test, it is about language semantics.   What made the
compiler crash was related to the TBAA semantics.

> > 
> > Whether we allow
> > 
> > struct r {
> >  int a;
> >  char b[];
> > };
> > 
> > struct r {
> >  int a;
> >  char b[0];
> > };
> > 
> > depends on us because the [0] is an extension.
> 
> [0] is an extension for representing FAM ONLY when -fstrict-flex-array<3, when
> -fstrict-flex-array=3 specified, [0] is NOT considered as an extension for 
> FAM anymore. 
> For [1], only when -fstrict-flex-array<2 spedified, it’s considered as an 
> extension for FAM.
> 
> So, I still think that we should consider -fstrict-flex-array and its impact 
> on the GCC extensions [0] and [1]. 

I do not understand what you mean by "consider
-fstrict-flex-array" 

Independent from this option, the types are always different
types, e.g. sizeof(x->b) would still not be allowed for the
first type because x->b is incomplete but allowed for the
second, and return 1 if it were declared with char b[1].  


Martin

> 
> Qing
> >  I would make it
> > compatible but not allow redefinition as the types are different.
> 
> 
> > 
> > 
> > Martin
> > 
> > 
> > > 
> > > thanks.
> > > 
> > > Qing
> > > > even when we do not treat the later as FAM (i.e. still forbid
> > > > out-of-bounds accesses).
> > > > 
> > > > E.g. see Richard's comment: 
> > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114713#c2
> > > > 
> > > > 
> > > > Martin
> > > > 
> > > > > Thanks.
> > > > > 
> > > > > Qing
> > > > > > 
> > > > > > Martin
> > > > > > 
> > > > > > 
> > > > > > > 
> > > > > > > thanks.
> > > > > > > 
> > > > > > > Qing
> > > > > > > > On Nov 23, 2024, at 14:45, Martin Uecker  
> > > > > > > > wrote:
> > > > > > > > 
> > > > > > > > 
> > > > > > > > This patch tries fixes the errors we have because of
> > > > > > > > flexible array members.  I am bit unsure about the exception
> > > > > > > > for the mode. 
> > > > > > > > 
> > > > > > > > Bootstrapped and regression tested on x86_64.
> > > > > > > > 
> > > > > > > > 
> > > > > > > > 
> > > > > > > > Fix type compatibility for types with flexible array member 
> > > > > > > > [PR113688,PR114014,PR117724]
> > > > > > > > 
> > > > > > > > verify_type checks the compatibility of TYPE_CANONICAL using
> > > > > > > > gimple_canonical_types_compatible_p.   But it is stricter than 
> > > > > > > > what the
> > > > > > > > C standard requires and therefor inconsistent with how 
> > > > > > > > TYPE_CANONICAL is set
> > > > > > > > in the C FE.  Here, the logic is changed to ignore array size 
> > > > > > > > when one of the
> > > > > > > > types is a flexible array member.  To not get errors because of 
> > > > > > > > inconsistent
> > > > > > > > number of members, zero-sized arrays are not ignored anymore 
> > > > > > > > when checking
> > > > > > > > fields of a struct (which is stricter than what was done 
> > > > > > > > before).
> > > > > > > > Finally, a exception is added that allows the TYPE_MODE of a 
> > > > > > > > type with
> > > > > > > > flexible array member to differ from another compatible type.
> > > > > > > > 
> > > > > > > > PR c/113688
> > > > > > > > PR c/114014
> > > > > > > > PR c/117724
> > > > > > > > 
> > > > > > > > gcc/ChangeLog:
> > > > > > > > * tree.cc (gimple_canonical_types_compatible_p): Revise
> > > > > > > > logic for types with FAM.
> > > > > > > > (verify_type): Add exception for mode for types with 
> > > > > > > > FAM.
> > > > > > > > 
> > > > > > > > gcc/testsuite/ChangeLog:
> > > > > > > > * gcc.dg/pr113688.c: New test.
> > > > > > > > * gcc.dg/pr114014.c: New test.
> > > > > > > > * gcc.dg/pr117724.c: New test.
> > > > > > > > 
> > > 

Re: [PATCH v1] RISC-V: Fix incorrect optimization options passing to cond and builtin

2024-12-02 Thread Kito Cheng
LGTM

 於 2024年12月2日 週一 22:25 寫道:

> From: Pan Li 
>
> Like the strided load/store, the testcases of vector cond and builtin are
> designed to pick up different sorts of optimization options but actually
> these option are ignored according to the Execution log of gcc.log.
> This patch would like to make it correct almost the same as what we
> fixed for strided load/store.
>
> The below test suites are passed for this patch.
> * The rv64gcv fully regression test.
>
> It is test only patch and obvious up to a point, will commit it
> directly if no comments in next 48H.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/rvv.exp: Fix the incorrect optimization
> options passing to testcases.
>
> Signed-off-by: Pan Li 
> ---
>  gcc/testsuite/gcc.target/riscv/rvv/rvv.exp | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp
> b/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp
> index 87c5ecb1a8b..87dea457608 100644
> --- a/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp
> @@ -75,9 +75,9 @@ foreach op $AUTOVEC_TEST_OPTS {
>dg-runtest [lsort [glob -nocomplain
> $srcdir/$subdir/autovec/reduc/*.\[cS\]]] \
>  "" "$op"
>dg-runtest [lsort [glob -nocomplain
> $srcdir/$subdir/autovec/cond/*.\[cS\]]] \
> -"" "$op"
> +"$op" ""
>dg-runtest [lsort [glob -nocomplain
> $srcdir/$subdir/autovec/builtin/*.\[cS\]]] \
> -"" "$op"
> +"$op" ""
>  }
>
>  # widening operation only test on LMUL < 8
> --
> 2.43.0
>
>


Re: [PATCH v3 4/7] Support for 64-bit location_t: libgdiagnostics parts

2024-12-02 Thread David Malcolm
On Sun, 2024-12-01 at 19:44 -0500, Lewis Hyatt wrote:
> This patch is new in v3 and is a small change to libgdiagnostics
> similar to
> other changes required by 64-bit location_t.
> 
> -- >8 --
> 
> Tweak libgdiagnostics.cc, which is necessarily sensitive to line-map
> internals, to support 64-bit location_t as well.
> 
> gcc/ChangeLog:
> 
>   * libgdiagnostics.cc (struct diagnostic_manager): Use
> location_t(-1)
>   instead of UINT_MAX to support 64-bit location_t as well.
>   (diagnostic_manager::diagnostic_manager): Change hard-coded
> "5" to
>   line_map_suggested_range_bits.
> ---
>  gcc/libgdiagnostics.cc | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/libgdiagnostics.cc b/gcc/libgdiagnostics.cc
> index e5cee0958f9..53a8423f904 100644
> --- a/gcc/libgdiagnostics.cc
> +++ b/gcc/libgdiagnostics.cc
> @@ -320,7 +320,7 @@ public:
>  linemap_init (&m_line_table, BUILTINS_LOCATION);
>  m_line_table.m_reallocator = xrealloc;
>  m_line_table.m_round_alloc_size = round_alloc_size;
> -    m_line_table.default_range_bits = 5;
> +    m_line_table.default_range_bits = line_map_suggested_range_bits;

Is line_map_suggested_range_bits still a constant after the other
patches in the kit?  If so, this patch is OK for trunk.

Thanks
Dave


>    }
>    ~diagnostic_manager ()
>    {
> @@ -500,7 +500,7 @@ private:
>    impl_client_version_info m_client_version_info;
>    std::vector> m_sinks;
>    hash_map m_str_to_file_map;
> -  hash_map,
> +  hash_map,
>      diagnostic_physical_location *> m_location_t_map;
>    std::vector>
> m_logical_locs;
>    const diagnostic *m_current_diag;
> 



New Chinese (simplified) PO file for 'gcc' (version 14.2.0)

2024-12-02 Thread Translation Project Robot
Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'gcc' has been submitted
by the Chinese (simplified) team of translators.  The file is available at:

https://translationproject.org/latest/gcc/zh_CN.po

(This file, 'gcc-14.2.0.zh_CN.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

https://translationproject.org/latest/gcc/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

https://translationproject.org/domain/gcc.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.




Re: [PATCH] AIX Build failure with default -std=gnu23.

2024-12-02 Thread Eric Gallager
On Mon, Dec 2, 2024 at 1:01 PM swamy sangamesh
 wrote:
>
> Dear Community,
>
> Please let me know your comment.
> Or is it more appropriate to have changes with header guard like this ?
>

I personally think it's better to just remove the define, but if
you're going to leave it in and guard it with a macro instead, I'd use
something a bit more specific than just "_AIX".

> --- a/libiberty/getopt.c
> +++ b/libiberty/getopt.c
> @@ -25,9 +25,11 @@
>  ^L
>  /* This tells Alpha OSF/1 not to define a getopt prototype in .
> Ditto for AIX 3.2 and .  */
> +#ifndef _AIX
>  #ifndef _NO_PROTO
>  # define _NO_PROTO
>  #endif
> +#endif
>
>  #ifdef HAVE_CONFIG_H
>  # include 
>
>
> Thanks,
> Sangamesh
>
>
> On Thu, Nov 28, 2024 at 11:09 AM Sangamesh Mallayya 
>  wrote:
>>
>>  libiberty/getopt.c file is defining _NO_PROTO which causes conflicting
>>  declarations for the functions in AIX header files like stdio.h & stdlib.h.
>>  These declarations are being considered as errors in C23 which wasn't
>>  the case with C17.
>>
>> Here is the error we get.
>>
>> /gcc_build/./prev-gcc/xgcc -B/gcc_build/./prev-gcc/ 
>> -B/home/sangam/install/GCC/powerpc-ibm-aix7.3.3.0/bin/ -B/home/sangam
>> /install/GCC/powerpc-ibm-aix7.3.3.0/bin/ 
>> -B/home/sangam/install/GCC/powerpc-ibm-aix7.3.3.0/lib/ -isystem 
>> /home/sangam/ins
>> tall/GCC/powerpc-ibm-aix7.3.3.0/include -isystem 
>> /home/sangam/install/GCC/powerpc-ibm-aix7.3.3.0/sys-include   -fno-check
>> ing -c -DHAVE_CONFIG_H -g -O2 -fno-checking  -I. 
>> -I/opt/freeware/src/packages/BUILD/gcc/libiberty/../include  -W -Wall -W
>> write-strings -Wc++-compat -Wstrict-prototypes -Wshadow=local -pedantic  
>> -D_GNU_SOURCE  /opt/freeware/src/packages/BUILD/
>> gcc/libiberty/getopt.c -o getopt.o
>>
>>
>> In file included from 
>> /opt/freeware/src/packages/BUILD/gcc/libiberty/getopt.c:45:
>> /gcc_build/prev-gcc/include-fixed/stdio.h:593:12: error: conflicting types 
>> for 'fgetpos64'; have 'int(FILE *, fpos64_t *)
>> ' {aka 'int(FILE *, long long int *)'}
>>   593 | extern int fgetpos64(FILE *, fpos64_t *);
>>   |^
>> /gcc_build/prev-gcc/include-fixed/stdio.h:298:17: note: previous declaration 
>> of 'fgetpos64' with type 'int(void)'
>>   298 | extern int  fgetpos();
>>   | ^~~
>> /gcc_build/prev-gcc/include-fixed/stdio.h:594:14: error: conflicting types 
>> for 'fopen64'; have 'FILE *(const char *, cons
>> t char *)'
>>   594 | extern FILE *fopen64(const char *, const char *);
>>   |  ^~~
>>
>> /gcc_build/prev-gcc/include-fixed/stdio.h:259:17: note: previous declaration 
>> of 'fopen64' with type 'FILE *(void)'
>>   259 | extern FILE *   fopen();
>>   | ^
>> /gcc_build/prev-gcc/include-fixed/stdio.h:595:14: error: conflicting types 
>> for 'freopen64'; have 'FILE *(const char *, co
>> nst char *, FILE *)'
>>   595 | extern FILE *freopen64(const char *, const char *, FILE *);
>>   |  ^
>> /gcc_build/prev-gcc/include-fixed/stdio.h:260:17: note: previous declaration 
>> of 'freopen64' with type 'FILE *(void)'
>>   260 | extern FILE *   freopen();
>>   | ^~~
>> /gcc_build/prev-gcc/include-fixed/stdio.h:597:12: error: conflicting types 
>> for 'fsetpos64'; have 'int(FILE *, const fpos6
>> 4_t *)' {aka 'int(FILE *, const long long int *)'}
>>   597 | extern int fsetpos64(FILE *, const fpos64_t *);
>>   |^
>> /gcc_build/prev-gcc/include-fixed/stdio.h:300:17: note: previous declaration 
>> of 'fsetpos64' with type 'int(void)'
>>   300 | extern int  fsetpos();
>>   | ^~~
>> In file included from 
>> /opt/freeware/src/packages/BUILD/gcc/libiberty/getopt.c:216:
>> /gcc_build/prev-gcc/include-fixed/stdlib.h: In function 'strtold':
>> /gcc_build/prev-gcc/include-fixed/stdlib.h:233:30: error: too many arguments 
>> to function 'strtod'
>>
>>
>> Compiled with this patch on RHEL8.10 ppc64le as well.
>>
>> ---
>>  libiberty/getopt.c | 6 --
>>  1 file changed, 6 deletions(-)
>>
>> diff --git a/libiberty/getopt.c b/libiberty/getopt.c
>> index 2f7086cc0c8..48736d4db41 100644
>> --- a/libiberty/getopt.c
>> +++ b/libiberty/getopt.c
>> @@ -23,12 +23,6 @@
>> Foundation, Inc., 51 Franklin Street - Fifth Floor, Boston, MA 
>> 02110-1301,
>> USA.  */
>>
>> -/* This tells Alpha OSF/1 not to define a getopt prototype in .
>> -   Ditto for AIX 3.2 and .  */
>> -#ifndef _NO_PROTO
>> -# define _NO_PROTO
>> -#endif
>> -
>>  #ifdef HAVE_CONFIG_H
>>  # include 
>>  #endif
>> --
>> 2.41.0
>>
>
>
> --
> Thanks & Regards,
> Sangamesh


[PATCH] MAINTAINERS: Add myself to write after approval

2024-12-02 Thread Yury Khrustalev
ChangeLog:

* MAINTAINERS: Add myself to write after approval.
---
 MAINTAINERS | 1 +
 1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 26455d1cabf..6851affb6cb 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -573,6 +573,7 @@ Filip Kastl pheeck  

 Geoffrey Keatinggeoffk  
 Brendan Kehoe   -   
 Richard Kenner  kenner  
+Yury Khrustalev ykhrustalev 
 Andi Kleen  ak  
 Matthias Klose  doko
 Jeff Knaggs -   
-- 
2.39.5



Why is 3x2^96 defined as 59421121885698253195157962752 instead of 237684487542793012780631851008?

2024-12-02 Thread 蒋力夫 Lifu JIANG
In the following two files:
gcc/libquadmath/math/expq.c
glibc/blob/master/sysdeps/ieee754/ldbl-128/e_expl.c


static const __float128 C[] = {
/* Smallest integer x for which e^x overflows.  */
#define himark C[0]
 11356.523406294143949491931077970765Q,

/* Largest integer x for which e^x underflows.  */
#define lomark C[1]
-11433.4627433362978788372438434526231Q,

/* 3x2^96 */
#define THREEp96 C[2]
 237684487542793012780631851008.0Q,

/* 3x2^103 */
#define THREEp103 C[3]
 30423614405477505635920876929024.0Q,

/* 3x2^111 */
#define THREEp111 C[4]
 7788445287802241442795744493830144.0Q,


Why is 3x2^96 defined as 59421121885698253195157962752 instead of 
237684487542793012780631851008?
3x2^94=59421121885698253195157962752,
3x2^96=237684487542793012780631851008,
3x2^103=30423614405477505635920876929024,
3x2^111=7788445287802241442795744493830144.



[PATCH] testsuite: Fix CRC testcases

2024-12-02 Thread Bohan Lei
Hi all,

The latest CRC optimization patches include some testcases that do not
work well.  Some testcases in gcc/testsuite/gcc.dg lead to UNRESOLVED
results when testing without an explicit -O flag.  Other testcases in
gcc/testsuite/gcc.target/riscv do not work when testing with RV32
-march/-mabi options on riscv64* compilers, and in the opposite case.
This patch tries to deal with the aforementioned issues.

Thanks,
Bohan
---
 gcc/testsuite/gcc.dg/crc-linux-1.c | 2 +-
 gcc/testsuite/gcc.dg/crc-linux-2.c | 2 +-
 gcc/testsuite/gcc.dg/crc-linux-4.c | 2 +-
 gcc/testsuite/gcc.dg/crc-linux-5.c | 2 +-
 gcc/testsuite/gcc.dg/crc-not-crc-15.c  | 2 +-
 gcc/testsuite/gcc.dg/crc-side-instr-1.c| 2 +-
 gcc/testsuite/gcc.dg/crc-side-instr-10.c   | 2 +-
 gcc/testsuite/gcc.dg/crc-side-instr-11.c   | 2 +-
 gcc/testsuite/gcc.dg/crc-side-instr-12.c   | 2 +-
 gcc/testsuite/gcc.dg/crc-side-instr-13.c   | 2 +-
 gcc/testsuite/gcc.dg/crc-side-instr-14.c   | 2 +-
 gcc/testsuite/gcc.dg/crc-side-instr-15.c   | 2 +-
 gcc/testsuite/gcc.dg/crc-side-instr-16.c   | 2 +-
 gcc/testsuite/gcc.dg/crc-side-instr-17.c   | 2 +-
 gcc/testsuite/gcc.dg/crc-side-instr-2.c| 2 +-
 gcc/testsuite/gcc.dg/crc-side-instr-3.c| 2 +-
 gcc/testsuite/gcc.dg/crc-side-instr-4.c| 2 +-
 gcc/testsuite/gcc.dg/crc-side-instr-5.c| 2 +-
 gcc/testsuite/gcc.dg/crc-side-instr-6.c| 2 +-
 gcc/testsuite/gcc.dg/crc-side-instr-7.c| 2 +-
 gcc/testsuite/gcc.dg/crc-side-instr-8.c| 2 +-
 gcc/testsuite/gcc.dg/crc-side-instr-9.c| 2 +-
 gcc/testsuite/gcc.target/riscv/crc-builtin-zbc32.c | 4 ++--
 gcc/testsuite/gcc.target/riscv/crc-builtin-zbc64.c | 4 ++--
 24 files changed, 26 insertions(+), 26 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/crc-linux-1.c 
b/gcc/testsuite/gcc.dg/crc-linux-1.c
index 918b423a583..3261ba48b8b 100644
--- a/gcc/testsuite/gcc.dg/crc-linux-1.c
+++ b/gcc/testsuite/gcc.dg/crc-linux-1.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-fdump-tree-crc-details -w" } */
+/* { dg-options "-O2 -fdump-tree-crc-details -w" } */
 /* { dg-skip-if "" { *-*-* } { "-O0" "-O1" "-Os" "-O3" "-flto" } } */
 
 #include 
diff --git a/gcc/testsuite/gcc.dg/crc-linux-2.c 
b/gcc/testsuite/gcc.dg/crc-linux-2.c
index 990a28cef6f..156d9986aaf 100644
--- a/gcc/testsuite/gcc.dg/crc-linux-2.c
+++ b/gcc/testsuite/gcc.dg/crc-linux-2.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-fdump-tree-crc-details" } */
+/* { dg-options "-O2 -fdump-tree-crc-details" } */
 /* { dg-skip-if "" { *-*-* } { "-O0" "-O1" "-Os" "-flto" } } */
 
 #include 
diff --git a/gcc/testsuite/gcc.dg/crc-linux-4.c 
b/gcc/testsuite/gcc.dg/crc-linux-4.c
index 50cbbba49d2..81f35ae2bc5 100644
--- a/gcc/testsuite/gcc.dg/crc-linux-4.c
+++ b/gcc/testsuite/gcc.dg/crc-linux-4.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-fdump-tree-crc-details -w" } */
+/* { dg-options "-O2 -fdump-tree-crc-details -w" } */
 /* { dg-skip-if "" { *-*-* } { "-O0" "-O1" "-Os" "-flto" } } */
 
 /* We don't detect, it's optimized to branch-less CRC.  */
diff --git a/gcc/testsuite/gcc.dg/crc-linux-5.c 
b/gcc/testsuite/gcc.dg/crc-linux-5.c
index ff3cc25fb66..873257daa8c 100644
--- a/gcc/testsuite/gcc.dg/crc-linux-5.c
+++ b/gcc/testsuite/gcc.dg/crc-linux-5.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-fdump-tree-crc-details" } */
+/* { dg-options "-O2 -fdump-tree-crc-details" } */
 /* { dg-skip-if "" { *-*-* } { "-O0" "-O1" "-Os" "-flto" } } */
 
 #include 
diff --git a/gcc/testsuite/gcc.dg/crc-not-crc-15.c 
b/gcc/testsuite/gcc.dg/crc-not-crc-15.c
index cf7993a2c49..05b46ae68da 100644
--- a/gcc/testsuite/gcc.dg/crc-not-crc-15.c
+++ b/gcc/testsuite/gcc.dg/crc-not-crc-15.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-fdump-tree-crc" } */
+/* { dg-options "-O2 -fdump-tree-crc" } */
 /* { dg-skip-if "" { *-*-* } { "-O0" "-O1" "-Os" "-O3" "-flto" } } */
 
 /* With -O3 the cycles is split into 2,
diff --git a/gcc/testsuite/gcc.dg/crc-side-instr-1.c 
b/gcc/testsuite/gcc.dg/crc-side-instr-1.c
index 69738d5c866..d9977d9865f 100644
--- a/gcc/testsuite/gcc.dg/crc-side-instr-1.c
+++ b/gcc/testsuite/gcc.dg/crc-side-instr-1.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-fdump-tree-crc-details" } */
+/* { dg-options "-O2 -fdump-tree-crc-details" } */
 /* { dg-skip-if "" { *-*-* } { "-O0" "-O1" "-Os" "-flto" } } */
 
 #include 
diff --git a/gcc/testsuite/gcc.dg/crc-side-instr-10.c 
b/gcc/testsuite/gcc.dg/crc-side-instr-10.c
index 765572cf9b7..3c807aec9db 100644
--- a/gcc/testsuite/gcc.dg/crc-side-instr-10.c
+++ b/gcc/testsuite/gcc.dg/crc-side-instr-10.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-fdump-tree-crc-details" } */
+/* { dg-options "-O2 -fdump-tree-crc-details" } */
 /* { dg-skip-if "" { *-*-* } { "-O0" "-O1" "-Os" "-f

Re: [PATCH] libstdc++: Use hidden friends for __normal_iterator operators

2024-12-02 Thread Patrick Palka
On Mon, 2 Dec 2024, Jonathan Wakely wrote:

> On Mon, 2 Dec 2024 at 17:42, Patrick Palka  wrote:
> >
> > On Mon, 2 Dec 2024, Patrick Palka wrote:
> >
> > > On Thu, 28 Nov 2024, Jonathan Wakely wrote:
> > >
> > > > As suggested by Jason, this makes all __normal_iterator operators into
> > > > friends so they can be found by ADL and don't need to be separately
> > > > exported in module std.
> > >
> > > Might as well remove the __gnu_cxx exports in std.cc.in while we're at
> > > it?
> 
> Ah yes, since that was the original purpose of the change!
> 
> > > >
> > > > For the operator<=> comparing two iterators of the same type, I had to
> > > > use a deduced return type and add a requires-clause, because it's no
> > > > longer a template and so we no longer get substitution failures when
> > > > it's considered in oerload resolution.
> > > >
> > > > I also had to reorder the __attribute__((always_inline)) and
> > > > [[nodiscard]] attributes, which have to be in a particular order when
> > > > used on friend functions.
> > > >
> > > > libstdc++-v3/ChangeLog:
> > > >
> > > > * include/bits/stl_iterator.h (__normal_iterator): Make all
> > > > non-member operators hidden friends.
> > > > * src/c++11/string-inst.cc: Remove explicit instantiations of
> > > > operators that are no longer templates.
> > > > ---
> > > >
> > > > Tested x86_64-linux.
> > > >
> > > > This iterator type isn't defined in the standard, and users shouldn't be
> > > > doing funny things with it, so nothing prevents us from replacing its
> > > > operators with hidden friends.
> > > >
> > > >  libstdc++-v3/include/bits/stl_iterator.h | 341 ---
> > > >  libstdc++-v3/src/c++11/string-inst.cc|  11 -
> > > >  2 files changed, 184 insertions(+), 168 deletions(-)
> > > >
> > > > diff --git a/libstdc++-v3/include/bits/stl_iterator.h 
> > > > b/libstdc++-v3/include/bits/stl_iterator.h
> > > > index e872598d7d8..656a47e5f76 100644
> > > > --- a/libstdc++-v3/include/bits/stl_iterator.h
> > > > +++ b/libstdc++-v3/include/bits/stl_iterator.h
> > > > @@ -1164,188 +1164,215 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> > > >const _Iterator&
> > > >base() const _GLIBCXX_NOEXCEPT
> > > >{ return _M_current; }
> > > > -};
> > > >
> > > > -  // Note: In what follows, the left- and right-hand-side iterators are
> > > > -  // allowed to vary in types (conceptually in cv-qualification) so 
> > > > that
> > > > -  // comparison between cv-qualified and non-cv-qualified iterators be
> > > > -  // valid.  However, the greedy and unfriendly operators in 
> > > > std::rel_ops
> > > > -  // will make overload resolution ambiguous (when in scope) if we 
> > > > don't
> > > > -  // provide overloads whose operands are of the same type.  Can 
> > > > someone
> > > > -  // remind me what generic programming is about? -- Gaby
> > > > +private:
> > > > +  // Note: In what follows, the left- and right-hand-side 
> > > > iterators are
> > > > +  // allowed to vary in types (conceptually in cv-qualification) 
> > > > so that
> > > > +  // comparison between cv-qualified and non-cv-qualified 
> > > > iterators be
> > > > +  // valid.  However, the greedy and unfriendly operators in 
> > > > std::rel_ops
> > > > +  // will make overload resolution ambiguous (when in scope) if we 
> > > > don't
> > > > +  // provide overloads whose operands are of the same type.  Can 
> > > > someone
> > > > +  // remind me what generic programming is about? -- Gaby
> > > >
> > > >  #ifdef __cpp_lib_three_way_comparison
> > > > -  template > > > _Container>
> > > > -[[nodiscard, __gnu__::__always_inline__]]
> > > > -constexpr bool
> > > > -operator==(const __normal_iterator<_IteratorL, _Container>& __lhs,
> > > > -  const __normal_iterator<_IteratorR, _Container>& __rhs)
> > > > -noexcept(noexcept(__lhs.base() == __rhs.base()))
> > > > -requires requires {
> > > > -  { __lhs.base() == __rhs.base() } -> std::convertible_to;
> > > > -}
> > > > -{ return __lhs.base() == __rhs.base(); }
> > > > +  template
> > > > +   [[nodiscard, __gnu__::__always_inline__]]
> > > > +   friend
> > > > +   constexpr bool
> > > > +   operator==(const __normal_iterator& __lhs,
> > > > +  const __normal_iterator<_Iter, _Container>& __rhs)
> > > > +   noexcept(noexcept(__lhs.base() == __rhs.base()))
> > > > +   requires requires {
> > > > + { __lhs.base() == __rhs.base() } -> std::convertible_to;
> > > > +   }
> > > > +   { return __lhs.base() == __rhs.base(); }
> > > >
> > > > -  template > > > _Container>
> > > > -[[nodiscard, __gnu__::__always_inline__]]
> > > > -constexpr std::__detail::__synth3way_t<_IteratorR, _IteratorL>
> > > > -operator<=>(const __normal_iterator<_IteratorL, _Container>& __lhs,
> > > > -   const __normal_iterator<_IteratorR, _Container>& __rhs)
> > > > -noexcept(noexcept(std::__detail::__synth3way(__lhs.base(), 
> > > > __rhs.

Re: [committed] Patches 8-12 of Mariam & Matevos's CRC optimization work

2024-12-02 Thread Jeff Law




On 12/2/24 4:41 PM, Jakub Jelinek wrote:

On Mon, Dec 02, 2024 at 04:37:22PM -0700, Jeff Law wrote:

and similarly for all other crc-side-instr*.c
These tests are clearly written for gcc.dg/torture/, but placed in gcc.dg/,
where we don't cycle through different options and none of explicit -O0,
-O1, -Os or -flto will be among the options, only -fdump-tree-crc-details
will be and so it will be compiled without optimizations and
all the scan-tree-dump directives UNRESOLVED because crc dump doesn't exist
at -O0.

There were a few more with a different naming convention, but right now it
looks like they'll all move into torture/ without triggering any new
fallout.


Yes,
+UNRESOLVED: gcc.dg/crc-linux-1.c scan-tree-dump crc "calculates CRC!"
+UNRESOLVED: gcc.dg/crc-linux-2.c scan-tree-dump crc "calculates CRC!"
+UNRESOLVED: gcc.dg/crc-linux-5.c scan-tree-dump crc "drm_dp_msg_data_crc4 function 
maybe contains CRC calculation."
+UNRESOLVED: gcc.dg/crc-not-crc-15.c scan-tree-dump-times crc "calculates CRC!" 0
is what I see on i686-linux.

Same ones I found with a grep.
jeff



[patch, libgfortran] PR117820

2024-12-02 Thread Jerry D

Hi all,

Attached patch adds a test for zero that is needed for write_boz to work 
correctly. Almost obvious.


Regression tested on x86_64.

Ok for trunk?

Jerry

Author: Jerry DeLisle 
Date:   Mon Dec 2 19:45:26 2024 -0800

Fortran: Fix B64.0 formatted write output.

PR fortran/117820

libgfortran/ChangeLog:

* io/write.c (write_b): Add test for zero needed by write_boz.

gcc/testsuite/ChangeLog:

* gfortran.dg/pr117820.f90: New test.

diff --git a/gcc/testsuite/gfortran.dg/pr117820.f90 b/gcc/testsuite/gfortran.dg/pr117820.f90
new file mode 100644
index 000..e873393f2ef
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr117820.f90
@@ -0,0 +1,12 @@
+! { dg-do run }
+! { dg-options "-std=gnu" }
+! see pr117820, -std=gnu for this test needed to avoid a warning with -pedantic
+! on line 9 below.
+program test
+  integer(8) :: x
+  character(80) :: output
+  output = "garbage"
+  x = -huge(x) - 1_8
+  write(output, '("<",B64.0,">")') x
+  if (output .ne. "<1000>") stop 1
+end program
diff --git a/libgfortran/io/write.c b/libgfortran/io/write.c
index 2f414c6b57d..ccb2b5cb810 100644
--- a/libgfortran/io/write.c
+++ b/libgfortran/io/write.c
@@ -1392,6 +1392,10 @@ write_b (st_parameter_dt *dtp, const fnode *f, const char *source, int len)
 {
   n = extract_uint (source, len);
   p = btoa (n, itoa_buf, sizeof (itoa_buf));
+
+  /* Test for zero. Needed by write_boz.  */
+  if (n != 0)
+   n = 1;
   write_boz (dtp, f, p, n, len);
 }
 }


Re: [PATCH] testsuite: Fix CRC testcases

2024-12-02 Thread Xi Ruoyao
On Tue, 2024-12-03 at 15:23 +0800, Bohan Lei wrote:
> 
> diff --git a/gcc/testsuite/gcc.dg/crc-linux-1.c
> b/gcc/testsuite/gcc.dg/crc-linux-1.c
> index 918b423a583..3261ba48b8b 100644
> --- a/gcc/testsuite/gcc.dg/crc-linux-1.c
> +++ b/gcc/testsuite/gcc.dg/crc-linux-1.c
> @@ -1,5 +1,5 @@
>  /* { dg-do compile } */
> -/* { dg-options "-fdump-tree-crc-details -w" } */
> +/* { dg-options "-O2 -fdump-tree-crc-details -w" } */
>  /* { dg-skip-if "" { *-*-* } { "-O0" "-O1" "-Os" "-O3" "-flto" } } */

No, the current consensus is moving them from gcc.dg to gcc.dg/torture.
to fix the issue.  See
https://gcc.gnu.org/pipermail/gcc-patches/2024-December/670619.html.

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University