Re: [RFC v3 3/3] c: Add __lengthof__() operator

2024-08-04 Thread Alejandro Colomar
Hi Martin,

On Sun, Aug 04, 2024 at 07:38:49AM GMT, Martin Uecker wrote:
> Am Sonntag, dem 04.08.2024 um 01:17 +0200 schrieb Alejandro Colomar:
> > 
> > FUTURE DIRECTIONS:
> > 
> > We could make it work with array parameters to functions, and
> > somehow magically return the length designator of the array,
> > regardless of it being really a pointer.
> 
> And maybe flexible array members with "counted_by" attribute.

Hmmm; I didn't know that one.  Indeed.  I'll have a look at implementing
that in this patch set.

> > +
> > +/* Implement the lengthof keyword: Return the length of an array,
> > +   that is, the number of elements in the array.  */
> > +
> > +tree
> > +c_lengthof_type (location_t loc, tree type)
> > +{
> > +  enum tree_code type_code;
> > +
> > +  type_code = TREE_CODE (type);
> > +  if (!COMPLETE_TYPE_P (type))
> > +{
> > +  error_at (loc,
> > +   "invalid application of % to incomplete type %qT",
> > +   type);
> > +  return error_mark_node;
> > +}
> > +  if (type_code != ARRAY_TYPE)
> > +{
> > +  error_at (loc, "invalid application of % to type %qT", 
> > type);
> > +  return error_mark_node;
> > +}
> 
> I would swap these two errors, because the first is more specific and
> less helpful if you pass an incomplete struct, where it would be better
> to get the second error.

Agree.

BTW, I still don't understand what `if (! TYPE_DOMAIN (type))` means,
within array_type_nelts_minus_one().  What code triggers that condition?
Am I missing error handling for that?  Thanks!

Have a lovely day!
Alex

> 
> Martin
> 
> > +
> > +  return array_type_nelts_top (type);
> > +}
> 

-- 



signature.asc
Description: PGP signature


Re: [RFC v3 3/3] c: Add __lengthof__() operator

2024-08-04 Thread Martin Uecker
Am Sonntag, dem 04.08.2024 um 10:25 +0200 schrieb Alejandro Colomar:
> Hi Martin,
> 
> On Sun, Aug 04, 2024 at 07:38:49AM GMT, Martin Uecker wrote:
> > Am Sonntag, dem 04.08.2024 um 01:17 +0200 schrieb Alejandro Colomar:
> > > 
> > > FUTURE DIRECTIONS:
> > > 
> > >   We could make it work with array parameters to functions, and
> > >   somehow magically return the length designator of the array,
> > >   regardless of it being really a pointer.
> > 
> > And maybe flexible array members with "counted_by" attribute.
> 
> Hmmm; I didn't know that one.  Indeed.  I'll have a look at implementing
> that in this patch set.

But maybe wait for this:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116016

> 
> > > +
> > > +/* Implement the lengthof keyword: Return the length of an array,
> > > +   that is, the number of elements in the array.  */
> > > +
> > > +tree
> > > +c_lengthof_type (location_t loc, tree type)
> > > +{
> > > +  enum tree_code type_code;
> > > +
> > > +  type_code = TREE_CODE (type);
> > > +  if (!COMPLETE_TYPE_P (type))
> > > +{
> > > +  error_at (loc,
> > > + "invalid application of % to incomplete type %qT",
> > > + type);
> > > +  return error_mark_node;
> > > +}
> > > +  if (type_code != ARRAY_TYPE)
> > > +{
> > > +  error_at (loc, "invalid application of % to type %qT", 
> > > type);
> > > +  return error_mark_node;
> > > +}
> > 
> > I would swap these two errors, because the first is more specific and
> > less helpful if you pass an incomplete struct, where it would be better
> > to get the second error.
> 
> Agree.
> 
> BTW, I still don't understand what `if (! TYPE_DOMAIN (type))` means,
> within array_type_nelts_minus_one().  What code triggers that condition?
> Am I missing error handling for that?  Thanks!

For incomplete arrays, basically we have the following different
variants for arrays:

T[ ] incomplete: !TYPE_DOMAIN 
T[1] constant size: TYPE_MAX_VALUE == INTEGER_CST
T[n] variable size: TYPE_MAX_VALUE != INTEGER_CST
T[0] flexible array member: !TYPE_MAX_VALUE && !C_TYPE_VARIABLE_SIZE
  (ISO version T[0] has TYPE_SIZE == NULL_TREE)
T[*] unspecified variable size: !TYPE_MAX_VALUE && C_TYPE_VARIABLE_SIZE

The first should give an error. The next two should work giving an
integer constant expression or run-time size, respectively. 
The ISO FAM case should also give an error, while the GNU fam case
could return zero to be consistent with sizeof (not sure). The last 
case should return a non-constant.

If you reuse the sizeof code, it should be mostly correct, but
maybe the last case needs to be revisted. In the following
examples

void foo(char (*a)[3][*], int (*x)[__lengthof__(*a)]);
void bar(char (*a)[*][3], int (*x)[__lengthof__(*a)]);

the array '*x' should be a regular fixed size array in foo
but a VLA in 'bar'.


Martin







[pushed] wwwdocs: gcc-14: Tweak a word around GNAT Storage Models

2024-08-04 Thread Gerald Pfeifer
"Performances" would be the choice in case of theater performances or the 
like; the generic, uncountable "performance" seems more appropriate here.

Pushed.

Gerald


When not countable, use performance instead of performances.
---
 htdocs/gcc-14/changes.html | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/htdocs/gcc-14/changes.html b/htdocs/gcc-14/changes.html
index daa09217..62a279b1 100644
--- a/htdocs/gcc-14/changes.html
+++ b/htdocs/gcc-14/changes.html
@@ -298,7 +298,7 @@ You may also want to check out our
 
   https://gcc.gnu.org/onlinedocs/gcc-14.1.0/gnat_rm/Pragma-Storage_005fModel.html";>Storage
   Model: this feature proposes to redesign the concepts of Storage 
Pools
-into a more efficient model allowing higher performances and easier
+into a more efficient model allowing higher performance and easier
 integration with low footprint embedded run-times.
   https://gcc.gnu.org/onlinedocs/gcc-14.1.0/gnat_rm/String-interpolation.html";>String
   Interpolation: allows for easier string formatting.
-- 
2.45.2


[pushed] wwwdocs: gcc-15: Mark up static_assert as

2024-08-04 Thread Gerald Pfeifer
A minor markup change.

Pushed.

Gerald

---
 htdocs/gcc-15/changes.html | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/htdocs/gcc-15/changes.html b/htdocs/gcc-15/changes.html
index 2fd7aa90..bfd98496 100644
--- a/htdocs/gcc-15/changes.html
+++ b/htdocs/gcc-15/changes.html
@@ -75,9 +75,9 @@ a work-in-progress.
 C++
 
 
-   Inline assembler statements now support
-   https://gcc.gnu.org/onlinedocs/gcc/asm-constexprs.html";>constexpr
 generated strings,
- analoguous to static_assert.
+  Inline assembler statements now support
+https://gcc.gnu.org/onlinedocs/gcc/asm-constexprs.html";>constexpr
 generated strings,
+analoguous to static_assert.
   
 
 Qualified name lookup failure into the current instantiation, e.g.
-- 
2.45.2


[PATCH v1] Match: Add type_has_mode_precision_p check for SAT_TRUNC [PR116202]

2024-08-04 Thread pan2 . li
From: Pan Li 

The .SAT_TRUNC matching can only perform the type has its mode
precision.

g_12 = (long unsigned int) _2;
_13 = MIN_EXPR ;
_3 = (_Bool) _13;

The above pattern cannot be recog as .SAT_TRUNC (g_12) because the dest
only has 1 bit precision but QImode.  Aka the type doesn't have the mode
precision.  Thus,  add the type_has_mode_precision_p for the dest to
avoid such case.

The below tests are passed for this patch.
1. The rv64gcv fully regression tests.
2. The x86 bootstrap tests.
3. The x86 fully regression tests.

PR target/116202

gcc/ChangeLog:

* match.pd: Add type_has_mode_precision_p for the dest type
of the .SAT_TRUNC matching.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr116202-run-1.c: New test.

Signed-off-by: Pan Li 
---
 gcc/match.pd  |  6 +++--
 .../riscv/rvv/base/pr116202-run-1.c   | 24 +++
 2 files changed, 28 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr116202-run-1.c

diff --git a/gcc/match.pd b/gcc/match.pd
index c9c8478d286..dfa0bba3908 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -3283,7 +3283,8 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
wide_int trunc_max = wi::mask (otype_precision, false, itype_precision);
wide_int int_cst = wi::to_wide (@1, itype_precision);
   }
-  (if (otype_precision < itype_precision && wi::eq_p (trunc_max, int_cst))
+  (if (type_has_mode_precision_p (type) && otype_precision < itype_precision
+   && wi::eq_p (trunc_max, int_cst))
 
 /* Unsigned saturation truncate, case 2, sizeof (WT) > sizeof (NT).
SAT_U_TRUNC = (NT)(MIN_EXPR (X, 255)).  */
@@ -3309,7 +3310,8 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
wide_int trunc_max = wi::mask (otype_precision, false, itype_precision);
wide_int int_cst = wi::to_wide (@1, itype_precision);
   }
-  (if (otype_precision < itype_precision && wi::eq_p (trunc_max, int_cst))
+  (if (type_has_mode_precision_p (type) && otype_precision < itype_precision
+   && wi::eq_p (trunc_max, int_cst))
 
 /* x >  y  &&  x != XXX_MIN  -->  x > y
x >  y  &&  x == XXX_MIN  -->  false . */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr116202-run-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/pr116202-run-1.c
new file mode 100644
index 000..d150f20b5d9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr116202-run-1.c
@@ -0,0 +1,24 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -march=rv64gcv_zvl256b -fdump-rtl-expand-details" } */
+
+int b[24];
+_Bool c[24];
+
+int main() {
+  for (int f = 0; f < 4; ++f)
+b[f] = 6;
+
+  for (int f = 0; f < 24; f += 4)
+c[f] = ({
+  int g = ({
+unsigned long g = -b[f];
+1 < g ? 1 : g;
+  });
+  g;
+});
+
+  if (c[0] != 1)
+__builtin_abort ();
+}
+
+/* { dg-final { scan-rtl-dump-not ".SAT_TRUNC " "expand" } } */
-- 
2.43.0



[PATCH v2] RISC-V: Support IMM for operand 0 of ussub pattern

2024-08-04 Thread pan2 . li
From: Pan Li 

This patch would like to allow IMM for the operand 0 of ussub pattern.
Aka .SAT_SUB(1023, y) as the below example.

Form 1:
  #define DEF_SAT_U_SUB_IMM_FMT_1(T, IMM) \
  T __attribute__((noinline)) \
  sat_u_sub_imm##IMM##_##T##_fmt_1 (T y)  \
  {   \
return (T)IMM >= y ? (T)IMM - y : 0;  \
  }

DEF_SAT_U_SUB_IMM_FMT_1(uint64_t, 1023)

Before this patch:
  10   │ sat_u_sub_imm82_uint64_t_fmt_1:
  11   │ li  a5,82
  12   │ bgtua0,a5,.L3
  13   │ sub a0,a5,a0
  14   │ ret
  15   │ .L3:
  16   │ li  a0,0
  17   │ ret

After this patch:
  10   │ sat_u_sub_imm82_uint64_t_fmt_1:
  11   │ li  a5,82
  12   │ sltua4,a5,a0
  13   │ addia4,a4,-1
  14   │ sub a0,a5,a0
  15   │ and a0,a4,a0
  16   │ ret

The below test suites are passed for this patch:
1. The rv64gcv fully regression test.

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_gen_unsigned_xmode_reg): Add new
func impl to gen xmode rtx reg from operand rtx.
(riscv_expand_ussub): Gen xmode reg for operand 1.
* config/riscv/riscv.md: Allow const_int for operand 1.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/sat_arith.h: Add test helper macro.
* gcc.target/riscv/sat_u_sub_imm-1.c: New test.
* gcc.target/riscv/sat_u_sub_imm-1_1.c: New test.
* gcc.target/riscv/sat_u_sub_imm-1_2.c: New test.
* gcc.target/riscv/sat_u_sub_imm-2.c: New test.
* gcc.target/riscv/sat_u_sub_imm-2_1.c: New test.
* gcc.target/riscv/sat_u_sub_imm-2_2.c: New test.
* gcc.target/riscv/sat_u_sub_imm-3.c: New test.
* gcc.target/riscv/sat_u_sub_imm-3_1.c: New test.
* gcc.target/riscv/sat_u_sub_imm-3_2.c: New test.
* gcc.target/riscv/sat_u_sub_imm-4.c: New test.
* gcc.target/riscv/sat_u_sub_imm-run-1.c: New test.
* gcc.target/riscv/sat_u_sub_imm-run-2.c: New test.
* gcc.target/riscv/sat_u_sub_imm-run-3.c: New test.
* gcc.target/riscv/sat_u_sub_imm-run-4.c: New test.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/riscv.cc | 51 -
 gcc/config/riscv/riscv.md |  2 +-
 gcc/testsuite/gcc.target/riscv/sat_arith.h| 10 
 .../gcc.target/riscv/sat_u_sub_imm-1.c| 20 +++
 .../gcc.target/riscv/sat_u_sub_imm-1_1.c  | 20 +++
 .../gcc.target/riscv/sat_u_sub_imm-1_2.c  | 20 +++
 .../gcc.target/riscv/sat_u_sub_imm-2.c| 21 +++
 .../gcc.target/riscv/sat_u_sub_imm-2_1.c  | 21 +++
 .../gcc.target/riscv/sat_u_sub_imm-2_2.c  | 22 
 .../gcc.target/riscv/sat_u_sub_imm-3.c| 20 +++
 .../gcc.target/riscv/sat_u_sub_imm-3_1.c  | 21 +++
 .../gcc.target/riscv/sat_u_sub_imm-3_2.c  | 22 
 .../gcc.target/riscv/sat_u_sub_imm-4.c| 19 +++
 .../gcc.target/riscv/sat_u_sub_imm-run-1.c| 56 +++
 .../gcc.target/riscv/sat_u_sub_imm-run-2.c| 56 +++
 .../gcc.target/riscv/sat_u_sub_imm-run-3.c| 55 ++
 .../gcc.target/riscv/sat_u_sub_imm-run-4.c| 48 
 17 files changed, 482 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_sub_imm-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_sub_imm-1_1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_sub_imm-1_2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_sub_imm-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_sub_imm-2_1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_sub_imm-2_2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_sub_imm-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_sub_imm-3_1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_sub_imm-3_2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_sub_imm-4.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_sub_imm-run-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_sub_imm-run-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_sub_imm-run-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_sub_imm-run-4.c

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index b19d56149e7..5e4e9722729 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -11612,6 +11612,55 @@ riscv_expand_usadd (rtx dest, rtx x, rtx y)
   emit_move_insn (dest, gen_lowpart (mode, xmode_dest));
 }
 
+/* Generate a REG rtx of Xmode from the given rtx and mode.
+   The rtx x can be REG (QI/HI/SI/DI) or const_int.
+   The machine_mode mode is the original mode from define pattern.
+
+   If rtx is REG,  the gen_lowpart of Xmode will be returned.
+
+   If rtx is const_int,  a new REG rtx will be created to hold the value of
+   const_int and then returned.
+
+   According to the gccint doc, the constants generated for modes with fewer
+   bits than in HOST_WIDE_IN

Re: [PATCH] Make may_trap_p_1 return false for constant pool references [PR116145]

2024-08-04 Thread Jeff Law




On 8/2/24 2:23 PM, Andrew Pinski wrote:

On Wed, Jul 31, 2024 at 9:41 AM Richard Sandiford
 wrote:


The testcase contains the constant:

   arr2 = svreinterpret_u8(svdup_u32(0x0a0d5c3f));

which was initially hoisted by hand, but which gimple optimisers later
propagated to each use (as expected).  The constant was then expanded
as a load-and-duplicate from the constant pool.  Normally that load
should then be hoisted back out of the loop, but may_trap_or_fault_p
stopped that from happening in this case.

The code responsible was:

   if (/* MEM_NOTRAP_P only relates to the actual position of the memory
  reference; moving it out of context such as when moving code
  when optimizing, might cause its address to become invalid.  */
   code_changed
   || !MEM_NOTRAP_P (x))
 {
   poly_int64 size = MEM_SIZE_KNOWN_P (x) ? MEM_SIZE (x) : -1;
   return rtx_addr_can_trap_p_1 (XEXP (x, 0), 0, size,
 GET_MODE (x), code_changed);
 }

where code_changed is true.  (Arguably it doesn't need to be true in
this case, if we inserted invariants on the preheader edge, but it
would still need to be true for conditionally executed loads.)

Normally this wouldn't be a problem, since rtx_addr_can_trap_p_1
would recognise that the address refers to the constant pool.
However, the SVE load-and-replicate instructions have a limited
offset range, so it isn't possible for them to have a LO_SUM address.
All we have is a plain pseudo base register.

MEM_READONLY_P is defined as:

/* 1 if RTX is a mem that is statically allocated in read-only memory.  */
   (RTL_FLAG_CHECK1 ("MEM_READONLY_P", (RTX), MEM)->unchanging)

and so I think it should be safe to move memory references if both
MEM_READONLY_P and MEM_NOTRAP_P are true.

The testcase isn't a minimal reproducer, but I think it's good
to have a realistic full routine in the testsuite.

Bootstrapped & regression-tested on aarch64-linux-gnu.  OK to install?



This is breaking the build on a few targets (x86_64 and powerpc64le so
far reported, see PR 116200).

 From what I can tell is that it is treating `(plus:DI (ashift:DI
(reg:DI 0 ax [690]) (const_int 3 [0x3]))  (label_ref:DI 1620))` as not
trapping and allowing it to be moved before the check of ax being in
the range [0..2] and we have eax being (unsigned long)(unsigned int)-9
in value. So we get a bogus address which will trap. I put my findings
in PR 116200 too.
I think it's the root cause of the x86_64 libgomp failures on the trunk 
as well.  I haven't done any debugging beyond that.


jeff


Re: [RFC v3 3/3] c: Add __lengthof__() operator

2024-08-04 Thread Alejandro Colomar
Hi Martin,

On Sun, Aug 04, 2024 at 11:39:26AM GMT, Martin Uecker wrote:
> Am Sonntag, dem 04.08.2024 um 10:25 +0200 schrieb Alejandro Colomar:
> > Hi Martin,
> > 
> > On Sun, Aug 04, 2024 at 07:38:49AM GMT, Martin Uecker wrote:
> > > Am Sonntag, dem 04.08.2024 um 01:17 +0200 schrieb Alejandro Colomar:
> > > > 
> > > > FUTURE DIRECTIONS:
> > > > 
> > > > We could make it work with array parameters to functions, and
> > > > somehow magically return the length designator of the array,
> > > > regardless of it being really a pointer.
> > > 
> > > And maybe flexible array members with "counted_by" attribute.
> > 
> > Hmmm; I didn't know that one.  Indeed.  I'll have a look at implementing
> > that in this patch set.
> 
> But maybe wait for this:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116016

Maybe; if I don't find a way to implement it now, their patches may give
me some tool to do it.  I'll still try to do it now, just to try.

I've drafted something by making build_counted_by_ref() an extern
function, and then I wrote

diff --git a/gcc/c-family/c-common.cc b/gcc/c-family/c-common.cc
index 9f5feb83345..875d471a1f4 100644
--- a/gcc/c-family/c-common.cc
+++ b/gcc/c-family/c-common.cc
@@ -4078,6 +4078,7 @@ c_alignof_expr (location_t loc, tree expr)
 tree
 c_lengthof_type (location_t loc, tree type)
 {
+  tree counted_by;
   enum tree_code type_code;
 
   type_code = TREE_CODE (type);
@@ -4086,6 +4087,12 @@ c_lengthof_type (location_t loc, tree type)
   error_at (loc, "invalid application of % to type 
%qT", type);
   return error_mark_node;
 }
+
+  counted_by = lookup_attribute ("counted_by", DECL_ATTRIBUTES (type));
+  if (counted_by)
+// XXX: How to build the counted_by ref without the parent struct 
type?
+return build_counted_by_ref (NULL, type, counted_by);
+
   if (!COMPLETE_TYPE_P (type))
 {
   error_at (loc,

But since I don't have the struct to which the FAM belongs, I don't know
how to get that working.  Do you have any idea?  Or maybe it's better to
just wait for those patches to be merged, and reuse their
infrastructure?


> > > > +
> > > > +/* Implement the lengthof keyword: Return the length of an array,
> > > > +   that is, the number of elements in the array.  */
> > > > +
> > > > +tree
> > > > +c_lengthof_type (location_t loc, tree type)
> > > > +{
> > > > +  enum tree_code type_code;
> > > > +
> > > > +  type_code = TREE_CODE (type);
> > > > +  if (!COMPLETE_TYPE_P (type))
> > > > +{
> > > > +  error_at (loc,
> > > > +   "invalid application of % to incomplete type 
> > > > %qT",
> > > > +   type);
> > > > +  return error_mark_node;
> > > > +}
> > > > +  if (type_code != ARRAY_TYPE)
> > > > +{
> > > > +  error_at (loc, "invalid application of % to type 
> > > > %qT", type);
> > > > +  return error_mark_node;
> > > > +}
> > > 
> > > I would swap these two errors, because the first is more specific and
> > > less helpful if you pass an incomplete struct, where it would be better
> > > to get the second error.
> > 
> > Agree.
> > 
> > BTW, I still don't understand what `if (! TYPE_DOMAIN (type))` means,
> > within array_type_nelts_minus_one().  What code triggers that condition?
> > Am I missing error handling for that?  Thanks!
> 
> For incomplete arrays, basically we have the following different
> variants for arrays:

Thanks!  This list is very useful.

> T[ ] incomplete: !TYPE_DOMAIN 
> T[1] constant size: TYPE_MAX_VALUE == INTEGER_CST

I guess that old flexible-array members using [1] are understood as
normal arrays [42], right?

> T[n] variable size: TYPE_MAX_VALUE != INTEGER_CST
> T[0] flexible array member: !TYPE_MAX_VALUE && !C_TYPE_VARIABLE_SIZE
>   (ISO version T[0] has TYPE_SIZE == NULL_TREE)

ISO version is T[], right?  Did ISO add support for zero-sized arrays?

> T[*] unspecified variable size: !TYPE_MAX_VALUE && C_TYPE_VARIABLE_SIZE

This is not allowed aoutside of function-prototype scope.  And there it
decays to pointer way before reaching __lengthof__.  So we can't handle
that at the moment.  However, we'll have to keep it in mind for when you
do the change to keep the array types of function prototypes.  When that
happens, I guess we'll have to reject these arrays.

> 
> The first should give an error.

Agree (and already implemented []).

> The next two should work giving an
> integer constant expression or run-time size, respectively. 

Agree (and already implemented [42] and [n]).

> The ISO FAM case should also give an error,

Agree (and already implemented []).  (Although I didn't really
distinguish it from an incomplete type.)

Although, if attributed with counted_by, we'd like it to work.  But this
is not yet implemented.

> while the GNU fam case
> could retur

Re: [RFC v3 3/3] c: Add __lengthof__() operator

2024-08-04 Thread Alejandro Colomar
On Sun, Aug 04, 2024 at 06:40:14PM GMT, Alejandro Colomar wrote:
> > The last 
> > case should return a non-constant.
> 
> The last case [*] is only allowed in prototypes.  How should we get the
> non-constant value?  It's just another way to say [], isn't it?
> 
> > If you reuse the sizeof code, it should be mostly correct, but
> > maybe the last case needs to be revisted. In the following
> > examples
> > 
> > void foo(char (*a)[3][*], int (*x)[__lengthof__(*a)]);
> > void bar(char (*a)[*][3], int (*x)[__lengthof__(*a)]);
> > 
> > the array '*x' should be a regular fixed size array in foo
> > but a VLA in 'bar'.
> 
> In the function prototype, it seems to be completely ignoring
> __lengthof__, just as it ignores any expression, so I don't know if it's
> working (I could try to print some debugging values to stderr from the
> compiler, if we care about it).

Huh, no, my bad.  It must be using the lengthof value.  It needs to
match pointers to arrays of a given size.  I'll test this.

> 
>   $ cat muecker.h 
>   void foo(char (*a)[3][*], int (*x)[__lengthof__(*a)]);
>   void bar(char (*a)[*][3], int (*x)[__lengthof__(*a)]);
>   void f(char (*a)[3][*], int (*x)[sizeof(*a)]);
>   void b(char (*a)[*][3], int (*x)[sizeof(*a)]);
>   $ /opt/local/gnu/gcc/lengthof/bin/gcc muecker.h -S
>   $
> 
> I assume the code above is not reaching my code.

-- 



signature.asc
Description: PGP signature


Re: [RFC v3 3/3] c: Add __lengthof__() operator

2024-08-04 Thread Martin Uecker


Hi Alex,

Am Sonntag, dem 04.08.2024 um 18:40 +0200 schrieb Alejandro Colomar:
> Hi Martin,
> 
> On Sun, Aug 04, 2024 at 11:39:26AM GMT, Martin Uecker wrote:
> > Am Sonntag, dem 04.08.2024 um 10:25 +0200 schrieb Alejandro Colomar:
> > > Hi Martin,
> > > 
> > > On Sun, Aug 04, 2024 at 07:38:49AM GMT, Martin Uecker wrote:
> > > > Am Sonntag, dem 04.08.2024 um 01:17 +0200 schrieb Alejandro Colomar:
> > > > > 
> > > > > FUTURE DIRECTIONS:
> > > > > 
> > > > >   We could make it work with array parameters to functions, and
> > > > >   somehow magically return the length designator of the array,
> > > > >   regardless of it being really a pointer.
> > > > 
> > > > And maybe flexible array members with "counted_by" attribute.
> > > 
> > > Hmmm; I didn't know that one.  Indeed.  I'll have a look at implementing
> > > that in this patch set.
> > 
> > But maybe wait for this:
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116016
> 
> Maybe; if I don't find a way to implement it now, their patches may give
> me some tool to do it.  I'll still try to do it now, just to try.
> 
> I've drafted something by making build_counted_by_ref() an extern
> function, and then I wrote
> 
>   diff --git a/gcc/c-family/c-common.cc b/gcc/c-family/c-common.cc
>   index 9f5feb83345..875d471a1f4 100644
>   --- a/gcc/c-family/c-common.cc
>   +++ b/gcc/c-family/c-common.cc
>   @@ -4078,6 +4078,7 @@ c_alignof_expr (location_t loc, tree expr)
>tree
>c_lengthof_type (location_t loc, tree type)
>{
>   +  tree counted_by;
>  enum tree_code type_code;
>
>  type_code = TREE_CODE (type);
>   @@ -4086,6 +4087,12 @@ c_lengthof_type (location_t loc, tree type)
>  error_at (loc, "invalid application of % to type 
> %qT", type);
>  return error_mark_node;
>}
>   +
>   +  counted_by = lookup_attribute ("counted_by", DECL_ATTRIBUTES (type));
>   +  if (counted_by)
>   +// XXX: How to build the counted_by ref without the parent struct 
> type?
>   +return build_counted_by_ref (NULL, type, counted_by);
>   +
>  if (!COMPLETE_TYPE_P (type))
>{
>  error_at (loc,
> 
> But since I don't have the struct to which the FAM belongs, I don't know
> how to get that working.  Do you have any idea?  Or maybe it's better to
> just wait for those patches to be merged, and reuse their
> infrastructure?

I would wait, also not to duplicate work.

> 
> > > > > +
> > > > > +/* Implement the lengthof keyword: Return the length of an array,
> > > > > +   that is, the number of elements in the array.  */
> > > > > +
> > > > > +tree
> > > > > +c_lengthof_type (location_t loc, tree type)
> > > > > +{
> > > > > +  enum tree_code type_code;
> > > > > +
> > > > > +  type_code = TREE_CODE (type);
> > > > > +  if (!COMPLETE_TYPE_P (type))
> > > > > +{
> > > > > +  error_at (loc,
> > > > > + "invalid application of % to incomplete type 
> > > > > %qT",
> > > > > + type);
> > > > > +  return error_mark_node;
> > > > > +}
> > > > > +  if (type_code != ARRAY_TYPE)
> > > > > +{
> > > > > +  error_at (loc, "invalid application of % to type 
> > > > > %qT", type);
> > > > > +  return error_mark_node;
> > > > > +}
> > > > 
> > > > I would swap these two errors, because the first is more specific and
> > > > less helpful if you pass an incomplete struct, where it would be better
> > > > to get the second error.
> > > 
> > > Agree.
> > > 
> > > BTW, I still don't understand what `if (! TYPE_DOMAIN (type))` means,
> > > within array_type_nelts_minus_one().  What code triggers that condition?
> > > Am I missing error handling for that?  Thanks!
> > 
> > For incomplete arrays, basically we have the following different
> > variants for arrays:
> 
> Thanks!  This list is very useful.
> 
> > T[ ] incomplete: !TYPE_DOMAIN 
> > T[1] constant size: TYPE_MAX_VALUE == INTEGER_CST
> 
> I guess that old flexible-array members using [1] are understood as
> normal arrays [42], right?

I think so.

> 
> > T[n] variable size: TYPE_MAX_VALUE != INTEGER_CST
> > T[0] flexible array member: !TYPE_MAX_VALUE && !C_TYPE_VARIABLE_SIZE
> >   (ISO version T[0] has TYPE_SIZE == NULL_TREE)
> 
> ISO version is T[], right?  Did ISO add support for zero-sized arrays?

No, my fault. ISO FAMs are T[].

> 
> > T[*] unspecified variable size: !TYPE_MAX_VALUE && C_TYPE_VARIABLE_SIZE
> 
> This is not allowed aoutside of function-prototype scope.  And there it
> decays to pointer way before reaching __lengthof__.  So we can't handle
> that at the moment.  

But __lengthof__ can occur at function prototype scope,
although this is then not evaluated

> However, we'll have to keep it in mind for when you
> do the change to keep the array types of function prototypes.  When that
> happens, I guess we'll have to reject these arrays.

I am not yet sure this will work, but let's see.
> 
> > 
> >

Re: [RFC v3 3/3] c: Add __lengthof__() operator

2024-08-04 Thread Alejandro Colomar
Hi Martin,

On Sun, Aug 04, 2024 at 06:43:46PM GMT, Alejandro Colomar wrote:
> On Sun, Aug 04, 2024 at 06:40:14PM GMT, Alejandro Colomar wrote:
> > > The last 
> > > case should return a non-constant.
> > 
> > The last case [*] is only allowed in prototypes.  How should we get the
> > non-constant value?  It's just another way to say [], isn't it?
> > 
> > > If you reuse the sizeof code, it should be mostly correct, but
> > > maybe the last case needs to be revisted. In the following
> > > examples
> > > 
> > > void foo(char (*a)[3][*], int (*x)[__lengthof__(*a)]);
> > > void bar(char (*a)[*][3], int (*x)[__lengthof__(*a)]);
> > > 
> > > the array '*x' should be a regular fixed size array in foo
> > > but a VLA in 'bar'.
> > 
> > In the function prototype, it seems to be completely ignoring
> > __lengthof__, just as it ignores any expression, so I don't know if it's
> > working (I could try to print some debugging values to stderr from the
> > compiler, if we care about it).
> 
> Huh, no, my bad.  It must be using the lengthof value.  It needs to
> match pointers to arrays of a given size.  I'll test this.

Is this missing diagnostics?

$ cat star.c 
void foo(char (*a)[3][*], int (*x)[__lengthof__(*a)]);
void bar(char (*a)[*][3], int (*x)[__lengthof__(*a)]);
void foos(char (*a)[3][*], int (*x)[sizeof(*a)]);
void bars(char (*a)[*][3], int (*x)[sizeof(*a)]);

int
main(void)
{
int  i3[3];
int  i5[5];
char c35[3][5];
char c53[5][3];

foo(&c35, &i3);
foo(&c35, &i5);  // I'd expect this to fail
bar(&c53, &i3);  // I'd expect this to fail
bar(&c53, &i5);

foos(&c35, &i3);
foos(&c35, &i5);  // I'd expect this to fail
bars(&c53, &i3);  // I'd expect this to fail
bars(&c53, &i5);
}
$ /opt/local/gnu/gcc/lengthof/bin/gcc -Wall -Wextra star.c -S
$

Cheers,
Alex

> 
> > 
> > $ cat muecker.h 
> > void foo(char (*a)[3][*], int (*x)[__lengthof__(*a)]);
> > void bar(char (*a)[*][3], int (*x)[__lengthof__(*a)]);
> > void f(char (*a)[3][*], int (*x)[sizeof(*a)]);
> > void b(char (*a)[*][3], int (*x)[sizeof(*a)]);
> > $ /opt/local/gnu/gcc/lengthof/bin/gcc muecker.h -S
> > $
> > 
> > I assume the code above is not reaching my code.
> 
> -- 
> 



-- 



signature.asc
Description: PGP signature


Re: [RFC v3 3/3] c: Add __lengthof__() operator

2024-08-04 Thread Martin Uecker


Hi Alex,

Am Sonntag, dem 04.08.2024 um 19:49 +0200 schrieb Alejandro Colomar:
> Hi Martin,
> 
> On Sun, Aug 04, 2024 at 06:43:46PM GMT, Alejandro Colomar wrote:
> > On Sun, Aug 04, 2024 at 06:40:14PM GMT, Alejandro Colomar wrote:
> > > > The last 
> > > > case should return a non-constant.
> > > 
> > > The last case [*] is only allowed in prototypes.  How should we get the
> > > non-constant value?  It's just another way to say [], isn't it?
> > > 
> > > > If you reuse the sizeof code, it should be mostly correct, but
> > > > maybe the last case needs to be revisted. In the following
> > > > examples
> > > > 
> > > > void foo(char (*a)[3][*], int (*x)[__lengthof__(*a)]);
> > > > void bar(char (*a)[*][3], int (*x)[__lengthof__(*a)]);
> > > > 
> > > > the array '*x' should be a regular fixed size array in foo
> > > > but a VLA in 'bar'.
> > > 
> > > In the function prototype, it seems to be completely ignoring
> > > __lengthof__, just as it ignores any expression, so I don't know if it's
> > > working (I could try to print some debugging values to stderr from the
> > > compiler, if we care about it).
> > 
> > Huh, no, my bad.  It must be using the lengthof value.  It needs to
> > match pointers to arrays of a given size.  I'll test this.
> 
> Is this missing diagnostics?
> 
>   $ cat star.c 
>   void foo(char (*a)[3][*], int (*x)[__lengthof__(*a)]);
>   void bar(char (*a)[*][3], int (*x)[__lengthof__(*a)]);
>   void foos(char (*a)[3][*], int (*x)[sizeof(*a)]);
>   void bars(char (*a)[*][3], int (*x)[sizeof(*a)]);
> 
>   int
>   main(void)
>   {
>   int  i3[3];
>   int  i5[5];
>   char c35[3][5];
>   char c53[5][3];
> 
>   foo(&c35, &i3);
>   foo(&c35, &i5);  // I'd expect this to fail

Yes, this should fail. The int (*)[5] is not
compatible with int(*)[3].

>   bar(&c53, &i3);  // I'd expect this to fail

This is no contraint violation, because int (*)[5] is
compatible with int (*i)[*], so this needs to be accepted.

It is then UB at run-time and the patches I posted recently
would catch this.  When possible, a compile time warning 
would be nice and I am also looking into this.

It would also be good if we could allow a compiler to
reject this at compile time... also something I am
thinking about.

>   bar(&c53, &i5);
> 
>   foos(&c35, &i3);
>   foos(&c35, &i5);  // I'd expect this to fail
>   bars(&c53, &i3);  // I'd expect this to fail

These are both okay, because the sizeof is not an integer
constant expressions (both int[*][3] and int[3][*] have
variable size), so the last argument has to be compatible
with int[*] which they both are.  Both would trigger
run-time UB then because the size is then 15.

Martin

>   bars(&c53, &i5);
>   }
>   $ /opt/local/gnu/gcc/lengthof/bin/gcc -Wall -Wextra star.c -S
>   $
> 
> Cheers,
> Alex
> 
> > 
> > > 
> > >   $ cat muecker.h 
> > >   void foo(char (*a)[3][*], int (*x)[__lengthof__(*a)]);
> > >   void bar(char (*a)[*][3], int (*x)[__lengthof__(*a)]);
> > >   void f(char (*a)[3][*], int (*x)[sizeof(*a)]);
> > >   void b(char (*a)[*][3], int (*x)[sizeof(*a)]);
> > >   $ /opt/local/gnu/gcc/lengthof/bin/gcc muecker.h -S
> > >   $
> > > 
> > > I assume the code above is not reaching my code.
> > 
> > -- 
> > 
> 
> 
> 



Re: [RFC v3 3/3] c: Add __lengthof__() operator

2024-08-04 Thread Alejandro Colomar
On Sun, Aug 04, 2024 at 08:02:25PM GMT, Martin Uecker wrote:
> Hi Alex,

Hi Martin,

> > Is this missing diagnostics?
> > 
> > $ cat star.c 
> > void foo(char (*a)[3][*], int (*x)[__lengthof__(*a)]);
> > void bar(char (*a)[*][3], int (*x)[__lengthof__(*a)]);
> > void foos(char (*a)[3][*], int (*x)[sizeof(*a)]);
> > void bars(char (*a)[*][3], int (*x)[sizeof(*a)]);
> > 
> > int
> > main(void)
> > {
> > int  i3[3];
> > int  i5[5];
> > char c35[3][5];
> > char c53[5][3];
> > 
> > foo(&c35, &i3);
> > foo(&c35, &i5);  // I'd expect this to fail
> 
> Yes, this should fail. The int (*)[5] is not
> compatible with int(*)[3].
> 
> > bar(&c53, &i3);  // I'd expect this to fail
> 
> This is no contraint violation, because int (*)[5] is
> compatible with int (*i)[*], so this needs to be accepted.

No constraint, but I'd expect a diagnostic from -Wextra (array-bounds?).

> It is then UB at run-time and the patches I posted recently

Can you please send a link to those patches?

> would catch this.  When possible, a compile time warning 
> would be nice and I am also looking into this.
> 
> It would also be good if we could allow a compiler to
> reject this at compile time... also something I am
> thinking about.

Thanks!

> 
> > bar(&c53, &i5);
> > 
> > foos(&c35, &i3);
> > foos(&c35, &i5);  // I'd expect this to fail
> > bars(&c53, &i3);  // I'd expect this to fail
> 
> These are both okay, because the sizeof is not an integer
> constant expressions (both int[*][3] and int[3][*] have
> variable size), so the last argument has to be compatible
> with int[*] which they both are.  Both would trigger
> run-time UB then because the size is then 15.

D'oh!  I screwed it.  I wanted to have written this:

$ cat star.c 
void foo(char (*a)[3][*], int (*x)[__lengthof__(*a)]);
void bar(char (*a)[*][3], int (*x)[__lengthof__(*a)]);
void foo2(char (*a)[3][*], int (*x)[sizeof(**a)]);
void bar2(char (*a)[*][3], int (*x)[sizeof(**a)]);

int
main(void)
{
int  i3[3];
int  i5[5];
char c35[3][5];
char c53[5][3];

foo(&c35, &i3);
foo(&c35, &i5);  // I'd expect this to err
bar(&c53, &i3);  // I'd expect this to warn
bar(&c53, &i5);

foo2(&c35, &i3);  // I'd expect this to warn
foo2(&c35, &i5);
bar2(&c53, &i3);
//bar2(&c53, &i5);  // error: -Wincompatible-pointer-types
}
$ /opt/local/gnu/gcc/lengthof/bin/gcc -Wall -Wextra star.c -S
$ 


> 
> Martin

Cheers,
Alex

-- 



signature.asc
Description: PGP signature


[PATCH v5 0/1] RISC-V: Support CORE-V XCVBITMAIP extension

2024-08-04 Thread Mary Bennett
This patch series presents the comprehensive implementation of the BITMANIP
extension for CORE-V.

Tested with riscv-gnu-toolchain on binutils, ld, gas and gcc testsuites to
ensure its correctness and compatibility with the existing codebase.
However, your input, reviews, and suggestions are invaluable in making this
extension even more robust.

The CORE-V builtins are described in the specification [1] and work can be
found in the OpenHW group's Github repository [2].

[1] 
github.com/openhwgroup/core-v-sw/blob/master/specifications/corev-builtin-spec.md

[2] github.com/openhwgroup/corev-gcc

Contributors:
  Mary Bennett 
  Nandni Jamnadas 
  Pietra Ferreira 
  Charlie Keaney
  Jessica Mills
  Craig Blackmore 
  Simon Cook 
  Jeremy Bennett 
  Helene Chelin 

RISC-V: Add support for XCVbitmanip extension in CV32E40P

 gcc/common/config/riscv/riscv-common.cc   |   2 +
 gcc/config/riscv/constraints.md   |  16 ++
 gcc/config/riscv/corev.def|  13 ++
 gcc/config/riscv/corev.md | 188 ++
 gcc/config/riscv/predicates.md|  11 +
 gcc/config/riscv/riscv-builtins.cc|   1 +
 gcc/config/riscv/riscv-ftypes.def |   3 +
 gcc/config/riscv/riscv.cc |  13 ++
 gcc/config/riscv/riscv.opt|   5 +-
 gcc/doc/extend.texi   |  53 +
 gcc/doc/sourcebuild.texi  |   3 +
 .../riscv/cv-bitmanip-compile-bclr.c  |  27 +++
 .../riscv/cv-bitmanip-compile-bclrr.c |  18 ++
 .../riscv/cv-bitmanip-compile-bitrev.c|  30 +++
 .../riscv/cv-bitmanip-compile-bset.c  |  27 +++
 .../riscv/cv-bitmanip-compile-bsetr.c |  18 ++
 .../riscv/cv-bitmanip-compile-clb.c   |  18 ++
 .../riscv/cv-bitmanip-compile-cnt.c   |  18 ++
 .../riscv/cv-bitmanip-compile-extract.c   |  27 +++
 .../riscv/cv-bitmanip-compile-extractr.c  |  18 ++
 .../riscv/cv-bitmanip-compile-extractu.c  |  27 +++
 .../riscv/cv-bitmanip-compile-extractur.c |  18 ++
 .../riscv/cv-bitmanip-compile-ff1.c   |  18 ++
 .../riscv/cv-bitmanip-compile-fl1.c   |  18 ++
 .../riscv/cv-bitmanip-compile-insert.c|  24 +++
 .../riscv/cv-bitmanip-compile-insertr.c   |  18 ++
 .../riscv/cv-bitmanip-compile-ror.c   |  18 ++
 .../riscv/cv-bitmanip-fail-compile-bclr.c |  25 +++
 .../riscv/cv-bitmanip-fail-compile-bitrev.c   |  23 +++
 .../riscv/cv-bitmanip-fail-compile-bset.c |  25 +++
 .../riscv/cv-bitmanip-fail-compile-extract.c  |  25 +++
 .../riscv/cv-bitmanip-fail-compile-extractu.c |  25 +++
 .../riscv/cv-bitmanip-fail-compile-insert.c   |  25 +++
 gcc/testsuite/lib/target-supports.exp |  13 ++
 34 files changed, 810 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-bitmanip-compile-bclr.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-bitmanip-compile-bclrr.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-bitmanip-compile-bitrev.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-bitmanip-compile-bset.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-bitmanip-compile-bsetr.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-bitmanip-compile-clb.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-bitmanip-compile-cnt.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-bitmanip-compile-extract.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/cv-bitmanip-compile-extractr.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/cv-bitmanip-compile-extractu.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/cv-bitmanip-compile-extractur.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-bitmanip-compile-ff1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-bitmanip-compile-fl1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-bitmanip-compile-insert.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-bitmanip-compile-insertr.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-bitmanip-compile-ror.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/cv-bitmanip-fail-compile-bclr.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/cv-bitmanip-fail-compile-bitrev.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/cv-bitmanip-fail-compile-bset.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/cv-bitmanip-fail-compile-extract.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/cv-bitmanip-fail-compile-extractu.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/cv-bitmanip-fail-compile-insert.c

-- 
2.43.0



[PATCH v5 1/1] RISC-V: Add support for XCVbitmanip extension in CV32E40P

2024-08-04 Thread Mary Bennett
Spec: 
github.com/openhwgroup/core-v-sw/blob/master/specifications/corev-builtin-spec.md

Contributors:
  Mary Bennett 
  Nandni Jamnadas 
  Pietra Ferreira 
  Charlie Keaney
  Jessica Mills
  Craig Blackmore 
  Simon Cook 
  Jeremy Bennett 
  Helene Chelin 

gcc/ChangeLog:
* common/config/riscv/riscv-common.cc: Add XCVbitmanip.
* config/riscv/constraints.md: Likewise.
* config/riscv/corev.def: Likewise.
* config/riscv/corev.md: Likewise.
* config/riscv/predicates.md: Likewise.
* config/riscv/riscv-builtins.cc (AVAIL): Likewise.
* config/riscv/riscv-ftypes.def: Likewise.
* config/riscv/riscv.opt: Likewise.
* doc/extend.texi: Add XCVbitmanip builtin documentation.
* doc/sourcebuild.texi: Likewise.

gcc/testsuite/ChangeLog:
* gcc.target/riscv/cv-bitmanip-compile-bclr.c: New test.
* gcc.target/riscv/cv-bitmanip-compile-bclrr.c: New test.
* gcc.target/riscv/cv-bitmanip-compile-bitrev.c: New test.
* gcc.target/riscv/cv-bitmanip-compile-bset.c: New test.
* gcc.target/riscv/cv-bitmanip-compile-bsetr.c: New test.
* gcc.target/riscv/cv-bitmanip-compile-clb.c: New test.
* gcc.target/riscv/cv-bitmanip-compile-cnt.c: New test.
* gcc.target/riscv/cv-bitmanip-compile-extract.c: New test.
* gcc.target/riscv/cv-bitmanip-compile-extractr.c: New test.
* gcc.target/riscv/cv-bitmanip-compile-extractu.c: New test.
* gcc.target/riscv/cv-bitmanip-compile-extractur.c: New test.
* gcc.target/riscv/cv-bitmanip-compile-ff1.c: New test.
* gcc.target/riscv/cv-bitmanip-compile-fl1.c: New test.
* gcc.target/riscv/cv-bitmanip-compile-insert.c: New test.
* gcc.target/riscv/cv-bitmanip-compile-insertr.c: New test.
* gcc.target/riscv/cv-bitmanip-compile-ror.c: New test.
* gcc.target/riscv/cv-bitmanip-fail-compile-bclr.c: New test.
* gcc.target/riscv/cv-bitmanip-fail-compile-bitrev.c: New test.
* gcc.target/riscv/cv-bitmanip-fail-compile-bset.c: New test.
* gcc.target/riscv/cv-bitmanip-fail-compile-extract.c: New test.
* gcc.target/riscv/cv-bitmanip-fail-compile-extractu.c: New test.
* gcc.target/riscv/cv-bitmanip-fail-compile-insert.c: New test.
* lib/target-supports.exp: Add proc for the XCVbitmanip extension.
---
 gcc/common/config/riscv/riscv-common.cc   |   2 +
 gcc/config/riscv/constraints.md   |  16 ++
 gcc/config/riscv/corev.def|  13 ++
 gcc/config/riscv/corev.md | 188 ++
 gcc/config/riscv/predicates.md|  11 +
 gcc/config/riscv/riscv-builtins.cc|   1 +
 gcc/config/riscv/riscv-ftypes.def |   3 +
 gcc/config/riscv/riscv.cc |  13 ++
 gcc/config/riscv/riscv.opt|   5 +-
 gcc/doc/extend.texi   |  53 +
 gcc/doc/sourcebuild.texi  |   3 +
 .../riscv/cv-bitmanip-compile-bclr.c  |  27 +++
 .../riscv/cv-bitmanip-compile-bclrr.c |  18 ++
 .../riscv/cv-bitmanip-compile-bitrev.c|  30 +++
 .../riscv/cv-bitmanip-compile-bset.c  |  27 +++
 .../riscv/cv-bitmanip-compile-bsetr.c |  18 ++
 .../riscv/cv-bitmanip-compile-clb.c   |  18 ++
 .../riscv/cv-bitmanip-compile-cnt.c   |  18 ++
 .../riscv/cv-bitmanip-compile-extract.c   |  27 +++
 .../riscv/cv-bitmanip-compile-extractr.c  |  18 ++
 .../riscv/cv-bitmanip-compile-extractu.c  |  27 +++
 .../riscv/cv-bitmanip-compile-extractur.c |  18 ++
 .../riscv/cv-bitmanip-compile-ff1.c   |  18 ++
 .../riscv/cv-bitmanip-compile-fl1.c   |  18 ++
 .../riscv/cv-bitmanip-compile-insert.c|  24 +++
 .../riscv/cv-bitmanip-compile-insertr.c   |  18 ++
 .../riscv/cv-bitmanip-compile-ror.c   |  18 ++
 .../riscv/cv-bitmanip-fail-compile-bclr.c |  25 +++
 .../riscv/cv-bitmanip-fail-compile-bitrev.c   |  23 +++
 .../riscv/cv-bitmanip-fail-compile-bset.c |  25 +++
 .../riscv/cv-bitmanip-fail-compile-extract.c  |  25 +++
 .../riscv/cv-bitmanip-fail-compile-extractu.c |  25 +++
 .../riscv/cv-bitmanip-fail-compile-insert.c   |  25 +++
 gcc/testsuite/lib/target-supports.exp |  13 ++
 34 files changed, 810 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-bitmanip-compile-bclr.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-bitmanip-compile-bclrr.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-bitmanip-compile-bitrev.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-bitmanip-compile-bset.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-bitmanip-compile-bsetr.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-bitmanip-compile-clb.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-bitmanip-compile-cnt.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-bitmanip-compile-extract.c
 creat

Re: [RFC v3 3/3] c: Add __lengthof__() operator

2024-08-04 Thread Martin Uecker
Am Sonntag, dem 04.08.2024 um 20:34 +0200 schrieb Alejandro Colomar:
> On Sun, Aug 04, 2024 at 08:02:25PM GMT, Martin Uecker wrote:
> > Hi Alex,
> 
> Hi Martin,
> 
> > > Is this missing diagnostics?
> > > 
> > >   $ cat star.c 
> > >   void foo(char (*a)[3][*], int (*x)[__lengthof__(*a)]);
> > >   void bar(char (*a)[*][3], int (*x)[__lengthof__(*a)]);
> > >   void foos(char (*a)[3][*], int (*x)[sizeof(*a)]);
> > >   void bars(char (*a)[*][3], int (*x)[sizeof(*a)]);
> > > 
> > >   int
> > >   main(void)
> > >   {
> > >   int  i3[3];
> > >   int  i5[5];
> > >   char c35[3][5];
> > >   char c53[5][3];
> > > 
> > >   foo(&c35, &i3);
> > >   foo(&c35, &i5);  // I'd expect this to fail
> > 
> > Yes, this should fail. The int (*)[5] is not
> > compatible with int(*)[3].
> > 
> > >   bar(&c53, &i3);  // I'd expect this to fail
> > 
> > This is no contraint violation, because int (*)[5] is
> > compatible with int (*i)[*], so this needs to be accepted.
> 
> No constraint, but I'd expect a diagnostic from -Wextra (array-bounds?).
> 
> > It is then UB at run-time and the patches I posted recently
> 
> Can you please send a link to those patches?

https://gcc.gnu.org/pipermail/gcc-patches/2024-July/657253.html


Martin


> 
> > would catch this.  When possible, a compile time warning 
> > would be nice and I am also looking into this.
> > 
> > It would also be good if we could allow a compiler to
> > reject this at compile time... also something I am
> > thinking about.
> 
> Thanks!
> 
> > 
> > >   bar(&c53, &i5);
> > > 
> > >   foos(&c35, &i3);
> > >   foos(&c35, &i5);  // I'd expect this to fail
> > >   bars(&c53, &i3);  // I'd expect this to fail
> > 
> > These are both okay, because the sizeof is not an integer
> > constant expressions (both int[*][3] and int[3][*] have
> > variable size), so the last argument has to be compatible
> > with int[*] which they both are.  Both would trigger
> > run-time UB then because the size is then 15.
> 
> D'oh!  I screwed it.  I wanted to have written this:
> 
>   $ cat star.c 
>   void foo(char (*a)[3][*], int (*x)[__lengthof__(*a)]);
>   void bar(char (*a)[*][3], int (*x)[__lengthof__(*a)]);
>   void foo2(char (*a)[3][*], int (*x)[sizeof(**a)]);
>   void bar2(char (*a)[*][3], int (*x)[sizeof(**a)]);
> 
>   int
>   main(void)
>   {
>   int  i3[3];
>   int  i5[5];
>   char c35[3][5];
>   char c53[5][3];
> 
>   foo(&c35, &i3);
>   foo(&c35, &i5);  // I'd expect this to err
>   bar(&c53, &i3);  // I'd expect this to warn
>   bar(&c53, &i5);
> 
>   foo2(&c35, &i3);  // I'd expect this to warn
>   foo2(&c35, &i5);
>   bar2(&c53, &i3);
>   //bar2(&c53, &i5);  // error: -Wincompatible-pointer-types
>   }
>   $ /opt/local/gnu/gcc/lengthof/bin/gcc -Wall -Wextra star.c -S
>   $ 
> 
> 
> > 
> > Martin
> 
> Cheers,
> Alex
> 



[committed][PR rtl-optimization/116199] Fix latent bug in reload's SUBREG handling

2024-08-04 Thread Jeff Law


Building glibc on the m68k has exposed a long standing latent bug in reload.

Basically ext-dce replaced an extension with a subreg expression (good) 
resulting in this pair of insns:



(insn 7 4 8 2 (set (reg:DI 31 [ _1 ])
(subreg:DI (reg/v:SI 37 [ __major ]) 0)) "j.c":7:32 75 {*m68k.md:1568}
 (nil))
(insn 8 7 10 2 (set (reg:DI 39 [ _2 ])
(ashift:DI (reg:DI 31 [ _1 ])
(const_int 8 [0x8]))) "j.c":7:48 322 {*ashldi3}
 (expr_list:REG_DEAD (reg:DI 31 [ _1 ])
(nil)))



insn 7 was optimized to the simple copy by ext-dce.  That looks fine. 
Combine comes along and squashes them together resulting in:



(insn 8 7 10 2 (set (reg:DI 39 [ _2 ])
(ashift:DI (subreg:DI (reg/v:SI 37 [ __major ]) 0)
(const_int 8 [0x8]))) "j.c":7:48 322 {*ashldi3}
 (nil))


Which also looks good.

After IRA's allocation, in the middle of reload we have:


(insn 8 7 10 2 (set (reg:DI 8 %a0 [orig:39 _2 ] [39])
(ashift:DI (subreg:DI (reg/v:SI 0 %d0 [orig:37 __major ] [37]) 0)
(const_int 8 [0x8]))) "j.c":7:48 322 {*ashldi3}
 (nil))



Again, sensible.  The pattern requires op0 and op1 to match, so we try 
to figure out if d0 & a0 are the same underlying register.  So we get 
into this code in operands_match_p:





  if (code == SUBREG)
{
  i = REGNO (SUBREG_REG (x)); 
  if (i >= FIRST_PSEUDO_REGISTER)

goto slow;
  i += subreg_regno_offset (REGNO (SUBREG_REG (x)), 
GET_MODE (SUBREG_REG (x)),

SUBREG_BYTE (x),
GET_MODE (x));
}
  else
i = REGNO (x);


There's a similar fragment for the other operand.  The key is that 
subreg_regno_offset call.  That call assumes the subreg is 
representable.  But in the case of (subreg:DI (reg:SI d0)) we're going 
to get -1 (remember, m68k is a big endian target).  That -1 gets passed 
to hard_regno_regs via this code (again, just showing one of the two 
copies of this fragment):



 if (REG_WORDS_BIG_ENDIAN
  && is_a  (GET_MODE (x), &xmode)
  && GET_MODE_SIZE (xmode) > UNITS_PER_WORD
  && i < FIRST_PSEUDO_REGISTER)
i += hard_regno_nregs (i, xmode) - 1;



That triggers the reported ICE.  It appears this has been broken since 
the conversion to SUBREG_BYTE way back in 2001, though possibly it could 
have been some minor changes around this code circa 2005 as well, it 
didn't seem worth putting under the debugger to be sure.  Certainly the 
code from 2001 looks suspicious to me.


Anyway, the fix here is pretty simple.  The routine 
"simplify_subreg_regno" is meant to be used to determine if we can 
simplify the subreg expression and will explicitly return -1 if it can't 
be represented for one reason or another.  It checks a variety of 
conditions that aren't worth listing here.



Bootstrapped and regression tested on x86 (after reverting an unrelated 
patch from Richard S that's causing multiple unrelated failures), which 
of course doesn't really test the code as x86 is an LRA target.  Also 
built & tested the crosses, none of which show issues (and some of which 
are reload targets).  m68k will bootstrap & regression test tomorrow, 
but I don't think there's any point in waiting for that.


Pushing to the trunk.

jeffcommit cfeb994d64743691d4a787f8eef7ce52d8711756
Author: Jeff Law 
Date:   Sun Aug 4 13:09:02 2024 -0600

[committed][PR rtl-optimization/116199] Fix latent bug in reload's SUBREG 
handling

Building glibc on the m68k has exposed a long standing latent bug in reload.

Basically ext-dce replaced an extension with a subreg expression (good)
resulting in this pair of insns:

> (insn 7 4 8 2 (set (reg:DI 31 [ _1 ])
> (subreg:DI (reg/v:SI 37 [ __major ]) 0)) "j.c":7:32 75 
{*m68k.md:1568}
>  (nil))
> (insn 8 7 10 2 (set (reg:DI 39 [ _2 ])
> (ashift:DI (reg:DI 31 [ _1 ])
> (const_int 8 [0x8]))) "j.c":7:48 322 {*ashldi3}
>  (expr_list:REG_DEAD (reg:DI 31 [ _1 ])
> (nil)))

insn 7 was optimized to the simple copy by ext-dce.  That looks fine.  
Combine
comes along and squashes them together resulting in:

> (insn 8 7 10 2 (set (reg:DI 39 [ _2 ])
> (ashift:DI (subreg:DI (reg/v:SI 37 [ __major ]) 0)
> (const_int 8 [0x8]))) "j.c":7:48 322 {*ashldi3}
>  (nil))

Which also looks good.

After IRA's allocation, in the middle of reload we have:

> (insn 8 7 10 2 (set (reg:DI 8 %a0 [orig:39 _2 ] [39])
> (ashift:DI (subreg:DI (reg/v:SI 0 %d0 [orig:37 __major ] [37]) 0)
> (const_int 8 [0x8]))) "j.c":7:48 322 {*ashldi3}
>  (nil))

Again, sensible.  The pattern requires op0 and op1 to match, so we try to
figure out if d0 & a0 are the same underlying register.  So we get into this

[RFC 0/1] libstdc++: Replace Ryu with Teju Jagua.

2024-08-04 Thread Cassio Neri
Dear libstdc++ developers,

I developed a novel algorithm called Teju Jagua which finds the shortest
representation of a floating-point number. It can be used by std::to_chars
which currently uses Ryu.

IMHO, Teju Jagua is much simpler than Ryu and, according to my measurements,
for doubles, Teju Jagua is twice faster than Ryu. Similarly, it's simpler
and ~27% faster than Dragonbox which, to the best of my knowledge, was the
fastest algorithm for this task until now.

Teju Jagua can bring another benefit. All the aforementioned algorithms use
look-up tables that often raise memory footprint concerns. At least in their
reference implementations, one or more tables is provided for each
floating-point type. I don't know about the others but for Teju Jagua the
table for a type can be also used for any smaller type. Hence, it just
needs to provide the table for the largest type (e.g., float128_t) of the
platform.

I already started an implementation in libstdc++ and will send a subsequent
RFC to gather feedback and to give you a better understanding of this work.
The first RFC will replace Ryu only for floats. After this is reviewed and
if there's still interest from others, I will send subsequent patches to
replace Ryu for each one of the other floating-point types. In the end I'll
send a patch to remove Ryu. (Note: This concerns only regular Ryu, which
finds the shortest representation, Ryu-printf that produces the result for a
given precision is out of the scope of this work and will remain.)

A reference C implementation (it's a WIP which lacks documentation) is found
in [1].

I presented this algorithm in C++ Now 2024, C++ on Sea 2024 and I'll show
it again at CppCon 2024. If any libstdc++ developer plans to attend, I'd be
more than happy to discuss this work in person at the conference. The video
recording of my C++ Now 2024 talk is on YouTube [2].

[1] https://github.com/cassioneri/teju_jagua:
[2] https://www.youtube.com/watch?v=w0WrRdW7eqg


Re: [committed][PR rtl-optimization/116199] Fix latent bug in reload's SUBREG handling

2024-08-04 Thread Georg-Johann Lay



Building glibc on the m68k has exposed a long standing latent bug in reload.

Basically ext-dce replaced an extension with a subreg expression (good) 
resulting in this pair of insns:


(insn 7 4 8 2 (set (reg:DI 31 [ _1 ])
(subreg:DI (reg/v:SI 37 [ __major ]) 0)) "j.c":7:32 75 {*m68k.md:1568}
 (nil))
(insn 8 7 10 2 (set (reg:DI 39 [ _2 ])
(ashift:DI (reg:DI 31 [ _1 ])
(const_int 8 [0x8]))) "j.c":7:48 322 {*ashldi3}
 (expr_list:REG_DEAD (reg:DI 31 [ _1 ])
(nil)))



insn 7 was optimized to the simple copy by ext-dce.  That looks fine. Combine 
comes along and squashes them together resulting in:


(insn 8 7 10 2 (set (reg:DI 39 [ _2 ])
(ashift:DI (subreg:DI (reg/v:SI 37 [ __major ]) 0)
(const_int 8 [0x8]))) "j.c":7:48 322 {*ashldi3}
 (nil))


Which also looks good.

After IRA's allocation, in the middle of reload we have:


(insn 8 7 10 2 (set (reg:DI 8 %a0 [orig:39 _2 ] [39])
(ashift:DI (subreg:DI (reg/v:SI 0 %d0 [orig:37 __major ] [37]) 0)
(const_int 8 [0x8]))) "j.c":7:48 322 {*ashldi3}
 (nil))



Again, sensible.  The pattern requires op0 and op1 to match, so we try to figure 
out if d0 & a0 are the same underlying register.  So we get into this code in 
operands_match_p:




  if (code == SUBREG)
{
  i = REGNO (SUBREG_REG (x));   if (i >= FIRST_PSEUDO_REGISTER)
goto slow;
  i += subreg_regno_offset (REGNO (SUBREG_REG (x)), 
GET_MODE (SUBREG_REG (x)),
SUBREG_BYTE (x),
GET_MODE (x));
}
  else
i = REGNO (x);


There's a similar fragment for the other operand.  The key is that 
subreg_regno_offset call.  That call assumes the subreg is representable.  But 
in the case of (subreg:DI (reg:SI d0)) we're going to get -1 (remember, m68k is 
a big endian target).  That -1 gets passed to hard_regno_regs via this code 
(again, just showing one of the two copies of this fragment):


 if (REG_WORDS_BIG_ENDIAN
  && is_a  (GET_MODE (x), &xmode)
  && GET_MODE_SIZE (xmode) > UNITS_PER_WORD
  && i < FIRST_PSEUDO_REGISTER)
i += hard_regno_nregs (i, xmode) - 1;



That triggers the reported ICE.  It appears this has been broken since the 
conversion to SUBREG_BYTE way back in 2001, though possibly it could have been 
some minor changes around this code circa 2005 as well, it didn't seem worth 
putting under the debugger to be sure.  Certainly the code from 2001 looks 
suspicious to me.

Anyway, the fix here is pretty simple.  The routine "simplify_subreg_regno" is 
meant to be used to determine if we can simplify the subreg expression and will 
explicitly return -1 if it can't be represented for one reason or another.  It checks a 
variety of conditions that aren't worth listing here.


Bootstrapped and regression tested on x86 (after reverting an unrelated patch from 
Richard S that's causing multiple unrelated failures), which of course doesn't really 
test the code as x86 is an LRA target.  Also built & tested the crosses, none of 
which show issues (and some of which are reload targets).  m68k will bootstrap & 
regression test tomorrow, but I don't think there's any point in waiting for that.

Pushing to the trunk.

jeff

P

commit cfeb994d64743691d4a787f8eef7ce52d8711756
Author: Jeff Law 
Date:   Sun Aug 4 13:09:02 2024 -0600

[committed][PR rtl-optimization/116199] Fix latent bug in reload's SUBREG 
handling

Building glibc on the m68k has exposed a long standing latent bug in reload.

Basically ext-dce replaced an extension with a subreg expression (good)

resulting in this pair of insns:

> (insn 7 4 8 2 (set (reg:DI 31 [ _1 ])

> (subreg:DI (reg/v:SI 37 [ __major ]) 0)) "j.c":7:32 75 
{*m68k.md:1568}
>  (nil))
> (insn 8 7 10 2 (set (reg:DI 39 [ _2 ])
> (ashift:DI (reg:DI 31 [ _1 ])
> (const_int 8 [0x8]))) "j.c":7:48 322 {*ashldi3}
>  (expr_list:REG_DEAD (reg:DI 31 [ _1 ])
> (nil)))

insn 7 was optimized to the simple copy by ext-dce.  That looks fine.  Combine

comes along and squashes them together resulting in:

> (insn 8 7 10 2 (set (reg:DI 39 [ _2 ])

> (ashift:DI (subreg:DI (reg/v:SI 37 [ __major ]) 0)
> (const_int 8 [0x8]))) "j.c":7:48 322 {*ashldi3}
>  (nil))

Which also looks good.

After IRA's allocation, in the middle of reload we have:

> (insn 8 7 10 2 (set (reg:DI 8 %a0 [orig:39 _2 ] [39])

> (ashift:DI (subreg:DI (reg/v:SI 0 %d0 [orig:37 __major ] [37]) 0)
> (const_int 8 [0x8]))) "j.c":7:48 322 {*ashldi3}
>  (nil))

Again, sensible.  The pattern requires op0 and op1 to match, so we try to

figure out if d0 & a0 are the same underlying register.  So we get into this
co

Re: [committed][PR rtl-optimization/116199] Fix latent bug in reload's SUBREG handling

2024-08-04 Thread Jakub Jelinek
On Sun, Aug 04, 2024 at 10:47:57PM +0200, Georg-Johann Lay wrote:
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/torture/pr116199.c
> > @@ -0,0 +1,13 @@
> > +/* { dg-do compile } */
> > +/* { dg-require-effective-target int32 } */
> > +
> > +__extension__ typedef unsigned long long int __uint64_t;
> 
> In almost all cases, __UINT64_TYPE__ etc. is more robust in test
> cases than using long or short etc.

unsigned long long is just fine.  The C standard requires it is at least
64-bits and GCC doesn't support it larger than that on any arch.

Jakub



Re: [PATCH 0/1] Initial support for AVX10.2

2024-08-04 Thread Andi Kleen
> BTW, I noticed that in LLVM there is FP8 support for ARM currently
> undergoing. I will have a look on it to see if everything is mature.

There's even FP8 work for ARM work under way for gcc, see
https://gcc.gnu.org/pipermail/gcc-patches/2024-August/659248.html


-andi


[PATCH] Fix handling of const or volatile void pointers in CodeView

2024-08-04 Thread Mark Harmstone
DWARF represents voids in DW_TAG_const_type and DW_TAG_volatile_type
DIEs by the absence of a DW_AT_type attribute, which we weren't handling
correctly.

gcc/
* dwarf2codeview.cc (get_type_num_const_type): Handle missing
DW_AT_type attribute.
(get_type_num_volatile_type): Likewise.
---
 gcc/dwarf2codeview.cc | 36 
 1 file changed, 24 insertions(+), 12 deletions(-)

diff --git a/gcc/dwarf2codeview.cc b/gcc/dwarf2codeview.cc
index 470cbae7110..f7107021bc7 100644
--- a/gcc/dwarf2codeview.cc
+++ b/gcc/dwarf2codeview.cc
@@ -2344,23 +2344,26 @@ get_type_num_const_type (dw_die_ref type, bool 
in_struct)
   bool is_volatile = false;
 
   base_type = get_AT_ref (type, DW_AT_type);
-  if (!base_type)
-return 0;
 
   /* Handle case when this is a const volatile type - we only need one
  LF_MODIFIER for this.  */
-  if (dw_get_die_tag (base_type) == DW_TAG_volatile_type)
+  if (base_type && dw_get_die_tag (base_type) == DW_TAG_volatile_type)
 {
   is_volatile = true;
 
   base_type = get_AT_ref (base_type, DW_AT_type);
-  if (!base_type)
-   return 0;
 }
 
-  base_type_num = get_type_num (base_type, in_struct, false);
-  if (base_type_num == 0)
-return 0;
+  if (!base_type)
+{
+  base_type_num = T_VOID;
+}
+  else
+{
+  base_type_num = get_type_num (base_type, in_struct, false);
+  if (base_type_num == 0)
+   return 0;
+}
 
   ct = (codeview_custom_type *) xmalloc (sizeof (codeview_custom_type));
 
@@ -2383,13 +2386,22 @@ get_type_num_const_type (dw_die_ref type, bool 
in_struct)
 static uint32_t
 get_type_num_volatile_type (dw_die_ref type, bool in_struct)
 {
+  dw_die_ref base_type;
   uint32_t base_type_num;
   codeview_custom_type *ct;
 
-  base_type_num = get_type_num (get_AT_ref (type, DW_AT_type), in_struct,
-   false);
-  if (base_type_num == 0)
-return 0;
+  base_type = get_AT_ref (type, DW_AT_type);
+
+  if (base_type)
+{
+  base_type_num = get_type_num (base_type, in_struct, false);
+  if (base_type_num == 0)
+   return 0;
+}
+  else
+{
+  base_type_num = T_VOID;
+}
 
   ct = (codeview_custom_type *) xmalloc (sizeof (codeview_custom_type));
 
-- 
2.44.2



Re: [PATCH] Fix handling of const or volatile void pointers in CodeView

2024-08-04 Thread Jeff Law




On 8/4/24 4:42 PM, Mark Harmstone wrote:

DWARF represents voids in DW_TAG_const_type and DW_TAG_volatile_type
DIEs by the absence of a DW_AT_type attribute, which we weren't handling
correctly.

gcc/
* dwarf2codeview.cc (get_type_num_const_type): Handle missing
DW_AT_type attribute.
(get_type_num_volatile_type): Likewise.

OK
jeff



Re: [PATCH] RISC-V: Minimal support for Zimop extension.

2024-08-04 Thread Jeff Law




On 8/2/24 9:32 AM, Jiawei wrote:

https://github.com/riscv/riscv-isa-manual/blob/main/src/zimop.adoc

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc: New extension.
* config/riscv/riscv.opt: New mask.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/arch-42.c: New test.
* gcc.target/riscv/arch-43.c: New test.
Shouldn't the binutils bits go in first?  There's basic support for 
Zimop/Zcmop from Lyut on the binutils list in late 2023 or early 2024. 
I'm pretty sure it marked as DO NOT MERGE because we were waiting for 
the extension to get ratified.


I don't know if Lyut is doing any RISC-V work right now, so if you 
wanted to ping the patch on his behalf, it'd be appreciated and I can 
handle the review on the binutils side too.


I think the GCC bits are fine, but let's get the binutils bits installed 
first.


jeff



Re: [PATCH] RISC-V: Minimal support for Zimop extension.

2024-08-04 Thread Jiawei



在 2024/8/5 8:45, Jeff Law 写道:



On 8/2/24 9:32 AM, Jiawei wrote:

https://github.com/riscv/riscv-isa-manual/blob/main/src/zimop.adoc

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc: New extension.
* config/riscv/riscv.opt: New mask.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/arch-42.c: New test.
* gcc.target/riscv/arch-43.c: New test.
Shouldn't the binutils bits go in first?  There's basic support for 
Zimop/Zcmop from Lyut on the binutils list in late 2023 or early 2024. 
I'm pretty sure it marked as DO NOT MERGE because we were waiting for 
the extension to get ratified.


Christoph informed me that Zimop has been ratified, so we may not need 
to worry about the spec lifecycle status:


https://jira.riscv.org/browse/RVS-1603?src=confmacro



I don't know if Lyut is doing any RISC-V work right now, so if you 
wanted to ping the patch on his behalf, it'd be appreciated and I can 
handle the review on the binutils side too.


I found that ESWIN's patch to support Zimop on the binutils mailing list 
last month:


https://sourceware.org/pipermail/binutils/2024-June/134592.html

I think the GCC bits are fine, but let's get the binutils bits 
installed first.

jeff


Okay, Thank you for your comments!

BR,

jiawei



Re: [PATCH v2] RISC-V: Add --with-cmodel configure option

2024-08-04 Thread Hau Hsu
Oh the Palmer's patch is here
https://gcc.gnu.org/pipermail/gcc-patches/2023-December/641172.html

It doesn't mater who's patch get merged for me :)









> On Aug 2, 2024, at 10:43 PM, Jeff Law  wrote:
> 
> 
> 
> On 8/1/24 11:11 PM, Hau Hsu wrote:
>> Sometimes we want to use default cmodel other than medlow. Add a GCC
>> configure option for that.
>> gcc/ChangeLog:
>> * config.gcc (riscv*-*-*): Add support for --with-cmodel configure 
>> option.
>> * config/riscv/riscv.h (TARGET_RISCV_DEFAULT_CMODEL): Define default 
>> cmodel.
>> * configure: Regenerate.
>> * configure.ac: Add --with-cmodel configure option.
>> * doc/install.texi: Document --with-cmodel configure option.
>> * doc/invoke.texi (-mcmodel): Mention --with-cmodel configure option.
> I thought Palmer had submitted an equivalent patch a while back, but I can't 
> seem to find it.
> 
> 
> It looks like pre-commit testing has flagged this as failing to build on 
> arm/aarch64.  I'm not as familiar with that system, so I don't know exactly 
> what it's complaining about other than the configuration step for GCC failed.
> 
> 
> https://patchwork.sourceware.org/project/gcc/patch/20240802051151.3658614-1-hau@sifive.com/
> 
> # reset_artifacts:
> -10
> # true:
> 0
> # build_abe gcc:
> # FAILED
> # First few build errors in logs:
> # 00:01:36 make[1]: *** [Makefile:4646: configure-gcc] Error 1
> # 00:01:36 make: *** [Makefile:1065: all] Error 2
> 
> 
> 
> 



Re: [PATCH] IRA: Ignore debug insns for uses in split_live_ranges_for_shrink_wrap. [PR116179]

2024-08-04 Thread Andrew Pinski
On Sat, Aug 3, 2024 at 3:31 PM Jeff Law  wrote:
>
>
>
> On 8/2/24 11:31 PM, Sam James wrote:
> > Andrew Pinski  writes:
> >
> >> Late_combine exposed this latent bug in split_live_ranges_for_shrink_wrap.
> >> What it did was copy-prop regno 151 from regno 119 from:
> >> ```
> >> (insn 2 264 3 2 (set (reg/f:DI 119 [ thisD.3697 ])
> >>  (reg:DI 151)) "/app/example.cpp":19:13 70 {*movdi_aarch64}
> >>   (expr_list:REG_DEAD (reg:DI 151)
> >>  (nil)))
> >> ```
> >>
> >> into these insns:
> >> ```
> >> (debug_insn 147 146 148 5 (var_location:DI thisD.3727 (reg/f:DI 119 [ 
> >> thisD.3697 ])) "/app/example.cpp":21:5 -1
> >>   (nil))
> >> 
> >> (insn 167 166 168 7 (set (reg:DI 1 x1)
> >>  (reg/f:DI 119 [ thisD.3697 ])) "/app/example.cpp":14:21 70 
> >> {*movdi_aarch64}
> >>   (nil))
> >> ```
> >>
> >> Both are valid things to do. The problem is 
> >> split_live_ranges_for_shrink_wrap looks at the
> >> uses of reg 151 and with and without debugging reg 151 have a different 
> >> usage in different BBs.
> >> The function is trying to find a splitting point for reg 151 and they are 
> >> different. In the end
> >> this causes register allocation difference.
> >> The fix is for split_live_ranges_for_shrink_wrap to ignore uses that were 
> >> in debug insns.
> >>
> >> Bootstrappped and tested on x86_64-linux-gnu with no regressions.
> >>
> >>  PR rtl-optimization/116179
> >>
> >> gcc/ChangeLog:
> >>
> >>  * ira.cc (split_live_ranges_for_shrink_wrap): For the uses loop,
> >>  only look at non-debug insns.
> >>
> >> gcc/testsuite/ChangeLog:
> >>
> >>  * g++.dg/torture/pr116179-1.C: New test.
> >>
> >> Signed-off-by: Andrew Pinski 
> >> ---
> >>   gcc/ira.cc|  4 +++-
> >>   gcc/testsuite/g++.dg/torture/pr116179-1.C | 26 +++
> >>   2 files changed, 29 insertions(+), 1 deletion(-)
> >>   create mode 100644 gcc/testsuite/g++.dg/torture/pr116179-1.C
> >>
> >> diff --git a/gcc/ira.cc b/gcc/ira.cc
> >> index 5642aea3caa..5a04231f7bd 100644
> >> --- a/gcc/ira.cc
> >> +++ b/gcc/ira.cc
> >> @@ -5144,7 +5144,9 @@ split_live_ranges_for_shrink_wrap (void)
> >> use = DF_REF_NEXT_REG (use))
> >>  {
> >>int ubbi = DF_REF_BB (use)->index;
> >> -  if (bitmap_bit_p (reachable, ubbi))
> >> +  /* Only non debug insn should not be taken into account.  */
> >
> > Nit: maybe /* Only non-debug insns should be ignored.  */ to remove the
> > double negative or /* Don't ignore non-debug insns.  */.
> Andrew's comment looks slightly wrong.  I think he just needs to take
> out the "not".  "insn" should be plural as well.  If he wanted to remove
> the reference to "non debug" we often use "real" to refer to that case
> (INSN, JUMP_INSN and CALL_INSNs).
>
>
>
> OK with some kind of comment fix along those lines.

I originally had the test using `!DEBUG_INSN_P` so the comment was
originally  `Debug insn should not be taken into account.`.
When I changed the test to use `NONDEBUG_INSN_P`, I also changed the
comment but had messed it up; forgetting to remove the `not`.

Anyways in the end I went with `Only non debug insns should be taken
into account.` since that reflex the check.

Thanks,
Andrew

>
> Jeff
>
>


Re: [PATCH 0/1] Initial support for AVX10.2

2024-08-04 Thread Hongyu Wang
Andi Kleen  于2024年8月5日周一 06:31写道:
>
> > BTW, I noticed that in LLVM there is FP8 support for ARM currently
> > undergoing. I will have a look on it to see if everything is mature.
>
> There's even FP8 work for ARM work under way for gcc, see
> https://gcc.gnu.org/pipermail/gcc-patches/2024-August/659248.html
>

The FP8 for ARM was defined as an opaque type used for intrinsic only.
It behaves the same as int8 on reg move/load/store,
but no conversion or scalar operation permitted.

As we only have convert instruction for FP8 currently, I think there
is no need to add such scalar type. We didn't add ABI and scalar type
for BF16 until the arithmetic instruction was introduced.

>
> -andi


Re: [PR middle-end/114635] Set OMP safelen handling to INT_MAX when the pragma didn’t provide one.

2024-08-04 Thread Kugan Vivekanandarajah


> On 15 Jul 2024, at 5:18 pm, Jakub Jelinek  wrote:
>
> External email: Use caution opening links or attachments
>
>
> On Mon, Jul 15, 2024 at 12:39:22AM +, Kugan Vivekanandarajah wrote:
>> OMP safelen handling is assigning backend provided max as an int even when 
>> the pragma didn’t provide one. As a result, vectoriser is rejecting SVE 
>> modes while comparing poly_int with the safelen.
>>
>> That is, for the attached test case,  omp_max_vf gets [16, 16] from the 
>> backend. This then becomes 16 as omp safelen is an integer. When vectoriser 
>> compares the potential vector mode with  maybe_lt (max_vf, min_vf)) , this 
>> would fail resulting in any SVE vector mode being  selected.
>>
>> One suggestion there was to set safelen to INT_MAX when OMP pragma does not 
>> provide safely explicitly.
>
> This is wrong.  The reason why safelen is set to that sctx.max_vf is that if
> there are any "omp simd array" arrays, sctx.max_vf is their size.
> The code you're touching has a comment that explains it even:
>  /* If max_vf is non-zero, then we can use only a vectorization factor
> up to the max_vf we chose.  So stick it into the safelen clause.  */
>  if (maybe_ne (sctx.max_vf, 0U))
>
> If sctx.max_vf is 0, there were no "omp simd array" arrays emitted and so
> OMP_CLAUSE_SAFELEN isn't set.
> The vectorizer can only shrink the arrays, not grow them and that is why
> there is this limit.
>
> Now, I think even SVE has a limit, which is not a scalar constant but
> poly_int, so I think in that case you need to arrange for the size of the
> arrays to be POLY_INT_CST as well and use that as a limit.
> Now, the clause argument itself at least in the OpenMP standard needs to be an
> integer constant (if provided), because the proposals to extend it for the
> SVE-like arches (aarch64, RISC-V) have not been voted in I believe.
> So, there is a question what to do if user specifies safelen (32) or
> something similar.
> But if the user didn't specify it (so it is effectively infinitity), then
> yes, it might be ok to set it to some POLY_INT_CST representing the sizes of
> the arrays and tweak the loop safelen so that it can represent those.

Thanks for the explanation. Does that mean:
1. We change loop->safelen to poly_int
2. Modify the apply_safelen  to account for the poly_int.

I am attaching an RFC patch for your reference.
Thanks,
Kugan



Signed-off-by: Kugan Vivekanandarajah 

>
>>PR middle-end/114635
>>PR 114635
>>
>> gcc/ChangeLog:
>>
>>* omp-low.cc (lower_rec_input_clauses): Set INT_MAX
>>when safelen is not provided instead of using backend
>>provided safelen.
>>
>> gcc/testsuite/ChangeLog:
>>
>>* c-c++-common/pr114635-1.cpp: New test.
>>* c-c++-common/pr114635-2.cpp: New test.
>>
>> Signed-off-by: Kugan Vivekanandarajah 
>
>Jakub




0001-safelen_v2.patch
Description: 0001-safelen_v2.patch


Re: [PATCH ver 3] rs6000, Add new overloaded vector shift builtin int128, variants

2024-08-04 Thread Kewen.Lin
Hi Carl,

on 2024/8/2 03:35, Carl Love wrote:
> GCC developers:
> 
> Version 3, updated the testcase dg-do link to dg-do compile.  Moved the new 
> documentation again.  Retested on Power 10 LE and BE to verify the dg 
> arguments disable the test on Power10BE but enable the test for Power10LE.  
> Reran the full regression testsuite.   There were no new regressions for the 
> testsuite.
> 
> Version 2, updated rs6000-overload.def to remove adding additonal internal 
> names and to change XXSLDWI_Q to XXSLDWI_1TI per comments from Kewen.  Move 
> new documentation statement for the PIVPR built-ins per comments from Kewen.  
> Updated dg-do-run directive and added comment about the save-temps  in 
> testcase per feedback from Segher.  Retested the patch on Power 10 with no 
> regressions.
> 
> The following patch adds the int128 varients to the existing overloaded 
> built-ins vec_sld, vec_sldb, vec_sldw, vec_sll, vec_slo, vec_srdb, vec_srl, 
> vec_sro.  These varients were requested by Steve Munroe.
> 
> The patch has been tested on a Power 10 system with no regressions.

OK with the below nits tweaked and tested well on both BE and LE, thanks!

> 
> Please let me know if the patch is acceptable for mainline.
> 
>    Carl
> 
> --
> rs6000, Add new overloaded vector shift builtin int128 variants
> 
> Add the signed __int128 and unsigned __int128 argument types for the
> overloaded built-ins vec_sld, vec_sldb, vec_sldw, vec_sll, vec_slo,
> vec_srdb, vec_srl, vec_sro.  For each of the new argument types add a
> testcase and update the documentation for the built-in.
> 
> gcc/ChangeLog:
>     * config/rs6000/altivec.md (vsdb_): Change
>     define_insn iterator to VEC_IC.
>     * config/rs6000/rs6000-builtins.def (__builtin_altivec_vsldoi_v1ti,
>     __builtin_vsx_xxsldwi_v1ti, __builtin_altivec_vsldb_v1ti,
>     __builtin_altivec_vsrdb_v1ti): New builtin definitions.
>     * config/rs6000/rs6000-overload.def (vec_sld, vec_sldb, vec_sldw,
>     vec_sll, vec_slo, vec_srdb, vec_srl, vec_sro): New overloaded
>     definitions.
>     * doc/extend.texi (vec_sld, vec_sldb, vec_sldw, vec_sll, vec_slo,
>     vec_srdb, vec_srl, vec_sro): Add documentation for new overloaded
>     built-ins.
> 
> gcc/testsuite/ChangeLog:
>     * gcc.target/powerpc/vec-shift-double-runnable-int128.c: New test file.
> ---
>  gcc/config/rs6000/altivec.md  |   6 +-
>  gcc/config/rs6000/rs6000-builtins.def |  12 +
>  gcc/config/rs6000/rs6000-overload.def |  40 ++
>  gcc/doc/extend.texi   |  43 ++
>  .../vec-shift-double-runnable-int128.c    | 419 ++
>  5 files changed, 517 insertions(+), 3 deletions(-)
>  create mode 100644 
> gcc/testsuite/gcc.target/powerpc/vec-shift-double-runnable-int128.c
> 
> diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
> index 5af9bf920a2..2a18ee44526 100644
> --- a/gcc/config/rs6000/altivec.md
> +++ b/gcc/config/rs6000/altivec.md
> @@ -878,9 +878,9 @@ (define_int_attr SLDB_lr [(UNSPEC_SLDB "l")
>  (define_int_iterator VSHIFT_DBL_LR [UNSPEC_SLDB UNSPEC_SRDB])
> 
>  (define_insn "vsdb_"
> - [(set (match_operand:VI2 0 "register_operand" "=v")
> -  (unspec:VI2 [(match_operand:VI2 1 "register_operand" "v")
> -       (match_operand:VI2 2 "register_operand" "v")
> + [(set (match_operand:VEC_IC 0 "register_operand" "=v")
> +  (unspec:VEC_IC [(match_operand:VEC_IC 1 "register_operand" "v")
> +       (match_operand:VEC_IC 2 "register_operand" "v")
>     (match_operand:QI 3 "const_0_to_12_operand" "n")]
>    VSHIFT_DBL_LR))]
>    "TARGET_POWER10"
> diff --git a/gcc/config/rs6000/rs6000-builtins.def 
> b/gcc/config/rs6000/rs6000-builtins.def
> index 77eb0f7e406..a2b2b729270 100644
> --- a/gcc/config/rs6000/rs6000-builtins.def
> +++ b/gcc/config/rs6000/rs6000-builtins.def
> @@ -964,6 +964,9 @@
>    const vss __builtin_altivec_vsldoi_8hi (vss, vss, const int<4>);
>  VSLDOI_8HI altivec_vsldoi_v8hi {}
> 
> +  const vsq __builtin_altivec_vsldoi_v1ti (vsq, vsq, const int<4>);
> +    VSLDOI_V1TI altivec_vsldoi_v1ti {}

Nit: s/VSLDOI_V1TI/VSLDOI_1TI/ to align with the other VSLDOI_* (no 'V' in *).

> +
>    const vss __builtin_altivec_vslh (vss, vus);
>  VSLH vashlv8hi3 {}
> 
> @@ -1831,6 +1834,9 @@
>    const vsll __builtin_vsx_xxsldwi_2di (vsll, vsll, const int<2>);
>  XXSLDWI_2DI vsx_xxsldwi_v2di {}
> 
> +  const vsq __builtin_vsx_xxsldwi_v1ti (vsq, vsq, const int<2>);
> +    XXSLDWI_1TI vsx_xxsldwi_v1ti {}
> +
>    const vf __builtin_vsx_xxsldwi_4sf (vf, vf, const int<2>);
>  XXSLDWI_4SF vsx_xxsldwi_v4sf {}
> 
> @@ -3299,6 +3305,9 @@
>    const vss __builtin_altivec_vsldb_v8hi (vss, vss, const int<3>);
>  VSLDB_V8HI vsldb_v8hi {}
> 
> +  const vsq __builtin_altivec_vsldb_v1ti (vsq, vsq, const int<3>);
> +    VSLDB_V1TI vsldb_v1ti {}
> +
>    const vsq __builtin_altivec_vslq (vsq, v

Re: [PATCH] rs6000, document built-ins vec_test_lsbb_all_ones and, vec_test_lsbb_all_zeros

2024-08-04 Thread Kewen.Lin
on 2024/8/3 05:48, Peter Bergner wrote:
> On 7/31/24 10:21 PM, Kewen.Lin wrote:
>> on 2024/8/1 01:52, Carl Love wrote:
>>> Yes, I noticed that the built-ins were defined as overloaded but only had 
>>> one definition.   Did seem odd to me.
>>>
 either is with "vector unsigned char" as argument type, but the 
 corresponding instance
 prototype in builtin table is with "vector signed char".  It's 
 inconsistent and weird,
 I think we can just update the prototype in builtin table with "vector 
 unsigned char"
 and remove the entries in overload table.  It can be a follow up patch.
>>>
>>> I didn't notice that it was signed in the instance prototype but unsigned 
>>> in the overloaded definition.  That is definitely inconsistent.
>>>
>>> That said, should we just go ahead and support both signed and unsigned 
>>> argument versions of the all ones and all zeros built-ins?
>>
>> Good question, I thought about that but found openxl only supports the 
>> unsigned version 
>> so I felt it's probably better to keep consistent with it.  But I'm fine for 
>> either, if
>> we decide to extend it to cover both signed and unsigned, we should notify 
>> openxl team
>> to extend it as well.
>>
>> openxl doc links:
>>
>> https://www.ibm.com/docs/en/openxl-c-and-cpp-aix/17.1.2?topic=functions-vec-test-lsbb-all-ones
>> https://www.ibm.com/docs/en/openxl-c-and-cpp-aix/17.1.2?topic=functions-vec-test-lsbb-all-zeros
> 
> If it makes sense to support vector signed char rather than only the vector 
> unsigned char,
> then I'm fine adding support for it.  It almost seems since we tried adding 
> an overload
> for it, that that was our intention (to support both signed and unsigned) and 
> we just
> had a bug so only unsigned was supported?

Good question but I'm not sure, it could be an oversight without adding one 
more instance
for overloading, or adopting some useless code (only for overloading) for a 
single instance.
I found it's introduced by r11-2437-gcf5d0fc2d1adcd, CC'ed Will as he 
contributed this.

BR,
Kewen

> 
> CC'ing Steve since he noticed the missing documentation when we was trying to
> use the built-ins.  Steve, do you see a need to also support vector signed 
> char
> with these built-ins?
> 
> Peter
> 
> 



[PATCH] tree-reassoc.cc: PR tree-optimization/116139 Don't assert when forming fully-pipelined FMAs on wide MULT targets

2024-08-04 Thread Kyrylo Tkachov
Hi all,

The code in get_reassociation_width that forms FMAs aggressively when
they are fully pipelined expects the FMUL reassociation width in the
target to be less than for FMAs. This doesn't hold for all target
tunings.

This code shouldn't ICE, just avoid forming these FMAs here.
This patch does that.

The test case uses -mcpu=neoverse-n3 tuning because it uses the width of 6 for 
the fp reassociation cost. The test case in the PR uses neoverse-v2 but I 
intend to change the reassociation cost for neoverse-v2 in a future patch that 
would not trigger this ICE.

Bootstrapped and tested on aarch64-none-linux-gnu.
Ok for trunk?
Thanks,
Kyrill

Signed-off-by: Kyrylo Tkachov 

PR tree-optimization/116139

gcc/ChangeLog:

* tree-ssa-reassoc.cc (get_reassociation_width): Move width_mult
<= width comparison to if condition rather than assert.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/pr116139.c: New test.
---



reassoc.patch
Description: reassoc.patch