date:20241122

Re: [PATCH] i386: Make __builtin_ia32_f{nstenv,ldenv,nstsw,fnclex} builtins internal [PR117165]

2024-11-22 Thread Uros Bizjak

On Fri, Nov 22, 2024 at 9:50 AM Jakub Jelinek  wrote:
>
> Hi!
>
> As the comment says, these builtins are meant to be internal for the atomic
> support and cause various ICEs when using them directly in various
> conditions.
> So the following patch makes them internal.
> We do have also internal-fn.*, but those target specific builtins would
> need to be there in generic code, so I've just added space to their name,
> which is the old way to hide builtins/attributes etc.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2024-11-22  Jakub Jelinek  
>
> PR target/117165
> * config/i386/i386-builtin.def (IX86_BUILTIN_FNSTENV,
> IX86_BUILTIN_FLDENV, IX86_BUILTIN_FNSTSW, IX86_BUILTIN_FNCLEX): Add
> space to the end of the builtin name to make it really internal.
>
> * gcc.target/i386/pr117165.c: New test.

OK.

Thanks,
Uros.

>
> --- gcc/config/i386/i386-builtin.def.jj 2024-11-06 10:19:11.418260865 +0100
> +++ gcc/config/i386/i386-builtin.def2024-11-21 11:39:05.245410674 +0100
> @@ -94,10 +94,10 @@ BDESC (0, 0, CODE_FOR_nothing, "__builti
>  BDESC (0, 0, CODE_FOR_pause, "__builtin_ia32_pause", IX86_BUILTIN_PAUSE, 
> UNKNOWN, (int) VOID_FTYPE_VOID)
>
>  /* 80387 (for use internally for atomic compound assignment).  */
> -BDESC (0, 0, CODE_FOR_fnstenv, "__builtin_ia32_fnstenv", 
> IX86_BUILTIN_FNSTENV, UNKNOWN, (int) VOID_FTYPE_PVOID)
> -BDESC (0, 0, CODE_FOR_fldenv, "__builtin_ia32_fldenv", IX86_BUILTIN_FLDENV, 
> UNKNOWN, (int) VOID_FTYPE_PCVOID)
> -BDESC (0, 0, CODE_FOR_fnstsw, "__builtin_ia32_fnstsw", IX86_BUILTIN_FNSTSW, 
> UNKNOWN, (int) USHORT_FTYPE_VOID)
> -BDESC (0, 0, CODE_FOR_fnclex, "__builtin_ia32_fnclex", IX86_BUILTIN_FNCLEX, 
> UNKNOWN, (int) VOID_FTYPE_VOID)
> +BDESC (0, 0, CODE_FOR_fnstenv, "__builtin_ia32_fnstenv ", 
> IX86_BUILTIN_FNSTENV, UNKNOWN, (int) VOID_FTYPE_PVOID)
> +BDESC (0, 0, CODE_FOR_fldenv, "__builtin_ia32_fldenv ", IX86_BUILTIN_FLDENV, 
> UNKNOWN, (int) VOID_FTYPE_PCVOID)
> +BDESC (0, 0, CODE_FOR_fnstsw, "__builtin_ia32_fnstsw ", IX86_BUILTIN_FNSTSW, 
> UNKNOWN, (int) USHORT_FTYPE_VOID)
> +BDESC (0, 0, CODE_FOR_fnclex, "__builtin_ia32_fnclex ", IX86_BUILTIN_FNCLEX, 
> UNKNOWN, (int) VOID_FTYPE_VOID)
>
>  /* MMX */
>  BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_emms, "__builtin_ia32_emms", 
> IX86_BUILTIN_EMMS, UNKNOWN, (int) VOID_FTYPE_VOID)
> --- gcc/testsuite/gcc.target/i386/pr117165.c.jj 2024-11-21 11:46:07.971413045 
> +0100
> +++ gcc/testsuite/gcc.target/i386/pr117165.c2024-11-21 11:45:44.849741064 
> +0100
> @@ -0,0 +1,27 @@
> +/* PR target/117165 */
> +/* { dg-do compile } */
> +/* { dg-options "-msoft-float" } */
> +
> +void
> +foo ()
> +{
> +  __builtin_ia32_fnstsw ();/* { dg-error "implicit declaration of 
> function" } */
> +}
> +
> +void
> +bar ()
> +{
> +  __builtin_ia32_fnclex ();/* { dg-error "implicit declaration of 
> function" } */
> +}
> +
> +void
> +baz ()
> +{
> +  __builtin_ia32_fnstenv (0);  /* { dg-error "implicit declaration of 
> function" } */
> +}
> +
> +void
> +qux ()
> +{
> +  __builtin_ia32_fldenv (0);   /* { dg-error "implicit declaration of 
> function" } */
> +}
>
> Jakub
>

[PATCH] inline asm, v3: Add new constraint for symbol definitions

2024-11-22 Thread Jakub Jelinek

On Thu, Nov 21, 2024 at 11:57:13PM +, Joseph Myers wrote:
> On Wed, 6 Nov 2024, Jakub Jelinek wrote:
> 
> > + error_at (loc, "%<:%> constraint operand is not address "
> > +"of a function or non-automatic variable");
> 
> I think a testcase for this error is needed.

Here is an updated patch with the additional test coverage.

2024-11-21  Jakub Jelinek  

gcc/
* genpreds.cc (mangle): Add ':' mangling.
(add_constraint): Allow : constraint.
* common.md (:): New define_constraint.
* stmt.cc (parse_output_constraint): Diagnose "=:".
(parse_input_constraint): Handle ":" and diagnose invalid
uses.
* doc/md.texi (Simple Constraints): Document ":" constraint.
gcc/c/
* c-typeck.cc (build_asm_expr): Diagnose invalid ":" constraint
uses.
gcc/cp/
* semantics.cc (finish_asm_stmt): Diagnose invalid ":" constraint
uses.
gcc/testsuite/
* c-c++-common/toplevel-asm-4.c: New test.
* c-c++-common/toplevel-asm-5.c: New test.

--- gcc/genpreds.cc.jj  2024-02-10 11:25:10.404468273 +0100
+++ gcc/genpreds.cc 2024-11-05 14:57:14.193060528 +0100
@@ -753,6 +753,7 @@ mangle (const char *name)
   case '_': obstack_grow (rtl_obstack, "__", 2); break;
   case '<':obstack_grow (rtl_obstack, "_l", 2); break;
   case '>':obstack_grow (rtl_obstack, "_g", 2); break;
+  case ':': obstack_grow (rtl_obstack, "_c", 2); break;
   default: obstack_1grow (rtl_obstack, *name); break;
   }
 
@@ -797,12 +798,13 @@ add_constraint (const char *name, const
   for (p = name; *p; p++)
 if (!ISALNUM (*p))
   {
-   if (*p == '<' || *p == '>' || *p == '_')
+   if (*p == '<' || *p == '>' || *p == '_' || *p == ':')
  need_mangled_name = true;
else
  {
error_at (loc, "constraint name '%s' must be composed of letters,"
- " digits, underscores, and angle brackets", name);
+ " digits, underscores, colon and angle brackets",
+ name);
return;
  }
   }
--- gcc/common.md.jj2024-01-03 11:51:24.519828508 +0100
+++ gcc/common.md   2024-11-05 14:51:29.098989927 +0100
@@ -100,6 +100,11 @@ (define_constraint "s"
(match_test "!CONST_SCALAR_INT_P (op)")
(match_test "!flag_pic || LEGITIMATE_PIC_OPERAND_P (op)")))
 
+(define_constraint ":"
+  "Defines a symbol."
+  (and (match_test "CONSTANT_P (op)")
+   (match_test "!CONST_SCALAR_INT_P (op)")))
+
 (define_constraint "n"
   "Matches a non-symbolic integer constant."
   (and (match_test "CONST_SCALAR_INT_P (op)")
--- gcc/stmt.cc.jj  2024-10-25 10:00:29.523767070 +0200
+++ gcc/stmt.cc 2024-11-05 18:31:11.518948252 +0100
@@ -278,6 +278,10 @@ parse_output_constraint (const char **co
  error ("matching constraint not valid in output operand");
  return false;
 
+   case ':':
+ error ("%<:%> constraint used for output operand");
+ return false;
+
case '<':  case '>':
  /* ??? Before flow, auto inc/dec insns are not supposed to exist,
 excepting those that expand_call created.  So match memory
@@ -325,6 +329,7 @@ parse_input_constraint (const char **con
   size_t c_len = strlen (constraint);
   size_t j;
   bool saw_match = false;
+  bool at_checked = false;
 
   /* Assume the constraint doesn't allow the use of either
  a register or memory.  */
@@ -362,6 +367,21 @@ parse_input_constraint (const char **con
   case 'N':  case 'O':  case 'P':  case ',':
break;
 
+  case ':':
+   /* Verify that if : is used, it is just ":" or say ":,:" but not
+  mixed with other constraints or say ",:,," etc.  */
+   if (!at_checked)
+ {
+   for (size_t k = 0; k < c_len; ++k)
+ if (constraint[k] != ((k & 1) ? ',' : ':') || (c_len & 1) == 0)
+   {
+ error ("%<:%> constraint mixed with other constraints");
+ return false;
+   } 
+   at_checked = true;
+ }
+   break;
+
/* Whether or not a numeric constraint allows a register is
   decided by the matching constraint, and so there is no need
   to do anything special with them.  We must handle them in
--- gcc/doc/md.texi.jj  2024-10-16 14:41:45.553757783 +0200
+++ gcc/doc/md.texi 2024-11-05 18:46:30.795896301 +0100
@@ -1504,6 +1504,13 @@ as the predicate in the @code{match_oper
 the mode specified in the @code{match_operand} as the mode of the memory
 reference for which the address would be valid.
 
+@cindex @samp{:} in constraint
+@item @samp{:}
+This constraint, allowed only in input operands, says the inline @code{asm}
+pattern defines specific function or variable symbol.  The constraint
+shouldn't be mixed with other constraints on the same operand and
+the operand should be address of a function or non-automatic variable.

[PATCH v2] testsuite: arm: Check that a far jump is used in thumb1-far-jump-2.c

2024-11-22 Thread Torbjörn SVENSSON

Changes since v1:

- Rewrote the padding instructions in the macro to instead write to volatile
  memory. This ensures that every expansion of the base macro is exactly 2
  bytes.

If the `GO()` in f3 is removed, the generated assembly would be reduced to:

f3:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
push{lr}
cmp r0, #0
bne .LCB7
bl  .L1 @far jump
.LCB7:
movsr2, #1
ldr r3, .L6
str r2, [r3]
...
str r2, [r3]
.L1:
@ sp needed
pop {pc}

Would this assembly be as stable as with the `GO()` in f3? If so, would it be
preferred to generate the simpler assembly in the test?

Ok for trunk as it is or perhaps with the simpler assembly?

--

With the changes in r15-1579-g792f97b44ff, the code used as "padding" in
the test case is optimized way. Prevent this optimization by forcing a
read of the volatile memory.
Also, validate that there is a far jump in the generated assembler.

Without this patch, the generated assembler is reduced to:
f3:
cmp r0, #0
beq .L1
ldr r4, .L6
.L1:
bx  lr
.L7:
.align  2
.L6:
.word   g_0_1

With the patch, the generated assembler is:
f3:
movsr2, #1
ldr r3, .L6
push{lr}
str r2, [r3]
cmp r0, #0
bne .LCB10
bl  .L1 @far jump
.LCB10:
b   .L7
.L8:
.align  2
.L6:
.word   .LANCHOR0
.L7:
str r2, [r3]
...
str r2, [r3]
.L1:
pop {pc}

gcc/testsuite/ChangeLog:

* gcc.target/arm/thumb1-far-jump-2.c: Write to volatile memmory
in macro to avoid optimization.

Signed-off-by: Torbjörn SVENSSON 
---
 .../gcc.target/arm/thumb1-far-jump-2.c| 95 ++-
 1 file changed, 51 insertions(+), 44 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/thumb1-far-jump-2.c 
b/gcc/testsuite/gcc.target/arm/thumb1-far-jump-2.c
index 78fcafaaf7d..c79580d660a 100644
--- a/gcc/testsuite/gcc.target/arm/thumb1-far-jump-2.c
+++ b/gcc/testsuite/gcc.target/arm/thumb1-far-jump-2.c
@@ -5,53 +5,60 @@
 /* { dg-options "-Os" } */
 /* { dg-skip-if "" { ! { arm_thumb1 } } } */
 
-volatile register int r4 asm ("r4");
+volatile int r4;
+
+#define GO() \
+  r4 = 1;
+
+#define GO8() \
+  GO() \
+  GO() \
+  GO() \
+  GO() \
+  GO() \
+  GO() \
+  GO() \
+  GO()
+
+#define GO32() \
+  GO8() \
+  GO8() \
+  GO8() \
+  GO8()
+
+#define GO128() \
+  GO32() \
+  GO32() \
+  GO32() \
+  GO32()
+
+#define GO512() \
+  GO128() \
+  GO128() \
+  GO128() \
+  GO128()
+
+#define GO1018() \
+  GO512() \
+  GO128() \
+  GO128() \
+  GO128() \
+  GO32() \
+  GO32() \
+  GO32() \
+  GO8() \
+  GO8() \
+  GO8() \
+  GO() \
+  GO()
+
 void f3(int i)
 {
-#define GO(n) \
-  extern volatile int g_##n; \
-  r4=(int)&g_##n;
-
-#define GO8(n) \
-  GO(n##_0) \
-  GO(n##_1) \
-  GO(n##_2) \
-  GO(n##_3) \
-  GO(n##_4) \
-  GO(n##_5) \
-  GO(n##_6) \
-  GO(n##_7)
-
-#define GO64(n) \
-  GO8(n##_0) \
-  GO8(n##_1) \
-  GO8(n##_2) \
-  GO8(n##_3) \
-  GO8(n##_4) \
-  GO8(n##_5) \
-  GO8(n##_6) \
-  GO8(n##_7) \
-
-#define GO498(n) \
-  GO64(n##_0) \
-  GO64(n##_1) \
-  GO64(n##_2) \
-  GO64(n##_3) \
-  GO64(n##_4) \
-  GO64(n##_5) \
-  GO64(n##_6) \
-  GO8(n##_0) \
-  GO8(n##_1) \
-  GO8(n##_2) \
-  GO8(n##_3) \
-  GO8(n##_4) \
-  GO8(n##_5) \
-  GO(n##_0) \
-  GO(n##_1) \
-
+  GO();
   if (i) {
-GO498(0);
+GO1018();
   }
 }
 
-/* { dg-final { scan-assembler "push.*lr" } } */
+/* { dg-final { scan-assembler "\tpush.*lr" } } */
+/* { dg-final { scan-assembler "\tbl\t\\.L\[0-9\]+\t@far jump" } } */
-- 
2.25.1

[PATCH] c-family: Yet another fix for _BitInt & __sync_* builtins [PR117641]

2024-11-22 Thread Jakub Jelinek

Hi!

Sorry, the last patch only partially fixed the __sync_* ICEs with
_BitInt(128) on ia32.
Even for !fetch we need to error out and return 0.  I was afraid of
APIs like __atomic_exchange/__atomic_compare_exchange, those obviously
need to be supported even on _BitInt(128) on ia32, but they actually never
sync_resolve_size, they are handled by adding the size argument and using
the library version much earlier.
For fetch && !orig_format (i.e. __atomic_fetch_* etc.) we need to return -1
so that we handle it with a manualy __atomic_load +
__atomic_compare_exchange loop in the caller, all other cases should
be rejected.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-11-22  Jakub Jelinek  

PR c/117641
* c-common.cc (sync_resolve_size): For size 16 with _BitInt
on targets where TImode isn't supported, use goto incompatible if
!fetch.

* gcc.dg/bitint-117.c: New test.

--- gcc/c-family/c-common.cc.jj 2024-11-19 20:35:53.222455518 +0100
+++ gcc/c-family/c-common.cc2024-11-21 11:23:07.897995741 +0100
@@ -7457,11 +7457,10 @@ sync_resolve_size (tree function, vec

[PATCH] inline-asm, v2: Add - constraint modifier support for toplevel extended asm [PR41045]

2024-11-22 Thread Jakub Jelinek

On Fri, Nov 22, 2024 at 12:01:21AM +, Joseph Myers wrote:
> On Mon, 18 Nov 2024, Jakub Jelinek wrote:
> 
> > +@smallexample
> > +extern void foo (void), bar (void);
> > +int v;
> > +extern int w;
> > +asm (".globl %cc0, %cc2; .text; %cc0: call %cc1; ret; .data; %cc2: .word 
> > %cc3"
> > + :: ":" (foo), "-s" (&bar), ":" (&w), "-i" (&v));
> > +@end smallexample
> > +
> > +This asm declaration tells the compiler it defines function foo and 
> > variable
> > +w and uses function bar and variable v.  This will compile even with PIC,
> > +but it is up to the user to ensure it will assemble correctly and have the
> > +expected behavior.
> 
> That should be @code{foo}, @code{w}, @code{bar}, @code{v}.
> 
> The C front-end changes in this patch are OK.

Thanks, here is the adjusted patch.

2024-11-21  Jakub Jelinek  

PR c/41045
gcc/
* stmt.cc (parse_output_constraint, parse_input_constraint): Handle
- modifier.
* recog.h (raw_constraint_p): Declare.
* recog.cc (raw_constraint_p): New variable.
(asm_operand_ok, constrain_operands): Handle - modifier.
* common.md (i, s, n): For raw_constraint_p don't require
LEGITIMATE_PIC_OPERAND_P.
* doc/md.texi: Document - constraint modifier.
gcc/c/
* c-typeck.cc (build_asm_expr): Reject - constraint modifier inside
of a function.
gcc/cp/
* semantics.cc (finish_asm_stmt): Reject - constraint modifier inside
of a function.
gcc/testsuite/
* c-c++-common/toplevel-asm-4.c: Add missing %cc2 use in template, add
bar, x, &y operands with "-i" and "-s" constraints.
(x, y): New variables.
(bar): Declare.
* c-c++-common/toplevel-asm-7.c: New test.
* c-c++-common/toplevel-asm-8.c: New test.

--- gcc/stmt.cc.jj  2024-11-17 21:07:06.712510933 -1100
+++ gcc/stmt.cc 2024-11-17 21:45:30.201294501 -1100
@@ -269,7 +269,7 @@ parse_output_constraint (const char **co
case 'E':  case 'F':  case 'G':  case 'H':
case 's':  case 'i':  case 'n':
case 'I':  case 'J':  case 'K':  case 'L':  case 'M':
-   case 'N':  case 'O':  case 'P':  case ',':
+   case 'N':  case 'O':  case 'P':  case ',':  case '-':
  break;
 
case '0':  case '1':  case '2':  case '3':  case '4':
@@ -364,7 +364,7 @@ parse_input_constraint (const char **con
   case 'E':  case 'F':  case 'G':  case 'H':
   case 's':  case 'i':  case 'n':
   case 'I':  case 'J':  case 'K':  case 'L':  case 'M':
-  case 'N':  case 'O':  case 'P':  case ',':
+  case 'N':  case 'O':  case 'P':  case ',':  case '-':
break;
 
   case ':':
--- gcc/recog.h.jj  2024-08-15 09:23:26.981012468 -1100
+++ gcc/recog.h 2024-11-17 22:26:47.190602347 -1100
@@ -335,6 +335,9 @@ private:
matched.  */
 extern int which_alternative;
 
+/* True for inline asm operands with - constraint modifier.  */
+extern bool raw_constraint_p;
+
 /* The following vectors hold the results from insn_extract.  */
 
 struct recog_data_d
--- gcc/recog.cc.jj 2024-10-24 21:00:29.511767242 -1100
+++ gcc/recog.cc2024-11-17 23:16:00.654874432 -1100
@@ -86,6 +86,9 @@ static operand_alternative asm_op_alt[MA
 
 int which_alternative;
 
+/* True for inline asm operands with - constraint modifier.  */
+bool raw_constraint_p;
+
 /* Nonzero after end of reload pass.
Set to 1 or 0 by toplev.cc.
Controls the significance of (SUBREG (MEM)).  */
@@ -2300,6 +2303,7 @@ asm_operand_ok (rtx op, const char *cons
   switch (c)
{
case ',':
+ raw_constraint_p = false;
  constraint++;
  continue;
 
@@ -2350,6 +2354,11 @@ asm_operand_ok (rtx op, const char *cons
result = 1;
  break;
 
+   case '-':
+ raw_constraint_p = true;
+ constraint++;
+ continue;
+
case '<':
case '>':
  /* ??? Before auto-inc-dec, auto inc/dec insns are not supposed
@@ -2407,8 +2416,12 @@ asm_operand_ok (rtx op, const char *cons
constraint++;
   while (--len && *constraint && *constraint != ',');
   if (len)
-   return 0;
+   {
+ raw_constraint_p = false;
+ return 0;
+   }
 }
+  raw_constraint_p = false;
 
   /* For operands without < or > constraints reject side-effects.  */
   if (AUTO_INC_DEC && !incdec_ok && result && MEM_P (op))
@@ -3202,6 +3215,9 @@ constrain_operands (int strict, alternat
  case ',':
c = '\0';
break;
+ case '-':
+   raw_constraint_p = true;
+   break;
 
  case '#':
/* Ignore rest of this alternative as far as
@@ -3357,6 +3373,7 @@ constrain_operands (int strict, alternat
  }
  while (p += len, c);
 
+ raw_constraint_p = false;
  constraints[opno] = p;
  /* If this operand did not win somehow,
 this alternative loses.  */
--- gcc/common.

[PATCH] v2: Allow limited extended asm at toplevel [PR41045]

2024-11-22 Thread Jakub Jelinek

On Thu, Nov 21, 2024 at 09:32:51PM +, Joseph Myers wrote:
> On Sat, 2 Nov 2024, Jakub Jelinek wrote:
> 
> > +Extended @code{asm} statements outside of functions may not use any
> > +qualifiers, may not specify clobbers, may not use @code{%}, @code{+} or
> > +@code{&} modifiers in constraints and can only use constraints which don%'t
> > +allow using any register.
> 
> Just ' in Texinfo, not %'.
> 
> > @@ -3071,7 +3072,62 @@ c_parser_declaration_or_fndef (c_parser
> >  static void
> >  c_parser_asm_definition (c_parser *parser)
> >  {
> > -  tree asm_str = c_parser_simple_asm_expr (parser);
> > +  location_t asm_loc = c_parser_peek_token (parser)->location;
> 
> The syntax comment above this function needs updating.
> 
> The C front-end changes are OK with that fix.

Thanks.
Here is the adjusted patch.

2024-11-22  Jakub Jelinek  

PR c/41045
gcc/
* output.h (insn_noperands): Declare.
* final.cc (insn_noperands): No longer static.
* varasm.cc (assemble_asm): Handle ASM_EXPR.
* lto-streamer-out.cc (lto_output_toplevel_asms): Add sorry_at
for non-STRING_CST toplevel asm for now.
* doc/extend.texi (Basic @code{asm}, Extended @code{asm}): Document
that extended asm is now allowed outside of functions with certain
restrictions.
gcc/c/
* c-parser.cc (c_parser_asm_string_literal): Add forward declaration.
(c_parser_asm_definition): Parse also extended asm without
clobbers/labels.
* c-typeck.cc (build_asm_expr): Allow extended asm outside of
functions and check extra restrictions.
gcc/cp/
* cp-tree.h (finish_asm_stmt): Add TOPLEV_P argument.
* parser.cc (cp_parser_asm_definition): Parse also extended asm
without clobbers/labels outside of functions.
* semantics.cc (finish_asm_stmt): Add TOPLEV_P argument, if set,
check extra restrictions for extended asm outside of functions.
* pt.cc (tsubst_stmt): Adjust finish_asm_stmt caller.
gcc/testsuite/
* c-c++-common/toplevel-asm-1.c: New test.
* c-c++-common/toplevel-asm-2.c: New test.
* c-c++-common/toplevel-asm-3.c: New test.

--- gcc/output.h.jj 2024-10-02 13:30:11.006426823 +0200
+++ gcc/output.h2024-11-01 14:59:28.563541848 +0100
@@ -338,6 +338,9 @@ extern rtx_insn *current_output_insn;
The precise value is the insn being output, to pass to error_for_asm.  */
 extern const rtx_insn *this_is_asm_operands;
 
+/* Number of operands of this insn, for an `asm' with operands.  */
+extern unsigned int insn_noperands;
+
 /* Carry information from ASM_DECLARE_OBJECT_NAME
to ASM_FINISH_DECLARE_OBJECT.  */
 extern int size_directive_output;
--- gcc/final.cc.jj 2024-10-24 18:53:38.780079517 +0200
+++ gcc/final.cc2024-11-01 14:59:28.980535887 +0100
@@ -150,7 +150,7 @@ extern const int length_unit_log; /* Thi
 const rtx_insn *this_is_asm_operands;
 
 /* Number of operands of this insn, for an `asm' with operands.  */
-static unsigned int insn_noperands;
+unsigned int insn_noperands;
 
 /* Compare optimization flag.  */
 
--- gcc/varasm.cc.jj2024-10-29 13:51:43.247898084 +0100
+++ gcc/varasm.cc   2024-11-01 16:46:23.998237127 +0100
@@ -62,6 +62,8 @@ along with GCC; see the file COPYING3.
 #include "toplev.h"
 #include "opts.h"
 #include "asan.h"
+#include "recog.h"
+#include "gimple-expr.h"
 
 /* The (assembler) name of the first globally-visible object output.  */
 extern GTY(()) const char *first_global_object_name;
@@ -1667,16 +1669,167 @@ make_decl_rtl_for_debug (tree decl)
for an `asm' keyword used between functions.  */
 
 void
-assemble_asm (tree string)
+assemble_asm (tree asm_str)
 {
   const char *p;
-  app_enable ();
 
-  if (TREE_CODE (string) == ADDR_EXPR)
-string = TREE_OPERAND (string, 0);
+  if (TREE_CODE (asm_str) != ASM_EXPR)
+{
+  app_enable ();
+  if (TREE_CODE (asm_str) == ADDR_EXPR)
+   asm_str = TREE_OPERAND (asm_str, 0);
+
+  p = TREE_STRING_POINTER (asm_str);
+  fprintf (asm_out_file, "%s%s\n", p[0] == '\t' ? "" : "\t", p);
+}
+  else
+{
+  location_t save_loc = input_location;
+  int save_reload_completed = reload_completed;
+  int save_cse_not_expected = cse_not_expected;
+  input_location = EXPR_LOCATION (asm_str);
+  int noutputs = list_length (ASM_OUTPUTS (asm_str));
+  int ninputs = list_length (ASM_INPUTS (asm_str));
+  const char **constraints = NULL;
+  int i;
+  tree tail;
+  bool allows_mem, allows_reg, is_inout;
+  rtx *ops = NULL;
+  if (noutputs + ninputs > MAX_RECOG_OPERANDS)
+   {
+ error ("more than %d operands in %", MAX_RECOG_OPERANDS);
+ goto done;
+   }
+  constraints = XALLOCAVEC (const char *, noutputs + ninputs);
+  ops = XALLOCAVEC (rtx, noutputs + ninputs);
+  memset (&recog_data, 0, sizeof (recog_data));
+  recog_data.n_operands = ninputs + noutputs;
+  recog_data.is_asm = true;

Re: [PATCH 00/11] Add FP overloads for __atomic_fetch_add etc

2024-11-22 Thread Matthew Malcomson

Wanted to provide a link to the current patch that Prathamesh has worked 
on for automatically linking in libatomic (since this patch relies on 
that for the idea approach).

https://gcc.gnu.org/pipermail/gcc-patches/2024-November/669461.html

On 11/14/24 13:55, mmalcom...@nvidia.com wrote:

From: Matthew Malcomson 

Hello,

This is the revision of the RFC posted here:
https://gcc.gnu.org/pipermail/gcc-patches/2024-September/663355.html

This patchset introduces floating point versions of atomic fetch_add,
fetch_sub, add_fetch and sub_fetch.  Instructions for performing these
operations have been directly available in GPU hardware for a while, and
are now starting to get added to CPU ISA's with instructions like the
AArch64 LDFADD.  Clang has allowed floating point types to be used with
these builtins for a while now https://reviews.llvm.org/D71726.

Introducing these new overloads to this builtin allows users to directly
specify the operation needed and hence allows the compiler to provide
optimised output if possible.

There is additional motivation to use such floating point type atomic
operations in libstdc++ so that other compilers can use libstdc++ to
generate optimal code for their own targets (e.g. NVC++ can use
libstdc++ atomic::fetch_add to generate optimal code for GPU's
when using the `-stdpar` argument).  A patch implementing that has been
sent to the libstdc++ mailing list here:
https://gcc.gnu.org/pipermail/libstdc++/2024-October/059865.html

Uses of such builtins in libstdc++ doesn't technically need introduction
of the builtins to GCC -- just ensuring that SFINAE techniques can be used
with overloaded builtins (as is done in this patch:
https://gcc.gnu.org/pipermail/gcc-patches/2024-October/665982.html).
That said it is still worthwhile to add these builtins -- especially now
that a primary target will be adding these operations.
N.b. I include an update to that libstdc++ patch in this patch series
because I noticed the ChangeLog in the cover letter had spaces instead
of tabs.

In the original RFC I suggested that libstdc++ use preprocessor checks
along the lines of `__has_builtin(__atomic_fetch_add_fp)` to determine
whether to use the new builtins or a CAS loop written inline.  After asking
the clang community I found out that such resolved versions of the atomic
builtins are not exposed by clang to the user.  They suggested the use of
SFINAE to determine if a given type works with the `__atomic_fetch_add`
builtin.  That's why I have posted the SFINAE patch above and chosen this
approach.

--
As standard with the existing atomic builtins, we add the same functions
in libatomic, allowing a fallback for when a given target has not
implemented these operations directly in hardware.  In order to use
these functions we need to have a naming scheme that encodes the type --
we use a suffix of _fp to denote that this operation is on a floating
point type, and a further empty suffix to denote a double, 'f' to denote
a float, and similar.  The scheme for the second part of the suffix
taken from the existing builtins that have different versions for
different floating point types -- e.g.  __builtin_acosh,
__builtin_acoshf, __builtin_acoshl, etc.

In order to add floating point functions to libatomic we updated the
makefile machinery to use names more descriptive of the new setup (where
the SIZE of the datatype can no longer be used to distinguish all
operations from each other).  Moreover we add a CAS loop implementation
in fop_n.c that handles floating point exception information and handles
casting between floating point and integral types when switching between
applying the operation and using CAS to attempt to store.

--
As Joseph Myers pointed out in response to my RFC, when performing
floating point operations in a CAS loop there is the floating point
exception information to take care of.  In order to take care of this
information I use the existing `atomic_assign_expand_fenv` target hook
to generate code that checks this information.

Partly due to the fact that this hook emits GENERIC code and partly due
to the language-specific semantics of floating point exceptions, this
means we now decide whether to emit a CAS loop handling the frontend
(during overload resolution).  The frontend decides to only use the
underlying builtin if the backend has an optab defined that can
implement it directly.

--
Now that the expansion to a CAS loop is performed in overloaded builtin
resolution, this means that if the user were to directly use a resolved
version (e.g. `__atomic_fetch_add_fp` for a double) that would not
expand into a CAS loop inline.  Instead (assuming the optab is not
implemented for this target) it would pass through and end up using the
libatomic fallback.

This is not ideal, but I believe the complexity of adding another clause
for this expansion to a CAS loop is not worth the benefit of handling a
CAS l

[PATCH] Adjust error message for initialized variable in .bss

2024-11-22 Thread Eric Botcazou

Hi,

if you compile the following Ada package:

package Bss1 is

  I : Integer := 0 with Linker_Section => ".bss";

end Bss1;

you get the error:

bss1.ads:3:3: error: only zero initializers are allowed in section '.bss'

which is quite paradoxical.  The reason is that the default setting in Ada is 
-fno-zero-initialized-in-bss instead of -fzero-initialized-in-bss.

Tested on x86-64/Linux, OK for the mainline?


2024-11-22  Eric Botcazou  

* doc/invoke.texi (-fno-zero-initialized-in-bss): Adjust for Ada.
* arasm.cc (get_variable_section): Adjust the error message for an
initialized variable in .bss to -fno-zero-initialized-in-bss.


2024-11-22  Eric Botcazou  

* gnat.dg/specs/bss1.ads: New test.

-- 
Eric Botcazoudiff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 0951901f50a..abf3650e0a1 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -13037,7 +13037,7 @@ rely on variables going to the data section---e.g., so that the
 resulting executable can find the beginning of that section and/or make
 assumptions based on that.
 
-The default is @option{-fzero-initialized-in-bss}.
+The default is @option{-fzero-initialized-in-bss} except in Ada.
 
 @opindex fthread-jumps
 @item -fthread-jumps
diff --git a/gcc/varasm.cc b/gcc/varasm.cc
index acc4b4a0419..dd67dd441c0 100644
--- a/gcc/varasm.cc
+++ b/gcc/varasm.cc
@@ -1264,9 +1264,14 @@ get_variable_section (tree decl, bool prefer_noswitch_p)
   if ((sect->common.flags & SECTION_BSS)
 	  && !bss_initializer_p (decl, true))
 	{
-	  error_at (DECL_SOURCE_LOCATION (decl),
-		"only zero initializers are allowed in section %qs",
-		sect->named.name);
+	  if (flag_zero_initialized_in_bss)
+	error_at (DECL_SOURCE_LOCATION (decl),
+		  "only zero initializers are allowed in section %qs",
+		  sect->named.name);
+	  else
+	error_at (DECL_SOURCE_LOCATION (decl),
+		  "no initializers are allowed in section %qs",
+		  sect->named.name);
 	  DECL_INITIAL (decl) = error_mark_node;
 	}
   return sect;
package Bss1 is

  I : Integer := 0 with Linker_Section => ".bss"; -- { dg-error "no initializers" }

end Bss1;

Re: [PATCH] wwwdocs: Align the DCO text for the GNU Toolchain to match community usage.

2024-11-22 Thread Sam James

Carlos O'Donell  writes:

> On 11/21/24 1:47 PM, Sam James wrote:
>> Mark Wielaard  writes:
>> 
>>> Hi Carlos,
>>>
>>> On Thu, 2024-11-21 at 12:04 -0500, Carlos O'Donell wrote:
 Adjust the DCO text to match the broader community usage including
 the Linux kernel use around "real names."
>>>
>>> We made a similar change to switch from "real names" to "known
>>> identifies" for elfutils a year ago:
>>> https://sourceware.org/cgit/elfutils/commit/CONTRIBUTING?id=b770e1c4def3532c7b59c4d2e4cd3cee26d4548b
>>>
>>> I suggest including the actual clarification in the explantion, so
>>> there is no confusion about what is meant by "known identity":
>>>
>>> diff --git a/htdocs/dco.html b/htdocs/dco.html
>>> index 68fa183b9fc0..f4bf17d2a6ec 100644
>>> --- a/htdocs/dco.html
>>> +++ b/htdocs/dco.html
>>> @@ -54,8 +54,10 @@ then you just add a line saying:
>>>  
>>>  Signed-off-by: Random J Developer 
>>> 
>>>  
>>> -using your real name (sorry, no pseudonyms or anonymous contributions.)  
>>> This
>>> -will be done for you automatically if you use `git commit -s`.
>>> +using a known identity (sorry, no pseudonyms or anonymous contributions.)
>>> +The name you use as your identity should not be an anonymous id or
>>> +false name that misrepresents who you are.
>>> +This will be done for you automatically if you use `git commit -s`.
>>>  
>>>  Some people also put extra optional tags at the end.  The GCC project 
>>> does
>>>  not require tags from anyone other than the original author of the patch, 
>>> but
>>>
>>> It looks like the cncf has an almost identical clarification.
>> 
>> I will note that our DCO in Gentoo was based on the kernel's, and we
>> changed ours in April last year accordingly to align with this update
>> too.
>
> That's good to know.
>
> Did you include any additional clarifying language?

The change was made in
https://gitweb.gentoo.org/data/glep.git/commit/?id=9733e2706ff46ebbc1c2b468f55006dd2921fca2
to our GLEP 76 document
(https://www.gentoo.org/glep/glep-0076.html#certificate-of-origin).

>
> I'd like to keep the language as simple as possible, and so I do not plan to
> make any further changes to my patch.
>
> I'm concerned that terms like "false" or "misrepresent" are context dependent
> and may lead to more confusion.
>
> I like that the linux kernel text is succinct.

The key part is the first hunk of the patch I linked where we wanted to
emphasise "established online identity", rather than a throwaway
pseudonym.

RE: [PATCH]middle-end: Pass along SLP node when costing vector loads/stores

2024-11-22 Thread Tamar Christina

> -Original Message-
> From: Richard Biener 
> Sent: Thursday, November 21, 2024 8:03 AM
> To: Tamar Christina 
> Cc: gcc-patches@gcc.gnu.org; nd 
> Subject: Re: [PATCH]middle-end: Pass along SLP node when costing vector
> loads/stores
> 
> On Wed, 20 Nov 2024, Tamar Christina wrote:
> 
> > Hi All,
> >
> > With the support to SLP only we now pass the VMAT through the SLP node,
> however
> > the majority of the costing calls inside vectorizable_load and
> > vectorizable_store do no pass the SLP node along.  Due to this the backend
> costing
> > never sees the VMAT for these cases anymore.
> >
> > Additionally the helper around record_stmt_cost when both SLP and stmt_vinfo
> are
> > passed would only pass the SLP node along.  However the SLP node doesn't
> contain
> > all the info available in the stmt_vinfo and we'd have to go through the
> > SLP_TREE_REPRESENTATIVE anyway.  As such I changed the function to just
> Always
> > pass both along.  Unlike the VMAT changes, I don't believe there to be a
> > correctness issue here but would minimize the number of churn in the backend
> > costing until vectorizer costing as a whole is revisited in GCC 16.
> 
> I agree this is the best way forward at this point - I originally
> thought to never pass both and treat the calls with SLP node as being
> the "future" way, but clearly we're not even close to a "future" costing
> API right now.

After trunk stabilizes a bit I plan to look at the other functions and move the 
AArch64
backend to use only SLP_tree.  One thing I was wondering about is what to do 
with
costing that happens early during say DR analysis where we don't have an SLP 
tree.

So I guess the ABI must always have the duality of getting both the stmt_vinfo 
and
SLP tree but we could go back to your original design of not setting stmt_vinfo 
if
An SLP tree is given which would distinguish between the two?

Thanks,
Tamar

> 
> > These changes re-enable the cost model on AArch64 and also correctly find 
> > the
> > VMATs on loads and stores fixing testcases such as sve_iters_low_2.c.
> >
> > Bootstrapped Regtested on aarch64-none-linux-gnu,
> > arm-none-linux-gnueabihf, x86_64-pc-linux-gnu -m32, -m64 and no issues.
> >
> > Ok for master?
> 
> OK.
> 
> Thanks,
> Richard.
> 
> > Thanks,
> > Tamar
> >
> > gcc/ChangeLog:
> >
> > * tree-vect-data-refs.cc (vect_get_data_access_cost): Pass NULL for SLP
> > node.
> > * tree-vect-stmts.cc (record_stmt_cost): Expose.
> > (vect_get_store_cost, vect_get_load_cost): Extend with SLP node.
> > (vectorizable_store, vectorizable_load): Pass SLP node to all costing.
> > * tree-vectorizer.h (record_stmt_cost): Always pass both SLP node and
> > stmt_vinfo to costing.
> > (vect_get_load_cost, vect_get_store_cost): Extend with SLP node.
> >
> > ---
> > diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc
> > index
> 3ea5fb883b1a5289195142171eb45fa422910a95..d87ca79b8e4c16d242e6743
> 1d1b527bdb8cb74e4 100644
> > --- a/gcc/tree-vect-data-refs.cc
> > +++ b/gcc/tree-vect-data-refs.cc
> > @@ -1729,12 +1729,14 @@ vect_get_data_access_cost (vec_info *vinfo,
> dr_vec_info *dr_info,
> >  ncopies = vect_get_num_copies (loop_vinfo, STMT_VINFO_VECTYPE
> (stmt_info));
> >
> >if (DR_IS_READ (dr_info->dr))
> > -vect_get_load_cost (vinfo, stmt_info, ncopies, 
> > alignment_support_scheme,
> > -   misalignment, true, inside_cost,
> > -   outside_cost, prologue_cost_vec, body_cost_vec, false);
> > +vect_get_load_cost (vinfo, stmt_info, NULL, ncopies,
> > +   alignment_support_scheme, misalignment, true,
> > +   inside_cost, outside_cost, prologue_cost_vec,
> > +   body_cost_vec, false);
> >else
> > -vect_get_store_cost (vinfo,stmt_info, ncopies, 
> > alignment_support_scheme,
> > -misalignment, inside_cost, body_cost_vec);
> > +vect_get_store_cost (vinfo,stmt_info, NULL, ncopies,
> > +alignment_support_scheme, misalignment, inside_cost,
> > +body_cost_vec);
> >
> >if (dump_enabled_p ())
> >  dump_printf_loc (MSG_NOTE, vect_location,
> > diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> > index
> 7a92da00f7ddcfdf146fa1c2511f609e8bc40e9e..46543c15c00f00e5127d06446f
> 58fce79951c3b0 100644
> > --- a/gcc/tree-vect-stmts.cc
> > +++ b/gcc/tree-vect-stmts.cc
> > @@ -93,7 +93,7 @@ stmt_in_inner_loop_p (vec_info *vinfo, class
> _stmt_vec_info *stmt_info)
> > target model or by saving it in a vector for later processing.
> > Return a preliminary estimate of the statement's cost.  */
> >
> > -static unsigned
> > +unsigned
> >  record_stmt_cost (stmt_vector_for_cost *body_cost_vec, int count,
> >   enum vect_cost_for_stmt kind,
> >   stmt_vec_info stmt_info, slp_tree node,
> > @@ -1008,8 +1008,8 @@ cfun_returns (tree decl)
> >
> >  /* Calculate cost of DR's memory access.  */
> >  void
>

[PATCH] testsuite: Fix up various powerpc tests after -std=gnu23 by default switch [PR117663]

2024-11-22 Thread Jakub Jelinek

Hi!

These tests use the K&R function style definitions or pass arguments
to () functions.
It seemed easiest to just use -std=gnu17 for all of those.

Bootstrapped/regtested on powerpc64le-linux and powerpc64-linux (on the
latter tested with -m32/-m64), ok for trunk?

2024-11-22  Jakub Jelinek  

PR testsuite/117663
* gcc.target/powerpc/pr58673-1.c: Add -std=gnu17 to dg-options.
* gcc.target/powerpc/pr64505.c: Likewise.
* gcc.target/powerpc/pr116170.c: Likewise.
* gcc.target/powerpc/pr58673-2.c: Likewise.
* gcc.target/powerpc/pr64019.c: Likewise.
* gcc.target/powerpc/pr96506-1.c: Likewise.
* gcc.target/powerpc/swaps-stack-protector.c: Likewise.
* gcc.target/powerpc/pr78543.c: Likewise.
* gcc.dg/vect/pr48765.c: Add -std=gnu17 to dg-additional-options.

--- gcc/testsuite/gcc.target/powerpc/pr58673-1.c.jj 2024-06-04 
13:19:04.531594020 +0200
+++ gcc/testsuite/gcc.target/powerpc/pr58673-1.c2024-11-21 
18:57:26.724287696 +0100
@@ -1,6 +1,6 @@
 /* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
 /* { dg-skip-if "" { powerpc*-*-darwin* } } */
-/* { dg-options "-mdejagnu-cpu=power8 -mvsx -O1" } */
+/* { dg-options "-mdejagnu-cpu=power8 -mvsx -O1 -std=gnu17" } */
 /* { dg-require-effective-target powerpc_vsx } */
 
 enum typecode
--- gcc/testsuite/gcc.target/powerpc/pr64505.c.jj   2020-11-09 
15:25:52.0 +0100
+++ gcc/testsuite/gcc.target/powerpc/pr64505.c  2024-11-21 19:08:32.258800032 
+0100
@@ -1,6 +1,6 @@
 /* { dg-do compile } */
 /* { dg-skip-if "" { powerpc*-*-aix* } } */
-/* { dg-options "-w -O2 -mpowerpc64" } */
+/* { dg-options "-w -O2 -mpowerpc64 -std=gnu17" } */
 
 /*
  * (below is minimized test case)
--- gcc/testsuite/gcc.target/powerpc/pr116170.c.jj  2024-11-21 
18:56:42.283921230 +0100
+++ gcc/testsuite/gcc.target/powerpc/pr116170.c 2024-11-21 18:55:57.301562478 
+0100
@@ -1,6 +1,6 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target ppc_float128_sw } */
-/* { dg-options "-mdejagnu-cpu=power8 -O2 -fstack-protector-strong 
-ffloat-store" } */
+/* { dg-options "-mdejagnu-cpu=power8 -O2 -fstack-protector-strong 
-ffloat-store -std=gnu17" } */
 
 /* Verify there is no ICE.  */
 
--- gcc/testsuite/gcc.target/powerpc/pr58673-2.c.jj 2024-06-04 
13:19:04.531594020 +0200
+++ gcc/testsuite/gcc.target/powerpc/pr58673-2.c2024-11-21 
18:59:33.549479716 +0100
@@ -1,6 +1,6 @@
 /* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
 /* { dg-skip-if "" { powerpc*-*-darwin* } } */
-/* { dg-options "-mdejagnu-cpu=power8 -mvsx -O3 -funroll-loops" } */
+/* { dg-options "-mdejagnu-cpu=power8 -mvsx -O3 -funroll-loops -std=gnu17" } */
 /* { dg-require-effective-target powerpc_vsx } */
 
 #include 
--- gcc/testsuite/gcc.target/powerpc/pr64019.c.jj   2024-06-04 
13:19:04.531594020 +0200
+++ gcc/testsuite/gcc.target/powerpc/pr64019.c  2024-11-21 19:00:08.110987010 
+0100
@@ -1,6 +1,6 @@
 /* { dg-do compile { target { powerpc*-*-* } } } */
 /* { dg-skip-if "" { powerpc*-*-darwin* } } */
-/* { dg-options "-O2 -ffast-math -mdejagnu-cpu=power7" } */
+/* { dg-options "-O2 -ffast-math -mdejagnu-cpu=power7 -std=gnu17" } */
 /* { dg-require-effective-target powerpc_vsx } */
 
 #include 
--- gcc/testsuite/gcc.target/powerpc/pr96506-1.c.jj 2020-11-22 
19:11:44.0 +0100
+++ gcc/testsuite/gcc.target/powerpc/pr96506-1.c2024-11-21 
19:09:58.042577378 +0100
@@ -1,7 +1,7 @@
 /* PR target/96506 */
 /* { dg-do compile } */
 /* { dg-require-effective-target power10_ok } */
-/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2 -std=gnu17" } */
 
 extern void bar0();
 extern void bar1();
--- gcc/testsuite/gcc.target/powerpc/swaps-stack-protector.c.jj 2020-01-12 
11:54:38.0 +0100
+++ gcc/testsuite/gcc.target/powerpc/swaps-stack-protector.c2024-11-21 
19:12:01.487819286 +0100
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-fstack-protector -O3" } */
+/* { dg-options "-fstack-protector -O3 -std=gnu17" } */
 
 /* PR78695: This code used to ICE in rs6000.c:find_alignment_op because
the stack protector address definition isn't associated with an insn.  */
--- gcc/testsuite/gcc.target/powerpc/pr78543.c.jj   2024-06-04 
13:19:04.0 +0200
+++ gcc/testsuite/gcc.target/powerpc/pr78543.c  2024-11-21 19:09:13.071218226 
+0100
@@ -1,5 +1,5 @@
 /* { dg-do compile { target { powerpc64*-*-* && lp64 } } } */
-/* { dg-options "-mdejagnu-cpu=power8 -mvsx -O1" } */
+/* { dg-options "-mdejagnu-cpu=power8 -mvsx -O1 -std=gnu17" } */
 /* { dg-require-effective-target powerpc_vsx } */
 
 typedef long a;
--- gcc/testsuite/gcc.dg/vect/pr48765.c.jj  2024-02-22 10:10:19.705018061 
+0100
+++ gcc/testsuite/gcc.dg/vect/pr48765.c 2024-11-21 18:54:59.592385168 +0100
@@ -1,5 +1,5 @@
 /* { dg-do compile { target { powerpc*-*-* } } } */
-/* { dg-additional-options "-O3 -mdejagnu-cpu=power6 -mno-vsx" } */
+/* { dg-additional-options "-O3 -mde

[PATCH] v3: Add -f{,no-}assume-sane-operators-new-delete options [PR110137]

2024-11-22 Thread Jakub Jelinek

On Tue, Nov 19, 2024 at 01:52:06PM +0100, Jan Hubicka wrote:
> > On Tue, Nov 19, 2024 at 11:23:31AM +0100, Jakub Jelinek wrote:
> > > On Tue, Nov 19, 2024 at 10:25:16AM +0100, Richard Biener wrote:
> > > > I think it's pretty clear and easy to describe to users what "m " and 
> > > > what "mC" do.  But with "pure" this is an odd intermediate state.  For 
> > > > both
> > > > "m " and "mP" you suggest above the new/delete might modify their
> > > > global state but as you can't rely on the new/delete pair to prevail
> > > > you cannot rely on the modification to happen.  But how do you explain
> > > > that
> > > 
> > > If we are willing to make the default not strictly conforming (i.e.
> > > basically revert PR101480 by default and make the GCC 11.1/11.2 behavior
> > > the default and allow -fno-sane-operators-new-delete to change to GCC
> > > 11.3/14.* behavior), I can live with it.
> > > But we need to make the documentation clear that the default is not 
> > > strictly
> > > conforming.
> > 
> > Here is a modified version of the patch to do that.
> > 
> > Or do we want to set the default based on -std= option (-std=gnu* implies
> > -fassume-sane-operators-new-delete, -std=c++* implies
> > -fno-assume-sane-operators-new-delete)?  Though, not sure what to do for
> > LTO then.
> 
> My oriignal plan was to add " sane" attribute to the declarations and
> prevent them from being merged.  Then every direct call to new/delete
> would know if it came from sane or insane translation unit.
> 
> Alternatively one can also declare
>  +C++ ObjC++ LTO Var(flag_assume_sane_operators_new_delete) Init(1)
>  +Assume C++ replaceable global operators new, new[], delete, delete[] don't 
> read or write visible global state.
> as optimization.  Then sanity would be function specific.
> 
> inline_call contains code that drops flag_strict_aliasing for function
> when it inlines -fno-strict-alising function into -fstrict-aliasing.
> At same place we can make new/delete operator insanity similarly
> contagious.  If you inline function that has insane new/delete calls you
> make the combined function also insane.

Here is an updated patch, which makes it Optimize and merges it during
inlining (pessimistically).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Note, perhaps it could be (maybe incrementally) refined to only clear
the flag on inlining if the callee actually has any
gimple_call_from_new_or_delete stmts calling DECL_IS_REPLACEABLE_OPERATOR.
I.e. like it collects whether a function uses floating point operations
also gather this in another flag.

2024-11-22  Jakub Jelinek  

PR c++/110137
PR middle-end/101480
gcc/
* doc/invoke.texi (-fassume-sane-operators-new-delete,
-fno-assume-sane-operators-new-delete): Document.
* gimple.cc (gimple_call_fnspec): Handle
-f{,no-}assume-sane-operators-new-delete.
* ipa-inline-transform.cc (inline_call): Also clear
flag_assume_sane_operators_new_delete on caller when inlining
-fno-assume-sane-operators-new-delete callee into
-fassume-sane-operators-new-delete caller.
gcc/c-family/
* c.opt (fassume-sane-operators-new-delete): New option.
gcc/testsuite/
* g++.dg/tree-ssa/pr110137-1.C: New test.
* g++.dg/tree-ssa/pr110137-2.C: New test.
* g++.dg/tree-ssa/pr110137-3.C: New test.
* g++.dg/tree-ssa/pr110137-4.C: New test.
* g++.dg/torture/pr10148.C: Add -fno-assume-sane-operators-new-delete
as dg-additional-options.
* g++.dg/warn/Warray-bounds-16.C: Revert 2021-11-10 changes.

--- gcc/doc/invoke.texi.jj  2024-11-20 14:27:49.257228428 +0100
+++ gcc/doc/invoke.texi 2024-11-20 14:44:02.819559242 +0100
@@ -213,7 +213,9 @@ in the following sections.
 @item C++ Language Options
 @xref{C++ Dialect Options,,Options Controlling C++ Dialect}.
 @gccoptlist{-fabi-version=@var{n}  -fno-access-control
--faligned-new=@var{n}  -fargs-in-order=@var{n}  -fchar8_t  -fcheck-new
+-faligned-new=@var{n}  -fargs-in-order=@var{n}
+-fno-assume-sane-operators-new-delete
+-fchar8_t  -fcheck-new
 -fconcepts  -fconstexpr-depth=@var{n}  -fconstexpr-cache-depth=@var{n}
 -fconstexpr-loop-limit=@var{n}  -fconstexpr-ops-limit=@var{n}
 -fno-elide-constructors
@@ -3164,6 +3166,35 @@ but few users will need to override the
 
 This flag is enabled by default for @option{-std=c++17}.
 
+@opindex fno-assume-sane-operators-new-delete
+@opindex fassume-sane-operators-new-delete
+@item -fno-assume-sane-operators-new
+The C++ standard allows replacing the global @code{new}, @code{new[]},
+@code{delete} and @code{delete[]} operators, though a lot of C++ programs
+don't replace them and just use the implementation provided version.
+Furthermore, the C++ standard allows omitting those calls if they are
+made from new or delete expressions (and by extension the same is
+assumed if @code{__builtin_operator_new} or @code{__builtin_operator_delete}
+functions are used).
+This

Re: [PATCH] libsanitizer: Move language level from gnu++14 to gnu++17

2024-11-22 Thread Jakub Jelinek

On Thu, Nov 21, 2024 at 05:21:27PM -0800, Andrew Pinski wrote:
> While compiling libsanitizer for aarch64-linux-gnu, I noticed the new warning:
> ```
> ../../../../libsanitizer/asan/asan_interceptors.cpp: In function ‘char* 
> ___interceptor_strcpy(char*, const char*)’:
> ../../../../libsanitizer/asan/asan_interceptors.cpp:554:6: warning: ‘if 
> constexpr’ only available with ‘-std=c++17’ or ‘-std=gnu++17’ 
> [-Wc++17-extensions]
>   554 |   if constexpr (SANITIZER_APPLE) {
>   |  ^
> ```
> So compile-rt upstream compiles this as gnu++17 (the current defualt for 
> clang), so let's update it
> to be similar.
> 
> Build and tested on aarch64-linux-gnu.
> 
>   PR sanitizer/117731
> libsanitizer/ChangeLog:
> 
>   * asan/Makefile.am: Replace gnu++14 with gnu++17.
>   * asan/Makefile.in: Regenerate.
>   * hwasan/Makefile.am: Replace gnu++14 with gnu++17.
>   * hwasan/Makefile.in: Regenerate.
>   * interception/Makefile.am: Replace gnu++14 with gnu++17.
>   * interception/Makefile.in: Regenerate.
>   * libbacktrace/Makefile.am: Replace gnu++14 with gnu++17.
>   * libbacktrace/Makefile.in: Regenerate.
>   * lsan/Makefile.am: Replace gnu++14 with gnu++17.
>   * lsan/Makefile.in: Regenerate.
>   * sanitizer_common/Makefile.am: Replace gnu++14 with gnu++17.
>   * sanitizer_common/Makefile.in: Regenerate.
>   * tsan/Makefile.am: Replace gnu++14 with gnu++17.
>   * tsan/Makefile.in: Regenerate.
>   * ubsan/Makefile.am: Replace gnu++14 with gnu++17.
>   * ubsan/Makefile.in: Regenerate.

Please change the ChangeLog
s/am:/am (AM_CXXFLAGS):/g

> Signed-off-by: Andrew Pinski 

Ok with that nit changed.
I have a follow-up patch but will adjust it after you commit.

Jakub

RE: [PATCH]middle-end: Pass along SLP node when costing vector loads/stores

2024-11-22 Thread Richard Biener

On Fri, 22 Nov 2024, Tamar Christina wrote:

> > -Original Message-
> > From: Richard Biener 
> > Sent: Thursday, November 21, 2024 8:03 AM
> > To: Tamar Christina 
> > Cc: gcc-patches@gcc.gnu.org; nd 
> > Subject: Re: [PATCH]middle-end: Pass along SLP node when costing vector
> > loads/stores
> > 
> > On Wed, 20 Nov 2024, Tamar Christina wrote:
> > 
> > > Hi All,
> > >
> > > With the support to SLP only we now pass the VMAT through the SLP node,
> > however
> > > the majority of the costing calls inside vectorizable_load and
> > > vectorizable_store do no pass the SLP node along.  Due to this the backend
> > costing
> > > never sees the VMAT for these cases anymore.
> > >
> > > Additionally the helper around record_stmt_cost when both SLP and 
> > > stmt_vinfo
> > are
> > > passed would only pass the SLP node along.  However the SLP node doesn't
> > contain
> > > all the info available in the stmt_vinfo and we'd have to go through the
> > > SLP_TREE_REPRESENTATIVE anyway.  As such I changed the function to just
> > Always
> > > pass both along.  Unlike the VMAT changes, I don't believe there to be a
> > > correctness issue here but would minimize the number of churn in the 
> > > backend
> > > costing until vectorizer costing as a whole is revisited in GCC 16.
> > 
> > I agree this is the best way forward at this point - I originally
> > thought to never pass both and treat the calls with SLP node as being
> > the "future" way, but clearly we're not even close to a "future" costing
> > API right now.
> 
> After trunk stabilizes a bit I plan to look at the other functions and move 
> the AArch64
> backend to use only SLP_tree.  One thing I was wondering about is what to do 
> with
> costing that happens early during say DR analysis where we don't have an SLP 
> tree.
> 
> So I guess the ABI must always have the duality of getting both the 
> stmt_vinfo and
> SLP tree but we could go back to your original design of not setting 
> stmt_vinfo if
> An SLP tree is given which would distinguish between the two?

I think we need to look at the duality we have right now with the
builtin_vectorization_cost target hook and the vector_cost create_costs
hook.  While the former is kind-of "legacy" we still need it when we
want to cost something without a concrete stmt or context.  So my plan
was to look over those cases and re-design that hook (or create a new
set of hooks) to cover those uses - the DR alignment peeling case is
such, IIRC it currently uses the hook and the aligned/unaligned load/store
costs already (we also do not know what VMAT_* we'll chose, alignment
might not matter in the end if we end up choosing VMAT_ELEMENTWISE
for example).  The other item of cost thought is the chicken-and-egg
issue of deciding how to vectorize loads + stores, aka load-lanes
vs. interleaving vs. gather - we ideally want something that doesn't
require full loop costing for all variants as that'll easily explode.

Btw, my first next stage1 target is likely trying to clean up
vectorizable_{load,store} with only SLP in mind.  And thinking of
how to handle vect_make_slp_decision, aka marking SLP stmts, finding
hybrid and then rejecting if we didn't cover all relevant stmts with
SLP.  I'm going to try building a full single-lane SLP graph covering
all relevant stmts for this (aka, first step of new SLP discovery),
and at first allow the old SLP discovery to replace handled single-lane
instances somehow (somehow so we have the single-lane instances ready
to fall back to, possibly even analyze them).

Richard.

> Thanks,
> Tamar
> 
> > 
> > > These changes re-enable the cost model on AArch64 and also correctly find 
> > > the
> > > VMATs on loads and stores fixing testcases such as sve_iters_low_2.c.
> > >
> > > Bootstrapped Regtested on aarch64-none-linux-gnu,
> > > arm-none-linux-gnueabihf, x86_64-pc-linux-gnu -m32, -m64 and no issues.
> > >
> > > Ok for master?
> > 
> > OK.
> > 
> > Thanks,
> > Richard.
> > 
> > > Thanks,
> > > Tamar
> > >
> > > gcc/ChangeLog:
> > >
> > >   * tree-vect-data-refs.cc (vect_get_data_access_cost): Pass NULL for SLP
> > >   node.
> > >   * tree-vect-stmts.cc (record_stmt_cost): Expose.
> > >   (vect_get_store_cost, vect_get_load_cost): Extend with SLP node.
> > >   (vectorizable_store, vectorizable_load): Pass SLP node to all costing.
> > >   * tree-vectorizer.h (record_stmt_cost): Always pass both SLP node and
> > >   stmt_vinfo to costing.
> > >   (vect_get_load_cost, vect_get_store_cost): Extend with SLP node.
> > >
> > > ---
> > > diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc
> > > index
> > 3ea5fb883b1a5289195142171eb45fa422910a95..d87ca79b8e4c16d242e6743
> > 1d1b527bdb8cb74e4 100644
> > > --- a/gcc/tree-vect-data-refs.cc
> > > +++ b/gcc/tree-vect-data-refs.cc
> > > @@ -1729,12 +1729,14 @@ vect_get_data_access_cost (vec_info *vinfo,
> > dr_vec_info *dr_info,
> > >  ncopies = vect_get_num_copies (loop_vinfo, STMT_VINFO_VECTYPE
> > (stmt_info));
> > >
> > >if (

[PATCH] Regenerate opt urls for r15-5584.

2024-11-22 Thread Lulu Cheng

gcc/ChangeLog:

* config/g.opt.urls: Regenerate.
* config/i386/nto.opt.urls: Regenerate.
* config/riscv/riscv.opt.urls: Regenerate.
* config/rx/rx.opt.urls: Regenerate.
* config/sol2.opt.urls: Regenerate.
---
 gcc/config/g.opt.urls   | 2 +-
 gcc/config/i386/nto.opt.urls| 2 +-
 gcc/config/riscv/riscv.opt.urls | 2 +-
 gcc/config/rx/rx.opt.urls   | 2 +-
 gcc/config/sol2.opt.urls| 2 +-
 5 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/gcc/config/g.opt.urls b/gcc/config/g.opt.urls
index 4ffd5cbd2cf..10ab02a6d63 100644
--- a/gcc/config/g.opt.urls
+++ b/gcc/config/g.opt.urls
@@ -1,5 +1,5 @@
 ; Autogenerated by regenerate-opt-urls.py from gcc/config/g.opt and generated 
HTML
 
 G
-UrlSuffix(gcc/System-V-Options.html#index-G-5)
+UrlSuffix(gcc/System-V-Options.html#index-G-6)
 
diff --git a/gcc/config/i386/nto.opt.urls b/gcc/config/i386/nto.opt.urls
index 37c07a5b88b..055e669d54b 100644
--- a/gcc/config/i386/nto.opt.urls
+++ b/gcc/config/i386/nto.opt.urls
@@ -1,5 +1,5 @@
 ; Autogenerated by regenerate-opt-urls.py from gcc/config/i386/nto.opt and 
generated HTML
 
 G
-UrlSuffix(gcc/System-V-Options.html#index-G-5)
+UrlSuffix(gcc/System-V-Options.html#index-G-6)
 
diff --git a/gcc/config/riscv/riscv.opt.urls b/gcc/config/riscv/riscv.opt.urls
index 622cb6e7b44..294d6628e86 100644
--- a/gcc/config/riscv/riscv.opt.urls
+++ b/gcc/config/riscv/riscv.opt.urls
@@ -33,7 +33,7 @@ mcpu=
 UrlSuffix(gcc/RISC-V-Options.html#index-mcpu-8)
 
 msmall-data-limit=
-UrlSuffix(gcc/RISC-V-Options.html#index-msmall-data-limit-1)
+UrlSuffix(gcc/RISC-V-Options.html#index-msmall-data-limit)
 
 msave-restore
 UrlSuffix(gcc/RISC-V-Options.html#index-msave-restore)
diff --git a/gcc/config/rx/rx.opt.urls b/gcc/config/rx/rx.opt.urls
index 7af4bd249d8..62e2a23cba6 100644
--- a/gcc/config/rx/rx.opt.urls
+++ b/gcc/config/rx/rx.opt.urls
@@ -22,7 +22,7 @@ mlittle-endian-data
 UrlSuffix(gcc/RX-Options.html#index-mlittle-endian-data)
 
 msmall-data-limit=
-UrlSuffix(gcc/RX-Options.html#index-msmall-data-limit-2)
+UrlSuffix(gcc/RX-Options.html#index-msmall-data-limit-1)
 
 mrelax
 UrlSuffix(gcc/RX-Options.html#index-mrelax-7)
diff --git a/gcc/config/sol2.opt.urls b/gcc/config/sol2.opt.urls
index ef64d47d65e..950bb860719 100644
--- a/gcc/config/sol2.opt.urls
+++ b/gcc/config/sol2.opt.urls
@@ -1,7 +1,7 @@
 ; Autogenerated by regenerate-opt-urls.py from gcc/config/sol2.opt and 
generated HTML
 
 G
-UrlSuffix(gcc/System-V-Options.html#index-G-5)
+UrlSuffix(gcc/System-V-Options.html#index-G-6)
 
 mclear-hwcap
 UrlSuffix(gcc/Solaris-2-Options.html#index-mclear-hwcap)
-- 
2.34.1

[PATCH] i386/testsuite: Correct AVX10.2 FP8 test mask usage

2024-11-22 Thread Haochen Jiang

Hi all,

Under FP8, we should not use AVX512F_LEN_HALF to get the mask size since
it will get 16 instead of 8 and drop into wrong if condition. Correct
the usage for vcvtneph2[b,h]f8[,s] runtime test.

Tested under sde. Ok for trunk?

Thx,
Haochen

gcc/testsuite/ChangeLog:

* gcc.target/i386/avx10_2-512-vcvtneph2bf8-2.c: Correct 128bit
mask usage.
* gcc.target/i386/avx10_2-512-vcvtneph2bf8s-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvtneph2hf8-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvtneph2hf8s-2.c: Ditto.
---
 .../i386/avx10_2-512-vcvtneph2bf8-2.c | 25 +++
 .../i386/avx10_2-512-vcvtneph2bf8s-2.c| 25 +++
 .../i386/avx10_2-512-vcvtneph2hf8-2.c | 23 ++---
 .../i386/avx10_2-512-vcvtneph2hf8s-2.c| 23 ++---
 4 files changed, 58 insertions(+), 38 deletions(-)

diff --git a/gcc/testsuite/gcc.target/i386/avx10_2-512-vcvtneph2bf8-2.c 
b/gcc/testsuite/gcc.target/i386/avx10_2-512-vcvtneph2bf8-2.c
index d5ba911334c..96ca7e80c4d 100644
--- a/gcc/testsuite/gcc.target/i386/avx10_2-512-vcvtneph2bf8-2.c
+++ b/gcc/testsuite/gcc.target/i386/avx10_2-512-vcvtneph2bf8-2.c
@@ -11,8 +11,8 @@
 #include "avx10-helper.h"
 #include "fp8-helper.h"
 
-#define SIZE_SRC (AVX512F_LEN / 16)
-#define SIZE (AVX512F_LEN_HALF / 8)
+#define SIZE (AVX512F_LEN / 16)
+#define SIZE_DST (AVX512F_LEN_HALF / 8)
 #include "avx512f-mask-type.h"
 
 void
@@ -23,14 +23,14 @@ CALC (unsigned char *r, _Float16 *s)
   hf8_bf8 = 1;
   saturate = 0;
   
-  for (i = 0; i < SIZE; i++)
+  for (i = 0; i < SIZE_DST; i++)
 {
   r[i] = 0;
-  if (i < SIZE_SRC)
+  if (i < SIZE)
{
  Float16Union usrc = {.f16 = s[i]};
  r[i] = convert_fp16_to_fp8(usrc.f16, 0, hf8_bf8, saturate);
}
 }
 }
 
@@ -41,17 +41,22 @@ TEST (void)
   UNION_TYPE (AVX512F_LEN_HALF, i_b) res1, res2, res3; 
   UNION_TYPE (AVX512F_LEN, h) src;
   MASK_TYPE mask = MASK_VALUE;
-  unsigned char res_ref[SIZE];
+  unsigned char res_ref[SIZE_DST];
 
   sign = 1;
-  for (i = 0; i < SIZE_SRC; i++)
+  for (i = 0; i < SIZE; i++)
 {
   src.a[i] = (_Float16)(sign * (2.5 * (1 << (i % 3;
   sign = -sign;
 }
 
+#if AVX512F_LEN > 128
+  for (i = 0; i < SIZE_DST; i++)
+res2.a[i] = DEFAULT_VALUE;
+#else
   for (i = 0; i < SIZE; i++)
 res2.a[i] = DEFAULT_VALUE;
+#endif
 
   CALC(res_ref, src.a);
 
diff --git a/gcc/testsuite/gcc.target/i386/avx10_2-512-vcvtneph2bf8s-2.c 
b/gcc/testsuite/gcc.target/i386/avx10_2-512-vcvtneph2bf8s-2.c
index 49e170aa428..c458f1ebb77 100644
--- a/gcc/testsuite/gcc.target/i386/avx10_2-512-vcvtneph2bf8s-2.c
+++ b/gcc/testsuite/gcc.target/i386/avx10_2-512-vcvtneph2bf8s-2.c
@@ -11,8 +11,8 @@
 #include "avx10-helper.h"
 #include "fp8-helper.h"
 
-#define SIZE_SRC (AVX512F_LEN / 16)
-#define SIZE (AVX512F_LEN_HALF / 8)
+#define SIZE (AVX512F_LEN / 16)
+#define SIZE_DST (AVX512F_LEN_HALF / 8)
 #include "avx512f-mask-type.h"
 
 void
@@ -23,14 +23,14 @@ CALC (unsigned char *r, _Float16 *s)
   hf8_bf8 = 1;
   saturate = 1;
   
-  for (i = 0; i < SIZE; i++)
+  for (i = 0; i < SIZE_DST; i++)
 {
   r[i] = 0;
-  if (i < SIZE_SRC)
+  if (i < SIZE)
{
  Float16Union usrc = {.f16 = s[i]};
  r[i] = convert_fp16_to_fp8(usrc.f16, 0, hf8_bf8, saturate);
}
 }
 }
 
@@ -41,17 +41,22 @@ TEST (void)
   UNION_TYPE (AVX512F_LEN_HALF, i_b) res1, res2, res3; 
   UNION_TYPE (AVX512F_LEN, h) src;
   MASK_TYPE mask = MASK_VALUE;
-  unsigned char res_ref[SIZE];
+  unsigned char res_ref[SIZE_DST];
 
   sign = 1;
-  for (i = 0; i < SIZE_SRC; i++)
+  for (i = 0; i < SIZE; i++)
 {
   src.a[i] = (_Float16)(sign * (2.5 * (1 << (i % 3;
   sign = -sign;
 }
 
+#if AVX512F_LEN > 128
+  for (i = 0; i < SIZE_DST; i++)
+res2.a[i] = DEFAULT_VALUE;
+#else
   for (i = 0; i < SIZE; i++)
 res2.a[i] = DEFAULT_VALUE;
+#endif
 
   CALC(res_ref, src.a);
 
diff --git a/gcc/testsuite/gcc.target/i386/avx10_2-512-vcvtneph2hf8-2.c 
b/gcc/testsuite/gcc.target/i386/avx10_2-512-vcvtneph2hf8-2.c
index f481b72cc71..cb9cdbb89c1 100644
--- a/gcc/testsuite/gcc.target/i386/avx10_2-512-vcvtneph2hf8-2.c
+++ b/gcc/testsuite/gcc.target/i386/avx10_2-512-vcvtneph2hf8-2.c
@@ -11,8 +11,8 @@
 #include "avx10-helper.h"
 #include "fp8-helper.h"
 
-#define SIZE_SRC (AVX512F_LEN / 16)
-#define SIZE (AVX512F_LEN_HALF / 8)
+#define SIZE (AVX512F_LEN / 16)
+#define SIZE_DST (AVX512F_LEN_HALF / 8)
 #include "avx512f-mask-type.h"
 
 void
@@ -23,14 +23,14 @@ CALC (unsigned char *r, _Float16 *s)
   hf8_bf8 = 0;
   saturate = 0;
   
-  for (i = 0; i < SIZE; i++)
+  for (i = 0; i < SIZE_DST; i++)
 {
   r[i] = 0;
-  if (i < SIZE_SRC)
+  if (i < SIZE)
{
  Float16Union usrc = {.f16 = s[i]};
  r[i] = convert_fp16_to_fp8(usrc.f16, 0, hf8_bf8, saturate);
}
 }
 }
 
@@ -44,14 +44,19 @@ TEST (void)
   unsigned char res_ref[SIZE];
 
   sign = 1;
-  for (i = 0

[PATCH] i386: Make __builtin_ia32_f{nstenv,ldenv,nstsw,fnclex} builtins internal [PR117165]

2024-11-22 Thread Jakub Jelinek

Hi!

As the comment says, these builtins are meant to be internal for the atomic
support and cause various ICEs when using them directly in various
conditions.
So the following patch makes them internal.
We do have also internal-fn.*, but those target specific builtins would
need to be there in generic code, so I've just added space to their name,
which is the old way to hide builtins/attributes etc.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-11-22  Jakub Jelinek  

PR target/117165
* config/i386/i386-builtin.def (IX86_BUILTIN_FNSTENV,
IX86_BUILTIN_FLDENV, IX86_BUILTIN_FNSTSW, IX86_BUILTIN_FNCLEX): Add
space to the end of the builtin name to make it really internal.

* gcc.target/i386/pr117165.c: New test.

--- gcc/config/i386/i386-builtin.def.jj 2024-11-06 10:19:11.418260865 +0100
+++ gcc/config/i386/i386-builtin.def2024-11-21 11:39:05.245410674 +0100
@@ -94,10 +94,10 @@ BDESC (0, 0, CODE_FOR_nothing, "__builti
 BDESC (0, 0, CODE_FOR_pause, "__builtin_ia32_pause", IX86_BUILTIN_PAUSE, 
UNKNOWN, (int) VOID_FTYPE_VOID)
 
 /* 80387 (for use internally for atomic compound assignment).  */
-BDESC (0, 0, CODE_FOR_fnstenv, "__builtin_ia32_fnstenv", IX86_BUILTIN_FNSTENV, 
UNKNOWN, (int) VOID_FTYPE_PVOID)
-BDESC (0, 0, CODE_FOR_fldenv, "__builtin_ia32_fldenv", IX86_BUILTIN_FLDENV, 
UNKNOWN, (int) VOID_FTYPE_PCVOID)
-BDESC (0, 0, CODE_FOR_fnstsw, "__builtin_ia32_fnstsw", IX86_BUILTIN_FNSTSW, 
UNKNOWN, (int) USHORT_FTYPE_VOID)
-BDESC (0, 0, CODE_FOR_fnclex, "__builtin_ia32_fnclex", IX86_BUILTIN_FNCLEX, 
UNKNOWN, (int) VOID_FTYPE_VOID)
+BDESC (0, 0, CODE_FOR_fnstenv, "__builtin_ia32_fnstenv ", 
IX86_BUILTIN_FNSTENV, UNKNOWN, (int) VOID_FTYPE_PVOID)
+BDESC (0, 0, CODE_FOR_fldenv, "__builtin_ia32_fldenv ", IX86_BUILTIN_FLDENV, 
UNKNOWN, (int) VOID_FTYPE_PCVOID)
+BDESC (0, 0, CODE_FOR_fnstsw, "__builtin_ia32_fnstsw ", IX86_BUILTIN_FNSTSW, 
UNKNOWN, (int) USHORT_FTYPE_VOID)
+BDESC (0, 0, CODE_FOR_fnclex, "__builtin_ia32_fnclex ", IX86_BUILTIN_FNCLEX, 
UNKNOWN, (int) VOID_FTYPE_VOID)
 
 /* MMX */
 BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_emms, "__builtin_ia32_emms", 
IX86_BUILTIN_EMMS, UNKNOWN, (int) VOID_FTYPE_VOID)
--- gcc/testsuite/gcc.target/i386/pr117165.c.jj 2024-11-21 11:46:07.971413045 
+0100
+++ gcc/testsuite/gcc.target/i386/pr117165.c2024-11-21 11:45:44.849741064 
+0100
@@ -0,0 +1,27 @@
+/* PR target/117165 */
+/* { dg-do compile } */
+/* { dg-options "-msoft-float" } */
+
+void
+foo ()
+{
+  __builtin_ia32_fnstsw ();/* { dg-error "implicit declaration of 
function" } */
+}
+
+void
+bar ()
+{
+  __builtin_ia32_fnclex ();/* { dg-error "implicit declaration of 
function" } */
+}
+
+void
+baz ()
+{
+  __builtin_ia32_fnstenv (0);  /* { dg-error "implicit declaration of 
function" } */
+}
+
+void
+qux ()
+{
+  __builtin_ia32_fldenv (0);   /* { dg-error "implicit declaration of 
function" } */
+}

Jakub

[PATCH v2] sibcall: Check partial != 0 for BLKmode argument

2024-11-22 Thread H.J. Lu

On Thu, Nov 21, 2024 at 6:43 AM H.J. Lu  wrote:

> On Wed, Nov 20, 2024 at 9:55 PM Richard Sandiford <
> richard.sandif...@arm.com> wrote:
>
>> "H.J. Lu"  writes:
>> > On Wed, Nov 20, 2024 at 2:12 AM Richard Sandiford
>> >  wrote:
>> >>
>> >> "H.J. Lu"  writes:
>> >> > Adjust BLKmode argument size for parameter alignment for sibcall
>> check.
>> >> >
>> >> > gcc/
>> >> >
>> >> > PR middle-end/117098
>> >> > * calls.cc (store_one_arg): Adjust BLKmode argument size for
>> >> > alignment padding for sibcall check.
>> >> >
>> >> > gcc/testsuite/
>> >> >
>> >> > PR middle-end/117098
>> >> > * gcc.dg/sibcall-12.c: New test.
>> >> >
>> >> > OK for master?
>> >> >
>> >> >
>> >> > H.J.
>> >> > From 8b0518906cb23a9b5e77b04d6132c49047daebd2 Mon Sep 17 00:00:00
>> 2001
>> >> > From: "H.J. Lu" 
>> >> > Date: Sun, 13 Oct 2024 04:53:14 +0800
>> >> > Subject: [PATCH] sibcall: Adjust BLKmode argument size for alignment
>> padding
>> >> >
>> >> > Adjust BLKmode argument size for parameter alignment for sibcall
>> check.
>> >> >
>> >> > gcc/
>> >> >
>> >> >   PR middle-end/117098
>> >> >   * calls.cc (store_one_arg): Adjust BLKmode argument size for
>> >> >   alignment padding for sibcall check.
>> >> >
>> >> > gcc/testsuite/
>> >> >
>> >> >   PR middle-end/117098
>> >> >   * gcc.dg/sibcall-12.c: New test.
>> >> >
>> >> > Signed-off-by: H.J. Lu 
>> >> > ---
>> >> >  gcc/calls.cc  |  4 +++-
>> >> >  gcc/testsuite/gcc.dg/sibcall-12.c | 13 +
>> >> >  2 files changed, 16 insertions(+), 1 deletion(-)
>> >> >  create mode 100644 gcc/testsuite/gcc.dg/sibcall-12.c
>> >> >
>> >> > diff --git a/gcc/calls.cc b/gcc/calls.cc
>> >> > index c5c26f65280..163c7e509d9 100644
>> >> > --- a/gcc/calls.cc
>> >> > +++ b/gcc/calls.cc
>> >> > @@ -5236,7 +5236,9 @@ store_one_arg (struct arg_data *arg, rtx
>> argblock, int flags,
>> >> > /* expand_call should ensure this.  */
>> >> > gcc_assert (!arg->locate.offset.var
>> >> > && arg->locate.size.var == 0);
>> >> > -   poly_int64 size_val = rtx_to_poly_int64 (size_rtx);
>> >> > +   /* Adjust for argument alignment padding.  */
>> >> > +   poly_int64 size_val = ROUND_UP (UINTVAL (size_rtx),
>> >> > +   parm_align /
>> BITS_PER_UNIT);
>> >>
>> >> This doesn't look right to me.  For one thing, going from
>> >> rtx_to_poly_int64 to UINTVAL drops support for non-constant parameters.
>> >> But even ignoring that, I think padding size_val (the size of
>> arg->value
>> >> IIUC) will pessimise the later:
>> >>
>> >>   else if (maybe_in_range_p (arg->locate.offset.constant,
>> >>  i, size_val))
>> >> sibcall_failure = true;
>> >>
>> >> and so cause sibcall failures elsewhere.  I'm also not sure this
>> >> accurately reproduces the padding that is added by locate_and_pad_parm
>> >> for all cases (arguments that grow up vs down, padding below vs above
>> >> the argument).
>> >>
>> >> AIUI, the point of the:
>> >>
>> >>   if (known_eq (arg->locate.offset.constant, i))
>> >> {
>> >>   /* Even though they appear to be at the same
>> location,
>> >>  if part of the outgoing argument is in registers,
>> >>  they aren't really at the same location.  Check
>> for
>> >>  this by making sure that the incoming size is the
>> >>  same as the outgoing size.  */
>> >>   if (maybe_ne (arg->locate.size.constant, size_val))
>> >> sibcall_failure_1 = true;
>> >> }
>> >
>> > Does this
>> >
>> > diff --git a/gcc/calls.cc b/gcc/calls.cc
>> > index 246abe34243..98429cc757f 100644
>> > --- a/gcc/calls.cc
>> > +++ b/gcc/calls.cc
>> > @@ -5327,7 +5327,13 @@ store_one_arg (struct arg_data *arg, rtx
>> > argblock, int flags,
>> >they aren't really at the same location.  Check for
>> >this by making sure that the incoming size is the
>> >same as the outgoing size.  */
>> > -   if (maybe_ne (arg->locate.size.constant, size_val))
>> > +   poly_int64 aligned_size;
>> > +   if (CONST_INT_P (size_rtx))
>> > + aligned_size = ROUND_UP (UINTVAL (size_rtx),
>> > +   parm_align / BITS_PER_UNIT);
>> > +   else
>> > + aligned_size = size_val;
>> > +   if (maybe_ne (arg->locate.size.constant, aligned_size))
>> >   sibcall_failure = true;
>> >   }
>> >  else if (maybe_in_range_p (arg->locate.offset.constant,
>> >
>> > look correct?
>>
>> Heh.  Playing the reviewer here, I was kind-of hoping you'd explain
>> why it was correct to me :)
>>
>> But conceptually, the call is copying from arg->value to arg->locate.
>> And this code is trying to detect whether the copy is a nop, whether
>> it overlaps, or whether the source and destination are distinct.

[PATCH] match.pd: Fix up the new simpliofiers using with_possible_nonzero_bits2 [PR117420]

2024-11-22 Thread Jakub Jelinek

Hi!

The following testcase shows wrong-code caused by incorrect use
of with_possible_nonzero_bits2.
That matcher is defined as
/* Slightly extended version, do not make it recursive to keep it cheap.  */
(match (with_possible_nonzero_bits2 @0)
 with_possible_nonzero_bits@0)
(match (with_possible_nonzero_bits2 @0)
 (bit_and:c with_possible_nonzero_bits@0 @2))
and because with_possible_nonzero_bits includes the SSA_NAME case with
integral/pointer argument, both forms can actually match when a SSA_NAME
with integral/pointer type has a def stmt which is BIT_AND_EXPR
assignment with say SSA_NAME with integral/pointer type as one of its
operands (or INTEGER_CST, another with_possible_nonzero_bits case).
And in match.pd the latter actually wins if both match and so when using
(with_possible_nonzero_bits2 @0) the @0 will actually be one of the
BIT_AND_EXPR operands if that form is matched.

Now, with_possible_nonzero_bits2 and with_certain_nonzero_bits2 were added
for the
/* X == C (or X & Z == Y | C) is impossible if ~nonzero(X) & C != 0.  */
(for cmp (eq ne)
 (simplify
  (cmp:c (with_possible_nonzero_bits2 @0) (with_certain_nonzero_bits2 @1))
  (if (wi::bit_and_not (wi::to_wide (@1), get_nonzero_bits (@0)) != 0)
   { constant_boolean_node (cmp == NE_EXPR, type); }))) 
simplifier, but even for that one I think they do not do a good job, they
might actually pessimize stuff rather than optimize, but at least does not
result in wrong-code, because the operands are solely tested with
wi::to_wide or get_nonzero_bits, but not actually used in the
simplification.  The reason why it can pessimize stuff is say if we have
  # RANGE [irange] int ... MASK 0xb VALUE 0x0
  x_1 = ...;
  # RANGE [irange] int ... MASK 0x8 VALUE 0x0
  _2 = x_1 & 0xc;
  _3 = _2 == 2;
then if it used just with_possible_nonzero_bits@0, @0 would have
get_nonzero_bits (@0) 0x8 and (2 & ~8) != 0, so we can fold it into
  _3 = 0;
But as it uses (with_possible_nonzero_bits2 @0), @0 is x_1 rather
than _2 and get_nonzero_bits (@0) is unnecessarily conservative,
0xb rather than 0x8 and (2 & ~0xb) == 0, so we don't optimize.
Now, with_possible_nonzero_bits2 can actually improve stuff as well in that
pattern, if say value ranges aren't fully computed yet or the BIT_AND_EXPR
assignment has been added later and the lhs doesn't have range computed yet,
get_nonzero_range on the BIT_AND_EXPR lhs will be all bits set, while
on the BIT_AND_EXPR operand might actually succeed.

I believe better would be to either modify get_nonzero_bits so that it
special cases the SSA_NAME with BIT_AND_EXPR def_stmt (but one level
deep only like with_possible_nonzero_bits2, no recursion), in that case
return bitwise and of get_nonzero_bits (non-recursive) for the lhs and
both operands, and possibly BIT_AND_EXPR itself e.g. for GENERIC
matching during by returning bitwise and of both operands.
Then with_possible_nonzero_bits2 could be needed for the GENERIC case,
perhaps have the second match #if GENERIC, but changed so that the @N
operand always is the whole thing rather than its operand which is
error-prone.  Or add get_nonzero_bits wrapper with a different name
which would do that.

with_certain_nonzero_bits2 could be changed similarly, these days
we can test known non-zero bits rather than possible non-zero bits on
SSA_NAMEs too, we record both mask and value, so possible nonzero bits
(aka. get_nonzero_bits) is mask () | value (), while known nonzero bits
is value () & ~mask (), with a new function (get_known_nonzero_bits
or get_certain_nonzero_bits etc.) which handles that.

Anyway, the following patch doesn't do what I wrote above just yet,
for that single pattern it is just a missed optimization.
But the with_possible_nonzero_bits2 uses in the 3 new simplifiers are
just completely incorrect, because they don't just use the @0 operand
in get_nonzero_bits (pessimizing stuff if value ranges are fully computed),
but also use it in the replacement, then they act as if the BIT_AND_EXPR
wasn't there at all.
While we could use (with_possible_nonzero_bits2@3 @0) and use
get_nonzero_bits (@0) and use @3 in the replacement, that would still
often be a pessimization, so I've just used with_possible_nonzero_bits@0.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

And what do you think about the above mentioned approach for the other
with_possible_nonzero_bits2 using simplifier?

2024-11-22  Jakub Jelinek  

PR tree-optimization/117420
* match.pd ((X >> C1) << (C1 + C2) -> X << C2,
(X >> C1) * (C2 << C1) -> X * C2, X / (1 << C) -> X /[ex] (1 << C)):
Use with_possible_nonzero_bits@0 rather than
(with_possible_nonzero_bits2 @0).

* gcc.dg/torture/pr117420.c: New test.

--- gcc/match.pd.jj 2024-11-18 12:21:10.449236948 +0100
+++ gcc/match.pd2024-11-21 16:12:46.346009032 +0100
@@ -4952,7 +4952,7 @@ (define_operator_list SYNC_FETCH_AND_AND
 #if GIMPLE
 /* (X >> C1) << (C1 + C2) -> X << C2 if the low C1 bits

[Patch] OpenMP: Add 'interop' clause to 'dispatch' for C/C++

2024-11-22 Thread Tobias Burnus


This patch depends on 'dispatch' (which is currently in mainline only
available for C/C++ but not Fortran) and it depends on the not-yet-committed
"[Patch] OpenMP: 'interop' construct - add C/C++ parser support, improve Fortran 
parsing"
https://gcc.gnu.org/pipermail/gcc-patches/2024-November/668782.html
due to be committed soonish.

The idea of the 'interop' clause is to use it with declare_variant + 
'adjust_args',
which is not yet in.  There must be at least as many adjust_args arguments as
'interop' items (restriction). As the former is currently always 0, an error is
be issued unconditionally, making the current implementation fully correct, even
if it is not yet very useful.

The intent is to commit it as follow up very soon after the
'interop' C/C++ parser patch is in.

Tobias

PS: Usage example once fully implemented:

void f(omp_interop_t);
#pragma omp declare variant(f) \
   match(construct={dispatch}) \
   append_args(interop(targetsync))
void g();
...
omp_interop_t obj_tgtsync;
...
#pragma omp dispatch interop(obj_tgtsync)
  g (); // calls  f(obj_tgtsync)

#pragma omp dispatch
  g ();  // does:
// #pragma omp interop init(targetsync: tmp)
//   f(tmp)
// #pragma omp interop destroy(tmp)
OpenMP: Add 'interop' clause to 'dispatch' for C/C++

Will fail with an error if/as no suitable 'append_args' has been specified,
given that 'append_args' is not yet implemented.

gcc/c-family/ChangeLog:

	* c-pragma.h (enum pragma_omp_clause): Add PRAGMA_OMP_CLAUSE_INTEROP.

gcc/c/ChangeLog:

	* c-parser.cc (c_parser_omp_clause_interop): New.
	(c_parser_omp_clause_name, c_parser_omp_all_clauses,
	c_parser_omp_dispatch_body): Handle 'interop' clause.
	* c-typeck.cc (c_finish_omp_clauses): Likewise.

gcc/cp/ChangeLog:

	* parser.cc (cp_parser_omp_clause_name, cp_parser_omp_all_clauses,
	cp_parser_omp_dispatch_body): Handle 'interop' clause.
	* pt.cc (tsubst_omp_clauses): Likewise.
	* semantics.cc (finish_omp_clauses): Likewise.

gcc/ChangeLog:

	* gimplify.cc (gimplify_call_expr): Add initial support for
	dispatch's 'interop' clause.
	(gimplify_scan_omp_clauses): Handle interop clause.
	* tree-pretty-print.cc (dump_omp_clause): Likewise.
	* tree-core.h (enum omp_clause_code): Add OMP_CLAUSE_INTEROP.
	* tree.cc (omp_clause_num_ops, omp_clause_code_name): Add interop.

gcc/testsuite/ChangeLog:

	* c-c++-common/gomp/dispatch-11.c: New test.
	* c-c++-common/gomp/dispatch-12.c: New test.

 c-family/c-pragma.h   |1 
 c/c-parser.cc |   17 ++
 c/c-typeck.cc |   23 ++--
 cp/parser.cc  |   10 +++
 cp/pt.cc  |1 
 cp/semantics.cc   |   29 +++---
 gimplify.cc   |   15 +
 testsuite/c-c++-common/gomp/dispatch-11.c |   84 ++
 testsuite/c-c++-common/gomp/dispatch-12.c |   53 ++
 tree-core.h   |3 +
 tree-pretty-print.cc  |6 +-
 tree.cc   |2 
 12 files changed, 228 insertions(+), 16 deletions(-)
diff --git a/gcc/c-family/c-pragma.h b/gcc/c-family/c-pragma.h
index c95a602a475..df5625d5f4f 100644
--- a/gcc/c-family/c-pragma.h
+++ b/gcc/c-family/c-pragma.h
@@ -134,6 +134,7 @@ enum pragma_omp_clause {
   PRAGMA_OMP_CLAUSE_INDIRECT,
   PRAGMA_OMP_CLAUSE_INIT,
   PRAGMA_OMP_CLAUSE_IS_DEVICE_PTR,
+  PRAGMA_OMP_CLAUSE_INTEROP,
   PRAGMA_OMP_CLAUSE_LASTPRIVATE,
   PRAGMA_OMP_CLAUSE_LINEAR,
   PRAGMA_OMP_CLAUSE_LINK,
diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
index ca4a6b39b27..f3ed6104747 100644
--- a/gcc/c/c-parser.cc
+++ b/gcc/c/c-parser.cc
@@ -15786,6 +15786,8 @@ c_parser_omp_clause_name (c_parser *parser)
 	result = PRAGMA_OMP_CLAUSE_INIT;
 	  else if (!strcmp ("is_device_ptr", p))
 	result = PRAGMA_OMP_CLAUSE_IS_DEVICE_PTR;
+	  else if (!strcmp ("interop", p))
+	result = PRAGMA_OMP_CLAUSE_INTEROP;
 	  break;
 	case 'l':
 	  if (!strcmp ("lastprivate", p))
@@ -20569,6 +20571,16 @@ c_parser_omp_clause_use (c_parser *parser, tree list)
   return c_parser_omp_var_list_parens (parser, OMP_CLAUSE_USE, list);
 }
 
+/* OpenMP 6.0:
+   interop ( variable-list ) */
+
+static tree
+c_parser_omp_clause_interop (c_parser *parser, tree list)
+{
+  check_no_duplicate_clause (list, OMP_CLAUSE_INTEROP, "interop");
+  return c_parser_omp_var_list_parens (parser, OMP_CLAUSE_INTEROP, list);
+}
+
 /* Parse all OpenACC clauses.  The set clauses allowed by the directive
is a bitmask in MASK.  Return the list of clauses found.  */
 
@@ -21076,6 +21088,10 @@ c_parser_omp_all_clauses (c_parser *parser, omp_clause_mask mask,
 	  clauses = c_parser_omp_clause_use (parser, clauses);
 	  c_name = "use";
 	  break;
+	case PRAGMA_OMP_CLAUSE_INTEROP:
+	  clauses = c_parser_omp_clause_interop (parser, clauses);
+	  c_name = "interop";
+	  break;
 	case PRAGMA_OMP_CLAUSE_MAP:
 	  clauses = c_parser_omp_cl

สินเชื่อsme

2024-11-22 Thread Adrian

ขออนุญาตผู้ดูเเลเเละเจ้าของกิจการด้วยครับ
ของผมจะเป็นการเสนอเงินทุนเพื่อเจ้าของธุรกิจ
ที่มีการจดทะเบียนในรูปแบบบริษัท
หจก โรงงานอุตสาหกรรม ทั่วประเทศ
ดอกเบี้ยต่ำ เริ่มต้นที่ 1-1.5%
สอบถามฟรีพนักงานสุภาพ
0626697879 (ผู้จัดการฝ่ายการเงิน)

[PING*2][PATCH 1/1] config: Handle dash in library name for AC_LIB_LINKAGEFLAGS_BODY

2024-11-22 Thread Ijaz, Abdul B

Ping for:
https://gcc.gnu.org/pipermail/gcc-patches/2024-July/656541.html

Thanks & Best Regards
Abdul Basit

-Original Message-
From: Ijaz, Abdul B 
Sent: Monday, September 23, 2024 10:22 AM
To: gcc-patches@gcc.gnu.org
Cc: Ijaz, Abdul B 
Subject: [PING]: [PATCH 1/1] config: Handle dash in library name for 
AC_LIB_LINKAGEFLAGS_BODY

https://gcc.gnu.org/pipermail/gcc-patches/2024-July/656541.html

Best Regards
Abdul Basit

-Original Message-
From: Ijaz, Abdul B 
Sent: Sunday, July 7, 2024 9:45 PM
To: Ijaz, Abdul B 
Subject: [PATCH 1/1] config: Handle dash in library name for 
AC_LIB_LINKAGEFLAGS_BODY

From: "Ijaz, Abdul B" 

For a library with dash in the name like yaml-cpp the AC_LIB_LINKAGEFLAGS_BODY 
function generates a with_libname_type argument variable name with a dash but 
this results in configure error.  Since dashes are not allowed in the variable 
name.

This change handles such cases and in case input library for the 
AC_LIB_HAVE_LINKFLAGS has dash then it replaces it with the underscore "_".

Example of an error for yaml-cpp library before the change using gcc config 
scripts in gdb:
gdb/gdb/configure: line 22868: with_libyaml-cpp_type=auto: command not found

After having underscore for this variable name:

checking whether to use yaml-cpp... yes
checking for libyaml-cpp... yes
checking how to link with libyaml-cpp... -lyaml-cpp

config/ChangeLog:

* lib-link.m4: Handle dash in the library name for
AC_LIB_LINKFLAGS_BODY.

2024-07-03 Ijaz, Abdul B 
---
 config/lib-link.m4 | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/config/lib-link.m4 b/config/lib-link.m4 index 
20e281fd323..a60a8069453 100644
--- a/config/lib-link.m4
+++ b/config/lib-link.m4
@@ -126,6 +126,7 @@ AC_DEFUN([AC_LIB_LINKFLAGS_BODY],  [
   define([NAME],[translit([$1],[abcdefghijklmnopqrstuvwxyz./-],
[ABCDEFGHIJKLMNOPQRSTUVWXYZ___])])
+  define([Name],[translit([$1],[./-], [___])])
   dnl By default, look in $includedir and $libdir.
   use_additional=yes
   AC_LIB_WITH_FINAL_PREFIX([
@@ -152,8 +153,8 @@ AC_DEFUN([AC_LIB_LINKFLAGS_BODY],
 ])
   AC_LIB_ARG_WITH([lib$1-type],
 [  --with-lib$1-type=TYPE type of library to search for 
(auto/static/shared) ],
-  [ with_lib$1_type=$withval ], [ with_lib$1_type=auto ])
-  lib_type=`eval echo \$with_lib$1_type`
+  [ with_lib[]Name[]_type=$withval ], [ with_lib[]Name[]_type=auto ]) 
+ lib_type=`eval echo \$with_lib[]Name[]_type`
 
   dnl Search the library and its dependencies in $additional_libdir and
   dnl $LDFLAGS. Using breadth-first-seach.
--
2.34.1

Intel Deutschland GmbH
Registered Address: Am Campeon 10, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de
Managing Directors: Sean Fennelly, Jeffrey Schneiderman, Tiffany Doon Silva
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928

Re: [PATCH] [x86] Fix uninitialized operands[2] in vec_unpacks_hi_v4sf.

2024-11-22 Thread Richard Biener

On Fri, 22 Nov 2024, liuhongt wrote:

> It could cause weired spill in RA when register pressure is high.
> 
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> Ok for trunk?
> 
> BTW, It's difficult to get a decent testcase for the issue since the spill
> is not exposed in simple testcase.

I think it's a good patch independent on the spill issue given it
avoids false dependences on the scratch reg contents.  While it happens
to avoid the problematic spill I don't think it does so by design,
"sse_movhlps" still requires two registers and spilling the
first input/output will cause a STLF fail for a later reload since
the actual store is V2SFmode but the reload will be V4SFmode.

That said, it's a good enough fix for the actual regression, just
not a full solution for the then theoretical performance issue.
Adding $ to the memory alternative didn't fix the regression btw
(maybe LRA doesn't honor that).

Richard.

> gcc/ChangeLog:
> 
>   PR target/117562
>   * config/i386/sse.md (vec_unpacks_hi_v4sf): Initialize
>   operands[2] with CONST0_RTX.
> ---
>  gcc/config/i386/sse.md | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
> index 72acd5bde5e..498a42d6e1e 100644
> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -10424,7 +10424,10 @@ (define_expand "vec_unpacks_hi_v4sf"
> (match_dup 2)
> (parallel [(const_int 0) (const_int 1)]]
>"TARGET_SSE2"
> -  "operands[2] = gen_reg_rtx (V4SFmode);")
> +{
> +  operands[2] = gen_reg_rtx (V4SFmode);
> +  emit_move_insn (operands[2], CONST0_RTX (V4SFmode));
> +})
>  
>  (define_expand "vec_unpacks_hi_v8sf"
>[(set (match_dup 2)
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Re: [PATCH] wwwdocs: Align the DCO text for the GNU Toolchain to match community usage.

2024-11-22 Thread Carlos O'Donell

On 11/22/24 7:00 AM, Sam James wrote:
> The key part is the first hunk of the patch I linked where we wanted to
> emphasise "established online identity", rather than a throwaway
> pseudonym.

Thanks, yes, that language is completely neutral from my reading and clarifies
that this should be a known identity.

-- 
Cheers,
Carlos.

Re: [PATCH] inline-asm, v2: Add - constraint modifier support for toplevel extended asm [PR41045]

2024-11-22 Thread Richard Biener

On Fri, 22 Nov 2024, Jakub Jelinek wrote:

> On Fri, Nov 22, 2024 at 12:01:21AM +, Joseph Myers wrote:
> > On Mon, 18 Nov 2024, Jakub Jelinek wrote:
> > 
> > > +@smallexample
> > > +extern void foo (void), bar (void);
> > > +int v;
> > > +extern int w;
> > > +asm (".globl %cc0, %cc2; .text; %cc0: call %cc1; ret; .data; %cc2: .word 
> > > %cc3"
> > > + :: ":" (foo), "-s" (&bar), ":" (&w), "-i" (&v));
> > > +@end smallexample
> > > +
> > > +This asm declaration tells the compiler it defines function foo and 
> > > variable
> > > +w and uses function bar and variable v.  This will compile even with PIC,
> > > +but it is up to the user to ensure it will assemble correctly and have 
> > > the
> > > +expected behavior.
> > 
> > That should be @code{foo}, @code{w}, @code{bar}, @code{v}.
> > 
> > The C front-end changes in this patch are OK.
> 
> Thanks, here is the adjusted patch.

The middle-end changes are OK.

Richard.

> 2024-11-21  Jakub Jelinek  
> 
>   PR c/41045
> gcc/
>   * stmt.cc (parse_output_constraint, parse_input_constraint): Handle
>   - modifier.
>   * recog.h (raw_constraint_p): Declare.
>   * recog.cc (raw_constraint_p): New variable.
>   (asm_operand_ok, constrain_operands): Handle - modifier.
>   * common.md (i, s, n): For raw_constraint_p don't require
>   LEGITIMATE_PIC_OPERAND_P.
>   * doc/md.texi: Document - constraint modifier.
> gcc/c/
>   * c-typeck.cc (build_asm_expr): Reject - constraint modifier inside
>   of a function.
> gcc/cp/
>   * semantics.cc (finish_asm_stmt): Reject - constraint modifier inside
>   of a function.
> gcc/testsuite/
>   * c-c++-common/toplevel-asm-4.c: Add missing %cc2 use in template, add
>   bar, x, &y operands with "-i" and "-s" constraints.
>   (x, y): New variables.
>   (bar): Declare.
>   * c-c++-common/toplevel-asm-7.c: New test.
>   * c-c++-common/toplevel-asm-8.c: New test.
> 
> --- gcc/stmt.cc.jj2024-11-17 21:07:06.712510933 -1100
> +++ gcc/stmt.cc   2024-11-17 21:45:30.201294501 -1100
> @@ -269,7 +269,7 @@ parse_output_constraint (const char **co
>   case 'E':  case 'F':  case 'G':  case 'H':
>   case 's':  case 'i':  case 'n':
>   case 'I':  case 'J':  case 'K':  case 'L':  case 'M':
> - case 'N':  case 'O':  case 'P':  case ',':
> + case 'N':  case 'O':  case 'P':  case ',':  case '-':
> break;
>  
>   case '0':  case '1':  case '2':  case '3':  case '4':
> @@ -364,7 +364,7 @@ parse_input_constraint (const char **con
>case 'E':  case 'F':  case 'G':  case 'H':
>case 's':  case 'i':  case 'n':
>case 'I':  case 'J':  case 'K':  case 'L':  case 'M':
> -  case 'N':  case 'O':  case 'P':  case ',':
> +  case 'N':  case 'O':  case 'P':  case ',':  case '-':
>   break;
>  
>case ':':
> --- gcc/recog.h.jj2024-08-15 09:23:26.981012468 -1100
> +++ gcc/recog.h   2024-11-17 22:26:47.190602347 -1100
> @@ -335,6 +335,9 @@ private:
> matched.  */
>  extern int which_alternative;
>  
> +/* True for inline asm operands with - constraint modifier.  */
> +extern bool raw_constraint_p;
> +
>  /* The following vectors hold the results from insn_extract.  */
>  
>  struct recog_data_d
> --- gcc/recog.cc.jj   2024-10-24 21:00:29.511767242 -1100
> +++ gcc/recog.cc  2024-11-17 23:16:00.654874432 -1100
> @@ -86,6 +86,9 @@ static operand_alternative asm_op_alt[MA
>  
>  int which_alternative;
>  
> +/* True for inline asm operands with - constraint modifier.  */
> +bool raw_constraint_p;
> +
>  /* Nonzero after end of reload pass.
> Set to 1 or 0 by toplev.cc.
> Controls the significance of (SUBREG (MEM)).  */
> @@ -2300,6 +2303,7 @@ asm_operand_ok (rtx op, const char *cons
>switch (c)
>   {
>   case ',':
> +   raw_constraint_p = false;
> constraint++;
> continue;
>  
> @@ -2350,6 +2354,11 @@ asm_operand_ok (rtx op, const char *cons
>   result = 1;
> break;
>  
> + case '-':
> +   raw_constraint_p = true;
> +   constraint++;
> +   continue;
> +
>   case '<':
>   case '>':
> /* ??? Before auto-inc-dec, auto inc/dec insns are not supposed
> @@ -2407,8 +2416,12 @@ asm_operand_ok (rtx op, const char *cons
>   constraint++;
>while (--len && *constraint && *constraint != ',');
>if (len)
> - return 0;
> + {
> +   raw_constraint_p = false;
> +   return 0;
> + }
>  }
> +  raw_constraint_p = false;
>  
>/* For operands without < or > constraints reject side-effects.  */
>if (AUTO_INC_DEC && !incdec_ok && result && MEM_P (op))
> @@ -3202,6 +3215,9 @@ constrain_operands (int strict, alternat
> case ',':
>   c = '\0';
>   break;
> +   case '-':
> + raw_constraint_p = true;
> + break;
>  
> case '#':
>   /* Ignore rest of this alternative as far as
> @@ -3357,6 +3373,7 @@ con

[PATCH] [x86] Fix uninitialized operands[2] in vec_unpacks_hi_v4sf.

2024-11-22 Thread liuhongt

It could cause weired spill in RA when register pressure is high.

Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
Ok for trunk?

BTW, It's difficult to get a decent testcase for the issue since the spill
is not exposed in simple testcase.

gcc/ChangeLog:

PR target/117562
* config/i386/sse.md (vec_unpacks_hi_v4sf): Initialize
operands[2] with CONST0_RTX.
---
 gcc/config/i386/sse.md | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 72acd5bde5e..498a42d6e1e 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -10424,7 +10424,10 @@ (define_expand "vec_unpacks_hi_v4sf"
(match_dup 2)
(parallel [(const_int 0) (const_int 1)]]
   "TARGET_SSE2"
-  "operands[2] = gen_reg_rtx (V4SFmode);")
+{
+  operands[2] = gen_reg_rtx (V4SFmode);
+  emit_move_insn (operands[2], CONST0_RTX (V4SFmode));
+})
 
 (define_expand "vec_unpacks_hi_v8sf"
   [(set (match_dup 2)
-- 
2.34.1

Re: [RFC/RFA][PATCH v6 03/12] RISC-V: Add CRC expander to generate faster CRC.

2024-11-22 Thread Mariam Arutunian

On Fri, Nov 22, 2024, 20:29 Jeff Law  wrote:

>
>
> On 11/13/24 7:16 AM, Mariam Arutunian wrote:
> >
> >
> > On Tue, Nov 12, 2024 at 2:15 AM Jeff Law  > > wrote:
> >
> >
> >  > \ No newline at end of file
> >  > diff --git a/gcc/testsuite/gcc.target/riscv/crc-1-zbkc.c b/gcc/
> > testsuite/gcc.target/riscv/crc-1-zbkc.c
> >  > new file mode 100644
> >  > index 000..8c627c0431a
> >  > --- /dev/null
> >  > +++ b/gcc/testsuite/gcc.target/riscv/crc-1-zbkc.c
> >  > @@ -0,0 +1,11 @@
> >  > +/* { dg-do run } */
> >  > +/* { dg-options "-fdump-tree-crc -fdump-rtl-dfinish  -fdisable-
> > tree-phiopt2 -fdisable-tree-phiopt3" } */
> >  > +/* { dg-additional-options "-march=rv64gc_zbkc" { target
> > { rv64 } } } */
> >  > +/* { dg-additional-options "-march=rv32gc_zbkc" { target
> > { rv32 } } } */
> > So I think we probably need to add a bit of code to the testsuite.
> > Essentially we don't want to run this test on targets that don't have
> > zbkc support.
> >
> > I think we probably end up wanting something similar to what we do
> with
> > vector where we have a test to tell us when V is supported.  I'm
> > planning to pick that up.  Similarly I think we want to do something
> > similar for Zbc.
> >
> >
> > To address this, I added code in |target-supports.exp| and modified the
> > relevant tests.
> > I've attached the patch. Could you please check whether it is correct?
> I think that just tests if the compiler thinks the extension is enabled.
>   ie, did we pass Zbkb, Zbc or whatever on the command line.  The
> question we need to answer is whether or not we can run such code.
>
> The way we've done that for the V extension looks like this:
>
> > proc check_effective_target_riscv_v_ok { } {
> > # If the target already supports v without any added options,
> > # we may assume we can execute just fine.
> > if { [check_effective_target_riscv_v] } {
> > return 1
> > }
> >
> > # check if we can execute vector insns with the given hardware or
> > # simulator
> > set gcc_march [regsub {[[:alnum:]]*} [riscv_get_arch] &v]
> > if { [check_runtime ${gcc_march}_exec {
> >   int main() {  asm("vsetivli t0, 9, e8, m1, tu, ma"); return 0;
> } } "-march=${gcc_march}"] } {
> > return 1
> > }
> >
> > # Possible future extensions: If the target is a simulator,
> dg-add-options
> > # might change its config to make it allow vector insns, or we might
> use
> > # options to set special elf flags / sections to effect that.
> >
> > return 0
> > }
> So we compile a little program with a single vector instruction and
> check that it doesn't fault.  I was thinking we could do the same thing
> for Zbc and Zbkb, but I haven't had time to cobble it together yet.
>


 I have written similar code for ZBC, ZBKC, and ZBB, and in my previous
reply, I attached the patch containing that code.
Here is a part from that patch:
+proc check_effective_target_riscv_zbc_ok { } {
+# If the target already supports zbc without any added options,
+# we may assume we can execute just fine.
+if { [check_effective_target_riscv_zbc] } {
+ return 1
+}
+
+# check if we can execute zbc insns with the given hardware or
+# simulator
+set gcc_march [riscv_get_arch]
+if { [check_runtime ${gcc_march}_zbc_exec {
+ int main()
+ {
+asm ("clmul a0,a0,a1");
+asm ("clmulh a0,a0,a1");
+return 0;
+ } } "-march=${gcc_march}"] } {
+return 1
+ }
+return 0
+}

BR,
Mariam

Jeff
>

[PATCH] RISC-V: Ensure vtype for full-register moves [PR117544].

2024-11-22 Thread Robin Dapp

Hi,

as discussed in PR117544 the VTYPE register is not preserved across
function calls.  Even though vmv1r-like instructions operate
independently of the actual vtype they still require a valid vtype.  As
we cannot guarantee that the vtype is valid we must make sure to emit a
vsetvl between a function call and vmv1r.v.

This patch makes the necessary changes by splitting the full-reg-move
insns into patterns that use the vtype register and adding vmov to the
types of instructions requiring a vset.

Regtested on rv64gcv but the CI knows best :)

Regards
 Robin

PR target/117544

gcc/ChangeLog:

* config/riscv/vector.md (*mov_whole): Split.
(*mov_fract): Ditto.
(*mov): Ditto.
(*mov_vls): Ditto.
(*mov_reg_whole_vtype): New pattern with vtype use.
(*mov_fract_vtype): Ditto.
(*mov_vtype): Ditto.
(*mov_vls_vtype): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/abi-call-args-4.c: Expect vsetvl.
* gcc.target/riscv/rvv/base/pr117544.c: New test.
---
 gcc/config/riscv/vector.md| 91 +--
 .../riscv/rvv/base/abi-call-args-4.c  |  1 +
 .../gcc.target/riscv/rvv/base/pr117544.c  | 14 +++
 3 files changed, 99 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr117544.c

diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
index 02cbd2f56f1..57e3c34c1c5 100644
--- a/gcc/config/riscv/vector.md
+++ b/gcc/config/riscv/vector.md
@@ -42,7 +42,8 @@ (define_attr "has_vtype_op" "false,true"
   (cond [(eq_attr "type" "vlde,vste,vldm,vstm,vlds,vsts,\
  vldux,vldox,vstux,vstox,vldff,\
  
vialu,viwalu,vext,vicalu,vshift,vnshift,vicmp,viminmax,\
- vimul,vidiv,viwmul,vimuladd,viwmuladd,vimerge,vimov,\
+ vimul,vidiv,viwmul,vimuladd,viwmuladd,vimerge,
+ vmov,vimov,\
  vsalu,vaalu,vsmul,vsshift,vnclip,\
  
vfalu,vfwalu,vfmul,vfdiv,vfwmul,vfmuladd,vfwmuladd,vfsqrt,vfrecp,\
  vfcmp,vfminmax,vfsgnj,vfclass,vfmerge,vfmov,\
@@ -1214,21 +1215,58 @@ (define_expand "mov"
 ;; which is not the pattern we want.
 ;; According the facts above, we make "*mov_whole" includes 
load/store/move for whole
 ;; vector modes according to '-march' and "*mov_fract" only include 
fractional vector modes.
-(define_insn "*mov_whole"
+(define_insn_and_split "*mov_whole"
   [(set (match_operand:V_WHOLE 0 "reg_or_mem_operand" "=vr, m,vr")
(match_operand:V_WHOLE 1 "reg_or_mem_operand" "  m,vr,vr"))]
   "TARGET_VECTOR && !TARGET_XTHEADVECTOR"
   "@
vl%m1re.v\t%0,%1
vs%m1r.v\t%1,%0
-   vmv%m1r.v\t%0,%1"
+   #"
+  "&& !memory_operand (operands[0], mode)
+   && !memory_operand (operands[1], mode)"
+  [(parallel [(set (match_dup 0) (match_dup 1))
+ (use (reg:SI VTYPE_REGNUM))])]
+  ""
   [(set_attr "type" "vldr,vstr,vmov")
(set_attr "mode" "")])
 
-(define_insn "*mov_fract"
+;; Full-register moves like vmv1r.v require a valid vtype.
+;; The ABI does not guarantee that the vtype is valid after a function
+;; call so we need to make it dependent on the vtype and have
+;; the vsetvl pass insert a vsetvl if necessary.
+;; To facilitate optimization we keep the reg-reg move patterns "regular"
+;; until split time and only then switch to a pattern like below that
+;; uses the vtype register.
+;; As the use of these patterns is limited (in the general context)
+;; there is no need for helper functions and we can just create the RTX
+;; directly.
+(define_insn "*mov_reg_whole_vtype"
+  [(set (match_operand:V_WHOLE 0 "reg_or_mem_operand" "=vr")
+   (match_operand:V_WHOLE 1 "reg_or_mem_operand" " vr"))
+   (use (reg:SI VTYPE_REGNUM))]
+  "TARGET_VECTOR && !TARGET_XTHEADVECTOR"
+  "vmv%m1r.v\t%0,%1"
+  [(set_attr "type" "vmov")
+   (set_attr "mode" "")])
+
+(define_insn_and_split "*mov_fract"
   [(set (match_operand:V_FRACT 0 "register_operand" "=vr")
(match_operand:V_FRACT 1 "register_operand" " vr"))]
   "TARGET_VECTOR"
+  "#"
+  "&& 1"
+  [(parallel [(set (match_dup 0) (match_dup 1))
+ (use (reg:SI VTYPE_REGNUM))])]
+  ""
+  [(set_attr "type" "vmov")
+   (set_attr "mode" "")])
+
+(define_insn "*mov_fract_vtype"
+  [(set (match_operand:V_FRACT 0 "register_operand" "=vr")
+   (match_operand:V_FRACT 1 "register_operand" " vr"))
+   (use (reg:SI VTYPE_REGNUM))]
+  "TARGET_VECTOR"
   "vmv1r.v\t%0,%1"
   [(set_attr "type" "vmov")
(set_attr "mode" "")])
@@ -1249,10 +1287,23 @@ (define_expand "mov"
 DONE;
 })
 
-(define_insn "*mov"
+(define_insn_and_split "*mov"
   [(set (match_operand:VB 0 "register_operand" "=vr")
(match_operand:VB 1 "register_operand" " vr"))]
   "TARGET_VECTOR && !TARGET_XTHEADVECTOR"
+  "#"
+  "&& 1"
+  [(parallel [(set (match_dup 0) (match_dup 1))
+ (use (reg:SI VTYPE_REGNUM))])]
+  ""
+  [

Re: [PING] [contrib] validate_failures.py: fix python 3.12 escape sequence warnings

2024-11-22 Thread Sam James

Jeff Law  writes:

> On 6/9/24 5:45 AM, Gabi Falk wrote:
>> Hi,
>> On Sat, Jun 08, 2024 at 03:34:02PM -0600, Jeff Law wrote:
>>> On 5/14/24 8:12 AM, Gabi Falk wrote:
 Hi,

 This one still needs review:

 https://inbox.sourceware.org/gcc-patches/20240415233833.104460-1-gabif...@gmx.com/
>>> I think I just ACK'd an equivalent patch from someone else this week.
>> Looks like it hasn't been merged yet, and I couldn't find it in the
>> mailing list archive.
>> Anyway, I hope either one gets merged soon. :)
> I'm sure it will.  The variant I asked is from someone with commit
> privs, so they'll push it to the tree when convenient for them.

I still don't see that change having landed.

>
> jeff

[PATCH v2 2/4] RISC-V: Add interleave pattern.

2024-11-22 Thread Robin Dapp

From: Robin Dapp 

This patch adds efficient handling of interleaving patterns like
[0 4 1 5] to vec_perm_const.  It is implemented by a slideup and a
gather.

gcc/ChangeLog:

* config/riscv/riscv-v.cc (shuffle_interleave_patterns): New
function.
(expand_vec_perm_const_1): Use new function.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vls-vlmax/shuffle-interleave-run.c: New 
test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/shuffle-interleave.c: New test.
---
 gcc/config/riscv/riscv-v.cc   |  80 
 .../vls-vlmax/shuffle-interleave-run.c| 122 ++
 .../autovec/vls-vlmax/shuffle-interleave.c|  69 ++
 3 files changed, 271 insertions(+)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/shuffle-interleave-run.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/shuffle-interleave.c

diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index deb2bdb4247..3f8fd3257c4 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -3492,6 +3492,84 @@ shuffle_slide_patterns (struct expand_vec_perm_d *d)
   return true;
 }
 
+/* Recognize interleaving patterns like [0 4 1 5].  */
+
+static bool
+shuffle_interleave_patterns (struct expand_vec_perm_d *d)
+{
+  machine_mode vmode = d->vmode;
+  machine_mode sel_mode = related_int_vector_mode (vmode).require ();
+  poly_int64 vec_len = d->perm.length ();
+  int n_patterns = d->perm.encoding ().npatterns ();
+
+  if (!vec_len.is_constant ())
+return false;
+
+  if (n_patterns != 2)
+return false;
+
+  unsigned vlen = vec_len.to_constant ();
+
+  if (vlen < 4 || vlen > 64)
+return false;
+
+  if (d->one_vector_p)
+return false;
+
+  bool low = true;
+  if (d->perm.series_p (0, 2, 0, 1)
+  && d->perm.series_p (1, 2, vlen, 1))
+low = true;
+  else if (d->perm.series_p (0, 2, vlen / 2, 1)
+  && d->perm.series_p (1, 2, vlen + vlen / 2, 1))
+low = false;
+  else
+return false;
+
+  vec_perm_builder sel (vlen, 2, 1);
+  sel.safe_grow (vlen);
+  int cnt = 0;
+  for (unsigned i = 0; i < vlen; i += 2)
+{
+  sel[i] = cnt;
+  sel[i + 1] = cnt + vlen / 2;
+  cnt++;
+}
+
+  vec_perm_indices indices (sel, 2, vlen);
+
+  if (vlen != indices.length ().to_constant ())
+return false;
+
+  /* Success!  */
+  if (d->testing_p)
+return true;
+
+  int slide_cnt = vlen / 2;
+  rtx tmp = gen_reg_rtx (vmode);
+
+  if (low)
+{
+  /* No need for a vector length because we slide up until the
+end of OP1 anyway.  */
+  rtx ops[] = {tmp, d->op0, d->op1, gen_int_mode (slide_cnt, Pmode)};
+  insn_code icode = code_for_pred_slide (UNSPEC_VSLIDEUP, vmode);
+  emit_vlmax_insn (icode, SLIDEUP_OP_MERGE, ops);
+}
+  else
+{
+  rtx ops[] = {tmp, d->op1, d->op0, gen_int_mode (slide_cnt, Pmode)};
+  insn_code icode = code_for_pred_slide (UNSPEC_VSLIDEDOWN, vmode);
+  emit_nonvlmax_insn (icode, BINARY_OP_TUMA, ops,
+ gen_int_mode (slide_cnt, Pmode));
+}
+
+  rtx sel_rtx = vec_perm_indices_to_rtx (sel_mode, indices);
+  emit_vlmax_gather_insn (gen_lowpart (vmode, d->target), tmp, sel_rtx);
+
+  return true;
+}
+
 /* Recognize decompress patterns:
 
1. VEC_PERM_EXPR op0 and op1
@@ -3808,6 +3886,8 @@ expand_vec_perm_const_1 (struct expand_vec_perm_d *d)
return true;
  if (shuffle_slide_patterns (d))
return true;
+ if (shuffle_interleave_patterns (d))
+   return true;
  if (shuffle_compress_patterns (d))
return true;
  if (shuffle_decompress_patterns (d))
diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/shuffle-interleave-run.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/shuffle-interleave-run.c
new file mode 100644
index 000..57748d95362
--- /dev/null
+++ 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/shuffle-interleave-run.c
@@ -0,0 +1,122 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target riscv_v_ok } */
+/* { dg-add-options riscv_v } */
+/* { dg-additional-options "-O3 -mrvv-max-lmul=m8 -std=gnu99" } */
+
+#include "shuffle-interleave.c"
+
+#define SERIES_2(x, y) (x), (x + 1)
+#define SERIES_4(x, y) SERIES_2 (x, y), SERIES_2 (x + 2, y)
+#define SERIES_8(x, y) SERIES_4 (x, y), SERIES_4 (x + 4, y)
+#define SERIES_16(x, y) SERIES_8 (x, y), SERIES_8 (x + 8, y)
+#define SERIES_32(x, y) SERIES_16 (x, y), SERIES_16 (x + 16, y)
+#define SERIES_64(x, y) SERIES_32 (x, y), SERIES_32 (x + 32, y)
+
+#define comp(a, b, n)  
\
+  for (unsigned i = 0; i < n; ++i) 
\
+if ((a)[i] != (b)[i])  
\
+  __builtin_abort ();
+
+#define CHECK1(TYPE, NUNITS)   
\
+  __attri

Re: [PATCH] wwwdocs: Align the DCO text for the GNU Toolchain to match community usage.

2024-11-22 Thread Jason Merrill


On 11/22/24 6:03 PM, Carlos O'Donell wrote:

On 11/22/24 11:13 AM, Jason Merrill wrote:

On 11/21/24 6:04 PM, Carlos O'Donell wrote:

Adjust the DCO text to match the broader community usage including
the Linux kernel use around "real names."

These changes clarify what was meant by "real name" and that it is
not required to be a "legal name" or any other stronger
requirement than a known identity that could be contacted to
discuss the contribution.


My take has been that this change is not necessary for us because the
FSF can accept copyright assignment for pseudonymous contributions,
so individual reviewers don't need to adjudicate whether a particular
pseudonym is sufficiently "known".


This is not the case, which is why I'm suggesting we align the wording of the 
DCO
usage to match the  general community accepted meaning.

The FSF copyright assignment process allows you to *post* your work publicly 
from
a pseudonym and allows you to use your pseudonym in the "sources" file that
GNU Maintainers use to check assignment and marks it like this:
"Note: this is a pseudonym; legal name on assignment."

The process does not allow you to remain pseudonymous to the FSF, and that 
information
may eventually leak out of the FSF.

Again, I'm suggesting we align the text of the DCO we use with the rest of the
communities that use it.

This is not a material change in the use of the DCO, just a clarification of the
wording around "real name."


Sure, but it is a material change in our processes.  How do you propose 
that reviewers judge what constitutes a "known" identity?


Jason

Re: [PATCH] wwwdocs: Align the DCO text for the GNU Toolchain to match community usage.

2024-11-22 Thread Carlos O'Donell



On 11/22/24 1:45 PM, Jason Merrill wrote:
> On 11/22/24 6:03 PM, Carlos O'Donell wrote:
>> On 11/22/24 11:13 AM, Jason Merrill wrote:
>>> On 11/21/24 6:04 PM, Carlos O'Donell wrote:
 Adjust the DCO text to match the broader community usage including
 the Linux kernel use around "real names."

 These changes clarify what was meant by "real name" and that it is
 not required to be a "legal name" or any other stronger
 requirement than a known identity that could be contacted to
 discuss the contribution.
>>>
>>> My take has been that this change is not necessary for us because the
>>> FSF can accept copyright assignment for pseudonymous contributions,
>>> so individual reviewers don't need to adjudicate whether a particular
>>> pseudonym is sufficiently "known".
>>
>> This is not the case, which is why I'm suggesting we align the wording of 
>> the DCO
>> usage to match the  general community accepted meaning.
>>
>> The FSF copyright assignment process allows you to *post* your work publicly 
>> from
>> a pseudonym and allows you to use your pseudonym in the "sources" file that
>> GNU Maintainers use to check assignment and marks it like this:
>> "Note: this is a pseudonym; legal name on assignment."
>>
>> The process does not allow you to remain pseudonymous to the FSF, and that 
>> information
>> may eventually leak out of the FSF.
>>
>> Again, I'm suggesting we align the text of the DCO we use with the rest of 
>> the
>> communities that use it.
>>
>> This is not a material change in the use of the DCO, just a clarification of 
>> the
>> wording around "real name."
> 
> Sure, but it is a material change in our processes.  How do you
> propose that reviewers judge what constitutes a "known" identity?

Why do you need to judge that?

You can point out that anonymous contributions are not permitted.

And accept that the person is being honest in their dealings with you.

-- 
Cheers,
Carlos.

Re: [patch,avr] avr.opt: Refactor Var(avr_) to Var(avropt_)

2024-11-22 Thread Denis Chertykov

чт, 21 нояб. 2024 г. в 20:39, Georg-Johann Lay :
>
> This is a no-op refactoring that uses a prefix of avropt_
> (formerly: avr_) for variables defined qua Var() directives
> in avr.opt.  This makes it easier to spot values that come directly
> from avr.opt in the rest of the backend.
>
> Ok for trunk?

Ok. Please apply.

Denis.

[PATCH v2] fold fold_truth_andor field merging into ifcombine

2024-11-22 Thread Alexandre Oliva



This patch introduces various improvements to the logic that merges
field compares, while moving it into ifcombine.

Before the patch, we could merge:

  (a.x1 EQNE b.x1)  ANDOR  (a.y1 EQNE b.y1)

into something like:

  (((type *)&a)[Na] & MASK) EQNE (((type *)&b)[Nb] & MASK)

if both of A's fields live within the same alignment boundaries, and
so do B's, at the same relative positions.  Constants may be used
instead of the object B.

The initial goal of this patch was to enable such combinations when a
field crossed alignment boundaries, e.g. for packed types.  We can't
generally access such fields with a single memory access, so when we
come across such a compare, we will attempt to combine each access
separately.

Some merging opportunities were missed because of right-shifts,
compares expressed as e.g. ((a.x1 ^ b.x1) & MASK) EQNE 0, and
narrowing conversions, especially after earlier merges.  This patch
introduces handlers for several cases involving these.

The merging of multiple field accesses into wider bitfield-like
accesses is undesirable to do too early in compilation, so we move it
from folding to ifcombine.

When it is the second of a noncontiguous pair of compares that first
accesses a word, we may merge the first compare with part of the
second compare that refers to the same word, keeping the compare of
the remaining bits at the spot where the second compare used to be.

Handling compares with non-constant fields was somewhat generalized
from what fold used to do, now handling non-adjacent fields, even if a
field of one object crosses an alignment boundary but the other
doesn't.

Regstrapped on x86_64-linux-gnu.  Ok to install?


Changes from v1:

- Noncontiguous ifcombine split out and already installed.

- Do not attempt to place new loads next to the latest of the original
  loads.  Only check that loads have the same vuses, and place loads
  next to either one, even if that's suboptimal.

- Use gimple pattern matching.  It turned out not to be very useful, but
  it enabled the elimination of redundant auxiliary functions.

- Rewrote constants and masks using wide_int.

- Work harder to gather and reuse location_t.

- Rework BIT_XOR handling to avoid having to match patterns again.

- Distinguish the cases of contiguous and noncontiguous conditions.

- Comments, lots of comments.

- More tests.

- Dropped the warnings that were hitting i386 and rs6000 bootstraps.
  The new possibilities with ifcombine, noncontiguous at that, on top of
  more flexible field merging, made room for too many false positives.

Requested but unchanged from v1:

- fold_truth_andor_for_ifcombine (renamed from ...andor_maybe_separate)
still builds and returns generic expressions.  Since other
ifcombine_ifandif build generic exprs and have to go through
regimplification anyway, I failed to see the point.  Implementing that
change would require some more work in tree-ssa-ifcombine.cc to support.
I'd rather do that as a follow up if desired, please let me know.

- it also remains a single bulky function.  There's so much state that
breaking it up doesn't seem to be helpful.  Hopefully the added comments
will help despite making it even bigger.

- TBAA situation is unchanged, same as what's carried over from fold.
I'm not sure what the concerns in your mind are, but if there are actual
problems, they have long been there, and we'd better address them in
both fold and in this bit now moved to ifcombine, ideally in a separate
backportable patch.



for  gcc/ChangeLog

* fold-const.cc (make_bit_field): Export.
(unextend, all_ones_mask_p): Drop.
(decode_field_reference, fold_truth_andor_1): Move
field compare merging logic...
* gimple-fold.cc: (fold_truth_andor_for_ifcombine) ... here.
(compute_split_boundary_from_align): New.
(make_bit_field_load, build_split_load): New.
(reuse_split_load): New.
* fold-const.h: (make_bit_field_ref): Declare
(fold_truth_andor_for_ifcombine): Declare.
* match.pd (any_convert, bit_and_cst, rshift_cst): New.
* tree-ssa-ifcombine.cc (ifcombine_ifandif): Try
fold_truth_andor_for_ifcombine.

for  gcc/testsuite/ChangeLog

* gcc.dg/field-merge-1.c: New.
* gcc.dg/field-merge-2.c: New.
* gcc.dg/field-merge-3.c: New.
* gcc.dg/field-merge-4.c: New.
* gcc.dg/field-merge-5.c: New.
* gcc.dg/field-merge-6.c: New.
* gcc.dg/field-merge-7.c: New.
* gcc.dg/field-merge-8.c: New.
* gcc.dg/field-merge-9.c: New.
* gcc.dg/field-merge-10.c: New.
* gcc.dg/field-merge-11.c: New.
---
 gcc/fold-const.cc |  512 ---
 gcc/fold-const.h  |   10 
 gcc/gimple-fold.cc| 1107 +
 gcc/match.pd  |   11 
 gcc/testsuite/gcc.dg/field-merge-1.c  |   64 ++
 gcc/testsuite/gcc.dg/field-merge-10.c |   36 +
 gcc/testsuite

[PATCH] Sync top-level configure with binutils

2024-11-22 Thread Sam James

This syncs us with binutils/gdb's toplevel configure as of
987db70acefd0b223a8df2240d4e5ca544cc0a91.

There's not much notable here, just gprofng (which is in binutils) being
disabled for musl and a new target which got added on that side too.

The only part which may look interesting is the baseargs->bbaseargs
change which goes back to Arsen's gettext work and a fixup which
landed for that on the binutils side in
9c0aa4c53104b1c4333d55aeaf11b41053307929.
---
OK?

I've just synced the binutils/gdb side as well. The only difference left
is a divergence from some autoconf modernisation patches from Matthieu Longo
who is submitting them on the GCC side, pending approval for those.

 configure| 21 ++---
 configure.ac | 22 ++
 2 files changed, 36 insertions(+), 7 deletions(-)

diff --git a/configure b/configure
index 10c1589d473c..c6796040fd8a 100755
--- a/configure
+++ b/configure
@@ -1551,7 +1551,7 @@ Optional Features:
 
   --enable-gold[=ARG] build gold [ARG={default,yes,no}]
   --enable-ld[=ARG]   build ld [ARG={default,yes,no}]
-  --enable-gprofng[=ARG]  build gprofng [ARG={yes,no}]
+  --disable-gprofng   do not build gprofng
   --enable-compressed-debug-sections={all,gas,gold,ld,none}
   Enable compressed debug sections for gas, gold or ld
   by default
@@ -3151,7 +3151,9 @@ fi
 
 if test "$enable_gprofng" = "yes"; then
   case "${target}" in
-x86_64-*-linux* | i?86-*-linux* | aarch64-*-linux*)
+*-musl*)
+  ;;
+x86_64-*-linux* | i?86-*-linux* | aarch64-*-linux* | riscv64-*-linux*)
 configdirs="$configdirs gprofng"
 ;;
   esac
@@ -3670,6 +3672,15 @@ case "${target}" in
   cris-*-* | crisv32-*-*)
 libgloss_dir=cris
 ;;
+  kvx-*-elf)
+libgloss_dir=kvx-elf
+;;
+  kvx-*-mbr)
+libgloss_dir=kvx-mbr
+;;
+  kvx-*-cos)
+libgloss_dir=kvx-cos
+;;
   hppa*-*-*)
 libgloss_dir=pa
 ;;
@@ -3971,6 +3982,9 @@ case "${target}" in
   i[3456789]86-*-rdos*)
 noconfigdirs="$noconfigdirs gdb"
 ;;
+  kvx-*-*)
+noconfigdirs="$noconfigdirs gdb gdbserver sim"
+;;
   mmix-*-*)
 noconfigdirs="$noconfigdirs gdb"
 ;;
@@ -11314,7 +11328,8 @@ hbaseargs="$hbaseargs --disable-option-checking"
 tbaseargs="$tbaseargs --disable-option-checking"
 
 if test "$enable_year2038" = no; then
-  baseargs="$baseargs --disable-year2038"
+  bbaseargs="$bbaseargs --disable-year2038"
+  hbaseargs="$hbaseargs --disable-year2038"
   tbaseargs="$tbaseargs --disable-year2038"
 fi
 
diff --git a/configure.ac b/configure.ac
index fb61550dba7b..a8d13b31ee2e 100644
--- a/configure.ac
+++ b/configure.ac
@@ -407,13 +407,14 @@ case "${ENABLE_LD}" in
 esac
 
 AC_ARG_ENABLE(gprofng,
-[AS_HELP_STRING([[--enable-gprofng[=ARG]]],
-   [build gprofng @<:@ARG={yes,no}@:>@])],
+[AS_HELP_STRING([[--disable-gprofng]], [do not build gprofng])],
 enable_gprofng=$enableval,
 enable_gprofng=yes)
 if test "$enable_gprofng" = "yes"; then
   case "${target}" in
-x86_64-*-linux* | i?86-*-linux* | aarch64-*-linux*)
+*-musl*)
+  ;;
+x86_64-*-linux* | i?86-*-linux* | aarch64-*-linux* | riscv64-*-linux*)
 configdirs="$configdirs gprofng"
 ;;
   esac
@@ -892,6 +893,15 @@ case "${target}" in
   cris-*-* | crisv32-*-*)
 libgloss_dir=cris
 ;;
+  kvx-*-elf)
+libgloss_dir=kvx-elf
+;;
+  kvx-*-mbr)
+libgloss_dir=kvx-mbr
+;;
+  kvx-*-cos)
+libgloss_dir=kvx-cos
+;;
   hppa*-*-*)
 libgloss_dir=pa
 ;;
@@ -1193,6 +1203,9 @@ case "${target}" in
   i[[3456789]]86-*-rdos*)
 noconfigdirs="$noconfigdirs gdb"
 ;;
+  kvx-*-*)
+noconfigdirs="$noconfigdirs gdb gdbserver sim"
+;;
   mmix-*-*)
 noconfigdirs="$noconfigdirs gdb"
 ;;
@@ -3543,7 +3556,8 @@ hbaseargs="$hbaseargs --disable-option-checking"
 tbaseargs="$tbaseargs --disable-option-checking"
 
 if test "$enable_year2038" = no; then
-  baseargs="$baseargs --disable-year2038"
+  bbaseargs="$bbaseargs --disable-year2038"
+  hbaseargs="$hbaseargs --disable-year2038"
   tbaseargs="$tbaseargs --disable-year2038"
 fi
 
-- 
2.47.0

Re: [patch,avr] PR117726: More tweaks to multi-byte shifts

2024-11-22 Thread Denis Chertykov

пт, 22 нояб. 2024 г. в 17:12, Georg-Johann Lay :
>
> This patch is similar to https://gcc.gnu.org/r15-5569 (tweak ashift:SI)
> but for
> ashiftrt and lshiftrt codes.  It splits constant shift offsets > 16
> into a 3-operand byte shift and a 2-operand residual bit shift.
> Moreover, some of the constraint alternatives have been promoted
> to 3-operand alternatives regardless of options.  For example,
> ashift:HI and lshiftrt:HI can support 3 operands for offsets 9...12
> without any overhead.
> Apart from that, it's a bit of code clean up for 2-byte and 4-byte
> shift insns:  Use one RTL peephole with any_shift code iterator
> instead of 3 individual peepholes.  It also removes some useless
> split insns; presumably introduced during the cc0 -> CCmode work.
>
>
> No regressions. Ok for trunk?

Please apply.

Denis.

[PATCH] C/C++: add fix-it hints for missing '&' and '*' (v5) [PR87850]

2024-11-22 Thread David Malcolm

Revisiting this patch from 2018 that didn't quite make it;
earlier versions were:
  v1: https://gcc.gnu.org/ml/gcc-patches/2018-11/msg00802.html
  v2: https://gcc.gnu.org/legacy-ml/gcc-patches/2018-11/msg01408.html
  v3: https://gcc.gnu.org/legacy-ml/gcc-patches/2018-11/msg01658.html
  v4: https://gcc.gnu.org/legacy-ml/gcc-patches/2018-11/msg02617.html

I believe the remaining point of discussion was about enum vs int:
  https://gcc.gnu.org/legacy-ml/gcc-patches/2018-12/msg00293.html
and that none of us had strong opinions on the matter.  I've added
some test coverage for that, and rebased it (e.g. for the .c to .cc
renaming of our sources).

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
OK for trunk?


This patch adds a note with a fix-it hint to various
pointer-vs-non-pointer diagnostics, suggesting the addition of
a leading '&' or '*'.

For example, note the ampersand fix-it hint in the following:

demo.c: In function 'int main()':
demo.c:5:22: error: invalid conversion from 'pthread_key_t' {aka 'unsigned int'}
   to 'pthread_key_t*' {aka 'unsigned int*'} [-fpermissive]
5 |   pthread_key_create(key, NULL);
  |  ^~~
  |  |
  |  pthread_key_t {aka unsigned int}
demo.c:5:22: note: possible fix: take the address with '&'
5 |   pthread_key_create(key, NULL);
  |  ^~~
  |  &
In file included from demo.c:1:
/usr/include/pthread.h:1122:47: note:   initializing argument 1 of
   'int pthread_key_create(pthread_key_t*, void (*)(void*))'
 1122 | extern int pthread_key_create (pthread_key_t *__key,
  |~~~^

gcc/c-family/ChangeLog:
PR c++/87850
* c-common.cc: Include "gcc-rich-location.h".
(maybe_emit_indirection_note): New function.
* c-common.h (maybe_emit_indirection_note): New decl.
(compatible_types_for_indirection_note_p): New decl.

gcc/c/ChangeLog:
PR c++/87850
* c-typeck.cc (compatible_types_for_indirection_note_p): New
function.
(convert_for_assignment): Call maybe_emit_indirection_note for
pointer vs non-pointer diagnostics.

gcc/cp/ChangeLog:
PR c++/87850
* call.cc (convert_like_real): Call maybe_emit_indirection_note
for "invalid conversion" diagnostic.
* typeck.cc (compatible_types_for_indirection_note_p): New
function.

gcc/testsuite/ChangeLog:
PR c++/87850
* c-c++-common/indirection-fixits.c: New test.
* g++.dg/template/error60.C: Add fix-it hint to expected output.
* g++.dg/template/error60a.C: Likewise.

Signed-off-by: David Malcolm 
---
 gcc/c-family/c-common.cc  |  35 ++
 gcc/c-family/c-common.h   |   4 +
 gcc/c/c-typeck.cc |  18 +-
 gcc/cp/call.cc|   2 +
 gcc/cp/typeck.cc  |   8 +
 .../c-c++-common/indirection-fixits.c | 317 ++
 gcc/testsuite/g++.dg/template/error60.C   |   5 +
 gcc/testsuite/g++.dg/template/error60a.C  |   5 +
 8 files changed, 392 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/indirection-fixits.c

diff --git a/gcc/c-family/c-common.cc b/gcc/c-family/c-common.cc
index 1ffc63afbd3a..8cae3d1feb1f 100644
--- a/gcc/c-family/c-common.cc
+++ b/gcc/c-family/c-common.cc
@@ -56,6 +56,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "vec-perm-indices.h"
 #include "tree-pretty-print-markup.h"
 #include "gcc-urlifier.h"
+#include "gcc-rich-location.h"
 
 cpp_reader *parse_in;  /* Declared in c-pragma.h.  */
 
@@ -9640,6 +9641,40 @@ maybe_suggest_missing_token_insertion (rich_location 
*richloc,
 }
 }
 
+/* Potentially emit a note about likely missing '&' or '*',
+   depending on EXPR and EXPECTED_TYPE.  */
+
+void
+maybe_emit_indirection_note (location_t loc,
+tree expr, tree expected_type)
+{
+  gcc_assert (expr);
+  gcc_assert (expected_type);
+
+  tree actual_type = TREE_TYPE (expr);
+
+  /* Missing '&'.  */
+  if (TREE_CODE (expected_type) == POINTER_TYPE
+  && compatible_types_for_indirection_note_p (actual_type,
+ TREE_TYPE (expected_type))
+  && lvalue_p (expr))
+{
+  gcc_rich_location richloc (loc);
+  richloc.add_fixit_insert_before ("&");
+  inform (&richloc, "possible fix: take the address with %qs", "&");
+}
+
+  /* Missing '*'.  */
+  if (TREE_CODE (actual_type) == POINTER_TYPE
+  && compatible_types_for_indirection_note_p (TREE_TYPE (actual_type),
+ expected_type))
+{
+  gcc_rich_location richloc (loc);
+  richloc.add_fixit_insert_before ("*");
+  inform (&richloc, "possible fix: dereference with %qs", "*");
+}
+}
+
 #if CHECKING_P
 
 namespace s

Re: [PATCH] c++: give suggestion on misspelled class name [PR116771]

2024-11-22 Thread Marek Polacek

On Fri, Nov 22, 2024 at 04:56:08PM -0500, David Malcolm wrote:
> Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
> 
> OK for trunk?

Not an approval, but the patch LGTM -- it follows the usual hint.suggestion()
pattern elsewhere in the front end.
 
> gcc/cp/ChangeLog:
>   PR c++/116771
>   * parser.cc (cp_parser_name_lookup_error): Provide suggestions for
>   the case of complete failure where there is no scope.
> 
> gcc/testsuite/ChangeLog:
>   PR c++/116771
>   * g++.dg/spellcheck-pr116771.C: New test.
> 
> Signed-off-by: David Malcolm 
> ---
>  gcc/cp/parser.cc   | 16 +++-
>  gcc/testsuite/g++.dg/spellcheck-pr116771.C |  9 +
>  2 files changed, 24 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/g++.dg/spellcheck-pr116771.C
> 
> diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
> index ed10f58422ea..06ee36e6727c 100644
> --- a/gcc/cp/parser.cc
> +++ b/gcc/cp/parser.cc
> @@ -3399,7 +3399,21 @@ cp_parser_name_lookup_error (cp_parser* parser,
>   error_at (location, "%<%T::%E%> has not been declared",
> parser->object_scope, name);
>else
> - error_at (location, "%qE has not been declared", name);
> + {
> +   auto_diagnostic_group d;
> +   name_hint hint
> + = lookup_name_fuzzy (name, FUZZY_LOOKUP_TYPENAME, location);
> +   if (const char *suggestion = hint.suggestion ())
> + {
> +   gcc_rich_location richloc (location);
> +   richloc.add_fixit_replace (suggestion);
> +   error_at (&richloc,
> + "%qE has not been declared; did you mean %qs?",
> + name, suggestion);
> + }
> +   else
> + error_at (location, "%qE has not been declared", name);
> + }
>  }
>else if (parser->scope && parser->scope != global_namespace)
>  {
> diff --git a/gcc/testsuite/g++.dg/spellcheck-pr116771.C 
> b/gcc/testsuite/g++.dg/spellcheck-pr116771.C
> new file mode 100644
> index ..fd8bd6d46cbd
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/spellcheck-pr116771.C
> @@ -0,0 +1,9 @@
> +class layout_printer
> +{
> +void print_newline ();
> +};
> +
> +void
> +layout_pirnter::print_newline () // { dg-error "'layout_pirnter' has not 
> been declared; did you mean 'layout_printer'" }
> +{
> +}
> -- 
> 2.26.3
> 

Marek

[PATCH] c++: give suggestion on misspelled class name [PR116771]

2024-11-22 Thread David Malcolm

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.

OK for trunk?

gcc/cp/ChangeLog:
PR c++/116771
* parser.cc (cp_parser_name_lookup_error): Provide suggestions for
the case of complete failure where there is no scope.

gcc/testsuite/ChangeLog:
PR c++/116771
* g++.dg/spellcheck-pr116771.C: New test.

Signed-off-by: David Malcolm 
---
 gcc/cp/parser.cc   | 16 +++-
 gcc/testsuite/g++.dg/spellcheck-pr116771.C |  9 +
 2 files changed, 24 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/spellcheck-pr116771.C

diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index ed10f58422ea..06ee36e6727c 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -3399,7 +3399,21 @@ cp_parser_name_lookup_error (cp_parser* parser,
error_at (location, "%<%T::%E%> has not been declared",
  parser->object_scope, name);
   else
-   error_at (location, "%qE has not been declared", name);
+   {
+ auto_diagnostic_group d;
+ name_hint hint
+   = lookup_name_fuzzy (name, FUZZY_LOOKUP_TYPENAME, location);
+ if (const char *suggestion = hint.suggestion ())
+   {
+ gcc_rich_location richloc (location);
+ richloc.add_fixit_replace (suggestion);
+ error_at (&richloc,
+   "%qE has not been declared; did you mean %qs?",
+   name, suggestion);
+   }
+ else
+   error_at (location, "%qE has not been declared", name);
+   }
 }
   else if (parser->scope && parser->scope != global_namespace)
 {
diff --git a/gcc/testsuite/g++.dg/spellcheck-pr116771.C 
b/gcc/testsuite/g++.dg/spellcheck-pr116771.C
new file mode 100644
index ..fd8bd6d46cbd
--- /dev/null
+++ b/gcc/testsuite/g++.dg/spellcheck-pr116771.C
@@ -0,0 +1,9 @@
+class layout_printer
+{
+void print_newline ();
+};
+
+void
+layout_pirnter::print_newline () // { dg-error "'layout_pirnter' has not been 
declared; did you mean 'layout_printer'" }
+{
+}
-- 
2.26.3

Re: [PATCH] build: Remove INCLUDE_MEMORY [PR117737]

2024-11-22 Thread David Malcolm

On Fri, 2024-11-22 at 13:15 -0800, Andrew Pinski wrote:
> Since diagnostic.h is included in over half of the sources, requiring
> to `#define INCLUDE_MEMORY`
> does not make sense. Instead lets unconditionally include memory in
> system.h.
> 
> The majority of this patch is just removing `#define INCLUDE_MEMORY`
> from the sources which currently
> have it.

Sorry about the unpleasantness.

FWIW I did consider simply including  unconditionally for r15-
4610 ("Use unique_ptr in more places in pretty_printer/diagnostics
[PR116613]")
https://gcc.gnu.org/pipermail/gcc-patches/2024-October/665443.html
but the verbose approach seemed to me like something I could self-
approve; the simple approach didn't.

As I said there,  I'd like to use std::unique_ptr in more places, such
as when creating passes, so I think the number of places we'd need
INCLUDE_MEMORY is likely to eventually be most of the TUs in the
compiler.

So I'm in favor of Andrew's patch, FWIW

Dave

[PATCH v2 0/4] Improve and add VLS slide strategies.

2024-11-22 Thread Robin Dapp

From: Robin Dapp 

Changes from v1:
 - Improve function naming and rephrase comment.

The series still causes execution failures due to the previously
mentioned bugs.  The avlprop one seems to have disappeared on my
machine but I'm not convinced.
Hopefully this time I'm using git send-email correctly and the CI
can pick it up.

Robin Dapp (4):
  RISC-V: Add slide to perm_const strategies.
  RISC-V: Add interleave pattern.
  RISC-V: Add even/odd vec_perm_const pattern.
  RISC-V: Improve slide1up pattern.

 gcc/config/riscv/riscv-protos.h   |   1 +
 gcc/config/riscv/riscv-v.cc   | 297 +-
 gcc/config/riscv/riscv.cc |  18 +-
 .../gcc.target/riscv/rvv/autovec/pr112599-2.c |   2 +-
 .../autovec/vls-vlmax/shuffle-evenodd-run.c   | 122 +++
 .../rvv/autovec/vls-vlmax/shuffle-evenodd.c   |  68 
 .../vls-vlmax/shuffle-interleave-run.c| 122 +++
 .../autovec/vls-vlmax/shuffle-interleave.c|  69 
 .../autovec/vls-vlmax/shuffle-slide-run1.c|  81 +
 .../rvv/autovec/vls-vlmax/shuffle-slide1.c| 137 
 10 files changed, 901 insertions(+), 16 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/shuffle-evenodd-run.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/shuffle-evenodd.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/shuffle-interleave-run.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/shuffle-interleave.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/shuffle-slide-run1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/shuffle-slide1.c

-- 
2.47.0

[patch,avr,applied]: Tabify avr-common.cc

2024-11-22 Thread Georg-Johann Lay


avr-common.cc used spaces for indentation instead of TABs.
Applied as obvious.

Johann

--

AVR: Tabify avr-common.cc according to coding rules.

gcc/
* common/config/avr/avr-common.cc: Tabify.AVR: Tabify avr-common.cc according to coding rules.

gcc/
* common/config/avr/avr-common.cc: Tabify.

diff --git a/gcc/common/config/avr/avr-common.cc b/gcc/common/config/avr/avr-common.cc
index c8f40fc2367..2ff8cbf2cbc 100644
--- a/gcc/common/config/avr/avr-common.cc
+++ b/gcc/common/config/avr/avr-common.cc
@@ -77,8 +77,8 @@ static const struct default_options avr_option_optimization_table[] =
 
 static bool
 avr_handle_option (struct gcc_options *opts, struct gcc_options*,
-   const struct cl_decoded_option *decoded,
-   location_t loc ATTRIBUTE_UNUSED)
+		   const struct cl_decoded_option *decoded,
+		   location_t loc ATTRIBUTE_UNUSED)
 {
   int value = decoded->value;
 
@@ -86,22 +86,22 @@ avr_handle_option (struct gcc_options *opts, struct gcc_options*,
 {
 case OPT_mdouble_:
   if (value == 64)
-{
+	{
 #if !defined (HAVE_DOUBLE64)
-  error_at (loc, "option %<-mdouble=64%> is only available if "
-"configured %<--with-double={64|64,32|32,64}%>");
+	  error_at (loc, "option %<-mdouble=64%> is only available if "
+		"configured %<--with-double={64|64,32|32,64}%>");
 #endif
-  opts->x_avropt_long_double = 64;
-}
+	  opts->x_avropt_long_double = 64;
+	}
   else if (value == 32)
-{
+	{
 #if !defined (HAVE_DOUBLE32)
-  error_at (loc, "option %<-mdouble=32%> is only available if "
-"configured %<--with-double={32|32,64|64,32}%>");
+	  error_at (loc, "option %<-mdouble=32%> is only available if "
+		"configured %<--with-double={32|32,64|64,32}%>");
 #endif
-}
+	}
   else
-gcc_unreachable();
+	gcc_unreachable();
 
 #if defined (HAVE_LONG_DOUBLE_IS_DOUBLE)
   opts->x_avropt_long_double = value;
@@ -110,26 +110,26 @@ avr_handle_option (struct gcc_options *opts, struct gcc_options*,
 
 case OPT_mlong_double_:
   if (value == 64)
-{
+	{
 #if !defined (HAVE_LONG_DOUBLE64)
-  error_at (loc, "option %<-mlong-double=64%> is only available if "
-"configured %<--with-long-double={64|64,32|32,64}%>, "
-"or %<--with-long-double=double%> together with "
-"%<--with-double={64|64,32|32,64}%>");
+	  error_at (loc, "option %<-mlong-double=64%> is only available if "
+		"configured %<--with-long-double={64|64,32|32,64}%>, "
+		"or %<--with-long-double=double%> together with "
+		"%<--with-double={64|64,32|32,64}%>");
 #endif
-}
+	}
   else if (value == 32)
-{
+	{
 #if !defined (HAVE_LONG_DOUBLE32)
-  error_at (loc, "option %<-mlong-double=32%> is only available if "
-"configured %<--with-long-double={32|32,64|64,32}%>, "
-"or %<--with-long-double=double%> together with "
-"%<--with-double={32|32,64|64,32}%>");
+	  error_at (loc, "option %<-mlong-double=32%> is only available if "
+		"configured %<--with-long-double={32|32,64|64,32}%>, "
+		"or %<--with-long-double=double%> together with "
+		"%<--with-double={32|32,64|64,32}%>");
 #endif
-  opts->x_avropt_double = 32;
-}
+	  opts->x_avropt_double = 32;
+	}
   else
-gcc_unreachable();
+	gcc_unreachable();
 
 #if defined (HAVE_LONG_DOUBLE_IS_DOUBLE)
   opts->x_avropt_double = value;

[PUSHED] test-art: Fix comment in types.h

2024-11-22 Thread Andrew Pinski

The comment references INCLUDE_MEMORY but the code actually
checks INCLUDE_VECTOR. So fix up the comment to mention
INCLUDE_VECTROR.

Pushed as obvious.

gcc/ChangeLog:

* text-art/types.h: Fix comment.

Signed-off-by: Andrew Pinski 
---
 gcc/text-art/types.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/text-art/types.h b/gcc/text-art/types.h
index 2b9f8b387c7..c741f77c25f 100644
--- a/gcc/text-art/types.h
+++ b/gcc/text-art/types.h
@@ -23,7 +23,7 @@ along with GCC; see the file COPYING3.  If not see
 
 /* This header uses std::vector, but  can't be directly
included due to issues with macros.  Hence it must be included from
-   system.h by defining INCLUDE_MEMORY in any source file using it.  */
+   system.h by defining INCLUDE_VECTOR in any source file using it.  */
 
 #ifndef INCLUDE_VECTOR
 # error "You must define INCLUDE_VECTOR before including system.h to use 
text-art/types.h"
-- 
2.43.0

Re: [PATCH] wwwdocs: Align the DCO text for the GNU Toolchain to match community usage.

2024-11-22 Thread Sam James

Jason Merrill  writes:

> On 11/21/24 6:04 PM, Carlos O'Donell wrote:
>> Adjust the DCO text to match the broader community usage including
>> the Linux kernel use around "real names."
>> These changes clarify what was meant by "real name" and that it is
>> not required to be a "legal name" or any other stronger requirement
>> than a known identity that could be contacted to discuss the
>> contribution.
>
> My take has been that this change is not necessary for us because the
> FSF can accept copyright assignment for pseudonymous contributions, so
> individual reviewers don't need to adjudicate whether a particular
> pseudonym is sufficiently "known".

This is an interesting point I hadn't considered which makes GCC (and
the other GNU toolchain projects) a bit different from those otherwise
considering this change.

FWIW, to offer context, the contributor who spawned some of this
discussion wasn't up for that option:
https://inbox.sourceware.org/libc-alpha/LG1vUwbz4ISz55P_V8z4yTtrujN57vlQQmtSTGpTVDw4FQ3D9FGkhVsUbWULbWN9cvxGMbpRbBrrN0XTIruRDGOKjLIv6Pxu4-eeHUcE4t8=@proton.me/.

But that doesn't mean it's unreasonable for us to insist on pseudonyms
to go through FSF assignment either. I don't yet know what my position
is on that, but thought the context could be helpful nonetheless.

>
> Jason

thanks,
sam

Ping [PATCH] ada: PR target/117538 Traceback includes load address if executable is PIE.

2024-11-22 Thread Simon Wright

What I didn’t say before is that, if for example an exception is raised 
in Ada Language Server and the code uses s-trasym to output a report,
what you get on macOS is, for example,

ALS.MAIN] in GNATformat Format
[ALS.MAIN] raised CONSTRAINT_ERROR : erroneous memory access
_ALS.MAIN_ 0x000105847D68 0x000105847DA0 0x0001058EA6AC
0x0001058EB5F4 0x000104B292C5 0x000104B36C5C
0x000104A8C0A0 0x000104A3C29C 0x0001049DB52C
0x000104A03844 0x000104A02DE8 0x000104A03058
0x000104A03C0C 0x000104A03240 0x000104A03854
0x000104A03884 0x000104A02DE8 0x000104A04F8C
0x0001049996E8 0x0001035BD344 0x0001035BD620
0x0001034AFBC8 0x0001034962DC 0x00010354930C
0x00010353D33C 0x00010252074C 0x00010252D25C
0x00010350A53C 0x000105831980 0x00019EE8F2E0

which is useless without the program load address. (A symbolic traceback would 
be even better, but that would be a different and more difficult project).

What should I do to get this change pushed?

> On 13 Nov 2024, at 17:15, Simon Wright  wrote:
> 
> If s-trasym.adb (System.Traceback.Symbolic, used as a renaming by
> GNAT.Traceback.Symbolic) is given a traceback from a
> position-independent executable, it does not include the executable's
> load address in the report. This is necessary in order to decode the
> traceback report.
> 
> Note, this has already been done for s-trasym__dwarf.adb, which really
> does produce a symbolic traceback; s-trasym.adb is the version used in
> systems which don't actually support symbolication.
> 
> Bootstrapped and regtested (ada onlyj) on x86_64-apple-darwin.
> 
> * gcc/ada/libgnat/s-trasym.adb: Returns the traceback in the required
>form. Note that leading zeros are trimmed from hexadecimal strings.
>  (Symbolic_Traceback): Import Executable_Load_Address.
>  (Trim_Hex): New internal function to trim leading '0' characters
>from a hexadecimal string.
>  (Load_Address): New, from call to Executable_Load_Address.
>  (One_If_Executable_Is_PI): New, 0 if Load_Address is null, 1 if
>not.
>  (Max_Image_Length): New, found by calling System.Address_Image on
>the first address in the traceback. NB, doesn't include "0x".
>  (Load_Address_Prefix): New, String containing the required value.
>  (Max_Length_Needed): New, computed using the number of elements
>in the traceback plus the load address, if the executable is PIE.
>  (Result): New String of the required length (which will be an
>overestimate).
> 
> 2024-11-13  Simon Wright   
> 
> gcc/ada/Changelog:
> 
> PR target/117538
> * libgnat/s-trasym.adb: Returns the traceback in the required
> form. Note that leading zeros are trimmed from hexadecimal strings.
> 
> —
> diff --git a/gcc/ada/libgnat/s-trasym.adb b/gcc/ada/libgnat/s-trasym.adb
> index 894fcf37ffd..7172214453f 100644
> --- a/gcc/ada/libgnat/s-trasym.adb
> +++ b/gcc/ada/libgnat/s-trasym.adb
> @@ -53,19 +53,75 @@ package body System.Traceback.Symbolic is
> 
>   else
>  declare
> -Img : String := System.Address_Image (Traceback 
> (Traceback'First));
> -
> -Result : String (1 .. (Img'Length + 3) * Traceback'Length);
> -Last   : Natural := 0;
> +function Executable_Load_Address return System.Address;
> +pragma Import
> +  (C, Executable_Load_Address,
> +   "__gnat_get_executable_load_address");
> +
> +function Trim_Hex (S : String) return String;
> +function Trim_Hex (S : String) return String is
> +   Non_0 : Positive;
> +begin
> +   for J in S'Range loop
> +  if S (J) /= '0' or else J = S'Last then
> + Non_0 := J;
> + exit;
> +  end if;
> +   end loop;
> +   return S (Non_0 .. S'Last);
> +end Trim_Hex;
> +
> +Load_Address : constant System.Address :=
> +  Executable_Load_Address;
> +One_If_Executable_Is_PI : constant Natural :=
> +  Boolean'Pos (Load_Address /= Null_Address);
> +
> +--  How long is an Address_Image?
> +Max_Image_Length : constant Natural :=
> +  System.Address_Image (Traceback (Traceback'First))'
> +Length;
> +
> +Load_Address_Prefix : constant String :=
> +  "Load address: ";
> +
> +Max_Length_Needed : constant Positive :=
> +  (Load_Address_Prefix'Length *
> +   One_If_Executable_Is_PI) +
> +  (Max_Image_Length + 3) *
> +(Traceback'Length + One_If_Executable_Is_PI) +
> +  2;
> +
> +Result : String (1 .. Max_Length_Needed);
> +
> +Last : Natural := 0;
> 
>  begin
> +
> +if One_If_Executable_Is_PI /= 0 then
> +   declare
> +  item : constant String :=
> +

Re: [RFC/RFA][PATCH v6 03/12] RISC-V: Add CRC expander to generate faster CRC.

2024-11-22 Thread Jeff Law

On 11/13/24 7:16 AM, Mariam Arutunian wrote:

On Tue, Nov 12, 2024 at 2:15 AM Jeff Law > wrote:

 > +
 > +
 > +/* Generate assembly to calculate CRC using clmul instruction.
 > +   The following code will be generated when the CRC and data
sizes are equal:
 > +     li      a4,quotient
 > +     li      a5,polynomial
 > +     xor     a0,a1,a0
 > +     clmul   a0,a0,a4
 > +     srli    a0,a0,crc_size
 > +     clmul   a0,a0,a5
 > +     slli    a0,a0,word_mode_size - crc_size
 > +     srli    a0,a0,word_mode_size - crc_size
Not something you need to change.  Aren't the final two instructions
here just a zero extension?

I suspect combine would pick this up.  So it's probably not worth
adding
more conditionals in expander.

In the GCC code, after inserting the |clmul| instruction, I call | 
riscv_emit_move (operands[0], gen_lowpart (crc_mode, a0)).

|I do not explicitly insert |slli| and |srli| instructions.
Zero extension is used in the early phases of the compilation, but 
during the |split2| pass, it is replaced by |ashift| and |lshiftrt| 
instructions.
OK.  That all makes sense and I even more strongly suspect that all the 
right things will happen if (for example) Zbb is turned on.  Thanks for 
clarifying.

 > \ No newline at end of file
 > diff --git a/gcc/testsuite/gcc.target/riscv/crc-1-zbkc.c b/gcc/
testsuite/gcc.target/riscv/crc-1-zbkc.c
 > new file mode 100644
 > index 000..8c627c0431a
 > --- /dev/null
 > +++ b/gcc/testsuite/gcc.target/riscv/crc-1-zbkc.c
 > @@ -0,0 +1,11 @@
 > +/* { dg-do run } */
 > +/* { dg-options "-fdump-tree-crc -fdump-rtl-dfinish  -fdisable-
tree-phiopt2 -fdisable-tree-phiopt3" } */
 > +/* { dg-additional-options "-march=rv64gc_zbkc" { target
{ rv64 } } } */
 > +/* { dg-additional-options "-march=rv32gc_zbkc" { target
{ rv32 } } } */
So I think we probably need to add a bit of code to the testsuite.
Essentially we don't want to run this test on targets that don't have
zbkc support.

I think we probably end up wanting something similar to what we do with
vector where we have a test to tell us when V is supported.  I'm
planning to pick that up.  Similarly I think we want to do something
similar for Zbc.

To address this, I added code in |target-supports.exp| and modified the 
relevant tests.

I've attached the patch. Could you please check whether it is correct?
I think that just tests if the compiler thinks the extension is enabled. 
 ie, did we pass Zbkb, Zbc or whatever on the command line.  The 
question we need to answer is whether or not we can run such code.

The way we've done that for the V extension looks like this:

proc check_effective_target_riscv_v_ok { } {
# If the target already supports v without any added options,
# we may assume we can execute just fine.
if { [check_effective_target_riscv_v] } {
return 1
}

# check if we can execute vector insns with the given hardware or
# simulator
set gcc_march [regsub {[[:alnum:]]*} [riscv_get_arch] &v]
if { [check_runtime ${gcc_march}_exec {
  int main() {  asm("vsetivli t0, 9, e8, m1, tu, ma"); return 0; } } 
"-march=${gcc_march}"] } {
return 1
}

# Possible future extensions: If the target is a simulator, dg-add-options
# might change its config to make it allow vector insns, or we might use
# options to set special elf flags / sections to effect that.

return 0
}
So we compile a little program with a single vector instruction and 
check that it doesn't fault.  I was thinking we could do the same thing 
for Zbc and Zbkb, but I haven't had time to cobble it together yet.

Jeff

Re: [PATCH] c-family: Yet another fix for _BitInt & __sync_* builtins [PR117641]

2024-11-22 Thread Marek Polacek

On Fri, Nov 22, 2024 at 09:56:17AM +0100, Jakub Jelinek wrote:
> Hi!
> 
> Sorry, the last patch only partially fixed the __sync_* ICEs with
> _BitInt(128) on ia32.
> Even for !fetch we need to error out and return 0.  I was afraid of
> APIs like __atomic_exchange/__atomic_compare_exchange, those obviously
> need to be supported even on _BitInt(128) on ia32, but they actually never
> sync_resolve_size, they are handled by adding the size argument and using
> the library version much earlier.
> For fetch && !orig_format (i.e. __atomic_fetch_* etc.) we need to return -1
> so that we handle it with a manualy __atomic_load +
> __atomic_compare_exchange loop in the caller, all other cases should
> be rejected.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ok, thanks.

> 2024-11-22  Jakub Jelinek  
> 
>   PR c/117641
>   * c-common.cc (sync_resolve_size): For size 16 with _BitInt
>   on targets where TImode isn't supported, use goto incompatible if
>   !fetch.
> 
>   * gcc.dg/bitint-117.c: New test.
> 
> --- gcc/c-family/c-common.cc.jj   2024-11-19 20:35:53.222455518 +0100
> +++ gcc/c-family/c-common.cc  2024-11-21 11:23:07.897995741 +0100
> @@ -7457,11 +7457,10 @@ sync_resolve_size (tree function, vec  
>size = tree_to_uhwi (TYPE_SIZE_UNIT (type));
>if (size == 16
> -  && fetch
>&& TREE_CODE (type) == BITINT_TYPE
>&& !targetm.scalar_mode_supported_p (TImode))
>  {
> -  if (!orig_format)
> +  if (fetch && !orig_format)
>   return -1;
>goto incompatible;
>  }
> --- gcc/testsuite/gcc.dg/bitint-117.c.jj  2024-11-21 11:23:59.720260255 
> +0100
> +++ gcc/testsuite/gcc.dg/bitint-117.c 2024-11-21 11:24:59.141416996 +0100
> @@ -0,0 +1,13 @@
> +/* PR c/117641 */
> +/* { dg-do compile { target bitint575 } } */
> +/* { dg-options "-std=c23" } */
> +
> +void
> +foo (_BitInt(128) *b)
> +{
> +  __sync_add_and_fetch (b, 1);   /* { dg-error 
> "incompatible" "" { target { ! int128 } } } */
> +  __sync_val_compare_and_swap (b, 0, 1); /* { dg-error "incompatible" "" 
> { target { ! int128 } } } */
> +  __sync_bool_compare_and_swap (b, 0, 1);/* { dg-error "incompatible" "" 
> { target { ! int128 } } } */
> +  __sync_lock_test_and_set (b, 1);   /* { dg-error "incompatible" "" 
> { target { ! int128 } } } */
> +  __sync_lock_release (b);   /* { dg-error "incompatible" "" 
> { target { ! int128 } } } */
> +}
> 
>   Jakub
> 

Marek

Re: [PATCH v2] Add new warning Wmissing-designated-initializers [PR39589]

2024-11-22 Thread Marek Polacek

On Sat, Sep 07, 2024 at 10:17:14PM +0100, Peter Frost wrote:
> v2 Patch:
>   * adds proper changelog text
>   * fixes typo in c.opt
> 
> Currently the behaviour of Wmissing-field-initializers is inconsistent
> between C and C++. The C warning assumes that missing designated 
> initializers are deliberate, and does not warn. The C++ warning does warn
> for missing designated initializers.
> 
> This patch changes the behaviour of Wmissing-field-initializers to
> universally not warn about missing designated initializers, and adds a new
> warning for specifically for missing designated initializers.
> 
> NOTE TO MAINTAINERS: This is my first gcc contribution, so I don't have
> git write access.

Thanks for the patch and sorry for the delay.

New options need to be documented in doc/invoke.texi; you can follow
-Wmissing-field-initializers in this case.
 
> Successfully tested on x86_64-pc-linux-gnu.
> 
>   PR c/39589
> 
> gcc/c-family/ChangeLog:
> 
>   * c.opt: Added new option Wmissing-designated-initializers, enabled by
>   Wextra

Usually we'd say:

* c.opt (Wmissing-designated-initializers): New option.

> gcc/c/ChangeLog:
> 
>   * c-typeck.cc (pop_init_level): Generate warning for missing designated
>   initializers rather than always ignore

Please add a '.' at the end.

> gcc/cp/ChangeLog:
> 
>   * typeck2.cc (process_init_constructor_record): Add check if missing
>   initializer is designated, warn as appropriate

Here too.

> gcc/testsuite/ChangeLog:
> 
>   * g++.dg/diagnostic/base.C: Change flags
>   * gcc.dg/20011021-1.c: Fix test, missing designated initializers can
>   generate a warning now
>   * gcc.dg/missing-field-init-1.c: Change flags
>   * gcc.dg/pr60784.c: Change flags

Here too.

>   * g++.dg/warn/missing-designated-initializers-1.C: New test.
>   * g++.dg/warn/missing-designated-initializers-2.C: New test.
>   * gcc.dg/missing-designated-initializers-1.c: New test.
>   * gcc.dg/missing-designated-initializers-2.c: New test.
> 
> 
> ---
>  gcc/c-family/c.opt|  4 +++
>  gcc/c/c-typeck.cc | 36 +++
>  gcc/cp/typeck2.cc | 20 ---
>  gcc/testsuite/g++.dg/diagnostic/base.C|  4 +--
>  .../warn/missing-designated-initializers-1.C  | 11 ++
>  .../warn/missing-designated-initializers-2.C  | 11 ++
>  gcc/testsuite/gcc.dg/20011021-1.c |  4 +--
>  .../missing-designated-initializers-1.c   | 13 +++
>  .../missing-designated-initializers-2.c   | 13 +++
>  gcc/testsuite/gcc.dg/missing-field-init-1.c   |  2 +-
>  gcc/testsuite/gcc.dg/pr60784.c|  2 +-
>  11 files changed, 96 insertions(+), 24 deletions(-)
>  create mode 100644 
> gcc/testsuite/g++.dg/warn/missing-designated-initializers-1.C
>  create mode 100644 
> gcc/testsuite/g++.dg/warn/missing-designated-initializers-2.C
>  create mode 100644 gcc/testsuite/gcc.dg/missing-designated-initializers-1.c
>  create mode 100644 gcc/testsuite/gcc.dg/missing-designated-initializers-2.c
> 
> diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
> index 491aa02e1a3..81e52f1417e 100644
> --- a/gcc/c-family/c.opt
> +++ b/gcc/c-family/c.opt
> @@ -977,6 +977,10 @@ Wmissing-field-initializers
>  C ObjC C++ ObjC++ Var(warn_missing_field_initializers) Warning 
> EnabledBy(Wextra)
>  Warn about missing fields in struct initializers.
>  
> +Wmissing-designated-initializers
> +C ObjC C++ ObjC++ Var(warn_missing_designated_initializers) Warning 
> EnabledBy(Wextra)
> +Warn about missing designated initializers in struct initializers.
> +

Since you're adding a new option, c.opt.urls needs to be regenerated.  All you
should need for that is to:
$ make html
$ make regenerate-opt-urls
in the builddir.  And I think you need to enable all languages when configuring
GCC for that to work.  Let me know if you're having trouble with this.

>  Wmissing-format-attribute
>  C ObjC C++ ObjC++ Warning Alias(Wsuggest-attribute=format)
>  ;
> diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
> index 094e41fa202..72b544e8f67 100644
> --- a/gcc/c/c-typeck.cc
> +++ b/gcc/c/c-typeck.cc
> @@ -9795,7 +9795,7 @@ pop_init_level (location_t loc, int implicit,
>  }
>  
>/* Warn when some struct elements are implicitly initialized to zero.  */
> -  if (warn_missing_field_initializers
> +  if ((warn_missing_field_initializers || 
> warn_missing_designated_initializers)
>&& constructor_type
>&& TREE_CODE (constructor_type) == RECORD_TYPE
>&& constructor_unfilled_fields)
> @@ -9806,21 +9806,29 @@ pop_init_level (location_t loc, int implicit,
>  || integer_zerop (DECL_SIZE (constructor_unfilled_fields
> constructor_unfilled_fields = DECL_CHAIN 
> (constructor_unfilled_fields);
>  
> - if (constructor_unfilled_fields
> - /* Do not warn if this level of the initializer uses member
> -

[PATCH v2 3/4] RISC-V: Add even/odd vec_perm_const pattern.

2024-11-22 Thread Robin Dapp

From: Robin Dapp 

This adds handling for even/odd patterns.

gcc/ChangeLog:

* config/riscv/riscv-v.cc (shuffle_even_odd_patterns): New
function.
(expand_vec_perm_const_1): Use new function.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vls-vlmax/shuffle-evenodd-run.c: New 
test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/shuffle-evenodd.c: New test.
---
 gcc/config/riscv/riscv-v.cc   |  66 ++
 .../autovec/vls-vlmax/shuffle-evenodd-run.c   | 122 ++
 .../rvv/autovec/vls-vlmax/shuffle-evenodd.c   |  68 ++
 3 files changed, 256 insertions(+)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/shuffle-evenodd-run.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/shuffle-evenodd.c

diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index 3f8fd3257c4..ded8d52d9f9 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -3570,6 +3570,70 @@ shuffle_interleave_patterns (struct expand_vec_perm_d *d)
   return true;
 }
 
+
+/* Recognize even/odd patterns like [0 2 4 6].  We use two compress
+   and one slideup.j  */
+
+static bool
+shuffle_even_odd_patterns (struct expand_vec_perm_d *d)
+{
+  machine_mode vmode = d->vmode;
+  poly_int64 vec_len = d->perm.length ();
+  int n_patterns = d->perm.encoding ().npatterns ();
+
+  if (n_patterns != 1)
+return false;
+
+  if (!vec_len.is_constant ())
+return false;
+
+  int vlen = vec_len.to_constant ();
+  if (vlen < 4 || vlen > 64)
+return false;
+
+  if (d->one_vector_p)
+return false;
+
+  bool even = true;
+  if (!d->perm.series_p (0, 1, 0, 2))
+{
+  even = false;
+  if (!d->perm.series_p (0, 1, 1, 2))
+   return false;
+}
+
+  /* Success!  */
+  if (d->testing_p)
+return true;
+
+  machine_mode mask_mode = get_mask_mode (vmode);
+  rvv_builder builder (mask_mode, vlen, 1);
+  int bit = even ? 0 : 1;
+  for (int i = 0; i < vlen; i++)
+{
+  bit ^= 1;
+  if (bit)
+   builder.quick_push (CONST1_RTX (BImode));
+  else
+   builder.quick_push (CONST0_RTX (BImode));
+}
+  rtx mask = force_reg (mask_mode, builder.build ());
+
+  insn_code icode = code_for_pred_compress (vmode);
+  rtx ops1[] = {d->target, d->op0, mask};
+  emit_vlmax_insn (icode, COMPRESS_OP, ops1);
+
+  rtx tmp2 = gen_reg_rtx (vmode);
+  rtx ops2[] = {tmp2, d->op1, mask};
+  emit_vlmax_insn (icode, COMPRESS_OP, ops2);
+
+  rtx ops[] = {d->target, d->target, tmp2, gen_int_mode (vlen / 2, Pmode)};
+  icode = code_for_pred_slide (UNSPEC_VSLIDEUP, vmode);
+  emit_vlmax_insn (icode, SLIDEUP_OP_MERGE, ops);
+
+  return true;
+}
+
 /* Recognize decompress patterns:
 
1. VEC_PERM_EXPR op0 and op1
@@ -3888,6 +3952,8 @@ expand_vec_perm_const_1 (struct expand_vec_perm_d *d)
return true;
  if (shuffle_interleave_patterns (d))
return true;
+ if (shuffle_even_odd_patterns (d))
+   return true;
  if (shuffle_compress_patterns (d))
return true;
  if (shuffle_decompress_patterns (d))
diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/shuffle-evenodd-run.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/shuffle-evenodd-run.c
new file mode 100644
index 000..c0760e5ed30
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/shuffle-evenodd-run.c
@@ -0,0 +1,122 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target riscv_v_ok } */
+/* { dg-add-options riscv_v } */
+/* { dg-additional-options "-O3 -mrvv-max-lmul=m8 -std=gnu99" } */
+
+#include "shuffle-evenodd.c"
+
+#define SERIES_2(x, y) (x), (x + 1)
+#define SERIES_4(x, y) SERIES_2 (x, y), SERIES_2 (x + 2, y)
+#define SERIES_8(x, y) SERIES_4 (x, y), SERIES_4 (x + 4, y)
+#define SERIES_16(x, y) SERIES_8 (x, y), SERIES_8 (x + 8, y)
+#define SERIES_32(x, y) SERIES_16 (x, y), SERIES_16 (x + 16, y)
+#define SERIES_64(x, y) SERIES_32 (x, y), SERIES_32 (x + 32, y)
+
+#define comp(a, b, n)  
\
+  for (unsigned i = 0; i < n; ++i) 
\
+if ((a)[i] != (b)[i])  
\
+  __builtin_abort ();
+
+#define CHECK1(TYPE, NUNITS)   
\
+  __attribute__ ((noipa)) void check1_##TYPE ()
\
+  {
\
+TYPE v0 = (TYPE){SERIES_##NUNITS (0, NUNITS)}; 
\
+TYPE v1 = (TYPE){SERIES_##NUNITS (NUNITS, NUNITS)};
\
+TYPE ref = (TYPE){MASKE_##NUNITS (0, NUNITS)}; 
\
+TYPE res;  
\
+permute1_##TYPE (v0, v1, &res);
\
+comp (r

[PATCH v2 1/4] RISC-V: Add slide to perm_const strategies.

2024-11-22 Thread Robin Dapp

From: Robin Dapp 

This patch adds a shuffle_slide_patterns to expand_vec_perm_const.
It recognizes permutations like

  {0, 1, 4, 5}
or
  {2, 3, 6, 7}

which can be constructed by a slideup or slidedown of one of the vectors
into the other one.

gcc/ChangeLog:

* config/riscv/riscv-v.cc (shuffle_slide_patterns): New.
(expand_vec_perm_const_1): Call new function.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vls-vlmax/shuffle-slide-run1.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/shuffle-slide1.c: New test.
---
 gcc/config/riscv/riscv-v.cc   |  99 +
 .../autovec/vls-vlmax/shuffle-slide-run1.c|  81 +++
 .../rvv/autovec/vls-vlmax/shuffle-slide1.c| 137 ++
 3 files changed, 317 insertions(+)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/shuffle-slide-run1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/shuffle-slide1.c

diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index ee7a0128c0e..deb2bdb4247 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -3395,6 +3395,103 @@ shuffle_compress_patterns (struct expand_vec_perm_d *d)
   return true;
 }
 
+/* Recognize patterns like [4 5 6 7 12 13 14 15] where either the lower
+   or the higher parts of both vectors are combined into one.  */
+
+static bool
+shuffle_slide_patterns (struct expand_vec_perm_d *d)
+{
+  machine_mode vmode = d->vmode;
+  poly_int64 vec_len = d->perm.length ();
+
+  if (!vec_len.is_constant ())
+return false;
+
+  int vlen = vec_len.to_constant ();
+  if (vlen < 4)
+return false;
+
+  if (d->one_vector_p)
+return false;
+
+  /* For a slideup OP0 can stay, for a slidedown OP1 can.
+ The former requires that the first element of the permutation
+ is the first element of OP0, the latter that the last permutation
+ element is the last element of OP1.  */
+  bool slideup = false;
+  bool slidedown = false;
+
+  /* For a slideup the permutation must start at OP0's first element.  */
+  if (known_eq (d->perm[0], 0))
+slideup = true;
+
+  /* For a slidedown the permutation must end at OP1's last element.  */
+  if (known_eq (d->perm[vlen - 1], 2 * vlen - 1))
+slidedown = true;
+
+  if (slideup && slidedown)
+return false;
+
+  if (!slideup && !slidedown)
+return false;
+
+  /* Check for a monotonic sequence with one pivot.  */
+  int pivot = -1;
+  for (int i = 0; i < vlen; i++)
+{
+  if (pivot == -1 && known_ge (d->perm[i], vec_len))
+   pivot = i;
+  if (i > 0 && i != pivot
+ && maybe_ne (d->perm[i], d->perm[i - 1] + 1))
+   return false;
+}
+
+  if (pivot == -1)
+return false;
+
+  /* For a slideup OP1's part (to be slid up) must be a low part,
+ i.e. starting with its first element.  */
+  if (slideup && maybe_ne (d->perm[pivot], vlen))
+  return false;
+
+  /* For a slidedown OP0's part (to be slid down) must be a high part,
+ i.e. ending with its last element.  */
+  if (slidedown && maybe_ne (d->perm[pivot - 1], vlen - 1))
+return false;
+
+  /* Success!  */
+  if (d->testing_p)
+return true;
+
+  /* PIVOT is the start of the lower/higher part of OP1 or OP2.
+ For a slideup it indicates how many elements of OP1 to
+ skip/slide over.  For a slidedown it indicates how long
+ OP1's high part is, while VLEN - PIVOT is the amount to slide.  */
+  int slide_cnt = slideup ? pivot : vlen - pivot;
+  insn_code icode;
+  if (slideup)
+{
+  /* No need for a vector length because we slide up until the
+end of OP1 anyway.  */
+  rtx ops[] = {d->target, d->op0, d->op1, gen_int_mode (slide_cnt, Pmode)};
+  icode = code_for_pred_slide (UNSPEC_VSLIDEUP, vmode);
+  emit_vlmax_insn (icode, SLIDEUP_OP_MERGE, ops);
+}
+  else
+{
+  /* Here we need a length because we slide to the beginning of OP1
+leaving the remaining elements undisturbed.  */
+  int len = pivot;
+  rtx ops[] = {d->target, d->op1, d->op0,
+  gen_int_mode (slide_cnt, Pmode)};
+  icode = code_for_pred_slide (UNSPEC_VSLIDEDOWN, vmode);
+  emit_nonvlmax_insn (icode, BINARY_OP_TUMA, ops,
+ gen_int_mode (len, Pmode));
+}
+
+  return true;
+}
+
 /* Recognize decompress patterns:
 
1. VEC_PERM_EXPR op0 and op1
@@ -3709,6 +3806,8 @@ expand_vec_perm_const_1 (struct expand_vec_perm_d *d)
return true;
  if (shuffle_consecutive_patterns (d))
return true;
+ if (shuffle_slide_patterns (d))
+   return true;
  if (shuffle_compress_patterns (d))
return true;
  if (shuffle_decompress_patterns (d))
diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/shuffle-slide-run1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/shuffle-slide-run1.c
new file mode 100644
index 000..17e68caad21

[PATCH v2 4/4] RISC-V: Improve slide1up pattern.

2024-11-22 Thread Robin Dapp

From: Robin Dapp 

This patch adds a second variant to implement the extract/slide1up
pattern.  In order to do a permutation like
<3, 4, 5, 6> from vectors <0, 1, 2, 3> and <4, 5, 6, 7>
we currently extract <3> from the first vector and re-insert it into the
second vector.  Unless register-file crossing latency is essentially
zero it should be preferable to first slide the second vector up by
one, then slide down the first vector by (nunits - 1).

gcc/ChangeLog:

* config/riscv/riscv-protos.h (riscv_register_move_cost):
Export.
* config/riscv/riscv-v.cc (shuffle_extract_and_slide1up_patterns):
Rename...
(shuffle_off_by_one_patterns): ... to this and add slideup/slidedown
variant.
(expand_vec_perm_const_1): Call renamed function.
* config/riscv/riscv.cc (riscv_secondary_memory_needed): Remove
static.
(riscv_register_move_cost): Add VR<->GR/FR handling.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/pr112599-2.c: Adjust test
expectation.
---
 gcc/config/riscv/riscv-protos.h   |  1 +
 gcc/config/riscv/riscv-v.cc   | 52 ++-
 gcc/config/riscv/riscv.cc | 18 ++-
 .../gcc.target/riscv/rvv/autovec/pr112599-2.c |  2 +-
 4 files changed, 57 insertions(+), 16 deletions(-)

diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index 500b357f6eb..ecb4e64cdf8 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -139,6 +139,7 @@ extern void riscv_expand_ussub (rtx, rtx, rtx);
 extern void riscv_expand_sssub (rtx, rtx, rtx);
 extern void riscv_expand_ustrunc (rtx, rtx);
 extern void riscv_expand_sstrunc (rtx, rtx);
+extern int riscv_register_move_cost (machine_mode, reg_class_t, reg_class_t);
 
 #ifdef RTX_CODE
 extern void riscv_expand_int_scc (rtx, enum rtx_code, rtx, rtx, bool 
*invert_ptr = 0);
diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index ded8d52d9f9..12251b7f90b 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -3796,11 +3796,13 @@ shuffle_bswap_pattern (struct expand_vec_perm_d *d)
   return true;
 }
 
-/* Recognize the pattern that can be shuffled by vec_extract and slide1up
-   approach.  */
+/* Recognize patterns like [3 4 5 6] where we combine the last element
+   of the first vector and the first n - 1 elements of the second vector.
+   This can be implemented by slides or by extracting and re-inserting
+   (slide1up) the first vector's last element.  */
 
 static bool
-shuffle_extract_and_slide1up_patterns (struct expand_vec_perm_d *d)
+shuffle_off_by_one_patterns (struct expand_vec_perm_d *d)
 {
   poly_int64 nunits = GET_MODE_NUNITS (d->vmode);
 
@@ -3818,17 +3820,39 @@ shuffle_extract_and_slide1up_patterns (struct 
expand_vec_perm_d *d)
   if (d->testing_p)
 return true;
 
-  /* Extract the last element of the first vector.  */
-  scalar_mode smode = GET_MODE_INNER (d->vmode);
-  rtx tmp = gen_reg_rtx (smode);
-  emit_vec_extract (tmp, d->op0, gen_int_mode (nunits - 1, Pmode));
+  int scalar_cost = riscv_register_move_cost (d->vmode, V_REGS, GR_REGS)
++ riscv_register_move_cost (d->vmode, GR_REGS, V_REGS) + 2;
+  int slide_cost = 2;
+
+  if (slide_cost < scalar_cost)
+{
+  /* This variant should always be preferable because we just need two
+slides.  The extract-variant also requires two slides but additionally
+pays the latency for register-file crossing.  */
+  rtx tmp = gen_reg_rtx (d->vmode);
+  rtx ops[] = {tmp, d->op1, gen_int_mode (1, Pmode)};
+  insn_code icode = code_for_pred_slide (UNSPEC_VSLIDEUP, d->vmode);
+  emit_vlmax_insn (icode, BINARY_OP, ops);
+
+  rtx ops2[] = {d->target, tmp, d->op0, gen_int_mode (nunits - 1, Pmode)};
+  icode = code_for_pred_slide (UNSPEC_VSLIDEDOWN, d->vmode);
+  emit_nonvlmax_insn (icode, BINARY_OP_TUMA, ops2, gen_int_mode (1, 
Pmode));
+}
+  else
+{
+  /* Extract the last element of the first vector.  */
+  scalar_mode smode = GET_MODE_INNER (d->vmode);
+  rtx tmp = gen_reg_rtx (smode);
+  emit_vec_extract (tmp, d->op0, gen_int_mode (nunits - 1, Pmode));
+
+  /* Insert the scalar into element 0.  */
+  unsigned int unspec
+   = FLOAT_MODE_P (d->vmode) ? UNSPEC_VFSLIDE1UP : UNSPEC_VSLIDE1UP;
+  insn_code icode = code_for_pred_slide (unspec, d->vmode);
+  rtx ops[] = {d->target, d->op1, tmp};
+  emit_vlmax_insn (icode, BINARY_OP, ops);
+}
 
-  /* Insert the scalar into element 0.  */
-  unsigned int unspec
-= FLOAT_MODE_P (d->vmode) ? UNSPEC_VFSLIDE1UP : UNSPEC_VSLIDE1UP;
-  insn_code icode = code_for_pred_slide (unspec, d->vmode);
-  rtx ops[] = {d->target, d->op1, tmp};
-  emit_vlmax_insn (icode, BINARY_OP, ops);
   return true;
 }
 
@@ -3960,7 +3984,7 @@ expand_vec_perm_const_1 (struct expand_vec_perm_d *d)
return true;
  if (shuffle_bswap_patte

[PATCH] ifcombine: skip fallback conjunction on noncontiguous blocks

2024-11-22 Thread Alexandre Oliva



When everything else fails, if enabled by the target or by a
parameter, and when other requirements are satisfied, ifcombine
generates an AND of both conditions.

That may be good for contiguous conditions, but it's unlikely to be an
optimization when the blocks are separate.

Add contiguity to the set of requirements for this fallback
transformation.

Regstrapped on x86_64-linux-gnu.  Ok to install?


for  gcc/ChangeLog

* tree-ssa-ifcombine.cc (ifcombine_ifandif): Avoid fallback
conjunction of noncontiguous conditions.
---
 gcc/tree-ssa-ifcombine.cc |4 
 1 file changed, 4 insertions(+)

diff --git a/gcc/tree-ssa-ifcombine.cc b/gcc/tree-ssa-ifcombine.cc
index 9b9dc10cd2202..51f37f15a9efc 100644
--- a/gcc/tree-ssa-ifcombine.cc
+++ b/gcc/tree-ssa-ifcombine.cc
@@ -974,6 +974,10 @@ ifcombine_ifandif (basic_block inner_cond_bb, bool 
inner_inv,
gimple_cond_rhs (outer_cond),
gimple_bb (outer_cond
{
+ /* Only combine conditions in this fallback case if the blocks are
+neighbors.  */
+ if (single_pred (inner_cond_bb) != outer_cond_bb)
+   return false;
  tree t1, t2;
  bool logical_op_non_short_circuit = LOGICAL_OP_NON_SHORT_CIRCUIT;
  if (param_logical_op_non_short_circuit != -1)


-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
More tolerance and less prejudice are key for inclusion and diversity
Excluding neuro-others for not behaving ""normal"" is *not* inclusive

[committed] c: Fix typeof_unqual handling of qualified array types [PR112841]

2024-11-22 Thread Joseph Myers

As reported in bug 112841, typeof_unqual fails to remove qualifiers
from qualified array types.  In C23 (unlike in previous standard
versions), array types are considered to have the qualifiers of the
element type, so typeof_unqual should remove such qualifiers (and an
example in the standard shows that is as intended).  Fix this by
calling strip_array_types when checking for the presence of
qualifiers.  (The reason we check for qualifiers rather than just
using TYPE_MAIN_VARIANT unconditionally is to avoid, as a quality of
implementation matter, unnecessarily losing typedef information in the
case where the type is already unqualified.)

Bootstrapped with no regressions for x86_64-pc-linux-gnu.

PR c/112841

gcc/c/
* c-parser.cc (c_parser_typeof_specifier): Call strip_array_types
when checking for type qualifiers for typeof_unqual.

gcc/testsuite/
* gcc.dg/c23-typeof-4.c: New test.

diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
index f716c567e819..d60e12f6b301 100644
--- a/gcc/c/c-parser.cc
+++ b/gcc/c/c-parser.cc
@@ -4442,7 +4442,8 @@ c_parser_typeof_specifier (c_parser *parser)
   parens.skip_until_found_close (parser);
   if (ret.spec != error_mark_node)
 {
-  if (is_unqual && TYPE_QUALS (ret.spec) != TYPE_UNQUALIFIED)
+  if (is_unqual
+ && TYPE_QUALS (strip_array_types (ret.spec)) != TYPE_UNQUALIFIED)
ret.spec = TYPE_MAIN_VARIANT (ret.spec);
   if (is_std)
{
diff --git a/gcc/testsuite/gcc.dg/c23-typeof-4.c 
b/gcc/testsuite/gcc.dg/c23-typeof-4.c
new file mode 100644
index ..471d08293414
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/c23-typeof-4.c
@@ -0,0 +1,10 @@
+/* Test C23 typeof and typeof_unqual on qualified arrays (bug 112841).  */
+/* { dg-do compile } */
+/* { dg-options "-std=c23 -pedantic-errors" } */
+
+const int a[] = { 1, 2, 3 };
+int b[3];
+extern typeof (a) a;
+extern typeof (const int [3]) a;
+extern typeof_unqual (a) b;
+extern typeof_unqual (const int [3]) b;

-- 
Joseph S. Myers
josmy...@redhat.com

Ping: Re: [PATCH 0/3] Nested diagnostics for g++ [PR116253]

2024-11-22 Thread David Malcolm

I'd like to ping this patch kit.

I've already pushed "[PATCH 1/3] diagnostics: add support for nested
diagnostics [PR116253]" but was hoping for review from C++ maintainers
for patches 2 and 3:

[PATCH 2/3] c++: consolidate location printing in error.cc [PR116253]
  https://gcc.gnu.org/pipermail/gcc-patches/2024-November/668469.html

[PATCH 3/3] c++: use diagnostic nesting [PR116253]
  https://gcc.gnu.org/pipermail/gcc-patches/2024-November/668470.html

Screenshot:
  https://gcc.gnu.org/bugzilla/attachment.cgi?id=59580

There are also some followups I posted last week that IMHO further
improve the readability of template errors:
 
[PATCH 1/2] c++: print z candidate count and number them
  https://gcc.gnu.org/pipermail/gcc-patches/2024-November/669035.html

[PATCH 2/2] diagnostics: suppress "note: " prefix in nested diagnostics 
[PR116253]
  https://gcc.gnu.org/pipermail/gcc-patches/2024-November/669034.html

Screenshot with those followups:
  https://gcc.gnu.org/bugzilla/attachment.cgi?id=59611

Thanks!
Dave


On Tue, 2024-11-12 at 09:02 -0500, David Malcolm wrote:
> The following patch kit adds experimental support for nested
> diagnostics to g++.
> 
> As noted in P3358R0 ("SARIF for Structured Diagnostics"), C++
> diagnostics involving concepts can be made much more readable
> by capturing the hierarchical structure of the diagnostics
> and displaying this structure to the user (perhaps via
> a UI that allows the user to drill down into the aspects
> of the issue they are interested in).
> 
> Implementing this for GCC requires several things:
> 
> (a) extending the diagnostics subsystem so that it can
>     represent hierarchically structured diagnostics internally
> 
> (b) extending the output formats (text and SARIF) to
>     present/export such hierarchical structures
> 
> (c) extending the C++ frontend to capture the hierarchical
>     structure
> 
> and all of these tasks are interrelated: for example,
> we can't fix (c) if we don't have (a) and (b), and providing a
> good user experience is likely to require iterating on all three
> aspects.
> 
> Patch 1 of this patch kit implements (a) and (b), adding a plugin
> to the test case to inject a placeholder hierarchical diagnostic,
> adding SARIF output support (based on P3358R0), and text output
> support.
> 
> The most natural way to present textual output is to use indentation,
> so patch 1 does this.  However, our existing textual format uses the
> source location as a prefix, e.g.
>   PATH/foo.cc: error: message goes here
> and unfortunately this makes the indented hierarchy unreadable
> (especially if a tree of diagnostics spans multiple source files).
> 
> So for this patch kit I've retained the existing textual presentation
> by default; viewing the structure via indentation requires that
> the user opt-in to a new text output style.  The patch implements
> this via a new "experiment-nesting" key in the text output scheme,
> so the option is:
>   -fdiagnostics-set-output=text:experimental-nesting=yes
> This is deliberately rather verbose to signal to the user that
> they're opting-in to the new approach.
> 
> An example of the hierarchical text output can be seen in patch 3.
> 
> Patch 2 of the kit does some consolidation to the C++ FE in how
> locations are printed, as a preliminary cleanup.
> 
> Patch 3 of the kit adds the hierarchical structuring to the C++ FE.
> 
> I've successfully bootstrapped & regrtested the kit on
> x86_64-pc-linux-gnu.
> 
> I think I can self-approve patch 1; are patches 2 and 3 OK for the
> C++ FE?
> 
> David Malcolm (3):
>   diagnostics: add support for nested diagnostics [PR116253]
>   c++: consolidate location printing in error.cc [PR116253]
>   c++: use diagnostic nesting [PR116253]
> 
>  gcc/c-family/c-opts.cc    |   2 +-
>  gcc/cp/call.cc    |  69 
>  gcc/cp/constraint.cc  |   5 +
>  gcc/cp/error.cc   | 164 +++-
> --
>  gcc/diagnostic-core.h |  12 ++
>  gcc/diagnostic-format-sarif.cc    |  16 +-
>  gcc/diagnostic-format-text.cc | 121 -
>  gcc/diagnostic-format-text.h  |  34 +++-
>  gcc/diagnostic-global-context.cc  |  12 ++
>  gcc/diagnostic-show-locus.cc  |   5 +
>  gcc/diagnostic.cc |  29 +++-
>  gcc/diagnostic.h  |  15 +-
>  gcc/doc/invoke.texi   |  18 ++
>  gcc/opts-diagnostic.cc    |  32 +++-
>  .../concepts/nested-diagnostics-1-truncated.C |  41 +
>  .../g++.dg/concepts/nested-diagnostics-1.C    |  51 ++
>  .../g++.dg/concepts/nested-diagnostics-2.C    |  37 
>  .../plugin/diagnostic-test-nesting-sarif.c    |  16 ++
>  .../plugin/diagnostic-test-nesting-sarif.py   |  39 +
>  ...c-test-nesting-text-indented-show-levels.c |  24 +++
>  ...ostic-test-nesting-text-indente

Re: [PATCH] C/C++: add fix-it hints for missing '&' and '*' (v5) [PR87850]

2024-11-22 Thread Joseph Myers

On Fri, 22 Nov 2024, David Malcolm wrote:

> Revisiting this patch from 2018 that didn't quite make it;
> earlier versions were:
>   v1: https://gcc.gnu.org/ml/gcc-patches/2018-11/msg00802.html
>   v2: https://gcc.gnu.org/legacy-ml/gcc-patches/2018-11/msg01408.html
>   v3: https://gcc.gnu.org/legacy-ml/gcc-patches/2018-11/msg01658.html
>   v4: https://gcc.gnu.org/legacy-ml/gcc-patches/2018-11/msg02617.html
> 
> I believe the remaining point of discussion was about enum vs int:
>   https://gcc.gnu.org/legacy-ml/gcc-patches/2018-12/msg00293.html
> and that none of us had strong opinions on the matter.  I've added
> some test coverage for that, and rebased it (e.g. for the .c to .cc
> renaming of our sources).
> 
> Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
> OK for trunk?

The C front-end changes in this version are OK.

-- 
Joseph S. Myers
josmy...@redhat.com

Re: [PATCH] build: Remove INCLUDE_MEMORY [PR117737]

2024-11-22 Thread Jeff Law





On 11/22/24 3:09 PM, David Malcolm wrote:

On Fri, 2024-11-22 at 13:15 -0800, Andrew Pinski wrote:

Since diagnostic.h is included in over half of the sources, requiring
to `#define INCLUDE_MEMORY`
does not make sense. Instead lets unconditionally include memory in
system.h.

The majority of this patch is just removing `#define INCLUDE_MEMORY`
from the sources which currently
have it.


Sorry about the unpleasantness.

FWIW I did consider simply including  unconditionally for r15-
4610 ("Use unique_ptr in more places in pretty_printer/diagnostics
[PR116613]")
https://gcc.gnu.org/pipermail/gcc-patches/2024-October/665443.html
but the verbose approach seemed to me like something I could self-
approve; the simple approach didn't.

As I said there,  I'd like to use std::unique_ptr in more places, such
as when creating passes, so I think the number of places we'd need
INCLUDE_MEMORY is likely to eventually be most of the TUs in the
compiler.

So I'm in favor of Andrew's patch, FWIW

Likewise.  ACK'd for the trunk.

jeff

Re: [PATCH] ranger, v2: Handle nonnull_if_nonzero attribute [PR117023]

2024-11-22 Thread Andrew MacLeod


FYI,

 I will shortly be submitting , and presumable committing, this patch 
as part of a series to improve VRP time for 117467..


So it may be in place by the time you need it

Andrew

On 11/18/24 09:31, Andrew MacLeod wrote:
Attached is a pre-approved patch which adds a range_query to the 
inferred range mechanism.


The only change you will need to make is to replace "get_range_query 
(cfun)->"  with "q->" which is passed in.


This regstraps on x86 without your patch, and I got as far as a 
bootstrap with your patches..


Andfrew

On 11/15/24 04:36, Jakub Jelinek wrote:

On Thu, Nov 14, 2024 at 06:25:49PM +0100, Jakub Jelinek wrote:

On Thu, Nov 14, 2024 at 10:05:05AM -0500, Andrew MacLeod wrote:

The inferred range mechanism is also initialized using cfun, so again
introducing a use of cfun shouldnt be an issue.

Something like this ought to work I think?

2024-11-14  Jakub Jelinek  
    Andrew MacLeod  

PR c/117023
* gimple-range-infer.cc (gimple_infer_range::gimple_infer_range):
Handle also nonnull_if_nonzero attributes.

* gcc.dg/tree-ssa/pr78154-2.c: New test.

Unfortunately that broke bootstrap.
https://gcc.gnu.org/pipermail/gcc-patches/2024-November/668554.html
https://gcc.gnu.org/pipermail/gcc-patches/2024-November/668699.html
https://gcc.gnu.org/pipermail/gcc-patches/2024-November/668700.html
https://gcc.gnu.org/pipermail/gcc-patches/2024-November/668711.html
bootstrap/regtest fine, but if
https://gcc.gnu.org/pipermail/gcc-patches/2024-November/668700.html
in there is replaced with
https://gcc.gnu.org/pipermail/gcc-patches/2024-November/668818.html
I get ICE on e.g. opts-common.cc.

Reduced testcase:

void foo (int *, __SIZE_TYPE__);
char *a, b[64], c;

void
bar (void)
{
   for (int i = 0; i < 42; ++i)
 {
   char *d = &a[i];
   int e;
   if (c)
foo (&e, 1);
   __SIZE_TYPE__ f = __builtin_strlen (d);
   if (f)
__builtin_memcpy (b, d, f);
 }
}

/home/jakub/src/gcc/obj72x/prev-gcc/cc1 -quiet -O2 fbc-ice.c
during GIMPLE pass: wrestrict
fbc-ice.c: In function ‘bar’:
fbc-ice.c:5:1: internal compiler error: in fill_block_cache, at 
gimple-range-cache.cc:1565

 5 | bar (void)
   | ^~~
0x44ebf97 internal_error(char const*, ...)
../../gcc/diagnostic-global-context.cc:518
0x44bc0ef fancy_abort(char const*, int, char const*)
../../gcc/diagnostic.cc:1696
0x41d186f ranger_cache::fill_block_cache(tree_node*, 
basic_block_def*, basic_block_def*)

../../gcc/gimple-range-cache.cc:1565
0x41d07be ranger_cache::block_range(vrange&, basic_block_def*, 
tree_node*, bool)

../../gcc/gimple-range-cache.cc:1304
0x41c95f4 gimple_ranger::range_on_entry(vrange&, basic_block_def*, 
tree_node*)

../../gcc/gimple-range.cc:175
0x41c93f3 gimple_ranger::range_of_expr(vrange&, tree_node*, gimple*)
../../gcc/gimple-range.cc:147
0x41e2c8c gimple_infer_range::gimple_infer_range(gimple*, bool)
../../gcc/gimple-range-infer.cc:205
0x41e3f52 infer_range_manager::register_all_uses(tree_node*)
../../gcc/gimple-range-infer.cc:476
0x41e35ec infer_range_manager::has_range_p(basic_block_def*, tree_node*)
../../gcc/gimple-range-infer.cc:356
0x41d1a64 ranger_cache::fill_block_cache(tree_node*, 
basic_block_def*, basic_block_def*)

../../gcc/gimple-range-cache.cc:1604
0x41d07be ranger_cache::block_range(vrange&, basic_block_def*, 
tree_node*, bool)

../../gcc/gimple-range-cache.cc:1304
0x41c95f4 gimple_ranger::range_on_entry(vrange&, basic_block_def*, 
tree_node*)

../../gcc/gimple-range.cc:175
0x41c93f3 gimple_ranger::range_of_expr(vrange&, tree_node*, gimple*)
../../gcc/gimple-range.cc:147
0x41d5d5b fur_stmt::get_operand(vrange&, tree_node*)
../../gcc/gimple-range-fold.cc:143
0x41d7cb5 fold_using_range::range_of_range_op(vrange&, 
gimple_range_op_handler&, fur_source&)

../../gcc/gimple-range-fold.cc:718
0x41d770e fold_using_range::fold_stmt(vrange&, gimple*, fur_source&, 
tree_node*)

../../gcc/gimple-range-fold.cc:649
0x41c9d03 gimple_ranger::fold_range_internal(vrange&, gimple*, 
tree_node*)

../../gcc/gimple-range.cc:278
0x41ca03c gimple_ranger::range_of_stmt(vrange&, gimple*, tree_node*)
../../gcc/gimple-range.cc:339
0x41c95b5 gimple_ranger::range_on_entry(vrange&, basic_block_def*, 
tree_node*)

../../gcc/gimple-range.cc:172
0x41c93f3 gimple_ranger::range_of_expr(vrange&, tree_node*, gimple*)
../../gcc/gimple-range.cc:147
Please submit a full bug report, with preprocessed source (by using 
-freport-bug).

Please include the complete backtrace with any bug report.
See  for instructions.

Jakub

Patch ping - [PATCH] [APX EGPR] Fix indirect call prefix

2024-11-22 Thread Gregory Kanter

Hello,
I would like to ping the patch
https://gcc.gnu.org/pipermail/gcc-patches/2024-November/668105.html
please.

Also CC'ing someone who is working on APX,
sorry if this is frowned upon.

Thanks.

Re: [PATCH] Sync top-level configure with binutils

2024-11-22 Thread Jeff Law





On 11/22/24 12:16 PM, Sam James wrote:

This syncs us with binutils/gdb's toplevel configure as of
987db70acefd0b223a8df2240d4e5ca544cc0a91.

There's not much notable here, just gprofng (which is in binutils) being
disabled for musl and a new target which got added on that side too.

The only part which may look interesting is the baseargs->bbaseargs
change which goes back to Arsen's gettext work and a fixup which
landed for that on the binutils side in
9c0aa4c53104b1c4333d55aeaf11b41053307929.
---
OK?

OK
jeff

[PATCH] Remove the rest of INCLUDE_MEMORY

2024-11-22 Thread Andrew Pinski

I missed these in r15-5603-gb3f1b9e2aa079f8ec73e because they
were no in a `.cc` or `.h` file or they were outside of the gcc
subdirectory.

Bootstrapped and tested on x86_64-linux-gnu.

gcc/m2/ChangeLog:

* mc/keyc.mod: Don't print `#define INCLUDE_MEMORY`.

gcc/ChangeLog:

* optc-gen.awk: Don't print `#define INCLUDE_MEMORY`.
* optc-save-gen.awk: Likewise.
* options-urls-cc-gen.awk: Likewise.

libcc1/ChangeLog:

* libcc1plugin.cc (INCLUDE_MEMORY): Remove.
* libcp1plugin.cc (INCLUDE_MEMORY): Remove.

libgcc/ChangeLog:

* libgcov-util.c (INCLUDE_MEMORY): Remove.

gcc/testsuite/ChangeLog:

* gcc.dg/plugin/analyzer_cpython_plugin.c (INCLUDE_MEMORY): Remove.
* gcc.dg/plugin/analyzer_gil_plugin.c (INCLUDE_MEMORY): Remove.
* gcc.dg/plugin/analyzer_kernel_plugin.c (INCLUDE_MEMORY): Remove.
* gcc.dg/plugin/analyzer_known_fns_plugin.c (INCLUDE_MEMORY): Remove.
* gcc.dg/plugin/diagnostic_plugin_xhtml_format.c (INCLUDE_MEMORY): 
Remove.
* gcc.dg/plugin/dump_plugin.c (INCLUDE_MEMORY): Remove.
* gcc.dg/plugin/ggcplug.c (INCLUDE_MEMORY): Remove.

Signed-off-by: Andrew Pinski 
---
 gcc/m2/mc/keyc.mod   | 1 -
 gcc/optc-gen.awk | 1 -
 gcc/optc-save-gen.awk| 1 -
 gcc/options-urls-cc-gen.awk  | 1 -
 gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c| 1 -
 gcc/testsuite/gcc.dg/plugin/analyzer_gil_plugin.c| 1 -
 gcc/testsuite/gcc.dg/plugin/analyzer_kernel_plugin.c | 1 -
 gcc/testsuite/gcc.dg/plugin/analyzer_known_fns_plugin.c  | 1 -
 gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_xhtml_format.c | 1 -
 gcc/testsuite/gcc.dg/plugin/dump_plugin.c| 1 -
 gcc/testsuite/gcc.dg/plugin/ggcplug.c| 1 -
 libcc1/libcc1plugin.cc   | 1 -
 libcc1/libcp1plugin.cc   | 1 -
 libgcc/libgcov-util.c| 1 -
 14 files changed, 14 deletions(-)

diff --git a/gcc/m2/mc/keyc.mod b/gcc/m2/mc/keyc.mod
index a182e120692..f3f09309ad5 100644
--- a/gcc/m2/mc/keyc.mod
+++ b/gcc/m2/mc/keyc.mod
@@ -95,7 +95,6 @@ BEGIN
   IF NOT initializedGCC
   THEN
  initializedGCC := TRUE ;
- print (p, '#define INCLUDE_MEMORY\n');
  print (p, '#include "config.h"\n');
  print (p, '#include "system.h"\n');
  checkGccTypes (p)
diff --git a/gcc/optc-gen.awk b/gcc/optc-gen.awk
index 551cc5cf138..0eccb1951ef 100644
--- a/gcc/optc-gen.awk
+++ b/gcc/optc-gen.awk
@@ -158,7 +158,6 @@ for (i = 0; i < n_opts; i++) {
 
 print "/* This file is auto-generated by optc-gen.awk.  */"
 print ""
-print "#define INCLUDE_MEMORY"
 n_headers = split(header_name, headers, " ")
 for (i = 1; i <= n_headers; i++)
print "#include " quote headers[i] quote
diff --git a/gcc/optc-save-gen.awk b/gcc/optc-save-gen.awk
index 6fa10c5171f..b1289c281e7 100644
--- a/gcc/optc-save-gen.awk
+++ b/gcc/optc-save-gen.awk
@@ -31,7 +31,6 @@
 END {
 print "/* This file is auto-generated by optc-save-gen.awk.  */"
 print ""
-print "#define INCLUDE_MEMORY"
 n_headers = split(header_name, headers, " ")
 for (i = 1; i <= n_headers; i++)
print "#include " quote headers[i] quote
diff --git a/gcc/options-urls-cc-gen.awk b/gcc/options-urls-cc-gen.awk
index 33715a6c1d2..a2933233abe 100644
--- a/gcc/options-urls-cc-gen.awk
+++ b/gcc/options-urls-cc-gen.awk
@@ -29,7 +29,6 @@ END {
 
 print "/* This file is auto-generated by options-urls-cc-gen.awk.  */"
 print ""
-print "#define INCLUDE_MEMORY"
 n_headers = split(header_name, headers, " ")
 for (i = 1; i <= n_headers; i++)
print "#include " quote headers[i] quote
diff --git a/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c 
b/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c
index 467af16c3d1..4bdd5f22ecd 100644
--- a/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c
+++ b/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c
@@ -1,7 +1,6 @@
 /* -fanalyzer plugin for CPython extension modules  */
 /* { dg-options "-g" } */
 
-#define INCLUDE_MEMORY
 #define INCLUDE_VECTOR
 #include "gcc-plugin.h"
 #include "config.h"
diff --git a/gcc/testsuite/gcc.dg/plugin/analyzer_gil_plugin.c 
b/gcc/testsuite/gcc.dg/plugin/analyzer_gil_plugin.c
index 77767c88ad7..a4153607868 100644
--- a/gcc/testsuite/gcc.dg/plugin/analyzer_gil_plugin.c
+++ b/gcc/testsuite/gcc.dg/plugin/analyzer_gil_plugin.c
@@ -4,7 +4,6 @@
 */
 /* { dg-options "-g" } */
 
-#define INCLUDE_MEMORY
 #define INCLUDE_VECTOR
 #include "gcc-plugin.h"
 #include "config.h"
diff --git a/gcc/testsuite/gcc.dg/plugin/analyzer_kernel_plugin.c 
b/gcc/testsuite/gcc.dg/plugin/analyzer_kernel_plugin.c
index 7f2158ec54d..b4abde87c1e 100644
--- a/gcc/testsuite/gcc.dg/plugin/analyzer_kernel_plugin.c
+++ b/gcc/testsuite/g

Re: [PATCH v2 0/4] Improve and add VLS slide strategies.

2024-11-22 Thread 钟居哲

This series patches LGTM.

juzhe.zh...@rivai.ai

From: Robin Dapp
Date: 2024-11-23 02:20
To: gcc-patches
CC: palmer; kito.cheng; juzhe.zhong; jeffreyalaw; pan2.li; rdapp.gcc
Subject: [PATCH v2 0/4] Improve and add VLS slide strategies.
From: Robin Dapp 

Changes from v1:
- Improve function naming and rephrase comment.

The series still causes execution failures due to the previously
mentioned bugs.  The avlprop one seems to have disappeared on my
machine but I'm not convinced.
Hopefully this time I'm using git send-email correctly and the CI
can pick it up.

Robin Dapp (4):
  RISC-V: Add slide to perm_const strategies.
  RISC-V: Add interleave pattern.
  RISC-V: Add even/odd vec_perm_const pattern.
  RISC-V: Improve slide1up pattern.

gcc/config/riscv/riscv-protos.h   |   1 +
gcc/config/riscv/riscv-v.cc   | 297 +-
gcc/config/riscv/riscv.cc |  18 +-
.../gcc.target/riscv/rvv/autovec/pr112599-2.c |   2 +-
.../autovec/vls-vlmax/shuffle-evenodd-run.c   | 122 +++
.../rvv/autovec/vls-vlmax/shuffle-evenodd.c   |  68 
.../vls-vlmax/shuffle-interleave-run.c| 122 +++
.../autovec/vls-vlmax/shuffle-interleave.c|  69 
.../autovec/vls-vlmax/shuffle-slide-run1.c|  81 +
.../rvv/autovec/vls-vlmax/shuffle-slide1.c| 137 
10 files changed, 901 insertions(+), 16 deletions(-)
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/shuffle-evenodd-run.c
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/shuffle-evenodd.c
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/shuffle-interleave-run.c
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/shuffle-interleave.c
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/shuffle-slide-run1.c
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/shuffle-slide1.c

-- 
2.47.0

[PATCH] libsanitizer: Remove -pedantic from AM_CXXFLAGS [PR117732]

2024-11-22 Thread Jakub Jelinek

Hi!

We aren't the master repository for the sanitizers and clearly upstream
introduces various extensions in the code.
All we care about is whether it builds and works fine with GCC, so
-pedantic flag is of no use to us, only maybe to upstream if they
cared about it (which they clearly don't).

The following patch removes those and fixes some whitespace nits at the same
time.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-11-23  Jakub Jelinek  

PR sanitizer/117732
* asan/Makefile.am (AM_CXXFLAGS): Remove -pedantic.  Formatting fix.
(asan_files): Formatting fix.
* hwasan/Makefile.am (AM_CXXFLAGS): Remove -pedantic.  Formatting fix.
* interception/Makefile.am (AM_CXXFLAGS): Likewise.
(interception_files): Formatting fix.
* libbacktrace/Makefile.am: Update copyright years.
* lsan/Makefile.am (AM_CXXFLAGS): Remove -pedantic.  Formatting fix.
* sanitizer_common/Makefile.am (AM_CXXFLAGS): Likewise.
(libsanitizer_common_la_DEPENDENCIES): Formatting fix.
* tsan/Makefile.am (AM_CXXFLAGS): Remove -pedantic.  Formatting fix.
* ubsan/Makefile.am (AM_CXXFLAGS): Likewise.
* asan/Makefile.in: Regenerate.
* hwasan/Makefile.in: Regenerate.
* interception/Makefile.in: Regenerate.
* libbacktrace/Makefile.in: Regenerate.
* lsan/Makefile.in: Regenerate.
* sanitizer_common/Makefile.in: Regenerate.
* tsan/Makefile.in: Regenerate.
* ubsan/Makefile.in: Regenerate.

--- libsanitizer/asan/Makefile.am.jj2024-11-22 20:05:18.952495444 +0100
+++ libsanitizer/asan/Makefile.am   2024-11-22 20:05:30.322333770 +0100
@@ -7,7 +7,7 @@ DEFS = -D_GNU_SOURCE -D_DEBUG -D__STDC_C
 if USING_MAC_INTERPOSE
 DEFS += -DMAC_INTERPOSE_FUNCTIONS -DMISSING_BLOCKS_SUPPORT
 endif
-AM_CXXFLAGS = -Wall -W -Wno-unused-parameter -Wwrite-strings -pedantic 
-Wno-long-long  -fPIC -fno-builtin -fno-exceptions -fno-rtti 
-fomit-frame-pointer -funwind-tables -fvisibility=hidden -Wno-variadic-macros 
-fno-ipa-icf
+AM_CXXFLAGS = -Wall -W -Wno-unused-parameter -Wwrite-strings -Wno-long-long 
-fPIC -fno-builtin -fno-exceptions -fno-rtti -fomit-frame-pointer 
-funwind-tables -fvisibility=hidden -Wno-variadic-macros -fno-ipa-icf
 AM_CXXFLAGS += $(LIBSTDCXX_RAW_CXX_CXXFLAGS)
 AM_CXXFLAGS += -std=gnu++17
 AM_CXXFLAGS += $(EXTRA_CXXFLAGS)
@@ -47,7 +47,7 @@ asan_files = \
asan_thread.cpp \
asan_win.cpp \
asan_win_dynamic_runtime_thunk.cpp \
-  asan_interceptors_vfork.S
+   asan_interceptors_vfork.S
 
 libasan_la_SOURCES = $(asan_files)
 libasan_la_LIBADD = $(top_builddir)/sanitizer_common/libsanitizer_common.la 
$(top_builddir)/lsan/libsanitizer_lsan.la
--- libsanitizer/hwasan/Makefile.am.jj  2024-11-22 20:05:18.952495444 +0100
+++ libsanitizer/hwasan/Makefile.am 2024-11-22 20:05:30.322333770 +0100
@@ -4,7 +4,7 @@ AM_CPPFLAGS = -I $(top_srcdir)/include -
 gcc_version := $(shell @get_gcc_base_ver@ $(top_srcdir)/../gcc/BASE-VER)
 
 DEFS = -D_GNU_SOURCE -D_DEBUG -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS 
-D__STDC_LIMIT_MACROS -DCAN_SANITIZE_UB=0 -DHWASAN_WITH_INTERCEPTORS=1
-AM_CXXFLAGS = -Wall -W -Wno-unused-parameter -Wwrite-strings -pedantic 
-Wno-long-long  -fPIC -fno-builtin -fno-exceptions -fno-rtti -funwind-tables 
-fvisibility=hidden -Wno-variadic-macros -fno-ipa-icf
+AM_CXXFLAGS = -Wall -W -Wno-unused-parameter -Wwrite-strings -Wno-long-long 
-fPIC -fno-builtin -fno-exceptions -fno-rtti -funwind-tables 
-fvisibility=hidden -Wno-variadic-macros -fno-ipa-icf
 AM_CXXFLAGS += $(LIBSTDCXX_RAW_CXX_CXXFLAGS)
 AM_CXXFLAGS += -std=gnu++17
 AM_CXXFLAGS += $(EXTRA_CXXFLAGS)
--- libsanitizer/interception/Makefile.am.jj2024-11-22 20:05:18.952495444 
+0100
+++ libsanitizer/interception/Makefile.am   2024-11-22 20:05:30.322333770 
+0100
@@ -4,7 +4,7 @@ AM_CPPFLAGS = -I $(top_srcdir)/include -
 gcc_version := $(shell @get_gcc_base_ver@ $(top_srcdir)/../gcc/BASE-VER)
 
 DEFS = -D_GNU_SOURCE -D_DEBUG -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS 
-D__STDC_LIMIT_MACROS 
-AM_CXXFLAGS = -Wall -W -Wno-unused-parameter -Wwrite-strings -pedantic 
-Wno-long-long  -fPIC -fno-builtin -fno-exceptions -fno-rtti 
-fomit-frame-pointer -funwind-tables -fvisibility=hidden -Wno-variadic-macros
+AM_CXXFLAGS = -Wall -W -Wno-unused-parameter -Wwrite-strings -Wno-long-long 
-fPIC -fno-builtin -fno-exceptions -fno-rtti -fomit-frame-pointer 
-funwind-tables -fvisibility=hidden -Wno-variadic-macros
 AM_CXXFLAGS += $(LIBSTDCXX_RAW_CXX_CXXFLAGS)
 AM_CXXFLAGS += -std=gnu++17
 AM_CXXFLAGS += $(EXTRA_CXXFLAGS)
@@ -14,9 +14,9 @@ ACLOCAL_AMFLAGS = -I m4
 noinst_LTLIBRARIES = libinterception.la
 
 interception_files = \
-interception_linux.cpp \
-interception_mac.cpp \
-interception_win.cpp \
+   interception_linux.cpp \
+   interception_mac.cpp \
+   interception_win.cpp \
interception_type_test.cpp
 
 libinterception_la_SOURCES = $(i

[libgfortran, patch] PR88052 Format contravening constraint C1002 permitted

2024-11-22 Thread Jerry D


Hi all,

I had originally created this patch in 2018 and we did not get back to 
it. This results in more restrictive runtime behavior. I will go through 
the front-end code with another patch to catch this at compile time.


Changelog and new test case. See attached patch.

OK for trunk

Author: Jerry DeLisle 
Date:   Fri Nov 22 19:29:42 2024 -0800

Fortran: Reject missing comma in format.

Standards require rejecting formats where descriptors
are not separated by commas. This change allows this
the missing comma to be accepted only with
-std=legacy.

PR fortran/88052

libgfortran/ChangeLog:

* io/format.c (parse_format_list): Reject missing comma in
format strings by default or if -std=f95 or higher. This is
a runtime error.

gcc/testsuite/ChangeLog:

* gfortran.dg/comma_format_extension_4.f: Add missing comma.
* gfortran.dg/dollar_edit_descriptor_2.f: Likewise.
* gfortran.dg/fmt_error_9.f: Likewise.
* gfortran.dg/fmt_g0_5.f08: Likewise.
* gfortran.dg/fmt_t_2.f90: Likewise.
* gfortran.dg/pr88052.f90: New test.diff --git a/gcc/testsuite/gfortran.dg/comma_format_extension_4.f b/gcc/testsuite/gfortran.dg/comma_format_extension_4.f
index 30f07e803c5..1d018380f9c 100644
--- a/gcc/testsuite/gfortran.dg/comma_format_extension_4.f
+++ b/gcc/testsuite/gfortran.dg/comma_format_extension_4.f
@@ -1,7 +1,7 @@
 ! PR fortran/13257
-! Note the missing , before i1 in the format.
+! Note the missing , after i4 in the format.
 ! { dg-do run }
-! { dg-options "" }
+! { dg-options "-std=legacy" }
   character*6 c
   write (c,1001) 1
   if (c .ne. '1 ') STOP 1
diff --git a/gcc/testsuite/gfortran.dg/dollar_edit_descriptor_2.f b/gcc/testsuite/gfortran.dg/dollar_edit_descriptor_2.f
index 437f4dfd811..de583f374dc 100644
--- a/gcc/testsuite/gfortran.dg/dollar_edit_descriptor_2.f
+++ b/gcc/testsuite/gfortran.dg/dollar_edit_descriptor_2.f
@@ -1,5 +1,5 @@
 ! { dg-do run }
-! { dg-options "-w" }
+! { dg-options "-w -std=legacy" }
 ! PR25545 internal file and dollar edit descriptor.
   program main
   character*20 line
diff --git a/gcc/testsuite/gfortran.dg/fmt_error_9.f b/gcc/testsuite/gfortran.dg/fmt_error_9.f
index 40c73599ac8..2755074054c 100644
--- a/gcc/testsuite/gfortran.dg/fmt_error_9.f
+++ b/gcc/testsuite/gfortran.dg/fmt_error_9.f
@@ -4,7 +4,7 @@
 ! Test case prepared by Jerry DeLisle 
   character(len=25) :: str
   character(len=132) :: msg, line
-  str = '(1pd24.15e6)'
+  str = '(1pd24.15,e6)'
   line = "initial string"
   x = 555.25
   
@@ -19,11 +19,11 @@
   if (istat.ne.5006 .or. msg(1:10).ne."Zero width") STOP 4
   if (x.ne.555.25) STOP 5
   
-  write (line,'(1pd24.15e11.3)') 1.0d0, 1.234
+  write (line,'(1pd24.15,e11.3)') 1.0d0, 1.234
   if (line.ne."   1.000D+00  1.234E+00") STOP 6
   
   str = '(1p2d24.15)'
   msg = "   1.000D+00   1.23367575073D+00That's it!"
-  write (line,'(1p2d24.15a)') 1.0d0, 1.234, "That's it!"
+  write (line,'(1p2d24.15,a)') 1.0d0, 1.234, "That's it!"
   if (line.ne.msg) print *, msg
   end
diff --git a/gcc/testsuite/gfortran.dg/fmt_g0_5.f08 b/gcc/testsuite/gfortran.dg/fmt_g0_5.f08
index d2a97b1ac80..cafd90b94b5 100644
--- a/gcc/testsuite/gfortran.dg/fmt_g0_5.f08
+++ b/gcc/testsuite/gfortran.dg/fmt_g0_5.f08
@@ -6,13 +6,13 @@ program test_g0_special
 
 call check_all("(g10.3)", "(f10.3)")
 call check_all("(g10.3e3)", "(f10.3)")
-call check_all("(spg10.3)", "(spf10.3)")
-call check_all("(spg10.3e3)", "(spf10.3)")
+call check_all("(sp,g10.3)", "(sp,f10.3)")
+call check_all("(sp,g10.3e3)", "(sp,f10.3)")
 !print *, "---"
 call check_all("(g0)", "(f0.0)")
 call check_all("(g0.15)", "(f0.0)")
-call check_all("(spg0)", "(spf0.0)")
-call check_all("(spg0.15)", "(spf0.0)")
+call check_all("(sp,g0)", "(sp,f0.0)")
+call check_all("(sp,g0.15)", "(sp,f0.0)")
 contains
 subroutine check_all(fmt1, fmt2)
 character(len=*), intent(in) :: fmt1, fmt2
diff --git a/gcc/testsuite/gfortran.dg/fmt_t_2.f90 b/gcc/testsuite/gfortran.dg/fmt_t_2.f90
index 01647655de6..56414c54bfb 100644
--- a/gcc/testsuite/gfortran.dg/fmt_t_2.f90
+++ b/gcc/testsuite/gfortran.dg/fmt_t_2.f90
@@ -12,7 +12,7 @@
   read (11, '(a040,t1,040a)', end = 999)  foost1 , foost2
   if (foost1.ne.foost2) STOP 1
 
-  read (11, '(a032,t2,a032t3,a032)', end = 999)  foost1 , foost2, foost3
+  read (11, '(a032,t2,a032,t3,a032)', end = 999)  foost1 , foost2, foost3
   if (foost1(1:32).ne."123456789 123456789 123456789   ") STOP 2
   if (foost2(1:32).ne."23456789 123456789 123456789") STOP 3
   if (foost3(1:32).ne."3456789 123456789 123456789 ") STOP 4
diff --git a/gcc/testsuite/gfortran.dg/pr88052.f90 b/gcc/testsuite/gfortra

Re: [PATCH] md-files: Add a note about escaped quotes in braced strings in md files

2024-11-22 Thread Jeff Law





On 10/31/24 5:13 PM, Andrew Pinski wrote:

While looking into PR 33532, It was noted that \" would be treated
still as " for braced strings in the md file. I think that is still
the correct thing to do. So let's just a note to the documentation
on this behavior and NOT change read-md.cc (read_braced_string).
Since this behavior has been there for the last 23 years and only
one person ran into this behavior and helped with the conversion
from using quoted strings to braced strings; that is you just need
to remove the quote around the brace rather than change all of the
code.

Build the documentation to make sure it looks correct.

gcc/ChangeLog:

* doc/rtl.texi: Add a note about quotes in braced strings.

OK
jeff

[to-be-committed][RISC-V][PR target/109s79] Improve RISC-V constant synthesis

2024-11-22 Thread Jeff Law

This is a small improvement to the constant synthesis code to capture a 
case appended to PR 109279.


The case in question has the property that the high 32 bits have the 
value one less than the low 32 bits and the highest bit in two low 32 
bits is on.  The example used in BZ is 0xcccd which comes up 
computing N/10.


When we construct a constant with bit 31 on, it gets implicitly sign 
extended.  So something like 0xcccd when constructed would generate 
0xcccd.  The low bits are precisely what we want and the 
high bits are a "-1".  Both properties are useful.


We left shift that value by 32 positions into a temporary and add that 
temporary to the original value.  Concretely:



  0xcccd
+ 0xcccd
  --
  0xcccd


Tested in my tester on rv32 and rv64, waiting on the pre-commit tester 
to do its thing.


Jeff

PR target/109279
gcc/
* config/riscv/riscv.cc (riscv_build_integer): Handle another 64-bit
synthesis where high half is one less than the low half and the 32-bit
sign bit is on.

gcc/testsuite/

* gcc.target/riscv/synthesis-16.c: New test.


diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 93702f71ec9..a25fdf89e44 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -1315,6 +1315,34 @@ riscv_build_integer (struct riscv_integer_op *codes, 
HOST_WIDE_INT value,
  cost = alt_cost;
}
 
+  /* If bit31 is on and the upper constant is one less than the lower
+constant, then we can exploit sign extending nature of the lower
+half to trivially generate the upper half with an ADD.
+
+Not appropriate for ZBKB since that won't use "add"
+at codegen time.  */
+  if (!TARGET_ZBKB
+ && cost > 4
+ && bit31
+ && hival == loval - 1)
+   {
+ alt_cost = 2 + riscv_build_integer_1 (alt_codes,
+   sext_hwi (loval, 32), mode);
+ alt_codes[alt_cost - 3].save_temporary = true;
+ alt_codes[alt_cost - 2].code = ASHIFT;
+ alt_codes[alt_cost - 2].value = 32;
+ alt_codes[alt_cost - 2].use_uw = false;
+ alt_codes[alt_cost - 2].save_temporary = false;
+ /* This will turn into an ADD.  */
+ alt_codes[alt_cost - 1].code = CONCAT;
+ alt_codes[alt_cost - 1].value = 32;
+ alt_codes[alt_cost - 1].use_uw = false;
+ alt_codes[alt_cost - 1].save_temporary = false;
+
+ memcpy (codes, alt_codes, sizeof (alt_codes));
+ cost = alt_cost;
+   }
+
   if (cost > 4 && !bit31 && TARGET_ZBA)
{
  int value = 0;
diff --git a/gcc/testsuite/gcc.target/riscv/synthesis-16.c 
b/gcc/testsuite/gcc.target/riscv/synthesis-16.c
new file mode 100644
index 000..352c48ec037
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/synthesis-16.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target rv64 } */
+/* We aggressively skip as we really just need to test the basic synthesis
+   which shouldn't vary based on the optimization level.  -O1 seems to work
+   and eliminates the usual sources of extraneous dead code that would throw
+   off the counts.  */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" "-O2" "-O3" "-Os" "-Oz" "-flto" } } 
*/
+/* { dg-options "-march=rv64gc" } */
+
+/* Rather than test for a specific synthesis of all these constants or
+   having thousands of tests each testing one variant, we just test the
+   total number of instructions.
+
+   This isn't expected to change much and any change is worthy of a look.  */
+/* { dg-final { scan-assembler-times 
"\\t(add|addi|bseti|li|pack|ret|sh1add|sh2add|sh3add|slli|srli|xori|or)" 5 } } 
*/
+
+unsigned long foo_0xcccd(void) { return 0xcccdUL; }

Re: [PATCH] match.pd: Fix up the new simpliofiers using with_possible_nonzero_bits2 [PR117420]

2024-11-22 Thread Richard Biener

On Fri, 22 Nov 2024, Jakub Jelinek wrote:

> Hi!
> 
> The following testcase shows wrong-code caused by incorrect use
> of with_possible_nonzero_bits2.
> That matcher is defined as
> /* Slightly extended version, do not make it recursive to keep it cheap.  */
> (match (with_possible_nonzero_bits2 @0)
>  with_possible_nonzero_bits@0)
> (match (with_possible_nonzero_bits2 @0)
>  (bit_and:c with_possible_nonzero_bits@0 @2))
> and because with_possible_nonzero_bits includes the SSA_NAME case with
> integral/pointer argument, both forms can actually match when a SSA_NAME
> with integral/pointer type has a def stmt which is BIT_AND_EXPR
> assignment with say SSA_NAME with integral/pointer type as one of its
> operands (or INTEGER_CST, another with_possible_nonzero_bits case).
> And in match.pd the latter actually wins if both match and so when using
> (with_possible_nonzero_bits2 @0) the @0 will actually be one of the
> BIT_AND_EXPR operands if that form is matched.

Hmm, the (match ...) cases should be ideally ordered in order of the
match.pd file, so we maybe should simply swap the two, making the
fallback to with_possible_nonzero_bits match last?

> Now, with_possible_nonzero_bits2 and with_certain_nonzero_bits2 were added
> for the
> /* X == C (or X & Z == Y | C) is impossible if ~nonzero(X) & C != 0.  */
> (for cmp (eq ne)
>  (simplify
>   (cmp:c (with_possible_nonzero_bits2 @0) (with_certain_nonzero_bits2 @1))
>   (if (wi::bit_and_not (wi::to_wide (@1), get_nonzero_bits (@0)) != 0)
>{ constant_boolean_node (cmp == NE_EXPR, type); }))) 
> simplifier, but even for that one I think they do not do a good job, they
> might actually pessimize stuff rather than optimize, but at least does not
> result in wrong-code, because the operands are solely tested with
> wi::to_wide or get_nonzero_bits, but not actually used in the
> simplification.  The reason why it can pessimize stuff is say if we have
>   # RANGE [irange] int ... MASK 0xb VALUE 0x0
>   x_1 = ...;
>   # RANGE [irange] int ... MASK 0x8 VALUE 0x0
>   _2 = x_1 & 0xc;
>   _3 = _2 == 2;
> then if it used just with_possible_nonzero_bits@0, @0 would have
> get_nonzero_bits (@0) 0x8 and (2 & ~8) != 0, so we can fold it into
>   _3 = 0;
> But as it uses (with_possible_nonzero_bits2 @0), @0 is x_1 rather
> than _2 and get_nonzero_bits (@0) is unnecessarily conservative,
> 0xb rather than 0x8 and (2 & ~0xb) == 0, so we don't optimize.
> Now, with_possible_nonzero_bits2 can actually improve stuff as well in that
> pattern, if say value ranges aren't fully computed yet or the BIT_AND_EXPR
> assignment has been added later and the lhs doesn't have range computed yet,
> get_nonzero_range on the BIT_AND_EXPR lhs will be all bits set, while
> on the BIT_AND_EXPR operand might actually succeed.
> 
> I believe better would be to either modify get_nonzero_bits so that it
> special cases the SSA_NAME with BIT_AND_EXPR def_stmt (but one level
> deep only like with_possible_nonzero_bits2, no recursion), in that case
> return bitwise and of get_nonzero_bits (non-recursive) for the lhs and
> both operands, and possibly BIT_AND_EXPR itself e.g. for GENERIC
> matching during by returning bitwise and of both operands.
> Then with_possible_nonzero_bits2 could be needed for the GENERIC case,
> perhaps have the second match #if GENERIC, but changed so that the @N
> operand always is the whole thing rather than its operand which is
> error-prone.  Or add get_nonzero_bits wrapper with a different name
> which would do that.
> 
> with_certain_nonzero_bits2 could be changed similarly, these days
> we can test known non-zero bits rather than possible non-zero bits on
> SSA_NAMEs too, we record both mask and value, so possible nonzero bits
> (aka. get_nonzero_bits) is mask () | value (), while known nonzero bits
> is value () & ~mask (), with a new function (get_known_nonzero_bits
> or get_certain_nonzero_bits etc.) which handles that.
> 
> Anyway, the following patch doesn't do what I wrote above just yet,
> for that single pattern it is just a missed optimization.
> But the with_possible_nonzero_bits2 uses in the 3 new simplifiers are
> just completely incorrect, because they don't just use the @0 operand
> in get_nonzero_bits (pessimizing stuff if value ranges are fully computed),
> but also use it in the replacement, then they act as if the BIT_AND_EXPR
> wasn't there at all.
> While we could use (with_possible_nonzero_bits2@3 @0) and use
> get_nonzero_bits (@0) and use @3 in the replacement, that would still
> often be a pessimization, so I've just used with_possible_nonzero_bits@0.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

> And what do you think about the above mentioned approach for the other
> with_possible_nonzero_bits2 using simplifier?

I find having both matches confusing most of the time, we definitely
lack better documentation (in comments) on how they are supposed to be
used.

Improving things is appreciated - if what you su

[COMMITTED] MAINTAINERS: Add myself to write after approval

2024-11-22 Thread Evgeny Karpov

ChangeLog:

* MAINTAINERS: Add myself to write after approval.
---
 MAINTAINERS | 1 +
 1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 7da332323dc..e432b2a4da9 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -570,6 +570,7 @@ Kean Johnston   -   

 Phillip Jordan  pmj 
 Tim Josling timjosling  
 Victor Kaplanskyvictork 
+Evgeny Karpov   -   
 Filip Kastl pheeck  
 Geoffrey Keatinggeoffk  
 Brendan Kehoe   -   
-- 
2.34.1

[patch,avr] PR117726: More tweaks to multi-byte shifts

2024-11-22 Thread Georg-Johann Lay

This patch is similar to https://gcc.gnu.org/r15-5569 (tweak ashift:SI) 
but for

ashiftrt and lshiftrt codes.  It splits constant shift offsets > 16
into a 3-operand byte shift and a 2-operand residual bit shift.
   Moreover, some of the constraint alternatives have been promoted
to 3-operand alternatives regardless of options.  For example,
ashift:HI and lshiftrt:HI can support 3 operands for offsets 9...12
without any overhead.
   Apart from that, it's a bit of code clean up for 2-byte and 4-byte
shift insns:  Use one RTL peephole with any_shift code iterator
instead of 3 individual peepholes.  It also removes some useless
split insns; presumably introduced during the cc0 -> CCmode work.


No regressions. Ok for trunk?

Johann

--

AVR: target/117726 - Tweak ashiftrt:SI and lshiftrt:SI insns.

This patch is similar to r15-5569 (tweak ashift:SI) but for
ashiftrt and lshiftrt codes.  It splits constant shift offsets > 16
into a 3-operand byte shift and a 2-operand residual bit shift.
   Moreover, some of the constraint alternatives have been promoted
to 3-operand alternatives regardless of options.  For example,
ashift:HI and lshiftrt:HI can support 3 operands for offsets 9...12
without any overhead.
   Apart from that, it's a bit of code clean up for 2-byte and 4-byte
shift insns:  Use one RTL peephole with any_shift code iterator
instead of 3 individual peepholes.  It also removes some useless
split insns; presumably introduced during the cc0 -> CCmode work.

PR target/117726
gcc/
* config/avr/avr-passes.cc (avr_split_shift): Also handle
ASHIFTRT and LSHIFTRT codes for 4-byte shifts.
(constr_split_shift4): New code_attr.
* config/avr/predicates.md (scratch_or_d_register_operand):
rename to scratch_or_dreg_operand.
* config/avr/avr.md: Same.
(define_peephole2): Write the RTL scratch peephole for 2-byte and
4-byte shifts that generates *sh*3_const insns using code
iterator any_shift.
(*ashlhi3_const_split, *ashrhi3_const_split, *ashrhi3_const_split)
(*lshrsi3_const_split, *lshrhi3_const_split): Remove useless
split insns.
(define_split) [avropt_split_bit_shift]: Add splitters
for 4-byte ASHIFTRT and LSHIFTRT insns using avr_split_shift().
(ashrsi3, *ashrsi3, *ashrsi3_const): Add "r,0,C4a" and "r,r,C4a"
constraint alternatives depending on 2op, 3op.
(lshrsi3, *lshrsi3, *lshrsi3_const): Add "r,0,C4r" and "r,r,C4r"
constraint alternatives depending on 2op, 3op. Add "r,r,C15".
(lshrhi3, *lshrhi3, *lshrhi3_const, ashlhi3, *ashlhi3)
(*ashlhi3_const): Add "r,r,C7c" alternative.
* config/avr/constraints.md (C7c): New constraint in 7...12.
* config/avr/avr.cc (ashlhi3_out, lshrhi3_out)
[case 7, 9, 10, 11, 12]: Support as 3-operand insn.
(lshrsi3_out) [case 15]: Support as 3-operand insn.
* doc/invoke.texi (AVR Options) <-msplit-bit-shift>: Document.AVR: target/117726 - Tweak ashiftrt:SI and lshiftrt:SI insns.

This patch is similar to r15-5569 (tweak ashift:SI) but for
ashiftrt and lshiftrt codes.  It splits constant shift offsets > 16
into a 3-operand byte shift and a 2-operand residual bit shift.
   Moreover, some of the constraint alternatives have been promoted
to 3-operand alternatives regardless of options.  For example,
ashift:HI and lshiftrt:HI can support 3 operands for offsets 9...12
without any overhead.
   Apart from that, it's a bit of code clean up for 2-byte and 4-byte
shift insns:  Use one RTL peephole with any_shift code iterator
instead of 3 individual peepholes.  It also removes some useless
split insns; presumably introduced during the cc0 -> CCmode work.

PR target/117726
gcc/
* config/avr/avr-passes.cc (avr_split_shift): Also handle
ASHIFTRT and LSHIFTRT codes for 4-byte shifts.
(constr_split_shift4): New code_attr.
* config/avr/predicates.md (scratch_or_d_register_operand):
rename to scratch_or_dreg_operand.
* config/avr/avr.md: Same.
(define_peephole2): Write the RTL scratch peephole for 2-byte and
4-byte shifts that generates *sh*3_const insns using code
iterator any_shift.
(*ashlhi3_const_split, *ashrhi3_const_split, *ashrhi3_const_split)
(*lshrsi3_const_split, *lshrhi3_const_split): Remove useless
split insns.
(define_split) [avropt_split_bit_shift]: Add splitters
for 4-byte ASHIFTRT and LSHIFTRT insns using avr_split_shift().
(ashrsi3, *ashrsi3, *ashrsi3_const): Add "r,0,C4a" and "r,r,C4a"
constraint alternatives depending on 2op, 3op.
(lshrsi3, *lshrsi3, *lshrsi3_const): Add "r,0,C4r" and "r,r,C4r"
constraint alternatives depending on 2op, 3op. Add "r,r,C15".
(lshrhi3, *lshrhi3, *lshrhi3

Re: [PATCH] inline asm, v3: Add new constraint for symbol definitions

2024-11-22 Thread Joseph Myers

On Fri, 22 Nov 2024, Jakub Jelinek wrote:

> On Thu, Nov 21, 2024 at 11:57:13PM +, Joseph Myers wrote:
> > On Wed, 6 Nov 2024, Jakub Jelinek wrote:
> > 
> > > +   error_at (loc, "%<:%> constraint operand is not address "
> > > +  "of a function or non-automatic variable");
> > 
> > I think a testcase for this error is needed.
> 
> Here is an updated patch with the additional test coverage.

The C front-end changes in this patch are OK.

-- 
Joseph S. Myers
josmy...@redhat.com

Re: [PATCH] wwwdocs: Align the DCO text for the GNU Toolchain to match community usage.

2024-11-22 Thread Carlos O'Donell

On 11/22/24 11:13 AM, Jason Merrill wrote:
> On 11/21/24 6:04 PM, Carlos O'Donell wrote:
>> Adjust the DCO text to match the broader community usage including 
>> the Linux kernel use around "real names."
>> 
>> These changes clarify what was meant by "real name" and that it is 
>> not required to be a "legal name" or any other stronger
>> requirement than a known identity that could be contacted to
>> discuss the contribution.
> 
> My take has been that this change is not necessary for us because the
> FSF can accept copyright assignment for pseudonymous contributions,
> so individual reviewers don't need to adjudicate whether a particular
> pseudonym is sufficiently "known".

This is not the case, which is why I'm suggesting we align the wording of the 
DCO
usage to match the  general community accepted meaning.

The FSF copyright assignment process allows you to *post* your work publicly 
from
a pseudonym and allows you to use your pseudonym in the "sources" file that
GNU Maintainers use to check assignment and marks it like this:
"Note: this is a pseudonym; legal name on assignment."

The process does not allow you to remain pseudonymous to the FSF, and that 
information
may eventually leak out of the FSF.

Again, I'm suggesting we align the text of the DCO we use with the rest of the
communities that use it.

This is not a material change in the use of the DCO, just a clarification of the
wording around "real name."

-- 
Cheers,
Carlos.

[PATCH] replace atoi with strtol in varasm.cc (decode_reg_name_and_count) [PR114540]

2024-11-22 Thread Heiko Eißfeldt

A simple replacement of atoi() with strtol() in 
varasm.cc:decode_reg_name_and_count().

Parsing now has errno ERANGE checking, eg no undetected overflow.

Being new it is difficult for me to come up with a good test case.
---

diff --git a/gcc/varasm.cc b/gcc/varasm.cc
index acc4b4a0419..d3c60eaf4d6 100644
--- a/gcc/varasm.cc
+++ b/gcc/varasm.cc
@@ -993,8 +993,11 @@ decode_reg_name_and_count (const char *asmspec, int 
*pnregs)

  break;
   if (asmspec[0] != 0 && i < 0)
    {
- i = atoi (asmspec);
- if (i < FIRST_PSEUDO_REGISTER && i >= 0 && reg_names[i][0])
+ char *pend{};
+ errno = 0;
+ i = strtol (asmspec, &pend, 10);
+ if (errno != ERANGE
+ && i < FIRST_PSEUDO_REGISTER && i >= 0 && reg_names[i][0])
    return i;
  else
    return -2;

Re: [PATCH v1] autoupdate: replace obsolete macros in libiberty

2024-11-22 Thread Sam James

Matthieu Longo  writes:

> Autoreconf-2.72 warns about obsolete macros. This patch aims at removing
> the noise from a future upgrade to autoreconf-2.72 or later. This is in
> no a way a complete patch allowing the upgrade to autoreconf-2.72.
>
> - AC_GNU_SOURCE by AC_USE_SYSTEM_EXTENSIONS
>   https://www.gnu.org/savannah-checkouts/gnu/autoconf/manual/autoconf-2.72/
>   autoconf.html#index-AC_005fGNU_005fSOURCE-1
> - AC_CONFIG_HEADER by AC_CONFIG_HEADERS
>   https://www.gnu.org/software/automake/manual/1.12.2/html_node/Obsolete-
>   Macros.html#index-AM_005fCONFIG_005fHEADER
>
> Those fixes were originally submitted in a patch series in binutils.
> https://inbox.sourceware.org/binutils/878qthm6a0@gentoo.org/
>
> libiberty/ChangeLog:
>
>   * configure: Regenerate.
>   * configure.ac: Fix autoupdate warnings.

LGTM but cannot approve.

Jeff, would you mind acking this as a global reviewer? It was in
binutils-gdb.git for several months, this is just bringing us in sync.

> ---
>  libiberty/configure| 1 -
>  libiberty/configure.ac | 4 ++--
>  2 files changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/libiberty/configure b/libiberty/configure
> index 5c69fee56c1..38856a07e5f 100755
> --- a/libiberty/configure
> +++ b/libiberty/configure
> @@ -4413,7 +4413,6 @@ $as_echo "$ac_cv_safe_to_define___extensions__" >&6; }
>$as_echo "#define _TANDEM_SOURCE 1" >>confdefs.h
>  
>  
> -
>  # Check whether --enable-largefile was given.
>  if test "${enable_largefile+set}" = set; then :
>enableval=$enable_largefile;
> diff --git a/libiberty/configure.ac b/libiberty/configure.ac
> index 0888e638896..c27e08e1428 100644
> --- a/libiberty/configure.ac
> +++ b/libiberty/configure.ac
> @@ -172,7 +172,7 @@ AC_MSG_NOTICE([target_header_dir = $target_header_dir])
>  
>  GCC_NO_EXECUTABLES
>  AC_PROG_CC
> -AC_GNU_SOURCE
> +AC_USE_SYSTEM_EXTENSIONS
>  AC_SYS_LARGEFILE
>  AC_PROG_CPP_WERROR
>  
> @@ -205,7 +205,7 @@ dnl AM_PROG_LIBTOOL
>  
>  dnl When we start using automake:
>  dnl AM_CONFIG_HEADER(config.h:config.in)
> -AC_CONFIG_HEADER(config.h:config.in)
> +AC_CONFIG_HEADERS([config.h:config.in])
>  
>  dnl When we start using automake:
>  dnl AM_MAINTAINER_MODE

Re: [PATCH] wwwdocs: Align the DCO text for the GNU Toolchain to match community usage.

2024-11-22 Thread Jason Merrill


On 11/21/24 6:04 PM, Carlos O'Donell wrote:

Adjust the DCO text to match the broader community usage including
the Linux kernel use around "real names."

These changes clarify what was meant by "real name" and that it is
not required to be a "legal name" or any other stronger requirement
than a known identity that could be contacted to discuss the
contribution.


My take has been that this change is not necessary for us because the 
FSF can accept copyright assignment for pseudonymous contributions, so 
individual reviewers don't need to adjudicate whether a particular 
pseudonym is sufficiently "known".


Jason

Re: [PATCH] inline asm, v3: Add new constraint for symbol definitions

2024-11-22 Thread Richard Biener

On Fri, 22 Nov 2024, Jakub Jelinek wrote:

> On Thu, Nov 21, 2024 at 11:57:13PM +, Joseph Myers wrote:
> > On Wed, 6 Nov 2024, Jakub Jelinek wrote:
> > 
> > > +   error_at (loc, "%<:%> constraint operand is not address "
> > > +  "of a function or non-automatic variable");
> > 
> > I think a testcase for this error is needed.
> 
> Here is an updated patch with the additional test coverage.

The middle-end changes are OK.

Richard.

> 2024-11-21  Jakub Jelinek  
> 
> gcc/
>   * genpreds.cc (mangle): Add ':' mangling.
>   (add_constraint): Allow : constraint.
>   * common.md (:): New define_constraint.
>   * stmt.cc (parse_output_constraint): Diagnose "=:".
>   (parse_input_constraint): Handle ":" and diagnose invalid
>   uses.
>   * doc/md.texi (Simple Constraints): Document ":" constraint.
> gcc/c/
>   * c-typeck.cc (build_asm_expr): Diagnose invalid ":" constraint
>   uses.
> gcc/cp/
>   * semantics.cc (finish_asm_stmt): Diagnose invalid ":" constraint
>   uses.
> gcc/testsuite/
>   * c-c++-common/toplevel-asm-4.c: New test.
>   * c-c++-common/toplevel-asm-5.c: New test.
> 
> --- gcc/genpreds.cc.jj2024-02-10 11:25:10.404468273 +0100
> +++ gcc/genpreds.cc   2024-11-05 14:57:14.193060528 +0100
> @@ -753,6 +753,7 @@ mangle (const char *name)
>case '_': obstack_grow (rtl_obstack, "__", 2); break;
>case '<':  obstack_grow (rtl_obstack, "_l", 2); break;
>case '>':  obstack_grow (rtl_obstack, "_g", 2); break;
> +  case ':': obstack_grow (rtl_obstack, "_c", 2); break;
>default: obstack_1grow (rtl_obstack, *name); break;
>}
>  
> @@ -797,12 +798,13 @@ add_constraint (const char *name, const
>for (p = name; *p; p++)
>  if (!ISALNUM (*p))
>{
> - if (*p == '<' || *p == '>' || *p == '_')
> + if (*p == '<' || *p == '>' || *p == '_' || *p == ':')
> need_mangled_name = true;
>   else
> {
>   error_at (loc, "constraint name '%s' must be composed of letters,"
> -   " digits, underscores, and angle brackets", name);
> +   " digits, underscores, colon and angle brackets",
> +   name);
>   return;
> }
>}
> --- gcc/common.md.jj  2024-01-03 11:51:24.519828508 +0100
> +++ gcc/common.md 2024-11-05 14:51:29.098989927 +0100
> @@ -100,6 +100,11 @@ (define_constraint "s"
> (match_test "!CONST_SCALAR_INT_P (op)")
> (match_test "!flag_pic || LEGITIMATE_PIC_OPERAND_P (op)")))
>  
> +(define_constraint ":"
> +  "Defines a symbol."
> +  (and (match_test "CONSTANT_P (op)")
> +   (match_test "!CONST_SCALAR_INT_P (op)")))
> +
>  (define_constraint "n"
>"Matches a non-symbolic integer constant."
>(and (match_test "CONST_SCALAR_INT_P (op)")
> --- gcc/stmt.cc.jj2024-10-25 10:00:29.523767070 +0200
> +++ gcc/stmt.cc   2024-11-05 18:31:11.518948252 +0100
> @@ -278,6 +278,10 @@ parse_output_constraint (const char **co
> error ("matching constraint not valid in output operand");
> return false;
>  
> + case ':':
> +   error ("%<:%> constraint used for output operand");
> +   return false;
> +
>   case '<':  case '>':
> /* ??? Before flow, auto inc/dec insns are not supposed to exist,
>excepting those that expand_call created.  So match memory
> @@ -325,6 +329,7 @@ parse_input_constraint (const char **con
>size_t c_len = strlen (constraint);
>size_t j;
>bool saw_match = false;
> +  bool at_checked = false;
>  
>/* Assume the constraint doesn't allow the use of either
>   a register or memory.  */
> @@ -362,6 +367,21 @@ parse_input_constraint (const char **con
>case 'N':  case 'O':  case 'P':  case ',':
>   break;
>  
> +  case ':':
> + /* Verify that if : is used, it is just ":" or say ":,:" but not
> +mixed with other constraints or say ",:,," etc.  */
> + if (!at_checked)
> +   {
> + for (size_t k = 0; k < c_len; ++k)
> +   if (constraint[k] != ((k & 1) ? ',' : ':') || (c_len & 1) == 0)
> + {
> +   error ("%<:%> constraint mixed with other constraints");
> +   return false;
> + } 
> + at_checked = true;
> +   }
> + break;
> +
>   /* Whether or not a numeric constraint allows a register is
>  decided by the matching constraint, and so there is no need
>  to do anything special with them.  We must handle them in
> --- gcc/doc/md.texi.jj2024-10-16 14:41:45.553757783 +0200
> +++ gcc/doc/md.texi   2024-11-05 18:46:30.795896301 +0100
> @@ -1504,6 +1504,13 @@ as the predicate in the @code{match_oper
>  the mode specified in the @code{match_operand} as the mode of the memory
>  reference for which the address would be valid.
>  
> +@cindex @samp{:} in constraint
> +@item @samp{:}
> +This constraint, allowed only in input operands, says th

Re: [PATCH] v3: Add -f{, no-}assume-sane-operators-new-delete options [PR110137]

2024-11-22 Thread Richard Biener

On Fri, 22 Nov 2024, Jakub Jelinek wrote:

> On Tue, Nov 19, 2024 at 01:52:06PM +0100, Jan Hubicka wrote:
> > > On Tue, Nov 19, 2024 at 11:23:31AM +0100, Jakub Jelinek wrote:
> > > > On Tue, Nov 19, 2024 at 10:25:16AM +0100, Richard Biener wrote:
> > > > > I think it's pretty clear and easy to describe to users what "m " and 
> > > > > what "mC" do.  But with "pure" this is an odd intermediate state.  
> > > > > For both
> > > > > "m " and "mP" you suggest above the new/delete might modify their
> > > > > global state but as you can't rely on the new/delete pair to prevail
> > > > > you cannot rely on the modification to happen.  But how do you explain
> > > > > that
> > > > 
> > > > If we are willing to make the default not strictly conforming (i.e.
> > > > basically revert PR101480 by default and make the GCC 11.1/11.2 behavior
> > > > the default and allow -fno-sane-operators-new-delete to change to GCC
> > > > 11.3/14.* behavior), I can live with it.
> > > > But we need to make the documentation clear that the default is not 
> > > > strictly
> > > > conforming.
> > > 
> > > Here is a modified version of the patch to do that.
> > > 
> > > Or do we want to set the default based on -std= option (-std=gnu* implies
> > > -fassume-sane-operators-new-delete, -std=c++* implies
> > > -fno-assume-sane-operators-new-delete)?  Though, not sure what to do for
> > > LTO then.
> > 
> > My oriignal plan was to add " sane" attribute to the declarations and
> > prevent them from being merged.  Then every direct call to new/delete
> > would know if it came from sane or insane translation unit.
> > 
> > Alternatively one can also declare
> >  +C++ ObjC++ LTO Var(flag_assume_sane_operators_new_delete) Init(1)
> >  +Assume C++ replaceable global operators new, new[], delete, delete[] 
> > don't read or write visible global state.
> > as optimization.  Then sanity would be function specific.
> > 
> > inline_call contains code that drops flag_strict_aliasing for function
> > when it inlines -fno-strict-alising function into -fstrict-aliasing.
> > At same place we can make new/delete operator insanity similarly
> > contagious.  If you inline function that has insane new/delete calls you
> > make the combined function also insane.
> 
> Here is an updated patch, which makes it Optimize and merges it during
> inlining (pessimistically).
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

> Note, perhaps it could be (maybe incrementally) refined to only clear
> the flag on inlining if the callee actually has any
> gimple_call_from_new_or_delete stmts calling DECL_IS_REPLACEABLE_OPERATOR.
> I.e. like it collects whether a function uses floating point operations
> also gather this in another flag.

After 64bit location_t we have spare flags in gimple, I think we could
reserve a flag to indicate something with that something dependent on
the function called - like in the gimple_call_from_new_or_delete case
whether it's considered "sane" or not.  This would solve the inlining
case (we now also have two bits left in GF_CALL, so possible even now)

Thanks,
Richard.

> 2024-11-22  Jakub Jelinek  
> 
>   PR c++/110137
>   PR middle-end/101480
> gcc/
>   * doc/invoke.texi (-fassume-sane-operators-new-delete,
>   -fno-assume-sane-operators-new-delete): Document.
>   * gimple.cc (gimple_call_fnspec): Handle
>   -f{,no-}assume-sane-operators-new-delete.
>   * ipa-inline-transform.cc (inline_call): Also clear
>   flag_assume_sane_operators_new_delete on caller when inlining
>   -fno-assume-sane-operators-new-delete callee into
>   -fassume-sane-operators-new-delete caller.
> gcc/c-family/
>   * c.opt (fassume-sane-operators-new-delete): New option.
> gcc/testsuite/
>   * g++.dg/tree-ssa/pr110137-1.C: New test.
>   * g++.dg/tree-ssa/pr110137-2.C: New test.
>   * g++.dg/tree-ssa/pr110137-3.C: New test.
>   * g++.dg/tree-ssa/pr110137-4.C: New test.
>   * g++.dg/torture/pr10148.C: Add -fno-assume-sane-operators-new-delete
>   as dg-additional-options.
>   * g++.dg/warn/Warray-bounds-16.C: Revert 2021-11-10 changes.
> 
> --- gcc/doc/invoke.texi.jj2024-11-20 14:27:49.257228428 +0100
> +++ gcc/doc/invoke.texi   2024-11-20 14:44:02.819559242 +0100
> @@ -213,7 +213,9 @@ in the following sections.
>  @item C++ Language Options
>  @xref{C++ Dialect Options,,Options Controlling C++ Dialect}.
>  @gccoptlist{-fabi-version=@var{n}  -fno-access-control
> --faligned-new=@var{n}  -fargs-in-order=@var{n}  -fchar8_t  -fcheck-new
> +-faligned-new=@var{n}  -fargs-in-order=@var{n}
> +-fno-assume-sane-operators-new-delete
> +-fchar8_t  -fcheck-new
>  -fconcepts  -fconstexpr-depth=@var{n}  -fconstexpr-cache-depth=@var{n}
>  -fconstexpr-loop-limit=@var{n}  -fconstexpr-ops-limit=@var{n}
>  -fno-elide-constructors
> @@ -3164,6 +3166,35 @@ but few users will need to override the
>  
>  This flag is enabled by default for @option{-std=c++17}.
>  
> +@opindex fno-ass

[PATCH] [RFC] Add extra 64bit SSE vector epilogue in some cases

2024-11-22 Thread Richard Biener

Similar to the X86_TUNE_AVX512_TWO_EPILOGUES tuning which enables
an extra 128bit SSE vector epilouge when doing 512bit AVX512
vectorization in the main loop the following allows a 64bit SSE
vector epilogue to be generated when the previous vector epilogue
still had a vectorization factor of 16 or larger (which usually
means we are operating on char data).

This effectively applies to 256bit and 512bit AVX2/AVX512 main loops,
a 128bit SSE main loop would already get a 64bit SSE vector epilogue.

Together with X86_TUNE_AVX512_TWO_EPILOGUES this means three
vector epilogues for 512bit and two vector epilogues when enabling
256bit vectorization.  I have not added another tunable for this
RFC - suggestions on how to avoid inflation there welcome.

This speeds up 525.x264_r to within 5% of the -mprefer-vector-size=128
speed with -mprefer-vector-size=256 or -mprefer-vector-size=512
(the latter only when -mtune-crtl=avx512_two_epilogues is in effect).

I have not done any further benchmarking, this merely shows the
possibility and looks for guidance on how to expose this to the
uarch tunings or to the user (at all?) if not gating on any uarch
specific tuning.

Note 64bit SSE isn't a native vector size so we rely on emulation
being "complete" (if not epilogue vectorization will only fail, so
it's "safe" in this regard).  With AVX512 ISA available an alternative
is a predicated epilog, but due to possible STLF issues user control
would be required here.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress
(I expect some fallout in scans due to some extra epilogues, let's see)

* config/i386/i386.cc (ix86_vector_costs::finish_cost): For an
128bit SSE epilogue request a 64bit SSE epilogue if the 128bit
SSE epilogue VF was 16 or higher.
---
 gcc/config/i386/i386.cc | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index c7e70c21999..f2e8de3aafc 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -25495,6 +25495,13 @@ ix86_vector_costs::finish_cost (const vector_costs 
*scalar_costs)
   && GET_MODE_SIZE (loop_vinfo->vector_mode) == 32)
m_suggested_epilogue_mode = V16QImode;
 }
+  /* When a 128bit SSE vectorized epilogue still has a VF of 16 or larger
+ enable a 64bit SSE epilogue.  */
+  if (loop_vinfo
+  && LOOP_VINFO_EPILOGUE_P (loop_vinfo)
+  && GET_MODE_SIZE (loop_vinfo->vector_mode) == 16
+  && LOOP_VINFO_VECT_FACTOR (loop_vinfo).to_constant () >= 16)
+m_suggested_epilogue_mode = V8QImode;
 
   vector_costs::finish_cost (scalar_costs);
 }
-- 
2.43.0

Re: [PATCH] replace atoi with strtol in varasm.cc (decode_reg_name_and_count) [PR114540]

2024-11-22 Thread Jeff Law





On 11/22/24 7:40 AM, Heiko Eißfeldt wrote:
A simple replacement of atoi() with strtol() in 
varasm.cc:decode_reg_name_and_count().

Parsing now has errno ERANGE checking, eg no undetected overflow.

Being new it is difficult for me to come up with a good test case.
So I don't see any technical problem with the patch, but we don't have 
any evidence there's any kind of bug here.  I guess if a port had a 
bogus register name we could trigger a problem.


Given we're in stage3 (bugfixing) in preparation for the gcc-15 release 
in the spring, I'm going to defer this patch.


Jeff

Re: [PATCH] tree-optimization/117355: object size for PHI nodes with negative offsets

2024-11-22 Thread Jeff Law





On 11/20/24 12:54 PM, Siddhesh Poyarekar wrote:

When the object size estimate is returned for a PHI node, it is the
maximum possible value, which is fine in isolation.  When combined with
negative offsets however, it may sometimes end up in zero size because
the resultant size was larger than the wholesize, leading
size_for_offset to conclude that there's a potential underflow.  Fix
this by allowing a non-strict mode to size_for_offset, which
conservatively returns the size (or wholesize) in case of a negative
offset.

gcc/ChangeLog:

PR tree-optimization/117355
* tree-object-size.cc (size_for_offset): New argument STRICT,
return SZ if it is set to false.
(plus_stmt_object_size): Adjust call to SIZE_FOR_OFFSET.

gcc/testsuite/ChangeLog:

PR tree-optimization/117355
* g++.dg/ext/builtin-object-size2.C (test9): New test.
(main): Call it.
* gcc.dg/builtin-object-size-3.c (test10): Adjust expected size.

OK
jeff

Re: [PATCH][v2] tree-optimization/115825 - improve unroll estimates for volatile accesses

2024-11-22 Thread Jeff Law





On 11/19/24 5:27 AM, Richard Biener wrote:

The loop unrolling code assumes that one third of all volatile accesses
can be possibly optimized away which is of course not true.  This leads
to excessive unrolling in some cases.  The following tracks the number
of stmts with side-effects as those are not eliminatable later and
only assumes one third of the other stmts can be further optimized.

This causes some fallout in the testsuite where we rely on unrolling
even when calls are involved.  I have XFAILed g++.dg/warn/Warray-bounds-20.C
but adjusted the others with a #pragma GCC unroll to mimic previous
behavior and retain what the testcase was testing.  I've also filed
PR117671 for the case where the size estimation fails to honor the
stmts we then remove by inserting __builtin_unreachable ().
For gcc.dg/tree-ssa/cunroll-2.c the estimate that the code doesn't
grow is clearly bogus and we have explicit code to reject unrolling
for bodies containing calls so I've adjusted the testcase accordingly.

Re-posted with testsuite adjustments (original from July).

Bootstrapped and tested on x86_64-unknown-linux-gnu.

OK?

Thanks,
Richard.

PR tree-optimization/115825
* tree-ssa-loop-ivcanon.cc (loop_size::not_eliminatable_after_peeling):
New.
(loop_size::last_iteration_not_eliminatable_after_peeling): Likewise.
(tree_estimate_loop_size): Count stmts with side-effects as
not optimistically eliminatable.
(estimated_unrolled_size): Compute the number of stmts that can
be optimistically eliminated by followup transforms.
(try_unroll_loop_completely): Adjust.

* gcc.dg/tree-ssa/cunroll-17.c: New testcase.
* gcc.dg/tree-ssa/cunroll-2.c: Adjust to not expect unrolling.
* gcc.dg/pr94600-1.c: Force unrolling.
* c-c++-common/ubsan/unreachable-3.c: Likewise.
* g++.dg/warn/Warray-bounds-20.C: XFAIL cases we rely on
unrolling loops created by new expressions and not inlined
CTOR invocations.

LGTM.
jeff

Re: [PATCH] c++: Use type_id_in_expr_sentinel in 6 further spots in the parser

2024-11-22 Thread Marek Polacek

On Thu, Sep 19, 2024 at 10:36:01PM +0200, Jakub Jelinek wrote:
> Hi!
> 
> The following patch uses type_id_in_expr_sentinel in a few spots which
> did it all manually.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Nice cleanup.  Patch looks good to me.
 
> 2024-09-19  Jakub Jelinek  
> 
>   * parser.cc (cp_parser_postfix_expression): Use
>   type_id_in_expr_sentinel instead of manually saving+setting/restoring
>   parser->in_type_id_in_expr_p around cp_parser_type_id calls.
>   (cp_parser_has_attribute_expression): Likewise.
>   (cp_parser_cast_expression): Likewise.
>   (cp_parser_sizeof_operand): Likewise.
> 
> --- gcc/cp/parser.cc.jj   2024-09-07 09:31:20.708482757 +0200
> +++ gcc/cp/parser.cc  2024-09-19 10:46:21.916155154 +0200
> @@ -7554,7 +7554,6 @@ cp_parser_postfix_expression (cp_parser
>   tree type;
>   cp_expr expression;
>   const char *saved_message;
> - bool saved_in_type_id_in_expr_p;
>  
>   /* All of these can be handled in the same way from the point
>  of view of parsing.  Begin by consuming the token
> @@ -7569,11 +7568,11 @@ cp_parser_postfix_expression (cp_parser
>   /* Look for the opening `<'.  */
>   cp_parser_require (parser, CPP_LESS, RT_LESS);
>   /* Parse the type to which we are casting.  */
> - saved_in_type_id_in_expr_p = parser->in_type_id_in_expr_p;
> - parser->in_type_id_in_expr_p = true;
> - type = cp_parser_type_id (parser, CP_PARSER_FLAGS_TYPENAME_OPTIONAL,
> -   NULL);
> - parser->in_type_id_in_expr_p = saved_in_type_id_in_expr_p;
> + {
> +   type_id_in_expr_sentinel s (parser);
> +   type = cp_parser_type_id (parser, CP_PARSER_FLAGS_TYPENAME_OPTIONAL,
> + NULL);
> + }
>   /* Look for the closing `>'.  */
>   cp_parser_require_end_of_template_parameter_list (parser);
>   /* Restore the old message.  */
> @@ -7643,7 +7642,6 @@ cp_parser_postfix_expression (cp_parser
>{
>   tree type;
>   const char *saved_message;
> - bool saved_in_type_id_in_expr_p;
>  
>   /* Consume the `typeid' token.  */
>   cp_lexer_consume_token (parser->lexer);
> @@ -7658,10 +7656,10 @@ cp_parser_postfix_expression (cp_parser
>  expression.  */
>   cp_parser_parse_tentatively (parser);
>   /* Try a type-id first.  */
> - saved_in_type_id_in_expr_p = parser->in_type_id_in_expr_p;
> - parser->in_type_id_in_expr_p = true;
> - type = cp_parser_type_id (parser);
> - parser->in_type_id_in_expr_p = saved_in_type_id_in_expr_p;
> + {
> +   type_id_in_expr_sentinel s (parser);
> +   type = cp_parser_type_id (parser);
> + }
>   /* Look for the `)' token.  Otherwise, we can't be sure that
>  we're not looking at an expression: consider `typeid (int
>  (3))', for example.  */
> @@ -7916,10 +7914,8 @@ cp_parser_postfix_expression (cp_parser
>   else
> {
>   /* Parse the type.  */
> - bool saved_in_type_id_in_expr_p = parser->in_type_id_in_expr_p;
> - parser->in_type_id_in_expr_p = true;
> + type_id_in_expr_sentinel s (parser);
>   type = cp_parser_type_id (parser);
> - parser->in_type_id_in_expr_p = saved_in_type_id_in_expr_p;
>   parens.require_close (parser);
> }
>  
> @@ -9502,11 +9498,11 @@ cp_parser_has_attribute_expression (cp_p
>   expression.  */
>cp_parser_parse_tentatively (parser);
>  
> -  bool saved_in_type_id_in_expr_p = parser->in_type_id_in_expr_p;
> -  parser->in_type_id_in_expr_p = true;
> -  /* Look for the type-id.  */
> -  oper = cp_parser_type_id (parser);
> -  parser->in_type_id_in_expr_p = saved_in_type_id_in_expr_p;
> +  {
> +type_id_in_expr_sentinel s (parser);
> +/* Look for the type-id.  */
> +oper = cp_parser_type_id (parser);
> +  }
>  
>cp_parser_parse_definitely (parser);
>  
> @@ -10268,15 +10264,13 @@ cp_parser_cast_expression (cp_parser *pa
>   cp_parser_simulate_error (parser);
>else
>   {
> -   bool saved_in_type_id_in_expr_p = parser->in_type_id_in_expr_p;
> -   parser->in_type_id_in_expr_p = true;
> +   type_id_in_expr_sentinel s (parser);
> /* Look for the type-id.  */
> type = cp_parser_type_id (parser);
> /* Look for the closing `)'.  */
> cp_token *close_paren = parens.require_close (parser);
> if (close_paren)
>   close_paren_loc = close_paren->location;
> -   parser->in_type_id_in_expr_p = saved_in_type_id_in_expr_p;
>   }
>  
>/* Restore the saved message.  */
> @@ -34299,13 +34293,11 @@ cp_parser_sizeof_operand (cp_parser* par
>   cp_parser_simulate_error (parser);
>else
>   {
> -   bool saved_in_type_id_in_expr_p = parser->in_type_id_in_expr_p;
> -   parser->in_type_id_in_expr_p = true;
> +   type_id_in_expr_senti

Re: [RFC/RFA][PATCH v6 03/12] RISC-V: Add CRC expander to generate faster CRC.

2024-11-22 Thread Jeff Law





On 11/22/24 10:33 AM, Mariam Arutunian wrote:



On Fri, Nov 22, 2024, 20:29 Jeff Law > wrote:




On 11/13/24 7:16 AM, Mariam Arutunian wrote:
 >
 >
 > On Tue, Nov 12, 2024 at 2:15 AM Jeff Law mailto:jeffreya...@gmail.com>
 > >> wrote:
 >
 >
 >      > \ No newline at end of file
 >      > diff --git a/gcc/testsuite/gcc.target/riscv/crc-1-zbkc.c
b/gcc/
 >     testsuite/gcc.target/riscv/crc-1-zbkc.c
 >      > new file mode 100644
 >      > index 000..8c627c0431a
 >      > --- /dev/null
 >      > +++ b/gcc/testsuite/gcc.target/riscv/crc-1-zbkc.c
 >      > @@ -0,0 +1,11 @@
 >      > +/* { dg-do run } */
 >      > +/* { dg-options "-fdump-tree-crc -fdump-rtl-dfinish  -
fdisable-
 >     tree-phiopt2 -fdisable-tree-phiopt3" } */
 >      > +/* { dg-additional-options "-march=rv64gc_zbkc" { target
 >     { rv64 } } } */
 >      > +/* { dg-additional-options "-march=rv32gc_zbkc" { target
 >     { rv32 } } } */
 >     So I think we probably need to add a bit of code to the
testsuite.
 >     Essentially we don't want to run this test on targets that
don't have
 >     zbkc support.
 >
 >     I think we probably end up wanting something similar to what
we do with
 >     vector where we have a test to tell us when V is supported.  I'm
 >     planning to pick that up.  Similarly I think we want to do
something
 >     similar for Zbc.
 >
 >
 > To address this, I added code in |target-supports.exp| and
modified the
 > relevant tests.
 > I've attached the patch. Could you please check whether it is
correct?
I think that just tests if the compiler thinks the extension is
enabled.
   ie, did we pass Zbkb, Zbc or whatever on the command line.  The
question we need to answer is whether or not we can run such code.

The way we've done that for the V extension looks like this:

 > proc check_effective_target_riscv_v_ok { } {
 >     # If the target already supports v without any added options,
 >     # we may assume we can execute just fine.
 >     if { [check_effective_target_riscv_v] } {
 >         return 1
 >     }
 >
 >     # check if we can execute vector insns with the given hardware or
 >     # simulator
 >     set gcc_march [regsub {[[:alnum:]]*} [riscv_get_arch] &v]
 >     if { [check_runtime ${gcc_march}_exec {
 >           int main() {  asm("vsetivli t0, 9, e8, m1, tu, ma");
return 0; } } "-march=${gcc_march}"] } {
 >         return 1
 >     }
 >
 >     # Possible future extensions: If the target is a simulator,
dg-add-options
 >     # might change its config to make it allow vector insns, or
we might use
 >     # options to set special elf flags / sections to effect that.
 >
 >     return 0
 > }
So we compile a little program with a single vector instruction and
check that it doesn't fault.  I was thinking we could do the same thing
for Zbc and Zbkb, but I haven't had time to cobble it together yet.


  I have written similar code for ZBC, ZBKC, and ZBB, and in my previous 
reply, I attached the patch containing that code.

Here is a part from that patch:
+proc check_effective_target_riscv_zbc_ok { } {
+    # If the target already supports zbc without any added options,
+    # we may assume we can execute just fine.
+    if { [check_effective_target_riscv_zbc] } {
+ return 1
+    }
+
+    # check if we can execute zbc insns with the given hardware or
+    # simulator
+    set gcc_march [riscv_get_arch]
+    if { [check_runtime ${gcc_march}_zbc_exec {
+ int main()
+ {
+    asm ("clmul a0,a0,a1");
+    asm ("clmulh a0,a0,a1");
+    return 0;
+ } } "-march=${gcc_march}"] } {
+    return 1
+ }
+    return 0
Oh, that's exactly what I was expecting.  Maybe I just didn't read down 
far enough.


Jeff

81 matches

Mail list logo