date:20231113

Hi,

this patch enhances the equality check for REG_EQUAL notes in the vsetvl
pass.  Currently, we assume that two such notes describe the same value
when they have the same rtx representation.  This is not true when
either of the note's source operands is modified by an insn between the
two notes.

Suppose:

(insn 62 60 63 4 (set (reg:DI 17 a7 [orig:154 loop_len_54 ] [154])
(umin:DI (reg:DI 15 a5 [orig:174 _100 ] [174])
(reg:DI 30 t5 [219]))) 442 {umindi3}
 (expr_list:REG_EQUAL (umin:DI (reg:DI 15 a5 [orig:174 _100 ] [174])
(const_int 8 [0x8]))
(nil)))
(insn 63 62 65 4 (set (reg:DI 15 a5 [orig:175 _103 ] [175])
(minus:DI (reg:DI 15 a5 [orig:174 _100 ] [174])
(reg:DI 17 a7 [orig:154 loop_len_54 ] [154]))) 11 {subdi3}
 (nil))
(insn 65 63 66 4 (set (reg:DI 16 a6 [orig:153 loop_len_53 ] [153])
(umin:DI (reg:DI 15 a5 [orig:175 _103 ] [175])
(reg:DI 30 t5 [219]))) 442 {umindi3}
 (expr_list:REG_EQUAL (umin:DI (reg:DI 15 a5 [orig:175 _103 ] [175])
(const_int 8 [0x8]))
(nil)))

where insn 63 overwrites a5 and insn 65's REG_EQUAL note that refers to
a5 describes a different value than insn 62's REG_EQUAL note.

In order to catch this situation this patch has source_equal_p check
every instruction between two notes for modification of any
participating register.

Regards
 Robin


gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (modify_reg_between_p): Move.
(source_equal_p): Check if source registers were modified in
between.
---
 gcc/config/riscv/riscv-vsetvl.cc | 62 
 1 file changed, 47 insertions(+), 15 deletions(-)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index 3fa25a6404d..34bf7498103 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -85,6 +85,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "predict.h"
 #include "profile-count.h"
 #include "gcse.h"
+#include "rtl-iter.h"
 
 using namespace rtl_ssa;
 using namespace riscv_vector;
@@ -548,6 +549,21 @@ get_all_sets (set_info *set, bool /* get_real_inst */ 
real_p,
   return hash_set ();
 }
 
+static bool
+modify_reg_between_p (insn_info *prev_insn, insn_info *curr_insn,
+ unsigned regno)
+{
+  gcc_assert (prev_insn->compare_with (curr_insn) < 0);
+  for (insn_info *i = curr_insn->prev_nondebug_insn (); i != prev_insn;
+   i = i->prev_nondebug_insn ())
+{
+  // no def of regno
+  if (find_access (i->defs (), regno))
+   return true;
+}
+  return false;
+}
+
 static bool
 source_equal_p (insn_info *insn1, insn_info *insn2)
 {
@@ -561,7 +577,37 @@ source_equal_p (insn_info *insn1, insn_info *insn2)
   rtx note1 = find_reg_equal_equiv_note (rinsn1);
   rtx note2 = find_reg_equal_equiv_note (rinsn2);
   if (note1 && note2 && rtx_equal_p (note1, note2))
-return true;
+{
+  /* REG_EQUIVs are globally.  */
+  if (REG_NOTE_KIND (note2) == REG_EQUIV)
+   return true;
+
+  /* If both insns are the same, the notes are definitely equivalent.  */
+  if (insn2->compare_with (insn1) == 0)
+   return true;
+
+  /* Canonicalize order so insn1 is always before insn2 for the following
+check.  */
+  if (insn2->compare_with (insn1) < 0)
+   std::swap (insn1, insn2);
+
+  /* If two REG_EQUAL notes are similar the value they calculate can still
+be different.  The value is only identical if none of the sources have
+been modified in between.  */
+  subrtx_iterator::array_type array;
+  FOR_EACH_SUBRTX (iter, array, note2, NONCONST)
+   {
+ if (!*iter)
+   continue;
+
+ if (!REG_P (*iter))
+   continue;
+
+ if (modify_reg_between_p (insn1, insn2, REGNO (*iter)))
+   return false;
+   }
+  return true;
+}
   return false;
 }
 
@@ -1439,20 +1485,6 @@ private:
   && find_access (i->defs (), REGNO (info.get_avl ()));
   }
 
-  inline bool modify_reg_between_p (insn_info *prev_insn, insn_info *curr_insn,
-   unsigned regno)
-  {
-gcc_assert (prev_insn->compare_with (curr_insn) < 0);
-for (insn_info *i = curr_insn->prev_nondebug_insn (); i != prev_insn;
-i = i->prev_nondebug_insn ())
-  {
-   // no def of regno
-   if (find_access (i->defs (), regno))
- return true;
-  }
-return false;
-  }
-
   inline bool reg_avl_equal_p (const vsetvl_info &prev, const vsetvl_info 
&next)
   {
 if (!prev.has_nonvlmax_reg_avl () || !next.has_nonvlmax_reg_avl ())
-- 
2.41.0

Re: Fwd: [PATCH, expand] Call misaligned memory reference in expand_builtin_return [PR112417]

2023-11-13 Thread HAO CHEN GUI

Sorry, forgot to cc gcc-patches.

在 2023/11/13 16:05, HAO CHEN GUI 写道:
> Andrew,
>   Could you kindly inform us what's the functionality of __objc_forward?
> Does it change the memory content pointed by args? Thanks a lot.
> 
> Thanks
> Gui Haochen
> 
> 
> libobjc/sendmsg.c.
> 
>void *args, *res;
> 
>args = __builtin_apply_args ();
>res = __objc_forward (rcv, op, args);
>if (res)
>  __builtin_return (res);
>else
>  ...
> 
>  转发的消息 
> 主题: Re: [PATCH, expand] Call misaligned memory reference in 
> expand_builtin_return [PR112417]
> 日期: Fri, 10 Nov 2023 14:39:02 +0100
> From: Richard Biener 
> 收件人: HAO CHEN GUI 
> 抄送: gcc-patches , Kewen.Lin 
> 
> On Fri, Nov 10, 2023 at 11:10 AM HAO CHEN GUI  wrote:
>>
>> Hi Richard,
>>
>> 在 2023/11/10 17:06, Richard Biener 写道:
>>> On Fri, Nov 10, 2023 at 8:52 AM HAO CHEN GUI  wrote:

 Hi Richard,
   Thanks so much for your comments.

 在 2023/11/9 19:41, Richard Biener 写道:
> I'm not sure if the testcase is valid though?
>
> @defbuiltin{{void} __builtin_return (void *@var{result})}
> This built-in function returns the value described by @var{result} from
> the containing function.  You should specify, for @var{result}, a value
> returned by @code{__builtin_apply}.
> @enddefbuiltin
>
> I don't see __builtin_apply being used here?

 The prototype of the test case is from "__objc_block_forward" in
 libobjc/sendmsg.c.

   void *args, *res;

   args = __builtin_apply_args ();
   res = __objc_forward (rcv, op, args);
   if (res)
 __builtin_return (res);
   else
 ...

 The __builtin_apply_args puts the return values on stack by the alignment.
 But the forward function can do anything and return a void* pointer.
 IMHO the alignment might be broken. So I just simplified it to use a
 void* pointer as the input argument of  "__builtin_return" and skip
 "__builtin_apply_args".
>>>
>>> But doesn't __objc_forward then break the contract between
>>> __builtin_apply_args and __builtin_return?
>>>
>>> That said, __builtin_return is a very special function, it's not supposed
>>> to deal with what you are fixing.  At least I think so.
>>>
>>> IMHO the bug is in __objc_block_forward.
>>
>> If so, can we document that the memory objects pointed by input argument of
>> __builtin_return have to be aligned? Then we can force the alignment in
>> __builtin_return. The customer function can do anything if gcc doesn't state
>> that.
> 
> I don't think they have to be aligned - they have to adhere to the ABI
> which __builtin_apply_args ensures.  But others might know more details
> here.
> 
>> Thanks
>> Gui Haochen
>>
>>>
>>> Richard.
>>>

 Thanks
 Gui Haochen

Re: [PATCH] RISC-V: vsetvl: Refine REG_EQUAL equality.

Does this patch fixes exposed bugs in current tests?
Or could you add test for it ?



juzhe.zh...@rivai.ai
 
From: Robin Dapp
Date: 2023-11-13 16:06
To: gcc-patches; palmer; Kito Cheng; jeffreyalaw; juzhe.zh...@rivai.ai
CC: rdapp.gcc
Subject: [PATCH] RISC-V: vsetvl: Refine REG_EQUAL equality.
Hi,
 
this patch enhances the equality check for REG_EQUAL notes in the vsetvl
pass.  Currently, we assume that two such notes describe the same value
when they have the same rtx representation.  This is not true when
either of the note's source operands is modified by an insn between the
two notes.
 
Suppose:
 
(insn 62 60 63 4 (set (reg:DI 17 a7 [orig:154 loop_len_54 ] [154])
(umin:DI (reg:DI 15 a5 [orig:174 _100 ] [174])
(reg:DI 30 t5 [219]))) 442 {umindi3}
 (expr_list:REG_EQUAL (umin:DI (reg:DI 15 a5 [orig:174 _100 ] [174])
(const_int 8 [0x8]))
(nil)))
(insn 63 62 65 4 (set (reg:DI 15 a5 [orig:175 _103 ] [175])
(minus:DI (reg:DI 15 a5 [orig:174 _100 ] [174])
(reg:DI 17 a7 [orig:154 loop_len_54 ] [154]))) 11 {subdi3}
 (nil))
(insn 65 63 66 4 (set (reg:DI 16 a6 [orig:153 loop_len_53 ] [153])
(umin:DI (reg:DI 15 a5 [orig:175 _103 ] [175])
(reg:DI 30 t5 [219]))) 442 {umindi3}
 (expr_list:REG_EQUAL (umin:DI (reg:DI 15 a5 [orig:175 _103 ] [175])
(const_int 8 [0x8]))
(nil)))
 
where insn 63 overwrites a5 and insn 65's REG_EQUAL note that refers to
a5 describes a different value than insn 62's REG_EQUAL note.
 
In order to catch this situation this patch has source_equal_p check
every instruction between two notes for modification of any
participating register.
 
Regards
Robin
 
 
gcc/ChangeLog:
 
* config/riscv/riscv-vsetvl.cc (modify_reg_between_p): Move.
(source_equal_p): Check if source registers were modified in
between.
---
gcc/config/riscv/riscv-vsetvl.cc | 62 
1 file changed, 47 insertions(+), 15 deletions(-)
 
diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index 3fa25a6404d..34bf7498103 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -85,6 +85,7 @@ along with GCC; see the file COPYING3.  If not see
#include "predict.h"
#include "profile-count.h"
#include "gcse.h"
+#include "rtl-iter.h"
using namespace rtl_ssa;
using namespace riscv_vector;
@@ -548,6 +549,21 @@ get_all_sets (set_info *set, bool /* get_real_inst */ 
real_p,
   return hash_set ();
}
+static bool
+modify_reg_between_p (insn_info *prev_insn, insn_info *curr_insn,
+   unsigned regno)
+{
+  gcc_assert (prev_insn->compare_with (curr_insn) < 0);
+  for (insn_info *i = curr_insn->prev_nondebug_insn (); i != prev_insn;
+   i = i->prev_nondebug_insn ())
+{
+  // no def of regno
+  if (find_access (i->defs (), regno))
+ return true;
+}
+  return false;
+}
+
static bool
source_equal_p (insn_info *insn1, insn_info *insn2)
{
@@ -561,7 +577,37 @@ source_equal_p (insn_info *insn1, insn_info *insn2)
   rtx note1 = find_reg_equal_equiv_note (rinsn1);
   rtx note2 = find_reg_equal_equiv_note (rinsn2);
   if (note1 && note2 && rtx_equal_p (note1, note2))
-return true;
+{
+  /* REG_EQUIVs are globally.  */
+  if (REG_NOTE_KIND (note2) == REG_EQUIV)
+ return true;
+
+  /* If both insns are the same, the notes are definitely equivalent.  */
+  if (insn2->compare_with (insn1) == 0)
+ return true;
+
+  /* Canonicalize order so insn1 is always before insn2 for the following
+ check.  */
+  if (insn2->compare_with (insn1) < 0)
+ std::swap (insn1, insn2);
+
+  /* If two REG_EQUAL notes are similar the value they calculate can still
+ be different.  The value is only identical if none of the sources have
+ been modified in between.  */
+  subrtx_iterator::array_type array;
+  FOR_EACH_SUBRTX (iter, array, note2, NONCONST)
+ {
+   if (!*iter)
+ continue;
+
+   if (!REG_P (*iter))
+ continue;
+
+   if (modify_reg_between_p (insn1, insn2, REGNO (*iter)))
+ return false;
+ }
+  return true;
+}
   return false;
}
@@ -1439,20 +1485,6 @@ private:
   && find_access (i->defs (), REGNO (info.get_avl ()));
   }
-  inline bool modify_reg_between_p (insn_info *prev_insn, insn_info *curr_insn,
- unsigned regno)
-  {
-gcc_assert (prev_insn->compare_with (curr_insn) < 0);
-for (insn_info *i = curr_insn->prev_nondebug_insn (); i != prev_insn;
- i = i->prev_nondebug_insn ())
-  {
- // no def of regno
- if (find_access (i->defs (), regno))
-   return true;
-  }
-return false;
-  }
-
   inline bool reg_avl_equal_p (const vsetvl_info &prev, const vsetvl_info 
&next)
   {
 if (!prev.has_nonvlmax_reg_avl () || !next.has_nonvlmax_reg_avl ())
-- 
2.41.0

Re: [PATCH] RISC-V: vsetvl: Refine REG_EQUAL equality.

> Does this patch fixes exposed bugs in current tests?
> Or could you add test for it ?

Ah, yes forgot to mention.  This fixes several tests when
testing with -march=rv64gcv_zbb.

Regards
 Robin

RE: [ARC PATCH] Provide a TARGET_FOLD_BUILTIN target hook.

2023-11-13 Thread Claudiu Zissulescu

Hi Roger,

Looks good. Please proceed with your commit.

Thank you,
Claudiu

-Original Message-
From: Roger Sayle  
Sent: Friday, November 3, 2023 9:43 PM
To: gcc-patches@gcc.gnu.org
Cc: 'Claudiu Zissulescu' 
Subject: [ARC PATCH] Provide a TARGET_FOLD_BUILTIN target hook.


This patch implements a arc_fold_builtin target hook to allow ARC builtins to 
be folded at the tree-level.  Currently this function converts 
__builtin_arc_swap into a LROTATE_EXPR at the tree-level, and evaluates 
__builtin_arc_norm and __builtin_arc_normw of integer constant arguments at 
compile-time.  Because ARC_BUILTIIN_SWAP is now handled at the tree-level, 
UNSPEC_ARC_SWAP no longer used, allowing it and the "swap" define_insn to be 
removed.

An example benefit of folding things at compile-time is that calling 
__builtin_arc_swap on the result of __builtin_arc_swap now eliminates both and 
generates no code, and likewise calling __builtin_arc_swap of a constant 
integer argument is evaluated at compile-time.

Tested with a cross-compiler to arc-linux hosted on x86_64, with no new 
(compile-only) regressions from make -k check.
Ok for mainline if this passes Claudiu's nightly testing?


2023-11-03  Roger Sayle  

gcc/ChangeLog
* config/arc/arc.cc (TARGET_FOLD_BUILTIN): Define to
arc_fold_builtin.
(arc_fold_builtin): New function.  Convert ARC_BUILTIN_SWAP
into a rotate.  Evaluate ARC_BUILTIN_NORM and
ARC_BUILTIN_NORMW of constant arguments.
* config/arc/arc.md (UNSPEC_ARC_SWAP): Delete.
(normw): Make output template/assembler whitespace consistent.
(swap): Remove define_insn, only use of SWAP UNSPEC.
* config/arc/builtins.def: Tweak indentation.
(SWAP): Expand using rotlsi2_cnt16 instead of using swap.

gcc/testsuite/ChangeLog
* gcc.target/arc/builtin_norm-1.c: New test case.
* gcc.target/arc/builtin_norm-2.c: Likewise.
* gcc.target/arc/builtin_normw-1.c: Likewise.
* gcc.target/arc/builtin_normw-2.c: Likewise.
* gcc.target/arc/builtin_swap-1.c: Likewise.
* gcc.target/arc/builtin_swap-2.c: Likewise.
* gcc.target/arc/builtin_swap-3.c: Likewise.


Thanks in advance,
Roger
--

Re: Re: [PATCH] RISC-V: vsetvl: Refine REG_EQUAL equality.

I know the root cause is:

(reg:DI 15 a5 [orig:175 _103 ] [175])(reg:DI 15 a5 [orig:174 _100 ] [174])

is considered as true on rtx_equal_p.

I think return note1 == note2; will simplify your codes and fix this issue.



juzhe.zh...@rivai.ai
 
From: Robin Dapp
Date: 2023-11-13 16:12
To: juzhe.zh...@rivai.ai; gcc-patches; palmer; kito.cheng; jeffreyalaw
CC: rdapp.gcc
Subject: Re: [PATCH] RISC-V: vsetvl: Refine REG_EQUAL equality.
> Does this patch fixes exposed bugs in current tests?
> Or could you add test for it ?
 
Ah, yes forgot to mention.  This fixes several tests when
testing with -march=rv64gcv_zbb.
 
Regards
Robin

RE: [ARC PATCH] Improved DImode rotates and right shifts by one bit.

2023-11-13 Thread Claudiu Zissulescu

Looks good too. Please proceed with your commit.

Thank you for your contribution,
//Claudiu

-Original Message-
From: Roger Sayle  
Sent: Monday, November 6, 2023 7:30 PM
To: gcc-patches@gcc.gnu.org
Cc: 'Claudiu Zissulescu' 
Subject: [ARC PATCH] Improved DImode rotates and right shifts by one bit.

This patch improves the code generated for DImode right shifts (both arithmetic 
and logical) by a single bit, and also for DImode rotates (both left and right) 
by a single bit.  In approach, this is similar to the recently added DImode 
left shift by a single bit patch, but also builds upon i386.md's UNSPEC carry 
flag representation:
https://urldefense.com/v3/__https://gcc.gnu.org/pipermail/gcc-patches/2023-October/632169.html__;!!A4F2R9G_pg!fxasivX0MLFBbgzab_TjnY9wQnIao29buOVHUv6gvPzOS-W4IWfIdwse4TRg__ek2AEplJ7BpYrTYBt1hB8mUCM$

The benefits can be seen from the four new test cases:

long long ashr(long long x) { return x >> 1; }

Before:
ashr:   asl r2,r1,31
lsr_s   r0,r0
or_sr0,r0,r2
j_s.d   [blink]
asr_s   r1,r1,1

After:
ashr:   asr.f   r1,r1
j_s.d   [blink]
rrc r0,r0

unsigned long long lshr(unsigned long long x) { return x >> 1; }

Before:
lshr:   asl r2,r1,31
lsr_s   r0,r0
or_sr0,r0,r2
j_s.d   [blink]
lsr_s   r1,r1

After:
lshr:   lsr.f   r1,r1
j_s.d   [blink]
rrc r0,r0

unsigned long long rotl(unsigned long long x) { return (x<<1) | (x>>63); }

Before:
rotl:   lsr r12,r1,31
lsr r2,r0,31
asl_s   r3,r0,1
asl_s   r1,r1,1
or  r0,r12,r3
j_s.d   [blink]
or_sr1,r1,r2

After:
rotl:   add.f   r0,r0,r0
adc.f   r1,r1,r1
j_s.d   [blink]
add.cs  r0,r0,1

unsigned long long rotr(unsigned long long x) { return (x>>1) | (x<<63); }

Before:
rotr:   asl r12,r1,31
asl r2,r0,31
lsr_s   r3,r0
lsr_s   r1,r1
or  r0,r12,r3
j_s.d   [blink]
or_sr1,r1,r2

After:
rotr:   asr.f   0,r0
rrc.f   r1,r1
j_s.d   [blink]
rrc r0,r0

On CPUs without a barrel shifter the improvements are even better.

Tested with a cross-compiler to arc-linux hosted on x86_64, with no new 
(compile-only) regressions from make -k check.
Ok for mainline if this passes Claudiu's nightly testing?

2023-11-06  Roger Sayle  

gcc/ChangeLog
* config/arc/arc.md (UNSPEC_ARC_CC_NEZ): New UNSPEC that
represents the carry flag being set if the operand is non-zero.
(adc_f): New define_insn representing adc with updated flags.
(ashrdi3): New define_expand that only handles shifts by 1.
(ashrdi3_cnt1): New pre-reload define_insn_and_split.
(lshrdi3): New define_expand that only handles shifts by 1.
(lshrdi3_cnt1): New pre-reload define_insn_and_split.
(rrcsi2): New define_insn for rrc (SImode rotate right through carry).
(rrcsi2_carry): Likewise for rrc.f, as above but updating flags.
(rotldi3): New define_expand that only handles rotates by 1.
(rotldi3_cnt1): New pre-reload define_insn_and_split.
(rotrdi3): New define_expand that only handles rotates by 1.
(rotrdi3_cnt1): New pre-reload define_insn_and_split.
(lshrsi3_cnt1_carry): New define_insn for lsr.f.
(ashrsi3_cnt1_carry): New define_insn for asr.f.
(btst_0_carry): New define_insn for asr.f without result.

gcc/testsuite/ChangeLog
* gcc.target/arc/ashrdi3-1.c: New test case.
* gcc.target/arc/lshrdi3-1.c: Likewise.
* gcc.target/arc/rotldi3-1.c: Likewise.
* gcc.target/arc/rotrdi3-1.c: Likewise.

Thanks in advance,
Roger
--

Re: Re: [PATCH] RISC-V: vsetvl: Refine REG_EQUAL equality.

Sorry. It should be return note1 && note2 && note1 == note2;

juzhe.zh...@rivai.ai

From: Robin Dapp
Date: 2023-11-13 16:12
To: juzhe.zh...@rivai.ai; gcc-patches; palmer; kito.cheng; jeffreyalaw
CC: rdapp.gcc
Subject: Re: [PATCH] RISC-V: vsetvl: Refine REG_EQUAL equality.
> Does this patch fixes exposed bugs in current tests?
> Or could you add test for it ?

Ah, yes forgot to mention.  This fixes several tests when
testing with -march=rv64gcv_zbb.

Regards
Robin

Re: Re: [PATCH] RISC-V: vsetvl: Refine REG_EQUAL equality.

Also, like kito previous remind me:
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635326.html 

I think you should add a dedicated test which with specifying 
-march=rv64gcv_zbb,
then scan-assembler-check  the correct vsetvl.

So that we can allow people like me be able to avoid regression of such issue 
even if I didn't build toolchain with "zbb".



juzhe.zh...@rivai.ai
 
From: Robin Dapp
Date: 2023-11-13 16:12
To: juzhe.zh...@rivai.ai; gcc-patches; palmer; kito.cheng; jeffreyalaw
CC: rdapp.gcc
Subject: Re: [PATCH] RISC-V: vsetvl: Refine REG_EQUAL equality.
> Does this patch fixes exposed bugs in current tests?
> Or could you add test for it ?
 
Ah, yes forgot to mention.  This fixes several tests when
testing with -march=rv64gcv_zbb.
 
Regards
Robin

Re: [PATCH] Avoid generate vblendps with ymm16+

On Mon, Nov 13, 2023 at 02:27:35PM +0800, Hongtao Liu wrote:
> > 1) if it isn't better to use separate alternative instead of
> >x86_evex_reg_mentioned_p, like in the patch below
> vblendps doesn't support gpr32 which is checked by x86_evex_reg_mentioned_p.
> we need to use xjm for operands[1], (I think we don't need to set
> attribute addr to gpr16 for alternative 0 since the alternative 1 is
> alway available and recog will match alternative1 when gpr32 is used)

Ok, so like this then?  I've incorporated the other two tests into the patch
as well.

2023-11-13  Jakub Jelinek  
Hu, Lin1  

PR target/112435
* config/i386/sse.md (avx512vl_shuf_32x4_1,
avx512dq_shuf_64x2_1): Add
alternative with just x instead of v constraints and xjm instead of
vm and use vblendps as optimization only with that alternative.

* gcc.target/i386/avx512vl-pr112435-1.c: New test.
* gcc.target/i386/avx512vl-pr112435-2.c: New test.
* gcc.target/i386/avx512vl-pr112435-3.c: New test.

--- gcc/config/i386/sse.md.jj   2023-11-11 08:52:20.377845673 +0100
+++ gcc/config/i386/sse.md  2023-11-13 09:31:08.568935535 +0100
@@ -19235,11 +19235,11 @@ (define_expand "avx512dq_shuf_avx512dq_shuf_64x2_1"
-  [(set (match_operand:VI8F_256 0 "register_operand" "=v")
+  [(set (match_operand:VI8F_256 0 "register_operand" "=x,v")
(vec_select:VI8F_256
  (vec_concat:
-   (match_operand:VI8F_256 1 "register_operand" "v")
-   (match_operand:VI8F_256 2 "nonimmediate_operand" "vm"))
+   (match_operand:VI8F_256 1 "register_operand" "x,v")
+   (match_operand:VI8F_256 2 "nonimmediate_operand" "xjm,vm"))
  (parallel [(match_operand 3 "const_0_to_3_operand")
 (match_operand 4 "const_0_to_3_operand")
 (match_operand 5 "const_4_to_7_operand")
@@ -19254,7 +19254,7 @@ (define_insn "avx512dq_shu
   mask = INTVAL (operands[3]) / 2;
   mask |= (INTVAL (operands[5]) - 4) / 2 << 1;
   operands[3] = GEN_INT (mask);
-  if (INTVAL (operands[3]) == 2 && !)
+  if (INTVAL (operands[3]) == 2 && ! && which_alternative == 0)
 return "vblendps\t{$240, %2, %1, %0|%0, %1, %2, 240}";
   return "vshuf64x2\t{%3, %2, %1, 
%0|%0, %1, %2, %3}";
 }
@@ -19386,11 +19386,11 @@ (define_expand "avx512vl_shuf_32x4_1"
-  [(set (match_operand:VI4F_256 0 "register_operand" "=v")
+  [(set (match_operand:VI4F_256 0 "register_operand" "=x,v")
(vec_select:VI4F_256
  (vec_concat:
-   (match_operand:VI4F_256 1 "register_operand" "v")
-   (match_operand:VI4F_256 2 "nonimmediate_operand" "vm"))
+   (match_operand:VI4F_256 1 "register_operand" "x,v")
+   (match_operand:VI4F_256 2 "nonimmediate_operand" "xjm,vm"))
  (parallel [(match_operand 3 "const_0_to_7_operand")
 (match_operand 4 "const_0_to_7_operand")
 (match_operand 5 "const_0_to_7_operand")
@@ -19414,7 +19414,7 @@ (define_insn "avx512vl_shuf_)
+  if (INTVAL (operands[3]) == 2 && ! && which_alternative == 0)
 return "vblendps\t{$240, %2, %1, %0|%0, %1, %2, 240}";
 
   return "vshuf32x4\t{%3, %2, %1, 
%0|%0, %1, %2, %3}";
--- gcc/testsuite/gcc.target/i386/avx512vl-pr112435-1.c.jj  2023-11-13 
09:20:53.330643098 +0100
+++ gcc/testsuite/gcc.target/i386/avx512vl-pr112435-1.c 2023-11-13 
09:20:53.330643098 +0100
@@ -0,0 +1,13 @@
+/* PR target/112435 */
+/* { dg-do assemble { target { avx512vl && { ! ia32 } } } } */
+/* { dg-options "-mavx512vl -O2" } */
+
+#include 
+
+__m256i
+foo (__m256i a, __m256i b)
+{
+  register __m256i c __asm__("ymm16") = a;
+  asm ("" : "+v" (c));
+  return _mm256_shuffle_i32x4 (c, b, 2);
+}
--- gcc/testsuite/gcc.target/i386/avx512vl-pr112435-2.c.jj  2023-11-13 
09:23:04.361788598 +0100
+++ gcc/testsuite/gcc.target/i386/avx512vl-pr112435-2.c 2023-11-13 
09:34:57.186699876 +0100
@@ -0,0 +1,63 @@
+/* PR target/112435 */
+/* { dg-do assemble { target { avx512vl && { ! ia32 } } } } */
+/* { dg-options "-mavx512vl -O2" } */
+
+#include 
+
+/* vpermi128/vpermf128 */
+__m256i
+perm0 (__m256i a, __m256i b)
+{
+  register __m256i c __asm__("ymm17") = a;
+  asm ("":"+v" (c));
+  return _mm256_permute2x128_si256 (c, b, 50);
+}
+
+__m256i
+perm1 (__m256i a, __m256i b)
+{
+  register __m256i c __asm__("ymm17") = a;
+  asm ("":"+v" (c));
+  return _mm256_permute2x128_si256 (c, b, 18);
+}
+
+__m256i
+perm2 (__m256i a, __m256i b)
+{
+  register __m256i c __asm__("ymm17") = a;
+  asm ("":"+v" (c));
+  return _mm256_permute2x128_si256 (c, b, 48);
+}
+
+/* vshuf{i,f}{32x4,64x2} ymm .*/
+__m256i
+shuff0 (__m256i a, __m256i b)
+{
+  register __m256i c __asm__("ymm17") = a;
+  asm ("":"+v" (c));
+  return _mm256_shuffle_i32x4 (c, b, 2);
+}
+
+__m256
+shuff1 (__m256 a, __m256 b)
+{
+  register __m256 c __asm__("ymm17") = a;
+  asm ("":"+v" (c));
+  return _mm256_shuffle_f32x4 (c, b, 2);
+}
+
+__m256i
+shuff2 (__m256i a, __m256i b)
+{
+  register __m256i c __asm__("ym

[committed] i386: Remove j constraint letter from list of unused letters

Hi!

I've noticed the list of unused letters still list j, even when that
constraint letter is now the first letter of jr, jR, jm, j<, j>, jo, jV, jp,
ja, jb and jc constraints.

Committed to trunk as obvious.

2023-11-13  Jakub Jelinek  

* config/i386/constraints.md: Remove j constraint letter from list of
unused letters.

--- gcc/config/i386/constraints.md.jj   2023-11-09 09:04:18.582543884 +0100
+++ gcc/config/i386/constraints.md  2023-11-13 09:41:11.271405386 +0100
@@ -19,7 +19,7 @@
 
 ;;; Unused letters:
 ;;;   H
-;;;   j   z
+;;; z
 
 ;; Integer register constraints.
 ;; It is not necessary to define 'r' here.

Jakub

Re: [PATCH] RISC-V: testsuite: Fix 32-bit FAILs.

> FAIL: gcc.target/riscv/rvv/autovec/slp-mask-run-1.c -O3 -ftree-vectorize 
> (test for excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-run.c -std=c99 -O3 
> -ftree-vectorize --param riscv-autovec-preference=fixed-vlmax (test for 
> excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-runu.c -std=c99 -O3 
> -ftree-vectorize --param riscv-autovec-preference=fixed-vlmax (test for 
> excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-zvfh-run.c -std=c99 
> -O3 -ftree-vectorize --param riscv-autovec-preference=fixed-vlmax (test for 
> excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-run.c -std=c99 -O3 
> -ftree-vectorize --param riscv-autovec-preference=fixed-vlmax (test for 
> excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-zvfh-run.c -std=c99 -O3 
> -ftree-vectorize --param riscv-autovec-preference=fixed-vlmax (test for 
> excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vmv-imm-run.c -O3 -ftree-vectorize (test 
> for excess errors)

Hmm, I don't see those here locally:

PASS: gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-zvfh-run.c -std=c99 
-O3 -ftree-vectorize --param riscv-autovec-preference=fixed-vlmax execution test
PASS: gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-runu.c -std=c99 -O3 
-ftree-vectorize --param riscv-autovec-preference=fixed-vlmax execution test
PASS: gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-run.c -std=c99 -O3 
-ftree-vectorize --param riscv-autovec-preference=fixed-vlmax execution test
PASS: gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-run.c -std=c99 -O3 
-ftree-vectorize --param riscv-autovec-preference=fixed-vlmax execution test
...
PASS: gcc.target/riscv/rvv/autovec/vmv-imm-run.c -O3 -ftree-vectorize execution 
test 

Could you please post logs for them?  Maybe a spike vs qemu thing
and/or there is indeed an error in the expander still?

Regards
 Robin

Re: Re: [PATCH] RISC-V: testsuite: Fix 32-bit FAILs.

The FAIL is as follows:

xgcc: fatal error: Cannot find suitable multilib set for 
'-march=rv32imafdcv_zicsr_zifencei_zfh_zfhmin_zve32f_zve32x_zve64d_zve64f_zve64x_zvl128b_zvl32b_zvl64b'/'-mabi=ilp32d'^M
compilation terminated.^M
compiler exited with status 1
FAIL: gcc.target/riscv/rvv/autovec/vmv-imm-run.c -O3 -ftree-vectorize (test for 
excess errors)
Excess errors:
xgcc: fatal error: Cannot find suitable multilib set for 
'-march=rv32imafdcv_zicsr_zifencei_zfh_zfhmin_zve32f_zve32x_zve64d_zve64f_zve64x_zvl128b_zvl32b_zvl64b'/'-mabi=ilp32d'
compilation terminated.

My compile option is : --with-arch=rv32gcv_zfh_zvfh --with-abi=ilp32d

I am using SPIKE but I don't think simulator cause such issue since it is 
compile issue.



juzhe.zh...@rivai.ai
 
From: Robin Dapp
Date: 2023-11-13 16:52
To: 钟居哲; gcc-patches; palmer; kito.cheng; Jeff Law
CC: rdapp.gcc
Subject: Re: [PATCH] RISC-V: testsuite: Fix 32-bit FAILs.
> FAIL: gcc.target/riscv/rvv/autovec/slp-mask-run-1.c -O3 -ftree-vectorize 
> (test for excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-run.c -std=c99 -O3 
> -ftree-vectorize --param riscv-autovec-preference=fixed-vlmax (test for 
> excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-runu.c -std=c99 -O3 
> -ftree-vectorize --param riscv-autovec-preference=fixed-vlmax (test for 
> excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-zvfh-run.c -std=c99 
> -O3 -ftree-vectorize --param riscv-autovec-preference=fixed-vlmax (test for 
> excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-run.c -std=c99 -O3 
> -ftree-vectorize --param riscv-autovec-preference=fixed-vlmax (test for 
> excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-zvfh-run.c -std=c99 -O3 
> -ftree-vectorize --param riscv-autovec-preference=fixed-vlmax (test for 
> excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vmv-imm-run.c -O3 -ftree-vectorize (test 
> for excess errors)
 
Hmm, I don't see those here locally:
 
PASS: gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-zvfh-run.c -std=c99 
-O3 -ftree-vectorize --param riscv-autovec-preference=fixed-vlmax execution test
PASS: gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-runu.c -std=c99 -O3 
-ftree-vectorize --param riscv-autovec-preference=fixed-vlmax execution test
PASS: gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-run.c -std=c99 -O3 
-ftree-vectorize --param riscv-autovec-preference=fixed-vlmax execution test
PASS: gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-run.c -std=c99 -O3 
-ftree-vectorize --param riscv-autovec-preference=fixed-vlmax execution test
...
PASS: gcc.target/riscv/rvv/autovec/vmv-imm-run.c -O3 -ftree-vectorize execution 
test 
 
Could you please post logs for them?  Maybe a spike vs qemu thing
and/or there is indeed an error in the expander still?
 
Regards
Robin

Re: Re: [PATCH] RISC-V: testsuite: Fix 32-bit FAILs.

Also, I didn't enable multi-lib.



juzhe.zh...@rivai.ai
 
From: Robin Dapp
Date: 2023-11-13 16:52
To: 钟居哲; gcc-patches; palmer; kito.cheng; Jeff Law
CC: rdapp.gcc
Subject: Re: [PATCH] RISC-V: testsuite: Fix 32-bit FAILs.
> FAIL: gcc.target/riscv/rvv/autovec/slp-mask-run-1.c -O3 -ftree-vectorize 
> (test for excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-run.c -std=c99 -O3 
> -ftree-vectorize --param riscv-autovec-preference=fixed-vlmax (test for 
> excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-runu.c -std=c99 -O3 
> -ftree-vectorize --param riscv-autovec-preference=fixed-vlmax (test for 
> excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-zvfh-run.c -std=c99 
> -O3 -ftree-vectorize --param riscv-autovec-preference=fixed-vlmax (test for 
> excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-run.c -std=c99 -O3 
> -ftree-vectorize --param riscv-autovec-preference=fixed-vlmax (test for 
> excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-zvfh-run.c -std=c99 -O3 
> -ftree-vectorize --param riscv-autovec-preference=fixed-vlmax (test for 
> excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vmv-imm-run.c -O3 -ftree-vectorize (test 
> for excess errors)
 
Hmm, I don't see those here locally:
 
PASS: gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-zvfh-run.c -std=c99 
-O3 -ftree-vectorize --param riscv-autovec-preference=fixed-vlmax execution test
PASS: gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-runu.c -std=c99 -O3 
-ftree-vectorize --param riscv-autovec-preference=fixed-vlmax execution test
PASS: gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-run.c -std=c99 -O3 
-ftree-vectorize --param riscv-autovec-preference=fixed-vlmax execution test
PASS: gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-run.c -std=c99 -O3 
-ftree-vectorize --param riscv-autovec-preference=fixed-vlmax execution test
...
PASS: gcc.target/riscv/rvv/autovec/vmv-imm-run.c -O3 -ftree-vectorize execution 
test 
 
Could you please post logs for them?  Maybe a spike vs qemu thing
and/or there is indeed an error in the expander still?
 
Regards
Robin

Re: [PATCH] Avoid generate vblendps with ymm16+

2023-11-13 Thread Hongtao Liu

On Mon, Nov 13, 2023 at 4:45 PM Jakub Jelinek  wrote:
>
> On Mon, Nov 13, 2023 at 02:27:35PM +0800, Hongtao Liu wrote:
> > > 1) if it isn't better to use separate alternative instead of
> > >x86_evex_reg_mentioned_p, like in the patch below
> > vblendps doesn't support gpr32 which is checked by x86_evex_reg_mentioned_p.
> > we need to use xjm for operands[1], (I think we don't need to set
> > attribute addr to gpr16 for alternative 0 since the alternative 1 is
> > alway available and recog will match alternative1 when gpr32 is used)
>
> Ok, so like this then?  I've incorporated the other two tests into the patch
> as well.
LGTM.
>
> 2023-11-13  Jakub Jelinek  
> Hu, Lin1  
>
> PR target/112435
> * config/i386/sse.md (avx512vl_shuf_32x4_1,
> avx512dq_shuf_64x2_1): Add
> alternative with just x instead of v constraints and xjm instead of
> vm and use vblendps as optimization only with that alternative.
>
> * gcc.target/i386/avx512vl-pr112435-1.c: New test.
> * gcc.target/i386/avx512vl-pr112435-2.c: New test.
> * gcc.target/i386/avx512vl-pr112435-3.c: New test.
>
> --- gcc/config/i386/sse.md.jj   2023-11-11 08:52:20.377845673 +0100
> +++ gcc/config/i386/sse.md  2023-11-13 09:31:08.568935535 +0100
> @@ -19235,11 +19235,11 @@ (define_expand "avx512dq_shuf_  })
>
>  (define_insn "avx512dq_shuf_64x2_1"
> -  [(set (match_operand:VI8F_256 0 "register_operand" "=v")
> +  [(set (match_operand:VI8F_256 0 "register_operand" "=x,v")
> (vec_select:VI8F_256
>   (vec_concat:
> -   (match_operand:VI8F_256 1 "register_operand" "v")
> -   (match_operand:VI8F_256 2 "nonimmediate_operand" "vm"))
> +   (match_operand:VI8F_256 1 "register_operand" "x,v")
> +   (match_operand:VI8F_256 2 "nonimmediate_operand" "xjm,vm"))
>   (parallel [(match_operand 3 "const_0_to_3_operand")
>  (match_operand 4 "const_0_to_3_operand")
>  (match_operand 5 "const_4_to_7_operand")
> @@ -19254,7 +19254,7 @@ (define_insn "avx512dq_shu
>mask = INTVAL (operands[3]) / 2;
>mask |= (INTVAL (operands[5]) - 4) / 2 << 1;
>operands[3] = GEN_INT (mask);
> -  if (INTVAL (operands[3]) == 2 && !)
> +  if (INTVAL (operands[3]) == 2 && ! && which_alternative == 0)
>  return "vblendps\t{$240, %2, %1, %0|%0, %1, %2, 240}";
>return "vshuf64x2\t{%3, %2, %1, 
> %0|%0, %1, %2, %3}";
>  }
> @@ -19386,11 +19386,11 @@ (define_expand "avx512vl_shuf_  })
>
>  (define_insn "avx512vl_shuf_32x4_1"
> -  [(set (match_operand:VI4F_256 0 "register_operand" "=v")
> +  [(set (match_operand:VI4F_256 0 "register_operand" "=x,v")
> (vec_select:VI4F_256
>   (vec_concat:
> -   (match_operand:VI4F_256 1 "register_operand" "v")
> -   (match_operand:VI4F_256 2 "nonimmediate_operand" "vm"))
> +   (match_operand:VI4F_256 1 "register_operand" "x,v")
> +   (match_operand:VI4F_256 2 "nonimmediate_operand" "xjm,vm"))
>   (parallel [(match_operand 3 "const_0_to_7_operand")
>  (match_operand 4 "const_0_to_7_operand")
>  (match_operand 5 "const_0_to_7_operand")
> @@ -19414,7 +19414,7 @@ (define_insn "avx512vl_shuf_mask |= (INTVAL (operands[7]) - 8) / 4 << 1;
>operands[3] = GEN_INT (mask);
>
> -  if (INTVAL (operands[3]) == 2 && !)
> +  if (INTVAL (operands[3]) == 2 && ! && which_alternative == 0)
>  return "vblendps\t{$240, %2, %1, %0|%0, %1, %2, 240}";
>
>return "vshuf32x4\t{%3, %2, %1, 
> %0|%0, %1, %2, %3}";
> --- gcc/testsuite/gcc.target/i386/avx512vl-pr112435-1.c.jj  2023-11-13 
> 09:20:53.330643098 +0100
> +++ gcc/testsuite/gcc.target/i386/avx512vl-pr112435-1.c 2023-11-13 
> 09:20:53.330643098 +0100
> @@ -0,0 +1,13 @@
> +/* PR target/112435 */
> +/* { dg-do assemble { target { avx512vl && { ! ia32 } } } } */
> +/* { dg-options "-mavx512vl -O2" } */
> +
> +#include 
> +
> +__m256i
> +foo (__m256i a, __m256i b)
> +{
> +  register __m256i c __asm__("ymm16") = a;
> +  asm ("" : "+v" (c));
> +  return _mm256_shuffle_i32x4 (c, b, 2);
> +}
> --- gcc/testsuite/gcc.target/i386/avx512vl-pr112435-2.c.jj  2023-11-13 
> 09:23:04.361788598 +0100
> +++ gcc/testsuite/gcc.target/i386/avx512vl-pr112435-2.c 2023-11-13 
> 09:34:57.186699876 +0100
> @@ -0,0 +1,63 @@
> +/* PR target/112435 */
> +/* { dg-do assemble { target { avx512vl && { ! ia32 } } } } */
> +/* { dg-options "-mavx512vl -O2" } */
> +
> +#include 
> +
> +/* vpermi128/vpermf128 */
> +__m256i
> +perm0 (__m256i a, __m256i b)
> +{
> +  register __m256i c __asm__("ymm17") = a;
> +  asm ("":"+v" (c));
> +  return _mm256_permute2x128_si256 (c, b, 50);
> +}
> +
> +__m256i
> +perm1 (__m256i a, __m256i b)
> +{
> +  register __m256i c __asm__("ymm17") = a;
> +  asm ("":"+v" (c));
> +  return _mm256_permute2x128_si256 (c, b, 18);
> +}
> +
> +__m256i
> +perm2 (__m256i a, __m256i b)
> +{
> +  register __m256i c __asm__("ymm17") = a;
> +  asm ("":"+v" (

Re: [PATCH] RISC-V: testsuite: Fix 32-bit FAILs.

> xgcc: fatal error: Cannot find suitable multilib set for 
> '-march=rv32imafdcv_zicsr_zifencei_zfh_zfhmin_zve32f_zve32x_zve64d_zve64f_zve64x_zvl128b_zvl32b_zvl64b'/'-mabi=ilp32d'^M
> compilation terminated.^M
> compiler exited with status 1
> FAIL: gcc.target/riscv/rvv/autovec/vmv-imm-run.c -O3 -ftree-vectorize (test 
> for excess errors)
> Excess errors:
> xgcc: fatal error: Cannot find suitable multilib set for 
> '-march=rv32imafdcv_zicsr_zifencei_zfh_zfhmin_zve32f_zve32x_zve64d_zve64f_zve64x_zvl128b_zvl32b_zvl64b'/'-mabi=ilp32d'
> compilation terminated.
> 
> My compile option is : --with-arch=rv32gcv_zfh_zvfh --with-abi=ilp32d
> 
> I am using SPIKE but I don't think simulator cause such issue since it is 
> compile issue.

Ok, that's no spike issue but some config issue.
I also didn't enable multilib and those tests should work
regardless. 

The problem seems to be just with the run tests in those two
directories and they are no different than other tests so it's
probably a problem how they are invoked by the test harness?

I'm going to configure with --with-arch=rv32gcv_zfh_zvfh --with-abi=ilp32d
to see if there is any difference.

Regards
 Robin

Re: [PATCH] RISC-V: vsetvl: Refine REG_EQUAL equality.

On 11/13/23 09:25, juzhe.zh...@rivai.ai wrote:
> Also, like kito previous remind me:
> https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635326.html 
>  
> 
> I think you should add a dedicated test which with specifying 
> -march=rv64gcv_zbb,
> then scan-assembler-check  the correct vsetvl.
> 
> So that we can allow people like me be able to avoid regression of such issue 
> even if I didn't build toolchain with "zbb".

Yes, makes sense.  Added.

Regarding the use of == instead of rtx_equal_p.  Hmm, I'm not sure
if pointer equality works.  Do we really re-use the exact rtx note
for a different insn?

When I use note1 == note2 (instead of equal_p) there are regressions:

FAIL: gcc.target/riscv/rvv/autovec/vls/dup-1.c -O3 -ftree-vectorize --param 
riscv-autovec-preference=scalable  check-function-bodies foo10
FAIL: gcc.target/riscv/rvv/autovec/vls/dup-2.c -O3 -ftree-vectorize --param 
riscv-autovec-preference=scalable  check-function-bodies foo10
FAIL: gcc.target/riscv/rvv/autovec/vls/dup-3.c -O3 -ftree-vectorize --param 
riscv-autovec-preference=scalable  check-function-bodies foo10
FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-102.c   -O2   scan-assembler-times 
vsetvli 1
FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-102.c   -O2 -flto 
-fno-use-linker-plugin -flto-partition=none   scan-assembler-times vsetvli 1
FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-102.c   -O2 -flto 
-fuse-linker-plugin -fno-fat-lto-objects   scan-assembler-times vsetvli 1

So what we could do instead is rtx_equal_p for REG_EQUIV and skip
REG_EQUAL altogether?

Regards
 Robin

Re: Re: [PATCH] RISC-V: vsetvl: Refine REG_EQUAL equality.

I just checked definition of REG_EQUAL and REG_EQUIV.

As you said, REG_EQUIV is more reasonable. Agree with use rtx_equal_p on 
REG_EQUIV and skip REG_EQUAL.
Could you check whether it does fix your issues ?



juzhe.zh...@rivai.ai
 
From: Robin Dapp
Date: 2023-11-13 17:25
To: juzhe.zh...@rivai.ai; gcc-patches; palmer; kito.cheng; jeffreyalaw
CC: rdapp.gcc
Subject: Re: [PATCH] RISC-V: vsetvl: Refine REG_EQUAL equality.
On 11/13/23 09:25, juzhe.zh...@rivai.ai wrote:
> Also, like kito previous remind me:
> https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635326.html 
>  
> 
> I think you should add a dedicated test which with specifying 
> -march=rv64gcv_zbb,
> then scan-assembler-check  the correct vsetvl.
> 
> So that we can allow people like me be able to avoid regression of such issue 
> even if I didn't build toolchain with "zbb".
 
Yes, makes sense.  Added.
 
Regarding the use of == instead of rtx_equal_p.  Hmm, I'm not sure
if pointer equality works.  Do we really re-use the exact rtx note
for a different insn?
 
When I use note1 == note2 (instead of equal_p) there are regressions:
 
FAIL: gcc.target/riscv/rvv/autovec/vls/dup-1.c -O3 -ftree-vectorize --param 
riscv-autovec-preference=scalable  check-function-bodies foo10
FAIL: gcc.target/riscv/rvv/autovec/vls/dup-2.c -O3 -ftree-vectorize --param 
riscv-autovec-preference=scalable  check-function-bodies foo10
FAIL: gcc.target/riscv/rvv/autovec/vls/dup-3.c -O3 -ftree-vectorize --param 
riscv-autovec-preference=scalable  check-function-bodies foo10
FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-102.c   -O2   scan-assembler-times 
vsetvli 1
FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-102.c   -O2 -flto 
-fno-use-linker-plugin -flto-partition=none   scan-assembler-times vsetvli 1
FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-102.c   -O2 -flto 
-fuse-linker-plugin -fno-fat-lto-objects   scan-assembler-times vsetvli 1
 
So what we could do instead is rtx_equal_p for REG_EQUIV and skip
REG_EQUAL altogether?
 
Regards
Robin

Re: [PATCH] RISC-V: testsuite: Fix 32-bit FAILs.

> I'm going to configure with --with-arch=rv32gcv_zfh_zvfh --with-abi=ilp32d
> to see if there is any difference.

No change for me, how do you invoke the testsuite? I.e. Which target board?

Regards
 Robin

[wwwdocs][committed] projects/gomp: Update for TR12, update impl. status

2023-11-13 Thread Tobias Burnus


A new OpenMP 6.0 preview, Technical Report (TR) 12 has been released in time 
for Supercomputing 2023 (SC23),
cf.https://www.openmp.org/specifications/

This commit links to the new spec (see bottom of change), it also updates the
implementation status of some items for 'allocate', 'indirect' and C++23/C23 
attributes,
and, mainly, updates it for new features added between TR11 and TR12.

Committed - looks like:
  https://gcc.gnu.org/projects/gomp/
esp, see "TR 12" and "OpenMP Releases and Status"

Tobias

PS: The libgomp.texi update is at 
https://gcc.gnu.org/onlinedocs/libgomp/OpenMP-Implementation-Status.html
-> TR12 (I need to fix the typo 'c(a)lause' there) and some update for
https://gcc.gnu.org/gcc-14/changes.html is also eventually required.
But as more features keep getting added, there is no rush.
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
commit ad756d3cfed3007d8c07c7f22facf24b202f1160
Author: Tobias Burnus 
Date:   Mon Nov 13 10:23:25 2023 +0100

projects/gomp: Update for TR12, update impl. status

This adds a link to the new TR12 (second OpenMP 6.0 preview); updates
the status + TR12 to-do list form libgomp.texi and adds a couple of
missing ''.
---
 htdocs/projects/gomp/index.html | 248 +---
 1 file changed, 205 insertions(+), 43 deletions(-)

diff --git a/htdocs/projects/gomp/index.html b/htdocs/projects/gomp/index.html
index 7f0b97c3..bc472747 100644
--- a/htdocs/projects/gomp/index.html
+++ b/htdocs/projects/gomp/index.html
@@ -29,7 +29,7 @@ OpenMP and OpenACC are supported with GCC's C, C++ and Fortran compilers.
   3.1 · 4.0 ·
   4.5 · 5.0 ·
   5.1 · 5.2 ·
-  TR 11
+  TR 12
   OpenMP Releases and Status
 
 
@@ -480,7 +480,7 @@ than listed, depending on resolved corner cases and optimizations.
   
 allocate directive
 GCC 14
-Only C, only stack variables
+Only C and Fortran, only stack variables
   
   
 Discontiguous array section with target update construct
@@ -555,7 +555,7 @@ than listed, depending on resolved corner cases and optimizations.
   
 align clause in allocate directive
 GCC 14
-Only C (and only stack variables)
+Only C and Fortran (and only stack variables)
   
   
 align modifier in allocate clause
@@ -708,14 +708,14 @@ than listed, depending on resolved corner cases and optimizations.
 
   
   
-iterators in target update motion clauses and map clauses
+.terators in target update motion clauses and map clauses
 No
 
   
   
-indirect calls to the device version of a procedure or function in target regions
-No
-
+Indirect calls to the device version of a procedure or function in target regions
+GCC 14
+Only C and C++
   
   
 interop directive
@@ -745,7 +745,7 @@ than listed, depending on resolved corner cases and optimizations.
   
   
 For Fortran, diagnose placing declarative before/between USE,
-  IMPORT, and IMPLICIT as invalid
+  IMPORT, and IMPLICIT as invalid
 No
 
   
@@ -756,8 +756,8 @@ than listed, depending on resolved corner cases and optimizations.
   
   
 indirect clause in declare target
-No
-
+GCC 14
+Only C and C++
   
   
 device_type(nohost)/device_type(host) for variables
@@ -964,6 +964,12 @@ than listed, depending on resolved corner cases and optimizations.
 No
 
   
+  
+Invoke virtual member functions of C++ objects created on the host
+  device on other devices
+No
+
+  
 
 
 
@@ -976,9 +982,9 @@ code, the omx sentinel is warned for with -Wsurprising
 (enabled by -Wall). Unknown clauses are always rejected with an
 error.
 
-OpenMP Technical Report 11
+OpenMP Technical Report 12
 
-Technical Report (TR) 11 is the first preview for OpenMP 6.0.
+Technical Report (TR) 12 is the second preview for OpenMP 6.0.
 
 
 
@@ -995,18 +1001,90 @@ error.
 Backward compatibility
   
   
-The decl attribute was added to the C++ attribute syntax
-GCC 14
-
+Full support for C 23 was added
+GCC 9
+/ GCC 14
+Increasing coverage (since GCC 9 -std=c2x,
+since GCC 14 -std=c23)
   
   
-_ALL suffix to the device-scope environment variables
+Full support for C++ 23 was added
+GCC 11
+/ GCC 14
+https://gcc.gnu.org/projects/cxx-status.html";>C++ Implementation Status;
+increasing coverage (since GCC 11 -std=c++2b,
+since GCC 14 -std=c++23)
+  
+  
+_ALL suffix to the device-scope environment variables
 GCC 13
 Host device number wrongly accepted
   
+  
+num_threads now accepts a list
+No
+
+  
+  
+Supporting increments with abstract names in OMP_PLACES
+No
+
+  
+  
+Extension of OMP_DEFAULT_DEVICE and new
+  OMP_AVAILABLE_DEVICES environmen

Re: [PATCH] RISC-V: vsetvl: Refine REG_EQUAL equality.

On 11/13/23 10:30, juzhe.zh...@rivai.ai wrote:
> I just checked definition of REG_EQUAL and REG_EQUIV.
> 
> As you said, REG_EQUIV is more reasonable. Agree with use rtx_equal_p on 
> REG_EQUIV and skip REG_EQUAL.
> Could you check whether it does fix your issues ?

Yes it would fix the issues.  I just figured we could work
a bit harder and also catch cases where two "different"
REG_EQUALS would still be the same.  But I'm not sure whether
such cases exist at all (leaning towards no...).

Going to post v2 after the tests ran.

Regards
 Robin

Re: Re: [PATCH] RISC-V: testsuite: Fix 32-bit FAILs.

Ok. Lehua is going to take care of this issue. He has reproduced it.



juzhe.zh...@rivai.ai
 
From: Robin Dapp
Date: 2023-11-13 17:31
To: juzhe.zh...@rivai.ai; gcc-patches; palmer; kito.cheng; jeffreyalaw
CC: rdapp.gcc
Subject: Re: [PATCH] RISC-V: testsuite: Fix 32-bit FAILs.
> I'm going to configure with --with-arch=rv32gcv_zfh_zvfh --with-abi=ilp32d
> to see if there is any difference.
 
No change for me, how do you invoke the testsuite? I.e. Which target board?
 
Regards
Robin

Re: Re: [PATCH] RISC-V: vsetvl: Refine REG_EQUAL equality.

For @code{REG_EQUIV}, the register is equivalent to @var{op} throughout
the entire function, and could validly be replaced in all its
occurrences by @var{op}.  (``Validly'' here refers to the data flow of
the program; simple replacement may make some insns invalid.)  For
example, when a constant is loaded into a register that is never
assigned any other value, this kind of note is used.

I think REG_QEUIV is what I want. So I think you can test it to see if there is 
regression on current tests.

juzhe.zh...@rivai.ai

From: Robin Dapp
Date: 2023-11-13 17:34
To: juzhe.zh...@rivai.ai; gcc-patches; palmer; kito.cheng; jeffreyalaw
CC: rdapp.gcc
Subject: Re: [PATCH] RISC-V: vsetvl: Refine REG_EQUAL equality.
On 11/13/23 10:30, juzhe.zh...@rivai.ai wrote:
> I just checked definition of REG_EQUAL and REG_EQUIV.
> 
> As you said, REG_EQUIV is more reasonable. Agree with use rtx_equal_p on 
> REG_EQUIV and skip REG_EQUAL.
> Could you check whether it does fix your issues ?

Yes it would fix the issues.  I just figured we could work
a bit harder and also catch cases where two "different"
REG_EQUALS would still be the same.  But I'm not sure whether
such cases exist at all (leaning towards no...).

Going to post v2 after the tests ran.

Regards
Robin

Re: Fwd: [PATCH, expand] Call misaligned memory reference in expand_builtin_return [PR112417]

On Mon, Nov 13, 2023 at 9:09 AM HAO CHEN GUI  wrote:
>
> Sorry, forgot to cc gcc-patches.
>
> 在 2023/11/13 16:05, HAO CHEN GUI 写道:
> > Andrew,
> >   Could you kindly inform us what's the functionality of __objc_forward?
> > Does it change the memory content pointed by args? Thanks a lot.
> >
> > Thanks
> > Gui Haochen
> >
> >
> > libobjc/sendmsg.c.
> >
> >void *args, *res;
> >
> >args = __builtin_apply_args ();
> >res = __objc_forward (rcv, op, args);

maybe this should use __builtin_apply and the
__builtin_return use 'args', not 'res' as the return value?

> >if (res)
> >  __builtin_return (res);
> >else
> >  ...
> >
> >  转发的消息 
> > 主题: Re: [PATCH, expand] Call misaligned memory reference in 
> > expand_builtin_return [PR112417]
> > 日期: Fri, 10 Nov 2023 14:39:02 +0100
> > From: Richard Biener 
> > 收件人: HAO CHEN GUI 
> > 抄送: gcc-patches , Kewen.Lin 
> >
> > On Fri, Nov 10, 2023 at 11:10 AM HAO CHEN GUI  wrote:
> >>
> >> Hi Richard,
> >>
> >> 在 2023/11/10 17:06, Richard Biener 写道:
> >>> On Fri, Nov 10, 2023 at 8:52 AM HAO CHEN GUI  
> >>> wrote:
> 
>  Hi Richard,
>    Thanks so much for your comments.
> 
>  在 2023/11/9 19:41, Richard Biener 写道:
> > I'm not sure if the testcase is valid though?
> >
> > @defbuiltin{{void} __builtin_return (void *@var{result})}
> > This built-in function returns the value described by @var{result} from
> > the containing function.  You should specify, for @var{result}, a value
> > returned by @code{__builtin_apply}.
> > @enddefbuiltin
> >
> > I don't see __builtin_apply being used here?
> 
>  The prototype of the test case is from "__objc_block_forward" in
>  libobjc/sendmsg.c.
> 
>    void *args, *res;
> 
>    args = __builtin_apply_args ();
>    res = __objc_forward (rcv, op, args);
>    if (res)
>  __builtin_return (res);
>    else
>  ...
> 
>  The __builtin_apply_args puts the return values on stack by the 
>  alignment.
>  But the forward function can do anything and return a void* pointer.
>  IMHO the alignment might be broken. So I just simplified it to use a
>  void* pointer as the input argument of  "__builtin_return" and skip
>  "__builtin_apply_args".
> >>>
> >>> But doesn't __objc_forward then break the contract between
> >>> __builtin_apply_args and __builtin_return?
> >>>
> >>> That said, __builtin_return is a very special function, it's not supposed
> >>> to deal with what you are fixing.  At least I think so.
> >>>
> >>> IMHO the bug is in __objc_block_forward.
> >>
> >> If so, can we document that the memory objects pointed by input argument of
> >> __builtin_return have to be aligned? Then we can force the alignment in
> >> __builtin_return. The customer function can do anything if gcc doesn't 
> >> state
> >> that.
> >
> > I don't think they have to be aligned - they have to adhere to the ABI
> > which __builtin_apply_args ensures.  But others might know more details
> > here.
> >
> >> Thanks
> >> Gui Haochen
> >>
> >>>
> >>> Richard.
> >>>
> 
>  Thanks
>  Gui Haochen

Re: Principles of the C99 testsuite conversion

These changes are now in, for i686-linux-gnu, powerpc64le-linux-gnu,
x86_64-linux-gnu.  For aarch64-linux-gnu, there's one change that would
benefit from maintainer review:

  [PATCH] aarch64: Call named function in gcc.target/aarch64/aapcs64/ice_1.c
  


(On aarch64-linux-gnu, we also need a libgcc fix, but that is a
different matter.)

Thanks,
Florian

Re: [PATCH] RISC-V: testsuite: Fix 32-bit FAILs.


Hi Robin,

Can you show me the compile command in gcc.log for the 
slp-mask-run-1.exe like bellow? I'd like to see the -march option on 
your side.


Executing on host: 
/work/home/lding/open-source/riscv-gnu-toolchain-push/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-newlib-spike-debug/build-gcc-newlib-stage2/gcc/xgcc 
-B/work/home/lding/open-source/riscv-gnu-toolchain-push/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-newlib-spike-debug/build-gcc-newlib-stage2/gcc/ 

/work/home/lding/open-source/riscv-gnu-toolchain-push/gcc/gcc/testsuite/gcc.target/riscv/rvv/autovec/slp-mask-run-1.c 
 -march=rv64gcv_zvfh_zfh -mabi=lp64d -mcmodel=medany 
-fdiagnostics-plain-output  -O3 -ftree-vectorize -ansi -pedantic-errors 
-march=rv64gcv_zfh -mabi=lp64d -O3 -std=gnu99 -O3 
--param=riscv-autovec-preference=scalable  -lm  -o 
./slp-mask-run-1.exe(timeout = 6000)
spawn -ignore SIGHUP 
/work/home/lding/open-source/riscv-gnu-toolchain-push/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-newlib-spike-debug/build-gcc-newlib-stage2/gcc/xgcc 
-B/work/home/lding/open-source/riscv-gnu-toolchain-push/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-newlib-spike-debug/build-gcc-newlib-stage2/gcc/ 
/work/home/lding/open-source/riscv-gnu-toolchain-push/gcc/gcc/testsuite/gcc.target/riscv/rvv/autovec/slp-mask-run-1.c 
-march=rv64gcv_zvfh_zfh -mabi=lp64d -mcmodel=medany 
-fdiagnostics-plain-output -O3 -ftree-vectorize -ansi -pedantic-errors 
-march=rv64gcv_zfh -mabi=lp64d -O3 -std=gnu99 -O3 
--param=riscv-autovec-preference=scalable -lm -o ./slp-mask-run-1.exe


On 2023/11/13 17:31, Robin Dapp wrote:

I'm going to configure with --with-arch=rv32gcv_zfh_zvfh --with-abi=ilp32d
to see if there is any difference.


No change for me, how do you invoke the testsuite? I.e. Which target board?

Regards
  Robin


--
Best,
Lehua (RiVAI)
lehua.d...@rivai.ai

Re: [PATCH v5] C, ObjC: Add -Wunterminated-string-initialization

2023-11-13 Thread Alejandro Colomar

Hi,

Gentle ping, just again a little before v14 stage 3.

Do I need to do anything else with this patch?  The CI seemed to say
it's ok.

Cheers,
Alex

On Sun, Oct 01, 2023 at 06:24:00PM +0200, Alejandro Colomar wrote:
> Warn about the following:
> 
> char  s[3] = "foo";
> 
> Initializing a char array with a string literal of the same length as
> the size of the array is usually a mistake.  Rarely is the case where
> one wants to create a non-terminated character sequence from a string
> literal.
> 
> In some cases, for writing faster code, one may want to use arrays
> instead of pointers, since that removes the need for storing an array of
> pointers apart from the strings themselves.
> 
> char  *log_levels[]   = { "info", "warning", "err" };
> vs.
> char  log_levels[][7] = { "info", "warning", "err" };
> 
> This forces the programmer to specify a size, which might change if a
> new entry is later added.  Having no way to enforce null termination is
> very dangerous, however, so it is useful to have a warning for this, so
> that the compiler can make sure that the programmer didn't make any
> mistakes.  This warning catches the bug above, so that the programmer
> will be able to fix it and write:
> 
> char  log_levels[][8] = { "info", "warning", "err" };
> 
> This warning already existed as part of -Wc++-compat, but this patch
> allows enabling it separately.  It is also included in -Wextra, since
> it may not always be desired (when unterminated character sequences are
> wanted), but it's likely to be desired in most cases.
> 
> Since Wc++-compat now includes this warning, the test has to be modified
> to expect the text of the new warning too, in .
> 
> Link: 
> Link: 
> Link: 
> 
> Acked-by: Doug McIlroy 
> Cc: "G. Branden Robinson" 
> Cc: Ralph Corderoy 
> Cc: Dave Kemper 
> Cc: Larry McVoy 
> Cc: Andrew Pinski 
> Cc: Jonathan Wakely 
> Cc: Andrew Clayton 
> Cc: Martin Uecker 
> Cc: David Malcolm 
> Signed-off-by: Alejandro Colomar 
> ---
> 
> v5:
> 
> -  Fix existing C++-compat tests.  [reported by ]
> 
> 
>  gcc/c-family/c.opt | 4 
>  gcc/c/c-typeck.cc  | 6 +++---
>  gcc/testsuite/gcc.dg/Wcxx-compat-14.c  | 2 +-
>  gcc/testsuite/gcc.dg/Wunterminated-string-initialization.c | 6 ++
>  4 files changed, 14 insertions(+), 4 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/Wunterminated-string-initialization.c
> 
> diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
> index 44b9c862c14..e8f6b836836 100644
> --- a/gcc/c-family/c.opt
> +++ b/gcc/c-family/c.opt
> @@ -1407,6 +1407,10 @@ Wunsuffixed-float-constants
>  C ObjC Var(warn_unsuffixed_float_constants) Warning
>  Warn about unsuffixed float constants.
>  
> +Wunterminated-string-initialization
> +C ObjC Var(warn_unterminated_string_initialization) Warning LangEnabledBy(C 
> ObjC,Wextra || Wc++-compat)
> +Warn about character arrays initialized as unterminated character sequences 
> by a string literal.
> +
>  Wunused
>  C ObjC C++ ObjC++ LangEnabledBy(C ObjC C++ ObjC++,Wall)
>  ; documented in common.opt
> diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
> index e55e887da14..7df9de819ed 100644
> --- a/gcc/c/c-typeck.cc
> +++ b/gcc/c/c-typeck.cc
> @@ -8399,11 +8399,11 @@ digest_init (location_t init_loc, tree type, tree 
> init, tree origtype,
>   pedwarn_init (init_loc, 0,
> ("initializer-string for array of %qT "
>  "is too long"), typ1);
> -   else if (warn_cxx_compat
> +   else if (warn_unterminated_string_initialization
>  && compare_tree_int (TYPE_SIZE_UNIT (type), len) < 0)
> - warning_at (init_loc, OPT_Wc___compat,
> + warning_at (init_loc, OPT_Wunterminated_string_initialization,
>   ("initializer-string for array of %qT "
> -  "is too long for C++"), typ1);
> +  "is too long"), typ1);
> if (compare_tree_int (TYPE_SIZE_UNIT (type), len) < 0)
>   {
> unsigned HOST_WIDE_INT size
> diff --git a/gcc/testsuite/gcc.dg/Wcxx-compat-14.c 
> b/gcc/testsuite/gcc.dg/Wcxx-compat-14.c
> index 23783711be6..6df0ee197cc 100644
> --- a/gcc/testsuite/gcc.dg/Wcxx-compat-14.c
> +++ b/gcc/testsuite/gcc.dg/Wcxx-compat-14.c
> @@ -2,5 +2,5 @@
>  /* { dg-options "-Wc++-compat" } */
>  
>  char a1[] = "a";
> -char a2[1] = "a";/* { dg-warning "C\[+\]\[+\]" } */
> +char a2[1] = "a";/* { dg-warning "initializer-string for array of 'char' 
> is too long" } */
>  char a3[2] = "a";
> diff --git a/gcc/testsuite/gcc.dg/Wunterminated-string-initialization.c 
> b/gcc/testsuite/gcc.dg/Wunterminated

Re: [PATCH] RISC-V: testsuite: Fix 32-bit FAILs.

Hi Lehua,

> Executing on host: 
> /work/home/lding/open-source/riscv-gnu-toolchain-push/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-newlib-spike-debug/build-gcc-newlib-stage2/gcc/xgcc
>  
> -B/work/home/lding/open-source/riscv-gnu-toolchain-push/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-newlib-spike-debug/build-gcc-newlib-stage2/gcc/
> /work/home/lding/open-source/riscv-gnu-toolchain-push/gcc/gcc/testsuite/gcc.target/riscv/rvv/autovec/slp-mask-run-1.c
>   -march=rv64gcv_zvfh_zfh -mabi=lp64d -mcmodel=medany 
> -fdiagnostics-plain-output  -O3 -ftree-vectorize -ansi -pedantic-errors 
> -march=rv64gcv_zfh -mabi=lp64d -O3 -std=gnu99 -O3 
> --param=riscv-autovec-preference=scalable  -lm  -o ./slp-mask-run-1.exe   
>  (timeout = 6000)
> spawn -ignore SIGHUP 
> /work/home/lding/open-source/riscv-gnu-toolchain-push/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-newlib-spike-debug/build-gcc-newlib-stage2/gcc/xgcc
>  
> -B/work/home/lding/open-source/riscv-gnu-toolchain-push/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-newlib-spike-debug/build-gcc-newlib-stage2/gcc/
>  
> /work/home/lding/open-source/riscv-gnu-toolchain-push/gcc/gcc/testsuite/gcc.target/riscv/rvv/autovec/slp-mask-run-1.c
>  -march=rv64gcv_zvfh_zfh -mabi=lp64d -mcmodel=medany 
> -fdiagnostics-plain-output -O3 -ftree-vectorize -ansi -pedantic-errors 
> -march=rv64gcv_zfh -mabi=lp64d -O3 -std=gnu99 -O3 
> --param=riscv-autovec-preference=scalable -lm -o ./slp-mask-run-1.exe


Executing on host: /home/rdapp/projects/gcc32/build/gcc/xgcc 
-B/home/rdapp/projects/gcc32/build/gcc/  
/home/rdapp/projects/gcc32/gcc/testsuite/gcc.target/riscv/rvv/autovec/slp-mask-run-1.c
  -march=rv32gcv_zvfh   -fdiagnostics-plain-output  -O3 -ftree-vectorize -ansi 
-pedantic-errors -march=rv32gcv_zfh -mabi=ilp32d -O3 -std=gnu99 -O3 
--param=riscv-autovec-preference=scalable  -lm  -o ./slp-mask-run-1.exe
(timeout = 300) 

spawn -ignore SIGHUP /home/rdapp/projects/gcc32/build/gcc/xgcc 
-B/home/rdapp/projects/gcc32/build/gcc/ 
/home/rdapp/projects/gcc32/gcc/testsuite/gcc.target/riscv/rvv/autovec/slp-mask-run-1.c
 -march=rv32gcv_zvfh -fdiagnostics-plain-output -O3 -ftree-vectorize -ansi 
-pedantic-errors -march=rv32gcv_zfh -mabi=ilp32d -O3 -std=gnu99 -O3 
--param=riscv-autovec-preference=scalable -lm -o ./slp-mask-run-1.exe

That's for the 32-bit version now (yours looks 64 bit).

Regards
 Robin

Re: [PATCH V2] RISC-V: Optimize combine sequence by merge approach

Hi Juzhe,

LGTM apart from:

> +  int64_t a = -1789089.23423;
> +  int64_t b = -8916156.45644;

What's that? :)  Doesn't really matter of course but please change to
a proper integer.  OK with that changed.

Regards
 Robin

Re: [PATCH] RISC-V: testsuite: Fix 32-bit FAILs.





On 2023/11/13 17:59, Robin Dapp wrote:

Hi Lehua,


Executing on host: 
/work/home/lding/open-source/riscv-gnu-toolchain-push/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-newlib-spike-debug/build-gcc-newlib-stage2/gcc/xgcc
 
-B/work/home/lding/open-source/riscv-gnu-toolchain-push/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-newlib-spike-debug/build-gcc-newlib-stage2/gcc/
/work/home/lding/open-source/riscv-gnu-toolchain-push/gcc/gcc/testsuite/gcc.target/riscv/rvv/autovec/slp-mask-run-1.c
  -march=rv64gcv_zvfh_zfh -mabi=lp64d -mcmodel=medany 
-fdiagnostics-plain-output  -O3 -ftree-vectorize -ansi -pedantic-errors 
-march=rv64gcv_zfh -mabi=lp64d -O3 -std=gnu99 -O3 
--param=riscv-autovec-preference=scalable  -lm  -o ./slp-mask-run-1.exe    
(timeout = 6000)
spawn -ignore SIGHUP 
/work/home/lding/open-source/riscv-gnu-toolchain-push/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-newlib-spike-debug/build-gcc-newlib-stage2/gcc/xgcc
 
-B/work/home/lding/open-source/riscv-gnu-toolchain-push/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-newlib-spike-debug/build-gcc-newlib-stage2/gcc/
 
/work/home/lding/open-source/riscv-gnu-toolchain-push/gcc/gcc/testsuite/gcc.target/riscv/rvv/autovec/slp-mask-run-1.c
 -march=rv64gcv_zvfh_zfh -mabi=lp64d -mcmodel=medany -fdiagnostics-plain-output 
-O3 -ftree-vectorize -ansi -pedantic-errors -march=rv64gcv_zfh -mabi=lp64d -O3 
-std=gnu99 -O3 --param=riscv-autovec-preference=scalable -lm -o 
./slp-mask-run-1.exe



Executing on host: /home/rdapp/projects/gcc32/build/gcc/xgcc 
-B/home/rdapp/projects/gcc32/build/gcc/  
/home/rdapp/projects/gcc32/gcc/testsuite/gcc.target/riscv/rvv/autovec/slp-mask-run-1.c
  -march=rv32gcv_zvfh   -fdiagnostics-plain-output  -O3 -ftree-vectorize -ansi 
-pedantic-errors -march=rv32gcv_zfh -mabi=ilp32d -O3 -std=gnu99 -O3 
--param=riscv-autovec-preference=scalable  -lm  -o ./slp-mask-run-1.exe
(timeout = 300)
spawn -ignore SIGHUP /home/rdapp/projects/gcc32/build/gcc/xgcc 
-B/home/rdapp/projects/gcc32/build/gcc/ 
/home/rdapp/projects/gcc32/gcc/testsuite/gcc.target/riscv/rvv/autovec/slp-mask-run-1.c
 -march=rv32gcv_zvfh -fdiagnostics-plain-output -O3 -ftree-vectorize -ansi 
-pedantic-errors -march=rv32gcv_zfh -mabi=ilp32d -O3 -std=gnu99 -O3 
--param=riscv-autovec-preference=scalable -lm -o ./slp-mask-run-1.exe


Looks like your configure is --with-march=rv32gcv_zvfh, can you change 
to --with-march=rv32gcv_zvfh_zfh?


--
Best,
Lehua (RiVAI)
lehua.d...@rivai.ai

Re: [PATCH] RISC-V: testsuite: Fix 32-bit FAILs.

> Looks like your configure is --with-march=rv32gcv_zvfh, can you change to 
> --with-march=rv32gcv_zvfh_zfh?

>From config.log:

  $ ../configure --prefix=/home/rdapp/projects/builds/gcc 
--target=riscv32-unknown-linux-gnu --disable-nls --disable-multilib 
--disable-bootstrap 
--with-sysroot=/home/rdapp/projects/x-tools/riscv32-unknown-linux-gnu/riscv32-unknown-linux-gnu/sysroot
 --disable-libsanitizer --with-arch=rv32gcv_zfh_zvfh --with-abi=ilp32d 
--enable-languages=c,c++

How do you invoke the testsuite and which target board?

Regards
 Robin

Re: Re: [PATCH] RISC-V: testsuite: Fix 32-bit FAILs.

If there is a difference between them. I think we should fix riscv-common.cc.
Since I think "zvfh_zfh" should not be different with "zfh_zvfh"



juzhe.zh...@rivai.ai
 
From: Robin Dapp
Date: 2023-11-13 18:17
To: Lehua Ding; juzhe.zh...@rivai.ai; gcc-patches; palmer; kito.cheng; 
jeffreyalaw
CC: rdapp.gcc
Subject: Re: [PATCH] RISC-V: testsuite: Fix 32-bit FAILs.
> Looks like your configure is --with-march=rv32gcv_zvfh, can you change to 
> --with-march=rv32gcv_zvfh_zfh?
 
From config.log:
 
  $ ../configure --prefix=/home/rdapp/projects/builds/gcc 
--target=riscv32-unknown-linux-gnu --disable-nls --disable-multilib 
--disable-bootstrap 
--with-sysroot=/home/rdapp/projects/x-tools/riscv32-unknown-linux-gnu/riscv32-unknown-linux-gnu/sysroot
 --disable-libsanitizer --with-arch=rv32gcv_zfh_zvfh --with-abi=ilp32d 
--enable-languages=c,c++
 
How do you invoke the testsuite and which target board?
 
Regards
Robin

Re: Re: [PATCH V2] RISC-V: Optimize combine sequence by merge approach

Thanks for noticing it.
Will commit it with adjusting the testcase.

Thanks.

juzhe.zh...@rivai.ai

From: Robin Dapp
Date: 2023-11-13 18:05
To: Juzhe-Zhong; gcc-patches
CC: rdapp.gcc; kito.cheng; kito.cheng; jeffreyalaw
Subject: Re: [PATCH V2] RISC-V: Optimize combine sequence by merge approach
Hi Juzhe,

LGTM apart from:

> +  int64_t a = -1789089.23423;
> +  int64_t b = -8916156.45644;

What's that? :)  Doesn't really matter of course but please change to
a proper integer.  OK with that changed.

Regards
Robin

Re: [PATCH] RISC-V: vsetvl: Refine REG_EQUAL equality.

On 11/13/23 10:38, juzhe.zh...@rivai.ai wrote:
> For @code{REG_EQUIV}, the register is equivalent to @var{op} throughout
> the entire function, and could validly be replaced in all its
> occurrences by @var{op}.  (``Validly'' here refers to the data flow of
> the program; simple replacement may make some insns invalid.)  For
> example, when a constant is loaded into a register that is never
> assigned any other value, this kind of note is used.
> 
> I think REG_QEUIV is what I want. So I think you can test it to see if there 
> is regression on current tests.

Let's keep the REG_EQUAL optimization for later in case we find
a case that triggers it.

Regards
 Robin

Subject: [PATCH v2] RISC-V: vsetvl: Refine REG_EQUAL equality.

This patch enhances the equality check for REG_EQUAL notes in the vsetvl
pass by using the == operator instead of rtx_equal_p.  With that, in
situations like the following, a5 and a7 are not considered equal
anymore.

(insn 62 60 63 4 (set (reg:DI 17 a7 [orig:154 loop_len_54 ] [154])
(umin:DI (reg:DI 15 a5 [orig:174 _100 ] [174])
(reg:DI 30 t5 [219]))) 442 {umindi3}
 (expr_list:REG_EQUAL (umin:DI (reg:DI 15 a5 [orig:174 _100 ] [174])
(const_int 8 [0x8]))
(nil)))
(insn 63 62 65 4 (set (reg:DI 15 a5 [orig:175 _103 ] [175])
(minus:DI (reg:DI 15 a5 [orig:174 _100 ] [174])
(reg:DI 17 a7 [orig:154 loop_len_54 ] [154]))) 11 {subdi3}
 (nil))
(insn 65 63 66 4 (set (reg:DI 16 a6 [orig:153 loop_len_53 ] [153])
(umin:DI (reg:DI 15 a5 [orig:175 _103 ] [175])
(reg:DI 30 t5 [219]))) 442 {umindi3}
 (expr_list:REG_EQUAL (umin:DI (reg:DI 15 a5 [orig:175 _103 ] [175])
(const_int 8 [0x8]))
(nil)))

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (source_equal_p): Use pointer
equality for REG_EQUAL.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_zbb_run-2.c: New 
test.
---
 gcc/config/riscv/riscv-vsetvl.cc  | 12 +++-
 .../partial/multiple_rgroup_zbb_run-2.c   | 19 +++
 2 files changed, 30 insertions(+), 1 deletion(-)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_zbb_run-2.c

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index 3fa25a6404d..0eea33dd0e1 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -561,7 +561,17 @@ source_equal_p (insn_info *insn1, insn_info *insn2)
   rtx note1 = find_reg_equal_equiv_note (rinsn1);
   rtx note2 = find_reg_equal_equiv_note (rinsn2);
   if (note1 && note2 && rtx_equal_p (note1, note2))
-return true;
+{
+  /* REG_EQUIVs are invariant at function scope.  */
+  if (REG_NOTE_KIND (note2) == REG_EQUIV)
+   return true;
+
+  /* REG_EQUAL are not so in order to consider them similar the RTX they
+point to must be identical.  We could also allow "rtx_equal"
+REG_EQUALs but would need to check if no insn between them modifies
+any of their sources.  */
+  return note1 == note2;
+}
   return false;
 }
 
diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_zbb_run-2.c
 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_zbb_run-2.c
new file mode 100644
index 000..aeb337dc7ee
--- /dev/null
+++ 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_zbb_run-2.c
@@ -0,0 +1,19 @@
+/* { dg-do run { target { riscv_v } } } */
+/* { dg-additional-options "-march=rv64gcv_zbb --param 
riscv-autovec-preference=fixed-vlmax" } */
+
+#include "multiple_rgroup-2.c"
+
+int main (void)
+{
+  TEST_ALL (run_1)
+  TEST_ALL (run_2)
+  TEST_ALL (run_3)
+  TEST_ALL (run_4)
+  TEST_ALL (run_5)
+  TEST_ALL (run_6)
+  TEST_ALL (run_7)
+  TEST_ALL (run_8)
+  TEST_ALL (run_9)
+  TEST_ALL (run_10)
+  return 0;
+}
-- 
2.41.0

Re: [PATCH] RISC-V: testsuite: Fix 32-bit FAILs.




On 2023/11/13 18:22, juzhe.zh...@rivai.ai wrote:
If there is a difference between them. I think we should fix 
riscv-common.cc.

Since I think "zvfh_zfh" should not be different with "zfh_zvfh"


It's possible. Let me debug it and see if there's a problem.

--
Best,
Lehua (RiVAI)
lehua.d...@rivai.ai

Re: [PATCH] RISC-V: testsuite: Fix 32-bit FAILs.

> On 2023/11/13 18:22, juzhe.zh...@rivai.ai wrote:
>> If there is a difference between them. I think we should fix riscv-common.cc.
>> Since I think "zvfh_zfh" should not be different with "zfh_zvfh"
> 
> It's possible. Let me debug it and see if there's a problem.

I don't think it is different.  Just checked and it still works for me.

Could you please tell me how you invoke the testsuite?

Regards
 Robin

Re: [PATCH] RISC-V: testsuite: Fix 32-bit FAILs.





On 2023/11/13 18:33, Robin Dapp wrote:

On 2023/11/13 18:22, juzhe.zh...@rivai.ai wrote:

If there is a difference between them. I think we should fix riscv-common.cc.
Since I think "zvfh_zfh" should not be different with "zfh_zvfh"


It's possible. Let me debug it and see if there's a problem.


I don't think it is different.  Just checked and it still works for me.

Could you please tell me how you invoke the testsuite?


We use the riscv-gnu-toolchain and run this `make report-newlib 
SIM=spike RUNTESTFLAGS="rvv.exp" -j100`


--
Best,
Lehua (RiVAI)
lehua.d...@rivai.ai

Re: Re: [PATCH] RISC-V: vsetvl: Refine REG_EQUAL equality.

--- /dev/null
+++ 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_zbb_run-2.c
@@ -0,0 +1,19 @@
+/* { dg-do run { target { riscv_v } } } */
+/* { dg-additional-options "-march=rv64gcv_zbb --param 
riscv-autovec-preference=fixed-vlmax" } */

Could you add compile test (with assembly check) instead of run test ?

If I don't build toolchain with "zbb" then we can't test such issue (VSETVL 
BUG).
I may cause regression again if I change VSETVL pass in the future.



juzhe.zh...@rivai.ai
 
From: Robin Dapp
Date: 2023-11-13 18:31
To: juzhe.zh...@rivai.ai; gcc-patches; palmer; kito.cheng; jeffreyalaw
CC: rdapp.gcc
Subject: Re: [PATCH] RISC-V: vsetvl: Refine REG_EQUAL equality.
On 11/13/23 10:38, juzhe.zh...@rivai.ai wrote:
> For @code{REG_EQUIV}, the register is equivalent to @var{op} throughout
> the entire function, and could validly be replaced in all its
> occurrences by @var{op}.  (``Validly'' here refers to the data flow of
> the program; simple replacement may make some insns invalid.)  For
> example, when a constant is loaded into a register that is never
> assigned any other value, this kind of note is used.
> 
> I think REG_QEUIV is what I want. So I think you can test it to see if there 
> is regression on current tests.
 
Let's keep the REG_EQUAL optimization for later in case we find
a case that triggers it.
 
Regards
Robin
 
Subject: [PATCH v2] RISC-V: vsetvl: Refine REG_EQUAL equality.
 
This patch enhances the equality check for REG_EQUAL notes in the vsetvl
pass by using the == operator instead of rtx_equal_p.  With that, in
situations like the following, a5 and a7 are not considered equal
anymore.
 
(insn 62 60 63 4 (set (reg:DI 17 a7 [orig:154 loop_len_54 ] [154])
(umin:DI (reg:DI 15 a5 [orig:174 _100 ] [174])
(reg:DI 30 t5 [219]))) 442 {umindi3}
 (expr_list:REG_EQUAL (umin:DI (reg:DI 15 a5 [orig:174 _100 ] [174])
(const_int 8 [0x8]))
(nil)))
(insn 63 62 65 4 (set (reg:DI 15 a5 [orig:175 _103 ] [175])
(minus:DI (reg:DI 15 a5 [orig:174 _100 ] [174])
(reg:DI 17 a7 [orig:154 loop_len_54 ] [154]))) 11 {subdi3}
 (nil))
(insn 65 63 66 4 (set (reg:DI 16 a6 [orig:153 loop_len_53 ] [153])
(umin:DI (reg:DI 15 a5 [orig:175 _103 ] [175])
(reg:DI 30 t5 [219]))) 442 {umindi3}
 (expr_list:REG_EQUAL (umin:DI (reg:DI 15 a5 [orig:175 _103 ] [175])
(const_int 8 [0x8]))
(nil)))
 
gcc/ChangeLog:
 
* config/riscv/riscv-vsetvl.cc (source_equal_p): Use pointer
equality for REG_EQUAL.
 
gcc/testsuite/ChangeLog:
 
* gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_zbb_run-2.c: New test.
---
gcc/config/riscv/riscv-vsetvl.cc  | 12 +++-
.../partial/multiple_rgroup_zbb_run-2.c   | 19 +++
2 files changed, 30 insertions(+), 1 deletion(-)
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_zbb_run-2.c
 
diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index 3fa25a6404d..0eea33dd0e1 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -561,7 +561,17 @@ source_equal_p (insn_info *insn1, insn_info *insn2)
   rtx note1 = find_reg_equal_equiv_note (rinsn1);
   rtx note2 = find_reg_equal_equiv_note (rinsn2);
   if (note1 && note2 && rtx_equal_p (note1, note2))
-return true;
+{
+  /* REG_EQUIVs are invariant at function scope.  */
+  if (REG_NOTE_KIND (note2) == REG_EQUIV)
+ return true;
+
+  /* REG_EQUAL are not so in order to consider them similar the RTX they
+ point to must be identical.  We could also allow "rtx_equal"
+ REG_EQUALs but would need to check if no insn between them modifies
+ any of their sources.  */
+  return note1 == note2;
+}
   return false;
}
diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_zbb_run-2.c
 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_zbb_run-2.c
new file mode 100644
index 000..aeb337dc7ee
--- /dev/null
+++ 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_zbb_run-2.c
@@ -0,0 +1,19 @@
+/* { dg-do run { target { riscv_v } } } */
+/* { dg-additional-options "-march=rv64gcv_zbb --param 
riscv-autovec-preference=fixed-vlmax" } */
+
+#include "multiple_rgroup-2.c"
+
+int main (void)
+{
+  TEST_ALL (run_1)
+  TEST_ALL (run_2)
+  TEST_ALL (run_3)
+  TEST_ALL (run_4)
+  TEST_ALL (run_5)
+  TEST_ALL (run_6)
+  TEST_ALL (run_7)
+  TEST_ALL (run_8)
+  TEST_ALL (run_9)
+  TEST_ALL (run_10)
+  return 0;
+}
-- 
2.41.0

Re: [PATCH] openmp: Add support for the 'indirect' clause in C/C++

2023-11-13 Thread Thomas Schwinge

Hi!

On 2023-11-09T17:00:11+0100, Tobias Burnus  wrote:
> On 09.11.23 13:24, Thomas Schwinge wrote:
>> Also, assuming that the order of appearance of 'IND_FUNC_MAP' does matter
>> as it does for 'FUNC_MAP', ... 
>> https://github.com/MentorEmbedded/nvptx-tools/pull/29 ...
>
> It should matter. Thus, we should indeed update nvptx-tools for this!
>
> For hello-world it probably does not show up that easily as there are
> only very few such tagged functions. But especially once it gets used
> for C++ virtual functions, the number of function will be that large
> that the ordering issue is likely to occur in the real world.
>
> (I shouldn't have missed this – given that I debugging and reported the
> original issue.)

Let's blame this on the inadequate general (non-)handling of these
directives in nvptx-tools 'as' -- which, as I said, I'll address once
Kwok has fixed this specific issue (with test case, please).
(Generalize/refactor after fixing specific issue.)

> (BTW: OMP_CLAUSE_INDIRECT is only used intermittendly in the C/C++ FEs
> and not in the ME as it is soon turned into an attribute string.)

OK, that does explain:

>> I would've assumed handling for 'OMP_CLAUSE_INDIRECT' to also be
>> necessary in the following places:
>>
>>- 'gcc/c-family/c-omp.cc:c_omp_split_clauses'
> "split_clauses" applies only to combined composite constructs like
> 'target'+'parallel' +'for' + 'simd' where clauses have to be added to
> the right constituent clause(s). Declarative directives cannot be combined.
>>- 'gcc/cp/pt.cc:tsubst_omp_clauses',
>>- 'gcc/gimplify.cc:gimplify_scan_omp_clauses',
>>  'gcc/gimplify.cc:gimplify_adjust_omp_clauses'
>>- 'gcc/omp-low.cc:scan_sharing_clauses' (twice)
>>- 'gcc/tree-nested.cc:convert_nonlocal_omp_clauses',
>>  'gcc/tree-nested.cc:convert_local_omp_clauses'
>>- 'gcc/tree-pretty-print.cc:dump_omp_clause'

... this.

> Most of those seem to relate to executable directives

(That remark I don't understand.)

> – and not to
> declarative ones, where we attach DECL_ATTRIBUTES to a decl and process
> them. For functions, the pretty printer prints the attributes.

> Here, we use "omp declare target indirect" as attribute.

ACK.

> We use noclone,noinline attributes for 'declare target', thus, there
> should be no issue on this side and regarding tsubst_omp_clauses, as the
> clause is either present or not (either bare or with a parse-time
> constant logical), there is not much post processing needed.

That's not obvious to the casual reader of GCC source code, though.

> Thus, I bet that there is nothing to do for those.
>
>> Please verify, and add handling as well as test cases as necessary, or,
>> as applicable, put 'case OMP_CLAUSE_INDIRECT:' next to
>> 'default: gcc_unreachable ();' etc., if indeed that clause is not
>> expected there.
>
> What's the point of having it next to default if it is gcc_unreachable?

Instead of "bet", I suggest to document intentions: so that it's clear
that 'OMP_CLAUSE_INDIRECT' is not meant to be seen here vs. an accidental
omission.

> I mean there are several others which shouldn't be needed like
> OMP_CLAUSE_DEVICE_TYPE which also does not show up at gcc/cp/pt.cc.

Quite possible.  :-) I certainly wouldn't object to "handling" those,
too.

Generally, in my opinion, we should usually see 'case's listed for all
clause codes where we 'switch' on them, for example.

>>> --- a/libgomp/config/gcn/team.c
>>> +++ b/libgomp/config/gcn/team.c
>>> @@ -45,6 +46,9 @@ gomp_gcn_enter_kernel (void)
>>>   {
>>> int threadid = __builtin_gcn_dim_pos (1);
>>>
>> Shouldn't this:
>>
>>> +  /* Initialize indirect function support.  */
>>> +  build_indirect_map ();
>>> +
>> ... be called inside here:
>>
>>> if (threadid == 0)
>>>   {
>> ..., so that it's only executed by one thread?
> (concur)
>> Also, for my understanding: why is 'build_indirect_map' done at kernel
>> invocation time (here) instead of at image load time?
>
> The splay_tree is generated on the device itself - and we currently do
> not start a kernel during GOMP_OFFLOAD_load_image. We could, the
> question is whether it makes sense. (Generating the splay_tree on the
> host for the device is a hassle and error prone as it needs to use
> device pointers at the end.)

Hmm.  It seems conceptually cleaner to me to set this up upfront, and
avoids potentially slowing down every device kernel invocation (at least
another function call, and 'gomp_mutex_lock' check).  Though, I agree
this may be "in the noise" with regards to all the other stuff going on
in 'gomp_gcn_enter_kernel' and elsewhere...

What I just realize, what's also unclear to me is how the current
implementation works with regards to several images getting loaded --
don't we then overwrite 'GOMP_INDIRECT_ADDR_MAP' instead of
(conceptually) appending to it?

In the general case, additional images may also get loaded during
execution.  We thus need proper locking of the shared data structure, uh?
Or, can we have sep

Re: [PATCH] RISC-V: testsuite: Fix 32-bit FAILs.


Hi Robin,

On 2023/11/13 18:33, Robin Dapp wrote:

On 2023/11/13 18:22, juzhe.zh...@rivai.ai wrote:

If there is a difference between them. I think we should fix riscv-common.cc.
Since I think "zvfh_zfh" should not be different with "zfh_zvfh"


It's possible. Let me debug it and see if there's a problem.


I don't think it is different.  Just checked and it still works for me.

Could you please tell me how you invoke the testsuite?


This looks to be the difference between the linux and elf versions of 
gcc. The elf version of gcc we are build will have this problem, the 
linux version of gcc will not. I think the linux version of gcc has a 
wrong behavior.:


➜  riscv-gnu-toolchain-push git:(tintin-dev) 
./build/dev-rv32gcv_zfh_zvfh-ilp32d-medany-newlib-spike-debug/install/bin/riscv32-unknown-elf-gcc 
-march=rv32gcv_zfh 
build/dev-rv64gcv_zvfh_zfh-lp64d-medany-newlib-spike-debug/hello.c
riscv32-unknown-elf-gcc: fatal error: Cannot find suitable multilib set 
for 
'-march=rv32imafdcv_zicsr_zifencei_zfh_zfhmin_zve32f_zve32x_zve64d_zve64f_zve64x_zvl128b_zvl32b_zvl64b'/'-mabi=ilp32d'

compilation terminated.
➜  riscv-gnu-toolchain-push git:(tintin-dev) 
./build/dev-rv32gcv_zfh_zvfh-ilp32d-medany-linux-spike-debug/install/bin/riscv32-unknown-linux-gnu-gcc 
-march=rv32gcv_zfh 
build/dev-rv64gcv_zvfh_zfh-lp64d-medany-newlib-spike-debug/hello.c



--
Best,
Lehua (RiVAI)
lehua.d...@rivai.ai

[PATCH] gm2: Add missing declaration of m2pim_M2RTS_Terminate to test

Needed for C99 testsuite compatibility.

gcc/testsuite/

* gm2/link/externalscaffold/pass/scaffold.c (m2pim_M2RTS_Terminate):
Declare.

---
 gcc/testsuite/gm2/link/externalscaffold/pass/scaffold.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/testsuite/gm2/link/externalscaffold/pass/scaffold.c 
b/gcc/testsuite/gm2/link/externalscaffold/pass/scaffold.c
index 2bd3587f6c7..2df0368b983 100644
--- a/gcc/testsuite/gm2/link/externalscaffold/pass/scaffold.c
+++ b/gcc/testsuite/gm2/link/externalscaffold/pass/scaffold.c
@@ -6,6 +6,7 @@ extern  void m2pim_M2_M2RTS_init (int argc, char *argv[]);
 extern  void m2pim_M2_M2RTS_fini (void);
 extern  void m2pim_M2_RTExceptions_init (int argc, char *argv[]);
 extern  void m2pim_M2_RTExceptions_fini (void);
+extern  void m2pim_M2RTS_Terminate (void);
 extern  void _M2_hello_init (int argc, char *argv[]);
 extern  void _M2_hello_fini (void);
 

base-commit: 9d471cd993e47dcbe91946beb529134565c58185

Re: [PATCH v2 1/7] aarch64: Use br instead of ret for eh_return

2023-11-13 Thread Szabolcs Nagy

The 11/13/2023 01:27, Hans-Peter Nilsson wrote:
> > From: Szabolcs Nagy 
> > Date: Fri, 3 Nov 2023 15:36:08 +
> 
> I don't see others commenting on this patch, and you're not
> mentioning this aspect, so I wonder:
> 
> > * config/aarch64/aarch64.h (EH_RETURN_TAKEN_RTX): Define.
> > (EH_RETURN_STACKADJ_RTX): Change to R5.
> > (EH_RETURN_HANDLER_RTX): Change to R6.
> 
> Isn't this an ABI change?

not really: this is interface between the function body
and the epilogue, so all within the code of a single
function doing eh return, not a public abi boundary.

(e.g. R0..R3 are preserved from the function throwing
the exception to the exception handler, so that's abi.
R4..R6 are just an internal detail of the function doing
the eh return in the unwinder.)

> 
> (I've forgotten relevant bits of the exception machinery; if
> throw and catch are always in the same object and everything
> in between register-number-agnostic then the only flaw would
> be not mentioning that in the commit message.)
> 
> brgds, H-P

Re: [RFC] Intel AVX10.1 Compiler Design and Support

On Mon, Nov 13, 2023 at 7:58 AM Hongtao Liu  wrote:
>
> On Fri, Nov 10, 2023 at 6:15 PM Richard Biener
>  wrote:
> >
> > On Fri, Nov 10, 2023 at 2:42 AM Haochen Jiang  
> > wrote:
> > >
> > > Hi all,
> > >
> > > This RFC patch aims to add AVX10.1 options. After we added -m[no-]evex512
> > > support, it makes a lot easier to add them comparing to the August 
> > > version.
> > > Detail for AVX10 is shown below:
> > >
> > > Intel Advanced Vector Extensions 10 (Intel AVX10) Architecture 
> > > Specification
> > > It describes the Intel Advanced Vector Extensions 10 Instruction Set
> > > Architecture.
> > > https://cdrdv2.intel.com/v1/dl/getContent/784267
> > >
> > > The Converged Vector ISA: Intel Advanced Vector Extensions 10 Technical 
> > > Paper
> > > It provides introductory information regarding the converged vector ISA: 
> > > Intel
> > > Advanced Vector Extensions 10.
> > > https://cdrdv2.intel.com/v1/dl/getContent/784343
> > >
> > > Our proposal is to take AVX10.1-256 and AVX10.1-512 as two "virtual" ISAs 
> > > in
> > > the compiler. AVX10.1-512 will imply AVX10.1-256. They will not enable
> > > anything at first. At the end of the option handling, we will check 
> > > whether
> > > the two bits are set. If AVX10.1-256 is set, we will set the AVX512 
> > > related
> > > ISA bits. AVX10.1-512 will further set EVEX512 ISA bit.
> > >
> > > It means that AVX10 options will be separated from the existing AVX512 
> > > and the
> > > newly added -m[no-]evex512 options. AVX10 and AVX512 options will control
> > > (enable/disable/set vector size) the AVX512 features underneath 
> > > independently.
> > > If there’s potential overlap or conflict between AVX10 and AVX512 options,
> > > some rules are provided to define the behavior, which will be described 
> > > below.
> > >
> > > avx10.1 option will be provided as an alias of avx10.1-256.
> > >
> > > In the future, the AVX10 options will imply like this:
> > >
> > > AVX10.1-256 < AVX10.1-512
> > >  ^ ^
> > >  | |
> > >
> > > AVX10.2-256 < AVX10.2-512
> > >  ^ ^
> > >  | |
> > >
> > > AVX10.3-256 < AVX10.3-512
> > >  ^ ^
> > >  | |
> > >
> > > Each of them will have its own option to enable/disabled corresponding
> > > features. The alias avx10.x will also be provided.
> > >
> > > As mentioned in August version RFC, since we lean towards the adoption of
> > > AVX10 instead of AVX512 from now on, we don’t recommend users to combine 
> > > the
> > > AVX10 and legacy AVX512 options.
> >
> > I wonder whether adoption could be made easier by also providing a
> > -mavx10[.0] level that removes some of the more obscure sub-ISA requirements
> > to cover more existing implementations (I'd not add -mavx10.0-512 here).
> > I'd require only skylake-AVX512 features here, basically all non-KNL AVX512
> > CPUs should have a "virtual" AVX10 level that allows to use that feature 
> > set,
> We have -mno-evex512 can cover those cases, so what you want is like a
> simple alias of "-march=skylake-avx512 -mno-evex512"?

For the AVX512 enabled sub-isas of skylake-avx512 yes I guess.

> > restricted to 256bits so future AVX10-256 implementations can handle it
> > as well as all existing (and relevant, which excludes KNL) AVX512
> > implementations.
> >
> > Otherwise AVX10 is really a hard sell (as AVX512 was originally).
> It's a rebranding of the existing AVX512 to AVX10, AVX10.0  just
> complicated things further(considering we already have x86-64-v4 which
> is different from skylake-avx512).

Well, the cut-off for "AVX512" is quite arbitrary.  Introducing a
"new" ISA that's
only available in HW available in the future and suggesting users to embrace
that already (like Intel did with AVX512 without offering client SKU support)
is a hard sell.

I realize Intel thinks client SKU support for AVX10 (restricted to 256bit) will
be "easier".  But then don't expect anybody to adopt that in the next 10 years.

Just to add - we were suggesting to use x86_64-v3 for the "next" enterprise
product but got downvoted to x86_64-v2 for compatibility reasons.

If it were possible I'd axe x86_64-v4.  Maybe we should add a x86_64-v3.5
that sits inbetween v3 and v4, offering AVX512 but restricted to 256bit
(and obviously not requiring more of the AVX512 features that v4 requires).

Richard.

> >
> > > However, we would like to introduce some
> > > simple rules for user when it comes to combination.
> > >
> > > 1. Enabling AVX10 and AVX512 at the same command line with different 
> > > vector
> > > size will lead to a warning message. The behavior of the compiler will be
> > > enabling AVX10 with longer, i.e., 512 bit vector size.
> > >
> > > If the vector sizes are the same (e.g. -mavx10.1-256 -mavx512f 
> > > -mno-evex512,
> > > -mavx10.1-512 -mavx512f), it will be valid with the corresponding vector 
> > > size.
> > >
> > > 2. -mno-avx10.1 option can’t disable any featu

Re: [PATCH] RISC-V: testsuite: Fix 32-bit FAILs.

Hi Kito,

On 2023/11/13 19:13, Lehua Ding wrote:

Hi Robin,

On 2023/11/13 18:33, Robin Dapp wrote:

On 2023/11/13 18:22, juzhe.zh...@rivai.ai wrote:
If there is a difference between them. I think we should fix
riscv-common.cc.

Since I think "zvfh_zfh" should not be different with "zfh_zvfh"

It's possible. Let me debug it and see if there's a problem.

I don't think it is different. Just checked and it still works for me.

Could you please tell me how you invoke the testsuite?

This looks to be the difference between the linux and elf versions of
gcc. The elf version of gcc we are build will have this problem, the
linux version of gcc will not. I think the linux version of gcc has a
wrong behavior.:

➜ riscv-gnu-toolchain-push git:(tintin-dev)
./build/dev-rv32gcv_zfh_zvfh-ilp32d-medany-newlib-spike-debug/install/bin/riscv32-unknown-elf-gcc -march=rv32gcv_zfh build/dev-rv64gcv_zvfh_zfh-lp64d-medany-newlib-spike-debug/hello.c
riscv32-unknown-elf-gcc: fatal error: Cannot find suitable multilib set
for
'-march=rv32imafdcv_zicsr_zifencei_zfh_zfhmin_zve32f_zve32x_zve64d_zve64f_zve64x_zvl128b_zvl32b_zvl64b'/'-mabi=ilp32d'

compilation terminated.
➜ riscv-gnu-toolchain-push git:(tintin-dev)
./build/dev-rv32gcv_zfh_zvfh-ilp32d-medany-linux-spike-debug/install/bin/riscv32-unknown-linux-gnu-gcc -march=rv32gcv_zfh build/dev-rv64gcv_zvfh_zfh-lp64d-medany-newlib-spike-debug/hello.c

It looks like this commit[1] from you make the difference between elf
and linux. Can you help to see if it makes sense to behave differently
now? elf version --with-arch is rv32gcv_zvfh_zfh, and the user will get
an error with -march=rv32gcv_zfh. linux version will not.

[1] https://github.com/gcc-mirror/gcc/commit/17d683d

--
Best,
Lehua (RiVAI)
lehua.d...@rivai.ai

[PATCH v2 1/2] libatomic: atomic_16.S: Improve ENTRY, END and ALIAS macro interface

The introduction of further architectural-feature dependent ifuncs
for AArch64 makes hard-coding ifunc `_i' suffixes to functions
cumbersome to work with.  It is awkward to remember which ifunc maps
onto which arch feature and makes the code harder to maintain when new
ifuncs are added and their suffixes possibly altered.

This patch uses pre-processor `#define' statements to map each suffix to
a descriptive feature name macro, for example:

  #define LSE2 _i1

and reconstructs function names with the pre-processor's token
concatenation feature, such that for `MACRO(name)', we would now have
`MACRO(name, feature)' and in the macro definition body we replace
`name` with `name##feature`.

libatomic/ChangeLog:
* config/linux/aarch64/atomic_16.S (CORE): New macro.
(LSE2): Likewise.
(ENTRY): Modify macro to take in `arch' argument.
(END): Likewise.
(ALIAS): Likewise.
(ENTRY1): New macro.
(END1): Likewise.
(ALIAS): Likewise.
---
 libatomic/config/linux/aarch64/atomic_16.S | 147 +++--
 1 file changed, 79 insertions(+), 68 deletions(-)

diff --git a/libatomic/config/linux/aarch64/atomic_16.S 
b/libatomic/config/linux/aarch64/atomic_16.S
index 0485c284117..3f6225830e6 100644
--- a/libatomic/config/linux/aarch64/atomic_16.S
+++ b/libatomic/config/linux/aarch64/atomic_16.S
@@ -39,22 +39,34 @@
 
.arch   armv8-a+lse
 
-#define ENTRY(name)\
-   .global name;   \
-   .hidden name;   \
-   .type name,%function;   \
-   .p2align 4; \
-name:  \
-   .cfi_startproc; \
+#define ENTRY(name, feat)  \
+   ENTRY1(name, feat)
+
+#define ENTRY1(name, feat) \
+   .global name##feat; \
+   .hidden name##feat; \
+   .type name##feat,%function; \
+   .p2align 4; \
+name##feat:\
+   .cfi_startproc; \
hint34  // bti c
 
-#define END(name)  \
-   .cfi_endproc;   \
-   .size name, .-name;
+#define END(name, feat)\
+   END1(name, feat)
 
-#define ALIAS(alias,name)  \
-   .global alias;  \
-   .set alias, name;
+#define END1(name, feat)   \
+   .cfi_endproc;   \
+   .size name##feat, .-name##feat;
+
+#define ALIAS(alias, from, to) \
+   ALIAS1(alias,from,to)
+
+#define ALIAS1(alias, from, to)\
+   .global alias##from;\
+   .set alias##from, alias##to;
+
+#define CORE
+#define LSE2   _i1
 
 #define res0 x0
 #define res1 x1
@@ -89,7 +101,7 @@ name:\
 #define SEQ_CST 5
 
 
-ENTRY (libat_load_16)
+ENTRY (libat_load_16, CORE)
mov x5, x0
cbnzw1, 2f
 
@@ -104,10 +116,10 @@ ENTRY (libat_load_16)
stxpw4, res0, res1, [x5]
cbnzw4, 2b
ret
-END (libat_load_16)
+END (libat_load_16, CORE)
 
 
-ENTRY (libat_load_16_i1)
+ENTRY (libat_load_16, LSE2)
cbnzw1, 1f
 
/* RELAXED.  */
@@ -127,10 +139,10 @@ ENTRY (libat_load_16_i1)
ldp res0, res1, [x0]
dmb ishld
ret
-END (libat_load_16_i1)
+END (libat_load_16, LSE2)
 
 
-ENTRY (libat_store_16)
+ENTRY (libat_store_16, CORE)
cbnzw4, 2f
 
/* RELAXED.  */
@@ -144,10 +156,10 @@ ENTRY (libat_store_16)
stlxp   w4, in0, in1, [x0]
cbnzw4, 2b
ret
-END (libat_store_16)
+END (libat_store_16, CORE)
 
 
-ENTRY (libat_store_16_i1)
+ENTRY (libat_store_16, LSE2)
cbnzw4, 1f
 
/* RELAXED.  */
@@ -159,10 +171,10 @@ ENTRY (libat_store_16_i1)
stlxp   w4, in0, in1, [x0]
cbnzw4, 1b
ret
-END (libat_store_16_i1)
+END (libat_store_16, LSE2)
 
 
-ENTRY (libat_exchange_16)
+ENTRY (libat_exchange_16, CORE)
mov x5, x0
cbnzw4, 2f
 
@@ -186,10 +198,10 @@ ENTRY (libat_exchange_16)
stlxp   w4, in0, in1, [x5]
cbnzw4, 4b
ret
-END (libat_exchange_16)
+END (libat_exchange_16, CORE)
 
 
-ENTRY (libat_compare_exchange_16)
+ENTRY (libat_compare_exchange_16, CORE)
ldp exp0, exp1, [x1]
cbz w4, 3f
cmp w4, RELEASE
@@ -228,10 +240,10 @@ ENTRY (libat_compare_exchange_16)
cbnzw4, 4b
mov x0, 1
ret
-END (libat_compare_exchange_16)
+END (libat_compare_exchange_16, CORE)
 
 
-ENTRY (libat_compare_exchange_16_i1)
+ENTRY (libat_compare_exchange_16, LSE2)
ldp exp0, exp1, [x1]
mov tmp0, exp0
mov tmp1, exp1
@@ -264,10 +276,10 @@ ENTRY (libat_compare_exchange_16_i1)
/* ACQ_REL/SEQ_CST.  */
 4: caspal  exp0, exp1, in0, in1, [x0]
b   0b
-END (libat_compare_exchange_16_i1)
+END (libat_compare_exchange_16, LSE2)
 
 
-ENTRY (libat_fetch_add_16)
+ENTRY (libat_fetch_add_16, CORE)

[PATCH v2 0/2] Libatomic: Add LSE128 atomics support for AArch64

v2 updates:

Move the previously unguarded definition of IFUNC_NCONDN(N) in
`host-config.h' to within the scope of `#ifdef HWCAP_USCAP'.
This is done so that its definition is not only contingent on the
value of N but also on the definition of HWCAP_USCAP as it was found
that building on systems where !HWCAP_USCAP and N == 16 led to a
previously-undetected build error.

---

Building upon Wilco Dijkstra's work on AArch64 128-bit atomics for
Libatomic, namely the patches from [1] and [2],  this patch series
extends the library's  capabilities to dynamically select and emit
Armv9.4-a LSE128 implementations of atomic operations via ifuncs at
run-time whenever architectural support is present.

Regression tested on aarch64-linux-gnu target with LSE128-support.

[1] https://gcc.gnu.org/pipermail/gcc-patches/2023-June/620529.html
[2] https://gcc.gnu.org/pipermail/gcc-patches/2023-August/626358.html

Victor Do Nascimento (2):
  libatomic: atomic_16.S: Improve ENTRY, END and ALIAS macro interface
  libatomic: Enable LSE128 128-bit atomics for armv9.4-a

 libatomic/Makefile.am|   3 +
 libatomic/Makefile.in|   1 +
 libatomic/acinclude.m4   |  19 ++
 libatomic/auto-config.h.in   |   3 +
 libatomic/config/linux/aarch64/atomic_16.S   | 315 ++-
 libatomic/config/linux/aarch64/host-config.h |  27 +-
 libatomic/configure  |  59 +++-
 libatomic/configure.ac   |   1 +
 8 files changed, 352 insertions(+), 76 deletions(-)

-- 
2.42.0

[PATCH v2 2/2] libatomic: Enable LSE128 128-bit atomics for armv9.4-a

The armv9.4-a architectural revision adds three new atomic operations
associated with the LSE128 feature:

  * LDCLRP - Atomic AND NOT (bitclear) of a location with 128-bit
  value held in a pair of registers, with original data loaded into
  the same 2 registers.
  * LDSETP - Atomic OR (bitset) of a location with 128-bit value held
  in a pair of registers, with original data loaded into the same 2
  registers.
  * SWPP - Atomic swap of one 128-bit value with 128-bit value held
  in a pair of registers.

This patch adds the logic required to make use of these when the
architectural feature is present and a suitable assembler available.

In order to do this, the following changes are made:

  1. Add a configure-time check to check for LSE128 support in the
  assembler.
  2. Edit host-config.h so that when N == 16, nifunc = 2.
  3. Where available due to LSE128, implement the second ifunc, making
  use of the novel instructions.
  4. For atomic functions unable to make use of these new
  instructions, define a new alias which causes the _i1 function
  variant to point ahead to the corresponding _i2 implementation.

libatomic/ChangeLog:

* Makefile.am (AM_CPPFLAGS): add conditional setting of
-DHAVE_FEAT_LSE128.
* acinclude.m4 (LIBAT_TEST_FEAT_LSE128): New.
* config/linux/aarch64/atomic_16.S (LSE128): New macro
definition.
(libat_exchange_16): New LSE128 variant.
(libat_fetch_or_16): Likewise.
(libat_or_fetch_16): Likewise.
(libat_fetch_and_16): Likewise.
(libat_and_fetch_16): Likewise.
* config/linux/aarch64/host-config.h (IFUNC_COND_2): New.
(IFUNC_NCOND): Add operand size checking.
(has_lse2): Renamed from `ifunc1`.
(has_lse128): New.
(HAS_LSE128): Likewise.
* libatomic/configure.ac: Add call to LIBAT_TEST_FEAT_LSE128.
* configure (ac_subst_vars): Regenerated via autoreconf.
* libatomic/Makefile.in: Likewise.
* libatomic/auto-config.h.in: Likewise.
---
 libatomic/Makefile.am|   3 +
 libatomic/Makefile.in|   1 +
 libatomic/acinclude.m4   |  19 +++
 libatomic/auto-config.h.in   |   3 +
 libatomic/config/linux/aarch64/atomic_16.S   | 170 ++-
 libatomic/config/linux/aarch64/host-config.h |  27 ++-
 libatomic/configure  |  59 ++-
 libatomic/configure.ac   |   1 +
 8 files changed, 274 insertions(+), 9 deletions(-)

diff --git a/libatomic/Makefile.am b/libatomic/Makefile.am
index c0b8dea5037..24e843db67d 100644
--- a/libatomic/Makefile.am
+++ b/libatomic/Makefile.am
@@ -130,6 +130,9 @@ libatomic_la_LIBADD = $(foreach s,$(SIZES),$(addsuffix 
_$(s)_.lo,$(SIZEOBJS)))
 ## On a target-specific basis, include alternates to be selected by IFUNC.
 if HAVE_IFUNC
 if ARCH_AARCH64_LINUX
+if ARCH_AARCH64_HAVE_LSE128
+AM_CPPFLAGS = -DHAVE_FEAT_LSE128
+endif
 IFUNC_OPTIONS   = -march=armv8-a+lse
 libatomic_la_LIBADD += $(foreach s,$(SIZES),$(addsuffix 
_$(s)_1_.lo,$(SIZEOBJS)))
 libatomic_la_SOURCES += atomic_16.S
diff --git a/libatomic/Makefile.in b/libatomic/Makefile.in
index dc2330b91fd..cd48fa21334 100644
--- a/libatomic/Makefile.in
+++ b/libatomic/Makefile.in
@@ -452,6 +452,7 @@ M_SRC = $(firstword $(filter %/$(M_FILE), $(all_c_files)))
 libatomic_la_LIBADD = $(foreach s,$(SIZES),$(addsuffix \
_$(s)_.lo,$(SIZEOBJS))) $(am__append_1) $(am__append_3) \
$(am__append_4) $(am__append_5)
+@ARCH_AARCH64_HAVE_LSE128_TRUE@@ARCH_AARCH64_LINUX_TRUE@@HAVE_IFUNC_TRUE@AM_CPPFLAGS
 = -DHAVE_FEAT_LSE128
 @ARCH_AARCH64_LINUX_TRUE@@HAVE_IFUNC_TRUE@IFUNC_OPTIONS = -march=armv8-a+lse
 @ARCH_ARM_LINUX_TRUE@@HAVE_IFUNC_TRUE@IFUNC_OPTIONS = -march=armv7-a+fp 
-DHAVE_KERNEL64
 @ARCH_I386_TRUE@@HAVE_IFUNC_TRUE@IFUNC_OPTIONS = -march=i586
diff --git a/libatomic/acinclude.m4 b/libatomic/acinclude.m4
index f35ab5b60a5..4197db8f404 100644
--- a/libatomic/acinclude.m4
+++ b/libatomic/acinclude.m4
@@ -83,6 +83,25 @@ AC_DEFUN([LIBAT_TEST_ATOMIC_BUILTIN],[
   ])
 ])
 
+dnl
+dnl Test if the host assembler supports armv9.4-a LSE128 isns.
+dnl
+AC_DEFUN([LIBAT_TEST_FEAT_LSE128],[
+  AC_CACHE_CHECK([for armv9.4-a LSE128 insn support],
+[libat_cv_have_feat_lse128],[
+AC_LANG_CONFTEST([AC_LANG_PROGRAM([],[asm(".arch armv9-a+lse128")])])
+if AC_TRY_EVAL(ac_link); then
+  eval libat_cv_have_feat_lse128=yes
+else
+  eval libat_cv_have_feat_lse128=no
+fi
+rm -f conftest*
+  ])
+  LIBAT_DEFINE_YESNO([HAVE_FEAT_LSE128], [$libat_cv_have_feat_lse128],
+   [Have LSE128 support for 16 byte integers.])
+  AM_CONDITIONAL([ARCH_AARCH64_HAVE_LSE128], [test x$libat_cv_have_feat_lse128 
= xyes])
+])
+
 dnl
 dnl Test if we have __atomic_load and __atomic_store for mode $1, size $2
 dnl
diff --git a/libatomic/auto-config.h.in b/libatomic/auto-config.h.in
index ab3424a759e..7c78933b07d 100644
--- a/libatomic/auto-config

Re: [PATCH] openmp: Add support for the 'indirect' clause in C/C++

2023-11-13 Thread Tobias Burnus


Hi Thomas,

On 13.11.23 11:59, Thomas Schwinge wrote:

- 'gcc/cp/pt.cc:tsubst_omp_clauses',
- 'gcc/gimplify.cc:gimplify_scan_omp_clauses',
  'gcc/gimplify.cc:gimplify_adjust_omp_clauses'
- 'gcc/omp-low.cc:scan_sharing_clauses' (twice)
- 'gcc/tree-nested.cc:convert_nonlocal_omp_clauses',
  'gcc/tree-nested.cc:convert_local_omp_clauses'
- 'gcc/tree-pretty-print.cc:dump_omp_clause'

[...]

Most of those seem to relate to executable directives

(That remark I don't understand.)


OpenMP classifies directives into categories. The following exists
(grep + count of TR12's json/directive/ files.)

  2 meta
  2 subsidiary
  2 utility
  4 informational
 11 declarative
 40 executable

where "declare target' + 'indirect' (like declare variant, declare simd) is 
declarative,
i.e.

"declare target directive - A declarative directive that ensures that procedures
and/or variables can be executed or accessed on a device."

For those, we usually add an attribute to the function declaration. And 
'executive' is
defined as

"executable directive - A directive that appears in an executable context and 
results in
implementation code and/or prescribes the manner in which associate user code must 
execute."

Those either are turned into a libgomp call or transform some code - possibly 
including
calling the library. That would be, e.g. 'omp target', 'omp parallel', 'omp 
atomic'.

The listed functions all deal with executable code, i.e. some tree ..._EXPR, 
possibly associated
with a structured block (at least the scan_* only apply for block-associated 
directives as they
scan of usage inside that block - affecting the directive/the directive's 
clauses).

* * *


We use noclone,noinline attributes for 'declare target', thus, there
should be no issue on this side and regarding tsubst_omp_clauses, as the
clause is either present or not (either bare or with a parse-time
constant logical), there is not much post processing needed.

That's not obvious to the casual reader of GCC source code, though.


True, but that's the general problem with code - without background, you
don't always understand the flow in the code and when something is called.

I think there is no good way how this can be solved; or rather, it can
be solved for a specific question but the next person looks for
something else and then has the same problem and the previous "solution"
(like comment) doesn't help.

In some cases, I think it helps to add comments, but I have the feeling
that your question is to specific (you look at a single patch) to be
really helpful here. But I am happy to be proven wrong and see useful
code changes/comments.


Thus, I bet that there is nothing to do for those.


Please verify, and add handling as well as test cases as necessary, or,
as applicable, put 'case OMP_CLAUSE_INDIRECT:' next to
'default: gcc_unreachable ();' etc., if indeed that clause is not
expected there.

What's the point of having it next to default if it is gcc_unreachable?

Instead of "bet", I suggest to document intentions: so that it's clear
that 'OMP_CLAUSE_INDIRECT' is not meant to be seen here vs. an accidental
omission.

Done - in this email thread, which can be found by patch archeology.

I mean there are several others which shouldn't be needed like
OMP_CLAUSE_DEVICE_TYPE which also does not show up at gcc/cp/pt.cc.

Quite possible.  :-) I certainly wouldn't object to "handling" those,
too.

Generally, in my opinion, we should usually see 'case's listed for all
clause codes where we 'switch' on them, for example.


If there are plenty of 'default:', I am no sure it makes sense.

But in general, the question is (for a switch where most 'enum' values are used)
whether it would make more sense to remove the 'default: gcc_unreachable();'.
In that case, we do not handle the others - but the -Wswitch-enum will warn 
about it.
Thus, a bootstrap build will ensure that all values are covered due to 
-Werror=switch-enum.

Downside is that without -Werror/bootstrap, it will silently fall through but 
for
normal FE/ME code, it is guaranteed to be bootstrapped and will show up.

* * *


Also, for my understanding: why is 'build_indirect_map' done at kernel
invocation time (here) instead of at image load time?

The splay_tree is generated on the device itself - and we currently do
not start a kernel during GOMP_OFFLOAD_load_image. We could, the
question is whether it makes sense. (Generating the splay_tree on the
host for the device is a hassle and error prone as it needs to use
device pointers at the end.)

Hmm.  It seems conceptually cleaner to me to set this up upfront, and
avoids potentially slowing down every device kernel invocation (at least
another function call, and 'gomp_mutex_lock' check).  Though, I agree
this may be "in the noise" with regards to all the other stuff going on
in 'gomp_gcn_enter_kernel' and elsewhere...


I think the most common case is GOMP_INDIRECT_ADDR_MAP == NULL.

The question is whethe

[PATCH] libatomic: Add rcpc3 128-bit atomic operations for AArch64

Continuing on from previously-proposed Libatomic enablement work [1],
the introduction of the optional RCPC3 architectural extension for
Armv8.2-A upwards provides additional support for the release
consistency model, introducing both the Load-Acquire RCpc Pair
Ordered, and Store-Release Pair Ordered operations in the form of 
LDIAPP and STILP.

These operations single-copy atomic on cores which also implement
LSE2 and, as such, support for these operations is added to Libatomic
and employed accordingly when the LSE2 and RCPC3 features are detected
in a given core at runtime.

The possibility that a core implements (beyond LSE & LSE2) both the
LSE128 and RCPC3 features has also required that support for up to 4
ifuncs (up from 3 before) be added, so that the lse128+rcpc option is
available for selection at runtime.

[1] https://gcc.gnu.org/pipermail/gcc-patches/2023-November/636287.html

libatomic/ChangeLog:

  * libatomic_i.h (GEN_SELECTOR): define for
  IFUNC_NCOND(N) == 4.
  * configure.ac: Add call to LIBAT_TEST_FEAT_LRCPC3() test.
  * configure: Regenerate.
  * config/linux/aarch64/host-config.h (HAS_LRCPC3): New.
  (has_rcpc3): Likewise.
  * config/linux/aarch64/atomic_16.S (libat_load_16): Add
  LRCPC3 variant.
  (libat_store_16): Likewise.
  * acinclude.m4 (LIBAT_TEST_FEAT_LRCPC3): New.
  (HAVE_FEAT_LRCPC3): Likewise
  (ARCH_AARCH64_HAVE_LRCPC3): Likewise.
  * Makefile.am (AM_CPPFLAGS): Conditionally append
  -DHAVE_FEAT_LRCPC3 flag.
---
 libatomic/Makefile.am|  6 +-
 libatomic/Makefile.in| 22 +++--
 libatomic/acinclude.m4   | 19 
 libatomic/auto-config.h.in   |  3 +
 libatomic/config/linux/aarch64/atomic_16.S   | 94 +++-
 libatomic/config/linux/aarch64/host-config.h | 26 +-
 libatomic/configure  | 59 +++-
 libatomic/configure.ac   |  1 +
 libatomic/libatomic_i.h  | 18 
 9 files changed, 230 insertions(+), 18 deletions(-)

diff --git a/libatomic/Makefile.am b/libatomic/Makefile.am
index 24e843db67d..dee38e46af9 100644
--- a/libatomic/Makefile.am
+++ b/libatomic/Makefile.am
@@ -130,8 +130,12 @@ libatomic_la_LIBADD = $(foreach s,$(SIZES),$(addsuffix 
_$(s)_.lo,$(SIZEOBJS)))
 ## On a target-specific basis, include alternates to be selected by IFUNC.
 if HAVE_IFUNC
 if ARCH_AARCH64_LINUX
+AM_CPPFLAGS  =
 if ARCH_AARCH64_HAVE_LSE128
-AM_CPPFLAGS = -DHAVE_FEAT_LSE128
+AM_CPPFLAGS += -DHAVE_FEAT_LSE128
+endif
+if ARCH_AARCH64_HAVE_LRCPC3
+AM_CPPFLAGS+= -DHAVE_FEAT_LRCPC3
 endif
 IFUNC_OPTIONS   = -march=armv8-a+lse
 libatomic_la_LIBADD += $(foreach s,$(SIZES),$(addsuffix 
_$(s)_1_.lo,$(SIZEOBJS)))
diff --git a/libatomic/Makefile.in b/libatomic/Makefile.in
index cd48fa21334..8e87d12907a 100644
--- a/libatomic/Makefile.in
+++ b/libatomic/Makefile.in
@@ -89,15 +89,17 @@ POST_UNINSTALL = :
 build_triplet = @build@
 host_triplet = @host@
 target_triplet = @target@
-@ARCH_AARCH64_LINUX_TRUE@@HAVE_IFUNC_TRUE@am__append_1 = $(foreach 
s,$(SIZES),$(addsuffix _$(s)_1_.lo,$(SIZEOBJS)))
-@ARCH_AARCH64_LINUX_TRUE@@HAVE_IFUNC_TRUE@am__append_2 = atomic_16.S
-@ARCH_ARM_LINUX_TRUE@@HAVE_IFUNC_TRUE@am__append_3 = $(foreach \
+@ARCH_AARCH64_HAVE_LSE128_TRUE@@ARCH_AARCH64_LINUX_TRUE@@HAVE_IFUNC_TRUE@am__append_1
 = -DHAVE_FEAT_LSE128
+@ARCH_AARCH64_HAVE_LRCPC3_TRUE@@ARCH_AARCH64_LINUX_TRUE@@HAVE_IFUNC_TRUE@am__append_2
 = -DHAVE_FEAT_LRCPC3
+@ARCH_AARCH64_LINUX_TRUE@@HAVE_IFUNC_TRUE@am__append_3 = $(foreach 
s,$(SIZES),$(addsuffix _$(s)_1_.lo,$(SIZEOBJS)))
+@ARCH_AARCH64_LINUX_TRUE@@HAVE_IFUNC_TRUE@am__append_4 = atomic_16.S
+@ARCH_ARM_LINUX_TRUE@@HAVE_IFUNC_TRUE@am__append_5 = $(foreach \
 @ARCH_ARM_LINUX_TRUE@@HAVE_IFUNC_TRUE@ s,$(SIZES),$(addsuffix \
 @ARCH_ARM_LINUX_TRUE@@HAVE_IFUNC_TRUE@ _$(s)_1_.lo,$(SIZEOBJS))) \
 @ARCH_ARM_LINUX_TRUE@@HAVE_IFUNC_TRUE@ $(addsuffix \
 @ARCH_ARM_LINUX_TRUE@@HAVE_IFUNC_TRUE@ _8_2_.lo,$(SIZEOBJS))
-@ARCH_I386_TRUE@@HAVE_IFUNC_TRUE@am__append_4 = $(addsuffix 
_8_1_.lo,$(SIZEOBJS))
-@ARCH_X86_64_TRUE@@HAVE_IFUNC_TRUE@am__append_5 = $(addsuffix 
_16_1_.lo,$(SIZEOBJS)) \
+@ARCH_I386_TRUE@@HAVE_IFUNC_TRUE@am__append_6 = $(addsuffix 
_8_1_.lo,$(SIZEOBJS))
+@ARCH_X86_64_TRUE@@HAVE_IFUNC_TRUE@am__append_7 = $(addsuffix 
_16_1_.lo,$(SIZEOBJS)) \
 @ARCH_X86_64_TRUE@@HAVE_IFUNC_TRUE@   $(addsuffix 
_16_2_.lo,$(SIZEOBJS))
 
 subdir = .
@@ -424,7 +426,7 @@ libatomic_la_LDFLAGS = $(libatomic_version_info) 
$(libatomic_version_script) \
$(lt_host_flags) $(libatomic_darwin_rpath)
 
 libatomic_la_SOURCES = gload.c gstore.c gcas.c gexch.c glfree.c lock.c \
-   init.c fenv.c fence.c flag.c $(am__append_2)
+   init.c fenv.c fence.c flag.c $(am__append_4)
 SIZEOBJS = load store cas exch fadd fsub fand fior fxor fnand tas
 EXTRA_libatomic_la_SOURCES =

[PATCH] c++: Implement C++26 P2864R2 - Remove Deprecated Arithmetic Conversion on Enumerations From C++26

Hi!

The following patch implements C++26 P2864R2 by emitting pedwarn enabled by
the same options as the C++20 and later warnings (i.e. -Wenum-compare,
-Wdeprecated-enum-enum-conversion and -Wdeprecated-enum-float-conversion
which are all enabled by default).  I think we still want to allow users
some option workaround, so am not using directly error, but if that is
what you want instead, I can change it.

Tested on x86_64-linux with GXX_TESTSUITE_STDS=98,11,14,17,20,23,2c, ok
for trunk?

2023-11-13  Jakub Jelinek  

gcc/cp/
* typeck.cc: Implement C++26 P2864R2 - Remove Deprecated Arithmetic
Conversion on Enumerations From C++26.
(do_warn_enum_conversions): Use pedwarn rather than warning_at for
C++26 and remove " is deprecated" part of the diagnostics in that
case.
(cp_build_binary_op): For C++26 call do_warn_enum_conversions
for complain & tf_warning_or_error.
* call.cc (build_conditional_expr): Use pedwarn rather than warning_at
for C++26 and remove " is deprecated" part of the diagnostics in that
case and check for complain & tf_warning_or_error.  Use emit_diagnostic
with cxx_dialect >= cxx26 ? DK_PEDWARN : DK_WARNING.
(build_new_op): Use emit_diagnostic with cxx_dialect >= cxx26
? DK_PEDWARN : DK_WARNING and complain & tf_warning_or_error check
for C++26.
gcc/testsuite/
* g++.dg/cpp2a/enum-conv1.C: Adjust expected diagnostics in C++26.
* g++.dg/diagnostic/enum3.C: Likewise.
* g++.dg/parse/attr3.C: Likewise.
* g++.dg/cpp0x/linkage2.C: Likewise.

--- gcc/cp/typeck.cc.jj 2023-11-02 07:49:16.135878656 +0100
+++ gcc/cp/typeck.cc2023-11-13 11:03:42.869760702 +0100
@@ -4976,13 +4976,21 @@ do_warn_enum_conversions (location_t loc
case BIT_AND_EXPR:
case BIT_IOR_EXPR:
case BIT_XOR_EXPR:
- warning_at (loc, opt, "bitwise operation between different "
- "enumeration types %qT and %qT is deprecated",
- type0, type1);
+ if (cxx_dialect >= cxx26)
+   pedwarn (loc, opt, "bitwise operation between different "
+"enumeration types %qT and %qT", type0, type1);
+ else
+   warning_at (loc, opt, "bitwise operation between different "
+   "enumeration types %qT and %qT is deprecated",
+   type0, type1);
  return;
default:
- warning_at (loc, opt, "arithmetic between different enumeration "
- "types %qT and %qT is deprecated", type0, type1);
+ if (cxx_dialect >= cxx26)
+   pedwarn (loc, opt, "arithmetic between different enumeration "
+"types %qT and %qT", type0, type1);
+ else
+   warning_at (loc, opt, "arithmetic between different enumeration "
+   "types %qT and %qT is deprecated", type0, type1);
  return;
}
 }
@@ -5010,7 +5018,13 @@ do_warn_enum_conversions (location_t loc
case LE_EXPR:
case EQ_EXPR:
case NE_EXPR:
- if (enum_first_p)
+ if (enum_first_p && cxx_dialect >= cxx26)
+   pedwarn (loc, opt, "comparison of enumeration type %qT with "
+"floating-point type %qT", type0, type1);
+ else if (cxx_dialect >= cxx26)
+   pedwarn (loc, opt, "comparison of floating-point type %qT "
+ "with enumeration type %qT", type0, type1);
+ else if (enum_first_p)
warning_at (loc, opt, "comparison of enumeration type %qT with "
"floating-point type %qT is deprecated",
type0, type1);
@@ -5023,7 +5037,13 @@ do_warn_enum_conversions (location_t loc
  /* This is invalid, don't warn.  */
  return;
default:
- if (enum_first_p)
+ if (enum_first_p && cxx_dialect >= cxx26)
+   pedwarn (loc, opt, "arithmetic between enumeration type %qT "
+"and floating-point type %qT", type0, type1);
+ else if (cxx_dialect >= cxx26)
+   pedwarn (loc, opt, "arithmetic between floating-point type %qT "
+"and enumeration type %qT", type0, type1);
+ else if (enum_first_p)
warning_at (loc, opt, "arithmetic between enumeration type %qT "
"and floating-point type %qT is deprecated",
type0, type1);
@@ -6163,15 +6183,13 @@ cp_build_binary_op (const op_location_t
  return error_mark_node;
}
   if (complain & tf_warning)
-   {
- do_warn_double_promotion (result_type, type0, type1,
-   "implicit conversion from %qH to %qI "
-   "to match other operand of binary "
-   "expression",
-   location);
- do_warn_enum_conversions (location, c

Re: [V2 PATCH] Handle bitop with INTEGER_CST in analyze_and_compute_bitop_with_inv_effect.

On Mon, Nov 13, 2023 at 8:50 AM Hongtao Liu  wrote:
>
> On Fri, Nov 10, 2023 at 5:12 PM Richard Biener
>  wrote:
> >
> > On Wed, Nov 8, 2023 at 9:22 AM Hongtao Liu  wrote:
> > >
> > > On Wed, Nov 8, 2023 at 3:53 PM Richard Biener
> > >  wrote:
> > > >
> > > > On Wed, Nov 8, 2023 at 2:18 AM Hongtao Liu  wrote:
> > > > >
> > > > > On Tue, Nov 7, 2023 at 10:34 PM Richard Biener
> > > > >  wrote:
> > > > > >
> > > > > > On Tue, Nov 7, 2023 at 2:03 PM Hongtao Liu  
> > > > > > wrote:
> > > > > > >
> > > > > > > On Tue, Nov 7, 2023 at 4:10 PM Richard Biener
> > > > > > >  wrote:
> > > > > > > >
> > > > > > > > On Tue, Nov 7, 2023 at 7:08 AM liuhongt  
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > analyze_and_compute_bitop_with_inv_effect assumes the first 
> > > > > > > > > operand is
> > > > > > > > > loop invariant which is not the case when it's INTEGER_CST.
> > > > > > > > >
> > > > > > > > > Bootstrapped and regtseted on x86_64-pc-linux-gnu{-m32,}.
> > > > > > > > > Ok for trunk?
> > > > > > > >
> > > > > > > > So this addresses a missed optimization, right?  It seems to me 
> > > > > > > > that
> > > > > > > > even with two SSA names we are only "lucky" when rhs1 is the 
> > > > > > > > invariant
> > > > > > > > one.  So instead of swapping this way I'd do
> > > > > > > Yes, it's a miss optimization.
> > > > > > > And I think expr_invariant_in_loop_p (loop, match_op[1]) should be
> > > > > > > enough, if match_op[1] is a loop invariant.it must be false for 
> > > > > > > the
> > > > > > > below conditions(there couldn't be any header_phi from its
> > > > > > > definition).
> > > > > >
> > > > > > Yes, all I said is that when you now care for op1 being INTEGER_CST
> > > > > > it could also be an invariant SSA name and thus only after swapping 
> > > > > > op0/op1
> > > > > > we could have a successful match, no?
> > > > > Sorry, the commit message is a little bit misleading.
> > > > > At first, I just wanted to handle the INTEGER_CST case (with TREE_CODE
> > > > > (match_op[1]) == INTEGER_CST), but then I realized that this could
> > > > > probably be extended to the normal SSA_NAME case as well, so I used
> > > > > expr_invariant_in_loop_p, which should theoretically be able to handle
> > > > > the SSA_NAME case as well.
> > > > >
> > > > > if (expr_invariant_in_loop_p (loop, match_op[1])) is true, w/o
> > > > > swapping it must return NULL_TREE for below conditions.
> > > > > if (expr_invariant_in_loop_p (loop, match_op[1])) is false, w/
> > > > > swapping it must return NULL_TREE too.
> > > > > So it can cover the both cases you mentioned, no need for a loop to
> > > > > iterate 2 match_ops for all conditions.
> > > >
> > > > Sorry if it appears we're going in circles ;)
> > > >
> > > > > 3692  if (TREE_CODE (match_op[1]) != SSA_NAME
> > > > > 3693  || !expr_invariant_in_loop_p (loop, match_op[0])
> > > > > 3694  || !(header_phi = dyn_cast  (SSA_NAME_DEF_STMT 
> > > > > (match_op[1])))
> > > >
> > > > but this only checks match_op[1] (an SSA name at this point) for being 
> > > > defined
> > > > by the header PHI.  What if expr_invariant_in_loop_p (loop, mach_op[1])
> > > > and header_phi = dyn_cast  (SSA_NAME_DEF_STMT (match_op[0]))
> > > > which I think can happen when both ops are SSA name?
> > > The whole condition is like
> > >
> > > 3692  if (TREE_CODE (match_op[1]) != SSA_NAME
> > > 3693  || !expr_invariant_in_loop_p (loop, match_op[0])
> > > 3694  || !(header_phi = dyn_cast  (SSA_NAME_DEF_STMT 
> > > (match_op[1])))
> > > 3695  || gimple_bb (header_phi) != loop->header  - This would
> > > be true if match_op[1] is SSA_NAME and expr_invariant_in_loop_p
> >
> > But it could be expr_invariant_in_loop_p (match_op[1]) and
> > header_phi = dyn_cast  (SSA_NAME_DEF_STMT (match_op[0]))
>
> > > > > > > > > +  if (expr_invariant_in_loop_p (loop, match_op[1]))
> > > > > > > > > +std::swap (match_op[0], match_op[1]);
> match_op[1] will be swapped to match_op[0], the case is also handled
> by my patch [1](the v2 patch)
> My point is the upper code already handles 2 SSA names, no need to
> iterate with all conditions, expr_invariant_in_loop_p alone is enough.
>
> [1] https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635440.html

I see - thanks for the repeated explanations.  That patch is OK.

Thanks,
Richard.

> >
> > all I say is that for two SSA names we could not match the condition
> > (aka not fail)
> > when we swap op0/op1.  Not only when op1 is INTEGER_CST.
>
> >
> > > 3696  || gimple_phi_num_args (header_phi) != 2)
> > >
> > > If expr_invariant_in_loop_p (loop, mach_op[1]) is true and it's an 
> > > SSA_NAME
> > > according to code in expr_invariant_in_loop_p, def_bb of gphi is
> > > either NULL or not belong to this loop, either case will make will
> > > make gimple_bb (header_phi) != loop->header true.
> > >
> > > 1857  if (TREE_CODE (expr) == SSA_NAME)
> > > 1858{
> > > 1859  def_bb = gimple_bb (SSA_NAME_DEF_STMT (expr));
> > > 1860  if

[PATCH] c++/modules: Support lambdas in static template member initialisers [PR107398]

2023-11-13 Thread Nathaniel Shead

Bootstrapped and regtested on x86_64-pc-linux-gnu. I don't have write access.

-- >8 --

The testcase noted in the PR fails because the context of the lambda is
not in namespace scope, but rather in class scope. This patch removes
the assertion that the context must be a namespace and ensures that
lambdas in class scope still get the correct merge_kind.

PR c++/107398

gcc/cp/ChangeLog:

* module.cc (trees_out::get_merge_kind): Handle lambdas in class
scope.
(maybe_key_decl): Remove assertion and fix whitespace.

gcc/testsuite/ChangeLog:

* g++.dg/modules/lambda-6_a.C: New test.
* g++.dg/modules/lambda-6_b.C: New test.

Signed-off-by: Nathaniel Shead 
---
 gcc/cp/module.cc  | 35 +--
 gcc/testsuite/g++.dg/modules/lambda-6_a.C | 16 +++
 gcc/testsuite/g++.dg/modules/lambda-6_b.C |  9 ++
 3 files changed, 45 insertions(+), 15 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/modules/lambda-6_a.C
 create mode 100644 gcc/testsuite/g++.dg/modules/lambda-6_b.C

diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
index c1c8c226bc1..434caf22d1d 100644
--- a/gcc/cp/module.cc
+++ b/gcc/cp/module.cc
@@ -10412,13 +10412,13 @@ trees_out::get_merge_kind (tree decl, depset *dep)
 
  case RECORD_TYPE:
  case UNION_TYPE:
+ case NAMESPACE_DECL:
if (DECL_NAME (decl) == as_base_identifier)
- mk = MK_as_base;
-   else if (IDENTIFIER_ANON_P (DECL_NAME (decl)))
- mk = MK_field;
-   break;
+ {
+   mk = MK_as_base;
+   break;
+ }
 
- case NAMESPACE_DECL:
if (DECL_IMPLICIT_TYPEDEF_P (STRIP_TEMPLATE (decl))
&& LAMBDA_TYPE_P (TREE_TYPE (decl)))
  if (tree scope
@@ -10431,6 +10431,13 @@ trees_out::get_merge_kind (tree decl, depset *dep)
break;
  }
 
+   if (RECORD_OR_UNION_TYPE_P (ctx))
+ {
+   if (IDENTIFIER_ANON_P (DECL_NAME (decl)))
+ mk = MK_field;
+   break;
+ }
+
if (TREE_CODE (decl) == TEMPLATE_DECL
&& DECL_UNINSTANTIATED_TEMPLATE_FRIEND_P (decl))
  mk = MK_local_friend;
@@ -18887,18 +18894,16 @@ maybe_key_decl (tree ctx, tree decl)
   if (TREE_CODE (ctx) != VAR_DECL)
 return;
 
-  gcc_checking_assert (DECL_NAMESPACE_SCOPE_P (ctx));
-
- if (!keyed_table)
+  if (!keyed_table)
 keyed_table = new keyed_map_t (EXPERIMENT (1, 400));
 
- auto &vec = keyed_table->get_or_insert (ctx);
- if (!vec.length ())
-   {
- retrofit_lang_decl (ctx);
- DECL_MODULE_KEYED_DECLS_P (ctx) = true;
-   }
- vec.safe_push (decl);
+  auto &vec = keyed_table->get_or_insert (ctx);
+  if (!vec.length ())
+{
+  retrofit_lang_decl (ctx);
+  DECL_MODULE_KEYED_DECLS_P (ctx) = true;
+}
+  vec.safe_push (decl);
 }
 
 /* Create the flat name string.  It is simplest to have it handy.  */
diff --git a/gcc/testsuite/g++.dg/modules/lambda-6_a.C 
b/gcc/testsuite/g++.dg/modules/lambda-6_a.C
new file mode 100644
index 000..28bfb358afb
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/lambda-6_a.C
@@ -0,0 +1,16 @@
+// PR c++/107398
+// { dg-additional-options "-fmodules-ts" }
+// { dg-module-cmi Lambda6 }
+
+export module Lambda6;
+
+template 
+struct R { static int x; };
+
+template 
+int R::x = []{int i; return 1;}();
+
+export int foo();
+int foo() {
+  return R::x;
+}
diff --git a/gcc/testsuite/g++.dg/modules/lambda-6_b.C 
b/gcc/testsuite/g++.dg/modules/lambda-6_b.C
new file mode 100644
index 000..ab0c4ab4805
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/lambda-6_b.C
@@ -0,0 +1,9 @@
+// PR c++/107398
+// { dg-additional-options "-fmodules-ts" }
+
+import Lambda6;
+
+int main() {
+  if (foo() != 1)
+__builtin_abort();
+}
-- 
2.42.0

C++ patch ping^3

Hi!

I'd like to ping a couple of C++ patches.

- c++, v2: Implement C++26 P2169R4 - Placeholder variables with no name 
[PR110349]
  https://gcc.gnu.org/pipermail/gcc-patches/2023-September/630802.html

- c++, v2: Implement C++26 P2741R3 - user-generated static_assert messages 
[PR110348]
  https://gcc.gnu.org/pipermail/gcc-patches/2023-September/630803.html

- c++: Implement C++ DR 2406 - [[fallthrough]] attribute and iteration 
statements [PR107571]
  https://gcc.gnu.org/pipermail/gcc-patches/2023-August/628487.html
  (from this one Richi approved the middle-end changes)

- c++, v2: Implement C++26 P1854R4 - Making non-encodable string literals 
ill-formed [PR110341]
  https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635170.html

- c++: Fix error recovery ICE [PR112365]
  https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635210.html

Thanks

Jakub

[Committed] RISC-V: Adapt VLS init tests

2023-11-13 Thread Juzhe-Zhong

Realize that init tests are wrong by my previous mistakes.
Fix them and committed.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vls/def.h: Fix init test.
* gcc.target/riscv/rvv/autovec/vls/init-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls/init-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls/init-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls/init-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls/init-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls/init-6.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls/init-7.c: Ditto.

---
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/def.h| 2 +-
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/init-1.c | 2 +-
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/init-2.c | 2 +-
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/init-3.c | 2 +-
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/init-4.c | 2 +-
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/init-5.c | 2 +-
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/init-6.c | 2 +-
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/init-7.c | 2 +-
 8 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/def.h 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/def.h
index 2e91b9a9664..9cc3656e710 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/def.h
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/def.h
@@ -437,7 +437,7 @@ typedef double v512df __attribute__ ((vector_size (4096)));
   void init_##TYPE1##_##TYPE2##_##NUM (VARS##NUM (TYPE2, __VA_ARGS__), 
\
   TYPE2 *__restrict out)  \
   {
\
-TYPE1 v = {INIT##NUM (__VA_ARGS__)};   
\
+TYPE1 v = {__VA_ARGS__};   
\
 *(TYPE1 *) out = v;
\
   }
 
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/init-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/init-1.c
index aec2c6e5e5f..0f78ae0ebe2 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/init-1.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/init-1.c
@@ -43,4 +43,4 @@ DEF_INIT (v128uqi, uint8_t, 128, 0, 1, 2, 3, 4, 5, 6, 7, 8, 
9, 10, 11, 12, 13,
  113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126,
  127)
 
-/* { dg-final { scan-assembler-times {vslide1down\.vx} 494 } } */
+/* { dg-final { scan-assembler-times {vid\.v} 14 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/init-2.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/init-2.c
index f9c58aef553..f27c395441b 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/init-2.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/init-2.c
@@ -45,4 +45,4 @@ DEF_INIT (v128uhi, uint16_t, 128, 0, 1, 2, 3, 4, 5, 6, 7, 8, 
9, 10, 11, 12, 13,
  113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126,
  127)
 
-/* { dg-final { scan-assembler-times {vslide1down\.vx} 494 } } */
+/* { dg-final { scan-assembler-times {vid\.vx} 494 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/init-3.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/init-3.c
index eb970c7b042..df15bd7300f 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/init-3.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/init-3.c
@@ -24,4 +24,4 @@ DEF_INIT (v128hf, _Float16, 128, 0, 1, 2, 3, 4, 5, 6, 7, 8, 
9, 10, 11, 12, 13,
  113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126,
  127)
 
-/* { dg-final { scan-assembler-times {vfslide1down\.vf} 247 } } */
+/* { dg-final { scan-assembler-times {vle16\.v} 7 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/init-4.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/init-4.c
index fedeb445a2b..09bdbd19cc0 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/init-4.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/init-4.c
@@ -45,4 +45,4 @@ DEF_INIT (v128usi, uint32_t, 128, 0, 1, 2, 3, 4, 5, 6, 7, 8, 
9, 10, 11, 12, 13,
  113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126,
  127)
 
-/* { dg-final { scan-assembler-times {vslide1down\.vx} 494 } } */
+/* { dg-final { scan-assembler-times {vid\.v} 14 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/init-5.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/init-5.c
index c93ac524c88..65ca8cb41e3 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/init-5.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/init-5.c
@@ -23,4 +23,4 @@ DEF_INIT (v128sf, float, 128, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 
10, 11, 12, 13, 14,
  100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113,
  114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124,

Re: [RFC PATCH] Detecting lifetime-dse issues via Valgrind

On Tue, Oct 24, 2023 at 4:12 PM  wrote:
>
> From: Daniil Frolov 
>
> PR 66487 is asking to provide sanitizer-like detection for C++ object lifetime
> violations that are worked around with -fno-lifetime-dse in Firefox, LLVM,
> OpenJade.
>
> The discussion in the PR was centered around extending MSan, but MSan was not
> ported to GCC (and requires rebuilding everything with instrumentation).
>
> Instead, allow Valgrind to see lifetime boundaries by emitting client requests
> along *this = { CLOBBER }.  The client request marks the "clobbered" memory as
> undefined for Valgrind; clobbering assignments mark the beginning of ctor and
> end of dtor execution for C++ objects.  Hence, attempts to read object storage
> after the destructor, or "pre-initialize" its fields prior to the constructor
> will be caught.
>
> Valgrind client requests are offered as macros that emit inline asm.  For use
> in code generation, we need to wrap it in a built-in.  We know that 
> implementing
> such a built-in in libgcc is undesirable, ideally contents of libgcc should 
> not
> depend on availability of external headers.  Suggestion for cleaner solutions
> would be welcome.
>
> gcc/ChangeLog:
>
> * Makefile.in: Add gimple-valgrind.o.
> * builtins.def (BUILT_IN_VALGRIND_MEM_UNDEF): Add new built-in.
> * common.opt: Add new option.
> * passes.def: Add new pass.
> * tree-pass.h (make_pass_emit_valgrind): New function.

Another generic comment - placing a built-in call probably pessimizes code
generation unless we handle it specially during alias analysis (or in
builtin_fnspec).  I also don't like having another pass for this - did you
investigate to do the instrumentation at the point the CLOBBERs are
introduced?  Another possibility would be to make this more generic
and emit the instrumentation when we lower GIMPLE_BIND during
the GIMPLE lowering pass, you wouldn't then rely on the CLOBBERs
some of which only appear when -fstack-reuse=none is not used.

Richard.

> * gimple-valgrind.cc: New file.
>
> libgcc/ChangeLog:
>
> * Makefile.in: Add valgrind.o.
> * config.in: Regenerate.
> * configure: Regenerate.
> * configure.ac: Add option --enable-valgrind-annotations into libgcc
> config.
> * libgcc2.h (__valgrind_make_mem_undefined): New function.
> * valgrind.c: New file.
> ---
>  gcc/Makefile.in|   1 +
>  gcc/builtins.def   |   1 +
>  gcc/common.opt |   4 ++
>  gcc/gimple-valgrind.cc | 124 +
>  gcc/passes.def |   1 +
>  gcc/tree-pass.h|   1 +
>  libgcc/Makefile.in |   2 +-
>  libgcc/config.in   |   9 +++
>  libgcc/configure   |  79 ++
>  libgcc/configure.ac|  48 
>  libgcc/libgcc2.h   |   1 +
>  libgcc/valgrind.c  |  50 +
>  12 files changed, 320 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/gimple-valgrind.cc
>  create mode 100644 libgcc/valgrind.c
>
> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
> index 9cc16268abf..ded6bdf1673 100644
> --- a/gcc/Makefile.in
> +++ b/gcc/Makefile.in
> @@ -1487,6 +1487,7 @@ OBJS = \
> gimple-ssa-warn-access.o \
> gimple-ssa-warn-alloca.o \
> gimple-ssa-warn-restrict.o \
> +   gimple-valgrind.o \
> gimple-streamer-in.o \
> gimple-streamer-out.o \
> gimple-walk.o \
> diff --git a/gcc/builtins.def b/gcc/builtins.def
> index 5953266acba..42d34189f1e 100644
> --- a/gcc/builtins.def
> +++ b/gcc/builtins.def
> @@ -1064,6 +1064,7 @@ DEF_GCC_BUILTIN(BUILT_IN_VA_END, "va_end", 
> BT_FN_VOID_VALIST_REF, ATTR_N
>  DEF_GCC_BUILTIN(BUILT_IN_VA_START, "va_start", 
> BT_FN_VOID_VALIST_REF_VAR, ATTR_NOTHROW_LEAF_LIST)
>  DEF_GCC_BUILTIN(BUILT_IN_VA_ARG_PACK, "va_arg_pack", BT_FN_INT, 
> ATTR_PURE_NOTHROW_LEAF_LIST)
>  DEF_GCC_BUILTIN(BUILT_IN_VA_ARG_PACK_LEN, "va_arg_pack_len", 
> BT_FN_INT, ATTR_PURE_NOTHROW_LEAF_LIST)
> +DEF_EXT_LIB_BUILTIN(BUILT_IN_VALGRIND_MEM_UNDEF, 
> "__valgrind_make_mem_undefined", BT_FN_VOID_PTR_SIZE, ATTR_NOTHROW_LEAF_LIST)
>  DEF_EXT_LIB_BUILTIN(BUILT_IN__EXIT, "_exit", BT_FN_VOID_INT, 
> ATTR_NORETURN_NOTHROW_LEAF_LIST)
>  DEF_C99_BUILTIN(BUILT_IN__EXIT2, "_Exit", BT_FN_VOID_INT, 
> ATTR_NORETURN_NOTHROW_LEAF_LIST)
>
> diff --git a/gcc/common.opt b/gcc/common.opt
> index f137a1f81ac..c9040386956 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -2515,6 +2515,10 @@ starts and when the destructor finishes.
>  flifetime-dse=
>  Common Joined RejectNegative UInteger Var(flag_lifetime_dse) Optimization 
> IntegerRange(0, 2)
>
> +fvalgrind-emit-annotations
> +Common Var(flag_valgrind_annotations,1)
> +Emit Valgrind annotations with respect to object's lifetime.
> +
>  flive-patching
>  Common RejectNegative Alias(flive-patching=,inline-clone) Optimization
>
> diff --git a/gcc/gimple-valgrind.cc b/gcc/gimple-valgrind.cc

Re: [PATCH V2] VECT: Support mask_len_strided_load/mask_len_strided_store in loop vectorize

On Mon, 13 Nov 2023, juzhe.zh...@rivai.ai wrote:

> Hi. Ping this patch which is last optab pattern for RVV support.
> 
> The mask_len_strided_load/mask_len_strided_store document has been approved:
> 
> https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635103.html 
> 
> Bootstrap on X86 and regtest no regression.
> Tested on aarch64 no regression.
> Tested on RISC-V no regression.
> 
> 
> juzhe.zh...@rivai.ai
>  
> From: Juzhe-Zhong
> Date: 2023-11-06 14:55
> To: gcc-patches
> CC: richard.sandiford; rguenther; Juzhe-Zhong
> Subject: [PATCH V2] VECT: Support 
> mask_len_strided_load/mask_len_strided_store in loop vectorize
> This patch adds strided load/store support on loop vectorizer depending on 
> STMT_VINFO_STRIDED_P.
>  
> Bootstrap and regression on X86 passed.
>  
> Ok for trunk ?
>  
> gcc/ChangeLog:
>  
> * internal-fn.cc (strided_load_direct): New function.
> (strided_store_direct): Ditto.
> (expand_strided_store_optab_fn): Ditto.
> (expand_scatter_store_optab_fn): Add strided store.
> (expand_strided_load_optab_fn): New function.
> (expand_gather_load_optab_fn): Add strided load.
> (direct_strided_load_optab_supported_p): New function.
> (direct_strided_store_optab_supported_p): Ditto.
> (internal_load_fn_p): Add strided load.
> (internal_strided_fn_p): New function.
> (internal_fn_len_index): Add strided load/store.
> (internal_fn_mask_index): Ditto.
> (internal_fn_stored_value_index): Add strided store.
> (internal_strided_fn_supported_p): New function.
> * internal-fn.def (MASK_LEN_STRIDED_LOAD): New IFN.
> (MASK_LEN_STRIDED_STORE): Ditto.
> * internal-fn.h (internal_strided_fn_p): New function.
> (internal_strided_fn_supported_p): Ditto.
> * optabs-query.cc (supports_vec_gather_load_p): Add strided load.
> (supports_vec_scatter_store_p): Add strided store.
> * optabs-query.h (supports_vec_gather_load_p): Add strided load.
> (supports_vec_scatter_store_p): Add strided store.
> * tree-vect-data-refs.cc (vect_prune_runtime_alias_test_list): Add strided 
> load/store.
> (vect_gather_scatter_fn_p): Ditto.
> (vect_check_gather_scatter): Ditto.
> * tree-vect-stmts.cc (check_load_store_for_partial_vectors): Ditto.
> (vect_truncate_gather_scatter_offset): Ditto.
> (vect_use_strided_gather_scatters_p): Ditto.
> (vect_get_strided_load_store_ops): Ditto.
> (vectorizable_store): Ditto.
> (vectorizable_load): Ditto.
> * tree-vectorizer.h (vect_gather_scatter_fn_p): Ditto.
>  
> ---
> gcc/internal-fn.cc | 101 -
> gcc/internal-fn.def|   4 ++
> gcc/internal-fn.h  |   2 +
> gcc/optabs-query.cc|  25 ++---
> gcc/optabs-query.h |   4 +-
> gcc/tree-vect-data-refs.cc |  45 +
> gcc/tree-vect-stmts.cc |  65 ++--
> gcc/tree-vectorizer.h  |   2 +-
> 8 files changed, 199 insertions(+), 49 deletions(-)
>  
> diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
> index c7d3564faef..a31a65755c7 100644
> --- a/gcc/internal-fn.cc
> +++ b/gcc/internal-fn.cc
> @@ -164,6 +164,7 @@ init_internal_fns ()
> #define load_lanes_direct { -1, -1, false }
> #define mask_load_lanes_direct { -1, -1, false }
> #define gather_load_direct { 3, 1, false }
> +#define strided_load_direct { -1, -1, false }
> #define len_load_direct { -1, -1, false }
> #define mask_len_load_direct { -1, 4, false }
> #define mask_store_direct { 3, 2, false }
> @@ -172,6 +173,7 @@ init_internal_fns ()
> #define vec_cond_mask_direct { 1, 0, false }
> #define vec_cond_direct { 2, 0, false }
> #define scatter_store_direct { 3, 1, false }
> +#define strided_store_direct { 1, 1, false }
> #define len_store_direct { 3, 3, false }
> #define mask_len_store_direct { 4, 5, false }
> #define vec_set_direct { 3, 3, false }
> @@ -3561,62 +3563,87 @@ expand_LAUNDER (internal_fn, gcall *call)
>expand_assignment (lhs, gimple_call_arg (call, 0), false);
> }
> +#define expand_strided_store_optab_fn expand_scatter_store_optab_fn
> +
> /* Expand {MASK_,}SCATTER_STORE{S,U} call CALL using optab OPTAB.  */

Please amend the comment

> static void
> expand_scatter_store_optab_fn (internal_fn, gcall *stmt, direct_optab optab)
> {
> +  insn_code icode;
>internal_fn ifn = gimple_call_internal_fn (stmt);
>int rhs_index = internal_fn_stored_value_index (ifn);
>tree base = gimple_call_arg (stmt, 0);
>tree offset = gimple_call_arg (stmt, 1);
> -  tree scale = gimple_call_arg (stmt, 2);
>tree rhs = gimple_call_arg (stmt, rhs_index);
>rtx base_rtx = expand_normal (base);
>rtx offset_rtx = expand_normal (offset);
> -  HOST_WIDE_INT scale_int = tree_to_shwi (scale);
>rtx rhs_rtx = expand_normal (rhs);
>class expand_operand ops[8];
>int i = 0;
>create_address_operand (&ops[i++], base_rtx);
> -  create_input_operand (&ops[i++], offset_rtx, TYPE_MODE (TREE_TYPE 
> (offset)));
> -  create_integer_operand (&ops[i++], TYPE_UNSIGNED (TREE_TYPE (offset)));
> -  create_integer_operand (&ops[i++], scale_int);
> +  if (internal_stri

[PATCH] tree-optimization/111792 - new testcase

This was fixed as part of the PR111000 fix.

Tested on x86_64-unknown-linux-gnu, pushed.

PR tree-optimization/111792
PR tree-optimization/111000
* gcc.dg/torture/pr111792.c: New testcase.
---
 gcc/testsuite/gcc.dg/torture/pr111792.c | 39 +
 1 file changed, 39 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr111792.c

diff --git a/gcc/testsuite/gcc.dg/torture/pr111792.c 
b/gcc/testsuite/gcc.dg/torture/pr111792.c
new file mode 100644
index 000..58ae6f149d1
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr111792.c
@@ -0,0 +1,39 @@
+/* { dg-do run } */
+/* { dg-require-effective-target lp64 } */
+
+int c, d, h, i, j, l, *n = &h;
+short e, f, g, *k, m;
+long o;
+short p(short p1, int q) { return q >= 32 || p1 > 5 >> q ? 1 : p1 << q; }
+long u(unsigned p1)
+{
+  int r = 50, s, *t = &c;
+ L:
+  m && (*k = 0);
+  for (d = 1; d; d--)
+for (s = 0; s < 3; s++) {
+  *n = i ^ p1;
+  *t = p1 > (unsigned)p((unsigned)(o = 4073709551615) >= p1 && 5, r);
+  if (f)
+goto L;
+}
+  for (; e < 1;)
+return j;
+  int *b[2] = {&s, &r};
+  for (; l; l--) {
+long a[1];
+for (r = 0; r < 1; r++) {
+  h = a[0];
+  if (g)
+goto L;
+}
+  }
+  return 0;
+}
+int main()
+{
+  u(6);
+  if (c != 1)
+__builtin_abort();
+  return 0;
+}
-- 
2.35.3

Re: [PATCH v3 3/4] ifcvt: Handle multiple rewired regs and refactor noce_convert_multiple_sets

2023-11-13 Thread Manolis Tsamis

Hi Jeff,

Indeed, that sounds like a good idea. I will make this separate and
send it after the required testing.
I'll see what can be done about a testcase.

Best,
Manolis

On Sat, Nov 11, 2023 at 1:20 AM Jeff Law  wrote:
>
>
>
> On 8/30/23 04:13, Manolis Tsamis wrote:
> > The existing implementation of need_cmov_or_rewire and
> > noce_convert_multiple_sets_1 assumes that sets are either REG or SUBREG.
> > This commit enchances them so they can handle/rewire arbitrary set 
> > statements.
> >
> > To do that a new helper struct noce_multiple_sets_info is introduced which 
> > is
> > used by noce_convert_multiple_sets and its helper functions. This results in
> > cleaner function signatures, improved efficientcy (a number of vecs and hash
> > set/map are replaced with a single vec of struct) and simplicity.
> >
> > gcc/ChangeLog:
> >
> >   * ifcvt.cc (need_cmov_or_rewire): Renamed 
> > init_noce_multiple_sets_info.
> >   (init_noce_multiple_sets_info): Initialize noce_multiple_sets_info.
> >   (noce_convert_multiple_sets_1): Use noce_multiple_sets_info and handle
> >   rewiring of multiple registers.
> >   (noce_convert_multiple_sets): Updated to use noce_multiple_sets_info.
> >   * ifcvt.h (struct noce_multiple_sets_info): Introduce new struct
> >   noce_multiple_sets_info to store info for noce_convert_multiple_sets.
> >
> > gcc/testsuite/ChangeLog:
> >
> >   * gcc.target/aarch64/ifcvt_multiple_sets_rewire.c: New test.
> So this seems like (in theory) it could move forward independently.  The
> handling of arbitrary statements code wouldn't be exercised yet, but
> that's OK IMHO as I don't think anyone is fundamentally against trying
> to handle additional kinds of statements.
>
> So my suggestion would be to bootstrap & regression test this
> independently.  AFAICT this should have no functional change if it were
> to go in on its own.  Note the testsuite entry might not be applicable
> if this were to go in on its own and would need to roll into another
> patch in the series.
>
>
> Jeff

Re: [PATCH v3 4/4] ifcvt: Remove obsolete code for subreg handling in noce_convert_multiple_sets

2023-11-13 Thread Manolis Tsamis

Yes, my finding back then was that this is leftover code from the
initial implementation, nothing to do with the rest of the changes.
I will first re-evaluate this and test it separately from the other
series. If all is good I'll let you know so we can proceed.

Manolis

On Sat, Nov 11, 2023 at 12:03 AM Jeff Law  wrote:
>
>
>
> On 8/30/23 04:14, Manolis Tsamis wrote:
> > This code used to handle register replacement issues with SUBREG before
> > simplify_replace_rtx was introduced. This should not be needed anymore as
> > new_val has the correct mode and that should be preserved by
> > simplify_replace_rtx.
> >
> > gcc/ChangeLog:
> >
> >   * ifcvt.cc (noce_convert_multiple_sets_1): Remove old code.
> So is it the case that this code is supposed to no longer be needed as a
> result of your kit or it is unnecessary independent of patches 1..3?  If
> the latter then it's OK for the trunk now.
>
> Jeff

Re: [PATCH v3 2/4] ifcvt: Allow more operations in multiple set if conversion

2023-11-13 Thread Manolis Tsamis

On Thu, Oct 19, 2023 at 10:46 PM Richard Sandiford
 wrote:
>
> Manolis Tsamis  writes:
> > Currently the operations allowed for if conversion of a basic block with
> > multiple sets are few, namely REG, SUBREG and CONST_INT (as controlled by
> > bb_ok_for_noce_convert_multiple_sets).
> >
> > This commit allows more operations (arithmetic, compare, etc) to participate
> > in if conversion. The target's profitability hook and ifcvt's costing is
> > expected to reject sequences that are unprofitable.
> >
> > This is especially useful for targets which provide a rich selection of
> > conditional instructions (like aarch64 which has cinc, csneg, csinv, ccmp, 
> > ...)
> > which are currently not used in basic blocks with more than a single set.
> >
> > gcc/ChangeLog:
> >
> >   * ifcvt.cc (try_emit_cmove_seq): Modify comments.
> >   (noce_convert_multiple_sets_1): Modify comments.
> >   (bb_ok_for_noce_convert_multiple_sets): Allow more operations.
> >
> > gcc/testsuite/ChangeLog:
> >
> >   * gcc.target/aarch64/ifcvt_multiple_sets_arithm.c: New test.
> >
> > Signed-off-by: Manolis Tsamis 
> > ---
> >
> > Changes in v3:
> > - Add SCALAR_INT_MODE_P check in 
> > bb_ok_for_noce_convert_multiple_sets.
> > - Allow rewiring of multiple regs.
> > - Refactor code with noce_multiple_sets_info.
> > - Remove old code for subregs.
> >
> >  gcc/ifcvt.cc  | 63 ++-
> >  .../aarch64/ifcvt_multiple_sets_arithm.c  | 79 +++
> >  2 files changed, 123 insertions(+), 19 deletions(-)
> >  create mode 100644 
> > gcc/testsuite/gcc.target/aarch64/ifcvt_multiple_sets_arithm.c
> >
> > diff --git a/gcc/ifcvt.cc b/gcc/ifcvt.cc
> > index 3273aeca125..efe8ab1577a 100644
> > --- a/gcc/ifcvt.cc
> > +++ b/gcc/ifcvt.cc
> > @@ -3215,13 +3215,13 @@ try_emit_cmove_seq (struct noce_if_info *if_info, 
> > rtx temp,
> >  /* We have something like:
> >
> >   if (x > y)
> > -   { i = a; j = b; k = c; }
> > +   { i = EXPR_A; j = EXPR_B; k = EXPR_C; }
> >
> > Make it:
> >
> > - tmp_i = (x > y) ? a : i;
> > - tmp_j = (x > y) ? b : j;
> > - tmp_k = (x > y) ? c : k;
> > + tmp_i = (x > y) ? EXPR_A : i;
> > + tmp_j = (x > y) ? EXPR_B : j;
> > + tmp_k = (x > y) ? EXPR_C : k;
> >   i = tmp_i;
> >   j = tmp_j;
> >   k = tmp_k;
> > @@ -3637,11 +3637,10 @@ noce_convert_multiple_sets_1 (struct noce_if_info 
> > *if_info,
> >
> >
> >
> > -/* Return true iff basic block TEST_BB is comprised of only
> > -   (SET (REG) (REG)) insns suitable for conversion to a series
> > -   of conditional moves.  Also check that we have more than one set
> > -   (other routines can handle a single set better than we would), and
> > -   fewer than PARAM_MAX_RTL_IF_CONVERSION_INSNS sets.  While going
> > +/* Return true iff basic block TEST_BB is suitable for conversion to a
> > +   series of conditional moves.  Also check that we have more than one
> > +   set (other routines can handle a single set better than we would),
> > +   and fewer than PARAM_MAX_RTL_IF_CONVERSION_INSNS sets.  While going
> > through the insns store the sum of their potential costs in COST.  */
> >
> >  static bool
> > @@ -3667,20 +3666,46 @@ bb_ok_for_noce_convert_multiple_sets (basic_block 
> > test_bb, unsigned *cost)
> >rtx dest = SET_DEST (set);
> >rtx src = SET_SRC (set);
> >
> > -  /* We can possibly relax this, but for now only handle REG to REG
> > -  (including subreg) moves.  This avoids any issues that might come
> > -  from introducing loads/stores that might violate data-race-freedom
> > -  guarantees.  */
> > -  if (!REG_P (dest))
> > +  /* Do not handle anything involving memory loads/stores since it 
> > might
> > +  violate data-race-freedom guarantees.  */
> > +  if (!REG_P (dest) || contains_mem_rtx_p (src))
> > + return false;
> > +
> > +  if (!SCALAR_INT_MODE_P (GET_MODE (src)))
> >   return false;
> >
> > -  if (!((REG_P (src) || CONSTANT_P (src))
> > - || (GET_CODE (src) == SUBREG && REG_P (SUBREG_REG (src))
> > -   && subreg_lowpart_p (src
> > +  /* Allow a wide range of operations and let the costing function 
> > decide
> > +  if the conversion is worth it later.  */
> > +  enum rtx_code code = GET_CODE (src);
> > +  if (!(CONSTANT_P (src)
> > + || code == REG
> > + || code == SUBREG
> > + || code == ZERO_EXTEND
> > + || code == SIGN_EXTEND
> > + || code == NOT
> > + || code == NEG
> > + || code == PLUS
> > + || code == MINUS
> > + || code == AND
> > + || code == IOR
> > + || code == MULT
> > + || code == ASHIFT
> > + || code == ASHIFTRT
> > + || code == NE
> > + || code == EQ
> > + || code == GE
> > + || code == GT
> > + || code == LE
> > + || code == LT
> > + || co

[PATCH 0/6] Turn some C warnings into errors by default

This patch series converts the following warnings into errors by
default:

  -Wint-conversion
  -Wimplicit-function-declaration
  -Wimplicit-int
  -Wreturn-mismatch
  -Wincompatible-pointer-types

As explained in the first commit, I decided not to use permerror_opt
because it does not exhibit the existing behavior for -pedantic-errors.

The impact on existing sources of the last commit is not really known to
me at this point.  I plan to start a Fedora build later this week with
an instrumented compiler, to see how much of a compatible impact it will
be.  The first conversion pass through Fedora only covered
-Wimplicit-function-declaration, -Wimplicit-int.  I started looking at
-Wint-conversion, and it did not seem to be too bad, so I think
including it should be fine.  I'm more worried about
-Wincompatible-pointer-types.

I have not yet added a new overview test for -fpermissive.  Such a test
should trigger all the dozen or so places where I introduced
pedpermerror, and see what happens under multiple dialects, each with
-fpermissive and without, and maybe also with and withoyt for
-pedantic-errors in -std=gnu89 and default modes.  I plan to do this
once I get some initial feedback on the direction of these series
because this test would likely be obsolete fairly quickly if changes to
the diagnostics are required.  I did copy some existing tests to test
both the error and warning (-fpermissive) diagnostics, and adjusted
others to expect errors, so there is already quite a bit coverage
without that overview test.

Right now, this series breaks the build on aarch64-linux-gnu due to an
incompatible pointer assignment in libgcc:

  [PATCH] aarch64: Avoid -Wincompatible-pointer-types warning in Linux unwinder
  


Other targets had the same issue previously, but I've already fixed most
of them (I hope).  There could of course be similar issues lurking in
target-specific code, or even in system headers.

With the recent testsuite fixes, the testsuite should be fairly clean
despite these changes.  I verified that on i686-linux-gnu,
powerpc64-linux-gnu, and x86_64-linux-gnu.  There is one
aarch64-linux-gnu testsuite change I'd like the AArch64 maintainers to
review:

  [PATCH] aarch64: Call named function in gcc.target/aarch64/aapcs64/ice_1.c
  


Recently, I also found a problem in the gm2 testsuite:

  [PATCH] gm2: Add missing declaration of m2pim_M2RTS_Terminate to test
  


Thanks,
Florian


Florian Weimer (6):
  c-family: Introduce pedpermerror
  c: Turn int-conversion warnings into permerrors
  c: Turn -Wimplicit-function-declaration into a pedpermerror
  c: Turn -Wimplicit-int into a pedpermerror
  c: Turn -Wreturn-mismatch into a pedpermerror
  c: Turn -Wincompatible-pointer-types into a pedpermerror

 gcc/c-family/c-common.h   |   4 +
 gcc/c-family/c-warn.cc|  34 
 gcc/c/c-decl.cc   |  40 ++--
 gcc/c/c-typeck.cc | 164 +--
 gcc/diagnostic-core.h |   3 +
 gcc/diagnostic.cc |   7 +
 gcc/doc/invoke.texi   |  33 +++-
 gcc/testsuite/c-c++-common/pr77624-1.c|   4 +-
 .../c-c++-common/spellcheck-reserved.c|   4 +-
 gcc/testsuite/gcc.dg/20030906-1.c |   2 +-
 gcc/testsuite/gcc.dg/20030906-1a.c|  21 ++
 gcc/testsuite/gcc.dg/20030906-2.c |   2 +-
 gcc/testsuite/gcc.dg/20030906-2a.c|  21 ++
 .../Wimplicit-function-declaration-c99-2.c|   7 +
 .../Wimplicit-function-declaration-c99.c  |   2 +-
 gcc/testsuite/gcc.dg/Wimplicit-int-1.c|   2 +-
 gcc/testsuite/gcc.dg/Wimplicit-int-1a.c   |  11 ++
 gcc/testsuite/gcc.dg/Wimplicit-int-4.c|   2 +-
 gcc/testsuite/gcc.dg/Wimplicit-int-4a.c   |  11 ++
 .../gcc.dg/Wincompatible-pointer-types-2.c|   2 +-
 .../gcc.dg/Wincompatible-pointer-types-4.c|   2 +-
 .../gcc.dg/Wincompatible-pointer-types-5.c|  10 +
 .../gcc.dg/Wincompatible-pointer-types-6.c|  10 +
 gcc/testsuite/gcc.dg/Wint-conversion-2.c  |   2 +-
 gcc/testsuite/gcc.dg/Wint-conversion-3.c  |   2 +-
 gcc/testsuite/gcc.dg/Wint-conversion-4.c  |  14 ++
 gcc/testsuite/gcc.dg/Wreturn-mismatch-1.c |   2 +-
 gcc/testsuite/gcc.dg/Wreturn-mismatch-1a.c|  40 
 gcc/testsuite/gcc.dg/Wreturn-mismatch-2.c |   2 +-
 gcc/testsuite/gcc.dg/Wreturn-mismatch-2a.c|  41 
 gcc/testsuite/gcc.dg/anon-struct-11.c |   5 +-
 gcc/testsuite/gcc.dg/anon-struct-11a.c| 111 +++
 gcc/testsuite/gcc.dg/anon-struct-13.c |   2 +-
 gcc/testsuite/gcc.dg/anon-struct-13a.c|  76 +++
 gcc/testsuite/gcc.dg/assign-warn-1.c  |   2 +-
 gcc/tes

[PATCH 1/6] c-family: Introduce pedpermerror

It turns out that permerror_opt is not directly usable for
-fpermissive in the C front end.  The front end uses pedwarn
extensively, and pedwarns are not overridable by -Wno-* options,
yet permerrors are.  Add new pedpermerror helpers that turn into
pedwarns if -pedantic-errors is active.

Due to the dependency on flag_pedantic_errors, the new helpers
are specific to the C-family front ends.  For implementing the
rich location variant, export emit_diagnostic_valist from
gcc/diagnostic.cc in parallel to its location_t variant.

gcc/

* diagnostic-core.h (emit_diagnostic_valist): Declare function.
* diagnostic.cc (emit_diagnostic_valist): Define it.

gcc/c-family/

* c-common.h (pedpermerror): Declare functions.
* c-warn.cc (pedpermerror): Define them.
---
 gcc/c-family/c-common.h |  4 
 gcc/c-family/c-warn.cc  | 34 ++
 gcc/diagnostic-core.h   |  3 +++
 gcc/diagnostic.cc   |  7 +++
 4 files changed, 48 insertions(+)

diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index b57e83d7c5d..789e0bf2459 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -1486,6 +1486,10 @@ extern void warn_for_address_or_pointer_of_packed_member 
(tree, tree);
 extern void warn_parm_array_mismatch (location_t, tree, tree);
 extern void maybe_warn_sizeof_array_div (location_t, tree, tree, tree, tree);
 extern void do_warn_array_compare (location_t, tree_code, tree, tree);
+extern bool pedpermerror (location_t, int, const char *,
+ ...) ATTRIBUTE_GCC_DIAG(3,4);
+extern bool pedpermerror (rich_location *, int, const char *,
+ ...) ATTRIBUTE_GCC_DIAG(3,4);
 
 /* Places where an lvalue, or modifiable lvalue, may be required.
Used to select diagnostic messages in lvalue_error and
diff --git a/gcc/c-family/c-warn.cc b/gcc/c-family/c-warn.cc
index b1bd8ba9f42..475a3e4b42e 100644
--- a/gcc/c-family/c-warn.cc
+++ b/gcc/c-family/c-warn.cc
@@ -3932,3 +3932,37 @@ check_for_xor_used_as_pow (location_t lhs_loc, tree 
lhs_val,
  lhs_uhwi, lhs_uhwi);
 }
 }
+
+/* If !flag_pedantic_errors, equivalent to permerror_opt, otherwise to
+   pedwarn.  */
+
+bool
+pedpermerror (location_t location, int opt, const char *gmsgid, ...)
+{
+  auto_diagnostic_group d;
+  va_list ap;
+  va_start (ap, gmsgid);
+  rich_location richloc (line_table, location);
+  bool ret = emit_diagnostic_valist (flag_pedantic_errors
+? DK_PEDWARN : DK_PERMERROR,
+location, opt, gmsgid, &ap);
+  va_end (ap);
+  return ret;
+}
+
+/* Same as "pedpermerror" above, but at RICHLOC.  */
+
+bool
+pedpermerror (rich_location *richloc, int opt, const char *gmsgid, ...)
+{
+  gcc_assert (richloc);
+
+  auto_diagnostic_group d;
+  va_list ap;
+  va_start (ap, gmsgid);
+  bool ret = emit_diagnostic_valist (flag_pedantic_errors
+? DK_PEDWARN : DK_PERMERROR,
+richloc, opt, gmsgid, &ap);
+  va_end (ap);
+  return ret;
+}
diff --git a/gcc/diagnostic-core.h b/gcc/diagnostic-core.h
index 04eba3d140e..eac775fb573 100644
--- a/gcc/diagnostic-core.h
+++ b/gcc/diagnostic-core.h
@@ -121,6 +121,9 @@ extern bool emit_diagnostic (diagnostic_t, location_t, int,
 const char *, ...) ATTRIBUTE_GCC_DIAG(4,5);
 extern bool emit_diagnostic (diagnostic_t, rich_location *, int,
 const char *, ...) ATTRIBUTE_GCC_DIAG(4,5);
+extern bool emit_diagnostic_valist (diagnostic_t, rich_location *, int,
+   const char *,
+   va_list *) ATTRIBUTE_GCC_DIAG (4,0);
 extern bool emit_diagnostic_valist (diagnostic_t, location_t, int, const char 
*,
va_list *) ATTRIBUTE_GCC_DIAG (4,0);
 extern bool seen_error (void);
diff --git a/gcc/diagnostic.cc b/gcc/diagnostic.cc
index addd6606eaa..6e73343dbdb 100644
--- a/gcc/diagnostic.cc
+++ b/gcc/diagnostic.cc
@@ -1829,6 +1829,13 @@ emit_diagnostic_valist (diagnostic_t kind, location_t 
location, int opt,
   return diagnostic_impl (&richloc, NULL, opt, gmsgid, ap, kind);
 }
 
+bool
+emit_diagnostic_valist (diagnostic_t kind, rich_location *richloc, int opt,
+   const char *gmsgid, va_list *ap)
+{
+  return diagnostic_impl (richloc, NULL, opt, gmsgid, ap, kind);
+}
+
 /* An informative note at LOCATION.  Use this for additional details on an 
error
message.  */
 void
-- 
2.41.0

[PATCH 2/6] c: Turn int-conversion warnings into permerrors

gcc/

* doc/invoke.texi (Warning Options): Document changes.

gcc/c/

* c-typeck.cc (build_conditional_expr): Use pedpermerror for
pointer/integer type mismatches, based on -Wint-conversion.
(pedwarn_pedpermerror_init, permerror_init): New function.
(pedwarn_init): Call pedwarn_pedpermerror_init.
(convert_for_assignment): Use pedpermerror and
pedpermerror_init for -Wint-conversion  warnings.

gcc/testsuite/

* c-c++-common/pr77624-1.c (foo, bar): Expect
error instead of warning.
* gcc.dg/Wint-conversion-2.c: Compile with -fpermissive due
to expected int-conversion warning.
* gcc.dg/Wint-conversion-3.c: Likewise.
* gcc.dg/Wint-conversion-4.c: New test.  Based on
gcc.dg/Wint-conversion-3.c.  Expect int-conversion errors.
* gcc.dg/assign-warn-1.c: Compile with -fpermissive.
* gcc.dg/assign-warn-4.c: New file.  Extracted from
assign-warn1.c.  Expect int-cnversion errors.
* gcc.dg/diagnostic-types-1.c: compile with -fpermissive.
* gcc.dg/diagnostic-types-2.c: New file.  Extracted from
gcc.dg/diagnostic-types-1.c.  Expect some errors instead of
warnings.
* gcc.dg/gomp/pr35738.c: Compile with -fpermissive due to
expected int-conversion error.
* gcc.dg/gomp/pr35738-2.c: New test.  Based on
gcc.dg/gomp/pr35738.c.  Expect int-converison errors.
* gcc.dg/init-excess-3.c: Expect int-converison errors.
* gcc.dg/overflow-warn-1.c: Likewise.
* gcc.dg/overflow-warn-3.c: Likewise.
* gcc.dg/param-type-mismatch.c: Compile with -fpermissive.
* gcc.dg/param-type-mismatch-2.c: New test.  Copied from
gcc.dg/param-type-mismatch.c.  Expect errors.
* gcc.dg/pr61162-2.c: Compile with -fpermissive.
* gcc.dg/pr61162-3.c: New test. Extracted from
gcc.dg/pr61162-2.c.  Expect int-conversion errors.
* gcc.dg/spec-barrier-3.c: Use -fpermissive due to expected
int-conversion error.
* gcc.dg/spec-barrier-3a.c: New test.  Based on
gcc.dg/spec-barrier-3.c.  Expect int-conversion errors.
* gcc.target/aarch64/acle/memtag_2.c: Use -fpermissive due to expected
int-conversion error.
* gcc.target/aarch64/acle/memtag_2a.c: New test.  Copied from
gcc.target/aarch64/acle/memtag_2.c.  Expect error.
* gcc.target/aarch64/sve/acle/general-c/load_3.c (f1): Expect
error.
* gcc.target/aarch64/sve/acle/general-c/store_2.c (f1): Likewise.
* gcc.target/aarch64/sve/acle/general-c/store_scatter_index_1.c
(f1): Likewise.
* 
gcc.target/aarch64/sve/acle/general-c/store_scatter_index_restricted_1.c
(f1): Likewise.
* gcc.target/aarch64/sve/acle/general-c/store_scatter_offset_2.c
(f1): Likewise.
* 
gcc.target/aarch64/sve/acle/general-c/store_scatter_offset_restricted_1.c
(f1): Likewise.
---
 gcc/c/c-typeck.cc |  98 +
 gcc/doc/invoke.texi   |   6 +
 gcc/testsuite/c-c++-common/pr77624-1.c|   4 +-
 gcc/testsuite/gcc.dg/Wint-conversion-2.c  |   2 +-
 gcc/testsuite/gcc.dg/Wint-conversion-3.c  |   2 +-
 gcc/testsuite/gcc.dg/Wint-conversion-4.c  |  14 ++
 gcc/testsuite/gcc.dg/assign-warn-1.c  |   2 +-
 gcc/testsuite/gcc.dg/assign-warn-4.c  |  21 ++
 gcc/testsuite/gcc.dg/diagnostic-types-1.c |   2 +-
 gcc/testsuite/gcc.dg/diagnostic-types-2.c |  24 +++
 gcc/testsuite/gcc.dg/gomp/pr35738-2.c |  18 ++
 gcc/testsuite/gcc.dg/gomp/pr35738.c   |   2 +-
 gcc/testsuite/gcc.dg/init-excess-3.c  |   4 +-
 gcc/testsuite/gcc.dg/overflow-warn-1.c|   4 +-
 gcc/testsuite/gcc.dg/overflow-warn-3.c|   4 +-
 gcc/testsuite/gcc.dg/param-type-mismatch-2.c  | 187 ++
 gcc/testsuite/gcc.dg/param-type-mismatch.c|   2 +-
 gcc/testsuite/gcc.dg/pr61162-2.c  |   2 +-
 gcc/testsuite/gcc.dg/pr61162-3.c  |  13 ++
 gcc/testsuite/gcc.dg/spec-barrier-3.c |   2 +-
 gcc/testsuite/gcc.dg/spec-barrier-3a.c|  13 ++
 .../gcc.target/aarch64/acle/memtag_2.c|   4 +-
 .../gcc.target/aarch64/acle/memtag_2a.c   |  71 +++
 .../aarch64/sve/acle/general-c/load_3.c   |   2 +-
 .../aarch64/sve/acle/general-c/store_2.c  |   2 +-
 .../acle/general-c/store_scatter_index_1.c|   2 +-
 .../store_scatter_index_restricted_1.c|   2 +-
 .../acle/general-c/store_scatter_offset_2.c   |   2 +-
 .../store_scatter_offset_restricted_1.c   |   2 +-
 29 files changed, 452 insertions(+), 61 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/Wint-conversion-4.c
 create mode 100644 gcc/testsuite/gcc.dg/assign-warn-4.c
 create mode 100644 gcc/testsuite/gcc.dg/diagnostic-types-2.c
 create mode 100644 gcc/testsuite/gcc.dg/gomp/pr35738-2.c
 create mode 100644 gcc/testsuite/gcc.dg/param-type-mismat

[PATCH 3/6] c: Turn -Wimplicit-function-declaration into a pedpermerror

In the future, it may make sense to avoid cascading errors from
the implicit declaration, especially its assumed int return type.
This change here only changes the kind of the diagnostic, not
its wording or consequences.

gcc/c/

* doc/invoke.texi (Warning Options): Document changes.

gcc/c/

* c-decl.cc (implicit_decl_pedpermerror): Rename from
implicit_decl_warning.  Call pedpermerror instead of
pedwarn and warning_at.
(implicitly_declare): Adjust callers.

gcc/testsuite/

* c-c++-common/spellcheck-reserved.c (test, test_2): Expect
error instead of warning.
(f): Expect error instead of warning.
* gcc.dg/Wimplicit-function-declaration-c99.c: Compile with
-fpermissive due to expected warning.
* gcc.dg/Wimplicit-function-declaration-c99-2.c: New test.
Copied from gcc.dg/Wimplicit-function-declaration-c99.c.
Expect error.
* gcc.dg/missing-header-fixit-1.c: Compile with -fpermissive
due to expect error.
* gcc.dg/missing-header-fixit-1a.c: New test.  Copied from
gcc.dg/missing-header-fixit-1.c, but expect error.
* gcc.dg/missing-header-fixit-2.c: Compile with -fpermissive
due to expect error.
* gcc.dg/missing-header-fixit-2a.c: New test.  Copied from
gcc.dg/missing-header-fixit-2.c, but expect error.
* gcc.dg/missing-header-fixit-4.c: Compile with -fpermissive
due to expect error.
* gcc.dg/missing-header-fixit-4a.c: New test.  Copied from
gcc.dg/missing-header-fixit-4.c, but expect error.
* gcc.dg/missing-header-fixit-5.c: Compile with -fpermissive
due to expect error.
* gcc.dg/missing-header-fixit-5a.c: New test.  Copied from
gcc.dg/missing-header-fixit-5.c, but expect error.
* gcc.dg/pr61852.c: Expect implicit-function-declaration
error instead of warning.
* gcc.dg/spellcheck-identifiers-2.c: Compile with
-fpermissive due to expected warnings.
* gcc.dg/spellcheck-identifiers-2a.c: New test.  Copied
from gcc.dg/spellcheck-identifiers-2a.c.  Expect errors.
* gcc.dg/spellcheck-identifiers-3.c: Compile with
-fpermissive due to expected warnings.
* gcc.dg/spellcheck-identifiers-3a.c: New test.  Copied
from gcc.dg/spellcheck-identifiers-2a.c.  Expect errors.
* gcc.dg/spellcheck-identifiers-4.c: Compile with
-fpermissive due to expected warnings.
* gcc.dg/spellcheck-identifiers-4a.c: New test.  Copied
from gcc.dg/spellcheck-identifiers-2a.c.  Expect error.
* gcc.dg/spellcheck-identifiers.c: Compile with
-fpermissive due to expected warnings.
* gcc.dg/spellcheck-identifiers-1a.c: New test.  Copied
from gcc.dg/spellcheck-identifiers.c.  Expect errors.
* gcc.target/aarch64/sve/acle/general-c/ld1sh_gather_1.c (f1):
Expect error.
* gcc.target/aarch64/sve/acle/general-c/load_ext_gather_index_1.c:
(f1): Likewise.
* 
gcc.target/aarch64/sve/acle/general-c/load_ext_gather_index_restricted_1.c:
(f1): Likewise.
* gcc.target/aarch64/sve/acle/general-c/load_ext_gather_offset_1.c:
(f1): Likewise.
* gcc.target/aarch64/sve/acle/general-c/load_ext_gather_offset_2.c:
(f1): Likewise.
* gcc.target/aarch64/sve/acle/general-c/load_ext_gather_offset_3.c:
(f1): Likewise.
* gcc.target/aarch64/sve/acle/general-c/load_ext_gather_offset_4.c:
(f1): Likewise.
* gcc.target/aarch64/sve/acle/general-c/load_ext_gather_offset_5.c:
(f1): Likewise.
* 
gcc.target/aarch64/sve/acle/general-c/load_ext_gather_offset_restricted_1.c:
(f1): Likewise.
* 
gcc.target/aarch64/sve/acle/general-c/load_ext_gather_offset_restricted_2.c:
(f1): Likewise.
* 
gcc.target/aarch64/sve/acle/general-c/load_ext_gather_offset_restricted_3.c:
(f1): Likewise.
* 
gcc.target/aarch64/sve/acle/general-c/load_ext_gather_offset_restricted_4.c:
(f1): Likewise.
---
 gcc/c/c-decl.cc   |  20 +--
 gcc/doc/invoke.texi   |   8 +-
 .../c-c++-common/spellcheck-reserved.c|   4 +-
 .../Wimplicit-function-declaration-c99-2.c|   7 +
 .../Wimplicit-function-declaration-c99.c  |   2 +-
 gcc/testsuite/gcc.dg/missing-header-fixit-1.c |   2 +-
 .../gcc.dg/missing-header-fixit-1a.c  |  37 +
 gcc/testsuite/gcc.dg/missing-header-fixit-2.c |   2 +-
 .../gcc.dg/missing-header-fixit-2a.c  |  31 
 gcc/testsuite/gcc.dg/missing-header-fixit-4.c |   2 +-
 .../gcc.dg/missing-header-fixit-4a.c  |  27 
 gcc/testsuite/gcc.dg/missing-header-fixit-5.c |   2 +-
 .../gcc.dg/missing-header-fixit-5a.c  |  42 ++
 gcc/testsuite/gcc.dg/pr61852.c|   4 +-
 .../gcc.dg/spellcheck-identifiers-1a.c| 136 ++
 .../gcc.dg/spel

[PATCH 4/6] c: Turn -Wimplicit-int into a pedpermerror

There is a missed opportunity here to issue spelling diagnostics
in prototype declarations (e.g., for “extern int foo (int32t);”).

gcc/

* doc/invoke.texi (Warning Options): Document changes.

gcc/c/

* c-decl.cc (warn_defaults_to): Call emit_diagnostic_valist
instead of reimplementing it. Issue a pedpermerror for C99
and later.
(store_parm_decls_oldstyle): Call pedpermerror for
OPT_Wimplicit_int.

gcc/testsuite/

* gcc.dg/Wimplicit-int-1.c: Compile with -fpermissive due to
expected warning.
* gcc.dg/Wimplicit-int-4.c: Likewise.
* gcc.dg/Wimplicit-int-1a.c: New test.  Copied from
gcc.dg/Wimplicit-int-1.c, but expect errors.
* gcc.dg/Wimplicit-int-4a.c: New test.  Copied from
gcc.dg/Wimplicit-int-4.c, but expect errors.
* gcc.dg/gnu23-attr-syntax-2.c: Compile with -fpermissive
due to expected implicit-int error.
* gcc.dg/gnu23-attr-syntax-3.c: New test.  Copied from
gcc.dg/gnu23-attr-syntax-2.c, but expect an error.
* gcc.dg/pr105635.c: Build with -fpermissive due to implicit
int.
* gcc.dg/pr105635-2.c: New test.  Copied from
gcc.dg/pr105635.c.  Expect implicit int error.
* gcc.dg/noncompile/pr79758.c: Build with -fpermissive due to
implicitint.
* gcc.dg/noncompile/pr79758-2.c: New test.  Copied from
gcc.dg/noncompile/pr79758.c.  Expect implicit int error.
---
 gcc/c/c-decl.cc | 20 +++-
 gcc/doc/invoke.texi |  7 +--
 gcc/testsuite/gcc.dg/Wimplicit-int-1.c  |  2 +-
 gcc/testsuite/gcc.dg/Wimplicit-int-1a.c | 11 +++
 gcc/testsuite/gcc.dg/Wimplicit-int-4.c  |  2 +-
 gcc/testsuite/gcc.dg/Wimplicit-int-4a.c | 11 +++
 gcc/testsuite/gcc.dg/gnu23-attr-syntax-2.c  |  2 +-
 gcc/testsuite/gcc.dg/gnu23-attr-syntax-3.c  | 17 +
 gcc/testsuite/gcc.dg/noncompile/pr79758-2.c |  6 ++
 gcc/testsuite/gcc.dg/noncompile/pr79758.c   |  1 +
 gcc/testsuite/gcc.dg/pr105635-2.c   | 11 +++
 gcc/testsuite/gcc.dg/pr105635.c |  2 +-
 12 files changed, 77 insertions(+), 15 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/Wimplicit-int-1a.c
 create mode 100644 gcc/testsuite/gcc.dg/Wimplicit-int-4a.c
 create mode 100644 gcc/testsuite/gcc.dg/gnu23-attr-syntax-3.c
 create mode 100644 gcc/testsuite/gcc.dg/noncompile/pr79758-2.c
 create mode 100644 gcc/testsuite/gcc.dg/pr105635-2.c

diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
index e8387bfa984..f787d213cfe 100644
--- a/gcc/c/c-decl.cc
+++ b/gcc/c/c-decl.cc
@@ -6577,16 +6577,18 @@ warn_defaults_to (location_t location, int opt, const 
char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
-  rich_location richloc (line_table, location);
+  diagnostic_t kind;
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc,
-   flag_isoc99 ? DK_PEDWARN : DK_WARNING);
-  diagnostic.option_index = opt;
-  diagnostic_report_diagnostic (global_dc, &diagnostic);
+  if (flag_isoc99 && !flag_pedantic_errors && opt)
+kind = DK_PERMERROR;
+  else if (flag_isoc99)
+kind = DK_PEDWARN;
+  else
+kind = DK_WARNING;
+  emit_diagnostic_valist (kind, location, opt, gmsgid, &ap);
   va_end (ap);
 }
-
 /* Returns the smallest location != UNKNOWN_LOCATION in LOCATIONS,
considering only those c_declspec_words found in LIST, which
must be terminated by cdw_number_of_elements.  */
@@ -10635,9 +10637,9 @@ store_parm_decls_oldstyle (tree fndecl, const struct 
c_arg_info *arg_info)
  warn_if_shadowing (decl);
 
  if (flag_isoc99)
-   pedwarn (DECL_SOURCE_LOCATION (decl),
-OPT_Wimplicit_int, "type of %qD defaults to %",
-decl);
+   pedpermerror (DECL_SOURCE_LOCATION (decl),
+ OPT_Wimplicit_int, "type of %qD defaults to %",
+ decl);
  else
warning_at (DECL_SOURCE_LOCATION (decl),
OPT_Wmissing_parameter_type,
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 1a15af29f01..3a1b9b00f24 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -6180,6 +6180,7 @@ that have their own flag:
 
 @gccoptlist{
 -Wimplicit-function-declaration @r{(C)}
+-Wimplicit-int @r{(C)}
 -Wint-conversion @r{(C)}
 -Wnarrowing @r{(C++)}
 }
@@ -6851,8 +6852,10 @@ This warning is enabled by @option{-Wall} in C++.
 @opindex Wno-implicit-int
 @item -Wno-implicit-int @r{(C and Objective-C only)}
 This option controls warnings when a declaration does not specify a type.
-This warning is enabled by default in C99 and later dialects of C,
-and also by @option{-Wall}.
+This warning is enabled by default, as an error, in C99 and later
+dialects of C, and also by @option{-Wall}.  The error can be downgraded
+to a warning using @option{-fpermissive} (a

[PATCH 5/6] c: Turn -Wreturn-mismatch into a pedpermerror

gcc/

* doc/invoke.texi (Warning Options): Document changes.

gcc/c/

* c-typeck.cc (c_finish_return): Issue a permerror
for mismatching pointers to builtins.  For mismatching
other pointers, issue a pedpermerror.

gcc/testsuite/

* gcc.dg/20030906-1.c: Compile with -fpermissive due to
expected -Wreturn-mismatch error.
* gcc.dg/20030906-1a.c: New test.  Copied from
gcc.dg/20030906-1.c.  Expect the error.
* gcc.dg/20030906-2.c: Compile with -fpermissive due to
expected -Wreturn-mismatch error.
* gcc.dg/20030906-2a.c: New test.  Copied from
gcc.dg/20030906-2.c.  Expect the error.
* gcc.dg/Wreturn-mismatch-1.c: Compile with -fpermissive due to
expected -Wreturn-mismatch error.
* gcc.dg/Wreturn-mismatch-1a.c: New test.  Copied from
gcc.dg/Wreturn-mismatch-1.c.  Expect the error.
* gcc.dg/Wreturn-mismatch-2.c: Compile with -fpermissive due to
expected -Wreturn-mismatch error.
* gcc.dg/Wreturn-mismatch-2a.c: New test.  Copied from
gcc.dg/Wreturn-mismatch-2.c.  Expect the error.
* gcc.dg/diagnostic-range-bad-return.c: Compile with
-fpermissive due to expected -Wreturn-mismatch error.
* gcc.dg/diagnostic-range-bad-return-2.c: New test.
Copied from gcc.dg/diagnostic-range-bad-return.c.  Expect the
error.
* gcc.dg/pr105635-2.c: Expect -Wreturn-mismatch error.
* gcc.dg/pr23075.c: Build with -fpermissive due to
expected -Wreturn-mismatch error.
* gcc.dg/pr23075-2.c: New test.  Copied from gcc.dg/pr23075.c.
Expect the error.
* gcc.dg/pr29521.c: Compile with -fpermissive due to expected
-Wreturn-mismatch error.
* gcc.dg/pr29521-a.c: New test. Copied from gcc.dg/pr29521.c.
Expect error.
* gcc.dg/pr67730.c: Compile with -fpermissive due to expected
-Wreturn-mismatch error.
* gcc.dg/pr67730-a.c: New test.  Copied from
gcc.dg/pr67730-a.c.  Expect error.
* gcc.target/powerpc/conditional-return.c: Compile with
-fpermissive due to expected-Wreturn-mismatch error.
---
 gcc/c/c-typeck.cc |  6 ++-
 gcc/doc/invoke.texi   |  6 ++-
 gcc/testsuite/gcc.dg/20030906-1.c |  2 +-
 gcc/testsuite/gcc.dg/20030906-1a.c| 21 
 gcc/testsuite/gcc.dg/20030906-2.c |  2 +-
 gcc/testsuite/gcc.dg/20030906-2a.c| 21 
 gcc/testsuite/gcc.dg/Wreturn-mismatch-1.c |  2 +-
 gcc/testsuite/gcc.dg/Wreturn-mismatch-1a.c| 40 ++
 gcc/testsuite/gcc.dg/Wreturn-mismatch-2.c |  2 +-
 gcc/testsuite/gcc.dg/Wreturn-mismatch-2a.c| 41 +++
 .../gcc.dg/diagnostic-range-bad-return-2.c| 52 +++
 .../gcc.dg/diagnostic-range-bad-return.c  |  2 +-
 gcc/testsuite/gcc.dg/pr105635-2.c |  2 +-
 gcc/testsuite/gcc.dg/pr23075-2.c  | 14 +
 gcc/testsuite/gcc.dg/pr23075.c|  2 +-
 gcc/testsuite/gcc.dg/pr29521-a.c  | 15 ++
 gcc/testsuite/gcc.dg/pr29521.c|  2 +-
 gcc/testsuite/gcc.dg/pr67730-a.c  | 11 
 gcc/testsuite/gcc.dg/pr67730.c|  2 +-
 .../gcc.target/powerpc/conditional-return.c   |  2 +-
 20 files changed, 234 insertions(+), 13 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/20030906-1a.c
 create mode 100644 gcc/testsuite/gcc.dg/20030906-2a.c
 create mode 100644 gcc/testsuite/gcc.dg/Wreturn-mismatch-1a.c
 create mode 100644 gcc/testsuite/gcc.dg/Wreturn-mismatch-2a.c
 create mode 100644 gcc/testsuite/gcc.dg/diagnostic-range-bad-return-2.c
 create mode 100644 gcc/testsuite/gcc.dg/pr23075-2.c
 create mode 100644 gcc/testsuite/gcc.dg/pr29521-a.c
 create mode 100644 gcc/testsuite/gcc.dg/pr67730-a.c

diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index 9d7bdbb4523..be376758b82 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -11192,7 +11192,9 @@ c_finish_return (location_t loc, tree retval, tree 
origtype)
  && valtype != NULL_TREE && TREE_CODE (valtype) != VOID_TYPE)
{
  no_warning = true;
- if (emit_diagnostic (flag_isoc99 ? DK_PEDWARN : DK_WARNING,
+ if (emit_diagnostic (flag_isoc99 && !flag_pedantic_errors
+  ? DK_PERMERROR
+  : flag_isoc99 ? DK_PEDWARN : DK_WARNING,
   loc, OPT_Wreturn_mismatch,
   "% with no value,"
   " in function returning non-void"))
@@ -11205,7 +11207,7 @@ c_finish_return (location_t loc, tree retval, tree 
origtype)
   current_function_returns_null = 1;
   bool warned_here;
   if (TREE_CODE (TREE_TYPE (retval)) != VOID_TYPE)
-   warned_here = pedwarn
+   warned_here = pedpermerror
  (xloc, OPT_Wreturn_mismatch,
   "% with a value, in

[PATCH 6/6] c: Turn -Wincompatible-pointer-types into a pedpermerror

gcc/

* doc/invoke.texi (Warning Options): Document changes.

gcc/c/

* c-typeck.cc (build_conditional_expr): Use pedpermerror-
equivalent for pointer type mismatches in conditional
expression.
(convert_for_assignment): Use pedpermerror and
pedpermerror_init for OPT_Wincompatible_pointer_types
warnings.

gcc/testsuite/

* gcc.dg/Wincompatible-pointer-types-2.c: Compile with
-fpermissivedue to expected errors.
* gcc.dg/Wincompatible-pointer-types-4.c: Likewise.
* gcc.dg/Wincompatible-pointer-types-5.c: New test.  Copied
from gcc.dg/Wincompatible-pointer-types-2.c.  Expect errors.
* gcc.dg/Wincompatible-pointer-types-6.c: New test.  Copied
from gcc.dg/Wincompatible-pointer-types-4.c.  Expect errors.
* gcc.dg/anon-struct-11.c: Compile with -fpermissive
due to expected errors.
* gcc.dg/anon-struct-11a.c: New test.  Copied from
gcc.dg/anon-struct-11.c.  Expect errors.
* gcc.dg/anon-struct-13.c: Compile with -fpermissive
due to expected errors.
* gcc.dg/anon-struct-13a.c: New test.  Copied from
gcc.dg/anon-struct-13.c.  Expect errors.
* gcc.dg/builtin-arith-overflow-4.c: Compile with -fpermissive
due to expected errors.
* gcc.dg/builtin-arith-overflow-4a.c: New test.  Copied from
gcc.dg/builtin-arith-overflow-4.c.  Expect errors.
* gcc.dg/c23-qual-4.c: Expect -Wincompatible-pointer-types errors.
* gcc.dg/dfp/composite-type.c: Compile with -fpermissive
due to expected errors.
* gcc.dg/dfp/composite-type-2.c: New test.  Copied from
gcc.dg/dfp/composite-type.c.  Expect errors.
* gcc.dg/diag-aka-1.c: Compile with -fpermissive
due to expected errors.
* gcc.dg/diag-aka-1a.c: New test.  Copied from gcc.dg/diag-aka-1a.c.
Expect errors.
* gcc.dg/enum-compat-1.c: Compile with -fpermissive
due to expected errors.
* gcc.dg/enum-compat-2.c: New test.  Copied from
gcc.dg/enum-compat-1.c.  Expect errors.
* gcc.dg/func-ptr-conv-1.c: Compile with -fpermissive
due to expected errors.
* gcc.dg/func-ptr-conv-2.c: New test.  Copied from
gcc.dg/func-ptr-conv-1.c.  Expect errors.
* gcc.dg/init-bad-7.c: Compile with -fpermissive
due to expected errors.
* gcc.dg/init-bad-7a.c: New test.  Copied from gcc.dg/init-bad-7.c.
Expect errors.
* gcc.dg/noncompile/incomplete-3.c (foo): Expect
-Wincompatible-pointer-types error.
* gcc.dg/param-type-mismatch-2.c (test8): Likewise.
* gcc.dg/pointer-array-atomic.c: Compile with -fpermissive
due to expected errors.
* gcc.dg/pointer-array-atomic-2.c: New test.  Copied from
gcc.dg/pointer-array-atomic.c.  Expect errors.
* gcc.dg/pointer-array-quals-1.c (test): Expect
-Wincompatible-pointer-types errors.
* gcc.dg/transparent-union-1.c: Compile with -fpermissive
due to expected errors.
* gcc.dg/transparent-union-1a.c: New test.  Copied from
gcc.dg/transparent-union-1.c.  Expect errors.
* gcc.target/aarch64/acle/memtag_2a.c
(test_memtag_warning_return_qualifier): Expect additional
errors.
* gcc.target/aarch64/sve/acle/general-c/load_2.c (f1): Likewise.
* gcc.target/aarch64/sve/acle/general-c/load_ext_gather_offset_1.c
(f1): Likewise.
* gcc.target/aarch64/sve/acle/general-c/load_ext_gather_offset_2.c
(f1): Likewise.
* gcc.target/aarch64/sve/acle/general-c/load_ext_gather_offset_3.c
(f1): Likewise.
* gcc.target/aarch64/sve/acle/general-c/load_ext_gather_offset_4.c
(f1): Likewise.
* gcc.target/aarch64/sve/acle/general-c/load_ext_gather_offset_5.c
(f1): Likewise.
* 
gcc.target/aarch64/sve/acle/general-c/load_ext_gather_offset_restricted_1.c
(f1): Likewise.
* 
gcc.target/aarch64/sve/acle/general-c/load_ext_gather_offset_restricted_2.c
(f1): Likewise.
* 
gcc.target/aarch64/sve/acle/general-c/load_ext_gather_offset_restricted_3.c
(f1): Likewise.
* 
gcc.target/aarch64/sve/acle/general-c/load_ext_gather_offset_restricted_4.c
(f1): Likewise.
* gcc.target/aarch64/sve/acle/general-c/sizeless-1.c (f1): Likewise.
* gcc.target/aarch64/sve/acle/general-c/sizeless-2.c (f1): Likewise.
* gcc.target/aarch64/sve/acle/general-c/store_1.c (f1): Likewise.
* gcc.target/aarch64/sve/acle/general-c/store_2.c (f1): Likewise.
* gcc.target/aarch64/sve/acle/general-c/store_scatter_index_1.c
(f1): Likewise.
* 
gcc.target/aarch64/sve/acle/general-c/store_scatter_index_restricted_1.c
(f1): Likewise.
* gcc.target/aarch64/sve/acle/general-c/store_scatter_offset_2.c
(f1): Likewise.
* 
gcc.target/aarch64/sve/ac

Re: [PATCH 2/3] Add generated .opt.urls files

2023-11-13 Thread Marc Poulhiès



David Malcolm  writes:

> gcc/ada/ChangeLog:
>   * gcc-interface/lang.opt.urls: New file, autogenerated by
>   regenerate-opt-urls.py.


> diff --git a/gcc/ada/gcc-interface/lang.opt.urls 
> b/gcc/ada/gcc-interface/lang.opt.urls
> new file mode 100644
> index ..e24210bcb12a
> --- /dev/null
> +++ b/gcc/ada/gcc-interface/lang.opt.urls
> @@ -0,0 +1,28 @@
> +; Autogenerated by regenerate-opt-urls.py from 
> gcc/ada/gcc-interface/lang.opt and generated HTML
> +
> +I
> +UrlSuffix(gcc/Directory-Options.html#index-I)
> +
> +; skipping 'Wall' due to multiple URLs:
> +;   duplicate: 'gcc/Standard-Libraries.html#index-Wall-1'
> +;   duplicate: 'gcc/Warning-Options.html#index-Wall'
> +
> +nostdinc
> +UrlSuffix(gcc/Directory-Options.html#index-nostdinc)
> +
> +nostdlib
> +UrlSuffix(gcc/Link-Options.html#index-nostdlib)
> +
> +; skipping 'fshort-enums' due to multiple URLs:
> +;   duplicate: 'gcc/Code-Gen-Options.html#index-fshort-enums'
> +;   duplicate: 'gcc/Non-bugs.html#index-fshort-enums-3'
> +;   duplicate: 
> 'gcc/Structures-unions-enumerations-and-bit-fields-implementation.html#index-fshort-enums-1'
> +
> +; skipping 'fsigned-char' due to multiple URLs:
> +;   duplicate: 'gcc/C-Dialect-Options.html#index-fsigned-char'
> +;   duplicate: 'gcc/Characters-implementation.html#index-fsigned-char-1'
> +
> +; skipping 'funsigned-char' due to multiple URLs:
> +;   duplicate: 'gcc/C-Dialect-Options.html#index-funsigned-char'
> +;   duplicate: 'gcc/Characters-implementation.html#index-funsigned-char-1'

Hello David,

This looks very nice, thanks!

I wonder why the Ada frontend only gets I, nostdinc and nostdlib
URLified to the common gcc doc.

Is it possible that your doc scrapper doesn't match the option in the
Ada doc? We are documenting nostdlib, nostdinc and I, so I would also
expect a "multiple URLs" for these. We are generating the texinfo files
from sphinx, so maybe we could adjust the script to also match what the
sphinx generator produces?

Thanks,
Marc

Re: [PATCH] RISC-V: vsetvl: Refine REG_EQUAL equality.

On 11/13/23 11:36, juzhe.zh...@rivai.ai wrote:
> --- /dev/null
> +++ 
> b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_zbb_run-2.c
> @@ -0,0 +1,19 @@
> +/* { dg-do run { target { riscv_v } } } */
> +/* { dg-additional-options "-march=rv64gcv_zbb --param 
> riscv-autovec-preference=fixed-vlmax" } */
> 
> Could you add compile test (with assembly check) instead of run test ?

I found it a bit difficult to create a proper test, hopefully the attached
is not too brittle.

My impression is that it would be easier to have such tests if there were
vsetvl statistics of how many vsetvls we merged, fused and for what
reasons etc.
Maybe that's a good learning exercise to get familiar with the pass for
somebody?

Regards
 Robin

Subject: [PATCH v3] RISC-V: vsetvl: Refine REG_EQUAL equality.

This patch enhances the equality check for REG_EQUAL notes in the vsetvl
pass by using the == operator instead of rtx_equal_p.  With that, in
situations like the following, a5 and a7 are not considered equal
anymore.

(insn 62 60 63 4 (set (reg:DI 17 a7 [orig:154 loop_len_54 ] [154])
(umin:DI (reg:DI 15 a5 [orig:174 _100 ] [174])
(reg:DI 30 t5 [219]))) 442 {umindi3}
 (expr_list:REG_EQUAL (umin:DI (reg:DI 15 a5 [orig:174 _100 ] [174])
(const_int 8 [0x8]))
(nil)))
(insn 63 62 65 4 (set (reg:DI 15 a5 [orig:175 _103 ] [175])
(minus:DI (reg:DI 15 a5 [orig:174 _100 ] [174])
(reg:DI 17 a7 [orig:154 loop_len_54 ] [154]))) 11 {subdi3}
 (nil))
(insn 65 63 66 4 (set (reg:DI 16 a6 [orig:153 loop_len_53 ] [153])
(umin:DI (reg:DI 15 a5 [orig:175 _103 ] [175])
(reg:DI 30 t5 [219]))) 442 {umindi3}
 (expr_list:REG_EQUAL (umin:DI (reg:DI 15 a5 [orig:175 _103 ] [175])
(const_int 8 [0x8]))
(nil)))

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (source_equal_p): Use pointer
equality for REG_EQUAL.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_zbb.c: New test.
---
 gcc/config/riscv/riscv-vsetvl.cc  | 12 +-
 .../rvv/autovec/partial/multiple_rgroup_zbb.c | 23 +++
 2 files changed, 34 insertions(+), 1 deletion(-)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_zbb.c

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index 3fa25a6404d..63f966f2f3a 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -561,7 +561,17 @@ source_equal_p (insn_info *insn1, insn_info *insn2)
   rtx note1 = find_reg_equal_equiv_note (rinsn1);
   rtx note2 = find_reg_equal_equiv_note (rinsn2);
   if (note1 && note2 && rtx_equal_p (note1, note2))
-return true;
+{
+  /* REG_EQUIVs are invariant at function scope.  */
+  if (REG_NOTE_KIND (note2) == REG_EQUIV)
+   return true;
+
+  /* REG_EQUAL are not so in order to consider them similar the RTX they
+point to must be identical.  We could also allow "rtx_equal"
+REG_EQUALs but would need to check if no insn between them modifies
+any of their sources.  */
+  return note1 == note2;
+}
   return false;
 }
 
diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_zbb.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_zbb.c
new file mode 100644
index 000..15178a2c848
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_zbb.c
@@ -0,0 +1,23 @@
+/* { dg-do compile } *.
+/* { dg-options "-march=rv64gcv_zbb -mabi=lp64d -O2 --param 
riscv-autovec-preference=fixed-vlmax -fno-schedule-insns -fno-schedule-insns2" 
} */
+
+#include 
+
+void __attribute__ ((noipa))
+test (uint16_t *__restrict f, uint32_t *__restrict d, uint64_t *__restrict e,
+  uint16_t x, uint16_t x2, uint16_t x3, uint16_t x4, uint32_t y,
+  uint32_t y2, uint64_t z, int n)
+{
+  for (int i = 0; i < n; ++i)
+{
+  f[i * 4 + 0] = x;
+  f[i * 4 + 1] = x2;
+  f[i * 4 + 2] = x3;
+  f[i * 4 + 3] = x4;
+  d[i * 2 + 0] = y;
+  d[i * 2 + 1] = y2;
+  e[i] = z;
+}
+}
+
+/* { dg-final { scan-assembler-times 
"vsetvli\tzero,\s*\[a-z0-9\]+,\s*e16,\s*m1,\s*ta,\s*ma" 4 } } */
-- 
2.41.0

Re: Re: [PATCH] RISC-V: vsetvl: Refine REG_EQUAL equality.

2023-11-13 Thread 钟居哲

I just checked your test. I won't be brittle in the future.
Since it should be 4 vsetvls with e16m1 for SLP AVL/VL toggling.
And also it is no scheduling.  The middle-end MIN_EXPR SLP always produce 4 
AVL/VL toggling
as long as we don't schedule the instructions.

So it won't be problem.

So, LGTM.



juzhe.zh...@rivai.ai
 
From: Robin Dapp
Date: 2023-11-13 21:28
To: juzhe.zh...@rivai.ai; gcc-patches; palmer; kito.cheng; jeffreyalaw
CC: rdapp.gcc
Subject: Re: [PATCH] RISC-V: vsetvl: Refine REG_EQUAL equality.
On 11/13/23 11:36, juzhe.zh...@rivai.ai wrote:
> --- /dev/null
> +++ 
> b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_zbb_run-2.c
> @@ -0,0 +1,19 @@
> +/* { dg-do run { target { riscv_v } } } */
> +/* { dg-additional-options "-march=rv64gcv_zbb --param 
> riscv-autovec-preference=fixed-vlmax" } */
> 
> Could you add compile test (with assembly check) instead of run test ?
 
I found it a bit difficult to create a proper test, hopefully the attached
is not too brittle.
 
My impression is that it would be easier to have such tests if there were
vsetvl statistics of how many vsetvls we merged, fused and for what
reasons etc.
Maybe that's a good learning exercise to get familiar with the pass for
somebody?
 
Regards
Robin
 
Subject: [PATCH v3] RISC-V: vsetvl: Refine REG_EQUAL equality.
 
This patch enhances the equality check for REG_EQUAL notes in the vsetvl
pass by using the == operator instead of rtx_equal_p.  With that, in
situations like the following, a5 and a7 are not considered equal
anymore.
 
(insn 62 60 63 4 (set (reg:DI 17 a7 [orig:154 loop_len_54 ] [154])
(umin:DI (reg:DI 15 a5 [orig:174 _100 ] [174])
(reg:DI 30 t5 [219]))) 442 {umindi3}
 (expr_list:REG_EQUAL (umin:DI (reg:DI 15 a5 [orig:174 _100 ] [174])
(const_int 8 [0x8]))
(nil)))
(insn 63 62 65 4 (set (reg:DI 15 a5 [orig:175 _103 ] [175])
(minus:DI (reg:DI 15 a5 [orig:174 _100 ] [174])
(reg:DI 17 a7 [orig:154 loop_len_54 ] [154]))) 11 {subdi3}
 (nil))
(insn 65 63 66 4 (set (reg:DI 16 a6 [orig:153 loop_len_53 ] [153])
(umin:DI (reg:DI 15 a5 [orig:175 _103 ] [175])
(reg:DI 30 t5 [219]))) 442 {umindi3}
 (expr_list:REG_EQUAL (umin:DI (reg:DI 15 a5 [orig:175 _103 ] [175])
(const_int 8 [0x8]))
(nil)))
 
gcc/ChangeLog:
 
* config/riscv/riscv-vsetvl.cc (source_equal_p): Use pointer
equality for REG_EQUAL.
 
gcc/testsuite/ChangeLog:
 
* gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_zbb.c: New test.
---
gcc/config/riscv/riscv-vsetvl.cc  | 12 +-
.../rvv/autovec/partial/multiple_rgroup_zbb.c | 23 +++
2 files changed, 34 insertions(+), 1 deletion(-)
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_zbb.c
 
diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index 3fa25a6404d..63f966f2f3a 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -561,7 +561,17 @@ source_equal_p (insn_info *insn1, insn_info *insn2)
   rtx note1 = find_reg_equal_equiv_note (rinsn1);
   rtx note2 = find_reg_equal_equiv_note (rinsn2);
   if (note1 && note2 && rtx_equal_p (note1, note2))
-return true;
+{
+  /* REG_EQUIVs are invariant at function scope.  */
+  if (REG_NOTE_KIND (note2) == REG_EQUIV)
+ return true;
+
+  /* REG_EQUAL are not so in order to consider them similar the RTX they
+ point to must be identical.  We could also allow "rtx_equal"
+ REG_EQUALs but would need to check if no insn between them modifies
+ any of their sources.  */
+  return note1 == note2;
+}
   return false;
}
diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_zbb.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_zbb.c
new file mode 100644
index 000..15178a2c848
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_zbb.c
@@ -0,0 +1,23 @@
+/* { dg-do compile } *.
+/* { dg-options "-march=rv64gcv_zbb -mabi=lp64d -O2 --param 
riscv-autovec-preference=fixed-vlmax -fno-schedule-insns -fno-schedule-insns2" 
} */
+
+#include 
+
+void __attribute__ ((noipa))
+test (uint16_t *__restrict f, uint32_t *__restrict d, uint64_t *__restrict e,
+  uint16_t x, uint16_t x2, uint16_t x3, uint16_t x4, uint32_t y,
+  uint32_t y2, uint64_t z, int n)
+{
+  for (int i = 0; i < n; ++i)
+{
+  f[i * 4 + 0] = x;
+  f[i * 4 + 1] = x2;
+  f[i * 4 + 2] = x3;
+  f[i * 4 + 3] = x4;
+  d[i * 2 + 0] = y;
+  d[i * 2 + 1] = y2;
+  e[i] = z;
+}
+}
+
+/* { dg-final { scan-assembler-times 
"vsetvli\tzero,\s*\[a-z0-9\]+,\s*e16,\s*m1,\s*ta,\s*ma" 4 } } */
-- 
2.41.0

[PATCH v2 0/3] RISC-V: Support CORE-V XCVELW and XCVBI extensions

v1 -> v2:
  * Bring the MEM into the operand for cv.elw. The new predicate is
move_operand.
  * Add comment to riscv.md detailing why corev.md must appear before
the generic riscv instructions.

This patch series presents the comprehensive implementation of the ELW and BI
extension for CORE-V.

Tested with riscv-gnu-toolchain on binutils, ld, gas and gcc testsuites to
ensure its correctness and compatibility with the existing codebase.
However, your input, reviews, and suggestions are invaluable in making this
extension even more robust.

The CORE-V builtins are described in the specification [1] and work can be
found in the OpenHW group's Github repository [2].

[1] 
github.com/openhwgroup/core-v-sw/blob/master/specifications/corev-builtin-spec.md

[2] github.com/openhwgroup/corev-gcc

Contributors:
  Mary Bennett 
  Nandni Jamnadas 
  Pietra Ferreira 
  Charlie Keaney
  Jessica Mills
  Craig Blackmore 
  Simon Cook 
  Jeremy Bennett 
  Helene Chelin 

RISC-V: Update XCValu constraints to match other vendors
RISC-V: Add support for XCVelw extension in CV32E40P
RISC-V: Add support for XCVbi extension in CV32E40P

 gcc/common/config/riscv/riscv-common.cc   |  4 ++
 gcc/config/riscv/constraints.md   | 21 +---
 gcc/config/riscv/corev.def|  3 ++
 gcc/config/riscv/corev.md | 33 -
 gcc/config/riscv/predicates.md|  4 ++
 gcc/config/riscv/riscv-builtins.cc|  2 +
 gcc/config/riscv/riscv-ftypes.def |  1 +
 gcc/config/riscv/riscv.md | 11 -
 gcc/config/riscv/riscv.opt|  4 ++
 gcc/doc/extend.texi   |  8 
 gcc/doc/sourcebuild.texi  |  6 +++
 .../gcc.target/riscv/cv-bi-beqimm-compile-1.c | 17 +++
 .../gcc.target/riscv/cv-bi-beqimm-compile-2.c | 48 +++
 .../gcc.target/riscv/cv-bi-bneimm-compile-1.c | 17 +++
 .../gcc.target/riscv/cv-bi-bneimm-compile-2.c | 48 +++
 .../gcc.target/riscv/cv-elw-elw-compile-1.c   | 11 +
 gcc/testsuite/lib/target-supports.exp | 26 ++
 17 files changed, 254 insertions(+), 10 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-bi-beqimm-compile-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-bi-beqimm-compile-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-bi-bneimm-compile-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-bi-bneimm-compile-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-elw-elw-compile-1.c

-- 
2.34.1

[PATCH v2 1/3] RISC-V: Add support for XCVelw extension in CV32E40P

Spec: 
github.com/openhwgroup/core-v-sw/blob/master/specifications/corev-builtin-spec.md

Contributors:
  Mary Bennett 
  Nandni Jamnadas 
  Pietra Ferreira 
  Charlie Keaney
  Jessica Mills
  Craig Blackmore 
  Simon Cook 
  Jeremy Bennett 
  Helene Chelin 

gcc/ChangeLog:
* common/config/riscv/riscv-common.cc: Add XCVelw.
* config/riscv/corev.def: Likewise.
* config/riscv/corev.md: Likewise.
* config/riscv/riscv-builtins.cc (AVAIL): Likewise.
* config/riscv/riscv-ftypes.def: Likewise.
* config/riscv/riscv.opt: Likewise.
* doc/extend.texi: Add XCVelw builtin documentation.
* doc/sourcebuild.texi: Likewise.

gcc/testsuite/ChangeLog:
* gcc.target/riscv/cv-elw-compile-1.c: Create test for cv.elw.
* testsuite/lib/target-supports.exp: Add proc for the XCVelw extension.
---
 gcc/common/config/riscv/riscv-common.cc   |  2 ++
 gcc/config/riscv/corev.def|  3 +++
 gcc/config/riscv/corev.md | 15 +++
 gcc/config/riscv/riscv-builtins.cc|  2 ++
 gcc/config/riscv/riscv-ftypes.def |  1 +
 gcc/config/riscv/riscv.opt|  2 ++
 gcc/doc/extend.texi   |  8 
 gcc/doc/sourcebuild.texi  |  3 +++
 .../gcc.target/riscv/cv-elw-elw-compile-1.c   | 11 +++
 gcc/testsuite/lib/target-supports.exp | 13 +
 10 files changed, 60 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-elw-elw-compile-1.c

diff --git a/gcc/common/config/riscv/riscv-common.cc 
b/gcc/common/config/riscv/riscv-common.cc
index 526dbb7603b..6a1978bd0e4 100644
--- a/gcc/common/config/riscv/riscv-common.cc
+++ b/gcc/common/config/riscv/riscv-common.cc
@@ -312,6 +312,7 @@ static const struct riscv_ext_version 
riscv_ext_version_table[] =
 
   {"xcvmac", ISA_SPEC_CLASS_NONE, 1, 0},
   {"xcvalu", ISA_SPEC_CLASS_NONE, 1, 0},
+  {"xcvelw", ISA_SPEC_CLASS_NONE, 1, 0},
 
   {"xtheadba", ISA_SPEC_CLASS_NONE, 1, 0},
   {"xtheadbb", ISA_SPEC_CLASS_NONE, 1, 0},
@@ -1667,6 +1668,7 @@ static const riscv_ext_flag_table_t 
riscv_ext_flag_table[] =
 
   {"xcvmac",&gcc_options::x_riscv_xcv_subext, MASK_XCVMAC},
   {"xcvalu",&gcc_options::x_riscv_xcv_subext, MASK_XCVALU},
+  {"xcvelw",&gcc_options::x_riscv_xcv_subext, MASK_XCVELW},
 
   {"xtheadba",  &gcc_options::x_riscv_xthead_subext, MASK_XTHEADBA},
   {"xtheadbb",  &gcc_options::x_riscv_xthead_subext, MASK_XTHEADBB},
diff --git a/gcc/config/riscv/corev.def b/gcc/config/riscv/corev.def
index 17580df3c41..3b9ec029d06 100644
--- a/gcc/config/riscv/corev.def
+++ b/gcc/config/riscv/corev.def
@@ -41,3 +41,6 @@ RISCV_BUILTIN (cv_alu_subN, "cv_alu_subN", 
RISCV_BUILTIN_DIRECT, RISCV_SI_FT
 RISCV_BUILTIN (cv_alu_subuN,"cv_alu_subuN", RISCV_BUILTIN_DIRECT, 
RISCV_USI_FTYPE_USI_USI_UQI,  cvalu),
 RISCV_BUILTIN (cv_alu_subRN,"cv_alu_subRN", RISCV_BUILTIN_DIRECT, 
RISCV_SI_FTYPE_SI_SI_UQI, cvalu),
 RISCV_BUILTIN (cv_alu_subuRN,   "cv_alu_subuRN",RISCV_BUILTIN_DIRECT, 
RISCV_USI_FTYPE_USI_USI_UQI,  cvalu),
+
+// XCVELW
+RISCV_BUILTIN (cv_elw_elw_si, "cv_elw_elw", RISCV_BUILTIN_DIRECT, 
RISCV_USI_FTYPE_VOID_PTR, cvelw),
diff --git a/gcc/config/riscv/corev.md b/gcc/config/riscv/corev.md
index 1350bd4b81e..c7a2ba07bcc 100644
--- a/gcc/config/riscv/corev.md
+++ b/gcc/config/riscv/corev.md
@@ -24,6 +24,9 @@
   UNSPEC_CV_ALU_CLIPR
   UNSPEC_CV_ALU_CLIPU
   UNSPEC_CV_ALU_CLIPUR
+
+  ;;CORE-V EVENT LOAD
+  UNSPECV_CV_ELW
 ])
 
 ;; XCVMAC extension.
@@ -691,3 +694,15 @@
   cv.suburnr\t%0,%2,%3"
   [(set_attr "type" "arith")
   (set_attr "mode" "SI")])
+
+;; XCVELW builtins
+(define_insn "riscv_cv_elw_elw_si"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+   (unspec_volatile [(match_operand:SI 1 "move_operand" "p")]
+ UNSPECV_CV_ELW))]
+
+  "TARGET_XCVELW && !TARGET_64BIT"
+  "cv.elw\t%0,%a1"
+
+  [(set_attr "type" "load")
+  (set_attr "mode" "SI")])
diff --git a/gcc/config/riscv/riscv-builtins.cc 
b/gcc/config/riscv/riscv-builtins.cc
index fc3976f3ba1..5ee11ebe3bc 100644
--- a/gcc/config/riscv/riscv-builtins.cc
+++ b/gcc/config/riscv/riscv-builtins.cc
@@ -128,6 +128,7 @@ AVAIL (hint_pause, (!0))
 // CORE-V AVAIL
 AVAIL (cvmac, TARGET_XCVMAC && !TARGET_64BIT)
 AVAIL (cvalu, TARGET_XCVALU && !TARGET_64BIT)
+AVAIL (cvelw, TARGET_XCVELW && !TARGET_64BIT)
 
 /* Construct a riscv_builtin_description from the given arguments.
 
@@ -168,6 +169,7 @@ AVAIL (cvalu, TARGET_XCVALU && !TARGET_64BIT)
 #define RISCV_ATYPE_HI intHI_type_node
 #define RISCV_ATYPE_SI intSI_type_node
 #define RISCV_ATYPE_VOID_PTR ptr_type_node
+#define RISCV_ATYPE_INT_PTR integer_ptr_type_node
 
 /* RISCV_FTYPE_ATYPESN takes N RISCV_FTYPES-like type codes and lists
their associated RISCV_ATYPEs.  */
diff --git a/gcc/config/riscv/riscv-ftypes.def 
b/gcc/config/riscv/riscv-ftypes.def
index 0d1e4dd061e..3e7d5c69503 100644
--- a/

[PATCH v2 2/3] RISC-V: Update XCValu constraints to match other vendors

gcc/ChangeLog:
* config/riscv/constraints.md: CVP2 -> CV_alu_pow2.
* config/riscv/corev.md: Likewise.
---
 gcc/config/riscv/constraints.md | 15 ---
 gcc/config/riscv/corev.md   |  4 ++--
 2 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/gcc/config/riscv/constraints.md b/gcc/config/riscv/constraints.md
index 68be4515c04..2711efe68c5 100644
--- a/gcc/config/riscv/constraints.md
+++ b/gcc/config/riscv/constraints.md
@@ -151,13 +151,6 @@
 (define_register_constraint "zmvr" "(TARGET_ZFA || TARGET_XTHEADFMV) ? GR_REGS 
: NO_REGS"
   "An integer register for  ZFA or XTheadFmv.")
 
-;; CORE-V Constraints
-(define_constraint "CVP2"
-  "Checking for CORE-V ALU clip if ival plus 1 is a power of 2"
-  (and (match_code "const_int")
-   (and (match_test "IN_RANGE (ival, 0, 1073741823)")
-(match_test "exact_log2 (ival + 1) != -1"
-
 ;; Vector constraints.
 
 (define_register_constraint "vr" "TARGET_VECTOR ? V_REGS : NO_REGS"
@@ -246,3 +239,11 @@
A MEM with a valid address for th.[l|s]*ur* instructions."
   (and (match_code "mem")
(match_test "th_memidx_legitimate_index_p (op, true)")))
+
+;; CORE-V Constraints
+(define_constraint "CV_alu_pow2"
+  "@internal
+   Checking for CORE-V ALU clip if ival plus 1 is a power of 2"
+  (and (match_code "const_int")
+   (and (match_test "IN_RANGE (ival, 0, 1073741823)")
+(match_test "exact_log2 (ival + 1) != -1"
diff --git a/gcc/config/riscv/corev.md b/gcc/config/riscv/corev.md
index c7a2ba07bcc..92bf0b5d6a6 100644
--- a/gcc/config/riscv/corev.md
+++ b/gcc/config/riscv/corev.md
@@ -516,7 +516,7 @@
 (define_insn "riscv_cv_alu_clip"
   [(set (match_operand:SI 0 "register_operand" "=r,r")
(unspec:SI [(match_operand:SI 1 "register_operand" "r,r")
-   (match_operand:SI 2 "immediate_register_operand" "CVP2,r")]
+   (match_operand:SI 2 "immediate_register_operand" 
"CV_alu_pow2,r")]
 UNSPEC_CV_ALU_CLIP))]
 
   "TARGET_XCVALU && !TARGET_64BIT"
@@ -529,7 +529,7 @@
 (define_insn "riscv_cv_alu_clipu"
   [(set (match_operand:SI 0 "register_operand" "=r,r")
(unspec:SI [(match_operand:SI 1 "register_operand" "r,r")
-   (match_operand:SI 2 "immediate_register_operand" "CVP2,r")]
+   (match_operand:SI 2 "immediate_register_operand" 
"CV_alu_pow2,r")]
 UNSPEC_CV_ALU_CLIPU))]
 
   "TARGET_XCVALU && !TARGET_64BIT"
-- 
2.34.1

[PATCH v2 3/3] RISC-V: Add support for XCVbi extension in CV32E40P

Spec: 
github.com/openhwgroup/core-v-sw/blob/master/specifications/corev-builtin-spec.md

Contributors:
  Mary Bennett 
  Nandni Jamnadas 
  Pietra Ferreira 
  Charlie Keaney
  Jessica Mills
  Craig Blackmore 
  Simon Cook 
  Jeremy Bennett 
  Helene Chelin 

gcc/ChangeLog:
* common/config/riscv/riscv-common.cc: Create XCVbi extension
  support.
* config/riscv/riscv.opt: Likewise.
* config/riscv/corev.md: Implement cv_branch pattern
  for cv.beqimm and cv.bneimm.
* config/riscv/riscv.md: Change pattern priority so corev.md
  patterns run before riscv.md patterns.
* config/riscv/constraints.md: Implement constraints
  cv_bi_s5 - signed 5-bit immediate.
* config/riscv/predicates.md: Implement predicate
  const_int5s_operand - signed 5 bit immediate.
* doc/sourcebuild.texi: Add XCVbi documentation.

gcc/testsuite/ChangeLog:
* gcc.target/riscv/cv-bi-beqimm-compile-1.c: New test.
* gcc.target/riscv/cv-bi-beqimm-compile-2.c: New test.
* gcc.target/riscv/cv-bi-bneimm-compile-1.c: New test.
* gcc.target/riscv/cv-bi-bneimm-compile-2.c: New test.
* lib/target-supports.exp: Add proc for XCVbi.
---
 gcc/common/config/riscv/riscv-common.cc   |  2 +
 gcc/config/riscv/constraints.md   |  6 +++
 gcc/config/riscv/corev.md | 14 ++
 gcc/config/riscv/predicates.md|  4 ++
 gcc/config/riscv/riscv.md | 11 -
 gcc/config/riscv/riscv.opt|  2 +
 gcc/doc/sourcebuild.texi  |  3 ++
 .../gcc.target/riscv/cv-bi-beqimm-compile-1.c | 17 +++
 .../gcc.target/riscv/cv-bi-beqimm-compile-2.c | 48 +++
 .../gcc.target/riscv/cv-bi-bneimm-compile-1.c | 17 +++
 .../gcc.target/riscv/cv-bi-bneimm-compile-2.c | 48 +++
 gcc/testsuite/lib/target-supports.exp | 13 +
 12 files changed, 184 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-bi-beqimm-compile-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-bi-beqimm-compile-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-bi-bneimm-compile-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-bi-bneimm-compile-2.c

diff --git a/gcc/common/config/riscv/riscv-common.cc 
b/gcc/common/config/riscv/riscv-common.cc
index 6a1978bd0e4..04631e007f0 100644
--- a/gcc/common/config/riscv/riscv-common.cc
+++ b/gcc/common/config/riscv/riscv-common.cc
@@ -313,6 +313,7 @@ static const struct riscv_ext_version 
riscv_ext_version_table[] =
   {"xcvmac", ISA_SPEC_CLASS_NONE, 1, 0},
   {"xcvalu", ISA_SPEC_CLASS_NONE, 1, 0},
   {"xcvelw", ISA_SPEC_CLASS_NONE, 1, 0},
+  {"xcvbi", ISA_SPEC_CLASS_NONE, 1, 0},
 
   {"xtheadba", ISA_SPEC_CLASS_NONE, 1, 0},
   {"xtheadbb", ISA_SPEC_CLASS_NONE, 1, 0},
@@ -1669,6 +1670,7 @@ static const riscv_ext_flag_table_t 
riscv_ext_flag_table[] =
   {"xcvmac",&gcc_options::x_riscv_xcv_subext, MASK_XCVMAC},
   {"xcvalu",&gcc_options::x_riscv_xcv_subext, MASK_XCVALU},
   {"xcvelw",&gcc_options::x_riscv_xcv_subext, MASK_XCVELW},
+  {"xcvbi", &gcc_options::x_riscv_xcv_subext, MASK_XCVBI},
 
   {"xtheadba",  &gcc_options::x_riscv_xthead_subext, MASK_XTHEADBA},
   {"xtheadbb",  &gcc_options::x_riscv_xthead_subext, MASK_XTHEADBB},
diff --git a/gcc/config/riscv/constraints.md b/gcc/config/riscv/constraints.md
index 2711efe68c5..718b4bd77df 100644
--- a/gcc/config/riscv/constraints.md
+++ b/gcc/config/riscv/constraints.md
@@ -247,3 +247,9 @@
   (and (match_code "const_int")
(and (match_test "IN_RANGE (ival, 0, 1073741823)")
 (match_test "exact_log2 (ival + 1) != -1"
+
+(define_constraint "CV_bi_sign5"
+  "@internal
+   A 5-bit signed immediate for CORE-V Immediate Branch."
+  (and (match_code "const_int")
+   (match_test "IN_RANGE (ival, -16, 15)")))
diff --git a/gcc/config/riscv/corev.md b/gcc/config/riscv/corev.md
index 92bf0b5d6a6..f6a1f916d7e 100644
--- a/gcc/config/riscv/corev.md
+++ b/gcc/config/riscv/corev.md
@@ -706,3 +706,17 @@
 
   [(set_attr "type" "load")
   (set_attr "mode" "SI")])
+
+;; XCVBI Builtins
+(define_insn "cv_branch"
+  [(set (pc)
+   (if_then_else
+(match_operator 1 "equality_operator"
+[(match_operand:X 2 "register_operand" "r")
+ (match_operand:X 3 "const_int5s_operand" 
"CV_bi_sign5")])
+(label_ref (match_operand 0 "" ""))
+(pc)))]
+  "TARGET_XCVBI"
+  "cv.b%C1imm\t%2,%3,%0"
+  [(set_attr "type" "branch")
+   (set_attr "mode" "none")])
diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
index a37d035fa61..69a6319c2c8 100644
--- a/gcc/config/riscv/predicates.md
+++ b/gcc/config/riscv/predicates.md
@@ -400,6 +400,10 @@
   (ior (match_operand 0 "register_operand")
(match_code "const_int")))
 
+(define_predicate "const_int5s_operand"
+  (and (match_code

[PATCH] tree-optimization/112495 - alias versioning and address spaces

We are not correctly handling differing address spaces in dependence
analysis runtime alias check generation so refuse to do that.

Bootstrapped and tested on x86_64-unknown-linux-gnu.  I'm double-checking
hundreds of ACATS FAILs (segfaults), will push if those are latent.

Richard.

PR tree-optimization/112495
* tree-data-ref.cc (runtime_alias_check_p): Reject checks
between different address spaces.

* gcc.target/i386/pr112495.c: New testcase.
---
 gcc/testsuite/gcc.target/i386/pr112495.c | 12 
 gcc/tree-data-ref.cc |  7 +++
 2 files changed, 19 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr112495.c

diff --git a/gcc/testsuite/gcc.target/i386/pr112495.c 
b/gcc/testsuite/gcc.target/i386/pr112495.c
new file mode 100644
index 000..21afbaa6945
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr112495.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O3" } */
+
+typedef struct { int v; } T1;
+typedef struct { T1 v[32]; } T2;
+
+T1 s;
+T1 f1() { return s; }
+
+void f2(__seg_gs T2 *p, int n) {
+  for (int i = 0; i < n; ++i) p->v[i] = f1();
+}
diff --git a/gcc/tree-data-ref.cc b/gcc/tree-data-ref.cc
index 689aaeed722..3549485d251 100644
--- a/gcc/tree-data-ref.cc
+++ b/gcc/tree-data-ref.cc
@@ -1640,6 +1640,13 @@ runtime_alias_check_p (ddr_p ddr, class loop *loop, bool 
speed_p)
   "runtime alias check not supported for"
   " outer loop.\n");
 
+  /* FORNOW: We don't support handling different address spaces.  */
+  if (TYPE_ADDR_SPACE (TREE_TYPE (TREE_TYPE (DR_BASE_ADDRESS (DDR_A (ddr)
+  != TYPE_ADDR_SPACE (TREE_TYPE (TREE_TYPE (DR_BASE_ADDRESS (DDR_B 
(ddr))
+return opt_result::failure_at (DR_STMT (DDR_A (ddr)),
+  "runtime alias check between different "
+  "address spaces not supported.\n");
+
   return opt_result::success ();
 }
 
-- 
2.35.3

[PATCH] middle-end/112487 - inline and parameter mismatch

When passing an aggregate to a implicitly declared function that's
later declared as receiving a register type we can run into a
sanity assert that cannot hold for such gross mismatches.  Instead
of asserting avoid emitting a debug temp that's invalid.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

PR middle-end/112487
* tree-inline.cc (setup_one_parameter): When the parameter
is unused only insert a debug bind when there's not a gross
mismatch in value and declared parameter type.  Do not assert
there effectively isn't.

* gcc.dg/torture/pr112487.c: New testcase.
---
 gcc/testsuite/gcc.dg/torture/pr112487.c | 18 ++
 gcc/tree-inline.cc  |  6 +-
 2 files changed, 23 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr112487.c

diff --git a/gcc/testsuite/gcc.dg/torture/pr112487.c 
b/gcc/testsuite/gcc.dg/torture/pr112487.c
new file mode 100644
index 000..bc2838ee3eb
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr112487.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-w -std=gnu89" } */
+
+struct A { char i; };
+struct B {
+  struct C *p;
+  struct A *q;
+};
+struct C { struct B a[1]; };
+struct T { struct U *ptr; };
+
+volatile struct T v;
+void f1(volatile struct T v) { f2(v); }
+void f2(volatile struct T *const v) { }
+void bar() {
+  struct U *ptr;
+  f1(v);
+}
diff --git a/gcc/tree-inline.cc b/gcc/tree-inline.cc
index 00d05102e7a..0b14118b94b 100644
--- a/gcc/tree-inline.cc
+++ b/gcc/tree-inline.cc
@@ -3562,7 +3562,11 @@ setup_one_parameter (copy_body_data *id, tree p, tree 
value, tree fn,
  it.  */
   if (optimize && gimple_in_ssa_p (cfun) && !def && is_gimple_reg (p))
 {
-  gcc_assert (!value || !TREE_SIDE_EFFECTS (value));
+  /* When there's a gross type mismatch between the passed value
+and the declared argument type drop it on the floor and do
+not bother to insert a debug bind.  */
+  if (value && !is_gimple_reg_type (TREE_TYPE (value)))
+   return NULL;
   return insert_init_debug_bind (id, bb, var, rhs, NULL);
 }
 
-- 
2.35.3

Re: [PATCH] RISC-V: vsetvl: Refine REG_EQUAL equality.

2023-11-13 Thread Jeff Law





On 11/13/23 01:15, juzhe.zh...@rivai.ai wrote:

I know the root cause is:

(reg:DI 15 a5 [orig:175 _103 ] [175])

(reg:DI 15 a5 [orig:174 _100 ] [174])


is considered as true on rtx_equal_p.

I think return note1 == note2; will simplify your codes and fix this issue.
NOTEs are not shared (look at how they get chained and it's obvious 
they're not shared).  So you can't use pointer equality, you must use 
rtx_equal_p.


More generally, nodes are not shared unless explicitly documented as 
such in the internals manual, "Sharing" section.


Jeff

Re: [PATCH 2/3] Add generated .opt.urls files

2023-11-13 Thread David Malcolm

On Mon, 2023-11-13 at 14:11 +0100, Marc Poulhiès wrote:
> 
> David Malcolm  writes:
> 
> > gcc/ada/ChangeLog:
> > * gcc-interface/lang.opt.urls: New file, autogenerated by
> > regenerate-opt-urls.py.
> 
> 
> > diff --git a/gcc/ada/gcc-interface/lang.opt.urls b/gcc/ada/gcc-
> > interface/lang.opt.urls
> > new file mode 100644
> > index ..e24210bcb12a
> > --- /dev/null
> > +++ b/gcc/ada/gcc-interface/lang.opt.urls
> > @@ -0,0 +1,28 @@
> > +; Autogenerated by regenerate-opt-urls.py from gcc/ada/gcc-
> > interface/lang.opt and generated HTML
> > +
> > +I
> > +UrlSuffix(gcc/Directory-Options.html#index-I)
> > +
> > +; skipping 'Wall' due to multiple URLs:
> > +;   duplicate: 'gcc/Standard-Libraries.html#index-Wall-1'
> > +;   duplicate: 'gcc/Warning-Options.html#index-Wall'
> > +
> > +nostdinc
> > +UrlSuffix(gcc/Directory-Options.html#index-nostdinc)
> > +
> > +nostdlib
> > +UrlSuffix(gcc/Link-Options.html#index-nostdlib)
> > +
> > +; skipping 'fshort-enums' due to multiple URLs:
> > +;   duplicate: 'gcc/Code-Gen-Options.html#index-fshort-enums'
> > +;   duplicate: 'gcc/Non-bugs.html#index-fshort-enums-3'
> > +;   duplicate: 'gcc/Structures-unions-enumerations-and-bit-fields-
> > implementation.html#index-fshort-enums-1'
> > +
> > +; skipping 'fsigned-char' due to multiple URLs:
> > +;   duplicate: 'gcc/C-Dialect-Options.html#index-fsigned-char'
> > +;   duplicate: 'gcc/Characters-implementation.html#index-fsigned-
> > char-1'
> > +
> > +; skipping 'funsigned-char' due to multiple URLs:
> > +;   duplicate: 'gcc/C-Dialect-Options.html#index-funsigned-char'
> > +;   duplicate: 'gcc/Characters-implementation.html#index-
> > funsigned-char-1'
> 
> Hello David,
> 
> This looks very nice, thanks!
> 
> I wonder why the Ada frontend only gets I, nostdinc and nostdlib
> URLified to the common gcc doc.
> 
> Is it possible that your doc scrapper doesn't match the option in the
> Ada doc? We are documenting nostdlib, nostdinc and I, so I would also
> expect a "multiple URLs" for these.

The new regenerate-opt-urls.py script only parsed
  buildir/gcc/HTML/gcc-14.0.0/gcc/Option-Index.html
looking for anchors for options via a regex.

Looking at my build, I don't see any generated Ada HTML docs, so maybe
I messed this up?  Does the generated HTML from the generated Ada
texinfo go somewhere else?  (and, in particular, does it have its own
index?)

Perhaps this script could also deal directly with Sphinx-generated
HTML?

>  We are generating the texinfo files
> from sphinx, so maybe we could adjust the script to also match what
> the
> sphinx generator produces?

It *might* be as simple as pointing it at the option index for the
generated HTML for Ada.

Though as Iain's email points out, there may be some issues with per-
language URLs for options that my approach doesn't quite handle yet.

Dave

Re: [PATCH 2/3] Add generated .opt.urls files

2023-11-13 Thread David Malcolm

On Sun, 2023-11-12 at 11:56 +0100, Iain Buclaw wrote:
> Excerpts from David Malcolm's message of November 10, 2023 10:42 pm:
> > gcc/d/ChangeLog:
> > * lang.opt.urls: New file, autogenerated by
> > regenerate-opt-urls.py.
> > ---
> >  gcc/d/lang.opt.urls  |   95 +
> >  create mode 100644 gcc/d/lang.opt.urls
> > 
> 
> [abridged view of patch]
> 
> > diff --git a/gcc/d/lang.opt.urls b/gcc/d/lang.opt.urls
> > new file mode 100644
> > index ..57c14ecc459a
> > --- /dev/null
> > +++ b/gcc/d/lang.opt.urls
> > @@ -0,0 +1,95 @@
> > +; Autogenerated by regenerate-opt-urls.py from gcc/d/lang.opt and
> > generated HTML
> > +
> > +H
> > +UrlSuffix(gcc/Preprocessor-Options.html#index-H)
> > +
> > +I
> > +UrlSuffix(gcc/Directory-Options.html#index-I)
> > +
> > +M
> > +UrlSuffix(gcc/Preprocessor-Options.html#index-M)
> > +
> > +MD
> > +UrlSuffix(gcc/Preprocessor-Options.html#index-MD)
> > +
> > +MF
> > +UrlSuffix(gcc/Preprocessor-Options.html#index-MF)
> > +
> > +MG
> > +UrlSuffix(gcc/Preprocessor-Options.html#index-MG)
> > +
> > +MM
> > +UrlSuffix(gcc/Preprocessor-Options.html#index-MM)
> > +
> > +MMD
> > +UrlSuffix(gcc/Preprocessor-Options.html#index-MMD)
> > +
> > +MP
> > +UrlSuffix(gcc/Preprocessor-Options.html#index-MP)
> > +
> > +MT
> > +UrlSuffix(gcc/Preprocessor-Options.html#index-MT)
> > +
> > +MQ
> > +UrlSuffix(gcc/Preprocessor-Options.html#index-MQ)
> > +
> > +Waddress
> > +UrlSuffix(gcc/Warning-Options.html#index-Waddress)
> > +
> > +; skipping 'Wall' due to multiple URLs:
> > +;   duplicate: 'gcc/Standard-Libraries.html#index-Wall-1'
> > +;   duplicate: 'gcc/Warning-Options.html#index-Wall'
> > +
> > +Walloca
> > +UrlSuffix(gcc/Warning-Options.html#index-Walloca)
> > +
> > +Walloca-larger-than=
> > +UrlSuffix(gcc/Warning-Options.html#index-Walloca-larger-than_003d)
> > +
> > +Wbuiltin-declaration-mismatch
> > +UrlSuffix(gcc/Warning-Options.html#index-Wbuiltin-declaration-
> > mismatch)
> > +
> > +Wdeprecated
> > +UrlSuffix(gcc/Warning-Options.html#index-Wdeprecated)
> > +
> > +Werror
> > +UrlSuffix(gcc/Warning-Options.html#index-Werror)
> > +
> > +Wextra
> > +UrlSuffix(gcc/Warning-Options.html#index-Wextra)
> > +
> > +Wunknown-pragmas
> > +UrlSuffix(gcc/Warning-Options.html#index-Wno-unknown-pragmas)
> > +
> > +Wvarargs
> > +UrlSuffix(gcc/Warning-Options.html#index-Wno-varargs)
> > +
> > +; skipping 'fbuiltin' due to multiple URLs:
> > +;   duplicate: 'gcc/C-Dialect-Options.html#index-fbuiltin'
> > +;   duplicate: 'gcc/Other-Builtins.html#index-fno-builtin-3'
> > +;   duplicate: 'gcc/Warning-Options.html#index-fno-builtin-1'
> > +
> > +fexceptions
> > +UrlSuffix(gcc/Code-Gen-Options.html#index-fexceptions)
> > +
> > +frtti
> > +UrlSuffix(gcc/C_002b_002b-Dialect-Options.html#index-fno-rtti)
> > +
> > +imultilib
> > +UrlSuffix(gcc/Directory-Options.html#index-imultilib)
> > +
> > +iprefix
> > +UrlSuffix(gcc/Directory-Options.html#index-iprefix)
> > +
> > +isysroot
> > +UrlSuffix(gcc/Directory-Options.html#index-isysroot)
> > +
> > +isystem
> > +UrlSuffix(gcc/Directory-Options.html#index-isystem)
> > +
> > +nostdinc
> > +UrlSuffix(gcc/Directory-Options.html#index-nostdinc)
> > +
> > +v
> > +UrlSuffix(gcc/Overall-Options.html#index-v)
> > +
> > -- 
> > 2.26.3
> > 
> > 
> 
> So I see this focuses on only adding URLs for common options, or
> options
> that relate to C/C++ family, but may be handled by other front-ends
> too?

The regenerate-opt-urls.py script only parsed
  buildir/gcc/HTML/gcc-14.0.0/gcc/Option-Index.html
looking for anchors for options via a regex.

Looking at the docs I now see that various frontends have their own
Option-Index.html, e.g.:

  gdc/Option-Index.html

so probably regenerate-opt-urls.py ought to be parsing those also, and
marking the generated .opt.urls as being lang-specific.

That way we could (somehow) generate a options-urls.cc that has logic
(perhaps with langmasks) for giving out different URLs for different
frontends.


> 
> To pick out one, you have:
> 
>     frtti
>     UrlSuffix(gcc/C_002b_002b-Dialect-Options.html#index-fno-rtti)
> 
> It looks like it could could alternatively be
> 
>     frtti
>     UrlSuffix(gdc/Runtime-Options.html#index-frtti)
> 
> Or are other front-ends having URLs to their language-specific
> documentation pages not supported for the same reason as why they
> can't
> add self-documentation to their own options if another front-end
> (typically C/C++) also makes claim to the option?

Implementation-wise that would need fixing in the way I'm handling the
UrlSuffix directives; perhaps with a LangUrlSuffix directive.

> 
>     frtti
>     D 
>     ; Documented in C
> 
> 
> I'm OK with the D parts regardless of this observation.

Thanks
Dave

[committed 00/22] arm: testsuite: clean up some architecture-specific tests

A lot of the arm-specific compiler tests require a specific CPU or
architecture to be specified.  This causes problems if the test suite
run is set up to test a specific architecture or CPU that differs from
the test's requirements.  An exmple I use commonly is

set target_list { "arm-qemu{,-mthumb}" }

but it is possible to also test other architectures or CPUs this way, for
example,

set target_list { "arm-qemu{,-mthumb,
  -march=armv6t2+fp/-mfloat-abi=hard,
  -march=armv8-a+simd/-mthumb/-mfloat-abi=hard,
  -mcpu=cortex-m33/-mfloat-abi=softfp,
  -mcpu=cortex-m55/-mfloat-abi=hard,
  -mcpu=cortex-m23}" }

[line breaks inserted for readability]

tests 7 permutations of
 - base configuration
 - base configuration with -mthumb
 - armv6t2 with FP and a hard-float ABI
 - armv8-a with Neon and thumb and the hard-float ABI
 - cortex-m33 with the softfp ABI
 - cortex-m55 with the hard-float ABI
 - cortex-m23

Over time we have developed a series of checks that can be used to
ensure that we test what we want to test and don't test if the options
conflict, but these have been applied somewhat haphazzardly and as the
framework has been improved tests haven't been updated to make full
use of the tests.

This patch series deploys the framework dg- directives more widely
across the arm-specific tests to make testing more consistent.  On
that long list of permutations above this results in the following
changes:

16 tests move from FAIL to PASS.
21 new FAILS.  
562 new tests that PASS
74 tests that passed have been removed

The new FAILs are real issues on targets that only support
single-precision FP and should be investigated at some point, but
probably aren't urgent given the use cases for cores with this issue.

The tests that have been removed come from the fact that we now more
accurately test that option combinations won't cause problems; they
are related to the fact that if the testrun config specifies -mcpu,
but the test sets -march, then we can get an architecture conflict.
I have some ideas about how to address this, but that's for a later
test series.

committed to master branch.

R.

Richard Earnshaw (22):
  arm: testsuite: correctly detect armv6t2 hardware for acle execution
tests
  arm: testsuite: correctly detect hard_float
  arm: testsuite: avoid hard-float ABI incompatibility with -march
  arm: testsuite: avoid problems with -mfpu=auto in pacbti-m-predef-11.c
  arm: testsuite: avoid problems with -mfpu=auto in attr-crypto.c
  arm: testsuite: avoid problems with -mfpu=auto in attr_thumb-static2.c
  arm: testsuite: tidy up pre-run check for g2.c
  arm: testsuite: improve compatibility of arm/lto/pr96939_1.c
  arm: testsuite: tidy up pr65647-2.c pre-checks.
  arm: testsuite: improve compatibility of arm/pr78353-*.c
  arm: testsuite: improve compatibility of pr88648-asm-syntax-unified.c
  arm: testsuite: improve compatibility of pragma_arch_attribute*.c
  arm: testsuite: improve compatibility of pragma_arch_switch_2.c
  arm: testsuite: modernize framework usage for arm/scd42-2.c
  arm: testsuite: improve compatibility of ftest-armv7m-thumb.c
  arm: testsuite: improve compatibility of gcc.target/arm/macro_defs*.c
  arm: testsuite: improve compatibility of
gcc.target/arm/optional_thumb-*.c
  arm: testsuite: improve compatibility of gcc.target/arm/pr19599.c
  arm: testsuite: improve compatibility of gcc.target/arm/pr59575.c
  testsuite: arm: tighten up mode-specific ISA tests
  arm: testsuite: fix some more architecture tests
  arm: testsuite: improve compatibility of gcc.dg/debug/pr57351.c

 gcc/testsuite/gcc.dg/debug/pr57351.c  |  7 +-
 .../arm/acle/data-intrinsics-armv6.c  |  2 +-
 .../arm/acle/data-intrinsics-rbit.c   |  2 +-
 .../gcc.target/arm/acle/pacbti-m-predef-11.c  |  2 +-
 gcc/testsuite/gcc.target/arm/attr-crypto.c|  2 +-
 .../gcc.target/arm/attr_thumb-static2.c   |  2 +-
 .../gcc.target/arm/ftest-armv7m-thumb.c   |  3 +-
 gcc/testsuite/gcc.target/arm/g2.c | 10 +-
 gcc/testsuite/gcc.target/arm/lto/pr96939_1.c  |  2 +-
 gcc/testsuite/gcc.target/arm/macro_defs0.c|  7 +-
 gcc/testsuite/gcc.target/arm/macro_defs1.c|  6 +-
 gcc/testsuite/gcc.target/arm/macro_defs2.c|  6 +-
 .../gcc.target/arm/optional_thumb-1.c |  2 +-
 .../gcc.target/arm/optional_thumb-3.c |  4 +-
 gcc/testsuite/gcc.target/arm/pr19599.c|  2 +-
 gcc/testsuite/gcc.target/arm/pr59575.c|  4 +-
 gcc/testsuite/gcc.target/arm/pr60650-2.c  |  4 +-
 gcc/testsuite/gcc.target/arm/pr60657.c|  4 +-
 gcc/testsuite/gcc.target/arm/pr60663.c|  4 +-
 gcc/testsuite/gcc.target/arm/pr65647-2.c  |  3 +-
 gcc/testsuite/gcc.target/arm/pr78353-1.c  |  3 +-
 gcc/testsuite/gcc.target/arm/pr78353-2.c  |  3 +-
 gcc/testsuite/gcc.target/arm/pr81863.c|  4 +-
 .../arm/pr88648-asm-syntax-unified.c  |  2 +-
 gcc/testsuite/gcc.target/arm/pr97969.c|  4 +-
 gcc/te

[committed 01/22] arm: testsuite: correctly detect armv6t2 hardware for acle execution tests


Some of the ACLE tests for Arm are executable, but we were only testing
that the compiler could generate code for them, not that the hardware
was capable of executing them.  Fix this by adding an execution test for
suitable hardware.

gcc/testsuite:

* lib/target-supports.exp (check_effective_target_arm_arch_v6t2_hw_ok):
New function.
* gcc.target/arm/acle/data-intrinsics-armv6.c: Use it.
* gcc.target/arm/acle/data-intrinsics-rbit.c: Likewise.
---
 .../arm/acle/data-intrinsics-armv6.c   |  2 +-
 .../gcc.target/arm/acle/data-intrinsics-rbit.c |  2 +-
 gcc/testsuite/lib/target-supports.exp  | 18 ++
 3 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/acle/data-intrinsics-armv6.c b/gcc/testsuite/gcc.target/arm/acle/data-intrinsics-armv6.c
index 988ecac3787..6dc8c55e2f9 100644
--- a/gcc/testsuite/gcc.target/arm/acle/data-intrinsics-armv6.c
+++ b/gcc/testsuite/gcc.target/arm/acle/data-intrinsics-armv6.c
@@ -1,5 +1,5 @@
 /* { dg-do run } */
-/* { dg-require-effective-target arm_arch_v6t2_ok } */
+/* { dg-require-effective-target arm_arch_v6t2_hw_ok } */
 /* { dg-add-options arm_arch_v6t2 } */
 
 #include "arm_acle.h"
diff --git a/gcc/testsuite/gcc.target/arm/acle/data-intrinsics-rbit.c b/gcc/testsuite/gcc.target/arm/acle/data-intrinsics-rbit.c
index d1fe274b5ce..b01c4219a7e 100644
--- a/gcc/testsuite/gcc.target/arm/acle/data-intrinsics-rbit.c
+++ b/gcc/testsuite/gcc.target/arm/acle/data-intrinsics-rbit.c
@@ -1,6 +1,6 @@
 /* Test the ACLE data intrinsics existence for specific instruction.  */
 /* { dg-do run } */
-/* { dg-require-effective-target arm_arch_v6t2_ok } */
+/* { dg-require-effective-target arm_arch_v6t2_hw_ok } */
 /* { dg-additional-options "--save-temps -O1" } */
 /* { dg-add-options arm_arch_v6t2 } */
 /* { dg-final { check-function-bodies "**" "" "" } } */
diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
index 1a7bea96c1e..d414cddf4dc 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -5590,6 +5590,24 @@ proc check_effective_target_arm_thumb1_cbz_ok {} {
 }
 }
 
+# Return 1 if this is an Arm target which supports the Armv6t2 extensions.
+# This can be either in Arm state or in Thumb state.
+
+proc check_effective_target_arm_arch_v6t2_hw_ok {} {
+if [check_effective_target_arm_thumb1_ok] {
+	return [check_no_compiler_messages arm_movt object {
+	int
+	main (void)
+	{
+	  asm ("bfc r0, #1, #2");
+	  return 0;
+	}
+	} [add_options_for_arm_arch_v6t2 ""]]
+} else {
+	return 0
+}
+}
+
 # Return 1 if this is an ARM target where ARMv8-M Security Extensions is
 # available.

[committed 02/22] arm: testsuite: correctly detect hard_float


Add an arm-specific test to check_effective_target_hard_float for
Arm to handle cases where we only have single-precision FP in hardware.

gcc/testsuite:

* lib/target-supports.exp (check_effective_target_hard_float): Add
arm-specific test.
---
 gcc/testsuite/lib/target-supports.exp | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
index d414cddf4dc..ee173b9fb6b 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -1420,6 +1420,17 @@ proc check_effective_target_mpaired_single { args } {
 # Return true if the target has access to FPU instructions.
 
 proc check_effective_target_hard_float { } {
+# This should work on cores that only have single-precision,
+# and should also correctly handle legacy cores that had thumb1 and
+# lacked FP support for that, but had it in Arm state.
+if { [istarget arm*-*-*] } {
+	return [check_no_compiler_messages hard_float assembly {
+		#if __ARM_FP == 0
+		#error __arm_soft_float
+		#endif
+	}]
+}
+
 if { [istarget loongarch*-*-*] } {
 	return [check_no_compiler_messages hard_float assembly {
 		#if (defined __loongarch_soft_float)

[committed 07/22] arm: testsuite: tidy up pre-run check for g2.c


gcc.target/arm/g2.c is an xscale-only test, but the test is quite old
and we have improved the infrastructure for setting up such tests now.
So make use of that to reduce the number of cases where this test fails
to run.

gcc/testsuite:

* lib/target-supports.exp (check_effective_target_arm_arch_FUNC_ok):
Add entry to check for xscale.
* gcc.target/arm/g2.c: Use it.
---
 gcc/testsuite/gcc.target/arm/g2.c | 10 --
 gcc/testsuite/lib/target-supports.exp |  1 +
 2 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/g2.c b/gcc/testsuite/gcc.target/arm/g2.c
index ca5e3ccff66..04334c97713 100644
--- a/gcc/testsuite/gcc.target/arm/g2.c
+++ b/gcc/testsuite/gcc.target/arm/g2.c
@@ -1,11 +1,9 @@
 /* Verify that hardware multiply is preferred on XScale. */
 /* { dg-do compile } */
-/* { dg-options "-mcpu=xscale -O2 -marm" } */
-/* { dg-skip-if "Test is specific to the Xscale" { arm*-*-* } { "-march=*" } { "-march=xscale" } } */
-/* { dg-skip-if "Test is specific to the Xscale" { arm*-*-* } { "-mcpu=*" } { "-mcpu=xscale" } } */
-/* { dg-skip-if "Test is specific to ARM mode" { arm*-*-* } { "-mthumb" } { "" } } */
-/* { dg-require-effective-target arm_arch_v5te_arm_ok } */
-/* { dg-require-effective-target arm32 } */
+/* { dg-options "-O2" } */
+/* { dg-require-effective-target arm_arch_xscale_arm_ok } */
+/* { dg-add-options arm_arch_xscale_arm } */
+
 
 /* Brett Gaines' test case. */
 unsigned BCPL(unsigned) __attribute__ ((naked));
diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
index 7d83bd8740f..9d2958626ad 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -5411,6 +5411,7 @@ foreach { armfunc armflag armdefs } {
 	v5te "-march=armv5te+fp -mfloat-abi=softfp" __ARM_ARCH_5TE__
 	v5te_arm "-march=armv5te+fp -marm" __ARM_ARCH_5TE__
 	v5te_thumb "-march=armv5te+fp -mthumb -mfloat-abi=softfp" __ARM_ARCH_5TE__
+	xscale_arm "-mcpu=xscale -mfloat-abi=soft -marm" __XSCALE__
 	v6 "-march=armv6+fp -mfloat-abi=softfp" __ARM_ARCH_6__
 	v6_arm "-march=armv6+fp -marm" __ARM_ARCH_6__
 	v6_thumb "-march=armv6+fp -mthumb -mfloat-abi=softfp" __ARM_ARCH_6__

[committed 10/22] arm: testsuite: improve compatibility of arm/pr78353-*.c


Again, use the infrastructure available to improve the compatibility
of these tests.

gcc/testsuite:

* gcc.target/arm/pr78353-1.c: Use dg-add-options to manage target
flags.
* gcc.target/arm/pr78353-2.c: Likewise.
---
 gcc/testsuite/gcc.target/arm/pr78353-1.c | 3 ++-
 gcc/testsuite/gcc.target/arm/pr78353-2.c | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/pr78353-1.c b/gcc/testsuite/gcc.target/arm/pr78353-1.c
index a107e300269..56480774ce4 100644
--- a/gcc/testsuite/gcc.target/arm/pr78353-1.c
+++ b/gcc/testsuite/gcc.target/arm/pr78353-1.c
@@ -1,6 +1,7 @@
 /* { dg-do link }  */
 /* { dg-require-effective-target arm_arch_v7a_multilib } */
-/* { dg-options "-march=armv7-a -mthumb -O2 -flto -Wa,-mimplicit-it=always" }  */
+/* { dg-options "-mthumb -O2 -flto -Wa,-mimplicit-it=always" }  */
+/* { dg-add-options arm_arch_v7a } */
 
 int main(int x)
 {
diff --git a/gcc/testsuite/gcc.target/arm/pr78353-2.c b/gcc/testsuite/gcc.target/arm/pr78353-2.c
index 2589e6135aa..c070d7275bc 100644
--- a/gcc/testsuite/gcc.target/arm/pr78353-2.c
+++ b/gcc/testsuite/gcc.target/arm/pr78353-2.c
@@ -1,6 +1,7 @@
 /* { dg-do link }  */
 /* { dg-require-effective-target arm_arch_v7a_multilib } */
-/* { dg-options "-march=armv7-a -mthumb -O2 -flto -Wa,-mimplicit-it=always,-mthumb" }  */
+/* { dg-options "-mthumb -O2 -flto -Wa,-mimplicit-it=always,-mthumb" }  */
+/* { dg-add-options arm_arch_v7a } */
 
 int main(int x)
 {

[committed 04/22] arm: testsuite: avoid problems with -mfpu=auto in pacbti-m-predef-11.c


This test overrides the architecture, but fails to describe which
floating-point features are needed.  This causes problems if the ABI
requires FP for parameter passing and -mfpu=auto is selected, so ensure
that one is specified.

gcc/testsuite:

* gcc.target/arm/acle/pacbti-m-predef-11.c: Add +fp to the -march
specification.
---
 gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-11.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-11.c b/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-11.c
index 9f2711097ac..6a5ae92c567 100644
--- a/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-11.c
+++ b/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-11.c
@@ -1,6 +1,6 @@
 /* { dg-do compile } */
 /* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { "-marm" "-mcpu=*" "-mfloat-abi=*" } } */
-/* { dg-options "-march=armv8.1-m.main+pacbti" } */
+/* { dg-options "-march=armv8.1-m.main+fp+pacbti" } */
 
 #if (__ARM_FEATURE_BTI != 1)
 #error "Feature test macro __ARM_FEATURE_BTI_DEFAULT should be defined to 1."

[committed 03/22] arm: testsuite: avoid hard-float ABI incompatibility with -march


A number of tests in the gcc testsuite, especially for arm-specific
targets, add various flags to control the architecture.  These run
into problems when the compiler is configured with -mfpu=auto if the
new architecture lacks an architectural feature that implies we have
floating-point instructions.

The testsuite makes this worse as it falls foul of this requirement in
the base architecture strings provided by target-supports.exp.

To fix this we add "+fp", or something equivalent to this, to all the
base architecture specifications.  The feature will be ignored if the
float ABI is set to soft.

gcc/testsuite:

* lib/target-supports.exp (check_effective_target_arm_arch_FUNC_ok):
Add base FPU specifications to all architectures that can support
one.
---
 gcc/testsuite/lib/target-supports.exp | 50 +--
 1 file changed, 25 insertions(+), 25 deletions(-)

diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
index ee173b9fb6b..7d83bd8740f 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -5408,36 +5408,36 @@ foreach { armfunc armflag armdefs } {
 	v5t "-march=armv5t -mfloat-abi=softfp" __ARM_ARCH_5T__
 	v5t_arm "-march=armv5t -marm" __ARM_ARCH_5T__
 	v5t_thumb "-march=armv5t -mthumb -mfloat-abi=softfp" __ARM_ARCH_5T__
-	v5te "-march=armv5te -mfloat-abi=softfp" __ARM_ARCH_5TE__
-	v5te_arm "-march=armv5te -marm" __ARM_ARCH_5TE__
-	v5te_thumb "-march=armv5te -mthumb -mfloat-abi=softfp" __ARM_ARCH_5TE__
-	v6 "-march=armv6 -mfloat-abi=softfp" __ARM_ARCH_6__
-	v6_arm "-march=armv6 -marm" __ARM_ARCH_6__
-	v6_thumb "-march=armv6 -mthumb -mfloat-abi=softfp" __ARM_ARCH_6__
-	v6k "-march=armv6k -mfloat-abi=softfp" __ARM_ARCH_6K__
-	v6k_arm "-march=armv6k -marm" __ARM_ARCH_6K__
-	v6k_thumb "-march=armv6k -mthumb -mfloat-abi=softfp" __ARM_ARCH_6K__
-	v6t2 "-march=armv6t2" __ARM_ARCH_6T2__
-	v6z "-march=armv6z -mfloat-abi=softfp" __ARM_ARCH_6Z__
-	v6z_arm "-march=armv6z -marm" __ARM_ARCH_6Z__
-	v6z_thumb "-march=armv6z -mthumb -mfloat-abi=softfp" __ARM_ARCH_6Z__
+	v5te "-march=armv5te+fp -mfloat-abi=softfp" __ARM_ARCH_5TE__
+	v5te_arm "-march=armv5te+fp -marm" __ARM_ARCH_5TE__
+	v5te_thumb "-march=armv5te+fp -mthumb -mfloat-abi=softfp" __ARM_ARCH_5TE__
+	v6 "-march=armv6+fp -mfloat-abi=softfp" __ARM_ARCH_6__
+	v6_arm "-march=armv6+fp -marm" __ARM_ARCH_6__
+	v6_thumb "-march=armv6+fp -mthumb -mfloat-abi=softfp" __ARM_ARCH_6__
+	v6k "-march=armv6k+fp -mfloat-abi=softfp" __ARM_ARCH_6K__
+	v6k_arm "-march=armv6k+fp -marm" __ARM_ARCH_6K__
+	v6k_thumb "-march=armv6k+fp -mthumb -mfloat-abi=softfp" __ARM_ARCH_6K__
+	v6t2 "-march=armv6t2+fp" __ARM_ARCH_6T2__
+	v6z "-march=armv6z+fp -mfloat-abi=softfp" __ARM_ARCH_6Z__
+	v6z_arm "-march=armv6z+fp -marm" __ARM_ARCH_6Z__
+	v6z_thumb "-march=armv6z+fp -mthumb -mfloat-abi=softfp" __ARM_ARCH_6Z__
 	v6m "-march=armv6-m -mthumb -mfloat-abi=soft" __ARM_ARCH_6M__
-	v7a "-march=armv7-a" __ARM_ARCH_7A__
-	v7r "-march=armv7-r" __ARM_ARCH_7R__
+	v7a "-march=armv7-a+fp" __ARM_ARCH_7A__
+	v7r "-march=armv7-r+fp" __ARM_ARCH_7R__
 	v7m "-march=armv7-m -mthumb" __ARM_ARCH_7M__
-	v7em "-march=armv7e-m -mthumb" __ARM_ARCH_7EM__
-	v7ve "-march=armv7ve -marm"
+	v7em "-march=armv7e-m+fp -mthumb" __ARM_ARCH_7EM__
+	v7ve "-march=armv7ve+fp -marm"
 		"__ARM_ARCH_7A__ && __ARM_FEATURE_IDIV"
-	v8a "-march=armv8-a" __ARM_ARCH_8A__
-	v8a_hard "-march=armv8-a -mfpu=neon-fp-armv8 -mfloat-abi=hard" __ARM_ARCH_8A__
-	v8_1a "-march=armv8.1-a" __ARM_ARCH_8A__
-	v8_2a "-march=armv8.2-a" __ARM_ARCH_8A__
-	v8r "-march=armv8-r" __ARM_ARCH_8R__
+	v8a "-march=armv8-a+simd" __ARM_ARCH_8A__
+	v8a_hard "-march=armv8-a+simd -mfpu=auto -mfloat-abi=hard" __ARM_ARCH_8A__
+	v8_1a "-march=armv8.1-a+simd" __ARM_ARCH_8A__
+	v8_2a "-march=armv8.2-a+simd" __ARM_ARCH_8A__
+	v8r "-march=armv8-r+fp.sp" __ARM_ARCH_8R__
 	v8m_base "-march=armv8-m.base -mthumb -mfloat-abi=soft"
 		__ARM_ARCH_8M_BASE__
-	v8m_main "-march=armv8-m.main -mthumb" __ARM_ARCH_8M_MAIN__
-	v8_1m_main "-march=armv8.1-m.main -mthumb" __ARM_ARCH_8M_MAIN__
-	v9a "-march=armv9-a" __ARM_ARCH_9A__ } {
+	v8m_main "-march=armv8-m.main+fp -mthumb" __ARM_ARCH_8M_MAIN__
+	v8_1m_main "-march=armv8.1-m.main+fp -mthumb" __ARM_ARCH_8M_MAIN__
+	v9a "-march=armv9-a+simd" __ARM_ARCH_9A__ } {
 eval [string map [list FUNC $armfunc FLAG $armflag DEFS $armdefs ] {
 	proc check_effective_target_arm_arch_FUNC_ok { } {
 	return [check_no_compiler_messages arm_arch_FUNC_ok assembly {

[committed 06/22] arm: testsuite: avoid problems with -mfpu=auto in attr_thumb-static2.c


This test overrides the architecture, but fails to describe which
floating-point features are needed.  This causes problems if the ABI
requires FP for parameter passing and -mfpu=auto is selected, so ensure
that one is specified.

gcc/testsuite:

* gcc.target/arm/attr_thumb-static2.c: Add +fp to the -march
specification.
---
 gcc/testsuite/gcc.target/arm/attr_thumb-static2.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/arm/attr_thumb-static2.c b/gcc/testsuite/gcc.target/arm/attr_thumb-static2.c
index 77454343b23..a38f9a95607 100644
--- a/gcc/testsuite/gcc.target/arm/attr_thumb-static2.c
+++ b/gcc/testsuite/gcc.target/arm/attr_thumb-static2.c
@@ -2,7 +2,7 @@
 
 /* { dg-do compile } */
 /* { dg-require-effective-target arm_arch_v7a_ok } */
-/* { dg-options "-O0 -march=armv7-a" } */
+/* { dg-options "-O0 -march=armv7-a+fp" } */
 
 struct _NSPoint
 {

[committed 11/22] arm: testsuite: improve compatibility of pr88648-asm-syntax-unified.c


Fix another test that was trying to set the architecture directly
rather than using the infrastructure as intended.

gcc/testsuite:

* gcc.target/arm/pr88648-asm-syntax-unified.c: It isn't necessary
to try to override the architecture flags specified by arm_arch_v7a.
---
 gcc/testsuite/gcc.target/arm/pr88648-asm-syntax-unified.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/arm/pr88648-asm-syntax-unified.c b/gcc/testsuite/gcc.target/arm/pr88648-asm-syntax-unified.c
index 251b4d5bc9d..53d0bb053fc 100644
--- a/gcc/testsuite/gcc.target/arm/pr88648-asm-syntax-unified.c
+++ b/gcc/testsuite/gcc.target/arm/pr88648-asm-syntax-unified.c
@@ -1,8 +1,8 @@
 /* Test for unified syntax assembly generation.  */
 /* { dg-do compile } */
 /* { dg-require-effective-target arm_arch_v7a_ok } */
+/* { dg-options "-marm -masm-syntax-unified" } */
 /* { dg-add-options arm_arch_v7a } */
-/* { dg-options "-marm -march=armv7-a -masm-syntax-unified" } */
 
 void test ()
 {

[committed 17/22] arm: testsuite: improve compatibility of gcc.target/arm/optional_thumb-*.c


These tests deliberately pass invalid option combinations to check
that the compiler is generating the correct diagnostic.  Nevertheless,
we can improve their compatibility with other testsuite options.  For
optional_thumb-1.c we use a soft-float ABI, while for
optional_thumb2.c we use arm_arch_v7em as the target architecture,
then set the architecture manually.

gcc/testsuite:

* gcc.target/arm/optional_thumb-1.c: Force a soft-float ABI.
* gcc.target/arm/optional_thumb-3.c: Check for armv7e-m compatibility,
then set the architecture explicitly.
---
 gcc/testsuite/gcc.target/arm/optional_thumb-1.c | 2 +-
 gcc/testsuite/gcc.target/arm/optional_thumb-3.c | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/optional_thumb-1.c b/gcc/testsuite/gcc.target/arm/optional_thumb-1.c
index 99cb0c3f33b..90d9ada6ade 100644
--- a/gcc/testsuite/gcc.target/arm/optional_thumb-1.c
+++ b/gcc/testsuite/gcc.target/arm/optional_thumb-1.c
@@ -1,6 +1,6 @@
 /* { dg-do compile { target { ! default_mode } } } */
 /* { dg-skip-if "-marm/-mthumb/-march/-mcpu given" { *-*-* } { "-marm" "-mthumb" "-march=*" "-mcpu=*" } } */
-/* { dg-options "-march=armv6-m" } */
+/* { dg-options "-march=armv6-m -mfloat-abi=soft" } */
 
 /* Check that -mthumb is not needed when compiling for a Thumb-only target.  */
 
diff --git a/gcc/testsuite/gcc.target/arm/optional_thumb-3.c b/gcc/testsuite/gcc.target/arm/optional_thumb-3.c
index d9150e09e47..a6c661ac031 100644
--- a/gcc/testsuite/gcc.target/arm/optional_thumb-3.c
+++ b/gcc/testsuite/gcc.target/arm/optional_thumb-3.c
@@ -1,7 +1,7 @@
 /* { dg-do compile } */
-/* { dg-require-effective-target arm_cortex_m } */
+/* { dg-require-effective-target arm_arch_v7em_ok } */
 /* { dg-skip-if "-mthumb given" { *-*-* } { "-mthumb" } } */
-/* { dg-options "-marm" } */
+/* { dg-options "-march=armv7e-m+fp -marm" } */
 /* { dg-error "target CPU does not support ARM mode" "missing error with -marm on Thumb-only targets" { target *-*-* } 0 } */
 
 /* Check that -marm gives an error when compiling for a Thumb-only target.  */

[committed 05/22] arm: testsuite: avoid problems with -mfpu=auto in attr-crypto.c


This test overrides the architecture, but fails to describe which
floating-point features are needed.  This causes problems if the ABI
requires FP for parameter passing and -mfpu=auto is selected, so ensure
that one is specified.

gcc/testsuite:

* gcc.target/arm/attr-crypto.c: Add +simd to the -march
specification.
---
 gcc/testsuite/gcc.target/arm/attr-crypto.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/arm/attr-crypto.c b/gcc/testsuite/gcc.target/arm/attr-crypto.c
index 05e458f36b6..3959d0b67e7 100644
--- a/gcc/testsuite/gcc.target/arm/attr-crypto.c
+++ b/gcc/testsuite/gcc.target/arm/attr-crypto.c
@@ -3,7 +3,7 @@
pragma.  */
 /* { dg-skip-if "-mpure-code supports M-profile only" { *-*-* } { "-mpure-code" } } */
 /* { dg-require-effective-target arm_fp_ok } */
-/* { dg-options "-O2 -march=armv8-a" } */
+/* { dg-options "-O2 -march=armv8-a+simd" } */
 /* { dg-add-options arm_fp } */
 
 /* Reset fpu to a value compatible with the next pragmas.  */

[committed 08/22] arm: testsuite: improve compatibility of arm/lto/pr96939_1.c


This test overrides the architecture, but fails to specify the
floating point architecture.  This causes problems if -mfpu=auto is
used.

gcc/testsuite:

* gcc.target/arm/lto/pr96939_1.c: Add +simd to the architecture
specification.
---
 gcc/testsuite/gcc.target/arm/lto/pr96939_1.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/arm/lto/pr96939_1.c b/gcc/testsuite/gcc.target/arm/lto/pr96939_1.c
index 53c6093e803..4afdbdaf5ad 100644
--- a/gcc/testsuite/gcc.target/arm/lto/pr96939_1.c
+++ b/gcc/testsuite/gcc.target/arm/lto/pr96939_1.c
@@ -1,5 +1,5 @@
 /* PR target/96939 */
-/* { dg-options "-march=armv8-a+crc" } */
+/* { dg-options "-march=armv8-a+simd+crc" } */
 
 #include

[committed 12/22] arm: testsuite: improve compatibility of pragma_arch_attribute*.c


These tests use pragmas adn attributes to change the architecture.
Sometimes they simply add a feature using "+crc", but other times they
try to completely reset the architecture using "arch=armv8-a+crc".
The latter fails on a hard-float ABI with -mfpu=auto because it also
clears the FP capability.  Fix by adding +simd when the full
architecture is specified.

gcc/testsuite:

* gcc.target/arm/pragma_arch_attribute.c: Add +simd to pragmas that
set an explicit architecture.
* gcc.target/arm/pragma_arch_attribute_2.c: Likewise.
* gcc.target/arm/pragma_arch_attribute_3.c: Likewise.
---
 gcc/testsuite/gcc.target/arm/pragma_arch_attribute.c   | 6 +++---
 gcc/testsuite/gcc.target/arm/pragma_arch_attribute_2.c | 2 +-
 gcc/testsuite/gcc.target/arm/pragma_arch_attribute_3.c | 2 +-
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/pragma_arch_attribute.c b/gcc/testsuite/gcc.target/arm/pragma_arch_attribute.c
index a06dbf04037..a5e1edad3a4 100644
--- a/gcc/testsuite/gcc.target/arm/pragma_arch_attribute.c
+++ b/gcc/testsuite/gcc.target/arm/pragma_arch_attribute.c
@@ -10,7 +10,7 @@
 #endif
 
 #pragma GCC push_options
-#pragma GCC target ("arch=armv8-a+crc")
+#pragma GCC target ("arch=armv8-a+simd+crc")
 #ifndef __ARM_FEATURE_CRC32
 # error "__ARM_FEATURE_CRC32 is not defined in push 1."
 #endif
@@ -41,7 +41,7 @@ void test_crc_unknown_ok_attr_1 ()
 # error "__ARM_FEATURE_CRC32 is defined after attribute set 1."
 #endif
 
-__attribute__((target("arch=armv8-a+crc")))
+__attribute__((target("arch=armv8-a+simd+crc")))
 void test_crc_unknown_ok_attr_2 ()
 {
 	__crc32b (0, 0);
@@ -51,4 +51,4 @@ void test_crc_unknown_ok_attr_2 ()
 # error "__ARM_FEATURE_CRC32 is defined after attribute set 2."
 #endif
 
-#pragma GCC reset_options
\ No newline at end of file
+#pragma GCC reset_options
diff --git a/gcc/testsuite/gcc.target/arm/pragma_arch_attribute_2.c b/gcc/testsuite/gcc.target/arm/pragma_arch_attribute_2.c
index 2e8e385774b..189af170096 100644
--- a/gcc/testsuite/gcc.target/arm/pragma_arch_attribute_2.c
+++ b/gcc/testsuite/gcc.target/arm/pragma_arch_attribute_2.c
@@ -8,7 +8,7 @@
 
 extern uint32_t bar();
 
-__attribute__((target("arch=armv8-a+crc"))) uint32_t crc32cw(uint32_t crc, uint32_t val)
+__attribute__((target("arch=armv8-a+simd+crc"))) uint32_t crc32cw(uint32_t crc, uint32_t val)
 {
 uint32_t res;
 asm("crc32cw %0, %1, %2" : "=r"(res) : "r"(crc), "r"(val));
diff --git a/gcc/testsuite/gcc.target/arm/pragma_arch_attribute_3.c b/gcc/testsuite/gcc.target/arm/pragma_arch_attribute_3.c
index 3714812cf26..eb7f990477b 100644
--- a/gcc/testsuite/gcc.target/arm/pragma_arch_attribute_3.c
+++ b/gcc/testsuite/gcc.target/arm/pragma_arch_attribute_3.c
@@ -9,7 +9,7 @@
 extern uint32_t bar();
 
 #pragma GCC push_options
-#pragma GCC target("arch=armv8-a+crc")
+#pragma GCC target("arch=armv8-a+simd+crc")
 uint32_t crc32cw(uint32_t crc, uint32_t val)
 {
 uint32_t res;

[committed 14/22] arm: testsuite: modernize framework usage for arm/scd42-2.c


Make this test more useful by using dg-require-effective-target/
dg-add-options.

gcc/testsuite:

* gcc.target/arm/scd42-2.c: Use modern dg- flags.
---
 gcc/testsuite/gcc.target/arm/scd42-2.c | 9 +++--
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/scd42-2.c b/gcc/testsuite/gcc.target/arm/scd42-2.c
index 3c9768d22d9..cd416885a80 100644
--- a/gcc/testsuite/gcc.target/arm/scd42-2.c
+++ b/gcc/testsuite/gcc.target/arm/scd42-2.c
@@ -1,11 +1,8 @@
 /* Verify that mov is preferred on XScale for loading a 2 byte constant. */
 /* { dg-do compile } */
-/* { dg-skip-if "Test is specific to the Xscale" { arm*-*-* } { "-march=*" } { "-march=xscale" } } */
-/* { dg-skip-if "Test is specific to the Xscale" { arm*-*-* } { "-mcpu=*" } { "-mcpu=xscale" } } */
-/* { dg-skip-if "Test is specific to ARM mode" { arm*-*-* } { "-mthumb" } { "" } } */
-/* { dg-require-effective-target arm32 } */
-/* { dg-require-effective-target arm_arch_v5te_arm_ok } */
-/* { dg-options "-mcpu=xscale -O -marm" } */
+/* { dg-require-effective-target arm_arch_xscale_arm_ok } */
+/* { dg-options "-O" } */
+/* { dg-add-options arm_arch_xscale_arm } */
 
 unsigned load2(void) __attribute__ ((naked));
 unsigned load2(void)

[committed 13/22] arm: testsuite: improve compatibility of pragma_arch_switch_2.c


This test was explicitly setting the architecture on the command-line and
in the body of the test.  In both cases this causes problems with the auto
FPU setting.  Fix by using the testsuite infrastructure correctly and by
adding +fp to the pragma.

gcc/testsuite:

* gcc.target/arm/pragma_arch_switch_2.c: Use testsuite infrastructure
to set the architecture flags.  Add +fp to the pragma that changes the
architecture.
---
 gcc/testsuite/gcc.target/arm/pragma_arch_switch_2.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/pragma_arch_switch_2.c b/gcc/testsuite/gcc.target/arm/pragma_arch_switch_2.c
index 5080d2c7a91..567943bd8ed 100644
--- a/gcc/testsuite/gcc.target/arm/pragma_arch_switch_2.c
+++ b/gcc/testsuite/gcc.target/arm/pragma_arch_switch_2.c
@@ -3,9 +3,10 @@
 /* { dg-do assemble } */
 /* { dg-require-effective-target arm_arm_ok } */
 /* { dg-require-effective-target arm_arch_v5te_arm_ok } */
-/* { dg-additional-options "-Wall -O2 -march=armv5te -std=gnu99 -marm" } */
+/* { dg-additional-options "-Wall -O2 -std=gnu99" } */
+/* { dg-add-options arm_arch_v5te_arm } */
 
-#pragma GCC target ("arch=armv6")
+#pragma GCC target ("arch=armv6+fp")
 int test_assembly (int hi, int lo)
 {
int res;

[committed 09/22] arm: testsuite: tidy up pr65647-2.c pre-checks.


Another case where we can make better use of the infrastructure to
improve the compatibility of this test.

gcc/testsuite:

* gcc.target/arm/pr65647-2.c: Use dg-add-options to manage target
flags.
---
 gcc/testsuite/gcc.target/arm/pr65647-2.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/arm/pr65647-2.c b/gcc/testsuite/gcc.target/arm/pr65647-2.c
index e3978e512ea..79637bfd9d7 100644
--- a/gcc/testsuite/gcc.target/arm/pr65647-2.c
+++ b/gcc/testsuite/gcc.target/arm/pr65647-2.c
@@ -1,6 +1,7 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target arm_arch_v6_arm_ok } */
-/* { dg-options "-O3 -marm -march=armv6 -std=c99" } */
+/* { dg-options "-O3 -std=c99" } */
+/* { dg-add-options arm_arch_v6_arm } */
 
 typedef struct {
   int i;

[committed 19/22] arm: testsuite: improve compatibility of gcc.target/arm/pr59575.c


Use dg-require-effective-target/dg-add-options to improve
compatibility of this test with various compiler configurations.

gcc/testsuite:

* gcc.target/arm/pr59575.c: Use dg-require-effective-target and
dg-add-options.
---
 gcc/testsuite/gcc.target/arm/pr59575.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/arm/pr59575.c b/gcc/testsuite/gcc.target/arm/pr59575.c
index cc49be3d61f..27d7d40526e 100644
--- a/gcc/testsuite/gcc.target/arm/pr59575.c
+++ b/gcc/testsuite/gcc.target/arm/pr59575.c
@@ -1,7 +1,9 @@
 /* PR target/59575 */
 /* { dg-do compile } */
+/* { dg-require-effective-target arm_arch_v7a_ok } */
 /* { dg-skip-if "-mpure-code supports M-profile only" { *-*-* } { "-mpure-code" } } */
-/* { dg-options "-Os -g -march=armv7-a" } */
+/* { dg-options "-Os -g" } */
+/* { dg-add-options arm_arch_v7a } */
 
 void foo (int *);
 int *bar (int, long long, int);

[committed 20/22] testsuite: arm: tighten up mode-specific ISA tests


Some of the standard Arm architecture tests require the test to use a
specific instruction set (arm or thumb).  But although the framework
was checking that the flag was accepted, it wasn't checking that the
flag wasn't somehow being override (eg by run-specific options).  We
can improve these tests easily by checking whether or not __thumb-_ is
defined.

gcc/testsuite:

* lib/target-supports.exp (check_effective_target_arm_arch_FUNC_ok):
For instruction-set specific tests, check that __thumb__ is, or
isn't defined as appropriate.
---
 gcc/testsuite/lib/target-supports.exp | 26 +-
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
index 316e34a34be..3d504d26164 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -5403,25 +5403,25 @@ proc check_effective_target_arm_fp16_hw { } {
 foreach { armfunc armflag armdefs } {
 	v4 "-march=armv4 -marm" __ARM_ARCH_4__
 	v4t "-march=armv4t -mfloat-abi=softfp" __ARM_ARCH_4T__
-	v4t_arm "-march=armv4t -marm" __ARM_ARCH_4T__
-	v4t_thumb "-march=armv4t -mthumb -mfloat-abi=softfp" __ARM_ARCH_4T__
+	v4t_arm "-march=armv4t -marm" "__ARM_ARCH_4T__ && !__thumb__"
+	v4t_thumb "-march=armv4t -mthumb -mfloat-abi=softfp" "__ARM_ARCH_4T__ && __thumb__"
 	v5t "-march=armv5t -mfloat-abi=softfp" __ARM_ARCH_5T__
-	v5t_arm "-march=armv5t -marm" __ARM_ARCH_5T__
-	v5t_thumb "-march=armv5t -mthumb -mfloat-abi=softfp" __ARM_ARCH_5T__
+	v5t_arm "-march=armv5t -marm" "__ARM_ARCH_5T__ && !__thumb__"
+	v5t_thumb "-march=armv5t -mthumb -mfloat-abi=softfp" "__ARM_ARCH_5T__ && __thumb__"
 	v5te "-march=armv5te+fp -mfloat-abi=softfp" __ARM_ARCH_5TE__
-	v5te_arm "-march=armv5te+fp -marm" __ARM_ARCH_5TE__
-	v5te_thumb "-march=armv5te+fp -mthumb -mfloat-abi=softfp" __ARM_ARCH_5TE__
-	xscale_arm "-mcpu=xscale -mfloat-abi=soft -marm" __XSCALE__
+	v5te_arm "-march=armv5te+fp -marm" "__ARM_ARCH_5TE__ && !__thumb__"
+	v5te_thumb "-march=armv5te+fp -mthumb -mfloat-abi=softfp" "__ARM_ARCH_5TE__ && __thumb__"
+	xscale_arm "-mcpu=xscale -mfloat-abi=soft -marm" "__XSCALE__ && !__thumb__"
 	v6 "-march=armv6+fp -mfloat-abi=softfp" __ARM_ARCH_6__
-	v6_arm "-march=armv6+fp -marm" __ARM_ARCH_6__
-	v6_thumb "-march=armv6+fp -mthumb -mfloat-abi=softfp" __ARM_ARCH_6__
+	v6_arm "-march=armv6+fp -marm" "__ARM_ARCH_6__ && !__thumb__"
+	v6_thumb "-march=armv6+fp -mthumb -mfloat-abi=softfp" "__ARM_ARCH_6__ && __thumb__"
 	v6k "-march=armv6k+fp -mfloat-abi=softfp" __ARM_ARCH_6K__
-	v6k_arm "-march=armv6k+fp -marm" __ARM_ARCH_6K__
-	v6k_thumb "-march=armv6k+fp -mthumb -mfloat-abi=softfp" __ARM_ARCH_6K__
+	v6k_arm "-march=armv6k+fp -marm" "__ARM_ARCH_6K__ && !__thumb__"
+	v6k_thumb "-march=armv6k+fp -mthumb -mfloat-abi=softfp" "__ARM_ARCH_6K__ && __thumb__"
 	v6t2 "-march=armv6t2+fp" __ARM_ARCH_6T2__
 	v6z "-march=armv6z+fp -mfloat-abi=softfp" __ARM_ARCH_6Z__
-	v6z_arm "-march=armv6z+fp -marm" __ARM_ARCH_6Z__
-	v6z_thumb "-march=armv6z+fp -mthumb -mfloat-abi=softfp" __ARM_ARCH_6Z__
+	v6z_arm "-march=armv6z+fp -marm" "__ARM_ARCH_6Z__ && !__thumb__"
+	v6z_thumb "-march=armv6z+fp -mthumb -mfloat-abi=softfp" "__ARM_ARCH_6Z__ && __thumb__"
 	v6m "-march=armv6-m -mthumb -mfloat-abi=soft" __ARM_ARCH_6M__
 	v7a "-march=armv7-a+fp" __ARM_ARCH_7A__
 	v7r "-march=armv7-r+fp" __ARM_ARCH_7R__

[committed 16/22] arm: testsuite: improve compatibility of gcc.target/arm/macro_defs*.c


Convert these tests to use dg-add-options for increased compatibilty.
Since they also result in an empty translation unit, override the
default testsuite options.

gcc/testsuite:

* gcc.target/arm/macro_defs0.c: Use dg-effective-target and
dg-add-options.
* gcc.target/arm/macro_defs1.c: Likewise.
* gcc.target/arm/macro_defs2.c: Likewise.
---
 gcc/testsuite/gcc.target/arm/macro_defs0.c | 7 +++
 gcc/testsuite/gcc.target/arm/macro_defs1.c | 6 ++
 gcc/testsuite/gcc.target/arm/macro_defs2.c | 6 ++
 3 files changed, 7 insertions(+), 12 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/macro_defs0.c b/gcc/testsuite/gcc.target/arm/macro_defs0.c
index 684d49ffafa..17fd157452e 100644
--- a/gcc/testsuite/gcc.target/arm/macro_defs0.c
+++ b/gcc/testsuite/gcc.target/arm/macro_defs0.c
@@ -1,8 +1,7 @@
 /* { dg-do compile } */
-/* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { "-march=*" } { "-march=armv7-m" } } */
-/* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=soft" } } */
-/* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { "-marm" } { "" } } */
-/* { dg-options "-march=armv7-m -mcpu=cortex-m3 -mfloat-abi=soft -mthumb" } */
+/* { dg-require-effective-target arm_arch_v7m_ok } */
+/* { dg-options "" } */
+/* { dg-add-options arm_arch_v7m } */
 
 #ifdef __ARM_FP
 #error __ARM_FP should not be defined
diff --git a/gcc/testsuite/gcc.target/arm/macro_defs1.c b/gcc/testsuite/gcc.target/arm/macro_defs1.c
index 655ba9334f3..bd22154321e 100644
--- a/gcc/testsuite/gcc.target/arm/macro_defs1.c
+++ b/gcc/testsuite/gcc.target/arm/macro_defs1.c
@@ -1,10 +1,8 @@
 /* { dg-do compile } */
-/* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { "-march=*" } { "-march=armv6-m" } } */
-/* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { "-marm" } { "" } } */
 /* { dg-require-effective-target arm_arch_v6m_ok } */
-/* { dg-options "-march=armv6-m -mthumb" } */
+/* { dg-options "" } */
+/* { dg-add-options arm_arch_v6m } */
 
 #ifdef __ARM_NEON_FP
 #error __ARM_NEON_FP should not be defined
 #endif
-
diff --git a/gcc/testsuite/gcc.target/arm/macro_defs2.c b/gcc/testsuite/gcc.target/arm/macro_defs2.c
index 9a960423562..a26fc237611 100644
--- a/gcc/testsuite/gcc.target/arm/macro_defs2.c
+++ b/gcc/testsuite/gcc.target/arm/macro_defs2.c
@@ -1,7 +1,7 @@
 /* { dg-do compile } */
-/* { dg-options "-march=armv7ve -mcpu=cortex-a15 -mfpu=neon-vfpv4" } */
-/* { dg-add-options arm_neon } */
 /* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "" } */
+/* { dg-add-options arm_neon } */
 
 #ifndef __ARM_NEON_FP
 #error  __ARM_NEON_FP is not defined but should be
@@ -10,5 +10,3 @@
 #ifndef __ARM_FP
 #error  __ARM_FP is not defined but should be
 #endif
-
-

[committed 15/22] arm: testsuite: improve compatibility of ftest-armv7m-thumb.c


This test is specific to armv7m cores which do not support hardware
floating-point.  We can improve its compatibility by having the default
options for this core specify -mfloat-abi=soft.

gcc/testsuite:

* lib/target-supports.exp (check_effective_target_arm_arch_FUNC_ok):
Use soft-float ABI for armv7m.
* gcc.target/arm/ftest-armv7m-thumb.c: Use dg-require-effective-target
to check flag compatibility.
---
 gcc/testsuite/gcc.target/arm/ftest-armv7m-thumb.c | 3 +--
 gcc/testsuite/lib/target-supports.exp | 2 +-
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/ftest-armv7m-thumb.c b/gcc/testsuite/gcc.target/arm/ftest-armv7m-thumb.c
index 363b48b7516..ba1985f5b0d 100644
--- a/gcc/testsuite/gcc.target/arm/ftest-armv7m-thumb.c
+++ b/gcc/testsuite/gcc.target/arm/ftest-armv7m-thumb.c
@@ -1,6 +1,5 @@
 /* { dg-do compile } */
-/* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { "-march=*" } { "-march=arm7-m" } } */
-/* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { "-marm" } { "" } } */
+/* { dg-require-effective-target arm_arch_v7m_ok }
 /* { dg-options "-mthumb" } */
 /* { dg-add-options arm_arch_v7m } */
 
diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
index 9d2958626ad..316e34a34be 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -5425,7 +5425,7 @@ foreach { armfunc armflag armdefs } {
 	v6m "-march=armv6-m -mthumb -mfloat-abi=soft" __ARM_ARCH_6M__
 	v7a "-march=armv7-a+fp" __ARM_ARCH_7A__
 	v7r "-march=armv7-r+fp" __ARM_ARCH_7R__
-	v7m "-march=armv7-m -mthumb" __ARM_ARCH_7M__
+	v7m "-march=armv7-m -mthumb -mfloat-abi=soft" __ARM_ARCH_7M__
 	v7em "-march=armv7e-m+fp -mthumb" __ARM_ARCH_7EM__
 	v7ve "-march=armv7ve+fp -marm"
 		"__ARM_ARCH_7A__ && __ARM_FEATURE_IDIV"

[committed 18/22] arm: testsuite: improve compatibility of gcc.target/arm/pr19599.c