date:20231122

Re: [PATCH] testsuite: Tweak xfail bogus g++.dg/warn/Wstringop-overflow-4.C:144, PR106120

2023-11-22 Thread Richard Biener

On Wed, Nov 22, 2023 at 3:04 AM Hans-Peter Nilsson  wrote:
>
> I added that xfail in February for { ilp32 && c++98_only } and it
> looks like it's moved on to lp64 now. :-/  Noted by Rainer
> Orth, see the PR.
>
> Tested cris-elf and x86_64-pc-linux-gnu w/wo. -m32.
> Ok to commit?

OK

> -- >8 --
> The conditions under which this this bogus warning is
> emitted has changed to not happen for 32-bit targets
> anymore.  Adjust accordingly.
>
> PR testsuite/106120
> * g++.dg/warn/Wstringop-overflow-4.C:144 XFAIL bogus warning for
> lp64 targets with c++98.
> ---
>  gcc/testsuite/g++.dg/warn/Wstringop-overflow-4.C | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/gcc/testsuite/g++.dg/warn/Wstringop-overflow-4.C 
> b/gcc/testsuite/g++.dg/warn/Wstringop-overflow-4.C
> index 275ecac01b5f..2024f8d93ca3 100644
> --- a/gcc/testsuite/g++.dg/warn/Wstringop-overflow-4.C
> +++ b/gcc/testsuite/g++.dg/warn/Wstringop-overflow-4.C
> @@ -141,7 +141,7 @@ void test_strcpy_new_int16_t (size_t n, const size_t 
> vals[])
>
>int r_imin_imax = SR (INT_MIN, INT_MAX);
>T (S (1), new int16_t[r_imin_imax]);
> -  T (S (2), new int16_t[r_imin_imax + 1]); // { dg-bogus "into a region of 
> size" "pr106120" { xfail { c++98_only } } }
> +  T (S (2), new int16_t[r_imin_imax + 1]); // { dg-bogus "into a region of 
> size" "pr106120" { xfail { lp64 && c++98_only } } }
>T (S (9), new int16_t[r_imin_imax * 2 + 1]);
>
>int r_0_imax = SR (0, INT_MAX);
> --
> 2.30.2
>

Re: [PATCH v4] DSE: Allow vector type for get_stored_val when read < store

2023-11-22 Thread Richard Biener

On Wed, Nov 22, 2023 at 3:30 AM Li, Pan2  wrote:
>
> Hi Richard S,
>
> Thanks a lot for reviewing and comments. May I know is there any concern or 
> further comments for landing this patch to GCC-14?

It looks like Jeff approved the patch?

Richard.

> Pan
>
> -Original Message-
> From: Li, Pan2
> Sent: Wednesday, November 15, 2023 8:25 AM
> To: gcc-patches@gcc.gnu.org
> Cc: juzhe.zh...@rivai.ai; Wang, Yanzhang ; 
> kito.ch...@gmail.com; richard.guent...@gmail.com; richard.sandif...@arm.com; 
> Jeff Law 
> Subject: RE: [PATCH v4] DSE: Allow vector type for get_stored_val when read < 
> store
>
> Sorry for disturbing, looks I have a typo for Richard S's email address, cc 
> the right email address for awareness.
>
> Pan
>
> -Original Message-
> From: Li, Pan2
> Sent: Wednesday, November 15, 2023 8:18 AM
> To: Jeff Law ; gcc-patches@gcc.gnu.org
> Cc: juzhe.zh...@rivai.ai; Wang, Yanzhang ; 
> kito.ch...@gmail.com; richard.guent...@gmail.com; richard.sandiford@arm.com2
> Subject: RE: [PATCH v4] DSE: Allow vector type for get_stored_val when read < 
> store
>
> > I wouldn't try to handle that case unless we had actual evidence it was
> > useful to do so.  Just wanted to point out that unlike pseudos we can
> > have multiple modes referencing the same memory location.
>
> Got the point here, thanks Jeff for emphasizing this, 😉.
>
> Pan
>
> -Original Message-
> From: Jeff Law 
> Sent: Tuesday, November 14, 2023 4:12 AM
> To: Li, Pan2 ; gcc-patches@gcc.gnu.org
> Cc: juzhe.zh...@rivai.ai; Wang, Yanzhang ; 
> kito.ch...@gmail.com; richard.guent...@gmail.com; richard.sandiford@arm.com2
> Subject: Re: [PATCH v4] DSE: Allow vector type for get_stored_val when read < 
> store
>
>
>
> On 11/12/23 20:22, pan2...@intel.com wrote:
> > From: Pan Li 
> >
> > Update in v4:
> > * Merge upstream and removed some independent changes.
> >
> > Update in v3:
> > * Take known_le instead of known_lt for vector size.
> > * Return NULL_RTX when gap is not equal 0 and not constant.
> >
> > Update in v2:
> > * Move vector type support to get_stored_val.
> >
> > Original log:
> >
> > This patch would like to allow the vector mode in the
> > get_stored_val in the DSE. It is valid for the read
> > rtx if and only if the read bitsize is less than the
> > stored bitsize.
> >
> > Given below example code with
> > --param=riscv-autovec-preference=fixed-vlmax.
> >
> > vuint8m1_t test () {
> >uint8_t arr[32] = {
> >  1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
> >  1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
> >};
> >
> >return __riscv_vle8_v_u8m1(arr, 32);
> > }
> >
> > Before this patch:
> > test:
> >lui a5,%hi(.LANCHOR0)
> >addisp,sp,-32
> >addia5,a5,%lo(.LANCHOR0)
> >li  a3,32
> >vl2re64.v   v2,0(a5)
> >vsetvli zero,a3,e8,m1,ta,ma
> >vs2r.v  v2,0(sp) <== Unnecessary store to stack
> >vle8.v  v1,0(sp) <== Ditto
> >vs1r.v  v1,0(a0)
> >addisp,sp,32
> >jr  ra
> >
> > After this patch:
> > test:
> >lui a5,%hi(.LANCHOR0)
> >addia5,a5,%lo(.LANCHOR0)
> >li  a4,32
> >addisp,sp,-32
> >vsetvli zero,a4,e8,m1,ta,ma
> >vle8.v  v1,0(a5)
> >vs1r.v  v1,0(a0)
> >addisp,sp,32
> >jr  ra
> >
> > Below tests are passed within this patch:
> > * The risc-v regression test.
> > * The x86 bootstrap and regression test.
> > * The aarch64 regression test.
> >
> >   PR target/111720
> >
> > gcc/ChangeLog:
> >
> >   * dse.cc (get_stored_val): Allow vector mode if read size is
> >   less than or equal to stored size.
> >
> > gcc/testsuite/ChangeLog:
> >
> >   * gcc.target/riscv/rvv/base/pr111720-0.c: New test.
> >   * gcc.target/riscv/rvv/base/pr111720-1.c: New test.
> >   * gcc.target/riscv/rvv/base/pr111720-10.c: New test.
> >   * gcc.target/riscv/rvv/base/pr111720-2.c: New test.
> >   * gcc.target/riscv/rvv/base/pr111720-3.c: New test.
> >   * gcc.target/riscv/rvv/base/pr111720-4.c: New test.
> >   * gcc.target/riscv/rvv/base/pr111720-5.c: New test.
> >   * gcc.target/riscv/rvv/base/pr111720-6.c: New test.
> >   * gcc.target/riscv/rvv/base/pr111720-7.c: New test.
> >   * gcc.target/riscv/rvv/base/pr111720-8.c: New test.
> >   * gcc.target/riscv/rvv/base/pr111720-9.c: New test.
> OK for the trunk.
>
>
> >
>
> > +  else if (VECTOR_MODE_P (read_mode) && VECTOR_MODE_P (store_mode)
> > +&& known_le (GET_MODE_BITSIZE (read_mode), GET_MODE_BITSIZE 
> > (store_mode))
> > +&& targetm.modes_tieable_p (read_mode, store_mode))
> > +read_reg = gen_lowpart (read_mode, copy_rtx (store_info->rhs));
> > else
> >   read_reg = extract_low_bits (read_mode, store_mode,
> >copy_rtx (store_info->rhs));
> It may not matter, especially for RV, but we could possibly have a
> mixture of scalar and vector modes in the RTL.  Say a vector store
> followed by a scalar read or vice-versa.
>

Re: [PATCH #2/4] c++: mark short-enums as packed

2023-11-22 Thread Alexandre Oliva

On Nov 20, 2023, Jason Merrill  wrote:

> I think the warning is wrong here.

Interesting...  Yeah, your analysis makes perfect sense.

Still, we're left with a divergence WRT the TYPE_PACKED status of enum
types between C and C++.

It sort of kind of makes sense to mark short enums as packed, because,
well, they are.

Even enum types with explicit attribute packed, that IIUC uses the same
underlying type selection as -fshort-enums, IIRC are not be marked with
TYPE_PACKED in C++, at least not at the place where I proposed to set
it.  Do you consider that behavior correct?

Even if the warning happens to be buggy in this regard, it is at best
(or worst) accessory to this patch, in that it makes that difference
between languages apparent, and I worry that there might be other middle
end tests involving TYPE_PACKED that would get things different in C vs
C++.  (admittedly, I haven't searched for occurrences of TYPE_PACKED in
the tree, but I could, to alleviate my concerns, in case there's a
decision to keep them different)

> In the analyzer testcase, we have a cast from an
> enum pointer that we don't know what it points to, and even if it did 
> point to the obj_type member of struct connection, that wouldn't be a
> problem because it's at offset 0.

Maybe I misunderstand the point of the warning, but ISTM that the
circumstance it's warning about is real: the member is not as aligned as
the enclosing struct, so the cast is risky.  Now, I suppose the idiom of
finding the enclosing struct given a member is common enough that we
don't want to warn about it in general.  I'm not sure what makes packed
structs special in this regard, though.  I don't really see much
difference, more laxly-aligned fields seem equally warn-worthy, whether
the enclosing struct is packed or not, but what do I know?

> Also, -fshort-enums has nothing to do with structure packing

*nod*, it's about packing of the enum type itself.  It is some sort of a
degenerated aggregate type ;-) But yeah, I guess it doesn't fit the
circumstance the warning was meant to catch, and the fact that in C is
does is a consequence of marking C short enums as TYPE_PACKED.

Which might be a bug in C.

But wouldn't it be a bug in C++ if an enum with attribute packed weren't
markd as TYPE_PACKED?  Or is TYPE_PACKED really meant to say something
about the enclosing struct rather than about the enclosed type itself?
(am I getting too philosophical here? :-)

Thanks,

-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
More tolerance and less prejudice are key for inclusion and diversity
Excluding neuro-others for not behaving ""normal"" is *not* inclusive

Re: [PATCH] call maybe_return_this in build_clone

2023-11-22 Thread Alexandre Oliva

On Nov 20, 2023, Jason Merrill  wrote:

> So it only passed on that platform before because of the bug?

I'm not sure it passed (we've had patches for that testcase before), but
I didn't look closely enough into their history to tell.  I suspected
the warning suppression machinery changed, or details on cloning did, or
something.  It's been fragile historically.  But yeah, recently, the
test for the warning was only passing because of the bug.  But we were
also getting excess warnings, so it wasn't fully passing.


Thanks for the reviews!

-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
More tolerance and less prejudice are key for inclusion and diversity
Excluding neuro-others for not behaving ""normal"" is *not* inclusive

Re: [PATCH V2 1/3]rs6000: update num_insns_constant for 2 insns

2023-11-22 Thread Kewen.Lin

Hi,

on 2023/11/15 11:02, Jiufu Guo wrote:
> Hi,
> 
> Trunk gcc supports more constants to be built via two instructions: e.g.
> "li/lis; xori/xoris/rldicl/rldicr/rldic".
> And then num_insns_constant should also be updated.
> 
> Function "rs6000_emit_set_long_const" is used to build complicate
> constants; and "num_insns_constant_gpr" is used to compute 'how
> many instructions are needed" to build the constant. So, these 
> two functions should be aligned.
> 
> The idea is: reusing "rs6000_emit_set_long_const" to compute/record
> the instruction number(when computing the insn_num, then do not emit
>  instructions).
> 
> Compare with previous verions:
> https://gcc.gnu.org/pipermail/gcc-patches/2023-October/634195.html
> This verion adds an argument to "rs6000_emit_set_long_const" to
> indicate computing instruction number instead emit intructions.
> 
> Bootstrap & regtest pass ppc64{,le}.
> Is this ok for trunk?
> 
> BR,
> Jeff (Jiufu Guo)
> 
> gcc/ChangeLog:
> 
>   * config/rs6000/rs6000.cc (rs6000_emit_set_long_const): Add new 
>   parameter to record number of instructions to build the constant.
>   (num_insns_constant_gpr): Call rs6000_emit_set_long_const to compute
>   num_insn.
>   (ADJUST_INSN_NUM_AND_RET): New macro.
>   (rs6000_emit_set_const): Call rs6000_emit_set_long_const with NULL
>   argument.
> 
> ---
>  gcc/config/rs6000/rs6000.cc | 245 +++-
>  1 file changed, 133 insertions(+), 112 deletions(-)
> 
> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
> index cc24dd5301e..ba40dd6eee4 100644
> --- a/gcc/config/rs6000/rs6000.cc
> +++ b/gcc/config/rs6000/rs6000.cc
> @@ -1115,7 +1115,7 @@ static tree rs6000_handle_longcall_attribute (tree *, 
> tree, tree, int, bool *);
>  static tree rs6000_handle_altivec_attribute (tree *, tree, tree, int, bool 
> *);
>  static tree rs6000_handle_struct_attribute (tree *, tree, tree, int, bool *);
>  static tree rs6000_builtin_vectorized_libmass (combined_fn, tree, tree);
> -static void rs6000_emit_set_long_const (rtx, HOST_WIDE_INT);
> +static void rs6000_emit_set_long_const (rtx, HOST_WIDE_INT, int *);

Make the new argument default as nullptr... 

>  static int rs6000_memory_move_cost (machine_mode, reg_class_t, bool);
>  static bool rs6000_debug_rtx_costs (rtx, machine_mode, int, int, int *, 
> bool);
>  static int rs6000_debug_address_cost (rtx, machine_mode, addr_space_t,
> @@ -6054,21 +6054,9 @@ num_insns_constant_gpr (HOST_WIDE_INT value)
> 
>else if (TARGET_POWERPC64)
>  {
> -  HOST_WIDE_INT low = sext_hwi (value, 32);
> -  HOST_WIDE_INT high = value >> 31;
> -
> -  if (high == 0 || high == -1)
> - return 2;
> -
> -  high >>= 1;
> -
> -  if (low == 0 || low == high)
> - return num_insns_constant_gpr (high) + 1;
> -  else if (high == 0)
> - return num_insns_constant_gpr (low) + 1;
> -  else
> - return (num_insns_constant_gpr (high)
> - + num_insns_constant_gpr (low) + 1);
> +  int num_insns = 0;
> +  rs6000_emit_set_long_const (NULL, value, &num_insns);
> +  return num_insns;
>  }
> 
>else
> @@ -10284,7 +10272,7 @@ rs6000_emit_set_const (rtx dest, rtx source)
> emit_move_insn (lo, GEN_INT (c));
>   }
>else
> - rs6000_emit_set_long_const (dest, c);
> + rs6000_emit_set_long_const (dest, c, NULL);

... then we don't need to change this line.

>break;
> 
>  default:
> @@ -10494,14 +10482,13 @@ can_be_built_by_li_and_rldic (HOST_WIDE_INT c, int 
> *shift, HOST_WIDE_INT *mask)
> 
>  /* Subroutine of rs6000_emit_set_const, handling PowerPC64 DImode.
> Output insns to set DEST equal to the constant C as a series of
> -   lis, ori and shl instructions.  */
> +   lis, ori and shl instructions.  If NUM_INSNS is not NULL, then
> +   only increase *NUM_INSNS as the number of insns, and do not output
> +   real insns.  */
> 
>  static void
> -rs6000_emit_set_long_const (rtx dest, HOST_WIDE_INT c)
> +rs6000_emit_set_long_const (rtx dest, HOST_WIDE_INT c, int *num_insns)
>  {
> -  rtx temp;
> -  int shift;
> -  HOST_WIDE_INT mask;
>HOST_WIDE_INT ud1, ud2, ud3, ud4;
> 
>ud1 = c & 0x;
> @@ -10509,41 +10496,71 @@ rs6000_emit_set_long_const (rtx dest, HOST_WIDE_INT 
> c)
>ud3 = (c >> 32) & 0x;
>ud4 = (c >> 48) & 0x;
> 
> +  /* This macro RETURNs this function after increasing *NUM_INSNS!!!  */
> +#define ADJUST_INSN_NUM_AND_RET(N)   
>   \
> +  if (num_insns) 
>   \
> +{
>   \
> +  *num_insns += (N); 
>   \
> +  return;
>   \
> +}

This macro and its uses below can still have the chance to get the inconsistent
counts, as in some arms the c

Re: [committed] sanitizer: Fix build on SPARC/Solaris with Solaris as [PR112562]

2023-11-22 Thread Rainer Orth

Hi Jakub,

> Solaris as apparently doesn't accept %function and requires @function
> instead.

actually, it's the other way round: Solaris/x86 as cannot handle
%function, but requires @function.

Solaris/SPARC as is similar (requires #function instead), but is
unaffected: sanitizer_asm.h disables ASM_INTERCEPTOR_TRAMPOLINE on sparc
and sets ASM_INTERCEPTOR_TRAMPOLINE_SUPPORT to 0, so interceptor.h
doesn't use the problematic version of DECLARE_WRAPPER.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

Re: [PATCH V2 2/3] Using pli to split 34bits constant

2023-11-22 Thread Kewen.Lin

Hi,

on 2023/11/15 11:02, Jiufu Guo wrote:
> Hi,
> 
> For constants with 16bit values, 'li or lis' can be used to generate
> the value.  For 34bit constant, 'pli' is ok to generate the value.
> For example: 0xULL, "pli 3,1717986918; rldimi 3,3,32,0"
> can be used.

Since now if emit_move_insn with a 34bit constant, it's already adopting
pli.  So it's not obvious to the readers why we want this change, I think
you should probably state the reason here explicitly, like in function 
rs6000_emit_set_long_const it's possible to recursively call itself without
invoking emit_move_insn, then it can result in sub-optimal constant build ...
And for the testing I prefer to have a dedicated test case for it, like
extracting function msk66 from pr93012.c and checking its generated assembly
has pli but not lis and ori on Power10 and up.

The others look good to me.  Thanks!

BR,
Kewen

> 
> Compare with previous:
> https://gcc.gnu.org/pipermail/gcc-patches/2023-October/634196.html
> This verion updates a testcase to cover this functionality.
> 
> Bootstrap®test pass on ppc64{,le}.
> Is this ok for trunk?
> 
> BR,
> Jeff (Jiufu Guo)
> 
> gcc/ChangeLog:
> 
>   * config/rs6000/rs6000.cc (rs6000_emit_set_long_const): Add code to use
>   pli for 34bit constant.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/powerpc/pr93012.c: Update to check pli.
> 
> ---
>  gcc/config/rs6000/rs6000.cc| 9 +
>  gcc/testsuite/gcc.target/powerpc/pr93012.c | 1 +
>  2 files changed, 10 insertions(+)
> 
> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
> index ba40dd6eee4..b277c52687b 100644
> --- a/gcc/config/rs6000/rs6000.cc
> +++ b/gcc/config/rs6000/rs6000.cc
> @@ -10504,6 +10504,15 @@ rs6000_emit_set_long_const (rtx dest, HOST_WIDE_INT 
> c, int *num_insns)
>return;
>   \
>  }
> 
> +  if (TARGET_PREFIXED && SIGNED_INTEGER_34BIT_P (c))
> +{
> +  /* li/lis/pli */
> +  ADJUST_INSN_NUM_AND_RET (1);
> +
> +  emit_move_insn (dest, GEN_INT (c));
> +  return;
> +}
> +
>if ((ud4 == 0x && ud3 == 0x && ud2 == 0x && (ud1 & 0x8000))
>|| (ud4 == 0 && ud3 == 0 && ud2 == 0 && !(ud1 & 0x8000)))
>  {
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr93012.c 
> b/gcc/testsuite/gcc.target/powerpc/pr93012.c
> index 4f764d0576f..a07ff764bbf 100644
> --- a/gcc/testsuite/gcc.target/powerpc/pr93012.c
> +++ b/gcc/testsuite/gcc.target/powerpc/pr93012.c
> @@ -10,4 +10,5 @@ unsigned long long mskh1() { return 0x92349234ULL; }
>  unsigned long long mskl1() { return 0x2bcd2bcdULL; }
>  unsigned long long mskse() { return 0x12341234ULL; }
> 
> +/* { dg-final { scan-assembler-times {\mpli\M} 4 { target has_arch_pwr10 }} 
> } */
>  /* { dg-final { scan-assembler-times {\mrldimi\M} 7 } } */

Re: PING^1 [PATCH v3] sched: Change no_real_insns_p to no_real_nondebug_insns_p [PR108273]

2023-11-22 Thread Kewen.Lin

on 2023/11/17 20:55, Alexander Monakov wrote:
> 
> On Fri, 17 Nov 2023, Kewen.Lin wrote:
>>> I don't think you can run cleanup_cfg after sched_init. I would suggest
>>> to put it early in schedule_insns.
>>
>> Thanks for the suggestion, I placed it at the beginning of haifa_sched_init
>> instead, since schedule_insns invokes haifa_sched_init, although the
>> calls rgn_setup_common_sched_info and rgn_setup_sched_infos are executed
>> ahead but they are all "setup" functions, shouldn't affect or be affected
>> by this placement.
> 
> I was worried because sched_init invokes df_analyze, and I'm not sure if
> cfg_cleanup can invalidate it.

Thanks for further explaining!  By scanning cleanup_cfg, it seems that it
considers df, like compact_blocks checks df, try_optimize_cfg invokes
df_analyze etc., but I agree that moving cleanup_cfg before sched_init
makes more sense.

> 
>>> I suspect this may be caused by invoking cleanup_cfg too late.
>>
>> By looking into some failures, I found that although cleanup_cfg is executed
>> there would be still some empty blocks left, by analyzing a few failures 
>> there
>> are at least such cases:
>>   1. empty function body
>>   2. block holding a label for return.
>>   3. block without any successor.
>>   4. block which becomes empty after scheduling some other block.
>>   5. block which looks mergeable with its always successor but left.
>>   ...
>>
>> For 1,2, there is one single successor EXIT block, I think they don't affect
>> state transition, for 3, it's the same.  For 4, it depends on if we can have
>> the assumption this kind of empty block doesn't have the chance to have debug
>> insn (like associated debug insn should be moved along), I'm not sure.  For 
>> 5,
>> a reduced test case is:
> 
> Oh, I should have thought of cases like these, really sorry about the slip
> of attention, and thanks for showing a testcase for item 5. As Richard as
> saying in his response, cfg_cleanup cannot be a fix here. The thing to check
> would be changing no_real_insns_p to always return false, and see if the
> situation looks recoverable (if it breaks bootstrap, regtest statistics of
> a non-bootstrapped compiler are still informative).

As you suggested, I forced no_real_insns_p to return false all the time, some
issues got exposed, almost all of them are asserting NOTE_P insn shouldn't be
encountered in those places, so the adjustments for most of them are just to
consider NOTE_P or this kind of special block and so on.  One draft patch is
attached, it can be bootstrapped and regress-tested on ppc64{,le} and x86.
btw, it's without the previous cfg_cleanup adjustment (hope it can get more
empty blocks and expose more issues).  The draft isn't qualified for code
review but I hope it can provide some information on what kinds of changes
are needed for the proposal.  If this is the direction which we all agree on,
I'll further refine it and post a formal patch.  One thing I want to note is
that this patch disable one assertion below:

diff --git a/gcc/sched-rgn.cc b/gcc/sched-rgn.cc
index e5964f54ead..abd334864fb 100644
--- a/gcc/sched-rgn.cc
+++ b/gcc/sched-rgn.cc
@@ -3219,7 +3219,7 @@ schedule_region (int rgn)
 }

   /* Sanity check: verify that all region insns were scheduled.  */
-  gcc_assert (sched_rgn_n_insns == rgn_n_insns);
+  // gcc_assert (sched_rgn_n_insns == rgn_n_insns);

   sched_finish_ready_list ();

Some cases can cause this assertion to fail, it's due to the mismatch on
to-be-scheduled and scheduled insn counts.  The reason why it happens is that
one block previously has only one INSN_P but while scheduling some other blocks
it gets moved as well then we ends up with an empty block so that the only
NOTE_P insn was counted then, but since this block isn't empty initially and
NOTE_P gets skipped in a normal block, the count to-be-scheduled can't count
it in.  It can be fixed with special-casing this kind of block for counting
like initially recording which block is empty and if a block isn't recorded
before then fix up the count for it accordingly.  I'm not sure if someone may
have an argument that all the complication make this proposal beaten by
previous special-casing debug insn approach, looking forward to more comments.

BR,
Kewen

From d350f411b23f6064a33a72a6ca7afc49b0ccea65 Mon Sep 17 00:00:00 2001
From: Kewen Lin 
Date: Wed, 22 Nov 2023 00:08:59 -0600
Subject: [PATCH] sched: Don't skip empty block in scheduling

att
---
 gcc/haifa-sched.cc | 166 +++--
 gcc/rtl.h  |   4 +-
 gcc/sched-rgn.cc   |   2 +-
 3 files changed, 103 insertions(+), 69 deletions(-)

diff --git a/gcc/haifa-sched.cc b/gcc/haifa-sched.cc
index 8e8add709b3..62377d99162 100644
--- a/gcc/haifa-sched.cc
+++ b/gcc/haifa-sched.cc
@@ -1207,6 +1207,11 @@ recompute_todo_spec (rtx_insn *next, bool for_backtrack)
   int n_replace = 0;
   bool first_p = true;

+  /* We don't skip no_real_insns_p any more, so it's possible to
+ meet NOTE insn now

Re: [PATCH, v3] Fortran: restrictions on integer arguments to SYSTEM_CLOCK [PR112609]

2023-11-22 Thread Mikael Morin


Le 21/11/2023 à 23:09, Harald Anlauf a écrit :

Uhh, it happened again.  Attached a wrong patch.
Only looked at the -v3 ...  My bad.

Sorry!

Harald


On 11/21/23 22:54, Harald Anlauf wrote:

Hi Mikael, Steve,

On 11/21/23 12:33, Mikael Morin wrote:

Harald, you mentioned the lack of GFC_STD_F2023_DEL feature group in
your first message, but I don't quite understand why you didn't add one.
  It seems to me the most natural way to do this.


thanks for insisting on this variant.

In my first attack at this problem, I overlooked one place in
libgfortran.h, which I now was able to find and adjust.
Now everything falls into place.


I suggest we emit a warning by default, error with -std=f2023 (I agree
with Steve that we should push towards strict f2023 conformance), and no
diagnostic with -std=gnu or -std=f2018 or lower.


As the majority agrees on this, I accept it.  The attached patch
now does this and fixes the testcases accordingly.


It seems that the solution is to fix the code in the testsuite.


Agreed, these seem to explicitly test mismatching kinds, so add an
option to prevent error.


Done.

I also fixed a few issues in the documentation in gfortran.texi .

As I currently cannot build a full compiler (see PR112643),
patch V3 is not properly regtested yet, but appears to give
results as discussed.

Comments?


Mikael


Thanks,
Harald





(...)


diff --git a/gcc/fortran/error.cc b/gcc/fortran/error.cc
index 2ac51e95e4d..be715b50469 100644
--- a/gcc/fortran/error.cc
+++ b/gcc/fortran/error.cc
@@ -980,7 +980,11 @@ char const*
 notify_std_msg(int std)
 {
 
-  if (std & GFC_STD_F2018_DEL)

+  if (std & GFC_STD_F2023_DEL)
+return _("Fortran 2023 deleted feature:");


As there are officially no deleted feature in f2023, maybe use a 
slightly different wording?  Say "Not allowed in fortran 2023" or 
"forbidden in Fortran 2023" or similar?



+  else if (std & GFC_STD_F2023)
+return _("Fortran 2023:");
+  else if (std & GFC_STD_F2018_DEL)
 return _("Fortran 2018 deleted feature:");
   else if (std & GFC_STD_F2018_OBS)
 return _("Fortran 2018 obsolescent feature:");



diff --git a/gcc/fortran/libgfortran.h b/gcc/fortran/libgfortran.h
index bdddb317ab0..af7a170c2b1 100644
--- a/gcc/fortran/libgfortran.h
+++ b/gcc/fortran/libgfortran.h
@@ -19,9 +19,10 @@ along with GCC; see the file COPYING3.  If not see
 
 
 /* Flags to specify which standard/extension contains a feature.

-   Note that no features were obsoleted nor deleted in F2003 nor in F2023.
+   Note that no features were obsoleted nor deleted in F2003.


I think we can add a comment that F2023 has no deleted feature, but some 
more stringent restrictions in f2023 forbid some previously valid code.



Please remember to keep those definitions in sync with
gfortran.texi.  */
+#define GFC_STD_F2023_DEL  (1<<13)   /* Deleted in F2023.  */
 #define GFC_STD_F2023  (1<<12)   /* New in F2023.  */
 #define GFC_STD_F2018_DEL  (1<<11)   /* Deleted in F2018.  */
 #define GFC_STD_F2018_OBS  (1<<10)   /* Obsolescent in F2018.  */
@@ -41,12 +42,13 @@ along with GCC; see the file COPYING3.  If not see
  * are allowed with a certain -std option.  */
 #define GFC_STD_OPT_F95(GFC_STD_F77 | GFC_STD_F95 | 
GFC_STD_F95_OBS  \
| GFC_STD_F2008_OBS | GFC_STD_F2018_OBS \
-   | GFC_STD_F2018_DEL)
+   | GFC_STD_F2018_DEL | GFC_STD_F2023_DEL)
 #define GFC_STD_OPT_F03(GFC_STD_OPT_F95 | GFC_STD_F2003)
 #define GFC_STD_OPT_F08(GFC_STD_OPT_F03 | GFC_STD_F2008)
 #define GFC_STD_OPT_F18((GFC_STD_OPT_F08 | GFC_STD_F2018) \
& (~GFC_STD_F2018_DEL))
F03, F08 and F18 should have GFC_STD_F2023_DEL (and also F03 and F08 
should have GFC_STD_F2018_DEL).


OK with this fixed (and the previous comments as you wish), if Steve has 
no more comments.


Thanks for the patch.

Re: [PATCH] ARM/testsuite: Use non-capturing parentheses with pr53447-5.c

2023-11-22 Thread Richard Earnshaw





On 22/11/2023 01:40, Maciej W. Rozycki wrote:

Use non-capturing parentheses for the subexpressions used with
`scan-assembler-times', to avoid a quirk with double-counting.

     gcc/testsuite/
     * gcc.target/arm/pr53447-5.c: Use non-capturing parentheses with
     `scan-assembler-times'.
---
Hi,

  The `scan-assembler-times' quirk is being fixed with
>, but

we don't need capturing parentheses here, typically used for back
references, so let's just avoid the quirk altogether and make our matching
here work either way.  Cf. commit 88c888f11379 ("pr53447-5.c: Fix test
expectations for neon-fpu.").

  Verified by proof-reading, with a reference to the commit quoted above.
OK to apply?

   Maciej
---
  gcc/testsuite/gcc.target/arm/pr53447-5.c |    8 +++-
  1 file changed, 3 insertions(+), 5 deletions(-)

gcc-arm-test-pr53447-5-non-capturing.diff
Index: gcc/gcc/testsuite/gcc.target/arm/pr53447-5.c
===
--- gcc.orig/gcc/testsuite/gcc.target/arm/pr53447-5.c
+++ gcc/gcc/testsuite/gcc.target/arm/pr53447-5.c
@@ -15,8 +15,6 @@ void foo(long long* p)
    p[9] -= p[10];
  }

-/* We accept neon instructions vldr.64 and vstr.64 as well.
-   Note: DejaGnu counts patterns with alternatives twice,
-   so actually there are only 10 loads and 9 stores.  */
-/* { dg-final { scan-assembler-times "(ldrd|vldr\\.64)" 20 } } */
-/* { dg-final { scan-assembler-times "(strd|vstr\\.64)" 18 } } */
+/* We accept neon instructions vldr.64 and vstr.64 as well.  */
+/* { dg-final { scan-assembler-times "(?:ldrd|vldr\\.64)" 10 } } */
+/* { dg-final { scan-assembler-times "(?:strd|vstr\\.64)" 9 } } */



OK.

Thanks.

R.

[PATCH 0/5] Add support for operand-specific alignment requirements

2023-11-22 Thread juzhe.zh...@rivai.ai

Hi, Richard.

Thanks for supporting register filter in IRA/LRA.
I found it is useful for RVV since we have a set of widen operations that allow 
source register overlap highpart of dest register group

For example, if vsext.vf2 v0(dest consume reg v0 and reg v1), v1 (source 
consume v1 only)
I want to support the highpart overlap above. (Currently, we don't any overlap 
between source and dest in such instructions).

So, I wonder whether we can pass "machine_mode" into register filter. Ok, I 
think it's too late since stage 1 closes. I wonder we can add it in GCC-15?

Thanks. 



juzhe.zh...@rivai.ai

[PATCH] arm: [MVE intrinsics] Fix typo

2023-11-22 Thread Christophe Lyon

In commt 0c2037d9d93a8f768cb11698ff794278246bb31f (Add support for
contiguous loads and stores), I added a spurious line which broke
bootstrap because of an unused variable error.

This patch removes it.

Committed as obvious.

2023-11-22  Christophe Lyon  

gcc/ChangeLog:

* config/arm/arm-mve-builtins.cc
(function_resolver::infer_pointer_type): Remove spurious line.
---
 gcc/config/arm/arm-mve-builtins.cc | 1 -
 1 file changed, 1 deletion(-)

diff --git a/gcc/config/arm/arm-mve-builtins.cc 
b/gcc/config/arm/arm-mve-builtins.cc
index a265cb05553..dec163dce4f 100644
--- a/gcc/config/arm/arm-mve-builtins.cc
+++ b/gcc/config/arm/arm-mve-builtins.cc
@@ -1168,7 +1168,6 @@ function_resolver::infer_pointer_type (unsigned int argno)
build_qualified_type (target, 0));
   return NUM_TYPE_SUFFIXES;
 }
-  unsigned int bits = type_suffixes[type].element_bits;
 
   return type;
 }
-- 
2.34.1

[PATCH] c++, v4: Implement C++26 P2741R3 - user-generated static_assert messages [PR110348]

2023-11-22 Thread Jakub Jelinek

On Tue, Nov 21, 2023 at 10:51:36PM -0500, Jason Merrill wrote:
> Actually, let's go back to the previous message, but change the tf_nones
> above to 'complain' so that we see those errors and then this explanation.
> Likewise with the conversion checks later in the function.

So like this?
Besides what you asked for I've separated the diagnostics for when size
member isn't found in lookup vs. when data isn't found, because it looked
weird to get 2 same errors e.g. in the udlit-error1.C case.

So far tested with
GXX_TESTSUITE_STDS=98,11,14,17,20,23,26 make check-g++ 
RUNTESTFLAGS="dg.exp='static_assert1.C feat-cxx26.C udlit-error1.C'"
Ok if it passes full bootstrap/regtest?

2023-11-22  Jakub Jelinek  

PR c++/110348
gcc/
* doc/invoke.texi (-Wno-c++26-extensions): Document.
gcc/c-family/
* c.opt (Wc++26-extensions): New option.
* c-cppbuiltin.cc (c_cpp_builtins): For C++26 predefine
__cpp_static_assert to 202306L rather than 201411L.
gcc/cp/
* parser.cc: Implement C++26 P2741R3 - user-generated static_assert
messages.
(cp_parser_static_assert): Parse message argument as
conditional-expression if it is not a pure string literal or
several of them concatenated followed by closing paren.
* semantics.cc (finish_static_assert): Handle message which is not
STRING_CST.  For condition with bare parameter packs return early.
* pt.cc (tsubst_expr) : Also tsubst_expr
message and make sure that if it wasn't originally STRING_CST, it
isn't after tsubst_expr either.
gcc/testsuite/
* g++.dg/cpp26/static_assert1.C: New test.
* g++.dg/cpp26/feat-cxx26.C (__cpp_static_assert): Expect
202306L rather than 201411L.
* g++.dg/cpp0x/udlit-error1.C: Expect different diagnostics for
static_assert with user-defined literal.

--- gcc/doc/invoke.texi.jj  2023-11-22 10:14:56.021376360 +0100
+++ gcc/doc/invoke.texi 2023-11-22 10:17:41.328065157 +0100
@@ -9107,6 +9107,13 @@ Do not warn about C++23 constructs in co
 an older C++ standard.  Even without this option, some C++23 constructs
 will only be diagnosed if @option{-Wpedantic} is used.

+@opindex Wc++26-extensions
+@opindex Wno-c++26-extensions
+@item -Wno-c++26-extensions @r{(C++ and Objective-C++ only)}
+Do not warn about C++26 constructs in code being compiled using
+an older C++ standard.  Even without this option, some C++26 constructs
+will only be diagnosed if @option{-Wpedantic} is used.
+
 @opindex Wcast-qual
 @opindex Wno-cast-qual
 @item -Wcast-qual
--- gcc/c-family/c.opt.jj   2023-11-22 10:14:55.963377171 +0100
+++ gcc/c-family/c.opt  2023-11-22 10:17:41.328065157 +0100
@@ -498,6 +498,10 @@ Wc++23-extensions
 C++ ObjC++ Var(warn_cxx23_extensions) Warning Init(1)
 Warn about C++23 constructs in code compiled with an older standard.

+Wc++26-extensions
+C++ ObjC++ Var(warn_cxx26_extensions) Warning Init(1)
+Warn about C++26 constructs in code compiled with an older standard.
+
 Wcast-function-type
 C ObjC C++ ObjC++ Var(warn_cast_function_type) Warning EnabledBy(Wextra)
 Warn about casts between incompatible function types.
--- gcc/c-family/c-cppbuiltin.cc.jj 2023-11-22 10:14:55.962377185 +0100
+++ gcc/c-family/c-cppbuiltin.cc2023-11-22 10:17:41.329065143 +0100
@@ -1023,7 +1023,8 @@ c_cpp_builtins (cpp_reader *pfile)
{
  /* Set feature test macros for C++17.  */
  cpp_define (pfile, "__cpp_unicode_characters=201411L");
- cpp_define (pfile, "__cpp_static_assert=201411L");
+ if (cxx_dialect <= cxx23)
+   cpp_define (pfile, "__cpp_static_assert=201411L");
  cpp_define (pfile, "__cpp_namespace_attributes=201411L");
  cpp_define (pfile, "__cpp_enumerator_attributes=201411L");
  cpp_define (pfile, "__cpp_nested_namespace_definitions=201411L");
@@ -1086,6 +1087,7 @@ c_cpp_builtins (cpp_reader *pfile)
{
  /* Set feature test macros for C++26.  */
  cpp_define (pfile, "__cpp_constexpr=202306L");
+ cpp_define (pfile, "__cpp_static_assert=202306L");
}
   if (flag_concepts)
 {
--- gcc/cp/parser.cc.jj 2023-11-22 10:14:55.969377087 +0100
+++ gcc/cp/parser.cc2023-11-22 10:17:41.335065058 +0100
@@ -16616,6 +16616,7 @@ cp_parser_linkage_specification (cp_pars
static_assert-declaration:
  static_assert ( constant-expression , string-literal ) ;
  static_assert ( constant-expression ) ; (C++17)
+ static_assert ( constant-expression, conditional-expression ) ; (C++26)

If MEMBER_P, this static_assert is a class member.  */

@@ -16646,10 +16647,10 @@ cp_parser_static_assert (cp_parser *pars

   /* Parse the constant-expression.  Allow a non-constant expression
  here in order to give better diagnostics in finish_static_assert.  */
-  condition =
-cp_parser_constant_expression (parser,
-   /*allow_non_constant_p=*/true,
-

Re: RISC-V: Support XTheadVector extensions

2023-11-22 Thread Christoph Müllner

Hi Juzhe,

Sorry for the late reply, but I was not on CC, so I missed this email.

On Fri, Nov 17, 2023 at 2:41 PM juzhe.zh...@rivai.ai
 wrote:
>
> Ok. I just read the theadvector extension.
>
> https://github.com/T-head-Semi/thead-extension-spec/blob/master/xtheadvector.adoc
>
> Theadvector is not custom extension. Just a uarch to disable some of the 
> RVV1.0 extension
> Theadvector can be considered as subextension of 'V' extension with disabling 
> some of the
> instructions and adding some new thead vector target load/store (This is 
> another story).
>
> So, for disabling the instruction that theadvector doesn't support.
> You don't need to touch such many codes.
>
> Here is a much simpler approach to do (I think it's definitely working):
> 1. Don't change any codes in vector.md and keep GCC generates ASM with "th." 
> prefix.
> 2. Add !TARGET_THEADVECTOR into vector-iterator.md to disable the mode you 
> don't want.
> For example , theadvector doesn't support fractional vector.
>
> Then it's pretty simple:
>
> RVVMF2SI "TARGET_VECTOR && !TARGET_THEADVECTOR".
>
> 3. Remove all the tests you add in this patch.
> 4. You can add theadvector specific load/store for example, th.vlb 
> instructions they are allowed.
> 5. Modify binutils, and make th.vmulh.vv as the pseudo instruction of vmulh.vv
> 6. So with compile option "-S", you will still see ASM as  "vmulh.vv". but 
> with objdump, you will see th.vmulh.vv.

Yes, all these points sound reasonable, to minimize the patchset size.
I believe in point 1 you meant "without th. prefix".

I've added Jin Ma (who is the main author of the Binutils patchset) so
he is also aware
of the proposal to use pseudo instructions to avoid duplication in Binutils.

Thank you very much!
Christoph


>
> After this change, you can send V2, then I can continue to review on GCC-15.
>
> Thanks.
>
> 
> juzhe.zh...@rivai.ai
>
>
> From: juzhe.zh...@rivai.ai
> Date: 2023-11-17 19:39
> To: gcc-patches
> CC: kito.cheng; kito.cheng; cooper.joshua; Robin Dapp; jeffreyalaw
> Subject: RISC-V: Support XTheadVector extensions
> 90% theadvector extension reusing current RVV 1.0 instructions patterns:
> Just change ASM, For example:
>
> @@ -2923,7 +2923,7 @@ (define_insn "*pred_mulh_scalar"
>   (match_operand:VFULLI_D 3 "register_operand"  "vr,vr, vr, vr")] VMULH)
>(match_operand:VFULLI_D 2 "vector_merge_operand" "vu, 0, vu,  0")))]
>"TARGET_VECTOR"
> -  "vmulh.vx\t%0,%3,%z4%p1"
> +  "%^vmulh.vx\t%0,%3,%z4%p1"
>[(set_attr "type" "vimul")
> (set_attr "mode" "")])
>
> +  if (letter == '^')
> +{
> +  if (TARGET_XTHEADVECTOR)
> + fputs ("th.", file);
> +  return;
> +}
>
>
> For almost all patterns, you just simply append "th." in the ASM prefix.
> like change "vmulh.vv" -> "th.vmulh.vv"
>
> Almost all theadvector instructions are not new features,  all same as RVV1.0.
> Why do you invent the such ISA doesn't include any features that RVV1.0 
> doesn't satisfy ?
>
> I am not explicitly object this patch. But I should know the reason.
>
> Btw, stage 1 will close soon.  So I will review this patch on GCC-15 as long 
> as all other RISC-V maintainers agree.
>
>
> 
> juzhe.zh...@rivai.ai

Re: [PATCH 0/5] Add support for operand-specific alignment requirements

2023-11-22 Thread Richard Sandiford

"juzhe.zh...@rivai.ai"  writes:
> Hi, Richard.
>
> Thanks for supporting register filter in IRA/LRA.
> I found it is useful for RVV since we have a set of widen operations that 
> allow source register overlap highpart of dest register group
>
> For example, if vsext.vf2 v0(dest consume reg v0 and reg v1), v1 (source 
> consume v1 only)
> I want to support the highpart overlap above. (Currently, we don't any 
> overlap between source and dest in such instructions).
>
> So, I wonder whether we can pass "machine_mode" into register filter. Ok, I 
> think it's too late since stage 1 closes. I wonder we can add it in GCC-15?

I think adding a mode would add too much overhead.  The mode would be
the mode of the operand, but with subregs, the mode of the operand can
be different from the mode of the RA allocno.  So it would no longer
be enough for the RA to calculate a bitmask of filters.  It would need
ro remember which modes are used with those filters.

We'd also need to turn the current HARD_REG_SETs into [MAX_MACHINE_MODE]
arrays of HARD_REG_SETs.  (And there are now more than 256 machine modes
for riscv.)

The pattern that uses the constraints should already "know" the mode.
So if possible, I think it would be better to use different constraints
for different modes, using define_mode_attrs.

Thanks,
Richard

[PATCH]AArch64: fix aarch64_usubw pattern

2023-11-22 Thread Tamar Christina

Hi All,

It looks like during my pre-commit testrun I forgot to apply this patch
to the patch stack.  It had a typo in the element size.

It also looks like since the hi/lo operations take different element
counts for the assembler syntax that I can't have a unified pattern.

This splits it into two each :(

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Sorry for the breakage,
Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

* config/aarch64/aarch64-simd.md
(aarch64_uaddw__zip,
 aarch64_usubw__zip): Split into...
(aarch64_uaddw_lo_zip, aarch64_uaddw_hi_zip,
"aarch64_usubw_lo_zip, "aarch64_usubw_hi_zip): ... This.
 

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/uxtl-combine-4.c: Fix typo.
* gcc.target/aarch64/uxtl-combine-5.c: Likewise.
* gcc.target/aarch64/uxtl-combine-6.c: Likewise.

--- inline copy of patch -- 
diff --git a/gcc/config/aarch64/aarch64-simd.md 
b/gcc/config/aarch64/aarch64-simd.md
index 
75ee659871080ed28b9887990b7431682c283502..80e338bb8952140dd8be178cc8aed0c47b81c775
 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -4810,7 +4810,7 @@ (define_insn "aarch64_subw2_internal"
   [(set_attr "type" "neon_sub_widen")]
 )
 
-(define_insn "aarch64_usubw__zip"
+(define_insn "aarch64_usubw_lo_zip"
   [(set (match_operand: 0 "register_operand" "=w")
(minus:
  (match_operand: 1 "register_operand" "w")
@@ -4818,23 +4818,51 @@ (define_insn 
"aarch64_usubw__zip"
(unspec: [
(match_operand:VQW 2 "register_operand" "w")
(match_operand:VQW 3 "aarch64_simd_imm_zero")
-  ] PERM_EXTEND) 0)))]
+  ] UNSPEC_ZIP1) 0)))]
   "TARGET_SIMD"
-  "usubw\\t%0., %1., %2."
+  "usubw\\t%0., %1., %2."
   [(set_attr "type" "neon_sub_widen")]
 )
 
-(define_insn "aarch64_uaddw__zip"
+(define_insn "aarch64_uaddw_lo_zip"
   [(set (match_operand: 0 "register_operand" "=w")
(plus:
  (subreg:
(unspec: [
(match_operand:VQW 2 "register_operand" "w")
(match_operand:VQW 3 "aarch64_simd_imm_zero")
-  ] PERM_EXTEND) 0)
+  ] UNSPEC_ZIP1) 0)
  (match_operand: 1 "register_operand" "w")))]
   "TARGET_SIMD"
-  "uaddw\\t%0., %1., %2."
+  "uaddw\\t%0., %1., %2."
+  [(set_attr "type" "neon_add_widen")]
+)
+
+(define_insn "aarch64_usubw_hi_zip"
+  [(set (match_operand: 0 "register_operand" "=w")
+   (minus:
+ (match_operand: 1 "register_operand" "w")
+ (subreg:
+   (unspec: [
+   (match_operand:VQW 2 "register_operand" "w")
+   (match_operand:VQW 3 "aarch64_simd_imm_zero")
+  ] UNSPEC_ZIP2) 0)))]
+  "TARGET_SIMD"
+  "usubw2\\t%0., %1., %2."
+  [(set_attr "type" "neon_sub_widen")]
+)
+
+(define_insn "aarch64_uaddw_hi_zip"
+  [(set (match_operand: 0 "register_operand" "=w")
+   (plus:
+ (subreg:
+   (unspec: [
+   (match_operand:VQW 2 "register_operand" "w")
+   (match_operand:VQW 3 "aarch64_simd_imm_zero")
+  ] UNSPEC_ZIP2) 0)
+ (match_operand: 1 "register_operand" "w")))]
+  "TARGET_SIMD"
+  "uaddw2\\t%0., %1., %2."
   [(set_attr "type" "neon_add_widen")]
 )
 
diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
index 
2354315d7d249ccee46625d13b32678f1da1f087..a920de99ffca378ce518f378a35cbe2766877ee8
 100644
--- a/gcc/config/aarch64/iterators.md
+++ b/gcc/config/aarch64/iterators.md
@@ -2645,9 +2645,6 @@ (define_int_iterator PERMUTEQ [UNSPEC_ZIP1Q UNSPEC_ZIP2Q
 (define_int_iterator OPTAB_PERMUTE [UNSPEC_ZIP1 UNSPEC_ZIP2
UNSPEC_UZP1 UNSPEC_UZP2])
 
-;; Permutes for zero extends
-(define_int_iterator PERM_EXTEND [UNSPEC_ZIP1 UNSPEC_ZIP2])
-
 (define_int_iterator REVERSE [UNSPEC_REV64 UNSPEC_REV32 UNSPEC_REV16])
 
 (define_int_iterator FRINT [UNSPEC_FRINTZ UNSPEC_FRINTP UNSPEC_FRINTM
@@ -3470,10 +3467,7 @@ (define_int_attr rev_op [(UNSPEC_REV64 "64") 
(UNSPEC_REV32 "32")
 (UNSPEC_REV16 "16")])
 
 (define_int_attr perm_hilo [(UNSPEC_UNPACKSHI "hi") (UNSPEC_UNPACKUHI "hi")
-   (UNSPEC_UNPACKSLO "lo") (UNSPEC_UNPACKULO "lo")
-   (UNSPEC_ZIP2 "hi") (UNSPEC_ZIP1 "lo")])
-
-(define_int_attr perm_index [(UNSPEC_ZIP2 "2") (UNSPEC_ZIP1 "")])
+   (UNSPEC_UNPACKSLO "lo") (UNSPEC_UNPACKULO "lo")])
 
 ;; Return true if the associated optab refers to the high-numbered lanes,
 ;; false if it refers to the low-numbered lanes.  The convention is for
diff --git a/gcc/testsuite/gcc.target/aarch64/uxtl-combine-4.c 
b/gcc/testsuite/gcc.target/aarch64/uxtl-combine-4.c
index 
e1a9c4f5661a36ec7b2c5dc6f0fd85c42fcaac39..67944f70ecceff7ed833de86b76606547f3db76c
 100755
--- a/gcc/testsuite/gcc.target/aarch64/uxtl-combine-4.c
+++ b/gcc/testsuite/gcc.target/aarch64/uxtl-combine-4.c
@@ -16,5 +16,5

RE: [PATCH]AArch64: fix aarch64_usubw pattern

2023-11-22 Thread Kyrylo Tkachov

Hi Tamar,

> -Original Message-
> From: Tamar Christina 
> Sent: Wednesday, November 22, 2023 10:20 AM
> To: gcc-patches@gcc.gnu.org
> Cc: nd ; Richard Earnshaw ;
> Marcus Shawcroft ; Kyrylo Tkachov
> ; Richard Sandiford 
> Subject: [PATCH]AArch64: fix aarch64_usubw pattern
> 
> Hi All,
> 
> It looks like during my pre-commit testrun I forgot to apply this patch
> to the patch stack.  It had a typo in the element size.
> 
> It also looks like since the hi/lo operations take different element
> counts for the assembler syntax that I can't have a unified pattern.
> 
> This splits it into two each :(
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> 
> Sorry for the breakage,
> Ok for master?

If I was pedantic I'd argue for the testsuite changes going in independently as 
obvious, but the patch is okay as is anyway.
So ok for trunk.
Thanks,
Kyrill

> 
> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
>   * config/aarch64/aarch64-simd.md
>   (aarch64_uaddw__zip,
>aarch64_usubw__zip): Split into...
>   (aarch64_uaddw_lo_zip, aarch64_uaddw_hi_zip,
>   "aarch64_usubw_lo_zip, "aarch64_usubw_hi_zip): ...
> This.
> 
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/aarch64/uxtl-combine-4.c: Fix typo.
>   * gcc.target/aarch64/uxtl-combine-5.c: Likewise.
>   * gcc.target/aarch64/uxtl-combine-6.c: Likewise.
> 
> --- inline copy of patch --
> diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-
> simd.md
> index
> 75ee659871080ed28b9887990b7431682c283502..80e338bb8952140dd8be178c
> c8aed0c47b81c775 100644
> --- a/gcc/config/aarch64/aarch64-simd.md
> +++ b/gcc/config/aarch64/aarch64-simd.md
> @@ -4810,7 +4810,7 @@ (define_insn
> "aarch64_subw2_internal"
>[(set_attr "type" "neon_sub_widen")]
>  )
> 
> -(define_insn "aarch64_usubw__zip"
> +(define_insn "aarch64_usubw_lo_zip"
>[(set (match_operand: 0 "register_operand" "=w")
>   (minus:
> (match_operand: 1 "register_operand" "w")
> @@ -4818,23 +4818,51 @@ (define_insn
> "aarch64_usubw__zip"
>   (unspec: [
>   (match_operand:VQW 2 "register_operand" "w")
>   (match_operand:VQW 3 "aarch64_simd_imm_zero")
> -] PERM_EXTEND) 0)))]
> +] UNSPEC_ZIP1) 0)))]
>"TARGET_SIMD"
> -  "usubw\\t%0., %1.,
> %2."
> +  "usubw\\t%0., %1., %2."
>[(set_attr "type" "neon_sub_widen")]
>  )
> 
> -(define_insn "aarch64_uaddw__zip"
> +(define_insn "aarch64_uaddw_lo_zip"
>[(set (match_operand: 0 "register_operand" "=w")
>   (plus:
> (subreg:
>   (unspec: [
>   (match_operand:VQW 2 "register_operand" "w")
>   (match_operand:VQW 3 "aarch64_simd_imm_zero")
> -] PERM_EXTEND) 0)
> +] UNSPEC_ZIP1) 0)
> (match_operand: 1 "register_operand" "w")))]
>"TARGET_SIMD"
> -  "uaddw\\t%0., %1.,
> %2."
> +  "uaddw\\t%0., %1., %2."
> +  [(set_attr "type" "neon_add_widen")]
> +)
> +
> +(define_insn "aarch64_usubw_hi_zip"
> +  [(set (match_operand: 0 "register_operand" "=w")
> + (minus:
> +   (match_operand: 1 "register_operand" "w")
> +   (subreg:
> + (unspec: [
> + (match_operand:VQW 2 "register_operand" "w")
> + (match_operand:VQW 3 "aarch64_simd_imm_zero")
> +] UNSPEC_ZIP2) 0)))]
> +  "TARGET_SIMD"
> +  "usubw2\\t%0., %1., %2."
> +  [(set_attr "type" "neon_sub_widen")]
> +)
> +
> +(define_insn "aarch64_uaddw_hi_zip"
> +  [(set (match_operand: 0 "register_operand" "=w")
> + (plus:
> +   (subreg:
> + (unspec: [
> + (match_operand:VQW 2 "register_operand" "w")
> + (match_operand:VQW 3 "aarch64_simd_imm_zero")
> +] UNSPEC_ZIP2) 0)
> +   (match_operand: 1 "register_operand" "w")))]
> +  "TARGET_SIMD"
> +  "uaddw2\\t%0., %1., %2."
>[(set_attr "type" "neon_add_widen")]
>  )
> 
> diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
> index
> 2354315d7d249ccee46625d13b32678f1da1f087..a920de99ffca378ce518f378a35
> cbe2766877ee8 100644
> --- a/gcc/config/aarch64/iterators.md
> +++ b/gcc/config/aarch64/iterators.md
> @@ -2645,9 +2645,6 @@ (define_int_iterator PERMUTEQ [UNSPEC_ZIP1Q
> UNSPEC_ZIP2Q
>  (define_int_iterator OPTAB_PERMUTE [UNSPEC_ZIP1 UNSPEC_ZIP2
>   UNSPEC_UZP1 UNSPEC_UZP2])
> 
> -;; Permutes for zero extends
> -(define_int_iterator PERM_EXTEND [UNSPEC_ZIP1 UNSPEC_ZIP2])
> -
>  (define_int_iterator REVERSE [UNSPEC_REV64 UNSPEC_REV32
> UNSPEC_REV16])
> 
>  (define_int_iterator FRINT [UNSPEC_FRINTZ UNSPEC_FRINTP UNSPEC_FRINTM
> @@ -3470,10 +3467,7 @@ (define_int_attr rev_op [(UNSPEC_REV64 "64")
> (UNSPEC_REV32 "32")
>(UNSPEC_REV16 "16")])
> 
>  (define_int_attr perm_hilo [(UNSPEC_UNPACKSHI "hi") (UNSPEC_UNPACKUHI
> "hi")
> - (UNSPEC_UNPACKSLO "lo") (UNSPEC_UNPACKULO
> "lo")
> - (UNSPEC_ZIP2 "hi") (UNSPEC_ZIP1 "lo")])
> -
> -(define_

Re: PING^1 [PATCH v3] sched: Change no_real_insns_p to no_real_nondebug_insns_p [PR108273]

2023-11-22 Thread Richard Biener

On Wed, Nov 22, 2023 at 10:31 AM Kewen.Lin  wrote:
>
> on 2023/11/17 20:55, Alexander Monakov wrote:
> >
> > On Fri, 17 Nov 2023, Kewen.Lin wrote:
> >>> I don't think you can run cleanup_cfg after sched_init. I would suggest
> >>> to put it early in schedule_insns.
> >>
> >> Thanks for the suggestion, I placed it at the beginning of haifa_sched_init
> >> instead, since schedule_insns invokes haifa_sched_init, although the
> >> calls rgn_setup_common_sched_info and rgn_setup_sched_infos are executed
> >> ahead but they are all "setup" functions, shouldn't affect or be affected
> >> by this placement.
> >
> > I was worried because sched_init invokes df_analyze, and I'm not sure if
> > cfg_cleanup can invalidate it.
>
> Thanks for further explaining!  By scanning cleanup_cfg, it seems that it
> considers df, like compact_blocks checks df, try_optimize_cfg invokes
> df_analyze etc., but I agree that moving cleanup_cfg before sched_init
> makes more sense.
>
> >
> >>> I suspect this may be caused by invoking cleanup_cfg too late.
> >>
> >> By looking into some failures, I found that although cleanup_cfg is 
> >> executed
> >> there would be still some empty blocks left, by analyzing a few failures 
> >> there
> >> are at least such cases:
> >>   1. empty function body
> >>   2. block holding a label for return.
> >>   3. block without any successor.
> >>   4. block which becomes empty after scheduling some other block.
> >>   5. block which looks mergeable with its always successor but left.
> >>   ...
> >>
> >> For 1,2, there is one single successor EXIT block, I think they don't 
> >> affect
> >> state transition, for 3, it's the same.  For 4, it depends on if we can 
> >> have
> >> the assumption this kind of empty block doesn't have the chance to have 
> >> debug
> >> insn (like associated debug insn should be moved along), I'm not sure.  
> >> For 5,
> >> a reduced test case is:
> >
> > Oh, I should have thought of cases like these, really sorry about the slip
> > of attention, and thanks for showing a testcase for item 5. As Richard as
> > saying in his response, cfg_cleanup cannot be a fix here. The thing to check
> > would be changing no_real_insns_p to always return false, and see if the
> > situation looks recoverable (if it breaks bootstrap, regtest statistics of
> > a non-bootstrapped compiler are still informative).
>
> As you suggested, I forced no_real_insns_p to return false all the time, some
> issues got exposed, almost all of them are asserting NOTE_P insn shouldn't be
> encountered in those places, so the adjustments for most of them are just to
> consider NOTE_P or this kind of special block and so on.  One draft patch is
> attached, it can be bootstrapped and regress-tested on ppc64{,le} and x86.
> btw, it's without the previous cfg_cleanup adjustment (hope it can get more
> empty blocks and expose more issues).  The draft isn't qualified for code
> review but I hope it can provide some information on what kinds of changes
> are needed for the proposal.  If this is the direction which we all agree on,
> I'll further refine it and post a formal patch.  One thing I want to note is
> that this patch disable one assertion below:
>
> diff --git a/gcc/sched-rgn.cc b/gcc/sched-rgn.cc
> index e5964f54ead..abd334864fb 100644
> --- a/gcc/sched-rgn.cc
> +++ b/gcc/sched-rgn.cc
> @@ -3219,7 +3219,7 @@ schedule_region (int rgn)
>  }
>
>/* Sanity check: verify that all region insns were scheduled.  */
> -  gcc_assert (sched_rgn_n_insns == rgn_n_insns);
> +  // gcc_assert (sched_rgn_n_insns == rgn_n_insns);
>
>sched_finish_ready_list ();
>
> Some cases can cause this assertion to fail, it's due to the mismatch on
> to-be-scheduled and scheduled insn counts.  The reason why it happens is that
> one block previously has only one INSN_P but while scheduling some other 
> blocks
> it gets moved as well then we ends up with an empty block so that the only
> NOTE_P insn was counted then, but since this block isn't empty initially and
> NOTE_P gets skipped in a normal block, the count to-be-scheduled can't count
> it in.  It can be fixed with special-casing this kind of block for counting
> like initially recording which block is empty and if a block isn't recorded
> before then fix up the count for it accordingly.  I'm not sure if someone may
> have an argument that all the complication make this proposal beaten by
> previous special-casing debug insn approach, looking forward to more comments.

Just a comment that the NOTE_P thing is odd - do we only ever have those for
otherwise empty BBs?  How are they skipped otherwise (and why does that not
work for otherwise empty BBs)?

Richard.

> BR,
> Kewen
>

[PATCH] tree: Fix up try_catch_may_fallthru [PR112619]

2023-11-22 Thread Jakub Jelinek

Hi!

The following testcase ICEs with -std=c++98 since r14-5086 because
block_may_fallthru is called on a TRY_CATCH_EXPR whose second operand
is a MODIFY_EXPR rather than STATEMENT_LIST, which try_catch_may_fallthru
apparently expects.
I've been wondering whether that isn't some kind of FE bug and whether
there isn't some unwritten rule that second operand of TRY_CATCH_EXPR
must be a STATEMENT_LIST, but then I tried
--- gcc/gimplify.cc 2023-07-19 14:23:42.409875238 +0200
+++ gcc/gimplify.cc 2023-11-22 11:07:50.511000206 +0100
@@ -16730,6 +16730,10 @@ gimplify_expr (tree *expr_p, gimple_seq
   Note that this only affects the destructor calls in FINALLY/CATCH
   block, and will automatically reset to its original value by the
   end of gimplify_expr.  */
+if (TREE_CODE (*expr_p) == TRY_CATCH_EXPR
+   && TREE_OPERAND (*expr_p, 1)
+   && TREE_CODE (TREE_OPERAND (*expr_p, 1)) != STATEMENT_LIST)
+ gcc_unreachable ();
input_location = UNKNOWN_LOCATION;
eval = cleanup = NULL;
gimplify_and_add (TREE_OPERAND (*expr_p, 0), &eval);
hack in gcc 13 and triggered on hundreds of tests there within just 5
seconds of running make check-g++ -j32 (and in cases I looked at had nothing
to do with the r14-5086 backports), so I believe this is just bad
assumption on the try_catch_may_fallthru side, gimplify.cc certainly doesn't
care, it just calls gimplify_and_add (TREE_OPERAND (*expr_p, 1), &cleanup);
on it.  So, IMHO non-STATEMENT_LIST in the second operand is equivalent to
a STATEMENT_LIST containing a single statement.

Unfortunately, I don't see an easy way to create an artificial tree iterator
from just a single tree statement, so the patch duplicates what the loops
later do (after all, it is very simple, just didn't want to duplicate
also the large comments explaning it, so the 3 See below. comments).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
Shall it go to branches as well given that r14-5086 has been backported
to those branches as well?

2023-11-22  Jakub Jelinek  

PR c++/112619
* tree.cc (try_catch_may_fallthru): If second operand of
TRY_CATCH_EXPR is not a STATEMENT_LIST, handle it as if it was a
STATEMENT_LIST containing a single statement.

* g++.dg/eh/pr112619.C: New test.

--- gcc/tree.cc.jj  2023-11-14 09:24:28.436530018 +0100
+++ gcc/tree.cc 2023-11-21 19:19:19.384347469 +0100
@@ -12573,6 +12573,24 @@ try_catch_may_fallthru (const_tree stmt)
   if (block_may_fallthru (TREE_OPERAND (stmt, 0)))
 return true;
 
+  switch (TREE_CODE (TREE_OPERAND (stmt, 1)))
+{
+case CATCH_EXPR:
+  /* See below.  */
+  return block_may_fallthru (CATCH_BODY (TREE_OPERAND (stmt, 1)));
+
+case EH_FILTER_EXPR:
+  /* See below.  */
+  return block_may_fallthru (EH_FILTER_FAILURE (TREE_OPERAND (stmt, 1)));
+
+case STATEMENT_LIST:
+  break;
+
+default:
+  /* See below.  */
+  return false;
+}
+
   i = tsi_start (TREE_OPERAND (stmt, 1));
   switch (TREE_CODE (tsi_stmt (i)))
 {
--- gcc/testsuite/g++.dg/eh/pr112619.C.jj   2023-11-21 19:22:47.437439283 
+0100
+++ gcc/testsuite/g++.dg/eh/pr112619.C  2023-11-21 19:22:24.887754376 +0100
@@ -0,0 +1,15 @@
+// PR c++/112619
+// { dg-do compile }
+
+struct S { S (); ~S (); };
+
+S
+foo (int a, int b)
+{
+  if (a || b)
+{
+  S s;
+  return s;
+}
+  return S ();
+}

Jakub

[committed] testsuite: Add testcase for already fixed PR112518

2023-11-22 Thread Jakub Jelinek

Hi!

This PR has been fixed by the PR112526 fix.

Tested on x86_64-linux and i686-linux, committed to trunk as obvious.

2023-11-22  Jakub Jelinek  

PR target/65368
* gcc.target/i386/bmi2-pr112518.c: New test.

--- gcc/testsuite/gcc.target/i386/bmi2-pr112518.c.jj2023-11-21 
19:55:21.948132480 +0100
+++ gcc/testsuite/gcc.target/i386/bmi2-pr112518.c   2023-11-21 
19:57:13.694570877 +0100
@@ -0,0 +1,25 @@
+/* PR target/65368 */
+/* { dg-do run { target { bmi2 && int128 } } } */
+/* { dg-options "-Os -mbmi2" } */
+
+#include "bmi2-check.h"
+
+unsigned u;
+int g;
+
+unsigned long long
+foo (int i)
+{
+  unsigned long long x = u;
+  g = __builtin_mul_overflow_p (u, ((unsigned __int128) 4292468825) << 64 | 
150, 0);
+  x |= g % i;
+  return x;
+}
+
+static __attribute__((noipa)) void
+bmi2_test ()
+{
+  unsigned long long x = foo (3);
+  if (x)
+__builtin_abort ();
+}

Jakub

Re: Fix 'gcc.dg/tree-ssa/return-value-range-1.c' for 'char' defaulting to 'unsigned' (was: Propagate value ranges of return values)

2023-11-22 Thread Christophe Lyon

Hi!

On Tue, 21 Nov 2023 at 22:24, Thomas Schwinge  wrote:
>
> Hi!
>
> On 2023-11-19T16:05:42+0100, Jan Hubicka  wrote:
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/return-value-range-1.c
>
> Pushed to master branch commit a0240662b22312ffb3e3fefb85f258ab0e7010f4
> "Fix 'gcc.dg/tree-ssa/return-value-range-1.c' for 'char' defaulting to
> 'unsigned'", see attached.  On powerpc64le-linux-gnu ('char' defaulting
> to 'unsigned') I still saw:
>
> /tmp/ccd1xwD7.o: In function `test':
> return-value-range-1.c:(.text+0x50): undefined reference to `link_error'
>
We do see the same error in our CI (Thomas, normally you have received
a notification because your patch turned ERROR in FAIL)

Thomas, you said in another email that adding -O2 avoids the linker
error with missing link_error(), but I don't see how that would be
possible?
(and hence I expect the error you quoted above to happen)

So should we use dg-compile instead of dg-link? Not sure what the
original intention was?

Thanks,

Christophe

>
> Grüße
>  Thomas
>
>
> > @@ -0,0 +1,22 @@
> > +/* { dg-do ling } */
> > +/* { dg-options "-O1 -dump-tree-evrp-details" } */
> > +__attribute__ ((__noinline__))
> > +int a(char c)
> > +{
> > + return c;
> > +}
> > +void link_error ();
> > +
> > +void
> > +test(int d)
> > +{
> > + if (a(d) > 200)
> > + link_error ();
> > +}
> > +int
> > +main(int argc, char **argv)
> > +{
> > + test(argc);
> > + return 0;
> > +}
> > +/* { dg-final { scan-tree-dump-times "Recording return range" 2 "evrp"} } 
> > */
>
>
> -
> Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
> München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
> Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
> München, HRB 106955

Re: Darwin: Replace environment runpath with embedded [PR88590]

2023-11-22 Thread Iain Sandoe

Hi FX,

> On 17 Nov 2023, at 14:20, FX Coudert  wrote:
> 
>>> I have done a full rebuild, and having looked more at the structure of 
>>> libtool.m4 I am now convinced that having that line outside of the scope of 
>>> _LT_DARWIN_LINKER_FEATURES is simply wrong (probably a copy-pasto or 
>>> leftover from earlier code).

The latter;  my original patch to do this had all the work inside libtool.m4 - 
but that proved to be unwieldy in practice, so
the eventual patch split the ENABLE_DARWIN_AT_RPATH AM_CONDITIONAL out and 
applies it (after libtool is
initiallised) in configure.ac cases that need it.

It seems I failed to remove all the old code in that change :(.

>>> Having rebuilt everything, it only manifests itself in 
>>> fixincludes/ChangeLog. Iain is traveling right now, but when he is back I 
>>> would like to submit this patch if he agrees with the above. It was 
>>> regtested on x86_64-apple-darwin21.

I have also regtested on i686, x86_64, aarch64 Darwin, x86_64 and aarch64 Linux.
> 
> With the correct patch attached.

I believe this can be applied as a partial reversion of a previously approved 
patch,
thanks
Iain

> 
> <0001-Build-fix-error-in-fixinclude-configure.patch>

[PATCH] RISC-V: Fix incorrect use of vcompress in permutation auto-vectorization

2023-11-22 Thread Juzhe-Zhong

This patch fixes following FAILs on zvl512b of RV32 system:

FAIL: gcc.target/riscv/rvv/autovec/struct/struct_vect_run-12.c execution test
FAIL: gcc.target/riscv/rvv/autovec/struct/struct_vect_run-9.c execution test

The root cause is that for permutation indice = {0,3,7,0} use vcompress 
optimization
which is incorrect. Fix vcompress optimization bug.

PR target/112598

gcc/ChangeLog:

* config/riscv/riscv-v.cc (shuffle_compress_patterns): Fix vcompress 
bug.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/pr112598-3.c: New test.

---
 gcc/config/riscv/riscv-v.cc   | 15 ++---
 .../gcc.target/riscv/rvv/autovec/pr112598-3.c | 21 +++
 2 files changed, 29 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112598-3.c

diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index 7d6d0821d87..7d3e8038dab 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -3005,14 +3005,15 @@ shuffle_compress_patterns (struct expand_vec_perm_d *d)
   if (compress_point < 0)
 return false;
 
-  /* It must be series increasing from compress point.  */
-  if (!d->perm.series_p (compress_point, 1, d->perm[compress_point], 1))
-return false;
-
   /* We can only apply compress approach when all index values from 0 to
  compress point are increasing.  */
   for (int i = 1; i < compress_point; i++)
-if (known_le (d->perm[i], d->perm[i - 1]))
+if (maybe_le (d->perm[i], d->perm[i - 1]))
+  return false;
+
+  /* It must be series increasing from compress point.  */
+  for (int i = 1 + compress_point; i < vlen; i++)
+if (maybe_ne (d->perm[i], d->perm[i - 1] + 1))
   return false;
 
   /* Success!  */
@@ -3080,10 +3081,10 @@ shuffle_compress_patterns (struct expand_vec_perm_d *d)
   if (need_slideup_p)
 {
   int slideup_cnt = vlen - (d->perm[vlen - 1].to_constant () % vlen) - 1;
-  rtx ops[] = {d->target, d->op1, gen_int_mode (slideup_cnt, Pmode)};
+  merge = gen_reg_rtx (vmode);
+  rtx ops[] = {merge, d->op1, gen_int_mode (slideup_cnt, Pmode)};
   insn_code icode = code_for_pred_slide (UNSPEC_VSLIDEUP, vmode);
   emit_vlmax_insn (icode, BINARY_OP, ops);
-  merge = d->target;
 }
 
   insn_code icode = code_for_pred_compress (vmode);
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112598-3.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112598-3.c
new file mode 100644
index 000..231a068c680
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112598-3.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv32gcv_zvfh_zfh_zvl512b -mabi=ilp32d -O3 
-ftree-vectorize -std=c99 -fno-vect-cost-model" } */
+
+#include 
+#define TYPE uint64_t
+#define ITYPE int64_t
+
+void __attribute__ ((noinline, noclone))
+foo (TYPE *__restrict a, TYPE *__restrict b, TYPE *__restrict c,
+TYPE *__restrict d, ITYPE n)
+{
+  for (ITYPE i = 0; i < n; ++i)
+{
+  d[i * 3] = a[i];
+  d[i * 3 + 1] = b[i];
+  d[i * 3 + 2] = c[i];
+}
+}
+
+/* We don't want vcompress.vv.  */
+/* { dg-final { scan-assembler-not {vcompress\.vv} } } */
-- 
2.36.3

Re: Darwin: Replace environment runpath with embedded [PR88590]

2023-11-22 Thread FX Coudert

Hi,

> I believe this can be applied as a partial reversion of a previously approved 
> patch,

Yes, that makes sense.
Pushed as 
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=ce966ae66067d8d365431ef7a323f4207fcb729a

FX

Re: Fix 'gcc.dg/tree-ssa/return-value-range-1.c' for 'char' defaulting to 'unsigned' (was: Propagate value ranges of return values)

2023-11-22 Thread Thomas Schwinge

Hi!

On 2023-11-22T11:51:02+0100, Christophe Lyon  wrote:
> On Tue, 21 Nov 2023 at 22:24, Thomas Schwinge  wrote:
>> On 2023-11-19T16:05:42+0100, Jan Hubicka  wrote:
>> > --- /dev/null
>> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/return-value-range-1.c
>>
>> Pushed to master branch commit a0240662b22312ffb3e3fefb85f258ab0e7010f4
>> "Fix 'gcc.dg/tree-ssa/return-value-range-1.c' for 'char' defaulting to
>> 'unsigned'", see attached.  On powerpc64le-linux-gnu ('char' defaulting
>> to 'unsigned') I still saw:
>>
>> /tmp/ccd1xwD7.o: In function `test':
>> return-value-range-1.c:(.text+0x50): undefined reference to `link_error'
>>
> We do see the same error in our CI (Thomas, normally you have received
> a notification because your patch turned ERROR in FAIL)

Yes, I have; and I even tried to log in there, to point to my commit
mentioned above, which is meant to address this issue -- please let me
know if you're still seeing the FAIL after that commit.

> Thomas, you said in another email that adding -O2 avoids the linker
> error with missing link_error(), but I don't see how that would be
> possible?

That's the gist of Honza's "Propagate value ranges of return values"
optimization, per my understanding: from 'int a(signed char c)' doing
'return c;' figure out that 'a(d) > 200)' is always false (due to
'-128 <= c <= 127)'.

> (and hence I expect the error you quoted above to happen)
>
> So should we use dg-compile instead of dg-link? Not sure what the
> original intention was?

No, the idea really is to prove that the 'link_error ()' call is
unreachable.


Grüße
 Thomas


>> > @@ -0,0 +1,22 @@
>> > +/* { dg-do ling } */
>> > +/* { dg-options "-O1 -dump-tree-evrp-details" } */
>> > +__attribute__ ((__noinline__))
>> > +int a(char c)
>> > +{
>> > + return c;
>> > +}
>> > +void link_error ();
>> > +
>> > +void
>> > +test(int d)
>> > +{
>> > + if (a(d) > 200)
>> > + link_error ();
>> > +}
>> > +int
>> > +main(int argc, char **argv)
>> > +{
>> > + test(argc);
>> > + return 0;
>> > +}
>> > +/* { dg-final { scan-tree-dump-times "Recording return range" 2 "evrp"} } 
>> > */
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955

Re: [PATCH 10/11] aarch64: Add new load/store pair fusion pass.

2023-11-22 Thread Richard Sandiford

Alex Coplan  writes:
> This is a v3 of the aarch64 load/store pair fusion pass.
> v2 was posted here:
>  - https://gcc.gnu.org/pipermail/gcc-patches/2023-October/633601.html
>
> The main changes since v2 are as follows:
>
> We now handle writeback opportunities as well.  E.g. for this testcase:
>
> void foo (long *p, long *q, long x, long y)
> {
>   do {
> *(p++) = x;
> *(p++) = y;
>   } while (p < q);
> }
>
> wtih the patch, we generate:
>
> foo:
> .LFB0:
> .align  3
> .L2:
> stp x2, x3, [x0], 16
> cmp x0, x1
> bcc .L2
> ret
>
> instead of:
>
> foo:
> .LFB0:
> .align  3
> .L2:
> str x2, [x0], 16
> str x3, [x0, -8]
> cmp x0, x1
> bcc .L2
> ret
>
> i.e. the pass is now capable of finding load/store pair opportunities even in
> the case that one or more of the initial candidate accesses uses writeback 
> addressing.
> We do this by adding a notion of canonicalizing RTL bases.  When we see a
> writeback access, we record that the new base def is equivalent to the 
> original
> def plus some offset.  When tracking accesses, we then canonicalize to track
> each access relative to the earliest equivalent base in the basic block.
>
> This allows us to spot that accesses are adjacent even though they don't share
> the same RTL-SSA base def.
>
> Furthermore, we also add some extra logic to opportunistically fold in 
> trailing
> destructive updates of the base register used for a load/store pair.  E.g. for
>
> void post_add (long *p, long *q, long x, long y)
> {
>   do {
> p[0] = x;
> p[1] = y;
> p += 2;
>   } while (p < q);
> }
>
> the auto-inc-dec pass doesn't currently form any writeback accesses, and we
> generate:
>
> post_add:
> .LFB0:
> .align  3
> .L2:
> add x0, x0, 16
> stp x2, x3, [x0, -16]
> cmp x0, x1
> bcc .L2
> ret
>
> but with the updated pass, we now get:
>
> post_add:
> .LFB0:
> .align  3
> .L2:
> stp x2, x3, [x0], 16
> cmp x0, x1
> bcc .L2
> ret
>
> Other notable changes to the pass since the last version include:
>  - We switch to using the aarch64_gen_{load,store}_pair interface
>for forming the (non-writeback) pairs, allowing use of the new
>load/store pair representation added by the earlier patch.
>  - The various updates to the load/store pair patterns mean that
>we no longer need to do mode canonicalization / mode unification
>in the pass, as the patterns allow arbitrary combinations of suitable modes
>of the same size.  So we remove the logic to do this (including the
>param to control the strategy).
>  - Fix up classification of zero operands to make sure that these are always
>treated as GPR operands for pair discovery purposes.  This avoids us
>pairing zero operands with FPRs in the pre-RA pass, which used to lead to
>undesirable codegen involving cross-file moves.
>  - We also remove the try_adjust_address logic from the previous iteration of
>the pass.  Since we validate all ldp/stp offsets in the pass, this only
>meant that we lost opportunities in the case that a given mem fails to
>adjust in its original mode.
>
> Bootstrapped/regtested as a series on aarch64-linux-gnu, OK for trunk?
>
> Thanks,
> Alex
>
> gcc/ChangeLog:
>
>   * config.gcc: Add aarch64-ldp-fusion.o to extra_objs for aarch64; add
>   aarch64-ldp-fusion.cc to target_gtfiles.
>   * config/aarch64/aarch64-passes.def: Add copies of pass_ldp_fusion
>   before and after RA.
>   * config/aarch64/aarch64-protos.h (make_pass_ldp_fusion): Declare.
>   * config/aarch64/aarch64.opt (-mearly-ldp-fusion): New.
>   (-mlate-ldp-fusion): New.
>   (--param=aarch64-ldp-alias-check-limit): New.
>   (--param=aarch64-ldp-writeback): New.
>   * config/aarch64/t-aarch64: Add rule for aarch64-ldp-fusion.o.
>   * config/aarch64/aarch64-ldp-fusion.cc: New file.

Looks really good.  I'll probably need to do another pass over it,
but some initial comments below.

Main general comment is: it would be good to have more commentary.
Not "repeat the code in words" commentary, just comments that sketch
the intent or purpose of the following code, what the assumptions and
invariants are, etc.

> ---
>  gcc/config.gcc   |4 +-
>  gcc/config/aarch64/aarch64-ldp-fusion.cc | 2727 ++
>  gcc/config/aarch64/aarch64-passes.def|2 +
>  gcc/config/aarch64/aarch64-protos.h  |1 +
>  gcc/config/aarch64/aarch64.opt   |   23 +
>  gcc/config/aarch64/t-aarch64 |7 +
>  6 files changed, 2762 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/config/aarch64/aarch64-ldp-fusion.cc
>
> diff --git a/gcc/config.gcc b/gcc/config.gcc
> index c1460ca354e..8b7f6b20309 100644
> --- a/gcc/config.gcc
> +++ b/gcc/config.gcc
> @@ -349,8 +349,8 @@ aarch64*-*-

[PATCH v3 1/8] sched-deps.cc (find_modifiable_mems): Avoid exponential behavior

2023-11-22 Thread Maxim Kuvyrkov

This patch avoids sched-deps.cc:find_inc() creating exponential number
of dependencies, which become memory and compilation time hogs.
Consider example (simplified from PR96388) ...
===
sp=sp-4 // sp_insnA
mem_insnA1[sp+A1]
...
mem_insnAN[sp+AN]
sp=sp-4 // sp_insnB
mem_insnB1[sp+B1]
...
mem_insnBM[sp+BM]
===

[For simplicity, let's assume find_inc(backwards==true)].
In this example find_modifiable_mems() will arrange for mem_insnA*
to be able to pass sp_insnA, and, while doing this, will create
dependencies between all mem_insnA*s and sp_insnB -- because sp_insnB
is a consumer of sp_insnA.  After this sp_insnB will have N new
backward dependencies.
Then find_modifiable_mems() gets to mem_insnB*s and starts to create
N new dependencies for _every_ mem_insnB*.  This gets us N*M new
dependencies.

In PR96833's testcase N and M are 10k-15k, which causes RAM usage of
30GB and compilation time of 30 minutes, with sched2 accounting for
95% of both metrics.  After this patch the RAM usage is down to 1GB
and compilation time is down to 3-4 minutes, with sched2 no longer
standing out on -ftime-report or memory usage.

gcc/ChangeLog:

PR rtl-optimization/96388
PR rtl-optimization/111554
* sched-deps.cc (find_inc): Avoid exponential behavior.
---
 gcc/sched-deps.cc | 48 +++
 1 file changed, 44 insertions(+), 4 deletions(-)

diff --git a/gcc/sched-deps.cc b/gcc/sched-deps.cc
index c23218890f3..005fc0f567e 100644
--- a/gcc/sched-deps.cc
+++ b/gcc/sched-deps.cc
@@ -4779,24 +4779,59 @@ parse_add_or_inc (struct mem_inc_info *mii, rtx_insn 
*insn, bool before_mem)
 /* Once a suitable mem reference has been found and the corresponding data
in MII has been filled in, this function is called to find a suitable
add or inc insn involving the register we found in the memory
-   reference.  */
+   reference.
+   If successful, this function will create additional dependencies between
+   - mii->inc_insn's producers and mii->mem_insn as a consumer (if backwards)
+   - mii->inc_insn's consumers and mii->mem_insn as a producer (if !backwards).
+*/
 
 static bool
 find_inc (struct mem_inc_info *mii, bool backwards)
 {
   sd_iterator_def sd_it;
   dep_t dep;
+  sd_list_types_def mem_deps = backwards ? SD_LIST_HARD_BACK : SD_LIST_FORW;
+  int n_mem_deps = sd_lists_size (mii->mem_insn, mem_deps);
 
-  sd_it = sd_iterator_start (mii->mem_insn,
-backwards ? SD_LIST_HARD_BACK : SD_LIST_FORW);
+  sd_it = sd_iterator_start (mii->mem_insn, mem_deps);
   while (sd_iterator_cond (&sd_it, &dep))
 {
   dep_node_t node = DEP_LINK_NODE (*sd_it.linkp);
   rtx_insn *pro = DEP_PRO (dep);
   rtx_insn *con = DEP_CON (dep);
-  rtx_insn *inc_cand = backwards ? pro : con;
+  rtx_insn *inc_cand;
+  int n_inc_deps;
+
   if (DEP_NONREG (dep) || DEP_MULTIPLE (dep))
goto next;
+
+  if (backwards)
+   {
+ inc_cand = pro;
+ n_inc_deps = sd_lists_size (inc_cand, SD_LIST_BACK);
+   }
+  else
+   {
+ inc_cand = con;
+ n_inc_deps = sd_lists_size (inc_cand, SD_LIST_FORW);
+   }
+
+  /* In the FOR_EACH_DEP loop below we will create additional n_inc_deps
+for mem_insn.  This by itself is not a problem, since each mem_insn
+will have only a few inc_insns associated with it.  However, if
+we consider that a single inc_insn may have a lot of mem_insns, AND,
+on top of that, a few other inc_insns associated with it --
+those _other inc_insns_ will get (n_mem_deps * number of MEM insns)
+dependencies created for them.  This may cause an exponential
+growth of memory usage and scheduling time.
+See PR96388 for details.
+We [heuristically] use n_inc_deps as a proxy for the number of MEM
+insns, and drop opportunities for breaking modifiable_mem dependencies
+when dependency lists grow beyond reasonable size.  */
+  if (n_mem_deps * n_inc_deps
+ >= param_max_pending_list_length * param_max_pending_list_length)
+   goto next;
+
   if (parse_add_or_inc (mii, inc_cand, backwards))
{
  struct dep_replacement *desc;
@@ -4838,6 +4873,11 @@ find_inc (struct mem_inc_info *mii, bool backwards)
  desc->insn = mii->mem_insn;
  move_dep_link (DEP_NODE_BACK (node), INSN_HARD_BACK_DEPS (con),
 INSN_SPEC_BACK_DEPS (con));
+
+ /* Make sure that n_inc_deps above is consistent with dependencies
+we create.  */
+ gcc_assert (mii->inc_insn == inc_cand);
+
  if (backwards)
{
  FOR_EACH_DEP (mii->inc_insn, SD_LIST_BACK, sd_it, dep)
-- 
2.34.1

[PATCH v3 0/8] Avoid exponential behavior in scheduler and better logging

2023-11-22 Thread Maxim Kuvyrkov

Compared to v1 and v2 this is a complete patch series.

The debugging/dumping improvements gently touch IRA, RTL lists, and sel-sched 
bits to avoid re-inventing or copy-pasting the wheel.

Bootstrapping and regtesting these patches on aarch64-linux-gnu.  OK to merge?

Thanks!

===

This patch series fixes exponential behavior in scheduler's 
find_modifiable_mems(). This fixes PRs [1] and [2], which are compilation time 
and memory hogs.

The first patch in the series is the actual fix (bootstrapped and regtested on 
aarch64-linux-gnu), and follow up patches are improvements to scheduler's 
logging infrastructure, that enabled me to debug this problem.  As-is, the 
scheduler has good logging of the actual scheduling process, but calculation of 
instruction dependencies has almost no logging.

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96388
[2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111554

Maxim Kuvyrkov (8):
  sched-deps.cc (find_modifiable_mems): Avoid exponential behavior
  Unify implementations of print_hard_reg_set()
  Simplify handling of INSN_ and EXPR_LISTs in sched-rgn.cc
  Improve and fix sched-deps.cc: dump_dep() and dump_lists().
  Add a bit more logging scheduler's dependency analysis
  sched_deps.cc: Simplify initialization of dependency contexts
  Improve logging of register data in scheduler dependency analysis
  Improve logging of scheduler dependency analysis context

 gcc/hard-reg-set.h|   3 +
 gcc/ira-color.cc  |  17 +-
 gcc/ira-conflicts.cc  |  39 +
 gcc/lists.cc  |  30 +++-
 gcc/rtl.h |   4 +-
 gcc/sched-deps.cc | 399 +++---
 gcc/sched-int.h   |   9 +-
 gcc/sched-rgn.cc  |  56 +++---
 gcc/sel-sched-dump.cc |  21 +--
 gcc/sel-sched-dump.h  |   2 +-
 10 files changed, 467 insertions(+), 113 deletions(-)

-- 
2.34.1

[PATCH v3 5/8] Add a bit more logging scheduler's dependency analysis

2023-11-22 Thread Maxim Kuvyrkov

gcc/ChangeLog:

* sched-deps.cc (sd_add_dep, find_inc): Add logging about
dependency creation.
---
 gcc/sched-deps.cc | 30 ++
 1 file changed, 26 insertions(+), 4 deletions(-)

diff --git a/gcc/sched-deps.cc b/gcc/sched-deps.cc
index 4d357079a7a..2a87158ba4b 100644
--- a/gcc/sched-deps.cc
+++ b/gcc/sched-deps.cc
@@ -1342,6 +1342,13 @@ sd_add_dep (dep_t dep, bool resolved_p)
  in the bitmap caches of dependency information.  */
   if (true_dependency_cache != NULL)
 set_dependency_caches (dep);
+
+  if (sched_verbose >= 9)
+{
+  fprintf (sched_dump, "created dependency ");
+  dump_dep (sched_dump, dep, 1);
+  fprintf (sched_dump, "\n");
+}
 }
 
 /* Add or update backward dependence between INSN and ELEM
@@ -4879,18 +4886,33 @@ find_inc (struct mem_inc_info *mii, bool backwards)
 we create.  */
  gcc_assert (mii->inc_insn == inc_cand);
 
+ int n_deps_created = 0;
  if (backwards)
{
  FOR_EACH_DEP (mii->inc_insn, SD_LIST_BACK, sd_it, dep)
-   add_dependence_1 (mii->mem_insn, DEP_PRO (dep),
- REG_DEP_TRUE);
+   {
+ add_dependence_1 (mii->mem_insn, DEP_PRO (dep),
+   REG_DEP_TRUE);
+ ++n_deps_created;
+   }
}
  else
{
  FOR_EACH_DEP (mii->inc_insn, SD_LIST_FORW, sd_it, dep)
-   add_dependence_1 (DEP_CON (dep), mii->mem_insn,
- REG_DEP_ANTI);
+   {
+ add_dependence_1 (DEP_CON (dep), mii->mem_insn,
+   REG_DEP_ANTI);
+ ++n_deps_created;
+   }
}
+ if (sched_verbose >= 6)
+   fprintf (sched_dump,
+"created %d deps for mem_insn %d due to "
+"inc_insn %d %s deps\n",
+n_deps_created, INSN_UID (mii->mem_insn),
+INSN_UID (mii->inc_insn),
+backwards ? "backward" : "forward");
+
  return true;
}
 next:
-- 
2.34.1

[PATCH v3 6/8] sched_deps.cc: Simplify initialization of dependency contexts

2023-11-22 Thread Maxim Kuvyrkov

gcc/ChangeLog:

* sched-deps.cc (init_deps, init_deps_reg_last): Simplify.
(free_deps): Remove useless code.
---
 gcc/sched-deps.cc | 13 -
 1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/gcc/sched-deps.cc b/gcc/sched-deps.cc
index 2a87158ba4b..e0d3c97d935 100644
--- a/gcc/sched-deps.cc
+++ b/gcc/sched-deps.cc
@@ -3927,10 +3927,9 @@ init_deps (class deps_desc *deps, bool lazy_reg_last)
   int max_reg = (reload_completed ? FIRST_PSEUDO_REGISTER : max_reg_num ());
 
   deps->max_reg = max_reg;
-  if (lazy_reg_last)
-deps->reg_last = NULL;
-  else
-deps->reg_last = XCNEWVEC (struct deps_reg, max_reg);
+  deps->reg_last = NULL;
+  if (!lazy_reg_last)
+init_deps_reg_last (deps);
   INIT_REG_SET (&deps->reg_last_in_use);
 
   deps->pending_read_insns = 0;
@@ -3961,9 +3960,7 @@ init_deps (class deps_desc *deps, bool lazy_reg_last)
 void
 init_deps_reg_last (class deps_desc *deps)
 {
-  gcc_assert (deps && deps->max_reg > 0);
-  gcc_assert (deps->reg_last == NULL);
-
+  gcc_assert (deps && deps->max_reg > 0 && deps->reg_last == NULL);
   deps->reg_last = XCNEWVEC (struct deps_reg, deps->max_reg);
 }
 
@@ -4013,8 +4010,6 @@ free_deps (class deps_desc *deps)
  it at all.  */
   free (deps->reg_last);
   deps->reg_last = NULL;
-
-  deps = NULL;
 }
 
 /* Remove INSN from dependence contexts DEPS.  */
-- 
2.34.1

[PATCH v3 2/8] Unify implementations of print_hard_reg_set()

2023-11-22 Thread Maxim Kuvyrkov

We currently have 3 implementations of print_hard_reg_set()
(all with the same name!) in ira-color.cc, ira-conflicts.cc, and
sel-sched-dump.cc.  This patch generalizes implementation in
ira-color.cc, and uses it in all other places.  The declaration
is added to hard-reg-set.h.

The motivation for this patch is the [upcoming] need for
print_hard_reg_set() in sched-deps.cc.

gcc/ChangeLog:

* hard-reg-set.h (print_hard_reg_set): Declare.
* ira-color.cc (print_hard_reg_set): Generalize a bit.
(debug_hard_reg_set, print_hard_regs_subforest,)
(setup_allocno_available_regs_num): Update.
* ira-conflicts.cc (print_hard_reg_set): Remove.
(print_allocno_conflicts): Use global print_hard_reg_set().
* sel-sched-dump.cc (print_hard_reg_set): Remove.
(dump_hard_reg_set): Use global print_hard_reg_set().
* sel-sched-dump.h (dump_hard_reg_set): Mark as DEBUG_FUNCTION.
---
 gcc/hard-reg-set.h|  3 +++
 gcc/ira-color.cc  | 17 ++---
 gcc/ira-conflicts.cc  | 39 ---
 gcc/sel-sched-dump.cc | 21 -
 gcc/sel-sched-dump.h  |  2 +-
 5 files changed, 22 insertions(+), 60 deletions(-)

diff --git a/gcc/hard-reg-set.h b/gcc/hard-reg-set.h
index b0bb9bce074..81bca6df0e5 100644
--- a/gcc/hard-reg-set.h
+++ b/gcc/hard-reg-set.h
@@ -524,4 +524,7 @@ call_used_or_fixed_reg_p (unsigned int regno)
 }
 #endif
 
+/* ira-color.cc */
+extern void print_hard_reg_set (FILE *, HARD_REG_SET, const char *, bool);
+
 #endif /* ! GCC_HARD_REG_SET_H */
diff --git a/gcc/ira-color.cc b/gcc/ira-color.cc
index f2e8ea34152..43564b44933 100644
--- a/gcc/ira-color.cc
+++ b/gcc/ira-color.cc
@@ -482,11 +482,14 @@ first_common_ancestor_node (allocno_hard_regs_node_t 
first,
 }
 
 /* Print hard reg set SET to F.  */
-static void
-print_hard_reg_set (FILE *f, HARD_REG_SET set, bool new_line_p)
+void
+print_hard_reg_set (FILE *f, HARD_REG_SET set,
+   const char *title, bool new_line_p)
 {
   int i, start, end;
 
+  if (title)
+fprintf (f, "%s", title);
   for (start = end = -1, i = 0; i < FIRST_PSEUDO_REGISTER; i++)
 {
   bool reg_included = TEST_HARD_REG_BIT (set, i);
@@ -516,7 +519,7 @@ print_hard_reg_set (FILE *f, HARD_REG_SET set, bool 
new_line_p)
 DEBUG_FUNCTION void
 debug_hard_reg_set (HARD_REG_SET set)
 {
-  print_hard_reg_set (stderr, set, true);
+  print_hard_reg_set (stderr, set, NULL, true);
 }
 
 /* Print allocno hard register subforest given by ROOTS and its LEVEL
@@ -534,7 +537,7 @@ print_hard_regs_subforest (FILE *f, 
allocno_hard_regs_node_t roots,
   for (i = 0; i < level * 2; i++)
fprintf (f, " ");
   fprintf (f, "%d:(", node->preorder_num);
-  print_hard_reg_set (f, node->hard_regs->set, false);
+  print_hard_reg_set (f, node->hard_regs->set, NULL, false);
   fprintf (f, ")@%" PRId64"\n", node->hard_regs->cost);
   print_hard_regs_subforest (f, node->first, level + 1);
 }
@@ -2982,12 +2985,12 @@ setup_allocno_available_regs_num (ira_allocno_t a)
  "  Allocno a%dr%d of %s(%d) has %d avail. regs ",
  ALLOCNO_NUM (a), ALLOCNO_REGNO (a),
  reg_class_names[aclass], ira_class_hard_regs_num[aclass], n);
-  print_hard_reg_set (ira_dump_file, data->profitable_hard_regs, false);
+  print_hard_reg_set (ira_dump_file, data->profitable_hard_regs, NULL, false);
   fprintf (ira_dump_file, ", %snode: ",
   data->profitable_hard_regs == data->hard_regs_node->hard_regs->set
   ? "" : "^");
   print_hard_reg_set (ira_dump_file,
- data->hard_regs_node->hard_regs->set, false);
+ data->hard_regs_node->hard_regs->set, NULL, false);
   for (i = 0; i < nwords; i++)
 {
   ira_object_t obj = ALLOCNO_OBJECT (a, i);
@@ -3000,7 +3003,7 @@ setup_allocno_available_regs_num (ira_allocno_t a)
}
   fprintf (ira_dump_file, " (confl regs = ");
   print_hard_reg_set (ira_dump_file, OBJECT_TOTAL_CONFLICT_HARD_REGS (obj),
- false);
+ NULL, false);
   fprintf (ira_dump_file, ")");
 }
   fprintf (ira_dump_file, "\n");
diff --git a/gcc/ira-conflicts.cc b/gcc/ira-conflicts.cc
index a4d93c8d734..69806b1a15b 100644
--- a/gcc/ira-conflicts.cc
+++ b/gcc/ira-conflicts.cc
@@ -670,37 +670,6 @@ build_conflicts (void)
 
 
 
-/* Print hard reg set SET with TITLE to FILE.  */
-static void
-print_hard_reg_set (FILE *file, const char *title, HARD_REG_SET set)
-{
-  int i, start, end;
-
-  fputs (title, file);
-  for (start = end = -1, i = 0; i < FIRST_PSEUDO_REGISTER; i++)
-{
-  bool reg_included = TEST_HARD_REG_BIT (set, i);
-
-  if (reg_included)
-   {
- if (start == -1)
-   start = i;
- end = i;
-   }
-  if (start >= 0 && (!reg_included || i == FIRST_PSEUDO_REGISTER - 1))
-   {
- if (start == end)
-   fprintf (file, " %d", start);
- else if (start == end + 1)
-

[PATCH v3 8/8] Improve logging of scheduler dependency analysis context

2023-11-22 Thread Maxim Kuvyrkov

Scheduler dependency analysis uses two main data structures:
1. reg_pending_* data contains effects of INSN on the register file,
   which is then incorporated into ...
2. deps_desc object, which contains commulative information about all
   instructions processed from deps_desc object's initialization.

This patch adds debug dumping of (2).

Dependency analysis contexts (aka deps_desc objects) are huge, but
each instruction affects only a small amount of data in these objects.
Therefore, it is most useful to dump differential information
compared to the dependency state after previous instruction.

gcc/ChangeLog:

* sched-deps.cc (reset_deps, dump_rtx_insn_list)
(rtx_insn_list_same_p): New helper functions.
(dump_deps_desc_diff): New function to dump dependency information.
(sched_analysis_prev_deps): New static variable.
(sched_analyze_insn): Dump dependency information.
(init_deps_global, finish_deps_global): Handle sched_analysis_prev_deps.
* sched-int.h (struct deps_reg): Update comments.
* sched-rgn.cc (concat_insn_list, concat_expr_list): Update comments.
---
 gcc/sched-deps.cc | 197 ++
 gcc/sched-int.h   |   9 ++-
 gcc/sched-rgn.cc  |   5 ++
 3 files changed, 210 insertions(+), 1 deletion(-)

diff --git a/gcc/sched-deps.cc b/gcc/sched-deps.cc
index f9290c82fd2..edca9927e23 100644
--- a/gcc/sched-deps.cc
+++ b/gcc/sched-deps.cc
@@ -1677,6 +1677,15 @@ delete_all_dependences (rtx_insn *insn)
 sd_delete_dep (sd_it);
 }
 
+/* Re-initialize existing dependency context DEPS to be a copy of FROM.  */
+static void
+reset_deps (class deps_desc *deps, class deps_desc *from)
+{
+  free_deps (deps);
+  init_deps (deps, false);
+  deps_join (deps, from);
+}
+
 /* All insns in a scheduling group except the first should only have
dependencies on the previous insn in the group.  So we find the
first instruction in the scheduling group by walking the dependence
@@ -2960,6 +2969,177 @@ dump_reg_pending_data (FILE *file, rtx_insn *insn)
 }
 }
 
+/* Dump rtx_insn_list LIST.
+   Consider moving to lists.cc if there are users outside of sched-deps.cc.  */
+static void
+dump_rtx_insn_list (FILE *file, rtx_insn_list *list)
+{
+  for (; list; list = list->next ())
+fprintf (file, " %d", INSN_UID (list->insn ()));
+}
+
+/* Return TRUE if lists A and B have same elements in the same order.  */
+static bool
+rtx_insn_list_same_p (rtx_insn_list *a, rtx_insn_list *b)
+{
+  for (; a && b; a = a->next (), b = b->next ())
+if (a->insn () != b->insn ())
+  return false;
+
+  if (a || b)
+return false;
+
+  return true;
+}
+
+/* Dump parts of DEPS that are different from PREV.
+   Dumping all information from dependency context produces huge
+   hard-to-analize logs; differential dumping is way more managable.  */
+static void
+dump_deps_desc_diff (FILE *file, class deps_desc *deps, class deps_desc *prev)
+{
+  /* Each "paragraph" is a single line of output.  */
+
+  /* Note on param_max_pending_list_length:
+ During normal dependency analysis various lists should not exceed this
+ limit.  Searching for "!!!" in scheduler logs can point to potential bugs
+ or poorly-handled corner-cases.  */
+
+  if (!rtx_insn_list_same_p (deps->pending_read_insns,
+prev->pending_read_insns))
+{
+  fprintf (file, ";; deps pending mem reads length(%d):",
+  deps->pending_read_list_length);
+  if ((deps->pending_read_list_length + deps->pending_write_list_length)
+ >= param_max_pending_list_length)
+   fprintf (file, "%d insns!!!", deps->pending_read_list_length);
+  else
+   dump_rtx_insn_list (file, deps->pending_read_insns);
+  fprintf (file, "\n");
+}
+
+  if (!rtx_insn_list_same_p (deps->pending_write_insns,
+prev->pending_write_insns))
+{
+  fprintf (file, ";; deps pending mem writes length(%d):",
+  deps->pending_write_list_length);
+  if ((deps->pending_read_list_length + deps->pending_write_list_length)
+ >= param_max_pending_list_length)
+   fprintf (file, "%d insns!!!", deps->pending_write_list_length);
+  else
+   dump_rtx_insn_list (file, deps->pending_write_insns);
+  fprintf (file, "\n");
+}
+
+  if (!rtx_insn_list_same_p (deps->pending_jump_insns,
+prev->pending_jump_insns))
+{
+  fprintf (file, ";; deps pending jump length(%d):",
+  deps->pending_flush_length);
+  if (deps->pending_flush_length >= param_max_pending_list_length)
+   fprintf (file, "%d insns!!!", deps->pending_flush_length);
+  else
+   dump_rtx_insn_list (file, deps->pending_jump_insns);
+  fprintf (file, "\n");
+}
+
+  fprintf (file, ";; last");
+  if (!rtx_insn_list_same_p (deps->last_pending_memory_flush,
+prev->last_pending_memory_flush))
+{
+  fprintf (

[PATCH v3 3/8] Simplify handling of INSN_ and EXPR_LISTs in sched-rgn.cc

2023-11-22 Thread Maxim Kuvyrkov

This patch simplifies logic behind deps_join(), which will be
important for the upcoming improvements of sched-deps.cc logging.

The only functional change is that when deps_join() is called with
empty state for the 2nd argument, it will not reverse INSN_ and
EXPR_LISTs in the 1st argument.  Before this patch the lists were
reversed due to use of concat_*_LIST().  Now, with copy_*_LIST()
used for this case, the lists will remain in the original order.

gcc/ChangeLog:

* lists.cc (copy_EXPR_LIST, concat_EXPR_LIST): New functions.
* rtl.h (copy_EXPR_LIST, concat_EXPR_LIST): Declare.
* sched-rgn.cc (concat_insn_list, concat_expr_list): New helpers.
(concat_insn_mem_list): Simplify.
(deps_join): Update
---
 gcc/lists.cc | 30 +++-
 gcc/rtl.h|  4 +++-
 gcc/sched-rgn.cc | 51 +++-
 3 files changed, 61 insertions(+), 24 deletions(-)

diff --git a/gcc/lists.cc b/gcc/lists.cc
index 2cdf37ad533..83e7bf32176 100644
--- a/gcc/lists.cc
+++ b/gcc/lists.cc
@@ -160,6 +160,24 @@ free_INSN_LIST_list (rtx_insn_list **listp)
   free_list ((rtx *)listp, &unused_insn_list);
 }
 
+/* Make a copy of the EXPR_LIST list LINK and return it.  */
+rtx_expr_list *
+copy_EXPR_LIST (rtx_expr_list *link)
+{
+  rtx_expr_list *new_queue;
+  rtx_expr_list **pqueue = &new_queue;
+
+  for (; link; link = link->next ())
+{
+  rtx x = link->element ();
+  rtx_expr_list *newlink = alloc_EXPR_LIST (REG_NOTE_KIND (link), x, NULL);
+  *pqueue = newlink;
+  pqueue = (rtx_expr_list **)&XEXP (newlink, 1);
+}
+  *pqueue = NULL;
+  return new_queue;
+}
+
 /* Make a copy of the INSN_LIST list LINK and return it.  */
 rtx_insn_list *
 copy_INSN_LIST (rtx_insn_list *link)
@@ -178,12 +196,22 @@ copy_INSN_LIST (rtx_insn_list *link)
   return new_queue;
 }
 
+/* Duplicate the EXPR_LIST elements of COPY and prepend them to OLD.  */
+rtx_expr_list *
+concat_EXPR_LIST (rtx_expr_list *copy, rtx_expr_list *old)
+{
+  rtx_expr_list *new_rtx = old;
+  for (; copy; copy = copy->next ())
+new_rtx = alloc_EXPR_LIST (REG_NOTE_KIND (copy), copy->element (), 
new_rtx);
+  return new_rtx;
+}
+
 /* Duplicate the INSN_LIST elements of COPY and prepend them to OLD.  */
 rtx_insn_list *
 concat_INSN_LIST (rtx_insn_list *copy, rtx_insn_list *old)
 {
   rtx_insn_list *new_rtx = old;
-  for (; copy ; copy = copy->next ())
+  for (; copy; copy = copy->next ())
 {
   new_rtx = alloc_INSN_LIST (copy->insn (), new_rtx);
   PUT_REG_NOTE_KIND (new_rtx, REG_NOTE_KIND (copy));
diff --git a/gcc/rtl.h b/gcc/rtl.h
index e4b6cc0dbb5..7e952d7cbeb 100644
--- a/gcc/rtl.h
+++ b/gcc/rtl.h
@@ -3764,10 +3764,12 @@ extern void free_EXPR_LIST_list (rtx_expr_list **);
 extern void free_INSN_LIST_list (rtx_insn_list **);
 extern void free_EXPR_LIST_node (rtx);
 extern void free_INSN_LIST_node (rtx);
+extern rtx_expr_list *alloc_EXPR_LIST (int, rtx, rtx);
 extern rtx_insn_list *alloc_INSN_LIST (rtx, rtx);
+extern rtx_expr_list *copy_EXPR_LIST (rtx_expr_list *);
 extern rtx_insn_list *copy_INSN_LIST (rtx_insn_list *);
+extern rtx_expr_list *concat_EXPR_LIST (rtx_expr_list *, rtx_expr_list *);
 extern rtx_insn_list *concat_INSN_LIST (rtx_insn_list *, rtx_insn_list *);
-extern rtx_expr_list *alloc_EXPR_LIST (int, rtx, rtx);
 extern void remove_free_INSN_LIST_elem (rtx_insn *, rtx_insn_list **);
 extern rtx remove_list_elem (rtx, rtx *);
 extern rtx_insn *remove_free_INSN_LIST_node (rtx_insn_list **);
diff --git a/gcc/sched-rgn.cc b/gcc/sched-rgn.cc
index e5964f54ead..da3ec0458ff 100644
--- a/gcc/sched-rgn.cc
+++ b/gcc/sched-rgn.cc
@@ -2585,25 +2585,32 @@ add_branch_dependences (rtx_insn *head, rtx_insn *tail)
 
 static class deps_desc *bb_deps;
 
+/* Return a new insn_list with all the elements from the two input lists.  */
+static rtx_insn_list *
+concat_insn_list (rtx_insn_list *copy, rtx_insn_list *old)
+{
+  if (!old)
+return copy_INSN_LIST (copy);
+  return concat_INSN_LIST (copy, old);
+}
+
+/* Return a new expr_list with all the elements from the two input lists.  */
+static rtx_expr_list *
+concat_expr_list (rtx_expr_list *copy, rtx_expr_list *old)
+{
+  if (!old)
+return copy_EXPR_LIST (copy);
+  return concat_EXPR_LIST (copy, old);
+}
+
 static void
 concat_insn_mem_list (rtx_insn_list *copy_insns,
  rtx_expr_list *copy_mems,
  rtx_insn_list **old_insns_p,
  rtx_expr_list **old_mems_p)
 {
-  rtx_insn_list *new_insns = *old_insns_p;
-  rtx_expr_list *new_mems = *old_mems_p;
-
-  while (copy_insns)
-{
-  new_insns = alloc_INSN_LIST (copy_insns->insn (), new_insns);
-  new_mems = alloc_EXPR_LIST (VOIDmode, copy_mems->element (), new_mems);
-  copy_insns = copy_insns->next ();
-  copy_mems = copy_mems->next ();
-}
-
-  *old_insns_p = new_insns;
-  *old_mems_p = new_mems;
+  *old_insns_p = concat_insn_list (copy_insns, *old_insns_p);
+  *old_mems_p = concat

[PATCH v3 4/8] Improve and fix sched-deps.cc: dump_dep() and dump_lists().

2023-11-22 Thread Maxim Kuvyrkov

Better propagate flags from dump_lists() into dump_dep() and
add a missing "]" in dump_lists().

gcc/ChangeLog:

* sched-deps.cc (DUMP_DEP_PRO): Improve comment.
(dump_dep_flags): Remove.
(DUMP_LISTS_SIZE, DUMP_LISTS_DEPS, DUMP_LISTS_ALL): Continue
numbering from DUMP_DEP_* flags.
(dump_lists): Update and fix.
---
 gcc/sched-deps.cc | 21 +++--
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/gcc/sched-deps.cc b/gcc/sched-deps.cc
index 005fc0f567e..4d357079a7a 100644
--- a/gcc/sched-deps.cc
+++ b/gcc/sched-deps.cc
@@ -132,7 +132,8 @@ static void dump_ds (FILE *, ds_t);
 /* Define flags for dump_dep ().  */
 
 /* Dump producer of the dependence.  */
-#define DUMP_DEP_PRO (2)
+#define DUMP_DEP_PRO (2) /* Reserve "1" for handling of DUMP_DEP_ALL and
+   DUMP_LISTS_ALL.  */
 
 /* Dump consumer of the dependence.  */
 #define DUMP_DEP_CON (4)
@@ -206,9 +207,6 @@ dump_dep (FILE *dump, dep_t dep, int flags)
   fprintf (dump, ">");
 }
 
-/* Default flags for dump_dep ().  */
-static int dump_dep_flags = (DUMP_DEP_PRO | DUMP_DEP_CON);
-
 /* Dump all fields of DEP to STDERR.  */
 void
 sd_debug_dep (dep_t dep)
@@ -1454,19 +1452,20 @@ sd_delete_dep (sd_iterator_def sd_it)
 }
 
 /* Dump size of the lists.  */
-#define DUMP_LISTS_SIZE (2)
+#define DUMP_LISTS_SIZE (32) /* (DUMP_DEP_STATUS << 1)  */
 
 /* Dump dependencies of the lists.  */
-#define DUMP_LISTS_DEPS (4)
+#define DUMP_LISTS_DEPS (64)
 
 /* Dump all information about the lists.  */
 #define DUMP_LISTS_ALL (DUMP_LISTS_SIZE | DUMP_LISTS_DEPS)
 
 /* Dump deps_lists of INSN specified by TYPES to DUMP.
-   FLAGS is a bit mask specifying what information about the lists needs
-   to be printed.
+   FLAGS is a bit mask specifying what information about the lists and
+   the individual deps needs to be printed, this is a combination of
+   DUMP_DEP_* and DUMP_LISTS_* flags.
If FLAGS has the very first bit set, then dump all information about
-   the lists and propagate this bit into the callee dump functions.  */
+   the lists and deps propagate this bit into the callee dump functions.  */
 static void
 dump_lists (FILE *dump, rtx insn, sd_list_types_def types, int flags)
 {
@@ -1488,10 +1487,12 @@ dump_lists (FILE *dump, rtx insn, sd_list_types_def 
types, int flags)
 {
   FOR_EACH_DEP (insn, types, sd_it, dep)
{
- dump_dep (dump, dep, dump_dep_flags | all);
+ dump_dep (dump, dep, flags | all);
  fprintf (dump, " ");
}
 }
+
+  fprintf (dump, "]");
 }
 
 /* Dump all information about deps_lists of INSN specified by TYPES
-- 
2.34.1

[PATCH v3 7/8] Improve logging of register data in scheduler dependency analysis

2023-11-22 Thread Maxim Kuvyrkov

Scheduler dependency analysis uses two main data structures:
1. reg_pending_* data contains effects of INSN on the register file,
   which is then incorporated into ...
2. deps_desc object, which contains commulative information about all
   instructions processed from deps_desc object's initialization.

This patch adds debug dumping of (1).

gcc/ChangeLog:

* sched-deps.cc (print-rtl.h): Include for str_pattern_slim().
(dump_reg_pending_data): New function.
(sched_analyze_insn): Use it.
---
 gcc/sched-deps.cc | 90 ++-
 1 file changed, 89 insertions(+), 1 deletion(-)

diff --git a/gcc/sched-deps.cc b/gcc/sched-deps.cc
index e0d3c97d935..f9290c82fd2 100644
--- a/gcc/sched-deps.cc
+++ b/gcc/sched-deps.cc
@@ -38,6 +38,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "sched-int.h"
 #include "cselib.h"
 #include "function-abi.h"
+#include "print-rtl.h"
 
 #ifdef INSN_SCHEDULING
 
@@ -432,10 +433,24 @@ dep_spec_p (dep_t dep)
   return false;
 }
 
+/* These regsets describe how a single instruction affects registers.
+   Their "life-time" is restricted to a single call of sched_analyze_insn().
+   They are populated by sched_analyze_1() and sched_analyze_2(), and
+   then sched_analyze_insn() transfers data from these into deps->reg_last[i].
+   Near the end sched_analyze_insn() clears these regsets for the next
+   insn.  */
 static regset reg_pending_sets;
 static regset reg_pending_clobbers;
 static regset reg_pending_uses;
 static regset reg_pending_control_uses;
+
+/* Similar to reg_pending_* regsets, this variable specifies whether
+   the current insn analyzed by sched_analyze_insn() is a scheduling
+   barrier that should "split" dependencies inside a block.  Internally
+   sched-deps.cc does this by pretending that the barrier insn uses and
+   sets all registers.
+   Near the end sched_analyze_insn() transfers barrier info from this variable
+   into deps->last_reg_pending_barrier.  */
 static enum reg_pending_barrier_mode reg_pending_barrier;
 
 /* Hard registers implicitly clobbered or used (or may be implicitly
@@ -2880,7 +2895,77 @@ get_implicit_reg_pending_clobbers (HARD_REG_SET *temp, 
rtx_insn *insn)
   *temp &= ~ira_no_alloc_regs;
 }
 
-/* Analyze an INSN with pattern X to find all dependencies.  */
+/* Dump state of reg_pending_* data for debug purposes.
+   Dump only non-empty data to reduce log clobber.  */
+static void
+dump_reg_pending_data (FILE *file, rtx_insn *insn)
+{
+  fprintf (file, "\n");
+  fprintf (file, ";; sched_analysis after insn %d: %s\n",
+  INSN_UID (insn), str_pattern_slim (PATTERN (insn)));
+
+  if (!REG_SET_EMPTY_P (reg_pending_sets)
+  || !REG_SET_EMPTY_P (reg_pending_clobbers)
+  || !REG_SET_EMPTY_P (reg_pending_uses)
+  || !REG_SET_EMPTY_P (reg_pending_control_uses))
+{
+  fprintf (file, ";; insn reg");
+  if (!REG_SET_EMPTY_P (reg_pending_sets))
+   {
+ fprintf (file, " sets(");
+ dump_regset (reg_pending_sets, file);
+ fprintf (file, ")");
+   }
+  if (!REG_SET_EMPTY_P (reg_pending_clobbers))
+   {
+ fprintf (file, " clobbers(");
+ dump_regset (reg_pending_clobbers, file);
+ fprintf (file, ")");
+   }
+  if (!REG_SET_EMPTY_P (reg_pending_uses))
+   {
+ fprintf (file, " uses(");
+ dump_regset (reg_pending_uses, file);
+ fprintf (file, ")");
+   }
+  if (!REG_SET_EMPTY_P (reg_pending_control_uses))
+   {
+ fprintf (file, " control(");
+ dump_regset (reg_pending_control_uses, file);
+ fprintf (file, ")");
+   }
+  fprintf (file, "\n");
+}
+
+  if (reg_pending_barrier)
+fprintf (file, ";; insn reg barrier: %d\n", reg_pending_barrier);
+
+  if (!hard_reg_set_empty_p (implicit_reg_pending_clobbers)
+  || !hard_reg_set_empty_p (implicit_reg_pending_uses))
+{
+  fprintf (file, ";; insn reg");
+  if (!hard_reg_set_empty_p (implicit_reg_pending_clobbers))
+   {
+ print_hard_reg_set (file, implicit_reg_pending_clobbers,
+ " implicit clobbers(", false);
+ fprintf (file, ")");
+   }
+  if (!hard_reg_set_empty_p (implicit_reg_pending_uses))
+   {
+ print_hard_reg_set (file, implicit_reg_pending_uses,
+ " implicit uses(", false);
+ fprintf (file, ")");
+   }
+  fprintf (file, "\n");
+}
+}
+
+/* Analyze an INSN with pattern X to find all dependencies.
+   This analysis uses two main data structures:
+   1. reg_pending_* data contains effects of INSN on the register file,
+  which is then incorporated into ...
+   2. deps_desc object, which contains commulative information about all
+  instructions processed from deps_desc object's initialization.  */
 static void
 sched_analyze_insn (class deps_desc *deps, rtx x, rtx_insn *insn)
 {
@@ -3328,6 +3413,9 @@ sched_analyze_insn (class d

Re: [PATCH 11/11] aarch64: Use individual loads/stores for mem{cpy, set} expansion

2023-11-22 Thread Richard Sandiford

Alex Coplan  writes:
> This patch adjusts the mem{cpy,set} expansion in the aarch64 backend to use
> individual loads/stores instead of ldp/stp at expand time.  The idea is to 
> rely
> on the ldp fusion pass to fuse the accesses together later in the RTL 
> pipeline.
>
> The earlier parts of the RTL pipeline should be able to do a better job with 
> the
> individual (non-paired) accesses, especially given that an earlier patch in 
> this
> series moves the pair representation to use unspecs.
>
> Bootstrapped/regtested as a series on aarch64-linux-gnu, OK for trunk?
>
> Thanks,
> Alex
>
> gcc/ChangeLog:
>
>   * config/aarch64/aarch64.cc
>   (aarch64_copy_one_block_and_progress_pointers): Emit individual
>   accesses instead of load/store pairs.
>   (aarch64_set_one_block_and_progress_pointer): Likewise.

OK, thanks.

Richard

> ---
>  gcc/config/aarch64/aarch64.cc | 12 
>  1 file changed, 8 insertions(+), 4 deletions(-)
>
> diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
> index 1f6094bf1bc..315ba7119c0 100644
> --- a/gcc/config/aarch64/aarch64.cc
> +++ b/gcc/config/aarch64/aarch64.cc
> @@ -25457,9 +25457,12 @@ aarch64_copy_one_block_and_progress_pointers (rtx 
> *src, rtx *dst,
>/* "Cast" the pointers to the correct mode.  */
>*src = adjust_address (*src, mode, 0);
>*dst = adjust_address (*dst, mode, 0);
> -  /* Emit the memcpy.  */
> -  emit_insn (aarch64_gen_load_pair (reg1, reg2, *src));
> -  emit_insn (aarch64_gen_store_pair (*dst, reg1, reg2));
> +  /* Emit the memcpy.  The load/store pair pass should form
> +  a load/store pair from these moves.  */
> +  emit_move_insn (reg1, *src);
> +  emit_move_insn (reg2, aarch64_progress_pointer (*src));
> +  emit_move_insn (*dst, reg1);
> +  emit_move_insn (aarch64_progress_pointer (*dst), reg2);
>/* Move the pointers forward.  */
>*src = aarch64_move_pointer (*src, 32);
>*dst = aarch64_move_pointer (*dst, 32);
> @@ -25638,7 +25641,8 @@ aarch64_set_one_block_and_progress_pointer (rtx src, 
> rtx *dst,
>/* "Cast" the *dst to the correct mode.  */
>*dst = adjust_address (*dst, mode, 0);
>/* Emit the memset.  */
> -  emit_insn (aarch64_gen_store_pair (*dst, src, src));
> +  emit_move_insn (*dst, src);
> +  emit_move_insn (aarch64_progress_pointer (*dst), src);
>  
>/* Move the pointers forward.  */
>*dst = aarch64_move_pointer (*dst, 32);

Re: [PATCH] tree: Fix up try_catch_may_fallthru [PR112619]

2023-11-22 Thread Richard Biener

On Wed, 22 Nov 2023, Jakub Jelinek wrote:

> Hi!
> 
> The following testcase ICEs with -std=c++98 since r14-5086 because
> block_may_fallthru is called on a TRY_CATCH_EXPR whose second operand
> is a MODIFY_EXPR rather than STATEMENT_LIST, which try_catch_may_fallthru
> apparently expects.
> I've been wondering whether that isn't some kind of FE bug and whether
> there isn't some unwritten rule that second operand of TRY_CATCH_EXPR
> must be a STATEMENT_LIST, but then I tried
> --- gcc/gimplify.cc   2023-07-19 14:23:42.409875238 +0200
> +++ gcc/gimplify.cc   2023-11-22 11:07:50.511000206 +0100
> @@ -16730,6 +16730,10 @@ gimplify_expr (tree *expr_p, gimple_seq
>  Note that this only affects the destructor calls in FINALLY/CATCH
>  block, and will automatically reset to its original value by the
>  end of gimplify_expr.  */
> +if (TREE_CODE (*expr_p) == TRY_CATCH_EXPR
> + && TREE_OPERAND (*expr_p, 1)
> + && TREE_CODE (TREE_OPERAND (*expr_p, 1)) != STATEMENT_LIST)
> +   gcc_unreachable ();
>   input_location = UNKNOWN_LOCATION;
>   eval = cleanup = NULL;
>   gimplify_and_add (TREE_OPERAND (*expr_p, 0), &eval);
> hack in gcc 13 and triggered on hundreds of tests there within just 5
> seconds of running make check-g++ -j32 (and in cases I looked at had nothing
> to do with the r14-5086 backports), so I believe this is just bad
> assumption on the try_catch_may_fallthru side, gimplify.cc certainly doesn't
> care, it just calls gimplify_and_add (TREE_OPERAND (*expr_p, 1), &cleanup);
> on it.  So, IMHO non-STATEMENT_LIST in the second operand is equivalent to
> a STATEMENT_LIST containing a single statement.

Did you check if there's ever a CATCH_EXPR or EH_FILTER_EXPR not wrapped
inside a STATEMENT_LIST?  That is, does

 if (TREE_CODE (TREE_OPERAND (stmt, 1)) != STATEMENT_LIST)
   {
 gcc_checking_assert (code != CATCH_EXPR && code != EH_FILTER_EXPR);
 return false;
   }

work?

> Unfortunately, I don't see an easy way to create an artificial tree iterator
> from just a single tree statement, so the patch duplicates what the loops
> later do (after all, it is very simple, just didn't want to duplicate
> also the large comments explaning it, so the 3 See below. comments).
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> Shall it go to branches as well given that r14-5086 has been backported
> to those branches as well?
> 
> 2023-11-22  Jakub Jelinek  
> 
>   PR c++/112619
>   * tree.cc (try_catch_may_fallthru): If second operand of
>   TRY_CATCH_EXPR is not a STATEMENT_LIST, handle it as if it was a
>   STATEMENT_LIST containing a single statement.
> 
>   * g++.dg/eh/pr112619.C: New test.
> 
> --- gcc/tree.cc.jj2023-11-14 09:24:28.436530018 +0100
> +++ gcc/tree.cc   2023-11-21 19:19:19.384347469 +0100
> @@ -12573,6 +12573,24 @@ try_catch_may_fallthru (const_tree stmt)
>if (block_may_fallthru (TREE_OPERAND (stmt, 0)))
>  return true;
>  
> +  switch (TREE_CODE (TREE_OPERAND (stmt, 1)))
> +{
> +case CATCH_EXPR:
> +  /* See below.  */
> +  return block_may_fallthru (CATCH_BODY (TREE_OPERAND (stmt, 1)));
> +
> +case EH_FILTER_EXPR:
> +  /* See below.  */
> +  return block_may_fallthru (EH_FILTER_FAILURE (TREE_OPERAND (stmt, 1)));
> +
> +case STATEMENT_LIST:
> +  break;
> +
> +default:
> +  /* See below.  */
> +  return false;
> +}
> +
>i = tsi_start (TREE_OPERAND (stmt, 1));
>switch (TREE_CODE (tsi_stmt (i)))
>  {
> --- gcc/testsuite/g++.dg/eh/pr112619.C.jj 2023-11-21 19:22:47.437439283 
> +0100
> +++ gcc/testsuite/g++.dg/eh/pr112619.C2023-11-21 19:22:24.887754376 
> +0100
> @@ -0,0 +1,15 @@
> +// PR c++/112619
> +// { dg-do compile }
> +
> +struct S { S (); ~S (); };
> +
> +S
> +foo (int a, int b)
> +{
> +  if (a || b)
> +{
> +  S s;
> +  return s;
> +}
> +  return S ();
> +}
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

[PATCH] c++/modules: Prevent overwriting arguments for duplicates [PR112588]

2023-11-22 Thread Nathaniel Shead

Bootstrapped and regtested on x86_64-pc-linux-gnu. I don't have write
access.

-- >8 --

When merging duplicate instantiations of function templates, currently
read_function_def overwrites the arguments with that of the existing
duplicate. This is problematic, however, since this means that the
PARM_DECLs in the body of the function definition no longer match with
the PARM_DECLs in the argument list, which causes issues when it comes
to generating RTL.

There doesn't seem to be any reason to do this replacement, so this
patch removes that logic.

PR c++/112588

gcc/cp/ChangeLog:

* module.cc (trees_in::read_function_def): Don't overwrite
arguments.

gcc/testsuite/ChangeLog:

* g++.dg/modules/merge-16.h: New test.
* g++.dg/modules/merge-16_a.C: New test.
* g++.dg/modules/merge-16_b.C: New test.

Signed-off-by: Nathaniel Shead 
---
 gcc/cp/module.cc  |  2 --
 gcc/testsuite/g++.dg/modules/merge-16.h   | 10 ++
 gcc/testsuite/g++.dg/modules/merge-16_a.C |  7 +++
 gcc/testsuite/g++.dg/modules/merge-16_b.C |  5 +
 4 files changed, 22 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/modules/merge-16.h
 create mode 100644 gcc/testsuite/g++.dg/modules/merge-16_a.C
 create mode 100644 gcc/testsuite/g++.dg/modules/merge-16_b.C

diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
index 4f5b6e2747a..2520ab659cc 100644
--- a/gcc/cp/module.cc
+++ b/gcc/cp/module.cc
@@ -11665,8 +11665,6 @@ trees_in::read_function_def (tree decl, tree 
maybe_template)
   DECL_RESULT (decl) = result;
   DECL_INITIAL (decl) = initial;
   DECL_SAVED_TREE (decl) = saved;
-  if (maybe_dup)
-   DECL_ARGUMENTS (decl) = DECL_ARGUMENTS (maybe_dup);
 
   if (context)
SET_DECL_FRIEND_CONTEXT (decl, context);
diff --git a/gcc/testsuite/g++.dg/modules/merge-16.h 
b/gcc/testsuite/g++.dg/modules/merge-16.h
new file mode 100644
index 000..fdb38551103
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/merge-16.h
@@ -0,0 +1,10 @@
+// PR c++/112588
+
+void f(int*);
+
+template 
+struct S {
+  void g(int n) { f(&n); }
+};
+
+template struct S;
diff --git a/gcc/testsuite/g++.dg/modules/merge-16_a.C 
b/gcc/testsuite/g++.dg/modules/merge-16_a.C
new file mode 100644
index 000..c243224c875
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/merge-16_a.C
@@ -0,0 +1,7 @@
+// PR c++/112588
+// { dg-additional-options "-fmodules-ts" }
+// { dg-module-cmi merge16 }
+
+module;
+#include "merge-16.h"
+export module merge16;
diff --git a/gcc/testsuite/g++.dg/modules/merge-16_b.C 
b/gcc/testsuite/g++.dg/modules/merge-16_b.C
new file mode 100644
index 000..8c7b1f0511f
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/merge-16_b.C
@@ -0,0 +1,5 @@
+// PR c++/112588
+// { dg-additional-options "-fmodules-ts" }
+
+#include "merge-16.h"
+import merge16;
-- 
2.42.0

RE: [PATCH v4] DSE: Allow vector type for get_stored_val when read < store

2023-11-22 Thread Li, Pan2

> It looks like Jeff approved the patch?

Yes, just would like to double check the way of this patch is expected as 
following the suggestion of Richard S.

Pan

-Original Message-
From: Richard Biener  
Sent: Wednesday, November 22, 2023 4:02 PM
To: Li, Pan2 
Cc: richard.sandif...@arm.com; juzhe.zh...@rivai.ai; Wang, Yanzhang 
; kito.ch...@gmail.com; Jeff Law 
; gcc-patches@gcc.gnu.org
Subject: Re: [PATCH v4] DSE: Allow vector type for get_stored_val when read < 
store

On Wed, Nov 22, 2023 at 3:30 AM Li, Pan2  wrote:
>
> Hi Richard S,
>
> Thanks a lot for reviewing and comments. May I know is there any concern or 
> further comments for landing this patch to GCC-14?

It looks like Jeff approved the patch?

Richard.

> Pan
>
> -Original Message-
> From: Li, Pan2
> Sent: Wednesday, November 15, 2023 8:25 AM
> To: gcc-patches@gcc.gnu.org
> Cc: juzhe.zh...@rivai.ai; Wang, Yanzhang ; 
> kito.ch...@gmail.com; richard.guent...@gmail.com; richard.sandif...@arm.com; 
> Jeff Law 
> Subject: RE: [PATCH v4] DSE: Allow vector type for get_stored_val when read < 
> store
>
> Sorry for disturbing, looks I have a typo for Richard S's email address, cc 
> the right email address for awareness.
>
> Pan
>
> -Original Message-
> From: Li, Pan2
> Sent: Wednesday, November 15, 2023 8:18 AM
> To: Jeff Law ; gcc-patches@gcc.gnu.org
> Cc: juzhe.zh...@rivai.ai; Wang, Yanzhang ; 
> kito.ch...@gmail.com; richard.guent...@gmail.com; richard.sandiford@arm.com2
> Subject: RE: [PATCH v4] DSE: Allow vector type for get_stored_val when read < 
> store
>
> > I wouldn't try to handle that case unless we had actual evidence it was
> > useful to do so.  Just wanted to point out that unlike pseudos we can
> > have multiple modes referencing the same memory location.
>
> Got the point here, thanks Jeff for emphasizing this, 😉.
>
> Pan
>
> -Original Message-
> From: Jeff Law 
> Sent: Tuesday, November 14, 2023 4:12 AM
> To: Li, Pan2 ; gcc-patches@gcc.gnu.org
> Cc: juzhe.zh...@rivai.ai; Wang, Yanzhang ; 
> kito.ch...@gmail.com; richard.guent...@gmail.com; richard.sandiford@arm.com2
> Subject: Re: [PATCH v4] DSE: Allow vector type for get_stored_val when read < 
> store
>
>
>
> On 11/12/23 20:22, pan2...@intel.com wrote:
> > From: Pan Li 
> >
> > Update in v4:
> > * Merge upstream and removed some independent changes.
> >
> > Update in v3:
> > * Take known_le instead of known_lt for vector size.
> > * Return NULL_RTX when gap is not equal 0 and not constant.
> >
> > Update in v2:
> > * Move vector type support to get_stored_val.
> >
> > Original log:
> >
> > This patch would like to allow the vector mode in the
> > get_stored_val in the DSE. It is valid for the read
> > rtx if and only if the read bitsize is less than the
> > stored bitsize.
> >
> > Given below example code with
> > --param=riscv-autovec-preference=fixed-vlmax.
> >
> > vuint8m1_t test () {
> >uint8_t arr[32] = {
> >  1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
> >  1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
> >};
> >
> >return __riscv_vle8_v_u8m1(arr, 32);
> > }
> >
> > Before this patch:
> > test:
> >lui a5,%hi(.LANCHOR0)
> >addisp,sp,-32
> >addia5,a5,%lo(.LANCHOR0)
> >li  a3,32
> >vl2re64.v   v2,0(a5)
> >vsetvli zero,a3,e8,m1,ta,ma
> >vs2r.v  v2,0(sp) <== Unnecessary store to stack
> >vle8.v  v1,0(sp) <== Ditto
> >vs1r.v  v1,0(a0)
> >addisp,sp,32
> >jr  ra
> >
> > After this patch:
> > test:
> >lui a5,%hi(.LANCHOR0)
> >addia5,a5,%lo(.LANCHOR0)
> >li  a4,32
> >addisp,sp,-32
> >vsetvli zero,a4,e8,m1,ta,ma
> >vle8.v  v1,0(a5)
> >vs1r.v  v1,0(a0)
> >addisp,sp,32
> >jr  ra
> >
> > Below tests are passed within this patch:
> > * The risc-v regression test.
> > * The x86 bootstrap and regression test.
> > * The aarch64 regression test.
> >
> >   PR target/111720
> >
> > gcc/ChangeLog:
> >
> >   * dse.cc (get_stored_val): Allow vector mode if read size is
> >   less than or equal to stored size.
> >
> > gcc/testsuite/ChangeLog:
> >
> >   * gcc.target/riscv/rvv/base/pr111720-0.c: New test.
> >   * gcc.target/riscv/rvv/base/pr111720-1.c: New test.
> >   * gcc.target/riscv/rvv/base/pr111720-10.c: New test.
> >   * gcc.target/riscv/rvv/base/pr111720-2.c: New test.
> >   * gcc.target/riscv/rvv/base/pr111720-3.c: New test.
> >   * gcc.target/riscv/rvv/base/pr111720-4.c: New test.
> >   * gcc.target/riscv/rvv/base/pr111720-5.c: New test.
> >   * gcc.target/riscv/rvv/base/pr111720-6.c: New test.
> >   * gcc.target/riscv/rvv/base/pr111720-7.c: New test.
> >   * gcc.target/riscv/rvv/base/pr111720-8.c: New test.
> >   * gcc.target/riscv/rvv/base/pr111720-9.c: New test.
> OK for the trunk.
>
>
> >
>
> > +  else if (VECTOR_MODE_P (read_mode) && VECTOR_MODE_P (store_mode)
> > +&& known_le (GET_MODE_BITSIZE (read_mode), GET_MODE

Re: [PATCH] tree: Fix up try_catch_may_fallthru [PR112619]

2023-11-22 Thread Jakub Jelinek

On Wed, Nov 22, 2023 at 11:32:10AM +, Richard Biener wrote:
> > hack in gcc 13 and triggered on hundreds of tests there within just 5
> > seconds of running make check-g++ -j32 (and in cases I looked at had nothing
> > to do with the r14-5086 backports), so I believe this is just bad
> > assumption on the try_catch_may_fallthru side, gimplify.cc certainly doesn't
> > care, it just calls gimplify_and_add (TREE_OPERAND (*expr_p, 1), &cleanup);
> > on it.  So, IMHO non-STATEMENT_LIST in the second operand is equivalent to
> > a STATEMENT_LIST containing a single statement.
> 
> Did you check if there's ever a CATCH_EXPR or EH_FILTER_EXPR not wrapped
> inside a STATEMENT_LIST?  That is, does
> 
>  if (TREE_CODE (TREE_OPERAND (stmt, 1)) != STATEMENT_LIST)
>{
>  gcc_checking_assert (code != CATCH_EXPR && code != EH_FILTER_EXPR);
>  return false;
>}
> 
> work?

Looking at a trivial example
void bar ();
void
foo (void)
{
  try { bar (); } catch (int) {}
}
it seems it is even more complicated, because what e.g. the gimplification
sees is not TRY_CATCH_EXPR with CATCH_EXPR second operand, but
TRY_BLOCK with HANDLER second operand (note, certainly not wrapped in a
STATEMENT_LIST, one would need another catch (long) {} for it after it),
C++ FE specific trees.
And cp_gimplify_expr then on the fly turns the TRY_BLOCK into TRY_CATCH_EXPR
(in genericize_try_block) and HANDLER into CATCH_EXPR
(genericize_catch_block).
When gimplifying EH_SPEC_BLOCK in genericize_eh_spec_block it even
creates TRY_CATCH_EXPR with genericize_eh_spec_block -> 
build_gimple_eh_filter_tree
if even creates TRY_CATCH_EXPR with EH_FILTER_EXPR as its second operand
(without intervening STATEMENT_LIST).

So, I believe the patch is correct but for C++ it might be hard to see it
actually trigger because most often one will see the C++ FE specific trees
of TRY_BLOCK (with HANDLER) and EH_SPEC_BLOCK instead.
So, I wonder why cxx_block_may_fallthru doesn't handle TRY_BLOCK and
EH_SPEC_BLOCK as well.  Given the genericization, I think
TRY_BLOCK should be handled similarly to TRY_CATCH_EXPR in tree.cc,
if second operand is HANDLER or STATEMENT_LIST starting with HANDLER,
check if any of the handler bodies can fall thru, dunno if TRY_BLOCK without
HANDLERs is possible, and for EH_SPEC_BLOCK see if the failure can fall
through.

Jakub

Gcc

2023-11-22 Thread Suma Luther

Hi Gcc,

I'm following up to confirm if you are interested in acquiring the 
Registrants/Attendees/Members list.

 *   CMAA Annual Conference (Washington, USA, Oct 29-31, 2023)
 *   1,000+ Contacts

Let me know your thoughts so that I can share the price & more information.

Regards,
Suma - Business Executive

Re: [PATCH v5] Introduce attribute sym_alias

2023-11-22 Thread Richard Biener

On Mon, Nov 20, 2023 at 1:54 PM Alexandre Oliva  wrote:
>
> On Sep 20, 2023, Alexandre Oliva  wrote:
>
> > This patch introduces an attribute to add extra asm names (aliases)
> > for a decl when its definition is output.
>
> Ping?
> https://gcc.gnu.org/pipermail/gcc-patches/2023-September/630971.html
>
> Re-regstrapped on x86_64-linux-gnu.  Ok to install?

OK if Honza or C/C++ maintainers do not request additional changes
this week.

Thanks,
Richard.

> --
> Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
>Free Software Activist   GNU Toolchain Engineer
> More tolerance and less prejudice are key for inclusion and diversity
> Excluding neuro-others for not behaving ""normal"" is *not* inclusive

Re: [PATCH] tree: Fix up try_catch_may_fallthru [PR112619]

2023-11-22 Thread Jakub Jelinek

On Wed, Nov 22, 2023 at 01:06:28PM +0100, Jakub Jelinek wrote:
> Looking at a trivial example
> void bar ();
> void
> foo (void)
> {
>   try { bar (); } catch (int) {}
> }
> it seems it is even more complicated, because what e.g. the gimplification
> sees is not TRY_CATCH_EXPR with CATCH_EXPR second operand, but
> TRY_BLOCK with HANDLER second operand (note, certainly not wrapped in a
> STATEMENT_LIST, one would need another catch (long) {} for it after it),
> C++ FE specific trees.
> And cp_gimplify_expr then on the fly turns the TRY_BLOCK into TRY_CATCH_EXPR
> (in genericize_try_block) and HANDLER into CATCH_EXPR
> (genericize_catch_block).
> When gimplifying EH_SPEC_BLOCK in genericize_eh_spec_block it even
> creates TRY_CATCH_EXPR with genericize_eh_spec_block -> 
> build_gimple_eh_filter_tree
> if even creates TRY_CATCH_EXPR with EH_FILTER_EXPR as its second operand
> (without intervening STATEMENT_LIST).

Ah, and the difference between the above where TRY_BLOCK is turned into
TRY_CATCH_EXPR and HANDLER into CATCH_EXPR vs. the ICE on the testcase from
the PR is that in that case it isn't TRY_BLOCK, but CLEANUP_STMT which is
not changed during gimplification but already during cp generication.
So, pedantically perhaps just assuming TRY_CATCH_EXPR where second argument
is not STATEMENT_LIST to be the CATCH_EXPR/EH_FILTER_EXPR case could work
for C++, but there are other FEs and it would be fragile (and weird, given
that STATEMENT_LIST with single stmt in it vs. that stmt ought to be
generally interchangeable).

Plus of course question whether we want to handle TRY_BLOCK/EH_SPEC_BLOCK in
cxx_block_may_fallthru in addition to that remains (it apparently already
handles CLEANUP_STMT, but strangely just the try/finally special case of it
- I'd assume the CLEANUP_EH_ONLY case would be
(block_may_fallthru (CLEANUP_BODY (stmt))
 || block_may_fallthru (CLEANUP_EXPR (stmt)))
because if the body can fallthru, everything can, and if there is an
exception and cleanup can fallthru, then it could fallthru as well).

Jakub

Re: [PATCH] RISC-V: Fix incorrect use of vcompress in permutation auto-vectorization

2023-11-22 Thread juzhe.zh...@rivai.ai

Committed as it is obvious bug fix.



juzhe.zh...@rivai.ai
 
From: Juzhe-Zhong
Date: 2023-11-22 18:53
To: gcc-patches
CC: kito.cheng; kito.cheng; jeffreyalaw; rdapp.gcc; Juzhe-Zhong
Subject: [PATCH] RISC-V: Fix incorrect use of vcompress in permutation 
auto-vectorization
This patch fixes following FAILs on zvl512b of RV32 system:
 
FAIL: gcc.target/riscv/rvv/autovec/struct/struct_vect_run-12.c execution test
FAIL: gcc.target/riscv/rvv/autovec/struct/struct_vect_run-9.c execution test
 
The root cause is that for permutation indice = {0,3,7,0} use vcompress 
optimization
which is incorrect. Fix vcompress optimization bug.
 
PR target/112598
 
gcc/ChangeLog:
 
* config/riscv/riscv-v.cc (shuffle_compress_patterns): Fix vcompress bug.
 
gcc/testsuite/ChangeLog:
 
* gcc.target/riscv/rvv/autovec/pr112598-3.c: New test.
 
---
gcc/config/riscv/riscv-v.cc   | 15 ++---
.../gcc.target/riscv/rvv/autovec/pr112598-3.c | 21 +++
2 files changed, 29 insertions(+), 7 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112598-3.c
 
diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index 7d6d0821d87..7d3e8038dab 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -3005,14 +3005,15 @@ shuffle_compress_patterns (struct expand_vec_perm_d *d)
   if (compress_point < 0)
 return false;
-  /* It must be series increasing from compress point.  */
-  if (!d->perm.series_p (compress_point, 1, d->perm[compress_point], 1))
-return false;
-
   /* We can only apply compress approach when all index values from 0 to
  compress point are increasing.  */
   for (int i = 1; i < compress_point; i++)
-if (known_le (d->perm[i], d->perm[i - 1]))
+if (maybe_le (d->perm[i], d->perm[i - 1]))
+  return false;
+
+  /* It must be series increasing from compress point.  */
+  for (int i = 1 + compress_point; i < vlen; i++)
+if (maybe_ne (d->perm[i], d->perm[i - 1] + 1))
   return false;
   /* Success!  */
@@ -3080,10 +3081,10 @@ shuffle_compress_patterns (struct expand_vec_perm_d *d)
   if (need_slideup_p)
 {
   int slideup_cnt = vlen - (d->perm[vlen - 1].to_constant () % vlen) - 1;
-  rtx ops[] = {d->target, d->op1, gen_int_mode (slideup_cnt, Pmode)};
+  merge = gen_reg_rtx (vmode);
+  rtx ops[] = {merge, d->op1, gen_int_mode (slideup_cnt, Pmode)};
   insn_code icode = code_for_pred_slide (UNSPEC_VSLIDEUP, vmode);
   emit_vlmax_insn (icode, BINARY_OP, ops);
-  merge = d->target;
 }
   insn_code icode = code_for_pred_compress (vmode);
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112598-3.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112598-3.c
new file mode 100644
index 000..231a068c680
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112598-3.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv32gcv_zvfh_zfh_zvl512b -mabi=ilp32d -O3 
-ftree-vectorize -std=c99 -fno-vect-cost-model" } */
+
+#include 
+#define TYPE uint64_t
+#define ITYPE int64_t
+
+void __attribute__ ((noinline, noclone))
+foo (TYPE *__restrict a, TYPE *__restrict b, TYPE *__restrict c,
+TYPE *__restrict d, ITYPE n)
+{
+  for (ITYPE i = 0; i < n; ++i)
+{
+  d[i * 3] = a[i];
+  d[i * 3 + 1] = b[i];
+  d[i * 3 + 2] = c[i];
+}
+}
+
+/* We don't want vcompress.vv.  */
+/* { dg-final { scan-assembler-not {vcompress\.vv} } } */
-- 
2.36.3

Re: [PATCH] mingw: Exclude utf8 manifest [PR111170, PR108865]

2023-11-22 Thread Costas Argyris

Attached a new patch.

A couple things to note:

1) I changed your

host_extra_objs=utf8-mingw32.o

to

host_extra_objs_mingw=utf8-mingw32.o

to match the other two, since I believe that's what you meant.

2) This approach has the complication that the variables
in configure.ac need to be set before it sources config.host.

On Wed, 22 Nov 2023 at 01:17, Jonathan Yong <10wa...@gmail.com> wrote:

> On 11/21/23 18:07, Costas Argyris wrote:
> > This patch makes the inclusion of the utf8 manifest on the
> > mingw hosts optional by introducing the configure option
> > --disable-win32-utf8-manifest (has no effect on non-mingw
> > hosts).
> >
> > Bootstrapped OK on i686-w64-mingw32 and x86_64-w64-mingw32
> > with and without --disable-win32-utf8-manifest.
> >
> > Costas
> >
>
> I would prefer a AC_ARG_ENABLE to document the option in configure.ac,
> so it would show with configure --help. It should set new variables to
> i386/x-mingw32-utf8, utf8rc-mingw32.o and utf8-mingw32.o respectively
> unless disabled, like so:
>
> host_xmake_mingw=i386/x-mingw32-utf8
> host_extra_gcc_objs_mingw=utf8rc-mingw32.o
> host_extra_objs=utf8-mingw32.o
>
> And then entries in config.host would be:
>
> >   i[34567]86-*-mingw32* | x86_64-*-mingw*)
> > host_xm_file=i386/xm-mingw32.h
> > host_xmake_file="${host_xmake_file} ${host_xmake_mingw}
> i386/x-mingw32"
> > host_extra_gcc_objs="${host_extra_gcc_objs}
> ${host_extra_gcc_objs_mingw} driver-mingw32. >
>  host_extra_objs="${host_extra_objs} ${host_extra_objs_mingw}"
>
>


Exclude-win32-utf8-manifest.patch
Description: Binary data

Re: [PATCH v5] Introduce attribute sym_alias (was: Last call for bikeshedding on attribute sym/exalias/reverse_alias)

2023-11-22 Thread Jan Hubicka

Hi,
it seems that interface to symbol table is fairly minimal here reduced
to...
>   (create_sym_alias_decl, create_sym_alias_decls): New.
>   * cgraphunit.cc (cgraph_node::analyze): Create alias_target
>   node if needed.
called from here...
>   (analyze_functions): Fixup visibility of implicit alias only
>   after its node is analyzed.

> +  if (VAR_P (replaced))
> + varpool_node::create_extra_name_alias (sym_node->decl, replacement);
> +  else
> + cgraph_node::create_same_body_alias (sym_node->decl, replacement);

I wonder why you use same body aliases, which are kind of special to C++
frontend (and come with fixup code working around its quirks you had to
disable above).

Why you do not produce usual alias attribute once you know the symbol
table so it goes the cgraph_node::create_alias or
vaprool_node::create_alias path?

Honza

Re: Re: RISC-V: Support XTheadVector extensions

2023-11-22 Thread 钟居哲

I am totally ok to approve theadvector on GCC-14 before stage 3 close
as long as it doesn't touch the current RVV codes too much and binutils 
supports theadvector.

I have provided the draft approach:
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/637349.html 
which turns out doesn't need to change any codes of vector.md.
I strongly suggest follow this draft. I can be actively review theadvector 
during stage 3.
And hopefully can help you land theadvector on GCC-14.

Thanks.



juzhe.zh...@rivai.ai
 
From: Christoph Müllner
Date: 2023-11-22 18:07
To: juzhe.zh...@rivai.ai
CC: gcc-patches; kito.cheng; Kito.cheng; cooper.joshua; Robin Dapp; 
jeffreyalaw; Philipp Tomsich; Cooper Qu; Jin Ma; Nelson Chu
Subject: Re: RISC-V: Support XTheadVector extensions
Hi Juzhe,
 
Sorry for the late reply, but I was not on CC, so I missed this email.
 
On Fri, Nov 17, 2023 at 2:41 PM juzhe.zh...@rivai.ai
 wrote:
>
> Ok. I just read the theadvector extension.
>
> https://github.com/T-head-Semi/thead-extension-spec/blob/master/xtheadvector.adoc
>
> Theadvector is not custom extension. Just a uarch to disable some of the 
> RVV1.0 extension
> Theadvector can be considered as subextension of 'V' extension with disabling 
> some of the
> instructions and adding some new thead vector target load/store (This is 
> another story).
>
> So, for disabling the instruction that theadvector doesn't support.
> You don't need to touch such many codes.
>
> Here is a much simpler approach to do (I think it's definitely working):
> 1. Don't change any codes in vector.md and keep GCC generates ASM with "th." 
> prefix.
> 2. Add !TARGET_THEADVECTOR into vector-iterator.md to disable the mode you 
> don't want.
> For example , theadvector doesn't support fractional vector.
>
> Then it's pretty simple:
>
> RVVMF2SI "TARGET_VECTOR && !TARGET_THEADVECTOR".
>
> 3. Remove all the tests you add in this patch.
> 4. You can add theadvector specific load/store for example, th.vlb 
> instructions they are allowed.
> 5. Modify binutils, and make th.vmulh.vv as the pseudo instruction of vmulh.vv
> 6. So with compile option "-S", you will still see ASM as  "vmulh.vv". but 
> with objdump, you will see th.vmulh.vv.
 
Yes, all these points sound reasonable, to minimize the patchset size.
I believe in point 1 you meant "without th. prefix".
 
I've added Jin Ma (who is the main author of the Binutils patchset) so
he is also aware
of the proposal to use pseudo instructions to avoid duplication in Binutils.
 
Thank you very much!
Christoph
 
 
>
> After this change, you can send V2, then I can continue to review on GCC-15.
>
> Thanks.
>
> 
> juzhe.zh...@rivai.ai
>
>
> From: juzhe.zh...@rivai.ai
> Date: 2023-11-17 19:39
> To: gcc-patches
> CC: kito.cheng; kito.cheng; cooper.joshua; Robin Dapp; jeffreyalaw
> Subject: RISC-V: Support XTheadVector extensions
> 90% theadvector extension reusing current RVV 1.0 instructions patterns:
> Just change ASM, For example:
>
> @@ -2923,7 +2923,7 @@ (define_insn "*pred_mulh_scalar"
>   (match_operand:VFULLI_D 3 "register_operand"  "vr,vr, vr, vr")] VMULH)
>(match_operand:VFULLI_D 2 "vector_merge_operand" "vu, 0, vu,  0")))]
>"TARGET_VECTOR"
> -  "vmulh.vx\t%0,%3,%z4%p1"
> +  "%^vmulh.vx\t%0,%3,%z4%p1"
>[(set_attr "type" "vimul")
> (set_attr "mode" "")])
>
> +  if (letter == '^')
> +{
> +  if (TARGET_XTHEADVECTOR)
> + fputs ("th.", file);
> +  return;
> +}
>
>
> For almost all patterns, you just simply append "th." in the ASM prefix.
> like change "vmulh.vv" -> "th.vmulh.vv"
>
> Almost all theadvector instructions are not new features,  all same as RVV1.0.
> Why do you invent the such ISA doesn't include any features that RVV1.0 
> doesn't satisfy ?
>
> I am not explicitly object this patch. But I should know the reason.
>
> Btw, stage 1 will close soon.  So I will review this patch on GCC-15 as long 
> as all other RISC-V maintainers agree.
>
>
> 
> juzhe.zh...@rivai.ai

Re: [PATCH] Fix PR ada/111909 On Darwin, determine filesystem case sensitivity at runtime

2023-11-22 Thread Simon Wright

On 21 Nov 2023, at 23:13, Iain Sandoe  wrote:

>> #if defined (__APPLE__)
>> -#include 
> 
> If removing unistd.h is intentional (i.e. you determined that it’s no longer
> needed for Darwin), then we should make that a separate patch.

I thought that I’d had to include unistd.h for the first patch in this thread; 
clearly not!

What I hope will be the final version:

——— 8< .———

In gcc/ada/adaint.c(__gnat_get_file_names_case_sensitive), the current
assumption for __APPLE__ is that file names are case-insensitive
unless __arm__ or __arm64__ are defined, in which case file names are
declared case-sensitive.

The associated comment is
  "By default, we suppose filesystems aren't case sensitive on
  Windows and Darwin (but they are on arm-darwin)."

This means that on aarch64-apple-darwin, file names are treated as
case-sensitive, which is not the default case.

The true default position is that macOS file systems are
case-insensitive, iOS file systems are case-sensitive.

Apple provide a header file  which permits a
compile-time check for the compiler target (e.g. OSX vs IOS); if
TARGET_OS_IOS is defined as 1, this is a build for iOS.

  * gcc/ada/adaint.c
  (__gnat_get_file_names_case_sensitive): Split out the __APPLE__
  check and remove the checks for __arm__, __arm64__.
  For Apple, file names are by default case-insensitive unless
  TARGET_OS_IOS is set.

Signed-off-by: Simon Wright 
---
 gcc/ada/adaint.c | 14 +++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/gcc/ada/adaint.c b/gcc/ada/adaint.c
index bb4ed2607e5..2e9c59ae958 100644
--- a/gcc/ada/adaint.c
+++ b/gcc/ada/adaint.c
@@ -85,6 +85,7 @@

 #if defined (__APPLE__)
 #include 
+#include 
 #endif

 #if defined (__hpux__)
@@ -613,11 +614,18 @@ __gnat_get_file_names_case_sensitive (void)
   else
{
  /* By default, we suppose filesystems aren't case sensitive on
-Windows and Darwin (but they are on arm-darwin).  */
-#if defined (WINNT) || defined (__DJGPP__) \
-  || (defined (__APPLE__) && !(defined (__arm__) || defined (__arm64__)))
+Windows or DOS.  */
+#if defined (WINNT) || defined (__DJGPP__)
  file_names_case_sensitive_cache = 0;
+#elif defined (__APPLE__)
+ /* By default, macOS volumes are case-insensitive, iOS
+volumes are case-sensitive.  */
+#if TARGET_OS_IOS
+ file_names_case_sensitive_cache = 1;
 #else
+ file_names_case_sensitive_cache = 0;
+#endif   
+#else /* Neither Windows nor Apple.  */
  file_names_case_sensitive_cache = 1;
 #endif
}
-- 
2.37.1 (Apple Git-137.1)

Re: [PATCH] Fix PR ada/111909 On Darwin, determine filesystem case sensitivity at runtime

2023-11-22 Thread Arnaud Charlet

> >> #if defined (__APPLE__)
> >> -#include 
> > 
> > If removing unistd.h is intentional (i.e. you determined that it’s no longer
> > needed for Darwin), then we should make that a separate patch.
> 
> I thought that I’d had to include unistd.h for the first patch in this 
> thread; clearly not!
> 
> What I hope will be the final version:

OK here.

> ——— 8< .———
> 
> In gcc/ada/adaint.c(__gnat_get_file_names_case_sensitive), the current
> assumption for __APPLE__ is that file names are case-insensitive
> unless __arm__ or __arm64__ are defined, in which case file names are
> declared case-sensitive.
> 
> The associated comment is
>   "By default, we suppose filesystems aren't case sensitive on
>   Windows and Darwin (but they are on arm-darwin)."
> 
> This means that on aarch64-apple-darwin, file names are treated as
> case-sensitive, which is not the default case.
> 
> The true default position is that macOS file systems are
> case-insensitive, iOS file systems are case-sensitive.
> 
> Apple provide a header file  which permits a
> compile-time check for the compiler target (e.g. OSX vs IOS); if
> TARGET_OS_IOS is defined as 1, this is a build for iOS.
> 
>   * gcc/ada/adaint.c
>   (__gnat_get_file_names_case_sensitive): Split out the __APPLE__
>   check and remove the checks for __arm__, __arm64__.
>   For Apple, file names are by default case-insensitive unless
>   TARGET_OS_IOS is set.
> 
> Signed-off-by: Simon Wright

[pushed] [PR112610] [IRA]: Fix using undefined dump file in IRA code during insn scheduling

2023-11-22 Thread Vladimir Makarov


The following patch fixes

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112610

The patch was successfully tested and bootstrapped on x86-64.

commit 95f61de95bbcc2e4fb7020e27698140abea23788
Author: Vladimir N. Makarov 
Date:   Wed Nov 22 09:01:02 2023 -0500

[IRA]: Fix using undefined dump file in IRA code during insn scheduling

Part of IRA code is used for register pressure sensitive insn
scheduling and live range shrinkage.  Numerous changes of IRA resulted
in that this IRA code uses dump file passed by the scheduler and
internal ira dump file (in called functions) which can be undefined or
freed by the scheduler during compiling previous functions.  The patch
fixes this problem.  To reproduce the error valgrind should be used
and GCC should be compiled with valgrind annotations.  Therefor the
patch does not contain the test case.

gcc/ChangeLog:

PR rtl-optimization/112610
* ira-costs.cc: (find_costs_and_classes): Remove arg.
Use ira_dump_file for printing.
(print_allocno_costs, print_pseudo_costs): Ditto.
(ira_costs): Adjust call of find_costs_and_classes.
(ira_set_pseudo_classes): Set up and restore ira_dump_file.

diff --git a/gcc/ira-costs.cc b/gcc/ira-costs.cc
index e0528e76a64..c3efd295e54 100644
--- a/gcc/ira-costs.cc
+++ b/gcc/ira-costs.cc
@@ -1662,16 +1662,16 @@ scan_one_insn (rtx_insn *insn)
 
 
 
-/* Print allocnos costs to file F.  */
+/* Print allocnos costs to the dump file.  */
 static void
-print_allocno_costs (FILE *f)
+print_allocno_costs (void)
 {
   int k;
   ira_allocno_t a;
   ira_allocno_iterator ai;
 
   ira_assert (allocno_p);
-  fprintf (f, "\n");
+  fprintf (ira_dump_file, "\n");
   FOR_EACH_ALLOCNO (a, ai)
 {
   int i, rclass;
@@ -1681,32 +1681,34 @@ print_allocno_costs (FILE *f)
   enum reg_class *cost_classes = cost_classes_ptr->classes;
 
   i = ALLOCNO_NUM (a);
-  fprintf (f, "  a%d(r%d,", i, regno);
+  fprintf (ira_dump_file, "  a%d(r%d,", i, regno);
   if ((bb = ALLOCNO_LOOP_TREE_NODE (a)->bb) != NULL)
-	fprintf (f, "b%d", bb->index);
+	fprintf (ira_dump_file, "b%d", bb->index);
   else
-	fprintf (f, "l%d", ALLOCNO_LOOP_TREE_NODE (a)->loop_num);
-  fprintf (f, ") costs:");
+	fprintf (ira_dump_file, "l%d", ALLOCNO_LOOP_TREE_NODE (a)->loop_num);
+  fprintf (ira_dump_file, ") costs:");
   for (k = 0; k < cost_classes_ptr->num; k++)
 	{
 	  rclass = cost_classes[k];
-	  fprintf (f, " %s:%d", reg_class_names[rclass],
+	  fprintf (ira_dump_file, " %s:%d", reg_class_names[rclass],
 		   COSTS (costs, i)->cost[k]);
 	  if (flag_ira_region == IRA_REGION_ALL
 	  || flag_ira_region == IRA_REGION_MIXED)
-	fprintf (f, ",%d", COSTS (total_allocno_costs, i)->cost[k]);
+	fprintf (ira_dump_file, ",%d",
+		 COSTS (total_allocno_costs, i)->cost[k]);
 	}
-  fprintf (f, " MEM:%i", COSTS (costs, i)->mem_cost);
+  fprintf (ira_dump_file, " MEM:%i", COSTS (costs, i)->mem_cost);
   if (flag_ira_region == IRA_REGION_ALL
 	  || flag_ira_region == IRA_REGION_MIXED)
-	fprintf (f, ",%d", COSTS (total_allocno_costs, i)->mem_cost);
-  fprintf (f, "\n");
+	fprintf (ira_dump_file, ",%d",
+		 COSTS (total_allocno_costs, i)->mem_cost);
+  fprintf (ira_dump_file, "\n");
 }
 }
 
-/* Print pseudo costs to file F.  */
+/* Print pseudo costs to the dump file.  */
 static void
-print_pseudo_costs (FILE *f)
+print_pseudo_costs (void)
 {
   int regno, k;
   int rclass;
@@ -1714,21 +1716,21 @@ print_pseudo_costs (FILE *f)
   enum reg_class *cost_classes;
 
   ira_assert (! allocno_p);
-  fprintf (f, "\n");
+  fprintf (ira_dump_file, "\n");
   for (regno = max_reg_num () - 1; regno >= FIRST_PSEUDO_REGISTER; regno--)
 {
   if (REG_N_REFS (regno) <= 0)
 	continue;
   cost_classes_ptr = regno_cost_classes[regno];
   cost_classes = cost_classes_ptr->classes;
-  fprintf (f, "  r%d costs:", regno);
+  fprintf (ira_dump_file, "  r%d costs:", regno);
   for (k = 0; k < cost_classes_ptr->num; k++)
 	{
 	  rclass = cost_classes[k];
-	  fprintf (f, " %s:%d", reg_class_names[rclass],
+	  fprintf (ira_dump_file, " %s:%d", reg_class_names[rclass],
 		   COSTS (costs, regno)->cost[k]);
 	}
-  fprintf (f, " MEM:%i\n", COSTS (costs, regno)->mem_cost);
+  fprintf (ira_dump_file, " MEM:%i\n", COSTS (costs, regno)->mem_cost);
 }
 }
 
@@ -1939,7 +1941,7 @@ calculate_equiv_gains (void)
and their best costs.  Set up preferred, alternative and allocno
classes for pseudos.  */
 static void
-find_costs_and_classes (FILE *dump_file)
+find_costs_and_classes (void)
 {
   int i, k, start, max_cost_classes_num;
   int pass;
@@ -1991,8 +1993,8 @@ find_costs_and_classes (FILE *dump_file)
  classes to guide the selection.  */
   for (pass = start; pass <= flag_expensive_optimizations; pass++)
 {
-  if ((!allocno_p || internal_flag_ira_verbose > 0) && dump_file)
-	fprintf (dump_file

Re: [PATCH v4] Introduce strub: machine-independent stack scrubbing

2023-11-22 Thread Richard Biener

On Mon, Nov 20, 2023 at 1:40 PM Alexandre Oliva  wrote:
>
> On Oct 26, 2023, Alexandre Oliva  wrote:
>
> >> This is a refreshed and improved version of the version posted back in
> >> June.  https://gcc.gnu.org/pipermail/gcc-patches/2023-June/621936.html
>
> > Ping? https://gcc.gnu.org/pipermail/gcc-patches/2023-October/633675.html
> > I'm combining the gcc/ipa-strub.cc bits from
> > https://gcc.gnu.org/pipermail/gcc-patches/2023-October/633526.html
>
> Ping?
> Retested on x86_64-linux-gnu, with and without -fstrub=all.

@@ -898,7 +899,24 @@ decl_attributes (tree *node, tree attributes, int flags,
   TYPE_NAME (tt) = *node;
 }

-  *anode = cur_and_last_decl[0];
+  if (*anode != cur_and_last_decl[0])
+{
+  /* Even if !spec->function_type_required, allow the attribute
+ handler to request the attribute to be applied to the function
+ type, rather than to the function pointer type, by setting
+ cur_and_last_decl[0] to the function type.  */
+  if (!fn_ptr_tmp
+  && POINTER_TYPE_P (*anode)
+  && TREE_TYPE (*anode) == cur_and_last_decl[0]
+  && FUNC_OR_METHOD_TYPE_P (TREE_TYPE (*anode)))
+ {
+  fn_ptr_tmp = TREE_TYPE (*anode);
+  fn_ptr_quals = TYPE_QUALS (*anode);
+  anode = &fn_ptr_tmp;
+ }
+  *anode = cur_and_last_decl[0];
+}
+

what is this a workaround for?  Isn't there a suitable parsing position
for placing the attribute?

+#ifndef STACK_GROWS_DOWNWARD
+# define STACK_TOPS GT
+#else
+# define STACK_TOPS LT
+#endif

according to docs this is defined to 0 or 1 so the above looks wrong
(it's always defined).

+  if (optimize < 2 || optimize_size || flag_no_inline)
+return NULL_RTX;

I'm wondering about these checks in the expansions of the builtins,
I think this is about inline expanding or emitting a libcall, right?
I wonder if you should use optimize_function_for_speed (cfun) instead?
Usually -fno-inline shouldn't affect such calls, but -fno-builtin-FOO would.
I have no strong opinion here though.

The new builtins seem undocumented - usually those are documented
within extend.texi - I guess placing __builtin___strub_enter calls in
the code manually will break in interesting ways - if that's not supposed
to happen the trick is to embed a space in the name of the built-in.
__builtin_stack_address looks like something users will pick up though
(and thus should be documented)?

-symtab_node::reset (void)
+symtab_node::reset (bool preserve_comdat_group)

not sure what for, I'll leave Honza to comment.

+/* Create a distinct copy of the type of NODE's function, and change
+   the fntype of all calls to it with the same main type to the new
+   type.  */
+
+static void
+distinctify_node_type (cgraph_node *node)
+{
+  tree old_type = TREE_TYPE (node->decl);
+  tree new_type = build_distinct_type_copy (old_type);
+  tree new_ptr_type = NULL_TREE;
+
+  /* Remap any calls to node->decl that use old_type, or a variant
+ thereof, to new_type as well.  We don't look for aliases, their
+ declarations will have their types changed independently, and
+ we'll adjust their fntypes then.  */
+  for (cgraph_edge *e = node->callers; e; e = e->next_caller)
+{
+  if (!e->call_stmt)
+ continue;
+  tree fnaddr = gimple_call_fn (e->call_stmt);
+  gcc_checking_assert (TREE_CODE (fnaddr) == ADDR_EXPR
+   && TREE_OPERAND (fnaddr, 0) == node->decl);
+  if (strub_call_fntype_override_p (e->call_stmt))
+ continue;
+  if (!new_ptr_type)
+ new_ptr_type = build_pointer_type (new_type);
+  TREE_TYPE (fnaddr) = new_ptr_type;
+  gimple_call_set_fntype (e->call_stmt, new_type);
+}
+
+  TREE_TYPE (node->decl) = new_type;

it does feel like there's IPA mechanisms to deal with what you are trying to do
here (or in the caller(s)).


+unsigned int
+pass_ipa_strub_mode::execute (function *)
+{
+  last_cgraph_order = 0;
+  ipa_strub_set_mode_for_new_functions ();
+
+  /* Verify before any inlining or other transformations.  */
+  verify_strub ();

if  (flag_checking) verify_strub ();

please.  I guess we talked about this last year - what's the reason to have both
an IPA pass and a simple IPA pass?  IIRC the simple IPA pass is a simple
one because it wants to see inlined bodies and "fixes" those up?  Some toplevel
comments explaining both passes in the ipa-strub.cc pass would be nice to
have.  I guess I also asked before - did you try it with -flto?

+/* Decide which of the wrapped function's parms we want to turn into
+   references to the argument passed to the wrapper.  In general,
we want to
+   copy small arguments, and avoid copying large ones.
Variable-sized array
+   lengths given by other arguments, as in 20020210-1.c, would lead to
+   problems if passed by value, after resetting the original function and
+   dropping the length computation; passing them by reference works.
+   DECL_BY_REFERENCE is *not* a substitute for this: it involves copying
+   anyway, but performed at the caller.  */
+indirect_parms_t indirect_nparms (3

Re: [PATCH v2] A new copy propagation and PHI elimination pass

2023-11-22 Thread Filip Kastl

Hi Richard,

> Can you name the new file gimple-ssa-sccopy.cc please?

Yes, no problem.

Btw, I thought that it is standard that gimple ssa passes have the tree-ssa-
prefix. Do I understand it correctly that this is not true and many
tree-ssa-*.cc passes should actually be named gimple-ssa-*.cc but remain
tree-ssa-*.cc for historical reasons?

>> +   3 A set of PHI statements that only refer to each other or to one other
>> + value.
>> +
>> +   _8 = PHI <_9, _10>;
>> +   _9 = PHI <_8, _10>;
>> +   _10 = PHI <_8, _9, _1>;
> 
> this case necessarily involves a cyclic CFG, so maybe say
> 
> "This is a lightweight SSA copy propagation pass that is able to handle
> cycles optimistically, eliminating PHIs within those."
> 
> ?  Or is this a mis-characterization?

I'm not sure what you mean here. Yes, this case always involves a cyclic CFG.
Is it weird that a lightweight pass is able to handle cyclic CFG and therefore
you suggest to comment this fact and say that the pass handles cycles
optimistically?

I'm not sure if optimistic is a good word to characterize the pass. I'd expect
an "optimistic" pass to make assumptions which may not be true and therefore
not always all redundancies it can. This pass however should achieve all that
it sets out to do.

> It might be nice to optimize SCCs of size 1 somehow, not sure how
> many times these appear - possibly prevent them from even entering
> the SCC discovery?

Maybe that could be done. I would have to think about it and make sure it
doesn't break anything. I'd prefer to get this version into upstream and then
possibly post this upgrade later.

Btw, SCCs of size of size 1 appear all the time. Those are the cases 1 and 2
described in the comment at the beginning of the file.

> I'll note that while you are working with stmts everywhere that
> you are really bound to using SSA defs and those would already
> nicely have numbers (the SSA_NAME_VERSION).  In principle the
> SCC lattice could be pre-allocated once, indexed by
> SSA_NAME_VERSION and you could keep a "generation" number
> indicating what SCC discovery round it belongs to (aka the
> set_using).

I see. I could allocate a vertex struct for each statement only once when the
pass is invoked instead of allocating the structs each time tarjan_compute_sccs
is called. Will do that.

I'm not sure if I want to use SSA_NAME_VERSION for indexing an vec/array with
all those vertex structs. Many SSA names will be defined neither by PHI nor by
a copy assignment statement. If I created a vertex struct for every SSA name I
would allocate a lot of extra memory.

> There's a old SCC finding algorithm working on the SSA graph
> in the old SCC based value-numbering, for example on the
> gcc 7 branch in tree-ssa-sccvn.c:DFS

> For reading it would be nice to put the SCC finding in its
> own class.

Okay, I'll do that.

> > +   }
> > +}
> > +
> > +  if (!stack.is_empty ())
> > +gcc_unreachable ();
> > +
> > +  /* Clear copy stmts' 'using' flags.  */
> > +  for (vertex v : vs)
> > +{
> > +  gimple *s = v.stmt;
> > +  tarjan_clear_using (s);
> > +}
> > +
> > +  return sccs;
> > +}
> > +
> > +/* Could this statement potentially be a copy statement?
> > +
> > +   This pass only considers statements for which this function returns 
> > 'true'.
> > +   Those are basically PHI functions and assignment statements similar to
> > +
> > +   _2 = _1;
> > +   or
> > +   _2 = 5;  */
> > +
> > +static bool
> > +stmt_may_generate_copy (gimple *stmt)
> > +{
> > +  if (gimple_code (stmt) == GIMPLE_PHI)
> > +{
> > +  gphi *phi = as_a  (stmt);
> > +
> > +  /* No OCCURS_IN_ABNORMAL_PHI SSA names in lhs nor rhs.  */
> > +  if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (gimple_phi_result (phi)))
> > +   return false;
> > +
> > +  unsigned i;
> > +  for (i = 0; i < gimple_phi_num_args (phi); i++)
> > +   {
> > + tree op = gimple_phi_arg_def (phi, i);
> > + if (TREE_CODE (op) == SSA_NAME
> > + && SSA_NAME_OCCURS_IN_ABNORMAL_PHI (op))
> > +   return false;
> > +   }
> 
> When there's more than one non-SSA PHI argument and they are not
> the same then the stmt also cannot be a copy, right?
> 
> > +  return true;
> > +}

Do I understand you correctly that you propose to put another check here?
Something like

unsigned nonssa_args_num = 0;
unsigned i;
for (i = 0; i < gimple_phi_num_args (phi); i++)
  {
tree op = gimple_phi_arg_def (phi, i);
if (TREE_CODE (op) == SSA_NAME)
  {
nonssa_args_num++;
if (nonssa_args_num >= 2)
  return false;

if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (op))
  return false;
  }
  }

> > +
> > +  if (gimple_code (stmt) != GIMPLE_ASSIGN)
> > +return false;
> > +
> > +  /* If the statement has volatile operands, it won't generate a
> > + useful copy.  */
> > +  if (gimple_has_volatile_ops (stmt))
> > +return false;
> > +
> > +  /* Statements with loads and/or stores will never generate a useful 
> > copy.

Re: [PATCH v2] gcov: Fix integer types in gen_counter_update()

2023-11-22 Thread Christophe Lyon

Hi,

On Tue, 21 Nov 2023 at 12:22, Sebastian Huber
 wrote:
>
> On 21.11.23 11:46, Jakub Jelinek wrote:
> > On Tue, Nov 21, 2023 at 11:42:06AM +0100, Sebastian Huber wrote:
> >>
> >> On 21.11.23 11:34, Jakub Jelinek wrote:
>  --- a/gcc/tree-profile.cc
>  +++ b/gcc/tree-profile.cc
>  @@ -281,10 +281,13 @@ gen_assign_counter_update (gimple_stmt_iterator 
>  *gsi, gcall *call, tree func,
>   if (result)
> {
>   tree result_type = TREE_TYPE (TREE_TYPE (func));
>  -  tree tmp = make_temp_ssa_name (result_type, NULL, name);
>  -  gimple_set_lhs (call, tmp);
>  +  tree tmp1 = make_temp_ssa_name (result_type, NULL, name);
>  +  gimple_set_lhs (call, tmp1);
>   gsi_insert_after (gsi, call, GSI_NEW_STMT);
>  -  gassign *assign = gimple_build_assign (result, tmp);
>  +  tree tmp2 = make_ssa_name (TREE_TYPE (result));
>  +  gassign *assign = gimple_build_assign (tmp2, NOP_EXPR, tmp1);
>  +  gsi_insert_after (gsi, assign, GSI_NEW_STMT);
>  +  assign = gimple_build_assign (result, gimple_assign_lhs (assign));
> >>> When you use a temporary tmp2 for the lhs of the conversion, you can just
> >>> use it here,
> >>> assign = gimple_build_assign (result, tmp2);
> >>>
> >>> Ok for trunk with that change.
> >> Just a question, could I also use
> >>
> >> tree tmp2 = make_temp_ssa_name (TREE_TYPE (result), NULL, name);
> >>
> >> ?
> >>
> >> This make_temp_ssa_name() is used throughout the file and the new
> >> make_ssa_name() would be the first use in this file.
> > Yes.  The only difference is that it won't be _234 = (type) something;
> > but PROF_time_profile_234 = (type) something; in the dumps, but sure,
> > consistency is useful.
>
> Thanks for your help. I checked in an updated version.
>

Our CI bisected a regression to this commit:
Running gcc:gcc.dg/tree-prof/tree-prof.exp ...
FAIL: gcc.dg/tree-prof/time-profiler-3.c scan-ipa-dump-times profile
"Read tp_first_run: 0" 1
FAIL: gcc.dg/tree-prof/time-profiler-3.c scan-ipa-dump-times profile
"Read tp_first_run: 2" 1

(on aarch64)

Can you check?

Thanks,

Christophe

> --
> embedded brains GmbH
> Herr Sebastian HUBER
> Dornierstr. 4
> 82178 Puchheim
> Germany
> email: sebastian.hu...@embedded-brains.de
> phone: +49-89-18 94 741 - 16
> fax:   +49-89-18 94 741 - 08
>
> Registergericht: Amtsgericht München
> Registernummer: HRB 157899
> Vertretungsberechtigte Geschäftsführer: Peter Rasmussen, Thomas Dörfler
> Unsere Datenschutzerklärung finden Sie hier:
> https://embedded-brains.de/datenschutzerklaerung/

Re: [PATCH v2] gcov: Fix integer types in gen_counter_update()

2023-11-22 Thread Sebastian Huber


On 22.11.23 15:22, Christophe Lyon wrote:

On Tue, 21 Nov 2023 at 12:22, Sebastian Huber
  wrote:

On 21.11.23 11:46, Jakub Jelinek wrote:

On Tue, Nov 21, 2023 at 11:42:06AM +0100, Sebastian Huber wrote:

On 21.11.23 11:34, Jakub Jelinek wrote:

--- a/gcc/tree-profile.cc
+++ b/gcc/tree-profile.cc
@@ -281,10 +281,13 @@ gen_assign_counter_update (gimple_stmt_iterator *gsi, 
gcall *call, tree func,
  if (result)
{
  tree result_type = TREE_TYPE (TREE_TYPE (func));
-  tree tmp = make_temp_ssa_name (result_type, NULL, name);
-  gimple_set_lhs (call, tmp);
+  tree tmp1 = make_temp_ssa_name (result_type, NULL, name);
+  gimple_set_lhs (call, tmp1);
  gsi_insert_after (gsi, call, GSI_NEW_STMT);
-  gassign *assign = gimple_build_assign (result, tmp);
+  tree tmp2 = make_ssa_name (TREE_TYPE (result));
+  gassign *assign = gimple_build_assign (tmp2, NOP_EXPR, tmp1);
+  gsi_insert_after (gsi, assign, GSI_NEW_STMT);
+  assign = gimple_build_assign (result, gimple_assign_lhs (assign));

When you use a temporary tmp2 for the lhs of the conversion, you can just
use it here,
 assign = gimple_build_assign (result, tmp2);

Ok for trunk with that change.

Just a question, could I also use

tree tmp2 = make_temp_ssa_name (TREE_TYPE (result), NULL, name);

?

This make_temp_ssa_name() is used throughout the file and the new
make_ssa_name() would be the first use in this file.

Yes.  The only difference is that it won't be _234 = (type) something;
but PROF_time_profile_234 = (type) something; in the dumps, but sure,
consistency is useful.

Thanks for your help. I checked in an updated version.


Our CI bisected a regression to this commit:
Running gcc:gcc.dg/tree-prof/tree-prof.exp ...
FAIL: gcc.dg/tree-prof/time-profiler-3.c scan-ipa-dump-times profile
"Read tp_first_run: 0" 1
FAIL: gcc.dg/tree-prof/time-profiler-3.c scan-ipa-dump-times profile
"Read tp_first_run: 2" 1

(on aarch64)

Can you check?


Yes, I will have a look at it.

--
embedded brains GmbH
Herr Sebastian HUBER
Dornierstr. 4
82178 Puchheim
Germany
email: sebastian.hu...@embedded-brains.de
phone: +49-89-18 94 741 - 16
fax:   +49-89-18 94 741 - 08

Registergericht: Amtsgericht München
Registernummer: HRB 157899
Vertretungsberechtigte Geschäftsführer: Peter Rasmussen, Thomas Dörfler
Unsere Datenschutzerklärung finden Sie hier:
https://embedded-brains.de/datenschutzerklaerung/

Re: Re: RISC-V: Support XTheadVector extensions

2023-11-22 Thread Christoph Müllner

On Wed, Nov 22, 2023 at 2:52 PM 钟居哲  wrote:
>
> I am totally ok to approve theadvector on GCC-14 before stage 3 close
> as long as it doesn't touch the current RVV codes too much and binutils 
> supports theadvector.
>
> I have provided the draft approach:
> https://gcc.gnu.org/pipermail/gcc-patches/2023-November/637349.html
> which turns out doesn't need to change any codes of vector.md.
> I strongly suggest follow this draft. I can be actively review theadvector 
> during stage 3.
> And hopefully can help you land theadvector on GCC-14.

I see now two approaches:
1) Let GCC emit RVV instructions for XTheadVector for instructions
that are in both
2) Use the ASM_OUTPUT_OPCODE hook to output "th." for these instructions

No doubt, the ASM_OUTPUT_OPCODE hook approach is better than
our format-string approach, but would 1) not be the even better solution?
It would also mean, that not a single test case is required for these
overlapping instructions (only a few tests that ensure that we don't emit
RVV instructions that are not available in XTheadVector).
Besides that, letting GCC emit RVV instructions for XTheadVector is a
very clever idea,
because it fully utilizes the fact that both extensions overlap to a
huge degree.

The ASM_OUTPUT_OPCODE approach could lead to an issue if we enable XTheadVector
with any other vector extension, say Zvfoo. In this case the Zvfoo
instructions will
all be prefixed as well with "th.". I know that it is not likely to
run into this problem
(such a machine does not exist in real hardware), but it is possible
to trigger this
issue easily and approach 1) would not have this potential issue.

Thanks,
Christoph


>
> Thanks.
>
> 
> juzhe.zh...@rivai.ai
>
>
> From: Christoph Müllner
> Date: 2023-11-22 18:07
> To: juzhe.zh...@rivai.ai
> CC: gcc-patches; kito.cheng; Kito.cheng; cooper.joshua; Robin Dapp; 
> jeffreyalaw; Philipp Tomsich; Cooper Qu; Jin Ma; Nelson Chu
> Subject: Re: RISC-V: Support XTheadVector extensions
> Hi Juzhe,
>
> Sorry for the late reply, but I was not on CC, so I missed this email.
>
> On Fri, Nov 17, 2023 at 2:41 PM juzhe.zh...@rivai.ai
>  wrote:
> >
> > Ok. I just read the theadvector extension.
> >
> > https://github.com/T-head-Semi/thead-extension-spec/blob/master/xtheadvector.adoc
> >
> > Theadvector is not custom extension. Just a uarch to disable some of the 
> > RVV1.0 extension
> > Theadvector can be considered as subextension of 'V' extension with 
> > disabling some of the
> > instructions and adding some new thead vector target load/store (This is 
> > another story).
> >
> > So, for disabling the instruction that theadvector doesn't support.
> > You don't need to touch such many codes.
> >
> > Here is a much simpler approach to do (I think it's definitely working):
> > 1. Don't change any codes in vector.md and keep GCC generates ASM with 
> > "th." prefix.
> > 2. Add !TARGET_THEADVECTOR into vector-iterator.md to disable the mode you 
> > don't want.
> > For example , theadvector doesn't support fractional vector.
> >
> > Then it's pretty simple:
> >
> > RVVMF2SI "TARGET_VECTOR && !TARGET_THEADVECTOR".
> >
> > 3. Remove all the tests you add in this patch.
> > 4. You can add theadvector specific load/store for example, th.vlb 
> > instructions they are allowed.
> > 5. Modify binutils, and make th.vmulh.vv as the pseudo instruction of 
> > vmulh.vv
> > 6. So with compile option "-S", you will still see ASM as  "vmulh.vv". but 
> > with objdump, you will see th.vmulh.vv.
>
> Yes, all these points sound reasonable, to minimize the patchset size.
> I believe in point 1 you meant "without th. prefix".
>
> I've added Jin Ma (who is the main author of the Binutils patchset) so
> he is also aware
> of the proposal to use pseudo instructions to avoid duplication in Binutils.
>
> Thank you very much!
> Christoph
>
>
> >
> > After this change, you can send V2, then I can continue to review on GCC-15.
> >
> > Thanks.
> >
> > 
> > juzhe.zh...@rivai.ai
> >
> >
> > From: juzhe.zh...@rivai.ai
> > Date: 2023-11-17 19:39
> > To: gcc-patches
> > CC: kito.cheng; kito.cheng; cooper.joshua; Robin Dapp; jeffreyalaw
> > Subject: RISC-V: Support XTheadVector extensions
> > 90% theadvector extension reusing current RVV 1.0 instructions patterns:
> > Just change ASM, For example:
> >
> > @@ -2923,7 +2923,7 @@ (define_insn "*pred_mulh_scalar"
> >   (match_operand:VFULLI_D 3 "register_operand"  "vr,vr, vr, vr")] VMULH)
> >(match_operand:VFULLI_D 2 "vector_merge_operand" "vu, 0, vu,  0")))]
> >"TARGET_VECTOR"
> > -  "vmulh.vx\t%0,%3,%z4%p1"
> > +  "%^vmulh.vx\t%0,%3,%z4%p1"
> >[(set_attr "type" "vimul")
> > (set_attr "mode" "")])
> >
> > +  if (letter == '^')
> > +{
> > +  if (TARGET_XTHEADVECTOR)
> > + fputs ("th.", file);
> > +  return;
> > +}
> >
> >
> > For almost all patterns, you just simply append "th." in the ASM prefix.
> > like change "vmulh.vv" -> "th.vmulh.vv"
> >
> >

Re: [PATCH v2 1/6] libgomp: basic pinned memory on Linux

2023-11-22 Thread Tobias Burnus


Hi Andrew,

Side remark:


-#define MEMSPACE_CALLOC(MEMSPACE, SIZE) \ - calloc (1,
(((void)(MEMSPACE), (SIZE


This fits a bit more to previous patch, but I wonder whether that should
use (MEMSPACE, NMEMB, SIZE) instead - to fit to the actual calloc arguments.

I think the main/only difference between SIZE and NMEMB and SIZE is that
"If the multiplication of nmemb and size would result in integer overflow,
then calloc() returns an error." (Linux manpage)

However, while this wording seems to be neither in POSIX nor in the OpenMP
spec. There was some alignment discussion at https://gcc.gnu.org/PR112364
regarding whether C (since C23) has a different alignment for
calloc(1, n) vs. calloc(n,1) but Joseph believes it doen't.

Thus, this is more bikesheding than making a real difference.

* * *

[somehow my email program caused some odd formatting issues when I
hit some odd key combo. I am not sure whether I fully fixed it or not;
sorry if some parts look odd.]


On 23.08.23 16:14, Andrew Stubbs wrote:

Implement the OpenMP pinned memory trait on Linux hosts using the
mlock syscall.  Pinned allocations are performed using mmap, not
malloc, to ensure that they can be unpinned safely when freed.

This implementation will work OK for page-scale allocations, and
finer-grained allocations will be implemented in a future patch.


Can you also update libgomp.texi, i.e. 
https://gcc.gnu.org/onlinedocs/libgomp/Memory-allocation.html
to document that and how pinning works on Linux?

I think I proposed in the low-latency patch to add a @ref to
https://gcc.gnu.org/onlinedocs/libgomp/Offload-Target-Specifics.html and
add there the nvptx and gcn specific memory-allocation handling.

* * *

I think the following is not ideal in the pinning-is-not-supported case:


@@ -434,10 +435,6 @@ omp_init_allocator (omp_memspace_handle_t
memspace, int ntraits, } #endif

-  /* No support for this so far.  */
-  if (data.pinned)
-return omp_null_allocator;
- ret = gomp_malloc (sizeof (struct omp_allocator_data));
*ret = data;
#ifndef HAVE_SYNC_BUILTINS


which continues as:
  gomp_mutex_init (&ret->lock);
#endif
  return (omp_allocator_handle_t) ret;
}


Therefore:

This code will always return a handle, even if pinning is not supported.
I had expected that the following happens:

"Otherwise if an allocator based on the requirements cannot be created
then the special omp_null_allocator handle is returned."

Using this allocator on a system where libgomp does not support pinning
will always fail with the fallback, which could be either of:

default_mem_fb (= omp_atv_default), null_fb, abort_fb, allocator_fb (+ fb_data).

Thus, the current code kind of works if the fallback is (explicitly or 
implicitly)
omp_atv_default or (explicitly) default_mem_fb – but otherwise, allocations will
always fail, most prominently with "abort_fb".

* * *

The following definitions (ab)use comma operators to avoid unused
variable errors. */ #ifndef MEMSPACE_ALLOC -#define
MEMSPACE_ALLOC(MEMSPACE, SIZE) \ - malloc (((void)(MEMSPACE), (SIZE)))
+#define MEMSPACE_ALLOC(MEMSPACE, SIZE, PIN) \ + (PIN ? NULL : malloc
(((void)(MEMSPACE), (SIZE


I wonder whether the comment should note something like: All of the
following will return NULL or are a no-op when pinning is enabled
(unless overridden).

And the following looks odd:


+#define MEMSPACE_FREE(MEMSPACE, ADDR, SIZE, PIN) \
+  (PIN ? NULL :  free (((void)(MEMSPACE), (void)(SIZE), (ADDR
#endif


Contrary to the other functions that return a value (pointer), 'free' "returns" 
'void'.

And, indeed, the compiler might complain:

test.c:5:14: warning: ISO C forbids conditional expr with only one void side 
[-Wpedantic]
5 |   pin ? NULL : free(p);
  |  ^

While (void)NULL works, I think the simplest is to just use an 'if (pin) 
free(...)'.

* * *

+linux_memspace_alloc (omp_memspace_handle_t memspace, size_t size,
int pin) +{ + (void)memspace; + + if (pin) + { + void *addr = mmap
(NULL, size, PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANONYMOUS,
-1, 0);



Maybe add a comment noting that mmap returns nullified memory – as required for 
the calloc call.

(The linux man page states for MAP_ANONYMOUS: "The mapping ...; its contents are 
initialized to zero."
while POSIX has: "The system shall always zero-fill any partial page at the end of 
an object.", which
should be all in case of addr = NULL.)


+  if (mlock (addr, size))
+ { +   gomp_debug (0, "libgomp: failed to pin memory (ulimit too 
low?)\n");


I wonder whether the size should be included in the output - it might help to 
debug to know
whether "just" 1 kiB or 20 GB were tried to be pinned.

For the comment, I wonder whether it should mention RLIMIT_MEMLOCK or 'lockable 
memory' instead
or in addition to ulimit to be clearer.
(csh uses 'limit' instead of ulimit, but POSIX has both as function and as 
shell (sh, bash) ulimit,
i.e. using 'ulimit' is fine. Albeit 'ulimit()' has been deprecated in favour of 
{s,g}etr

[committed] amdgcn: Fix vector TImode reload loop

2023-11-22 Thread Andrew Stubbs

This patch fixes a reload bug that's hard to reproduce reliably (so far 
I've only observed it on the OG13 branch, with testcase 
gcc.c-torture/compile/pr70355.c), but causes an infinite loop in reload 
when it fails.


For some reason it wants to save a value from AVGPRs to memory, this 
can't happen directly on CDNA1, so secondary reload moves the value to 
VGPRS, but instead of proceeding to memory, LRA just goes and moves the 
value right back into AVGPRs.  Disparaging this move (when a reload is 
needed) fixes the issue, but I don't know if this is the intended or 
optimal solution in these cases.


Andrewamdgcn: Fix vector TImode reload loop

I've only observed the problem on the devel/omp/gcc-13 branch, but this
could theoretically affect mainline also.  The mov insns for the other modes
already have '$', so this completes the set.

gcc/ChangeLog:

* config/gcn/gcn-valu.md (*mov_4reg): Disparage AVGPR use when a
reload is required.

diff --git a/gcc/config/gcn/gcn-valu.md b/gcc/config/gcn/gcn-valu.md
index 23f2bbe454b..a928decd408 100644
--- a/gcc/config/gcn/gcn-valu.md
+++ b/gcc/config/gcn/gcn-valu.md
@@ -566,10 +566,10 @@ (define_insn "*mov_4reg"
(match_operand:V_4REG 1 "general_operand"))]
   ""
   {@ [cons: =0, 1; attrs: type, length, gcn_version]
-  [v,vDB;vmult,16,*]   v_mov_b32\t%L0, %L1\;  
v_mov_b32\t%H0, %H1\;  v_mov_b32\t%J0, %J1\;  v_mov_b32\t%K0, 
%K1
-  [v,a  ;vmult,32,*]  v_accvgpr_read_b32\t%L0, %L1\; 
v_accvgpr_read_b32\t%H0, %H1\; v_accvgpr_read_b32\t%J0, %J1\; 
v_accvgpr_read_b32\t%K0, %K1
-  [a,v  ;vmult,32,*] v_accvgpr_write_b32\t%L0, 
%L1\;v_accvgpr_write_b32\t%H0, %H1\;v_accvgpr_write_b32\t%J0, 
%J1\;v_accvgpr_write_b32\t%K0, %K1
-  [a,a  ;vmult,32,cdna2]   v_accvgpr_mov_b32\t%L0, %L1\;  
v_accvgpr_mov_b32\t%H0, %H1\;  v_accvgpr_mov_b32\t%J0, %J1\;  
v_accvgpr_mov_b32\t%K0, %K1
+  [v ,vDB;vmult,16,*]   v_mov_b32\t%L0, %L1\;  
v_mov_b32\t%H0, %H1\;  v_mov_b32\t%J0, %J1\;  v_mov_b32\t%K0, 
%K1
+  [v ,a  ;vmult,32,*]  v_accvgpr_read_b32\t%L0, %L1\; 
v_accvgpr_read_b32\t%H0, %H1\; v_accvgpr_read_b32\t%J0, %J1\; 
v_accvgpr_read_b32\t%K0, %K1
+  [$a,v  ;vmult,32,*] v_accvgpr_write_b32\t%L0, 
%L1\;v_accvgpr_write_b32\t%H0, %H1\;v_accvgpr_write_b32\t%J0, 
%J1\;v_accvgpr_write_b32\t%K0, %K1
+  [a ,a  ;vmult,32,cdna2]   v_accvgpr_mov_b32\t%L0, %L1\;  
v_accvgpr_mov_b32\t%H0, %H1\;  v_accvgpr_mov_b32\t%J0, %J1\;  
v_accvgpr_mov_b32\t%K0, %K1
   })
 
 (define_insn "mov_exec"

[PATCH] libgcc: mark __hardcfr_check_fail as always_inline

2023-11-22 Thread Jose E. Marchesi

The function __hardcfr_check_fail in hardcfr.c is internal and static
inline.  It receives many arguments, which require more than five
registers to be passed in bpf-none-unknown targets.  BPF is limited to
that number of registers to pass arguments, and therefore libgcc fails
to build in that target.  This patch marks the function with the
always_inline attribute, fixing the bpf build.

Tested in bpf-unknown-none target and x86_64-linux-gnu host.

libgcc/ChangeLog:

* hardcfr.c (__hardcfr_check_fail): Mark as always_inline.
---
 libgcc/hardcfr.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/libgcc/hardcfr.c b/libgcc/hardcfr.c
index 25ff06742cb..48a87a5a87a 100644
--- a/libgcc/hardcfr.c
+++ b/libgcc/hardcfr.c
@@ -206,7 +206,8 @@ __hardcfr_debug_cfg (size_t const blocks,
enabled, it also forces __hardcfr_debug_cfg (above) to be compiled into an
out-of-line function, that could be called from a debugger.
*/
-static inline void
+
+static inline  __attribute__((__always_inline__)) void
 __hardcfr_check_fail (size_t const blocks ATTRIBUTE_UNUSED,
  vword const *const visited ATTRIBUTE_UNUSED,
  vword const *const cfg ATTRIBUTE_UNUSED,
-- 
2.30.2

[PATCH] tree-optimization/112344 - wrong final value replacement

2023-11-22 Thread Richard Biener

When performing final value replacement chrec_apply that's used to
compute the overall effect of niters to a CHREC doesn't consider that
the overall increment of { -2147483648, +, 2 } doesn't fit in
a signed integer when the loop iterates until the value of the IV
of 20.  The following fixes this mistake, carrying out the multiply
and add in an unsigned type instead, avoiding undefined overflow
and thus later miscompilation by path range analysis.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

PR tree-optimization/112344
* tree-chrec.cc (chrec_apply): Perform the overall increment
calculation and increment in an unsigned type.

* gcc.dg/torture/pr112344.c: New testcase.
---
 gcc/testsuite/gcc.dg/torture/pr112344.c | 20 
 gcc/tree-chrec.cc   | 32 -
 2 files changed, 41 insertions(+), 11 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr112344.c

diff --git a/gcc/testsuite/gcc.dg/torture/pr112344.c 
b/gcc/testsuite/gcc.dg/torture/pr112344.c
new file mode 100644
index 000..c52d2c8304b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr112344.c
@@ -0,0 +1,20 @@
+/* { dg-do run } */
+/* { dg-require-effective-target int32plus } */
+
+int
+main ()
+{
+  long long b = 2036854775807LL;
+  signed char c = 3;
+  short d = 0;
+  int e = -2147483647 - 1, f;
+  for (f = 0; f < 7; f++)
+while (e < 20)
+  {
+   e += 2;
+   d = c -= b;
+  }
+  if (d != 13)
+__builtin_abort ();
+  return 0;
+}
diff --git a/gcc/tree-chrec.cc b/gcc/tree-chrec.cc
index 2f67581591a..f4ba130ba20 100644
--- a/gcc/tree-chrec.cc
+++ b/gcc/tree-chrec.cc
@@ -613,32 +613,42 @@ chrec_apply (unsigned var,
   if (evolution_function_is_affine_p (chrec))
{
  tree chrecr = CHREC_RIGHT (chrec);
+ tree chrecl = CHREC_LEFT (chrec);
  if (CHREC_VARIABLE (chrec) != var)
-   res = build_polynomial_chrec
- (CHREC_VARIABLE (chrec),
-  chrec_apply (var, CHREC_LEFT (chrec), x),
-  chrec_apply (var, chrecr, x));
+   res = build_polynomial_chrec (CHREC_VARIABLE (chrec),
+ chrec_apply (var, chrecl, x),
+ chrec_apply (var, chrecr, x));
 
- /* "{a, +, b} (x)"  ->  "a + b*x".  */
- else if (operand_equal_p (CHREC_LEFT (chrec), chrecr)
+ /* "{a, +, a}" (x-1) -> "a*x".  */
+ else if (operand_equal_p (chrecl, chrecr)
   && TREE_CODE (x) == PLUS_EXPR
   && integer_all_onesp (TREE_OPERAND (x, 1))
   && !POINTER_TYPE_P (type)
   && TYPE_PRECISION (TREE_TYPE (x))
  >= TYPE_PRECISION (type))
{
- /* We know the number of iterations can't be negative.
-So {a, +, a} (x-1) -> "a*x".  */
+ /* We know the number of iterations can't be negative.  */
  res = build_int_cst (TREE_TYPE (x), 1);
  res = chrec_fold_plus (TREE_TYPE (x), x, res);
  res = chrec_convert_rhs (type, res, NULL);
  res = chrec_fold_multiply (type, chrecr, res);
}
+ /* "{a, +, b} (x)"  ->  "a + b*x".  */
  else
{
- res = chrec_convert_rhs (TREE_TYPE (chrecr), x, NULL);
- res = chrec_fold_multiply (TREE_TYPE (chrecr), chrecr, res);
- res = chrec_fold_plus (type, CHREC_LEFT (chrec), res);
+ /* The overall increment might not fit in a signed type so
+use an unsigned computation to get at the final value
+and avoid undefined signed overflow.  */
+ tree utype = TREE_TYPE (chrecr);
+ if (INTEGRAL_TYPE_P (utype) && !TYPE_OVERFLOW_WRAPS (utype))
+   utype = unsigned_type_for (TREE_TYPE (chrecr));
+ res = chrec_convert_rhs (utype, x, NULL);
+ res = chrec_fold_multiply (utype,
+chrec_convert (utype, chrecr, NULL),
+res);
+ res = chrec_fold_plus (utype,
+chrec_convert (utype, chrecl, NULL), res);
+ res = chrec_convert (type, res, NULL);
}
}
   else if (TREE_CODE (x) == INTEGER_CST
-- 
2.35.3

Re: [PATCH] Fix PR ada/111909 On Darwin, determine filesystem case sensitivity at runtime

2023-11-22 Thread Paul Koning




> On Nov 22, 2023, at 8:54 AM, Simon Wright  wrote:
> 
> On 21 Nov 2023, at 23:13, Iain Sandoe  wrote:
> 
>>> #if defined (__APPLE__)
>>> -#include 
>> 
>> If removing unistd.h is intentional (i.e. you determined that it’s no longer
>> needed for Darwin), then we should make that a separate patch.
> 
> I thought that I’d had to include unistd.h for the first patch in this 
> thread; clearly not!
> 
> What I hope will be the final version:
> 
> ——— 8< .———
> 
> In gcc/ada/adaint.c(__gnat_get_file_names_case_sensitive), the current
> assumption for __APPLE__ is that file names are case-insensitive
> unless __arm__ or __arm64__ are defined, in which case file names are
> declared case-sensitive.
> 
> The associated comment is
>  "By default, we suppose filesystems aren't case sensitive on
>  Windows and Darwin (but they are on arm-darwin)."
> 
> This means that on aarch64-apple-darwin, file names are treated as
> case-sensitive, which is not the default case.
> 
> The true default position is that macOS file systems are
> case-insensitive, iOS file systems are case-sensitive.

Sort of.  The most common choices for Mac OS file system type are indeed case 
insensitive, but it also allows case sensitive file systems. 

paul

Re: [PATCH 01/11] rtl-ssa: Support for inserting new insns

2023-11-22 Thread Alex Coplan

On 21/11/2023 11:51, Richard Sandiford wrote:
> Alex Coplan  writes:
> > N.B. this is just a rebased (but otherwise unchanged) version of the
> > same patch already posted here:
> >
> > https://gcc.gnu.org/pipermail/gcc-patches/2023-October/633348.html
> >
> > this is the only unreviewed dependency from the previous series, so it
> > seemed easier just to re-post it (not least to appease the pre-commit
> > CI).
> >
> > -- >8 --
> >
> > The upcoming aarch64 load pair pass needs to form store pairs, and can
> > re-order stores over loads when alias analysis determines this is safe.
> > In the case that both mem defs have uses in the RTL-SSA IR, and both
> > stores require re-ordering over their uses, we represent that as
> > (tentative) deletion of the original store insns and creation of a new
> > insn, to prevent requiring repeated re-parenting of uses during the
> > pass.  We then update all mem uses that require re-parenting in one go
> > at the end of the pass.
> >
> > To support this, RTL-SSA needs to handle inserting new insns (rather
> > than just changing existing ones), so this patch adds support for that.
> >
> > New insns (and new accesses) are temporaries, allocated above a temporary
> > obstack_watermark, such that the user can easily back out of a change 
> > without
> > awkward bookkeeping.
> >
> > Bootstrapped/regtested as a series on aarch64-linux-gnu, OK for trunk?
> >
> > gcc/ChangeLog:
> >
> > * rtl-ssa/accesses.cc (function_info::create_set): New.
> > * rtl-ssa/accesses.h (access_info::is_temporary): New.
> > * rtl-ssa/changes.cc (move_insn): Handle new (temporary) insns.
> > (function_info::finalize_new_accesses): Handle new/temporary
> > user-created accesses.
> > (function_info::apply_changes_to_insn): Ensure m_is_temp flag
> > on new insns gets cleared.
> > (function_info::change_insns): Handle new/temporary insns.
> > (function_info::create_insn): New.
> > * rtl-ssa/changes.h (class insn_change): Make function_info a
> > friend class.
> > * rtl-ssa/functions.h (function_info): Declare new entry points:
> > create_set, create_insn.  Declare new change_alloc helper.
> > * rtl-ssa/insns.cc (insn_info::print_full): Identify temporary 
> > insns in
> > dump.
> > * rtl-ssa/insns.h (insn_info): Add new m_is_temp flag and 
> > accompanying
> > is_temporary accessor.
> > * rtl-ssa/internals.inl (insn_info::insn_info): Initialize 
> > m_is_temp to
> > false.
> > * rtl-ssa/member-fns.inl (function_info::change_alloc): New.
> > * rtl-ssa/movement.h (restrict_movement_for_defs_ignoring): Add
> > handling for temporary defs.
> 
> Looks good, but there were a couple of things I didn't understand:

Thanks for the review.

> 
> > ---
> >  gcc/rtl-ssa/accesses.cc| 10 ++
> >  gcc/rtl-ssa/accesses.h |  4 +++
> >  gcc/rtl-ssa/changes.cc | 74 +++---
> >  gcc/rtl-ssa/changes.h  |  2 ++
> >  gcc/rtl-ssa/functions.h| 14 
> >  gcc/rtl-ssa/insns.cc   |  5 +++
> >  gcc/rtl-ssa/insns.h|  7 +++-
> >  gcc/rtl-ssa/internals.inl  |  1 +
> >  gcc/rtl-ssa/member-fns.inl | 12 +++
> >  gcc/rtl-ssa/movement.h |  8 -
> >  10 files changed, 123 insertions(+), 14 deletions(-)
> >
> > diff --git a/gcc/rtl-ssa/accesses.cc b/gcc/rtl-ssa/accesses.cc
> > index 510545a8bad..76d70fd8bd3 100644
> > --- a/gcc/rtl-ssa/accesses.cc
> > +++ b/gcc/rtl-ssa/accesses.cc
> > @@ -1456,6 +1456,16 @@ function_info::make_uses_available 
> > (obstack_watermark &watermark,
> >return use_array (new_uses, num_uses);
> >  }
> >  
> > +set_info *
> > +function_info::create_set (obstack_watermark &watermark,
> > +  insn_info *insn,
> > +  resource_info resource)
> > +{
> > +  auto set = change_alloc (watermark, insn, resource);
> > +  set->m_is_temp = true;
> > +  return set;
> > +}
> > +
> >  // Return true if ACCESS1 can represent ACCESS2 and if ACCESS2 can
> >  // represent ACCESS1.
> >  static bool
> > diff --git a/gcc/rtl-ssa/accesses.h b/gcc/rtl-ssa/accesses.h
> > index fce31d46717..7e7a90ece97 100644
> > --- a/gcc/rtl-ssa/accesses.h
> > +++ b/gcc/rtl-ssa/accesses.h
> > @@ -204,6 +204,10 @@ public:
> >// in the main instruction pattern.
> >bool only_occurs_in_notes () const { return m_only_occurs_in_notes; }
> >  
> > +  // Return true if this is a temporary access, e.g. one created for
> > +  // an insn that is about to be inserted.
> > +  bool is_temporary () const { return m_is_temp; }
> > +
> >  protected:
> >access_info (resource_info, access_kind);
> >  
> > diff --git a/gcc/rtl-ssa/changes.cc b/gcc/rtl-ssa/changes.cc
> > index aab532b9f26..da2a61d701a 100644
> > --- a/gcc/rtl-ssa/changes.cc
> > +++ b/gcc/rtl-ssa/changes.cc
> > @@ -394,14 +394,20 @@ move_insn (insn_change &change, insn_info *after)
> >// At the moment we

Re: [PATCH] Fix PR ada/111909 On Darwin, determine filesystem case sensitivity at runtime

2023-11-22 Thread Iain Sandoe




> On 22 Nov 2023, at 13:55, Arnaud Charlet  wrote:
> 
 #if defined (__APPLE__)
 -#include 
>>> 
>>> If removing unistd.h is intentional (i.e. you determined that it’s no longer
>>> needed for Darwin), then we should make that a separate patch.
>> 
>> I thought that I’d had to include unistd.h for the first patch in this 
>> thread; clearly not!
>> 
>> What I hope will be the final version:
> 
> OK here.

also OK here, thanks
Iain

> 
>> ——— 8< .———
>> 
>> In gcc/ada/adaint.c(__gnat_get_file_names_case_sensitive), the current
>> assumption for __APPLE__ is that file names are case-insensitive
>> unless __arm__ or __arm64__ are defined, in which case file names are
>> declared case-sensitive.
>> 
>> The associated comment is
>>  "By default, we suppose filesystems aren't case sensitive on
>>  Windows and Darwin (but they are on arm-darwin)."
>> 
>> This means that on aarch64-apple-darwin, file names are treated as
>> case-sensitive, which is not the default case.
>> 
>> The true default position is that macOS file systems are
>> case-insensitive, iOS file systems are case-sensitive.
>> 
>> Apple provide a header file  which permits a
>> compile-time check for the compiler target (e.g. OSX vs IOS); if
>> TARGET_OS_IOS is defined as 1, this is a build for iOS.
>> 
>>  * gcc/ada/adaint.c
>>  (__gnat_get_file_names_case_sensitive): Split out the __APPLE__
>>  check and remove the checks for __arm__, __arm64__.
>>  For Apple, file names are by default case-insensitive unless
>>  TARGET_OS_IOS is set.
>> 
>> Signed-off-by: Simon Wright

Re: [PATCH RFC] c++: mangle function template constraints

2023-11-22 Thread Jonathan Wakely

On Mon, 20 Nov 2023 at 02:56, Jason Merrill wrote:
>
> Tested x86_64-pc-linux-gnu.  Are the library bits OK?  Any comments before I
> push this?

The library parts are OK.

The variable template is_trivially_copyable_v just uses
__is_trivially_copyable so should be just as efficient, and the change
to  is fine.

The variable template is_trivially_destructible_v instantiates the
is_trivially_destructible type trait, which instantiates
__is_destructible_safe and __is_destructible_impl, which is probably
why we used the built-in directly in . But that's an
acceptable overhead to avoid using the built-in in a mangled context,
and it would be good to optimize the variable template anyway, as a
separate change.

Re: [PATCH v3] aarch64: SVE/NEON Bridging intrinsics

2023-11-22 Thread Richard Sandiford

Richard Ball  writes:
> ACLE has added intrinsics to bridge between SVE and Neon.
>
> The NEON_SVE Bridge adds intrinsics that allow conversions between NEON and
> SVE vectors.
>
> This patch adds support to GCC for the following 3 intrinsics:
> svset_neonq, svget_neonq and svdup_neonq
>
> gcc/ChangeLog:
>
>   * config.gcc: Adds new header to config.
>   * config/aarch64/aarch64-builtins.cc (enum aarch64_type_qualifiers):
>   Moved to header file.
>   (ENTRY): Likewise.
>   (enum aarch64_simd_type): Likewise.
>   (struct aarch64_simd_type_info): Make extern.
>   (GTY): Likewise.
>   * config/aarch64/aarch64-c.cc (aarch64_pragma_aarch64):
>   Defines pragma for arm_neon_sve_bridge.h.
>   * config/aarch64/aarch64-protos.h: New function.
>   * config/aarch64/aarch64-sve-builtins-base.h: New intrinsics.
>   * config/aarch64/aarch64-sve-builtins-base.cc
>   (class svget_neonq_impl): New intrinsic implementation.
>   (class svset_neonq_impl): Likewise.
>   (class svdup_neonq_impl): Likewise.
>   (NEON_SVE_BRIDGE_FUNCTION): New intrinsics.
>   * config/aarch64/aarch64-sve-builtins-functions.h
>   (NEON_SVE_BRIDGE_FUNCTION): Defines macro for NEON_SVE_BRIDGE
>   functions.
>   * config/aarch64/aarch64-sve-builtins-shapes.h: New shapes.
>   * config/aarch64/aarch64-sve-builtins-shapes.cc
>   (parse_element_type): Add NEON element types.
>   (parse_type): Likewise.
>   (struct get_neonq_def): Defines function shape for get_neonq.
>   (struct set_neonq_def): Defines function shape for set_neonq.
>   (struct dup_neonq_def): Defines function shape for dup_neonq.
>   * config/aarch64/aarch64-sve-builtins.cc (DEF_SVE_TYPE_SUFFIX):
>   (DEF_SVE_NEON_TYPE_SUFFIX): Defines 
> macro for NEON_SVE_BRIDGE type suffixes.
>   (DEF_NEON_SVE_FUNCTION): Defines 
> macro for NEON_SVE_BRIDGE functions.
>   (function_resolver::infer_neon128_vector_type): Infers type suffix
>   for overloaded functions.
>   (init_neon_sve_builtins): Initialise neon_sve_bridge_builtins for LTO.
>   (handle_arm_neon_sve_bridge_h): Handles #pragma arm_neon_sve_bridge.h.
>   * config/aarch64/aarch64-sve-builtins.def
>   (DEF_SVE_NEON_TYPE_SUFFIX): Macro for handling neon_sve type suffixes.
>   (bf16): Replace entry with neon-sve entry.
>   (f16): Likewise.
>   (f32): Likewise.
>   (f64): Likewise.
>   (s8): Likewise.
>   (s16): Likewise.
>   (s32): Likewise.
>   (s64): Likewise.
>   (u8): Likewise.
>   (u16): Likewise.
>   (u32): Likewise.
>   (u64): Likewise.
>   * config/aarch64/aarch64-sve-builtins.h
>   (GCC_AARCH64_SVE_BUILTINS_H): Include aarch64-builtins.h.
>   (ENTRY): Add aarch64_simd_type definiton.
>   (enum aarch64_simd_type): Add neon information to type_suffix_info.
>   (struct type_suffix_info): New function.
>   * config/aarch64/aarch64-sve.md
>   (@aarch64_sve_get_neonq_): New intrinsic insn for big endian.
>   (@aarch64_sve_set_neonq_): Likewise.
>   (@aarch64_sve_dup_neonq_): Likewise.
>   * config/aarch64/aarch64.cc 
>   (aarch64_init_builtins): Add call to init_neon_sve_builtins.
> (aarch64_output_sve_set_neonq): asm output for Big Endian set_neonq.
>   * config/aarch64/iterators.md: Add UNSPEC_SET_NEONQ.
>   * config/aarch64/aarch64-builtins.h: New file.
>   * config/aarch64/aarch64-neon-sve-bridge-builtins.def: New file.
>   * config/aarch64/arm_neon_sve_bridge.h: New file.
>
> gcc/testsuite/ChangeLog:
>
>   * gcc.target/aarch64/sve/acle/asm/test_sve_acle.h: Add include 
>   arm_neon_sve_bridge header file
>   * gcc.dg/torture/neon-sve-bridge.c: New test.
>   * gcc.target/aarch64/sve/acle/asm/dup_neonq_bf16.c: New test.
>   * gcc.target/aarch64/sve/acle/asm/dup_neonq_f16.c: New test.
>   * gcc.target/aarch64/sve/acle/asm/dup_neonq_f32.c: New test.
>   * gcc.target/aarch64/sve/acle/asm/dup_neonq_f64.c: New test.
>   * gcc.target/aarch64/sve/acle/asm/dup_neonq_s16.c: New test.
>   * gcc.target/aarch64/sve/acle/asm/dup_neonq_s32.c: New test.
>   * gcc.target/aarch64/sve/acle/asm/dup_neonq_s64.c: New test.
>   * gcc.target/aarch64/sve/acle/asm/dup_neonq_s8.c: New test.
>   * gcc.target/aarch64/sve/acle/asm/dup_neonq_u16.c: New test.
>   * gcc.target/aarch64/sve/acle/asm/dup_neonq_u32.c: New test.
>   * gcc.target/aarch64/sve/acle/asm/dup_neonq_u64.c: New test.
>   * gcc.target/aarch64/sve/acle/asm/dup_neonq_u8.c: New test.
>   * gcc.target/aarch64/sve/acle/asm/get_neonq_bf16.c: New test.
>   * gcc.target/aarch64/sve/acle/asm/get_neonq_f16.c: New test.
>   * gcc.target/aarch64/sve/acle/asm/get_neonq_f32.c: New test.
>   * gcc.target/aarch64/sve/acle/asm/get_neonq_f64.c: New test.
>   * gcc.target/aarch64/sve/acle/asm/get_neonq_s16.c: New test.
>   * gcc.target/aarch64/sve/acle/asm/get_neonq

[pushed] testsuite: Update path to intl include.

2023-11-22 Thread Iain Sandoe

Tested on i686, x86_64 and aarch64 Darwin, aarch64 and x86_64 Linux,
pushed to master as obvious, thanks
Iain

--- 8< ---

When we are building libintl in-tree, we need to pass the path
to the generated libintl.h include to the plugin tests.  This
path has changed with the use of gettext directly.

gcc/testsuite/ChangeLog:

* lib/plugin-support.exp: Update the expected path to an
in-tree build of libintl.

Signed-off-by: Iain Sandoe 
---
 gcc/testsuite/lib/plugin-support.exp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/lib/plugin-support.exp 
b/gcc/testsuite/lib/plugin-support.exp
index 378881b0f5d..8accf13fab6 100644
--- a/gcc/testsuite/lib/plugin-support.exp
+++ b/gcc/testsuite/lib/plugin-support.exp
@@ -85,7 +85,7 @@ proc plugin-test-execute { plugin_src plugin_tests } {
 set gcc_objdir "$objdir/../../.."
 set includes "-I. -I${srcdir} -I${gcc_srcdir}/gcc -I${gcc_objdir}/gcc \
   -I${gcc_srcdir}/include -I${gcc_srcdir}/libcpp/include \
-  $GMPINC -I${gcc_objdir}/intl"
+  $GMPINC -I${gcc_objdir}/gettext/intl"
 
 if { [ ishost *-*-darwin* ] } {
# -mdynamic-no-pic is incompatible with -fPIC.
-- 
2.39.2 (Apple Git-143)

Re: [PATCH RFC] c++: mangle function template constraints

2023-11-22 Thread Jonathan Wakely

On Wed, 22 Nov 2023 at 14:50, Jonathan Wakely  wrote:
>
> On Mon, 20 Nov 2023 at 02:56, Jason Merrill wrote:
> >
> > Tested x86_64-pc-linux-gnu.  Are the library bits OK?  Any comments before I
> > push this?
>
> The library parts are OK.
>
> The variable template is_trivially_copyable_v just uses
> __is_trivially_copyable so should be just as efficient, and the change
> to  is fine.
>
> The variable template is_trivially_destructible_v instantiates the
> is_trivially_destructible type trait, which instantiates
> __is_destructible_safe and __is_destructible_impl, which is probably
> why we used the built-in directly in . But that's an
> acceptable overhead to avoid using the built-in in a mangled context,
> and it would be good to optimize the variable template anyway, as a
> separate change.

For C++20 we could do:

#if __cpp_concepts
template 
  inline constexpr bool is_trivially_destructible_v = false;
template  requires (_Tp& __t) { __t.~_Tp(); }
  inline constexpr bool is_trivially_destructible_v<_Tp>
= __has_trivial_destructor(_Tp);
#else
template 
  inline constexpr bool is_trivially_destructible_v =
is_trivially_destructible<_Tp>::value;
#endif

But that won't help C++17.

Re: [PATCH] Fix PR ada/111909 On Darwin, determine filesystem case sensitivity at runtime

2023-11-22 Thread Iain Sandoe




> On 22 Nov 2023, at 14:48, Iain Sandoe  wrote:
> 
> 
> 
>> On 22 Nov 2023, at 13:55, Arnaud Charlet  wrote:
>> 
> #if defined (__APPLE__)
> -#include 
 
 If removing unistd.h is intentional (i.e. you determined that it’s no 
 longer
 needed for Darwin), then we should make that a separate patch.
>>> 
>>> I thought that I’d had to include unistd.h for the first patch in this 
>>> thread; clearly not!
>>> 
>>> What I hope will be the final version:
>> 
>> OK here.
> 
> also OK here, thanks

I think this fixes https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111909 ?
if you agree then please add that to the commit.
Iain

> Iain
> 
>> 
>>> ——— 8< .———
>>> 
>>> In gcc/ada/adaint.c(__gnat_get_file_names_case_sensitive), the current
>>> assumption for __APPLE__ is that file names are case-insensitive
>>> unless __arm__ or __arm64__ are defined, in which case file names are
>>> declared case-sensitive.
>>> 
>>> The associated comment is
>>> "By default, we suppose filesystems aren't case sensitive on
>>> Windows and Darwin (but they are on arm-darwin)."
>>> 
>>> This means that on aarch64-apple-darwin, file names are treated as
>>> case-sensitive, which is not the default case.
>>> 
>>> The true default position is that macOS file systems are
>>> case-insensitive, iOS file systems are case-sensitive.
>>> 
>>> Apple provide a header file  which permits a
>>> compile-time check for the compiler target (e.g. OSX vs IOS); if
>>> TARGET_OS_IOS is defined as 1, this is a build for iOS.
>>> 
>>> * gcc/ada/adaint.c
>>> (__gnat_get_file_names_case_sensitive): Split out the __APPLE__
>>> check and remove the checks for __arm__, __arm64__.
>>> For Apple, file names are by default case-insensitive unless
>>> TARGET_OS_IOS is set.
>>> 
>>> Signed-off-by: Simon Wright

Re: [PATCH v3 2/8] Unify implementations of print_hard_reg_set()

2023-11-22 Thread Vladimir Makarov




On 11/22/23 06:14, Maxim Kuvyrkov wrote:

We currently have 3 implementations of print_hard_reg_set()
(all with the same name!) in ira-color.cc, ira-conflicts.cc, and
sel-sched-dump.cc.  This patch generalizes implementation in
ira-color.cc, and uses it in all other places.  The declaration
is added to hard-reg-set.h.

The motivation for this patch is the [upcoming] need for
print_hard_reg_set() in sched-deps.cc.

gcc/ChangeLog:

* hard-reg-set.h (print_hard_reg_set): Declare.
* ira-color.cc (print_hard_reg_set): Generalize a bit.
(debug_hard_reg_set, print_hard_regs_subforest,)
(setup_allocno_available_regs_num): Update.
* ira-conflicts.cc (print_hard_reg_set): Remove.
(print_allocno_conflicts): Use global print_hard_reg_set().
* sel-sched-dump.cc (print_hard_reg_set): Remove.
(dump_hard_reg_set): Use global print_hard_reg_set().
* sel-sched-dump.h (dump_hard_reg_set): Mark as DEBUG_FUNCTION.


OK for me.  Thank you for consolidation of the print code, Maxim.

Re: [PATCH] Fix PR ada/111909 On Darwin, determine filesystem case sensitivity at runtime

2023-11-22 Thread Simon Wright

> On 22 Nov 2023, at 15:03, Iain Sandoe  wrote:
> 
> 
> 
>> On 22 Nov 2023, at 14:48, Iain Sandoe  wrote:
>> 
>> 
>> 
>>> On 22 Nov 2023, at 13:55, Arnaud Charlet  wrote:
>>> 
>> #if defined (__APPLE__)
>> -#include 
> 
> If removing unistd.h is intentional (i.e. you determined that it’s no 
> longer
> needed for Darwin), then we should make that a separate patch.

 I thought that I’d had to include unistd.h for the first patch in this 
 thread; clearly not!

 What I hope will be the final version:
>>> 
>>> OK here.
>> 
>> also OK here, thanks
> 
> I think this fixes https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111909 ?
> if you agree then please add that to the commit.
> Iain

git format-patch does so much, I forgot this, sorry:

gcc/ada/Changelog:

2023-11-22 Simon Wright mailto:si...@pushface.org>>

PR ada/111909

> 
>> Iain
>> 
>>> 
 ——— 8< .———

 In gcc/ada/adaint.c(__gnat_get_file_names_case_sensitive), the current
 assumption for __APPLE__ is that file names are case-insensitive
 unless __arm__ or __arm64__ are defined, in which case file names are
 declared case-sensitive.

 The associated comment is
 "By default, we suppose filesystems aren't case sensitive on
 Windows and Darwin (but they are on arm-darwin)."

 This means that on aarch64-apple-darwin, file names are treated as
 case-sensitive, which is not the default case.

 The true default position is that macOS file systems are
 case-insensitive, iOS file systems are case-sensitive.

 Apple provide a header file  which permits a
 compile-time check for the compiler target (e.g. OSX vs IOS); if
 TARGET_OS_IOS is defined as 1, this is a build for iOS.

 * gcc/ada/adaint.c
 (__gnat_get_file_names_case_sensitive): Split out the __APPLE__
 check and remove the checks for __arm__, __arm64__.
 For Apple, file names are by default case-insensitive unless
 TARGET_OS_IOS is set.

 Signed-off-by: Simon Wright 
>

[PATCH] AArch64/testsuite: Use non-capturing parentheses with ccmp_1.c

2023-11-22 Thread Maciej W. Rozycki

Use non-capturing parentheses for the subexpressions used with 
`scan-assembler-times', to avoid a quirk with double-counting.

gcc/testsuite/
* gcc.target/aarch64/ccmp_1.c: Use non-capturing parentheses 
with `scan-assembler-times'.
---
Hi,

 Here's another one.  I realised my original regexp used to grep the tree 
for `scan-assembler-times' with subexpressions was too strict and with an 
updated pattern I found this second test case that does regress once the 
`scan-assembler-times' double-counting quirk has been fixed.

 As with the ARM change we don't need capturing parentheses here, usually 
used for back references, so let's just avoid the double-counting quirk 
altogether and make our matching here work whether the quirk has been 
fixed or not.

 Verified for the `aarch64-linux-gnu' target with the quirk fix submitted 
as  
and the aarch64.exp subset of the C language test suite.  OK to apply?

  Maciej
---
 gcc/testsuite/gcc.target/aarch64/ccmp_1.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

gcc-aarch64-test-ccmp_1-non-capturing.diff
Index: gcc/gcc/testsuite/gcc.target/aarch64/ccmp_1.c
===
--- gcc.orig/gcc/testsuite/gcc.target/aarch64/ccmp_1.c
+++ gcc/gcc/testsuite/gcc.target/aarch64/ccmp_1.c
@@ -86,8 +86,8 @@ f13 (int a, int b)
 /* { dg-final { scan-assembler "cmp\t(.)+35" } } */
 
 /* { dg-final { scan-assembler-times "\tcmp\tw\[0-9\]+, 0" 4 } } */
-/* { dg-final { scan-assembler-times "fcmpe\t(.)+0\\.0" 2 } } */
-/* { dg-final { scan-assembler-times "fcmp\t(.)+0\\.0" 2 } } */
+/* { dg-final { scan-assembler-times "fcmpe\t(?:.)+0\\.0" 1 } } */
+/* { dg-final { scan-assembler-times "fcmp\t(?:.)+0\\.0" 1 } } */
 
 /* { dg-final { scan-assembler "adds\t" } } */
 /* { dg-final { scan-assembler-times "\tccmp\t" 11 } } */

Re: [PATCH 2/2] testsuite/unroll-8: Disable vectorization for varibale-factor targets

2023-11-22 Thread Jeff Law





On 11/21/23 16:27, Palmer Dabbelt wrote:

The vectorizer picks up these loops and disables unrolling on targets
with variable vector factors.  That result in better code here, but it
trips up the unrolling tests.  So just disable vectorization for these.

gcc/testsuite/ChangeLog:

PR target/112531
* gcc.dg/unroll-8.c: Disable vectorization on arm64 and riscv.
So probably the right check is to test for vector and 
!vect_variable_length rather than doing something target specific for 
aarch64/riscv


Jeff

Re: [committed] d: Merge upstream dmd ff57fec515, druntime ff57fec515, phobos 17bafda79.

2023-11-22 Thread Iain Buclaw

Excerpts from Rainer Orth's message of November 21, 2023 4:59 pm:
> Hi Iain,
> 
>> This patch merges the D front-end and runtime library with upstream dmd
>> ff57fec515, and the standard library with phobos 17bafda79.
>>
>> Synchronizing with the upstream release candidate of v2.106.0.
>>
>> D front-end changes:
>>
>> - Import dmd v2.106.0-rc.1.
>> - New'ing multi-dimensional arrays are now are converted to a single
>>   template call `_d_newarraymTX'.
>>
>> D runtime changes:
>>
>> - Import druntime v2.106.0-rc.1.
>>
>> Phobos changes:
>>
>> - Import phobos v2.106.0-rc.1.
>>
>> Bootstrapped and regression tested on x86_64-linux-gnu/-m32, committed
>> to mainline.
> 
> either this patch or the previous one broke D bootstrap with GCC 9.  On
> both i386-pc-solaris2.11 with gdc 9.4.0 and sparc-sun-solaris2.11 with
> gdc 9.3.0, stage 1 d21 fails to link with
> 
> Undefined   first referenced
>  symbol in file
> _D3dmd4root11stringtable34__T11StringValueTC3dmd5mtype4TypeZ11StringValue7lstringMFNaNbNiNjZPa
>  d/func.o
> _D3dmd4root11stringtable34__T11StringValueTC3dmd5mtype4TypeZ11StringValue8toDcharsMxFNaNbNiNjZPxa
>  d/func.o
> _D3dmd4root11stringtable34__T11StringValueTC3dmd5mtype4TypeZ11StringValue8toStringMxFNaNbNiNjZAxa
>  d/func.o
> ld: fatal: symbol referencing errors
> collect2: error: ld returned 1 exit status
> make[3]: *** [/vol/gcc/src/hg/master/local/gcc/d/Make-lang.in:236: d21] Error 
> 1
> 
> I'm now running bootstraps with gdc 11.1.0 instead, which seems to work:
> in both cases, stage 1 d21 did link.
> 
> If this is intentional, install.texi should be updated accordingly.
> 

Thanks Rainer,

I don't think this should happen if we can help it just yet.  I'll have
a look to see which specific upstream change might have caused it.

Iain.

Re: [PATCH v4] DSE: Allow vector type for get_stored_val when read < store

2023-11-22 Thread Richard Sandiford

"Li, Pan2"  writes:
>> It looks like Jeff approved the patch?
>
> Yes, just would like to double check the way of this patch is expected as 
> following the suggestion of Richard S.

Yeah, it looks good to me, thanks.

Richard

> Pan
>
> -Original Message-
> From: Richard Biener  
> Sent: Wednesday, November 22, 2023 4:02 PM
> To: Li, Pan2 
> Cc: richard.sandif...@arm.com; juzhe.zh...@rivai.ai; Wang, Yanzhang 
> ; kito.ch...@gmail.com; Jeff Law 
> ; gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH v4] DSE: Allow vector type for get_stored_val when read < 
> store
>
> On Wed, Nov 22, 2023 at 3:30 AM Li, Pan2  wrote:
>>
>> Hi Richard S,
>>
>> Thanks a lot for reviewing and comments. May I know is there any concern or 
>> further comments for landing this patch to GCC-14?
>
> It looks like Jeff approved the patch?
>
> Richard.
>
>> Pan
>>
>> -Original Message-
>> From: Li, Pan2
>> Sent: Wednesday, November 15, 2023 8:25 AM
>> To: gcc-patches@gcc.gnu.org
>> Cc: juzhe.zh...@rivai.ai; Wang, Yanzhang ; 
>> kito.ch...@gmail.com; richard.guent...@gmail.com; richard.sandif...@arm.com; 
>> Jeff Law 
>> Subject: RE: [PATCH v4] DSE: Allow vector type for get_stored_val when read 
>> < store
>>
>> Sorry for disturbing, looks I have a typo for Richard S's email address, cc 
>> the right email address for awareness.
>>
>> Pan
>>
>> -Original Message-
>> From: Li, Pan2
>> Sent: Wednesday, November 15, 2023 8:18 AM
>> To: Jeff Law ; gcc-patches@gcc.gnu.org
>> Cc: juzhe.zh...@rivai.ai; Wang, Yanzhang ; 
>> kito.ch...@gmail.com; richard.guent...@gmail.com; richard.sandiford@arm.com2
>> Subject: RE: [PATCH v4] DSE: Allow vector type for get_stored_val when read 
>> < store
>>
>> > I wouldn't try to handle that case unless we had actual evidence it was
>> > useful to do so.  Just wanted to point out that unlike pseudos we can
>> > have multiple modes referencing the same memory location.
>>
>> Got the point here, thanks Jeff for emphasizing this, 😉.
>>
>> Pan
>>
>> -Original Message-
>> From: Jeff Law 
>> Sent: Tuesday, November 14, 2023 4:12 AM
>> To: Li, Pan2 ; gcc-patches@gcc.gnu.org
>> Cc: juzhe.zh...@rivai.ai; Wang, Yanzhang ; 
>> kito.ch...@gmail.com; richard.guent...@gmail.com; richard.sandiford@arm.com2
>> Subject: Re: [PATCH v4] DSE: Allow vector type for get_stored_val when read 
>> < store
>>
>>
>>
>> On 11/12/23 20:22, pan2...@intel.com wrote:
>> > From: Pan Li 
>> >
>> > Update in v4:
>> > * Merge upstream and removed some independent changes.
>> >
>> > Update in v3:
>> > * Take known_le instead of known_lt for vector size.
>> > * Return NULL_RTX when gap is not equal 0 and not constant.
>> >
>> > Update in v2:
>> > * Move vector type support to get_stored_val.
>> >
>> > Original log:
>> >
>> > This patch would like to allow the vector mode in the
>> > get_stored_val in the DSE. It is valid for the read
>> > rtx if and only if the read bitsize is less than the
>> > stored bitsize.
>> >
>> > Given below example code with
>> > --param=riscv-autovec-preference=fixed-vlmax.
>> >
>> > vuint8m1_t test () {
>> >uint8_t arr[32] = {
>> >  1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
>> >  1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
>> >};
>> >
>> >return __riscv_vle8_v_u8m1(arr, 32);
>> > }
>> >
>> > Before this patch:
>> > test:
>> >lui a5,%hi(.LANCHOR0)
>> >addisp,sp,-32
>> >addia5,a5,%lo(.LANCHOR0)
>> >li  a3,32
>> >vl2re64.v   v2,0(a5)
>> >vsetvli zero,a3,e8,m1,ta,ma
>> >vs2r.v  v2,0(sp) <== Unnecessary store to stack
>> >vle8.v  v1,0(sp) <== Ditto
>> >vs1r.v  v1,0(a0)
>> >addisp,sp,32
>> >jr  ra
>> >
>> > After this patch:
>> > test:
>> >lui a5,%hi(.LANCHOR0)
>> >addia5,a5,%lo(.LANCHOR0)
>> >li  a4,32
>> >addisp,sp,-32
>> >vsetvli zero,a4,e8,m1,ta,ma
>> >vle8.v  v1,0(a5)
>> >vs1r.v  v1,0(a0)
>> >addisp,sp,32
>> >jr  ra
>> >
>> > Below tests are passed within this patch:
>> > * The risc-v regression test.
>> > * The x86 bootstrap and regression test.
>> > * The aarch64 regression test.
>> >
>> >   PR target/111720
>> >
>> > gcc/ChangeLog:
>> >
>> >   * dse.cc (get_stored_val): Allow vector mode if read size is
>> >   less than or equal to stored size.
>> >
>> > gcc/testsuite/ChangeLog:
>> >
>> >   * gcc.target/riscv/rvv/base/pr111720-0.c: New test.
>> >   * gcc.target/riscv/rvv/base/pr111720-1.c: New test.
>> >   * gcc.target/riscv/rvv/base/pr111720-10.c: New test.
>> >   * gcc.target/riscv/rvv/base/pr111720-2.c: New test.
>> >   * gcc.target/riscv/rvv/base/pr111720-3.c: New test.
>> >   * gcc.target/riscv/rvv/base/pr111720-4.c: New test.
>> >   * gcc.target/riscv/rvv/base/pr111720-5.c: New test.
>> >   * gcc.target/riscv/rvv/base/pr111720-6.c: New test.
>> >   * gcc.target/riscv/rvv/base/pr111720-7.c: New test.
>> >   * gcc.target/riscv/rvv/base/pr111720-8.c:

[PATCHv2] Clean up by_pieces_ninsns

2023-11-22 Thread HAO CHEN GUI

Hi,
  This patch cleans up by_pieces_ninsns and does following things.
1. Do the length and alignment adjustment for by pieces compare when
overlap operation is enabled.
2. Replace unnecessary mov_optab checks with gcc assertions.

  Compared to last version, the main change is to replace unnecessary
mov_optab checks with gcc assertions and fix the indentation.

  Bootstrapped and tested on x86 and powerpc64-linux BE and LE with
no regressions. Is this OK for trunk?

Thanks
Gui Haochen

ChangeLog
Clean up by_pieces_ninsns

The by pieces compare can be implemented by overlapped operations. So
it should be taken into consideration when doing the adjustment for
overlap operations.  The mode returned from
widest_fixed_size_mode_for_size is already checked with mov_optab in
by_pieces_mode_supported_p called by widest_fixed_size_mode_for_size.
So it is no need to check mov_optab again in by_pieces_ninsns.  The
patch fixes these issues.

gcc/
* expr.cc (by_pieces_ninsns): Include by pieces compare when
do the adjustment for overlap operations.  Replace mov_optab
checks with gcc assertions.

patch.diff
diff --git a/gcc/expr.cc b/gcc/expr.cc
index 556bcf7ef59..ffd18fe43cc 100644
--- a/gcc/expr.cc
+++ b/gcc/expr.cc
@@ -1090,18 +1090,16 @@ by_pieces_ninsns (unsigned HOST_WIDE_INT l, unsigned 
int align,
   unsigned HOST_WIDE_INT n_insns = 0;
   fixed_size_mode mode;

-  if (targetm.overlap_op_by_pieces_p () && op != COMPARE_BY_PIECES)
+  if (targetm.overlap_op_by_pieces_p ())
 {
   /* NB: Round up L and ALIGN to the widest integer mode for
 MAX_SIZE.  */
   mode = widest_fixed_size_mode_for_size (max_size, op);
-  if (optab_handler (mov_optab, mode) != CODE_FOR_nothing)
-   {
- unsigned HOST_WIDE_INT up = ROUND_UP (l, GET_MODE_SIZE (mode));
- if (up > l)
-   l = up;
- align = GET_MODE_ALIGNMENT (mode);
-   }
+  gcc_assert (optab_handler (mov_optab, mode) != CODE_FOR_nothing);
+  unsigned HOST_WIDE_INT up = ROUND_UP (l, GET_MODE_SIZE (mode));
+  if (up > l)
+   l = up;
+  align = GET_MODE_ALIGNMENT (mode);
 }

   align = alignment_for_piecewise_move (MOVE_MAX_PIECES, align);
@@ -1109,12 +1107,11 @@ by_pieces_ninsns (unsigned HOST_WIDE_INT l, unsigned 
int align,
   while (max_size > 1 && l > 0)
 {
   mode = widest_fixed_size_mode_for_size (max_size, op);
-  enum insn_code icode;
+  gcc_assert (optab_handler (mov_optab, mode) != CODE_FOR_nothing);

   unsigned int modesize = GET_MODE_SIZE (mode);

-  icode = optab_handler (mov_optab, mode);
-  if (icode != CODE_FOR_nothing && align >= GET_MODE_ALIGNMENT (mode))
+  if (align >= GET_MODE_ALIGNMENT (mode))
{
  unsigned HOST_WIDE_INT n_pieces = l / modesize;
  l %= modesize;

Re: [PATCH] tree: Fix up try_catch_may_fallthru [PR112619]

2023-11-22 Thread Jakub Jelinek

On Wed, Nov 22, 2023 at 01:21:12PM +0100, Jakub Jelinek wrote:
> So, pedantically perhaps just assuming TRY_CATCH_EXPR where second argument
> is not STATEMENT_LIST to be the CATCH_EXPR/EH_FILTER_EXPR case could work
> for C++, but there are other FEs and it would be fragile (and weird, given
> that STATEMENT_LIST with single stmt in it vs. that stmt ought to be
> generally interchangeable).

Looking at other FE, e.g. go/go-gcc.cc clearly has:
stat_tree = build2_loc(location.gcc_location(), TRY_CATCH_EXPR,
   void_type_node, stat_tree,
   build2_loc(location.gcc_location(), CATCH_EXPR,
  void_type_node, NULL, except_tree));
so CATCH_EXPR is immediately the second operand of TRY_CATCH_EXPR.
d/toir.cc has:
/* Back-end expects all catches in a TRY_CATCH_EXPR to be enclosed in a
   statement list, however pop_stmt_list may optimize away the list
   if there is only a single catch to push.  */
if (TREE_CODE (catches) != STATEMENT_LIST)
  {
tree stmt_list = alloc_stmt_list ();
append_to_statement_list_force (catches, &stmt_list);
catches = stmt_list;
  }

add_stmt (build2 (TRY_CATCH_EXPR, void_type_node, trybody, catches));
so I assume it run into the try_catch_may_fallthru issue (because gimplifier
clearly doesn't require that).
rust/rust-gcc.cc copies go-gcc.cc and also creates CATCH_EXPR directly in
TRY_CATCH_EXPR's operand.

Note, the only time one runs into the ICE is when the first operand (i.e.
try body) doesn't fall thru, otherwise the function returns true early.

Jakub

RE: [PATCH v4] DSE: Allow vector type for get_stored_val when read < store

2023-11-22 Thread Li, Pan2

Committed, thanks all.

Pan

-Original Message-
From: Richard Sandiford  
Sent: Thursday, November 23, 2023 2:39 AM
To: Li, Pan2 
Cc: Richard Biener ; juzhe.zh...@rivai.ai; Wang, 
Yanzhang ; kito.ch...@gmail.com; Jeff Law 
; gcc-patches@gcc.gnu.org
Subject: Re: [PATCH v4] DSE: Allow vector type for get_stored_val when read < 
store

"Li, Pan2"  writes:
>> It looks like Jeff approved the patch?
>
> Yes, just would like to double check the way of this patch is expected as 
> following the suggestion of Richard S.

Yeah, it looks good to me, thanks.

Richard

> Pan
>
> -Original Message-
> From: Richard Biener  
> Sent: Wednesday, November 22, 2023 4:02 PM
> To: Li, Pan2 
> Cc: richard.sandif...@arm.com; juzhe.zh...@rivai.ai; Wang, Yanzhang 
> ; kito.ch...@gmail.com; Jeff Law 
> ; gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH v4] DSE: Allow vector type for get_stored_val when read < 
> store
>
> On Wed, Nov 22, 2023 at 3:30 AM Li, Pan2  wrote:
>>
>> Hi Richard S,
>>
>> Thanks a lot for reviewing and comments. May I know is there any concern or 
>> further comments for landing this patch to GCC-14?
>
> It looks like Jeff approved the patch?
>
> Richard.
>
>> Pan
>>
>> -Original Message-
>> From: Li, Pan2
>> Sent: Wednesday, November 15, 2023 8:25 AM
>> To: gcc-patches@gcc.gnu.org
>> Cc: juzhe.zh...@rivai.ai; Wang, Yanzhang ; 
>> kito.ch...@gmail.com; richard.guent...@gmail.com; richard.sandif...@arm.com; 
>> Jeff Law 
>> Subject: RE: [PATCH v4] DSE: Allow vector type for get_stored_val when read 
>> < store
>>
>> Sorry for disturbing, looks I have a typo for Richard S's email address, cc 
>> the right email address for awareness.
>>
>> Pan
>>
>> -Original Message-
>> From: Li, Pan2
>> Sent: Wednesday, November 15, 2023 8:18 AM
>> To: Jeff Law ; gcc-patches@gcc.gnu.org
>> Cc: juzhe.zh...@rivai.ai; Wang, Yanzhang ; 
>> kito.ch...@gmail.com; richard.guent...@gmail.com; richard.sandiford@arm.com2
>> Subject: RE: [PATCH v4] DSE: Allow vector type for get_stored_val when read 
>> < store
>>
>> > I wouldn't try to handle that case unless we had actual evidence it was
>> > useful to do so.  Just wanted to point out that unlike pseudos we can
>> > have multiple modes referencing the same memory location.
>>
>> Got the point here, thanks Jeff for emphasizing this, 😉.
>>
>> Pan
>>
>> -Original Message-
>> From: Jeff Law 
>> Sent: Tuesday, November 14, 2023 4:12 AM
>> To: Li, Pan2 ; gcc-patches@gcc.gnu.org
>> Cc: juzhe.zh...@rivai.ai; Wang, Yanzhang ; 
>> kito.ch...@gmail.com; richard.guent...@gmail.com; richard.sandiford@arm.com2
>> Subject: Re: [PATCH v4] DSE: Allow vector type for get_stored_val when read 
>> < store
>>
>>
>>
>> On 11/12/23 20:22, pan2...@intel.com wrote:
>> > From: Pan Li 
>> >
>> > Update in v4:
>> > * Merge upstream and removed some independent changes.
>> >
>> > Update in v3:
>> > * Take known_le instead of known_lt for vector size.
>> > * Return NULL_RTX when gap is not equal 0 and not constant.
>> >
>> > Update in v2:
>> > * Move vector type support to get_stored_val.
>> >
>> > Original log:
>> >
>> > This patch would like to allow the vector mode in the
>> > get_stored_val in the DSE. It is valid for the read
>> > rtx if and only if the read bitsize is less than the
>> > stored bitsize.
>> >
>> > Given below example code with
>> > --param=riscv-autovec-preference=fixed-vlmax.
>> >
>> > vuint8m1_t test () {
>> >uint8_t arr[32] = {
>> >  1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
>> >  1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
>> >};
>> >
>> >return __riscv_vle8_v_u8m1(arr, 32);
>> > }
>> >
>> > Before this patch:
>> > test:
>> >lui a5,%hi(.LANCHOR0)
>> >addisp,sp,-32
>> >addia5,a5,%lo(.LANCHOR0)
>> >li  a3,32
>> >vl2re64.v   v2,0(a5)
>> >vsetvli zero,a3,e8,m1,ta,ma
>> >vs2r.v  v2,0(sp) <== Unnecessary store to stack
>> >vle8.v  v1,0(sp) <== Ditto
>> >vs1r.v  v1,0(a0)
>> >addisp,sp,32
>> >jr  ra
>> >
>> > After this patch:
>> > test:
>> >lui a5,%hi(.LANCHOR0)
>> >addia5,a5,%lo(.LANCHOR0)
>> >li  a4,32
>> >addisp,sp,-32
>> >vsetvli zero,a4,e8,m1,ta,ma
>> >vle8.v  v1,0(a5)
>> >vs1r.v  v1,0(a0)
>> >addisp,sp,32
>> >jr  ra
>> >
>> > Below tests are passed within this patch:
>> > * The risc-v regression test.
>> > * The x86 bootstrap and regression test.
>> > * The aarch64 regression test.
>> >
>> >   PR target/111720
>> >
>> > gcc/ChangeLog:
>> >
>> >   * dse.cc (get_stored_val): Allow vector mode if read size is
>> >   less than or equal to stored size.
>> >
>> > gcc/testsuite/ChangeLog:
>> >
>> >   * gcc.target/riscv/rvv/base/pr111720-0.c: New test.
>> >   * gcc.target/riscv/rvv/base/pr111720-1.c: New test.
>> >   * gcc.target/riscv/rvv/base/pr111720-10.c: New test.
>> >   * gcc.target/riscv/rvv/base/pr111720-2.c: New test.
>> >   * gcc.target/r

Re: [PATCH v5 1/1] c++: Initial support for P0847R7 (Deducing This) [PR102609]

2023-11-22 Thread waffl3x

> > > > /* Nonzero for FUNCTION_DECL means that this decl is a non-static
> > > > - member function. */
> > > > + member function, use DECL_IOBJ_MEMBER_FUNC_P instead. */
> > > > #define DECL_NONSTATIC_MEMBER_FUNCTION_P(NODE) \
> > > > (TREE_CODE (TREE_TYPE (NODE)) == METHOD_TYPE)
> > > > 
> > > > +/* Nonzero for FUNCTION_DECL means that this decl is an implicit object
> > > > + member function. */
> > > > +#define DECL_IOBJ_MEMBER_FUNC_P(NODE) \
> > > > + (TREE_CODE (TREE_TYPE (NODE)) == METHOD_TYPE)
> > > 
> > > I was thinking to rename DECL_NONSTATIC_MEMBER_FUNCTION_P rather than
> > > add a new, equivalent one. And then go through all the current uses of
> > > the old macro to decide whether they mean IOBJ or OBJECT.
> > 
> > I figure it would be easier to make that transition if there's a clear
> > line between old versus new. To be clear, my intention is for the old
> > macro to be removed once all the uses of it are changed over to the new
> > macro. I can still remove it for the patch if you like but having both
> > and removing the old one later seems better to me.
> 
> 
> Hmm, I think changing all the uses is a necessary part of this change.
> I suppose it could happen before the main patch, if you'd prefer, but it
> seems more straightforward to include it.
> 

I had meant to reply to this as well but forgot, I agree that it's
likely necessary but I've only been changing them as I come across
things that don't work right rather than trying to evaluate them
through the code. Making changes to them without having a test case
that demonstrates that the case is definitely being handled incorrectly
is risky, especially for me since I don't have a full understanding of
the code base. I would rather only change ones that are evidently
wrong, and defer the rest to someone else who knows the code base
better.

With that said, I have been neglecting replacing uses of the old macro,
but I now realize that's just creating more work for whoever is
evaluating the rest of them. Going forward I will make sure I replace
the old macro when I am fairly certain it should be.

Alex

Re: RISC-V: Support XTheadVector extensions

2023-11-22 Thread Jeff Law





On 11/22/23 07:24, Christoph Müllner wrote:

On Wed, Nov 22, 2023 at 2:52 PM 钟居哲  wrote:


I am totally ok to approve theadvector on GCC-14 before stage 3 close
as long as it doesn't touch the current RVV codes too much and binutils 
supports theadvector.

I have provided the draft approach:
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/637349.html
which turns out doesn't need to change any codes of vector.md.
I strongly suggest follow this draft. I can be actively review theadvector 
during stage 3.
And hopefully can help you land theadvector on GCC-14.


I see now two approaches:
1) Let GCC emit RVV instructions for XTheadVector for instructions
that are in both
2) Use the ASM_OUTPUT_OPCODE hook to output "th." for these instructions

No doubt, the ASM_OUTPUT_OPCODE hook approach is better than our
format-string approach, but would 1) not be the even better
solution? It would also mean, that not a single test case is required
for these overlapping instructions (only a few tests that ensure that
we don't emit RVV instructions that are not available in
XTheadVector). Besides that, letting GCC emit RVV instructions for
XTheadVector is a very clever idea, because it fully utilizes the
fact that both extensions overlap to a huge degree.

The ASM_OUTPUT_OPCODE approach could lead to an issue if we enable

XTheadVector
with any other vector extension, say Zvfoo. In this case the Zvfoo 
instructions will all be prefixed as well with "th.". I know that it

is not likely to run into this problem (such a machine does not exist
in real hardware), but it is possible to trigger this issue easily
and approach 1) would not have this potential issue.
I'm not a big fan of the ASM_OUTPUT_OPCODE approach.While it is 
simple, I worry a bit about it from a long term maintenance standpoint. 
As you note we could well end up at some point with an extension that 
has an mnenomic starting with "v" that would blow up.  But I certainly 
see the appeal of such a simple test to support thead vector.


Given there are at least 3 approaches that can fix that problem (%^, 
assembler dialect or ASM_OUTPUT_OPCODE), maybe we could set that 
discussion aside in the immediate term and see if there are other issues 
that are potentially more substantial.





--



More generally, I think I need to soften my prior statement about 
deferring this to gcc-15.  This code was submitted in time for the 
gcc-14 deadline, so it should be evaluated just like we do anything else 
that makes the deadline.  There are various criteria we use to evaluate 
if something should get integrated and we should just work through this 
series like we always do and not treat it specially in any way.



jeff

[PATCH 2/2] c-family: rename warn_for_address_or_pointer_of_packed_member

2023-11-22 Thread Jason Merrill

Following the last patch, let's rename the functions to reflect the change
in behavior.

gcc/c-family/ChangeLog:

* c-warn.cc (check_address_or_pointer_of_packed_member):
Rename to check_address_of_packed_member.
(check_and_warn_address_or_pointer_of_packed_member):
Rename to check_and_warn_address_of_packed_member.
(warn_for_address_or_pointer_of_packed_member):
Rename to warn_for_address_of_packed_member.
* c-common.h: Adjust.

gcc/c/ChangeLog:

* c-typeck.cc (convert_for_assignment): Adjust call to
warn_for_address_of_packed_member.

gcc/cp/ChangeLog:

* call.cc (convert_for_arg_passing)
* typeck.cc (convert_for_assignment): Adjust call to
warn_for_address_of_packed_member.
---
 gcc/c-family/c-common.h |  2 +-
 gcc/c-family/c-warn.cc  | 32 ++--
 gcc/c/c-typeck.cc   |  4 ++--
 gcc/cp/call.cc  |  2 +-
 gcc/cp/typeck.cc|  2 +-
 5 files changed, 19 insertions(+), 23 deletions(-)

diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index b57e83d7c5d..9380452a93b 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -1482,7 +1482,7 @@ extern void warnings_for_convert_and_check (location_t, 
tree, tree, tree);
 extern void c_do_switch_warnings (splay_tree, location_t, tree, tree, bool);
 extern void warn_for_omitted_condop (location_t, tree);
 extern bool warn_for_restrict (unsigned, tree *, unsigned);
-extern void warn_for_address_or_pointer_of_packed_member (tree, tree);
+extern void warn_for_address_of_packed_member (tree, tree);
 extern void warn_parm_array_mismatch (location_t, tree, tree);
 extern void maybe_warn_sizeof_array_div (location_t, tree, tree, tree, tree);
 extern void do_warn_array_compare (location_t, tree_code, tree, tree);
diff --git a/gcc/c-family/c-warn.cc b/gcc/c-family/c-warn.cc
index 2a399ba6d14..abe66dd3030 100644
--- a/gcc/c-family/c-warn.cc
+++ b/gcc/c-family/c-warn.cc
@@ -2991,13 +2991,13 @@ check_alignment_of_packed_member (tree type, tree 
field, bool rvalue)
   return NULL_TREE;
 }
 
-/* Return struct or union type if the right hand value, RHS
+/* Return struct or union type if the right hand value, RHS,
is an address which takes the unaligned address of packed member
of struct or union when assigning to TYPE.
Otherwise, return NULL_TREE.  */
 
 static tree
-check_address_or_pointer_of_packed_member (tree type, tree rhs)
+check_address_of_packed_member (tree type, tree rhs)
 {
   bool rvalue = true;
   bool indirect = false;
@@ -3042,14 +3042,12 @@ check_address_or_pointer_of_packed_member (tree type, 
tree rhs)
   return context;
 }
 
-/* Check and warn if the right hand value, RHS:
-   1. Is a pointer value which isn't aligned to a pointer type TYPE.
-   2. Is an address which takes the unaligned address of packed member
-  of struct or union when assigning to TYPE.
- */
+/* Check and warn if the right hand value, RHS,
+   is an address which takes the unaligned address of packed member
+   of struct or union when assigning to TYPE.  */
 
 static void
-check_and_warn_address_or_pointer_of_packed_member (tree type, tree rhs)
+check_and_warn_address_of_packed_member (tree type, tree rhs)
 {
   bool nop_p = false;
   tree orig_rhs;
@@ -3067,11 +3065,11 @@ check_and_warn_address_or_pointer_of_packed_member 
(tree type, tree rhs)
   if (TREE_CODE (rhs) == COND_EXPR)
 {
   /* Check the THEN path.  */
-  check_and_warn_address_or_pointer_of_packed_member
+  check_and_warn_address_of_packed_member
(type, TREE_OPERAND (rhs, 1));
 
   /* Check the ELSE path.  */
-  check_and_warn_address_or_pointer_of_packed_member
+  check_and_warn_address_of_packed_member
(type, TREE_OPERAND (rhs, 2));
 }
   else
@@ -3095,7 +3093,7 @@ check_and_warn_address_or_pointer_of_packed_member (tree 
type, tree rhs)
}
 
   tree context
-   = check_address_or_pointer_of_packed_member (type, rhs);
+   = check_address_of_packed_member (type, rhs);
   if (context)
{
  location_t loc = EXPR_LOC_OR_LOC (rhs, input_location);
@@ -3107,14 +3105,12 @@ check_and_warn_address_or_pointer_of_packed_member 
(tree type, tree rhs)
 }
 }
 
-/* Warn if the right hand value, RHS:
-   1. Is a pointer value which isn't aligned to a pointer type TYPE.
-   2. Is an address which takes the unaligned address of packed member
-  of struct or union when assigning to TYPE.
-*/
+/* Warn if the right hand value, RHS,
+   is an address which takes the unaligned address of packed member
+   of struct or union when assigning to TYPE.  */
 
 void
-warn_for_address_or_pointer_of_packed_member (tree type, tree rhs)
+warn_for_address_of_packed_member (tree type, tree rhs)
 {
   if (!warn_address_of_packed_member)
 return;
@@ -3123,7 +3119,7 @@ warn_for_address_or_pointer_of_packed_member (tree type, 
tree rhs)
   if (!POINTER_TYPE_P (type))
 return;
 
-  check

[pushed] wwwdocs: releasing: No longer refer to buildstat.html

2023-11-22 Thread Gerald Pfeifer

That's the counterpart to the branching.html change I just made, also 
reported by Thomas.

Pushed.

Gerald
---
 htdocs/releasing.html | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/htdocs/releasing.html b/htdocs/releasing.html
index 1cd56f72..c7365e64 100644
--- a/htdocs/releasing.html
+++ b/htdocs/releasing.html
@@ -85,9 +85,6 @@ web pages.
 Update the version numbers of the current and future releases on
 the main web page, and add a proper news item there as well.
 
-For a new major release, ensure that the build status page is present
-and add a link from the main buildstat.html page.
-
 Generate online documentation for the new release with
 update_web_docs_git.  The appropriate command to run (as gccadmin)
 to generate the documentation would be scripts/update_web_docs_git
-- 
2.42.1

Re: [PATCH] Clean up by_pieces_ninsns

2023-11-22 Thread Richard Sandiford

"Kewen.Lin"  writes:
> Hi,
>
> on 2023/11/15 10:26, HAO CHEN GUI wrote:
>> Hi,
>>   This patch cleans up by_pieces_ninsns and does following things.
>> 1. Do the length and alignment adjustment for by pieces compare when
>> overlap operation is enabled.
>> 2. Remove unnecessary mov_optab checks.
>> 
>>   Bootstrapped and tested on x86 and powerpc64-linux BE and LE with
>> no regressions. Is this OK for trunk?
>> 
>> Thanks
>> Gui Haochen
>> 
>> ChangeLog
>> Clean up by_pieces_ninsns
>> 
>> The by pieces compare can be implemented by overlapped operations. So
>> it should be taken into consideration when doing the adjustment for
>> overlap operations.  The mode returned from
>> widest_fixed_size_mode_for_size is already checked with mov_optab in
>> by_pieces_mode_supported_p called by widest_fixed_size_mode_for_size.
>> So there is no need to check mov_optab again in by_pieces_ninsns.
>> The patch fixes these issues.
>> 
>> gcc/
>>  * expr.cc (by_pieces_ninsns): Include by pieces compare when
>>  do the adjustment for overlap operations.  Remove unnecessary
>>  mov_optab check.
>> 
>> patch.diff
>> diff --git a/gcc/expr.cc b/gcc/expr.cc
>> index 3e2a678710d..7cb2c935177 100644
>> --- a/gcc/expr.cc
>> +++ b/gcc/expr.cc
>> @@ -1090,18 +1090,15 @@ by_pieces_ninsns (unsigned HOST_WIDE_INT l, unsigned 
>> int align,
>>unsigned HOST_WIDE_INT n_insns = 0;
>>fixed_size_mode mode;
>> 
>> -  if (targetm.overlap_op_by_pieces_p () && op != COMPARE_BY_PIECES)
>> +  if (targetm.overlap_op_by_pieces_p ())
>>  {
>>/* NB: Round up L and ALIGN to the widest integer mode for
>>   MAX_SIZE.  */
>>mode = widest_fixed_size_mode_for_size (max_size, op);
>> -  if (optab_handler (mov_optab, mode) != CODE_FOR_nothing)
>
> These changes are on generic code, so not a review.  :)
>
> If it's guaranteed previously, maybe we can replace it with an assertion
> like: gcc_assert (optab_handler (mov_optab, mode) != CODE_FOR_nothing);

Yeah, sounds OK to me FWIW.  I suppose the counter-argument is that
nothing here directly relies on the move optab.  It's just checking on
behalf of later code, which is now done by widest_fixed_size_mode_for_size
instead.

So the patch as posted is OK for trunk too, except that:

>
>> -{
>> -  unsigned HOST_WIDE_INT up = ROUND_UP (l, GET_MODE_SIZE (mode));
>> -  if (up > l)
>> -l = up;
>> -  align = GET_MODE_ALIGNMENT (mode);
>> -}
>> +  unsigned HOST_WIDE_INT up = ROUND_UP (l, GET_MODE_SIZE (mode));
>> +if (up > l)
>> +  l = up;
>> +  align = GET_MODE_ALIGNMENT (mode);

the indentation looks off here (the "if" is indented differently from the
first and last statements).

Thanks,
Richard

>>  }
>> 
>>align = alignment_for_piecewise_move (MOVE_MAX_PIECES, align);
>> @@ -1109,12 +1106,10 @@ by_pieces_ninsns (unsigned HOST_WIDE_INT l, unsigned 
>> int align,
>>while (max_size > 1 && l > 0)
>>  {
>>mode = widest_fixed_size_mode_for_size (max_size, op);
>> -  enum insn_code icode;
>> 
>>unsigned int modesize = GET_MODE_SIZE (mode);
>> 
>> -  icode = optab_handler (mov_optab, mode);
>
> ... likewise.
>
> BR,
> Kewen
>
>> -  if (icode != CODE_FOR_nothing && align >= GET_MODE_ALIGNMENT (mode))
>> +  if (align >= GET_MODE_ALIGNMENT (mode))
>>  {
>>unsigned HOST_WIDE_INT n_pieces = l / modesize;
>>l %= modesize;
>>

[PATCH V2 3/3] OpenMP: Use enumerators for names of trait-sets and traits

2023-11-22 Thread Sandra Loosemore

This patch introduces enumerators to represent trait-set names and
trait names, which makes it easier to use tables to control other
behavior and for switch statements to dispatch on the tags.  The tags
are stored in the same place in the TREE_LIST structure (OMP_TSS_ID or
OMP_TS_ID) and are encoded there as integer constants.

gcc/ChangeLog
* omp-general.h (enum omp_tss_code): New.
(enum omp_ts_code): New.
(enum omp_tp_type): New.
(omp_tss_map): New.
(struct omp_ts_info): New.
(omp_ts_map): New.
(OMP_TSS_CODE, OMP_TSS_NAME): New.
(OMP_TS_CODE, OMP_TS_NAME): New.
(make_trait_set_selector, make_trait_selector): Adjust declarations.
(omp_construct_traits_to_codes): Likewise.
(omp_context_selector_set_compare): Likewise.
(omp_get_context_selector): Likewise.
(omp_get_context_selector_list): New.
(omp_lookup_tss_code): New.
(omp_lookup_ts_code): New.
* omp-general.cc (omp_construct_traits_to_codes): Pass length in
as argument instead of returning it.  Make it table-driven.
(omp_tss_map): New.
(kind_properties, vendor_properties, extension_properties): New.
(atomic_default_mem_order_properties): New.
(omp_ts_map): New.
(omp_check_context_selector): Simplify lookup and dispatch logic.
(omp_mark_declare_variant): Adjust for new representation.
(make_trait_set_selector, make_trait_selector): Adjust for new
representations.
(omp_context_selector_matches): Simplify dispatch logic.  Avoid
fixed-sized buffers and adjust call to omp_construct_traits_to_codes.
(omp_context_selector_props_compare): Adjust for new representations
and simplify dispatch logic.
(omp_context_selector_set_compare): Likewise.
(omp_context_selector_compare): Likewise.
(omp_get_context_selector): Adjust for new representations, and split
out...
(omp_get_context_selector_list): New function.
(omp_lookup_tss_code): New.
(omp_lookup_ts_code): New.
(omp_context_compute_score): Adjust for new representations.  Avoid
fixed-sized buffers and magic numbers.  Adjust call to
omp_construct_traits_to_codes.
* gimplify.cc (omp_construct_selector_matches): Avoid use of
fixed-size buffer.  Adjust call to omp_construct_traits_to_codes.

gcc/c/ChangeLog
* c-parser.cc (omp_construct_selectors): Delete.
(omp_device_selectors): Delete.
(omp_implementation_selectors): Delete.
(omp_user_selectors): Delete.
(c_parser_omp_context_selector): Adjust for new representations
and simplify dispatch logic.
(c_parser_omp_context_selector_specification): Likewise.
(c_finish_omp_declare_variant): Adjust for new representations.

gcc/cp/ChangeLog
* decl.cc (omp_declare_variant_finalize_one): Adjust for new
representations.
* parser.cc (omp_construct_selectors): Delete.
(omp_device_selectors): Delete.
(omp_implementation_selectors): Delete.
(omp_user_selectors): Delete.
(cp_parser_omp_context_selector): Adjust for new representations
and simplify dispatch logic.
(cp_parser_omp_context_selector_specification): Likewise.
* pt.cc (tsubst_attribute): Adjust for new representations.

gcc/fortran/ChangeLog
* trans-openmp.cc (gfc_trans_omp_declare_variant): Adjust for
new representations.
---
 gcc/c/c-parser.cc   | 192 -
 gcc/cp/decl.cc  |   8 +-
 gcc/cp/parser.cc| 189 -
 gcc/cp/pt.cc|  15 +-
 gcc/fortran/trans-openmp.cc |  41 ++-
 gcc/gimplify.cc |  17 +-
 gcc/omp-general.cc  | 530 +++-
 gcc/omp-general.h   |  89 +-
 8 files changed, 590 insertions(+), 491 deletions(-)

diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
index a2ff381e0c1..70c0e1828ca 100644
--- a/gcc/c/c-parser.cc
+++ b/gcc/c/c-parser.cc
@@ -24016,16 +24016,6 @@ c_parser_omp_declare_simd (c_parser *parser, enum 
pragma_context context)
 }
 }
 
-static const char *const omp_construct_selectors[] = {
-  "simd", "target", "teams", "parallel", "for", NULL };
-static const char *const omp_device_selectors[] = {
-  "kind", "isa", "arch", NULL };
-static const char *const omp_implementation_selectors[] = {
-  "vendor", "extension", "atomic_default_mem_order", "unified_address",
-  "unified_shared_memory", "dynamic_allocators", "reverse_offload", NULL };
-static const char *const omp_user_selectors[] = {
-  "condition", NULL };
-
 /* OpenMP 5.0:
 
trait-selector:
@@ -24038,7 +24028,8 @@ static const char *const omp_user_selectors[] = {
trait-selector-set SET.  */
 
 static tree
-c_parser_omp_context_selector (c_parser *parser, tree set, tree parms)
+c_parser_omp_context_selector (c_parser *parser, enum om

[pushed] wwwdocs: branching: No longer refer to buildstat.html

2023-11-22 Thread Gerald Pfeifer

Thomas spotted this (among others) not being necessary any longer and 
kindly reported it.

Pushed.
---
 htdocs/branching.html | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/htdocs/branching.html b/htdocs/branching.html
index 0d48dce1..23ff92e8 100644
--- a/htdocs/branching.html
+++ b/htdocs/branching.html
@@ -52,9 +52,6 @@ populate it with initial copies of changes.html 
and
 based on the previous release branch to the directory corresponding to
 the newly created release branch.

-Add buildstat.html and update the toplevel
-buildstat.html accordingly.
-
 Update the toplevel index.html page to show the new active
 release branch, the current release series, and active development
 (mainline).  Update the version and development stage for mainline.
-- 
2.42.1

[PATCH 1/2] c-family: -Waddress-of-packed-member and casts

2023-11-22 Thread Jason Merrill

Tested x86_64-pc-linux-gnu, OK for trunk?

-- 8< --

-Waddress-of-packed-member, in addition to the documented warning about
taking the address of a packed member, also warns about casting from
a pointer to a TYPE_PACKED type to a pointer to a type with greater
alignment.

This wrongly warns if the source is a pointer to enum when -fshort-enums
is on, since that is also represented by TYPE_PACKED.

And there's already -Wcast-align to catch casting from pointer to less
aligned type (packed or otherwise) to pointer to more aligned type; even
apart from the enum problem, this seems like a somewhat arbitrary subset of
that warning.  Though that isn't currently on by default.

So, this patch removes the undocumented type-based warning from
-Waddress-of-packed-member.  Some of the tests where the warning is
desirable I changed to use -Wcast-align=strict instead.  The ones that
require -Wno-incompatible-pointer-types, I just removed.

gcc/c-family/ChangeLog:

* c-warn.cc (check_address_or_pointer_of_packed_member):
Remove warning based on TYPE_PACKED.

gcc/testsuite/ChangeLog:

* c-c++-common/Waddress-of-packed-member-1.c: Don't expect
a warning on the cast cases.
* c-c++-common/pr51628-35.c: Use -Wcast-align=strict.
* g++.dg/warn/Waddress-of-packed-member3.C: Likewise.
* gcc.dg/pr88928.c: Likewise.
* gcc.dg/pr51628-20.c: Removed.
* gcc.dg/pr51628-21.c: Removed.
* gcc.dg/pr51628-25.c: Removed.
---
 gcc/c-family/c-warn.cc| 58 +--
 .../Waddress-of-packed-member-1.c | 12 ++--
 gcc/testsuite/c-c++-common/pr51628-35.c   |  6 +-
 .../g++.dg/warn/Waddress-of-packed-member3.C  |  8 +--
 gcc/testsuite/gcc.dg/pr51628-20.c | 11 
 gcc/testsuite/gcc.dg/pr51628-21.c | 11 
 gcc/testsuite/gcc.dg/pr51628-25.c |  9 ---
 gcc/testsuite/gcc.dg/pr88928.c|  6 +-
 8 files changed, 19 insertions(+), 102 deletions(-)
 delete mode 100644 gcc/testsuite/gcc.dg/pr51628-20.c
 delete mode 100644 gcc/testsuite/gcc.dg/pr51628-21.c
 delete mode 100644 gcc/testsuite/gcc.dg/pr51628-25.c

diff --git a/gcc/c-family/c-warn.cc b/gcc/c-family/c-warn.cc
index d2938b91043..2a399ba6d14 100644
--- a/gcc/c-family/c-warn.cc
+++ b/gcc/c-family/c-warn.cc
@@ -2991,10 +2991,9 @@ check_alignment_of_packed_member (tree type, tree field, 
bool rvalue)
   return NULL_TREE;
 }
 
-/* Return struct or union type if the right hand value, RHS:
-   1. Is a pointer value which isn't aligned to a pointer type TYPE.
-   2. Is an address which takes the unaligned address of packed member
-  of struct or union when assigning to TYPE.
+/* Return struct or union type if the right hand value, RHS
+   is an address which takes the unaligned address of packed member
+   of struct or union when assigning to TYPE.
Otherwise, return NULL_TREE.  */
 
 static tree
@@ -3021,57 +3020,6 @@ check_address_or_pointer_of_packed_member (tree type, 
tree rhs)
 
   type = TREE_TYPE (type);
 
-  if (TREE_CODE (rhs) == PARM_DECL
-  || VAR_P (rhs)
-  || TREE_CODE (rhs) == CALL_EXPR)
-{
-  tree rhstype = TREE_TYPE (rhs);
-  if (TREE_CODE (rhs) == CALL_EXPR)
-   {
- rhs = CALL_EXPR_FN (rhs); /* Pointer expression.  */
- if (rhs == NULL_TREE)
-   return NULL_TREE;
- rhs = TREE_TYPE (rhs);/* Pointer type.  */
- /* We could be called while processing a template and RHS could be
-a functor.  In that case it's a class, not a pointer.  */
- if (!rhs || !POINTER_TYPE_P (rhs))
-   return NULL_TREE;
- rhs = TREE_TYPE (rhs);/* Function type.  */
- rhstype = TREE_TYPE (rhs);
- if (!rhstype || !POINTER_TYPE_P (rhstype))
-   return NULL_TREE;
- rvalue = true;
-   }
-  if (rvalue && POINTER_TYPE_P (rhstype))
-   rhstype = TREE_TYPE (rhstype);
-  while (TREE_CODE (rhstype) == ARRAY_TYPE)
-   rhstype = TREE_TYPE (rhstype);
-  if (TYPE_PACKED (rhstype))
-   {
- unsigned int type_align = min_align_of_type (type);
- unsigned int rhs_align = min_align_of_type (rhstype);
- if (rhs_align < type_align)
-   {
- auto_diagnostic_group d;
- location_t location = EXPR_LOC_OR_LOC (rhs, input_location);
- if (warning_at (location, OPT_Waddress_of_packed_member,
- "converting a packed %qT pointer (alignment %d) "
- "to a %qT pointer (alignment %d) may result in "
- "an unaligned pointer value",
- rhstype, rhs_align, type, type_align))
-   {
- tree decl = TYPE_STUB_DECL (rhstype);
- if (decl)
-   inform (DECL_SOURCE_LOCATION (decl), "defined here");
- decl = TYPE_STUB_DECL (type);
- if (decl

Re: [PATCH v5 1/1] c++: Initial support for P0847R7 (Deducing This) [PR102609]

2023-11-22 Thread waffl3x







On Tuesday, November 21st, 2023 at 8:22 PM, Jason Merrill  
wrote:


>
>
> On 11/21/23 08:04, waffl3x wrote:
>
> > Bootstrapped and tested on x86_64-linux with no regressions.
> >
> > Hopefully this patch is legible enough for reviewing purposes, I've not
> > been feeling the greatest so it was a task to get this finished.
> > Tomorrow I will look at putting the diagnostics in
> > start_preparsed_function and also fixing up anything else.
> >
> > To reiterate in case it wasn't abundantly clear by the barren changelog
> > and commit message, this version is not intended as the final revision.
> >
> > Handling re-declarations was kind of nightmarish, so the comments in
> > there are lengthy, but I am fairly certain I implemented them correctly.
> >
> > I am going to get some sleep now, hopefully I will feel better tomorrow
> > and be ready to polish off the patch. Thanks for the patience.
>
>
> Great!
>
> > I stared at start_preparsed_function for a long while and couldn't
> > figure out where to start off at. So for now the error handling is
> > split up between instantiate_body and cp_parser_lambda_declarator_opt.
> > The latter is super not correct but I've been stuck on this for a long
> > time now though so I wanted to actually get something that works and
> > then try to make it better.
>
>
> I see what you mean, instantiate body isn't prepared for
> start_preparsed_function to fail. It's ok to handle this in two places.
> Though actually, instantiate_body is too late for it to fail; I think
> for the template case it should fail in tsubst_lambda_expr, before we
> even start to consider the body.
>
> Incidentally, I notice this code in tsubst_function_decl seems to need
> adjusting for xobj:
>
> tree parms = DECL_ARGUMENTS (t);
> if (closure && !DECL_STATIC_FUNCTION_P (t))
> parms = DECL_CHAIN (parms);
> parms = tsubst (parms, args, complain, t);
> for (tree parm = parms; parm; parm = DECL_CHAIN (parm))
> DECL_CONTEXT (parm) = r;
> if (closure && !DECL_STATIC_FUNCTION_P (t))
> ...
>
> and this in tsubst_lambda_expr that assumes iobj:
>
> /* Fix the type of 'this'. */
> fntype = build_memfn_type (fntype, type,
> type_memfn_quals (fntype),
> type_memfn_rqual (fntype));

Originally I was going to say this doesn't look like a problem in
tsubst_lambda_expr, but after looking at tsubst_function_decl I'm
thinking it might be the source of some trouble. If it really was
causing problems I would think it would be working much worse than it
currently is, but it does feel like it might be the actual source of
the bug I was chasing yesterday. Assigning to a capture with a deduced
const xobj parameter is not being rejected right now, this might be
causing it. I'll look more thoroughly today.

> This also seems like the place to check for unrelated type.

It does feel that way, I agree.

> > /* Nonzero for FUNCTION_DECL means that this decl is a non-static
> > - member function. */
> > + member function, use DECL_IOBJ_MEMBER_FUNC_P instead. */
> > #define DECL_NONSTATIC_MEMBER_FUNCTION_P(NODE) \
> > (TREE_CODE (TREE_TYPE (NODE)) == METHOD_TYPE)
> >
> > +/* Nonzero for FUNCTION_DECL means that this decl is an implicit object
> > + member function. */
> > +#define DECL_IOBJ_MEMBER_FUNC_P(NODE) \
> > + (TREE_CODE (TREE_TYPE (NODE)) == METHOD_TYPE)
>
>
> I was thinking to rename DECL_NONSTATIC_MEMBER_FUNCTION_P rather than
> add a new, equivalent one. And then go through all the current uses of
> the old macro to decide whether they mean IOBJ or OBJECT.

I figure it would be easier to make that transition if there's a clear
line between old versus new. To be clear, my intention is for the old
macro to be removed once all the uses of it are changed over to the new
macro. I can still remove it for the patch if you like but having both
and removing the old one later seems better to me.

> > - (static or non-static). */
> > + (static or object). */
>
>
> Let's leave this comment as it was.

Okay.

> > + auto handle_arg = [fn, flags, complain](tree type,
> > + tree arg,
> > + int const param_index,
> > + conversion *conv,
> > + bool const conversion_warning)
>
>
> Let's move the conversion_warning logic into the handle_arg lambda
> rather than have it as a parameter. Yes, we don't need it for the xobj
> parm, but I think it's cleaner to have less in the loop.

I would argue that it's cleaner to have the lambda be concise, but I'll
make this change.

> Also, let's move handle_arg after the iobj 'this' handling so it's
> closer to the uses. For which the 'else xobj' needs to drop the 'else',
> or change to 'if (first_arg)'.

Agreed, I didn't like how far away it was.

> > + /* We currently handle for the case where first_arg is NULL_TREE
> > + in the future this should be changed and the assert reactivated. */
> > + #if 0
> > + gcc_assert (first_arg);
> > + #endif
>
>
> Let's leave this out.

Alright.

> > + val = handle_arg(TREE_VALUE (parm),
>
>
> Missing space before (.
>
> > - if (null_node_p (arg)
> > - && DECL_TEMP

Re: [PATCH, v3] Fortran: restrictions on integer arguments to SYSTEM_CLOCK [PR112609]

2023-11-22 Thread Harald Anlauf


Hi Steve,

On 11/22/23 19:03, Steve Kargl wrote:

On Wed, Nov 22, 2023 at 10:36:00AM +0100, Mikael Morin wrote:


OK with this fixed (and the previous comments as you wish), if Steve has no
more comments.



No further comments.  Thanks for your patients, Harald.

As side note, I found John Reid's "What's new" document
where it is noted that there are no new obsolescent or
delete features.

https://wg5-fortran.org/N2201-N2250/N2212.pdf



this is good to know.

There is an older version (still referring to F202x) on the wiki:

https://gcc.gnu.org/wiki/GFortranStandards

It would be great if someone with editing permission could update
the link and point to the above.

Thanks,
Harald

[committed] hppa: Fix integer REG+D address reloads

2023-11-22 Thread John David Anglin

Tested on hppa-unknown-linux-gnu and hppa64-hp-hpux11.11.  Fixes testcase
in PR.  Committed to trunk.

Dave
---

hppa: Fix integer REG+D address reloads

I made a mistake in the previous change to integer_store_memory_operand.
There is no support pa_emit_move sequence to handle secondary reloads of
integer REG+D instructions.  Further, the Q constraint is used for some
non-simple instructions (movb and addib).  Thus, we need to return true
when reload is in progress.

2023-11-22  John David Anglin  

gcc/ChangeLog:

PR target/112617
* config/pa/predicates.md (integer_store_memory_operand): Return
true for REG+D addresses when reload_in_progress is true.

diff --git a/gcc/config/pa/predicates.md b/gcc/config/pa/predicates.md
index 1b50020e1de..4c07c0a3828 100644
--- a/gcc/config/pa/predicates.md
+++ b/gcc/config/pa/predicates.md
@@ -308,6 +308,13 @@
 
   if (reg_plus_base_memory_operand (op, mode))
 {
+  /* There is no support for handling secondary reloads of integer
+REG+D instructions in pa_emit_move_sequence.  Further, the Q
+constraint is used in more than simple move instructions.  So,
+we must return true and let reload handle the reload.  */
+  if (reload_in_progress)
+   return true;
+
   /* Extract CONST_INT operand.  */
   if (GET_CODE (op) == SUBREG)
op = SUBREG_REG (op);


signature.asc
Description: PGP signature

Re: [PATCH 2/2] bugzilla: remove `gcc-bugs@` mailing list address

2023-11-22 Thread Joseph Myers

On Mon, 20 Nov 2023, Ben Boeckel wrote:

> Bugzilla is preferred today.
> 
> ChangeLog:
> 
>   * config-ml.in: Replace gcc-bugs@ with Bugzilla link.
>   * symlink-tree: Replace gcc-bugs@ with Bugzilla link.

I don't think we should use a URL that redirects (i.e. 
https://gcc.gnu.org/bugzilla should preferably have a trailing '/'), and 
arguably we should use https://gcc.gnu.org/bugs/ as the URL; that's the 
preferred one to point people to for bugs in the compilers themselves, 
since it gives more instructions on bug reporting (though those 
instructions may be less relevant for bugs in these files).

codingconventions.html claims that symlink-tree is "copied from mainline 
automake".  That is, I think, out-of-date information: automake's 
contrib/multilib/README says "The master (and probably more up-to-date) 
copies of the 'config-ml.in' and 'symlink-tree' files are maintained in 
the GCC development tree".  But it does indicate that 
codingconventions.html itself should be updated to stop suggesting 
symlink-tree is maintained elsewhere.

> libcpp/ChangeLog:
> 
>   * configure: Replace gcc-bugs@ with Bugzilla link.
>   * configure.ac: Replace gcc-bugs@ with Bugzilla link.
> 
> libdecnumber/ChangeLog:
> 
>   * configure: Replace gcc-bugs@ with Bugzilla link.
>   * configure.ac: Replace gcc-bugs@ with Bugzilla link.

I hope the configure changes are the same as you get with regeneration 
with the right autoconf version, and so should be described as 
regeneration in the ChangeLog entries.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH v3 00/11] : More warnings as errors by default

2023-11-22 Thread Jeff Law





On 11/20/23 02:55, Florian Weimer wrote:

This revision addresses Marek's comment about handing
-Wdeclaration-missing-parameter-type properly in conjunction with
-fpermissive.  A new test (permerror-fpermissive-nowarning.c)
demonstrates the expected behavior.  I added a test for -std=gnu89
-fno-permissive, too.

I'm including the precursor cleanup patches in this posting.  Hopefully
this will make the aarch64 tester happy.

Thanks,
Florian

Florian Weimer (11):
   aarch64: Avoid -Wincompatible-pointer-types warning in Linux unwinder
   aarch64: Call named function in gcc.target/aarch64/aapcs64/ice_1.c
   gm2: Add missing declaration of m2pim_M2RTS_Terminate to test
   Add tests for validating future C permerrors
   c: Turn int-conversion warnings into permerrors
   c: Turn -Wimplicit-function-declaration into a permerror
   c: Turn -Wimplicit-int into a permerror
   c: Do not ignore some forms of -Wimplicit-int in system headers
   c: Turn -Wreturn-mismatch into a permerror
   c: Turn -Wincompatible-pointer-types into a permerror
   c: Add new -Wdeclaration-missing-parameter-type permerror
The series is fine by me.  But give Marek additional time to chime in, 
particularly given the holidays this week in the US.  Say through this 
time next week?


jeff

Re: [PATCH v3 00/11] : More warnings as errors by default

2023-11-22 Thread Florian Weimer

* Jeff Law:

> On 11/20/23 02:55, Florian Weimer wrote:
>> This revision addresses Marek's comment about handing
>> -Wdeclaration-missing-parameter-type properly in conjunction with
>> -fpermissive.  A new test (permerror-fpermissive-nowarning.c)
>> demonstrates the expected behavior.  I added a test for -std=gnu89
>> -fno-permissive, too.
>> I'm including the precursor cleanup patches in this posting.
>> Hopefully
>> this will make the aarch64 tester happy.
>> Thanks,
>> Florian
>> Florian Weimer (11):
>>aarch64: Avoid -Wincompatible-pointer-types warning in Linux unwinder
>>aarch64: Call named function in gcc.target/aarch64/aapcs64/ice_1.c
>>gm2: Add missing declaration of m2pim_M2RTS_Terminate to test
>>Add tests for validating future C permerrors
>>c: Turn int-conversion warnings into permerrors
>>c: Turn -Wimplicit-function-declaration into a permerror
>>c: Turn -Wimplicit-int into a permerror
>>c: Do not ignore some forms of -Wimplicit-int in system headers
>>c: Turn -Wreturn-mismatch into a permerror
>>c: Turn -Wincompatible-pointer-types into a permerror
>>c: Add new -Wdeclaration-missing-parameter-type permerror

> The series is fine by me.

Thanks.

> But give Marek additional time to chime in, particularly given the
> holidays this week in the US.  Say through this time next week?

Yes, Marek and I spoke about it today.  I'll wait a bit longer for
feedback.

I'm also gathering some numbers regarding autoconf impact and potential
silent miscompilation.

Thanks,
Florian

Re: [PATCH 1/2] testsuite/unroll-8: Avoid triggering undefined behavior

2023-11-22 Thread Andrew Pinski

On Tue, Nov 21, 2023 at 3:29 PM Palmer Dabbelt  wrote:
>
> I was poking around with this test failure and noticed it was exercising
> undefined behavior.  The return type doesn't matter for what's being
> tested, so just mark it as void.

Just a quick note, this is NOT undefined behavior in C to return
without a value from a function which has a non-void return type. It
is only undefined if the value was used.
It is undefined behavior in C++ though for a fallthrough.
Yes there is a difference in the language. As Jeff said it does not
change what the testcase was/is testing but we should be clear in the
changelog that this is NOT undefined behavior.

Thanks,
Andrew Pinski

>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/unroll-8.c: Remove UB.
> ---
> I didn't tes this, but it seems trivial enough that I'm just going to
> throw it at the bots and hope I'm right.
> ---
>  gcc/testsuite/gcc.dg/unroll-8.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/gcc/testsuite/gcc.dg/unroll-8.c b/gcc/testsuite/gcc.dg/unroll-8.c
> index 4388f47d4c7..06d32e56893 100644
> --- a/gcc/testsuite/gcc.dg/unroll-8.c
> +++ b/gcc/testsuite/gcc.dg/unroll-8.c
> @@ -3,7 +3,7 @@
>  /* { dg-additional-options "-fno-tree-vectorize" { target amdgcn-*-* } } */
>
>  struct a {int a[7];};
> -int t(struct a *a, int n)
> +void t(struct a *a, int n)
>  {
>int i;
>for (i=0;i --
> 2.42.1
>

Re: [PATCH 2/2] bugzilla: remove `gcc-bugs@` mailing list address

2023-11-22 Thread Ben Boeckel

On Wed, Nov 22, 2023 at 23:15:56 +, Joseph Myers wrote:
> On Mon, 20 Nov 2023, Ben Boeckel wrote:
> 
> > Bugzilla is preferred today.
> > 
> > ChangeLog:
> > 
> > * config-ml.in: Replace gcc-bugs@ with Bugzilla link.
> > * symlink-tree: Replace gcc-bugs@ with Bugzilla link.
> 
> I don't think we should use a URL that redirects (i.e. 
> https://gcc.gnu.org/bugzilla should preferably have a trailing '/'), and 
> arguably we should use https://gcc.gnu.org/bugs/ as the URL; that's the 
> preferred one to point people to for bugs in the compilers themselves, 
> since it gives more instructions on bug reporting (though those 
> instructions may be less relevant for bugs in these files).

I'll update the URL.

> codingconventions.html claims that symlink-tree is "copied from mainline 
> automake".  That is, I think, out-of-date information: automake's 
> contrib/multilib/README says "The master (and probably more up-to-date) 
> copies of the 'config-ml.in' and 'symlink-tree' files are maintained in 
> the GCC development tree".  But it does indicate that 
> codingconventions.html itself should be updated to stop suggesting 
> symlink-tree is maintained elsewhere.

I'll also change that.

> > libcpp/ChangeLog:
> > 
> > * configure: Replace gcc-bugs@ with Bugzilla link.
> > * configure.ac: Replace gcc-bugs@ with Bugzilla link.
> > 
> > libdecnumber/ChangeLog:
> > 
> > * configure: Replace gcc-bugs@ with Bugzilla link.
> > * configure.ac: Replace gcc-bugs@ with Bugzilla link.
> 
> I hope the configure changes are the same as you get with regeneration 
> with the right autoconf version, and so should be described as 
> regeneration in the ChangeLog entries.

Is there a version of autoconf I should use? I have 2.71 laying around
but see that these were generated with 2.69. If you want me to regen
with 2.71, I'll do that as separate prep commits so that this diff is
sensible. Or I can try and dig up a 2.69 in some container to do it.

Thanks,

--Ben

Re: [PATCH v3 03/11] gm2: Add missing declaration of m2pim_M2RTS_Terminate to test

2023-11-22 Thread Joseph Myers

On Mon, 20 Nov 2023, Florian Weimer wrote:

> gcc/testsuite/
> 
>   * gm2/link/externalscaffold/pass/scaffold.c (m2pim_M2RTS_Terminate):
>   Declare.

OK in the absence of Modula-2 maintainer objections within 48 hours.

-- 
Joseph S. Myers
jos...@codesourcery.com

[PATCH, testsuite, fortran] fix invalid testcases (missing MOLD argument to NULL)

2023-11-22 Thread Harald Anlauf

Dear all,

testcases assumed_rank_8.f90 and assumed_rank_10.f90 are invalid:
NULL() is passed without MOLD to an assumed-rank dummy argument.

This is detected by NAG, but not yet by gfortran (see pr104819).
gfortran even ignores the MOLD argument; the dump-tree is identical
if MOLD is there or not.

Now these testcases are { dg-do run }.  Therefore I would like to
fix these testcases, independent of the work on fixing pr104819.

Comments?

Thanks,
Harald

From cbb0c61f9d6f06667666a33da6e6ce3213a92248 Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Wed, 22 Nov 2023 21:45:46 +0100
Subject: [PATCH] testsuite: fortran: fix invalid testcases (missing MOLD
 argument to NULL)

The Fortran standard requires that NULL() passed to an assumed-rank
dummy argument has a MOLD argument.

gcc/testsuite/ChangeLog:

	PR fortran/104819
	* gfortran.dg/assumed_rank_10.f90: Add MOLD argument to NULL().
	* gfortran.dg/assumed_rank_8.f90: Likewise.
---
 gcc/testsuite/gfortran.dg/assumed_rank_10.f90 | 6 +++---
 gcc/testsuite/gfortran.dg/assumed_rank_8.f90  | 4 ++--
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/gcc/testsuite/gfortran.dg/assumed_rank_10.f90 b/gcc/testsuite/gfortran.dg/assumed_rank_10.f90
index 6a3cc94483e..f22d43ab955 100644
--- a/gcc/testsuite/gfortran.dg/assumed_rank_10.f90
+++ b/gcc/testsuite/gfortran.dg/assumed_rank_10.f90
@@ -50,9 +50,9 @@ program test

  is_present = .false.

- call fpa(null(), null()) ! No copy back
- call fpi(null(), null()) ! No copy back
- call fno(null(), null()) ! No copy back
+ call fpa(null(iip), null(jjp)) ! No copy back
+ call fpi(null(iip), null(jjp)) ! No copy back
+ call fno(null(iip), null(jjp)) ! No copy back

  call fno() ! No copy back

diff --git a/gcc/testsuite/gfortran.dg/assumed_rank_8.f90 b/gcc/testsuite/gfortran.dg/assumed_rank_8.f90
index 5873296a7a5..34ff42c0be2 100644
--- a/gcc/testsuite/gfortran.dg/assumed_rank_8.f90
+++ b/gcc/testsuite/gfortran.dg/assumed_rank_8.f90
@@ -22,13 +22,13 @@ program main
   call f (ii)
   call f (489)
   call f ()
-  call f (null())
+  call f (null(kk))
   call f (kk)
   if (j /= 2) STOP 1

   j = 0
   nullify (ll)
-  call g (null())
+  call g (null(ll))
   call g (ll)
   call g (ii)
   if (j /= 1) STOP 2
--
2.35.3

[committed] hppa: Define MAX_FIXED_MODE_SIZE

2023-11-22 Thread John David Anglin

Tested on hppa-unknown-linux-gnu and hppa64-hp-hpux11.11.  Committed to
trunk.

Fixes FAIL: c-c++-common/pr111309-1.c ICE.

Dave
---

hppa: Define MAX_FIXED_MODE_SIZE

Replace default define.  We support TImode when TARGET_64BIT is true.

2023-11-22  John David Anglin  

gcc/ChangeLog:

PR target/112592
* config/pa/pa.h (MAX_FIXED_MODE_SIZE): Define.

diff --git a/gcc/config/pa/pa.h b/gcc/config/pa/pa.h
index aba2cec7357..d73428682e7 100644
--- a/gcc/config/pa/pa.h
+++ b/gcc/config/pa/pa.h
@@ -1310,3 +1310,7 @@ do {  
 \
 
 /* Output default function prologue for hpux.  */
 #define TARGET_ASM_FUNCTION_PROLOGUE pa_output_function_prologue
+
+/* An integer expression for the size in bits of the largest integer machine
+   mode that should actually be used.  We allow pairs of registers.  */
+#define MAX_FIXED_MODE_SIZE GET_MODE_BITSIZE (TARGET_64BIT ? TImode : DImode)


signature.asc
Description: PGP signature

Re: [PATCH v3 01/11] aarch64: Avoid -Wincompatible-pointer-types warning in Linux unwinder

2023-11-22 Thread Joseph Myers

On Mon, 20 Nov 2023, Florian Weimer wrote:

>   * config/aarch64/linux-unwind.h
>   (aarch64_fallback_frame_state): Add cast to the expected type
>   in sc assignment.

OK in the absence of AArch64 maintainer objections within 48 hours.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH] mingw: Exclude utf8 manifest [PR111170, PR108865]

2023-11-22 Thread Jonathan Yong


On 11/22/23 12:34, Costas Argyris wrote:

Attached a new patch.

A couple things to note:

1) I changed your

host_extra_objs=utf8-mingw32.o

to

host_extra_objs_mingw=utf8-mingw32.o

to match the other two, since I believe that's what you meant.

2) This approach has the complication that the variables
in configure.ac need to be set before it sources config.host.



I specifically asked for it to be done that way so users are aware of it 
with --help. Thanks, pushed to master.

Re: [PATCH, v3] Fortran: restrictions on integer arguments to SYSTEM_CLOCK [PR112609]

2023-11-22 Thread Steve Kargl

On Wed, Nov 22, 2023 at 10:36:00AM +0100, Mikael Morin wrote:
> 
> OK with this fixed (and the previous comments as you wish), if Steve has no
> more comments.
> 

No further comments.  Thanks for your patients, Harald.

As side note, I found John Reid's "What's new" document
where it is noted that there are no new obsolescent or
delete features.

https://wg5-fortran.org/N2201-N2250/N2212.pdf

-- 
Steve

Re: [RFA] New pass for sign/zero extension elimination

2023-11-22 Thread Jeff Law





On 11/20/23 11:26, Richard Sandiford wrote:


+
+/* If we know the destination of CODE only uses some low bits
+   (say just the QI bits of an SI operation), then return true
+   if we can propagate the need for just the subset of bits
+   from the destination to the sources.  */
+
+static bool
+safe_for_live_propagation (rtx_code code)
+{
+  /* First handle rtx classes which as a whole are known to
+ be either safe or unsafe.  */
+  switch (GET_RTX_CLASS (code))
+{
+  case RTX_OBJ:
+   return true;
+
+  case RTX_COMPARE:
+  case RTX_COMM_COMPARE:
+  case RTX_TERNARY:


I suppose operands 1 and 2 of an IF_THEN_ELSE would be safe.
Yes.  The only downside is we'd need to special case IF_THEN_ELSE 
because it doesn't apply to operand 0.  Right now we're pretty 
conservative with anything other than binary codes.  Comment added about 
the possibility of handling I-T-E as well.






This made me wonder: is this safe for !TRULY_NOOP_TRUNCATION?  But I
suppose it is.  What !TRULY_NOOP_TRUNCATION models is that the target
mode has a canonical form that must be maintained, and wouldn't be by
a plain subreg.  So TRULY_NOOP_TRUNCATION is more of an issue for
consumers of the liveness information, rather than the computing the
liveness information itself.
Really interesting question.  I think ext-dce is safe.  As you note this 
is more a consumer side question and on the consumer side we don't muck 
with TRUNCATE at all.






+case SS_TRUNCATE:
+case US_TRUNCATE:
+case PLUS:
+case MULT:
+case SS_MULT:
+case US_MULT:
+case SMUL_HIGHPART:
+case UMUL_HIGHPART:
+case AND:
+case IOR:
+case XOR:
+case SS_PLUS:
+case US_PLUS:


I don't think it's safe to propagate through saturating ops.
They don't have the property that (x op y)%z == (x%z op x%z)%z

Yea, you're probably right.  Removed.




+
+ /* We don't support vector destinations or destinations
+wider than DImode.   It is safe to continue this loop.
+At worst, it will leave things live which could have
+been made dead.  */
+ if (VECTOR_MODE_P (GET_MODE (x)) || GET_MODE (x) > E_DImode)
+   continue;


The E_DImode comparison hard-codes an assumption about the order of
the mode enum.  How about using something like:

Guilty as charged :-)  Not surprised you called that out.





  scalar_int_mode outer_mode;
  if (!is_a (GET_MODE (x), &outer_mode)
  || GET_MODE_BITSIZE (outer_mode) > 64)
continue;
Wouldn't we also want to verify that the size is constant, or is it the 
case that all the variable cases are vector (and would we want to 
actually depend on that)?




The other continues use iter.skip_subrtxes (); when continuing.
I don't think it matters for correctness whether we do that or not,
since SETs and CLOBBERs shouldn't be nested.  But skipping should
be faster.
My thought on not skipping the sub-rtxs in this case was to make sure we 
processed things like memory addresses which could have embedded side 
effects.  It probably doesn't matter in practice though.




Maybe it would be worth splitting the SET/CLOBBER code out into > a 
subfunction, to make the loop iteration easier to handle?
Yea, it could use another round of such work.  In the originalm set and 
use handling were one big function which drove me nuts.





+ /* Transfer all the LIVENOW bits for X into LIVE_TMP.  */
+ HOST_WIDE_INT rn = REGNO (SUBREG_REG (x));
+ for (HOST_WIDE_INT i = 4 * rn; i < 4 * rn + 4; i++)
+   if (bitmap_bit_p (livenow, i))
+ bitmap_set_bit (live_tmp, i);
+
+ /* The mode of the SUBREG tells us how many bits we can
+clear.  */
+ machine_mode mode = GET_MODE (x);
+ HOST_WIDE_INT size = GET_MODE_SIZE (mode).to_constant ();
+ bitmap_clear_range (livenow, 4 * rn, size);


Is clearing SIZE bytes correct?  Feels like it should be clearing
something like log2 (size) + 1 instead.

Yea, I think you're right.  Fixed.




+ bit = SUBREG_BYTE (x).to_constant () * BITS_PER_UNIT;
+ if (WORDS_BIG_ENDIAN)
+   bit = (GET_MODE_BITSIZE (GET_MODE (SUBREG_REG (x))).to_constant 
()
+  - GET_MODE_BITSIZE (GET_MODE (x)).to_constant () - bit);
+
+ /* Catch big endian correctness issues rather than triggering
+undefined behavior.  */
+ gcc_assert (bit < sizeof (HOST_WIDE_INT) * 8);


This could probably use subreg_lsb, to avoid the inline endianness adjustment.
That's the routine I was looking for!  The original totally mucked up 
the endianness adjustment and I kept thinking we must have an existing 
routine to do this for us but didn't find it immediately, so I just 
banged out a trivial endianness adjustment.





+
+ mask = GET_MODE_MASK (GET_MODE (SUBREG_REG (x))) << bit;
+

Re: [PATCH 1/2] testsuite/unroll-8: Avoid triggering undefined behavior

2023-11-22 Thread Jeff Law





On 11/21/23 16:27, Palmer Dabbelt wrote:

I was poking around with this test failure and noticed it was exercising
undefined behavior.  The return type doesn't matter for what's being
tested, so just mark it as void.

gcc/testsuite/ChangeLog:

* gcc.dg/unroll-8.c: Remove UB.
I just reviewed the history of unroll-8, I don't think this compromises 
the test's original intent.  OK for the trunk.


jeff

1 2 >

1 - 100 of 139 matches

Mail list logo